Professional Documents
Culture Documents
COMP7507
Visualization & Visual Analytics
Types of Data
• One-dimensional — linear data
– e.g., age distribution
– includes sequential data such as text, program source
codes
• Two-dimensional: planar or map data
– e.g., geographical maps, floorplans, newspaper layouts
• Three-dimensional: real-world objects
– e.g medical scans, architectural design
• Temporal
– e.g. timelines, weather
2
Types of Data
• Multidimensional or Multivariate
– e.g., financial data, customer behaviour
• Tree (hierarchical data)
– e.g., file structure, evolution
• Network (relational data)
– e.g., social network, air traffic
• The above classification is not mutually exclusive
– E.g., how about air traffic data?
– multivariate, geographical, network, and temporal
Shneiderman, B., "The eyes have it: a task by data type taxonomy for information
visualizations," Proc. IEEE Symposium on Visual Languages, 1996, pp.336,343. 3
The Iris Sample Data Set
• Created by R.A. Fisher
• Possibly the best known data set in
the pattern recognition community
Iris Setosa
• 3 classes (types of iris)
• 50 objects in each class
• 5 attributes
– sepal length & width (cm) Iris Versicolour Iris Virginica
– petal length & width (cm) [wikipedia]
– class (Iris Setosa, Iris Versicolour, Iris Virginica)
4
The Iris Sample Data Set
5
Some Basic Plots
Bar Charts / Histograms
• To show distribution of values of a single variable
• Values are divided into bins
• A bar plot is used so that the height of each bar
indicates the number of objects in each bin
• Shape of histogram depends on the number of bins
40 30
35 10 bins 25 20 bins
30
20
25
20 15
15 10
10
5
5
0 0
0.25
1.5
2.25
1.625
0.375
1.125
1.875
2.375
0.75
1.25
1.375
1
2
0.875
1.75
0.5
2.5
0.625
0.125
2.125
0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5
Iris’s data – petal width
7
Box Plots
• Conventional histograms / distribution plots might
not be space-efficient.
https://machinelearningmastery.com/data-visualization-in-r/
8
Box Plots
• An efficient way to show quantitative distribution
of 1D data
Value axis
outliers (shown individually)
Q1 Q2 Q3
Left-skewed
Q1 Q2 Q3
7
Values (cm)
11
2D Bar Charts
• To show the joint distribution of the values of
two variables
50
40
Frequency
30
20
10
0
2.3
0.8
1.7 4.7
Issues:
• occlusion when too many lines,
• when to use piecewise or smooth lines
13
Scatter Plots
• A plot of points showing the relationship
between two variables of a set of data
3
2.5
• Point position determined
2
by attribute values
Petal Width
1.5
• Additional attributes can
1
be marked by size, shape,
0.5 or color for each item
0
0 1 2 3 4 5 6 7 8
Petal Length
14
Scatter Plots
• A plot of points showing the relationship
between two variables of a set of data
3
2.5
Petal Width
1.5
0.5
0
0 1 2 3 4 5 6 7 8
Petal Length
15
Scatter Plots
• 2D scatter plots are commonly used, and there
are also 3D scatter plots
• Can compare two attributes at a time only. What
if we have a lot of pairwise comparisons to show?
– have an array (a matrix) of scatter plots
16
Scatter Plot Matrices
sepal length sepal width petal length petal width
sepal length
visualization, e.g., a
histogram
• Matrix plot can be
petal width
[Math, NYU]
18
Contour Plots
• Commonly used in
scientific visualization
• Examples: height fields,
temperature, rainfall, etc.
[3DFieldPro]
[OriginLab]
19
Temporal Data
Time-Series Data
• Set of values that change over time
• Examples:
– Finance (stock prices, exchange rates)
– Science (temperatures, pollution levels, electric
potentials)
– Public policy (crime rates, public health)
• Common requirements:
– Able to compare many time series simultaneously
– Able to use different visualizations in combination
21
Index Charts
• Interactive line chart showing % change based
on a selected index point
• Useful for showing relative changes
percentage change
of selected stock
prices according to
the day of purchase
22
Stacked Graphs
Centered
Zero-Basedline
24
Horizon Graphs
[ http://homes.cs.washington.edu/~jheer//files/zoo/ex/time/horizon.html ] 25
Horizon Graphs
• To divide the area plot into horizontal bands and
layer them over each others.
• Useful for increasing the data density (i.e. save
space) without sacrificing resolution.
• Limitation: Not intuitive and takes time to learn
26
Spiral Graphs
• Use a spirally shaped time axis
• Good for showing or identifying periodic structure of
data
Number of influenza cases over a period of three years
[Aigner et al., "Visual Methods for Analyzing Time-Oriented Data", IEEE TVCG, 2008.]
27
Multivariate Data
Chernoff Faces
• Relate data to facial features, something which we find easy
to differentiate
• Each feature, e.g., mouth, encode a data dimension by their
shape, size, placement and orientation
Chernoff Faces
• Represent only trends
but not actual values
• Drawback: Affected by
our perceived
importance of a facial
feature
30
Heat Maps
• Encode values stored in table entries
as colors
• Rows and columns can be reordered to
better expose features.
[M. Ward]
31
column: patient
row: gene
[Warwick, http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/r/heatmap/ ]
[ http://mbostock.github.io/d3/talk/20111116/iris-parallel.html ]
33
• Each attribute value of a data item corresponds to a point on a
coordinate axis, and the data item is represented as a polyline
connecting these points
• A distinct class of objects can sometimes be seen as a group of
lines on some axes
• Ordering of axes is important for seeing patterns 34
Parallel Coordinates
[Ward 2010]
X1 & X2
proportional
X1 & X2
inversely proportional
37
Dimension Reduction
• To remove some of the dimensions out from the display to
avoid cluttering
– Examples: Principle Component Analysis (PCA), Multidimensional Scaling
(MDS), Self Organizing Maps (SOM)
[http://commons.wikimedia.org/wiki/File:GaussianScatterPCA.png#mediaview [http://commons.wikimedia.org/wiki/File:Recent
er/File:GaussianScatterPCA.png] Votes.svg#mediaviewer/File:RecentVotes.svg] 38
Dimension Ordering
• Crucial for the effectiveness of many visualization
techniques
• Relationship among adjacent dimensions are easier to
detect than relationship among those positioned far
apart, e.g., Parallel Coordinates, Heat Maps
• Use for attribute mapping to highlight important
dimensions, e.g., Chernoff face,
• An NP-complete problem equivalent to the Travelling
Salesman Problem (TSP)
• Use approximation to compute ordering or by manual
ordering (interaction needed)
39
Geospatial Data
Geospatial Data
• Data refers to a specific location in the world.
– e.g., population, health data, traffic, etc.
• Visualization techniques used intensively in
geographic information systems (GIS),
cartography.
• Issues:
– Geographical aggregation
• Recall the London Cholera Case
– Map projection
41
Map Projections
• A mapping from a position on Earth (spherical
surface) to a position on screen (a flat plane)
• From longitude+latitude pair (l,j) to screen
coordinates (x,y)
All map
projections must
have distortions!
https://vvvv.org/blog/polar-spherical-
and-geographic-coordinates
[Wikipedia]
44
Equirectangular Projection
[Wikipedia]
Mapping: x = l, y = j
• Cylindrical Projection
• Meridians are mapped to equally spaced vertical straight lines
• Circles of latitude are mapped to equally spaced horizontal
straight lines 45
Equirectangular Projection
[Wikipedia]
Mapping: x = l, y = j
[Wikipedia] 47
Choropleth Maps
• For showing data collected or aggregated by
geographical areas
• In Greek: choro = area, pleth = value
• Use color to encode values for a region
48
Choropleth Maps
Obesity in the US, 2002
50
Circular Cartograms
a.k.a. Dorling cartograms [Heer et al., 2010]
55
Exploiting 3D
Microsoft GeoFlow
https://dylanbabbs.com/writing/map-data-viz-design
56
Book: Visual Analytics for Data Scientists
Coordinate Transformations
• Select a spatial reference frame which best
facilitates data comparison, pattern finding, etc.
– Spherical coordinates? Cartesian coordinates?
• Bivariate
– a surface v = f(x,y)
2D surface
Contours (isolines) 60
Volume Data
A volume data is essentially a trivariate scalar function
A scalar value is defined at every (x, y, z) in the volume
domain: v = f(x, y, z)
If we have a discrete sampling of the 3D domain, we
obtain a voxel (volume element) representation.
https://minecraft.net
https://en.wikipedia.org/wiki/Volume_rendering
61
Isosurface Rendering
https://en.wikipedia.org/wiki/Voxel-Man
https://youtu.be/dPPjUtiAGYs
Slicing
https://www.uni-muenster.de/... http://www.asawicki.info/...
63
Isosurface Rendering
To extract an isosurface from the volume data and use
standard surface rendering techniques to visualize
The Marching Cube Algorithm
Lorensen, W. E.; Cline, Harvey E. (1987). "Marching cubes: A high resolution 3d surface construction
algorithm". ACM Computer Graphics. 21 (4): 163–169.
The 15 cube
configuration (symmetry
considered)
64
https://en.wikipedia.org/wiki/Marching_cubes
Marching Cube Algorithm (in 2D)
https://en.wikipedia.org/wiki/Marching_squares 65
Isosurface Rendering
https://www.eriksmistad.no/...
66
Direct Volume Rendering
Assigning color & transparencies based on voxel value
via a transfer function
http://cg.inf.h-bonn-rhein-
sieg.de/?page_id=2700
https://youtu.be/gq8oqtnKFH4
67
Visualization Gallery
• Take a look at:
– Tableau Public
(https://public.tableau.com/s/gallery)
– D3.js
(http://d3js.org/)
– Google Charts
(https://developers.google.com/chart/interactive/docs/gallery)
• Try visualize the Iris data set with the different techniques
taught in this class using the above tools.
• What can/cannot be done by these tools?
68
Reference
• Jeffrey Heer, Michael Bostock, and Vadim
Ogievetsky. 2010. A tour through the visualization
zoo. Commun. ACM 53, 6 (June 2010), 59-67.
(http://hci.stanford.edu/jheer/files/zoo/ )
• Matthew Ward, Georges Grinstein and Daniel Keim,
"Interactive Data Visualization: Foundations,
Techniques, and Applications", 2010 [Chapters 6 & 7]
69