You are on page 1of 69

Temporal, Geospatial

& Multivariate Data

COMP7507
Visualization & Visual Analytics
Types of Data
• One-dimensional — linear data
– e.g., age distribution
– includes sequential data such as text, program source
codes
• Two-dimensional: planar or map data
– e.g., geographical maps, floorplans, newspaper layouts
• Three-dimensional: real-world objects
– e.g medical scans, architectural design
• Temporal
– e.g. timelines, weather

2
Types of Data
• Multidimensional or Multivariate
– e.g., financial data, customer behaviour
• Tree (hierarchical data)
– e.g., file structure, evolution
• Network (relational data)
– e.g., social network, air traffic
• The above classification is not mutually exclusive
– E.g., how about air traffic data?
– multivariate, geographical, network, and temporal
Shneiderman, B., "The eyes have it: a task by data type taxonomy for information
visualizations," Proc. IEEE Symposium on Visual Languages, 1996, pp.336,343. 3
The Iris Sample Data Set
• Created by R.A. Fisher
• Possibly the best known data set in
the pattern recognition community
Iris Setosa
• 3 classes (types of iris)
• 50 objects in each class
• 5 attributes
– sepal length & width (cm) Iris Versicolour Iris Virginica
– petal length & width (cm) [wikipedia]
– class (Iris Setosa, Iris Versicolour, Iris Virginica)

4
The Iris Sample Data Set

5
Some Basic Plots
Bar Charts / Histograms
• To show distribution of values of a single variable
• Values are divided into bins
• A bar plot is used so that the height of each bar
indicates the number of objects in each bin
• Shape of histogram depends on the number of bins
40 30
35 10 bins 25 20 bins
30
20
25
20 15
15 10
10
5
5
0 0
0.25

1.5

2.25
1.625
0.375

1.125

1.875

2.375
0.75

1.25
1.375
1

2
0.875

1.75
0.5

2.5
0.625
0.125

2.125
0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5
Iris’s data – petal width
7
Box Plots
• Conventional histograms / distribution plots might
not be space-efficient.

https://machinelearningmastery.com/data-visualization-in-r/

8
Box Plots
• An efficient way to show quantitative distribution
of 1D data
Value axis
outliers (shown individually)

Q3 + 1.5 * IQR (or the highest


datum smaller than this)
Q3: 75th percentile
Q2: 50th percentile

x Q1: 25th percentile

25% of the data


samples are
with a value Q1 – 1.5 * IQR (or the lowest
below x datum greater than this)
9
Box Plots
Right-skewed

Q1 Q2 Q3

Left-skewed

Q1 Q2 Q3

For normal distribution (symmetric)


https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51 10
Box Plots
With a box plot, outliers can be highlighted
9
and further examined.
8

7
Values (cm)

sepal length sepal width petal length petal width

11
2D Bar Charts
• To show the joint distribution of the values of
two variables

50

40
Frequency

30
20
10
0
2.3
0.8
1.7 4.7

2.5 7 petal length


petal width

3D effect not good in showing the exact values, but the


correlation can be seen clearly
12
Line Graphs
• Points connected by lines to show how
something changes in value (usually over time)
GDP Growth (annual %)

Issues:
• occlusion when too many lines,
• when to use piecewise or smooth lines
13
Scatter Plots
• A plot of points showing the relationship
between two variables of a set of data
3

2.5
• Point position determined
2
by attribute values
Petal Width

1.5
• Additional attributes can
1
be marked by size, shape,
0.5 or color for each item
0
0 1 2 3 4 5 6 7 8
Petal Length

Iris Setosa Iris Versicolour Iris Virginica

14
Scatter Plots
• A plot of points showing the relationship
between two variables of a set of data
3

2.5
Petal Width

1.5

0.5

0
0 1 2 3 4 5 6 7 8
Petal Length

15
Scatter Plots
• 2D scatter plots are commonly used, and there
are also 3D scatter plots
• Can compare two attributes at a time only. What
if we have a lot of pairwise comparisons to show?
– have an array (a matrix) of scatter plots

16
Scatter Plot Matrices
sepal length sepal width petal length petal width
sepal length

• Diagonal plot show


the distribution of the
1D data
sepal width

• Can also make use of


the diagonal space for
other kinds of 1D
petal length

visualization, e.g., a
histogram
• Matrix plot can be
petal width

used not only for


scatter plot, but
anything that can deal
with a bivariate plot
17
Contour Plots
• To show continuous attributes measured on a
spatial grid
• Partition the space into regions of similar values;
boundaries of regions are contour lines called
iso-value lines, or isolines.

[Math, NYU]
18
Contour Plots
• Commonly used in
scientific visualization
• Examples: height fields,
temperature, rainfall, etc.

[3DFieldPro]

[OriginLab]
19
Temporal Data
Time-Series Data
• Set of values that change over time
• Examples:
– Finance (stock prices, exchange rates)
– Science (temperatures, pollution levels, electric
potentials)
– Public policy (crime rates, public health)
• Common requirements:
– Able to compare many time series simultaneously
– Able to use different visualizations in combination

21
Index Charts
• Interactive line chart showing % change based
on a selected index point
• Useful for showing relative changes

percentage change
of selected stock
prices according to
the day of purchase

[Heer et al., 2010]

22
Stacked Graphs
Centered

Zero-Basedline

Total counts of unemployed


US workers per industry,
2000-2010.

[Heer et al., 2010]


23
Stacked Graphs
• Stack area charts on top of each other
• Useful for showing summation of time-series
values (aggregation)
• Limitation:
– negative numbers not supported
– difficult to interpret trends accurately
– meaningless for some kind of data
(e.g., temperatures)

24
Horizon Graphs

US unemployment rate, 2000-2010.


[Heer et al., 2010]
Positive values: above average unemployment
Negative values: below average unemployment

[ http://homes.cs.washington.edu/~jheer//files/zoo/ex/time/horizon.html ] 25
Horizon Graphs
• To divide the area plot into horizontal bands and
layer them over each others.
• Useful for increasing the data density (i.e. save
space) without sacrificing resolution.
• Limitation: Not intuitive and takes time to learn

26
Spiral Graphs
• Use a spirally shaped time axis
• Good for showing or identifying periodic structure of
data
Number of influenza cases over a period of three years

Time series 27 days per cycle 28 days per cycle

[Aigner et al., "Visual Methods for Analyzing Time-Oriented Data", IEEE TVCG, 2008.]
27
Multivariate Data
Chernoff Faces
• Relate data to facial features, something which we find easy
to differentiate
• Each feature, e.g., mouth, encode a data dimension by their
shape, size, placement and orientation

10 facial features, each


corresponds to a
parameter in [0,1]

All 0.5 Random parameters


All 0 All 1 29
[ http://kspark.kaist.ac.kr/Human%20Engineering.files/Chernoff/Chernoff%20Faces.htm ]
[ http://kspark.kaist.ac.kr/Human%20Engineering.files/Chernoff/Chernoff%20Faces.htm ]

Chernoff Faces
• Represent only trends
but not actual values
• Drawback: Affected by
our perceived
importance of a facial
feature

30
Heat Maps
• Encode values stored in table entries
as colors
• Rows and columns can be reordered to
better expose features.

[M. Ward]

31
column: patient
row: gene

Heatmap from DNA


microarray data
showing genes
expressed differently
for two types of
leukemia.

[Warwick, http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/r/heatmap/ ]

Interactive examples: http://amp.pharm.mssm.edu/clustergrammer/ 32


Parallel Coordinates
• How to present all n axes of the n dimensions on
a 2D plane?
• Use parallel axes instead of orthogonal axes

[ http://mbostock.github.io/d3/talk/20111116/iris-parallel.html ]
33
• Each attribute value of a data item corresponds to a point on a
coordinate axis, and the data item is represented as a polyline
connecting these points
• A distinct class of objects can sometimes be seen as a group of
lines on some axes
• Ordering of axes is important for seeing patterns 34
Parallel Coordinates

[Ward 2010]

Parallel coordinate plot showing the distribution


(i.e., centers and extents) of clusters
35
Parallel Coordinates
• Parallel correlation

X1 & X2
proportional

Cartesian point plot PC plot

X1 & X2
inversely proportional

[Wong and Bergeron, “30 Years of Multidimensional Multivariate Visualization,” 36


Scientific Visualization: Overviews, Methodologies & Techniques, 1997.]
Parallel Coordinates
• Parallel correlation

Parallel coordinate plot of


six-dimensional data
illustrating correlations of
r (correlation coefficient)
=1, .8, .2, 0, -.2, -.8 and -1.

[Wegman, “Hyperdimensional Data


Analysis Using Parallel Coordinates”,
Journal of the American Statistical
Association, 1990.]

37
Dimension Reduction
• To remove some of the dimensions out from the display to
avoid cluttering
– Examples: Principle Component Analysis (PCA), Multidimensional Scaling
(MDS), Self Organizing Maps (SOM)

• Issue: Resulting dimensions are not the original ones, not


intuitive to users
PCA MDS

[http://commons.wikimedia.org/wiki/File:GaussianScatterPCA.png#mediaview [http://commons.wikimedia.org/wiki/File:Recent
er/File:GaussianScatterPCA.png] Votes.svg#mediaviewer/File:RecentVotes.svg] 38
Dimension Ordering
• Crucial for the effectiveness of many visualization
techniques
• Relationship among adjacent dimensions are easier to
detect than relationship among those positioned far
apart, e.g., Parallel Coordinates, Heat Maps
• Use for attribute mapping to highlight important
dimensions, e.g., Chernoff face,
• An NP-complete problem equivalent to the Travelling
Salesman Problem (TSP)
• Use approximation to compute ordering or by manual
ordering (interaction needed)

39
Geospatial Data
Geospatial Data
• Data refers to a specific location in the world.
– e.g., population, health data, traffic, etc.
• Visualization techniques used intensively in
geographic information systems (GIS),
cartography.
• Issues:
– Geographical aggregation
• Recall the London Cholera Case
– Map projection

41
Map Projections
• A mapping from a position on Earth (spherical
surface) to a position on screen (a flat plane)
• From longitude+latitude pair (l,j) to screen
coordinates (x,y)

All map
projections must
have distortions!
https://vvvv.org/blog/polar-spherical-
and-geographic-coordinates

meridians and circles of latitude


[An Album of Map Projections, U.S. Geological SurveyProfessional Paper 1453]
42
Map Projections
• Projection methods differ by spatial properties
that they preserve
– Conformal (preserves local angle and thus shape; not
area-preserving)
– Equal area (preserves area; shape can change)
– Equidistance (preserves distance from a specific point
or line)
– Others: Gnomonic (great circles as straight lines),
Azimuthal/Retrozimuthal (preserves direction
from/to a point)
Map projections: a video lecture
https://www.youtube.com/watch?v=v5fSBQRbPR0
43
Cylindrical Projection
• Each point on the sphere surface is projected
outward on a cylinder that is put around the sphere.
• No distortion around the equator where the cylinder
touches the globe, but severely distorted at the
poles
• Two common cylindrical
map projections:
– Equirectangular projection
– Lambert cylindrical projection

[Wikipedia]
44
Equirectangular Projection

[Wikipedia]

Mapping: x = l, y = j
• Cylindrical Projection
• Meridians are mapped to equally spaced vertical straight lines
• Circles of latitude are mapped to equally spaced horizontal
straight lines 45
Equirectangular Projection

[Wikipedia]

Mapping: x = l, y = j

• Neither conformal nor equal area, i.e., much distortion


• Use often in thematic mapping, e.g., choropleth map
46
Lambert Cylindrical Projection
• Cylindrical Projection
• Area preserving
• Undistorted along equator, but highly distorted near
the poles.
Mapping: x = l, y = sin j

[Wikipedia] 47
Choropleth Maps
• For showing data collected or aggregated by
geographical areas
• In Greek: choro = area, pleth = value
• Use color to encode values for a region

48
Choropleth Maps
Obesity in the US, 2002

[Heer et al., 2010]

• Problem: tends to highlight patterns in large


areas, while highly populated but small areas
might be of more interest 49
Cartograms
• Regions are resized so that the area directly
encodes a data variable
• Cartograms differ by the properties of:
– Shape preservation
– Exact area correspondence
– Topology preservation (i.e., region connectivity)
• An optimization problem to find a good
compromise between the above conflicting
criteria

50
Circular Cartograms
a.k.a. Dorling cartograms [Heer et al., 2010]

Data represented by area


Obesity in the US, 2002
faithfully, but shape &
Color: % of obese people
Circle size: absolute number of obese people topology are not retained
51
Noncontinuous Cartograms

Use shaded subregion


to represent data

[Ward et al., 2010]


Example: http://bl.ocks.org/mbostock/4055908
• Exact area, preserves shape
• Not preserving topology, map perception still ok
• Size limited by the maximum scaling w.r.t. map region
52
Noncontiguous Cartograms

[Ward et al., 2010]

• Exact area, preserves shape as much as possible


• Not preserving topology
• Map perception is more difficult
53
Continuous Cartograms

[Ward et al., 2010]

• Preserves topology Example: http://prag.ma/code/d3-cartogram/

• Preserves area & shape as much as possible only


• Takes a long time to compute the visualization, interactive data
change is not possible
54
Graduated Symbol Maps
• Data is showed by placing symbols over a map
• More dimensions can be visualized by encoding with
the attributes of the symbol

[Heer et al., 2010]

Obesity in the US, 2002

55
Exploiting 3D

Microsoft GeoFlow

https://dylanbabbs.com/writing/map-data-viz-design

56
Book: Visual Analytics for Data Scientists
Coordinate Transformations
• Select a spatial reference frame which best
facilitates data comparison, pattern finding, etc.
– Spherical coordinates? Cartesian coordinates?

Example from: Visual Analytics for Data Scientists

w.r.t. football field w.r.t. team center 57


Volume Data
3D Data as Point Set
• Each data sample contains 3 variables
• E.g., we want to show the relationship between
petal length, petal width & sepal length for the
Iris data Sepal length (y)

Each data point (x, y, z) represents a sample


in the data set

A 3D scatter plot will do


Petal length (x)

How about a volumetric data set?


Petal width (z) 59
Scalar Function Visualization
• Univariate
– a plot v = f(x)

• Bivariate
– a surface v = f(x,y)

2D surface
Contours (isolines) 60
Volume Data
A volume data is essentially a trivariate scalar function
A scalar value is defined at every (x, y, z) in the volume
domain: v = f(x, y, z)
If we have a discrete sampling of the 3D domain, we
obtain a voxel (volume element) representation.

https://minecraft.net
https://en.wikipedia.org/wiki/Volume_rendering
61
Isosurface Rendering

Visible Human Project

https://en.wikipedia.org/wiki/Voxel-Man
https://youtu.be/dPPjUtiAGYs

Slicing

Direct Volume Rendering 62


https://youtu.be/ojCNUoVfzh4
Slicing with Cut Planes
• Allow probing the 3D volume to see a subset (2D)
of data
Axis-aligned slicing Arbitrary cut plane

https://www.uni-muenster.de/... http://www.asawicki.info/...

63
Isosurface Rendering
To extract an isosurface from the volume data and use
standard surface rendering techniques to visualize
The Marching Cube Algorithm
Lorensen, W. E.; Cline, Harvey E. (1987). "Marching cubes: A high resolution 3d surface construction
algorithm". ACM Computer Graphics. 21 (4): 163–169.

Basic idea: identify if an


isosurface passes
through a voxel

The 15 cube
configuration (symmetry
considered)

64
https://en.wikipedia.org/wiki/Marching_cubes
Marching Cube Algorithm (in 2D)

https://en.wikipedia.org/wiki/Marching_squares 65
Isosurface Rendering

https://www.eriksmistad.no/...

Marching Cube in Action


https://youtu.be/LfttaAepYJ8

66
Direct Volume Rendering
Assigning color & transparencies based on voxel value
via a transfer function

http://cg.inf.h-bonn-rhein-
sieg.de/?page_id=2700

https://youtu.be/gq8oqtnKFH4
67
Visualization Gallery
• Take a look at:
– Tableau Public
(https://public.tableau.com/s/gallery)
– D3.js
(http://d3js.org/)
– Google Charts
(https://developers.google.com/chart/interactive/docs/gallery)

• Try visualize the Iris data set with the different techniques
taught in this class using the above tools.
• What can/cannot be done by these tools?

68
Reference
• Jeffrey Heer, Michael Bostock, and Vadim
Ogievetsky. 2010. A tour through the visualization
zoo. Commun. ACM 53, 6 (June 2010), 59-67.
(http://hci.stanford.edu/jheer/files/zoo/ )
• Matthew Ward, Georges Grinstein and Daniel Keim,
"Interactive Data Visualization: Foundations,
Techniques, and Applications", 2010 [Chapters 6 & 7]

69

You might also like