Professional Documents
Culture Documents
DATA GRAPHICS 2 –
PART B:
PRINCIPLES OF GOOD
DATA GRAPHICS
Davood Shojaei
ADMIN STUFF
GEOM90007
ADMIN STUFF
GEOM90007
QUESTIONS
GEOM90007
FEEDBACK/COMMNETS/ISSUES
shojaeid@unimelb.edu.au
GEOM90007
VISUALISATION EXAMPLES
GEOM90007
REVIEW: THREE SKILLS
https://maps.philipmallis.com/melbourne-tram-map/
Yarra Trams, 2014
GEOM90007
REVISION
Word Frequency
data 4009
one 1125
variables 1105
each 1103
graphics 1086
two 1067
visualisation 1039
plot 976
statistical 945
using 933
Document Total 576,541
GEOM90007
WORD CLOUD
Example: Word Clouds Which visual variables from the dataset are
typically visualised?
1. Size
What else could be visualised?
2. Position
3. Brightness
4. Colour (hue, saturation)
5. Orientation
6. Pattern (arrangement)
…
GEOM90007
REVIEW (LECTURE 1)
(Tufte, 2001)
GEOM90007
REVIEW (LECTURE 1)
(Tufte, 2001)
GEOM90007
OVERVIEW
A. Data-ink ratio
GEOM90007
A. Data-ink ratio
“Data graphics should draw the viewer’s attention to the sense and substance of the
data, not to something else” (Tufte, 2001)
Therefore the majority of ink (or pixels) used in a graphic should present data-
information.
Image: http://www.ecglibrary.com/ecghome.html
All ink (or pixels) represent data direct from the ECG.
Nothing can be erased without losing information!
GEOM90007
A. Data-ink example
GEOM90007
A. Data-ink example
Improvement
GEOM90007
A. Data-ink example
Nothing
Ratio = 0
GEOM90007
Maximise the data-ink ratio
Every bit of ink (or pixel) in a data graphic must be there for a reason.
Therefore maximise the data-ink ratio (within reason)
Erase non-data-ink
Such data can clutter the graphic (e.g., dense grid lines)
GEOM90007
Example of redundancy with numerical data
Look for redundant data-ink, which are representations that are repeated and
unnecessary
29.3
Any 5 of the following can be erased and the one remaining will still indicate height:
(1) Height of the left line, (2) Height of the right line, (3) Height of the shading,
(4) Position of the top line, (5) Position (not content) of the number, (6) the number itself
GEOM90007
Redesign – Bar chart
GEOM90007
Redesign – Bar chart
GEOM90007
Redesign – Box Plot
GEOM90007
A. Data-ink summary
GEOM90007
B. FOUR PRINCIPLES OF DATA GRAPHICS
1. Data density
2. Data integrity
3. Data correspondence
4. Data aesthetics
GEOM90007
B. Example dataset: Table
▪ A data graphic that requires a description of 2000 words is typically more dense than
one that requires 100 words (Duckham, 2014)
GEOM90007
B.1 Data density – Low example
The two-dimensional
distribution of galaxies Seldner et al. (1977)
Map area:
▪ 2D surface, 61 inches (393 cm2)
Data
▪ 1024 x 2222 rectangles (totalling 2,275,328), representing
three numbers:
o Two coordinates ~ position (x,y)
o One galaxy count ~ brightness (grey)
▪ Data density
o ~ 17,369 numbers per cm2
GEOM90007
Chart
Junk
GEOM90007
B.2 DATA INTEGRITY
1. Data scrubbing
Deliberately removing data or deriving data (smoothing/resampling) to create a false message
2. Unbalanced scaling
Ratios (transformations) between raw data and visualisation
3. Range distortion
Move axis (at least one dimension) and not making this explicit
4. Abusing dimensionality
Errors resulting from dimensionality (rises with the power of the dimensions displayed: lines,
areas, volumes)
GEOM90007
B.2 DATA INTEGRITY
▪ Avoid distortion
size of effect shown in graphic
• Lie factor = size of effect in data
▪ Size encoding
▪ Additional principles
• Clear, detailed labels to defeat distortion and ambiguity
• Show data variation, not design variation
• Account for inflation (time & money)
GEOM90007
GEOM90007
B.3 DATA CORRESPONDENCE
GEOM90007
GEOM90007
B.4 Aesthetics
▪ Aesthetic Science is
examined across three broad fields:
o Philosophy
o Psychology
o Neuroscience
Image: Palmer (2009), continued Shimamura and Palmer (2013)
GEOM90007
GEOM90007
B.4 UNDERSTANDING AESTHETICS
https://www.getty.edu/education/teachers/building_lessons/principles_design.pdf
GEOM90007
B.4 Making complexity accessible
Friendly Unfriendly
Words are spelled out, jargon is avoided Multiple abbreviations, requiring viewer to re-
read text to decode
Words run left to right (usual direction for
reading English) Words run vertically, e.g., Y Axis, or in many
directions
Elaborate shading and symbols are avoided
Obscure codings requiring the viewer to
Labels are placed on the graphic itself, no constantly revert to the legend
legend required
Chart-junk
Colour deficiencies are considered
Type (font and style) is unreadable
Type (font and style) is clear and precise
GEOM90007
B.4 Making complexity accessible
Wider plots make it easier for the eye to follow from left to right
(Tukey, 1977)
GEOM90007
CHART JUNK
▪ “Decoration… to make the graphic appear more scientific and precise… to give the
designer an opportunity to exercise artistic skills”
(Tufte, 2001)
▪ Innovative design can promote intrigue and curiosity – chart junk does not
GEOM90007
Chart junk
1. Vibrations
Some spatial frequencies can cause vibration and
movement that may distract the viewer’s experience…
GEOM90007
Chart junk
Line properties (e.g., width and Decorations that take over from
frequency) that compete with data quantitative data (i.e. form disastrously
overtakes from function)
Color gradient
GEOM90007
GEOM90007
DISPLAYING DATA GRAPHICS
GEOM90007
SUPPORTING INFORMATION
1. Caption
Mappings used for visual variables, data source
2. Title
Clear and concise
3. Axis
Always labelled with units, balanced distribution of marks
GEOM90007
LAYOUT
Whitespace
Symmetry
No crowding
Important elements
towards top Whitespace
GEOM90007
ALLIGNMENT
GEOM90007
EXAMPLE
Title:
Population Distribution, 2016
Subtitle:
Number of people per local government area
Explanatory note:
The area of each circle is proportioned to the number of people in the local
government area. The legend presents example symbol sizes shown on the
map.
GEOM90007
Brewer, 2016 GEOM90007
IMPORTANT TERMINOLOGY
GEOM90007
TAXONOMY
GEOM90007
TAXONOMY
2. Classification of information visualisation techniques
GEOM90007
SALIENCE
▪ Design task:
Choice of visual variables must consider trade-offs
(e.g., accuracy)
GEOM90007
BRUSHING
Scatterplot matrix
GEOM90007
SOFTWARE SUPPORTING VISUALISATION
▪ Proprietary
o Tableau
o Power BI
o Qlik
o Alteryx
o Echarts3
▪ Open Source
o D3
o Processing V3
o Cinder
GEOM90007
THE DESIGN PROCESS (Lecture 1)
1. Concept
What’s the idea?
2. Form
What does it look like?
3. Content
What information does it
contain?
4. Analysis
What is the problem?
Munzner (2014) 5. Design
What’s the solution?
6. Implementation
Constructing the solution
7. Test
Evaluating the solution
1. Analysis
2. Design
3. Implement
4. Test
5. Evaluate
GEOM90007
NEXT LECTURE
▪ Cartography 1
READING
Agrawala, M., Li, W. and Berthouzoz, F. (2011) Design principles for visual
communication. Communications of the ACM, 54(4), pages 60-69
Access: http://dl.acm.org.ezp.lib.unimelb.edu.au/citation.cfm?id=1924439
GEOM90007
Thank you!