You are on page 1of 10

Visual

Vocabulary Click any section below to


There are so many ways to visualise data - how do we know which one to pick? Click on a category below to decide which data relationship is most important in your story, then look view the charts
at the different types of charts within the category to form some initial ideas about what might work best. This list is not meant to be exhaustive, nor a wizard, but is a useful
starting point for making informative and meaningful data visualisations. ⬇

Deviation Correlation Ranking


Emphasise variations (+/-) from a fixed reference point. Show the relationship between two or more variables. Be Use where an item’s position in an ordered list is more
Typically the reference point is zero but it can also be a target mindful that, unless you tell them otherwise, many readers important than its absolute or relative value. Don’t be afraid to
or a long-term average. Can also be used to show sentiment will assume the relationships you show them to be causal (i.e., highlight the points of interest.
(positive/neutral/negative). one causes the other).

Distribution Change over Time Part-to-Whole


Show values in a dataset and how often they occur. The shape Give emphasis to changing trends. These can be short Show how a single entity can be broken down into its
(or ‘skew’) of a distribution can be a memorable way of (intra-day) movements or extended series traversing decades component elements. If the reader’s interest is solely in the
highlighting the lack of uniformity or equality in the data. or centuries: Choosing the correct time period is important to size of the components, consider a magnitude-type chart
provide suitable context for the reader. instead.

Magnitude Spatial Flow


Show size comparisons. These can be relative (just being able Used only when precise locations or geographical patterns in Show the reader volumes or intensity of movement between
to see larger/bigger) or absolute (need to see fine differences). data are more important to the reader than anything else. two or more states or conditions. These might be logical
Usually these show a ‘counted’ number (for example, barrels, sequences or geographical locations.
dollars or people) rather than a calculated rate or per cent.

CREDITS & TUTORIALS


CREATED BY
(c) Andy Kriebel | @VizWizBI | 2018 | All right reserved | Permission to republish with proper credit
Diverging Stacked Bar Steve Wexler Data Revelations Arc Chart Ken Fl.. KenFlaerlage.com Chord Diagram Noah .. DataBlick

Sunburst Chart Leonid Golub Super Data Science Radar Chart Adam .. Dueling Data Sankey Diagram Leoni.. Super Data Science
INSPIRED BY
FT Graphics: Alan Smith; Chris Campbell; Ian Bott; Liz Faunce; Surplus/Deficit Filled Line Jeffrey Shaffer Data +Science Scaled Cartogram Ken Fl.. KenFlaerlage.com .
Graham Parrish; Billy Ehrenberg; Paul McCallum; Martin Stabe
Violin Plot Ben Moss YouTube Venn Diagram Leoni.. Super Data Science ..
Visual Vocabulary Poster: ft.com/vocabulary
Deviation
Emphasise variations (+/-) from a fixed reference point. Typically the reference point is zero but it can also be a target or a long-term average. Can also be used to show sentiment
(positive/neutral/negative).

Diverging Bar Diverging Stacked Bar


A simple standard bar chart that can handle both negative and positive magnitude values Perfect for presenting survey results which involve sentiment (e.g., disagree/neutral/agree)

Labels
Has grace under pressure 2.9
Copiers
Paper Good Job Skills 2.8
Envelopes
Fasteners Makes good coffee 2.8
Accessories
Art Can Play Jazz 2.8
Phones
Chairs Good Sense of Humor 2.7
Storage
Binders Likes the Beatles 2.7
Machines
Supplies Good Ability to lift heavy objects 2.6

Bookcases
Tables Is Kind to animals 2.4

Appliances
High Intelligence 1.7
Furnishings

-20% -10% 0% 10% 20% 30% 40% -100% -75% -50% -25% 0% 25% 50% 75% 100%

Profit Ratio Score

Spine Chart Surplus/Deficit Filled Line


Splits a single value into 2 contrasting components (e.g., Male/Female) The shaded area of these charts allows a balance to be shown – either against a baseline or between two series.
French
Hong Kong
Swedish 60K
British
German
Singaporean 40K
Thai
American
Australian 20K
Danish
Finnish 0K
Norwegian 0K
Filipino
Egyptian -20K
Indian
Indonesian
Vietnamese -40K
Malaysian
Saudi Arabian
UAE -60K
50% 40% 30% 20% 10% 0% 10% 20% 30% 40% 50% 60% 70%

Response Rate 2015 2016 2017 2018 2019


Correlation
Show the relationship between two or more variables. Be mindful that, unless you tell them otherwise, many readers will assume the relationships you show them to be causal (i.e.,
one causes the other).

Scatterplot Line + Column Connected Scatterplot


The standard way to show the relationship between two continuous variables, A good way of showing the relationship between an amount (columns) and a Usually used to show how the relationship between 2 variables has changed
each of which has its own axis. rate (line). over time.

50 US obesity average: 27.0% 80K 2012


<== % of people with a BA degree or higher ==>

45 1917
10%
DC
40 MA 60K
NJ NH
400K
35 CT

Top 0.01%
CO MD KS People with a BA,

Profit
Sales
UT VA
30 WA US average: 27.2% 40K
MO
5%
25
200K
NV MT AK OH
20 WY NC 20K
AL
KY
15 AR WV

10 0K 0K 0%

20 22 24 26 28 30 32 34 36 2015 Q3 2016 Q3 2017 Q3 2018 Q3 15% 20% 25% 30% 35%

<== % of obese people ==> Quarter of Order Date Bottom 90%

Bubble XY Heatmap
Like a scatterplot, but adds additional detail by sizing the circles according to a A good way of showing the patterns between 2 categories of data, less good at showing fine differences in amounts.
third variable.
I don't have a savings Just the minimum balance
Age Range account $0 requirement Less than $1,000 $1,000-$4,999 $5,000-$9,999 $10,000 or more
200K
Overall 21.0% 28.0% 9.0% 13.0% 10.0% 5.0% 14.0%

150K 18-24 22.4% 21.8% 9.7% 19.1% 14.7% 4.7% 7.5%


Sales

25-34 18.0% 26.3% 10.6% 15.2% 12.5% 5.4% 12.1%


100K

35-44 18.9% 31.6% 6.6% 11.6% 9.8% 5.6% 16.0%


50K
45-54 21.6% 30.8% 7.7% 10.9% 7.5% 5.2% 16.2%

0K
55-64 22.8% 28.4% 8.4% 10.7% 8.0% 4.8% 16.8%
-20K -10K 0K 10K 20K 30K 40K 50K

Profit 65+ 21.6% 27.6% 10.7% 8.2% 7.2% 4.7% 20.0%


Ranking
Use where an item’s position in an ordered list is more important than its absolute or relative value. Don’t be afraid to highlight the points of interest.

Ordered Bar Ordered Column Ordered Proportional Symbol


Standard bar charts display the ranks of values much more easily when sorted Standard bar charts display the ranks of values much more easily when sorted Use when there are big variations between values and/or seeing fine
into order. into order. differences between data is not so important.

1500K
1200
West

1000

Burglary Rate
East 1000K
800

600
Central
500K
400

South

0 2 4 6 8 10
0K
0K 200K 400K 600K 800K 1000K 1200K 1400K Murder Rate
West East Central South

Dot Strip Plot Slope Lollipop Chart


Dots placed in order on a strip are a space-efficient method of laying out ranks Perfect for showing how ranks have changed over time or vary between Lollipops draw more attention to the data value than standard bar/column and
across multiple categories. categories. can also show rank and value effectively.

January
West

40%
February

East
March

30%
April Central

May
South
20%
June
0K 200K 400K 600K 800K 1000K 1200K 1400K

0K 10K 20K 30K 40K 50K 60K 70K


% w/ BA or Higher % Obese
Distribution
Show values in a dataset and how often they occur. The shape (or ‘skew’) of a distribution can be a memorable way of highlighting the lack of uniformity or equality in the data.

Histogram Boxplot Violin Plot Population Pyramid


The standard way to show a statistical distribution - keep Summarise multiple distributions by showing the median Similar to a box plot but more effective with complex A standard way for showing the age and sex breakdown of
the gaps between columns small to highlight the ‘shape’ of (centre) and range of the data distributions (data that cannot be summarised with a population distribution; effectively, back to back
the data. simple average). histograms.

20K
1960

100
10K
1980

0K
50 2000

-10K

6M 4M 2M 0M 2M 4M 6M
0 Consumer Corporate Home Office
← Female | Male →
32 36 40 44 48 52 56 60 64 68

Dot Strip Plot Dot Plot Barcode Plot Cumulative Curve


Dots placed in order on a strip are a space-efficient A simple way of showing the change or range (min/max) of Like dot strip plots, good for displaying all the data in a A good way of showing how unequal a distribution is: y
method of laying out ranks across multiple categories. data across multiple categories. table, they work best when highlighting individual values. axis is always cumulative frequency, x axis is always a
measure.
Appliances Appliances
January
1000K 1,002,480
Art Art
February 783,444
Binders
Binders

Cumulative Sales
Envelopes
Envelopes
March
Fasteners
Fasteners 500K
Labels
April
Labels
Paper
Paper
May Storage
Storage
Supplies 0K
June
Supplies
0K 5K 10K 15K 20K 25K 0 10 20 30 40 50 60

0K 20K 40K 60K 0 500 1000 1500 2000 Months since First Purchase
Change over Time
Give emphasis to changing trends. These can be short (intra-day) movements or extended series traversing decades or centuries: Choosing the correct time period is important to
provide suitable context for the reader.

Line Column Line + Column Stock Price


The standard way to show a changing time series. If data Columns work well for showing change over time - but Columns work well for showing change over time - but Usually focused on day-to-day activity, these charts show
are irregular, consider markers to represent data points. usually best with only one series of data at a time. usually best with only one series of data at a time. opening/closing and hi/low points of each day.

2015 2016 2017 80K


200K
68
60K
150K 400K
400K 66

Profit
Sales
124,370 40K

Close
100K 64
200K
200K
20K
50K 62

0K 0K 0K
0K 60
2015 2016 2017 2018 2016 2018
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

Slope Area Chart Fan Chart Connected Scatterplot


Good for showing changing data as long as the data can be Use with care – these are good at showing changes to Use to show the uncertainty in future projections - usually A good way of showing changing data for two variables
simplified into 2 or 3 points without missing a key part of total, but seeing change in components can be very this grows the further forward to projection. whenever there is a relatively clear pattern of
story. difficult. progression.

95K 1500
2000
120K

Travel Agents
400K
90K 1000 100K

80K
85K 200K 500
2014
60K

80K 0K 0 0B 50B 100B 150B

2015 Q4 2016 Q4 2017 Q4 2018 Q4 Jul 1, 13 Dec 30, 13 Jun 30, 14 Online Hotel Revenue
2016 2017

Calendar Heatmap Priestley Timeline Circle Timeline Seismogram


A great way of showing temporal patterns (daily, weekly, Great when date and duration are key elements of the Good for showing discrete values of varying size across Another alternative to the circle timeline for showing
monthly) – at the expense of showing precision in story in the data. multiple categories (e.g., sales by quarter). series where there are big variations in the data.
quantity.
Haydn
2010 2011 2012 2013 2014 2015 2016 2017 2018 Central
Mozart
Jan
Feb Beethoven
Mar East
Schubert
Apr
May Berlioz
Jun South
Schumann
Jul
Aug Brahms
Sep West
Oct Elgar
Nov
1750 1800 1850 1900 2016 2017 2018 2019
Dec
Part-to-Whole
Show how a single entity can be broken down into its component elements. If the reader’s interest is solely in the size of the components, consider a magnitude-type chart instead.

Stacked Column Proportional Stacked Bar Pie Chart Donut Chart


A simple way of showing part-to-whole relationships but A good way of showing the size and proportion of data at A common way of showing part-to-whole data – but be Similar to a pie chart – but the centre can be a good way of
can be difficult to read with more than a few components. the same time – as long as the data are not too aware that it’s difficult to accurately compare the size of making space to include more information about the data
complicated. the segments. (e.g., total).
10..

South

Central

50%

East 4.6M
West

0%
0% 20% 40% 60% 80% 100%
2015 2016 2017 2018

Treemap Sunburst Arc Gridplot


Use for hierarchical part-to-whole relationships; can be Another way of visualisaing hierarchical part-to-whole Another way of visualisaing hierarchical part-to-whole Good for showing % information, they work best when
difficult to read when there are many small segments. relationships. Use sparingly (if at all) for obvious reasons. relationships. Use sparingly (if at all) for obvious reasons. used on whole numbers and work well in multiple layout
form.
Phones

32%
Machines

Chairs Tables

Bookcases Paper

Venn Waterfall
Generally only used for schematic representation. Can be useful for showing part-to-whole relationships
where some of the components are negative.

20M

584 10M

0M

686 674 Gross Sales Cost of Goods Total Net Profit


West East Sold Expenses
Magnitude
Show size comparisons. These can be relative (just being able to see larger/bigger) or absolute (need to see fine differences). Usually these show a ‘counted’ number (for example,
barrels, dollars or people) rather than a calculated rate or per cent.

Column Bar Paired Column Paired Bar


The standard way to compare the size of things. Must The standard way to compare the size of things. Must As per standard column, but allows for multiple series. As per standard bar, but allows for multiple series. Can
always start at 0 on the axis. always start at 0 on the axis. Good when the data are not Can become tricky to read with more than 2 series. become tricky to read with more than 2 series.
time series and labels have long category names.
1500K West East Central South 2015

West
West 2018

400K 2015

East
1000K
East 2018

Central
2015
Central 200K 2018
500K
2015

South
South
2018
0K 0K
0K 500K 1000K 1500K 0K 100K 200K 300K 400K 500K
West East Central South 2015 2018 2015 2018 2015 2018 2015 2018

Proportional Stacked Bar Proportional Symbol Isotype (pictogram) Lollipop Chart


A good way of showing the size and proportion of data at Use when there are big variations between values and/or Excellent solution in some instances – use only with whole Lollipops draw more attention to the data value than
the same time – as long as the data are not too seeing fine differences between data is not so important. numbers (do not slice off an arm to represent a decimal). standard bar/column - does not HAVE to start at zero (but
complicated. preferable).

1000 West
South
Burglary Rate

800 East
Central

Central
600
East

South
400
West
0 2 4 6 8 10 0K 500K 1000K 1500K

0% 20% 40% 60% 80% 100% Murder Rate

Radar Chart Parallel Coordinates


A space-efficient way of showing value of multiple An alternative to radar charts – again, the arrngement of
variables– but make sure they are organised in a way that the variables is important. Usually benefits from
makes sense to reader. highlighting values.

Burlgary Rate Larceny Theft Motor Vehicle Murder Rate


Theft
Spatial
Used only when precise locations or geographical patterns in data are more important to the reader than anything else.

Basic Choropleth (rate/ratio) Proportional Symbol (count/magnitude) Flow Map Contour Map
The standard approach for putting data on a map – should Use for totals rather than rates – be wary that small For showing unambiguous movement across a map. For showing areas of equal value on a map. Can use
always be rates rather than totals and use a sensible base differences in data will be hard to see. deviation colour schemes for showing +/- values
geography.

© Mapbox © OSM © Mapbox © OSM © Mapbox © OSM © Mapbox © OSM

Equalized Cartogram Scaled Cartogram Dot Density Heat Map


Converting each unit on a map to a regular and Stretching and shrinking a map so that each area is sized Used to show the location of individual events/locations – Grid-based data values mapped with an intensity colour
equally-sized shape – good for representing voting regions according to a particular value. make sure to annotate any patterns the reader should see. scale. As choropleth map – but not snapped to an
with equal value. admin/political unit.

© Mapbox © OSM © Mapbox © OSM


Flow
Show the reader volumes or intensity of movement between two or more states or conditions. These might be logical sequences or geographical locations.

Sankey Waterfall
Shows changes in flows from one condition to at least one other; good for Designed to show the sequencing of data through a flow process, typically budgets. Can include +/-
tracing the eventual outcome of a complex process. components.

60M

40M

20M

0M
Total Cost of Add'l Income Tax Research Sales, Add'l ex Net Profit
Revenue Revenue incom.. and Develo.. General an.. pense..

Chord Network
A complex but powerful diagram which can illustrate 2-way flows (and net Used for showing the strength and inter-connectedness of relationships of varying types.
winner) in a matrix.

You might also like