You are on page 1of 120

Data

Visualization

Dudoit

Motivation

Principles of Data Visualization


Data
Visualization Data 100: Principles and Techniques of Data Science
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs
Sandrine Dudoit
Survey of
Data Department of Statistics and Division of Biostatistics, UC Berkeley
Visualization
Techniques
One
Quantitative Spring 2019
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 1 / 120
Plots
Outline

Data
Visualization 1 Motivation
Dudoit
2 Principles of Data Visualization
Motivation
2.1 Do We Really Need a Graph?
Principles of
Data 2.2 General Considerations
Visualization
Do We Really 2.3 Graphical Perception
Need a Graph?
General
Considerations
2.4 Bad Graphs
Graphical
Perception
Bad Graphs 3 Survey of Data Visualization Techniques
Survey of 3.1 One Quantitative Variable
Data
Visualization 3.2 Multiple Quantitative Variables
Techniques
One 3.3 One Qualitative Variable
Quantitative
Variable
Multiple
3.4 Multiple Qualitative Variables
Quantitative
Variables 3.5 Conditional Plots
One Qualitative
Variable
Multiple
Qualitative
Variables Version: 05/02/2019, 17:17
Conditional 2 / 120
Plots
Data Visualization

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
“One picture worth ten thousand words.”
General
Considerations
Graphical
Perception Frederick R. Barnard, Printer’s Ink, March 10th, 1927.
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 3 / 120
Plots
An Oldie But Goodie

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of Figure 1: Minard’s representation of Napoleon’s 1812 Russian


Data
Visualization Campaign. This graph, made in 1861 by Charles Joseph Minard
Techniques
One
(1781–1870), is commonly regarded as one of the finest ever. It
Quantitative
Variable represents, in only two dimensions, the size of the troops, their
Multiple
Quantitative location, their direction of movement, dates, and temperatures.
Variables
One Qualitative https://en.wikipedia.org/wiki/Charles_Joseph_Minard.
Variable
Multiple
Qualitative
Variables
Conditional 4 / 120
Plots
New But ...

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Figure 2: Bitcoin wealth distribution.
Variables
One Qualitative
http://viz.wtf/image/166329900475.
Variable
Multiple
Qualitative
Variables
Conditional 5 / 120
Plots
Data Visualization

Data
Visualization

Dudoit

Motivation
One picture worth ten thousand words.
Principles of
Data
Visualization
• Only if it is a good picture.
Do We Really
Need a Graph? • We tend to be more demanding with text than with
General
Considerations
Graphical
graphics.
Perception
Bad Graphs • How long does it take to write/read one thousand words?
Survey of
Data At least the same effort should be put into
Visualization
Techniques
making/viewing a graph.
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 6 / 120
Plots
Learning Objectives

Data
Visualization

Dudoit
• Become a wise and effective “creator”/“maker” as well as
“reader”/“viewer” of data visualization.
Motivation

Principles of
• Master general principles for data visualization and apply
Data
Visualization
these when making your own graphs as well as when
Do We Really
Need a Graph?
viewing others’.
General
Considerations • Produce the right graph for the matter at hand.
Graphical
Perception
Bad Graphs • Become aware of the variety of graphical techniques
Survey of
Data
available for different types of data and purposes and
Visualization understand their pros and cons.
Techniques
One
Quantitative
Go beyond histograms and pie charts!
Variable
Multiple • Think more carefully about each plot you create, consider
Quantitative
Variables
One Qualitative
the pros and cons of different choices, and try several
Variable
Multiple different plots for a given dataset.
Qualitative
Variables
Conditional 7 / 120
Plots
Learning Objectives

Data
Visualization

Dudoit • Familiarize yourself with software for data visualization.


Motivation
Most of the examples in theses slides are based on
Principles of Python’s matplotlib and seaborn libraries. However, as
Data
Visualization discussed in the first lecture, other languages such a R
Do We Really
Need a Graph? may be better suited for certain tasks.
General
Considerations
Graphical
• Focus on what type of plot to make rather than how to
Perception
Bad Graphs make it, i.e., compose the plot conceptually before
Survey of thinking of its software implementation details.
Data
Visualization Concepts are general and long-lasting, will syntax is highly
Techniques
One specific and ephemeral.
Quantitative
Variable
Multiple • Avoid bad graphs!
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 8 / 120
Plots
Data Visualization

Data
Visualization

Dudoit
• Data visualization is a fundamental aspect of Data
Science.
Motivation

Principles of
• It is essential to “look at data” throughout the workflow,
Data
Visualization
from exploratory data analysis (EDA) to model diagnostics
Do We Really
Need a Graph?
and reporting the results of the inquiry.
General
Considerations • Visualization is valuable for detecting the main features
Graphical
Perception
Bad Graphs
(good or bad) of a dataset, revealing patterns, and
Survey of suggesting theories or further questions.
Data
Visualization • Visualization is also useful for quality/assessment control
Techniques
One
Quantitative
(QA/QC) and detecting problems with the data.
Variable
Multiple • An effective plot can be good enough to answer the
Quantitative
Variables
One Qualitative
question on its own. In some cases, it may even be the
Variable
Multiple only appropriate type of answer.
Qualitative
Variables
Conditional 9 / 120
Plots
Data Visualization

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization • An effective plot can also be sufficient to convince
Do We Really
Need a Graph?
General
stakeholders of the findings from a full-blown statistical
Considerations
Graphical inference procedure.
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 10 / 120
Plots
Data Visualization

Data
Visualization
• Although data visualization is ubiquitous and heavily relied
Dudoit
upon, in research as well as in the media, typically not
Motivation much thought is put into creating or reading plots.
Principles of I Creators often rely on very limited subsets of plots and
Data
Visualization without proper consideration of their limitations.
Do We Really
Need a Graph?
I Readers often passively absorb a message imposed on them
General
Considerations by the graph, rather than reason and think critically about
Graphical
Perception it.
Bad Graphs

Survey of
• Very few Statistics, Computer Science (CS), or domain
Data
Visualization curricula offer courses in data visualization.
Techniques
One • Proper data visualization is non trivial. Entire courses
Quantitative
Variable
Multiple
could and should be devoted to data visualization,
Quantitative
Variables including discussions of vision and perception to guide the
One Qualitative
Variable
Multiple
design of effective graphs.
Qualitative
Variables
Conditional 11 / 120
Plots
Do We Really Need a Graph?

Data
Visualization

Dudoit

Motivation

Principles of • When the data only comprise a handful of values, a table


Data
Visualization or a simple mention in text may be a more effective, i.e.,
Do We Really
Need a Graph? accurate and simple, display.
General
Considerations
Graphical • E.g. Percentage of popular vote for Trump and Clinton in
Perception
Bad Graphs 2016 presidential election:
Survey of
Data
Trump 46.1 %
Visualization Clinton 48.2 %
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 12 / 120
Plots
Do We Really Need a Graph?

Data
Visualization

Dudoit
Trump Trump
Motivation
48.9% 46.1%
Principles of
Data
Visualization
5.7%
Do We Really Others
Need a Graph?
General 51.1% 48.2%
Considerations
Graphical
Perception Clinton Clinton
Bad Graphs

Survey of
Data
Visualization
Techniques
Figure 3: US Election Results 2016. Left: Pie chart of percentage of
One
Quantitative
popular vote for Trump and Clinton. Right: Pie chart of percentage
Variable
Multiple
of popular vote for Trump, Clinton, and other candidates. Why the
Quantitative
Variables different percentages on left and right?
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 13 / 120
Plots
From Tables to Graphs

Data
Visualization

Dudoit

Motivation • When a table represents two or more variables, with more


Principles of than a handful of values each, a graph may be more
Data
Visualization effective.
Do We Really
Need a Graph?
General
• Tables leave the interpretation to the viewer.
Considerations
Graphical
Perception
• Graphs provide a summary of the data and are more
Bad Graphs
amenable to comparisons.
Survey of
Data
Visualization
• Gelman et al. (2002). Lets Practice What We Preach:
Techniques Turning Tables into Graphs. http://www.stat.columbia.
One
Quantitative
Variable edu/~gelman/research/published/dodhia.pdf.
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 14 / 120
Plots
From Tables to Graphs

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Figure 4: Turning tables into graphs (Gelman et al., 2002, Figure 2).
Quantitative
Variable Counts and rates of citations of various professions from the New
Multiple
Quantitative York Times database. Graph: Log-log scale allows comparison across
several orders of magnitude. Any 45◦ line indicates constant relative
Variables
One Qualitative
Variable
Multiple frequency. The relative positions of the different professions is clearer.
Qualitative
Variables
Conditional 15 / 120
Plots
More Oldies But Goodies

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 5: Album de Statistique Graphique (1881).
Variable
Multiple
https://www.davidrumsey.com/.
Qualitative
Variables
Conditional 16 / 120
Plots
More Oldies But Goodies: Maps

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Figure 6: Album de Statistique Graphique (1881). Train load (scaled
Multiple
Quantitative
by length of line) is represented by thickness of bands. How would
Variables
One Qualitative you represent this data without a graph?
Variable
Multiple
Qualitative
Variables
Conditional 17 / 120
Plots
More Oldies But Goodies: Graphical Timetables

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One Figure 7: Marey (1885). Train schedule Paris–Lyon, 1880s.
Quantitative
Variable https://www.edwardtufte.com/bboard/q-and-a-fetch-msg?
Multiple
Quantitative
Variables
msg_id=0003zP. How would you represent this data without a
One Qualitative
Variable
graph?
Multiple
Qualitative
Variables
Conditional 18 / 120
Plots
More Oldies But Goodies: Graphical Timetables

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One Figure 8: Marey (1885). Train schedule Paris–Lyon with TGV, 1980s
Quantitative
Variable vs. 1880s. The red line indicates the 1981 itinerary of the TGV, a
Multiple
Quantitative
Variables
new express train that cut the trip from Paris to Lyon to under three
One Qualitative
Variable
hours (vs. nine hours in the 1880s).
Multiple
Qualitative
Variables
Conditional 19 / 120
Plots
More Oldies But Goodies: Graphical Timetables

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Figure 9: Train schedule SF–Gilroy, now.
Variable
Multiple
https://i.stack.imgur.com/qJ1hH.
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 20 / 120
Plots
More Oldies But Goodies: Graphical Timetables

Data
Visualization
• In Marey (1885)’s Paris–Lyon graphical train schedule in
Dudoit
the 1880s, time is represented on the x axis and the
Motivation stations and distances between stations are represented on
Principles of
Data
the y axis (Tufte, 2001).
Visualization
Do We Really
• A train’s itinerary is represented by a line.
Need a Graph?
General
Considerations
• The slope of the line reflects the speed of the train: The
Graphical
Perception more nearly vertical the line, the faster the train.
Bad Graphs

Survey of • The length of a stop at a station is indicated by the length


Data
Visualization of the horizontal line.
Techniques
One • The intersection of two lines locates the time and place
Quantitative
Variable
Multiple
that trains going in opposite directions pass each other.
Quantitative
Variables
One Qualitative
• This type of graph, known as a parallel coordinates plot, is
Variable
Multiple still used today and has many other applications.
Qualitative
Variables
Conditional 21 / 120
Plots
Caveats

Data
Visualization

Dudoit
• Graphs should attempt to summarize data in a simple,
intuitive, and efficient manner, without distorting or
Motivation
loosing important information.
Principles of
Data
Visualization
• However, not all good graphs are simple. As with text,
Do We Really
Need a Graph?
plots conveying a lot of information (e.g., displaying
General
Considerations multiple variables) require both a skillful creator and an
Graphical
Perception educated reader.
Bad Graphs

Survey of
E.g. Minard’s graph for Napoleon’s Russia campaign, old
Data
Visualization
graphical train schedules.
Techniques
One
• There is no “one-size-fits-all” graph, i.e., different types of
Quantitative
Variable graphs should be used for different
Multiple
Quantitative I types of data, e.g., quantitative, qualitative variables;
Variables
One Qualitative
Variable
I purposes, e.g., debugging code, EDA, reporting results;
Multiple
Qualitative
I media, e.g., print journal, projector.
Variables
Conditional 22 / 120
Plots
Caveats

Data
Visualization

Dudoit • Graphs typically reduce the information contained in the


data.
Motivation

Principles of
E.g. Histograms map n data points into B < n bins;
Data boxplots map n data points into 5 summary statistics (+
Visualization
Do We Really
Need a Graph?
possibly outliers).
General
Considerations • By focusing on certain aspects of the data or even
Graphical
Perception
Bad Graphs
imposing structure on data, graphs can also be subjective.
Survey of E.g. Choosing which variables to plot, decisions regarding
Data
Visualization axes and scales, dendrogram representation of clusters1 .
Techniques
One • As with text, the creator of the plot makes editorial
Quantitative
Variable
Multiple
decisions as to which data to display and which aspects of
Quantitative
Variables these data to show or emphasize.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 23 / 120
Plots
Caveats

Data
Visualization
• The reader should assess the relevance and reliability of
Dudoit the data being displayed, as well as the appropriateness of
the graph.
Motivation

Principles of • Software implicitly makes many decisions for the creator of


Data
Visualization a plot, e.g., axes, scales, plotting symbols, color, ordering
Do We Really
Need a Graph? of data. Experiment with different settings.
General
Considerations
Graphical
• Graphs are rarely presented on their own. They should be
Perception
Bad Graphs interpreted in context of the text which they support. The
Survey of reader should examine the graph-text interface and, in
Data
Visualization particular, whether the conclusions in the text are
Techniques
One supported by the graph.
Quantitative
Variable
Multiple 1
Quantitative
Variables
A dendrogram is a graphical representation of hierarchical clustering
One Qualitative results; for a given clustering of n objects, there are 2n−1 possible
Variable
Multiple dendrograms. The various choices made in hierarchical clustering as well as
Qualitative
Variables the dendrogram representation impose (vs. reveal) structure on the data.
Conditional 24 / 120
Plots
Statistical Inference

Data
Visualization

Dudoit • Graphs are by definition functions of the data, i.e.,


Motivation
statistics.
Principles of • Although not typically viewed this way, visualization can
Data
Visualization therefore be used as part of statistical inference.
Do We Really
Need a Graph?
General
• One can produce the same types of plots for a sample and
Considerations
Graphical for a population, in that sense, the plot for the sample can
Perception
Bad Graphs be viewed as an estimator of the plot for the population,
Survey of
Data
i.e., the parameter.
Visualization
Techniques • A pattern that we detect from plotting data for a sample
One
Quantitative can be used to infer properties of the population from
Variable
Multiple
Quantitative
which the sample was drawn. A formalized special case of
Variables
One Qualitative such an approach is given by linear regression.
Variable
Multiple
Qualitative
Variables
Conditional 25 / 120
Plots
General Considerations

Data
Visualization In the process of creating a plot, you should consider the
Dudoit
following issues.
Motivation • Determine the purpose of the plot.
Principles of
Data
E.g. EDA, debugging code, comparing distributions,
Visualization
Do We Really
model diagnostics, summarizing results, reporting results.
Need a Graph?
General • Formulate the message.
Considerations
Graphical
Perception • Identify the audience.
Bad Graphs

Survey of • Identify the display mode/medium (e.g., journal,


Data
Visualization projector).
Techniques
One • Think about the best type of graph for the purpose,
Quantitative
Variable
Multiple
message, audience, and display mode.
Quantitative
Variables
One Qualitative
• Aim for efficient perception: Speed, accuracy, and
Variable
Multiple minimum cognitive load for understanding the message.
Qualitative
Variables
Conditional 26 / 120
Plots
General Considerations

Data
Visualization

Dudoit • Apply visual perception principles.


Motivation E.g. Angles and areas are harder to perceive/compare
Principles of than lengths.
Data
Visualization • Do not use more dimensions to represent the data than
Do We Really
Need a Graph?
General
are in the data. This rules out pie charts and barplots.
Considerations
Graphical
Perception
• An important consideration when selecting a graphical
Bad Graphs
technique is how easily it can be extended (e.g., to
Survey of
Data multiple variables) and how amenable it is to comparing
Visualization
Techniques distributions.
One
Quantitative
Variable
• Choose graphical parameters carefully: Aspect ratio,
Multiple
Quantitative plotting symbols, line types, texture, axes, etc.
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 27 / 120
Plots
General Considerations

Data
Visualization • Choose color palette carefully. E.g. Be mindful of color
Dudoit blindness, use different color schemes for different types of
Motivation
data and messages (e.g., sequential, qualitative, and
Principles of
diverging).
Data
Visualization • Provide sufficient information so that the plot can be
Do We Really
Need a Graph? interpreted properly.
General
Considerations
Graphical
E.g. Title, axis parameters (i.e., label, tick marks),
Perception
Bad Graphs
annotation, legend, caption, etc.
Survey of In a document, number the figures and tables.
Data
Visualization • Do not include irrelevant information, i.e., avoid “chart
Techniques
One
Quantitative
junk”.
Variable
Multiple
Quantitative
• Principle of “least surprise”: If you defy expectations,
Variables
One Qualitative
people may get confused. Only defy expectations if it is
Variable
Multiple very important.
Qualitative
Variables
Conditional 28 / 120
Plots
General Considerations

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
• Experiment, i.e., consider different types of plots and
Do We Really
Need a Graph?
update the plots iteratively.
General
Considerations
Graphical
• Of course, always think about the quality of the data you
Perception
Bad Graphs plot.
Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 29 / 120
Plots
General Considerations

Data
Visualization

Dudoit • Sample size.


I For small sample sizes, plot all of the data – Why loose
Motivation

Principles of information?
Data I For larger samples sizes, plot relevant summaries of the
Visualization
Do We Really data, that do not distort or loose important information in
Need a Graph?
General the data.
Considerations
Graphical
Perception • Variables to display/emphasize. Depends on the purpose
Bad Graphs

Survey of
and message of the plot.
Data
Visualization • Type of variables. Quantitative and qualitative variables
Techniques
One
call for different types of graphical summaries.
Quantitative
Variable
Multiple
• Pre-processing. E.g. Transformation (e.g., log),
Quantitative
Variables dimensionality reduction, imputation.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 30 / 120
Plots
Graphical Perception

Data
Visualization

Dudoit • Cleveland and McGill (1985): “Graphical perception is the


Motivation visual decoding of the quantitative and qualitative
Principles of information encoded on graphs. Recent investigations have
Data
Visualization uncovered basic principles of human graphical perception
Do We Really
Need a Graph? that have important implications for the display of data.”
General
Considerations
Graphical • When we create a graph, we encode the data as graphical
Perception
Bad Graphs attributes.
Survey of
Data • Possible graphical attributes are: Angles, areas, lengths,
Visualization
Techniques position on common aligned/unaligned scale, slopes, color
One
Quantitative properties.
Variable
Multiple
Quantitative • Effective graphs are those for which attributes are most
Variables
One Qualitative
Variable
easily decoded.
Multiple
Qualitative
Variables
Conditional 31 / 120
Plots
Graphical Perception

Data
Visualization

Dudoit

Motivation

Principles of • There are empirical laws for perception that can be used
Data
Visualization to rank different types of graphical encodings.
Do We Really
Need a Graph?
General
• In general, such laws relate the perceived (change in)
Considerations
Graphical intensity in a physical stimulus to the actual (change in)
Perception
Bad Graphs intensity. This concerns stimuli to all senses, i.e., vision,
Survey of
Data
hearing, taste, touch, and smell.
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 32 / 120
Plots
Graphical Perception: Weber’s Law

Data
Visualization • Weber’s Law is an empirical relationship in psychophysics
Dudoit between the initial intensity in a stimulus (I ) and the
Motivation
smallest perceivable difference (a.k.a., just noticeable
Principles of difference) in the stimulus intensity (∆I ):
Data
Visualization ∆I
Do We Really = k, (1)
Need a Graph?
General
I
Considerations
Graphical
Perception
where k is a proportionality constant for a given type of
Bad Graphs stimulus 2 .
Survey of
Data • In terms of length, this means we detect a 1 cm change in
Visualization
Techniques a 1 m length as easily as we detect a 10 m change in a 1
One
Quantitative km length.
Variable
Multiple
Quantitative
• Weber’s Law appears to hold for many different graphical
Variables
One Qualitative encodings.
Variable 2
Multiple Law formulated and published by Gustav Theodor Fechner
Qualitative
Variables (1801–1887), a student of Ernst Heinrich Weber (1795–1878).
Conditional 33 / 120
Plots
Graphical Perception: Stevens’ Law

Data
Visualization • Stevens (1957) Law is an empirical relationship in
Dudoit psychophysics between the intensity in a stimulus and the
Motivation
perceived magnitude of the sensation created by the
Principles of
stimulus:
Data
Visualization
ψ(I ) = Ci β , (2)
Do We Really
Need a Graph? where I is the intensity or strength of the stimulus in
General
Considerations
Graphical
physical units (energy, weight, pressure, mixture
Perception
Bad Graphs proportions, etc.), ψ(I ) is the magnitude of the sensation,
Survey of β is an exponent that depends on the type of stimulation
Data
Visualization or sensory modality, and C is a proportionality constant
Techniques
One that depends on the units used.
Quantitative
Variable • Examples of values for exponent, β
Multiple
Quantitative
Variables Length: 0.9 – 1.1
One Qualitative
Variable Area: 0.6 – 0.9
Multiple
Qualitative
Variables
Volume: 0.5 – 0.8
Conditional 34 / 120
Plots
Graphical Perception: Stevens’ Law

Data
Visualization • For lengths, the relationship is almost linear, thus our
Dudoit
perception is about right.
Motivation • However, according to this power law, our perception of
Principles of
Data
areas and volumes is conservative, i.e., when values are
Visualization
Do We Really
represented as areas or volumes, we underestimate the
Need a Graph?
General
large values relative to the small ones and overestimate
Considerations
Graphical the small ones relative to the large ones.
Perception
Bad Graphs
• E.g. Areas, with β = 0.7.
Survey of
Data Consider two areas of size 1 and 2, respectively.
Visualization
Techniques
One ψ(2) 20.7
Quantitative
Variable = 0.7 u 1.62.
Multiple ψ(1) 1
Quantitative
Variables
One Qualitative
Variable Thus, we don’t see the bigger area as twice as large.
Multiple
Qualitative
Variables
Conditional 35 / 120
Plots
Graphical Perception: Stevens’ Law

Data
Visualization

Dudoit

Motivation

Principles of Now consider two areas of size 1/2 and 1, respectively.


Data
Visualization
Do We Really
Need a Graph? ψ(1/2) 0.50.7
General = 0.7 u 0.62.
Considerations
Graphical
ψ(1) 1
Perception
Bad Graphs

Survey of Thus, we don’t see the smaller area as half as large.


Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 36 / 120
Plots
Graphical Perception: Stevens’ Law

Data
Visualization electric shock

5
3.5 saturation
Dudoit 1.7

length

4
Motivation 1

Perceived Intensity
Principles of area
0.7

3
Data
Visualization depth
Do We Really 0.67
Need a Graph?
2
brightness
General
Considerations 0.5
Graphical
Perception
1

Bad Graphs

Survey of
Data
0

Visualization
Techniques 0 1 2 3 4 5
One
Quantitative Actual Intensity
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 10: Graphical perception: Steven’s Law. Stevens (1957)
Variable
Multiple
perceived sensory magnitude power law.
Qualitative
Variables
Conditional 37 / 120
Plots
Graphical Perception: Combining Weber’s and
Stevens’ Laws
Data
Visualization

Dudoit • Consider comparing the values x and x + w , using length


Motivation
(β = 1) and area (β = 0.7) encodings.
Principles of • For length, we perceive the relative value
Data
Visualization
Do We Really x +w w
Need a Graph? =1+ .
General
Considerations x x
Graphical
Perception
Bad Graphs • For area, we perceive the relative value
Survey of
Data
Visualization (x + w )0.7  w 0.7 0.7w
Techniques 0.7
= 1 + u1+ .
One x x x
Quantitative
Variable
Multiple
Quantitative
• Thus, we are more likely to detect small differences using
Variables
One Qualitative length encoding.
Variable
Multiple
Qualitative
Variables
Conditional 38 / 120
Plots
Graphical Perception

Data
Visualization
• Cleveland and McGill (1985) carried out an extensive study
Dudoit
of graphical encodings to obtain a best to worst ranking.
Motivation
• The encodings they examined include: position on a
Principles of
Data common aligned scale, position on a common unaligned
Visualization
Do We Really
scale, length, slope, angle, area, volume, color hue,
Need a Graph?
General brightness, and purity.
Considerations
Graphical
Perception
• One of their experiments consisted of
Bad Graphs
I 7 graphical encodings,
Survey of
Data I 3 judgments per encoding,
Visualization
Techniques
I 10 replications per subject,
One
Quantitative
I 127 experimental subjects.
Variable
Multiple
Quantitative
Assessment criterion: error = kperceived p − true pk,
Variables
One Qualitative
where p denotes the ratio (in percentages) of the smaller
Variable
Multiple to the larger magnitude.
Qualitative
Variables
Conditional 39 / 120
Plots
Graphical Perception

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable Figure 11: Graphical perception. Based on Table 1 in Cleveland and
Multiple
Quantitative
Variables
McGill (1985).
One Qualitative
Variable
http://paldhous.github.io/ucb/2016/dataviz/week2.html.
Multiple
Qualitative
Variables
Conditional 40 / 120
Plots
Bad Graphs

Data
Visualization
• The literature is full of “bad graphs”, that, for instance,
Dudoit
distort the data and are misleading, are too complicated,
Motivation or are missing essential information.
Principles of
Data • Karl Broman’s Top Ten Worst Graphs (including one of
Visualization
Do We Really
his own!): https://www.biostat.wisc.edu/~kbroman/
Need a Graph?
General topten_worstgraphs/.
Considerations
Graphical
Perception • Ross Ihaka’s Good and Bad Graphs: https://www.stat.
Bad Graphs

Survey of
auckland.ac.nz/~ihaka/120/Lectures/lecture03.pdf.
Data
Visualization • Edward Tufte: https://www.edwardtufte.com/bboard/
Techniques
One
q-and-a-fetch-msg?msg_id=00040Z.
Quantitative
Variable
Multiple
• Junk Charts:
Quantitative
Variables https://junkcharts.typepad.com/junk_charts/.
One Qualitative
Variable
Multiple • WTF Visualization: http://viz.wtf.
Qualitative
Variables
Conditional 41 / 120
Plots
Bad Graphs: Pie Charts

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Figure 12: Top 10 Google salaries by job category: Pie chart.
Quantitative
Variable
https://junkcharts.typepad.com/junk_charts/2011/10/
Multiple
Quantitative the-massive-burden-of-pie-charts.html. What’s the
Variables
One Qualitative message? What do the angles represent? What’s a better graph?
Variable
Multiple
Qualitative
Variables
Conditional 42 / 120
Plots
Bad Graphs: Pie Charts

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable Figure 13: Top 10 Google salaries by job category: Interval chart.
Multiple
Quantitative
Variables
https://junkcharts.typepad.com/junk_charts/2011/10/
One Qualitative
Variable
the-massive-burden-of-pie-charts.html.
Multiple
Qualitative
Variables
Conditional 43 / 120
Plots
Bad Graphs: Pie Charts

Data
Visualization Lead Software Engineer Contractor ●

Dudoit Product Management Director ●

Directors ●
Motivation
Human Resources Director ●
Principles of
Data Engineering Director ●
Visualization
Do We Really Senior Partner Technology Manager ●
Need a Graph?
General Staff User Experience Designer ●
Considerations
Graphical
Marketing Director ●
Perception
Bad Graphs
Senior Managers* ●

Survey of
Data Group Product Manager ●

Visualization
140 160 180 200 220 240
Techniques
One Salary, thousands of dollars
Quantitative
Variable
Multiple
Quantitative
Variables
Figure 14: Top 10 Google salaries by job category: Interval chart.
One Qualitative
Variable
Sorted by midpoint of salary range.
Multiple
Qualitative
Variables
Conditional 44 / 120
Plots
Bad Graphs: Pie Charts

Data
Visualization Directors ●

Dudoit Senior Managers* ●

Staff User Experience Designer ●


Motivation
Lead Software Engineer Contractor ●
Principles of
Data Human Resources Director ●
Visualization
Do We Really Product Management Director ●
Need a Graph?
General Senior Partner Technology Manager ●
Considerations
Graphical
Group Product Manager ●
Perception
Bad Graphs
Engineering Director ●

Survey of
Data Marketing Director ●

Visualization
140 160 180 200 220 240
Techniques
One Salary, thousands of dollars
Quantitative
Variable
Multiple
Quantitative
Variables
Figure 15: Top 10 Google salaries by job category: Interval chart.
One Qualitative
Variable
Sorted by salary range.
Multiple
Qualitative
Variables
Conditional 45 / 120
Plots
Bad Graphs: Pie Charts

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Figure 16: Google Home query categories: Pie chart.
Variable
Multiple
http://viz.wtf/image/171134950336. Unreadable. Can’t match
Quantitative
Variables numbers to categories. What’s a better graph?
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 46 / 120
Plots
Bad Graphs: Pie Charts

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable Figure 17: Bitcoin wealth distribution: Pie chart.
Multiple
Quantitative http://viz.wtf/image/166329900475. What’s the message?
Variables
One Qualitative How to compare shapes and areas? Without text, pie uninformative.
Variable
Multiple What’s a better graph?
Qualitative
Variables
Conditional 47 / 120
Plots
Bad Graphs: Pie Charts

Data
Visualization

Dudoit

100
100

Motivation ●
● ● ● ● ●

Principles of ●

80
Data
80

Visualization
Do We Really ●

60

60

Need a Graph?
% BTC owned

% BTC owned
General
Considerations
Graphical

40

40


Perception
Bad Graphs

20
20

Survey of ●
Data ●
Visualization ● ●

● ● ●

0
0

Techniques
0 20 40 60 80 100 0 20 40 60 80 100
One
Quantitative % of top addresses % of bottom addresses
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 18: Bitcoin wealth distribution: Scatterplot.
Variable
Multiple
Qualitative
Variables
Conditional 48 / 120
Plots
Bad Graphs: Multilevel Donut Charts

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable Figure 19: Goldman Sachs job listings: Multilevel donut chart.
Multiple
Quantitative
Variables
https://s3.amazonaws.com/cbi-research-portal-uploads/
One Qualitative
Variable
2017/09/18173935/GSteardownjobs. What’s the message?
Multiple
Qualitative
Unreadable. What’s a better graph?
Variables
Conditional 49 / 120
Plots
Bad Graphs: Wordclouds

Data
Visualization

Dudoit
american
spending one businesses
people

business
Motivation
know americans
many tonight it...s country
Principles of
economy future government health
Data that...s let...s

right
must ...ve the reform deficit get families
Visualization also they energyput even just work
Do We Really jobs that time care next first like small
tax give last help clean
Need a Graph?

make
General
security america i...m world
still want this
but every

come
Considerations education now new

can
Graphical
Perception take and year
nation
...re

Bad Graphs

Survey of
will
years
congress
two need
Data
Visualization
Techniques
One
Quantitative
Variable
Figure 20: State of the Union speeches 2010 and 2011: Wordcloud.
Multiple
Quantitative
Frequency of words with at least 15 occurrences. What’s the
Variables
One Qualitative message? How to compare frequencies of words? What’s a better
Variable
Multiple graph?
Qualitative
Variables
Conditional 50 / 120
Plots
Bad Graphs: Wordclouds

Data
Visualization spending
security
put



education ●
they ●
i...m ●

Dudoit ...re
still
must



first ●
even ●
...ve ●
this ●
right ●

Motivation reform
help
health



clean ●
want ●
let...s ●

Principles of it...s
families


deficit ●

Data world
that
next


Visualization congress
care
small



many ●

Do We Really future
every


give
Need a Graph? get
economy



tax ●
General like
come

Considerations that...s
energy


business ●

Graphical also
country

Perception businesses
nation


tonight ●

Bad Graphs two


need


take ●
government ●
time ●

Survey of last
know
the


Data just
make
america



american ●

Visualization americans
year


now ●

Techniques work
one


years ●
jobs ●

One new
but

Quantitative can
people

Variable and
will

Multiple 20 40 60 80 100 120

Quantitative Frequency

Variables
One Qualitative
Variable
Multiple Figure 21: State of the Union speeches 2010 and 2011: Dotplot.
Qualitative
Variables Frequency of words with at least 15 occurrences.
Conditional 51 / 120
Plots
Bad Graphs: Wordclouds

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
Figure 22: Gilets jaunes: Wordcloud. Frequency of expressions and
One
Quantitative
hashtags on Twitter for first four days of gilets jaunes movement.
Variable
Multiple
How to compare frequencies between days?
Quantitative
Variables https://www.lexpress.fr/actualite/societe/
One Qualitative
Variable gilets-jaunes-ce-qu-en-disent-les-francais_2055542.
Multiple
Qualitative html.
Variables
Conditional 52 / 120
Plots
Bad Graphs: Wordclouds

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 23: Names: Wordcloud.
Variable
Multiple https://www.wordclouds.com/?cloud=names.
Qualitative
Variables
Conditional 53 / 120
Plots
Bad Graphs: Wordclouds

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables Figure 24: Business words: Wordcloud.
One Qualitative
Variable
Multiple
https://www.wordclouds.com/?cloud=business
Qualitative
Variables
Conditional 54 / 120
Plots
Bad Graphs

Data
Visualization
• Chart junk. The previous graphs exemplify “chart junk” ,
Dudoit
i.e., they contain superfluous elements that are not
Motivation
necessary to convey the information contained in the data,
Principles of
Data but instead distract the viewer from this information or
Visualization
Do We Really
even mask or distort important information.
Need a Graph?
General • Pie charts.
Considerations
Graphical
Perception
I Frequency represented by angle/area.
Bad Graphs I Angles and areas are hard to perceive and compare.
Survey of
Data
I Pie charts quickly become unreadable for more than a
Visualization handful of values.
Techniques
One
I Listing the values is often better – they are actually often
Quantitative
Variable added to a pie chart anyway!
Multiple
Quantitative
Variables
I How to select order of categories?
One Qualitative
Variable
I Not amenable to comparing distributions; side-by-side
Multiple
Qualitative
comparisons not effective.
Variables
Conditional 55 / 120
Plots
Bad Graphs

Data
Visualization I Hard to extend to multiple variables.
Dudoit I A lot of junk often added to pie charts, e.g., thickness,
slice explosion.
Motivation

Principles of
• Wordclouds/tag clouds.
Data I Frequency represented by font size.
Visualization
Do We Really I Neither area nor height corresponds to frequency of words.
Need a Graph?
General I How do longer words compare with shorter words?
Considerations
Graphical I How are capital letters handled?
Perception
Bad Graphs I How to calculate relative difference in frequency between
Survey of two words?
Data
Visualization I How are the words ordered within the cloud (alphabetical,
Techniques
One
frequency)?
Quantitative
Variable I Not amenable to comparing distributions; side-by-side
Multiple
Quantitative comparisons not effective.
Variables
One Qualitative I How to extend to multiple variables?
Variable
Multiple I A lot of junk often added to word clouds.
Qualitative
Variables
Conditional 56 / 120
Plots
Bad Graphs

Data
Visualization

Dudoit • Barcharts/barplots. Better.


I Based on length and position on common aligned scale.
Motivation
I Add an irrelevant dimension (thickness of bar).
Principles of
Data I How to select order of categories?
Visualization
Do We Really I Not readily amenable to comparisons.
Need a Graph?
General I Extension to multiple variables problematic.
Considerations
Graphical
Perception • Dotcharts/dotplots. (And interval charts.) Even better.
Bad Graphs
I Based on length and position on common aligned scale.
Survey of
Data I Display only the relevant information.
Visualization
Techniques I How to select order of categories?
One
Quantitative I More amenable to comparisons and extensions to multiple
Variable
Multiple
Quantitative
variables.
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 57 / 120
Plots
Gapminder

Data
Visualization Gapminder. (https://www.gapminder.org)
Dudoit
• We will use data from Gapminder to reason through the
Motivation process of data visualization, e.g., population, population
Principles of density, life expectancy, income for each country.
Data
Visualization • Note that in this case we have a census, i.e., there is no
Do We Really
Need a Graph?
General
sampling involved 3 .
Considerations
Graphical
Perception
• Gapminder is a Swedish foundation co-created in 2005 by
Bad Graphs
Hans Rosling (Professor of International Health at
Survey of
Data Karolinska Institute) and family members.
Visualization
Techniques • “Gapminder is a fact tank, not a think tank.”
One
Quantitative
Variable “Gapminder measures ignorance about the world.”
Multiple
Quantitative “Gapminder makes global data easy to use and
Variables
One Qualitative
Variable
understand.”
Multiple
Qualitative “Gapminder promotes Factfulness, a new way of thinking.”
Variables
Conditional 58 / 120
Plots
Gapminder

Data
Visualization

Dudoit

Motivation

Principles of
Data
• Gapminder developed Trendalyzer, a data visualization
Visualization
Do We Really
software providing dynamic and interactive graphics of
Need a Graph?
General data compiled by organizations such as the United Nations
Considerations
Graphical
Perception
and the World Bank (acquired by Google in 2007).
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable 3
Multiple Some of the data could be estimates, but we won’t concern ourselves
Qualitative
Variables with this at this point.
Conditional 59 / 120
Plots
Gapminder
INCOME LEVELS LEVEL 1 LEVEL 2 LEVEL 3 LEVEL 4

85
Data Japan Andorra

apminder World 2015 Italy Iceland Switzerland

Spain Australia
Visualization Greece
Malta
Cyprus
Israel
N. Zeal.
France
Canada
UK Austria
Sweden
Ireland
Netherlands
Norway Luxembourg

Singapore
Sloven.
Costa Rica
Finl. Germany
Peru Turkey Portugal South Korea Belgium Denm. Kuwait

80
Qatar
Saudi Arabia
Maldives
Lebanon Chile
Dudoit Nicaragua
Bosnia & Herz.
Jordan
Albania
Colombia
Cuba Panama
Czech Rep.

Poland Estonia Puerto


Bahrain

Bermuda USA

China
Sri Lanka Monten.
Croatia Rico
Tunisia Uruguay Slovak Rep. Oman Brunei
Maced F. Algeria Argentina Hungary
Antig.& B.
Mexico
Serbia
Malaysia
Vietnam Jamaica Ecuador
Barbados Aruba

HEALTHY
El Salvador Latvia United Arab Em.
Dominican R. Venezuela
Lithuania

75
Armenia Bulgaria Romania
Palestine Morocco St. Lucia
Thailand
Motivation Moldova
Bolivia
Paraguay
Iran
Mauritius
Libya
Seychelles
Bahamas

Brazil
Georgia Dominica
Tajikistan Honduras Samoa Azerbaijan
North Korea Cape Verde Bhutan
Egypt
Timor-Leste
Uzbekistan Guatemala Belize
Trinidad & Tobago

Bangladesh
Suriname
Tonga Grenada

Ukraine St.V&G Belarus Russia


Principles of Philippines
Indonesia
70

Kyrgyz Rep. Mauritania


Nepal Kazakhastan HEALTH & INCOME

India
Turkmenistan
Cambodia
HEALTH Myanmar OF NATIONS
Data Comoros Gambia Sao T & P
Micronesia
Syria IN 2015
Life expectancy (years)

Mars. Isl.
Lao Iraq This graph compares
Sudan
Visualization Rwanda
Kenya Yemen Pakistan
Guyana

Fiji
Mongolia
Gabon Life Expectancy & GDP per capita
for all 182 nations
Ethiopia Senegal
recognized by the UN.
65

Vanuatu Ghana
Do We Really Haiti
Solomon Isl. Namibia

Madagascar
Nigeria
Djibouti
Liberia Tanzania
Need a Graph? Togo
Kiribati

Benin
COLOR BY REGION
Burundi
Niger Uganda Papua N. G. Congo, Rep.
Equatorial
Guinea
General Eritrea
Burkina Faso
South Africa
Malawi Congo Mali Cameroon
60

Considerations Dem. Rep. Zimbabwe


Cote d'Ivoire Angola
Guinea
SICK

Graphical Mozambique
Chad

Zambia SIZE BY POPULATION


Perception Sierra Leone

Guinea-Bissau South Sudan


55

Bad Graphs Somalia


Afghanistan
1 000
1
10 million
100

Survey of Swaziland

www.gapminder.org
50

Data Central African Rep.

INCOME
a free fact-based worldview

Visualization Lesotho
POOR RICH
Techniques $1 000 $2 000 $4 000 $8 000
GDP per capita ($ adjusted for price differences, PPP 2011)
$16 000 $32 000 $64 000 $128 000
version 15
DATA SOURCES—INCOME: World Bank’s GDP per capita, PPP (2011 international $). Income of Syria & Cuba are Gapminder estimates. X-axis uses log-scale to make a doubling income show same distance on all levels. POPULATION: Data from UN Population Division. LIFE EXPECTANCY: IHME GBD-2015, as of Oct 2016.
ANIMATING GRAPH: Go to www.gapminder.org/tools to see how this graph changed historically and compare 500 other indicators. LICENSE: Our charts are freely available under Creative Commons Attribution License. Please copy, share, modify, integrate and even sell them, as long as you mention: ”Based on a free chart from www.gapminder.org”.
One
Quantitative
Variable
Multiple
Quantitative
Figure 25: Gapminder: World Poster 2015. “How Does Income
Variables
One Qualitative
Relate to Life Expectancy? Short answer - Rich people live longer.”
Variable
Multiple
Bubble chart with four variables displayed in 2D.
Qualitative
Variables
Conditional 60 / 120
Plots
Software

Data
Visualization
• Most of the plots below are produced with Python’s
Dudoit
seaborn library, using default arguments.
Motivation
• Default settings typically do not correspond to the most
Principles of
Data basic version of the plot, but rather impose many decisions
Visualization
Do We Really
Need a Graph?
on the plot, e.g., color, legend, ordering. Experiment with
General
Considerations
different settings to make sure you get the plot you want.
Graphical
Perception • Seaborn tutorial:
Bad Graphs

Survey of https://seaborn.pydata.org/tutorial.html.
Data
Visualization Each function has many arguments to customize the plots.
Techniques
One
As usual, consult documentation.
Quantitative
Variable
Multiple
• Datasets available at:
Quantitative
Variables https://github.com/mwaskom/seaborn-data.
One Qualitative
Variable
Multiple
E.g. Titanic survival dataset, Fisher’s iris dataset.
Qualitative
Variables
Conditional 61 / 120
Plots
One Quantitative Variable

Data
Visualization

Dudoit
How would you visualize life expectancy in 2018 over all
Motivation countries?
Principles of
Data
Visualization count 182.000000
Do We Really
Need a Graph? mean 72.726374
General
Considerations
Graphical
std 7.237996
Perception
Bad Graphs min 51.100000
Survey of 25% 67.150000
Data
Visualization 50% 74.100000
Techniques
One 75% 78.075000
Quantitative
Variable
Multiple
max 84.200000
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 62 / 120
Plots
Stem-And-Leaf Plots

Data
Visualization Key: aggr|stem|leaf
182 84 0 = 84 .0
x1.0 = 84.0
182 84 02
Dudoit 180 83 25
178 82 1244446669
168 81 11223333458889
154 80 0125778
Motivation 147 79 13446
142 78 00122367
134 77 0223466677899
Principles of 121 76 0125678899
111 75 1223355578999
Data 98 74 011238899
_
Visualization 89 73 123448
83 72 0002334456
Do We Really 73 71 115569
Need a Graph? 67 70 3555679
60 69 138
General 57 68 0002378
Considerations 50 67 113389
44 66 114689
Graphical 38 65 0245788
Perception 31 64 356
28 63 14569
Bad Graphs 23 62 24599
18 61 01112269
Survey of 10 60 025
7 59 57
Data 5 58 067
2 57
Visualization 2 56
Techniques 2 55
2 54
One 2 53
Quantitative 2 52
2 51 16
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple Figure 26: Life expectancy, 2018.
Qualitative
Variables
Conditional 63 / 120
Plots
Stripplots

Data
Visualization
85 85
Dudoit

80 80
Motivation

Principles of 75 75
Data
Visualization
life expectancy

life expectancy
Do We Really 70 70
Need a Graph?
General
Considerations 65 65
Graphical
Perception
Bad Graphs 60 60

Survey of
Data 55 55
Visualization
Techniques 50 50
One
Quantitative
Variable
Multiple
Quantitative Figure 27: Life expectancy, 2018. Right: Jittering, i.e., adding
Variables
One Qualitative random noise, to avoid overplotting.
Variable
Multiple
Qualitative
Variables
Conditional 64 / 120
Plots
Histograms

Data
Visualization

Dudoit

35
Motivation

Principles of 30
Data
Visualization 25
Do We Really
Need a Graph? 20
General
Considerations
Graphical
15
Perception
Bad Graphs 10
Survey of
Data
5
Visualization
Techniques 0
One
50 55 60 65 70 75 80 85
Quantitative life expectancy
Variable
Multiple
Quantitative
Variables Figure 28: Life expectancy, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 65 / 120
Plots
Histograms

Data
Visualization

Dudoit
140
Motivation bins=default
120 2
Principles of 20
Data 100
Visualization
Do We Really
Need a Graph? 80
General
Considerations 60
Graphical
Perception
Bad Graphs 40
Survey of 20
Data
Visualization
Techniques 0
One
50 55 60 65 70 75 80 85
Quantitative life expectancy
Variable
Multiple
Quantitative
Variables Figure 29: Life expectancy, 2018. Different numbers of bins.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 66 / 120
Plots
Density Plots

Data
Visualization

Dudoit

Motivation 0.05
Principles of
Data 0.04
Visualization
Do We Really
Need a Graph? 0.03
General
Considerations
Graphical
Perception
0.02
Bad Graphs

Survey of 0.01
Data
Visualization
Techniques 0.00
One
50 60 70 80 90
Quantitative life expectancy
Variable
Multiple
Quantitative
Variables Figure 30: Life expectancy, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 67 / 120
Plots
Density Plots

Data
Visualization

Dudoit

Motivation 0.07 bw=default


0.5
Principles of 0.06 2
Data 16
Visualization 0.05
Do We Really
Need a Graph? 0.04
General
Considerations
Graphical 0.03
Perception
Bad Graphs 0.02
Survey of
Data 0.01
Visualization
Techniques 0.00
One
0 20 40 60 80 100 120
Quantitative
Variable
Multiple
Quantitative
Variables Figure 31: Life expectancy, 2018. Different bandwidths.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 68 / 120
Plots
Boxplots

Data
Visualization

Dudoit
85
Motivation
80
Principles of
Data
Visualization
75
life expectancy

Do We Really
Need a Graph? 70
General
Considerations 65
Graphical
Perception
Bad Graphs 60
Survey of 55
Data
Visualization 50
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables Figure 32: Life expectancy, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 69 / 120
Plots
One Quantitative Variable and One Qualitative
Variable
Data
Visualization

Dudoit

Motivation How would you visually compare life expectancy between


Principles of regions?
Data
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 70 / 120
Plots
Stripplots

Data
Visualization

Dudoit 85

Motivation 80
Principles of
Data 75
Visualization
Do We Really
Need a Graph?
life expectancy 70
General
Considerations
Graphical
65
Perception
Bad Graphs
60
Survey of
Data
Visualization
55
Techniques
One 50
Quantitative
Variable south_asia
europe_central_asia
middle_east_north_africa america
sub_saharan_africa east_asia_pacific
Multiple six_regions
Quantitative
Variables
One Qualitative
Variable
Multiple
Figure 33: Life expectancy by region, 2018.
Qualitative
Variables
Conditional 71 / 120
Plots
Histograms

Data
Visualization

Dudoit

Motivation 20.0 middle_east_north_africa


america
Principles of 17.5 east_asia_pacific
Data sub_saharan_africa
Visualization 15.0 europe_central_asia
Do We Really
Need a Graph? 12.5 south_asia
General
Considerations 10.0
Graphical
Perception 7.5
Bad Graphs
5.0
Survey of
Data 2.5
Visualization
Techniques 0.0
One
50 55 60 65 70 75 80 85
Quantitative life expectancy
Variable
Multiple
Quantitative
Variables Figure 34: Life expectancy by region, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 72 / 120
Plots
Density Plots

Data
Visualization

Dudoit

Motivation middle_east_north_africa
0.10 america
Principles of east_asia_pacific
Data sub_saharan_africa
Visualization 0.08 europe_central_asia
Do We Really
Need a Graph?
south_asia
General 0.06
Considerations
Graphical
Perception 0.04
Bad Graphs

Survey of 0.02
Data
Visualization
Techniques 0.00
One
50 60 70 80 90
Quantitative
Variable
Multiple
Quantitative
Variables Figure 35: Life expectancy by region, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 73 / 120
Plots
Boxplots

Data
Visualization

Dudoit
85
Motivation
80
Principles of
Data
Visualization
75
life expectancy

Do We Really
Need a Graph? 70
General
Considerations 65
Graphical
Perception
Bad Graphs 60
Survey of 55
Data
Visualization 50
Techniques
south_asia

america

east_asia_pacific
europe_central_asia

sub_saharan_africa
middle_east_north_africa
One
Quantitative
Variable
Multiple
Quantitative
Variables Figure 36: Life expectancy by region, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional six_regions 74 / 120
Plots
Violin Plots

Data
Visualization

Dudoit

Motivation 90
Principles of
Data 80
Visualization
life expectancy

Do We Really
Need a Graph?
General
70
Considerations
Graphical
Perception 60
Bad Graphs

Survey of
Data 50
Visualization
Techniques
asia ia ca africa erica ic
One
th_ _as afri acif
Quantitative
sou tral
en o rth_ haran_ am s ia_p
Variable
e_c t_ n a t_ a
eur
op eas sub_s eas
dle_
Multiple
Quantitative
Variables Figure 37: Lifemid expectancy by region, 2018.
One Qualitative six_regions
Variable
Multiple
Qualitative
Variables
Conditional 75 / 120
Plots
Log-Transformation

Data
Visualization

Dudoit

Motivation

Principles of 120000 10
5

Data 100000
Visualization
Do We Really 80000
income

income
Need a Graph? 10
4
60000
General
Considerations
40000
Graphical
Perception
20000
Bad Graphs 3
10
0
Survey of
asia asia a a a ific asia asia a a a ific
Data th_ tral_ afric n_afric americ pac th_ tral_ afric n_afric americ pac
sou th_ sia_ sou th_ sia_
Visualization _ cen st_nor sahara t_a cen st_nor sahara t_a
ope _ea _ a s pe_ _ea _ a s
eur dle sub e euro dle sub e
Techniques mid mid
One Figure 38: Income,
six_regions 2018. Left: Income (GDP/capita,
six_regions
Quantitative
Variable
Multiple
inflation-adjusted $). Right: Log-transformed income.
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 76 / 120
Plots
Time Series

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph?
General How did life expectancy vary between 1800 and 2018?
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 77 / 120
Plots
Time Series

Data
Visualization
United States
Dudoit Russia
China
Syria
Motivation Cambodia

Principles of
Data

life expectancy
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
1800

1825

1850

1875

1900

1925

1950

1975

2000
One year
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 39: Life expectancy over time for five countries.
Variable
Multiple
Qualitative
Variables
Conditional 78 / 120
Plots
Time Series

Data
Visualization

Dudoit

80
Motivation

70
Principles of

60
Data

life expectancy
Visualization ●

50
● ●

Do We Really ●




Need a Graph? ● ●

● ●


General
40

Considerations
Graphical
Perception
30

Bad Graphs

Survey of
20


Data ●

Visualization
Techniques 1800 1850 1900 1950 2000
One year
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 40: Life expectancy over time.
Variable
Multiple
Qualitative
Variables
Conditional 79 / 120
Plots
One Quantitative Variable: Summary

Data
Visualization Displaying and comparing marginal distributions for
Dudoit quantitative data.
Motivation • Stem-and-leaf plots.
Principles of I Simple pen-and-paper method for visualizing the
Data
Visualization distribution of all of a handful of values.
Do We Really
Need a Graph?
I Not amenable to comparisons between distributions.
General
Considerations
I No reason to use these days.
Graphical
Perception • Stripcharts/Stripplots. (Sometimes referred to as
Bad Graphs

Survey of
dotcharts/dotplots, related to rug plots.)
Data I Effective for visualizing the distribution of all of a
Visualization
Techniques moderate number of values.
One
Quantitative
Variable
I Can use side-by-side stripplots to compare multiple
Multiple
Quantitative
distributions.
Variables
One Qualitative • Histograms.
Variable
Multiple I Classical method for displaying a single distribution.
Qualitative
Variables
Conditional 80 / 120
Plots
One Quantitative Variable: Summary

Data
Visualization
I Sensitive to bin width and bin boundaries.
Dudoit
I Cannot easily display and compare multiple distributions.
• Density plots.
Motivation
I Based on kernel density estimation (cf. smoothing).
Principles of
Data I Sensitive to bandwidth, but methods available to select
Visualization
Do We Really bandwidth.
Need a Graph?
I Effective for displaying and comparing multiple
General
Considerations
Graphical distributions.
Perception
Bad Graphs • Boxplots. (A.k.a., box-and-whiskers plots.)
Survey of
I Summarize distribution by only 5 numbers (+ outliers):
Data
Visualization Median, upper and lower-quartiles, whiskers at 1.5 times
Techniques
One inter-quartile range (IQR) above and below upper and
Quantitative
Variable
Multiple
lower-quartiles, respectively.
Quantitative I Possible loss of information, e.g., multimodality.
Variables
One Qualitative I Effective for displaying and comparing multiple
Variable
Multiple
Qualitative distributions, especially with notches.
Variables
Conditional 81 / 120
Plots
One Quantitative Variable: Summary

Data
Visualization

Dudoit

Motivation
• Violin plots.
Principles of
Data I Trendy hybrids of boxplots and density plots.
Visualization
Do We Really
I Redundant (twice the density plot!), unless plot different
Need a Graph?
General densities on each side.
Considerations
Graphical I Same limitations and issues as with boxplots and density
Perception
Bad Graphs plots.
Survey of I Cannot compare densities as readily as with standard
Data
Visualization density plots.
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 82 / 120
Plots
Multiple Quantitative Variables

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization
Do We Really
Need a Graph? How would you visually examine the relationship between
General
Considerations
Graphical
life expectancy and income in 2018 over all countries?
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 83 / 120
Plots
Scatterplots

Data
Visualization

Dudoit

Motivation
85 85
Principles of
Data 80 80
Visualization 75 75
life expectancy

life expectancy
Do We Really
Need a Graph? 70 70
General 65 65
Considerations
Graphical 60 60
Perception
Bad Graphs 55 55

Survey of 50 50
0 20000 40000 60000 80000 100000 120000 3 4 5
Data 10 10 10
income income
Visualization
Techniques
One Figure 41: Life expectancy vs. income, 2018. Left: Income
Quantitative
Variable (GDP/capita, inflation-adjusted $). Right: Log-transformed income.
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 84 / 120
Plots
Scatterplots

Data
Visualization

Dudoit
85
Motivation
80
Principles of
Data
Visualization
75
life expectancy

Do We Really
Need a Graph? 70
General six_regions
Considerations 65 south_asia
Graphical
Perception
europe_central_asia
Bad Graphs 60 middle_east_north_africa
sub_saharan_africa
Survey of 55 america
Data
Visualization
east_asia_pacific
50
Techniques 3 4 5
One 10 10 10
Quantitative
Variable
income
Multiple
Quantitative
Variables Figure 42: Life expectancy vs. income, colored by region, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 85 / 120
Plots
Bubble Charts

Data
Visualization

Dudoit
85
Motivation
80
Principles of
Data
life expectancy
75 four_regions
Visualization asia
Do We Really 70 europe
Need a Graph? africa
General
Considerations
65 americas
Graphical population
Perception 60 0
Bad Graphs 500000000
Survey of 55 1000000000
Data 1500000000
Visualization 50
Techniques 3 4 5
10 10 10
One
Quantitative
income
Variable
Multiple
Quantitative Figure 43: Life expectancy vs. income, colored by region and with
Variables
One Qualitative area of bubbles representing population, 2018.
Variable
Multiple
Qualitative
Variables
Conditional 86 / 120
Plots
Mean-Difference Plots

Data
Visualization

Dudoit

Motivation

Principles of 15 +SD1.96: 12.95


90
Data
Visualization 80 10

Difference
Do We Really
Need a Graph? 70 mean diff:
5 5.87
2018

General
Considerations
60
Graphical
Perception 0
Bad Graphs 50
-SD1.96: -1.21
Survey of 40
Data
50 55 60 65 70 75 80
40 50 60 70 80 90 Means
Visualization 1998
Techniques
One
Quantitative Figure 44: Life expectancy, 2018 vs. 1998.
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 87 / 120
Plots
Scatterplot Matrices

Data
80
Visualization

life expectancy
70

Dudoit 60

50

Motivation 20
18

log pop
Principles of 16
14
Data six_regions
12
Visualization south_asia
europe_central_asia
middle_east_north_africa
Do We Really 8
sub_saharan_africa
america
Need a Graph?
log pop dens

east_asia_pacific
6
General
4
Considerations
2
Graphical
Perception
12
Bad Graphs
10
log inc

Survey of
Data 8

Visualization
60 80 10 20 0 5 10 5.0 7.5 10.0 12.5
Techniques life expectancy log pop log pop dens log inc

One
Quantitative
Variable
Multiple
Figure 45: Life expectancy, population, population density, and
Quantitative
Variables income, by region, 2018.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 88 / 120
Plots
Scatterplot Matrices

Data
Visualization
1.0
Dudoit ●


● ●






●●

●●

●● ●


●●


●● ●

● ●●

●●
● ●
● ●


● ●●


● ● ●●
●●

● ●●




● ●
●● ● ●●●
● ●●●
●●

● ●

● ●




●●


●●

●●
● ● ●●

● ●●


●●




● ●
0.6 0.8 1.0
● ● ● ● ● ●● ●● ●●● ●
● ● ● ● ● ● ●
●●

● ● ●
● ●●
●● ●
● ●● ● ●
●●


● ● ● ●
● ●● ● ●●● ● ●

● ●●
● ● 0.8

● ●● ● ●● ● ● ● ● ● ●● ●●● ● ●●
● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ●●●
● ●
● ●● ● ● ● ●● ● ● ● ●●
● ● ● ●● ● ●● ● ● ● ●● ●● ● ● ●
●● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●● ●
● ● ● ●● ● ●● ● ● ●● ● ● ● ●
● ●● ● ●● ● ●
● ●●


● ●

●●
●●


●●
● ● ●●
● ●●
● ●





● ●
●● ● ● ●●


● ● ●
0.6
● ●● ●●

Motivation ●
● ● ● ● ●● ● ● ● ● ● ● ●

● ●
●● ●● ●

●● ●
● ● ●
●● ●● ●●
● ● ●

●●
●●● ●

●●
● ●● ●

● ●

● ●
● ● ●

● ● ●


●●
●● ●
● ● ●●
● ● ● ●●●● ● ●
● z
● ● ● ● ● ● ● ●● ● ● ● ●
● ●● ●●●●● ● ● ●
● ● ●● ● ● ● ●● ● ●● ●●

●●● ●
● ●

● ● ●

●●
●●
●●
● ●●
●● ●
● ● ●

●●
● ●
●● ●


● ●●●

● ● ●
● ● 0.4
● ● ● ● ● ● ● ● ● ●
● ● ● ●● ● ● ●● ● ● ●
● ●●

● ● ● ●● ● ● ● ●●
● ● ●● ●

● ● ●
●● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ●
● ●● ● ● ● ●
● ● ●● ● ● ● ●● ● ● ●● ● ●
● ●●
● ● ● ● ● ●
● ● ●

● ●● ● ● ●●●● ● ●● ● ●● ● ● ●● ●●

Principles of
● ●


●●
● ●●● ●

● ●
● ● ●

● ●


● ● ● ●
● ●
●●● ●
●● ●
●● ●
● ●

● ● ●●
● ●
●●

0.2
● ● ●● ● ●
●● ●
● ● ● ●● ● ●
● ● ●
● ● ● ● ●● ● ● ●● ● ●
●● ● ● ●
● ● ● ●● ●
● ● ●●●● ●● ● ●
● ● ● ● ● ●
● ● ●
● ●● ●
● ●
● ●
● ●
● ●
● ● ● ●●● ● ● ● ● ● ●● ● ● 0.0 0.2 0.4
Data ● ● ● ● ●● ● ●●
●● ●
●● ● ●
● ●
● ● ● ● ●
0.0
● ● ● ● 1.0 ●●

Visualization ●●
● ● ● ●● ● ● ●● ●●


●●
●● ●
● ● ●

●●●


●●

● 0.6 0.8 1.0 ● ●
● ●


●●
● ●
●● ●
●● ● ● ●
●●
● ●
● ●● ● ● ● ● ●
●● ● ● ●● ● ●●● ● ● ● ●●

●● ● ●● ●

● ● ● ●●
●● ● ● ● ● ● ● ● ●
● ●● ● ● ● ● ● ● ● ●

● ●● ●● ● ● ●● ● ● ●
● ● ●●● ● ● ●● ●
●● ● ● ● ● ●●● ● ● ●


●● 0.8 ● ● ● ● ●● ● ●●
●●



●● ●●

Do We Really ●

●●


●● ●●●

● ●
● ●●





●●






● ●
●●

● ●● ● ●
●●●
● ●


● ● ● ●
●● ●
●●●
● ●● ● ●
● ●●

●● ● ●
●●
●●
● ●●

● ●
● ● ● ●

●● ● ●

● ● ●

●●



● ●
● ●

● ●


●●

Need a Graph? ●
●●
●● ●
● ●
●●
● ●

●●
● ●
● ●


● ●
● ● ●●

●● ●





● ●●



●●●

0.6



● ●




●●

●● ●

●●


●●


●●

● ●



●●



●● ● ● ●

●●●● ●● ●
●●
●● ● ● ●● ● ● ● y ●●

● ●● ●

●● ● ● ●● ● ●●●

General ●






● ●
● ●



● ●● ● ●
● ●
● ● ● ●●
●● ●


● ●●
● ●●

●●
●●

●● ●

●●

0.4
●●
● ●
● ●

●● ● ●
● ●
● ●

● ● ●●

●●

●●
●●● ●

●●


● ●


●●●






●●● ●
●●

Considerations ●



●●●


● ● ●●

● ●●
● ●

● ●
● ●

● ● ●
● ●

● ●

●●
● ●

● ●

● ●


●●


●● ● ●

● ●●
● ● ●


● ● ● ●●


●●



● ● ●
● ● ●

●●


● ● ●●
●● ●
●●●

●●●
●● ●●● ● ●


●●
● 0.2 ●●

● ● ●

● ●● ●
● ● ● ●
● ●● ● ● ● ● ●

Graphical ●

● ●●
●●

● ●

● ●
●●
● ●


●●
●●



● ●● ●



●●



●●



● ● ●


●●● ●
●● ● ● ●


●●






●●
● ●

● ●●

●● ●
● ● ● ●● ●● ● ● ●● ● 0.0 0.2 0.4 ●● ● ● ● ●● ● ●●● ●

Perception ●



● ● ●● ● ●● ● ●●● ●
0.0 ● ● ●●
● ●● ●
●●
● ● ●● ●●

1.0 ● ● ● ● ●

Bad Graphs 0.6 0.8 1.0 ●●




● ●●
●● ●

●● ●
●●● ●
● ●
●●

● ●
●● ●

●●


● ● ●
● ● ●●

●● ●

● ●●

●●
● ● ●

●●● ●


●●

● ●

● ● ● ●


● ●

●●●


●●
● ●
●● ●
● ●●

● ● ● ●●●

●●
● ● ● ● ●● ● ● ●
● ● ● ● ●● ● ●● ●● ●

● ●
●●●● ● ● ● ● ● ●
● ● ● ● ●
● ● ●
● ● ●● ● ● ● ●● ● ●● ● ●●● ●●●
0.8 ●

● ●●●●● ●
● ●
●●●


●●
● ●● ● ●●
●●

●●
●●
● ●

● ●

●●


● ● ● ●●
● ●





● ● ●●
● ● ● ●● ● ● ●

Survey of ● ● ● ● ● ● ●●● ●● ● ● ● ●●
● ● ● ● ● ● ●
● ●
● ●
● ● ●●● ● ● ●
●● ●● ● ● ● ●● ●
● ● ● ●● ●● ● ● ● ● ●●● ● ● ●
● ● ●
●● ● ● ● ● ●
● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●

0.6 ●●●
● ● ●
● ●
● ●

●● ● ● ● ● ●● ●
● ● ●
●●
● ●● ●
● ●

●● ● ● ●

●●
●● ●
●●●● ● ●●








● ●●
●●


● ● ● ● ●

Data ●
● ● ● ●● ● ● ● ● ● ●
●● ● ●●

● ● ●
x ●●
●●






●●●
●●




●●
● ●
● ●
● ●●
●●
● ●
● ●

● ●
●●
● ●

●●
● ●



● ●


●●●
●● ●● ●
●●●
● ●

● ● ●● ● ●●● ● ●● ●
● ● ● ● ● ● ● ●● ● ●●
0.4 ● ● ● ● ●
●● ● ●
●●
● ●
● ●●
●●● ● ● ●●
● ●
● ●
● ●●
● ●

Visualization ● ●●
● ●●
● ● ●● ●

●●
● ● ●●
● ● ●● ● ●● ● ● ●


● ●
● ●
● ● ● ●● ● ● ●●● ●
● ●● ●
● ● ●
● ● ●
●● ● ●● ●● ●●● ● ● ● ● ●
● ●● ● ●● ● ●
●● ● ● ●● ●● ● ● ●● ● ● ● ● ● ●
● ●● ● ● ● ●

0.2 ●● ● ●
●●
● ● ● ●

● ● ● ● ● ●

●● ● ●●
● ●
●●

Techniques ●●
● ● ● ● ● ●● ●
● ●● ● ● ●● ● ●

● ● ● ● ●●●
● ● ●
● ● ● ● ● ● ● ● ● ●● ● ●

● ●
● ● ● ●●
● ● ● ● ● ●●● ● ● ● ●● ●● ● ● ●
● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●
0.0 0.2 0.4 ●

●● ● ●●
● ● ● ● ●

●● ● ●●
● ●
0.0 ● ● ● ●● ● ●

One Scatter Plot Matrix


Quantitative
Variable
Multiple
Quantitative
Variables Figure 46: RANDU RNG. Triples of successive numbers.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 89 / 120
Plots
3D Scatterplots

Data
Visualization

Dudoit

Motivation
y x

Principles of ●
●●





● ●





●●

Data
● ● ●● ●
● ● ●
● ● ● ●
● ● ● ● ● ●
●● ●

● ●●
● ● ● ●
● ● ●

Visualization

● ● ●
● ●
● ● ●
● ● ● ● ● ● ●
● ● ● ● ●

● ● ● ● ●
● ● ●● ●
● ● ●
Do We Really ●

● ●●



● ●
●●








● ●


● ●


Need a Graph? ● ● ●






●●


● ●



● ●

●●

● ● ●

● ● ● ●

General ●
●●









●●
●●
●●


Considerations ●
















● ●●

●● ●


z
● ●
● ●
●● ● ● ● ● ●●

Graphical ● ●



●●









● ●

●● ●
●●



Perception ● ●●


● ●


● ●
● ●
● ●



● ●


●●●

● ●
● ●


● ●
● ●● ●

Bad Graphs ●


● ●
●●



●●



● ●




●●

●●

● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ●● ●
●● ● ● ●
● ● ● ● ● ●●● ●●
● ●

Survey of ●








●●

● ●●

●●




● ●

● ●
● ● ●

Data ●



●●









Visualization ●

Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Figure 47: RANDU RNG. Triples of successive numbers.
Qualitative
Variables
Conditional 90 / 120
Plots
RANDU RNG

Data
Visualization

Dudoit RANDU random number generator. (R. Ihaka, https://www.


Motivation stat.auckland.ac.nz/~ihaka/120/Lectures/lecture27.pdf.)
Principles of
Data
• The dataset consists of 400 triples of successive numbers
Visualization
Do We Really
produced by the RANDU random number generator
Need a Graph?
General
(RNG).
Considerations
Graphical
Perception
• The consecutive triples produced by RANDU are
Bad Graphs
constrained to lie on a series of parallel planes which cut
Survey of
Data through the unit cube.
Visualization
Techniques • The planes are not aligned with the sides of the unit cube
One
Quantitative
Variable and so do not show up in any of the panels of a
Multiple
Quantitative
Variables
scatterplot matrix.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 91 / 120
Plots
Overplotting

Data
Visualization

Dudoit

Motivation 20

Principles of 18
Data
Visualization 16
Do We Really
Need a Graph? 14
General y
Considerations 12
Graphical
Perception 10
Bad Graphs
8
Survey of
Data 6
Visualization
4 6 8 10 12 14 16 18 20
Techniques x
One
Quantitative
Variable
Multiple
Quantitative
Figure 48: Simulated data, n = 60, 000: Scatterplot.
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 92 / 120
Plots
Overplotting: Hexagonal Binning

Data
Visualization

Dudoit

Motivation 20

Principles of 18
Data
Visualization 16
Do We Really
Need a Graph? 14
General y
Considerations 12
Graphical
Perception 10
Bad Graphs
8
Survey of
Data 6
Visualization
4 6 8 10 12 14 16 18 20
Techniques x
One
Quantitative
Variable
Multiple
Quantitative
Figure 49: Simulated data, n = 60, 000: Hexagonal binning.
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 93 / 120
Plots
Overplotting: Scatterplot Smoothing

Data
Visualization

Dudoit
22.5
Motivation
20.0
Principles of
Data 17.5
Visualization
Do We Really 15.0
Need a Graph? y
General 12.5
Considerations
Graphical 10.0
Perception
Bad Graphs
7.5
Survey of
Data 5.0
Visualization
2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0
Techniques x
One
Quantitative
Variable
Multiple
Quantitative
Figure 50: Simulated data, n = 60, 000: Scatterplot smoothing.
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 94 / 120
Plots
Multiple Quantitative Variables: Summary

Data
Visualization
Displaying joint distributions for quantitative data.
Dudoit
• While density plots and boxplots are useful for comparing
Motivation
two or more marginal distributions (e.g., in terms of
Principles of
Data location and scale), they do not provide any information
Visualization
Do We Really about joint distributions and, in particular, associations
Need a Graph?
General
Considerations
between two variables.
Graphical
Perception • Scatterplots and scatterplot matrices.
Bad Graphs

Survey of
I Useful for examining linear association between two
Data variables.
Visualization
Techniques I Can extend beyond two variables by using color and
One
Quantitative plotting symbol area, as in bubble charts.
Variable
Multiple
I However, can miss important higher-dimensional patterns
Quantitative
Variables (cf. RANDU example).
One Qualitative
Variable
Multiple
• Mean-difference plots.
Qualitative
Variables
Conditional 95 / 120
Plots
Multiple Quantitative Variables: Summary

Data
Visualization I Rotated and scaled version of scatterplot.
Dudoit I Better for looking at differences vs. associations.
Motivation • Bubble charts. A bubble chart is a type of scatterplot that
Principles of
Data
displays one or two extra dimensions using area and color.
Visualization
Do We Really
• Parallel coordinates plots.
Need a Graph?
General
I Natural for visualizing time series data, i.e., same variable
Considerations
Graphical measured across time.
Perception
Bad Graphs Cf. Train schedules.
Survey of I Can also be used for visualizing the relationship between
Data
Visualization multiple variables, but trickier: Each line corresponds to an
Techniques
One
observation and each axis to a variable.
Quantitative
Variable
I Three important considerations, that can affect
Multiple
Quantitative interpretation of the plot: The order, the rotation, and the
Variables
One Qualitative scaling of the axes.
Variable
Multiple
Qualitative
Variables
Conditional 96 / 120
Plots
Overplotting

Data
Visualization

Dudoit

Motivation

Principles of Overplotting issues can be reduced by the following


Data
Visualization approaches.
Do We Really
Need a Graph? • Changing plotting symbol.
General
Considerations
Graphical
Perception
• Jittering, i.e., adding random noise.
Bad Graphs
• Smoothing.
Survey of
Data
Visualization • Hexagonal binning.
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 97 / 120
Plots
Qualitative Variables

Data
Visualization How would you visualize the 2017 UK election results?
Dudoit
Number of seats for each of 13 parties.
Motivation
Party MPs
Principles of
Data 0 CON 318
Visualization 1 LAB 261
Do We Really
Need a Graph? 2 SNP 35
General
Considerations
Graphical
3 LIB DEM 12
Perception
Bad Graphs
4 DUP 10
Survey of
5 SF 7
Data 6 PC 4
Visualization
Techniques 7 GREEN 1
One
Quantitative 8 IND 1
Variable
Multiple 9 OTHER 1
Quantitative
Variables 10 UKIP 0
One Qualitative
Variable 11 SDLP 0
Multiple
Qualitative
Variables
12 UUP 0
Conditional 98 / 120
Plots
Pie Charts

Data
Visualization

Dudoit
CON
Motivation

Principles of 48.9%
Data
Visualization
Do We Really
Need a Graph? 0.0%
0.2%
0.6%
1.1% UKIP
SDLP
UUP
OTHER
IND
GREEN
PC
General 1.5% SF
Considerations
1.8% DUP
5.4% LIB DEM
Graphical
Perception 40.2% SNP
Bad Graphs

Survey of
Data
LAB
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative Figure 51: UK Election Results 2017. Number of seats for each of 13
Variables
One Qualitative parties.
Variable
Multiple
Qualitative
Variables
Conditional 99 / 120
Plots
Barplots

Data
Visualization

Dudoit

Motivation 300

Principles of 250
Data
Visualization
200
Do We Really
MPs

Need a Graph?
General 150
Considerations
Graphical
Perception 100
Bad Graphs

Survey of 50
Data
Visualization 0
Techniques N B P P F C N D R P P P
One CO LA SN IB DEMDU S P GREE IN OTHE UKI SDL UU
Quantitative L
Variable Party
Multiple
Quantitative Figure 52: UK Election Results 2017. Number of seats for each of 13
Variables
One Qualitative parties.
Variable
Multiple
Qualitative
Variables
Conditional 100 / 120
Plots
Dotplots

Data
Visualization

Dudoit
CON
Motivation LAB
SNP
Principles of LIB DEM
Data
Visualization DUP
Do We Really SF
Party

Need a Graph? PC
General
Considerations GREEN
Graphical IND
Perception
Bad Graphs
OTHER
UKIP
Survey of SDLP
Data
UUP
Visualization
Techniques 0 50 100 150 200 250 300
One MPs
Quantitative
Variable
Multiple
Quantitative Figure 53: UK Election Results 2017. Number of seats for each of 13
Variables
One Qualitative parties.
Variable
Multiple
Qualitative
Variables
Conditional 101 / 120
Plots
Lollipop Plots

Data
Visualization

Dudoit

Motivation
300

Principles of 250
Data
Visualization 200
Do We Really
Need a Graph?
General
150
Considerations
Graphical 100
Perception
Bad Graphs
50
Survey of
Data 0
Visualization
Techniques
CON

DUP
LAB
SNP

SF
PC
GREEN
IND
OTHER
UKIP
LIB DEM

SDLP
One
Quantitative
Variable
Multiple
Quantitative Figure 54: UK Election Results 2017. Number of seats for each of 13
Variables
One Qualitative parties.
Variable
Multiple
Qualitative
Variables
Conditional 102 / 120
Plots
One Qualitative Variable: Summary

Data
Visualization
• Pie charts.
Dudoit
I Frequency represented by angle/area.
Motivation I Angles and areas are hard to perceive and compare.
Principles of I Pie charts quickly become unreadable for more than a
Data
Visualization handful of values.
Do We Really
Need a Graph? I Listing the values is often better – they are actually often
General
Considerations added to a pie chart anyway!
Graphical
Perception I How to select order of categories?
Bad Graphs
I Not amenable to comparing distributions; side-by-side
Survey of
Data comparisons not effective.
Visualization
I Hard to extend to multiple variables.
Techniques
One I A lot of junk often added to pie charts, e.g., thickness,
Quantitative
Variable
Multiple slice explosion.
Quantitative
Variables
One Qualitative
• Wordclouds/tag clouds.
Variable
I Frequency represented by font size.
Multiple
Qualitative
Variables
Conditional 103 / 120
Plots
One Qualitative Variable: Summary

Data
Visualization
I Neither area nor height corresponds to frequency of words.
Dudoit
I How do longer words compare with shorter words?
I How are capital letters handled?
Motivation
I How to calculate relative difference in frequency between
Principles of
Data
two words?
Visualization I How are the words ordered within the cloud (alphabetical,
Do We Really
Need a Graph? frequency)?
General
Considerations I Not amenable to comparing distributions; side-by-side
Graphical
Perception
Bad Graphs
comparisons not effective.
Survey of
I How to extend to multiple variables?
Data I A lot of junk often added to word clouds.
Visualization
Techniques • Barcharts/barplots.
One
Quantitative I Based on length and position on common aligned scale.
Variable
Multiple I Add an irrelevant dimension (thickness of bar).
Quantitative
Variables
One Qualitative I How to select order of categories?
Variable
Multiple I Not readily amenable to comparisons.
Qualitative
Variables
Conditional 104 / 120
Plots
One Qualitative Variable: Summary

Data
Visualization I Extension to multiple variables problematic.
Dudoit
• Dotcharts/dotplots. (And interval charts.)
Motivation I Based on length and position on common aligned scale.
Principles of I Display only the relevant information.
Data
Visualization I How to select order of categories?
Do We Really
Need a Graph? I More amenable to comparisons and extensions to multiple
General
Considerations variables.
Graphical
Perception
Bad Graphs
• Lollipop plots.
Survey of I Similar to dotcharts/dotplots (with added stem) and
Data
Visualization barcharts/barplots.
Techniques I Stem is redundant.
One
Quantitative I How to select order of categories?
Variable
Multiple I Not readily amenable to comparisons.
Quantitative
Variables
I Extension to multiple variables problematic.
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 105 / 120
Plots
Multiple Qualitative Variables

Data
Visualization

Dudoit

Motivation
How would you display survival data on the Titanic
Principles of
Data according to class, gender, and age?
Visualization
Do We Really
Need a Graph?
General
Considerations
Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 106 / 120
Plots
Barplots

Data
Visualization

Dudoit

Motivation
350

No Yes
Yes No
Principles of

400
300

Data
Visualization
250

Do We Really

300
Need a Graph?
200

General
Considerations
Graphical

200
150

Perception
Bad Graphs
100

Survey of

100
Data
50

Visualization
Techniques
0

0
One First Second Third First Second Third
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative Figure 55: Titanic: Survival by class.
Variable
Multiple
Qualitative
Variables
Conditional 107 / 120
Plots
Dotplots

Data
Visualization

Dudoit

Motivation

Principles of
Third ●

Data
Visualization
Do We Really Second ●
Need a Graph?
General
Considerations
Graphical
Perception
First ●
Bad Graphs

Survey of
Data
Visualization
Techniques 0.3 0.4 0.5 0.6
One
Quantitative
Variable
Multiple
Quantitative
Survival frequency per class
Variables
One Qualitative
Variable
Multiple Figure 56: Titanic: Survival by class.
Qualitative
Variables
Conditional 108 / 120
Plots
Dotplots

Data
Visualization

Dudoit

Motivation
woman ●
Principles of
Data
Visualization
Do We Really man ●
Need a Graph?
General
Considerations
Graphical
Perception
child ●
Bad Graphs

Survey of
Data
Visualization
Techniques 0.2 0.4 0.6
One
Quantitative
Variable
Multiple
Quantitative
Survival frequency per gender/age
Variables
One Qualitative
Variable
Multiple
Figure 57: Titanic: Survival by gender/age.
Qualitative
Variables
Conditional 109 / 120
Plots
Dotplots

Data
Visualization

Dudoit
child
Third ●
Motivation Second ●
Principles of
First ●
Data
man
Visualization
Third ●
Do We Really Second ●
Need a Graph?
General
First ●
Considerations
Graphical woman
Perception Third ●
Bad Graphs Second ●
Survey of
First ●
Data
Visualization
Techniques 0.2 0.4 0.6 0.8 1.0
One
Quantitative
Variable
Multiple
Quantitative
Survival frequency per class and gender/age
Variables
One Qualitative
Variable
Multiple
Figure 58: Titanic: Survival by class and gender/age.
Qualitative
Variables
Conditional 110 / 120
Plots
Mosaic Plots

Data no yes

Visualization

First
Dudoit

Second
Motivation

Principles of
Data

class
Visualization
Do We Really
Need a Graph?
General
Considerations
Third

Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques alive
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 59: Titanic: Survival and class.
Variable
Multiple
Qualitative
Variables
Conditional 111 / 120
Plots
Mosaic Plots

Data child
no
man woman child man
yes
woman
Visualization

First
Dudoit

Second
Motivation

Principles of
Data

class
Visualization
Do We Really
Need a Graph?
General
Considerations
Third

Graphical
Perception
Bad Graphs

Survey of
Data
Visualization
Techniques alive
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 60: Titanic: Survival, class, and gender/age.
Variable
Multiple
Qualitative
Variables
Conditional 112 / 120
Plots
Multiple Qualitative Variables: Summary

Data
Visualization The following types of plots are used to represent conditional
Dudoit
distributions for multiple categorical variables or counts for
Motivation hierarchical categories.
Principles of • Multilevel donut/pie/sunburst plots.
Data
Visualization I Same or worse perception issues as with univariate pie
Do We Really
Need a Graph? charts.
General
Considerations I Which variable to choose for “outer” layer?
Graphical
Perception
Bad Graphs • Barcharts/barplots.
Survey of I For two categorical variables, a barchart/barplot displays
Data
Visualization the counts (or percentages) for each category of the
Techniques
One
second variable within each category of the first variable.,
Quantitative
Variable i.e., conditional distribution of second variable given first.
Multiple I Which variable to choose as “first”?
Quantitative
Variables
One Qualitative
I In a side-by-side barplot, the frequencies for the second
Variable
Multiple variable are displayed as juxtaposed bars.
Qualitative
Variables
Conditional 113 / 120
Plots
Multiple Qualitative Variables: Summary

Data I In a stacked/segmented barplot, the bars for the second


Visualization

Dudoit
variable are staked, so that their total height is the total
count for the category of the first variable or 100 percent.
Motivation I Hard to compare frequencies between categories of first
Principles of variable with both types of barplots.
Data
Visualization I Hard to compare frequencies of second variable within
Do We Really
Need a Graph? categories of first variable with stacked barplot.
General
Considerations I Circular barcharts/barplots: Eye-catching, but even harder
Graphical
Perception to compare frequencies.
Bad Graphs

Survey of • Treemap. The hierarchical or conditional frequencies are


Data
Visualization represented using nested figures, usually rectangles.
Techniques
One
• Mosaic plots.
Quantitative
Variable
Multiple
I A mosaic plot is a graphical display of the counts in a
Quantitative
Variables
contingency table (a.k.a., cross-tabulation or crosstab),
One Qualitative
Variable
where each cell is represented by a tile (i.e., rectangle)
Multiple
Qualitative whose area is proportional to the cell frequency.
Variables
Conditional 114 / 120
Plots
Multiple Qualitative Variables: Summary

Data
Visualization

Dudoit
I Color and shading of the tiles can be used to represent
Motivation
unusually large or small counts, the sign and magnitude of
Principles of
Data residuals (deviations) for particular models (e.g.,
Visualization
Do We Really
independence).
Need a Graph?
General
I For two categorical variables, the width of each tile is
Considerations
Graphical
proportional to the marginal frequency of the category for
Perception
Bad Graphs
the first variable and the height of the tile to the
Survey of conditional frequency of the category for the second
Data
Visualization
variable given the first.
Techniques I Can be hard to read mosaic plots for more than two
One
Quantitative variables.
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 115 / 120
Plots
Conditional Plots

Data
Visualization Conditional plots/coplots/faceting/panels/small multiples.
Dudoit
• Collection of plots, where each plot represents the
Motivation conditional distribution of one or more variables given a
Principles of
Data
conditioning variable.
Visualization
Do We Really • Each plot corresponds to a value or set of values for the
Need a Graph?
General
Considerations
conditioning variable. For a quantitative conditioning
Graphical
Perception
variable, the ranges are typically chosen so that there are
Bad Graphs
equal numbers of observations in each panel.
Survey of
Data • The scales on the axes have to be the same for all panels.
Visualization
Techniques
One
• The colors (and legends) also have to be the same for all
Quantitative
Variable panels.
Multiple
Quantitative
Variables • E.g. Scatterplots of life expectancy vs. income for each of
One Qualitative
Variable
Multiple
the six world regions.
Qualitative
Variables
Conditional 116 / 120
Plots
Conditional Plots

Data
Visualization
85

Dudoit ●




● ●
● ●
● ●
● ● ●

● ● ●

● ●
80 ● ●



Motivation ●






● ●
● ●
● ● ● ●


● ●


● ● ●

● ● ●
● ●
Principles of 75 ● ●

● ●
● ●


● ● ●
Data ●



● ●




● ●

life.expectancy

Visualization 70






● ●


Do We Really ● ● ●

● ●



Need a Graph? ●






General 65 ● ●


● ●

Considerations ●





Graphical ● ●


Perception 60 ●


● ●

Bad Graphs
55
Survey of
Data ●

Visualization 50
Techniques
america

east_asia_pacific

europe_central_asia

middle_east_north_africa

south_asia

sub_saharan_africa
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 61: Life expectancy by region, 2018.
Variable
Multiple
Qualitative
Variables
Conditional 117 / 120
Plots
Conditional Plots

Data Given : income

Visualization 0 20000 40000 60000 80000 100000 120000

Dudoit

Motivation

Principles of amrc es__ er__ m___ sth_ sb__ amrc es__ er__ m___ sth_ sb__

85
Data ● ●








● ●














80
● ● ● ●
● ● ● ● ●

Visualization ●













● ●










● ●










75
● ● ● ●
● ● ● ●

● ● ● ● ●
● ● ● ●
● ●
● ● ● ● ●
● ●
Do We Really ● ● ●

70
● ●

● ●
● ●
Need a Graph? ●




65

General

60
life.expectancy

Considerations

55
Graphical

50
85

Perception
80

● ● ●
Bad Graphs ●












75

● ● ● ● ● ● ●
● ● ●
● ● ● ●
● ● ● ● ●
● ● ● ● ●
● ●
● ● ●
● ● ● ● ● ●
● ● ● ●
70

● ● ● ● ● ●

Survey of ●





















● ●






65


● ●
● ●
● ● ● ●
Data ●















60


● ●
● ●
● ● ●

Visualization
55

Techniques ●
● ●
50

amrc es__ er__ m___ sth_ sb__


One
Quantitative six_regions
Variable
Multiple
Quantitative
Variables
One Qualitative
Figure 62: Life expectancy by region conditioning on income, 2018.
Variable
Multiple
Qualitative
Variables
Conditional 118 / 120
Plots
References

Data
Visualization
• Peter Aldhous. Data visualization: basic principles. http:
Dudoit
//paldhous.github.io/ucb/2016/dataviz/week2.html.
Motivation
• Ross Ihaka. Statistics 120 – Information Visualisation.
Principles of
Data https://www.stat.auckland.ac.nz/~ihaka/120/.
Visualization
Do We Really
Need a Graph?
• Duncan Temple Lang. Data Visualization Workshops.
General
Considerations http://dsi.ucdavis.edu/tag/data-visualization.html.
Graphical
Perception
Bad Graphs
W. S. Cleveland and R. McGill. Graphical perception and graphical
Survey of
Data methods for analyzing scientific data. Science, 229(4716):828–833,
Visualization 1985.
Techniques
One
Quantitative
A. Gelman, C. Pasarica, and R. Dodhia. Lets practice what we preach:
Variable
Multiple
Turning tables into graphs. The American Statistician, 56(2):121–130,
Quantitative
Variables
2002.
One Qualitative
Variable E. J. Marey. La Mthode Graphique. Librairie de l’Académie de Médecine,
Multiple
Qualitative 1885.
Variables
Conditional 119 / 120
Plots
References

Data
Visualization

Dudoit

Motivation

Principles of
Data
Visualization S. S. Stevens. On the psychophysical law. Psychological Review, 64(3):
Do We Really
Need a Graph? 153–181, 1957.
General
Considerations
Graphical
E. R. Tufte. The Visual Display of Quantitative Information. Graphics
Perception
Bad Graphs
Press, 2nd edition, 2001.
Survey of
Data
Visualization
Techniques
One
Quantitative
Variable
Multiple
Quantitative
Variables
One Qualitative
Variable
Multiple
Qualitative
Variables
Conditional 120 / 120
Plots

You might also like