Professional Documents
Culture Documents
Full Chapter Better Data Visualizations A Guide For Scholars Researchers and Wonks Jonathan Schwabish PDF
Full Chapter Better Data Visualizations A Guide For Scholars Researchers and Wonks Jonathan Schwabish PDF
https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-
loucas/
https://textbookfull.com/product/data-and-the-built-environment-
a-practical-guide-to-building-a-better-world-using-data-1st-
edition-ian-gordon/
https://textbookfull.com/product/becoming-metric-wise-a-
bibliometric-guide-for-researchers-1st-edition-ronald-rousseau/
https://textbookfull.com/product/developing-medical-apps-and-
mhealth-interventions-a-guide-for-researchers-physicians-and-
informaticians-alan-davies/
Fullstack D3 and Data Visualization Build Beautiful
Data Visualizations and Dashboards with D3 Amelia
Wattenberger
https://textbookfull.com/product/fullstack-d3-and-data-
visualization-build-beautiful-data-visualizations-and-dashboards-
with-d3-amelia-wattenberger/
https://textbookfull.com/product/medical-writing-a-guide-for-
clinicians-educators-and-researchers-3rd-edition-robert-b-taylor-
auth/
https://textbookfull.com/product/r-visualizations-derive-meaning-
from-data-1st-edition-david-gerbing/
https://textbookfull.com/product/doing-your-research-project-a-
guide-for-first-time-researchers-7th-edition-stephen-waters/
https://textbookfull.com/product/jumpstart-tableau-a-step-by-
step-guide-to-better-data-visualization-1st-edition-arshad-khan/
BETTER DATA
VISUALIZATIONS
COLUMBIA UNIVERSITY PRESS ր ր NEW YORK
Columbia University Press
Publishers Since 1893
New York Chichester, West Sussex
cup.columbia.edu
Copyright © 2021 Columbia University Press
All rights reserved
Chapter 11, “Tables,” based on Jonathan A. Schwabish, “Ten Guidelines for Better Tables,”
Journal of Benefit-Cost Analysis 11, no. 2 (2020): 151–178. Reprinted with permission.
INTRODUCTION 1
5. TIME 133
Line Chart 133
Circular Line Chart 149
Slope Chart 150
Sparklines 152
Bump Chart 153
Cycle Chart 155
Area Chart 157
Stacked Area Chart 159
Streamgraph 162
Horizon Chart 164
Gantt Chart 166
Flow Charts and Timelines 170
Connected Scatterplot 175
Conclusion 177
6. DISTRIBUTION 179
Histogram 179
CONTENTSփ ix
7. GEOSPATIAL 217
Choropleth Map 220
Cartogram 233
Proportional Symbol and Dot Density Maps 243
Flow Map 245
Conclusion 248
8. RELATIONSHIP 249
Scatterplot 249
Parallel Coordinates Plot 263
Radar Charts 267
Chord Diagram 269
Arc Chart 272
Correlation Matrix 275
Network Diagrams 277
Tree Diagrams 284
Conclusion 287
9. PARTҺTOҺHOLE 289
Pie Charts 289
Treemap 297
Sunburst Diagram 299
Nightingale Chart 300
Voronoi Diagram 304
Conclusion 309
xփ CONTENTS
CONCLUSION 391
Acknowledgments 409
References 413
Index 431
BETTER DATA
VISUALIZATION
INTRODUCTION
R
aise your hand if your approach to creating a graph goes something like this: You ana-
lyze some data. Write up the results. Make a graph and drop it into the report, sur-
rounded by text. Label it something benign like “Figure 1. Average Earnings, 1990–2020.”
Save it as a PDF. Post it to the world.
It might have taken you months or even years to compile and analyze the data and write
the report. For many, it takes far less time to design the graphs that showcase that data. You
might open a program like Microsoft Excel, paste in the data, click through the drop-down
menu, select one you’ve used dozens or hundreds of times, accept the default formatting,
and paste it into the report.
But at any point in this sequence did you pause to consider what’s most important about
communicating the work? It’s the audience. People will read your report. People will listen to
you discuss your work. And yet many of us spend far too little time thinking about how we can
best present our findings. Instead we use whatever default approach is quickest and easiest.
Why is this? Maybe you don’t believe you have the technical skills or design know-how
to create complex, attractive graphs. Or you worry it’s not worth the effort, because your
managers or tenure committee or whoever else won’t see it as time well spent. Many people
simply think that their reader will just “get it,” as if everyone has seen the content a hundred
times before. But many readers, especially those who can make change or implement policy,
may have never seen this content before. In these cases—which are probably most of them—
thinking carefully about how data is presented is just as important as the data itself.
This book is about how to create better, more effective visualizations of your data. It aims
to expand your graphic literacy and put more graphs in your toolbox. The next time you open
2փ INTRODUCTION
Excel, Tableau, R, or whatever your software tool of choice, you won’t be bound by the graphs
in the dropdown menus or the tutorial manual. This book will guide you to choose the graph
that is the best fit for your data and will most effectively communicate your message.
People often tell me they can’t create some of these different, nonstandard graphs because
their colleague or manager or audience won’t understand them. We are not born knowing
instinctively how to read a bar chart or line chart or pie chart. As Scott Klein, deputy manag-
ing editor at ProPublica once wrote, “There is no such thing as an innately intuitive graphic.
None of us are born literate in reading visualizations.”
As data visualization creators, we must understand our audience and know when a differ-
ent graph can engage readers—and help them expand their own graphic literacy.
This book has three main parts. Part 1 covers general guidelines to creating effective visu-
alizations. We’ll learn the importance of our audience and how to consider what category
of graph will best meet their needs. No data visualization book will contain every lesson to
create effective graphs, but there are some best practices that can guide your work. As you go
Each of these six charts visualizes the same data: The share of people earning minimum
wage or less in each state.
INTRODUCTIONփ 3
forward creating more visuals and seeing their effect on your audience, you’ll develop your
own aesthetic and learn when to bend or break these guidelines.
Part 2 is the meat of the book. We will define and discuss more than eighty graphs, cat-
egorized into eight broad categories: Comparisons, Time, Distribution, Geospatial, Rela-
tionship, Part-to-Whole, Qualitative, and Tables. We will see how each graph works and the
advantages and disadvantages of each.
Graphs overlap between these categories—a bar chart, for example, can be used to show
changes over time or comparisons between groups. The categorizations here are based on
a graph’s primary purpose. But even that’s not an objective truth, and your perspective and
situation may differ. I do not discuss every single possible graph—there are many specialized
graphs in fields like architecture, biology, and engineering that are excluded here. Instead,
these chapters cover the most common and flexible graphs that can showcase the sorts of
data most people will need to display.
I tie these chapters together in part 3 with a chapter on building a data visualization style guide
and a chapter on how to pull the different lessons together in a series of graph redesigns. If you’ve
ever written a research paper, or even a book report, you are probably aware of the array of writing
style guides, from the Chicago Manual of Style to the Modern Language Association. These guides
break down writing into component parts and prescribe their proper use. A data visualization style
guide does the same for graphs—defines their parts and how to style and use them. In the final
chapter, we apply the lessons to redesign a series of graphs to improve how they communicate data.
This book will guide you as you explore your data and how it might be visualized. Now
more than ever, content must be visual if it is to travel far. Your clients and colleagues, and
your audiences of policymakers, decisionmakers, and interested readers are inundated with
a flow of information. Visuals cut through that.
Anyone can improve the way they visualize and communicate their data—and you don’t
need a graduate degree in marketing or design or advertising. Take it from me, I started my
career as an economist in the federal government.
I moved to DC in 2005 to join the Congressional Budget Office (CBO). My job was to
help work on the long-term microsimulation model that is used to examine the Social Secu-
rity system and forecast the long-term finances of the federal budget. The spring of 2005 was
an exciting time to work on Social Security—President George W. Bush had made Social
Security a central component of his second term. In his 2005 State of the Union address, he
said, “We must pass reforms that solve the financial problems of Social Security once and for
all.” Reform would stall later that year, but in the course of my first few months on the job,
my group at CBO estimated and analyzed the effects of dozens of policy proposals.
Five years later, I had expanded my work to include issues around policies that affected
disabled workers, immigration, and food stamps (now called the Supplemental Nutrition
Assistance Program or SNAP). In 2010, three of my colleagues were drafting a special report
on policy options for Social Security. In it, they would show the impact of thirty different
options for reform. One of the central figures in the report would show changes in taxes
received by the system, benefits paid out from the system, the balance between the two, and
other measures of fiscal solvency for these thirty options. It looked something like this:
Author’s rendering of early draft of exhibit from the Congressional Budget Office.
INTRODUCTIONփ 5
You don’t need to be a government economist to know that members of Congress are
unlikely to read something that looks like a spreadsheet. There are too many rows, too many
columns, too many numbers—too much information. It was right then that I first started
thinking about better ways to present this information.
This was the result. We replaced some numbers with small area charts, which give the
reader an immediate visual impression of each option—which ones increased the solvency
of the program and which ones did not.
Final version of that main exhibit in the Congressional Budget Office report on Social
Security. Notice that there is less data and more graphs.
Source: Congressional Budget Office.
The report worked. We received good feedback from colleagues at CBO and other agen-
cies, as well as readers on Capitol Hill and elsewhere, noting how easy it was to read and
digest the graphs. It was maybe the first time I (and perhaps the agency) thought carefully
6փ INTRODUCTION
and strategically about our data visuals. From there, I started reading books on data visual-
ization, design, color theory, and typography.
Working with our editorial department and designers, we began to improve the
graphs in our basic reports and started creating new report and graph types. We made
infographics—what was then a buzzword referring (sometimes derisively) to longer
graphics that combine data, text, images, and more into a single visual. In 2012, we cre-
ated this infographic to accompany and summarize The Long-Term Budget Outlook, a
109-page report.
One-page infographic about the 2012 Long-Term Budget Outlook from the Congressional
Budget Office.
Source: Congressional Budget Office.
INTRODUCTIONփ 7
That June, CBO’s director sat in front the U.S. House Budget Committee to relay the
results of our analysis. As the hearing played on a TV out in the hallway, I suddenly heard
yells of, “Jon! Jon! Come out! Your infographic is on TV!”
And, sure enough, Congressman Chris Van Hollen was holding up the infographic on
C-SPAN, covered with scribbles and notes. The visualization had captured and engaged the
attention of one of the busiest people in America, and someone who could do something
about the pressures facing the federal budget. That was the moment I knew that how we
presented our data could matter as much as the data itself.
In 2014, I moved to the Urban Institute, a nonprofit research institution in Washington,
DC, to spend half of my time conducting research and half of my time in the Communica-
tions department, helping colleagues present and visualize their data.
Since that time, I have conducted hundreds of workshops, delivered lectures around the
globe, and published two books on data communication. The world, it seemed, had seen
what I saw—better visual content and better presentations were the currency of research and
Maryland Congressman Chris Van Hollen holding up that Long-Term Budget Outlook info-
graphic in a House Budget Committee hearing.
Source: C-SPAN2.
8փ INTRODUCTION
policy adoption. The advance of computing power, social media platforms, and the expand-
ing media landscape made visual content more important, perhaps even necessary.
Today, I work with people in nonprofits, government agencies, private sector companies,
and everything in between to improve how they create their graphs and communicate their
content. I’ve worked with junior economists and analysts dealing with enormous data sets;
health care workers trying to communicate results to patients, families, and hospital admin-
istrators; human resource representatives working with databases of job-seekers; advertisers
and marketing executives selling products to clients; and many more.
I’ve seen hundreds of different kinds of data visualization challenges. The skills to meet
them, unfortunately, are not yet regularly taught in schools or professional development
programs. But these skills can be learned. We can learn how to read chart types we’ve never
seen before, even if they are complex. And we can learn how to communicate our work in
better and more effective ways.
Eventually, I discovered that one of the most important things I can show people is the
incredibly wide array of graphs available to them. And that is precisely the content of this
book, a survey of more than eighty types of data visualizations, from the familiar to the
nonstandard.
But before we get to the library of graph types, we’ll consider some of the science behind
how we process visual information and some best practices and approaches to visualizing data.
PART ONE
PRINCIPLES OF DATA VISUALIZATION
VISUAL PROCESSING AND
PERCEPTUAL RANKINGS
1
B
efore we start creating our charts and graphs, we need to cover some basic theory of
how the brain perceives visual stimuli. This will guide you as you decide what chart type
is most appropriate to visualize your data.
When we consider how to visualize our data, we must ask ourselves how accurately the
reader can perceive the data values. Are some graphs better equipped to guide the reader to
the specific difference between, say, 2 percent and 2.3 percent? If so, how should we think
about those differences as we create our visualizations?
There’s a thread of research in the data visualization field that explores this very ques-
tion. Based on original research over the past thirty years or so, the image on the next
page shows a spectrum of graphs—or more generally, types of data encodings like dots,
lines, and bars—arrayed by how easily readers can estimate their value. The encodings that
readers can most accurately estimate are arranged at the top, and those that enable more
general estimates are at the bottom.
The rankings are unsurprising. It is easier to compare the data in line charts, bar charts,
and area charts that have the same axis or baseline. Graphs on which the data are positioned
on unaligned axes—think of a pair of bars that are offset from one another on different
axes—are slightly harder for us to accurately discern the values.
Farther down the vertical axis are encodings based on angle, area, volume, and color.
You intuitively know this: it’s much easier to discern the exact data values and differences
between values when reading a bar chart than when reading a map where countries are
shaded with different colors.
ovbঞom-Ѵom]1ollomv1-Ѵ;v
Enable ovbঞom-Ѵom]b7;mঞ1-Ѵķmom-Ѵb]m;7v1-Ѵ;v
accurate
;vঞl-|;v
Length
Area
(oѴl;
"_-7bm]-m7v-|u-ঞom
May enable
general Color hue
;vঞl-|;v
Perceptual ranking diagram. What kind of data visualization you choose to create will
depend on your goals and your audience’s needs, experiences, and expertise. This image is
based on Alberto Cairo (2016) from research by Cleveland and McGill (1984), Heer,
Bostock, and Ogievetsky (2010), and others.
Another random document with
no related content on Scribd:
back
back
back