You are on page 1of 39

Better Data Visualizations: A Guide For

Scholars, Researchers, And Wonks


Jonathan Schwabish
Visit to download the full and correct content document:
https://textbookfull.com/product/better-data-visualizations-a-guide-for-scholars-resear
chers-and-wonks-jonathan-schwabish/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Biota Grow 2C gather 2C cook Loucas

https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-
loucas/

Data and the Built Environment: A Practical Guide to


Building a Better World Using Data 1st Edition Ian
Gordon

https://textbookfull.com/product/data-and-the-built-environment-
a-practical-guide-to-building-a-better-world-using-data-1st-
edition-ian-gordon/

Becoming Metric Wise A Bibliometric Guide for


Researchers 1st Edition Ronald Rousseau

https://textbookfull.com/product/becoming-metric-wise-a-
bibliometric-guide-for-researchers-1st-edition-ronald-rousseau/

Developing Medical Apps and mHealth Interventions A


Guide for Researchers Physicians and Informaticians
Alan Davies

https://textbookfull.com/product/developing-medical-apps-and-
mhealth-interventions-a-guide-for-researchers-physicians-and-
informaticians-alan-davies/
Fullstack D3 and Data Visualization Build Beautiful
Data Visualizations and Dashboards with D3 Amelia
Wattenberger

https://textbookfull.com/product/fullstack-d3-and-data-
visualization-build-beautiful-data-visualizations-and-dashboards-
with-d3-amelia-wattenberger/

Medical Writing A Guide for Clinicians Educators and


Researchers 3rd Edition Robert B. Taylor (Auth.)

https://textbookfull.com/product/medical-writing-a-guide-for-
clinicians-educators-and-researchers-3rd-edition-robert-b-taylor-
auth/

R Visualizations: Derive Meaning from Data 1st Edition


David Gerbing

https://textbookfull.com/product/r-visualizations-derive-meaning-
from-data-1st-edition-david-gerbing/

Doing Your Research Project: A Guide For First-Time


Researchers 7th Edition Stephen Waters

https://textbookfull.com/product/doing-your-research-project-a-
guide-for-first-time-researchers-7th-edition-stephen-waters/

Jumpstart Tableau A Step By Step Guide to Better Data


Visualization 1st Edition Arshad Khan

https://textbookfull.com/product/jumpstart-tableau-a-step-by-
step-guide-to-better-data-visualization-1st-edition-arshad-khan/
BETTER DATA
VISUALIZATIONS

COLUMBIA UNIVERSITY PRESS ր ր NEW YORK
Columbia University Press
Publishers Since 1893
New York Chichester, West Sussex
cup.columbia.edu
Copyright © 2021 Columbia University Press
All rights reserved

Chapter 11, “Tables,” based on Jonathan A. Schwabish, “Ten Guidelines for Better Tables,”
Journal of Benefit-Cost Analysis 11, no. 2 (2020): 151–178. Reprinted with permission.

Library of Congress Cataloging-in-Publication Data


Names: Schwabish, Jonathan A., author.
Title: Better data visualizations : a guide for scholars, researchers, and wonks /
Jonathan Schwabish.
Description: New York : Columbia University Press, [2021] | Includes bibliographical
references and index.
Identifiers: LCCN 2020017814 (print) | LCCN 2020017815 (ebook) | ISBN 9780231193108
(hardback) | ISBN 9780231193115 (trade paperback) | ISBN 9780231550154 (ebook)
Subjects: LCSH: Information visualization. | Visual analytics.
Classification: LCC QA76.9.I52 S393 2021 (print) | LCC QA76.9.I52 (ebook) |
DDC 001.4/226—dc23
LC record available at https://lccn.loc.gov/2020017814
LC ebook record available at https://lccn.loc.gov/2020017815

Columbia University Press books are printed on permanent and


durable acid-free paper.

Printed in the United States of America


For Aunt Vivi. Our Mendales. With love and Diet Coke.

CONTENTS

INTRODUCTION 1

PART ONE: PRINCIPLES OF DATA VISUALIZATION


1. VISUAL PROCESSING AND PERCEPTUAL RANKINGS 13
Anscombe’s Quartet 20
Gestalt Principles of Visual Perception 22
Preattentive Processing 25

2. FIVE GUIDELINES FOR BETTER DATA VISUALIZATIONS 29


Guideline 1: Show the Data 29
Guideline 2: Reduce the Clutter 31
Guideline 3: Integrate the Graphics and Text 33
Guideline 4. Avoid the Spaghetti Chart 41
Guideline 5. Start with Gray 43

3. FORM AND FUNCTION: LET YOUR AUDIENCE’S NEEDS DRIVE


YOUR DATA VISUALIZATION CHOICES 53
Changing How We Interact with Data 61
Let’s Get Started 62
viiiփ  CONTENTS

PA RT TWO: C HART T Y PES


4. COMPARING CATEGORIES 67
Bar Charts 68
Paired Bar 84
Stacked Bar 87
Diverging Bar 92
Dot Plot 97
Marimekko and Mosaic Charts 102
Unit, Isotype, and Waffle Charts 106
Heatmap 112
Gauge and Bullet Charts 118
Bubble Comparison and Nested Bubbles 121
Sankey Diagram 126
Waterfall Chart 129
Conclusion 130

5. TIME 133
Line Chart 133
Circular Line Chart 149
Slope Chart 150
Sparklines 152
Bump Chart 153
Cycle Chart 155
Area Chart 157
Stacked Area Chart 159
Streamgraph 162
Horizon Chart 164
Gantt Chart 166
Flow Charts and Timelines 170
Connected Scatterplot 175
Conclusion 177

6. DISTRIBUTION 179
Histogram 179
CONTENTSփ  ix

Pyramid Chart 185


Visualizing Statistical Uncertainty with Charts 187
Box-and-Whisker Plot 196
Candlestick Chart 199
Violin Chart 200
Ridgeline Plot 201
Visualizing Uncertainty by Showing the Data 204
Stem-and-Leaf Plot 214
Conclusion 215

7. GEOSPATIAL 217
Choropleth Map 220
Cartogram 233
Proportional Symbol and Dot Density Maps 243
Flow Map 245
Conclusion 248

8. RELATIONSHIP 249
Scatterplot 249
Parallel Coordinates Plot 263
Radar Charts 267
Chord Diagram 269
Arc Chart 272
Correlation Matrix 275
Network Diagrams 277
Tree Diagrams 284
Conclusion 287

9. PARTҺTOҺHOLE 289
Pie Charts 289
Treemap 297
Sunburst Diagram 299
Nightingale Chart 300
Voronoi Diagram 304
Conclusion 309
xփ  CONTENTS

10. QUALITATIVE 311


Icons 311
Word Clouds and Specific Words 312
Word Trees 316
Specific Words 318
Quotes 319
Coloring Phrases 321
Matrices and Lists 324
Conclusion 325

11. TABLES 327


The Ten Guidelines of Better Tables 329
Demonstration: A Basic Data Table Redesign 338
Demonstration: A Regression Table Redesign 341
Conclusion 344

PART THREE: DESIGNING AND REDESIGNING


YOUR VISUAL
12. DEVELOPING A DATA VISUALIZATION STYLE GUIDE 349
The Anatomy of a Graph 352
Color Palettes 358
Defining Fonts for the Style Guide 362
Guidance for Specific Graph Types 364
Exporting Images 365
Accessibility, Diversity, and Inclusion 366
Putting it All Together 368

13. REDESIGNS 369


Paired Bar Chart: Acreage for Major Field Crops 369
Stacked Bar Chart: Service Delivery 372
Line Chart: The Social Security Trustees 374
Choropleth Map: Alabama Slavery and Senate Elections 378
Dot Plot: The National School Lunch Program 380
Dot Plot: GDP Growth in the United States 382
CONTENTSփ  xi

Line Chart: Net Government Borrowing 385


Table: Firm Engagement 387
Conclusion 389

CONCLUSION 391

APPENDIX 1: DATA VISUALIZATION TOOLS 397

APPENDIX 2: FURTHER READING AND RESOURCES 403


General Data Visualization Books 403
Historical Data Visualization Books 405
Books on Data Visualization Tools 405
Data Visualization Libraries 406
Where to Practice 407

Acknowledgments 409
References 413
Index 431
BETTER DATA
VISUALIZATION
INTRODUCTION 
R
aise your hand if your approach to creating a graph goes something like this: You ana-
lyze some data. Write up the results. Make a graph and drop it into the report, sur-
rounded by text. Label it something benign like “Figure 1. Average Earnings, 1990–2020.”
Save it as a PDF. Post it to the world.
It might have taken you months or even years to compile and analyze the data and write
the report. For many, it takes far less time to design the graphs that showcase that data. You
might open a program like Microsoft Excel, paste in the data, click through the drop-down
menu, select one you’ve used dozens or hundreds of times, accept the default formatting,
and paste it into the report.
But at any point in this sequence did you pause to consider what’s most important about
communicating the work? It’s the audience. People will read your report. People will listen to
you discuss your work. And yet many of us spend far too little time thinking about how we can
best present our findings. Instead we use whatever default approach is quickest and easiest.
Why is this? Maybe you don’t believe you have the technical skills or design know-how
to create complex, attractive graphs. Or you worry it’s not worth the effort, because your
managers or tenure committee or whoever else won’t see it as time well spent. Many people
simply think that their reader will just “get it,” as if everyone has seen the content a hundred
times before. But many readers, especially those who can make change or implement policy,
may have never seen this content before. In these cases—which are probably most of them—
thinking carefully about how data is presented is just as important as the data itself.
This book is about how to create better, more effective visualizations of your data. It aims
to expand your graphic literacy and put more graphs in your toolbox. The next time you open
2փ  INTRODUCTION

Excel, Tableau, R, or whatever your software tool of choice, you won’t be bound by the graphs
in the dropdown menus or the tutorial manual. This book will guide you to choose the graph
that is the best fit for your data and will most effectively communicate your message.
People often tell me they can’t create some of these different, nonstandard graphs because
their colleague or manager or audience won’t understand them. We are not born knowing
instinctively how to read a bar chart or line chart or pie chart. As Scott Klein, deputy manag-
ing editor at ProPublica once wrote, “There is no such thing as an innately intuitive graphic.
None of us are born literate in reading visualizations.”
As data visualization creators, we must understand our audience and know when a differ-
ent graph can engage readers—and help them expand their own graphic literacy.

    
This book has three main parts. Part 1 covers general guidelines to creating effective visu-
alizations. We’ll learn the importance of our audience and how to consider what category
of graph will best meet their needs. No data visualization book will contain every lesson to
create effective graphs, but there are some best practices that can guide your work. As you go

Each of these six charts visualizes the same data: The share of people earning minimum
wage or less in each state.
INTRODUCTIONփ  3

forward creating more visuals and seeing their effect on your audience, you’ll develop your
own aesthetic and learn when to bend or break these guidelines.
Part 2 is the meat of the book. We will define and discuss more than eighty graphs, cat-
egorized into eight broad categories: Comparisons, Time, Distribution, Geospatial, Rela-
tionship, Part-to-Whole, Qualitative, and Tables. We will see how each graph works and the
advantages and disadvantages of each.
Graphs overlap between these categories—a bar chart, for example, can be used to show
changes over time or comparisons between groups. The categorizations here are based on
a graph’s primary purpose. But even that’s not an objective truth, and your perspective and
situation may differ. I do not discuss every single possible graph—there are many specialized
graphs in fields like architecture, biology, and engineering that are excluded here. Instead,
these chapters cover the most common and flexible graphs that can showcase the sorts of
data most people will need to display.
I tie these chapters together in part 3 with a chapter on building a data visualization style guide
and a chapter on how to pull the different lessons together in a series of graph redesigns. If you’ve
ever written a research paper, or even a book report, you are probably aware of the array of writing
style guides, from the Chicago Manual of Style to the Modern Language Association. These guides
break down writing into component parts and prescribe their proper use. A data visualization style
guide does the same for graphs—defines their parts and how to style and use them. In the final
chapter, we apply the lessons to redesign a series of graphs to improve how they communicate data.
This book will guide you as you explore your data and how it might be visualized. Now
more than ever, content must be visual if it is to travel far. Your clients and colleagues, and
your audiences of policymakers, decisionmakers, and interested readers are inundated with
a flow of information. Visuals cut through that.
Anyone can improve the way they visualize and communicate their data—and you don’t
need a graduate degree in marketing or design or advertising. Take it from me, I started my
career as an economist in the federal government.

HOW I LEARNED TO VISUALIZE MY DATA

Once I settled on declaring my economics major at the University of Wisconsin at Madison


(there was an ill-fated attempt to also be a math major, but I hit a wall at Markov chains),
I knew I wanted to end up in Washington, DC. I wanted to be near the center of public
policy and politics. I wanted to explore the real problems of the day and help craft solutions.
4փ  INTRODUCTION

I moved to DC in 2005 to join the Congressional Budget Office (CBO). My job was to
help work on the long-term microsimulation model that is used to examine the Social Secu-
rity system and forecast the long-term finances of the federal budget. The spring of 2005 was
an exciting time to work on Social Security—President George W. Bush had made Social
Security a central component of his second term. In his 2005 State of the Union address, he
said, “We must pass reforms that solve the financial problems of Social Security once and for
all.” Reform would stall later that year, but in the course of my first few months on the job,
my group at CBO estimated and analyzed the effects of dozens of policy proposals.
Five years later, I had expanded my work to include issues around policies that affected
disabled workers, immigration, and food stamps (now called the Supplemental Nutrition
Assistance Program or SNAP). In 2010, three of my colleagues were drafting a special report
on policy options for Social Security. In it, they would show the impact of thirty different
options for reform. One of the central figures in the report would show changes in taxes
received by the system, benefits paid out from the system, the balance between the two, and
other measures of fiscal solvency for these thirty options. It looked something like this:

Author’s rendering of early draft of exhibit from the Congressional Budget Office.
INTRODUCTIONփ  5

You don’t need to be a government economist to know that members of Congress are
unlikely to read something that looks like a spreadsheet. There are too many rows, too many
columns, too many numbers—too much information. It was right then that I first started
thinking about better ways to present this information.
This was the result. We replaced some numbers with small area charts, which give the
reader an immediate visual impression of each option—which ones increased the solvency
of the program and which ones did not.

Final version of that main exhibit in the Congressional Budget Office report on Social
Security. Notice that there is less data and more graphs.
Source: Congressional Budget Office.

The report worked. We received good feedback from colleagues at CBO and other agen-
cies, as well as readers on Capitol Hill and elsewhere, noting how easy it was to read and
digest the graphs. It was maybe the first time I (and perhaps the agency) thought carefully
6փ  INTRODUCTION

and strategically about our data visuals. From there, I started reading books on data visual-
ization, design, color theory, and typography.
Working with our editorial department and designers, we began to improve the
graphs in our basic reports and started creating new report and graph types. We made
infographics—what was then a buzzword referring (sometimes derisively) to longer
graphics that combine data, text, images, and more into a single visual. In 2012, we cre-
ated this infographic to accompany and summarize The Long-Term Budget Outlook, a
109-page report.

One-page infographic about the 2012 Long-Term Budget Outlook from the Congressional
Budget Office.
Source: Congressional Budget Office.
INTRODUCTIONփ  7

That June, CBO’s director sat in front the U.S. House Budget Committee to relay the
results of our analysis. As the hearing played on a TV out in the hallway, I suddenly heard
yells of, “Jon! Jon! Come out! Your infographic is on TV!”
And, sure enough, Congressman Chris Van Hollen was holding up the infographic on
C-SPAN, covered with scribbles and notes. The visualization had captured and engaged the
attention of one of the busiest people in America, and someone who could do something
about the pressures facing the federal budget. That was the moment I knew that how we
presented our data could matter as much as the data itself.
In 2014, I moved to the Urban Institute, a nonprofit research institution in Washington,
DC, to spend half of my time conducting research and half of my time in the Communica-
tions department, helping colleagues present and visualize their data.
Since that time, I have conducted hundreds of workshops, delivered lectures around the
globe, and published two books on data communication. The world, it seemed, had seen
what I saw—better visual content and better presentations were the currency of research and

Maryland Congressman Chris Van Hollen holding up that Long-Term Budget Outlook info-
graphic in a House Budget Committee hearing.
Source: C-SPAN2.
8փ  INTRODUCTION

policy adoption. The advance of computing power, social media platforms, and the expand-
ing media landscape made visual content more important, perhaps even necessary.
Today, I work with people in nonprofits, government agencies, private sector companies,
and everything in between to improve how they create their graphs and communicate their
content. I’ve worked with junior economists and analysts dealing with enormous data sets;
health care workers trying to communicate results to patients, families, and hospital admin-
istrators; human resource representatives working with databases of job-seekers; advertisers
and marketing executives selling products to clients; and many more.
I’ve seen hundreds of different kinds of data visualization challenges. The skills to meet
them, unfortunately, are not yet regularly taught in schools or professional development
programs. But these skills can be learned. We can learn how to read chart types we’ve never
seen before, even if they are complex. And we can learn how to communicate our work in
better and more effective ways.
Eventually, I discovered that one of the most important things I can show people is the
incredibly wide array of graphs available to them. And that is precisely the content of this
book, a survey of more than eighty types of data visualizations, from the familiar to the
nonstandard.
But before we get to the library of graph types, we’ll consider some of the science behind
how we process visual information and some best practices and approaches to visualizing data.
PART ONE
PRINCIPLES OF DATA VISUALIZATION
VISUAL PROCESSING AND
PERCEPTUAL RANKINGS
 1

B
efore we start creating our charts and graphs, we need to cover some basic theory of
how the brain perceives visual stimuli. This will guide you as you decide what chart type
is most appropriate to visualize your data.
When we consider how to visualize our data, we must ask ourselves how accurately the
reader can perceive the data values. Are some graphs better equipped to guide the reader to
the specific difference between, say, 2 percent and 2.3 percent? If so, how should we think
about those differences as we create our visualizations?
There’s a thread of research in the data visualization field that explores this very ques-
tion. Based on original research over the past thirty years or so, the image on the next
page shows a spectrum of graphs—or more generally, types of data encodings like dots,
lines, and bars—arrayed by how easily readers can estimate their value. The encodings that
readers can most accurately estimate are arranged at the top, and those that enable more
general estimates are at the bottom.
The rankings are unsurprising. It is easier to compare the data in line charts, bar charts,
and area charts that have the same axis or baseline. Graphs on which the data are positioned
on unaligned axes—think of a pair of bars that are offset from one another on different
axes—are slightly harder for us to accurately discern the values.
Farther down the vertical axis are encodings based on angle, area, volume, and color.
You intuitively know this: it’s much easier to discern the exact data values and differences
between values when reading a bar chart than when reading a map where countries are
shaded with different colors.
ovbঞom-Ѵom]1ollomv1-Ѵ;v

Enable ovbঞom-Ѵom]b7;mঞ1-Ѵķmom-Ѵb]m;7v1-Ѵ;v
accurate
;vঞl-|;v

Length

bu;1ঞomņvѴor; Angle Parts of a whole

Area

(oѴ†l;

"_-7bm]-m7v-|†u-ঞom

May enable
general Color hue
;vঞl-|;v

Perceptual ranking diagram. What kind of data visualization you choose to create will
depend on your goals and your audience’s needs, experiences, and expertise. This image is
based on Alberto Cairo (2016) from research by Cleveland and McGill (1984), Heer,
Bostock, and Ogievetsky (2010), and others.
Another random document with
no related content on Scribd:
back
back
back

You might also like