You are on page 1of 75

This PDF is available at http://nap.nationalacademies.

org/24755

Data Visualization Methods for


Transportation Agencies (2017)

DETAILS
75 pages | 8.5 x 11 | PAPERBACK
ISBN 978-0-309-45848-1 | DOI 10.17226/24755

CONTRIBUTORS
Nathan Higgins, Ronald Basile, Samuel Van Hecke, Joseph Zissman, and Scott
Gilkeson; National Cooperative Highway Research Program; Transportation
BUY THIS BOOK Research Board; National Academies of Sciences, Engineering, and Medicine

FIND RELATED TITLES SUGGESTED CITATION


National Academies of Sciences, Engineering, and Medicine. 2017. Data
Visualization Methods for Transportation Agencies. Washington, DC: The
National Academies Press. https://doi.org/10.17226/24755.

Visit the National Academies Press at nap.edu and login or register to get:
– Access to free PDF downloads of thousands of publications
– 10% off the price of print publications
– Email or social media notifications of new titles related to your interests
– Special offers and discounts

All downloadable National Academies titles are free to be used for personal and/or non-commercial
academic use. Users may also freely post links to our titles on this website; non-commercial academic
users are encouraged to link to the version on this website rather than distribute a downloaded PDF
to ensure that all users are accessing the latest authoritative version of the work. All other uses require
written permission. (Request Permission)

This PDF is protected by copyright and owned by the National Academy of Sciences; unless otherwise
indicated, the National Academy of Sciences retains copyright to all materials in this PDF with all rights
reserved.
Data Visualization Methods for Transportation Agencies
NCHR RP Web-OOnly Doccument 2226:
Data Vis
sualizatio
on Method ansporta tion Agen
ds for Tra ncies

Naathan Higgins
Ronald
R Basile
Sam
muel Van Hecke
Copyright National Academy of Sciences. All rights reserved.

Jos
seph Zissman
ambridge Systematics, Inc. - Ca
Ca ambridge, MA

cott Gilkeson
Sc
Tak
koma Park, MD

NCHRP Project 08
8-36, Task 128
Submitted August 2016

ACKNOWEDGMENT

This work was sponsorred by the American Association of State


e Highway and Transportation Officials (AASHTO),
( in coopeeration with the Fede
eral Highway Administration, and was
nducted in the Nation
con nal Cooperative High
hway Research Prog gram (NCHRP), which is administered byb the Transportationn Research Board (TTRB) of the Nationa
al Academies of
Sciences, Engineering,, and Medicine.

OPYRIGHT INFORMATION
CO

Autthors herein are resp


ponsible for the auth
henticity of their mate
erials and for obtainiing written permissio
ons from publishers or persons who own
n the copyright to an
ny previously
pub
blished or copyrighteed material used herrein.

Coo
operative Research Programs (CRP) grants permission to reproduce material in n this publication for classroom and not--for-profit purposes. Permission is given
n with the
und
derstanding that nonne of the material will be used to imply TRB, AASHTO, FAA,, FHWA, FMCSA, FR RA, FTA, Office of thhe Assistant Secreta
ary for Research andd Technology,
PHMSA, or TDC endorrsement of a particular product, method, or practice. It is exppected that those rep producing the materrial in this document for educational and not-for-profit
es will give appropria
use ate acknowledgmentt of the source of any reprinted or reprod
duced material. Forr other uses of the mmaterial, request permmission from CRP.

DIS
SCLAIMER

The
e opinions and conclusions expressed or
o implied in this repo
ort are those of the researchers
r who perrformed the researchh. They are not nece
essarily those of the Transportation
Ressearch Board; the National Academies of
o Sciences, Enginee ering, and Medicine; or the program spoonsors.

The
e information contain
ned in this document was taken directly from the submission
n of the author(s). Th
his material has not been edited by TRB
B.

 
Data Visualization Methods for Transportation Agencies

Table of Contents
Chapter 1 · Introduction and Background ................................................................................................................................................ 1
1.1 · Visualization in Transportation........................................................................................................................................................... 1
1.2 · Background .................................................................................................................................................................................... 2
1.3 · Audience for the Guide .................................................................................................................................................................... 3
1.4 · Definitions ...................................................................................................................................................................................... 4
1.5 · Outline .......................................................................................................................................................................................... 4
Chapter 2 · How to Illustrate Data .......................................................................................................................................................... 5
2.1 · Common Chart Types ...................................................................................................................................................................... 5
2.2 · Other Recommended Chart Types................................................................................................................................................... 14
2.3 · Common Techniques ..................................................................................................................................................................... 15
Chapter 3 · Developing Effective Visualizations ....................................................................................................................................... 16
3.1 · Data Wrangling ............................................................................................................................................................................ 16
3.2 · Intent and Audience ....................................................................................................................................................................... 19
3.3 · Analysis ........................................................................................................................................................................................ 21
3.4 · Choosing a Strategy ...................................................................................................................................................................... 24
3.5 · Tools and Implementation .............................................................................................................................................................. 27
3.6 · Putting It All Together ..................................................................................................................................................................... 31
Chapter 4 · Style Guide ...................................................................................................................................................................... 32
4.1 · Basic Design Principles ................................................................................................................................................................... 32
4.2 · Font ............................................................................................................................................................................................. 32
4.3 · Color ........................................................................................................................................................................................... 32
4.4 · Federal Requirements for Style ........................................................................................................................................................ 34
Chapter 5 · Conclusion ...................................................................................................................................................................... 36

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Appendix A: Best Practice Examples ...................................................................................................................................................... 37


Opening Notes ..................................................................................................................................................................................... 37
Asset Management ................................................................................................................................................................................ 38
Connectivity, Accessibility, and Livability ................................................................................................................................................... 40
Environmental....................................................................................................................................................................................... 42
Transit ................................................................................................................................................................................................. 45
Highway Mobility .................................................................................................................................................................................. 48
Performance Based Planning .................................................................................................................................................................. 51
Freight ................................................................................................................................................................................................. 54
Safety .................................................................................................................................................................................................. 57
Socioeconomic ..................................................................................................................................................................................... 60
Pedestrian and Bicycle ........................................................................................................................................................................... 63
Appendix B: Matrix of Tools and Chart Types.......................................................................................................................................... 66

ii

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Figure 1: 3D Model of the Proposed Mobile River Bridge (Alabama DOT)

iii

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

This Guide is intended to help transportation planners create modern data In some cases, these organizations can engage the time and skills of app
visualizations. It is built for planners who want to learn the basics and peek around developers, hobbyists, and academic researchers to display complex open
the corner at what is next once they have them mastered. It includes advice and datasets. For example, after the Massachusetts Bay Transportation Authority
best practices for developing visualization skills, enhancing transportation (MBTA) released time-bound subway location data, graduate students Michael
analysis, and improving public engagement. It considers advances in technology Barry and Brian Card produced “Visualizing MBTA Data,” shown in Figure 2.
and communication, such as online tools, software, and technical support
acquisition. The Guide is available in a website format at Figure 2: Screen Capture from “Visualizing MBTA Data” (by Michael Barry and Brian
http://vizguide.camsys.com; it focuses on key takeaways and examples. Card) http://mbtaviz.github.io/

1.1 · Visualization in Transportation


Why is it important?
A visualization is any illustration that conveys information. It can be static,
animated, or interactive, colorful or monochrome. It can represent a conceptual
framework (e.g., an organizational structure), a dataset (e.g., census
demographics), or a simple idea (e.g., the budget is really big). As society’s
ability to collect, store, and consume data has grown over time, so has the need
to illustrate and explain that information, and visualizations often can do so more
succinctly, engagingly, and quickly than can spreadsheets and narrative alone.

At the turn of the 21 st Century, the term “visualization” in the transportation


Text intentionally left small to focus the reader on the overall image.
industry almost always referred to static or animated renderings of improvement
projects, often animated and three-dimensional (as illustrated in Figure 1). As In addition to these interactive solutions, many visualizations are static – a pie
the last decade ended, the proliferation of free public data from transit agencies, chart, bar graph, or pictogram integrated into printed text, for example. While
among others, changed the popular conception. Visualizations became more there is no limit to the types of transportation information that can be conveyed
interactive and migrated online and into smartphone apps. Simultaneously, visually, uses might include:
business-oriented visualization tools have provided practitioners with more
sophisticated options for illustrating and communicating information.  Bridge deterioration over time;
 Ridership or occupancy over time for a transit service;
Organizations such as state transportation agencies (DOTs) and Metropolitan
Planning Organizations (MPOs) have an opportunity to place more data than  Volume over time for a roadway or intersection;
ever in front of stakeholders and the public. For these audiences, visualizations
 Benefits of a project compared with costs;
can inform, can spur an action, and can improve decision-making.

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Flows of funding from revenue sources to expenditures; Fernanda Viegas and Martin Wattenberg (creators of ManyEyes for IBM).
Wisdom from these subject matter experts appears throughout the Guide.
 Maps of trip generation, rider origin and destination, transportation equity
(service level vs. income level), and roadway congestion, among many Key findings from our literature review and interviews include:
others; and
 There are excellent visualization guides from private subject-matter experts
Comments and feedback collected at public meetings – see Figure 3.•
Figure 3: Graphic Recording (Champaign County Regional Planning Commission) and enthusiasts, presented as books and as blogs. For example:
 Storytelling with Data – Cole Nussbaumer Knaflic
http://www.storytellingwithdata.com/; and
 Evergreen Data - Stephanie Evergreen http://stephanieevergreen.com/.
 The best visualizations communicate information with intent to a specific
audience. While sometimes charts are presented with labels, they are not
always. An example of this, from the British air traffic controller NATS, is
provided in Figure 4.
Figure 4: European Air Traffic over 24 Hours – GPS Locations of Aircraft (NATS)

Text intentionally left small to focus the reader on the overall image.

1.2 · Background
What’s the best stuff out there?
This Guide is informed by and founded upon a large-scale literature review and
interviews. The literature review focused on best practices drawn from our
professional experience and prior work, from online media (e.g., newspapers,
blogs, and magazines), and from academia. While we paid special attention to
work that touched the transportation industry, we also considered the literature
review to be an opportunity to introduce best practices from other fields to
transportation practitioners.

We conducted interviews with John Allen (New Jersey DOT and the CATT Lab),
 With that said, one of the subject matter experts noted the growth of
Dan Howard (San Francisco Municipal Transportation Authority), and Ben “visualizing text” as a visualization task. Several survey respondents provided
Shneiderman (University of Maryland). We adapted interviews conducted by
best-practice examples of using color, size, typeface, and geometric
others of Mike Bostock (creator of D3.js), David McCandless (author of
arrangement, among other strategies, to help key numbers and words
Information is Beautiful), Tamara Munzner (University of British Columbia), and
“pop,” as in Figure 3;

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 There are many tools and types of tools for building best-practice Figure 5: Organizations that Responded to the Visualization Survey for
visualizations, each with a different learning curve. Common programs like Transportation Practitioners (not pictured: Hawaii Office of Planning)
Microsoft Excel can create effective charts. A new class of Business
Intelligence (BI) tools including Tableau, Qlik, and Microsoft’s PowerBI
brings sophisticated visualization power to experts and casual users alike.
Beyond these user-friendly tools, users with software programming
capabilities can obtain several free and open-source tools such as D3.js to
create interactive, web-based, data-driven visualizations; and
 All subject-matter experts and best practices embrace simplicity as a driving
principle. One interviewee, Dan Howard, noted that:
“If it’s too complex for you to explain [to the lay reader], that’s a signal
that you’re in trouble.”

1.3 · Audience for the Guide


How do agencies visualize today?
To ensure that this Guide would be as helpful as possible for practitioners, we
conducted a survey of prospective users. Our objectives included:

 Understanding who within transportation organizations possesses


visualization and data management skills and who builds visualizations;
 Exploring the level of visualization expertise in transportation agencies;
 Assessing the degree to which transportation organizations deploy
visualization methods and tools; and
 Collecting a sample set of self-identified best-practice visualizations from the Based on their responses, we conclude that our typical respondents:
transportation industry and identifying trends and common approaches.
We asked respondents to identify the two “best” visualizations published by their  Are mid-level managers or technical specialists for State agencies, MPOs,
organizations. We asked them to reflect on the development, objectives, and or universities in the fields of Transportation Planning, Highway Operations,
outcomes of each. Thirty respondents completed the survey, from organizations and City Planning;
shown in Figure 5.  Present infographics or interactive online visualizations as their best
visualization examples;
 Most often use bar/column charts, colored maps, stacked area graphs, and
pictograms;

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Construct visualizations using ArcGIS, Microsoft Office, Adobe InDesign,


and JavaScript; 1.5 · Outline
 Design visualizations primarily for non-technical audiences (lay people, We designed this Guide so that practitioners can reference the portions they need
executives, and legislators); when they need them. Practitioners also can find targeted summaries of this
material at vizguide.camsys.com.
 Use several common datasets (e.g., Highway Performance Monitoring
System (HPMS), National Bridge Inventory (NBI), and American Community We have divided this Guide into five chapters, the first is this introduction. The
Survey (ACS)); other four include:
 Work for MPOs who produce visualizations for unique audiences, including
 Chapter 2 · How to Illustrate Data: This chapter describes the most common
municipal planners, officials and technical advisory panels;
chart types for transportation professionals, including types of data, chart
 Are confident in their in-house visualization skills, but they consider hiring variations, tips, and tools for production. The chapter also describes some
consulting help for some visualization tasks; and less-common but useful alternatives. It draws on available comprehensive
taxonomies of charts (e.g., Emery’s Essentials).
 Have strong data management skills but lack institutional resources,
software/data experience, and high-quality data.  Chapter 3 · Developing Effective Visualizations: This chapter describes a
five-step process for translating data into an effective visualization:
This Guide is intended to be useful to those organizations that frequently practice
visualization and want to understand how to advance their practice toward the  Data wrangling;
best practice but also to those that are just learning how to produce modern
 Intent and audience;
visualizations.
 Analysis;

1.4 · Definitions  Choosing a strategy;


 Tools and implementation (i.e., selecting the right tool and learning to
In addition to defining a visualization as an illustration that conveys information,
use it).
we use the following terminology throughout this Guide:
The chapter will describe each of these steps and best practices for
 A chart or chart type is the data-driven arrangement of information on the addressing it.
page. Charts need not include axes, lines, or bars, though these are all
elements of some types of charts. A word cloud, for example, will be treated  Chapter 4 · Style Guide: This chapter provides basic design best practices
as a chart type though it does not encode information through position; that you can apply to your visualizations.

 A dimension is an “attribute” of data (e.g., a column in a table) shown as a  Chapter 5 · Conclusion: This chapter summarizes the Guide and offers
variation in the appearance of data points. Torsten Moller and Tamara advice and inspiration from visualization experts.
Munzner refer to these as “channels.” (Visualization Analysis and Design,
2014); and
 A tool is a resource or software package used to build and publish
visualizations.

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Choosing a chart type often means deciding among a set of familiar favorites.
Types of Data
Popular choices include map, bar/column, line/area, donut/pie, flow, treemap,
heat map, scatterplot, pictograph, and node-link diagrams. While it is important Geospatial data must have a dimension that associates it with a geographic area
to maintain fluency in your favorite charts in order to produce high-quality or location, such as:
visualizations, it also is important to know when a certain chart is not the right
choice to tell your story.  Places and Political Entities – States, counties, ZIP codes or other
well-defined political or administrative areas;

2.1 · Common Chart Types  Latitude/Longitude – Specific point locations;


 Roads and Routes – Highways, waterways, railroads, multiuse paths, etc.;
People often associate visualization with novel and complex graphics, but the and
most effective visualizations often are familiar. Jean-Daniel Fekete has studied
ways to measure and improve a concept he calls “visualization literacy.” His  Areas – Wetlands, watersheds, flood plains, etc.
research (Towards Visualization Literacy, 2014) shows that people can more
quickly and accurately answer questions about visualizations when there is a Variations
direct relationship between a data-oriented term and a perceptual term (which
he labels “congruence”). For example, when charting unemployment on a bar  Choropleth – Map areas are colored by value.
graph, the highest bar represents the highest unemployment. Common chart
types typically have the highest congruence and are the easiest to understand.  Bubble – Bubbles are sized relative to values and located on a map.

See the Charts section of the web version of the Guide or Appendix A for a  Pie – Bubbles are sized by value, divided by qualitative categories, and
selection of useful examples. located on a map.
 Dot Density – Dots vary in density by value and are located on a map.
Geographic Maps  Route – Lines representing transportation networks or paths between
locations are sized by value and located on a map.
 Flow – Arcs showing flow from one location or node to another are sized
Maps are effective for communicating conditions in or
by value and located on a map.
differences among specific geographic areas. Geographic
boundaries (e.g., states or counties) provide the foundation  Area Cartogram – Map areas are distorted by value while keeping the
for many maps. geography recognizable.

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Tips:  Standard Deviation – Observations are assigned to some number of


sub-ranges based on numbers of standard deviations from the mean.
Tell the same story as your data This tends to emphasize outliers.

 Be aware of the disproportionate effect of sparsely populated areas when  Custom Whole Numbers – Observations are assigned to custom
using choropleth maps. These colored maps are eye-catching and familiar, sub-ranges bounded by whole numbers. All the other class break
and generally well-understood, but can be misleading because choices tend to give odd-looking ranges like “12.3 to 15.7,” while
tightly-packed, densely populated areas like inner cities may not even viewers may be more comfortable with “10 to 15.” If the map is updated
appear at the scale being displayed, while large rural areas will dominate with new data, you will need to re-evaluate the suitability of the chosen
the visual field, even though they may represent few people. Interactive breaks.
zooming can make the small areas visible, but the larger areas will always  Associate the scale of light and dark colors with a scale in your data.
dominate the perceived coloration. Consider using a bar graph, which gives
equal weight to each area, or a treemap, which can size areas by population  Use different color scales for positive and negative values (e.g., red for
rather than geographic size (albeit at the expense of easy spatial positive and blue for negative). Chapter 4 addresses the use of color and
recognition). Another option is an area cartogram, which distorts the size of ColorBrewer.org is a useful resource for selecting them.
the area on the map while maintaining adjacency or relative position, to
make the size proportional to the item of interest. Tools
Improve memory and comprehension  All Types of Maps

 Limit the information to just what your audience needs to understand your  Map Tools: Esri ArcGIS/ArcMap, QGIS; and
point. This is true for any visualization but especially so for maps. It can be  General Tools: Google Fusion Tables.
tempting to include interstate shields, urbanized area boundaries, north
arrows, and scales when they don’t add any information to the visual.  Basic Maps (Choropleth and Bubble)

 If interactive, make layers of information available via a layer menu because  Visualization Environments: Tableau, Qlik, Microsoft Power BI; and
only parts of your audience will want to see them.  For Developers: D3.js, R, Google Maps application program interface
 Limit yourself to five or fewer classes, particularly for choropleth maps. There (API), Leaflet.
are various ways to determine the break points for these sub-ranges
(sometimes called ‘class breaks’):
 Quantiles – An equal number of observations are assigned into each
sub-range. Each color will appear the same number of times on the map
but the number of observation in each sub-range may be large or small.
 Equal interval – Observations are assigned to some number of
equal-sized sub-ranges. Some colors may not appear at all while others
may appear frequently. Outliers can have a strong influence.
 Natural Breaks – Observations are assigned to some number of
sub-ranges based on how they cluster. This generally results in an
attractive map.

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Bar Charts this chart often is used to show population by age group with bars for
males extending to the left and females extending to the right.
 Radial – Bar arranged in a circle or spiral, rather than extending in parallel
Bar charts are useful for comparing quantities across one from a baseline.
or more dimensions. The length of the bars represent the
relative magnitude of attributes (e.g., vehicle miles traveled Tips
by mode).
Tell the same story as your data

Types of Data  Start from zero – Starting anywhere else distorts variation. If your data does
not show much difference at full scale, that may be the story. If starting from
Bar charts require at least one quantitative dimension, which corresponds to the zero is not an option, you might consider using a logarithmic scale or
length of the bars, and one qualitative dimension, which is represented by providing a zoom feature to show differences more clearly.
different bars. Some variations can represent multiple quantitative and qualitative
dimensions.  Use stacked bars with care – Stacked bars are not good for estimating
percentages or comparing components because only the first bars line up.
Quantitative – Bar charts are suitable for comparing nearly any quantitative data.  Radial charts are visually appealing but make it difficult to compare values
- They rarely are the best choice.
Qualitative – Bars must represent discrete categories.
Improve memory and comprehension
Variations
 Use five to eight bars - More bars make it hard to compare specific values.
 Horizontal/Vertical – Bars extend horizontally or vertically (sometimes If you find yourself needing more bars, consider using a line chart instead.
called column charts).
 Sort the bars – to make it easier to compare bars that are similar in height.
 Clustered – Bars are grouped to show differences among categories of the However, be aware that sorting implies a ranking.
data, with color representing different dimensions.
 Horizontal layouts accommodate longer category names – Labeling charts
 Stacked – Multiple bars are stacked on top of the other. The bars can be is important and horizontal bar charts give you more labeling real estate;
normalized so that bars represent percentages of the whole.
 Diverging – Bars extend in positive and negative directions from the
baseline to represent positive and negative values.
 Bullet – Bars are overlaid on a background to compare quantitative
measures (e.g. condition) against qualitative ranges (e.g. poor,
satisfactory, and good). Markers are placed on the bar to indicate targets.
 Histogram – Bars show the number of elements in each category. If the
measure is continuous, observations are grouped into ranges.
 Pyramid – Bars diverge to the left and right from a centerline showing the
number of elements in each category. Often called a population pyramid,

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Tools  Smoothed Line – An algorithm calculates a curved path to connect (or


nearly connect) data points.
 Basic Bars
 Regression Line – An algorithm calculates a best fit line that passes through
 General Tools: Microsoft Excel, Google Sheets, Google Fusion Tables; the data points. A regression line most likely will not connect any of the
 Visualization Environments: Tableau, Qlik, Microsoft Power BI; and data points, but will show the overall trend.

 For Developers: D3.js, R, Google Chart API.  Area Graph – Straight lines are drawn to connect data points and the area
under the line is colored.
 Other Bar Types (Bullet, Histogram, Pyramid, Radial)
 Stacked Area Graph – Multiple area graphs are stacked on top of the
 General Tools: Microsoft Excel (histogram starting with 2016); other. The areas can be normalized so that they represent percentages of
 Visualization Environments: Tableau, Microsoft Power BI (with custom the whole.
visuals); and  Streamgraph – A stacked area graph centered on an axis to create a
 For Developers: D3.js, R, Google Chart API. flowing shape.

Tips
Line Graphs
Tell the same story as your data

 Start scale at zero to put your data into proper context. Focusing on a narrow
Line graphs are useful for showing trends and comparing range can make changes appear more dramatic than they really are. If
them among variables. Line graphs show changes in a starting from zero is not an option, use a logarithmic scale or provide a zoom
quantitative variable across some other ordered variable, feature to see differences.
usually time (e.g., increase/decrease in revenue over time).
 Use two y-axes to plot dimensions with different scales at the same time –
When necessary, plot variables with different ranges together to compare
trends. Use two y-axes (usually placed on opposite sides of the graph).
Types of Data  Use stacked areas with care – Stacked areas are not good for estimating
Values (y-axis) – Line graphs are suitable for plotting one or more quantitative percentages or comparing areas because only the first components line up.
variables on the same chart; and Improve memory and comprehension
Period or Span (x-axis) – Line graphs typically show change over time. If the  Label lines directly – Label the lines directly on the graph to make them easier
variable is not time, it should have a logical order so that moving from the left to read.
to right has some meaning.
 Highlight key events – Use arrows and callouts or shade background to
Variations annotate significant events.

 Segmented Line – Straight lines are drawn to connect data points.  If the chart is interactive, show points – In an interactive visualization, points
add a visual cue for users to mouse over them to see more information.

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 If the chart is static, avoid showing points - A line without points looks sleek A search of the web turns up many articles discouraging the use of pie charts.
and uncluttered. One example, from Business Insider blogger Walter Hickey: “The pie chart is
easily the worst way to convey information ever developed in the history of data
 Use line thickness to make a statement – Thicker lines make a bold statement
visualization.” (http://www.businessinsider.com/pie-charts-are-the-worst-2013-
and are easy to see. Do not use thick lines if they obscure each other.
6 – accessed 2016). Such opinions generally cite Edward Tufte (The Visual
 Show grid lines but make them nearly invisible – Show grid lines or reference Display of Quantitative Information, 2001) and/or Stephen Few
lines to help people estimate the values but make them as much a part of (http://www.perceptualedge.com/articles/visual_business_intelligence/save_the
the background as possible. _pies_for_dessert.pdf), two well-known data visualization writers.
 Use smoothed or regression lines to reduce the visual impact of variable Still, pie charts are effective when used correctly. At a glance, the audience can
data and maintain the overall shape of the line. see divisions of a whole and discern high-level proportion (i.e., quarter or half).
 Watch for obscured lines – The maximum number of comprehensible lines They also can sum adjacent slices (i.e., the two largest segments account for
may depend on how close together they are and how they look. If there is more than half of all cases).
little difference among them, one line may obscure another. Use style to help
differentiate among the lines. Types of Data
Parts of a Whole/Percentages – Use when there are fewer than five to eight slices
Tools and when the sum of all slices is exactly 100 percent.
 Basic Line and Area (* can also produce streamgraphs)
 General Tools: Microsoft Excel, Google Sheets;
Variations
 Pie – A circle is divided into segments (arcs) by value with each segment
 Visualization Environments: Tableau, Qlik, Microsoft Power BI (using
representing one portion of the whole.
custom visuals*); and
 Donut – The center is removed from a standard pie chart, leaving a ring
 For Developers: D3.js*, R*, Google Chart API.
of arcs. The blank center area can be used for labeling or other purposes.

Pie Charts  Multi-Tier (Sunburst) – Multi-tier pie or donut charts add one or more rings
around the original chart, with each segment of the outer ring further
subdivided to show hierarchies in data. Major segments in the outer rings
must align with their inner counterparts, although some outer segments
Pie charts give a general impression of the relative may be missing.
contributions of each part to a whole (e.g., the percent of
congestion caused by different things). They show each Tips
portion as a slice of a circular pie.
Tell the same story as your data

 Use only when data represent a whole - This also goes for normalized
Note: Some data visualization experts discount pie charts because humans are stacked area or bar charts, but is particularly important for pie charts.
not good at recognizing slight differences in angles. This means that humans
have a hard time comparing slices accurately (bar charts often are better at this).  Consider a bar chart – If the intent of the chart is to compare values, bar
charts are better suited.

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Improve memory and comprehension Variations


 Avoid too many slices – Use no more than five to eight slices to ensure that  Flowchart – Shapes connected by arrows show steps and decision points
readers can see differences among the slices. that represent a workflow or process.

 Avoid one very small slice –Small slivers are hard to distinguish and hard to  Sankey Diagram – Shapes are connected by arrows sized relative to value.
label. If it makes sense for your data, combine a number of the smallest Typically, Sankey diagrams show the flow of energy or money through a
segments to create an ‘other’ category. process.

 Avoid 3D pies – In general, it is a bad idea to use three-dimensional effects


for data visualization, but special emphasis is due for pie charts. In order to
Tips
add a 3D perspective, the pie has to be ‘tilted’ with respect to the viewer, Tell the same story as your data
and this distorts the relative areas of near and far segments.
 Ensure logical flow – If you are mapping the flow of cash, for example, make
Tools sure that the flow of money moves from source to expenditure.

 Pie and Donut charts (asterisk = can also produce sunburst charts)  Make sure that your flows match – A Sankey diagram requires that you
balance the outgo and the income for each node. .
 General Tools: Microsoft Excel (*starting with 2016), Google Sheets;
Improve memory and comprehension
 Visualization Environments: Tableau*, Qlik, Microsoft Power BI*; and
 For Developers: D3.js*, R*, Google Chart API.  Minimize overlapping flows to make it easier for the audience to understand
your intent. Complex flows may mean lots of lines crossing; experiment with
various layouts to minimize that. Interactivity can help by highlighting one
Flow Charts pathway on mouseover.

Flow charts show how a quantity flows among containers. Tools


If these containers are geographic places (e.g., migration
of population from one state to another), the flow chart  Flowcharts and Sankey Diagrams
can be displayed on a map (i.e., a “Flow Map”). A Sankey  General Tools: Microsoft PowerPoint, Visio;
diagram illustrates non-geographic flows (e.g., revenue
sources to spending programs).  Visualization Environments: ArcGIS, Tableau (with effort), Microsoft
Power BI (with custom visuals), SankeyMatic; and

Types of Data  For Developers: D3.js, R, Google Chart API.

Quantitative values associated with starting and ending points or states - To plot  EventFlow/LifeFlow is part of an ongoing research program at the University
a flow, you need starting points, each with one or more ending points, and a of Maryland, and is available for commercial licensing as well as non-
measure of flow between them, such as volume. You can display categorical commercial use.
data as well, typically by coloring the point or the path. For example, you can
plot the flow of freight tons from origin to destination seaports, distinguishing
type of freight.

10

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Heat Maps attributes or other hot spots. Green and blue are perceived as “cool,”
making them good for showing negative values or other cool spots. Chapter
4 describes general use of color.

Heat maps are useful for highlighting areas of interest. Tools


Clusters of colors show areas of concentration of a value
 Matrix/Calendar
or magnitude. The data can be location-specific and
overlaid on a diagram of the area of interest (e.g., a web  General Tools: Microsoft Excel, Google Sheets;
page or a geographic map) or not and overlaid on a
 Visualization Environments: Tableau, Qlik, Microsoft Power BI (with
matrix or calendar.
custom visuals); and
 For Developers: D3.js, R, Google Chart API.
Note: Choropleth maps and treemaps are sometimes called heat maps.  Smoothed

Types of Data  Map Tools: Esri ArcGIS/ArcMap, QGIS; and

Quantitative values in an ordered field – Heat maps depend on assigning colors  For Developers: D3.js, R, Google Chart API, Leaflet.
to a range of values, and the values must have a logical order such that
adjacency is meaningful. For example, heat maps can show the concentration Scatterplots
of crashes by the days on a calendar, of where a user’s eyes look on a webpage,
or of where jobs are located.
Scatterplots are effective at showing how two variables
relate to each other. A scatterplot displays values for two
Variations variables on a grid. The data are displayed as points,
 Cluster Heat Map – A shaded/colored matrix, with rows and columns positioned according to the value of one variable on the
arranged to highlight a relationship (e.g., number of trips by origin and x-axis and the other on the y-axis. Unlike line charts, the
destination of travel). x-axis does not require any logical order.

 Calendar Chart –The intensity of color on each day represents some


activity on that day. Types of Data
 Smoothed Area Heat Map – A mapping of areas of progressively higher At least two quantitative values – The basic scatterplot compares two attributes
concentration or value to color, in such a way that the colors form for the same item. On two axes, each pair of attributes defines a point. Since the
continuous, enclosing shapes. two measures have different axes, they can have different units and ranges. By
making the points different sizes, you can show a third quantitative dimension,
Tips and you can add a categorical dimension by color-coding the points.
Additionally, you can display a time dimension using animation.
Improve memory and comprehension

 Use red and yellow for hot and green and blue for cool – Red and yellow
are perceived as “hot” colors making them good for showing positive

11

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Variations  Motion Chart

 Scatterplot – Points are placed on a graph based on two values. An  General Tools: Gapminder, Google Charts; and
algorithm can be used to fit a line that passes through the points. A  For Developers: D3.js, R, Google Chart API.
regression line will show the overall trend.
 Bubble Chart – Points are placed on a graph based on two values and Pictographs
sized based on a third.
 Motion Chart – Points are placed on a graph based on two values, sized
based on a third, and put into motion based on a fourth (typically time). Pictographs are useful for making simple data more
approachable and memorable. They use graphic symbols
Tips (e.g., a figure of a person to represent people) to depict
data.
Tell the same story as your data

 Size bubbles based on area, not diameter – When given the option, correlate
bubble area (not diameter) to your values.
Types of Data
Improve memory and comprehension Single Quantitative Measure – Pictographs show how many there are of
something.
 Consider adding reference lines - Draw attention to a certain category of
values by using lines to highlight the median, average, or target value on
each axis. This is useful when assessing risk or project priorities. Variations
 Use transparency to help with overlapping data points – Overlapping dots  Dot Matrix Diagram or Icon Array – Graphic symbols are laid out in a grid
get darker, suggesting clusters of data. and colored to denote the group to which they belong. The entire grid
represents a denominator and the colored group a numerator (for
 The individual dot is not as important as the general shape - The many points example, seven out of a hundred people). A symbol can be partially
on a scatterplot can be close enough to appear as a mass or line. colored to represent a fractional part.

Tools  Symbol Bar Chart – Icons are arranged in a row or column to resemble a
bar chart. Like a bar chart, length (determined by number of icons)
 Scatter and Bubble Plots represents magnitude.
 General Tools: Microsoft Excel, Google Sheets;
Tips
 Visualization Environments: Tableau, Qlik, Microsoft Power BI;
Tell the same story as your data
 For Developers: D3.js, R, Google Chart API.
 Use natural frequencies – Viewers understand ‘x of 100’ (or 10) better than
numeric percentages.

12

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Beware of volume distortion – if using icon size to show value, correlate to Types of Data
volume rather than height or width; use only one shape.
Hierarchical quantitative data - Treemaps represent quantitative data by dividing
Improve memory and comprehension a space into areas relative to the quantity. They show hierarchy by nesting areas
within larger areas. An additional quantitative or categorical dimension can by
 Choose meaningful icons – Using emotional rather than abstract imagery represented by coloring the areas.
(e.g., outlines of humans vs. circles) can increase interest and attract viewers.
 Place the icons next to each other for greater impact. Don’t distribute the Variations
numerator icons over the entire array unless the point you are making is the
randomness of these occurrences. When showing two icon arrays, use the  Treemap – Rectangles are sized relative to the value and organized in an
same denominator, to enable effective comparison. alternating vertical and horizontal pattern or by category and packed into
larger rectangles.
Tools  Circle Packing – Bubbles are sized relative to the value and organized by
category and packed into larger circles.
 All Types
 Note: Multi-tier pie charts (also called sunburst charts) display hierarchical
 General Tools: Microsoft Excel (with effort), PowerPoint, or Visio; Adobe as well. They are discussed under Pie Charts.
Illustrator or Photoshop;
 Visualization Environments: Tableau, Qlik, Microsoft Power BI (all with Tips
effort); and
Tell the same story as your data
 For Developers: D3.js, R, Google Chart API.
 Consider whether your nodes belong in a hierarchy – If you don’t have
hierarchical data, consider using a bar chart.

Treemaps Improve memory and comprehension

 Clearly show and label hierarchy – It is important to label the hierarchies.


 If interactive, provide drill-down capability – Allow viewers to examine one
Treemaps are useful for showing relative quantities within item in the hierarchy more closely and to get details about leaf nodes.
hierarchies. Traditional treemaps divide a large rectangle
into many smaller rectangles, the size of each rectangle
representing a quantity, and the color representing Tools
another quantity or a categorical quality. Rectangles can  Treemap (* can also produce packed circles)
be further subdivided to show hierarchy.
 General Tools: Google Charts, Excel (treemaps starting with 2016);
 Visualization Environments: Tableau, Qlik, Microsoft Power BI; and
 For Developers: D3.js*, R*, Google Chart API.

13

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Node-Link Diagrams Improve memory and comprehension

 Watch for overlapping arcs – Too many overlapping arcs can make them
difficult to understand. Play with the layout to highlight your intent.
Node-Link Diagrams, also known as network graphs,
 Minimize the number of nodes - Arc and chord diagrams can be difficult to
show entities and their relationships. Generally, entities
grasp, but you can help viewers by using them only for data with a limited
are expressed as nodes (dots), and relationships (or
number of nodes and providing a way to highlight specific chords, arcs, or
edges) as links (lines).
edges.

Tools
Types of Data  All node-link diagrams
A finite set of nodes, each representing an entity – Nodes may have a quantitative  General Tools: Microsoft Excel (with NodeXL add-on, Node-link only);
value, which can be expressed by the size of the node. Nodes can have
characteristics represented by color and size. An edge connects nodes to other  Visualization Environments: Tableau (with effort), Qlik (with D3.js
nodes. Edges also can have size and color characteristics. extension), Google Fusion Tables, Microsoft Power BI (with custom
visuals); and
Variations  For Developers: D3.js and R.
 Arc Diagram – Nodes are placed along an axis with arcs connecting them.
The arc’s lines can be colored or made thicker relative to the frequency of
the connection.
2.2 · Other Recommended Chart Types
These additional chart types are fairly common and may be a good choice for
 Tree Diagram – Boxes or nodes connected in a hierarchy and relationships.
particular visualizations. This list is not comprehensive, however, as many unique
The classic organizational chart. They can start with a node at the top or
chart types exist and analysts are constantly developing more.
bottom.
 Chord Diagram – Nodes are placed along a circle with arcs connecting Tables – One of the most common ways of presenting
them. The arc’s lines can be colored or made thicker relative to the numbers is in a table, where rows and columns represent
frequency of the connection. some meaningful concept. Tables can be the best way to
 Force-Directed Graph – Nodes are placed such that connecting edges are present a lot of numbers if the important take-away is the
about the same length and have as few crossings as possible. number itself. Visual elements, such as color or small
graphs (called Sparklines in Excel) can be added to a table
for emphasis or to facilitate comparison.
Tips
Tell the same story as your data

 Consider whether your nodes belong in a hierarchy – Nodes in a network


diagram may fit into a hierarchy. If they don’t, use arc and chord diagrams.

14

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Slope Charts – Parallel Coordinates – Parallel coordinates


charts are a way to visualize the relationship of data along
2.3 · Common Techniques
different dimensions. A line connects points for a single Additional techniques can be applied to make the visualization more interesting
item on each dimension (e.g. cars, sold in 1978, with less or more interactive, or to add additional dimensions of data, including:
than 100 horsepower, greater than 30 miles per gallon).
The axes are independent and have different scales. These  Combinations – Many of the different chart types can be combined to create
can be useful for visualizing survey responses or other hybrid chart types. For example, a bubble map can use pie charts instead of
complicated datasets. bubbles, combining the part-of-the-whole visualization of the pie with the
location and magnitude visualization of the bubble map;
Spider Charts – A spider chart (also known as a radar  Infographic – An infographic is a static display of data visualization charts
chart) is a radial plot of points on some number of and words to tell a story. Usually it includes multiple charts, which may be
dimensions, with a line connecting the points. The overall the same or different types;
shape, when compared to other spider charts for different
 Dashboard – A dashboard is a compact display of multiple data
items, can indicate how the items differ along particular
visualizations that represent the current state of a process or project.
dimensions. These can be useful for showing how projects
Dashboards are presented on a computer screen and provide “real-time”
score against different performance measures.
information (the frequency with which the displays are updated are a function
of the data). Typically, the dashboard itself presents simple views of just the
Word Clouds – A word cloud is a collection of words or key bits of data, but provides a drill-down capability to get more detail;
phrases extracted from a text and arranged in a compact
 Small Multiples – Small multiples show thumbnails of multiple charts (e.g.,
space, with the size of each word/phrase determined by
line chart) in a grid to allow the viewer to find gross differences; and
the number of uses within the text. The layout of words can
be horizontal only, horizontal and vertical, or at various  Animation – Animation is often used in interactive visualization. Sometimes
angles. These can be useful for understanding themes in it helps viewers track changes between different views of data. It can also
stakeholder feedback. add an additional dimension—usually time—to visualization. For example,
Hans Rosling uses animated bubble charts to trace the history of health and
economics around the world, as shown in this TED talk:
Gauges – Gauges graphically resemble mechanical https://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen
gauges in the real world, such as speedometers, and thus
are familiar to most people. They show a single attribute
at a glance. These often are used to report performance
measures.

15

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Consider each step in the process of developing an effective visualization in order  Discover the content and patterns in your data. “Sketching” your data can
to imbue the finished product with focus and meaning. First, you must acquire provide a positive feedback loop. Illustrating a dataset can make outliers
and refine a dataset – a process called “data wrangling” in the visualization and patterns in the data obvious where a spreadsheet might hide them, and
community – analyze the data, and identify patterns and findings that you can it simplifies a key thought process – are the outliers mistakes or do they point
call out visually. To hone your message, identify your intent and audience – who to a real phenomenon?;
needs to know about your data, what do you want them to think and do about it?
 Structure the data to have only the needed attributes, named and formatted
For example, do you want them to change their daily behavior or try to change a
in a way that maximizes comprehension;
law?
 Clean the data to eliminate meaningless or undesirable outliers (i.e., null
With data in hand and clear intent, identify and execute a strategy, using values reported as 0 or 99);
appropriate charts and communicating with clarity. Finally, use the best tools to
implement and share the project effectively within your organization’s practice.  Enrich the data with relevant additions that illuminate trends or provide
necessary context; and

3.1 · Data Wrangling  Validate the prior steps by, at a minimum, assessing whether each attribute
is formatted properly and falls within logical constraints (e.g., percentage
Find your data and make it yours sums to 100).
Before you begin visualizing data, you must find, acquire, and prepare it. Analysis
and visualization require accurate data that are well-structured for your task. The Volume of Data
process of transitioning raw data inputs into presentable data sets has come to
be called “data wrangling.” It is equally realistic that a transportation practitioner could seek to display a
single data point as it is that she may wish to portray millions. For example, a
Martin Wattenberg and Fernanda Viegas – cofounders of the IBM ManyEyes report to the residents of a town might wish to convey the 0-9 NBI rating of a
project – note that it is important to work with real rather than mocked-up data, local bridge.
since manufactured data will rarely contain the nuances of the real thing.
Wattenberg compares working with real data to getting feedback from real This one data point can be placed in context (e.g., a bar chart scaled from 0-9),
people. Information is Beautiful creator David McCandless observes that translated (e.g.,  / ), or illustrated (e.g., diagrams or photographs showing
sometimes the data may seem boring, and in these cases the practitioner may the damage that drives the rating). On the other hand, the same agency may
be able to find additional data to normalize, compare, or merge, or the boredom wish to convey a dozen condition metrics on thousands of bridges through a
might be a cue to ask deeper questions. single visualization. Methods that would make little sense for the single data
point, such as geographic search functionality, mouse-over information
Jeffrey Heer, co-creator of the Trifacta data-wrangling tool, has cited survey windows, and animation, become sensible for larger datasets.
results showing that between 50 and 80 percent of productive time spent by
industry data analysts is for formatting and integration. His team uses a process Volume of data is closely tied to enrichment – you may need to add additional
for data wrangling that includes the following tasks: data to provide context and visual interest when you have a small dataset. For

16

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

example, West Virginia DOT visualized four alternatives for replacing the Dick Figure 7: 3D Model for the Mobile River Bridge (ALDOT)
Henderson Memorial Bridge, as shown in Figure 6. https://informedinfrastructure.com/18532/building-a-blockbuster-bridge/

Figure 6: Build Alternatives Comparison for the Dick Henderson Bridge (WVDOT)

Acquiring Data
Data may be available in-house, but rarely are they already clean and in the
ideal format. If data are acquired from a vendor, the format may be negotiable,
Text intentionally left small to focus the reader on the overall image.
but adapting them for the chosen visualization platform may still take some effort.
Our survey respondents often visualized in-house data, but also often augmented
With the larger data set, the alternatives (i.e., the data points) provided a full them with free data from several common sources, including:
context for the data and fulfilled the designer’s intent – to compare the cost,
closure time, and maximum grade of each design while also demonstrating the  Highway Performance Monitoring System (HPMS) – HPMS is an information
aesthetics of each. system maintained by the Federal Highway Administration (FHWA) that is
built from required annual submissions by DOTs. Statistics including
By contrast, once Alabama DOT selected an alternative for the Mobile River mileage, pavement condition, traffic volume, and functional classification
Bridge, its intent became selling the project to neighbors by demonstrating the can be found at http://www.fhwa.dot.gov/policyinformation/statistics.cfm.
visual impact of the structure. To do this, the Visualization Team enriched the
It is important to note that the website can be challenging to navigate and
bridge model with four square miles of Downtown Mobile, to allow residents to
does not function with all browsers – we recommend Microsoft Internet
“see” the bridge from their doorstep. Figure 7 portrays an overview.
Explorer;
 National Bridge Inventory (NBI) – Like HPMS, NBI is compiled from annual
DOT submissions to FHWA. For each bridge of sufficient size, states are
required to provide physical characteristics (e.g., type, length, height), as

17

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

well as the results of an annual inspection and condition assessment. The (The video is a production of WGBH Educational Foundation © 2015)
data can be downloaded at https://www.fhwa.dot.gov/bridge/nbi/ascii.cfm.
A delimited format is available behind the link for each year, for easy upload Key points of interest, with time stamps of their locations in the video, include:
into Microsoft Excel or any other data wrangling and analysis tool;
 Brief Overview of the Product: 1:52 – Card demonstrated an animated map
 US Census and the American Community Survey (ACS) – The US Census of train positions, a static line chart (with time and space on the axes) with
takes place every 10 years and participation is compulsory for all US annotations for important events, a heatmap of station entries and exits
residents. To fill in the intervening years, the Census Bureau completes the against time, and a scatterplot of overall transit times (including on-train
ACS annually using a sample of households in each census tract (approx. travel and wait times) for each pair of stations over the course of the day;
3.5 million individuals per year). Generally, the ACS will be combined over
a 3- or 5-year period. The demographic and employment data (including  Research: 4:10 – The team discussed which elements of the MBTA would be
vehicle ownership and commute mode choice) from the Census and ACS interesting for users and the public to experience visually. Card emphasized
are available through American FactFinder at http://factfinder.census.gov/; the importance of identifying objectives in advance, because “once you have
and a dataset, you start thinking in terms of what’s easy to do with that data,
instead of what’s important.” Barry and Card chose to focus on locating
 GIS Sources –Where an agency standard basemap does not exist, Esri congestion and delay, illustrating the impact of large events and snowstorms,
provides options to customers of its ArcGIS package, while OpenStreetMap and giving each user a takeaway about his own commute;
relies on a worldwide network of contributors to provide a basemap for free
with attribution. The primary means of accessing all of these alternatives is  Brainstorming: 8:00 – Barry and Card brainstormed illustrations iteratively
through a GIS tool. Beyond basemaps, most states maintain free GIS by sketching them on paper and uploading them to Google Docs for
datasets and formatted layers for public use that are easily found through an comment;
online search, as do Federal Agencies such as the US Geological Survey  Data Acquisition: 9:10 – A snapshot of train positions is publicly available
(USGS) and the US Census Bureau Topologically Integrated Geographic from the MBTA. Barry and Card periodically downloaded the snapshots to
Encoding and Referencing (TIGER). form a month-long dataset. Each member of the team added a redundant
Data may also be acquired through a “data scraper,” a procedural routine – set of records each minute. Merging the datasets resolved missing records,
typically based online – that extracts data from websites and documents to as shown in Figure 8;
convert it into a tabular format (e.g., www.import.io). The practitioner can use Figure 8: Use of Redundant Datasets in “Visualizing MBTA Data” (Brian Card)
these tools to collect and store a live feed over an extended period of time—
either for retroactive analysis or to develop a visualization of live data.

Data Wrangling in the Real World


Brian Card and Mike Barry: “Visualizing MBTA Data”
Brian Card and Mike Barry, creators of the prominent visualization project
 Interpreting Data Elements: 12:00 – Card noted a key record in the train
“Visualizing MBTA Data,” described their development process in a lecture at
location file: predictions of time-to-station. While the train might not actually
Simmons College in Boston (January 15, 2015). The video of their presentation
take that amount of time (measured in seconds) to reach the station, the
is available from the WGBH Forum Network at
value hitting zero indicates that it has arrived. Barry and Card interpreted the
http://forum-network.org/lectures/data-visualization-how-do-it-and-do-it-well/. data slightly differently from its intent, but used that interpretation to calculate
the actual value;

18

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Data Wrangling Tools: 14:40 – Barry and Card used node.js (a JavaScript Times, recommends that you “learn to sketch with data,” by which she means
library) for processing their files in the JSON format. The visualizations creating rapid, low-fidelity sketches of various visualizations to identify patterns
themselves are built in D3.js, and the code is stored in BitBucket because and findings that will interest your intended audience. This way of designing
GitHub (a more commonly-used competitor) makes all draft code public; allows you to put tangible products in front of people for discussion.
and
 Iteration: 16:10 – “Not all of the ideas that look good on paper look good Intent
with real data… we had 6,000 JSON files and no idea what our dataset
looked like. The only way that we could look at it was by building “Intent” is the question you want to answer or the outcome you want to
visualizations.” Barry and Card built many draft visualizations and tested encourage. Generally speaking, a visualization will convey a fact or an argument
them against their objectives. If an attempt did not tell their intended story, about a topic. For transportation practitioners, frequent topics include proposed
Barry and Card not only tried again, they attempted to identify elements of projects, assets (e.g., bridges, roadways, bike lanes), the traveling public, and
the failed attempts that were interesting and could inform future attempts. budgets. In many cases, the transportation practitioner must assume that the
audience’s entire understanding of a topic will be driven by a particular
Barry takes over the presentation at 17:00 and describes the team’s chart type illustration.
and stylistic choices. We will pick up the description in Section 3.4.
Being firm in your objectives can be a help you build a focused visual. In a blog
post entitled “visualizing opportunity,” visualization author Cole Nussbaumer-
3.2 · Intent and Audience Knaflic demonstrates how a focus on communication leads from a formatted
What’s your story, and who needs to hear it? table to a more intuitive view of key characteristics and elements within the
dataset. We summarize here process here, beginning with Figure 9.
Conceptualizing and planning a visualization project is about telling a story, so
you can frame it around your intent and audience: Figure 9: Initial Formatted Table, “visualizing opportunity” (Cole Nussbaumer-
Knaflic)
 Intent is the “nugget of truth” that a visualization must make obvious. This http://www.storytellingwithdata.com/blog/2015/9/16/visualizing-opportunity
visualization may be the only thing your audience knows about this topic.
What do you want that to be, and what do you want them to do as a result?
 Your audience should be comfortable with your tone and level of technical
language, so align it to your audience’s role and experience. Make
comparisons, allusions, and references that tell your audience “I get where
you’re coming from, and I’m meeting you there.”
You should keep your desired outcome – an element of intent – in mind Nussbaumer-Knaflic makes the following immediate refinements:
throughout the process. Is your intent simply to inform your audience about a
topic, or do you wish for them to take action? If so, what type of action? Do you  The blue background represents a meaningless variation in color, so it is
need to highlight certain elements of a dataset not only because they are removed;
interesting, but also because they relate to an important proposal or initiative for
 The sample size does not lead a reader to any interesting conclusions (i.e.,
which you want to gain support?
it is not part of her intent), so she moves it to a footnote;
Reviewing your data before you start may lead you to an insight to explore  For a focused visualization, she applies a heatmap to the more easily-
through visualization. Amanda Cox, editor of “The Upshot” at The New York understood metric: average score.

19

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

After these refinements, the table appears as shown in Figure 10. schemes taken from an agency, state, or local university or sports team, can
communicate your desire to connect with them;
Figure 10: Intermediate Formatted Table, “visualizing opportunity” (Cole
Nussbaumer-Knaflic)  Tone – Beyond avoiding technical jargon, your tone should be intentional.
If your audience is expecting something casual, formal language will fail to
resonate, and vice versa; and
 References – With almost any data project for a local, regional, or State
agency, information should be compared locally unless the intent is to place
local data in a national or international context.

She then notes that her objective is to show opportunity: how much better could An example of how these concepts can be applied: Chris Hedden, Dan
we be doing in each category? So she revises her chart type to a stacked bar Krechmer, and Ron Basile of Cambridge Systematics produced a cartoon-based
with a transparent gap between reported and benchmark performance, yielding slide presentation (Figure 12) to inform Transportation Planners about connected
her final product as shown in Figure 11. and self-driving cars.

Figure 11: Final Formatted Table, “visualizing opportunity” (Cole Nussbaumer- Figure 12: “The Top Five Things Planners Need to Know About Self-Driving Vehicles”
https://www.camsys.com/insights/top-5-things-planners-need-know-about-self-driving-vehicles
Knaflic)

Through this process, Nussbaumer-Knaflic has clarified the context of her data,
focused the audience on the most important metric, and communicated
additional information about that metric (the opportunity for improvement) by
visualizing the data rather than stating it.

Audience
A beautiful and informative visualization does no good if it cannot its target
audience cannot understood it. An overly technical illustration will not effectively Despite the technical audience, they chose a casual approach to convey the
reach an audience of laypeople. A designer can positively impact audience inevitable ubiquity of the technology and the high-level approach of the slides,
response by playing to its known interests through: and to capture an audience that might avoid the topic because it was widely
perceived as too complex to address. The audience became open to taking in
 Visual Cues – Section 3.4 will discuss the use of human-recognizable the technical details because they were presented in an accessible manner. The
objects. Beyond using familiar imagery, you may wish to tie the cues directly document achieved record views and inquiries, suggesting that it motivated
to your audience. Pictograms of local landmarks and icons as well as color people to delve into the topic further.

20

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

3.3 · Analysis
This notion of how close to the data you want to be and what is your question –
what is the story you want to tell? – seems to be really important.
Are you and your data telling the same story?
Your analytical and aesthetic decisions should reflect the nature of your dataset. Data Literacy
Explore how much data you have, how many ways it can vary, and your need to
illustrate uncertainty. Selecting a chart type or homing in on a “look” without To be data literate, you must understand what your data both can and cannot
considering the data may make your visualization difficult to comprehend. be made to communicate, and identify where relevant uncertainty can be shown
visually. A lack of absolute certainty is not an impediment to effective
Analysis is part of a feedback loop with Data Wrangling and Intent – If you realize visualization, and not all uncertainty is necessary to illustrate. Furthermore, data
that your data don’t tell the story you wanted, do you clean, manipulate, or add literacy can aid in the analysis-intent feedback loop – a logical problem often
data, or do you want to re-evaluate the argument you are making? Do your offers an opportunity to improve your message.
outliers signal error, or do they have meaning that you need to consider? Are
your data in general trustworthy: do you need to show uncertainty? Critiques of data literacy and appeals to critical thinking can be found in many
forms and from many commentators. In his Data Journalism Handbook, Nicolas
Kayser-Bril outlines some of the pitfalls of drawing unsupported conclusions:
Visualizing for an Audience of You
“When writing about an average, always think ‘an average of what?’ Is the
As with the other elements in the feedback loop, one way to make analysis easier reference population homogenous? Uneven distribution patterns explain why
is to visualize early and often. It will help you understand the data and, as a most people drive better than average, for instance. Many people have zero or
result, use it more appropriately. You are creating a visualization because it will just one accident over their lifetime. A few reckless drivers have a great many,
illuminate patterns and increase clarity for your audience – take advantage! pushing the average number of accidents way higher than most people
experience.”
In March, 2010 interview with acmqueue, Fernanda Viegas notes the importance
of identifying patterns through iterative visualizations: (http://datajournalismhandbook.org/1.0/en/understanding_data_0.html)

“[We] spent the whole summer trying to figure out a good way to visualize Applying this principle to a transportation context, it may be the case that the
[Wikipedia] editors, but we kept getting these not-very-useful results. At one point majority of intersections experience below average accident rates, or the majority
we tried just to get a sense of the shape of the data using bar charts, line graphs, of bridges have above-average maintenance records. When visualizing these
and stack graphs, but that wouldn’t tell us anything either. datasets, you should be prepared both to respond to an audience that points out
these “logical flaws” and to reflect them in your intent. Do you want to visualize
Eventually, we decided to try out a very weird technique, which was mapping the difference from the average, or can you reduce your sample set by focusing
streams of text to colors. This makes you lose a lot of information because text is only on the problem locations?
really rich and you can only use so many colors. All of a sudden we saw patterns.
Someone was going around all of Wikipedia correcting typos; another person “Articles about the benefits of drinking tea are commonplace… although the
was working on images; another was working on stub sorting… effects of tea are seriously studied by some, many pieces of research fail to take
into account lifestyle factors, such as diet, occupation, or sports. In most countries,
Looking back, we feel that the very first experiments we did with the data were on tea is a beverage for the health-conscious upper classes. If researchers don’t
too high of a level. They were abstracting too much away from the data and not control for lifestyle factors in tea studies, they tell us nothing more than ‘rich
giving you this sort of messiness that Wikipedia has, which is everybody’s there, people are healthier, and they drink more tea.’”
every day making minute changes… that add up to patterns.

21

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Once again applying the principle to transportation, a map of mode choice Figure 14: 3D Worldwide Air Pollution Map, where Color Indicates Confidence (Kai
across a region may show lower-income areas commuting by transit more often Pothkow, Britta Weber, and Hans-Christian Hege, “Probabilistic Matching Cubes.”
than by single-occupancy vehicle, except in areas nearby to centers of service Computer Graphics Forum, 30(3):931-940, 2011.)
and manufacturing employment (which have shifts outside of transit operating
hours). It would be insufficient to simply draw conclusions about mode choice in
these neighborhoods without accounting for these demographic trends; adding
them presents the opportunity to provide your audience with useful insight and
illuminate new parts of your data.

Beyond simply showing the audience that the data do not present certain
conclusions, you also can develop and visualize scenarios based on varying
assumptions, as demonstrated by the Victoria Transportation Policy Institute in
Figure 13.

Figure 13: “Autonomous Vehicle Sales, Fleet and Travel Projections” (VTPI)
http://www.vtpi.org/avip.pdf

Figure 15: “Cat’s Eye” Approach to Visualizing Statistical Error (Geoff Cumming)
http://www.psychologicalscience.org/index.php/publications/observer/2014/m
arch-14/theres-life-beyond-05.html

General approaches for visualizing uncertainty include:

 Using a visualization strategy that clearly communicates that the data are
not meant to be exact (e.g., shapes instead of columns on a column chart);
 Fading edges, increasing transparency, or in some other manner altering
the appearance of conventional data points (as shown in Figure 14); and
 Including error bars (an alternative approach – the Cat’s Eye (Figure 15).

22

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Using Visualization to Drive Analysis Virginia DOT (VDOT) provides another example in Figure 17. DOTs are
adopting dashboards to illustrate system performance, either in a static form (i.e.,
Beyond the need to perform analysis to drive your visualization, it is important to to report performance to the public) or in an interactive form (i.e., to allow
recognize your visualization’s potential for informing and facilitating analysis planners and budget-makers to project the consequences of their decisions).
done by others. Dashboards can greatly facilitate performance-based planning and budgeting,
a key mandate of recent federal legislation.
For example, the Delaware Valley Regional Planning Commission (DVRPC)
developed the Ridescore metric for bicycle accessibility at Philadelphia-area Figure 17: The VDOT Dashboard (Virginia DOT)
commuter rail stations. Not only does the metric combine many measures of http://dashboard.virginiadot.org/default.aspx
accessibility in to one easily-consumed number, it also allows for the data to be
presented in a single map. The screenshot in Figure 16 shows this map, which
leaves the immediate impression that bicycle accessibility improves the closer
one gets to the city center, as well as identifying outliers – suburban stations with
superior access for cyclists. The same interface displays the constituent scores
when a user clicks on a station.

Figure 16: Ridescore (Delaware Valley Regional Planning Commission)


http://www.dvrpc.org/webmaps/ridescore/

Text intentionally left small to focus the reader on the overall image.

Text intentionally left small to focus the reader on the overall image.

23

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

3.4 · Choosing a Strategy


The story uses the full width of the user’s screen, with content appearing on
multiple panels and at multiple widths to ease mobile viewing. Visualizations
Bringing your story to life on the page include photographs, static illustrations, maps, charts showing demographic and
economic data, and interactive renderings.
Your strategy for visualizing your data represents not only the chart type or types
that you include, but also how you customize your charts and illustrations to (http://www.washingtonpost.com/wp-srv/special/business/reimagining-union-
reflect your intent, your audience, and the elements of your data. Overall, your station/ )
tasks when choosing a strategy include:
A similar visual production would not have been possible in a printed newspaper,
 Selecting a chart type or types; but the level of technical detail (including the budget and funding approach for
the project) and reporting would not have been possible in a purely digital
 Selecting a medium;
medium without text (such as presentation boards or a slide show).
 Differentiating your data points; and
 Ensuring that your visualization is useful, clear, and memorable for your Differentiation
audience.
Every visual distinction should communicate useful information to the audience.
Chapter 2 addressed chart types and their use cases in detail. This section will Elements of your data’s appearance should each reflect an attribute that (a)
focus on the other three tasks. varies and (b) is important to show varying. We refer to these attributes as
“dimensions.” In designing your visualization, you will need to decide upon many
Selecting a Medium dimensions to depict. Taking NBI bridge data for a state as an example:

Your medium has a profound impact on your design. Zooming and filtering of  With zero dimensions, the visualization shows how many bridges there are.
data is impossible if the medium is static. If your visualization is intended for a This could be accomplished with a stylized number, with a collection of small
large-scale poster or presentation board, then you can either expand the bridge icons, or with a proportionally-sized box (in reference to some outside
dimensions of a single visualization or make a greater number of simpler charts. point of comparison);
 One dimension could be location (e.g., a map of bridges), NBI condition
The form and dimensions of the page or screen can and should drive the
(e.g., a bar chart), type (e.g., a pie chart or treemap), and so forth;
arrangement and even the inclusion of information – if it is placed where the
audience will have to scroll down, flip a page, or turn around to see it, they may  Two dimensions could any pair of the above. For instance, location and
not see it. If the visualization is to be delivered in a printed book, information on condition could be visualized at the same time using a choropleth, with
some pairs of consecutive pages (i.e., facing pages, which form spreads) is far regions colored by average condition; and
easier to consume at once than on other pairs, where the pages are on reverse
 Three dimensions could add another variable. For instance, if time were
sides of the same sheet.
added to the above, the choropleth map could be animated to show
The possibility of publishing content in web-based documents opens new changes in average condition in each region over time.
opportunities for your audience to tour through information and for presenting Tamara Munzner and Torsten Möller discuss dimensions in the language of
interactive visualizations naturally in the course of a document. The Washington “marks and channels.” To them, a mark is a “basic graphical element or
Post produced a classic best practice for this approach in its 2014 feature geometric primitive” – a point, line, area, or volume. A channel is a means of
“Reimagining Union Station.” controlling appearance. Möller’s slide presentation on the topic lists position,
size, shape, orientation, and hue/saturation/lightness as channels.

24

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Recognizing that orientation is fundamentally an element of shape, and


accounting for the possibility of data points appearing or disappearing in an
Memorability and Comprehension
animation or a series of images, you can change five things about the The academic community has produced innovative and important guidance for
appearance of your data points: visualization.
1. Position – You can change where a data point is located on the page on The MassVIS team at MIT (http://massvis.mit.edu/) provides an additional set of
three axes; recommendations for maximizing recognition and recall. After conducting online
2. Color – As noted by Möller, elements of color include hue, saturation, and experiments that tested subjects’ attention to and retention of visualized
lightness. Some image editing programs also will allow you to change information, the researchers concluded that:
transparency and add patterns in place of solid colors;
 Memorable visualizations have memorable content. While sparse designs
3. Shape – Shapes are not only simple geometry, but human-recognizable with significant white space may be more attractive, something needs to
objects as well. Shape also includes rotation and orientation; jump out and stick with people. This can be relevant background imagery,
4. Size – Elements can be proportionally-sized in terms of length, width or area; bright colors, a unique typeface, etc.;
and  Titles and text are key elements. According to the MIT team’s research, the
5. Existence – Assuming an animation or a series of static images, data points most memorable part of a visualization is the title. Their results also support
can appear and disappear between frames. labels next to the data (as opposed to below the axis) and limited, effective
captions;
Because only these five visual characteristics of a data point can change, a  Human recognizable objects (e.g., pictograms) can add to effectiveness.
maximum of five dimensions can be represented in a visualization. To wit, if your Instead of text-based labels, designers should consider using visual cues or
data includes 30 dimensions, you will need to iterate through data wrangling, pictures. This extends to bars, columns, and lines, as well – making them
intent and audience, and analysis to identify the five (at most) that tell the best resemble a related object improves retention; and
story.
 Redundancy improves comprehension. Repeat elements such as titles,
It is possible to make visual choices that have little useful meaning and detract captions, labels and pictograms as much as possible and appropriate
from comprehension. Many visualization tools, for instance, will default to among related visualizations.
showing each record in a different color based on ID number or name. Referring
Figure 19 shows an excerpt from the Florida Transportation Plan that provides
back to Section 3.1, Cole Nussbaumer noted that the blue background on her
memorable imagery and colors, emphasizes important text, and uses
table’s header row constituted meaningless color so she removed it. Making
human-recognizable objects.
visual choices without clarity implies that you lack clarity about your data and
intent.

25

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Figure 18: Excerpt from the Florida Transportation Plan (Florida DOT)
http://floridatransportationplan.com/)
Visualization Strategy in the Real World
Brian Card and Mike Barry: “Visualizing MBTA Data”
The first 17 minutes of Barry and Card’s seminar at Simmons College are
discussed in Section 3.2. Moving on from data wrangling, they discussed their
strategy and process for visualizing the data.

 Organizing the Information: 17:40 – Barry recounts that “each of the


different views of the data answered a different question better than the other
views did.” It wasn’t possible to have a single overview. Barry and Card
noted that their favorite visualizations were tall webpages that navigate using
scrolling (as opposed to links) and chose that approach;
 Innovating through Development: 22:30 – Barry and Card recognized that
not only do some of our ideas not pan out in implementation, but some
ideas we didn’t consider to be promising look great. Their mantra was “when
that happens, just use it everywhere.” Barry gives the example of the line-
based system map, which was originally to appear in only one location but
was so successful that they added it to multiple other views. In another
example, the team experimented with changing the appearance of
Another key concept from our review of academic research is “congruence” – visualizations and highlighting information in response to the reader
the idea that visual design decisions should convey a meaning similar to the one hovering over parts of the text. Again, it was effective enough to implement
conveyed by the data. For example, this would exhibit poor congruence: a chart widely;
of hybrid car ownership using green to depict regions with the fewest vehicles
and brown to depict regions with the most vehicles. To reference the discussion  Seeking Feedback: 23:40 – “Ways that people use your visualization
in Section 3.1, congruence may be audience-dependent. For example, if you are incorrectly give you really useful feedback. The trick is that they’re correct
presenting data to a DOT that places its state in a national context, you may and your visualization is wrong.” Barry and Card connected with a data
choose to represent its state with a color or icon familiar to the audience—such visualization professional and sought his insight before completing their
as the main color from the state flag. project;
 Accounting for Screen Size: 27:42 – Barry and Card developed their
Chapter 4 provides more detail on when and how to tailor your style to your visualizations on a MacBook. The test users viewed the project on larger and
audience. smaller screens and accordingly recommended that they either “use more of
the real estate” or shrink their content to prevent scrolling. The team resolved
this with Bootstrap, a web coding library that allows a developer to
automatically adapt content to fit screen size. They tested the project with all
modern browsers;
 Accounting for Screen Size: 29:30 – Barry and Card added one more
visualization at the end of the project. Shown in Figure 20, it allows the user
to select any two stops and observe the range of transit and wait times (and

26

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

3.5 · Tools and Implementation


from there the travel time). They had felt that a core question: “How long
will my commute take?” had gone unanswered. Barry refers to their model
as a “martini glass” – you start out with wide-reaching overviews, narrow in Maximizing your visualization toolbox
on specific attributes and data points, and finish by widening back out and
allowing for exploration and personalization; and There are a growing number of tools for creating data visualizations. You can
draw simple graphics by hand or creating them in a straightforward image editor
Figure 19: Travel Time Scatterplot from “Visualizing MBTA Data” (Mike Barry and such as Microsoft Paint or PowerPoint. You can build data-driven visuals in basic
Brian Card) http://mbtaviz.github.io/ tools like Microsoft Excel and advanced tools like Tableau or build interactive
online visualization using Tableau a coding library like D3.js. You can also use
multiple tools in the process of creating a single visualization.

Choosing the right tool depends on your strategy and your level of expertise. This
section describes many of the most useful visualization tools covering a range of
strategies and skill levels. We use our professional judgment to define the
ease-of-use of each of the tools.

Common Tools and How to Use Them


Map Tools
Creating sophisticated maps has become relatively easy with modern GIS tools.
Esri has long been the major player in GIS, but recently open source projects
have brought powerful mapping tools within reach of everyone’s desktop.

 Implementation: 31:00 – Barry and Card hosted their work at GitHub Pages
due to its simplicity, lack of cost, and unlimited traffic accommodation. They Esri’s ArcGIS is the gold standard in GIS software. It is a full-fledged professional
added a date and header, used AddThis to include sharing buttons (partially tool, but even novice users can create simple maps. Developers can create
to grant the site credibility for people stumbling across it). They implemented custom interactive web pages and apps using ArcGIS servers, APIs, and software
Google Analytics to track unique visits and visitors. Finally, they added tags developer kits (SDKs).
to tell social media networks how to render an image, description, and title
when the page is shared.  Platforms: Windows (desktop and server) | Online via web | API for
developing apps and web pages.
 Cost (as of April 30, 2016): Desktop – $1,500 and up | Online - $2,500
for five users and up | Server - $5,000 and up for perpetual license | $100
for personal use | Discounts for non-government organizations, non-profits,
and schools.

27

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Support: Esri provides online documentation and self-service and paid  Cost (as of April 30, 2016): Free (Power BI desktop and service) | $150 and
support | Esri Developers Network | Esri-related conferences and user up (Office) or $70 per year and up (Office 365 – cloud) | $300 and up
groups | Extensive community of users | Books | Commercial support. (Visio) or $13 per user per month (Visio for Office 365).
 Publishing online: Via Esri cloud (requires service credits) or your own ArcGIS  Support: Microsoft provides online documentation and tutorials | Active user
server. community.
 Publishing Online: Power BI can publish to the Power BI service.

Adobe Creative Suite


QGIS is a powerful free and open source GIS. Its capabilities are constantly
evolving and can be extended through various free plugins. You can publish your Adobe’s Illustrator, Photoshop, and InDesign are often used to polish and
maps on the web if you have access the necessary equipment and expertise. enhance visualizations created with other products. You also can use Illustrator
to produce some basic visualizations.
 Platforms: Windows | Mac OS X | Linux | Android.
 Cost: Free, open source (Creative Commons Attribution-ShareAlike 3.0).  Platforms: Windows | Mac OS X.

 Support: Online community | Online documentation and tutorials | Books  Cost (as of April 30, 2016): Part of Creative Cloud, starting at $9.99 per
| Commercial support. month for a single application.

 Publishing online: QGIS Server and Web Client | Export to Leaflet or other  Support: Adobe provides online documentation and tutorials | Active user
servers. community.
 Publishing Online: Not available.
General Tools General visualization tools allow you to upload data from a variety of sources
The multi-purpose office tools allow users to build many of the most basic data (e.g., Microsoft Excel, comma delimited, R). Once the data is in place, the
visualizations and, with practice, they can make elegant visualizations. application can illustrate it in dozens of ways with limited customization. Finished
visualizations can be exported for use in reports and presentations. Some tools
facilitate hosting for interactive projects.

Microsoft Office comprises components that include Excel, PowerPoint, Visio,


and PowerBI. Excel is often a first stop for exploratory data analysis and data
wrangling, and can produce a number of data visualizations. PowerPoint can be Tableau is a general-purpose visualization environment with powerful tools for
good way to combine various visualizations with text to create infographics and creating interactive data visualizations. You can combine them into dashboards
visual presentations. Visio is useful for creating drawings. Power BI is a general- and combine them into stories. The free version, Tableau Public, allows you to
purpose visualization environment with a free version that can be published publish and reference your visualizations on the Tableau Public site (as long as
online via a subscription service. you can let viewers download your data).

 Platforms: Windows (desktop and cloud) | Mac OS X | Windows (Power BI).  Platforms: Windows (desktop and server) | Mac OS X |Online via web.

28

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Cost (as of April 30, 2016): $999 (personal desktop) | $1,999 (professional For Developers
desktop) | $10,000+ (server) | $500 per user per year (online) | Free
(Tableau Public) | Discounts for non-profits and educational use. Custom, interactive visualizations like those seen in The New York Times
generally are developed in JavaScript (an internet browser coding language). To
 Support: Tableau provides online documentation and self-service support as build visualizations using these libraries, you will need software programming
well as paid support | Tableau-related conferences and user groups | skills and comfort with web publishing.
Extensive community of users | Examples readily available (visualizations on
Tableau Public can be downloaded and reverse-engineered).
 Publishing Online: Tableau Public | Tableau Online or Server | Hosted
visualizations can be embedded in other web pages. Data Driven Documents, or D3.js, is an open-source JavaScript library that
provides powerful visualization components. If you have strong
web-development skills, you can find an example visualization that fits your
strategy, copy the code, and build your own.
Qlik is a general-purpose visualization environment with powerful and easy-to-
use tools for creating interactive data visualizations. With a paid version or cloud  Platforms: JavaScript | Runs in all recent web browsers.
hosting, you can embed visualizations or share them on the web. Qlik provides
 Cost (as of April 30, 2016): Free, open source.
an API that enables you to mashup and extend visualizations in sophisticated
Web applications.  Support: D3.js provides online documentation and lots of examples | Active
user community | Vast gallery of examples, many with source code shown.
 Platforms: Windows (desktop) | Online via Web | API for developing apps
and web pages.  Publishing Online: JavaScript scripts in a webpage, any web server.

 Cost (as of April 30, 2016): Desktop - free for personal or internal business
use | $20 per user per month for Qlik Sense Cloud | $1,500 per token (one
user or ten logins per month) | QlikView Enterprise (server) priced on hybrid
server and client access model. You can add charts and graphs to Google Sheets, and you can access those
 Support: Qlik provides online forums, consulting, training, and conferences same visualizations and data through various APIs. Google Maps is accessible
| Active user community. via API, enabling various map-based visualizations. Fusion Tables is an
application to gather, explore, and share data tables. It helps you find public
 Publishing Online: Qlik Sense Cloud (share with up to five others, 250 MB data, visualize it, and host it online.
free).
 Platforms: JavaScript | Runs in all recent web browsers.
 Cost (as of April 30, 2016): Free, under terms of Google APIs Terms of
Service (https://developers.google.com/terms/).
 Support: Google provides online documentation and forums | Active user
community.
 Publishing online: JavaScript scripts in a webpage, any web server.

29

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Data Wrangling Tips for Implementing Advanced Visualization


Advanced data visualization can be engaging, beautiful, and informative. It can
form the basis for how people think about an entire topic. This type of
visualization requires building a toolkit of web development, statistical analysis,
Trifacta enables analysts of all skill levels to work with and manipulate complex software programming, and graphic design. It is enticing to imagine taking an
data. As much as 80 percent of effort in a visualization project can be absorbed online course, learning a JavaScript coding library and building a fancy
by cleaning and formatting your data, and Trifacta automates parts of that task. visualization. That will not be possible without first understanding the basics of
Whether you are accessing complex big data or a simple spreadsheet, Trifacta each skill. This level of comprehension will help you to develop the capabilities
can help you prepare it for a visualization tool like Tableau. in house, bring in the right kind of employee, or hire the right vendor to
accomplish the work on your behalf.
 Platforms: Windows | Mac OS X.
To get you started, we provide a job description for an online visualization
 Cost (as of April 30, 2016): Free (except Wrangler Enterprise – data
professional on our website: vizguide.camsys.com/.
wrangling for Hadoop).
 Support: Trifacta provides online training, videos, and basic documentation
| Active user community.
 Publishing Online: Not applicable.

R is a power tool for data wrangling and statistical computing that also creates
data visualizations. It is like a software development environment – the basic
package includes a command-line editor and interpreter. RStudio provides a
graphical development environment but still requires you to write scripts.

Several graphics packages make creating plots and charts fairly easy, and Shiny
(also from RStudio) produces interactive web pages.

 Platforms: Linux/Unix | Windows | Mac OS X.


 Cost (as of April 30, 2016): Free, open source (GNU General Public
License).
 Support: R provides online documentation | Active user community.
 Publishing online: Through packages like Shiny by RStudio (which has both
free and supported versions).

30

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

3.6 · Putting It All Together  Data Wrangling – We held a workshop to explain the types of data we
needed and what we planned to do with it. We collected written documents
One practitioner’s example (e.g., Citizen’s Guide to the Transportation System and annual reports for
the Turnpike and DOT) and spreadsheets (e.g., a comprehensive budget
Members of our team worked with the New Hampshire DOT to develop a Sankey book) describing cash flows.
Diagram for the department’s Transportation Asset Management Plan.
We had a sense of who the audience would be and the story we wanted to
The chart shows the flow of funds from revenue sources on the left – through tell, so we refined the data so it had common revenue, program, and
funds and programs in the center – to uses on the right, all proportionally-sized expenditure categories and names. This took some effort.
and colored by revenue source.
 Intent and Audience – The audience for this chart includes the public, FHWA,
Figure 21 shows the chart and the bullets to the right walk through how we internal staff, and legislature. The intent was to explain to this audience how
considered the elements of this Guide to produce it. money is spent on different asset management programs, by asset (i.e., how
much did you spend on maintenance and how did you pay for it?).
Figure 20: New Hampshire Funding Flows – Typical Year We wanted to highlight connections among revenue, programs, and
(New Hampshire DOT, 2015) investment categories. As we sketched with stacked bar charts, we could see
how revenue tied to programs but not how it related to expenditures. We
needed something that had more connections.
 Analysis – The Sankey requires that every flow balances. The DOT does not
manage their income and investments like this, so we needed to make some
assumptions to tie them together. We went back and modified the data,
creating a hypothetical fiscal year that explicitly ties the flows together
through the whole process. We checked with the fiscal folks to make sure
that these assumptions were appropriate.
 Choosing a strategy - The Sankey Diagram was effective at communicating
our intent to our audience. We wanted to make clear how the revenue
sources flowed through the diagram, so we kept them in the same color
scheme (e.g., all toll revenues are in blue). We added text throughout to
help the reader understand the chart. We also experimented with the
organization of the flows to ensure readability.
 Tools and implementation - We used Excel to wrangle the data. We
generated the diagram using SankeyMatic, a free online tool built in
JavaScript, but easy to learn for those without coding experience. The final
graphic was built in Adobe Illustrator by tracing a screenshot of the raw
diagram; this allowed us much more control over the look and feel of the
chart.

31

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Establishing and adhering to style standards is essential to maintaining a


consistent brand image. This section covers recommendations for key elements Best Practices
such as color, font, and responsiveness. We include guidelines on a number of
Federal requirements to address whenever you are producing visuals for print or  Capitalization: A mixture of upper and lower case letters is easier to read
web. quickly and accurately, compared with text in all upper case, and takes up
less valuable space on the visualization. Be mindful that using more than
seven consecutive upper case words will force the audience to reread that
4.1 · Basic Design Principles section. When all words have equal weight, your audience will have difficulty
prioritizing their importance.
There are some basic concepts to keep in mind for all designs:
 Size: When your font is too small, it is difficult to read. When your font is too
 Keep it simple: Visualizations should convey only the essential elements of large, it limits the number of words you can fit in the same amount of space.
the concept, keep text to a minimum, and be easily understood; According to BootstrapBay (https://bootstrapbay.com/blog/web-
 Make it clear: To help guide the eye, establish anchors in the visual. Choose typography-best-practices/), 38 pixels is the average headline size, but 20-
fonts that are easy to read. Choose a readable font size and increase it for to-32-point fonts are most frequent for headlines for web typography. For
key statements to make them stand out. Use overlays to continue building web body copy, size 14-to-16-point fonts are most common.
on a visual and create emphasis through differentiation of format (e.g.,  Responsiveness: When producing web copy, consider that people access it
position, color, shape, size, and existence as described in 3.4) and font; and from many screen types. A font appears differently on a smartphone screen
 Be consistent: Once you decide on a style (color scheme, fonts, etc.) stick compared with a tablet, which is different from your laptop, or your desktop.
with it. The audience will know what to expect and will not be distracted by Choose a responsive typography so it scales to fit the size of each user’s
changes in the look of your visualizations. browser.

4.2 · Font 4.3 · Color


When making font choices, focus on readability and suitability. To be readable,
align your text for comfortable spacing between words and choose a font that is “Color creates emotion, triggers memory, and gives sensation” – Gael
appropriate for its intended purpose. To be suitable, consider the design intent Towey, Creative Director, Martha Stewart Living
of the typeface (font). If it was intended for a sign, do not use it to annotate
paragraphs. Be aware of the suitability of serif and sans serif fonts. Vocabulary
(https://www.fonts.com/content/learning/fontology/level-4/fine-
typography/legibility) Choosing the appropriate color that looks attractive and differentiates
dimensions of your visualization is critical. Color Wheel Artist describes the color
vocabulary (http://color-wheel-artist.com/hue.html)

32

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Color/Hue: The twelve purest and brightest colors, including the three wide spectrum; it can be helpful to identify a few people in your workplace with
primary colors (red, blue, yellow), three secondary colors (violet, green, color blindness who can help to test your materials. Also consider how easily
orange), and six tertiary colors (blue-violet, red-violet, yellow-green, blue- distinguishable the various tints, shades, and tones are from one another within
green, yellow-orange, red-orange); the same hue family, as strong visuals use color to clearly demarcate separate
pieces.
 Tint: The lightened version of any color, also known as pastel, created by
adding white. Tints can range from slightly lighter than the hue to almost There are many resources on color theory that define good color combinations
white; to choose for your scheme. If you aren’t sure where to get started, see Pantone
 Shade: The darkened version of any color, created by adding black. Shades for excellent advice:
can range from barely darker than the hue to almost black; and
https://www.pantone.com/pages/MYP_myPantone/mypantone.aspx
 Tone: The grayed version of any color, created by adding both white and
black. Tones typically are considered more appealing color combinations When choosing a color scheme to represent continuous data (e.g., in a heat
than simple tints or shades. map), it is best to avoid using the colors of the rainbow:

 There is no “greater than” and “less than” order to colors the way there is
Palette with light to dark.
When creating visualizations, you need to determine a color palette that can be  It is difficult to spot differences: Human eyes are not good at detecting color
used consistently throughout the design. The palette can be monochromatic differences. This makes it difficult to spot differences among dimensions.
(using only one color), black and white, full color, or neutral. Color palettes for
visualizations typically comprise a primary palette and secondary palette. The Contrast
primary palette includes the colors that are used most frequently, while the
secondary palette provides additional complementary colors that can be used as When choosing colors for text and background, contrast is key. If you plan to
needed throughout the design. The secondary palette colors often are bright, as stray from black text on a white background, you should consider the
they are intended to be accents. transparency/opacity of your text. Transparency indicates how easy it is to see
through the color; opacity indicates how difficult it is to see through the color.
Google has created a helpful online video demonstrating that within the selected
Light text on a dark background typically requires a higher level of opacity than
primary palette alone, there are various options for regard tints and shades of
dark text on a light background. Brightness is another factor to consider.
the hue, before incorporating secondary palette accents:
Our eyes find it easiest to read text that is different in terms of color and in
https://design.google.com/videos/palette-perfect/
brightness from the selected background. Choosing contrasting colors, such as
colors on the opposite side of the color wheel, helps to ensure legibility. For
Scheme example, dark violet text does not work well against a blue background, but it
reads well against yellow (particularly light yellow)
Color schemes are informed by color palette. Depending on the selected color
palette, the color scheme will include the tints, shades, and tones of the primary There are online color contrast checkers that can help you verify whether you
colors and the accent colors used in all designs. When selecting a color scheme, have chosen colors with ample contrast ratios, e.g., WebAIM:
it is important to consider color insensitivity and color blindness. When using reds
and greens together, choose highly saturated, darker shades rather that light http://webaim.org/resources/contrastchecker/
tints, and use thicker lines. Color blindness is fairly common and falls across a

33

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

4.4 · Federal Requirements for Style  When a webpage requires an applet, plug-in, or other application to
interpret page content, include a link to a plug-in or applet that complies
Under Section 508, the federal government outlines a number of standards to with §1194.21 (a) through (l);
guarantee equal access to information conveyed electronically for those with or  Create electronic forms completed online to allow people using assistive
without disabilities. All your visualizations developed, procured or maintained by technology to access the information, field elements, and functionality
federal departments and agencies must comply with the standards. To familiarize required for completion and submission of the form, including all directions
yourself with these standards, use these resources: and cues; and

https://www.fcc.gov/general/section-508-information  Provide a method to allow users to skip repetitive navigation links.

http://www.fhwa.dot.gov/publications/research/general/03074/index.cfm
Formatting Documents
Technical Standards  Figure captions must describe the chart in the caption title, as in Figure 22.
Figure 21: Labor Division by Income Level (Cambridge Systematics)
 Provide a text equivalent for every non-text element (e.g., via “alt”,
“longdesc,” or element content);
 Synchronize equivalent alternatives for any multimedia presentation; Labor Division by Income
 Design webpages so all information conveyed in color also is available 6
without color, as in context or markup;
 Organize all documents to be readable without associated style sheets; 5

 Provide redundant links for each active region of a server-side image map;
 Provide client-side image maps rather than server-side image maps, except 4
where regions cannot be defined with an available geometric shape;
 Identify both row and column headers in data tables; 3

 Use markup to associate data cells and header cells for data tables that have
two or more logical levels of row or column headers; 2

 Title frames with text that facilitates frame identification and navigation;
1
 In the instance that compliance cannot be accomplished in another way,
provide a text-only page, with equivalent information/functionality, to ensure
your website complies with stated requirements. Update the text-only page 0
each time the primary page changes; Clerk Designer Artist CEO
 For pages using scripting languages to display content or create interface Low Income Medium Income High Income
elements, identify all information provided by the script with functional text
that assistive technology can read;

34

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

 Equations and formulas should be presented as images (not as text or in a


text box) and inserted into the document. Not all computers, printers, and
operating systems can interpret special math and scientific symbols and
fonts. Number and caption each equation as a figure, as shown in Figure
23:
Figure 22: Example of a Formatted Equation (Cambridge Systematics)

 Alt text and table summaries must be clean. Clean alt text/summaries for
Figures/Tables/Equations. Where needed, break into long descriptions (the
HTML Validato [https://validator.w3.org/] suggests a maximum of 75
characters; others suggest 100). When cleaning, make no references to
color, remove special characters if they’re not necessary (though they are
allowed), and spell out acronyms in summaries and alt text.

Image Requirements
The Section 508 requirements specifically for images include:

 Image weights are less than 30 K (when possible without making illegible);
 Image widths are less than 420 pixels (when possible without making
illegible);
 Save all files as fig[section#][figure# in section] (i.e. “fig41.jpg”), using all
lowercase letters; and
 File names should never exceed 20 characters or contain dashes, special
characters, or spaces (only underscores).

35

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

To envision information … is to work at the intersection of image, Visualizations need to succeed in two areas: be engaging, and be
word, number, art. easily understandable.
Edward Tufte Jean-Daniel Fekete

Transportation planning is a field and an industry built for visualization. Because your visualization is designed around your audience, you should use
Information of relevance to planners can be readily illustrated, be it the design imagery that speaks to them. Use colors that correspond to a client’s logo, or to
alternatives for a project, traffic flow in the peak hour, bicycle mode share, or a local sports team. Use human-recognizable objects to create pictograms in
color-of-money. Transportation professionals must also frequently communicate place of bar or bubble charts. Use logos in place of text to take advantage of
plans, objectives, and justifications to lay stakeholders and a public in which their brand equity and immediate recognition. Above all, provide information
“everyone who drives thinks they’re a traffic engineer.” clearly to send the message that you are both a reputable and innovative source
for that information.

Visualization is… taking advantage of the fact that we are so The first sign that a visualization is good is that it shows you a
programmed to understand the world around us in terms of what problem in your data.
we see. Martin Wattenberg
Fernanda Viegas

When visualizing information, you should expect that many in your audience will For even simple datasets, visualizing can provide insight that leads to better data
likely “just look at the pictures.” Not only should a visualization tell a story, but it and, in turn to better visualizations. This positive feedback loop is at the core of
should tell a complete story, with a subject, a function, and a desired outcome. complex and interactive data visualization, and the refinement of both end
products increases with each iteration.

Text intentionally left small to focus the reader on the overall image.

36

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

This appendix lists examples of best practice visualization, selected by the project team. They are also provided on
the “Examples” tab of the Vizguide website (vizguide.camsys.com).

Opening Notes
Best practice examples of transportation visualizations were collected from several sources, including:

 Best practices discovered during background research and project development;


 Best practices encountered by the project team in the course our their professional lives; and
 Best practices submitted as part of our Visualization Survey of Transportation Practitioners.

For presentation both in this appendix and on the website, the visualizations were grouped into 10 subjects:

 Asset Management  Performance Based Planning


 Connectivity, Accessibility, and Livability  Freight
 Environmental  Safety
 Transit  Socioeconomic
 Highway Mobility  Pedestrian and Bicycle

37

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Asset Management
NHDOT Funding Flows – Typical Year
New Hampshire Department of Transportation, 2015

New Hampshire DOT (NHDOT)


included this Sankey Diagram in its
inaugural Transportation Asset
Management Plan (TAMP). It shows the
flow of funds from revenue sources on
the left – through funds and programs
in the center – to uses on the right, all
proportionally sized and colored by
revenue source.

The diagram required a complex set of


amalgamations and assumptions to
align NHDOT’s revenue and expense
datasets, which do not perfectly
balance and had never previously been
illustrated together. This was a task of
both Data Wrangling and Analysis –
cleaning the data tables so that they
were speaking the same language and
then modifying them to cleanly
distribute the revenue of a hypothetical
fiscal year.

The diagram itself was generated using


SankeyMatic, a free online tool built in
JavaScript, but easy to learn for those
without coding experience. The final
graphic was built in Adobe Illustrator by
tracing a screenshot of the raw
diagram.

38

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Arlington Visual Budget


Town of Arlington, Massachusetts, 2016

The Town of Arlington, Massachusetts


developed “Visual Budget” to clearly
communicate its investment decisions to
residents and taxpayers. What makes
the tool unique is that in addition to the
pictured dashboard, visitors to the
website land on a “tour” function that
highlights sections of the page with key
facts and interpretation. In addition,
taxpayers can input their property tax
assessment and convert some
mouseover figures to show their own
contribution.

The website is fully open-source, with


Javascript code and data easily
accessible on the website. Not only is
the underlying data provided in JSON
format (for the site itself), it is also
provided in a CSV file that can be easily
imported into Microsoft Excel.

An advanced user could modify the


open-source code to create a similar
tool based on any entity’s budget.
However, the CSV file allows laypeople
to create the included charts in Excel,
Tableau, Microsoft PowerBI, or similar
programs. The site primarily uses
treemap and stacked area charts in
addition to formatted text.

39

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Connectivity, Accessibility, and Livability


Local Accessibility and Mobility Analysis
Champaign County Regional Planning Commission, 2014

The “Local Accessibility and Mobility


Analysis” (LAMA) is an appendix to
CCRPC’s long-term transportation
plan. It explores regional variation in
land use and transportation and relates
that variation to travel patterns. The
charts summarize the public comment,
display the results of mobility and
accessibility analysis, and compare
these results to estimated travel
behavior in the neighborhood.

LAMA was developed in R using data


from not only CCRPC, but from the
Illinois Secretary of State, the US
Census Bureau, and Esri. The primary
audience was local planners and
decision-makers, who might be familiar
with the community and with planning
principles but who do not have
analytical expertise in transportation.

A novice could replicate the column


charts in Microsoft Excel and the map in
ArcMap or QGIS. In CCRPC’s case, the
positive feedback on the static version
of the charts has led to funding for an
online version, which can highlight
subsets of the information for each
neighborhood and focus the attention
of the reader.

40

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Ridescore
Delaware Valley Regional Planning Council, 2015

RideScore is a metric and online tool


that describes bicycle accessibility
where first- and last-mile issues are
most important: commuter rail stations,
trolley, and subway termini outside of
Center City Philadelphia. The overall
RideScore for each location is the sum
of ten 0-5 components (hence the total
score is 0-50). Each component score
represents a categorization of a metric
(e.g., transit vehicles/day, number of
cultural resources, population density)
within a given radius.

DVRPC incorporated data from SEPTA,


the National Establishment Time Series
(NETS), the National Center for
Education Statistics (NCES), the City of
Philadelphia, and the US Census
Bureau. The tool features a point-and-
line map built that requires an expert-
level knowledge of web development as
shown in Javascript.

However, it can be replicated by less-


experienced designers in GIS software
including ArcMap, QGIS, and Google
Earth. The bar charts can be replicated
in Microsoft Excel, Tableau, and many
other tools.

41

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Environmental
Massachusetts Clean Energy and Climate Plan
Massachusetts Executive Office of Energy and Environmental Affairs, 2015

The Massachusetts EEA provides an


example of a line graph that resembles
many others in public sector planning
documents. Points of interest in this
instance include:

 EEA has directly compared


sustainability metrics by converting
them to consistent units (CO2
equivalent).
 EEA has placed general residential,
commercial, and industrial uses on
the same graph as specific uses
transportation and electricity
consumption. This both illustrates
and the magnitude of emissions
from these specific sources and
highlights the sharp decline in
emissions from electricity
generation relative to the other
uses.
 EEA has used dotted lines to
distinguish projections.
This plot can be reproduced in
Microsoft Excel by someone who can
generate line graphs or scatterplots with
multiple series.

42

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Global Carbon Budget


Future Earth and the Global Carbon Project, 2015

The “Global Carbon Budget”


infographic uses primarily line charts as
well as a choropleth map to relate
trends in national carbon dioxide
emissions to global warming rates. The
most powerful design decision was tying
the color blue to declines and using it
throughout the graphic, repeatedly
emphasizing the style not only in the
charts but also each time a decline is
mentioned in the text.

This decision allows the reader to easily


pick out that the USA, EU, Russia,
Japan, and Canada (intuitively the
nations that developed the earliest)
have begun to decline in their
emissions, while China and India have
experienced steep increases. These
charts are also shaded below the line to
create Area Graphs, subtly hinting at
the large total emissions resulting from
those generation rates.

Line and area graphs such as these can


be reproduced by a novice user of
Microsoft Excel, though some skill is
required to recognize and introduce
stylistic cues such as the colored and
dashed lines and the subtle inclusion of
varying forecasts for temperature
increase.

43

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

EV Charger Use by Type


Oregon Department of Transportation, 2016

ODOT mapped and plotted Electric


Vehicle (VC) charging station data from
vendors to identify trends for the
general public. It becomes immediately
clear from viewing the page that EV
charger use is highest in the Willamette
Valley and on Fridays, and that use of
DC Fast Chargers has increased
steadily over the past 3 years.

These visualizations were created in


Tableau Public, allowing for updates
and mouse-over text that provides
details on individual locations, days,
and dates. The line graph and stacked
column chart can be replicated in static
form in Microsoft Excel by someone with
relatively rudimentary skills. The pie
chart/bubble map combination can be
replicated with some effort in Excel and
Microsoft PowerPoint (by copying each
pie onto a map in PowerPoint), but is
much easier to execute in Tableau.

44

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Transit
Transit and Density
Alain Bertaud and Harry W. Richardson, 2004

Bertaud and Richardson have


compelling data analysis to support
their argument that greater density
leads to greater transit use, but this
illustration might have been all they
needed. The darkened silhouettes of
Atlanta and Barcelona strikingly convey
the contrasting density of the cities and
the potential relationship of that density
to the prevalence of transit lines.

Only novice skills in ArcGIS or QGIS


are required to replicate this graphic.
Either tool will allow a designer to
isolate and darken developed parcels
(or census tracts, etc.) and overlay
transit lines, all using free public
shapefiles.

45

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Visualizing MBTA Data


Mike Barry and Brian Card, 2014

Mike Barry and Brian Card created


“Visualizing MBTA Data” as a class
project at Worcester Polytechnic
Institute. Drawing on the MBTA’s live
data feed of subway train locations,
they computed and illustrated the
progress of trains over the course of a
day using animation and position-time
line graphs. In addition, they made their
data personally relevant to the viewer by
including averages and distributions
wait, transit, and travel time (the last
being the sum of the first two).

Barry and Card built their project using


the D3 Javascript library. The interactive
and animated elements of the website
cannot be replicated without at least
intermediate coding experience. An
intermediate user of Microsoft Excel
could replicate some, but not all, of the
visualizations in static form.

46

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Remix
Remix is a Transit Planning start-up with
a visually compelling browser-based
visualization and analysis tool. It allows
the user to draw transit service on a
map, place stations, and forecast
operating costs, performance, and
economic impact based on many
integrated datasets.

In this series of screenshots, Remix uses


isochrones (i.e., a heat map) to
illustrate the areas that a transit user in
San Francisco can reach using not only
the City’s Muni network, but including
all of the Bay Area’s independent and
uncoordinated transit providers.

A user can of course replicate this


visualization by using Remix itself. In
addition, a novice in ArcMap or QGIS
could produce a colored map of transit
services provided that they acquired the
shapefiles. ArcMap has the capacity for
network analysis (to produce the
isochrones) based on the underlying
dataset, but this functionality is
purchased separately and requires
expert-level skill with the software.

47

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Highway Mobility
Transportation Outlook
Mid-America Regional Council, 2014

MARC has inserted iconography and


formatted text into this table of
transportation system performance data
and trends. Each panel represents one
of the agency’s guiding objectives and
relates that larger concept to
performance metrics and goals. The
table uses color to overcome the
challenge that the agency desires some
metrics to decline and others to
increase – green always represents a
“good” trend, and red a “bad one.”

This table was likely generated in


Microsoft Excel and can easily be
replicated by a user with a working
knowledge of cell formats and time to
place and recolor the arrows.

48

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Top US Interstates
Federal Highway Administration, 2015

This FHWA diagram illustrates the


length of Interstates and compares it to
annual miles traveled, but it’s
memorable mostly because it’s so
compelling visually. The color scheme,
iconography, and formatted text all
draw the eye into the nested donut
charts. Below, the infographic uses both
upward and downward-facing column
charts to compare the top 15 Interstates
by mileage and volume.

While the diagram was likely created in


Adobe Illustrator or a similar advanced
graphic editing tool, each element can
be created in Microsoft Excel and
PowerPoint with the exception of the
curving text in the donut charts. A
partial donut chart can be mimicked by
creating a blank placeholder category
and individually coloring it gray, while
the downward-oriented columns can be
replicated either by entering the data as
negative numbers or in the axis options.

49

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Timeline
CATT Lab, University of Maryland, 2012

“Timeline” is a software package that


places events in the course of highway
incident response on a common time
axis. These include the arrival and
tenure of different emergency
responders on-site, the closure of lanes,
and average speed on the road
segment. Generally speaking these
things are shown as a timeline graph
that could also be thought of as a
heatmap.

The tool was designed for an audience


of real-time traffic operators,
emergency managers, and participants
in after-action reviews for incidents. The
same methodology can be applied in
other time-sensitive response situations
(e.g., hospitals) where decisions can
lead to positive and negative outcomes.

The tool requires training to use and an


expert level of software development
skill to replicate. As a static
representation of a single incident,
however, the series of timelines showing
emergency personnel response and
lane closures can be replicated in
Microsoft PowerPoint, Adobe Illustrator,
and other diagramming tools. More
advanced statistical analysis software is
required to create the blended
heatmap.

50

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Performance Based Planning


Project Performance Assessment
Plan Bay Area, 2013 Plan Bay Area (PBA) is a long-term
planning effort for the San
Francisco/Oakland/San Jose metropolitan
area led by the region’s metropolitan
planning organization. To visualize the
tradeoffs between different regional
projects, initiatives, and investments, PBA
placed them on axes tied to benefits, costs,
and impacts. In this bubble chart, the size
of the bubble represents benefits, while the
vertical and horizontal axes represent B/C
ratio and support for the project’s regional
development targets.

Because the plot is limited to transit


projects – and PBA is a transit-friendly effort
– all projects fall on the right-hand side of
the diagram. Note that projects are not
sized by cost; expensive projects such as
BART extensions appear small if their
quantified benefits are small. While the
credibility of the diagram largely rests on
how well PBA can explain and defend its
data wrangling and analysis steps, the
diagram can support a decision to, for
instance, fund only the projects that appear
“big,” or those in the upper-rightmost
region of the plot.

Microsoft Excel can be used to create


bubble charts without a steep learning
curve.

51

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Planning for Performance


Massachusetts Department of Transportation, 2016

MassDOT uses the Planning for


Performance cross-asset resource
allocation tool to view the performance
consequences of different investment
portfolios, as well as to generate
beneficial scenarios that reflect user
preferences. The tool’s output page
uses formatted text, bullet charts, and
stacked bar charts to highlight key data,
performance as compared to targets,
and budget, respectively.

While the tool is built in Microsoft Excel,


different visualization elements require
different levels of skill to implement.
Formatted text and conditional
formatting are fairly basic skills, while
the bullet charts are much more
complicated to understand and to
implement (though tutorials exist
online).

52

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

The VDOT Dashboard


Virginia Department of Transportation, 2015

The VDOT Dashboard provides the


public with up-to-date metrics on
performance, safety, condition,
projects, citizen survey results, and
finances. The metrics are presented on
the topic-specific pages using a variety
of visualization techniques, including
bar and column charts, iconography,
and dials. The landing page
summarizes the metrics using dials and
a pie chart.

While VDOT utilized advanced web


development and coding skills to build
The VDOT Dashboard, Microsoft
PowerBI, Tableau, and other
visualization software packages can
create dial charts and can incorporate
them with pie and bar charts into a
dashboard or report format that can be
shared, automatically updated, and
even published to the web. These are
intermediate-level tools that require
some familiarity and have a learning
curve, but they are accessible through a
user interface and do not require
coding.

53

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Freight
Incentives for Truck Use of SH-130
Texas Transportation Institute, Texas A&M University, 2015

Texas A&M University used this stacked


area chart to illustrate the limited
potential of SH-130 – a circumferential
toll road in Austin – as a detour for
through truck traffic currently using
Interstate 35. Placing volume on the
horizontal axis and (roughly north-
south) location on the vertical, the plot
clearly demonstrates the limited volume
of through trucks in the narrowness of
the yellow and lavender bands.

The clear implication is that if all the


yellow and lavender (and even light
green) volume switched to the SH-130
side of the diagram, the overall volume
(and thus congestion) on I-35 would not
appreciably diminish.

A graph such as this one would be time-


consuming to assemble in common
software tools, but because the volume
is constant between exits, the same
effect can be achieved using stacked
bar charts in Microsoft Excel. The
spaces between the bars can be
eliminated to appear very similar to
Texas A&M’s product. Any publication
tool could then be used to assemble the
infographic. The attached map can be
produced in ArcGIS or QGIS by a
novice user.

54

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Crude-by-Rail Movements 2014


US Energy Information Administration, 2015

The US EIA has used a combination


bubble map/flow map to show the
transportation of crude oil by rail
between regions of the US. In addition
to communicating the overall amount
of originating crude and the magnitude
of each flow, the visualization color-
codes the flows to match the regions,
allowing a quick glance at one region
to identify the size of flows from each
other region.

Tableau, Microsoft PowerBI, and


Microsoft Excel 2016 can generate
bubble and flow maps. This graphic can
also be reproduced using ArcMap or
QGIS to generate the background
image and adding the bubbles and
arrows in Microsoft PowerPoint or a
dedicated graphics program such as
Adobe Illustrator.

55

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

GEDVIZ
Bertelsmann-Stiftung, 2016

Bertelsmann-Stiftung uses a chord


diagram to illustrate trade between
user-selected countries. Mouseover text
for the chords establishes the exact size
of the flows. The column chart used to
select the data year represents the total
value of trade flows from the selected
countries in each year.

Developing an interactive online tool


requires expertise in web development,
but Tableau, Microsoft PowerBI, and
similar visualization packages can
create chord diagrams with mouseover
capability and publish them online.

56

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Safety
Five Years of Traffic Fatalities
John Nelson and IDV Solutions, 2010

John Nelson uses heat maps with a


calendar view and with a view of the US
to show when fatalities occurred over
five years and to describe their factors
(alcohol, pedestrian, and weather). He
brings them together along with text to
create an infographic that builds a story
of fatalities by day of week, month of
year, and by location. The heat map
and small multiples help the audience
to quickly identify patterns.

The charts for this diagram can be


created using Microsoft Excel using the
conditional formatting function and
some elbow grease – it means cleaning
and parsing the data into the right
format, especially for the map version,
and selecting a pleasing color. Once
the charts are built, this can be built in
Adobe Illustrator.

57

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

MassDOT Top Crash Clusters


Massachusetts Department of Transportation, 2015

The Massachusetts Top Crash Clusters


map uses polygon shapes to highlight
where clusters of crashes occur
throughout the state. The DOT
processed the data to identify which
crashes were related to others spatially.
The intent was to identify areas that
need to be addressed more as a system
than as single intersections or crossings.

The DOT handled data wrangling using


ArcGIS scripts and mapped the
resulting clusters using ArcGIS Online.

58

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Fatality Analysis Reporting System


National Highway Traffic Safety Administration, 2010

Boost Labs developed this infographic


that compiles line charts to show
fatalities versus population, a sunburst
chart to show how fatalities in the US
are distributed (for example, 33,808
fatalities in the US in 2009, 4,872 of
which were nonmotorists, 630 of which
were pedalcyclists), and bar charts in
small multiples formation to compare
state by state funding and fatalities.

Elements of this visualization can be


recreated using Microsoft Excel and
compiled using Adobe Illustrator. The
sunburst chart can be produced using
an add-in to Microsoft PowerBI.

http://www.boostlabs.com/portfolio-
item/national-highway-traffic-safety-
administration-fatality-analysis-
reporting-system-fars/

59

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Socioeconomic
Mesa County Employment
Mesa County, Colorado, 2014

Mesa County developed these funnel


charts to show which industry groups
had the highest levels of employment
(on the left) and which were growing
more quickly (on the right). The charts,
when combined, show that retail, while
one of the industries with the largest
employment, has not grown over last
several years. They also show that
Leisure and Hospitality employment is
growing faster than all other industries
and is becoming an important
economic driver in the region.

These charts were created using


Microsoft Excel and combined using
Microsoft PowerPoint.

60

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Unemployment Rate Difference from Average


Joe Mako, 2012

Joe Mako developed this set of stream


charts combined into a state by state
view using the small multiples
approach. It shows how the
unemployment rate in each state relates
to the national average. He used
orange lines to indicate states with
above average unemployment and blue
lines to indicate states below average
unemployment. The stream charts are
sorted from highest to lowest
unemployment rates based on the most
recent year of data. Showing these
values over time show how West
Virginia had high levels of
unemployment in the 1980s but has
lower levels of unemployment recently.

Joe Mako used Tableau Public to create


this chart. In Tabluea, this chart type is
called a horizon chart.

61

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

People Movin’
Carl Zapponi

Carlo Zapponi produced an interactive


Sankey diagram to show migration
flows across the world in 2010. He
wrangled the data by applying weights
based on bilateral migrant stocks (from
population census of individual
countries) to the UN Population
Division’s estimates of total migrant
stocks. Clicking on a country on the left
side of the chart calls up a table
describing the current population of
that country, the emigrants from that
country, and the migrant destinations. It
also highlights the flows from the origin
country to the migrant destination in
proportionally sized lines. Similarly,
clicking on a country on the right side
of the chart will show immigrants, their
native country in tabular and flow
forms.

He produced this chart using an HTML5


toolkit for the creation of flow charts
called datamovin.

62

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Pedestrian and Bicycle


Commute to Work Trends, 2000-2012
Mesa County, Colorado, 2014

Mesa County created this combination


chart including stacked bar charts and
a line chart to show trends in bicycling
and walking, both in total terms
(stacked bar charts showing the
numbers of biking and walking
commuters) and in relative terms
(percent of all commuters who bike and
walk to work). Showing these two
perspectives on the same data together
in one combination chart highlights that
in 2010, pedestrian and bicycle
commuting peaked as a percent of total
commuting while it had a relatively low
overall number of biking and walking
commuters.

This chart was produced using


Microsoft Excel and Microsoft
PowerPoint.

63

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Fremont Bridge Bicycle Counter


Seattle Department of Transportation, 2016

The City of Seattle, Washington created


this dashboard using interactive charts
to show hourly bicycle counts on the
Fremont Bridge. The bar charts on the
left show the annual average bicycle
counts and the monthly totals.

Using the same color scheme as the bar


charts, the line chart in the top right
shows the peak travel day of the month,
for each day of the week and each
month.

The line chart in the bottom right shows


bicycle counts over time on those peak
days. Combining these into one
interactive dashboard allows users to
view high level information at a glance
and to dive deep to answer specific
questions.

The dashboard was created using


Tableau.

64

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Network Analysis of Hubway


Ta Virot Chiraphadhanakul, 2013

Virot Chiraphadhanakul developed an


interactive combination chart using a
sunburst chart, network diagram, and
heat map to depict commuting patterns
from a selected Hubway station to more
than 8,000 MBTA bus/rail stops in the
Boston area.

The author wrangled data from


Hubway’s origin-destination pairs, the
MBTA’s GTFS feed, and used network
optimization techniques to find the
shortest path among nodes.

Each slice of the sunburst represents an


access/transfer stop; the area of each
slice represents the number of
destinations that can be reached by that
stop; and the color of the slice
represents the travel time to the from the
origin using the MBTA.

The author used a network diagram to


show the paths from origin to all
potential destinations. He used a
compact space to show a great deal of
dimensions and package a lot of
information into a relatively small
space.

The author describes his methodology


in “Large Scale Analytics and
Optimization in Urban Transportation:
Improving Public Transit and Its
Integration with Vehicle-Sharing
Services” (MIT, 2013) – Chapter 3.

65

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Key
 Y = Fully functional  E = Effort required
 P = Partially-functional  X = Plugin required

Microsoft Office Adobe Google

PowerPoint

Photoshop

Charts API

Maps API
Illustrator
Qlikview

Power BI
Tableau
ArcGIS

Leaflet
Fusion
Tables
Sheets
QGIS

Excel

d3.js
Visio

R
Map
Choropleth Y Y Y Y Y Y Y Y Y

Bubble Y Y Y Y Y Y Y Y Y

Route Y Y Y

Pie Map Y Y Y

Dot Density Y Y Y

Flow Y Y Y

Area Cartogram Y Y Y

66

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Microsoft Office Adobe Google

PowerPoint

Photoshop

Charts API

Maps API
Illustrator
Qlikview

Power BI
Tableau
ArcGIS

Leaflet
Fusion
Tables
Sheets
QGIS

Excel

d3.js
Visio

R
Bar
Horizontal/
Y Y Y Y Y Y Y Y Y Y
Vertical
Clustered Y Y Y Y Y Y Y Y Y Y

Stacked Y Y Y Y Y Y Y Y Y Y

Diverging Y Y Y Y Y Y Y Y Y Y

Bullet Y Y Y Y Y

Histogram Y P P Y Y

Pyramid Y Y Y Y Y

Radial Y Y Y Y Y

Line/Area
Segmented Y Y Y Y Y Y Y Y Y

Smoothed Y Y Y Y Y Y Y Y Y

Regression P P Y P P Y

Area Y Y Y Y Y Y Y Y Y

Stacked Area Y Y Y Y Y Y Y Y Y

Streamgraph X Y Y Y

67

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Microsoft Office Adobe Google

PowerPoint

Photoshop

Charts API

Maps API
Illustrator
Qlikview

Power BI
Tableau
ArcGIS

Leaflet
Fusion
Tables
Sheets
QGIS

Excel

d3.js
Visio

R
Pie
Pie Y Y Y Y Y Y Y Y

Donut Y Y Y Y Y Y Y Y

Sunburst Y Y Y Y Y

Flow
Flowchart E Y X Y Y Y Y

Sankey E Y X Y Y Y Y

Heat
Matrix Y Y Y Y X Y Y Y Y

Calendar Y Y Y Y X Y Y Y Y

Smoothed Area Y Y Y Y Y Y

Scatterplot
Scatterplot Y Y Y Y Y Y Y Y

Bubble Y Y Y Y Y Y Y Y

Motion Y Y

68

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

Microsoft Office Adobe Google

PowerPoint

Photoshop

Charts API

Maps API
Illustrator
Qlikview

Power BI
Tableau
ArcGIS

Leaflet
Fusion
Tables
Sheets
QGIS

Excel

d3.js
Visio

R
Pictogram
Dot Matrix E E E Y E Y Y Y Y Y Y

Symbol Bar E E E Y E Y Y Y Y Y Y

Treemap
Treemap Y Y Y Y Y Y Y Y

Circle Packing Y Y

Node-Link
Node-Link E X X X Y Y Y

Arc X X X Y Y Y

Tree E X X Y Y Y

Chord E X X Y Y Y

Force-Directed E X X Y Y Y

Other
Table Y Y Y Y

Parallel Coordinates Y Y

Spider Y Y Y

Word Cloud Y Y Y

Gauge E Y Y Y Y

69

Copyright National Academy of Sciences. All rights reserved.


Data Visualization Methods for Transportation Agencies

70

Copyright National Academy of Sciences. All rights reserved.

You might also like