You are on page 1of 14

Data Visualization

Data Visualization: The scienceand artof discovering patter ns and trends in data
In January 1997, The New York Times published a two-part article by Pulitzer Prize-winning journalist Nicholas Kristof on diseases in the developing world. The article would later be credited for inspiring Bill and Melinda Gates to dedicate, through their foundation, a significant portion of their wealth to the fight against global health challenges. But according to Kristof, it wasnt actually the article that ignited the Gates passion. Rather, it was a simple graphic that accompanied the article that caught the Gates eyes: a two-column list of health problems in the developing world and the number of people those diseases kill. In a note to colleagues on the power of art, Kristof related this story, writing, No graphic in human history has saved so many lives in Africa and Asia.1 The graphic accomplished this by showing Times readers, including the Microsoft founder and his wife, the problem of diseases in developing countries in a quick, easy-to-comprehend format. Generations of young writing students have had the show, dont tell principle of composition drilled into their psyche. And generations of young math students have plotted graph points, created Venn diagrams and turned data into pie charts in order to illustrate information. Today, these two disciplines are being combined to make connections and show stories in ways that raw data, words and traditional graphic forms could never communicate. Data visualization all those graphs, charts and maps on steroidsis the new way to harness massive amounts of data, reveal patterns, discover trends and inspire action. Simply put, data visualization is the practice of communicating information through graphical means. Combined with emerging software capabilities and design principles, data visualization is capable of presenting large amounts of information in ways that reveal previously unknown insights and invite more indepth analysis than was possible with traditional spreadsheets, bar graphs and pie charts. (In fact, many data visualization experts loathe the pie chartwell get to that later.)

1 Talk to the Newsroom: Graphics Director Steve Duenes. The New York Times. The New York Times Co., 25 Feb. 2008. Web. 12 Feb. 2013. < html?pagewanted=1&_r=2>.
2013 4imprint, Inc. All rights reserved

Data visualization has the potential to be a powerful storyteller. Heres why: For the human brain, visual perception is fast and efficient. Cognitive function is slower and less efficient.2 As data visualization expert Stephen Few explains, Traditional data sense-making and presentation methods require conscious thinking for almost all of the work. Data visualization shifts the balance toward greater use of visual perception, taking advantage of our powerful eyes whenever possible.3 This Blue Paper will demystify some of the media buzzwords related to data visualization, delve into a variety of forms of data visualization, examine best practices and offer a few warnings. If one of the purposes of data visualization is to reveal previously unknown information, then lets start peeling back the layers.

The buzz
Data is the new oil, became one of the most oftrepeated clichs in 2012. But the analogy has been around for a few years. In 2006, Michael Palmer, executive vice president for the Association of National Advertisers, compared data to crude, saying, Its valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value. The issue is how do we marketers deal with the massive amounts of data that are available to us? How can we change this crude into a valuable commoditythe insight we need to make actionable decisions?4 One of the answers to Palmers question is data visualization. But first, lets consider some related buzzwords: big data, data mining and infographics. Big data: Information technology research company Gartner describes big data as high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.5

2 Few, Stephen. Data Visualization for Human Perception. The Encyclopedia of Human-Computer Interaction, 2nd Ed. Soegaard, Mads and Dam, Rikke Friis (eds.). Aarhus, Denmark: The Interaction Design Foundation, 2013. 3 Ibid. 4 Palmer, Michael. Data Is the New Oil. Web log post. ANA Marketing Maestros. Association of National Advertisers, 3 Nov. 2006. Web. 12 Feb. 2013. <>. 5 Matzer, Michael. What Exactly Is Big Data? Business Innovation from SAP. SAP AG, 14 Dec. 2012. Web. 13 Feb. 2013. <>.
2013 4imprint, Inc. All rights reserved

Where does this data come from? According to IBM, 90 percent of the data in the world today has been created in the last two years, from sources including social media sites, digital pictures and videos, purchase transaction records and cell phone GPS signals.6 And its going to keep coming. Networking technology firm Cisco predicts that global mobile data traffic will grow 13-fold from 2012 to 2017, reaching 11.2 exabytes per month7 (one Exabyte is equal to one trillion gigabytes). IT journalist Michael Matzer writes that while big data certainly presents a problem in terms of storage and analysis, the actual problem, according to Gartner, lies in spotting meaningful patterns within the data that can help companies make better decisions.8 Data mining: One approach to harness big data is data mining, the process of searching through large amounts of data to uncover hidden patterns or trends. Software development firm Albion Research Ltd. explains that computer-assisted data mining can help companies find profitable relationships that were not even suspected to exist, or that would take too long to find by manual means.9 According to Albion, data mining has successfully contributed to:  Identifying unexpected shopping patterns in supermarkets.  Optimizing website profitability by making appropriate offers to each visitor.  Predicting customer response rates in marketing campaigns.  Defining new customer groups for marketing purposes.  Predicting customer defections.  Distinguishing between profitable and unprofitable customers.  Improving yields in complex production processes by finding unexpected relationships between process parameters and defect rates.  Identifying suspicious behavior as part of a fraud detection process.10 Data mining and data visualization are separate disciplines, but data visualization is sometimes referred to as visual data mining.

6 What Is Big Data? IBM. IBM, n.d. Web. 13 Feb. 2013. <>. 7 Visual Networking Index (VNI). Cisco. Cisco, n.d. Web. 13 Feb. 2013. < ns827/networking_solutions_sub_solution.html>. 8 Matzer, Michael. What Exactly Is Big Data? Business Innovation from SAP. SAP AG, 14 Dec. 2012. Web. 13 Feb. 2013. <>. 9 Why Should I Be Considering Data Mining? Albion Research Ltd. Albion Research Ltd., n.d. Web. 14 Feb. 2013. <>. 10 Ibid.
2013 4imprint, Inc. All rights reserved

Infographic: The terms data visualization and infographic are often banded together, but there are distinctions between these two types of visual tools. Below is a quick snapshot example of each: Data Visualization Infographic

Lapierre, Audree. Data Visualization for Marketing. Macguire, Eoghan. Food Waste: From Farm to Fork Audre Lapierres Blog. N.p., 30 Jan. 2011. Web. 19 and Landfill. CNN. Cable News Network, 21 Dec. Feb. 2013. 2012. Web. 19 Feb. 2013.

In general, an infographic illustrates a small amount of information or a single idea and the conclusions of an infographic are known to its creators before the infographic is designed. Basically, it is a graphic that communicates known information more quickly than words alone (as we all know, a picture is worth a thousand words). In contrast, data visualization usually encompasses a larger set of information and the patterns that a visualization reveals arent necessarily known to the creators of a visualization before a graphic is designed. The visualization itself reveals information that would have been extremely timeconsuming or impossible to see by studying numbers on a spreadsheet. Heres an example that illustrates the difference between an infographic and data visualization and demonstrates how both can help government decision makers, businesses and consumers to make better-informed decisions: In February 2013, the Minneapolis Star Tribune ran an article about the danger of radon, a cancer-causing gas that occurs naturally in soil and tends to seep into homes, especially in the U.S. Midwest. The article was accompanied by two illustrations: an infographic that showed how radon enters homes and a data visualization of more than 100,000 radon tests performed statewide.11 The infographic showed a cutaway illustration of a three-level house and used arrows and captions to help readers quickly understand where radon comes from
11  Schrade, Brad. Minnesota Is Hotbed for Radioactive Gas Radon. StarTribune, 10 Feb. 2013. Web. 14 Feb. 2013. <>.
2013 4imprint, Inc. All rights reserved

and how it enters homes. All of the information illustrated on the infographic was known to the articles writers, editors and graphic artists before the graphic was produced. The infographic served as a means to show quickly, rather than tell slowly, how high radon levels occur in homes.12 For the data visualization, a map of the state of Minnesota was broken down by zip code. Zip code areas were color-coded according to low, medium or high potential for unsafe radon levels based on the results of more than 100,000 in-home tests conducted over a 13-year period and collected by the states department of health. The articles producers may not have known what, if any, pattern might emerge from mapping the data. But by applying the data to the map and color-coding the results, a pattern became immediately clear. The data visualization showed that some pockets in central and northern Minnesota have a low potential for unsafe radon levels. Most of central and northern Minnesota showed a medium potential for unsafe radon levels. A vast swath of southern and western Minnesota showed a high potential for unsafe radon levels. Viewers of the electronic visualization could enter their zip code to have the map zoom in to their specific geographic area.13 With this pattern revealed through the visualization, stakeholders now have powerful information upon which they can act. For example, the state public health department may choose to concentrate a public awareness campaign in the regions where the danger is highest, conserving taxpayer dollars by targeting the areas where the need is greatest rather than blanketing the entire state. Homeowners in the areas where the danger is high may be motivated to have their homes tested. Hardware and home stores in those same areas may use the data to help push sales of radon testing kits and contractors may be motivated to add radon mitigation to their services. On the other hand, homeowners in areas where the potential danger is low may use the information to help protect themselves against unscrupulous contractors pushing mitigation systems onto people who dont need them.

12  Grumney, Raymond. How Invisible Radon Gas Can Enter a House. StarTribune, 10 Feb. 2013. Web. 14 Feb. 2013. <>. 13  Schaul, Kevin. Minnesotas Radon Zones. StarTribune, 10 Feb. 2013. Web. 14 Feb. 2013. <>.
2013 4imprint, Inc. All rights reserved

Forms of data visualization

Whether you have a massive amount of data or a reasonable amount that you want to use to help reveal unknown patterns or trends, a key to good data visualization is choosing the right form of chart, graph or map. The geographic map of Minnesota in the Star Tribune example is one form of a data visualization chart. Here are other examples: Basic charts: The traditional bar graphs, pie charts, line graphs and Venn diagrams that we all learned in elementary school.

Scatter graph or scatter plot: A diagram that plots values with two variables, with one variable represented on a horizontal axis and the other represented on a vertical axis.

2013 4imprint, Inc. All rights reserved

Bubble chart: A chart that applies numeric value to circles whose areas are proportional to the values. A common variant of this is the bubble map, which charts proportionally sized circles on a geographic map.

Sparkline chart: A small line chart that displays a general trend in measurement. Often lacking a defined x- and y-axis, these charts are commonly seen in stock market visualizations.

Tree map: A visualization of hierarchical data.

2013 4imprint, Inc. All rights reserved

Heat map: A graphical representation of data values in a color-coded rectangular array.

Pareto chart: A chart that combines bars and a line graph.

These are only a few of the most common forms. Ralph Lengler and Martin J. Eppler of the e-learning course identified 100 visualization methods, grouping them into six general categories: 1. Data visualization (Examples include pie charts, bar charts and line charts.) 2.  Information visualization (Examples include timelines, flow charts and Venn diagrams.) 3. Concept visualization (Examples include mind maps and synergy maps.) 4. Strategy visualization (Examples include stakeholder maps, s-cycle diagrams and value chains.) 5.  Metaphor visualization (Examples include story templates, icebergs and funnels.) 6.  Compound visualization (Examples include cartoons and knowledge maps.)14
14  Lengler, Ralph, and Martin J. Eppler. A Periodic Table of Visualization Methods., n.d. Web. 14 Feb. 2013. <>.
2013 4imprint, Inc. All rights reserved

In addition to the charts themselves, electronic data visualization often includes interactive elements that add more information and depth to the visualization if the viewer needs or desires it. For example, Lengler and Eppler illustrated their 100 identified methods as a periodic table of visualization methods, and in the electronic version of this table, an example of each of the 100 methods pops up as the viewer scrolls over the table. In the Star Tribune example, the opportunity to enter a zip code to have the map zoom into a specific region of the state is another form of interactive data visualization. Robert Korsara, assistant professor of the Department of Computer Science with the University of North Carolina at Charlotte, explains that interaction in visualization enables the fast exploration and discovery of data patterns that the user may not even have expected. It is also possible to reduce the amount of data shown at the same time, providing clearer visualizations, while still giving the user the option to get that information on demand at any time, he says.15 Tooltips are graphical interface elements that appear when a curser hovers over a particular spot on a visualization. Relegating detailed information into tooltips helps provide additional information while keeping a chart clean. The experts at FusionCharts provide this example of hotel analyzing data related to reservation trends: The hotel experiences a dip in reservations in September because a new hotel opened nearby and then reservations pick up again in October when the hotel offers a discount. Putting all of this information on a line graph of reservations by month would make it very cluttered. Putting the detailed information in tooltips is more helpful for users, in this case, management. When a user is interested in getting more information about a data set, he or she can use their cursor to hover over it to access additional information. The chart is clean, yet it provides all of the necessary information.16 Other common interactive tools include brushing and linking. Brushing lets the user select data points to be highlighted in one or more views of the same data, Korsara explains. When several views are involved, the fact that all of them highlight the same data points is commonly referred to as linking (and the views are called coordinated multiple views).17

15  Robert Kosara (2010). Commentary on: Few, Stephen. Data Visualization for Human Perception. The Encyclopedia of Human-Computer Interaction, 2nd Ed. Soegaard, Mads and Dam, Rikke Friis (eds.). Aarhus, Denmark: The Interaction Design Foundation, 2013. visualization_for_human_perception.html 16  6 Tips to Increase the Usability of Your Charts. FusionCharts. FusionCharts, n.d. Web. 14 Feb. 2013. <http://>. 17  Robert Kosara (2010). Commentary on: Few, Stephen. Data Visualization for Human Perception. The Encyclopedia of Human-Computer Interaction, 2nd Ed. Soegaard, Mads and Dam, Rikke Friis (eds.). Aarhus, Denmark: The Interaction Design Foundation, 2013. visualization_for_human_perception.html
2013 4imprint, Inc. All rights reserved

Best practices
Data visualization may be an emerging discipline, but solid visualizations rely on well-established design principles. The Gestalt principles, theories of visual perception developed by German psychologists in the 1920s, help to explain how and why humans see patterns and trends in visualized data. These principles include similarity (when objects look similar, humans perceive them to be a group or pattern), continuation (the human eye tends to follow a line or curve from one object to another) and proximity (when objects are placed close together, humans perceive them as a group). Data visualization consultant Jorge Camoes argues that good data visualization functions with an understanding of how human perception works. He writes that:  Humans are constantly grouping objects based on color, shape, direction, proximity and closure/enclosure.  Humans like simple, close, smooth, symmetrical, easy-toprocess shapes.  If we are aware of these laws, we can take advantage of them to design better charts or dashboards.  We must also be aware of their negative effect. We shouldnt force the reader to see groups that arent really there; a simple shape is not necessarily the right shape.18 Good data visualization gets both the science and the art right. The following steps are adapted from an approach to data visualization by user interface developer Ryan Bell of EffectiveUI: Step 1: Understand the problem domain. Ask questions to understand what specific problem needs to be explored. Step 2: Get sound data. This includes understanding the meaning of the data, ensuring that it is as accurate as possible and having a sense of how it was gathered and what errors or inadequacies may exist. Step 3: Show the data and show comparisons. Choose the visualization method that will best show relationships within the data. Step 4: Incorporate visual design principles. Sound design elements (line, form, shape, value, color) and principles will make trends easy to see.

18  Camoes, Jorge. Perception: Gestalt Laws. Web log post. VizWise, n.d. Web. 14 Feb. 2013. <>.
2013 4imprint, Inc. All rights reserved

Step 5: Bring in more dimensions with one or more of the following techniques:  Add small multiples. Thumbnail charts of related data help users make quick side-by-side comparisons. Add layers. Adding extra levels of information, while preserving the high-level summary data, can make a graphic more flexible and useful.  Add axes or coding patterns. Adding an additional variable or coding of information can help reveal new information. However, be careful not to add too much clutter. Combine metaphors. Add other forms of data visualization to show additional connections.19 Finally, all good data visualization is accompanied by clear text that helps viewers understand what they are looking at quickly. Use plain English to summarize the data, include units of representation and the time period and avoid using articles (a, an and the) and adjectives.20

Lies, damned lies, statistics and bad data visualization

Theres nothing new about number manipulation. It was Mark Twain, among others, who warned of lies, damned lies and statistics. Whether intentionally misleading or just poorly executed, data visualization has as much potential for misunderstanding and misuse as earlier iterations of data analysis. In the hands of a statistician with no understanding of how design principles affect the way humans perceive graphic renderings, data visualization may be accurate but fail to communicate. In the hands of a designer who wants to show off skills but doesnt understand the information, data visualization may look pretty but mislead or fail in its goal to reveal connections and trends. Vitaly Friedman, editor of Smashing Magazine, an online magazine for designers and developers, writes about the primary goal of data visualization: [T]o communicate information clearly and effectively through graphical means. It doesnt mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key aspects in a more intuitive way. Yet designers
19  Bell, Ryan. Eight Principles of Data Visualization. Information Management. SourceMedia, 17 Aug. 2012. Web. 14 Feb. 2013. <>. 20  5 Tips for Writing Great Chart Captions. FusionCharts. FusionCharts, n.d. Web. 14 Feb. 2013. <http://www.>.
2013 4imprint, Inc. All rights reserved

often fail to achieve a balance between form and function, creating gorgeous data visualizations which fail to serve their main purpose to communicate information.21 Researcher and data educator Dana Griffin offers the following data visualization donts: 1. Dont assume viewers will automatically get whatever graphic you create. Understanding doesnt happen in a vacuumit is the researchers responsibility to distill data in a way that makes clean, intuitive sense to viewers and to provide visual tools that guide the audience to insights or conclusions. 2. Dont overstate or oversimplify the data.The conclusions that viewers reach are highly dependent on how specific the data set is. 3. Dont make viewers look too long and/or think too hard in order to get the point.Time and attention are limited.Poorly designed visuals risk confusing, irritating or alienating viewers.22 And while we are on the topic of bad data visualization, lets return to that ubiquitous pie chart. In an Oracle blog exploring why designers would choose a pie chart as a visualization metaphor, Tony Wolfram explains that pie charts dont lend themselves well to human perception capabilities. He writes: When comparing data, which is what a pie chart is for, people have a hard time judging the angles and areas of the multiple pie slices in order to calculate how much bigger one slice is than the others. Controlled studies show that people will overestimate the percentage that a pie slice area represents. This is because we have trouble calculating the area based on the space between the two angles that define the slice. Wolframs preference for data typically applied to a pie chart is a sorted bar chart.23 With that in mind, save your pie for eating, and choose a visualization method that reveals relationships and communicates patterns and trends in a clear and
21  Friedman, Vitaly. Data Visualization and Infographics. Smashing Magazine. Smashing Media, 14 Jan. 2008. Web. 14 Feb. 2013. <>. 22  Griffin, Dana. Data Visualization Lesson 6: The Ultimate List of Dos and Donts. Research Access. Research Access, 17 Oct. 2012. Web. 14 Feb. 2013. <>. 23  Wolfram, Tony. Pie Charts Just Dont Work When Comparing Data - Number 10 of Top 10 Reasons to Never Ever Use a Pie Chart (The Designer Experience). Web log post. The Designer Experience. Oracle, 10 May 2010. Web. 14 Feb. 2013. < data_-_number_10_of_top_10_reasons_to_never_ever_use_a_pie>.
2013 4imprint, Inc. All rights reserved

accurate manner. Use basic design principles based on visual perception theories and make it interactive to keep charts simple while providing the maximum amount of information. Whether you need to uncover sales patterns, reveal trends in purchase categories or customize offers to particular shoppers, good data visualization will help to provide the insights you need and your customer wants. You never know what actions it may inspire

4imprint serves more than 100,000 businesses with innovative promotional items throughout the United States, Canada, United Kingdom and Ireland. Its product offerings include giveaways, business gifts, personalized gifts, embroidered apparel, promotional pens, travel mugs, tote bags, water bottles, Post-it Notes, custom calendars, and many other promotional items. For additional information, log on to

2013 4imprint, Inc. All rights reserved