You are on page 1of 2

Infographics Lie: Here's How

1. We live in an age of Big: Big Computers, Big Data, and Big Lies.
2. Faced with an unprecedented torrent of information, data scientists have turned to the visual
arts to make sense of big data. The results of this unlikely marriage-often called “data
visualizations" or "infographics"-have repeatedly provided us with new and insightful
perspectives on the world around us. 3D/73
3. However, time and time again we have seen that data visualizations can easily be manipulated
to lie. By misrepresenting, distorting, or faking the data they visualize, data scientists can twist
public opinion to their benefit and even profit at our expense. We have a natural tendency to
trust images more than text. As a result, we're easily fooled by data visualizations. Fortunately,
there are three easy steps we can follow to save ourselves from getting duped in the data
deluge.
Check the data presentation
4. The subtlest way a data visualization can fool you is by using visual cues to make data stand out
that normally wouldn't. Be on the lookout for these visual tricks.
5. 1. Color cues: Color is one popular tool for making certain data more prominent than the rest.
When considering the map below, Kentucky and Utah (the darkest and the lightest) will most
likely stand out to us first.
6. If the map in Figure 1 were showing percentage of the population that smokes (where dark
colors indicate more smokers and light colors fewer smokers), we might quickly conclude that
Kentucky has a serious smoking problem. But what if we looked at the raw numbers and saw
that 27% of Kentuckians and 23% of Utahans smoke? Now, there's not so big of a difference
after all. Make sure to look at what the colors actually represent before drawing a conclusion
from the visualization.
7. 2. Structural cues: Structure is another popular tool for making data immediately stand out. In
the bar charts in Figure 2, we're looking at the same data, but with different scales on the y-
axes. Notice how such a simple structural change can make differences in the data look much
more significant. Is an increase of 15 fraudulent visualizations from last year really
"skyrocketing"? Don't let the structure of the visualization decide that for you. Always check the
numbers that the visualization is representing.
Check the data source
8. Make sure the data source is reliable. Data collected by an amateur is more error-prone than
data collected by a professional scientist. Do a quick Web search to see if the people who
collected and organized the data have a good track record of collecting and distributing data.
9. You should also make sure the data source isn't biased. A drug company may be inclined to
present fake data showing that their latest drug is more effective than it really is, or a political
campaign may manipulate data to discredit their political opponents. 6D/73Think twice when
considering data provided by biased groups.
10. Generally, we can trust data provided by government organizations, university research centers,
and non-partisan organizations. However, we should look more closely at data provided by for-
profit companies, political organizations, and advocacy groups. If the data source isn't listed,
take the data visualization with many grains of salt.
Check the data alterations
11. Many data sets require a little bit of housecleaning before they can be visualized, but excessive
editing can be a sign of misrepresented data. Every good data visualization will come with
explanations describing how the data was manipulated from its raw form into the visualization
you see. Read the explanations, and watch out for the following data alterations.
12. 1. Excluded data: Ensure that the explanations for excluding that data are reasonable.
Sometimes the "explanation" may be that the data inconveniently contrasted with the story the
author wanted to tell.
13. 2. Transformed data: Data transformation, the process of converting data from one format to
another format, can complicate the relationships between data. It's difficult to interpret a
finding such as "The log transform of a city's productivity is related to the log transform of the
city's population." See how that doesn't make any sense to us in practical terms? While a
transformation can make complicated mathematics accessible, it can also potentially be
misleading. Be wary if several transformations have been applied to the data.
14. 3. Statistics: Statistics are an often-abused tool in data science, "Fatal shark attacks have risen
100% this year" sounds like an alarming statistic until you realize that only one person was
fatally attacked by a shark last year Check the raw numbers when data visualizations present
only the statistics.
15. Comparing statistics Is even trickier. If a survey shows that 50% of Latinos and only 30% of
Caucasians enjoy watching baseball 2D/73, those results could easily have been purely due to
chance if the survey interviewed only 20 people of each ethnicity (Figure 3). If the visualization
doesn’t' t indicate the researchers’ confidence in the comparison (called statistical significance),
then we shouldn't be confident in their comparisons.
16. If the details on the data alterations aren't provided with the visualization, always keep in mind
how easy it is to make data lie when it's visualized.
17. Remember: To save yourself from getting tricked by deceitful data, check the presentation, data
source, and alterations

You might also like