Professional Documents
Culture Documents
Deception With Graphs: Omitting A Scale: Sometimes Data Are
Deception With Graphs: Omitting A Scale: Sometimes Data Are
Graphs are most powerful tool to producing distortion in the graphs under this
represent data visually, easily understood by any category:
person. It plays an important role in analysing the Omitting a scale: Sometimes data are
data primarily and it is the gateway for extracting represented by graphs with missing scale from
complex pattern from a dataset. Graphs began to one of the axes. Such methods destroy
appear around 1770 and became common only perspectives. The number of customers served by
ABC Company between 1990 and 2013 is
around 1820. Good graphs are extremely helpful
represented in Fig. 1 with no vertical scale.
for extracting basic features of complex data; they However, without a scale on the Y-axis, it is not
help turn the realms of information available known whether this graph represents a growth in
today into knowledge. But similar to the case of demand of 10 percent, 100 percent or 1,000 percent.
most of the statistical tools graphical Graphs like these should be avoided.
representation may turn to be deceptive for its
improper presentation. Here some misleading Manipulating vertical axis: It is a malpractice of
changing the scale of the vertical axis so that the
representations of graphs are discussed. Also
origin is not started at zero. Sometimes such
some simple metrics to measure the extent of graphs are lack of right interpretation for their
distortion in graph are given. visual illusion.
Some graphs deceive or mislead. This
No. of customers served
1
62 percent of Democratic respondents agreed,
compared to 54 percent of Republicans, and 54
percent of Independents. The correct
representation is given in Fig. 3.
CNN/ USA Today Gallup Poll:2005
Results by Party
Percentage who agree
65
60
55
50
Democrats Republicans Independents
Party Fig. 5: Average life span of different animal (with
scale break)
Fig. 2: Results of CNN/ USA Today Gallup Poll in
2005 Misleading trends
Manipulation of vertical axis also results in
CNN/ USA Today Gallup Poll:2005 misleading trend in case of line diagram for time
Results by Party series data. Fig. 6 represents the number of
annual deaths in dowry in West Bengal between
Percentage who agree
100
2001 and 2012. It shows an upward trend. This
50 graph turns to be misleading in several ways:
0
Democrats Republicans Independents Changing the vertical scale: Narrower vertical
Party scale than original one will exaggerate the trend
(Fig. 7)
Fig. 3: Correct representation of results of CNN/ USA
Today Gallup Poll in 2005 1500
No. of dowry deaths
1000
Sometimes the vertical scale starts at 0 but
distortion occurs due to the use of scale break. In 500
Fig. 5 the bar graph shows the average life span
0
of different animals with a scale break in y –axis.
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
It looks as if the Horse lives 4 times as long as
camel but that is not true as shown in original
graph (Fig. 4). Year
2
Changing ratio of graph dimensions: The trend
1300
No. of dowry deaths
has also been affected by the ratio of graph
1100 dimensions. The trend is overstated than original
900 by using narrow horizontal axis compared to the
vertical axis (Fig. 9.1) and it understates the trend
700
2001 by the use of extended horizontal axis than
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
original graph (Fig. 9.2).
Year
No. of dowry
1000
deaths
500
Fig. 7: Number of annual deaths from dowry in West 0
Bengal from 2001 to 2012 (origin shifted to 700)
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
No. of dowry deaths
Year
1000
5000 2000
4000 1500
Deaths
3000 1000
2000 500
1000
0 0
2001 2003 2005 2007 2009 2011
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
Year
Year
3
crimmes against women in India is shoown in Fig. ANDHRA PR
RADESH WEST BENGAL
21. It shows that theree is a strong linear
relaationship between
b Stteel produuction and 1600
Deaths
num mber of crimes againsst women. From this
1100
diagram if one concludes “Crimes
“ agaiinst women
increase as SteelS produuction increeases” the 600
stattement turns to be infeeasible and ridiculous. 2001 2
2003 2005 20
007 2009 2011
Thee fact is thhat the scattter diagram m does not
Year
assure any caause-effect relationship and if it
shoows any relationship thhis may be due to the
effeect of any third factorr on both of o the two Fig. 122: Number of aannual deathss from dowryy in West
varriables. This incidencee is called “spurious Bengall and Andhra Pradesh from m 2001 to 20112
corrrelation”. This
T time thhe third facttor is time (origin of vertical axxis changed too 600).
itseelf and it iss observed that
t both thhe variables
Steeel productio on and Nuumber of crrimes have 1500
0
upw ward trends between the given period 2001 – 0
1000
201 12 (Fig. 22). 500
Misleading 3D
D graphs 0
ANDHR
RA PRADESH WEST BENGAL 1000
500
6000
0
Deaths
4000
0
2010 2011 2012 2013
2000
0 Year
01 2003 2005
200 5 2007 2009 2011
Year Fig. 144: 2D bar diaggram showingg the number of
sunglassses sold from
m 2010 – 20133 in an opticaal shop
Fig
g. 11: Numberr of annual deeaths from do owry in West less saame and are lowest amoong the four groups.
Benngal and Andhhra Pradesh from
f 2001 to 2012 But in n 2D Pie ddiagram thee picture is totally
(maaximum of verrtical axis shif
ifted to 6000) differeent (Fig. 155) where itt is clear that
t the
distribuution of bloood group O and
a AB are same.
s
3D Pie diagram m may alsoo turn to bee deceptive.
Bloood group diistribution of
o several Arrts students Manip
pulating barr graph and
d pictogram
of a college is represennted throug gh 3D Pie Simplee bar diagram
m may turn to be illusivve when
diagram given in Fig. 15. From
F the figuure one can the barrs are not of
o equal widdth. In Fig. 17, bar
surrely concludee that O is thhe most commmon blood diagram
m represennts the aveerage weeklly food
gro
oup among the studeents. The number n of expendditures amoong the families
f of Texas
studdents with blood
b group A and B are more or betweeen 1986 and 1992. Theree is slightly an
a
A B AB O
Fig. 177: Average weeekly food exp
penditure by different
d
Fig
g. 15: 3D pie diagram
d show
wing the distrribution of familiess of Texus
diffferent blood groups
g amongg the Arts studdents of a
colllege
11%
%
9%
40%
%
40%
A B AB
B O Fig. 188: Average weeekly food exppenditure by different
d
familiess of Texas (width of the ba
ars are proporrtional
Fig
g. 16: 3D pie diagram
d show
wing the distrribution of to theirr height)
diffferent blood groups
g amongg the Arts studdents of a
colllege
upw
ward trend inn food expennditure. In Fig. 18 the
sam
me data is reppresented byy bar diagram m but
wid
dth of the baars is proporttional to theiir height.
Thiis graph exagggerates exppenditure groowth by
wid
dening the baars as they become
b high
her. This
cou
uld create thee impressionn that expendditures
actu
ually rose more
m sharply over the perriod.
Picctogram also o creates visual
v illusion if the
pictures used inn the diagramm are not off same size.
Thee number person
p ownns different pets in a
cerrtain city are
a presenteed through Pictogram Fig. 199: Misleading pictogram shhowing no. off
giv
ven in Fig.199. The pictuures are diffferent sizes differen
nt pets ownedd by people
andd it appears that
t more peeople own a horse than Omittiing data
anyy other animmal. An imp provement would
w be to
red
draw the picttogram with each of the animals the Graphss created with omittted data remove
samme size and aligned
a with
h one anotherr (Fig. 20). inform
mation from which it is hard to gett proper
concluusion. Fig. 23a and 23b 2 represennts two
scatterr diagram wiith missing categories (year) in
Fig. 23b. In Fig. 23b the growth appears to be 100
more linear with less variation than Fig. 23a. 80
60
Data
40
20
0
2001 2003 2005 2007 2009 2011 2013
Year
100
80
60
Data
Fig. 20: Pictogram showing no. of different pets 40
owned by people 20
0
600000 2001 2003 2005 2007 2009 2011 2013
No. of crimes against
400000
Year
women
200000
0 Fig. 23b: Scatter plot with missing categories
0 40000 80000
Measuring distortion
Steel production (in thousand tonnes)
Several methods have been developed to
determine whether graphs are distorted and to
Fig. 21: Scatter diagram of Steel production vs. quantify this distortion.
Number of crimes against women in India
Lie factor:
500000
No. of 400000 Lie factor =
crime 300000
against 200000 where,
women 100000
0 size of effect = | |
A perfectly accurate graph would exhibit a lie
Year factor of 1. A graph with a high lie factor (>1)
would exaggerate change in the data it represents,
100000 while one with a small lie factor between 0 and 1
Steel 80000 would obscure change in the data.
production 60000
(in '000 Graph discrepancy index (GDI)
40000
tonnes)
20000
0
1 100% , where
a = percentage change depicted in graph
b = percentage change in data
Year The graph discrepancy index also known as
the graph distortion index was originally
Fig. 22: Upward trend for both Steel production proposed by Paul John Steinbart in 1998.
(2001 – 2012) and No. of crime against women (2001 GDI is ranging from -100% to ∞ with 0%
– 2012) percent indicating that the graph has been
6
properly constructed and anything outside the ±5% into a 2D graph. Otherwise one should use
margin is considered to be distorted. 2D graph.
Data-ink ratio Pictogram should be drawn using pictures
Data ink ratio of equal size.
"Ink" used to display the data The decision from scatter diagram should
Total "ink" used to display the graphic be carefully reviewed before giving final
conclusions such as feasibility of the
The data-ink ratio should be relatively high; relationship between two variables, effect
otherwise the chart may have unnecessary of third factor etc.
graphics. The graph should be represented in such a
way that the decision is not affected by
Data density missing data.
References