6. Exploring data – using graphs
Gra¡hs are usefuI for severaI reasons. They can heI¡ us lo visuaIise lhe dala and decide
vhich slalislicaI lesl is lhe besl. We may s¡ol ¡auerns in lhe dala and gain a beuer under-
slanding of vhal ve are deaIing vilh. Gra¡hs are aIso usefuI for summarising our hnaI
resuIls, es¡eciaIIy vhen ve ¡resenl our hndings lo olher ¡eo¡Ie.
We can lhink of gra¡hs as being usefuI for lvo ¡ur¡oses: hrslIy lo heI¡ us decide hov lo
lackIe lhe dala, and secondIy lo ¡resenl resuIls. We viII Iook al delaiIs of gra¡hs and hov
lo ¡roduce lhem in IxceI and R in Seclion 12.4 vhere ve examine vays lo ¡resenl
our hndings. We viII aIso menlion gra¡hs lhroughoul lhe lexl as ve Iook al lhe various
anaIylicaI melhods lo examine our dala. Indeed ve have aIready seen some exam¡Ies in
Cha¡ler 4. In lhis shorl cha¡ler ve viII summarise lhe gra¡hs ve mighl use lo heI¡ us
ex¡Iore our dala.
6.1 Exploratory graphs
One of lhe mosl common anaIysis of sam¡Ie of dala is lo delermine if lhey are normaIIy
dislribuled or nol. This aßecls lhe kind of slalislicaI anaIysis ve are abIe lo ¡erform on lhe
dala. There are severaI vays ve can iIIuslrale lhe dislribulion of a dala sam¡Ie. We may
use a sim¡Ie laIIy ¡Iol or a slem÷Ieaf ¡Iol, ve can even do lhis righl from our nolebook in
lhe heId. The foIIoving exam¡Ie shovs a slem÷Ieaf ¡Iol.
1 | 679
2 | 112334
2 | 5666678899
3 | 01124
3 | 6
In lhis exam¡Ie, lhe dala are sorled in numericaI order in each rov bul ve can sliII gain
insighls inlo lhe dala dislribulion if lhe numbers are nol sorled.
1 | 967
2 | 143123
2 | 9568667869
3 | 40121
3 | 6
A sim¡Ier version of a slem÷Ieaf ¡Iol is lhe laIIy ¡Iol, and in lhis case ve enler lhe dala as
a sim¡Ie laIIy mark. In TabIe 28, ve see a laIIy ¡Iol of lhe same dala as our slem÷Ieaf ¡Iol.
96 | Statistics for Ecologists Using R and Excel
Table 28. A tally plot to show data distribution
Tally Bin
x 16
x 18
x 20
xxx 22
xxx 24
xxxxx 26
xxx 28
xxx 30
xxx 32
x 34
x 36
These are sim¡Ie ¡Iols bul neverlheIess can be exlremeIy heI¡fuI. When ve relurn from
lhe heId ve may decide lo use a more formaI hislogram lo iIIuslrale lhe dislribulion
(Iigure 78).
Figure 78. A histogram to illustrate the distribution of a data sample
The size of lhe bars in our hislogram shovs us lhe number of ilems (lhe frequency) of our
dalasel lhal Iie vilhin each size cIass, re¡resenled on lhe .-axis. We may decide lo use a
Iine inslead of bars and lhe resuIl is a densily ¡Iol (Iigure 79).
6. Exploring data – using graphs | 97
Figure 79. A density plot to illustrate the distribution of a data sample
Some ly¡es of gra¡h are usefuI because lhey shov a Iol of informalion in a com¡acl man-
ner such as lhe box÷vhisker ¡Iol. A box÷vhisker ¡Iol shovs us hve ¡ieces of informalion:
median, maximum, minimum and bolh quarliIes (Iigure 80).
Figure 80. A box–whisker plot can be used to illustrate data distribution as well as provid-
ing other information, e.g. median, inter-quartiles and max/min
In Iigure 80, ve can see lhal lhe dala a¡¡ear normaIIy dislribuled as lhe box÷vhiskers are
symmelricaI aboul lhe median slri¡e. We can use lhe box÷vhisker ¡Iol lo Iook al severaI
sam¡Ies and iIIuslrale nol onIy dißerences belveen sam¡Ies bul lheir dislribulion as veII
(Iigure 82).
98 | Statistics for Ecologists Using R and Excel
Anolher vay ve can visuaIise our dala is by using a Iine gra¡h lo shov lhe running average
(mean or median). We mel lhis earIier in Seclion 4.7 vhere ve used lhe idea lo heI¡ deler-
mine if ve had coIIecled enough dala. In Iigure 81, ve see an exam¡Ie of a running mean.
Figure 81. A line graph illustrating the running mean
This is anolher exam¡Ie of a gra¡h ve can skelch vhiIsl oul in lhe heId. We do nol have lo
be quile so exacl vhen ve are oul in lhe heId, lhe gra¡h is sim¡Iy a looI lo heI¡ us make a
6.2 Graphs to illustrate diferences
When ve have a ¡ro|ecl lhal is cenlred on Iooking al dißerences belveen sam¡Ies ve can
iIIuslrale lhe silualion using bar charls or box÷vhisker ¡Iols. We mel lhe box÷vhisker ¡Iol
¡reviousIy (Iigure 80) vhen ve used il lo viev a sam¡Ie and check ils dislribulion. In
Iigure 82 ve Iook al lhree sam¡Ies.
Figure 82. A box–whisker plot illustrating diferences between three samples
6. Exploring data – using graphs | 99
We can see lhe dißerences belveen lhe lhree sam¡Ies fairIy easiIy and in addilion ve can
gain some insighl inlo lhe dislribulion. A common aIlernalive lo lhe box÷vhisker ¡Iol is
lhe bar charl. This is usefuI lo shov dißerences belveen ilems in dißerenl calegories and
is lherefore suilabIe lo iIIuslrale dißerences in sam¡Ies. In Iigure 83 ve see lhe same dala
as in Iigure 82 bul here ve use a bar charl vilh slandard error bars lo shov lhe variabiIily
vilhin each sam¡Ie.
Figure 83. A bar chart illustrating diferences between three samples
We can see from Iigure 82 lhal lhere are dißerences belveen lhe lhree sam¡Ies unIike in
Iigure 83 vhere ve cannol leII anylhing aboul lhe dislribulion.
6.3 Graphs to illustrate links
When ve lhink of vays lo Iink dala logelher lhere are lvo main a¡¡roaches. In one
a¡¡roach, ve have lvo sels of vaIues, bolh are numeric and one re¡resenls a de¡endenl
variabIe and lhe olher an inde¡endenl variabIe. We are Iooking for a correIalion. In lhe
olher kind of a¡¡roach, ve have calegories of ilems and ve are Iooking lo associale one
sel of calegories vilh lhe olher.
6.3.1 Graphs to illustrate correlations
When ve are Iooking for correIalions, ve can besl iIIuslrale lhe silualion using a scauer
¡Iol, lhis aIIovs us lo see hov one variabIe is reIaled lo lhe olher. In Iigure 84 ve see a
scauer ¡Iol shoving hov lhe abundance of a freshvaler inverlebrale is reIaled lo lhe
s¡eed of lhe valer in vhich il Iives.
100 | Statistics for Ecologists Using R and Excel
Figure 84. A scatter plot illustrating a correlation
In lhis case, il a¡¡ears as lhough as lhe valer s¡eed increases so does lhe abundance of lhe
inverlebrale. We do nol knov if lhis reIalionshi¡ is slalislicaIIy signihcanl bul il gives us
an im¡ression. When ve have severaI inde¡endenl variabIes ve can ¡Iol severaI scauer
¡Iols, lhis may heI¡ us decide vhich is lhe mosl im¡orlanl faclor lo consider (Iigure 85).
Figure 85. Multiple scatter plots showing one dependent variable plotted against several
independent variables
In Iigure 85 ve can see lhal lvo of lhe inde¡endenl variabIes shov a more dehnile lrend
lhan lhe olhers, one shovs a ¡osilive correIalion and lhe olher a negalive one (aIlhough al
lhis ¡oinl ve do nol knov if eilher is slalislicaIIy signihcanl).
6. Exploring data – using graphs | 101
6.3.2 Graphs to illustrate associations
When ve have calegoricaI variabIes, ve have various choices. We can dis¡Iay lhe dala for
each rov or coIumn calegory as a ¡ie charl (e.g. Iigure 86), lhis viII usuaIIy require severaI
¡ie charls lo be ¡roduced (one for each rov or coIumn calegory, de¡ending on hov ve
vanl lo Iook al lhe dala). The ¡ie charl shovs lhe dala ¡ro¡orlionaIIy, each sIice of ¡ie
shovs lhe conlribulion as a ¡ro¡orlion of lhe lolaI.
Figure 86. A pie chart illustrating categorical data. The proportions of common bird species
in a garden habitat
When ve have lhis kind of dala ve can aIvays re¡resenl il in lhe form of a bar charl
inslead. The advanlage of lhe bar charl is lhal ve can shov severaI calegories al one lime
(Iigure 87).
Figure 87. A bar chart illustrating categorical data. The number of common garden birds in
various habitats
102 | Statistics for Ecologists Using R and Excel
In Iigure 87 ve can see various bird s¡ecies and various habilals, in lhis case ve have aIso
incIuded a Iegend on lhe gra¡h so lhe reader can idenlify lhe various bars more easiIy.
6.4 Graphs – a summary
There are quile a fev dißerenl sorls of gra¡h lhal ve can uliIise lo heI¡ visuaIise our dala
and make im¡orlanl decisions aboul lhe anaIylicaI a¡¡roach (TabIe 29). We shouId aIso
use gra¡hs lo iIIuslrale our dala, vhich can make lhem more com¡rehensibIe lo readers.
When ve ¡resenl gra¡hs ve shouId ensure lhey are fuIIy IabeIIed and as cIear as ¡ossibIe.
Iven vhen ve use gra¡hs for our ovn use il is good ¡raclice lo IabeI and lilIe lhem fuIIy.
LabeI axes and incIude lhe unils.
Do nol incIude loo many dißerenl eIemenls on a singIe gra¡h ÷ avoid cIuuer and if neces-
sary ¡roduce lvo gra¡hs ralher lhan one.
Give a main lilIe ex¡Iaining vhal lhe gra¡h shovs. UsuaIIy lhis is done as a ca¡lion in a
vord ¡rocessor. The ca¡lion shouId enabIe a reader lo undersland vhal lhe gra¡h shovs
vilhoul having lo read lhe main lexl. If your gra¡h is in your heId nolebook lhen make
sure you describe lhe gra¡h so lhal someone eIse can undersland il.
Table 29. Summary of graph types to use for diferent purposes
Purpose Types of graph
Illustrating distribution Stem–leaf plot, tally plot, histogram, density
chart, box–whisker plot
Illustrating diferences between samples Bar chart, box–whisker plot
Illustrating correlations Scatter plot
Illustrating associations Pie charts, bar charts
Illustrating sample sizes Line plot of running average (mean or median)
We viII examine gra¡hs in more delaiI in Cha¡ler 12, vhich viII aIso cover he ¡resenlalion
of resuIls. Seclions 12.4.1 and 12.5 viII deaI vilh ¡roducing gra¡hs in R and Seclion 12.4.3
viII cover ¡roducing gra¡hs in IxceI. We viII aIso make some references lo gra¡hs in each
of lhe seclions deaIing vilh lhe delaiIs of lhe various anaIylicaI melhods. Il is im¡orlanl lo
remember lhal our gra¡hicaI anaIysis shouId go aIongside lhe malhemalicaI one.