"Which statistical test should we use?" is a common question from Biology students. This Factsheet provides simple guidelines on when

each type of statistical test should be used.

The choice of the correct statistical test is all-important - use the wrong test and the conclusions will be invalidated. Marks are only awarded for an appropriate

- i.e. correct - use of statistics. The flowchart below can be used to identify the appropriate test. Table 1 overleaf gives examples of investigations. and

appropriate tests.

the same in each of two or

more categories? Chi-squared

(eg. are seed germination rates Goodness-of-Fit

the same in different pHs?)

Comparing frequencies

(numbers of things) in

various categories? Do you Whether observed

(eg. are the numbers of want to frequencies are the same as

seeds germinating in test those theoretically expected?

various trays significantly (eg. are the predictions of

different?)

genetics correct?)

related? Contingency

(eg. does pollution affect the table

number of sites at which clinging

Are mayfly are found?)

you Calculated

Mann-Whitney

(such as a diversity index) or

Counted U test

(such as the number of

organisms?)

Finding if there is a Yes paired

difference between two t-test

Is the

averages? Do the data occur in

data

(eg. is there, on average, a Measured natural pairs?

higher species diversity in (such as length, width,

unpolluted water rather height, velocity)? eg. the same organism

than polluted water?) reacting to two different

stimuli.

No unpaired

t-test

variables are correlated? Spearman's

(i.e does increasing one cause rank correlation

the other to increase or coefficient

Investigating the decrease?)

relationship between

two variables? Do you

(eg. Is there a relationship want to Use one variable to predict

between pollution level the value of the other?

and distance from road?) Regression

(eg. predict the pollutant levels at

10m from the road)

1

TEST

1. Effect of pH on Number of seeds (out of 20, say) Ho: Number of seeds Chi-squared We are comparing observed

seed germination germinating in each of several germinating is not frequencies with expected

trays which contain different pH dependent on pH frequencies (i.e. that the

solutions same number germinate in

each tray)

2. Effect of differing The weight of usable crop Ho: Mean yield the same The variable measured is

environmental produced from a given area at a for both soil types continuous, and would be

conditions (eg. two minimum of 4 sites for each soil expected to follow a normal

soil types) on the type - or bell-shaped -

yield of a crop plant distribution.

on vegetation height from at least 4 polluted and unaffected by pollution (unpaired)

4 unpolluted sites

and we are interested in

4. Comparison of leaf Measurements of length of at Ho: Mean leaf lengths the comparing mean (or

length for the same least 20 leaves from each site same for both sites average) values. Most

tree species in two "natural" measurements -

different sites length, weight etc. - follow

a normal distribution

5. Comparison of plant Measurement of vegetation Ho: Mean vegetation height t-test The paired test is used

growth on two height at matched sites - i.e. the same on both sides (paired) when the above applies and

sides of a hedgerow equivalent points at opposite we have a natural matching

sides of hedge. between sites

6. Comparison of Simpson's Diversity Index at a Ho: Species diversity does This is used when we want

species diversity in minimum of 5 mown and 5 not differ significantly to compare averages, but

mown and unmown unmown sites between mown and cannot assume our figures

turf unmown sites come from a normal

distribution

7. Lichen distribution The number of quadrats in which Ho: Lichen distribution does

related to direction lichen occurs at each of a not differ significantly Mann-Whitney This will certainly be the

faced (North or minimum of 5 sites facing in each between North and South U-test case when we are

South) direction facing areas comparing something we

have calculated or counted

Either: The incidence of specified Ho: Incidence of specified at different sites

8. Comparison of species at each of at least 5 species /Species diversity

wildlife in coppiced sites in each type of woodland does not differ significantly It can be used instead of a

and uncoppiced Or: Simpson's Diversity Index at between coppiced and t-test as well, but is less

woods each of at least 5 sites in each uncoppiced woodland poweful and less likely to

type of woodland give significant results

9. Relationship Simpson's Diversity Index and Ho: There is no correlation Spearman's We are looking for a

between species zinc concentration at a minimum between species diversity Rank relationship where species

diversity and of 5 sites (but preferably more) and temperature in a diversity decreases with

concentration of zinc stream zinc concentration

in a stream

10.Effect of soil type Number of quadrats in which Ho: Incidence of the Chi-squared We are trying to test

on incidence of a species is present or absent for species is not dependent (Contingency whether two factors - i.e.

particular plant at least 20 samples taken for on soil type Table) type of soil and absence or

species each soil type. (there must be presence of a plant species

enough samples to guarantee - are independent

that there will be at least 5 cases

where it is present and 5 where

it is absent for each soil type)

11.Establishment of the Measurements of base width and N/A Regression Since we are trying to find a

exact relationship height of at least 10 limpets relationship between the

between width and two variables, we want to

height of limpets be able to use one to predict

the other

