## Which Stats test should I use?

"Which statistical test should we use?" is a common question from Biology students. This Factsheet provides simple guidelines on when
each type of statistical test should be used.
The choice of the correct statistical test is all-important - use the wrong test and the conclusions will be invalidated. Marks are only awarded for an appropriate
- i.e. correct - use of statistics. The flowchart below can be used to identify the appropriate test. Table 1 overleaf gives examples of investigations. and
appropriate tests.

## Whether the frequencies are

the same in each of two or
more categories? Chi-squared
(eg. are seed germination rates Goodness-of-Fit
the same in different pHs?)
Comparing frequencies
(numbers of things) in
various categories? Do you Whether observed
(eg. are the numbers of want to frequencies are the same as
seeds germinating in test those theoretically expected?
various trays significantly (eg. are the predictions of
different?)
genetics correct?)

## Whether two factors are Chi-squared

related? Contingency
(eg. does pollution affect the table
number of sites at which clinging
Are mayfly are found?)

you Calculated
Mann-Whitney
(such as a diversity index) or
Counted U test
(such as the number of
organisms?)
Finding if there is a Yes paired
difference between two t-test
Is the
averages? Do the data occur in
data
(eg. is there, on average, a Measured natural pairs?
higher species diversity in (such as length, width,
unpolluted water rather height, velocity)? eg. the same organism
than polluted water?) reacting to two different
stimuli.

No unpaired
t-test

## Find whether the two

variables are correlated? Spearman's
(i.e does increasing one cause rank correlation
the other to increase or coefficient
Investigating the decrease?)
relationship between
two variables? Do you
(eg. Is there a relationship want to Use one variable to predict
between pollution level the value of the other?
(eg. predict the pollutant levels at

## INVESTIGATION WHAT IS MEASURED ? NULL HYPOTHESIS STATISTICAL EXPLANATION

TEST

1. Effect of pH on Number of seeds (out of 20, say) Ho: Number of seeds Chi-squared We are comparing observed
seed germination germinating in each of several germinating is not frequencies with expected
trays which contain different pH dependent on pH frequencies (i.e. that the
solutions same number germinate in
each tray)

2. Effect of differing The weight of usable crop Ho: Mean yield the same The variable measured is
environmental produced from a given area at a for both soil types continuous, and would be
conditions (eg. two minimum of 4 sites for each soil expected to follow a normal
soil types) on the type - or bell-shaped -
yield of a crop plant distribution.

## 3. Effect of pollution Measurements of vegetation Ho: Mean vegetation height t-test

on vegetation height from at least 4 polluted and unaffected by pollution (unpaired)
4 unpolluted sites
and we are interested in
4. Comparison of leaf Measurements of length of at Ho: Mean leaf lengths the comparing mean (or
length for the same least 20 leaves from each site same for both sites average) values. Most
tree species in two "natural" measurements -
different sites length, weight etc. - follow
a normal distribution

5. Comparison of plant Measurement of vegetation Ho: Mean vegetation height t-test The paired test is used
growth on two height at matched sites - i.e. the same on both sides (paired) when the above applies and
sides of a hedgerow equivalent points at opposite we have a natural matching
sides of hedge. between sites

6. Comparison of Simpson's Diversity Index at a Ho: Species diversity does This is used when we want
species diversity in minimum of 5 mown and 5 not differ significantly to compare averages, but
mown and unmown unmown sites between mown and cannot assume our figures
turf unmown sites come from a normal
distribution
7. Lichen distribution The number of quadrats in which Ho: Lichen distribution does
related to direction lichen occurs at each of a not differ significantly Mann-Whitney This will certainly be the
faced (North or minimum of 5 sites facing in each between North and South U-test case when we are
South) direction facing areas comparing something we
have calculated or counted
Either: The incidence of specified Ho: Incidence of specified at different sites
8. Comparison of species at each of at least 5 species /Species diversity
wildlife in coppiced sites in each type of woodland does not differ significantly It can be used instead of a
and uncoppiced Or: Simpson's Diversity Index at between coppiced and t-test as well, but is less
woods each of at least 5 sites in each uncoppiced woodland poweful and less likely to
type of woodland give significant results

9. Relationship Simpson's Diversity Index and Ho: There is no correlation Spearman's We are looking for a
between species zinc concentration at a minimum between species diversity Rank relationship where species
diversity and of 5 sites (but preferably more) and temperature in a diversity decreases with
concentration of zinc stream zinc concentration
in a stream

10.Effect of soil type Number of quadrats in which Ho: Incidence of the Chi-squared We are trying to test
on incidence of a species is present or absent for species is not dependent (Contingency whether two factors - i.e.
particular plant at least 20 samples taken for on soil type Table) type of soil and absence or
species each soil type. (there must be presence of a plant species
enough samples to guarantee - are independent
that there will be at least 5 cases
where it is present and 5 where
it is absent for each soil type)

11.Establishment of the Measurements of base width and N/A Regression Since we are trying to find a
exact relationship height of at least 10 limpets relationship between the
between width and two variables, we want to
height of limpets be able to use one to predict
the other

