Professional Documents
Culture Documents
CHAPTER 4: HOW
MUCH EVIDENCE IS
ENOUGH?
Our artist painting a landscape does not use a canvas bigger than
will fit in a room. He also does not try to tell the story in 10 different ways
on the same canvas. If he wants to communicate, and not die of hunger
before he completes the work, he must decide how much will be enough.
Deciding how much is adequate, but not wasteful, is the art of the painter.
It is also the art of the scientist. We will start with a consideration of a
simple question and a simple graph.
Our scientist studying crayfish decides to test whether places with
no fish have more crayfish than places with predatory fish. She therefore
counts the number of crayfish in sections of stream with predatory fish
and without predatory fish, but she is unsure how many sections to sample.
Obviously, one of each type of section is not going to tell her much (Fig.
20a). Therefore she increases the number to three of each type of section,
but is still in doubt (Fig.20b). With 5 sections of stream in each category
she does not have much doubt (Fig. 20c).
This process, which Dytham (1999) called “collecting dummy data,”
appears much too simple, but it is one of the most powerful means of
planning research. There are some very elegant mathematical formulae
for deciding how many observations are needed to detect an effect of a
given magnitude (e.g. Krebs, 1989). However, all require preliminary
samples, and most only apply to fairly simple situations. Generally, asking
an experienced researcher to make hypothetical graphs that show the
expected variability in the data is about as good as any of the computer
Figure 20
10 10 10
A B C
8 8 8
CRAY FISH
CRAY FISH
CRAY FISH
6 6 6
4 4 4
2 2 2
0 0 0
FISH NO FISH FISH NO FISH FISH NO FISH
31
methods. Just keep adding points until the graphs look convincing. If you
cannot extract useful information from the team members in relation to
questions at the scale that is the focus of the project, there are a few
seat-of-the-pants rules that you can use. Ecological data usually show
the sort of variability in Fig. 20. Therefore, for comparisons between
categories, you should ensure that there will be at least 4 observations
per category, and preferably more. However, there is usually no point in
having more than 10 observations per category unless data are very cheap
to collect, or you think it important to detect extremely small differences
between categories. We will add other seat-of-the-pants rules in the
following chapters, but four observations per category will generally stand
us in good stead.
Figure 21
10 10
A B
8 8
CRAYFISH
CRAYFISH
6 6
4 4
2 2
0 0
FISH NO FISH FISH NO FISH
TABLE 2
COPEPODS ALGAE
JUVENILES 3217 18
ADULTS 23 2936
contrasts the number of copepods and algae found in juveniles and adults.
The biologist knows that it will be impossible to publish the results
without a statistical test and so applies a contingency-table analysis which
results in the following hieroglyphics c2 =6030, P<<0.001. Do not worry if
1
you do not know what they mean. Most readers will revise what they
think that they mean after reading a few more chapters. The important
thing is that most practicing ecologists would interpret the result as
indicating that it is very unlikely that juveniles and adults have the same
diet.
However, the impression changes when we look at the biologist’s
note book (Table 3). Now nobody thinks that there is a convincing general
difference between the diets of adults and juveniles. By chance, an adult
swam into a school of copepods, and one juvenile had gorged on plankton.
Why did the test give us a false answer? Because the analysis assumed
that each register of a copepod or alga was independent of the others.
We could have carried out a valid test had we randomly selected one
item from each gut and discarded the rest of the sample. This would
have required 6194 individual fish, each fish taken from a different school.
Obviously, that would have been a very inefficient way of attacking the
question. It is the fact that contingency table analyses require independent
TABELE 3
ADULT #1 0 6
ADULT #2 3211 7
ADULT #3 6 5
JUVENILE #1 8 2906
JUVENILE #2 8 1
JUVENILE #3 7 29
35
Figure 22
12
A AMAZONIA CENTRAL BRAZIL B
8
SPECIES NUMBER
0
SPECIES
36
She studied many individuals, but when she mapped the savannas of interest
(Fig. 22A is a conceptual representation of that map), she realized that her
sampling universe was not the same as her universe of interest and she changed
the question. Note that this simple conceptual map, with ovals representing
areas of savannas, a straight line representing the boundary between biomes,
and three black dots representing sampling areas, is quite adequate to show
that the universe of sampling does not correspond to the universe of interest in
her original question.
Figure 22A shows the universe of interest of a student studying
migratory birds. He wanted to make inferences about long distance migration in
wading birds. Even though he had a large number of birds, he had sampled
only one of the species that showed that form of migration (the black dot). The
simple diagram, with species of interest represented by a line of circles, shows
how restricted his universe of sampling was in relation to the universe of interest.
He therefore realized that before making any strong inferences he had to restrict
the question to one species or increase the number of species sampled.
Conceptual maps will generally always give you a good idea of whether you
should treating the data as a hint (no real replication), as a strong inference
(lots of independent evidence), or something in between. What appears to be
mapping the sampling units is really the process of mapping out the question.
In most of the following chapters we will discuss how most of the
common statistical analyses can be regarded as methods of reducing
complex problems to two dimensions so that the results can be presented
in as simple dispersion graphs. If team members cannot represent the
expected results of their analyses in simple graphical form, they do not
understand the analyses and should not be using them. However, when
confronted with their inability to produce two-dimensional graphical
representations of their results, team members will claim that the role of
scientists is not to produce simple graphs that can be understood by
everyone. They will say that the objective of a scientific study is to produce
probability statements. Therefore, in the next chapter we will consider
the unusual definition of probability that is used by most scientists.