You are on page 1of 4

SPATIAL PATTERNS 157

number of points per cell will be roughly equal to the variance of the number of
points per cell.
If there is a large amount of variability in the number of points from cell to
cell (some cells have many points, some have none, etc.), this implies a ten-
dency toward clustering. If there is very little variability in the number of
points from cell to cell, this implies a tendency toward a systematic pattern
(where the number of points per cell would be the same). The statistical test
makes use of a chi-square statistic involving the variance±mean ratio:

…m 1†s2
2 ˆ …8:1†
x
where m is the number of quadrats, and x and s2 are the mean and variance of
the number of points per quadrat, respectively. This value is then compared
with a critical value from a chi-square table with m 1 degrees of freedom.
Quadrat analysis is easy to employ, and it has been a mainstay in the spatial
analyst's toolkit of pattern detectors over several decades. One important issue
is the size of the quadrat; if the cell size is too small, there will be many empty
cells, and if clustering exists on all but the smallest spatial scales it will be
missed. If the cell size is too large, one may miss patterns that occur within
cells. One may ®nd patterns on some spatial scales and not at others, and thus
the choice of quadrat size can seriously in¯uence the results. Curtiss and
McIntosh (1950) suggest an ``optimal'' quadrat size of two points per quadrat.
Bailey and Gatrell (1995) suggest that the mean number of points per quadrat
should be about 1.6.

Summary of the quadrat method

(1) Divide a study region into m cells of equal size.


(2) Find the mean number of points per cell ( x). This is equal to the total
number of points divided by the number of cells (m).
(3) Find the variance of the number of points per cell, s2, as follows:
P
iˆm
…xi x†2
2 iˆ1
s ˆ …8:2†
m 1
where xi is the number of points in cell i.
(4) Calculate the variance±mean ratio (VMR):

s2
VMR ˆ …8:3†
x
(5) Interpret the results as follows.

If s2 =
x < 1, the variance of the number of points is less than the mean. In
the extreme case where the ratio approaches zero, there is very little variation in
158 STATISTICAL METHODS FOR GEOGRAPHY

the number of points from cell to cell. This characterizes situations where the
distribution of points is spread out, or uniform, across the study area.
If s2 =
x > 1, there is a good deal of variation in the number of points per
cell ± some cells have substantially more points than expected ( i.e., xi > x for
some cells i), and some cells have substantially fewer than expected (i.e.,
xi < x). This characterizes situations where the point pattern is more clustered
than random. A value of s2 = x near one indicates that the points are close to
being randomly distributed across the study area.

Hypothesis Testing. How can we be more precise in testing the null hypothesis
that there is no spatial pattern? Suppose we were to simulate the null
hypothesis by placing points at random in a study area, and that we then
carried out the procedure described above for ®nding the variance±mean
ratio. Furthermore, suppose we were to repeat this many times (say 1000),
and then draw a histogram of the results. We would ®nd that the mean of
our 1000 VMR values would be near one, and that the histogram would be
asymmetric, displaying a positive skew (see Figure 8.3). Values of VMR in the
tails of the histogram (also known as the sampling distribution of VMR),
indicate values that are relatively rare when the underlying null hypothesis of
no pattern is true.
For an actual set of observed data, we decide to accept the null hypothesis
that the points are randomly distributed in space if the VMR for the observed
data does not di€er too much from one; otherwise, we reject the null hypoth-
esis. More speci®cally, if the VMR for an observed pattern is greater than
VMRH (shown in Figure 8.3), the null hypothesis is rejected, and the pattern
is taken to be more uniform than random. Similarly, if the observed VMR is
less than VMRL, the null hypothesis is rejected, and the pattern is taken to be
more clustered than random.

Figure 8.3 Sampling distribution of VMR when H0 is rue


SPATIAL PATTERNS 159

If we were to actually observe an extreme value of VMR in our data (either


greater than VMRH or less than VMRL), we reject the null hypothesis that the
pattern is random. In this case, either (a) the null hypothesis is actually true (in
which case we have incorrectly rejected it, and committed a Type I error), or
(b) the null hypothesis is not true, and we have made a correct decision. To
establish the critical, cuto€ values, VMRL and VMRH, we ®rst have to decide
upon how great a likelihood of a Type I error that we are willing to tolerate.
If we use ˆ 0.05, then the 50 most extreme values out of the total of 1000 in
our experiment are used to obtain the critical values (since 50/1000 ˆ 0.05). If
we rank the 1000 VMR values from lowest to highest, the 25th VMR on our
list would be chosen as VMRL; 25 out of 1000 times we will observe a lower
VMR than this when H0 is true. Similarly, the 975th VMR on our ordered list
would be chosen as VMRH; 25 out of 1000 times we can expect to observe a
VMR higher than this when H0 is true. Thus 50 out of 1000, or 5% of the time
we will incorrectly reject a true hypothesis when we use these critical values. In
those 50 instances, we would make a Type I error, since we would reject H0
when in fact it was true, and we had simply observed an unusual value of VMR
in the tail of the sampling distribution.

Example. We wish to know whether the pattern observed in Figure 8.4 is


consistent with the null hypothesis that the points were placed at random. We
®rst calculate the VMR. There are 100 points on the 10  10 grid, implying a
mean of one point per cell. There are 6 cells with 3 points, 20 cells with 2 points,
42 cells with one point, and 32 cells with no points. The variance is
n o
6…3 1†2 ‡ 20…2 1†2 ‡ 42…1 1†2 ‡ 32…0 1†2 =99 ˆ 76=99 ˆ 0:77 …8:4†

Figure 8.4 A spatial point pattern


160 STATISTICAL METHODS FOR GEOGRAPHY

and, since the mean is equal to one, this is also our observed VMR. Since
VMR<1, there is a tendency toward a uniform pattern. How unlikely is a
value 0.77 if the null hypothesis is true ± is it unlikely enough that we should
reject the null hypothesis?
To assess this, we need to ®nd the sampling distribution of VMR, when H0 is
true. One hundred points were assigned to cells at random in a 10  10 grid.
The VMR was calculated using x ˆ 1, since there is an average of one point per
cell. This was repeated 1000 times to establish the form of the sampling dis-
tribution, when the null hypothesis of a random point pattern is true. The
resulting 1000 VMRs were ranked from lowest to highest. The 25th lowest
value was VMRL ˆ 0.747 and the 975th value on the list was VMRH ˆ 1.313.
These critical values can then be used to decide whether the actual pattern of
interest exhibits signi®cant deviations from randomness. Since our observed
value of VMR ˆ 0.77 is not less than the lower critical value of 0.7475, we
accept the null hypothesis: a VMR of 0.77 is not particularly unusual when H0
is true.
The process of deriving a sampling distribution via simulation of the null
hypothesis in the manner we have just described is known as the Monte Carlo
method. It has the merit of making the underlying ideas associated with
hypothesis testing easy to convey and, hopefully, easy to understand. But if
we had ten people each using the Monte Carlo method to ®nd critical values of
VMR in this example, we would get ten di€erent sets of critical values. The
larger the number of repetitions, the closer together the sets of critical values
would be. Another feature of the Monte Carlo method in this example is that
the critical values we found are only appropriate when there are 100 points in
100 cells. If we were to look at another example with other than 100 cells, we
would need to repeat a large number of simulations under the null hypothesis
to establish critical values.
There is an easy way to avoid having to actually simulate the sampling
distribution under the assumption that H0 is true. The critical values can be
determined by using the fact that the quantity 2 ˆ (m 1)VMR has a chi-
square distribution, with m 1 degrees of freedom, when H0 is true. This fact
allows us to obtain critical values, 2L and 2H , from a chi-square table. In
particular, we will reject H0 if either 2 < 2L or 2 > 2H .
When the number of degrees of freedom (df) is large, the sampling distribu-
tion of 2 ˆ (m 1)VMR begins to approach the shape of a normal distri-
bution. In particular, when df>30 or so (m 1)VMR will, when H0 is true,
have a normal distribution with mean m 1 and variance equal to 2(m 1).
This means that we can treat the quantity

…m 1†VMR …m 1† p
zˆ p ˆ …m 1†=2 …VMR 1† …8:5†
2…m 1†

as a normal random variable with mean 0 and variance 1. With ˆ 0.05, the
critical values are zL ˆ 1.96 and zH ˆ +1.96. The null hypothesis of no

You might also like