Professional Documents
Culture Documents
number of points per cell will be roughly equal to the variance of the number of
points per cell.
If there is a large amount of variability in the number of points from cell to
cell (some cells have many points, some have none, etc.), this implies a ten-
dency toward clustering. If there is very little variability in the number of
points from cell to cell, this implies a tendency toward a systematic pattern
(where the number of points per cell would be the same). The statistical test
makes use of a chi-square statistic involving the variance±mean ratio:
m 1s2
2
8:1
x
where m is the number of quadrats, and x and s2 are the mean and variance of
the number of points per quadrat, respectively. This value is then compared
with a critical value from a chi-square table with m 1 degrees of freedom.
Quadrat analysis is easy to employ, and it has been a mainstay in the spatial
analyst's toolkit of pattern detectors over several decades. One important issue
is the size of the quadrat; if the cell size is too small, there will be many empty
cells, and if clustering exists on all but the smallest spatial scales it will be
missed. If the cell size is too large, one may miss patterns that occur within
cells. One may ®nd patterns on some spatial scales and not at others, and thus
the choice of quadrat size can seriously in¯uence the results. Curtiss and
McIntosh (1950) suggest an ``optimal'' quadrat size of two points per quadrat.
Bailey and Gatrell (1995) suggest that the mean number of points per quadrat
should be about 1.6.
s2
VMR
8:3
x
(5) Interpret the results as follows.
If s2 =
x < 1, the variance of the number of points is less than the mean. In
the extreme case where the ratio approaches zero, there is very little variation in
158 STATISTICAL METHODS FOR GEOGRAPHY
the number of points from cell to cell. This characterizes situations where the
distribution of points is spread out, or uniform, across the study area.
If s2 =
x > 1, there is a good deal of variation in the number of points per
cell ± some cells have substantially more points than expected ( i.e., xi > x for
some cells i), and some cells have substantially fewer than expected (i.e.,
xi < x). This characterizes situations where the point pattern is more clustered
than random. A value of s2 = x near one indicates that the points are close to
being randomly distributed across the study area.
Hypothesis Testing. How can we be more precise in testing the null hypothesis
that there is no spatial pattern? Suppose we were to simulate the null
hypothesis by placing points at random in a study area, and that we then
carried out the procedure described above for ®nding the variance±mean
ratio. Furthermore, suppose we were to repeat this many times (say 1000),
and then draw a histogram of the results. We would ®nd that the mean of
our 1000 VMR values would be near one, and that the histogram would be
asymmetric, displaying a positive skew (see Figure 8.3). Values of VMR in the
tails of the histogram (also known as the sampling distribution of VMR),
indicate values that are relatively rare when the underlying null hypothesis of
no pattern is true.
For an actual set of observed data, we decide to accept the null hypothesis
that the points are randomly distributed in space if the VMR for the observed
data does not dier too much from one; otherwise, we reject the null hypoth-
esis. More speci®cally, if the VMR for an observed pattern is greater than
VMRH (shown in Figure 8.3), the null hypothesis is rejected, and the pattern
is taken to be more uniform than random. Similarly, if the observed VMR is
less than VMRL, the null hypothesis is rejected, and the pattern is taken to be
more clustered than random.
and, since the mean is equal to one, this is also our observed VMR. Since
VMR<1, there is a tendency toward a uniform pattern. How unlikely is a
value 0.77 if the null hypothesis is true ± is it unlikely enough that we should
reject the null hypothesis?
To assess this, we need to ®nd the sampling distribution of VMR, when H0 is
true. One hundred points were assigned to cells at random in a 10 10 grid.
The VMR was calculated using x 1, since there is an average of one point per
cell. This was repeated 1000 times to establish the form of the sampling dis-
tribution, when the null hypothesis of a random point pattern is true. The
resulting 1000 VMRs were ranked from lowest to highest. The 25th lowest
value was VMRL 0.747 and the 975th value on the list was VMRH 1.313.
These critical values can then be used to decide whether the actual pattern of
interest exhibits signi®cant deviations from randomness. Since our observed
value of VMR 0.77 is not less than the lower critical value of 0.7475, we
accept the null hypothesis: a VMR of 0.77 is not particularly unusual when H0
is true.
The process of deriving a sampling distribution via simulation of the null
hypothesis in the manner we have just described is known as the Monte Carlo
method. It has the merit of making the underlying ideas associated with
hypothesis testing easy to convey and, hopefully, easy to understand. But if
we had ten people each using the Monte Carlo method to ®nd critical values of
VMR in this example, we would get ten dierent sets of critical values. The
larger the number of repetitions, the closer together the sets of critical values
would be. Another feature of the Monte Carlo method in this example is that
the critical values we found are only appropriate when there are 100 points in
100 cells. If we were to look at another example with other than 100 cells, we
would need to repeat a large number of simulations under the null hypothesis
to establish critical values.
There is an easy way to avoid having to actually simulate the sampling
distribution under the assumption that H0 is true. The critical values can be
determined by using the fact that the quantity 2 (m 1)VMR has a chi-
square distribution, with m 1 degrees of freedom, when H0 is true. This fact
allows us to obtain critical values, 2L and 2H , from a chi-square table. In
particular, we will reject H0 if either 2 < 2L or 2 > 2H .
When the number of degrees of freedom (df) is large, the sampling distribu-
tion of 2 (m 1)VMR begins to approach the shape of a normal distri-
bution. In particular, when df>30 or so (m 1)VMR will, when H0 is true,
have a normal distribution with mean m 1 and variance equal to 2(m 1).
This means that we can treat the quantity
m 1VMR
m 1 p
z p
m 1=2
VMR 1
8:5
2
m 1
as a normal random variable with mean 0 and variance 1. With 0.05, the
critical values are zL 1.96 and zH +1.96. The null hypothesis of no