You are on page 1of 34

3.2.

3 Geary‘s c
Geary„s c is based on paired comparisons of values of a geo-referenced variable X
In neighbouring regions to measure spatial autocorrelation.

Geary„s c with unstandardized weights w *ij :


n n
(n  1)   w *ij  ( x i  x j ) 2
i 1j1 n n
(3.12) C  with S0    w ij
*
n
2  S0   ( x i  x)2 i 1 j1
i 1

Geary„s c with standardized weights wij:


n n
(n  1)   w ij  ( x i  x j ) 2
i 1 j1
(3.13) C 
n
2  n   (x i  x) 2
i 1

Range of Geary„s c: [0; 2], Expectation of C under independency: E( C)=1


(spatial randomness)
Positive spatial autocorrelation: 0 ≤ C < 1 1
Negative spatial autocorrealtion: 1 < C ≤ 2
Example:

We show the calculation of Geary„s c in the five-region example with the standar-
dized weight matrix.

Standardized weight matrix: Observation vector:

 0 1/ 2 1/ 2 0 0  8 
1 / 3 0 1 / 3 1 / 3 0  6 
   
W  1 / 3 1 / 3 0 1 / 3 0  x  6 
 0 1 / 3 1 / 3 0 1 / 3  3
   
 0 0 0 1 0  2

Table 1: Weighted squared differences wij(xi-xj)2


Region 1 2 3 4 5
1 0 (1/2)(8-6)2=2 (1/2)(8-6)2=2 0 0
2 (1/3)(6-8)2=1 1/3 0 (1/3)(6-6)2=0 (1/3)(6-3)2=3 0
3 (1/3)(6-8)2=1 1/3 (1/3)(6-6)2=0 0 (1/3)(6-3)2=3 0
4 0 (1/3)(3-6)2=3 (1/3)(3-6)2=3 0 (1/3)(3-2)2=1/3
5 0 0 0 1(2-3)2=1 0
 Sum of weighted squared differences 20 2
The sum of sqared deviations from the mean has already been calculated with
Moran„s I (section 3.2.1):
5
 ( x i  x )  24
2
i 1

Geary„s c with standardized weights wij (n=5):

(5  1)  20 80
C   0,3333
2  5  24 240

0 ≤ C=0,3333 < 1: positive spatial autocorrelation

3
Comparison between Moran„s I and Geary„s c:

Evaluating spatial autocorrelation with Moran„s I and Geary„s c leads to similar


but not identical results.

Griffith (1987) notes that simulation experiments suggest that the inverse relation-
ship between Moran's I and Geary's C is basically linear in nature. Departures
from linearity are ascribed to differences in what each of the two indices measure.
Geary's C deals with paired comparisons and Moran's I with covariations.

The relation between Moran's I and Geary's C can be compared by randomization


experiments

4
Figure: Relation between Moran's I and Geary's C for 20000 statistics
generated using rook contiguity

5
3.2.4 Getis-Ord G statistic
Getis and Ord (1992) have suggested a somewhat different approach to measuring
spatial association using a distance-based contiguity matrix. Neighbourhoods are
defined by a critical distance d. All regions within the critical distance d from a spa-
tial unit i are neighbours of that region.

The Getis-Ord G statistic are conceived for assessing overall spatial concentration.
An application of the G statistic is restricted to geo-referenced variables with posi-
tive values and a natural origin.

G statistic: n
  w ij (d )  x i  x j
i 1ji
(3.14) G (d )  Range: 0 ≤ G ≤ 1
n
  xi  x j
i 1ji
The G statistic measures the proportion of the sum of the products of each xi with
an xj value within a distance d from i to the total sum of all products xi·xj, j≠i. G(d)
provides a evidence of global spatial clustering of high values (“hot spots”). A low
value of G(d) will occur in case of low value clustering but may emerge also in case
of negative spatial autocorrelation.

high G(d): overall concentration of high attribute values 6


low G(d): lack of overall concentration of high attribute values
Weights of the binary matrix W(d):

1, if dij  d
w ij (d)  
0, otherwise

With respect to a unique usage of global and local Getis-Ord statistics


( section 3.3.2) we set the elements of the main diagonal wii equal to 1.
Note that the G statistic is not affected by this definition.

7
Example:

Distances between regions are measured by distances between their centres. In


our five-region example,

1 2
4 5

we impute the following distances between centres (in km):

Region 1 2 3 4 5
1 0 6 5 11 14
2 6 0 4 5 8
3 5 4 0 7 10
4 11 5 7 0 3
5 14 8 10 3 0

8
The above table covers the entries of the distance matrix D:

0 6 5 11 14
6 0 4 5 8
 
D5 4 0 7 10
11 5 7 0 3 

14 8 10 3 0 

We set the critical distance d equal to 7.5 kilometres. The spatial weight matrix
W(d) corresponding with d=7.5 reads:

1 1 1 0 0
1 1 1 1 0
 
W (d  7.5)  1 1 1 1 0
 
0 1 1 1 1
0 0 0 1 1

Because of the particular choice of the critical distance, W(d=7.5) is – apart from
the main diagonal elements - identical with the „ordinary“ first-order contiguity
matrix W*.
9
Observation vector x: x  8 6 6 3 2 '
Calculation of the denominator of (3.14):
Region 1 2 3 4 5
1 - 86=48 86=48 83=24 82=16
2 68=48 - 66=36 63=18 62=12
3 68=48 66=36 - 63=18 62=12
4 38=24 36=18 36=18 - 32=6
5 28=16 26=12 26=12 23=6 -
 j≠i Sum of products xixj, j≠i 476

Calculation of the numerator of (3.14):


Region 1 2 3 4 5
1 - 86=48 86=48 0 0 G statistic:
2 68=48 - 66=36 63=18 0 348
G  0.7311
3 68=48 66=36 - 63=18 0 476
4 0 36=18 36=18 - 32=6
5 0 0 0 23=6 -
10
 j≠i Sum of weighted products wij(d)xixj, j≠i 348
Test for global spatial clustering

Null hypothesis H0: Lack of spatial concentration of attribute values


Alternative hypothesis H1: Spatial concentration of high attribute values

G  E (G ) a
Test statistic: (3.15) Z(G )  ~ N(0,1)
Var (G )
Expected value of G(d):
W n
(3.16) E[G (d)]  with W    w ij (d)
n (n  1)
i 1ji

Variance of G(d): Var (G)  E(G 2 )  [E(G)]2


1
(3.17) E (G 2 ) 
(m12  m 2 ) 2  n (n  1)(n  2)(n  3)

(B0 m 22  B1m 4  B 2 m12 m 2  B3m1m 3  B 4 m14 )


with
n
m r   x ir , r  1,2,3,4 (rth non-centered moment of X multiplied by n)
i 1 11
B0  (n 2  3n  3)S1  nS 2  3W 2

B1  [(n 2  n )S1  2nS 2  3W 2 ]

B2  [2nS1  (n  3)S2  6W 2 ]

B3  4(n  1)S1  2(n  1)S2  8W 2

B4  S1  S2  W 2

1 n
S1    [ w ij (d)  w ji (d)]2
2 i 1ji
2
n  n
 
S2    w ij (d)   w ji (d)   ( w i  w i ) 2

i 1 ji ji
 i 1

with w i   w ij (d) and w i   w ji (d)


ji ji
12
Example:

In order to test for global spatial clustering on the basis of the G statistic, we
have to compute its expected value and variance.

Expected value of G(d):

Calculation of W:
Regio 1 2 3 4 5
n
1 - 1 1 0 0
2 1 - 1 1 0
3 1 1 - 1 0
4 0 1 1 - 1
5 0 0 0 1 -
ΣΣ j≠i Sum of wij, j≠i 12

W 12 12
E[G (d)]     0.6
n (n  1) 5  (5  1) 20
13
1 5 1
S1    [ w ij (d)  w ji (d)]2  12  22  24
2 i 1 ji 2

Row sums of W(d):

w1   w1 j (d)  2 w 2   w 2 j (d)  3 w 3   w 3 j (d)  3


ji ji ji

w 4   w 4 j (d)  3 w 5   w 5 j (d)  1
ji ji

Column sums of W(d):

w 1   w j1(d)  2 w 2   w j2 (d)  3 w 3   w j3 (d)  3


ji ji ji

w 4   w j4 (d)  3 w 5   w j5 (d)  1
ji ji
5
S2   ( w i  w i ) 2  (2  2) 2  (3  3) 2  (3  3) 2  (3  3) 2  (1  1) 2
i 1
 4  62  62  62  22  16  36  36  36  4  128
2
14
Variance of G(d):

B0  (n 2  3n  3)S1  nS 2  3W 2  (52  3  5  3)  24  5 128  3 122


 312  640  432  104

B1  [(n 2  n )S1  2nS 2  3W 2 ]  [(52  5)  24  2  5 128  3 122 ]


 (480  1280  432)  368

B2  [2nS1  (n  3)S2  6W 2 ]  (2  5  24  (5  3) 128  6 122 )


 (240  1024  864)  80

B3  4(n  1)S1  2(n  1)S2  8W 2  4  (5  1)  24  2  (5  1) 128  8 122


 384  1536  1152  0

B4  S1  S2  W 2  24  128  122  40

15
Observation vector of the
attribute variable X:
x ' 8 6 6 3 2

Moments of X multiplied by n:
n 1
m1   x i  (8  6  6  3  2)  25
i 1
n 2
m 2   x i  (82  62  62  32  22 )  (64  36  36  9  4)  149
i 1
n 3
m3   x i  (83  63  63  33  23 )  (512  216  216  27  8)  979
i 1
n 4
m 4   x i  (84  64  64  34  24 )  (4096  1296  1296  81  16)  6785
i 1

16
1
E (G 2 ) 
(m12  m 2 ) 2  n (n  1)(n  2)(n  3)

(B0 m 22  B1m 4  B 2 m12 m 2  B3m1m 3  B 4 m14 )

1

(252  149) 2  5  (5  1)  (5  2)  (5  3)
 [104 1492  368  6785  (80  252 149)  0  25  979  40  254 ]
1
  (2308904  2496880  7450000  0  15625000)
27189120
1
 12980784  0.477426
27189120

Var (G)  E(G 2 )  [E(G)]2  0.477426  0.62  0.117426

17
Test statistic:
G  E(G ) 0.7311  0.6 0.1311
z (G )     0.3826
Var (G ) 0.117426 0.342675

Critical value (α=0.05, one-sided test):

z1-α = 1.6449

Test decision:

z(G) = 0.3826 < z0.95 = 1,6449 => Accept H0


Interpretation:
No global evidence for substantive spatial clustering of high unemployment
regions

Hint:
As the normal approximation requires a large sample size, the test on the Getis-
Ord G statistic has only been performed here for illustrative purposes.

18
3.3 Local indicators of spatial association (LISA)
While global spatial autocorrelation analysis aims at summarizing the strength
of spatial dependencies by a single statistic, local spatial autocorrelation analy-
sis focuses on heterogeneity of spatial association over space. Instead of a single
global statistic, location-specific statistics are provided.

Local indicators of spatial association (LISA) provide detailed information on


spatial clustering (Anselin, 1995). The LISA for each observation gives an indication
of the extent of substantial spatial clustering of similar values around that observation.
Some LISA have also the property that their sum or average is proportional to the
global counterpart.

LISA aims at identifying local clusters and spatial outliers. Local clusters are charac-
terized by a concentration of high or low values of an attribute variable X. A spatial
clustering of contiguous high-value regions is called a „hot spot“, whereas a concen-
tration of low-value regions defines a „cold spot“. Both cases are associated with
positive local autocorrelation. Spatial outliers are regions with a reversed local
orientation compared to the predominant global one. When positive global spatial
autocorrelation has been established, regions with negative local autocorrelation
coefficients represent spatial outliers.
19
We deal with three well-known local indicators of spatial association,

- the local Moran statistic (Anselin, 1995),

- the Getis-Ord Gi statistic (Getis and Ord, 1992),

- the Getis-Ord Gi* statistic (Getis and Ord, 1992),

which complement one another with regard to identification of spatial clusters and
spatial outliers. The local Moran coefficient is adapted for identifying spatial
outliers and general but not specific clustering formations. For the latter purpose
the Getis-Ord Gi and Gi*statistics have to be applied. They can distinguish be-
tween „hot spots“ and „cold spots“ both of which are characterized by high posi-
tive spatial autocorrelation.

20
3.3.1 Local Moran statistic
The Local Moran statistic Ii detects local spatial autocorrelation. The Ii„s are indica-
tors of local instability. They decompose Moran's I into contributions for each loca-
tion.
According to this property, Local Moran statistics can be used for two purposes:
- Indicators of local spatial clusters,
- Diagnostics for outliers in global spatial patterns.
Local Moran statistic:
n
( x i  x )  w ij  ( x j  x )
j1
(3.15) Ii 
n 2
 (x j  x) / n
j1
Numerator
Determines the sign of Ii:
+, if both the ith region and the neighbouring have above or below average
values in the geo-referenced variable X
-, if the ith region has an above (below) and the neighbouring regions have a be-
low (above) average values in X

Denominator
Standardization of the cross-product by the variance sx² of the geo-referenced va-
21
riable X
Expected value (under independence):
w i 1
(3.16) E(I i )   
n 1 n 1
n
with w i   w ij  1
j1

Relationsship between global and local Moran statistics:


1 n
The average of the Ii's coincides with Moran's I: I   Ii
n i 1

Note: Random permutation tests on local Moran„s I statistics are available in pro-
grams like GeoDa and R (package spdep). Because of the high computational ex-
pense, the testing approach is introduced in the computer exercise. In the following
example local Moran„s I statistics are interpreted descriptively.

22
Example:

We calculate Local Moran statistics with the standardized weights wij.

w i 1 1
Expected value: E(I i )      0.25
n 1 n 1 5 1
Standardized weight matrix: Observation vector ( x  5 ):

 0 1/ 2 1/ 2 0 0  8 
1 / 3 0 1 / 3 1 / 3 0  6 
   
W  1 / 3 1 / 3 0 1 / 3 0  x  6 
 0 1 / 3 1 / 3 0 1 / 3  3
   
 0 0 0 1 0  2

The sum of sqared deviations from the mean has already been calculated with
Moran„s I (section 3.2.1):
1 5 1
s x   ( x j  x ) 2  24  4.8
2
n j1 5

23
● Region 1:

Weighted sum of deviations from the mean:


5 1 1
 w1 j  ( x j  x )   (6  5)   (6  5)  1
j1 2 2
(8  5) 1 3
Local Moran statistic: I1    0.6250
4.8 4.8
● Region 2:

Weighted sum of deviations from the mean:


5 1 1 1 2
 w 2 j  ( x j  x )   (8  5)   (6  5)   (3  5) 
j1 3 3 3 3

Local Moran statistic: I 2 


(6  5)  (2 / 3) 2
  0.1389
4.8 14.4

24
● Region 3:

Weighted sum of deviations from the mean:


5 1 1 1 2
 3j
w  ( x j  x )   (8  5)   ( 6  5)   (3  5) 
j1 3 3 3 3
(6  5)  (2 / 3) 2
Local Moran statistic: I3    0.1389
4.8 14.4
● Region 4:

Weighted sum of deviations from the mean:


5 1 1 1 1
 w 4 j  ( x j  x )   (6  5)   (6  5)   (2  5)  
j1 3 3 3 3

Local Moran statistic:I 4 


(3  5)  (1 / 3) 2
  0.1389
4.8 14.4

25
● Region 5:

Weighted sum of deviations from the mean:


5
 w 5 j  ( x j  x )  1  (3  5)  2
j1
Local Moran statistic: I5 
(2  5)  (2) 6
  1.2500
4.8 4.8
Moran„s I = Average of Local Moran Statistics:
1 5
I   Ii  (0.6250  0.1389  0.1389  0.1389  1.2500) / 5
5 i 1
 2.2917 / 5  0.4583
[Section 3.2.1: I = 0,4583 (with standardized weights)]

Interpretation:
- A local spatial clustering is identified around region 5 and to a somewhat
less extent around region 1, as both I5 and I1 exceed the global Moran I
value noticeably.
- Since all Ii values exceed their expected value, no outlying region
with respect to orientation is identified.
26
3.3.2 Getis-Ord G statistics
The Getis-Ord Gi and Gi* statistics are local measures of spatial concentration.
They indicate the extent of spatial clustering of high values („hot spots“) or low
values („cold spots“) of an attribute variable X around region i
.
As with the global G statistic contiguity is defined by distance bands.

The Gi and Gi* statistics differ in excluding or including observation i from summa-
tion. While observation i is excluded in Gi, it is included in the computation of Gi*
(Getis and Ord, 1992).

Gi statistic:
 w ij (d )  x j
ji
(3.16) Gi 
xj
ji
Gi* statistic: n
 w ij (d )  x j
j1
(3.17) G *i  n
xj
j1 27
Expected values of Gi and Gi*:

(3.18) E(Gi) = Wi / (n-1) with (3.19) Wi   w ij (d)


ji
n
(3.20) E(Gi*) = Wi* / n with (3.21) Wi*   w ij (d)
j1

Local spatial concentration of high values (“hot spots”):


Values of Gi and Gi* above their expected values

Local spatial concentration of low values (“cold spots”):


Values of Gi and Gi* below their expected values

Note: Getis and Ord (1995) also provide standardized Gi and Gi* statistics that
are asymptotically normally distributed. The normal test is even valid for sample
sizes as low as eight when the underlying distribution is not too skewed. For small
samples, however, the random permutation test is preferable. The testing approa-
ches are available in GeoDa and R (package spdep). In the following example the
Gi and Gi* statistics are interpreted descriptively.

28
Example:

We calculate the local Getis-Ord statistics Gi and Gi* for the five regions by using
the spatial weights matrix
1 1 1 0 0
1 1 1 1 0
 
W (d  7.5)  1 1 1 1 0
 
 0 1 1 1 1 
0 0 0 1 1

which is defined in section 3.2.4 (global G statistic) for a distance band of 7.5
kilometres.

As the denominator of (3.17) does not vary across regions, it has to be calcu-
lated only once using the entries of the observation vector x:
x  8 6 6 3 2 '

5
Denominator of (3.17):  x j  8  6  6  3  2  25
i 1

29
Region 1:
Gi statistic:
 w1 j (d  7.5)  x j
j1 1 6  1 6 12
G1     0.7059
 xj 6  6  3  2 17
j1
 w1 j (d  7.5)
j1 11 0  0 2
E(G1)     0.5
n 1 5 1 4
G1 > E(G1)  Tendency of spatial concentration of high values around region1
(hot spot)
Gi* statistic:
n
 w1 j(d 7.5)x j
j1 1  8  1  6  1  6 20
G1*     0.8
n 8  6  6  3  2 25
 xj
j1
n
 w1 j (d  7.5)
j1 111 0  0 3
E(G1* )     0.6
n 5 5

G1* > E(G1*)  Tendency of spatial concentration of high values (hot spot): 30
region1 and surrounding
Region 2:
Gi statistic:
 w 2 j (d  7.5)  x j
j 2 1  8  1  6  1  3 17
G2     0.8947
 xj 8 63 2 19
j 2
 w 2 j (d  7.5)
j 2 111 0 3
E (G 2 )     0.75
n 1 5 1 4
G2 > E(G2)  Tendency of spatial concentration of high values around region 2
(hot spot)
Gi* statistic:
n
 w 2 j(d 7.5)x j
j 2 1  8  1  6  1  6  1  3 23
G*2     0.92
n 8663 2 25
 xj
j1
n
 w 2 j (d  7.5)
j 2 1111 0 4
E(G*2 )     0.8
n 5 5

G2* > E(G2*)  Tendency of spatial concentration of high values (hot spot): 31
region 2 and surrounding
Region 3:
Gi statistic:
 w 3 j (d  7.5)  x j
j3 1  8  1  6  1  3 17
G3     0.8947
 xj 8 63 2 19
j3
 w 3 j (d  7.5)
j3 111 0 3
E (G 3 )     0.75
n 1 5 1 4
G3 > E(G3)  Tendency of spatial concentration of high values around region 3
(hot spot)
Gi* statistic:
n
 w 3 j(d 7.5)x j
j3 1  8  1  6  1  6  1  3 23
G*3     0.92
n 8 6 63 2 25
 xj
j1
n
 w 3 j (d  7.5)
j3 1111 0 4
E(G*3 )     0.8
n 5 5

G3* > E(G3*)  Tendency of spatial concentration of high values (hot spot): 32
region 3 and surrounding
Region 4:
Gi statistic:
 w 4 j (d  7.5)  x j
j 4 1  6  1  6  1  2 14
G4     0.6364
 xj 8662 22
j 4
 w 4 j (d  7.5)
j 4 0 111 3
E (G 4 )     0.75
n 1 5 1 4

G4 < E(G4)  Tendency of spatial concentration of low values around region 4


(cold spot)
Gi* statistic:
n
 w 4 j(d 7.5)x j
j 4 1  6  1  6  1  3  1  2 17
G*4     0.68
n 8663 2 25
 xj
j1
n
 w 4 j (d  7.5)
j 4 0 1111 4
E(G*4 )     0.8
n 5 5
G4* < E(G4*)  Tendency of spatial concentration of low values (cold spot): 33
region 4 and surrounding
Region 5:
Gi statistic:
 w 5 j (d  7.5)  x j
j5 1 3 3
G5     0.1304
 xj 8  6  6  3 23
j5
 w 5 j (d  7.5)
j5 0  0  0 1 1
E (G 5 )     0.25
n 1 5 1 4
G5 < E(G5)  Tendency of spatial concentration of low values around region 5
(cold spot)
Gi* statistic:
n
 w 5 j(d 7.5)x j
j5 1 3  1 2 5
G*5     0.2
n 8  6  6  3  2 25
 xj
j1
n
 w 5 j (d  7.5)
j5 0  0  0 11 2
E(G*5 )     0.4
n 5 5
G5* < E(G5*)  Tendency of spatial concentration of low values (cold spot): 34
region 5 and surrounding

You might also like