Professional Documents
Culture Documents
Kent
Visit to download the full and correct content document:
https://ebookmass.com/product/spatial-analysis-john-t-kent-2/
Spatial Analysis
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by Walter A. Shewhart and Samuel S. Wilks
John T. Kent
University of Leeds, UK
Kanti V. Mardia
University of Leeds, UK
University of Oxford, UK
This edition first published 2022
© 2022 John Wiley and Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or
otherwise, except as permitted by law. Advice on how to obtain permission to reuse material
from this title is available at http://www.wiley.com/go/permissions.
The right of John T. Kent and Kanti V. Mardia to be identified as the authors of this work has
been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley
products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some
content that appears in standard print versions of this book may not be available in other
formats.
10 9 8 7 6 5 4 3 2 1
To my wife Sue for all her patience during the many years it has taken to
complete the book
(John Kent)
Contents
1 Introduction 1
1.1 Spatial Analysis 1
1.2 Presentation of the Data 2
1.3 Objectives 9
1.4 The Covariance Function and Semivariogram 11
1.4.1 General Properties 11
1.4.2 Regularly Spaced Data 13
1.4.3 Irregularly Spaced Data 14
1.5 Behavior of the Sample Semivariogram 16
1.6 Some Special Features of Spatial Analysis 22
Exercises 27
7 Kriging 231
7.1 Introduction 231
7.2 The Prediction Problem 233
7.3 Simple Kriging 236
7.4 Ordinary Kriging 238
7.5 Universal Kriging 240
7.6 Further Details for the Universal Kriging Predictor 241
7.6.1 Transfer Matrices 241
7.6.2 Projection Representation of the Transfer Matrices 242
7.6.3 Second Derivation of the Universal Kriging Predictor 244
7.6.4 A Bordered Matrix Equation for the Transfer Matrices 245
7.6.5 The Augmented Matrix Representation of the Universal Kriging
Predictor 245
7.6.6 Summary 247
7.7 Stationary Examples 248
7.8 Intrinsic Random Fields 253
7.8.1 Formulas for the Kriging Predictor and Kriging Variance 253
7.8.2 Conditionally Positive Definite Matrices 254
7.9 Intrinsic Examples 256
7.10 Square Example 258
7.11 Kriging with Derivative Information 259
7.12 Bayesian Kriging 262
7.12.1 Overview 262
7.12.2 Details for Simple Bayesian Kriging 264
7.12.3 Details for Bayesian Kriging with Drift 264
7.13 Kriging and Machine Learning 266
7.14 The Link Between Kriging and Splines 269
7.14.1 Nonparametric Regression 269
Contents xi
Appendix B A Brief History of the Spatial Linear Model and the Gaussian
Process Approach 347
B.1 Introduction 347
B.2 Matheron and Watson 348
B.3 Geostatistics at Leeds 1977–1987 349
B.3.1 Courses, Publications, Early Dissemination 349
B.3.2 Numerical Problems with Maximum Likelihood 351
B.4 Frequentist vs. Bayesian Inference 352
List of Figures
Figure 1.12 Directional semivariograms for (a) the Landsat data and (b) the
synthetic Landsat data. 19
Figure 1.13 Gravimetric data: (a) bubble plot and (b) directional
semivariograms. 21
Figure 1.14 Soil data: (a) bubble plot and (b) directional semivariograms. 22
Figure 1.15 Mercer–Hall wheat data: log–log plot of variance vs.
block size. 24
Figure 2.1 Matérn covariance functions for varying index parameters.
The range and scale parameters have been chosen so that the
covariance functions match at lags r = |h| = 0 and r = 1. 44
Figure 3.1 Examples of radial semivariograms: the power schemes
𝛾 # (r) ∝ r 2𝛼 for 𝛼 = 1∕4, 1∕2 and the exponential scheme
𝛾 # (r) = 1 − exp(−r). All the semivariograms have been scaled to
take the same value for r = 2. 83
Figure 3.2 A linear semivariogram with a nugget effect:
lim 𝛾 # (r) = 0.3 > 0. 83
r→0
Figure 4.1 Panels (a) and (b) illustrate the first-order basic and full
neighborhoods of the origin in the plane. Panel (c) illustrates the
second-order basic neighborhood. 123
Figure 4.2 Three notions of “past” of the origin in ℤ2 : (a) quadrant past (−),
(b) lexicographic past (−), and (c) weak past (−). In each plot,
○ denotes the origin, × denotes a site in the past, and • denotes a
site in the future. 136
Figure 4.3 (a) First-order basic neighborhood (nbhd) of the origin ○ in d = 2
dimensions. Neighbors of the origin are indicated by ×. (b) Two
types of clique in addition to singleton cliques: horizontal and
vertical edges. 145
Figure 4.4 (a) First-order full neighborhood (nbhd) of the origin ○ in d = 2
dimensions. Neighbors of the origin are indicated by ×. (b) Seven
types of clique in addition to singleton cliques: horizontal and
vertical edges, four shapes of triangle and a square. 146
Figure 5.1 Bauxite data: Bubble plot and directional semivariograms. 167
Figure 5.2 Elevation data: Bubble plot and directional semivariograms. 168
Figure 5.3 Bauxite data: Profile log-likelihoods together with 95% confidence
intervals. Exponential model, no nugget effect. 176
Figure 5.4 Bauxite data: sample isotropic semivariogram values and fitted
Matérn semivariograms with a nugget effect, for 𝜈 = 0.5 (solid),
𝜈 = 1.5 (dashed), and 𝜈 = 2.5 (dotted). 177
List of Figures xv
Figure 7.2 Kriging predictor for n = 5 data points assumed to come from a
stationary random field with a squared exponential covariance
function (7.55), plus a nugget effect, with mean 0. The size of the
relative nugget effect in Panels (a)–(c) is given by
𝜓 = 𝜏 2 ∕(𝜎 2 + 𝜏 2 ) = 0.01, 0.1, 0.5, respectively. Each panel shows
the true unknown shifted sine function (solid), together with the
fitted kriging curve (dashed), plus/minus twice the kriging
standard errors (dotted). 250
Figure 7.3 Panel (a) shows the interpolated kriging surface for the elevation
data, as a contour map. Panel (b) shows a contour map of the
corresponding kriging standard errors. This figure is also included
in Figure 1.3. 251
Figure 7.4 Panel (a) shows a contour plot for the kriged surface fitted to the
bauxite data assuming a constant mean and an exponential
covariance function for the error terms. Panel (b) shows the same
plot assuming a quadratic trend and independent errors. Panels (c)
and (d) show the kriging standard errors for the models in (a) and
(b), respectively. 252
Figure 7.5 Kriging predictor and kriging standard errors for n = 3 data points
assumed to come from an intrinsic random field, 𝜎I (h) = − 12 𝜎 2 |h|,
no nugget effect. The intrinsic drift is constant. Panel (a): no
extrinsic drift; Panel (b): linear extrinsic drift. Each panel shows
the fitted kriging curve (solid), plus/minus twice the kriging
standard errors (dashed). 257
Figure 7.6 Panel (a) shows the interpolated kriging surface for the gravimetric
data, as a contour map. Panel (b) shows a contour map of the
corresponding kriging standard errors. 257
Figure 7.7 Kriging predictors for Example 7.6. For Panel (a), the kriging
predictor is based on value constraints at sites 1,2,3. For Panel (b),
the kriging predictor is additionally based on derivative constraints
at the same sites. 262
Figure 7.8 Deformation of a square (a) into a kite (b) using a thin-plate spline.
The effect of the deformation on ℝ2 can also be visualized: it maps
a grid of parallel lines to a bi-orthogonal grid. 276
Figure B.1 Creators of Kriging: Danie Krige and Georges Matheron. 348
Figure B.2 Letter from Matheron to Mardia, dated 1990. 350
Figure B.3 Translation of the letter from Matheron to Mardia, dated
1990. 351
xvii
List of Tables
Table 5.1 Parameter estimates (and standard errors) for the bauxite data
using the Matérn model with a constant mean and with various
choices for the index 𝜈. 175
Table 5.2 Parameter estimates (and standard errors) for the elevation data
using the Matérn model with a constant mean and with various
choices for the index 𝜈. 177
Table 5.3 Parameter estimates (with standard errors in parentheses) for
Vecchia’s composite likelihood in Example 5.7 for the synthetic
Landsat data, using different sizes of neighborhood k 185
Table 7.1 Notation used for kriging at the data sites t1 , . . . , tn , and at the
prediction and data sites t0 , . . . , tn . 235
Table 7.2 Various methods of determining the kriging predictor U(t ̂ 0 ) = 𝜸 T X,
where 𝜸 can be defined in terms of transfer matrices by
𝜸 = Af 0 + B𝝈 0 or in terms of 𝜹 by 𝜹 = [1, −𝜸 T ]T . 247
Table 7.3 Comparison between the terminology and notation of this book for
simple kriging and Rasmussen and Williams (2006, pp. 13–17) for
simple Bayesian kriging. The posterior mean and covariance
function take the same form in both formulations, given by (7.61)
and (7.62). 267
Table A.1 Types of domain. 304
Table A.2 Domains for Fourier transforms and inverse Fourier transforms in
various settings. 314
Table A.3 Three types of boundary condition for a one-dimensional set of
data, x1 , . . . , xn . 327
Table A.4 Some examples of Toeplitz, circulant, and folded circulant matrices
U (n,r) , V (n,r) , W (n,r) , respectively, for n = 5 and r = 1, 2, 3, 4. 328
Table A.5 First and second derivatives of Σ and Σ−1 with respect to 𝜃1 = 𝜎𝜀2
and 𝜃2 = 𝜆. 344
xix
Preface
technical mathematical tools have been collected in Appendix A for ease of refer-
ence. Appendix B contains a short historical review of the spatial linear model.
The development of statistical methodology for spatial data arose somewhat sep-
arately in several academic disciplines over the past century.
(a) Agricultural field trials. An area of land is divided into long, thin plots, and
different crop is grown on each plot. Spatial correlation in the soil fertility can
cause spatial correlation in the crop yields (Webster and Oliver, 2001).
(b) Geostatistics. In mining applications, the concentration of a mineral of interest
will often show spatial continuity in a body of ore. Two giants in the field of
spatial analysis came out of this field. Krige (1951) set out the methodology
for spatial prediction (now known as kriging) and Matheron (1963) devel-
oped a comprehensive theory for stationary and intrinsic random fields; see
Appendix B.
(c) Social and medical science. Spatial continuity is an important property when
describing characteristics that vary across a region of space. One application is
in geography and environmetrics and key names include Cliff and Ord (1981),
Anselin (1988), Upton and Fingleton (1985, 1989), Wilson (2000), Lawson
and Denison (2002), Kanevski and Maignan (2004), and Schabenberger and
Gotway (2005). Another application is in public health and epidemiology,
see, e.g., Diggle and Giorgi (2019).
(d) Splines. A very different approach to spatial continuity has been pursued in the
field of nonparametric statistics. Spatial continuity of an underlying smooth
function is ensured by imposing a roughness penalty when fitting the function
to data by least squares. It turns out that fitted spline is identical to the kriging
predictor under suitable assumptions on the underlying covariance function.
Key names here include Wahba (1990) and Watson (1984). A modern treat-
ment is given in Berlinet and Thomas-Agnan (2004).
(e) Mainstream statistics. From at least the 1950s, mainstream statisticians have
been closely involved in the development of suitable spatial models and suit-
able fitting procedures. Highlights include the work by Whittle (1954), Matérn
(1960, 1986), Besag (1974), Cressie (1993), and Diggle and Ribeiro (2007).
(f) Probability theory and fractals. For the most part, statisticians interested in
asymptotics have focused on “outfill” asymptotics – the data sites cover an
increasing domain as the sample size increases. The other extreme is “infill
asymptotics” in which the interest is on the local smoothness of realizations
from the spatial process. This infill topic has long been of interest to proba-
bilists (e.g. Adler, 1981). The smoothness properties of spatial processes under-
lie much of the theory of fractals (Mandelbrot, 1982).
(g) Machine learning. Gaussian processes and splines have become a fundamental
tool in machine learning. Key texts include Rasmussen and Williams (2006)
and Hastie et al. (2009).
Preface xxi
The book is designed to be used in teaching. The statistical models and methods
are carefully explained, and there is an extensive set of exercises. At the same time
the book is a research monograph, pulling together and unifying a wide variety of
different ideas.
A key strength of the book is a careful description of the foundations of the sub-
ject for stationary and related random fields. Our view is that a clear understanding
of the basics of the subject is needed before the methods can be used in more com-
plicated situations. Subtleties are sometimes skimmed over in more applied texts
(e.g. how to interpret the “covariance function” for an intrinsic process, especially
of higher order, or a generalized process, and how to specify their spectral repre-
sentations). The unity of the subject, ranging from continuously indexed to lattice
processes, has been emphasized. The important special case of self-similar intrin-
sic covariance functions is carefully explained. There are now a wide variety of
estimation methods, mainly variants and approximations to maximum likelihood,
and these are explored in detail.
There is a careful treatment of kriging, especially for intrinsic covariance func-
tions where the importance of drift terms is emphasized. The link to splines is
explained in detail. Examples based on real data, especially from geostatistics, are
used to illustrate the key ideas.
The book aims at a balance between theory and illustrative applications, while
remaining accessible to a wide audience. Although there is now a wide variety
of books available on the subject of spatial analysis, none of them has quite the
same perspective. There have been many books published on spatial analysis, and
here we just highlight a few. Ripley (1981) was one of the first monographs in the
mainstream Statistics literature. Some key books that complement the material in
this book, especially for applications, include Cressie (1993), Diggle and Ribeiro
(2007), Diggle and Giorgi (2019), Gelfand et al. (2010), Chilés and Delfiner (2012),
Banerjee et al. (2015), van Lieshout (2019), and Rasmussen and Williams (2006).
xxii Preface
What background does a reader need? The book assumes a knowledge of the
ideas covered by intermediate courses in mathematical statistics and linear alge-
bra. In addition, some familiarity with multivariate statistics will be helpful. Oth-
erwise, the book is largely self-contained. In particular, no prior knowledge of
stochastic processes is assumed. All the necessary matrix algebra is included in
Appendix A. Some knowledge of time series is not necessary, but will help to set
some of the ideas into context.
There is now a wide selection of software packages to carry out spatial analysis,
especially in R, and it is not the purpose in this book to compare them. We have
largely used the package geoR (Ribeiro Jr and Diggle, 2001) and the program of
Pardo-Igúzquiza et al. (2008), with additional routines written where necessary.
The data sets are available from a public repository at https://github.com/jtkent1/
spatial-analysis-datasets.
Several themes receive little or no coverage in the book. These include point pro-
cesses, discretely valued processes (e.g. binary processes), and spatial–temporal
processes. There is little emphasis on a full Bayesian analysis when the covari-
ance parameters needed to be estimated. The main focus is on methods related to
maximum likelihood.
The book has had a long gestation period. When we started writing the book
the 1980s, the literature was much sparser. As the writing of the book progressed,
the subject has evolved at an increasing rate, and more sections and chapters have
been added. As a result the coverage of the subject feels more complete. At last,
this first edition is finished (though the subject continues to advance).
A series of workshops at Leeds University (the Leeds Annual Statistics Research
[LASR] workshops), starting from 1979, helped to develop the cross-disciplinary
fertilization of ideas between Statistics and other disciplines. Some leading
researchers who presented their work at these meetings include Julian Besag,
Fred Bookstein, David Cox, Xavier Guyon, John Haslett, Chris Jennison, Hans
Künsch, Alain Marechal, Richard Martin, Brian Ripley, and Tata Subba-Rao.
We are extremely grateful to Wiley for their patience and help during the writ-
ing of the book, especially Helen Ramsey, Sharon Clutton, Rob Calver, Richard
Davies, Kathryn Sharples, Liz Wingett, Kelvin Matthews, Alison Oliver, Vikto-
ria Hartl-Vida, Ashley Alliano, Kimberly Monroe-Hill, and Paul Sayer. Secretarial
help at Leeds during the initial development was given by Margaret Richardson,
Christine Rutherford, and Catherine Dobson.
We have had helpful discussions with many participants at the LASR work-
shops and with colleagues and students about the material in the book. These
include Robert Adler, Francisco Alonso, Jose Angulo, Robert Aykroyd, Andrew
Baczkowski, Noel Cressie, Sourish Das, Pierre Delfiner, Peter Diggle, Peter
Dowd, Ian Dryden, Alan Gelfand, Christine Gill, Chris Glasbey, Arnaldo Goitía,
Colin Goodall, Peter Green, Ulf Grenander, Luigi Ippoliti, Anil Jain, Giovanna
Preface xxiii
Jona Lasinio, André Journel, Freddie Kalaitzis, David Kendall, Danie Krige,
Neil Lawrence, Toby Lewis, John Little, Roger Marshall, Georges Matheron,
Lutz Mattner, Charles Meyer, Michael Miller, Mohsen Mohammadzadeh,
Debashis Mondal, Richard Morris, Ali Mosammam, Nitis Mukhopadhyay, Keith
Ord, E Pardo-Igúzquiza, Anna Persson, Sophia Rabe, Ed Redfern, Allen Royale,
Sujit Sahu, Paul Sampson, Bernard Silverman, Nozer Singpurwalla, Paul Switzer,
Charles Taylor, D. Vere-Jones, Alan Watkins, Geof Watson, Chris Wikle, Alan
Wilson, and Jim Zidek.
John is grateful to his wife Sue for her support in the writing of this book, espe-
cially with the challenges of the Covid pandemic. Kanti would like to thank the
Leverhulme Trust for an Emeritus Fellowship and Anna Grundy of the Trust for
simplifying the administration process. Finally, he would like to express his sincere
gratitude to his wife and his family for continuous love, support and compassion
during his research writings such as this monograph.
We would be pleased to hear about any typographical or other errors in the text.
Here is a list of some of the key notations and terminology used in the book.
● ℝ and ℤ denote the real numbers and integers.
∑d
● If s and t are sites, then sT t = 𝓁=1 s[𝓁]t[𝓁] is the inner product and |t|2 = tT t is
the squared Euclidean norm.
● Modulo notation mod (for numbers) and Mod (for vectors) (Section A.1)
● Check and convolution notation. If 𝜑(u) is a function of u ∈ ℝd , let 𝜑(u)
̌ = 𝜑(−u).
Then
(𝜑 ∗ 𝜓)(h) = 𝜑(u)𝜓(h − u) du, (𝜑 ∗ 𝜑)(h)
̌ = 𝜑(u)𝜑(u − h) du,
∫ ∫
and the latter is symmetric in h.
● The Kronecker delta and Dirac delta functions are denoted 𝛿h , h ∈ ℤd and
𝛿(h), h ∈ ℝd , respectively.
● . A finite symmetric neighborhood of the origin in ℤ2 . The augmented
neighborhood 0 = ∪ {0} includes the origin. Half of the neighborhood
is denoted † (Section 4.4).
● . A half-space in ℤd , especially the lexicographic half-space (Section 4.8).
Related ideas are the weak past and quadrant past (Section 4.8), and the
partial past (Section 5.9).
● Kriging is essentially prediction for random fields. It comes in various forms
including simple kriging, ordinary kriging, universal kriging, and Bayesian krig-
ing. In each case, there is a kriging predictor at every site, which depends on the
data through a kriging vector. Combining the kriging predictor for all sites yields
a kriging surface. The kriging variance describes the accuracy of the predictor
at each site. Tables 7.1 and 7.2 set out the notation for kriging and Table 7.3
provides a comparison with some related notation used in machine learning.
● The transfer covariance matrix and transfer drift matrix are used to construct the
kriging predictor (Section 7.6).
● Bordered covariance matrix. This is an (n + 1) × (n + 1) matrix, Section 7.6.4,
also used to construct the kriging predictor.
● Autoregression (AR) and related spatial models come in various forms in
Chapter 4 including:
– MA: Moving average (Section 4.3)
– SAR: Simultaneous autoregression (Section 4.5)
– CAR: Conditional autoregression (Section 4.6)
– ICAR: Intrinsic CAR (Section 4.6.3)
– QICAR: Quasi-intrinsic CAR (Section 4.6.3)
– UAR: Unilateral autoregression (Section 4.8.2)
– QAR: Quadrant unilateral autoregression (Section 4.8.3)
● Types of matrix
– Tensor product matrices (Section A.3.9)
– Toeplitz, circulant, folded circulant matrices (Sections A.3.8 and A.10).
– All n × n circulant matrices in d = 1 dimension have the same eigen-
vectors. These can be represented in complex coordinates by the unitary
matrix G(DFT,com)
n or in real coordinates by the orthogonal matrix G(DFT,rea)
n
(Section A.7.2).
xxviii List of Notation and Terminology
Introduction
Spatial analysis involves the analysis of data collected in a spatial region. A key
aspect of such data is that observations at nearby sites tend to be highly correlated
with one another. Any adequate statistical analysis should take these correlations
into account.
The region in which the data lie is a subset of d-dimensional space, ℝd , for some
d ≥ 1. The important cases in practice are d = 1,2, 3, corresponding to the data on
the line, in the plane, or in 3-space, respectively.
The one-dimensional case, d = 1, is already well known from the analysis of
time series. Therefore, it will come as no surprise that many of the techniques
introduced in this book represent generalizations of standard methodology from
time-series analysis. However, just as multivariate analysis contains techniques
with no counterpart in univariate statistical analysis, spatial analysis includes
techniques with no counterpart in time-series analysis.
Spatial data arise in many applications. In mining we may have measurements
of ore grade at a set of boreholes. If all the observations along each borehole
are averaged together, we obtain data in d = 2 dimensions, whereas if we retain
the depth information at which each observation in the borehole is made, we
obtain three-dimensional data. In agriculture, experiments are usually performed
on experimental plots, which are regularly spaced in a field. For environmental
monitoring, data are collected at an array of monitoring sites, possibly irregularly
located. There may also be a temporal component to this monitoring application
as data are collected through time.
Digital images can also be viewed as spatial data sets. Examples include Landsat
satellite images of areas of the earth’s surface, medical images of the interior of the
human body, and fingerprint images.
A digital image can be regarded as a spatial data set on a large regular grid;
typically, d = 2 and n = 256 × 256 or 512 × 512. In this context, the sites are known
as pixels (picture elements).
6 7 3 10
13 2 4 3
22 9 2 5
1 6 7 3 10
t[1] 2 13 2 4 3
3 22 9 2 5
(c) Graphical coordinates (used for all spatial data sets in this book). The asterisks
are explained in Example 1.7
3 ∗6 ∗7 ∗3 10
t[2] 2 ∗13 ∗2 ∗4 3
1 ∗22 ∗9 ∗2 5
1 2 3 4
t[1]
Figure 1.1 Fingerprint of R A Fisher, taken from Mardia’s personal collection. A blowup
of the marked rectangular section is given in Figure 1.9.
from an arbitrary origin located in the southwest corner; t1 is the East–West coor-
dinate and t2 is the North–South coordinate. Figure 1.2 gives two plots of the data.
The raw plot in Panel (a) shows the elevation values printed at each site. The bub-
ble plot in Panel (b) shows a circle plotted at each site, where the size of the circle
encodes graphically the elevation information; larger elevations are indicated by
bigger circles. Patterns in the data are often easier to pick out using the bubble plot.
Note that the elevations are high near the edges of the region with a basin in the
middle. There are extra features associated with the data such as river locations,
but we will limit ourselves here to just the elevation information for illustrative
purposes.
One of the objectives for this sort of data is to predict the elevation through-
out the region and to represent the result graphically. Using a statistical method
called kriging (see Chapter 7 for details), the elevation was predicted or smoothed
1.2 Presentation of the Data 5
6
6
705
710 780
800 728 855
730 804
762
765 765 760
740
830 813 790 820
4
4
812773
812 840
t[2]
t[2]
855 827805
830
820
890 873 875
2
2
0
0
880
0 1 2 3 4 5 6 0 1 2 3 4 5 6
t[1] t[1]
(a) (b)
Figure 1.2 Elevation data: (a) raw plot giving the elevation at each site and (b) bubble
plot where larger elevations are indicated by bigger circles.
on a fine grid of points throughout the region. The result for this data set can be
summarized visually in different ways including:
● A contour map (Figure 1.3a)
● A perspective plot (Figure 1.3b), viewed from the top of the region
● A digital image using gray level (or color) to indicate ore grade, where white
denotes the lower values and black denotes higher values (Figure 1.3c)
These images all show that the data have a valley in the top middle of the the
image and a peak in the bottom middle.
In addition, the contour plot in Figure 1.3d shows the standard error of the pre-
dictor. Notice that the predictor is perfect with zero standard error at the data sites,
and it has a larger standard error in places where the data sites are sparse. See
Example 7.2 for more details. ◽
0
86 720
0
84
0 84
74 0
760
780 800
820
840
880
880 900 860
900
920 940
880
920
(a) (b)
20 22
20 16 22 16
16 16
20 18
18
20
20 18
12
16 16
16
24
16
22 12
14
18
14 16
16 18 20 16
16
14
16
22
16 18 16
22 14
20 16
20 20 18
16
16 16
14 14 16
16 16 20 18 20
16 20 14
16
22
12
18
18
16 14 18
14 22
20
20 16
24
(c) (d)
Figure 1.3 Panels (a), (b), and (c) show interpolated plots for the elevation data, as a
contour map, a perspective plot (viewed from the top of the region), and an image plot,
respectively. Panel (d) shows a contour map of the corresponding standard errors.
8
7.55
15 22.1
9.7 5.1 5.8 9
6
6
7.6 9.9
18.35 5.95 6.7 3.3
47.711.1 3.05 6.75
t[2]
1.6 4.45
t[2]
4
13.35
22 5.3
8.55 1.9
21.8 16.3
3.55 5.87.15 4.3
2
2
9.9
31.7
36
0
0 2 4 6 8 0 2 4 6 8
t[1] t[1]
(a) (b)
Figure 1.4 Bauxite data: (a) raw plot giving the ore grade at each site and (b) bubble
plot where larger ore grades are indicated by bigger circles.
Table 1.3 Bauxite data: percentage ore grade for bauxite at n = 33 locations.
Landsat data
Image plot
1.0
0.8
0.6
t[2]
0.4
0.2
0.0
t[1]
1.3 Objectives
1.0
0.8
0.6
t[2]
0.4
0.2
0.0
where the process is evolving through time), and the term “random field” when
the sites lie in higher dimensions.
(a) Spatial correlation. A key aspect in most applications is that observations at
nearby sites will tend to be correlated. Hence, in a data set that can be regarded
as a stationary random field, one of the first objectives is to describe and quan-
tify the extent of this spatial correlation.
(b) Prediction. Using a sample of observations one might want to predict the
value of the random field at a new site. Using the spatial correlation structure
will improve the accuracy of the predictor. A typical example arises in mining
where measurements are made available at a set of boreholes, and one wants
to predict the ore content at a new borehole or in a block of rock. In time-series
analysis, “prediction” usually means predicting the future given the past.
However, in spatial analysis, even for the one-dimensional case, there is
generally no concept of “past” or “future.” Instead, prediction involves either
interpolation within or extrapolation beyond the set of data sites.
(c) The spatial linear model. A more realistic model than a stationary random
field might include trend terms (such as linear or quadratic drift) or treat-
ment effects (such as in a designed agricultural experiment). If the spatial
correlation structure is known, then estimation of these parameters in the
1.4 The Covariance Function and Semivariogram 11
The covariance function is well known from its use in time-series analysis. How-
ever, the semivariogram is valid in more settings in which 𝛾(h) → ∞. Since both
constructions are popular, we shall work with both of them through the book.
An important simplification occurs when the covariance function and semivari-
ogram depend only on the radial component r = |h|. In this case, the random field
is said to be isotropic and the notation
Range
0.8
Semivariogram
0.6
Sill
0.4
0.2
Nugget variance
0.0
0 2 4 6 8 10
Lag
Figure 1.7 Typical semivariogram, showing the range, nugget variance, and sill.
1.4 The Covariance Function and Semivariogram 13
(c) Sill. The fitted semivariogram can be extrapolated toward lag |h| = ∞. In
Figure 1.7, the semivariogram increases to a finite sill with a value of 0.7.
More generally, for stationary processes the sill is always finite. However, in
the wider setting of intrinsic processes, the sill may be infinite; see Chapter 3.
(d) Range. In Figure 1.7, the sill is attained by the semivariogram when |h| = 6.
This value of the lag is known as the “range.” Observations at sites separated
by a distance greater than the range are uncorrelated. In many examples, the
range is not finite; the semivariogram approaches the sill only asymptotically
as |h| → ∞. But even in this situation, it is helpful to define an “approximate
range” such that when the lag has reached this value, the semivariogram has
nearly reached the sill. See Section 5.2 for further discussion.
Dh = {t ∈ ℤd ∶ t ∈ D and t + h ∈ D} (1.2)
denote those sites t for which t and t + h lie in D. Provided |h[𝓁]| ≤ n[𝓁] for
𝓁 = 1, . . . , d, the set Dh contains |Dh | = (n[1] − |h[1]|) × · · · × (n[d] − |h[d]|)
sites.
The sample mean is defined by
1 ∑
x= x.
|D| t∈D t
gh ≠ s0 − sh . (1.3)
Example 1.7 Consider the illustrative data of Table 1.1 where n = 12. To give a
stronger feeling for the intuition behind the formulas of this section, we illustrate
some of the calculations by hand.
The mean and variance are given by
x = (22 + 9 + · · · + 3 + 10)∕12 = 7.17,
s(0,0) = [(22 − 7.17)2 + · · · + (10 − 7.17)2 ]∕12 = 30.81.
For lag h = (1,0) the subset of data D(1,0) in (1.2) has been marked with asterisks
(*) in Table 1.1c; there are 9 such values. Thus, the calculations for the sample
covariance function and sample semivariogram take the form
s(1,0) = {(9 − 7.17)(22 − 7.17) + · · · + (10 − 7.17)(3 − 7.17)} ∕9 = 1.94,
{
g(1,0) = (9 − 22)2 + (2 − 13)2 + (7 − 6)2 + (2 − 9)2 + (4 − 2)2
}
+ (3 − 7)2 + (5 − 2)2 + (3 − 4)2 + (10 − 3)2 ∕(2 × 9) = 23.28.
Note that
23.28 = g(1,0) ≠ s(0,0) − s(1,0) = 28.87,
in accordance with (1.3). ◽
CHAPTER XI.
CHAPTER III.