Professional Documents
Culture Documents
Lemay 1995 PDF
Lemay 1995 PDF
by
in partial fulfillment
Master of Science
Applied Mathematics
1998
This thesis for the Master of Science
degree by
by
Weldon Lodwick
s /b/ct 5
Date
LeMay, Norman E.(M.S., Applied Mathematics)
ABSTRACT
considered for the region D, where JJ(x) represents the global variability and
c5(x) is a zero mean, stationary random process which represents small scale
the A's depend on the spatial covariance function. Kriging assumes a known
hom which the covariance, under certain conditions can be derived. The
main emphasis of this thesis is estimating the variogram used in Kriging. The
lll
graphs are defined and discussed in this chapter. Chapter three discusses es-
An analysis of this data set will show trend removal, simulations to aid in as-
sessing anisotropy, and the final kriging surface along with the standard error
surface.
James R. Koehler
!V
ACKNOWLEDGEMENTS
out their loving support, kindness and encouragement, I would have not been
able to complete this goal. A great debt of gratitude is owed to my long suffer-
ing advisor, Dr. James R. Koehler. His encouragement, advising and patience
has gone above and beyond the call of duty as mentor and friend. Thanks Dr.
v
CONTENTS
Chapter
l. Introduction .......................... 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
VI
2.4.3 Isotropic Transition Models 17
2.5 Anisotropy . . . . 21
2.5.1 Geometric 24
3.2 Estimators . . . . . . . . . 36
4.3 Simulation . . . . . 51
Vll
4.4 Completion of Original Data 55
4.5 Kriging . . . . . . . 55
References . 60
Vlll
FIGURES
Figure
!X
3.4 Omnidirectional variogram of elevation data. The parameters
adjustment . . . . . . . . . . . . . . . 39
X
1. Introduction
1.1 Motivation
done previously and this led me into spatial statistics and Geographical Infor-
tics, after which I transfered into the statistics program. I found the problems
1
for i = 1, ... , n. An unsampled location is denoted x 0 . We want to measure
some feature in the design region such as carbon monoxide, elevation, depths to
A spatial random process, Z(x), is modeled as, Z(x) = p,(x) + o(x) where
p,(x) models the large scale variability (trend) of the spatial random process
Z(x) and o'(x) is a zero mean stationary stochastic process, with covariance
function, C(h). It is obvious that the random variables Z(xi) and Z(xj) are
Three models for kriging are: simple, ordinary and universaL In simple
kriging, the mean, p,(x) is known and constant, p,(x) = Jl, and the covariance
function, C(h) is known. In Ordinary kriging, the mean is unknown but as-
sumed constant, and the covariance function C(h) is known. Ordinary kriging
is presented in Section 2.4. Finally, in universal kriging, the mean p,(x) is un-
known.
from the data. I will show in section 2.4 that the kriging weights; A, are in
terms of the covariance between all sampled locations,C(Z(x +h), Z(x)), and
2
the covariance between the sampled locations and the unsampled locations
C(Z(xo), Z(xi)).
This derivation and definition is found in chapter two. In fact, kriging can
be done in terms of the variogram function, 1(h), rather than the covariance
function, C(h).
1.2 Outline
are defined in chapter two. Chapter three presents methods to estimate the
variogra.m from data. For example, the classical estimator clue to Matheron [5]
and a robust estimator clue to Cressie [5] are introduced. How to assess the
design region are also discussed in this chapter. Finally in chapter four, an
chapters two and three are implemented. I also use simulation to aid in the
3
analysis of the variogram modeling process. Finally, ordinary kriging is used to
obtain the surface of the topological data and is shown along with the standard
error surface.
4
2. Terminology
let Z(x;) denote a random variable, which z(x.i) is a realization. Within the
that are close together tend to be more similar in value than measurements fur-
ther apart. Statistically, random variables closer together are more correlated
than random variables further apart. I will use the above notation through-
out this thesis to denote spatial locations, x;, sample measurements, z(x.;) and
isotropy and anisotropy. All these terms are defined in this chapter along with
5
2.1 Random Function
by the set of all its n-variate cdfs for any number n and any choice of the n
(2.2)
a distance vector h has both magnitude and direction. The two locations will
llhll· Realizations of the random variable are denoted, z(x). Lastly, we have
only one realization of the random variables (i.e. we do not have replication).
baring any measurement errors, we will have the same results. This leads
6
Although in practice we do not know the underlying random function, we will
use the data to find two properties of the random function, namely the first
2.2.1 First Moments Recall from classical statistics, the first moment
E[X] = J: xf(x)dx = Jl
E[Z(x)] = m(x)
when it exists. As seen, the mean of spatial data is a function of the locations,
X.
2.2.2 Second Moments Second moments of spatial data, when they ex-
ist, have three forms. First is the a priori Variance, when it exists and is
defined,
Definition 1 (Variance)
7
If Var[Z(xi)] and Var[Z(x1)] exists, then the Covariance exists and is defined
as,
Definition 2 (Covariance)
Lastly, the variance of the difference of two random variables known as the
Definition 3 (Variogram)
(2.5)
Cressie [5] notes that some people have defined the variogram as, 21(x 1 , x 2 ) =
E [(Z(x +h) - Z(x)/]. This works fine when m(x) is constant Vx ED. Note
2.3 Stationarity
Recall, there is only one realization of Z(x). Since there are no replications,
to view z(x) and z(x + h) as two different realizations of the same random
variable.
8
Definition 4 (Strict Stationarity) A random function {Z(x),x E D} is
said to be Strictly Stationary within the field D if its multivariate cdf is invari-
mg
• For Z(x), Z(x + h), the covariance exists and depends only on the sepa-
ration distance h.
9
we can find a relationship between the covariance function and the variogram.
Var[Z(x)]
= E [Z(x) 2 - 2Z(x)m + m 2]
E[Z(x?J - 2mE[Z(x)] + m2
E[Z(x) 2 ] - m2
This states the variance of a spatial random variable, under the covariance
A relationship exists between the variogram and the covariogram (see equa-
10
2.3.3 Intrinsic Stationarity An even weaker form of stationarity is that
of Intrinsic Stationarity.
• E[Z(x)] = m \f x.
• Z(x +h) - Z(x) has a finite variance and does not depend on x.
allows for infinite variance, whereas the covariance hypothesis allows only for
finite variance. For the rest of this thesis, I will be using the definition in
11
Z(x) = !L + o(x) X E D (2.11)
We will assume the model in equation 2.11 along with the model assump-
tions that o(-) is a zero mean stochastic process with variance modeled by a
known covariance function. Further assume the first order component, /L, is
constant in the region, but unknown. Using the data, we would like to predict
Z(xo) = xtz(x)
We would like our estimator to be unbiased and have minimum variance. For
E [Z(xo)] = !L = E [z(xol]
E [~x;Z(x;)J
n
L XiE [Z(x;)]
i=O
i=O
12
evaluating each term,
E[Z 2 ] E[(XZ) 2 ]
E[XZZ',\]
XE[ZZ'],\
Where C is a n x n covariance matrix for all sample locations. The next term,
E[ZZJ = E[XZZ]
XE[ZZJ
X(c + .u 2 1) (2.13)
(2.14)
We have a constraint on the A's which must sum to one. Using Lagrange
1
The Lagrange multipliers will be denoted by v.
13
Taking the derivative and setting it equal to zero,
G\- c+ vl 0
C.\+ vl c
where:
1 1 0
Ao
,\+ =
An
v
co1
c+ =
Con
14
hence if c+ is invertible,
(2.16)
Cressie [5] does the above analysis in terms of the variogram. Also, the solution
to the kriging system yields a best linear unbiased predictor (BLUP) of Z(x 0 ).
linear combination of Z(xi)'s and let Z(xi) have expectation m and covariance
Y must be positive or 0.
rn
Y = :z= ),iZ(xi) (2.17)
i=l
Now, take into account equation 2.8 and the above relation, and rewrite
(2.19)
j j
In the case where the variance C(O) does not exist, only intrinsic stationarity
is assumed, and the variance of Y is defined on the condition that the sum of
15
the Xs is 0, and the term C(O) is eliminated, leaving only
Thus, the variogram function must have the condition of conditional positive
(1) 21(-) is contin-uo-us at the origin, then Z(-) is L 2 continuous. That is,
2
E [(Z(x +h)- Z(x)) ]-t 0 if and only if 21(h) -t 0 as llhl[-t 0.
(2) If 21(h) does not approach 0 as h approaches the origin, then Z(-) is
(3) If 21(h) is a positive constant, then Z(xi)'s are uncorrelated for all x
• 1(0) = o
• 1(h) = 1( -h)
16
2.4.2 Functions and Graphs There are a number of known functions
that are useful as model for the variogram and assure negative definiteness.
mally, this says that two random variables separated by a large distance are
intrinsic random function reaches a sill, then the random function is second or-
der stationary. Then we can have the following relationship from equation 2.8
direction, then the variogram is said to be isotr-opic (this is also true for the
0 h=O
(2.21)
co+ a Vh>a
17
where c0 +cr is the sill, c0 is a nugget effect and a is the range. Further-
more, cr, c0 and a are all positive. The following plot is standardized
o. e
h=O
(2.22)
hfO
is the sill, c0 is the nugget effect and a is the range, all positive real
numbers. For this model, the range is the value where the variogram
reaches 95% of the sill, since this model reaches the sill asymptoticly.
18
0.8 -
0.6
0.4
0.2 -
6 7
distance
~
0 h=O
(2.23)
1(h) = c0 +J { 1 - exp (-(3h)2)} O<h<a
2
a
where c0 , J and a have the same interpretation as above, where the
range is 3.
These models allow for infinite variance. Frequently used models are:
19
' "
' '
,,
~
""
~ c
0.6
'
""'
0.2
.
"~~---::-,'-:;,~~-c,:---~--7,"-;c,~~--;,~~-,0"'"-:,~~-;--- ·--~~...-l.---·-
cu.""''"""'""
With this model, lim1,_, 0 h = -oo. Journal and Huijbregts point out this model
2.4.5 Hole Effect Models A third type of models are known as Hole-
Effect Models. These models are used when growth is not monotonic and can
_ sin(h)
I (h) -1--~ (2.26)
h
2.5 Anisotropy
the same in every direction. Using Figure 2.7 below, variability is xiependent
21
'A,---~---~---------~----------,
O.B
E
"
·~ 0.6
0.4
0.2
0
0 2 4 0 B
distance
when a contour of the variogram surface will yield elliptical shapes at various
distances. Along the major axis of the ellipse, variability increases slowly, and
along the minor axis of the e1lipse, variability rises rapidly. These will be
the time of the genesis of the studied phenomenon [15]. Examples of anisotropy
are:
22
Figure 2.7. Variogram contour surface with anisotropy
of the rectangular coordinates (hu, hu, hw) of the vector h in the case of geomet-
In the classical book by Isaaks and Srivastava [13] along with Journel and Hui-
that have approximately the same sill but at different ranges. Geometrical
anisotropy cannot [15]. Isaaks and Srivastava [13] describe zonal anisotropy
as one in which the sill value changes with direction while the range remains
constant. I agree with Zimmerman [27] that the above definitions of zonal
anisotropy are inadequate. This nomenclature does not describe the type of
23
anisotropy that characterizes the region of study. Zimmerman suggests aban-
doning the term zonal anisotropy in favor of more descriptive terms, such as
range anisotmpy, sill anisotmpy, and nugget anisotropy. The classical term
Huijbregts [15]. Basically, variograms are computed and plotted in various eli-
rections, and in each direction, the sill value will be the same, but the range
08
0.6
E
"
.§'
!"
04
02
0
0 2 3 4 5 6 7 8
Distance
24
(1) Rotation of coordinate axis
l
and 3 dimensions, respectively
cos( 0) sin( 0)
R= (2.27)
[ -sin( 0) cos( fi)
where a is the rotational angle in the xy-direction and (p is the rotation in the
zx- direction.
For simplicity, the figure below shows the rotated axis aligned along the
y
Y'
X'
25
Now that we have addressed rotation of coordinates, we next address trans-
the range. That is, we can reduce the range in which the variogra.m reaches
it's sill value to 1. Refer to Figure 2.8, suppose that each of these variograms
x direction, the sill is reached at the range value of 3, and the y direction, the
sill value is reached at the range of 6. We can divide each of the x coordinates
by the range value of 3 and the y coordinates by the range value of 6. Now,
in each direction, the sill value in each direction will reached at a range of 1.
respectively.
~l
l
T=
[ Gx
0
(2.29)
T= 0 l 0 (2.30)
ay
0 0 l
a,
Putting the two transformations together, we have the new coordinates give
by h',
Note that T and R are not reversible, the transformations must be in this
order.
26
2.5.2 Sill Anisotropy Figure 2.10 below is an example of anisotropy in
the sill. Here we see the sill values are different in two directions.
3.5
'·'
j
''
0.0
o"--~--~--~-~- -~-_j
0 '
Distance
Since the variogram attains a sill, this means the region has second order
which are correlated or have unequal means. If one were computing an ex-
perimental variogram that has unequal sills, then exploratory analysis of the
data may yield a trend in the data. This may explain the first reason for
sill anisotropy. Zimmerman states experience shows this to be most likely the
case. A trend in the data is a violation of the stationarity hypothesi&, and may
cause the non vanishing spatial correlation. The analyst should abandon the
27
stationarity hypothesis, estimate the trend, remove it and proceed with ana-
lyzing the residuals as a new data set. This should yield isotropic directional
experimental variograms.
was another reason for sill anisotropy. Zimmerman uses an example from a drill
hole data set. The samples can be analyzed on different occasions, even per-
that measurements from different drill holes are uncorrelated, but measure-
ments from the same drill hole are equally and positively correlated. This
being the case, the only correction the analyst can use is a nested model.
rn
lz(h) = Lli(IIA;hll) (2.32)
i=l
where
and
/1(·), · · ·, rrn(·)
28
are isotropic variograms.
correction was discussed. Referring to Figure 2.8, the variograms in two differ-
ent directions reach the same sill, but at different ranges. Range anisotropy is
we are contributing the range anisotropy solely to measurement error and not
range and sills to ensure the fitted sills are equal. In the example he uses,
29
NW-SE direction !z(h)
variability in the design region. Directional variograms not only have a nugget
effect, but they are different in each direction. Nugget anisotropy is attributed
} 3
',~----~------~----~,~-----~,------7-----~
Distance
30
3. Data Analysis
• Assessment of stationarity
assess stationary assumptions. Recall in section 2.3, the mean of the data is
constant and does not depend on location. A trend in the data implies a non-
31
the trend can be removed, and the residuals used as a new data set. This chap-
ter begins with identifying different sample designs. Second, the terminology
used to compute the experimental variogram are discussed. Third, the meth-
ods used to compute the experimental variogram and finally, methods used to
The sample design for spatial data may or may not be on a grid. In chapter
3 of Ripley [21], he shows four ways to sample spatial data. Uniform Sampling
samples are taken on an equally spaced grid in the region. Finally, Non-Aligned
Sampling is where the sample locations are off the nodes of a grid. Of these,
result in sampling that are not on a grid. Centric Sampling is on a grid. For
the rest of this chapter, I will differentiate gridded sampling and non-gridded
sampling.
are on a grid. This example is from Cressie [5], chap. 2. These data are
32
---------··
0 0 0 0
0 0 0
0 0 0 0 0 c 0
~ 0 0 0
"'
0 0 0
0 0 0 0 0 0 0
0 0 0 0
0 0 0 0
0
0 0 0 0 0 0 0
0 0
"' 0
0
0
0
0
0 0
0 0 0 0 0 0 0 0
€ 0 0 0 0 0
g 0 0 0
0 0 0 0
"
0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0
0 0 0 0 0
0 0 0 0
~~
0 0 0 0 0 0
i 0 0 0 0
0
0
''
5 '0 >5
east
33
sampled on equidistan t transects of a design region. The number of partitions
along the x and y axis are denoted as nx, ny respectively. The separation
between each adjacent partition along the x and y directions are denoted X8iz,
ysiz, respectively. A lag is the distance between two locations and should not
exceed L/2, where L is the maximum distance of any two locations in the
design region.
3.1.2 Non-Grid ded Data The figure below shows a sample area that
6.50_
®
"' !II
"' "' ®
5.50--::
® @)
"' "' ®
"'
4.50
@}
""
@}
"' "'
3.50_
®
"' ® "'
1.50
@
"' "' @
.50-: @
"' "'CD
CD
@
@
-.50 "'
2.00 3.00 4.00 5.00 6.00 7.00
.00 1.00
Figure 3.3 should help clarify some of the following terminolog needed to
compute the sample variogram with non-gridded data. Using Figure 3.3, the
origin will represent a location in the design region. The azimuth is the mea-
sured angle from the y-axis, clockwise. We can partition the direction vector
into nlag, the number of lags to compute the sample variogram. In Figure 3.3,
34
-
Y o~is (Nmth)
I~ Azimuth
I --
/
X axis (Eas\)
35
there are 5 lags. The maximum distance used to compute the sample variogram
is L/2, where Lis the maximum distance between two locations in the sample
region. The lag separation htag, is determined by dividing the maximum dis-
tance, L /2 by the number of lags. Two important tolerance intervals are the
lag tolerance and azimuth toler-ance. The lag tolerance is some /i amount from
the current lag, such that any location within h 1a. 9 ± 5 interval is considered to
be at the current lag. Next is the azimuth tolerance, this is a e amount such
that any point within the interval azimuth ± e is considered in the current
azimuth.
3.2 Estimators
is magnitude, llhll-
1 Nh
2 :Y(h) = - l::(Z(x +h)- Z(x)) 2 (3.1)
Nh i=J
It is recommended to use this estimator when there are many outliers in the
36
sample data.
4
(N
1
I:f':;\ IZ(x +h) - Z(x) 112 )
1
Deutsch and Journel [8] give nine other methods to measure spatial variability.
the various directional variograms [13]. The omnidirectio nal variogram serves
a useful starting point for establishing some of the parameters required for
sample va.riogra.rn calculations. It has been shown to be useful when the data.
are not on a. grid to establish the lag tolerance. Once the lag tolerance is es-
tablished, then directional sample variogra.rns are computed, and the azimuth
tolerance can be established. This does not necessarily need to be done when
the locations are on a. grid since the number of pairs to compute the sample
Figure 3.4 shows a.n omnidirectio nal va.riogra.m before adjusting the lag
tolerances.
The number of lags in the computed omnidirectio nal va.riogra.m has been
37
Semlvariogram direction omnidirectional
3000.
2000.
y
1000.
o;stance
38
reduced to produce the sample variogram in Figure 3.5. By reducing the num-
ber of lags used to compute the sample variogram, we increase the lag spacing,
which in turn includes more pairs in estimating each point in the sample vari-
3000.
2000.
y
1000
0~~----~.-~~.-~~ ~~~~·~~~~~
.oo 1.0o.so
1.50 2.00 2.so
Distance
if they are different, the analyst must determine the reason. There may be a
number of reasons for the apparent anisotropy such as outliers in the data or
This section will deal with ways of handling the apparent anisotropy.
Graphical tools help in identifying trends in the data, particularly spin plots
and scatter pair plots. GSLIB [8] has a routine called locmap which color scales
the location of the data. Another nsefnl plot is a text plot, which plots the
39
value of the data at the location instead of a dot. One technique of removing
and removed. The residuals are obtained and used as a new data set. Hence, an
can justify the removal of the data, they can be removed; otherwise more robust
ish [5]. When the data are on a regular grid, they may be treated as row and
columns of data. Assuming there are no row and column interactions, median
polish gives a two-way fit, and the residuals become the new data set. In either
method, anisotropy may still be an issue. In some cases, the removal of the
trend may yield an isotropic data set, but in others it may not. The analyst
must employ methods of dealing with anisotropy in the new data set by the
Cressie [5] mentions that a grid can be imposed over a region of study, and
the locations are referenced to the nodes of the imposed grid. Median pol-
ish with missing values is again employed, and residuals obtained. Another
40
method uses linear smoothers. If there is no interaction in the northern and
ized Additive Models [11 J are nice to use. If a plot in the data shows a linear
trend in the north direction and east direction, the trend is removed by the
determinis tic part and a smooth of the residuals in each direction is used to
assess the removal of the trend. An example of using this method is presented
Once the sample variogram has been computed and anisotropy has been
detected, and plotted, the next major task is to fit a parametric variogram func-
ple variogram has been done by ad-hoc techniques, such as "eye-balling" the
sample variogram and guessing parameters . One software package, GeoEas [1],
info-map (Bailey and Gatrell [2]). A sample variograrn is computed, the user
enters a variogram function, then interactively tweaks the parameter s until the
curve looks like it fits the sample variogram. These two packages are limited
to the exponential, spherical and gaussian models. SPLUS [12] also has this
capability, but has the power and flexibility to execute the methods mentioned
41
below.
squares using Matheron's estimator, weighted least squares using Cressie's ro-
Are using the ad-hoc techniques such a bad idea? Cressie and Zimmer-
man [4] discuss the stability of variogram parameters. Grondona and Cressie [10]
have shown the parameters of the variogram to be more stable than estimat-
ing the covariogram. I have found that this area has great potential for large
42
4. Analysis of Topogra phical Data Set
estimate the variogram from the data and to use Kriging to reconstruct the
surface. The data set is taken from Davis [7], and the measurement s are eleva-
tion data taken on a 310 x 310 area in ~2 , where the units are in yards. The
locations in the design region were chosen randomly. Furthermore, the region
The data are displayed using various plots in Figure 4.1. Figure 4.1 (a)
is a contour plot of the data, along with the locations of the data. Two hills
appear in the lower right corner and lower left corners, with a valley separating
them and continuing North. The values are higher on the left and right side
of the picture, decreasing towards the center. This observation may suggest a
moving from the North towards the South, indicating a linear structure in
the North/South direction. Figure 4.1(b) is a text plot of the values of the
43
data. The following two graphs are perspective plots viewing South and East,
taken using these graphs as they have been preprocessed using an interpolation
algorithm. A brush and spin plot verifies the above assessment. Because of
these trends, I have decided to remove them before any further analysis. I used
a Generalized Additive Model [11] to model the trend since the data are not on
directions based on the above plot. The model consists of a second degree
for any bias. This component consists of a loess smooth of the residuals on
both the North/South and East/West. The output of the GAM fit is shown in
Figure 4.2.
A regression line of the quadratic fit is added in Figure 4.2(a). The linear com-
ponent in the North/South is captured by the fit and added to Figure 4.2(b).
The residuals plot in Figure 4.2(c) and Figure 4.2(d) show no structure in
the East/West and North/South directions. The loess smooth is added to the
residual plots to assess the fit of the model. The smooth shows no structure.
44
--yg·:;
01
le~-0
755
710
i~
780
800
l
72B
800 7JO 8551
"i 762
80.
8~ i
·I 830
813
812
785 7'0 765
773
780
790
0)~
800
1l55
~0
812
873
827 805
830 '"Ii
Nj 885 841 855
0751
873 BOO
862 908 i882 910
-1 940 915 830
870
000
890~
0 i
~
890
~,.- ---------,---- 860
c 4 5 8
Eas:Pir&::lion East Direction
(a) Contour and point plot of data (b) Text plot of data.
(c) Perspective plot of data looking (d) Perspective plot looking East
South.
45
Therefore, I have modeled the trend successfully. Figure 4.3 displays some plots
to visualize how well the GAM model worked. Now that the trend has been
identified and removed, I will use the residuals as a new data set to compute a
vanogram.
als as a new data set. This iterative procedure will help obtain the settings for
the number of lags and the angular tolerance. After three iterations, I found
that using 10 lags produced a "nice" omnidirectional variogram. The next step
tional variogram of the residuals is shown in Figure 4.4 using an angular toler-
ance of 45°. The directions computed are 0°, 30°, 45', 90°, 120', 135°, and 150°.
It appears, in the 90 through 150 degree directions, the sill of the variogram
46
0r ~~~~-----~--,
§1
~j ..
Oj~
~I
Oj
..
·, ·. :. I
i
gj
":i
3
y
. .
'··~· ·
o~·
.: . : ·: .. .. .
..
.. : ·.
0
'
3
y
47
Easli)recl'on
3
3 North Direction
East Direction
(c) Perspective plot of residuals look- (d) Perspective plot of the residuals
ing South. looking looking West.
48
00 05 IO 15 lO 25 30 oo oo 10 ts ~.c 25 3.o
¢
0 D 0
t
0
0 0
0
0 0 0
Q Q
0 0
0
0
0 0
0 0 0 0
0
' 0 0
0 0
Q
0
0
0 0 0 0
" 0 0
0
0 0 0
• 0
0
0 0 0
0
" 0 0
distance
49
-3 -2 -i 0 2 3
EIW distance
50
From Figure 4.5, one may assess geometrical anisotropy due to the ellipti-
cal form of the variogram surface. Recall from section 2.5.1 that geometrical
ure 4.6. From this graph, it appears to have a range of 2 and sill of 1200.
using the coordinate s from the topological data set, I simulated values using
an exponentia l covariance function with range of two and a sill of 1200. The
The directional variogram using the simulate data is shown in Figure tl.7.
We can see in the 30, 120, 135 and 150 degree azimuths there is some clear
structure: the sample variogram appears to reach a sill. Knowing the simulated
values are stationary and isotropic, the variograms should be the same in each
51
The number of pairs used to compute each point on the sample directional
variogram is less than 30. A rule of thumb is that the number of pairs used to
estimate each point in the sample variogram should have at least 30 to 50 pairs.
I chose a minimum of 40. To increase the number of pairs, one can increase
the lag tolerance, which by default is half the lag separation. I increased the
lag tolerance for a number of instances, but the results were not favorable, the
number of pairs did not increase. The next method is to increase the angular
the 70° angular tolerance I achieved favorable results. The number of pairs
to estimate each point was 40 and the sample directional variogram looked
favorable.
The results of increasing the angular tolerance show that in each direction,
the sample variograms are similar, and the variogram surface is fairly circular.
It was interesting to note the angular tolerance used to produce the results
shown in Figure 4.8 is 70 degrees. I performed this simulation 5 times and each
time came to the same conclusion. The conclusion drawn from this analysis, is
that I have an anomalous sample data set, in which these high tolerances are
needed to produce the known isotropic results. I need to increase the angular
53
----------------------................
5
~·
~'
olstanoa
(a.)
Variogram Surface
(b)
54
0
§
80
80
0
E
E
g
0
~
distance
0 0 0
0 0
0 0
0 0
o,; 0
oo {J 0 1000
00 0
0 0 0
0 0 00 0 00
0 00
0
0 0
0 0
0
0
0
' 0
0
0
0
0 00 0
' 0
0
0 0
0 0
0 0
0
0 ,, 0
0
0 o0 o () 0{) 0 0 0
0
0 oo
" 00
0
distance
52
4.4 Completion of Original Data
Going back to the original data set, I used the omnidirectional variogram
m Figure 4.6. I believe by the simulations, the original data set does not
have any anisotropy, and the results are unique due to the sampling and the
to the residuals. The fitting procedure returned the values: range = 1.522 ,
sill = 943.724 and the nugget = 18.07. The fit looks good and is shown in
Figure 4.9
4.5 Kriging
Now that the variogram parameters are estimated, we can Krige the surface.
The SPlus [12] algorithm chooses a 30 x 30 grid in the design region along with
the data locations to perform kriging. After kriging the residuals, I will add
in the estimated trend from section 4.1. This will give me the predictions at
unsampled locations. A contour of the surface along with the contour of the
The small standard errors show the kriging routine produced reasonable
results. The surface along the North/South direction and the East/West di-
55
0
0
0
:::'
0
0
"'
0 0
0
•
E
<0
D
E
•
~
..
0
0
0
0
"'
56
(a) Kriging surface (b) Standard errors of krig-
ing surface
57
the results starting with the estimation of the variogram and the surface results
niques used in practice. The data was plotted and trends were identified. The
next step was to remove the trend. My research has shown that analysts are
looking for robust methods to remove the trend. Robust methods produce
to remove the trend. The basic reason for choosing this method is because
I wanted to verify the trend had been correctly identified and removed. A
nonparametric tool for this assessment was to use a loess smooth [11] on the
to further my knowledge of this tool. After many steps of computing the om-
ogram was fitted to the sample variogram. This completed the variogram study
of the region, which is the longest part of any geostatistical analysis. Kriging
the region was a simple task to complete. After completion of the. Kriging
routine, questions remain as to the fit of the surface. We have not accounted
58
for variability from the variogram fit and the trend fit when we reproduced the
59
REFEHENCES
[1] U.S. Environmental Protection Agency. Geo-Eas 1.2.1 Users Guide. 1991.
[2] Trevor Bailey and Anthony C. Gatrell. Interactive Spatial Data Analysis.
Longman Group Limited England, 1995.
[3] Noel Cressie and Douglas Hawkins. Robust estimation of the variogram.
Mathematical Geology, 12(2), 1980.
[4] Noel Cressie and Dale L. Zimmerman. On the stability of the geostatistical
method. Mathematical Geology, 1992.
[5] Noel A.C. Cressie. Statistics For Spatial Data . .John Wiley & Sons USA,
1993.
[7] John C. Davis. Statistics and Data Analysis in Geology. John Wiley and
Sons, New York, 1986.
[8] Clayton Deutsch and Andre Journel. GSLIB: Geostatistical Software and
User's Guide. Oxford New York, 1992.
[10] Martin 0. Grondona and Noel Cressie. Residuals based estimators of the
covariogram. Stat·istics, 1995.
[11] T.J. Hastie and R.J. Tibshirani. Generalized Additive Models. Chapman
and Hall, 1990.
60
[14] E. H. Isaaks and Mohan Srivastava. Spatial continuity measures for proba-
bilistic and deterministic geostatistics. Mathematical Geology, 20(4):313-
341, 1988.
[17] K.V. Mardia and R.J. Marshall. Maximum likelihood estimation of models
for residual covariance in spatial regression. Biometrika, 71:135-146, 1984.
[18] K.V. Mardia and A.J. Watkins. On multimodality of the likelihood in the
spatial linear model. Biometrika, 76(2):289-295, 1989.
[20] G. Matheron. The intrinsic random functions and their applications. Ad-
vances in Applied Probability, 5:439-468, 1972.
[22] Michael Stein and Mark Handcock. Some asymptotic properties of krig-
ing when the covariance function is misspecified. Mathematical Geology,
21(2):171-190, 1989.
[23] J.J. Warnes and B.D. Ripley. Problems with likelihood estimation of
covariance functions of spatial gaussian processes. Biometrika, 73(3):640-
642, 1987.
61
[28] Dale L. Zimmerman and M. Bridget Zimmerman. A comparison of spatial
semivariogram estimators and corresponding ordinary kriging predictors.
Technometrics, Feb 1991.
62