You are on page 1of 12

Stoch Environ Res Risk Assess (2006) 20: 307–318

DOI 10.1007/s00477-005-0026-1

TEACHING A ID

Ricardo A. Olea

A six-step practical approach to semivariogram modeling

Published online: 9 May 2006


Ó Springer-Verlag 2006

Abstract Geostatistical prediction and simulation are autocorrelation is a prerequisite in most applications of
being increasingly used in the earth sciences and engi- geostatistics, which is generally done by modeling the
neering to address the imperfect knowledge of attributes semivariogram based on the information provided by a
that fluctuate over large areas or volumes—pollutant sampling of the attribute of interest. Equivalent results
concentration, electromagnetic fields, porosity, thickness can be obtained employing covariances instead, which
of a geological formation. Central to the application of can be readily derived from semivariograms.
such techniques is the need to know the spatial conti- Despite numerous publications on the subject, mod-
nuity, knowledge that is commonly condensed in the eling a semivariogram remains to the uninitiated the
form of covariance or semivariogram models. Their most difficult and intriguing aspect in the application of
preparation is subdivided here into the following steps: geostatistics. What follows is a hands-on approach in-
(1) Data editing, (2) Exploratory data analysis, (3) tended to teach by example, by breaking the task of
Semivariogram estimation, (4) Directional investigation, modeling into six sequential steps. Proper execution of
(5) Simple modeling, (6) Nested modeling. I illustrate the task requires computer software for calculations and
these stages practically with a real data set from a geo- graphical display, for which I provide references. The
physical survey from Elk County, Kansas, USA. The use of a two-dimensional gravimetrical survey here is
applicability of the approach is not limited by the merely pedagogical. The basic steps are the same,
physical nature of the attribute of interest. regardless of the physical nature of the attribute and the
dimensionality of the sampling. The focus is on handling
Keywords Geostatistics Æ Continuity Æ Uncertainty Æ typical peculiarities of spatial continuity as revealed by a
Semivariogram Æ Model fitting proper sampling, rather than in complications derived
from insufficient data, blunders in their preparation, or a
combination of the two.
1 Introduction

Shortage of information is common in the modeling and 2 Step 1: Data editing


evaluation of natural resources. Geostatistics has been
developed in recent decades to assist in the process and To remain as focused as possible, I will start by
to quantify the uncertainty associated with such imper- assuming that there is a sampling already available. This
fect knowledge (e.g. Goovaerts 1997; Chilès and Delfiner will avoid going into sampling design, a major step
1999; Olea 1999). worthy of a separate paper.
A fundamental difference between geostatistics and Typically, a geostatistical sampling comprises one
classical statistics is the assumption by geostatistics of the record per measurement. Each record gives the observed
existence of spatial autocorrelation, which matches the value and its spatiotemporal location, which may be in
commonly held notion that in the vicinity of small values one, two, or three spatial dimensions. The most frequent
there are other small values, while large values tend to be case—and the one to which I will devote my atten-
close to other large values. The assessment of such tion—is the atemporal, two-dimensional case, in which
location comprises a couple of geographic coordinates.
In cases such as locations given as legal description of
R. A. Olea
Institut für Ostseeforschung Warnemüende,
location or latitude and longitude, it is necessary to
18119 Rostock, Germany convert them to Cartesian coordinates using programs
E-mail: ricardo@oleageostats.com such as the one prepared by Collins (1999).
308

Where every observation is some type of average over weights (Isaaks and Srivastava 1989, pp. 241–247),
a line, area, or volume—porosity, chemical concentra- which can be done by the program declus (Deutsch
tion, ore grade, crop yield—all measurements must have and Journel 1998, pp. 213–214). The Elk county
the same type of underlying line, area, or volume in gravimetric survey was designed to take one mea-
terms of shape, size and orientation. Observations hav- surement at every intersection of the almost perfectly
ing significant variations in the underlying line, area, or regular network of roads every mile in the east–west
volume should form different sub-samplings that must and north–south directions. Hence the demonstration
be treated separately. sampling is as free of clustering as any sampling
The first task is to review every record to eliminate can be.
any possible reading, recording, or processing errors, The second decision relates to the requirement or
either in the locations or in the measurement of the convenience of transforming the data to increase its
attribute. Human errors are more common than most univariate normality, namely the ability of the mea-
people imagine. Final results will be particularly sensi- surements to approximate a normal distribution. The
tive to anomalous fluctuations; one needs to scrutinize most common practice is to convert the data to normal
them as much as possible. The sampling should retain scores. The transformed data will have a normal distri-
only those observations denoting real, natural fluctua- bution with a mean of zero and a variance of one,
tions, within instrumental precision, of course. Failure transformation that is also known as a Gaussian ana-
to eliminate sampling blunders can sometimes be de- morphose. Often the transformation is optional and
tected later in the modeling. Then it is necessary to re- makes sense solely in the case of clear deviation from
peat the work after taking corrective action. Undetected normality. Some applications, however, such as
blunders usually degrade the representativity of the re- sequential Gaussian simulation, require normal scores,
sults. and so the transformation is mandatory. For more de-
A posting of values or preliminary mapping, such as tails, see Verly (1986). If the reader needs to make a
the one in Fig. 1, is helpful to gain familiarity with the normal score transformation, then see for example
data and detect dubious sampling sites. The sampling in Deutsch and Journel (1998, pp. 223–226).
Fig. 1 is a medium-size survey with 668 measurements Figure 2 shows the cumulative univariate distribution
and will be used extensively to illustrate the methodol- for the Elk data on a normal probability scale. In this
ogy. The sampling is part of a larger geophysical study instance, the maximum deviation from normality occurs
comprising more than 27,000 stations that cover all of for 56 mgal and is 5.7% points. As a practical rule,
the state of Kansas (Lam 1987; Xia et al. 1992). The there are no clear advantages to working with normal
survey was done to detect Bouguer gravity anomalies in scores unless the deviation from normality is above 10%
the gravitational force relative to the standard geoid, points. For cases such as the Bouguer values from Elk
which in this case mainly indicate depth to bedrock. county, one should avoid a normal score transforma-
Values of Bouguer anomaly are gravity averages over tion. Most statistical libraries have programs to
the same vertically elongated volumes, with cross-sec- calculate maximum deviation between cumulative dis-
tions that are regarded as points relative to the size of tributions, such as the one in Press et al. (1992, pp. 617–
the sampling area. The coordinates in this case are those 619). Given the nonlinearity of the probability scale, it is
of the instrument on the surface of the earth. safer to run this simple calculation than trying to read
The reader is encouraged to download and work with the maximum discrepancy from a graphical comparison
the sampling available at URL http://www.kgs.ku.edu/ of the distributions like the one in Fig. 2.
Mathgeo/Books/Elk/index.html. As rendered in Fig. 1,
the survey is free of errors, ready for reliable use, within
a combined measurement and processing precision of 4 Step 3—Semivariogram estimation
0.1 mgal.
At this stage it seems opportune to define what the
semivariogram is. Given two sites h units apart and the
3 Step 2: Exploratory data analysis difference for a variable of interest at those sites, the
semivariogram, c(h), is half the variance of this differ-
Before calculating the semivariogram, the user needs to ence. The semivariogram has the property of measuring
examine both the spatial distribution of sampling sites the degree of dissimilarity between pairs of measure-
and the cumulative distribution of the measurements to ments in terms of how far apart they are and the ori-
assess any need to modify the original data. entation of the line between those two sampling sites.
First, for the proper modeling of a semivariogram, Statistics and geostatistics are sciences of the un-
the sampling should not have preferential areas, as is known. Therefore, it follows that the true semivario-
the case when some measurements concentrate in gram is never known, and as in statistics generally, all
clusters with a much greater sampling density that the that it is customarily known is an estimate of the semi-
rest of the sampling area. If that is the situation, the variogram. Although there are several semivariogram
sampling needs preprocessing to eliminate the influ- estimators, the predominant practice is to use the fol-
ence of clusters. One way to do this is by assigning lowing unbiased estimator:
309

Fig. 1 Location of Elk County


in the state of Kansas, United
States (star), posting of
observation stations (+), and
contour map of Bouguer
anomaly in Elk County. Lines
close to the margins are the Elk
County boundary lines.
Distances along the axes are in
kilometers and contour interval
is 1 mgal

Fig. 2 Cumulative distribution


for the Bouguer gravity
anomaly, Elk County, Kansas,
denoted by the dots, and a
normal distribution with the
same mean and variance, given
by the straight line. The sample
mean is 61.3 mgal, the
standard deviation 4.8 mgal,
and the maximum discrepancy
with a normal distribution with
these same parameters is 5.7
percentage points
310

nðhÞ the average sampling distance; the lag tolerance, th, is set
1 X
^cðhÞ ¼ ½zðxi þ hÞ  zðxi Þ2 ; ð1Þ to half this spacing; the lateral tolerance, tb, 0.5 to 2
2 nðhÞ i¼1 times the lag spacing; and the angular tolerance, d, 0.25–
0.5 times the increment in the azimuth. The increment in
where z (xi) is a measurement taken at location xi and the azimuth is customarily between 22.5° and 45°.
n(h) is the number of pairs h units apart in the direction Clearly there is a trade-off between the resolution of
of the vector. In the geostatistical jargon, ^cðhÞ is known small distance classes with few observations per class,
as experimental semivariogram and h is the lag. An and the reliability and smoothing of large distance
important assumption for the validity of this estimator is classes. Options become more numerous as the sampling
the absence of any systematic variations, that is to say, size increases. In practice, the pairing of measurements
there should be no trend. Note that the estimator is and calculation of half-square averages is better done
conveniently independent from the individual sites xi. with the assistance of computer programs, such as
The lag is written in bold to denote that the argument VARIOWIN (Pannatier 1996) or gamv (Deutsch and
simultaneously has a magnitude defined by a distance Journel 1998, pp. 53–55).
and an orientation, which in two dimensions is any of Figure 4 is an example of a typical experimental
the points in the compass. Location is also in bold to semivariogram, which in this case shows the results of
denote that it is a vector of coordinates, as many as the processing the Elk data with gamv using an incremental
dimensions in the sampling space. In two dimensions, lag of 1.6 km (1 mi), an angular tolerance of 10° and a
they are commonly denoted as easting and northing. lateral tolerance of 1 km. The degree of dissimilarity
Given an orientation, the estimator in (1) is strictly provided by the semivariogram often shows a bounded
applicable to a sampling at regular intervals, d. ^cð0Þ is increase. The distance at which the semivariogram
always zero, as the difference of any measurement with reaches the limiting value is called the range and the
itself is zero. ^cðd Þ is half the square mean of all differ- bound is denoted as the sill. In the case of experimental
ences of a measurement with its immediate neighbor, semivariograms, the range and the sill are hard to define
^cð2d Þ is half the square mean of all differences of a
accurately because of irregularity in the fluctuations.
measurement with the neighbor that results from skip- Let me conclude this section with some remarks
ping one measurement, and so on. For a numerical about the significance of the point estimates comprising
example, see, e.g., Olea (1999, p. 74). the experimental semivariogram. The autocorrelation
In irregular sampling schemes, for the purpose of assumption at the core of geostatistics is both a blessing
establishing pairs, z(xi+h ) is regarded as a centroid of a and a problem. It is an advantage in that allows for
distance class. Figure 3 illustrates the situation of a two- better characterizations than would be possible without
dimensional irregular sampling, in which any measure- spatial continuity. Autocorrelation, however, introduces
ment inside the shaded area is considered in the calcu- enough theoretical complications that it has been
lation of ^cðhÞ; although it is not exactly a distance h from impossible to develop any kind of test of significance in
xi. The lag spacing is typically taken close to the value of the style of classical statistics. So, for example, the an-
swer to the question as to the minimum number of pairs
in ^cðhÞ required to provide a reliable estimation, can be

Fig. 3 Definition of the distance class in the estimation of a


semivariogram. The shaded area is defined by an angular tolerance
d, lag tolerance th, and a lateral tolerance tb, relative to the point
here appearing in the upper left corner. Any measurement inside Fig. 4 Experimental semivariogram for demonstration sampling
the shaded area can be used in the calculation of ^cðhÞ despite not along N63E showing a range of about 21.3 km and a sill of about
being exactly h units apart from xi 1.07 mgal2
311

addressed solely by the following couple of recommen- bound at different incremental rates. Many of the esti-
dations: mated values surpass the value of the variance, which in
this instance is 23 mgal2 . What we have in such situa-
1. The minimum number of pairs in semivariogram
tions are a collection of artifacts, not genuine experi-
estimation must be 30, according to Journel and
mental semivariograms. Even a casual inspection of
Huijbregts (1978, p. 194), and 50 if one is going to
Fig. 1 reveals that there is a systematic increase in the
follow the advice of Chilès and Delfiner (1999, p. 38).
Bouguer gravity from northwest to southeast, which
Please see Webster and Oliver (1992) for a discussion.
violates the assumption of no-trend for the proper use of
2. If it is necessary to estimate the semivariogram for a
the estimator in Eq. 1. Hence, the curves in Fig. 6 are
large lag close to the diameter of the sampling area,
neither semivariogram estimates, nor the kind of semi-
then the pairs of observations that are that far apart
variograms that will be required in a geostatistical
are only those located at opposite extremes of the
characterization of an attribute with a trend. In the
sampling area, thus excluding the central points from
presence of trend, the semivariogram required for
the analysis. Hence, the justification for a second
modeling is that computed on the residuals obtained
practical rule that advises to limit the lag of the
after removing the trend. Yet to remove the trend, it is
experimental semivariogram should be limited to half
necessary to have the semivariogram of the residual. The
the extreme distance in the sampling domain for the
simplest, yet most effective way out of this conundrum,
direction of interest (Journel and Huijbregts 1978,
is to find a trend-free direction—namely a direction that
p. 194).
on average has a constant mean—and use the semi-
Indirectly, these two rules collectively make it difficult variogram in that direction as the semivariogram for the
to properly estimate a semivariogram with less than 50 residual. The justification for this is based on the fact
measurements at different locations. that the experimental semivariogram depends on dif-
ferences z(xi+h )  z(xi) in which the addition or
subtraction of a constant to each term does not change
5 Step 4: Directional investigation the increment. The trend-free direction is perpendicular
to the direction of maximum dip and coincides with the
In more than one dimension, the semivariogram gener- direction of minimal increase in the pseudo-semivario-
ally has directional properties. Hence the user should grams of a directional survey. In the case of the Elk
not stop at estimating a semivariogram for a single County data, this direction is about N67E.
direction, such as the one in Fig. 4. In general, the data A second directional analysis around the approxi-
permitting, the more directions are investigated, the mately trend-free direction—such as the one in
better. In two dimensions, the bare minimum is the Fig. 7—helps to narrow the solution, which in this case
investigation of three azimuths (Goovaerts 1997, p. 98). is approximately N63E. Other more sophisticated ap-
In three dimensions, there is the additional need to run a proaches not considered here include iterative modeling
sensitivity analysis in the declination. of the trend and the semivariogram (Chilès and Delfiner
One might observe at least three basic types of 1999, pp. 115–128) and removal of the trend by filtering
behavior in a directional semivariogram survey. In the through the calculation of increments (Chilès and Del-
simplest situation, there is no significant difference finer 1999, chap. 4).
among experimental semivariograms for the different
directions tested. In this circumstance, one speaks of
isotropy and it is acceptable to average all experimental
semivariograms regardless of orientation. The result is
usually a smooth semivariogram—an omnidirectional
semivariogram—which is smoother than individual Var(Z)
directional semivariograms.
If the sampling is trend free and the average size of
semivariogram

the anomalies is smaller than the maximum length of the


sampling area, then the typical situation is that one
obtains semivariograms with different rates of increase
for short lags that level off at a common sill that Direction 1
approximates the sampling variance (Barnes 1991), such Direction 2
as in Fig. 5. This is the second basic type of behavior, a Direction 3
semivariogram with what is called a geometric anisot- Direction 4
ropy, which is a true function of distance and direction.
Figure 6 shows the third type of basic behavior
through eight semivariograms for the Elk data set at
lag
increments of 22.5°, starting from the east–west direc-
tion. Now, instead of observing a sill for every semi- Fig. 5 Schematic example of geometrically anisotropic semivario-
variogram, one can see an exponential increase without gram
312

Fig. 6 Directional
semivariogram investigation for
the Elk County gravity data

the semivariogram is required for kriging estimation or


6 Step 5: simple modeling for some form of stochastic simulation involving kri-
ging. In these situations, modeling of the semivariogram
If all that the user wants to obtain are some conclusions becomes mandatory.
from the inspection of the experimental semivariogram, Kriging is the solution to the quadratic minimization
step 4 is the end of the process. Yet, more often than not, problem of finding weights that minimize the estimation
313

Fig. 7 Directional
semivariogram investigation
around the trend-free direction

error in a mean square sense. Any quadratic minimiza- essential to avoid imaginary standard errors. A semi-
tion problem has a unique, positive solution, provided variogram model is any negative definite analytical
that the coefficient matrix is not singular, which in the expression of a shape likely to capture and emulate the
case of the kriging minimization problem introduces the style of variation of some experimental semivariogram.
requirement that the semivariogram be negative definite. By replacing the experimental semivariogram by a neg-
By a positive solution, it is meant that the objective ative definite model, the user avoids singular kriging
function—the estimation error—be positive, which is matrices no matter what the combination of arguments.
314
Table 1 Most commonly used
simple semivariogram models 0 < a, 0<C

Power model : PðhÞ ¼ a hb ; 0\b\2


 3h

Exponential model : ExðhÞ ¼ C 1  e a
 h 2

Gaussian model : GðhÞ ¼ C 1  e3ðaÞ
(   3 
C 32 ha  12 ha ; 06jhj\jaj
Spherical model : SpðhÞ ¼
C; jaj6jhj
(    
5 h 3 3 h 5
C 15 h
 þ ; 06jhj\jaj
Pentaspherical : PeðhÞ ¼ 8 a 4 a 8 a
C; jaj6jhj
(   h3 7 h5 3 h7 
2
C 7 ha  35 þ  ; 06jhj\jaj
Cubic model : CuðhÞ ¼ 4 a 2 a 4 a
C; jaj6jhj
 
sin ðphaÞ
Sine hole effect : SðhÞ ¼ C 1  ph
a

This explanation is given to dispel any notion that the and displayed in Fig. 8. Parameters C and a conve-
use of semivariogram models is an unnecessary compli- niently relate directly to the sill and the range. A special
cation introduced solely to make life more miserable. case of the negative definite model is the pure nugget
The use of negative definite models is utterly more effi- model, N, which can be considered a limiting case of
cient than the alternative to test the non-singularity of some of the other models when the range is infinitesi-
every kriging coefficient matrix for each particular set of mally small.
values derived from direct interpolation of the table of
experimental semivariogram values, even though this N ¼ C0 ð1  H ð0ÞÞ;
would be a valid approach. where H(0) is the Heaviside function, which is 1 at lag 0
Although there is an infinite number of negative and 0 otherwise. In this particular case, the constant C0
definite functions, the basic shape of the semivariogram is not called the sill but the nugget effect, a term derived
rising from zero to reach a limiting value restricts to a from the modeling of semivariograms of gold deposits.
few the negative definite functions that are of interest. The sum of a simple model and a pure nugget effect
Those most commonly employed are defined in Table 1 model is also negative definite. The higher the nugget
effect relative to the nugget minus sill, the poor the
spatial continuity. To the limit, just a pure nugget effect
2.0 semivariogram indicates complete absence of spatial
Power
continuity, making the use of geostatistics meaningless
Exponential as it produces the same results as classical statistics. Such
Gaussian lack of continuity can be real, the result of too large
1.5 Spherical sampling space, or the consequence of numerous blun-
Pentaspherical ders in the data. For example, an attribute with a true
Cubic range in its semivariogram of 100 m will have a pure
semivariogram

Sine hole effect nugget effect semivariogram when the attribute is sam-
pled at intervals of 1 km. In my experience, this is
1.0
commonly the case of geochemical data.
Modeling of a semivariogram is the process of
replacing the collection of estimated values by the closest
negative definite model, namely, the selection of the
0.5
most adequate model type and the determination of its
parameters. This can be done by:
(a) trial and error (Goovaerts 1997, 97–104);
0.0 (b) maximum likelihood (Kitanidis 1997, chap. 4); and
0.0 0.5 1.0 1.5 2.0 (c) weighted least squares (Jian et al. 1996).
lag
Considering that this paper presents a practical ap-
Fig. 8 Most common negative definite semivariogram models proach, I refer the reader to the references for theoretical
when the two parameters are equal to 1 discussions. For years, I have had the most satisfactory
315

Fig. 9 Best fits for experimental semivariogram along N63E employing each one of the models in Table 1 plus a pure nugget model when
necessary
316

results employing weighted least squares, particularly estimator involving a semivariogram model—such as
when the experimental semivariogram follows a typical the most adequate form of kriging, which in the case
behavior devoid of anomalous fluctuations, such as the of the Elk data may be universal kriging because of
case of the semivariogram in Fig. 4. the presence of a trend—one can use crossvalidation
Regardless of the method used to find the model to study the sensitivity of the errors to changes in the
parameters, it is particularly hard for the novice to pick semivariogram. Conclusions derived from the analysis
the best type of model by simple inspection. Hence, the of the errors, however, must be taken cautiously be-
safest approach is to fit all models and then select the cause the errors are not independent.
model with the best goodness of fit, which is a trivial and Table 3 shows the sensitivity of crossvalidation errors
instantaneous undertaking when the modeling is not to two sets of anisotropic models, one with a maximum
done by trial and error. Figure 9 shows the results for range 10% larger than the range in the best simple
the Elk data using a program described in Jian et al. model and a minimum range 10% less than the range in
(1996) that is available from the Internet at http:// the best simple model (16.38, 13.40) and another set with
www.iamg.org/CGEditor/index.htm. Table 2 contains a discrepancy of 20% (17.86, 11.91). Although there is a
the optimal parameters and the sum for the squares of systematic improvement that suggests a minimum mean
the weighted differences, Rm. square error for a largest range oriented approximately
In a weighted least squares sense, the best model is in a north-south direction, for this example the
the Gaussian one, closely followed by the cubic model. improvement is not large enough to be considered sig-
In the absence of a trend, if there is anisotropy, one nificant or to justify the complications of an anisotropic
has to model as many directions as dimensions in the model.
sampling space. For two dimensions, one has to model
one semivariogram for the direction of maximum range
and another one for the direction of the minimum range.
Programs making use of anisotropic models generally 7 Step 6: nested modeling
will expect that:
A sum of negative define semivariograms is also negative
(a) the two directions are perpendicular;
definite. The sum of a simple model plus a pure nugget
(b) the type of model is the same; and
effect model is just a special case. This property of
(c) both models have the same sill; which in most cases
negative definite semivariograms opens infinite possi-
forces some approximations. Those programs auto-
bilities of semivariogram mixing, which in geostatistical
matically model the range for intermediate directions
jargon is called semivariogram nesting. In practice, the
as the radius of an ellipse in which the axes are the
goodness of fit rapidly reaches a saturation point,
minimum and maximum ranges.
explaining why one rarely sees nested models involving
For cases with trend and modeled only in the more than a pure nugget effect model plus two simple
trend-free direction, such as the demonstration data models, not necessarily of the same type.
along N63E, if the user wants to investigate the pos-
sibility of anisotropy, the investigation may be done
indirectly making use of crossvalidation. Crossvalida-
tion is a verification process in which each observation Table 3 Sensitivity of Elk County Bouguer gravity to anisotropic
is removed with replacement to produce an estimate at semivariogram models
the same site of the removal. Each estimate is then Maximum Minimum Orientation of Mean square
used to calculate a difference with the corresponding range (km) range (km) largest range error (mgal2)
censored measurement, thus generating a set of errors
that one can use to investigate their sensitivity to the 14.89 14.89 0.231
16.38 13.40 N63E 0.233
selection of the estimation method and some of its N83E 0.235
parameters (Olea 1999, Chap. 7). If one employs an N77W 0.235
N57W 0.234
N37W 0.232
N17W 0.230
Table 2 Results of fitting simple models by weighted least squares N3E 0.229
N23E 0.229
Model Rm C0 a or C b or a
N43E 0.231
(mgal4) (mgal2) (mgal2) (km)
17.86 11.91 N63E 0.236
N83E 0.240
Gaussian 0.081 0.038 0.990 14.885 N77W 0.241
Cubic 0.085 0.033 0.991 20.525 N57W 0.238
Spherical 0.202 0.000 1.083 25.449 N37W 0.234
Pentaspherical 0.217 0.000 1.100 31.958 N17W 0.230
Exponential 0.349 0.000 1.663 72.665 N3E 0.227
Sine hole effect 0.410 0.038 0.909 15.904 N23E 0.228
Power 0.565 0.000 0.069 0.857 N43E 0.231
317

Table 4 Best simple and double nested models

Model Rm (mgal4) AIC C0 (mgal2) C (mgal2) a (km)

Simple Gaussian 0.081 104.2 0.038 0.990 14.885


Nested Gaussian 0.025 123.4 0.010 0.6900.380 22.3118.042

To keep an eye on the parsimony of nested modeling, The smaller the Akaike information criterion, the better
one can use the Akaike information criterion (AIC) is the fit. Given an experimental semivariogram, when n
from time series analysis. The AIC is a measure of and p remain constant, such as in Table 3, the ranking
goodness of fit involving not only the weighted errors, by Rm or AIC is the same.
but the number of points used for the fitting, n, and the By looking at the best simple model in Fig. 10a, one
number of parameters, p, as well (Tong 1983, p. 135): can see that between a lag of 8 and 20 km there are 8
  experimental points below the curve, two of them clearly
Rm below, which justify trying a more complex model to aim
AIC ¼ n ln þ 2p
n

Fig. 10 Best semivariogram


models. (a) Simple Gaussian
model. (b) Best double nested
Gaussian model.
318

for a better fit. Table 4 and Fig. 10b provide the answer:
a double nested Gaussian model. References
The AIC for the nested model is indeed smaller than
Barnes RJ (1991) The variogram sill and the sample variance. Math
that for the simple one. Hence, the improvement is Geol 23:673–678
worth increasing the number of parameters from 3 to 5. Chilès J-P, Delfiner P (1999) Geostatistics—modeling spatial
Considering that the crossvalidation does not show any uncertainty. Wiley, New York, 695 p
significant evidence of anisotropy for this nested model Collins DR (1999) User’s guide for the LEO system, version 3.9.
either, the isotropic model: Kansas Geological Survey Open-File Report 99–48, 13 p
Deutsch CV, Journel AG (1998) GSLIB—geostatistical software
cðhÞ ¼ Nð0:01Þ þ Gðh; 0:69; 22:3Þ þ Gðh; 0:38; 8:0Þ library and user’s guide. Oxford University Press, New York,
369 p and 1 compact disk
is the best model for the Bouguer gravity anomaly data Goovaerts P (1997) Geostatistics for natural resources evaluation.
Oxford University Press, New York, 483 p
from Elk County. Isaaks EH, Srivastava, RM (1989) Introduction to applied geo-
If there is a sill, then the covariance is easily obtained statistics. Oxford University Press, New York, 561 p
by subtracting the semivariogram from the total sill, Jian X, Olea RA, Yu Y-S (1996) Semivariogram modeling by
which in this case would be: weighted least squares. Comput Geosci 22:387–397
Journel AG, Huijbregts CJ (1978) Mining geostatistics. Academic,
CovðhÞ ¼ 1:08  Nð0:01Þ  Gðh; 0:69; 22:3Þ London, 600 p
Kitanidis PK (1997) Introduction to geostatistics: applications to
 Gðh; 0:38; 8:0Þ hydrology. Cambridge University Press, New York, 249 p
Lam C-K (1987) Interpretation of statewide gravity survey of
The main advantage of this indirect way to model the Kansas. Kansas Geological Survey Open-File Report 87–1. 213
covariance is that it does not require knowledge of the p and 6 plates
mean. Olea RA (1999) Geostatistics for engineers and earth scientists.
Kluwer, Boston, 303 p
Pannatier Y (1996) VARIOWIN: software for spatial data analysis
in 2D. Springer, New York, 91 p
8 Concluding remarks Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1992)
Numerical recipes in fortran, 2nd edn. Cambridge University
Press, New York, 963 p
I hope the novice reader, for which this paper is in- Tong H (1983) Threshold models in non-linear time series analysis.
tended, is now less intimidated and more confident of Springer, New York, 323 p
being able to model a semivariogram or a covariance. Verly G (1986) Multigaussian kriging—a complete case study. In:
Sophistications and variants abound. The six steps Ramani RV (ed) Proceedings of the 19th APCOM international
symposium, Society of Mining Engineers, Littleton, Colorado,
described here are by no means the absolute way to go. pp 283–298
The ultimate test of understanding for the reader will be Webster R, Oliver MA (1992) Sample adequately to estimate
to feel confident enough to try her or his own version of variograms of soils properties. J Soil Sci 43:177–192
these basic steps. Xia J, Yarger H, Lam C-K, Steeples D, Miller R (1992) Bouguer
gravity anomaly map of Kansas. Kansas Geological Survey
Map M-31
Acknowledgements I am grateful to John H. Doveton and two
anonymous reviewers for critical reading of the manuscript that
resulted in suggestions that improved the presentation.

You might also like