You are on page 1of 247

Cd concentrations (ppm) Geology

Geostatistics for
Natural Resources
Evaluation

Kriginq estimates Classification

PIERRE GOOVAERTS
Risk u Risk [3
Dcparlinent of Civil and Envil~on~nenldHnginecring
The University o f Michigan, Ann Arbor

New Yoi k Oxli~rd


Oxlhrd University Press
1997
To

Nrrrhnlie,

Maxime,

and Xuvier
Copyrighl 1997 by Oxf<rrdIlnivcrsily Press. Inc.

Published by Orford University Press, l ~ c .


118 M d i s o s Aveniic. New York,New York 10016
Oxford is n regislrrcd tradcwzrk of Oxford Univcrrily Press
A l l ci$$ts rercrvul. No pw of L i s publieution may k r r p r d \ i c d .
.;lord i s ;I rciricvsl ryswm. or trassmilcd, in m y ions or by m y mean?
rlurlror~ic.~n~ccl~s>icid. plrutocopying, recording, ur otherwise.
\"i,l,,>,,,l>,i<Nlxrr,,issi<>n<,iOd,~,,I u,,ivccsi,y I ~ C C ~ S S
Foreword

As a last remark, heware that uncertainty . . . arises horn our imperfect krlowletlgc
of that phenon~enon,it is data-dependent and most importantly model-dependent,
that model specifying our prior concept (decisions) ahout the phenomenon. No
rnndcl, hence no uncertainty rneasurc, can ever he objective.

Pierre Goovaerts closes no authoritative work with a comment that inany oias-
ters could envy, showing his co~ntnandingdistance from a suhjcct he otherwise
meticulously articulated over 400 pages. 'i'lris is the signature of the very best.
'l'his (still) young Inan is neither French nor Soutlr Africa~r.Hc is neither mining
cr~gir~cer nor pctnilc~~rncngineer Hc is an agroa~mistwillt ;I tradem:irk no-now
sense practicality that, from the beginning, t m d e geostatistics diSSercnt and
hencc, 1 dare say, successful. An author must iiave rare insight lo present such a
con~pendiumof tncthods and tools witl~outletting them lakc cornn~ando f the
hook. Picrre built his hook ;troond the study goals, giving to data aiialysis m d
data integration the lend they deserve. Never mistake tlrc tools Sor the goals-
this is the recipe S~ira good geostatistical study, and Pierre Ins demonstrated that
it also works for a good hook. His illustration of every concept, every kriging
system, by an cxarnplc from the same data set (heavy ntetal contamination in the
Swiss lura) is an er~chanttncntand a necessary hrenthc~:I" .: siiine trwnh read-
ing.
Characteristically. exploratory dat;~analysis (chapter 2) precedes the intro-
duction of the random function (chapter 3). The chapter on inference and niodel-
ing of a multivariate model (chapter 4) is, in my opinion, tlre best ever written OII
the suh~cct.Pierre could have heen holder hy giving prccedencc to uncertainty
assessolent over estinratioo, but Ire chose to present first the estimation tools
(cl1;tpt~r~ 5 and 6). l'lris is tlrc most complete yet cohcsive expos6 of all the vari-
ous llavors of kriging and cokriging. The en~pllasisis not on the illusory kriging
variance hut rather on building esti~nalorsthat can account for the large diversity
11f information types clraractcristic of earth sciences. The very reason for geo- Acknowledgments
statistics and the future of the discipline lie in the modeling of uncertainty, at
each node through conditional distributions (chapter 7) and globally through
stochastic images (conditional simulations, chapter 8). In modern geoshtistics,
which is driven by conditional simulations, kriging is an engine, and not the
only one, to build models of conditional probability distributions. Kriging esti-
males and kriging variances have lost their original luster: the fonner hccause of
thcir uneven s~ooothingand the latter because of thcir data independence.
The practicc of gcostatistics has always been ahead of academic publications.
This hook linally may I w e caught up with the use of random function models in
l l ~ cearth sciences, hut gcoslatislics has already freed itself from tlie frame of
such models. It belits that the door of an era he closed hy a man o f the future.

A.G. Jorrn~el

Most of this hook was written in 1994 during my second year as 21 postdoc at the
Department of Geological and Environmental Sciences (St;rol'ord University).
'The idea of a geostatisticd hook was launched by AndrC Journel on a flight hack
from the forum "Geostatistics for the Next Century," held in Montreal in June
1993. In addition to initiating the project, AndrC was an enthusiastic adviser and
a tireless revicwcr, tracking any sentencc that failed to follow his golden rules of
clarity and practicality. Witl~outhis ceaseless support, this book probably would
never llave appeared on the shelves.
711e 20 months of writing were followed by a 6-tnonth peer review. I wish to
thank the IiAlwving persons for their time and palicncc in reading all or part of
the 550-page draft manuscript: Clayton Deutsch, Jennifer Dungan, Guy Girard,
J;rinrc Ci6111cz-llerndodcz.Ricardo Olea, Plrilippe Sonnet, Mohan Srivastava,
fl;nis W;rckcr~i;tgcl,i ~ n t lRic11;rrd Wchstcr. Spccial tllanks to Molian Srivastava
for Iris thorough review of the book and Iris I06 comments, sontetirncs irritating
but always pertinent. 'The draft manuscript also was used as a textbook for pre-
qualifying exams of Stanford University Ph.D. students and benefited from the
careful reading of Phacdon Kyriakidis, Srinivas Rao, and Tingling Yao.
Most of my knowlcdgc ahout indic;rtor gcoslalistics ;uid stochastic simulation
was gained during my 2 years at Stanford, and I am gratcful to Ciilles Bourgault
and nrany graduate students for their stirnulatittg discussions and seminars. This
time at Stanf6rd would not have heen possible without the linancial support of
S ~ m f o r dUniversity, the Belgian Naliooal Fund for Scientilic Research, the Bel-
gian American Educational Foundation, NATO, and a Fulhrigltt-Hays grant. I
a, ,ally indebted to the Belgian National Fund for Scientilic Research for
$"I .;my research for 7 years.
6 , i~listicscannot exist without data. Throughout the book, case studies
have proved valuable cornplcrncnts to thc theoretical introduction of concepts
and algorithms. I thank Jean-Pascal Duhois of the Swiss Federal Institute of Contents
Technology at Lausanne for providing me with a priceless environmental data
set and for autliorizing the publication of these data.
Lorrvain-la-Neuve, Belgirrn~ P.G.
December 1996

1 Introduction 3

2 Exploratory data amlysis 9


2.1 Univariate description 9
2.1.1 Categorical variables 9
2.1.2 Cootinuoos va~.iahles I I
2.2 Bivariate description I9
2.2.1 The scattergram I9
2.2.2 Measures ul'hivariate relation 2 1
2.3 Univariate spatial description 22
2.3.1 Location maps 23
2.3.2 The h-scattergram 25
2.3.3 Measures of spatial continuity and variability 26
2.3.4 Application to indicator tranfurn~s 32
2.3.5 Spatial continuity of metal conccntratioss 36
2.4 Bivariatc spatial description 46
2.4.1 The cross h-scattergram 46
2.4.2 Measures of spatial cross continuity/variahility 46
2.4.3 The scattergram of h-increments 49
2.4.4 Measures of joint variability 50
2.4.5 Application to indicator tr;msfonns 52
2.4.6 Spatial relations between metal concentr;~tions 54
2.5 Main features of the Jura data 56
8.7 Simulated annealing 409
8.7.1 Simulated annealing paradigm 409
8.7.2 Implementation tips 412
8.8 Simulation of categorical variables 420
8.9 Miscellaneous aspects of simulation 424
8.9.1 Reproduction of model statistics 426
8.9.2 Visualization of spatial uncertainty 431
8.9.3 Choosing a simulation algorithm 434 Chapter 1
9 Summary 437

Appendixes 443
A Fitting an LMC 443 Introduction
B List of acronyms and notation 447
B.l Acronyms 447
B.2 Common notation 448
C The Jura data 457 Earth scier~ccsd a t a are typically distributed in space and/or in t i n ~ c .Knowl-
C.1 Prediction and validation sets 457 edge of all attribute value, say, a mineral grade or a pollut.a~rt.conccnt.r;r-
C.2 Transcct data set 464 t.ion, is l,lrus of litlle interest unless locatio~iand/or tinrt: of ~ r ~ c a s n r c ~ are ~~ent
k~rownand accuul~tetlfor in the dat,a analysis. (:coshI.ist,ics provides ;I srl.
Bibliography 465 of statistical tools for incorporating the spatial and t e n ~ p o r a lcoordinal.cs of
ol?servations in d a t a processing.
Index 477 ,,
I he developrne~~t. of geostatistics i l l the 1960s rcsultcd from the n e d for a
n r c l l ~ o d o l o gto~ evaluat.r the rccoverahle reserves in m i n i ~ ~ g d c p o s i t sPriority
.
was given l o practicalil,y, a cnrrellt trademark of gcostatisl.ics tlrnt explains
it,s succr:ss ;rnd application in such divt:rsr firlds as r r ~ i l r i ~pel,rolculn,~g, soil
science, occanograpl~y,irydrogeology, remote scnsi~rg,and c ~ ~ v i r o n n r e sci- ~~td
ences. Ilntil the late 1080s, geostalistics was r:sscnt,ially viewed as a rrreaus 1.0
dcscrilx spatial patterns and ioterpol;~tethe valhr of l,l~eattril)ote of i111.t:r-
est a t ~ ~ n s a m p l eIocaiions.
d (kostatislics is now irrcreasi~rglyr~setlto morlcl
tllc rmcertainty about I I I I ~ I I ~ values ~ I I t,lrrongh the generation of allkrnat.ive
inrages (realizations) that all honor tlw d a t a and reproduce aspcct,s of llre
patterns of spatial depcntlerrce or olher statistics d c e ~ ~ r ecorrstxll~enlialtl for
t . 1 1 ~prot'lenr a t i~;tntl.A give11 scenario or I.r;tnsf(:r funclinn (re~llcdiat,iolrpro-
ccss, flow sin~ulabor)call iw ;,ppliad 10 t . l ~se(, of roalie;~l.ions,nliowillg 1 . 1 ~ :
t r r ~ c e r t ; ~ i of t y r q m l s l : (rcnrt:di;itio~r cllicicncy, (low p r o p d i r s ) 1.0 bo as-
~ ~Llre
scssed St.ocllast,ic i m n g i ~ ~isgow! of (.he ~nosl,vihriltll. :wd pronlisit~gx r w s of
rcsearcl~in gcost,atistics.
O f t m there are o ~ ~ al yfew tneasurc~ncnlsof the a1.t.ril~irl.cof i~rl.ercsl,;l.l~i!
resultant predicted maps t,hus provide poor resolution, and the correspo~~rling
uncertainty may bc very large. In strch situat,iorls, it is critical Lo nccollnt. fol.
informat.ion t h a t is more densely s a n ~ p l e d .For example, insr~fficicnt,p o l l n t , a ~ ~ t
d indirect, yet exlraustive, in for ma ti or^ 11rovidcrl
d a t a can be s ~ ~ p p l e t n c n t ewit11
by LIE calibration of a soil or land nsc m a p . 111l.cgrationof secondary 11al;r in
prediction and simulation algorithms is a ~ ~ o t l active ~ e r aveunc of resmrcl~in
geostalistics.
,,
Il~corct.iraldcvcloplncnls and i ~ p p l i c a t i oof ~ ~gwstatisI,iciil
s 1.0ols i t r ~11t:ing
p ~ ~ l ~ l i s li ~n call
t l wcr-increasing variety of jour~lals;mrl coirgress p r o c e e d i ~ ~ g s .
Yt!I., a s of 1997, t.l~crcwas no tcxthook to provi(lr~st.~ldc~~f.s RIICI p r i ~ t i t i o ~ w r s
with a colnprclmrsive as well as practicd ovrrvicw of tl~r:ever-growir~gpaleblk
of gcost~atisticdconcepts and a l g o r i t h s . 'len years after David's (1077) a l ~ d
Jo~rrncland Iluijbrcgts' (1978) books olr nlit~inggeoslkitistics, Isaaks m d Sri-
v;tst.itva (1989) wrot.e a rernarkahle i ~ ~ t r o d u c t i oto n applied geost,atist,ics a t
lllc n l ~ r ~ r r g r a d u a icvcl.
tc 'I'lrcir foc~rswas 011 tmplr~r;tl.orydal;t a d y s i s and
spat,ial prt:dict,io~~, wit,l~liI~l.11: < I I ! \ , ~ I ~ O I I uf
I ~ ~111orf:
I I I , advmced t,opics, such ;IS
;rsscssnm~f,of unccrl.ainty slid stocl~ast,icimaging. 'Thr: goi(lebouk ;~ssociatr:d
wil,ll lllr goost.atisLical soft.~varclil)r:rry (:Sl,lll (Ijeulsclr and Jourucl, 1992a)
@VP a co111p1cLo{,rfwrtl.ahio~rof rccc~rlgr:oshlisI.ic;rl dcv(\lrrl,~l~e~ll.s, p;lrticw
M y i l l 1 . 1 ~ : ttrcn of condil.ion:d s i ~ ~ u ~ l i i t~ i co ~Lit~ was , not i n t ~ ~ ~ 10i d I)(.
~ da
t.heorctica1 n!f~!rt:~~cc l.extbook. 'I'lkis book aims a t I~ridgilrgthe gap 1~t:tweeo
Isaaks and Srivaslava's introductory hook and GSLIR's more complt:l,e user
guide.
'The main text of the book is divided into seven chapters, covering the
n ~ o s ilnport,itl~l
t arras of gcostalistical n~ethodology'I'l~e prescnlatiorr follows
tllr l.ypical steps of a geosL;tt,istical analysis, ir~l.roducingtools for description,
quantitative modeling of spal.ial colrt.inoity, spatial prediction and unc.ertaint.y
a s s c s s n m k 'To lacilitnlc reading and as art attenlpt a t standardizat~ion,this
book llses Lllc lota at ion of the GSLIR guirlcbook.
,.
I lic various tools are illr~stratedusing a nlullivari;ttc soil d a t a set relat.ed
to l ~ e a v yn~et,;tlc o ~ ~ t a ~ ~ l iof~ a~ a14.5 ~ ~ regi011 i l l the Swiss J u r a . TIIC
t i okltlZ
geostalistical analysis was carried out using the (;SLID software. Alll~ougll
tliis d a t a s e t gives t l ~ el ~ o o ka definite environmental flavor, presentation of t.he
a l g o r i t l ~ n ~iss gcl~crala ~ l diutelded for str~dcntsand practitioners desiring to
gain itn undersla~ldingof the n~etl~odology. Mathematical developments un-
derlying most interpolation algoritlrms are given; therefore, the reader should
lravc some prior i~otionsof lil~earalgebra, in addition to an undergradunto
Figure 1.1: Map showiug the split of the 359 data locatio~tsinto a test set (closed
k~lowlcdgeof slutistics. 'fhese tl~eoretical dcvelopmcnls may be skipped, circles) and n prediction set (open circles).
Ilowcvcr, OII first reading, without altering cornprcllcrrsion of the case studies.

a few precise analytical rl~nasurell~t:tlt,s


arc supplcllienled by rllorc numerous
1.1 The Jura Data Set categorical dat.a (soft inforrr~atio~l).
T h e d a t a r~sedthrougl~outthis hook were collected hy the Swiss Federal In- T h e large sarnple size allows the d a t a t o he divided into a validation set
s t i h t e of 'Yechnology a t Lausanne. A detailed description of the sampling, (100 test locations) and a prediction set (259 localiorrs). T h e prediction set
field, and laboratory procedures is given in Atteia e t al. (1094) and Wehster includes geology and land use, nickel and zinc concentrations a t all 359 lo-
e l id. (1994). I l a t a were recorded a t 359 locations scattered in space; see Fig- cations, and the concentratiolrs of ol.her lieavy metals a t 259 locations. T h e
ure 1.1. Concentrations of seven heavy metals (cadtrrium, cobalt, chromium, prediction set is considered t o he the only inforn~ationavailable for charac-
copper, nickel, lead, and zinc) in the t,opsoil were nreasured a t each loca- h i z i n g the elitire study area. 'The validation set is used to check results
t i o ~ l .Geologic :lnd land use maps provide exl~;trlstive,alheit sofl. (indirect), provided by 1.11~various inhcrpolatioll and sinlulat,ion algoritl~lrisproposed.
catcgoricnl infornlation related to metal content. 'I'l~isdat,a set shares three A typical geostatistical analysis is condncted on the J u r a d a t a with the
features conrlrlon t,o lnost earth science d a t a scts: ( I ) d a t a are a d o - and following objectives:
cross-correlated ill space; (2) several attributes are involved jointly; and (3)
I. Ilescribe the I~atterrrsof spatial dcpcndence of heavy nret,als, and relate LU decomposition algorithm, probability field sin111lat.ior1,and si11111latetlall-
them t o the distrihubion of potential sonrccs, s11c1i a s rock types and nealing. Different ways of sumrnarizingand visualizing the spatial uncertaint,y
111nnan activities (land use). model provided by the series of alternat,ive realizations are revicwed

1.3 Terminology
3 . E s t i ~ n a t ethe inct;tl conct:~rLrat~ions
;it lest loca1,ions. Altlrougl~all statislical concepts and ~ ~ o l a t i o l are
r s defined in the Lcxt, and
are suirmrarized in Appcndix B, a few terms used extensively t l ~ r o u g h o u the
t
4. Model the probability distributions of nlctnl co~~cenf.r;ttions ;tt lcst lo- book are now introduced.
cations, and assess the risk of exceeding critical tlrresholrls.
r Attribute. 1'hysic;tl properties arc called "attrilrutcs" a d denoted by
lowcrcme lellcrs, silch as z or s. Continrtu~i~s al.I,ril)utcs SIICII a s I I I ~ ~ I
concentrations are rncasr~rcdon a c o n t i ~ ~ o oqo ~s ~ a ~ ~ t i t . ; ~ t i v ewharws
sci~lc,
6. Model joint spatial n n c c r t a i ~ ~ tofnirl.;tl
y c o ~ ~ c e n t r a t i o illrough
a sct of ~rs categoricd att,ributes take only a limit,ed nunrbcr of stales, usually non-

involved in rlcclaring t l ~ ost,~tdyarea safe.


~ d 1.11~risk
alternative nnrnerical models (stochastic imaging), a ~ assess

Prediction and classificat~ionof t,est locations are checked against the trne
. ordered, e.g., rock types or land uses.
Variable. T h e variable % or S, dcnoted by capital letters, is defined as
the set of possible values or states that the attribute 2 or s can take
valnes from the v a l i d a h n set. over the study area or a t a location will, coordinates vector 11. III the

1.2 Plan of The Book . latter case, the variable is denoted %(I]) or ,S(II).
Individual. T h e attrihutc valnc is n~easnreilou ;I pltysical s;~nrple,s u c l ~
a s a piece of rock or a core of soil taken from the field. l o scct,ions 2.1
and 2.2 where n o accon~ttis taktw of d a t a locatiorrs, that physical sarrrpl(!
T h e book starts with an exploratory analysis of tile J o r a d a t a set i l l C l ~ a p t e 2.
r
is referred t,o a s an individual. In subsequent chapters, each physical
T h e nnivariate distrib111.ionof calcgorical and corrl,inuous attribulks and their
sample is associat,ed l o a precise lucntion 11, in thc stndy area.
relationslrips arc first described ignoring data locations. D a t a locations itre
tlrerr considered for niodrling spatial conthuity and cross d e p t ~ ~ d c n chetwce~l e is delil~ndas the sel, of all ~ ~ ~ e a s ~ r l ~ c nol cf l , I . s
r f'opa/niioa. ' n ~ apcq1111alio11
altributt:~. the attrihutc of i~~t.ercstt,lral, could I)<:
111;1dcovcr LIIC sl.udy itrc;l. '1'111'
Chapter 3 introdnces the r a n d o ~ ~function
r concept, wlricl~allows a p r o b finite collection of measurenlenls availa1,le is referred to as a smnyle or
ahilistic presentatior~of the various tools irrtrodoccd i n t,he preview cllaptcr. sample set.
Clrapter 4 addresses the prohlcnr of irrferri~~g stathtics that, arc reprcsonta-
tive of the sl.udy area and not only of the availal)lc sample. 'I'l~coretici~l and Parameter. Parameters arc constant (not random) q ~ ~ a n t i t iof
e sa trio11~1,
practical issues related l o ~ ~ r o d e l i ~ ~ g c x p e r i ~ ~
(cross)
l c n t a srmivariograrrrs
l are for example, the rallge parameter of a seriiivariogra~nmodcl or tlie l u m n
paranrclcr of a lognormal proldrilily ~iistribubionfikncl.io~linodeling i t
.
discussed.
,
1 he sub~cq11e111 two chapters introdi~cetlie proble~rlof i n t e r p o l a t i o ~; ~I I I ~ I~istograrn.
the nrulliple versions of the kriging (i~~terpolation) paradigm. Chapter 5 Stnlislics. Stal.istics are qrrantities s u ~ n ~ n a r i z i nagdistril~uhion,wlricll
presents krigirrg techniqrles t i n t utilize only values of the attribute under may involve sevcral attributes and/or several loci~tionsin space. Uai-
study. Algorithms for i n c o r p o r a t i q secondary infornlation are discr~ssedin vuriaie, bivaviate, and maltivaviute statistics relate, respectively, to one,
Chapter 6. two, and multiple a1,trihutes. 'l'be terminology one-poi71t, two-point, and
Chapter 7 introdnces thi: Gaussiau a n d indicator a1go:orithrtlsfor modeling multiple-point statisl.ics is used when the varial~lesrelate 1.0 1.lre same
local probability distributions of either corrtin~~ous or categorical attributes. attribute a t one, two, and multiple locations. For example, t,he corre-
T h e use of these rrrodcls to assess the uncertainty allout unknown valr~esa ~ d lat,ion coefficient is ;I bivariatc stat,isl.ic, whcrcirs thc scrnivariogra~tris
determi~ratiorrof optirrtu~nestinrat.cs is discussed. a 1 . ~ 0 - p o i nst.atistic.
t T h e cross scrnivariogra~r~ is a bivariak! two-point.
Chapter 8 preserrts algoritl~nisto gcuerate n~nltiplercalizirtio~~s distril~~~t~cd statistic ljecause it i~rvolvr!~ two different at,l.ril)nl.es al, l\vo difbronl. Icl-
in space of either c o n t i ~ ~ or
~ ~cat,egorical
o~~s attribntes. 'I'lrc simulation tt:clr- cations.
niqties presented inclndc s c q ~ ~ e n t i aGaussian
l m d indica1,or algorillrn~s,t.he
Chapter 2

Exploratory Data Analysis


,,
Ilre objcclivc of this chapter is to introduce the J u r a prediction d a t a set
I.11rougl1i1.s rt~osf.sidicrll fmlr~rcs. 'rliis p11rc1ydescriplivc part is a prelilui-
11sry s k p l.owi~rd1)11ilding:L ~ ~ u ~ r r e r iand
c i d probabilist~icnrodcl for ~~rlcert,air~ty
ill spat.ia1 prediction. Altlrorrglr t l ~ cr ~ l t i ~ n igoal ~ l e of the sthdy is characteri-
zation of 1.lte wlrole area, tbo sel. of measurernet~tsa t all 359 sites is, a t this
l,irite, considered t o be an exhaustive population. '1'lrus, all slatistics conr-
p r ~ t c dfrom t.11mc dala arc considcrcd l o be cx11;tustivc slnl.istics or sample
popr~l;~t.ir~u pnr;t~nct.crs;II(WW t l i ~I , r i ~ d i l , i ~ ~^ rs~~pcrscripI.
i~l (for esli~~iat~iotl)
is itrtc~tlio~rdly not nscd. 111 Cltapter 4 , the populatioll is exparrded to llre
ent.irc strldy area, arid tlir issue of inferring ~ ~ n k n opopulation w~~ pararrleters
from sample shtistics is addressed.
T h e nnivariate (one attribute a t a lirrre) dislribtltions of 'ategorical and
continuous variables arc described in scclion 2.1. Sectioti 2.2 looks a t the join1
r(:latio~~s IAwee~!pairs of colocnted r r ~ c h lconcent,rations. In section 2.3, tire
patLerus of variatio~tof metal c o ~ ~ c e r ~ l r a tarc i o ~ drscribed
~s and related to
t.11oseof potm1i;il sonrct,s, such as rock 1,ypes and land uses. Spatial relat,ions
I , < t w w t ~co~~crwtral.i<,r~s l lscct.ion 2.4. T h e
of clif[;:rcr~titlclds arc i t ~ ~ i l l y z ei l ~
~ I I ~ I fcmh~res
~ I I o f t11(, ,111ri~
d a t a sct arc su~~rttrnriac(l in sect,io~l2.5.

2.1 Univariate Description


.1.ltis stxI.iml lwgins w i t l ~a straigl~tforwardrlrscript,io~lof lhc two categorical
variables, l a l ~ duse ;md rock t.ypc. Subserll~ently,colttinuow variables (metal
cor~centrations)are rlescribed, pooled over the endire d a t a set, a ~ then
~ d split
according to rock type and land usr.

2.1.1 Categorical variables


1,ct { s ( o ) , n = I , . . . , n } he tlre set ofobservations of the categorical attribute
c i l II individr~;tlsn . 'I'llc a% of Ii possible sl.ates sn t h a t any
s ~ n r a s ~ ~ rolt
10 CIlA P'lXR 2 EXPLOfl.4TOll' A ' A N A L1'SlS 2.1. 1JNl \/A RIAT'l': lIE,Sltl1'TION

value s ( n ) call lake is ~lcnotedby {sl , . . . , s ~ ] .For e x a n ~ p l es(<r)


, = sn if t 1 1 ~ 'Ikl~lc2 1 : I,'requrv~cyof occurrence of dilli:~.cr~t
land uses
s t a t e s k is otist?rved on the at11 i n d i v i d ~ ~ aTl .h e I< states arc exlranstive and and rock types (7x259)
mutually exclusive in tile sense t.lrat each individual helongs l o one and only
one stat,e s,:. T h e distributiot~(histogram) of categorical d a t a is completely
described by a frequency table, which lists the li states and their frequency of Forest
occurrence. 'I'lre freqiency ofoccnrrence of slate s r , derrotctl J(s*) or simply Pasture
p*, can be expressed as the aritluiietic average of n indicator dat,a: htc;ulow
Tillage

2.1.2 Continuous variables


where t,lie inrlicalor dalorn i(n;s,:j associated w k h the n l h indivirloal is set.
to 1 if the s t a l e st is observed and zero otherwist:, tlrat, is, Fkrr1iicnc.y d i s t r i b u l i o l ~
~,,:t. { Z ( C , j, = I, .,. I>(: I,II(! S C : ~ l l l ~ ~ ~ ~ s ~ t r c l l l ~ ~~~f
l l t .Is, I I I ! c ~ ~ l l t . i l l l l ~Z~I I~. ~l s
I if s(<r)= s i trib11t.e r OII t.he 71 individuals (r. Oncr! agaiu, t,Iw ;rclual 1oi:;ttim of thew
i(0;s t ) =
0 otherwise d a t a is ignored for now. T h e distribut,iou of cont.i~~uous vaiucs is typically de-
picted 11y a histogram with thc range of dat:t values discrrl.izcd into a sprcilic
'Sable 2.1 gives the list of land uses and rock types1 ill the st,udy area and number of classes of equal widt.11 and t l ~ erelative proportion of d n l a witl~iri
the corresponding sarnple proport,ions. Most of tile sampled locat,ions an: eac11 class cxpresscd liy 1 . 1 1 ~lrciglit of lxirs. 'Tl~rsc relxl.ive proportioris dcli~rc
under p e r m a ~ l o grass, t hay twice a
~ ~ l w l ~ i c lis~ cil.l~ergrazed (fj4.l';C) or c ~ l for t l ~ ccl;tss I'req~~encir,~, Irm~ccl . 1 ~hislograt~i<l~,pictst l ~ cf r ~ ~ l ~ ~dist.rilwf.io~~ mcy
year (21.2%); only I.!)%, of t h s c locations arc crill~iv;~lcd and t,llc: r c l ~ ~ i t i n i ~ ~ g of z-valrlcs lijr a g i v w definit,io~~ of classcs.
12.8% is forest, ~ n a i u l yspruce (I'IcFRnbicsj. A p w t f r m ~the f'ortlandixn 12igura 2.1 sl~orvst l sevcu~ l ~ i s t , o g r a ~of
t ~ IIKL;II
s i : o ~ ~ c c , ~ ~ t r ; ~~l:.siporl c~sss i ~ l
formalion, wl1ic11 rcpreswt.s only 1.2% of the sites, i l ~ cfour o t h r gr:dogic ~ ~ S.1. I I I I ~ I S = I I I ~kg-'). 'I'II<, long l t p p ~ rlhils o f
i u parts per t t ~ i l l i o(ppni;
formalious are in fairly equal prol~ortions. 1l1e lristogran~sof <:(I, ( h , 1'11, ar~cl% I I v a h ~ c sillrlicittc lbc ]Jrcscllcr: Of a few
111 additior~to the frequency of cacl~state, it is worth kilowing 11ow o f l e ~ l large concentrations. T h e o t l ~ e rl ~ i s t o g r a u ~arc s fitirly s y n l ~ ~ ~ e l rmil.11 i c , 1111'
two slalcs s k and .I,,:, c o r r r s p o n d i ~ ~to
g two different categorical a1,tributes distribution of cobalt values bci~igson~cwhal.I ~ i ~ ~ ~ o d a l
jointly occur. For exaluple, what is i.he proportio~rof saniplcd localions that
are simultancor~slyurrtlcr forest and in t.11~Qunlernary formation'? T l i s irr-
for~nat,ionis provirlcrl by the joint frequency of occurrence J(s*, or,), wlricl~
cim b e e~prt:s~ecI RS the average of all iudicittor p n ~ d u c l :

where i ( n ;7ihr) is d e t i ~ ~ cas


d in equalio~i(2.2). 'I'able 2.2: Joint freuuencies of occurrence of I m d uses and
,,
I a h l e 2.2 gives the joint frequencies of observations for all possi1)le p a i r i ~ ~ g s
of rock type and land use. Forests are prcfcre~lliallylocated 011 I<immeridgialr
rocks, wl~ercasmosl paslures are on Kirnmeridgian aud Seqaaoian rocks. Tillage
Meadow and tillage are equally represcoted on each geologic formation except 3 z K
Portlandian. Note l l ~ a t13 out of 20 bivariatt: categories c o ~ ~ t h ilcss
n Ll~an10 I). 4 96
i~ldivirluals(frequency 5 4%). 0.7%
(1.0%
04?4
Concentration (pprn)
'l'llest. proport,ions can be cornputcd for a series of t l ~ r ~ s l ~ vid\tes o l d zk, result-
ing in the ct~l~iolal.ivc freqncrrcy dist,ril,r~tiot~ f~lrtctiouF ( z r ) aud its grapl~ical
represcnt,ation, the cuniul;ttivt: bistogriu~~. Most often, the r m g e of d a t a val-
ues is not discretized, but the i r val~losare ordered from srnallcst to large st^,
and each value z(rr) is plo1,ted versus the proportion of data that are less than
2 020
it, F(z(ct)).
17igur<,2.2 sl~owsl.l~rc u t ~ ~ ~ ~ l ; r dt .iisvlc. r i l ~ u l i o ~of~ sIIII:I.RI c o ~ ~ c e n t r a t i o ~ ~ s .
,0i
10 ,,
I I I C vr:rl.ical dasllcd l i ~ indicates,
~c for c;rt:l~I I I C ~ , t,lw ~ ~ , tolcral~lc~ ~ ~ a x i ~for nurn
I~caltJ~y soils, as defined by the Swiss Fedcral Office of Envirotr~~~enl., Forests
0 40 00 120 160 and i , a ~ ~ d c a p(FOEFI,,
e 1987); see 'I'irhle 2.3 (pagc 15) for cxacl. values. T h e
Concentratlon (ppm) pfw:cl~thgco r data cacccrli~l~'t,l~csecril.ic;tl l.l~rcsl~olds is given a1 1.l1rlbp of
s h r g c for (:(I (05.3%) and 1'1) (42.1%), and
each grztpl~.'t'lrcse p r o p o r l i o ~ ~are
they are slnaller for (h(8.5%), Ni (0.3%), and ZII (0.6%). T l ~ econcentrations
of Llre ot.hcr metals arc below their tolerable m a x i n ~ a T . h e e x l e ~ of
~ t the pol-
lution is rimsf. i~rilmrtautfor those variables with an nsyuur~etricdistribution
of val~les.

S n ~ n m a r ystatistics
l ~ n p o r t a ~frrt a l ~ ~ r of
e s a distribution are its central value and measures of its
spreiid and syrrunetry.

i:hI1
Concentration (pprn) 'I'lrr cwt.ral valrlc of a dist.ril~t~tion is u s ~ ~ i d tl ayk m its t.lw ;tritlniietic nlean,
<lcfi11m1as
16 Ni data
." n = 1
Yor highly ;tsytt~t~~r%ric d i s t r i l ~ u t i o ~;I~ utorc
s, appropriate ceut,rnl value is the
K rncdiar~,M , wlriclr is the value c o r r c s p o ~ ~ d i rto ~ ga cun~ulativefrequency of

' 004
0.5, i.e., the value t h a t splits the distril~utioninto two halves: lower-valued
<!atkt and l~ighcr-valueddab.
'I'he p-qoantile value of the distribution, noted q,,, is Ll~evalue that a
0 00
0 10 20 30 40 50 60
p r o p o r ( , i o p~ ~of l,lle &ita docs not cxca:d; i.e., ql, is sr1c11that F(q,,) = p. T h e
Concentratlon (ppm) Concentralion (pprn) rr~ediarlis 1.he 0.5 quantile of the distribution, M = q0.5. Other qllallliles of
2Throegl~oottbe hook, these individuals or sites will be referred t u a- "contaminated"
Figure 2.1: liistograms of mctal coscentr.itions or "polluted," although the critical threshold may he exceetletl as a result of rocks that are
naturally rich in that metal, in the nbrencr of any man-made pollution.
A 2. EXPLORATORY DATA ANALI'SIS

widcsprcad use are the nir~edeciles that correspoud to cmrnilat,ivc frequencies


0.1, 0.2, 0.3, . . ., 0.9. T l ~ etnininrurn and i n a x i m ~ ~values
n ~ of the distrilmtion
define the range of variation of the variable.
"&.. 0.6 A measure of spread around tlre mcarr is t,lle variance, defii~edas
w
. 0.4.
m
;0.2
0.0
0 1 2 3 4 5 T h e square root of the variance, u, is called the stkndard deviation, and its
Concentration (pprn)
ratio l o the nleatr, o/m, Tor non-urgativc vari;tblcs, is 1 . h ~~iriil.-freecodficiml.
of variat.ion.
A distribution is said to be sy~nmetricif

"&
.. 0.6 t
a, w
-5 0.4 Reladion ('2.8) er~tailsthat the rneaii and median of the dist~rihntionare equal,
m
m = M. A measure of asylometry is the coefficicrrt of skcwr~css,defined as
0.0. ,..~--,
0 40 80 120 160 0 50 100 150 200
Concentration (pprn) Concentration (pprn)

For syrninetric distributiolls, p is zero. If a distrihl&w fins ;I long tail of


large values, t.hen p is positive a r ~ dthe disl.ril~utionis said 1.o be positively
skewed. On the other hand, a distribution with a long tail of stnall virlucs
lras a negat.ive skewness. For inost pmctical prirposrs, only t.l~cskewiless sign
is of interest. A sin~plermeasure of skewness would tlreli be t . 1 ~differel~ce
between the mean and n~ctlianof the distribution, pi = nr - n/l
' h b l e 2.3 lists, for each metal, the values of the nlaiii statistics. Note the
departure hetwcei~mean atid ~nctliairfor t,l~rdistributions of Cd, C I I ,and 1'11
values, which are strongly positively skcwetl.
Concentration (ppm) Concentration (pprn)

'I'ahle 2.3: S i ~ ~ n r ~ tst;rtistics


ary for tlie preclictioir data set (unil.s =~ ~ I I I )
-
I'b
-
Il 259
Mcau 53.9
Median 46 4
Mininrum 18.9
Maxirmtm 229.6
Std. deviatioir 29.7
Concentration (pprn) Concentration (pprn) Cocf. of var. 0.55
Sktwncss 2.85)
,>
Figure 2.2: Cumulative distributions ol rrictal conceutratiuns i l ~ dproportions of Iolcrable max. 50.0
-
data that exceed the tolerable ruaxirna represented by the vcrtical dashed lines. For
cobalt and cl\ron~iatr>,
Ole critical thres1,old is larger tliim the rn;~rirrrumobst:rvetl.
-- -

18 CIIAP'1'E;Jl 2 , EXI'LOHA?'O11Y 1)A'rA ANALYSIS 22 JIIVAI1lA'l'L< IIJ<SCI~II"lION 19

Table 2.4: Avera, concentrntions of seve~rheavy metals for each 'I'iiblc 2.5: l'crcc~~l,agc of i ~ ~ d i v i d ~ l t11;rl.
als
land use and rock m). excrrd t~olcrnhle111axi111a ( G I , 0 1 , or 1'13)
- - rvitllin mcli 1;rlld use iwd rock type. 'I'hii I Y -
Crl
- - 1'1,
suits for tillagc m d l ' o r l l a ~ ~ d irocks
a ~ ~ rclnte
Lam1 use
to less t,hat~10 individuals (see 'I'ahle 2.1).
Forest 1.40 50.6
Pasture 2.0 61.8
Meadow 1.06 52.0
Tillage 0.96 52.6

Rock lype
Argovian 1.14 41.3
Kinrnmidgian 1.35 56.4
Sequani;tit 1.51 62.9
Portlandian 1.85 48.0
Quaternary 1.16
- 52.3
-

wlticll the forest does m t reccivc. 'The avcragc c o l ~ c e ~ ~ t r ; ~ of


l . i col~nll.
o ~ ~ s ;rnd
nickel in the soil on Argovian for~~lal.ions
are half tltosc rneasr~rnrla t tho o t l m
form at'lons.

Condilzonnl cuin7dntive {requencies . .


111 addition to tlte average metal coocentr;rt.io~lwithin a catogory s r , , 11. 8s
worllr knowiug the corraspo~tdil~g proportion of d n t ; ~that are almve or bdorv
tlte critical tlireshold. T h e proportion of data no'greater t l ~ a uzk is given
by tlre conditional c ~ ~ t n u l a t i vfrcquel~cy
e f*'(zrlst,),whicl~is co~nputcdas Llro
average of an indicator product:

where 7 t k t = C:=, i(n;sk,) is (,he tnllnhar of i ~ ~ d i v i d ~111:longing


~als to cale-
gory sk,. 2.2 Bivariate Description
Table 2.5 gives tlto conditional i,tvr.i:ntagcs of dat.;~l,l~;rl.eaccctl tolor;~l~ln
~ n a x i n ~fora the tlrree metals (Cd, Co, l'b) with widespread contaminalior~. I l ~ e11cx1.step i l l 1.11~: cxploratio~rof the J l ~ r adata c o ~ ~ s i sof
t s looking al. I.llc
There is no pollntion by copper under forest and tillage, and it is of snrall relation l~etwetmpairs of lnetal concentrations measoreil a t tlrc samr locn-
lions. 1"igure 2.2 (page 14) shows that a Iargf. proport,io~~ of d;rl.a excrrd 1.11~
extent on Argoviitn and Portlandiarr rocks. For cadmium and lead, the elfrct
tolerable m a x i ~ u afor Cd, ( h , or 1'11, so (,he hivariatc descriptiolr focustrs O B I
of land use is not as clear. 'I'hc proportiorr of Cd anrl PI] d a t a exceeding the
thc rclalimis i)clwecn thos<: t,l~rrrcn~et,alsanrl tlre two nlore dr~t~scly san~plcil
critical thresllold is snraller for t,l,e soils on Argovizn~rocks.
~ncl.als,nichcl ;rnd zinc.
Subdivision o/ Lhe dnla sel
T h e previous analysis of co~lditionaldistribuI.iol~sreveals t h a t s1,atisLics or
certain categories differ ill terms of average n ~ c t a conce~~tr;it,ions
l ; t t ~ r lpropor-
tions above critical tl~resholds.This raises quesI.ions about the prior dccisio~r
of pooling all the dala together and raises lhc possibility of splitting the d;tl.;t
onc ;tnnll~cr.I"igurs 2.3 slrows tha scattergrams of nickel and zinc valnes vcr- 2.2.2 Measures of bivariate relation
~ s ( A , C u , and I'h, l'lre litttrr three rnet.als are related
s r ~ st , l ~ c o n c e n t r a t i o ~ of
lo zinc: larger co~~ccntrations of each n ~ e t a licnd l o be associated wil.lr larger As f(rr 1.111: univariate case, o ~ is~ inf.ercstctl e in stat.isl,ics 1.liat. s ~ ~ ~ r ~ ~ nthe
arize
co~~u:nLrnl.inns of a i ~ ~ (:;t1111~iul11
c. an<I~ i c k e Cl O I I C ~ > I I ~ , I ; ~ ~ ~ arc
O I I Salso posit.ively ~ n i t fca1.111r:s
i~ of 1 . 1 1 ~I j i v i ~ r i a lr~c l i ~ t i o ~'1111~
~ . most f r ~ q t ~ c n t lt~sed
y statistics
correlat.ed are the covariance ;tud its standardized form, the lir~earcorrelalion coellicierrt.
,I,ire covariance o;j is a nleasore of the joint variation of Z; and Z, arollnd

tliriir incans. It. is c o n ~ p ~ ~as


tcd
6

E
-:: .........
: prank : 0.64
. .. wlrrrc mi iuld 111; are the aritlnnetic means of variables Z; and Z,, respec-
tively. 'The covariance beco~ncsthe variance if i = j , see relation (2.7).
,,
111ec o r r r l a l i o ~coefficient,
~ pjj is readily dorlncrd as

0 10 20 30 40 50 0 50 100 150200250
Nickel (ppm) Zinc (pprn)
rvllcrc n, F I I I ~n j are the standard rlevialiorls of %; and Zj, respectively.
'l'llc IIII~L-bcr: correlation cor.llicic111is easier 1.0 i~~lr:rprel t l ~ a t.lle
~ i covariance,
w l ~ i c lrleprnds
~ 011 tl~r:rr~easnrcn~cnt scales of LIE lwo variables.
.>
1l1e t o l l o a i ~ ~a gr r g~~irlaliur!~ for llsing t l ~ cli~w;trcorrelatiotr c o e f i c i e ~ ~1.0t
~ [ u a ~ ~ t ~i l~' y~ P I I ~ C ~I ~I ~ C~ IP, w w I lwo
I \~:tri:d~lc:s:
I . 'The quant.ity pij provides ;t nmrsure only of lirrear r e l a l i o ~IAween
~ two
variablcs. It ~ ; I I I bc n~islcadinyiC i~rberprcterlotl~crwisc(see discussion
in section 9.2.2). I w o variables may be highly d e p e n d r ~ ~and l yet have
zero linear correlation: a classic exarnplc is %; = [[;,I2, where Z, has a
syn~rnctricdislril~nl,ion.
0 10 20 30 40 50 0 50 100150200250
Nickel (pprn) Zinc (pprn) 2. Like Ll~cvariance, t l ~ ecorrelation coe&ient is st,rongIy afFected by ex-
t,ren~evalws. A nlorr: robust rncasnre is {,lierank correlalion coetlicient
wlliclr crr~rsid<.rs
Lllc ranks o l tlre data, v(rj(n)) aud v(r,(n)), rather
t11il11llw origi11;11va111r.s:

pi,11 = -1 ] [ I ( ( ) ) - l , ] . [ ( ( ) ) - llll(,]
(2.13)
.Prank ' 035 ' . Prank n ~ r i- o n j
~

.. . 11

0 10 20 30 40 50 50 100 150 200 250


Ntckel (pprn) Zinc (ppm) 3 Wlivr~ d ~ ; ~ l i t lwit11
g Nu v ~ r i i t l ~ l mNy(hru, - 1)/2 scattnrgrams can be
Figure 2 . 3 Scaf.tergrart,s of tlw tlvu cxl~austivi.lgsarnpled nwtzls ( N i , Xn) vorsns h w o . '1'11~ user 11ti1y hi! tempt-ed t,n bypass 1.l1e c u ~ ~ ~ l w r splot,. o~~~t:
Llw l , l t w e nwt,;~Is( ( X , ( > u , 1'1)) wit13 widt~sprmdr ~ ~ ~ ~ t , ; ~ t t t i ~ t z ~ t i ~ ~ r t . t h g ~ I I dI ~~ s c r i p t i onf~ ~1.00 I I I ~ L I sI ~~ a t t c r g r i u ~iln<l
~ s 1'1x11s011 the linear
c o r r c l a t i o ~coellicirwts
~ alone. Itowwer, renrcl~~bnr t.hal the correli~lion
coefficient. cstr;~ctsonly a small part of f.11t: i11for11ri~Li011 proviclcd by t l ~ c
sralt.crgran~
2.3.1 Location maps
Table 2.6: Matrix of linear correlation cocfficic~~ts.
Any spatial ~ u ~ a l y ss11o11ld
is slart wit11 a posting of i1at.a v;tlues. I'iguws 2.5
and 2.6 show the location maps of the sample data, wit11 darker s h a d i ~ ~ t :
indicating larger metal c o ~ ~ c e n t r a t i o ~All
~ s .~ r ~ e t concent.ratior~s
al were mea-
with Ni and %n c o n c e ~ r t h l . i o ~Rcing
s ~ ~ r ea dt 259 local.io~~s, ~ s rccorrlcd a t 100
additional locathns (Figure 2.0, boltom inaps).
T h e data confignratio~~ resulted from a c o ~ ~ h i n a t i oofn regular and nested
samplit~gsclmnes (Webstcr ct al., 1994). 'I'hc hasic grid is a square mesh
with 107 grid nodes at. intervals of 250 ni. Out of l,t~ese107 grid nodes, 38
~ ~ ~ l yso that t.l~t:proportion of 38 nodes lx!longi~~g
wcrc r a ~ ~ d o sclcctcd to rock
type sb corresponds to t11c proportion of t11e s1.udy area covcred by illat, rock
Table 2.6 gives the li~reitrcorrelatio~t coefficients con~putcrlamong the
seven heavy metals from the same set of 259 individuals. The strongest
correlations ( p > 0.70) are for llre pairs Cu-PI) a ~ r dCo-Ni. Figure 2.1 is
Cd data
a scattergrarr~of the h e a r versus the rank correlation coefficient for the
21 pairs of heavy n~ctals. Both measures arc similar, wl~ichit~dic;rt.t!sLlr;tt,
extreme vnlues d o not grcatly alfect 1,111: linciw corrc!liilio~~
coclfi(.i(:nts.

2.3 Univariate Spatial Description


'I'o ~ C C O I I I I L for ditl,it IOCR~IOIIS, I I I B ~ ! S I I I ! C I I I ~ I I ~of
S C O I I ~ ~ I I I I ~and
I I S cat.cgoric:rl
attrihutefi are d e ~ ~ o t ez(u,) d and s(u,), where 11, is tlrr: vector of spatial
coordinates of the n t h individual. For cxample, s(11,) = sp if cat,egory s* is
observed a1 location u,. The objective is to describe and q n a ~ ~ t ithe f y relalion
hetween measurements of the same attribute a t any two data locations II,,
and 11"
Cu data Pb data

0.0 1.'
,
0.0 0.2 0.4 0.6 0.8
Linear correlation

Figure 2.4: Srattcrgrarv of the rank rorrelnlion cocffirienLvcrsus the lincar corrp-
l ~ l i o t tcoefliricltt far f.lm 5 1 pairs of lboi~vy~ r w t a l h .
Co data Cr data Cd data

Ni data Zn data Cu data Pb data

type. Starlillg from eacll of tlrcse 38 nodes, the surveyors chose a first location cluster. Such clost.criug could reflect potential sourccs, surll irs originating
100 In away. Front tlral, locatiolr t h y chose asccond location 40 1x1 away; from from llre rock or mart-rnarle pollubion. In general, observatiol~sthat arc close
that second locatiotr, a third location was chosen 16 m away; and finally from to each other on the ground arc also alikc in metal concentrations.
t h a t third location, a i'o11rtI1 location was chosen G rn ;way. 'fhe dist~anr:es
were fixed but tlre directions were ra~ldonr. TIE ol,jective of this nest,cd
sampling wrrs to cover ;i wide range of spatial scales ht:twtwr 0 aud 250 ni.
2.3.2 The h-scattergram
111sectiou 2.1.2, tnany sites were found to exceed the tolorable maxinmrn Sp;tl.i;d rc1;~lionsbetween d;tt;~r;ru be displayed nsing an 11-scatt.crgralrr. .lust
I I , I , n l I . 'I'IIc indicator n ~ a p sin I'ligure 2.7, wl~irltsliow I.he as l.ltc sc;tttcrgranl is a plot, of :ill pairs of v;tlnes relnbcd t.o two different
contanri~iatedloc;tt,ions ill l~lxck,conrplclc t l ~ cdescript.ion. ' f h dcnsily of atlrihotes rrreasnrcd a t the sirme loc;lt,ion, t l ~ el ~ - s c a l . l m g r ais~ a~ ~plot of all
black d o h reflect,^ the greater extent of contarninahio~thy c a d n ~ i u and
~ ~ r lcarl. pairs of nmtsurenrcrrts ( ~ ( I I , ) ,z ( I I , + ~ ) ) on LIE sa111eaLIri1)ute :a t locations
,,
l h e pi~t.ternof spatial distribution of conta~ninatcdsites is also informative. separated by a give11 distance h ill a particular direcliou 0. 'fhe vector nola-
'The lrlack dots in I'igurc 2.7 are not rnndon~lydistribntcd L h c y t,e~rd1.0 Lion 11 ;tcconnts for both dist,;mce and direction Hy convention, 1.lre valuc a t
26 CHAPTER 2. EXPLORATORY DATA ANA1,YSIS

the start of the vector 11, r(u,), is called the tail value, whereas the valne at.
the end, z(u, + h), is the head value.
Data pairs are typically grouped into clases of distances (lags) and a~rgles,
[h i A h ] and [B & AO], so that each h-scattergram is built on a sufficient
number of pairs. For example, consider the relation between Cd values in the
east-west (E-W) direction. Figure 2.8 shows the h-scattergrams of C d values
in the easterly3 direction ( A 0 = 22.5O) for six different classes of distances
( A h = 100 m). T h e abscissa corresponds to tail values and tlie ordillate
to head values. Because the two axes relate to the same variable, a perfect
correlation would entail that all points lie on the first bisector (46' dashed
line). Tlre spread of the cloud of points around this 45' line reflects tlre
variability between data values. T h e increasing irrflat,iol~of tlle cloud wit,h
increasing separation disl.ance h reflects the increasing dissimilarity between
measurements farther apart.
Tail value (ppm) Tail value (ppm)
T h e h-scattergrarn also draws attention to possible outliers or n~isrecorded
values. Isolated pairs on the 11-scattergram should he identified, located 0 1 1
the map, and then analyacd more carefully. For small separatiou d i s t a ~ ~ c e s ,
outlier pairs involving t l ~ esaltle location, say, u,,, i l ~ ~ l i c a that
t c r l a t t ~ nz(11,)
~ h = 388 rn 1
' h=616rn
is very unlike its i~eighbors.For example, ill Figure 2.8, a few pairs fall far
from tire 45' line 01, tile 11-scattergram for lhl= 214 rn. Three of these pairs
involve the same location wit11 Cd concentration of 4.50 P ~ I I I ;tile s~?cond
measurements of t i m e pairs are 0.40, 0.77, and 1.43 ppm. If tlieri: is a sound
physical reason for collsidcring the extreme value (4.50 ppm) as erroneous,
then it should be removed from tlie data sct. Such a decision slrould not,
however, be based on a single lag h , since the same datum may not yield
outlier pairs for other lags h or in otlrcr directions.
Distinct clouds of points on a h-scattergram 111ayindicate the presence
Tail value (ppm) Tail value (ppm)
of separate populations wit11 d i f f e r o ~spatial
~t conlinuil.y. 'I'llc data coi~ldlrc
split into more hornogeneo~~s suhscts if suflicient d a t a are avai1al)le w i t l ~ i ~ l
each subset.
'7 h = 787 rn 1
' h=1024m
2.3.3 Measures of spatial continuity and variability
Each h-scattergrani displays the relation between pairs of z-valncs for a given
class of distance and dirrction. The similarity or dissimilar it,^ between dat,;t
separated by a vcctor 11 call he quantified by several measures (Dentscll and
Journcl, 1Y'32a, p. 40).

Covariance function
Tail value (ppm) Tail value (ppm)
Tire covariance and correlation coellicient inbroduced in section 2.2.2 can be
extended t o ~neasrrresimilarity between non-colocatetl d a t a . 'I'l~e covariance
between d a t a valurs separated by a vecl.or 11 is computed ,zs follows: The scrlriv;~riogra~ri vrluc a t a givm lag 11, so~iietirriescalled semivari-
nnrc, c;tli 11c i~itrvprekrlas the moment of incrliii of t.l~r11-srat,t.ergraniabout
its lirsf. I~isecl.or(k'iplrr 2.9). Indeed, the orll~ogolialdisLancc of any point.
(z(u,), z(u, + h)) to the first biscctor is d, = Iz(u,,) - i(n, -1- 11)1. COB 45'.
t 45' line is 1,lie ;werage of all such squared
'The rirolr~rntof iiiert.ia a h ~ tlte
distances:

where N ( h ) is t.lv 11u1111ierof d a t a pairs wit,lri11 the class of distance n ~ r d


dirwt~iii~r, ;ind 92,-ll tmd mil1 arc t,Iie I I I ( ~ I Sof the c o r r c q x ~ ~ i d i ithil
~ g c~nd
licatl v;~lrrcs(lag iiiearrs). T I E covariance can lir colnpuled for d i f f c r e ~ lags ~t
111, h2, . . . a d the orilered set of covaria~iccsC:(hi), C ( h z ) , . . . is c;rllcd Llic
espcrinietrt.al autocovaria~icef ~ ~ n c t i oor
n , simply, the experi~rre~rtal covariailcc IIence, t,he scinivcrriogram valrie increases RS the points spread out farther
fi~nctioil from the first biseclor of the 11-scaf,l.ergrairr,

In Figure 2.10, 1,lie three Lop graphs show, respectively, lire c ~ ~ e r i r l ~ e r l l a l


covnriancc fui~clion,correlogratn, and scmivariograrri of cadmium co~riputed
ill the cast,rrly direct.io~~using an angular tolerarrcc of 22.5' and a distance
iolcraoce o l 100 in. Tlic decreasilrg Oehavior of the covariailce function and
corrclogri~ri~ rcflect.~Ll~cdecrensirig s i ~ i i i l a r iof
l ~ da1.n values as llin separation
dista~lccA irirrr:ascs.

wl~crcc -h11 w d c2 are the variarices of the tail and Imd valrir:~(lag vari-
ill
) . ordcrcd sct of correlat,ioii c.oe%cie~iLsp(111), p(lr2), . . . is ct~llcd
a ~ ~ c e s 'Tile
tlic r:xpcri~ncnt~;~laulocorrr:lation fnircl.ioil or r o r r ~ I o g r a ~ ~ i .

Uiilikr tlic cov;tri~~lc,e and corrrlat.ion frurctions, wlrich arc mcasures of sinli-
larity, t l ~ cexperilirental seniivariogra~ny(11) nrcasorcs the average dissir~iilar-
ity betxvee~idata separated by a vect,or 11. It is couiputcd a s half the average
s q ~ l a r c ddiffaencc hctwccti the components of every d a t a pair:

Figure 2.9: Int.erprctation ol the semivariogram value 7(h) as the moment of inertia
of the 11-scuttcrgrarn urouad the first bisector. y(h) is the average of all squared
orthogonal distances d , to thst bisector.
30 CNAI'TEN 2 ISXPLORATOtiY UA'L'A ANALYSIS

,Cd data Cd data Cd data

z,= 0.58 ppm


llrmlerks
I . Thc covarianct! m d scn~ivariogramvalrrc conip~rtedill opposilo d i m -
lions arc identical, i . e , C ( h ) = C(-11) and -,(h) = ?(-h), since the
0.0 0.4 06 $ 2 1.6 corress~onc~ing 11-scattergram and -11-scattergrar~t arc sym~r~el.ric wil,l~
Disiance (*I")
respect (,o the first, bisector.
2. Set.l.ing the ; i ~ ~ g r ~tolerance
lnr A0 l o 90' a n ~ o n n k1.0 I.hr <l;~la
pairs in all directions. Tlrc res~rltingcovariirncr, function or sr:rniv;rri-
ogrmii is called o7lreidz1'cclio7rn/, whcrc;~st,lre 1.srni dircrhurrnl is irsrd
wlrenever A0 < !lOO.
3 . Iiko ot.11ervari;i~rc~:-type
stalistics, t l ~ cvirlrrrs oS t.ltc cov;~ri;mr.c(11. s c ~ r ~ i ~
variograni art! srmsitive l,o extrcrne dalh values. 'I'lre f o l l o w i ~ ~are
g ways
to i~nnrlletire prolilc~r~
of roln~strrcss:
(a) 'liansfornt the d a t a (see seclior~2.1.2) t,o reduce i.l~cskrwrrtw or
tlreir histograms.
(I)) Use otller suttlrnwy statislics ol I.he 11-sci~Ltcrgri1111 t l ~ i ~Bl .~ CIrss
scositive l o cstrenw values. 'I'lre srnsitivity of t,he seniivxriogrm~~
to exl.rcme r-valws co~uosfront the s q ~ m r i ~ or gf 11-increriicnts,sr:o
eqnxlion (2.16). A ~ r ~ o gcneral
rc rue;tsrrrc of Llrn sl~;itialvnria1,ility
is the variogram of order w, defintrd as I.hc I I I ~ I I ~IIISOIIII.<:
I devial.io~~
1.0 t,lrc powrrr w:

1 ~(11)
%(I,) = ;----
iN(11)
IZ(II,,) - z(u, + 11)j" rvil.ll w t [O, 21
o=1
(2.18)
For w = 2, one retrieves the traditional scn~iv;rriogranry(11). 'I'lre
srrtallerw, the lesser tlre i~lfluenceof exlrerrte valucs on tlre meitsure
Figure 2.10: kkpcrinreatal covariance functiuns (kit c o l ~ m a )correlograms
, (cei~lcr y,(li). Two co~nniorrlyused rneaslrres arc the nrirdogrn~r~ (w = I )
column), and semivnriogmms (right colnrrie) of cadmiam i n thi: atsterly dircctioa. and rodogra~n(w = 112) (Ifcntscl~and .lor~rncl,1!1!12a, 1). 41).
,,
I lie ll~reetop graplts are coml,uted irmu the original values, whereas tho o t l m
7 7
I hcse n~easurcsarc itol suljstitut?s for t . 1 1 ~ t,r;rdiliotr;ll s i : n ~ i v ; ~ r i ~
graplts relate to tlre indicator data disphyed in Figure 2.1 1, page :XI. 'I'lw indicator ograln; riil.l~er,they slro~tldbe i~sed1.0 inC<,rf m l . ~ ~ r csuch
s , ;is ritngc,
semivariogram vnlars are standardized try the varianct: oi the indicator data. itnd anisotropy (scc rclatcd discr~ssionin soct,ion 4 2 . 4 ) .
34 CIlAP?'Eft 2. EXP1,ORATOJW DATA ANALYSIS
23 UNIVA RIATfl' SPATIAL L1E:SCRIPTION 35

.I,Ire i~r<licalorv a r i o g r a ~v~a~l w 2yl(ll; z r ) rnensures how o f l . e ~t.wo


~ r-virluea
I n d i c a t o r covariailce f i l n c t i o n
selmrated by a vector 11 are on opposite sides of tlir tlircslrold v a l w zr.
T h e experinrental indicator covariance a t a give11 lag 11 is corrryutcd as In ot,l~crwords, 2yI(h;zk) measorcs I,llr f.r;~rrsilion frcqncncy l)ctwccn t,wo
classes of z-values as a function of h. Unlike the indicator covariance fu~rc-
lion, the greater y r ( h ; zr) or yr(11; zk,), the less cou~~ecl,edin space arc lhe
small (z(r1,) <
zr) or large values (~(11,) > z r , ) Similar t.o tlic indicittor
; for j(u,; r r ) = 1 - i(11,; zr).
covariance, yr(11; r r ) = y ~ ( hzx),

Graphical interpretation

where F - l l ( ~ k )a d F+ll(zk) arc tlie proportions of tail and Iread values not
exceeding l l ~ chlrresl~oldvalue zn
l l r c intlica1,or covari;rncr (,'l(li; zt) ;rppc;trs ; ~ Ll~c
s " c u ~ l . c r i ~ ~ofg "bl~cl,wr)~
poilrt c ~ ~ ~ m t l a tfreq~rency
ivc l~'(11;zn). Tl~;tlfrrqtrency tlreitsurcs h v a f h ~
two values of tlrc sarrrc atlrihulc z separated l,y a vcclor 11 are jointly no
g r e e l h t t w ~the tl~rcsholdval~rt:"z t . For a s ~ n a l tl~reslrold
l vd111: zk, I;'(h; zr)
measures the co~~~~ct:tivit,y l)trt.weeu s~llallvalrrtis sclmral.cd by ;L v d o r 11: the
larger F(h;zk), the better c o ~ ~ n e c t eill d space are t,l~csnrall z-valr~es. '1,Irc
spatial connectivity of large values ( e g . , v;~lursgreater tharr a largc tlrresl~old
a,)call be measured by I " ( l ~ ; z p ) , where the iridic;tl.or t r a ~ ~ s f o ris~ nIIOW
j(u,; zr,) = 1 - i(u,; znr). Note that C>(h; zn) = Cr(11; zr), V z r .

Indicator correlogra~n
'Die indicator corrclogran~is t,lte standardized for111of the prcvions iodic;btor
covariance function:

where n2- ll(Zr) = I*'-ll(~r)[l- li_ll(~k)]is l . 1 1 ~variit~iceor 1 . 1 1 ~tail irrdirat.01:


values and = F ( ~ ~ )-[ l,;ll(zr)]
l is the variance of the l ~ c a d
+I1 t11
indicator values.

I n d i c a t o r senrivariograrn

Figure 2.12: Graphical iaterprotatio~iof the indicator spatial statistics. 'l'lte nor>-
centered indicator covarimrco F(h;I*) and the indicator variogranr value 2yr(l1; z k )
are the proportions of points idling in the horizonlal inid vertical hi~tchedareas of
tlie h-sci~ttcrgranr,respectively.
Fd data

0.0) , ,
0.0 0.4 0.8 1.2 1.6
Distance (km)

Cu data Pb data
I I

2. Typically, cxlrerirnenthl indicator scmiv;wiogr;r~nsa t extreme tlrrcsliold


values k n d to bc more erratic t l t n ~t,liosc
~ a t the median tl~resl~olrl
value.
Indeed, for such extrenrr tlrresholds, t l ~ indicator
. semiviuiograut valuc
clcpc~rdson llre spatial dist,ribot,ion of ihe f i w d a t a pairs where t,he t,wo
z-valtlcs arc 011 opposilr sides of the Lhresl~oldrk
0 1 - - 0
00 04 08 12 16 00 04 08 12 16
3 . IJnlikt: t l ~ cz - c o v a r i a ~ mor z-sc~nivariogrn~n,
the indicator st;tt,ist,ics ;ire D~stance(km) Dlstance (km)
!tot afthctcd hy cxbre~r~c v;llws, since only t,l~eposit.ion of thc d;tl.;r with
respect to the tlrreslrold value ;-5k is co~isitlered. T h e indicitt,or semi- ,Co data ,Cr data
variogra~na t the ~rlcdiar~ tltrcshold value, in part.icular, inay he used
to delcct patlcrrls of s p t i a l coutinuity whenever rxlrcrr~e-valurtlrla1.a
r w d r r t.lre t.mtliliotral sc~~~ivariogr;mry ( h ) crr;rt,ic.
4. Tlic sill of the experi~~retri.alindicator sc~nivariogramy,(h; z r ) is roughly
e q ~ l a lto the indicator variance, I.'(zn)[l- F ( z k ) ] ,wlicre l"(zi;) is tbe
mean of t,hr il~dicatordata i(u,; r r ) . W l ~ e ~comparir~g
r indicator s c m -
v;triogranis a t differelrt, t.hrrsliold valr~es,it is good pract,ire to stan- 0. ' 0 1
0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6
dardize their sills to one hy dividing the semivxriogranr valucs by t.ltc Distance (km) Distance (km)
iudicat,or v;~riat~cc.
Ni data Zn data
2.3.5 Spatial conti~iuityof metal concentrations
,,
I ltc spalial ;u~alysisis now c x t , r ~ ~ d eto d all dircctio~rsand n~ei,itls,and i.l~t:
corrcspondi~lg~ ; L ~ ~ , < T or
I I ssp;rliai v i ~ r i i ~ h iitrc
o ~ ~inlcrpret,cd i l l rcl;lt.io~~
l o gc-
ology nrrd land use. T h e following discussion focuses o ~ rt.lre s e n ~ i v a r i o g r a n ~ ,
which is irrost freq~wntlyused.

0J 0)
0.0 0.4 0.8 1.2 1.8 0.0 0.4 0.8 1.2 1.8
Distance (km) Distance (km)

F i g 2 . 1 F:xprrin~catal sernivnriograms for tlw seven Itcavy metals in four


dircrtions ( : 22.5", : 67,5', - - - : 112.S0, . . . : 157.5" ; A0 = 22.5").
38 CIIAPTER 2. EXPLOHATOHY DATA ANA1,YSIS

available for each lag. Cobalt conceritration, lrowever, appear to vary niore Cd data
continuously (smaller semivariograrn values) in the 67.5' direction (SW-NE). I

S e n s i t i v i t y to o x t r e m c va111cs
Figure 2.14 (solid line) slrorvs tlie experimental o r ~ ~ ~ ~ i d i r c c t isenrivari-
onal
ogralns of rnethl conce~rtrations. For the four positively skewed variables
(Cd, Cu, Pb, Zn), the seniivariogran~is also conrput,ed on llie logaritlrr~r 0.0
transform and plottc:d o n the w m c grapli after rescalii~gto t,lw z-v;tri;rnct: 0.0 04 0.8 1.2 1.6
(dashed line). 'f'lre se~nivariogramsof t,lii. lognritluns ;ippc;ir slighLly lcss cr- Distance (km)
ratic wil,h a s~iiirllernugget clft,ct. 'I'lic dif~ercncrsarc, i m w v r r , s~rr;rlland do
not justify tire l o g - t r a ~ ~ s f o r ~ n a l i o r ~ Cu data P b data

Iutcrprctirig g a t t t m l s of spatial variatioo


T h e semiv;rriograir~sof all attribute values have a s ~ r ~ anugget ll effect, fro111
10 t o 30% of the total variance (Figure 2.14, solid line). Two scales of spa-
tial variation can be dist.ing~~isl~ed froin inflexions of Ll~eexperimental curvcs:
a local scale (range % 200 m) ar~rla regional scale ( r a ~ i g csz 1 kin). ' I h
sh;tpc of Ni n11i1 Co semivariograrirs is donrinaterl hy t,hc l ~ m g - r ~ t s~tir gt ~~c - 0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6
Distance (km) Distance (km)
tore, whereas t l ~ eslrort-rirrrge strucl.trre is tlic t ~ ~ a j coor ~ ~ ~ p o nfor c n ltlrc ol,l~er
rnctals. T h e sc~nivariogran~ of Zn cornbir~estlii: t.wo strlrcl.r~res,with variances
in approxinrativcly equal proportions. ,COdata Cr data
T h e long-range stroctnro of the seinivariagrams of Ni and C o concer~t.ra-
tions arc prolj;tl,ly rcl;rl.cd t,o {.he control asserted hy rock l,ype, uiorc prwiccly
A r g o v i a ~rocks
~ (n:cxll <lisciissio~~re1alr:d lo 'l't~l~lc2.4). '1'11~~ ~.~IICCII~.~:L~.~IIIIS
of the other metals tlmt are more i ~ ~ f l ~ r e ~l ~~ycland c d use show i ~ ~ a i n lt,l~c y
short-range variation. 'f'his sllggests that the long-r;tllgc strlsctllrc relatks t,o
regional changes in geology (rock type) a ~ ~the r t s h o r t - r a ~ ~ gs tr r u c l , ~ ~to
r c t,lrc
dislxihution of sourccs of illan-n~adcco~rI.an~itrarrts. 0 0 1
00 0.4 08 1.2 1.6 0.0 0.4 0.0 1.2 1.6
Figure 2.15 sliows the positioris of the 05'3 sa~nplingsitessnperinrposcd 0x1
Distance (km) Distance (krn)
the land use and geologic maps. Tliese maps have heen crent,ed by allocating
unsarnpled locations to tlre land rise or rock type of t l ~ enearest ilatuiir. 'l'he
scatter of farmland and past.ures c.out.rasts with the grc;rt,t!r coritin~iilyof
geologic forrnabions preferentially orienled SW-NE. A l o ~ ~that g direction, note
t h e grouping of forest soils in the eastern part of the study a r c x
Indicator senritiur.ioyrnws fur rock types n u d lnnd xses
Wlwrevcr the average value of ail attribute t is very rlilfercnt frorn its avcragc
within a p;irticiilar category s t , the goornetric l;ryor~l,of t,lrat catcgory controls
tlre shape and auisotropy of the 2-semivariogram. l'lrr: pattcrlr of cor~ti~ruity , ---
(varial>ility)of a m t q y y s t ~ i i lr ~ ~:li;tr:~rt.~~riz~:~l
c hy s < ~ ~ ~ ~ i v i i r i ~~)1g1r~; f~i ~i 1r~i ~s 1 0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.0 1.2 1.6
Distance (krn) Distance (km)
on an indicator codiug of 1,lre prcse~rcc/;ihsenccccol'l.lr;~tcategory. Delinc
40 CHAP'I'FX 2 J?X1'l~ORA?'Ol1Y DATA ANALYSIS 23 UNlVAHIA'I'E SI'ATIAI, III.:SCRIP'I'ION 41

Land use Argovian Kimmeridgian

Tillage

Meadow

Pasture

1
--
] Forest
Sequanian Quaternary

Geology
Quaternary

Portlandian
1"igure 2.16: Indicator maps showing, is black, tllc locntions 1xlo1lgin.gto a par-
ticular ruck tgpc.

'l'hc four irtdic;rtor r m p s of Figure 2. IF show, in black, tLlw locat,iorts where a


Kimmeridgian parf.icolirr rock t,ype prw;rils, i.c., locntions whcrc ~ ( I I *sr) ; = I.
'The indiral.or s t w i v a r i o g r a ~ for
t ~ category sk is l h c o l ~ ~ p u t easd

U Argovian

'J'lte indicat,or vxriogmm value 2yl(11; sk) mcasures liow o f k n two locations a
vcclor h apart bclong to difirrul, cntcgories sn.t # sn..'Tlrc smaller Zyr(h;sn.),
tlir hcttcr the spatial co~~rlectivily of category sn.. T l ~ eralrges and shapes
of tltc directional indicat,or semivariogrants reflect t.lre g e o ~ ~ ~ e tpatterns
ric of
category sk.
I2igure 2.17 sl~owstllc i~rdicat,orsi:~~~ivariogr;r~lrsof i.he I I I O S ~c o ~ ~ r r r 1a11d
ro~~ Forest
~ ~ s and
c s rock types c o ~ n p ~ ~ in
l k fdh r rlircctioris with all ir11g111i~r toler;~nceof
22.5', with the following resolts:

Ehr all rock t,ypcs, the i ~ ~ d i c a t oseruivariograin


r val~leequals zcro a(, t l ~ c
first lag, whic11 rrleaus that any two dat,a l o c i ~ t i o less
~ ~ s tlra~r100 III apart,
;r 0.10
.t
bcloog to the same for~natio~r.

For Argovian a d Sequirnia~irocks, t.lic longer SW-NE range (larger


(lashed line) reflects the corresponding prefercnlial orirnthtion of t,l~esc 0.0 0.4 0.8 1.2 1.6
Distance (km)
two lithologic fornlatiot~s(Ii'igure 2.16).

T h e semivariogra~nsfor forcsf wils aird Kitnll~eridgianrocks also sl~inv Meadow


a better SW-Nk: coulinuity, tl~oughless pronouncod.

'I'lie iirdicator sc~nivariogramsfor past,ures and incadow are silriilar in


all directions (isotropy) and lrave a sl~orlcrrntlge than for forest,~.
,,
I h e indicat,or s t r ~ ~ c l , ~analysis
~ r a l confirr~~stlie influence of i\rgovinu rocks
on the variability of cobalt concentmt~ions,T h e indicator selnivariogran~of
Argovian rocks and the se~rlivariogr;ru~ of r:.nbalt concentrations s l ~ a r eblrc
same long-range structure and a hettcr spatial conti~luit.yin t h r SW-NE di- Distance (km)
rection (compare Fig~lres2.13 a r d 2.17). S I I C ~long-rangeI n~bisol.ropyis not
apparent for LIic N i soluivwiogra~n,; t l t l ~ o ~ ~like g l ~ colxilt,
, the avcragtr cow
Kimmeridgian
centratious in nickel on Argovian rocks arc half tlrose ineasr~rcdon the otlrcr 0.30~
rocks. T h e direction-il~dt!pcndent and nr(~st.lys1rort:rangc variabi1it.y of othcr
metal cor~centrationsnrat.clles that of ~ncadowand pastures.
$ 0.20
Semivarioyrnrr~sof residz~nls m
.->
When the pattern of variation of attribute z rest~ltsfrom large dilfercnc~sin
"3
average r-values 1)etwecn cakgories sa, filtering such diffcre~~ccs
sliould aff+ct
the shape of the z-ser~~ivariogramT h s , one approacl~for intsrpreting thc
pabtrrn of spatial variation of ;~t,t,ribut,c
z col~sistsof 0.0 0.4 0.8 1.2 1.6
Djstance (km) Distance (km)
1. subtractir~gfrom cacl~d i l t , ~v11111<:
~ ~ i ~z ( I ~ ~I~&llging
,) sk =
I,O ci~t~:gnry
s(u,) the average z-value rvitl~ins t , that is, the conditional irrean ml,, , Sequanian Quaternary
I /
2. computing the serrrivariogram yn(11) of the rcsid~lals~ ( n , ) = z(ri,,) -
rnlak,and

3. cowparil~gl l ~ esill ;tnd shape of se~~~iv;iriagraz~rs


for origilinl altril,til,e
values and for residuals.

Figure 2.18 sllows the scmivariogrii~r~s of Ni and (bcor~cenl.r;~liorls ixfore


and iiftcr filtering tlw col~ditionalinrraus for lalrtl wcs n r d rock typcs given
in 'lkble 2.4 (page 18). For hot11 metals, the conditior~alrncans are fairly Distance (km)
constant from one land ilse to another. Hence, s~rbtractingLlle conditional
means does not affect Co and Ni sen~ivariogranrs:the semivariogralrr of resid- Figure 2.17: 12xparirne~1talindicator sernivariogra~nsof the most cornmot, land
uals (large dashed line) is close to thc original scrniv;rriogram (solid h e ) . 111 nscs and rock types in four directions ( : 22.5', : 67.5', - - - : 112.5", . . . :
2.4 Bivariate Spatial Description
'I'lre next step in t l ~ cspatial exploration of data co~rsisI,sof looking ;it. 1.111:
cross d e p e ~ ~ d e ~beI.wcen
~ c e measr1renrents of different att.rib111.cs.

2.4.1 The cross h-scattergram


A cross 11-scat.1.ergr;trn is an 11-scat,krgram where the tail ;u,d licad valrics
relate to two different ;~Ll,ril~i~l,cs z; and z,. Figure 2.20 shows 1.111: cross 11-
scatl,txgra~~iI M % W ~ C I I Ni xncl Cd c ~ ~ ~ r c ~ : t ~ l i, r~zr1,Iw
~ l ,Li ~\ V~ ~d~i rsd i o u (or furir Nickel (pprn)
different classes of distai~ccs.'Ih: lag tolerance is Ah = 100 111
At 1111=0, tlrc cross 11-sc;itt.crgm~tris tlte tradit.iorl;tl scattcrgr;rln af colo- h=43rn
cated values (Figure 2.:$, page 20, i.011 lcft graph). For each lag l h / # 0, l.wo
cross 11-scall.ergrntns call he d r ; t w ~ ~rlepending
, on what, tlrc head ; I I I ~ t.;iil
attributes arc: (1) c.;tdlr~il~~n and nickcl or (2) nickel mrl c a d ~ ~ ~ i i' l1n%~~l. d l .
cross 11-sc;rl.l.crgra~~~s sl~owl.ht sin~il;rriIy1,ctww11;I Ni v;iltw ;rnd 1 . 1 1 ~(:<I v i l l ~ ~ c
east of it, whereas the righL cross 11-scattergra~~~s slimv t.hc si111ili1ri1.yIICI,WCCII
a Ni value a ~ t.11e~ dCd v n l ~ ~west c of it,.
i a l c {,II(: i~~crc:asir~g
As in I,Iic ~ ~ i ~ i v ~ r rcase, i111Ial.imo f I,IK c1o11d ! \ p i t , l ~ iw
0 10 20 30 40 50
creasing lag 11 iudicatcs l.11;it tlic rr:lal,io~~ between Ni and ('3c o ~ ~ c c ~ ~ t r a t ~ i o ~ ~ s Tall N I value (ppm) Tail Cd value (ppm)
weakens as the so pa ratio^^ distnocc incrcascs. Iliiwevrr, Iwcarm the t,wo vzrri-
al~lcsare diffiwlit, the inflation caurmt he r ~ ~ c w x r earoimrl tl t11c 45' line.

2.4.2 Measures of spatial cross continuity/variability


T h e covariance and c o r r e l o g r a ~trieasurcs
~~ arc readily rstoi~dcdl o ILlia casc:
where liead and tail values relate to two difli:re~rtatt.ribrltcs.

C r o s s c o v a r i a n c e firnction
0 10 20 30 40 50
Tall N I value (pprn)

with

Tail Ni value (pprn) Tall Cd value (ppm)

-11 tll
ordered set of cross covariances Cjj(lrl), (>;(lr2), . . . i s callrd tllc c x p c r i ~ i i c ~ ~ t a l Figure 2.20: Cross 11-scattergrains of Ni a d Cd conr:cntr;~tionsin thc: westcrly
cross covariauce funct,ioo. 1x1 g ~ n c r a l (,' , ,,(11) # C;,(-It), ~ L I I , I I ~ I I(,';;(h)
~II = (left column) and easterly (right column) directions (angle t,olcrmre = 22.5') lor
four classits o i distances.
( 7 . . l - l > l 2 c Ak",9.x,~,lu,31,c" , , , I *>,,ll"
'I'lrr cross c o r r e l o g r ; ~pij(11)
~ ~ ~ is givrn hy
Nickel-Cadmium Nickel-Cadmium
.I 0.94

-2.
00 0.4 0.8 1.2 1.6 00 0.4 0.8 1.2 1.6
Distance (km) Distance (km)

6-Nickel-Cadmium Nickel-Cadmium

Pseudo moss semivariogra~~l

- . 0 I I O . O
00 0.4 0.8 1.2 1.6 00 04 0.8 1.2 1.6
Distance (km) Distance (km)

Figure 2.21: IFonr meamrcs of spatial relatioasliip between Ni mid Cd co~~centra-


i s i I V l i i 'l'ltc solid and daslted liws rrlcr lo Llw westerly and
easterly direcliou, respectin:ly. l ' l ~cxprrimental cross semivariogratn and codis-
+
w l m e the diffcrcncc [zi(n,) - zj(a,, h ) ] is called a cross 11-i~lcrcment. pcrsion function arc identical in both dircctiorrs.
U ~ ~ l i in
k ethe 11-scattergram case, the first 1)ist:ctor of a cross 11-scati,ergr;un
docs not rcpresent perfect correlation, Irence the spread of the clo~idaround
the 45' line, as ~ ~ ~ e ; r s u r1)y c dtllc pscudo cross semivariogratn (2.27), is nlcan- A sohstir~~tinl difference bvtwer~lC ; j ( l ~ )aud Cij(-11) worrld lncalr i.lrat
i11gIms. A ~ ~ o l l s~l ~r ro r t . c o ~ r r oi ~f ~t,lre
g pseudo cross s c ~ n i v a r i o g r mis~ I l ~ n tits o w v;rrial~leis 1;rggiug l)r4ii11dtllc olllcr, ;ru t:lfrct, rcfi'rrrd ho ;IS ir lag effect
value t m d s to lxr overioflnc~~ct:dby t.he variablc with the largest valnes, 'l'ypi- (Jour~reland Iiuijbrq$s, 1978, 11. 41). For exanlplc, this elftxi is observed in
cally, the d a t a s l ~ o r ~ he l d t.ransfor~~rcd, for example, standardized to zero incan geocl~cn~istry, whcre difli,re~rt,rat,cs of prccipitiltion rrray cttose enricbrrrent ill
; I I I ~unit variance, 11efort c o ~ i ~ p r ~ ttlrc , i ~ ~nrrasurc
g, ?$(h) (Myers, 1991). so~rleir~int.ralsto lag helliorl t.hat of orl~ersalong t l ~ diracl.ion
c of I~ydrothcrrnsl
,~
I he stat,ist,icy l i ( l ~ )l u i y he I I S W~ I~I ~ I Ithe two variables Zi and %, re1at.e flow. A I;rg effect entails that the cross covariance fllnct.io~~ or the cross
t.o t.hc same ;tt.Lril)ol.e ~ r ~ e a s u r e;rl d bwo different times (e.g., see I'aprik and ~ ~#r 0.
corrclogra~urcnclres its n r a x i ~ ~ a~t olhl
i i ~ I r . I I I this part,iculitr cast:, the di~t.itneed not be tmnsformcd 11 Iiig clkcl, nol. Ixickcd by ;my physical i ~ ~ t e r p r c l . a l . is
i o ~bctter
~ ignored.
si~rcethe l,wo attribuLes are lnensured in the same unit and t l ~ e i rvariatiol~s Most often, rlcviatio~~s Retwee~ropposite directions reflect, experi~nerrtalfluc-
are likely of the same rr~agnitode. that, res~lltfrom the small nurnhcr old at.;^ p;tirs avail;rl)lc. In what fol-
I.r~;thio~rs
lows, all cross covari:rnws Clij(h) refcr to t.hr avrrngr of C;;(-11) and C;j(l~).

2.4.3 The scattergram of 11-increments


'I'he cross cov;tri;tnce measures lrow t . 1 value
~ of a11at.tribr~tcz; at, one location
is rcl;rf.r:d 1.0 t,lrc vall~eof n~wt.l~er
attrilrut,c z, a vr:cLor 11 ;rptrL. H.at,l~erthau
52 CllAl'7'EII 2 EXI'l,OllA?'011Y 1)A7'A ANA Ll'.Sl,S

I n c h a t o r cross corrologra~n
!

j rvl~nrct l ~ cvariance uf (zjk) of tail indicator v a l ~ ~ ci(u,,;


s zjx) is equal to
- I1
I.; (~j&)[l ~'~~ll(Z;k)].
-'

-11

I n d i c a t o r cross sclnivariogranl

'l'l~e ir~dical.orcross semivariogra~tiis cornpuled as


2.4.5 Application t o indicator transforms

T l ~ ouly
c data pairs t h a t have non-aero contributions to the indicator cross
sclnivariogralr~arc those wl~crethe valrles of both attrihutes zi and z, are on
opposite sides of their ihresl~oldvalues ( q k , zjk,). 'l'l~rcontribut,ion of a data
pair t.n y f j ( l ~zjk,
; zjk,) can be positivr: ( + I ) or negative ( - I ) , rlepending on
I e t l ~ Ir : ;rt~dz ; - v ; ~ l ~ ~joint,ly
es decrease (i~lcrmse)frmn 11, to 11, 11, +
'The indicator cross cov;rriirnre is co~npot.edas or vary ill opposite ways. 'l'licrcfore, t.lie indicator cross scmivariogram value
cannot. he interpreted as a joint f r e i p e ~ ~ cofy transition, that is, it does not
Ilieasure horn o f t c ~values
~ of hot11 athribotes z; and z; are 011 opposite sides
of I.11reshold values.

Figure 2.23 shows tS~ec x p r r i ~ n c ~ ~cross t a l corrclogran~a11d cross semivari-


o g r u n s bt*t.rverl~Ni and Cd i~rdicalordata in the E-W direclion; thc threshold
+A.
z;-values not cxccecli~~g tile tlircsl~oldvalues ria i d zjk,. values arc the sccond (left ~ ~ I U I I Ifift.11
I I ) ,(middle colnrnt~),and eight11 (right
,, colnmn) dccilc of t,hc cu~nulativedist.ributio~~. 'l'l~ecross correlat,ion for the
I he cross r o v a r i a ~ ~ c(2.30)
e is Ll~ccct~teredversion of Llrc tvvo-point joi111.
cu~in~l;~t,ive fraqt~cucyI';;(lr; rjn, z;&,). 'l'l~c litt,1,1~n ~ e i t s ~ ~IIOWr c s o f t c ~a ~zi- first lag drops fro111 0.78 for the sccond decile 1.0 0.18 for thc eiglrtlr dccile
v;tluc aud t.lle zj-valr~ca v d o r 11 apart are jointly no grcal.er than their t l ~ r s l lSi~~iilarly: t.lte relat.ivr nugget clfect on 1 . h cross sernivariogram
r q x c t i v c tl~rcsl~olrl v;tluos ( z i n ,z,r.), for e s a ~ ~ r p l11ow
c , o f t m two dat,a loca- increasm wit11 tlrrcsl~oldvalue. Snch strong dcpn~~dcnce i ~ c t w c cCd
~ ~and Ni
t i o m sep;rrat,cd lly n vixt,or 11 cxcccd t,l~c: cri1,icnl t.l~resholrlsfor two dilli:rmt i~rrlicatord a t a a t snr2tll tlrrcslrold vitlr~csrelates to the greater spatial conti-
~llrl.ills. 11uit.y of small Cd and N i cor~ccotral.ionsollserved in Figure 2.19 (page 15).
56 CI1Al"l'Eft 2. EXI'LORA?'OllY DATA ANALYSIS 2 5 . MAIN I;'l?A?'llRl?S OF ?'ME JUlZA 1)A'rA 57

4. T h e metals with widespread contamiliation are positively related to the


Nickel-Cadmium Zinc-Cadmium better sampled zinc. There is a positive relation between nickel and
I cadrniuln col~ce~ttraliorm.
5. A striall nugget effect,, a short scale (range x 200 rrr), and a regional
scale (range x 1 km) of spatial variability are observed on tlrc semivari-
ograrlts of ~ n e t a cottcentralions.
l T h e short-range structure is the major
colnponent for tlie three llletals witlr widespread contaminatior~(Cd,
Cu, and Pb) and Cr. 'I'hc long-range structure dominates the semivar-
s Ni and Co concentrations. T h e Zn sernivariogram corrrbines
i o g r a n ~ of
Dislance (km) Dislance (km) t . l ~two st.ruct~tresi t t ;q~proxilrmt~cly
cqtlnl proporlio~ts.

Figure 2.25: 1Sxperimeutal onrnidircctiotinl indicnlor cross corre1ogr;~rnshetweetr 6. ' l h c slrort.-rimge strrtctrsre relates to the spatial dist.ribntion of rock
the two cx1t;uistively sampled rnelnls (Ni, ZN)and Cd: second- ( ) , lift11 ( .), types ;and laltd uses irr the sludy area. '1'111: long-range structure re-
;rnd viglt01-clccilc(- - -) tl~rcsholds. flecl.s the inlluencr of Argoviart altd I<ir~~lttr:ridgiatr
rock types on nretal
ctrrtccrttratiolls.

7. Nickel co~lccutratiot~s vary more contirroously in the SW-NE direction,


wlriclt corresponds to the prcfererrtial orientation of the underlying ge-
ologic for~tlatimrs.T h e palterrts of variat,ion of o t l m il~el.alsare fairly
similar in all directions (isotropy).

8. Small conccnlmtions in Cd, Ni, atid ZII itre l ~ e t l e rc o ~ ~ n e c t cin d space


tltarr larger concentratiol~s.'I'his suggests the existe~rccof lromoge~reous
areas of srrrnll cooce~~tratiolrs aud larger aorrcs where ltigh and median
c o ~ ~ r < ~ ~ t t ~ r a: r et . it~I,wtrti~rglt:d.
io~~s

9. 'I'h t ~ ~ c l k with
~ l s widespread c o ~ i l ~ ~ ~ t ~ i (Cd, i o t tatrd 1'1)) show a
~ ~ a tCu,
sIrorl,-range cross dcpc~tdcncewith the bet,tcr sampled Zn. There is a
I O I I ~ - ~ ;cross i)el.wetw Cd atld N i concnnlrations.
L I ~ dt:pcttrlc~~~c
~C

2.5 Main Features of the Jura Data


'I'hc salie~ttf<:aturcs of this exploratory data nt~alysisare s n n ~ n ~ a r i a easd fol
Io\I's:
Chapter 3

The Random Function


Model

,.
Data description is rarely, if ever, the ultimate goal of a stabistical s l t ~ d y .
lypically, o m w m l s to go beyond the d ; ~ t atri clraraetcriac t . l ~populat,iorr
from rvlricll the sample 118s fxxn drawn. One asscnlinl step i l l llris process
is t.he q u k i t a l i v e modelillg of the spatial slatislics of thal, population irorn
the data avitilahle over t,he study area. A11 s~~bseqtienl applical.ions, such as
prcdict,ion or risk a ~ ~ a l y s irely
s , oil the model (represeutatio~~) choserr.
*,
1Ire del,ernri~tislic;md prolral~ilist.icappro;~clreslo r~ro(lr:lir~g
arc col~iparcrl
~l . Scct,iuu 3.2 int.rod~rccst h rmdw!r ft~n,:l,iot~
i l l s c c t i u ~X I I I O f
~ r~o ~ wlrirl~
r~
~nosl.gm"St;~l.isl.icalitlgoril.h~i~s ;rrc llt~ilt,.

3.1 Deterministic and Probabilistic Models


k t S , = {~(ii,), cx = 1 , . . . , n ) be tlie set of 71 r ~ r e a s ~ t r c ~ ~of
~ cant t sr i l ~ ~ ~ ( . c
r over lire study arcs A. 111 <hapter 2 the populat,iorr was idrinlified a s t h
sample set S,,, ' I h p o p u l a t i o ~is~ now dcli~rcdas ttlw scl, of ;dl i~~e;tsurernolts
that, corrlrl he inade over A , { r ( u ) , Q 1, E A]. 'I'lrc need for 111odt4irrgtlrc:
spatial disl,ribotior~of z over A colnes f r o ~ ntlrc h c l Llral. t l ~ :irr(i,rmxliorr
available, S,,, is no louger exlral~stive.
A nrodel is Ltrt a represevtalion of the (nnkrrow~l)rc;dity. illLl~o~tglr llrat
reality is unique, i t has many possible rrpreserrtations, depending on 1 . h ~
i~rformationavailable and lhe goal of tlrc study. Wbeu h~rildirrga model, one
should use t.he following g n i d dIIICS:I

1. 'flre trrodel I I I I I S ~i ~ r c o r p ~ r a all


l e the rclevant ir~for~rraI,iou. ' r l ~ cpat,Lcrn
of spatial cor~t,inuil,y of iargc vvalws (e.g., pdlirt,ant colrce~rl,rat.ions,lwr-
irrcal)ility, ~ r c l a lgrades) is crit,ic;rl inf~xnra1.io11 for I I I ~ I Hal)[~lici~l.io~rs
~
irr e u v i r o ~ ~ ~ r r estlldies,
~ ~ t a l rmcrvoir charackriz;~t.iol,, om! sclectiv~:~llill-
ing, wl~ereasthe hclravior of merli~rrrrvalrm may b c morr rclrivant. t,o
2. ' I o l l ~ n s It t r ; c t ; I . 011c slwnld sl.rikf: a l r n l a ~ ~ clrrl,wce~~
c
a congcni;tl but possibly unrealistic nrodel ;md a nrorc re[jrcsc~rt.;tlivc
rt~otlclwibh too nrmy paranrctcrs that, art! difficult. to inf(:r from sparsc
da1.a.
3. 'L'hc rnodel musl b r tuned to the goal a t liand. 'l'hcre is no need to
~notlols ~ ~ r a l l - s c ;s~t ,l nc ~ c t , ~ ~ifn !t.lte
s ol,jccl,ive is lo dr.linrat.c targct. ;trcas 1 2 3 4 5
O I I a largo scnlc or if t,lw n d c l is to be ~ ~ s er.g., d , for Row sinrrll;tt.ion, Distance (km)
;I proc<~ss i~~srwsit,ivr I.o sr~r;rll-scalrf~!;~turrs.(,!o~~versoly,a low-prss
lilter-type ~rrodclwould he inappropriate wlrtm tlrc process rr~rdcrs t ~ d y
drp~:nclsOII high-frequency (small-scalc:) vari;ttions.
Deterministic model Probabilistic model

A dclcrrninist,ic iirodel ;~ssociatcsto ally uns;~n~pled 1ocal.io1111 a singla csli-


~t~;tl.cd value, say, ~ ' ( I I )for t,lrc ~ I I I ~ I I O W Ivalue
I Z(II), witlmut d ~ ~ u n ~ ~ : n I ~ i n g
tlrr. potential error ~ * ( I I-) ~ ( I I ) For . all subseq~~ent, utilizations, ttlrat uuiquc - ?

e s t i ~ n a t r dv;tlue is taken as the true value; t,l~at.is, llic error is assunked to be 1 2 3 4 5 6

uil or nrgligiblc. Such in~plicildisrcgnrrt for llrr polential error is just.ifi:~blc Cd concenlratlon (ppm)
if the estimate z*(n) is b a d on eitlrcr many d a t a or some k n o w l d g e of the
Figure 3.1: 'l'lic problem of rnodeliag the spatial clistrihution of a conti~iuousat-
physics governing the spatial distribuf.ion of the attribute z. Unfortor~ately;in
tribute (Cd concentration) along a NE-SW trsnsect. TIE detcrn~inisticmodel as-
c:~rtltsriurcw such li~mr~lctlgr is l i ~ ~ ~ iand k d ,tlrc i~suallysparsc i~rforniation sociates to the nasampled location u n single estimated value, say, a. concentration
;~vailel,lcclors not allow o~rr:lo iglrorc Ll~ccrror ;lssociatcd lo ally t'stinralr of 0.7 ppm, whereas thc probabilistic approach provides a probability distribution
z * ( u ) , ~ r onmtter how the variable is estimated. for the possihlc valuss at tlmt location
Chnsidcr, for example, the problem or rnodeling the spatial disdrilrr~tio~r
of C d concentrations aloug the NE-SW transfcl slmwn in I'ignrc 3.1 (lop
gr;iplr). A deter~~rinistic represcotation would attribute to Llre onsan~plcd
the r ~ ~ ~ s a n r pv;rlr~c
l e d ~ ( 1 1 )and, n ~ o r egenerally, of (.Ire distribution of z willrin
, e d say, ~'(11)-0.7 p p n . Suc11 a n c s l . i i ~ ~ ;is
location u a single r s t i ~ ~ ~ ; t tvalue, ~tc
llre are;$. For example, Figure 3.1 (hotlonr graph) slrows the distribution of
likely t o be in crror 0ec;t11sefew data are available and our knowledge of (,Ire
probabilily for C d concentrations a t locatio~r11. T h e unk~~owrr concentrat,ion
physical processes that control the spatial dist,ributiou of Cd values ovcr the
has a 0.2 probability of exceeding llrr critical threshold 0.8 ppm. T h e loca-
area is impcrfcct. In decision making, it is critical to assess the potential error
Lieu 11 would be classified as safe on the basis of the sir& estimated value
~ ' ( I I )- ~ ( 1 1 )or, better, l o know lhe range of values t h a t the onk~rownz ( u )
z'(u) =0.7 p p m , ycl. the rr~odelof nrrcertainty indicates that there is a sig-
may take ;tnd the prolxrhility of occurrence of ench outcome. In the e x a n ~ p l e
nificant, risk for the Cd co~~centr;ition to exceed the tolcrahle m a x i ~ r ~ u an ti
co~rsid~:red, it is critical l o cvaluate the probahilily that t.l~cCd c o ~ ~ c c ~ ~ t r a t , i o n
a t loc;rtion 11 excer:ds the toicrahlr I E I ~ ~ ~ I I I 0.8 I I I ~ppru.
I
u. 'L'hc dc~isionof declaring tire l o c a l i o ~u~safe or contatrrinated would the11
d e p ~ m don wlretlrcr one chooscs to ignore that risk.
Whereas dotern~inisticinodels usually rely on the pl~ysicsof the phe-
Probabilistic: m o d e l
nomenon, most of the infor~rrai.ionused in a probabilistic model comes from
Irrstcad of a single cst,imated value for the unknow~rz(u), the probabilistic the d a t a . T h e spatial distribution of Cd values may he nlodeled witlrout any
approach provides a set of possible values with the corresponding probahil- knowledge of the r~nderlyir~g physical processes; one would then capitalize on
itics of occurrence Such represcntalion reflects our imperfect kr~owledgcof the spatial dependence between any two Cd values as inferred through the
3.2. T H E R A N D O M FUNCTION MODEI, 63

s,: Argouian Rernnrks

s5
s4 i lypes
s,: Kimmerldgian
$3: Sequanian
sa: Portlandian
Good discussions about the pl~ilosoplryand practice of rrrodeling spa1.ial da1.a
can be found in Journel (1986a, 1994a), lsaaks and Srivastava (1989, p. 196.-
236), and Matllerorl (1989)

1. 1)eterministic and probabilistic ~nodelsmay be used together. For cx-


ample, a delerminislic representation of better known large-scale struc-
, . . , . T~-~..- 7--
tures can he complemented with a probahilisl.ic rnodeling of small-scale
1 2 3 4 5 6 variability.
Distance (km)
2. A rnodel is not cast forever. It slrould be updated wlicnever the goal of
the strldy changes, additional data become availal,le, or the physics of
Deterministic model Probabilistic model the phenomenon becomes better kl~own.

3 . When drawing a conclusion from a nrodcl, one sllould always quest,ion


how much of 1.lrat conclusiou a c h ~ d l yoriginates fro111the dat,a am1 trow
much comes from t,lre model itself. A ~iiodelbased or1 assumptions
t h a t are riot supported by the d a l a may generate artificial features; for
example, see the discussion or1 the rnaximrlrrl entropy property of 1,hc
niultiGaussian li-F rilodel in section 8.4.

3.2 The Random Function Model


(~eoslatist,icsis largely 1,asr:d otr l l ~ econcept of rando~rrfurrctiorl, wllr:rcl)y
the set of unknown values is regarded as a sel of spatially dclxndc~lt.r a n d o ~ n
Figure 3.2: TIE problem of rrrodeling the spatial distribution o f a categorical at- variables. T h e local uncertainty about tire attribute valr~ea t arty parlicrrlar
tribute (rock type) along a NE-SW transect. The delcrrninistic model associates location u is rnodeled through the set of possible realizations of the random
to the unsampled location 11 a single value, say, a rack type sl, wlmeas the pmhi~- varial~leat t h a t locatiorr. T h e rarrdom function concept allows 11s to account
bilistic approach providcs the probability ofoccurrencr for all possible rock types. for structures in the spatial variation of the attribn1.e. T h e set of realiaat,ions
of tllc random fitnctior~models the uncertainty ;rholtt tllr? s p n t i n l dist.ribul.ion
of tile aLt.ributc ovar the cntire study area.
sample variogram in section 2.3. Of course, considering a probabilistic model
does not prevent using ancillary physical i~rfornlation,for example, the fact
that particnlat rock types contain more cadmiunl than ot,ticrs. Also, geologic
3.2.1 Random variable
information, such as the orie~rt;tt,ionof rock types over the study area, may
help in rnodelir~gthe ser~~ivariograrl~ of Cd concol~l.r;ltl.i~)~ls.
Like contiiruorrs attributes, the spalial distributio~rof any c;dcgoricirl at-
tribute s, say, rock type, car1 be nrodclcd using eitlrer deterministic or proha-
bilistic approaclres. Corrsidcr, for cxarnple, the NE-SW tra~rsecl,in Figure X 2
(top graph). Wllereas tire deler~ninisticrepresentation associalcs t o the 1111-
sampled location 11 n single category, say, sl , the probal~ilisticapproach gives If the n ~ ~ m b of of the 1W is linile willrout alry ordcrirrg,
e r possible o~~l.corucs
the probability for any of the five rock types to prevail a t Lhat location (Fig- c.g., I< rnutually exclusive soil types or geologic facics, tlrc ItV is said 1.0 I><:
ure 3.2, bottom graph). Tlie category s l l ~ a sa large pro1)abilit.y of previtiliug discrete or categorical. Let S(II) delrotc sudr a discrete rando111variitble' a t
a t u, yet the probability that o actually belongs to any one of tlre four other
categories is not zero.
32 7'1115 RANDOM FUNC'I'ION MODISI, 65

11,
, >
1 Ilc qua111.it.yp(n; s r ) represents t,lre probahility for the category st to Figure 3 . 3 sliorvs cdf and pdf represerrtal.ions for both categorical (t.op
prwail at. loc'<t1'I O I I 11: graphs) and colitir~uousItVs ( b o t l o n ~grapl~s)a t location 11. T h e cdf rep-
resentation is typically used for co~itir~uous variables, whereas tbc pdf rep-
resentation is rnore appropriate for categorical attributes wit,]] nowordered
states.

p ( n ; .st) E [O, I] k = I , . . . , Ii' (3.2) I l ~ d i c a t o rm n d o r n v a r i a b l e

A I I ii~diridorI&' is ;I discwt.c I h a r y 11,V wi1.11otily two possil)I(! I I I I ~ O I I E Y : 0


and I . 'Tl~eprobability (3.1) that a category s t prevails a t 11car1 be expressed
I=1
as t . 1 expected
~ value of the iudicator RV I(rt; sa) a t that location:
1:or ally specific ordering of t,hc I< oot,co~~rcs s r , the rutr~nlnt~ivc
probnljilil~y
d i s t , r i l , ~ ~ t I.'(II;
i o ~ ~ s h ) is ilefi~~erl
as
= I - Prob { l ( q s r ) = l } + 0 . Proh { ~ ( I Isa)
15 { I ( u ; sk)) ; = 0)
(3.10)

'flic qrrantity i.'(ll;sn) is the probability for ally one of the categories sr.,
ordcretl lesser or i:qu;tl to s k t,o a t u, Accounting for c o r ~ d i t i o ~(3.2)
~s
o t ~ d(3.3), ally cu111111;rt~ivc
probabilily Ic'(11; s r ) t~irtst.lie witl~in[O, I], and the
cwnulative probability distriliution of t.ype ( 3 . 4 ) must he a non-decrcasi~~g
frrrlcthr~of t,hc tlm3sltok! s:,:
Categorical RV S(u)

C o n t i n r ~ o l r~asn d o n r v a r i a b l e

Continuous RV Z(u)

' f l ~ ederivative of the cdf, wl~etlit exists, is the probability density function
( p d f ) f(u; z) = F ' ( I ~2).
; Ry analogy wit11 the case of discrete random vari-
ables, t.he following order relations are satisfied:

Any cun~ulativcprobability F ( u ; z) u~rlstlie within [0, 11, and the cdf must
111: a nolt-dccrr;wi~igfi~~rctiotr
of the t h r ~ s l ~ o lzd.
Similarly, the probal~ility(0.7) that the variable Z ( u ) is no groatcr t.li:r~~ No uncertainty
any given threshold z can be expressed as

F ( u ; z) = Prob {i:(n) 5 z} = E { ~ ( I Ia)}


; (3.12)
where the RV I(n;z) is defined as

Indicator geoslatistics capitalizes on relations (3.10) and (3.12) to c v n l ~ ~ a l c


the prohahilily for any category s k to prevail a t nnsanipled locations 11, or
the probability for the variable i: to be no great,cr than any tltresl~oldz .

T w o extreme p r o b a b i l i t y d i s t r i b u t i o n s
Maximum uncertainty
Each d a l n n ~~ ( I I , . )ar ~ ( I I , ,is) vicwed as a particular rc;iliz;~tio~~
of t l ~ crii11d0111
variable S(11,) or Z(11,). Provided nieasurcnterrts are precise, t,l~ercis no
uncertainty about the sample valne ~(11,) or z(11,). 'f'lins, the probability for
the outcornc v;ilue a& = ~ ( I I , , )to prcv;~ilat (I,, is 011~;the prnl~;tl~ilil,y is zero
for 1111: r e ~ ~ ~ i ~ i(lin i u-g I) ~:al,cg(~rics:

Similarly, the prohal>ility for the rat~dornvariable Z a t u,, l o be no greirter


than a lhresl~olda is one for any tllreshold greater than or eqwd to the d a t u m S, s2 ... ... SK

~ ( I I , , ) ;the proljability is zero for otller tl~rcslioldsz < ~ ( u , , ) : s-values

These probability dislribntions wit11 zero varia~tce(no unccrt.aint.y) are dc-


pictcd a t the top of Figure 3.4.
Where n o information w11at.soever is available, all ootconres have the sanle
probability of occurrence (maxinium ance~lainly).'I'lie K categories sa t1ie11
, , ~c~q]n i v a l e ~ ~tol t,,,j,, < z
tion, and the notation z t ( Z , , ~ ~ , , , Z , is < z,,,"~
The corresponding nniforrn probability d i s t r i b u t i o ~ ~arc s representtid a t t.he
have the same probability l / l i to prevail a t loc. t'lon 11: CL
boll.o~nof Fignre 3.4.
'flic basic paradigm of the probabilistic itpproncl~is to n ~ o d c ally l l~nk~towt~
value S(U) or z(11) as a randoni variable S(11) or Z(U). 'f'he prohlen~of
Similarly, the probability for tlic rantlon~variable Z a t 11 to be no greiiter assessing the uncertainty about ail attribute value a t 11 t,ltl~srcdui:cs l o t,lral
than the threshold z increases linearly with tliat thresl~old: of modeling the probability distribution of tlie randoru vari:thle S or Z a t
that location. W l ~ e nno infornration is available, tllc u~~crrtaint.y is n ~ a x i m r ~ n l ,
if i < trliili with the same prohahiliI,y of occllrrence fnr ;all out,rnmc v;ilues. 'I'lw idea is
F(u;t ) = if 2 E (%in ~ r n r ] V t (3.17) to use neiglthoring d a t a values s(u,,) or z(11,) to rcducc i.l~cunccrt.ai~ityat.
location u, wliicli anroutits to npilating ti~odeluf type (3.16) or (3.17) int.o
if z > t,,,,,
a model t.11atis conditional 011 (accounts for) tho inforn~at.io~i :~vailabLea r o u ~ ~ d
wl~erez,,,;,, and z,,,,, are the ~rrinin~unr
and maxinrumvalues of the z-distribi~- 11. By accounting for the dependence bet,weeu IWs a t different local.ioos, 1111:
random function model allows such updating. 111 tltc 11ex1,sect.ion, the focits
is I 1 1 i 1 1 s~ I I f ~I ~ ~ c lS~i ~i ~s~ .i ldevclop111c11l.s
;rr call hc lrkadc for
&cr& r n ~ l i l o ~f tl~~n c t i r n ~ s st:c Isaaks iwd SrivitsI.itva, 1989, 1). 218 2:i(i).
(c.g.,
T h e corrditio~~alcumulative distribulion f u n c t i o ~(ccdf)
~ Is'j11; zlZ(ul) 5 r ' ) is
the cdf of %(II) give11 knowledge about Z ( u l ) , qecifically l l ~ a tZ ( u f ) 5 2'.
Accortli~~g t o 1l1c conditional protmbility definition, the ccdf is deduced from
A ratrdo~nfu~lcl,ion(IW) is dcfitred as it sc't of ust~i~lly I ~ V ~ ~ : I I ~ C rirnc10111
II~. the prcvious one- and two-point cdfs by the relation
variables %(,I), one fix each location 11 in the s(,~trlya r c a d , {%(II), V n E A].
To ;trry set of N Iocatiotw u i , 6 = 1, . . .,I?' corrcspo~rdsa vector o l N randorri F ( n ; zlZ(oi) < 2') = Pro11 (Z(II) < <
z/%(II') z'}
varialrles ( Z ( I I ~ ). ,. . , Z ( I I A ~ ) t,hitt
] is cl~aractcriaed by t,lrr N-variat,c or N- -
-
<
Prob ( Z ( u ) z , Z(II') <
z'}
poi111rdf: <
l'roh {Z(lll) 2')

I'wo fLVs % ( u ) a n d Z ( d ) are said to be i ~ r d q ~ c ~ i d cifnthe


t probability
disl.ribrrLio~~ of eitl~erorlo is not affcctcd by m y k~rowlctlgeahout thc other
orre, t.l~atis, if and o d y if

I!y s ~ r h s t i t r ~ t i nthe
g first expression itrto rclatiot~ (:1.25), tlrc indcpendcl~ce
condition hecorncs

?'lrus, tlrr set of all i n d i c a l h cross covaria~rcesLj(11, u'; L , 2') for a11 lltresh-
ulds z , z', provides a irreasurc of deperrdcnce Retwce~rthe t.wo HVs Z(u) arrd
Z(u'). 111 c o ~ ~ t r x sthe
t , Z-covariancr C(II, 11') is R measlrr~only of linear cor-
relolion betwce~ithe two RVs, k . , a rne;tsnre of the ability of a straiglrl line to
clescrilre the rclatiou between these two variables. Tire %-covariance G ( t ~ , u ' )
arid all indicator cross covariances Cj(u,11'; z , 2') vtu~ishwhen the two RVs
,~
I l~t:one-poiel and luio-point ter~ni~rology
rcrrrids us l.lmt the two ra~idonr are indcpcrrde~rt,. Ifowcvcr, the couditio~rC ( n , n l ) = O docs not necessarily
variables relate l o tlic sanre atlrihute ia t iwo dilferent points or locations imply relation (3.26): two li~rrarlyr~ncorrclatedltVs may st.ill he dependent,
r d l ~ c rtlratl to two differel~tat,lrihotrs. i l l wl~icllcase t,Iie depcndcncc relalion is non-linear.
70 CIlAI'Y'ER 3. THE RANDOM FUNCTION MODEL 3.2. ?'HE HANDOM r'UNG??ON MODEL 71

The d e c i s i o n of s t a t i o n n r i t y An I t F model is said to be stationary of order two when (1) the expected
value E {Z(11)} exists and is invariant within A, and (2) the two-point covari-
T h e one- aud two-poinl cdfs of the H.F and their niomeots as rfelirred by ance G ( h ) exists and depends only 011 the separation v e c t h 11. T l ~ covariance
c
relations (3.19)-(3.24) are location-dependent. Their inference thus requires f u o c t i o ~ correlogran~,
~, and semivariograr~~
of a slatimary 1W are relat.txl by
repetitive realizations a t each location u. For example, the inference of the Z-
covariance C(11,11') between 1.l1etwo RVs Z(11) and %(II') separated by a vec-
tor 11 = 11' - u calls for a set of repetitive ~neas~lre~rrerlts {z(')(u), z(')(11'); 1 =
1 , . . . , L ] , which is never2 available in practice. T h e idea is 1.0 use all pairs of
measurements a vector 11 apart w i l l ~ the i ~ ~s l t ~ d yarea A, {z(llo), Z(II@ 11); +
a = I , . . . , n], as a set of repetitions. T h e iniplicit a s s t ~ ~ n p l i oisn that. the cor- As the separalion distance 1111 increases, tlre correlation between any lrvo ILVs
responding pairs of RVs {Z(u,), Z(u, + h ) ; u = 1 , . . . , 7 1 1 originn1.e fro1111 . l ~ +
Z ( n ) and Z ( u 11) generally tends to zero:"
same two-point disl.ribul,io~~. S i ~ c pooling
cations calls for t,lte phe~io~r~erron
l~ of data pairs regardless of l l ~ c i rlo-
under s t , ~ ~ dtoy be spatially "l~onrogeneo~~s"
wit,llin A. In probabilistic terms, the R.F rr~orlelZ ( o ) m r ~ s the clroserr to i ~ c
C(11) - O for 1111 - a2

Accounting for r e l a t i o ~(3.30),


~ the sill value of a botlndcd semivariogra~rr
statior~arywithin A. tends toward Lhe a priori variancr C(0):
Tlre R F Z(11) is said to he stationary rvitl~i~r A X t l ~ e11111l1.ivariale cdf (3.18)
is invariant t~nclertransliSion. This lorans 1.lrat ally two vectors of ItVs y(11) -+ C ( 0 ) for llll -oo (3.32)
+ +
( I ) . . . , ( I ) a 1 { ( I 1 ) . . . , Z(IIN 11)) have tlre s:mrc multi-
variate cdf whatever the t.ra~islatio~r vecl.or 11:

1. 'l'lrc definition of the se~r~ivariogranr y ( h ) docs ~ o treqliirr: , t l exiskncr


~
of a constant rrrcm and finite variance for t l ~ cR.V Z(II); a s~~fficicnl
co~~dil,ion +
is t h a l 1 . l ~ItF incremer11.s [Z(II) - %(la h ) ] ;trc sl.ntionary
of order two, a condition rcfcrrcd to as t.tie iatr.iwtr. lrypolhesis (.lot~rrrcl
I 1 1 i I r s , 1978, 11. 33). St:wrt~I-or~lcrslht~ion:~ril.yiml)lit,s 1,111,
i111.ri11sic hypothesis but. 1.lw rwcrsc is imt, t r t ~ c :an i111.ri11sic 1<1' I I C I : ~!lot
I x s t . a t i o ~ ~ aof
r y ordcr 1,rvo (scc discussio~~ ~ I I III I I I I O I I I I I I C ~s < ! t ~ ~ i v i ~ i . i o g r i ~ ~ ~ ~
n~odelsin section 4 . 2 . 1 ) .
2. Stationarity is a properly of l,l~eIll' t n o d e l b a propcr1.y !~~!or~cd
for ill-
fercncc. It is not a characteristic of (,he plrenomenor~urrder s1.rldy. Sla-
tionarily is a dccision made by the user, not a liypo1,liesis t.l~atcan be
proven or refut.ed from &a.

Recall t h a t the vector notation 11 accoonls for 110th distance 111/and dirt:ctio~~.
' n ~ efnnctions C ( h ) and 2r(ll) are said to 11c irnasolropic if (.hey d e p c ~ ~ou
both distance aud direction 'l'hcy arc, said to 11i: isolrupiv if they <lcpi.~~rl
only on the r~rodulr~s of Ir.
d . T h e avail;rl~ilil,yof er~ouglrd a l a to iufw t.111: p a r m ~ ~ c l e r( sn ~ c n n ,
covariance f1111c1.iorr) of ear11 scparatc 1il2

When the set uf repetitions consists of ~ncasr~renrmtsrccor~lt:rl


a t t l w same l o ~ a t i u n1mt
the 111' is actually defined in space-time awl st~ouldhe denoted Z ( n , t ) .
a1 difTerent l i m e s ,
,1~ 11c cross covnri;r~rcef u ~ ~ c t i oCni j ( h ) can be writt,cn ;IS 1 . h S U I I I of an even
f~rnctionof 11, [ C t j ( h + C ; , ( - l ~ ) ] and ~ I odd
I f i ~ n c t i oof11,
~~ [(~,j(l~)-~~~j(-l~)]:

,\
l l l c cross semivdriograrn yij (11) incorporates only tlre even t e r ~ of r ~ the cross
covariance fr~ncliotr,see rel;rl.io~r(3.38), lle~rccit is ~ y ~ n n ~ e till~ r(11,
i c -11).
In ~w;i':l,ice,asylilnietry of t l ~ cross
r covariance funct.iot~(recall sccl.iorr 2.4.2)
is ii~osl,o f l m igl~oredfor t,hcse reasolrs:

2. I f a c t of (1;rt;i lypically prcvr!uts asserting t.l~cpl~ysicalrcalily of a lag


cffcct,.
3 . Modelilrg asynlrnctric cross covarianccs, i ~ l l l ~ o u gpossible,
h is difficult;
for exarlrple, see Joimlrl and llui,jhregts (1978, 1). 179), Grechyk (1993).

wl~err: I';j(l~;z ; , r j ) is not nccmsarily q u ; r l t,o I.;i(h; zj, ri).


r 'I'lre <,xpcclr~l
values:
I~I, = I? {%,(11)] i = I , . . . , Nt, (0.35)

0 ?'he (cross) covariance funclioirs:


Cij(1l) = I? {[Z;(il) - ~ u ; ]. [%,(u+ 11) - nt,]] V i , j (3.36) As i l l l h sii~glr-variablecase, the correlation brtwce~rany two lZVs Zi(u)
and Zj(11 -1- 11) is assu~ncdto terrrl t.o zero as t,l~csrpar:ttion dist.ance 1111

. wllrrc C j j ( h ) is naL necessarily equal t,o Cji(l1).


T h e (cross) variograms:
+ +
increases:

2yij (11) = Cov {[Zi(ll) - Zi(11 ll)], [Z,(u) - 2, (u ll)]) Accolmting for relatio~r(3.38), thr sill value o f a horr~~tlerl
cross setnivariograr
+
= I< {[Z;(11) - Zi(ll El)] , [Zj(ll) - Z j ( l l + ll)]} v i , ;
(3.37) ; j ) -
t,rmds toward I.lw cov;rrin~ii.i.value Cij(0):
( 0 ;IS lhJ - a
1>he t m n s j o i n t or cross apply when i
7
# j, rvhereas the terms direct or a u t o T h~-
e cross c,qVm5&~prat~
apply wlrcri i = j .
.-:P A unit-free measure of linear correlation bctwt:en ally two RVs X i ( ~ )and
Lag effect +
Z j ( u 11) is the cross correlogra~r~,
drfi~redas

T h e cross covariance funct,ion and the cross sernivariograrnare related through


the rxpression

. At 111/ = 0, expressiw (3.40) is the linear c o r r e l a t i o ~coellicient


~ between tile
two variables %; and Z,. Note that pij(0) = 0 does not uccessarily imply that
pij (11) = 0 .
S e r n i v i ~ r i o ~ r and
a n ~ c o r r r l n t i o ~ rf u n c t i o n l t r a t r i x
Under Lhc d c c i s i o ~i~il joinl. sccond-order s(.at,ionarit,y for all N,, IWs, OM

defines:

+ T h e mean-value vector:

Chapter 4
T h e covariance function matrix:

Inference and Modeling


T h e senrivariogra~~l
matrix:

Once a random function ~notlell ~ a sl ~ e r rchosen, ~ the next step consists of


7' 7'
, . . . , IN,,] ;
with z ( u ) = [ ~ I ( I I )Z, ~ ( I I )., . . , ZN~,(U)], 111 = [ m ~7112, inferring its parameters front t,he available infornr;rtion. T h e focus of I.bis
chapter is on inference of the lwo first moments (mean, covariance) of t l ~ r
the superscript 3' drrmtes lnatrix t r a n s p o s i l i o ~ ~ .
multivariate 1ZF Z(n), which are required by the int,crpolation (kriging) a1-
T h e covariance function matrix is arr N,, x N , matrix t l ~ s t co111.ai11s
. t,hc goriI.l~msinl.roduced in Chapters 5 and 6.
autocovariarrce functions along its major diagou;~land t . l c cross c o v ; r ~ . ~ ; ~ c e Sect,iou 4.1 addresses the problem of d c t w m i ~ ~ st;tt.istics
i~~g ropres~:~~t,i~-
functions off tIraI, diagonal: live of l . 1 sl.udy
~ area m d nol, only of 1.11~samplf, ;iv;ril;~hlc.'I'l~twret~icnl;III<I
pr:tc"lical issms of n ~ o d c l i ~ rtlrc
g sl~al.i;rlv;~ri;rlrilil,yof c o ~ ~ l . i n w tal.I,ril~!~l.cs
rs
are discussed in seclion 4.2. T h e d i s c ~ ~ s s iiso exbendcd
~~ t,o the morleling of
cross correlations in secl,ioo 4 . 3 .

At lhl = 0, the matrix C(0) is the t r a d i t i o ~ ~variance


al covr~riat~cc
tnatrix.
,Ilie
,
semivariogram rnalrix r ( h ) is art N , x N, symnlctrii: nral.rix that 4.1 Statistical Inference
contains tlie direct sernivariogranis aloug i1.s major diagon;il and tlrc cross
, \

se~nivariograniso(f that diagonal: I hc i ~ ~ f c r e n cprocess


c aims at. est,iinating the parrrriielers of tlic It1,' ~ i i r A : l
from tlrf sample i~~fornri~tiotr availahle ovcr the s l . ~ ~ d;~rtr;r.
y 111 cmt,rasL 1,o
Clraptcr 2 , here the s;tn~plesti~tisticsare I I O longer populatioti puarnct,rrs
sincc 1.he sample is not cor~sidered exl~nnst,iveany more. 'I'l~cdistinctiorl
betwccu populalion and sarriple st,atistics is iri;ule clear hy adding the SII-
perscript to t,lie latter.
,,
T h e covariance furrclion u ~ a t r i xand serniv;rriogrer~rn ~ a l r i xarc related by I he use of snrnplestatislics as eslim:ttes o f l w p ~ ~ l a t . i paran~!l.ers
or~ reqtiircs
lhat the sample be represel~lativeof the nnrlerlyi~~g area or popnlatio~r.Such
reprrsentativi1,y call he achieved by carefnlly designing the sampling scllrme;
Ll~einterested reader sl~onldrefer t,o .Ripl.cy (1081, p. 151 2 7 ) or Wrhst.er .ar1(1.
~ . . (1990,
Oliver ~ 1). 272 200) for a prcsentat.io~~ of rll;riu si~111p1ing ~RI(!I~K!S.111
this book, onc considers l l ~ esiLu;ttio~~ where d a t , ; lravc
~ alreztdy 1,cc11collected,
possibly with no stal.istical l,reatri~c!nt in m i n d 'I'lrc represcnlativily of blit:
salnple sl~or~lri ba q~~estioncrl wlrencvcr t,lre data ;ire not, sprc;rd c v c ~ ~ ovcr ly
t,lle area, U ~ I I ~ C I Iis ~ ~ ~ r f i ~ r t . ~0f1,cti
~ ~ ~tire
a l <cixs(:
, l y in carI.I~scitinccs i~ppliciil.io~~s.
4 1 STATTSTICA I, INFERENCE 11

4.1.1 Preferential sampling ,( co~rccvtrationsin soils bordering roads trray be larger than in opeu fields he-
/ cause of road trafficpollution. Similarly, oversartrpliug of accessible farmlands
I'hc sarnplirrg is said to he pvefereatial wllcrrever data locatiorrs are neithsr relative t o forest soils, which Iiave low levels of contaminatiorr, would lead l o
regularly ]lor r a n d o ~ d ydistrihutcd over tlre study area. Several fact.ors may ovcrestirr~atio~r of the average concentrations of rrrost metals.
cause specific subareas to be prefcrentially saltrplcd: l'referential sar~lplingof specific classes of values can be del.ected by corn-
I. (:(,nditio~rs of accessibilit.y; fields Imrdcring roads or farnrlatrds arc, casier pariltg tlrt: qrl;rrrtiles of the distribrrtiolis of clt~stcrcdversus uon-clustered
t,o sarj~plrt , l ~ a r~ ~r ~ g gkrraiu
d or d m s r for?st,s. (griddctl) d a t a , Recall that. the pqoarrtile value of a rlist~ribul.ion,q p , is 111~
v a l w l~clmvwlricl~a proportion p of the da1.a falls; tlral. is, qy is such t h a t
( q , ) = . 1,et qb")
and d c ~ r o ktlrc p-qua~rlilrsof (.he distributions of
grid(lcd and clnstertd (lath, rcspcctively. Sirtrilar distrihutions should have
sit~rilarqt~:r~~t,iIm; I,l~tist,Iw gr;tph of 'q: vmstls q/:), c ; ~ l l ~a dQ Q plot, s1to11Id
:I. S;rrrlpli!q2 str;rhcgy; cl~rsl.eredloc;~t.iot~s
rll;ry have heerr salripkcd to char- appear as tlrc st,r;tiglrt line &) = qF) V 1,.
at:t.<,rizcsl~orl.-rirngc:vi~riithility. 't'l~e J u r a sample set was split, i11t.o 11, = 152 clustered data (38 clusters
of 4 locat.ions caclr) and n9 = 107 gridded data (71, = 207 for nickel and zinc
I'referential sanipli~rgsliould always he clearly documented by surveyors, co~rcr~~t.rations). 1-'igorc 4.2 (Icft, graphs) s l ~ o \ r stlrc Q-Q ylols for llie three
since i t may skew the results of any cxplor;rtory dat,a analysis. For Ll~cJ u r a nkctals with widespread aml.artrination ( ( A ,( l u , and PB) and for chromium.
d a t a set., 38 clusters' of 5 locatio~lseach were sarrlpled to characterize tllc 1\11 111r1;tlssltorv discrcpancics of the r~ppcrtails of the two distributions: the
small-scale variabilit,y (scc section 2.3.1). Fnrtlrermore, t,lre sarr~plitrgdesign large quantilc values of t.lre cl~rstcrcddat.a are larger t.li;ru the correspondirrg
was such t.l~attlrc clusters are random stratifictl, whiclr explains their some- quantilc valr~esof t h r grirlded d;rt.a (dot,s are ahove the 45' line). Such Q-Q
what ollrvt:lr ilistkiljutio~rovcr the area (Figure 4.1). plots irrdicatc thztt t.hc larger metal corrcerrtratio~ls are nicasured prefererr-
tSvcn if l~iglr-or low-valued arcas were not purposely targeted, any !>ref- t,i;rlly ;it c l ~ ~ s l r r c1ocat.iorrs.
d 'l'ltis dilfcrcnce is caused hy prcf<:rcr!lial Iocalior!
crerrl.i;~lsa~rrpli~rg is likcly t,o impact salr~plr,st;rl,istics. For exa~rryle,lcad of t.lic clt~st,orsi n far~r~larrd or gr;tsslanil with iriglier metal colrceutrations (see
conditiorral ~rteairsill Table 2.4, page 18). lotlcrd, only 8.39)of the clustered
d&a lravr beer1 collect,cd trndcr forest, whereas that proportion is 18.4% for
gri<lrlr(ld a l . ; ~(glol,;rl proport.io~lof foresL is 12.8%). 'rlris rliffrrc~rce,I~owcver,
is tlol pronoor~crdC I I O I I ~ ~toI sig~lilicanlly:tllkcl st.;rl.ist.ics stlch as meall cow
ec~rtr;tt.ionor porcentrrgc of colrl.;utlinatcd locat.ions ('l'able 41). Note t , l ~ a t
the standard deviation of the clustered d a t a is also larger than that of the
gricldd dt1t.a.

~ i s s ~ that c corrtirnrons attribute z has been prcfarotially samplcil irr


~ ~ ntllr
Iriglr- or low-valued ;rre;rs. 'l'hus, the cqual-weighted linr;lr average G of the
71 data z(u,) is a biascd esti~rratcof tlrc average vnlue of z ovcr the area A,
with
iil = -
I
11
C" ~(11") (4.1)
o=i
,.
hlorc gerlcrally, tlrt! sample marginal dislribnl,io~iF ( z ) is rrol, representative
or IIle distrih~tt.ionof z-valrws o w r A , wi1.h

' A I ~ l m u ~the
b a c r d ~ a ~ r ~ p l iwas
r r g nl:strrl, ion-gridded locatinrrs are lrcrenftet. cons id^
r ~ c as
d clustered for illostratiun where t.lrr indicat,or d a t u m i(11,; z) is defi~rcdas in equat,ion (2.19).
78 CHAPTER 4. INFERENCE AND MOIJISIJNC; 4.1. S?'A'I'ISTICAI, INFEI<.ENCE

'lkble 4.1: St.aLisl.ics of c l ~ ~ s t e r emd d grirlded daLa for Llre sevml

.. .. ..
.... .
m I'
u
u Men71
Clustcrcd d a t a
Gridded da1.n

0 2 4 6 0 2 4 6 Sld de1iintio71
Gridded data (ppm) Gridded data (ppm) Clustered d a t a
Griddt!d d a t a

% co7tln~r~.
Clnstercd d a t a
Gridrlrd d a t a
0,

One proccrlure 1.0 correcl for preferential s a ~ n p l i ~consists


~g of reliii~rirrg
only the regularly spaced d a t a . 'f'lris approach is approprink for dat,a set,s
0 40 80 120 160 that i11c111dee~roughgriddt:d data for reliirblr inference, for. e x a ~ ~ r p lt.lie
e , dtrra
Gridded data (ppm) Gridded data (ppm)
data set wit11 more than 100 gridrled data.
Wlreu 11at.a sparsity docs noL allow onc to i g ~ ~ o rthe c clusterr:d values,
the eq~lalwt!igl~t,s1/71 in expressions (4.1) and (4.2) sliould l ~ rcplace(l r hy
m weights t.lrat, account for d a t a cl~~stcrirrg.
Int~~itively, d a t a in dcnscly s ; t ~ r ~ l ~ l r t l
.. areaq slroold receive less wcight than tliose in sparscly samplcd arcas. Such
. ...
u u
u weighting ainour~tsLo "declostcring" the data. Two commonly used clecl~~s-
a,
;:'
..
(U
V)
teriug t.ecl111ic111es
arc the polygonal mct.lrorl (Isaaks and Srivastavit, 1980,
1'. 238 230) and cell-dcrlust,cri~~g ~ ~ i e l l ~(Journcl,
od I W I ; I)c~rl.scl~,1!18!1).

0 50 100 150 200 250 0 50 100 150 200 250


Gridded data (ppm) Gridded data (ppm)
T h e polygond ~ncthotlfirst rlcli~~eat.es l,he polygon of i ~ ~ l l ~ mof
r ceach
c ilalul~r
location u,,, t , l ~ a is,
t blre area co~islitutcclby all local.io~~s 11 t A closcr t,o
a u, tlrali Lo any other datum locahion. 'f'lic area of the p o l y g o ~coitercrl
~ ill.
m location u, is tlicn used as a <leclr~steri~ig weight for datum valuc ~(11,):
2 40
13
u u
F
...
a,
$ 20
m
3
3

0 0 .'
0 20 40 60 0 20 40 60
Gridded data (ppm) Grldded data (ppm)
sm;tll polygoirs of influence rcccive less weight t l ~ a risr)lal.cd
~ locations will^
large p o l y g o ~ ~ofs inlluencr.

, .
I l w cell-decl~isterir~g approach calls for dividing the study area A into rect-
angular cdls, and co~mtingLire rrunrhcr 11 of cells that, c o n t a i ~ ~a st least one
daLu111and the I I I I I I I ~ 7~ cell 6. Each datum loca-
) ~ ~of data falling wit.l~ine a d ~
ti01111, then receives a weight A, = l / ( B .na), wliich gives inore importance
l o isolated locat,ions:

wllcrc liia and bi,(z) arc tllc cqu;~l-weight,cdii~e;tr~ (4.1) and eqii;&weigbted
c u ~ ~ r u l a t i vdistrihrrtiolr
c (4.2) of z-values within cell b. Figure 4.3 (bottom
graph) slio\vs t . 1 ~swell d a t a locations and fonr decli~sleririgcells (B=4).
E a c l ~of (,lie four clust.rred dat,a valires receives a weight. A, = 1/16 since they
share t l ~ csame cell (na = 4). 'Tl~ethree other cells contain o d y orie dal,u~rr
(??a= I ) , hence t11at isolatsd dalum receives a weight of 1/4. 'l'l~c total weight,
s ~ m tso I ;is it, sl~orlld.
'Two kcy p;iralrlcters of llie ccll-dcclusterirrg tcclmiq~icare the cell size
and t,lre location (origin, o r i e ~ ~ t a t i o nof) the grid. Clnsters are frequently
added on an or~derlyirrgpseudo-regular g r i d A natural cell size would tlie~r
he the spaciiig of (,lie grid: a, where ng is the i ~ u ~ r hof
e rgridded d a t a .
l'roviileil d a t a arc rcgolarly spaced, the celltcr of the cells should correspond
t o grid nodes.
Wlrcn the s a ~ ~ ~ p l pat.tern
ing clors not suggest a natural ccll size, several
rrll s i m ;1111l origins I I I I I S ~IIP
. t,ricd '1'11~ro1111)ini~t.im
tlml. yiclds the sn~allest.
or largrsl, rleclt~sl.t:rcd ir~c;trris rctair~edaccording 10 wl~r:l.l~cr t l ~ ehiglr- or
low-valued arcas werc preferer~li;tllysampled. 'Ib avoid erratic results caused
by ext.renie vallres falling into specilic cells, i t is useful to average results for
several din'ererlt grid origins for each cell size.
T h e J u r a dst.zt were declnstered using square cells of 250 in. T h e sarnple
grid was rol,atrrl 1.0 hc p;tr;dlel to the N-S direction, tlle~leach ccll was centered
OII a g r i d node. 'i'lrc ileclt~sl,eretlrlistrihution is compared rvilll the distribution
of tlic 107 griddrxl d:d;t using a &-Q plot (Figure 4.2, right graphs). Using only
Figurr 4.3: Two teclutiques that correct lor the clust,ering oC data locations (black the gridded r1al.s~or applying the cell declustering t.ecl~niqt~e provide sirrrilar
dots): llrc polygonal method (top graph) and the cell-declustnring tecll~~iqnr! (bot- rcs~ilts:hotlr distrihntions plot close to the 45' line with similar stat,istics.
tom graph). 'Yhe declustering wcights are proportional to the area of the pulygoll
of inlluencc of each datum in the first case; they are inversely proportional to the
number na, b = 1 , . . . , 4 , of data that fall within the same cell in thr secor~dcase.
82 C K A P T E H 4. I N F E R E N C E A N D MODI<LING
II
I
type (4.1) or (4.2) if the cell size is very small (each cell contains a t
most o w datum) or very large (all data fall into same cell). i
i
2. By arralogy with the meall, t.lre declustered variance is a ~ r n p u t c da s ?
!

where 6 is the declustered mean, and A, are the corresponding declus-


tering weights.

4.1.3 Semivariogram inference


T h e sample sernivariograrrr ?(I]) is corr~pnt,edas

frank Prank 0.62


7 F
40 70 80
where N(11) is the nrmher of pairs of d a t a l o c a t i o ~ ~as vcct.or 11 apart. 111
Local mean Local mean
section 2.3.3, several techniques (data transforniation, it-scattergram "clcatt-
ing") for reducing the inflocnce of extrcnte data on sartrple semivariograrr~
v a l ~ ~ ewere
s disci~sscd. Like o t h r sumtrlary statisl~ics,?(It) V ~ I I C Sx c d s o
seusitivc to clust,criiy, of d;rl.ir v a l ~ ~ e ptttictrl;~rly
s, wltc~ts11c11c l ~ l s l c r i ~is~ g
combined with a p r o p o r t i o ~ ~ effect.
al

P r o p o r t i o n a l effkct
Most often, ttte local variability of d a t a changes across tlre study arca, a
feature known as hrteroscednslicity. Tlic proporlionnl cffecl ( J o u r ~ ~ cand l
Huijbregl,~,1978, 11. 1 8 6 189) is ;L particular form of Ircl,eroscedast,icity wlrerc
the local variance of rla.l,a is rclated t.o 1ht:ir local itrcatl. For positively skewed
distributions, the local v;triance i~icreimcswit.11 the local r t i r w i (dircct. propor-
tional elfect). Corrversaly, if the ~listril~nliorr is r~cgativelyskcwed, lwgtir
variauccs gc~trri~lly c ~ r r ~ s p o ntod s~trallcriit<:ans (i~rwrst!proporl.ioititl cff~:d,).
A proportiorial enbcl, can 1)e &!tcctctl froirt a scnt.i.eq~lotof 1oc;J ~ ~ ~ c ; t l t s
versus local varimces as calculated from lrtovi~rgwindow statistics. 'I'lrc a r m
is divided into wi~tdowsof equal siac, and tlre mean a11d variance are compt~terl
within each wirrdow. Each wi~rdows h o ~ ~ linclode d ctrouglr (lat,a for relialilr:
inference, yet titere slrould be enough wi~~rlows to detect ally spatial irer~d.
For srnall d a t a sets, t,lrc w i ~ ~ d o wmay
s have t,o overlap.
Fourteen oo~~-ovt:rla~,ping :k n ~x I kur W ~ I I C ~ O Wcir(41
S, i n ~ l u d i n gI J C ~ W C C I I
Local mean Local mean
10 and 28 d a t a values, were r l e f i ~ ~ over a l the study arca. '1'11,: positivt.ly
skewed variables ((31, (:!I, I'b, Zn) show a clear direct rclaliot~bct.wcen local
average concentr;rtio~~s i n metal a 1 1 1 local variaocc (l'igrrrc 4.4). 'f'l~ercis 110
significant proportional clrect for the otlrer rr~rtals.
4.2. MODELING A REGZONALIZATION 8'7

and cross correlogrartr (2.26) provide more robust cross correlation mcasurcs
than the traditional cross se~nivariogratn(2.28).

2. Relative semivariograms are trot substit,utes for the tratlitiot~;tlsen~ivari-


ogram ?(]I) in the sense that krigirlg requires a model for Llte traditional
4.2 Modeling a Regionalization
sen~ivariogranl.However, ll~eserobust meas~lresmay provide a clearer Semivariogram or covariauce ittferer~ceprovides a set of expcri~ueutalvalttes
description of the spatial contitruity, revealing ranges and n t ~ i s o t h q y ?(hi;) or G(11i;) for a finite number of lags, hi;, k = 1,. . . , li, and directions.
whenever overestimation of Ll~erelative nugget effect, rettdcrs the tradi- Continuous functions must be fitted to these experitneutal values s o a s t o
tional srr~rivariogramerratic. Such iltfornratio~rs u p p l e n ~ e t ~ I.lral,
t s pro- deduce semivariogram or covariance values for any possible lag 11 required by
vided by t,lw ror1ogr;un or rriadogr;m otl iargc-sr:tlc featurcs (r;mgc, irttcrpolal,ion algoritlrms, and also to smoot.11 ottt satr~plefirtct~t;~lions.
;rnisotrr~py). In this section, tire corldilions tlr;tt any scn~ivariograrrror covariittlce trlodel
3. All robust trleastlres, such as relative sctnivariograms, tttarlograut, or innst, satisfy are first rstablishcd, and the h e a r r~todelof regionalizatiolt is
rodograrn, cottsirler ouly one attribute a t a tilue. 'I'lre settsitivily of cross introd~tced.Practical issues of modeling are addressed.
serr~ivariogranrsl o proportional effects at111clustering of 11igh values lras
rarely heell invest.igntcd Il~l.rtitivcly,this srwit.ivit,y woul~ldcpr,nrl O I I 4.2.1 Permissible models
thc sigtt of lltc corrcl;il.iol~l,el.wcc~~
altrilmt,es. Wlrcu two varial~lcsarc
T h e p o s i t i v e dcfiuilc. c o n d i t i o n
positively correlated, they arc likely lo sltow s i t ~ ~ i l ap rr ~ p o r t i o ~ ~~:f-
al
fects a r ~ ditupacts of data clnsteri~tg.'I'lrcir cross w ~ r ~ i v a r i o g r iwor~ld
m~ Let { % ( u ) , u t A) be a slatio~taryILF witit a cov;trii~trccfunrtiott C(11). 'I'lri:
then colnbine the adverse effects slto\vtr on t11r. I,rvo direct setttivari- variance of auy finite linear cornhination Y of ratrdolr~variables % ( I I , ) , 11, E
ograurs. Conversely, cross senriv;rriogr;tl~~s of ucgativcly corrclal.cd vari- A, is expressed as a litrcar co~nbiniit.iol~
of (,he covaria~rceV I L I I I ~ S m d IIIIISL l ~ e
ables would he less scusitivc to prefcrellt.ia1 clns(.crittg si~tccslnall lag notr-negative:
rncans of one variahlc might be b;ila~rc~:d by large leg mcans of the otlrer.

4.1.4 Covariance inference

Sonle sernivririogra~nn~odcls,such as tltr power nlodel (Figore t1.6, boltotll


graph), have no sill, hence no cowariance couuterpart. FOEsuch se~nivariogratn
moilcis, llte variance (4.13) can sl.ill be expressed i r l tornls of t l ~ csenrivari-
Srivastava and I';rrker (1089) l ~ a v cs l i o w ~tlri~t
~ t.lw nw-ergodic corrclo- ograrli u ~ o d c lt ~ u d c rt l ~ ecottditiotr that t,lrt: weights A, sutn 1.0 zero:
gram and general relative sctnivariogram are very ri,sist.a~ttt,o n con~l,iu;it,iot~
of proportional effect, and pref(:rent.ial sampling of 11igl1values. Sinril;rrly, he-
cause tlrev acconnt, for Iae ntnnrrs nnd lac v a r i a u c r s l l > cross
~ ~ w n r i i l n w(.) ??I\
This co~rditionon the weights filters I.he v;tri;wcf: term C ( 0 ) frorn expres-
sion (4.13) w l ~ i r lLhw
~ Ijr:co~iies Sill

Itelations (4.14) and (4.15) show l.liat, to cnsurc I h iron-negativity of the


varialice o f Y , 1.11~srmivariogram irrodel y ( h ) 111us1,be condit.ionally negativr
dcli~iil.c,ilrc condil,ion b c i q Ilmt t,Iw s w t ~of 1 . l ~ ;wcigl11.s A,, is zcro.
I
-
_ .._ _ .- Spherical model
Exponential model
/'
,
I
-. -. - Gaussian model
,, range
l o avoid 11;tviilg t,o t.cst ;I poslcriori lhc pcr~~iissibility of ;t s ~ : ~ ~ r i v ; ~ r i o g r a ~ i r h
r~iodol(C:l~ristakos,1984), a co~r~rr~orr practice co~~sist,sof nsilrg only linear
coi~il~i~r;tt.iotrs
of hasic 111ot1clsthat arc krrown to be ~ ~ c r ~ r ~ i s s iTlic
h l e .following
arc: thc fivc 111ostfreqrre~itlyused hasic rnodels:
Nugget effect tnodrl
0 il/t=O
dlJ)= 1 otlicrwisr

G a r ~ s s i a rtrrodel
~ with practical range n Figure 4.6: Boarded serniv;~riogrammodels with the same practical range (top
and power inodcls for dilkrerrt values of tire pararncter w (bottom graph).

'The spherical rnodel rcaclies its sill a t distance a ( a c t u a l range)

'The cxpooi~ntiala i ~ dGaussian models reach their sill asymptotically.


\I praciicnl range a is defined as tlrc &stance a t which the trrodel value
is at. 95% of t.he sill.

I l o ~ ~ n i l ci d~ t r d c l sarc also rcfcrred to as irnasilios ~ ~ l n d c land


s , their covariance
countt:rp;trt is 411) = l - y ( h ) . 111 co~~tr;rst, the powcr ~tiodolhas I I O sill, hence
IIO covariance c o u n t h p r l .

' f l ~ n ?t,ypcs
~ of 111~lraviorilrar tlic origin art. dist.ing~~islrr:d:
1. I'araholic hcliavior, c.g., a G a i ~ s s i amodel.
~~ Such behavior is cliarac-
teristic of highly reg~tlarplicnoniena such as topographic elev;tlioti of
gently onrlulating Idls.
2 . Linear behavior, e . g , splierical or exl~onential~ n o d r l . Note that. for
the s a ~ l i epractical range, the expo~rc~itial
rrrodel st,;irt.s iucrensit~gfaster
t.lian the splwric;~lntorlel (Figurr 4 . 6 t,op graplr).
3. D i s c o t r t i ~ ~ r l obehavior,
~~s e g . , nugget effect niodel.
T h e bcllavior near tlie origin of the power iilodel cl~ungcswit.l~the valne of
the parameter w . It is linear for w=l (li~war~riodel)and approaches l~aralmlic
beliavior as w incre;rses t.ow;rrd 2 (Figure 4.6, I)ott.~,~ii
gr;ipl~).

4.2.2 Anisotropic models


A phenomeoon is said to be at~isolropicwhen its p a t t ~ wof spat.i;tl varial~ility
changes with direct,ion. For example, nickel conce~~trations were found to vary
more continuously in the SW-NK direction corresponding to Ll~cclongaliol~
of the geologic o~ltcrops.Modeling anisotropy calls for fimct,io~~s that depend
on the vector 11 rather tl~arion the ilistance h = 1111 o d y . 'I'11e following
presentation is limited to two-dimensional aoisot.ropic r ~ ~ o d e with
ls vrct.or
,,
11 = (h,, h,)' . Three-din~ensiorlnlanisotropic n~orlclsarc discussccl in Isnnks
and Srivastava (1989), Ch;ipt.er 1 6

An anisotropy is said 1.0 be geometric w l ~ e l ~ :

2. the rose diagralrr of ranges2, that is, tlrc plot of raugc v i t l ~ ~ cversus
s t.lir
ahimoth B of the direction, is an ellipse.
, elliptical rose diagram of ranges in l i i g ~ ~ 4r e7 (top
Consider, for e x a ~ n p l ethe
right graplr). T h e rnajor axis of tlic ellipse corresporitling to tha dircction ,I> he anisotropy corriicliori cousists of L r a ~ i s f r ~ r n i tlic
i ~ ~ gvcrt.or of ,)rigillirl
of niaxinlu~ncontiunity forms an angle 0 with the coordinate axis y (north o vcctor 11' = (It;, hi)'', so t h a l tllc value
cooriliriatrrs 11 = (h,, hy)?' i ~ ~ al new
direction). By convention, the azimuth angle B is iire;isured in degrees clock- of t.lis ailisotropic sernivariogrim~iriodcl g ( h ) irlmt.ilics thal, o f ;ti1 isot,rol)ic
wise from the y-axis. T h e minor direction of itnisotrapy is p q ~ e n d i c u l a rto inotlcl gl(/ll'I) in the ~rewsystrrrl of coortlir~atas:
tlie major axis of t.lie ellipse and has an azimuth 4 = B + 90'.
T h e two semivariograms computed in the directiot~sof ;tailnutlr 0 and 9 arc
fitted by splrerical models with the sanlc unit sill but differeut range values;
see Figure 4.7 (lop left graph). The major and minor ranges of anisotropy,
an and am, are plotlad as the major and minor radii of tlrc ellipse. 'I'lrc w11crc: g t ( . ) is t i u isot,ropic 111odc1wit,h EL r m g e c y t ~ d10 t.he ~iiinorr m g c a4 of
anisotropy factor h is dcfit~edas the rsl,io of l l 111i11or
~ r ; q e 1.0 t.lw tll;~jor auisotropy.
range, h = amJan < 1. ,~
I lie coordin;rt.e tririlsfr,rtn;riii,~i cells I'w t.wo kcy ~ ~ i ~ r i i ~ ~t,l~o
~ c ivait~~ullt
krs:
Z170rlinear scmivariogreo~nrorlcls y ( h ) = I . Ir, it is tltc d o p e I llml rllitngrs witl, angle B of tlic direction of n~irxininmcoot.il11ii1.y;rnd Llic anisol,rr~pyf;trt.or A.
direction, and the rose diagram of the irwerse of slope 1 / b is considered 'I'lrc t . r ; i r l s f o r ~ ~ ~ nproccwls
t i o ~ ~ i r t t.wo st.eps:
2. 111 the minor direction of azimutlr 4,the distance hs is equal t o zero,
hc~rct: Ih'I = Irg. 'I'l~c model g'(111'1) resuttws to a spllcrical model of
range a+:

3. For any ot.llcr dirert.ions s ~ t c lthat ~ I t , - ha # 0, say, tlic dirrctio~lof


2. 'l'l~.cllipsc is tlwr~ rcscalad to 11 circle or radios equal 1.0 t l ~ en ~ i n o r A Z ~ I I I I I ~(,
~ ~ the
I ~liodelyt(lh'l) yields a spherical model with a range
rxngc ng (F'ignre 4.7, riglil hottorr~graph). 'l'l~r rescalirig of tthe n w value a( that would plot orr t,he rose diagrmr of k'igtrrr 4.7 (top right
coordiilal.es ( / I + ,Its)?' is ivrit.t,cn ;m graph), wit,ll a,) < nc < no.
Any isotropic model can he inlcrpret.eil a25 a part,icolar case of the geo-
rriel.ric ;u~isotropictnodcl (4.21) where t,hr anisotropy factor X is eqnal to one
( a p = n o ) . If X = I , it does not maltcr what Q is.
wllim: DAis tltc diagonal t~tatrixof affini1.y

An anisotropy t , l ~ a ir~volves
t sill values varying with direction is said t o be
o ( i r 4.8 r i g r ) '1'112 sm~ivariogra~rl
i n tllc direction of aa-
illluth d hzis a longer range ag and also a larger sill than in ot.ller directions.
Snclr anisotropy can be ~nodtilrdas tlic sum of an isotropic t.ransitiotl model
y l ( J h J )and a "zot~id"I I I ~ O P I y2(hg), wl1ic11 depends o ~ l yon the distance hg
i n HIP ~lircrt.iorlo f g r r ~ ; ~ l ov;~riaucc:
r

(4.23) where tlir rllodcl 9&) h;ts n range am.


g(11) = ( I ) = Sph
' I ' h c o n l p o n e ~ ~gz(1tg)
t can be seen as all extreme case of the geometric
; t ~ ~ i s o t m prt~odcl
ic (4.21); it,s ~l~odelirrg
procceds in 1.wo st~:ps:

2. 'L'llc ncw axcs ;ur t,Iwn rcscalcd so t,l1;111,lic zor~allilodcl does not r o w
1,rillul.e 1.0 t.lw dircctio~lof tn;rxi~~lnln
conl.innity ( a a i ~ ~ ~ n0).
t l i Such
roscaling amounts 1.0 setl.ing t,lw range a" in that direction to infinily,
hcncc t.lie ;ulisotropy factor X = og/na 1.0 zero:
CHAPTER 4. INFERENCE: ANII MODELING 4 . 2 . MODELING A RECIONALIZAT1ON 95

3. In other direct.ions, where hot11 dist,onces h+ and hs arc (lilkrant Frou~


aero, the n ~ o d e l9:(111'1) yields valr~cstila1 arc intcrmerliatc hctwa:ii
those obtaincd along t , h minor and major direct.ions of anisotropy; for
cxample, in the dircctibri of aeirnutli C:

Tlre directional semivariograrrr y ( h < ) would theu plot brtwcerr the two
n ~ o d e l sof Figure 1.8.

4.2.3 Tlre linear model of regionalization


111most sit.l~at.ions,two or more basic niodels g(11) or c(h) n ~ ~ rbe s t co~~ibineri
Figure 4.8: Example of zonal anisotropy i n tltc direction of azimuth 4 (greater (nested) to fit tile shape of the cxperi~ncntalse~ilivariograrnor covariance
variance). Thc snisotropic model consists of arr isotropic rr~odcly,(lh/) and a zonal ftrnction. Ilowcver, not all cornhinations of permissible serr~ivariogranior co-
componenl. y l ( h , , ) , rvlricli drpesds o d y on l l ~ erlistnnic h r . v;triancc ri,odcls rt:sdt in a prr~nissihleserllivariograrrr or c o v a r i a ~ ~ functioo.
cr
,
I hc easiest way t,o l~iiilda ptrrn~issil)lenlodcl consist.^ i l l first. I>uildinga r a w
\

Consider, for cxa~nple,the aoi~alconlpo~~cril ya(hg) in Figure '1.8 (right do111 function. 'I'he covariance f~mctiono r scrnivariogram of l l ~ a trandom
graph). This model can be seen as an anisotropic model y~(11)that ident.ifies fi~nctionis then, hy definit,ion, permissible.
a spherical rnodel of range urnin tlic <lircct,ion of greater variance hut does I I O ~ 'rlic linear model of regionaliaatior~bnilds l l ~ craniloni f~i~icl.ioti Z(11) as
contribute irk the p ~ r p e r l r l i c ~ ~direct.ioi~
lar of wi1m1LI10. Wi1.h 1l1c I I ~ WsysI.crri a 1i11r:ar conlbinaliorr of ([, + 1) iridcpendcnt r;tndom functio~lsY 1 ( u ) , t:xh
of coordinates (It;, hi)'", LIE a~lisotropicniorlr:l gz(h) c;rn l x nxprmxd as an wit11 EWO mean and h s i c covariance function cr(li):
isotropic splieric;d n d n l g:([hfl) wit,l~;L rauge a*:

willr
Basic models I
Linear model of regionalization

Distance h Distance h

A si~rrilnrd e v c l o p m e ~ c;m
~ t be dotre in tornts of semivariograrr~s.Let gl(11)
~ l c n o t cthe sernivariogram of the ILF Y i ( o ) , with <.he cross sernivariogr;mm
hetween any two dilkrent RFs Y 1 ( u ) and ~ " ( 1 1 )equal to zero: Remarks
1. Fhch i h i c rnoricl of ;r lil~ear~tlodelcan br isolmpic or can display
cit,l~crt,ypc of auisotropy. I'or exnn~ple,the ~ ~ ~ o in d cFigure
l 4.8 is a
c o ~ n b i ~ ~ a tofi oit11
~ r isotropic nroclcl and a zonal model. Accounting for
t,lre cor~ceptof linear rcscalit~gof coordinates introdr~ct!din section 4.2.2,
tlw g c ~ c r a llinc;rr ir~odclis rlcfi~~crl as

2. Under second-order stationarity, the coelficicnts b1 ill both expressions


(4.27) arid (4.28) are identical, and their sun1 equals t.lie a priori variance
1' is the vari;tuu: cot~trihut,io~r
wl~crctlra positive coelficic~~i. of the corrrspond- C(0). ludccd, a t 111l = 0, the value of ear11 basic inodel cl(1r) is 1; then
i ~ hasic
~ g sernivariogram n ~ o d e gl(11).
l

Consider t,lre r n n d o ~ nft~nction%(o) hnilt as a linear con~l)inationof Lliree


i ~ ~ d c p e n d e nItFs:
t 4.2.4 The practice of modeling
Z(u) = 3 . YO(ll)-- 4 . Y1(ll) + 2 .Y2(11) + 0, Three key i~rgrrdientsof the ~nodelirrgprocess arc.

with the three correspo~~ding semivariogram rnodels: a nugget cffect, a splrer- 1, experimcnlal serniv;triogram or covariance valoes,
ical model of range n = 1, and a power n ~ o d c lwith w = 1. Tlle three basic 2. permissible se~rrivariogramor covariance niodels, see section 4.2.1, and
st:rnivariogranr n ~ o d c l sare isotropic and arc slrown in Figure 4.9 (Icft graph).
According to relati011 ( 4 2 8 ) the s c n ~ i v a r i o g r amodel
~ ~ ~ is
. knowledge of the area and plteoomcno~~ under study (e.g.,

where go(h) is the nugget effect inodel (4.16). 'The rnodel y ( h ) is, hy con-
st,rur:t.ion, pcrrrrissiblc rind is dcpict,rd in Figure 4.9 (right graph).
. rlirrrt,iolis of anisot.ropy, ilnporl,once of ~ ~ ~ c a s r ~ r eerrors)
l~tcl~t
robust rneasorcs, such as the maclogra~n(for ~valuitlionof anisotr01)y
directions, ratio, and range) or relative scmiv:triograrns
T h e art of n ~ o d e l i l ~consists
g of c:rpitaliaiug on t l m c diflcrcl~tsotirccs of
information Lo build a per~nissiblernodcl that captures t.l~ei t ~ a j o rspal.ial
features of the atl,ril>t~te under study. 'I'lrc following d i s c ~ ~ s s i ~o ~t ~C I I S Pon
S tltc
sernivariogralrt, w l ~ i c lis~ the most frequetltly l~seclstructural tool. A sinrilar
approach can be adopted for covariaure functions.

A u t o m a t i c or v i s u a l i n o d e l i n g ?
Too often, the modeling process is viewed as a mere exercise it1 litting a curve
to experimental valnes. T h e scrnivariogran~is tlrm inodclcd itsing statisl.ical
fitting procedures, wlticll cat, he roughly classified into two catcgorics: Distance (km) Distance (km)

Pull black-box procedures rvl~ichit~volvt:all ;~nt,onraticclmicc and fitting


of the tnodel

r Semi-automatic procedures linlited to tlrc cstirn;ttio~~


of (.l~r: paramct,rm
(sill, range) of models clrose~~
hy t.lre user

Black-box procedures tnusl be avoided because they cautrot take ir1t.o ac-
count ancillary information tlrat is cri1,ical when sparse or prcfcrcntial saln-
pling rnakcs the cxperitrrcntal sernivariograrn unn:liable. S e ~ ~ r i - ; r n t o ~ ~pro-
~at,ic
cedures may facilil.ate thr: det,crmination of n ~ o d c lparalneters wl~ettt h r forill
of the experirr~entalse~nivariograrnis clcar. IIowcver, snch procedures cow
tribute little t o the aclnal nrodeling, since the most importal~t.decisions re-
garding number, type, and ania~t,ropyof basic s t r u c t ~ ~ r emuel,
s be t.aken by
the user. With t,l~ehelp of a good interactive graplrical program (e.g., C l u ~ ,
1993), the user wo~rlddo better than sop1listicatr:d fully ;rul.on~aticfitt.ing
procedures.
The modeling process relies crilhAly O I I a series of user's deczszons, which
must be backed hy experinrental d a t a or ancillary informatiot~,as follows:

1 . whcthcr t,o fit ail isotropic or anisotropic tt~oilcl,

3, which paranreters (sill, ;tnisotropy, range or slope) for each l)asic sctni-
variogranr n ~ o d e to
l use.

I s o t r o p i c or a n i s o t r o p i c m o d e l s ?
T h e conventional approar:lr for delecli~tgar~isolropyis 1.0 compare cxpcriinen-
tal sernivariograms comp~lt.edi n several rlin!ctiol~s(Figure 4.10, l.op graphs).
To decide wltctl~crguorncl.rir: a~tisot.ropyis presr:111.,a t [cast t l ~ r c ?rlireclious
must he considered. 1Iircct.ioos of nmjor and minor spat,i;rl rot~I.i~ruit.y arc
often suggested by banding of large or small valucs 011 a lomtion rnap, or hy
aucillary information, sllclr as orientation of lithologic fornratious or prwailing
contour lines wliose major axis indicates the d i r e c l i o ~of~ maxinrum c o ~ ~ t i m - In earth sciences, there is rarely physical ground for cl~oosirrgany partic-
ity (Figure 4.10, hottom right graph). Note that cobiilt. co~icentrationssl~orva ular basic i~rodelg,(h). For interpolation, a critical feature is llie behavior of
c o ~ r ~ b i n a t i oofn slrort.-range isotropy (sm;ill c.ircl<,s) ;md lorrg-rmgc g c o ~ n d r i c the st:nrivariogra~nn ~ o d e al t the origin. Models wit,]] a parabolic. behavior a t
;u~isot.ropy(largo cllipscs) (Figurc 1.10, right grapl~s). the origin sl~o~rlcl
i,e used only for p l ~ c n o n ~ e nthat
a are know11 to be lriglrly
'I'lre computation of.a sen~ivariogran~ rrlal, requires co~rsidcringnr;t~rydi- continuous, for example, the surface of a water table. 111 most other cases,
rections and lags. Thus, such a represcntatio~~ is best suited for large griddcd any model with a linear bchavior a t the origin wonld be appropriate.
dat,a sets. When d a t a arc sparse and irrcgr~larlyspaced, large angular and
lag tolcranccs are needed to fill irr the s c r ~ ~ i v a r i o g r anriq~, ~ n lre~rceI.Ir<:spatial
resolntiorr of t,hc m a p niay be drastically redur.etf. Beware that, the scmi-
v a r i o g r a ~ rinap
~ provides i ~ i f o r ~ r ~ aonly t i w about. I.hc directions of major and T l ~ clast sky ill the rr~odelirrgprocess consist,s of dctcr~uiningt,he parameters
nrirror spatial co~~linuil.y. Anisot,n~pictr~odclsd i l l nrusl. 11e fitted t,o rlircclional (sill, rarrge, anisot.ropy ratio) of the selocled models. It l m r s repealing t h a t
sc~~rivariogr;i~ns. a good inl.eractive graphical prograrrr is rnore usehl t l ~ a nany sophisticaled
,411ar~isoI,ropyl.h;it,is I I cltwly ~ ;ipparc;~~l, O I I q ~ ~ ? r i n ~ c rsr~t ~
wa~Ii v t ~ r i o g r a ~ r ~ s sti~lislicirlfitting p r o c e d n r ~ .'l'h foll~wingtips s l i o ~ ~ guide
ld I.he user in the
nor 11acked hy i u ~ yq~~alit.at,ivc i ~ ~ l i ) r n r ; t t ,isi ~bct,lter
~r igt~orfcl. Co~~v?rst:ly, inf(:rr:t~ceof the paratr~et,ersof t,lw scn~ivariogrammodel
sl.ro~rgprior qr~alil,alivei ~ ~ f o r n ~ h t imay o n lead one Lo adopt all anisotro$)ic I . Nlrggel rfrecl
~~roclel cvtw i l d;il.;r sp;rrsil,y prcvct~l.sswing ar~iw)l.ropyfronr l.lre c x p e r i l ~ ~ c n l a l
'l'lre nugget elfect is usually determined by extrapolating the behavior of the
se~rrivariograms. first few sernivariogram values to the vertical axis. T h e following should be
taken i11t.o con side ratio^^:

1. 'The relative nugget effect 011 expcrirnenl.al senrivariogra~nstends to


O I I slrould
~ avoid overfit.ting expcrin~e~rl.al scrni\mriogranrs. 'I'hc olljcctive increase with the lag tolerance (Figure 4.12) and wit11 d a t a sparsity.
is to capt,urc the nrirjor sp;iLinl fcitturm of the ;it.l.ribute, riot to nrodcl a r ~ y 'Typically, the relative nngget rffrct decreases as more and better d a t a
(possiil,y spurious) details of the snniplc sciniv,triograms. M'l~err rlifti:rent h e c o ~ r ~;~viril;d~le.
e
~~cst.cil~norlcls1,rovide sin~ilitrfits, one slranld select t,lw sirnplcst olre. W l ~ e r c
;t splreric;il model sl~fliccs,tlwre is no ~~rierl to s u ~ nt,wo cxpo~rentialnrodcls 2. W I m e data are clnslered and t l ~ e r eis a p r o p o r t i o ~ ~ aeffect,
l the rela-
(Izigurc 4.11). 'I'lre won. co~nplic;rl,edmodel ~ ~ s t ~ ndl loy~ snot. Icird to nlore tive nugget elhct is het,ter irrferrrd from relative srrr~ivariogra~ris (see
accnra1.c rslin~at.rs. discussiou i l l section 4.1.3).

0
0.0
C
0.4 0.8 I
Distance (km)
- 1.6
0J
0.0 0.4 0.8 1.2
Distance (km)
1.6
00
0.0 0.4 0.8 1.2 1.6
Distance (krn)

Figure k12: E f ~ of t lag tolerance or, thc iafcrence of the nligget effect. A large
lag tolerancc (Ah = 'LOO to, solid line) Inads to overestin~ationof the relative nugget
component, compare wit11 daslred line (Ah = 50 m).
4.2. MODELING A IlEGIONALIZA'I'ION 105

4. Anisotropy pnrn7nelrrs How g o o d i s t h e m o d e l ?


h4otleling a geonretric anisotropy proceeds ill two steps:
There is always uncertainty attached to the parameters of the se~nivariogram
~rrodel:many ri~orlelscan ~riatchequally well the s a ~ r ~ pinformation;
le see, for
1. l'lre niir~orand major ranges (or slopes) are first inferred from a series example, Figure 4.11 (page 100). Tliere is a tendency to justify the choice of
of experin~e~italdirectional se~~~ivariograrns
(Figure 4.13, top graphs). a particular model using statistical criteria.
Weighted least-squares criteria
2. Model values are calculated in o t l m directions and are cl~eckedagainst
A criterion cotnmooly used in cor~junc~ion
with automat~icfiltingis the weighted
the experimer~talsemivariogra~~is coniputed in the same directions (Fig-
ure 4.13, I ~ A t o t ngraphs). sum of squares (WSS) of differences between experi~nerrtal?(hk) and model
y(11~)se~rrivariogramvalues:
Anisotropy ratios tmd to be underestirnat~edwhencver directio~ialscmivar-
iograrns with large angular Lolera~rcesare co~npotcd. Indeed, pooling data
pairs of different directions reduces the discrepancy hetwetm Llre most contin-
uous and less c o ~ ~ t i ~ i u directions.
ous lo Figure 4.13, the angular tolerance was
reduced to i LOo to better capture the anisotropy. If d a t a sparsity prohibits T h e weight w(hk) given to each lag h k is usually taken proportional to the
using srr~nllangular tolerances, one may decide arbitrarily to increase the r~urnber N(hk) of d a t a pairs that contribute to the estimate T(11n). T h e
misol.ropy ratio inferred from the experimental directio~ialse~nivariograms. implicit assumption is that the reliability of an experimental semivariograrn
value increases with statistical mass. An alternative that gives more weight
to the first lags consists of dividing the i ~ o n ~ h of e r data pairs by the squared
I
Dir N67.5E I
Dir N157.5E n ~ o d e lvalue: N ( h c ) / [ y ( l ~ k ) ](Cressie,
~ 1985).
'I'l~eWSS criterion is hot a rr~cast~rt: of the goodt~essof the fit; other
nreasures can also be used. Jtecall that t,he ohjcctive is to capture the nlajor
spatial featurr:~of t,lro ;~ltribute,not to build a s e ~ ~ ~ i v a r i o g rmodel
a r r ~ that is
the closest possible to t~xperirnentalvalues. For example, a model of spatial
continuity that accoulit,s for reliable ancillary information should be preferred
to a nugget-like model that closely fits a noisy experinrental semivariogram.
T h e value of the WSS criterion depends on the riu~nberof lags consid-
0 - 0
0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6 ered and on t,be weights selected hy the user. 'The rnodel that yields the
Distance (km) Distance (km) srnallest WSS value need riot be the sarlle for various choices of parameters
I( or w(11r). Therefore, t,he r a i r k i ~ ~ofg alterr~ativeruodels, thongh based on
Dir N22.5E Dir N112.5E statistical criteria, still depends on prior user's decisions l,l~al.are necessarily
I I subject~ive.
('.,loss,. validation
Sernivariograrr~modeling is rarely a goal per se. 'I'lre nlt,i~nateobjective is,
for example, to estimate metal concer~trationsa t unsampled locations. Cross
validation allows one to compare the impact of different models on interpo-
lation results; see Davis (1987), J o ~ ~ r r (1987),
~el Isaaks and Srivastava (1989,
0.r p. 351-368). 'The idea corrsists of removing one datum a t a time from the
0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6 d a t a set and re-~stimatingthis value fro111 ren~air~irrgdata using the different
Distance (km) Distance (km) sc~nivariogra~rr motlels. Interpolated and actual values are compared, and the
model t h a t yields the most. accurate predictions is retained.
Figure 4.13: Experimental direcliorml Co semivariugrams and the geometric 'I'he use of cross validation to select se~r~ivariogram models suffers, how-
anisotropy model fillcd. T h e major and minor directions of anisotropy are NG7.5E ever, from t,l~escsevere restrictions:
and N157.5F:.
A rescaling of the semivariogra~rrmodel does not influence kriging weights
CHAPTER 4. INFERENCE AND MOLIELiNG

(see section 5.8.1). Tllos, total sill valt~escantrot bt: cross validated froin
re-estimation scores.

Values of the seniivariogram niodel for lags smaller than the shortest
sampling interval do not intervene in interpolation algorithms. Hence,
critical model parameters, such as relative nugget effect and the serni-
variogram behavior a t the origin, cannot be cross validated.

Sample data, particularly when they are scarce arid preferentially lo-
cated, may not be representative of the study area. Therefore, the
model that produces the best cross-v;~lid;ited resnlt.~may not yield the
best prediclions a t ol~sanlpledlocations.

T h e re-estiinalion scores depend on the semivariogram niodel but also


critically on implernentation parameters related to the search strat-
egy and the specific interpolation algoritfrrn used. If the re-estimation
scores of all modcls are deemed unsatisfactory, it is not clear what is
faulty: the decision of stationarity, the selnivariogralu n~odels,or the
implenientation of the algoritlnn. If the tnodel is inmleq~uate,t l ~ c nit is i r e 4 1 4 : Experimeulal ornnidirectioad sernivariogralrts for nretal concentra
unclear which parameter sho111d be cl~anged. tions and the isotropic models f i t t d

Remarks
T h e "goodness" character of a model is elusive and cannot be nreitsured by seniivariogran~. A l.ltird structure is needed to reprodncc both sliort-r;~~kge
rigorous tests; there is no best semivariogram model. Rather than relying on and long-range cornporrelit.s of Ni and Zo stmivariogra~ns.
an elusive objectivity provided by questionable statistical criteria, the user 'l'lre semivariogran~sof the three other meials (Cd, Cn, aud 1'11) arc more
should decide on a niodel ;recording to the followi~~g: erratic and could be modclcd e i t l m as a combination of a 1111ggeteffect it1111it
lortg-rangt: exponential nrodrl or ;IS a con~hin;ttionof a nugget ellixt aud two
1. T h e user's experience and the inforniation available; particular features spherical models with short and long ranges. 'Tltc lat,ler rr~odelwas cl~oserr
of the experimental semivariogram may be deemed spurious and not hecanse the short-range strncturc conld be related to the slrort-range distribtt-
worth modeling, whereas ancillary inforination may lead l o model fea- lion of human contaminant sources as fcat.~rredby indicnt,or scmivariogr;ttns
tures that are not apparent on the experimental curves. of land oscs in Figure 2.17 (page 43).

2. T h e objective of the stody; modeling short-range! it:;tt.~rresn ~ a ynot. he


rolevant if one sceks a sn~ootlrrt:prcsentat,ion of tlw longranga f m t t ~ r t ~ . 4.3 Modeling a Coregionalization
A subjective decision llrat is clearly rlocur~~ertt.ctlis prd'eral~lr1.0 ;I bliud Ml~ltivarialeco)(ariiuncc or st.+variogra~ti i l ~ l i w ~ ~provirlc+
A cs a sc.1 of iml.ri-
decision based on elusive tests. Onc must. keep in iirind t h t . t.l~ecl~oiccof a ccs C@k) = [Cjj(hk)] and r i l l k ) = [Ttj(lik)] for it finite I I I I ~ I I )of~ ~lags,
semivariogram rtlodel is posterior and secondary to the li~irrlirt~~ental choice I l k , 6 = 1 , . . . , Ii, and directirms, As in t . 1 ~
ntrivari;ll,e c;wr,, rltodcli~tgprovidcs
of the R'F nod el for modeling uncertainly and to tlic critical decision of covariance or scrnivariograln values for ;Iny lag 11 ;is rcquircd by i~rtcrpolatioli
stationarity. S u c l ~prior decisions arc far nrore conseq~telrtialfor i~it.erpolat,iotr algorit~lnns.
results than the use of an exponential nlodel instead of it splmical niodcl, or Modeling a corcgionalizat,iotr calls for infcrrii~gA',,(N, + 1)/2 direct, nud
than setting the range value to 10 rather than 12. cross scn~ivariograrn(covariance) niodcls ignoriug any lag cffcct.. 'l'l~c diffi-
culty does not lie in the numl)er of models l o infer, hut in t.lrc fact t.11at these
S e ~ n i v a r i o g r a l nm o d & for m e t a l s models cannot he built independently lroni onc anotl~er. In tliis scct.io~i,
tlie condit,ions that any tn;ttrix of sernivariogran~or covarianu: ~nodels1nnst
Figure 4.14 sl~owsl . 1 cxpcrimental
~ o~iltridirectiolialscntivariogran~saurl the satisfy arc: first csLal~lislred. l'ha linmr niodnl of coregion;tliaalio~~ is t.lren
models fitted. Two strilctnres sulfice to capture the in;ijor Se;ttures of thc Cr introdnccd, ;nrd pract,ical issws of rlel.r:rntinitig it.s pnranlct.rrs arc disc~tssvd
4.3.1 Permissible models dotn functio~is%i(n). The c o r r e s p o ~ d n gcovariance function matrix C ( h ) or
sanlivariogram matrix r ( h ) is then, by dcfir~it,ion,pcrmissiblc.
Lei, {%,(11), i = 1 , . . . , N u ) be a set of N,, ittlercorrelated r a t ~ d o ~functiotrs
n ,
Lhc li~lcartnorlel of coregionalization llcrcaftcr developed derives fro111
%

a t ~ d{n,, o = 1 , . . . , 7 1 1 he a set of 11 data locatio~ls. 'I'hc variance of any one particular set of itltercorrelated 1tf"s Zi(11). O t l ~ c rsets have been built,
bite linear c.orrrbirration Y of the ItVs &(urn), 11, E A, i = 1 , . . . , N u ,can Ire
cr~slom-madefor specific needs, resulting in different ~r~orlcls of coregionaliza-
expressed as a linear colnhination of cross covariatice values:
lion; for exa~nplc,see Zliu and Journcl(lSS3) or the Markov model itttroduced
subseqne~rtlyin sectiot~6.2.6.
,%
I Ire lirlear rrrodel of coregiotraliaation builds each ranclorn function Z,(u)
as a liuear cornbil~at.ionof i~~ciepetldent ratrdorn f u n c t i o ~ ~Ys~ ( I I )each
, with
zero rneari and basic covariance functio~lcl(11) (Jollrnel and Ilrrijbregts, 1978,
[I. 172):

for any clloicr: of 11 locat,ions 11, t A and any weights A,,. lising matrix
notation, t,Irc variance of 1' is writ.ten as

wlrere A, = [A,,], . . . , h n ~ , ] ' is a11 N,, x 1 vector ofweigltts A,,, m d C(u, -


"a) = [Gj(11, - ua)] is tlrc N, x N, matrix of st.atiol~arycross covarianccs
bctwcetl atly two 1Ws Zi(i~,,)i ~ u dZ ; ( I I ~ ) .To CIISUIC that the variatice of Y
is IIOII-negative, l l ~ ctnal.rix of : ~ o t oatld cross covariauce rrtodcls C ( h ) tnusl '1'Irc I;~ll.erco~~<lil.iott exprrmes Lha ~rrulh~al (1.wo by two) it~rlf!l)r:~tdetlceof tlrc
11e positive sc~rti-Mitrite. r a ~ r d o ~fr~nctiotts
n Y;(U). Some of Lhc 1Ws Y ~ ( I I )say,, 711 (tlr< N,,), may
Accorniti~~g for the relation C ( h ) = C ( 0 ) - ~ ( I I ) the
, variance (4.30) is slr;rre t.lre s a ~ t l ecovariattcc fnnction cl(h), yct tllly rentnin indcpetident of
rr:writlctl ill terms of t . 1 nlatrix
~ of semivariogralrt models r(l1): one :tnr>t.lter. T h e lot.al mitnher of indrpctrdenl. IWs Y;(u) is thus xfZ0<111
+
(I, I) . N,. 'l'lrc u t ~ i v u i a t edecot~~posili~m (1.26) is 1,111a particrrlar case of
I h trrndel (4.33) for N, = 721 = 1.
,-
I Ire cross covariatlce between any two RVs Zi(n) and Z j ( u 11) can he+
expressed as a line:tr cor~~bination of cross covariances hetweet] any two RVs
wlirre C(0) is the varialrce-covariaticc matrix. As in Llre univariate case, 5!(11) and ~;:(11+ h):
there are scnrivariogran~inodcls that have no covariance rounterparl. For
such inodels, (he variancc of Y is defined on the condition that. the vectors Cij(h) = Cov{Z;(u), Z j ( n + 11))
of weigllls A, sum to the null vector, which allows the filtering of the terlil
C(0) from expression (4.31):

Decause the RFs YL(u) arc mutually indepe~rdetrt,expression (4.34) reduces


+
to a linear comhinatiotl of (L 1) basic covariance models cl(h):
Expression (4.32) shows that to ensure non-negativity of the variance of Y,
the rnatrix of sernivariogram models must be cotrdilionally negative seriii-
definite, the condition being that the sttrn of the vectors A, is the null vector.
'I% linear nrodel of coregio~raliaalio~ris t,he set of N , x Nu auto and cross
covxriauce models C;;(h) defined as
4.3.2 The linear model of coregionalization
By analogy with the univariate case, the easiest way to build a permissible
model of coregionalization consists in building a set of intercorrelated ran-
4.3. MODELING A COREGIONALIZATION 111

where the sill bjj of t.he basic covsria~~ce


n~otlelcr(11) is where (o')'Ir, and YZ0(u) have the same semivariogram go@), arid llre two
other independent RFs Y,'(u) and ;>' ( u ) share t h e same semivariograrn yl (h)
According t o expression (4.37), the two direct sertiivariograrns y l l ( h ) , yzz(h)
and tlie cross senrivariogram ylz(11) are defined .arvcigliterl sums of the t.wo
basic sernivariograln models go(h) and gi(1l):
By construction, l l ~ ecoefficients hij and bji are identical, hcncr t . 1 two ~ cross
covariance models Cij(h) and (I,,(ll) are t,he sarnc. Furthrrmore, relation
(4.36) is the gencral definition of ;i positive s e m i - d e h i k A!, x N,, mal.rix
BI = [ b ; ] , called n coregioualiaatio~~ matrix.
T h e conditiow sulfici1:111for t l ~ cniat.rix of f t ~ l ~ c t i w( '~j ,s( l ~ ) c l ~ 4 1 1 ~i td,
(4.35) l o be a permissil)lc [node1 of coregioualizalio~iarc: Consider the following rmrnerical esa~nple:

1. the firnctio~iscr(l1) ;ire pcr~nissil~le mrl


coviirimce r~~oclcls, .%I (11) = 5 . 1;0(11) + 0 . Y;(ll) + 2 . Y,'(ll) + 4 - Y,' (u) + 0
2. the (1, + 1) corcgionalizatio~~
matriccs Br are all positivc semi-definite. Z2(u) = 0.UP(-) + 3 . Yj(ll) + 5 . Y,'(I,) + 2 . y;(ll) + 0
go(h) = nugget effect model
T h e linear ~riodelof rcgionalizat.ion (4.27) is I J I I ~a particular case of the lincar
model of coregior~alizatio~t (4.35) for A!, = 1. As for 1.11~univariate caw, each
basic covariance rnodel cl(h) may have it,s ow11 pattcrr~of anisotropy.
A similar dcvclop~ltcntcall be rlonc: ill t.ernls of scn~ival-iogmuis.1,cL yg(l~)
Sobstitutiug the t~umericalvalues of cocfficie~~ls a:, into e q ~ ~ i k i o(4.36),
u orbe
denote the sernivariogran~oftlle n, lWs };!(11), ivit,lt the cross sr?uiivnriogr;trn dednccs the sills (contribot.ions) of the three hasic semivariogram modcls:
betweeti any two ltFs Y ~ ( I I )and Y::(II) equal t,o zcro:

This linear model of coregionaliaation is, by construction, pern~issibleand is


* displayed in Figure 4.15 (top graphs). lJ111ikethe bwo scn~ivariogrammodels,
is 1lte11defit~edas tile set of Nu x N,
I Ire linear model of coregionalisatio~~ the cross st:mivariogram nior11:l does not, i~iclodea 1111ggetetlecl. A s discl~ssrxl
direct, and cross sanivariogram models y,j(h), such that in section 4.2.4, the nugget efkct on a cross scmivariogram is due only t o
riiicro-scale variahilily comriion l o the two variables. In t l ~ i sesanrple, lhe
nugget con~ponrntsof Z1(ll) and Z2(u) itre built as 6Yf(u) and 3Y$(u),
respec1,ivcly. Because llue two R.Fs Y?(n) and k:?(u) are iiid~pelidt:nL, the
two nrlggel components of Z l ( u ) and %2(11) are in<lepende~it.,11encctl~circross
where each functiou gr(1r) is a per~t~issible semiv;rriograru 111odc1,and the covariance does not conlributc lo the nugget enixt of the cross scn~ivariogrn~l~.
+
(1, 1) matrices of coefficient,s bf,, corresponding to the sill or slope of the
model gr(h), are all positive semi-definite. M a t r i x lota at ion

Example Using matrix notation, the Rl' vector [ Z l ( u ) ,Z2(u)lT is exl~ressed;is

Consider the two R.Fs ZI (11) and Z z ( u ) t h t are l ~ n i l as


t linear co~nl,it~aliolts
of tlie same set of four illdependent ltFs {Y:(u), k = 1 , 2 , 1 = 0, 1) and the
two stationary means rzrl aod ?a2:
112 CHAPl'ER 4. INFERENCE AND MODELING

C $ ( h ) is Lhe nr x nl matrix of cross covariances Cov{Yl(u), Y;:(u +


h)].
All t l ~ celetnents of C $ ( h ) are zero whenever 1 # 1'. For 1 = l', the matrix
C;(h) is diagonal because t,lie cross covariances between any two RFs l;'(u)
auri Y;,(u + 11) (off-diagonal elcrnents) are zero.
Tlle covariance fnnction tnatrix of the rrn~ltivariaterandom function Z ( u )
is then

Similarly, the sernivariogram matrix r ( h ) is exprrssed as


Figure 4.15: Examples or a linear nrodcl of coregio~~alizatiun (top graphs) and au
intrinsic coregionalis;&m n d e l (bottom in the bivariate case (A', = 2).
'The intrinsic coregionalisation n~odelrequires that each basic rnodcl (nugget effect,
spherical model) contributes equally to the two direct semivariograms and to the
cross sernivariogram. +
Under secood-order stationarity, tlie sum of the (L 1) coregionalization
matrices is equal to tlie variance-covariance niatrix C(0). Indeed, a t ihl = 0,
the value of each basic model c ~ ( h is
) 1, lience
where A. is the 2 x 2 n ~ a t r i xof coefficients r&, and Yo(u) is tlie R F vector
[YF(u), YZU(u)]'. T h e correspoiiding linear model of coregio~~alization is

T h e linear inodel of coregionalizatiorr is very convenient because the condi-


tions of permissibility arc readily verilied. Instead of checking the positive
where Bo and BI are the 2 x 2 coregionalizat,ion matrices that contain the
coefficients of the hasic scrnivariogratrr ~nodclsgo(l2) and g ~ ( h ) . scn~i-dcfi~~itc~icss of C ( h ) or the condit.ionally negative semi-definiteness of
, all lags h, we Irave only to verify t,liat t l ~ c(I,4-1) coregionalization
~ ( I I )for
T I I C tnatrix for~nulationof the g e ~ ~ e r deco~nposit,ion
al (4.33) is written
niatrices B, are posibivr senli-definite.
A syrnnret,ric inatrix is positive semi-definite if its dcterriiinant and all its
principal rninor d e t e r n ~ i t ~ a nare
t s non-negative. Consider, for example, the
linear model of roregionalizatiori for Nu = 3:
7 l l ( l l ) 'Y12(11) 713(11)
yzt(h) ~ z z ( h ) ~ z d 1 1 )
'731 (11) ^/32(11) 733(11)

Each coregionalization matrix Bi is positive semi-definite if tlie following


seven inequalities are satisfied:

wl~erc:Yl(11) = [Y;(u), . . . , Y , ( ~ ( U ) ] '61ls


~ , is t,lie Kronecker drlta (6118 = 1 if a All diagonal e l e ~ n c ~ iare
t . ~non-negative:
1 = 1' and 1 5 ~ ~ =
3 O otl~crwisr),and I,,, is the ni x n r ident.ity niatrix. T h e latter
conilit,io~lmpresscs t,lw r i ~ u t , ~iodependci~ce
~al of LIE RFs YL(u). 'l'l~c rnatrix
114 CHAPTER 4 INFEXENCE AND IMOIlE1,ING 43 MODELING A COREGlONAL1ZAsi'ION 115

e All principal minor deterrrrir~anlsof order 2 are non-negative aud cross se~nivariograminodels yij(h) are tlrerl ohtained by si~rlplcrcscaling
of the same standardized linear inodel of rcgionaliaat.io~ryn(11):
= b',, b', - [b',,I2 >0
= a;, bi, - [bi,]' 20
= b;, 6$, - [6ill2 >0 For exalriple, Figure 4.15 (page 112, bottom graphs) sl~owsan intrinsic core-
gionalization model where all direct a ~ imss ~ d sernivariogram n~oriclsarc pro-
T h e determinant of order 3 is nun-uegalive: porl.iona1 t,o the sanle standardized nrodcl yo(11) = 0.5g&) + O.5Spl1( / h / / 5 ) .
?'he nugget cffect thus represents 50% of the sill valuc of all tl~rer: n~odels.
Note I.liat, any lincar nod el of coregio~~alization t.llat corrsist,s of a single hasic
struct.ure is an i~~I,rir~sic nro~lcl.
,~
I he i n l r i ~ ~ s icon:gio~~;tlia;rli<~~i
c iriodel is n n ~ r l lnioro rrstriclivc t,lrau t . 1 1 ~
linear rrrotlel of c o r c g i o ~ ~ a l i z a l i Lire
o ~ ~N,(N,
: + 1)/2 direcl, all11 cross s e ~ u i v i ~ r -
+
iogram models must include all ( L I) structures in the same proportio~~s
6'. This is not the case for the linear model (see Figure 4.15, lop graplrs). 11)
These inequalities provide four practical rules for choosing basic structures
particular, note the difference in the rclat.ive rr~~gget effect betwecr~tlre t h e e
gl(h) in the linear model of coregionalization:
direct a11rl cross se~nivariogran~ ii~odi:ls.
,,
R u l e 1: Every insic sl.n~clurcappearing on a cross scn~ivariograrnyij(11) 1 lie ~n;itrixf o r m u l a l h ~of t11c iutri~isiccor<:gi(~~~;iIiz;~tio~t ~rio(Id( 4 4 1 ) is
Innst bc present in both direct sernivariogram models yii(11) and yjj(ll):

R u l e 2: If a basic struclrm gl(11) is absent on a direct semivariograrn, it where = [pij]is the matrix of coefficients p,j. 7'11e intrinsic coregio~ializa-
must be absent on all cross semivariograrns involving this variable:
tion nod el is similarly expressed in terms of basic ~ o v i ~ r i i t nI IcI O~~ C I S ri(lr)
RS

R u l e 3: E a c l ~direct or cross seniivariogra~nrnodel y,j(h) ~reeilnot inclr~de


+
all ( L 1) basic sI.ructi~res:

wllere Cil(ll) is a sta~rtlartlizcdcovaria~~ce rrmilcl. 'I'hc nralrix 4 is cxpressi:il


a5 the ratio C(h)/Co(h), where C(1r) is a positive senti-definite mat,rix arid
R u l e 4: There is no nccd for lfre structhre gl(11) appearing on both direct Co(h) is a posilivc definite furrclion. 'I'li~rs, lhc ratio 1,hat is the matrix of
semivariograms yii(h) and yij(1l) to be present on the cross sen~ivariogra~n coeflicients pij is posit,ive scnii-dcfi11i1.e.
?ij(ll): Under second-order stationarity, Llie n ~ a t r i xa is equal to the v a r i a n c c ~ ~
bii # O a n d 6 i j # 0 =$ b!.8 ) = Oor 6ij # 0 covariance matrix C(0). Indeed, a t III/ = 0, the value of each basic rnodel
~ ( 1 1 is
) 1, hence
Similar rules apply for covariance models

The i n t r i n s i c c o r r ~ g i o n a l i z a t i o rm~o d e l
T h e intrinsic coregionalization model is but a particular linear rnodel(4.37) in
wtrich all the N,(N, +
1)/2 coefficients bfj of any basic sernivariogram model The proportionality coefficient p,j is equal to the variitricc C,,(O) of 1 . 1 1 ~Itfr
gl(h) are proportional to each other, tl~;ttis, bf, = p,j . 6' V i, j , I. All direct Zi(11) for i = j and is equal Lo the cross covariance (Aj(()) for i # j .
4.3.3 T h e practice of tnodelilig requires that all expcrin~entaldirect and cross se~nivariograrnsbe proporl.ional
to each ot,lier. T I E proportiotialit,y ciiirrlition can be vis~lallychecked from
( : l ~ o o s i ~ t.hc
~ g I I I I I I I ~ );rnd
~ ~ cIi;tr;rcl.crist,ics (typc, rsrigc, :inisot.ropy) of basic
the shape of experimental curves. An alternative consists of verifying t h a t
st,ruct.urcs and detcrn~ittingt.ltt>ir cot~l.rib~~t.ioo (sill, slope) to caclr l ~ ~ o d is el the ratio hetween any two semivariograms is a coustaot independent of h:
~ n ~ r~iorc
~ c hdifficult t l t a ~in~ the rurivariate casc. I)iffic~~ltics lie in the r~ontbcr
of t x p ~ i ~ r ~ e ~ r tC.Oi LV IM ~ ~ I ~ Cf C~ ~ ~ c t i o( sn~s~ i r i v a r i o g r i t ~and
~ ~ st,ltt:
) (;o~~st.rai~~ts
that. thr. ~ ~ ~ K i c i ehij n t nrust
s satisfy. As in scction 4.2.4, the followi~rgdiscus-
sion focuses o t ~scn~ivariograms,which are nrost frequently used in pract,ice.
A siruilar approach could he adopted for covariance firrtctio~~s. Similarly, all auto arrd cross correlogranrs are ahout equal
Linear nkodcl of corcgioanlirntioe
S n l c x t i o n of vnriahlos T h e intrinsic ntodel rarely fits expcri~rientalcoregio~~alizaliorrs. One rewon is
t h a t the filtering of rir~correlatedmicro-structures by the cross semivariograni
Ally i~rl~ltivariate
rr~odeli~rg
st,arts with a selecl.io~~ of the N, variables to in- (see section 4 . 2 . 4 ) usually leads to asnialler relative nugget effect on the cross
clude in the corr:gio~~aliaationa ~ ~ a l y s i In
s . most situat,ions, t l ~ e r ris no point sernivariograni than on direct semivariograms. T h e more flexible linear model
i ~ rethi~~irrg
r all measured atl.ributes for these reasons:
of coregionaliaatio~rwould tlierr he preferred. Note the following:
1. T h e nntnher of variables that can be ha~idledj o i ~ ~ t lhy
y i~rterpolation Direct or cross selriivariograrr~sneed not include all hasic structures; see
a l g o r i t ~ l ~is ~li111it.14.
~~s rule 3 ( l q c , 114).
2. Some variables are likely to be redundant and rrerd uot he considered, Variables that, arc well cross-corrrlated arc more likely to show similar
whereas i~tdcprndentvariables are l)rt.ter Irandlerl scparatcly. pal,t,crns of slxitial variahi1il.y.
3. Incrcnsi~rgtlrc 11rrli11)crof vari;tbles ~trakesthe fitting of all dircct and Slight dilliwmccs ~ I the
I slrape of expcri~ncntaldirect and cross semi-
cross scmivariogmrirs tnnch more t k d i o ~ ~ s . variogratirs rriay be disregarded, in particular when data sparsity affects
tlrc reliability of the estimates.
It, is good practice to defirte a hierarclry of v a r i a l h according to the
ol>jec.tivc pursued. Consider, for example, the ol)jectivc of cstitnaling con- l s N i , and Zn), a quick i o s p c c t i o ~a~t the direct
For the three r ~ ~ r t a(Cd,
centrat.ions of Cd, which is called tlre mnin or primary variable. One should semivariograms suffices to i~~vnlirlate the proporl.ior~ality co~rdition(4.44) of
prefererrtially retain as secondary variables the one or two other metals better the intrinsic model (Figure 4.14, page 107). Indeed, the shape of the Ni
correlated wit11 that prirmry variable and bet,ter s a n ~ p l e dsay,
, Ni and ZII. scniivariogram is dominated by the long-range structure, whereas the short-
Different models of corcgionalization could be built depending on which range structrtrc is the r~lajorcomponent for the otlrer rnetkils.
variable is selected as primary. A single rnodel of coregio~ializationthat poorly
lits the direct and cross semivariograms of all attributes is better replaced by F i t t i n g a linear m o d c l of coregionalization
sevcral i~rodelstltat providc hctter fits t,o fcwrr varinhles dce~ncdi m p o r t m t
for l.lre objective. 'Ih fit a linear ir~odclof corr~gion;rliz;~I.io~r,
proceed as follows:
1. Select the smallest, set of basic structures g((11) that captures the major
Selrction of t h e i n o d d fenthrcs of all N, onr~ridircctionaldirect sc~nivariograms. There is no
treed to look a t the experirnental cross selnivariograr~~s in this step since
Once tlic variables have beer^ selected, the next decision concerns the type of
their models cat! i ~ ~ c l u donly4
c I,hc structures that are apparent on the
r ~ ~ o dof
e l coregio~~aliaat.io~r:
i~rtrinsicor linear."
direct sernivariogr;ttr~s;see rule 2 (page 114). For Cd, Ni, and Zn, three
Intrinsic eorcgiosnliznlion model str~~ct,uresare retained: the nugget effect, a splierical model with a
T h e intririsic corcgionalization niodcl is rasicr to fit than the h e a r model rangn of 200 in, and a splrrricnl model with a range of 1.3 km.
in tlre scnsr that only a sirtgle coregionalizatiort r ~ ~ a t r @i x ueed he infrrrrd
(see secl,ion 4 3 . 2 ) Ilowever, such ;L rrrodel is 111uclimore restrictive brc;rnsa it 2 . For oxlt s t h r t u r e ~ , ( I I )irmsidcr
, anisotropy only i l that a~tisotropyis
clearly evident on all dirccl.ional sernivariogra~trs. As for the onivari-
3 A s mentioned previously, other models of corcgionilliration,possibly nodinear, mighl ate case, ancillary infor~r~abiorr may help to dcterniinc the directions of
br coasidcred. I n the fdlowing, only the wirlely used linear model of roregiooalizntion is
retained 'Thir represents ?I definite limitation of the linear model of coregiot~aliziltion.
4.3. MODELING A COKEGIONALIZATION

anisotropy. 'l'lirr patter~rof viuiation of the t,liree selected ~trelals,C d ,


Ni, and Zn, does not vary with the direction; see sectio~t2.3.5.

3. Estimate the contributions (sill, slope) blj of t l ~ ebasic strltctl~rrsgl(11)


building 111) each model yij(l1) nt~dert.he constrairit of positive semi-
definiteness of the Eoregionalizatiorr matrices B, = [blj].

( 4. Appreciate visually the goodness-of-fit for all direct and cross seniivar-
iograms. One may then decide to modify a range or to change the type
of basic scririvariograoi model, say, try an exponnttt.iitl ratltcr than a
.; spl~ericalt~todel,to in~provt?1 . l ~ : overall q~iiilil.yof the lit.. TIic lit of Figure 4.16: Experinlentirl ornrtidircctionirl clirscl and cross semivariogratns for (:d,
@'
y' any particular one of the N,(N, +
1)/2 exp~:riniet~taldircct and cross and Ni, and t h e linear model of coregionalization fitted. 'The three experimental
se~nivariograrnscall always be improved, but it, is generally dolie a t the sernivaringrams werc co~nputedusing the same set of 259 concentrations.
expense of poorer fits for other semivariogranrs. Wlre~~ever a cornpro-
mise is needed, one s l i o ~ ~ lgive
d pri0rit.y to the direct scniivariograrns,
particularly that of the primary varial~lc. and the determii~antsof the three coregionaliaat.ion matrices are all posit.ive
or zero:
E s t i m a t i o n of t h e cort?giorraIization irrntricr~s
Bivnrinte case (N,, =i!)
In the hivariate c;rsr, the corcgionalimtion m;~l,rixBi is posit,ive semi-deliuitrr hence, the linear nrodel of coregiotializatio~i(4.47) is positive se~iii-defiuitc
if the following tbrce inequalities are s;itisfied:
M d l i u a r ~ a t ecase (N, 2 3)
Tlre hivariate approach is ge~teraliaedto N, variables as fbllows:
1. Tlre N,, direct se~nivariograrnsyii(h) are first modeled as linettr cottibi-
nations of selected basic st.ructures g,(h).
Thus, the littear nrodel of coregionalization is fitt,ed in two steps:

1. Both direct scniivariograrns are first modeled as linear cornl~ioatiot~s


of
2 . ?'he same slrnct.ures are l l ~ r rfitted
~
sen~ivariogramsyjj(11) under the constraint 6; 5 m.
lo e;tclr of t.he N,(N, - 1)/2 cross

selected basic s t h c h l r e s gr(h). T h e latter condition is, Iiowevcr, insufficient to ensure positive se~rii-deli~rit,cness
of matrices Br = [bij] for N,, > 2. Irrtleed, all princip;il rnirtor d~t.wrt~i~~irtil.s.
2 . ' 1 ' 1 ~sariie l):tsic strrrctures art: t.lrert lilted t,o t l ~ ecross sc~rtivnriogranl 11ot otrly those of order 2, tmlst he r~oti-ireg;tt.ive(rcc;~IIpxgc 114).
u ~ ~ d the
e r r:omtrairit. (4.46). , example, the linear model of coregiooalizatir)rt i ~ I'igurc
C h ~ s i d e rfor r 417.
r,

A good interactive graplrical prograw (e.g., Clru, 1993) and ir pocket c;tlcw I he dirocf. and cross scmivariogranis of (kl, N i , and Ztt were r~~odelcrl using
lator sqfice to n ~ o d e lthe corcgioodia;ttim ljet,wecn t,wo v;~ri;tl)les. the two-slep approach, yielding the followittg iiioclel:
12igure 4.16 slrows the linear niodel of coregionaliz;ilio~~ filled to llic pair
Cd-Ni.. T h e model is as follows:

+ [ 01 00
0.0 0.0 1 (")
~ p h 200rn + [ 0213 38
7, ] (A)
~ p h 1.3km
(4 47)

where go(h) is the nugget effect model (4.10). Only llie s c r ~ ~ i v a r i o g r rttodel
a~n
for Cd inclodcs tlre sl~ort-r:trrge slrrrclr~re(200 ni). l'lle rli;rgor~alelements
T h e burdc~iof cl~eckiuga poskriori each ~rratrixB, tlien applying empirical
corrections is alleviated by an iterative procedure that fits l l ~ elinear model of
coregiona1iz;ttiou dircclly rnidc~rtllc consl.rai~itof posiLive serni-defi~~iteness of
all n ~ a t r i c s sI31 (Goulard, 1989; Goul;trd and volt.^, 1902). As with the u n -
variate criterio~t(1.29), tlie algoritlirn atterrlpts lo nrii~i~nize a wcigl~tedsum
of differences betweell experirrrental ?jj(ht) and model y;j(hk) semivariogram
values (see Appendix A).
'filis iterative procedure is used to model the corcgionaliaation of the three
rnclals Cd, Ni, and Zn (Figure 4.18). Altlrongh it fits the experi~ncntalcurves
as well as the nrodcl shown in Figure 4.17, the rnodel of coregionalization is
now positive s e ~ r ~ i - d c f i ~ ~ i t , ~ :

1"igure 1.17: Experinrental oti~~~idin:ctioad dircct rlud cross semivari~gramsfor Cd,


Ni, and ZII ( ~ = 2 5 9 ) . Tlie linear n~otlelof coregionalizntion fitted using a two-step
approach is not p~rmissible.

Tlre slmrt-rauge strncturc now contributes to the Ni semivariogram model.


~ ~ = 3.0) allows niodeling the short-range strncture
T h e snrall c o u t r i l ~ u t i o(/I,&;
seen on the cross semivariogra~nNi-Zn. T h e determinants of order 3 of tlie coregionalization rnatriccs are 78.8, 50.1,
By co~rstructionof the two-step approacl~,the three following conditions and 276.3, respectively.
are mtisfied for each basic sernivariograrn nrotlel yr(h):

T h e model (4.48) is, liowever, not pernrissi1)le sir~cethe coregiorralization


matrix associated with the spherical model of range 200 m is not positive
semi-definite; indeed, t.he determinar~tof the matrix is negative:

A current practice consists of modifying empirically the coefficients b:j


until t h e positive senii-definite condition is met. Such an approach has two
drawbacks:

I t is tedious because it is unclear wliiclr coeficient slrould be modified


Figure 4.18: Experimental omnidirectional direct mid cross semivariograms for
first and wl1ic11correctio~~ (increase, decrease) slrould he applicd.
Cd, Ni, and ZII (n=259). The l i ~ ~ e aruodel
r of coregionalization is fitted using an
It becomes nnpractical as the number of variables, hence tire nurnher iterative procerlnre that ensures that the model is permissible.
of coefficients hij, incrrases.
122 CHAPTER 4. INFERENCE AND MODELING 4.3. MODELING A COREGIONALIZATION 123

C o r e g i o ~ ~ a l i z a t i oofn o t l r e r ~ t l e t i r l s Warning
A similar approach is applied to the two other ttrctals with widespread con- T h e reaso~ifor the wide ut,iliaatioo of tllc h e a r model of coregio~talia. stt'1011
tamination (Cu, 1%). The linear model of coregionalization now it~clrtdes is that its pcrmissibilit.y can he ~ a s i l yclreckerl by verifying the positive srnti-
four variables: the two better s a n ~ p l r dnrctals (Ni, Zn), and copper and lead, definiteness of coregionalizatior iriatrices. T h e price to p:iy is ilie rccll~ircn~ctlt
which are closely related (f=0.78). Figure 4.19 shows the ten experirr~ent;d that all direct and cross re~nivariogratrlssharethe same set of basic structures
direct and cross semivariograms. The corcgior~aliaatior~ rrlodel is fitted using gr(h), wliicli may represent a linritation.
the iterative procedltre. Dcspite t.he number of setnivariogra~ns,the overall Alternative linear models that do not rcquire l~sitrga cormrori set of basic
fit is satisfactory. st,ructures have been proposed (e.g., Myers, 1982). Their greater flexibility
is halanced by the ~ieedfor cltecking t l ~ epositive s c ~ ~ ~ i - ~ l e f i ~ r icor~ditirnt
l,c~~ess
of C ( h ) for a l l lags to be used (Goovacrts, 1 9 9 4 ~ ) .Beware tlrat, wet1 in the
bivari:atc case, there is no easy way to pcrfornr s l ~ c al ~cllcck, cspeci;rlly it, t,Ilc
presence of anisotropy. Tlrerefore, it is i)ettt!r to stay wit.l~t . 1 1 ~well proven
and yet rensonahly flexible linear tllrrclel of c.orcgion;~liznl.i(~~~

160 Pb-Ni sou Pb-Zn

80.
... .
3 200
E ,
0.0 0.4 0.8 1.2 1.6 00 0.4 0.8 1.2 l.8 0.0 0.4 0.8 1.2 1.6
D l ~ l e n c(km)
~ D ~ S I B ~(km)
CB Uislsnca (km)

Figure 4.19: Exprriiscstal omnidirectional dircct and cross semivariograms for (h,
Pb, Ni, and Zn (n=259). The linear model of coregionaliaatior~is fitled using tltr
iterative procedure, l~ertceis pertoissiblc
Chapter 5

Local Estimation:
Accounting for a Single
Attribute
,,
l l ~ eexisl~eliccof ;t t i ~ o ~ l cofl spatial depnttdeltce allows onz to tackle the
problcrri of eslirriat.i~~g attribute values a t ur~sarnpledlocations. This chapter
presents leasl-squares h e a r regression (kriging) algoritllri~st,l~ataccount for
d a t a wlatcd solely lo the coutinooos attribute being cstiri~al,ed.A l g o r i t l ~ n ~ s
o r ~ introduced in Clraptcr 6.
for incorporating srco~ldaryi n f o r ~ ~ l a l i are
Scctioris 5.1 5.4 introduce (,he lincar regrcssion pararlig~naud t h e e of its
n o s t import.a~~t. variant,^: simple kriging, ordinary kriging, and kriging with
a trend ntodel. Estinration of average values ovcr larger areas (block kriging)
is addressed in section 5.5.
Section 5.6 presents factorial kriging, a rncl.liod for filtering low- or high-
freqr~encycornpo~rr~rl.s frotn the spatial variatiou of the attribute. Filtering
propwlies of krigitig :iIgoritli~nsarc furt,licr discusscd in section 5.7 in the
fr;ltt~cworkof I.lir: drml forttii~lis~ir of kriging.
Scclion 5.8 recalls some importaut properties of thc kriging weights and
kriging varianrc. 'I'l~eperformancc of ordiiiary kriging in estimating metal
conccntratioos a t trst locations is discusscd.

5.1 The Kriging Paradigm


Consider t.he probleir~of est.imating the value of a continuous attribnle z a t
any irns;nnplcd locahiolt t t rising only z-(lath avsilal,le ovcr tllc st,lidy area A,
say, t,l~e7 1 4;it;t { ~ ( u , , ) ,<r = 1 , . . . , 11). Kriging is a generic nanie adopted by
geostat,islicians for a fanlily of generalized least-squares regression algoritlrrns
in recogr~it.ionof the pioneering work of Ilanie Krige (1051). 1\11 krigir~g
esbirlmtors are but variants of tlte basic linear regressioii estimator Z * ( u )
126 CHAPTER 5. ACCOUN'I'ING FOR A SINGLE A'T'I'RIDUTIS 5.2. SIMPLE IiHlGING 127

defined as 2. Ordinary kriging (OK) accounts for local flnct,uations of the urea11 by
limiting tlre d o ~ n ; ~ iofn stationarity of thc mean to the local r~eiglrlmr-
hood W ( u ) :

m(u1) = constant but u~rkrlown V 11' t W(o) (5.5)


where X,(II) is the weiglrt assigned to datum z(u,) interl~retedas a realization
Unlike si~rrplckriging, here the nrear~is deemed u r ~ k r ~ o w ~ r
of the RV Z(u,). T h e quantities in(11) and m(11,) are the cxl~ecledvalues of
the lLVs Z(II) and %(II,). 'flrc nunrl~erof data i~~volvcrl i n t,lie e s t i ~ ~ ~ a las
ior~ 3. Kriging with a trerrd ir~odel' (IC'1') consirlers l.l~att,l~eir~rk~rown local
well a their weiglrts may cl~arrgefrom one location to another. [II practice, near^ r n ( u l ) sn~ool.l~ly varies witlrin rirch local neighborhood W(II),
only the n ( u ) data closest to the location u being estimated are rt:tained, i e , h n c e over the entire study area A. TIrc trerrd conrponent is rr~ocleled
the d a t a within a given neigl~l~orhood or rvin(1ow W ( u ) ce~rtcrcdOII 11. as a lincar c o u ~ b i ~ r a tof
i o fu~rctio~rs
~~ Jh(u) of the cor~rdin;rl,vs:
'fire interpretation of the u n k ~ ~ o wvalue
n z(u) awl dat.;~valncs ~ ( u , , )as
realizations of RVs Z ( u ) a d Z(u,) allows orre to define the csti~nationerror K
as a random variable Z * ( n ) - Z(11). All flavors of krigirrg share the same rn(ul) = ah(u1) Jh(uf) (5.6)
objective of mininrizi~~g the esti~r~ation or error variancc o i ( n ) 111rc1erthe k=0
constraint of unbiasedness of the estimator; that is, with uk(11') x an. c o n s t a ~ ~but
t r~nknown V of E W(u)

T h e coefficie~~ts
nb(11') are U I I ~ I I ~ W and
II deemed c o n s t a ~ ~witlriu
t each
local neigtrborliood W ( u ) . Uy conventiorr, Jo(u') = 1, IIPIICC the case
is nrinirnieed u~rderthe constraint that K = 0 is equivalent to o r d i ~ ~ e rkriging
y (constant hut unknorvn r~reali
(10).

T h e kriging estimator varies dcpcl~dil~g olr Lire riiodel adopI.l!d for t . h 5.2 Simple Kriging
random function Z ( u ) itself. T h e R.F Z(II) is usually dccomposerl ir11.o a
residual c o ~ r r p o ~ r R
e ~( u~)t and a trend coiiq~orrentm(u): , .
llle modeling of the 1.rend component m ( u ) as a known st a t lonary
' rrrean
in allows oue t,o write the lincar esli111a1,or(5.1) as ;t linc;rr c o ~ ~ r l ) i r ~ n or
l.io~~
+
( n ( u ) 1) picccs of information: the 411) ILVs Z(II,) and the meall value in:
,.
l h e residual component is inodeled as a stationary RF with zero mean and
covariance C R ( ~ ) :

T h e expected value of the ILV % a t 1ocal.ion u is 1.11~1s


tlrc valuc of the trc~rtl
co~nponerrla t Llrat location: The n ( u ) weights X:K(ll) are then determined suclr as to r~rirri~rii~e the er-
ror varia~iceg ; ( ~ ) = Var{Z;,c(n) - Z ( u ) ] under the rmhiasedness con-
straint (5.2).
Three kriging variirrrts can be distinguisl~edaccordirrg to t l ~ cn~oilelconsidcrcd 'The simple kriging (SIC) esti~nator(5.7) is already 11111)iast~d
since the error
rrrea~ris equal to zero. Indeed, utilizing tire first exyrrssion (5.7), it corrres:
for the trend ~ ( I I ) :

1 . Simple kriging (SK) c o ~ ~ s i d e1,11e


l : ~I I I W L I I rn(11) 1.0 lw ~ I I O W Iimd
~ C O I I S ~ ~ I ~ .
tlrrouglrout the study arca A: 'Though the algorithmis also krluwu as u n i v e r ~ n hriging
l (Journel and ftoijbregts, 1978,
p. 313), the more appropriate terminology liriging with o l r m d model, proposed by Journel
and Rossi (1989), will be used tl~ronglmntlitis book.
Corrt:logm~unotation

z;I<(ll) - X(U) = [z;I<(ll)- m] - [ ~ ( I I-) V L ]


n(n)
= C xy(ll)n(ll,) - R(U) = R;,(rr) - ~ ( u )
*=I

wl~ereR(u,)=Z(u,) - m and R ( u ) = Z ( u ) - 111. 'Tlie error variance can tlius 7


I h e SK variarrc? is then give11 by
,

he expressed as a donblc linear combinatiotr of residual covariance values:


1

Systc~rls(5.10) and (5.12) yield l l ~ esame kriging weights.

Matrix n o t a t i o n
IJsing matrix notation, t,he simple kriging system (5.10) is written as
K,, Xs,(u) = ks, (5.13)
Tlre error variance (5.8) appears as a quaddralic form Q ( . ) in the n ( n )
wl~ereK,s, is the n ( u ) x n ( u ) matrix of data covariances:
wt:igl~Ls XZK(11). 'Tire optimal wciglrts, i.e., those that r r ~ i ~ ~ i r nthe i a e error
variimce, itrc ol~l.;lincrl1)y setting lo zero c i d t of the n ( u ) partial first drriva-
t,ives:

X,,<(u) is the vector o f S K weights, and k


, is the vector of data-to-unknown
covariances:

'I'lre system (5.9) of n ( u ) litrear equatiolls is known as the systclr~of normal


equations (I,uenberger, 1969, p. 56) or the simple kriging system.
Stationarit,~of the mean entails that tire residual covariance function
X,,(n) = ksK = [ C(u1 - n)

C(l1,,(11) - 11) ]
CJr(h) is equal to the stationary covariance function C(11) of t . 1 ~RF Z(U):
T h e kriging weights required by the SIC estimator (5.7) are obtained by mul-
tiplying tlre inverse of tlre data covariance matrix by the vcct,or of data-to-
r~nknowncovariarlcm:

Tlrus, t l ~ siruple
e krigingsystem (5.9) can be writ.ten in terms of %-covarianccs
as 'I%c matrix f o r m u l a t i o ~of~ Llle simple krigirlg variance (6.11) is correspond-
ll(U) ingly
X$,(II) C ( u o - q ) = C ( u n - 11) u = I , . . . , n(11) (5.10)
o:,<(u) = C(0) k,, = C(0) - k:, K;: ksti
- Az,(rl)
0=1
T h e simple kriging system (5.10) has a unique solution and the resulting
Tlre rnininn~rnerror variance, also called tire SIC variance, is deduced by kriging variance is positive if the covariancc matrix K,, = [C(u, - UO)] is
substituting expression (5.10) into the definition (5.8) of the error variance: positive definite, in practice if:
no two d a t a are colocated: 11, # 110 for 0 # P
the covariance modrl C ( h ) is permissible (see section 4.2.1)
130 CHAPTER 5 . ACCOUNTING FOR A SINGLE ATTKIBV'I'E 5.2. SIMPLE KRIGING 131

Exactitude property Example


r,
T h e SK estirnator is an exact interpolator in that it honors d a t a values z(u,) lhroug11o11t Chapter 6 , kriging algorithnrs are illustrated from tile one-
a t their locations: dimensional d a t a set shown in Figure 5.1 (left graplr). 'The inforrrration avail-
able for the task of estirnating Cd concentrations along the transcct consists
of:
Indeed, when the location u being estimated coincides wit11 a datum location,
say, u,,, the SK system (5.10) beconles 10 Cd values (black dots), wl~iclr show a general increase along tlic

.
transect, and

t . 1 ~Cd scntivariogran~moilel inferred in scctiolr 4 2 . 4 from ;ill Cd d&t


available over tlre s t ~ l d yarea (Figure 5.1, right, graplr):

u = 1, . . . ,71(11)
T h e unique solution is then
Xz?(11) = 1 and A;"(II) = 0 V up # 11,.
Weight o f the m e a n Tlte c s t i ~ n a t i o lis
~ performcd every 50 111 using a t each location 11 the five
closest d a t a , n ( u ) = 5 V n .
l'lre simple kriging estirnator (5.7) can be rewrittell as Figure 5.2 sllows the SI< cstintat,es of Cd concentrations (top gr;rp11) a d
n(ll) the wciglrt of the stat.ionary mean (bott,orrr graph) t h k r l r as the aritltmelic
Z:,<(II) = A:(iz) Z(ue) + A::(II) 7r1 (5.14) rneen of tllc 10 Cd values, r n = 1.4!) ppnr. 'l'he SK estilr~;~t,cs (solid line):
*=I

where the weight X:(n) assigned to the stationary m:alt irt for estimation
a t location 11 is equal to 1 minus the sum of the n ( u ) (lath wcights AZx(lr):
rr(11)

A;T:(u) = 1 - ~F(II)
o=r

As the locatiou II being estimated gets fiirtlrer away from d a t a locatior~s,


the data-to-unknown covariances C(u, - u ) , wllicl~arc eletrltwts of vcc-
tor k in system (5.13), decrease.
r the data-to-d;tt.a covariances C ( u , - u p ) rtmain rlnchangcd.
Consequently, thc SK weights [ A F ( u ) ] = X,,(u) = K;: k , , tknd to dc-
crease; hence the weight of tlte mean increases, and the estimate zZK(u) gets
closer to the stationary rnealb rn. The global infornratiolr carried hy the sta-
tionary ~nealrl m o ~ n e spreponrlcr;urt as rt:nlot,e ~l~!igl~boring dat.;~])ring lrss
information about the ~lnktrow~i value a1 u.
T h e limitingcase correspor~dsto a location 11 heyood the rorrelatiolt range
of any data locatiol~n,, i.c., beyoi~dthe distatrce a t wlticlt tltc covariance
C(u, - 11) vanishes. In suclk it case, t,ll,: vector k,, of rlnt,a-t~~-t~rtkrrow~~
covariances C(u,, - 11) is a 11r111 v d o r ;itid SO is t l r ~vccLor X+,(II) of kriging
weights: X,,(u) = K;: 0 = 0. T h e nrliquc solutiolt X?.(II) = 0 V n =
1 , . . . , n ( ~ t cnlails
) tlmt. t.lre rvt:igltl of tlre nrcali is equ;rl to I , Iic,ticc (.Ire SIC
estiniator (5.14) is the stationary nre:rlt rrt.
kriging (OK) allows one 1.0 a c c o u ~ ~for t such local variation of the mean by
iK Cd estimates h r i t i n g the domain of stationarity of tho rr1e;tn to the local ncighborhoocl
I W ( n ) cent,crctl on thc location 11 bring rsti~nated.The li~rearestirrlator (5.1)
is t.lrcn a linear r o r ~ ~ h i ~ ~ a tof
, i othe
t r ~ ( I I IWs
) Z(r1,) plus lhc c o ~ ~ s t a nlocal
t

_~_ . , ._ -,--

1
I 2 3 4 5 6
Distance (km) 'l'he n~e;to,tr(u) is filtered lroru thr: linear csti~rtatorby forcing
U I ~ ~ I I O W local
II

I5 thc kriging weights Lo s t r ~ r tto 1. Tlre ordinary krigirrg estimator Z&K(ll) is


weight o/ the m a n thns written as a linear combinatior~olrly of the n ( u ) RVs %(u,):
1.0

Again, the n ( o ) wniglrls XZK(o)are dcterr~rinedsuch as 10 i ~ ~ i ~ r i n itlre


i z eerror
v;triat~cechccking for tlre unbiasedncss cor~strairrt(5.2).
T h e O K csl,itl~ator(5.16) is 11l1biasc11
since the error mean is equal to zero:

Figure 5.2: SK estiniates of Cd concentrations (top .zad weigh1 o l tlw


stationary mean (hottom graph).

'Tlre exactil,tlrle property of the SK esl.i~rlator~ creates arlifact, rlisco~rtinu- Tlre rrii~~i~r~iaatiorr
of the error variance (5.8) ot~derthe no~r-biascondition
itirs (peaks) a t d a t a locations (Figure 5.2, top graph). Disconti~~uities are ~fi(=):) XZK(u) = 1 calls for the delittition of a Lagrangiarl I,("), which is a
i~nportant.here because Cd co~~ce~rtratiorrs vary in the sernivi~riograrnof Fig- fu~ictioriof the data weights h g x ( n ) , and a Lagrange parameter 2/1,,(u)
ure 5.1 (right graph). Indeed, such short-range varialrilily entails that the (c.g., see Edwards and Peoney, 1982):
d a t a weigl~t,sX r . ( n ) rapidly dccresc as the locatior~u being e s t i n ~ a k dge1.s
f a r t l ~ c raway from data locat.ions and t,ha e s t i t ~ ~ agets
t e closcr 1.0 the station-
ary lucan f r ~= 1.49 ppor. Large disconli~~uilies occur nexl, to ext.rclnc d a t a
v a l u c s t l r o s e t l ~ a depart
l most from the global 11iea11,for exanrple, near the
corrccntration of 4 ppm ri~easureda t 3.25 k ~ n Alterrratives
. for rcmoving such
discontinuities are discussed it1 the next sections. 'The optimal wciglits Xz"(u) are obtained by setting to zero each of the
(n(1i) + I ) partial first derivatives:

5.3 Ordinary Kriging


As is apparent on the graphs of Figure 4.4 ( p a g ~83), Llrc local mean ~rrayvary
s i g ~ i i f i r a o lover
l ~ the st,trdy i t r ~ a .Vor cxaniplc, the Cd local rrleall c.oirl[lllted
fro111I km x 1 k m rnoving wi~idnwsvaries frorrr 0.5 to 2.4 p p m , depending on
the wit~dowlocation; recall that the overall Cd meall is 1.3 ppm. Ordil~ary
Tlre o r d i ~ ~ a rkriging
y system inclndes (n(11) +
1) lilrear cquatiolls wit.11 'I'hauks to the nol~-biitsco~rditioo
~)."I(;'=
c(:;' = 1, the variimcc lerul (:(0)
+
(~~(11) 1) r~nknowns: the ~ ( I I weiglrts
) A:"(II) and the liagrangc paraln- cancels out frmn tile first n(11) cqnations, yielding 1.lris systenr:
eter fl,,(u) that accor~ntsfor l,he co~lstr;iinl,~ I tlie
I wcigl~l,~:

In contrast, the SI< systcnr (5.10) call he expressed i l l terms of ollly covxi-
A l t h o ~ ~ g the
l i mean m ( u ) is assurned stationary only rvitlri~~ the local ances since tlrere is no similar co~rstraii~t 011 Llie simple kriging weiglrts.
neighhorlrood W ( n ) , in llre practice of ordinary kriging tllc rcsir111;tl covari- ? ,
I lie c o n r r r ~ opractice
~~ coosisf.s of inferring and 1nodi:lillg i l ~ nscrr~ivari-
ance is assimilated to the global z-covaria~~ce irrfcrred from all data ;~vi~ilal)lr: ograrn ratlicr than the covi&.nce fr~llctionh d c e d , blic sc1niv;iriograln ~ ( 1 1 )
(Journel and Iluijbrcgts, 1978, 11. 33- 34), learliug 1.0 Lire following systenr: allows one to filter Llie n n k ~ ~ o \ vlocal
n ~rreanm ( u ) tl~;rtis h n r e d coosh;rnt
but unknown over t.11elocal ncigl~borlloodW(n):

T h e resnlting I I I ~ I I ~ I I I I I Ierror
II variimc~:,c;lllctl O K variaurc, is o l r t a i ~ ~ c11y
d since 71l(1l1) = ,ll(llf + ll), v 11') 11' + 11 € W'(l1).
substituting the first ~ ( I I equations ) of tlie ordinary kriging system (5.18) Itowever, for reasons of conrl~ntationalefRcit:ncy, krigil~gsysI.t:~~rs are USII-
into the crror variancc of type (5.8): ally solved in terrns of covariances. As disc~~ssed in section 1.2.1, there arc
SIKII ,as I,hc p v c r n~o(lcl,I h I , II;IVI: I I O cov;~rimc(~
s c n t i v a r i o g r a ~n~odcls,
~
co~nrterpart. For ~ ~ ~ ~ b o u nsr~~rivariogr;rur
ded i~rodcls,a " ~ W I I ~ covarimu:"
O
C ( h ) is defined by subtracting the sen~ivariagrai~r nrodel y ( h ) fro111ally 110s-
ilive value A , such that A - y(11) 2 0 , V h. Again, the no^^-hias condition
allows the constant A to cancel out, frorrr the ordinary krigitrg systerr~,which
is then written in terms of psertrlo covari:tnces. In sulnrnary, (.he conllrlon
practice corrsist,~of (1) inferring and modcling the srn~ivariograrn,and (2)
solvilig all ordinary kriging systelns in tcr~ilsof (pseudo) covnrimccs.
Acconntirrg ibr tire relation C ( h ) = C'(O) - y(ll), t l ~ cordinary kriging sys-
tem (5.18) is expressed in terrns of srrnivariograms as

Kriging t h o l o c a l m e a n

Instead of an e s t i ~ n a t eof tlre attribute r , one may he i~~tcrested


in e s t i n ~ a t i r ~ g
and mapping the local rncarr of the attcibnt.e. Such a m a p of trend estimates
allows one l o evaluate local d e p a r t ~ ~ r efrom
s the overall mean and lxovidcs a
srnooth pict~sreof global trends. Like tlrc OK estimator (5.16) of a t t r i b u k
values, the csttmator n ~ : , ~ ~ ( of
u ) the local mean is written as a lirtrar corrrhi- S i ~ t q ' l rkrigirlg v e r s u s o r d i n a r y k r i g i n g
nation of ~ I ( I Irando~rr
) variables:
Ordil~arykrigittg is usually preferred to sirrtple kriging l~ecauseit requires
ncitller krtowlcdge nor stationarity of the rneatr over the ent.ire area A. Sev-
eral autltors (Matlreron, 1970, 11. 129; Journel arid Rossi, 1089) slrowed that
ordinary kriging will1 local search neiglrhorlroods alnou~rtsto:
where XzG(u) is the rveigltt associated with the datum z(u,) in tlre OK 1. estinrat,irrg the local mean mLK(u) a t each location u using ordinary
estimation of the local tnean a t location u. kriging with data s p d i c to the neiglrborl~oodof u, tlren
T h e unhiasedness of the eslitnator (5.20) is ensured by forcing l l ~ ekriging
wciglrts A~:,(II) l o sum to 1: 2. applying thr: sitnple kriging cstinrator (5.7) using llrat rstitnate of the
Illearl rnt.11cr t,I~iwtllc staliotritry I I ~ C in,
: ~ tllat is,

,>
J Ire error variance nj$ = V ~ ~ ( I ~ L ~ , ~ ( I I ) - I IisI (expressed
II)) as a double linear
cornbirration of cov;lri:~nce values:
wlrcre A::(II) = 1 - c~!!~)A~K(u). Accottnting for t11e definition (5.14) of
111s sitnple krigittg cstint;tt.or, one drduccs the following relation between the
SK ntld OK cst.i~ttat.ors:

'I'lrc last two tcrtns of t.11~error variance an: arro since t,lle trerld in(11) is
viewed as a deterministic conlponenl. 'I'lrr differel~cebetween the SK ar~rlOK rstintat,cs of z a t 11 is caused by a
T h e niirrirniaationof the error variance (5.21) under the non-bias condition depar1.11reof the local meat, rnblc(u) horn the global mean in. More precisely,
+
yields the following system of (n(u) 1) linear equations: since X::(II) is ~ ~ s u a l lpositive,
y tlte OK est,irnatc is smaller than the SK
cstinratr: in low-valued areas wltcre the local data tnearr is srnaller than the
global meall. Conversely, the OK est,irnate is largcr than the SIC estirnate in
high-valued areas wltcrc the local mean is larger than the global mean. In
t.lris sense,, the corri~rtonlyused OK algorithm with local search neighborhoods
already accounts for trends (varying mean) in tlre z-data values (see discussion
in seclion 5.4).
,
%
I Be discrepancy hetwcen the two esLinrates ziIc(u) and z ; ~ ( u ) increases
as the weight A:l(u) of tire mean increases, i.e., as the location u being
estimated gels farther away from data locations (section 5.2).
System (5.22) is identical to the OK system (5.18) except for the r i g h t - l r a d
side covariances C(11, - 11) being set to eero.
Because all data-to-nnkrtown covariance terms C ( u , - 11) are zero, t l ~ e Example
location 11 being eslirnat,ed does not appear in the ordinary kriging sys- Figure 5 . 3 (lop graph) shows tert Cd concentr;ttions a t locations 11, to ulo
t c n ~(5.22). Provi(ler1 the same sct. of dalx is used to estirnate tlrc local rticatr along t . 1 NE-SW
~ transect. 'I'ltc local mcan is estimated every 50 rn along that
a t two different 1ocat.ior~s11 and n', tlre sysktn (5.2") rcrrlains rtncl~anged. 1.ransecl using tltct five closest data values; the resrtltitrg estimate is depicted
'I'lrtts, the two sets of kriging wt:ights and the t,wo trend cstirnntris are idrnti- hy a solid line ott the middle graph of Figure 5.3. T h e vcrtical dashed lines
cal: Az:,(u) = X ~ ~ ( I I 'V) ,n and T ~ I ~ ) ~ <=( I~IL) & ~ ( I I ' ) . delineate the OK trettd estimates that are based on the same set of five
neigl~l)oring<lala. For exantple, (:d co~~a:ntratiotis at, local.ions 111 to rri, nrc:
used 1.0 cslilnate the local 111eat1a t all locations wit.lri~~ lhe scgnrent 1 ~ ~ 2k. n1 ~ .
Cd data Sin~il;rrly,tlto next segnrent, 2.1 2 . 5 kln, i n c l ~ ~ d eall s t,rr,nd val~res1l1;tt. are
esl~io~aI.t:dfro111(:<I concenbrat.io~~s itt I O C L ~ ~ O ~112
I S to 116.
As ~ni~ntiolted previously, tlre ordinary kriging S ~ S I . C I I I (5.22) is i i l r ~ ~ t i cat.
al
all locations where the same neighboring data are il~viilvcdin thc e s t i n ~ n t i o n
Consequcnlly, the 01< t,rend estirrmte rniK(lx) is cot~stalllwit.hin each scg-
I I I C I I ~and cl~augcsfrom one segrnerrt to arrotlw depending or^ the neighboring
dnta rctaioctl. 'I'ltis procedure yiclds a trmd estinmte l.l~;~t. follorvs t.hc gel~wal
incrcase ill (>d concentralions along the tr;rnsect. In cout,rast, t,he lllcall of
10 d a t a valncs, rn=1.49 ppm (I~orizont,aldashed line in t,Ire i ~ ~ i d dgraph), le
ovrrcsti~natcsthe local rnean in the low-valued (left) pitrt of i,l~ct.ransect m d
underestilriatcs 1.111: local me;ir~i r ~lllc hig11-va111cd(right,) part oJ'1.h~transcct.
Figure 5.3 (bot.torn graph) sliows both OK (solid line) and SK (d;rsllcd

5'
Trend estimates
I
. .
I
.
I i i
.
liur:) cst,i~natcsof Cd concentr;rl.ions along tlre NE-SW 1.ralrsccl. Notc t.lral.:
&,th estinlalors art: exact..
OK cst.iniales arc smaller tl~nrrSK cstimalks i l l the lcfi, part of t l ~ c
S3 i i i i i
-
C
0
.-
2-
i i i i i transect where the local intiarl 7n;,,((11) is sln;tllcr than t,he overall nle;tii
VI = 1.4;J.

r OK cstintates are 1;rrger t,lnn~SIC estiri~atcsill t,l~criglrt part of Lhe


1r:~risectwl~eret h local
~ ~nciinnr~lK(11) is 1:trgcr tlran t t ~ coverall niran
T n = 1.4'3.
1 2 3 4 5 6
r 'l'lle dcpartlrre between l l ~ elrvo estinlates is rirast irnl,ort,;~nlb i p l l d
Distance (km)
tlic extreme right rlatnnr ole. Indeed, away frorrr t,lle d;rt.a, the tv<!ight
X;:(n) of t,he mean increases (Figure 5.2, bottonr graph); I~cncct.lre SIC
e s l i n ~ a t cis closer Lo t,he over;rll mean r n = 1.49. 111 cottl,rast,, t l ~ cOU
Cd estimates estimate is closer to t l ~ clocal incan t n ; , , ~ ( o )wliicli
, is csi,imated rroni
l.ho lasl. and li~rgeCd cor~cerrlralions.
In sulnnrary, llic use of a slatiouary rncan yields SIC estin~atesthat are
close l o that mean v a h ~ e(1.49 pprn) away frorrr t l ~ cdat,a. 111 conlmst, li,cal
est,irnalion of the rnean wit hi^^ scarclr naighborlroods yiclds 01C estimates that
hclAer follow i.lrr d;tl,a flrtct~r;tlio~~s: slr~allv;tli~csi n t,l~cIcR part of f.lic:t,r;rl~scct,
nod l i t r g ~villnrs in 1 . h ~rigllt pitrl,.
-.--, ~-.
-7-.-. -. , -.-y-
1 2 3 4 5 6
Distance (km) 5.4 Kriging with a Wend Model
'I'hc local estin~ationof Lhe n x a n ill ordinary krigitig allows one lo accotrnt,
for any "global" trend in the data over the study area A . 'I'hns, tlrc OK al-
gorithm implicitly considers ;t norr-stationary mndom fl~nct.iournodcl, w l w c
stationarity is litr~itcriwillrill cad1 search neiglrborlrood I'V(o). In sonre si1.11-
ations, it may he inappropriate to consider the local mean m(n) as coristartt
e v c ~wit.hi~r
~ s ~ n a l search
l neighl,orhoods.. Kriging with a (rend (KT) cor~sists
I,C,C;LIIS(, +
( ~ ~r, I I < !( A 1) ~ ~ ~ ~(5.24). ~ I s ~ ~ ~ ~ ~ ~ I ~ s
'l'lrc n~itti~nizirtion of the correspotldi~~g crmr v;rriitncn, of type (5.8), ur~der
the ( f i + 1) nm-lrias col~ditions(5.24) calls for the ~leli~tition of a Lagrangian
I,(II). T l r procerlurr
~ is sintilar to t11;iL fc~rnrdiu;try kriging cxcepl that there
+
are now ( I i 1) Lagrangc pararr~ctcrs11;'~(11) accoorrling for the ( K 1) +
+
constraints on the weiglrts. Setting the (7i.(u)+ Ii 1) partial first derivatives
1.0 zero yiclds the following sysl.t:t~rof (n(r1) + K -t 1) linear equations:

(5.26)
Accounting for the first n ( n ) equations in system (5.26), the minimized error
v;iriancr Iiccon~cs

Hy cor~venI.ioo,t,lle first. l.rcr~dr ~ ~ t ~ c lfiIol (~ul) is t l ~ el~riit,c o ~ ~ s t a u tt~Ii;11,


. , is,
f o ( o ) = I. Henu: t l ~ afirst condition is similar to t l ~ rO K c ~ ~ i s t . r i ~0 i1 n1 tthe
wcigltts:
,,
c:!!:)
A,,(11) = 1
l h c cot~slraints(6.24) idlow one t,o express tli; ISI'esl.in~atoras a linear Notr t , l ~ a tfor Ii = 0, syste111 (5.26) rever1.s to the ordinary krigitrg sys-
cotttl~inal,io~~of o111ytlte n(11) li\rs X ( I I ~ ~ ) : I ( 5 . 1 7 ) ' l ' l ~ ~1~. 1s1 ,~ K'I' cslin~;rl.or(5.25) and kriging variance (5.27) are
m p a l to tlw OK rslilrl;ttor (5.16) arid krigiug v n r i a ~ ~ c(5.1'3). e
Z&(~I) = x
n(11)

n=1
X;7,(~~)Z(u,.,) (5.25) IZriging willt a trend model requires a prior detcrrrrir~ationof (1) the I(
trend hrnctiot~sJ ~ ( I I )and , (2) the covariance of t.ltc resirli~alcomponent R ( u ) ,
wil.11 Cl<(ll).
rr(11)

A z ' r ( ~ ~Jk(ue)
) = Jk(u) k = 0,.. ., It'
n=1
'I'lte l.ypc of hnctions fk(rr) niay be dircctly s ~ ~ g g c s t cby
d tlte physics of the
Tire krigit~gwith trend cst.irnator (5.25) is unbiased since the error mean is problem Fur c x a t ~ ~ p lae ,stxirs of sine and cosine f~~ncl.ions can he used 1.0
eq~ralt o zcro: rnodel a periodic lrcnd of nu attribole; for ex;unple, see Skguret and Ilnclron

{ ( I ) -( I ) ) = x
?.(U)

*=I
XZT(ll) m(uo) - rn(11)
(1990). 111 nlost spatial situations, tlic earth scientist has no physical ground
for clroosir~ga particular type of analytical tret~rffutrcliol~.ljccause the con-
cept of trend is t~suallyassociated wilh a smootlily varying corrlporie~~t of
the z-varial~ility,Ion-order ( 5 2) polynomials are typically used l o model the
I,rt:nd, for CX:LIII~IC,

a lir~eartrettd in R' ( I i = 2):


5.4. K R I G I N G WITH A ?'HEN11 MOI)F;L 143

Another approach (Delfiner, 1976) consists of defining linear con~binations


of d a t a that filter the trend rn(u). For example, a trend of order 1 such
as m ( u ) = no +
nl . u would he filtered by "differences of order 2" snc11 as
where ( 2 ,y) are tlie coordinates of the location 11. T h e trcnd may be limited [ z ( i ~- + + +
) 2z(u 11) z ( n 211)]. Indeed, each colnponent of that difference of
t o a particdar direction, say, the direction of prevailing wind for airbone order 2 can be expressed tire sirlti of a residual and a trcnd co~nponent:
pollutiol~or of steepest slope for soil properties. For example, a linear t.rcnd
in the 45' direction would be defined as

I n f e r r i n g the r e s i d u a l covnriauct:
011evcrifics that tlre previous linear combination ofz-valr~esreverts to a l i ~ m n
In practice, llic residunl st:n~ivariogranry1((11) is first ir~fcmed,l.11e111 . l rcsicl- ~ comhinat~io~r of residual valr~cs,t,Iiereby filtering the trcnd component,:
nal (pseudo) covariance Clr(11) is dc(li~cedas A -- y11(11). Likc i l l or(1in;try
kriging, the first ui~l)iasr:dl~essc o ~ d i l . i o l~;::)~i;.'.(ir)
~ = I filtcvs o i ~ ttlw
arbitrary constaut A f r o r ~the ~ first n ( u ) c q u a t i o ~ ~ins tile K'T system (5.26).
,Ixhe computation of tlie residoal semivariograni ~ ~ ( 1 is1 110t ) straightfor- +
More generally, a difference of order (k I) filters any polynomial trcr~dof
ward because availahle data arc z-values, 1101, rt,sidnal vrtli~rs. 'l'lre expcri- ordcr I.. T h e variance of such differences is rcfcrrctl t o as tlrc generalized
~ related to ylr(li):
mentally avail;il)le z - s c r t ~ i v a r i o g ry; ~( h~)~is covariance of order k m d is rlsrd as rcsidiriil cov;~ri;~ncc ill s y s k m ~(5.26).
1h:wnre (.hat:

r Such Iriglr-order differences isre not readily available wl~endata are 11011
griddetl

' l ' l ~ n auI.on~aticniodelitig of generaliacd covariances from cxperi~ncrrI.al


+
where the trcnd values rn(u) aud in(11 11) are rtnktiow~~. 11igl1-orderdifferences or rclatrxl st&stics typically results irk very lnrg~!
A first solution to the problem of i~rferrirrgy ~ ( 1 1 consists
) of selecting d i ~ t a (artifact) relative nugget eficts.
pairs that are unaffected or sliglitly affecl.rrl hy the trend, i.e., d a t a pairs such
that

A I I ~~nreli;rl)le
pract.ice would cousist of:

T l ~ residr~al
e serriivariogratncal~hllen be inferred directly from tlra z-sen~ivariograw

.
conipnted frorn sr~clipairs. 'I'lre followi~rgg11icIe1irn:sare nscf111:
1telat.iorr (5.28) is generally sat,isfied Tor sn~;illscparatio~rdistances Ih/.
Thus, tlre residual semivariogram for Ll~rfirst h g s rimy be identified
2' . applying the K'I' algoril11111usil~gthe residiial serniv;~riogr:~m
ferred from the estimated values ~ ( I I , ) .
-,-(11)
11%
in

with t l ~ ecorresponding z-seniivariogranr ~ ( I I ) .


Irrderd, tlrc semivariograni of estin~atedresiduals, yE(h), stroligly depci~rlsO I I
For larger distances, tlic residual semivariograrn rvould be inferrt:d from the algorill~rnnsed to estimate the trcnd c o m p o n e ~ and
~ t may depart, from tllc
pairs of z-values taken in subareas or along directions ( c g . , perpen- residual senriv;uiagra~nyn(lr). A l)cl,trr all,ern;~t.ivewo111d1)e t,o i~itcrpolat,r
dicular to the trend) w1ir:rs the inflwnce of the t.rend can he ignorctl. the residn:d viilues ?(II~,)using SK and t,Iicir s<;~~~iv;iri<>gr;~rl~
yE(1x), 1 . 1 xdd ~ ~
In tlie latter case, tile inaccessil~leresidnal scmivariogrnnr in the trend the trend esl.in~at.esI%(u) back to the interpolated r e s i d ~ ~ a lTs ~ . ~ ~ tu
( I get
I)
dirtxtion is <lee~nedsirr~il;irl o tliat con~pr~t.cd in tlic perpcnrlicular di- the rstinrales z*(n), A siniilar approach is present.ed in f%apt.er 6, w l w e
rection; s11c11a dccisiotl arrrolnr(.s t,o n~odelirrgt11c pattcrn of varint.ion tlrc trend cslitnates ~ ( I I are
) deduced fro~rrt:xlrat~slivcly san~plt:dsecondary
of (.he resi(lui11sas isotropic. Any :tnisot.ropy i l l 1 1 1 ~z-data is ;~croi~nt~cil information rat,lrcr llrarl lxiog rnodeled as ;L specific fi111c1.ior1 of thc spalial
~ , ~ I t,l~e
for in 1.1~:Ihmd I I I I ) ~ MIL I rrsid11>11
covari;~~~cc coor~linnI,es11.
144 CIlAPTER 5 . ACL'UUN'J'JN(~
bun A x i v c l l , r . A I IIUI,ULP,

M;tt,rix ~rot;rtiorr of l l ~ sirr~plr


r krigiug systcm (5.13) Adding only t11c first row a ~ the ~ dfirst
C O ~ I I I I I I Iof 1's (,<I the ilatit c~vatiimrc:matrix Ks,( yields t l ~ c O K system (5.18).
As all example, cotwider the followirrg linear tr(:trd rr~odelin two ditnensions: ,~
I hus, a C O I I ~ . ~ I I I I I I Icxist,s
II ht?twee~r~ ~ ~ r c o ~ ~ s t rsi~nple
a i ~ r e kriging
d and kriging
m(u) = n ~ ( x , y )= no + a 1 2 + a z y with a t.rend model. 'I'lrr o111r11,erof c o ~ ~ s t r a i on ~ ~ ttlic
s weights illcreases
wit,ll the roroplcxity of the t,rend niodel, i.e., ns Lire umber (li' 1) of trend +
wlrere ( 2 ,y) are the coordiriatcs of the l o c a t i o ~11~ heing csti~riatcd. The K T Cur~ctio~rs fk(u) rctainerl increases. Such conslrairrts on the rvcigl~tsamount
+
system (5.26) includes ( ~ ( I I ) 3) linear equatio~ts: to srlectir~gonly tliosc linear rombirratio~~s of data that can filter out the
on know^^ trcnd m(u).
Wit.lr more concise notat,ion, tlir matrix formulation (5.30) of the KT
+
s y s l ~ mfor any ounrbm ( I < 1) of t,rend functions is writ,lco

(5.31)
= K;: k,<.. aud the kriging

,
I l ~ ekriging system (5.31) has a unique solution if t l ~ r s etwo conrlitior~sare
\

met:
where (r,, :I*) are t,he coordi~ratcsof any datum location 11,. Using nratrix
iotat ti on, the K T system (5.29) is written as 1 . T h r covarimce matrix [(&(II, - up)] is positive d c f i ~ ~ i tinc , practice if:
K ,<. X,.r(rz) = k,. (5.30) no t,wo d a t a are colocatcd: 11, # u p for n # P.
the residual covaria~rcenrodel C?,i(l~)is permissihlt:
w l m c K,.- is the (n(11) + 3) x (n(11) + 3) ~iml~rix: 0

+
2. l'lw (I{ 1) fi~~rctiorls f L ( u ) arc linearly independent o n the sel of 71(u)

K,<, = (
(:I~(~II,- Ill)

1 1- 1
,

..
.,
..

.
1
: 1 i

1
- 11"(1)

1
1
-11(1]
1

1
0
l
I I

l
0
]
;I
y,,(11)
0
data; that is, the relabions

;tlig~~ed
K
Q Jk(u,) = 0 V n = 1 , . . . , n(u), would

require that cr = 0 V k = 0 , . . ., 1<. Such a couditiotr rueans t h a t a


drift along a particular directior~cannot be estinralcd if all n(u) d a t a are
pcrpcndicrrlar to that direction ( J o ~ ~ r n and
p . 319). Moreover, the r~rrrr~bcr
t . 1 1 ~n u ~ ~ r b ~e r( I Iof
in two dirne~~sions
e l Iluijbregts, 1878,
( K + I ) of trend f~rrrctionscannot exceed
) data; for example, (.he rrrodeli~rgof a quadratic trend
r e q ~ ~ i r ea st least six data values.
X,.,.(11) is l l ~ evcclor of 1C1' weights and Lagrange parameters, and k,<,. is l.hr
vector of d a t a - t o - I I I I ~covaria~rces
~~~II and t,rend fnnctions:
K r i g i u g the trc~id
T h e trerrd colrrponerrt m ( u ) can he estimated explicitly using an approach
that is sir~rilarto kriging the local tr~eanin relation (5.20). The KT estimator
of the trcnd is expressed as a linear ro~nhinationof n ( n ) rmdorn variables:

tn;(T(t~) = x
n(11)

*=I
X~:,(U) Z(u,) (5.32)

'1'11~: matrix K,, in systerr~(5.30) is oht,aincd by adding three rows and where X;$(u) is the weight associated with the datum z(u,). The kriging
i x, , = [C(II, -up)]
bhnr: colurt~nsto the covariance ~ r ~ a t rK [ ~ R ( -u")]
u ~ -- weights are obtained by solving a kriging system identical t o the K T sy8-
146 CHAPTER 5. AGCOIIN'I'ING 1"ON A SINGLE A7'1'HlU1IT15

te~rr(5.20) cxcopt that tlic right-l~an<l-side


covarianct:s CI,(II,, - 11) arc scl 1.0
zero:

Tire minirrrizatior~of t l ~ cerror valriarrcc V;kr{n;,(~~)- nr,(ll)} rlrrder the


(1; + 1) const.rni~~ts (5.35) yialds a kriging system idr~~t,ical 1.c~ the KT sys-
tern (5.33) except for tlre (1; +
1) nou-hias a)ndiLiims:

+
where ,LL~,';(II)is the 1,agrauge pararnetcr tlrat accounts for l.l~t: ( k 1)I.lr 1SI'
constraint on the kriging weights. For 1i = 0, s y s t m ~(5.33) rcverts to t l ~ c
ordinary kriging systelri (5.22) for cstinratir~gthe local nmtn.
Gstimat,ion of the trend conlponent m ( u ) amounts to first esti~ilatitrgLIE
( I i + l ) trend coefficients a k ( u ) , then computing the trcnd csthmate as a l i ~ e a r
co~nhinationof the know11 trend fnnctiorrs Jk(u):

Like the trcnd corrrpol~ent,t l ~ eunknowt~(.rend cocfficicr~l.~


un(11) can be esth

.
mated as linear combinations of z-val~~cs. For example, the li~~e;rrrsti~~~ath
of the coefficier~trrr. a t location 11 is
Ijot,l~kriging systcn~s(5.33) and (5.36) ;trc ihrt.ical lor 1i I0. I ~ ~ d c e r l ,
the co~istantt.rend component in(11) is t , l ~ c t,he
~ i single I.rfmil coellicic~~l,
(111(11).
'The locat.ioo 11 being estirr~;it.eiIdoes ~ ~ appear o t i n thc kriging sys-
where X:~,(U) is the weight msociatcd with tire z-datum a t location 11, for
ten1 (5.36). I'rovirlcd the sarrrr: set of data is used to csti~nabcany trend
the K'r estimatior~of the trend coefficient at 11. 'l'lie csti~riatorn;,(u)
coclficient nr a t t,wo diffment locations 11 and XI', their estimates are
is a RV, being a linear combination of the 1tVs Z(II,), wlmeas ar(11) is ;t
ident,ical: n;(o) = a ; ( d ) Q k = 0 , . . . , I{. For Ii = 0, tllc t.rentl es-
deterministic, tliough uuknnw~r,valoc.
l.irnates at. 11 and 11' arc then ids~itical(secI.iou 5.3). For l,olyrro~~rials
T h e e s t i ~ r ~ a t (5.34)
or is orrhi;~sedif the error mean is zero, that is, if
of lriglicr order, say, a l i ~ w r rt,rcnd irr R 1 , t,hc trt:~rd cst.i~r~ntcs a t 1oc;r-
tiotrs II = ( ; l ! , O ) ;ttrd 11' = ( I : ' ,0) appear as t . 1 ~s;rnw linear t ~ ~ n c t i o nors
c o ~ r d i n a t c sJ : and z', rr~;,.~.(o) = a; + a ; -;I!atrd ~ ~ L ; ~ . , ,= +
( I I ~ ) (1; .c'.

O r d i n a r y kriging versrls k r i g i n g wit11 a t r e n d


As n ~ e ~ r t i o n eilld section 5.3, orrlitirtry liriging iimoimt,s to estin~at,iug,wit.lrin
each search ncighborliood 1.1'(11), ilie local constant n ~ a 7n 1 t h(II), ~ i.Iie11 p(!r-
forn~ingSK on t.he corrcspo~~ding residuals:
Similarly, kriging with a trend amounts to estimating, within each search transect of Figure 5.4. Tlre trend is arbitrarily modeled as a linear function
neighborhood W ( u ) , the trend cornponents r i t ; ( T ( ~ )and rn;iT(u,), then of the coorrlinaik x along that transect: m ( u ) = rn(x,0) = ao(u) a t ( u ) . x.
r .
+
performing SK otr the corresponding residuals: I h e srrriivariogranl model of Figure 5.1 is used for both OK and KT.
'l'he first search neighborhood (Figure 5.4, left graph) includes Cd d a t a a t
locations 1x1 to u s . 'Fhe solid line depicts the local constant mean estimated
by ordinary kriging: mbK(i1) = 0.68 V 11 E W(uo). For the same search
neighborhood, kriging with the aforementioned trend rnodel yields the follorv-
ing trend coefficients: a;(u) = 0.49, a;(u) = 0.10. Tlre corresponding linear
trend estimate, depicted by the dashed line, is then ~ n ; ( ~ ( r=l )0.49+ 0.1. x .
T h e small positive slope of that, nrodcl reflects the slighl increase in Cd con-
centrathns from locat,ior~su l to 11%. At the cm~lrallocation 110, the K T
Unlike in ordinary kriging, the (,rend component ill K'l' is not constant within estimate of t l ~ rt r e ~ ~isd 7njcT(2.0,0) = 0.69, which is close to the OK cs-
tlic search neigl11,orhood. Rather, it depends on the coordinates of tlrr loca- tiinate of the local mean a t that same location, r r ~ ; ) ~ ( 2 . 0 , 0=) 0.68. T h e
tion being estimated and of the dala l o c a t i o ~ ~ s . difffrcnce between the OK and KT estimates of l,lre trend, i.e., t~etweerrthe
Both OK and KT estimates zZK(u) and 2&(11) can be expressed as the solid and dashed lines, increases away from 11o. T h a t difference, however,
soin of two terms: the same linear combirratiorl of n ( u ) d a t a ~(11,) and a is small hecause of the flatness (small slope) of the estimated linear trend
fi~~rct.ion
of Ll~rt r c ~ ~esti~nates:
d rnodel.
Conversely, t l ~ cdilference het,wcen OK and 1C1' trcnd models is 1rrnc11
larger for the seco~rdsearch ncigl~borlroodcentered on ub and including lo-
cations uz to ufj(Figure 3.4, right, graph). Tlre litrge (ki corrccntratio~ra t
116 greatly inllut-nccs t l ~ trend
r fitter1 by both algoritl~~rrs. 'l'he local c.onst,ant
rrrf!arl esl,i~r~al.~~rl
liy ordinary kriging is twice the e s t i m a f . for
~ t.he first neig11-
borl~oorl:~ r ~ > , < ( u=) 1.40 Q u E W(ub). 'l'lle t ~ s l i ~ n a t elinear
d trend model is
also inuch streper t l ~ a nfor t,lrc lirsl. ~ir:iglrhorl~oorl:III;<,~(II) = -3.22+2.01 .x.
Again, a1 1 . 1 1 ~ cr:r~trnl location 14, both OK and I(?. esli~natesof the trend
are similar: 1 n 4 ~ ( 2 . 3 , 0 =
) 1.40 and 1n;(~(2.3,0)= 1.40. However, the two
trend nrodels differ considerably away from 11;.

Trend estimates rrend estimates


:6
'l'he differcnce b e t w c e ~the
~ two estimates is thns

- ( I ) [111((11) -7 t ( l l ) ] (5.39)
o=1

Any difference between the OK and KT estimates originates from a difference


between l l ~ etwo trend estima'es, hence froru the usually arbitrary decision 1.0 1.5 2.0 2.5
-
3.0 1.5 2.0 2.5 3.0 3.5
o d( u ) as a constant
uf modeling the local trend in(u) within a ~ ~ e i g l ~ l i o r h oW Distance (km) Distance (km)
or a particular polyno~nialof order Ii
Figure 5.4: OK and 1Sl' modeling of the local trend within two successive search
Nurncricnl e z a m p k neighhorhoorls centcred on locations no (Icft graplt) and 11; (right graph). The 01<
C:onsirlar, for exaniple, tlir: nrorlelingof the trcnd component withill two search trend is a constant mean (solid line), wltercas the #'I' trcnd is a linear function of
neighborlroorls ce~rtercda t n o = (2.0,0) and ub = (2.3,0) along h ~ NIa:-SW
c the z-roordinntc (dnsltccl line).
150 CHAPTER 5. ACCOUNTING FOR A SINGLE A'1"1‘1lIBUTt; 5.4. KRIGING WI'1'11 A TIWND MODEL

Consider now tlre estimation of tile trend component a t all loc.<II1011s 11


along the NE-SW transect. Figure 5.5 (top graph) slrows the staircase OK OK trend estimated u,
. . . .
estimate of the local mean. T h e K'r estinrate of tire trend is dr:picted by 1 . 1 ~
solid line in Figure 5.5 (second graph). As iu Figure 5.3, the vertical ~ l a s l ~ e d
lines delineate tile trend estimates that are hased on t l ~ esame five closest Cd
values. Within each segment, the estiniates of the trend coefficients n;(u)
and a;(o) arc identical (see previous discnssion and Figure 5.5, two bottom
graphs). T h e K T trend estimate thus appears as a series of linear segnrents
with different sloprs. For tlre first segment, 1-2.1 k n ~ ,t.he s n ~ a l lpositive
slope, ni(11) = 0.10, reflects the slight. incre;tse in Cd concc~rtrationsfro111 2 3 4 5 6
Distance (km)
ul to 118. 'I'his flat t.rend contrasts wil.11 tile stcep positive gradient, of the
next segment, 2.1-2.5 km, whose large slope, raj(u) = 2.3, rcflccts the large
increiise in Cd values from 112 to 11s. 'I'he linear i l o w ~ ~ w t ~trend
r d witl~in
the extreme right segment is fitted f r o n ~data a t locations us t,o u l u . 'Tl~e
large Cd concentralion a t 116 now yields a decreasing trend c s t i ~ n a t ewit11 :r
substantial negative slope, a i ( u ) = -1.0. Together wit.11 the slope estimate
ai(rt), the trend coefficie~~t estin~atea;(u) also cl~;~nges along the NIS-SW
transect (Figure 5.5, two bottom graphs).
Figure 5.6 SIIULVS I)ot,11OK (~las111xlline) aud ICI' (solid Iiw) t;sLi~~~&es
of the trend (top graph) and of Cd corrcerrtrations (bottom graph). W11er1 + 2 3 4 5 6
comparing the performances of OK and K T estimators, it is i r n p o r t a ~ tto Distance (km)
distinguish betwcen interpolation and ext.rapol;rl.ion.
8 , Coefficient estimates a&u)
Inleryolatioir
Interpolation corresponds l o cases wl~erethe location 11 being est.i~~~at.ixl is i i i
i i i
surrounded by data and is rvit.11irr tlre correlation range of t l ~ r s edata, c.g., I i i
the locations 1,elonging to the sqynent 1 5 krn of t.he NKSW trnnscct. 1111- i i
der suclr conditions, 01< and K'I' yield sinrilar estiniates for bot,l~t.he t,renrl
conrponcnt m(u) and I.lic attril~utevalue ~ ( I I )These
. rcsults c o ~ ~ f i r t,lrat
m in
interpolation kriging rrs111I.sarc not inflneoced hy tlrr rhoice of a particuliw
1
~ .--
2 3
J ..,
4 3
... .-
6
represeut,ation for thc trmrl (Journel and R.ossi, 1'389). Distance (km)

L'rtlopolalio~l 3 , Coefficient estimates a;(u)


Ext,rapolation corresponds to s i t u a t i o ~ ~wilere s tlro location 11 k i n g c s l i ~ n ; ~ t e d
is outside t l ~ egeographic range of data, e.g., all locations l11;11 are beyond tlic
extreme riglit datum in Figure 5.6 (t,op grnpli). In this case, thtr paran~clcrs
a k ( u ) of the t r c ~ t d1nodc1 ( q , ( u ) for OK, ~ O ( I Iand
) ~ I ( I Ifur
) KT) arc & -
n~at.edfrom tlrc closest data and extrapolated toward t l ~ clocation 11 being
eptirnated. For example, beyond l l ~ right r extrcmc dat.un~,i l ~ e0 1 < esti~nat,or
extrapolates the conslant local niaan that is e v i ~ l ~ ~ x tfroni e r l the last, live data
values. In conthst,, t l ~ eK T i:st.irnator cxtrapol;rt,es t,l~c:limrar trwrl (decrease) Dislance (km)
fitted to 1.he last five data v;tlws. Unlike in i~~t.erl~olat.ion, the choice of ;1 t,rcnd
nod el is lrcrc critical
I n s ~ ~ n r r r ~ ano
r y ,matter w l ~ a ttrend is prcscnt i n dalh, ordiuary kriging Figure 5.5: O K and K T estirnalcs of the trend (two top graphs). 'The I(T trmd
with local search ~~~!iglrhorlroods is prcf<wed i l l inlorpolation sittratio~rsIx- is a local linear function of tlte z-coordinate, m(u) = m ( u ) + at(lr) - Z . 'fhr two
bottom craohs show the K T trend coefficients a o ( o ) and a , ( = ) estimated evcry
Trend estimates be assimilated to a point of c o o r d i ~ ~ au,.
t e Similarly, the support related to
tlir ahtrihute z to Be estimated is usr~allyassitni1;~tedto ;L point of coordinate
u. Irr ssvoral applications, howwer, the target quatltity is 1.11~average value
of a t t r i b u k z over a block of specilic rlin~msiolls,for example, the average
Cd concentration over a 1-hectare field if rc~nedinlmeasures are applied to
I-l~ectareare'as, Dlock kriging is a generic narne for estirrratior~of average
+-values over a segnlent, a surface, or a volume of any size or shape. T h e
tcrrn poial kriging refers to estimatiott on point support.
1 2 3 4 5 6 Corlsider the prohlenl of estimating tlie average value of attribute z over
Distance (km) a block V centered a t n. Provided the averaging process is linear, the block
value z v ( u ) is defined as
Cd estimates

where IVI is the measure (length, area, volume) of block V. T h e integral is,
in practice, approximated by a discrete sum of z-values defined a t N points
u: discretizing the block V ( u ) . For example, the block V(u) in Figure 5.7 is
discretized 1)y the four points u', to 11;.
1 2 3 4 5 6
Distance (km) 'Thc block valne ~ ~ ( 1 conld
1 ) be estimated as the linear average of the N
point estimates, say, OK estimates z&,<(11:):
I'ignre 5.6: O K a n d K'festima1.e~ol tlte trend (topgraph) and olCd concentrations
(lmttotn gritplt). Note tlic siaiilarity betwecrt O K v r d K'1'estimates in interpolation
sitnation 1-5 km, and trow extrapolation results depend critically on the prior choice
or a trend i~~odcl: ccrnstant lor OK or litwar lor K'I'.

where t f ~ esame n(u) d a t a are userl for all N point estimates, and X,(U~)
canse it provides results similar to llmse of KT, but it is easier to i1np1cni~:nt.
is the weight%sociatcd wit11 the datum z(u,) for Llte OK estimation of
111 ext,rapolation conditions, the KT estimator slrould be userl wlrenever the
physics of tlie plrenolrrenon snggesls a particular functional form for extrap-
olaling a trr!rlrl fitted from witllin the salnpled area. In most cart11 science
applications, however, there is usr~allyno such physical hasis for choosing
a particular trend model, and the user slror~ldbe aware that thc eslirnat.ed
values depend heavily on the arbitrary trend being extrapolated.
Though the choice of a particular trend model cannot be validated when
n o d a t a are available (extrapolation situation), mapping the trend estimate
may draw attention to aberrant extrapolation results. For example, Figure 5.6
(bottom graph) indicates that the linear trend model is inappropriate much
beyond the extrerne right datnrn sihce its extrapolation would yield negative
concentration estimates around 7 km.

5.5 Block Kriging Figure 5.7: A two-dinrmsiosal block V centered on location u and its discretizalion
by four points 11; to 11;.
Any rneasurcment z(u,) relates t o a non-zero, finite sample volume, such a s
a piece of rock or a core of soil. Often the size or "support" of the daturn may "To simplify notation, the upperscript 01< is ren~ovedfrom all notations i n section 5 . 5 .
154 CHAPTER 5. ACCOUNTING 1'011A SINGLE: A'T'I'RlBUTE 5.5. BLOCK KRIGING 155

attribute z a t location u:. Such an approacl~requires solving, a t each of the For exan~ple,the covariance hetweeu location 11, and the t w o - i l i n ~ c ~ ~ s i o ~ ~ s l
N locations PI:, an ordinary kriging systern of dimension (n(u) 1): + block in Figure 5.8 is approxi~~raledby the aritlinletic average of point, co-
variances between u,, and the four discrclizit~gpoints 11; t o 11;:

'I'he block kriging variance is

*,
I h e approach bcconres co~r~pul.ationallyexpensive as the number of blacks
and discretizing points increase, hence it is preferal~leto estimate t,he block
value directly from the data values z(u,) wing an estinrator of type
where the block-1.0-block covariance ~ ( v ( u ) V(II))
, is approximated i)y the
arithmetic average of the covari;mccs C(11: - 11;) defined Letwecu any t,wo
discretizing points n: and 11;:

where X,v(u) is the block kriging weight assigr~edto tire i l a t u ~ n~ ( I I , ) . Like


"x
N N
- I
the point estimator (5.16), the block estimator Z;(u) n n ~ s be t nubiased and
such as t o rnininiiac tile error variance ui(11) = Var{Z;(u) - Zv(11)).
C ( V ( u ) , V(u)) = -
A'
C,=I j=1
C(u: - 11;) (5.46)

T h e block o r d i ~ ~ a rkriging
y system is written as follows:
Provided tlrc same n ( u ) da1.a are used for ;dl N point kriging systems (5.12)
and for tlrc block kriging sysleln (5.44), each block krigi~igweight X,v(lt) c;ui
be shown to bc the avcrage of the N poiut kriging weigl11.s X,(u:) (dournel
and Wuijbregts, 1078, p. 322):

This "block" kriging system is identical to the "point" O l i s y s t , e ~(5.18)


~t ex-
cept for the right-lm~d-side term wllcrc the ~ ~ o i ~ ~ l . - l . o - cov;rri;tnrt:
l>oi~~t
C(u, - 11) is rrp1;tced by t l ~ epoir~l-to-block covariance ~ ( I I , ,V(U)),
, that
is, the average covariance bcl.weeo Ll~eI1V Z(11,) ~ I I I I I 1 . 1 1 ~riu~dolnvi~riablcs
Z(u') a t all tile poi111s witl~inthe block V (Jouruel and iluijhregf~s,1978,
p. 54):

In practice, this covnriancc c ( n , , V(II)) is approxi~n;rl.cdhy l l ~ e:~ritlmtetic


average of tlre point sttpport covaric~nccs(:(it,, - 11:) ilclined helwec~ilocatio~l
u, and tire N points 11: discretizing l.l~eblock V(u):
Figure 5.8: Approximation of the point-to-block rovarimrc Cov{X(u,.), %v(ll)]
try the average of point-to-point rnvariaeces Cov{Z(u,), % ( t i : ) ) bolween datnrn
location u, and each of 1,lw four discretizing points 11: to 11:.
156 CHAP'l'ISR 5 A(XOfINT1NG FOR A SlNGLE ATTRIBUTE

'l'hus, the block kriging system yields a n estimate irlentical to that ob-
tained by averaging t.he N point rstirnatcs Z & ~ ( U : ) :
-
(;(v(1tm), I J ( I ~ " ) ) =
1
-
1 % lw ~ ~ 'IL 1,118)
u ~
C(II - u') du'
, ~

where iv,l and ltial are l l ~ emeasures of Lire data support a t locations
1. Virecl block kriging through syste~rr(5.44) yields all estimate of l.lle ;111d ufi.
XI<,

lir~earaverage of z over LIE block I/. When applied lo nn al.t,ribute


srtclr ;IS l ~ l lt.l~al,does not avcragc l i ~ ~ e a r lin
y space, the hlock triging
estimnte is 1 . h average pH over the hlock V (statisticalaverage), ]lot t.he ,,
111csize of thn hlock \/(I,) conld he iucrmsed in~l.ilt l ~ cblock cqnals t l ~ entire
c
logaritl~mof the average conccntratio~~ in It+ ovcr that hlock (physical s h d y area A. T h e block cstinratt! z;(u) would tlrm be an estirnate of tlre
average). Let z ( u ) be t l ~ econcentration in tlf a t locatiol~11and y(o) = liltear averageof z over A . Tlieoreticnlly, such a global mean z,, could be
-log,,[z(~~)] be tha corrcsporrdi~tgpit valne. Hlock kriging performed es1.imatcrl difcclly f r o n ~all data z(a,) in A hy solving a block krigiug system
on pH <lath g(u*) yields an rstimn1.e of the lypr in cquatiotr (5.44). In pmcl,icr, several reasons prevent such direct
krigiug:
1. Tlir cov;iriancc function needed can rarely be considered as stationary
ovcr t l ~ cspan of the entire study area.
that is dilli.rcrrt from the pH of tlrc lrlock co~~cerrtraliotr
estinr;rtr z;(n): 2. S a n ~ p l ccovari;tnce values for largc disl,a~lcrs;ire usually unreliable be-
c;tttsc or t , ! ~few data locatior~ssrpnratcd by suclt lnrgrl dislnncm

3 . ltetait~itrgdl data makes the kriging syslent very large, lengthens the
c o t ~ ~ p t ~ i t,inw, ~ ~ofteti lcads to insL;drility of 1 . h krigiug matrix.
t t . i ~iitld
,, For all tllcsc rcasorrs, krigitig is pri~riarilyused as a local e s t i ~ ~ ~ a l ialgorithm.
ort
I ire physical average of pll values over the block would he obtained by
perforuing block krigitrg on the concentration z(u) of I f f , not on the A global kriging estimate zk(11) can, however, be obtaincd using a two-step
pII valnr ~ ( I I ) . procedure. First, tlre study area is discretized into small blocks and the
average value of z is estimated within each such block. Second, the global
2 . One sllould strikc a balance bet,rvren too frw ~liscrctizingpoints tlral estimatc is cornpuled as a linear combi~ratiol~ of hlock estitnates z;(u), with
providc a rougl~approxi~nntiol~ of t.lw poilrt,-tri-block m d block-to-I~Iock each cstiolatf: rt:criving a weight proportional to the area JVI of that block. A
cov;trianccs and t,oo many rliscrctizing point,s that are c o ~ r ~ ~ ~ ~ ~ t a t . i o r ~ ; ~ l l yrnorc straightforward alternative consists of computing a rieclustered mean of
vxlm~sivr.A rule of tlntml, is to srlert (4)" discretizing points, wllrre the data using the polygonal method or cell-declnstering teclmiqne introduced
e r d i m ~ i ~ s i o2~ 0~1.s ,3, of the block, scc Jorrrtrcl a11d
n is tlw i ~ w ~ b 01. in section 4.1.2; see also lsaaks and Srivistava (1989), Chapter 10.
Iluijl~regts(1978, p. 95-108) for furt.lm discossions. A good practice
co~~sisbsof computing covariance cstin~atesfor an increasing number of Example
discretizing points: a t some stage increasing tlre dcnsity of point,s will Figure 5.9 sliows the point (small dashed line) and block OK estimates of
not signific;~ntlymodify l,l~cresults of tile approxi~nation.
Cd concrntr:ttiorrs along the NE-SW transect. The blocks are defined on seg-
~rtentsof length 50 nt (large dashed line) and 250 m (solid line). In both cases,
3. T l ~ nblock kriging system (5.41) call he generalizcd to l,lte case where
four discretising p a i ~ ~were
t s used to compute the point-to-block covariances.
1 . h ~di1l.a t.l~enmelvesare defined OH ;<support, say, ~ ( I I , ) ,which cannot
Note the following:
he co~~sidcreil as negligilile with regard to the poi111support of tire
covarimca tnodcl. 'The point-1.0-point C(II, - I I ~ and ) poin1.-to-block 'I'he hlock estimates do not rnatclt point-data valnes (black dots) be-
(,"(IS,,, '~'(11)) covariance ternls in syst.em (5.44) arc Lhrn relrl;tcrd hy the cause the supports are different.
ntcrre rlilfic~~lt 1.0 detcct on Argovia~irocks wlrose rtat.ur;d Co ill10 N i C O I I C C I I -
] PoinVbtock Cd estimates tmtions are half tl~oseineasured on otlier rocks ('l'ablc 2.4, pnga 18).
If the scales a t wl~ichthe dilkrent factors ( h u ~ n a ngeologic)
, operate are
very different fro111one anotl~cr,then they slrould be apparent in t.lrc scnri-
variograms of met;rl concentrations. 'I'lle st,ructr~ralanalysis performcrl in
sectio~i2.3.5 revedcd t t ~ eexistence of tliree scales of sp;tt,ial variation: i~ricro-
scale (range sruallrr tliai~the first s r ~ i ~ i v a r i o g rlag
a ~ ~of
i 50 tit), local sc;ilc
(short range x 200 rri), m d regionid sc;rl<:(range x I k~ri).'l'lic slrort,-rang(>
- block (250 ml s t h c t ~ l r ewas rr:latetl to the slml.iel dis1.ril111tio11
of rock t y p and Iznid rrscs
in lllc stt~rlya r e a 'rhe lo~ig-rnngest.rur1.11rcw;rs i~rti:rprdcd as 1.11~. rcgiw;tl
irdhlcnce at' geology, particolsrly Argovinn and Iiinmmidgia~irocks, 011 metal
concentrations. Sucl~inl.erpretatio~iled us to rnodcl the exptirin~entalse~itivar-
iograms of n ~ e t a lconcent~ratioosas linear corr~binntionsd t,l~rcebasic struc-
tures gl(11):

T h e hlock estirrratcs vary nmrc s~imothlyin space t,itan t,l~epoint estk


I="
mates; that snmotlting incrcnses with iucre;rsing size of tlie hlock. 'I'hc
within-block averaging of hlock kriging sr~rootl~s ottt tlie slmrt-range rvl~ercyi~(ll)is a nuggcl elkct rnodcl, nnrl yl (11) a ~ y,(li)
~ d are spl~cricitlinotlels
variation of concentration, erasing the artifact disconti~~uitim ncar data wit.h short and long ranges, ill and a 2 For example, Figure 5.1 (pagc 1 3 1 ,
locations. If the ohjsctive is to 111;tp large-scalr fmturcs of' nttribti1.c rigl~tg r q ~ l t sl~ows
) tltc expt%~~~ent;rl
Cd semiv;iriogr;t~~i ;rud the ino&:I ii(,Lcd:
z, block krigi~igis preferred 1.0 point kriging. A n ;rlLr:rnativc consists
of filtering the sl~ort-rangevariability c o t r ~ p o ~ ~ froni
c n t the covariance
rnodel (see s c c l i o ~5.6).
~
(4,281, t.11,~ I f V ~ ( I I wil,l~
~ J n d e rthe linear 111ude1of rcgiott;~Iiz:~l,i<~~~ ) tlw
~~estt:d sen~ivariograr~~ model (5.47) can be inl,erpretctl as ;I l i ~ ~ ccorr~l>innt,io~~
ar
of three inilrpendenl IWs Y1(lt), ~ C I wit11I zero IIICRII RIKI l 4 c sm~~ivitri-
5.6 Factorial Kriging ogr&lll y'(li):
,.
Ilie kriging algoritlm~sintroduced so far are designed to estimate tlia un-
kr~ownvalue of a c o n t i n ~ ~ oatlhbote
r~s 2 , say, rrietal co~iccntmt,ion, or1 a paint
or a block support. In this section, t,lie objective is no longer to estinmte 2
(metal conccntratio~ls),Lo1 rather to understand t l ~ corigins of 1.l1at value
!,'or example, trace mcl;ils i i ~soils can origilrnle naturnlly from r w h s or t.hy
can result from hornan activities, such as irrini~~g, i d u s t r i a l waste, or farrs~iog.
Whereas little can b e d o ~ to r correct for large nat.11ral metal conce~rtrations,
measures can be taken to prevent niari-made pollutions lio111g d t i n g worse.
Therefore, an early understanding of natural as distinct frorn 1iu111anorigins
of contamination is critical.
Whereas large nretal concentrations are o f l e ~rclat,cil
~ to f.11ei~carllyp r w
ence of a farm or a factory, it is generally 11toc11more dilficdt to discover
the origin of mediom co~~cerrt,rations, say, conccntr~~tions that just cxceed the
tolerable m a x i r n ~ ~ rSuch
n . conce~~trations rr~ayresult froni rocks that arc nat-
urally rich in that metal. Medium concentral.ions may d s o origilrat,e from
, .. -
human pollution, the impact of wlrich is temporarily bala~icedby s1n;~11!,at-
The linear nlollel (5.50) is convenient because it allows us to infer the cross
l o previous sections, the focns was either on the attribute estinrate t ' ( u )
covariance between z-data m d unsampled spatial components from the sole
or on the t r e l ~ d< d m a t e ~ ' ( I I ) . Factorial kriging a ~ n o u n t slo splitting the Z-covariance inodel.
rcsidnal component 11(11) = %(u) - m ( u ) into several independent spatial
Minitnizing the eslirnatio~~ variance (5.52) under the constraint of unbi-
cornponenls (or factors) on the basis of the svn~ivariograr~~ rnodel y(h). I'ro- +
asedrrcss leads ro t,l~cfollowing system of ( n ( u ) 1) equat,ions:
files or maps of estimates of tl~esespatial conrponents allow us to separate
local and regional features of the pheoonm~onunder study.

Consider the prohletn of estimating the spatial component Z 1 ( n ) of dccotr~-


positiolr (5.50). T h e 01'i cstin~at,orof t.h;tl spi~tialcon~ponentis

w l w e lc;'"(u) is the Lagrangc parameter tlrat x c o u n t s for the non-bias con-


sl.raint.
wl~oreA:;(") is the weigl~t,asigned to dat,~~rrl ~(11,) for the c s l i ~ n a t i oof~ ~
Lhe 1Ll1 cornpo~renl,.The only data available are tlre z-values XI,), wliicl~ Matrix n o t a t i o n
include Llre contributions of all ( L + I ) components.
ISacll sl,;rl.ial cotnponrxtl Z i ( u ) is defined as a R.1" with zero mean, hetrcc 111 m;rtrix rlrifat,ion, I.111: ordi~tarykriging system (5.53) is written as
its estirt~at~iol~ error is
n(11)

E { z ~ ~ ~ { (-U~) ~ ( l l=) } AZY(11) 7,l(ll)


,?=I

with m ( u ) r:onstaot b ~ unknown


~ t within the search neiglrborl~ood.'l'lle non-
KO, XO,(ll) = k", (5.54)
bias cooditiot~is theti sat.isfied by forcing t,lw kriging weights A::;(u) 1.0 sum
T h e kriging weights arc o b t a i ~ ~ ebyd multiplying the inverse of the covariance
matrix K , , l ~ ythe vcctor k,,<: A , , s ( ~ ~=) K;; k <,,<.
W I I ~ Iall
I separation distances /u, - 111 between the location 11 being es-
l.imatcd and data locations are larger tlmn the corrrl;ttion range a; of the
cornponant. Z1(ll), all right-l~an<l-sidecovariancr tcr~lrsG ( I I , - 11) in sys-
tom (5.60) vanisll; the veclor k,, is t11eri a trtrll vector, and so is the vector
A,,,<(II) of kriging weigllts: A,,,<(u) = K;:. 0 = 0. '~'IIIIs, the krigingestimate
of :L spatial c o n ~ p o t ~ c Z1(u)
nt ill a relrrote locabion II is cqunl to its zero mean.
llsi~lgthc ~ r ~ a l r ifort~~ulation
x (5.54), one can see L l ~ t tthe decornyosi-
tion (5.50) of tire R\' variable Z ( n ) into spatial and t r e d corrryonents holds
true in t r r n ~ sof krigitlg esI.irnaks, say, OK estimates:
nccanse t.lrc ( L + l ) spat.ial co~nponcntsZ1(u) arc defined as n~uhrallyindepen-
dent, the cross covarimcc between the RV Z(11,) and thc spat;al component
~ ' ( u )reverts to the basic covariance Cl(r~, - 11):

w l m c all r.stim;~t~cs
arc I).?sr:d on Ll~rs a n e n ( u ) clz~tn~ ( I I , ) . It~dccd,the righl-
Itand-side vcctor of tho OK syslcrrl (5.18) is hut the surn of the right-hand-side
162 CIIAI'TEI1 5. ACCOUNTING FOR A SlNGLlC A7'1'1111jU1'1.:

vectors of OK systems (5.22) i~tld(5.53) for spatial and I.rei10 cor~ll~oncnt,~:

I OK Cd estimates
wlrich entails tile following equality between leR-lra~td-sidetertns:

~ n a t r i xKc,, is U I ~ I I I I I O I Ito all ordi11;iry krigiilg systcms,


Since tllc covari;r~~ce
one deduces the following relation betweer~ kriging weighls and Lagrange 0.0 05 1.0 1.5
parameters: Distance h Cm)

Mnltiplyit~gboth sides of eqrtality (5.56) by the vcctor of &itit, olle obtains


relation (5.55) hctween kriging estimates.

Examplc
Figure 5.10 shows the dccompositio~~ of the OK estirtlaf.es of Cd concentril-
tions along the N1':-SW trar~sect(right top graph) into fonr corrlponents: a
n t s i,t t,n:~rd c o ~ ~ ~ p o -
nugget component, slmrt- and long-rangc u j l ~ ~ p o ~ ~ cand
nent. Note the following:
T h e short-rang,: and lorq-range c o r n p o n t ~ ~arc
l s zero wlwt:ver bl~eclos-
est daturrr location u,, is more tlrarr 200 ni or 1.3 kt11 *way, rrspcctively.
,, '
l h e nugget component, a l ~ i r hcorrespol~(lsto a zero mnga, is zero a t
any ~~nsanipletl location.
O
I Long-tango component Trend component

T h e decolnpositio~lallows one to distil~guisl~ t l ~ ot.rend incrcnse of (:<I


concentrations a l o ~ ~tile g lransect frorii variat,ions in cil.lwx sltort- or
long-range c o n ~ p o n r n l , ~ ,
Factorial kriging depends wholly 011 the so~newhatarbitkary cltoice of the
1 2 3 4 5 6 , 2 3 1 1 6 6
nested seiiiivariogra~ilrnoclel (5.47), hetlce rtmedial tncas~iresshould not he Oislance (km) Distance (km)
decided solely on !.Ire lrasis of the maps of spatial componnlrt.s. ll,;1t.ller, t.llesc
maps may draw ;tL!,e~ltioil to sul)areas Lllal depart st~l~sl.;t~~tiillly f r o ~ ntheir
regional background. For example, althouglr s i l ~ ~ i l Cd
a r concerllratiol~swerc
measured a t 111, 117, at111ulrl alo~lgthe tra~~secl, shown in Figure 5.10 (right
top graph), tho first. ir~rasirre~rrc~~t is in :I low-v;~lr~c~l
su1,;rrc;r. wI~<'r~iis 1.11(:
two otlrer concentrations l ~ e l ~ m1.0g a 11igl1-valuedsub;in:a. 'This dcpart,~~cc!
from tlte regional background is rellecled i n tlie profilc of f.lte s l ~ o r t - r a ~ ~ g e
component (Figure 5.10, ri#ht middle praplr).
System (5.58) is identical to tlie OK system (5.18) except for tire riglit- Because it relies on tire linear decomposilioo (5.50), llle noise filtering
hand-side covariance terms that are computed by subtracting from the %- performed by fac1,orial kriging regards that noise a s at, indepcnrleol, compo-
covariance C(u, - o) the covariir~~cesC#(rl, -n) of the lo spatial coniponerrts nent added to the r~nderlyingsignal. Another assumption is tlrat t l ~ auoise
being filtered. Two particular cases of the esti~nator(5.57) are: is "Iiomoscedastic", that is, the noise variance is deemed colistatit over the
whole range of variation of signal values. Whenever noise relates to mea-
1. No spatial component is filtered (lo = 0). surerrmit errors, as in contaminatiou of sampling procedures, it may irot be
T h e right-liand-side covariance terms are equal to tile Z-covariances independent of the- signal value (I~eteroscedasticnoise). In cnvironn~cntal
C(U, - u), hence t.he estir~ratorw~;(II) reverts t o the OK estimator applications, srilall concentratiotrs are typically ovetest,imated, wl~crcaslarge
of the z-attribute valne, Z;,<(n). conce~itratiot~s arc onderestinrated. Furtherniore, Llie error variaucc generally
iricrc;ises with tile ~irensuredvirllic.
are filtered ( l o = L + I ) .
2. All spat.ial cor~il~otrcnts W l m t,he sarnpling is exllausl.ive, for cxaniple, i l l illrage p n ~ u x s i ~ t Hour-
g,
T h e right-tran&side covariance ternis arc all eqnal Lo zero, the sys- gault (10'34) sl~owedthat factorial kriging succeeds in fillrririg tlie ~loiscevvn
tem (5.58) reverts to the OK system (5.22), and tlie estimator W%(II) if it is correlated with Llle signal and is lieteroscedastic. Where d a t a are
is equal to OK estimator of the local nieirn, n ~ ; , ~ ( u ) . sparse, a sigl~alconta~ninatedwitlr liet.erosccdastic noise rrray be diflicnll 1.0
extract, especially in l.lrc presence of noise-to-sigrral correlation.
Example

Metal conccuLralioiis are deconiposed into spatial col11poncn1.s t,lre basis of


the linear ir~odelsfitted to the cxlmirirental a:rnivariogm~rrsin Figures 4.13
tmrl 4.14. For each ~lretal,the spat,i;il co~lrpor~t:irt
;rssocinlsxl wil.11t l ~ cspl~~:rir;rl
modal of long rallgc (1.3 k ~ n is
) addecl to the trend conil)oimrt 1tli;tt. Inay also
be viewed its a 101ig-ra~ige(rcgional) con~policnt.'I'ltc n r q x of the n!sulti~~g
rcgiorral conrpolrerits are sliowri i n Figures 5.12 and Figure 5 . 1 3 N o h tlic
following:
As shown in Figure 5.10, tlic cstimat.e of tlic rruggct canpollent is zero rvlier-
ever the locatiou 11 does not coincide will1 a datum location u,. In s r ~ h Iicgional ctrinponerrts of copper i t ~ lead
l are fairly sirliil;lr.
cases, bol.11 cstinxrtes zbIc(u) and W;;~(II) are irlrntical:
. Nickel arid cobalt show comllron regional features linked to I.11c spatial
d i s t r i h u t i o ~of~ rock types (Figure 5.13, top grapl~). T l m o twr~rrliips
cnipliasiae the impact of gcology or1 I.he regional v a r i a t i o ~of~ trict,al oir-
Filtering properties of tlte kriging estit~latornri. furtl~crdiscttsscd iir sec- cer~trations,in particular the occurrence of the srrrallest conccvtri~l.ions
tion 5.7, wlrcre i.lre OII;II kriging f o r ~ n a l i is
s ~irrt.roduri~d
~~ OII Argovim rocks.

OK Cd estimates

...... No filtering
i .-.-
1
~ --...,
2
~-
3
~-~
4
- Nugget fillered
5
~ ~ -

6
T ~

Dislance (km)
Figure 5.11: O K csti,antas of G I ronrestr;~tionsbrlorc irtd after filtering tlw Figure 5.12: Maps of regional compoecats fur copper a n d l e i d
nugget component.
Geology 5.7 Dual Kriging
Quaternary Llnal kriging is just arrotl~crprcselitatiou of kriging whereby t l ~ eestinrales are
Portlandian cxprrssed as linear co~t~l~inat,ions of cov;wiatrcc valws instcad of data values
(1)11l)rule,1983; doorncl, 1!180). The dual kriging frn~nalisn~ provides insight
Sequanian
into the filtering properties of kriging n ~ reduces
~ d tlre c o n ~ p ~ ~ t a t icost
o ~ ~of
al
Kimmeridgian kriging rnl~cnosrd with a global search neigl~l~orl~ood.
Argovian
D n a l simple krigiug
t c z a t l o c a t i o ~11:
1h11sidrrthe simple kriging c s l i ~ r ~ aof ~

'Tlje SIC weights XF(u) are obtained by solvirrg the system of n(u) linear
cquat~ionsderived in seclion 5.2 atld recalled here:

or ill ~ n a t r i xform X,,(u) = I<;; k,=,<.'l'he krigitrg rvcighds appear as linear


frrnct.io~mof t l ~ cright-11;rud-side covariarrco values C(u, - 11). Consequenlly,
the SK estimate (5.50) can be cxpresscd as a linear cornhination of these
covariance values plus the statio~rarymean y n :

where dF(11) is t.l~edual weight associated wit11 tlre covariance f:(ri, - 11).
Expression (5.61) is called the d1ml form of the tradit,ior~alor primalsimple
kriging estimat.e (5.59). 'l'he dual weights dy(u) are obtained by identifica-
tion of t,he dat;i values by the kriging expression (5.GI):

Figure 5.13: Spatial dislribatioa of rock types over the study area and msps of
rsgional cornportents of four ractals.

. Like the pair nickel-cobalt, small valrtrs of Cd and Zn regioual compo-


neuts arc cor~firrcdto Argoviat~rocks. 0 1 1 the other Ira~rd,the spat.iitl
In contrast with the primal SK system (5.60), the dual system (5.62) does
not result from rnirtimizing an error variance; rather, it is established from
the exactitude property of the kriging estimator.
pattcrr~of large Zn values is close to that displayed by the pair copper- Like the notation AzK(u) used for the prin~alweigl~tsin systerrk (5.60), the
lead. As already revealed hy the structllral allalysis in section 2.3.5, dependence on u of the dual weigl~tsdZK(o) refers to the fact that the n ( u )
Llrc: spatial distkibotion of zinc values shares tlre spatial cl~aracteristics d a t a retained may vary from oue location 11 to another (see related discussion
of tllc pair copper-lead, as well as tlmse of t.lre group of other heavy in section 5.1). For a given data configuration, the dual weights dZK(u) and
tndals. djjK(rr') are the same for all 8.
172 CllAPTEli <5. ACCOIINTING FOR A SINGLE ATTRIBUTE 5.7. DUAL KRIGING

OK Cd estimates
Accounting for the dual form (5.68), one can rewrite relation (5.55) het,weeu
OK tisti~uatesas

Expression (5.69) allows a straiglltforward filtering of any spatial cornpoucut 1


- 2 3
7
4
Distance (krn)
, . . ,
5
.
6

z:;;<(u) fro111the krigiug cstintalc zXfe(u) hy sclI.i~rgthe correspondiug co-


variance tcrnrs Cp(n, - 11) to zero. For example, the rrugget contponent
Z:;~~(II) is filt,cred at. I ~ dat.a I and orrsau~pledlocatious hy s u b t r a c t i ~ ~t gl ~ e Figure 5.14: Impact of data configuration on filtering properties of kriging. As the
unggct c1li:ct Co(ll, - 11) frorrr all u w a r i a ~ ~ cvalues
e C(n,, - u ) . As shown for Inration b&g estitltated gets farther away from data localinns depicted by black
tlrc tntggct nrtnpo~mtl,in Ia'igure 5.1 1 (page IGG), spai.ial rourpor~e~tlk arc iul- dots, 11igh-lreque~icy(slmt-raegc) cornponnits are progrrssivt4y filtered, making
plicitly lillcred i n i.l~e11sua1ordinary krigi~tgof z. Such filteriug is, lrowcvct, the appsrerit sp&d variability progressivcly snmller.
~rol~-systo~l~irti~:: ~ I I Dco111p011cntLlmt is actually lillered d e p e ~ ~ dons 1.111: local
da1.a cooligurat,ioo. lFor exatuplc, the tiuggci. rotuponcrll, z F K ( u ) is lilf.r:red
ouly a t I I I I S R I U P I D ~Io~aLiw~s.'llte key fnct,or is the separation rlistarrre, say,
3 . iroc;it.ion u:,: 200 111 < d,,,,,, <
1.3 km
All covariauce terms Co(u, - it) and (;I(u, - 11) are zero, hence both
d,,,,,,, betweell the location u being estiniated arrrl the closrst dat,a locntion. nngget. atid short-rauge corrlpooerlts are filtered from the est,irrlale (5.70):
Wllcnewx the dist,ance ~i,,,~,,
z3(
is larger than the correlation rauge of a spatial
, zZ'(o),the corresponding covariance terms ,r U ~ - I I )vanish
c o n ~ p o n r n tsay,
in the dual cstituatc (5.139);Iicncc Lhe spatial colnponetll, role(u) is liltcrrd.
(:ousidrr, for cxat~rplr:,t l ~ cordinary krigi~rgesti~naliorlof Cd conceutra- 4. Location u,: d,,,;,, > 1.3 kt11
tious along the trauscct it, Figure 5.14. 'nie dual OK estirnate is writleu All cov;wiauce lerrus CI(U, - o), I = 0, 1 , 2 , arc zero, lmice all spatial
n(11) components are filtered aud the estimate (5.70) inclr~desonly the trend
z ~ ; ~ ( I I=
) d z K ( u ) [Co(u,, - 11) + C I ( I -~ 11)
~ + Cz(11, - ti)] colnpoucnt:
*=I ~ ? > K ( W=) & K ( W )
7'11~st.l~ekriging estimator is a variablr low-pas filter: the high-frequency
(nugget, sltorB-rang?) co~npouentsarc progressivcly filt,ercd ;is the location 11
wlu:re &(h) is a uugget clfect model, and Cl(11) aud Gz(11) are spltcrical being rslin~atcdgets fartl~craway from data local.ions. Such variable filtering
covariancc models with ranges of 200 in aud 1.3 k n ~rcspcci.ivcly.
, creates artifact riiscori1.iooitirs uear data locatio~~s. These discontinuities are
Four differcut situatior~scan be distinguished irr Figure 5.14, depeuding p:irt~icnlarly toti ice able when ~ o s oft the spat,ial variation o c c r ~ ~over
s short
on the positio~iof t l ~ clocatiou 11being cstiruated relative to da1.a locatious: distances. Fbr exaniple, Figure 5.15 sltows OK estimates of Cd concentrations
using thrcc arbitrary scoiivariogram models of l.ype:
1. Localiolr u l : d,j,, = 0
No componeut is filtered, and the rlatum is exactJy honored:

2. 1,ocatiou u2: 0 < d,,,,,, 5 200 m where bm is the i>roport.ionof tlrc nuggct clfcct lnodrl go(],). For a zero relative
All covariance terms Co(u, -11) are iero, so only the uugget amrponenl urigget eflixt (6" = O ) , tlte i ~ ~ t e r p o l a t icurve
o ~ ~ (solid line) varies smootlily
is filtered fro111the estinlate (5.70): witlrout apparent disconbint~it,y. IIiscont~i~~t~iLies appear as soon as a small
nugget effect, say, bo = 0.2, is present (large daslied line), and they become
irmensingly importatrt as t . 1 relative
~ nugget effect incrca3cs.
174 CIlAP'I%'II5 ACCOUNTING FOR A SINGLE AlTIIIDUTE 5 8 MISCELLANEOUS ASPECTS 01,'KIIIGING 175

of the semivariogran~,not on its global sill or ;my f;rct,or i~~ultiplying LIE


OK Cd estimates scniivariograrn or covariance uiodcl.
Consider, for cxainplc, the ordinary kriging estirnitt,io~~ of z ill uli, using
s l ~ o wi l l~ Figure
da1.a a t t l ~ cfive locat.io~~s ~ 5.16 (lcft, g r a p l ~ ) .T h e o r d i ~ ~ a r y
kriging system (5.18) is

- 0% nugget
...... 20% nugget
- - - 5n% "8 ,noat

1 2 3 4 5 6
Distance (km)

Fignre 5.15: OK csti~uatesol Cd concentrations usi~lgthrce spherical sernivari-


ogram nlodels (rangc=2km) with increasing relative nugget elfcfect.
wliere CeBdenotes the dala-lo-data covariance C(u, - up), and Ccr0dlenott:s
the data-to-unknown covariance C(u, - uo). Let the sernivariogra~rr~ n o d c l
Because of its vari;rl~le( l o c a t i o n - ~ l e p e ~ ~ c l elilterirrg
~~t) property, the ap- he an isotropic splierical model with zero uugget effect., tuiit sill, and rr 1 krn
pearance (s~noothness)of kriging profiles or maps depends on t l ~ elocal data range. 'I'lie correspoliding covariance model is
configuration. For irregularly spaced data, the iol.erpolated profile (rnap) is
more variable wlrorc sampliug is deusc thati where it is sparsr. Such all ef-
fect 11iay create structures that, are pure mtifiicts of the data configurat.iou.
One soluliol~c o ~ ~ s i soft .utilizing
~ sirnnlatio~~ algorit,luns,iut.rod~~crd in C l ~ q -
ter 8, wl~icli,as opl~oscdto krigi~igalgoritl~ms,reproduce the full cov;iri;rncc
everywhere. slrown i n Fig-
Given that covariance model and the data confimiratio~~
urc 5.16 (left graph), the kriging system (5.71) is
5.8 Miscellaneous Aspects of Kriging
5.8.1 Kriging weights
Using the matrix forr~iulal.ion( H I ) , l l ~ cvector of ordinary liriging weiglit,~
is computed as

,
l l ~ large
\
e off-diagonal covariarrce value C ( u a - I%,) = 0.85 informs the syslrnl
'I'l~ekriging rvcighting systew ;rcco~n~ts
for: on the r e d ~ ~ n d a n cofy the two data a t clustered locations 112 and 113. On the
other hand, the zero covariance veltres in the right-hand-side vector i ~ ~ f o r r n
1. proximity OI data 1.0 llre l o c a t i o ~II~beilig c s t i r ~ ~ a t et ld~ r o u g lt ~l ~ ncovari- the system that the poil~t11 to be estirn;il.etl is heyolid t.lir correlatio:~range
ance t e r m C ( u , - n ) , and of the two data a t locations n4 and US.
2. data redundancy i . l ~ r m ~ t.l~e
g l ~ data covariauce t ~ ~ a t r [i C
x ( u , - II,~)] T h e following ordinary krigirrg weights are solulio~lsof sytcm (5.72):

Instead of llre I,hclidcan rlist,alice I I, - u c o n ~ n ~ Lnoi all viwii~l~lcs,


1 . 1 ~ : dis-
Lance used in krigiug is the semivariogrnn~distance ?(,I, - u ) , as ~i~otlcletl
from the data and specific to the varial~leru~iicrst1111y.'I'trc kriging rvriigltts (la-
pend only on t l ~ eshape (rclat.ive nuggrt effect, a~~isotrol,y, correlation range)
snch as nrgativc c o ~ ~ c m t r a t i or
o ~estimated
~s proportions larger than 1. There
are three ways 1.0 denl with nolr-convexity problcn~s:

1. Force all kriging weights to be positive; see Harnes and Johnson (1984),
Xu (1991).

2. Add to all the weights a constant equal to the ~nodulusof the largest
negative weight,, then reset, the weights to s u m to 1; see Jonrnel and
I ~ (I 996)
I

. .. .
-7.0 -0.5
u5
,
, .
0.0
,. ,_-~
0.5
Distance (km)
1.0
. ,-_ / _
-1.0 -0.5 0.0
, ~-_,-.
0.5
Distance (km)
,'
. ~ ~ ~ T -
1.0
3. lteset any faulty esti~nateto the nearest hou~rd,say, 0, if negative values
arc not arccpt;rble, or I for cxccssive proportions; scc Mallet (1980).

I . I n ~ ~ m scorlstraints
e on the kriging estimates rather Limn on the krig-
ing wcigl~tsthn)ngl~the use of iudicator ro~rslraintiulervals; see sec-
Figlure 5.16: Two two-dimensional data configurations with dilfrrent positious of t,ion 7.1.2 and Journel (19861,).
uz. In both cases, the attribute v d u c at locatiou ua is cstimatcd using ordinary
krigiag and live data at locations ul to us. 'J'ltc radius of the daslred circle ceutercd For a parl.icular data co~~figuration, kriging weights u ~ a ydrastically change,
on 110 corresponds to the correlation range (1 krn). depending on the scrnivariogra~r~ i~rodel. For the two data configurations of
1"igure 5.16, Figllre 5.17 sllows the evolution of the kriging weights s the
relat,ive nugget t:fft.ct incrcas~s.A 1;rrgcr nuggct rfict reduces the impact of
As cxljccted, the kriging weight dccraascs as the rlatom location gets l a r t l m disl,a~rceof dirti~1ocati011sto u0 (left grztpl~);it also reduces tlre screening
from uo. N o k blic following: effect of lociilior~1 1 : ~on 112 (right graph), In the presence of a pure nugget
effect, all weights are eqoal to l/rr(u) = 0.2; the krigirrgestirnate then reverts
1. 130th data a t locatiorrs ul and u ? arc the same distance frorrr uo (same
lo 1.11~arit,l~~nel.ic
average of tlrc data retai~~etl. 'I'lre impact of sernivariogram
righ.-hand-side covaria~~ce t c r ~ n s ) ,yet. the latter receives less weight.
parameters (slmpe, relative nugget ell'ect, anisolropy) on kriging weights is
hecanse of its r e d u ~ r d a ~ ~with
c y the third datum a t us.
cogently discussed in lsaaks and Srivastava (1989, p. 296 313).
2. IMa a t locations u 4 and us gel. a norr-zero weight, although they are
heyorrd the correlation range of data. This nowzero weight is due to
their cont.ribut,io~rto t,lre estirrralion of the t r e ~ ~corrrponent
d a t locatiou
uo. 'I'his treud estimation is implicit to ordinary kriging (see related
discussion in section 5.3).

C o ~ ~ s i dar rsecoitrl data configuration ml~errthe location us falls bi.tweer~


the locations 110 and u ? (Figure 5.16, right graph). T h e new ordinary kriging
weights are

-
-I
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8
Note t.l~es ~ n a l lt~egativeweight given to the datnrrr a t 112. Negative weights Relative nugget effect Relative nugget effect
typically occur wlmr the inlluence of aspecific d a t u ~ nis screened l)y that of a
closer one. In Figure 5.16 (right graph), the location us screens the infiuence Figure 5.17: Impact of the relative nugget effect on the OK weights for tire two
of uz for estimation at ua. Negative wcigl~lsallow the kriging cstirnate to two-dimrnsional data configurations of Figure 5.16. All weigltts get close to each
take values outside the range of the data, Altl~ouglrthis lion-convexity of the o t h ~ as
r the relative nugget effect increases.
estimator is generally a desirable properly, it rrray yield unacceptable results,
5.8. MISCELLANEOUS ASf'EC?'S O F KRIGING 171)
178 CHAPTER 5. ACCOUNTING FOR A SINGLE ATTRIBU?%

clnstercd so that one or two of the nearest data do not screen all o1.11t:rs. 'I'lle
5.8.2 Search neighborhood maxinlunl number n ( u ) of data values to retain deper~dson the objective
As mentioned in the introduction to the linear estimator (5.1), common prac- pursued. When one aims a t depicting local features of Llhe attribute, t h a t
tice consists of using only the data closest to the location u being estimated, number should be limitcd; wltereas more data arid dala farther away sl~ould
i.e., the d a t a within a given neighborhood W(11) centered on the location u be retained t o depict long-range structures.

.
to be estimated. There are several reasons for such a restriction:
The covariance values for large distances are usually unreliable because
of the few data pairs available for iuference a t suclr large d i s l a ~ ~ c e s .
Before nmning any kriging over an entire area, it is good practice to try
several search strategies on a test subarea. Cross validation teclmiq~~es
then Ire nsed to evaluate the inlpact of dilferent search parameters on the
interpol;lt.ion r e s ~ ~ l t As
can

s . nmltior~cdi l l sectio~l4.2.4; Lx:warc t l ~ n tthe s c ; ~ c l l


Using local srarcl~neigl~l~orhoods with o r d i ~ ~ a rkriging
y allows one to strategy tllal produces the best cross-valirlaterl results nray not yield the best.
predictio~rsa t onsampled locations.

. account for local departures from stationarity over hlre area A.


T h e closcst da1.a tend to screen t11e i n f l o e ~ ~of
previous disct~ssion.
c e thosc fartlier away; st:c
Further discnssion on tile practice of kriging is avail;tl,lr in J o ~ l r ~ r(1987),
and I)euI,sclr and , J o u r ~ ~(I'J'J'la,
el scCt,ionIV.6).
el

T h e size of tile kriging system, l~encethe co~opntationtime, drastically 5.8.3 Kriging variance
increases with the nnmher of data rctaincd, approximately in proportion Kriging provides not only a least-squares estimate of the a t t r i l ~ u t ez bnt also
to [n(rr)l3. Llre ;tl.tacllcd crror variarrcc; for cx;tr~~plc,
for ordinary triging:
T h e search ncigl~borhoodis typically taken as a circle ccntered on thc
location being estimated. When the variation is a~~isotropic, it, is l~el,terto
consider an ellipse wit11 its nlajor axis oriented along tile direction of rnaxi-
mum continuity: this provides more relevant data. 'Ih reduce the influence
T l ~ a error
l variance is:
of d a t a clusters (if any), it is good pract,ice to split the search rreigl~horl~ood
into equal angle sectors, say, quadrants in two climensions or octants in thrce I . d e p e n d e ~ ~ont t.lle covariance nod el.
dimensions, and retain within each sector a specified uumber of nearest data. 'I'l~is is 1n1 excellent feature: the estimation precision should in<lwd
When data are gridded and the estimation grid is aligned with the sampling depend on the complexity of the spatial variability of r as modeled by
grid, quadrant and octant searches may cause artifact discontinuities in the the covariance.
estimated maps because of sudden changes in the data retained for estimation
2. dependent on tire data co~~figuration.
from one location to the i ~ e x t .
T h e terms C(n,-u) account for the relative geonletry of data locations
One shonld avoid limit,ing a priori the rrraxirnu~i~ senrcl~d i s t a ~ ~ ctoc t l ~ c
11, and their distances to tile location 11 b e i ~ ~
esli~~mtecl.
g 'TI& is also
correlatio~rrange of data. As featured in Figure 5.16, data beyond the corre-
arr excellent feature.
lation range contribute to the estimation of tlre local mean within the scarcll
neigl~borl~ood. As {,herelat.ive nugget increases, the scrcsning clkct of r1ost:r 3. illdependent of dat,a va111es.
d a t a decreases, I~enceremote data get more weight (Fignrc 5.17). 111 sub- For a given covarial~cemodel, two idcntic;rl rl;ita co11lig11rat.io11.s
would
areas where sampling is sparse, the search distance sl~onldbe increased to yield t.lre same kriging variance n o matler wlrab the 11;rtn were.
retain enough data. AII alternative is to spiral away from the local.ioll 11 Ije- The lal.ter two features are illustrirtcd ill Figure 5.18 ( I ~ o t t o ngraph),
~ wliicll
ing estimated and lo retail] the n ( n ) closest dala wl~atnvertheir rlista~lceto slrows the OK variance along the NE-SW transect. ?'lie pattern of variatioil
u. For data selection, a useful practice consists of using the semivariogram of the kriging variance is fi~llycontrolled 11y the dtlla config~~ratiorr.T h e
distance y ( o - u,) rather t h i the Euclidian distance 111 - n,l, so that data
e r r o r variance is zero a t data l o c a t i o ~ ~il~creascs
s, away from 1 . h data, and
are preferentially sclected along tlrc direction of maxi~nnmcontim~ity.
reacl~esn maxinrum value beyond tlre extreme riglit datum (c:xlrapol;rt,io~~
In addition to size and orientation of the search neighborhood, one n111st
sitllation). Indeed, as the locatio~rrl bcing estin~atedgets fart,l~craway from
specify the i n i ~ ~ i ~ uand
u ~ inn a x i ~ n n ~i ~n ~ m i h eofr dab 1.0 IISC for cstimation.
data locations u,,,1,otI1i.11~covariauce term C(u,, -11) ;md tlic kriging wcigl~t.
+
T h e minim on^ must be equal a t least, to 1 . l ~nnrnber (Ii 1) of cor~strai~tts
XEr'(u) decrease, lleuce the kriging variitnce (5.73) i~lcrcasrs.'
on kriging weights. For example, the modeling of a ynadratic tre~ldi l l two
.
dimensions requires a t 1e;rst six data values. In practice, 10 dirta values w o d d
be a rmsoo;hle niirii~nmn. ,I h a t rnlmber should Ire 1;rrgcr rvltcrc data iire
4
Irere.
Only the common case of positive weights and l ~ o s i ~ i vcovarinrtcr
c values is ro~isi4ered
of d a t a values, tlrr kriging variance retains, frotn the data used, only its
geonretry---not t,he data values. In this sense, the krigilrg variance is otrly
a rartkirrg index of d a l a geo~netry(and size)-not a measure of the local
spread of errors; scc J o u r ~ ~ c(198Fa),
l f)eutsch and Joornel (1992a, 11. 15),
and ClrapI.cr 7.

5.8.4 Re-estimation scores


Figure 5.19 shorvs the maps of OK estirnates for the three metals that most
1 2 3 4 5 6
exceed t l ~ Swiss
: guide valr~rs(GI,
Cu, and Pb). Notr the following:
Distance (km)
Most kriging rst.irrratrs rwaed the critical tlrresl~oldof 0.8 ppru for cad-
OK error variances rrriu~rr.'Tl~etwo low-valued zones correspond l o clusters of small corr-
ccntrations on Argoviarl rocks.

Large csti~rlatesof c o p p r concentratio~lsare cot~fir~ed


to the sonther~r
part of the regiolr.

Large estimated concentratious in PI] are in the central and soulh part
of the rcgion.

1 2 3 4 5 6
Distance (km)

Figure 5.18: OK estimates of Cd co~~cmtrntions (top graph) awl the corresponding


error variances (bottom graph). 'l'l~ckriging variance is independent oirlata values,
Imxe it does not reflect the ~ r e a t s rnnrr~rtainty that is expecled at location u:,
w l t i d ~ is s~irrout~lcd
hy a very liirgc a w l a sin;dl (A values, con~parrdlo lor-ation
u:, wl,icli is snrroundrd by two consistently swirll Cd vslues.

Kriging yiolds s i n ~ i l i ~
error
r v i ~ r i i l ~ ~ill.
w sm y two or IIIOIC I~cal.iotlswit11
siniilar local 11;rta configurations. Int~liI.ively,Irowevcr, the potential f<)rerror
is expectod to be greater a t a location uz sr~rror~~lderl by data that arc very
different thin1 a t 111 surrounded by similirrly valued data (Figure 5.1R, top
grapll).
'l'lrc prohlt:m with the krigiug variarice is tl1;~t it is 1101, corrdit,ioz~edto
Lhc 11;~t.nv;~lucsusctl. 'l'hc proper nraasurc of local precision wor~ldbe the
c o n d i t i o ~ ~estimalion
al variance specific to l l ~ rdata values and dehred as

where the r ( u , ) are t,lre data values. T h e krigiug variance is but hire average
Over all possihle realizations of the n(u) data IWs Z(u,). If the configr~ratiorls
of tllc 11(0(')) data retained a t marly different laces ~ ( ' 1 ,I = 1,. . . , L were the
same, t l ~ e nthe krigirrg variance (5.73) would be an estimate of the variance of Figure 5.19: Maps of OK estimates. The tolerable maxima are 0.8 ppm for Cd
and 50 ppm for Cu and Pb.
the 1, errors z*(u('))-z(II(')), Ikcanse of the averaging over the L realizations
At the 100 t,cst locations u,, b o t l ~esti~natedvalues Z & ~ ( I I ~and ) actual Cd data
values z(u,) are available. Statistics for both sets are g i v c ~in~ Table 5.2.
Besides the similarity of the means (global u ~ ~ b i a s e d ~ ~ enote
s s ) ,that t.he csti- -E Prank : ,.
mated values appear I ~ I I C I I less variable Oran act,u;d valr~cs,'Tl~is "s~noothirrg
erect" is due to tho filtering of lrigI~-Sre~r~e~~cy (sllort,-ra~tgc)c o ~ n p o ~ ~ c l ~ t s ;
-:: . . ,
3~
b,
x . : . .,, .
smoothing increases with ir~crcasiugrelative nuggcl effect and wit11 decreas-
ing sampling rlensity. For cx;m~ple,srr~oot.hi~,g is less p r o ~ r o u ~ ~ cfor
r d coball .3. , .., ,,.. . .
.-E
2~
':n .-I?;.:.
,

concentratious, the se~nivariogramof w l ~ i c lhas ~ a larger mngc ru~da smaller . - . . ....'.....:


..-,-.-.

-
Y A

relative nugget erect than tlrose of tire tlrrce 0 t h metals; c o r ~ p a r ethe setni- O ,.i-
variograrns of I"igurrs 4.13 and 4.14. 0 1 2 3
The scattergrams of true z(u,) versus c s t i ~ ~ ~ a lvalues
ed ~ ; , ~ ( carryli~) True value (ppm) Std deviation (ppm)
muclr more iofornralion than the sulnmary statistics of 'fable 5.2 (Figure 5.20, Cu data
left graphs). 'Tl~cyall reveal an n~~rlerest.in~at.io~~ of large cot~centrations(val-
Prank 0.01
ues are below tlre 45' lino) a d an overestiu~ationof s~rrallconce~~trat.ior~s
(values are above the 45' line). 'Tlrc l~alanciogof these two effects results
- 120

in dire global onbiasedness seen in Tahle 5.2. Such a bias is called con&-
Ei 80
-
tional because it depends on the class of values considered. T h e c o ~ ~ d i t i o u a l e
bias drastically ir~Ruer~ces the evaluatiou of the extent of c o r ~ t a m i n a t i o(Ta-
~~ 404
ble 5.2). T h e u~~clerestimation of large copper concentrations may lead one to ..
declare safe all test locations, wl~ercirs8% of thcii~actually exceed the thresl~. 0
old value. In contrast, the overestimation of smiill cadnrin~rrc o ~ ~ c t w l . r a t i o ~ ~ s 0 40 80 120 160 15 16 17 18 19 20
may lead one to classify t l ~ cmajority (02%) of test locatious as contatni~mted. True value (pprn) Std deviation (ppm)
The same effect, tllouglr less pronourlced, is ohserved for l e a d
Pb data

'Priljlc 5.2: St,al.istics for true and esli~uitmlsill-


centrations of heavy rnetals a t 100 test locations.
b,

0
0 100 200 300 23 25 27 29
True values True value (pprn) Sld deviation (ppm)
OK estixnatrs Co data
Sld devzatzoe
True values
OK estnnates

% contammatto7r
.. . ....
True values 0
OK estimates
0 4 8 12 16 20
% a~isclnssificalion True value (pprn) Std deviation (pprn)
OK estimates
Figure 5.20: Scattergrains of t n ~ cvalues a(u,) versus OK cstim;rtes Z ~ ~ ~ at ( I I ~ )
100 test locations ut (left graphs). For the same locations, note the lack of reli~tion
between the kriging error standard deviation ooli(or) and t.lx ithsoluk estimation
...-. I-. I.. 1 - -1.. ) I I.:,.,,' 1..,"."
184 CHAPTER 5 ACCOUNTING FOR A SINGLE ATTRlBUTE

Couditional bias is a serious problem that relates to the smaller range of


variation (smoothing effect) of rstiruated values relative to true values. As
sl~owrrin section 5.7, iuterpolalion algorithms are usually low-pas filters:
they tend to srnoolh local spatial variation. 'The resulting overestimation
of small values and underestilnatior~ of large ones is unfortunate since the
focus of the study is typically on extreme values. Unlike kriging algorithms,
sirr~ulationalgorithms preseuted in later Chapter 8 allow one to reproduce Chapter 6
t,he full covariance nodel ling the variability of data. Sinrulated maps would
then correct for the conditional bins of est,irnated maps.
'She rank correlation belween actual and estimated values is weak for
cadwium, copper, and lead. 'l'hcse poor re-estimation scores result frorrr l l ~ e
short-range variation of the nletals, which reduces t,be information carried by
Local Estimation:
neighBori~~g data. T h e rank corrclatio~~ is slrolrger for cobalt, w t ~ i c lvaries
in spare (I,'igurc 5.20, l d t l~ot.tonrgraph). In the presence
nrorc co~if~introt~sly
~
Accounting for Secondary
of short-range variation, it becornes critical to accou~ttfor any secondary
information sampled with higher resolulion and well correlated witlr the pri-
mary variable, for example, geology, land use, zinc or nickel concentrations.
Informat ion
Kriging algorit,lrros integrating such secoudary information are preseuted in
Cl~apl.rr6.
'The error variance provirlerl by kriging is ul~fortuuat~ely often ~rrisuscdas
Direct rr~easurerrle~tts of the primary attribute of interest are often supple-
a measure of reliahilit.y of the kriging estimate. Figure 5.20 (right colurrm)
mented by secondary infor~nationorigirratirlg from other related categorical
shows the scattergralus of the absolute error l z ~ , ~ ( o - , ) z(11,)l versus t,he
or continuous attributes. 7'11e estinratiou generally improves when this addi-
kriging standard deviatiou U O K ( U * ) , wit11 the corresponding rank correlation
tional a114usually dcnser i~~forluatiorr is taken into co~lsideratioo,particularly
coefficients. 'fhe rveakucss of the rank correlation confirms that 1 . l ~krigiug
y are sparse or poorly correlated i n space.
wlre~r(.he p r i r ~ ~ a rdata
slarrdartl deviation c;mnot be used as a direct measure of csti~uatio~r precisiott.
Sectioll 6.1 presents three kriging ~ l g o r i t h l l lto~ incorporate exhaustively
sampled secondary data: krigiug withitr strata, simple kriging with varying
local mealis, and krigil~gwith an external drift. Section 6.2 introduces the
cokriging algorit,hrns that allow for n o ~ ~ - e s l r n ~ ~ ssecondary
tive information. All
algoritlrt~rsare used to interpolate metal co~rcentral.ionsa t the Lest locations,
and rc-esl.in~atio~r scores are compared witlr ordioary kriging results.

6.1 Exhaustive Secondary Information


Col~sidcrthe situation whme priulary dala ( ~ ( u , , ) ,rr = 1 , . . . , n ) are sop-
plemelrteri by seco~rdaryinformatio~~ exltaust~ivclysampled. "Exhaustively
sampled" refers to secondary information that is available a t all primary
data locatious u, and a t all locations 11 being estimated (Figure 6.1). Such

.
infornratiolr t~iayrelate to two types of attributes:
A catcgorical attribute s wit11 It' ~ n u t ~ t a l lexclusives
y states sk (e.g.,

. rock types)
A smootl~lyvaryiug coirtinooos attribute y (e.g., nickel concentration)
If the secondary inforrnation is densely available but not exhaustive, it is a
*w better yet, by
re;wooal)lr! ;rpproxilnat.ior~to cotnplete it By ioterpolati~,~.
186 CHAPTER 6 . ACCOUNTING FOR SECONDARY INFOIIII.IR??ON

siniulation; see Almcida and Jonr~icl(1994). 111 the exatnple of liigure 6.1, Ni 111 this section, tltrce variants of the kriging pnradigni itrc dcvcloped to
concentratioirs were est,irnated in non-overlapping segnients (blocks) of lengtl~ incorporate exhaustive secondary data in the estinrat.io~lof tlic pri111;lry 8t-
50 ni so as to mirriic a smoothly varying and exhaustively sampled variable. tribuk z:
Iirigiag cuilhin strata an~ortnt,sto first strutifyit~gthe study area based
on the secood;try information then est.in~atingthe primary attriRrlt.e
w i l , l ~each
i ~ ~ sprxilic stratuni using tlie ~ o r r ~ s p o n d i nprimary
g data a ~ l d
, Cd data covariance model.
Sinrplc krigieq wilh tXITyIllg local means and krrging luilh o n el:lerr~n/
drift use sccoltdary data to clraracterize the spatial trend of the prinrary
attril~ule.

6.1.1 Kriging within strata


1Sxplorat.ory iIa1.a andysis may 1taw revealed signilicaut dilTcrcriccs in tlro av-
erage virhre ;tnd ill t.he ])atl.rru of spatial conl,inuit,y of I l l ? primary nl.lribul.c
r across tlic study area. In such a case, tile original decision of stotionarit,y
sl~ouldl)e rrconsidert:d nnrl the study area divided it1t.o more lio~nogcncous
regions (strata). T h e primary data shoald then be treatcd rvitliirr each sl.ra-
s,: Argovian
Rock types SZ: Kimmwridgian
t u m as a separate popnlation. l'liat stratilication is largely condit,ioncd by
n s,: Soquanian 1111: ;tvaiI;d>ilit~y,within each s t r ; ~ I . ~ ~of
n r e11i)ug11
, prit~tarydat.a t,o ittf?r I.II(~
s ~ Porllandian
: wccssary cov;rri;rncc. Morcovr:r, olie shoolrl be ;rl,le 1.0 ;~lloc;tt.t:~ ~ ; i rIoc;tt.im
lt
ss: Quaternary 11 h i l l ( : estiluaf.cd to a specific populi~I.ion. Where rt~;Gorscalcs of s1J;tti;ll
variatiori are related to changes irr land use, soil l.ype, or lithology, secondary
cat,sgorical illformation, such ;IS l i i d use, x ~ i lor , geologic maps, c;rlt bc usrd
10 slr;rt,ify Llir: study area ( c g . , src Skill ct al., 1988; Volta ;IIIII Wclwtr~r,
I.. , . _. ..
_._I.~_ l!)Xl; Vii~rMeirvenl~cel a!., 19114).
1 2 3 a 5 6
1,cL the secondary i~~fortnatioo take the form of a categorical map of nt-
Distance (km)
tribute s over the study area A . Krigiug within st,rata (IiWS) ~m~ccreds in
tltrec sl.eps:
I . 'l'l~estudy a r m is first st,r;rtilirid according tri thc boul~dirriosof Itlrc
Ni block estimates categorical map. T h e kt11 straturn A x is rlefilted as t l ~ cset ol'locahims
11 Ll~xtbelong Lo the catrgory s r , A k = (11 t A, s t . ~ ( 1 % =) sr}.
of z , ?(I,; s r ) , is compi11,etlwithin mcli
2 . Tlbe experinrental scmivariogra~r~
stratum as

wliere N(1r; s r ) is the I I ~ I I I of


I ~pairs of priniary data loczltions 110, a
vector 11 apnrl, that jointly belotig to the kt11 strittutrr.
Figure 6.1: Ten primary Cd data and the cxhaustivcly sampl~dsecoudary isfor-
mation: rock types and block OK estimates of N i concentrations. 3. Tlie value of z at, each location u t kt11 stratum is estimated using
the sen~ivariogra~n m o d 4 ?(ti; s n ) and (.Ire closest pri~rraryda1.a r ( n , , )
1 within thnl s l r n t . t ~ m
Most often, c1;tta sparsit.y prevents col~rputingreliable estinrates ^y(11;sa) 5 rock types sk--> 2 strata: At and A2
w i t l ~ i neach geologic or soil cakgory. One solution consists of combining the
Ii espori~r~t!ntal sc~nivariogrcrmsy(11; SL) into a single pooled witlri~~-stratum
scrnivariogra~~l y,v,(I~):

wlrcre each scniivnriogral~rvctlnt: y ( h ; sk) is weighted hy the nn~rrherof d a t a


pairs N ( h ; sn) used for its est.i~~ration. Distance tkm)
If the variance of 2-valnes clrangcs significantly frorn one slrat.utri l o all-
other, cach semiv:triogranr ^y(h;sk) could Bc scaled by the variance of the
z-values t l ~ a were
t used to build it: Within-stratum semivariograms

wliere LIK variance of r-values witl~inthe kt11 stratom. 'This pool-


111
ing of sl.andardizcd sen~ivariograniscorresponds t o a "proporlional cffecl!' stratum A,
corrrc1.ion (.lournel and lluijbregts, 1978, p. 187). stratum A*
As ~nentionedill sect,ion 5.8, the krigiug estimate depends on the slrape
of the seruivariogra~nmodel, not on its sill. 'I'l~ns,tlre pooled semivariograrrr
~ n o d e y,,(h)
l can be used for estimation witlrin cach straturn. However, the
krigi~rgvariance specific t o stratum k requires ~nolt.iplicationby the proper
varimce~rI..&'
A ?

OK Cd estimates
i
Figure 6.2 (t.op graph) shows a stratification of the NE-SW trarrsect based
on geology. D a t a sparsit.y prevents considering lrrore tl~arrtwo strata. T h e
first s t r a t u m , denoted A , , i ~ ~ c l u d all
- stratification
e s locations on Argovian and Quaternary --- no stratification
rocks. 'I'lrc tlrrce otlrcr rock types with larger Cd concel~trationsand larger
proportions of c o l ~ t a r ~ r i ~ r a tfields
e d (Tables 2.4 and 2.5, pages 18-19), are
regrouped into the second sbratorn Az, wlriclr cousists of two discon~rt~ct.cd
scgt~~rwl.~.
Lf'igure 6.2: Kriging within strata. 'She transect is first split into two strata A I and
Figure 6.2 (~rriddlegraph) shows the s e n ~ i v a r i o g r a ~ofn Cd concentrations
dz,according to geology (top graph). Within each stratum, the Cd semivariogram
of each straturn. Concentrations appear to vary more continuously within is inferred arid mncleled (middle and Cd coucentrations are estimated using
the s t r a t u ~ nA, (smaller relative r~uggeteffect and larger range of the serni- ordinary kriging and stratnn-specific data (bottom graph, solid line). Vertical
variograrn). Recall t h a t the smallest concentrations were rr~eamredon Argo- arrows depict discontinuities at the strata boundaries. 'The dashed line represents
via11 and Quaternary rocks, rvlrich form the first straturn. Thus, the better the OK estimatc witlwnt stratification.
continuity of Crl concentr;rt.ioos within that st.rato~urelates t o the brt.tcr
connectivity of srnall Cd concentratious revealed by the alralysis of indicator bottom graph. T h e d a s l ~ e dline corresponds to the OK estimates computed
sernivariogranls (Figure 2.19, top graph, page 45)

.
without stratification ill section 5.3. Note tlre following:
C a d m i o t r ~coucet~lmt.iousarc cstiln;ttcd ~ r s i r ~ordinary
g krigirrg, st.ratil111-
specific primary data, and the selnivariogriur~models shown in Figure 6.2 Kriging w i t t ~ i ts~t r a t a yields estimates t h a t suddenly cl~arlgeat the
(rniddlc graph). ' h e resulting estinrates arc depicted by the solid line in t l ~ e s t r a t a boundaries depicted by vertical arrows. Indeed, estimates 0x1
different sides of a stratum boundary are based on t,wo different sets of
190 CHAPTER 6. ACCOUNTING FOR SECONDARY INFORMATION 6.1. EXHAUSTIVE SECONDAHY INFORMATION 191

d a t a usually with very different irlt,u~~s An all.ernalive t o usiug regression to rieter~r~itre


tllc f i ~ n c t i o I~(l. ) consists
of discretizing the range of variation of tire secondary attribute iuto Ii classes
Away from s t r a t a bonndaries, the stratification has little influence ou (yx, yt+l]. T h e primary local mean rn(11) is then identified with the 111cai1of
tlre search strategy because primary d a t a in the search neighborhood z-values with colocaled y-values falling iuto class ( y t , yr+l]:
generally belong t o tire same s t r a t u m . 1)iscrepancies between ordiuary
kriging estimates wit11 and w i t l ~ o u tstratification now result ouly from m>,( (u) = rulr with g(u) t (gn, gn+r]
differences in the semivariogram models.
Tire corrditioual rilean rnlk is conrpoted as

6.1.2 Simple kriging with varying local means


Recall the simple kriging (SK) estimator (5.7):
T h e nnrnlrer of primary d a t a z(11,), suclr a s y(u,) t (gn, yr+l], is n t , and
the y-indicator variable ~(II,;h ) is defined as

Uuder the decision of stationarity, the mean 711 does not depend ou loca-
tion u but represents global information c o ~ n m o nt o all unsnrnpled loc a t.~ o n s . T h e kriging weights X r ( u ) in expression (6.2) are obtained hy solvirrg a
T o account for the secondary information available a t e a c l ~location u, the simple kriging systerrr of type (5.9):
knownstationary mean rn rnay be replaced by known varying means 7n;K(11),
n(11)
leading t o the simple kriging with varying local incans (SIilrn) estilnator:
C $"(II) ~ l i ( u ,- uir) = c,~(u-- n) o = I , . . ., ~ ( I I )
p=1
I t F 11(11) = X ( u ) -
where C ~ ( 1 1 )is the covariauce fn~rctionof tlre resid~~;tl
ni(lr), not t h a t of Z(u) itself.
Ilifferent estimatr:~o f Llre primary local illeati i r ~ ( o )c;m Ire tlscd, dcl>cIIdillg
on tlie secondary information available: Example

1. If the secoudary information relates to a categorical attrihute s wit11 Consider the esti~natiouof Cd co~~cenlrationsalong 1,he Nit:-SW Lra~is(:cl11si11g
K non-overlapping states s ~ the
, primary local mean can Be identified as secondary inforrnatiou either geology (categorical attribute) o r the lrlock
with tire mean of z-values within t l ~ ecategory sa prevailing a t u: OK estimates of Ni conccntratior~(continno~isattrihute), a s rlcfi~iedirr Fig-
ure 6.1. Figures 6.3 and fi.4 slrow the difTercut st,eljs of I h ! corrcspor~ding
k r i g i ~ ~appn,acl~os:
g
1. 'L'lre c;tlibration between prinlary aurl s c c o ~ ~ d a rdya t a ;~nloontslo de-
ternlining either the average Cd concentration for each rock type (coo-
rlitional ~ n c a u s )or tlre regression fimctiorr of Cd conccntrat.ions on Ni
block estimates (top left graphs of liignrrs 6.3 and 6.4, res~~cctively).
This calibration results either in a "staircase" trend cstilrratc (Fig-
lire 6.3, second row) or a more s ~ n o o t l ~ lvarying
y treud when derived
where n t = E=,
~(II,;sn) is the number of primary d a t a locations from the coutiul~o~ls Ni att.ributc (Figure 6.4, second row).
within category sb, and i(u,; sk) is the indicator (2.23) of category sn.
2. At each primary d a t u m location u,, tlre rcsidr~alvalue ~ ( I I , , )is com-
2. In tlre case of a secondary c o u t i u ~ ~ o nattribate
s y, the priwary local puted lry snbtracting tlie trend estimatt: ~IL;~((II,)f r o n ~tlrc prinrary
mean can Ire a funcl.ion (linear or not) of the secorrtlary attribute va111e I ; I I I I ( I ) e . ( I ) = ( I ) - ~ ( I I ) 'I'l~e. se~nivariogralnof
at u: residuals is t1m1 con~pntetland rnodeled. I'ignrcs 6.3 and 6.4 (top right
grapll) slrow 1.1~: a:ntiv;~riograorcornpu1,r:d from 259 residu;~li1at.a and
tlic inode1 fitted.
1 2 ClIAP'l'ER 6 ACCOUNTlNG FOR SECONDARY INIIORMATION

Semivariogram of residuals
Sernivatiogram of residuals ;egression

I5 l

0.0 C
0.0
-,
0.5 1.0
-
1.5
7
2.0
/
-- 10 20 30
Ni block value (pprn)
0 = 0.86

40 50

Distance h (km)

4 1 Trend component

Dislance (km)

+
SK estimates of residuals
SK estimates of residuals

-7--,
1 2 3 I 5
. -
6
Distance (km)

i ~, .,. ..,:. .- ..-T- 1 2 3 4 5 6


1 2 3 4 5 6 Distance Lkm)
Distance (km)
Figure 6.4: Sirrrple krigirtg with varying local means. The trend conrponent. at
F i g u r c 6.3: Simplc kriging with varying locirl means. T h e trend c o ~ n p o ~ ~ ea1n t location u is estimated hy the regression of Cd concentration on the Ni block value
location u is e s t i m a t d by the wean o l C d concentrations lor the rock typr prevailiag a1 that location.
a t t h a t location
3. T h e residual values arc t.licn cst.i~~iat.cd
; h n g the t,r;uiscci. using si1ii1,lc
kriging and the five closest residual data ~ ( I I , , )(t,hird rmv). 'Cl~cfinal
estimate of Cd coucentration, Z ~ ~ ~ , , , , (isI obtai~led
I), by adding the tri:lld
estiltiate nG1<(u) lo the SI< es1iniat.r: of the r e s i d ~ ~ r;,((r~)
al (bot.lo~ii
graphs)
Figures 6.3 and 6.4 slio\v 1I1efollowing:
T h e cstin~atedvalues in Figure 0.3 (boLto1~1graph) show disco~~tinuilies
a t the geologic h o ~ ~ n d a r i e sFor
. example, t,he s l ~ a r pdecrease a r o u ~ i d
4.8 kt11 (exlrenic right dat,urii) rellects the decrease in 1 . 1 1 ~mean of Cd
concentratio~~s from I'orllandiati to i<imnicridgian rocks. Such riiscotl-
tinnities are, liowevcr, less important Llran when krigiug witliiri strata
(recall Figure 6 2 , boltom graph) because data across geologic I)ortt~ll-
a r k s are now used.
I3ecause GI data ;wd Ni block cslimates arc slm~lglycorrclat.ed across 'l'ltc KEl) systcn~(0.5) is a p;irI,icnlar cast: of the 1C1' syst.eln (5.26) wlicrc
the transect (p= 0.86), l.he lrcnd component in Figure G.4 already ac- li = 1 and the trcr~dconiponenl, fl(rl) a t any l o c ; ~ l i o11~ ~is idcntifit?d rvit,l~
co~lntsfor most of the varia~iccin the Cd data. Thus, rr~ost.residoal the value y(u) of tlie secondary attril~utethere.
data are close lo zero and the residnal estirnatcs c o ~ ~ t r i h u little te to As wit,li kriging with ;I Lrend (K'f), two major issues R W LIIC clmicc of t . 1 ~
the estirriates of z , which appear as a rime rescaling of the sn~ootlily t.rrr~trlfunction and t l ~ einfereuw of t.lic rcsidttal serniv;rriogr;nii yIi(h):
varying s c c o ~ ~ d ; ~variable
ry sl~owni n I'igure ( i l (Ilrrttrirt~gr;rpli).

6.1.3 Kriging with an external drift


Kriging with an external drift (KICD) is but a variant of krigiug will1 a Lrcnrl
model ( K T ) , as presented in relation (5.25). The trend m ( u ) is ~nodaledas
a lir~earfi~nct,ioliof n snmot,l~lyvarying secoud;iry (extcrual) variahlc !/(u)
instead of as a firnctio~~ of the spatial coordinat,es:

as opposed to m ( u ) = no(u) + li
( L ~ ( I Ifi(11)
) for kriging wit11 a lrcrid
model.
As with the 1C1' approach, I.11e two t~nkno'ivot.rrnrl coellicients no(11) and 'Che secondary variahle sliorild vary smoothly i l l spacc 1.0 avoid inst.:t-
Ijility of the KED systeni. 111 nit carly applicatio~~ of kriging wit11 en
al(u) are deemed constant witliiu llre search ncighhorlmod W(n) and are
extcrn;rl drift, lto~vever,M;rr&clml(1984) colisidered tlic ex;tmple of lo-
implicitly estimated through t,lie kriging syslem. Unlike l h e SIC approach of
c;il trends of ~iiineralgrades cot~lhllcdby faults hlocks. 'I'he ext~crriid
the previous section, the mean m(u) is not e s t i ~ ~ ~ a tLhrougl~
,cd a calibration
variahle w;rs the11 an indicator varia1,lc that, cli;il~gcdvalue only ;rcross
or regrcssiorr process prior to the kriging of I .
the hrrlls.
The KED estimator is
n(U) As discussed ill sectiol~5.4, the residual scmivariogra~rtyrl(lr) s l ~ o ~be
~ld
I ) = C Xy'(4) %(II,) (6.4) inferred fror11 pairs of z-values that are ~tnafl'ectcdor slightly affi~ctail
hy thc t.rrnd, id?., fro111dat,;t pairs such tlial y(u,,) y(u, +
11).
n=l
196 CIIAI"I'ISI1 6 . ACCOUNTING FOR SECONDARY INFORMAlTON 1 . EXllA US'I'IVE SECONDARY INFOIfA,fAll?ON
(i. 197

Kriging t l r r t r e n d a part,icol;rr case of the K T system (5.36) wit,t~IC = k' = 1 and f i ( u ) = y(u):

St is s o l r ~ e t i ~ luseful
~ e s to couipute and map the trend component m ( u ) that is
implicitly rlsed in tire expression of the I'iEI) estimator. T h e trend estimator
is writ,t,en
n(11)
lll;IEU(ll) = h(;1:1) Z(lln) (6.6)
n=1

where X;~/(II) is the weight assigned i,o datunr z(u,). 'I'lre kriging weights
arc ohtaiued by solviug a krigirig system identical l o lire KT syslern (5.33)
exr:ept that 1.11r.t.rtwd function f i (11) is rrow sr:t cqaal to the secondary varii~ble
?l(ll):

ICstirriation of the trend rrr(11) tlirougl~system (6.7) amounts to cstimat-


iug by lcast-squ;~rrsregression tlie coelficicnts oo(u) a n d n l ( u ) , defined in
relation (&3), within encli searclr m:ighborlioorf iV(u):

A zero slope n;(u) tncatts titat tile secon<l;~rydnlum y(u) does nol. influence
1 . 1 ~primary trend cstirnate a t u. As that slope incrcasr:s, the ilifluc~~ce of
llie secondary value hcco~nespreponderant. A map of the scaling fact.or
a;(u) allows one lo depict the local inlluetrct: of t l ~ secondary
r variable in tlie
estirnaliol~of tlie primary trend compor~er~t.
TIE estimator of tile i,rerid coefficient n l ( n ) is written 'I'lic ICED estimator (6.9) is tlrus sirrtilar to an SIC estin~at.orwith varying
local means dorived from ;l linear resraliltg of tile colocated ?/-datum:

Z;,,,,, (11) - 111;~( (11) = x


,,(ll)

,,=I
A r ( u ) [Z(ua) - m>rt(ue)]
198 CIlAPTEll G. ACCOUNTING POI1 SECONDARY 1Nl~'OIIMATION

.
,l h e two estitrtators differ by the defiuition of t.lre ireilcl compottc~~t,.
'Tltt: N M W trwrstxl. As in Vigrrrc 5.3, tlic verlic;d d;rshcd l i l m split. (.he t.raitsccf.
trend coefficients nf, and o; are derived once and indej~eodentlyof the kriging into six segnients within wlriclr t,hc same five primary data arc ttsed far esti-
system in the SKlm approach, whereas in the KEI) approacli the regression ~~r;tt,iolt:for examplc, ;dl trettd and (:(I estitrtatcs withill lire seg~llent.I 2.1 klii
coefficient^ a;(u) aud O ; ( I I ) are implicitly estitlratml t.hro1rg11I.lrn krigitrg sys- art: I x w d O I I Cd data n(. loc;rfions 111 to u 5 . AS ( I ~ S C I I S S ( : ~ lpr~~viot~sly, IIII. Ikrip;-
tem within each search ~trighborhoodW(u). ing system (6.8) is identical a t all locations where t.lrr: same neiglrhoring data
are involved in the eshirr~ation.'Therefore, the two KkX-cstituatcd tret~dco-
efficient,s n ; ( ~ ) and n ; ( ~ )are coltstant wit,llin each segnlent and cltaogc from
Example
one segulcnt to the next, depending on Clle fivc neigl~horingdata rcbait~erl;sce
the slope cstitnales (I~'igrrre6.5, 1)ottom griq~lr,solid litre). 'l'l~eslopc of t.lte
trend tttod~lis tnttll.iplied I>y I.w0 frotn t,lie first S ~ ~ C I I 10 L t l v IICXI. I K C ~ I I S I ~
tirr: larg? (:d c ~ ~ l c ~ n t r i r t ;it
i o un g , I , ~ D I I it g& C ~ O S C1.0~ Lh<!C~111Sl.ikrlL ~ 1 0 1 ~ 1

, Trend estimates of llie global trend rrtodel that is 1 1 s 4 i n siniplc krigitig wil,l~varying local
means (Irot.t.ot~igmplr, Iioriaor~l;rldasltcd lirte).
Botlr ICED and SKlrrl trend an11 Cd cstinlates arc fairly similar. 'Tltct
largest diffwcnccs occur for the first segment, 1 2 . 1 k n ~wlifm! , tire two KEl)
and SKlm slope estitiiates differ inore. As wil.11 krigiug with a I.rt!lrd modtd,
beyond Lire extreme right daturrr a t coorditratc 1.8ktn, krigittg with at, r:x-
ternal drift extrapolates the linear trrnd mo11r:l fitted 1.0 tlrc I;rst, fivc (:<I
values. Such e x t r a p o l a t i o ~woold
~ ha dartgemus if the last five vdtlcs slrowrd
an at.ypical relation b e t w c e ~Cd ~ data and Ni lllock ~ ~ t . i t n i i l esay, s , a iicgal.ive
Distance (km) correlation ca~iscdhy an outlier. Si~trpIekrigiug with varying I0ca1 nreit~ts
would be inore rollust in that il. cxtrapol;ites u relatiotr tliat is fidted i,o all
I Cd estimates
data along the transect.

6.1.4 Performance comparison


Figures 6.6 arrd 6.7 slrow different estirnates of Cd concentrations accou~~t.i~rg
for the informat.iorr provided by either geologic and laud use maps or hy the
Inall of %n hlock cstirnates, 'I'lrrce algorithms for integrat,ion of 1111:sccortrlary
ittfori~iationare considered:

1 . I<rigiug witl~ittstrai,a, 'l'lrc strrdy irrw is divided into two s1.ri~l.i~ I>its(:d
j Slope estimates a;(u) on tlrc geologic map ill I'igrrre 6.6 (left top graph). 'I'lrr, first, stral.!rrrr
includas Argovian rocks wit11 the s~~tallcst. proport.iolt of c o ~ t t ~ ; ~ n r i ~ ~ ; r t ~ : d
locations; the four other rocks form tire scco~tdstr:tt.utr~(l'igure 6.6, lcl'l.
hottom grapl~).Ordinary kriging is pcrfor~rrerlseparately wil,l~itrcaclr
strati1111wing a pooled withi~r-stratumsen~ivariograui;srt: right bolf.orrr
graph of Figure 6.6.

2 . Simple krigirrg with varying local means. T h c local ~ n e a n sarc deter-


1 2 3 4 5 6 ruined its a linear functiotr of the Zn block estimates showti in I'igure 6.7
Distance (km)
(left. l.op g r q ~ l r ) .
Figure 6.5: SKlm aild K1,:l) &imales of tile frewl aad uf Cil conceutraliuns. 'l'l~c
vcrtic;tl daslieil litws drlinratc thr seguwnts that are rstimntrd t~singtlkr s a m e five
3. Kriging with an external drift. 'The external drift corrsists of the m a p
Cd canrenlratiws. 'l'ltc estinratrd slope a;(,,) is cottsti~ntw i t l t i u r:ni:l~sw1, s e ~ n m t t . of %ti block estituates.
Zn block estimates OK Cd estimates
Geoloqv OK Cd estimates

Quaternary
2.0
Poitlamdian 16

Sequanian 12

08
Kinmeiidgian
0.4
Argoviao
00

SKlrn Cd estimates KED Cd estimates

Stratification KWS Cd estimates

20
stratum 2 16
st<atum1 12

08
00
0 0

1st trend coefficient 2nd trend coefficient (slope)

Accounl.ing for scu~rldaryit~forriiationyields more iletaileil i m p s lltatr


1 . 1 1 ~tn:xp of I,IIc r d k v ~ c cOK ~sI.i~tmt.cs
s110wtt i t ] Figures 6.6 z ~ t i 1 1 (j.7 (right
top graphs). Krjgir,g witllin sbrata (KWS) mlt;trtcrs I.lrt: contrast i~ctmecn
low-valued Argoviau rocks and otlrer rocks (Figure 6.6, right bottoru graph).
Wllcrcvcr LIN? t.wo strata are ir~tcr~nit~gletl, as in the SW part of thc stt~rly
a r m , discorrtil,ttilics i t 1 Ckl cst,itr~;rksoccur.
Vigurc 6.7: Accou~~ting for an exhaustively sunpled continuous vvriable (Zn block
Hrxausc they slrarc the same secottclary ir~forrrration,maps o l SlClrn end estimates) i n the cstinmtiort of Cd corrceetratioas: simple kriging with varying locd
ICED estimates ill Figure 6.7 (middle graplw) show sirriilar long-rat~gefra- means (SKlm) and krigilrg with an cxternal drift (KED). 'I'hc two bottom graphs
tures. Itowever, kriging wit11 an exterr~aldrift yields inore local detail. Such dcpict the treud coeficients implicitly estimated locally in the ICED upproach.
sltort-rang,: vitriatiotr results from the local re-evaluation of the linear r e p s -
siorr of Cd co~rcet~trst.ions on Zn lrlock cstiotateii. Ebr example, tlrr larger
variatiolr of KLSD t.slitnatcs itr t,he ct:t~trdpart of 1.lre study area relates 1.0
largcr slopcs of tlrr trend rrrodcl (Figure 6.7, right bottom g m p l ~ ) .1I1c conl-
b i n a t i o ~of~ a steep trend trrodel a t ~ dstnall Zn block estirrrntcs yields tregative 'The same three ;~lgoritlrrrtsfor integratit~gsecondary information are used
IiRi) esti~natcsof Cd corrcentrations; see white pixels i t 1 Figure 6.7 (right lu estirrrate metals with widespread contamination (Cd, Cu, i'b) and cobalt
middle graph). Sucl~unrrcceptable esl.imates arc trot produced 11y the o t l m a t the 100 lest locations. Results in Table 6.1 sltorv that accounting [or
algorit~lrt~~s. ~econdaryinforma t'lo11
202 ClIAl'l'Ell 6 ACCOUNTING FOR SECONIIARY INIa'OIIMA?'ION

s corrects partly for t l ~ eslnooll~ingeffect of o r r l i ~ ~ a kriging:


ry this is i ~ ~ o s t . 6.2 The Cokriging Approach
significant for the metals (Cd, C u , I'h) with short-range variation,
?'he main limitation of the algorithms previously proposed for integration of
increases significantly the rank correlat~onbetween acllial and estl- secondary information is that secondary data nus st be available a t all loca-
mated values, arid tions being estimated and in addition, for kriging with an external drift, a t all
decreases the percentage of test locations rr~isclassificd,that is, wro~igly primary data locations. Non-exliaiistive secondary information can be incor-
declared safe or contami~~ittrd. porated using the cokriging approach that explicitly acconnls for the spatial
cross r o r r d a t i o ~ lI~etwecilprin~aryand secondary varial>les.

Table 6.1: SLal.istics fur t,rire and cstiri~atedconcent,ra- 6.2.1 The cokriging paradigm
tions of heavy metals a t 100 tcst lociil.iom; algorithms Chusirler the situation where priniary diit,a { r i ( n , , ) , n i = 1 , . . . , r r l } arc slrp-
are the reftwnce ordinary kriging (OK), krigiug within plementcd by secordary data re1;lted to (N,, - 1 ) c o n t i ~ ~ u oat,t.ril)~~l.es m zi,
strata (KWS), simple k r i g i q with Iocd means c o n - { z i ( o , , ) , r v i = 1 , . . . , 7 t i , i = 2 . . . , N,,), a t any, possibly diffcrcnt., locations.'
puted as a linear fui~ctionof Z n l~lockestimates (SKlrn), T h e 1i11e;we s t i ~ ~ i a t o(5.1) r is readily cxl.e~~ded to incorporate tlral, additio~inl
and kriging wit,li an external drift (KED) infornlation:

Mean
T r l w values
OK estimates
KWS esliruates
Sl(1111 cstin~atcs
wl~ereA,,, (11) is the rz.eight assignd to tlic primary datum 21 (II,,,) and A,,,(II),
KEI) esti~nates
i > 1, is the weiglil assigned lo the secondary datum z ~ ( I I , , , )'.r l ~ eexpccl.c:d
values ofl,lie ItVs Z l ( n ) a ~ Zi(u,,) ~ d are denoted m1(11) and nzi(n,,), respec-
Sld deuiatim
tively. 'I'ypically, o111ythe p r i ~ ~ ~ aud a r y sccor~dirrydata closest to llic locatioo
True v a l u e s
11 being cst.in~;tbedare rel.;~i~~erl, i.c., n,(u) is us~lallysmaller than ni. 'I'llc
OK estimates
~ I I I O U d
I ~ data
~ r c t a i n d and 1 . l s~ k <iftlic sc;ircl~~ ~ ( ; i g l ~ l i ( > rIl I~(oWo~n(11,
~ ! lh!
KWS estimates
tlie salne for all atlril~utcs.
S l i l ~ estimates
i~
All c o k r i g i ~ ~csti~nators
g arc but, varia~itsof cxprcssio~i(6.10). 'l'l~oy ;IS(:
KEI) esl.irrrales
;III required 1.0 be unbi;tst:d and 1,o n~irtin~izc 1,hr crror vnrinncc n ; < ( r ~ tlint ),
IS,

% n~isclnssi/icalioe Tlie varions rokrigi~rgesI.in~;itorsdillcr in the r a ~ i d o ~f ni ~ ~ ~ c t itiro(le1


orr Zi(11)
01<est,imates ulopt,cd for tlie w r i o r ~ svxriabl<m Typically, icac11 111: %,(u)is ilecomposc~d
KWS ostin~t~tes iut,~a rc,siiIud c u ~ ~ i p o ~/tj(n)
w t , ;md ;I t,rcr~lC O I I I ~ ~ V711<(11):
~ ~ ,

SKlm cstin~ates
KED ostimatcs
any two residual 11Vs Ri(11) and R j ( u
'I'ltc cross covariaocc bel~rvce~~ + h) is

1. Sinlplc cokrigiug (SCK) cousidcrs eacl~local 111ea1lIII,(II) k~lowlrand


wl~cre%:'L~*rc(~~) is l,lri: si~irplerokrigi~rge s t i r ~ ~ i ~of t othe
r prinr;try at1,ributc 21
c o ~ ~ s t a nwitlti~r
t. thc study area A:
+
at. locat,io~r11. 'I'llc ( I I ~ ( I I )112(u))cokriging w-ighls are drternlined such as
1,0W S ! I ~ C I I I I ~ ~ ~ ~ I S I ~ I I I I ~C SLS I ~I I ~I ~ I I ~ I I I I Ierror vitri;i~~c~!.
( I ) I , V I Vn t A i = I,. . . , N,, III
l l ~ ~ l ~ i : r s r ~ cisl ~g~~r~s sa r ; ~ ~ ~byt rr,xl,rnssio~~ t:d (6.1 I ) . It~clci~rl
2 . Orrli~mrycokriging (O(:I() ; ~ c c o ~ ~for ~ i lloc;tl
, s v;~ri;ttio~~s
of 1.111: Irleims
by limiting the rlo~nainof stationarity of bot11 primary and secondary
nlrnus (I~ot.hu ~ ~ k n o w nlo) 1 . 1 1 ~loral ncighborl~oodW ( o ) :

3. Cokriging wil.11 i,n:nd niodcls2 ((CIfl) consists of i ~ ~ o d c l i nt,l~c


g t,rend
conqmnents as linri~rc ~ ~ n b i ~ ~ i t tof
i o n~ Is~ O I V functions
II fk;(n) of t h ?
spal.inl coordi~~atcs11:

w l l a t ~ w rthir rokrigi~rgweiglrt.~X::"(uj and X~:"(II).


Let Ilj(~r,,) = %i(n,,)- mi be x primary (i = I ) or secondary ( i = 2)
residual ICV a t I I , , . The estimation error Z g $ X ( ~ ~-) Z I ( U )a t location u can
he wrilte~ias a lir~carcombi~~alion of t,he residual variables:

,I,he error variauce U;(II) is then expressed m a lincar comhi~iationof residual


auto and cross covariaoce values:
'l'lte I ~ ~ I I ~ I I ! L~IrI rI o~ rv a r i a ~ ~ ccalled
e, (SCK) varimce, is
thc s i ~ l ~ pcokrigi~~g
le
compolerl by s~~bst.iL~rtirrg equat,ions (6.13) into the expression of t,l~eerror
vuiauce (6.12):

,,
I he cokriging weigl~lstitat nli~linliaethe error viirian~r;ire ol~t,itincd1)y set-
+
ting to zero the ( ~ I ( I I ) irx(u)) partial first derivatives of tlrr quadratic
form (6.12):

S e v ~ : r a ls c x o ~ ~ r l n rvnriablcs
y
,,
I h e silnplr cokriging estimator (6.1 I) is readily extended l,o marc t l ~ a none
for cxalnplc, for (A',, --I) varialllcs,
s c c o ~ ~ d n rvarial~le;
y

( 1 1 ) - 7 , = x
?1,(11)

rrl=l
X ~ ( I I [%l(llo,)
) - 1121j

N" t~,(Il)
.+ ( I )( I )-I ] (6.15)
i d ,,,=I
Usir~gan approach similar to that for two attribl~tcs,our derives bile folk)wilIg
cokriging system of ( ~ ~ I ~2( I I1illt:;lr
)~) aluatior~s:

Because the pri~narya ~ secondary


~ d treurl c o ~ ~ r p o n earc
~ ~ tnssul~led
s slk~
tionary (constant) and known, t l x two residual anl.oa~v;rri;r~~~:<i fi~ncl.ions,
CPl(h) and C&(h), and t l ~ crcsidi~alcmss covarimce fuirrtio~lC;\(h) ;Ire
equal to the auto imd cross covariauce fr~nctiorisof lll's X 1 ( ~ l and
) &(u):

Finally, the simple cokrigiug s y s t c ~ is


~ lwritten as

Matrix r ~ o t n t i o u
. .
,, (XI) Using ~ n i ~ l r ~i xr o t a t i o the
~ ~ ,silnple cokrigiug systeltr (6.13) is writt,en

i
~ ~

= G l ( l l < , , - 11) e, = I , . . . ,

&=I
x
n,(lIj
A;:"'(n) c21(llo2 - u p , ) +
&=I
x
,~?(ll)
Ai;;"(ll) C22(110z- 11,j2)
with
Ksc:ti AscK(r~) = kscK (6.17)

= C?l(ll,, - 11) m2 = I , . . . , ?t2(ll)


208 CIIAIJ'TEI'H 6 ACCOUNTIN(: FOR SECONDARY INFORMATlON

lkxause of dilrt:re~tcesi r ~units s f n~easuren~t>nt,the varia~cesof prirnary and


secon~lary variables may differ by several orders of rr~agoitodc,leading to
large dilferwces between rows of the cokriging rnat,rix K,,,<. This may cause
where [(:;j(~~,, - up,)] is the n i ( u ) x n j ( n ) rrlatrix of dalh auto and cross n u ~ r i c r i c instaIAty
~~l wlwn solving cokrigiug systetn (6.13) or (6.16). In such
covariatlces, [ ~ $ ~ " ( u ) ] lis' ail n i ( u ) x 1 vrcI.or of cok~iging~veigllls,and cases, it is good practice tqrcscale the auto and cross covariance values; for
l'
example, solve the cokriging system (6.13) in terms of correlograms:
[Cil(ueL- u)] 1s an V ~ ( I I x) 1 vector of data-to-unknown auto and cross
covariances. T h e cokriging weights are obtained by n~ultiplyingthe itlverse
of matrix K,,., by tllr vector k,,,<:
As<;K(ll) = K;:~ kSCK

'I'lie simple cokriging variat~ce(6.14) is t.l~encomputed as

fLcc;rll ll1:11 1 I K cross corrrrlograni pij(l1) is delixted as Lhc rnlio (Aj(h)/(ci .uj),

x
N.
, s c ~ I ~ (=~ ~ ) [z;(II) - mi]' A ~ ~ ~ (+I WII
p'
i=l
I ) (&la)
ivl~ereu: is the stationary variance of t,lre ItF Z;(u). Cokriging systems
written in terms of correlograms or covariancrs yield different weights, yet
L I P rokriging cstin~at,orremains !.he same.
- -
'r, gcti (11) is the ni(11) x 1 vertor of
w11ere Z;(tl) = [ Z ~ ( I I ~. .).,, Z ; ( I I , ~ , ( ~ ~ ) ) ] Indecd, accout~littgfor the defiuition of t h cross corrclogra~rr,system (6.20)
cokriging wcightr nssignd 1.0 tlic zi-(iat,a a t locat.iolls II,,,, n; = 1 . . . , rti(tl), IICC~I~I~S
and nl, is t.I~en i ( u ) x 1 vector of stationary (cot~stant)n m w s m i . ' I h Nu
vectors of cokriging iwiglrts arc solutions of t l ~ cs y s t r n ~

1. 'l'l~csinrplc cokriging systet~i(6.17) has a ~ ~ n i q tsol~tt,io~i


tc wit11 positive hlull.iplyir~gthe first n ~ ( u equations
) by ei and thr next n2(11) equations by
cokriging variance if and only if the covariance matrix K,,, is positive o ~ c a one
, dedocrs the followitig rclatioll between thr two sets of cokriging
deti~rite.'I'l~atcor~ditionis satisfied by using permissible rore~;ionaliza- wdgllts:
tion models as introduced in section 4.3.2, provided that no two data
v:,hrs rcl;rt.<vrlto t.lw sattte wriahlr nro roloc.af.e~l.

3. 'The simple cokriging est,itnat.or (6.15) is an exact interpolator i r ~that Wl~crrnsthe pri~narydata weights arc the same for bolh systems, the weights
it, l t o t ~ the
~ s printnry data nt tlrcir locations: of the secondary data are rescaled by r a t h of staudard deviations. T h e
cmkrigirtg est.il~r;itor(6.1 1) is llren writtm
210 CHAPTER 6. ACCOUNTING FOR SECONDARY INE'ORMATfON

vcrlor k,,,., is ;t n u l l vrxtor, and


syst,cnt ((i.17) vmislr. 'I'l~e rigl~t,-l~;ru&side
so is the vector X,,,(n) of cokrigirtg wights: A,,-,<(II) = K G ! , 0 = 0. '1'11~
SCK est.imator tlrcn reverts to the primary nkcati I ? I ~ ,as it s l ~ o t ~ l d

T h e standardized form of the SCK estimator is obtained by dividing lmtlr 'l'lts cokriging variants are now illustrated rrsing the one-dimc~tsio~taldata
ternis of the expression by 01: set with a prin~aryvariable (cadmiur~iconcentration) and a single secondary
variable (nickel conccntratiott). The iuforniation availahle for estimating (:(I
concet~t.rai.ionsalot~gthe NE-SW transect consist.s of Ll~efollowiug:

I , ' l h (:<I co~rceritratio~~s


aurl 16 Ni conce~ttrations;s w i'igurii 6.8 (lop
graphs). Closed squares depict locations whcrc hot11 priniary aud sec-
otrdary variables are known Open circles rorrrspond to six locations
rvlicre only the secoudary varialile is availahle.
T h u s the estimator Z";((U) in the standardized form ((i.21) nvil.11 weigltts
provided by cokriging system (13.20) identifies the estin~alor( R I I ) , as it 2. The linear rnotlcl of corcgionalization (4.47) shown in 1"igure 4.16:
slrot~ld.

W e i g h t s of t h c ntcmls
Tlie simple cokrigiq e s h n a t o r ((i.11) c m he rewrit,tcu as

3 . Prinrary and secondary stationary 1ne;rrrs idnr~tified with tlrc sarnple


tt~eansofCd and Ni data along (.he t r x ~ w c t :r i i l I: 4 9 ppm, m2==19.01)11111.

'The csti~natiouis perfor~nedevery 60 ni !!sing a t each locatiolr the five closest


primary data and five closest secondary data: n l ( n ) = nz(11) = 5 V 11.
Figure &8 (hotto~rrgraphs) shows the SCK cstirriat,cs of Cd col~ccnt.r;rtiorls
arrd the weighl.s~"giver, to t l ~ epriil~aryand secondary wcalrs. Note t.11at tlic
Similarly, the weight X::;"(II) given t,o the srcon<lary incall r112 is SCK estin~ates(solid line, third row):

. identify tlic primary data (esactitude properly)

identify the prinlary I I I ~ ~ (I~orizorttal


I I dasl~cdliue) beyot~dthe range of
i ~ t f l ~ ~ e nofc ethe extrc~neright dat,un. The primary wiglit. A ~ ~ xis( ~ ~ )
t l m equal Lo one, wlrerei~sLlre wcigl~tof the scco~~dary nieiw is zero.

3Secorldary data weights are rescalcd by the ratio of s t a n d u d deviations of secordary


to primary vnriahles, h:FK(n) . n 2 / o l ,so their ntngnitndeis rmrpnrahle to that of primary
data wei8bls. Such rescalingamourds to solvir~gtlw simple cokriging systco, in trl-ma of
correlograms; see relation ((i.20).
Cd data
-E
a
a 3
c
2
- 2
!!

.
0" 0 .
50
I Ni data

-2e- . . . O
"
111 t,hr gencr:rl forttl of cokrigitig, exprrssio~t ( G I s ) , primary and secondary
s
g
20

10
0
o
. '
.PIS data locations
[lala localiotrs itred not coi~icidr.Olie particular casc is wltet~all variables are
equally sa~nplcrl,m d tlrr satr~cn(n) primary and srvmdary data locations
0
-
m
0
- o S datalocations are n:t.aitlcd in t,llc ~ ~ s t i i t t ; & x ~ :
I 2 3 e 5 6
Distance (km)

4 I SCK Cd estimates 'l'lw s i ~ ~ t pcokriging


lr e s t i t ~ ~ i ~ t((j.15)
or can tlmi be writ,lrn

It1 such an isotopic case, one gc~iernllyaims a t estimating all N , variables


Z;(II) at, cacti unsanipled locntiou 11. Such esfi~~,al.iou calls for solviug N,
1.
I 2
.
3
. . ~.~
4
._-. .~.
. . .. .-. . .~
5 6
cokrigiug sysl.crris of type (F.lG), will! mcli v;rri;tble Zi(11) Ireing considered
a s primary i n t.urn, A Inore straigllt.forward altcrrr;rt.ivc cotlsisls of estimating
Distance (km)
dircrt.ly the v d o r of N , variables (Mycrs, 1982). 111 mat,rix trotation,
I Weights of the means

7'.
where Z ; C I i ( ~ ~ )= [Z.v;((o), . . . , %E;(u)] IS the vector of simple cokrig-

ing estimators, Z(u,) = [ZI(U,), . . . , ZN,(U,)]' is the vector of data a t u,,


Primary and 1x1 = [ m l , . . . , m ~ ~ is ] 'tlrr vcclor of means (known). Each N,, x N , ma-
...... Secondary
, . ,- .--__
~ . . ,, . ~ ~ . ~
. .~.' ~ . ..~-
? ~ -, ~ . trix C S,C K (11) c o t i t a i ~ lthe cokrigit~gweights, [ A ~ ~ ~ ( rrssigrretl
u)], t o zi-data
2 3 d 5 6 a t location u, for the estimation of at,t.rihr~tesrj al. 11:
Distance (km)

Figlire R.8: T e a prie,ary Cd co~rccntralia~ts


and I6 secondary Ni concentrations
along t l ~ eNE-SW transecl. Opcn cirrlrs cfcpict the six locations rvlrere only Ni
coecci~tratiossare knows. The two botton~graphs slww the simple cokrigisg (SCK)
estimates of Cd concentration and the weights of the primary arid secondary means.
214 CI1AI''I'b;R 6 ACCOUNTlNC FOR SECONDAR1'INFORl\fA?'ION

T h e matrices of cokriging wciglrls arc ol~lailledby s o l v i ~ ~t,lw


g single cok- Singularity can iw avoidcd by estir~~ating one variable at a time, using a
riging system: cokriging systern of t.ype (IiiLCi) and discarding any variable that is wetrkly
corrclatcd with the variable lreirig estimated. 'l'o salhfy t,lre constant sum
co~istraint,each cstiir~ahez:;;(II) is dividcd by the su~rr c:~
Z:~;(II) of c s ~
t i ~ ~ ~ n tvalues
e r l a t t.lie sarire loral,ion 11, then it is n~ultipliedby the C O I I S ~ ~ L I I I .
where the matrix of data-1.0-dat,a aolo and cross covariarrcrs is I).
An alternative consists of estinratilig all variables bnt one, s;ty, Z,,(u),
rising a vector cokriging system of type (6.24). The Z , ~ - V ; ~at.I E11 is t1m1
computed as the constrailrt value n~iriusthe srlrlr of remainiug attril>rtle rst&
illatrs a t that location:

T h e matrix of cokriging rrciglits arid t,lrc rnatri.x of data-to-~rirk~~orr~r


auto i ~ n d
cross covariances are

where C ( u , - 11~3)= [Cij(tlu - rrp)] is the N,, x A', matrix of ;int.o mid
cross covariairces l~r!l.weerr ;my two variahlcs Z, ;and Z, a t II,, ;rrld lql, a l ~ d
C(II,, - u) is t l ~ cA', x Nu ii~alrixof aut.o aud cross covarianccs l,st.wicu ;ruy S i ~ ~ r pkl roi g i n g versus s i m p l e cokriging
t,wo variables Xi ;ind Z, a t ti,, and 11. 'I'l~es i ~ ~ r pcokriging
lc s y s t c n ~(li.21) has
the s a ~ f :o r u ~as ~.II(. sitr~pli~
kriging systclri (5.10) c s c q ~ tt.ll;rt,
. t11,: iwlrics ;rrc
matrices irmtcad of scalars.
,,
I h c error variances associated will, t.lre N, s i ~ r ~ p cokriging
lc cstit~mtors
are the diagoi~irlelemel~tsof the nrat,rix:
A large rokrigiug syst.enr 1i1osl Ihc solvt!rl
It is t.licn worth looking a t tlrc hcncfit one rrlay gi~iu~ I Y X I I 1 . 1 1 ~ addibi~llil~
morlcling and coniputatio~ialdart ~recessitaLcdby cokriging.

? % c o w l r r . e l n d o u ~ r l a g r of r u b r y r r r y
'1'11c wkriging t,st,irn;it,or is f,l~wr~%ic;~Ily l x t t , < 1~ ~ ~ ~it,%
x w1r o~r \,t~ri:tuw
s ~ ~ is
In s o ~ ~sitr~at.io~rs,
ic t . 1 1 ~varial~lcsiuvolvcd i u cokrigilig are linearly d c ~ p c ~ r d i ~ ~ ~ t . . always smaller tliati or cq~~;rl t,o tllc r:rror v;lri;rnct: of kriging whic11 igi~orrs
I b r example, tlia zi-values irmy sum lh ;I constant 1); e.g., rcl' a I iv<: ' coiic?~i- all stwm(l;wy iuf~mrtatio~l:
~ oue. I<liowledg<,of any ( N , - 1)
tratious of geoclielnical elelrrr?uts s r l r ~to
attribute valot:s at, 11 allows one to dcd~icr,exactly tile valur of I . 1 1 ~r ~ l ~ ~ i l i n i ~ i g
altril~utexs I ) - ~:i_';' z,(u). I N s~iclra r ; ~ i,cw;rrr,
, t.It:tI.:
estin~alorof a st1111 s, Y(u) =
o f v a r i i d ~ l ~say,
li --.
111 t.11,: isohpic cast, ;irioI.ltrr adv;i~tt.agvof d r i g i r ~ gis t,lb;it, t,lw c o k r i g i ~ ~ g
L1=]
X t ( u ) , is ~ q u i r lt,0 t,lrc S U I I I
of the cokrigirrg estimators of the li compone~rt,sZ k ( u ) (Mathcrori, 1979):

+
provided the ( l i 1) cokrigiq cst,i~rialrmY;<;,<(u) and % : * ~ ~ 1are ~ ( ~built
~)
frollr t h p saxrrr set of zi-data. I"or examplo, Llrr varinl,lc Y ( u ) ilmy bc thc
dols) and h 1 r scco~ldaryd a t a v a l ~ ~ c(ope~r
s circles) all loc;rtcd orr a circle of
unit radius away from u. Simple cokriging weights are c o n i p ~ ~ t eusing tl tllc
fnllowilrg c o r e g i o ~ ~ n l i z a t irnodcl:
o~~

Both direct semiv;trii~gra~n irrodels arc boundcrl and of unit sill, licr~cetlic
sill of the cross scnrivariograrri rr~odelis tlic correlation cocficient i,ctween
prirnary a ~ sccor~dary
~ d variables; see relat.iorr (3.39).

Figure 6.9: Relative custrihation of secondary data (open cin:lrs) versus primary
data (black dots) lor the simple cokriging cstimatiou 01 the prim;~ryntlribtite at
location u. That contribution, nrrasrrcd by the ralio $(II) of secondary to pri-
mary cokriginy weights (ahsolute values), increases as lhe two variables arc better
correlated and as the relative nugget elfect on the primary scrnivnriogran~nrodcil
increases.
220 CIIAP'I'ER 6 ACCOTJN'l'lNG FOR SECONDARY INF'OIIRIATION 6 2 T l f E CORHIGING API'ROACN A*,'

(.I) S n i r ~ p l i e gdcnszly
Figure 6.11 (lop graph) sl~orvs;UI isotopic d;rt;~conligrlralioo wl~ereboth
primary and secondary variables are measured a t four locations one unit
distance away f r o u ~the location 11 to be rstitnated. Two lreterolopic d a t a
configurn1,ions are also considrrcd:

Screening variable: Semndaiy

10
. , ~ ~~-~ ~ -
.~- ~.'~--.-
20 30
~.
40
Azimuth angle I1

Figure 6.12 slrows the silriple kriging (clashed line) and cokriging (solid line)
estim;ites of (:<I conccntraI,ioos along f.lrc Nl?-SW tratlsect for three sarrlplitrg
clc~lsitics:

'1.e~pairs ol. eolocat,crl (:d aud N i data plus six a d d i l i o ~ ~Ni


a l d t ~ t a(uu.
dersa~npledcase I )

Six pairs of colocated Cd and Ni data plus ten additional Ni data (nn-
ilersa~rtpledc ; m 2)

'The Irriging a1111 cnkriging est,imatcs are cssent.inlly the same in the isolopic
c a w . I)ilkrc~rc<,sbetwecl~cst,i~n;~ti,s
increase as sccol~dnrydata hecornc Inore
For a w r o ~ a i t ~ ~ umgle,
t ~ l r the ratio $(u) rlow excr:crls folu (U'ignrc 6.10, riglit ~ r o t I ~ sr i r l ; Wlmr ouly six Cd values ;ire avnilal,lc (utl-
hof.ton~graph). This li~rgeratio val~leindicates a scrcc~riltgof t l ~ eprimary dcrs;rtrll,lcd case 2), the simple krigillg esti1rr;rtes are close t,o tlrc stalio~lary
d a t a by the colocated sacoudirry i11for111al.ion.A s n ~ a l shift
l (0 = 6') of thc IIIC~~A I ~C
. COIIIIL for ~ ~ secondary licke el data allows a better resolt~tio~r
~ I1,he of
sccotdary data drastically reduces that scrcct~ingeffect. tile cokrigiug csl.inralcs.
222 CHAPTEII 6. ACCOUN'flNG FOR SECONUAflY INE'OllMATlON

Isotopic case

- I
Cd estimates (equally sampled case)
4

Cd=lO
- N i = 10

-8 ' m,
..
..

80 - SCK
....... SK
1 2 3 4 5 6
DislanCe (km)
Config 1 Config 2

4 1 C d estimates (undersampled case i )


- Cd=i0
N i = 16

0 0
- SCK
....... SK
i_~
.. .. --..-
1 2 3 I 5 6
Dislanco (km)

--
--
Isotopic
,'.
-

- -
. Conlig 1
Config 2 ,
- I I Cd estimates (undersampled case 2)

,:I E Cd = 6
-8 ' -
,'i
.
'N
,"
m
*
Ni- 16

.// c mi ......
3 0
- SCK
....... SK

I
00 02 04 06 06 10
Relat~venugget effect bo

Figure 6.12: 'The di&rcncc bellveeji siinplr cokriging (solid line) and simplc kriging
(dashed line) estimates of Cd concenlralioi~s incrcasrs as thc primary vnriable is
more tndersampled. Nickel data loc;rlions increase f n m 10 (upper grzqrh) to 16

.
(caws 1 and 2). Cd rlala locatioss decrease from 10 (two appcr graphs) l o fj (lower
>
6.2.3 Ordinary cokriging w l ~ r r et.he t.wo 1,ngr;mge parurrretcrs /L;"'"(II) and l~;"""(n) accoill~tfor tile
I.wo ~ ~ I L I > ~ ; ~ S P I I I I Cco~mt,riii~lt,s.
SS 'I'II(I l,i~~r:mgiitl~
1411) is i ~ l i ~ ~ i l l ~by
i z csflt~illg
d
tu zrro i1.s partial first derivatives with r<,spcct to thcr ( I I I ( I I )4-IL~(II) 2) +
(1;kt.a wcigl~t,s;wrl 1,agrnllgt: pararnet.ers:

wllcrc pril,l;rry atrd sccol~daryIncans art! d ( u r l r d cotlstant, r v i t l ~ i t rmcll search


~ ~ ~ ~ i g W(II).l ~ l ~ ~ w
'I'lw r sl ~A,,,,~ (11)
~i& ~ ~:tnd
~ ~, \ , l, t 2 ( ~;tw
~)

If Lllc locd means r n l ( n ) i111d I I I ~ ( Uarc


) actually I I I I ~ ~ I O ~t.l~ey
I I , car1 he
fillvred froltr llrc lincar estimator h y set,til~gItl~cirrcspcctivc weights lo zero,
i.e., by col~strail~illg the prir~mrydata wciglrt,s A,,,(II) to SIIIII to one and
a r y weiglrts A,,,(II) to SIIIII 1-0 mro. 'l.11~ ordinary cokriging
the s n c o ~ ~ ~ l d;rl.a
iist~irrra1,oris t.llen written
ri,(ll) a@)
( I ) = ( I )( I ) t ( I )( I ) (6.27)
<",=I <,>=I

'Y11c ordinary cokrigi~lgesf.imator (6.27) is 1lnl)i:tsed:

Minirrlizatiolr of the error variance of type (6.12) ullder the two con-
str;tints (6.28) calls for the definition of a 1,;rgrangian L ( u ) similar to that
introdl~cetlfor ordinary kriging ill sect,ioll 5.3:

Siwilnrly, for L l r n c;we of t.wo or more secondary v;rrinl)lcs, t.hc ordilrary cok-
rigir~gestimator is writ,Lcn
CHAPTER 6. ACCOUNTYNG FOR SECONllAI1Y INFORMATION 6.2. T H E GOKRIGING APPROACH
226

with the N , unbiasedness constraints:

Minimizing the estirnatio~ivariance under these non-Bias comtraints leads to


+
the following syst,eln of ( ~ 2n i)( u ) Nu) linear eqr~ations:

where I is the Nu x N,,ider~tityi m t r i x , and M(rr) is the A', x Nu ~ r ~ a t r i x


Lagrange parameters.

Scmivariogranr n o t a t i o n
with = 1 for i = I and bil = 0 otlrerrvise. The cokrigiug variance is T h e cokriging systern (6.32) cat1 be expressed i n t e r ~ n sof dircct a n d cross
s~mivariograrr~s provided 1 . h ~cross covariance ~nodclsh c t r u w ~prin~ary
~ aud
secondary variables art: sy~r~rnctric, i.e., G j ( h ) = (;,,(ir) V i f j (recall
rliscr~ssionabout tlrc lag elli:cL i ~ seclion
r 3.2.3). Acco11111.ingfor thc r e l a t i o ~ ~ s
Cj,(h) = Cj,(O) - y,j(b), i , j = 1 , 2 , the ordinary cokriging systcrn i s wril,t.cn
Matrix notation
Using the 111aLria forrrrulirtiou i n h d n c c d in scclion (5.2.2, Lha ordinary cok-
riging system (6.29) is wrilten as

where 1; and 0; are n,(n) x 1 unit and nu11 vcctcirs, rcspdivnly. 'rhe gellet-
aliaation to the case N,, 2 3 is str:tigl~t,forward.
In the equally sampled case, !,he vector of ordinary cokrigiug rslima1,ors Il~tlikcortlin;try cokrigit~g,a s i l ~ ~ pcokrigiug
le sysI.i!n~ciw 1w cxpn:sscd
is written in terms of o l ~ l yauto and cross ~ovariances.
a(11)

ZbCK(ll) = [ L : ~ ~ , I ) ]Z[lxc,)
~
n=l

," 7'
where Z;)CK(~l)=: [ z $ ~(I,), . . . , LOCK(u)] IS t,he vector of c ~ k r i g i n ga -
timators. As for sirrrple cokr~ging,the N , x N,, matrices of cokrigi~rgweigl~ts,
LO""(\l)
a = [Az:;"(n)], arc: obtained by solving a single cokriging s y s t e m
It is good pr;~cl.icet,o rcscalc covarimre values sticti t,l~atclen~cotsol' llte
cov;rriatlce m i ~ t ~ iinxsystenl (6.33) arc of t.hc same order of i l ~ a g n i t t t d ~Tlic
.
st.att<l;~rdixrdibrt~tof (,lie ordinary cokrigitig csli~~l;lt.or
(6.27) is

,,
l h c corrcspouding cokriging weiglrl,~uz:"(ir) are ohtained hy solvi~tgt,lre
ordinary cokrigil~gsyslclt~expressed in tcrnis of corrclograms.
1,ikr siir~plccokriging wr:ights, O(:I< rvcigl11.sof starldardiaed and original
v;rriahlr~s;Ire dilti.rc~rt,althoogli l , l ~cokrigittg
r esl.iniatcs arc the same:

A s i n g l e l ~ t ~ l t i a s t ! d n e cs so n s t r a i n t
In s o l ~ ~c;~sr,s,
c priirmry and sccorlihry rl;rl.;t rcl;dr 1.0 the sallti: a1.triljllt.c.
I*'or exnnlplc, a f<,w prcrisc. 1:hor;ltory ~ ~ ~ r ; r s ~ t r e t ~ofl cp11
~ r l .or
s clay conl.~!~~l.
arc s t ~ p p l c t ~ ~ c n hy
l c dmore lrwilerous field data collcclcd using chcaper me:&-
suren~rlrt.devices. Mrasttretncnt errors are likely l,o l x l;rrger for field data,
m d f,l~eirsct~~iv;triogr;trtl is likely l,u havr a larger relalivc nugget, elkxt. 'Ib Like o r d i ~ t ; t rkrigiog,
~ ordinary cokriging ;ttnount,s lo:
z~cco~tnt for such a dilfercncc ill l l ~ rpnlt~crlisof spatial contik~nity,precise
I . estitttaliug tlir local pri~ttaryatid secondary ineans, my, m $ ~ l ~ ( uand
)
I;hor;~l,ory~~trasurc~tr~rt,l,s ancl less prccisc lirld dat;r arc weiglitcd diff(mwttly
~n:;';~(n) for the case N, = 2, a t e a d ~loci~1.ion11 using hotlt pritnary
t,lirot~gl~ cokriging. a ~ t dseconrlary <lath speciiic to t.li;tt neiglthorliood, l l ~ e ~ t
I'rovidcd bolli rncasurctnel~lprocesses arc unbiased and the two pritnary
atid secondary tircms rvitlritt the search rroigl~horltood bV(11) ;Ire dee~lted 2. applying the sir~iplecokriging estitnator (0.11) using these estimates of
q n a l , the lincar est.itnator (6.10) is r v r i t t e ~ ~ I.lrc incans rather titan tlic stj~t,ionaryI I I ~ R I I S7111 imd 1112:

Accoonbing for rxpressiotr (6.22) of the simple cokrigi~tgestimator, one


mean
'1'11~I I I I ~ ~ O W I ~ IIL(II)is filtered fiom t.l1;11.est.iti~alor
by forcing all primary can deduce the followi~~g
relation between the two esbi~nators:
m d sncondary data weights to sum to o w (Journel and Iluijhregts, 1978,
1). 325). 'Tlw corrcspoorlirrg cxlw~~ssioti of 1.11~ordinary cokrigittg c s l i r ~ ~ i ~is
tor
As ill the single i~ttributccase, discrep;u~cies bctwecr~siltrple mrl ordinary
Cd trend estimates
cokriging estirnalors are related 1,o ihe cstirrrated pritnary and secondary local
,.
mearts being diffcre~tlfrorrr I,hc statiot~wyv;rlrtcs m 1 and I J ~ Z .
l h e local means 112:;$~(11) and ,I~:;,,~(II) that arc i~ttplirit.ly1 1 s 4 ill r,s-
t,inrator (6.38) can bo f:xplicilly csti~~latetl as linear conti~inalionsof primary
and s e c o ~ ~ d a rdya k For a single secondary a t t r i l ~ n t r t, l ~ c0(:1< estiinntor of
the pritnary local mcarr r n l ( ~ r )is

, , , . . .
3
_
, , ~. . _ ~ ,

4
l , . _,
5
.. . , ~~.
6
1 2
Distance (km)

where X~'$,(II) is t,l~eweight assiguetl lo l.l~rtlaI.t~ntz,(u,,,). 'Vo c~tsttre


unl~iasedoess,tltc primary and secondary tl;rl.a weights n111s1meet, thc two 50 i Ni trend estimates
constraints (6.28).
'Tl~ecokriging weights for t.lle prilnary IIKWI arc tI1oi1 oht;~itrerlby solvittg
. PIS data locations
S data locations
a systelri similar t,o tho o r d i ~ ~ a rcokrigittg
y systcm (6.29) for at.f.ril~utev a l ~ ~ c s ,
except, for the right-ltantl-side covariance I.crlus (I,!(n,,,-11) br:i~igst:l 1.0 a m ) .

0
.. Y

-
----- OCK
....... SCK

Figure 6.13 (two top graphs) slrows the ordinary cokriging estimates of Cd
and Ni local mearts along the NE-SW transect. Like the single attribute case
in Figure 5.3 (rniddlc graph), both trrrtrd esI.iitl;rtcs h;rvc n s h i r r i ~ s csl~apr,,
each step correspondiug 1.0 r s t i ~ ~ ~ ahassdt c s 011 tlic sntllc twigltlmritrg prirnnry
and secondary data. ?'hc rltmber of sl,rps is, lio\vcvcr, hrger since 16 pritrxtry
data locations are here co~tsidercdinste;~dof only 10 primary da1.a locatior~s 1 Cd estimates
for Figure 5.3. I%of.lit r e l ~ deslininlcs follow tltc getter;d inrrc;~s<ri l l (:<I at111N i
corrcentratiol~salong tlrt: tr;ursert. 111 contrast, tilt: overall ~ ~ l c a t(Itorizot~lal
is
daslted line) overesti~natethe local mean in the low-valrrcd (left,) part of 1.l~:
transect and underestimate the local mean it! 1 . l ~ l~igl~-wdueil (right.) par( of
the trans&. This underestimation is Icss pn~lrout~ced lilr N i concenlritt.iolts.
Figure 6.13 (botl.om graph) shows the simple (daslmi linc) and ordinary
(solid line) cokriging c s t i ~ i ~ a lof
r ~(~!d
s co~~cel~lr;rl,iolts. Note tlw following:
Both tnterpolrrlors arc cxact
OCK cstitnitlks are smaller thau SCli rstilnat,cs i t t the lafl part of (,Ire
transect where C d local rricar~sare smaller tlr;tn the overall nreall.
Figure 6.13: Estimates Cd and Ni local n ~ c a s s ,and Cd concelktri~tionsilsillg
ordinary cokriging (solid line) and simple cokrigi~rg( d a h a d line).
c s l i t ~ r a t i oof
~ ~l . 1 ~C O I I I I I I ~Ioci~ln ~ e i uof
~ pri~~tiiryi111d I , ~ R I c ~s ~ c o ~ i d a r y
vari;thles w i t l ~ each i ~ ~ search neighborhood C V ( o ) . 'I'hlrs, local departures fro111
the ovcrall inems ;re d i l l accow~l,adfor as they arc in traditiotial o r d i ~ ~ a r y
cokriging.
1. So~i~ ofc tllc scco~r<laryd;tl,n w i g h t s arc negative, thereby increasing l i k e simple or ordinary cokriging, it is good pract.ice to solve the cok-
t l ~ rrisk o l gel.t,i~~g
utraccepf~al~le c s l , i ~ ~ ~ a tsee
e s ; rrl;~t,cdrlisc~lssion in riging s y s l . t ~( 6~. 3~7 ) in t.er~nsol c o r r e l u g r a ~ ~w~ls ~ et l ~~ c~variances o l primi~ry
srclion 5.8. and secondary variables d i l h by several orders of maguituile. 'I'lle cokrigi~ig
weights t h a t are provided ljy the a~krigitlgsyst,er~~s writkn in terms of covari-
anccs or a,rrclogra~lisitre not linearly r e l a k l , heuce thr r e s ~ ~ l t ,cokriging i~~g
< > s t . i ~ ~ ~ area t o rsliglrtly
s differmi. (Goovaerl,~,1997).
'I?) rcrluci. f.hc occurrcncc of 11eg;it.ivi. wnigl~lsand avoid artificially li~nit,ing
l.lltx i1111xtct(of s ~ w n d i r r ydalh, Isaaks and Srivastxva (1080, p. 4 1 6 ) proposed 6.2.5 Principal component kriging
rlsil~ga sirrglc co~~sl.ririt~f, of type ( 6 . 3 6 ) ;fin exaniple, for a single secondary
nttril>utc:

wlrcre Q is the ortliogo~talmatrix of eigcltvectors of the matrix R.,


A = [XI] is the <liagonal n ~ a l r i xof eigenvalurs, and I is t.he N, X Nu
i~lcnt.it,y!!t:tlrix.
Cdoc:;tl.c!rl s i l n p l o cokrigiltg
'The colocirtcrl siinplc cokriging cslinrnlor of the p r i ~ ~ ~ :tt,l,rib~~te
ary zl at, lo-
c ; h n ~11~is

~,g:.~<
(11) = x
r>~(ll)

,.,=I
-,m1]
A:YJ<(xl) [Zj(n,, )

+ Ar"'(11) [Z::(~I)- 7n2] + 1111 (6.43)

[i,=l
?21(11)

( I ) I AyK(ll) = 1
11, I
(6.46)
I3ot,l1prirn;rry ;rnd scco~rdaryvarial~lossl~ooldhe st.nu11;trdiaed to a zero
wllerl hlreir variances are significantly difirent. 'I'h
Itlean ;111rl rtt~itvari;r~~ce
cokriging weigbI,s would t,11r11lic solttlions of systeln (64G) exprcsscd in terms
of corrrhgr;tn~s.
Il~rlikel . l ~s i ~ ~ r p lr er k r i g i ~ ~syst,cln
g ( & I s ) , 1 . 1 ~colocatcd s i n ~ p l ecokrigilrg
systeln does not c;rll for the covirriauce h c t w e n secood;~rydata for 1111 > 0 ,
wl~iclrsllevi;tt.t.s the inli:rerlce anil ~rrodclingclfnrl (see s ~ ~ b s e q ~ discussio~~ ~rwt
rcl;tlt?d to I.lir M;rrliov trlodel).
llnlike full cokrigiug, colocat.ed cokrigiug calls for the iofcrance and modeling
m l y of 1 . h ~prirrmry covaria~~ce
function C l l ( h ) and l l ~ ecross covariance fnnc-
tiou (,'12(l~).'I'hr 111o0cIingdFort ran IIC f~~rt,llcr ; ~ l l ~ ~ v i ; ~hy
l , v t,l~c
~ l foIio\vi~rg
;i]>[~roxi~~r;il,iot~:

( '1 ( I ) G d 0 ) (;ll(ll)
Cll(0)

or, in terms of correlogran~s,

'L'his corrc1ogr;lnr nrorlel correspo~idsto the linear regressio~t111ode1:

+
wlrere tho residual 11(11) is assumed ortl~ogonalto Z l ( u h ) , V 11. T h e linear
rcgrcssion (6.48) is not a requisite for model (6.47); rather equatior~(6.48)
sl ~011dbc read as the definition of the residual il(rr).
A sufficient but tmt necessary condit.ion for inodnl (6.47) to L~oldis tilt,
independence relation:

. 11 does not lead liccessarily to a linear model of coregionaliaaliotr if ouly


because it does not specify the autocovariance function of the secondary
In words, dcpendcnce of lhc secotrdary variihlc otr 1 . 1 1 ~prin~aryis li~t~it,cd
t,r, ~itriill>lc.
the colocated prirnary datum, a Markov-type sssn~nptiotr.
Figure 6.15 slrows the experinlrtllal Cd se~nivariogratl~ and the expcri~net~t;tl
I'rooJ cross srnrivariogranis for the pnirs c;td~r~inn~-aiuc arid c a ~ l ~ t ~ i o ~ ~ r - t r i7'111:
ck~:l.
For conciseness of iiolatiot~,assume tllnt the two 1ll"s % ~ ( I Iatid
) & ( n ) arc two cross seniivariogram ~rrodelsarc c o m p ~ ~ t cfrom
d the Cd semivariogranl
st,attdardized aud that tS~crcis no lag <:ffcct (scc sert.ion 3.2.:1). 'l'lir..'IC covitr~. 11iode1(5.15) rrsiug relatio~l(6.50). Wlicrcas the fit is satisfactory for tllc air
ance is Cd-%I,, tllc hfnrkov approxil~t;rt,io~~ur~dcrestil~t;rt,cs
lllr long-range sl,ri~ct~lre
f o r l.lle pair Cd-Ni. 11) 1ntl.cr case, a linear nlodcl of corcgion;rliaat.ic?~iis
prel'axd (see Figure 1 1 8 , pa@ 121).

Color:itted cokrigir~gv c r s n s fill1 cokriging


-Jlr(b; Z , zr)dzdz' In tlic prcscnce of deusoly sautl)lcd scconrlarp inforn~atiou,colocatctl c o k r i g i ~ ~ g
is a v;rlu;rl)le alt,crtiativc 1.0 f t t l l ndirigillg for th:sc rmsous:
~ l l c r eJ ~ l ( l ~ ;2')
z , is t11e I)ivariat~!~ d f o ZI(lx)
f ;i11,1Z,[II + l ~ ) ,

= J J z f ~ c {; Z ~ ( ~ I ) ~ Z =I ( I I ) J ~ ~ ( :,I I ;
2 ) . z1+~2r
1. (hloc;rl(xl c d r i g i n g ;~\widsi1~slh1~ilil~y
<,lt<Iiiry11;llil.
c a ~ ~ s cI)?
d highly ~ ~ I I I I ~SIX-
~ I I ~ ,

if relation (6.49) Ilolds (.rue, 2. It is fast, sirrcc it calls for a s~iiallercokrigiug s y s k m

if the regression of & ( n ) o t ~Z l ( n ) is linear, as in rc~lxt,iorr(6.48), 4. It docs nol require nlodcling tlic cross covariance fnnction C12(11) as
long as the Markov apprositttation is rcasonahle.

and is readily cxtei~iledt o scvcr;il scco11rl;rry wrial,lcs. 'l'llis iilodcl is very


corlgeniat ill t.hat only tllc sstirivariogr;url 71l ( h ) ~lrerlh e niode1t:d
'rhe h$iwlmv-l.yl)~!I I I < J ~ <Itas
, ~ Llie fohvi11g cI~;~r;t<:l,~~ristics: l"ig~xe6.15: Markov mudcl for the pairs cad~iriunl-zincancl cit<lrrtit~~t~-l~i<:kCI; the
two cross se~nivariograrnrnotlels (solid liec) arc rescaliugs of tlri: Cd sen~ivariogratn
model using the hlarkov nrodcl ((i.50). Thc n d e l lrlrrs poorly for tlle pair cadmium-
nickel.
,>
1 lie Lradr-oll' costs arc its follows:

2 , l'he rescaling of variables calls for knowledge of the statiorlary iwitns


of pritnary and srcot~d;~ry varial~lcs.

3. 'l'he i t ~ f o r ~ n a t i oprovided
n by secondary data 1)eyond the colocat.ed sec-
ondary cl;tl.u~n~ ~ ( 1is1 )ig~wrcd.

r Tlic ilrfcrcnce of the residual covariancc r+:qllircd iiy kriging with an ex-
t.crual drill, is 1101st,raightforw;rrd. M o d c l i ~ ~dirrxt
g and crms scmivari-
ogr;ttils is st,raightfi~rwardt.lioug11dc~n;~ri~ling. It is rveu less delnanditrg
if Lhr: Markov al~proxi~rral.io~i (6.47) is appropriat,c.

6.2.7 Arcounling for soft informat'ion

I . (;o~rsl.rainl, int,crvals ( r t , ; n + , ] . 'I'l~eseirrdicstc tlr;rl, tllc primary at-


Lrib~it,eis valued betweetr zn and z ~ + , Interval-l.ype
. da1.a are typically
provided by ilrcxpelr~ivelllcasurelllent devices, sllcll as colorimetric pa-
1x.r fc,r ~ r ~ c a s l r r i ro~lccntratio~rs.
t~g
2 Tu<lic;rt.orsof occurrence of a particular facics or rock category. A cal-
ibration s u c l ~as that in sectioli 2.1.2 then yields, for each category, a
specific coodil,ion;rl rlislrihnbiorr of the prinrary attribntc.
I3otl1 types of d a t a are referred to ;ts soft rather rlratr second;try because they
rclatc direct.ly to t.he pri~naryattribute valne. I'recise z-measurements are
called lrnnl data. Soft d a t a , though imprecise, are usnally more numerous
Colocated OCK tliali hard d a t a and hence are worth considering. Both hard and soft data
....... Full OCK catr be combined using the cokriging formalism.

i g r c I : Ordinary cokriging pstimatcs of Cd concentrations using the live Consider the situaliorr where interval-type data are collect.ed a t n' locations
closest Ni hlock astinrntes (full cokrigieg, dnshed linc) r x the single colocnted Ni ub. More elaborate measurements are conducted a t n << n' locations,
I h r k rstirnatc (rolocnlcd cokriCisg, solicl line).
6The primary attril,utc is denoted z in this section since no other continuuos ntlribute
is considered.
the estin~atedOCK value a t 4.75 ~ I I I(undersampled case 1) exceeds
Cd indicator data
4 1 t,he critical tl~ridlold(L~oriaontaldashed line), altlroogh the colocated
soft. & ~ t n ~i~~clicatrs
n that p;uticulnr location as safe, At that location,
the Iirrge itcighbori~~g 11nrd d;rt;r prevail o n the soft (possibly inaccu-
- threshold &=08 ppnr
rat?) coloc;ttccl dat11111.Indicat.or kriging d g o r i l h ~ n sint.roduced in sec-

. + location contaminated
- localionsate
t.ioti 7.4.2 will ;tllow both iiartl dath and ronst,rainI, i~rtcrvaldata t o be
1io11or1dprovidcil they are co~~sistenl.

I..
+ . . . . .

t
-

2
~ 7
+ + + + +

3
+ +

4
+

.-. ... .,. . . ~.~


-~.
. +

5
- 6
0 Accon~~t.il,g for indimtor data yields s ~ ~ ~ ; i lCd l e r eslimat.cs a t locations
t h a t are deenred safe tlrsn ordinary kriging usiug o ~ d y11ard data would.
Dislance (km) Uiscrepancios 1)ctwcrn kriging and cokriging esli~lrntcsincrease as hard
da1.a 1icc~l111~ l v i l2).
s p r s c r ( r u ~ d e r s a ~ ~ ~ pcase

12 ,Cd data 0.3 ,Indicator data ,Cd-indicator data

Consider a sil,uat.im~where p r i ~ ~ i a rda1.a


y {r(u,), rr = I , . . . , 71) arc supple-
~ i ~ e t i t eby
d cxl~austivelys a ~ ~ ~ p second;rry
led calrgorical i t l f o r ~ ~ ~ a t i{s(n),
on
V 11 E A ) . 'I'l~ecal.r~goricalattril~utcs can take I< d i l f c r r ~ ~ s t a h , say, It'
soil t,ypcs or f;tr:ies. For each state s t , the p r o p o r t h ~of z-values not ex-
c c c d i ~ ~agpwticular Ll~rcslrolilr, c;m be cornptrtrd fro111thosr: locatiolls U,
where b o t l ~prinl;try a ~ i dsccmdary variables are k ~ r o w see~ ~ ,relation (2.10).
Such c;dihr:ttion of soft data allorvs o ~ r ct,o nssociatr witlr caclr location 11 a
Cd estimates (undersampledcase 1) conditional p r o l d ~ i l i t yof t,ype

zC) = l'rot) {X(u) 5 zclS(ll) = S I ]


?/(XI;
= F(z,l,?r) E [O, 11 (6.54)

Where a l~artldatum z(u,) is availal)le, the soft datr1111y(u;zc) is either zero


- OCK or ome, depending on rvl~cthcrt.11e ;-mcasoremcnt exceeds hhe thrcslrold I,.
....... OK
. , . . . ~ . . T ~ . ~ . - - ~ ..~ .r ~. ,, .., ,.. 7
As wil.11 interval-type data, c o ~ ~ d i t i o probabilities
~~al of typr (6.54) can
1 2 3 4 5 6
Distance (km) bc accoru11,e~lfor using the cokriging formalis~n.W ~ theI soft informatiol~
relates Lo cat,egorical variables, such as rock t,ypes or soil associat.ions, t,hc cor-
Cd estimates (undersampled case 2) respondi~igsoft data ;wir likely t,o be more continuous than ]lard data because
4 1 all p r l a t a in tlw saurr class sk are C O I I S ~ R ~To
~ ~ avoid
. llulllerical p r o b l ~ l n ~
causcd by s n c l ~irigl~lyr e d ~ ~ u d a osccmdary
t, inforruntirr~~,tlrc colocated stall-
dardizccl ordi~mrycokriging estit~t;~l.rrr is profcrrcd:
"(U)
s<
;,(;l zc) - 7122
-
- A..(u, I.) ;'u"[ 7i"]
cz n=1

Distance (km)

tvl~t!rcLl~r:meall iny(z,) of variable Y ( u ; I,) is the rnargiiral probability tllat


i r I : ~\cr-osnliagfor sol1 inforrnntiou in the esti~nationof n contitnm~s attrilii~tez docs tmt exceed the critical tlrrcsholrl 2, calculated across all I(
af.triInrtr. Soft data consist of 16 indicators of wltstlicr tlw tolerablt: ~naxir~nirrr ,1111.
, v;rriancr: CT$(I,) of the indic.nt.or varial~le is then
cnt.c,gorics s r .
0.8 i w n for Cd co~,ccntrationslias lrceit excerdc~l. Both hart1 Cd data and soft
" ....... I :. . 3 : : I .I . . 1. 1 I
246 CIIAPTEII A ACCOllNTIhrG FOR SECONIIAIZY INIWI~MI~TION

n w ( z , ) [ l - my(:,)]. Notc that both estiinator arid the cokriging weig1~t.s


depend on the tliresltold 2,. Probabilities of Cd not exceeding 0.8 ppm
T h e csliniator (6.55) is nrrbiascd provided all crikriging rveiglits suiir t.o
o11c:

T h e cokrigiiig weights are obtained by solvii~gthe fnllowing system of (ri(11)t


2) linear equations:

1.2 ,Cd data son data 06 ,Cd-SOHdata

(6.56)
where pzy(11; z,) is tbc cross corrclograitr hctwrxn lrard and soft d ; h a t
threshold 2, Cd estimates (undersampledcase 1)
4 I
Eznmple
tising the co~iditiorialfreqrie~iciesgive11 in 'l'al>lc 2.5 (page I!)), t.l~cgeologic
profile i t 1 Figure 6.1 ( p ~ g cI % , niiddlc grztpl~)is ~ : o ~ ~ v c ri l t. ~~ d;I: h ~>rolilc01'
probabilities of not, excecdit~gtlrr critical t.lrresl~old for Cd concetitr;~tions
(Figure 618, top graplr). T h e largest proliahilit.y is observed oil iirgovim
rocks (1.25-2.75 kni), wlicrcas t,l~ereis a zenl prohal~ilityof not escccrling
the critical t~liresholdon Portlaridian rocks ( 4 . 2 ~ 4 . 8km). Ncxt, the 259 goo-
logic data s(u,) available across t,he study area are co~ivcrt.edinto soft <I;rla
(conditional prol~al~ilities) y ( u n ; rC).Figure 6.18 (secoiid row) slioms t,lw s t h -
dardiactl sen~ivariogramsof soh data m c l C:d colrcr:titr;~tioi~s.'l'hc corrrrln(.io~~ Cd estimates (undersampledcase 2)
between hard and soft (l;tt.;t is inl~clistnmgcr alo~lgtlic N L S W trausvct (10
locations) tlrarr over the study area (259 localioi~s). Rather tliait i~rocfclirrg
the experimental cross se~~iivariogram of 259 data values drpicting a negli-
gible liard-soft corrclation, for the sole purpose of this exairiple a sytttlretic ...........
cross scrnivnriogram irmdcl is 1)11ilt,as a spl~r:rical~ i ~ o dof e l raogc I .Z l i n i witlr
a sill corresponding t,o a correlntion coeflicieirt p z y ( 0 ; zi)of 0 . 4 'T11c litrear - OCK
....-.. OK
model of coregionalimtio~rbetween hard and soft d a h is &:pictcd by LIic solid ., _
,., ., .
4
_. / ... . ~ ,, .
1 2 3 5 6
line in Figure (i.18 (scconrl row). Uislanee IknO
Fig~rre6.18 (two Im~~totii griqilts) sliows tl~t,s t ~ i ~ i ~ d ~ ~cokrigitig
r d i w d (()(!I<)
estinrates (solid liite) of Crl concc~~trntions usitig the five closcst Cd d a t a and
the colocateif soft y-datuit~. 'I'lic d;~slrcd liue dcpicts ordinary kriging CS-
ti,rratt.s l l s i l r l r n n l v ill,. (!,I rl:ll:, I?,," i,,,tl> c:*,,,,,li,,,, ' l ' ~ n Q ; i i1, ~
<>P. ill~ 6 l'rl
I Cd data Cd data

I
OK CK8so CKhet CKcol K CKIFO C ~ h d CKCOI
Algorllhm Algorithm
(hkrigiug algoritlrtus are used l o cst,inlatc ~ l ~ e t nwit.11 l s widesprcsd c o n t ; u ~ ~ -
illat.io~l(Crl, ( A , t'b) and coh;dt a t 100 test locations. 'Lrhlc (i.2 gives, for
each metal, b l c set of st~condary~ncl,alsretained in cokrighg. Sccmdary ill-
-
c
a,
0.8
Cu data Cu data
for~rraliouis availnhle at. 259 prilnarY d ; h 1oc;llions (isot,opic cnsc) or a t 259 -- 1
pri~narydata locatiorls pills 100 lest Iocaliol~s(Ilelerotopic case). For h L h
s;unpling rlcnsil,ies, hllc 16 closr~sl.<lathlocat.iu~sof cad1 pri~tt:lry/sccolldary
2
.-0
0.4

v;rri;d,Ic arc r c t a i ~ ~ e d(: ) = 1 V . 111 the hrtorolupic cme, tltc site+ zm


0.2
-----
tion w l ~ r r conly the colocatcd s e c o ~ l d a rdal;r
~ arc retai~rcdis also u ~ ~ r s i d c r e d : 5
0
n l ( r l ) = 16 and ili(lt) = I V i > 1. For (.he two san~plingd c ! ~ ~ s i t atd i ~ s the 0.0 ~~ , ,
OK CKlso CKhot CKcol
colocaled c;tsc, two cokriging c s t i ~ ~ l x t n rosr,t l i ~ ~ a rcokrigiug
y ;rnd st;~~ldardiacd Algorithm
ordinary cokriging, are comp:wctl to the rc1i:rcncc ordinary kriging cstilrlator.
.I,hr direct and cross sclr~ivnriogra~ns are modeled using the it,erative proce-
durc rlescribed ill Appi:~~rlix A . For cxarriple, Figures 4.18 a11d 4.10 sliow t . 1 1 ~ Pb data
linmr it~o&lof coragionaliznt,ion for c a d ~ r t i n ~col,pcr, n, a11d l e a d
Figure 6.19 (IcfL colurnn) sl~owsthe rank corrrlaliou coellicicl~tbetmeell
true valucs and reference ordinary kriging (OK) estin~ates,then cokriging n
m
(CK) cstirnates for the isotopic (CKiso), heterotopic (CKlieL), and colocated m
14
case (CKCoI). 'The corresponding mean al)solute errors are displayed in the
right rolwnn. Note the following: T- 7 , 10 *~- ~ ~~ r
-
-..-
- - . .,.
-

CKm CKhet CKcol OK CKiso CKhet CKcol


Algonlhm Algorithm
Iirigiug and cokrigir~gscorcs are si~rrilarin tlre isotopic (!.;we. As dis-
cussed irt section 6.2.2, pri~uarydata {.end lo screen the il~flucnct!of
colocaled secondary data. IIence seco~rdaryirrformation co~ttribules Co data
little t.o tire cokriging estir~ratewlrcn all metals arc cqr~allys;t~npled.
--\
'l'able 6.2: Secondary variables used to
estimate p r i ~ n a r ynretals a t 100 test lo-
cations.
. - >

F-Primary varlahle Secondary variables CKw


Algortlhm
CKhot CKCOI

Figurc fi.19: Ilank correlalion coefkients betwecn trnr metal co~~ccntratio~ts and
cokriging cstirrrntcs at 100 test locations (left column), and mean absohtc errors
(right colunm). 'Two cokrigiug estimators (ordinary cokriging, st,aadardizrd ordi-
nary cokriging) i d three data configurations (isotopic, heterotopic, and colocated)
.. , . , , , ., , ~ .,: ~:,,:,,'.
A c c o ~ c ~ ~ tfor
i ~ rbettcr
g s ; i ~ ~ ~ psecourlary
lcd 111e1,als(I~eterotopiccase) sig- primary n ~ c t a l .Rlisclassification fbr Cd and C11 is furlher reduced l)y r~siug
rrificantly increases the rank corrclatiori between true valr~csand est& a single nnbiasedness constririut (dashed line) rather than t,he t r a d i t i o ~ ~ a l
males, and rednces Lhe mean absolute error. 'Ikaditio~raland s t a ~ ~ ( l a r d - constraint,^ that call for the sccoiidary data weights to ~ I I I I Ito zero (solid
iaed ordinary cokriging esti~natorspcrfor~nequally. line). Retainir~gonly the colocated secondary data sligl~tlyreduces cokriging
performances for c a d ~ n i u min the sense that it increases the pcrce~rtageof
Retaining only the colocated secondary data (colocated cokriging) causes Iocatio~rsrnisclassified.
only a slight reduction of the cokriging performances.

Each test locatior~ is classified as coutanrinated or safe, d n p e ~ ~ d i ~~ r Ig I 6.2.9 Multivariate factorial krigitlg
whether the cokriging eslinrate exceeds or not the crit.ical threshold. Fig-
ure 6.20 shows the perce~rtagesof locatior~sthat arc wrongly declared safe As rner~dio~led ill secliorl 5.6, Jlrally physical processes related t,o geology or
or contaminat.ed using difl'erent esti~uittorsand data c o r ~ f i g ~ ~ m t i oAnss .with human act.ivities control the spiitial d i s t r i h ~ ~ t i of o t ~met.al concentr;tt~ionsover
(.he rank correlatio~lcoefficients and t l rrieau ~ alm)lut.e errors, cokriging iul- the study area. Ol~strrvedrrlat,ions L,etwccn cooccntrations of difl'erent illel-
proves over kriging only \ ~ I I ( I I I seccmilary mrtals arc betl,cr sa~llplcdth;ru (.he als n1igl11.tlwo be con~tectedwit11 the occurrence of ctrlrrnloll sources of soil
corrtaminat,ion. For exanrple, Figure 5.13 (middle gral~hs)sl~owcdtliat ilickel
and cobalt concentratior~shave c o ~ ~ r ~ nregional on features linked to the spa-
tial distribution of rock types. 'I'lre strong relation is also depicted by the
scat,t,rrgram of Ni and Co regional co~nponcntsin Figure 6.21 (right top
graph). 'L.lre two 11otto111s r : a t t c r g r ; ~ ~of~ ~Fsi g ~ ~ 6.21
r e s l ~ o ww a k c r relat.io~~s
k)r 1oc;ri ilnd il~icro-scdt!spabial co11rpo11c111s of i1ol.11inetals, which stlggcsls

40 Regional CnmpOnentS
1 Original concentrations 1

-, .
OK CKlso CKhei CKcol
Algorelhm

- 0
0 4 8 12 16 20
Cu data Pb data Cobalt (ppm)

I
4c

Local components Micro.scale components


5
.- 35

I-
3
rn
0 30

- std 0Ck' 'za? 25


1'' '~:
OCK
-.*..
.. - - .- - 20
-8
: 047
7 ~-~ ~

-3 -1 1 3
CKso CKhel CKcol CKiso CKnsl CKcol
Algnrilhm Algorithm Cobalt

Figure 6.20: l'roportiur~ of test locatioas that are wrongly declared safe or contam-
inated according to cokriging estimates; as in Figure 6.19, two cokriging rstin~alors
and three data configurnlions arc considered.
t11at dcstructuril~gprocesses operate over shorter distances. Micro-scale vari-
i~tioltsariw piirlly f r o ~ I~I rI ~ : R S I I ~ C I I ~errors
~ I I ~ that could he indeperrdel~tfrom
one metal to auol.lu~r.'l'lie original t~ickela d cobalt concollratior~sresull
ftour a cot~lhinationof all three different scales of variation, aud their scat-
t.r:rgram s l ~ o w in t ~ Figure 6.21 (left top grnpl~)drpict,s a rclat.iotr intermediate
I ~ e t , i v c cthose
~ ~ ol~serveda t local and regional scales.
'I'l~crcladiol~ljc~l,wce~~ col~altnud t~ickclis said lo be scnlr-rlcprirdrul Ix-
c a l m it, clrmgcs as a f~mctiouof the spatial scale considered. Such att.entio11
to scalr-tlrpcrrde~rre may enlianco a relatiou hetwecu varirhles tlral is 0111- Bccnt~set11c Ws Y;(n) arc, by consl,ruction, ~mltoallyortlrogonal, that cross
erwise blrtrrcd in an approacl~ where all different sources of variatiotr are covari;~nccreduces to
~ n i x e d 1cadi1,g
, to a hett,er i~rrdrmtandingof tile physical ~rudcrlyingrnccha-
nisnrs coulrollil~gspat,inl pattcrlls ((kmvarrts turd Wcl,sler, 1994). Mull,iv;tri-
a1.e factorial kriging, also called factorial krigiug analysis (Mat.lrcn)~~, 1982;
Wackernagd, 1988, 1905, p. 160-165), ;tIlorvs otle to a n d y a r relations l x - At 1111 = 0, tlrc value of each basic covariat~ceitrodel ~ ( 1 1 is ) 1. Thus, the
trvccn variables ;it the spatial scales detected and modeled frtrnr experiute~~tal coefficient liij is cil.l~erthe covariarrce a t 1111 = 0 helweetr spatial components
semivariograms. T h e teclt~~iilue has been applied in various areas, s u c l ~as gro- (case i # j) <lr the variance of the spatial component Z:(u) (case j = i).
clletnistry (Sandjivy, 1981; Dnrrrgaolt and Marcotlc, 1991; M'ackcr~~;rgda d I'br each covariance model c,(h), the coeficicnts b& can he arranged into
Sanguit~ett.i,11J93), soil scicnce (Wackcrliagcl el, al., 1988; Goovnerts, 1992, an N, x hrv coregioualization matrix Dl:
1994d; Coulard and Volta, ll)92), l~ydrogeology(Itoulrani and Wackcr~~agcl,
1090; (hovxcrts vt at., I993), ~uiniug(Sor~sx,1989), and in~xgcprocessi~~g
(Daly ct. a . , 1989).
J,ikc f;~cl,orialkrigiug, w l ~ i c lis~ l>esecl or, t.he linear trrodcl of regioualiza-
I . i m (4.27), ~uull.ivrrrial.cfaclorial kriging is b ; w ~ Ion Lhr: spccific litmrr luorlcl
of coregiot~i~lizat,ioi~ (4.35) f i t t d Lo tlre cxperiu~cntalat1t.o and cross covari- Accortnliug for relnlio~l(6.58), the nmtrix Dl is t l ~ cvaria~m-covariance ma-
ance f~loctious: trix of the Nu spatial cotnponcnt,s %~(IE).'Tlte linear correlatiotr between any
two spatial components G(u) a d $(u) is then ~rrcasoreilby the structural
correlation caeficieut defined as

I
Pij
_- 6!
ZJ

Under tlml. particular model, caclr ILF %;(II) can be int.erpreted as ;t liri-
ear con~hinatiot~ of it~dqm~dt:rttlZFs Yk(i~),r;:clt with zero I I I C ~ I I and basic Tltc tn;~t,rixof structural correlation cocflicicrlt,s is deuotcd Rr = [ p l j ] .
covariance f ~ ~ n c t i ocl(11):
~r +
In sectior~4.3.2, under second-order stat,iouarity, the ( L 1) coregional-
iant.ion i l r i ~ t ~ i cwere
r s slmwn 1.0 sum to the covariance firltcl~ionmatrix a t lag
zero:

C(O) = z:
L

I=O
DI (6.5'3)

Tvidt,ivariat.v n~ralysis(Audrrson, 1958) is traditionally coriducled on the


variaucc ct~vnrianceinal.rix C(O) or its staudardized form, the correlation
matrix R(0), thus igrroring the dala coordinates. Factorial krigiug accounts
for the regionalized nature of variables by analyzing each coregionalization
matrix Dl or i~~iil.rix of struchural orr relation a~eficicxlbsR.1 separately. By
so doing, each correlalio~~ structure is dislingnisl~edby filtering Llrc structures
helougirlg to otller scales of spatial variat~ion
Multivariate factorial kriging procco<ls i l l t.11rw stcps: ,L.I~,! vi,riall~c: o;(ll) = V ~ ( . ( J ; I * ( I-
I ) ~ i ( 1 1 ) )C ~ I I Ibe as a~
C S ~ ~ < ~ S S C

liliear coniliinalio~~ of cross covariance vxirlcs:


1. 'The coregionalizatio~~ ~r~atriccs BI are first esti~liatedt r s i ~ ~tgl ~ eitera-
tive procedore described in Apperrdix A . llccall that the drcolrrposi-
tion (6.60) and lrencc m y subseqncr~tirrterpretation of the coregional-
ization matrices depend on tire sor~rewlratarbitrary decision of wllether
to include a particular component in the linear model of coregionaliza-
tion. As in univariale factorial kriging, when i r ~ o d e l i ~the
~ g direct atrd
cross se~rrivariograms,it is crit.ical to acco~r~lt, for any p1rysic;rl knowl-
edge about thc p l m ~ o r n e ~ ~and
o n the sturly area; sce related rliscr~ssior,
i l l Goulard and W t a (1 W2), (:oov;wrts (I!J!J2).

2. Multivariate rrrcthods, such as priucipal co~rrponenta~ralysisur discrinl-


inant. analysis, are then applied lo each matrix BI or Rl (Wacherr~agel
et al., 1980). 'lh avoid results that are o v e r i ~ ~ f l o e ~ ~hyc cthe
d variables
s , Nu v a r i a l ~ l ~
with the largest v a l ~ ~ et.lrc slro~~lrl
s 1)sstnndar&c<l to zero
mean and unit varia~lceat llic I ~ ~ I I I I ~ II~;lt!~rre~~ts
I~. of cacll uwegio~l-
alizatio~r~lratrixB I t l ~ r ~represcnt
s t , it .~1 1~~basir:
relat.ivc c o ~ t r i l r ~ ~ of ~~s
model yr(11) lo dircct. ;rnd cross srr~r~iv;lriogr;t~l~s.

E s t i m a t i n g rcgiorlalieed factors w]lere Q, = lYlr]


is tile ortllogo~ralniatrix of rigeovectors, a n d A1 = [A:] is
the diagor~almatrix of eigenval~~cs.
T h e ordirrary c o k r i g i ~ r g ~ s t i r n a t oofr the k1.11 regionaliz~!~lf;tctor a t the ltli Millirnizing thi: esti~lmtio~r vnria~rceu?;(II) onrlcr
N
the . 'A lloll-l'ias c m -
spatial scale is
straitlt.s (6.01) +
syste~nof (xi:, ~ , ( I I ) N , ) l i w u c ~ a -
the firlloriri~~g

where Xk,,(t~) is the weight assig~retlto tllc dittrr~rrzi(r~,,,)i~~fs:rpret.e<l ;IS a


realizatior~of the IW Z;(II,,,). Since rcgiooaliard factors arc Iluilt, as Ttlis wit.11
zero mean, the estimator (6.60) is unbiased provided the N, sets of cokrigirrg
weights sum to zero: . .
~~~~t frolll tile rig~l~-ll;tllr]-sid~~
t , e r ~and
~ ~ sthe u~~binscrlnt~ss
consI,raillt~,illat
system is identical t,o tllc ordir~arycokrigingsystcrn (6.32) for att.rilnllc valnes.
to zero nrean and unit variance. The 28 experittrel~taldirect a r d cross semi- slrlrclure gr(h) in the direct semivariograrn litodel of variahlc Z,
,,
variograms are ~norleledas li~tearcolllbiliatiot~sof t.llrtx basic structr~res: a I he cortrlxtrison of circles of correlat,iorls sl~owrti l l Figllre 6.22 i ~ d i c a l e s
t ~ r ~ g g ellixt
ct ;rnd two sp1lcric;rl ~rrodelswit.11 r;rngcs of 200 111 and 1.3 ~ I I I . tlrat interrela1,ions Iwtwrcn 111~~l.nlsrhangr as ;L frrltcliotr of tlre spat,ial scale.
I'rincipal co~rrponcnta ~ ~ a l y sisi sperformrd on tile correlation matrix R(0) I n particular, I I O ~ P :
and the three coregiorralizaliot~rl~alricesBI correspo~rdil~g lo the three l m i c
sernivariogra~ustructures. One capitalizes or1 the property that the first few the s t h n g e r rclat,ion between copper and lead a t the local scale, which
principal cotnpo~icntsaccoru~tfor r m s t of the variauce to display possible proliably reflects rottlntou sor~rcesof man-made pollotiotr over short.
interrrlalio~tsbetween variables. Sthrt with the global interrelations between (lisl.at~ctl~s.
variables as described l,y t.he correlation matrix given in 'I'ablc 2.6 (page 22); the s h n g correlation Iretween Cd, Co, and Ni at. f.he regional scale
see Figure 6.22 (lefl top graph). 'I'lte positiou of each variable ( ~ r ~ e t aZi l ) in that iu;itcl~estire scalr: of the st.rntigritplry, suggesting that the source
t l ~ eplane of a pair of components Irk and Yh, is given hy the iwo correlation of tllrsc ~ n e t a l sis geoclremical.
coelficients (pkj, prq) clclilted ;IS

Correlation m a t r ~ x M~cro-scale

E q r r a l i o ~(6.64)
~ is a rnere rescaling of tile loading q b ; hy the variallce XI of the
kt11 principal conlponenl, and the variance (T? of l.lte it11 original variable. 11
veclor can brl i l r a w u f r o n ~1111. origin iu 1,hr pl;i~te(0, 0) t,o each plotted point.
(pri,pk,i). T h e orientation of t , l d vector with respect to the two axes reflects
tlrc corrrlat,io~rbetrver~rthe vrrriahlc Z; and the two principal r o t n p r e u l s l'i
and &r. T h c I ~ n g l hOC l.lre vector, &m, nreasurcs tlrc perce~ltageof
tlrr variance n;/ esplairrcd by the two components, If the two compotre!tI,s
co~nplct.elyaccotltll for the vari;rrrce of X i , t.lle poiut would lie on ;r circle of
unit radius as ~ ~ I I W IillI Figure 6.22, lrel~cet,llc name czrcle a/ co~~elrrlzer~s.
't'he closer llje poiul, is lo [Ire cclll,cr, t.l~csl~tnllcrlltc proport,ion of variatlce ac-
coutrtrd for hy I;r atrd 1;,.'I'lrcrefore, one should avoid itrl.r~rprrlingrelations
11rt.wm.11vnriahles tltai, plot near t.he cmtcr.
Figure 6.22 (Iefl top graph) shows the circle of correlat.ions defined by tlw Local scale Regional scale
l ~ l , s15). Tllc circle allows one to distilrgrtish copper
first. two r o t t r ~ ~ o n r ~Orl,
a~l(llc;~d,whiclr are s l r o ~ ~ g lcorrelated
y ( p = 0.78), f r o n ~a. group of four
metals (Cd, Cr, Co, and Ni). T h e isolated position of zir~cexpresses its equal
correlatiou with variables of bollt groups (srr 'L'ahle 2.6, pagc 22).
Sinlilarly, circles of correlations call be usrd to display spatial ~ I I ~ I ~ B -
I;il.ious l,rl.wrcw vari;ibles ;is ilescribed by each cor~~giort;~lia~~tion t~ralrixB,.
'I'llr position of lhe ilh variable ill lhe circle dcli~terlby t,llt: pair ( < , I < , ) is
givrn hy the pair of corrclaliorl coeficie~ltsbetween the spatial cotnponent,
Zi and t h ~ two e rrgiol~;dizedfaclors. Lly a l d o g y with exprrssioll (6.64), the
correlation coefficic~rt1)etween the spatial amrponent %;' and t.he regionalized
factor I;' is cot~lpuletlas
Figure 6.23 slrows the nmps of llre first regiol~alize~lfactors a t the local
and regional scales. T h e map of the regional factor enipl~asiaesthe impact of
the geology st row^^ iu Figure 5.13 (page 168, top grap11) on i.he r r ~ n regiot~d
i~~
features of metal concentrations.

1st local factor 1st regional factor Chapter 7

Assessment of Local
Uncertainty

llnliki: i n ( h p t c r s 5 ; U I ~6 , wl~icll~ U C I I S011~ diirivil~g;UI ~ p t i ~ ~vsLitrti<t.?


lal
and (.Ire associalcd error varior~ce,i l l t,ltis cl~;q~l,cr prioril,y is given I o nio~lcling
Figure 6.23: Maps of the firs1 rsgionalized izictor a t locnl and rrgional scales. tlre uriccrtai~~ty ahont the vrrlioown Suclr a n~odclof loc;rl tl~rcerf.;~inty irllows
one t o cvalu;~tcllie risk irlvolved i l l any dccisio~l-~rl;~kilrg process, such as
(lelit~cat,iot~of contarninaterl areas wlrcre ren~edialmeasures slrould Ije tt~ken.
Front the ntodcl of uuccrtainty, m c can also derive difirent csliit~alesoptimal
for different criteria, each custo~nizedto the spccific prolrlrm irt I I ~ L I I ~i ~, d e i k d
nf retaining the somewhat arbitrary least-squares error estimat,v.
Section 7.1 proposes ~notlelingthe local unccutair~tytlirouglr c o n d i t i ~ ~ ~ m l
probability dislributiotrs instead of confidence iut,ervals. Mult,i(;;i~tssi;r~r-and
it~dicator-baseda l g o r i t . h ~for
~ ~ sdeterminiug such r:onditional dist.ributions are
introduced in scctions 7.2 and 7.3. Sectior~7.4 prrsettts tools for accounting
for local urtcertaint,y i l l risk an;tlysis and decision-nraking proccssm 'I'hesc
tools arc used i n sr:cl.ioo 7.5 to classify tcsl locat,io!~sas stif<!or c o t ~ t a ~ n i ~ ~ n b c d
with rcsprct Lo (ki, G I ,and 1'11 conccl~t,r;tfio~ts.

7.1 Two Models of Local Uncertainty


&mcc intervals often assun~crl;wl~it,rarilyG;iussian (1"igure 7.1, ~ ~ t i d d l c 7.1.1 Local confidence interval
gr;,p11s) 'flit: t.r:idit,ionnl appronclr for ~nodrliriglocal uncertainty a t an onsa~rrplrd
location u consists of comput,it~g;t I I I ~ I I ~ I I I I err0r
L ~ I ~ variance (kriging) esti-
ntat,c, z'(u), of 1.11~ ur~knorvnvalue r ( u ) ; n ~ dthe associated crror variance
n i ( u ) = Vnr{Z*(u) - Z ( u ) ] . ,The rst,imate and error variance arc then Lyp-
ically combined to derive a Gaussian-type confidence interval centered on the
t d n l a t c d value. For exalnplc, the 95% confidcticc interval is taken as

Prob { Z ( u ) E [z*(n) - 2nB(u), r'(11) + 2 o ~ ( u ) ] )= 0.95 (7.1)


wl~rreuz;(r~)is l.l~cerror (kriging) vari;ince a b o .
'flre ordinary kriging tistimate and crror standard deviation a t locations
II', nnd 11; sl~owtiin Figure 7.1 are, respectively:
z*(u',) = 0.5 111x11,UE(U;) = 0.7 ppm
~ ' ( I I ; ) = 2.3 pjirti, oF;(u;) = 0.7 ppln
Dislarlce (km)
Using relation (7.1), one dedoces that the unknown Cd concentration has
a 95% pro11abilit.y of lying in the i ~ ~ t e r v a1-0.9,1.9]
l ppm a t u', and in the
itttrrval [O.!), 3.71 ppni a t u;.
95% confidence interval T11e derivation of R confidence interval of t.ype (7.1) is straightforward in
t h t only a rncasorc of the corrrlatiolt bet wee^^ z-values (semivariograru or
covariance f u ~ ~ c t i o is n )rcquireL Ilowever, the crror model (7.1) calls for two
stringent 11ypot.htws ( c g . , see lua;tks and Srivastava, 1989, p. 517 -519):

1. The estimation error z * ( u )- z(u) is rnodeled as a realization of a Gaus-


sian HV crror.
Local probability distribution
2. 'I'l~ccrror v;~ri;i~bcc
&II) is i~rtlepettdentof the data v;~lues.

'flir first assurnptioti irnplies the s y ~ r ~ t n e t rofy the "local" distribution


] Location ui / Location ub of errors, I n practice, true values tend to be ovcrest.it~~ated in low-valued
arcits and ~ ~ ~ ~ d c r e s t i ~in~ 11igl1-valurd
t~t.ed areas (recall the discussion about
t l ~ es m o o t l i i ~ ~efrccl
g in section 5.8). Thus, local distrilnttions of errors are
-
2
generally positively or rregat.ivrly skewed, llrc sign of tlic skewr~essdepending
a
m
D
ou the data values retained a t each location. Such asymmetry is more likely
2 05
a to occur wheu Lhe sample l~istogratriof llre original data is also asymmetric.
T h e "glolml" distributior~of estimation errors, that is, the distribution of all
0.0 .........,....,.... ,
...... ..,....,..
..., local cstirnatiot~errors pooled over the entire st,ody area, may be syu~rrretric,
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 wit11 the Iocd ovcrcstimal.io~~s balitttcing the local ur~dercst,ir~~atioris.
Cd concentration (ppm) Cd concentration (ppm)
'The second condition calls for the variance of the errors to he indepew
dent of (.he actual d a t a values and to depend only on the data configuration,
F i r 7.1: Modeling nnr:crt;tinty about Crt cor~centratiousat loc.stioas 11; and a situation referred to as homoscedaslicaly and rarely rnet in practice. In the
u: (top graph), l'l~etraditioaal approaclt amoynts to comlrtrting a inini~rnmer- example of Figure 7.1, ordinary kriging yields similar estirnation variances
ror variance rstimalc i'(u) and adoplieg Ganssiu-type c o ~ M e a c eintervals. A uZ;(u',) z U;(U;) since the dat.a confignrations at, locations u', and 11; are
more rigorous approach calls lor nrodrling the two local distributions of probability similar: in bollr cases, tire two closest data are a p p r o x i ~ ~ ~ a t250 c l y rn away.
(bottom grilplw). Itowever, the potential for error is expeckd to he greater a t u;, which is
surronnded by one very large and one small Cd corrcentration, than a t u:,
262 G'IlAPTE11 7. ASSESSMEN'I' 01: L O C A L UNCI;11?'AlN?'Y 71 1'W 1if0t)l~IS
0 1 , ' L O C A L IINCC11'L'AINTY

wlriclr is surrounded b y two consistently small vall~cs.' r l ~ u sconfide~rceinter-


vals, such as t h a t in equation (7.1), based orr a rnere estimation variance arc P h y s ~ c aconstraint
l Interval
generally n o t a sal,isfactory solution t o the critical p r o b l e ~ nof ;tssessing I o c i ~ I
uncertainty.
] Location ui Location ui
7.1.2 Local probability distributions
A m o r e rigorous approach t o c s t i r ~ ~ a t i oisnLo assess first the uncertainty a b o u t
the unknown, then dednce a n estimate o p t i ~ n di n some appropriate sense (see
Srivastava, 1987a; Journel, 1989, Lesson 4). 'I'lris is sig~rificanllyd i l f c w n t
f r o m the t r a d i t i o r ~ aapproach
l of first deriving the c s l i n ~ a t c1l11:n att;tching t o
i t a coolidence interv;~l. L e t Z(U) be t l ~ s1W n ~ o r l e l i n gt , l ~ cu i t r c r t a i n t y ;rl,out 0 1 2 3 4 5 6 7 8
~(II). T h e d i s t r i l ) u t h ~f ~
r~~~ctF i o( n ; zI(11)) = l'rob {X(II) 5 z l ( n ) ] III:& Cd concentralion (pprn) Cd concentration (ppm)
corrditional t o the i n f o r ~ n ; k t i o invitilable
~ (11) fully i ~ r o d c l st,11;1l. ~ ~ ~ ~ c c r l ; i~ni i ~ t , y
the sense t h a t prol,ability intervals rat1 he derived, s ~ ~ c;is lr

Sample cumulative histogram

] Location ui ] Location u i
N o t e t h a t these probatrility intervals arc i r ~ d e p e r ~ r l c n o tf any particular esti-
m a t e ~'(II) of l,he unknown value ~(II). 111(leed, uncertainty d r ~ p e r ~ dosi l the
inforlnatioll available ( ~ r ) , i ~ n OtI 1.l~.particular o l i t . i ~ ~ ~ i trlri ifiyc r i o n r ~ t a i ~ w d
t o define an f ! s t ~ i ~ x ~A t < lucid
~. discr~ssio~r of tltis i ~ ~ ~ p o r t inctlrodological
nnt
p r i o r i t y is give^^ i n Srivastava (1087a).
Each conditional probability distribution f u ~ ~ c t . P(u;
rrleasure o f ivcaluncertainty i r ~
i o ~ ~z l ( i ~ ) )pmvidcs n
ILlial i t relates t o n specific l o c a l i o ~u. ~ A series
00
k
-

0
obcontamin.: 0.70

1 2
---
3 4
Cd concentration (ppm)
5
, -
6 7
..-
8
0.0 I8-.-.-
0
..I
1 2
Prob contamin.: 0.70

3 4
,
5 6 7
"
8

o f single-point ccdfs do n o t provide any n~tessureo f 11111l1iplc-pointor .spniinl Cd concentration (pprn)


uncertaillty, srrclr as tht: prohal,ilily t l ~ a l .;I string o r l ( ~ r : ; t l i o ~joi111.ly ~s vxccw,l
a given t h r e s l ~ o l dvalne. 111 Chapter 8, the coucept o f r:a~~dit.io~ral si~rn~l;rl.io~~
w i l l allow the asscsslne~rto f such spatial ~mcerl.aintyf r o n ~scver;tl rt:aliaatio~~s
L o c a l probability distribution
of the distribution i n space o f t l ~ ea t t r i h u t c r .
T l ~ crnininral i n f o r m ; t . t h ~;~vaili~l)lc
~ a11011t. 1 . 1 1 ~Z-V~I~II,! at. ally 10cilti011 II
usually consists o f ii physical constraint i ~ ~ t c r v ;[z,,,j r l ,,, iior GI v a l u ~ s , 1 Location ui Location u>
the lower h o ~ ~ nz,,,,d o f t h a t intcrval rvonld he zero since concrntr.,I1'KXIS are 1.0
necessarily ~ m ~ ~ - n c # a t i v111 c . ;td~lit,iou,one m a y k ~ ~ oILlrat w (,he (:<I C~~IWII- -
2'
.-
t r a t i o n cannot cxcecd G p p r ~ r .'L'hlrs, in the a l l s e ~ ~ coef i m y Iociit,io~~-spxific 3
D
i~rforrrration,the u ~ ~ k n o r uvalucs n z(11;) n ~ ~(11;)
~ d a t any two locations 11; and 2 05 g 0.5
a a
11; would be v;rhicd anywlwr,. w i t h ixlunl p r ~ h l > i l i t iir y t l ~ ciutc'rvxl [O, 6 ~ ~ I I I ] .
111other words, (.Ire c n n ~ u l o t i v adistrihut.ioi~f ~ ~ n c t . i o(cdf) n for l x , t I ~llVs Z ( u ; j I Prob contamin.: 0.21 Prob conlamin.: 0.95
and Z(II~) would he t,lie t ~ n i f o r r uexpression 0.0 ..A -....?-7.-.-7
~ 00
0 1 2 3 4 5 6 7 8 1 2 3 4 5 5 7 8
0 irl<O Cd concentration (ppm) Cd concentration (ppm)
F(u; ; z ) = F(II; z) = z/6 i f z E (O,6] Vz (7.2)
6 . All conditio~iitldistribr~I.io~rs of m y subset of I.lrc 111" )'(]I), give11 n d
ieations of any oLllcr strbsets of it, are (tiloltillle-poi111.)n o r n ~ a l .Itr par-
ticular, the cor~ditio~rnlrlislrihutiatr o f t l ~ usiuglc varial~lel'(u) given t.he
n(l1) da1.n y(ua) is rrorn~alm i l fully clraractcria~xlby i1.s i.rvo paraulc?-
ters, mean and varia~rcr,whicli are the co~~dit,ional itlean and condil.io~ral
variance of the ILV Y ( u ) given tlrc iirforrnatior~(11): 1. 'l'lre original z-data are lirst tratrsforuicd into y-val~icswith a slhndard
normal hisl.ogram. Such a Lr;rrrsforttr is rcft.rred to as a ~rurinnlscoTr
fran.sfur~~t,mil lhc y-vali~es?/(XI,,) = d(z(11,)) arc called uorntal scores.

with

score trairsform Co~rr:tiotl4(.)cair IIC rlerived tlrro~~glr


'I'lie ~rorir~al ;L graph-
ical correspondence hrtwerrt (.IN: ciim~llativeorx-point d i s l r i b ~ ~ t i oof~ ~t .s l ~
origirral and st;tr&rd 110r1na1varialil~s;scc fiigtirc 7.3 ~ I I JI o~~ i r ~ i m
rI d ll~ii-
jbrcgt,~(1078), 11. 478. Let F ( z ) and G(IJ) br: tlrc stationary ooc-point cdfs rrl
the origitral IW ~ ( I I atrd
) 1.11~st,anihrd I I O ~ I I I ~ I IIW Y(II):

'l'l~e I.ratrsform that allows ane to go I'ro~tr;r lili Z ( u ) wil.h ;my cilf Is'(:) Lo
a R.F Y(n) wit11 standard Ca~tssia~r cdf C(y) is dcpictcd by thr: arro\rrs i n
Figure 7.3 and is rvrithi!n as

7.2.2 Normal score transform


wlrcre (:-I(.) is tlre irlvcrse Gaussin~rcdf or q ~ t m t i l cfi~uctionof flrc: 11.F Y ( u ) .
I he rrrultiGaussian (RIG) aplwmch is vcry c o n v c ~ r i e ~tlra
r \
~ t : itrfereuce of tlre The ilorrnal score Lransfornl can b a scctr as a corrcspoi~dei~ce t:tl~lahel,wwt~
corrditional cdf reduces to solving ;I simple krigi~rgsystc~ria t Iocatiotr 11. 'I'lre equal yquarrtilrs z,, aud y,, of f,l~etwo dist.ril~~~t,ior~s. 11, other words, zJ, and
trade-off cost is the ;~ssutnptior~ that d;~l,itfoll~~w21 t~nilI.iGailssi;rr~ dist.ril~~~Lirrn, d 1h(! salllc clllllkl~iltiv~
yr, c o r r ~ s p ~ n1.0 I)r~ixrhilityP :
which ilnplics first, 1,hnt tile one-poi~rt. dist.ributio~,of data (s;t~ril,lcIristogra~cl)
is nnmml
CllA l"1'li'll 7. AS.SESSI\~~I~;N'~'
01.'LOCA I, IINCEWXlN'l'Y

A conrrr~nli;rppro;~cli for "rmr~nalizi~rg" a positively skcwed salrrplt: his-


t,ogra~nis to take the logaritlrms of the origin;rl data, i e . , transform y(11,)
a I ( ) Figure 7.4 (Imlt,o~~i graphs) sliows tlre histogra~unlrrl i ~ o r ~ u ; t l
Original Cd data
probnhility plot of 259 Cd concclrtrations ;ifter such a t r ; u r s f ~ ~ r ~ r r ; ~'L'lrc l.i~~~i.
crrlrr~~lative frequcncics plot in n fairly st.raiglrl line wr:cpt a t tlrc two c s t r e ~ ~ w s
of tlrc distribution. Althoi~glrof stitall ~ri;rg~rit,rrde, tl~esedeviatioris s l r o ~ ~ lriot
d
Ile disregarcletl sirrce t,ail behaviors arc generally critical. Thus, tlie normal
score trnrisfrmr~t,llitt ensures j~crfcctreproduction of tlrr I I O ~ I rlistrih~rtion
I ~
is t l ~ rpri?ferri:d l,mrisfor~ni~bion. Ar~otl~crslrort.colriing of tlir log 1.r;msli~rmis
t.lrat it ;+ppliesa i l y 1.0 st,rict.ly positivc vari;ihlcs (without. zcro-valr~crlkitit).

7.2.3 Checking the multiGaussian assumption

Cd concentralion (pprn) Cd cancenlralion (pprn)

After normal score transform

Idoally orre slrould d 4 1 r c a tr;r~~sfor~li siriiilar 1.0 ~ I I , . o~ir:-poi~~t.


norrrr:tl score
trnnskxln, wl~iclr wo111densure tl~;rtall t w o ~ p o i ~cdfs ~ l , of t.hr t.r;~risfor~~rcd
H.1: )'(IS) an. C;aussi;rn. U~rfortu~rat.cly, such ;I t.rii~rslr)rr~l is wry dillicult,
to deternrine in pracl.ice since tlrrre are as lrlalry two-poi111cdfs as tlrcre arc
dill'ere~rt lag vrcl,ors 11 s e p a r a t i ~ ~t g. 1 t,wn
~ RVs Y(i1) nud Y ( i ~ + h ) .1le11ccone
slroi~ldcl~cckwlrct,lic:r the prcvirrr~slyolrl.;,irrod data y(rl,,) ;%realso rrasorr;hly
bivariiite (:ar~ssial~.If they are, t.lzcr~tlrc nrultivari;~tcCar~ssiarr111odt4can 11o
adoptrti (it. has 11ec11checked to tlrc two-poi111level); if t,hcy are ~ i o t anotlrcr ,
-1 0 nlodcl sllorrld be considcrcd.
Cd normal score
There are several ways to clreck tlrat the two-point distribution of d;tt.;t
ty(u,,),n = 1 , . . . , n] is normal. Oue l~retliotlcorisisls of verifying that 1 . h ~
After logarithmic transform expcrinmit,al t,wo-point cdf values oC ;rny set of dat,;~pairs scp;~rat,rdby t . 1 ~
same vcctor 11 {(?y(u,,), y ( u , + Ir)), cr = 1 , . . .. , N ( h ) } rrratcli tlle t.lrcorctic;~l
nlo(lel given By t,l~ealialytical expression (7.5). In pract.ice, only the case
y, = y1>,is considered, leading to tile following expression for l,lrr two-point
Caussi;ln crlf:

2 3 -3 -2 -1 0 1 2 3 'l'lle clreck proceeds ;IS fbllows:


score Cd lognormal score
1. The serrrivariogra~rryl,(li) of 1111: ~rornralscore data ?/(n,) is conrpole<l
Figure 7.4: llistogram a d iwrlnal probaldity plot of 259 ( i l cot,ccst.ratio~tsbefore and modeled. The correspondirig covariaocr rrrodel Cy(11) is then ob-
(first row) and after normal score trassform (srcoud row) or logarithnnic trmsform tained as 1 - ~ ~ ( 1 1 ) .
lt.l,ird r n w l ~
3 . For tltc s;une pquanI.ilc v;thras, t.he ~xpcrinicnt;~l
indicator sen~ivari-
ogranls of r m r ~ ~score
~ a l data arc ron~put.crlas 0.0 0.5 1 .0 1.5
Distance h

'1. tkpcri~lient,al:iod G : ~ o s s i n ~ r~~ o d c I - i ~ ~ d uindical.or


ce<l se~nivariogran~s
arc co~l~part:d gralrl~ic:dly. ijnscd or] t . 1 1 ~quality of tlw fit,, tlle mer
decides w l 4 w to reject the assunlpf~iwof two-poirrt normality. A
goohcss-of-fit critcriou would consist of con~paringthe ~ r ~ a g ~ ~ iof tnde
11ovi;tlions hcl.wt:cn cxpcrinlrr~laland ~rmdelindicator serr~ivariograrns
with t,lmsr ol,servcd fkr s i ~ r ~ & l e dv;tlucs that, arr known to he 11iGans-
sim

. 'Tlrc G;russiatr ntodcl does 1101, itllow for any s i g t ~ i f i c i tspi~tial


~~t
tion of cxtr(m~cIylarge or sliiall values, a 1)ropert.y known as ~deslixc-
correla-

turatiua ~!,/Jecl ;tnd associated to the maximum entropy property of t l ~ c 111the e a r t l ~scirnces, low eol,ropy patterns, s t ~ c as l ~ connected strings or
Cmssian l(1' model. M e e d when the cunnllative frequency p trnds to- ptclres of exlren~evalocs, o f t c ~currcspo~d ~ l,o iraaarrlous features that are
ward acre or one, y; i t c o , and the two-point crlf (7.10) tends toward worth specific i~ltent.ior~. For exa~nplr,strings of large per~neal,ilityvalues car1
the product phif the two marginal prohabilil.ies (independence case). represent leakage conduits ilaz;trdoos for :t nnclt:ar repository. Sinrilarly, cow
IIcncc, tire indicator corrclograni pl(11; y$>)tends toward zero, aud the ~tcctedclusters of large metal c o n c e ~ ~ t r a t i oalmvc ~ ~ s the tolr!rahle r n a x i r n r ~ ~ ~ ~
in<lic;ttor seniivariograni y r ( h ; y,,) tends t.ownrd its sill p(1 - y): arc critical for asscssii~gthe risk of soil pollut,io~~, Hence, and rtotrvitl~stalidirlg
its analytical sinlplicity, f.he n~olt,iGanssianIL1' model may be inappropriate
whe~lever the structural analysis or qlmlitative i n f o r ~ u a t i oindicates ~~ that
exbrernc values are spati;dly correlated. I.;vcrr in the ;tbscrrcc of information
;thou1 tllr: conllcr:t,ivity o f e x t r m ~ ~values,
r: blre C;anssinn rriotlcl is no1 corrscrva-
for any vcctor h . Fignre 7.5 slrows an exampleof tllc dcsbructuration ef- tivr in thc sense that, its K I ~ X ~ I I I ~eutropy
I~II charnctcr leads one t o underst.atc
fect for a C a r ~ s s i a111'
~'~Y(o) with aspl~ericalscniivariograrn model y(h) the p d e ~ ~ t i for
a l Irmarrl (G6111e~-Hernin11<:~ and Wen, 1994). 'The more Rex-
with a unit range: the sl;tndardiaed indicntor s a ~ ~ ~ i v a r i o g mreach
ms ible but, also rnorc d e r r ~ n r d i ~indicator
~g approacb introdt~ccdin section 7.3
their unit sill lasher for extrenie tlrresholrl values. allows one to account Tor spali;rl corrclaliou of extreme valnes.
lizample
Figure 7.6 (top graph) sl~owsthe experitue~ttalsenrivariogrnn~of (Y i t ~ r t r ~ a l
scores with the rnodel fitted. Given that model, G;rnssinn-bascil indicator
sernivariograrn rrlotiels are deduced usir~gexpression (7.11) for the nine drcilcs
of the sample distribotiorr. T h e exprrin~enlnlindicator scmivariogr;r~rrs(black
dots) are seen t o deviate from the Galtssinn-inditcd trlotlcls (solid line) lor
slnall thresl~oldval~tcs.A s alre;uly ~rol.iccrlirr I " i g ~ ~2.19
r e (1);rg~4 5 ) , si11;~11(X
concentrat~ionsare trlorc cunltectcd i l l space Illno h r g e conccntral.io~~s. ,1,hlls,

,Cd normal scores

2nd decile 3rd decile

0.0 0.4 0.8 1.2 1 6


D,rlsoce lkml 7.2.4 Estitllating the Gaussiarl ccdf parattleters
Once the ~l~uIti(;ii~~ssiatl inodel is adopted, infcr~!t~cc of t.he tlorn~;dccdf G ( u ;
,51h decile ,61h decile
( n ) ) retliirr~s to estitrrat,ing i1.s I,wo p a r ; ~ ~ ~ ~ c(tIcI rI sC ; !a~ d viklii~n(x!)a t i ~ i y
unsa~rrpledIucnthn 11. Considcr tllc deu,mlrositior~of t.I~i:t l ~ u l t i G ; ~ t r s sIW i;~~~
Y'(II) inlo :I r&d11a1 ~ ~ . o I I I ~ ~ /((I&)
~ ~ ~ wand
I I ~ ;L. t,rmd ~ O I I I ~ I I I I I ~111y(11):
II~.

7th decile 8th decile 9th decile

001- , ,
00 0.4 0.8 1.2 b G
Oislsnce lkml

Figure 7.6: Experirncatnl semivilriogram of C:d normal scores (top gr;tp$) and ex-
perinrental stimdardised indicator snrnivariugranw [or the ~ i n iler.ilc,s
e of the sample
r:anialative distribt~tion ' ~ I wsolid liws &,pirt I l w i ~ d i c ; ~ t nsct~~iv;iriogr~irt,s
r <I?-
, , * > .. ,- , . ? r~ ., ~,,:,. ~ :... ..., 3.7
2. A linear resc;~lingof ;t s~l~ootlrly
v;trying secondary variable d(11):

where d-valoos arc ~ ~ o r t r ~ i tdislribuleil


lly with zero luean and unit vari-
auce (normal scoros). 'I'lre lrlran of the Gaussian ccdf is tlrclr providcd
by a KEU rst,i~ltatorof 1.ype (6.4).
neware that what.ever the kriging algorithnr (OK, KT, KED) considr:rad, the
varimcr of t,he ccdf nrrlsl be idcnlilied with the t,l~eareticallycorrcct sinrplc
k r i i v i ~ ( IcI I I ~ I I1I ,0 ) A n nllrrl~;tt.iveto nor~-st;~t.ioni~ry kriging of
t,lic n o r n ~ a score
l y-ditt,;~co11sis1,sof clelrcnrlit~gthc origirr;il 2-(litbit prior to it
~ ~ o r t t ~score
; t I t,r;~t~sforrtr;tt,io~~
of t,lw c o r r c s p o ~ ~ d iz-residual
l~g data. 1 SK error variances
.G

E'igurc 7.7 (top graphs) slrows the ai~~iplr: krigir~gc s t i n ~ n k?/;rs(t~)a111 vari-
nucc (~S,((11) C O I I ~ ~ I I ~ Cusing
I I the l~11C:d nortnal scorcs given in 'IkI11e 7.1
(p;+ge 269, third colu~nn).Acctrrdi~~g t,o relation (7.7), lhc Gattssian cot~di-
tionnl cdfs ;al. locatiorrs 11; to 11; are irlodclcd ;ts
278 C X A l " I E R 7 ASSESSAJEN'I' 01,' LOCAL UNGERTAIN'I'Y

by tlic SK variance, is larger 1lrm1 a t d;~l.r~ni localioii ,I$ brrt, s~ti;~ller


tlrm a t u!,, w l ~ i c lis~ beyond the corrclaLion r;rugc::

0.0 /".....,....i 00 I' ........:..,... .,.! .,. ..,... ,..


0 1 2 3 . l 5 -3 -2 -1 0 , 2 3
Cd concentration (ppm) Normal score

7.2.5 Increasing t h e resolution of t h e s a ~ n p l ecdf


T h e local uncertainty rr~odi~lis r~ccderliu the spaco of the origirlnl vari;tl,lc 2,
not in the space of th: rrornial scorr variable Y . Because tlrc norlnal score
tral~sfortlr+(.) is molrotonic increasilig, tlre z-ccdf value a t ally 1,liresIrold z'
can he rclrievctl from t l ~ eGar~ssiat~ccdC i l l two steps:
IOU
0.5
--
00
[G(ub;~l(n))lk

3
* ------

.2 .1 0
.
,....,. .. . . .

1
Normal score
2 3

Liacnr cdf m d e l
A linear rrrodel is gr~rt:rally;rdopted for interpolation witlriti classes of thrtislr-
old values (zr-1, zr]. Such rr h e a r rnodel arnounts to assurnilig ir rlniform
distribution within tlml class, tli;tt is,
by t l ~ cuser. Convtmt~ly,a power model for a psit,ively skewed upper tail
could be

rvlicre tlic p;trar~lctr:r (tltc porrw) w is strictly positive, w > 0. 1)ilfcri~nl. where :I( is l.lrc largrst r - d a k v;tlur and z,,,,, is the rr~axi~iium
z-value fixed
i l t l . ~ ~ r ~ ~111mI~k
~ ~ ~ ~ arc> i ~ ~ r l hy varyitig w ( F i g ~ ~ r7.:)):
i l ~d>I,z~iii<~d t- I)y t,lin nscr.

Ilypwholic cdf modrl


r ~
I lie power model for a11 upper tail calls for the a ~ ~ r ~ r l . i tarljitrary
i~es ctloice
of a t ~ l i t x i ~ n uz-value.
it~ 'She liyperlrolic tl~odelallows one to extrapolate t,hc
upper tail of a positively skewed distribution townrd an infinite upper bound.
'I'l~el~yperholicmodel is
,,
1l1e posiLivr (1tcga1.ivtr)skewncss of l.hrr ilisl.riljr~tio~~ i n c r ~ i ;IS
~ ~t11('
~ sparat~t-
c k r w decre;iscs (increast~s).
,,
I I l C ]KI\v?r 1t10d~~l
cat1 I W I I S ~ ~ I?LISI>
I kl l l t ( K l d l,lw 10\vcr ;lfld ll[,p,:r l.ails of
I i s l r i ~ ~ lI\ t ~rcg;~liwly
. skcwcd lowcr tail co111dbl- t~iodcle<l as with the parartiel.r.r w 2 1 conlroling how fast the cdf rr~odelreaches its
Ii~iititigv a l e 1: the s ~ ~ r a l l eisr w , t.lir longer is the tail of the distribution
(Figure 7.10). 'The pararueter X is sucl~that the hyperl~olicmodel (7.16)
idc~rtifiesI.he s a u ~ p l rcu~iiolativcfrcqi~nncyl ' " ( z ~ ) :

I Hyperbolic model
1 Power model

Figure 7.10: Ryprrholic mod& for extrapolating cdf valucs bcyond the largest
lZigllre 7.9: I'owr nlodels ror inlcrpolating wil.l!ila-class cdl values. 'TI)? shape ~f tlrresliold unlun 2,; (uppcr t.:~il).'Tlw pararnetcr w rontrols how fast the positively
tlw dislrilwli~nis C O I ~ L ~ O I I C hy
. ~ th<.pnrmwler w : porifivc skewacss for w lrss than skewed cdf rnodcl reaches tlw appcr litnil value 1: the strr;rller is w , the longer is
1.0, tueiforw clistril~oliot~for w = 1 (linear modcl), i ~ r dnegativr ske~vrtcssf<,r w the tail of the distributio~~.
greater thau 1.11.
A weiglrt, w allows a haluuce hotrvtx~~ smoot.lr~msand closeness t,o the origiual 7.3.1 Indicator coding of information
i s t r i i n : as w dccrcascs, tlrc cdf nrotlc:l hecon~esit~crcasinglysnioothcr
but rlcvialw irlorc from tllcr origiuitl distribution 'l'lris particular smoothing 'J'l~eindic;ttor approach s h r t s wil,lr a sclrxtiot~of t l ~ anu~nbcrof tlrrcslrolds
;rlgorithnr rcqoires prior r l c t c r ~ ~ ~ i n a t iofo n1.l11: ruiui~r~ntn
aud nraxi~rnunz- and tlicir v a l ~ ~ c s'To . allcvi;rtc computation m d ~ I I ~ ~ ~ I efforts I C I : and reduce
val IICS. tlic occurrence of order relal.iou prolrlents (ses discussion iu section 7.3.4),
t,lw r~umbcrof tllrcslrolds shuuld rarely exceed 15. 011tlre other hand, the
uu~irhrrof t,l1r~~s1~01~1s S ~ I O I I ~110t
I I he s ~ m l l e rtlrati five to provide a reasonable
7.3 The Indicator Approach disr:rel.izal.ion of t l ~ clocal distrib~ilion. 'rhe set of I< tl~rcshold values is
lypicdly cl~osfvs~lrli(.hat l h r;rnge ofz-veli~rsis split iuto ( I < + 1) classes of
approximat,cly equal frcqucncy, r.g., the ninr! rleciles of Llrc saluple c~ll~rolative
o n I,IIv fdlowing guidclirrvs:
d i s t , t i l ~ ~ ~ l . inot,^^

2. i111dt.r tlrc ~ r ~ u l t i G a ~ r s strmlel,


i a t ~ cxt,rcmcly large aud snrall values are
spatially nocorrelated, an assumption often invalid;ited in practice or a More tlircshold values slrould be cliose~rwithin tlrc part, of the distri-
noti-coi~servativcmodel for applic;itions wl~ereconncctivit,y of extreme hutio~rtlmt is of greater iutcrest than t . 1 1 ~rest, r . g , tlrc lower or upper
values arc darrgerous features. tail.
3. Tlre variaucc of the co~rditiotralcdf in the ~iorrr~al
space depends only on
1 . 1 1 ~data coufiguraI.io~~,
r ~ o on
l (.Ire iia1.a ll~crr~selves
(Iro~noscednsticity).

Oucc the Ii tl~resholdvalues have bee11 chosen, each piece of inforr~ratiol?


( e g . , inrtal conce~rt.rat.ion,soil, or rock type) is corlrd into a vector of Ii
cu~nolaI,ivrprolral,ilil.ics of the type
Like the cdf values in section 7.2.5, the I< ccdf values are then int,crpolated
wilhill each class (zr, t k + l ] aud extrapolat,ad beyond lire two cxtrenre tlneslk-
old values zl and ZK 'l'lre discrc,tc cdf (7.19) r c l , n x ~ t . s1 . 1 1 ~ loc;rl i ~ r f o r i ~ ~ a t inl,oul
o u the 2-valne
'J'lw indicator apl~roachis based on the inlr:rprctation of Llrc conditio~~al a t 11 prior to any corrccl,io~ror i~pdi~l.itrg based on ~reiglrl)oringdata, llerrce
probability (7.17) a*; the condil~iorialexpectalion of an indicator ILV I ( n ; zk) the term local prior.. Differcut t,ypes of local prior cdfs can lie distinguisl~ed,
given the information (n): d c p ~ n d i u gon t.l~ermlure of Llre local iuformation av;iilal)le.

with l ( n ; q ) = l if Z ( n ) 5 zi; tr11c1 zero othcrwisc.


A hard ~ l a l , u ~i ( ttr~, ) is a precise ineasnrcr~~cnt.
of the attribute of interest.
Accordiug to the projection llreorern (I,uenberger, 1969, p. 49), Llre least-
'I'lrcre is no uncertair~t.ya t Lht: dat,um locatiorr 11,, 11eucr the local prior proh-
squares (kriging) estimate of the indicator i ( u ; zi;) is also the least-squares
ahilitics arc binary (hard) irtdicator d;rt,a defincil as
esti~rtateof i1.s conditional expectation. 'Tlrus, the ccdf value F(u;zr,l(n)) can
bc obtained by (co)krigilrg the unkuowu i ~ ~ d i c a t oi(u;r zr.) using indicator
trarrsforms of the neighboring information.
Consider t l ~ ciiidicator codiug of tile leu Cd conca~~tr;it,ions showu i l l Fig- Constraint intervals
ure 7.13 (top graph). T h e lrorizontal dashed lines depict the four tllreshold
values retair~ed,zr.= 0.80, 1.38, 1.88, and 2.2G ppm, corresponding to tBe Tirere is always some uncertainty at1,achcd to airy n~cnsureiuent~(11,) hc-
n ~ three upper quantiles (p = 0.6, 0.8, and 0.0) of the
tolerable r n a a i ~ n t ~and cause of lneasrlre~netiterrors. Tlie coding (7.20) into hard indicat,or d a t a (0
dist,ribution of 259 Cd col~cenlrat~ious.Iuilicator ctrdiog yiclds it vect.or of or 1 only) amounts to thcatiug tlmt error as ilegligihlc. Wtm1 imprccisiml
four indicator valt~esa t each ~ : I ~ . ~ I loc;~tion
III (1:igurc 7.13, bol.toril graph). of I I I C ~ ~ S I I ~ ~CI~II IL ~I I~OIL
~w :iL. I ~ J C ~ ~ ~11,.
~, ig~mrcd,L I E iufor~t~at,im O I Itau l w
For exan~ple,tile vect,or of bard i~idicat.ordata rclatcrl to the mcasurcment ~l~odelcil ~ i111.erval of possil,lc valur:s for z , ~ ( I I , , )t ( n , , b,], equiv-
i l ~ t ian
~(11s)= 1.31 p p n ~is (lif1.h colu~nnof irottonl I,'igul-e 7.13): <
alent to a,, < ~ ( I I , ) 4,. S11cl1interval-typt: data are typic;illy provided
by iltcxperlsive 1ne;rsllrclncnt devicm, such ;ts colorimebric papers to evduale
i(115; 2.26) I'roh {%(II,) 5 2.2C,lz(n5) = 1.31) pollr~tiorrlevels. T h e prior ctlf is f.lrcm ~nodclcdas an incom1~lrl.i:vector o l
<
Prob { % ( I I ~ ) 1.88Iz(11~)= 1.31 J 11;ird indir;~tordat.;i:
<
IJroI>{ Z ( I I ~ ) 1.381:(~5) = 1.31)
i ( n 5 0.80)
; h o b { Z ( I I ~ ,5) 0.8012(11~)= 1.31)
(7.21)

T h e prior prol~;rt)ilitythat the Cd concentratio~ra t 1 1 ~is oo greater than


0.8 ppm is zero hec;~~lsc t l ~ cr~reasrlrcdconcenlratio~lis 1.31 111x11. T h a t pro1)-
abi1il.y is I fix tlic: l.l~reelarger tl~resholdvnlries 1.38, 1.88, end 2 . X pptrl.

,
-
, Cd data

- - - - - - - - - :z '5-- --
' - - ppm
z4=2.26
---.---
z3=13 8 ppm
I

,l ,l
[!I I

= [ Pro11 {Z(II,I)5 2 . 2 6 / ~ ( 1 1t~ )(1.0,2.U])

iProh
h i ) {%(110) <
{ ~ ( I I o5) 1.8812(1lo)
1.38/~(11,~)

~ etwo missi~lgindicator 11al.areflect the r~ncert;lil~l.y


t (1.0,
(1.0,2.0])
I'rob {Z(IIO)5 0.801z(u0) E (1.0,2.0])
2.011

&out wlretl~crt,he Cd
I (7.23)

e 1 ------a-
=-
z2=1.38 ppm
.. .-
0)
. . .P 1-0.80 ppm col~ccntrntionexceeds 1.38 al~rl1.88 ppm, rcspecl.ivrly.
8 0

'1'hv locd i11fort1ii~l.io11 I I W ~ 1101. rel;,t,o rlirect.ly lo t 1 1 ~


;~I.I.ril,c~l.c
o l i~~lcrcst,.
FcJr cxarrlpk!, one 1Il;ty know that a ~rarticolnrsl.al,e sl of the i:;rl.cgor.ical nt-
tribute s, say, n p;rrticular fiicit,s or soil type, prcv;iils al. I(,. I h n r t,l~cn'
Indicator vectors Iocatious wlrere h l . l i 2 and s ;ire kltown, 1 . 1 1 ~pwlmrliol,s ol' r-r1;il.a I1r4onging
l o illat s1at.c s1 rvldc not escredit~gany p d i c u l a r t I i r 4 o l d value zr. can i ~ c
1 1 1 1 1 0 1 1 0 1
computed as
1 1 1 1 1 0 1 0 0 1
1 1 1 1 1 0 1 0 0 0

. .. .. .. .. .
0

~
1 1 1 0

-~---
0 0 0 0 0

T.. . .~~.~
-r--

I 2 3 4 5 6
Distance (km)

Figure 7.13: C:odiup of ten Cd data v i d ~ ~ einto


s lait verlors of four idicators of
non-exccedencr of Ll~resholdvalacs dapirtetl by tlie four ltorizontal d d e d lines.
Rock types

/ s,: Other rocks

0 1 2 3 4 5 6
Cd concentration (ppm)

I Local prior probabilities


- - - - - - z4=2.26 ppm
.- - -
gory sz, say .................. z3=1238 pprn
................................ 2,-1.38 pprn

zl=0.80 ppm

Figurc 7.14: Goding oi the profilc or two rock cxtt:gories illto lour profiles of local
prior prohabilitics or not exceeding Cd threshold values. 'The prior probsbilitics at
a location u belonging to rock category sc are interred from the carnulativc distri-
bution of Cd data belonging to that category (middle graph). The four threshold
valucs are depicted by the vertical d a d e d lines.
60 Ni data
- v,=50 pprn

where 71' is the ~ i ~ m b of


e r locations 11: ~ l l e r c110th prilrrary and SCC-
-5
CJ
20
v p 2 5 ppm
v,=15ppm
ondary attributes ;ire kuow~r.T11e indicator variable i(i~:,;zil) is clefirrcd
as
s
U

0 .
________r__________---------.
23 .
1Calibral~onscattergram
As ~llentiol~ctl
for soft calegoricd i ~ i f ~ r ~ ~ r iLllc
~ t i~ondil.io~lal
o~r,
cies F'(ztlu1) call tic sn~oothedprior to being used or they call he
frquiw
-
E
,
I .I
. I
borrowed from anot,l~erbetter sampled field. -2 I
I
-1
ti..l
. 1
I
I
I
3. T h e set of I< local prior prol~;tl~ilil.irs
at are t,hen ident,ificd with -arz I
I .'I' . I
9 ,. ,
1. I... I
I
..'..,J.I..p'.'1:@;,+. .: . .
2
the Ii sanlple conditional frequencies E"(znlul) rorresponditq to the
3 I
prevailing class of v-values:
0
0
3
$0
.@44
.1
20
NI value (ppm)
.:.$
.

,
..
30
_
40
-1
I

50

Consider the indicator coding of five Ni crmcentral.iotrs alotlg (.he N I X W


transect (Figure 7.15, top graph). 'I'lre middle grapll slrows t.11e c ~ l i l ~ r i t t i o ~ ~ F'(zlvl) F'(zlv2) to F'(zlv3)
1.0 10
scattergram of Ni-values versos C d - v d w s ( n - 259). l'hrec classes of N i con-
, /./
& l arb considered: (0, 151 1ipn1,
cen1,rations depictcd by the v e r t i c a ~ ' ~ k ~liic:s
(15,251 pprn, and (25,501 pprn. I'hr each class, the cum~rlativcdistribution
of Cd d a t a is plotted, and tire proportions of data not excerding 1.hr t.liresli-
old valoes zk=O.80, 1.38, 1.88, or 2.26 ppru itrs compuled. 'I'llrrse c;~lil,r;rt,io~~
00
results are tlwr used to code the prior infor~rtatiorrpmvidcd hy each Ni da- 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6
t l ~ ~(Figure
rt 7.15, bottom graph). For cxan~ple,the vector of sofl. indicator Cd concentration (ppm) Cd concentration (ppm) Cd conconfration (ppm)
d a t a a t location 11; wllen: the Ni c o ~ ~ c e n t . r a tis
i o virlued
~~ wit,hirr 1 . l ~i l ~ t ~ r v i ~ l
(0,151 pprrr is (second colunrn a t bot.torrr Fig~lrc7.15)

I Local prior probabilities

C o l o c a t e d s o o r c o s of i n f o r m a t i o n
When asoft rl;r1111r1 1ocitI.iorl 11: coincides with a h r d d a t w r locatiol~I I ~ 1.11'
, .
set of hard indicat.or data i(11,:.. . zcl .
.., orevails over the colocatcd soft i ~ ~ d i c a t o r
\

d a t a y(u;; 2 ~ ) .For cxnn~yilc,the lrarrl dalo~rlz(11:,)-13l pp111prevails over


,I.., '...f+ :.,c -...~..,;^.. .. .
CI...I . " . " I F I,.^ .....
..-:
292 C H A P T E R 7 ASSESSMENT Ok' LOCAL IJNCElt?'AlhT7'Y

&turn is a coltstraint, interval ( a , , b,,], the corresponding nrissing indicator 7.3.2 Updating into ccdf values
data are replaced by soft indicator data a t t.he corresponding tl~resholds,i.e., Indicator coding allows rliffmcnt types of inforniat,ioli ( I I ; anrl
~ soft data) to
tire Iw;d prior cdf (7.22) I~ecou~cs: Ije processed togrll~or,rq;rrrlless of their origins. 'l'l~cobjective is to evaluate
11 the set of Ii' ccdf ~ i d n r s
a t any 1oci~Li011 or posterior probabilities:
if b, 5 zi;
I ; z )= I'
Iz"(zr Isr) if zi; E (a,", b,] k = I , . . ., K (7.26) zkl(71))
Ig'(11; = Pro11 { Z ( u ) 5 z~.l(iz)] k = I,.
wlme the corlditionil~ginforntation (11)consists of l ~ a r i l a soft
. . , Ii

~ ~ ldata rehailled
(7.27)

in a searrl~ncigltl~orhoodW ( u ) centered a t 11. As mentioned previously, the


if ~(11,) = sr. least-sqnarcs (kriging) rsbi~nateof the ir~dicatori ( u ; zt) call be ilsed as a
For exa~ttple, a l location uo hclongi~igto the rock category sz (Fig- mo&l for tlie ccdf value of z(u) ;it a particular threshold value z r :
ure 7.14), llrc vrct,or of local prior prol,abilities (7.23) is co~npletcdas

I a h c l ~ccdf value call thus be derived as a linear conibination of neiglrboring


hard and soft indicator data, i(u,; zk) and y(u',; zi;), using kriging algorithms
similar to tilose introduced i n Cll;y,t,ers 5 arid 6. Note that Ii (co)kriging
systems, oue for each tlireshold value z r , rin~stbe solvod a t any localion u.
This s e c t i o ~presents
~ kriging ; i l g o r i t l ~ ~that
l ~ s account for hard indicator
I.ransforn~sof the sole attribut,e of interest. Algoril.ll~nsfor incorporating soft
When
. ~- several sofl, data are available a t Ll~csarne l o c a t i o ~u~l , LIE local
i ~ i f ( ~ r ~ t ~ aarc
l , i oinl.rorluc.cd
u in srctiorl 7 . 3 3
~ ~~

prior iufor~nt~l.iolr consists of several prior cdfs of type (7.24) or (7.25). For
example, one may kuow that, the conceut.ration of the s e c o ~ d a r yvariahlc Ni is
v;rlnc<l w i t l i i ~(0,151
~ pp111;it location u j I d o ~ ~ g i ntog rock calegory 51. 'l'lius,
there arc two colacated vcctors of local prior probabilities a t uk. 'J'l~efirst I,'irsl., co~lsidcrt.he problem of esti~n;rhi~lg i,l~einrlicnt.or vduc i ( u ; a)a t any
vector cor~tainsthe prohabilitics for the Cd conce~ltrationto he smaller than location 11 using only lctrd indic;ttor data dcfined a t tile same tl~reslroldvalue
each of the four tl~resl~old values given that Ni concentration is valued witl~in r r . T h e linear rst.irnat,or (5.1) is expressed in terms of indicator RVs as
(0,151 pptn:

rrl~creX,(u; zr) is the w i g h t assigned t,o the indic;ttor da1.urn i(u,; q ) in-
terprelkd as a nx&aat,ion of the indicator itV I(11,; z t ) . As with kriging
'I'l~esecond vrctor consists of local prior probah~litirsderived from the cali- estimal.ors of the z-attribute values, two indicator krigi~rg(110 variants are
I~rationof the rock category .st: (listingr~islietl,depending on whetlrer t.lx indicator niean is considered con-
~ study area A.
st,ant. w i t l t i ~the
yz(11$; 2.2G)
y2(uj; 1.88) Sztr~plrindiculur k r i g i q
, \

~ ~ ( I I1Q3 ;8 ) I l ~ csinqile iudicator kriging (SIT<) esli~natorconsiders tlle indicator mean


yZ(l1;; 0.80) 1~1iow11 t htr o ~ ~ g l ~ A:
aud c o ~ ~ s t m o~tl,

15 { ~ ( I Iz,i ) ) = F ( z r ) , known V 11 E A
Different sources of soft information co~lldhe ranked ;rccordir~gto their ability ,,
to predict 11;rrd da1.a; see t,he discnssiou relaled t,o Figure 7.23 (page 316). I he linear estimator (7.28) is tlitis writtell as a linear cornhi~~ation
of ( n ( u ) + l )
T h c vector of local prior probabilitirs correspo~idir~g to the most accurate pieces of inforr~~atiori,the n ( u ) indicnl,or 1LVs 1(1r,;zk) anrl t l ~ acdf value
soft inforrni~tionwor~ldthen prevail over the other vectors. I,'(.a):
A s witli o r d i ~ ~ s kriging
ry of a cotrl.iituous variable, ordinitry l l i wibh l o r d
search neigl~horlioodamounts to:

2. applying the si~upleIK cstitnnlor (7.2!)) usi~tgt,l~atestinintc of tlic nican


rather tl~atit11e sanrplc c~~lnrilalive
Frcql~enryi3"(z&):
where the wciglit of tlrc rrieau is dditicd ns

A:~(II;:~) = 1 -- x
.,(ll)

o=1
A:(EI; rk)

T h e indicator tirean could he e s t i l ~ ~ a t eby


d the sample c~~rntrl:~tivcfrcql~ct~cy
Emclrtzidr properly
F'(rr) after possible correcl,ion for prt:ferc~~bial sampling (see s r 4 o l i 41.2).
Both si~npleand ordiuary 11( cstitnators are exact. ir~terpolalkmbccausc they
The simple 11<weights are provided by a simple kriging systeul of l,ype (5.10):
honor hard iitdicator data at, t,l~eirlocat,ions:

Where thc r-value is kltown exacl,ly, tlie pust,erior cdf is I.hc i~nil.-steyprior
cdf (no r~pdatirig):

Ordinary i s r l i c n l o ~higrng
~
Ordiimry indicator krigitig (OK);~llowso w 1.0 account fur local ll~~ctoalions Whcre tile prior inforrnatiou is a wnsl,rait~tinterval ( ( I , , b,], only the missing
of Lllc indicator mean by li~riitingthc donmi11of sthtioriarit.y of th;tt lrrrarr to indicrlm values are updated by posterior probabilities:
the local neigl~l,orl~ood W(II):

Emtuple
Simplr: and ordii~aryindicator hriging ;ire used irloi~gt.lre NIC-SW I.rattsrcl 1.0
drtcrnrilrc ccdf val~iesa t the fonr threshold values zr= 0.80, 1.38, 1.88, and
2.26 ppm. Tlic information available consists of ten Cd concentrations a t
whcre the wciglits are given by a n ordinary kriging systcru of type (5.18): locat.iorts 111 to 1110 (bhck dots) pllrs the constraint int,rrval (1.0,2.0 ~ ) l x na] t
locatiot~11" (Figure 7.16, top grap11). For each tl~resl~old vdue, tlto ioilicaf,or
sr:lrrivariogram is inferred from dl dat.;~;rvailahlr? ovcr t.Itt! sl.udy m : a (1"ig-
tire 7.16, riiiddlc graphs). Estininlioi~is perfornicd cvery 50 111 lmiug 81. I:LICII
locatio~ilhc fivc closcst indicator data r e l a t d oljly to Lhr: t,l,r<dioldL L I~eiirg
considered. 'fhe statiormry irtr1ic;rtor n ~ c a o s(.'(zk) required by silnple IK are
estirnatcd from the cunii~lativt:1~ist.ograrnof the t,en Cd values.
Figure 7.16 (botto~iigraph) slrows bolli olK (solid line) a ~ l dslK (slliall
dashed line) cst.itnat,rs of the prohahilit,y of exccedi~lgthc critical Ll~r?sl~ol(l
rl = 0.8 pprrr c o ~ r i p ~ ~ as
t c dI - [ ' ( I ; ~ ( ) ) ] *l'lic
. 11orizon1,;j:ldaslied line
rl~yict,st . 1 1 ~nrargini~lprobability [ I - I,'(rl)] oC cxccccli~lgthr: I.l~resholdvalne
Cd data 11, cdin~;rl.cdby the proportion o l ( 2 v:rlurs great,t:r 1.lii~110.8 pp111:7/10 =
.Ue 0.7. Not,c t.lre following:
Ijotlr II< estitnators itre exact. The posterior probahilit,y of exceeding
t.llc critical tlircshold is zcro a t dath locations uz t,o 114, where the Cd
concentration is s~nallerthan 0.8 ppm, and that probability is 1 a t other
d;tl.a lomtio~rskliowu lo be col~taniinatcd,
' I h loc.;rl cst,iinnLioti of ihc indicator incan ivitl~incarh search ncigll-
I~orlmodyiclds ordinary IK estinrales that follow the data trends l ~ e t t e r
l.hn11 1 . 1 1 ~sin~plcII< cslilt~ates.'I'hc posterior prr,l,al~ililicsof cotltami-
tri~tioiiare s~rt;llleri n t l ~ elow-valued (left) p r t of tlir trimsect and are
1st threshold 0.8 ppm 2nd threshold 1.38 pprn larger in tlte higll-valnnd (right,) part.
1.21 1.21
Beyond locathn 116, orilinary It< yields a unit prrhl,ility of exceeding
the critic;tl i.l~rrsliold,which correspot~dst,o a zero runiulative prohabil-
ity [F(II;z ~ l ( r ~ ) ) ]It1
* . this part of t,ltc trausecl, all da1.a Iocatiotls within
t,l~csearch ileiglhorlrood lV(11) arc contan~i~ial.sd, which eol.ails that all
iudicalor d;rt:~i(11,; r l ) arc zero and so is llieir cstilnatcd rneari (7.31).
11)contrast, away froltr tllr: data (extml,ol:~tiot~ sitoation), the SIC csti-
nr:rlr approarllcs t,l~cinargiual prolmhility 0.7.
3rd threshold 1.88 ppm 4th threshold 2.26 ppm At locations ti5 and Q , ordir~aryII< yi<~lds t.lie following vcclors of posterior.
. .
. .
. . ~wol~al~ilil.ics:115: [O I I I], 110: [0.00.l(j 0.54 I O ] . 'I'lto cxarl.itude property of
I.he II< cst.i~~l;~t.or
clllails 1,11;tl, t,hc ccrlf ; ~ tlln
t ~ ; I ~ I I I loca1,iolt
II 115 is idcotical lo
t.he prior cdf (7.21) (110 updat.it~g).Similarly, tlic crdl ;tl, locatior~110 honors
the coirstminl inlerval inforn~ation(7.23); only the two missing indicator
0.01 , , , , 0.01 7 , , ,
values a t tlrrcslrolds q and 23 arc updated into tlir posl.crior probabilities
00 0.4 0.8 1.2 1.6 00 0.4 0.8 1.2 1.6 0.16 :rttrl 0.51, rcspccl.ively.
Distance (km) Oislance (km)

hidicator cokriging
] Posterior probability of exceeding 0.8 pprn Nfit,l~erIf< t&iinators (7.29) and (7.31) nrakt. fnll use of tlle infori~tation
availihlv i i r tluiit llwy ignore indica1,or d;&a ;it, tlircsholrls dilliwirt froill that
I x i r ~ gestinmbed. Iufnrnralion froill d l li' thresholds call be accout~terlfor
ltsing the cokriging formalisin introduced i t r seclioi~6 . 2 , 1"or example, the
ordinary indicator cokriging (olCK) estiinat,or of ccdf v;tlue a t threshold zko
would hc writtcll

1 ,
1 2
-.--
3
T~-

4
Distance (km)
5
- 6

Figure 7.16: Simple and ordirtary indicator kriging t:stirnntes of the probability ol wliere XZ:"(u;zk,,) is the weight assigrrcd to indicator daturt~i(u,,;zh) a t
excceding tlre critical thresldd value 0.8 ppnl. 'The information available cousists
location II,, . 'These cokrigirig wcig1tt.s are solutiotts of an ordinary cokriging
of ten Cd concentrations arid the constraint interval (1.6,2.0 ppm] at ua, plus the
indicator scmivariogram rr~odelcdfrom 259 indicator data.
system of type (6.32). screcns tlrc iufl~renceof colocated indicator virlt~esa t ollrer tlrrcslrol~ls
Zk # zko.

Emmple
Figure 7.17 sliows tlic tell o~~mirlirect,ional sta~rdardizc(lindicator rlircct i ~ n d
cross sclnivariogran~sfor the four t,l~rr:sboldCd values ZL= 0.80, 1.38, 1.88,
(7 3) and 2.26 p p ~ n . 1 ' 1 ~ : it,erat.ivc procedt~rt:&xril)cd i n I \ ~ I ~ ~ I\ I is ~ ~~~svrl
X
where C,(h; zb, 21.) is the cross covariance helweelr any two i~~dic;tt.or ltVs
to lit, R /incar t11ode1of cor~gio11aIiz;1t,io11, i ~ r c l u d i ~1~1g1 1 ~
hsic s~,~IIcIII~(~s:
+
I(u;zk) and l(11 11; zb,), and 6ik, = 1 for = 6 0 HIKI 6ib0 = 0 otherwisp.
As rrtentiorled it1 section 6.2.4, ordinary cokrigiug conslraiuts I.11atcall for
the secondary (1;tl.a weights to sum to zero t,etrd to lilttit arlilicially 1 . 1 1 ~intlw
cnce of these secondary data. 7'11esc constraints also increase t.lre occurrenct
of negative wciglrts lreoce the risk of getting unacceplahle estimates such as
negative posterior probaldit&s (Coovaert.~,1!194;1). 'I'hercfort!, one ~ ~ ~ i g l ~ t ,
consider rescaling each secondary indicator variable I ( o ; q ) to the I I I ~ ~ of I I
the primary indicator varidrle, then solving the ordinary indicalor cokriging
system (7.34) with the siriglc onbiirsedt~r~ss constrnint that calls fbr d l prijunry
a d secondary data wcigllls to sum to I .

Indicntot. kriyiny versus indicolor cok~~igiay


Indicator cokriging is I I I I I C ~ Irmre demittrding titan indicator kriging l i ~ rt.wo
reasons:

I. li(li-1- 1)/2 dirccl ;utd cross i ~ ~ < l i ~ .s(!~t~ivilriogri~~~ks


irl.~r I I I U S I l i v ittkrr~vl
and joittlly ~nodel~:d (c.g., 45 smtivariogri~~tls for 9 tl~rc~sl~old vid~~c,s).
1st-2nd thresholds 1st-3rd thresholds 2nd-3rd thresholds
. . ! . 1

Not,e that, tlrc ~ttodolirtg;rod ro~t~ptttnl.io~ial ellirt ircrcr!ssitatr~dby iudicator


cokrigitrg rlny hc nlleviutcd hy kriging principal i~tdicatorconrpolrents; see
section 6.2.5and Suro-Pkrea and Jourlrcl (1'391).
In theory, the i ~ ~ d i c a t ocokriging
r estimator is 11cttr.r (ill IL 1cast.-sqt~arcs 00 "4 0.8
oir,a,ra
1.2
,*nq
i"

sense) than the IK estirr~at.orbecause it accounts for additional informntion


available across all tlrresl~olds. I'racl.ir:(: has sltown, Irowever, that indicat.or 1st-4111 thresholds 2nd-4th thresholds 316-4111 thresholds
cokriging irr~proveslil.tlc o w r indir;~t,orkriging (Goovat~rts, 1!)'34;1) fix tlw
following reitsons:
9: 0 2

a0 04 0s 12 $ 6 0.0 0.4 08 1.2 16 00 a$ 00 12 1.6


Oii\a~ro(knj ~%&x"W3l UIzlslKo Wlri

Figwe 7.17: I<rperi~l~entalstnadardiaed i~dicalordin:cl and cross seniivariogra~l~s


correspo~dingto four diflercul (:d t l ~ r c c l ~ ~v;tlnrs,
ld witla t l w linear roorlrl of <:or?-
cionnliaation fittcrl.
a iiugget effect and t,wo spherical rr~odelswith ranges of 500 111 and 1.0 kin,
rrslxxt.ively. A slrorl.cut 1.0 illdieator cokriging c o ~ ~ s i s of
t s using the original data v a l ~ ~ e s
Vigurc 7.18 sl~owst l ~ efour prolilt-s of post,erior probabilil.ics rsti~r~aterl
11s- ~ ( I I ~~ , I) I s I 01'
, ~ i~~dic:il.or
I,rn~rsfm~ns at, t,lrrcsI~oldsdiRcwnt froiti that
ing ordit~aryirldicaf,or kriging (dashed liuc) and ordinary indicator cokrigii~g hcing eslin~alcd. 'l'lic resull.ing cokrigirig estimator is less cr~r~~bersorne be-
with a single nnbiasedness coirstraint of the typc dcscrihcd in section 6.2.4 carisc it accouirlk for a single secondary variable:
(solid liur). Note llie following:

Large rliff<wncrs hetween the itit its of ~iraasure~r~enl of z and i1.s indicator
Lrai~sfor~ns may c;msi: i~ist;~l~ilil.y prol,lt:~nsw l ~ solvit~g
e ~ ~ the cokriging system.
One sol~itiorrc o ~ ~ s i s of t s rq~l;icir~g !,Ire z-vitlr~esby 1,lrr:ir stanrlardized ranks
3:(110) = 1.(11~,)/11,W I I C ~ CI . ( I I ~ )t [ l , ~is] the r m k of t.br d a l u ~ ~~(11,) r ill
the smilrle r u ~ ~ ~ u l a t .distribntio~~
ivc (see section 2.1.2). 'I'lre irrdicator data
i(n,,;rt) ;w<. valtrcd i,il.ltrr 0 or I , wlwrcas i , l ~ r;r~lk-order r t.ra~rsfor~rrs
~(II,)
arc 1111ifcxru1y disl~riI~111,c~l I].
i l l [I),
' l ' l ~ r r colirigiug 01' t l ~ eindicator l.ransforn~i ( 1 1 ; ~ r) ~ s i i ~the g rank-order
I . r ; , ~ ~ s l i m~ ~( 1i 1 ):is n s r c o ~ ~ d ; ~v;~rial~lc
ry is n:fvrrr:,l 1.0 as probubility kriging
(Ismks, ISJM; J o u r ~ w l ,1!184l); St~Iliv:ii~, 1984, 1985). 'rhc prd)aldity kriging
(J'K) csti~it;~t,or is

1 1st Ihroslmld 0 8 ppm , I 2nd threshold 1.38 ppm


(7.35)
T l ~ e1'K weights AZK(li;zr) nud v,PX(11;3 ) are obtaiiiecl by solving the fol-
lowing ordin:try crkrigiug s y s t . ~ nof~ (2n(11) + 2) rqoatio~rs:

p=1 "=I
+ l ~ F ( 1 1z;t ) = (,"l(lb - 11;Zt) ,,
= I , , . . ,n(1,)
1 3rd threshold 1.88 ppm ] 4th threshold 2.26 ppm ~(11) ?&(~rj
C x;"(ll; 21) (:xl(ll" - UP; zt) + C 1,,y(11; Zk) GX(1lU- U")

I"igurc 7.18: Ordioary kriging and cokriging isti~n.ztes01 the probability of ex-
ceeding [mr dillermt 1,hresbold valucs. Arrows iuclicntc inconsistent probnbilitirs
outside tltc irttcrval [!I, I ] produced by indicator cokrigiq. whcrc Cx(11) is the covariance frlnctioi~of the ~rniforrrlI W X ( u ) , and C:xr(h;z t )
is the cross cov:rri;mcc funct,iorr l~etwecnX ( u ) i111dI,h(: in<liratorl l F I ( u ; ZL).
302 I A 1 T 7. ASSIWMEN?' 01,' LOCAL UNCEICIXINTP 73 '17115 INDICATOR AI'I'ROACII 303

Tlie PI< estimator (7.35) uses ruore infor~uatiouthan the ordinary I K I Indicator and uniform data
estimator (7.31) hecause the rank of lhe datum ~(11,) in the sample crlf is
taken into a c c o r ~ ~ill i t additiol~to it.$ indicator of e x r c c d i ~ ~llle
g tlrrcsliold
value zn. T h e 1,r;zrle-offcost for this brtter condilioui~tgof l l ~ cposl.nrior ctlf

. .. .. . .
is the inference of tlie semivariogram y,y(h) and the I< cross se~rtivariogra~~rs 0
y x I ( h ; z t ) , and the n i o d e l i ~ ~ofg the coregiorraliaation between indicator a ~ r d O u 3 o Indicator
,-
unifornr transforms a t each thresl~olil.
l o reduce the proporlio~lof uegative cokrigiilg wciglits and i:r~l~ancet11f.
0
o Uniform

inflwr~ceof rank-order transforms in the PK estim;it,or, the two unbiased-


ness constraints ill the cokrigitig system (7.36) can be replaced by the single
constraint that all tlie cokriging wt:iglit,s n u s t s u n to 1:

03 Indicator
1
a=1 a=1

T h e PK cst,inlator is ~ I I I I Srewrittell

1 Posterior probability of exceeding 0.8 ppm

since the stationary meal^ of t11e 1111ifor11i


tranvforri~S ( u ) is 0.5

Examyle
Figure 7.19 (top graph) shows the itniform t,ra~rsfor~ns (open circles) arid
indicator transforms for the t~liresholdvalue 0.8 ppm (closed circles) of ten
Cd conce~~lrations. T h e rank-order trausfor~nis Ixrswl 0 1 1 only tlic tiv Cd (lat,a
values, llence the two cxtrcnie uniforllr dida V ~ I I I C S arc 0.1 fur tlw s~il;dlesl(A1
concentraliou a1 113 and 1.0 for the largest concerllralioli a t un. fiotlr data
sets are comhirred using tlie I'K esti~rralor(7.37) wit.11 ;r sir~glcn~~bi;tsedness Figure 7.19: l'robability kriging estimates of the probahilily of excacdiag 11.8 ppar.
constraint. 'l'lic ser~~ivariograrnsofindicalor and unifor~ntr;ti~sfor~ns and tlicir Indicator and aniform tramfornrs are combined using ordinary cokriging wit11 a
cross semivariograrl~art! inferred from all data available over the stndy area single a~~biuedsess constraint and the scmivariogram ma~lelssl,owu i t , the swond
(Figure 7.19, irriddle graph). By constr~~cliou, i~ldicatorand t~nifortri(1at.a row. 1 lie d a s l d line depicts the ordinary krigillg cstintate tnsing only imlicalor
are negatively correlated since a zero indicator datru~rcorrcspo~~ds to a largc data. 'I'he arrow indicates two int:oasistent i~mb;rbilitiespro<lucrd lby probability
Cd concer~lratiori and, therefore, to a l~iglrrsnk in the salnple cumi11at.ive kriging.
distribution
Figure 7.19 (hottor~lgr;q)l~)shows l)rrtl~I'li (solid liur:) aucl d l < (il;~sl~~!d w a y , probabilit,y kriging c o r r d s for the loss of resolution c;t~~sc(l
by i l ~ e

.
line) estimates of the probability that C:d concentratio~lexceeds 0.8 p p m
Like the cokrigiug i n Figure 7.18, prol~aliilitykriging yields art inlerpo-
lation profilr more vxriablc in space, particularly in the rig111 part of the
use of a single tlrreshold in indic;rtor kriging.
Unlike ordinary cokriging, probability kriging p r o d ~ ~ c eo s~ l ytwo prob-
al~ilitiesoutside the interval [O, 11. 'I'lie use of 2% single secoi~daryvari-
transect. Indeed, the data ranks va111edin [0, 1) allow one to discrinr- able in tho PI< estin~atorlessens the screening effect between variables,
inate Cd concentrations wit11 sirrrilar indic;itor Lrausforn~s0 or I . In tliis thereby reducing the risk of getting neg;~livecokrigirig weights and in-
ronsistrnt nrohnl,ilit,irs.
Median IK is very f a 1 because it requires only one indicator semivari-
,I ,hough lrss d ~ t n a n d i u gll~arlindicator cokriging or probability kriging, iri- ograni (tuedian) to be nrodelcd artrl a single IK system to be solved a t each
dicator kriging st,ill c;~llsfor estirnat,itrg and modelirtg I< indicator scniivar- location u . However, such an approach calls for the particular coregionaliza-
iograms ;utd solviug K kriging s y s t c ~ ~at.~ c!acli
s location 11. 'I'l~e modeling tion model (7.38) that does not allow different shapes or anisotropy patterns
ilnd c ~ t n p u t a t i ~ tdl'ort
~ a l can he substanti;rlly alleviated if the two following for the I( indicator senrivariograt~ts.Wheu saniple indicator semivariograms
conrlit,iotm are jointly met: appear not to be proportional to each other, as in thc case of the Cd semi-
variograms in Figure 7.17 (page 209), the more flexible and still reasonably
I . 'l'ltc li iildica1,w RI's / ( u ; z k ) arc it~t,rinsicxl!y?orrdaLed; rlmll vX- f;rst, i~rdicittorkriging sl~ouldbe used.
pression (4.41). All the I i ( K + 1)/2 indicator direct and cross stitrri-
variogram tnodcls are 1,ltrm proportiorral to a c o r n m o ~ne~nivi~riogram
~ B l o c k v e r s u s c o m p o s i t e cccifs
111odcly,,,1(11):
Let r v ( u ) be the linear average value of an attribute z over a block V of any
specific dirn~,nsions:

2. All vectors of hard indicator dat,a retained in the estimation are COIII-
plete (eqnally sampled case); there arc no missing itidicator values snch where IVI is the measure (length, area, volume) of block V. T h e integral is, in
a s iinplied try constraint intervals of typc (7.22). practice, approxi~natrdby a discrete sum of z-values defined a t N points o:
discretizing the hlock V(u). For the level of information (n), the rtnccrtaitity
As sl~ownit] section 0.2.2, krigir~gand cokriging est,itnators are then ide~ttical; ahout the block value z v ( u ) is ntodcled by the "blor:k" posterior cdf:
fnr example, for the orrlinary (co)kriging c;~se:

<
with the hlock indicator 1tV defined as lv(u;z)=I if Z v ( u ) z a d equal l o
zmo otl~crwisc.
'rhe est,in~nl.or(7.39) is called tncdiat~indicator kriging (n1111) since the c o n - If hlock data zv(o,) wcre available, the posterior cdf (7.42) a t any thresh-
rnon model y,,,l(li) is usually inferred front the indicator semivariogratn a t old zr could be obtained by indicat.or kriging fro111block indicator data defined
(.he median t,l~resholdvnluc z~ = Y'(O.5) (Jourttel, 1984b). Indeed, me- as
d i a ~indicator
~ d a l ; ~i(u,; Z M )are evenly distributcid as O and 1 values, which
nsually rendars the experimental indicat,or swrlivnriograrn y^,(h;r,w) lxtter
drfined t,lian a t o t l ~ e rtbrrshold values.
Since the indical.or data confignrat~iottis the same for all thrcsl~oldvaloes, Unfort~i~ittely, block data rv(n,) do not usnally exist,, iwncc the Mock ccdC
the kriging weights XgK(n) do not depend on t,l~etl~reslioldbeing considered; nus st be n~odeledfrom the point-da1.a ~ ( 0 , )nlot~t:.
tlierefore, only one IK systenl needs to be solved a t each IocatLiot~u : Because l l ~ cindicator variable i(u; r ) is a noti-linear transform of the
original variable ~ ( I I )the
, block indicator i v ( u ; 2) is not a linear average of
point indicators i ( u ; z):

Tlt?reforc, the block ccdf I . ; ( ~ ~ ; z l ( n cannot


)) he derived as a linear average
of point ccdfs:

where the indicator covariance function C,,,~(li)is deduced as 0.25 - y,,r(h)


if the model y,,,l(h) relates to the rnedian threshold value zM
306 CHAPTER 7. ASSESSMENT OF LOCAL UNCERTAINTY

T h e "composite" ccdf [Fjv(u; z1(n))]* is an esti1rla1.eof llre proportio~rof p o i ~ ~ t


values within V(11) t,l~atdo not exceed the tl~resl~old vitlue z , w l ~ c r e ; ~t,he
s
s r LC! s 1 . i v r p r y a , i{ ( I )Y I , , . . , 1 1 ) iLr(. s11[)~~
hlock ccdf gives the probabilily lhat tlrc average z-value is no greater than
z . A simnlation-based approach for deriving block ccdfs from point data is
plen~errted hy an ealraustively samplcd2 secoorlary infor~~r;tI.iorr t l ~ a tn ~ a y
rclatc to either a categorical at.t.ril,ul.e s or a cor11,itlnous attrihulr: 7,. In-
introduced in section 8.5.
dic;tlor cocliug yields, for each 1.11rcsl1oldvall~cz r , ;I set, of lrard indicator
In marry envirol~rrrentalapplications, risks relate to the occurrence of large
a { ( I ) n = 1 , . . . , n} and an exlra1~s1,iveset of soft indicator dab
concentrations over small vol~urres.Such risks are underestimated hy linear
averaging (7.41) over the block V(u) because extreme values are s~noothed {y(u; z t ) , 11 E A} defined as:
out. Hence, it is often the composite ccdf thal isof interest, not tlre block ccdf.
Block kriging formalisnr allows one to estimate the composite probabilit,y a t
thresl~oldzk as a linear corrr11in;ition of point indicator data i(11,; rk). For
exnrnplc, the hlock oil< estirr~aloris written

where the wciglrts are give11i ~ ya block OK system of type (5.44): In si~nplekrigiug, t,l~e111argina1prnl~al~ili~,y Y(:L.) docs IWI, d ~ p ~ nO I Id 1.11~:
location u and represents 1 . 1 1 ~glol~;rlprior i ~ ~ f o r ~ ~ ~~ 0i 1r1l1 1, 1i1o0 1~1 LO
~ iill 1111-
sampled locatiorrs under the decision of st;ttionarity. 'I'o account for t,lw soft
datum available a t each local,io~,,the n~arginalprab;hility is replact!d by the
soft prior probnl>ilityy(n; zk) ((:oovaerls and do~rrrrcl,1995). 'I'l~esimple IK
estimator is tlreu rewritlt:~~:

-
T h e average indicalor coviiriar~ceCl(u,, V(u); rk) is approxi~r~ated by the
aritlmetic average of the point-support indiciitor covariances C~(11,- 11:; za)
defined between u,, and auy of the N point,s 11: discrctizing the block V(r1):
where y(u,; zk) is identified with t l ~ es a ~ r ~ pconditional
lc frctpency F'(:klsr)
or I."(zklvl), depending on whcther the soft informat.ion relates to a cakgor-
ical or a contimtons attribute (recall sccI.ion 7.3.1). Tlrc ksiging 1vcigl11~sarc
obtained hy solving a simple II( syslein:

7.3.3 Accountirig for secorrtlary informat'ron


T h e major advantage of t l ~ ei ~ d i c a l o rapproncl~is its aljilily l o incorporate
soft inforrnatio~~ of various types in addition to direct ~ r r e : a ~ ~ r e ~ r~~ Ir ~(.he
I nls
attribute of i ~ r k r c s t .01lce soft, dtita have been coded iuto local prior prohn- wlwre Cn(11;zk) is the covarimce function of 1.11,: rrsidual 111" Ii(u; zk) : :
hilities of type (7.24) or (7.25), they can be processed wi1.h Iltird d a l a using I(u;zk) - y(o; zi.) a t tlrresholil value zi..
kriging algorithrr~sintroduced in Chapter ti. 'This sectiorr preserrt,~two other If the secondary data do not allow a significant dilli:rr~~liation of 2-values,
irrdicalor algoritlrms for i~~corporating soft. da1.a: llre 1, prior probal)ilities F * ( z t 1st) or F F ( z kIvr) would be si~lrilnrnnd closr to
the siinlple marginal probiil,ilily t,'*(zk).'Che e s t i ~ l ~ a~l cv o ~ tllcn
~ l d revert to
I. T l ~ cmost straiglrtfor\vard r~rell~orl co11sis1.sof t i sin~plc11< of l.l~ehard
the s i ~ r ~ pIK
l e estirtratc with constant indicator rlrcarl. Tlre cstinialor (7.43)
i ~ d i c a t o dat.;i
r using thc soft, prior probabilities as local indicator nreans.
is exact because it l~onorsIrard indicat.or da1.a i(n;zk) at, tl~eirIocatiorrs.
2 . 'l'lrc second r~ictlrodis a f o m ~of ilrdic;ttor rokriging of hard indicat,or
data rising ISre soft prior prnhnl~ilil.iesas serood;~rydat.;~.
Semivariograni of residuals

I . 'I'l~cproportimr of Crl d;rl;~not cxccedi~~g 2, is firsl, co~ttpotc<l


fur each .-.--- -.-. ....-.-
0.5 $0 1.5 20
rock cnt.t,gory sr (Iiigure 7.20, left, top graph). )

Distance h (km)

Local prior probabilities (zk=0.8ppm)

3. At cacl~Imrd d n t w r ~location II,, tbr residual v ; ~ l t ~v(II,,;z,) c i s co111-


putcd Oy sublractitrg the sort i~tdical.ord a t u ~ i l~ ( I I , ;z,) frotn colo-
cated hard ittrlicator d a t u ~ ~r ~( I I , ;2,). The set~~ivariogram of rcsidnals is
t.l~cncomputed and i~rodcled.Figure 7.20 (right Lop graph) sl~owsthe
cspcri~r~ental residt~alse~i~ivariogranl infimwl f r o r ~t.lic
~ rcsidi~nldaI.;t sttl
(259 dat.a), wit.h t . 1 1 ~ti~odelf i l h l .
4 . 'l'hrr rrsic111;1lvalues are estimated along 1.I1etrar~sectusi~rgs i t ~ ~ pkrig-lc
+
SK estimates of residuals
ing and t,llc five closest residttal data ~(11,; 2;) (third row). 'The poste-
rior pnhtbility [1.'(11; ~,1(n))]:,~~ is obl.ait~cdby a d d i ~ ~theg soft inrlica-
lor datunt ~ ( I Ir,.); to the St< estitil;rt,e l.;lc(ll; zr) (I.'ig~~r<.
7.20, l)ol,torn
gr;lpl,).
'The profile of simple 11< e s t h a t e s is converted iuto a profile of probabilities
of cxceedit~gthe critical threslrold 0.8 ppm (Figure 7.21, solid line) As dis-
cussed i ~ rprevious sccl.ior~sand s l m w ~in~ Figure 7.21 (dashed line), ordit~ary 1 2 3
,
4
----- 5 6

indicalor krigilrg yitrlds a unit prolxthility of cxuredit~g0.8 plrln i l l t.l~eright Distance (kml
part of tlic trmsect where d l data locatiar~snre contarr~itmted.Cali1,ralion of
geologic ir~for~iiatio~ri ~ ~ d i c a t eIrowevrr,
s, t.liat Cd coucent,ratiotl Itas a s~irirller 1 Ccdf values (zk=0.8ppm)
probabilily (76.5%) of exceeding that tl~reslroldon the rock category sz pre-
vailiug in this part of the transect,. Account,ing ibr this local prior infororatior~
reduces the probability of co111.amination. Si~uilarly,accounting for tllc non-
zero (33.9%) prior probability of contaniinatio~~ on Argovian rocks increases
t.hc prob;hility of contatnination in the lowvalued part of the transect.

Soft cukriging
l t a t l ~ e rthan asiog t,lic sort indicator data y(n; zk) as local indicator xnea~rs,
these data call be interpreted as a realization of a RF Y(u; zr) correlated
with the indicator ILF I(u;zr), Ilard and soft indicator data are the^^ c o n - Figure 7.20: Simple kriging with varying local means. 'I'l~ctrend component at
bitled using i~tdicatorcokrigitig where I(11; zk) and Y ( u ;r*) are the primary location u is identified with the probability of not exreedieg tltn critical threshold
of 0.8 ppm lor the rock type prevailing tltere.
7.3. THE INDICATOR APIJROAC?I 311
1 Posterior probability of exceeding 0.8 pprn autocovariance functions of the hard and soft indicator RFs, aird CIY(11; en)

.
is their cross covariance fimction. Note the following:
Only one r~l~biasedness
condition is needed if one can assume that the
primary and secondary indicator variables have tlre same mean witlrill
each searcl~t~eigl~borhood
W(II):

Ilnhiasedness is then ensured by corrstrairring all tlrc iveigl11,sto s ~ m rto


1 , see the last eqr~atiorrof syskirl (7.45). If this assumption cammt be
nrade, (,Ire two traditio~ralonhiasctlness i:orrstr;ii~~ts
of type (8.28) I I I I I S ~
I x ttsed:

and seco~rdaryv;~ri;il)les,respectively, For cxatr~l~le,


tlrv ordinary iitdicator
cokriging estimator a t tllresliold zn is

[ ( I * ( ) ) ] ~=
n,(uj

<?,=I
A;:y(ll; zk) 1(lla, ; zk j
. T h e cokriging estimator (7.44) can be readily extended to incor1mrate
several differel~tsoft indicator v;~rial~lcs.

+ 1
nd11)
I )( I ;z ) (7.44)
<I=l

where Azyti(u; zc) and A~:"(II; z k ) are the cokrigiug wciglrts of imrd and soft,
indicator data a t locations u,, and n:,>. Ezan~ple
Unlike tlre simple I l i estiinator (7.43), tlre soft information ireixl no longer Consider tlrat the ten Cd concentratiorrs along the NE-SW transect are s111i-
he exhausl,ive. Notc 1.11;it tllc: "soft" colirighg cslit~lnl.or(74.1) rrhilis wily plenrc~~tcrl by the five N i conccrtl.rat.ions slrown at. I,lw top of Figure 7 1 5 .
hard a11d soft ittdical.or daLa a t the tl~rrslioldz k 1)eilig ronsidcr<:d. A s rlis- A c c o ~ ~ ~ lfor
t i ~the
~ g c;~lilwatiot~pcrlim~rmli l l soi:t,ior~7.:1.1, 1 1 ; d ; U I ~soil iw
cussed in section 7.3.2, it is generally ltot worth accou~rt,ingfor irifornratiot~a t for~nationarc coded into local prior probabilities of not e x c t d i ~ t gthe critical
other thresholds when indicator vectors are roniplcte (eqr~nllysampled case). tlrreshold 0.8 pprn (Figure 7.22, top graph). 'l'he direct and cross indicator
T h e cokriging weights nre obtail~edby solvir~gthe following wdirixy mk- scmivariograms reqtiiretl 11y the cokriging syskrn (7.45) arc it~ferreilfrom d l
+ +
riging systerr~of ( n i ( n ) nz(u) 1) equations: data ~tvailahleover tlrc study area. 'l'he modcl of corcgio~rnlizat,iorrhetweo~r
hard and soft irrdicator data for thrcsl~old0.8 pprn is depicted 11y I.he solid
line i r ~Figure 7.22 (rniddle graph).
Figlire 7.22 ( h o t b r n graph) sliows the soft cokriging esti~uate(solid line)
of the prohahility th;tt Cd concentration exceeds 0.8 p p ~ n .T h e daslred h i e
depick the ordinary I K estilnate nsiug only the Lei1 hard indicator ditt,a. As
for tlrc example of Figure 7.21, accounting for soft information reduces the
~xob;tl~ility of co~tlhnti~lation in tl~c:high-valnerl (right) part of the transect.

Colocatctl indicator cokrigir~g


lrigl~lyrechu~rl;ultsncot~daryi ~ ~ f o r n r a t coosistsof
h~l rct,nining only the sec-
uudary da1.111r1closest t o t , l ~loratio~l11 ixing esti~r~at,cd,c.g., the colocated
I Hard and soft indicator data soft indicator dal,um y ( u ; r t ) . 'l'lte indicator cokriging cstilr~alor(7.44) is

c
-
10 ~

.. . 0
tlreu rewritten

6
2 0s
e

. __.
a

0 0 ~
I_
._
. -.
0

_ T _.
. / . -_
~ . . o . .
0

. _. / ~
o
_
Hard
son
1 2 3 d 5 6 wlrerr the cokriging wciglits arc solutions of the followingsystenl of (n,(11)+2)
Distance (km) eqo:tt.ions:

a 3 Hard data Soft data


I

. .
If hard and soft iudicator rtrcans arc different, the soft indicator variable
1 Posterior probability of exceeding 0.8 ppm Y(u; z r ) must he rescaled so t h a t its mean equals the ilard irldicator mean
F(zx). T h e cokriging estimator (7.46) is tlren written

wl~crem y ( z r ) = ~,;{)I(II;zk)}.
Colocatcd indicator rokriging is faster than the full indic;rl,or cokrigir~g
a t d avoids i ~ ~ s l . a l ~ i lproblcnrs
ity caused ljy denscly sanlplcd soft i~lforrnation.
Moreover, the colocated cokriging system (7.47) does 11ot require the soft
I.'igrlrc 7.2'2: Soft c o k r i ~ i ~ crstitr~ate
g or the probnbility of eaccedinfi 0.8 p p m The , 11 = 0 where i l is m y ( z ~ ) [ -
autocov;rrinnrc I I I O ~ P I , e x c q ~ tat l m y (zi)].
snit informat,io,t consists of five prim prul~hilitirsor not erccrdittg t.l>cc-ritical
I1m:sliold as rlcrivrd frorn 1 1 ~ :calihratiort of Ni concentr;rlioas in Figure 7.15. llard
a d soft da1.a are combined using orrliuary cokriging a d the semivariugram models
slmwe in the saroad row. 'l'he clasl~ed linr drpicts the ordinilry krigirtg estimate Tire cokriging syslem (7.45) calls for the joint niodcling of two autocovariance
wing only Imr4 indicator d a t a functions and m e cross covnriance f u ~ ~ c l iam t each tlrresl~oldz r . Tlris mod-
eling of ilte Ilard-soft coregionalizatiorr rrlay be alleviated by a Markov-type
1rypol.l~rsisstating t.liat "a h u r l i~rdicatord a t u m a t u, i ( u ;zk), screens the
influence of any colocated soft indicator d a t a 9.' ? r ) on the estiniatior~of
the primary variable a t any other loc;&m u"', lhat is, wlrerr: ~ r l yis the I~IIIIIINY of Io~at.ioltsW I I C ~ C 110t,lr I l i < d illld d i t h ilrB
knowti. T h e besl, situation is when iii[')(zn)= 1, wlrialr rrrtmls t,llat the soft
Prob (Z(11') < zrl i(u; zt), ~ ( nz;t ) ] = Prob { Z ( u l ) 5 zk I i ( u ; zt)} infornlatioo y exactly predicts that t,lw valr~er(n,), with ~ ( I I , , ) zn, is 110<
V u , u', zx (7.48) greater Lh;m the tltreslrold valrle zk.
Convt:rsely, rni"(zx) is estimated by the arithmclic iivcrage of the soft
Building on this relation, Zlru and Jour~lel(1993) cstablislietl tlic f<rllo\virlgre-
indicalor rlirta a t locatio~rswhsrc i(n,; z t ) = 0:
latiolrs between indicator auto a l ~ dcross covariance functions a t any I.lrresl~old
Zk :

Here, tlre best situatiou is wlicn lii("(zk) = 0, wlticll indicates that tlre sofl.
infornraLinll exactly predicts that tlrc value ~ ( I I , , ) wit,ll
, z(o,) > zn, exceeds
the thresllold value2i.
where each cocllicient B(zt) is drfilted as the diffrrcnce hetween the trvo T h e dill'erenrc Il(zk) = %("(zt) - i;l'"'(zr) nrcasures the ;rbilit,y of the
conditional cxpert;tt,ions: -
soft infor~rr;rtiony 1.0 separat,e t,lw two citscs i ( n ; z r ) = I and i(11; zr) = 0. 111
other ~vorrls,H(zk) is an accuracy index for the soft informalio~l:
( t =7 ( z )-? ( z ) € [ I1 k = I , . . . , 11 A

I . If / j ( z n ) = 1 , the ;rutocovarinncc inode1 of liard atid sofl, inrlic;tlor


with
Itlz's, as well as tlrcir cross covsriarrcc niodel given by rc:latiom (7.19)
and (7,50), are identical:

( I ; : ) = ( I ; : ) = ' ( I ; : ) v I1
Under the Markov hypothesis (7.48), Lltc antocov;rriatlct: rriod~:lCy(l1; zn)
a t h > 0 and the cross ciwiriance niodel C ~ y ( l r ; z n )arc deductxl simply
by liltear rescalhg of lire lt?~rdautocovariallce ~ i ~ o d d . ( ; ~ (zk).
ll; 'I'l~~cfore,
modeling the hartl-soft coregionaliaation requires the nrodelingof only a siugle
covariance function per tllreslrold zk. The Bayesian updating of local prior A

2. If Il(zi) = 0, re la ti or^ (7.50) yields a acro cross covarian~:t!1iiodt.1 bc-


probabilities by irldicator cokriging under the Markov-type rclaf.ions (7.49) tween bard and soft indicator 11.lZs:
and (7.50) is referred to as the M(~rkov~LIuyes algorith?n.

Reln;Lrk:
The Markov sthteuicot "hard dala scrtxm t,he itifloeuce of any colocaled soft
data" s o ~ m d srrlislcadilrgly trivial. Indeed, if the soft datum refers to a volume
larger than the coloc;~tedhard datum, it lnay carry valunhlc additiotml infor-
mation. T h e resultittg covariance rr~odels(7.49) and (7.50) must be clrecked
against experimerital covariance fitnctions (see Irereafter ;rid Figure 7.24) Eminple
Consider the estimatioll of LI a t the tltrrslrold Cd v;rhtc zr = 0.8 pptir. '1'11~
L'slimaling the coeficieals B ( r r ) soft information consist,^ of either five different rock typcs st or nilre classes
T h e determination of coefficients B(zx) requires the estitnatioli of tlte two of Ni concentrations ( u - ~ur]. , A calibration like the orkc performed in sec-
conditional expectations mi"(zk) and rr~""(rr) a t each threshold z t . T h e tion 7.3.1 yields two sets of local prior probabilities: y1(11,; 0.8) = P'(O.Rlsz)
quantity m"'(rt) is estiruale~lby the aritluiretic average of t.he soft itrdiciitor arid I l z ( ~ ~ , v ; 0=
. 8 )I."(0.81ur). Figure 7.23 (t,op graphs) shows t.l~csrd.tcr-
data y(u,; z t ) , where i(u,; zk) = 1: grains of 259 Cd cotrct:ntralious versus coloc;itcd lrrcal prior prol)al~ilil.icsde-
rived from rock types (left graph) or Ni co~lcantrations(right gralil~). 111

3S~R
data weigllls ilrc exaclly E C ~to~ zero
I il the t w o tradilic,nnl nd~iasedz,cssc o w
slrnints of l w r (6.28) arc used in the c u k r i ~ i n gsyslem.
both cases, the 259 prior probabilities are split into two groups, depend-
Rock soft data Nl soft data ing 011 whetlrer the cnlocated Cd concentration exceeds the critical threshold
0.8 ppnl depickd by the vertical daslled line. 'She i~istogritnrsof each subset
of local prior proba1,ilities a t the bottom of Figure 7.23 show the following:
lo 8 1.0 , For Ni data (right column), the two lristogrntris have very dilferent
-
??
.- II
-
2
.-
--'
I
-.- sl~apcs.'The prior probi~l~ilities of 11o1exceeding 0.8 p p ~ nare large wheu
.-
n -"."
I
."..... n
m
I hhe actnal Cd co~irrnt,rat,ionis no greater than t.lmt tlrteshold. Con-
n
2 0 s 1 n 0s
I
-4.--. .. verscly, 1.l1cse prior prob;tbilities are s l ~ ~ awl ll m ~the itchlal Cd concen-
a a .A .
: ...-.
I
8
C
a
2
-
= 'I :
7.. +--- . .. . t.ration excecrls 0.8 pp111. 't'lic reasonably large 11-value (0.42) obtained
as t,lw clilT<,rcncebet wee^^ I.II~, Itrr.aris of l.l~eLwo disl.ril~ul.io~rs
reflccts the
.1.--. .
I
00 I . . ,
. l , ~ . ~ ~. , . ~ .
T ,~ ~
, . .
~ - . ~ ~T 7
~ - ~ . >
ability c $ Ni co~~c<~r~t.r;ti,ior~s Lo prrdirt wl~ct.lrar1,llr (:d c o ~ ~ c n ~ t m t i o n
0 1 2 3 4 5 6 0 1 2 3 6 5 6 cxrrwls 11.8 ~ ' ~ H I I .
Cd corrcentration (pprn) Cd concentration (ppm)
For rock types (left colurlm), the contrast betwccn tlrc two l~istograrnsof
prior probabilities is lcss apparent. 'She small U-valuc (0.12) indicates
j Cd 5 0.8 ppm t , l ~ ageology
t provides little infor~nalionon wl~el.l~er
the Cd concentra-
10 In1 Cd0.8ppm
00
Ition cxcrwls 0.8 pprll.
,.
h
rn1')(0.8pprn)=0.42 rn1"(0.8 pprn)=0.62
2 06
0,

02

0.0 05 1.0
....
1.5 0.0 0.5 1.0
. ~-
i5
Prior probabilily Pnor probability

Cd > 0.8 ppm Cd>0.8pprn 'lkhlc 7.2: Mcitsurcs of t l ~ e a1,ilil.y of


rock types or Ni c o ~ ~ c e ~ ~ t r a lto
i o npredict
s
whether llic Cd concentration exceeds a
lriveu tl~resl~old
value; prediction is best if

.- 05 1.0 1.5 0.0 05 1.0 1.5


Prior probabilily Priorprobabilily

,.
B(0.8 pprn) = 0.42 - 0.30
,. E(0.8 ppm) = 0.62 - 0.20
= 0.12 = 0.42
318 CHAPTER 7. ASSESSMEN'I' OF l.O(CA1, IlNCEl17'AIN'l'Y

Checking the M a r h hypothesis Hard data (Cd)


T h e Markov hypothesis (7.48) and resulliug expressious (7.49) and (7.50)
091
are very congenial in tliat only the autocovariance f ~ r r t c t i oor~ ~senrivariogra~n
of lrard indicator d;tta I I I U S ~be inodeled. However, that. Irypotl~csissliould
be checked, particularly relation (7.50), whicl~is llre most critical in the
ensuing cokriging process. If tliat relation is invalidated, the t.liree auto and
cross covariance f ~ ~ n c t i o must
n s be jointly modeled rlsit~gthe linear model of
coregionaliaation (4.37) as in the exarriple of Figure 7.22.
Checking the Markov l~ypot,hrsisa t thrcsliold t t involves the following Soft data (rock lype) Soft data (NO
three steps: 4 091.

functiorr C,(lr; a)of hard


1. C o ~ n p u t eand ~notlelthe untocovaria~~ce ill-
dicator data.

2. Use that niodcl aud the B-value B(zr) to deduce the auto and crass
covariance models Cv(h;zx), C ~ y ( l rzr)
; tl~roughrelations (7.49) and
(7.50),
Hard-soft data (rock type) Hard-so&data (Ni)
I I

Figure 7.24 (top gmph) sl~owsthe experinieutal o~ii~~iclircct.io~,;tl sta~~d;rrdia<:d


indicator covariance f u ~ ~ c t i oofn Cd at tltresltold 21 = 0.8 ppm with the n ~ o d c l
fitted. Using tlrat model and the U-values co~nputedirt Figure 7.23, 1.l1esoft.
autocovariance model and the liarrl-soft cross covari;mcc inotlel are derivcd
in both cases wlrere the soft inforniation originates from rock types or Ni
concentrations. 'I'lic rest~ltingtnodels are slrown as t.lte cont.inuous curves i ~ t Figure 7.24: hlarkuv-derivcd modcls for the indicator auto and cross correlograrus
Figure 7.24 (bottom graph). of the pairs Cd-rock type and Cd-Ni a1 lltr tlireshold valw 0.8 ppln. Motlcls
From expression (7.49), the hlarkov-derived a~ttocovariance rtiodol (coetinaous line) are dedticed from the autocovariance rrmdcl of hard idicirtor
Cy(h;t r ) shows a ~~rrgget e k t t.l~esize of which incrcascs with decreasing data (top graph) using lhe Markov-typc irpproximatioes (7.49) m d (7.50) R I I ~the
B-value. 'I'herefore, the disco~~tinuity at. (.he origiti of t l ~ csoft autocovariat~ce calibration parameters obtained in Figure 7.23.
niorlel is larger for rock type data than for Ni data. In hat,l~cases, t.lie Mnrkov-
related model severely overesl.irrt;tt,esl l ~ cituggel rlrcct or soft inilicalor d a t ; ~ . dat;r ;ire Irere ;rccou~~lsd for wing tile colocated mkriging rsl.in~alor(7.46)
Suc11 a poor n~atcltor 111,. covari;urcc t~totlrdI : y ( I ~ zn) ; is uf I K , r o ~ t s r q ~ ~ t ;IS mi.~ and 1,ltc Markov-rel;ttt!d r~rodt!lsof I"igure 7.24.
long as o d y the colocald scro~~rl;try d;rt.urti y(i1; ZI) is I I S C ~i l l tlw r o k r i g i ~ ~ g I'igore 7.25 shows both colocnl.~dordirb;try indicator roliriging (solid lim)
cistirnator (7.44). It~deed,the corresponding cokriging systetti (7.45) rcqrtircs and ordinary indicator kriging (dashed line) estimates of t l ~ eprohal~ilil~y of
o ~ t l ythe soft antocovnri;tnci: valur: a t 1111 = 0. excrwling 0.8 p p m R.csnlts are sin~ilarto those obtai~rtirlusing sin~plckrigi~rg
Beller fits are o l ~ t a i ~ ~for
c d t,lw cross covnri;rnce fut~cf.iot~s Ixtweeu hard with v a r y i ~ ~local
g ~ n e i r ~in~Figurc
s 7.21.
and soft data. Tlic Markov-derived n~odclstill nndercsti~natt!~ Lht: sliort-range
continuity for the pair Cd-Ni. In 1.l1iscase, the linear model of corcgionnliaa-
7.3.4 Correcting for order relation deviations
tion shown in Figure 7.22 is preferred.
At any location u, each csti~nalcdposterior probahili1.y [ F ( u ; zr1(71))]* tr111s1.
lie in the interval [0, 11 and the scries of such I< trsli~natrsnllrst. l)c a non-
decreasing funclion of t l threshold
~ v a l w zk:
1 Posterior probability of exceeding 0.8 ppm
1. 'The occurrence of rtegativr (c.o)kriging weights.
I'ract,ice has sl~ownthat orrli~raryindical.or kriging and cokriging algo-
rithms produce rrtany more and larger order relation deviations t.han
simple i~idicirtorkrigiug and cokriging (Goovacrts, 1994a). Indeed, the
co~~sl.raints on the weights, particularly the constraint that the sec-
o~i(l;lrydath rveig11t.srrlust sum to zero, increaa! the possibility of get-
ting negat,ive wr:igltt.s with ;t resulting incrcase in iuconsistcnt estimated
~~r~~l~~~l~ilil.i~~s,
i ~ 7.25:r Sol1 cokriging estimate of tlrs probi~bilit,yof exceeding 0.8 pp111. The
soft iuforrni~tiortis t l ~ pmfilc
: of prior prob~biliti~s
sl,own i n 1,'igure 7.20 (sccoad 2. 'The lack of z-i1al.a in smnc cl:~sscsof tlircsl~ddvalues.
row). Hanl ad soft dala are combisrd using ordinmy cokriging and the hlitrkov- Suppose, for exatr~plc,that the class (zs, z7] contains no 2-data. Tlie
relaled morlels ol Figure 7.24. l'lw daslred line dcpict.~tho ordinary kriging rstirrrirte two It< cstirnates a t Ll,rrslrolds 2s and 27 are b11m hmed OII t,hc same
usi~kgouly hard indicator data. il~dicatord;tt.a set, since

,>
l h e first ordcr r c l a t i o ~may
~ not l ~ csatisfied because the (co)kriging csti- Tlie difference between the t,wo IK estimates is thus a linear combination
mate is a lion-convex h e a r c o ~ n h i n a t i oof~ ~the conditionit~gdata, i.e., tlie of diffcrc~tccsbedweell 11< weights a t the two tlrresholds 2s and 27:
(co)kriging rveigl~t,~ can be negative (sce sccthn 5.8.1). Uccause the I< pro1,a-
bilities arc: not estimated jointly, Llie sccond condition may not he met either.
,,
Llie posterior cdf in Figure 7.26 s l ~ ~ w hol.11
s types of order relation (levialio~rs:

A ncgativo valr~afkr the difftmwe A ; I c ( ~ 26,~ ; 27) entails violation of


the order relation (7.52). Were the indicator scrrlivariogram models
All order relatiou rleviatious in this exaulple have been exagger;tted for better , and yr(h, z7) the same, the two sets of II< weights would be
y ~ ( h26)
illustration. In practice, both t.ypes of deviations arc generally small, around identical sirrcc tlic sari~edata locations are retailled at hot11 thresholds:
0.01-0.03 ((hovacrts, 1994a).

I 11e cliffrretrce (7.53) is thus zero, heltce tlrcre is no order relation de-
viation of type (7.52). In contrast, a sndden cl~aogci ~ two r consecutive
indicator senlivariogram models, say, from 2s to 27, leads to different
IK weiglits with an increasing risk of order relatiort problems.

I r r r p l e ~ r l c n t a t i o nt i p s
Inconsistent probabilities could be avoided by ir~iposingthe order relations
(7.51) and (7.52) as constraints in the kriging algorillrm. Such a solution is,
liowcvcr, cornpntalior~allyexpensive. Instead, the common practice is t o cor-
rect a posteriori for order relation deviations, see suhseqoenl discussion. Tlie
Figure 7.26: Examples ol order rclaliori pro1,lerns s11orr.n by ccdf vlzlues (black proportion and magnitude of these dcviations and the required corrections
dots) estinlated by an it~dicntorapproach. ' I h magnit,& of onler relation rlevia- can be rcdoced using the followirig implementation tips (see also Deutsch and
tions is, is practice, ~IIIICI, smallex titan i s this fic1,itioas exnmplr. .lournel, 1092n, p. 7 9 80):
322 CIlAl'TEli 7. ASSESShfENT OF LOCAL UNCERTAINTY 7.3. THE INI)ICA?'OH APl'ROAClf

1. Apply olCKJPK algoritlnns wit11 a single od~iascdnessconstraint for


all cokriging w i g h t s so as to reduce the occurrence of negative cokriging ,<stdecile ,2nd decile 3rd decile
weights.
2. Avoid sodden changes in i ~ ~ d i c a t ose~nivariogra~n
r pnraoreters from onc
threshold to the next. One solution consists of modeling all indicat,or
semivariograrns using different linear combinations of tlre same set of
basic structures, e.g., a nugget effect and an exponential model: 0 0 1 " 0 0 1 0 0 1
0.0 0.4 0.8 12 1.6 0.0 0.4 0.8 1 . 2 1.6 00 0.4 08 1.2 1.6

1 ; z = bO(zl) + bl(zk) Exp(I1iI; ~ ( Z L ) ) k = 1,. . . , I <


0is18nc~lkml Oibfsnce jkm) Distance (kml

(7.51) ,4111decile 5th decile 6th decile


T h e inr1ic;ttor sernivariograrn p a r a l ~ ~ e l e (sill,
r s rttngc, anisotropy dirrc-
tion, and a ~ ~ i s o t r o pratio)
y should vary srnootl~lyf r o ~ none tlrreshold to
the next so that:
There is a c o ~ ~ t i n u iol l ~the
l ~ spatial variability with increasi~~g (or
decreasing) thresholds.
Seniivariogran~paral~letersare easily interpolated or cxtrapol;ilc<l
beyond the initial thresholds ZL. This allows one to retain more 7th decile 8th decile ,91h decile
thresl~oldsw i t l ~ o ~increasing
rt the i~~fcrcnce a ~ l dmodeling effort.
For e x a ~ ~ ~ pthe
l c ,uine (:d indic;rtor sc~niv;~riogrnn,s i l l I"ig11re 7.27 ;are
all rnodelctl as the snm of a nugget effect and an exponential inodcl.
Note the contirtu~~m in the relative ~ n ~ g g effects
et and ~Rccliverange
values with i~rcreasirrgtl~rcsl~old value (Figon: 7.27, 110tt~o1n grnpl~s).
Relation (7.53) slkows that all order r r : l a l h ~prohlr~ns
~ of l.ypt: (7.52)
c;nlsed by t.he lack of data i l l son^ classes rvould lie elin~i~l;~t.e<l i f tlrc Range value Relative nuggat effact
same indicator scmivariograrn model is nscd a t a11 tl~rcsl~olds zi ( I I E
dian indicator kriging). The tmde-olf cost is the lack of flexibility l,o
model changcs i r ~the pattern of sl~ati;rlcontir~uityfro111one t.I~resl~old
to another.
3. Select thresliolds zi so that wil.hin each scarclr nrigi~borlroo~lW ( u ) 0.0 1 . .. . . ~ .. , ~~.~~. /
0.0 ~, , ~ ~, ~ . .~, ~ - . ~
DO 0.5 10 15 20 25 00 05 1.0 1.6 20 75
there is a t least olle d a t u n ~from each class (ZL-1,211. S a l ~ ~ p l i sparsity
ng c d threshold (porn) ~d threshold lppml

may dramatically red~lcetlre r t ~ ~ n ~ bofc suclrr t,hrcsl~olils. ' r l r e r ~ h e ,


rather t l m ~using tllc same set, of thresl~olclszk over t l ~ estudy ;tn:a, t.he
tl~rcslrolrlscan 111: ~ n a d cdependmt on tlw local i l t h r r ~ ~ a t avnilal~le
io~~
within each neiglrhorltood W(u):
Thresholds 21. that arc upper bounds of classes (zn-, , zi] with no
z-data are ignored. In the previous example of Figore 7.26 where
the class (zc, z7] mas assnn~edempty, t,lw ccdf viilrrc a t thn:shold
27 wot~ld1101 [)I: i~~f<!rred.
.
o
Beforecorrection
After correclion

z-variable

is r ~ ~ i n i ~ l i i a1111rlcr
c d the (Ii' + I) li~rcarconstraints:

3 . 'Tl~cfilt;tl set of ccdf values ('l'ablc 7 . 3 , fifth c o l ~ ~ ~ nisr rthe


) average of
llre t.wo scts of correct,erl ccdf values:

1. AII u p w a d corroclion is firsl. [lorforn~<:ilresulting i l l n srl, of &' ccdf

.
v:i111vs [ ~ ' ( I I2kl(n))]y;
;
Reset all ccdf values t h t are
Oor 1:
i.:!,
('l'z~l~le 1.11ird ~ ~ I U I I I I I ) :
[0, I] to the closest hound,
l ~ r r t ,wit,l~i~l 'TO alleviat.e nol.ations, the corrected ccdf values arc l~creafterdenoted
[ F ( u ; ~kI(71))j*.

( I( I ) ) ] = 0 if [I,'(tl; zkl(n))]' <0


[I<'(II;zA.l(n))]% = 1 i f [I.'(II; :kl(n))]* >1 7'alilc 7.3: I~rcorrsistentccdf valucs sl~orvnin Figure 7.28 and

. 1,oop olnvard tlrrougl~all tbrcsl~oldsapplying the corrccliol~:


tllcir c o r r c c h l rlsirlg tlrc average of an i ~ ~ w a rand

Corrected ccdf valucs


d downward

I
Upward Downward I Average
,Gl=/Ts
'l'hrcshold Faulty prob
0.05 I 0.05 1 0.05
2. 'l'hr same npproaclr is used to pcrfor~na downw;rrd correctiot~resriltiog
in a set of /i' ccdf values [l.'(n; rxl(n))j;' (Table 7.3, fourth column):
r H.eset all ccrlf vnlws thal. are rot wit,hin [0, I] 1.0 the closest t)ound,
0 or I :
ccdf values
7.3.5 111terpolati11g/extra~~01ating Ccdf model at ~i

I . The sampling do~mity.


Increasing tlir: ~ ~ n m b eofr thresholds elllka~~res
l,lie risk of occurrcnce
of empty cl;rsses (zk_1, ; ~ n dthe resulting ordcr rel;rtio~~ problc~~rs,
particularly i l l sparscly s;~lriplrdi m a s .
0 1 2 3 4 5
Cd concentration (ppm)

Improved
I Cdf model Ccdf model at U;

3. Inference a i ~ di~iwlelingrcsourccs.
Except for median IK, the infixe~~ce ;ind i ~ ~ o d ~ leffort
i ~ r gi~~creases drit- I

matically as more thrcsliolils are co~~si<lered; e , Ii tI1res11-


for e x n ~ r ~ p lfor 1 2 3 4 5 6
olds: Cd concentration (ppm)

3. 'l'lrc upper Cail is exlrapolatcd low;trcl ;%,I i ~ i l i ~ ~nl,pr:r


i t c l,o1111r1~ ~ s i a~ r g
hyperbolic inodel with w = 1.5.
As rnentioncd previously, t l ~ cinfercncc and irrodt:li~~g dlbrt, could be
alleviated by interpolating seniivariogr;~r~i
pirrarrir:ters from the paraln- 13ec;iuse of t.lw l i ~ ~ r i t cnun~ber
d of t.l~rrslroldszr, the posterior cdf inode1
etws of f~irvcrindicator sen~iv;~riogran~s. is much less detailed than the s;ui~plccdf nrodel, which is based ~ I all I 7 1 rlai,;t
z(u,,) available, say, the tell Crl ctron:ntrations along tllc NE-SW tra~~sccl,
T h e nsn;rlly poor rcsolut.io~ioftlrc post.crior cdf re~iderscritic;rl tlic ir~tcrpola- (Pigurt, 7.2!1, left bnti.orri gmph). 'l'l~eidea is to capi1;ilizc OII tlrc l~igllcrlcvcl
Lion of ccdf valrm witliiu eacl~class of tlrrcslmld vnlws ( ~ ~ _ a~ d, , rltlorr ~ ] of discretization of the cdf to improve (.he within-class resol~~tiort of the ccdl
i m p o r t a ~ ~ t ltheir
y , extrapolation beyond Llre srnallesi, i.hrcsliold zl (lower I.ail) (Figure 7.29, right botlorn graph). Interpolation within iwy class of 111ir6~s1~01d
and tlie largest thrcsl~oldzrc (upper tail). 'l'l~e i.l~rrri~rt~crpolation/extrii~~~~Iiitio~~ values ( z l - , , zk] would tlie~iproceed as follows:
cdf ~ o o d e l s(linear, power, hyperlrolic) introduced in s c c t i a ~7.2.5
~ can he ap-
I . T h e class zk] is first split into I,(*' subclasses (2;-", zi']; for
plied to ccdf values. For example, 1:igure 7.29 (top graph) slio\vs the 111odel
example, for 1,"' = 3, tlie three subclasses are (2:' = ah-I, zki l l 1,
litted to t.lie fnur ccdf valucs provided l)y prol~nhilitykriging ( I ' K J ;rt loc;it.im
11; sl~owna t the bop of I'igur~. 7.7 (page 277):
( a t ' , a:'], a n d (a:), a?' = %., l ' l ~ eI ~ I I I vI n~l u ~ sr:' of the s ~ ~ h c l a s s i : ~
can be ident,ified with the s a ~ r ~ p data l e values fallir~gw i t l l i ~tlio
~ class
I. Thc lower tail is ~rst.r;rpol;itcdtow;rrtl n zero I I I ~ I I ~ ~ I C~ IOI II I C ( ~ ~ . ~ ; ~ ~ ~ O I I ( z ~ . . ~ ak].
, Consider, for exa~riplc,t,lw ccdf cl;~ss((1.8, 1.381 depicted hy
r ~ s i ~ r;I!gncgativcly skr:wwl powcr 111odclwith w : 2.5. t.lie vertical dashed lines i l l Figure 7.20 (right 11oi.to11tgriqh). I'our ~111)-
CISSLS .
.. . &re . defiued ming as I I O I I I I ~ vitl~~cs
the t l ~ r wC 2 dat;r vdues f;tllirig
witlrin that class of the s a ~ n p l cr~mulative
c dist.rihuticm (I'igure 7.29, left
I,oft.onr eranhl.
328 C A T E R 7 ASSISSSMENT 01,' LOCAL UNCERTAINTY

,,
llic indicator approacll requires a preliminary coding of each piece of infor-
r i a vect,or of I i local prior prol~ahililirscorresponding to the I<
~ r ~ a l i ointo
sthtcs sk:

I'rob (S(II) = ski local i ~ ~ f o r r n a t al.


i o ~11)~ k = 1,. . . , I <

3 . 'I'hc series of L"' cdf models are rescaled linearly, srlcli that tire ccdf llifixeut types of local prior 'pdfs can he distinguished, ilel~endingon the
valucs at. l ~ l ~ r ~ ~ l r21-1
o l d sarid 11 art! 11011ored: n;tl.ur<~
of tl~c,l w n l inforn~at.ionnv;rilahlc:
A h;trd dalu~ir~ ( I I , , )is a precise ~rre;lsurcn~eitl
of tlic st,& si; a t location
11, ( n o uncertai~lty),l'he local prior prohahilities arc tl~rribinary (Bard)
ilrdicator data dcfinrd as

,"it,ll y
[ ( I k ) ) ] - [F(ll; ~ k ll(ll))]*
-
1 * z ) - I*,*(*,_,) 'I'hr local i~~fort~rat,iotttlray cot~sistof a zero prohal,ilil.y of occurrence of
ollc or lli(m! sf.ntes s t , ; for exanlpk, a parI.iclrlar land use is kliowli lo
Such a n interpolation a n i o ~ ~ r rtot susing tlre same i~itraclassdistril)utiol~~liodcl be ;il~sonti l l a giveti grologic e ~ ~ v i r o n ~ i ~Tehre~local
t , prior pdf is then
[I.'(r)]r,i,, a t all locations 11; in ot.lier words, the i ~ i t r a c l a sdistribuf.io~~
is non- an incr,nlplet~cvrxtor of liard indicator data:
co~lditiolml.A sirriilnr approncli allows one 1.0 increase llrc rt:solutiott of tlrr
lower and i~pp<:rtail classes ( z,,,in, 211 and (*I<,z,,,, 1.
A good alternative 1.0 t h pirccwise i~~t~t:rpolatio~~/cxtr:~~~olatio~i of the
sample cdf colmist,s of sirioot.liing aud rx1r:~polating the cdf usiug t.he ;tlgu-
rit,lirr~silitrotlnced in sectioi, 7 . 2 5 Ancill;~ryinfor~iiat,ioli(c.g.,calibration of a ~ ~ ~ t i n u o1,-data)
us may pro-
vide prior prohabilitics of occurrence for tile K states s k a t location ub.
7.3.6 Modeling uncertainty for categorical attributes 'I'lle set of local soft indicator data is t,l~e~r
defined as

Consider the prol,lerri of ~r~odelingthe uncertaitity about the state sk of the


categorical attrihutc s a t the u ~ ~ s a r n p l rlocaliolr
d 11. For the level of in-
format.io~l(la), tltat u~~ccrtaintyis ~trodeletl by tlic cotiditiolral prol~;tl,ilily
dist,ril>utior~
fu~tction(cpdf) of the discrete lW S ( n ) :

'1'Itc idir;rtor ;~lgr,ril.lirrisitrtroduccd i ~ scctions


i 7.3.2 and 7 . 3 3 can be used LO
<:sli~i~alc each of t l ~ elC conditional probability values p(r1; snl(n)) as a linear
-
coti~biriatiooof neiglihori~m \, llarrl and sofl, indicator d a t a

If a single category .sk prevails a.1, each l o c a t h i 11 (~iiuluallyexclusive


categories), 1,lie IC hard indicator (lath i(u,;sk) s n ~ nto 1 a t any location u.
'L'hus, the I i class-indicator 1Ws I ( n ; sc) are lil~carlyrelated, lcadirig t o linear
rlnrnnrlcnca ill t,lie rows and coluriins of tlic exrrcrin~r:nlalrrrat,rix of indicator
4.
covariance functiorrs arlcl a risk of nulrierical i ~ i s t i ~ l ~ i l iif
t i tall
: ~ li cat,egories
are accounted for iri a cnkriging systcm (recall discussion in section 6.2.2).
As mil,h t l ~ ccokriging of lim;trly related corttinuotis varial~les,two solutions
arc:
t i
,,
1. Estinrate each posterior probahilityone a t a time, r ~ s i i ~agcokriging sys- 111sc o r r r c t i o ~procedure
~ is itlore streiglrtf<~rward
t h a ~ rfor conditio~~al
cdfs
tern of type (7.34) and discaiding a v category that is weakly correlated of continuous variables and t,ypic;rlly proreeds in two steps:
.I
with the category being e s t k a t e d . $
1. Any posterior prohabilit,y iml.side t,lw it~terval[0, I ] is first. rc,sct, t,o t l ~ c
closr,st bound, 0 or 1.
2 . E s t i ~ n a t eall posterior probabilities &tone, say, tire probability p(u; sk,/
(n)) of the category sh, with the jargest gloljal proportion yh,. 'Hie
posterior probability of that capg'ory a t 11 is tlren c o ~ r ~ p u t eas d the
complement:

For categorical att.ribr~tes,the corr~~rro~l iirdirator exprriirrer~talsenrivari-


ograln y , , , ~ ~ ( hr c) q ~ ~ i r cby
d i~lrvJi;iaindiratw krigitrg car) he ro~nputarl21st,lic
mean of the I< rescaled r!xpi:ritrie~~l;tlindicator sc~nivariogr;ur~s:

One (lrawl>arkof l.his approarli is t.lj;rL LIIP postwior prol,;hilily [ ~ ( I I s; ~ , , / ( I I ) ) ] *


s ; u . c ~ r r ~ ~all
~ m d to ~ ~wrors
l a ~ c;~fr&ir~gi.h: ( / < - I ) otlwr ~ ~ S I . ~ I I I ; I [II(II;
I . ~ ~ S skl("))]*
wliere ~ r ( l l ; s ~is ) Cl~eindir:ator sen~ivrtriogra~n
(2.24) of r:irt.cgory sr wibli l'hernforc, ilie category sr,, sl~oulrlhc tlre o t ~ oof least iutcrrsl or Lllo one wil,li
global proportion p f . ISach indicator scniivariogratu is rescaled 1)y i.he i~iili- a large global proportion pn,, or b o t h
cator variance y;(l - p;). I'osl.srior prol,nl)ilitics are t,lrrr~ol,t,ai~r~d u s i ~ ~;ing
indicator kriging systern of type (7.40).

Considcr t.he problr~nof modrli~rgLlre 1111cer1,aint.y al~oul,the prevailir~glnnd


C o r r e c t i n g f o r o r d e r r e l a t i i ~ nd c v i n t i o n s use along the NE-SW transect. Figure 7.30 (top grnpl~s)shows the ten d a t a
availnhle :rt~dtheir cocli~~g into i~tdicatorsof ~~rese~ice/ahs(:~i~re ~ f c a r land
l ~ ~rsc.
At each location 11, the li estirrmted probabilities [l1(11;srl(n))]' lur~sthe
St.ni~dnrdizcdindicathr scmivariograms are inferred frorir all data av;iilnhlc
valued within [O, I] and rnlrst sum i,o 1:
over t.lie study area (Figure 7.30, hot,t.otr~graplls). All scr~rivnriogr;rtrlshave
a sni;tll nugget elkct imd, excq)I, for forests, reacli a sill a t ~ ~ O I I 300
.
,Ilrc proliability of occnrrclice of c a r l ~lancl rtse is d c t c r ~ n i ~ l eevery
d
I I ~ 111.
50 111
using ordili;iry iirrlicntor kriging and !,Ire live closrsl. indicator d a t a . 'l'l~r:four
profilcs of corrected prol)abilit.ics are slrtrwii in Figurc 7.31.

The non-convexi1.y of the kriging estimator entails that an estirrintcd proh-


7.4 Using Local Uncertainty Models
ability may be negative or greater than 1 . Again, t,lre orrli~rarycokriging Let {P(II; zl(n)), 11E A) be rlre set o l conditional distrib~itio~rs (ccdfs) deli~~etl
constrai~itsotr the wcigltt,s of secondary data incrcasr tlrc risk of grl.tiug ncg- over the study area A. Each filnctior~F ( u ; zI(11)) fully nrorlcls the i~nctxtainly
ative weights, with a resulti~rgincrease in faulty prohahilities. The second a t locnt.iot~11 in that it gives, for a c o n t i ~ l ~ vari;rl,le,
~ o ~ ~ s tlrc proljahility tlml,
condition ( 7 3 ) is rarely satisfied wlren t.lie Ii prohabililrs arc est,irnal.ed the unknown is no greater ll~arrany give11 thrcsl~old2:
separately. Iforvcvcr, practice has sliow~lt,l~atClie n r ; r g t i i t ~ ~ofd ~bot.11 t y l m
of order rtdat.iori deviation is soal ally sni;rll, around 0.01 0 . 0 3 (Goovaerts,
l994h).
Forest
Land u s e s
s,: Forest
5 ~Paslute
:
s3: Meadow
5,: Tillage
-~-
-.~.~-~-~.
1 2 3 4
, ,
1 6
..
.~-
$ 2
. ,.
3
.*~.
4
. _.-
5 6
08slance (km) Oislance Ikm)

Dislance (krs)

Meadow Tillage

,
Indicator vectors

1 2 3 4 5 6
Distance (km)

,Forest Pasture

. .. .. .,-i--;-i.
.. .
,
E
eg17os
1 <:. 0.8 ...+~~ *
c c 'l'he 111o<le1 of local u~~certaint,y
is typically post-processed to retrieve single
.-B vall,,~sS I I ~ ;,s
I
04 E 0.4

0.0 0.0~
0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6
Dialancs (km) Distance (km)
2. a11 estilni~tt:of the unknown value r ( u ) , whirl^ is op1,inral for a given
Meadow .Tillage crikriot~which need 1101hr 1c:tsl-sqwres, and

Maps of Llrese difrerenl quantities are then used in decisio~i-makingprocesses,


such as the dclincation of candidate areas for rcnlcdiatio~lor additional sam-
-1
0.0 -J00
plit~g.
0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6
Dislance (km) Di~lance(km)

7.4.1 Measures of local uncertainty


Figure 7.30: Coding of ten categorical datir into indicators of presence/absencr Knowledge of tlic conditional cdf model a t locatiotr 11 allows a straightforward
of cad,land use. Standardized indicator senrivariogran~sarc inlerrrd from all data
assesstncnt of llrc uncrrt,ainty about [.he 11nknown t ( n )I d o r c and irrdepcn-
i n 1 . 1 , ~st,wIy WNL.
delrtly of the clroice of a particular estin~atcfor that i~nktrown.
334

Probability i n t e r v a l
CHAPTER 7. I'i
AS'SESS
I

I
I
EN?' OF LOCAL UNCEItTAlM'1'

T h e probability that the unknown is valued within an interval (a, b], called
7.4. USING LOCAL UNCERTAINTY hfODEI,S

Local entropy
A nreasure of local uncertainty, not specific to any particrllar interval ( a , b], is
335

a prohability interval, is computed as the difference between ccdf valurs for provided by the entropy of t,hc local prohahility density function (Shannon,
thresholds b and a: 1948; Clrrislakos, 1900; Jour~rel;md I)aotsclr, 1093):

A 50% probabili1.y interval means that the unki~owr~ has eqrral probabilily to
lie inside or outside the interval (u, b]. where f(u; TI)) = DI'.(u;zl(n))/iJz is t,he c o ~ ~ d i t i o ~pdf, r a l and all zero pdf
Setting the upper boond b of l l ~ einterval to +oo provides 111sprobalrilily vall~rsare excluded from tlre integral.
of exceeding tlic tl~resholda: T h e boonded prlf wiI.11r ~ i a x i ~ r ~ r ~ r ~ r e nisl rthe
o p yu ~ ~ i f o rdistribut.iol~
rn (3.17):
all outcomes have the same probability of occurrence. T h e entropy, and hence
Prob {%(u)E ( a , +m]1(n)] = Prob {%(II)> nl(n)) uncertainty, decrea3es as the probability d i s t r i h t i o i ~focuses more toward a
= 1 - r ( u ; al(n)) (7.130) single value.
In practice, the range of variatiorr of z-values is discretized illto 1i non-
This latter probability is of particdar i~nportancefor environnienl;d applic,r- overlapping classes ( z r - ] , zk], and the corresponding Ii probability il~tervals
tions, where the focr~sis on the risk of exceeding regulatory lin~it,s. are conipoled as
Consider the l.wo ccdf ruodcls F ( I I ~rl(n))
; and F(u!,; zl(n)) in Figure 7.32:
is valued h e l w n ~1.5
The probability tlmt t l ~ cCd conce~~trat,ion ~ and
4 ppm, depicted between the two vertical dashed lines, is 11egligi1)lea t ,,
Illere are typically ~r~;trry
more rliscretiz;ttion valrles zk tl~alloriginal t,l~n%hold
u; in lhe low-valued part of tlre 1r;tnscct. values used for i~rdicat.orcoding.
T h e entropy of the coriditio~ralpdf a t Iocatiou 11 is t11c11c o ~ n p ~ ~ tas
eil
r The uncertainty is much larger a t uk, wit,h the U I I ~ I I ~ W Cd
I I COIICCII-
trat~ionhaving si~nilarprohahilities to he valued inside or o ~ ~ t s i dthe
e
interval (1.5,4 pprn].

Ccdf model at LC Ccdf model at U;

.-C
zm 3
m
- - .- .-
.- .- .- . .- ,
. zero i f pr8(11) = 1 and p i ( o ) = 0 V k # k'
,I ,h e unknown is certainly valued in tlre interval (zrs-I, zr,]
i I i

.
e 0.5
i
n 05
I i ( I I I ~ I I ~ I I uI I~I~I cI Ie r t ~ a i n l y ) .
a i I i
I i I ~ r l i if pk(11) = 1/11' V k
00 -1 _-j,_ .._.n
A~ 0.0
!
- .I... . .
!
L-.,.?..,
Eaclt inlerval (zr-,, z r ] is equally likely to include lhe ~III~IIOWII

0 1 2 3 4 5 6 0 1 2 3 4 5 6 ( ~ n a a i m r ~nm~ ~ c r r t a i n t y ) .
Cd concentralion (ppm) Cd concentration (pprn)
Figure 7.32: (hsditioual cdl mod& provided by prol~nbilityk,ii;ittg a t locatiotcs
I 1 I 131 both c a w s , it linear int,erpolatiotl is prrlornwl i , e l w c w 1alxtl;tl.ed
bounds idmtiticd w i t h L l w 259 (:d cowentr;itiws r r v w llte s1.1~lyarvir. 'l'hi. did1cd
lia~.sdepict the prol,al,ility istvrval corrwposding to t l w inlerval (l.5,~i.O]pp111.
03
/ Cpdf model at u; 03
] Cpdf model at u>

linlike tltc entrq,y rtleastrre (7.6:1), llie varinrlcc is delined arotmd a specific
ccntral vall~c,the rncan of the conditional disLrihntion, and it depends on the
I< wit1iit1-class rncans i k .Beware that hot,lr t,hc ccdf near^ and its nl)per tail
Cd concentration (ppm) Cd concentrallon (ppm) mean can he very sensitive t.o t,lie clloicc of ill<: oxtrayolation model, as would
t,ltc c ~ t ~ d i t i o n variance
al ~ 1SIII)SC(IIICII~
~ ~ ( (see ) CX;LIII[IIC).
For highly asyr~~tnet,ric dist,riht~tions,a more rolinst rlrcasure of spread is
the interqoarLik range, dclinod as 1.l1edifference lietween t.lle tipper and lower
quar1,iles of t l ~ cdistril)utiol~:

of ex11 disl.rilit~lionis tl~cllco~l~put.ed tksiug relation (7.6.3). 'The sn1z11I~:r


c~tlropy;rl. 11; r~:lleclst . 1 ~grrnl.cr cerl.ai~it,y;issoc.int~cdwill1 that local.iolt.
Such n resull agrees wil.lt the iutr~itivcfeeling t,l~att.l~ertncer1.ailll.y slioulrl Ihcaosc it does not use rrwins of cxk,renic classes, ihe int.crqt~artileraltge is
I x smaller a t 11;, which is sorrout~dedhy two sin~ilardata valocs, than a t less affect,ed by the clmirt of a partico1;rr extrapolation ~ ~ l o d for
c l the upper
II;, w l ~ i c lis
~ sl~rronndcdby two cxtrcrne c1al.a valtles. Recall tll:it the. kriging tail.
varinucc, wltich is (l;~t;t-indepen(It:i~t, wonld not differentiak l,ltr irr~crrf;rinty C o ~ ~ s i i lthc
r r following three different, extrapolatioll ~nodelsfor the upI)Pr
p r w a i l i ~ ~a gt I . l r r w t,w<,locations (Figure 7.1, page 260). tail of i.he 1,wo cedis ; ~ locat.io~~s
t I,', aud uk (Figure 7.34, top graphs):

1 . I,iurar intcrpolatio~thct.rwr~lta1111lat.rdh o t l ~ ~ dprovided


s by the sample
C m r c l i t i o ~ r a vl a r i a n c c cdf (259 Cd data),
Otlrcr statistics can lie used to rneasttrc 1 . h spread of the corrditional pdf a t 2. Ilyprrbolir model with a slrort tad (w = 5 ) , and
localion 11. For example, the conditional variance u 2 ( u ) mcasnrcs the spread
of Lltc cot~dilior~sl
proBahility distrihut,ion around it,s rrirari zk(t1): 3. Ilyprrbolic nod el with a long tail (w = 1 5)
'I'lte three ittorlels provide sinlilar fits a t Iocatio~r11; twcausc of the small
contribution of the upper t.ail to the ccdf 1;'(11',; zl(n)). 'l'l~eimpact of the
trpper tail 111odclis nrt~chgreater a t 11; in the higlr-valacd part of the tran-
As with t,hc cotropy measure, thi3 i l l k g r ~ lis, i l l practir?, a p p r o x i ~ n a t dhy sect, 111 t.11is cx;rurple, t.lle satnl)lc-i~~terpolatell
distribution model (solid line)
k . 1 1 ~(lisc1.141:S I I I I ~ : is int.er~~~edial(: I~etwcel~ the two I~yperbolicrnodels depicted by dasl~ctilines.
{<+I T h e shorl-tail nlotlel (small dasl~edline) yields larger probabilities for inter-
u2(u) N [ - ( I ) ] [ I ;( j - ( I( I ) ) ] (7.64) mediate Cd ctmccntratiorls (2.5- 3 ppm), wlrereas the long-tail model (large
r=1 dashed line) increases the probability of occurrence of largc Cd concentrations
,"I,V~<,
2 k , k = I , . . ., Ii, are If Lhrrsl~oldv;rl~lcsdiscretizing t,lle range of varia- .
(> 5.4 ppni) (Figurr 7.34, Imt,totll graphs). Note the following:
'I'lie clroice of the upper tail e x t ~ a ~ o l a t i omodel
rr greatly influences the
tion of r-valncs. By corlventio~~,
1. Other t~l~rcsl~olds
F ( u ; zol(n)j= 0 and P(n;z ~ + ~ I ( n=
zr- could be identifiad with p-qnantilcs corrrspond-
ing to rcgt~larlyspaced ccdf increnlent,~, +
L: = F-'(11; k / [ K 1]1(n)).
))
. ccdf nlcan and variance a t location uk.
Difft!rences between ccdf models a t 11;, thouglt of srnall magnitude, lead
to rneans zE(u',) tlrat, fluctuate above or below the critical threshold
0.8 ~ P I I I . Decision-making based on that tltresl~oldwould then dramat-
ically &pend on t,lw nppcr tail ~rrodel~
A 1 E 7 ASSESSMENT 02.' LOCAL IIN(:ER?'AlN'I'Y

Ccdf model at ui Ccdf m o d e l at u i

f---- 1"igure 7.35 slrows t,he o r d i n a r y kriging vari;rncc a n d Itl~rcastatisl.ics (vari-

/
a n c e , ellI,ropy, a n d interqnartile range) of t h e ccdf m o d e l s obt.ained a l o n g t h e
NIS-SW tr;irisect using prohahili1.y kriging. 'l'lle u p p e r tail is l i w a r l y int.erpo-
Tabulated bounds Tabulated bounds d s o p t i o n ) . At all locat.ions w i i h 1111:s a m r
......... ......... I;rt,ril l,etween t,ahulated h o ~ r ~ (first

-1_---,
- HYpecb. 0,=5.0
----.
Hyperb. 0=1.5
Hypecb. w=5.0
Hypsrb. w t . 5 rlatn configuration, 1,he o r d i n a r y k r i g i r ~ gvariance is t,l~as a n w wl~at,avcrLIla
~
r
-
-
1 2 3 4 5 6 I 2 3 4 5 6
Cd concentration (pprn) Cd concentration (pprn) Cd d a t a

Tabulated bounds Tabulated bounds

zS0.74 z;=1 95
02=0.45 oZ=t46
Hrl=0.55 HR=O 82
'ii
IOFb0.37

00 00
0 1 2 3 4 5 6 0 1 2 3 4 5 6
Cd concentralion (pprn) Cd cancenlral~on
(ppm) '' OK variance 1 Conditional variance
0.3 i Hyperbolic ( ~ 5 . 0 ) 0.3 1 Hyperbolic (61~5.0) 10

r;=0 71
a2=020 3
- 02
He=O 55 B
IQR=O 37 2 0, I . , .. .
1 2
.
3
. _ .. . . . .
4
.
5
. ~..,
6
..
Distance (km)
00 00
0 1 2 3 4 5 6 0 1 2 3 4 5 6
1 lnterqualtile range
Cd concenlrat~on(pprn) Cd concentration (ppm) 10 I Local entropy
Hyperbolic ( ~ 1 . 5 ) Hyperbolic (10=1.5)
0.3

2;-2 62
$=18 5
H"=0.57 Hn=O 91
lQR=0.37

00
Figure 7.35: Four mcasures of local unccrtaiuty aI<,ng the N E ~ S Wtransact. As
Cd concsnlralian (pprn) opposed to t h e kriging variance, crdf statistics (conditional variance, eutropy, and
Cd cancentratlon (pprn)
interqnartilc range) account for d a t a values and indicate n larger uncertainty i t , the
right part oi the transcct where tlw range of f i l concentrrtions is laryer.
surrounding Cd c o ~ ~ c e ~ ~ l . r a l The
i o r ~three
s. other measures, wl~iclraccount for wlrm? zr m d Ek nri: dctined as in r q u i ~ t i o(7.64).
~ ~ '111~"1,-ol,timal" estilr~ate
d a t a vahrcs, indicate that the uncertainty ;!bout Cd c o ~ m n t r a t i o nis greater for. t l w loss frutctiun I,(.) is tlml tllc z.-v:rliw tlmt. n~ilrin~iars
llre cxpect.ed loss:
w l m e t.11~:range of data is l;trger, that is, in t l ~ ecmtral a ~ r dright parts of tlrc
transecl. (Figure 7.35). 'fhe condrast between t,lre two parts of the transect is
most apparent wheu using t.he interqoartile r a n g r U~~lilie inttq,olat,iou : ~ l g o r i t l ~introduced
r~~s in Cl~aptcrs5 and 6, liere the
d c l . ~ r ~ ~ ~ i r of
~ ia11
i t iq)ti113id
o ~ ~ e s l i ~ n i ~lt ier ~ c ~ i : dills 1 . ~ steps:
0

7.4.2 Optimal estimates


I'robal,iIil.y or 1oi:al entropy iilaps allow locntions to hc rmkerl ;tccordir~glo
tl~cirlevel of ~rrrcerlni~rly a l ~ o u the
t r ~ n k ~ ~value o r n ~~(11).
~ Sucl~;t ranking could
suffice for decision-making such as delineation of arras where remedial nica-
sures sl~onldbe taken, Areas with the largest probabilities of rxceerli~~g the
tolerahle ~uaximurnwould be cleaned first, followed by otl~ersi ~ decreasing r
order of their prohabilit~iesof cxceedcncc. However, 1ne;Lsurrs of local nncer-
t,ainty IIIIISL Ix, typically s ~ ~ p p l c ~ ~ ~ c1)y
r ~;UI
( . rccslt . i ~ ~ ~~a't(eI I )of t l ~ cL I I I ~ I I O ~ V I I
value lhecausc decision makers rarely tl~irbkonly in terms of probahilit,~.

,,
l h e selcct.ion of a unique esLiu~ntcwithi~rthe rangc of possible r-values re-
quires an o p l i ~ ~ ~ a lcirtiyk r i o ~ ~One
. common crit.erion is the n ~ i u i ~ i ~ i a ; r of l,io~~
the irrq~act.nI.tac11erl tn the estiniation crror e ( u ) = z'(u) - r ( o ) that is likely
l o occur. Consider, for example, the estinratio~~ of a toxic conce~rt,r;ttion.Un-
dt!rest.in~;rtion of tllitt corlcentration (~tcgat,ivresti~riatior~ error) IIIZLY CRLISC ill
src i'igurr 7.46 (Iclt lop graph). T11c optintal csti~tl;rt,cis slmwn t,o be the
I~c;rlLhand Ic:;td tu i r ~ s u r n ~ clitirtrs
~ r c ;dlawsuits. Co~~vcrscly, ovcrest,in~al.ioi~
r~sp!cl.crlvitluc of l,l~r. ccdf at locatio~r11, also c;tlld t . 1 E-typi:
~ e s t i ~ ~ ~ : ~rccnll
l,e,
of the t,oxic c o ~ ~ c e n t . r a t i(positive
o~~ t . s t i ~ ~ r a l , error)
i o ~ ~ rrmy cause costly and
rd;~I,ior~(7.65):
unnecessary cleaning.
Eval~ratet,lle inipact or loss associated wit11 any error as a function I,(.) Kti

of that error, e.g., I,(e(u)) = [e(u)12. Given tlmt particular loss fttnct,ion, the 2:,(11) = z;<(U) Zk ' [ ~ ( I ~kl(71))
I; - 1 7 ( ~~k-11(71))]
~;
estimate z t ( o ) sl~orrldbr clrosen so as to r~lini~nioe the resulting loss; that, is, k=i

For c x ; ~ n ~ p l t,lw
c , E-t,ype e s t i r ~ ~ a taet I&, Z ~ ( I I ~depicl.er1
), by the vertical
dnslictl line in Figure 7.36 (right trip graph), is I .95 P P I ~ I .
I)etcr~r~ir~at~ion of tlrc a c l o d loss l,(z*(n) - ~ ( I I ) requires
) tlw aclonl v;tluc ICI,ypcaod (co)kriging ~ s t i m a t c sare usr~allydilfmenl, ;tltho~~gli holh are
~ ( I I ) wliicl~
, is u u k ~ ~ o wi lnl 1,r;tctice. flowcvcr, the uncertainty about, z(u) is opt.i~~m forl the Icnst-squares criterion, I'hcy differ in that t l ~ cccdf from which
111o11r:ladby the ccrlf I"(II; ~ [ ( n ) )which , is avail;hle. T h e idea is to use t,l~is t,lw IS-t,ypc cstirtrste derives depends OII data vxlucs. One exception occurs
model of u ~ ~ c e r t a i ~tor t ydatern~incthe expected loss: w11o11t.11~original r-vzlues are imrrnnlly distributed and the ccdf is modeled
using a rnulLiG;~ossia~~ approach. In i d ather situations, t,he advantage of the
1';-type estinr;~l,elies in the availability of a model of nncertainl.y much rickler
lhirn a 1 1 1 ~ 1krigirig
~ vnriarict: (~.ccallsectior~s7 1 . 1 and 7.4.1).
l3ec;tusc of' i.lx: s q r ~ ; ~ r iof~ ~
l.lw
g rwLi~~;,t,io~r
error ill oxpressior~(7.68), ex-
treme error values tend to have a grcat, impact on tlrc cxpect,ed loss. Thus,
t.hc K ~ . y p resLiul;,tc inay overly dcl~endon t l ~ :~r~orlelirig of t.l~eupper and
lower t.ails of t11c distr-ihtrtio~~ b'(11; rl(n)). For Llte positively skewed ccdf a t
location uk, t,lw i:-type r s t i n ~ a t cvaries froni 1.70 to 2.02 pp111,d e p e ~ ~ d i n011g
whether l.l~ew - p a r a ~ ~ ~ eof r hypcrlmlic tail 111odc1is set to 5 (short tail)
t c the
or I .5 (long t.ail) ('R~hle 7.4).
74 USING LOCAL UNCEKI'AIN'I'Y MODELS 343

'I'able 7.4: Impact, of the oppcr tail rxtrapolatioti tnodel


Underestimation
i
I n3 / Cpdl model at ui on variorls optimal estimates a t location u; and 11;.

' Ext.rapolation options -


I
Optimal I'abulated Ilyperholic Hyperbolic
I cstinrates I~(,~nids (w = 5) (w = 1.5)
I
6 Location o; I I
I Meall (E-t.ype)
I
C Median
0 0 1 2 3 4 1 5 6 O,l-q~~a~iI.ile
l err Cd concentration (ppm)
0.9-qnri~itile
] Undsreslimation i Overestimation 03 / Cpdf y d e t at u; Incalioii 11;
hlean (E-type)
Mcdian
0.1-q~i;rntile
B = I 36 ppm 0.9-q~tantilc

00
0 1 2 3 4 5 5
Cd concentration (pprn)
Using a liucar fnnct.ion of the estimation error rather tbnrl (.he qlladratil:
Underestimation i
I
Owreslimation 0,3 1 CpPf model al u i
fnoct,ioo (7.08) allows one to rcdncc tile impact of t k i l rrlorlcls o ~ tltr
t expected
loss. For rxarnple, the loss call be triodeled as proportioual to the alrsolntc
d 1'l<,tl error:
cstilt~.

see Figure 7.36 (second row, icft, graph). 'l'he optir~balastir~iale(br this iilcalr
rthsolritr: deviation cril.crinn is thi. u~rrrliairof (.III. ccdl
0 1 2 3 4 5 6
Cd Concentration (ppm)

3 I Underestimation Cpdf model at u;

Z;=O 9-quanl~le
=3 86 pprn
I
I

0
.
1
. ,..~.~.7~.~,~
2 3 1 2
~ 3
r
4 5
k
6
h ~ ~
Eslimallon error (pprn) (pprn)
Cd conccl~trvl#on

Fignre 7.36: Poor loss functions and tltc corrrspatliq optimrrl cstinratcs r ; , ( n ; )
depicted by thc verticitl dnsl~rdlinr on the conditionid probability dislvibwtion.
(cost of ullllcccssnry cltming). Cotlvcrsely, rvlret~ predicting clcfir:iet~ciesin of tlrc ccdf dcpict,cd by the! vertic;tl dashed line itr Figurr 7.36 (right hot-
soil nutrient.s, t l ~ oittrpact of ~tlldcresti~llat,io~l
(~uiilue;rpplicat,io~rof correc- t.om graph). T h a t qtlantile rstimate can bc interpreted as t,lx threslrold
tive t,reaLil~ents)is likcly 1.0 be smaller t,lran that of overestir~tat,ion(risk of v i ~ l ~trle~ n has
t only a 10% probability of l~eingexccitdcil by tha ttttknown
rei.ardrd growtlr or pre~naturrdeath). 'I'hrrs, it is often critical t o d i s t i ~ ~ g u i s l ~ vnlne:
fro^^^ ovrrf:stin~at.i<~~~
tll~~lerrsLit~t;~t.iotr wllr~rirtodrling the i~llpactof est,irtration
errors.

111cre;~sing the cont,r;ist bcl.weco the relative iln~~acl,s


wl and wz yields (loant.ile
wltcrc t . 1 1 ~ nor]-rregativc ]~ara~rlet,crs wl and w2 itre tlle relitlive i ~ l ~ p a cal-
ts t:stirn;dcs corresponditrg t.o very srttall or very large ~ ~ ~ l t ~ ~probabilities
li~tivc
t d m l t,o overestitnatio~tand ulldcrestit~lation,~ c s ~ e c t i v e l y 'I'lle
. opti~tinl p. Beware 1.11;it u i y qu;mt,iIe YXIIIC otttside thc range of tltrcsl~oldvalues,
cst~i111;tt~r of t.11(1 ccdf (JOIIIIICI,1984;~):
is S ~ O W I I10 I)(, Lire pq~tn~lLil<, i.c., q,,(n) < r l or ql,(n) > L,<, would deptwd stro~rplynu the nod el ttsed
to estrapol;tlc t.111, lowcr or t.he upper t,;ril. I'ur cx;$~rlple,the (1.9-qonnt,ile
estitrlate a t locittion I,;, qo l(u;) = 3.80 pprrl, is larger than the nlaximum
t,tweshold r4=2.26 ppm. Sucli an estimate is t.llos very serlsitive to the upper
tail niodcl ('I:rl,lr 7.4, last row).

'f'lre three typrs of loss fu~lctionsslmwu ill Figure 7.36 (lcfl colwnrr) allow
1. W] = wz (analytical) dcterrrlinatiorr of the opt.itlral d m a t e . lo ah-
a st,raiglrt.forrr.ar~l
,,
L Ire line;tr loss furlction (7.70) is then synllrtet,ric, and the optitnal esti- scnce of srlclr ;ill an;~lytic,alsolution, tltc expected loss cat, be cotnl~utedfor a
inale is t,hc 0.5-qtlmtilc, i.e., l l ~ cmediart previously discussed. series of z-vitltrcs, and t , h ~imc yieldi~lgthe strlallrst expected loss is retained
2. W l > wz as the optimal eslitrlate. Consider, for example, the followirrg asynlrrtetric
'1'111, impact of ovcrcsLi~~l;~tiot~ is larger 1,Itatr tllitl. of ~ ~ ~ r d ~ r e s t i t ~ lof
~rtioII loss hmcl,i~j~t,wllicl~severdy pet~alizcs~ t n d e r r s t i ~ r ~ a t ~ i m :
the same n~agttitt~dc. 'I'lli~s,I ) < 11.5, and the optirrral esl,irl~nt,c is smaller e(u) for e(u) 2 O (overestirnation) (7.71)
than the r~ledian(a conservative clroicc for detecthg deficiencies). Con-
sider Ll1e asyn~metricloss furrctio~~ in Figure 7.30 (t,hird row, lefl graph),
where wr = 0.9 a d w2 = 0.1. The optitrral rstinmte is the 0.1-quantile scc Figure 7.37 (left top gtapb). Krmwledgc of thr ccdf ~ilorlell;'(u;;z l ( d )
of the ccdf depictrd hy the vertical daslred title itr Figure 7.36 ( l l k l allows one to cot~rputet l ~ cexpect.ed loss for a series of Cd concentratio~is
row, riglrt gr;tph). Tlrat qunntile estimate can be iut,crpreterl as the rallgillg front 0 to 6 ppnr (Figme 7.37, holtonl graph). The oplilnal estilnate
threshold value that Itas a 900% probability of being exceeded by the is 2.4 ppm, wllich correspouds 1.0 t,he ~nittirr~~l~rlex~)ect,cd
loss.
~titkrrownvalue:

1. T l ~ ilnpact
c of the estimation error may vary from one area to another;
7'1111sa large esliruate go.l(n) indicates tlrat the onknowrr value is cer- for example, the underesti~natio~r of a toxic cot~ceritrat,ionis likcly t o
taiuly large a t location u. be more prcjudicial in residential areas than in industrial yards. One
3. Wl < wz sllould llretr define a loss f u ~ ~ c t i oLi l( e ( n ) ; ~ lth;tL
) cl~a~rges
locally, de-
T h e impact of overestimation is s ~ n a l l ethat1
~ that of onderestit~latiot~
of I j ~ n i l i on
~ ~ ~11etl1er
g o is in a l industrial or rtrsidentiitl area.
the saritt! t~ragnitude.'f'hus, p > 0.5, and the optilnal eslirrtate is larger 2 . 111 pmctice, LIE losses associated with both l.ypes o l estimation error
than the median ( a conservative choice for detectit~gpollutiorr). Con- may be difficult to evaluate precisely. IIowrver, the riser getrerally knows
sider the asytulnetric loss ftlliclion in Figure 7.36 (left hottorn graph), w11icl1 type of error would be most prejudicial in the situation a t hand.
wltere wl = 0.1 and wz = 0.9. The optirrml estimate is the 0.9-quant,ile Simple asyrnrnet,ric loss functions, such as rxpressions (7.70) or (7.71),
346 CllAP'l'1~li7 ASSESSMENT 01,' LOCAL IJNCl<lI?'AlN?.Y 7.4. USlNG LOCAL CINCElUXlNTY MODELS 347

10
1 Underestimation I
I
Overest~mat~on Ccdf model at u>
7.4.3 Decision making in t h e face o f uncertainty
Many ir~vestigationslead l o irrrportanl decisions, such its cleauing Iramrrloos
,----- areas or correcting for soil deficiencies. Decisions are most often made in the
face of uncertainty because concentrations in toxic or nutrient elenlet~ts;Ire
rarely known with certainty. Given n conditional cdf inodel, there are several
ways t,o account for such ~ ~ n c e r l a i r ~iut y decisio~i-uiaki~~g
proc(?ss.

i . 1
_.-
7 3
..._-. .
4 5 6
Exceeding u probability threshold
Cd concentratton (ppm) Consider the decision of clranirlg localions coritaminattvl by c a d l n i u r ~ ~11 .
straiglttfnrward approach consists of declirring contnmi~lat,edall locatio~ls
where the of cxcectling the tolt!rnl~lcrrraxilritlr~~ 0.8 ppin is 1;rrger
tlrari a given probability tbresl~old.There is no doubt. that locittio~rTI!,,wil,ll
a 95% probability of exceeding 0.8 ppm, is a prime i:;tnrlirlate for ren~edia-
t i o ~ I)ccisio~l
~. l ~ dirlicult. nt. loc;rtion XI;, wlticlr iias ;L 34%
rrlaki~rgis ~ n u c t~lore
grolmbility of exceeding the tolerable maxi~nunr.111stre11 it CIISC, tlrc clwicc of
a prob;~bililyt,lrrcslrold is subjective a d d e p e ~ ~ on d s political arrd social deci-
sions; for cx;itnple, ;r 34% proljal~ilityof contanrination 111;rylw u~raccel~table
for r e s i h t i ; t l amas, ycl b ~ l r d r l cfor i ~ ~ d u s l r iyttrds.
al
In ptacticc, rctncdi;tl ~rrcasuresare applied to an area or block V ( u ) , not lo
l._.l__._..r__ a si~iglcIocatiotl 11. Decision rnaki~rgcar1 t l i e ~procred ~ in ~rianydifferent ways,
the regulatory tlireshold applies to a block, say, ZY fnr
1 2 3 4 5 6
Cd estimate (ppm) delxmtling on \vI~et.l~er
a truckload-sized volu~nc,or t,o a single location, say, z!j for all anger-sanipled

.
size vol~ttt~c. Exiunples of diRerent qq>roacl~cs follow:
Model the ccdf a t N localio~ts11; discrrtizittg l.lie block L1(ll), and r c ~
trieve t t ~ cA' corresporld>g pmbabilit.ies of exceeding l.he crit,ical Lliresli-
old, 1 - [ ~ ( u ;zfl(n))]
; . 'rlict~,decide to clc;in the lilock i f the prolxk-
bility tl~rcsl~old y, is excccded by a given proportion of locations.

3. Wl~atevr!r (.heopl.i~l~ality
crilcriou reliti~~cd,
I.lw rcstrlti~~g ustir~l;~l.r:
ri,(r~)
honors the soft i~lformationavaild>leal, location 11. Consider, fr~rcx-
atnple, that the soft infor~lratio~t of a co~~st,raiut
cor~sist,~ in1erv;rl ( a , 61
for the t~nk~tnrvr~
value ~(11).The post,crior cdf at. II t . l ~ e tiikcs
~i the form

,1~IIC [irsl q ~ p r o a c docs


l ~ r i o t , call fhr awmging prdml~iIit.icso r ~ - c s ~ . i ~ ~ l ~vat-
tl.t!d
ues. 'f'hos, a block could he declared i:ontanrinatcd if llic l ~ r o l ~ a l ~ i lt~lrrc~sl~old
ily
is exccedcd a t a few loc;rtior~swill~inthat ),lock. III contrast, the use of a
bloek ccdf rimy lead to overly optinlistic decisions liccausi! the nvcmge z-valllc
sn~oot.l~s out extrcrlx vahlcs.
348 C N A I'TISR 7 ASSF:SSMEN?' Of'L O C A I, U N C E l ~ I A 1 N ' Il'

Median estimates 0.9-quantile estimates


Another approach consists of declarittg aintan~itiatedall locatiotrs rvliere the
esli~naled(3conccntni.ion axcwds the tolcr;tble nraxinrom z, = 0.8 ppnr.
,>
I his approach requires the prior determinatio~~ of an esti~nat,efor ihc un-
knoirrn conccntmt~ion. As shown for locatiotr u; in 'lkhle 7.4 (first row,
p g e 313), diflpreni, opt,imality criteria atid i~~tcrpolation cctlfmoilels yield dif-
f ~ ~ r ncsLit~~at.cs,
t~l, rvltich )nay or tnay not cxcectl the cril.ical l.ltrcsl~oldU.8 ppnl.
'llierefore, tlrerc is a risk of declaring cotrtalnit~ateda safe location. Con-
versely, orre might declare safe a contaminated location. These two misclns-
sifir.;,l.irm risks r:;m 11c ;~ssi,sst~Ifront t.lrr co~tditionnlcdf ntorld I.'(II; :I(II)):
i"
I Risk a (median estimate) 1 Risk n (0.9-quantileestimate)

1. 'Tl~erisk LI(II) of wrot~glyclassifying a locatior~u as ltaaardorls (false


jmsiI,iw) is

for all locat,ions 11 s o c l ~that t,lte r,stin~alcz;,(n) > 2,.


I 0
I Risk P (median estimate)
I D
Risk p (0.9-quantileestimate)

2. 'l%e risk ~ ( I I of
) wro~iglyclassifyit~ga location u as safe (false neg;ttive)
is

Note that the risk @(u)is not defit~cdwhere the risk n ( u ) is, and corrvcrsely. Figure 7.38: Median and 0.9-quantile cstiorates deduced itom the prohatrilily
Agairt, o w r x c s tlie difficr~ll.pmhlent of clroositrg a probabilit,y tlrresl~oldfor kriging ccdf models, and the corresponding ~nisclassificatioarisks: a is the risk of
r d i ~nisclassificutionrisk.
wrongly declaring tlmt a location is hazardous on tlie basis that the estimate &(u)
exceeds 0.8 ppm, whereas p is the risk of wrongly declaring that a location is safe
1"igrrre 7.38 (t,op graph) sllows l,l~cmedim ;tmI 0.9-qualrtik: estil~r:ti.csrc-
on tlic basis tlmt t.Lr mtimate i;(a) is smaller khan 0.8 p p m
usittg prol,;hility kriging. In Irot.11
t b h a d l'roni tlrc ccdf ~ ~ i o i l e ol>Lnit~ed
ls
cascs, a locatiott 11 is declared lrnzardous with s risk n ( u ) of f&e positiw i f
the cstinrate r:xcccds t,l~ecritical thrcsltold depicted 11y the lioriaontal rlasl~ed risks a ( u ) . For example, location 11; is classified as safe with a risk
line (Figure 7.38, middle graphs). Conversely, if the estimate a t u is s ~ t ~ a l l e r P ( o ; ) = 0.24 on the basis of the rr~ediaricst.i~nate(Figure 7.38, bottom
than 0.8 p p ~ nthat
, locatiorr is declared safe with a risk P ( u ) oSf;tlse negative left graph). T h e same l o c a t i o ~u;
~ is classified as hazardous with a risk
(Figure 7.38, Imtt.om graphs). Note the following: a(u',) = 0.76 on the basis of the 0.9-qtrantile estimate (Figure 7.38,
At ally part.icular locatiot~11, the magnitude of the rnisclassificatio~irisk rniddle right graph). In the latter case, all unsar~~pled locations are
dt!pends on t.lw c a l f inotlcl, trot. on f.11~particular csf.i~t~atc
(tnrdi;in or declared contan~inatedwith a large rnisclassification risk a in the low-
(I.!)-quani.ilc) rel.aincd. valued (Icft) part of the (ransect.

'llte type of tnisclassilicatio~~ risk a t a d e p c l ~ ~ ll~owever,


s, otr tho estitnate Misclassification risks call he used to rank locatiorrs candidates for addi-
zF.(u), that, is, on tlrc optirnality criterion. As t l ~ closs funci.ion penalizes tional s a ~ r ~ p l i n locations
g: with the higl~estrisks P(u) arc preferentially
undcrestit~t;tl,i~~rr
more severely, cst.inl;tl,estend to he larger, thus l i t ~ ~ i t i n g sanipled. Such a criterion would lead one to locate additional samples
the areas with false ncgative risk @(ti) a t the o x p o s e of potcnt,ial large in tlie low-valued part of the transect.
Ileware that a classificntkm based on all estiruati is rio more olj,ject.ivc thau ;L
classification based on ;L probability of excccdir~git critical tlrrt!slmld. inilcetl,
an optimality criterion IIIIISI, be clrosrtr, and the r ~ s u l t i ~~~s tgi t i i a kI I I ~ L YIIC-
p e d on so~nervlrats~tbjectivedecisions d m ~ I ItI O ~ B I S for cxt.rii~~olati~rg ccdf which are, in practice, approxilnated as
tails. K+I
(Ol(11) Y ( z ) [ ( I ;( 7 ) ) - ( 1 1 ; I ( ) ) ] (7.74)
l t h e oxpcctcd loss
M i ~ i i n r i z a t i o l of 1=1
K+l
A third approach consists of evaluating the cconor~ticimpact of the two pos- p2(1,) Y 1 ) ( 1 ;( 7 ) ) - ( 1I ( ) ) ] (7.75)
sible decisions using the conccpl of loss hloctions, i~ltrotluccdi t ~scction 7.4.2. 1=1
Each location is classificd as safc or cont,aniinnt.cil so as to ~ n i n i ~ ~ rt.hc i a e rc-
wlrcre zn and I< are defined as in cq~lation(7.64). 'I'l~cl o c a t i o ~11~ is t.lrcn
s d t i u g expected loss ( J o u r ~ ~ c1987).
l, Uulike prwious apprmicl~c~s IXISNI O I I
(lcclcrrctl safe or c o ~ ~ t , a n ~ i ~ tso
; l t as
c d to i ~ r i ~ l i ~ rtllc
~ i xr(wIt,it~g
c I ~ ~ o ( Y I ~ I . I Ih s :
probabilit.ies of cxcci:diug crit,iral t~ltresllddsor ~~liscl;~ssilic;~(iot~ risks, Itcrc:
tlre decision is Imscd 011 financial cosls.
As in the dctcr~riinat.io~r of optirrlal eslin~;rl,cs,t.ltc key s l q ) is L I K spec-
ficatim of ecimmlic f i t ~ t c l h ~that s we;!srlre the inrpact of t.lre two t.ylm of
~nisclassific a t'ron: Consider tlrc prolrlen~of wl~ethcrto c1;lssify llocatiorts ;rloug the NI':-SW
transect as c o ~ ~ l ; ~ n r i ~ l ;by
t t ecda d ~ ~ t i n ~Tltc
r i . loss att.;lclred fo the t.wo typcs
1. T h e loss associated wil.11 classifyir~g;t location II ;is safe could be n ~ o ~ l - of classilicalio~~ is modeled by iur~ctions(7.72) and (7.73) wil.lr two sets of
eled as w-costs: (1) w l = l , w z d . 5 and (2) w l = l , ~ = 2 . 5(Figure 7.39, top graplrs).
Figure 7.39 (middle graphs) slmws the expected losses computed 11si11g1 . 1 1 ~
cctlf rnorlcls provided hy probability krigir~g. 'I'lre cxpecl.erl loss associated
with declaring a locat,ion safe (solid linc) is larger in tltc higli-value11 (right.)
where wl is the relative cost of undercstirrlati~~g tllc t,oxic co~tccntrat,ion, part of the trarrsect, where the u~lkrrowuCd co~tcentrationmore likely cxcei:(ls
e.g., potential ill health (wl is in units oftrrt~noy/cotrce~rtmtion, say, dol- the critical thresltold. Conversely, tllr expected loss associated with declaring
lar / pprrr), and z, is the critical tl~resllold.If tlrc 1ocal.ion 11 is ;rcl.rt;rlly a location c.ontan~in;ttetl( d a s l d line) is larger in the low-vnlucd (left) part
safe, r ( o ) 5 z,, t l w the classification is correct and there is no loss. If of the transect, where there is a small probability of exceeding the critical
the location is actually corrlarr~i~~aLr:d, z ( u ) > r,., t , l ~ rt~~iscl;~-silicnt.i(jt~ t l ~ r e s l ~ l 'The
l . mioimiz;~t,io~t of the expcct,cd loss yields tlrr: clnssilicat,ions
cost is n~odclcdas proporlio~~al l ;(~I tIi--)o zr].
lo l l w irctrt;tl c o ~ ~ t ; ~ ~ l ~ i [r ~ l~ s l ~ o wa t~ 1~. h ~l,ot,l.o~r~
of Ir'igure 7.39. As t,l~crv~r~~xli;tl,iotl cost. w? inrrws?s,
the irrtpact of a frrlse positive (~mneccssary cleaning) increaes, hence rimre
2. T h e loss ;tssociatcd with classifying a locat,ion II as conlaminatcd could locations are declared safe (Figure 7.39, rigl~tbottom graplr).
be modeled ;IS

7.4.4 Simulation
Let f,'(u;:[(n)) he the cottditional distribution (ccdf) ~ n o d e l i ~t,lre ~ g uncer-
T h e renledialion cost is Irere nrotleled as a constaut value w z . For ex- tainty about the unknown r ( u ) , l t a t l ~ e rtlran deriving a sing11: estin~atkd
ample, the cleaning procedure atnoulrts to removing t h upper layer of value z'(u) from that ccdf, o w rnay draw from it a series of L sinrulatcd
soil, hence the cost is independe~ttof the actual ujucentration and W ? is v a h ~ r szll)(ll), I = I , . . ., L. Each value z(')(o) represents a possiljlc outconic
in units of moury only. An alternative consist.^ of modeling the remcdi- or realization of the 11V Z(11) modeling the uncertainty a t localion 11.
ation cost as proportional to the estimated colrtaminatiorr [z;(u) - z,], ?'lie "Monte-Carlo" sitnulatio~rpruceeds in two steps:
wliich calls for a prior estimate of the unknown concentration a t 11. 1. A series of 1, ittdepe~rdentmndorri trurnbers pi'), I = I , . . . , L, unifi~rlnly
'The conditional cdf ntodel F ( n ; rl(n)) allows one Lo dctcrnriue t,lre ex- distributed in [O, I], is drawn.
pected loss att;tcltrtl to llre two types of classification as 2. T h e 1t.h simulated value :il)(u) is idejrt,ifictl with the l,(')-~lon~~tilc
of tlte
ccdf (Figure 7.40):
z(l)(11) = ( I ;( ) l = 1, . . . , 1, (7.76)
Case: ctb = 0.5 1 Case: 1% = 2.5 TIE I, si~nttlatcdv;rlrrcs r(')(u) nrr dislril~rttedarrrirding t,o the co~tditional
Cdf. Ill<l?<Xl,
Il~alllicost Health cost

......
froto I,he defiitiI.iott (7.76),
Remediation cost Remediation cosl
..3 .. -~. ...... ,.-, = ~ ' r o t ) { ~ i '<) l.'(u;zl(n))]
1 7 3 4 1 6 O l Z 9 d i B
Cd concenlral#on(ppm) Cd concentralion ippm) since F(n;zI(n)) is inolmt.otric t~otrdecreasing,
= 1 q u ;Z[(?L))
p(') arc n~iikrrtrtlyrlisl.ril~t~bri~l
sittcf, 1.11,. r;rttrlo~titt~tt~rl)crs i ~ [(I,
t I].
'I'lris 11ropcrt.y of cctlf reprodncl,iou allows ottc l o approximate any nlo-
trro~rl.or qrta~rt.ilnof llte ct~lidil.ionaldislributiol~by tlte corresponding mo-
111cnLor qttmtilc of the Itis(.ogr;tnt of tttilny rc;tlia;~l,iol~s z ( ' ) ( r ~ ) Tltns,
. Monte-
. - Sale (:atlo sitt~rllatiot~ providcs an allerttnt.ive t,o the ;tl)l)roxirt~;ttiot~s of type (7.64)
1
. , . -.
7 3
.
.......
4
Contaminated
5
. ..
6
.
1
.-~T

2
....... Contaminated
.- --.~-.~
3
..... ..*-
1 5 6
o r (7.65) for ~ ~ t t t p r ~ thel h gconditiotial variance o r L t y p c eslitnale. Note
nistance ( k a ~ j Dirlanm (km)
t,lio~~gli t h a t dctcrmination of the qn;trrt,ile value (7.76) still rcqt~ircsi~rlerpo-
1;rt,io11and cxt.rapolatiot~from calculated ccdf v;iltlcs.
I Resulting classilication Resulting classilication Stochastic simulatiorr can b r rxtrttded to tlrr modcli~igof uncertainty
; h u t Llte oxil.puf. v;1111cof m y cotnplcs trattsfer fuoct,iott a t m y localiorr
Contaminated 11. (:o~tsirlcr,for rxamplc, l l ~ epriiblett~of modclit~gthe t~t~certairtty about

IJ t l loss~ associated with classifying t.ltc location 112 a s safc. I'igurc 7.11 (left
graph) shows 1l1c itist.ograttr of 1,000 s i t ~ ~ t ~ l n Cd-v;dnes
tetl dr;rwrt frorrt the ccdf
nlodcl rlcpiclcd in F'igurc 7.32 (page 334, right, graph). 'l'hc sirt~ulatedhis-
f.ogr;tlli is close 1.0 1.l1e condil.i~mnlpdf nrodcl s11mvtt it1 i.'igtlu! 7.33 (right
grlt), IS it S I I I I I Tht: 1,000 n:aliaations z(')(n2) can he fed into
tltc loss futrctio~t(7.72) t o yield a si~~tttlat,ed distributior~of costs a t loca-
i r e 7 . : 'Swo pairs ol loss l~triclionsn~odolthe loss associated with all ill- tiott uz, y ( l ) ( i ~ 2 )= I,~[z~')(IIz)],1 = 1 , . . . , I , (Figure 7.41, right graph). I11
f:mrect classificatioa of a location as safe (ill health) or coatarniaated (snneccssary
clea~~irtg).As the rr~~iedialirrncost wz increases, tltc expected loss att,ached to u~tdtte
cleaning irrcrrasrs relathrly t o the expeclcd ltralth costs, leading to more lor:ations
bving r-lassilird as safe. Simulated Cd values
0.3
1 Simulated costs

0 1 2 3 4 5 6
Cd concentration (ppm)

I t 7 . 4 1 Ilistogram of 1,000 Cd valncs sinrnlated at location uz using the


qnant.ilr algoritltm (left graph). These 1,000 realizations are fcd into tltc loss f1111c-
tiori (7.72), yielding tlm output distribution of costs on tlte right.
contrast, the ;rpproxi~natio~~
(7.74) pnwiclcs ooly ;in r i s l i ~ ~ ~of a t t,he
e nlrtulof
t i~if~~rtil;rtiorl
the distribution (cxpectcrl cost) w i t l ~ o r ~;my O I I the spread of
that distriliotio~l. Tillage
Meadow
7.4.5 Classification of categorical attributes Pasture
, >
l l l e prcvions a ~ ~ a l y sof
i s local oncertaiui.y can he ext.coilcrl l o i i c;+tegorir;tl Forest
attribute s with K mutoally exclusive catcgorics sn. Consider, for exaniplc,
the land use d a t a a t the top of Figure 7.42. ' I ' l ~ r s ~ ~ ~ ~ c t . r t a ;11>o11t
i ~ r t . yw l d ~ c r
land use sn prev;iils a t airy u ~ r s ; r ~ ~ ~ 1oc:rl.ion
plcd II is ~ ~ ~ o d 11y ~ l tcl ~dcvwl.or
of li conditional probabilities [ p ( r ~ ; s ~ / ( n ) ) ] Figure
'. 7.42 (rr~idrllegraljl~s)
shows the probiibilily distril~ulior~s of four laud uses at, 1ocal.io11sII:, a111l11;.
Intuilively, the o~~certaint.y is s~n;tllera t 11; where t l ~ ccategory s:j (~irc:dow) Location ub
has a rnuclr larger pro1ial)ilil.y of occurrellcc tlmn any r~tlwl-c;rt~,gory.111cow
trast, three calcgories 11;ive ~~ownegligiblt! probabilities to prevail a t 11;. 'l'l~c:
post-processing of c o ~ ~ r l i l . i o probability
~~al distril~utio~is allows onc to qonll-
tify such local ~~ncerf.ainly slrout the prevailing calegory ~ ( I Iand ) dcl-ivc ;III
optinla1 estimate for 1.11;it cat.cgory.

T h e unccrtainty att~acheilto a particular 1ocalio11u could he ~ r ~ c a s u r yasl I


minus the largest cooditio~~al prol)ability a t t,ltis locat,io~~:
(1 - Max. cpdf value) Local entropy

As with the nicasllre (7.77), t l ~ elocal elitropy equals zero a t ally datum
location (no o~~cert,ainty).At otlrer loc;iI,io~rs,t.hc local entropy is valt~erli l l
(0, 1111<].
T h e upper bound (Inli) is the lrraxirnu~nelltropy associatcd with
the uniforrn distribution, b1(11;sk/(n))]' = 1 / K , V k . 'l'l~us,;I st;~~rdardizcd
measure of the local entropy is

,,
l h e local entropy a t locations 11; it11011; is 0.28 and 0.87, rcspr:ctively. A g i ~ i l ~ ,
the uncertainty is larger a t 11;.
7.5 Performance Comparison
T h e risk of the C d , Co, or I'h cotrce~rt,ralionsexcecditrg the loleral)lc max- MG estimates olK estimates
imum within the study area is modeled using iliffcrerlt multiCaussiarr- and
indicator-based algorilltrns. Next, regions where remcclial measures sl~oold
he taken are ileli~reatedby applying classificat~ior~ criteria introduced in sct:-
tion 7.4 l o the local ccdf r~totlcls. (:lassi{ic;~i,ion rcsults art: compared \vil,lr
actual metal cortcetltratiolrs known at, test locations.

olCK estimates PK estimates


~ r model. till-
1. T h e first algoril,l~turoquircs t l ~ cn~ttltivari;ltcG a ~ ~ s s i ; rILI.'
der t h a t modcl, the meall and variancc of mcll Gaussian ccdf G ( u ; yl(n))
are estimated from Cd normal scores using sirnplc kriging. 'l'lre proha-
hility of exceeding the critical tllrrsl~oldz , = 0.8 pprrr is then rctrirvrd
as the probability G(u; y,1(91)) of cxccr:iling 111snormal scorc y, = +(z,:)
(Figure 7.44, left top graph).

of the (:<I
2. T h e four otlrcr algorit,ltrlis rcquirc a prior i~rdicatortr;r~tsfrrr~r~
data. Wltere;~sordinary indic;it.or kriging (olK) uses o ~ d yindicator da1.a
defined a t the tlircshold being rstitnatcd, 0.8 ppl~r,ordinary indicator
cokriging (oICK) accolnrts for t,l~rer:additio~ralt,lrrcslrolds 1.38, 1.88,
and 2.26 ppm. 11lsle;rd of indicat,or t,ra~~sfortrrs a t thrcslrolds differe111,
from 0.8 pptir, prohal)ility kriging (I'U) rises orlifnr~lrl,raosfnrn~sof Cd Local prior probabilities SKlm estimates
data as s ~ c i m d a r yi n f o r m a t i o ~ ~ .

3 . T h e l;rsl i~~rlicaf.or
nlgorithlt~i~~ciirporntcs soft, ittfor~r~;rliotl r<.lat.cd to gc-
t,o t . 1 1 ~i~rdi~~;rl.or
ology i t 1 i~ddit,io~t dal.;~;at, l . l ~ r i ~ s l 0.8
~ d ~pp~tr.
l A wliI)r;k-
tion of rock ILypes allows w e t,o rrmvert 1 . l ~ :gwlogic trrap 01 L"ig11rc (i.6
(lefl top graplr) into a map of loc;il prior pn~l,;~ljilitir:sof cxceivlitrg
0.8 pprn, slrown i l l Figure 7.44 (lcft. Im! graph). 'I'l~~w:prcd~;~I~ilit~if~s
are tl~ivru s d 21s l ( m l IIIWUS itr t,Iw siwpit! lli <7sl.i~~~al.or
(7.43); SW I . h
rcsulti~rgprol,ahilit,y tlrap i l l Figure 7.44 (right. Irottom grnph).
360 CHA1"I'ER 7 ASSESSMENT OF LOCAL UNCE'KlAINTY

O p t i m a l e s t i r n a t o s and m e a s u r e s of local u n c e r t a i n t y
OK error variance Conditional variance
Figure 7.45 shows LIte ritaps of ordiuary krigiug Cd e s l i ~ r ~ a t e(I.op
s graph)
and four esI.i~~tates ~ olK-rclalerl ccdfs. Altltougl~boLh OK
retrieved f r o ~ rthe
and 1.:-type estiri~atcsare optimal for the lea&sqoares criterion, they are rot
identical. As mentioned in section 7.4.2, the two estimates differ in that the
ccdf from wlrich tlre E-type estimate derives is deper~ilerrton the d a t a v;ilnes.
' I h t differer~ceis frirther illustrated in Figure 7.46, which drows the maps of
the OK variance and three nreasures of spread of the conditional d i s t r i b u t i o ~ ~ s

Cd OK estimates

Local entropy lnterquartile range

E-type estimates Median estimates

Figure 7.46: I.'our nrcasures of local micert.iiaty over t l ~ cstudy area: ordinary
kriging vnriaucr awl tllrce ccdf statistics (I-oaditioanl variirnce, entropy, and ia-
lkrqu;~rI.il~~
~;mgc).

(variance, ent.n)py, ir~t,crquarbilerangr). 'l'lre I I I A ~of O K variance indicates


greater u~rccrI,;ti~ity i n tlre cxtrcrnc wirst cornrr o l t.l~astudy area wlrcrc data
;rrr sparse, wIler(!;m the ur~cmtai~~f.y is smallesl, 1le;tr data locations. Elsf!-
w h i x t,lrr lirigiug v;~riancris almut t l ~ s;trnc
r wlt;ttc\w t,lm st~rrorrndi~rg data
0.1-quantile estimates 0.9-quantile estimates valrrcs. 111 cu~rl,~~;tst., nreasurcs of sprcltd dctluced fro111the ccdf n ~ o d r l sall
indir-at,(: tltat unccrt,ail~f.yis larger i n tire high C(I-valued p;trLs of llie study
arca wl~ercthe data varia~~cr! is ;&o llie largest (recall the proportional effect
shown in Figure 4.4). T l ~ runcertainty is smaller on Argovian rocks wlrrre
Cd concrnt.ratiorrs are consisle~~lly small.
In addition t o the means of the conditional cdfs, three p q u a ~ ~ t i l cof s these
disl,ributions are mapped in Figure 7.45: the medi;rn, the 0.1-quantile, and
the O.!l-qumtile. 'l'licse three ~ortdili011~1 quantiles iriiip the tllreslrold values
t.l~iitI~avc,resprctively, a 50%, 90%, and 10% prol~abililyof being exceeded
by the o n k ~ ~ oCrl w values.
~~ Iliglr-valued parts (dark zones) of tlrc 0.1-qnanlile
Figure 7.45: Maps of ordinary krigilrg Cd cstirnates and various statistics of the m a p indicate a r e a where t.he ~rtrkrrowrrCd concent.rations are certaidy large,
local ccrlfs: mean, rnedias, 0.1-qnnntile, and O.<)-qt~antile. whereas low-valur:d p;rrts (light grey zones) of the 0.9- pan tile map corre-
slmnd t.o a r m s wIri,rr: {.he co~iceritratior~s are ccrt.aiuIy small, As with the
E-type estimates Classification
marginal distribution hist tog ran^), tlrc local distril~ut.ionsof C:d da1.a iknrl t o
be positively skewed, Iicrrc~:the ~nedianestimate is gcncr;tlly stnallcr t11n11t l ~ e
E-lype estimate a t the s;me locatioo.

D e l i ~ n i t a t i o of
i ~ continninated areas
Figures 7.47~~7.49 sliow three subdivisions of the stndy area into safe regions
and r e g i o ~ ~where
s re~nedialrneasrires s l ~ o d dbe taken. 111all t,hrce cases, t.l~e
conditional crlfs were cst;~l~lisl~ed 11si11gordinitry iodical.cx kriging. Tlw Lhrcc
criteria for classifying a location as corrta~~~inirtcd follow:
1 . T h e probability of cxcecding the tolcrablc n i i t x i n ~ u0.8
~ ~~~ ~ I isI liuger
I
than the marginal probability of cotrla~~ii~~al.ionr~stirnated in section 2.1.2 Risk n
(Fignn: 7.47).

2. l'lre 12-type e s l i ~ u i ~exceeds


t,~ the critical t.l~resl~old (Fignrc 7.48, top
graphs). T h e bottom nraps depict the risks attached to each classifi-
. example, the risk u of wrongly declaring that a location is
c a t i o ~ ~For
hazardous (false positive) is larger iiround h v - v e h ~ e t zirens
l (IGgore 7.48,
left bottorn graph)
3. T h e cxpcctccl loss associated with wrongly declarirrg a location cont,;in~-
inateil (undue cleaning of a safe location) is sinaller t l i a ~the ~ cost associ-
ated with wrongly classifyi~~g a 1ocat.io11as safe (ill 11caltl1)(Figure 7.49).
,,
I l ~ e s etwo losses were niodeleil r ~ s i r ~ftt~~ctions
g (7.72) aud (7.73) wid11
for w-costs: w l = l , w ~ = 2 . 5 .
These direrent criteria lead to classifications tlial niay greatly differ. In
particular, tile E-type estimate, which is o p t i n ~ afor
l a least-sqonres criterion,
tends to overestinrate the Cd coricentration, leading to most locations being
classified as contan~inatsd(Figure 7.48, right top graph).

Probability map (0.8ppm) Classification

2. 'rile tolerable nraxininn~0.8 ppm is exceeded hy a p q u a ~ ~ t i lnstit~~ilbe


e
retrieved from the local ccdf.

3. 't'l~e expected loss associated with wrongly declaring a location ~ I I -


taminated (nndue c l e a n i ~ ~ofg a safe location) is smaller than the cost
associated with wrongly declaring it safe (ill health).
A range of values for i.lie probability p and the w-msts was considered lo
Figure 7.47: Clasuificatiua of locations as contaminated by rarlmiurn on the t~asis investigak the sensitivity of classificatior~r e s ~ ~ lt,o
t s the prohitbility theslloki
that the probability of exceeding the critical threshold 0.8 ppln is larger than the p, the pquantile estimate, or the loss functions retained.
marginal probability of contamination (0.653). Figure 7.60 (left column) shows the proportion of locations declared con-
taniinated versus the probability p or the cost rat.io wl/wz, with wz set to 1
Health costs Remediation costs First criterion: Probability threshold p
70

60
C
.-0
5 so
.-
r
40
-0
."E 30

* 20

10
Classilication
0.0 0.2 0.4 0.6 0.8
Probability p Probability p

S e c o n d criterion: p-quantile estimate

0.0 0.2 0.4 0.6 0.8 1.0


Probability p
Third criterion: Expected cost

2. crICK: ccdf values arc dt:tcrritined usittg ordinary ilr<licator cokrigirrg a t


tlra same five tlirt:sholds riscd for o l K 'l'lic st?corrdary (soft,) infornrat.iorr
co~isistsof !.lie coloc;~t.cdlocal prior probability of type (7.25) dcrive<l
from a. calibration of tire rclatior~behccit (:d arid Ni valnes.
Wh;tl.cver the algoritl~r~r used to dcterlnirte tlrr ccdl'valucs, tlre proportiorr
of locatioris di:clarrd crir~tatnilialcddecreases as !,Ire prohabilit,y tlircslrold 1,
increases (12igrrrc 7.50, lcft, top graph). Coir~crsi!ly,as y i ~ l c r e a s c ~ the
, cor-
resporrdilig p q ~ r a n t i l ccslilnale increases, ltcrrce mor? locatiolis are del:lare<l
--,- r-- r - -,-
08 16 24 32
contanrinotcd (Figure 7.50, lcft nriddle grapli). ,Is tlrr relat,i\.e cost wl of Cost ratio, wily
declaring a lornliorr safe iucrcascs. Ilw proportiori of locatiorrs as
it~crrascs(1,'ignre 7.50, lcft bot.t,oln graph).
ro~it.alnirr;~tcd
Assessment of Spatial
Uncertainty

i a ~ r indicator-based algorit,hws irrtrod~rcedin Chapter 7


l'lre ~ n ~ ~ l t i ( : a u s s ;md
provide only niodrls of local uncertainty in that e a c l ~co~~ditional cdf is spe-
cific 1.0 orbe single location, Most applications require a rrre;rsnre of tire joint
unrcrtainty about ;~ttributevalues a t several locatiorrs taken together, for
example, the probability of occurrerice of a string of large or small vxhres.
Such spatial uucert.ainty is inodeled 1)y grnrratiog rr~ultiplerealizations of tlre
joint distril,nt.ion of attril,ot.t: values in space, a process known as stochastic
siinulatiol~.'FIren, transfcr functions, such as Row sitrr~ilators,can be applied
1.0 tlic sct of all,ernat,ive represenlations, yielding a distribution of response
values, such a? the time for a fluid t,o travel from one location t.o another,
used in subsequent risk arlalysis.
'I'he cor~ccptualdifference between stochastic simulation and estimation
is discussctl in s c c t i o ~8.1. ~ Scclions 8.2-8.7 present various algorithms for
s i m n l a t i ~ ~one
g or several int,erdc!pendcnt continuous variables: sequential
G a u s s i a ~a~ d indicator s i ~ n n l ; t t i otcclmiq~~es,
~~ the LU decomposition algo-
rillrnr, the ?,-field simolation algoritli~~i, m d sirr~rrlateda~mealing.Algorithms
for si~nnl;tlirigcat,egorical variables are inlroduccd in section 8.8. Important
t.opics such as rreproduction of n~odelstatistics, visrlalizalio~~ of spatial uncer-
tainty, and accuracy and precision of response distributions are addressed in
sect,ion 8.9.

8.1 Estimation versus Simulation


E A ) be tlrr set, of kriging estimates of attribute z over the
Let, { r * ( n ) , o
study area A. Each cstinlate z*(u) taken separately, i.e., illdependcntly of
neighboring csti~riatesz*(11'), is "best" in tlre least-squares sense because the
local error variance Var{%'(u) - Z(u)} is minimum. T h e nrap of such best
local estimates, Imrvtwer, nlay rmt he I)& as a wliole. As S ~ I O I V I Ii t 1 Figure 5.19
and 'fable 5.2 (pages 181 and l a ) , interpolatio~~ algorith~r~ tend
s to smooI,lr
out local details of t . 1 1 ~spat,i;rl vnri;ltio~~ of tllc ;~tl,rihut.e. 'l'ypically, sn1;111 OK Cd estimates Simulated Cd values
vair~esare overi:sti~~tat.cd,I V I I ~ Iarg'. ~ ~ S vi~lticsarc I I I I ~ C ~ C S ~ ~ I I I ~ SL I~I(C, ~~.I
conditional bias is a serious slrortcotnir~givlrct~trying lo il<,tcrt paltcr~w<,f
extreme attribute values, suclr as zones of 1iigl1 pt,rt~~r;d,ility values or zoltcs
rich in a metal. Atmtller drawback of esti~natioois that. tlrc sn~ool.l~ing is
not uniform. l t a l h r , it depends on tlie local data configuration: srrmot.lting
is m i ~ ~ i r n close
al to the data l o c a t i o ~ ~asi d increases as tlrc locatiol~lbeing
estirrrated gets fartlrer a w y f r o ~ data t ~ locations. A map of kriging ~ s t i ~ n i t t . ~ ~
appears nrore variable in densely sanrliled areas t,lra~till sparst4y sampled
areas. '1'1~1sthe kriged map may display zrrtifact, stntcturcs.
a t c d s11011ldnot, lbc IISIXI for a ~ > p I i c a l i wscnsil,ivv
Sn~ootlii ~ ~ l ~ t ~ r p o l1112ips s k t
the presence of r x t r e n ~ evalues and tlrcir pattert~sof co~ltir~uity. Consider, for
c x a n ~ p l c tllc
, prolilenr of assessing g r o ~ ~ i i r l w a t ~ e r - t rtiinns
m l frrit~ra n~tcle;rr
repository to the surface. A snroollt map of estirnatcd transri~issivil.iesworrlrl
fail to n!prodnct! critical featnres, strcl~as strings of large or small values that
foror flow paths or barriers. T h e processing of a kriged tr;~usi~~issivit.y itlap
through a flow sirr~~ilator may yield inaccurate travel times. Similarly, t,hc risk
of soil pollution by heavy lrretals wonld be r~nderestiniated11y a kriged map of
metal concc~rtrationsthat fails to reproduce clusters of large concent.r;hons mean 135 mean 1 41
above the tolerable n ~ a x i n i u m varlance 0 16 varlance 0 86

R c p r o d u c i n g nroi1c:l s t a t i s t i c s
Instead of a map of local best esli~rrat,es,slociiastic si~tnllatio~i gcueratcs a
1 2
I
3
4
L
5 6
L
m a p or a realization of z-val~~es, say, {z(')(n), rr E A) with 1 dcnoting the it11 2 3 4 5 6

realizalion, which r q ~ r o d u r e sstatistics deelncd most consr:q~letrti;d for t,he Cd concentration (ppm)
prohleni in h a n d Typical rcq~~isites for SIICII si111111at~~d
tnap art, as f o l l ~ ~ ~ v s :
1. Data values are lio~~orcd
a t tlieir locat,iur~s:

T h e realizatiou is L11tx1said to ba ro~~dit,ional


(to tlrc d a t a vn111cs)
2. T h e histugranr of sitnnlated valws reproduces closcly the ~lcclnstcrctl
sample hist.ogram
3. T h e covariance rrrudel C(11) or, better, the sct of indicator covariance
models C',(lr; z t ) for various tlrresl~oldszr are reprodnced.
As sllowrl slltlseqtlent.1y, illore colllpkx fi~atllrcs,sliclt as sp;lti;d corrrl;rt.ioll
with a secondary attribute or i t ~ ~ ~ l t i p l e - ~statistics,
)oir~t iuay also l)t: rcpm- Distance (km)
duced.
Figure 8.1 (Lop graphs) shows tlie inaps of cstinri&d and si~rntlatc(lCkl
values over the study area. In both cases, the 259 Cd (lath valucs are ho~rorerl
ordi~mrykriging yields t i s n ~ o o t hitlap
a t their locations, l l d i k e sit~iulal~ior~,
of estimated valnes:
Oulput di ribution

Mean 4370

Utilizing expression (7.72), one can ulorlel thr: cost ;tssociat~cilwit11 rlcclaring
safe a locatioll 11'.I as T
1"-
cost
Kriged map

Realization ti 41 (min, cost) Realization # 10 (max. cost)


w11t:rv zc is 1.1~:cril,ic;~lt l ~ r ~ ~ s lm~d~ W l d~, ( I I is$ )tlte relative cost of undcrwti-
111i11.ittg I.II(. t,oxic c~ilr('rll.ri~I~iot~ t i t ti!, C R . I~, I i < , COSI. associatd wiI.11 ill Ii~i~lt,ll
J
( i t is c s l / ~ n ~'l'llc ) cost wl(n;) could be ~no(lclcdas a f l ~ ~ l c t , iof oi~
g II;, my, wl = 1 for forrsl,, w1=5 for i n ( d w ;IMI
I ~ I I P h i d us<: p r f w t i l i ~ ~ztl,
paslurc, and wl = 10 for Lillage. If 1 . l ~situulat,cd value 2(')(111) docz not
exceed z,, tlirr classificzttion is rorrect and t,Ilerr is no cost. Conversely, if the
simrrlated Iocatior~is contalninntcd, =(')(I$) > z,, tht! ~nisclassificationcost,
is proportion:tl to 1.he collta~rrinatiol~ [ r ( ' ) ( u ; )- z , ] .
,,
I h a t opcral.ion mi be rcpe;tl,crl for Inany, say, I, = 100, rr;tlizat.ions that
d l honor Cd data and rcpnduce t.he s a n ~ p l ehis tog rat^^ and semivariograiii
niodel. 'W histogram of t.he 100 costs C(I), I = 1,.. ., L, provides an as- Figure 8.3: 'Thu distribution of costs resdting from a wrong dccisioa to declare
thr stncly area safe with respect to Cd (top graph). l'ltis distrilmtiol~is obtained
sessment of the risk involvcd with declaring lhc area safe (Figure 8.3, l,op by post-processing 100 realizations of the spatial distribotioa ol Cd values; the two
graph). 'fhe worst sccnario corresponds l o realization # 10, which slioms an realizations yielding the srnvllcst arid the largesl costs are show^^ at the bottom of
important conl,an~inationof agricultural land Lhat leads to the n i a x i ~ n u mcost thr figure.
Cr,,,, = 5187. T h e rnii~i~nunl cosl, associated with this simulation model is
ol>taintxl fin realiz;~tion# 41, C,,,i,, = 3517. 'i'hrse two ext,rcme rc:alizabions
are s l ~ o w ia~t the b o t l o ~of~Figure ~ 8.3. Siinilarly, a set. of sirrlolated maps {z(')(II;), j = 1,. . . , N}, I = 1,.. ., L , can
T h e sailre cost fur~ct.iottwas applied to lllc krigeil m a p s l l o w ~a~t the top he gmernhed 1,y s a ~ u ~ , l i n
thc
g N-varintc or N-poii~tccrlf tlmt models the joint
of Figurv 8.1 (Icfl, graph), yicldi~rga cost. C' = 3854 denol.cd by :i verl.ical ;at 1 . l ~N localions 11;:
~i~lcwt:til~l,y
armw on Lhc Iiist,ogrm~of Figure 8.3. 111 addition to providing no nlcasure
of response uncertainty, t,lre use of the smooth kriged m a p would rnost likely
have led to an u~rderrstimationof the cost associated wit11 declaring tllc study
area safc.

lnfcrencc of tlrt; c o n d i t i ~ n i ~cdf


l (8.3) requires k~iowlcrlgcor strirlgellt hy-
pothescs abotrt the s p a t i d law of the R F Z ( u ) . ?'he u~ultivariateGaussian
In section 7.4.4, the qualttile algorithm for generating a set of L realizations H I inodcl is one model whose spat,ial law is fully determined by the sole
~ ( " ( u ) ,1 = 1,. . . , L, a t any specific locatiort u was introduced. This was done z-covariance fuliction; it ulrderlies sever;~lsirrrulatiorr algorithms, such as the
by sampling the om-point ccdf that models uncertainty a t that location: LU decon~~osition algoritlnn introduced in section 8.5. In the following pre-
sentation, the focus is on three classes of simulatiorr algoritlrtlrs that are not
F ( u ; rl(n)) = Prob {Z(u) < zl(rt)] lirnit,ed to t l ~ cG;rt~ssi:~tlforlnalism:
378 CXAI'TER 8 ASSESSMENT OF SJ'A'L'IAI, UNCERTAINTY 82 THIS SEQlJENflAI, SIMULATION PARADIGM YiY

fixed number ~ ( I I ' )of closest original data are retained no nrat,ter how many
prcvioosly simulated values are in the neighborhood of n'.
1. 'l'l~e s e y ~ ~ e n t i asi~rrulat,io~~
l algoritlrrn requires the deter~r~iuation of a
conditional cdf a t each location being sin~ulated.Two r ~ ~ a j oclasses r
Visiii71g srquerm
of seqoential simulatior~algorithms can be disti~rguished,depending on
In theory, the N nodes can he sirnulated in any seq~renceas long as all data
whether the series of conditional cdfs are determined using the multi-
and all prcvio~~sly siniulaled values are used in the dcterlrrinatiou of ccdfs.
Gaussian or the irrdicator f o r ~ r ~ a l i s introduced
~ns in the previous chap-
However, because ouly neighboring data are retained, nrtilicial continuity
ter.
may he gcneratcd along ;L d&rrninistic path visiting tlre N nodes. Hence a
2. S e q u e ~ ~ l i sirr~r~lation
al ensures tlrat dat,a are lronored a t bheir locatior~s ranclo~r~ sequence or path is reconrmendcd (Isaaks, 1990). An exception to
(cor~dil.ionalrc;~liaathos).I ~ ~ d c c ral t, any d n t u t ~l o~c a t i o ~u,,,
~ the sitw t.lris practicc is t.he ~ r ~ ~ ~ l L i ~ ) l c - g r i d s i t ~procedure
r u l a t i o ~descrihcrl
~ sul)sequenlly.
ulalrd value is drawn f r o ~ ra~zero-varial~cc,unit-step co~~dit.io~r;rl crlf Wltw g r w e r a t i ~ ~swtrral
g rc;dizat.ior~s,~ l c~o ~c ~ ~ p u t a t i ti111e
o ~ ~ acan
l be re-
with Incan equal to the z-drtturr~z(u,) it,self. If large rncasure~r~t~nt er- duced rot~sirler;~l~ly by keeping the same rando~tlpath for all realizations.
rors r e ~ ~ d qocstio~~alilc
cr the exnct matclring of data values, one slrould Irt<lecd, i . l x h' kriging systerns, O I K for eaclr node o;, nr:ed be solved o d y
allow the sirrn~latetlvalues Lo deviate somewl~atf r o ~ ndata a t their loca- once since thc N conditioning data conliguratiorrs remai~ithe sarne from one
tions. If the errors are rrormally distributed, the si~nulatedvalue could realizatiou to anotl~er.T h e trade-off cost is the risk of generating realizations
be draw11 from a Gaussian ccdf centered on tile d a t u ~ nvalue and with that are too similar. 'rl~erefore,it is brtt,er to use a different random path
a varial~ceequal to the error variance. for each realization.

3 T h e sequential prit~ciplecall be extended to simulate several continuoos filal/iple-g~~id si7rtulalion


or categorical attributes (see subsequent sections). 'The use of a search neighborhood limits reproduct,io~rof tlie input covariance
~ ~ ~ o ct,ol rtile
l radius of that neigl~borhootl,Another obstacle to r e p r o d ~ ~ t i o ~ ~
of long-rangr: st.ructures is the screening of distant data hy too rrlany data
closer to thc locatiot~I~eingsimu1atr:d. The multiple-grid concept (G6rnez-
As in ~ r r a ~applied
~ y fields, successful a p p l i c a t i o ~of~ a basic principle relies on I t e r n h d e z , 19111; 'Ikan, 1901) allows one to reproduce long-range correlal.ion
experience and a few crit,ir:al implcnicntal.io~~ tips; see Deutsclr and Journcl, sI,rucl.~trriswitlior~thaving to consider large search neigl~horl~oods with too
~ at111 I24 125; ( ~ ~ I I I I ~ ~ -
I!)!Kh, 1). 3 0 34 MI^I IC~hssiraga
~ ~ ~ I (1994).
I~II~I~~~ 111st1yc o ~ ~ d i t i o ~data.
~ i ~ l Vor
g e x a ~ ~ ~ pa ltwo-stel)
e, sirnulatio~rof a square grid
500 x 500 could proceed as follows:
Search slrniegies
'l'l~esequel~tials i ~ ~ r u l ; i lalgoritlm i o ~N~ succcs-
~ i o ~ ~ rcqttirrs t,hc d e t e r n r i ~ ~ a l .of
sive ccdrs Ii(i~',;z~(IL)), . . . , I " ( I I ~z; J ( n+ N - I ) ) , with all increasing level of 1. ' h e attribute valurs are first sinrolatcd on a coarse grid (t!.g., 25 x 25)
co~~dit.ioning i n f o r m a t i o ~ ~Correspondingly,
. the size of thc kriging system(s) using a large search ~~cigl~lxxlrood so as l,o reproduce long-range correla-
to be solved to di:Lernri~~et,llese ccdfs increases and ~ ~ C O I I I Cquickly S pro- tion structures. Ikcausr the grid is coarse, mclr ueighborlmod cot~tains
hibitive as the sir~rolal.iouprogresses, As sliowri in section 5.8.2, the d a t a fcw d a t a , wl~icllreduces tlw screening rffcct,.
closest Lo the locat.ion heir~gestimated t a ~ dto screen t,l~cinfh~e~rce of more
distarrl d a t a . Thos, in the practice of sequential sin~ulation,ortly t l ~ eoriginal 2. 0 n c c llw coarse grid has 1m!t1 con~pleted,the s i ~ u r r l a l i ocot~tir~ues
~~ 011
d a t a and bhose previously simulated values closcst to the location 11' beir~g the fincr grid 500 x 500 using a s~riallersearch ~rcighborhoodso as to
sir~rulatedare retained. Good practice c o ~ ~ s i sof l s using the se~~rivariograrr~ reproduce shorl-rmgc corrcl;ition structures. T h e prcviously sirnulated
dist.ar~cey(u' - 11,) rather tiran the E u c l i d i a ~distance ~ in' - u,l so t h a t values on the coarse grid are used as data for t l ~ csi~nulationon the fine
conditioning d a t a are preferentially selected along the direction of inaximum grid.
rontinnity.
As the sit~tolirtionprogrcsscs, the original d a t a tend to be overwl~rl~ned
by the large t ~ u n ~ b cofr previousl$ si~nulatedvalues, particularly when the A randoru path is followcd within eaclr grid.
~ ~ is dense. A i~alarrccbetween the two types of c o ~ ~ d i t i o ~ ~ i n g
s i n r r ~ l a t i ogrid 'l'lrc procu111rt1can he gmcralized to any n u ~ n b e rof intermediate grids;
iufor~nationcan be preserved by separately se;~rchi~rg tlie origiual d a t a a ~ ~ d this 11ur111ier~ l c p e ~ r dons t.hc irutnher of stnictures wif.ll diff~!rct~l,
ranges to be
t.l~cprcviously sin~ulaterlvalurs (t.wo-part scarrl~): ;tt each location II', a d I ~ I : h a 1 grid spaci~tg.
r q ) r o d ~ ~ c ta114
380 CIIAPTEIl 8 ASSESSMENT OF SI'A'I'IAI, UNG'ER?'AlN?'Y

8.3 Sequential Gaussian Simulation


RF rnodcl
Irnplementatiorr of tlrc scq~ientialprinciple r~ntlcrthe 111111LiC:ir11ssian
is referred l o as seqr~entialGaussian si~iiulation(sGs). Algoril.linrs for simu-
lating a single attribute using only values of that attribute tlmi accounting
for secondary informatiou are first inlrod~lced.T h e joint siniulation of several
correlated attributes is then addressed. from the dcfiuitioo (8.7) of !.he hack-transform,

8.3.1 Accounting for a single attribute


since the 1,ransforni fnnc1.io11d(.) is n~onotonicillcrcasiog,
Consider the sinwlation of the continuo~lsatt,ribulc 2 a t hr nodes 11; of a grid
(not necessarily regular) conditiorial to the data set { ~ ( I I , ) ,a = 1 , . . . , rr).
Sequential Gaussian simulation proceeds ;is follows:
1. T h e first step is to check tile appropriatel~essof t l ~ eii~ult,iGaussian1t1:
Otlier re;tlizat.ions {.z(")(il;),j = 1 , . . . , N } , 1' f I , are ol,t;ri~iedby re]>t:rrt,ing
model, wllicl~calls for a prior transform of z-data int,o y-data with a
steps 2 and 3 with a differart random path.
standard n o r u ~ a lcdf ilsirrg the nornial score transforn~(7.8). Nor~u;tl-
i1.y of the two-point distrihulio~iof 1.lie resulliug nornial scorc variahlc As mentioned in section 7.2.4, non-stationary behaviors c o d d Gc ac-
Y(II) = $(%(%I))is ~ , I I C I Il!l~e~'k~!cl
(rwtll s f ! c l i ~7~.i2 3 ) . I f I,IK I ) i ( ; a ~ ~ s s i u ~ courrtcd for i~singalgorit.l~nisotl~erthan s i ~ n p l ekriging to w t i ~ i i n t ethe mean
of Ll~cG a r ~ s s i a ~ccdl':
l ordinary krigilig or krigiug wit11 H t r ~ n d11iodc1. 1 1 0 ~ -
assumption is invalidittcd, other procccl~~rc~s fix ~lel.~:rt~ii~~:kl.iii~~ of t l ~ clo-
evcr, Gaussian thcory requires tlrat t l r silnplc (co)kriging v;rria~~cc of ~ ~ o r l ~ ~ a l
cal ccdk ~iiruslbe considered, fix e x a i ~ ~ p l iridir:al.or-lj;isc:(l
c, soqueut.i;,l
scores he user1 for variance of 1.l1cC;i~~ssi;rn ccdf (see Jourl~r?l,1980). ( h -
s i r m ~ l ; ~ t i algoritli~ns
o~i proscntcd s~il)sequcnlly.
sicler, (or exnlriple, that 1.lle I I I ~ ; I I I and variance of 1 . h ~(;mssiall ccdf a t 11 art!
c s t i n ~ a t ~using
d kriging wil.l~;I trc~lrlinodel. 'I'l~es i ~ l ~ n l i t l n11id1.1
i o ~ ~ is

1)efinc a r a i ~ d o r path
i ~ visiting each node of thc grid only once.
At each node n', determine tile parameters (mean and variar~cc) wliere tile error component l<(n) is independent of YiT(lr), I':{~~;(II))= O and
of tlie Gai~ssiarrccdf G(n1;gl(n)) using Sf< \villi t h ~ :normal scorc Var{b'(rl)) = o k T ( u ) . H.ecall the ISI' niodel Y(n) = ~ ( I I+) l i ( u ) , w l w c tile
srmivn.riograrn model yy(11). The condil.ioning information (71) trend component m ( n ) is rrlodelcd as a linear conlbi~lalioiiof ( I i + l ) f ~ ~ n c t i o n s
consists of a specified nnrnbcr n ( d ) of botll normal score c1at.a of tlie coordinates. Accounting for t l ~ oK T systcnr (5.26), one car1 s l ~ o wthat
~ ( I I < and
~ ) v:tl~lr:s2 / ( ' ) ( ~ ~
siinulatctl
i) a t previously visited grid nodes. the simulatiori rnoilel (8.8) does not, reproduce the residuitl covariance (:rr(ll)
u~ilessIi' = 0 (silnplr: krigiiig case):
Draw a si~nnlntcdvalue l / ( r ) ( ~ ~fro111
' ) 1.lmt ccdf, and i~ildit to t.lrc
d a t a set.
I'roceod to the next iiode along tlic r a ~ i d o mpath, and repeal. l,l~e
two previous steps.
Loop until all N nodes are simulated.
3. T h e final step consists of back-transforn~iogtlie sirriulated normal scores Consider now the situation rvlierc the unknown t r c ~ ~m i l( n ) is idenl.ified with
{ y ( ~ ~j ) =, 1 , . . . , N ) into simulated values for the original vari- its ISI' estiniate rnkT(n). Since m;iT(u) is know11 everywhere, sinnllation of
able, wliicli amounts to applying the inverse of the r~orlnalscore trans- Llie normal score variable Y anmunts to simr~latirigthe r e s i d ~ ~ aR(n) l llsillg
form (7.8) to the simulated y-values: simple kriging and the resirlnal covaria~lceGII(ll). T h e sinnllalion model (8.8)
becomes

with +-I(.) = F 1 ( G ( . ) ) , where E'-I(.) is the inverse cdf or quantile


function of the variable Z, and G ( . )is the sta~idardGaussian cdf. 'I'tiat
8.3. SEQUENTIAL GAUSSIAN SIMULATION

. 1:{15(11)) = 0, tl111s l q ~ i ' ) ( n ) =


) 1,1;~.(11) e 4 Cd data
. \hr{lqll) ] = &<(11) = (:f<(o) -
,i(ll)

1AY(11) CI~(II<, 11)


c.=1
-

w l w c the S1i weights AZK(u) arc provided by an SIC system of type (5.10).
+
Acco~mtingfor the relation nt;<T(~l) IZSf<(u)= YliT(u), O I I C can write thc
sin~rtlnt.io~l
n~otlel(8,s)) as
3 4 5 6
Distance (kmj

I Normal score data

1 2 3 4 5 6
;IS in t l ~ :SIC sysleni (5.9). Distance (km)
1)clertitining a kriging cslinlale and a difrerent k r i g i q variauce would
require solving two systems a t each location 11, ltence it is 1101practical. 'I'lie 1 Simulated normal scores
solution adopted by program sgsim in GS1,1112.(1 is to allow tlrc user to ittpi~t
a conslant v;rriancc correction factor 11111ltip1yingall kriging variariccs (of
normal scores). 'fl~isfactor is to be detern~inctlby trial and error to ensure
that the variauce of all sitttolatcd norrnal scorrs for any particular renliatttiol~
is indeed I as required hy the theory.

A'olr
Actnally, rcproduct.ion of t h covariance model (.:y(h) does 11<1trequire the 1 2 3 4 5 6
succcssivc ccdf ntodcls to be Gmssian; Lhey can be of any type as long as Distance (km)
l l ~ c i rnleans and variances arc deter~rrinedhy sitnl>lekriging (.lournel, 1994a).
,>
I Iris result leads to an iniportant tl~eorelicalextensior~of t,he seqrlmt.ial s i n - Simulated Cd values
ulatiot~paradigtn wl~erebyoriginal I-athribtt1.e values are sin~nlat.eddirect,ly
wil.ltot~lm y prior n o r ~ n ascore
l transforn~.'I'lie algorithn is tlirn ctdled direct
1
scqucntinl sitnulation (dssiln), see Xu and Joornel (1994), Xu (1995b). 111the
absence of a normal score transform and hack-transform, there is, l ~ o ~ e v e r ,
no control on i . 1 ~histogranr of s i n d a t e d values. Reproduction of a target
histogram can be achieved by post-processi~lgthe dssirn realization using the
algorithms introduced in section 8.9.1 I. . .--
-
1 2 3 4 5 6
Distance (km)

Consider the conditional sininlation of Cd coriceritrations along the NE-SW Fignrc 8.4: Sequential Gaussian simulation. 'I'l~eten original Cd rlala are first
transect. Figure 8.4 sllows the main steps of the sGs approach: transforrnrd into ten nortnill score data, then conditional sirnulation is performed
i n t l w n o r m a l s m c c itllirrl row). IFisnllv. tltr simnlatcd sorn,al scores are back-
8.3. SEQUENTIAL GA USSSAN SIMIJLATION 385
CIIAP?%IL 8 ASSESSMENT O F SPATIAL UNCER'lAINl'Y

as a correspondence t,able between equal pquarltiles of llre standard Gaus-


sian cdf G ( . ) and the sarrrple 2-cdf F * ( . ) . More precisely, a si~nldatedv a h e
i ) its back-transfnrn~z(')(u;) = &l((y(')(u$)) correspond to t,hc
l / ( ' ) ( ~ ~ and
same c~~rrrrrlative probabi1it.y p j :

G [ y ( ' ) ( ~ ~ ; ) ]= E* [z(')(II;)] = 11,


wl~ichanionnts to identifying r(')(n$) with the 11~-qoantile
of the sample r-cdf:

00 L-L--~.-I --..-. ( I . 3 ) = F--'(pj) wit11 p, t [0, 11.


-rJ--T--,------
1 2 3 4 5 6 7 -3 -2 -1 0 1 2 3
Cd concentration (ppm) Normal score llsing the t,ransforrriation 'L'able 7.1, which is displayed a t the top of Fig-
ure 8.5, tlic ten norrnal score data l/(11,) depicted by the large black dots itre
readily bark-thnsfor~nedi111.othe ten original vnlucs ~ ( I I , , ) .For ex;ml~lc,t,lre
hack-transform of the normal scorc a t datnru loc.at,io~~ 1110 (4111 largest Cd
daturn) yields Lbe original Cd concentration z ( u ~ o ) = l . 4 9I I I I I ~ I(f"igt~re8.5,
left hottom grq111).
scores y(')(rl;) # y(n,), say, t.lle
' r l ~ clmck-trarisforn~of s i n l ~ ~ l a t ci~ornral
d
l;~rgvstI I O ~ I I Kscore
~ y!A=i,:
2.58 <Icpict,d hy ;III o p w circlv i n I'?g~lrc8.5
(right, l ~ ~ t , t ngraph),
ni is \vritt,m

Figure 8.5: Graphical trartslorrrl of tcn Cd data into ten normal scores (top graph).
Using this correspondence in reverse, the 106 simulated normal scorrs drown in
Figure 8.4 (third row) are hack-transformed into sinlalaled Cd values. 'I'lw ten
original Cd data depicted by the large black dots are retrieved exactly.

1 T h e tell Cd data are first transfornicd into tell normal score data using
the t,ransfor~~lationTable 7.1 011 page 269 (Figure 8.4, t,op graphs).

2 . S e q ~ ~ e n l isimulation
al is then perforn~edin the norulal spact! r~siugt.lle
semivariograr~~rr~odel of Cd nornlalscores show^^ a t the top of Figure 7.6
(page 274). At each silnulaled node, the five closest Cd norriial scorc
data and the five closest, previously si~nr~laterl values arc retained in the
SK system (two-par(. search). Figure 8.4 ( t l ~ i r drow) shows the profile
of sirnulatcd normal scores.
8.3.2 Accounting for secondary information
3. Last, the s i n r ~ ~ l a t enorrr~al
d scorcs are lrack-transfor~ncdint,o si~nolal.etl
Cd vali~csusing expression (8.7) (Figure 8.4, botto~rrgmpll).
~ < I ~ I ~ Inor111a1scores {yI0)(&I;),j = I , . . . , N ]
6. I ~ : L ~ . ~ - I , ~ : ~L IIE I Ssi111111atcd
into si~rnrlatcilvalurs for (.he 1)rimary variable { r(1), 1 = 4 I ( ( 0( I I,~ ) ) ,
j = I , . . . , A'].
{zI( 1 ' ) (I%;),j = I , . . . , N ) , I' # I , are o l ~ t a i i ~ cby
Otlwr rcaliz;tl,io~~s d repeating
wl~ereI;?(.) is the i~rargir~al cdf of the variable %i the cntire process wit.ll a rlilfcrcnt rand0111 path for each realizatiot~.
'l'l~encxt step is t,o check wl~rtlrerthe auto and joint two-point. distrihr~.
Lions of t.l~en o r n ~ ascore
l RFs {YI(II),. . . , l ' ~ ~ ( 1 1J )are rcasotinbly normal. In C o l o c a t r t l r o k r i g i n g a n d thc, M a r k o v t i ~ o t l t ~ l
pract.ice, tlw Iji(:ii~~ssia~~ ~ISSIIIII~~~ isOchecIte~I
II (mly for rarh variihlc srpa-
rately 11si11g(IN. proccdrlre (lr~scriberlin secl.ion 7.2.3. 'f'l~e~ ~ ~ u l t i v a r i n t , ealgorit11111
s(k n g sense t,lrat a frill cokriging
is d e i ~ ~ ; r ~ ~ cinl i t,l~e
If the two-point distril)ut.ior~of each I I ~ ~ I I I :scoreI~ variable appears rea- systeru triost, be solvcd at i d i grid node bcing sirr~olal.ed.I ~ ~ r p l e ~ ~ ~ c of ~~tatio
sol~alrlynormal, a r~rultivari:rle r~~oltiplc-poinl Gaussian RI" 111odt4is then t.l~e;rlgoritl~mis alleviated by using coloc;ited cokriging a11i1the Markov-type
a d o p t c d Under that tnodel, t,lre ccdf p;lran~c:lcrs (mca~ra1111 variance) of ruodel i ~ r t h d ~ l c eillr l st:ctiot~6.2.6.
the nornial srrrre variablti Yl(11) are cqoal to t11e s i ~ n p l ecokrigir~gestimate ,l o reduce t.hr c o t ~ ~ p ~ r t a l i ot,imc
% ~ t a land avoid i11stabilit.y p r o b l m ~ scaused
y(51ZK(11) and correspoltdit~gsi~nplecokriging variauce (~(51)?~~(11))~ o l ~ t i i i ~ ~ e d by possibly highly r c d u n d a ~ ~sccondaryt infornration, ouly thc s e c o ~ ~ d a rdata y
from all rrormal score data yi(u,,). 'I'he ccdf is t,l~enmodeled as colocaterl wit11 the r d e beil~gs i r u ~ ~ l a t ccould d lie r e l a i d in the silnple

.
cokrigiug systeul. T h e trade-off ctlsts of such an approncl~are the followirrg:
'I'lwre is no control on reproduction of the correlation between priniary
and sccond;~ryvariables a t lags 1111 # 0.
. 'l'l~rsccondary vnriables must l ~ availal~lt:
n a t all si~nulatcdgrid nodes. If
this is uot the casc and the secondary i ~ l f m ~ ~ a tisi odense,
n ~ ~ o dvalues
al
of t,lw sccondary vnriahlcs could he ir~tr:rpolatrd or better simulated
prior 1.0 sin~uli~I.iol~
or I,he pri~naryvnri;tl)le.
Il~rliltrfull c o k r i g i ~ ~colocatcd
g, rokrigil~greqttires only tltc it~ferenceand
modcling o f t.lre pri~narycovariance function C y , y , ( h ) and the cross covari-
ance fi~ticl.iortsCy,,,:(h), i = 2 , . . . , N u ,defined as

where the s i ~ r ~ pcokrigit~g


lc (SCK) vvrigl~t.sX::=(11) are provided by an SCK ~:Y,,Y, (h) = 1?{)'1(11) . l'l(u + 11)) = /~y,,y,(11)
system of type (6.IG). Y (I) = I ' (I) ( 1+ I)) = p ( 1 ) i = 2, . . . , N,,
,,
I he s e q n e ~ ~ t i aGaussian
l sirr~ulat.io~~
of the primary variable %I proceeds
since E{):(II)) = 0 and Var{Y,(u)j = I , V i , by definition of the normal
as follows:
score trmsfor~tr. 'I'lrr secot~rlaryautocovnriarrcr: fur~ctiortsCy,,y,(h), i 2, >
1. Ijefine a rn11do111path visit.iug each node of the grid only once. are rcquirrd only a t lag zoro siuce a single datum of each s r c o ~ ~ d n rvariabley
is r e t a i ~ di t r the cokrigiug s y s t c n ~ 'I'l~e
, modelir~gerfort implied by colocated
2. A t e a c l ~node u', del.crmine the parameters (mean and variance) of the cokriging can Ire furtlier allcviatcd by using the Markov-type model (6.47):
Gaussian ccdf (8.9) using simple cokrigiug with the direct and cross
semivariogranr rr~odelsof normal score variables. T h e conditioning in-
formation consists of neiglrboring primary and secondary normal score or, in t.errns of set~~ivariograni
rrrodels
d a t a ;ind previously simulated valrres of Yl

3. Draw a simulated value y(ll)(ul) from t,hatccdf, and add it to the data
set. I'htcl~ cross stmivariogram rnodel yy,,y,(h) is t l ~ e rderived
~ as a mere linear
rescaling of tlto p r i l ~ ~ a rscmivariograrn
y model y,;,~, (11) Ily the correlation
I . Proceed to the next node along the raodo~rrpath, and repeat the two coolficient py,,lr.(0). One sl~ouldclteck wl~ether,the proporl.ionality rela-
previous steps. tion (8.10) actually applies to the experitnental cross semivariograms.
8.3. S E Q U E N T I A L GAUSSIAN SIMULA'flON

Example
Consider the conditioiriil sirl~r~latiou of Cd concentratiot~sa l o ~ ~the g NE-SW / Normal score data
transect using as exhar~sl,ivelysampled secontlary infor~n;it.ionthe Ni blocli
estiinatas s l ~ o w ~; r~ lhe
t hott.on~of Figure (Ll. 't'li<i it~tiltiv;~rini.~~s(ls
;dgurit.hnl e
proceeds as fdlovvs:

1. leu prirt~aryCrl data and I,he prolile of s c c o ~ ~ d a rNi


y clal.;r are limt
'f'lit!
transformed into norlnal scorc data (Figure 8.6, top gr;tpli).
Z -2
- Prmary
Secondary

2. Sequent,i;rl s i i ~ ~ ~ ~ l is
i htm
l ~re npcrfornied i l l tlro i~orrlralspace. At e;u:l~
simulated ilode, tlre conditioning infor~rrationconsist.s of the live clos-
est nor~rralscore (><I data, the colocaterl n o r n ~ a lscore N i d a t u ~ r air4
~,
Llle five closes1 previously si111~11alerl normal scorcs. 'I'l~ecoloc;it~rds i u -
Cd normal scores Ni normal scores. Cd-Ni normal scores
ple cokriging systen~is solved u s i q llre direct and cross se~riivariograin
1.2
models inferred from all 269 ~rorrnalscores available over tile study area
(Fignre 8.6, second row). In this cxarl~ple,the Markov-lylw approxima-
E
tion (8.10) was not used I~ecauscthe experil~rel~l.;rl cross scrnivi~riograrn
between Cd and Ni norir~alscorcs is 11o1proportior~alLo the scmivari-
0.0 , -
ogranl of 131 normal scores. 0.0 0.4 0.8 1.2 1.6 0.0 0.4 08 1 2 1-6
0.0 0.4 0 8 1 2 1.6
Olslanrs (W")
Dislancs lkml DIsBnce lkml
3 . I.'inally, the siinulated norrual scorcs are li;ick-lrnr~sfor~trctlinto ~ I I I I I I -
lated Cd values (I,'ig~ire8.0, bottonl graplis).
Simulated normal scores
K r i g i u g will1 all oxterual drift ( K E D )
An alternal.ive i,o cthcal.ed cokrigi~gfor incorporating cxhar~slivelysa111plr4
secondary ioformatio~ris providcd by krigiirg wiI.11 an exl.erl~;ddrill. 'S'hc ircy
assnrn]~liorrl l r w is I.II;II., i l l 1.111: ilori~~;tl
sp;rcc, l.111, ILrc~tdo11.11v~ , r i r l ~ a variidh
ry
Yi is a linear f~irrctjonof t.he s~:cmrdaryvariilhk! V2, I P I I ~ C I I 111us1,
s~rroothlyin space:
l.I111r~[Ore viwy
__ 1 2
..,_-__
3 4
Distance (km)
., _
5
, ._~,
6

I Simulated Cd values
As n~entionedill srclion 6.1.3, it is critical to validate this assonrptioo either
from calibration data or by rising some physical rat,ionale.
Under the trend rirodel (8.1 I), the mean of tlre Caussial~ccdf xt ally grid
rtode 11 is identified with the KEL) csi,iinate:

where the weights of llorlrla~score data alld pnwiollsiy sillllllatcd vallles,


XEED(o) slid X;'"''(u), are solutions of a S<I':S) syste~rlof type (6.5). As
in kriging with a t,rend, the cooditional variance most tic identified with tllc
simple kriging variance, nol, the I i E X variance.
8 3 SF:QUISN?'IA I, GAUSSIAN SJMlJJ,A?'ION 30 1

8.3.3 Joint simulat.ion of ~nultiplevariables Proreed lo t,he liext ti& along 1,hc random pal.lt, and repeat the four

.
ptcvioos st,rps.

Loop nntil ;ti1 N nodes are sim11laLet1.


(1)
r 13ack-trnnsforu~tbc realizations (71; ($),j = 1,. . . , A r ] , i = 1 , . . .,Nu,
into silrrulatcd valucs for t.lie original v;rrinbles Zi-

Repeat the procedure wit11 anol.l~crrandotr~path to getrerate another set of


joint rcaliaat,ions.

I . I)rfirlc it I~ierarcl~y
of variables, starting mit,h 1l1e mosl. itnportant or
better ;mt,wcorr~latedvariable % I .

2. ' l ' r a ~ ~ s f o rall


r i ~vari;tl)lcs %, into their normal scores 1:

. lJse sinkplc kriging to dt:t.errl~ine the paran~eLersof the Gaussian


ccdf of tllr first, vnrial~lr: Y1(ut). T h e conditioning information
consists of 11eigl11,oringnornial score h t . a !/I(II,,) ; t r d previollsly
sirnnl;t(,t:d valucs ~Y)(II;)of ilit: first vari;ll,le. 'I'ltctt, draw a simo-
lat,rrd vnhtc ?/~')(II')froni that, ccdf, and add i t l o Llic coltditioning
diil:~scb.

r Add thc vcctor ofsitir~~lat~cil


vnllrrs to t,hc conrlitiouing rlat;~sct.
previously s i ~ ~ ~ u l a tvalues
eil &' ( u i ) of l.lmt v;rri;rlrlp, plus all prc-
viously sirnr~latcdcolocated valncs y(l')(nl), . . . , Y(1)~ , -(11').
~ 'I'II~II,
(8)
draw a sini~rlatcdv a l ~ ~yN,(ul)e f ~ m lhat
r ccdf, and add it t,o 1.l1e
c o n d i t i o l ~ i ~data
~ g set.
5. Loop until all N nodes arc siniulatcd
8.4 Sequential Indicator Simulation
6. Back-transforn~111sN,, r~:alizatio~~s
ir~tosinlolal.crlva111csfor t,he original
variables. Seq~~cnLi;~l ( ~ a u s s i as ~i ~ ~ ~ u l a t iismwry t fast ;md str?rig~lt,l'<~r~vi~r<~ ~ W C ~ I I Sl!~!
(!
tnodcling of Llie Gawsian ccdf a t c;tch locat.ion i r requires the so111t.ir~1 of ollly
Unlike the vectorial simulation, this proceclure dors not rcqnirc nodel ling t.hc a single (co)kriging system ;it thi~t.l o c a t i o ~ ~AII . implicit a s s ~ ~ ~ ~is~1Ii;tt ptio~~
fnll matrix of cross covariance functions C ( h ) = [CY,,s,(l~)]. '1.11~ t.rndc-off the spatial variabilit.y of the st.t.ril~~~t.r: values can bt: flllly c\~ari~cl.criand 1ry a
cost is that l l ~ ecross covariance fnnctions may lic poorly roprriduccd a t lags
single covariance function. 111 l~;~rl.icular, this p r e c l ~ ~ ~nlodeling
les prtterns of
I # 0. As already ine~~tioncd i n sectiou 8.3.2, :r furf,lrer approxi~natio~i spat,iid c o n t i ~ n ~ i tspecific
y to diffkrciit classes of values. Possillly nlore c1il.i-
consisls of arlopting a Markov-type nrotlel to allevi;~t<. the n~odclingof t,hc c d t l ~ en i a ~ i n n r r nentropy prolicrty of the n ~ ~ ~ l l i G a ! l s11.1" s i a nlo&!l
~~ (recall
matrix of cross covariance fuuclions. serl,ion 7.2.3) does )lot allow for any sig~rificnntrorrelntio~rof e x t r e ~ ~ rV e&
ucs (Journd and Alahert, 1'388; .lournel and Lhrtsclr, 1993) slid for a givw
covariance rnaxi~niaestheir scallering in space (dcsl.ruct~~rati~)n effecl.). For
An allernative to the direr:t sirn~~l;ttio~r of the veclor [Z1(u),. . . , Z N U ( u ) ]is exan~plc,the sGs rcalizatio~rshown a t tlw t.op of I'ig~lrc8.7 (left graph) rcpnj-
[X,(u), . . . , XNS(~1)]
to sinnrlate separately a set of i t ~ d e l m l d e ~f~tctors
~t fr~j~ii drrcrs the Cd semivilriograr12 fairly \ \ d l , 1,111, the spat,ial contii~nityof sllldl
whiclt the original Z-varial~lt~scmr he r<~cot~sl.it.nt<:d (I,r~si,er,1!)85). C d vztlnes is underestinralccl: t,hs expcrio~t!ntal indicz~bors e n ~ i v a r i o g r a ~;itt ~
proceeds i n three steps:
'I'he sin~ulatior~ the scco~rtlclocilc has a ~luggst,ell;:cl, that is too large; sct: rlo1.s on l r i g ~ ~ 8.7 rc
(second row).
1. 'I'lre N, original v a r i d ~ l r s%;(II) arc h t deco~nposedinto N,, orl,l~ogn~~;rl [rI I,IIc earI,11sci~:nc<:~, ~ O I I I I W I , I X J slriltgs of'!;trgc or . s ~ l l ; tV;I~II<!S
~ ~ art! V H I l P
factors X ~ ( I I )e.g.,
, the N, principal components of the Z-corrclatio~~ IIIOII inid are critical for many applications. (:onsi(lcr t.hc p r o h h n of assl!~~illg
matrix a t 1111 = 0, R(0) (recall section 6.2.5 on principal compot~ent ground-wat.er travel times from a nuclear rcposit.ory t,n the surface. Sec11lc11-
k r i g i ~ ~ g'l'l~c
) . critical a s s r ~ m p t i ois~ tlrat,
~ of t . 1 1 ~ l ~ r i n -
the ~lrt.l~(,gr,l~;rlit.y (,i.,I1 ( : . r . :.
~ ~ s i~n i us l is~ t~i og~~~! ~ ~ c~r i ~rcaliaalior~s
tt,~~s that n ~ i n i ~ ~ tconnccl,ivit.y
izr of
cipal components a1 lag zcro extends to all otlier s e p a r a t i o ~vcctors. ~ 11igI1lx:r~neal~ility va111es Iwding to a l>ossil~lt!IIIIII~:~sI.~II,~:III~~I~I~ I,h! risks
of leakage, More co~~servirt.ive results, that is, shorter 1.ravcl t i n ~ e s ,are 1110-
viiterl ily non-Gaussian sinir~latio~r algorithms thal allow for continuity of
lxrgc eatrcrne vidnes yet reproduce t.he same across-all-classes s e n ~ i v a r i o g r c ~ ~ ~ i
((:61i~ea-lIcrtr;i11~1~~~ and M1cn, l!)!Jil). Siinilarly, t,he spatial rlist.ril)nt.io~~ oforc
gr&s i u n ~ i ~ ~ edeposits ral riften displays rich zones orieutcil pri!lerrwt~ially
wit,lri~la tllore hon~ogeneousisotropic l,ackgron~~<l. I n s~rclrcascs, ~ n o d c l i ~ ~ g
3. At each grid i ~ o d eu;, t l ~ esirrn~lalerlvalue of t , l ~ cvarial>le Z, is 1.l1e11 t h r specific rontinnity of high grades using a ~ ~ o n - ( : a ~ ~ tcch~riqrle s s i i ~ ~ ~ wollld
retrieved as a linear cornbinatiot~of [.he Nu i~~rlepr:nde~~tly simulated yield a betler prediction of tho rt:uwerrd grade (Bor~rgarrltand . l o ~ ~ r ~ 19!)4). icl,
,Ilre
,
n n ~ l t i ( > a ~ t s s i1t.F
a r ~ inodd is inapproprial,c w\kcnrvcr I,hc strllctllrd
zt-values a t t h t locatiou
,Ilre metltod is fml 1)ecarrsc {lo cokrigi~rgsystem m ~ s 1,e
\
; ~ ~ ~ ; ~orl ~q~~alil,;ttive
sis ~ ~ ~ O C I I I ; I ~indic;~t,cs
.~OII tl~itlCXI,~(:IIIP val~wsm111d 1 1 1 ~
i s o l v ~ %'1~'1l111s
. tlrcr-c: betlkr corrclitlkd in space Ilinrr t n t d i u ~ nvalues. Even i l l Lhr al,scnir! of ill-
is no need t o infer and jointly model the N,,(N, - l ) j 2 cross s c ~ n i v a r i o g r a ~ ~ ~ s . forlllatiorl about co~rnect,ivityof extrellre valnes, the user nir~st.be aware illat.
T h e method has t,he following drawl~acks: the aoalyt.ical sinil,licity of sequr:nI,ial Ga~rssiansirrrul;ttioll is ilaiallccd by Lhc
~ s 1111 N , al.tribut<:sz, art. s i ~ ~ ~ ~ i l l n n c o ~ r s l y
Only those d;tta l o c ; ~ t i o ~rvl~erc risk of ~~ntlcrst,at,it~g t.lic pot,cnt.ial for critical f<!al~~rcs, such as s t . r i ~ ~ of
g ss ~ n a l l
~ n e a u r e t lcall Bc considered or large values.
S~:qnent,ialindicator s i n ~ ~ ~ l i (xis) ~ t i oist ~1hc no st w i 0 ~ I yused rml~-(.~akrssian
There is no control OII rcproduct.ion of the spat.ial cross corn:lations simulation technique. The indicator forn~irlisrr~ in1roduci:d in section 7.3 is
between variables a t lag 1111 # 0. used 1.0 mo<lel1,lre srqucuca of conditionirl cdfs from which si~~iulat,cd valucs
'arc drawn (Alxljert, 10871); Journel and Alabert, 1988; G61nez-llernindea
Sequential Gaussian simulation and Srivastavx, I99O). Unlike seqtmnti;tl Gaussian sirnolation, the indicator
approach allows one to account for class-specific patterns of spatial colltilluity
t,Irrough different indicator scmivariogra~lrmodels. For exarr~ple,t,he realiza-
tion in Yignre 8.7 (third row) shows a better continuity of small Cd values rcl-
20 atively to rnediar~and large valucs as modeled by their standardized indicator
Cd data semivariograms (bottom graphs), Anollrcr advantage of indicator-based sim-
1.6 12
E ulation techniques is their flexibility in incorporating soft irrforlnation coded
i 2 C
under the forn~at,of local prior probabilities.
08

s
/I
04 04
8.4.1 Accounting for a single attribute
00.. ._ _.,..
..-. ~.~ . -
00 04 08 1.2 16 (:onsider first the si~iiulatio~t at.trilrute z a t N grid rlodcs
of asingle cootinrlo~~s
uis,ance (k",)
u; conditional only 1.0the z-data {z(u,),a = I , . , , .
I si~nalationprocecds its follows:
S c r ~ ~ ~ c ~il~t ~i adli c a t o r

. 1)iscrct.izr: thc range of vnriatim of z into (I( + I ) classes using


thrrsbolrl values r t . ' k n , trar~sforrrreach datnur z(o,) into a vector
of hard indicator data, defined a?
001
00 on
~~ ,-_-
0s 1 2
~

is
mL..
00 04
. _.
08 12 I B
.
00 04 08 12 $ 6
nir,anre j*,") D l l f m ~ sIkrn) Ulriancslln,,

Sequential indicator simulation

20
Cd data 1. 1)eterminc the It' ccdf valnes [ F ( u l ;rk1(71))]. using ally one of the
16
i~idicat,orkrigirlg algoritlm~sint,roduced in section 7.3.2: simple
12

on
! or ordin;~ry indicator kriging, ntedian indicator kriging, indica-
t,or cokriging or probability kriging. 'Tl~ccouditioning infoma-
I
04 1 t,io~iconsists of indicat,or transfor~~rs
(and riuiforrlr transforms for
0.0 lxobability kriging) of ~lr~igllhorilig
origirral d a t a and previously
00 04 0 8 12 1 6 simnlatrd z-viil~~cs.
DiSlrnce (km)

i 2. Correct for any order relation d e v i a t i o ~ ~(recall


s section 7.3.4).
'I.Iren, build a complete ccdf model ~"(II';zl(n)), V z , using the
algorithn~sintroduced in section 7.3.5.
i~~t~:rpolation/cxtra~~oIi~f,io~~
3 . Draw n si~l~ulat,ed
value z(')(nl) from that rcdf.
I. Add the simrrlxt.ed value to the conditiorrir~gdata set
5. I1rocr:cd t,o i.lrr>IKXI. nodc along the rat~rlompaI.l~,and repeat steps
1 t,o 4 .
lS'igore8.7: C:aassia~~versus ircdicator seq~~erttial simulation algorithms. Indicator-
based algoritl~~ns allow one to account for class-specificmodels ofspatial contirtr~it~
(indicntor semivariogran~s);hence the sis realization (third row) better reproduces
.
Itepeat the entire procedure with a differcnt random path t,o generate another
realization {2('')(11;), j = 1 , . . . N ) , I' $. I .
the sp;rtial contiuuity of small Cd valucs as nrorlcled by the indicator ~eraivnrio~ranl
at the second decilr.
396 C1IAI"~EIl8 ASSESSMENT 01.' SPA Tl.4 L 11NCEii'liA IN'l'Y

Example
Consider the co~tditionalsirnulalion of Cd concentralio~rsalong the NE-SW
transect. Figure 8.8 depicts the niaitl steps of the seqoet~tialindicator ap-
proach, as follows:
1. T h e ten Cd data w e first t r a n s f o r ~ ~ ~into
e d ten vectors ofirtdicalor data
using the three tlireslrold values zh= 0.8, 1.38, and 2.26 ppm (Figure 8.8,
top graphs).
2. For each tl~rcsl~old value, t.hc indicator s e ~ ~ ~ i v a r i o g risi uinfcrred
i~ frou~
all data over the study area (Figure 8.8, third row).
3. Ccclfs arc deterll~inedusing ordinary irtdicator krigiug. At eacl~sin,- Indicator vectors
d a t e d location, the conditioning information col~sistsof the indicator
transforn~sof the five closest 1X data a d of the five closest previ-
onsly siri~ulalcdCil v a l ~ ~ o(two-part
s search). T l ~ eresolution of tllc t 1 1 1 1 01 to 1
discrete ccdf nlodul is increased by liuear it~terpolatiorlbetweeu t a b w l i t I t O l O O O
lated bounds obtained from the ten Cd d a t k Lower and upper t,ails arc

--~. .. .. .. .. .
0 1 1 10 00 00 0
rnodelerl using, respectively, a power ~rtorlcl(w = 2.5) and a ltypcrl~olic
model (w = 5).
1 2 3 d 5 6
Distance (km)

R e p r o d u c t i o n of nroclol s t a t i s t i r s
At each grid node, the indicator-based sininlat,ion can be viewed as a two-stcp ,Is1threshold 0.8 ppm 2nd threshold 1.38 ppm 3rd threshold 2.26 ppm
procedure:
1 . A simulated class-value is first a s s i g ~ ~ rtod the grid l~otleti', say, 11' t
(z&,, z ~ I ] .
2. A simulated z-value is tlmr d r a w from that class ( 3 , ; q . 1 usir~gSOIIIC
witlrin-class rlislribotion iuodel (e.g., a urtifortn d i s t r ~ l ~ o t i o ~ r ) .
Consequcnlly, indicator-basd algoritl~msguarant.ae (;q~prnxi~nittc)
tion of only the Ii class proporl.io~rsand corrcspor~dingi~~iliral.or
rcproduc-
scn~ivari- 5 , Simulated Cd values
ogralns yr(11;z k ) , uot r ( q ~ t o d ~ ~ c t of
. i othe
~ t cdf i u ~ dsi:~~~iviiriogra~n or 1 . 1 1 ~~ 1 1 -
tinnous z-valucs.
T h e actual ;ipproxinratiorr of one-point aud two-point z-st.;rt.istics by a
seqner~tialindicator realiaathn tlrus depends o n several factors, sucl~;ts t l ~ r
discrctizatioo level (nurr~brrof tl~resholds),the ir~forrua(,ionaccounbcd for
when perforniitrg indicator kriging, aud tlie i r ~ t e r p o l a t i o t r / c s t r ~ ~ ~ ~ ~mod-
~l;ttiorr
els used for increasing LIte resolut,ion of tlu: discrete ccrlf. i. . , , , --
7

I 2 3 a 5 6
Distance (km)

Figure 8.8: Sequential indicator sirnalatio~Alter coding the ten original (:d data
into indicator vectors, coeditioual simulation is performed usiiig ordinary indic;~tor
kriging and the three indicator scn,ivariograms slmwn i s tltr third row. 'I'lrc bottom
, , . . *.. . .. , 0 ,
Cd data
'J'l~cr m m a l scow b;~ck-tr;i~rsforrir, vvl~icliis all int.cgrirl pnrl of tlrc sGs algo-
ritlun, g c n i d l g lmds to ;I l>ctt.crrcproducLion of I,IK 2-cdf ~II;LIIdo indicator-
I,;~scd t w l m i q u ~ s .If t h i t t r t ~ l ~ r ~ d ~ ~isc t~.Ci oF I~I iI CI ~ B T S S S : I I ~ , the realizat~iot~s
can IN. p ~ ~ s t - ~ ~ o w snsit~g
s " c d I.hr ;rlgorit.llms inlrorlilccd l;ttcl. i l l srctiot~8.9.1.

Concer~lralion(ppm)

Cd simulated values (option 1) Cd simulated values (option 2) I i n p l e ~ ~ ~ o t f n l lrps


ion
A bet.tcr reproduction of ~nnrgit~al ktrgat, stat.istics call On idiieved by in-
crcasiog t,lte I I I I I I ~ ~ I ? I;,
~ , of t,l~rosltolds, lo thcory, t,he r-seniivariugratn is
reprrrdoccd cnactly if i n d i c i ~ t ~cokrigil~g
r with :LII i ~ ~ f i n i~f c~ i t t n h eofr 1I1r~sIt-
olds is used; if only i ~ d i c a t o rkriging is used, ;~g;rinwith all infinitc rnmrher of
tl~rcstiolrls,it is llie z-rriadogr:m (2.18) that is ~eproduced(Alabert,, 1987b).
As discussed ill s e c t i o ~7.3.5,
~ several factors limit, in practice, thc: numl)er
of tlrrnsliolds used i r ~indicalor kriging. Too many tlrrcsl~oldsdrmlically in-
crease coruputal,iolral titrle, infcreucc and lrtodeling elfort, and t.he risk of order
r c l a t h ~(lcvi;ttiotts. h r i o u s itr~plcu~cntations re(1uc.t: ordcx rd:rf.io~~ problems
Concenlration (ppm) Concentralion (ppm) while providitlg a rr:rson;ll,le discre~iaal,iottof the local ccdk (scc h ~ t ~ s and ch
J o n r ~ ~ e1I)!)Z;r,
l, p. 77 8 0 ; (:lro, 1996).
O r ~ c; ~ p p r o ; ~ronsist~s
d~ of witig LIIC sitme sen~l?'ariogram nrodel a t all
t l 1 r t 4 a l d s (~tttdiiiuiudicator kriging). No ratter how nmtly tltresl~oldsare
r c t a i d , o~tlyrm. ( ~ ~ i n d i aindicator
n) scmivariogr;tnr is rclait~cd;ltence only
a siugle 1K system need be solvtxl a t each grid node. Co~lsequently,all or-
der rclatiou dcvial.ions caused hy lack of data in some classes are eliminated
the study area wing only f,l~ree tlireshold values: zx= 0.6, 1.4, and 2.0 ppn1. (see section 73.1). Orre loses, l~owcvcr,ilre llexihility to model class-specific
l-'igrxe 8.9 sliorvs t,lre t.argct, s;nnplr 11istogm~1~ (Lop graph) and the Iristogranis pat,tcrns of spatial coulinuity.
of sirnulatrd v;ilnes rrisulting from using two rlilfermnt t,ypr,s of interpolation Another possihilil,y is to interpolate the 11;iramcters of many indicator
and < ~ x t , r i ~ ~ ~ < > l ; i t , i o ~ ~ : serniv;triogratns frorn the parameters of a few explicit.ely modeled ones (recall
sccliorl 7.3.4 and Figure 7.27, boltonr graphs). More Lliresl~oliiscould then he
1. 'l'l~cfirst niorlel considers a scries of linear int,erpolatiu~rsrvil.lriu each retait~rxlfor indicator kriging wiflrout ir~crcasi~rg t,lr~ir~f<:rencrand rnodclit~g
of t.he four classes (zi;_],z t ] . ?'he rrstdting d i s t r i b u t i o ~of
~ sinn~lated effort.
values is approxi~rratclyuniform withi11 each class, which is dclirreated Instead of modeling t.lrc wl~olcccdf before s a r n p l i ~ ~it, g another approach
by t.he vcrl.ical dasl~cdlines on the lristogra~nof Figure 8.9 (left hottxm consists of n~odelingonlythat part of the ccdf that is sarrtpled by the random
graph). probability value p (Deutsch and Journel, 1992a, p. 186; Cliu, 1996). At each
grid node i d , the simulat.ion would then proceed as follows:
2. I n Llrc s~rcoodutodcl, a linear interpol;ttion is performed I~etwcertt,abu-
latctl L)ou~~ds ide~~tificdwit11 the 259 (:(I dalki values. 'I'lte resoll.ing c a l f 1. 1)raw L l ~ r : random i~nnrbcrp E [O, I], a cumulative 1)robability value
n ~ o d e llras a lligl~erlevel of discretization, lcadi~rgto a more detailed 2. Determine iteratively the two hounding ccdf values hetween which the
Irist,ograrn of simulated values that is closer l o the target hisbgram probability p lirs, say,
(Figure 8.9, right bol.t.om graph).
400 ('IIA N":R 8 ASSESSMENT 01'SI'A'I'JA 1, lINCISI1TAIN'l Y 8.4. SEQUENZAL INDICATOR SIMUIATION

This limited umdoling of i l ~ rccdf accelerates t.he sinrulat.iou: the r n a x i n l u ~ ~ ~


number of kriging systems to he solved a t each grid node is log2(Ii + I ) , the I Rock types
worst score of the bisect.ion algorithm (Press et al., 1986, 11. 117) used for
det,ermining tlrc tmmditig ccdf values, ir~steadof Ii systcrlls for a. "fu11" ccdl
modeling.

8.4.2 at'1 0 1 1
Accounting for secondary infor~n I 1 2 3 4
s2:Other rocks

5 6
Distance (km)

Local prior probabilities of exceeding zk

r,=O.BO ppnt

.. .................................. z2=1.38 ppm


..............
,- - - - - .,.,-
.
_*__
. ,____
.- . .
__-_-,__
...z,=2.26
. -.
-
..
_l
l
ppm
Example 1 2 3 4 5 6
Distance (km)
Consider tlle sitrlulalion of (Id valucs alotrg the NE-S\V tratlscct corrdil,io~~al
to tire ten Cd dath aud three profiles of local prior prohahilities dcrivctl frottr
calibration of two rock c;rtegories (Figure 8.10, top two rows). At, each grid 5 , Simulated Cd values
node, the corrdilionitrg inforttl&>~~ consists oft.lre livc: clos~rstCil dab, the five
closest previously sitr~ulirtedCd valrm, atid tlrc colocated soft d;rt,u~t~. Tltref!
tltresltol~lsrr = 0.8, 1.38, a d 2.26 p p n arc considered 'I'llc corn~sponiling
Irarrl-soft indiu;rt,or cross scrniv;rriogram ittodals are dcd~tcodfroln t l ~ ethree
hard indicator srr~triv;~riogr;tn~ rnodels of l*'igwe 7.27 using t.he Markov-l.y[~e
d calihri~lhnpar;rn~dt:rsof 'rirl>l<,
re la ti or^ (7.50) i r ~ ~llrc 7.2 (11agc. 317, l.lrird "
column)
Figure 8.10 (I~ottonrgraph) slrows the profile of simulated (:d valuas oh-
-1
1., .......
1 2
, .........
3
. . -;;z~-~.
4 5
:.-. .
ardinav kriging

6
tairled using t . 1 ~hlarkov-Bayes algorit.lt~rl(solid line) and ordinary indicator Distance (km)
kriging not accor~l~ting for soft iuforrnal,iorr (dashed li~ie).'l'he satue rnrtdorir
path and series of rarrdolri nunrhers art: used i l l l,of.lt cases, so diffcrcoces bc-
twecu the two sitiiulated profiles orighnte ouly fro111tltc ;i~lditio~ial soft infor-
mation. Accoutrt,ing for geology yields s~nnllcrsimnl;rl.ed valucs on Argoviir~t
rocks, wlricli have sttrallcr probnbilit~icsof esce<:rli~~g tlic diff;:rcnt I.lircslrrrlds.

8.4.3 Joint simulation of multiple variables


The Gaussian-lrased i.osil~rnlirt.iot~
p;rrnrligm int.roduccd in sccl,irrn 8.3.3 call
be extended to indicators.
'l'lic first approach ;t~noontsto sinlulating the N, different RFs %,(XI)
sirnnlt;tncously using saqltcntial indicator cosintnlation. Practical irnplrnrer~-
s ~ I I fro111
tiitio~tof tli;d ;~lgoritl~nt % ~ 1 1 1 ~ds r~i ~ w l ~ i ~ ~ k s :
'I'lrc tedious i~rfercncea d joint n~odclingof a matrix of M ( M + 1)/2
direct and cross indir:;itor sen~ivnriograt~is

l e d rrodc A4 large
'l'lic cornputn6io~1aIcost of solvirtg ;it caclt s i ~ r ~ t ~ l agrid
and oftctt u ~ ~ s I . a indicat,or
lh cokrigirrg sysf.m~swit.11 (A! x N ) equat.ions
5. I'rorcrd 1.0 1111. next, node along t l ~ crand or^^ path, ;rnd repeat stkps
I 1.04.

011ly indic;tlor datn a t thc tlrrrsl~old rri hci~igconsi(lw:d arc rct.ained in


t , l ~ clirigi~tgsyst,cln, A s discr~sscdi l l seelion 7.3.2, i t is gcncrally not worth
accounliug for itifornralior~ at, other tl~resl~olds wh<w inrlicaf.or vectors are
complete (equally sanlplcrl case).

8.5 The LU Decomposition Algorithm


As wit11 stx1ue1rti;sl (:;tussi;t~l si~m~lat,ion, sin~ulal,iaotlrrough L U deco~rtposi-
l i o ~ tof tlic uw;lriaoce mntrix capilaliaes or1 the coltgf~~lial properties of the
~nultiGaossiat~ HI,' rr~orlel.1.U sind;rtion sl~ouldbe the prcfmred Ganssia~t-
I~;rscd alguril.lrri~wlle~tI I I ~ I I Ysntall realiaalio~ls(fcw ~rorles)sparsely condi-
I.ionerl ;rrr. 1.0 I]<! gc~~cr;ct.~:<l
(Alalmt, 1087:~;Davis, 1!187; l)et~t.sclrand Journcl,
l992a, 11. 143).

. 'Iimrsfornl each primary dntonr z,(u,,) into a vect,or of hard indicator


h t a of t,ypr: (7.20), [it(u,,,; r l l ) , . . . ( ;1 . Code t.hr zz-data
Co~tsiderthe sinrulatiorr of the corrl.inuons atthiliute z ;it N grid nodes 11;
conditional 1.0 tltr r1;rl.a stlt {r(u,),or = I , . . . , 72) with both N and n small.
Like any Gatrssinr~-h;wdt,ccltrriqtre, the Lll dccor~~y~osition algorillrm starts
sirrlilarly. wit.1, t,rar~sforming t.hr Z-data illto imnralscorr yd;tta willr a standard norrnal
cdf. Next, the normality of Llte two-point dist.ril~i~tio~r of tlrrsr normal score
r 1)r:lirir ;t r;tndoln pnlh visiting each i ~ o d cof t.llr grid only once r1;il.a is cl~ockcd.
U I I thc ~ assutnplhtr that. the R P Y(u) is ~ r t a l t i G a u s s i athe
~ ~ ,si~nulation
At fracli node 11': prouwls ;Is follows:

~ ~~ . I I P standard normal RP
whcrc Cy(l1) is tlrc c o v a r i a ~ ~ fwr l ~ t c t i oof
Y(u), ClI is t,lw 7 1 x n ~ I i i ~ ~ - t o - d a t a c o ~ a matrix,
ria~ic~ Cz2 is the N x N
2. C:odc tlte sir~lolatcdvdue z f ) ( n l ) illto a vector of local prior proh-
rode-1.0-~torle covarinncc nratrix, and C 1 2 = C;l is t,lw data-to-node
abilities of t.y pe (7.25) for %2, [ l l a ( ~ ~I: ' ;z), . . . , l/a(nt;zriz)]. rw:triauw n~:ttrix.
triangular matrix: witlr the point-ccdf ~'(II;;zI(11)) being d e f i n ~ dit1 J poit~ts11; discr~tizingthe
block V ( u ) (recall section 7.3.2).
In the absence of hlock d;rl.a zv(u,) and c o r w s p o ~ ~ d ihlock
t ~ g sl.;ttist.ics Z I I I ~
I h c k indicat.or data, the 1,lock ccdf (8.13) can be numerically approxilni~terl
Clearly this step requires the dil~~ension
(n+ N ) t,o he stnall (lesser than hy the cninolativr distribution o f ~ n a n ysimulated block values r$'(u) (Isaaks,
10". 1'390; Cht~iez-Iicri~irr<lea, 19'31; I)r:utscl~ arlrl Jourrlrl, I!)!)Zn, 1,. 90; Clacken,
3. Generate a conditional realiaatiol~{y("(u;),j = 1 , . . . , N J as the linear 1'3'36):
combintttion:

where y, is the vector of tire 71 conrlitionit~gnorn~alscore data aud d')


is a vector of N itldepentlc~~t stmdard urmnal deviates.
~ ~ ~si~nrllatr:d t~orrnalscorrs {y('J(lti),j = 1, . . . , N )
4. R a c k - t r n i ~ s f o rt,lre
into sirr~ulatcrlvalues of the original variable {z(')(i~;)1= $-'(!,(')(II;)),
j = 1, ...,N J .
Other realizations, {z(")(u:),j = 1 , . . ., N ) , 1' # I, are readily o l > l h ~ ~ 11y cd
multiplying the 111a1rixL 2 2 ~ I expressio~i
I (8.12) by ot.ller v(:~I.orsOC i ~ ~ d q m -
dent standard norn~aldeviates, dl') I',# I.
'The vector y(') oC simulated valot:s call be viewed i ~ st l ~ e~ I I I !of ~ a first
component (Lzl . L;: . y,) tirat accoonls for llre c o ~ l d i l i o n i ~data ~ g aurl n
residual componcrlt (Lzz . w ( ' ) ) needed for reproducing tlle cowriance model
Cy(11). The first corrrpolxmt is but. the vcctor of SIC esti~iiatrsof the variable
Y a t the N simulated grid nodes.
Ehr each realization, sequeutial Gaussiat~si~onlatioiicalls for solving a 8.6 The pfield Simulation Algorithm
series of N small kriging systcrns that accout~t,for only neigl~boringc o ~ ~ d i -
lioning d a t a and previously s i n ~ ~ ~ l a values.
tcd In contrast, irorvever inany
realiaalions are t,o he generated, the L U deco~uposit.io~~ a l g o r i l l ~ nrequires
~
deconiposii~gonly once a single large covariance matrix tl~ittaccounts for all
sirn~~latetl riodcs arid dat.a l o c a t i o ~ ~sirr~~~ltaoeously.
s In praclicc, t l ~ c1.U ill-
gorill~mcannot, l l a d l e more tliim a few ltundred s i ~ ~ ~ u l a grid t e d iiodes and
condit.ioning dath, but it allows one to generntc ninny additional realizations
a t little addif.ional conrputat,ional cost.

A p p l i c a t i o n : D e t c ? r n ~ i l ~ a t i oofn a hlock ccdf


Consider the proldcm of evaluating the block poslerior ccdf l'i~(71;:/(TI)) tlr;rl.
models the uncertainly abont an average z-value over the hlock V(u):

1 , At eac11 locatiw 11; being siwolat.cd, build t l ~ ccrdf r ~ ~ o d ic'(u;;


el 21(71))
Because of the non-lir~eerityof the indicator translixtn, t l ~ eblock ccdf c i m ~ ~ o t
using any appropri;tto ;rlgorilli~n,c.&, ally one of tllr ~rilr~t~i(:~trlssi~rll
or
be derived simply as a linear combination of point, ccrlfs:
indicator ;rlgorii,li~l~s
int.rodurai1 ill C11apt.cr7.

2. Grnerate a set of antocorrelaled p-values, ( I ) = 1,.. .,AJ],


called probability fielrl or l+fidd, tlmt is a re;rlizat.io~~
of t l ~ eitl.' I'(II)
,,
Ille o n ~ - ~ , o i nand
t , two-point statistics of the simulated z-values are con-
tro1lt:d by the statistics of the y-field values. Joornel (1905) proved tbat,
under condilions of crgodicity and on average over a large ~ru~rrher L of re-
aiiaatiolis, the z - l ~ i s t o g r aa~l ~ d~t,he covnriatlce of tlic :-lil~iforrrl scores are
reproduced; t,hat, is,

<
\vitlr &')(,I; z ) = I if L(~)(II) Z, and Z C ~ Ootllerwise; A is l.he si~nulatiollilrea,
;tllfl A n A _ , , is the ilrlcrsection of area A wit11 its LranslaLioll by vector -11.
11)words, t,lic o ~ ~ c - ~ o and
i r r t two-point ccilfs of the s i ~ n ~ ~ l avalr~es
t r d ziO(u)
averaged over all possible locations 11 E A a p p r o x i ~ ~ ~Lire a t cstationary one-
point and two-poilit :-rdfs.

S t r p 2 and 3 arc rclwnt,r4 1.0 g v c r i t k a diffcrmt. rrvlizatio~t{;(")(TI;), j =


I , . . . ,N ] , l 1 # i 1 .

2 . A ~ ~ ~ ~ ~ - c r ~ ~ ~I,-fielrl
i l i t i rcaliaatio~~
o ~ l ; r l is t l m ~gcl~rriiterlin two-steps:
(I) I ( I ) = 1, . . . , N ] , arc g ~ ~ ~ c r a l11s-
ed
iug ~ i : ~ u c ~ i l (:;n~ssiar~
inl sillmlalio~~ arid t.1~: sc~~~ivariogr;rm of uliifo~lll
Lrarlsfor~iisshown in Figure 7.19 (p;~ge303, rniddlr graph), (2) l l ~ esirrr-
n l a t d y-values are then t r ; w s f o r ~ ~ ~into c d u n i f o r r ~y-values,
~ J =
p(')(u')
(:(11(1)(14)),j = I , . . . , N , wl~ereG(.) is thr: Gaussian cdf. 'The pfield
realiznt,iou is s l ~ ~ w i n ~Figurc
r 8.11 ( s e c o ~ ~row).
d
3. 'Tlli: sinlnl;ttrd p-values are used to s ; ~ m p l cthe couditional cdfs. For
e x a ~ r i ~ lthe
c , sin~ulat.t:d;-values a t l o c a t i o ~ ~11s; and 11; are, respectively,
0.46 and 1.7'3 ppm, c o r r e s p o ~ ~ d to i ~ ~si~riulnted
g l+val~les0.20 and 0.55
(Fignre 8.11, third row). 'l'l~e resulting sirrlulatrd profile is shown a t
R e p r o d u c t i o ~of~ the z-rlalkt values is ensured tlrmugh the ccdfs F ( u ;zI(n)). the bottom of Figure 8.11
Inilrrd, a t m y 11al~11ii I i ~ r a L i ou~e ~, t11c crdf is a imit-step f u ~ ~ c t i oidentifyi~ig
n
t l ~ cdatum value :(n,). 'Tlrns, wlmtcvcr the s i ~ r ~ ~ ~ 11-field l ; ~ t e dvalue JI(')(CI,)
R.cmark
a t that l o c n t i o ~ ~ ,
~ t i ; t l algorithm used in the previous examl~lcto generate
'rhe ~ ~ ! ~ u c ~Qaussinrr
tile p-field clot,s ~ m c;yjit.aliac
t on the fact that the pfickl is II~II-conditional.
C H A P T E R 8. ASSESSMENT OF SI'A'fIAL UNCERTAlNTY

Much faster si~n~rlntion algoritl~~r~sexist for the reproduction of auy covariance


PK ccdl values model as long as there is no data-conditioning, for example, si~nulationsusing
the spectral decomposition tlicorern (Borg~naoc t al., 1984; Clnl and Joirr~rel,
1994), or tlre non-conditional versio~rof the torniug hands algoritlr~n(Jol~riirl
and llui,jbrgts, 1'378, p. 498), or s i ~ r ~ ~ t l a t imirrg
o ~ i s irroving averagcs (Jor~ri~i,l
. . . ... . ~ ~ 1 .ppm
38
and Iluijbregts, 1'378, 1,. 505; Luster, 1985). This book dot:s not cover tlrc
z,=0.80 ppm vast field of si~nr~latiotr
algoritlriris that cannot be made directly condilio~rnl
to local data.

8.7 Simulated Annealing


field values

Ccdf model at u; Ccdf model at u;

-1--.,-...7.-- 8.7.1 Si~nulatedannealing paradigm


1 2 3 4 5 6 0 1 2 3 4 1 5 6
Cd concentratton (pprn) Cd concentration (ppm)

Simulated Cd values

where y(h,) is t.hc value of t l ~ ctargel z-sc~~iiv:iriograi~r inodd a t lag II,,


and ~ ( i l ( l l , is
) the c o r n q o ~ ~ d i rcxperinreiital
~g z-sc~r~ivariogrn~n valw of tire
1'.
,I . '
I I (1)
i f r l ~ i r l i i {, : ( I : ) , j - I , . . . , A').
Initial random image I 2 ,Realization sernivarioqram

I . Crcatc all inilia1 rcalizat,ion, {z(" (01 ( dJ ) , j = I , . , . , A r ] , Ilmt 31011orsdala


vall~rsa t 1,hcir locnt,irrns ;tnd ntay already approxil~ialesome of the lar-
gcf. sl.;tl.isi.ics,sricll ;IS I,hr variance or sill of tho t,nrgpl z~srrnivariograln.

00 0.4 0.8 1.2 1.6


Distance (km)

After 50.000 swaps I 2 Realization semivariogram :!. l'~~rt.url,


1 . l ~ : rc;~liz;tl.imt11y sott~osinlplc ntcchat~istl~,
suclt ;IS swapping a
1 (11
pair of z-values: ~(11 ( ~ ~ ( 1 IICCOIIIPS
1:) Z(~)(U:)
and viu: vcrsit.
I
4, Assess t,l~eirtlpact of the pert~~rhat.iori on the rcproductiorr of target
stat,ist,ics try n w m p n t i n g the ol,jcctivc fnnct,iorr, 0,,,,(0), nccoulttillg
for tllr ~~wdilicatiotl of the init,ial i111;rge.

0.0 0.4 0.8 1.2 1.6


Distance (km)

Final realization
After 398,000 swaps I 2 ,Realization semivarioqram I
O t l ~ c rrealizzttio~rs{ r ( " ) ( ~ ~ =
~ )1,, j. . . , A ' ) , I' f I , arc generated hy repealing
1 . h ~c~rtircprocess stnrti~lgfro111dilfcreut initial ilnages. 'I'ypically, the number
of nodes N is so largc and the sen~iv;uiogrnmis so little coustraiuing that
tlrcrc rxisl. Itlatry so111t.iu11st,o the ol~t,inrizatiouprobletl~. TIICreali'ations
dr:twu a r c sa~nlilcsfroin l.lm1 w t ~ f a ~ ~ ~ ~ r osol~~tiotm. xi,i~i~l~
Si~rn~laLcd nu~realingis roncrptrdly sinlple and ollers great flexi11ilit.y to
accounl for various constraints built into the obj,jecl.ive fw~ction. T h e Op-
Figure 8.12: The si~r~ulated ansealing algorithm. An initial random image (top tin~iaalionprocess, l~owevcr,relics 011 brute force, CPU-intct~sive,trial m d
graph) is gradually rnodilicd by swappiog pairs of values so to achieve reproduc- error to gracl~~iilly acl~ieverel)rod~~ctiou of 1I1e target stat.istics. Itrrplenreltla-
tion of the Cd sernivuriogran, model (bottom graph). lion Lips tlws play ;in essential role for the successfid applicatiou of sirnulatcd
anncirling; see the next section and ffcotsch and Journel (1992h), I)el~bsch
and Cockerl~arlt(1994).
412 CIIAP?%'H 8 ASSESSMENT OF SI'AA'I'IAI, UNCEIlY'RINTY 8.7. SIMULATED ANNEALING 413

8.7.2 Implementation tips Such a perturbation rncchanis~nallows one to keep ~lnchangedthe his-
togram of the initial image. Thus, there is no need 1.0 inc111derrpro-
There are rriauy possililc iri~plementatio~~s
of the general sini~~laterianueali~tg
ductioo of the histogram in the objective f~rnctionas long as llre injlial
paradigm. Variants differ in the way t,l~einitial image is generated and then
perturbed, in the corr~po~ienl.s that enter the ol?jcct.ive functinn, and i n t.lrc image already matches the target histogram.
type of decision rule and convergence crit.crion Llmt :art, ndoptccl. 2. Sclect ralldonrly ;r single locatiolr IIJ and n~odifyt,l~r: corrcspondi~~g z-
value z ( ~ _ I ) ( u ; )accortli~igto sonie niec11;roism; for cxample, the t,arget.
T h e i n i t i a l image l~istograniF ( z ) is sarrrpled anew for a different value:

Simu1att:d ; i n ~ ~ ~ ! artxlnires
l i ~ ~ g the prior d ~ ~ t e r ~ n i ~ ~~fa tan
i o rillilia1
~ i~mge,'
{zin)(u;), j = 1, . . . , N } . T l ~ a prior
t dcternlination shoold lxr such that,:

. Tlle initial i~riageis easily ge~rcrated.

The irnagc: already r~~atclies simple target constraints (e.g., Iionoriug of


data values, reproduction of target histogram) so as 1.0 accelerate the In hot11 cases, c o ~ ~ d i t i o n i ldata
~ g are r~cverperl.urhed, tl~creliyc n s r ~ r i ~t l~~ga t
s u b s e q ~ ~ e optimization
nt process, llle final realizat,ion honors data v;1111cs.

All initial iniagcs are "ctlnally probable" in that each i r ~ ~ a giseequally


likely to have bern drawn. Heware that w i r ~ gthe same image as a
starting point for several difTcrent rolls may lead to ort,ifici;~Isimilarity
between firl;rl realizations and 1io11ccall r ~ ~ ~ d ~ of ~ u~~cert.;sinty.
r s t ~ l ~ t
,>
lypically, tlrt: initial image is g e ~ m a t c dhy frcczi~~g data v;rlr~csa t t.1xir loc;i-
tions and assigning to each unsitmpled grid riotlc a z-value drawn a t rand0111
from the target cdf F ( z ) . This approach is Fast i t ~ dyields a sr:t of initial
images that already l~onorthe co~~ditioniug data and mahc11 t l ~ r : target his-
togram.
r ,
I he initial i ~ ~ ~ aC gO tI J: I ~ also he a realizatio~~ gc~xrat,cclby i ~ OyI K of ~ I I C
previous sirnulation algoritl~~ns, for exan~ple,pfield or seq~te~~l.ial siniolat~ion wlrere the wciglrt w , controls tl~r:relat,ive imliorta~rccof the c l l ~cornponcnt
algorithms. Sirii~~lated annealing is then 11scr1 as a 11osl-processor wilh ill the o1,jective funcl.ion.
olqjective of cit.her a belt,er reproduction of llre target statistics or tl~t:iriipo-
sitio~lof additionnl cor~strair~ts t l ~ a tcarrr~otlie n:adily incurpornthl by o t l m I)e/iaiay the componeals 0,
sirrr~~lalionalgoritlinis. Different co~ttponcntscan he incorlmrat,ed i u t l ~ cobjective f~ntclion,del~cnd
ing O I I tlic targnt statistics to be rcprodnced, Six cxnmples follow:
T l l e p e r t u r b a t i o n rrlecl~anism
T h e two niechanis~nsmost c o ~ o ~ n o n lused
y modifying the
for seq~~cotially
initial image are as follows:

1 . Swap the z-values a t m y two ~ ~ n s ; r ~ ~locations


~ p l e d u'I and ,I; chosen a t
random:
414 CJlA1'7'1:'It 8 ASSESSMENT O F Sl'A7'1A1, lJNCfSRTAIN7'Y 8 7. SIM~JIA?%LlANNEALING

(2) .'ir~rzi~im.roq~~o~t~
irtodcl
1t.eproducl~i011of tlrc z-sc~nivariogra~r~
u ~ o d e l~ ( 1 1 )is geuerally linrited to a
specified nuurhrr S of lags, f h e measure of the dcviat,io~~ bel\vecn target and
current semivariogranr values is

wl~crr?(;)(ha) is t,lw se~nivariogramvalue a t lag 11, of the realizatiw a t t,hr


n divisioi~by 01e sqtmrc of thr: serr~ivariogranimodel ; ~ t ,
it11 p e r l ~ ~ r b a t i o'I'lie
each lag 11, givm more wiglit, to rcproduclio~rof the 2-se~r~ivariograrr~ rrrodel
uear l l ~ corigin

I.'igorc 8.13: Examplrs ol two-point, three-point, and lour-point data configura-


tions.

as C,(Il; z l , z 2 ) = q$(hI,11; i l ,z2) - 1..(1,).l"(z2). If all J tl~reslroldvalues are


equal to the sitme value Z, and ;dl J separat,ion vectors art. r~u~ltiplcs of the
same vcctor h, the n~ulliplc-pointst,at,istic(8.18) is t.lte connccf~ivityfunctio~k
introtlucctl hy Journd iuld A l i ~ l ~ e(1988):
rt

wlrere $')(lr,; rk) is tlrc indicalor semivariogrnnr value a t lag 11, and t.lrreslrold
zr for tlrc roalizntio~~ a t the it11 perlnrhalior~.

(4) Mdtzplr-point sI~i~tisl.i~s


All previous silnulation lecliuiq~rcsare limitcrl to repro(l~~ction of t,w-point 'I'bc quant,it.y(8.19) urcasurcs the probability that a striug of J values orieuted
sl;ttistics in t f ~ a (indicator)
t semivariograms or covariance f u ~ ~ c t i o ninvolve s u l m g t,l~edirection of 11 ;Ire jointly 110 greater tlmn a give11 threshold value
only two locat.ious a t a time. Two-point st:rtislics, I~owever,arc o f i w not
enougl~1.0 clr;rractr~riaccouiplcx features, st~clras curvili~~ear s l r ~ ~ c t . u r rcross-
s,
I)cddi~lg,i u ~ dt u ~ m d ~ r i u~ IgI O I I I I : ~(Gu;~rdi;u~rj
, ~ ~ ~ ~ a ~ Srivastava,
d 1!)'33). 1t.c-
p t d u c i i o ~of~ such ro~nplcxspati;tI fcaturcs cnlls for cousicleriug urore illan
two localions a t a Lillie, say, I.IIICP or few Ioci~l~iot~s as i l l ~ ~ s t . r : ~Ily
t . ~t,Iw
~l
l ~ ls' i p r c 8.13.
c s ; i i ~ ~ pof'
Consider a J-poiut coufiguratio~~ dcliueil by J srpnration vccl.ors 111, . . . , h ~ , wl~crr the e x p r r i ~ r r e ~ ~frequeucy
tal q $ ( , ) ( f ~. .~,,IIJ;
. 21, . . . , Z J ) is calcnl;rted
with 11, = O by convention. 'I'i~epmhability that the J values z ( o + I l l ) , . . ., froul the rcalizalion a t the it,h pcrtrrrlri~tio~~.
z ( u + ~ I J arc) jointly no grcatcr i.Imn the J tlrr~'sholdvalues Z I , . . . , Z J is rr4dcs i n t.he il~frrenceof tlrc targel ~uultiplr-pointstatistics
'I'lic diffic~tlty
& l i ~ das ( I , . . . 1 1 ; . . .~, ZJ). 1*'m<~XRIIIPID, llrc i i ~ f w e u cof~ ;L t l ~ r c ~ - p o i nstat,istic
t
#(111,h2,h3; ZI,22, <I) r c q ~ ~ i r ethe
s avdability of a series oftriplet values with
t l ~ csame f,hrce-pojlrt co~~figural.iol~, say, t,he snub? gcon~c%ric conligurntiol~as
ill Figure 8.1:3 (lefl, bottom graph). Norr-regular gridding and data sparsity
gtincrally preclude the computation of sucB sLatist,ics fro111saruple d a t a . 111
' I h quautity ( 8 .18) ciw bc read as ;L r~~ulliple-point
non-a:utercd iudicntor ruost applicatiol~s,multiple-poiut slatistics are derived frour 8x1array of values
rovarianc?. For J = 2 , oue retrieves blw usual two-poiut. iudicator covaria~ice referred to as a lmining inmgr or control pattern (E'arrirer, 1992; Deutsch
420 GlIAPTER 8. ASSESSMEN'I' O F SPATIA I, UNCERTAINTY 8.8. SIMULATION OF CATE'GORICA L VAIX4IILP;S 421

Table 8.1: Para~iietersof annealing schedules used ill the ex-


ample of Figure 8.14. The inital temperature l o is reduced
by a factor X whenever enough perturbations have been at-
. or have been accepted (Ii.,,,,t. N).
tempted ( l i n t t e m p tN)
T h e simulation is stopped when either the target low value
O,,,i, is reached or the rnaximurn nurnber of attenlptcd per-
turbations'at the same t e m ~ e r a t u r ehas heen reached S
times.
Scl~edule I to X IC,tt,,,t 1<,,,,,t O,,i,, S
Default 1.0 0.10 100 10 0.001 3
lkt, 1.0 0.05 50 5 0001 3
I Very fast 1 0.5 0.01 10 2 0.001 3

T h e value of the objective function decreases sl~arplywhen the temper-


ature is lowered. The magnitude of the drop is greatest for the very
~ l e a small factor A.
fast annealing s c l ~ o d ~with

'The very fast. a ~ ~ ~ ~ e ascl~edule


ling is s t q ~ p e ditft.cr only ii4,000 swiips,
whereas tlir default scliednle is st~oppedafter 308,000 swaps. In this
l r s a siniilar, excellent reproduction of
example, t.l~et l ~ r r es c l ~ e d ~ ~yield
the target sanivariograr~~ 111odc1(I'igurc 8.14, second row). Iri ot,llcr
situations, there n ~ i g h the a risk of g n t l i ~ ~Lrapprrrlg in ~~nacccptable
soh-optima when using an annealing sclicdnle that is too fast.

The threc realizations shown a t l l ~ cbol.lom of I'igure 8.14 are quite


different, yet they all honor the 259 Cd data a t their locations and
reproduce closely t,l~etarget, srrnivariogram model. Such diffrreuces Figure 8.14: Impact of the azuteali~tgscltcdulc on redaction of thc objective func-
between renlizations reflect thr existcncc of many npl)rosini;tt,rsol~~lio~is tion value versus mrmber of swaps (top and reprodnction of the targct C:d
to the o p t i ~ ~ k . n t i oproblem,
n ;ntd t l m c ran l ~ ct~scilti) n ~ o d r lspitt.ial seuiivariogr;un rnorlcl (second row). The covresponding final rt:;tlizrttioss are sl~owtt
at the buttom of LIE figure.
nncertainty. 111 this example, rnrcertainty mzty have iwcn onderslatcrl
because the s;rrile initial image lrirs heen used.

helcroger~eilicsas gcncraled by categoric;tl I ~ o ~ ~ u d ; ~;turl,


r i e ss i ~ ~ ~ t r l t a ~ i c o r ~ s l y ,
8.8 Simulation of Categorical Variables the short-range 1ietemgeneit.irs wil.hin each category (Alabert and Masso~r-,
nat, 1000; Darnslctli e t al., 1'300).
Variables, such ;a the conceikratiou of n inet;d ill the soil, may appear 1.0 Corrsider, for example, a stratificatiou of the sth~dyarea liitscil O I I the gc-
change abrnptly in space a t terrace bluffs and across the horu~dariesbetweer~ ologic map (Figure 8.15, left top graph). Stratum 1 includcs Argovian rocks
the outcrops of corrl.r;tsti~~g rocks. 111 such cases, the pl~ano~nenon shonld be wit,lr the srnallest proportion of contan~ir~ated locations; t.he four other gao-
rnodeled as a mixture of populations each with possibly diff(:re~~t patterns logic forrriations constitute stratum 2. Figure 8.15 (riglrt tap g r a p l ~shows ) the
of spatial continuity. 'f'lie s i n ~ u l a t i owould
~ ~ proceed in two steps: (1) t,he sernivitriograrns of tlrc Cd norinnl score data r v i t l ~ ie;tch
~ ~ s1ratt1111.Concon-
relative geonletry of tlir: d i l l t r e ~ ~popul;ttious,
t say, diffirrnrrt rocks, is 111oi1- trations appear to vary ~ n o r ccontinuously witliirr the lirst, sl.r;rtuni (smnllcr
eled, and (2) the spatial distrilwt,ion of t,l~ec o ~ ~ t i n o o uxt,triImte
s spccific to rehlive rrugget, effect), wliicll relatcs to the bettcr con~iar:tivit.yof sriidl Cd
each popr~lationis si~nulstetlusing any of the previously rneut.io~~eri dgo- co~rcc~llral.ionsrrvcnlcd by the indicator semivariogm~nssl~orv~i in 17ig11rc2.1 $1
ritl~rns.Tlris two-step npproncl~allows for iwtter r r p r o d ~ ~ c t i oof~ llrmg-ritngrr (page 45). 'l'l~cspatial distribution of Cd c o ~ ~ c ~ n t r a t iiso ns i i ~ l u l a l ~within
d
sonnal, 1!iW; !)cul.srli ;md Jo~rrt~el, 1002;~);sitnulatcd ittl~t~alittg (Deutscl~
Stratification ~ I I JI o~t ~ r n d 1902h;
, F;trtltrr, 1902; Goovaerts and .Ior~rrt~I, 19!J6); p-field s i n -
ulation (Xu, 109Sn). 'l'lt<, following p r ~ s r ~ t l a t , i ois~ rlimit,rd to llre class of
indicator sittrttl;rl.io~~ nlgoritl~ttts.

S c y ~ c r l t i n il n d i c a t o r s i ~ ~ l u l a t i o n
Consider Ll~eS ~ I I I I Iof IA ~ Ispat,ial
Llrc I distril~~tlionof K niutnally exclusive
ralcgorics si co~rditioni~l to 1116: data set (.s(uCt),cv = 1, . . . , i t ) ; see sttbscqnent
d o l i t ~ i l i o ~Scqocnt,ial
~. indicatirr sirr~ttlationof cat,cgorical vxriahles follows a
pr~~cmlorc sit~iili~r
Lo l.lmt d~scrihcdi t t scct,io~~8.4 for co~~Linr~ous variables:
'li.nltsform tach cat.egoricnl datruti ~ ( I I , , )into a vcclor of li 11nrd indi-
cator dnt,a, d t ~ f i l ~ c;rsd

s G s with stratification s G s without stratification

0 Define a rando~ripal,11 visiting melt 11ode of the grid only once.

1. I)cl.t,rnli~tc(.It<, i!~t~ditiotraI pr0lml~i1il.yof OCCIII~CIICI: of each c;rl.e-


gory Q , [j1(11'; sk/(n))]*,usil~gsir~rpleor ordinary ititlicator (co)krig
ing. 'l'lre co~tdit,ionit~g infr,rtr~;ttio~r (71) cotisist,~of neighhoring orig-
it~alit1~1ic:dur( l i i l . ; ~mtil p r c v i ~ ~ t s lsyi t ~ t ~ ~ l :itt~lic:~I.or
~l,(d vah~es.
2. Corrocl, 1 . 1 prol~nl,ilif.ics
~ ~ for order rclxlion rlcviations.
3. Ilefi~tra n y ordering of the K categories and build ;lt cdf-type func-
tion by addiug the correspondi~~g probabilities of occurrence; for
example,

ezrclr stratuln r~siugscqucnti;rl Gaussian s i m ~ ~ l ; r I .conditioner1


io~~ hy data attd
sr:miv;iriogra~rr t~rodclsspecific l o tliat st.rat.rt~,t. 'I'lie resnlting widizatioll
4. 1)mw a mndottr nornbcr 11 uniformly distrilruted iu 10, 11. The
slrows ;r 1wl.ter contrast betwcct~ k~V-valIIcd Argovian rocks and tlir: other
rocks L11irt1 if a singlc poprrlation and scmiv;ltriograt~~ tnorlcl is trsetl; compare simulated category a t location u' is Llto ottn t,lrat corresponds to
the two sGs rcalizat.iotts a t the bottotrr of Figure 8.15. tlre pro11abilit.y il~trrvalthat includes p:
Itr the exan~plcof Figure 8.15, the g e o ~ ~ t e hofy the two strata is retrieved
dircct,ly fro111the geologic map. 111 tttany applications, there is no such ex-
Itnustive cnl.r:gorical niap, and the relative gcotnetry of the difftmwt popu- 5. Add that, sitttulated value s(')(n1) to the conditionit~gdata set
lations slronld Ile sirmllated first. Many algoril.lrt~rscan he used to sintrllatt!
categorical variahlrs: Hoolea~tand object-orietiterl algorithms (Ripley, 1987; 6 I'rorrrd to thr next node along the random path, and repeat steps
1-5
fialdorse~tet al., 1988; Suro-l'hrez, 1991); single or ntnltiple truncal.ions of
a Gaussian field (Journel arid Isaaks, 1984; Matl~erone t al., 1987; Xu and Repeat the entire sequential procedure with a different random path to gen-
,lournel, 1993); indicator-based algorithms (Joorrrel, 1989; Alabert and Ma.?- erate a n o t l m realization {s(' )(,I;), j = 1 , . . . , N ] , I' # 1.
8.9. MISCELLANEOUS ASPECTS OF SIMULATION 425

1. Recause tile r m d o ~ nvalue p is ur~ifor~nly in (0, 11, t.hc ar-


distril~r~t,ed sis realization
bitrary ordering of tire IC c;rtgories affects neitlw wl~ichcategory is
drawn nor their spiit.ial distribution ( A l a b d and Massonnat, 1'390).
Tillage (5%)
2. Because the K indicator ILVs I ( u ; s r ) are li~raarlyrelated (they add
to I), t h e is a risk of nornerical instabilities if all I i categories are Meadow (53%)
cor~sidercdtogctlm in a single cokriging systenr (see rel;ited d i s c r ~ s s i o ~ ~
in section 7.3.6). Paslure (25%)

3. Actual d&a almut tire prevailing cakgory s t ulay be strppli:~ne~rted by


soft i~~fornration, such as the k~~omledgc of ahscnce of o11e or nrore cate-
gories or a sct of prior proh;tlrilities of occttrrrrlce provided I,y ancilliiry
data. 'I'lie indicator ;ilgorit,hn~sirrtrotlua:d i l l scclion 7.3.3 allow onc to
account for 1~1thhard and sort iudicator data i n tlre deterlr~in;~t,ion of
the c o n d i l i o ~ ~prohabilitics
;~I a t eacl~sin~ulatcdnoda.

Pasture Meadow Tillage


171

Sequential i ~ ~ d i c a l osin1111atiorr
r (sis) is used to s i ~ ~ u r l ; ~
tila
t o slr;tt.i;rl dist,ri-
b u t i o ~of~ the few lar~d~ ~ s over c s tire study area condit.io~ialto ilrc 50 rlnt,;~
shown al, the top of IVigure 7.42 (page 355). 'l'l~eco~~ditiooal pn)l~ai,ilit,ics;rrc
determined using ordinary indic;rior kriging aurl tlrv indicator s e ~ ~ ~ i v a r i o g r a ~ ~ i
rnodt:ls displayed a t the b o t l o n ~of Figure 7.30. At, each grid nodc, t l ~ cc o ~ ~ d i -
tioning i ~ ~ f o r r n a t i consists
o~r of Ll~e12 closest la110 usr data and the 12 closest
previor~slysinn~latedvalnes. Forest x Pasture Forest x Meadow Forest x Tillage
Figure 8.10 sl~owsone realizatio~~ (top graph) and tilt: correspouding dircct. 001 E, 00
g ...,' .., . ... .
and cross i ~ d i c a t o rsemivariogra~~rs. Note the following: s
d 04
?'he rcaliaatio~~
depnrls significantly from t l ~ et;rrgrt sanrple proportions X
e
(4, 58, 18, 20%). For correction of s i ~ c del)art~orc,
l~ s w sc,ctiorr 8 9 . 1 0
0 " , ,
00 0 08 $ 2 1.6
,,lb,o,lcC il",)

'I'lie iudicator cross se~nivariogrn~~rs,


wliiclr I I ~ I S U 1,111'
~ P t . r ; ~ l ~ s i t ,fie-
io~~ Pasture x Meadow Pasture x Tillage Meadow x Tillage
quencies bet.wren two different cat.egorics, arc poorly r e p r o d u d sirlee
'. ......-.....
t.liey wcrc not, ncr:o~~ntedfor in the iorlic;~t.or algoril,l~n~, in this casr
ordinary i~dicat,orkriging.

8.9 Miscellaneous Aspects of Simulation


Algorithn~sfor sinn~lntingcontinuous aud categorical v;rrial~lcswere intro-
duced in sections 8.2-8.8. First, this section shows liow a post-processi~~g of
realizations allows one to irnprom: reproduction of niodel st,;tt.ist,ics. Next, rlif-
ferent ways of s ~ ~ r r ~ ~ n a r ithe
a i nspatial
g urmxtai~it.yrrpresmtiid by LIw scries
.r . t r L I ~ ,. .. , > . , .,,. " , ,. "
Wlrm dcpartr~resfrom urodel stat,isl.ics are deemed too i ~ n p o r t a ~the ~t,
rrdizatio~rcar] lw discarded ; ~ n danotl~errcaliaatiol~gc~icrated.Co~npntation
tinrc rnay p r c v c ~ one
~ t from creating mit~ryrci~lizat.io~rs and selcct~itigor~lythose
8.9.1 Reproductiou of model statistics wit.lr <lesiral>lcslalistirs. An allern;tlivc is to post.-process tlrc few rralizat.ions
crvnilable so as to better reproduce or cveu identify t l ~ e s rt;~rgt!lstatistics.
Stocl~aslicrcalizalions rarely n ~ a l c hrrlodel statistics exactly, ]lor slrould t h y .
13cw;trc that, model statistics are infcrrerl fro111s;r~nplcinfor~nal,ion, which
Consider, for exa~uple,t l ~ e t l ~ r sGs
e e realiai~tio~rss l r o w ~ill~ Figure 8.2 (page 373).
,, r~rcr,ssarilydepart,s from the p o p o l a t i o ~paramct.ers,
~ p a r t i c t ~ l a r lwhen
~ data
I he three corrrsponding rralizat,iou se~!~ivariogranis y ( ' ) ( l ~ ) 1, = I , 2, and 3,
arc s p r s c or i ~ tlrr
i p r c s c ~ ~ cofe prrfcre~rtials i i ~ n p l i ~ ~Ergo(lic
g. Iloctuntions
Hnctuak aroond the ntoilcl y(h) depicted by t,l~e solid line in Figure 8.17 (Icft allow one la iiccor~ntindirectly for the i~ncertainby a l m l t s:rmple statistics.
graph). Similar flnctuations are observed for the ~rrarginalcdfs (Figure 8.17, Ilcduction or removal of srlcl~flrrct.t~atio~rs may lead to a false sense of Ccr-
righl graph). Such discrcpa~~cies hctwcc~rrealiast.ia~rand rnodel statistics are
t;tint.y al,out t.hr si~nnlat.cilfciiturcs; tllc sclr~clcdrenliant.io~~s are artificially
rrf<wcd to ;IS erqorltr Jlnctenitoss.
rtradc L o look dike tl~roughloo stri~lgantconsl,raints of re{)rodnctio~rof lhr-
Several factors co~ttroltire i ~ ~ ~ p o r t a of
n c crgodic
c fluctoalio~~s displayed
get statistics, A rnore rigorous approach (Jonr~rel,19'34b) consists of a formal
by a realization (Ilcnt~scl~and J o u r ~ ~ eIY92a,
l, p. 127~~129):
ra~~ilotnizing of tlre s e ~ i v a r i o g r a nlodels
~r~ a ~ tlrc~r
~ d :icco~~nt,itrg for such vari-
I . 'Tl~e;~IgoriLl~ln used to gelmate the scl. of realizations. at,io~tin t l ~ csi~nolationalgoritlrms.
Unlike sintolal.t~rlim~~ealing, Gaussia~l--and indicator-based sininlation 1ht.a that arc s ~ ~ h j eto c t i~~r;rsuren~cnt error sllould not be exactly honored
algoritlrtrrs reproduce the sc~~rivariogr;~~nir~odel(s) only in cxpectrd vitlue, I,y the rraliaatiow Moreover, the part of t,lx nr~ggeteKect arising from tliese
that is, on average over ntany rediantions. Consequently, larger fluctu- errors s l ~ o ~ not
~ l dbe reproduced by f . l realization ~ s e ~ l ~ i \ ~ i ~ r i o g rIh
a mfilter
.
ations of realization semivariogranrs arr expected when l~sirlgscqrre~rlial the i~oiscdue to nteasrrre~nenterror from t l ~ crealiaaaior~,hlarcotte (1995)
s i n r ~ ~ l a t algorilh~rls.
io~~ propowrd 1.0 post-prncrss it using a modified v e r s i o ~of~ kriging analysis intro-
duci:rl i t ~serlion 5.6. 'Ure filtered image does r ~ o ll1o11ornoisy data a t their
2. 'I'l~odcnsihy of conditior~ingd a t a locatiorts, and 1.11t: ~ ~ ~ r ; r s u r e r uerror
r ~ n t variance is rcrnovcd fro111the realization
As ilmrc dxtn arc used to c o n d i t i o ~t~l ~ crraliz;ttions, t l ~ crcalizat,ion scn~iv;rringr;i~n
sb;rl.islirs I)eco~rraiucrcasi~~gly
sirnil;rr a ~ t dclosnr to the t.;trg?l. st;disl,ics,
il Il~risr: wcw inodclcd from the s m r r (lalkr.

I
Realization semlvariograms Realization cdfs
I

where r j O ( n ) is tlw value corrected from tire original sirtrolated valne z(')(ll).
0.0 0.4 0.8 1.2 1.6 0 1 2 3 4 5 6 The set of corrected valnes { z ,01(n),51 g A ) identifies (in expected value) the
Distance (km) Cd concentration (ppm) target cdf:

Figt~rc8.17: kYucf.satio~w
it, tlx roproductioa of i.hr targct sc~nivtrriogrammodcl
and target cdl (solid liw) by the threc raaliz;rtions of Figure 8.2.
since t , l ~ uniforn~
c scorc l.'(')(Z(')(u)) is, by definition, nnifor~~lly
distribute(1.
428 GIIAP'I'FR 8 ASSESSMENT OF SI'A'l IAL UNC1<117'AIN?'Y

Relation (8.20) anmutits t,o irlenlificittior~of t.lre pquautiles of the t.wo


distributions of zil)(ll) and z(')(t~):
SIS realization After correction ( ~ ~ 2 . 0 )

Such quarttile ideutification preserves by delit~itionthe ranks of the original


values z(')(II), and hence the structnres seen on the original realization are
undlanged.
T h e correction (8.20), iiowevcr, also affects data locatiol~s,lieuce

T h e exactitud<. propcrty of the original realizatioii c o ~ ~ lIred prcscrved by


correcting only t,lie i~nsn~nplcrllocitlions, lmt. st~cltsclccl.ive c o r r t 4 o i i I I I ~ L Y
create artifact discontinuitiss nest to the dnt,a locations. TIE s o l u l i o ~consists
~
of applying the cortrct,iou (8.20) progressively as llle location r l gets i;rrt,bcr
away from the data locatiol~s:

with the relative corrcctioli factor X(u) delilied as

where w > 0 is tlrc corrcctio~iparal~ictcr,oi(11) is a krigiug variance calcu-


lated using only the TI data ~(II,), and u,,,,, is t.lie niaximu~nkriging variance
observed over the study area A. 'The following are ~~otcrvortliy:
~ a i . ( u , ) = 0 and X(II,) = 0 , V w , hence the
At a datuln l o c a t i o ~11,: Cd data (ppm) Cd dala (ppm)
exactitude propert,y of the original realization is preserwd:

Realization sernivariograrns
As the locatio~j11 gets f a r t l m ;tway froin data Ioc;tt.ioos, the krigi~ig 12 1
variance &(II), and hence the i~itcnsityof the corrrcl.ion contnjllcd hy
the factor X(ti), increases.
r The ratio u,~(u)/u,,,,, is suialler tl1a11 1, so the rorrcctio~lf;rctor X(u)
increases as w decreases. Tllc parnlneter w allows a lmlance l)el.rvec~~ Lllc
two extrtxne cases of correctio~~
a t all non-data l o c a t i o ~ ~(case
s w = 0)
and no corrcctiotr (rase w = cu).
0.0 0.4 0.8 1.2 1.6
A sirriilar algorit.liu~has been developi:rl for imst-processing realiai~tionsof Distance (krn)
categorical variables, T l ~ eobjective is to reproduce $.lietargct proportion of
each category while lionoritig conclitioui~~g data without signilica~~t,
modifica-
Figure 8.18: Sequenlial indicator reulisation that departs from tlrc target cdf be-
tion of class-indicator sernivariograms. fore (Icft cohlrnn) and alter correction (right colame) uemg w = 2.0. 'The correction
Figure 8.18 (Icft top graph) shows a rcaliaatio~~ generated hy seqncntial improves the rt?produt:lion of the targct cdf w l d e preservi~~g
the sltape of the senli-
indicator si~nulnt,ionusing only five tl~rrsholdv;~lues.The (2-Q plot showti i ~ i "nr,",.~am
Allocate the node 11 to the cadegory s t associated with the smallest

, %
. value of t l ~ eobjective function
I'roceed t,o t l ~ c11ex1,tiode along the rantlotr~path.
l l ~ epost.-processing was stopped after 10 iterations wlren the proportion of
changes per iteration was found to be less tlran 1% of the total nunher of
grid nodes.
Figure 8.1'3 slmws the post-processed rr;rlizatio~~ and the correspo~~ding
13el.I.iv n:protlucthii of ~ r ~ o d sl:rtist.ics,
cl not lin~itcdto t.lre ~ r ~ ; a g i n ailistri-
l ! dircct and cross indicator semivariogr;mis. Sin~rdatedan~realingimproves
h u t i o ~ ~ran
, illso be ac11icvt:d 11y post,-processi~~g I.hr rcaIizal.ioti rviI.11 un;ic- the reproduction of the target global proportions (4, 58, 18, 20%) and indi-
i
cq)I.;hlti fl11ctoi8Lions i~singsir~~ulal,ed ari~~ealing i~~l,roducerl i t 1 sectiou 8.7. cator s c m i v a r i o g r a ~rnodals.
~~ Significaut, discrepancies still remain between
Act,ually, s i ~ r ~ u l : ~ tacndn e i ~ l i ~can
~ g be used urrt o111yto i ~ i ~ p r o vrcproiluc-
c
I ~ n o d c land realization cross sen~ivariogran~s, for example, between pasture
I
tion of currr:uL statistics b111also to inkpart IICW 1,ropcrtics to tlte re;~liz;tl,io~~, I and meadow. Such departures may he dun to an inconsistent target core-
such as ~ r ~ ~ ~ l l . i l ~ l rst.;ttislics
- ~ r o i i i t (I)t-r~bs<~l~
;tnd . l o ~ ~ r ~ 109211;
t r ~ l , Murray, 1!1!)2; gio~~alia:rLiollmodcl, Ikcall t11;tt the direct nnd cross senrivariograms were
Goovacrts, 1!106). i~iodeledinrlependcntly without cl~cckingfor the posit,ive semi-definiteness of
C o n s i h , for example, post-procvssing of t l ~ ,land use si~~jul;tfrd nlnp t h e full matrix of cross covitriance functions.
sl~owna t t l ~ etop of Is'igure 8.16 (page 425) wit11 trvo objectives: ( I ) 1.0 i n -
prove reprod~rctionof the I i target class proportions pr and direct sernivari- 8.9.2 Visualization of spatial uncertainty
ogriuns 7, (11; SI), and (2) to i ~ ~ ~ p r rcprotluction
ovc of i.11ecross s c ~ n i v a r i o g r a ~ r ~ s
-/r(ll; sn, sk,) not scco1111l.ct1 for 11y t l ~ eoriginal sequential i ~ ~ d i c a t ao lrg o r i t l ~ n ~ . T h e set of a l t ~ r n a t i v crealizations generated by sl.oc11astics i r ~ ~ u l a tprovides
io~~
,, it nlr:itsurc of uncr:rt;tinty ;rl)out the spatial distribulio~~ of attribute values.
J Irc ~:orrnspondingtwo ol>jeclivc fwctiou c o ~ ~ i p o ~a~r ce ~ ~ t s 1 'Ibdepict visually that uncertainty, several autl~ors(Sriv:~st,;rva,1994; Wang,
1904) have developed algorithms that slrow the rritliaatior~soue a t a time in
rapid succcssion, like the f r a n ~ e sof an ;u~in~al.ed carttmi, say, eight realiza-
lions per set:ond Like ;In animated cartoou, s~~cccsnive realizations must. be
siinilar enongh l,o allow t l ~ eeye to catch grai1n;tl clm~ges.Such a similarity
call be aclrievcd by ranking the realizations appropriately (Wang, 1994) or
by using a si~rrdationalgoritl~rr~ tlrat generates r e a l i z a t h ~ sthat are incre-
mentally diffrrent (Srivastava, 1994). The animated display of realizations
allows one to disti~tguisllareas that re~rrairlstable over all realizations (low
uncerLainty) from those where large fluct,~~ations occur betweell realizations
st and s n , a t lag 11, c a l c ~ ~ l a t (from
d tire realizatio~~
a t the it,h pcrt,urbaLivn (high u~~cectaint.y).
Ueca~iset l ~ crcalizalion s t ~ o win~Figure ~ 8. I F already r e p r o d ~ ~ c fairly
es well 'I'lic series of L realizatio~~s call also he posl-processed and the u~icertainly
global prtrporlions and indic;~torscmivariograms, it stronld not be conrplett4y i ~ ~ f o r i r ~ a tsumnrariaed
ion using different types of displays, as follows:
randonriacd hy ;~cceptingt,oo inmy u~~firvorablc perturbal.ior~sa t tlrt: 1 q i 1 1 - I . I'rol~abilit,ynkaps.
ning of the post-[)roct:ssi~~g.' ~ ~ I I I I s , t l ~ erei~liaitl.io~~
was post-processed using At tach si~nulatetlgrid node I I ~ ,the probabilily of exceeding a given
the following MAI' algoritl~rnthat retains t l ~ optvlurbation that diminisl~es Lhresl~oldz k is evaluated as the proportiorr of the I, simulated values
most thc value of t l ~ cobjective function: z(')(n$) that exceed that threshold. T h e rnap of sucl~probabilities is
1. Co11111rttc1h1: vnl~~r,
r ~ fI.lw objcctivr f r u ~ c l . i ofor
~ ~1.l1eillilia1 rralizat.ion referrcd to ;IS a probabilit,y nrap. For exa~uplc,Figurc 8.20 (top graph)
sl~ows1 . 1 ~pnhabili1.y ruap for cxceediug tlw critical Cd con cent ratio^^
2. I'hr n spt:cilii.d ~ ~ u ~ n of
b citt:ralions,
r follow tlicse steps: 0.8 11pr11. 'l'lral I I I ; ~was ~ deterrnit~etlfrom 100 sGs realizations. T h e
two low-v;rlimI zoncs correspoud to Argovian rocks where smaller Cd
L)t!fi~~c
a randor~ipat11 t l ~ a visits
t all IIOII-cor~dibiomctl
grid nodes. c o n c c n t r ; ~ l i o ~wcrc
~ s mtvisured.
At each node 11, consider all K possible calegories and conrpule
I.he values of tljr corresponding I< ot~jectivrfnnctior~s. 2. Quantile maps.
R a t l w than displaying tlrc probability of exceeding a particular threslr-
8.9. MISCELLANEOUS A S P E C T S O F SIhfULATION
432 C H A P T E R 8. ASSESSMENT OF SPATIAL U N C E R T A I N T Y

Probability map (0.8 ppm)


Post-processed sis realization

Tillage (5%)

Meadow (58%)

Pasture (19%)
Forest
Forest (18%) I 2

"
-
t
on

1 0 4
rn // 0.1-quantile map 0.9-quantile map

Meadow Tillage

,Forest x Pasture ,Forest x Meadow Forest x Tillage


Local entropy map lnterquartile range map

,Pasture x Meadow Pasture x Tillage Meadow x Tillage

area

13'igure 8.10: l'w-pmccssing of l l w s e q ~ ~ e n t iindicator


i~l w a l i z a t i ~ ,o~l~L:igwre 8.11;
sinn~latetlannealine. Note tlw bcttcr rcoroductiw of t , s r e r ~ ts r r n i v n r i n ~ l r m
U V ~ I ~
True value Response valils

3 . Maps of sprt,;~d.
120cal~ l i I l ' ~ ~ r ~I)I%W(X!II
: r ~ c ~ ~rmlizal~im~s
s rxn I><! dcyict<!<ll ~ yI I I ; I ~ I I ~S<IIII<,
~II~
Iucaslire of I h spretid of i.he distril>~~t,ir~ri of Ll~e1, s i ~ n l ~ l a t cv;rlues
d al
car11 s i ~ ~ t u l a t cgrid
d node; for exanrplc, t l ~ rlocal entropy, tlre varia~ice,
or t,he i ~ ~ t e r q o a r t irange
lr of the local ccdf, as defined in sect,ior~7.1.1
lC'or exatr~plc,Figure 8.20 (bottom rr~aps)sl~ows rrlaps of local clr-
tropy and inti:rquarl.ilc range det,crniincd from the 100 sGs realizatio~~s.
Uuth ii~erpsindicate grcaler certainty (liglrt,er grey) or1 Argovia~rrocks
where Cd concent~rationsare consistently snrall. T h e t i t i c e r t , a i ~ ~ist ~
greater where large and niedi~rmCd couceirtrations are i n t e y ~ n i ~ ~ gor lrd
in the west part of llir stndy area wlrrre data arc sp:rrse (see Figure 1.1,
page 5).

8.9.3 Choosing a si~~iulatior~


algorithm
,,
l l i e praclit,iot~crntay gct c o ~ ~ f u s cind t,lre face of an ever-growiug palclte of
s i t i i ~ ~ l ; r l i;rlgorill~~~is
o~~ ar,ailalrle. 'Tlrerc is I I O si111111atim1 aIgoril11111t,l~at,is 'I.lte grralcr CI'U cost of i~~rlicator-llased and simul;rlcd anoealirrg algo-
twst k)r ras<!s hul. ralilcr ii toolbox d a ~ t . t ! r ~ ~ a I ,~i Ivgeo r i l . h ~fro111
~ r s ~vl~icli
Lo rill~ntsis i,ala~~ecrl Ily their grcatf:r flexibility i n incorpornti~igvarious types
clioose or lo build the algoritl~mbest sui1,edfor the problern a t Irand. Uuildirrg of ioformatio~i,such as class-spcrific p a t t e r ~or spatial c o n t i ~ ~ ~ ~ soft
i t ydata,
,
from Deubsch (1994a), four criheria can be used to select an appropriate and multiple-poinl, statistics.
si~rtt~lalion :tIgoritli~n: ' I ' h r a f m ~ r t . hrrit.rrinn
...~
~~ . - - - relates to the distributio~iof response values ob-
~

taincrl hy proccssi~~g tltc set of rcnliealions, for exarr~ple,t.he distribution of


costs showrl a t tlie top of Figure 8.3 (page 375). Recall that this distribu-
2, 'Tile lr~rrnauand CPlJ time required for ger~eratir~g
a given set of real- tion provides ;rn a s s e s s t ~ w tof the risk associated with declaring the sllldy
izi~l~io~~s arca safe with rcspect 1.0 Cd. A respousc rlistril~~~tiotr is accllratc if some
lixed prolxtbility i ~ ~ t e r v (acl g . , t,l~esy~r~nrctric 50% probability irrlcrval or in-
3. 'I'he :ttnout~tof relevant information acco~~nteri
for (conditioning data) terquartilc r;nige) c o ~ ~ t a i nthe s lrue rcspolise, in this case I.he actllal cost if
no ren~erlialnleasxc is taken. The precisioti of the rcspollse distribution is
4. TIE precision and accuracy of probabilistic prediction, that is, of the dis- meamred hy its narrowncss. Note that accuracy can be evaluated only if the
tribution of o ~ ~ l c o n i ercsulling
s from the application of a given transfer actual l.nlc value is known, for exa~rtplc,during calibration exercises. A good
function (flow sirt~ulator,rernediation proccxs) to the set of realizations sirnulation algorithm should generate an o u t p ~ r tdislcil~utiot~ that is bolh ac-
As shown in previous sections, tire yficld and the various seqoe~itialCans- curate inld precise (Figitre 8.21, top graph). T h e worst situation is an output
siau si111111atiorr algoritli~iisareless rlerimndi~igtlr;m eitlier st:quential indicator <listribuLion t.liat is precise but not accurate, because it gives a f d s c sense of
s i r n r ~ l a t i oor~ ~sin~trlateda~rncalingi n ~ C I I I I S of inference a d corr~putational co~rlirleoccto a ~~redict.ion that is aclually wrong (Figure 8.21, left bottom
effort. T l ~ erapid growth of corr~putationalcapabilities, however, tends to graph). Accuracy can be achieved a t the expense of precisio~rthrough a n out-
attenr~atcthat difference. pul distribr~tiorrwith a very large spread (Figore 8.21, right bottom graph).
,Output distribution (sGs) Putput distribution (sis)

Mean: 4221
Std, dev.: 392
Chapter 9
3000 4000 5000 6000 3000 4000 5000 6000
cost cost Summary
Figure 8.22: T w o distributions of costs nssoci;tted wit11 a wrong dvrisiw to dccliirc
the study area safe with resprcl lo Cd. l'hrse distributions are obtained by post-
processi~~g 100 realizations of the spatial distribution of Cd values generated using
either the ser~rtentiitlGaussian or the seqecnti;,l indicatur simnlation algorit1,m. Chapters 2-8 covered the sequence typical of a geostatistical study, beginning
with exploratory data analysis, the11 quantitat,ive rnodeling of spatial conti~i~r-
,~ ity, prediction of attribute va111esa t unvisited locations, and, last, assess~~lent
l11e relative precisio~ircsullil~gf r m ~eacli si~irulal.ioni ~ l g o r i l l is
i ~r('i(dily
~~ of local and spatial r~rrcertai~~tyabout nnsan~pledvalues. 'l'liis liual clmpter
msessed by rr~cas~~rirrg the sprend of tile o t ~ t p u tdistribuliol~,which is oft.cn provides a synopsis of previous cllapters and points 0111 t.opics that i1est:rvf:
referred to ils the spnce of uscer-tnisly. 'I'l~cextt:ut of this space rlcpcnds or1 fnrt.11er attention.
the partLxdar algorithnr used to gol~eratetlru realiaatiol~s.For exarlrple, Fig-
ure 8.22 shows t l ~ eproha1)iliLy rlistribuliorrs of the cost. (8.2) c o ~ r ~ p u t efromd
d two different algorit,l~~irs:
two sets of 100 realiaatiol~sg e ~ ~ e r a t cby squc~~tiirl
Gaussian and seqrrential irrdicator sitii~~lation. l'rccision also depen~lson tlie
transfer f~mctionhciug applied 1.0 the realizations. h r t . 1 1 ~s a t t ~ edath st,(,, C11aracl.eriaation of any region requires a prior collection of data a t specific
d e l ~ n d i n gon t.11e transfer function and the specific respause value r c l . ; r i d , locations witlril~the region. Irr t l ~ i sImok, one consirlcred 1 . 1 ~ : s i t ~ ~ a t wl~c:r<rio~~
different algorithms can be best. Note tliat c o ~ ~ d i t i o ~ ito i u gmorc i~ifortri;tt.io~~ data lravc already hecr~collected alrd t.he main cl~alleugeis to i~rfcrstatis1,ics
docs not neccssi~rilyrcdllce Lhc space of rlllccrt;iilily. Ilrcorporitti~lgaddit,ioli;tl representative of the study area fro111data that iniglil have been prefcrel~tially
information t,liat conflicts rvit,l~erlrrettt. r l a t h ntay iucrrast: tho urmr~.t,;ri~ily ill sauipleil. 'I'he problem of designiltg si~rnpli~ig S C ~ I ~ I I I Chas
S not I)WII ;rddrrsscd
the response v a r i a t k lnformat,ion about 11ow geostatistirs can be used i n sarr~plirigdesign is give11
in Wehster and Oliver (1900, p. 272- 200).
'The ~~nrriber and spatial collfiguration of data are controlled by several
factors, sucli as tlie available teclr~rologyand resources, Llie accessibility of
some a r e a , t.11e need for s m ~ p l i n gmore densely arens dcen~edcril.ical, the
con~binat.iol~ of different rircas~~rement dcvices i~sed,ctc. T l ~ esnllipling str;rt.-
egy sl~ouldalways be clearly docrr~l~e~rted because it ir~;ryguidc or skcw the
results of thc explorat,ory data arlalysis.
Whcri dealing with attributes that vary over tiliic, such its p o h t a l l t
cor~centrations,spatial data sliould bc collected wit11i11a short time to avoid
rnixi~igspiif,ialand tenlporal fluctuations. If the study pursnrs dilli~rcrrtolijec-
tives tliat call for differeut sar~il,lil~gstrategies, tlrr s ; i ~ ~ ~ p l i t ~ g s r sl~ould
l l c ~ l ~bc
e
dcsigr~edto sirr~illtn~~eously aclrieve these objectives. For exaniple, the J u r a
data were collcctr:d using a c o ~ n b i ~ ~ a t ofi o n;L ilcstcd and regular s;t~npli~rg
sclicnics, which allowed r:l~arirctwiaation of sl~orl-raugevitri;il,ilily of r n r h l
r....: -..-... 1.:). :.I: ....:rill,, ,*., ,A. r...l>r"n,.r ,.., ,,,',,\".inr,.
E x p l o r a t o r y data :tnalysis under study. Suhscquent applicatiorrs, srtcl~as predictio~~ or assessn~entof 1111-
,, certainty, rely on t l ~ i sparticular mudel of spatial variability.
Ilrc sccond step c o ~ ~ s i sof
t s gett,ing familiar wit11 tlre data using descriptive As emplrmierd in Clraptrr 4, sen~ivariogra~r~ rrrodelitig is not simply an
f.ools, such as ii lociitiot~map, I~istograni,scattergram, or 11-scattergral~ras exercise in fitting curves to experimental values. Irt~portantdecisions rcgard-
introduced in Chapter 2. Exploratory data ar~alysismay indicate the exis- ing the ntnnber, type, and anisotropy of basic semivariogrir~rrmodels inust be
t.encr of several populations with sigr~ificatrtlydifferetrt features. 111 s u c l ~a nrade by the user rather tlral~left to some automatic-fittir~g algorithm. T h c
case, one slrould consider splitting the data. into more l~omogent~orls sul,sets user's expertise is particularly critical whenever sparse data and nreawremerrl
prior to statistical atmlysis. Such subdivision ]night not be possible becintse errors lead t,o noisy s;mrple seriiivariogranrs.
of t.he lack of data or the inability lo delineate the different popr~latior~s in Modeling a c o r e g i o ~ ~ a l i z a tisi omark
~ ~ difficrllt because the different direct
the field. and cross senrivariogratrls cantlot be ~rrodelerlindependel~llyof one another.
I'rior i~tfbrtuat'ionabout the range of attrilmte values, tlicir cross r e l a t i o ~ ~ s T h e corurnonly usetl linear model of coregionaliaatior~is reasorrably flexible
imd distrilmtions in space slrould be usctl rvitll t , l ~ rdata to build solnnlitry aud allows all easy check of the positive sern-definiteness condition. 'fire
st;tlislics, suclr as tnmtrs, varimcvs, correlat,io~~ coefficie~~ls,and srmivari- consistency constraints become very curnbersorrrc w l ~ e ~many i attribntes are
ogriuus. fk!w:~rc,Illat s;ul!plc sL;rlist,ics are ;~n):cl.edI)y spatial c l ~ ~ s t e arid
rs consi~lcreil;tllogcllrcr. For more than two ilttrihotes, interactive grapllical
c x t r c n ~ evalncs. If there is no stroug pliysical reason for discarrlitrg cxt.rerne progrmls it~clndingsen~i-ant.o~n,ztir fitting irlgoritlrms should bc considered
values or possibility to treal thcnr separately, their irrfluet~ceco~tldbe redr~ced (see Appendix A).
by all ;~ppropri;at.f!t~rans~ortn:tliorr of data or by using more rol)usl st;itistics.
When specific subareas arc preferentially santpled, the global reprcsetrtativ-
iby of sample statistics slronld be qoestiorred, and the declust,ering tecl~l~iqrrrs
inl.rodnced i l l scctio~r4.1.1 sltould he consi<lerc<l.
Locnl e s t i m a t i o n

' l i ~go l>eyol~d t.lw dat,;t ;rnd i4111:tt.ealtril)oI.c valucs at, r~nvisitcdloc;~tio~rs, :i In (3apt.cr 5, thret: v;triants of the li~mrrregression algorit11111(simple krig-
r ~ ~ o &o lf variability i l l s l m a I I I I I ~ Lbo csl:thlisl~ed.Our in~pcrfc:ctkr~orvledgeof i ~ t gordit~iiry
, krigitrg, ;tnd kriging wit,11a t r m d ) arc i~~t,rociuced for predicting
l ~ o wphysical processes oper;itc alrd interact over the st.udy area i~sn;illyyre- valtres of a sit~glcattribnle a t ut~sa~nplerl locatio~rs.Explicit ~nodeliugof local
cludcs any dctcrrnit~ist~ic i ~ r o d e l i ~wl~crcby
~g a siuglc and dcctr~rdexact, value trends is typically necded o~llywhen tlrc locatiorr lxilig estilnaled is ontside
is associ;iterl with tach ur~siirnplerll o c i t h n . As all alternative until dctcrtr~i~r- t . 1 1 ~gcograpltic range of da1.a ( e x t . r a l d a t i o ~situatiori).
~ 111 iutcrpolatio~rsitu-
islic nrotlels I m o m e avail;il)lc, l.hc randorrr fu~tctio~t (IZF) nod el is introtltlcetl ations, ordinary kriging ilnplicitly rescales the mean value within each local
in Chaptcr 3. 'f'l~crandom 111nct.ionallows nmlcling of spatial variability in scnrcl~neighborhood.
tllikf.it yields a t each n~~sctrr~plcd locatiot~a probability distribtttio~tof possible Block kriging illlows for ostitn;~tionof nttribut.~valncs liuertrly averaged
vnlues rol.l~crl l m i a siugle v;due. over supports (segrne~lt,area, voln~rre)IIIUCII larger than data s ~ ~ p p o r t 1%- s.
One i~uport.;r~rl aspect of the KI1 approach is tlie c o ~ i ~ of p tslalio~larity, ware that arith~nelicaverages of data are not the sarllr as physical averages
which dlows hferena: of statistics, s u c l ~as the mean, covari;wcc, or senti- if tilt: ;tttribrtt,t: does nol, avcragp linearly in space, for a x a ~ ~ ~per~ncnl>ility
ple
variogran~,by p o o l i ~ ~rlala g over areas dertr~edIrt~nlogelreotrs. It is worth
- or p11.
recalling t l ~ i stalionarity
t is r ~ o ta c1mracterist.i~of the pl~ysicalp h c ~ t o ~ r ~ e t ~ o n All interpolat,ion algorithnrs tend to smooth out local details of the spa-
under study, rather il is a u~odelirtgd e c i s i o ~tircded ~ to specify the piimnie- tial vnri;hility of the attribute, leading to overest,in~atiouof srnall values and
ters of the RF 111ode1uscd. Decause such statiorrarity cannot be proveu or underestinratio~~ of large ones. Such s t n o o t l ~ i ~depe~~cls
~g on the local data
~ e f n t c dfrom tlie data alone, lrowcver, the decisioir of stationarity, like any configtlration: high-frequency components are progressively filtered as tlre lo-
model drcisiou, can be d e c ~ ~ inappropriate ~ed if its conseqnences do not allow cation being estimated get,s f x t l ~ c raway from data locations. Maps of krigillg
one to reach the goals of the study. estimates appear artificially more variable in densely sampled areas than they
do where data are sparse. Building on that limitation, kriging analysis al-
lows one to lilhor one or more spat.id conrponcnts of the setr~ivariogralllor
S t r u c t u r a l an;rlysis covarial~cctnodel, resulting in n ~ n p sof low- or 11igl1-frerl~~e~~cy conrpo~ients
One major step in any gcoslalistical aualysis is to build a licit scmivariogra~r~ of spatial variability. Such maps can he used as exploratory kook to detect
(covariance) 111odc1that captures the relevatit spatial reatures of the attribute areas that dep;trt sul~stant,inllyfrom a regional backgronocl.
440 CIIAP'1'1<ft I) S U M M A f1Y

Accouuting for secondary information


Class-specific patterns of spatial correlation call bc accounted fnr ilsing all
When secondary infornratiot~is available a t all locations, the study area can indicator approach that docs nbt assorile any particular shape or ar~alytical
be stratified according to secondary data. Then the primary attribute is esti- expression for the conditional distributions. Ccdfs are modeled tl~rougha
mated within each stratwn using primary data and covariance model specific series of threshold values discretizing the range of variat.ion of the attribute.
to that stratum. An alternative is to nse secondary data to inform on the T h e greater coniputational and inference cost, of indicator algoritl~rns,com-
spatial trend of the primary attribute. pared to multiGaussian algorithms, is balanced by (,heir greater flexibility in
?'he cokriging algorithm, as introduced in Chapt.er 6, allows one to accoimt il~corporati~rg various types of i~~forrnation, such as constrxiltt intervals or ill-
for rmn-exhaustive secondary ioforrnatiol~,capitalizing on the spatial cross direct (soft) categorical or continuous data. T h e key step is to code e a c l ~piece
correlations between primary and secondary variables. Recurrent practice of iuforniatioo (hard or soft data) into local prior pro1)al)ilities. These prior
has shown that cokriging yields better re-estin~ationscores t h o kriging only probabilities are t.l~euprocessed together using kriging a l g o r i t h i ~ ~resultirig
s,
when the primary variable is undersarnpled with respect lo the secondary in posterior or updated distribut.ions.
variables. In other situations, the theoretical il~crtmseiu precision is not Two nl;ijor issws of t,he i ~ d i c a t o ri~pproacl~ arc the correcliou of order
worth the additional CI'U cost and iuodcling effort required by cokriging relatiotr deviations and the cl~oiceof ir~terpolation/ext~apoIaf~io~~ rnodels to
relative to kriging. incrcase Lhc resolution of the discrete ccdf inodels. Models for ext.rapolat.ing
T h e c o m p ~ ~ t a t i o n cost
a l of cokriging cirri he alleviat,ed lry retaining only lower and upper tails of the ccdf are critical to t,he rrsull,ing
the secondary data colocated with or nearest to the locatiou bcing e s t i n ~ ; i t d . stat,ist.ics. Note that the ntnlliGnussia~~ approach also requires iricre;wing the
Provided it is not i ~ ~ v a l i d a t r;Id ,M;irkov-type I I I ~ ~ I allrws
,I ;I furlllvr rcduv rrsduliorl of t l ~ cs;i~t~plv cdf.
lion of inftxence and nlotleling effort. Analysis of re-estimat.io~tscorcs at, test 'l'l~r: uncert.aiuLy a t an LIIISIIIIIPI~:~ location Cali l x ass~:sst:~l1hr011gl1spread
locations has sltowr~that colocated cokrigitrg with cross corrclogranrs instead tncasurcs derived from the corresponding ccdf inorlel, for cxan~plc,t . 1 1 ~local
of cross covarial~cesreduces the mean ahsolute prediction errors and the pro- eltt,ropy, the col~dit.ionalvaria~lcc,or the irllsrq~~artile rang(:. I)iffr!r~!nt esti
portior~of misclirssified locat,ions. m;rtes of the unknown a t t r i h u t , ~valno cin~be retricvad from t l ~ crcrlf, drpt:tr(l-
Cokriging can also be llsed to incorporate soft inforrnaliol~alrout lhc pri- ing on tlie optimality criterion retained. Instcad of adopting the ;irl)itmry
mary attribute, suclr irs constraint it~tervalor prior probability distrihutious least-sqvares criterion that equally penalizes overc:stiroatioo ztnd undi:rrsbi-
as derived from secondary information. rnat.io~~, criteria shonld be customized to the specikc p r o l h n a t i ~ a n d .
,>
I l ~ e r eare rnany ways to account for local nnct:rtainty in a decision-making
Assessment o f l o c a l n l r c e r t a i ~ r t y process such as ttmt involving cleaning of hazardoris areas. i n particular,
ccdfs modeled by the geostatisticiat~call be cornhincd with re1ev;nit econonlic
Chapter 7 is devoted to lnodeling uucertainty a t unsan~pledlocations, then functions to co~rlcup wit11 expected cost,s for the different opt.ions considered
using these nncertainty rnodt:ls for risk ar~alysisand decision making.
Analysis of re-estimation scores a t test locations Iins sl~ownt h t the error
variance provided by kriging algoritlur~sis poorly correlated wi1.l) actual esti-
mation error, hence in general the kriging variance car~notbe used alone as a
Cl~;~pt.cr8 addressrs I,lle ~ S S E : S S I I I P I Iof
~ spatial or joiut, ~~ncc~rI,ainby rathcr t l ~ i u ~
rueasure of local uncertainty. 'fhe major slrortco~ningof the kriging variance
n~lcertai~lty a t each siligle ~~nsaniplcrl location. Stocliasl.ic s i ~ n ~ l l a t ,proviclia
io~~
is that it does not depend on data values. A better ~neasureof local uncer-
ruultiple realizations of the spatial distribut.iot~of ilia a t t r i h u k v;rli~es,c a c l ~
tainty is provided by a model of the probability distribution of the uuknown
reproducing statistics deemed conseq~~c~rtial for the p r o l h r r at. Iian<l. 'ryl)i-
made conditional to neighboring data values. T l ~ i sapproach is more r i g o n ~ r ~ s
cally, simulated realizations do tiof, sl~owt l ~ esmoothing efFect cl~aractcristic
because local uncertainty is modeled independently of t.he derivation of a n
of interpolated maps. Among the ever-growing repertory of sin~olational-
optimal estimate. T h e easiest approach for modeling conditional distrib~t-
gorithn~s,this book focuses on fonr classes of algoritl~nrsi n c l ~ ~ d i nbut g not
tions (ccdfs) consists of transforniing all original data into normal scores, then
lin~itedto the Gaussian rr~odel: sequential sirrrrllatioi~,L11 d c c o n ~ p s i t i o n ,
adopting a nrultiGaussian itF ruodel for thcse i ~ o r m ascores.
l Under this par-
p-field approach, and simulated itnncaling.
ticular model, all conditional distributions a t any location are Gaussian with
means and variances identifying the corresponding simple kriging estimates Seqnential indicator simulat,ion and sinlolatetl ;rnncaling offer greatcr flcx-
and simple kriging variances. One limitation of the tnrdtiGanssian model ibility in incorporating diverse types of information (class-specific patterns of
is t h a t it does not allow for any significant and specific spatial correlation spatial continuity, soft data, and multiple-poitit statistics) a t the cost of larger
CPIJ atid iufcrcrrcc requirenrcnts. 1)iffwent algorithms an: oRen usccl in co11-
between extreme values, whetl~erlarge or small. , .. .. 3 ', 1 ,. r P i : , r , ., I . . ..:I
tylvs) cc;rt~hv rrroclcli4 t&g S C ~ I I I ~ L I L ~ Rildi(.itlor
I si11111l;~t,io81,
tlrm 1.l~'s~mtiid
dislribnt~ioirof the colttilruoi~satLril)rtte ( c . g nictal conccnl,ratio~~) specific to
cnch category can he siruulated using a Ganssi;tir-based algorithm or p-field
;rpproncli. Last, si~ntilat.cdannealing can be used to post-process a sirno-
lated nrap to ensure better reproduction of the target statistics and/or to
irrrpose addit.iot~alc n n s t r a i ~ ~ 1lt;ttt s cannot, h r readily incorporated by other
sinlul;tt.io~la l g ~ r i l h ~ n s . Appendix A
Utrlikc the ccilf model that provides only a locnlion-specific ineasure of nn-
certairll.y, t.l~eset of allernalive realizations generated by st.oclrast.ic sitrrr~lntion
providr.s a trwrsurc of irttcert.aint.y about tlrc spnfinl distrihutio~rof attribntc
v a l ~ ~ c511~11
s . ~ ~ t ~ c c r t i ~calli ~ bl lryvisrralised tlrro~~glr
;LII attitt~ntaddisplay of re-
aliaations. Spatial onc.erlainty cart also be displayed using prol,;~l,ilit.yntops,
Fitting an LMC
qrmntile ntaps, or conditional variance irlaps.
Si~rntlal.dmnps often scrve as input to coil~plextrarrsfer functions, SIICIL RS
flow sittn~l;~tors ill rtw!rvoir engitrcering or reniediation processes in pollution
control. The f i ~ ~olrjective
al is to rnorlel the uncertainty about some response Gonlnrd (1'389) proposed a11 iteralive procedure t,o fit a linear model of core-
value, such as travel time or ren~ediatiorrellicier~cy. gioualiznt.ioil (LMC) olrder tlrr coostrai~tlof posilivr semi-definiteness of the
As a last remark, heware that nncerlainty is not intrinsic to the phe- corcgionaliaation matrices BI, 'lhe algorithnr aims a t rninirniziog a weighted
sun^ of squares of diffcrrnccs lietwco~llre experinrental and rnodel (cross)
Irotrrmon u~rdcrsttidy: rather it arises f r o i ~ our
~ ili~perfect,ktlowledge of that,
pl~t:non~cnon, it is data-dependent and most importantly nrodcl-dependent, sw~iviiriograrrrVILIIICS:
tliat modr:l specifying our prior colrcept. (decisioirs) ahout llre plreilon~eoon.
C C C w(11r) . Fiii(11k)- - 7dA
K N, N" 2
No i ~ ~ o i l eIrcncc
l, r ~ ouncertainly measure, call ever be objeclive: the point is WSS -. (A.1)
to accept that 1inrit.ation and document clearly all aspects of the inodcl. n=l i=l ,=I
'Ti ' i7j

proportional t.o t.he nurnlier N(1ik) of pairs used ill the cst.i~natc

'To ~ x e v e n tt l r ~variahlc willr thc litrgr:st varinncc frour doini~ralingthe crite-


rion WSS, each rrsidnal [Tij(ht) - yij(hn)] is standardized by the product of
standard deviations ;ii a ~ r dC , .
Using matrix notation, the criterion (A.l) is rewilten:

wlrerc the trace Yr" is tlre snru of the d i a g o ~ ~elcnrerrts


d of the matrix,
and-V is the diagonal matrix of inverse standard deviatiotm. T h e matri-
ces r ( h n ) = [Tij(hk)] m d r ( h k ) = [yij(ht)] are llre e x p e r i ~ n ~ ~and
~ t amodel
l
444 APPENDIX A . FITTING A N LMC APPENDIX A . E'I'Y1'1NG AN L M C 445

semivariograrn matrices. The linear model of coregiotializatiott l ? ( l ~ r )is de-


T h e positive sem-definite matrix closest (in the least-squares sense) to
fined as
tlre matrix GI, is
Gt = Q I ~ 4: (A.7)
where A; is the matrix Aio wlrere all negative eigenvahies are reset lo
zero.
where the number and type of basic semivariograrn models gl(.) are specified -
5 . T h e coregionalizatiori matrix B1, tirat minimizes the criterion WSS un-
by the user.
der the corisirairit of positive semi-rlefiniterress is

Algorithm
T h e least-squares fit of a linear model of coregionalizatiori is more dernanding
than in the univariate case. The:iiffic~$y lies in the coristraint that tire
matrices of esti~rratedcoefficients Dl = [b;] must be positive semi-definite.
T h e idea is to start with a set of arbitrary coregionalizatiorr n~atricesI31 and 6. Define tlic new index riumlm lo = l o + l (lois reset to zero if lo > I , ) , nnd
to nrodify one matrix a t a tirrie iteratively so as to minimize the criterion repeat steps 2 to 5 until the criterion WSS is smaller tlrm a I.hresliold
WSS under the constraint of positive semi-definiteness of that matrix. Tlte v;dtre specified by tlte user.
algorithm proceeds as follows:

1. Choose initial va111es for the L + 1 coregionalization matrices D l .


I . Thcorel.ically, the procedure does not necessarily converge nor, if il
2. Remove any one of the L + 1 basic setniv;triogran~modcls, say, gl,(h), docs, is a unique solution assured Experierrce siiows, bowcv?r, t,ll;rl the
and c o r y u t e tlie difference b e t w c c ~each
~ cxperin~entalselnivariogranr ;dgorit,l~n~altnosl always cor~vergesarid leads Lo sirl~il;trrrwtlts w11;~l~avcr
matrix r ( l r t ) and the linear model of coregionalization iucluding tile L the initial values of tlie coregiolializat.ior~matrices.
rerrrainirrg basic structures:
2. 'I'he number of varia1,les mid lrasic strnctures are 11o1limited. An e x ; r w
plc of application to the joint rnodeling of 12 variables xn<l 3 structures
is given in Gonlard and Voltz (1992).
3. Hcwarc that. the i ~ u t ~ b of
c r parntiietcrs b!., to be cstirnated rapidly ill-
creases with the n u m l m of vari;tl)lcs and slructllrcs. 1"or e x ~ r n ~ p lt.his c,
3. Cornpule the symmetric matrix GI,: ~tnlnberwould be 165 for tell variables and three stn~clnres.'Ibo l~iiitly
paratiietcrs may cause instal~ilityiir the rsti~rtationr c s ~ ~ l land ,s IC~IIC(?
Ll~coverall quality of l , l ~ t . fit.. One u~ouldt.11c11rit1.ai11only tltc virri;d~lcs
t , l t ; ~ Lc o ~ r l , r i l ~ ~
I I~
I OlS. ~c,[,<I tlw ohjecl.iw ~ I I ~ S I I (SW
, : ~ r<!l;tl,~,<l
disc~tssiotr~ I I
scctiou 4 3 . 3 ) .

4. Perform the spectral decomposition' of tire matrix GI,:

where 41,is tlre N, x N,, orthogorial iriatrix of cigenvcctors, and A!, =


[XI,] is the diagonal matrix of eigenvnl~ws.' h e matrix GI, is not neces-
sarily positive scrni-dclinile in that sonic cigetrv;dues tnay be ncgativc.
'For the readel. nut familiar with matrix ducbrn. a b r i e f n c c ~ ~ o n
oft thc deterrninntirm of
Appendix B
it slrould not be t l ~ esole crilrrion for sdcct,it~gthe find linear rnodel of

List of Acronyms and


Notation

B.l Acronyms
ccdf: conditional cutnolative distri1)ulion firnctiol~
df:
ctruiulativc dist,ril>t~l,ion
function

CKT: cokriging with trcnd tr~udel,also known as "universal" cokrigir~g


E-typo: mndit.iotl:rl cx[)cct,;&w esl,it~ratc

F K A : (itct.ori;tl kriging almlysis


IK: indic;ttor krigit~g
of order it
I R F - k : inf.ri~tsicra11dol11fut~ctiol~s
KED: kriging with an cxtcrlml drift
I<T: kriging with n trend nmdel, also known as "universal" kriging
KWS: kriging wit.l~inst.rnta
LMC: linear n ~ o d c of
l coregionaliaation
LS: leasl squares
M A P : nlaxi~nonra posleriori lr~odel

M G : rnttltiGanssiarr (algoritlrrn or inodd)


mII<: ~tiedi;t~r
indicator kriging
M-type?: cottdiliott;tl illediatl rsl.il~lale
B.2. COMMON NOTATION 449

OCIC: ordinary cokriging g coefficients b;, of tlie hasic co-


Dl: coregionalization matrix i ~ ~ c l n d i nthe
variance model s ( h ) in tlie linear model of coregionalization C ( h ) =
d C K : ordinary indicalor cokrigi~~g
c;& o'cdll)
o I K : ordinary iurlicator kriging
C(0):covariance value a t separation distance lhl=O. It is also the stationary
OK: ordiuary kriging variance of the RV Z(n)

P C K : principal component kriging C ( I I , ~ I ' ) :non-stationary covariance between the KVs Z ( n ) and Z ( u f )

P K : prohabilily hrigi~tg C(2i): stat,imary covariance of the K F Z ( u ) for lag vector h

pdf. prob;rbility da11si1,yfrr~ict.iou C(1i): coviiriance functior~matrix (diniension N, x N,)

P-I' p l o t : ~ m o l ) a l ~ i l i t y - l ~ l - c ~ I ~plot
i~I~ilily C = C(O): variance cov;iri;utce 111a1,rix
L
(2-(2 p l o t : <lun~itilc-qn;r~itilc
plot
C(h) = b'cl(h): l i ~ ~ c model
ar of rcgionaliai~tion
R.F: random function 1 ~ 0

C,,(li): sl;~l,io~iiiry
cross covariaucc bet wee^^ l l ~ etwo It 1)s %,(u) ;uid %,(11)
RV: r a n d o ~ ~varinl)lc
r
for lag veclor 11
S C K : simple cokrigiug I.
Cij(Ir) = b ~ j c r ( l ~h) e: a r nlorlel o l coregior~alia;tt.i(~ti
s G s : scqut!~~lial
Gaussian sin~ulatio~r
k 0

sIIC: simple i ~ ~ d i c a t okrigiiig


r C r ( h ; zn): stationary indic;ttor covariance Tor lag vector 11 and t l ~ r r s l ~ o lzrd
of t.11~liinary iodic;ttor IIF I ( u ; zn).
of the ILF Z(i1); i L is t,l~ecovaria~~cr
sis: sequential i~~clicator
si~nr~lirlio~r
( I I ; , ) : sliitioniiry i l l d i d o r cross covarimct: for lag v ~ c t o r11 alld
S K : simple kriging
tliresliolds zn and zt, of the R.F %(n); it is the cross covaria~~ce l~rt.wr?en
SIClm: siniplc kriging with varying local means tlic two ir~dicatorRFs ~ ( I Irk) ; and l ( u ; r v ) .

B.2 Common notation


V: whatcver
A: study area
a: range parameter

a t : coefficient of the klh c o n i p o ~ ~ eof


n l the trend n d e l irr(rl)

B ( z ) : Markov-Uayes calil~r;itio~~
paran~t:ter

b': coeficicnt of the twsic cov:iri;t~~ce~~iorlel


cl(lr) or scmiviiriogr;un 111or1r!l
,~,(h)in i.hc linear model of regior~alizationfor v;~rial)leZ(U)
b:,: cocfficie~~t
of the basic covari;i~~cr
ino&,l c~(11)or seniivariogrim~inode1
g r ( l ~ in
) the linear motlcl of coregionalizat,io~i1,etwc:nn variirblc &(u)
and xi (11)
1*'(11;z): no~~-st,atio~rary
cornulative distrih~tiouftrnctiort of the ILV Z(II) ( I ) : non-stationary se~trivariograr~~
br1.wec11 the two RVs Z(II) and
X(l1')
I.'(rl; rl(rt)): no11-stationary cot~ditio~~;tl
currn~l;it.ivedistrib~~liorr
function of
Llre cotrtir~uousRV Z(II) given r~eigl~borit~g
il~for~nation,such as realiza-
tions of n otlrer 1tVs (called data)

1;'(11; sil(n)): i~ol~-st.;iLiolr;lry


conditional c t ~ l ~ ~ u l a tdistribr~tiorr
ive function of
tlrc c;rlegorical RV S(rl) give11 t.lw realiaa1,ions of r~ ot,l~cr11r.igl11)oring
RVs (called data) y;;(ll): stationary cross se~nivitriogrambetwccn the two RFs Z,(II) and
Z, (I,) for lag vcrtor 11
( I . . . , U N ; 21, . . . , I N ) : N-variaie cumulative dist,ributiorr function of t , l ~ c
N RVs % ( u I ) ,. . . , X(II,N)

y,(ll; zi;): statior~aryindicator scrnivariogranr for lag vect.or 11 and tl~rcstrold


I Z ( h ;z , 2'): statioll:rry "two-point" celtlolative distril)olio~tfunction of the zi; of i.l~eRE' Z(11); it is i,he scmivariograrr~of t.hc hi~riiryindicator R F
RF Z(n) l(11; zn).

I.'(z): culnnlativc distribution function of a IIV Z , or stationary cumulat.ive


disl.rihution frrurt,ion of a R,F Z ( u )

I , ' - ' ( ~ J ) :iltvcrsc cu~~~~rl;kbivc dislril,i~lionfrtnetion or qttantilc f~twtiollibr 1,ltr.


l . y 1, t [O, I ]
~ ~ r r h ~ l , i l iv;dtw

I$;(h; zj,z,): slal.ior~ary"two-point" joint rrtrrlt~lal~ive


disl.ril111liortft~nclio~i
of Rl's Zi(11) itrtrl Z;(II)

Ip<;(z,, 2;): joint c u ~ ~ ~ u I i ~dist.ril1ution


tive firnctiott of the bwo RVs 2; and
Z j , or stiltiot~aryjoint, cumulative distributiol~Stlnctio~~ of the two R l s
zj (11) illld z, (11) I(II; z ) : I~itiaryiudicat,or 11.1' at local,io~~
u and for tl~rrsl~olrl
z

I.'v(II; zl(n)): ~~o~~-st,itti~>t~iir.y c o ~ ~ d i t i ocul~~tilittive


~~al dist.ril~ut.ionfr111ctio11of
Lhr! c o n t i ~ n ~ o uILV
s %r,(u) defined over 1,he l,lock V(n)

G ( u ; zl(n)): norl-statio~~ary
conditional cutld;itive distribution functio~tof
the standard normal RV Y(u) given the i n f o r o ~ a t i o(n)
~ ~ available, fur
rxnmplr, rcaliaahio~mof n othcr nrigl~boriogHVs 1;(11,,)

(:(II; 2, 2'): sI.:rl.ionnry "two-point" cdfof the ~nultivnriateGaussian It17 Y(n)

D(l1): scmivariogran~~ m t r i x( d i ~ n ~ n s iA'm,, x N u )


452 AI'I'WNZ>IX U LIST' OI: ACRONYhlS AN11 NO'Ij1710N

X,,(u): cokriging weight associated to zi-datom a t locatio~ru, for estima- p ( u ; $I): yrobal~ilityfor tile category s t to prevail a t lot.1' I,IOlL
' 11
tion of the attribute z a t location 11
p(11; sr/(n)): c.orrdiLional probability for the calegory s t t o prevail a1 localior!
XZK(u): simple kriging weight associaled to z-datum a t location 11, for u give11 the neigbhorirrg iitlbrn~atio~r (it)
estimation of the attribute z a t location 11. The same type of notation
applies to other algoritllms, for example, OK, KT. p t : global proportion of category st within the area A

X2CK(l~):siritple cokrigi~rgweight associated Lo zi-dato~na t Iocaliou u,,, p,: p-q~~atrtile distribution fuoctiou I"(r), q,, = I.'-'(p)
of tlic c1~1~111ittive
for estimation of the primary attribute zl a t location 11. 'The same type
of notation applies to other algoritlrnls, for example, OCK
X:g(u): OK weight associaled 1.0 z-datum a t location 11, for cstir~ratiu~t
of
the trend conrponent m o x ( u ) . The same type of notation applies to
KT arid ICED.
ll(11)
(I,
-+x : ~ ~
Zi(i1): ~ l m o n ~ p o s i l of
i o ~tile
~ n:siclual cotr~po~lnlrt
trrodel iuto
1) spatial cornponcnts Z1(ll) corresponding Lo 1,lrr ilecortrposition
m: stationary mean of the ILF Z ( u ) of the residual covariance C n ( h ) = El=, L
bicl(1i)
rn: vector of stationary ~ n e a l ~(dilrtensron
s Nu)

r n h K ( i ~ ) :OK esti~riatcof the treud cornpollent a t locatiorl 11. T h e sanie


type of notation applies to K T and ICED.

~ O K ( I I )Lagrange
: parameter for ordinary kriging a t location 11. T h e same p r ( h ; zn): statiouary indicator corrclograr~~ for lag vect.or ir m d t.lrreshold
type of notation applies to other algorithms, for example, KT and KED. zt of the R F Z ( u ) ; it is the correlograni of the binary indicator RF
l(11; z t ) .
pl(ll; zr, zn,): st;at.ionary i~idicalorcross corrclogr;ml k ~ lag r vcct.or h and
N(11): number of pitirs of data values available at lag vector 11 tlireslrolds rt nnd zt, of the 1l.F Z(i1); it is the cross mrrrlogratr~hrtwecn
t,he two illdicator 1Ws I(u; zr) and I(11; zk,).
N u : nutrlber of varia1)les Z,
P&(h;z;l, rjt,): sl;rt.iorr;try indicat,or cross corrclogra~iifor lag vector 11 and
n: number of data valr~rss(u,) or z(n,) available over tllr arc;, A tlrresliolds zit and zjrr of the R.Fs Zi(11) m i l %;(11); it is t.ltc cross
71(u): nurr~berof data values z(11,) used for estimation of the ;~ttribatcz a t correlograltt between the two i~rtlicatorllFs l ( u ; zit) and I ( n ; z,t,).
location u S(II): generic categorical RV a t location u , or a generic categorical 111' a t
location 11.
ni(11): number of data values zi(11,) used for esl.itl~atiol~
of the a t t r i l ~ n t er
a t focation 11 S ~ I I ( ; ) : spherical se~rrivariogratnniotlel of range a, a fimction of the scpa
ration vector ti
vi,(h): statioriary codispersion coefficient between vz~riahles&({I) and Z,(II+
h) separated by lag vector 11
B2 COMMON NO?'A??ON

~ ( I I , ) : s-(1ntu111value a t l o c a t i o ~11~, zit: ktlr threshold value for the continuous attribute ti

z,(o): average value of attribute r over a block V centerrd a t u

n Z x ( o ) :silnple kriging variance associated wit,l~t.lrr sinlplc kriging estiniatc ~'(11): an estimate of value z(u)
Z;lc(u) a t Ic)c;itio~r11. The sarne Lype of notation applies to other
a l g o r i l , l t ~ ~for OK and K T
~ s cxa~nplc,
, t i K ( u ) : sinlple kriging estimate of value z(u). T h e same type of notation
applies to other algoritlnns, for rxa~nple,01< and K T
z ~ & ( o ) : simple cokriging estimate of t.he prirnary attribute 21 a t location
u. T h e s a n e type of notation applies t,o other rnoltivariate algorithms,
for example, OCIZ.
z;;(u): I?-type rstirrrale of value z(u), obtained as an aritlrrnetic average of
n~oltiplesintulal.ed realizatioris z(')(u) of the Rl*' %(XI)

!=
; y-'(Y): inverse transform fnnctiol~y ( . ) relating two ltVs % and Y

~ or a generic contil~uousILF a t
Z(n): generic continuot~s1tV a t l o c a t i o ~11,
locitt.ion 11

,%
,; (11): sin~plekriging estimator of Z(II). 'l'he sanle type of notahion ap-
plirs t , ot,lm
~ algorith~ns,for cxanrplr, OK and K T .

{%(II),11 t A ) : set, of rand0111vari;hles %(II) Mintxi nl. each locathn rt of


t,lic arcn A

z,(u,,,): zi-rlatunr value a t locat,ion u,,


Appendix C

The Jura data


Data set provided by J:P. Ihbois, IA'1'E-Pidologic, Ecolc Polylecl~r~iql~a
FEdirale de Lausmnc, 1015 i,ausanne, Switserl;t~dlJscd with permission.

C.l Prediction and validation sets


Spatial coordirratcs a11d valucs of categorical and continuous at,tributcs ; k t the
359 sampled sites. 'l'lie 100 test locations are denoted with a star.
12ock t,ypcs: 1: Argoviau, 2: Ki~n~ncridgian, 3: Sequ;tuia~~,
4: Port.la~~dit~rl, 5:
Qlraterl~ary.l a r d uses: 1: forest; 2: ptisturc, 3: mea<low,4: tillage.

X Y Rock Land Cd Cu Pb Co Ct. NI Zn


km km type use ppm ppm ppm ppm ppnr ppm ppm
2.386 3077 3 3 1.740 25.72 77.36 932 38 3 2 21 32 92.56
2 544 1.972 2 1335 24.76 778s i n oo 40 ?o zn 72 7 3 56
2.807 9 317 3 1 510 RHH 30 80 III R i l 4 7 011 2 1 40 6 4 80
C.2 Transect data set
Spatial coordinates and values of primary and secoudary attributes measured
along the NE-SW trarrsect. Rock types: 1: Argovian, 2: Kinlrneridp;ian, 3:
Seqrranian, 4: Portlandian, 5: Quaternary. A dash cle~iotcsmissing values.
X Cd Ni Rock Block Ni

1.00 0.910 18.00 3 18.814 Bibliography

Abramovita, M. atid 1. A. Steglill, editors. 1972. IInadbook of Mnlhemat-


icnl Functions: wilh Formelas, Graphs, and Muthemalical ?hbles, 9111
(revised) prirlling. Ihver, New York. 1046 1,.
Alahorl, F. G . 1987a. 'I'lle practice of fast colrdil.iollal silr~lrl;~(.ioils
tllrougli
the I,lJ rlccal~~positiorlof the covariance nrat.rix. Mnllre,,~alzcnl(:eulogy,
19(5):369-386.
Alahcrt, F. G . l987b. Slochaslic llnngzrrg of Spnlrrrl l)islnb~~lroaslJ.srng
lli~ldand Soft Iirfor~rralionM;rsler's tllesis, Slanfcml lilliversity, Stan-
ford, CA.
Alabcrt, F. 6. and G. J . Massolloal. 1990. Ilel,erogcl~cityin i~ colrrlilex
turbidilic reservoir: S1,ochastic modelling of f x i e s slid pelropltysical
variability. In 65th A n n d Technical Confewuce and Ezhihiiioe, n111n-
bet 20004, pages 775 -790. Society of P e t r o l e ~ ~Engitlecrs.
l~i
Almeida, A. S . 1993. Joivrl Sin~ulationof M~rltiyleVimnbles with a Markou-
type Goreyionnlizafion Mudel. Doctoral dissertation, Stanford liniver-
shy, Stanford, CA.
Almcida, A. S. and A. G. Jonrnel. 1994. Joint sil1111lationof ninlliple vitri-
able8 wit11 a Markov-type coregioilaliaittion rl~orlel,Mathernotic(11 Ccol-
ogy, 2G(5):565~588.
Andersorl, 'l'. 1958. A n Inlrodaclion lo Maltiuarznle Statialzcnl Analysts.
Jolrri Wiley & Sons, New York.
Atteizt, O., J.-P. Dubois, and It. Webster. 1994. Geostatistical arlalysis of
soil contalrriliation in tlre Swiss Jura. E n v ~ r o n n ~ e s tI'olhrtron,
nl SG:315
327.
Barnes, It. J. 1991. 'I'lle variogram sill and the sarrlplr variallce. Mathenhat-
ical Geology, 23:673-678.
Darues, 11. J . aud 'I'. J o l m x m 1984. I'osilivc krigiltg. 111(4.Vmly, M. Ilavid,
A. G . Journel, and A. Marichal, editors, Geoslatislics for Natural Ile-
sources Chamcterizution, volurrie 1, pages 231--244. Rcidcl, Dordrecbt.
Borgrllan, L., M. Talieri, arid 11. Hagall. 1984. 'I'lirce-dilrlensioi~d frcqtlelrcy-
domain sim~llationsof geoloaical variables. 111 G . Vcrlv. M , n n v i ~ l .
A. C . Jourocl, a ~ A. ~ dM;\r&cl~:tl,editors, G'eostatislrcs for Nnltrml Ile- Daly, C., U. I,a,i;tunie, and D. Jeulin. 1989. A p p l i m t i o ~of ~ multivariate
2, pages 517 5 4 1 . Itcidcl, 1)ordrccht.
so~~w:t:sC l ~ a ~ ~ c t c r i r n f tvolr1111e
on, kriging to the processing of lmisy images. In M . Armstrong, editor,
Ccnsfntzstics, volnnic 2, pirgcs 748-~760.Kluwcr, I)ordrecl~l.
h t r g n u l t , (:. l994. Il.ol,~~stncss
of iroise filtkriilg by k r i g i ~ ~analysis.
g Mntb-
eu~nticalCeoloyy, 2(3(6):733-752. I a n s l t l r , I . . B. ' l j o l s e ~ ~
K ,. 11. O n ~ r e ,and 11. II. IInldnrsetr. 1990. A
two-st.age slocl~asticmodel applied t,o a North Sea reservoir. In 65th
~ dG.Jonrnel. 1994. Gallssiall or i~~dicator-based
Borrrg;tult, G . a ~ A. simu- Annzial 'I'echaical Conference and Exhibilioe, pages 791~802. Society
lation? which variogrxrrt is more relevant? In I~zter~zatiortnl
Associnlioo of Petrolnrrn Engineers.
for Mathemnticnl Crology Annual Cosfewnce, pages 32- 37.
llavid, hl. 1977. Geo.stnlisticn1 Ore Reserve E.stin~atron.Elsevier, Amst,er-
Bourgault, G . , A. G . Journel, S. M . Lesl~,J . I). Rlroades, a r ~ dI). I,. Corwin. dam, 364 1).
1995. Geoslatistical aralysis of a soil saliuity data set. 111 Apphcntzon
of CIS to the kfodeling of Noe-poinl Sotixc Pollulanfs in the Vadose I)avis, 11. M. 1987. Uses and ahuses of cross validation in geostatistics.
Znne, pages 5 3 1 14, ASA-CSSA-SSSA 13ouyo11coscoofermce, Mission Mntbe~~tnlical Geology, 19(3):241 248.
IIIII,ltiv~~rsidc~,CA. Davis, 13. M . arid K . A. Greenes. 1983. Fktimating using spatially dis-
tributed n~nltivarint,e&&: An cxanrplc wit11 coal qualit.y. Mathemnticnl
Bonrgault, G. and I). Marcnl.te. 1991. Multivariable variograni and its
Gcology, 15(2):2R7-300.
application to tile lir~t'ar 111orlelof coregionalizatio~~.
Mnthentnticirl Ge-
ology, 23(7):899-028. Davis, 3. C. 1986. Slaiistics and Data Annlysis in Geology, 2nd edition
3ol111Wiley R- Sons, New York, 646 p.
Clrilks, J . and A . Guillen. 1984. Variogramrnes et krigeages pour la grav-
im&triee t lc magn6tisme. In J:J. Royer, editor, Corrzpulers iu h r t h Davis, M. 1987. Production of conditional s i ~ ~ ~ u l a t i ovia
n s the 1,U decom-
Sciences for h'alarnl Resources Chnracterimtion, volume 20, pages 156- position of the covariance matrix. Mathem~ticnlGeology, 19(2):91L98.
468. Sciences de la Terre, Nancy, I'*ranee. I)elfiner, 1'. 1976. Linear estinratio~rof nori-stationary spalial pl~enomena.
In M, Cnarascio, M , llavid, and C. 3. Hoijbregts, edibors, Advanced
Christakos, G. 1984. On the problerri of permissible covariance and vari-
Ceostniisfics in the Mini~tgIrrdwtry, pages 49- 68. Reidel, Dordrecht.
ograrn l~iodels. Wnter Resources Ileseaxh, 20(2):251-265.
I)eutsch, C. V. 1989. 1)ECLUS: A Fortran 77 program for deterntining opti-
Cl~ristakos,(:. 1990. A Raycsiao/maxi~n~~~~i-<;~~tro~>y view to Ll~esp;tf.inl
~ I I I I I I spal,iitl
I ~ I ~ r l ~ s L eweights. 8 Geo~cirrtces,15(3):325~-
r i ~ ~ g Con~p~itcrs
~ s l i i ~ t a pl rho~l h t ~ .M o l l ~ c ~ ~ ~ a tGa r aoll ~ q 22(7):763
~, 777.
332.
C l n ~ J, . 1993. Xgam: A 3-D interactive graphic. software fnr nrodeling var- I)eulscB, C. V . 1994a. Algoril.li1nicall~-dcfirrec1
random function rnodels. In
iogranis and crossvariogra~rrsunder conditions of positive defiuit.eness. 1%.Diinitrakopoulos, editor, Geoslntisfics for the h'ed Century, pages
111 Ileporl 6, Slnnfnlrl (..'rsler for Ileservoir Forrciisliag, S t a ~ ~ f o r CA.
il, 422 4J5. I(luwcr, IIordrt:cl~l..
Chu, .I. 1996. Fast sequential indicator simulation: Beyond reprorlr~ct,ionof Deutsch, C. V. I994b. Const.raineil modeling of histograms and cross plots
indicator v;triogr;t~rts.Mnthenralical Geology, 28(7):023 9:lli. with sir~~olaI.ed
a~~ricaling.1x1 Rcporl 7, Slnnfolri Center for Reservoir
Chu, 3. a ~ l dA. G. Journel. 1994. C o ~ d i t i o n a lfBnx sin~ulationwith dual I~orccnsti~~g,
Stanfnrd, (:A.
kriging. In It. Dirriitrakopoulos, editor, Geoslatistics for the Next Cea- Deutsch, C. V. and P. W . Cockerham. 1994. Practical considerations in the
tnry, pages 407 421. Klurver, Dordrecht. a p p l i c a t i o ~of~ simulated atmenling to stocliast,ic sinrulat,io~l.Mntheatat-
Clrn, J . , W . Xu, 11. Zhu, and A. G.Journel. 1991. T h e Amoco case study. ical Geology, 26(1):67 82.
111 ileporl 4, Slnxforrl Ccnter for l l r s t ~ u o i rForecasting, Stanford, CA. Dcutscli, C. V. and A. C . Journel. 1992a. GSLln: Geostalisticnl Software
1,ibnwy a d User's Guide. Oxford University Prcss, New York, 340 p.
Clark, I., I<. L. Basinger, and W. V. Harper. 1989. MUCK^-A tlovel
approach to cokrigir~g. In B. E. Huxton, editor, Proceedings of the Deutsch, C. V. and A.'G. Journel. 1992b. Annealing tecl~niquesapplied to
Conference on Geostatistical, Sensitivity, and Uncertainly Methods for the integration of geological and engineering data. In Report 5, Stanford
Ground-water Flow and Rndioi~uelide 7ianspot.l Modeling, pages 473- Center for Resewoir Forecasliag, Stanford, CA.
493. Batlelle Press. Doyen, P. M., T. M. Guidish, and M ,de Buyl. 1989. Seisrnic discrimination
Cressie, N. 1985. Fitting variograr~rn~odclsby weighted least squares. Math- of litlrology in sand/shale reservoirs: A Dayesia~h approach. In SEG
e ~ ~ t a t i c Geology,
al 17(5):563-580. 59th Annual Meetang, Dallas, T X .
BIBLIOGRAPHY

Dubrule, 0. 1983. Two ntethods with dilferent objectives: Splines and krig- Gdmez-Herniudee, J . and A. G . Journel. 1993. Joint sequential simulation
ing. Malhernattcal Geology, 15(2):245-257. of ~nultiGaussianfields. In A. Soares, editor, Geostatistics Trdia '92,
Edwards, C. and D. I'enney. 1982. Calculus and Analytzcal Geometry. volume 1, pages 85-94. Kluwer Academic Publishers, Dordrecht.
Prentice-Hall, Englewood Cliffs, N.J. G6mez-llerndndez, J . and R.. M. Srivaslava. 1990. 1SIM:iD: An ANSI-(;
Englund, E. and A. Sparks. 1991. Geo-EAS 1 . L l Geostnt~stical1S1rvz- three-dirnensio~~al mult,iple indicator condilional simulation program.
ronmental Assessn~enlSoftware, User's Guide, EPA lieport # 60018- Con~pz~ters& Geosciences, 16(4):395-440.
91/008. U S . EPA, EMS Lab, Las Vegas, NV. Gdrnea-Iietnindea, J , and X. Wen. 1994. To he or not to be inultiGaus-
Farmer, C. 1988. The generation of slocllastic fields of reservoir pararn- siao? T h a t is the question. In Heport 7,Starrford Center for Reservoir
eters with specified geostatistical distributions. In S. Edwards and Forecasting, Stanford, CA.
P. R. King, editors, Malhe~nnlicsin Oil Production, pages 235 252. Goovacrts, 1'. 1992. Factorial kriging analysis: h 11sef11ltool for exploring
Clarendon Press, Oxford. the stroclure of multivariate spatial soil i n f o r ~ n a l i o ~Joart~nl
~. of Soil
Farmer, C. 199'2. Numerical rocks. 1111'. R. King, editor, The Mnfheaznlical Science, 43(4):597-~619.
Generation of Reseruoir Geology. Clarendon Press, Oxford. Goovaerts, 1'. 1993. Spatial ortl~ogor~ality
of the principal c o t n p o n e ~ ~ct os n -
FOEFI, (Swiss Ih(1er;il Office of Environn~cntand i,;~nrlst:;qx:). 1987. C b n - puled from coregionaliaed variables. Malhe~nntir:nlGeology, 25(3):281
n~entary011 the Ordinnnce llelatiny to Pollutnnts i s Soil (VSl3o of June 302.
9, 1986). FOEFL, Bern. Goovaerls, P. 1994a. Comparative perfortnancc of indicator algoritlrrns for
modeling conditional probability dist.ributiorr functions. Malhen~nlical
Froidevaux, R. 1990. Geostnlisticul lholbox Primer, ver.sion 1.50. E'SS In-
Geology, 26(3):389-411.
ternational, 'Froinex, Switzerland.
Froidevaux, R. 1993. I'robability field simnlatior~. 111 A . Soares, editor, Goovaerts, 1'. 1994b. Co~tiparisonof ColK, IK and rnlK pcrforrn;t~~ces ior
modeling conditional prolrabililics of categorical variables. In R. Dim-
Geoslalistzcs Triia '98, volurne 1, pages 73-84. Kluwer Academic Pob-
lisl~ers,Dordrecht. itrakopoulos, editor, Geostntistics for the Nerl Century, pages 18--29.
Klnwer, Dordrecht.
Galli, A., F. Gerdil-Neuillet, and C. Dadou. 1984. Factorial kriging anal-
ysis: A substitute to spectral analysis of magnetic data. In ( 3 . Vcrly, Goovnerts, P. 1 9 9 4 ~ 0
. 11a controversial method for modeling a corcgional-
izatiorr. Mnihe~nnticalGeology, 26(2):197~204.
M. David, A. G . .lournel, and A. M a r k l ~ a l ,editors, Geoslufislics for
Natural Resourccs Chnrnctertzalion, volu~ne2, pages 543- 557. Reidd, Goovaerts, 1'. 1994d. Study of sp&ial r e l a t i o ~ ~ s l ~ l)cl.rvcc~~
ips i.\vo scts of
Dordrecht. variables using multivariate geostatistics. Geodennn, 62:03--107.
Galli, A. a d C. Meuuier. 1987. Study o f a gas restwoir using the exterual Goovaert,~,1'. 1996. Stochastic sinn~lntionof categorical variables using a
drift method. In G. Mat,lteron and M. Ar~nstrong,editors, Geostatistzcnl classification algoritl~rr~
nnd sir~t~llated
anuealirlg. Mnlhe~z~atzcnl Ceol-
Case Studies, pages 105- 119. Ikidrl, Dordrcctit. ogy, 28(7):909-921.
Geman, S. and 11. Geman. 1984. Stochastic relaxatiou, Gibhs d i s t r i b ~ ~ l . i o ~ ~ s , Goovacrts, 1'. 1997. Ordinary cokriging revisikd. Mnthe~naticnlGeology,
and the Bayesian restoration of i~rrages.I E E E Trnnsnclions on Pnttern 29, in press.
A~rnlysisand Machine I~~telligence, PAMI-G(A):721-~741 Goovaerts, P, aud C. Chiang. 1993. 'I'emporal persistence of spatial patterns
Glacken, I. 1996. Change of szrpporl by direct condition01 block sirir~~latioa. for tnineralioablc nitrogen and selected soil properties. Soil Science
Master's thesis, Stanford University, St,anford, CA. Society of America Josmal, 57(2):372- 381.
G6111ez-Iiern&1dea, J . 1991. A Stochnslic Approach to llre Sinz~tlnl~oe of Goovacrts, 1'. and A. 1;. Joornel. 1'3% Int,cgralitrg soil map it~forrr~ation ill
Block Condzrcliuity Fields Conditioned lrpon Data Me(zsured a1 n Smnllcr modelling the sl,atiaI variation of c o ~ ~ t i n m usoil
s propertics IS?iropeun
Scale. Iloctoral dissertation, Stanford University, Sthnford, CA. J o u r ~ dof Soil Science, 40(3):3'37-414.
GSmez-llernindes, J . and E:. Cassiraga. 1994. Theory and practice of se- Goovacrts, 1'. and A. G. Journel. 1996. Accounting for local probalditics
quential simulal.ion. In M. Arrrrstrong and P. A. Ilowd, editors, Geo- in stochast,ic nrodeliug of facios data. S P E Jo~rrnal,1(1):21 2 9 .
slalistical Szwelnlioas. uaaes 111bl24. Kluwer A c i ~ ~ l ~ : ~l'111~lisIw1~
nir Goov;ierts, 1'. and 1'11. S o r r ~ ~ 1993.
t. St,udy of spatial and t c ~ n y ~ x nvaria-
l
HIDLIOGIIA I'H Y 471

A. Soi~res,editor, Geostatisttcs T r i i a '92, volume 2, pages 7 4 5 756. Jonrncl, A. G. 1!184a. Mad and conditional quarktile cstimat,ors. III G . Verly,
IUuwer Academic Publishers, Dordreclit. M. David, A. G . Jonrnel, and A. Martchal, editors, Geostatistics for
Goovaert,~,P., Pli. Sonnet,, and A. Navarre. 1903. h c t o r i a l kriging anal- Natural Resources Chaructcrizatiorr, volume 2, pages 261 2 7 0 . Heidel,
ysis of springwater cont,ents in the Dyle river basin, Belgium. W a t e r Dordrccht.
Ilcsourcrs llesenrch, 29(7):2115 2125. Journel, A. G. 1984b. T h e p l x r of non-parametric gcost,atistics. 111G. V d y ,
Goovaerts, P. and It. Wehster. 1994. S c a l e - d e p e ~ ~ d corre1;~tion
e~~t between M. David, A. G. dournel, and A. Mar&cbal,editors, Geostotislics for
topsoil coppcr and cobalt conccntr;~lionsin Scotland. European Journal Natural Resources Characterization, volunre 1, pages 307355. Reidel,
of Soil Scieitce, 15(1):79 95. I)ordrec.l~t..
Goolard, M. 1989. Infererice in a coregionaliaation t ~ ~ o d eInl . M. Armstrong, Jouroel, A. G. I986a. Geostatistics: Models and tools for the carth sciences.
editor, Gcosfatistics, volu111e 1, pages 397-408. Kluwer, Dordrrcht. Mathernal~calGeology, 18(1):119 1 4 0 .
Coulard, M. and M. Volls. 1092. Linear corcgio~~alisat,ion rnodel: Tools for .lournel, A . G . l98Fh. Constrained interpolation and qualitative inforrna-
tistirnation and choice of cross-variograln iniitrix. Mathematical Geol- tion. M ~ t h e n r a t i c a lGeology, 18(3):269-286.
ogy, 24(3):269-286.
Journel, A. G. 1087. Geostatistics for the environmental sciences, EPA
Grschyk, M. IW3. A j u s l e n ~ r s td'anr Cor4giosalisatio1~Statio1t~~ailr.
I)oc- projwt, no. cr 811893. Technical report, 1J.S. EI'A, EMS Lab, La8
I ~ h Natio~ii~le
ILurd diss~:rl,;itio~~, k Sup6ric11rcdm Mines dc Paris. Vcgas, NV.
Guardiar~o,F, and It. M. Srivastava. 1993. Multivariate geostatistics: Be- Journel, A. G. 1989. Fundamentals of Geostatistics i n Five Lessons. Volume
yond bivariate moments. 111 A. Soares, editor, Geostatisfics Trdia '92, 8 Short Course in Geology. American Geophysical Union, Washington,
voloine 1, pages 133 144. Klriwer Acadtmic Publishers, I~ordrecht. D.C.
IIaldorsm, 11. 11., 1'. J . I3rand, and C. J . Mncdoti;tld. 1988. Iloview of the Journel, A. C:. 1994a. Modeling uncerlni~lty:Some coriceptual thoughts. In
sl.ucl~;wlicimtorc of reservoirs. In S. 1Sdwards and 1'. I t . King, edi- It. Dirnitrakoponlos, editor, Ceoslatislics for the Next Century, pages
tors, Mnlhcnmtics i n Oil Production, pages 109 209. Cl;irendon Press, 30-43. Kluwer, Dordrcclrt.
Oxford
Journel, A. G. 1994b. Resampling from stochastic sirnt~latiorrs. E'nvirow
llaslett, J . , 11.. Dradley, 1'. S . Craig, G. Wills, and A. It. Unwin 1091. Dy-
mental and Ecological Statistics, 1:63R4.
11amic graphics for exploring spatial dal.a, with ;ipplicatiol~lo localiug
glol,;iI i111r1 loc;~la ~ ~ o r l ~ a l iAntericatt
rs. Strtfisficint~,45:234-212. Jourrrel, A . C. 1995. Frobahility fields: Another look and a proof. 111Report
8, S l a n f u ~ dCenter for Reservoir Forecasting, Stallford, CA.
Hudson, G . 1993. I'irigil~gt~ernpcralorein Scotland using the external drift
method. 111A. Soares, editor, Geostntistzcs 7i.iia '99, v o l u ~ ~
2,~pages
c Journel, A. G . and F. G. Alahert. 1988. Focusing on spatial connectivity
577 588. Kh~wcrAcadc~liicI'ul~lisl~r:rs,I)ordrecl~t. of extreme valnad attributes: Stochastic indicator models of reservoir
Isaaks, 1.:. 11. 1984 l<zsk CJunlified Mappings for llntnrdous Waslr Silcs: A I~et,erogeneities.SI'E paper # 18324.
Case S t u d y i n Distrzb~~tios-free
Gcostntistics. Mastcr's tl~csis,Stanford Journel, A. G . and C . V. Deutsch. 1893. Entropy and spatial disorder.
University, Stai~forrl,Ch. Mathematical Geology, 25(3):329~355.
Isaaks, E. H. 1!190. T h e Apphcation o f M o n t e C n d o hlelhods lo the Analysis Journel, A. G . and C. J. Huijhregts. 1978. Mining Geostatistics. Academic
oJSpatially Corr.eluted Datn. Iloctoral dissertal.ion, Stanford Il~~ivcrsity, Press, New York, 600 p.
Sthnford, CA. Jonrnel, A. G . and E. 11. Isaaks. 1984. Conditional indicator simulation:
Isaaks, E. 11, alrrl It. M. Srivastava. 1989. A n l s l l v d u c t i o ~to
~ Applied Application to a Saskatchewan uranium deposit. Malhematical Geology,
Geostatistics. Oxford Univcrsily Press, New York, 561 p. 16(7):685-718.
e l (;., 1980. Tlir logrlarrnal approach to predicling local dist.ribu-
J a r ~ r ~ ~A. Journel, A. G . and D. Posa. 1990. Characteristic behavior and order rela-
tions of selective mining unit grades. Mothematical Geology, 12(4):285- tions for indicator variograms. Mathematical Geology, 22(8):1011-1025.
303.
Jonmel, A. G. and S. E. Rao. 1996. Deriving conditional distributions from
Joornel, A. G. 1983. NOXI-parametrice s t i r ~ i a l i oof
~ ~spatial distributior~s. ordinary kriging. In Report 9, Stanford Center for Reservoir Forecast-
ilfnthe~naficalGeology, 15(3):445-468. ing, Stanford, CA.
BIBLIOGRAPHY 473

Jonrnel, A. G . and M. E. Itossi. 1989. When do we need a t r r t ~ dmodel in Myers, I>. I:;. 1991. Pseudo-cross variograrns, positive-definitc~~ess,
and cok-
kriging? Mathernalical Geology, 21(7):715 -739. riging. Malhemaficnl Geology, 23(6):805-816.
Journel, A. G. and W. X u . 1994. Posterior idcntific;tthn of Iiistogmms I'apritz, A. and 11. Iliihler. 1994. 'IP~nporalchange of spatially aut,ocor-
conditional to local data. Mnthe~naficalGeology, 26(3):323~359. rrlal.cd soil properties: Optimal esti~natiouby cokriging. Geuderinn,
25:1015102&
Krige, D. G. 1951. A Stafislica/ Approach to Some ilfiae V(~/trnlios.sand
I'apritz, A,, H . R. Kunsch, and R. Webstcr. 1993, O n tlie pscndo cross
Allied /'roblems at the Wzhuaterstand. Master's thesis, University of
v;iriogral~t.Mnthernalicnl Geology, 25(8): 1015-1026.
Witwatersrand.
l'osa, D. 1089. Conditioning of the stationary kriging rn;tlrices for sonrc
Luenberger, D. G . 1069. Opti~rtizntiosby Vector S p c e Alellrods. . l o l ~ tWilcy
well-knowt~covariance nrotlcls. Afathematzcnl Geolo!/y, 21(7):755 7 6 5 .
& Sons, New York.
Press, W . 11., 13. 1.' I.'lan~tery,S , A. 'ii:~tkolsky,m d W. 'T'. V?t,tcrling. 198ti.
Luster, G. It. 1985. I l o s Malerials for I'ortland C r ~ n e ~ r lApplzcnlioss
: Nuntericnl Recipes. Canrhriilge University I'rcss, New York.
of Condifiunnl Simulntion of Coregio~~alizafion 1)octoral dissertat.ion,
Stanford University, Startford, CA. ltipley, D. I). 1981. S p ~ t t aSlnfistics.
l John Wiley 81 Sons, New York, 252 1).
Ripley, D. D, 1987. Slochaslic Sinnrlatzo71. Jolltl Wilry 9s Sorts, New York,
Ma, Y. 2. and J . J . R.oyer. 1988. Local geostatistical filteriltg: Application
237 y .
to remote sensing. Sciences de la Terre, Sirie Informatiqnc, 27:17-36.
Iloult;u~i,S . and 11. Wackerltagel. 1990. M~tllivariitlog~ostaliaticalirppr11irc11
Mallet, J . L. 1080. 1ti.grrssion soos cot~t,raintesli~tiilircs: Applic;ttiol~an to space-tinre data analysis. Wnler Resources flesearch, 20(4):5855591.
codage dcs variables al4atoires. Ileut~ede Statisliyue Appliq~iie,28(1):57
68. Sau<\jivy,1,. 1984. 'T'l~efactorial krigit~gutl;rlysis ofregio~~;~liactl
data. Its n p
plication t o geocl~emicalprospecting. In G . Verly, M . David, A. (:. Jonr-
Marcotte, D. 1995. 1:onditional s i ~ n ~ t l a t rvitlt
i o ~ ~data sl~bjectlo measure- nel, and A. Mar&chal,(!&tors, Geoslnlislics for Nslarol Ilesoawes (:ltal~
tnent error: I'ost-sirnolation filtering wit11 modified factorial krigil~g. nclerirtrtioe, pages 559H71. Ikidel, Dordrcclrt.
Mathenzaliral Geology, 27(6):740762.
Sdgoret, S, and P. Iirlclion. 1990. 'I\igonomctric kriging: A new nietliod
Markchal, A. 1984. Kriging seismic data. in presence of Lttdts. In G.Verly, for rerl~ovingthe dirmial variatiot~from geomtgnetic data. .lonrit~rlof
M. David, A. G. Joornel, and A. Mar(rhal, editors, Geoslalistics for Geophysical Rescalrh, 95(13):21,383 21,397.
Natural Resources Charucterizntioe, pages 271294. Iteidel, Dordreclit.
Shannon, C. 1,:. 1048, A niatlmnatical tl~eoryof c o l ~ t r ~ ~ n t ~ i c ; &Uell
l i o ~Sys-
i.
Matheron, G . 1970. La ThCorie des Variables Higiunalisies ct ses Appli- tem Technical Jonrrral, 27379-623.
calions. Faxicole 5, Les Cahiers (In Centre de Morphologie M a l l ~ i n ~ ; t -
Silvcr~nan,H. W . 1986. Uenszty Estzmutioa forStn1zslic.s n u d I)ul~rAnalysis.
tique, Ecole des Mines de Paris, Fontainehleau, 212 11.
Chapman and Ilall, New York.
Matheron, G. 1979. Itecllerclie de simplification dans tin problime de cokrigeage Soares, A. 1002. Ceosl.nt.ist.icnl estitrratiotr of intrlti-plrase st.ruclrrres. hfdlt-
Internal note N-628, Centre de Giostatistiqtie, FontaincBlean. cn~nticalGeology, 24(2):149-160.
Matlreron, G. 1982. Pour one analyse krigeante de dor~tlirsrdgioliaIis&s. Sous;i, A . ,I. 1980. Geostalistical da1.a an;rlysis Arl applicabiotr to on: 1.ypol-
Internal note N-732, Centre dc Giosthtistiqrtc, I~'ontaineh1aa~1. ogy. 111 M . Arn~strong,editor, Geoslnlislics, volwnc 2, pagt.s 851 860.
Matheron, G . 1989. Esliimating and Choosing Springer-Verl;rg, Berlin, Klawer, I)ordreclrt,.
141 p. Srivastava, R. M. 1067a. Mitii~nr~rr~varia~iceor
iiraximumprolitability? ClM
Matlreron, G . , 11. Hencher, C . de i.'ortquet, A. Galli, D. Cuerillol, and Bnrllelza, 80(901):63-68.
C, Ravenne. 1987. Cot~ditiorlalsitnulation of Lltc gconletry of flovio- Srivastava, It. M. I087h. A N~II-ergodzcI'iumr~uorkf i r Vnrrogram und
deltaic reservoirs. S P E paper # 113753. Covarim~ceFunctiol~s. Master's thesis, Stanford liuiversity, Stanford,
Murray, C. 1992. Geoslntistical Applications in IJelroleurn Geology and Sed- CA.
imentary Geology. 1)octoral dissertation, Stanford University, Stanford, Srivast.ava, R. M. 1092. Ittw:rvoir cl~aracterizatio~i
wit11 prob;rl)ility ficld siltl-
CA. ~ t l a l i o In
~ ~SI'E
. Ann~rnlConference and I~xhibzlios,Wnslrington, I). C.,
Myers, D. E , 1982. Matrix for~riulatio~i
of co-kriging. Matherr~uticalGeology, rtul~lber24753, pages 927 938, Washington, D.C., Socie1.y af I'etrolen~it
,a,*, " " n ',,- Ih"i,,"e"s
Srivastava, R.. M. 1994. 'FIE visualization ofspatial uncertainty. In J . M. Yartrs I,arsen, editors, Qum~titntivcAt~alysisof Mineral and Etwgy Resources,
and R. 1,. Cl~arubcrs,editors, Slochastic Modelrng and Geostatistics. pages 390 409. Ikidel, Ilortlreclrt.
Principles, Methods, and Case Studies, volurr~e3 of AAPG Conrputer.
Applicatio~~s in Geology, pages 339-345. Wackernagcl, B. 1904. Cokriging versus kriging in regionalized multivariate
data analysis. Geoderma, 62:83~-92.
Srivastava, It. M. and 11. M. Parker. 1989. Robust measures of spatial con-
Wackernagel, 11. 1995. Multivariate Ceostatistics: An introduclion with ap-
tinuity. In M. Armstrong, editor, Geostatistics, pages 295-308. Kluwer,
plicatio~~s.Springer-Verlag, Berlin, 256 p.
Dordrechl.
Stein, A., M. IIoogerwerf, and J . Borlrna. 1988. Use of soil-map delineatiorrs M1xckern;tgel, H . , 1'. I'etitgas, and Y. Tonffnit. 1989. Ovcrvicw of methods
for coregionalization analysis. In M. Arnrsfrong, editor, Ceoslatistics,
to irrrprove (co)krigiug of point data on ttroisture deficits. Geodermn,
vohlme 1, pagcs 400-420. Kluwer, Dordrecht.
43:163~177.
Wackernagel, 11. and 11. S a ~ r g n i ~ ~ c 1993.
t t i . Gold j,rospccting with factorial
Sr~llivan,J. 1984. Conditional recovery esti~rralior~through probability krig-
cokriging ill the Limousin, France. In 3. C. Davis and U. C. Kerzfeld,
ing: Theory and practice. In G. Vcrly, M . David, A , G. Journel, and
editors, Co~npaterain Geology: 25 yeara of progress. Studies in Moth-
A. Mar&cl~al, editors, Geoslatistics for Natural Resorrrces Charnclerim-
emnticnl Geology, volume 5, pages 33-43. Kluwer, h r i l r e c h t .
lion, pagcs 365-384. Il.cidel, Dordnxl~t.
Wackcrnagel, II., 11. Webster, and M. A. Oliver. 1988. A geostalisti-
Sullivan, 1 . 1085. Nos-parametric Estirnatios of Spatial Uistrib~rliott.~.Doc-
ral inctl~od for segmenting nlultivariate sequcnccs of soil data. In
1,or;il (lissert;tt~ion,Star~fortlUnivrrsity, Sl.altford, CA.
11. 11. flock, edil.or, Cllnssijicatioa nad Itelated Mrthod of I h t n A W s i s ,
Suro-l'drea, V. 1 9 0 1 Generalion of a turbiditic reservoir: T h e 13oolean pages (id 1 6 5 0 . Klsevier North Ilollar~d,Aitrstertlarn.
alternative. In Report 4, Sla~lfordCenter for Rrsrruoir Forecasting,
Wat~g,I,. 1904. Atrin~atrdvisualization of multiple sirndated realizations
Stanford, CA.
and i ~ ~ s e s s m eof
r ~uncertairrty.
t In Report 7, Stanford Certtcr for Reser-
Sum-I'e'rez, V. 1993. Indicator principal component kriging: T h e nrult,ivari- voir [.bi!ecnsting, S t a ~ ~ f b rCA.
d,
at,e case. 111 A. Soares, editor, Geostatzstics P d i a '92, volnrnc I , pages
Mrc!lmter, I<..,0 . Atlei;%,and J.-f. I)rrbois. 1994. Coregionalizatiorr of trace
441- 454. liluwer Acaderuic I'oblislrers, 1)ordrecht.
n ~ e t a l si l l tlrr soil i l l tlle Swiss Jura. Etrropem~J o ~ r n a lof Soil Science,
Suro-Pbrez, V. and A. G . Journcl. 1991. ludicator principal cornponrnt, 45:205 218.
krigirrg. Mut/trn~nlimlGeology, 23(5):75!1 7 8 8 .
r , aud M , A. Oliver. 1900. Statislical Methods in Soil and Land
W ~ l ~ s t e11..
Tran, T. 1994. Improving variogram reproductiort on dense sitnulatiort grids. Ilesoarce Strrvey. Oxford University Press, New York, 316 p.
Computers t3 Geosciences, 20(7):1161--1168. Xiao, 11. 1085, A U e s c ~ t ~ t i oofn the l?el~aviorof Indicator Vur'iograms for
Val) Mcirvc~mc,M., K. Scl~elrlenr;ur,C . Bacrt, and G . Ilofu~an.1994. Quan- n llt~inrtnlr.A'orrr~al I l i s t r ~ b ~ ~ l i Maskr's
os. tl~mis,Sthrfortl Universily,
tification of soil textural fr;rclions of Bas-%;tire using soil nrap polygons Stanford, (:A.
nr~d/orpoint observat,ious. Geodwn~a,62:69 82. Xu, W. 1994. (huvex krigi~lg.111 llcl~or.l7, Slaefonl C e ~ l f efol.
r lZescruoir
Vcrly, G. 1986. Mtrlti(;;u~ssian kriging - A completc case study. 111 It. V. Ita- I~olr'cn.slnig,St;tnford, (:A.
I I I ~ I cdit,or,
I~, Al'COM Sympo-
I'lncerdiags of thc 19th ltrter~~~atiorrnl XII, W. 1095;~.St.ocll;rsl.ic rllrrtlcling of lit.l~of;~cies:
Altert~atives. In Reporl
szunt, pages 283 298, Littleton, CO. Society of Minit~gEngineers. 8, Slnnfolil C m l e r for Ileseruoir I.brecn,sling, Stanford, CA.
Verly, G . 1093. Sequential Gaussian co-sitr~ulation:A simulation trrelliod X u , W. 1095t1. Stocha,stic Modeliug of Reseruoir Lilhofacies and Pelrophys-
integraling several types of i~rformation.In A. Soares, editor, Ceostatis- ical Propcrttes. IIoctoraI dissertation, St.anford University, Stanford,
tics Trriia '92, volu~nn1, pages 543-554. Kluwer Academic Pr~hlinl~rrs, (:A.
1)ordrcrhl.
Xu, W . and A. (:. J o u r ~ ~ c l1!)93.
. Gtsiur: G;mssia~ltruncated simulation
Volta, M. and R. Wcbstcr. 1990, A comparison of kriging, cubic splines of litktofacies. 111 neyort 6, Stnnfo~~d Ceuter. l o r flesenmir Forecasfiag,
and classification for predicting soil properties from sample inforlnatio~~. Stanford, (:A.
Joumnl of ,Soil Science, 41(:1):473 490.
Xu, W , turd A. C . Jowncl. 1'394. 1)ssiui: A general scqllclltial sirnula-
Wackernagel, 11. 1988. Geostatistical techniques for interpreting rr~ult,ivari- ~ . Rqiorl 7, Stanford Center for Ilcsereoir Fwecaslieg,
t,iorr a l g o r i f h ~ Irr
ale spatial inforroation In C.I.'. Clrortg, A . G. Fahbri, and R. Sinding- Stanfor~l,CA.
476 BIBLIOGRAPHY

X u , W. arid A . G. Journel. 1995. Histogram and scattergram slttootlring


using convex quadratic programrriing. Mathetnatical Geology, 27(1):83-
103.
Xu, W . , T. 'Ihrr, R. M. Srivastava, and A. G . Jour~lel. 1002. Integrating
seismic data in reservoir modeling: 'I'lie collocated cokriging alternative.
S P E paper # 24742. Index
Zlro, H. 1992. I h a l kriging. Geoslal Newsletler, 4:4--5.
21111, H. and A. G . Journel. 1989. Indicator conditio~redestirimtor. T h a s n c -
tioas, Society for Mining, Metallurgy rind E q h r n l i o n , lec., 286:1880 a priori v n r i a ~ ~ c71, r , 97 of LCSL Iocat.i~~m, 182, 202, 250,
1886. accuracy, 315, 435 363 3 6 6
Zbn, 11. and A . G . Joornel. 1'393. I > o r t ~ ~ a t tai ~~~ iutegraling
~
g d soft data: angular tolerance, 26, 31, 100, 104 cll~stt:rirrg,79, 84, $6, 178, 438
Stochastic imaging via the Markov-Bayes algorithm. In A. Soi~rcs,cil- a~~isotropy, 36, 54, 70, 86, 90, 98, circle of correlat.ions, 256
itor, Geoslnt:stics 'li.din '92, volun~c1, pages 1Ll2. Klr~wcrAcademic 117, 178,322 corlispersion coolficie~tt,50
Publisl~ers,Dordrecl~t. geometric, 90-93, 104 coc{licient of s k e w ~ ~ r s 15 s,
nugget, 102 coeRdetit of variation. 15
zonal, 93-95, 97 cokriging
a s y ~ i i ~ ~ ~ c l . r i c d i s t r i b 13,
u t i o16,
~ ~261,
, colr,catccl, 235, 245, 248, 311,
267, 337 387, 391, 102
atlribule, 7 c o ~ n p r i s 10 o ~krigiug,
~ 2 15 221,
azimutlr angle, DO, 1)s 248, 208, 304
indicator. Sre iudicnto?,
back-transform, 17, 380, 382-385 linearly dependent variables, 214
bias, 77. See also l~nbiasedncss~ O I I - ordinary. See o r d i ~ ~ a cokrig- ry
&ions ing
1)iuary varial)lc, 65, 285 pzradign~,XI3
tlivariak dcscril,lion, 10 22, 4 6 56
regionalized factors, 254
bivariate Gaussian model, 271
simple. S r r sirnpk, cokrigi~g
block average, 156
soft data, 241, 308, 401
block kriging
stir~idardizcd. See st~anrlnrd-
discretization, 156
iaed ordinary cokriging
indicator, 306
variance, 207.~208,225, 266
system of eqnations, 154
vector of variahlcs, 2 1 3 ~ 2 1 4
bounded srmivariograrri, 71, 88
weigl~ts,217-2'21, 232, 298
calilnation, 245, 288, 315-317 cot~ditiooalbins, 182, 261, 370
categorical variable, 9,63,245,287, co~~dilional rlistribution f u u c f i o ~69, ~,
328, 354,420, 428 262, 266, 284, 331, 372,
ccdf. See conditional distrihul.ior~ 376
function block, 305, 347, 404
cdf. See cumulative distribution composite, 305, 347
function c o r ~ d i t i o ~cxpcclation,
~al 284, 328
cell-dech~steringtcchniqoc, HI, 157 conditio~~al ~ r , Sec
s i ~ r n ~ l a t i o 370.
classific'<t1'1011 also s i ~ r na~L'lo11
l
of categorical attributes, 3 5 6 ~ ~ conditiorral variance, 180,266,336,
357 061. 442
co~rlidcnrrint,crval, 260-2G2, 334
cot~st,raii~t inl~cwal,177, 241 245,
287, 346
continoons variable, 11, 64, 288,
dccilc, 15, 285,
~lccisimti~akitrg,347, 373, 440
~leclr~stcri~rg,
tering
77F82. S r r also rlus I general relativc sen~ivariograln,85
gerlrmlizerl
'
.
I
covariance,
.
,
glohd rsrirn;tt~orr,1 5 i
143
.i,..
. ., ; ; : j3.~,l.,~,! ,
cokriging. See cokriging
dual. S(r dual kriging
,,..^.,./ . .:!,: : ,:>,~t‘ria!
i ,

global. S'rr global esti~natiou


k?(ciRc

3G 9 depelidence, 69, 214, 329 indicator. See indicator


coregionalisalio~~ dcsthctnration effect, 272, 393 1 liard dab, 241, 285 ordiuary. See ordinary krigirrg
intrinsic, 114, 116, 21G, 234, deterministic rnodcls, 60, 63 I~t?terosccdast,icity,82 paradigm, 125
304 distriliutioi~. See culnulativc dis- Iirlerotopy, 213 proldiility. See prolrability krig-
l i t I 108, 117, 239, tribution filttctio~r llierarcl~yof variables, 116, 391,402 ing
'L r'
J L , 318, 439, 443 drift, see 'lkcnd 11-incrmm!l, 28. 48, 71 sirnple. See simple kriging
~ n a t r i s 110,
, 113, 253 dual kriging Ilistogram, 11, 77 with soft data, 241, 308, 401
c o r r e l a l i o ~coellicient
~ fac.toria1, 171 hornosc.cdasticil.y, 167, 261 within slrata, 187, 200
linear, 21, 69, 73, 387 ordinary, 170 11-scattcrgrarn, 25 spatial components, 160
rank, 16, 21 s i ~ r ~ p l1G9
e, Irypcrbolic rnodcl, 281 variance, 179, 184, 261, 266,
sl.ruct,urnl, 253 339,361, 428
correlograln ' cntropy, 335, 354 intlepeudcncc, GO, 164, 392 weights, 174-177,321. See also
ergodic fluct~rations,426 indicator cokriging weights
:r~il.r~, 28, 68, 71, 86
cross, 48, 53, 73 eslitnation, 125. See d s o kriging coding, 285~-212,329 kriging with at, external drift
indicalor, 34, 53 esli~~l;rl.ioo
variance, 126 cokriging, 297, 308 313, 321, cornparison t,o colocated cok-
relation with scn~ivariograrn, IS-type estimate, 341, 360 329, 358, 399 riging, 240
exactit,nde propcrt~y,130, 132, 169, covariance, 34, 52, 69 comparison to simple kriging
71
cosimulatiou. See joint simulatiom 295,428 krigi~rg,293-2'37, 358 with local prior means, 197.-
covariance, 21, 73 expectcd value, 65, 68, 12G, 337 random variable, 65 199,201
covariance functiou csporwnt~inlscmivariogratn n ~ o d c l , sc~nivariogranl,31,41, 53 Gaussian varinlJe, 276, 388
auto, 26, 68, 71, 135 88 s i ~ n p l ekriging will] local prior kriging the trend, 196
extrapolation, 150 -152, 179, 199 means, 307~-308,359, 400 sysleru of equalions, 195
cross, 46, 72
extrapolation lieyond known (c)cdl simulatiori algorithms, 393-403, kriging with a trend model
indicator, 34, 52, 69
valr~es,279-282, 326-328 423 comparison to ordinary krig-
inference, 31, 86
extreme values, 16, 31, 273, 370 inference, 75 i r g , 147-152
~ n a l r i x 74,
interpolatiot~in space, 150 Gaussian variable, 275, 381
model, 87, 108, 237 factorial kriging interpolation witl~irrn (c)cdf, 279, kriging the trend, 145
synin~etry.See lag cfFect hid f o r ~ n 171 , 326 niatrix notation, 144
cross 11-scatl,orgram, 46 ~ r ~ a t r ~i x~ o t a t i o rI61
r, intercpartile range, 337, 361, 434 syste~riof equations, 141
cross valid;~tion,105, I79 r~~oltivariate, 251 intrinsic liypol.ltesis, 71
cwttnlative distril~ntiolrf~nrclion false positive/neg;rt~ive,348 isol.opy, 213, 217, 235 1%
onc-point, 64, 68, 77, 278, 327, filtering, 165, 172, 182, 253, 427 isolropy, 70. See also anisotropy effect, 48, 72, 227
396, 413,427 frcql~mcytable, 10 ilcrat.ive nrodrlingof coregiot~aliaa- means, 28, 84-86
two-uoint.. 68.. 72.. 265. 271 lion, 121, 443 tolerance, 26, 100, 101
curnnlative frequency, 11, 18, 34, Garlssian variances, 28, 84-86
52. 287 distribution, 265 266, 271 joint distribution, 72 Lagrange parameter, 133, 141, 225
ral~dornfunction, 265 joint simulat,ion, 390-393,400--403 least squares, 125, 341, 369
data sernivariograni model, 88, 102 J u r a dat,a, 4, 457-464 linear interpolation within cdf, 279
rrdu~idancy,174, 210, 235 sitnulation algoritlin~s,3 8 0 3 9 3 , linear rnodcl of coregionalization.
removal, 16 103-405 k e r ~ ~fonction,
el 2R3 See coregionalization
splitlit~g,16, 18, 71, 77, 187 traosforrnation. See normal score kriging linear model of regionalization, 95,
transforrnation, 16 17,266-271 transform block. See block kriging 159
INDEX

linear regressiot~.See kriging, cok- ~ ~ e g a t i vkriging


c weights, 176, 232, cotnparison to kriging with a pseudo covarinncc, 135, 227
riging 303, 321, 330 trend model, 147 pseudo cross semivariogram, 48
linear semivariograrn model, 90 nested sernivariogram nrodel, 95, comparison to simple kriging,
local (prior) distribution, 285 159 137 Q-Q plot, 77
logaritlimic trartsforniation, 16, 38, non-ergodic correlogram. See cor- dnal form, 170 qualitative infortnation. See soft
156, 271 relogram Gar~ssianvariable, 275,-381 data
loss (function), 340 346, 350, 373 non-ergodic covarial~cc.See covari- indicator, 204, 296, 333, 397 q ~ ~ a ~ ~13 tile,
low-pass filter, 173, 184 ance Cnnction kriging tlte local mean, 135 q ~ ~ a n t iestimate,
le 343
lower tail extrapolation, 280, 385 nowparametric, 284 matrix notation, 145 quantile rnap, 361, 431
LU decomposition, 390, 403 non-stationary, 139, 275, 381. See systcrn of equat~ions,134 qmrtile, 337
nlso treud outlicr resistmt. 103
madogram, 31, 103, 399 rin~d~ f~niclion,
n~ 68~~72
normal
MAP criterion, 418, 430 random path, 379, 423
dislribntion, 265 -266, 271 pairwise relative seniivariogr;m~,85
marginal d i s t r i b o t i o ~72,
~ , 77, 264, random variable, 63-67
equations, 128. See nlso krig- periodic trend, 141
396 range, 31, 88, 103
ing permissible semivariogram model,
Markov--Dayes algorillrrn, 313, 400 rauk correlatior~coelficient, 16, 21
probability plot, 209 87, 108, 123
Markov-type assumption, 237, 313.- rank t r a r ~ s f o r ~ 17,
n , :101
score transform, 266 -271, 27'3, plield simolalion, 405-409, 423
318, 387 realization, 66, 351, 370. See elso
380, 382 polygoml metl~od
matrix instability, 157, 195, 209, sil~nllation
nugget effect declustering, 7'3, 157
214, 235, 329 regrnssion. See kriging, cokriging
effect on kriging, 174,177,182, esl,irnation, 356
matrix notat,ion rel;~tivese~~iivariogrntrr, 85, 101, 103
218,222 popnlal,ion, 7, 59, 71
cokriging, 207, 213, 220 reprodttction of lnoclel sl~al.islics,470,
f i l t e r i ~ ~165
g , 167, 17'2, 253, 427 positive delinitent:ss, 87, 113, 145,
frictorial kriging, 161 382, 3 M , 406, 413, 421;
inference, 101 102 208
kriging, 129, 144 residual
interpretation, 31, 102 posterior distribution, 264, 328
linear model of coregionaliza- covaria~~ce, 126, 134, 142, 381
relalive, 31 post-processing realization, 375,427
tion, 11 1 kriging, 143, 190, 270, 307
median estimate, 343 s e ~ n i v a r i o g r i ~t~lodcl,
t ~ ~ 88 431
risk ; ~ s s t ~ s s ~ n 273,
e t ~ t ,:IOli, 348, 374
median indicator krigirrg, 304, 322, pnwer rlistril~~~tion rnorlel, 280
rt~dogram,31, 86
330, 399 objective f ~ ~ n c t i o413
n , 418, 430 power selnivariogratii model, 88
rose diagran~of rmges, 90
middleclass interpolation, 279,326, optin~alestimate, 340--346, 360 practical range, 89
398 order rclatiou, 64 precision, 180, 184, 435 sanlple, 7, 59
misclassification, 348 order relation deviation prcf?re~~tial sanlplil~g,76, 100, 349, sampling design, 23, 75, 437
Monte-Carlo simulation, 351 causes, 321 , 330 427. See also clnslering scale-dcp<:ndrnt corrclaiion, 252
moving window correction, 3 2 4 3 2 5 , 331 prittcilx~lc o ~ n p o n ~ : ~ ~ l , scaltergrat~~ 19,
size. See search neighborhood i~~lplemcntatioti tips, 321-323 aoirlysis, 255 scnltergranr of lt-il~crw~re~il,s, 49
statistics, 82 ordinary cokrigiug kriging, 233, 392, 402 screelt efkct, 176 177, 219 221, 299,
mnltiGa~~ssian model, 265. See also color:aled, 236 prior dist~ribution,285 313
Gaussi;ru cornp;irison to sirnple cokrig- pro1)id~ilisticnrodel, 60, 63 se;rrcl~ncighborllood, 178, 322, 378
multiple-grid sim~tliition,379 ing, '229-231 prol);rbility 379
multiple-point st,nt,istics, 414 il~dicator,297 clt!~isil.yfunctio~r,64 se~r~ivariograni
mr~ltivariate matrix notat.ion, 220 I i s r i l ~ i t i n . See cuninlative cross, 50, 72
distril)tltion, 68, 70, 265 wit,l~a singlc ~~nbiasedness con- ~listrihudior~ f~~nctiol~ rlirecl., 28, 68, 72
factorial kriging, 2 5 1 2 5 8 straint, 228 i~it.(!rvaI,334 d i r d i o ~ ~ a31
l,
random function rnodel, 72 systerrl of equat,ions, 225 krighg, 301-303, 358 graphical i ~ ~ l ~ f ! r ~ ~ r29,
~ ~35
l~:il~i~~t~,
vector of vari;rl,les, 226 niibp, 358, 431 indicator, 34, 41, 53
nn"*+:.mrl',C..:tr ...'.,... "0 11," --.I: ,.-:..: nmn,%rt;nll..lnn;llll i l l Us1 101 'I,:+ 0.)
11;1p, 99 rot~lparison to ordinary krig- s]x'd.r;d ~ l ~ , i r ~ ~ ~ t l x233,
~ s i255
tio~~, krigiug, 133, 140, I60
rnntrix, 74 ing, 137 spl~ericalsr!~~livxiogram r ~ ~ o d c88l, cokriging, 221, 228, 232, 251,
relation with covariance, 71 cornparison to simple cokrig- standard dsvialion, 15 254
relative, 85, 101 103 ing, 215, 2'23 st.and;rrdiaed ordiunry cokrigiug, 232, unbounded a:n~iv;triogratn,89, 133,
of residuals, 42 dual form, 169 236, 248 227
sc~nivariogra~u model Gaussian variable, 266, 277, sl;rt.io~r;irity,70 71, 106, 187, 264. uncerl,ai111.y
behavior near the origin, 90 381 Sce also i ~ o ~ ~ - s t a t i o r ~ a r y local, 262, 333, 354, 361
effect on kriging weights, 174, indicator, 293, 296, 395 stocliastic s i r ~ ~ u l a t i See
o ~ ~simula-
. spatial, 372, 431
177, 182, 218, 222 matrix notat,ion, 129 tion rroiform distributio~t.67, 262, 279,
cxpol~mtial,88 system of equations, 128 stntcl.ural corrclatioo rodfiricl~l,16, 351
Gaussian, 88, 102, with varying local rneans, 190, 21 uniforn~transfornr, 17, 303, 406
litbear, $10 201, 3 0 7 308, 359, 400 s ~ ~ ~ n t nstatistics,
ary 13, 16, 433 univariatt rlcscriptioo, 9- 18, 22-45
nugget elfcct, 88 weight of the mean, 1311, 132 support,, 152, N 5 , 347, 104 enivcrsnl cokriging, 204
ovcrfitting, 100 sinmlated annealing univcrsal krigi~rgS p e krigiug will,
pernrissibilit,y, 87, 108, 123 annealing schedule, 418, 420 ;L trend nlodel
power, 88 convergence criteriol~,I I!) disirilwtions, 67, 264, 293,
ul~litt,it~g
range, 31, 88, 103 initial irnage, 412 329
semi-autotnalicfilting, 98, 121, ol~jectivefunction, 413-418,430 upper tail extr;rpolation, 281, 337,
443 perturbatiou mechanis~n,412 343, 385
sill, 88, 103 post-processor, 430
spherical, 88 g c ~ ~ c r a427
l, varialrle, 7
sin~~llatiotl
soqlm~lialG;iussinn sin lo la ti or^ indiciif,or, 285 -292, 329 variance
ant~ealing. See ~ i n t ~ t l i 1~11-
~t~d
a c c o u t ~ t , i ~for
~ g secot~daryin- logariI.llr~tic,16, 38, 156, 271 cokrigirrg, 207~-208,225, 266
ncaling
forn~atiort,385 rank, 17, 301 experirrlenl.;~l,15
categorical variables, 1120b.125,
joint s i ~ t ~ u l a l . i390
o~~, unifor~n,17, 303, 406 kriging, 179, 184,261,266,339,
430
t.ransition frequency, 35, 356, 424 361,428
~mn-stationarily,381 clloosir~ga si~nulatiollalgoritlmr,
transitiou scntiv;rriogra~linmlcl, 8'3 variaucc covaria~lce~riatrix,74
univariate, 380 434~~436 trend vector random funcliot~,72, 390,
seqrtcntial indicator silttulatio~~ co~nparisonto estimation, 369
d e t e c t i o ~ 82
~, 400
accrrunbing for secondary in- contir~uonsvariables, 376-~420
kriging, 135, 145, 1'36, 230,
formation, 400 Gaussian, 380 393, 403 405
tnodr:l, 126, 139, 141, 192 193, weighted average, 77, 188
mtegorical v;iriables, 423 indicator, 393 403, 4X%
194, 275, 381, 388 weigl~led least.-squares, 105, 324,
joint. s i n t ~ ~ I : ~ t i400
ou, joint, 390 394, 400- 403
univariate, 395
two- art scarrh, 378 443
L U decomposition, 403 two-point sl,atistics, 7, 414
simple cokriging 11-field,4 0 5 409, 423 zonal anisotropy. See anisotropy
coloc:rtcd, 236, 391 skcwnrss, 31, 82, 261, 267, 271,
c o n ~ p x i s oto~ ~orditti~rymk- 280-281,343. See nlso co-
riging, 229--231 eficient of skewt~ess
cotrrparison 1.0 siinplc kriging, s ~ r t o o t l h geffect, 182, 202, 370
215, 223 s~nootliingdislributions, 283, 326
Gaussian variable, 266, 386 soft cokriging, 308, 320
mat,rix notation, 207 sofl. data 241, 287-290
s y s l m of cqui~tiolts,200 space uf rurcerliti~lty,436
vector of v;iri:thlcs, 213 sl)atiirl cott~pollcnts,159, 165, 251,
weigllts ofllte intwls, 210, 212 256
sirttplo krigitrg slxil.ial rlescript.ion, 22, 46

You might also like