You are on page 1of 70

Spatial statistics

Exploratory Spatial Data Analysis (ESDA)

by
Dr.Eng. Suryantini
Silabus kuliah

• Introduction to geostatistics
• Non-spatial statistics
Today’s
• Spatial statistics lecture

• Variogram
• Estimation
• Simulation
Spatial statistics
CONTENT
• Spatial description
• Spatial continuity model
–from non-spatial to spatial statistics
–Spatial statistics - Univariate
–Spatial statistics - Bivariate
Spatial description
• Data postings
• Contour maps
• Symbol maps
• Indicator maps
• Moving windows statistics
• Propotional effect
Spatial description
Data posting Symbol map or post plot

Contour map
Spatial description
Indicator map
Threshold from 15 to 135 ppm
• Series of indicator
map records the
transition from low to
high values

• Low values tend to


be alligned in a
north-south direction

• High values tend to


be grouped in the
south-east corner
Spatial description
Moving window statistics

Calculate the average and


standard deviation (SD) of
each moving window
……………..
average MAP RESULTS
• Average value and variability
SD
change locally across the area
• High SD in South-East, reflect
low and high values are being
close
• Low SD in Western area,
reflects the very uniform values
in that region
Spatial description
Buatlah Penampang A-B, C-D, E-F, G-H
A

B
Spatial description
Buatlah Penampang A-B, C-D, E-F, G-H

B
105 Variance

100 Mean

95

90

85
Spatial description
Proportional effect in spatial domain

CASE 1 CASE 1
Average and variability are
constant

CASE 2
Data values fluctuate about
the local average, but there is
CASE 2 no obvious change in
variability

For estimation, these two


cases are the most favorable
 Easy to model
++ data values
____ local average
Spatial description
Proportional effect in spatial domain

CASE 3
CASE 3
Local average constant but
variability changes

CASE 4
Local average and variability
change together
This is the common case for
CASE 4
earth science data

For estimation, case 4 is still


somewhat predictable (local
variability is related to the
local average)
++ data values
____ local average  Can still be modelled
A Spatial description
Plot Local Mean vs STDev
Proportional effect in spatial domain
B

Local means vs local standard deviation

correlation coefficient = 0.27


No apparent relationship
between mean and SD

correlation coefficient = 0.921


Strong relationship between
mean and SD
De-Clustering
• Statistics relies on samples being random
and un-biased
• In mining, we are more interested in ore
than waste
• Usually in mining, more drillholes are
located in high-grade areas
• So sampling is inherently biased - more
samples in high grade areas
De-Clustering
• Overcome this bias with de-clustering
• Put (3D) grid over data
• Within each cell take sample closest to
centre (or average of all samples)
Use average of Use single
4 samples sample

 Continue using this single value per cell


De-Clustering
• Any geological knowledge to split into
separate domains must be used

Only decluster if no geological separation


possible
Spatial continuity model

From non-spatial to spatial statistics


Non-spatial statistics
Univariate vs Bivariate
• Sample variance in one variable statistics:
n
 
2 1
 xi  x 2 Measure of difference between
n  1 i 1 sample value and its mean

• Sample covariance in two variables statistics:

n Two variables

Cov XY 
1
 xi  x   yi  y  x and y
n  1 i 1
Summary of relationship
between two variables
Non-spatial statistics - Bivariate
• Sample correlation coefficient in two variable
non-spatial statistics: the covariance normalized
by sample standard deviations
n

covariance Covxy  x  x    y
i i  y
   i 1
standard deviations  x   y n n

 x  x   y  y
2 2
i i
i 1 i 1

xi , yi  sample value at particular location Measure of how close the


observed values come to
x , y  sample mean falling on a straight line
Non-spatial statistics - Bivariate
Scatterplot

Regression line

  0.7

x
Non-spatial statistics - Bivariate
different values of correlation coefficient

(Picture taken from Dubrule, 2003)


Non-spatial vs spatial statistics
• Non-spatial statistics: relationship between two
samples of different variable on the same

xi  x  yi  y 
location
Same
location

• Spatial statistics: relationship between two


samples of the same variable, separated by
distance (h)

xi  x xi  h  x 
Different
location
Spatial statistics - Univariate
• Sample correlation coefficient in one variable
spatial statistics: the covariance normalized by
sample standard deviations Different
location

 x  x ×x  x
n

i ih
covariance Covxi x i h
 1 1
n-1  i 1
n-1 standarddeviations x i ×x ih n

 x  x  × x
n 2
x 
2
i i h
i 1 i 1

xi  sample value at particular location Measure of the similarity of


observed values with
xi  h  sample value at distance h from xi separation distance (h)
Spatial statistics - Univariate
h-Scatterplot
h=(0,1)

Sample data pairs of


same variable separated
by 1 m
xi  h in north-south direction
 h (t , t+h)  h (0,1)
N

  0.742

Value (t + h)
2

1 48
1
0 50
xi
Value (t)
Spatial statistics - Univariate
h-Scatterplot
h=(0,1)

Sample data pairs


of same variable
separated by 1 m
xi  h in north-south
direction  h(0,1)
N

  0.742

xi
Spatial statistics - Univariate
different separation distance results in different 
h=(0,1) h=(0,2) • If data value at locations
separated by h are very
similar then the pairs will
xi  h xi  h
plot close to the line
xi=xi+h (45-degree line
  0.742   0.590 passing through the
x
origin)
xi i

h=(0,4) • As the data values


h=(0,3)
become less similar or
separation distance
xi  h xi  h
becomes larger the cloud
of points on h-scatterplot
becomes fatter and more
  0.560   0.478
diffuse, drifted away from
xi x i the 45-degree line
Spatial statistics - Univariate
plots against separation distance (h)
h vs  Various index for spatial continuity

Correlation coefficient
decreases with increasing
distance in north direction

h vs Cov
Covariance function
Covariance also steadily
decreases in a manner very
similar to the correlation
coefficient
Spatial statistics - Univariate
plots against separation distance (h)
1 n
 h    xi  xi  h 2
 = gamma
2n i 1
Also known as semivariogram or variogram

h vs  Spatial continuity index


Unlike the other two
indices of previous spatial
continuity, the
semivariogram increases
as the cloud gets fatter or
separation distance
becomes larger
Semivariogram
• So, semivariogram or gamma gives an index for
spatial continuity
• Semivariogram value increases when separation
distance between samples increases
• Semivariogram value increases if correlation
decreases

h vs 

Lag or Separation Distance


Key Concepts

• Spatial dependence: the value of a


variable at a point in space is related
to its value at nearby points
• Spatial structure: the nature of the
spatial relation, as depicted in

SEMIVARIOGRAM
Relationship between variogram,
covariance and correlation
Variogram parameters

Model Form = EXPONENTIAL

Sill


= Data Points
Nugget = variogram model
(may be zero) Range

Lag or Separation Distance


Physical meaning of semivariogram
Sill

Nugget
Range

Lag or Separation Distance

• Sill: maximum semi-variance; represents variability in the


absence of spatial dependence
• Range: separation between point-pairs at which the sill is
reach; distance at which there is no evidence of spatial
dependence
• Nugget: semi-variance as the separation approach zero;
represents variability at point that cannot be explained by
spatial structure
How to calculate semivariogram?
Lag=1
5 6 5 7 8 8 6 9 9

1 5  6  6  5  5  7   7  8 
2 2 2 2

 (h  1)    1 n
2  8  8  82  8  62  6  92  9  92 
 h    `  xi  xih 2

2n i 1

1
1  1  4  1  0  4  9  0  20
16 16
 1.25

Lag=2


1 (5  5)  (5  8)  (8  6)  (6  9) 
2 2 2 2

 (h  1)   
2 * 7  (6  7) 2  (7  8) 2  (8  9) 2 

1
0  9  4  4  1  1  1  20 h
14 14
 1.42
etc… etc… etc…
Variability and Separation Distance
• Classical Statistics ignores spatial location of
data
• GeoStatistics uses spatial location and
variability inherent in dataset (e.g.
mineralisation)
• As the distance between samples increases
the variability between samples increases
Variability and Separation Distance
• Plot Variability in Sample Values vs
Distance between Samples

. .
. . .
.
variability

. .
0 distance between samples
Semi-Variogram Calculation

 (h) 
1
(g  g
2
i i h )
2N
Where:
h = distance vector between a pair of samples
gi = grade of first sample (at location i )
gi+h = grade of second sample (at location i+h)
N = number of pairs
Semi-Variogram / Variogram
• Semi-Variogram value for sample pairs at
distance apart is:
– half of the average squared difference between
values at this distance


1
(g  g
2
 (h) )
i i h
2N
• Hence the term semi-variogram
– However terminology is ‘loose’ and this equation
can also be referred to as the ‘variogram’
Semi-Variogram Calculation
• Semi-Variogram value for sample pairs at
distance apart is
– Variance
• Not just distance but also direction
– vector
• Vector also called Lag
Example Variogram Calculation
Fe deposit drilled at 100m spacing
NA NA

NA

NA NA

NA NA

NA = Not Available (no data)


Example Variogram Calculation
Pairs 100m apart in West-East direction
Example Variogram Calculation
Sum the square of their differences
Example Variogram Calculation
Plot 1.46 at 100m

20
18
16
Semi-Variogram

14
12
10
8
6
4
2
0
0 100 200 300 400 500
Distance
Example Variogram Calculation
Pairs 200m apart in West-East direction
Example Variogram Calculation
Sum the square of their differences
Example Variogram Calculation
Plot 3.30 at 200m

20
18
16
Semi-Variogram

14
12
10
8
6
4
2
0
0 100 200 300 400 500
Distance
Example Variogram Calculation
Plot 4.31 at 300m, 6.70 at 400m

20
18
16
Semi-Variogram

14
12
10
8
6
4
2
0
0 100 200 300 400 500
Distance
Variogram Calculation
• Calculate South-North semi-variogram
value at 100m, 200m and 300m
100m
44 NA 42 39 37 36
40 40 NA

42 NA 43 42 39 39 41 40 38

100m
37 37 37 35 38 37 37 33 34

35 38 NA 35 37 36 36 35 NA

36 35 36 35 34 33 32 29 28

38 37 35 NA 30 NA 29 30 32
Example Variogram Calculation
Not enough pairs at 400m South-North

Semi-Variogram
Distance East-West North-South
100 1.46 5.35
200 3.3 9.87
300 4.31 18.88
400 6.7
Example Variogram Calculation
Plot South-North results as well
20
North-South
18
16
14
Semi-Variogram

12
10
East-West
8
6
4
2
0
0 100 200 300 400 500

Distance
Calculating
Save Variograms
this slide to EMF,
then and
Distance insert and crop
Direction tolerance

• Samples will not be exactly the required


distance/direction apart
– Tolerance on both Direction and Distance
Calculating Variograms
Lag spacing
• Lag = Distance apart of sample pair
• Lag spacing
– incremental distance for calculation
• At least equal to sample spacing
• Note number of pairs used per lag
• Erratic variograms
– wrong lag?
– sometimes due to erratic sample support
Variogram Characteristics

.
Sill . . .
(levelling off = sill = overall population variance)

.
 ( h)

.
. . Samples
Spatially Correlated
Samples not
spatially correlated
Nugget
effect
0 h Range of influence

GEO STATISTICS
Geometric Anisotropy
Same Sill, Different Range
Variogram in East-West direction (along strike)

. .
variogram . . range of

. . influence
Along Strike
. .
0 25m 50m
distance

Variogram in North-South direction (across strike)

. . . . .
variogram

. . range of

. influence
Across Strike
..
0 25m 50m
distance
Zonal Anisotropy
Same Range, Different Sill
semi-variogram
N-S
AVERAGE

E-W

distance
• East-West direction has lower variance of
mineralisation
– Impossible?
– E-W really has much longer range?
Anisotropy
• Geometric anisotropy
– Same sill
– different range
across strike
down dip
along strike

• Zonal anisotropy
– different sill
– same range
along channel
across channel
Variograms
Models
Sill
. .
 ( h)
.
.
nugget . .
effect
h range

• Variogram shape is a function of


– nugget effect, range and sill
Variograms
Spherical Model
Sill
. .
 ( h) .
.
nugget . .
effect
h range

3
g ( h) = Nugget + (Sill-Nugget) x 1.5
( h
Range (
0.5
h
Range))
g ( h) = Sill (if h >= Range)
Variograms
Model Fitting
• Model fitting is subjective
– No accepted “goodness of fit” parameter
• Other variogram models exist
– Exponential, Gaussian, Linear
– Rarely used in practice
– Can use nested spherical instead
• Model choice depends on the geological
phenomena  accepted or generally used
model in that area of study
Variograms
Nested Models
• Nested Spherical Model
– Nested Variogram =
Nugget Effect + Short Range Spherical Model
+ Long Range Spherical Model

– Rarely more than 2 nested models


Variograms
Nested Models

+ =

+
Modelling Variograms
Nugget effect
• Incorporates
– inherent variability
– human error
– short scale variability
• Calculate from closest spaced data
– usually downhole variogram
• Duplicate data analysis
– inherent nugget effect
Calculating Variograms
Lag spacing

Lag spacing - Same data longer Lag -


too short more appropriate
Calculating Variograms
Number of lags
• Rule of thumb
– half the diagonal of areal extents
– subject to shape of domain
• long narrow orebody
• more lags along strike length

maximum
lag
distance
Variogram Cloud
•  (gi gih )
1
Semi-variogram is 2
– ½ Average of squared 2N
difference at specific lag
+/- distance and
direction tolerances
– Can plot all data points:
• Each squared difference
vs its distance
• Called a variogram cloud
Variogram Contours
Contoured variogram values showing a north-
east strike
NORTH
WEST

EAST
SOUTH
3D Variography
• Contours of horizontal variograms to
interpret strike of mineralisation
West
North

STRIKE

South
East
3D Variography
Direction 2 (intermediate direction)
Up Minimum continuity on dip plane
Direction 3 (minor direction)
Minimum continuity
perpendicular to dip plane

South
East

Direction 1 (major direction)


Maximum continuity on dip
plane
Other Variograms
• Log Variogram
– calculate using Log of grades
• Pairwise Relative Variogram
– each pair in each lag is divided by squared
average of the pair values
• Indicator Variogram
– Use 0 or 1 (depending on whether below/above
indicator value).
Variograms
What do they give us?
• Range of Influence of values
– Used to select search volume

• Anisotropy
– Understand directions of values
(of mineralisation or deposition)
Quiz 10 menit 31-1-2019
1. Tuliskan rumus aritmetik mean, variance
dan standar deviasi
2. Geostatistik akan maksimal aplikasinya
jika data yang digunakan memenuhi
syarat tertentu. Sebutkan syarat
tersebut!
3. Apa gunanya kita melakukan EDA
sebelum menggunakan geostatistik?

You might also like