You are on page 1of 33

TA3201 Geostatistics for Resources Modeling

Review on Basic Statistics


Review on Basic Statistical Analysis

1. Univariate Statistics
Analysis on single variable without considering their location. The
data is assumed to be a random variable.
2. Bivariate Statistics
Analysis on two different variables located in the same location.
3. Spatial Statistics
Analysis on a variable with considering the spatial aspect of data. It
can be applied for natural phenomena, by assuming that the data is a
random function.

2
Univariate Analysis
Parameters for Measure on Central Tendency

1. Mean :

2. Median : central data value

3. Mode : highest frequency value

4. Skewness :
Skewness

5. Kurtosis :
Kurtosis

3
µ

(a) (b) (c)

Skewness of some histograms: a) symmetric (normal


distribution); b) negative skewness; and c) positive skewness
(lognormal distribution)

4
Parameters for Measure on Dispersion

1. Range : range = Xmax – Xmin

2. Variance :

3. Standard Deviation :

4. Coefficient of Variation : CV

5
Coefficient of variation of some grade values of mineral deposit in the world

Type of mineral deposits CV


Gold: California, USA; placer Tertiary 5.10
Tin: Pemali, Bangka, Indonesia; Primary 2.89
Gold: Loraine, South Africa; Black Bar 2.81
Gold: Norseman, Australia; Princess Royal Reef 2.22
Gold: Grasberg, Papua, Indonesia 2.01
Lead: Grasberg, Papua, Indonesia 1.57
Tungsten: Alaska 1.56
Gold: Shamva, Rhodesia 1.55
Uranium: Yeelirrie, Australia 1.19
Gold: Vaal Reefs, South Africa 1.02
Zinc: Grasberg, Papua, Indonesia 0.87
Zinc: Frisco, Mexico 0.80
Nickel: Kambalda Australia 0.70
Manganese 0.58
Lead: Frisko, Mexico 0.57
Sulphur in Coal: Lati Mine, Berau, Indonesia 0.48
Lateritic Nickel: Gee Island, East Halmahera, Indonesia 0.44
Iron ore 0.27
Bauxite 0.22
6
Bivariate Analysis

Scatter plot  used to plot the correlation between two different


variables (i.e. x and y variables) located in the same position

7
Covariance (Cxy)  used to measure the dispersion of two different
variables (i.e. x and y variables) located in the same position

Coefficient of correlation ()  used to measure the correlation


between two different variables (i.e. x and y variables) located in
the same position:

Linear regression of two variables:

where: a = slope
b = Y-intercept
8
About Outlier…

• Outlier can be very sensitive to the data distribution and spatial


structure in estimation (i.e. tend to generate overestimate).
• There is no strict solution to decide how to handle outliers, or even
decide what an outlier is  any solution is based on feelings and
common sense.
• The presence of outliers may require a special robust estimator of
the mean, i.e. “Sichel’s-t-estimator” (Sichel, 1966).
• At this topic we only discuss the problem of correcting individual
values in practice  by cutting/capping high values.
• Other solution  the distribution of data larger and smaller than
twice standard deviation can be considered as outliers (anomaly).
9
Cumulative frequency curve of uranium grades and suggested correction
for outliers (David, 1988).
10
Probability plot of: (a) Pb and (b) Zn grades for each rock types. The parts
noted by dotted circles show some lowest and highest values which are
considered as outliers data, while the horizontal dotted line in the graph is
cut-off for lower and higher grades in intrusive group (Heriawan et al.,
2008).
11
Probability Plot TDHplot
Probability s/d -100 Tambang
of Tin grade Pemali
99.99

99.9
Top-cut for Sn grade = 3.26 kg/m3
99

95
90
80
70
Persen

50
30
20
10
5

.1

.01
0 5 10 15 20 25
12
TDH (kg/m3)
Grade Sn (kg/m3)
Distribution of metal grades in each rock type for Cu-Au porphyritic deposit
6 5000 35
Acidic-Andesitic Volcanics Acidic-Andesitic Volcanics
Acidic-Andesitic Volcanics
Breccia Breccia Breccia
5 Porphyritic Diorite Porphyritic Diorite 30
4000 Porphyritic Diorite
Tuff Tuff Tuff
25
4
3000
Au (ppm)

Cu (ppm)
20

Ag (ppm)
3

15
2000
2
10

1000
1
5

0 0 0
.01 .1 1 5 10 20 30 50 70 80 90 95 99 99.9 99.99 .01 .1 1 5 10 20 30 50 70 80 90 95 99 99.9 99.99 .01 .1 1 5 10 20 30 50 70 80 90 95 99 99.9 99.99
Percent Percent Percent

3500 1400
Acidic-Andesitic Volcanics
8000 Acidic-Andesitic Volcanics Acidic-Andesitic Volcanics
Breccia
3000 Breccia 1200 Breccia
Porphyritic Diorite
Porphyritic Diorite Porphyritic Diorite
Tuff
Tuff Tuff
2500 1000
6000
Pb (ppm)

2000 800
Zn (ppm)

Mo (ppm)
4000
1500 600

1000 400
2000

500 200

0 0 0
.01 .1 1 5 10 20 30 50 70 80 90 95 99 99.9 99.99 .01 .1 1 5 10 20 30 50 70 80 90 95 99 99.9 99.99 .01 .1 1 5 10 20 30 50 70 80 90 95 99 99.9 99.99
Percent
Percent Percent
13
One method to differentiate the background and anomaly data
An o m alo u s

An o m alo u s
M -2 SD M -1 SD M ean M + 1 SD M + 2 SD
(0 .7 ) (0 .9 ) (1 .1 ) (1 .3 ) (1 .6 )
Slig h tly An o m alo u s Back g roStan
u n dd ard Deviatio n Slig h tly An o m alo u s

6 6

4 4

2 2

0 0
0 .6 0 .7 0 .8 0 .9 1 .0 1 .1 1 .2 1 .3 1 .4 1 .5 1 .6 1 .7

14
Recognizing the different population…

15
Data-point locations of the different population of sodium contents with the low
content is smaller than 1 % () and the high content is larger than 1 % ()

16
Fe vs. Ni grades in Laterite Nickel Deposit

Limonitezone

Saprolite zone

17
The perspective views of Pb-Zn grades in
intrusive group for: (a) Pb and (b) Zn with blue Rocktype Pb >0.005% Pb <0.005% Zn >0.01% Zn <0.01%
and grey colors show the high grade and low Nb of values 979 10671 2666 9107
grade respectively
Min 0.0050 0.0003 0.0100 0.0004
Max 0.8537 0.0049 3.2509 0.0099
Statistics of Pb-Zn grades for Mean 0.0354 0.0014 0.0653 0.0050
each cut-off in intrusive group
Median 0.0086 0.0012 0.0150 0.0047
Std error 0.0032 0.000010 0.0047 0.000022
Variance 0.0103 0.000001 0.0579 0.000004
18
Coef. of var 2.8744 0.7010 3.6828 0.4128
Distribution of Spatial Data

Isotropic
Different Population

Trend (plane)
An example on spatial correlation
of data: The maps show good
correlation between Cu and Au
grades.
1 1 1 1 2 2 2 2 2 1 1 1
1 1 2 2 2 3 2 3 3 2 2 1
1 2 2 2 2 4 3 3 4 3 2 1
1 2 2 4 4 5 5 5 3 3 3 2
2 2 3 7 8 6 7 6 4 2 2 2
2 2 4 7 9 7 6 5 6 4 2 2
2 2 4 5 8 6 5 7 5 4 2 1
1 2 3 3 2 4 5 3 1 2 2 1
1 1 2 2 2 2 3 2 1 1 1 1
1 1 2 2 2 2 2 2 1 1 1 1

Example of data distributon in blocks

• In the blocks (population), if we


select specific area, we will obtain
the different histogram (means
different distribution).
• In the blocks, selecting all blocks Histogram of data distribution according to the
will produce histogram C, then blocks selection
selecting blocks color light grey
will produce histogram A, while
selecting blocks color dark grey
will produce hitogram B.
(a) Example of blocks distribution in
four different mine sites
(b) Histogram of data
• If the cut-off grade is known to be ≥ 2%, then the blocks with distribution from the blocks
dimension 5050 m2 contained grade ≥ 2% will be distributed in four different mine sites
as shown in figure (a) for the four different mine sites.
• The selected blocks in four different mine sites (by chance)
have the same histogram as shown in figure (b).
• If due to the technical reason that the mineable blocks should
have area minimum of 100100 m2 (four blocks in vicinity),
then not all selected blocks is mineable.
Pattern-1

Block size = 50  50 m
Histogram of Pattern-1

40 40

30 30

20 20

10 10

0 0
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
Pattern-2

Block size = 50  50 m
Histogram of Pattern-2

15 15

10 10

5 5

0 0
2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
Pattern-3

Block size = 50  50 m
Histogram of Pattern-3

Population for low


15
grade 15

Population for high


grade
10 10

5 5

0 0
2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
Classical Statistics vs. Spatial Statistics

• It has been known two methods for statistical analysis of mineral


deposit characteristics: classical statistics and spatial statistics.
• Classical statistical is used to define the properties of sample
values with assumption that they are realization of random
variables.
• In this case, the samples composition/support is relatively ignored,
then assumed that all sample values have the same probability to
be picked up.
• The presence of trends and ore shoots in mineralization zones is
ignored.
• The fact in earth sciences shows that two samples taken in vicinity
gives the similar value compared to the others in further distance.
29
Centre de Geostatistique, Ecole des Mines de Paris,
Fountainebleau

• On the other hand, spatial statistics assumes that the sample values
are realizations of random function.
• In this hypothesis, sample values is function of their locations in
deposit, then their relative position is considered in analysis.
• The similarity of sample values which is function of the samples
distance is the basics theory in spatial statistics.
• In order to define how closely the spatial correlation among points
in deposit, we must know the structural function which is
represented by variogram (semi-variogram).
30
Why spatial analysis ??

 Statistical description has not taken the data location into


account.
 Statistical description has not taken the data density into
account.
 Statistical description will produce the same result even though
the data location is changed randomly.
 Spatial analysis can be prepared by plotting the data
distribution (into a map).

31
Fundamentals of Geostatistics
【Random Data The Same Average 【Anisotropic Distribution】
Distribution】
● ● ● ● and Variance ● ● ●

BUT!
● ● ● ● The same histogram ● ● ●

● ● ● ● ● ● ●

● ● ● ● ● ● ● ●

【Biased Distribution】 【Distribution with Trend】


● ● ●
● ●
● ● Orange
Red

Blue
Green

● ●

● ●
● ● ●
● ●
● ● ●
● Largely Different
● ● ●
● ● ● ●
Spatial
● ● ●
Distribution
Importance of considering data location 18

You might also like