You are on page 1of 19

EDA: Exploratory Data Analysis EDA: Exploratory Data Analysis

Content Content

• Geology model • EDA - Applications


- Not EDA, but a quick word. - Understanding/Checking Data
- Maps: location, trends, ...
• Purpose of EDA
- Sample statistics: spacing, orientation,...
• Usual Graphs - Checking pairs of values
- Piechart - Checking twin holes, grade profiles
- Histogram, probability plot - Important for Estimation/Simulation
- Boxplot - Compositing
- Scattergram, regression - Declustering
- QQ plot - Trimming, cutting outliers
- Relative difference plot - Checking geological boundaries
• EDA Envelope • Reference to Geological Process
- Important concept - GP 4.0 “Create Geology Model”
- GP 5.3 “Choose Estimation Method and
Parameters”

1 2

3 4
Geology Model GEOLOGY MODEL: Geological Process

• A good geology model is most important. 4.1


Confirm
Governance
Geological Process 4.0
• Only the geology that tells something about the miner- Framework Create Geological Model
alization is useful.
4.2
• A geology model consists of several geology domains. Determine Criteria
to be Modeled
Each domain has its own statistical characteristics that + (Geological, Mining, Metallurgical)

do not depend on location (homogeneity ←→ station-


arity).
4.3
Validate Data
• Boundaries between geological domains can be hard or
soft.
4.4
• Statistics have to be computed per geology domain and Interpret
Geological
within the EDA envelope. + Criteria

• Useful statistics to check differences between domains: 4.5


- Multiple boxplots, histograms Produce 3D
(Block)
- Multiple cumulative distribution Geology
Model
- Variograms (different directions)
4.6
• Geological Process 4.0 “Create Geology Model” Confirm
Geology
Model

4.7
Freeze
Geology
Model

5 6
Sul. Saprolite
Ox. Saprolite

Crb Stbl Brk


Overburden

Crb Lch Brk


Saprock

Wst
Au
Dyke

DDH
4 g/t

2.0
0.5

0.0
22400

22400

200 100 0 -100 -200 -300 -400


PLACER DOME INC.

Las Cristinas CO & CM

lascris_012
SECTION: 9200 N

GEOLOGY + DDH SAMPLES


22200

22200
22000

22000
21800

21800
21600

21600
21400

21400

200 100 0 -100 -200 -300 -400

7 8
EDA EDA: Geological Process
Purpose
• Geological Process 5.3 “Choose Estimation method and
• Data familiarization Parameters”
• Detecting possible errors - 5.3.2 “Sample Compositing”
- 5.3.3 “Exploratory Data Analysis”
• Identifying/confirming different mineralizations
- 5.3.4 “Choice of Estimation Method”
• Answering questions such as:
- 5.3.5 “Cutting and/or Indicator Classes”
- Ordinary or indicator kriging ?
- 5.3.7 “Assess Boundary Conditions”
- What trimming values ?
- Mean and variance ?
• Providing information for:
- Model validation
- Reconciliation

9 10

EDA: Geological Process

5.3.1
Select SMU

5.3.2
Sample
Compositing

5.3.3 Geological Process 5.3


Exploration
Data
Analysis
Choose Estimation Method
and Parameters
5.3.4
Choice of
Estimation
Method
(OK, IK, ID, &)

5.3.5
Cutting and/or
Indicator
Classes

5.3.6
Variography

5.3.7
Assess
Boundary
Conditions

5.3.8
Develop Search
Criteria

5.3.9
Validate
Estimation
Parameters

11 12
EDA GRAPHS Piecharts EDA GRAPHS Histograms

AU PIECHARTS
HISTOGRAM - ARITHMETIC SCALE
0.300 Number of Data 99688
Number trimmed 2295
By number of samples By sample weight By sample weight x grade
(Total = 86721 samples) mean 2.170
std. dev 4.949
DOM-04 coef. of var 2.281
DOM-04 DOM-05 DOM-04 maximum 330.000
32% 29% 21% 0.200 minimum 0.100
17%

Frequency
DOM-03 DOM-03
DOM-05 6% DOM-03 6%
24% 2%
7% 18% f. 64

56% DOM-06
DOM-07
22% 6%
16% DOM-05 DOM-06
38% 0.100
DOM-07
DOM-06 DOM-07

0.000
0.0 4.0 8.0 12.0 16.0 20.0
• Useful to show geological domain relative importances. BH Au (g/t)

HISTOGRAM - LOGARITHMIC SCALE


0.200 Number of Data 99688
Number trimmed 2295
mean 2.170
std. dev 4.949
coef. of var 2.281
0.150
maximum 330.000
minimum 0.100

Frequency
0.100

0.050

f. 62

0.000
.1 1. 10. 100.

BH Au (g/t)

13 14

EDA GRAPHS Probability Plots EDA GRAPHS Multiple Probability Plots


PROBABILITY PLOT - DISTRIBUTION IS NOT NORMAL
Different Mineralizations
95.0

90.0

85.0

80.0

75.0 PROBALITY Plot - Domain 04 - 07 - Au Dwt


70.0
900.0
65.0

60.0 500.0
grade

55.0 300.0
50.0
200.0
45.0 100.0
40.0
50.0
35.0
30.0
30.0
20.0
25.0

10.0
GRADE

20.0

15.0
5.0
10.0
3.0
5.0 2.0
0.0
99.99 99.9 99.8 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.2 0.1 0.01 1.0
probability of exceeding grade
0.5
0.4
LOG PROBABILITY PLOT - DISTRIBUTION IS APPROX. LOGNORMAL 0.3
0.2
900.0
800.0
700.0
600.0
0.1000
500.0
400.0 0.0500
300.0

200.0
0.0300
0.0200
100.0
90.0
80.0
70.0
0.0100
60.0 99.99 99.9 99.8 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.1 0.01
50.0
40.0 PROBALITY OF
30.0 EXCEEDING GRADE pdi_12
grade

20.0

10.0
9.0
8.0
7.0
6.0
5.0
4.0
3.0

2.0

1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3

0.2

0.1000
99.99 99.9 99.8 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.2 0.1 0.01
probability of exceeding grade f. 65

15 16
EDA GRAPHS Boxplot EDA GRAPHS Boxplots
AU BOXPLOTS
• A graph summarizing a distribution essential statistics 1000
DOM-03 DOM-04 DOM-05 DOM-06 DOM-07
1000

Boxplot of AU grades 100 100

Au (g/t)

16 } Maximum

Outliers
10 10

1 1
12 Upper Quartile
f. 63
Mean
8 0.1 0.1

Median Number of data 4859 28050 20902 13793 19117 Number of data
Mean 4.7667 1.2245 0.7509 6.1199 10.6783 Mean
4 Lower Quartile Std. Dev. 21.9575 6.3134 3.7238 36.3533 45.2865 Std. Dev.
f. 61
Coef. of Var. 4.6065 5.1559 4.9593 5.9402 4.241 Coef. of Var.
Maximum 680 370 398 972 1000 Maximum
Minimum Upper quartile 2.9 0.9 0.5 4.5 6.9 Upper quartile
0 Median 1 0.23 0.11 1.2 2.25 Median
Lower quartile 0.3 0.1 0.1 0.2 0.5 Lower quartile
Minimum 0.01 0.01 0.01 0.01 0.01 Minimum

• Multiple boxplots can be displayed on the same page


• Great display!

17 18

EDA GRAPHS Scattergram EDA GRAPHS QQ Plot

• Scattergram + smoothed regresssion (Y over X). • Useful to compare 2 populations, say A and B.
LAS CRISTINAS - TRENCHES • The quantiles of A and B,
100. ORIGINAL AU VERSUS REJECT
NB. OF DATA 477 (a1 , b1), (a2, b2), · · · , (a100, b100),
X VAR: MEAN 5.814 are plotted on a X/Y graph.
STD. DEV. 6.110

Y VAR: MEAN 6.021


STD. DEV. 7.264
B B B B
REJECT AU

10 10 10 10

10. CORRELATION 0.935 f.108

A A A A
0 0 0 0
0 10 0 10 0 10 0 10

− Same distribution − Same shape − Same shape − Different shape


− A more variable − A less variable
than B than B

1.
1. 10. 100.

ORIGINAL AU

19 20
EDA GRAPHS Relative Difference Plot

• Useful to investigate conditional bias between two


populations, say A and B
- X axis: mean of pair of values (A+B)
2
- Y axis: relative difference between values
(A−B)
(A+B)/2 × 100%

RELATIVE DIFFERENCE
PLOT
+30
100%

+20

+10
(A + B) / 2

0
A − B

−10

−20
f.109
−30
0 1 2 3 4 5 6 7 8 9
A + B
2

• Notes on graph:
- Low values of A < B
- High values of A > B
- A few outliers (analytical errors?).

21 22

23 24
EDA Envelope (1/4) EDA Envelope (2/4)

• An EDA envelope is a 3D envelope within which statis-


tics are computed.
• Which statistics?
- Declustered mean, variance. EDA Envelope
- Choice of trimming values.
- Choice of indicator cut-off grades.
- Variogram.
- Resource block model validation. Very low grades
"Other" grades
• Why an EDA envelope?
- To restrict statistics to where “it matters”.
Project Area f.185
- Fairly tight around sample locations
- No extensive “waste” areas
• How to define and EDA envelope?
- Well covered with samples
- Don’t be too precise.
- To reduce the impact of “fringes” when - Fairly tight around reasonably well sampled zone.
declustering and computing statistics. - Around material that matters. Significant waste
- To make sure that comparisons made during zone well below cut-off can be ignored.
validation (e.g. sample vs. kriged average - Digitize on a series of benches, then wireframe and
grades) correspond to the same material (i.e. create a 0/1 indicator grid.
material within the EDA envelope for both samples - Generally, geology can be ignored when defining
and kriged estimates). the EDA envelope.

25 26

EDA Envelope (3/4) EDA Envelope (4/4)

Wallaby - 10m Composite Au values


6) RL = 302.5 +/- 2.5 -- 5 ft Bench Toe = 300 -- ( 22/10/99 ) • Some remarks:
809000

809000

431600 431800 432000 432200 432400 432600 432800


- The EDA envelope is used to compute statistics.
- All data within and outside the envelope can be
used at the estimation step.
808800

808800

- There can be estimates outside the EDA envelope.


- A geology model has to be considered in addition
808600

808600

to the envelope.
• Geological Process 5.3.3 “Exploratory data Analysis”
808400

808400
808200

808200
808000

808000
807800

807800

431600 431800 432000 432200 432400 432600 432800


10m cmp AU (g/t)
pdi-10

EDA Envelope
0.2 1.0 2.0

27 28
2.5ft Composites, Declusteed - All Domains

Grade Profiles by Coordinate Axes

Locations & Values


0.0 Composites

8.5

8.5
42370.0 80120.0 425.0

4
3.9

3.9
6
8
42063.0 79890.0 350.5

Contours
7.5 8.1

7.5 8.1
EDA MAPS: Checking Trends

1.5

1.5
41756.0 79660.0 276.0

EDA MAPS: Location Maps


41449.0 79430.0 201.5

6.2

6.2
Elevation (m)

13

13
(m)

m
41142.0 79200.0 127.0
Easting(m)

8.0

8.0
Northing
40835.0 78970.0 52.5

3.2

3.2
40528.0 78740.0 -22.0

0.8

0.8
2
40221.0 78510.0 -96.5

1.2

1.2
39914.0 78280.0 -171.0

39607.0 78050.0 -245.5


31

29
39300.0 77820.0 -320.0
0.0 0.443 0.886 1.329 1.772 2.215 2.658 3.101 3.544 3.987 0.0 0.443 0.886 1.329 1.772 2.215 2.658 3.101 3.544 3.987 0.0 0.443 0.886 1.329 1.772 2.215 2.658 3.101 3.544 3.987
bh_met1_tr
(g/t) bh_met1_tr(g/t) bh_met1_tr
(g/t)

Number of Data by Coordinate Axes Composites


42370.0 80120.0 425.0

42063.0 79890.0 350.5

41756.0 79660.0 276.0

8.5
41449.0 79430.0 201.5

3.9

Colour Scale
(m)
(m)

41142.0 79200.0 127.0


Easting(m)

Elevation
Northing

Symbols
1.5
40835.0 78970.0 52.5

6.2
40528.0 78740.0 -22.0

8.1 7.5
13
40221.0 78510.0 -96.5

8.0
39914.0 78280.0 -171.0

3.2
39607.0 78050.0 -245.5

0.8
39300.0 77820.0 -320.0

f. 175
0.0 266.2 532.4 798.6 1064.8 1331.0 1597.2 1863.4 2129.6 2395.8 2662.0 0.0 266.2 532.4 798.6 1064.8 1331.0 1597.2 1863.4 2129.6 2395.8 2662.0 0.0 266.2 532.4 798.6 1064.8 1331.0 1597.2 1863.4 2129.6 2395.8 2662.0

1.2
Number of Data Number of Data Number of Data f197a

Northing Y

4150

4320

4490

4660

4830

5000
2950

3030

3110

3190

3270

3350
-258 308 -258 308 -258 308 Standard Deviation

4600
39583

39583

39583

4600
0.08

1.34

3.92

6.48
5.20
2.64
0.08
78784 N

EDA MAPS: Proportional Effect

5120

EDA MAPS: Locating anomalies


5120
41917 39583

41917 39583

41917 39583

1.36

Easting X

5640
Mean Grade (g/t)

5640
78853 N

Easting X
Check for Proportional Effect
2.64
41917 39583

41917 39583

41917 39583

6160
6160
78923 N

3.92

6680
6680
41917 39583

41917 39583

41917 39583

5.20
32

30

7200.
7200
Elevation Z Elevation Z
78992 N

2950

3030

3110

3190

3270

3350
4150
41917

41917

41917

6.48

Number of Data: 1808

Assay Location - Au g/t


1.5m Comps - Domain 02 In EDA
All Data >= 0.5 g/t
0.0
(g/t)

(g/t)
(g/t)

Coefficient of Variation

Variance

Mean Grade

4320
Sections, Looking North
viewed on East-West
Moving Cellindow Statistics

Standard Deviation
Mean vs.
Cell Size: 200x70x70
2m Comps
0.0

0.0

0.5
0.0

1.0

1.0
0.5

2.0

1.5
1.0

4490
Northing Y

1.0
3.0

2.0
1.5

4.0

3.0
2.0

5.0

5.0
2.5

2.0
4660

5.0
4830

10.0
f195

(g/t)
5000
EDA: Sample Spacings, Grade Stats EDA: Orientation Plot

• Spacing between closest samples from 2 different holes


Orientations of Consecutive Pairs in Same Hole
< 10 pairs
10 to 100 pairs
Domain Ave Min Max Med %Data 100 to 1,000 pairs
1,000 to 10,000 pairs
RT2 15.7 0.36 85.9 11.3 100 > 10,000 pairs
N
350˚ 10˚
340˚ 20˚
RT3 11.3 0.07 64.6 9.1 100
330˚ 30˚

RT5 10.3 0.01 73.9 8.6 100


320˚ 40˚

... ... ... ... ... ...


310˚ 50˚

300˚ 60˚

• Number of assays above cut-off grades


290˚ 70˚

1.5m Au Composites - In EDA - Domain 02


Number of Assay Intervals Above Grade Cut-Offs 280˚ 80˚

Cut-Offs Max. Avg. Cut-Offs Max. Avg.


Hole ID 0.51.02.55.0 g/t Dist.
(m)* Hole ID 0.51.02.55.0 g/t Dist.
(m) W -10˚ -20˚ -30˚ -40˚ -50˚ -60˚ -70˚ -80˚ E

1003 2 1 0 0 2.24 3.0 12109 15 9 2 0 3.0 5.9


1004 2 2 0 0 1.98 31.9 12110 11 8 3 0 3.7 6.3 260˚ 100˚
10203 12 11 9 4 10.12 4.2 12111 9 4 0 0 2.06 5.8
10204 7 2 0 0 1.56 8.2 12112 11 8 2 0 4.64 7.3
250˚ 110˚
10205 6 3 1 1 8.16 3.1 12113 12 5 0 0 2.42 5.5
10452 9 2 0 0 2.4 34.5 1222 2 1 1 0 3.99 10.3
10466 8 6 2 0 4.11 20.2 1271 0 0 0 0 0.03 8.3 240˚ 120˚

1067 3 2 1 0 4.11 9.7 1279 0 0 0 0 0.03 8.3


f194 230˚ 130˚
*Avg. Dist.: Average distance between hole and closest one.
220˚ 140˚

210˚ 150˚

200˚ 160˚
190˚ 170˚
S
fig192

33 34

35 36
EDA: Checking Pairs of Values EDA: Checking Twin Holes

ASSAY A: NORMAL ORIGINAL


ASSAY B: NORMAL REJECT

SIDE-BY-SIDE BOXPLOT SCATTERPLOT SCATTERPLOT (log scaling)


Comparison of Drill Holes D-1 and D-2
1 1
ASSAY AASSAY B
10 Depth Au (in g/t)
0.01 0.1 1.0 10.0 100.0
100.0 0

ASSAY B (OZ/T )

ASSAY B (OZ/T )
1
AU (OZ/T)

D-1
1 0.1
10 D-2
0.1 10.0

RS2 Au (in g/t)


20
0.01
1.0
NUMBER 189 189
MEAN 0.143 0.146 0 0.01
STDEV 0.136 0.162 0 1 0.01 0.1 1 30
MAXIMUM 1.207 1.285 ASSAY A (OZ/T) ASSAY A (OZ/T)
75TH %-ILE 0.177 0.175 0.1
MEDIAN 0.094 0.091
25TH %-ILE 0.064 0.055 Q-Q PLOT Q-Q PLOT (log scaling) 40
MINIMUM 0.001 0.001 1 1
LINEAR CORRELATION: 0.699 0.01
ASSAY B QUANTILES (OZ/T)

ASSAY B QUANTILES (OZ/T)


RANK CORRELATION: 0.799 50
0.01 0.1 1.0 10.0 100.0
DDH161 Au (in g/t)
60
0.1 Number of pairs: 66
D-1 mean: 1.779
D-2 mean: 1.854 70
D-1 std. dev.: 1.487
D-2 std. dev: 0.927
Linear correlation: 0.553 80
0 0.01
0 1 0.01 0.1 1 Rank correlation: 0.591
ASSAY A QUANTILES (OZ/T) ASSAY A QUANTILES (OZ/T)
90
RELDIFF PLOT RELDIFF PLOT (log scali ng)
100 100
100
75 75
0.01 0.1 1.0 10.0 100.0
RELDIFF [A-B]/AVG (%)

RELDIFF [A-B]/AVG (%)

50 50

25 25

0 0

-25 -25

-50 -50

-75 -75

-100 -100
0 1 0.01 0.1 1
AVERAGE [A+B]/2 (OZ/T) AVERAGE [A+B]/2 (OZ/T)
pdi_0014.eps

• Geological Process 3.7.8 “Complete QA/QC on Analy-


sis”
37 38

EDA: Grade profiles

Comparison of Au,Ag,Cu,Zn,C and S for DDH161


Gold + Silver (in g/t) Copper + Zinc (in %) Carbon + Sulphur (in %)
Depth Depth
0.01 0.1 1.0 10.0 100 0.01 0.1 1.0 10.0 100 0.01 0.1 1.0 10.0 100
0 0

20 20

40 40

60 60

80 80

100 100

120 120

140 140

160 160

180 180

200 200

220 220

0.01 0.1 1.0 10.0 100. 0.01 0.1 1.0 10.0 100 0.01 0.1 1.0 10.0 100

39 40
COMPOSITING 1/3 COMPOSITING 2/3

• Support size (point, 2m sample, block, etc.) is impor- • Compositing may be required if:
tant. - Sample lengths are much different: average length
- Different support sizes → in different variabilities. of 1.5m, many 50cm long samples centered on high
- Blocks are less variable than samples. grade veins.
• In theory, samples must be representative of the pop-
• Before compositing:
ulation. 5m samples are not representative of a 1m
- Histogram of sample lengths.
sample population.
- Histograms of sample grades per interval of lengths.
• (Most) estimation algorithms do not account for sample - Trim or cut very high grade (outliers) to avoid
size, e.g. do not make the difference between a 10 and smearing them over much longer lengths (More
a 1 m sample. about outliers in Section “Bivariate Statistics”).
• Solution: composite samples so that resulting “com- • Composite length should be such that:
posite lengths” are ± identical. - Enough variability is retained when estimating.
- No geological boundary crossing.
555555
Original Regular
555555
Samples Composites - Do not exceed block size:

555555 - 5m benches: 2/3m composites OK; 5m is max-


555555 Rock imum.
555555
A
"Hard"

555555
Geological
Boundary • If possible, composite only what is needed, i.e. leave
555555
555555
Rock
untouched composites if in specified Min/Max limits.

555555
B

555555
555555
f. 178

555555
41 42

COMPOSITING 3/3

• Impact of compositing:
- Lose original samples;
- Grade variability reduced;
- Number of samples reduced;
- Geological contacts can be smeared out.
• If an original sample length is very long, compositing
will split it in many regular smaller lengths.
- OK if the original grade is very low.
- Problem if original grade is very high, because the
location of the high grade is unknown.
• Useful check: display drill holes with the composited
grade histogram on one side and the original grade his-
togram on the other side.
• Geological Process 5.3.2 “Sample Compositing”

43 44
DECLUSTERING: Introduction DECLUSTERING: Methods

• Cell Declustering
• Clusters of samples are common in the mining industry.
- Superimpose a grid of cells on the data;
CLUSTERS - Cell size roughly the average sample spacing, ignor-
ing clusters. There is on average 1 data per cell,
where clustered. The cell can be rectangular.
- The declustered weight of a given sample is 1/Nc
N
where Nc is the number of samples located in the
corresponding cell.

1 per cell ==> w=1.0 per sample


f.116

8 6 2
1
• Potential problem: 1
- Clusters are often located within high grade zones. f.184b

Their impact can be a serious overestimation of


the average grade and variability if not accounted 2 per cell ==> w=0.50 per sample
for.
Average
• Solution:
- Declustering. Naive: 1 + 2 + 8 + 6 + 1 + 2 = 3.33 g/t Au
6
- Objective is to reduce the “weight” of each clus-
8+6+1+2
tered data +/− proportionally to the cluster sam- 1+2+
Declustered: 2
= 2.60 g/t Au
pling density. 5

45 46

DECLUSTERING: Methods DECLUSTERING: Methods

• Polygonal declustering • Kriging


- In 2D, the declustered weights are proportional to - Kriging is an excellent declustering tool (see later).
the polygons of influence of the corresponding data. - A regular grid of cells is superimposed on the data.
- In 3D, the same principle is applied on a bench The size of the cell does not matter too much.
basis. - The cells are “kriged” using the samples. The sam-
ple kriging weights are kept in memory.
- The declustered weight of a given sample is the
Area : 28 21 47
sum of the corresponding kriging weights kept in
memory.
2
• Nearest Neighbour Model
6 2
1 8 - Data is used to estimate a regular cell/block model.
1
Closest data is used to estimate each block.
f.184C
- Resulting distribution is the distribution of
Area : 37 17 44 estimated blocks.
- The shape of the resulting distribution is very
similar to the shape of the polygonally declustered
Average distribution.
- Advantage: can be done with Vulcan and/or
Naive: 1 + 2 + 8 + 6 + 1 + 2 = 3.33 g/t Au
6 Datamine.

Declustered: 37*1 + 28*2 + 17*8 + 21*6 + 47*2 + 44*1


37 + 28 + 17 + 21 + 47 + 44
= 2.54 g/t Au

47 48
DECLUSTERING: Methods DECLUSTERING: Methods

• Automatic cell declustering • Note 1:


- Based on assumption that clusters are always in - Some declustered weights can be very large due to:
high grade zone. The naive average is therefore - huge polygonal area (on the fringe)
overestimated. - special sample location (start/end of hole)
- Several cell size are automatically used for declus- - The solution consists in:
tering. - setting a maximum value when declustering
- The selected cell size is that one that gives the low- - declustering within an EDA envelope
est average.
• Note 2:
Declustered
- Polygonal declustering incomplete due to declus-
Mean tering radius smaller than small polygons.
- Solution consists in:
- Checking maps of declustered areas
(op99/polydeclus)
- Increasing declustering radius and using EDA
envelope to control the fringes
- Check declustering weights and eventually
Optimum trim them.
Cell Size
f.177
0 • Useful displays
Tiny
Cell Size
Huge - Histogram of weights
Note: tiny or huge cell size <=> No declustering
- Maps of weight values
- Maps of declustered areas
- Nice in theory, but often inconclusive in practice. • Geological Process 5.3.3 “Exploratory Data Analysis”
⇒ Not recommended.
49 50

DECLUSTERING: Statistics
 
• Let
 N AU values  z(xi ), i = 1, . . . , N and
wi, i = 1, . . . , N rescaled declustered weights such
that:
XN
wi = 1
i=1

• Mean:
N
X
mZ = wi z(xi )
i=1

• Variance:

X
N
 2
s2Z = wi z(xi ) − mZ
i=1
X
N
= wi z 2 (xi) − m2Z
i=1
p
• Standard deviation: sZ = s2Z .
• Median: the value z50 such that the sum of the declus-
tered weights of the values less than z50 is 0.5.
• Note: if pairs of values are available, the covariance (see
bivariate statistics) can also be declustered.
51 52
DECLUSTERING: Example (1/3) DECLUSTERING: Example (2/3)

AU Naive Histogram Histogram of Declustered Weights


Nb. of data 4296 Nb. of data 4296
mean 2.059 mean 3.109
std. dev. 2.179 std. dev. 4.164
0.120 coef. var 1.058 0.120 coef. var 1.339
maximum 16.000 maximum 100.000
minimum 0.000 minimum 0.037
Frequency

Frequency
0.080
0.080

0.040
0.040

0.000
0.000
0.0 2.0 4.0 6.0 8.0 10.0
.01 .1 1. 10. 100.
Au (g/t)
Decl. Weight

AU Declustered Histogram
Nb. of data 4296
mean 1.763
std. dev. 1.984
0.120 coef. var 1.125 • Very few excessive weights. Keep as is, or trim to 40.
maximum 16.000
minimum 0.000
Frequency

0.080

0.040

0.000
0.0 2.0 4.0 6.0 8.0 10.0
Au (g/t)

• -14% change in grade


53 54

DECLUSTERING: Example (3/3) DECLUSTERING: Exercise 8

Declustering polygons - Elev: 202.5 • Let the following sampling situation:


75600 75800 76000 76200 76400
95400

95400

N
95200

95200
95000

95000

f.118
94800

94800

100m
94600

94600

• What would be a reasonable cell declustering size?


94400

94400

75600 75800 76000 76200 76400


Polygons
Geology
1. 5200.

• Incomplete declustering in the Northern portion of the


map and in elongated domain in the SW.
55 56
TRIMMING / CUTTING OUTLIERS (1) TRIMMING / CUTTING OUTLIERS (2)

• When computing statistics within a geological domain, • The main questions are:
we make the assumption that there is only one popula- - Is trimming/cutting warranted?
tion and that all samples belong to that population. - If yes, which value(s) to choose?
• Outliers or extreme values are often observed. Their • The answers are subjective.
impact can be a serious overestimation of:
• The following graphs might be useful.
- the average grade and its variability;
- Decile analysis.
- the mean, variance, variogram, block model esti-
- Actual versus smoothed grade profile along holes.
mates, etc.
- Histogram and cumulative probability plots.
• Note that the outliers impact all estimates, even the - Indicator correlation plot.
traditional ones such as polygonal and 1/d2. - Coefficient of variation plot.
- Quantity of metal plot.
• Various solutions:
- Outliers are erroneous: delete or correct them; • Also useful:
- Outliers are from different “population”: - Number of trimmed/cut data
- define new geology domain;
- trim them down prior to computing statistics;
- restrict their influence during estimation (1/d2
or kriging);
- use indicator kriging.
• Geological Process 5.3.5 “Cutting and/or Indicator
Classes”.

57 58

TRIMMING / CUTTING OUTLIERS (3) TRIMMING / CUTTING OUTLIERS (4)

• Grade profiles along holes • Histogram

HISTOGRAM
DHxxxx 0.100 NUMBER OF DATA 455
NB CUT-OUT 93
CUT VALUE (MIN) 0.110

Geological Outliers Sample Smoothed 0.080


MEAN 10.528
STD. DEV 18.818
COEF. OF VAR 1.787
Contacts Grades Sample Grades MAXIMUM 201.000
MINIMUM 0.110
0.060
Frequency

100
0.040
10
f.171 0.020
1
0 25 50 75 100 125
0.000
.1 1. 10. 100. 1000.
f114_a
Au

- Outliers stand out with respect to smoothed grade


profile.
- A possible trimming/cutting value is where the his-
• Similar techniques can be applied in 2 and 3D: togram classes start to be isolated on the horizontal
- 2D: sample values & contours; map of residuals. axis.
- 3D: sample values & 3D estimates; list of residuals. - Possible trimming value from graph: 80 g/t.
• Advantage: detect local outliers.

59 60
TRIMMING / CUTTING OUTLIERS (5) TRIMMING / CUTTING OUTLIERS (6)

• Cumulative (log)probability plots.


Decile Analysis
% of Contained Metal Decile # of Samples Average (g/t) Minimum (g/t) Maximum (g/t) Contained Metal (g)
99.99
CUM. DISTRIBUTION

99.9
99.8
1.0 0-10 27 1.35 0.8 1.9 36
1.9 10-20 27 2.72 2.0 3.4 73
99
98 3.0 20-30 27 4.22 3.4 5.25 114
CUMULATIVE PROBABILITY

95 4.2 30-40 27 5.91 5.28 6.65 159


90 5.3 40-50 27 7.44 6.69 8.4 200
80 6.9 50-60 27 9.66 8.5 11.0 260
70 9.3
60 60-70 27 13.04 11.1 15.0 352
50
40
12.3 70-80 27 17.24 15.0 21.0 465
30 18.0 80-90 27 25.26 21.0 31.5 681
20 38.0 90-100 23 62.54 32.0 201.0 1438
10 Total 266 14.22 0.8 201.0 3783
5
2
Percentile (of last decile)
1
0.2
0.1 1.7 90-91 2 32.35 32.0 32.7 64
2.9 91-92 3 36.67 35.0 39.0 110
0.01 2.1
0.100 1.00 10.0 100. 1000.
92-93 2 40.35 39.7 41.0 80
f114_b 2.2 93-94 2 41.5 41.0 42.0 83
VARIABLE 3.6 94-95 3 45.5 42.0 47.5 136
2.6 95-96 2 49.25 49.0 49.5 98
2.7 96-97 2 51.5 50.0 53.0 103
2.9 97-98 2 55.5 54.0 57.0 111
7.4 98-99 3 93.33 61.0 110.0 280
- A single population would be represented on the plot 9.8 99-100 2 185.5 170.0 201.0 371

by a gradually increasing line. Suggestion: cutting may be warranted


*

- If the population is (log)normal, then the curve is a


straight line on (log)probability paper. This, however,
is of secondary importance. • Decile Analysis
- A “kink” or a “break” in the curve might indicate two - Introduced by I.S. Parrish, Min. Eng., Apr. ’97.
populations or the presence outliers. - See next page for 40/10 rule of thumb
- A possible trimming/cutting value is around the - 40/10 rule to be reduced if last decile / percentile
kink/break where the second population (outliers) gets do no contain a full “complement” of samples.
predominant.
- Possible trimming value from graph: 70 g/t.
61 62

TRIMMING / CUTTING OUTLIERS (7) TRIMMING / CUTTING OUTLIERS (8)

• Decile Analysis (Contd.) • Indicator correlation plot.


- If Top decile contains: 1
- More than 40% of metal, or 0.9
Indicator Correlation for Lag 1

- More than twice the metal of previous decile 0.8

0.7
Split it in 10 percentiles
0.6
- If top percentile contains: 0.5

- More than 10% of metal 0.4

Trimming is warranted 0.3

0.2
- Suggested trimming value is then: 0.1
- Highest value of previous percentile 0
0.1 1 10 100 1000
Indicator Threshold (g/t Au) f114_c
• - Generally conservative?
• Possible trimming value from graph
- Note that last decile / percentile not “full”. - This plot shows the correlation coefficient of two adja-
- Trimming may be warranted. cent down-hole sample indicators for increasing cut-offs.

- Previous percentiles 3 values: 61, 109, 110 g/t 1, if the grade z(x) ≥ zc
- Indicator: ic (x) =
- Possible trimming value: 100 g/t 0, otherwise

where zc is the cut-off (indicator threshold).


- As the cut-off zc increases, the correlation decreases.
- A possible trimming/cutting value is when the correla-
tion is or is getting close to 0.
- Possible trimming value from graph: 60 g/t.
63 64
TRIMMING / CUTTING OUTLIERS (9) TRIMMING / CUTTING OUTLIERS (10)

• Coefficient of variation plot. • Quantity of metal plot.


3 100
2.8 90

% of Contained Metal in Samples


2.6
80
2.4
Coefficient of Variation

2.2 70
2
60
1.8
1.6 50
1.4 40
1.2
30
1
0.8 20
0.6 10
0.4
0
0.2 0.1 1 10 100 1000
0 Trimming Value (g/t Au) f114_e
0.1 1 10 100 1000
Cutting Limit (g/t Au) f114_d

- This plot shows the relative quantity of metal contained


- This plot shows the coefficient of variation of the grades within the trimmed-down samples for various trimming
below cut-off (cutting limit). values.
- As cut-off increases, coefficient of variation increases. - Useful to know the quantity of metal “discarded” by
- A possible trimming/cutting value is when the coeffi- trimming.
cient of variation curve gets out of control. - 93% of metal corresponding to 70g/t trimming value.
- Possible trimming value from graph: 100 g/t. =⇒ 7% of metal “loss” if larger than 80 g/t Au samples
are trimmed down to 80 g/t.

65 66

TRIMMING / CUTTING OUTLIERS (11) TRIMMING / CUTTING OUTLIERS (12)


1000

• Trimming/Cutting Summary Table


Mon Oct 6 14:26:11 PDT

Cutting Value (g/t Au)


100
10

Histogram 80 g/t
D6: HW Shear -- AU Decl, Trim 1500 g/t, Env=2, Inside Trust.

Probability Plot 70 g/t


1

Decile Analysis 100 g/t


0.1
100
90
80
70
60
50
40
30
20
10
0

% of Contained Metal in Samples Indicator Correlation 60 g/t


Coefficient of Variation 100 g/t
1000

Final Choice 80 g/t


Cutting Statistics

Trimming Limit (g/t Au)

Number of Data Trimmed 4 of 455


100

Metal “Loss” 7%
10
Placer Dome Inc - Geostatistics

0.1 1
3
2.8
2.6
2.4
2.2
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0

Coefficient of Variation
1000

Variable: au (weighted by wtpoly) from 0.1 to 9999


Indicator Threshold (g/t Au)
100
10
1
0.1

fig114_f
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Indicator Correlation for Lag 1

68
67
HARD / SOFT GEOLOGICAL BOUNDARIES HARD / SOFT GEOLOGICAL BOUNDARIES

MUSSELWHITE − Comparison
MUSSELWHITE - COMPARISON of Consecutive
OF CONSECUTIVE Down
DOWN HOLE AUHole Assays
ASSAYS
Pamour Feasibility Study
Waste / Waste
WASTE Waste Ore
WASTE / ORE OreORE
/ Ore DOM_In_Stope DOM_Out_Stope
100 100 100
N = 2146 N = 16721
Au in Adj. Waste

10 10 10 Mean = 2.72 Mean = 0.79

Au in Adj. Ore
Au in Ore 5.0
1 1 1
4.5
0.1 0.1 0.1
4.0
0.01 0.01 0.01
0.01 0.1 1 10 100 0.01 0.1 1 10 100 0.01 0.1 1 10 100
Au in Waste Au in Waste Au in Ore 3.5 90
f.110
196
Number of pairs: 89 Number of pairs: 129 Number of pairs: 102
X Mean: 1.517 X Mean: 1.852 X Mean: 8.882
3.0 486
74
Y Mean: 1.558 Y Mean: 7.363 Y Mean: 7.142
X Std.Dev.: 2.238 X Std.Dev.: 2.521 X Std.Dev.: 7.264 2.5 351
Y Std.Dev.: 2.239 Y Std.Dev.: 6.328 Y Std.Dev.: 6.296 164
104
Correlation (on logs): 0.214 Correlation (on logs): 0.102 Correlation (on logs): 0.573 52 94
2.0

1.5 120
188
426

555555
640
125

555555
1.0 219 56

555555 Waste Ore


Ore / Ore

Waste / Waste 555555


555555
{ {
0.5

0.0
555555 { -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8

555555
555555
Ore Distance From Contact, m

555555
f. 144
Waste
555555
555555
• Note: sometimes, mineralization occurs at contact.
• Geological Process 5.3.7 “Assess Boundary Conditions”
69 70

71 72
PRACTICE EDA: Summary (1) PRACTICE EDA: Summary (2)

• Geology model • Clusters:


- Overestimation possible
• EDA envelope - Various declustering techniques: cell, polygonal,
- Important! kriging.
- Declustered mean, variance, and covariance.
• Usual graphs:
- Piechart, boxplot • Compositing
- Histogram, probability plot
- Scattergram, QQ plot • Checking geological boundaries: hard/soft?
- Relative difference plot
• Trimmin/capping outliers
• EDA applications - Somewhat subjective
- Maps: location, trends, proprotional effect - Histogram, probability plot
- Sample statistics: spacing, orientation plot - Indicator correlation plot, decile analysis
- Checking pairs of values
- Checking twin holes, grade profiles • Available programs
- Refer to Section 1.

73 74

 FACT SHEET 

No new formulas

75

You might also like