You are on page 1of 30

Geostatistics

Descriptive Statistics &


Confidence limits

Dr. Charlie Wu
September 11, 2019
1
Content
1. Statistical data analysis of geoscientific data
2. Descriptive statistics
central tendency
dispersion, skewness and kurtosis
SD & SE example
3. Theoretical probability distribution
4. Z table
5. Confidence limits for population data
6. Confidence limits for sample mean
7. t table
8. Class example
9. Homework 2
2
Statistical analysis of geoscientific data
Descriptive Stat: data summary and presentation
data presentation
Central tendency (Mean,…) , dispersion (SD,…)
Shape

Inferential stat: handle/draw “reproducible” conclusion


Hypothesis testing: Samples A correlatable with B?
t and F test statistics and significance

Correlation and predictions:


regression, interpolation, extrapolation, mapping,
3
modeling/simulation
Descriptive Statistics
Central tendency: Average, mode, median & mean

Dispersion/variability:
Standard deviation (SD)= spread of data, variance

Variance= average of square difference from mean

Standard error (SE) of the mean= SD of repeated samples’


means

4
Descriptive Statistics of RRE sample

5
Sample Sample Sample
Mean Standard

31.40 x variable
deviation
x
s
0.00
Sample size -1
31.4 + 0 + 0.99 + 0 + 0.02 (24.92)2+(-6.48)2+(-5.49)2+(-6.48)2+(-6.46)2
0.99 -1

0.00 x 6.48 s = 13.93


0.02 Median =74
Mode = 0.00

13.93
The “standard error of the mean”,
s 13.93 also called the “standard deviation
6.23 of the mean”, is a method used to
estimate the standard deviation of
6
a sampling distribution.
Sample size N=15 Sampling Stat
A 27 20 12
0 11 0
0
6
0
5 A
N=
15
mean
8.5
SD
9.3
SE
2.4
SD of mean
21 19 0 0 7 B 15 16.8 37.5 9.7 from each
C 15 21.5 34.7 9.0
B 0 0 1 11 8 D 15 19.5 26.7 6.9 sample
150 3 9 0 0 E 15 11.7 17.9 4.6
16 17 12 20 5 A~E
Repeated sampling 5 times Stat
C 32 20 10 6 6 N= mean SD of mean
0 37 14 17 17 A~E 5 15.6 5.4 5.4
141 9 0 5 8
SD & SE
SE estimated
example
D 56 25 43 0 0
0 2 0 0 37
12 91 13 11 2 from 5 samples
0 17 0 0 4
E 39 0 2 0 9
9 60 30 0 6
F
G
H
I
J
:
:
:
7
Sampling Stat
N= mean SD SE
A 15 8.5 9.3 2.4
B 15 16.8 37.5 9.7
C 15 21.5 34.7 9.0
D 15 19.5 26.7 6.9
E 15 11.7 17.9 4.6

Repeat sampling 5 times Stat


mean SD of mean
A~E 5 15.6 5.4 5.4 Standard Error of Mean
AE B
D
C

8
MEASURES OF CENTRAL TENDENCY

• Mode

• Median

• Arithmetic mean

• Quartiles and percentiles

9
http://en.wikipedia.org/wiki/Mean_(statistics)

1 Arithmetic mean

2 Geometric mean

3 Harmonic mean

4 Power mean

5 Weighted mean

6 Interquartile mean
10
MEASURES OF DISPERSION
• Range

Population Sample
• Standard deviation

• Standard Error of the mean SE

Population Sample
• Variance

• Coefficient of variation
Population Sample
1 𝑛 𝑥𝑖 −μ 3 𝑛 𝑛 𝑥𝑖 −𝑥 3
• Skewness (symmetry) σ σ𝑖=1
𝑛 𝑖=1 σ (𝑛−1)(𝑛−2) 𝑠

𝑛(𝑛+1) 𝑛 𝑥𝑖 −𝑥 4 3 𝑛−1 2
• Kurtosis (peakness) σ𝑖=1 -
(𝑛−1)(𝑛−2)(𝑛−3) 𝑠 (𝑛−2)(𝑛−3)
11
99.7% Normal/Gaussian/Z
95.5% Distribution
68.3%

+1 SD
-1 SD

34.13%
34.13%

+2 SD
-2 SD
13.59%

13.59%

+3 SD
-3 SD

0.13% 2.15% 2.15% 0.13%

One standard deviation "one sigma" (red area) accounts for about 68 percent of the data points.
Two standard deviations (the red and green areas) account for roughly 95 percent of the data points.
Three (3) standard deviations (the red, green and blue areas) account for about 99 percent of the data
points.

12
Theoretical probability distribution
Probability density function (pdf).
Has 2 parameters: µ and σ.

50%

90%

95%
99% 13
Theoretical distributions
Pdf: probability density function
Cdf: cumulative density function

-SD +SD
14
Standard Standard Standard Standard
Cumulative Cumulative Cumulative Cumulati

%
Deviation from Deviation from Deviation from
Deviation from

0.13
probability probability probability probabili
Mean Mean Mean Mean
-3.00 -3.000.0014
0.00135 0.0014 0.00 0.000.5000 0.5000
-2.90 -2.900.0019 0.0019 0.10 0.100.5398 0.5398

Z
-2.80 -2.800.0026 0.0026 0.20 0.200.5793 0.5793
-2.70 -2.700.0035 0.0035 0.30 0.300.6179 0.6179

-3 SD
-2.60 -2.600.0047 0.0047 0.40 0.400.6554 0.6554
-2.576 -2.50 -2.500.0062 0.0062 0.50 0.500.6915 0.6915
-2.40 -2.400.0082 0.0082 0.60 0.600.7258 0.7258
-2.30 -2.300.0107 0.0107 0.70 0.700.7580 0.7580
-2.20 -2.200.0139 0.0139 0.80 0.800.7882 0.7882
-2.10 -2.100.0179 0.0179 0.90 0.900.8159 0.8159

2.15%
-2.00 -2.000.0228 0.0228 1.00 1.000.8414 0.8414
-1.96 -1.90 0.0227
-1.900.0287 0.0287 1.10 1.100.8643 0.8643
-1.80 -1.800.0359 0.0359 1.20 1.200.8849 0.8849
-1.70 -1.700.0446 0.0446 1.30 1.300.9032 0.9032
-1.64
-2 SD
-1.60 -1.600.0548 0.0548 1.40 1.400.9192 0.9192
-1.50 -1.500.0668 0.0668 1.50 1.500.9332 0.9332
-1.40 -1.400.0808 0.0808 1.60 1.600.9452 0.9452
%

-1.30 -1.300.0968 0.0968 1.70 1.700.9554 0.9554


13.59

-1.20 -1.200.1151 0.1151 1.80 1.800.9641 0.9641


-1.10 -1.100.1357 0.1357 1.90 1.900.9713 0.9713
-1.00 -1.000.1587 0.1587 2.00 2.000.9773 0.9773
-0.90 0.1587
-0.900.1841 0.1841 2.10 2.100.9821 0.9821
-0.80 -0.800.2119 0.2119 2.20 2.200.9861 0.9861
-0.70 -0.700.2420 0.2420 2.30 2.300.9893 0.9893
-1 SD

-0.67
-0.60 -0.600.2743 0.2743 2.40 2.400.9918 0.9918
-0.50 -0.500.3085 0.3085 2.50 2.500.9938 0.9938
%

-0.40 -0.400.3446 0.3446 2.60 2.600.9953 0.9953


34.13

dard Standard
-0.30 Standard
-0.300.3821 0.3821 2.70 2.700.9965 0.9965
umulative Cumulative Cumulative Cumulative
on from Deviation Deviation
-0.20 from from
-0.200.4207 0.4207 2.80 2.800.9974 0.9974
probability probability probability probability
an Mean
-0.10 Mean
-0.100.4602 0.4602 2.90 2.900.9981 0.9981
.00
0.0014 0.0014 0.00 0.00
0.000.5000 0.5000
0.5000 3.00 3.000.9987 0.9987
Z
.90
0.0019 0.0019 0.10 0.100.5398 0.5398

99%
95%
90%
.80
50%
0.0026 0.5793
0.0026 0.20 0.200.5793
.70
0.0035 0.0035 0.30 0.300.6179 0.6179
68.26%
95.45%
99.73%

.60
0.0047 0.0047 0.40 0.400.6554 0.6554
Probability
%

.50
0.0062 0.0062 0.50 0.500.6915 0.6915
.40 0.0082 0.7258
34.13

0.0082 0.60 0.600.7258


.30
0.0107 0.0107
0.67 0.70 0.700.7580 0.7580
.20
0.0139 0.0139 0.80 0.800.7882 0.7882
.10
0.0179 0.0179 0.90 0.900.8159 0.8159
1 SD

.00
0.0228 0.0228 1.00 0.8413
1.000.8414 0.8414
.90
0.0287 0.0287 1.10 1.100.8643 0.8643
.80
0.0359 0.0359 1.20 1.200.8849 0.8849
.70
0.0446 0.0446 1.30 1.300.9032 0.9032
.60
0.0548 0.0548 1.40 1.400.9192 0.9192
%

.50
0.0668 0.0668 1.50 1.500.9332 0.9332
13.59

.40
0.0808 0.0808 1.60 1.600.9452 0.9452
.30
0.0968 1.64
0.0968 1.70 1.700.9554 0.9554
.20
0.1151 0.1151 1.80 1.800.9641 0.9641
.10
0.1357 0.1357 1.90 1.900.9713 0.9713
2 SD

.00
0.1587 1.96 0.1587 2.00
0.9773
2.000.9773 0.9773
.90
0.1841 0.1841 2.10 2.100.9821 0.9821
.80
0.2119 0.2119 2.20 2.200.9861 0.9861
.70
0.2420 0.2420 2.30 2.300.9893 0.9893
.60
0.2743 0.2743 2.40 2.400.9918 0.9918
2.15%

.50
0.3085 0.3085 2.50 2.500.9938 0.9938
For

15

.40
0.3446
2.576 0.3446 2.60 2.600.9953 0.9953
.30
0.3821 0.3821 2.70 2.700.9965 0.9965
.20
0.4207 0.4207 2.80 2.800.9974 0.9974
.10
0.4602 0.4602 2.90 2.900.9981 0.9981
Z
Population

000.5000 0.5000 3.00 3.000.9987 0.9987


3 SD
0.13%

0.99865
z

16
Standard Normal Distribution
(Use “distcalc” program provided in the class)
Level of 2 sided 1 sided 1 sided
significance (α) 0.05 0.05 0.025
1 side (α) 0.025 0.05 0.025
Probability 0.95 0.95 0.975
Z value 1.96 1.644 1.96

0.975
0.025 0.95 0.025 0.95 0.05
0.025
-1.96

1.644
1.96

1.96
17
Confidence limits for population data
(If your sample is truly representative)
Mean

Standard
deviation

• For example, for 95% and 99% confidence


intervals, the lower and upper confidence limits
are:

18
Confidence limits for population data
Q: In a normal distribution with mean 4 and variance 25, what are the
upper and lower limit scores for the middle 50% of the data?

Solution:
From left table, Z=0.67
Sx=√25=5, =4

So, Lower limit


Xlow=4-0.67*5=-0.65

Upper limit
Xup=4+0.67*5=7.35

19
t distribution (for N<30)
The t-distribution is a family of distributions, a slightly different distribution
for each sample size (degrees of freedom)
It is flatter and more spread out than the normal z-distribution
As sample size increases, the t-distribution approaches a normal distribution

20
Confidence limits for Sample Mean μ or x-
For sample size N ≥ 30 & representative sampling
Standard deviation

Mean Sample size

Use the Z table for the standard normal (Z) distribution

For sample size N < 30 or unknown representativeness

Use the t distribution table with df=N-1


21
Confidence limits for Sample Mean, N≥30)
& representative sampling
Poura Gold Mine in Burkina Faso, West Africa

Upper limit gold grade (g/t)?


Lower limit gold grade (g/t)?

Upper limit gold grade:


15.1.+1.96*16.822/√3232=15.1+0.58=15.7
Lower limit gold grade:
15.1.-1.96*16.822/√3232=15.1-0.58=14.5
For 95% confidence interval, the gold grade is 14.5% to 15.7%
22
Confidence limits for Sample mean (N<30)
(distribution of means)
Limited observations (N<30, real life, t distribution)

Standard
deviation

Mean

23
-3.182

3.182

-5.840

5.840
24
1 tail=0.025, 2 tails=0.05
T Table (1 sided/tail) Probability=1-0.05=95%
df\p 0.40 0.25 0.10 0.05 0.025 0.01 0.005 0.0005
1 0.324 1.000 3.077 6.313 12.70 31.82 63.65 636.6
2 0.288 0.816 1.885 2.919 4.302 6.964 9.924 31.59
3 0.276 0.764 1.637 2.353 3.182 4.540 5.840 12.92
4 0.270 0.740 1.533 2.131 2.776 3.746 4.604 8.610
5 0.267 0.726 1.475 2.015 2.570 3.364 4.032 6.868
6 0.264 0.717 1.439 1.943 2.446 3.142 3.707 5.958
7 0.263 0.711 1.414 1.894 2.364 2.997 3.499 5.407
8 0.261 0.706 1.39 1.859 2.306 2.896 3.355 5.041
9 0.260 0.702 1.383 1.833 2.262 2.821 3.249 4.780
10 0.260 0.699 1.372 1.812 2.228 2.763 3.169 4.586
11 0.259 0.697 1.363 1.795 2.200 2.718 3.105 4.437
12 0.259 0.695 1.356 1.782 2.178 2.681 3.054 4.317
13 0.258 0.693 1.350 1.770 2.160 2.650 3.012 4.220
14 0.258 0.692 1.345 1.761 2.144 2.624 2.976 4.140
25
15 0.257 0.691 1.340 1.753 2.131 2.602 2.946 4.072
t Table (1 sided)
df\p 0.40 0.25 0.10 0.05 0.025 0.01 0.005 0.0005
16 0.257599 0.690132 1.336757 1.745884 2.11991 2.58349 2.92078 4.0150
17 0.257347 0.689195 1.333379 1.739607 2.10982 2.56693 2.89823 3.9651
18 0.257123 0.688364 1.330391 1.734064 2.10092 2.55238 2.87844 3.9216
19 0.256923 0.687621 1.327728 1.729133 2.09302 2.53948 2.86093 3.8834
20 0.256743 0.686954 1.325341 1.724718 2.08596 2.52798 2.84534 3.8495
21 0.256580 0.686352 1.323188 1.720743 2.07961 2.51765 2.83136 3.8193
22 0.256432 0.685805 1.321237 1.717144 2.07387 2.50832 2.81876 3.7921
23 0.256297 0.685306 1.319460 1.713872 2.06866 2.49987 2.80734 3.7676
24 0.256173 0.684850 1.317836 1.710882 2.06390 2.49216 2.79694 3.7454
25 0.256060 0.684430 1.316345 1.708141 2.05954 2.48511 2.78744 3.7251
26 0.255955 0.684043 1.314972 1.705618 2.05553 2.47863 2.77871 3.7066
27 0.255858 0.683685 1.313703 1.703288 2.05183 2.47266 2.77068 3.6896
28 0.255768 0.683353 1.312527 1.701131 2.04841 2.46714 2.76326 3.6739
29 0.255684 0.683044 1.311434 1.699127 2.04523 2.46202 2.75639 3.6594
30 0.255605 0.682756 1.310415 1.697261 2.04227 2.45726 2.75000 3.6460
inf 0.253347 0.674490 1.281552 1.644854 1.95996 2.32635 2.57583 26
3.2905
Class example (Sample data)

Depth Perm Porosity


1541.7 -- 29.8 Upper limit porosity, perm?
1544.1 -- 28.9
1619.0 2329 29.2
1622.0 312 26.3
Lower limit porosity, perm?

Mean 1321 28.6


Std Dev. 1426 1.5

t distribution
N=4
95% confidence intervals
2 tailed 27
Depth Perm Porosity
1541.7 -- 29.8
1544.1 -- 28.9
1619.0 2329 29.2
1622.0 312 26.3

Mean 1321 28.6


Std Dev. 1426 1.5

α=1-0.95=0.05, 2 tailed, ½ α=0.05/2=0.025


d.f.=4-1=3, at 0.025 α level of significance, t0.975, (3) =3.18
Upper limit porosity:
28.6+3.18*1.5/√4=28.6+2.4=31.0
Lower limit porosity:
28.6-3.18*1.5/√4= 28.6-2.4=26.2

For 95% the confidence interval of porosity is 26.2% to 31.0%


28
Au Zr RRE
Gold Zrcon Rare Earth
(ppm, g/ton)
1 0.005 7.30 2.10

limits?
2 0.116 18.40 1.80
3 0.038 22.50 16.10
4 0.005 20.70 16.20
5 0.005 99.00 82.60
6 0.005 42.10 45.10
7 0.005 32.70 51.10
8 0.005 19.50 47.50
9 0.005 31.80 70.60
10 0.005 54.30 28.60
11 0.005 48.80 42.10

(use t=2.040, see page 27)


12 0.005 9.90 10.60
13 0.008 8.20 11.70
14 0.007 10.50 13.00
15 0.005 37.90 16.70
16 0.005 28.50 22.40
17 0.017 23.10 12.20
18 0.005 100.00 80.70
19 0.005 15.10 9.20
20 0.077 15.50 17.40
21 0.148 70.10 73.70
22 0.006 14.40 27.60
23 0.005 13.30 18.20
24 0.014 17.70 22.70
1. List 12 key descriptive sample statistics (see page 5)

25 0.007 8.50 8.50


26 0.005 12.30 21.60
Homework 2: Descriptive statistics

27 0.005 12.60 11.30


28 0.005 11.70 10.80
29 0.121 9.80 9.60
30 0.007 8.00 8.80
2. Estimate the 95% confidence limits of the mean using formula:

31 0.013 12.30 22.10


32 0.009 14.40 27.20
29

33
Kalimantan area. Please select Au or Zr and perform the follow tasks:
Summarized above are grades (ppm, g/ton) of gold (Au), Zircon (Zr) and
total REE (rare earth element) of 32 data points/specimens from Central

3. Is your predicted value at #33 (homework 1) within the 95% confidence


Homework 2 Report example
REE
Rare Earth
(ppm, g/ton)
1 2.10 REE
2 1.80
3 16.10 1. Mean 26.87 ppm
4 16.20 Standard Error 4.04 ppm
5 82.60
Median 17.80 ppm
6 45.10
7 51.10 Mode #N/A ppm
8 47.50 Standard Deviation 22.86 ppm
9 70.60 Sample Variance 522.42 ppm
10 28.60 Kurtosis 0.81
11 42.10
Skewness 1.35
12 10.60
13 11.70 Range 80.80 ppm
14 13.00 Minimum 1.80 ppm
15 16.70 Maximum 82.60 ppm
16 22.40 Sum 859.80 ppm
17 12.20
18 80.70
19 9.20
2.
20
21
17.40
73.70 3. The predicted value ( ? )at
sample #33 in Homework 1
22 27.60
23 18.20
N= 32
(is, is not) within the(90%,
24 22.70
25 8.50
t= 2.04
26 21.60 Sd= 22.86 ppm
27 11.30 confidence interval= 8.24 ppm 95%, 99%) confidence limits.
28 10.80 Confidence Limits:
29 9.60
30 8.80
CL upper 35.11 ppm
31 22.10 CL lower 18.63 ppm 30
32 27.20

You might also like