Lec01-3 Data

Citation Program in Applied Geostatistics
Data Integrity and Related Issues
• Some Comments on Data

• Compositing
• Bootstrap
• Sampling
Data Integrity
• This course is not principally concerned with data collection, sampling
theory, database integrity, but these issues must be mentioned
• In general, geostatistical tools have no ability to detect problem data:
– Errors appear like short scale geological microstructure
– Biases can be detected between data sources, but the truth cannot be
discerned from geostatistical analysis
– Fraudulent data can appear even better than real data
• Special geostatistical analysis is required for “non-isotopic” data or data
that is not sampled at the same locations
• Standard best practices should be followed in all aspects of data
collection, preparation and assaying
Compositing
• The term “compositing” refers to the procedure of combining adjacent values
into longer down-hole intervals
• The grade of each new interval is calculated on the basis of the weighted
average of the original sample grades
• These are usually weighted by length and possibly by specific gravity and core
recovery
• Compositing leads to one of the following results:
– ore body intersections
– lithological or metallurgical composites
– regular length composites
– bench composites or section composites
– high grade composites
– minimum length & grade composites
• Each of these types of composite are produced for different purposes and in
different situations
• Regular length or bench composites are common in geostatistical analysis
– Geostatistical software assumes the data represent the same volume
– The length should be small enough to permit resolution of the final simulated grid
spacing
The Bootstrap
• The bootstrap is a name generically applied to statistical resampling schemes
that allow uncertainty in the data to be assessed from the data themselves, in
other words, “pulling yourself up by your bootstraps”.
• Given n observations zi, i=1,…,n and a calculated statistic S, e.g., the mean ,
what is the uncertainty in S?
• The procedure:
– Draw n values z’i, i=1,…,n from the
original data with replacement
– Calculate the statistic S’ from the
“bootstrapped” sample
– Repeat L times to build up a
distribution of uncertainty in S
• Assumes that the data are

independent and representative
Simple Examples
• Bootstrap uncertainty in the mean:
0.3 O riginal Data Bootstrap D istribution of M ean
N um ber of D ata 17 0.4 N um ber of D ata 1000
m ean 1225.28 m ean 1223.46
std. dev. 672.14 std. dev. 156.98
coef. of var 0.55 coef. of var 0.13
m axim um 3360.87 m axim um 1814.71
0.3
upper quartile 1305.14 upper quartile 1320.91
0.2 m edian 1145.50 m edian 1214.98
low er quartile 758.81 low er quartile 1111.62
m inim um 485.00 m inim um 836.53
F requency
F requency
0.2
0.1
0.1
0.0 0.0
0. 1000. 2000. 3000. 4000. 0. 1000. 2000. 3000. 4000.
P erm eability, m d Average P erm eability, m d
• Bootstrap uncertainty in the correlation coefficient:

13.0
0.10
11.0
0.08
P rim ary / H ard D ata
9.0
0.06
7.0
 = 0.54 0.04
5.0 0.02
3.0 0.00
0. 5000. 10000. 15000. 20000. 25000. 0.25 0.35 0.45 0.55 0.65 0.75
S econdary / S oft D ata 

All of Porosity Data in Rock Type O ne
Bootstrap of
Entire
Histogram
0.10 0.40
Bootstrap Histogram O ne Bootstrap Histogram Tw o
• Reference histogram
shown on top
• Four alternatives
0.10 0.40 0.10 0.40
Bootstrap Histogram Three Bootstrap Histogram Four
0.10 0.40 0.10 0.40

Uncertainty in Petroleum Resource
• Statistical procedure that could be used without resorting to a full geostatistical

treatment of uncertainty
• Pore volume is the product of:
– gross rock volume
– net-to-gross ratio
– net porosity
• Procedure to assess uncertainty in the pore volume is to determine the
uncertainty in the three controlling variables and then do a Monte Carlo
sampling of those distributions.
– uncertainty in gross rock volume is determined by modeling the surfaces and the oil /
gas contact stochastically
– uncertainty in net-to-gross ratio determined by the bootstrap
– uncertainty in net porosity determined by the bootstrap
• A (real, but scaled) example from a reservoir with 8 wells
W -1 W -3 W -4 W -2
-12000
-12500
1
-13000
432
• Contact at –12300 -13500
Section Looking NW 5
• No uncertainty in the 6
surface depths at the
well locations
W -7 W -6 W -5 W -4 W -8
-12000
-12500
-13000
1
-13500
32
65
Section Looking NE
Uncertainty in Gross Rock
Volume
W -1 W -3 W -4 W -2
-12000
-12500
-13000
-13500
Section Looking NW
from reference grid
• Model the top reservoir surface m ean 446.9

std. dev. 30.6
stochastically 0.12
• Calculate the gross rock volume

• note the scale (does not start at
F requency
0.08
zero)
0.04
0.00
300. 400. 500. 600.
G ross R ock Volum e (m illion cubic feet)

Field Average
0.25 m ean 0.81
std. dev. 0.13
0.20
0.15
F requency
Uncertainty in 0.10
Net-to-Gross 0.05
sm oothed distribution
Ratio 0.00
0.0 0.2 0.4 0.6
N et-to-G ross from W ells

0.8 1.0
0.20 Field Average

m ean 0.81
• Create a histogram model std. dev. 0.044
(and cumulative histogram) 0.15

for the well average net-to-
gross and then resample
F requency
0.10
0.05
0.00
0.0 0.2 0.4 0.6 0.8 1.0
N et-to-G ross for Field

F ield Average
0.25 m ean 0.21
std. dev. 0.011
0.20
0.15
Frequency
Uncertainty in
0.10
sm oothed distribution
0.05
Net Porosity 0.00

0.15 0.19 0.23 0.27 0.31
Average N et P orosity from W ells
F ield Average
m ean 0.21
• Create a histogram model std. dev. 0.0056
(and cumulative histogram) 0.3
for the well average net-to-

gross and then resample
F requency
0.2
0.1
0.0
0.15 0.19 0.23 0.27 0.31
Average N et P orosity for Field

Uncertainty in Pore Volume
• The procedure to assess uncertainty in the pore volume:
– draw a gross rock volume (see histogram)
– draw a net-to-gross ratio (see histogram)
– draw a net porosity (see histogram)
– calculate a pore volume
(product of three numbers) from average G R V /N G /P orosity
– repeat many times m ean 76.3
std. dev. 7.0
0.10
0.08
Frequency
0.06
0.04
0.02
0.00
50.0 60.0 70.0 80.0 90.0 100.0
P ore Volum e (m illion cubic feet)

Changing G ross Rock Volum e Only
m ean 75.9
std. dev. 5.2
0.16
0.12
F requency
0.08
0.04
0.00
50.0 60.0 70.0 80.0 90.0 100.0
P ore Volum e
Changing Net-to-Gross Only

m ean 75.9
std. dev. 4.0
0.16
Contribution of 0.12
F requency
Each Factor
0.08
0.04
0.00
50.0 60.0 70.0 80.0 90.0 100.0
P ore Volum e
Changing Porosity Only

0.40 m ean 76.5
std. dev. 2.0
0.30
F requency
0.20
0.10
0.00
50.0 60.0 70.0 80.0 90.0 100.0
P ore Volum e
Review of Main Points
• The data are assumed correct by all geostatistical analysis

• The raw assays are typically composited to some nominal
constant length samples for geostatistical analysis
• Bootstrapping is useful in assessing uncertainty in a sample
statistic, e.g. the mean, variance, calculated properties, …
• Let’s look at sampling for a while:

– Definitions
– Heterogeneity and Estimation
– Compositing
Definitions
• Fragment Size, d(cm): size of the fragment or average size of the
fragments
• Nominal Fragment Size, d (cm): the maximum fragment size in a lot,
as the square mesh size that retains no more than 5% of the oversize
material
• Lot, L: amount of material from which increments and samples are
selected
• Increment, I: a group of fragments extracted from a lot in a single
operation of the sampling device
• Sample: reunion of several increments
• Specimen: part of the lot obtained without respecting the rules of
delimitation and extraction correctness
• Component: constituent of the lot that can be quantified by analysis
• Critical Content, a: proportion of a critical component that is to be
estimated
Weight of a critical component in the lot L

Critical content a L 
Weight all components in the lot L
More Definitions
• Probabilistic Selection: a selection founded on the notion of

selection probability that includes the intervention of some
random element. A probabilistic selection is correct when the
selecting probability is uniformly distributed among all units
making up the lot
• Heterogeneity: where not all elements are identical
• Constitution Heterogeneity (CH): differences between the
composition of each fragment within the lot
• Distribution Heterogeneity (DH): differences from one group
to another within the lot
• Sampling Protocol: an agreed upon set of stages for sample
taking and preparation meant to minimize errors and to provide
a sample that is within certain standard controls.
Sampling Errors
• Accidental errors that occur during sampling or preparation

cannot be analysed using statistical methods as they are
typically non-random events
• The variance of independent errors can be added:
 2 TE   2 FE   2 DE   2 EE 

• Error variance compounds and does not cancel out
• When the mean of the sampling error, E{SE}, approaches zero
the sample can be called accurate
• A sampling selection is said to be precise when the variance of
the sampling error, 2(SE), is small, that is less than the
standard required for a given purpose
Heterogeneity
 a i  a L  M i2
2
• The constitution heterogeneity: CH L  N F  2

 2
i aL ML
• Where NF is the number of fragments, ai is the content of a particle, aL
is the content of the lot and the M’s are masses
• This is divided through by the number of particles and multiplied by the
mass of the lot (the average fragment mass). The result is the intrinsic
heterogeneity:
 a i  a L  Mi 2
2
IH L   2

i aL ML
• We often separate the lot into fractions based on size and
density:
a  a L  M Lαβ
2
– Volume: v.
IH L   v α  λ α
αβ
– Density: .
– Mass: MF = v. α β a 2L ML
– Critical Content: a.
Estimation of IH
• We simplify assuming (1) the critical content of the fraction, a, varies
much more from one density fraction to the next than from one size
fraction to the next, and (2) replacing the ratio, ML/ML, with an
average value, ML/ML
   a β  a L  M Lβ 
2
 v α M Lα
IH L      λβ 
  β ML 
2
 α ML aL

 X Y
• The first term, X, is estimated with X=fgd3, where f is a shape factor
(0.5), g is a granulometric factor (0.25-0.55), and d is the fragment size
• The second term, Y, is estimated as Y=cl, where c is a mineralogical
factor (M/aL when aL<0.1 and (1-aL)g when aL>0.9) and l is a
liberation factor (about 0.2)
• A reference like Pitard’s book can be reviewed for more details on
these factors
IH L  cfgd 3
Fundamental Sampling Error
• The fundamental sampling error, FE, is defined as the error that occurs
when the selection of the increments composing the sample is correct.
• This error is generated entirely by the constitution heterogeneity.
• Gy has demonstrated that the mean, m(FE), of the fundamental error is
negligible and that the variance, 2FE, can be expressed as:
 1 1 
σ 2
FE    IH L
 MS M L 
• When the lot is large relative to the sample:

 1 
σ 2FE   IH L
M
 S
• Sampling protocols and errors are designed with the FE

Sampling Nomograph
• A nomograph is a base 10 log-log plot with the sample variance on the
ordinate axis and the sample mass on the abscissa axis
• When the sample is split the nominal size stays the same but the mass
decreases (A to B). During comminution sample mass stays the same
and the nominal fragment size decreases (B to C)
An Example (1/4 slides)
• Characteristics of the material being sampled are:
– Molybdenum deposit occurring as MoS2.
– Expected mineral content; aL=0.100%
– Mineral density; M=4.7.
– Gangue density; g=2.7.
– Liberation size; d=500m
– 40 kg core sample.
– Half of the core will be assayed.
– Reduce the sample through comminution and splitting to 1g for assaying
• To calculate the intrinsic heterogeneity:
f  0.5 standard value for most materials 
g  0.25 unclassified material 
λ M  4.7 rock property 
λ g  2.2 rock property 
0.100% 160160 is the atomic weight of MoS2 
aL    
100 96  96 is the atomic weight of Mo 
 0.00167 Molybdenum metal content changed to mineral content 
1
 0.05   the liberation factor was calculated in the table below 
d     
 d   and a conservative value of 1 was chosen for b 
λ 4.7
c M   2814 since a L is less than 0.1 an approximation was used 
a L 0.00167
Example (2)
• The calculation and plotting of 

IHL (g)
d (cm) size 2(FE)
IHL for the possible fragment
sizes can now be done. To 2.5000 0.020 110 1.00E+05 1.10E-03
determine the coordinates of the 0.9500 0.053 15.9 1.00E+04 1.59E-03

point being plotted a mass was
0.4750 0.105 3.98 1.00E+04 3.98E-04
chosen for each of the nominal
sizes. 0.2360 0.212 0.982 1.00E+03 9.82E-04
• An additional constraint is that 0.1700 0.294 0.509 1.00E+03 5.09E-04

no single operation should
0.1000 0.500 0.176 1.00E+02 1.76E-03
introduce more error than
FE=5%. To plot this threshold 0.0710 0.704 0.0888 1.00E+02 8.88E-04
on the nomograph, the error 0.0425 1.000 0.0271 1.00E+01 2.71E-03

limit has to be converted to a
0.0250 1.000 0.00551 1.00E+00 5.51E-03
variance 2FE=0.025
0.0150 1.000 0.00119 1.00E+00 1.19E-03
0.0106 1.000 0.000420 1.00E+00 4.20E-04

Example (3)
• The variance design line is shown as the dashed blue line

• The black lines (slope -1) are constant particle size lines
Example (4)
• 20 kg results from splitting the initial 40kg core sample
• Locate the 20 000g mass on the horizontal axis and choose an
intersection with a size line that is below the variance design line
• Choose 0.475cm as the nominal size of the first comminution. Place a
point on the 0.475cm size line for a sample mass of 20kg;
• The fundamental error variance for the sample can now be read off of
the vertical axis
• The next step is to split the sample in
preparation for the next comminution
• Follow the size line to the left and
upwards until it is just below the
variance design line
• Decide to split the sample from 20kg
down to 2kg
• Continue with comminution and
splitting until the final sample
• mass is 1g.
Review of Main Points
• Experience has shown that the fundamental error, measurable using

IHL and the nomograph, constitutes only half of the total error found in
the sample. Doubling the variance is a more accurate estimation of the
sample error
• Some accepted standards are:
– 15% for internal process control and exploration
– 5% for contract sales or compliance
– 0.5% for commercial sales
• This presentation assumes that common sense has ensured no
delimitation and extraction error
• One unfortunate, but intriguing, facet of sampling is that local conditions
tend to dominate compared to the expected norm. Due to this
variability it is difficult to provide a step-by-step procedure that will be
valid for all cases
• This is a Bluffer’s Guide to Sampling – it is a large area of study

Lec01-3 Data

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec01-3 Data

Uploaded by

Copyright:

Available Formats

Citation Program in Applied Geostatistics

Data Integrity and Related Issues

• Some Comments on Data

• Assumes that the data are

P erm eability, m d Average P erm eability, m d

• Bootstrap uncertainty in the correlation coefficient:

S econdary / S oft D ata 

Bootstrap Histogram O ne Bootstrap Histogram Tw o

0.10 0.40 0.10 0.40

Bootstrap Histogram Three Bootstrap Histogram Four

0.10 0.40 0.10 0.40

• Statistical procedure that could be used without resorting to a full geostatistical

• Model the top reservoir surface m ean 446.9

• Calculate the gross rock volume

G ross R ock Volum e (m illion cubic feet)

N et-to-G ross from W ells

0.20 Field Average

• Create a histogram model std. dev. 0.044

(and cumulative histogram) 0.15

N et-to-G ross for Field

Net Porosity 0.00

Average N et P orosity from W ells

for the well average net-to-

Average N et P orosity for Field

P ore Volum e (m illion cubic feet)

Changing Net-to-Gross Only

Changing Porosity Only

• The data are assumed correct by all geostatistical analysis

• Let’s look at sampling for a while:

Weight of a critical component in the lot L

• Probabilistic Selection: a selection founded on the notion of

• Accidental errors that occur during sampling or preparation

 2 TE   2 FE   2 DE   2 EE 

• The constitution heterogeneity: CH L  N F  2

• When the lot is large relative to the sample:

• Sampling protocols and errors are designed with the FE

• The calculation and plotting of 

determine the coordinates of the 0.9500 0.053 15.9 1.00E+04 1.59E-03

• An additional constraint is that 0.1700 0.294 0.509 1.00E+03 5.09E-04

on the nomograph, the error 0.0425 1.000 0.0271 1.00E+01 2.71E-03

0.0106 1.000 0.000420 1.00E+00 4.20E-04

• The variance design line is shown as the dashed blue line

• Experience has shown that the fundamental error, measurable using

You might also like