You are on page 1of 39

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/234138000

The sampling theory of Pierre Gy: Comparisons, implementation and


applications for environmental sampling

Chapter · January 1996

CITATIONS READS

2 9,411

4 authors, including:

John W. Kern Richard Anderson-Sprecher


Kern Statistical Services, Inc., University of Wyoming, Montana State University University of Wyoming
58 PUBLICATIONS   968 CITATIONS    46 PUBLICATIONS   1,707 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Missing Numbers in the COVID-19 Equation: Answer 7 Questions to Fill the Gaps. View project

All content following this page was uploaded by John W. Kern on 21 May 2014.

The user has requested enhancement of the downloaded file.


1

The Sampling Theory of Pierre Gy:


Comparisons, Implementation, and
Applications for Environmental Sampling

Leon E. Borgman, John W. Kern,


Richard Anderson-Sprecher and George T. Flatman

ABSTRACT

The sampling theory developed and described by Pierre Gy [1]


is compared to design-based classical finite sampling methods for
estimation of a ratio of random variables. For samples of
materials which can be completely enumerated the methods are
asymptotically equivalent. Gy extends the finite sampling
methods to situations where complete enumeration of samples is
not feasible. Gy's methods involve a set of sampling constants
related to the heterogeneity of the material sampled; methods to
estimate these constants from grouped data are given. Computer
programs for the estimation of these constants are described, and
environmental applications are discussed.

INTRODUCTION

Classical finite sampling theory, sometimes called design-

based sampling, is most often associated with the design and

analysis of sample surveys. The identical theory is also used,

however, to guide the selection and analysis of samples in other

applications. In particular, environmental studies are most

often planned and interpreted from this standpoint. (see, for

example, [2]). A thorough description of classical sampling may

be found in Cochran [3], and Thompson [4] describes applications

of the classical theory to selected nonstandard problems.

The major advantage of classical random sampling is that it

is fundamentally objective; assumptions made about the underlying


2

population are, for practical purposes, nonexistent. An

underlying characteristic of the classical paradigm is that it

considers the real world (the population of interest) to be fixed

and deterministic, and randomness is present only because of the

sample selection process. When estimating a fixed population

parameter, variation within the population is thus a hurdle to be

surmounted, and probabilistic description of variation is neither

necessary nor even meaningful. Whatever patterns of variability

are present within the population are effectively removed or,

more accurately, nullified (on average) by randomization of the

sample.

The alternative to classical sampling is generally

understood to be model-based sampling. Model-based sampling

(like design-based sampling) actually consists of a variety of

methods, the most practical and important variant for

environmental samplers being geostatistical sampling (See [5] for

a thorough, general treatment and [6] for a discussion of

geostatistical applications to environmental problems). The main

distinction between model-based and design-based theories rests

on the use or non-use of a model to account for patterns of

variability within the population. Geostatistical models

describe the spatial covariance structure of variables of

interest. Because these models are stochastic, they interject a

degree of randomness into one's perception of the world itself.

In other words, the world as observed through a window in space-


3

time is no longer seen as a fixed fact, but is rather viewed as a

single realization of a random process.

Because the model-based approach views randomness as part of

the population itself, random sampling is no longer necessary for

a model-based sample design. In fact, it is generally not even

desirable, because regularly spaced observations usually provide

the best information about the random process one assumes to be

lurking behind the realized population. The price paid for

allowing non-random sampling is that the patterns of variation in

space and/or time--that is, the character of the underlying

process--must be adequately (although not perfectly) understood

if estimates are to be reliable. The payoff is that model-based

sampling is usually more efficient because it makes more complete

use of information about the population.

Model-based sampling is now sufficiently well established

amongst sampling theorists for opposing camps of design- vs.

model-based statisticians to have formed, feuded, and (sometimes)

made truces. More and more practitioners now recognize that both

perspectives have value, depending on the actual problem at hand.

Relevant articles of interest include Borgman and Quimby [7] and

Brus and De Gruijter [8].

Are design-based and model-based sampling the only choices

available? There exists, perhaps, a third way, as well. Pierre

Gy, a French mining engineer, developed a sampling theory in

relative isolation from the theoretical statisticians, and his


4

theory is now being advanced as another alternative to the

classical sampling theory [1],[9] and [10]. Like classical

sampling, Gy's theory assumes a particular fixed state of nature

which the sampler wishes to describe with calculable confidence.

Like model-based sampling (in particular, geostatistical

sampling), Gy's theory attempts to address patterns in the

variability of the population, body, or area to be sampled.

Although Gy developed his theory in the context of a particular

problem -- the estimation of the grade (percentage mineral

content) in a sample of ore -- proponents view Gy's contributions

as a general theory of sampling which offers improvements over

standard methods in other contexts as well. In particular,

whenever a sampled medium may be viewed as particulate (including

fluids), Gy's theory may be applied. Environmental samplers who

work with soils and waters are of course interested in any

contributions that Gy's theory may offer.

The following pages describe certain important aspects of

Gy's theory and compare it with classical sampling, thereby

clarifying the theory itself and indicating what practitioners

can hope to gain from this perspective. Lyman [11] compares Gy’s

variance estimator to that of Ingamells, including extensive data

analysis. Much of the mathematics in Gy's theory is equivalent

to that in classical design based sampling theory, and the most

important connections will be examined. Gy's work is most easily


5

understood in the context in which it was developed, and the

explanation below makes reference to a population of mineral

particles containing varying amounts of ore. Examples are

given for ways that this model can be used in situations that

parallel problems of interest to environmental samplers.

GY'S SAMPLING THEORY

One of the primary contributions of Gy's theory is it's

systematic identification of different aspects of population

variability before sampling begins. Gy treats the world as

deterministic, just as do classical samplers, but the choice of

sample for Gy may depend on one's assessment of population

variability. (This assessment may be made in part by studying

the variogram of the variable being studied, so Gy's theory has

connections with geostatistical sampling.) In particular, the

method seeks to improve upon classical sampling by directly

addressing the various sources of error rather than by simply

relying on randomization to account for all potential variation.

Pragmatically, this separation of error sources is often

essential, because some errors may represent actual biases, not

just simple variation. After carrying out a presampling analysis

scientists may adopt whatever additional assumptions they believe

are appropriate to organize the sampling process and to interpret

results.
6

The essence of Gy's theory is to "divide and conquer". Once

sources of error in sampling are identified, an attempt is made

to minimize each type of error separately. Pitard [10] describes

this part of Gy's theory in detail and the following brief

summary draws heavily from his exposition. Some error

minimization is just good technician work. In other cases errors

are intrinsic to the population being sampled. Many of Gy's

procedures for accounting for variability are expressly motivated

by properties of particulate sampling. Most notably, true simple

random samples are not feasible in particulate populations. Also,

units can occasionally be altered and rearranged by physical

crushing and mixing. (Note that environmental populations can

rarely be homogenized in this manner.) Finally, technicalities

arise because the parameter of interest, the percent of a mineral

in a body of ore, is a ratio instead of a simple additive

measure.

The most basic errors identified by Gy result from the

intrinsic heterogeneity of the world--not all particles are the

same, and unlike particles are unevenly distributed in space. To

assess the heterogeneity in a population Gy asks two questions.

First, how much do sampling units differ from each other (What is

the "constitution heterogeneity"); and second, how are different

types of units spread about or clustered within the population

(What is the "distribution heterogeneity")? Both types of

heterogeneity affect the reliability of a sample.


7

Because the constitution heterogeneity is impossible to

alter, the sampling error that is associated with it is termed

the fundamental error (FE). This error can never be eliminated

and is a major focus of much of Gy's theory. Estimation of the

variance of the fundamental error is the portion of the theory

which is outlined in most detail below.

A variety of errors are related to the interaction between

the distribution heterogeneity and the sampling method used

(always some form of cluster sampling). An important error of

this type results from the interplay between uneven clumps of

units in the population and sampling devices that grab clumps of

units. This error is called the grouping and segregation error

(GE). In brief, the grouping and segregation error is smallest

when clustering is absent in both the population and the sampling

procedure. Environmental samplers rarely have the luxury of

being able to homogenize (mix) their populations, so this error

will be present in most problems of interest. As is always true

in cluster sampling problems, the desire to select small sampling

clusters must be balanced against the practicality of sampling

larger clusters.

When the population is itself a nearly linear flow of

particles, then additional errors are associated with

distribution heterogeneity. In this case the distribution

heterogeneity may be addressed by using the variogram to describe

variation in the population (not in a random process as in


8

geostatistics). Stratified sampling can then be used as a

remedial measure with strata selected according to the errors

associated with trends (TrE) and cycles (CyE) that are identified

in such a stream. Patterns in two or three dimensions are

substantially more difficult to characterize and Gy does not

attempt to address such patterns.

A perfectly executed sampling plan would contain precisely

the errors described above. For a single lot the ideal sampling

error (ISE) would then be

ISE ' FE % GE (1)

and for a moving stream of particles the error would be

ISE ' FE % GE % TrE % CyE. (2)

Gy also carefully delineates and probes errors that can

enter a sample from sources other than those ideally described

thus far. Survey statisticians have long recognized problems

such as processing errors, nonresponse bias, and the effects of

improper sampling. Analogous errors may arise in particulate

sampling, and Gy has gone to great pains to describe, measure,

and minimize errors of this sort. Among major error sources

delineated by Gy are: errors that arise from edge effects of

sampling equipment and from similar mechanical problems

(delimitation and extraction, or, jointly, mechanical errors);


9

errors in preparing samples for laboratory analysis (preparation

errors) and actual errors from the laboratory (analytic errors).

Those familiar with quality control may note a similarity in

spirit between the above identification of error sources and

similar exercises in the quality literature.

Three comments about these additional types of errors are

relevant at this point. First, these errors are potentially

important to the practitioner because they may bias observations,

as mentioned above. If they are recognized in time, most of

these errors can be minimized, or even eliminated, by proper

physical collection and handling. Second, because sampling is

often done in stages, most of the above-mentioned errors can

enter the problem many times, and the stage with the greatest

error present will form a lower bound on the total sampling

error. Third, the theory itself can only point to the existence

of such variability; it cannot itself remedy errors at this

level.

The error which cannot be removed by even the most careful

technicians and the best instrumentation is the error intrinsic

to the population variability, that is the fundamental error.

The focus below is upon the fundamental error because it is

always present and it is the only error that can be assessed

independently of the sampling method. The fundamental error as

defined by Gy is the relative error in estimating the grade

(proportion of desired mineral), and is thus a measure of the


10

variation intrinsic in the population of available mineral

particles. It's variance is the square of the coefficient of

variation of the grade. If the grade is expressed as the ratio

of two random variables on a set of sampling units, then the

fundamental error may be estimated using methods given by Cochran

[3]. It can be shown that Gy's methods are equivalent to those

given by Cochran, for a population for which complete enumeration

is possible. However, Gy has developed methods for particulate

sampling where complete enumeration of samples is not feasible or

cost effective.

COMPARISON OF GY'S THEORY AND CLASSICAL SAMPLING

The description below uses the notation of classical

statistics and the physical context of particulate ore sampling.

Most environmental samplers will be accustomed to the notation

used, but they will probably wish to translate physical variables

into those used in their own areas of interest. For example,

grade of ore may be analogous to the percentage of some chemical

present in a particular medium.

Let L represent an ore body, where X is the total mass of

the body and Y is the mass of the mineral of interest. The

parameter to be estimated is R=Y/X, the grade of the ore body.

In the language of classical finite sampling, this is the problem

of estimating the ratio of two random variables. In finite


11

sampling theory, it is assumed that a population is composed of N

sampling units (Ui I=1,2,3...N) and that certain attributes of

these units may be enumerated or measured. For the estimation of

ore grade, the finite population consists of a set of fragments

of ore. The mass of the mineral of interest contained in

fragment I is denoted by yi and the total mass of the fragment is

given by xi for I=1,2,3...N. The notational conventions used for

totals, averages and ratios are given in (Table 1).


12

Table 1. Notation for 2 random variables measured on a sample of


size n from a population of size N.
Population of size N Sample of size n
Total
Y ' j yi y ' j yi
N n

i'1 i'1

Mean
Y X y x
Ȳ ' , X̄ ' ȳ ' , x̄ '
N N n n

Ratio
Y Ȳ y ȳ
R ' ' R̂ ' '
X X̄ x x̄

An estimator of the ratio R, and approximations of the first

two moments of that estimate are given both by Cochran [3] and by

Gy [1]. Each uses a slightly different method of derivation to

arrive at results, but the moments they obtain are equivalent up

to the order of approximation. A summary of the derivation of

these results is given.

Estimation of The Grade R^

In a finite population the statistical expectation operator

is defined by averaging over all possible combinations (CNn), of

samples, where
13

(N)!
CNn ' . (3)
n!(N&n)!

Cochran [3] shows that the expectation of ȳ is given by

E[ȳ] ' j ' Ȳ.



(4)
CNn

where the sum is over all possible samples. Using this result,

it is clear that Ny/n is an unbiased estimator of Y. The natural

estimator of R is based on the ratio of totals

y
R̂ ' (5)
x

Moments of the Estimated Grade R^

The moments of a ratio estimator are not obvious. Both Gy

and Cochran find means and approximations of variances of R^ .

Brief derivations follow.

Define

nX nY
µx ' E(x)' , µy 'E(y) ' . (6)
N N

One may express R^ in terms of the relative variables u and v

defined by
14

x ' µx (1%u) and y 'µy (1%v). (7)

Then in terms of u and v, R^ is given by

(1%v)
R̂ 'R . (8)
(1%u)

Changing the denominator into a multiplicative factor and using

Taylor's theorem gives

R̂ ' R (1%v)(1&u%u 2&u 3%...). (9)

Because the expectations of both u and v are 0,

E(R̂) ' R (1&µ11(u,v)%µ20(u,v)%µ21(u,v)%...) (10)

where

µij(u,v) 'E[ (u&µu)i(v&µv)j ] (11)

This may be written in the form

E(R̂) 'R (1%S) (12)

where S is given by

S ' (&µ11(u,v)%µ20(u,v)%µ21(u,v)%....). (13)

S is equivalent to what Gy calls the fundamental bias,

E(R̂) &R
S ' Bias(R̂) / , (14)
R
15

which is the relative bias in the estimate R^ . Gy [1] and

Cochran [3] both independently provided approximations of S using

the first two terms in the series and writing the result in terms

of the correlation between x and y. Matheron [12] in an

examination of Gy’s work used Laplace transforms to derive a

general expression for the expectation of the ratio of 2 random

variables raised to a power. This general result was used to

derive Gy’s formula. Further, Cochran [3] presents the exact

results due to Hartley and Ross [13]

µ11(R̂,x)
E(R̂) 'R 1& (15)
µxR

and

µ11(R̂,x)
S '& . (16)
E(x)R

Only approximate formulas for of the variance of R^ are

available. Using equation (8) we compute the expectation

2
E(R̂ )' R 2 (1%S )). (17)

where

S ) ' (µ02(u,v)&4µ11(u,v)%3µ20(u,v)%...) (18)


16

is obtained from the Taylor expansion of 1/(1+u)2. Combining

equations (12) and (17) gives

2
Var(R̂) 'E(R̂ ) &{E(R̂)}2
(19)
' R 2 (S )&2 S).

Using the definition of µij(u,v) given in (11) it can be shown

[3], that up to the second order moments of u and v,

j
N&n N (yi &R xi)
2
1
Var(R̂) – . (20)
2 nN i'1 N&1

The Fundamental Error

Gy [1] defines the fundamental error of estimation and the

relative variance of the fundamental error as

R̂ & R Var(R̂)
FE ' , and s2(FE) ' (21)
R [E(R̂)]2

respectively. This is a slightly nonstandard convention in that

the usual variance of the fundamental error is given by

Var(R̂)
Var(FE)' ' S )&2S. (22)
2
R

In practice, this convention does not pose any difficulty since

up to second order moments in u and v


17

s2(FE) ' Var(FE). (23)

From these forms, Gy derives the approximate form

&1 j
Var(R̂) N N xi 2
R i&R 2
– (24)
(E(R̂))2 n i'1 X R

which is used in applications below.

In summary, Gy's fundamental error is exactly the relative

bias in the estimate R^ of the ratio of random variables. This

is equivalent to the bias given by Cochran. Further, the

variance of the fundamental error as defined by Gy, s2(FE), is

asymptotically equivalent to the usual variance of the

fundamental error (Var(FE)).

In applications where complete enumeration of the sample is

possible, the methods given by Gy are equivalent to those of

classical random sampling. Differences between Gy's methods and

those of finite sampling lie in the methods developed for

sampling of particulate materials after grouping into categories.

These methods are used to reduce the cost of estimation when

complete enumeration of the sample is not feasible. In these

methods, estimators are developed which are similar to those

applied to estimate the mean and variance of grouped data.

APPLICATION TO PARTICULATE MATERIALS


18

To estimate R and the variance of R^ requires complete

enumeration of the n units in the sample and measurement of xi

and yi on each sampled fragment. In the case of particulate

materials, this is impractical. To overcome this problem, Gy

derived a method of estimation of R^ and the variance which does

not require fragment-by-fragment enumeration. Details follow.

Let L represent the population of particulate material with

N fragments denoted by {Ui I=1,2,3...N}. These fragments may be

divided into classes Laß with average volume Va and average

density ?ß. If each fragment in the class Laß is identified with

an average fragment Faß, then an estimate of R^ and Var( R^ ) based

on the midpoints of the size and density classes may be used.

This treatment is essentially the same as the computation of the

mean and variance from a grouped frequency distribution. The

necessary notation is listed in Table 2.

Table 2. Definitions of notation used for estimation of R^ and


Var( R^ ) based on size and density classes. Each fragment is
represented by an average fragment denoted Faß.

Population of size N Sample of size n

Average Volume Va va

Average Density ?ß dß
19

Average Mass X̄aß ' Va?ß x̄aß 'va dß

Average Ratio Raß R̂aß


(Grade)

Consider equation (24) for the variance of the fundamental

error. Summation on the index I is replaced with double sums on

a and ß as

j (Ri&R) x i – j j Naß (Raß&R) X̄ aß.


N r s
2 2 2 2
(25)
i'1 a'1 ß'1

Making this substitution in equation (24) and using the fact that

X̄aß ' Va?ß gives

jj
Var(R̂) N 1 r s NaßX̄aß Raß&R 2
' &1 Va?ß
(E(R̂))2 n X a'1 ß'1 X R
(26)
N 1
' &1 H.
n X

H is defined to be a constant of constitution heterogeneity. The

number of fragments Naß in class (a,ß) times the average particle

mass gives the total mass in that size-density class.

Using this relationship,

H – jj
r s Raß&R 2
Xaß
Va?ß. (27)
a'1 ß'1 R X

Now an estimate of H based on the sample of n fragments is

needed. One may estimate Va and ?ß with their sample equivalents


20

va and dß respectively. This gives the estimate

x – j j xaß where x̄aß – vadß xaß – naßx̄aß (28)


a ß

where naß may be known, or estimated based on average volume and

mass. Depending on the degree of precision desired and the

available resources, the number of grains in each volume density

class may be counted or estimated based on the average size and

density.

Defining dm as the density of the constituent of interest,

dw as the density of the waste and

1 1
&
da dw
R̂aß – , (29)
1 1
&
dm dw

an estimate of the ratio R is given by the weighted average

j j xaßR̂aß

j j xaß
a ß
R̂2 ' . (30)
a ß

Substituting equation (28) into (27) and using equations (29) and

(30)gives the estimate

Ĥ – j j
R̂aß&R̂2 xaß
vadß (31)
a ß R̂2 x
21

As with naß, the average value of R̂aß could be estimated through

assay if budget constraints allowed. In most environmental

sampling scenarios, direct assay would be used. It should be

noted that when estimated by equation (29), Raß appears to depend

only on a. However, the density of waste and mineral may vary

with volume class depending on the degree of separation between

ore and waste (percent liberation). As the fragment size

decreases, the percent liberation of the constituent of interest

is generally increased. This variation will be captured if Raß

is estimated by assay rather than by equation (29). Finally,

since X >> x an estimate of the variance of the fundamental error

is

N 1 X x̄ 1
var(FE) – &1 Ĥ ' &1 Ĥ
n X X̄ x X
(32)
1 1 1
– & Ĥ – Ĥ
x X x

Estimation of Physical Constants

Gy [9] developed a set of physical constants which can be

used to estimate the variance of the fundamental error for

mineralogical data. Following is a method to estimate those

physical constants from sample data. To facilitate the

computation of these constants we introduce the usual dot


22

notation for row and column sums typically associated with the

analysis of variance.

xa. ' j xaß x.ß ' j xaß


s r
(33)
ß'1 a'1

x.. ' j j xaß' j xa.' j x.ß


r s r s
(34)
a'1 ß'1 a'1 ß'1

Define the constitution heterogeneity for a given size class

obtained by summing over the density classes

Ha ' j
xaß R̂aß&R̂2
va dß. (35)
ß xa. R̂2

Two limiting cases can be identified for Ha given by complete

homogeneity of the material sampled or complete heterogeneity.

These limiting cases help to explain the method being used, and

they also occur in certain applications.

a) Completely homogeneous (R̂aß/R̂2) for all a.

In this case Ha = 0 (no constitution heterogeneity).

b) Completely heterogeneous (completely liberated)

In this case all of the material in class a can be

separated into 2 density classes

ß = 1, for pure mineral, with grade 1.0

ß = 2, for pure waste , with grade 0.0.


23

In the completely liberated case, let xa1 be the mass of the

mineral in class a and xa2 be the mass of the waste in class a.

Then, in this limiting case, let ca = Ha

2
1&R̂2
ca ' Ra dm % 1&Ra dw (36)
R̂2

where

xa1
Ra ' (37)
xa1%xa2

is the ratio of the mass of the constituent of interest to the

total mass in volume class a. In this case the liberation ratio

Ra, is defined to be (Ha/ca) so that for the completely

homogeneous case, Ra = 0 and for the completely liberated case Ra

= 1.0.

Define

j xa. va ca Ra

j xa. va
H (' a
. (38)
a

Then the mineralogical factor c is given by

2
1&R̂2
c ' R̂2 dm %(1&R̂2) dw (39)
R̂2

and the liberation factor R is the ratio of H* and c


24

H(
R' . (40)
c

Finally the constitution heterogeneity H, can be approximated by

Ĥ– j
xa.
vac R (41)
a x..

Letting v95 be the 95th percentile of the volumes, define the

granulometric constant

g'j
xa. va
. (42)
a x.. v95

Then the variance of the fundamental error is estimated by

var(R̂) N 1
s2FE ' – &1 cR gv95
(E(R̂)) 2 n X

1 1
– & cR gv95 (43)
x X
cR gv95
– .
x

The final relation follows because X >> 1. In the context of

environmental sampling, these constants H, R, c and g must be

reinterpreted and estimated. François-Bongarçon [14] noted that

Gy’s method, although potentially powerful has failed to be

widely applied even in mining applications due to the difficulty

in adequately estimating the geological constants. Further

research should be directed toward determination of appropriate


25

physical constants in the environmental setting. Sinclair [15]

emphasized the importance of characterization of heterogeneity

(denoted as geologic and value continuity) for ore reserve

estimation. Similar characterization is of equal importance in

the environmental sampling context.

Summary of Gy's Basic Formula

Let K = (c)(R)(g), where c is the composition

(mineralogical) constant, R is the liberation factor, and g is

the granulometric constant. Note that c has units mass/volume

and the other constants are unitless. Then the basic formula

advanced by Gy is

var(R̂) K v95
s2FE ' – (44)
(E(R̂))2 x

Here, v95 is the 95-th percentile of the fragment volumes and x

is the sample weight. The symbol sFE2 in equation (44) represents

the square of the coefficient of variation of the fundamental

error. This is somewhat different from usual statistical

notation where s2 is reserved for variances, but it is consistent

with Gy's use of the term. Therefore, if K is known, one can

estimate the square of the coefficient of variation of R^ as a

function of the physical constants, x, and v95. Alternatively,

the weight of the sample, x, needed to achieve a specified


26

coefficient of variation can be computed if v95 is known, or the

size to which the material must be ground to (that is the

required v95 ) can be calculated for a fixed weight of sample.

OTHER APPLICATIONS

Gy's methods were developed specifically for application to

particulate sampling. However, these methods may also be applied

directly to other continuous materials, such as liquids.

Although the physical constants developed empirically for

minerals do not apply to fluids, the histogram methods can be

applied directly. Environmental sampling for contaminants in

liquid media is thus a natural area of application. In

particular, Gy's methods suggest application to composite

sampling. For example, monitoring a river for contaminant

concentrations could be aided by Gy's methods, in that

appropriate sizes of experimental units could be derived through

a size analysis similar to that applied to particulates. Further

research should include empirical experimentation to develop a

set of physical constants for sampling of other than heavy

metals.

Computer subroutines have been developed at the University

of Wyoming to compute estimates of the ratio of random variables

using the methods given by Cochran [3] and Gy [1] and [9]. These

subroutines also provide estimates of the constants, c, R, and g.


27

Some examples of the application of Gy's results follow.

EXAMPLES

Example 1

To compare Gy's approximate method to the classical finite

sampling methods with complete enumeration, we simulated data

representing 1000 soil fragments. The simulated population ratio

was assumed to have a lognormal distribution with expected value

0.05. The fragment masses were assumed to be exponentially

distributed giving many small fragments with a few larger

fragments. The simulated data were analyzed using the computer

subroutines referenced above. Results are included in Appendix

A.

Using the finite sampling methods where individual fragment

by fragment enumeration was required, the estimate of the ratio

was found to be R^ = 0.05206 with an estimated relative variance

var( R^ )/( R^ )2 = 0.003327. Using Gy's methods on the same data

after data were cross-classified into size and density classes

resulted in the estimated ratio R^ = 0.05249 with an estimated

variance of the fundamental error given by s2(FE) = 0.002601. We

consider this to be relatively good agreement of the 2 methods,

although results are conditioned on the particular realization of


28

the simulated population. The physical constants derived by Gy

were also calculated and are given in Appendix A, along with the

other estimates and the cross-classified data.

Example 2

One application of the use of Gy's formulas is the

determination of the sample size (mass) required to attain a

desired relative precision in the estimation of the grade of a

mineral of interest. This is a standard example due to Ottley

[16] giving the way in which Gy’s formula is typically used in

mining applications. Other more recent examples can be found in

François-Bongarçon [17]. It is anticipated that similar use can

be made in environmental settings. Suppose it is anticipated

that an ore of zinc contains 6.6% Zn as ZnS. If the ore can be

crushed to a maximum size of 2 cm, what mass of sample is

required to insure that a 95% confidence interval gives an

estimate of the grade with relative error + 10% ?

An approximate 95% confidence interval for R is given by

R̂ ± 2× SE(R̂) (45)

The specified precision can be expressed by

2 SE(R̂)
' 0.10 (46)

29

or equivalently as

2
0.10 (cRg)v.95
s2
FE
' ' . (47)
2 x

Gy substitutes for v.95 using the 95th percentile of the diameters

and a shape factor f. Empirical studies have shown that in most

mineralogical applications v.95 ' f (d.95)3 . In the present

example, f=0.5 and d.95 =2 cm. giving v.95 = 4 cm3. Gy recommends

the granulometric constant g=0.25 and the liberation factor

R=0.05. The mineralogical factor recommended by Gy is given by

1&R
c' (1&R) dm % Rdw (48)
R

where some suitable constants are dm=5.0 for the density of

mineral, dw=2.6 for the density of waste and R=0.066 x 1.5=0.099.

This gives c=43.34 and K=(c)(l)(g)=0.54. Substituting K and v.95

into equation (47), gives a sample mass x = 864 g.

To improve the precision of estimates, the sample could be

crushed further. To what diameter should the sample be crushed

to give a relative error of 0.05% given the sample mass of 864 g?

Again solve equation (47) where x=864g is substituted. This

gives v.95 = 1 cm3 or d.95 = 1.26 cm, so the sample should be

crushed to a diameter less than 1.26 cm.

The first example provides evidence of the similarity


30

between estimates obtained through the use of classical sampling

methods and Gy's methods. The second example gives an indication

of the utility of Gy's specification of physical constants for

sample size and handling determination. Gy has developed a

method for converting the classical sample size determination

problem into one of sample mass and sample handling procedures

appropriate to achieve a specified precision. Pitard [10]

provides many further details and examples in a modern context

for these procedures.

Example 3

A third application is found in the sampling of liquids.

Suppose an estimate of the concentration of an organic

contaminant such as polychlorinated biphenyl (PCB) flowing past a

cross section of river is desired. Water sub-samples of volume v

are to be taken at random locations in the cross section and

combined to some total volume vt to estimate the concentration.

What volume of sub-sample unit should be used and what total

volume is required to give a specified coefficient of variation

(sFE) for the estimate?

To answer this question, one may use Gy's methods where each

sub-sample unit, (ie. an increment of water and suspended

particulate) is treated analogously to a fragment of solid

material. Assume that the contaminant is found in solution and

as a surfactant on suspended particulate material. If several


31

sizes of sub-sampling units are used, then the set of sub-sample

observations can be classified into a 2 way table by volume and

density. If there is little suspended particulate, then there

will be just one density class. It is anticipated that the

percentage of suspended particulate may vary with the volume and

density of sample units. Then using Gy's basic equation,

(c gR) v
s2FE ' (49)
vt

with a selected value of sFE, the volume of an individual sub-

sample unit may be determined for a given total volume of sample,

or alternatively, a total volume of sample may be determined

given a sub-sample unit volume. However, to apply equation (49),

the constants c, g and R must be determined.

A basic field exercise may be used to determine these

constants. Suppose a set of r samples of size n is taken where

the volume of each sub-sample unit is intentionally varied so

that v1, v2, ... vr are the sub-sample volumes. Each sample unit

is kept separate, and the, volume and density are recorded. The

set of (nxr) sub-sample units collected is then cross classified

based on volume and density. Sample units are combined within

volume and density class and assayed for PCB content. If

individual sample units are sufficiently large, and PCB

concentrations are high, then individual sample units could be

assayed. For volume density class (a,ß) the number of sample

units in the class naß, the average mass xaß and the average PCB
32

concentration is available. Based on this table, the estimates

of c, g and R can be obtained from the formulas previously given

in this paper. These constants may then be used to determine the

relationship between total sample volume, vt, and sub-sample unit

volume, v and s2FE .

CONCLUSIONS

The methods of Gy and Cochran for ratio estimation have been

shown to be asymptotically equivalent for samples which can be

completely enumerated. Both are based on finite sample theory.

Gy extends the procedure to treat data grouped into a 2-way table

of fragment volume and fragment density, and provides a simple

estimation procedure for estimating appropriate sample volume and

fragment sizes to attain a specified relative error. A computer

program, available from the authors, has been developed at the

University of Wyoming to estimate the constants from a table of

grouped data.

This paper has shown certain equivalences between finite

sampling theory and Gy’s work for cases where samples may be

completely enumerated. In environmental settings and for the

estimation of certain ores such as precious metals, further

empirical study is required to improve the value of Gy’s method

for sampling materials which are not completely enumerable. Note

that methods outlined in example 3 show how Gy’s method can be


33

implemented for the important problem of sampling liquid media.

Future work will determine the ultimate value of Gy’s method in

applications other than ore reserve estimation.


34

REFERENCES

[1] Gy, P.M. (1967). Memoires du Bureau de Recherches Geologiques


Minieres, no. 56, (Chapitre 4, Theorie de l'enchantillonnage
equiprobable, pp. 42-51), Paris.

[2] Gilbert R.O. (1987) Statistical Methods for Environmental


Pollution Monitoring, Van Nostrand Reinhold, New York.

[3] Cochran, W.G. (1977). Sampling Techniques, John Wiley &


Sons, Inc. New York.

[4] Thompson, S.K. (1992). Sampling, John Wiley & Sons, Inc.
New York.

[5] Cressie N.A.C. (1991). Statistics for Spatial Data, John


Wiley & Sons, Inc. New York.

[6] Flatman G.T, Englund E.J. and Yfantis, A.A. (1988)


Geostatistical Approaches to Design of Sampling Regimes, In
Principles of Environmental Sampling, L.H. Keith Ed;
American Chemical Society,Washington, D.C., pp 73-84.

[7] Borgman L.E. and Quimby W.F. (1988) Sampling for Tests of
Hypotheses When Data are Correlated in Space and Time. In
Principles of Environmental Sampling, L.H. Keith Ed;
American Chemical Society,Washington, D.C., pp 25-44.

[8] Brus, D.J. and de Gruijter, J.J. (1993) Environmetrics (4)


pp. 123-152.

[9] Gy, P.M. (1982). Sampling of Particulate Materials, Theory


and Practice. Elsevier Scientific Publishing Company, New
York.

[10] Pitard, F.F. (1989). Pierre Gy's Sampling Theory and


Sampling Practice Volume I, Heterogeneity and Sampling. CRC
Press Inc. Boca Raton, Florida.

[11] Lyman, G.J. (1993). Geochimica et Cosmochimica Acta. (57)


p. 3825-3833.

[12] Matheron, G., (1966), Review de L’industrie Minerale. Aug.


P. 609-621.

[13] Hartley, H.O. and Ross, A. (1954). Nature, 174, p 270-271.

[14] François-Bongarçon, D. (1992). The theory of sampling


35

broken ores, revisited: An effective geostatistical approach


for the determination of sample variances and minimum sample
masses. In Proceedings of The XVth World Mining Congress,
Madrid, Spain.

[15] Sinclair, A.J. (1994). Explor. Mining Geol., (3) 2, pp. 95-
108.

[16] Ottley, D.J. (1966). World Mining, (19), 9, p. 40-44.

[17] François-Bongarçon, D. (1991), CIM Bulletin, (84) 970, p 75-


81.
36

APPENDIX A.

CLASSICAL VARIANCE ESTIMATES, COCHRAN (1977)

TOTAL OF X: 48021.5
TOTAL OF Y: 2500.18
ESTIMATED RATIO: 0.520638E-01
ESTIMATED VARIANCE: 0.901763E-05
ESTIMATED SQUARED CV: 0.332676E-02

DATA GROUPED BY SIZE AND DENSITY CLASSES

AVERAGE FRAGMENT MASS


AVERAGE GRADE
CELL FREQUENCY

DENSITY CLASS
VOLUME 2.6378 2.8382 3.0070 3.1647 3.3960
8.9512
23.6117 25.4051 26.9161 28.3278 30.3988
0.0286 0.1777 0.2800 0.3722 0.4883
632 50 17 12 4
31.9481
84.2732 90.6742 96.0669 101.1057 108.4972
0.0308 0.1698 0.2917 0.3616 0.0000
187 18 4 1 0
56.7869
149.7934 161.1710 170.7565 179.7128 192.8510
0.0361 0.1571 0.0000 0.0000 0.0000
46 4 0 0 0
77.9711
205.6735 221.2953 234.4567 246.7541 264.7936
0.0283 0.1630 0.0000 0.0000 0.0000
19 2 0 0 0
105.5720
278.4796 299.6315 317.4518 334.1024 358.5276
0.0317 0.0000 0.2655 0.0000 0.0000
3 0 1 0 0

VARIANCE ESTIMATES USING VOLUME & DENSITY CLASSES, GY (1982)

APPROXIMATE GRADE: 0.524973E-01


TOTAL MASS: 48027.0
CONSTANT OF CONSTITUTION HETEROGENEITY: 124.920

MINERALOGICAL FACTOR: C= 87.9689


GRANULOMETRIC FACTOR: G= 0.601487
LIBERATION FACTOR: L= 0.422788E-01
37

95th VOLUME PERCENTILE: 55.8410


ESTIMATED VARIANCE OF THE FUNDAMENTAL ERROR: 0.260103E-02
38

CONTRIBUTORS

Anderson-Sprecher, Richard
Statistics Department
Univesity of Wyoming
Laramie, WY 82071

Borgman, Leon E.
Statistics Department
Univesity of Wyoming
Laramie, WY 82071

Flatman, George T.
Exposure Assessment Research Division
U.S. Environmental Protection Agency
Las Vegas, NV 89114-5027

Kern, John W.
Western Ecosystems Technology Inc.
1402 S. Greeley Hwy.
Cheyenne, WY 82007

View publication stats

You might also like