Professional Documents
Culture Documents
net/publication/250083967
CITATIONS READS
46 357
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Clifford Stanley on 02 June 2017.
Abstract — The measurement of error in assays collected as part of a mineral exploration program
or mining operation historically has been undertaken in a variety of ways. Different parameters
have been used to describe the magnitude of relative error, and each of these parameters is related
to the standard measure of relative error, the coefficient of variation. Calculation of the coefficient
of variation can be undertaken in a variety of ways; however, only one produces unbiased estimates
of measurement error: the root mean square coefficient of variation calculated from the individual
coefficients of variation.
Thompson and Howarth’s error analysis approach has also been used to describe measure-
ment error. However, because this approach utilizes a regression line to describe error, it pro-
vides a substantially different measure of error than the root mean square coefficient of varia-
tion. Furthermore, because regression is used, Thompson and Howarth’s results should only
be used for estimating error in individual samples, and not for describing the average error in a
data set. As a result, Thompson and Howarth’s results should not be used to determine the mag-
nitudes of component errors introduced during geochemical sampling, preparation, and analysis.
Finally, the standard error on the coefficient of variation is derived, and it is shown that
very poor estimates of relative error are obtained from duplicate data. As a result, geoscien-
tists seeking to determine the average relative error in a data set should use a very large num-
ber of duplicate samples to make this estimate, particularly if the average relative error is large.
© 2007 Canadian Institute of Mining, Metallurgy and Petroleum. All rights reserved.
Key Words: Relative error, root mean square, coefficient of variation, quality control, Thompson-
Howard, error analysis, geochemistry.
1
Department of Earth and Environmental Science, Acadia University, Wolfville, Nova Scotia, B4P 2R6; e-mail: cliff.stanley@acadiau.ca
2
ioGlobal, Level 3, IBM Building, 1060 Hay Street, West Perth, Western Australia, 6005, Australia; e-mail: dave.lawie@ioglobal.net
266 Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007
Introduction fies the common name of the statistic, the formula used to
calculate the statistic for a single duplicate pair and for the
Assessments of measurement error in geochemical ap- average of n duplicate pairs, and how this statistic is re-
plications, including errors introduced during sampling, lated to the coefficient of variation. This is because relative
sample preparation, or geochemical analysis/assaying, variances, and not relative standard deviations, are addi-
have historically been undertaken using two approaches, tive, and calculation of the average relative error from n
both of which have employed relatively large sets of n estimates of this error using the conventional formula for
duplicate samples to estimate the magnitude of measure- the mean:
ment error. Thompson-Howarth error analysis involves
regression of a proxy for the duplicate standard deviations n
CVi
against the duplicate means to obtain a function that esti- mCV = å , (1)
mates the magnitude of measurement error across a range i =1 n
of concentrations (Thompson, 1973, 1982; Thompson
and Howarth, 1973, 1976, 1978; Howarth and Thompson, where the CVi are the relative error estimates from each
1976; Fletcher, 1981; Stanley and Sinclair, 1986; Garrett duplicate pair, will significantly underestimate the true
and Grunsky, 2003; Stanley, 2003a,b). Although this form average relative error.
of analysis can involve plotting both absolute and relative The coefficient of variation represents the preferred
measurement errors (actually, it plots a proxy for the stan- measure of relative error in geochemical applications,
dard deviation or relative deviation, the absolute or relative and there are a number of reasons for this. Firstly, the co-
difference of pairs; Stanley, 2006a) against concentration, efficient of variation is the de juro universal measure of
the regression specifically considers absolute measure- relative error used by statisticians in virtually every sci-
ment errors (the absolute differences). Thus, this approach entific endeavor, precisely because it is equal to the stan-
strives to assess the absolute measurement error in a set of dard deviation (σ) divided by the mean (µ), the two most
geochemical determinations. Numerous papers, including common statistics used to summarize the characteristics of
those listed above, have described the calculation and use frequency distributions.
of Thompson-Howarth error analysis in geochemical appli- The conventional standard deviation formula:
cations, so this technique is not considered further herein.
The alternative technique for assessing measurement 1 p
å
2
error is not as simple. It involves calculating the average s= ( xi - x ) , (2)
relative error directly from a set of n duplicate samples, p -1 i=1
and thus differs from Thompson-Howarth error analysis in
that it assesses relative measurement error instead of ab- when applied to duplicate pairs (p = 2), simplifies to a for-
solute measurement error. Unfortunately, there are many mula that is a function of the absolute difference between
measures that one can use to describe the average rela- pairs ( s = x1 - x2 / 2 ; Stanley, 2006a). Note that this
tive measurement error (Shaw, 1997; Long, 1998; Stanley, equation forms the basis for use of the absolute difference
1999). Below, we describe these different measures, dem- as a proxy for the duplicate standard deviation in Thomp-
onstrate how they are related to each other, and discuss the son-Howarth error analysis (Stanley, 2006a). As a result,
advantages and disadvantages of each. the other statistics (RP, RV, ARD, and HARD) presented
in Table 1, with the exception of the relative variance (RV,
Average Relative Error which is merely the square of the coefficient of variation),
are directly proportional to the coefficient of variation.
Over the years, several methods have been developed to Consequently, these other statistics offer no more infor-
calculate the average relative error described by a set of n mation than the coefficient of variation, and their use thus
duplicate samples. Each of these methods involves the cal- provides no additional advantage (and in fact has resulted
culation of slightly different statistics (Shaw, 1997; Long, in significant confusion).
1998; Stanley, 1999). Because these measurement error es- Development of some of these measures of relative error
timates are calculated using different formulae, significant date back to the middle of the 20th century (Shaw, 1997;
confusion exists amongst geoscientists, both in mining cir- Long, 1998; Stanley, 1999), when computational power
cles and the academic/research community, regarding what was limited to primitive calculators that lacked square-root
each of these individual measures of average relative error functions, finicky mainframe computers, slide rules, and
describe, and how they should be used. “pencil and paper” calculations. Thus, the original use of
Table 1 presents a list of average relative error meas- these simple formulae, instead of the conventional standard
ures commonly used in geochemical applications. These deviation formula used in the calculation of the coefficient
measures include: (i) the coefficient of variation, CV; of variation (Equation 2), was both convenient and time
(ii) the relative precision, RP; (iii) the relative variance, saving. However, with the ready availability of today’s
RV; (iv) the absolute relative difference, ARD; and (v) the computational power, speed and convenience are not a
half absolute relative difference, HARD (sometimes also concern, and standard deviation functions are universally
known as the percent absolute relative difference, PARD; implemented in most statistical software packages. There-
Shaw, 1997; Long, 1998; Stanley, 1999). Table 1 identi- fore, calculation of the actual standard deviation using
Average Relative Error in Geochemical Determinations • C.R. Stanley and D. Lawie 267
Table 1. List of Common Measures of Relative Error Calculated from Duplicate Pairs of Measurements and Used in a Variety of Geological Applica-
tions
Single Duplicate Average Formula for Relationship
Measurement Conceptual Formula Pair Formula Several Duplicate Pairs with CV
s 2 x1 - x2 1 æ 2 x1i - x2 i ö÷2
å ççç
n
Coefficient of CV = CV = CV = ÷÷ CV
Variation (CV) m 2 ( x1 + x2 ) n i =1 çè 2 ( x1i + x2 i ) ÷ø
2s 4 x1 - x2 1 æ 4 x1i - x2 i ö÷2
å ççç
n
Relative Preci- ÷÷
RP = RP = RP = 2´CV
sion (RP) m 2 ( x1 + x2 ) n i =1 çè 2 ( x1i + x2 i ) ÷ø
s2 ( x1 - x2 )
2
1 æ ( x - x )2 ö÷
Relative Variance ç 1i
n
÷÷
å çç2
2i
RV = RV = 2 RV = CV 2
(RV) m2 ( x1 + x2 )
2
n i =1 çè ( x1i + x2 i )2 ÷÷ø
2
x1 - x2 x1 - x2 1 æ x1i - x2 i ö÷
å ççç2
n
Absolute Relative ÷÷
ARD = ARD = 2 ARD = 2 ´CV
Difference (ARD) m ( x1 + x2 ) n i =1 èç ( x1i + x2 i ) ÷ø
1 x1 - x2 x1 - x2 1 æ x1i - x2 i ö÷2 2
å çç
n
Half Absolute Relative ÷÷
HARD = HARD = HARD = ´CV
Difference (HARD) 2 m ( x1 + x2 ) n i =1 ççè ( x1i + x2 i ) ÷ø 2
Notes
x1, x2 = duplicate pair results; µ = duplicate mean; σ = duplicate standard deviation; CV = coefficient of variation (σ/µ; mean/standard deviation); n =
number of pairs of duplicate samples (referenced with index i).
the conventional formula (Equation 2) does not present a be employed, and the coefficient of variation statistic (as
computational impediment to geoscientists. Furthermore, well as the relative precision and relative variance statis-
although some geochemists believe that the complexity tics) can still be calculated. Thus, a second very important
associated with calculation of the standard deviation, at reason, that of the universal applicability to scenarios in-
least relative to some of these simpler but unconventional volving replicate measurements, exists for using the coeffi-
measures of relative error, may lead to errors in calculation cient of variation as a standard measure of relative error.
of the average relative measurement error, we believe that The coefficient of variation is further preferred over the
the concept and calculation of a standard deviation is not relative precision and relative variance statistics, which
beyond the intellectual capacity of geoscientists. Thus, be- can also be calculated for replicate samples. Although the
cause the coefficient of variation is the standard measure of relative variance is additive (Francois-Bongarcon, 2005;
relative error, it should be used as such in all geochemical Stanley and Smee, 2005a,b, ????), and thus exhibits certain
applications. advantages over the coefficient of variation, it describes
Recent insights (Stanley, 2006a) regarding the limitations measurement error in units of squared concentrations.
of duplicates in representing the full range of possible rela- Thus, it cannot be multiplied directly by a concentration to
tive errors in samples indicates that in some cases, dupli- determine the absolute measurement error at that concen-
cate samples will underestimate relative error, as calculat- tration. This makes the relative variance more complicated
ed using any and all of the formulae presented in Table 1. than necessary.
Stanley (2006a) illustrates that, when large average relative Furthermore, the relative precision is defined simply
errors exist in measurements (such as >100%, as observed as twice the coefficient of variation. This factor of two is
in some gold deposits; e.g., Francois-Bongarcon, 2005; derived from inferential statistics applied to the normal
Stanley and Smee, 2005a,b, ????), the very large relative distribution and an arbitrary Type 1 error (α) of 5% (cor-
error magnitudes that exist in some pairs may be truncated responding to confidence limits equal to the mean ± two
to the maximum attainable relative error for duplicates standard deviations; i.e., being correct 95% of the time, or
( 2 , ≈ 141%). Hence, to estimate the magnitude of large 19 times out of 20). Thus, the choice of a factor of two is
relative errors without bias, more than two replicate sam- essentially arbitrary, and is a function of the confidence a
ples (e.g., triplicates, quadruplicates, quintuplicates, etc.) geoscientist wishes to have in any conclusions to be drawn.
must be analyzed. If such replicates are employed to esti- Unfortunately, different levels of confidence are conven-
mate relative error, none of the absolute difference formu- tionally employed for different geochemical applications.
lae presented in Table 1 can be used because all involve For example, the determination of an analytical detection
only duplicate pairs of measurements. However, the con- limit is commonly arrived at using a calculation involv-
ventional standard deviation formula (Equation 2) can still ing three standard deviations (and thus where α = 1%), but
268 Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007
exhibit an average coefficient of variation of 10%, then the average relative error; Stanley, 2006a; Stanley and Lawie,
coefficient of variation describing the magnitude of error ????) is presented in the Appendix. The standard error on
introduced between initial sampling and subsampling after the relative error (presented in relative terms) derived from
crushing (i.e., sampling error) is 24% (= 0.262 - 0.102 ). Use Equation A7 in the Appendix is plotted as a function of
of relative errors in this application is appropriate because the average relative error for different numbers of replicate
only descriptive measures of relative error are required for determinations (p) in Figure 3. This illustrates that the stan-
the decomposition of errors introduced during the sam- dard error on the relative error is large for small numbers
pling/preparation/analysis protocol. of replicate determinations. For example, for a single esti-
Relative errors can also be appropriately used to assess mate of relative error of 15% determined from duplicates,
whether the data meet a required level of analytical quality, the standard error on this estimate is 11% (a relative error
then the RMS coefficient of variation can be used for this of 73%; from Equation A7; Fig. 3).
purpose because it is a descriptive measure of analytical Analogously, Equation A11 can be used to determine the
quality. For example, if 10% relative error is necessary to standard error on the average relative error of n replicate
ensure the data collected are “fit for purpose” (Bettenay sets. For the duplicate data set presented in Figures 1 and
and Stanley, 2001), and a replicate data set exhibits 12% 2, this standard error is 3.65% (on the estimate of 12.59%),
relative error, then one can conclude that this data set does and thus corresponds to a relative error of 29%. This rela-
not meet the required level of measurement precision to be tive error is large, in this case because n is relatively small
used for the purpose it is intended. (100). Thus, in order to obtain reasonably stable estimates
Although the slope and intercept derived from a Thomp- of average relative errors, larger numbers of replicate sets
son-Howarth error analysis describe a functional relation- (n) are commonly required. The poor precision exhibited
ship between concentration and measurement error, and by average relative errors is a consequence of the large
thus can be used to predict the expected error in samples scatter observed on plots of replicate mean versus repli-
of known concentration, these parameters can also be used cate standard deviation (e.g., Fig. 1, 2), a feature resulting
in the variance decomposition or “fit for purpose” applica- from the fact that duplicates provide poor estimates of both
tions described above. This is because, even though the par- means and standard deviations.
ameters derived from a Thompson-Howarth error analysis The functional form of Equation A11 also indicates that
are fundamentally and philosophically different from the larger average relative errors will exhibit larger standard
structural coefficient of variation, they are derived from a errors. Thus, a larger number of replicate determinations
regression, and thus can also be used to establish structural (p) will be required to obtain relatively accurate estimates
relationships in addition to functional relationships (Ken- of the average relative error, and the opposite will be true
dall and Stuart, 1966). Consequently, use of the parameters for small average relative errors. Consequently, the aver-
of a Thompson-Howarth error model can be used in both age relative error in a set of base metal assays, which typ-
structural and functional relationship contexts, for predic- ically exhibit relatively small average relative errors (such
tion and description of the magnitude of measurement er- as 5%), can typically be determined reliably using smaller
ror. numbers of replicate sets (500 < n < 1000) than the average
The Importance of Large n relative error in a set of gold assays, which typically exhibit
relatively large average relative errors (such as 25%; Fran-
The quality control duplicate data presented in Figures 1 cois-Bongarcon, 2005; Stanley and Smee, 2005a,b, ????),
and 2 exhibit significant scatter, and thus are similar to which will require larger numbers of replicates (such as
most duplicate quality control data observed in mineral ex- n > 2500).
ploration and mine assay data sets. The
large scatter is a consequence of the
very poor estimates of mean and stan-
dard deviation obtained from duplicate
data. The magnitude of these estima-
tion errors are defined by the standard
errors on the duplicate means and stan-
dard deviations, and these are inverse
functions of the square root of p, the
number of replicates under considera-
tion (in this case, p = 2; the relevant
standard error formulae are presented
in the Appendix). Because of the large
scatter that occurs in duplicate quality
control data sets, it is worth examining
how this scatter affects the estimate of
the average relative error.
A derivation of the standard error on
the average relative error (RMS coeffi-
cient of variation) of n replicate sets Fig. 3. Scatter plot of the standard error on the relative error (CV %; Equation A7) plotted against
(effectively the estimation error of the the relative error for different numbers of replicates (p).
Average Relative Error in Geochemical Determinations • C.R. Stanley and D. Lawie 271
A3 becomes:
grams and Abstracts, p. 77–78.
Stanley, C.R., and Smee, B.W., 2005a, Sample prepara-
tion of ‘nuggety samples’: Dispelling some myths about æ dCV ö÷ 2
æ dCV ö÷ 2
æ dCV ö÷æ dCV ö
sample size and sampling errors: EXPLORE, Newsletter sCV = çç s m + çç ss + 2 çç ç ÷ s . (A4)
çè dm ÷÷ø çè dm ÷÷øçè ds ÷÷ø
2 2 2
i çè ds ÷÷ø i ms
2 çè p ÷ø çè m ÷ø çè 2 ( p - 1)÷÷ø ççè m 2
øè m ø
Thompson, M., 1982, Regression methods and the com-
parison of accuracy: The Analyst, v. 107, p. 1169–1180. æ CV 2
1 ö÷
Thompson, M., and Howarth, R.J., 1973, The rapid esti- = CV
2 çç + ÷÷
çè p 2 ( p - 1) ÷ø
mation and control of precision by duplicate determina-
tions: The Analyst, v. 98, p. 153–160. or:
Thompson, M., and Howarth, R.J., 1976, Duplicate analy- CV 2 1
sis in practice—Part 1. Theoretical approach and estima- sCV = CV + . (A7)
tion of analytical reproducibility: The Analyst, v. 101, p. p 2 ( p -1)
690–698.
Thompson, M., and Howarth, R.J., 1978, A new approach Now, if we have several (i = 1 … n) estimates of the
to the estimation of analytical precision: Journal of Geo- relative error from several sets of replicates, we can propa-
chemical Exploration, v. 9, p. 23–30. gate the standard errors on these relative errors through the
calculation of the average relative error to obtain an error
Appendix on that estimate. Because errors are additive as variances,
the average relative error must be calculated using a root
The following derivations determine the formulae of the mean square approach:
standard error on the average coefficient of variation cal-
culated from one set of p replicates, and from n sets of p n
2
replicates. å CVi
i =1
The standard errors on the mean and standard deviation, mCV = . (A8)
for p replicates, are: n
i =1
æ d y ö÷2 2 p -1 p æ d y öæ d y ö
n
p
ç ÷÷
s y = å ççç ÷÷ s xi + 2 å å ççç ÷÷÷çç
2
÷ s xi x j , (A3) æ
CVi ç 1 ö÷
ç ÷
i =1 è d xi ø i =1 j =i +1 è d xi øè d x j ÷
ç ÷ ç ÷ø = ç ÷÷ .
n ççè m CV
÷ø
derived using a Taylor series expansion about the mean
(Meyer, 1975; Stanley, 1990). In this application, Equation Thus, the standard error on the average relative error is:
Average Relative Error in Geochemical Determinations • C.R. Stanley and D. Lawie 273