You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/250083967

Average Relative Error in Geochemical Determinations: Clarification,


Calculation, and a Plea for Consistency

Article  in  Exploration and Mining Geology · July 2007


DOI: 10.2113/gsemg.16.3-4.267

CITATIONS READS

46 357

2 authors:

Clifford Stanley Dave Lawie


Acadia University REFLEX Geochemistry
63 PUBLICATIONS   860 CITATIONS    8 PUBLICATIONS   153 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Geochemical Data Quality Analysis View project

Geochemical Data Transformation Methods View project

All content following this page was uploaded by Clifford Stanley on 02 June 2017.

The user has requested enhancement of the downloaded file.


Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007
© 2007 Canadian Institute of Mining, Metallurgy and Petroleum.
All rights reserved. Printed in Canada.
0964-1823/00 $17.00 + .00

Average Relative Error in Geochemical Determinations: Clarification, Calculation,  


and a Plea for Consistency

C.R. Stanley1 and D. Lawie2


(Received July 14, 2006; accepted May 17, 2007)

Abstract — The measurement of error in assays collected as part of a mineral exploration program
or mining operation historically has been undertaken in a variety of ways. Different parameters
have been used to describe the magnitude of relative error, and each of these parameters is related
to the standard measure of relative error, the coefficient of variation. Calculation of the coefficient
of variation can be undertaken in a variety of ways; however, only one produces unbiased estimates
of measurement error: the root mean square coefficient of variation calculated from the individual
coefficients of variation.
Thompson and Howarth’s error analysis approach has also been used to describe measure-
ment error. However, because this approach utilizes a regression line to describe error, it pro-
vides a substantially different measure of error than the root mean square coefficient of varia-
tion. Furthermore, because regression is used, Thompson and Howarth’s results should only
be used for estimating error in individual samples, and not for describing the average error in a
data set. As a result, Thompson and Howarth’s results should not be used to determine the mag-
nitudes of component errors introduced during geochemical sampling, preparation, and analysis.
Finally, the standard error on the coefficient of variation is derived, and it is shown that
very poor estimates of relative error are obtained from duplicate data. As a result, geoscien-
tists seeking to determine the average relative error in a data set should use a very large num-
ber of duplicate samples to make this estimate, particularly if the average relative error is large.
© 2007 Canadian Institute of Mining, Metallurgy and Petroleum. All rights reserved.

Key Words: Relative error, root mean square, coefficient of variation, quality control, Thompson-
Howard, error analysis, geochemistry.

Sommaire — La détermination de l’erreur de mesure des analyses provenant d’un programme


d’exploration minérales ou d’une exploitation minière a historiquement été faite de plu-
sieurs manières. Divers paramêtres ont été utilisés pour décrire l’importance de l’erreur rela-
tive, et chacun de ces paramêtres est un reflet de la mesure standard de l’erreur relative, le co-
efficient de variation. Plusieurs méthodes existent pour calculer le coefficient de variation,
mais une seule permet une estimation non biaisée de l’erreur de mesure : la racine du carré
moyen du coefficient de variation calculée à partir des coefficients de variation individuels.
L’approche de l’analyse d’erreur de Thompson et Howard a également été utilisée pour décrire
des erreurs de mesures. Cependant, étant donné que cette approche fait appel à une ligne de ré-
gression pour décrire l’erreur, elle fournit une mesure significativement différente de l’erreur que
la racine du carré moyen du coefficient de variation. De plus, à cause de la régression, les résultats
de Thompson et Howart ne devraient être utilisés que pour l’estimation de l’erreur d’échantillins
individuels, et non pour décrire l’erreur ‘moyenne’ d’un ensemble de données. Pour ces raisons, les
résultats de Thompson et Howard ne devraient pas être utilisés pour déterminer l’importance des
composantes de l’erreur introduites durant l’échantillonage, la préparation et l’analyse géochimi-
ques.
Finalement, l’erreur standard sur le coefficient de variation est dérivée, est il est montré que les
duplicata de données fournissent un estimé très médiocre de l’erreur relative. Par conséquent, les
géoscientifiques qui désirent déterminer l’erreur relative moyenne d’un ensemble de données dev-
raient utiliser un très grand nombre de duplicata pour faire cet estimé, particulièrement si l’erreur
relative moyenne est grande. © 2007 Canadian Institute of Mining, Metallurgy and Petroleum.

1
Department of Earth and Environmental Science, Acadia University, Wolfville, Nova Scotia, B4P 2R6; e-mail: cliff.stanley@acadiau.ca
2
ioGlobal, Level 3, IBM Building, 1060 Hay Street, West Perth, Western Australia, 6005, Australia; e-mail: dave.lawie@ioglobal.net
266 Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007

Introduction fies the common name of the statistic, the formula used to
calculate the statistic for a single duplicate pair and for the
Assessments of measurement error in geochemical ap- average of n duplicate pairs, and how this statistic is re-
plications, including errors introduced during sampling, lated to the coefficient of variation. This is because relative
sample preparation, or geochemical analysis/assaying, variances, and not relative standard deviations, are addi-
have historically been undertaken using two approaches, tive, and calculation of the average relative error from n
both of which have employed relatively large sets of n estimates of this error using the conventional formula for
duplicate samples to estimate the magnitude of measure- the mean:
ment error. Thompson-Howarth error analysis involves
regression of a proxy for the duplicate standard deviations n
CVi
against the duplicate means to obtain a function that esti- mCV = å , (1)
mates the magnitude of measurement error across a range i =1 n
of concentrations (Thompson, 1973, 1982; Thompson
and Howarth, 1973, 1976, 1978; Howarth and Thompson, where the CVi are the relative error estimates from each
1976; Fletcher, 1981; Stanley and Sinclair, 1986; Garrett duplicate pair, will significantly underestimate the true
and Grunsky, 2003; Stanley, 2003a,b). Although this form average relative error.
of analysis can involve plotting both absolute and relative The coefficient of variation represents the preferred
measurement errors (actually, it plots a proxy for the stan- measure of relative error in geochemical applications,
dard deviation or relative deviation, the absolute or relative and there are a number of reasons for this. Firstly, the co-
difference of pairs; Stanley, 2006a) against concentration, efficient of variation is the de juro universal measure of
the regression specifically considers absolute measure- relative error used by statisticians in virtually every sci-
ment errors (the absolute differences). Thus, this approach entific endeavor, precisely because it is equal to the stan-
strives to assess the absolute measurement error in a set of dard deviation (σ) divided by the mean (µ), the two most
geochemical determinations. Numerous papers, including common statistics used to summarize the characteristics of
those listed above, have described the calculation and use frequency distributions.
of Thompson-Howarth error analysis in geochemical appli- The conventional standard deviation formula:
cations, so this technique is not considered further herein.
The alternative technique for assessing measurement 1 p
å
2
error is not as simple. It involves calculating the average s= ( xi - x ) , (2)
relative error directly from a set of n duplicate samples, p -1 i=1
and thus differs from Thompson-Howarth error analysis in
that it assesses relative measurement error instead of ab- when applied to duplicate pairs (p = 2), simplifies to a for-
solute measurement error. Unfortunately, there are many mula that is a function of the absolute difference between
measures that one can use to describe the average rela- pairs ( s = x1 - x2 / 2 ; Stanley, 2006a). Note that this
tive measurement error (Shaw, 1997; Long, 1998; Stanley, equation forms the basis for use of the absolute difference
1999). Below, we describe these different measures, dem- as a proxy for the duplicate standard deviation in Thomp-
onstrate how they are related to each other, and discuss the son-Howarth error analysis (Stanley, 2006a). As a result,
advantages and disadvantages of each. the other statistics (RP, RV, ARD, and HARD) presented
in Table 1, with the exception of the relative variance (RV,
Average Relative Error which is merely the square of the coefficient of variation),
are directly proportional to the coefficient of variation.
Over the years, several methods have been developed to Consequently, these other statistics offer no more infor-
calculate the average relative error described by a set of n mation than the coefficient of variation, and their use thus
duplicate samples. Each of these methods involves the cal- provides no additional advantage (and in fact has resulted
culation of slightly different statistics (Shaw, 1997; Long, in significant confusion).
1998; Stanley, 1999). Because these measurement error es- Development of some of these measures of relative error
timates are calculated using different formulae, significant date back to the middle of the 20th century (Shaw, 1997;
confusion exists amongst geoscientists, both in mining cir- Long, 1998; Stanley, 1999), when computational power
cles and the academic/research community, regarding what was limited to primitive calculators that lacked square-root
each of these individual measures of average relative error functions, finicky mainframe computers, slide rules, and
describe, and how they should be used. “pencil and paper” calculations. Thus, the original use of
Table 1 presents a list of average relative error meas- these simple formulae, instead of the conventional standard
ures commonly used in geochemical applications. These deviation formula used in the calculation of the coefficient
measures include: (i) the coefficient of variation, CV; of variation (Equation 2), was both convenient and time
(ii) the relative precision, RP; (iii) the relative variance, saving. However, with the ready availability of today’s
RV; (iv) the absolute relative difference, ARD; and (v) the computational power, speed and convenience are not a
half absolute relative difference, HARD (sometimes also concern, and standard deviation functions are universally
known as the percent absolute relative difference, PARD; implemented in most statistical software packages. There-
Shaw, 1997; Long, 1998; Stanley, 1999). Table 1 identi- fore, calculation of the actual standard deviation using
Average Relative Error in Geochemical Determinations • C.R. Stanley and D. Lawie 267

Table 1. List of Common Measures of Relative Error Calculated from Duplicate Pairs of Measurements and Used in a Variety of Geological Applica-
tions
Single Duplicate Average Formula for   Relationship
Measurement Conceptual Formula Pair Formula Several Duplicate Pairs with CV

s 2 x1 - x2 1 æ 2 x1i - x2 i ö÷2
å ççç
n
Coefficient of CV = CV = CV = ÷÷ CV
Variation (CV) m 2 ( x1 + x2 ) n i =1 çè 2 ( x1i + x2 i ) ÷ø

2s 4 x1 - x2 1 æ 4 x1i - x2 i ö÷2
å ççç
n
Relative Preci- ÷÷
RP = RP = RP = 2´CV
sion (RP) m 2 ( x1 + x2 ) n i =1 çè 2 ( x1i + x2 i ) ÷ø

s2 ( x1 - x2 )
2
1 æ ( x - x )2 ö÷
Relative Variance ç 1i
n
÷÷
å çç2
2i
RV = RV = 2 RV = CV 2
(RV) m2 ( x1 + x2 )
2
n i =1 çè ( x1i + x2 i )2 ÷÷ø

2
x1 - x2 x1 - x2 1 æ x1i - x2 i ö÷
å ççç2
n
Absolute Relative ÷÷
ARD = ARD = 2 ARD = 2 ´CV
Difference (ARD) m ( x1 + x2 ) n i =1 èç ( x1i + x2 i ) ÷ø

1 x1 - x2 x1 - x2 1 æ x1i - x2 i ö÷2 2
å çç
n
Half Absolute Relative ÷÷
HARD = HARD = HARD = ´CV
Difference (HARD) 2 m ( x1 + x2 ) n i =1 ççè ( x1i + x2 i ) ÷ø 2

Notes
x1, x2 = duplicate pair results; µ = duplicate mean; σ = duplicate standard deviation; CV = coefficient of variation (σ/µ; mean/standard deviation); n =
number of pairs of duplicate samples (referenced with index i).

the conventional formula (Equation 2) does not present a be employed, and the coefficient of variation statistic (as
computational impediment to geoscientists. Furthermore, well as the relative precision and relative variance statis-
although some geochemists believe that the complexity tics) can still be calculated. Thus, a second very important
associated with calculation of the standard deviation, at reason, that of the universal applicability to scenarios in-
least relative to some of these simpler but unconventional volving replicate measurements, exists for using the coeffi-
measures of relative error, may lead to errors in calculation cient of variation as a standard measure of relative error.
of the average relative measurement error, we believe that The coefficient of variation is further preferred over the
the concept and calculation of a standard deviation is not relative precision and relative variance statistics, which
beyond the intellectual capacity of geoscientists. Thus, be- can also be calculated for replicate samples. Although the
cause the coefficient of variation is the standard measure of relative variance is additive (Francois-Bongarcon, 2005;
relative error, it should be used as such in all geochemical Stanley and Smee, 2005a,b, ????), and thus exhibits certain
applications. advantages over the coefficient of variation, it describes
Recent insights (Stanley, 2006a) regarding the limitations measurement error in units of squared concentrations.
of duplicates in representing the full range of possible rela- Thus, it cannot be multiplied directly by a concentration to
tive errors in samples indicates that in some cases, dupli- determine the absolute measurement error at that concen-
cate samples will underestimate relative error, as calculat- tration. This makes the relative variance more complicated
ed using any and all of the formulae presented in Table 1. than necessary.
Stanley (2006a) illustrates that, when large average relative Furthermore, the relative precision is defined simply
errors exist in measurements (such as >100%, as observed as twice the coefficient of variation. This factor of two is
in some gold deposits; e.g., Francois-Bongarcon, 2005; derived from inferential statistics applied to the normal
Stanley and Smee, 2005a,b, ????), the very large relative distribution and an arbitrary Type 1 error (α) of 5% (cor-
error magnitudes that exist in some pairs may be truncated responding to confidence limits equal to the mean ± two
to the maximum attainable relative error for duplicates standard deviations; i.e., being correct 95% of the time, or
( 2 , ≈ 141%). Hence, to estimate the magnitude of large 19 times out of 20). Thus, the choice of a factor of two is
relative errors without bias, more than two replicate sam- essentially arbitrary, and is a function of the confidence a
ples (e.g., triplicates, quadruplicates, quintuplicates, etc.) geoscientist wishes to have in any conclusions to be drawn.
must be analyzed. If such replicates are employed to esti- Unfortunately, different levels of confidence are conven-
mate relative error, none of the absolute difference formu- tionally employed for different geochemical applications.
lae presented in Table 1 can be used because all involve For example, the determination of an analytical detection
only duplicate pairs of measurements. However, the con- limit is commonly arrived at using a calculation involv-
ventional standard deviation formula (Equation 2) can still ing three standard deviations (and thus where α = 1%), but
268 Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007

other inferential decisions typically


employ confidence limits associated
with α = 5%. Because inference an-
alysis of relative error magnitudes is
not necessarily a goal for which the
assessment of relative error is under-
taken, use of the simpler coefficient of
variation statistic is preferred.
Finally, the concept of precision is
unfortunately counterintuitive. For ex-
ample, if data exhibit higher precision
(a desirable trait), the measure of pre-
cision is lower. This apparently con-
tradictory feature of precision can lead
to significant confusion. In contrast,
although the concept of relative error
is a negative trait (more error is not
desirable), if data exhibit higher rela- Fig. 1. Scatter plot of standard deviation plotted against mean concentration of Fe2O3 (wt.%) in
tive error, the measure of relative error duplicate analyses from a banded iron formation Fe ore deposit, illustrating the different absolute
and relative error results obtained by the various calculations described in the text. The incorrectly
(the coefficient of variation) is higher determined average standard deviation (0.97 wt.%) is approximately 2/3 of the correctly deter-
(i.e., more nondesirable). Thus, unlike mined RMS standard deviation (1.49 wt.%). Sloping lines show the average duplicate coefficient
precision, the concept of relative error of variation (dashed line), the RMS duplicate coefficient of variation (solid line), the average
is consistent with its measurement, and duplicate standard deviation divided by the average duplicate mean (dotted/dashed line), and the
RMS duplicate standard deviation divided by the average duplicate mean (dotted line). The aver-
thus is an intuitively easier concept to age CV (6.84%) is approximately half the RMS CV (12.59%), and the average standard devia-
understand and apply in error analysis. tion/average mean (4.61%) is approximately 2/3 of the RMS standard deviation/average mean
We, therefore, believe that the coeffi- (7.11%). m = slope.
cient of variation should be used as a
universal measurement of relative error in all geochemical ore deposit. This example demonstrates the important dif-
applications. ference between the average and RMS standard deviations,
Calculation of Relative Error and illustrates how use of the average standard deviation
underestimates (biases, in this case, by 35%) the error
With the exception of the relative variance, the formu- estimate (Stanley, 2006a; Stanley and Lawie, ????). The
lae in Table 1 do not use the conventional formula for the four sloping lines that pass through the origin of Figure 1
mean to calculate the average relative error measure. This correspond to the relative errors determined by the four
is because most of these measures use the standard devia- possible calculation strategies: (i) average duplicate co-
tion (or a proportional proxy for the standard deviation), efficient of variation (6.84% relative error; dashed line);
which is not additive (Stanley, 2006a). As a result, a root (ii) RMS duplicate coefficient of variation (12.59% rela-
mean square (RMS) calculation is undertaken to determine tive error; solid line); (iii) the average duplicate standard
each of these average statistics (Stanley, 2006a; Stanley deviation divided by the average duplicate mean (4.61%
and Lawie, ????). This calculation determines the average relative error; dotted/dashed line); and (iv) the RMS dupli-
square and then takes the square root to determine the true cate standard deviation divided by the average duplicate
average relative error measure. mean (7.11% relative error; dotted line). Note that these
Unfortunately, two calculation approaches exist that last two relative error estimates pass through the intersec-
produce different average relative error measures: (i) the tion points between the average replicate mean and aver-
coefficient of variation for each replicate set can be calcu- age and RMS replicate standard deviations. Based on the
lated and then the average (RMS) coefficient of variation above arguments, the RMS duplicate coefficient of varia-
determined using the formula in Table 1; or (ii) the average tion (12.59%) represents the correct and unbiased estimate
mean of the replicate sets can be divided into the average of relative error (Stanley, 2006a; Stanley and Lawie, ????).
(RMS) standard deviation of the replicate sets. The former All other estimates are biased low, in some cases by as
strategy produces the correct estimate of relative error. The much as 63%.
latter is not really an average relative error; rather, it is a Comparing this correctly determined average coefficient
ratio of the average replicate error divided by the average of variation with the results of a corresponding Thompson-
replicate mean, which is not the same thing. Examples of Howarth error analysis of these same data reveals several
these two calculation strategies are presented in Figure 1. important points. Figure 2 presents a Thompson-Howarth
In Figure 1, the vertical line defines the average dupli- error analysis scatter plot of the duplicate data with a re-
cate pair mean, and the two horizontal lines identify the gression model forced through the origin (and thus only in-
average and RMS standard deviations for a quality control volving a relative error term; i.e., a slope), using groups of
data set of 100 Fe2O3 (wt.%) determinations of rotary drill 5 duplicate pair means and RMS standard deviations. The
chip duplicate samples from a banded iron formation Fe slope derived from this analysis (3.62%) can be thought of
Average Relative Error in Geochemical Determinations • C.R. Stanley and D. Lawie 269

ical survey is 15%, then the expected


measurement error in a sample with a
Cu concentration of 220 ppm from that
survey (but not necessarily one of the
duplicates used to estimate this relative
error) should not be estimated through
simple multiplication of the concentra-
tion by the average relative error (i.e.,
220 ppm × 0.15 = 33 ppm). This is be-
cause the relative error is an average
of duplicate relative errors that exhibit
a positively skewed distribution (Stan-
ley, 2006a), and was calculated in a
manner that is inconsistent with this
predictive goal, using a method that
does not minimize estimation error
(i.e., a method such as least squares re-
Fig. 2. Thompson-Howarth error analysis scatter plot of the 100 duplicate Fe2O3 (wt.%) analyses gression). Thus, it will not provide the
in Fig. 1. Diamonds represent original duplicate means and standard deviations; squares represent best possible estimate of error.
groups of 5 average means and RMS standard deviations (Stanley and Lawie, ????). Appropriate statistics that could be
used to estimate the expected meas-
as comparable with the RMS coefficient of variation for urement error in such a sample are the slope and intercept
these data (12.59%). Unfortunately, the two estimates of of the regression line derived from a Thompson-Howarth
relative error are significantly different (the Thompson- error analysis (Thompson, 1973, 1982; Thompson and
Howarth result is 71% lower than the RMS coefficient of Howarth, 1973, 1976, 1978; Howarth and Thompson,
variation). 1976; Fletcher, 1981; Stanley and Sinclair, 1986; Garrett
The observed discrepancies are an expected con- and Grunsky, 2003; Stanley, 2003a,b). This is because
sequence of these two error analysis approaches. The the regression used in Thompson-Howarth error analysis
Thompson-Howarth error analysis result has been derived seeks to identify the expected value of the measurement
from data plotted in a Cartesian coordinate system, because error at any concentration. It identifies a linear model that
it estimates the absolute measurement error (the duplicate describes a “functional relationship” between concentra-
standard deviation; the ordinate coordinate) at a given con- tion and error. Thus, if the slope and intercept derived from
centration (the abscissa coordinate). In contrast, the RMS a Thompson-Howarth error analysis are 5% and 8 ppm,
coefficient of variation effectively operates in a radial co- respectively, then the best estimate of the standard devia-
ordinate system. The concentration of duplicate pairs is ir- tion of error in the sample with a concentration of 220 ppm
relevant because the units of concentration (in this case, would be (220 ppm × 0.05) + 8 ppm = 19 ppm. This esti-
wt.%) cancel out in the formation of the coefficient of mate would involve the smallest estimation error because
variation ratio. As a result, the individual coefficients of of the method used in its determination.
variation describe slopes of lines from the origin to the Because the coefficient of variation is a descriptive
duplicate pair points defined by their means and standard (structural relationship-type) measure of relative error,
deviations. These slopes are functionally related to the ra- and because the squares of relative errors (the relative
dial angles of a radial coordinate system. Thus, it should variances), like the squares of absolute errors (the vari-
come as no surprise that the RMS coefficient of variation ances), are additive (Francois-Bongarcon, 2005; Stanley
and a Thompson-Howarth error analysis regression slope and Smee, 2005a,b, ????; Stanley, 2006b), relative errors
produce significantly different results, precisely because of replicate measurements collected at various stages of
the data are considered in completely different coordinate sample treatment can be used to determine the amount of
systems. error introduced during each stage of sampling, prepara-
tion, and analysis of geochemical samples (Francois-Bon-
Appropriate Use of the Average Relative Error garcon, 2005; Stanley and Smee, 2005a,b, ????). For ex-
ample, because samples are collected in the field, crushed,
The coefficient of variation is a descriptive estimate of subsampled, pulverized, subsampled again, and then ana-
relative error for a specific set of replicate measurements lyzed, sample replicates collected in the field will docu-
(the set used to make the estimate of relative error). Con- ment the total error introduced to the samples during this
sequently, it describes a structural relationship between entire sampling/preparation/analysis protocol. In contrast,
element concentration and measurement error. Thus, use sample replicates collected after crushing will document
of the coefficient of variation should be restricted to those the amount of error introduced to the samples from that
applications that are consistent with this characteristic. point onward in the sampling/preparation/analysis proto-
For example, if the average relative error calculated (as col. If the sample replicates exhibit an average coefficient
CV) from a set of duplicate determinations in a geochem- of variation of 26%, whereas the post-crushing replicates
270 Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007

exhibit an average coefficient of variation of 10%, then the average relative error; Stanley, 2006a; Stanley and Lawie,
coefficient of variation describing the magnitude of error ????) is presented in the Appendix. The standard error on
introduced between initial sampling and subsampling after the relative error (presented in relative terms) derived from
crushing (i.e., sampling error) is 24% (=  0.262 - 0.102 ). Use Equation A7 in the Appendix is plotted as a function of
of relative errors in this application is appropriate because the average relative error for different numbers of replicate
only descriptive measures of relative error are required for determinations (p) in Figure 3. This illustrates that the stan-
the decomposition of errors introduced during the sam- dard error on the relative error is large for small numbers
pling/preparation/analysis protocol. of replicate determinations. For example, for a single esti-
Relative errors can also be appropriately used to assess mate of relative error of 15% determined from duplicates,
whether the data meet a required level of analytical quality, the standard error on this estimate is 11% (a relative error
then the RMS coefficient of variation can be used for this of 73%; from Equation A7; Fig. 3).
purpose because it is a descriptive measure of analytical Analogously, Equation A11 can be used to determine the
quality. For example, if 10% relative error is necessary to standard error on the average relative error of n replicate
ensure the data collected are “fit for purpose” (Bettenay sets. For the duplicate data set presented in Figures 1 and
and Stanley, 2001), and a replicate data set exhibits 12% 2, this standard error is 3.65% (on the estimate of 12.59%),
relative error, then one can conclude that this data set does and thus corresponds to a relative error of 29%. This rela-
not meet the required level of measurement precision to be tive error is large, in this case because n is relatively small
used for the purpose it is intended. (100). Thus, in order to obtain reasonably stable estimates
Although the slope and intercept derived from a Thomp- of average relative errors, larger numbers of replicate sets
son-Howarth error analysis describe a functional relation- (n) are commonly required. The poor precision exhibited
ship between concentration and measurement error, and by average relative errors is a consequence of the large
thus can be used to predict the expected error in samples scatter observed on plots of replicate mean versus repli-
of known concentration, these parameters can also be used cate standard deviation (e.g., Fig. 1, 2), a feature resulting
in the variance decomposition or “fit for purpose” applica- from the fact that duplicates provide poor estimates of both
tions described above. This is because, even though the par- means and standard deviations.
ameters derived from a Thompson-Howarth error analysis The functional form of Equation A11 also indicates that
are fundamentally and philosophically different from the larger average relative errors will exhibit larger standard
structural coefficient of variation, they are derived from a errors. Thus, a larger number of replicate determinations
regression, and thus can also be used to establish structural (p) will be required to obtain relatively accurate estimates
relationships in addition to functional relationships (Ken- of the average relative error, and the opposite will be true
dall and Stuart, 1966). Consequently, use of the parameters for small average relative errors. Consequently, the aver-
of a Thompson-Howarth error model can be used in both age relative error in a set of base metal assays, which typ-
structural and functional relationship contexts, for predic- ically exhibit relatively small average relative errors (such
tion and description of the magnitude of measurement er- as 5%), can typically be determined reliably using smaller
ror. numbers of replicate sets (500 < n < 1000) than the average
The Importance of Large n relative error in a set of gold assays, which typically exhibit
relatively large average relative errors (such as 25%; Fran-
The quality control duplicate data presented in Figures 1 cois-Bongarcon, 2005; Stanley and Smee, 2005a,b, ????),
and 2 exhibit significant scatter, and thus are similar to which will require larger numbers of replicates (such as
most duplicate quality control data observed in mineral ex- n > 2500).
ploration and mine assay data sets. The
large scatter is a consequence of the
very poor estimates of mean and stan-
dard deviation obtained from duplicate
data. The magnitude of these estima-
tion errors are defined by the standard
errors on the duplicate means and stan-
dard deviations, and these are inverse
functions of the square root of p, the
number of replicates under considera-
tion (in this case, p = 2; the relevant
standard error formulae are presented
in the Appendix). Because of the large
scatter that occurs in duplicate quality
control data sets, it is worth examining
how this scatter affects the estimate of
the average relative error.
A derivation of the standard error on
the average relative error (RMS coeffi-
cient of variation) of n replicate sets Fig. 3. Scatter plot of the standard error on the relative error (CV %; Equation A7) plotted against
(effectively the estimation error of the the relative error for different numbers of replicates (p).
Average Relative Error in Geochemical Determinations • C.R. Stanley and D. Lawie 271

Conclusions prospecting—Handbook of exploration geochemistry,


1: Amsterdam, Elsevier, 255 p.
Calculating relative errors in geochemical data sets is Francois-Bongarcon, D., 2005, Comment regarding “sam-
a task that has caused much confusion for geoscientists ple preparation of ‘nuggety’ samples: Dispelling some
over the years because there are many ways to undertake myths about sample size and sampling errors”: Explore,
such calculations. Fortunately, most of these approaches Newsletter of the Association of Applied Geochemists,
are numerically related to each other, and thus can be v. 127, p. 17–18.
compared if appropriate steps are taken to convert these Garrett, R.G., and Grunsky, E.C., 2003, S and R functions
measures to a common standard. The coefficient of varia- for the display of Thompson-Howarth plots: Computers
tion (one standard deviation divided by the mean) serves as and Geosciences, v. 29, p. 239–242.
this standard for a variety of reasons involving convention, Howarth, R.J., and Thompson, M., 1976, Duplicate an-
theory, simplicity, generality, practicality, and philosophy. alysis in practice—Part 1: Examination of proposed
Use of the standard deviation is warranted over other sta- methods and examples of its use: The Analyst, v. 101,
tistics, precisely because it represents the standard measure p. 699–709.
of variation. Kendall, M.G., and Stuart, A., 1966, The Advanced theory
Calculation of relative error should be undertaken using of statistics, 3: London, Griffin, 780 p.
a root mean square approach because variances, and not Long, S., 1998, Practical quality control procedures in
standard deviations, are additive. Furthermore, to ensure an mineral inventory estimation: Exploration and Mining
unbiased result, calculation of relative error should involve Geology, v. 7, p. 117–127.
calculating the mean (RMS) of the individual estimates of Meyer, S.L., 1975, Data analysis for scientists and engin-
relative error determined by each replicate set, rather than eers: New York, Wiley, 513 p.
dividing the mean error by the mean concentration. Shaw, W.J., 1997, Validation of sampling and assaying
Use of the coefficient of variation to describe relative quality for bankable feasibility studies [abs.]: Australian
error in a data set, calculated in the above manner, should Institute of Mining and Metallurgy, The resource data-
be restricted to applications that require a bulk estimate base towards 2000, Annual Meeting, May, Melbourne,
of relative error in the data set under consideration. The Extended Abstracts, p. 41–49.
relative error estimate should not be used to estimate the Stanley, C.R., 1999, Treatment of geochemical data: Some
expected measurement error in individual samples. In con- pitfalls in graphical analysis: Association of Explora-
trast, results from Thompson and Howarth’s error analysis tion Geochemists, 19th International Exploration Sym-
approach are suitable for the estimation of measurement posium, Vancouver, April, Quality Control in Mineral
error in individual samples, but not as a bulk description of Exploration Short Course, p. 1–34.
relative error in the data set. Stanley, C.R., 2003a, THPLOT.M: A MATLAB function
Lastly, because of the large uncertainties associated with to implement generalized Thompson-Howarth error an-
individual estimates of the coefficient of variation derived alysis using replicate data: Computers and Geosciences,
from duplicate samples, if an accurate average relative v. 29, p. 225–237.
error estimate is desired, a very large number of dupli- Stanley, C.R., 2003b, Corrigenda to “THPLOT.M: A
cate samples must be used in the calculation (e.g., >500). MATLAB function to implement generalized Thomp-
Furthermore, if the expected relative error is large, an even son-Howarth error analysis using replicate data.” [Com-
larger duplicate data set (e.g., >2500) may be necessary. puters and Geoscience, v. 29, p. 225–237]: Computers
and Geosciences, v. 29, p. 1069.
Acknowledgments Stanley, C.R. 2006a, On the special application of Thomp-
son-Howarth error analysis to geochemical variables
This research is motivated by the authors’ experi- exhibiting a nugget effect: Geochemistry: Exploration,
ences witnessing the confusion brought about by the use Environment, Analysis, v. 6, p. 357–368.
of different measures of relative error by geologists and Stanley, C.R., 2006b, Numerical transformation of geo-
engineers at a number of mine sites. It was supported by chemical data, I: Maximizing geochemical contrast to
a Natural Sciences and Engineering Research Council of facilitate information extraction and improve data pres-
Canada (NSERC) Discovery Grant, and a financial stipend entation: Geochemistry: Exploration, Environment, An-
and logistical support from CRC-LEME (Perth, Western alysis, v. 6, p. 69–78.
Australia) to the first author. Stanley, C.R., and Lawie, D., ????, Thompson-Howarth er-
ror analysis: Unbiased alternatives to the large sample
method for assessing non-normally distributed measure-
References ment error in geochemical samples: Geochemistry: Ex-
ploration, Environment, Analysis, v. ??, p. ???–???.
Bettenay, L., and Stanley, C.R., 2001, Geochemical data Stanley, C.R., and Sinclair, A.J., 1986, Relative error an-
quality: The “fit-for-purpose” approach: Explore, News- alysis of replicate geochemical data: Advantages and
letter of the Association of Exploration Geochemists, applications [abs.]: GeoExpo, 1986: Exploration in the
v. 111, p. 12, 21–22. North American Cordillera, Association of Exploration
Fletcher, W.K., 1981, Analytical methods in geochemical Geochemists, Regional Symposium, Vancouver, Pro-
272 Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007

A3 becomes:
grams and Abstracts, p. 77–78.
Stanley, C.R., and Smee, B.W., 2005a, Sample prepara-
tion of ‘nuggety samples’: Dispelling some myths about æ dCV ö÷ 2
æ dCV ö÷ 2
æ dCV ö÷æ dCV ö
sample size and sampling errors: EXPLORE, Newsletter sCV = çç s m + çç ss + 2 çç ç ÷ s . (A4)
çè dm ÷÷ø çè dm ÷÷øçè ds ÷÷ø
2 2 2

i çè ds ÷÷ø i ms

of the Association of Applied Geochemists, p. 19–22.


Stanley, C.R., and Smee, B.W., 2005b, Reply to comment
by Francois-Bongarcon, D., regarding “sample prep- The required partial derivatives are:
aration of ‘nuggety’ samples: Dispelling some myths
about sample size and sampling errors”: EXPLORE, æ dCV ö÷ -s æ dCV ö÷ 1
Newsletter of the Association of Applied Geochemists, çç ÷= , and çç ÷ = , (A5)
çè dm ÷÷ø m 2 èç ds ÷ø m
p. 21–27.
Stanley, C.R., and Smee, B.W., ????, Strategies for reducing
sampling errors in exploration and resource definition Making the appropriate substitutions, and assuming that
drilling programs for gold deposits: Geochemistry: Ex- the standard errors on the mean and standard deviation are
ploration, Environment, Analysis, v. ??, p. ???–???. independent, we obtain:
Thompson, M., 1973, DUPAN 3, A subroutine for the in-
terpretation of duplicated data in geochemical analysis: æ -s ö÷ 2
æ s ö÷ æ 1 ö
2 2
æ s 2
ö÷ æ s ö÷æ 1 ö÷
Computers and Geosciences, v. 4, p. 333–340. sCV = çç çç ÷ + çç ÷÷ çç ÷ + 2 ç- ÷÷ççç ÷÷(0) (A6)
çè m ÷÷ø
2

2 çè p ÷ø çè m ÷ø çè 2 ( p - 1)÷÷ø ççè m 2
øè m ø
Thompson, M., 1982, Regression methods and the com-
parison of accuracy: The Analyst, v. 107, p. 1169–1180. æ CV 2
1 ö÷
Thompson, M., and Howarth, R.J., 1973, The rapid esti- = CV
2 çç + ÷÷
çè p 2 ( p - 1) ÷ø
mation and control of precision by duplicate determina-
tions: The Analyst, v. 98, p. 153–160. or:
Thompson, M., and Howarth, R.J., 1976, Duplicate analy- CV 2 1
sis in practice—Part 1. Theoretical approach and estima- sCV = CV + . (A7)
tion of analytical reproducibility: The Analyst, v. 101, p. p 2 ( p -1)
690–698.
Thompson, M., and Howarth, R.J., 1978, A new approach Now, if we have several (i = 1 … n) estimates of the
to the estimation of analytical precision: Journal of Geo- relative error from several sets of replicates, we can propa-
chemical Exploration, v. 9, p. 23–30. gate the standard errors on these relative errors through the
calculation of the average relative error to obtain an error
Appendix on that estimate. Because errors are additive as variances,
the average relative error must be calculated using a root
The following derivations determine the formulae of the mean square approach:
standard error on the average coefficient of variation cal-
culated from one set of p replicates, and from n sets of p n
2
replicates. å CVi
i =1
The standard errors on the mean and standard deviation, mCV = . (A8)
for p replicates, are: n

s s The error propagation formula for the variance on this


sm = , and ss = , (A1)
p 2 ( p -1) average relative error is:

respectively (Speigel, 1975). Given that the relative error ææ 2 ö


2
n ççç dmCV ö÷ 2 ÷÷
(or coefficient of variation) is: s = å ççç ÷÷ sCV ÷÷ , (A9)
i =1 ççèçè dCVi ÷ø
mCV i÷
÷ø
s
CV =
, (A2)
m and the appropriate partial derivatives are:
we can propagate standard errors on the mean and standard
deviation through Equation A2 to determine the standard æ dmCV ö÷ 1 1 æ 2CVi ö÷
çç ÷÷ = çç ÷ (A10)
error on the relative error. This can be done using the error ççè dCV
i ø
÷ 2 n èç n ÷ø
propagation equation: å CVi
2

i =1

æ d y ö÷2 2 p -1 p æ d y öæ d y ö
n
p
ç ÷÷
s y = å ççç ÷÷ s xi + 2 å å ççç ÷÷÷çç
2
÷ s xi x j , (A3) æ
CVi ç 1 ö÷
ç ÷
i =1 è d xi ø i =1 j =i +1 è d xi øè d x j ÷
ç ÷ ç ÷ø = ç ÷÷ .
n ççè m CV
÷ø
derived using a Taylor series expansion about the mean
(Meyer, 1975; Stanley, 1990). In this application, Equation Thus, the standard error on the average relative error is:
Average Relative Error in Geochemical Determinations • C.R. Stanley and D. Lawie 273

æ 2 ö creases, the standard error on the mean relative error will


1 n ççæç CVi ö÷ 2 ÷÷ . (A11) decrease. Additionally, because Equation A11 is a function
smCV = å çççç ÷÷ sCV ÷÷
n i =1çèçè mCV ÷ø i÷
÷ø of CVi, larger standard errors on the relative error will occur
with larger relative errors. Curves defined by the standard
error on the relative error for one replicate set with differ-
The functional form of this equation dictates that as the ent numbers of replicate determinations (p = 2 through 10)
number of relative error estimates (replicate sets, n) in- are presented in Figure 3.
274 Exploration and Mining Geology, Vol. 16, Nos. 3–4, p. 265–274, 2007

View publication stats

You might also like