You are on page 1of 6

Accred Qual Assur (2005) 10: 338–343

DOI 10.1007/s00769-005-0008-5 GENERAL PAPER

Stephen L. R. Ellison
Including correlation effects in an
improved spreadsheet calculation
of combined standard uncertainties

Abstract A spreadsheet method al- only when the correlated contribution


Received: 25 October 2004
Accepted: 13 June 2005 lowing rapid calculation of combined to individual standard uncertainties
Published online: 11 October 2005 standard uncertainties is described. is significantly over about 30%

C LGC Limited 2005 The model used allows explicitly of the relevant standard uncertainty,
for correlation effects, and requires leading to correlation coefficients |r|
Electronic Supplementary a user to enter only the parameters, greater than 0.1. The implementation
Material Supplementary material is
available for this article at the calculation used to obtain the final includes a more robust differentiation
http://dx.10.1007/s00769-005-0008-5 result (including relevant influence algorithm than previously
factors), the individual standard reported for spreadsheet use, and
uncertainties for the parameters, initial preparation of the spreadsheets
S. L. R. Ellison () and estimates of correlation has been automated. The principle
LGC Limited, coefficients where necessary. The is illustrated with a simple example.
Queens Road, Teddington, Middlesex, estimation of correlation coefficients
TW11 0LY England in common cases is discussed, Keywords Measurement
e-mail: s.ellison@lgc.co.uk and it is shown that correlation uncertainty . Correlation .
Tel.: +440-20-8943-7325
Fax: +440-20-8943-2767 is likely to be practically significant Spreadsheet

Introduction extended to include a correlation term:


  2

The ISO “Guide to the Expression of Uncertainties in Mea-   ∂y
 u(x )
 i
u(y(xi , x j ...)) =  i=1,n ∂ xi 
surement” [1] and related guidance for analytical chemistry
[2], bases estimation of combined (that is, overall) un-  
  ∂y ∂y
certainties on the combination of individual contributing + u(xi , x j )
uncertainties expressed as standard uncertaintys, that is, i, j=1,n(i= j) ∂ x i ∂ x j
(2)
as standard uncertainties. The method of combination is
based on the general expression for combining standard where u(xi , xj ) is the estimated covariance between xi
uncertainties; the standard uncertainty u(y) of a value y and xj . The covariance may be related to the correlation
is calculated from the estimated standard uncertainties coefficient rij using
u(xi ) of the independent parameters x1 , x2 . . . on which it
depends is ri j = u(xi , x j )/u(xi )u(x j ) (3)

 Though these expressions reduce to relatively simple


  2
 ∂y forms for cases involving summation or multiplicative ex-
u(y) =  u(xi ) (1) pressions, it is often time consuming to reduce complex
i=1,n
∂ xi expressions to simple forms. Further, the process is prone
to both algebraic errors and errors in interpretation. It is,
of course, possible to obtain the necessary derivatives in
where ∂y/∂xi is the partial differential of y with respect to symbolic alegebra manipulation software, but that is rarely
xi . Where variables are not independent, the relationship is available in an analytical laboratory.
339

A viable alternative is the use of numerical methods im- A B C D E


plemented using widely available spreadsheet software. 1
Kragten [3] has demonstrated that approximate numeri- 2 a u(a) a’=a+u(a) a a
3 b u(b) b b’=b+u(b) b
cal methods employing relatively simple spreadsheets can 4 c u(c) c c c’=c+u(c)
provide sufficiently reliable estimates of combined stan- 5
dard uncertainties for many practical purposes, including 6 y= u(y)= y (a’)=f(a’,b,c) y (b’)=f(a,b’,c) y (c’)=f(a,b,c’)
f(a,b,c) SUM(C8:E8)
the estimation of measurement uncertainty.The process
7 u(y,a)=y (a’)-y u(y,b)=y (b’)-y u(y,c)=y(c’)-y
depends on direct calculation of the terms (∂y/∂xi )u(xi )
which may then be combined as in Eq. (1). However, the 8 C7*C7 D7*D7 E7*E7

method used is adapted to include correlations only by ex- Fig. 1 Schematic representation of spreadsheet calculations (After
plicitly adding terms of the form (∂y/∂xi ) u(xi ) (∂y/∂xj ) Kragten [3]). a, b, c represent parameters involved in the calculation
of y; the calculation itself is at cell A6. u(a) is the standard uncer-
u(xj )rij to the otherwise simple summation expression used tainty associated with a. u(y, a) is the contribution of the standard
to calculate u(y). Though this is practicable for many pur- uncertainty in a to the standard uncertainty associated with y, and is
poses, as most correlated effects are small, it nonetheless the calculated approximation to the term (∂y/∂xi )u(xi ) (see text). The
reduces the generality of the process and makes it harder to shaded region in row 6 holds the recalculated results with one value
provide a completely automated solution to the problem. in turn incremented (columns C, D) and decremented (E, F) by its
uncertainty; row 7 contains the calculated uncertaity contributions
This paper describes a number of measures aimed at im- using Eq. (14)
proving the generality of the technique. I describe an adap-
tation of the spreadsheet method which takes advantage
of the form of Eq. (2) to provide a more general solution, the implementation at (4) is shown here for simplicity of
requiring a user to enter only the parameters, their indi- presentation). For independent variables xi , these terms are
vidual standard uncertainties, the calculation of the final subsequently combined according to Eq. (1), in this exam-
value and any estimates of correlation coefficients, without ple by placing the squared results from row 7 in row 8 and
modification to the calculations of the combined standard by setting cell B6 to the square root of the sum of values
uncertainties. To facilitate estimation and use of correlation in row 8. Correlation terms (Eq. (2)) then need to be added
coefficients in uncertainty estimation, the common case of directly to the expression in B6.
parameters affected by the same influence factor is consid-
ered and a general expression and some relevant simplifi-
cations for estimating correlation coefficients are derived. Including correlation
These expressions lead to some useful observations on the
practical significance of correlation. Finally, a further im- The issue of correlation in uncertainty estimates has re-
provement in the generality of the spreadsheet, through cently been revisited, highlighting the need for routine
use of an alternative differentiation approximation, is de- methods of treating correlation [4]. This can be accommo-
scribed, and substantial automation of spreadsheet prepa- dated in a general spreadsheet implementation as follows.
ration is reported. Taking Eq. (2) and substituting from Eq. (3):

u(y(xi , x j ...))
Discussion 
  2   ∂y ∂y 
 ∂y
= u(xi ) + u(xi )u(x j )ri j
The basic principle outlined by Kragten [3] is illustrated
i=1,n
∂ xi i, j=1,n(i= j)
∂ xi ∂ x j
in the schematic spreadsheet in Fig. 1. Note that row 6 is

completed by the user, after entering the initial calculation     
 ∂y ∂y
at cell A6, by simply copying the calculation from A6 = u(xi ) u(x j ) ri j (5)
∂ xi ∂xj
across the row. This is the only change normally required i, j=1,n

once the spreadsheet is set up. It is feasible to set up the


sheet to allow for a large number of parameters, and simply where rij =1 for i=j.
leave unused values as zero; the calculation of uncertainty The summation term in Eq. (5) is across all products
will not be affected. In this schematic, three parameters are of pairs of terms (∂y/∂xi )u(xi ), multiplied by the appropri-
shown and all three are used. ate value of r. This observation suggests a simple imple-
The method implemented by Kragten relies on the ap- mentation based on two diagonally symmetric matrices R
proximation and B, whose elements Rij and Bij respectively are given
by
(∂ y/∂ xi )u(xi ) ≈ y(xi + u(xi )) − y(xi ) (4) 
 1 where i = j
Ri j = 
which is a simplified form of the usual implementation of ri j where i = j
a numerical differential [5], to provide direct calculations
of the terms (∂y/∂xi )u(xi ) (for derivation, see Kragten [3]).   
∂y ∂y
These appear at row 7 in Fig. 1. (An improved approx- Bi j = u(xi ) u(x j )
imation in use in this laboratory is described below, but ∂ xi ∂x j
340

A B C D E
both weights, but by virtue of the two readings being af-
1 fected in the same way, introduces a correlated effect on
2 a u(a) a+u(a) a a the measured weight readings and therefore affects the un-
3 b u(b) b b +u(b) b
4 c u(c) c c c+u(c) certainty of the derived quantity, relative density. Where
5 such a derived quantity (which may be a ratio, difference
6 y= s (y)= y (a’)=f(a’,b,c) y (b’)=f(a,b’,c) y(c’)=f(a,b,c’)
f(a,b,c) SUM(C15:E17)
or more complex form) can be characterised directly, one
7 u(y,a)=y(a’)-y u(y,b)=y(b’)-y u(y,c)=y(c’)-y way around this difficulty is to replace the separate terms in
8 the measurement equation with the derived quantity itself
9 a b c
10 a 1 r(a,b) r (a,c)
and estimate the uncertainty in the derived quantity directly
11 b [D10] 1 r ( b, c) [2]. However, it is not always possible to eliminate corre-
12 c [E10] [E11] 1 lations in this way. Sometimes the derived quantity may
13
14 a b c not be directly observable, or it may simply be uneconomic
15 a [C7]*[C7]*[C10] [C7]*[D7]*[D10] [C7]*[E7]*[E10] to characterise the behaviour of the derived quantity when
16 b [D15] [D7]*[D7]*[D11] [D7]*[E7]*[E11] the separate input quantities can be conveniently studied.
17 c [E15] [E16] [E7]*[E7]*[E12]
Given a simple and automatic calculation, it may then be
Fig. 2 Schematic representation of spreadsheet with correlations preferable to address the problem directly by forming an
Shaded regions here indicate that the cell contents are labels only.
Cross-references are indicated by grid reference (e.g. C7 is the cell estimate of the correlation coefficient r.
at row 7, column C; C15:E17 includes all nine cells in the range). Consider the case of a value y=y(p,q) for which the stan-
References in parentheses [] indicate that the reference is to the dard uncertainty u(y) is to be calculated using Eq. (2). The
value in the cell. Note the summation at B7. The region C10 to E12 two parameters p and q are characterised by combined
constitutes the matrix R, while C15 to E17 constitutes C (see text for
explanation) standard uncertainties uc (p), uc (q) respectively, in turn cal-
culated from a standard uncertainty u(c), representing the
contribution introduced by a common effect c, and inde-
The standard uncertainty u(y) can then be calculated as pendent standard uncertainties up’ , uq’ respectively. The
 separation of the uncertainty contributions in this way is
exactly equivalent to factoring p and q into independent
u(y) = Ri j Bi j (6) terms p and c, q and c respectively, that is
i, j=1,n

p = p( p  , c) r p  c = 0
which can be readily implemented as a simple calculation
using basic spreadsheet functions. q = q(q  , c) rq  c = 0 (7)
 
In practice, it is preferable to forego the intermediate and r p q = 0
matrix B in favour of C where Cij =Bij Rij , leading to the
implementation depicted in Fig. 2. In this schematic, cell where p(p , c) simply denotes that p is a function of p and
references are used in places to show cross-references more c etc., and c represents the influence of the common effect
clearly. Thus, the matrix C (C15 to E17), is constructed by on the result.1 From the general expression for covariance
reference to the values in row 7 and the correlation co- in cases involving common influence factors [1], we can
efficients entered in R (C10 to E12), while the combined write
standard uncertainty u(y) is simply the square root of the
summation over C. In the schematic, advantage is taken of ∂ p ∂q 2
u( p, q) = u (c) (8)
the diagonal symmetry in the matrices R and C by enter- ∂c ∂c
ing values into, and calculating, only the upper right part
of the matrices; the lower left portion is simply a copy and hence, from Eq. (3)
of the values at upper right. It is noteworthy that a sim-
∂ p ∂q 2
ple inspection of the values held here will immediately ∂c ∂c
u (c)
show the relative contribution of each parameter to the r pq =
u c ( p)u c (q)
overall uncertainty, allowing a quick check for significant
components. (Note: Eq. (8) is essentially identical to the covariance
estimate given as Eq. (11) of Ref. [4], though expressed in
terms of the partial derivatives of c instead of uncertainties
Estimation of correlation coefficients in p and q arising from uncertainty in c). This gives simple
expressions for rpq in the two common cases where the
In practice, correlations between parameters are often parameter c is combined with p and q additively and
found in uncertainty calculations where one or more known
systematic effects, each with its own uncertainty, affect two 1
Note that for many purposes it is convenient to choose c such that
different quantites in the measurement equation. For exam- p( p, c) = p and q(q , c) = q  . This will normally follow where the
ple, in two weighings in densitometry, uncertainty in any common factor c arises from a nominally null correction, such as in
air buoyancy correction (or arising from neglect of the the case of an assumption about which there is some uncertainty, or
correction) will appear as a component of uncertainty in a calibration step with associated uncertainty.
341

multiplicatively: 0.5

Additive : p = p + c, q = q  + c :
0.25
u 2 (c)
r pq = (9)
u c ( p)u c (q) 0
-1.00 -0.50 0.00 0.50 1.00
 

(ur-u0)/u0
Multiplicative : p = p c, q = q c :


-0.25

u(c) 2 u c ( p) u c (q)
r pq = (10)
c p q -0.5

Note that Eq. (9) and 10 are closely similar in form; -0.75
Eq. (10) differs only in using relative standard uncertainties
where Eq. (9) uses absolute standard uncertainties.
A further simplification applies when the two parameters -1
have very similar standard uncertainties, as might often r.K
be the case when considering weights by difference or Fig. 3 Relative change in combined uncertainty with correlation
common volumetric operations when two parame- See text for explanation
ters are determined using identical equipment. Setting
u(p)=u(q)=u or (for (10)) u(p)/p=u(q)/q=RSD gives, uncertainty, making it most important to consider correla-
tions where cancellation of errors is likely. The effect of
u 2 (c) a high positive correlation is less marked, but still results
for additive cases : r pq = (11)
u2 in an increase of up to about 41%. However, it is clear
from Fig. 3 that where r is close to zero the gradient is not
(u(c)/c)2 steep, and for values of u(c)/u below about 0.3 (i.e. r<0.1),
and for multiplicative : r pq = (12) including the correlation term will make only a moderate
RSD2 difference, suggesting that for many practical purposes in-
volving comparable input standard uncertainties, suspected
Equations (9) and (10) show that, for the common cases low correlations can safely be neglected.
detailed here, values for r can be calculated relatively easily
on the basis of the standard uncertainties associated with
each variable together with an estimate of the standard Improved accuracy in calculated differentials
uncertainty of the common contribution to each. Equally
importantly, these simple cases give considerable insight Kragten notes correctly that high numerical accuracy is
into the behaviour of the overall uncertainty with changes not essential in uncertainty calculations, the limitation nor-
in the degree of correlation between parameters. Figure 3 mally being accuracy of estimation of individual compo-
shows the relationship between the relative change in com- nents [3]. The approximation at Eq. (4), exact for lin-
bined standard uncertainty as a function of the correlation ear models, is accordingly often adequate in non-linear
term. The relative change in the combined uncertainty is cases, especially since its applicability can be verified by
shown as (ur −u0 )/u0 , where ur is the combined uncertainty comparison with results obtained on replacing y(xi +u(xi ))
calculated for a correlation coefficient r and u0 that for with y(xi −u(xi )) and substituting analytically derived dif-
zero correlation. The correlation term is represented by the ferentials if necessary. However, non-linearity is not un-
dimensionless expression rK, where common in chemistry, notably in the case of simple di-
vision, and where uncertainties are substantial with re-
∂ y/∂ p·∂ y/∂q spect to the relevant parameter the approximation can re-
K = (13) sult in appreciable error in the estimated uncertainty in
|∂ y/∂ p·∂ y/∂q|
y. To reduce dependence on the assumption of linearity,
where K=1 when ∂y/∂p.∂y/∂q=0, and r is calculated from the implementation in this laboratory uses the quadratic
the ratio of standard uncertainties u(c)/u in Eq. (11) (or rel- approximation
ative standard uncertainties in 12). Since K is constrained
to either +1 or −1, the complete expression rK is neg- ∂y y(xi + u(xi )) − y(xi − u(xi ))
si ≈ . (14)
ative for cases where the correlation results in reduced ∂ xi 2
uncertainty (e.g., for y=p−q with positive correlation) and
positive where correlation increases uncertainty (e.g., for Equation (14) is exact for quadratic functions, and repre-
y=p+q with positive correlation). The most striking fea- sents a substantial improvement for many other non-linear
ture of Fig. 3 is that a high correlation (|r| ≈ 1) with ‘can- cases. For example, in the case of y=1/x with x=2 and
celling’ effects can result in a reduction to zero combined u(x)=0.1, analytical differentiation and calculation gives
342

A B C D E F
calculation of a correlation coefficient from a common ef-
1 fect, as above. In practice, this is rarely necessary for effects
2 a s(a) a’=a+s(a) a a”=a-s(a) a
3 b s(b) b b’=b +s (b) b b”=b-s(b)
with small degrees of freedom, because a common effect
5 on two or more contributions is usually well characterised
6 y =f(a,b) y (a’)=f(a’,b) y (b’)=f(a,b’) y (a’’)=f(a’’,b) y(b’’)=f(a,b’’) (as in a zero correction for a balance, for example). If the
7 s (y,a)= s (y,b)=
(y (a’)-y (a’’))/2 (y (b’)-y(b’’))/2 problem is suspected to be significant, however, the most
8 direct approach is to extend the measurement equation (that
Fig. 4 Schematic representation of improved differentiation a, b is, the model equation for the measurand, y=f(x1 , x2 . . .))
represent parameters involved in the calculation of y; the calcula- to include the common term explicitly and to identify the
tion itself is at cell A6. s(a) is the standard uncertainty associated independent contributions of the remaining terms. This of
with a. s(y, a) is the contribution of the standard uncertainty in a
to the standard uncertainty associated with y and is the calculated course makes the calculation of correlation unnecessary,
approximation to the term (∂y/∂xi ).si (see text) but at the expense of a more complex measurement equa-
tion.
u(y)=0.025, Eq. (4) u(y)=0.0238 (−4.8% relative error)
and Eq. (14) u(y)=0.0251 (0.25% relative error). As u(x) Automation
increases, the absolute difference increases; with u(x)=0.2,
exact calculation gives u(y)=0.05, Eq. (4) u(y)=−0.0455 The combination of allowance for correlation and imple-
(9.1% relative error) and Eq. (14) u(y)=0.0505 (1.0% rel- mentation of Eq. (14) adds substantially to the complexity
ative error). The larger of these latter errors is approaching and preparation time for a given size of problem. The in-
practical significance and would probably lead to man- creased difficulty is in part offset by the possibility, noted
ual intervention in practice, while the error arising from above, of preparing standard spreadsheets of various sizes
Eq. (14) remains well within acceptable limits. and allowing unused parameters. Despite this, large correla-
The practical implementation of Eq. (14) is shown in tion matrices (over 5×5, say) involve a substantial amount
Fig. 4, which shows a schematic implementation based on a of input, increasing as the square of the number of parame-
two parameter problem. Comparison with Fig. 1 shows the ters, and the risks of mistakes in input become substantial.
addition of columns (E and F here) which include the values In this laboratory the process has accordingly been fully au-
and calculations for y(xi −u(xi )) (denoted y(p ,q) etc.). The tomated using spreadsheet ‘macro’ functions. The result is
calculations of the resulting contributions (∂y/∂xi ) u(xi ) are reliable and essentially instant preparation of an appropri-
at row 6 as before, showing the altered subtraction. These ately sized spreadsheet given only the required number of
values are used directly in the combined standard uncer- parameters. We have additionally automated the ‘calcula-
tainty calculation after combination with relevant correla- tion copy’, that is, the process of replicating the calculation
tion coefficients as above. in cell A6 of Fig. 4 to cells B6-E6, simply to reduce mis-
Note that under some circumstances, accurate differ- takes at this final stage.
entials may be less useful. In particular, situations near
maxima may result in near-zero differentials. In these
circumstances, max(y(xi )−y(xi −u(xi )), y(xi )−y(xi +u(xi ))) Calculation example: Moisture content by weight
may provide a more sensible uncertainty estimate. This is
generally safe, but has the disadvantge of being conserva- The principle of estimating and including correlation
tive, that is, larger than the exact differential would predict; is illustrated by the following brief example, which is
for routine analysis, this is likely to be acceptable. based on a basic moisture determination and will allow
readers to check their implementation of the procedure.
Degrees of freedom For a simple moisture determination, a mean dry mass
md is subtracted from a mean original mass mo to
The ISO Guide does not comment explicitly on the calcu- obtain a moisture content of (mo −md ), each value being
lation of degrees of freedom in the presence of correlated determined n times by weighing. Typically, uncertainties
effects. Instead, it is assumed that all degrees of freedom are will be derived for the individual averages m̄ 0 and m̄ d ,
associated with the individual uncertainties, and the effect and each will include a component for random variability
of any correlation on effective degrees of freedom appears ur and one for calibration uncertainty uc . For this example,
in the Welch-Satterthwaite calculation of effective degrees simulated masses are used and a two-figure balance is
of freedom only because the combined uncertainty changes assumed for simplicity of development. Using typical
in the presence of correlation. Conformance with the ISO two-figure balance uncertainties from the first edition
guide does not, therefore, require explicit treatment of de- of Ref. [2] (Appendix A3 of that edition), taken as the
grees of freedom in the correlation term. We accordingly calibration uncertainty uc is taken as 0.008 g, while the
assume that degrees of freedom are sufficiently addressed standard uncertainty ur associated with random variability
by application of the Welch-Satterthwaite equation (which is estimated at√0.03 g. Assuming three replicate weighings,
is described in Ref. [1]). ur (m̄ i )=0.03/ 3=0.017,
√ leading to a combined uncer-
The effect of this assumption in practice is likely to be tainty u(m̄ i ) = 0.0082 + 0.0172 = 0.019 g, where m̄ i
modest, as the chief case in which a problem arises is the denotes either mean mass. Without allowing for the
343

A B C D
relatively small value of r, the change in calculated
1 uncertainty is not large. This quick verification makes it
2 m0 5.5 0.019 5.519 4.8 easy to decide whether to include, or neglect, the effect of
3 md 4.8 0.019 4.8 4.819
4 correlation.
5 moisture 0.7 0.719 0.671
6 uncertainty 0.024 0.019 -0.019
7
8 m0 md
9 m0 1 0.17 Summary and conclusions
10 md 0.17 1
11
12 m0 md It is practical and straightforward to include allowance for
13 m0 3.6 x 10
-4

-5
-6.4 x 10
-5

-4
correlation in calculating combined standard uncertainties
14 md -6.4 x 10 3.6 x 10
15
in spreadsheet models, making estimates of uncertainty in-
volving correlated effects substantially simpler and more
Fig. 5 Example—moisture content calculation. The region C9 to general. Still greater generality is achieved through imple-
D10 constitutes the matrix of correlation coefficients R, while C13 to
D14 constitutes the contribution matrix C (see text for explanation). mentation of a quadratic approximation for differentiation,
The unlabelled column to the left of column A is inserted to show in- leading to reduced error in non-linear models with sub-
put parameter names while retaining some cell labelling consistency stantial relative standard uncertainties. Use of spreadsheet
with previous figures. Quantities are in grams automation functions markedly improves ease of imple-
mentation in practice.
correlated effect of calibration, this √
leads to a combined The general expression for estimation of appropriate cor-
uncertainty on moisture content of 0.0192 + 0.0192 = relation coefficients leads to very simple expressions for
0.027 g. However, from Eq. (11) we can calculate the most common cases. From the simple cases considered
here, the correlated contribution to individual standard un-
0.0082 certainties will normally need to exceed about 30% of the
r (m̄ 0 , m̄ d ) = = 0.177, standard uncertainty associated with relevant parameters to
0.0192
be practically significant.
leading to the figures shown in Fig. 5 (The values for
weights are arbitrary, and do not affect the uncertainty Acknowledgements This work was supported under contract with
the UK Department of Trade and Industry as part of the Valid An-
calculation in this instance). Note the reduction in overall alytical Measurement (VAM) programme. The author thanks Mr A
uncertainty to 0.024 g (cell A6); as expected from the Williams for valuable discussion.

References
1. ISO (1993) “Guide to the expression of 3. Kragten J (1994) Analyst
uncertainty in measurement”. Geneva, 119:2161–2166
Switzerland. (ISBN 92-67-10188-9) 4. Haesselbarth W, Bremser W (2004)
2. Eurachem/CITAC Guide (2000) Accred Qual Assur 9:597–600
“Quantifying uncertainty in analytical 5. Press WH, Flannery BP, Teukolsky SA,
measurement”, 2nd edition. Williams A, Vettering WT (1988) Numerical recipes
Ellison SLR, Roesslein M (eds). in C. Cambridge University Press,
Available from the Eurachem secretariat Cambridge NY, p 272.
and website
(http://www.eurachem.com/) and (hard
copy) LGC Ltd, London (ISBN
0-948926-15-5)

You might also like