This action might not be possible to undo. Are you sure you want to continue?
Algorithms for Computing the Sample Variance: Analysis and Recommendations Author(s): Tony F. Chan, Gene H.
Golub, Randall J. LeVeque Source: The American Statistician, Vol. 37, No. 3 (Aug., 1983), pp. 242247 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2683386 . Accessed: 13/03/2011 09:56
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, noncommercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . http://www.jstor.org/action/showPublisher?publisherCode=astata. . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a notforprofit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to The American Statistician.
http://www.jstor.org
Algorithms Computing SampleVariance:Analysis for the and Recommendations
TONY F. CHAN, GENE H. GOLUB, and RANDALL J. LEVEQUE*
The problem computing variance a sample of the of of N datapoints } may difficult certain sets, be for data {xi particularly N is largeand thevariance small. when is a We present survey possiblealgorithms their of and roundoff bounds, error including newanalysis some for computations shifted with data. Experimental results confirm bounds illustrate dangers some these and the of algorithms. Specific recommendations made as to are which algorithm should usedin various be contexts. KEY WORDS: Variance; Standard deviation; Shifted data; Roundoff errors; Computer algorithms. as is variance to be calculated dynamically thedata is collected. of nature (1.1), it is standard To avoidthetwopass of the to practice manipulate definition S intotheform
N /N
2
S
iN g E Xi)
(1.2)
in textbooks is Thisform frequently suggested statistical
Unand willbe called the textbook onepassalgorithm.
fortunately, although ismathematically (1.2) equivalent itcan The to (1.1), numerically be disastrous. quantities in be and large practice Lxi and(1/N) (Exi )2 may very If with error. be will generally computed somerounding cancelout is should thesenumbers thevariance small, in of almost completely thesubtraction (1.2). Many(or 1. INTRODUCTION will all) ofthecorrectly computed digits cancel, leaving errelative a S a computed with possibly unacceptable The problem computing variance a sample of the of in a S ror.The computed canevenbe negative, blessing ofN datapoints } is one that seems,at first glance, {xi that the sincethisat leastalerts programmer disguise to be almosttrivial can in factbe quitedifficult, but has cancellation occurred. disastrous when particularly N is largeand thevariance small. is To avoid thesedifficulties, severalalternative oneThe fundamental calculation consists computing of the have pass algorithms been introduced. These include sumofsquaresofthedeviations from mean, the of theupdating and Cramer algorithms Youngs (1971), N Welford (1962),West(1979),Hanson(1975),andCotS= (xi x_) (1.la) ton (1975), and the pairwise algorithm the present of 1=1 authors(Chan, Golub, and LeVeque 1979). In dewhere these we scribing algorithms usethenotation andMij Tij N to denotethesumand themeanof thedata points xi (1.lb) E x. xthrough respectively, xi, Ni=l
The sample variance is then SIN or S/(N  1) de
on The a pending theapplication. formulas define (1.1) for S. straightforward algorithm computing Thiswillbe
Tij =
Xk,
Mi=

1
Ti
since it requires called thestandardtwopass algorithm,
the onceto compute and x passing through datatwice: S. in thenagainto compute Thismaybe undesirable the is when datasample for many applications, example or too largeto be storedin mainmemory whenthe
*TonyF. Chan is AssistantProfessor,Departmentof Computer New Haven, CT 06520. Gene H. Golub is Science, Yale University, of and Chairman,Department ComputerScience, Stanford Professor Stanford,CA 94305. Randall J. LeVeque is Research University, of Fellow, CourantInstitute MathematicalSciences, New York UniNew York, NY 10012. This workwas supportedin part by versity, DAAG2978DEACO281ER10996, ArmyContract DOE Contract G0179, and by National Science Foundationand Hertz Foundation The articlewas produced using TEX, a comgraduatefellowships. systemcreated by Donald Knuthat Stanford. putertypesetting
and Sijto denotethesumofsquares
Sij =
>
k=i
( Xk Mij )2.
For computing unweighted of squares,as we an sum consider of here,the algorithms Welford, West,and Hanson are virtually identical and are based on the formulas updating 1 Mlj = Ml jl +  (xi  Mlj 1l)
Syj = Sijij + (j

(1.3a) (1.3b)
1) (xj  Mj,1)
'
M'
242
August1983, Vol. 37, No. 3 ?) The AmericanStatistician,
with Ml,1 = x1 and Sl,1= 0. The desired value of S is
as of ultimately obtained SliN. The updating formulas are Youngsand Cramer similar:
Tij= T1j1 + x
S=j
(1.4a)

+ (1

1) (jxj
T1,j)2
(1.4b)
have with T1,1= x1 and Sl 1= 0. These two algorithms
than and stable the similar numerical behavior aremore both that textbook algorithm. Note,in particular, with as S ofthesealgorithms = S1,Nis computed thesumof Cotton's nonnegative quantities. updateis no morestanot and blethanthetextbook algorithm should be used (see Chan and Lewis1979). to formulas The updating (1.4) can be generalized of size. two allowus to combine samples arbitrary Suppose we have two samples{xi} >1, {xi} +' 1 and we know
m
pected to have the same advantage, as is confirmed numerically. Incidentally, pairwisesummationcan be used in imx plementing (1.1) (both in computing and in forming S) or (1.2) withsimilarbenefits. Other devices can also be used to increase the accuracy of the computed S. For data with a large mean value x, experiencehas shown thatsubstantialgains in all accuracycan be achieved by shifting of the data by to some approximation x beforeattempting compute to S. Even a crude estimate of x can yield dramaticimprovementsin accuracy, so we need not resort to a in estimatex. This is twopassalgorithm order to first discussedin detailin Section3. However,whentheshift is the computedmean and the textbookalgorithm (1.2) is thenapplied to the shifted data, one obtainsthe correctedtwopassalgorithm
N N 2
S =
E(xix)2
Ti,m = Exi,
i=l
1
(Xi )x
(1.7)
m+n Tm+l,m+n = E i=m+ XiS
Sl,m =,
(Xi
m Ti,m
)2,
m+n
Sm+l,m+n=
i=m+l
E (xin n Tm+i,m+n)2
all of Then,ifwe combine ofthedataintoa sample size
m + n, we have
Tl,m+n = Ti,m + Tm+l,m+n
Sl,m+n = S1,m + Sm+l,m+n
(1.5a)
+
(m+ n)
m
n2
Tim Tm+m+n
(1. 5b)
When m = n thisreduces to
S1,2m = S1,m + Sm+1,2m +
2m(Ti,mTm+1,2m)2.
(1.6)
forms basisofthepairwise Thisformula the algorithm. summation for The pairwise the algorithm computing is sumofN numbers wellknown can be described and thatT1,2mshallbe computed as recursively stating by
Tl,2m = Ti,m + Tm+1,2m
Here the firstterm is simplythe twopass algorithm (1.la). The second termwould be zero in exact comto putationbut in practiceis a good approximation the errorin the first term.Note thatin thiscase use of the textbookalgorithmdoes not lead to catastrophiccanis muchsmaller cellation,since the correction generally was first thanthe first pointedout term.This algorithm to the authorsby ProfessorA. Bjorck (1978) who suggested this correctionterm based solely on the error analysis of the twopass algorithm(Chan, Golub, and LeVeque 1979). An alternative(and improved) error analysisis given in Section 3. for Initiallyalgorithms computingthe variance were judged solely on the basis of empiricalstudies(Hanson 1975, West 1979, and Youngs and Cramer 1971). More recently rigorouserrorbounds have been obtained for many algorithms(Chan, Golub, and LeVeque 1979; Chan and Lewis 1978, 1979). Our aim here is to present a unified surveyof error analyses for the previously and mentionedalgorithms techniques.Some of thismaterial is believed to be new, particularlythe inintotheeffects shifting data. Based on of the vestigation will be made as this survey,specificrecommendations to whichalgorithm should be used in various contexts. 2. CONDITION NUMBERS AND ERROR ANALYSIS Chan and Lewis (1978) firstderived the condition number,K, of a sample {xi } (withrespectto computing the variance). This conditionnumbermeasuresthesenof sitivity S for the given data set. If relativeerrorsof size y are introduced into the xi, then the relative change in S is bounded by sy. Chan and Lewis showed thisto be true up to 0O(y2). In factit is strictly trueas noted by van Nes (1979). Physical data almost always has some uncertainty it, and thisuncertainty be in will else, errors magnified thefactorK in S. If nothing by are the introducedin representing data on the computer,
243
in side computed a with each of thesumson theright manner. Formula(1.6) defines analogous the similar for Thiscan the pairwise algorithm computing variance. in manner be implemented a onepass only0 (log using storagelocationsas discussedin Chan, N) internal Golub, and LeVeque (1979) and also byNash (1981). be are All logarithms thisarticle base 2. It can easily in algorithm shown theuseofthepairwise that summation 0 reduces relative errors Ti N from (N) to 0 (logN) in can variance algorithm be exas N *oc. The pairwise
August 1983, Vol. 37, No. 3 C) The American Statistician,
and so a value of S computed a computerwithma.on chineaccuracyu mayhave relativeerrorsas largeas KU of regardless what algorithm used. This value KUcan is be used as a yardstick whichto judge theaccuracyof by the various algorithms, especially since error bounds that are functions solely of K, u, and N can oftenbe derived. If we definethe 2normof the data by
N
2X
X i
i=l
thenthe conditionnumberforthisproblemis givenby
__
2 IX11 =
=
/1
y2 1+X2N/S.
(2.1)
When S is smalland x is notclose to zero we obtainthe usefulapproximation
K
The numerical experiments wereperformed an IBM on 3081 computer the Stanford at Linear AcceleratorCenter. The data used were providedby a normalrandom number generator withmean 1 and a variety different of variances1 a ? 1013. For thischoice of the mean, K` 1/( (see (2.2)). In each case the resultshave been averagedover 20 runs. Single precisionwas used in all of the tests,withmachine accuracyu 5 x iO'. The "correct" answer for use in computingthe errorwas calculatedin double precision.The resulting errorsare denoted in the figures the symbols + (forN = 64) by and x (forN = 4096). The experimental results confirm generalform the of the error bounds given in Table 1. In particularthe graphs for the twopass algorithmsshow how the higherorder terms(such as N2K2U2) begin to dominate the errorat fairly modest values of K. 3. COMPUTATIONS WITH SHIFTED DATA If we replace the originaldata {xi } by shifteddata {xi } definedby =xi  d x~i (3.1) d, forsome fixedshift thenthenew data has meanx d and S remains unchanged (assuming the xi are computedexactly).In practice,data witha nonzeromean is of shifted some a prioriestimate themean frequently by to before attempting compute S. This will generally increase the accuracyof the computed S. We analyze the this improvement investigating dependence of by the conditionnumberon the shift.Bounds on i, the conditionnumberof the shifteddata, are derivedfor various choices of the shiftd. These can then be insertedin place of K in the bounds of Table 1 to obtain with shifted error bounds for each of the algorithms data. we of Fromthe definition theconditionnumber have
K
i x/_7/
(forS small, x nonzero), (2.2)
whichis the mean divided by the standarddeviation. K We alwayshaveK ?1, and in manysituations is very large. Table 1 shows the asymptotic errorbounds for the discussed.These are bounds on the relative algorithms error I(S  S )/S I in thecomputedvalue S. Small constant multipliers have been dropped, for clarity. termshave also been dropped, but the Higherorder termsshowndominatethe errorbounds wheneverthe error less than1. The boundsforthetextbook is relative and are algorithm West'supdating derivedbyChan and Lewis (1978). The twopasserrorbound including the N2K2U2 term(whichcan dominatein practice)is derived in Chan, Golub, and LeVeque (1979). Bounds forthese algorithms can usingpairwisesummation be foundsimilarly.The pairwisevariancealgorithm bound is a conof jecturebased on theform theerror boundforYoungs and Cramer updating and experimentalresults. The erroranalysisfor the correctedtwopassalgorithm is givenin Section 3. Graphs of these bounds are shown in Figures 1 8 through along withsome experimental results.Each plot has K on the abscissa and the relativeerrorin S on the ordinate.The lowercurvein each figure showsthe errorbound forN = 64, the upper curveforN = 4096. Table 1. Error Bounds forthe RelativeError  S)/S I in the Computed Value S. Onlythe DominantTermsare Shown, and Small Constant Factors Have Been Suppressed forClarity
1 +  (x  d)2.
(3.2)
I(S
Algorithm 1. textbook 2. textbook pairwise with summation 3. twopass 4. twopass with pairwise summation 5. corrected twopass 6. corrected twopass with pairwise summation 7. updating 8. pairwise 244
Error Bound NK2U K2UlogN Nu + N2K2U2 u logN + (KUlogN)2 Nu + N3K2U3
NKU
Comparingthiswith(2.1) we see that i < K whenever Id  x <I xI l, thatis, wheneverd lies between0 and 2x. Taking d = xe gives perfectlyconditioned data, Ki= 1. In practice we cannot compute x exactly and usually will not even attemptto compute it (except whenusinga twopassalgorithm). Instead,we use some a roughestimatethatis easily computedwithout separate pass through of the data. all Frequentlya shiftd is obtained by simply "eyebe balling" the data. Such a techniquemight expected to yieldan approximation thatis within fewstandard d a deviationsof the mean. This is sufficient give comto pletelysatisfactory bounds on ic. Recall that the standard deviation lis ( S/N)2 and suppose that Ix  d I < p ( S/N)2 forsome smallp . Then (3.2) gives lc2<1+p2. ~~~(3.3) For example,ifd is within one standarddeviationofthe
u logN +
K2U3log3N
KUlogN (conjectured)
? The AmericanStatistician, August1983, Vol. 37, No. 3
io1
io2
3
101
1
tll
x +X
1IHil
I I I 11IIII
I I I 11IIII
I I I 1^1131 1
i00 lo' io3 104 103
<
2
+
io4 10io6
107 I
10 7
100 101 102 103 104 10o5 1o
Algorithm Figure1. Textbook
II IgIi]
I
I iiIl
I I liI III
]
I
I11l III
I IiI
I t~
I11lII II
1
IllU I
i07
100
101
I
02
103 104 105 i6
107
Summation With Algorithm Pairwise 2. Figure Textbook
io0 10103xX
i00
io1 Lx~~~~~~~~
x +~~~~~
+ +
io2
+
+
104 x 105s
+ +
106 10s

3~~~~~~~~~
c 107
io 7

I~ ~~~ li ii
101
I
IMil II
10?
102 103 104 105 1015 107
I
i lli
III I
1111111 I I II
III11
I I I 111111
I I I1111
FI II 10 7 100 litilil i1
I
02
I I H ll ilil 11111 !
103 104 105 1io
iil l,i
Ill
II
1I 111ii!
107
Algorithm 3. Figure TwoPass
Summation With Algorithm Pairwise 4. Figure TwoPass
1.00 101 100
ii!
Xt0llt
TTWlll
I
s
,
io2
x~~~~~~
101
2

104
1O6 Io7
x
+
+
+
+
+
+
+
?
+
x
x6t. 105 1i0 107
xX
X Xx x
+
100 101 102
104 x 105 +
103
104
10
? ''10
102 103 104 105o io
107
5. TwoPass Figure Corrected Algorithm
With Algorithm Pairwise TwoPass 6. Figure Corrected Summation
+
+
io0 101
io4
104+
1o5
++ 10 7
+
llll
Iliii
+
+1 Ilil
2
iill
lii
II
l
Ii
I li,
10
? 101
i13
104 105
io6 107
Algorithm 8. Figure Pairwise August1983, Vol. 37, No. 3 (C The AmericanStatistician, 245
K mean then R \/2. This resultis completelyindependent of S and N. in It is notalwayspossibleto obtainan approximation this manner, nor is it always valid to make such an on assumption itsaccuracy.Anotherbound on K can be easily obtained by assumingonly that minxi < d ' max xi.
i i
to responding thisshiftis bounded by using Cauchy's inequality,
2=
1 + N(
)2
This is easly guaranteed,forexample by choosingone of the data pointsas the shift.When minxi c d c max  Xi)2 =S and so from ( xi, we have ( x d)2zi (3.2), (3.4) R2< 1 + N. as This bound is not as satisfactory (3.3), but formodto erate values of N it may be sufficient guaranteeacceptable errorsin S. by For the case in whichwe shift a singledata point, probad = xjforsomej, we can obtainsome interesting of bilisticrefinements (3.4). Equality in (3.4) is unattainable and approximateequality holds only when
( Xj 2
1+N 1(3.7) p For p = 1 this reduces to (3.4). We note that the reon can sultingalgorithm be veryeasily implemented a pocketcalculator,withgreatpotentialforacscientific curacyimprovement. We now consider the case in which the computed mean is used as the shift.In generalwe cannot ignore x. roundingerrorsin computing Instead we compute pointvalue fl(x ), givenby some approximatefloating
(Xxi )2,
fl(x) N
IN
E
farther fromx that is, only when xj lies considerably than do any of the otherxi. If xj is picked at random the from sample {xi }, thentheexpectedvalue of k2will be much smaller than 1 + N. In fact, since of E [( x  xi )2 ] = SIN, (the definition the sample variance), we have from(3.2) that
where the (i are bounded by (3.9) IkiI<Nu is summation used. Ifpairwise whentheusual (forward) summationis used, the N can be replaced by log N. Now we can bound R2 by
K2 = 2 1 + N (X fl(X))2 N~~~~~~~
Ni=1
xi (1 + (i).
(3.8)
E [R2 ] = 2
(3.5)
of independent N and S. Note thatthisis also indepenof distribution the {xi }. We asdent of the underlying sumed onlythatxj was chosenfrom } witha uniform {xi distribution. we Alternatively could choose the data value with a fixedindex, say x1, and assume that the This may not be a valid asdata is ordered randomly. are sumptionif,forexample, initialtransients present in the data. Improvedupper bounds of the form(3.4) thathold close to 1 can also be obtainedprobawithprobability For bilistically. fixedk, 1 ' k ' N, the inequality
( Xi)2
=
1 1NS EXikiI
'+N 1
2~H IX 1122I(
2~H
kS/N
can hold for at most Nlk values of i. Otherwisewe )2> (Nlk) (kS/N) = S. Thus if would have (x xj of at random, thereis a probability at least is chosen <)2<kS/N. It fol(N  Nlk)/N = 1  1/k that (x lows that
K2< 1 +
for at withprobability least 1  1/k 1 < k s N. (3.6) If N 2 100 we have, forexample, R2< 101 withprobaof bility.99. This is again independent N and S when the shift is chosen at randomfromthe sample. xj We can generalizethischoice of d by usingthe averp age of some p data points, < N. This averagewillbe /p, denotedbyxp= EXxj thesumbeingoverthechosenp small that data points.We assume thatp is sufficiently errors computing can be ignored.Specifin rounding xp numbercoricallythisrequiresKPU K 1. The condition
246
k
(3.10) . K2 2C 1 + K2 2 N Here we have used (2.1) and the general inequality = maxi I. Using (3.9) K N 2e11, where 1 11 as we can rewrite (3.10) S 1 + N2K2U2 (3.11) Note that owing to the dependence on K, the bound (3.11) maybe worsethantheboundsobtainedformore thatcan situations of estimates d. This reflects primitive exactuallyoccur in practice.One can easily construct amples where the computed mean does not even lie between min xi and max xi and hence (x  fl(x ))2 iS largerthanmaxi(x  xi )2. In thiscase one is betteroff by shifting any singledata pointthan by the computed mean. Of courseshifting thecomputedmean mayalso be by of the choice from standpoint efficiency, an undesirable since it requires a separate pass throughthe data to computefl(x). Nonetheless,whena twopassalgorithm is acceptable and N2 K2 u2 is small (< 1, say), thisshift provides a verydefollowedby a onepass algorithm twoS. pendable methodforcomputing The corrected (1.7) is of this form;it consistsof the pass algorithm on textbookalgorithm data shiftedby fl(x). Its error bound Nu( + N2 K2u2) is easily derived from(3.11) bounds of Table 1. and the textbookalgorithm could also be used in conOtheronepass algorithms
1+
August1983, Vol. 37, No. 3 ?) The AmericanStatistician,
by junctionwitha shift the computedmean. However, has ifa good shift been chosenso thatK 1, all onepass equivalentwitha bound Nu are algorithms essentially usingpairwisesummations). (or u log N foralgorithms oneis Since the textbookalgorithm the mostefficient and onlyN multiplications 2N (requiring pass algorithm and as additions opposed to 4N multiplications 3N addifor tionsfortheupdating algorithms, example), itis the methodof choice except in rare instances. 4. RECOMMENDATIONS of The results theprevioussectionsprovidea basis for for choice of algorithm accurately makingan intelligent the computing sample variance. Firstwe note thatif a parallel processoris available, the data can be splitup intosmallersamples and the sum of squares computed foreach sample individually. These can then be combined, and the global sum of squares computed, by (1.5). In thatcase the conformulas usingthe updating thatfollowapply foreach processor. siderations There is one situationin which the textbookalgoas rithm (1.2) can be recommended itstands.Ifthedata smallenoughthatno overflows consistonlyof integers, occur, then (1.2) should be used withthe sums comIn puted in integerarithmetic. this case no roundoff the two errorsoccur untilthe finalstep of combining sums,in whicha divisionby N occurs. to decide whether data we mustfirst For nonintegral If use a onepassor a twopass algorithm. all of thedata in and we are not interested memory fitin highspeed updatingthe varianceas new data are coldynamically is acceptable algorithm probably lected,thena twopass and the correctedtwopassalgorithm (1.7) is recommended. If N is large and high accuracyis needed, it in to maybe worthwhile use pairwisesummation implethisalgorithm. menting is step is If a onepass algorithm to be used, the first the to shift data as well as possible,perhapsby some xp estimates as discussedin Section 3. (The probabilistic a verified usingmuchtighter posmaybe subsequently of terioribounds provided as a byproduct the commust onepassalgorithm putation.)Now an appropriate estimate K, the condition be chosen. We should first number of the shifteddata, perhaps by one of the bounds of Section 3. If NK2U, the errorbound forthe is textbookalgorithm, at least as small as the desired can be relativeaccuracy,then the textbookalgorithm data. If thisbound is too large,we used on the shifted to for shouldresort a less efficient The algorithm safety. dependenceon N can be reducedby theuse of pairwise summation.The dependence on K can be reduced by The use of the pairwise using an updatingalgorithm. When N should reduce bothof thesefactors. algorithm is easy to is a power of 2 the pairwisealgorithm fairly and and implement requiresonly2N multiplications 4N algorithms. additions, whichis betterthanthe updating human more work (particularly For generalN slightly work) is required,makingit less attractive.
data? Intcgral yes no
textbook L(unshifted)_ no
X
twopass? \ \ yes
as shift wellas possible and estimate Jc
twopass (perhapswith pairwise summation)
corrected
N.'u
small? sufficiently yes n
textbook
small? ~~~~~~~~~sufficiently
textbook with yes lNRu
pairwise summnation  sufficiently smaUl?
l11
yes
/
n
updating algorithm

l
pairwise algorithm
 l
9. Figure DecisionProcedure Choosing Algorithm for an to the For Compute Variance. Detailssee theRecommendations section
The decision procedure just described is shown in graphically Figure 9.
[Received 1982.Revised June 1982.1 April
REFERENCES
communication. BJOCK,A. (1979),personal CHAN, T.F., GOLUB, G.H., and LeVEQUE, R.J. (1979),"Upand for Sample dating Formulae a Pairwise Algorithm Computing Variances," Compstat 1982,Proceedings the5thSymposium of heldat Toulouse, eds. H. Caussinus al., 3041. et Error Analysis CHAN,T.F.C., andLEWIS, J.G.(1978),"Rounding for of Algorithms Computing Meansand Standard Deviations," No. 284, The Johns DeTechnical Report Hopkins University, of partment Mathematical Sciences. Com(1979),"Computing Standard Deviations: Accuracy," on Mean and COTTON, E.W. (1975), Remark "Stably Updating
municationsof the Association for ComputingMachinery,22, 526531.
DeviHANSON, R.J.(1975),"Stably Updating MeanandStandard
StandardDeviation of Data," Communications theAssociation of for Computing Machinery,18, 458.
InterNASH, J.C. (1981),"Fundamental Statistical Calculations," faceAge,September, 4042. VAN NESS, F. (1979),personal communication. WEST, D.H.D. (1979),"Updating Mean and Variance Estimates: Corfor WELFORD, B.P. (1962),"Noteon a Method Calculating of and Sums Squares Products," rected Technometrics, 4, 419420. RelYOUNGS, E.A., andCRAMER, E.M. (1971),"SomeResults Techto evant Choiceof Sumand SumofProduct Algorithms," nometrics, 657665. 13,
An Improved Method," Communications the Associationfor of Computing Machinery, 532535. 22,
ationof Data," Communications theAssociation Computing of for Machinery,18, 5758.
August1983, Vol. 37, No. 3 C The AmericanStatistician,
247