You are on page 1of 6

An Approximate Distribution of Estimates of Variance Components

Author(s): F. E. Satterthwaite
Source: Biometrics Bulletin, Vol. 2, No. 6 (Dec., 1946), pp. 110-114
Published by: International Biometric Society
Stable URL: https://www.jstor.org/stable/3002019
Accessed: 17-10-2019 21:04 UTC

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend
access to Biometrics Bulletin

This content downloaded from 171.66.212.131 on Thu, 17 Oct 2019 21:04:45 UTC
All use subject to https://about.jstor.org/terms
AN APPROXIMATE DISTRIBUTION OF ESTIMATES
OF VARIANCE COMPONENTS

F. E. Satterthwaite
General Electric Company, Ft. Wayne, Indiana

1. Introduction

In many problems, only simple mean square statistics are require


to estimate whatever variances are involved. If the underlying popu
lations are normal, these mean squares are distributed as is chi-squa
and may therefore be used in the standard chi-square, Student's t an
Fisher's z tests. Frequently, however, the variances must be esti
mated by linear combinations of mean squares. Crump (1) has
recently discussed a problem of this type, based on the following data:
Analysis op Variance of Total Egg Production op 12 Females
(D. melanoga8ter) from 25 Races in 4 Experiments

The variance of the mean of the i th race is shown in his pape


be estimated by

(1) F.i.-^d.' + d^+i^.*)


e (zu

M8e~M8er MSer-MSt\ _1 (Mg x


?H 300 12 J en

(2) .lfeMlfMt(i-ikl
V ' e 1300 300 J \n 12/ J
where e is the number of experiments and n is the number of
in each experiment. Variance estimates such as (2) have be
complex estimates (2). Thus a complex estimate of varian
linear function of independent mean squares.
It is stated in (1) that "increasing the number of females i
nitely still leaves us with
v_ , MS. + 2/lM8?-25M8. 173 ?
<8> F<*'?->-3001-V
Conclusions are then reached without a
errors involved. Now the standard deviat
110

This content downloaded from 171.66.212.131 on Thu, 17 Oct 2019 21:04:45 UTC
All use subject to https://about.jstor.org/terms
W ^L"ts.<.;j) -300 4 5 74 1102 J
- 0.57 f (?,.);
and further analysis leading to confidence limits for V(x.i.) should be
helpful in choosing a course of action.
The writer has studied the distribution of complex estimates of
variance in a paper (2) in Psychometrika. Since this paper may not
be readily available to biometricians, the principal results are outlined
below and a few applications are given.

2. The Distribution of Complex Estimates of Variance

The exact distribution of a complex estimate of variance is t


involved for everyday use. It is therefore proposed to use, as a
approximation to the exact distribution, a chi-square distribution i
which the number of degrees of freedom is chosen so as to provid
good agreement between the two. This is accomplished by arrangin
that the approximating chi-square have a variance equal to that
the exact distribution. If MSl9 MS2, . . . are independent mean
squares with ru r2, . . . degrees of freedom and
(5) V8 = a1(M81) + a2{MS2) + . . .
is a complex estimate of variance based on them, the number
of freedom of the approximating chi-square is found to be gi
rfn [a1E(M81)+a2E(M82)+ . . . ]2
K) r8-[a1E(MS1)]* [^(MSf,)]' ( fi r2

where E{ ) denotes
In practice, the ex
will not be known.
(6), giving, as an est
m ~ _ [a1(MS1)+a2(MS2)+ . . . ]2
K } 8 [ai(M81)]2
fi
t r2
[a2{MS2)Y ,
An approximation o
first suggested by H
only two mean squar
support the use of
formula 3].
The writer has checked the accuracy of the suggested approxi-
mations by calculating the exact distribution for a number of special
cases. Typical results are as follows:

111

This content downloaded from 171.66.212.131 on Thu, 17 Oct 2019 21:04:45 UTC
All use subject to https://about.jstor.org/terms
The above discrepancies between the exact and the approximate
chi-squares, even for the extreme 99.9 percent case, are very small
compared with their sampling errors. Thus it appears that the ap?
proximation may be used with confidence. Furthermore, we know
from general reasoning that if r8 is large, both the approximate and
the exact distributions approach the same normal distribution; if r8 is
small, the sampling errors in the chi-squares are large and refinement
is superfluous.
Some care must be taken in the cases where one or more of the a's
in (5) are negative. If it is possible for V8 to be negative with a fairly
large probability, the approximate distribution will become rather poor
since it can not allow negative estimates. However, here again the
sampling errors in V8 will be quite large compared with its expected
value so that only the sketchiest of conclusions can be drawn in any
case.

3. Further Analysis op Crump's Example

The distribution of Crump's estimate of the residual variance of


the race means,
,?x *,- , (M8e)+mM8er)-25(M8e)
(3) y (*.,.)-^-
can now be approximated. Th
fa\ +fr \ ^46,659, (24
(8) F(*-<->=7[-300-
1 17<l
= -[155
e
+ 37-19]
e
= ??
From (7)
[155 + 37-19]2
(9)
(155)* (37)2 (19)
2

3 72 1,100
29,929 =3?
8,008 + 19 + 1
From chi-square tables interpolated for 3.7 d.f. at the 5 percent and
95 percent points we find that, with a high degree of probability,
112

This content downloaded from 171.66.212.131 on Thu, 17 Oct 2019 21:04:45 UTC
All use subject to https://about.jstor.org/terms
(10) 0.60 < H^- < 9.0
or

(11) (3.7) (173) v (3.7) (178)


(9.0) e ^ ^ (0.60) e
6 e

Thus if it
portant so
should run

(12) ,.i?2Ln9
experiments in the first series for confidenc
reduced. On the other hand, if the experim
time not important, we might run
(IS) (3.7) (173)
(13) e~ (5.6) (9) -13
experiments and then get a more accurate estimate of V to
how many additional experiments should be run (5.6 obtain
the 20 percent point for chi-square, 3.7 d.f.).

4. Difference of Means

The usual estimate of variance used in Student t tests for the dif?
ference of two means is

with
(14) Vi=[^.):;r,][_v_ii]
(15) rt = rt + r2
degrees of freedom. This assumes that both populations have th
variance. Seldom do we have positive evidence that this is s
often we have evidence that the variances are different. For e
F = (MS1)/(MS2) may be significant. Note that a non-signi
is not evidence that the variances are equal, especially if one o
MS's has a small number of degrees of freedom.
The assumption of equal variances can be avoided by use
complex estimate of variance,

(16) t.-^v+i^
with

113

This content downloaded from 171.66.212.131 on Thu, 17 Oct 2019 21:04:45 UTC
All use subject to https://about.jstor.org/terms
n7x - _ {[Wi/ri + l)] + [Jfg,/(ra+ !)]}?
U; '" [M81/(r1 + 1)]2 i [M?2/(r2 + l)]2
ri r2
degrees of freedom
For example, con
MS1 = 100,^
MS2= 90,r2= 9.
By the standard analysis one would obtain

(18) ^[HW^H]^.],,,
r? = 99 +9 = 108
The complex estimate gives

(19)

One will sometimes reach different conclusions with 108 degrees of


freedom from those he will reach with 11 degrees of freedom.
If from general reasoning or other a priori considerations it is be-
lieved that both M8t and M82 are independent estimates of the same
variance, then the use of 108 degrees of freedom is justified. On the
other hand, if the given data are the entire admissible knowledge, then
the use of more than 11 degrees of freedom is not valid.

5. Conclusion

In many practical problems the most efficient estimate of varia


available is a linear function of two or more independent mean-squ
Usually the exact distribution of such estimates is too complicated
practical use. A satisfactory approximation can be based on the ch
square distribution with the number of degrees of freedom determ
by (7).
Many problems, such as the difference of means, can be more con-
servatively analyzed by use of complex estimates of variance. Assump?
tions regarding homogeneity of variance can then be avoided.
references

(1) Crump, S. Lee. The estimation of variance components in the


Biometrics Bulletin 2 :1:7-ll. February 1946.
(2) Satterthwaite, Franklin E. Synthesis of variance. Psychometrika 6:309-316.
October 1941.
(3) Smith, H. Fairfield. The problem of comparing the results of two experiments with
unequal errors. Journal of the Council of Scientific and Industrial Research
9 :211-212. August 1936.

114

This content downloaded from 171.66.212.131 on Thu, 17 Oct 2019 21:04:45 UTC
All use subject to https://about.jstor.org/terms

You might also like