Professional Documents
Culture Documents
http://jar.sagepub.com/
Research
The Relationship of ANOVA Models with Random Effects and Repeated Measurement Designs
Christof Schuster and Alexander Von Eye
Journal of Adolescent Research 2001 16: 205
DOI: 10.1177/0743558401162006
Published by:
http://www.sagepublications.com
Additional services and information for Journal of Adolescent Research can be found at:
Subscriptions: http://jar.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://jar.sagepub.com/content/16/2/205.refs.html
What is This?
The Relationship of
ANOVA Models With Random Effects and
Repeated Measurement Designs
Christof Schuster
University of Notre Dame
Alexander von Eye
Michigan State University
Understanding ANOVA models is often difficult because of the large amount of different
experimental designs presented in applied textbooks. This article shows how different
experimental designs arise out of the variation of three basic distinctions: block versus
treatment factors, fixed versus random factors, and crossed versus nested factors. Once it
is understood how each distinction influences the statistical analysis, the amount of
experimental designs can be considerably reduced, because sometimes seemingly differ-
ent experimental designs are essentially equivalent. This is shown by an example com-
paring a two-way analysis of variance model to a three-factor partially nested design.
Furthermore, the way each distinction influences the statistical analysis of an experi-
mental design can simplify the computational effort of the analysis because virtually
every basic ANOVA procedure implemented in common statistical software packages
can be used to fit more complex ANOVA models that are usually analyzed using special
computer modules.
In all textbooks, these principles are explained and examples are given to mo-
tivate and illustrate them. However, the extent to which textbooks on the anal-
ysis of variance refer to these principles when discussing experimental de-
signs differs considerably. Usually, textbooks first give an outline of the
statistical theory of analysis of variance, considering only treatment factors,
and then classify different experimental designs according to the distinction
of whether a factor is a block or a treatment factor. This is good practice be-
cause the models for analyzing experimental designs are generally obtained
by combining a block structure with a treatment structure (see for example,
Milliken & Johnson, 1992; Box, Hunter, & Hunter, 1978).
However, the second and the third distinction from above are not as
strongly emphasized as the first one. Therefore, it is often hard to understand
how these distinctions are related to a specific experimental design. In addi-
tion, chapters in textbooks are often organized from an applied perspective.
For instance, in the social sciences, subjects are frequently observed repeat-
edly over time and/or under each of several treatment conditions. It then
appears to be natural to group designs with this characteristic together in one
chapter. This has led to the practice in which substantive researchers often
use applied textbooks as “cookbooks.”
Although this is not in itself critical, the large number of textbooks on
analysis of variance and experimental designs that has been written over the
years has resulted in considerably different presentations. For instance, in the
book by Winer (1971), repeated measurement designs are discussed at
length, whereas in Kirk (1995), there is no chapter and not even a section on
this topic. The reason for this is that these designs are treated by Kirk under
the label of split-plot designs.
Often, textbooks on the analysis of variance are so voluminous because
many experimental designs are discussed twice under different labels.
Usually, the notation differs so that the similarities are not obvious. This prac-
tice is likely to add to the confusion surrounding different ANOVA models.
We will demonstrate this point in detail by comparing two equivalent models
with different labels from the book of Neter et al. (1996).
The reason for this state of affairs is, in the opinion of the authors, that the
distinction between fixed and random factors as well as the distinction
between crossed and nested factors, although mentioned in all textbooks, is
not closely related to the derivation for the necessary calculations for designs.
In the following section, we outline briefly how calculations done in the anal-
ysis of an experimental design proceed in general. The third section illus-
trates the difference between fixed and random effects. The fourth section
discusses the relationship between random effects models and repeated mea-
surement models. In the fifth section, an example shows how seemingly very
different experimental designs sometimes are essentially the same. Using a
numeric example, we demonstrate in the sixth section how repeated mea-
sures ANOVA models can be fitted using standard computer routines for
ANOVA models with no need for special capabilities for handling repeated
observations.
where αi and βj denote the main effects for the two factors, (αβ)ij repre-
sents the interaction effects between the two factors, and εijk is the residual
term. Additional constraints are introduced to define parameters uniquely.
The most common constraints on the parameters are ∑i αi = 0, ∑j βj= 0, ∑i
(αβ)ij = ∑j (αβ)ij =0, and so forth. The standard hypothesis of interest is that a
treatment is without any effect; that is, different levels of factor A produce no
shift in mean response, formally,
H 0 : α 1 = α 2 = . . . = α a, (1)
where a denotes the number of levels of factor A. The hypothesis for other
main effects and interaction effects have the obvious analogous form.
After the model formula has been set up, the total sum of squares (SSTO)
is split up into parts, one for each group of effects; that is, we obtain SSA,
which refers to the αi effects of Factor A, SSB, which refers to the βj effects of
Factor B, and so forth, with the characteristic that the sums of squares for the
different effects add up to the total sum of squares; that is,
These two steps are related to the principles presented earlier. Specifically,
the distinctions between block and treatment factors as well as between
crossed and nested factors determine the model formula used, which in turn
determine how the total sum of squares is split into parts. The difference be-
tween fixed and random factors determines how F ratios are built for testing
hypotheses.
If it is understood how the three principles are related to the most impor-
tant steps in the analysis of an experimental design, it becomes obvious how
seemingly different experimental designs are essentially identical.
We will not go into the details of how the three principles relate to the
actual calculations performed in analysis of variance, but we will show by an
example how repeated measurement designs can simply be considered ran-
dom effects or mixed models. This is done by showing that a three-factor par-
tially nested design with two fixed and one random factor is equivalent to a
two-factor repeated measurement design having repeated measures on one
factor. Before doing this, we will explain how repeated measurement designs
are generally related to random effects models.
usually made at some point, meaning that in fixed-effects models, the spe-
cific factor levels for each factor were selected for good reasons by the exper-
imenter, whereas in random-effects models, the actual factor levels applied to
the experimental units are the result of random selection. Mixed models con-
tain at least one random and one fixed factor. According to Eisenhart (1947),
the fixed-effects model is often called Model 1, whereas the random-effects
model is denoted as Model 2. It then appears natural to label the mixed model
Model 3. In random-effects models, the focus is usually not whether there are
differences between the treatment levels of a factor but to what extent the
variance of the responses is influenced by this factor compared with the total
variability of the data. A simple example will illustrate this point.
Suppose we are interested in whether an experimenter has impact on the
responses of participants in psychological experiments. If he or she does, we
certainly would like to estimate the magnitude of the influence. We could
choose particular experimenters and let each experimenter direct a psycho-
logical experiment several times using participants that were assigned ran-
domly to each of the experimenters. To see whether the experimenters yield,
on average, different responses for the participants, we could perform a sim-
ple one-way ANOVA with experimenters as factor levels. But this procedure
is not very useful to determine whether there exists, in general, an experi-
menter effect on the outcome of the experiment. The results of such an analy-
sis merely tell whether there are differences in the way the selected experi-
menters performed the experiment. Because it was assumed that the factor
levels were selected intentionally, the description just given corresponds to a
fixed ANOVA model.
However, because interest does not always lie in differences between sin-
gle individuals, it would be a better idea to select experimenters at random
from the population of experimenters usually engaged in psychological
experiments or a more limited population. A one-way analysis of variance
can then be used to estimate a variance component denoted σ 2τ for the experi-
menter. This variance component is an estimate of the variability of the
response that is attributable to the experimenter. The ratio of this variance
component to the total variance of the data, denoted as Var(yij)= σ 2y is an esti-
mate of the relative amount of the influence of experimenters on the results of
psychological experiments. This ratio is called the intraclass correlation
coefficient ρ (Neter et al., 1996),
στ2
ρ= .
σ 2y
where τi ~ N(0, σ 2τ ), εij ~ N(0, σ 2ε ) and τi and εij are independent random vari-
ables. It follows that
Var(yij) = στ2 + σ 2ε
Cov(yij, yi′j) = 0.
Σ
Σ .
V=
O
Σ
There are as many blocks, Σ, as there are experimenters, each having the form
The matrix Σ is called the equicovariance matrix for obvious reasons. Al-
though derived from the one-factor random effects model, this matrix also
represents the covariance structure that is assumed to be valid in repeated
measurement designs. Often, this is unrealistic because observations that are
closer in time are often more strongly correlated than observations farther
apart in time. With these similarities between the two models, it appears to be
natural to ask what else do the two models have in common. Indeed, the two
models are the same. To see this, only a slight change in perspective is neces-
sary. The random effects one-way ANOVA experiment can also be seen as a
repeated measures ANOVA. What might be confusing is that in the experi-
ment previously described, there are two kinds of individuals involved, ex-
perimenters and participants, but it is legitimate to view the responses of the
participants as repeated measures of the same experimenter.
Instead of looking at a one-way random effects model from a repeated
measures ANOVA perspective, the reverse change in perspective is also
interesting. Suppose we have a one-way repeated measures ANOVA; that is,
we have observed, say, n randomly selected participants. Looking at each
participant as the level of the random factor participants, we have just a
one-way ANOVA with a random factor.1
Once this is understood, it is not difficult to see the correspondence
between more complicated repeated measurement designs and the equivalent
random effects or mixed models. Because the partitioning of the total sums of
squares into components relating to factors and their interactions does only
depend on the model formula and in no way on whether Models 1, 2, or 3 is
under study, this viewpoint stresses the similarities between all ANOVA
models. The differences in calculations among the three different types of
models lie exclusively in the way the F ratios are built. How these F ratios
should be calculated depends on the expected values of the mean squares
resulting from the decomposition of the total sum of squares and can be found
for many different designs in standard textbooks on the analysis of variance.
Alternatively, these expected mean squares can be derived by using a set of
rules (see for example, Kirk, 1995; Neter et al., 1996; Winer, 1971).
Therefore, every computer program that allows the user to specify the
model formula of an ANOVA model can be applied to analyze complex
repeated measurement ANOVA models without entering special modules.
With this approach, F ratios are best calculated by hand because computer
programs routinely divide every mean square by the mean square for the
error. However, this is only correct if a Model 1–type experiment is analyzed.
Some computer programs allow the specification for each factor as fixed or
random; see the numerical example in the following section. In this case, no
additional hand calculations are needed.
EXAMPLE
Neter et al. (1996) discuss the three-factor partially nested design. In this
design, there are two crossed factors, say, A and B, and a third factor, C, is
nested within the Factor A. Nesting one factor into another will be denoted by
C(A), meaning C is nested into A. Assuming a balanced design having n repli-
cations within each factor combination, the appropriate model formula for
this design is (see Neter et al., 1996)
• µ is an overall constant,
• αi are fixed effects and ∑iαi = 0,
• βj are fixed effects and ∑jβj = 0,
• γk(i) are random effects and γk(i) ~ N(0, σ 2γ ).
• (αβ)ij are fixed interaction effects and ∑i (αβ)ij= 0 for all j and ∑j (αβ)ij= 0 for
all i
• (βγ)jk(i) are random effects and ∑j(βγ)jk(i) = 0 for all k(i) and (βγ)jk(i) ~ N(0,
σ 2β γ ), and
• εijkm are random error terms and εijkm ~ N(0, σ 2ε ).
Note that because γk(i) is random, the interaction (βγ)jk(i) is considered ran-
dom as well. Furthermore, it is assumed that all random terms in the model—
that is, γk(i), (βγ)jk(i), and εijkm—are mutually independent. Now, we consider
Σα 2i
A a–1 σ2ε + bn σ2γ + bcn
a −1
Σβ 2j
B b–1 σ2ε + nσ2βγ + acn
b −1
C(A) a(c – 1) σ2ε + bnσ2γ
Σ Σ(αβ )2ij
AB (a – 1)(b – 1) σ2ε + nσ2βγ + cn
(a − 1)(b − 1)
BC(A) a(b – 1)(c – 1) σ2ε + n σ2βγ
Error abc(n – 1) σ2ε
the two-factor repeated measurement design with Factors A and B, where the
repeated measures are only on Factor B, as outlined in Neter et al. (1996). The
model formula given there is
Before considering the usual ANOVA assumptions for this model, we just
relabel the subscripts i to k, j to i, and finally k to j. In addition, we change the
letter ρ to γ and arrange the Greek letters in alphabetic order. The obtained
model formula then is
It should be noted that this model formula is now very similar to Equa-
tion 4. Specifically, except for the term (βγ)jk(i), the two model formulas are
already identical. Using only standard assumptions for the models, it is not
surprising that the assumptions given by Neter et al. (1996) for the repeated
measurement model are all identical to the assumptions given above, with the
only exception that assumptions involving (βγ)jk(i) are omitted. From the
model formula, it is apparent that each person is considered a level of a ran-
dom factor that enters in Equation 5 as ρi(j) and has been relabeled in Equation
6 to γk(i).
To resolve the remaining difference between the two models, it is worth-
while to look how the total degrees of freedom in the partially nested mixed
model are partitioned. Table 1 lists the degrees of freedom that correspond to
each effect in the model as well as the corresponding expected mean squares.
Σα 2i
A a–1 σ2ε + bσ2γ + bc
a −1
Σβ 2j
B b–1 σ2ε + ac
b −1
C(A) a(c – 1) σ2ε + b σγ2
Σ Σ(αβ )2ij
AB (a – 1)(b – 1) σ2ε + c
(a − 1)(b − 1)
Error a(b – 1)(c – 1) σ2ε
Recall that n denotes replications for each combination of the three treat-
ment factors. If the persons from the repeated measurement design are con-
sidered to be the levels of the random factor C(A), the number of replications
is one—that is, n = 1. It can now be seen from Table 1 that there are no degrees
of freedom available for estimating the variance of the error term σ 2ε because
df ε = abc( n − 1) = 0 from the within-cell residual variation. This situation is
also encountered in a simple two-factor cross-classification with a single
observation within each cell. In this situation, it is common to consider the
interaction mean square as an appropriate estimate of the error mean square,
assuming the interaction effect to be negligible, that is (αβ)ij = 0 for all i and j;
therefore, one uses this mean square in the denominator for the calculation of
F values.
The situation here is completely analogous. Now, we consider the mean
square error of the interaction effect BC(A) as an appropriate estimate of the
error mean square; that is, we consider σ βγ2 = 0. From the expected mean
square of BC(A) in Table 1, it can be seen that the mean square error for BC(A)
can be considered an appropriate estimate of the error variance σ 2ε .
We now present Table 1 again under the assumption σ βγ2 = 0; that is, we
simply omit σ βγ2 . In addition, because now the mean square for BC(A) can be
considered an estimate for the error variance, we relabel this line from BC(A)
to Error. Finally, we drop n from the formulas because n = 1. These modifica-
tions yield Table 2.
Table 2 is now exactly the one presented by Neter et al. (1996) for the
repeated measurement design, except for some minor notational differences.
Again, only relabeling is necessary to obtain exact equivalence. We simply
back substitute the subscript ρ for γ and replace the number of levels c of the
third factor—that is, the number of persons in the repeated measurement
TABLE 3: ANOVA Table for Two-Way Repeated Measurement Design in the No-
tation of Neter, Kutner, Nachtsheim, and Wasserman (1996)
Σα 2i
A a–1 σ2ε + b σρ2 + bn
a −1
Σβ 2j
B b–1 σ2ε + an
b −1
Subjects (A) a (n - 1) σ2ε + b σρ2
Σ Σ(αβ )2ij
AB (a – 1)(b – 1) σ2ε + n
(a − 1)(b − 1)
Error a (b – 1)(n – 1) σ2ε
NUMERIC EXAMPLE
Note that the three dependent variables Y1, Y2, and Y3 correspond to the
three levels of Factor B. To emphasize this, we have labeled the factor in the
WSFACTOR statement B. These statements yield the following output.
We have edited the output by deleting all lines that refer to the multivariate
test that are reported by SPSS by default. Note also that we have omitted the
column for the mean squares because they are redundant, given the sums of
squares and the degrees of freedom for each effect.
However, the same results can be obtained by introducing participants as a
separate factor, denoted S in the following; that is, nested within Factor A and
using the mean squares of the obtained ANOVA table to calculate the appro-
priate F ratios by hand. Instead of having a multivariate response we now
have only a univariate response variable Y in addition to the Factors A, B, and
S. The statements for processing the data by SPSS are:
1 1 1 958
1 2 1 1047
1 3 1 933
1 1 2 1005
1 2 2 1122
1 3 2 986
1 1 3 351
: : : :
2 3 4 599
2 1 5 375
2 2 5 436
2 3 5 351
End data
We can split up the total sum of squares by using an SPSS routine that al-
lows to specify a model formula. We can do this by using again the MANOVA
procedure together with the design statement (simply listing all the effects
that are present in the model formula, A and B for the main effects, A by B for
the interaction, and S within A for the nested effect). The following statement
will calculate all the necessary sums of squares:
Note that although we again use the MANOVA statement, the syntax used
corresponds to the syntax used with other ANOVA modules in SPSS, for in-
stance, the ANOVA procedure. The output of the MANOVA procedure is
(again omitting mean squares):
Residual 5727.47 16
Constant 13248136.53 1 37009.41 .000
A 168150.53 1 469.74 .000
B 67073.07 2 93.69 .000
A by B 391.47 2 .55 .589
S within A 1833680.93 8 640.31 .000
pected values of the mean squares in Table 3. To test for main effects of Factor
B and for interaction effects between Factors A and B, one calculates FB =
MSB / MSE and FAB = MSAB / MSE, respectively. Therefore, the F values of for
the B an AB effects are the same that the F values of the MANOVA analysis
using the WSFACTOR statement from above. However, to test for main ef-
fects of Factor A, the appropriate F ratio is FA = MSA / MSS(A) = (SSA/SSS(A))
(dfS(A)/dfA). Performing the calculation by hand yields FA = (168,150.53 /
1,833,680.93)(8 / 1) = 0.7336, which corresponds to the value 0.73 (rounded
by SPSS to two decimal places) in the output for the analysis employing the
WSFACTOR statement. It becomes obvious that for the two-way repeated
measures design, the only adjustment to the ANOVA table that has to be
made is the calculation of the F ratio for testing the main effect of Factor A.
Sometimes, computer procedures allow users to specify the denominator
that should be used when calculating F values. In SPSS, the design statement
of the MANOVA is capable of doing this. Here, we only give the appropriate
design statement (see the manuals for details):
This yields the following output now containing the correct F value for
testing the A main effect:
Residual 5727.47 16
Constant 13248136.53 1 37009.41 .000
B 67073.07 2 93.69 .000
A by B 391.47 2 .55 .589
S within A (error 1) 1833680.93 8 640.31 .000
Error 1 1833680.93 8
A 168150.53 1 .73 .417
DISCUSSION
model formula, we referred to the way in which ANOVA models are usually
presented and explained in textbooks. For instance, if Factor B is nested
within Factor A, the interaction between these two factors is not included in
the model formula. Likewise, the model formula for a randomized complete
block design does not contain effects for the interaction between blocks and
treatments if there are no replications for each block-treatment combination,
as is usually the case.
Therefore, the treatment structure together with the block structure of an
experimental design readily leads to a standard model formula, which is used
to partition the total sums of squares. For this reason, the presentation of
model formulas in experimental designs is sometimes completely omitted in
textbooks and the discussion focuses only on building F ratios for testing
standard hypothesis such as the one given in Equation 1.
This practice has two serious disadvantages. First, seeing ANOVA merely
from a hypothesis-testing perspective can lead to a superficial understanding
of the statistical technique. It becomes difficult to see how designs with dif-
ferent labels are sometimes related to each other. Second, when dealing with
more complex situations—for instance, designs having many factors and/or
unbalanced designs—the modeling perspective on ANOVA is, in our opin-
ion, more sensible because it encourages formulating a model for the data.
Substantive considerations can be taken into account. For instance, if factors
do not have equal status in an experiment or an observational study, it might
be worthwhile fitting a hierarchical analysis of variance for unbalanced data.
If many factors are involved, it might be known, perhaps from earlier research
or pilot studies, that certain interactions between factors are negligible. In
these cases, it would be unreasonable to rely on a standard model including
all interactions.
In addition, going back to basic principles when studying ANOVA models
naturally leads to thinking about designed experiments when prior knowl-
edge concerning the effects is available. For instance, knowing that a specific
interaction effect is negligible in a factorial experiment, it is possible to
design the experiment such that this interaction is confounded with other
effects, typically the error. This in turn leads to a considerable reduction in the
number of experimental units needed for testing effects of interest without
losing power (see for example, Box et al., 1978). Therefore, going back to
basic principles in the analysis of experimental designs instead of using
standard models in a cookbook-like fashion can be expected to improve the
analyses performed as well as train one’s understanding of this statistical
technique.
NOTE
1. Considering participants as levels of a random factor has also been proposed by, for in-
stance, Morrison (1990) and Hertzog and Rovine (1985).
REFERENCES
Box, G.E.P., Hunter, W. G., & Hunter, J. (1978). Statistics for experimenters. New York: John
Wiley.
Eisenhart, C. (1947). The assumptions underlying the analysis of variance. Biometrics, 3, 1-21.
Hertzog, C., & Rovine, M. (1985). Repeated-measures analysis of variance in developmental
research: Selected issues. Child Development, 56, 787-809.
Kirk, R. E. (1995). Experimental design: Procedures for the behavioral sciences (3rd. ed.).
Pacific Grove, CA: Brooks/Cole.
Milliken, G. A., & Johnson, D. E. (1992). Analysis of messy data: Vol. 1. Designed experiments.
London: Chapmen & Hall.
Morrison, D. F. (1990). Multivariate statistical methods. New York: McGraw-Hill.
Neter, J., Kutner, M. H., Nachtsheim, C. J., & Wasserman, W. (1996). Applied linear statistical
models (4th ed.). Burr Ridge, IL: Irwin.
Winer, B. J. (1971). Statistical principles in experimental design (2nd ed.). New York:
McGraw-Hill.