You are on page 1of 26

Analyzing Linear Models With Proc MIXED Page 1 of 26

Introduction to PROC MIXED


Table of Contents

1. Short description of methods of estimation used in PROC MIXED


2. Description of the syntax of PROC MIXED
3. References
4. Examples and comparisons of results from MIXED and GLM

- balanced data: fixed effect model and mixed effect model,


- unbalanced data, mixed effect model

1. Short description of methods of estimation used in PROC MIXED.

The SAS procedures GLM and MIXED can be used to fit linear models. Proc GLM was designed to
fit fixed effect models and later amended to fit some random effect models by including RANDOM
statement with TEST option. The REPEATED statement in PROC GLM allows to estimate and test
repeated measures models with an arbitrary correlation structure for repeated observations. The PROC
MIXED was specifically designed to fit mixed effect models. It can model random and mixed effect
data, repeated measures, spacial data, data with heterogeneous variances and autocorrelated
observations.The MIXED procedure is more general than GLM in the sense that it gives a user more
flexibility in specifying the correlation structures, particularly useful in repeated measures and random
effect models. It has to be emphasized, however, that the PROC MIXED is not an extended, more
general version of GLM. They are based on different statistical principles; GLM and MIXED use
different estimation methods. GLM uses the ordinary least squares (OLS) estimation, that is, parameter
estimates are such values of the parameters of the model that minimize the squared difference between
observed and predicted values of the dependent variable. That approach leads to the familiar analysis of
variance table in which the variability in the dependent variable (the total sum of squares) is divided into
variabilities due to different sources (sum of squares for effects in the model). PROC MIXED does not
produce an analysis of variance table, because it uses estimation methods based on different principles.
PROC MIXED has three options for the method of estimation. They are: ML (Maximum Likelihood),
REML (Restricted or Residual maximum likelihood, which is the default method) and MIVQUE0
(Minimum Variance Quadratic Unbiased Estimation). ML and REML are based on a maximum
likelihood estimation approach. They require the assumption that the distribution of the dependent
variable (error term and the random effects) is normal. ML is just the regular maximum likelihood
method,that is, the parameter estimates that it produces are such values of the model parameters that
maximize the likelihood function. REML method is a variant of maximum likelihood estimation; REML
estimators are obtained not from maximizing the whole likelihood function, but only that part that is
invariant to the fixed effects part of the linear model. In other words, if y = Xb + Zu + e, where Xb is the

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 2 of 26

fixed effects part, Zu is the random effects part and e is the error term, then the REML estimates are
obtained by maximizing the likelihood function of K'y, where K is a full rank matrix with columns
orthogonal to the columns of the X matrix, that is, K'X = 0. It leads to REML estimator of the variance-
covariance matrix of y, say V. It does not depend on the choice of matrix K. Then the generalized least
squares equations, known also from the weighted least squares approach and the GLM procedure,

X'(inverse of V)Xb=X'(inverse of V)y,

where V is replaced with its estimator, are solved to obtain the estimates of fixed effects parameters b.

It is assumed that the random effects u and the error vector e are normally distributed, uncorrelated and
have expectations 0. Under the assumption that u and e are not correlated, V, the variance-covariance
matrix of y, is equal to ZGZ’ + R, where G and R are the variance matrices of u and e, respectively.

Estimators of V, the variance-covariance matrix of y, can also be obtained in PROC MIXED by the
MIVQUE0 method. For a short description of the method see reference (3), p.506. This method has two
advantages over ML and REML; it does not require normality assumption (for computing the
estimators) as do ML and REML and does not involve iterations. However simulation studies by
Swallow and Monahan (1984) present evidence favoring ML and REML over MIVQUE0. PROC
MIXED uses MIVQUE0 as starting values for the ML and RELM procedures.

For balanced data the REML method of PROC MIXED provides estimators and hypotheses test results
that are identical to ANOVA (OLS method of GLM), provided that the ANOVA estimators of variance
components are not negative. The estimators, as in GLM, are unbiased and have minimum variance
properties. The ML estimators are biased in that case. In general case of unbalanced data neither the ML
nor the REML estimators are unbiased and they do not have to be equal to those obtained from PROC
GLM. There are many models involving forms of variance-covariance structure of observations that can
not be analyzed using PROC GLM with TEST or PROC GLM with the REPEATED options. PROC
MIXED can handle such cases. It also has to be mentioned that PROC GLM was design for analysis of
fixed effects models and all computations are done under the assumption that there is only one variance
component in the model, the error term. The RANDOM statement with the TEST option can be used to
get the right tests in the case random effects are present in the model, but still some printed results,
variances and standard errors, will be incorrect.

2. Description of the syntax of PROC MIXED

The PROC MIXED syntax is similar to the syntax of PROC GLM. There are, however, a few
important differences. The random effects and repeated statements are used differently, random effects
are not listed in the model statement, GLM has MEANS and LSMEANS statements, whereas MIXED
has only the LSMEANS statement, GLM offers Type I, II, III and IV tests for fixed effects, while
MIXED offers TYPE I and TYPE III. The following is a general form of PROC MIXED statement:

PROC MIXED options;


CLASS variable-list;
MODEL dependent=fixed effects/ options;
RANDOM random effects / options;
REPEATED repeated effects / options;
CONTRAST 'label' fixed-effect values | random-effect values/ options;
ESTIMATE 'label' fixed-effect values | random-effect values/ options;

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 3 of 26

LSMEANS fixed-effects / options;


MAKE 'table' OUT= SAS-data-set < options >;
RUN;

The CONTRAST, ESTIMATE, LSMEANS, MAKE and RANDOM statements can appear multiple
times, all other statements can appear only once.

The PROC MIXED and MODEL statements are required. The MODEL statement must appear after the
CLASS statement if CLASS statement is used. The CONTRAST, ESTIMATE, LSMEANS, RANDOM
and REPEATED statement must follow the MODEL statement. CONTRAST and ESTIMATE
statements must follow RANDOM statement if the RANDOM is used.

A detailed description of all functions and options of each PROC MIXED statement is given in
SAS/STAT Software Changes and Enhancements through Release 6.11 and SAS/STAT Software Changes
and Enhancements for Release 6.12, SAS Institute Inc. (1996). The following is a short summary of
selected, most often used, MIXED procedure statements.

PROC MIXED <options>;

Selected options:

DATA= SAS data set


Names SAS data set to be used by PROC MIXED. The default is the most recently created data set.

METHOD=REML
METHOD=ML
METHOD=MIVQUE0

Specifies the estimation method. See Section 1 for a brief description of the methods and references.
REML is the default method.

COVTEST

Prints asymptotic standard errors and Wald Z-test for variance-covariance structure parameter estimates.
For example, if a random effect A is included in the model, then the estimator of the variance of A will
be printed together with the Wald test of the hypothesis that the variance of A is 0.
The COVTEST option is specified after Proc mixed and before semicolon;. For example,

Proc mixed data=mydata method=reml covtest;

CLASS variables;

Lists classification variables (categorical independent variables in the model). For example:

proc mixed data=mydata covtest;

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 4 of 26

Class group gender agecat;

MODEL dependent = fixed effects </options>;

The model statement names a single dependent variable and the fixed effects, that is independent
variables that are not random. An intercept is included in the model by default. The NOINT option can
be used to remove the intercept.

NOTE: Even though PROC MIXED allows only for one dependent variable in the model statement, it is
possible to use it to model, for example, multivariate repeated measures. In such case, the data set has to
be properly prepared and should contain a variable indicating the measurement type. The correlation
between observations on the same unit has to be modeled properly with the REPEATED statement. For
example, suppose your observed data consist of heights and weights of children measured over several
successive years. Your input data set should then contain variables similar to the following:

Y, all of the heights and weights, with a separate observation (line in the data file) for each
VAR, indicating whether the measurement is a height or a weight
YEAR, indicating the year of measurement
CHILD, indicating the child on which the measurement was taken.

Selected Options of the model statement:

CHISQ, request χ2 – tests (Wald tests) be performed for all fixed effects in addition to the F-tests.

DDFM=RESIDUAL
DDFM=CONTAIN
DDFM=BETWITHN
DDFM=SATTERTH,

The DDFM= options specifies the method for computing the denominator degrees of freedom for the
tests of fixed effects. DDFM=SATTERTH will result in the Satterthwaite approximation for the
denominator degrees of freedom. For balanced designs with random effects it will produce the same test
results as RANDOM …/ TEST option in PROC GLM (if the default METHOD=REML is used in proc
mixed).

P, requests that the predicted values be printed.

RANDOM random effects </options>;

The RANDOM statement defines the random effects in the model. It can be used to specify traditional
variance components (independent random effects with different variances) or to list correlated random
effects and specify a correlation structure for them with the TYPE=covariance-structure option. A
variety of structures are available (see references 5 and 6), most often used are either TYPE=VC, a
variance components correlation structure or TYPE=UN, an unstructured, that is, arbitrary covariance
matrix. TYPE=VC is the default structure. In the following example, the effect of subject is random.

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 5 of 26

Proc mixed data=one method=reml covtest;


Class gender treat subject;
Model y=gender treat gender*treat /ddfm=satterth;
Random subject(gender);
Run;

In the next example there are two random effects specified (besides the error term) and it is assumed that
they are correlated.
Intercept and the slope coefficient in the regression equation have fixed and random parts which are
assumed to be correlated. The model is:
yij = a0 +aj + b0*time + bj*time + eij, where yij is observation i for person j.
The random effects, aj, bj and eij, are asumed to have normal distributions with mean zero and different
variances and it is also assumed that aj and bj are correlated.

Proc mixed data=one method=reml covtest;


Class person;
Model y=time /solution;
Random intercept time /type=un subject=person;
Run;

REPEATED repeated effects / options;

The repeated statement is used in PROC MIXED to specify the covariance structure of the error term.
The repeated effect has to be categorical and has to appear in the class statement and the data has to be
sorted accordingly. For example, suppose that for each subject a measurement was taken at five equally
spaced time points. The time is the repeated effect and the data has to be sorted by subject and time
within each subject. If time is also used as a continuous independent variable in the model then a new
variable, say t, identical to time has to be defined and t should be used in the class and repeated
statements. For example:

Data one;
Set one;
T=time;
Run;
Proc sort data=one;
By group id t;
Run;
Proc mixed data=one covtest;
Class t group id;
Model y=group time group*time;
Repeated t /type=ar(1) subject=id;
Run;

The option TYPE in the REPEATED statement specifies the type of the error correlation structure. The
one specified in the above example is the first-order autoregressive correlation. The subject option is
needed to identify observations that are correlated. Observations within the same subject are correlated
with the type of correlation specified in TYPE, observations from different subjects are independent.

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 6 of 26

The TYPE option allows for many types of correlation structures. Most commonly used are
autocorrelation, compound symmetry, Huynh-Feldt, Toeplitz, variance components, unstructured and
spatial. For the complete list and examples, see references (7) and (8).

CONTRAST ‘label’ fixed-effect values | random-effect values / options;

ESTIMATE ‘label’ fixed-effect values | random-effect values / options;

The CONTRAST statement is used when there is need for custom hypothesis tests, the ESTIMATE
statement, when there is need for custom estimates. Although they were extended in PROC MIXED to
include random effects, their use is very similar to the CONTRAST and ESTIMATE statement in PROC
GLM.

LABEL is required for every contrast or estimate statement. It identifies the contrast or estimated
parameter on the output. It can not be longer than 20 characters.

FIXED-EFFECT is the name of an effect appearing in the MODEL statement.

RANDOM-EFFECT is the name of an effect appearing in the RANDOM statement.

VALUES are the coefficients of the contrast to be tested or the parameter to be estimated.

For example, suppose that we want to test if there is a significant effect of treat in group 2, where treat
has three levels and group four levels. We also want to estimate the mean for treat 1 in group 2, the
mean for treat 2 in group 2 and the difference between these two means. We will need the following
CONTRAST and ESTIMATE statements to obtain these results.

Proc mixed data=one method=reml covtest;


Class group treat subject;
Model y=group treat group*treat /ddfm=satterth;
Random subject(group);
Contrast ‘treat in group 2’
Treat 1 –1 0 group*treat 0 0 0 1 –1 0 0 0 0 0 0 0,
Treat 0 1 –1 group*treat 0 0 0 0 1 –1 0 0 0 0 0 0;
Estimate ‘treat1 group2 mean’ intercept 1 group 0 1 0 0 treat 1 0 0
group*treat 0 0 0 1 0 0 0 0 0 0 0 0;
Estimate ‘treat2 group2 mean’ intercept 1 group 0 1 0 0 treat 0 1 0
Group*treat 0 0 0 0 1 0 0 0 0 0 0 0;
Estimate ‘mean diff t1g2-t2g2’ Treat 1 –1 0 group*treat 0 0 0 1 –1 0 0 0 0 0 0 0;
Run;

LSMEANS fixed-effects / options;

LSMEANS computes the least squares means of fixed effects. The ADJUST option requests a multiple

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 7 of 26

comparison adjustment to the p-values for pair-wise comparisons of means. The following adjustments
are available: BON (Bonferroni), DUNNET, SCHEFFE, SIDAK, SIMULATE, SMM|GT2 and TUKEY.
The ADJUST option results in all possible pair-wise comparisons. If comparisons with a control level
are only needed then in addition to ADJUST option, PDIFF=control should be used. The SLICE option
allows to test the significance of one effect at each level of another effect.

For example, suppose that we want to compute the least squares means for group*treat and do pair-wise
comparisons with the control being group 1 and treat 1. We also want to test for the significance of the
treat effect within each group level using the SLICE option..

Proc mixed data=one method=reml covtest;


Class group treat subject;
Model y=group treat group*treat /ddfm=satterth;
Random subject(group);
lsmeans group*treat /adjust=bon pdiff=control('1' '1') slice=group;
Run;

MAKE 'table' OUT= SAS-data-set < options >;

The MAKE statement converts any table produced by PROC MIXED into a sas data set. NOPRINT
option can be used to prevent printing the requested table. Only requested or default output can be
converted into a sas data set. Hence, in particular, the P option has to be used in the model statement to
produce a data set with predicted values, and the LSMEANS statement has to be included to output least
squares means. For example,

Proc mixed data=one method=reml covtest;


Class group treat subject;
Model y=group treat group*treat /ddfm=satterth p;
Random subject(group);
lsmeans group*treat /adjust=bon pdiff=control('1' '1') slice=group;
make ‘LSMeans’ out=gtmeans;
make ‘predicted’ out=pred noprint;
Run;
Proc print data=gtmeans;
Proc print data=pred;
Run;

References

Statistics Books:

1. Searle, Shayle R. (1987). Linear Models For Unbalanced Data, John Wiley & Sons.

2. Searle, Shayle R. (1971). Linear Models, John Wiley & Sons.

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 8 of 26

3. Searle, S.R., Casella, G., and McCulloch, C.E. (1992), Variance Components. John Wiley&Sons.

4. Verbeke, G., Molenberghs, G. (Editors) (1997), Linear Mixed Models in Practice. A SAS-Oriented
Approach. Springer-Verlag

SAS Institute Books:

5. Littell, Ramon C., Milliken, George A., Stroup, Walter W., Wolfinger, Russell D. (1996). SAS
System For Mixed Models, SAS Institute Inc.

6. SAS Institute Course Notes (1996). Advanced General Linear Models with an Emphasis on Mixed
Models, SAS Institute Inc.

7. SAS/STAT Software Changes and Enhancements through Release 6.11, SAS Institute Inc. 1996.

8. SAS/STAT Software Changes and Enhancements for Release 6.12, SAS Institute Inc. 1996.

3. Examples and comparisons of the results from PROC MIXED and PROC GLM.

Example1. Fixed effect model, balanced data.

In this example, 36 subjects are randomly assigned to 12 group – treatment combinations, 3 to each
combination. There are three treatments and four groups. In the following program, factor treat with 3
levels is the effect of the treatment and factor group with 4 levels is the effect of the group.

As you can see below, the results from both procedures are identical.

Program:

options ls=76;
data one;
input y group treat subject;
cards;
22 1 1 1
23 1 1 2
25 1 1 3
17 1 2 4
18 1 2 5
23 1 2 6
12 1 3 7
16 1 3 8
14 1 3 9

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 9 of 26

8 2 1 10
9 2 1 11
10 2 1 12
16 2 2 13
17 2 2 14
20 2 2 15
29 2 3 16
30 2 3 17
36 2 3 18
3 3 1 19
7 3 1 20
5 3 1 21
1 3 2 22
2 3 2 23
1 3 2 24
4 3 3 25
7 3 3 26
8 3 3 27
11 4 1 28
15 4 1 29
8 4 1 30
34 4 2 31
37 4 2 32
33 4 2 33
27 4 3 34
28 4 3 35
24 4 3 36
;
run;
Proc mixed data=one method=reml;
Class group treat;
Model y=group treat group*treat;
lsmeans group*treat /adjust=bon pdiff=control('1' '1') slice=group;
Contrast 'treat in group 2'
Treat 1 -1 0 group*treat 0 0 0 1 -1 0 0 0 0 0 0 0,
Treat 0 1 -1 group*treat 0 0 0 0 1 -1 0 0 0 0 0 0;
Estimate 'treat1 group2 mean' intercept 1 group 0 1 0 0 treat 1 0 0
group*treat 0 0 0 1 0 0 0 0 0 0 0 0;
Estimate 'treat2 group2 mean' intercept 1 group 0 1 0 0 treat 0 1 0
Group*treat 0 0 0 0 1 0 0 0 0 0 0 0;
Estimate 'mean diff t1g2-t2g2' Treat 1 -1 0 group*treat 0 0 0 1 -1 0 0 0 0 0 0 0;
Run;

proc GLM data=one;


class group treat;
Model y=group treat group*treat;
lsmeans group*treat /adjust=bon pdiff=control('1' '1') slice=group;
Contrast 'treat in group 2'
Treat 1 -1 0 group*treat 0 0 0 1 -1 0 0 0 0 0 0 0,
Treat 0 1 -1 group*treat 0 0 0 0 1 -1 0 0 0 0 0 0;
Estimate 'treat1 group2 mean' intercept 1 group 0 1 0 0 treat 1 0 0

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 10 of 26

group*treat 0 0 0 1 0 0 0 0 0 0 0 0;
Estimate 'treat2 group2 mean' intercept 1 group 0 1 0 0 treat 0 1 0
Group*treat 0 0 0 0 1 0 0 0 0 0 0 0;
Estimate 'mean diff t1g2-t2g2' Treat 1 -1 0 group*treat 0 0 0 1 -1 0 0 0 0 0 0 0;
Run;

Results:

The MIXED Procedure

GROUP 4 1234
TREAT 3 123

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

GROUP 3 24 121.60 0.0001


TREAT 2 24 34.11 0.0001
GROUP*TREAT 6 24 43.04 0.0001

ESTIMATE Statement Results

Parameter Estimate Std Error DF t Pr > |t|

treat1 group2 mean 9.00000000 1.35400640 24 6.65 0.0001


treat2 group2 mean 17.66666667 1.35400640 24 13.05 0.0001
mean diff t1g2-t2g2 -8.66666667 1.91485422 24 -4.53 0.0001

CONTRAST Statement Results

Source NDF DDF F Pr > F

treat in group 2 2 24 71.35 0.0001

Least Squares Means

Effect GROUP TREAT LSMEAN Std Error

GROUP*TREAT 1 1 23.33333333 1.35400640


GROUP*TREAT 1 2 19.33333333 1.35400640
GROUP*TREAT 1 3 14.00000000 1.35400640
GROUP*TREAT 2 1 9.00000000 1.35400640
GROUP*TREAT 2 2 17.66666667 1.35400640
GROUP*TREAT 2 3 31.66666667 1.35400640
GROUP*TREAT 3 1 5.00000000 1.35400640
GROUP*TREAT 3 2 1.33333333 1.35400640

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 11 of 26

GROUP*TREAT 3 3 6.33333333 1.35400640


GROUP*TREAT 4 1 11.33333333 1.35400640
GROUP*TREAT 4 2 34.66666667 1.35400640
GROUP*TREAT 4 3 26.33333333 1.35400640

Differences of Least Squares Means

Effect GROUP TREAT GROUP _TREAT Difference Std Error DF

GROUP*TREAT 1 2 1 1 -4.00000000 1.91485422 24


GROUP*TREAT 1 3 1 1 -9.33333333 1.91485422 24
GROUP*TREAT 2 1 1 1 -14.33333333 1.91485422 24
GROUP*TREAT 2 2 1 1 -5.66666667 1.91485422 24
GROUP*TREAT 2 3 1 1 8.33333333 1.91485422 24
GROUP*TREAT 3 1 1 1 -18.33333333 1.91485422 24
GROUP*TREAT 3 2 1 1 -22.00000000 1.91485422 24
GROUP*TREAT 3 3 1 1 -17.00000000 1.91485422 24
GROUP*TREAT 4 1 1 1 -12.00000000 1.91485422 24
GROUP*TREAT 4 2 1 1 11.33333333 1.91485422 24
GROUP*TREAT 4 3 1 1 3.00000000 1.91485422 24

Differences of Least Squares Means

t Pr > |t| Adjustment Adj P

-2.09 0.0475 Bonferroni 0.5224


-4.87 0.0001 Bonferroni 0.0006
-7.49 0.0001 Bonferroni 0.0000
-2.96 0.0068 Bonferroni 0.0752
4.35 0.0002 Bonferroni 0.0024
-9.57 0.0001 Bonferroni 0.0000
-11.49 0.0001 Bonferroni 0.0000
-8.88 0.0001 Bonferroni 0.0000
-6.27 0.0001 Bonferroni 0.0000
5.92 0.0001 Bonferroni 0.0000
1.57 0.1303 Bonferroni 1.0000

Tests of Effect Slices

Effect GROUP NDF DDF F Pr > F

GROUP*TREAT 1 2 24 11.96 0.0002


GROUP*TREAT 2 2 24 71.35 0.0001
GROUP*TREAT 3 2 24 3.66 0.0411
GROUP*TREAT 4 2 24 76.26 0.0001

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 12 of 26

General Linear Models Procedure


Class Level Information

GROUP 4 1234

TREAT 3 123

General Linear Models Procedure

Dependent Variable: Y
Sum of Mean
Source DF Squares Square F Value Pr > F

Model 11 3802.00000 345.63636 62.84 0.0001

Error 24 132.00000 5.50000

Corrected Total 35 3934.00000

R-Square C.V. Root MSE Y Mean

0.966446 14.07125 2.34521 16.6667

Source DF Type III SS Mean Square F Value Pr > F

GROUP 3 2006.44444 668.81481 121.60 0.0001


TREAT 2 375.16667 187.58333 34.11 0.0001
GROUP*TREAT 6 1420.38889 236.73148 43.04 0.0001

General Linear Models Procedure


Least Squares Means
Adjustment for multiple comparisons: Bonferroni

GROUP TREAT Y Pr > |T| H0:


LSMEAN LSMEAN=CONTROL

1 1 23.3333333
1 2 19.3333333 0.5224
1 3 14.0000000 0.0006
2 1 9.0000000 0.0001
2 2 17.6666667 0.0752
2 3 31.6666667 0.0024

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 13 of 26

3 1 5.0000000 0.0001
3 2 1.3333333 0.0001
3 3 6.3333333 0.0001
4 1 11.3333333 0.0001
4 2 34.6666667 0.0001
4 3 26.3333333 1.0000

GROUP*TREAT Effect Sliced by GROUP for Y

Sum of Mean
GROUP DF Squares Square F Value Pr > F

1 2 131.555556 65.777778 11.9596 0.0002


2 2 784.888889 392.444444 71.3535 0.0001
3 2 40.222222 20.111111 3.6566 0.0411
4 2 838.888889 419.444444 76.2626 0.0001

Dependent Variable: Y

Contrast DF Contrast SS Mean Square F Value Pr > F

treat in group 2 2 784.888889 392.444444 71.35 0.0001

T for H0: Pr > |T| Std Error of


Parameter Estimate Parameter=0 Estimate

treat1 group2 mean 9.0000000 6.65 0.0001 1.35400640


treat2 group2 mean 17.6666667 13.05 0.0001 1.35400640
mean diff t1g2-t2g2 -8.6666667 -4.53 0.0001 1.91485422

Example 2. Mixed effect model, balanced data.

In this example, 12 subjects are randomly assigned to 4 groups, 3 to each group. There are three
observations for each subject corresponding to measurements taken at time 1, 2 and 3. In the following
program, factor time with 3 levels is the effect of the time and factor group with 4 levels is the effect of
the group.

A mixed effect model with fixed effect of group and time and random effect of subject will be used to
analyze the data. It is assumed that the effect of the subject has a normal distribution with mean 0 and
variance sigmaS squared (it measures between subject variability). It is also assumed that the error term
has a normal distribution with mean 0 and variance sigmaE squared (it measures within subject error)
and the error and subject effects are not correlated

As you can see below, the results of MIXED and GLM are not identical. The F and p-values for the tests
are the same. Values from proc mixed have to be compared with the Tests of Hypotheses for Mixed

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 14 of 26

Model Analysis from proc GLM, not with the main, General Linear Model Procedure, ANOVA table.
The values in the main ANOVA table in proc GLM are incorrect for this example; they are computed
under the assumption that subject is a fixed effect. However, the standard error of the lsmeans and
requested estimates are not the same for proc MIXED and proc GLM. The ones printed by proc MIXED
are correct. Again, proc GLM computed the standard error assuming that the subject effect is fixed.
Note that the standard error for the third estimate, the mean difference between time 1 and time 2 in
group 2 is the same for both. This is because when you compute that difference, the effect of the subject
cancels out.

Also note that proc GLM results printed in the Test of Hypotheses table include the F-test for the
significance of the subject effect. The test is not printed in proc Mixed. The corresponding table includes
only the fixed effects. The estimates of the random effects, in this case sigmaS squared (variance of the
subject effect) and sigmaE squared (variance of the error term) are printed in the table named
Covariance Parameter Estimates. The test of significance is the Wald test. The estimates are consistent
with the proc GLM results. The residual variance in proc MIXED is the same as MSS (mean sum of
squares) for the error in proc GLM. The subject variance can be computed from the GLM Type III
Expected Mean Square table.

Type III Expected Mean Square

GROUP Var(Error) + 3 Var(SUBJECT(GROUP)) + Q(GROUP,GROUP*TIME)

SUBJECT(GROUP) Var(Error) + 3 Var(SUBJECT(GROUP))

TIME Var(Error) + Q(TIME,GROUP*TIME)

GROUP*TIME Var(Error) + Q(GROUP*TIME)

According to that table, MSS(subject)=var(error)+3*var(subject). Hence var(subject)=(MSS(subject) –


var(error))/3. Since the expected mean of MSS(error)=var(error), we can use MSS(error) as the
estimate of var(error) and replace var(error) with MSS(error) in the above formula. Thus,

Var(subject)=(12.5278 – 1.9861)/3=3.5139,

which is the same as the value printed in the proc MIXED Covariance Parameter Estimates table for the
subject.

Program:

options ls=76;
data one;
input y group time subject;
cards;
22 1 1 1
23 1 1 2
25 1 1 3
17 1 2 1
18 1 2 2
23 1 2 3

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 15 of 26

12 1 3 1
16 1 3 2
14 1 3 3
821 4
921 5
10 2 1 6
16 2 2 4
17 2 2 5
20 2 2 6
29 2 3 4
30 2 3 5
36 2 3 6
331 7
731 8
531 9
132 7
232 8
132 9
433 7
733 8
833 9
11 4 1 10
15 4 1 11
8 4 1 12
34 4 2 10
37 4 2 11
33 4 2 12
27 4 3 10
28 4 3 11
24 4 3 12
;
run;
proc sort data=one;
by group subject time;
run;
Proc mixed data=one method=reml covtest;
Class group time subject;
Model y=group time group*time / DDFM=SATTERTH;
RANDOM SUBJECT(group);
lsmeans group*time /adjust=bon pdiff=control('1' '1') slice=group;
Contrast 'time in group 2'
time 1 -1 0 group*time 0 0 0 1 -1 0 0 0 0 0 0 0,
time 0 1 -1 group*time 0 0 0 0 1 -1 0 0 0 0 0 0;
Estimate 'time1 group2 mean' intercept 1 group 0 1 0 0 time 1 0 0
group*time 0 0 0 1 0 0 0 0 0 0 0 0;
Estimate 'time2 group2 mean' intercept 1 group 0 1 0 0 time 0 1 0
Group*time 0 0 0 0 1 0 0 0 0 0 0 0;
Estimate 'mean diff t1g2-t2g2' time 1 -1 0 group*time 0 0 0 1 -1 0 0 0 0 0 0 0;
Run;
proc GLM data=one;
class group time subject;

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 16 of 26

Model y=group subject(group) time group*time;


RANDOM SUBJECT(GROUP) /TEST;
lsmeans group*time /stderr;
lsmeans group*time /adjust=bon pdiff=control('1' '1') slice=group;
Contrast 'time in group 2'
time 1 -1 0 group*time 0 0 0 1 -1 0 0 0 0 0 0 0,
time 0 1 -1 group*time 0 0 0 0 1 -1 0 0 0 0 0 0;
Estimate 'time1 group2 mean' intercept 1 group 0 1 0 0 time 1 0 0
group*time 0 0 0 1 0 0 0 0 0 0 0 0;
Estimate 'time2 group2 mean' intercept 1 group 0 1 0 0 time 0 1 0
Group*time 0 0 0 0 1 0 0 0 0 0 0 0;
Estimate 'mean diff t1g2-t2g2' time 1 -1 0 group*time 0 0 0 1 -1 0 0 0 0 0 0 0;
Run;

Results:

The MIXED Procedure

GROUP 4 1234
TIME 3 123
SUBJECT 12 1 2 3 4 5 6 7 8 9 10 11 12

Covariance Parameter Estimates (REML)

Cov Parm Estimate Std Error Z Pr > |Z|

SUBJECT(GROUP) 3.51388889 2.10104164 1.67 0.0944


Residual 1.98611111 0.70219632 2.83 0.0047

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

GROUP 3 8 53.39 0.0001


TIME 2 16 94.45 0.0001
GROUP*TIME 6 16 119.19 0.0001

ESTIMATE Statement Results

Parameter Estimate Std Error DF t Pr > |t|

time1 group2 mean 9.00000000 1.35400640 13.2 6.65 0.0001


time2 group2 mean 17.66666667 1.35400640 13.2 13.05 0.0001
mean diff t1g2-t2g2 -8.66666667 1.15068418 16 -7.53 0.0001

CONTRAST Statement Results

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 17 of 26

Source NDF DDF F Pr > F

time in group 2 2 16 197.59 0.0001

Least Squares Means

Effect GROUP TIME LSMEAN Std Error DF t Pr > |t|

GROUP*TIME 1 1 23.33333333 1.35400640 13.2 17.23 0.0001


GROUP*TIME 1 2 19.33333333 1.35400640 13.2 14.28 0.0001
GROUP*TIME 1 3 14.00000000 1.35400640 13.2 10.34 0.0001
GROUP*TIME 2 1 9.00000000 1.35400640 13.2 6.65 0.0001
GROUP*TIME 2 2 17.66666667 1.35400640 13.2 13.05 0.0001
GROUP*TIME 2 3 31.66666667 1.35400640 13.2 23.39 0.0001
GROUP*TIME 3 1 5.00000000 1.35400640 13.2 3.69 0.0026
GROUP*TIME 3 2 1.33333333 1.35400640 13.2 0.98 0.3424
GROUP*TIME 3 3 6.33333333 1.35400640 13.2 4.68 0.0004
GROUP*TIME 4 1 11.33333333 1.35400640 13.2 8.37 0.0001
GROUP*TIME 4 2 34.66666667 1.35400640 13.2 25.60 0.0001
GROUP*TIME 4 3 26.33333333 1.35400640 13.2 19.45 0.0001

Tests of Effect Slices

Effect GROUP NDF DDF F Pr > F

GROUP*TIME 1 2 16 33.12 0.0001


GROUP*TIME 2 2 16 197.59 0.0001
GROUP*TIME 3 2 16 10.13 0.0014
GROUP*TIME 4 2 16 211.19 0.0001

General Linear Models Procedure

GROUP 4 1234

TIME 3 123

SUBJECT 12 1 2 3 4 5 6 7 8 9 10 11 12

General Linear Models Procedure

Dependent Variable: Y
Sum of Mean
Source DF Squares Square F Value Pr > F

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 18 of 26

Model 19 3902.22222 205.38012 103.41 0.0001

Error 16 31.77778 1.98611

Corrected Total 35 3934.00000

R-Square C.V. Root MSE Y Mean

0.991922 8.455767 1.40929 16.6667

Source DF Type III SS Mean Square F Value Pr > F

GROUP 3 2006.44444 668.81481 336.75 0.0001


SUBJECT(GROUP) 8 100.22222 12.52778 6.31 0.0009
TIME 2 375.16667 187.58333 94.45 0.0001
GROUP*TIME 6 1420.38889 236.73148 119.19 0.0001

Source Type III Expected Mean Square

GROUP Var(Error) + 3 Var(SUBJECT(GROUP)) + Q(GROUP,GROUP*TIME)

SUBJECT(GROUP) Var(Error) + 3 Var(SUBJECT(GROUP))

TIME Var(Error) + Q(TIME,GROUP*TIME)

GROUP*TIME Var(Error) + Q(GROUP*TIME)

General Linear Models Procedure


Tests of Hypotheses for Mixed Model Analysis of Variance

Dependent Variable: Y

Source: GROUP *
Error: MS(SUBJECT(GROUP))
Denominator Denominator
DF Type III MS DF MS F Value Pr > F
3 668.81481481 8 12.527777778 53.3865 0.0001
* - This test assumes one or more other fixed effects are zero.

Source: SUBJECT(GROUP)
Error: MS(Error)
Denominator Denominator
DF Type III MS DF MS F Value Pr > F
8 12.527777778 16 1.9861111111 6.3077 0.0009

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 19 of 26

Source: TIME *
Error: MS(Error)
Denominator Denominator
DF Type III MS DF MS F Value Pr > F
2 187.58333333 16 1.9861111111 94.4476 0.0001
* - This test assumes one or more other fixed effects are zero.

Source: GROUP*TIME
Error: MS(Error)
Denominator Denominator
DF Type III MS DF MS F Value Pr > F
6 236.73148148 16 1.9861111111 119.1935 0.0001

Least Squares Means

GROUP TIME Y Std Err Pr > |T|


LSMEAN LSMEAN H0:LSMEAN=0

1 1 23.3333333 0.8136566 0.0001


1 2 19.3333333 0.8136566 0.0001
1 3 14.0000000 0.8136566 0.0001
2 1 9.0000000 0.8136566 0.0001
2 2 17.6666667 0.8136566 0.0001
2 3 31.6666667 0.8136566 0.0001
3 1 5.0000000 0.8136566 0.0001
3 2 1.3333333 0.8136566 0.1208
3 3 6.3333333 0.8136566 0.0001
4 1 11.3333333 0.8136566 0.0001
4 2 34.6666667 0.8136566 0.0001
4 3 26.3333333 0.8136566 0.0001

GROUP*TIME Effect Sliced by GROUP for Y

Sum of Mean
GROUP DF Squares Square F Value Pr > F

1 2 131.555556 65.777778 33.1189 0.0001


2 2 784.888889 392.444444 197.6000 0.0001
3 2 40.222222 20.111111 10.1259 0.0014
4 2 838.888889 419.444444 211.2000 0.0001

Contrast DF Contrast SS Mean Square F Value Pr > F

time in group 2 2 784.888889 392.444444 197.59 0.0001

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 20 of 26

T for H0: Pr > |T| Std Error of


Parameter Estimate Parameter=0 Estimate

time1 group2 mean 9.0000000 11.06 0.0001 0.81365658


time2 group2 mean 17.6666667 21.71 0.0001 0.81365658
mean diff t1g2-t2g2 -8.6666667 -7.53 0.0001 1.15068418

Example 3. Mixed effect model, unbalanced data.

In this example, there are 2 subjects in group 1, 3 in group 2, 4 in group 3 and 3 in group 4. There are
three observations for each subject corresponding to measurements taken under three conditions, 1, 2
and 3 for subjects in groups 1 and 3 and two observations for each subject corresponding to
measurements taken at different conditions, 4 and 5 for subjects in groups 2 and 4 . In the following
program, factor cond with 5 levels is the effect of the condition and factor group with 4 levels is the
effect of the group.

A mixed effect model with fixed effect of group and cond(group) and random effect of subject will be
used to analyze the data. It is assumed that the effect of the subject has a normal distribution with mean
0 and variance sigmaS squared (it measures between subject variability). It is also assumed that the error
term has a normal distribution with mean 0 and variance sigmaE squared (it measures within subject
variability) and the error and subject effects are not correlated.

Note the use of the option E3 in the model statement. It makes proc mixed print the coefficients of the
type 3 contrasts for the model effects hypotheses.

As can be seen below, the results of proc MIXED and proc GLM are different in this case.

Program:

options ls=76;
data one;
input y group cond subject;
cards;
22 1 1 1
23 1 1 2
17 1 2 1
18 1 2 2
12 1 3 1
16 1 3 2
824 3
924 4
10 2 4 5
16 2 5 3
17 2 5 4
20 2 5 5

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 21 of 26

13 3 1 6
17 3 1 7
15 3 1 8
18 3 1 9
11 3 2 6
12 3 2 7
11 3 2 8
14 3 2 9
17 3 3 6
18 3 3 7
19 3 3 8
14 3 3 9
11 4 4 10
15 4 4 11
8 4 4 12
34 4 5 10
37 4 5 11
33 4 5 12
;
run;
proc sort data=one;
by group subject cond;
run;
Proc mixed data=one method=reml covtest;
Class group cond subject;
Model y=group cond(group) / DDFM=SATTERTH e3;
RANDOM SUBJECT(group);
lsmeans cond(group) /adjust=bon pdiff=control('1' '1') slice=group;
Contrast 'cond 1 vs 2 in group 1'
cond(group) 1 -1 0 0 0 0 0 0 0 0;
contrast 'cond 1 vs 2 in group 3'
cond(group) 0 0 0 0 0 1 -1 0 0 0;
Estimate 'diff c1g1-c1g3' group 1 0 -1 0
cond(group) 1 0 0 0 0 -1 0 0 0 0;
Run;
proc GLM data=one;
class group cond subject;
Model y=group subject(group) cond(group);
RANDOM SUBJECT(GROUP) /TEST;
lsmeans cond(group) /stderr;
lsmeans cond(group) /adjust=bon pdiff=control('1' '1') slice=group;
Contrast 'cond 1 vs 2 in group 1'
cond(group) 1 -1 0 0 0 0 0 0 0 0;
contrast 'cond 1 vs 2 in group 3'
cond(group) 0 0 0 0 0 1 -1 0 0 0;
Estimate 'diff c1g1-c1g3' group 1 0 -1 0
cond(group) 1 0 0 0 0 -1 0 0 0 0;
Run;

Results:

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 22 of 26

The MIXED Procedure

GROUP 4 1234
COND 5 12345
SUBJECT 12 1 2 3 4 5 6 7 8 9 10 11 12

Covariance Parameter Estimates (REML)

Cov Parm Estimate Std Error Z Pr > |Z|

SUBJECT(GROUP) 1.50219942 1.58123118 0.95 0.3421


Residual 2.98807617 1.27017905 2.35 0.0186

Type III Coefficients for COND(GROUP)

Effect GROUP COND Row 1 Row 2 Row 3 Row 4 Row 5 Row 6

INTERCEPT 0 0 0 0 0 0
GROUP 1 0 0 0 0 0 0
GROUP 2 0 0 0 0 0 0
GROUP 3 0 0 0 0 0 0
GROUP 4 0 0 0 0 0 0
COND(GROUP) 1 1 1 0 0 0 0 0
COND(GROUP) 1 2 0 1 0 0 0 0
COND(GROUP) 1 3 -1 -1 0 0 0 0
COND(GROUP) 2 4 0 0 1 0 0 0
COND(GROUP) 2 5 0 0 -1 0 0 0
COND(GROUP) 3 1 0 0 0 1 0 0
COND(GROUP) 3 2 0 0 0 0 1 0
COND(GROUP) 3 3 0 0 0 -1 -1 0
COND(GROUP) 4 4 0 0 0 0 0 1
COND(GROUP) 4 5 0 0 0 0 0 -1

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

GROUP 3 7.1 19.08 0.0009


COND(GROUP) 6 11.1 58.93 0.0001

ESTIMATE Statement Results

Parameter Estimate Std Error DF t Pr > |t|

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 23 of 26

diff c1g1-c1g3 6.75000000 1.83513125 16.5 3.68 0.0020

CONTRAST Statement Results

Source NDF DDF F Pr > F

cond 1 vs 2 in group 1 11.1 8.37 0.0146


cond 1 vs 2 in group 1 11.1 9.41 0.0106

Least Squares Means

Effect GROUP COND LSMEAN Std Error DF t Pr > |t|

COND(GROUP) 1 1 22.50000000 1.49837839 16.5 15.02 0.0001


COND(GROUP) 1 2 17.50000000 1.49837839 16.5 11.68 0.0001
COND(GROUP) 1 3 14.00000000 1.49837839 16.5 9.34 0.0001
COND(GROUP) 2 4 9.00000000 1.22342083 16.5 7.36 0.0001
COND(GROUP) 2 5 17.66666667 1.22342083 16.5 14.44 0.0001
COND(GROUP) 3 1 15.75000000 1.05951352 16.5 14.87 0.0001
COND(GROUP) 3 2 12.00000000 1.05951352 16.5 11.33 0.0001
COND(GROUP) 3 3 17.00000000 1.05951352 16.5 16.05 0.0001
COND(GROUP) 4 4 11.33333333 1.22342083 16.5 9.26 0.0001
COND(GROUP) 4 5 34.66666667 1.22342083 16.5 28.34 0.0001

Tests of Effect Slices

Effect GROUP NDF DDF F Pr > F

COND(GROUP) 1 2 11.1 12.22 0.0016


COND(GROUP) 2 1 11.1 37.71 0.0001
COND(GROUP) 3 2 11.1 9.06 0.0047
COND(GROUP) 4 1 11.1 273.31 0.0001

General Linear Models Procedure

GROUP 4 1234
COND 5 12345
SUBJECT 12 1 2 3 4 5 6 7 8 9 10 11 12

General Linear Models Procedure

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 24 of 26

Dependent Variable: Y
Sum of Mean
Source DF Squares Square F Value Pr > F

Model 17 1463.66667 86.09804 29.95 0.0001

Error 12 34.50000 2.87500

Corrected Total 29 1498.16667

R-Square C.V. Root MSE Y Mean

0.976972 10.07277 1.69558 16.8333

Source DF Type III SS Mean Square F Value Pr > F

GROUP 3 353.91667 117.97222 41.03 0.0001


SUBJECT(GROUP) 8 53.25000 6.65625 2.32 0.0919
COND(GROUP) 6 1056.50000 176.08333 61.25 0.0001

General Linear Models Procedure

Source Type III Expected Mean Square

GROUP Var(Error) + 2.4667 Var(SUBJECT(GROUP))


+ Q(GROUP,COND(GROUP))

SUBJECT(GROUP) Var(Error) + 2.5 Var(SUBJECT(GROUP))

COND(GROUP) Var(Error) + Q(COND(GROUP))

General Linear Models Procedure


Tests of Hypotheses for Mixed Model Analysis of Variance

Source: GROUP *
Error: 0.9867*MS(SUBJECT(GROUP)) + 0.0133*MS(Error)

Denominator Denominator
DF Type III MS DF MS F Value Pr > F
3 117.97222222 8.09 6.6058333333 17.8588 0.0006
* - This test assumes one or more other fixed effects are zero.

Source: SUBJECT(GROUP)
Error: MS(Error)
Denominator Denominator

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 25 of 26

DF Type III MS DF MS F Value Pr > F


8 6.65625 12 2.875 2.3152 0.0919

Source: COND(GROUP)
Error: MS(Error)
Denominator Denominator
DF Type III MS DF MS F Value Pr > F
6 176.08333333 12 2.875 61.2464 0.0001

Least Squares Means

COND GROUP Y Std Err Pr > |T|


LSMEAN LSMEAN H0:LSMEAN=0

1 1 22.5000000 1.1989579 0.0001


2 1 17.5000000 1.1989579 0.0001
3 1 14.0000000 1.1989579 0.0001
4 2 9.0000000 0.9789450 0.0001
5 2 17.6666667 0.9789450 0.0001
1 3 15.7500000 0.8477912 0.0001
2 3 12.0000000 0.8477912 0.0001
3 3 17.0000000 0.8477912 0.0001
4 4 11.3333333 0.9789450 0.0001
5 4 34.6666667 0.9789450 0.0001

Least Squares Means

COND(GROUP) Effect Sliced by GROUP for Y

Sum of Mean
GROUP DF Squares Square F Value Pr > F

1 2 73.000000 36.500000 12.6957 0.0011


2 1 112.666667 112.666667 39.1884 0.0001
3 2 54.166667 27.083333 9.4203 0.0035
4 1 816.666667 816.666667 284.1000 0.0001

Dependent Variable: Y

Contrast DF Contrast SS Mean Square F Value Pr > F

cond 1 vs 2 in group 1 25.0000000 25.0000000 8.70 0.0122


cond 1 vs 2 in group 1 28.1250000 28.1250000 9.78 0.0087

T for H0: Pr > |T| Std Error of


Parameter Estimate Parameter=0 Estimate

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005
Analyzing Linear Models With Proc MIXED Page 26 of 26

diff c1g1-c1g3 6.75000000 4.60 0.0006 1.46841752

http://www.uky.edu/ComputingCenter/SSTARS/mixed1.htm 1/20/2005

You might also like