You are on page 1of 17

352 The Canadian Journal of Statistics

Vol. 38, No. 3, 2010, Pages 352–368


La revue canadienne de statistique

Variability explained by covariates in linear


mixed-effect models for longitudinal data
Bo HU1 *, Jun SHAO2 and Mari PALTA3
1 Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, OH, USA
2 Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
3 Department of Population Health Sciences, University of Wisconsin-Madison, Madison, WI, USA

Key words: Compound symmetry projection; explained variance; R2 statistics; random intercept;
random slope.
MSC 2000: Primary 62H20; Secondary 62H10.

Abstract: Variability explained by covariates or explained variance is a well-known concept in assessing the
importance of covariates for dependent outcomes. In this paper we study R2 statistics of explained variance
pertinent to longitudinal data under linear mixed-effect models, where the R2 statistics are computed at
two different levels to measure, respectively, within- and between-subject variabilities explained by the
covariates. By deriving the limits of R2 statistics, we find that the interpretation of explained variance for
the existing R2 statistics is clear only in the case where the covariance matrix of the outcome vector is
compound symmetric. Two new R2 statistics are proposed to address the effect of time-dependent covariate
means. In the general case where the outcome covariance matrix is not compound symmetric, we introduce
the concept of compound symmetry projection and use it to define level-one and level-two R2 statistics.
Numerical results are provided to support the theoretical findings and demonstrate the performance of the
R2 statistics. The Canadian Journal of Statistics 38: 352–368; 2010 © 2010 Statistical Society of Canada
Résumé: La variation expliquée par les covariables (ou la variance expliquée) est un concept bien connu
pour mesurer l’importance de ces covariables sur la variable dépendante. Dans cet article, nous étudions la
statistique du R carré pour la variance expliquée pertinente aux données longitudinales pour des modèles
linéaires à effets mixtes. La statistique du R carré est calculée à deux niveaux différents pour mesurer la
variation expliquée par les covariables à l’intérieur et entre les sujets. En obtenant des limites aux statistiques
du R carré, nous trouvons que l’interprétation de la variance expliquée pour les statistiques du R carré
existantes est claire seulement dans le cas où la matrice de variance-covariance des observations dépendantes
est symétrique composée. Deux nouvelles statistiques du R carré sont proposées afin de prendre en compte
les effets des moyennes des covariables pouvant dépendre du temps. Dans le cas général où la matrice
de variance-covariance des observations n’est pas symétrique composée, nous introduisons le concept de
projection symétrique composée et nous l’utilisons pour définir les statistiques du R carré de niveaux 1 et
2. Des résultats numériques appuient nos résultats théoriques et ils montrent la performance des statistiques
du R carré. La revue canadienne de statistique 38: 352–368; 2010 © 2010 Société statistique du Canada

1. INTRODUCTION
In medical research studies, clustered or longitudinal data (i.e., repeated measurements from each
subject in the study) are often encountered. Linear mixed-effect models (Laird & Ware, 1982)
are particularly useful in applications as they allow assessment of within- and between-subject
variabilities. Measuring the proportion of variability in the outcomes explained by the covariates
in a linear mixed-effect model is of great interest to applied statisticians. Kent (1983) and Korn
& Simon (1991) discussed the general definition of measures of explained variance. A common

* Author to whom correspondence may be addressed.


E-mail: hub@ccf.org

© 2010 Statistical Society of Canada / Société statistique du Canada


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 353

type of such measures are R2 statistics. The concept of R2 statistics of explained variance is well
known in linear regression (Helland, 1987; Draper & Smith, 1998). Mittlböck & Schemper (1996)
and Hu, Palta & Shao (2006) studied the R2 statistics for logistic regression models. Schemper
& Henderson (2000) studied the R2 statistics for proportional hazard models.
For linear mixed-effect models, measuring variability explained by covariates is more com-
plicated, since there are multiple variance components. Zheng (2000) and Xu (2003) proposed
R2 statistics to assess how the covariates explain the within-subject variance, whereas Snijders &
Bosker (1999) proposed an R2 to assess the covariate effect on the total variance of the outcomes.
In view of the particular variance components structure in linear mixed-effect models, it is useful
to consider different R2 statistics for different variance components. In the special case where
there is only a random intercept but no random slopes, Raudenbush & Bryk (2002) suggested
constructing one R2 for assessing the covariate effect on the within-subject variance and another
R2 for the between-subject variance. The R2 statistic for the within-subject variance is referred
to as a level-one R2 while the R2 statistic for the between-subject variance is referred to as a
level-two R2 . Singer & Willett (2003) studied R2 statistics for general linear mixed-effect models
that have both random intercept and slopes. Although these R2 statistics have become popular in
practice, their statistical properties have not been fully studied. It is not clear, for example, why
some R2 statistics take negative values while a statistic measuring proportion should be between
0 and 1. The purposes of the present paper are to explore what the existing R2 statistics measure
generally, to construct some new R2 statistics, and to study statistical properties of the existing
and proposed R2 statistics.
To develop the idea, we first consider in Section 2 the simple case of random intercept models,
where the covariance matrix of the outcomes of each subject is compound symmetric (see Section
2 for the definition of compound symmetry). By deriving the limits of the existing R2 statistics,
we show what they measure and how to interpret them. We also show that the limit of the existing
level-two R2 could be negative when the means of the covariates vary with time. We propose
two new R2 statistics to address this problem. Furthermore, we derive approximate sampling
distributions of these R2 statistics and construct confidence intervals of interest.
Section 3 considers general linear mixed-effect models. Although R2 statistics are defined
as for the random intercept models, their interpretation is not straightforward in the general
case since the covariance matrix of the outcome vector, unconditioning on the covariates, is not
compound symmetric. We define a geometry projection of the covariance matrix to the subspace
corresponding to a compound symmetric matrix. We then use the compound symmetry projection
to describe what are measured by the proposed R2 statistics.
In Section 4, we carry out a simulation study to examine the performance of the R2 statistics.
Section 5 applies the R2 statistics to data from a randomized clinical trial. Section 6 concludes
the paper with a brief summary and discussion.

2. RANDOM INTERCEPT MODEL


We start with a random intercept model for balanced data. The results for more general mod-
els and unbalanced data are deferred to Section 3. Let (yi , Xi ), i = 1, . . . , n, be independent
and identically distributed (i.i.d.) observations, where yi = (yi1 , . . . , yik )T is a k-vector of lon-
gitudinal outcomes from subject i, the number of repeated measurements k is a fixed integer,
Xi = (xi1 , . . . , xik )T is a p × k matrix of covariates associated with subject i, and AT denotes
the transpose of the vector or matrix A. The random intercept model is
yit = α + βT xit + bi + eit , (1)

where (α, βT )T is the parameter vector, bi ∼ N(0, σb2 ) and eit ∼ N(0, σe2 ) are, respectively, the
random intercept and error, and bi ’s, eit ’s, and Xi ’s are independent.

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


354 HU, SHAO AND PALTA Vol. 38, No. 3

A covariance matrix is said to be compound symmetric if it is of the form c1 I + c2 11T for


some constants c1 and c2 , where I is the identity matrix and 1 is a vector of ones. Thus, model
(1) describes a compound symmetric covariance structure of yi conditional on Xi . The variance
components σb2 and σe2 are between- and within-subject variances, respectively.

2.1. Existing R2 Statistics and Their Properties


Two R2 statistics are usually considered: a level-one R2 measuring the within-subject variability
explained by the covariates and a level-two R2 measuring the between-subject variability explained
by the covariates. A common approach to construct R2 statistics is to compare estimated variance
components in the conditional model (1) with those in a null model without any covariates.
Raudenbush & Bryk (2002) suggested the following null model:

yit = α00 + b0i + e0it , (2)

where α00 is a constant. In fitting this model by the maximum likelihood procedure, it is assumed
that b0i ∼ N(0, σb0
2 ), e
0it ∼ N(0, σe0 ), and b0i ’s and e0it ’s are independent. These assumptions
2

hold in the special situation when the covariates follow

xit = µx + bxi + exit , (3)

where bxi ∼ N(0, x ), exit ∼ N(0, ex ), and bxi ’s and exit ’s are independent. Under models (1)
and (3), Var(Xi β) = (βT ex β)I + (βT x β)11T and

Var(yi ) = (σe2 + βT ex β)I + (σb2 + βT x β)11T , (4)

that is, both covariance matrices Var(Xi β) and Var(yi ) are compound symmetric.
Let σ̂e2 and σ̂b2 be the maximum likelihood estimates (see Harville, 1977; Verbeke & Molen-
berghs, 2000) of the variance components σe2 and σb2 in model (1), respectively, and let σ̂e02 and
2 2 2
σ̂b0 be the maximum likelihood estimates of σe0 and σb0 in the null model (2), respectively. These
estimated variance components also depend on the maximum likelihood estimates of the regres-
sion coefficients in (1) and (2). Raudenbush & Bryk (2002) defined the level-one and level-two
R2 statistics as
σ̂e2 σ̂b2
R21 = 1 − 2
and R22 = 1 − 2
. (5)
σ̂e0 σ̂b0

The level-one R21 was also proposed by Xu (2003). Some variants of these R2 statistics exist in
the literature. For example, Zheng (2000) and Xu (2003) proposed a different level-one R2 that
2 .
uses the means of squared residuals to estimate σe2 and σe0
It follows directly from the likelihood theory and Equation (4) that under the above assump-
tions, as the number of subjects n → ∞, the two R2 statistics converge in probability to

σe2 σb2
1 = 1 − and 2 = 1 − (6)
σe2 + βT ex β σb2 + βT x β

respectively. Note that σe2 is the coefficient for the I term of the covariance matrix of yi in model
(1), while σe2 + βT ex β is the corresponding coefficient of Var(yi ) in the null model (2) without
covariates. In other words, σe2 and σe2 + βT ex β are, respectively, conditional and unconditional
within-subject variances. Thus, R21 measures the within-subject variability explained by the co-
variates. Similarly, 2 is related with the ratio of the coefficients for the 11T term in the two
models and R22 measures the between-subject variability explained by the covariates.

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 355

In many studies, however, the covariates may have time-dependent means and they follow a
more general model than model (3):

xit = µxt + bxi + exit , (7)

where the mean vector µxt  depends on t and bxi and exit are the same as those in (3). Let
µ̄x = kt=1 µxt /k and D = kt=1 [βT (µxt − µ̄x )]2 /(k − 1). Since E(yit ) = βT µxt , D represents
the variability of the outcome means that is caused by the time-dependent covariate means. It is
shown in the Appendix that as n → ∞, R21 and R22 defined in (5) converge in probability to

σe2 σb2
D
1 =1− and D
2 =1− , (8)
σe2 + βT ex β + D σb2 + βT x β − D/k

respectively. Formulas in (8) lead to the following issues.


(1) Since µxt is deterministic, the covariance matrix Var(yi ) is still given by (4). Thus, the new
limits D D
1 and 2 in (8) are not simply functions of variance components. They also involve
D, the effect of variation in covariate means on the outcome means. Should the variability
explained by covariates include the deterministic effect D?
(2) Even if we want to include D in R2 statistics, D intuitively is a within-subject characteristic
and it is not reasonable to include D in a level-two R2 .
(3) There is a negative sign in front of D in D2 given by (8), which is counter-intuitive because
the variation in covariate means should not decrease the level-two R22 while it increases the
level-one R21 . Moreover, if D > kβT x β, then D 2
2 < 0 and R2 is likely to be negative for large
2
n. Examples of negative R2 can be found in applications (Snijders & Bosker, 1999). Finally,
it is also disturbing that D
2 in (8) depends on k, the number of repeated measurements.

2.2. New R2 Statistics


To address issues 1–3, we propose two new R2 statistics. Combining (1) and (7), we obtain the
following random intercept model:

yit = α + βT µxt + βT x̃it + bi + eit ,

where x̃it = xit − µxt has mean zero. The covariate mean µxt is absorbed into the outcome mean
α + βT µxt , which is deterministic and is not involved in Var(yi ). We then consider the following
null model:

yit = α0t + b0i + e0it , (9)

where α0t is an unknown parameter. This null model differs from the null model (2) as it models
time-dependent means for the outcomes even without any covariates. Model (9) is nested in model
(1) under assumption (7).
2 and σ̃ 2 be the maximum likelihood estimates of σ 2 and σ 2 , respectively, under
Let σ̃e0 b0 e0 b0
model (9). We propose two new R2 statistics:

σ̂e2 σ̂b2
R̃21 = 1 − 2
and R̃22 = 1 − 2
, (10)
σ̃e0 σ̃b0

where σ̂e2 and σ̂b2 are the same variance estimates used in the definition of the R2 statistics (5).
As n → ∞, R̃2l converges in probability to l given in (6), l = 1, 2. Hence, the new statistics

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


356 HU, SHAO AND PALTA Vol. 38, No. 3

R̃21 and R̃22 measure variability explained by covariates regardless of whether the covariates have
time-dependent means, and their limits are always between 0 and 1.
The new R̃21 in (10) is more appropriate from the perspective of studying how much the covari-
ates reduce the within-subject variance of the outcomes. If one wants a level-one R2 that measures
not only the within-subject variance explained by the covariates but also how the outcome means
are explained by the covariate means, R21 in (5) is preferred. To measure the covariate effect on
the between-subject variance, R̃22 in (10) is always more appropriate than R22 in (5) because R22
has the problems described in issue 3 in Section 2.1.

2.3. Variance Estimation and Confidence Intervals


It is always desirable to provide error assessment or confidence intervals for an R2 as an estimator
of its limit. It is shown in the Appendix that, as the number of subjects n goes to ∞,

√ √
n(R2l − D
l ) →d N(0, σl )
2
and n(R̃2l − l ) →d N(0, σ̃l2 ) (11)

for l = 1, 2, where →d denotes convergence in distribution. The variances σl2 and σ̃l2 can be
estimated, respectively, by

2
σ̂l2 = ∇fl (T̄ )T 
ˆ T ∇fl (T̄ ) and σ̃ˆ l = ∇ f̃l (T̄ )T 
ˆ T ∇ f̃l (T̄ ),

where ∇f represents the gradient operator; the functions fl and f̃l are defined in (19) and
(20) in the Appendix, T̄ and  ˆ T are the sample mean and covariance of the vectors T̂i =
2 2
(r̂i1 , . . . , r̂ik , r̄i. , yi1 , . . . , yik , ȳi. , yi1 , . . . , yik )T , i = 1, . . . , n, r̂it ’s are the residuals of model (1),
2 2 ˆ 2 2
 
ȳi. = kt=1 yit /k, and r̄ˆ i. = kt=1 r̂it /k.
For not very large n, a better confidence interval for l or D l can be obtained by using the
transformation log(1 + Rl ). 2

3. GENERAL LINEAR MIXED-EFFECT MODEL


The early discussion focuses on a special case where the model contains only a random intercept
and data are balanced. In this special case, the covariance matrix of the outcome yi is compound
symmetric in models (1), (2), and (9). Thus, within- and between-subject variances are clearly
defined in the conditional model and the null model. R2 statistics are shown to be related with the
ratios of the conditional and unconditional variance components and they can be interpreted as how
much the covariates explain each variance component. In the general case, a linear mixed-effect
model has the form

yit = α + βT xit + bi + cTi zit + eit , (12)

i = 1, . . . , n, t = 1, . . . , ki ≤ k, where zit is a q-vector covariate with mean µzt , bi ∼ N(0, σb2 ),


eit ∼ N(0, σe2 ), ci ∼ N(0, c ) is a q-vector random slope, (bi , ci ) is jointly normal with
Cov(bi , ci ) = Tbc , and (bi , ci )’s and eit ’s are independent of each other.
The R2 statistics are still defined by (5) or (10), depending upon the choice of the null model.
In the following we discuss the asymptotic properties of the R2 statistics under model (12).

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 357

3.1. Balanced Data


When data are balanced, that is, ki = k for each subject i, it is shown in the Appendix that as
n → ∞, R̃21 and R̃22 in (10) converge in probability to

σe2 σb2
1 = 1 − and 2 = 1 − , (13)
σe2 + Ew σb2 + Eb

respectively, while R21 and R22 in (5) converge in probability to

σe2 σb2
D
1 =1− and D
2 =1− , (14)
σe2 + Ew + D σb2 + Eb − D/k

respectively, where
k k
t=1 Var(xit β) − kVar(x̄i. β) t=1 E(zit c zit ) − kE(z̄i. c z̄i. )
T T T T
Ew = + ,
k−1 k−1
 T Cov(x , x )β
2 t<l [β it il + E(zTit c zil )]
Eb = + 2µ̄Tz bc ,
k(k − 1)

µ̄z = kt=1 µzt /k and D is the same as that in Section 2.1. Although Ew is always nonnegative,
the value of Eb may be negative in some situations. For longitudinal data, covariate vectors xit
and xil (and zit and zil ) are usually positively correlated and, thus, Eb is typically nonnegative.
The interpretation of these limits is not trivial since the meanings of Ew and Eb are not clear.
The complication arises from the fact that the covariance matrix Var(yi ) in the null models may not
be compound symmetric. Hence, the within- and between-subject variances are not well-defined
and it is difficult to interpret σe2 + Ew and σb2 + Eb . For example, even in the special case where
the covariate follows model (7), the covariance matrix of yi is

Var(yi ) = (σe2 + βT ex β + tr(c ez ))I + (σb2 + βT x β + tr(c z ))11T


+Mz c MTz + Mz bc 1T + 1Tbc MTz ,

where Mz = (µz1 , . . . , µzk )T , Var(zit ) = ez + z and Cov(zit , zil ) = z for t = l. Var(yi ) is
not compound symmetric unless (i) c = 0 and bc = 0 so that the model reduces to a random
intercept model, or (ii) µzt = µz for all t so that Mz = 1µTz .
To interpret what the R2 statistics in (5) and (10) measure under the general linear mixed-effect
model (12), we introduce a concept of compound symmetry projection. For any random k-vector
y, the covariance matrix Var(y) =  with the (j, l) element σjl can be expressed as a vector

V = (σ11 , . . . , σkk , σ12 , . . . , σ(k−1)k , σ13 , . . . , σ(k−2)k , . . . . . . , σ1(k−1) , σ2k , σ1k )T .

Let A1 , . . . , Ak(k+1)/2 be a set of base vectors for the k(k + 1)/2 dimensional Euclidean space
Rk(k+1)/2 with A1 = (1Tk , 0Tk(k−1)/2 )T (corresponding to the identity matrix Ik ) and A2 = 1Tk(k+1)/2
(corresponding to 1k 1Tk ), where 1d is a d-vector of ones and 0d is a d-vector of zeros. The rest of
base vectors shall be chosen to be linearly independent with A1 and A2 , for example,

A3 = (−1, 1, 0, . . . , 0, 0Tk(k−1)/2 )T , Ak+2 = (0Tk , −1, 1, 0, . . . , 0)T ,


······ ······
Ak+1 = (−1, 0, . . . , 0, 1, 0Tk(k−1)/2 )T , Ak(k+1)/2 = (0Tk , −1, 0, . . . , 0, 1)T .

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


358 HU, SHAO AND PALTA Vol. 38, No. 3

k(k+1)/2
Then, V can be written as V = j=1 cj Aj , where cj ’s are coefficients depending on .
In particular,
k k
j=1 Var(yij ) − kVar(ȳ) j=1 σjj
c1 = = − c2 , (15)
k−1 k

c1 2 1≤j<l≤k σjl
c2 = Var(ȳ) − = .
k k(k − 1)

The vector c1 A1 + c2 A2 , which corresponds to the compound symmetric matrix c1 Ik + c2 1k 1Tk ,


is a projection of V to the space spanned by A1 and A2 . Therefore, we call c1 Ik + c2 1k 1Tk a
compound symmetry projection of . The matrix  is compound symmetric if and only if it is
equal to its compound symmetry projection.
For a compound symmetric covariance matrix, its off-diagonal element is the between-subject
variance and its diagonal element is the sum of the within- and between-subject variances. For
a general covariance matrix , between- and within-subject variances are not defined in the
literature. In view of the fact that c2 in the compound symmetry projection of  is the average of
the off-diagonal elements of , we define c2 to be a between-subject variance. Similarly, since c1
in the compound symmetry projection of  is the difference between the averages of the diagonal
and off-diagonal elements of , we define c1 to be a within-subject variance. Our definitions
coincide with the conventional definitions when  is compound symmetric.
The unconditional covariance matrix of yi under the general model (12) is

Var(yi ) = σe2 I + σb2 11T + Var(Xi β) + E(Zi c ZTi ) + Mz bc 1T + 1Tbc MTz , (16)

whose compound symmetry projection is exactly (σe2 + Ew )I + (σb2 + Eb )11T . Note that σe2 +
Ew appears in the limit 1 given by (13), representing the within-subject variance of the outcomes
in the null model without covariates. Thus, R̃21 measures how the covariates explain the within-
subject variance. The old R21 , however, mixes the covariate effects on the within-subject variance
and the means of the outcomes. Similarly R̃22 measures how the covariates explain the between-
subject variance. R22 , however, is still not an appropriate measure unless the covariate mean is
constant over time, because of the reasons discussed in Section 2.

3.2. Unbalanced Data


For unbalanced data, the maximum likelihood estimates of the regression coefficients and variance
components in model (12) do not have explicit forms. R2 statistics are still defined by (5) or (10).
2 of σ 2 in the null model (9), we prove in the Appendix
For the maximum likelihood estimate σ̃e0 e0
that its limit is equivalent to
n ki
t=1 [Var(xit β) − ki Var(x̄i. β) + E(zit c zit ) − ki E(z̄i. c z̄i. )]
T T T T
i=1
2
σe0 = σe2 + ,
N −n

where N = ni=1 ki . Let (σe2 + Ewi )Iki + (σb2 + Ebi )1ki 1Tki be the compound symmetry projec-
tion of the covariance matrix Var(yi ), where Ewi and Ebi are the same as those in (13) with k
replaced by ki . One can verify that


ki
[Var(xitT β) − ki Var(x̄i.T β) + E(zTit c zit ) − ki E(z̄Ti. c z̄i. )] = (ki − 1)Ewi .
t=1

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 359

Hence, as n → ∞, R̃21 − 1 converges to 0 in probability, where

σe2
1 = 1 − n ki −1
. (17)
σe2 + i=1 N−n Ewi

Following the discussion in Section 3.1, σe2 + Ewi isthe within-subject variance of yi uncon-
ditional on the covariates, and the weighted average ni=1 ((ki − 1)/(N − n))(σe2 + Ewi ) can be
considered as an overall within-subject variance of the outcomes in the null model. The interpre-
tation of R̃21 is thus the same as that for balanced data.
2 of σ 2 in
For the between-subject variance, the limit of the maximum likelihood estimate σ̃b0 b0
model (9) satisfies the equation
 

n 2
σe0 k2 /(σ 2 + ki σb0
2 )2
2
σb0 = wi Var(ȳi. ) − , wi = n i 2 e0 2 .
j=1 kj /(σe0 + kj σb0 )
ki 2 2
i=1

This equation is not an explicit solution of σb02 since w depends on both σ 2 and σ 2 unless k = k
i e0 b0 i
for all i. Since σe0 is the unconditional within-subject variance, Var(ȳi. ) − σe0
2 2 /k can be considered
i
as an approximation to the unconditional between-subject variance of y i following the definition of
c2 in the compound symmetry projection (15). The weighted average ni=1 wi [Var(ȳi. ) − σe0 2 /k ]
i
is then an approximation to the between-subject variance of the outcomes in the null model without
covariates. As n → ∞, R̃22 − 2 converges to 0 in probability, where

σ2
2 = 1 −  b . (18)
n 2
σe0
i=1 wi Var(ȳi. ) − ki

Hence, the interpretation of the level-two statistic R̃22 is also the same as that for balanced data.
On the other hand, the R2 statistics defined by (5) still include the effect of the time-dependent
covariate means. As n → ∞, R21 − D 1 converges to 0 in probability, where

n ki
σe2 i=1 t=1 (β
T (u
xit − ūxi. ))2
D
1 =1− n ki −1
, D= .
σe2 + i=1 N−n Ewi +D N −n

Note that D quantifies the effect of covariate means on the outcome means. Similarly, R22 also
involves D and can be negative in some cases.
Results (17) and (18) reduce to the earlier results (13) for balanced data and further reduce to
(6) for the random intercept model with balanced data.
Under the general model (12), it is difficult to explicitly derive the asymptotic distributions of
the R2 statistics, although R2 statistics are still asymptotically normal since the maximum likeli-
hood estimators are asymptotically normal. The bootstrap technique can be applied to calculate
the confidence intervals. Ukoumunne et al. (2003) proposed a nonparametric bootstrap proce-
dure to calculate the confidence intervals for the intraclass correlation coefficient by resampling
the subjects. Their method can be easily applied to the R2 statistics. A transformation such as
log(1 + R2 ) may be applied to improve the performance of confidence intervals when n is not
very large.

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


360 HU, SHAO AND PALTA Vol. 38, No. 3

Table 1: Simulation results of R2 s for random intercept models with balanced data.

Within-Subject Variability Between-Subject Variability

D β R21 (D
1) CP R̃21 (1 ) CP R22 (D
2) CP R̃22 (2 ) CP

(n, k) = (100, 4)
0 0.5 0.36 (0.36) 95.0 0.36 (0.36) 94.3 0.13 (0.14) 93.3 0.13 (0.14) 93.4
1.0 0.69 (0.69) 94.5 0.69 (0.69) 95.1 0.38 (0.39) 93.8 0.38 (0.39) 94.6
1 0.5 0.68 (0.68) 94.5 0.35 (0.36) 94.0 0.08 (0.09) 93.7 0.13 (0.14) 94.2
1.0 0.89 (0.89) 94.0 0.69 (0.69) 95.2 0.26 (0.28) 94.5 0.38 (0.39) 93.1
kβ2 x 0.5 0.87 (0.87) 95.1 0.36 (0.36) 94.8 0.01 (0.00) 94.7 0.14 (0.14) 94.6
1.0 0.96 (0.97) 95.3 0.69 (0.69) 95.2 −0.06 (0.00) 93.6 0.38 (0.39) 93.1
(n, k) = (50, 8)
0 0.5 0.36 (0.36) 94.9 0.35 (0.36) 94.8 0.13 (0.14) 92.7 0.13 (0.14) 92.8
1.0 0.69 (0.69) 93.5 0.69 (0.69) 94.6 0.37 (0.39) 93.8 0.37 (0.39) 94.1
1 0.5 0.68 (0.68) 93.6 0.35 (0.36) 93.8 0.11 (0.11) 93.6 0.13 (0.14) 93.2
1.0 0.90 (0.89) 95.8 0.69 (0.69) 94.6 0.32 (0.34) 93.2 0.38 (0.39) 93.1
2
kβ x 0.5 0.90 (0.90) 94.4 0.35 (0.36) 93.8 −0.02 (0.00) 94.2 0.13 (0.14) 93.0
1.0 0.97 (0.97) 94.1 0.69 (0.69) 93.8 −0.06 (0.00) 93.5 0.38 (0.39) 92.9

CP is the simulation coverage probability of the confidence interval for l or D


l ; 0.00 is a value smaller than
0.005.

4. A SIMULATION STUDY
In this section we demonstrate by simulation the finite sample performance of the R2 statistics.
We considered linear mixed-effect models with a single covariate, which was assumed to follow
decomposition (7), that is, xit = µxt + bxi + exit with bxi ∼ N(0, 0.64) and exit ∼ N(0, 0.36).
The covariate means were chosen to give different values of D.
We first considered balanced data. Two different sample sizes were considered: n = 100
subjects with k = 4 observations for each subject; and n = 50 subjects with k = 8 observations
for each subject. For the random intercept model (1), the data were generated from

yit = 1 + βxit + bi + eit ,

where the variances of bi and eit were 0.16 and 1, respectively. The coefficient β was 0.5 or 1. For
each combination of sample size and model parameter, we simulated the R2 statistics 1000 times.
Table 1 shows the simulation results. The R2 statistics shown are the average over 1000
simulations. All the R2 statistics are very close to their limits. The values of the R2 statistics
increase with the coefficient β. For each fixed β, the R2 statistics behave differently for different
values of D. In the first case where D = 0, R21 and R̃21 are about the same since 1 is identical to
D 2
1 when the covariate mean is constant over time. Similar results are obtained for R2 and R̃2 .
2

In the second case where D = 1, R1 becomes larger than that in the first case while the value of
2

R22 decreases. These results suggest that the time-dependent covariate means affect R21 and R22 as
discussed in the previous sections. The existence of the time-dependent covariate means inflates
R21 but lowers R22 . By contrast, the values of the proposed R̃21 and R̃22 tend to be the same as those
in the first case, which indicates that they do not depend on the covariate means. In the third case
where D = kβ2 x , R21 is very close to the maximum one since D is much larger than σe2 and

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 361

Table 2: Simulation results of R2 s for random intercept and slope models.

Within-Subject Variability Between-Subject Variability

D β R21 (D
1) CP R̃21 (1 ) CP R22 (D
2) CP R̃22 (2 ) CP

(n, k) = (100, 4)
0 0.5 0.53 (0.53) 93.4 0.52 (0.53) 93.7 0.37 (0.36) 93.3 0.37 (0.36) 93.4
1.0 0.74 (0.74) 93.8 0.74 (0.74) 93.2 0.52 (0.51) 94.3 0.52 (0.51) 94.3
kβ2 x 0.5 0.90 (0.90) 92.5 0.83 (0.84) 91.8 0.32 (0.33) 94.6 0.39 (0.40) 93.6
1.0 0.96 (0.96) 95.2 0.87 (0.87) 94.1 0.34 (0.33) 94.1 0.54 (0.53) 94.4
(n, k) = (50, 8)
0 0.5 0.53 (0.53) 92.0 0.52 (0.53) 91.5 0.37 (0.36) 92.4 0.37 (0.36) 93.5
1.0 0.74 (0.74) 93.8 0.74 (0.74) 93.3 0.51 (0.51) 93.5 0.51 (0.51) 93.4
kβ2 x 0.5 0.95 (0.95) 92.9 0.90 (0.90) 93.7 0.38 (0.38) 94.4 0.44 (0.44) 94.1
1.0 0.98 (0.98) 93.6 0.91 (0.92) 93.6 0.39 (0.38) 95.2 0.56 (0.56) 93.9

CP is the simulation coverage probability of the confidence interval for l or D


l .

β2 ex . A disturbing result is observed for the level-two R22 in this case, that is, R22 is nearly zero
because D 2 = (β x − D/k)/(1 + β x − D/k). On the other hand, the proposed R̃1 and R̃2
2 2 2 2

have the same limits as those in the first two cases. They appear to be more appropriate in the
existence of a strong variation in the covariate means. The confidence intervals obtained from the
asymptotic distributions (11) perform well as the percentages of covering the true limits are close
to the nominal level 95%.
Table 2 shows the simulation results for the general linear mixed-effect model

yit = 1 + βxit + bi + ci xit + eit ,

where the random slope ci has a variance of c = 0.25, and is independent of the random intercept
(bc = 0). The comparison of R2l and R̃2l is very similar to that in Table 1. In the case where
D = 0, R21 is very close to R̃21 and R22 is close to R̃22 . In the case where D = kβ2 x , R21 increases
while R22 decreases. Unlike the random intercept model, R22 does not degenerate to zero since
the random slope also contributes to explained variance. The bootstrap method was applied to
construct confidence intervals for the limits of R2 statistics with the transformation log(1 + R2 ).
The bootstrap Monte Carlo size was 200. The coverage percentages are generally close to the
nominal level.
The second part of the simulation was for unbalanced data with n = 100 subjects. For each
subject, the number of observations ki was randomly chosen from 2, 3, and 4. The data were
generated from the random intercept model described earlier in this section. We considered two
cases for the covariates: constant means (D = 0) and time-dependent means (D = 0). Figure 1
plots the means, 2.5th and 97.5th percentiles of 1000 simulated R2 statistics for different values
of β in each case. All R2 statistics increase as β increases. For each fixed β, the new and old R2
statistics are almost identical when D = 0. When D = 0, the new R2 statistics exclude the effect
of variation in covariate means while the old R2 statistics include such an effect. In particular,
the level-two R22 is around zero in some cases.

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


362 HU, SHAO AND PALTA Vol. 38, No. 3

Constant Covariate Means Time−dependent Covariate Means

1.0

1.0
Old measure
New measure
0.9

0.9
Level−one R2

0.8

0.8
Old measure
New measure
0.7

0.7
β 1 1.25 1.5 2 2.5 3 β 1 1.25 1.5 2 2.5 3

Old measure Old measure


0.8

0.8
New measure New measure
0.6

0.6
Level−two R2

0.4

0.4
0.2

0.2
0.0

0.0

β 1 1.25 1.5 2 2.5 3 β 1 1.25 1.5 2 2.5 3

Figure 1: Means, 2.5th and 97.5th percentiles of the simulated R2 s for unbalanced data.

5. AN EXAMPLE
In this section we illustrate our results by data from the American African Study of Kidney
Disease (AASKD). The AASKD study was a randomized clinical trial to study end-stage renal
disease in American Africans. During a 5-year study period, a total of 1094 patients were enrolled
and each patient was randomized into one of the two treatment groups. We study the biomarker
albumin, which was measured annually for each patient (k = 5). There were 264 patients who
reached a terminal event of death or end-stage renal disease. These patients were excluded in our
analysis because their albumin measurements were not available after the occurrence of terminal
events. We used a linear mixed-effect model to relate albumin to six covariates. Four covariates,
gender, baseline age, baseline albumin, and treatment, are time-independent. Urinary protein
was measured annually and it is thus a time-dependent covariate. The time at which albumin was
measured is also a time-dependent continuous covariate, and there are between- and within-patient
variations. Figure 2 shows a boxplot of the covariate “time” at five annual visits, which indicates
a time-trend in the mean of the covariate “time.”
We first consider the random intercept model (1). Table 3 shows the values of R2 statistics,
estimated fixed covariate effects, and their 95% confidence intervals under model (1). The level-
one R21 is much larger than the proposed R̃21 . This is expected from our discussion in Section 2
(i.e., the effect of D), since the time covariate has a strong trend in its mean (Figure 2) and the
time effect on the outcome is significant (P-values <0.001). On the other hand, R̃22 is only slightly
larger than R22 . This can be explained by the fact that the effect of D is divided by k = 5 in D2
given by (8). In this example, the level-two R2 statistics are much larger than the level-one R2
statistics, which suggests that the covariates contribute more to the between-subject variability
than to the variability within each subject. In theory the four time-independent covariates do not
contribute to the within-subject variability.

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 363

48
36
Time (months)

24
12
0

1 2 3 4 5

Annual Visit

Figure 2: Boxplot of the covariate time in the AASKD trial.

We next consider model (12) with zit being the time-dependent covariates time and urinary
protein. The values of R2 statistics, estimated fixed covariate effects, and their 95% confidence
intervals under model (12) are also given in Table 3. Although the estimates of fixed covariate
effects under two models are nearly equal and the relationship between R2l and R̃2l remains the
same, including random slopes substantially increases the R2 values. Since model (1) is more
restrictive than model (12), the results under model (12) are more reliable.

Table 3: R2 statistics and estimated fixed effects for the albumin data example.

Model (1) Model (12)

R2 statistics (95% CI)


R21 0.16 (0.12, 0.20) 0.23 (0.18, 0.29)
R̃21 0.04 (0.02, 0.05) 0.12 (0.08, 0.19)
R22 0.54 (0.46, 0.63) 0.74 (0.51, 0.86)
R̃22 0.58 (0.51, 0.67) 0.76 (0.55, 0.87)
Estimated fixed effects (95% CI)
Gender −0.027 (−0.058, 0.005) −0.024 (−0.055, 0.007)
Age −0.002 (−0.003, 0) −0.002 (−0.003, 0)
Treatment −0.011 (−0.04, 0.02) −0.009 (−0.04, 0.02 )
Baseline albumin 0.44 (0.39, 0.49) 0.48 (0.43, 0.52)
Time −0.004 (−0.005, −0.004) −0.004 (−0.005, −0.004)
Urinary protein −0.077 (−0.094, −0.060) −0.074 (−0.091, −0.058)

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


364 HU, SHAO AND PALTA Vol. 38, No. 3

6. SUMMARY AND DISCUSSION


We derive limits of various R2 statistics for linear mixed-effect models, which reveal what the
R2 statistics measure. Confidence intervals for limits of the R2 statistics can be computed based
on the asymptotic distributions derived for random intercept models or can be computed by
using bootstrap method in the general case. We point out that the existing level-one and level-
two R2 statistics in (5) include the effect of covariate means on the outcome means. New R2
statistics are proposed in (10) to exclude such an effect. While the existing level-one R2 statistic
may be useful if one would like to include the effect of covariate means as part of the explained
within-subject variation, the existing level-two R2 is not appropriate since it has some undesirable
problems.
In the general case where the covariance matrix of the outcome vector is not compound
symmetric, we use a compound symmetry projection to interpret what the R2 statistics measure.
The compound symmetry projection approach may be useful in other problems where a covariance
matrix is not compound symmetric and a study of within- and between-subject variabilities is
desired. The proposed R2 statistics measure variance components corresponding to the coefficients
of the compound symmetry projection. Other variance components are not captured by these R2
statistics. It may be of interests to derive more than two R2 statistics to better measure variabilities
explained by covariates when the covariance matrix of the outcome vector is not compound
symmetric. This deserves future research.
The R2 statistics discussed can be all expressed as 1 minus a ratio of two estimated variance
components. Thus, their limits are completely determined by the limits of the estimated variance
components. Verbeke & Lesaffre (1997) showed that the maximum likelihood estimators for the
variance components, obtained under the normality assumption, are consistent and asymptotically
normal even when the random effects are not normal. Therefore, our results on the limits of the
R2 statistics still hold if the normality assumption is removed. Without the normality assumption,
the R2 statistics are still asymptotically normal, but the their asymptotic covariance matrices are
different from those given in (11). However, the bootstrap method described in Section 3 can
always be used to derive confidence intervals, as long as the R2 statistics are asymptotically
normal.

APPENDIX
We assume that the covariates are random and independent of the random effects and error.
Moreover, for any fixed t, x1t , · · · , xnt are i.i.d. and z1t , · · · , znt are i.i.d.

(i) Proof of the limits (6), (8), (13), and (14) of the R2 statistics. Since (6) and (8) are special
cases of (13) and (14), respectively, we only prove (13) and (14). When the data are balanced, the
maximum likelihood estimates of the variance components in the null models (2) and (9) have
the following explicit forms:

n k n
i=1 t=1 (yit − ȳi. )2 i=1 (ȳi. − ȳ.. )2 2
σ̂e0
2
σ̂e0 = , 2
σ̂b0 = − ;
n(k − 1) n k
n  
k n
t=1 (yit − ȳ.t ) − k(ȳi. − ȳ.. )
2 2
i=1 (ȳi. − ȳ.. )2 σ̃ 2
2
σ̃e0 = , σ̃b0 = i=1
2
− e0 ,
n(k − 1) n k

which only involve sample moments of the outcome variables. Because (y1t , . . . , ynt ) are i.i.d.,
and (ȳ1. , . . . , ȳn. ) are also i.i.d, it follows from the Law of Large Numbers that

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 365

k k
t=1 E(yit ) − kE(ȳi. ) t=1 Var(yit ) − kVar(ȳi. )
2 2
2
σ̂e0 →p σe1
2
= , 2
σ̃e0 →p σe2
2
= ,
k−1 k−1
2
σe1 2
σe2
2
σ̂b0 →p Var(ȳi. ) − , 2
σ̃b0 →p Var(ȳi. ) − ,
k k
where →p denotes convergence in probability. Under the general model (12), we have

Var(yit ) = σe2 + σb2 + βT Var(xit )β + E(zTit c zit ) + 2µTzj bc ,

σe2
Var(ȳi. ) = + σb2 + βT Var(x̄i. )β + E(z̄Ti. c z̄i. ) + 2µ̄Tz bc .
k
Therefore,
k k
t=1 Var(xit ) − kVar(x̄i. ) t=1 E(zit c zit ) − kE(z̄i. c z̄i. )
T T
2
σe2 = σe2 +β T
β+
k−1 k−1
= σe2 + Ew .

Similarly
 T Cov(x , x )β
σ2 2 t<l β it il + E(zTit c zil )
Var(ȳi. ) − e2 = σb2 + + 2µ̄Tz bc = σb2 + Eb .
k k(k − 1)

Hence, result (13) holds since σ̂e2 and σ̂b2 from the full model (12) are consistent estimates for σe2
and σb2 , respectively. The limits for the old R2 statistics (14) follow from the fact that
n
it ) − kE
2 (y 2 (ȳ
t=1 E i. )
2
σe1 − σe2
2
= ≡D
k−1
2 − σ 2 = −D/k.
and σb1 䊏
b2

(ii) Proof of (11) for random intercept models. Let rit = yit − α − βT xit and r̂it = yit − α̂ −
where θ̂ = (α̂, β̂T ) denotes MLE for (α, β) in model (1). When the data are balanced, we
β̂T xit ,
have
n k n 2  2  
i=1 t=1 r̂it − k
2 ˆ
i=1 r̄ i. k2 ni=1 r̄ˆ i. − ni=1 kt=1 r̂it2
σ̂e =
2
, σ̂b =
2
.
n(k − 1) nk(k − 1)

Let Ti = (ri1
2 , . . . , r 2 , r̄ 2 , y2 , . . . , y2 , ȳ2 , yT )T be a random (3k + 2)-vector, which has a mean
ik i. i1 ik i. i
of µT and a covariance matrix of T . Define f1 (·) and f2 (·) as two functions from the Euclidean
space R3k+2 to R, for a vector a = (a1 , . . . , a3k+2 )T ,
k
i=1 ai − kak+1
f1 (a) = 1 − 2k+1 ,
i=k+2 ai − ka2k+2

k2 ak+1 − ki=1 ai
f2 (a) = 1 −
2 . (19)
 k−1 3k+2
k2 a2k+2 − 2k+1
i=k+2 ia − k a
i=2k+3 i

Consider R21 as a function of the maximum likelihood estimate θ̂, and write R21 = R21 (θ̂). Then
f1 (T̄ ) = R21 (θ̂), which has the same sampling distribution as R21 . By the Central Limit Theorem,

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


366 HU, SHAO AND PALTA Vol. 38, No. 3

n(T̄ − µT ) →d N(0, T ), and thus

n(f1 (T̄ ) − f1 (µT )) →d N(0, σ12 )

with σ12√= ∇f1 (µT )T T ∇f1 (µT ) by the delta-method. Note that f1 (µT ) is identical to D 1.
Hence n(R21 (θ̂) − 1 )√→d N(0, σ12 ). The result for R22 can be established in the same way.
Similarly, for l = 1, 2, n(R̃2l − 
˜ l ) →d N(0, σ̃l2 ) with σl2 = ∇fl (µT )T T ∇fl (µT ) and σ̃l2 =
∇ f̃l (µT ) T ∇ f̃l (µT ), where
T

k
− kak+1i=1 ai
f̃1 (a) = 1 − 
2 ,
2k+1 3k+2 3k+2
a
i=k+2 i − ka 2k+2 − a
i=2k+3 i
2+ a
i=2k+3 i /k

k2 ak+1 − ki=1 ai
f̃2 (a) = 1 − 
2  . (20)

k2 a2k+2 − 2k+1 a
i=k+2 i − 3k+2
a
i=2k+3 i + 3k+2
a
i=2k+3 i
2

(iii) Proof of (17) for unbalanced data. Let µ̂0it be the estimate of the mean of yit in the null
model (9). The score equations for the variance components are

1   1 
n ki n
N −n ki r̄i.2
2
− 4
(r it − r̄ i. )2
= − ,
σe0 σe0 i=1 t=1
λ
i=1 i i=1
λ2i

 ki n
ki2 r̄i.2
= ,
λ
i=1 i i=1
λ2i

n
where rit = yit − µ̂0it , λi = σe0
2 + k σ 2 and N =
i b0 i=1 ki .
First, the second score equation can be rewritten as


n
σ2
2
σb0 = wi r̄i.2 − e0
ki
i=1

with

k2 /λ2
wi = n i 2i 2 .
i=1 ki /λi

It is easy to verify that û0it is a consistent estimate for the mean E(yit ) since the null model (9)
correctly specifies the mean of yit (though the covariance-structure could be misspecified). In fact
µ̂0it is a weighted average of the responses measured at time point j. The first score equation can
be written as
n  
ki
i=1
2
t=1 rit − ki r̄i.2 σe4  λi − ki r̄i.2
n
2
σ̂e0 = + .
N −n N −n λ2i
i=1

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs


2010 EXPLAINED VARIANCE FOR LONGITUDINAL DATA 367

The second term on the right-hand side is of order op (1) because of the second score equation
2 by
and the fact that ki is bounded. Thus, we approximate σ̂e0
n ki 2 
i=1 r
t=1 it − k i r̄ 2
i.
,
N −n
which is essentially the ANOVA estimate. The limit (17) holds since
n ki 2 
i=1 r
t=1 it − k r̄
i i.
2

N −n
is asymptotically equivalent to
n  
ki
i=1 t=1 Var(yit ) − ki Var(ȳi. )
.
N −n

ACKNOWLEDGMENTS
The authors thank the editor, the associate editor and two referees for their helpful comments and
suggestions.

BIBLIOGRAPHY
N. R. Draper & H. Smith (1998). “Applied Regression Analysis.” New York: John Wiley & Sons, Inc.
D. A. Harville (1977). Maximum likelihood approaches to variance component estimation and to related
problems. Journal of the American Statistical Association, 72, 320–338.
I. S. Helland (1987). On the interpretation and use of R2 in regression analysis. Biometrics, 43, 61–69.
B. Hu, M. Palta & J. Shao (2006). Properties of R2 statistics for logistic regression. Statistics in Medicine,
25, 1383–1395.
J. T. Kent (1983). Information gain and a general measure of correlation. Biometrika, 70, 163–173.
E. L. Korn & R. Simon (1991). Explained residual variation, explained risk, and goodness of fit. American
Statistician, 45, 201–206.
N. M. Laird & J. H. Ware (1982). Random effects models for longitudinal data. Biometrics, 38, 963–974.
M. Mittlböck & M. Schemper (1996). Explained variation for logistic regression. Statistics in Medicine, 15,
1987–1997.
S. W. Raudenbush & A. S. Bryk (2002). “Hierarchical Linear Models: Applications and Data Analysis
Methods.” Newbury Park, CA: Sage publications.
M. Schemper & R. Henderson (2000). Predictive accuracy and explained variation in Cox regression. Bio-
metrics, 56, 249–255.
J. D. Singer & J. B. Willett (2003). “Applied Longitudinal Analysis: Modeling Change and Event Occur-
rence.” New York: Oxford University Press.
T. A. Snijders & R. J. Bosker (1999). Multilevel Analysis: An Introduction to Basic and Advanced Multilevel
Modeling. Sage Publications.
O. C. Ukoumunne, A. C. Davision, M.C. Gulliford & S. Chinn (2003). Non-parametric bootstrap confidence
intervals for the intraclass correlation coefficient. Statistics in Medicine, 22, 3805–3821.
G. Verbeke & G. Molenberghs (2000). “Linear Mixed Models for Longitudinal Data.” New York: Springer.
G. Verbeke & E. Lesaffre (1997). The effect of misspecifying the random-effects distribution in linear mixed
models for longitudinal data. Computational Statistics & Data Analysis, 23, 541–556.

DOI: 10.1002/cjs The Canadian Journal of Statistics / La revue canadienne de statistique


368 HU, SHAO AND PALTA Vol. 38, No. 3

R. Xu (2003). Measuring explained variation in linear mixed effects models. Statistics in Medicine, 22,
3527–3541.
B. Y. Zheng (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data.
Statistics in Medicine, 19, 1265–1275.

Received 6 April 2009


Accepted 24 March 2010

The Canadian Journal of Statistics / La revue canadienne de statistique DOI: 10.1002/cjs

You might also like