You are on page 1of 14

Random Effects Models

Yanez, Spring 2004 1 Lecture Notes XI


Linear Random Effects Models

Ordinary (linear) regression assumes subjects are randomly sam-


pled from some infinitely large population. Let
• Yi = outcome variable for subject i,
• xi = covariate value for subject i. We assume
◦ E[Yi|xi] = β0 + β1xi
◦ V ar[Yi|xi] = σ 2 (constant over the x’s)
• The model can be written as
Yi = β εi .
| 0 +{zβ1x}i + |{z}
fixed error

◦ Errors are independent with constant variance


• Independence is a consequence of the sampling design
Random effects models are often necessary when the observations
(i.e., elementary units) are not obtained by simple random sampling
but come from a cluster or multi-level sampling design.
• The design may induce additional sources of variation that need
to be taken into accounted by the model

Yanez, Spring 2004 2 Lecture Notes XI


Examples

• I. Insulin/Clinics: A treatment (new insulin injection) is being tested


using 15 different clinics in New York State
◦ Clinics are a random sample from all clinics in NY State.
◦ ni patients within clinic i (i = 1, 2, . . . 15) are randomized
equally to receive either the new treatment or the existing ther-
apy. Each subject is measured 1 hour after receiving treatment.
◦ Let Y represent some continuous response (e.g., measure of
blood-sugar). Write out a plausible model for this study design
Y =

◦ What sources of variation can be identified


. if the 15 clinics represented all clinics across the state?
. if the patients were randomly sampled from NY state records?
. if patients were measured repeatedly for one week post-
treatment administration?
• In addition to the sampling design, characterists that are mea-
sured/recorded may also be useful for identifying sources of varia-
tion in the data.

Yanez, Spring 2004 3 Lecture Notes XI


Examples

• II: Worker Productivity (Scheffé, 1958):


◦ Random sample of N workers selected from (large) population
of workers in a company,
◦ record daily productivity (output) for ith worker on day j
◦ assume no temporal trends; days considered a random sample
within each worker (nesting)
◦ Model:
Yij = µ + bi + εij ,
. µ overall (population) average productivity,
. bi (random) worker effect, bi ∼ N (0, σb2)
. εij (random) error, εij ∼ N (0, σε2)
◦ Assume constant within cluster variances var[Yij ] = σε2
. may not be reasonable if variances depend on means
◦ Independence of errors
εij = Yij − E[Yij | bi ] = Yij − (µ + bi)

• Define the ef f ect of worker i (or clinic) as


E[Yij | bi] − E[Yij ] = µ + bi − µ = bi.

• This is a random variable that depends on the cluster i

Yanez, Spring 2004 4 Lecture Notes XI


Intraclass Correlation

• The sampling design implies independence of εij and bi. Why?


• The variance of Yij decomposes into
σ 2 = V ar[Yij ] = V ar[bi] + V ar[εij ]

= σb2 + σε2
and we refer to σb2 and σε2 as variance components.
• The intraclass correlation is defined as the ordinary correla-
tion between two different observations (j 6= j 0) in the same cluster
(e.g., worker or clinic), i.e., with the same i:
E[(Yij − µ)(Yij 0 − µ)]
ρ =
σ2
E[(bi + εij )(bi + εij 0 )]
=
σ2
E(b2i ) σb2
= = 2
σ2 σb + σε2

• The covariance structure is given by



 0, i 6= i0
Cov(yij , yi0 j 0 ) = σb2, i = i0 , j 6= j 0
 2
σb + σε2, i = i0 , j = j 0

• The correlation structure is exchangeable within cluster.

Yanez, Spring 2004 5 Lecture Notes XI


Example II

• N = 18 workers, 2 ≤ ni ≤ 4 observations (days) per worker with


P
a total of 18
i=1 ni = 57 observations on productivity output.

. use machine
. table id, c(n output mean output sd output)
----------------------------------------------------
id | N(output) mean(output) sd(output)
----------+-----------------------------------------
1 | 2 9.15 1.06066
2 | 4 9.475 .801561
3 | 3 8.266666 .9291573
4 | 4 8.200001 1.055146
5 | 3 15.03333 .8144526
6 | 2 11.55 1.06066
7 | 2 11.45 1.909188
8 | 4 11.525 1.021029
9 | 3 11.26667 .7234181
10 | 3 10.13333 1.289703
11 | 3 11.13333 .8082907
13 | 3 16.1 1.5
14 | 3 18.96667 2.150193
15 | 4 15.35 2.330236
16 | 3 16.6 1.83303
17 | 4 15.3 1.783255
18 | 4 14.35 2.176389
19 | 3 10.43333 .6506408
----------------------------------------------------

• Substantial variability between workers compared to within worker


standard deviations

Yanez, Spring 2004 6 Lecture Notes XI


Example II – xtreg Command

. gen cons=1
. xtreg output cons, i(id)
Random-effects GLS regression Number of obs = 57
Group variable (i) : id Number of groups = 18
R-sq: within = . Obs per group: min = 2
between = . avg = 3.2
overall = 0.0000 max = 4
Random effects u_i ~ Gaussian Wald chi2(0) = 279.17
corr(u_i, X) = 0 (assumed) Prob > chi2 = .
------------------------------------------------------------------
output | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------+----------------------------------------------------------
cons | 12.4697 .7463195 16.71 0.000 11.00694 13.93246
_cons | (dropped)
-------------+----------------------------------------------------
sigma_u | 3.0496204
sigma_e | 1.4708855
rho | .81127331 (fraction of variance due to u_i)
------------------------------------------------------------------

• Estimates:
◦ σ̂b2 = sigma u ≈ 3.05
◦ σ̂ε2 = sigma e ≈ 1.47
◦ ρ̂ = rho ≈ 0.81.
• The intra-class correlation is over-stated because the model fails to
take into account that workers use different machines.

Yanez, Spring 2004 7 Lecture Notes XI


Example II – True Model Structure

• Note: Workers are nested within machines.


. table machine worker, c(mean output n output) col f(%8.2f)

brands of | worker nested in machine


machine | 1 2 3 4 Total
----------+----------------------------------
1 | 9.15 9.48 8.27 8.20 8.75
| 2 4 3 4 13
|
2 | 15.03 11.55 11.45 11.52 12.47
| 3 2 2 4 11
|
3 | 11.27 10.13 11.13 10.84
| 3 3 3 9
|
4 | 16.10 18.97 15.35 16.60 16.65
| 3 3 4 3 13
|
5 | 15.30 14.35 10.43 13.63
| 4 4 3 11
---------------------------------------------

• Think of workers at each machine as a random sample from an


infinite population of workers on that machine
• M =5 machines, m = 1, . . . , M with population means µm
• Output Ymij measured on jth day for ith worker on machine m.
• Random effects bmi represent deviations of long term average out-
put for worker (m, i) from population mean µm
• Errors εmij are deviations of output measured on day j from long
term worker averages µm + bmi

Yanez, Spring 2004 8 Lecture Notes XI


Example II (cont.)

. xi: xtreg output i.machine, i(id)


i.machine _Imachine_1-5 (naturally coded; _Imachine_1 omitted)

Random-effects GLS regression Number of obs = 57


Group variable (i) : id Number of groups = 18
R-sq: within = 0.0000 Obs per group: min = 2
between = 0.8124 avg = 3.2
overall = 0.7045 max = 4
Random effects u_i ~ Gaussian Wald chi2(4) = 56.86
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

----------------------------------------------------------------------
output | Coef. Std. Err. z P>|z| [95% Conf. Interval]
------------+---------------------------------------------------------
_Imachine_2 | 3.667751 1.120799 3.27 0.001 1.471026 5.864476
_Imachine_3 | 2.080964 1.197118 1.74 0.082 -.2653444 4.427272
_Imachine_4 | 7.963425 1.102605 7.22 0.000 5.802359 10.12449
_Imachine_5 | 4.671042 1.179784 3.96 0.000 2.358708 6.983377
_cons | 8.763481 .7820011 11.21 0.000 7.230787 10.29617
------------+---------------------------------------------------------
sigma_u | 1.3174599
sigma_e | 1.4708855
rho | .44514222 (fraction of variance due to u_i)
----------------------------------------------------------------------

• Same σ̂ε2 as “constant mean” model


• Much smaller estimates for σ̂b2 and ρ̂ reflecting systematic (fixed)
effects of machines; variation previously attributed to workers

Yanez, Spring 2004 9 Lecture Notes XI


Residuals at Two Levels

• stata output includes “predictions” or “estimates” of the two lev-


els of model error terms
1. bmi the error associated with worker (m, i)
2. εmij the error of measurement on day j, worker (m, i)
• qnorm plots examine whether these errors appear to have a normal
distribution
◦ Departure from linearity ⇒ non-normality
• First look at level two residuals: worker random effects
. predict ALPHA, u
. qnorm ALPHA
U Inverse Normal
1.8383
U

-2.12025
-1.58975 1.58975
Inverse Normal

• Continue with unit level residuals


. predict EPSILON, e
. qnorm EPSILON

Yanez, Spring 2004 10 Lecture Notes XI


Residuals at Two Levels

EPSILON Inverse Normal


3.12287
EPSILON

-2.71672
-2.71672 2.71672
Inverse Normal

• Can also check for assumed independence of residuals at two levels


. gr ALPHA EPSILON

1.8383
ALPHA

-2.12025
-2.0325 3.12287
EPSILON

Yanez, Spring 2004 11 Lecture Notes XI


Best Linear Unbiased Predictors (BLUPS)

• Calculate mean output Ȳmi for each worker


. sort id
. by id: egen Y_AVE = mean(output)

• According to model
Ȳmi = µm + bmi + ¯mi.

• Fixed effects estimate of bmi is just deviation of worker average


from the estimated mean for his machine:
b̂mi = Ȳmi − µ̂m
◦ Fails to account for knowledge that b’s drawn from (normal)
distribution with mean 0
◦ Will be too large if, by chance, ¯mi. >> 0
◦ Will be too small if, by chance, ¯mi. << 0
• Better estimate b̃mi shrinks b̂mi towards 0, with degree of shrinkage
depending on the intraclass correlation ρ and the cluster sizes, n mi
◦ Small ρ ⇒ more shrinkage
◦ Large ρ ⇒ less shrinkage
• Why?

Yanez, Spring 2004 12 Lecture Notes XI


BLUPs are Empirical Bayes Estimators

• Random effects are random variables


• Data: {Ymij , m = 1, . . . , M ; i = 1, . . . , Nm ; j = 1, . . . , nmi } have
a joint probability with the random effects, bmi
• The BLUP of a random effect is defined as
b̃mi = E(bmi|data; µ̂m, σ̂b2, σ̂ε2)

• When the bmi and εmij are assumed to have independent normal
distributions (as they are here), then
 
σ̂b2
b̃mi = b̂mi
σ̂b2 + σ̂ε2/nmi
 
ρ̂
= (ȳmi − µ̂m)
ρ̂ + 1/nmi
◦ “Empirical” Bayes from fact that estimates are substituted for
the variance components (instead of treating them as random
also)
◦ As nmi ↑ ∞ BLUP → fixed effect
• BLUP of output for worker (m, i) is
µ̃mi = µ̂m + b̃mi

• Better predictor than Ȳmi of future output of worker (m, i) in


the aggregate, i.e., when errors of prediction for all workers are
considered together

Yanez, Spring 2004 13 Lecture Notes XI


BLUPS, Worker Averages and Marginal Means

• We have
µ̃mi = µ̂m
|{z} + b̃mi
|{z}
marginal mean random effect

. predict MEAN, xb
. predict BLUP, xbu
. gr Y_AVE MEAN BLUP id, s(ToS) c(.J.) xlabel(0(4)20)

Y_AVE Linear prediction


BLUP
18.9667

8.2
0 4 8 12 16 20
id

Yanez, Spring 2004 14 Lecture Notes XI

You might also like