Random Effects Models: Yanez, Spring 2004 1 Lecture Notes XI

Random Effects Models
Yanez, Spring 2004 1 Lecture Notes XI

Linear Random Effects Models
Ordinary (linear) regression assumes subjects are randomly sam-

pled from some infinitely large population. Let
• Yi = outcome variable for subject i,
• xi = covariate value for subject i. We assume
◦ E[Yi|xi] = β0 + β1xi
◦ V ar[Yi|xi] = σ 2 (constant over the x’s)
• The model can be written as
Yi = β εi .
| 0 +{zβ1x}i + |{z}
fixed error
◦ Errors are independent with constant variance

• Independence is a consequence of the sampling design
Random effects models are often necessary when the observations
(i.e., elementary units) are not obtained by simple random sampling
but come from a cluster or multi-level sampling design.
• The design may induce additional sources of variation that need
to be taken into accounted by the model

Examples
• I. Insulin/Clinics: A treatment (new insulin injection) is being tested

using 15 different clinics in New York State
◦ Clinics are a random sample from all clinics in NY State.
◦ ni patients within clinic i (i = 1, 2, . . . 15) are randomized
equally to receive either the new treatment or the existing ther-
apy. Each subject is measured 1 hour after receiving treatment.
◦ Let Y represent some continuous response (e.g., measure of
blood-sugar). Write out a plausible model for this study design
Y =
◦ What sources of variation can be identified

. if the 15 clinics represented all clinics across the state?
. if the patients were randomly sampled from NY state records?
. if patients were measured repeatedly for one week post-
treatment administration?
• In addition to the sampling design, characterists that are mea-
sured/recorded may also be useful for identifying sources of varia-
tion in the data.

Examples
• II: Worker Productivity (Scheffé, 1958):

◦ Random sample of N workers selected from (large) population
of workers in a company,
◦ record daily productivity (output) for ith worker on day j
◦ assume no temporal trends; days considered a random sample
within each worker (nesting)
◦ Model:
Yij = µ + bi + εij ,
. µ overall (population) average productivity,
. bi (random) worker effect, bi ∼ N (0, σb2)
. εij (random) error, εij ∼ N (0, σε2)
◦ Assume constant within cluster variances var[Yij ] = σε2
. may not be reasonable if variances depend on means
◦ Independence of errors
εij = Yij − E[Yij | bi ] = Yij − (µ + bi)
• Define the ef f ect of worker i (or clinic) as

E[Yij | bi] − E[Yij ] = µ + bi − µ = bi.
• This is a random variable that depends on the cluster i

Intraclass Correlation
• The sampling design implies independence of εij and bi. Why?

• The variance of Yij decomposes into
σ 2 = V ar[Yij ] = V ar[bi] + V ar[εij ]
= σb2 + σε2
and we refer to σb2 and σε2 as variance components.
• The intraclass correlation is defined as the ordinary correla-
tion between two different observations (j 6= j 0) in the same cluster
(e.g., worker or clinic), i.e., with the same i:
E[(Yij − µ)(Yij 0 − µ)]
ρ =
σ2
E[(bi + εij )(bi + εij 0 )]
=
σ2
E(b2i ) σb2
= = 2
σ2 σb + σε2
• The covariance structure is given by


 0, i 6= i0
Cov(yij , yi0 j 0 ) = σb2, i = i0 , j 6= j 0
 2
σb + σε2, i = i0 , j = j 0
• The correlation structure is exchangeable within cluster.

Example II
• N = 18 workers, 2 ≤ ni ≤ 4 observations (days) per worker with

P
a total of 18
i=1 ni = 57 observations on productivity output.
. use machine
. table id, c(n output mean output sd output)
----------------------------------------------------
id | N(output) mean(output) sd(output)
----------+-----------------------------------------
1 | 2 9.15 1.06066
2 | 4 9.475 .801561
3 | 3 8.266666 .9291573
4 | 4 8.200001 1.055146
5 | 3 15.03333 .8144526
6 | 2 11.55 1.06066
7 | 2 11.45 1.909188
8 | 4 11.525 1.021029
9 | 3 11.26667 .7234181
10 | 3 10.13333 1.289703
11 | 3 11.13333 .8082907
13 | 3 16.1 1.5
14 | 3 18.96667 2.150193
15 | 4 15.35 2.330236
16 | 3 16.6 1.83303
17 | 4 15.3 1.783255
18 | 4 14.35 2.176389
19 | 3 10.43333 .6506408
----------------------------------------------------
• Substantial variability between workers compared to within worker

standard deviations

Example II – xtreg Command
. gen cons=1
. xtreg output cons, i(id)
Random-effects GLS regression Number of obs = 57
Group variable (i) : id Number of groups = 18
R-sq: within = . Obs per group: min = 2
between = . avg = 3.2
overall = 0.0000 max = 4
Random effects u_i ~ Gaussian Wald chi2(0) = 279.17
corr(u_i, X) = 0 (assumed) Prob > chi2 = .
------------------------------------------------------------------
output | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------+----------------------------------------------------------
cons | 12.4697 .7463195 16.71 0.000 11.00694 13.93246
_cons | (dropped)
-------------+----------------------------------------------------
sigma_u | 3.0496204
sigma_e | 1.4708855
rho | .81127331 (fraction of variance due to u_i)
------------------------------------------------------------------
• Estimates:
◦ σ̂b2 = sigma u ≈ 3.05
◦ σ̂ε2 = sigma e ≈ 1.47
◦ ρ̂ = rho ≈ 0.81.
• The intra-class correlation is over-stated because the model fails to
take into account that workers use different machines.

Example II – True Model Structure
• Note: Workers are nested within machines.

. table machine worker, c(mean output n output) col f(%8.2f)
brands of | worker nested in machine

machine | 1 2 3 4 Total
----------+----------------------------------
1 | 9.15 9.48 8.27 8.20 8.75
| 2 4 3 4 13
|
2 | 15.03 11.55 11.45 11.52 12.47
| 3 2 2 4 11
|
3 | 11.27 10.13 11.13 10.84
| 3 3 3 9
|
4 | 16.10 18.97 15.35 16.60 16.65
| 3 3 4 3 13
|
5 | 15.30 14.35 10.43 13.63
| 4 4 3 11
---------------------------------------------
• Think of workers at each machine as a random sample from an

infinite population of workers on that machine
• M =5 machines, m = 1, . . . , M with population means µm
• Output Ymij measured on jth day for ith worker on machine m.
• Random effects bmi represent deviations of long term average out-
put for worker (m, i) from population mean µm
• Errors εmij are deviations of output measured on day j from long
term worker averages µm + bmi

Example II (cont.)
. xi: xtreg output i.machine, i(id)

i.machine _Imachine_1-5 (naturally coded; _Imachine_1 omitted)
Random-effects GLS regression Number of obs = 57

Group variable (i) : id Number of groups = 18
R-sq: within = 0.0000 Obs per group: min = 2
between = 0.8124 avg = 3.2
overall = 0.7045 max = 4
Random effects u_i ~ Gaussian Wald chi2(4) = 56.86
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
----------------------------------------------------------------------
output | Coef. Std. Err. z P>|z| [95% Conf. Interval]
------------+---------------------------------------------------------
_Imachine_2 | 3.667751 1.120799 3.27 0.001 1.471026 5.864476
_Imachine_3 | 2.080964 1.197118 1.74 0.082 -.2653444 4.427272
_Imachine_4 | 7.963425 1.102605 7.22 0.000 5.802359 10.12449
_Imachine_5 | 4.671042 1.179784 3.96 0.000 2.358708 6.983377
_cons | 8.763481 .7820011 11.21 0.000 7.230787 10.29617
------------+---------------------------------------------------------
sigma_u | 1.3174599
sigma_e | 1.4708855
rho | .44514222 (fraction of variance due to u_i)
----------------------------------------------------------------------
• Same σ̂ε2 as “constant mean” model

• Much smaller estimates for σ̂b2 and ρ̂ reflecting systematic (fixed)
effects of machines; variation previously attributed to workers

Residuals at Two Levels
• stata output includes “predictions” or “estimates” of the two lev-

els of model error terms
1. bmi the error associated with worker (m, i)
2. εmij the error of measurement on day j, worker (m, i)
• qnorm plots examine whether these errors appear to have a normal
distribution
◦ Departure from linearity ⇒ non-normality
• First look at level two residuals: worker random effects
. predict ALPHA, u
. qnorm ALPHA
U Inverse Normal
1.8383
U
-2.12025
-1.58975 1.58975
Inverse Normal
• Continue with unit level residuals

. predict EPSILON, e
. qnorm EPSILON

Residuals at Two Levels
EPSILON Inverse Normal

3.12287
EPSILON
-2.71672
-2.71672 2.71672
Inverse Normal
• Can also check for assumed independence of residuals at two levels

. gr ALPHA EPSILON
1.8383
ALPHA
-2.12025
-2.0325 3.12287
EPSILON

Best Linear Unbiased Predictors (BLUPS)
• Calculate mean output Ȳmi for each worker

. sort id
. by id: egen Y_AVE = mean(output)
• According to model
Ȳmi = µm + bmi + ¯mi.
• Fixed effects estimate of bmi is just deviation of worker average

from the estimated mean for his machine:
b̂mi = Ȳmi − µ̂m
◦ Fails to account for knowledge that b’s drawn from (normal)
distribution with mean 0
◦ Will be too large if, by chance, ¯mi. >> 0
◦ Will be too small if, by chance, ¯mi. << 0
• Better estimate b̃mi shrinks b̂mi towards 0, with degree of shrinkage
depending on the intraclass correlation ρ and the cluster sizes, n mi
◦ Small ρ ⇒ more shrinkage
◦ Large ρ ⇒ less shrinkage
• Why?

BLUPs are Empirical Bayes Estimators
• Random effects are random variables

• Data: {Ymij , m = 1, . . . , M ; i = 1, . . . , Nm ; j = 1, . . . , nmi } have
a joint probability with the random effects, bmi
• The BLUP of a random effect is defined as
b̃mi = E(bmi|data; µ̂m, σ̂b2, σ̂ε2)
• When the bmi and εmij are assumed to have independent normal
distributions (as they are here), then

σ̂b2
b̃mi = b̂mi
σ̂b2 + σ̂ε2/nmi

ρ̂
= (ȳmi − µ̂m)
ρ̂ + 1/nmi
◦ “Empirical” Bayes from fact that estimates are substituted for
the variance components (instead of treating them as random
also)
◦ As nmi ↑ ∞ BLUP → fixed effect
• BLUP of output for worker (m, i) is
µ̃mi = µ̂m + b̃mi
• Better predictor than Ȳmi of future output of worker (m, i) in

the aggregate, i.e., when errors of prediction for all workers are
considered together

BLUPS, Worker Averages and Marginal Means
• We have
µ̃mi = µ̂m
|{z} + b̃mi
|{z}
marginal mean random effect
. predict MEAN, xb
. predict BLUP, xbu
. gr Y_AVE MEAN BLUP id, s(ToS) c(.J.) xlabel(0(4)20)
Y_AVE Linear prediction

BLUP
18.9667
8.2
0 4 8 12 16 20
id

Random Effects Models: Yanez, Spring 2004 1 Lecture Notes XI

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Random Effects Models: Yanez, Spring 2004 1 Lecture Notes XI

Uploaded by

Copyright:

Available Formats

Random Effects Models

Yanez, Spring 2004 1 Lecture Notes XI

Ordinary (linear) regression assumes subjects are randomly sam-

◦ Errors are independent with constant variance

Yanez, Spring 2004 2 Lecture Notes XI

• I. Insulin/Clinics: A treatment (new insulin injection) is being tested

◦ What sources of variation can be identified

Yanez, Spring 2004 3 Lecture Notes XI

• II: Worker Productivity (Scheffé, 1958):

• Define the ef f ect of worker i (or clinic) as

• This is a random variable that depends on the cluster i

Yanez, Spring 2004 4 Lecture Notes XI

• The sampling design implies independence of εij and bi. Why?

• The covariance structure is given by

• The correlation structure is exchangeable within cluster.

Yanez, Spring 2004 5 Lecture Notes XI

• N = 18 workers, 2 ≤ ni ≤ 4 observations (days) per worker with

• Substantial variability between workers compared to within worker

Yanez, Spring 2004 6 Lecture Notes XI

Yanez, Spring 2004 7 Lecture Notes XI

• Note: Workers are nested within machines.

brands of | worker nested in machine

• Think of workers at each machine as a random sample from an

Yanez, Spring 2004 8 Lecture Notes XI

. xi: xtreg output i.machine, i(id)

Random-effects GLS regression Number of obs = 57

• Same σ̂ε2 as “constant mean” model

Yanez, Spring 2004 9 Lecture Notes XI

• stata output includes “predictions” or “estimates” of the two lev-

• Continue with unit level residuals

Yanez, Spring 2004 10 Lecture Notes XI

EPSILON Inverse Normal

• Can also check for assumed independence of residuals at two levels

Yanez, Spring 2004 11 Lecture Notes XI

• Calculate mean output Ȳmi for each worker

• Fixed effects estimate of bmi is just deviation of worker average

Yanez, Spring 2004 12 Lecture Notes XI

• Random effects are random variables

• Better predictor than Ȳmi of future output of worker (m, i) in

Yanez, Spring 2004 13 Lecture Notes XI

Y_AVE Linear prediction

Yanez, Spring 2004 14 Lecture Notes XI

You might also like