Professional Documents
Culture Documents
Giorgio Di Gessa
g.di-gessa@ucl.ac.uk
Learning objectives
Learning objectives
3
Regression models: the basic idea
Graphical Representation
Blood
pressure
(weight Kg)
6
Simple linear regression equation
a = intercept
y a bx b = slope
y = dependent variable
x = independent variable
Yi = α0 + β1 Xi + εi
y Observed Value yi
of y for xi y^ a bx
yi
Residual Slope
Predicted Value ei b
}
yˆi
{ Intercept
a
xi x
𝑆𝑆 𝑆𝑆
𝑅 = =1 −
𝑆𝑆 𝑆𝑆
9
𝑆𝑆 𝑆𝑆
𝑅 = =1 −
𝑆𝑆 𝑆𝑆
y
yi
y
SSerr = (yi - yi )2
_
SStot = (yi - y)2
y _
SSreg = (yi - y)2 _
_
y y
Xi x
10
Learning objectives
11
11
12
Hierarchical structure (time)
Individual 1 Individual 2
13
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Subject id
Occasion 1 Occasion 2
Source: Rabe-Hesketh and Skrondal (2010): pp.75-76.
14
14
Continuous
outcome
𝑥 Exposure
15
15
Linear regression: no apparent relationship
Continuous
outcome
Exposure
16
16
Exposure
Exercise: Draw a line!
17
17
+ve
residuals
cluster +ve (variance 𝝈𝒆 𝟐 )
dependent -ve
intercepts
shrinkage
Exposure
18
18
Between- and within-group variation
There are two sources of variance within
longitudinal/ hierarchical data:
– Level 2(j): Between groups
differences between persons in a longitudinal study
(inter-individual)
– Level 1(i): Within the same group
change in outcomes within the same person over time
in a longitudinal study (intra-individual)
“Inter-individual differences are differences that are observed between people, whereas intra-individual
differences are differences that are observed within the same person when assessed at different times.”
19
𝑦 = 𝛽 + 𝑢 +𝑒
20
𝜌=𝜎 /(𝜎 +𝜎 )
21
Variance Partition Coefficient /2
22
summary(randint)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: QoLscore ~ 1 + (1 | idauniq)
## Data: ELSA_long
##
## AIC BIC logLik deviance df.resid
## 264345.3 264371.1 -132169.7 264339.3 40434
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -6.5733 -0.4958 0.0544 0.5340 4.3855
##
## Random effects:
## Groups Name Variance Std.Dev.
## idauniq (Intercept) 55.70 7.463
## Residual 23.36 4.833
## Number of obs: 40437, groups: idauniq, 10173
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 41.11007 0.07971 515.8
23
Learning objectives
24
24
Add covariates: Random-Intercept model
Fixed Random
𝑦 = 𝛽 + 𝜷𝟏 𝑥 + 𝑢 + 𝑒
25
β0 = intercept
β1 = slope (mean line)
Fixed part: slope (β1) does not vary across groups (parallel lines)
Random part: the intercept varies across clusters (overall mean
[β0] + cluster-specific deviation [uj]). Recognition that observations
are heterogeneous: some observations will have outcome values
above (uj>0) or below (uj<0) the overall mean (intercept β0) 26
26
27
Random-Intercept and Random slope model
28
𝑦 = 𝛽 + 𝜷𝟏𝒋 𝑥 +𝑒
Where 𝛽 = 𝛽 + 𝑢 ; 𝛽 =𝛽 + 𝑢
hence:
𝑦 = (𝛽 + 𝑢 ) +(𝛽 + 𝑢 )𝑥 + 𝑒
= (𝛽 +𝛽 𝑥 ) +(𝑢 + 𝑢 𝑥 ) +𝑒
fixed random
29
𝑦 = (𝛽 + 𝑢 ) +(𝛽 + 𝑢 )𝑥 +𝑒
30
Person-specific trajectories
31
Random part:
(1) group-specific variation around the mean
intercept (𝑢 ); and
(2) group-specific variation around the mean slope
of the X-Y association (𝑢 )
32
32
33
33
Model specification
• “Shall I add variable x to my random effects?”
• “Do you want x to be in the fixed part of the model
or the random part?”…
– What is your research question?
• Questions about means (variables)
• Questions about variability (levels: multilevel structure)
– Account for statistical correlation in longitudinal data to
ensure correct standard errors (partition the error into
the level 1 and level 2 components).
http://www.bristol.ac.uk/cmm/learning/videos/random-intercepts.html 34
34
Model specification /2
About variables (means) About variability
• What is the relationship between • How much variation in the slope of
age and cognitive function (CF)? the age - CF association is at the
• This is a question about means: person level (Level 2)?
what happens to the mean value • This is a question about variances
of CF for a 1-unit change in age? • This can be answered using the
• This is answered using the fixed random part of the model: we allow
part of the model: a random slope for age; in addition
– Fixed effect of age to the random intercepts (and the
• Using mixed models, we can allow fixed effect of age)
for the clustering in the data (e.g.
obtain correct SEs) by specifying
random intercepts. 𝜎 as a
“nuisance parameter”*.
*http://www.bristol.ac.uk/cmm/learning/videos/random-intercepts.html 35
35
Learning objectives
36
36
Random Intercept model
𝑦 =𝛽 + 𝛽 𝑥 +𝑢 +𝑒
Interpretation of fixed
effects as in ordinary
linear regression:
On average, the X-Y
( 𝝈𝒖 𝟐) association is represented
(𝝈𝒆 𝟐) by a straight line with
intercept = 43.1 and slope
= -0.75
β0
β1
37
37
38
39
What is a better model? Do we need extra
random intercept or random slope?
• The context of the research and your research
question(s) should guide the selection of the model
• You can also use Likelihood Ratio Tests [LRT] to
inform model choice and assess if extra parameters
in a larger, more complex model are needed.
– Fit the model with the random intercept/ random slope
– Fit the model without the random intercept/random slope
– Compare the 2 models using LR test
– The null hypothesis is equivalent to the hypothesis that
(a) the variance of the random intercept ( 𝝈𝒖 𝟐) = 0; (b) the
variance of the random slope = 0 40
40
Summary
Special techniques for longitudinal data to account for
correlation between observations within the same person
Random effects/ mixed models: combo of fixed & random parts:
Fixed part: describes the pop average response and how it
changes over the values of the covariates. Interpretation as
in ordinary regression.
Random part: involves decomposing error variance:
– Between-groups (level 2 residuals)
– Within-groups (level 1 residuals)
– Two different types of random effects (level 2 residuals):
random intercepts
random slopes
41
41
References
• Rabe-Hesketh & Skrondal (2012). Multilevel and
longitudinal Modeling Using Stata. Stata Press.
• Fitzmaurice & Ravichandran (2008). A primer in
longitudinal Data Analysis Circulation:118(19).
• Gibbons et al (2010) “Advances in Analysis of
Longitudinal Data” Annu Rev Clin Psychol: 6.
• Douglas Bates (2010): Mixed modelling with R
(http://lme4.r-forge.r-project.org/lMMwR/lrgprt.pdf)
• LEMMA: https://www.cmm.bris.ac.uk/lemma/ A
comprehensive and free (!) online course in
multilevel modelling, including computing exercises
in R & Stata. 42
42