Lda Lecture 1 Rem SM Lda

Longitudinal Data Analysis
Random Effects Models for

continuous outcomes
Giorgio Di Gessa
g.di-gessa@ucl.ac.uk
Learning objectives
• Linear regression for continuous outcomes

• Longitudinal data: ‘Between-individual’ or
‘between-cluster’ variability
• Statistical models to longitudinal data
• Interpret results from longitudinal models
Learning objectives

3
Regression models: the basic idea
When two variables are analysed, we might be

interested in summarising their relationship and in
explaining/ predicting one of the variables on the
basis of information on the other
 Dependent (outcome, y) variable: the variable whose

variation we wish to explain or predict
 Independent (explanatory, x) variable: the variable
used to explain/predict changes in the outcome
Simple linear regression

The simplest linear regression procedure is the
bivariate linear regression analysis (or simple
linear regression). This model explores the linear
relationship for 2 variables (continuous outcome).
Regression is a statistical model that captures

the randomness of real-life processes (not a
deterministic model). Variables are stochastically
related [i.e. you can only tell what the chances are
that the outcome will have a particular value,
based on the value of the other variable(s), and
their distributions].
Graphical Representation
Blood
pressure
(weight Kg)
6
Simple linear regression equation
Mathematically, a straight line is written as below:
a = intercept
y  a  bx b = slope
y = dependent variable
x = independent variable
Yi = α0 + β1 Xi + εi
Regression equation - Graphical Representation
y Observed Value yi
of y for xi y^  a  bx
yi
Residual Slope
Predicted Value ei b
}
yˆi
{ Intercept
a
xi x
Assessing the model -- R2
Coefficient of determination (R2) shows how much

of the variation in y is explained by the variation in x
(i.e. the regression model). R2 represents the
proportion of variation in y that is explained by x.
𝑆𝑆 𝑆𝑆
𝑅 = =1 −
𝑆𝑆 𝑆𝑆
SSreg = “Regression sum of squares"

SSerr = “Error sum of squares"
SStot = “Total sum of squares"
9
𝑆𝑆 𝑆𝑆
𝑅 = =1 −
𝑆𝑆 𝑆𝑆
y
yi 
 y
SSerr = (yi - yi )2
_
SStot = (yi - y)2

y  _
SSreg = (yi - y)2 _
_
y y
Xi x
10
Learning objectives

11
11
Hierarchical / Clustered/ Longitudinal data

• Longitudinal data:
– Data grouped in time: i.e. the same measures of cognitive
function gathered from same persons every year
• Observations from hierarchical data structures are
correlated. Standard regression techniques do not
take this intra-subject correlation of response
measurements into account (invalid inferences)
• Random Effects/ Mixed models can estimate the
associations between variables whilst taking into
account the correlated nature of observations within
the same group.
12
12
Hierarchical structure (time)
Individual 1 Individual 2
Occasion 1 Occasion 2 Occasion 1 Occasion 2
serial measurement occasions (level 1: i)

clustered within individuals (level 2: j)
13
13
Correlated nature of repeated measurements

700
600
Lung function
400 300
200500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Subject id
Occasion 1 Occasion 2
Source: Rabe-Hesketh and Skrondal (2010): pp.75-76.
14
14
ith measurement in jth cluster
Continuous
outcome
𝑥 Exposure
15
15
Linear regression: no apparent relationship
Continuous
outcome
Exposure
16
16
What if multiple observations come from the

same person?
Continuous
outcome
Exposure
Exercise: Draw a line!
17
17
Mixed model regression: clear relationship
Continuous Fixed effects relationship

outcome = re-population average relationship
Correlated residuals
shrinkage
+ve
residuals
cluster +ve (variance 𝝈𝒆 𝟐 )
dependent -ve
intercepts
shrinkage
(from a pop -ve

with var 𝝈𝒖 𝟐 )
Exposure
18
18
Between- and within-group variation
There are two sources of variance within
longitudinal/ hierarchical data:
– Level 2(j): Between groups
differences between persons in a longitudinal study
(inter-individual)
– Level 1(i): Within the same group
change in outcomes within the same person over time
in a longitudinal study (intra-individual)
“Inter-individual differences are differences that are observed between people, whereas intra-individual
differences are differences that are observed within the same person when assessed at different times.”
19
Null model for hierarchical data

Fixed and Random parts
𝑦 = 𝛽 + 𝑢 +𝑒
Outcome: 𝑦  response at occasion i for person j

Fixed / systematic part: 𝛽 is the overall mean
Random parts: 𝑢 - group level residual ~ N(0, 𝝈𝒖 𝟐 )
𝑒 - lowest- level residual ~ N(0, 𝝈𝒆 𝟐 )
𝜎 and 𝜎 are the variance components (group j
and lowest level i respectively) which we estimate.
20
# level 2 units >10, >20: to ensure good estimation of 𝝈𝒖 𝟐
20
Intra-cluster/class Correlation Coefficient (ICC)

Variance Partition Coefficient (VPC)
𝜌=𝜎 /(𝜎 +𝜎 )
𝜎 +𝜎 is the total variation.

𝜎 is the variation at level 2, 𝜎 the variation at level 1.
Possible values for 𝜌 : 0 ≤ 𝜌 ≤ 1
Two different ways of looking at the same thing:

• Proportion of variance: “How much of the total variance
is at level 2 (the group / cluster level)?”
• Correlation coefficient: With longitudinal data, 𝜌 is the
correlation among observations within the same cluster 21
21
Variance Partition Coefficient /2
VPC ~ how important group level differences are (what

proportion of the variance is at the group level?)
• VPC=0 if no group effect 𝜎 =0
• VPC=1 if no within-group differences 𝜎 =0
22
summary(randint)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: QoLscore ~ 1 + (1 | idauniq)
## Data: ELSA_long
##
## AIC BIC logLik deviance df.resid
## 264345.3 264371.1 -132169.7 264339.3 40434
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -6.5733 -0.4958 0.0544 0.5340 4.3855
##
## Random effects:
## Groups Name Variance Std.Dev.
## idauniq (Intercept) 55.70 7.463
## Residual 23.36 4.833
## Number of obs: 40437, groups: idauniq, 10173
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 41.11007 0.07971 515.8
ICC <- (55.70)/(55.70 + 23.36)

ICC
## [1] 0.7045282 23
23
Learning objectives

24
24
Add covariates: Random-Intercept model
Fixed Random
𝑦 = 𝛽 + 𝜷𝟏 𝑥 + 𝑢 + 𝑒
Fixed Part: 𝛽 , 𝛽 Covariates:

 Time-invariant (e.g. sex)
Random Part: 𝑢 , 𝑒  Time-varying, including time
to estimate rate of change
(Session 2)
• Mean response ~ combination of characteristics shared by all persons

(fixed effects) and subject-specific effects that are unique to a particular
25
observation (random effects).
25
University of Bristol: Centre for

Multilevel Modelling
β0 = intercept
β1 = slope (mean line)
For a given sample, there are

N lines, one per individual.
The variance 𝜎 represents
the spread of these lines
(Gibbons et al;2010).
Fixed part: slope (β1) does not vary across groups (parallel lines)
Random part: the intercept varies across clusters (overall mean
[β0] + cluster-specific deviation [uj]). Recognition that observations
are heterogeneous: some observations will have outcome values
above (uj>0) or below (uj<0) the overall mean (intercept β0) 26
26
Random-Intercept only model: Interpretation

• Fixed part (population-average relationship):
mean intercept (β ); average slope for X-Y
association (β )
– Mean population X-Y regression line: Set the level 1
and level 2 random effects to zero: the population-
average trajectory.
• Random effects part: group-specific variation
around the mean intercept (estimated by 𝜎 -- we
can assign values to uj): resolve non-
independence by allowing each level-2 unit to
have a different intercept / initial level.
27
27
Random-Intercept and Random slope model
We can extend the model by allowing the gradient /

slope as well as the intercept to vary with cluster
Random-Intercept: heterogeneity ~ intercepts above/ below average (b0)
Random-Slope: heterogeneity ~ slopes above/ below average (b1) 28
28
Random-Intercept and Random slope model /2
𝑦 = 𝛽 + 𝜷𝟏𝒋 𝑥 +𝑒
Where 𝛽 = 𝛽 + 𝑢 ; 𝛽 =𝛽 + 𝑢
hence:
𝑦 = (𝛽 + 𝑢 ) +(𝛽 + 𝑢 )𝑥 + 𝑒
= (𝛽 +𝛽 𝑥 ) +(𝑢 + 𝑢 𝑥 ) +𝑒
fixed random
Both intercept and slope are now group dependent.

Intercept and slope = average (𝛽 ,𝛽 ) + (group-specific
deviation from the average: 𝑢 , 𝑢 ) 29
29
𝑦 = (𝛽 + 𝑢 ) +(𝛽 + 𝑢 )𝑥 +𝑒
𝛽 is the mean response when X=0

𝛽 is the average slope of the X-Y association
𝛽 and 𝛽 are the fixed effects: and describe the

population average response of Y and how it
changes over the values of the covariates.
 Each random effect 𝑢 is the difference between the pop-
averaged intercept (𝛽 ) and the intercept for group j
 Each random effect 𝑢 is the difference between the
population-averaged slope (𝛽 ) and the slope for group j
30
30
Person-specific trajectories
Population averaged response

• j=1: Intercept: Y is higher (when X=0) than the pop. average
(b0) and therefore u is positive. Slope: steeper (b1 + u )
than the pop. average (b1) and therefore has a positive u
• j=2: Negative u and u
• Level 1 residuals e allow responses Y on any occasion to
vary randomly above/ below the group-specific trajectories
31
31
Random-Intercept and Random-Slopes:

Interpretation:
Fixed part: mean intercept (β ); average slope for X-

Y association (β )
Random part:
(1) group-specific variation around the mean
intercept (𝑢 ); and
(2) group-specific variation around the mean slope
of the X-Y association (𝑢 )
32
32
Two different random effects / mixed models

Random-Intercept only + Random-slope
Random intercept Random intercept

unique to group (𝛽 + 𝑢 ) unique to group (𝛽 + 𝑢 )
Fixed slope: Random slope:
Pop average over time for full sample: effect of X on individuals differs; slope
effect of X same ∀ individuals (β) is partitioned into:
(1) population average (β) , and
(2) individual-specific (𝛽 + 𝑢 )
33
33
Model specification
• “Shall I add variable x to my random effects?”
• “Do you want x to be in the fixed part of the model
or the random part?”…
– What is your research question?
• Questions about means (variables)
• Questions about variability (levels: multilevel structure)
– Account for statistical correlation in longitudinal data to
ensure correct standard errors (partition the error into
the level 1 and level 2 components).
http://www.bristol.ac.uk/cmm/learning/videos/random-intercepts.html 34
34
Model specification /2
About variables (means) About variability
• What is the relationship between • How much variation in the slope of
age and cognitive function (CF)? the age - CF association is at the
• This is a question about means: person level (Level 2)?
what happens to the mean value • This is a question about variances
of CF for a 1-unit change in age? • This can be answered using the
• This is answered using the fixed random part of the model: we allow
part of the model: a random slope for age; in addition
– Fixed effect of age to the random intercepts (and the
• Using mixed models, we can allow fixed effect of age)
for the clustering in the data (e.g.
obtain correct SEs) by specifying
random intercepts. 𝜎 as a
“nuisance parameter”*.
*http://www.bristol.ac.uk/cmm/learning/videos/random-intercepts.html 35
35
Learning objectives

36
36
Random Intercept model
𝑦 =𝛽 + 𝛽 𝑥 +𝑢 +𝑒
Interpretation of fixed
effects as in ordinary
linear regression:
On average, the X-Y
( 𝝈𝒖 𝟐) association is represented
(𝝈𝒆 𝟐) by a straight line with
intercept = 43.1 and slope
= -0.75
β0
β1
37
37
Random Slope model
group level intercept variance ( 𝝈𝒖𝟎 𝟐 )

group-level slope variance (σu12)
person-level variance ( 𝝈𝒆 𝟐 )
group-level intercept & slope cov (σu01)
Model where association between X (CF) and Y (QoL) is allowed to vary

across groups (level j) 38
38
Covariance between intercept and slope
Pattern of fanning out Pattern of fanning in 39
39
What is a better model? Do we need extra
random intercept or random slope?
• The context of the research and your research
question(s) should guide the selection of the model
• You can also use Likelihood Ratio Tests [LRT] to
inform model choice and assess if extra parameters
in a larger, more complex model are needed.
– Fit the model with the random intercept/ random slope
– Fit the model without the random intercept/random slope
– Compare the 2 models using LR test
– The null hypothesis is equivalent to the hypothesis that
(a) the variance of the random intercept ( 𝝈𝒖 𝟐) = 0; (b) the
variance of the random slope = 0 40
40
Summary
Special techniques for longitudinal data to account for
correlation between observations within the same person
Random effects/ mixed models: combo of fixed & random parts:
 Fixed part: describes the pop average response and how it
changes over the values of the covariates. Interpretation as
in ordinary regression.
 Random part: involves decomposing error variance:
– Between-groups (level 2 residuals)
– Within-groups (level 1 residuals)
– Two different types of random effects (level 2 residuals):
random intercepts
random slopes
41
41
References
• Rabe-Hesketh & Skrondal (2012). Multilevel and
longitudinal Modeling Using Stata. Stata Press.
• Fitzmaurice & Ravichandran (2008). A primer in
longitudinal Data Analysis Circulation:118(19).
• Gibbons et al (2010) “Advances in Analysis of
Longitudinal Data” Annu Rev Clin Psychol: 6.
• Douglas Bates (2010): Mixed modelling with R
(http://lme4.r-forge.r-project.org/lMMwR/lrgprt.pdf)
• LEMMA: https://www.cmm.bris.ac.uk/lemma/ A
comprehensive and free (!) online course in
multilevel modelling, including computing exercises
in R & Stata. 42
42

Lda Lecture 1 Rem SM Lda

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lda Lecture 1 Rem SM Lda

Uploaded by

Copyright:

Available Formats

Longitudinal Data Analysis

Random Effects Models for

• Linear regression for continuous outcomes

• Linear regression for continuous outcomes

When two variables are analysed, we might be

 Dependent (outcome, y) variable: the variable whose

Simple linear regression

Regression is a statistical model that captures

Mathematically, a straight line is written as below:

Regression equation - Graphical Representation

Assessing the model -- R2

Coefficient of determination (R2) shows how much

SSreg = “Regression sum of squares"

• Linear regression for continuous outcomes

Hierarchical / Clustered/ Longitudinal data

Occasion 1 Occasion 2 Occasion 1 Occasion 2

serial measurement occasions (level 1: i)

Correlated nature of repeated measurements

ith measurement in jth cluster

What if multiple observations come from the

Mixed model regression: clear relationship

Continuous Fixed effects relationship

(from a pop -ve

Null model for hierarchical data

Outcome: 𝑦  response at occasion i for person j

Intra-cluster/class Correlation Coefficient (ICC)

𝜎 +𝜎 is the total variation.

Possible values for 𝜌 : 0 ≤ 𝜌 ≤ 1

Two different ways of looking at the same thing:

VPC ~ how important group level differences are (what

• VPC=0 if no group effect 𝜎 =0

• VPC=1 if no within-group differences 𝜎 =0

ICC <- (55.70)/(55.70 + 23.36)

• Linear regression for continuous outcomes

Fixed Part: 𝛽 , 𝛽 Covariates:

• Mean response ~ combination of characteristics shared by all persons

University of Bristol: Centre for

For a given sample, there are

Random-Intercept only model: Interpretation

We can extend the model by allowing the gradient /

Random-Intercept and Random slope model /2

Both intercept and slope are now group dependent.

𝛽 is the mean response when X=0

𝛽 and 𝛽 are the fixed effects: and describe the

Population averaged response

Random-Intercept and Random-Slopes:

Fixed part: mean intercept (β ); average slope for X-

Two different random effects / mixed models

Random intercept Random intercept

• Linear regression for continuous outcomes

Random Slope model

group level intercept variance ( 𝝈𝒖𝟎 𝟐 )

group-level intercept & slope cov (σu01)

Model where association between X (CF) and Y (QoL) is allowed to vary

Covariance between intercept and slope

Pattern of fanning out Pattern of fanning in 39

You might also like