Bossuyt - E&E 2021 - 3. Conditioning - Handout

3.
Conditioning &
Standardization
Patrick M Bossuyt
Outline
1. Single confounder
2. Population versus Conditional Effects
3. Multiple confounders
Hypothetical example: breastfeeding
SES
breastfeeding IQ at 7.5
Breastfed
112.5
98 99 100 100 101 102 110 113 115 115 117 120
Not Breastfed
100
93 94 94 96 96 97 103 107
Low SES High SES

IQ at 7.5
Ideally, observe all individual effects
• Thought experiment…
• 1. Everybody exposed: calculate average

• 2. Nobody exposed: calculate average
• Effect of exposure: difference, i.e. gain in average

Breastfed
98 99 100 100 101 102 110 113 115 115 117 120
? ? ? ? ? ? ? ?
Not Breastfed
? ? ? ? ? ? ? ? ? ? ? ?
93 94 94 96 96 97 103 107
Low SES High SES

IQ at 7.5
Single confounder
Possible strategies
Single confounder
Conditional effect - restriction
Restriction: Low SES only
100
98 99 100 100 101 102 110 113 115 115 117 120
5
95
93 94 94 96 96 97 103 107
Low SES High SES

IQ at 7.5
Single confounder
Matching
Mached pair analysis
98
93
Low SES High SES

IQ at 7.5
98 99 100 100 101 102
93 94 94 96 96 97
Low SES High SES

IQ at 7.5
98 99 100 100 101 102 110 113
93 94 94 96 96 97 103 107
Low SES High SES

IQ at 7.5
107.5
98 99 100 100 101 102 110 113 115 115 117 120
7.5
100
93 94 94 96 96 97 103 107 103 107 103 107
Low SES High SES

IQ at 7.5
Single confounder
Conditional effect - stratification
Stratification
100 115
98 99 100 100 101 102 110 113 115 115 117 120
5 10
95 105
93 94 94 96 96 97 103 107
Low SES High SES

IQ at 7.5
Conditional Effect versus
Population Effect
Collapsibility
A conditional effect is the
average effect, at the subgroup
level, of moving a subgroup
from untreated to treated.
A marginal effect is the average

effect, at the population level,
of moving an entire population
from untreated to treated
Stratification – Is 7.5 the population effect?
100 115
98 99 100 100 101 102 110 113 115 115 117 120
5 10
95 105
93 94 94 96 96 97 103 107
Low SES High SES

IQ at 7.5
Conditional Effect equals Population Effect?
• Not necessarily
• Population effect also not necessarily average of conditional effects
• Depends also on type of effect measure

Single confounder
Estimate population effect through standardization
Population effect - standardization
100 115
98 99 100 100 101 102 110 113 115 115 117 120
5 10
95 105
93 94 94 96 96 97 103 107
12/20 8/20
IQ at 7.5
100 115
98 99 100 100 101 102 110 113 115 115 117 120
5 10
95 105
93 94 94 96 96 97 103 107
.6 .6 × 5 + .4 × 10 = 7 .4
• Estimate conditional effect in each stratum
• Estimated population effect

by computing weighted average across strata
(weights express relative prevalence stratum)
• Equivalent alternative:
estimate outcome for each participant with exposure – average
estimate outcome for each participant without exposure – average
difference between these averages is population effect
98 99 100 100 101 102 110 113 115 115 117 120
5 10
93 94 94 96 96 97 103 107
12×5+8×10
12 =7 8
20
IQ at 7.5
Single confounder
Population effect - Inverse Probability Weighting (IPW)
Population effect - IPW
.5 .75
98 99 100 100 101 102 110 113 115 115 117 120
.5 .25
93 94 94 96 96 97 103 107
IQ at 7.5
1 1 1 1
× 98 + × 99 + … + × 110 + ⋯ + × 120
.5 .5 .75 .75 = 106
20
1/.5 1/.75
98 99 100 100 101 102 110 113 115 115 117 120
1 1 1 1
.5
×93+.5
×94+⋯+ .25
×103+.25
×107
= 99
20
1/.5 1/.25
93 94 94 96 96 97 103 107
106 – 99 = 7
IQ at 7.5
1/.5 1/.75
98 99 100 100 101 102 110 113 115 115 117 120
1/.5 1/.25
93 94 94 96 96 97 103 107
IQ at 7.5
1/.5 1/.75
98 99 100 100 101 102 110 113 115 115 117 120
98 99 100 100 101 102
1/.5 1/.25
93 94 94 96 96 97 103 107
IQ at 7.5
1/.5 1/.75
98 99 100 100 101 102 110 113 115 115 117 120
98 99 100 100 101 102
1/.5 1/.25
93 94 94 96 96 97 103 107
93 94 94 96 96 97
IQ at 7.5
1/.5 1/.75
98 99 100 100 101 102 110 113 115 115 117 120
98 99 100 100 101 102
1/.5 1/.25
93 94 94 96 96 97 103 107 103 107 103 107
93 94 94 96 96 97 103 107
IQ at 7.5
1/.5 1/.75
98 99 100 100 101 102 110 113 115 115 117 120
98 99 100 100 101 102 115 115
1/.5 1/.25
93 94 94 96 96 97 103 107 103 107 103 107
93 94 94 96 96 97 103 107
IQ at 7.5
• Assign weight to each participant:
inverse of probability of being exposed (for those exposed)
inverse of probability of being unexposed (for those unexposed)
• Estimated population effect as difference between

weighted average in exposed and
weighted average in unexposed
• Is equivalent to creating two “pseudo-populations”

(one exposed, one unexposed)
that are balanced w.r.t. confounder
and reflect prevalence confounder
Break: Causal Claims
Example 1: Causal claim?
• “Table 1 summarizes the characteristics of 400 women aged 60–80
years who were admitted to our hospital after a first transient
ischemic attack”.
Example 2: Causal Claim?
• “Table 4 shows our estimate of the effect of statin treatment on the
risk of a stroke in the next 12 months for women aged 60-80 with a
transient ischemic attack, without history of cardiovascular disease.
Statins lead to a statistically significant reduction in the stroke risk.”
Example 3: Causal claim?
• “Table 3 shows the estimated probability of having a stroke in the
next 12 months for women aged 60-80 with a transient ischemic
attack, without history of cardiovascular disease.”
Three examples
• “Table 1 summarizes the characteristics of 400 women aged 60–80 years
who were admitted to our hospital after a first transient ischemic attack”.
• “Table 4 shows our estimate of the effect of statin treatment on the risk of
a stroke in the next 12 months for women aged 60-80 with a transient
ischemic attack, without history of cardiovascular disease. Statins lead to a
statistically significant reduction in the stroke risk.”
• “Table 3 shows the estimated probability of having a stroke in the next 12

months for women aged 60-80 with a transient ischemic attack, without
history of cardiovascular disease.”
Synonyms?
• Risk factor
• Risk indicator
• Risk marker
• Determinant
Multiple confounders
Conditional effect - restriction
Matching
Matching
• Can be challenging with many confounders
• Produces a strange conditional effect:

the effect in matched pairs…
Conditional effect - stratification
Stratification
• Becomes more difficult with increasing number of variables
Conditional effect – multivariable modeling
Multivariable modeling
E (Y) = g(B0.X + B1.Z1 + B2.Z2 + B3.Z3 + … + Bn.Zn)
Y is outcome; X is the determinant; Z1…Zn are confounders
Examples:
- Logistic regression
- Cox proportional-hazards regression
- Linear regression
- Poisson regression
Logistic regression
𝑝𝑝
ln = 𝛼𝛼 + 𝛽𝛽1 𝑥𝑥1 + + 𝛽𝛽2 𝑥𝑥2 + ⋯ + 𝛽𝛽𝑛𝑛 𝑥𝑥𝑛𝑛
1 − 𝑝𝑝
Conditional effect: Modeling
• Beware:
• Adding covariates changes estimand (the effect that is estimated) –
different conditional odds ratio
• Beware
• Model misspecification
• E.g. “adjusted for age” ?
Modeling: prediction versus causal modeling
• Prediction:
• “Anything goes”
• Causal modeling:
• Only essential set of confounders
Propensity scores
Randomized Clinical Trial
Active Outcome
Study
Population Randomize
Group
Control Outcome
Randomisation
No association between intervention and confounders

Propensity scores
Remove associations between intervention and confounders

Conditional effect: Propensity scores
• Model probability of getting intervention
as a function of confounders
• Probability (or linear combination) = propensity score
• Exchangeability for participants with same propensity score
• Can be used to “create” conditional randomization
• Only holds for measured variables in propensity score!
Propensity score calculation
• Model the probability of receiving the intervention (exposure) as a
function of the measured confounders
• Pr(E | Z1,Z2,….)
• Typically done through logistic regression

• How to be used?
• (Restriction)
• Matching
• Stratification (score as a covariate)
• Multivariable modeling (score as a covariate in model of outcome)
Propensity score in model
• First: propensity score
• E (X) = g(B0 + B1.Z1 + B2.Z2 + B3.Z3 + … + Bn.Zn)

• S = B0 + B1.Z1 + B2.Z2 + B3.Z3 + … + Bn.Zn
• Second: model of outcome
• E(Y) = g(C0 + C1.X + C2.S )
• X: exposure Y: outcome
Balance diagnostics
continuous proportions
percentage standardized difference
difference in means of a covariate between exposure groups

relative to pooled standard deviation
Propensity scores versus outcome regression
• Advantages:
• Modeling blinded to outcome status – minimize bias
• With propensity as covariate: closer to population effect
• More robust against model specification
• But variance may be larger than fully specified model
• Both may suffer from unmeasured confounders
• Always positivity check needed
(evaluate overlap between groups)
Propensity Scores
• Must include confounders!
• Otherwise: confounding still exists

• How to be used?
• (Restriction)
• Matching
• Stratification (score as a covariate)
• Multivariable modeling (score as a covariate in model of outcome)
• Inverse Probability Weighting
Inverse probability weighting
IP weighting
• Assign a weight to each individual outcome:
the inverse of the conditional probability
of receiving the intervention that she received.
• (equals propensity or 1-propensity)
• Creates two pseudo-populations, without confounding
• Now estimate population effect (difference in outcomes)
• Careful when testing hypothesis or when calculating CI
(bootstrapping needed)
98 99 100 100 101 102 110 113 115 115 117 120
1/0.5 1/0.75
98 99 100 100 101 102 115 115
1/0.5 93 94 94 96 96 97 103 107 103 107 103 107 1/0.25
93 94 94 96 96 97 103 107
propensity : 0.5 propensity : 0.75
IPW
• creates two pseudo-populations
that represent the study groups we would have observed if:
(i) everyone had been exposed and
(ii) no-one had been exposed
• Each has same distribution of potential confounders

• Hence: no association of confounders with treatment
no longer confounding
percentage standardised difference – the difference in means of a covariate between exposure groups divided by a pooled standard deviation
Another example
Standardization
Marginal effect: Standardization
• We can use a “trick”:
• 1. Build a multivariable model of the outcome,
based on all observed data; include treatment in model
• 2. Predict for everyone in the group the outcome under treatment
• 3. Calculate mean
• 4. Predict for everyone in the group the outcome under no treatment
• 5. Calculate mean
• 6. Compute difference between 3 and 5: Population effect
Stabilized weights
Stabilized weights
• Normal weights:
create pseudo-populations: twice as large as study group
• Expected mean: 2
• We can create different weights,
so pseudo-population does not increase in size
• E.g. multiply each weight by probability of being treated,
not considering confounders, if treated (and vice versa)
• Expected mean: 1
• Better statistical properties
3. Conditioning &
Standardization
Patrick M Bossuyt

Bossuyt - E&E 2021 - 3. Conditioning - Handout

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bossuyt - E&E 2021 - 3. Conditioning - Handout

Uploaded by

Copyright:

Available Formats

3.

Low SES High SES

• 1. Everybody exposed: calculate average

• Effect of exposure: difference, i.e. gain in average

Low SES High SES

Low SES High SES

Low SES High SES

98 99 100 100 101 102

Low SES High SES

98 99 100 100 101 102 110 113

Low SES High SES

93 94 94 96 96 97 103 107 103 107 103 107

Low SES High SES

Low SES High SES

A marginal effect is the average

Low SES High SES

• Population effect also not necessarily average of conditional effects

• Depends also on type of effect measure

• Estimated population effect

98 99 100 100 101 102

98 99 100 100 101 102

98 99 100 100 101 102

93 94 94 96 96 97 103 107 103 107 103 107

98 99 100 100 101 102 115 115

93 94 94 96 96 97 103 107 103 107 103 107

• Estimated population effect as difference between

• Is equivalent to creating two “pseudo-populations”

• “Table 3 shows the estimated probability of having a stroke in the next 12

• Produces a strange conditional effect:

Y is outcome; X is the determinant; Z1…Zn are confounders

No association between intervention and confounders

Remove associations between intervention and confounders

• Typically done through logistic regression

• E (X) = g(B0 + B1.Z1 + B2.Z2 + B3.Z3 + … + Bn.Zn)

• Second: model of outcome

• E(Y) = g(C0 + C1.X + C2.S )

percentage standardized difference

difference in means of a covariate between exposure groups

• Otherwise: confounding still exists

98 99 100 100 101 102 115 115

1/0.5 93 94 94 96 96 97 103 107 103 107 103 107 1/0.25

• Each has same distribution of potential confounders

You might also like