You are on page 1of 35

Measures of Association

Armand B. Hisona Jr, RMT, MSPHue


Measures of association
–  Risk difference
–  Risk ratio
–  Odds ratio

Calculation & interpretation of confidence


interval for each measure of association
2×2 table - Measures of association
Outcome - binary

Measure of Effect Formula

Risk difference p1-p0

Risk ratio p1 / p0

Odds ratio (d1/h1) / (d0/h0)


Differences in measures of association

•  When there is no association between exposure and outcome,


–  risk difference = 0
–  risk ratio (RR) = 1
–  odds ratio (OR) = 1

•  Risk difference can be negative or positive


•  RR & OR are always positive

•  For rare outcomes, OR ~ RR

•  OR is always further from 1 than corresponding RR


–  If RR > 1 then OR > RR
–  If RR < 1 the OR < RR
Interpretation of measures of association

•  RR & OR < 1, associated with a reduced risk / odds (may be


protective)
–  RR = 0.8 (reduced risk of 20%)

•  RR & OR > 1, associated with an increased risk / odds


–  RR = 1.2 (increased risk of 20%)

•  RR & OR – further the risk is from 1, stronger the association


between exposure and outcome (e.g. RR=2 versus RR=3).
Comparing the outcome measure of two exposure groups
(groups 1 & 0)

Outcome Population Estimate Standard error of 95% Confidence interval of


variable – parameter of loge(parameter) loge(population parameter)
data type population
parameter
from
sample

Categorical
Population p1/p0
1 1 1 1 log eRR
risk ratio s.e.(lo ge − + −
± 1.96 × s.e.(loeRg
RR )
d 1 n 1 d 0 n0
= R)

Categorical Population (d1/h1) / 1 1 1 1


odds ratio (d0/h0)
s.e. (lo g
e
=
OR )
+ + + log eOR
d1 h1 d0 h0
± 1.96 x s.e.(lo g eOR )
Calculation of p-values for comparing two groups

Outcome Population parameter Population parameter Test statistic


variable – under null hypothesis
data type

π1-π0 π1-π0=0
p1 − p0
Categorical z=
s.e.( p 1
− p
0
)

Population risk ratio Population risk ratio=1


log ( RR)
z = s.e.e(lo g
( RR ))
e

Population odds ratio Population odds ratio=1


log (OR)
z = s.e.e(lo g
( OR ))
e
Comparing the outcome measure of two exposure groups
(TBM trial: dexamethasone versus placebo)

Outcome Population Estimate of 95% confidence Two-sided p-value


variable – parameter population interval for
data type under null parameter population
hypothesis from sample parameter

Categorical Population
risk p1-p0 -0.175, -0.015 0.020
difference = -0.095
=0

Categorical
Population p1/p0 0.62, 0.96 0.016
risk ratio = 0.77
=1

Categorical Population (d1/h1) / (d0/h0) 0.46, 0.93 0.021


odds ratio = 0.66
=1
2×2 table – TBM trial example
Death during 9 months post start
of treatment
Treatment group Yes No Total

Dexamethasone 87 (d1) 187 (h1) 274 (n1)


(group 1)
Placebo 112 (d0) 159 (h0) 271 (n0)
(group 0)
Total 199 346 545

Odds ratio for death = (d 1/h1) / (d0/h0) = 0.465 / 0.704 = 0.66

Odds ratio for exposure to dexamethasone = (d 1/d0) / (h1/h0) = 0.777 / 1.176 = 0.66

Odds ratio for not dying = (h 1/d1) / (h0/d0) = 2.149 / 1.420 = 1.51 = (1/0.66)

Odds ratio for exposure to placebo = (d 0/d1) / (h0/h1) = 1.287 / 0.850 = 1.51 = (1/0.66)
Measure of association

Study Design Risk Risk Odds


difference Ratio Ratio
Randomised controlled trial
√ √ √
Cohort Study
√ √ √
Case-control Study

× × √
Controlling for confounding:
stratification and regression

•  A description of confounding

•  How to control for confounding in statistical


analysis by
–  Stratification
–  Regression modelling

•  A brief description of the role of multiple


linear or logistic regression in adjusting for
confounding
Outcome and exposure variables
(RECAP)
•  Outcomes are variables of
interest (population health
relevance) whose patterns
and determinants we wish to
learn about from data
•  Exposures are the variables we think might
explain observed variation in the outcomes

•  Statistical analysis can be used to quantify the


association between outcomes and exposures
What is confounding?
A confounding variable
1) is associated with the outcome variable;
2) is associated with the exposure
3) variable;
does not lie on the causal pathway.
Exposure variable Outcome variable

Confounding variable

Failing to control for confounding may result in a


biased estimate of the magnitude of the association
between exposure and outcome
Example of confounding

Exposure variable Outcome variable


Alcohol intake Heart disease

Confounding variables
Cigarette smoking
Control of confounding

Design of Study

•  Randomisation
(randomised controlled trial: e.g. TBM trial)

•  Restriction
(only include those with one value of
confounder)

•  Matching
Control of confounding

Statistical analysis

•  Stratification

•  Regression modelling
Hypothetical example of a case-control study
Association between energy intake and heart disease
Heart disease
Energy intake Yes No Total

High 730 (d1) 600 (h1) 1330 (n1)


(group 1)

Low 700 (d0) 540 (h0) 1240 (n0)


(group 0)

Total 1430 1140 2570

Odds of heart disease in high energy intake group = 730/600 = 1.22


Odds of heart disease in low energy intake group = 700/540 = 1.30

Odds ratio = 1.22 / 1.30 = 0.94


95% confidence interval: 0.80 up to 1.10
Is this association confounded
by physical activity?

Exposure variable Outcome variable


Energy intake Heart disease

Confounding variables
Physical activity
Stratify by physical activity…..

High physical activity Low physical activity

Heart disease Heart disease

Energy Yes No Yes No


intake
High 500 510 230 90
(group 1)

Low 100 150 600 390


(group 0)

Calculate the stratum specific odds ratios…


Stratify by physical activity…..

High physical activity Low physical activity

Heart disease Heart disease

Energy Yes No Yes No


intake
High 500 510 230 90
(group 1)

Low 100 150 600 390


(group 0)

For high physical activity group: For low physical activity group:
OR (95% CI) = 1.47 (1.11, 1.95) OR (95% CI) = 1.66 (1.26, 2.19)
Is this association confounded
by physical activity?
???

Exposure variable Outcome variable


Energy intake Heart disease

??? ???

Confounding variables
Physical activity
Confounding – condition 1
Association between physical activity and heart disease

High energy intake Low energy intake

Heart disease Heart disease


Physical
activity Yes No Yes No

High 500 510 100 150


(group 1)

Low 230 90 600 390


(group 0)

For high energy intake group: For low energy intake group:
OR (95% CI) = 0.38 (0.29, 0.50) OR (95% CI) = 0.43 (0.33, 0.58)

** Look particularly in those who are not exposed to the factor of interest**
Confounding – condition 2

Association between energy intake and physical activity

•  In a case-control study: examine the association in the


controls
•  In a cohort study: use the whole
cohort
Confounding – condition 2
Association between energy intake and physical activity for those
without heart disease (n=1140)

Physical activity
Energy intake High Low Total

High 510 90 600


(group 1)

Low 150 390 540


(group 0)

Proportion in high energy intake group who report high physical activity =
510/600 = 0.85 (85%)

Proportion in low energy intake group who report high physical activity =
150/540 = 0.28 (28%)

Odds Ratio = (510/90) / (150/390) = 14.7; 95% CI: 11.0 up to 19.7


Is this association confounded
by physical activity?
???

Exposure variable Outcome variable


Energy intake Heart disease

High energy intake High energy intake:


associated OR = 0.38 (95% CI: 0.29, 0.50)
with high physical Low energy intake:
activity OR = 0.43 (95% CI: 0.33, 0.58)

Confounding variables
Physical activity
So physical activity is a potential confounder
Control for confounding - Stratified analyses
1) Start with stratum specific estimates of odds ratios, risk ratios, risk
differences, rate ratios

2) Calculate a weighted average of the stratum-specific estimates


‘pooled’ estimate 

Usual method is Mantel-Haenszel method


–  Weights assigned according to amount of information in
each stratum
Calculate a pooled OR
High physical activity Low physical activity
(n=1260) (n=1310)
Heart disease Heart disease
Energy
Yes No Yes No
intake

High 500 (d1) 510 (h1) 230 (d1) 90 (h1)


(group 1)

Low 100 (d0) 150 (h0) 600 (d0) 390 (h0)


(group 0)

For high physical activity: For low physical activity:

OR = 1.47 OR = 1.66

w= (d0×h1)/n = w= (d0×h1)/n =
(100×510)/1260) = 40.5 (600×90)/1310) = 41.2
Calculate a pooled OR

For high physical activity: For low physical activity:

OR = 1.47 OR = 1.66

w= (d0×h1)/n = w= (d0×h1)/n =
(100×510)/1260) = 40.5 (600×90)/1310) = 41.2

Mantel-Haenszel estimate of pooled odds ratio:

ORMH = ∑ (w × OR )
i i

∑ wi
Stratum ‘i’
Calculate a pooled OR
For high physical activity: For low physical activity:

OR = 1.47 OR = 1.66

w= (d0×h1)/n = w= (d0×h1)/n =
(100×510)/1260) = 40.5 (600×90)/1310) = 41.2

Mantel-Haenszel estimate of pooled odds ratio:


(40.5 ×1.47 ) + (41.2 ×1.66)
ORMH = = 1.57
(40.5 +
41.2)
95% CI: 1.29 up to 1.91

Recall that the crude OR was 0.94 (95% CI 0.80-1.10)

Is there a difference between crude


and adjusted measures of effect?
Association between energy intake & heart
disease adjusting for physical activity
ORMH = 1.57
95% CI: 1.29, 1.91
Exposure variable Outcome variable
Energy intake Heart disease

High energy intake High energy intake:


associated OR = 0.38 (95% CI: 0.29, 0.50)
with high physical Low energy intake:
activity OR = 0.43 (95% CI: 0.33, 0.58)

Confounding variables
Physical activity
Multiple logistic regression
Outcome variable (y-variable) – binary
e.g. dead or alive; treatment failure or success;
disease or no disease..

Measure of association – Odds ratio

Multiple logistic regression model –

loge(odds of outcome) = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk

β1,…βk – loge(odds ratios)


X1, …..Xk – k different exposure variables (do not need to
be binary but can be categorical with more than 2
categories
or numerical)
Useful when there are many confounding variables…
Logistic regression
Example – Association between energy intake
and heart disease
Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0)

Logistic regression model –


loge(odds of outcome) = β0 + β1X1

β1 – loge(odds ratios)
X1 – energy intake (high versus low)

Exposure Odds Ratio (expβi) 95% Confidence Interval

Energy intake 0.94 0.80, 1.10


(high vs low)
Multiple logistic regression
Example – Association between energy intake and
heart disease
Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0)

Multiple logistic regression model –


loge(odds of outcome) = β0 + β1X1 + β2X2

β1, β2 – loge(odds ratios)


X1 – energy intake (high versus low)
X2 – physical activity (high versus low)

Exposure Odds Ratio (expβi) 95% Confidence Interval

Energy intake 1.57 1.29, 1.91


(high vs low)
Physical activity 0.41 0.33, 0.49
(high vs low)
Multiple linear regression
Outcome variable (y-variable) – numerical
e.g. blood pressure, forced expiratory volume in 1 sec (FEV1)

Linear regression model –

y = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk

y – numerical outcome variable,

β1,…βk – increase in y for every unit increase in x

X1, …..Xk – k different exposure variables (can be numerical


or categorical with 2+ categories)

Useful when there are many confounding variables…


35

End

Thank You For Listening!

You might also like