Professional Documents
Culture Documents
In research have an outcome variable of interest (disease, A study of UK adults to investigate whether c-reactive protein
disease marker, death…etc) levels differs between different ethnic groups
Often investigate the association between some exposure Outcome: Blood Pressure (continuous)
and outcome Exposure: Ethnic group (categorical)
Exposure variable may be demographic or lifestyle factor (eg
gender, smoking status, whether live near power lines… etc) A randomised control trial to see whether people on a new
or treatment/experimental group cancer treatment are more likely to recover from disease than
those on standard treatment
Assess association by comparing the amount of disease Outcome: Recovery (binary)
between exposure groups Exposure: Treatment (binary)
1
STEP 1: IS THE OUTCOME VARIABLE STEP 2: COMPUTE SUMMARY MEASURES
ROUGHLY NORMALLY DISTRIBUTED? IN EACH OF THE TWO GROUPS
Draw histogram
If Normal give the mean and sd for each group
25
15
Mean = 2.4359
Std. Dev. = 0.7773
N = 177
0
0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00
FEV1 in litres in 2000
LOOK AT THE 95% CIS AROUND MEANS STEP 3: COMPUTE MEASURE OF EFFECT
If Normal:
Men: 2.64 (95% CI 2.50 to 2.79)
Women: 2.00 (95% CI 1.85 to 2.14)
Difference in means = Mean in exposed – mean in unexposed
Difference in means (men – women) or mean difference Mean difference (men - women)
On average, FEV1 0.64 litres greater in men than women Estimate true difference is 0.64 L, but 95% sure between 0.45
and 0.85 L higher in men than women
So a value of 0 means no association between exposure and
outcome Suppose 95% CI = -0.23 to 1.51
How is this interpreted?
2
DIFFERENCE IN MEDIANS USING SPSS
If outcome not Normal, compute: Same example: Does lung function (FEV1) differ
between men and women?
Difference in medians =
median in exposed – median in unexposed
Outcome: FEV1
Exposure: Gender (male or female)
25
20
Frequency
15
10
Mean = 2.4359
Std. Dev. = 0.7773
N = 177
0
0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00
FEV1 in litres in 2000
Descriptives
3
STEP 3: COMPUTE MEASURE OF EFFECT STEP 3: COMPUTE MEASURE OF EFFECT
Std. Error
Sex N Mean Std. Deviation Mean
FEV1 in litres in 2000 male 120 2.6446 .78246 .07143
female 57 1.9967 .55515 .07353
4
STEP 1: COMPUTE PROPORTION STEP 1: COMPUTE PROPORTION
WITH OUTCOME IN TWO GROUPS WITH OUTCOME IN TWO GROUPS
Disease recurrence Total We have seen that disease seems to be more common amongst
smokers than non-smokers…
BMI >30 Yes No
….. but how much more?
Yes 34 35% 63 65% 97 100%
No 23 10% 203 90% 226 100% Three measures of effect for binary outcomes:
Total 57 18% 266 82% 323 100% o Risk difference
o Risk ratio
5
RISK DIFFERENCE: 95% CI RISK RATIO
How many times more likely exposed are to have disease Ratio measures always greater than 0, ie can’t be
than unexposed negative value
Tells us about the strength of association between RR<1 (between 0 and 1) means exposure ‘protective’
exposure and disease
RR=3 RR=4.7
3 times more likely 95% CI=1.8 to 12.4
If close to 1 or <1, easier to understand if use % more/less likely: What does this mean?
RR=1.5 95% confident that smokers are between about 2 and 12 times
1.5 – 1 = 0.5 ….. 50% more likely more likely to have disease than non-smokers
RR=0.4
0.4 - 1 = -0.6 ….. 60% less likely Suppose 95% CI = 0.9 to 16.1
RR=0.9 What mean?
0.9 – 1 = -0.1 ….. 10% less likely
6
ODDS RATIO RISK VERSUS ODDS
Interpretation:
Odds Ratio = Odds of disease in exposed
• Meaning less intuitive than risk ratio but if rare disease (<10%
Odds of disease in unexposed prevalence):
OR similar to risk ratio
Smoker
OR=15/63 Can interpret as if risk ratio
5/117 Yes No Total • If not rare disease
= 0.238 Disease Y 15 5 20 OR will be over-estimate of risk ratio
0.043 N 63 117 180 • ORs used more than RRs because multivariate methods to
control for confounding only deal with ORs
= 5.57 Tot 78 122 200
7
EXERCISE EXERCISE
Same example: Does proportion with disease differ Analyse, descriptive statistics, crosstabs
between smokers and non-smokers?
SMOKE
yes no Total
DISEASE yes Count 15 5 20
% within SMOKE 19.2% 4.1% 10.0%
no Count 63 117 180
% within SMOKE 80.8% 95.9% 80.0%
Total Count 78 122 200
% within SMOKE 100.0% 100.0% 100.0%
Risk Estimate
95% Confidence
Interval
Value Lower Upper
Odds Ratio for DISEASE
5.571 1.935 16.040
(1.00 / 2.00)
For cohort SMOKE = 1.00 2.143 1.553 2.957
Tick ‘Risk’ to get odds ratio
For cohort SMOKE = 2.00 .385 .179 .828
N of Valid Cases 200
8
EXAMPLE FROM A PAPER COMPARING MORE THAN TWO GROUPS
<20 60+
OR = odds of disease in < 20’s
odds of disease in 60+
Y 5 10
N 17 140 = 5/17 = 4.12
Total 22 150 10/140
Need to use a significance test to assess whether overall Ratio measures tell you about whether exposure likely to be
association, rather than looking at individual CIs causal or not (aetiological strength)
SPSS does not display ORs and 95% CIs for tables bigger than
Risk difference tells you about implications for individuals
2x2.
Risk difference can also tell you about public health
Need to select out the 2 exposure categories of interest (data,
select cases) and then run the cross-tab. implications, but may need to consider frequency of
exposure also (can compute population attributable risk,
PAR; see appendix in notes)
9
CROSS-TABULATE SUMMARY
10