Professional Documents
Culture Documents
Statistical concepts to be discussed: RR vs. OR, Cross table, expected values, chi-square
statistic
Question 1
What measure of association is suitable to express the strength of the relation between one
binary variable and one continuous variable (see e.g. https://www.youtube.com/watch?
v=6pIG4W8wPzE)?
a. Pearson correlation -> variabelen kunnen binair of continu zijn
b. Relative risk
c. Odds ratio -> beide moeten binair zijn hier
d. Chi-square
Question 2
What measure of association is suitable to express the strength of the relation between two
continuous variables?
a. Pearson correlation
b. Relative risk
c. Odds ratio
d. Chi-square
Question 3
What measure of association is most suitable to express the strength of the relation
between two binary variables?
a. Pearson correlation
b. R-square of a regression analysis
c. Chi-square
d. Odds ratio -> wordt gebruikt voor sterkte van associaties
Question 4
For which design is the relative risk an inappropriate measure of association between two
binary variables?
a. A cohort study
b. A case-control study-> kunt geen incidentie en prevalentie berekenen
c. An experiment with two conditions and a binary outcome
Question 5
For which design(s) is the odds ratio an appropriate measure of association between two
binary variables?
a. A cohort study -> kun je ook gebruiken maar RR is hier beter interpreteerbaar
b. A case-control study -> relatieve risico wordt niet direct berekend omdat de
studiepopulatie niet wordt gevolgd over de tijd
c. An experiment with two conditions and a binary outcome -> ook bruikbaar hier,
maar
Question 6
What frequencies are compared with the observed frequencies when the chi-square test
statistics is obtained?
a. The frequencies that are expected if there is no relation between 2 variables->
Chi-square test tests for independence between two variables (X/Y) that are
both categorical/factors.
b. The frequencies that are expected if there is a relation between 2 variables
c. The frequencies that are expected if there is a linear relation between 2 variables
d. The frequencies that are expected if there is a non-linear relation between 2
variables
Question 7
The chi-square test statistic for a contingency table can be calculated to examine the
relation between
a. two continuous variables
b. one continuous and one categorical variable
c. two categorical variables
d. one continuous and one binary variable
Question 8
In a study rats were randomized into one of two conditions, one condition in which the rats
were put on a restricted diet, and one in which there were no restrictions in this respect
(“ad libitum”). For each rat it was registered whether the lifespan was less than 2 years
(yes or no). The data are as follows:
9Lifespan less than 2 years?
Yes No
Restricted diet 12 89
Ad libitum 54 37
Yes No
diet
a. Calculate the odds of a lifespan shorter than 2 years for rats on a restricted diet
Odds:
12/89= 0,135 -> Voor elke rat dat langer dan twee jaar leeft is er 0,135 rat die
korter dan 2 jaar leeft
b. Calculate the odds of a lifespan shorter than 2 years for rats on a “free-eating” (“ad
libitum”) diet
54/37= 1,459 -> Voor elke rat dat langer dan twee jaar leeft is er 1,459 rat die
korter dan 2 jaar leeft
c. Calculate the odds ratio for a restricted diet versus a “free-eating” diet. Give an
interpretation of this odds ratio.
0,135/1,459= 0,092 -> OR is Kleiner dan 1= negatief verband (slides blok 3)
d. Calculate the risk of a lifespan shorter than 2 years for rats on a restricted diet
12/ 12+89= 0,119
e. Calculate the risk of a lifespan shorter than 2 years for rats on a “free-eating” (“ad
libitum”) diet
54/54+37= 0,593
f. Calculate the relative risk for a restricted diet versus a “free-eating diet”. Give an
interpretation of this relative risk.
0,119/0,593= 0,201 (hoe verder weg van 1 hoe verder het verband dus OR
sterker verband)
g. Assume that there is no relation between the rat’s life span and the type of diet.
Calculate the expected number of rats for each of the cells in the table as given
above.
Restricted diet and over yrs: (89+12) *(89+37)/192= 66,281
Restricted diet and less than 2 yrs: (89+12) * (12+54)/192= 34,718
AD diet and less than 2 yrs: (37+54)*(12+54)/192= 31,281
AD diet and over 2 years: (37+54)*(89+37)/192= 59,718
b. Which measure(s) of association is (are) meaningful here: odds ratio, relative risk or
both?
Case control study-> OR
(152/315) / (163/315)
b. Which measure(s) of association can be calculated here: odds ratio, relative risk or
both? Beide kunnen maar RR is beter
c. Calculate the appropriate measure(s) of association.
124/4000= 0,031
65/6000=0,011
RR= 0,031/0,011= 2,818
d. Give an interpretation of the calculated measure(s) of association.
Iemand dat rookt heeft 2,818 meer kans op longkanker
Question 11
In a cross-sectional study, researchers were interested to investigate the association
between smoking status (no vs yes) and coffee consumption (drinker vs non-drinker) in
healthy adults. The data are as follows:
Smoking status
No Yes
Coffee drinkers 5134 9189
Non-coffee 1052 821
drinkers
a. Calculate the probability of smoking for both coffee and non-coffee drinkers.
Coffee drinkers: 9189/14 323= 0,642 = 64,2% van de koffie drinkers heeft kans op
roken
Non coffee drinkers: 821/1873=0,438 = 43,8% van de non coffee drinkers heeft kans
op roken
b. Calculate the relative risk of smoking for coffee drinkers versus non-coffee drinkers.
RISK1/RISK 2= 0,642/0,438= 1,466
c. Give an interpretation of the calculated relative risk.
Iemand dat koffie drinkt heeft 1,466 meer kans op roken dan iemand die niet koffie
drinkt
Ook: roken en koffie drinken kunnen beiden beinvloed worden door iets
anders -> geen causaal verband maar wel associatie
d. Assuming no relation between drinking coffee and smoking status, calculate the
expected number of persons in each of the cells of the above contingency table.
Ramtotaal . ramtotaal / algemeen total
Volledig totaal= 16 196
Coffe drinkers yes + no = 14 323
Non coffie drinkers yes +no = 1873
Drinkt koffie rookt niet: A+b * a+c / N = (5134 + 9189) * (5134+1052) / 16 196=
5470, 615
Drinkt koffie en rookt = A+b * b+d/N= (5134 + 9189) * (9189+ 821)/ 16196=
8852,385
Rookt niet en drinkt geen koffie= (c+d)*(a+c)/N= (1052+ 821) * (5134+ 1052) /
16196= 715, 385
Rookt niet en drinkt wel koffie= (c+d) * (b+d) / N= 1157,615
b. Also examine the chi-square statistic and check whether it is the same as the one
calculated in problem 11e.
c. In the SPSS output, can you also find the relative risk, which you calculated in
problem 11b?
SPSS instructions Problem 12
See also Andy Field ch.19.5 about chi-square in SPSS.
First create a .sav file in SPSS (e.g., smoke.sav)
Create 3 columns in your dataset:
• A column named coffee, which indicates whether an individual is a coffee drinker
(code 1) or not (code 2)
• A column named smoke indicating whether an individual is smoker (code 1) or not
(code 2)
• A column named freq containing how many subjects belong to each of the cells of
the cross table.
Next choose: Data → Weight cases, and select Weight cases by then select the variable
freq for the Frequency Variable, and press the OK button.
Save your data set before computing some statistics
Perform the required analyses: Analyze → Descriptive Statistics → Crosstabs
• For the Row(s) box select the variable coffee
• Press the Exact button and select Exact, then press continue
• Press the Statistics button and select Chi-square and Risk, then press continue
• Press the Cells button and select Expected (next to Observed), then press continue
• Press the Statistics button and select Chi-square, then press continue
• Press the Cells button and select Counts Expected (next to Observed) and
Percentages Row, then press continue
• Press the Exact button and select Exact, then press continue
• Press the Statistics button and select Chi-square, then press continue
• Press the Cells button and select Counts Expected (next to Observed) and
Percentages Row, then press continue