You are on page 1of 9

M4.

U2
Bias, Confounding and Modification Effect

Overview
Epidemiological studies aim to identify exposures that may increase or decrease the risk of developing a
certain disease (or outcome). Unfortunately, errors in the design, conduct and analysis can distort the
results of any epidemiological study. Even if errors do not seem to be an obvious explanation for an
observed association between an exposure and an outcome, it may or may not be causal. In this session
we will discuss the potential pitfalls in epidemiological studies.

Learning objective
a. Discuss the different types of bias that can distort the results of epidemiological studies
b. Discuss a confounding variable provides an alternative explanation for an observed association
between an exposure and an outcome
c. Discuss the interaction in an association

A. Bias
In previous chapters we have mentioned some of the ways in which bias can occur in study design and
measurement of an association. Here, we give an overview of bias in the study designs you have met.
Bias can be categorized into two classes:
● selection bias
● information bias (or measurement bias)

Selection bias
Selection bias occurs when systematic errors are introduced by the selection of study participants or
allocation of individuals to different study groups. These errors can compromise the (internal) validity of
results of a study. This can occur if the participants selected for the study are not representative of the
general population to which the study will ultimately apply, or if the comparison groups are not
comparable (case–control or intervention studies).
For example, if subjects are allowed to choose between a new drug that is being tested and an
established drug, the more adventurous or health conscious individuals might like to try the new drug,
whereas the less adventurous or less well-informed individuals may opt for the established drug.
Differences in the effects of the two drugs observed in such a study design may be partly or entirely due
to the differences in the underlying characteristics of the study participants rather than in the effects of
the drugs. For these reasons it is preferable to randomly assign participants to the study drug or control
in intervention studies. In case–control studies, selection bias can occur in the selection of cases if they
are not representative of all cases within the population, or in the selection of controls if they are not
representative of the population that produced the cases.
In cohort studies, selection bias may occur if the exposed and unexposed groups are not truly
comparable. This might happen if the unexposed group is not correctly selected, and differs from the
exposed groups in other, unrelated, factors in addition to the exposure of interest. An example of this
would be comparing an occupational cohort with the general population. Any association with the

1|Ver. St-2020/APH
exposure and disease might be lost due to the healthy worker effect. Bias may also occur if there are
differences in follow-up between the comparison groups.

Information bias
Information (or measurement) bias occurs if an inaccurate measurement or classification of an outcome
or exposure is made. This could mean that individuals are assigned to the wrong exposure or outcome
category, and will then result in an incorrect estimation of the association between exposure and
outcome. Errors in measurement are also known as misclassifications, and might be introduced by the
observer (observer bias), by the study participants (recall bias), or by the measurement tools such as
weighing scales or questionnaires. The size and direction of the distortion of an observed association
between an exposure and an outcome depends on the type of misclassification, of which there
are two types:
● differential misclassification
● Non-differential (random) misclassification.
Only differential misclassification leads to information bias, although we will discuss both types of
misclassification here for completeness.

Differential misclassification
Differential misclassification occurs when one group of participants is more likely to be misclassified
than the other. In a cohort study differential misclassification can occur if exposure makes the
individuals more or less likely to be classified as having the disease. In a case–control study, differential
misclassification can occur if cases are more or less likely to be classified as being exposed than controls.
Differential misclassification can therefore lead to an over- or underestimation of an association
between exposure and outcome.

Activity 1
In a study about traumatic brain injury (TBI) conducted in A city, we would like to know the association
between using smartphone while driving/riding and traumatic brain injury. Cases are people who had
diagnosed with TBI and controls are people without TBI. The investigator asked them retrospectively
about their smartphone use while driving. The investigator assumed that people with TBI were more
likely to recall their smartphone use during driving.

1. Do you think differential misclassification is likely to occur?


Iya, mungkin saja bisa terjadi
2. If so, how do you think it would affect the observed risk of TBI?
Karena seseorang yang sudah terkena TBI beberapa diantara mereka memiliki gangguan kesulitan
umtuk mengingat beberapa kejadian sehingga dapat menimbulkan bias informasi / bias
misclarification

Suppose that the cases recalled the smartphone use accurately, but the controls did not (our case
study). This could lead to the results shown in Table 1.

2|Ver. St-2020/APH
Table 1 Odds of exposure to fried rice in cases and control
cases control
Exposure n=200 n=200
Using smartphone 70 20
not using smartphone 130 180

Table 2 shows the “real” odds of exposure to smartphone use in cases and controls in the study.

Table 2 Observed odds of exposure in scenario 1


Cases control
Exposure n=100 n=100
Using smartphone 70 50
not using smartphone 130 150

3. What is the observed odds ratio of exposure to smartphone use in cases compared to controls in
table 1 (our case study)?
70× 180 12.600
OR = 20× 130 = 2600 = 4.85
4. What is the actual odds ratio of exposure to smartphone use in cases compared to controls (table
2)?
70× 150 10,500
OR = = = 1.62
50× 130 6,500

Non-differential misclassification
Non-differential misclassification occurs when both groups (cases or controls, exposed or unexposed)
are equally likely to be misclassified. This form of misclassification is therefore independent of exposure
status or outcome status. Non-differential misclassification usually leads to underestimation of an
association between exposure and outcome, and will therefore reduce the observed strength of the
association. Suppose that, in the case–control study discussed above, the exposure to alcohol use was
determined from the police records. It is likely that the records of some patients might not be traceable.
However, the loss of records would probably be distributed equally among the cases and the controls,
since record-keeping in police agency is independent of the risk of TBI. If the investigators decided to
classify all patients who did not have a record of alcohol use in the police database as sober drivers, then
the odds of exposure to alcohol would be underestimated in both cases and controls. Although the odds
of exposure to alcohol would be underestimated equally among cases and controls, it would lead to
underestimation of the effect of alcohol use on TBI.

Activity 2
In a case control study about drunk driving and TBI among drivers in T country. Researcher want to
conduct a case control study using hospital medical record and police database. He found that 20% data
about alcohol level were missing in case and control. He classified these as sober drivers. The observed
odds of exposure in scenario 2 are shown in Table 3 and 4. Table 3 showed real data with missing and
table 4 showed data when missing classified as non-smoking

3|Ver. St-2020/APH
What is the observed odds ratio of smoking in cases compared to controls in activity 2 from
Table 3 and 4?
Table 3 Observed odds of exposure in scenario 2
cases control
Exposure n=500 n=1000
Drunk driving 200 80
Sober driving 200 720
Missing data 100 200
Total 500 1000

200× 720 144,000


OR table 3 = = =9
200× 80 16,000

200× 920 184,000


OR table 4 = = = 7,76
300× 80 24,000

Table 4 Observed odds of exposure in scenario 2 when missing data


classified as sober
cases control
Exposure n=500 n=1000
Drunk driving 200 80
Sober driving 300 920
Total 500 1000

Is result from tablet 3 and 4 different? Why?


Hasilnya berbeda karena pada table 3 ditemukan 20% data yang hilang sedangkan pada table 4 data
yang hilang dimasukkan pada pengemudi yang tidak mabuk. Hal ini mengakibatkan angka OR
menjadi semakin tinggi dikarenakan kesalahan dalam mengelompokkan.

Activity 3
Please determine what kind of bias or misclassification occured on each of the following statement :
1. You want to study about the effect of social media bullying and perception about suicide. Your team
decide to randomly select high school from one city and use paper based questionnaire distributed
through teacher.
Bias informasi, karena apabila untuk mengetahui pengaruh bullying media sosial dan persepsi
tentang bunuh diri hanya dengan kuesioner yang di sebarkan oleh guru memungkinkan adanya
kurang keterbukaan dalam menjawab pertanyaan sehingga menimbulkan bias informasi.

2. You were conducting a study on the association between demographic risk factors and knowledge,
attitude and practice of dengue infection among college students. You select college students from
health cluster such as school of medicine, pharmacy and dentistry. Data was collected by using
interview based on standardized questionnaire. There was no training conducted for the
interviewers.
Bias seleksi, karena subjek penelitian yang dipilih adalah dari kelompok kesehatan yang sudah
memiliki pengetahuan yang baik tentang DBD.

4|Ver. St-2020/APH
B. Confounding
Confounding provides an alternative explanation for an association between an exposure and an
outcome. It occurs when an observed association between an exposure and an outcome is distorted
because the exposure of interest is correlated with another risk factor. This additional risk factor is
also associated with the outcome, but independently of the exposure of interest. An unequal
distribution of this additional risk factor between those who are exposed and unexposed will result
in confounding. This situation is illustrated in Figure 9.1. Here, association 1 is an example of
confounding where smoking is the confounding variable in a study to assess the relationship
between occupation and lung cancer. In association 2, the variable blood cholesterol is on the causal
pathway between diet and heart disease, is not associated with the disease independently of diet,
and is therefore not a confounder. In association 3, alcohol consumption is not a confounder
because it is not associated with lung cancer at all.

5|Ver. St-2020/APH
A potential confounder is any factor that can have an effect on the risk of disease under study. This
includes factors that have direct causal links with the disease, and factors that are proxy measures for
other unknown causes (i.e. age and social class). In the next activity, you will look at the effect that
confounding can have on the estimates of association calculated in a study

Activity 4
In an outbreak investigation, we use case control study to measure association between gudeg and rice
with diarrhea cases. We found that both food have high odds ratio. Table 1 showed data about gudeg
with diarrhea cases and table 2 showed data about rice with diarrhea cases.

Table 1 Odds of exposure to gudeg among all cases and controls


Exposure cases control Total
eating gudeg 80 30 110
not eating gudeg 20 70 90
Total 100 100 200

1. What is the odds ratio of exposure to gudeg in cases compared to controls?


80 ×70 5,600
OR = = = 9.3
20× 30 600

6|Ver. St-2020/APH
Table 2 Odds of exposure to rice among all cases and controls
Exposure cases control Total
eating rice 65 40 105
not eating rice 35 60 95
Total 100 100 200

2. What is the odds ratio of exposure to rice in cases compared to controls?


65 ×60 3900
OR = = = 2.78
35× 40 1400

Because there are two food items that associated, researcher want to know which food item is a
confounding variable. We will use stratification analysis to know the real effect of each food item. Table
3 showed stratification analysis of gudeg and diarrhea between people who eat rice and did not eat rice.
Table 4 showed stratification analysis of rice and diarrhea between people who eat gudeg and did not
eat gudeg.

Table 3 Odds of exposure to gudeg in cases and controls stratified by exposure to rice
eating rice Not eating rice
Exposure cases Control cases Control Total
eating gudeg 60 20 20 10 110
not eating gudeg 10 30 10 40 90
Total 70 50 30 50 200

3. What is the odds ratio of exposure to gudeg in cases compared to controls among people eating rice
and not eating rice?
60× 30 1,800
OR eating rice = = =9
20× 10 200

20× 40 800
OR not eating rice= = =8
10 ×10 100

60 ×30 20 ×40
+
120 80 240,000
ORMantel-Haenszel = = = 8.57 = 8.6
10× 20 10 ×10 28,000
+
120 80

Table 4 Odds of exposure to rice in cases and controls stratified by exposure to gudeg
eating gudeg Not eating gudeg
Exposure cases Control cases Control Total
eating rice 35 17 2 4 58
not eating rice 5 3 8 26 42
Total 40 20 10 30 100
Table 5 Odds of exposure to rice in cases and controls stratified by exposure to gudeg

7|Ver. St-2020/APH
eating gudeg Not eating gudeg
Exposure cases Control cases control Total
eating rice 55 15 10 25 105
not eating rice 20 10 15 50 95
Total 75 25 25 75 200

4. What is the odds ratio of exposure to rice compared to controls among people eating gudeg
and not eating gudeg?
35× 3 105
Table 4: OR eating gudeg = = = 1.24
5× 17 85

2× 26 52
: OR not eating gudeg = = = 1.63
8 × 4 32

3 5 ×3 2× 26
+
60 40 366
ORMantel-Haenszel = = = 1.37 = 1.4
17× 5 4 ×8 266
+
60 40

55× 10 550
Tabel 5: OR eating gudeg = = = 1.83
20× 15 300

10× 50 500
: OR not eating gudeg = = = 1.33
15× 25 375

55 ×1 0 1 0 ×5 0
+
10 0 10 0 1, 05 0
ORMantel-Haenszel = = = 1.56 = 1.6
15 ×2 0 25 ×15 675
+
10 0 10 0

5. What is your conclusion regarding the association between eating gudeg and rice with diarrhea
cases?
Ada hubungan antara memakan gudeg dan nasi dengan kejadian diare, namun memakan gudeg
ternyata memiliki efek yang lebih besar terhadap kejadian diare dari pada memakan nasi, karena
nasi disini merupakan efek modifikasi.

Activity 5
A population based cross sectional study was conducted to examine association between protective
device and alcohol with fatal injury. There were 64200 people aged 18 years and older recruited in this
study. Investigator record information about socioeconomic status, injury status and crash
characteristics. The socioeconomic variables were sex, age, educational level, marital status, ethnicity,
family size, occupation and resident duration. The crash characteristics were type of crash, use of
protective device, alcohol use and weather condition during crash. The Outcome variables were fatal
injury.

8|Ver. St-2020/APH
Not using protective device and drunk driving was significantly associated with traffic accident. People
who didn’t use protective device such as helmet or seat belt had 3.1 times higher odds compared to
people who use protective device (OR: 3.1; 95% CI: 2.90 – 3.40) and drunk driver had 15 times higher
odds compared to sober driver (OR: 15; 95% CI: 14.1 – 16.1). The odds ratio was reduced after
adjustment of other variables. It reduced to 2.1 (95% CI: 1.90 – 2.22) for protective device and reduced
to 9.47 (95% CI: 8.75 – 10.25) for drunk driving. The odds ratio showed that there were reduced effect
size for both associations with fatal injury.

Please answer this questions based on information above,


1. What kind of variables that researchers suspected to had a confounding effect in this research?
Variable sosial ekonomi
2. How do they overcome the confounding variables?
Mereka melakukan penyesuaian terhadap variable penelitian dengan melakukan analisis
multivariate dan stratifikasi.

9|Ver. St-2020/APH

You might also like