You are on page 1of 68

Bias, Confounding and

Fallacies in Epidemiology

1
LEARNING OBJECTIVES

AT THE END, LEARNER SHOULD BE ABLE TO:

1. Define biases

2. Able to know the types of biases?

3. Able to control or identify various biases


BIAS
Definition
Types
Examples
Remedies

CONFOUNDING
Definition
Examples
Remedies

FALLACIES
Definition

(Effect Modification)

3
What is Bias?

Bias is one of the three major threats to internal


validity:

Bias

Confounding

Random error / chance

4
What is Bias?

Systematic errors, or deviation of results or inferences from the


truth may arise at any point in the course of study.

A process at any state of inference tending to produce results that


depart systematically from the true values (Fletcher et al, 1988)

Any trend in the collection, analysis, interpretation, publication or


review of data that can lead to conclusions that are
systematically different from the truth (Last, 2001)

5
Bias is systematic error
Errors can be differential (systematic) or non-
differential (random)

Random error: use of invalid outcome


measure that equally misclassifies cases and
controls

Differential error: use of an invalid measure


that misclassifies cases in one direction and
misclassifies controls in another

Term 'bias' should be reserved for differential or


systematic error

6
Random Error
Per Cent

14
12
10
8
6
4
2
0
0 5 10 15 20 25 30 35

Size of induration (mm)


7
Systematic Error
Per Cent

14
12
10
8
6
4
2
0
0 5 10 15 20 25 30

Size of induration (mm)


8
Chance vs Bias

Chance is caused by random error


Bias is caused by systematic error

Errors from chance will cancel each other out in the


long run (large sample size)
Errors from bias will not cancel each other out
whatever the sample size

Chance leads to less precise results


Bias leads to inaccurate results

9
TYPES OF BIAS
• Publication
• Selection bias
• Recall
• Information bias
• Surveillance
• Analytic
• Wish
• Assessment
• Too small, non-
• Conflict of interest representative sample
• Lead time • Population definition error
• Non-responsive, Loss to • Question construction
follow up error
• Over diagnosis 10
Confounding

• A CONFUSION OF EFFECT
– The apparent effect of the exposure is
distorted by the effect of an extraneus factor
which is mistaken with the actual exposure.
– The distortion can be large, leading to
overestimation or small leading to under
estimation. It can even change the apparent
direction of effect.

11
Confounding

• A third factor which is related to both


exposure and outcome, and which accounts
for some/all of the observed relationship
between the two

• Confounder is not a result of the exposure

12
Confounding
To be a confounding factor, two conditions must be met:

Exposure Outcome

Third variable

Be associated with exposure


- without being the consequence of exposure

Be associated with outcome


- independently of exposure (not an intermediary)

13
Confounding

Coffee CHD

Smoking

Smoking is correlated with coffee


drinking and a risk factor even for those
who do not drink coffee

14
Confounding ?

Smoking CHD

Coffee

Coffee drinking may be correlated with


smoking but is not a risk factor in non-
smokers

15
Confounding

Birth Order Down Syndrome

Maternal Age

Maternal age is correlated with birth


order and a risk factor even if birth order
is low

16
Confounding ?

Maternal Age Down Syndrome

Birth Order

Birth order is correlated with maternal age


but not a risk factor in younger mothers

17
Confounding

Alcohol Lung Cancer

Smoking

Smoking is correlated with alcohol


consumption and a risk factor even for
those who do not drink alcohol

18
Confounding ?

Smoking CHD

Yellow fingers

Not related to the outcome


Not an independent risk factor

19
Confounding ?

Diet CHD

Cholesterol

On the causal pathway

20
Confounding
Imagine you have repeated a positive finding of birth order
association in Down syndrome or association of coffee drinking
with CHD in another sample. Would you be able to replicate it?
If not why?

You would not necessarily be able to replicate the


original finding because it was a spurious association
due to confounding.
In another sample where all mothers are below 30 yr,
there would be no association with birth order.
In another sample in which there are few smokers,
the coffee association with CHD would not be
replicated.

21
Confounding
Imagine you have included only non-smokers in a study and
examined association of alcohol with lung cancer. Would you
find an association?

No because the first study was confounded. The


association with alcohol was actually due to smoking.
By restricting the study to non-smokers, we have
found the truth.
Restriction is one way of preventing confounding at
the time of study design.

22
Confounding
Imagine you have stratified your dataset for smoking status in
the alcohol - lung cancer association study. Would the odds
ratios differ in the two strata?

The alcohol association would yield the similar odds


ratio in both strata and would be close to unity.
In confounding, the stratum-specific odds ratios should
be similar and different from the crude odds ratio by at
least 15%.
Stratification is one way of identifying confounding at
the time of analysis.

If the stratum-specific odds ratios are different, then


23
this is not confounding but effect modification.
Confounding
Imagine you have tried to adjust your alcohol association for
smoking status (in a statistical model). Would you see an
association?

If the smoking is included in the statistical model, the


alcohol association would lose its statistical
significance.
Adjustment by multivariable modelling is another
method to identify confounders at the time of data
analysis.

24
Confounding

For confounding to occur, the confounders should be


differentially represented in the comparison groups.

Randomisation is an attempt to evenly distribute


potential (unknown) confounders in study groups. It
does not guarantee control of confounding.

Matching is another way of achieving the same. It


ensures equal representation of subjects with known
confounders in study groups. It has to be coupled with
matched analysis.

Restriction for potential confounders in design also


prevents confounding but causes loss of statistical
power (instead stratified analysis may be tried).

25
Confounding

Randomisation, matching and restriction can be tried at


the time of designing a study to reduce the risk of
confounding.

At the time of analysis:


Stratification and multivariable (adjusted) analysis can
achieve the same.

It is preferable to try something at the time of designing


the study.

26
Effect of randomisation on outcome of
trials in acute pain

27
Confounding

Obesity Mastitis

Age

In cows, older ones are heavier and older


age increases the risk for mastitis. This
association may appear as an obesity
association

28
Confounding on Age
Crude
Mas Mas Total Risk Odds ratio
+ve -ve
Obese 50 150 200 0.25
1.8
Normal 30 170 200 0.15

Young cows
Mas Mas Total Risk Odds ratio
+ve -ve 1.0
Obese 5 45 50 0.10 Crude Vs Adjusted RR
Normal 15 135 150 0.10 = (1/1.8)*100 =56%
or the change is 44%
Old cows
Mas Mas Total Risk Odds ratio
+ve -ve 1.0
Obese 45 105 150 0.30
29
Normal 15 35 50 0.30
No Confounding on type of Breed

Crude
Mas Mas Total Risk Odds ratio
+ve -ve
Obese 240 1760 2000 0.12
2.6
Normal 92 908 2000 0.046

Breed 1
Mas Mas Total Risk Odds ratio
+ve -ve 2.6
Obese 211 789 1000 0.21 Crude Vs Adjusted RR =
(2.6/2.9)*100 =93% or
Normal 82 918 1000 0.08 the change is 7%

Breed 2
Mas Mas Total Risk Odds ratio
+ve -ve 2.9
Obese 29 971 1000 0.029
30
Normal 10 990 1000 0.01
Selection Bias

31
Selection Bias

Selective differences between comparison groups


that impacts on relationship between exposure
and outcome

Usually results from comparative groups not


coming from the same study base and not being
representative of the populations they come from

32
Selection Bias Examples

• Self selection bias = publicity bias


– People refering themselves to the investigator following
publicity about the study
– Self-referal of the subject is threat to validity as the reasons
for self-referral may be associted with the outcome of the
study.
– Example: troops present at the smoky Atomic test at
Nevada. 82% of subjects were traced by the investigators,
but 18% contacted the investigators after about the study,
and Leukemia may have been overrepresented in these
people
33
Selection Bias Examples

• Self selection bias = healthy worker effect


– Can occur before the study subjects are identified. healthy
worker effect
– Relatively healthy people become or remain workers.
While those who retire, remain unemployed, absent
from duty are as a group less healthy
– May be less likely for the people in employment to
volunteer for the study

34
Selection Bias Examples

• Diagnostic bias
– Can occur before the study subjects are identified
– Example: in a case control study, looking at a
relationship of OC users and DVT. The clinicians
knew about the relationship being investigated, so
suggestive symptoms and known use of OC were
more likely to be refered to the hospital as DVT.
– Leading to overestimation of effect of OC on DVT

35
Selection Bias Examples

• LOSS TO FOLLOW UP= WITH DRAWL BIAS


– When there is differential loss to follow up that is
related to exposure. It should be controled by
following all the groups vigorously and with same
zeal.
– Single or double blind should be implemented to
ensure same followup for all groups

36
Selection Bias Examples

• PREVALENCE-INCIDENCE BIAS
– This occur when prevalent cases are investigated to
study the exposure disease relationship.
• once a person is diagnosed with the disease, they may
change the habit which contributed to the disease.
• Prevalent conditions are survivors of the disease and as
survivors may be atypical with respect to the exposure, they
may misrepresent the disease status.

Selective survival (Neyman's) bias 37


Selection Bias Examples

Case-control study:

Controls have less potential for exposure than cases

Outcome = brain tumour;

exposure = overhead high voltage power lines

Cases chosen from province wide cancer registry

Controls chosen from rural areas

Systematic differences between cases and controls

38
Case-Control Studies:
Potential Bias

39
Selection Bias Examples

Cohort study:
Differential loss to follow-up

Especially problematic in cohort studies

Subjects in follow-up study of multiple sclerosis may


differentially drop out due to disease severity

Differential attrition  selection bias

40
Selection Bias Examples
Self-selection bias:
- You want to determine the prevalence of HIV infection
- You ask for volunteers for testing
- You find no HIV
- Is it correct to conclude that there is no HIV in this
location?

41
Selection Bias Examples
Healthy worker effect:
Another form of self-selection bias
“self-screening” process – people who are unhealthy
“screen” themselves out of active worker population

Example:
- Course of recovery from low back injuries in 25-45 year
olds
- Data captured on worker’s compensation records
- But prior to identifying subjects for study, self-selection
has already taken place

42
Selection Bias Examples
Diagnostic or workup bias (Over diagnosis bias)
investigators become overenthusiastic and tend to over read.
Diagnoses (case selection) may be influenced by physician’s knowledge
of exposure
Example:
-Case control study –
-outcome is pulmonary disease,
-exposure is smoking
Radiologist aware of patient’s smoking status when reading x-ray – may
look more carefully for abnormalities on x-ray and differentially select
cases
•False rate of over detection
•Abnormal group diluted with “free of disease persons”
•False survival
43
Legitimate for clinical decisions, inconvenient for research
Information / Measurement /
Misclassification Bias

Method of gathering information is inappropriate and


yields systematic errors in measurement of exposures
or outcomes

If misclassification of exposure (or disease) is


unrelated to disease (or exposure) then the
misclassification is non-differential

If misclassification of exposure (or disease) is related


to disease (or exposure) then the misclassification is
differential

Distorts the true strength of association

44
Information bias
1. Differential misclassification
• Cases Controls or Controls Cases
• Exposed Non-exposed or Non-exposed Exposed

• Apparent association when there is no association or


can lead to no association when there is association

• Woman who had malformed baby may remember mild


infection than mothers of normal babies.

45
Information bias
2 Non-differential (both ways)
• Problem in data collection methods
• Relative risk or odds ratio is diluted.
• Less likely to detect association even if it exists in
reality
• Eg. By mistake we include some diseased persons to
controls and some non-diseased persons to cases
• Now controls will not have low prevalence of exposure
and cases will not have high prevalence of exposure

46
Information / Measurement /
Misclassification Bias

Sources of information bias:

Subject variation
Observer variation
Deficiency of tools
Technical errors in measurement

47
Information / Measurement /
Misclassification Bias
Recall bias:
Those exposed have a greater sensitivity for recalling
exposure (reduced specificity)

- specifically important in case-control studies


-when exposure history is obtained retrospectively

cases may more closely scrutinize their past history


looking for ways to explain their illness
- controls, not feeling a burden of disease, may less
closely examine their past history

Those who develop a cold are more likely to identify


the exposure than those who do not – differential
misclassification
- Case: Yes, I was sneezed on
- Control: No, can’t remember any sneezing
48
Some More Types of Bias
• Analytical bias: if epidemiologist or biostatistician has
strong preconception _ interfere in Analysis,
interpretation of results (control by masking)
• Assessment bias: the person who decides whether
disease has occurred if he knows the hypothesis or the
exposure status _ (control by masking)
• Conflict of interest: funding interest by epidemiologist in
industry, government projects:
• e.g. the employer may be effected politically, economically, or legally .
49
Lead-time bias
• Comparing survival in screened and unscreened
population
• Benefits we seek are delay of prevention of death

On set of disease Usual diagnosis death

On set of disease death

early diagnosis
Due to screening Lead time bias

Survival seems to be longer 50


Information / Measurement /
Misclassification Bias

Reporting bias:
oIndividuals with severe disease tends to have
complete records therefore more complete
information about exposures and greater association
found

oIndividuals who are aware of being participants of a


study behave differently. They may want to please the
principal investigator and give false statements.
(Hawthorne effect)

51
Reporting bias
o The subjects may be reluctant to report an exposure
– Due to belief, perception or attitudes
– Patients of HIV may be reluctant to identify the
cause of disease

Publication bias
o Journals may select studies for ‘Readers
interest’ having positive association, omitting
studies with no association 52
SOME MORE TYPES OF BIAS
• Surveillance bias?
• Disease ascertainment is better in monitored
population
• E.g. Oral contraceptives thrombophlebitis
– Physician monitored women using O.C. more closely than
other patients
– More chances of identifying cases of thrombophlebitis in
women using O.C.
• Thus association may be observed, even if NO
true association existed
53
SOME MORE TYPES OF BIAS

•Wish bias;
–Term used by Wynder, patients who develop
disease are in a state of denial “Why me?”
–“the disease is not their fault”
–Thus they may deny certain exposures related to
“life style” eg. Smoking, drinking

Efforts Should Be Made To Eliminate Or Reduce


Bias Or At Least Recognize It 54
Controlling for Information Bias
- Blinding
prevents investigators and interviewers from
knowing case/control or exposed/non-exposed
status of a given participant

- Form of survey
mail may impose less “white coat tension” than a
phone or face-to-face interview

- Questionnaire
use multiple questions that ask same information
acts as a built in double-check

- Accuracy
multiple checks in medical records
gathering diagnosis data from multiple sources

55
Types of Bias

** Confounding bias **
Distortion of exposure - disease relation by some
other factor

56
** (Effect Modification) **

FALLACIES
Definition

57
Confounding or Effect Modification

Birth Weight Leukaemia

Sex

Can sex be responsible for the birth weight


association in leukaemia?
- Is it correlated with birth weight?
- Is it correlated with leukaemia independently of
birth weight?
- Is it on the causal pathway?
- Can it be associated with leukaemia even if birth
weight is low?
- Is sex distribution uneven in comparison groups?
58
Confounding or Effect Modification

Birth Weight Leukaemia OR = 1.5

Sex

Does birth weight association differ in strength according to sex?

BOYS Birth Weight Leukaemia OR = 1.8

GIRLS Birth Weight // Leukaemia OR = 0.9

59
Effect Modification

In an association study, if the strength of the


association varies over different categories of a third
variable, this is called effect modification.
The third variable is changing the effect of the
exposure.
The effect modifier may be sex, age, an environmental
exposure or a genetic effect.
Effect modification is similar to interaction in statistics.
There is no adjustment for effect modification. Once it
is detected, stratified analysis can be used to obtain
stratum-specific odds ratios.

60
Effect modifier
Belongs to nature
Different effects in different strata
Simple
Useful
Increases knowledge of biological mechanism
Allows targeting of public health action

Confounding factor
Belongs to study
Adjusted OR/RR different from crude OR/RR
Distortion of effect
Creates confusion in data
Prevent it in (design)
Control it by (analysis)

61
** FALLACIES **
Definition

62
Fallacies

HISTORICAL FALLACY

ECOLOGICAL FALLACY
(Cross-Level Bias)

BERKSON'S FALLACY
(Selection Bias in Hospital-Based CC Studies)

HAWTHORNE EFFECT
(Participant Bias)

REGRESSION TO THE MEAN (Davis, 1976)


(Information Bias)

63
HOW TO CONTROL FOR
CONFOUNDERS?

• IN STUDY DESIGN…
– RESTRICTION of subjects according to potential
confounders (i.e. simply don’t include confounder in
study)
– RANDOM ALLOCATION of subjects to study groups to
attempt to even out unknown confounders
– MATCHING subjects on potential confounder thus
assuring even distribution among study groups

64
HOW TO CONTROL FOR
CONFOUNDERS?
• IN DATA ANALYSIS…
– STRATIFIED ANALYSIS using the Mantel Haenszel
method to adjust for confounders
– IMPLEMENT A MATCHED-DESIGN after you have
collected data (frequency or group)
– RESTRICTION is still possible at the analysis stage but
it means throwing away data
– MODEL FITTING using regression techniques

65
Effect of blinding on outcome of trials
of acupuncture for chronic back pain

Bandolier Bias Guide 66


(www)
WILL ROGERS' PHENOMENON

Assume that you are tabulating survival for patients with a certain type of tumour. You
separately track survival of patients whose cancer has metastasized and survival of
patients whose cancer remains localized. As you would expect, average survival is longer
for the patients without metastases. Now a fancier scanner becomes available, making it
possible to detect metastases earlier. What happens to the survival of patients in the
two groups?
The group of patients without metastases is now smaller. The patients who are removed
from the group are those with small metastases that could not have been detected
without the new technology. These patients tend to die sooner than the patients without
detectable metastases. By taking away these patients, the average survival of the
patients remaining in the "no metastases" group will improve.
What about the other group? The group of patients with metastases is now larger. The
additional patients, however, are those with small metastases. These patients tend to
live longer than patients with larger metastases. Thus the average survival of all
patients in the "with-metastases" group will improve.

Changing the diagnostic method paradoxically increased the average


survival of both groups! This paradox is called the Will Rogers'
phenomenon after a quote from the humorist Will Rogers ("When the
Okies left California and went to Oklahoma, they raised the average
intelligence in both states").

67
Cause-and-Effect Relationship
What to look for in observational studies?

Is selection bias present?


In a cohort study, are the participants in the exposed and unexposed groups
similar in all respects except for the exposure.
In a case-control study, are the participants in the 2 groups similar in all
respects except for the disease status.
Is information bias present?
In a cohort study, has the information about disease status collected in the
same manner from both the exposed and unexposed groups.
In a case-control study, has the information about exposure, collected in the
same manner from both the disease and control groups.
Is confounding present?
Could the result be accounted for by the presence of a factor – age, smoking,
sexual bahaviour, diet – associated with both the exposure and outcome, but
not directly involved in the causal pathway.
If the results can not be explained by these 3 biases, could they be
the result of chance.
What are the relative risks and odds ratios and there confidence intervals?
Is the difference statistically significant and has the study adequate power to
find clinically important difference?
if the results still can not be explained, then and only then the 68
findings might be real and worthy of note.

You might also like