You are on page 1of 39

Selection and information bias in

Epidemiological Studies

Dr. Nadira Sultana Kakoly


Every epidemiological study should be
viewed as a measurement exercise

K Rothman
Definition of bias

Any systematic error in an epidemiological


study that results in an incorrect estimate
of the association between exposure and
risk of disease
Three types of bias

• Confounding (Population characteristics)


• Selection bias
• Information bias
- Confounding may lead to errors in the conclusion of
a study, but, when confounding variables are
known, the effect may be fixed
- Bias is a systematic error in a study and cannot be
fixed
Selection bias
• Errors in the process of identifying the study
population
• Preferential selection of subjects related to
their
- case/control status
- exposure status
Direction of Bias

• A bias that overestimates an association is


called bias away from the null
• A bias that underestimates an association is
called bias toward the null

• What is null?
Selection bias

• Sampling bias (Purposive sampling)


• Ascertainment bias
- Surveillance (asymptomatic cases are missed)
- Referral, admission (Only referred and admitted cases
are included)
- Diagnostic (Exposed cases are more likely to be
diagnosed)
• Participation bias
- Self-selection (volunteerism)
- Non-response, refusal
- Healthy worker effect, survival
Selection bias in
case-control studies
Selection bias

Cases Controls A
liver cirrhosis trauma ward

Heavy alcohol use 80 40

Light/no alcohol use 20 60

OR=6

How representative are hospitalised trauma patients of the population


which gave rise to the cases?
Selection bias

Cases Controls A Controls B


liver cirrhosis trauma ward non-trauma

Heavy alcohol use 80 40 10

Light/no alcohol use 20 60 90

OR=6 OR=36
Diagnostic bias
Diagnostic approach related to knowing exposure status

Cases
Controls
uterine cancer

Takes oral
a b
contraceptives

Does not take oral


c d
contraceptives

• OC use  breakthrough bleeding  increased chance of


detecting uterine cancer
• Overestimation of “a”  overestimation of OR
Non-response bias
Papanicolau test Cases of
Controls
cervical cancer

Did not have test a b

Had test c d

Total 1000 1060

• Controls chosen among women at their homes: 13000 homes


contacted 1060 controls
• Controls mainly housewives with lower chance of having
test than women gainfully employed
• Underestimation of “b”  overestimation of OR
Selection bias in
cohort studies
Non-response bias

lung cancer
yes no

Smoker 9 91 100

Non-smoker 1 99 100

9 1
RR    9
100 100
lung cancer
yes no

Sportive smoker* 0 7 7

Unhealthy smoker 9 51 60

Non-smoker 1 99 100

*33 sportive smokers do not participate in study as too embarrassed


about admitting to their smoking

9 1
RR    13.4
67 100
Loss to follow-up

• Bias due to differences in completeness of


follow-up between comparison groups
• Example
- Study of disease risk in migrants
- Migrants more likely to return to place of origin when
having disease

 lost to follow-up
 lower disease rate among exposed (=migrant)
Minimising selection bias
• Clear definition of study population
• Explicit case and control definitions
• Cases and controls from same population
• Selection of exposed and non-exposed without
knowing disease status (retrospective cohort)
Information bias

• Systematic error in the measurement of


information on exposure or outcome
• Differences in accuracy
- of exposure data between cases and controls
- of outcome data between different exposure groups
Misclassification
Measurement error leads to assigning wrong
exposure or outcome category

Non-differential Differential
• Random error • Systematic error
• Unrelated to exposure or • Related to exposure or
outcome status outcome status
• Not a bias • Bias
• Weakens measure of • Measure of association
association distorted in any direction
Two main types of information bias

• Reporting bias
- Recall bias

• Observer bias
- Interviewer bias
- Biased follow-up
Recall bias
Cases remember exposure differently than controls
Mothers of

Children with
Controls
malformation

Took tobacco,
a b
alcohol, drugs

Did not take c d

• Mothers of children with malformations will remember past


exposures better than mothers with healthy children
• Overestimation of “a”  overestimation of OR
Interviewer bias

Investigator asks cases and controls differently about exposure

Cases of
Controls
listeriosis

Eats soft cheese a b

Does not eat


c d
soft cheese

• Investigator may probe listeriosis cases about consumption


of soft cheese
• Overestimation of “a”  overestimation of OR
Biased follow-up

Unexposed are less diagnosed for disease than


exposed

• Example
- Cohort study to investigate risk factors for
mesothelioma
- Difficult histological diagnosis
- Histologist more likely to diagnose specimen as
mesothelioma if asbestos exposure kown
Non-differential misclassification

• Misclassification does not depend on values of


other variables
- Exposure classification unrelated to disease status,
or
- Disease classification unrelated to exposure status
• Consequence
- Weakening of measure of association
(“bias towards the null”)
Nondifferential misclassification
• Cohort study: Alcohol  laryngeal cancer
Incidence
No misclassification RR
per mill

1,000,000 drinkers 50 5.0


500,000 nondrinkers 10

50% drinkers misclassified

500,000 drinkers 50 1.7


1,000,000 “nondrinkers” 30
Nondifferential misclassification
• Cohort study: Alcohol  laryngeal cancer

Incidence
No misclassification RR
per mill

1,000,000 drinkers 50 5.0


500,000 nondrinkers 10

50% drinkers & 33%


nondrinkers misclassified

666,667 “drinkers” 40 1.2


833,333 “nondrinkers” 34
Minimising information bias

• Standardise measurement instruments


• Administer instruments equally to cases and controls
(exposed/unexposed)
• Use multiple sources of information
- Questionnaires
- Direct measurements
- Registries
- Case records
Bias in randomised controlled trials

• Gold-standard: randomised, placebo-


controlled, double-blinded study
• Least biased
- Exposure randomly allocated to subjects -
minimises selection bias
- Masking of exposure status in subjects and study
staff - minimises information bias
Bias in prospective cohort studies

• Loss to follow up
- The major source of bias in cohort studies
- Assume that all do / do not develop outcome?
• Ascertainment and interviewer bias
- Some concern: Knowing exposure may influence how
outcome determined
• Non-response, refusals
- Little concern: Bias arises only if related to both exposure
and outcome
• Recall bias
- No problem: Exposure determined at time of enrolment
Bias in retrospective cohort &
case-control studies
• Ascertainment bias, participation bias,
interviewer bias
- Exposure and disease have already occurred 
differential selection / interviewing of compared
groups possible
• Recall bias
- Cases (or ill) may remember exposures differently
than controls (or healthy)
Control of bias

• Careful study design- Some adjustments can


be made during the analysis, but a well
designed study can minimize the potential for
bias in any study ranging from general
considerations to specific features of data
collection process
Choice of study population

• For a case control study it is always better to


take controls from the population the cases
come. This will decrease the likelihood of
non-response, selection and recall bias.
Choice of study population

• For cohort studies population should be well


defined with respect to employment,
occupation, residence so that biases due to
loss to follow up can be minimized. This is
also same for clinical trials.
Data collection

Data obtained should be informative and


interpretable.
There are two major aspect of data collection
procedure that can minimize bias.
1. The construction of specific instruments to
obtain information e.g. questionnaires,
interviews, physical examination
2. Administration of those instruments by study
personnel
Blinding

• The investigator should be blind about the


possible exposure and outcome depending
on the study design
• The same thing is also applicable to subjects
Training

• It is important that the investigators are


properly trained
• They should follow specific procedure that
are identical to all subjects (in completing
forms, examining subjects etc.)
Source of exposure and disease
information

• The use of pre-existing record is the most


unbiased data
• Problem is that it may not have adequate
information on all factors of interest
• But it is possible to gather information from
multiple sources
Conclusion
• Bias should be considered as possible alternative
explanation of any statistical observation like
confounding and effect modification
• Bias should be dealt during the design phase
unlike the confounding and effect modification
that can be dealt during the analysis
• It is difficult to deal a bias in the analysis
• But if present estimate the direction
• Misclassification can be corrected by a small
validation study
Thanks

You might also like