You are on page 1of 69

Variability &Bias

Huan Zeng, Ph.D., A/Professor


zenghuan586@aliyun.com
Chongqing Medical University, China
Contents
 Variability, validity
 Bias
 Selection bias
 Information bias
 Confounding bias
Research
results
Variability

True value or reality


Variability

Two reasons :

random error

systematic error
 B Vs A : random error
 D Vs C: systematic error

The sight of the


gun were bent?

The shots in both A and B are centered around the middle, but
in A the shots are less scattered and have less variability.
In C and D, the scatter is similar, but in target D the cluster of
shots is off center.
 It is important to consider the
accuracy and precision of any
measurements made in the
medical setting.
Table: levels of variability
 Levels  Features
Individual Individual variability
Measurement variability

Population Genetic variability between individuals


Environmental variability
Measurement variability

Manner of sampling
Size of sample
Sample Measurement variability
Variability within individual
Biological changes:
Occur on a minute-to-minute basis.
e.g., heart rate
Follow a regular diurnal pattern.
e.g., body temperature
Progress with normal development
e.g., height or weight
Variability within individual

Measurement:

Poor calibration of the instrument


Inherent( 固有的 ) lack of precision of the instru
ment
Misreading or misrecording information fro
m the instrument by the technician
Variability within population
 Cumulative variability of individual

Different genetic constitution Environm


ental influence
Populations exhibit more variation than
individuals
e.g., Blood pressure:
Systolic Blood Pressure(SBP): 90~139mmHg
Diastolic Blood Pressure(DBP): 60~89mmHg
Variability in research studies

 We cannot usually study the entire popu


lation
 Can study subsets or samples of the po
pulation

sampling variability
Examples:

Repeated samples from


the population will give
different estimates of
the true population
values.

Five of them have the total serum cholesterol val


ues above 240mg/dL.
Conclusion:

A larger sample size would result in le


ss variability and would more likely re
present the source population.
Variability

Two reasons :

random error

systematic error
Validity

 concerns the degree to which a


measurement or study reaches a
correct conclusion.
 A measurement or study may lead to an
incorrect conclusion because of the
effects of bias.

 The amount of bias can be determined by


the degree to which the shots are off
target in D.
 Internal validity: is the extent to
which the results of an
investigation accurately reflect the
true situation of the study
population.
 External validity: is the extent to
which the results of a study are
applicable to other populations.
Variability

Two reasons :

random error

systematic error
• Random error
Bias • Systematic error (bias)

Definition: any systematic error in the design,


conduct or analysis of a study that results
in a mistaken estimate of an exposure's
effect on the risk of disease.

leads to a distortion of the results.


It can occur in any research, particularly in obse
rvational study. (lack of randomization)
Categories of bias
Selection bias
Information bias
Confounding bias

Based on how bias enters the study


The magnitude of bias can not be quantified except confounding bias,but its influence on the study results can be in
ferred.

It is important to discern whether the suspected bias is likely to make an association appear stronger or weaker than it really
is.

Bias Overestimate(stronger)
Study results
True value
Bias
Underestimate(weaker)
 the selection bias leads to apparent
association.
 The apparent association does not
exist in fact.
 The cause: the way we select the
cases and the controls, the way we
select the exposed and nonexposed
individuals is not correct.
Selection bias
 we could not recruite all the target
population, we can only select a sample.
 If the sample could not represent the target
population, because of the significant differ
ences between the subjects chosen in the s
tudy and those not been chosen in the stud
y.
 selection bias may exist:
 when we use the volunteers as the sample;
 when we use convenient sample;
 or non-response;
 or loss to follow up.
selection bias exist in descriptive study:

ifwe do not use random sampling, the


external validity could not be ensured.
The key: the sample could not represent
the target population.
how to control: use random sampling.
selection bias exist in analytic study(case-
control study, cohort study):

the way of selecting the subject(inclusion and


exlusion, participation, non-response, loss to
follow up) is associated with the exposure.
this association may decrease or increase the
link between the exposure and the disease,
cause bias.
 A study: selection bias results from
nonresponse of potential study subjects

• Title: Non-responders to a postal questionnaire on respirator


y symptoms and diseases
• Author: Eva Rönmark, Ann Lundqvist, Bo Lundbäck and Len
narth Nyström.
• Journal: European Journal of Epidemiology
• Issue Date: 1999
 The aim of this study was to examine the characteristics
of the non-responders.
 The prevalence rates of wheezing, long-standing cough, s
putum production, attacks of breathlessness, asthma and
use of asthma medicines were significantly higher among
the non-responders compared with the responders accordi
ng both univariate and multivariate logistic regression analyses, in which th
e influences of age, sex, smoking habits, socio-economic group and area
of domicile were taken into account.
 The prevalence of respiratory symptoms and diseases wa
s slightly underestimated in the postal survey.
Types of selection bias
1. Admission rate bias (Berkson’ s bias)
 Happen in hospital-based case-control study.
 Only recruite the patients in the hospitals, the following
patients are not recruited: the serious patients die during
the hospital, patients far away from the hospital, the patients do
not go the hospitals, the mild cases.
 Thus, a hospital-based case-control study might find a
link between two diseases or between an exposure
and a disease when there is no association between
them in the general population.

 the exposed cases have more chance to go to the hospitals


than the exposed controls, thus OR is highly estimated.
2. Prevalence-incidence bias (Neyman
bias)
Example
 In one study of cardiovascular disease in Friminghan
Cohort study Case-control study
Patients Nonpatients total Patients Nonpatients
Cholesterol over deadline 85 462 547 38 34

Cholesterol under deadli


ne 116 1511 1627 113 117

Total 201 1973 2174 151 302


RR=2.2 OR=1.16

Exposure: cholesterol
Disease: coronary disease
2. Prevalence-incidence bias ( Neyman bias )
 Prevalent (existing)cases: Have been diagnosed for
some time or been ill for some months or years;
in the case-control study or cross-sectional study
 Newly diagnosed cases: just be diagnosed.
in the cohort study

change their habits or hobbies


prevalent patients biased result

thus, in this case-control study, the OR is lowly estimated.


Pay attention!!
should consider whether to use existing
cases, or only recuite newly diagnosed cases.
If the risk factor of interest is also a
prognostic factor, prevalent cases can lead to
a biased conclusion.
Control the selection bias
① Researcher should think over about the possible selec
tion bias for the whole study.
② The selected subjects can represent the target
population.
③ Have a suitable criteria of inclusion and exclusion.
e.g. smoking lung cancer
(exclude the patients with chronic respiratory diseases,such as chro
nic bronchitis, emphysema 肺气肿 )
③ Take some measurements to get cooperation from the
subjects and achieve higher response rate.
 Information bias can occur when the means
for obtaining information about the subject i
n the study are inadequate so that as a resul
t some of the information gathered regardin
g exposure and/or disease outcome is incor
rect.

 Exists in all types of epidemiological studies.


 Can result from the subjects, the researchers, the i
nstruments and methods.
Types of information bias

1. Recall bias
common in the case-control study
 the cases may try their best to recall their
exposure in the past, they are serious.
 However, the controls may be not serious
about the investigation, and could not
remember their exposure in the past.
Example
 Slewart found that in the study of risk factors of infant l
eukaemia 白血病 , patients’ mother had a higher
percentage of getting X-ray during pregnancy and befor
e pregnancy than those mothers in the control group.

Patients’ mother deep impression for the hurt to children a


nd herself, with serious attitude and careful recalling.

Nonpatients’ mother no deep impression, careless recalling

Exposure rate in the control group was lower than th


e true situation.
 This situation may magnify the real ass
ociation between the infant leukaemia a
nd X-ray, which causes the bias.

 Experts compared the record in the hospital a


nd the recalling results from the mothers. It w
as found that only 73% of them had the same
results (only 73% of women can recall accur
ately)
2. Reporting bias

 In which a subject may be reluctant to repo


rt an exposure he is aware of because of at
titudes, beliefs, and perceptions. If such
underreporting is more frequent either amo
ng the cases or among the controls, a bias
may result.
2.Reporting bias
Example:
In the health exam, some workers may hide their dise
ases or health problems information to protect the work
opportunity.

When investigating the occupational hazards, workers


may exaggerate the exposure experience.

When collecting data on sex, economy, adolescent s


moking, or other private things, subjects are lean to lie.
3. Exposure suspicion bias
 Caused by the investigators in Case-control
study.
 The investigator has known the subjects' diseas
e situation, and thought by herself/himself that t
here was association between the risk factors an
d disease.
 The investigator was very careful and serious to
the patients' exposure history(case group).
When asking the exposure history among the co
ntrol group, the investigator's attitude is
different (not as serious as to the case group).
 this may lead to bias, and magnify the OR value.
4. Measurement bias
 Occur during collecting data on disease
outcome in cohort study

 Inaccurate method of testing or measuring , th


e instrument's problem, or technique problem
of the person who operates the instruments, c
an cause measurement bias.

 Also, investigators’ attitude and methods can l


ead to some false results.
 For example, some years ago, a great deal of interes
t centered on the possible relationship of oral
contraceptive use to thrombophlebitis.
 It was suggested that physicians monitored patients
who had been prescribed oral contraceptives much
more closely than they monitored their other patients.
As a result, they are more apt to identify cases of
thrombophlebitis that developed in those patients wh
o are taking oral contraceptives than among other
patients who are not as well monitored. As a result,
just through better ascertainment of thrombophlebitis
in women receiving oral contraceptives, an apparent
association of thrombophlebitis with oral
contraceptive use may be observed, even if no true
association exists.
Control information bias

① Researchers should design detailed data c


ollection methods and strict quality control
methods before the investigation.
② The machines and instruments have good
quality.
③ Use the standard methods, instruments an
d techniques.
Control information bias

④ Train the investigators before investigation.

⑤ Blinding as possible as you can when collecting


data(the investigator do not know the subject is a case or
control, exposed or unexposed).

⑥ Choose objective indicators to measure, such


as biological markers.
III Confounding bias
 A problem post in many epidemiology studie
s is that we observe a true association and
are tempted to derive a causal inference whe
n, in fact, the relationship may not be causal.
 This brings us to the subject of confounding,
one of the most important problems in observ
ational epidemiologic studies.
III Confounding bias
 What do we mean by confounding? Definition
In a study of whether factor A is a cause of dise
ase B, we say that a third factor, factor X is a
confounder if the following are true:

a. Factor X is a known risk factor for disease B.


b. Factor X is associated with factor A, but is not a
result of factor A.
A. Causal B. Due to Confounding

Coffee Drinking Coffee Drinking

Observed Association
{ }
Observed Association

Smoking

Risk of Risk of
Pancreatic Pancreatic
Cancer Cancer

Figure. The association between coffee drinking and pancreatic cance


The relationship between coffee and cancer of the
pancreas
Smoking was a confounder, because
although we were interested in a possible
relationship between coffee consumption
(factor A) and pancreatic cancer(disease B),
the following are true of smoking(factor X):

a.It's a known risk factor for pancreatic cancer.


b.It is associated with coffee drinking but is not a
result of coffee drinking.
Control Confounding

 In the design and  In the analysis


conduct of the study
 Stratification
 Individual matching  Multivariable regression
 Grouping matching  Adjustment
 Randomization
Control confounding in the design
 Matching -- select study subjects so that the
potential confounders are distributed
equally among the exposed and unexposed
groups (cohort study) or among the cases
and controls (case control study)
1:1 Pair Matching in Case-contorl study
Pair 1 Pair 2 Pair 3 Pair 4
Case : 1: M, 50 2: F, 60 3: F, 52 4: M, 60......

Control: 1': M, 52 2: F, 58 3:F, 52 4: M, 61 ......

•Matching : the same gender, age gap <=2 years old


•then bias from the gender and age will be controled.
Control confounding in the design

 Randomization - with sufficient sample


size, randomization is likely to control
both known and unknown confounders.
Control confounding in the analysis:
stratification
 Example: Case control study of oral
contraceptive use and risk of heart attack.
Age is a confounder.

E (oral contraceptive) D(heart attack)

F (age)
Control confounding in the analysis:
stratification

TOTAL DATA (One 2 x 2 Table)

Case Control
OC Use Yes 39 24

No 114 154

Crude OR = 2.2
Stratified Data (Two 2 x 2 Tables)

Age< 40 Age 40 and over


Case Control Case Control

OC Yes 21 17 OC Yes 18 7
Use Use
No 26 59 No 88 95

Stratum-specific Stratum-specific
OR = 2.8 OR = 2.8
 After excluding the suspicious confounding fac
tor, the relation value is aRR(f) or aOR(f), or a
djusted RR/OR.
 Using Mantel-Haenszel method

The aOR is 2.8


cRR: crude RR aRR: adjust RR
 1. If cOR= aOR (f), then the f was not a confou
nding factor.
 2. If cOR ≠ aOR (f), then the f was a confoundi
ng factor. cRR was a biased value.
 3. If cOR > aOR (f), because of the f, cRR overe
stimates the relation of research factor and dis
ease.
 4. If cRR < aRR (f), because of the f, cRR unde
restimates the relation of research factor and d
isease.
Confounding bias: (2.2-2.8) / 2.8= - 21%

That means age was really a confounding fac


tor between oral contraceptive and MI, whic
h underestimates the OR value.
Control confounding in the analysis:
multivariate analysis

 The limitations of stratification: It is difficult to


control many variables simultaneously because
a large number of strata will be generated
relative to the number of study subjects.
Solution: multivariate analysis
– This is an analysis technique that
simultaneously adjusts several variables.
– statistic methods (analysis software,
SAS,SPSS ,et al.)

Examples: multiple linear regression for


continuous variables, logistic regression for
case-control data, Cox proportional hazards
model for cohort data.
Is an unbiased study ever possible?

The researcher should be critical and


careful to the study.
Good quality control in design, conducting
and data analysis.
Note
Confounding is not an error in the study, but is
rather a true phenomenon that is identified in the
study and must be understood.
Bias is a result of an error in the way the study
has been carried out, but confounder is a valid
finding that describes the nature of the
relationship between several factors and risk of
disease.
However, failure to take confounding into
account in interpreting the result of a study is
indeed an error in the conduct of the study and
can bias the conclusions of the study
Summary
 Variability
 including three level variability
 from random error to systematic error
 Bias
 Selection bias and how to control
 Informaiton bias and how to control
 Confounding bias and how to control
Test your knowledge
Study Question:
1. In a case-control study of the relationship between a
cholesterol lowering drug and the risk of developing breast
cancer, control subjects are sampled from participants in a
health screening fair.
A. Selection bias B. information bias C. confounding bias
2. In a cohort study of hormone replacement therapy and the
risk of developing atherosclerotic coronary artery disease, high
socioeconomic status is associated with both use of hormone
replacement therapy and the risk of developing coronary
artery disease.
A. Selection bias B. information bias C. confounding bias
3. In a case-control study of the relationship of stressful life
events and the occurrence of coronary artery disease, the
cases are more likely than the controls to over report
stressful events.
A. Selection bias B. information bias C. confounding bias
4. In a case-control study of estrogen as a risk factor for
developing breast cancer , women with breast cancer
tend to give more false-positive reports of using estrogen
than do women without breast cancer.
A. Overestimation B. Underestimation C.No effect

You might also like