Professional Documents
Culture Documents
Epidemiologists devote much attention to minimizing errors and assessing the impact
of errors that cannot be eliminated. Sources of error can be random or systematic.
A. RANDOM ERROR
Random error is when a value of the sample measurement diverges due to chance
alone from that of the true population value. Random error causes inaccurate measures
of association.
Random error can never be completely eliminated since we can study only a sample of
the population. Sampling error is usually caused by the fact that a small sample is not
representative of all the population’s variables. The best way to reduce sampling error
is to increase the size of the study. Individual variation always occurs and no
measurement is perfectly accurate. Measurement error can be reduced by stringent
protocols, and by making individual measurements as precise as possible.
Investigators need to understand the measurement methods being used in the study,
and the errors that these methods can cause. Ideally, laboratories should be able to
document the accuracy and precision of their measurements by systematic quality
control procedures.
Sample Size
The sample size must be large enough for the study to have sufficient statistical power
to detect the differences deemed important. Sample size calculations can be done with
standard formulae. The following information is needed before the calculation can be
done:
The precision of a study can also be improved by ensuring that the groups are of
appropriate relative size. This is often an issue of concern in case-control studies when
a decision is required on the number of controls to be chosen for each case.
B. SYSTEMATIC ERROR
Systematic error (or bias) occurs in epidemiology when results differ in a systematic
manner from the true values. A study with a small systematic error is said to have a
high accuracy. Accuracy is not affected by sample size. The possible sources of
systematic error in epidemiology are many and varied; over 30 specific types of bias
have been identified. The principal biases are:
Selection Bias
Measurement (or classification) bias.
Selection Bias
Selection bias occurs when there is a systematic difference between the characteristics
of the people selected for a study and the characteristics of those who are not. An
obvious source of selection bias occurs when participants select themselves for a
study, either because they are unwell or because they are particularly worried about an
exposure.
It is well known, for example, that people who respond to an invitation to participate
in a study on the effects of smoking differ in their smoking habits from non-
responders; the latter are usually heavier smokers.
Workers have to be healthy enough to perform their duties; the severely ill and
disabled are usually excluded from employment. Similarly, if a study is based on
examinations done in a health centre and there is no follow-up of participants who do
not turn up, biased results may be produced: unwell patients may be in bed either at
home or in hospital.
Measurement Bias
Recall bias can either exaggerate the degree of effect associated with the exposure as
with people affected by heart disease being more likely to admit to a past lack of
exercise or underestimate it if cases are more likely than controls to deny past
exposure.
If measurement bias occurs equally in the groups being compared, it almost always
results in an underestimate of the true strength of the relationship.
Such non-differential bias may account for apparent discrepancies in the results of
different epidemiological studies. If the investigator, laboratory technician or the
participant knows the exposure status, this knowledge can influence measurements
and cause observer bias .
A double-blind study means that neither the investigators, nor the participants, know
how the latter are classified.
CONFOUNDING
A problem arises if this extraneous factor itself a determinant or risk factor for the
health outcome is unequally distributed between the exposure subgroups.
Confounding occurs when the effects of two exposures (risk factors) have not been
separated and the analysis concludes that the effect is due to one variable rather than
the other.
Randomization
Restriction
Matching.
Stratification
Statistical modelling.
Randomization
In experimental studies, randomization is the ideal method for ensuring that potential
confounding variables are equally distributed among the groups being compared. The
sample sizes have to be sufficiently large to avoid random mal-distribution of such
variables. Randomization avoids the association between potentially confounding
variables and the exposure that is being considered.
Restriction
One way to control confounding is to limit the study to people who have particular
characteristics. For example, in a study on the effects of coffee on coronary heart
disease, participation in the study could be restricted to non-smokers, thus removing
any potential effect of confounding by cigarette smoking.
Matching
Stratification
In large studies it is usually preferable to control for confounding in the analytical
phase rather than in the design phase. Confounding can then be controlled by
stratification, which involves the measurement of the strength of associations in well-
defined and homogeneous categories (strata) of the confounding variable. If age is a
confounder, the association may be measured in, say, 10-year age groups; if sex or
ethnicity is a confounder, the association is measured separately in men and women or
in the different ethnic groups.
VALIDITY
Internal validity
Internal validity is the degree to which the results of an observation are correct for the
particular group of people being studied. For example, measurements of blood
haemoglobin must distinguish accurately participants with anaemia as defined in the
study. Analysis of the blood in a different laboratory may produce different results
because of systematic error, but the evaluation of associations with anaemia, as
measured by one laboratory, may still be internally valid. For a study to be of any use
it must be internally valid, although even a study that is perfectly valid internally may
be of no consequence because the results cannot be compared with other studies.
Internal validity can be threatened by all sources of systematic error but can be
improved by good design and attention to detail.
External validity
External validity or generalizability is the extent to which the results of a study apply
to people not in it (or, for example, to laboratories not involved in it). Internal validity
is necessary for, but does not guarantee, external validity, and is easier to achieve.
External validity requires external quality control of the measurements and
judgements about the degree to which the results of a study can be extrapolated. This
does not require that the study sample be representative of a reference population. For
example, evidence that the effect of lowering blood cholesterol in men is also relevant
to women requires a judgment about the external validity of studies in men. External
validity is assisted by study designs that examine clearly-stated hypotheses in well-
defined populations. The external validity of a study is supported if similar results are
found in studies in other populations.
CONSIDERING CAUSATION
1. Temporal Relationship
The temporal relationship is crucial; the cause must precede the effect. This is usually
self-evident, although difficulties may arise in case-control and cross-sectional studies
when measurements of the possible cause and effect are made at the same time. In
cases where the cause is an exposure that can be at different levels, it is essential that a
high enough level be reached before the disease occurs for the correct temporal
relationship to exist. Repeated measurement of the exposure at more than one point in
time and in different locations may strengthen the evidence.
2. Plausibility
An association is plausible, and thus more likely to be causal, if consistent with other
knowledge. For instance, laboratory experiments may have shown how exposure to
the particular factor could lead to changes associated with the effect measured.
However, biological plausibility is a relative concept, and seemingly implausible
associations may eventually be shown to be causal. For example, the predominant
view on the cause of cholera in the 1830s involved “miasma” rather than contagion.
Contagion was not supported by evidence until Snow’s work was published; much
later, Pasteur and his colleagues identified the causative agent. Lack of plausibility
may simply reflect lack of scientific knowledge. Doubts about the therapeutic effects
of acupuncture and homeopathy may be partly attributable to the absence of
information about a plausible biological mechanism.
3. Consistency
Techniques are available for pooling the results of several studies that have examined
the same issue, particularly randomized controlled trials. This technique is called
meta-analysis, and is used to combine the results of several trials, each of which may
deal with a relatively small sample, to obtain a better overall estimate of effect.
Systematic review uses standardized methods to select and review all relevant studies
on a particular topic with a view to eliminating bias in critical appraisal and synthesis.
One important reason for the apparent inconsistency of the results is that several of the
early studies were based on small samples.
4. Strength
A strong association between possible cause and effect, as measured by the size of the
risk ratio (relative risk), is more likely to be causal than is a weak association, which
could be influenced by confounding or bias. Relative risks greater than 2 can be
considered strong.
For example, cigarette smokers have a twofold increase in the risk of acute myocardial
infarction compared with non-smokers. The risk of lung cancer in smokers, compared
with non-smokers, has been shown in various studies to be increased between fourfold
and twentyfold. However, associations of such magnitude are rare in epidemiology.
The fact that an association is weak does not preclude it from being causal; the
strength of an association depends on the relative prevalence of other possible causes.
For example, weak associations have been found between diet and risk of coronary
heart disease in observational studies; and although experimental studies on selected
populations have been done, no conclusive results have been published. Despite this
lack of evidence, diet is generally thought to be a major causative factor in the high
rate of coronary heart disease in many industrialized countries. The probable reason
for the difficulty in identifying diet as a risk factor for coronary heart disease is that
diets in populations are rather homogeneous and variation over time for one individual
is greater than that between people. If everyone has more or less the same diet, it is
not possible to identify diet as a risk factor. Consequently, ecological evidence gains
importance. This situation has been characterized as one of sick individuals and sick
population, meaning that in many high-income countries, whole populations are at
risk from an adverse factor.
5. Dose–Response Relationship
A dose–response relationship occurs when changes in the level of a possible cause are
associated with changes in the prevalence or incidence of the effect.
6. Reversibility
When the removal of a possible cause results in a reduced disease risk, there is a
greater likelihood that the association is causal. For example, the cessation of cigarette
smoking is associated with a reduction in the risk of lung cancer relative to that in
people who continue to smoke. This finding strengthens the likelihood that cigarette
smoking causes lung cancer. If the cause leads to rapid irreversible changes
that subsequently produce disease whether or not there is continued exposure, then
reversibility cannot be a condition for causality.
7. Study design
Cohort studies. Cohort studies are the next best design because, when well
conducted, bias is minimized.
Again, they are not always available. Although case-control studies are subject
to several forms of bias, the results from large, well-designed investigations of
this kind provide good evidence for the causal nature of an association;
judgements often have to be made in the absence of data from other sources.
However, the time sequence can often be inferred from the way exposure and
effect data is collected. For instance, if it is clear that the health effect is recent
and the exposure to the potential causes is recorded in a questionnaire,
questions about the past may clearly identify exposures before the effect
occurred.
However, there are rare occasions when ecological studies provide good
evidence for establishing causation.
Evidence is often conflicting and due weight must be given to the different types when
decisions are made. In judging the different aspects of causation referred to above, the
correct temporal relationship is essential; once that has been established; the greatest
weight may be given to plausibility, consistency and the dose–response relationship.
The likelihood of a causal association is heightened when many different types of
evidence lead to the same conclusion. Evidence from well-designed studies is
particularly important, especially if they are conducted in a variety of locations.
The most important use of information about causation of diseases and injuries may
be in the area of prevention, which we will discuss in the following chapters. When
the causal pathways are established on the basis of quantitative information from
epidemiological studies, the decisions about prevention may be uncontroversial.
In situations where the causation is not so well established, but the impacts have great
potential public health importance, the “precautionary principle” may be applied to
take preventive action as a safety measure; this is called “precautionary prevention”.
Below are the lists of the main criteria for establishing a screening programme. These
relate to the characteristics of the disorder or disease, its treatment and the screening
test.
Disorder…Well-defined
Prevalence… Known
Natural history …Long period between first signs and overt disease; medically
important disorder for which there is an effective remedy Test choice Simple and
safe
Test performance….Distributions of test values in affected and unaffected
individuals known
Financial…Cost-effective
Facilities…Available or easily provided
Acceptability…..Procedures following a positive result are generally agreed upon
and acceptable to both the screening authorities and to those screened.
Equity…Equity of access to screening services; effective, acceptable and safe
treatment available
Above all, the disease should be one that would prove serious if not diagnosed early;
inborn metabolic defects such as phenylketonuria meet this criterion, as do some
cancers, such as cancer of the cervix.
In addition, several issues need to be addressed before establishing a screening
programme.
Lead time-The disease must have a reasonably long lead time; that is, the interval
between the time when the disease can be first diagnosed by screening and when it is
usually diagnosed in patients presenting with symptoms. Noise-induced hearing loss
has a very long lead time; pancreatic cancer usually has a short one. A short lead time
implies a rapidly progressing disease, and treatment initiated after screening is
unlikely to be more effective than that begun after the more usual diagnostic
procedures.
Screening test-The screening test itself must be cheap, easy to apply, acceptable to
the public, reliable and valid. A test is reliable if it provides consistent results, and
valid if it correctly categorizes people into groups with and without disease, as
measured by its sensitivity and specificity.
1. Sensitivity
2. Specificity
3. Predictive Value
Disease Status
Present Absent Total
Positive a b a+b
Screening test Negative c d c+d
Total a+c b+d (a+c)+(b+d)
NOTES
Decisions on the appropriate criteria for a screening test depend on the consequences
of identifying false negatives and false positives. For a serious condition in newborn
children, it might be preferable to have high sensitivity and to accept the increased
cost of a high number of false positives (reduced specificity). Further follow-up would
then be required to identify the true positives and true negatives.
This relative reduction in mortality from breast cancer of 23%–29% looks less
impressive when considered in absolute terms (the absolute mortality reduction was
0.05% of women screened). A much greater benefit in life-years gained would be
achieved if screening mammography delayed death from breast cancer in younger
women, but unfortunately this is not the case. Finally, the best preventive strategy
does not necessarily include screening.30 Where an important risk factor (such as
tobacco use, raised blood pressure or physical inactivity) can be reduced without
selecting a high-risk group for preventive action, it is better to concentrate on
available resources and use public policy and environmental measures to establish
mass approaches to prevention.