You are on page 1of 7

 PUBLIC HEALTH MATTERS 

Causation and Causal Inference in Epidemiology


| Kenneth J. Rothman, DrPH, Sander Greenland, MA, MS, DrPH, C Stat

fixed. In other words, a cause of a disease


Concepts of cause and causal inference are largely self-taught from early learn-
event is an event, condition, or characteristic
ing experiences. A model of causation that describes causes in terms of suffi-
cient causes and their component causes illuminates important principles such that preceded the disease event and without
as multicausality, the dependence of the strength of component causes on the which the disease event either would not
prevalence of complementary component causes, and interaction between com- have occurred at all or would not have oc-
ponent causes. curred until some later time. Under this defi-
Philosophers agree that causal propositions cannot be proved, and find flaws or nition it may be that no specific event, condi-
practical limitations in all philosophies of causal inference. Hence, the role of logic, tion, or characteristic is sufficient by itself to
belief, and observation in evaluating causal propositions is not settled. Causal produce disease. This is not a definition, then,
inference in epidemiology is better viewed as an exercise in measurement of an of a complete causal mechanism, but only a
effect rather than as a criterion-guided process for deciding whether an effect is pres-
component of it. A “sufficient cause,” which
ent or not. (Am J Public Health. 2005;95:S144–S150. doi:10.2105/AJPH.2004.059204)
means a complete causal mechanism, can be
defined as a set of minimal conditions and
What do we mean by causation? Even among eral. The tendency to identify the switch as events that inevitably produce disease; “mini-
those who study causation as the object of their the unique cause stems from its usual role as mal” implies that all of the conditions or
work, the concept is largely self-taught, cob- the final factor that acts in the causal mecha- events are necessary to that occurrence. In
bled together from early experiences. As a nism. The wiring can be considered part of disease etiology, the completion of a sufficient
youngster, each person develops and tests an the causal mechanism, but once it is put in cause may be considered equivalent to the
inventory of causal explanations that brings place, it seldom warrants further attention. onset of disease. (Onset here refers to the
meaning to perceived events and that ulti- The switch, however, is often the only part of onset of the earliest stage of the disease pro-
mately leads to more control of those events. the mechanism that needs to be activated to cess, rather than the onset of signs or symp-
Because our first appreciation of the con- obtain the effect of turning on the light. The toms.) For biological effects, most and some-
cept of causation is based on our own direct effect usually occurs immediately after turn- times all of the components of a sufficient
observations, the resulting concept is limited ing on the switch, and as a result we slip into cause are unknown.1
by the scope of those observations. We typi- the frame of thinking in which we identify the For example, tobacco smoking is a cause of
cally observe causes with effects that are im- switch as a unique cause. The inadequacy of lung cancer, but by itself it is not a sufficient
mediately apparent. For example, when one this assumption is emphasized when the bulb cause. First, the term smoking is too imprecise
turns a light switch to the “on” position, one goes bad and needs to be replaced. These to be used in a causal description. One must
normally sees the instant effect of the light concepts of causation that are established specify the type of smoke (e.g., cigarette,
going on. Nevertheless, the causal mechanism empirically early in life are too rudimentary cigar, pipe), whether it is filtered or unfiltered,
for getting a light to shine involves more to serve well as the basis for scientific theo- the manner and frequency of inhalation, and
than turning a light switch to “on.” Suppose ries. To enlarge upon them, we need a more the onset and duration of smoking. More im-
a storm has downed the electric lines to the general conceptual model that can serve as a portantly, smoking, even defined explicitly,
building, or the wiring is faulty, or the bulb common starting point in discussions of will not cause cancer in everyone. Appar-
is burned out—in any of these cases, turning causal theories. ently, there are some people who, by virtue
the switch on will have no effect. One cause of their genetic makeup or previous experi-
of the light going on is having the switch in SUFFICIENT AND COMPONENT ence, are susceptible to the effects of smok-
the proper position, but along with it we CAUSES ing, and others who are not. These suscepti-
must have a supply of power to the circuit, bility factors are other components in the
good wiring, and a working bulb. When all The concept and definition of causation various causal mechanisms through which
other factors are in place, turning the switch engender continuing debate among philoso- smoking causes lung cancer.
will cause the light to go on, but if one or phers. Nevertheless, researchers interested in Figure 1 provides a schematic diagram of
more of the other factors is lacking, the light causal phenomena must adopt a working defi- sufficient causes in a hypothetical individual.
will not go on. nition. We can define a cause of a specific dis- Each constellation of component causes rep-
Despite the tendency to consider a switch ease event as an antecedent event, condition, resented in Figure 1 is minimally sufficient to
as the unique cause of turning on a light, the or characteristic that was necessary for the produce the disease; that is, there is no redun-
complete causal mechanism is more intricate, occurrence of the disease at the moment it dant or extraneous component cause. Each
and the switch is only one component of sev- occurred, given that other conditions are one is a necessary part of that specific causal

S144 | Public Health Matters | Peer Reviewed | Rothman and Greenland American Journal of Public Health | Supplement 1, 2005, Vol 95, No. S1
 PUBLIC HEALTH MATTERS 

disease frequency produced by introducing


the factor into a population. This change may
be measured in absolute or relative terms. In
either case, the strength of an effect may
have tremendous public health significance,
but it may have little biological significance.
The reason is that given a specific causal
mechanism, any of the component causes can
have strong or weak effects. The actual iden-
tity of the constituent components of the
causal mechanism amounts to the biology of
causation. In contrast, the strength of a fac-
tor’s effect depends on the time-specific distri-
bution of its causal complements in the popu-
FIGURE 1—Three sufficient causes of disease. lation. Over a span of time, the strength of
the effect of a given factor on the occurrence
of a given disease may change, because the
mechanism. A specific component cause may sertion that there are nearly always some prevalence of its causal complements in vari-
play a role in one, two, or all three of the genetic and some environmental component ous causal mechanisms may also change.
causal mechanisms pictured. causes in every causal mechanism. Thus, The causal mechanisms in which the factor
even an event such as a fall on an icy path and its complements act could remain un-
MULTICAUSALITY leading to a broken hip is part of a compli- changed, however.
cated causal mechanism that involves many
The model of causation implied by component causes. INTERACTION AMONG CAUSES
Figure 1 illuminates several important princi- The importance of multicausality is that
ples regarding causes. Perhaps the most im- most identified causes are neither necessary The causal pie model posits that several
portant of these principles is self-evident from nor sufficient to produce disease. Neverthe- causal components act in concert to produce
the model: A given disease can be caused by less, a cause need not be either necessary or an effect. “Acting in concert” does not neces-
more than one causal mechanism, and every sufficient for its removal to result in disease sarily imply that factors must act at the same
causal mechanism involves the joint action of prevention. If a component cause that is nei- time. Consider the example above of the per-
a multitude of component causes. Consider ther necessary nor sufficient is blocked, a sub- son who sustained trauma to the head that
as an example the cause of a broken hip. Sup- stantial amount of disease may be prevented. resulted in an equilibrium disturbance,
pose that someone experiences a traumatic That the cause is not necessary implies that which led, years later, to a fall on an icy
injury to the head that leads to a permanent some disease may still occur after the cause path. The earlier head trauma played a
disturbance in equilibrium. Many years later, is blocked, but a component cause will never- causal role in the later hip fracture; so did
the faulty equilibrium plays a causal role in a theless be a necessary cause for some of the the weather conditions on the day of the
fall that occurs while the person is walking cases that occur. That the component cause is fracture. If both of these factors played a
on an icy path. The fall results in a broken not sufficient implies that other component causal role in the hip fracture, then they in-
hip. Other factors playing a causal role for the causes must interact with it to produce the teracted with one another to cause the frac-
broken hip could include the type of shoe the disease, and that blocking any of them would ture, despite the fact that their time of action
person was wearing, the lack of a handrail result in prevention of some cases of disease. is many years apart. We would say that any
along the path, a strong wind, or the body Thus, one need not identify every component and all of the factors in the same causal
weight of the person, among others. The com- cause to prevent some cases of disease. In the mechanism for disease interact with one an-
plete causal mechanism involves a multitude law, a distinction is sometimes made among other to cause disease. Thus, the head
of factors. Some factors, such as the person’s component causes to identify those that may trauma interacted with the weather condi-
weight and the earlier injury that resulted in be considered a “proximate” cause, implying tions, as well as with other component causes
the equilibrium disturbance, reflect earlier a more direct connection or responsibility for such as the type of footwear, the absence of
events that have had a lingering effect. Some the outcome.2 a handhold, and any other conditions that
causal components are genetic and would af- were necessary to the causal mechanism of
fect the person’s weight, gait, behavior, recov- STRENGTH OF A CAUSE the fall and the broken hip that resulted.
ery from the earlier trauma, and so forth. One can view each causal pie as a set of in-
Other factors, such as the force of the wind, In epidemiology, the strength of a factor’s teracting causal components. This model
are environmental. It is a reasonably safe as- effect is usually measured by the change in provides a biological basis for a concept of

Supplement 1, 2005, Vol 95, No. S1 | American Journal of Public Health Rothman and Greenland | Peer Reviewed | Public Health Matters | S145
 PUBLIC HEALTH MATTERS 

Table 1–Hypothetical Rates of Head that as much as 40% of cancer is attributable there is no reasonable way to allocate a por-
and Neck Cancer (Cases per 100 000 to occupational exposures. Many scientists tion of the causation to either genes or envi-
Person-Years) According to Smoking thought that this fraction was an overestimate, ronment. Similarly, every case of every dis-
Status and Alcohol Drinking and argued against this claim.4,5 One of the ease has some environmental and some
arguments used in rebuttal was as follows: genetic component causes, and therefore
Alcohol Drinking x percent of cancer is caused by smoking, every case can be attributed both to genes
Smoking Status No Yes y percent by diet, z percent by alcohol, and and to environment. No paradox exists as
so on; when all these percentages are added long as it is understood that the fractions of
Nonsmoker 1 3
up, only a small percentage, much less than disease attributable to genes and to environ-
Smoker 4 12
40%, is left for occupational causes. But this ment overlap.
rebuttal is fallacious, because it is based on Many researchers have spent considerable
the naive view that every case of disease has effort in developing heritability indices, which
interaction distinct from the usual statistical a single cause, and that two causes cannot are supposed to measure the fraction of dis-
view of interaction.3 both contribute to the same case of cancer. ease that is inherited. Unfortunately, these
In fact, since diet, smoking, asbestos, and vari- indices only assess the relative role of envi-
SUM OF ATTRIBUTABLE FRACTIONS ous occupational exposures, along with other ronmental and genetic causes of disease in a
factors, interact with one another and with particular setting. For example, some genetic
Consider the data on rates of head and genetic factors to cause cancer, each case of causes may be necessary components of
neck cancer according to whether people cancer could be attributed repeatedly to every causal mechanism. If everyone in a
have been cigarette smokers, alcohol drink- many separate component causes. The sum population has an identical set of the genes
ers, or both (Table 1). Suppose that the differ- of disease attributable to various component that cause disease, however, their effect is
ences in the rates all reflect causal effects. causes thus has no upper limit. not included in heritability indices, despite
Among those people who are smokers and A single cause or category of causes that is the fact that having these genes is a cause of
also alcohol drinkers, what proportion of the present in every sufficient cause of disease the disease. The two farmers in the example
cases is attributable to the effect of smoking? will have an attributable fraction of 100%. above would offer very different values for
We know that the rate for these people is 12 Much publicity attended the pronouncement the heritability of yellow shanks, despite the
cases per 100 000 person-years. If these in 1960 that as much as 90% of cancer is fact that the condition is always 100% depen-
same people were not smokers, we can infer caused by environmental factors.6 Since “envi- dent on having certain genes.
that their rate of head and neck cancer would ronment” can be thought of as an all-embracing If all genetic factors that determine disease
be 3 cases per 100 000 person-years. If this category that represents nongenetic causes, are taken into account, whether or not they
difference reflects the causal role of smoking, which must be present to some extent in vary within populations, then 100% of dis-
then we might infer that 9 of every 12 cases, every sufficient cause, it is clear on a priori ease can be said to be inherited. Analogously,
or 75%, are attributable to smoking among grounds that 100% of any disease is environ- 100% of any disease is environmentally
those who both smoke and drink alcohol. If mentally caused. Thus, Higginson’s estimate caused, even those diseases that we often
we turn the question around and ask what of 90% was an underestimate. consider purely genetic. Phenylketonuria, for
proportion of disease among these same Similarly, one can show that 100% of any example, is considered by many to be purely
people is attributable to alcohol drinking, disease is inherited. MacMahon 7 cited the ex- genetic. Nonetheless, the mental retardation
we would be able to attribute 8 of every 12 ample of yellow shanks, 8 a trait occurring in that it may cause can be prevented by appro-
cases, or 67%, to alcohol drinking. certain strains of fowl fed yellow corn. Both priate dietary intervention.
How can we attribute 75% of the cases to the right set of genes and the yellow-corn diet The treatment for phenylketonuria illus-
smoking and 67% to alcohol drinking among are necessary to produce yellow shanks. A trates the interaction of genes and environ-
those who are exposed to both? We can be- farmer with several strains of fowl, feeding ment to cause a disease commonly thought to
cause some cases are counted more than them all only yellow corn, would consider be purely genetic. What about an apparently
once. Smoking and alcohol interact in some yellow shanks to be a genetic condition, since purely environmental cause of death such as
cases of head and neck cancer, and these only one strain would get yellow shanks, de- death from an automobile accident? It is easy
cases are attributable both to smoking and to spite all strains getting the same diet. A differ- to conceive of genetic traits that lead to psy-
alcohol drinking. One consequence of interac- ent farmer, who owned only the strain liable chiatric problems such as alcoholism, which
tion is that we should not expect that the pro- to get yellow shanks, but who fed some of in turn lead to drunk driving and consequent
portions of disease attributable to various the birds yellow corn and others white corn, fatality. Consider another more extreme envi-
component causes will sum to 100%. would consider yellow shanks to be an envi- ronmental example, being killed by lightning.
A widely discussed (though unpublished) ronmentally determined condition because it Partially heritable psychiatric conditions can
paper from the 1970s, written by scientists at depends on diet. In reality, yellow shanks is influence whether someone will take shelter
the National Institutes of Health, proposed determined by both genes and environment; during a lightning storm; genetic traits such as

S146 | Public Health Matters | Peer Reviewed | Rothman and Greenland American Journal of Public Health | Supplement 1, 2005, Vol 95, No. S1
 PUBLIC HEALTH MATTERS 

athletic ability may influence the likelihood of reveal cause–effect relations with certainty. cancer at an earlier stage in these women, as
being outside when a lightning storm strikes; This view overlooks the fact that all relations compared with women not taking estrogens.
and having an outdoor occupation or pastime are suggestive in exactly the manner dis- Many epidemiologic observations could have
that is more frequent among men (or women), cussed by Hume: even the most careful and been and were used to evaluate these com-
and in that sense genetic, would also influ- detailed mechanistic dissection of individual peting hypotheses. The causal theory pre-
ence the probability of getting killed by light- events cannot provide more than associations, dicted that the risk of endometrial cancer
ning. The argument may seem stretched on albeit at a finer level. Laboratory studies would tend to increase with increasing use
this example, but the point that every case of often involve a degree of observer control (dose, frequency, and duration) of estrogens,
disease has both genetic and environmental that cannot be approached in epidemiology; as for other carcinogenic exposures. The
causes is defensible and has important impli- it is only this control, not the level of observa- detection bias theory, on the other hand,
cations for research. tion, that can strengthen the inferences from predicted that women who had used estro-
laboratory studies. Furthermore, such control gens only for a short while would have the
MAKING CAUSAL INFERENCES is no guarantee against error. All of the fruits greatest risk, since the symptoms related to
of scientific work, in epidemiology or other estrogen use that led to the medical consulta-
Causal inference may be viewed as a spe- disciplines, are at best only tentative formula- tion tend to appear soon after use begins.
cial case of the more general process of scien- tions of a description of nature, even when Because the association of recent estrogen
tific reasoning, about which there is substan- the work itself is carried out without mistakes. use and endometrial cancer was the same
tial scholarly debate among scientists and in both long-term and short-term estrogen
philosophers. Testing Competing Epidemiologic users, the detection bias theory was refuted
Theories as an explanation for all but a small fraction
Impossibility of Proof Biological knowledge about epidemiologic of endometrial cancer cases occurring after
Vigorous debate is a characteristic of mod- hypotheses is often scant, making the hy- estrogen use.
ern scientific philosophy, no less in epidemiol- potheses themselves at times little more than The endometrial cancer example illus-
ogy than in other areas. Perhaps the most im- vague statements of causal association be- trates a critical point in understanding the
portant common thread that emerges from tween exposure and disease, such as “smok- process of causal inference in epidemiologic
the debated philosophies stems from 18th- ing causes cardiovascular disease.” These studies: many of the hypotheses being evalu-
century empiricist David Hume’s observation vague hypotheses have only vague conse- ated in the interpretation of epidemiologic
that proof is impossible in empirical science. quences that can be difficult to test. To cope studies are noncausal hypotheses, in the
This simple fact is especially important to epi- with this vagueness, epidemiologists usually sense of involving no causal connection be-
demiologists, who often face the criticism that focus on testing the negation of the causal tween the study exposure and the disease.
proof is impossible in epidemiology, with the hypothesis, that is, the null hypothesis that For example, hypotheses that amount to
implication that it is possible in other scien- the exposure does not have a causal relation explanations of how specific types of bias
tific disciplines. Such criticism may stem from to disease. Then, any observed association could have led to an association between ex-
a view that experiments are the definitive can potentially refute the hypothesis, subject posure and disease are the usual alternatives
source of scientific knowledge. Such a view is to the assumption (auxiliary hypothesis) that to the primary study hypothesis that the epi-
mistaken on at least two counts. First, the biases are absent. demiologist needs to consider in drawing in-
nonexperimental nature of a science does not If the causal mechanism is stated specifi- ferences. Much of the interpretation of epi-
preclude impressive scientific discoveries; the cally enough, epidemiologic observations demiologic studies amounts to the testing of
myriad examples include plate tectonics, the under some circumstances might provide such noncausal explanations.
evolution of species, planets orbiting other crucial tests of competing non-null causal
stars, and the effects of cigarette smoking on hypotheses. On the other hand, many epide- THE DUBIOUS VALUE OF CAUSAL
human health. Even when they are possible, miologic studies are not designed to test a CRITERIA
experiments (including randomized trials) do causal hypothesis. For example, epidemio-
not provide anything approaching proof, and logic data related to the finding that women In practice, how do epidemiologists sepa-
in fact may be controversial, contradictory, who took replacement estrogen therapy were rate out the causal from the noncausal expla-
or irreproducible. The cold-fusion debacle at a considerably higher risk for endometrial nations? Despite philosophic criticisms of in-
demonstrates well that neither physical nor cancer was examined by Horwitz and Fein- ductive inference, inductively oriented causal
experimental science is immune to such stein, who conjectured a competing theory to criteria have commonly been used to make
problems. explain the association: they proposed that such inferences. If a set of necessary and suf-
Some experimental scientists hold that women taking estrogen experienced symp- ficient causal criteria could be used to distin-
epidemiologic relations are only suggestive, toms such as bleeding that induced them to guish causal from noncausal relations in epi-
and believe that detailed laboratory study of consult a physician.9 The resulting diagnostic demiologic studies, the job of the scientist
mechanisms within single individuals can workup led to the detection of endometrial would be eased considerably. With such

Supplement 1, 2005, Vol 95, No. S1 | American Journal of Public Health Rothman and Greenland | Peer Reviewed | Public Health Matters | S147
 PUBLIC HEALTH MATTERS 

criteria, all the concerns about the logic or Counterexamples of strong but noncausal simply because some results are “statistically
lack thereof in causal inference could be associations are also not hard to find; any significant” and some are not. This sort of
forgotten: it would only be necessary to con- study with strong confounding illustrates the evaluation is completely fallacious even if one
sult the checklist of criteria to see if a relation phenomenon. For example, consider the accepts the use of significance testing meth-
were causal. We know from philosophy that a strong but noncausal relation between Down ods: The results (effect estimates) from the
set of sufficient criteria does not exist. Never- syndrome and birth rank, which is con- studies could all be identical even if many
theless, lists of causal criteria have become founded by the relation between Down syn- were significant and many were not, the dif-
popular, possibly because they seem to drome and maternal age. Of course, once the ference in significance arising solely because
provide a road map through complicated confounding factor is identified, the associa- of differences in the standard errors or sizes
territory. tion is diminished by adjustment for the fac- of the studies. Furthermore, this fallacy is not
tor. These examples remind us that a strong eliminated by “standardizing” estimates.
Hill’s Criteria association is neither necessary nor sufficient 3. Specificity. The criterion of specificity
A commonly used set of criteria was pro- for causality, nor is weakness necessary or requires that a cause leads to a single effect,
posed by Hill,10 it was an expansion of a set sufficient for absence of causality. Further- not multiple effects. This argument has often
of criteria offered previously in the landmark more, neither relative risk nor any other mea- been advanced to refute causal interpreta-
surgeon general’s report on smoking and sure of association is a biologically consistent tions of exposures that appear to relate to
health,11 which in turn were anticipated by feature of an association; as described above, myriad effects—for example, by those seeking
the inductive canons of John Stuart Mill12 such measures of association are characteris- to exonerate smoking as a cause of lung can-
and the rules given by Hume. 13 tics of a given population that depend on the cer. Unfortunately, the criterion is invalid as a
Hill suggested that the following aspects of relative prevalence of other causes in that general rule. Causes of a given effect cannot
an association be considered in attempting to population. A strong association serves only be expected to lack all other effects. In fact,
distinguish causal from noncausal associa- to rule out hypotheses that the association is everyday experience teaches us repeatedly
tions: (1) strength, (2) consistency, (3) speci- entirely due to one weak unmeasured con- that single events or conditions may have
ficity, (4) temporality, (5) biological gradient, founder or other source of modest bias. many effects. Smoking is an excellent exam-
(6) plausibility, (7) coherence, (8) experimen- 2. Consistency. Consistency refers to the re- ple; it leads to many effects in the smoker,
tal evidence, and (9) analogy. These criteria peated observation of an association in differ- in part because smoking involves exposure
suffer from their inductivist origin, but their ent populations under different circumstances. to a wide range of agents.15,16 The existence
popularity demands a more specific discus- Lack of consistency, however, does not rule of one effect of an exposure does not detract
sion of their utility. out a causal association, because some effects from the possibility that another effect exists.
1. Strength. Hill’s argument is essentially are produced by their causes only under un- On the other hand, Weiss16 convincingly ar-
that strong associations are more likely to be usual circumstances. More precisely, the effect gued that specificity can be used to distinguish
causal than weak associations because, if they of a causal agent cannot occur unless the com- some causal hypotheses from noncausal hy-
could be explained by some other factor, the plementary component causes act, or have al- potheses, when the causal hypothesis predicts
effect of that factor would have to be even ready acted, to complete a sufficient cause. a relation with one outcome but no relation
stronger than the observed association and These conditions will not always be met. Thus, with another outcome. Thus, specificity can
therefore would have become evident. Weak transfusions can cause HIV infection but they come into play when it can be logically de-
associations, on the other hand, are more do not always do so: the virus must also be duced from the causal hypothesis in question.
likely to be explained by undetected biases. To present. Tampon use can cause toxic shock 4. Temporality. Temporality refers to the
some extent this is a reasonable argument but, syndrome, but only rarely when certain other, necessity for a cause to precede an effect in
as Hill himself acknowledged, the fact that an perhaps unknown, conditions are met. Consis- time. This criterion is inarguable, insofar as
association is weak does not rule out a causal tency is apparent only after all the relevant de- any claimed observation of causation must in-
connection. A commonly cited counterexam- tails of a causal mechanism are understood, volve the putative cause C preceding the pu-
ple is the relation between cigarette smoking which is to say very seldom. Furthermore, tative effect D. It does not, however, follow
and cardiovascular disease: one explanation even studies of exactly the same phenomena that a reverse time order is evidence against
for this relation being weak is that cardiovas- can be expected to yield different results sim- the hypothesis that C can cause D. Rather,
cular disease is common, making any ratio ply because they differ in their methods and observations in which C followed D merely
measure of effect comparatively small com- random errors. Consistency serves only to rule show that C could not have caused D in these
pared with ratio measures for diseases that are out hypotheses that the association is attributa- instances; they provide no evidence for or
less common.14 Nevertheless, cigarette smok- ble to some factor that varies across studies. against the hypothesis that C can cause D in
ing is not seriously doubted as a cause of car- One mistake in implementing the consis- those instances in which it precedes D.
diovascular disease. Another example would tency criterion is so common that it deserves 5. Biological gradient. Biological gradient
be passive smoking and lung cancer, a weak special mention. It is sometimes claimed that refers to the presence of a unidirectional
association that few consider to be noncausal. a literature or set of results is inconsistent dose–response curve. We often expect such a

S148 | Public Health Matters | Peer Reviewed | Rothman and Greenland American Journal of Public Health | Supplement 1, 2005, Vol 95, No. S1
 PUBLIC HEALTH MATTERS 

monotonic relation to exist. For example, sion (via body lice) was known: “It could be no 8. Experimental evidence. It is not clear what
more smoking means more carcinogen expo- more ridiculous for the stranger who passed Hill meant by experimental evidence. It might
sure and more tissue damage, hence more op- the night in the steerage of an emigrant ship to have referred to evidence from laboratory ex-
portunity for carcinogenesis. Some causal as- ascribe the typhus, which he there contracted, periments on animals, or to evidence from
sociations, however, show a single jump to the vermin with which bodies of the sick human experiments. Evidence from human ex-
(threshold) rather than a monotonic trend; an might be infested. An adequate cause, one rea- periments, however, is seldom available for
example is the association between DES and sonable in itself, must correct the coincidences most epidemiologic research questions, and an-
adenocarcinoma of the vagina. A possible ex- of simple experience.”17 What was to Cheever imal evidence relates to different species and
planation is that the doses of DES that were an implausible explanation turned out to be usually to levels of exposure very different
administered were all sufficiently great to pro- the correct explanation, since it was indeed the from those humans experience. From Hill’s ex-
duce the maximum effect from DES. Under vermin that caused the typhus infection. Such amples, it seems that what he had in mind for
this hypothesis, for all those exposed to DES, is the problem with plausibility: it is too often experimental evidence was the result of re-
the development of disease would depend not based on logic or data, but only on prior moval of some harmful exposure in an inter-
entirely on other component causes. beliefs. This is not to say that biological knowl- vention or prevention program, rather than the
Alcohol consumption and mortality is an- edge should be discounted when evaluating a results of laboratory experiments. The lack of
other example. Death rates are higher among new hypothesis, but only to point out the diffi- availability of such evidence would at least be
nondrinkers than among moderate drinkers, culty in applying that knowledge. a pragmatic difficulty in making this a criterion
but ascend to the highest levels for heavy The Bayesian approach to inference at- for inference. Logically, however, experimental
drinkers. There is considerable debate about tempts to deal with this problem by requiring evidence is not a criterion but a test of the
which parts of the J-shaped dose-response that one quantify, on a probability (0 to 1) causal hypothesis, a test that is simply unavail-
curve are causally related to alcohol con- scale, the certainty that one has in prior be- able in most circumstances. Although experi-
sumption and which parts are noncausal ar- liefs, as well as in new hypotheses. This quan- mental tests can be much stronger than other
tifacts stemming from confounding or other tification displays the dogmatism or open- tests, they are often not as decisive as thought,
biases. Some studies appear to find only an mindedness of the analyst in a public fashion, because of difficulties in interpretation. For ex-
increasing relation between alcohol consump- with certainty values near 1 or 0 betraying a ample, one can attempt to test the hypothesis
tion and mortality, possibly because the cate- strong commitment of the analyst for or that malaria is caused by swamp gas by drain-
gories of alcohol consumption are too broad against a hypothesis. It can also provide a ing swamps in some areas and not in others to
to distinguish different rates among moderate means of testing those quantified beliefs see if the malaria rates among residents are af-
drinkers and nondrinkers. against new evidence.12 Nevertheless, the fected by the draining. As predicted by the hy-
Associations that do show a monotonic Bayesian approach cannot transform plausi- pothesis, the rates will drop in the areas where
trend in disease frequency with increasing lev- bility into an objective causal criterion. the swamps are drained. As Popper empha-
els of exposure are not necessarily causal; con- 7. Coherence. Taken from the surgeon gen- sized, however, there are always many alterna-
founding can result in a monotonic relation eral’s report on smoking and health,11 the term tive explanations for the outcome of every ex-
between a noncausal risk factor and disease if coherence implies that a cause-and-effect inter- periment. In this example, one alternative,
the confounding factor itself demonstrates a pretation for an association does not conflict which happens to be correct, is that mosqui-
biological gradient in its relation with disease. with what is known of the natural history and toes are responsible for malaria transmission.
The noncausal relation between birth rank biology of the disease. The examples Hill gave 9. Analogy. Whatever insight might be de-
and Down syndrome mentioned in part 1 for coherence, such as the histopathologic ef- rived from analogy is handicapped by the in-
above shows a biological gradient that merely fect of smoking on bronchial epithelium (in ref- ventive imagination of scientists who can find
reflects the progressive relation between ma- erence to the association between smoking and analogies everywhere. At best, analogy pro-
ternal age and Down syndrome occurrence. lung cancer) or the difference in lung cancer vides a source of more elaborate hypotheses
These examples imply that the existence of incidence by gender, could reasonably be about the associations under study; absence of
a monotonic association is neither necessary considered examples of plausibility as well such analogies only reflects lack of imagination
nor sufficient for a causal relation. A nonmo- as coherence; the distinction appears to be a or experience, not falsity of the hypothesis.
notonic relation only refutes those causal hy- fine one. Hill emphasized that the absence of
potheses specific enough to predict a monoto- coherent information, as distinguished, appar- Is There Any Use for Causal Criteria?
nic dose–response curve. ently, from the presence of conflicting informa- As is evident, the standards of epidemio-
6. Plausibility. Plausibility refers to the bio- tion, should not be taken as evidence against logic evidence offered by Hill are saddled with
logical plausibility of the hypothesis, an impor- an association being considered causal. On the reservations and exceptions. Hill himself was
tant concern but one that is far from objective other hand, presence of conflicting information ambivalent about the utility of these “view-
or absolute. Sartwell, emphasizing this point, may indeed refute a hypothesis, but one must points” (he did not use the word criteria in the
cited the 1861 comments of Cheever on the always remember that the conflicting informa- paper). On the one hand, he asked, “In what
etiology of typhus before its mode of transmis- tion may be mistaken or misinterpreted.18 circumstances can we pass from this observed

Supplement 1, 2005, Vol 95, No. S1 | American Journal of Public Health Rothman and Greenland | Peer Reviewed | Public Health Matters | S149
 PUBLIC HEALTH MATTERS 

association to a verdict of causation?” Yet de- error, since nearly every study will have nearly 5. Ephron E. Apocalyptics: Cancer and the Big Lie—
spite speaking of verdicts on causation, he dis- every type of error. The real issue is to quan- How Environmental Politics Controls What We Know
about Cancer. New York, NY: Simon and Schuster;
agreed that any “hard-and-fast rules of evi- tify the errors. As there is no precise cutoff 1984.
dence” existed by which to judge causation: with respect to how much error can be toler-
6. Higginson J. Population studies in cancer. Acta
This conclusion accords with the views of ated before a study must be considered in- Unio Internat Contra Cancrum 1960;16:1667–1670.
Hume, Popper, and others that causal infer- valid, there is no alternative to the quantifica- 7. MacMahon B. Gene-environment interaction in
ences cannot attain the certainty of logical de- tion of study errors to the extent possible. human disease. J Psychiatr Res. 1968;6:393–402.
ductions. Although some scientists continue to Although there are no absolute criteria for 8. Hogben L. Nature and Nurture. London, England:
promulgate causal criteria as aids to inference, assessing the validity of scientific evidence, it Williams and Norgate; 1933.
others argue that it is actually detrimental to is still possible to assess the validity of a 9. Horwitz RI, Feinstein AR. Alternative analytic
cloud the inferential process by considering study. What is required is much more than methods for case-control studies of estrogens and
endometrial cancer. N Engl J Med. 1978;299:
checklist criteria.19 An intermediate, refutation- the application of a list of criteria. Instead, 1089–1094.
ist approach seeks to transform the criteria one must apply thorough criticism, with the
10. Hill AB. The environment and disease: association
into deductive tests of causal hypotheses.20,21 goal of obtaining a quantified evaluation of or causation? Proc R Soc Med. 1965;58:295–300.
Such an approach avoids the temptation to the total error that afflicts the study. This type 11. Smoking and Health: Report of the Advisory
use causal criteria simply to buttress pet theo- of assessment is not one that can be done Committee to the Surgeon General of the Public
ries at hand, and instead allows epidemiolo- easily by someone who lacks the skills and Health Service. Washington, DC: US Department of
Health, Education, and Welfare; 1964. Public Health
gists to focus on evaluating competing causal training of a scientist familiar with the subject Service Publication No. 1103.
theories using crucial observations. matter and the scientific methods that were
12. Mill JS. A System of Logic, Ratiocinative and Induc-
employed. Neither can it be applied readily tive. 5th ed. London, England: Parker, Son and Bowin,
CRITERIA TO JUDGE WHETHER by judges in court, nor by scientists who ei- 1862. Cited in Clark DW, MacMahon B, eds. Preventive
and Community Medicine. 2nd ed. Boston, Mass: Little,
SCIENTIFIC EVIDENCE IS VALID ther lack the requisite knowledge or who do
Brown; 1981:chap 2.
not take the time to penetrate the work.
13. Hume D. A Treatise of Human Nature. (Originally
Just as causal criteria cannot be used to published in 1739.) Oxford University Press edition,
establish the validity of an inference, there with an Analytical Index by L. A. Selby-Bigge, pub-
are no criteria that can be used to establish lished 1888. Second edition with text revised and
About the Authors notes by P. H. Nidditch, 1978.
the validity of data or evidence. There are Kenneth J. Rothman is with the Boston University Medical
Center, Boston, Mass. Sander Greenland is with the Uni- 14. Rothman KJ, Poole C. A strengthening programme
methods by which validity can be assessed,
versity of California, Los Angeles. for weak associations. Int J Epidemiol 1988;17(Suppl):
but this assessment would not resemble any- 955–959.
Requests for reprints should be sent to Kenneth J. Rothman,
thing like the application of rigid criteria. DrPH, Boston University School of Public Health, Depart- 15. Smith GD. Specificity as a criterion for causation:
Some of the difficulty can be understood by ment of Epidemiology, 715 Albany St., Boston, MA a premature burial? Int J Epidemiol. 2002;31:710–713.
02118 (e-mail: krothman@bu.edu).
taking the view that scientific evidence can 16. Weiss NS:. Can the specificity of an association be
This article was accepted November 18, 2004.
usually be viewed as a form of measurement. rehabilitated as a basis for supporting a causal hypoth-
esis? Epidemiology. 2002;13:6-8.
If an epidemiologic study sets out to assess the
Contributors 17. Sartwell P. On the methodology of investigations
relation between exposure to tobacco smoke Kenneth J. Rothman and Sander Greenland participated
of etiologic factors in chronic diseases—further com-
and lung cancer risk, the results can and equally in the planning and writing of this article.
ments. J Chron Dis. 1960;11:61–63.
should be framed as a measure of causal ef-
18. Popper, KR. The Logic of Scientific Discovery. New
fect, such as the ratio of the risk of lung cancer Acknowledgments York, NY: Harper & Row; 1959 (first published in Ger-
This work is largely abridged from chapter 2 of Modern man in 1934).
among smokers to the risk among nonsmok-
Epidemiology, 2nd ed., by K. J. Rothman and S. Green-
ers. Like any measurement, the measurement land, Lippincott, Williams & Wilkins, 1998, and chap- 19. Lanes SF, Poole C. “Truth in packaging?” The
of a causal effect is subject to measurement ter 2 of Epidemiology—An Introduction by K. J. Rothman, unwrapping of epidemiologic research. J Occup Med.
Oxford University Press, 2002. 1984;26:571–574.
error. For a scientific study, measurement error
20. Maclure M. Popperian refutation in epidemiology.
encompasses more than the error that we
References Am J Epidemiol. 1985;121:343–350.
might have in mind when we attempt to mea-
1. Rothman KJ. Causes. Am J Epidemiol. 1976;104: 21. Weed D. On the logic of causal inference. Am J
sure the length of a piece of carpet. In addition 587–592. Epidemiol. 1986;123:965–979.
to statistical error, the measurement error sub- 2. Honoreé A. Causation in the Law. In: Zalta EN,
sumes problems that relate to study design, in- ed. Stanford Encyclopedia of Philosophy. Winter 2001
cluding subject selection and retention, infor- ed. Stanford, Calif: Stanford University; 2001. Avail-
able at: http://plato.stanford.edu/archives/win2001/
mation acquisition, and uncontrolled entries/causation-law.
confounding and other sources of bias. There
3. Rothman KJ, Greenland S. Modern Epidemiology.
are many individual sources of possible error. Philadelphia, Pa: Lippincott; 1998: chap 18.
It is not sufficient to characterize a study as 4. Higginson J. Proportion of cancer due to occupa-
having or not having any of these sources of tion. Prev Med. 1980;9:180–188.

S150 | Public Health Matters | Peer Reviewed | Rothman and Greenland American Journal of Public Health | Supplement 1, 2005, Vol 95, No. S1

You might also like