Sander Greenland: Epidemiology, Vol. 9, No. 3. (May, 1998), Pp. 322-332

Probability Logic and Probabilistic Induction
Sander Greenland
Epidemiology, Vol. 9, No. 3. (May, 1998), pp. 322-332.
Stable URL:
http://links.jstor.org/sici?sici=1044-3983%28199805%299%3A3%3C322%3APLAPI%3E2.0.CO%3B2-Y
Epidemiology is currently published by Lippincott Williams & Wilkins.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained
prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in
the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/journals/lww.html.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is an independent not-for-profit organization dedicated to and preserving a digital archive of scholarly journals. For
more information regarding JSTOR, please contact support@jstor.org.
http://www.jstor.org
Tue Jun 5 15:38:41 2007
Probability Logic and Probabilistic Induction
Sander Greenhnd
This article reviews some philosophical aspects of probability ponent of statistical analysis, rather than the current mislead-
and describes how probability logic can give precise meanings ing practice of pretending that statistics applied to observa-
to the concepts of inductive support, corroboration,refutation, tional data are objective. This basis is important, because the
and related notions, as well as provide a foundation for logi- use of realistic priors in a statistical analysis can yield more
cally sound statistical inference. Probability logic also provides stringent tests of hypotheses and more accurate estimates than
a basis for recognizing prior distributions as an integral corn- conventional procedures. (Epidemiology 1998;9:322-332)
Keywords: Bayesian analysis, induction, inference, logic, probability, statistics.
Over the past two decades, there has been a dramatic Of necessity, the present review must be limited to
resurgence of Bayesian philosophy and methodology in just those elements essential for understanding the basic
statistics, as reflected in recent textbooks1" as well as issues. It skips many concepts and viewpoints entirely; it
journal articles. This resurgence has as yet had little is also ahistorical, even though the history of the ideas is
impact in epidemiology, which instead has experienced illuminating. For more thorough coverage, one may con-
lively arguments between Popperian and non-Popperian sult any of a number of philosophic t r e a t i s e ~ . ~ % ~~-l~
Hack-
(but not necessarily Bayesian) philosophical positions ingla provides a superb history of the early origins of
(see, for example, the debates in the volumes edited by probabilistic concepts and controversies, while Lad3 pro-
Greenland4 and Rothman5). This gap between statistics vides many interesting details of subjective Bayesian
and epidemiology is in part due to differing attitudes history.
toward mathematics and computing, which often seem
to be the pride and joy of statisticians but poorly con-
nected to the underlying epidemiologic reality. Probability Logic
I here review some basic elements from the philoso- In a companion paper, I have tried to document that
phy of probability which may be useful for bridging the there are ambiguous and contradictory definitions for
gap. Central among these elements is probability logic, the word " i n d u ~ t i o n . "Notions
~~ of probabilistic reason-
which provides a n extension of deductive logic to rea- ing suffer from at least as many problems, as witnessed by
soning under u n ~ e r t a i n t y ~and which forms the basis
s~.~ the controversies surrounding the foundations of proba-
of certain arguments given for inductive inference in bility and s t a t i s t i ~ s ~ ' (controversies
~ ~ ~ ~ ~ 4 - ~ ~long kept hid-
Bayesian p h i l o ~ o p h ~and~ - ~certain arguments against den from students, lest standard analysis methodologies
inductive inference in Popperian p h i l o s ~ p h y . ~Because
~~@ be called into Nonetheless, beyond these
intuitive reasoning under uncertainty is poorl1J2 and controversies one may discern a logical foundation for
because epidemiologic inference involves so many un- deriving uncertain conclusions from uncertain premises
certainties (for example, about uncontrolled biases), one when certainties are measured by probabilities. The
could argue that probability logic should be a center- present section outlines that foundation.
piece of epidemiologic training. Instead, the topic is
absent from most epidemiology and statistics texts, even
those that purport to address foundational i~sues,'~"4 There are two major classes of probability definitions,
while classes on probability and statistics usually cover "objective" and "subjective." Within these classes there
only mathematical models for probability and the statis- are many variants; this is especially true of "objective
tical methods derived from those models. probability," which subsumes frequency, propensity, fi-
ducial, and necessarist or logical probability (confusing-
ly, "logical probability" is only a special case of proba-
From the Department of Ep~dem~ology, UCLA School of Publ~cHealth, Los
Angeles, C A 90095- 1772 (address for correspondence). bility logic). I will represent the dichotomy among
definitions prevalent today by the two most common
Subm~ttedFebruary 18, 1997; f ~ n a verslon
l accepted November 17, 1997. definitions in statistics, the frequency and the subjective
Edrtors' note: See related e d ~ t o r ~on
a l page 233 of this issue.
Bayesian definitions.
The frequency or frequentist definition asserts that
O 1998 by Epidem~ologyResources Inc. probabilities are limits of sequences of relative frequen-
Epidemiology May 1998, Volume 9 Number 3 LOGIC AND PROBABILISTIC INDUCTION 323
cies (proportions) of events. Because relative frequencies more than theories about how certain relative frequen-
are observable, limits of such sequences are purported to cies will unfold.21 As such, they cannot be validly de-
be physical properties of systems or mechanisms that duced from observation of event frequencies (for the
generate sequences of events; hence, frequency proba- same reason that n o general theory can be validly de-
bilities are sometimes called physical probabilities. In its duced from observations alone). Consequently, "objec-
most pure form, frequentist theory denies any meaning tive probabilities" can never be established as facts; they
to probabilities of individual events, such as the outcome are instead hypothesized laws governing physical behav-
of a given coin toss or patient.20 This limitation of the ior, or hypothetical properties (propensities) of certain
theory has led to the development of theories of physical types of object^.'^^^^ T o complicate matters further, there
probability that allow individual probabilities, such as are forms of Bayesian statistics and probability logic that
propensity theory.l6l2l are based o n "objective" probability theory20; hence,
Individual roba abilities are also allowed under the there is a need to distinguish "objective" from "subjec-
subjective Bayesian or personalist definition, which treats tive" Baye~ianism.~ The latter has become so influential
probabilities as constructs of a n observer's mind. These today, however, that most modern discussions of Bayesi-
constructs are supposed to correspond to the observer's anism and probability logic (including the present one)
"rational certainty" about a statement, where rational focus on the subjective
certaintv means onlv that the certainties are constrained Subjective probabilities can apply to statements about
to follok the axioms of p r ~ b a b i l i t y Because
.~ such prob- events, and so are often confused with physical proba-
abilities vary from person to person, they are sometimes bilities. As a n example, suppose you read a report of a
called personal certainties, credibilities , personal probabili- randomized trial. T h e investigators might be 95% cer-
ties, or degrees of belief. tain of truth of the statement: "The lower and upper
T h e terms "obiective" and "subiective" confer some 95% confidence limits for the risk difference contain the
misleading.,connotations that tend to bias naive readers true effect." If derived from a well conducted randomized
away from the subjective view.8 For example, the word trial, that may be a perfectly reasonable subjective prob-
"subjective" suggests elements of arbitrariness or irratio- ability to have if there are few other data available.
nality, whereas knowledgeable critics of subjective Nonetheless, following standard frequentist theory, the
Bayesian probability often complain that it is too strin- physical probability that those limits contain the true
gent in its demands for rational probability assignment value is either one (if they do contain the true value) or
(see, for example, the discussion in Ref 22). In contrast, zero (if they do not). There is no conflict here; the sense
the word "objective" suggests direct observability (like of conflict arises in part because, in ordinary language,
the height of Mount Everest); nonetheless, limits of se- the event in question (true effect between the 95%
quences of relative frequencies are defined in terms of limits) must be described by a statement that the event
infinite sequences, which are not directly observable. occurred. This statement is not usually set off by quota-
This metaphysical property of frequentist probabilities is tion marks, as done here. Thus, ordinary language and
usually overlooked; instead, such probabilities are com- ordinary thinking do not distinguish the event (to which
monly described as referring to "the long run," which is the physical probability refers) and the statement describ-
rarely given a precise d e f i n i t i ~ n Other
.~ theories of ob- ing the event (to which the subjective probability refers).
jective probability share this metaphysical character, This distinction is important, however, for avoiding
including propensity theory.ll misuse of frequentist statistics such as P- value^.^
It should be noted that the two definitions of proba-
bility just described are not mutually exclusive: Some
authors believe that physical probabilities exist and can AXIOMS
OF PROBABILITY
be estimated, but also use subjective probabilities to Despite confusion and conflict, virtually all writers agree
measure both personal degrees of uncertainty and phys- that probabilities should follow a few simple axioms and
ical p r ~ b a b i l i t i e s Problems
. ~ ~ ~ ~ ~arise
~ ~ only
~ ~ because mea- definitions. From those rudiments, one can validly de-
sures of physical probabilities (such as traditional P- duce a vast body of logical consequences, known as
values and confidence limits) are routinelv probability theory. This axiomatic and definitional
misinterpreted as measures of uncertainty about hypoth- agreement results in many parallel structures in objec-
eses.13J4 tive and subjective probabilistic systems, despite the fact
Despite the compatibility of objective and subjective that the objective system refers to the physical world and
theories of probability, the metaphysical character of the subjective system refers to mental worlds. The key
objective theories has led some Bayesians to deny the difference is that physical probabilities can apply only to
very existence of "objective" probabilitie~.~.~ One such physical events or states, whereas subjective probabilities
argument is roughly as follows: "Limits of sequences of can apply to any precise declarative statement, whether
relative frequencies" are really only mental constructs it concerns physical events or states or a hypothesis that
that are built and modified to follow observed relative expresses a general law of nature.
frequencies; that is, so-called "objective probabilities" The first axiom requires probabilities to be nonnega-
are only subjective probabilities constructed to mimic tive. T h e second axiom requires that every logically
the magnitudes of observed proportions. A related ob- inevitable event (in objective terms) or tautology (in
jectivist view is that physical probabilities are never subjective terms) have a probability of one. For example,
324 GREENLAND Epidemiology May 1998, Volume 9 Number 3
the statement "The confidence interval either will or nonoverlapping (exclusive) parts of the total equals the
will not contain the true value" is a tautology (that is, it sum of the proportions or percentages contributed by
is logically inevitably true, regardless of any facts), and each part separately (Axiom 3). For example, if 30% of
the event it describes is logically inevitable; the state- the marbles in a bag are red and 20% are blue, then 20%
ment and event cover all possibilities. Therefore, the + 30% = 50% of the marbles are red or blue. Because
objective probability of the event (if it exists) must be physical probabilities are limits of sequences of relative
one; analogously, according to the subjective theory, we frequencies and the latter are proportions, such proba-
should set our subjective probabilities for the statement bilities must have the same bounds and additive behav-
to one. ior as proportions.
T h e third axiom requires that if A and B are mutually For the subjective theory, the axioms are normative
exclusive, the probability of "A or B" must equal the sum rules about how we ought to constrain our personal
of the probability of A and the probability of B. Two probabilities. Several justifications have been offered for
events are mutuallv exclusive if n o more than one of these constraint^.^^^^^^^^^^^^^^ A n argument paralleling the
them can happen; t h o statements are mutually exclusive frequentist justification is the following: Suppose we
(or mutually inconsistent) if n o more than one of them believe that a given event has a physical probability, and
can be true. For example, the following two statements we know that probability or have a generally accepted
and the events they describe are mutually exclusive: estimate of it, as with certain games of chance and with
"The observed risk difference will be greater than the
u
quantum events. It has been proposed that we should
true effect" and "the observed risk difference will equal then set our subjective probability for the statement of
the true effect." Axiom 3 asserts that the sum of ;he the event to the known physical p r ~ b a b i l i t y . ~This ,~,~~
objective probabilities of these two events must equal rule or axiom has been called the Principal Principle of
the objective probability of the event described by "the subjective p r ~ b a b i l i t y . 'Because
~~~ physical probabilities
observed risk difference will be greater than or equal to (if they exist) obey the above three probability axioms,
the true effect" (if these probabilities exist). In parallel, we should make sure that our subjective probabilities do
Axiom 3 asserts that we should set our subjective prob- so as well in order to ensure that our predictions are as
abilities so that the sum of probabilities for the first two well calibrated as possible, that is, to ensure that our
statements equals the probability of the last statement. predictions of future event frequencies (for example,
T o summarize, let Pr stand for probability, and let A incidence rates) are as close as possible to the event
and B stand for any two events (in the objective theory) frequencies that actually come to pass.
or statements (in the subjective theory). The above
three axioms then assert that objective probabilities do
satisfy and subjective probabilities should satisfy THEDUTCHBOOKARGUMENT
The axiom justifications given above will not do for
those who deny that physical probabilities exist. There
A2) Pr(A) = 1 if A is a tautology is, however, yet another rationale for Axioms 1-3, called
the "Dutch Book a r g ~ m e n t . " ~The
~ ~ ~premise
'~ of this
argument is that you are willing to "put your money
where your mouth is," in the following sense. Let us say
if A and B are mutually exclusive. you would bet on your probability assignments, up to a
total stake of s dollars per assignment, if for each assign-
(Percentages can be used in place of proportions; Axiom ment Pr(A) you would accept either of the following
2 then asserts that Pr(A) = 100% if A is logically bets, at my choice:
inevitable.)
1. You would wager $sPr(A) on A true against my
JUSTIFICATION OF THE AXIOMS $s(l - Pr(A)) o n A false;
Axioms 1 and 2 have almost n o content given Axiom 3. 2. You would wager $s( 1 - Pr(A)) on A false against
In essence, they assert only that we should measure all my $sPr(A) o n A true.
probabilities (whether frequencies or certainties) on a
proportion scale ranging from 0 = never or impossible to In other words, you offer betting odds of Pr(A)/(l -
1 = always or inevitable. W e can always do so: For Pr(A)) on A true, and will bet either way at those odds
example, if we measure our certainties about statements as long as the total money at stake does not exceed a
using odds, we need only divide our odds by one plus the certain amount. For example, suppose A is the statement
odds to transform our certainties to a 0-to-1 scale. that a given study will exhibit a negative association,
For frequentist theory, the above three axioms are you assign Pr(A) = 0.60, and you are willing to bet up
assertions about how physical probabilities behave. to a total stake of a dollar per assignment. You would
When expressed as proportions or percentages of a total, then be willing to wager 60 cents on a negative associ-
common physical quantities (for example, counts, areas, ation against my 40 cents on a nonnegative association,
weights) obey the above axioms: Proportions are non- and equally willing to trade sides by wagering 40 cents
negative (Axiom I ) , the total is 100% of itself (Axiom on a nonnegative association against my 60 cents on a
2), and the proportion or percentage contributed by two negative association.
R a m ~ ande ~ DeFinetti26
~ ~ made the following discov- certain that rats can be affected (C) and 40% certain
ery: If you are willing to bet on your assignments and that the LRD risks of both humans and rats can be
your assignments violate any of the three probability affected ( A and C), Axiom 4 instructs you to become
axioms, it will be possible for other people to set up a 0.40/0.50 = 80% certain that humans can be affected
system of bets against you, based on your probabilities, ( A ) if you are given that rats can, indeed, be affected
such that they will be guaranteed to win money from you (C). Like the other three probability axioms, Axiom 4
no mutter what the truth of the statements in the bets; in can be justified by a Dutch Book a r g ~ m e n t . ~
other words, you can be forced into sure loss (Appendix An event or statement A is independent of another
1 gives a brief proof of this result). Conversely, if your event or statement B if conditioning on B does not
assignments obey the probability axioms, no one will be change the probability of A: Pr(AIB) = Pr(A). If A is
able to force your loss with a system of bets based on your independent of B, then B is independent of A, so the
probabilities. (A system of bets that forces loss on a independence relation is symmetric. That is, if Pr(A1B)
bettor is called a "Dutch Book.") = Pr(A), then
Some writers find the Dutch Book argument so com-
pelling that they define a system of personal probability Pr(A and B)
Pr(A and B) = Pr(B)
assignments to be coherent if and only if it satisfies
Axioms 1-3.336~7 Others dismiss the argument, comment-
ing that it may be compelling when gambling, but sci-
ence is about testing hypotheses, not gambling on them
(see, for example, the discussion in Ref 22). Nonethe-
less, these critics tend to be believers in physical prob-
abilities; for them, the good frequentist properties of I
Pr(B A ) = Pr(A and B)/Pr(A)
Bayesian procedures can supply another rationale for the
use of Bayesian statistics in random sample surveys and
randomized trials.22It can also be argued that applica- In the subjective theory, independence of A and B is a
tions of science involve gambles on hypotheses; for property of your probability assignment; it means that
example, in banning the asthma drug fenoterol, author- learning B is true will not alter your probability for A,
ities would be gambling in favor of the hypothesis that and that learning A is true will not alter your probability
the death rate would be lower with the ban than with- for B.
out. Finally, A is conditionally independent of B given
In the objective theory, Axiom 3 is extended to another event or statement C if further conditioning on
include infinite sequences of events. Such a leap to the B does not change the probability of A after condition-
infinite has been resisted by ~ o m ebut ~ , not
~ all7 subjec- ing on C; that is, if Pr(AIB and C ) = Pr(A1C). This
tive Bayesians. Fortunately, this divergence has no con- relation is also symmetric, in that it implies Pr(A and
sequence for the present discussion and so will not be BIC) = Pr(AIC)Pr(BIC) and Pr(B1A and C ) = Pr(B1C).
considered further. In the subjective theory, it means that learning B is true
will not alter your probability for A and learning A is
COND~T~ONAL AND INDEPENDENCE
PROBABILITY true will not alter your probability for B, once you are
In the objective theory, we condition on an event C by given that C is true.
examining limits of relative frequencies among events of
the form "A and C," where A is another event. This
leads to the following axiom for the conditional proba-
Relations between Hypotheses and Observations
bility of A given C, denoted Pr(A(C): in Probability Logic
For the remainder of the paper, let H stand for a hy-
A4) If Pr(C) > 0, Pr(A I C) = Pr(A and C)/Pr(C). pothesis and let B stand for the outcome of an observa-
tion process, with both H and B of uncertain status a
In the objective theory, this axiom is usually referred to won; that is, 0 < Pr(B) < 1 and 0 < Pr(H) < 1. In
as a definition, but its justification is analogous to those typical problems, H is an assertion of a causal relation,
for the other three axioms: Proportions and percentages whereas B is a description of a study and the data it
of ordinary physical quantities will follow Eq 4. For obtained. The following definitions qualitatively char-
example, if 60% of the marbles in a bag are red and 30% acterize the dependence of H on B within a system of
of the marbles are red and small, the percentage that are probability assignments Pr( ):
small among those that are red is 30160 = 50%.
One interpretation of Eq 4 is as an axiom that shows 1. B proves H means that B would render H certain:
how to modify or update our certainty about statement Pr(H1B) = 1.
A if we learn that C is ~ o r r e c tFor
. ~ example, suppose A 2. B supports H means that B would raise the proba-
is the hypothesis that "first-trimester retinol supple- bility of H: Pr(HIB) > Pr(H).
ments can increase risk of limb reduction defects (LRD) 3. B is neutral with respect to H means that B would
in humans" and C is the hypothesis that "retinol sup- not change the probability of H: Pr(H1B) = P(H);
plements can increase LRD risk in rats." If you were 50% that is, H is independent of B.
4. B undermines or countersupports H means that B cated by Hume long before statistics was established as a
would reduce the probability of H: Pr(H1B) < topic distinct from probability.27The concept long re-
P(H). mained imprecise because of the vagueness surrounding
5. B refutes H means that B would render H certainly the concept of "an appropriate set of previous observa-
wrong: Pr(H(B) = 0. t i o n ~ . "In~ ~modern subjective theory, however, "appro-
priate" is defined in terms of e~changeability,~,~a concept
It is common to see "confirms" used as a synonym for that will be discussed below.
~'supports"and "disconfirms" used as a synonym for "un- A simpler interpretation is that probabilistic induc-
d e r m i n e ~ , "but
~ I feel these terms have connotations too tion corresponds to use of any probability theorem to
suggestive of "proves" and "refutes." update (change) one's probability assignments in light of
In ordinary English, one could also use "corroborates" new observations; that is, to compute new assignments
as a synonym for "supports," but Popperz1 established conditioned on the new observations. In particular, in
another meaning for qualitative corroboration in the modem subjective theory, probabilistic induction usu-
philosophy of science: ally refers to the deductive process of updating probabil-
6. B corroborates H means that finding B false would ities using Bayes' Theorem and its consequences. The
refute H: Pr(H1not B) = 0. theorem, also known as Bayes' rule or Bayes' formula, is
a simple formula which shows how to move from an
Appendix 2 shows that corroboration defined in this initial or prior probability Pr(H) for a hypothesis H to an
manner implies but is not implied by support, and nei- updated or posterior probability Pr(H/B) based on a new
ther implies nor is implied by proof; thus, corroboration observation B.(jt7 There are several versions of the theo-
is a strong form of support, though not as strong a form rem; the original form derived by Rev. Thomas Bayes, a
as proof. Corroboration is closely related to traditional contemporary of Hume's, goes as follows:
notions of prediction, defined by
8. A new observation B should change the probabil-
7. H predicts B means that H certainly implies B: ity of H via multiplication by the factor Pr(BIH)/
Pr(H implies B) = 1. Pr(B); that is,
Because H certainly implies B if and only if B is certain
given H , we could equivalently define "H predicts B" to
mean that B is certain given H: Pr(B1H) = 1.
A number of authors have a t t e m ~ t e dto define mea- Proof:
sures of support or corroboration. Although no measure
has prevailed in the literature, it has been recognized Pr(H and B)
that a n appropriate measure would have to involve com- Pr(H I B, = pr(B)
parison of p r ~ b a b i l i t i e s . ~Nonetheless,
,~' some proposed
measures (such as relative likelihood) are based on com- -
-
Pr(H and B)/Pr(H)
= Pr(H)
I
Pr(B H )
parison of observation probabilities, rather than hypoth- Pr(B)/Pr(H) Pr(B) '
esis probabilities.
Popperian writings often emphasize the roles of refu- A n immediate corollary is that B alters the probability of
tation and corroboration in scientific research, while H by the same proportion as H alters the probability of
criticizing notions of proof of hypotheses. From the B:
perspective of subjective probability, refutation and cor-
roboration are as criticizable as proof because they de-
9. Pr(HIB)/Pr(H) = Pr(BIH)/Pr(B).
mand an absolute certainty (probabilities of 0 or 1). A Many other relations between observations and hypoth-
strength of the subjective theory is that it provides eses can be derived from Bayes' Theorem. Here are some
precise concepts of support and countersupport without simple but nonetheless statistically useful examples:
invoking absolute certainty.
10. B corroborates H if and only if H predicts B; that
PROBABILISTIC INDUCTIONAND BAYES'THEOREM is, Pr(H1not B) = 0 if and only if Pr(B1H) = 1.
11. B supports H if and only if B is more probable
T h e general idea of probabilistic induction is that ob-
under H; that is, Pr(H1B) > Pr(H) if and only if
servations may somehow induce observers to make prob-
Pr(B1H) > Pr(B).
ability assignments, or at least induce observers to
12. B undermines H if and only if B is less probable
change their assignments. Such ideas can be traced back
under H; that is, Pr(H(B) < Pr(H) if and only if
to the 17th century16J8; since then, a number of con-
Pr(B1H) < Pr(B).
cepts in probability logic have been interpreted as prob-
13. B refutes H if and only if H predicts not-B; that is,
abilistic induction.
Pr(H1B) = 0 if and only if Pr (not BIH) = 1.
O n e interpretation is that probabilistic induction is
the process of setting our probabilities for a n unobserved Bayes' Theorem and its consequences provide basic log-
outcome to the frequency of that outcome in an appro- ical connections between probabilities of data given
priate set of previous observations. This concept has also hypotheses, which are the outputs of standard statistical
been called statistical induction, although it was expli- methods, and probabilities of hypotheses given data,
which are routinely requested by scientists and the gen- specification, and computational difficulties raised by
eral public. In particular, proposition 11 provides one the theorem, which may explain why he refrained from
way of making precise the commonsense notion that publishing his essay.33
successful predictions support a hypothesis. Suppose now we wish to compare the posterior prob-
There are theorems more elaborate than Bayes' that abilities of two hypotheses H I and H,. One way to do so
may also be interpreted as forms of probabilistic induc- is to take their ratio, which is called their posterior odds,
tion; see, for example, sec. VI, Ch. 15, of G o ~ d . ~
Applying Bayes' Theorem to both the numerator and

In a letter to Nature that inspired a small literature in denominator of this ratio, we obtain
philosophy (Refs 28-32 provide some examples), Popper
and Miller9 claimed to prove that "probabilistic induc-
tion" was impossible. The definition of "probabilistic
induction" that they used was not any of those given
above, however. In fact, Popper and Miller noted that
corroboration as defined above does imply probabilistic
support; they simply argued that probabilistic support is
not the same as probabilistic induction (using their def- The ratio Pr(BIHI)/Pr(BIHo)in this equation is often
inition of induction).1° As several authors have pointed called the Bayes factor comparing the two hypotheses,
out, the proof depends on the idiosyncratic definitions while the ratio Pr(Hl)/Pr(Ho)of the prior probabilities is
used by Popper and Miller7s28-32 (for example, Howson called their pnor odds.' With these definitions, the pre-
and Urbach7 describe those definitions as "eccentric" ceding equation yields
and "strange").
What Popper and Miller actually showed was that, if Bayes factor =
I
Pr(B H1)
Pr(H1B) < 1 and Pr(B) < 1 (that is, if the hypothesis H I
Pr(B Ho)
is not certain given the observation B, and the observa-
tion B is not certain), then Pr(B implies HIB) < Pr(B -
I I
Pr(H1 B)/Pr(Ho B)
-
-
Posterior Odds
implies H). Popper and Miller implicitly defined proba- Pr(Hl)/Pr(Ho) Prior Odds '
bilistic induction to be the reverse inequality, Pr(B im- so that the Bayes factor measures the change in the odds
plies HIB) > Pr(B implies H); hence, under their defini- of H1 us Hoproduced by observation B. These equations
tion, "probabilistic induction" is impossible. This result is bring some computational and conceptual simplifica-
simply irrelevant to the definitions of probabilistic in- tions to Bayesian analyses; for example, when H I and H,
duction given earlier; at best, it is another warning that represent one-point statistical hypotheses (such as
-
controversies often arise from semantic divergences.19 "OR = 2" and "OR = 1" in a two-by-two table), the
Popper and Miller did, however, make one concluding Bayes factor is the same as the likelihood ratio of ordinary
statement that is agreed upon by all discussants: "There statistic^.^^^^.^^ which in manv, ~roblemsis easilv derived
,
is such a thing as probabilistic s ~ p p o r t , " ~by, ~which
.~~~ from standard statistical out~uts.Nonetheless. ;he eaua-
they meant that proposition 11 given above is a valid tions do not supply all the statistics one might want from
.
relation between re dictions and hv~otheses.For
, those an analysis, such as posterior (Bayesian) interval esti-
who define probabilistic induction as the process of up- mates.'
dating probabilities in light of new observations, Popper
and Miller's statement is a startling concession of the
possibility and existence of such inductive processes.28
There are three major approaches to the specification
problem. The first and oldest, dating back to Laplace in
Bayesian Statistical Analysis the 18th century,33is not really part of the modern
There is nothing controversial about Bayes' Theorem subjective theory. It attempts to identify and specify
and its consequences as mathematical formulas. What is "noninformative," "ignorance," or "reference" prior dis-
controversial is the use of subjective probabilities in the t r i b u t i o n ~ .For
~ ~ epidemiologic analyses, this approach
theorem, especially probabilities of hypo these^.^^ Even if usually yields numerical results close or equal to standard
one accepts the Bayesian philosophy, however, there are frequentist procedures; for example, the posterior prob-
two practical obstacles to its application: Computation ability intervals ("Bayesian confidence intervals") ob-
of the unconditional probability Pr(B) of the observa- tained in this manner are usually close or equal to
tion B, and specification of the prior probability Pr(H) of standard confidence intervals. Their intemretations are
the hypothesis H. In typical applications, evaluation of entirely different, however; for example, 95% posterior
Pr(B) requires difficult integration, although modern probability limits of 1 to 3 for a risk ratio RR are a pair
computing developments have greatly diminished the of numbers such that Pr(1 < RR < 3 ) = 0.95, where
importance of this obstacle.' In contrast, proposed solu- Pr ( ) refers to the analyst's probability assignment. In
tions to the specification problem remain controversial. contrast, frequentkt 95% confidence limits of 1 to 3 for
Interestingly, Bayes was well aware of the philosophical, RR have no analogous physical probability interpreta-
tion; although commonly misinterpreted as posterior study, so that every member of the intended audience
~ r o b a b i l i limits,
t~ they are simply a pair of numbers that would accept the specification used, or (b) the results are
are either known to be generated by a random mecha- reasonably insensitive to the prior specification, in
nism (if the study involved randomization) or not (if the which case the effort to make the latter precise was
study was purely observational). unnecessary.'
Methods that employ reference priors are sometimes A third approach, which may be viewed as somewhere
called "objective," "logical," or "necessarist" Bayesian between the extremes of noninformative and detailed
methods, although here "objective" means only "agreed specifications, is to focus on incorporating accepted
upon by convention." Such methods are arguably less qualitative information into the prior specification, leav-
logical and less scientific than subjective Bayesian meth- ing quantitative details as unknown parameters to be
o d ~ . ~Consider
* ~ , ~ a~simple epidemiologic example, that estimated in a higher-stage model. Such hierarchical-
of coffee drinking and its association with myocardial Bayes approaches (also known as multilevel, empirical-
infarction. No one on any side of the controversy has Bayes, or random-coefficient modeling) have expanded
ever argued that drinking 'one cup a day would elkvate in parallel with recent algorithmic and computing ad-
rates by more than 10% (RR = 1.1), if at all. Yet the vances'J4 and are well suited to many epidemiologic
standard "reference" ~ r i o rfor the coefficient of coffee problem^.^^-^' For example, in studies of diet, nutrition,
cups per day (for example, in a proportional-hazard and health, nutrient measurements are constructed from
model) assigns the same prior probability density to diet measurements in a linear fashion using nutrient
ln(RR) = l n ( l . l ) , ln(RR) = 10-loo, and every other tables, and this qualitative information can be used in
numerical possibility, such as ln(RR) = If ln(RR) modeling without precisely specifying prior probabilities
= 10'0°, consuming a cup of coffee would usually lead to for any effect^.^'
an immediate massive coronary. No one would give As with conventional (frequentist) analysis methods,
ln(RR) = 10'0° any credence if they understood its a thorough Bayesian analysis must consider many issues,
substantive meaning. Nonetheless, some statisticians including insensitivity and robustness. A result is insensi-
continue to use and promote so-called "noninformative" tive if it does not change much under reasonable
priors, which correspond to precisely these kinds of changes in the analysis assumptions (of which the prior
scientific absurdities. Their use is sometimes rationalized specification is but one), whereas a method is robust if the
on the grounds that the resulting procedures are robust results it produces remain valid under reasonable depar-
(see below), but robustness against absurd possibilities is tures from its assumptions. Insensitivity and robustness
unnecessary and often costly in terms of overall accuracy are related but do not imply one another: a nonrobust
and credibilitv of results.24 method may yield an insensitive result, and a robust
Another faulty rationale for noninformative prior dis- method may yield a sensitive result. Furthermore, a
tributions is that "they allow the data to speak for robust method can be much less accurate than a nonro-
themselves." In reality, data never speak for themselves: bust method that is well tailored to the topic at hand-
Every analysis has to filter data through some set of which is another reason why the robustness of certain
simplifying assumptions, such as assumptions that the "objective" Bayes and frequentist methods is not a com-
data were generated by a conventional probability pelling argument in their fav0r.~4
m e c h a n i ~ m . ~ <This ~ ~ * problem
~ ~ - ~ ' is recognized in Pop-
per's theme that all observations are theory-ladenS2'In
non-Bayesian analyses, all assumptions are incorporated Central to any serious attempt at probability specifica-
into and often hidden by models for data probabilities Although
tion is the concept of e~changeability.'"~~~~~-~~
[for example, models for Pr(BIHO)and Pr(BlH,), where this conceDt is defined in both obiective and subiective
Ho and HI specify parameter values in a logistic model]. theories, I will here discuss onlv the subiective version.
Bayesian methods allow one to shift the form and bur- In subjective terms, you regard two unknown quantities,
den of some of the assumptions to models for prior X and Y, as exchangeable (or permutable or symmetric) if,
probabilities [for example, models for Pr(HO) and for any statement involving one or both of them, your
Pr(H,)I; such assumptions can and should be checked probability assignment will not change if X and Y were
against data, just as one should check models for data interchanged. For example, for two individuals of un-
~robabilities.~~~~~~~ known HIV status but identical values for known pre-
A t the other extreme from noninformative s~ecifica- dictors of HIV status (age, gender, intravenous drug use,
tion, there have been attempts to elicit detailed quan- ethnicity, and sexual activities), I would regard their
titative specifications of prior probabilities from scien- HIV status indicators X and Y (1 = positive, 0 =
tific expert^.^'^^ These are laudable efforts to negative) as exchangeable. Thus, I would have Pr(X <
operationalize the spirit of Bayesian philosophy. There Y) = Pr(Y < X), Pr(X = 0) = Pr(Y = 0), Pr(X = 1) =
are many practical drawbacks to this approach, however. Pr(Y = I), and so on; with respect to my probability
First, it can require extraordinary effort on the part of assignments, X and Y would be indistinguishable. More
both the experts and the statisticians, more than prac- generally, you would regard the unknowi quantities in a
tical for routine use. Second, it will not ~ r o d u c ea collection as exchangeable if they were indistinguishable
convincing analysis unless either (a) there'is a high or interchangeable with respect to your probability as-
degree of consensus among all experts in the field under signments.
Exchangeability is implied by but does not imply the with any physical probability. Similarly, I would be quite
common statistical assumption that the quantities under satisfied if, in my county, there was a decline in AIDS
study are independent and identically d i s t r i b ~ t e dIt
. ~ is case reports in 70% of those subgroups for which our
the exchangeability of characteristics of sampled persons projections gave a 70% chance of decline in the coming
with those of unsampled persons that justifies statistical year, and so on. Such excellent calibration is beyond
inferences from a survey sample to the popula- current epidemiology, and I know I should not confuse
Similarly, in comparative trials, it is the our projection (a subjective probability) with any phys-
exchangeability of group outcomes under homogeneous ical probability.
treatment allocation that justifies inferences about In the objective theory, the "70% chance of decline"
causal effect^.',^^-^^ Random sampling in surveys and just described is referred to as a n "estimate of the prob-
randomization in comparative trials are perhaps the ability of decline," and so presumes that the physical
most dependable but not the only methods for inducing probability exists. Whether or not the latter exists, sub-
observers to make exchangeable probability assignments; jective probabilities can be constructed and updated
for example, certain forms of systematic sampling or using the actual outcomes. What updating method will
treatment allocation may suffice. ensure that our subjective probabilities will approach the
In nonexperimental comparisons, exchangeability physical probabilities when the latter exist and data
arises in a much more conditional, partial fashion, be- accumulate indefinitely? As with probability estimates
cause such comparisons require control of a "sufficient" from the objective theory, such convergence depends on
set of selection and risk predictors. Here "sufficient" whether any assumptions we have used (for example,
means that, within levels of controlled predictors, the logistic dependence of probabilities on factors) are at
outcomes of the different exposure groups would be least approximately correct. Thus, to repeat, a Bayesian
exchangeable if every subject received the reference analysis does not free us from the need to check our
exposure Note that matching in nonexperimen- assumption^^^^^^^; as in Popperian philosophy, in the
tal studies does not necessarily induce such exchange- Bayesian philosophy espoused here the ultimate test of
ability; it only ensures that, within each matching stra- our hypotheses and assumptions is how well our predic-
tum, there will be both exposed and unexposed subjects tions are borne out by observations.
(in cohort studies) or diseased and nondiseased subjects
(in case-control studies).13 More generally, popular strat-
egies for control of confounding such as regression ad- Discussion
justment do not induce exchangeable assignments; Unlike forecasting, risk factor epidemiology provides few
rather, they must assume that the set of controlled pre- opportunities to validate predictions in the manner de-
dictors are sufficient in the sense just described, which is scribed above. That is, risk factor epidemiology suffers
often not true. from a lack of calibration opportunities, rather than a
lack of theoretical testability. N o philosophy, whether
frequentist, Bayesian, or Popperian, can do more than
point out this weakness in our science. Whether this
Kev structures in subjective theorv can be traced out in weakness can be remedied by anything other than more
with structures in objectivk theory. For example, randomized trials (such as the recent trials of beta-
in objective epidemiologic theory, "exchangeability" of carotene) remains to be seen.
exposure groups is synonymous with "comparability" of Although I have argued that subjective probability
the groups, or absence of c ~ n f o u n d i n g I. ~have
~ focused logic has value as a n approach to statistical inference,
instead on subjective exchangeability because n o phys- like all approaches (including the Popperian approach)
ical probabilities can be identified or even shown to it should be regarded as a partial and conditional ac-
exist in observational epidemiologic corn par is on^.^^ count of scientific reasoningz4;it is not as comprehensive
Nonetheless, one should not lose sight of the parallel as some a u t h ~ r $ , seem
~ < ~ to maintain. In particular, as
because, if physical probabilities do exist in a given with all statistical procedures, conclusions derived using
situation, then (generalizing from the Principal Princi- Bayesian methods are conditional on any probability
ple) we should want our subjective probability assign- models used in the course of analysis, and well as on
ments to be as closely calibrated to the physical proba- prior-probability assignments. Criticism of these models
bilities as ~ o s s i b l e . ~ ~ and assignments plays an essential role in data analysis
T o clarify the origin of the notion of calibration, that complements and cannot be replaced by Bayesian
consider meteorology, a n applied physical science whose methods.24In Popperian terms, we may view a Bayesian
difficulties with complex observational studies and pub- analysis as a method to incorporate data information
lic esteem may rival those of epidemiology. I should be into a given model; it cannot substitute for or be re-
quite satisfied if, in my town, it rained on 70% of those placed by the model-criticism step, in which the model
days for which the forecast gave a 70% chance of rain, is exposed to possible criticism based on conflict with
and so on for other percentages. I know, however, that observations.
such good calibration is well beyond current meteorol- It is a serious challenge for both objective and subjec-
ogy, and that I should not confuse the forecast (a sub- tive theories to address how one should specify and
jective ~ r o b a b i l i t~~r o d u c e dby the weather service) justify use of probability models when there are no
known physical probabilities or symmetries on which to over conventional "objective" analyses. First, it would
base them.36 By "justify," I mean deduce the probability yield logically derived conditional certainty statements;
assignments from a set of assumptions that are given for example, it would allow derivation of statements of
high prior probabilities by everyone in the scientific the form "Given the assumptions of this analyses, we
community. In this sense, the standard likelihood func- would be 95% c e r t a i n ~ h a tthe parameter under study
tions used to estimate causal effects from observational lies between RR and RR." In constructing such state-
epidemiologic studies appear difficult to justify in sub- ments, the Bayesian analysis can make use of valuable
jective terms and impossible to justify in objective terms, information that is ignored by conventional analyses,
for their deduction assumes that the exposure was ran- such as probable limits on the magnitude of typical
domized in a "natural experiment," which would be a effects.13,24,39-41
fanciful assumption in (say) a study of alcohol use and Another Bayesian advantage is a consequence of the
breast cancer. The Principal Principle does not help use of magnitude information and may be the most
here; it asks us to model our subjective probabilities on relevant for a Popperian epidemiologist. If the magni-
the basis of physical probabilities that do not exist or at tude of an effect estimate is symmetrically constrained
best are unknown. Likewise, the Dutch Book argument by a prior distribution or hierarchy, it will be less prob-
is of no help if we have no precise values to give to our able that the estimate will appear substantively or "sta.
subjective probabilities, although in such a case an ar- tistically" significant than in a conventional analysis. In
gument can be made for use of interval probability this sense, Bayesian analysis can provide a more strin-
assignment^.^ gent test of the causal hypothesis that the effect is
It is sometimes suggested that conventional statistics non-negligible. ("Non-negligible" often means nonzero,
should be regarded as providing the minimal uncertainty but it may also mean "above a certain action thresh-
that one should assign to a parameter.36This suggestion old.") From a conventional statistical perspective, this
seems too easily ignored in typical epidemiologic discus- means that the power (sensitivity) of the Bayesian anal-
sions, however, and it does not address the fact that, ysis at a given alpha level (specificity) will be lower than
without justifiable probabilities, probabilistic induction that of the conventional analysis. This power loss is
and the whole body of inferential statistics (objective modest, however, compared with the dramatic reduction
and subjective) is without foundation. For one must ask: in Type I (false-positive) error afforded by the use of the
Of what relevance is a statistic based on the assumption prior information. In hypothesis-screening terms, this
that "the data are from a perfect randomized trial" when
means that the Bayesian analysis allows one to trade a
randomization (let alone perfection) is a complete fan-
modest decrease in sensitivity for a very large increase in
tasy? The answer is a subjective one, in that persons with
specificity, so that the ROC curve for Bayesian hypoth-
much faith in the validity of the study (typically, the
esis screening lies above the curves for conventional
study investigators and persons who are pleased with the
hypothesis-screening procedures.
results) will think those statistics are highly relevant,
The preceding advantage of Bayesian analysis has
whereas persons with little faith in the validity of the
study (typically, persons displeased with the results) will been phrased in terms of hypothesis testing. Such terms,
think those statistics are deceptive. although used by Popper, have come to be abhorred by
Skepticism about claims for objectivity should be ap- some epidemiologists because of the widespread abuse of
plied to common statistical methods, such as traditional statistical hypothesis testing.13J4The advantage may be
P-values, confidence intervals, and regression analyses, rephrased in estimation terms, however: Bayesian meth-
as well as to more modem approaches based on boot- ods facilitate the use of prior information to construct
strapping, Gibbs sampling, and other Monte-Carlo estimates of much greater accuracy (that is, with better
methods. I am aware of only two justifiable responses to calibration) than conventional estimates. From my per-
this skepticism. One response is to limit analyses of spective, this is a decisive advantage, and one that has
observational data to pure data descriptors: graphs and been verified repeatedly in theory, simulation, and real
tables, perhaps some means and differences and ratios, appli~ations.'~~~J*-~'
but no P-values or confidence intervals or standard er- One remaining question is whether epidemiologists
rors or regressions. As for summary measures, only stan- can be trained to employ Bayesian methods in an intel-
dardization could be justified; maximum~likelihoodand ligent fashion. A cynic could point to evidence that the
Mantel-Haenszel estimates could not. Such a limited task is not possible, especially in the aforementioned
analysis would probably never get published and would abuse of statistical hypothesis testing. I would argue,
be devoid of measures of uncertainty. however, that potential for abuse of Bayesian methods is
A n alternative response is to expend the extra effort insufficient grounds for denial of its benefits to compe-
to propose plausible (if tentative) subjective probability tent users. Of course, widespread use of these methods
assignments and use them in a Bayesian analysis; fre- will be delayed by their conceptual unfamiliarity, by lack
quentist methods would sometimes be justified as ap- of software, and by innate resistance to change. Fortu-
proximations to Bayesian methods. Such an analysis nately, there is a growing movement in the statistics
would be "subjective" but not arbitrary, because it would community to introduce Bayesian methods into basic
be constrained by past observations and by the norms of statistics e d ~ c a t i o nthis
~ ~ ;movement, along with cover-
the subjective theory. It would have several advantages age of probability logic in epidemiologic training, will
Epidemiology May 1998, Volume 9 Number 3 LOGIC AND PROBABILISTIC INDUCTION 33 1
hasten the day when unfamiliarity is no longer an im- 35. Roblns JM, Greenland S. The role of model selection in causal inference
from nonexperimental data. Am ] Epidemiol 1986;123:392-402.
portant obstacle. 36. Greenland S. Randomization, statistics, and causal mference. Epidemiology
1990:1:421-429.
37. ree en land S. Summarization, smoothing, and Inference. Scand J Soc Med
1993;21:227-232.
Acknowledgments 38. Thomas DC. The problem of multiple inference Ln ident~fyingpoint-source
I am deeply indebted to Malcolm Maclure and Charles Poole for thelr extensive environmental ha'ards. Environ ~ i a l t hPerspect 1985;62:40~414.
criticisms of this manuscript, subsequent discussions, and valuable suggestions for 39. Greenland S. A semi-Bayes approach to the analysis of correlated multiple
references and revisions. associations, with an application to an occupational cancer-mortality study.
Stat Med 1992;11:219-230,
40. Greenland S. Hierarchical regression for ep~demlologicanalyses of multiple
exposures. Environ Health Perspect 1994;102(suppl 8):33-39.
41. Witte JS, Greenland S, Ha~leRW, Bird CL. Hierarchical regression analysis
References applied to a study of multiple dietary exposures and breast cancer. Epidemi-
ology 1994;5:612-621.
1. Gelman A, Carlin JB, Stem HS, Rubin DB. Bayeslan Data Analysis. New 42. Leamer EE. Specification Searches. New York: W~ley,1978.
York: Chapman and Hall, 1995. 43. Draper D, Hodges JS, Mallows CL, Pregibon D. Exchangeability and data
2. Berry DA, Stangl DK. Bayeslan Biostatistics. New York: Marcel Dekker, analysis. J R Stat Soc A 1993;156:9-37.
1996. 44. Greenland S, Draper D. Exchangeability. In: The Encyclopedia of Biosta-
3. Lad F. Operational Subjectlve Statistical Methods. New York: Wiley, 1996. tistics. New York: Wiley, 1998.
4. Greenland S, ed. Evolution of Epidemiologic Ideas. Chestnut Hill, MA: 45. Greenland S. Confounding. In: The Encyclopedia of Biostatistics. New
Epidemiology Resources Inc., 1987. York: Wiley, 1998.
5. Rothman KJ, ed. Causal Inference. Chestnut Hill. MA: Epidemiology Re- 46. Berry DA, Albert], Moore DS. Teacher's corner (with dlscusslon). Am Stat
sources Inc., 1988. 1997;51:241-274.
6. DeFinetti B. The Theow of Probabilitv. vols. 1 and 2. New York: W~lev,
1974.
7. Howson C , Urbach P. Scientific Reason~ng:The Bayesian Approach. 2nd
ed. LaSalle, IL: Open Court, 1993.
8. Good IJ. Good Thinking. Minneapol~s:University of Minnesota Press, 1983. Appendix 1
9. Popper KR, Mlller DW. A proof of the imposslb~lityof inductive probability.
Nature 1983;302:687-688. T h e Necessity of the Probability Axioms to Prevent Sure
10. Popper KR, Mtller DW. Why probabilistic support 1s not inductive. Phil Loss
Trans R Soc Lond 1987;A321:564-59 1.
11. Kahneman D, Slovic P, Tversky A. Judgement under Uncertainty: Heuris- Suppose you are willing to bet on your probability assignments
t ~ c sand Biases. New York: Cambridge University Press, 1982. in the following sense: You are willing to gamble up to s
12. Piattelli-Palmarini M. Inevitable Illusions. New York: Wlley, 1995. monetary units in a wager against me, and, if you assign Pr(A)
13. Rothman KJ, Greenland S. Modem Epidemiology. 2nd ed. Philadelphia: to a statement A (such as 0.01 to "HIV will be eradicated by
Lipplncott, 1997.
14. Oakes M. Statistical Inference. Chestnut Hill, MA: Epidemiology Resources 2020") then you would be willing to bet sPr(A) on A being
Inc., 1990. true or s(1 - Pr(A)) on A being false, at my choice. That is,
15. Kyburg HE. Probabll~tyand Inductive Logic. London: MacMillan, 1970. Pr(A)/(l - Pr(A)) equals the betting odds for A that would
16. Cohen LJ. An Introduction to the Philosophy of lnduct~onand Probability. render vou indifferent to bettine for or against A. Then, if vour
Oxford: Clarendon, 1989.
17. Skyrms B. Cho~ceand Chance: An Introduction to Inductive Logic. 3rd ed. probab;lity assignments do notYobey thlaxioms of pobability
Belmont. MA: Wadworth, 1986. theory, I can choose my bets so that you are sure to lose money:
18. Hack~ngI. The Emergence of Probability. New York: Cambridge Unlverstty
Press, 1975.
19. Greenland S. Induction versus Popper: substance versus semantics. Int J AXIOM1
Epidemlol 1998;27 (in press). Suppose you violate this axiom by assigning Pr(A) < 0. Then
20. Van Mises R. Probablllty, Statistics and Truth. 2nd ed. New York: Dover, I will bet on A. If A turns out to be false, you "win" the
1957.
21. Popper KR. The Logic of Sctent~f~c Discovery. New York: Harper and Row, negative amount sPr(A); this means you owe me the positive
1968. amount -sPr(A). If A turns out to be true, you owe me s(1 -
22. Freedman DA. Some Issues In the foundations of statist~cs( w ~ t hdiscussion). Pr(A)). Thus, only by assigning P(A) 2 0 (Axiom 1) can you
Found SCI 1995,l 19-83 avoid sure loss.
23 Lew~sD A subjectivist's guide to oblectlve chance In Harper WL, Stal-
naker R, Pearce G, eds. lfs: Conditionals, Bellef, ~ e c i s i o n , ~ ~ h a nand
ce,
Time. Boston: D. Reidel, 1981;267-297. AXIOM
2
24. Box GEP. An apology for ecumenism In statist~cs.In: Bnx GEP, Leonard T,
Wu CF, eds. Scientific Inference, Data Analysis, and Robustness. New York: Suppose A is logically inevitable, but you violate this axiom by
Academ~cPress, 1983. assigning Pr(A) > 1. Then I will bet against A , and you "win"
25. Ramsey FP. Truth and probability. In: Ramsey FP. The Foundations of the negative amount s(1 - Pr(A)); this means you owe me
Mathematics and Other Logical Esays. New York: Harcourt, Brace, 1931.
26. DeFinetti B. Foresight: its logical laws, ~ t subjective
s sources (or~ginalpub-
s(Pr(A) - 1). If you assign Pr(A) < 1, then I will bet on A and
l~shedin French). Reprinted In: Kyburg HE, Smokler HE, eds. Studies in win s(1 - Pr(A)). Only by assigning Pr(A) = 1 (Axiom 2) can
Subjectlve Probablllty. New York: Wlley, 1937;1964. you avoid a sure loss.
27. Hume D. An enquiry concemlng human understanding. LaSalle, IL: Open
Court, 1988, Sectlon 6 (flrst published 1748).
28. Levi 1, Jeffrey RC, Good I], Popper K, Miller D. Matters arlsmg: the AXIOM
3
imposs~bil~ty
433-434.
of inductive probab~lit~(discuss~onforum). Nature 1984;310: Suppose your assignments yield Pr(A or B) > Pr(A) Pr(B) +
29. Redhead M. On the impossibility of Inductive probability. Br J Phil Sci for some mutually exclusive A and B. Then I will place bets
1985;36:185-191. against "A or B," for A, and for B. If A turns out to be true, I
30. Dunn 1M. Hellman G. Dualllne: a crltioue of an areument of Poo~erand
Mlller.'~;] Phtl Sci 1986;37:2y0-223. '
. win
31. Eells E. O n the alleged impossibility of inductive probahlllty. Br J P h ~ Scl
l s(1 - Pr(A)) - sPr(B) - s(1 - Pr(A or B))
1988;39:111-116. (A1
32. Rodriguez AR. On Popper-Miller's proof of the impossibility of inductive
probab~l~ty. Erkenntls 1987;27:353-357,
33. Stlgler SM. The H~storyof Stat~stics.Cambridge, MA: Belknap, 1986.
34. Carl~nBP, LOUISTA. Bayes and Empirical Bayes Methods for Data Analysu. which is positive. Reversing A and B shows that I net the same
New York: Chapman and Hall, 1996. amount if B turns out to be true. If neither occurs, I also win
sPr(A or B) - s(Pr(A) + Pr(B)). (They cannot both be true.) because Pr(B) = 1 - Pr(not B) < 1.
Parallel algebra shows I can guarantee you suffer a net loss if Now suppose that Pr(H and B) = Pr(not H and not B) =
your assignments yield Pr(A or B) < Pr(A) + Pr(B) for some 0.3 and Pr(H and not B) = Pr(not H and B) = 0.2. Then
mutually exclusive A and B by betting for "A or B," against A,
and against B. Only by assigning Pr(A or B) = Pr(A) + Pr(B) p r ( I ~B) = O.jI(0.3 + 0.2) = 0.6 > p r ( ~ )
when A and B are mutually exclusive (Axiom 3) can you avoid
a sure loss. = Pr(H and B) + Pr(H and not B) = 0.3 + 0.2 = 0.5,
so B supports H, but Pr(H1not B) = 0.2/(0.3 + 0.2) = 0.4, so
Appendix
. -
2 B does not corroborate H.
Corroboration in the Subjective Theory The following counterexamples show that proof and cor-
The following argument counterexample show that ,-or- 'obo'a'iondo not imply one another. First, suppose that Pr(H)
roboration implies but is not implied by probabilistic support.
= and = Pr(H and B, = 0.2. Then Pr(H I B, =
Suppose that finding B false was possible and would have 0.210.2 = 1, so B Proves Hy but Pr(Hlnot B) = (0.6 - 0.2)/
refuted H; that is, Pr(not B) > O and Pr(Hlnot B) = 0. Then (1 - 0.2) = 0.5, SO B does not corroborate H. Second, suppose
that Pr(B) = 0.4 and Pr(H) = Pr(H and B) = 0.2. Then
Pr(H) = Pr(H I B)Pr(B) + Pr(H I not B)Pr(not B) Pr(H 1 not B) = (0.2 - 0.2)/(1 - 0.4) = 0, so B corroborates
H, but Pr(H 1 B) = 0.210.4 = 0.5, so B does not prove H.

Sander Greenland: Epidemiology, Vol. 9, No. 3. (May, 1998), Pp. 322-332

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sander Greenland: Epidemiology, Vol. 9, No. 3. (May, 1998), Pp. 322-332

Uploaded by

Copyright:

Available Formats

Probability Logic and Probabilistic Induction

Epidemiology, Vol. 9, No. 3. (May, 1998), pp. 322-332.

Epidemiology is currently published by Lippincott Williams & Wilkins.

Applying Bayes' Theorem to both the numerator and

You might also like