You are on page 1of 3

Bayes' Theorem in the 21st Century

Bradley Efron
Science 340, 1177 (2013);
DOI: 10.1126/science.1236536

This copy is for your personal, non-commercial use only.

If you wish to distribute this article to others, you can order high-quality copies for your
colleagues, clients, or customers by clicking here.

Downloaded from www.sciencemag.org on July 19, 2013


Permission to republish or repurpose articles or portions of articles can be obtained by
following the guidelines here.

The following resources related to this article are available online at


www.sciencemag.org (this information is current as of July 19, 2013 ):

Updated information and services, including high-resolution figures, can be found in the online
version of this article at:
http://www.sciencemag.org/content/340/6137/1177.full.html
This article cites 5 articles, 1 of which can be accessed free:
http://www.sciencemag.org/content/340/6137/1177.full.html#ref-list-1
This article appears in the following subject collections:
Computers, Mathematics
http://www.sciencemag.org/cgi/collection/comp_math

Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the
American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. Copyright
2013 by the American Association for the Advancement of Science; all rights reserved. The title Science is a
registered trademark of AAAS.
PERSPECTIVES

resentatives from the major phytoplankton major nutrients, such as carbon and nitrogen. in the ocean, these are exciting times to study
classes in the oceandiatoms, dinoagel- Superoxide also oxidizes dissolved manga- the dynamics of superoxide in seawater. The
lates, and cyanobacteriacan also produce nese to solid manganese oxides, which are analytic capabilities exist, correspondence
extracellular superoxide (6, 9, 10). More- efcient trace metal sorbents and powerful with other disciplines provides a good stream
over, eld studies have found elevated super- oxidants of organic materials (12). When of ideas and hypotheses, and there are still
oxide concentrations in areas of high phyto- these minerals settle out of the water col- more questions than answers.
plankton abundance (5, 7). Hence, it is now umn, they inuence the distribution of trace
accepted that phytoplankton are the main elements and nutrients. Furthermore, super- References and Notes
1. J. M. Diaz et al., Science 340, 1223 (2013); 10.1126/
source of particle-associated superoxide in oxide promotes the degradation of dissolved science.1237331.
the upper, photic, oceanic water column (see organic matter, with implications for the 2. R. M. Baxter, J. H. Carey, Nature 306, 575 (1983).
the gure). marine carbon cycle. Further interactions and 3. E. Micinski, L. A. Ball, O. C. Zariou, J. Geophys. Res. 98,
Diaz et al. show that extracellular pro- biogeochemical roles of superoxide in the 2299 (1993).
4. S. P. Hansard, A. W. Vermilyea, B. M. Voelker, Deep Sea
duction of superoxide is widespread among ocean are likely. Res. I 57, 1111 (2010).
taxonomically divergent heterotrophic bac- Given its functions in other systems, 5. A. L. Rose, A. Godrant, M. Furnas, T. D. Waite, Limnol.
teria from a range of different environments. superoxide may play a role in the chemical Oceanogr. 55, 1521 (2010).
6. A. L. Rose, E. A. Webb, T. D. Waite, J. W. Moffett, Environ.
Some of their bacterial cultures are marine interactions among microorganisms at sea. Sci. Technol. 42, 2387 (2008).
isolates; these bacteria can potentially gen- Superoxide is potentially toxic to organ- 7. S. A. Rusak, B. M. Peake, L. E. Richard, S. D. Nodder, W. J.

Downloaded from www.sciencemag.org on July 19, 2013


erate superoxide in marine sediments and in isms and can be used as a rst line of defense Cooper, Mar. Chem. 127, 155 (2011).
8. Y. Shaked, R. Harris, N. Klein-Kedem, Environ. Sci. Tech-
the vast expanses of the deep ocean that do against viral or bacterial attacks. At low lev-
nol. 44, 3238 (2010).
not receive sunlight. Of course, heterotrophic els, it may also assist communication among 9. A. B. Kustka, Y. Shaked, A. J. Milligan, D. W. King, F. M.
bacteria are not restricted to the deep ocean marine microbes. So far, the only demon- M. Morel, Limnol. Oceanogr. 50, 1172 (2005).
and may thus also contribute to particle-asso- strated role of superoxide production by 10. J.-A. Marshall, M. de Salas, T. Oda, G. Hallegraeff, Mar.
Biol. 147, 533 (2005).
ciated biological superoxide production close phytoplankton is of increased iron availabil- 11. E. Saragosti, D. Tchernov, A. Katsir, Y. Shaked, PLoS
to the ocean surface (see the gure). ity, shown for a lamentous cyanobacterium ONE 5, e12508 (2010).
Superoxide interacts with many chemi- (14). However, another study with a diatom 12. D. R. Learman, B. M. Voelker, A. I. Vazquez-Rodriguez, C.
M. Hansel, Nat. Geosci. 4, 95 (2011).
cal elements and compounds. For example, found that iron acquisition was unaffected by 13. S. P. Hansard, H. D. Easter, B. M. Voelker, Environ. Sci.
it alters the redox states of iron, copper, and superoxide production (9). Technol. 45, 2811 (2011).
manganese and modulates their chemical We are still a long way from a full assess- 14. A. L. Rose, Front. Microbiol. 3, 124 (2012).
reactivity, solubility, bioavailability, and tox- ment of superoxide concentrations across
Acknowledgments: Supported by Israel Science Foundation
icity (8, 9, 13, 14). These metals control the oceanic environments and their link to bacte- grant 248/11 (Y.S.).
abundance and distribution of marine phyto- rial activity. Given the potential inuence of
plankton, which in turn drive the cycling of superoxide on trace metal and carbon cycling 10.1126/science.1240195

MATHEMATICS
Bayes theorem plays an increasingly

Bayes Theorem in the 21st Century prominent role in statistical applications but
remains controversial among statisticians.

Bradley Efron

T
he term controversial theorem They wondered what the probability was that a daily basis, correctly predicting the actual
sounds like an oxymoron, but Bayes their twins would be identical rather than fra- vote in all 50 states. Statisticians beat pun-
theorem has played this part for two- ternal. There are two pieces of relevant evi- dits was the verdict in the press (2).
and-a-half centuries. Twice it has soared to dence. One-third of twins are identical; on Bayes 1763 paper was an impeccable
scientic celebrity, twice it has crashed, and the other hand, identical twins are twice as exercise in probability theory. The trouble
it is currently enjoying another boom. The likely to yield twin boy sonograms, because and the subsequent busts came from overen-
theorem itself is a landmark of logical rea- they are always same-sex, whereas the like- thusiastic application of the theorem in the
soning and the rst serious triumph of statis- lihood of fraternal twins being same-sex is absence of genuine prior information, with
tical inference, yet is still treated with suspi- 50:50. Putting this together, Bayes rule cor- Pierre-Simon Laplace as a prime violator.
cion by most statisticians. There are reasons rectly concludes that the two pieces balance Suppose that in the twins example we lacked
to believe in the staying power of its current out, and that the odds of the twins being iden- the prior knowledge that one-third of twins
popularity, but also some signs of trouble tical are even. (The twins were fraternal.) are identical. Laplace would have assumed
ahead. Bayes theorem is thus an algorithm for a uniform distribution between zero and one
Here is a simple but genuine example of combining prior experience (one-third of for the unknown prior probability of identi-
Bayes rule in action (see sidebar) (1). A phys- twins are identicals) with current evidence cal twins, yielding 2/3 rather than 1/2 as the
icist couple I know learned, from sonograms, (the sonogram). Followers of Nate Silvers answer to the physicists question. In modern
that they were due to be parents of twin boys. FiveThirtyEight Web blog got to see the parlance, Laplace would be trying to assign
rule in spectacular form during the 2012 an uninformative prior or objective prior
Department of Statistics, Stanford University, Stanford, CA U.S. presidential campaign: The algorithm (2), one having only neutral effects on the
94305, USA. E-mail: brad@stat.stanford.edu updated prior poll results with new data on output of Bayes rule (3). Whether or not this

www.sciencemag.org SCIENCE VOL 340 7 JUNE 2013 1177


Published by AAAS
PERSPECTIVES

can be done legitimately has fueled the 250- of selection bias or regression to the mean.
year controversy. These would be false discoveries.
If P(A) is the probability of A and P(B) is the
Frequentism, the dominant statistical para- False discovery rates (FDRs) (5) are a
probability of B, then the conditional probability
digm over the past hundred years, rejects the of A given B is P(A|B) and the conditional
recent development that takes multiple test-
use of uninformative priors, and in fact does probability of B given A is P(B|A). Bayes theorem ing into account (6). Here, it implies that the
away with prior distributions entirely (1). In says that 28 genes with z values above 3.40 (red dashes
place of past experience, frequentism consid- in the figure) are indeed interesting, with
ers future behavior. An optimal estimator is P(A|B) = P(B|A)P(A) the expected proportion of false discoveries
P(B)
one that performs best in hypothetical repeti- among them being less than 10%. This is a fre-
In the twins example, A is twins identical and B
tions of the current experiment. The resulting quentist 10%: how many mistakes we would
is sonogram shows twin boys. The doctors prior
gain in scientic objectivity has carried the says P(A) = 1/3; genetics implies P(B|A) = 1/2 average using the algorithm in future studies.
day, though at a price in the coherent integra- and P(B|not A) = 1/4, so P(B) = (1/2)(1/3) + We expect only 2.8 of the z values exceeding
tion of evidence from different sources, as in (1/4)(2/3) = 1/3. Bayes theorem then gives 3.40 to be null, that is, only 10% of the actual
the FiveThirtyEight example. number observed. Larger choices of the cutoff
The Bayesian-frequentist argument, unlike P(A|B) = (1/2)(1/3)/(1/3) = 1/2 would yield smaller FDRs.
most philosophical disputes, has immediate This brings us back to Bayes. Another
The two pieces of evidence thus balance out, and
practical consequences. Consider that after interpretation of the FDR algorithm is that
the likelihood of the boys being fraternal is equal

Downloaded from www.sciencemag.org on July 19, 2013


a 7-year trial on human subjects, a research to that of the boys being identical. the Bayesian probability of nullness given a
team announces that drug A has proved bet- z value exceeding 3.40 is 10%. What prior
ter than drug B at the 0.05 signicance level. evidence are we using? None, as it turns out!
Asked why the trial took so long, the team Bayes theorem, convenient but potentially With 6033 parallel situations at hand, we can
leader replies That was the first time the dangerous in practice, especially when using effectively estimate the relevant prior from
results reached the 0.05 level. Food and Drug prior distributions not firmly grounded in the data itself. Empirical Bayes is the name
Administration (FDA) regulators reject the past experience. for this sort of statistical jujitsu, suggesting a
teams submission, on the frequentist grounds I recently completed my term as editor of fusion of frequentist and Bayesian reasoning
that interim tests of the data, by taking repeated an applied statistics journal. Maybe a quarter (7). Empirical Bayes is an exciting new sta-
0.05 chances, could raise the false alarm rate of the papers used Bayes theorem. Almost tistical idea, well-suited to modern scientic
to (say) 15% from the claimed 5%. all of these were based on uninformative technology, saying that experiments involv-
A Bayesian FDA regulator would be more priors, reecting the fact that most cutting- ing large numbers of parallel situations carry
forgiving. Starting from a given prior distri- edge science does not enjoy FiveThirtyEight- within them their own prior distribution. The
bution, the Bayesian posterior probability of level background information. Are we in for idea was coined in the 1950s (8), but real
drug As superiority depends only on its nal another Bayesian bust? developmental interest awaited the vast data
evaluation, not whether there might have Arguing against this is a change in our sta- sets of the 21st century.
been earlier decisions. This is a corollary of tistical environment. Modern scientic equip- I wish I could report that this resolves the
ment pumps out results in re hose quanti- 250-year controversy and that it is now safe
Bell-shaped curve
ties, producing enormous data sets bearing on to always employ Bayes theorem. Sorry. My
400
complicated webs of interrelated questions. own practice is to use Bayesian analysis in
In this new scientic era, the ability of Bayes- the presence of genuine prior information; to
300 ian statistics to connect disparate inferences use empirical Bayes methods in the parallel
Frequency

counts heavily in its favor. cases situation; and otherwise to be cautious


200 An example will help here. In a microar- when invoking uninformative priors. In the
ray prostate cancer study (4), 102 men52 last case, Bayesian calculations cannot be
100
patients and 50 healthy controlseach had uncritically accepted and should be checked
0 their genetic activity measured for 6033 by other methods, which usually means fre-
4 3 2 1 0 1 2 3 4 genes. The investigators were hoping to nd quentistically.
z values genes expressed differently in the patients
References and Notes
than in the controls. To this end, they calcu- 1. B. Efron, Bull. Am. Math. Soc. 50, 129 (2013).
26 genes with z > 3.40 lated a test statistic z for each gene, with a 2. S. Wang, B. Campbell, Science 339, 758 (2013).
standard normal (bell-shaped) distribu- 3. J. Berger, Bayesian Anal. 1, 385 (2006).
4. D. Singh et al., Cancer Cell 1, 203 (2002).
tion in the null case of no patient/control 5. Y. Benjamini, Y. Hochberg, J. R. Stat. Soc. B 57, 289
difference, but with bigger values for genes (1995).
3 4 expressed more intensely in patients. 6. See chapter 4 of (7) for a careful exposition of false dis-
covery rate theory.
The histogram of the 6033 z values (see
7. B. Efron, Large-Scale Inference: Empirical Bayes Methods
the gure) does not look much different than for Estimation, Testing, and Prediction (Institute of Math-
True and false discoveries. Test statistic z for 6033 the bell-shaped curve that would apply if all ematical Statistics Monographs, Cambridge Univ. Press,
genes in a microarray study of prostate cancer. The Cambridge, UK, 2010).
genes were null. However, there is a sugges-
28 genes having z 3.40 are likely to be true dis- 8. H. Robbins, in Proceedings of the Third Berkeley Sym-
coveries, that is, genes that are more active in pros- tion of interesting non-null genes in the heavy posium on Mathematical Statistics and Probability,
tate cancer patients than in controls. These results right tail of the distribution. We have to be 19541955, Vol. I (Univ. of California Press, Berkeley/
are based on Bayes rule, but with prior informa- careful, though. With 6033 genes to consider Los Angeles, 1956), pp. 157163.
tion obtained from the current data, an example of at once, a few of the zs are bound to look big
empirical Bayes methodology. even under the null hypothesis, an example 10.1126/science.1236536

1178 7 JUNE 2013 VOL 340 SCIENCE www.sciencemag.org


Published by AAAS

You might also like