18 views

Uploaded by Srijit Sanyal

asfas

- Big Data Methods and Psychological Science
- Multiple camera fusion based on DSmT for tracking objects on ground plane
- SIP Report Guidelines and Certificates-2
- 12841 [Robert J. Marzano] Classroom Assessment Grading(B-ok.cc)
- 95d05ae01d64528be731f0d637281a3940c3.pdf
- Business Statistics : Theory and Applications, By Jani, P.N.
- Bayesian Statistics
- IRJET-Deep Learning Model to Predict Hardware Performance
- TristanMaryHuard_2012
- Business Statistics Assignment
- praktikum ekomett
- 101038
- main
- BayseianFramwork
- Process Traicing
- Bike Couting
- A New Approach for Handling Null Values in Web Log Using KNN and Tabu Search KNN
- 0948-0956 Kudzys Kliukas
- Survey paper on methodologies employed in MINERAL exploration
- PPT_Nielsen_Artificial Neural Networks 2 (2012!11!04 21-01-52 UTC)

You are on page 1of 4

Larry Wasserman

Introduction

Brad Carlin invited me to comment on Andrew Gelmans article because Brad considers

me an ex-Bayesian. Its true that my research moved away from Bayesian inference

long ago. But I am reminded of a lesson I learned from Art Dempster over 20 years ago

which I shall paraphrase:

A person cannot be Bayesian or frequentist. Rather, a particular analysis

can be Bayesian or frequentist.

My research is very frequentist but I would not hesitate to use Bayesian methods for

a problem if I thought it was appropriate. So perhaps it is unwise to classify people

as Bayesians, anti-Bayesians, frequentists or whatever. With the caveat, I will proceed

with a frequentist tirade.

Coverage

I began to write this just a few minutes after meeting with some particle physicists.

They had questions about constructing confidence intervals for a particular physical

parameter. The measurements are very subtle and the statistical model is quite complex.

They were concerned with constructing intervals with guaranteed frequentist coverage.

Their desire for frequentist coverage seems well justified. They are making precision measurements on well defined physical quantities. The stakes are high. Our

understanding of fundamental physics depends on knowing such quantities with great

accuracy. The particle physicists have left a trail of such confidence intervals in their

wake. Many of these parameters will eventually be known (that is, measured to great

precision). Someday we can count how many of their intervals trapped the true parameter values and assess the coverage. The 95 percent frequentist intervals will live up

to their advertised coverage claims. A trail of Bayesian intervals will, in general, not

have this property. Being internally coherent will be of little consolation to the physics

community if most of their intervals miss the mark.

Frequentist methods have coverage guarantees; Bayesian methods dont. In science,

coverage matters.

Department of Statistics and Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, http://www.stat.cmu.edu/~larry

DOI:10.1214/08-BA318D

464

of F is the empirical distribution

n

1X

Fbn (x) =

I(,x] (Xi )

n i=1

has two virtues. First, it is simple. Second, it comes with a frequency guarantee, namely,

2

(1)

all distribution functions F and then we find the posterior. A common example is

the Dirichlet process prior. The Bayes estimator (under squared error loss) is F n =

E(F |X1 , . . . , Xn ).

Does F n have a guarantee like (1)? If yes, thats nice, but then we might as well just

use Fbn . If not, then why use F n ? Isnt it better to use something with a strong guarantee

like (1)? This is the basic quandry in Bayesian inference. If the Bayes estimator has

good frequency behavior then we might as well use the frequentist method. If it has

bad frequency behavior then we shouldnt use it.

We observe training data (X1 , Y1 ), . . . , (Xn , Yn ) and we want to predict a new Y from

a new X. The catch is that X has dimension p much larger than n. For example, we

might have n = 100 and p = 10, 000. A popular predictor is `(X) = bT X where b is

the lasso estimator (Tibshirani 1996) which minimizes

b =

n

X

i=1

(Yi XiT ) +

p

X

|j |

j=1

and is chosen by cross-validation. Not only can b be computed very quickly but it has

excellent theoretical properties. Specifically, the prediction risk of `(X) = bT X is close

to optimal among all sparse linear predictions L; see Greenshtein and Ritov (2004) for

details.

It is important to understand that nowhere does this result assume that the true

regression function m(x) = E(Y |X = x) is linear. How does Bayesian inference proceed

here? Constructing a model for the 10,000 dimensional regression function would be

absurd. Again, we can seek a formal Bayesian method with good frequency behavior

but then, why not just use the lasso? A pure Bayesian approach lacking any frequency

guarantee would be dangerous.

Larry Wasserman

465

Adding Randomness

Some of the greatest contributions of statistics to science involve adding additional randomness and leveraging that randomness. Examples are randomized experiments, permutation tests, cross-validation and data-splitting. These are unabashedly frequentist

ideas and, while one can strain to fit them into a Bayesian framework, they dont really

have a place in Bayesian inference. The fact that Bayesian methods do not naturally

accommodate such a powerful set of statistical ideas seems like a serious deficiency.

Ive taken a strident and extreme tone to make my point. I reiterate that I would not

hesitate to use Bayesian methods for a problem if I thought it was appropriate. There

is room for both frequentist and Bayesian methods. Excepting a few special cases,

frequency guarantees are essential even for Bayesian methods.

References

Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor

selection and the virtue of overparametrization. Bernoulli, 10: 971988.

Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of

the Royal Statistical Society. Series B, 58: 267288.

466

- Big Data Methods and Psychological ScienceUploaded byXóchitl Padilla
- Multiple camera fusion based on DSmT for tracking objects on ground planeUploaded byAnonymous 0U9j6BLllB
- SIP Report Guidelines and Certificates-2Uploaded bysidhartha jain
- 12841 [Robert J. Marzano] Classroom Assessment Grading(B-ok.cc)Uploaded byPutri Widia
- 95d05ae01d64528be731f0d637281a3940c3.pdfUploaded byThoth Dwight
- Business Statistics : Theory and Applications, By Jani, P.N.Uploaded byPHI Learning Pvt. Ltd.
- Bayesian StatisticsUploaded byShweta Singh
- IRJET-Deep Learning Model to Predict Hardware PerformanceUploaded byIRJET Journal
- TristanMaryHuard_2012Uploaded byPutria Febriana
- Business Statistics AssignmentUploaded byEa Fatihah
- praktikum ekomettUploaded byRetno Jati Sahari
- 101038Uploaded byIJCNSVol2NO10
- mainUploaded byAgung Ora Dw
- BayseianFramworkUploaded bySwadheen Jain
- Process TraicingUploaded byLuis Jimenez
- Bike CoutingUploaded byAdrian Marian Rosu
- A New Approach for Handling Null Values in Web Log Using KNN and Tabu Search KNNUploaded byLewis Torres
- 0948-0956 Kudzys KliukasUploaded byRiyaz Jazz
- Survey paper on methodologies employed in MINERAL explorationUploaded byInternational Journal for Scientific Research and Development - IJSRD
- PPT_Nielsen_Artificial Neural Networks 2 (2012!11!04 21-01-52 UTC)Uploaded byEncik Badrul
- PROPOSED SYSTEM - Google Docs 1.pdfUploaded byJayadev H
- Sample SizeUploaded byAhmed Kadem Arab
- Rudy RegularizationUploaded bydexxt0r
- Sample ProjectUploaded byArunaML
- A support vector machine classifier with rough set-based feature selection for breast cancer diagnosisUploaded byAbdul Rahman
- Information of the Topics Covered in Given Lectures With Respect to the Topics in HandoutUploaded byvisava789
- Activity 2 - CopyUploaded byArshad Ullah
- 025PUploaded by'Kesowo Hari Murti
- Kuliah 3 - Uji T Dan Uji FUploaded byMuhammad Syah
- 10814286 Introduction to Qualitative ResearchUploaded byCynthia Joffrion

- 1224-7958-1-PBUploaded bySrijit Sanyal
- UCB104Uploaded bySrijit Sanyal
- BERCI-2.2Uploaded bySrijit Sanyal
- 0701092Uploaded bySrijit Sanyal
- Lecture 8Uploaded bySrijit Sanyal
- panpsychismUploaded bySrijit Sanyal
- Kolmogorov ComplexityUploaded bySrijit Sanyal
- Js SecretsUploaded byPeriyakaruppan Nagarajan
- diffalg1Uploaded bySrijit Sanyal
- fea-magidUploaded bySrijit Sanyal
- Diff FormsUploaded bySrijit Sanyal
- Ehrenfest's TheoremUploaded bySrijit Sanyal
- Waist-hip ratio and cognitive ability: is gluteofemoral fat a privileged store of neurodevelopmental resources?Uploaded byCromiron
- Pistone ExpUploaded bySrijit Sanyal
- Entropy Entanglement Figured 01Uploaded bySrijit Sanyal
- Berker Norm Insignif NeuroUploaded bySrijit Sanyal
- Calculus for Mathematicians - DJ Bernstein (12 pgs).pdfUploaded byKim Fierens
- Leo Breiman - Statistical Modeling - Two CulturesUploaded bymurilove
- Mathematica Programming - Advanced IntroUploaded byTraderCat Solaris
- Re WritingUploaded bySrijit Sanyal
- 0901.0952Uploaded bySrijit Sanyal
- Iraq the VoteUploaded bySrijit Sanyal
- Rosen LichtUploaded bySrijit Sanyal
- Overload SmallUploaded bySrijit Sanyal
- 1111.3846Uploaded bySrijit Sanyal
- How to Study EconomicsUploaded bySrijit Sanyal
- recounting the rationalsUploaded bySrijit Sanyal
- hl13nflasdgasdgUploaded bySrijit Sanyal
- comprehensive world historyUploaded bySuman Hazarika

- folien_woche_1-3_4x4Uploaded byRolandinho
- Abstract 107Uploaded byCésar Cabanillas
- YMS Ch4: More on Two-Variable Data AP Statistics at LSHS Mr. MoleskyUploaded byInTerp0ol
- Datamodel Etim5 0 EnUploaded byget_2gether
- gisc9315d4zamanmUploaded byapi-358717204
- Data Base Management SystemUploaded byArpit Chaurasia
- Evermotion Archinteriors Vol 25 PDFUploaded byChristy
- 2D AnimationUploaded byAshraf Zaman
- Lect 7Uploaded bySoumitra Chakraborty
- Image Processing Project Report NewUploaded byAbhishekSingh
- OJT Report.docxUploaded byz3r002
- Data Flow DiagramUploaded byJeric P Paatan
- imreadUploaded byAhmad Bas
- Cephas MDA Fast GuideUploaded bycomsi1
- Alexandria ACM SC | Rigorous Computational Simulation of Dynamical SystemsUploaded byAlex ACM SC Library
- An Approach to Scene Detection in Animation Movies and Its ApplicationsUploaded byAnonymous ntIcRwM55R
- DB ConceptsUploaded bydost_999
- SQL Tutorial [Schema Statements]Uploaded bysauru7
- SQL - A Practical IntroductionUploaded byTrisha0082
- Computer GraphicsUploaded bypraveennallavelly
- Introduction to Probabilistic Simulations in ExcelUploaded byOlajide Olanrewaju Adamolekun
- Carrier Aggregation in LTE-Advanced using ETOM AnalysisUploaded bybiancahasiandra
- Fits LibUploaded bySébastien Remy
- CS490 All final Exams True False MCQ- Updated.docUploaded byMerry Sinha
- project report by er luo and kehui maUploaded byapi-249410167
- Acc400 Chapter 4 Quiz Note FileUploaded bymad_hatter_
- Blei Review VariationalInferenceUploaded byRacool Rafool
- communication Sheet solved problemsUploaded byMohamed Dakheel
- 01 Iec t1s1 Oops Session 01Uploaded bycaininme
- Homework7 (1)Uploaded byNicole Willis