8 views

Uploaded by mijojo

- sample problems
- Discrete Distributions
- OR 508 Exercises Solution.pdf
- Ang a. H-S, Probability Concepts in Engineering Planning and Design, 1984
- Optimizing Shock Models In The Study Of Deteriorating Systems
- MultinomialDistribution.PDF
- maths formulae book
- Advance statistics
- Ch. 9(b) lec
- WST02/01 Jan 2015 QP
- Hierarchical GLMMM
- Bioinformatic_1
- 7Summits 7.15
- 410Hw01.pdf
- Use of Poisson Distribution in Highway Traffic
- Lecture 12
- Lecture 3
- fermi_EH3
- BayesianBrain (1).pdf
- Newdistns.pdf

You are on page 1of 3

Ill try to make this introduction to Bayesian statistics clear and short. First well look as a specic example, then the general setting, then Bayesian statistics for the Bernoulli process, for the Poisson process, and for normal distributions. P (B|X) = 1 P (A|X) 2k = nk 2 + 2k For example, if in n = 10 trials we get k = 4 red balls, then the posterior probabilities are P (A|X) = 4 and P (B|X) = 1 . 5 5

A simple example

Suppose we have two identical urnsurn A with 5 red balls and 10 green balls, and urn B with 10 red balls and 5 green balls. Well select randomly one of the two urns, then sample with replacement that urn to help determine whether we chose A or B. Before sampling we have prior probabilities of 1 1 , that is, P (A) = 2 and P (B) = 1 . 2 2 Lets suppose our sample X = (X1 , X2 , . . . , Xn ) is of size n, and k of the outcomes X1 , X2 , . . . , Xn are red. We want to determine the posterior probabilities P (A|X) and P (B|X), and well do that using Bayes formula. We can easily compute the reverse probabilities

The setting for Bayesian statistics is a family of distributions parametrized by one or more parameters along with a prior distribution for those parameters. In the example above we had a Bernoulli process parametrized by one parameter p the probability of success. In the example the prior distribution 1 for p was discrete and had only two values, 3 and 2 1 each with probability 2 . 3 Typically, a sample X is taken from the distribution, and a posterior distribution for the parameters is computed. Lets clarify the situation and introduce termiP (X|A) = ( 1 )k ( 2 )nk 3 3 nology and notation in the general case where X is P (X|B) = ( 1 )nk ( 2 )k a discrete random variable, and there is only one 3 3 discrete parameter . In statistics, we dont know so by Bayes formula we derive the posterior prob- what the value of is; our job is to make inferabilities ences about . The way to nd out about is to perform many trials and see what happens, that is, P (X|A)P (A) P (A|X) = to select a random sample from the distribution, P (X|A)P (A) + P (X|B)P (B) X = (X1 , X2 , ..., Xn ), where each random variable 1 ( 1 )k ( 2 )nk 2 Xi has the given distribution. The actual outcomes 3 = 1 k 2 nk 1 3 1 nk 2 k 1 that are observed Ill denote x = (x1 , x2 , . . . , xn ). (3) (3) + (3) (3) 2 2 Now, dierent values of lead to dierent proba2nk = nk bilities of the outcome x, that is, P (X=x | ) varies 2 + 2k 1

with . In the so-called classical statistics, this probability is called the likelihood of given the outcome x, denoted L( | x). The reason the word likelihood is used is the suggestion that the real value of is likely to be one with a higher probability P (X=x | ). But this likelihood L( | x) is not a probability about . (Note that classical statistics is much younger than Bayesian statistics and probably should have some other name.) What Bayesian statistics does is replace this concept of likelihood by a real probability. In order to do that, well treat the parameter as a random variable rather than an unknown constant. Since its a random variable, Ill use an uppercase . This random variable itself has a probability distribution, which Ill denote f () = P (=). This f is called the prior distribution on . The symbol P (X=x | ) really is a conditional probability now, and it should properly be written P (X=x | =), but Ill abbreviate it simply as P (x | ) and leave out the references to the random variables when the context is clear. Using Bayes law we can invert this conditional probability. In full, it says P (= | X=x) = P (X=x | =)P (=) P (X=x)

simple example above where our parameter p was 1 discrete and only took the two values 3 and 2 ). 3 That means well be working with probability densities instead of probabilities. In the continuous case there are analogous statements. In particular, analogous to the last statement, we have f ( | x) f (x | )f () where f () is the prior density function on the parameter , f ( | x) is the posterior density function on , and f (x | ) is a conditional probability or a conditional density depending on whether X is a continuous or discrete random variable.

This conditional probability P ( | x) is called the posterior distribution on . Note that the denominator P (x) is a constant, so the last equation says that the posterior distribution P ( | x) is proportional to P (x | )P (). Ill write proportions with the traditional symbol so that the last statement can be written as P ( | x) P (x | )P ().

A single trial X for a Bernoulli process, called a Bernoulli trial, ends with one of two outcomes success where X = 1 and failure where X = 0. Success occurs with probability p while failure occurs with probability q = 1 p. The term Bernoulli process is just another name for a random sample from a Bernoulli population. Thus, it consists of repeated independent Bernoulli trials X = (X1 , X2 , . . . , Xn ) with the same parameter p. The problem for statistics is determining the value of this parameter p. All we know is that it lies between 0 and 1. We also expect the ratio k/n of the number of successes k to the number trials n to approach p as n approaches , but thats a theoretical result that doesnt say much about what p is when n is small. Lets see what the Bayesian approach says here. We start with a prior density function f (p) on p, and take a random sample x = (x1 , x2 , . . . , xn ). Then the posterior density function is proportional to a conditional probability times the prior density function

When we discuss the three settingsBernoulli, f (p | x) P (X=x | p) f (p). Poisson, and normalthe random variable X will be either discrete or continuous, but our parameSuppose, now, that there are k successes occur ters will all be continuous, not discrete (unlike the among the n trials x. With our convention that 2

Xi = 1 means the trial Xi ended in success, that means that k = x1 + x2 + + xn . Then P (X=x | p) = pk (1 p)nk . Therefore, f (p | x) pk (1 p)nk f (p). Thus, we have a formula for determining the posterior density function f (p | x) from the prior density function f (p). (In order to know a density function, its enough to know what its proportional to, because we also know the integral of a density function is 1.) But what should the prior distribution be? That depends on your state of knowledge. You may already have some knowledge about what p might be. But if you dont, maybe the best thing to do is assume that all values of p are equally probable. Lets do that and see what happens. So, assume now that the prior density function f (p) is uniform on the interval [0, 1]. So f (p) = 1 on the interval, 0 o it. Then we can determine the posterior density function. On the interval [0, 1], f (p | x) pk (1 p)nk f (p) = pk (1 p)nk Thats enough to tell us this is the beta distribution Beta(k + 1, n + 1 k) because the probability density function for a beta distribution Beta(, ) is 1 f (x) = x1 (1 x)1 B(, ) for 0 x 1, where B(, ) is a constant, namely, the beta function B evaluated at the arguments and . Note that the prior distribution f (p) we chose was uniform on [0, 1], and thats actually the beta distribution Beta(1, 1).

- sample problemsUploaded byhazel_ena
- Discrete DistributionsUploaded bymanishkapoor04
- OR 508 Exercises Solution.pdfUploaded byNamra Fatima
- Ang a. H-S, Probability Concepts in Engineering Planning and Design, 1984Uploaded byIonescu Paul
- Optimizing Shock Models In The Study Of Deteriorating SystemsUploaded byInternational Journal of Research in Engineering and Science
- MultinomialDistribution.PDFUploaded by3rlang
- maths formulae bookUploaded byGareth Thomas
- Advance statisticsUploaded bylucky2010
- Ch. 9(b) lecUploaded byTom Ules
- WST02/01 Jan 2015 QPUploaded byRami
- Hierarchical GLMMMUploaded byLampros Kaimakamis
- Bioinformatic_1Uploaded byBharatbioinormaticshau
- 7Summits 7.15Uploaded bymesay83
- 410Hw01.pdfUploaded byd
- Use of Poisson Distribution in Highway TrafficUploaded byIqra Afsar
- Lecture 12Uploaded bysoug soug
- Lecture 3Uploaded byMayank Saini
- fermi_EH3Uploaded byJason WU
- BayesianBrain (1).pdfUploaded byreshma mohan
- Newdistns.pdfUploaded byomid_kharazmi
- Syllabus.psUploaded bytrabajados
- dd0310281ef2e6db0a6e0742a99c05faa368.pdfUploaded byRabia Almamalook
- Review Problems With KeyUploaded byjumpman23
- Food Reconstruction Using Isotopic Transferred Signals (FRUITS): A Bayesian Model for Diet ReconstructionUploaded byldv1452
- 3_BredaUploaded byAlvaro Sanchez
- MontecarloUploaded bypradeep
- Bivariate-Normal2Uploaded byAchintDuggal
- Aerodynamic Modeling & Analysis of Axial Aircrafts in MAVsUploaded byIORS Journals
- Chapter 5Uploaded byAbdul Mateen Mateen
- Simulasi MC Mhsw UdinusUploaded byFitria April

- bernhofen-lect2-2Uploaded bymajka077677
- Starting High Inertia LoadsUploaded byviswajd
- pspice_tutorialUploaded byJoão Cunha
- Coordinate Conversions and Transformations G7-2Uploaded bydggiyvnxssr9222
- 3Uploaded byintel6064
- Scalping the EurUsd Effectively Expat ForumUploaded byJuan Pablo Arango
- GIAAV2.01-BUploaded byTrần Nam Sơn
- Turbo Machines,Uploaded byyash_ganatra
- Thermal Decomposition of MnCO3 (in Air)Uploaded byAulizar Mario
- Ampilov, Yury P - From Seismic Interpretation to Modelling and Assessment of Oil and Gas Fields (2010, EAGE Publications)Uploaded byGilles Gitta
- Bus Bar Current Carrying CapacityUploaded byMichael Zhang
- Launch Vehicle ProjectUploaded bySabik Roauf Chowdhury
- Active VibrationUploaded byconcord1103
- VIRTUAL VEHICLEUploaded byศิษย์เก่า ทีเจพี
- Sliding Mode Control HandoutUploaded byGabriel Mejia
- Course outline emft EE 346 Spring 2016.docxUploaded byJhanzeb Khan
- wLanguageUploaded byYmiuchin
- Extended Kalman Filter_ShoudongUploaded byteju1996cool
- 03-Introduction to RUploaded bynaman
- Mark Scheme Unit 1 (6PH01) Paper 1R June 2014Uploaded byHemal Rajput
- Curve Fitting in ExcelUploaded byharshj11
- 07299689Uploaded byrallabhandiSK
- Fluid Mechanics In Porous MediaUploaded byGiovanni Benvegna
- WEKA EcosystemUploaded byNous Tempa
- PHYS632_L7_ch_27_Circuits_08Uploaded byAmer Zayegh
- Neki Zadaci AnalizaUploaded byJohanna Carlson
- Mathematics Course OutlineUploaded byMaisarah Manas
- 2013 october - november technical examination.docUploaded bythigz
- CFDModuleUsersGuide.pdfUploaded byResul Sahin
- Generalized CoordinatesUploaded byKevin G. Rhoads