Bayesian Stat

Introduction to Bayesian Statistics
Jenn Brynjarsdttir and Yifang Li

Statistical and Applied Mathematical Sciences Institute (SAMSI)
and North Carolina State University
SAMSI Undergraduate workshop

May 14 18, 2012
Jenn & Yifang (SAMSI & NCSU)
Bayesian Methods
May 15, 2012
1 / 31
Outline
1
Probability
Conditional Probability
Example: Special coin
Bayesian statistics
Posterior distribution
Coin example in Bayesian framework
Bayes for the Normal distribution

Example: Jeremys IQ
The Normal - Normal Bayesian analysis
The interplay of prior information and data
Activity
Bayesian Methods
May 15, 2012
2 / 31
Dont you love probability?

Agatha Christie The Mirror Crackd. Toronto: Bantam Books, 1962
I think youre begging the question, said Haydock, and I can see
looming ahead one of those terrible exercises in probability where six
men have white hats and six men have black hats and you have to
work it out by mathematics how likely it is that the hats will get mixed
up and in what proportion. If you start thinking about things like that,
you would go round the bend. Let me assure you of that!
Bayesian Methods
May 15, 2012
3 / 31
Probability
Probability is the language of Statistics

Two schools of thought within Statistics: Frequentist and Bayesian
Frequentist probability
The long run relative frequency of an outcome or event
Think: Rolling dice, gambling odds etc.
Subjective probability
We can put a measure on a degree of belief
If we are uncertain about something, we express the uncertainty
with a probability distribution
Bayesians work with both kinds of probabilities
Bayesian Methods
May 15, 2012
4 / 31
Probability
Conditional probability
Given events A and B in some sample space S,
P(A|B) =
P(A and B)
P(B)
P(A|B) =
P(B|A)P(A)
P(B)
Bayes Rule
Bayesian Methods
May 15, 2012
5 / 31
Probability
Inverse probability problems

Turning the conditional probability around
We know that if a die is fair we have a 1/6 chance of getting a six

We know that if a coin is fair we have a 1/2 chance of getting a
head
Turn this around: How can we know if a coin is fair?
Am I fair?
To investigate we toss the coin!
How many times? When are we convinced whether the coin is fair
or not?
As we toss it we get more and evidence and update our belief

about the what the probability of getting a head is.
The tool we use for this updating is Bayes rule
Bayesian Methods
May 15, 2012
6 / 31
Probability
Coin example
You have 4 visually identical coins in your pocket - 3 are standard
quarters and the 4th is a special coin.
The special coin appears identical to the 3 quarters, but has an
70% chance of landing heads up.
You reach into your pocket and randomly select a coin and toss it.
Suppose it lands heads up. What is the probability that the coin is
the special coin?
Which one is special?
Bayesian Methods
May 15, 2012
7 / 31
Probability
Coin example continued

Start by defining a convenient notation:
Q = standard quarter,
S = special (weighted) quarter,
H = a coin lands with the head facing up
We know the following:
P(Q) = 0.75 and P(S) = 0.25
P(H | Q) = 0.50 and P(H | S) = 0.70
What is the probability that the coin the special quarter S?
Using Bayes Rule:
P(S|H) =
P(H | S)P(S)
0.70 0.25
=
P(H)
P(H)
But we also need P(H)

Bayesian Methods
May 15, 2012
8 / 31
Probability

Law of total probability
P(B) = P(B and A) + P(B and Ac )
= P(B|A)P(A) + P(B|Ac )P(Ac )
Recall that P(Q) = 0.75, P(S) = 0.25, P(H | Q) = 0.50 and
P(H | S) = 0.70 so we get
P(H) = P(H | S)P(S) + P(H | Q)P(Q)
= 0.70 0.25 + 0.50 0.75 = 0.55
So we get
P(S|H) =
0.70 0.25
= 0.318
0.55
Bayesian Methods
May 15, 2012
9 / 31
Probability
What is the probability that the coin is the regular Q?

P(H | Q)P(Q)
P(H | S)P(S) + P(H | Q)P(Q)
0.50 0.75
=
= 0.681
0.70 0.25 + 0.50 0.75
P(Q|H) =
More general form of Bayes Rule

P(A|B) =
P(B|A)P(A)
P(B|A)P(A) + P(B|Ac )P(Ac )
Bayesian Methods
May 15, 2012
10 / 31
Bayesian statistics
Bayesian Statistics
Thomas Bayes (1701 - 1761) - Introduced Bayes formula 1761,

published posthumously in 1763 (by his friend Richard Price)
More general formula presented in 1773 by Pierre Simon Laplace.
Bayesian Methods
May 15, 2012
11 / 31
Bayesian statistics
Bayes Rule
P(A|B) =
P(B|A)P(A)
P(B|A)P(A) + P(B|Ac )P(Ac )
( | x) = R
f (x | )()
)d(
)
f (x | )(
Prior distribution: Describes our current (prior) knowledge about

(or A). Can be subjective.
Likelihood: Distribution for the data (as a function of the
parameter).
Posterior distribution: Our updated knowledge about (or A) after
seeing the data.
Marginal distribution
Frequentist inference is based only on the Likelihood
Bayesian Methods
May 15, 2012
12 / 31
Bayesian statistics

Coin example - continued
Suppose we take that same coin and toss it 3 more times, and each
toss resulted in a heads.
Now what is the probability that the coin is the special quarter?
Previous results: P(Q) = 0.75, P(S) = 0.25, P(H | Q) = 0.50,
P(H | S) = 0.70
P(Q | H) = 0.682, P(S | H) = 0.318
We will use Bayes rule again, but what is our Prior, i.e. current
view of P(Q) and P(Q)?
P(Q) = 0.75 or 0.682 ?
P(S) = 0.25 or 0.318 ?
Bayesian Methods
May 15, 2012
13 / 31
Bayesian statistics
Coin example in Bayesian framework Continued

Likelihood:
P(H | Q) = (0.5)3
P(H | S) = (0.7)3
Posterior:
(0.5)3 0.682
= 0.439
(0.5)3 0.682 + (0.7)3 0.318
(0.7)3 0.318
P(S | HHH) =
= 0.561
(0.7)3 0.318 + (0.5)3 0.682
P(Q | HHH) =
Bayesian Methods
May 15, 2012
14 / 31
Bayesian statistics
Coin example in Bayesian framework Continued
In the first round we had 1/4 chance of having the special coin
Then we got more information through our first experiment (H) and
the chances of having the special coin increased to 31.8%
After obtaining more data (HHH) we again updated our knowledge
and the chances that we have the special coin is 56.1%
Notice how we used the result in the first step, i.e the posterior
probability, as our prior in the next step.
This is an example of a Sequential Bayesian analysis
Todays posterior distribution is tomorrows prior
Bayesian Methods
May 15, 2012
15 / 31
Normal Distribution
Normal density: N(, 2 )
1
2 2
(x)2
2 2
0.5
0.30
f (x) =
=2
0.3
0.20
0.4
=8
0.1
=6
0.10
0.2
=1
= 12
0.0
0.00
=4
10
15
20
Bayesian Methods
10
15
May 15, 2012
16 / 31
Example: Jeremys IQ
Bayesian analysis: Normal - Normal

Example
Jeremys IQ
Jeremy, an enthusiastic Georgia Tech student, poses a statistical
model for his scores on standard IQ tests. He thinks that, in general,
his scores (Y) are normally distributed with unknown mean and the
variance of 80. Prior (and expert) opinion is that the IQ of Georgia
Tech students, is a normal random variable, with mean 110 and the
variance 120. Jeremy took two IQ tests and scored 98 in the first and
104 in the second. What is the Bayesian estimate of Jeremys IQ?
= 101
The frequentist estimator of would be y
To find the Bayesian estimate we want to find the mean of the
posterior distribution of Jeremys IQ.
Bayesian Methods
May 15, 2012
17 / 31

A general setup
Suppose we have independent data Yi sampled from a normal

distribution:
Yi | N(, 2 ) i = 1, . . . , n
has the following normal
The the sample average X = Y
distribution
Likelihood: X | N(, 2 /n)
Suppose our prior information about can be described with the
following normal distribution:
N(0 , 0 )
Then what is the posterior distribution for ?
Bayesian Methods
May 15, 2012
18 / 31

Posterior density:
p(X |)p()
p(X |)p()
p(X |)p()d
(
)

(X )2
( 0 )2
exp
exp
2 2 /n
202
(
!)
2
X
2
0
1
2 2 + 2 2 2
exp
2 2 /n
/n 0
0
!
!!)
(
1
1
1
X
0
2
= exp
+
2
+
2
2 /n 02
2 /n 02
p(|X ) = R
Bayesian Methods
May 15, 2012
19 / 31

A normal distribution for :

(
)2
1 2
2 2
f () exp
exp
2
2
2
2
From last slide:

1
1
1
X
0
p(|X ) exp
2
+
2
+
2
2 /n 02
2 /n 02
So we can see that the posterior has to be a normal distribution.

We simply have to determine
and
2.
Set
2 =
1
1
+ 2
2
/n 0
1
=
02 2 /n
02 + 2 /n
Then we have
(
1
p(|X ) exp
2
2
Bayesian Methods
X
0
+ 2
2
/n 0
!!)
May 15, 2012
20 / 31

From
(
1
p(|X ) exp
2
2
2
X
0
+
2 /n 02
!!)
we see that
X
0
+ 2
2
/n 0
2 2 /n
= 20 2
0 + /n
X
0
+ 2
2
/n 0
02
2 /n
X
+
0
02 + 2 /n
02 + 2 /n
Bayesian Methods
May 15, 2012
21 / 31

Summary
For
| N(, 2 /n)
Likelihood: Y
is the average of Y1 , . . . , Yn , and
where Y
Prior: N(0 , 02 )
we have
Posterior: | X N(
,
2)
where
2 =
2
02
+ /n 0
Y
02 + 2 /n
02 + 2 /n
02 2 /n
02 + 2 /n
Bayesian Methods
May 15, 2012
22 / 31

Posterior distribution for Jeremys IQ
Recall:
| N(, 2 /n = 80/2)
Likelihood: Y
Prior: N(0 = 110, 02 = 120)
The posterior distribution of Jeremys IQ is normal with mean and
variance
2
02
+ /n 0
Y
02 + 2 /n
02 + 2 /n
120
80/2
=
101 +
110 = 103.25
120 + 80/2
120 + 80/2
2 2 /n
120 80/2
2 = 2 0 2
=
= 30
120 + 80/2
0 + /n
Bayesian Methods
May 15, 2012
23 / 31

Credible interval for Jeremys IQ
In Bayesian analysis the result is the posterior distribution

But sometimes we want an estimate of our parameter with
uncertainty bounds
A common Bayesian estimator is the posterior mean,
Uncertainty bounds are found from percentiles of the posterior

distribution
A 95% credible interval for is
1.96
= 103.25 1.96 30 = (92.5, 114.0)
Interpretation: There is 95% chance that Jeremys IQ () is
between 92.5 and 114
That is NOT the interpretation of the frequentist confidence
interval!
Bayesian Methods
May 15, 2012
24 / 31
Interplay of Prior information and data
The posterior mean is a combination of the prior mean and the

data (the sample mean):
2
02
+ /n 0
Y
02 + 2 /n
02 + 2 /n
If we are very certain in our prior beliefs (0 is small) we go with

the prior mean
If our data is good ( is small or n is large) we go with the data
mean
Interactive demonstration ...
Bayesian Methods
May 15, 2012
25 / 31
2.0
N = 5, prior mean = 5, prior sd =0.5
1.0
0.0
0.5
Density
1.5
Likelihood
Prior
Posterior
10
15
20
Bayesian Methods
May 15, 2012
26 / 31
2.0
N = 5, prior mean = 5, prior sd =2
1.0
0.0
0.5
Density
1.5
Likelihood
Prior
Posterior
10
15
20
Bayesian Methods
May 15, 2012
26 / 31
2.0
1.0
0.0
0.5
Density
1.5
Likelihood
Prior
Posterior
10
15
20
Bayesian Methods
May 15, 2012
26 / 31
2.0
1.0
0.0
0.5
Density
1.5
Likelihood
Prior
Posterior
10
15
20
Bayesian Methods
May 15, 2012
26 / 31
Connection to Kalman Filter

The posterior variance can alternatively be written as
!
2 2 /n
2
2 = 2 0 2
= 1 2 02
02
0 + /n
0 + /n
The posterior variance can alternatively be written as
2
02
+ /n 0
Y
02 + 2 /n
02 + 2 /n
= 0 +

02
0
Y
2
0 + 2 /n
These alternative forms are what you will see in Nates lecture on the
Kalman Filter - but for the multivariate case.
Bayesian Methods
May 15, 2012
27 / 31
More about Bayesian statistics

Kalman Filter can be viewed as Bayesian updating
I.e. when the distributions are normal, variances are known and the
model is linear - see Nates lecture
When we are not operating with the normal distribution with a

known , obtaining the posterior distribution can be difficult or not
analytically possible
The reason is the integral in
( | x) = R
f (x | )()
)d(
)
f (x | )(
Instead we often try to approximate the posterior density or obtain

(approximate) samples from the posterior distribution
E.g. Monte Carlo Methods - see lecture by Alex, Chia and Jessi
Bayesian Methods
May 15, 2012
28 / 31
Activity
Activity
What is the average height of NBA players?
Bayesian Methods
May 15, 2012
29 / 31
Activity
Activity
Hint for determining the prior
99.7%
4 3 2
+ + 2 + 3 + 4
Almost all (99.7%) samples from a normal distribution fall within 3

standard deviations from the mean
Bayesian Methods
May 15, 2012
30 / 31
Activity
The end
Eyjafjallajkull three weeks before it erupted in 2010

Bayesian Methods
May 15, 2012
31 / 31

Bayesian Stat

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bayesian Stat

Uploaded by

Copyright:

Available Formats

Introduction to Bayesian Statistics

Jenn Brynjarsdttir and Yifang Li

SAMSI Undergraduate workshop

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Bayes for the Normal distribution

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Dont you love probability?

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Probability is the language of Statistics

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Inverse probability problems

We know that if a die is fair we have a 1/6 chance of getting a six

As we toss it we get more and evidence and update our belief

May 15, 2012

Example: Special coin

Which one is special?

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Example: Special coin

Coin example continued

But we also need P(H)

May 15, 2012

Example: Special coin

Coin example continued

May 15, 2012

Example: Special coin

Coin example continued

What is the probability that the coin is the regular Q?

More general form of Bayes Rule

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Thomas Bayes (1701 - 1761) - Introduced Bayes formula 1761,

May 15, 2012

Prior distribution: Describes our current (prior) knowledge about

May 15, 2012

Coin example in Bayesian framework

Coin example in Bayesian framework

May 15, 2012

Coin example in Bayesian framework

Coin example in Bayesian framework Continued

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Coin example in Bayesian framework

Coin example in Bayesian framework Continued

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Bayes for the Normal distribution

Jenn & Yifang (SAMSI & NCSU)

May 15, 2012

Bayes for the Normal distribution

Bayesian analysis: Normal - Normal

May 15, 2012

Bayes for the Normal distribution

The Normal - Normal Bayesian analysis

Bayesian analysis: Normal - Normal

Suppose we have independent data Yi sampled from a normal

May 15, 2012

Bayes for the Normal distribution