5 views

Original Title: ec6370_lecture1

Uploaded by deustoman

- Buse
- DCM Bayesian Inference
- Bayesian Statistics
- Bayes Lecture Notes
- Herschbach&Bechtel BBScommentary Jones&Love
- AI Syllabus Fall 2010 (1)
- Decision Tree
- RBI Straub
- Inductive Principles for RBM Training
- engineering statistics
- SSRN-id991805.pdf
- Finance - Financial Models in Excel - Volatility
- The Overview of Bayes Classification Methods
- 21685
- Is UNESCO Recognition Effective in Fostering Tourism
- 19probabilisticinference
- aadd24d5e67c3792f6558a08e20bd59645f4dd28
- The Anatomy of Choice Dopamine and Decision-making
- 9444-44340-1-PB (1)
- An Integrated Approach Toward the Spatial Modeling of Perceived Customer Value

You are on page 1of 22

Econometrics

University of Queensland

July, 2008

Slides for Lecture on

An Overview of Bayesian Econometrics

Gary Koop, University of Strathclyde

1 Bayesian Theory

Reading: Chapter 1 of textbook and Appendix B,

section B.1.

We will begin with broad outlines of general concepts

in Bayesian theory before getting to practical models

such as the regression model. If you know these

general concepts you will never get lost.

What does an econometrician do? i) Estimate para-

meters in a model (e.g. regression coecients), ii)

Compare dierent models (e.g. hypothesis testing),

iii) Prediction.

Bayesian econometrics does all these things based on

a few simple rules of probability.

Let and 1 be two events, j(1[) is the con-

ditional probability of 1[. summarizes what is

known about 1 given

Bayesians use this rule with 1 = something known or

assumed (e.g. the Data), is something unknown

(e.g. coecients in a model).

Let & be the data, &

+

be unobserved data (i.e. to be

forecast), A

j

for j = 1. ... n be the set of models

under consideration each of which depends on some

parameters, 0

j

.

Learning about parameters in a given model is based

on the posterior density: j(0

j

[A

j

. &)

Model comparison is based on posterior model prob-

ability: j(A

j

[&)

Prediction is based on the predictive density j(&

+

[&).

1.1 Bayes Theorem

I expect you know basics of probability theory from

your previous studies, see Appendix B of my text-

book if you do not.

Denition: Conditional Probability

The conditional probability of given 1, denoted by

Pr ([1), is the probability of event occurring given

event 1 has occurred.

Theorem: Rules of Conditional Probability including Bayes

Theorem

Let and 1 denote two events, then

Pr ([1) =

Pr(.1)

Pr(1)

and

Pr (1[) =

Pr(.1)

Pr()

.

These two rules can be combined to yield Bayes Theo-

rem:

Pr ([1) =

Pr (1[) Pr ()

Pr (1)

.

Note: Above is expressed in terms of two events, and

1. However, they also can be interpreted as holding

for two random variables, and 1 with probability or

probability density functions replacing the Pr ()s in the

previous formulae.

1.2 Learning About Parameters in a Given

Model (Estimation)

Assume we are working with a single model (e.g. a

regression model with a particular set of explanatory

variables) which depends on parameters 0

So we want to gure out properties of the posterior

j(0[&)

It is convenient to use Bayes rule to write the pos-

terior in a dierent way.

Bayes rule lies at the heart of Bayesian economet-

rics:

j(1[) =

j([1)j(1)

j()

.

Replace 1 by 0 and by & to obtain:

j(0[&) =

j(&[0)j (0)

j(&)

.

Bayesians treat j(0[&) as being of fundamental in-

terest. That is, it directly addresses the question

Given the data, what do we know about 0?.

The treatment of 0 as a random variable is contro-

versial among some econometricians. The chief com-

petitor to Bayesian econometrics, called frequentist

econometrics, says that 0 is not a random variable.

For estimation we can ignore the term j(&) since it

does not involve 0:

j(0[&) c j(&[0)j(0).

j(0[&) is referred to as the posterior density (i.e.

after or posterior to seeing the data)

j(&[0) is the likelihood function

j(0) as the prior density.

posterior is proportional to likelihood times prior.

j(0), does not depend on the data. It contains any

non-data information available about 0.

Prior information is a controversial aspect of Bayesian

econometrics since it sounds unscientic.

Bayesian answers (to be elaborated on later):

i) Often we do have prior information and, if so, we

should include it (more information is good)

ii) Can work with noninformative priors

iii) Can use empirical Bayes methods which esti-

mate prior from the data

iv) Training sample priors

v) Bayesian estimators often have better frequentist

properties than frequentist estimators (e.g. results

due to Stein show MLE is inadmissible but Bayes

estimators are admissible)

vi) Prior sensitivity analysis

1.3 Prediction in a Single Model

Prediction based on the predictive density j(&

+

[&)

Since a marginal density can be obtained from a joint

density through integration (see Appendix B) we can

write:

j(&

+

[&) =

Z

j(&

+

. 0[&)o0.

Term inside the integral can be rewritten using an-

other rule of probability as:

j(&

+

[&) =

Z

j(&

+

[&. 0)j(0[&)o0.

Prediction involves the posterior and j(&

+

[&. 0) (more

description provided later).

1.4 Model Comparison (Hypothesis test-

ing)

Models denoted by A

j

for j = 1. ... n. A

j

depends

on parameters 0

j

.

Posterior model probability is j(A

j

[&).

Using Bayes rule with 1 = A

j

and = & we

obtain:

j(A

j

[&) =

j(&[A

j

)j(A

j

)

j(&)

.

j(A

j

) is referred to as the prior model probability.

j(&[A

j

) is called the marginal likelihood.

How is marginal likelihood calculated?

Posterior can be written as:

j(0

j

[&. A

j

) =

j(&[0

j

. A

j

)j(0

j

[A

j

)

j(&[A

j

)

Integrate both sides of previous equation with re-

spect to 0

j

, use the fact that

R

j(0

j

[&. A

j

)o0

j

= 1

(since probability density functions integrate to one)

and rearrange we obtain:

j(&[A

j

) =

Z

j(&[0

j

. A

j

)j(0

j

[A

j

)o0

j

.

Note that the marginal likelihood depends only on

the prior and likelihood.

Bayesians often present posterior odds ratio to com-

pare two models:

1O

j;

=

j(A

j

[&)

j(A

;

[&)

=

j(&[A

j

)j(A

j

)

j(&[A

;

)j(A

;

)

.

Note that, since j(&) is common to both models, do

not need to work it out. Can use fact that j(A

1

[&)+

j(A

2

[&) + ... + j(A

n

[&) = 1 and posterior odds

ratios to calculate the posterior model probabilities.

For instance, if we have n = 2 models then we can

use the two equations:

j(A

1

[&) +j(A

2

[&) = 1

and

1O

12

=

j(A

1

[&)

j(A

2

[&)

to work out

j(A

1

[&) =

1O

12

1 +1O

12

and

j(A

2

[&) = 1 j(A

1

[&).

The Bayes Factor is dened as:

11

j;

=

j(&[A

j

)

j(&[A

;

)

.

1.5 Summary

On one level, this course could end right here. These

few pages have outlined all the basic theoretical concepts

required for the Bayesian to learn about parameters, com-

pare models and predict. We stress what an enormous

advantage this is. Once you accept that unknown things

(i.e. 0, A

j

and &

+

) are random variables, the rest of

Bayesian approach is non-controversial.

What are going to do in rest of this course?

See how these concepts work in commonly-used models

(e.g. the regression model).

Bayesian computation.

2 Bayesian Computation

How do you present results from a Bayesian empirical

analysis?

j(0[&) is a p.d.f. Especially if 0 is a vector of many

parameters cannot present a graph of it.

Want features analogous to frequentist point esti-

mates and condence intervals.

A common point estimate is the mean of the poste-

rior density (or posterior mean).

Let 0 be a vector with I elements, 0 = (0

1

. ... 0

I

)

t

.

The posterior mean of any element of 0 is:

1(0

j

[&) =

Z

0

j

j(0[&)o0.

Aside Denition B.8: Expected Value

Let j () be a function, then the expected value of j (A),

denoted 1[j (A)], is dened by:

1[j (A)] =

.

X

j=1

j (a

j

) j (a

j

)

if A is a discrete random variable with sample space

|a

1

. a

2

. a

3

. ... a

.

and

1[j (A)] =

Z

o

o

j (a) j (a) oa

if A is a continuous random variable (provided 1[j (A)] <

o).

Most common measure of dispersion is the posterior

standard deviation which is the square root of the

posterior variance. The latter is calculated as:

ov(0

j

[&) = 1(0

2

j

[&) |1(0

j

[&)

2

.

which requires evaluation posterior mean as well as:

1(0

2

j

[&) =

Z

0

2

j

j(0[&)o0.

Many other possible features of interest. E.g. what

is probability that a coecient is positive?

j(0

j

` 0[&) =

Z

o

0

j(0[&)o0

All of these posterior features which the Bayesian

may wish to calculate have the form:

1[j (0) [&] =

Z

j(0)j(0[&)o0,

where j(0) is a function of interest.

All of these features have integrals in them. Mar-

ginal likelihood and predictive density also involved

integrals.

Apart from a few simple cases, it is not possible

to evaluate these integrals analytically, and we must

turn to the computer.

2.1 Posterior Simulation

The integrals involved in Bayesian analysis are usu-

ally evaluated using simulation methods. We will

develop many of these later. Here we just describe

the basic idea to give you some intuition.

From your study of frequentist econometrics you should

know some asymptotic theory and, in particular, the

idea of a Law of Large Numbers (LLN) and a Central

Limit Theorem (CLT).

A typical LLN would say consider a random sam-

ple, Y

1

. ..Y

.

, as . goes to innity, the average con-

verges to its expectation (e.g. Y j)

Bayesians use a LLN as consider a random sample

from the posterior, 0

(1)

. ..0

(S)

, as S goes to innity,

the average of these converges to 1[0[&]

Note: Bayesians use asymptotic theory, but asymp-

totic in S (under control of researcher) not .

Example: Monte Carlo integration.

Let 0

(c)

for c = 1. ... S be a random sample from j(0[&)

and dene

b

j

S

=

1

S

S

X

c=1

j

0

(c)

,

then

b

j

S

converges to 1[j (0) [&] as S goes to innity.

Monte Carlo integration can be used to approximate

1[j (0) [&], but only if S were innite would the

approximation error go to zero.

We can choose any value for S (although larger val-

ues of S will increase the computational burden).

To gauge size of approximation error, we can use a

CLT to obtain a numerical standard error.

In practice, most Bayesians write their own programs

(e.g. using Gauss, Matlab or R) to do posterior sim-

ulation

BUGS (Bayesian Analysis Using Gibbs Sampling) and

BACC (Bayesian Analysis, Computation and Com-

munication) are popular Bayesian packages, but only

have limited set of models (or require substantial pro-

gramming to adapt to other models)

Bayesian work cannot (easily) be done in standard

econometric packages like Microt, Eviews or Stata.

- BuseUploaded byhectorgm77
- DCM Bayesian InferenceUploaded bytomasgoucha
- Bayesian StatisticsUploaded byralucam_tm
- Bayes Lecture NotesUploaded byAsim Mazin
- Herschbach&Bechtel BBScommentary Jones&LoveUploaded byAnonymous DSATBj
- AI Syllabus Fall 2010 (1)Uploaded byInnocent Bratt
- Decision TreeUploaded byNamarta Narang
- RBI StraubUploaded byBaran Yeter
- Inductive Principles for RBM TrainingUploaded byvikram360
- engineering statisticsUploaded byapi-3759063
- SSRN-id991805.pdfUploaded byNikhil Vidhani
- Finance - Financial Models in Excel - VolatilityUploaded byShweta Srivastava
- The Overview of Bayes Classification MethodsUploaded byEditor IJTSRD
- 21685Uploaded byluli_kbrera
- Is UNESCO Recognition Effective in Fostering TourismUploaded byxohaibali
- 19probabilisticinferenceUploaded byrashad_othman
- aadd24d5e67c3792f6558a08e20bd59645f4dd28Uploaded byquantumbazon
- The Anatomy of Choice Dopamine and Decision-makingUploaded bys
- 9444-44340-1-PB (1)Uploaded byiliescualex3643
- An Integrated Approach Toward the Spatial Modeling of Perceived Customer ValueUploaded byDante Edwards
- Installing.jagsUploaded byDemis Andrade
- Profile likelihood methodUploaded byPramesh Kumar
- Inference With the Median of a PriorUploaded byPrabhat Kumar
- Persson and Ryden - Exponentiated Gumbel distribution for estimation of return levels of significant wave height - 2010.pdfUploaded byAnonymous p9SPc84
- ReyUploaded byroberth
- Children’s application of decision strategies in a compensatory environmentUploaded bysofiaortega
- Chapter 3 & 4Uploaded byfahim ur rahman
- 16762-19119-1-PBUploaded byJik
- Hasil Perhitungan Dari SPSSUploaded byalief
- 4. Ijbgm - A Study on Measuring-l. KavithaUploaded byiaset123

- Elegant Chaos - Algebraically Simple Chaotic FlowsUploaded bysable14
- Marxist Theory of International Relations-PPT-16Uploaded byKeith Knight
- Schmidt H. - The Strange Properties of PsychokinesisUploaded byMarco Cagol
- SomeMoreProblems-1.pdfUploaded byPrasad Ravichandran
- Hector Calderon and William A. Hiscock- Quantum fields and “Big Rip” expansion singularitiesUploaded byDex30KM
- Paired Observation Test.docxUploaded byJennybabe Peta
- Hypothesis Testing (LAERD)Uploaded byFool on the hill
- Comparison Between Non Expected Utility TheoriesUploaded byFederico Falciani
- Binomial vs. Geometric DistributionsUploaded byRocket Fire
- Sniffy FinalUploaded byAndre van Jaarsveldt
- PHY3QMO - Computational AssignmentUploaded byReally Sexy
- statisticsquizUploaded byodin203
- Higgs to ZZ to 2l2qUploaded byKonstantin Kanishev
- robbins_mgmt11_ppt16.pptUploaded byallanrnmanaloto
- Mann Whitney U and WilcoxonUploaded bySophie Schrödinger
- Gage Linearity and Bias StudyUploaded bydhinatasurya
- Nucl.Phys.B v.775Uploaded bybuddy72
- Entropy and probabilityUploaded byFer Moncada
- Quantum Numbers and Electron ConfigurationsUploaded byEngr Fastidious
- QuanticaUploaded byOourdory Fernandes
- Super ExchangeUploaded byDyvison Pimentel
- LogitUploaded byAdnil Paredes
- Bowerman3Ce SSM Chapter05Uploaded byamul65
- Introduction to Hypothesis TestingUploaded byprsddy
- cvUploaded byAhmad Jpr
- ParticleUploaded byJeremy Chua
- ANALYSIS of Monetary Policy & Commercial Banks of IndiaUploaded bysumancalls
- NPT38_WN (1)Uploaded bySUNITA GUPTA
- EX10.pdfUploaded bySalman Farooq
- Montauk Project With Al Bielek and Preston NicholsUploaded bykhama662