An Introduction to Modern Bayesian Econometrics

Tony Lancaster May 26, 2003

ii .

. . . . . . . . . . . . . . . . . . . . . . . . . .5 Posterior Prediction . . . 2. . . . . . . . . . . . . .6 Exercises and Complements . . . . Prediction and Model Criticism 2. 2. . . . . . . . . . . 1. 1. . . . .4. . . .5. Bibliographic Notes .3 The Posterior p(θ|y) .4. . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . .5. .7 Appendix to Chapter 1: Some Probability 1. . . . . . . .1 Two Approximations to Bayes Factors . . . . . . . . . . . . . . 1. . . . . .6 Posterior Odds and Model Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . .4 The Components of Bayes’ Theorem . . . . . . . . . . . . . . . . .4. . . . . .1 Residual QQ Plots . . . . . . . .1 Econometric Analysis . . . 2. . . .2 The Bayesian Algorithm . . . . . . . . . . . . . . 1. . . . . . . . . . .3.4 Formal Model Checks . . . . . . . . . . . . . . . . . . . . . .4.1 The Likelihood p(y|θ) . . . .5 Conclusion and Summary . . . . .2 Statistical Analysis .6. . . . . . . 2. 1.1 Posterior Model Checking . . . . . . . . . . . . . . . . . . . . . . 1. . . iii . . . . . 1. . . . . . . . . . . . . . .2 Informal Model Checks . . . .2 The Prior p(θ) . . . . . . . . . . . . .3 Using the Prior Predictive Distribution to Check your Model 2.4. . . . . . .8 2 xi 1 1 2 3 8 9 9 10 28 39 56 57 63 66 69 70 71 72 76 79 79 79 81 83 84 87 89 90 97 99 . . . . . . . . . . . . .2 Sampling the Predictive Distribution . . . . . .3. . . . . . 2. . . . . . . . . .3 Uncheckable Beliefs? . . . . .4. . 2. .1 Methods of Model Checking . . . 2. .1 Parameters and Data . 1. . . .3 Bayes’ Theorem . . . . . . . . . . . . . . . . . . 1. . . . . . . . . . . . Distributions . . .2. 1. . . . . . . . . .1 Predictive Distributions . . . . . . 1. . . . . . 2. . . . 2. . .Contents Introduction 1 The Bayesian Algorithm 1.4. . . . . . . . .2 The Prior Predictive Distribution. . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . .4. . .4 Improper Prior Predictive Distributions . . . . . . .5 Prediction from Training Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . 3. . . . . . . . . . . . . . . . . . . . . .6 Extending the Normal Linear Model: .9 Appendix: Simulating Dirichlet Variates . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . .9 2. . . . . . 4. . . . .2 Economists and Regression Models. . . . . . . .2 Inverting the Distribution Function . .3. . . . . . . . . 3. . . . . . 3. . . 3.2. . .4 Finding the Stationary Distribution Given 4.3 The Two Marginals under a Vague Prior . . . . . .1 Introduction. . . . . . . . . . . . . . .2. . . . 4. . . .2. . . . . . . . . . . . . . . . . . . . . . . . .3 Linear Regression Models . . .3. . . . . . . . .5 Finite Discrete Chains . . . . . .3. 3. . . .3. . 3. . 4. . . . . . .5 The Least Squares Line .10 Appendix: Some Probability Distributions. . . . . . . 4. . . . . . . . . . . . . 4.5 Checking and Extending the Normal Linear Model . . . . . . . . . 4. . . . 3. . . . CONTENTS . . . . . . . . . . .3. . .3.7 Convergence . . . . . . . . . . .3 Stationary Distributions . . . . . . Summary . .2 Generalizing the Error Distribution . . . . . . . . . . . . . . 3.1 Normal Approximations. . . . . . . . . . . . . . . . . . .1 Rejection Sampling . . . . . . . . . .3.3. . . . . . . .4. . . . . . . . . . . . .3 Model Choice . . . . . . . . .3. . . . . . . . . . . . 4. . .4 Highest Posterior Density Intervals and Regions . . . . . . . . . . . . . 3. . 4. . . .8 2. . . 3. . . . . . . . . . . . . . . . . . . 3. a Kernel . . .2 Exact Sampling in One Step. . . . . . . . .8 Appendix: Analytical Results in the Normal Linear Model 3. . . . . . . . . .7 Sampling the posterior density of β . . . . . . . . . . . . . . . . . . . . Exercises and Further Examples Bibliographic Notes . . . 3. . . .3. . . . . . . . .3. 3. . . . . . . . . . . . . . . . .6. . . . . . . .5. . . . .6. . . . . 3.6 Informative Prior Beliefs . .3. 100 105 106 110 113 113 113 115 115 117 120 121 128 131 134 135 136 143 147 150 150 155 155 161 172 174 176 178 179 182 184 187 189 192 192 193 196 196 198 198 199 202 204 204 3 Linear Regression Models 3. . . . . . . . . . 4. .iv 2. .10 Enlarging the Model . . . . . . . . . . . .4 A Multinomial Approach to Linear Regression . . . . . . . . . . . 4 Bayesian Calculations 4.7 Conclusion and Summary of the Argument. . . . . . . . . . . . . . . . . . . 3. . . . . .2 The State Distribution. 3. . . . . . . . . . . . . . . . . . pt (x) . . . . . . . .1 Mean Independence . . . . . . . . 3. . . . . . . . 3. . . . . . . . .12 Bibliographic Notes .11 Exercises and Complements . . . . . . . . . . . .3 Markov Chain Monte Carlo . . .2 Vague Prior Beliefs about β and τ . . . .1 MarkovChains and Transition Kernels .3. . . . . . . . . . . Homoscedastic Errors . . . . . . . . . . .3. .1 Criticizing the Gasoline Model . . . . .1 Independent. . .6 More General Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 An Approximate Joint Posterior Distribution . . . . .7 2. . . . . . . 3. . . . . . . . . 3. . . . . . . . . . . 3. . . . . . . . . . . . . . . . . . . 3. . . . . . . . . . .6. . 3. . . . . 3. . .1 Checking . . 4. . . . . . .1 Comments on the Multinomial Approach . . . . . . . Normal. . .

. . . .4. . . . . .3 Metropolis-Hastings . 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . .9. . .5 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4. . . .2 Truncated Normal Distributions . . . . . . . . . . .2 The Metropolis Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7. . . . 5. . . . . . . . . . . .1 Estimation of Production Functions . 5. . . . . . . .4. . . . .1 Exponential Durations . . . .4. .4. .8 Exercises . 5. . . . . . . . . . . . . . . . . . . . .5. 5. . 4. . . . . . . . Bibliographic Notes . . . . 5. . . . . . . . . 5. . . . . . . 4. . . . . . . . . . . . . . . . .3.5. . . . .7 Implementing Markov Chain Monte Carlo Exercises and Complements . . . . . . . . .3. . . .2 Censoring and Truncation: . . . . .4. . . . . . . . . .8 Ergodicity . . 5. . . . . . . . 5. . . . .3 Ordered Multinomial Choice . . . . . . . . . . . . . . . . . . . . .1 The Lognormal Family .5 The Weibull Family . . . . . . . . . . . .4 Multinomial Choice .4 The Heterogeneous Poisson or Negative Binomial Family 5. . . . . .2. . . . . . . . . . 4. . . . . . .9 Speed .4. . . . . . . . . . . .2 Weibull Durations. . . . . . . . . . . . . . .9. . . . 5. . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . .7. . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . .2 Criticisms of the Probit Model . . . .9. . . . . . . . . . . . . . . . .7.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. 4. . . . . . . . . . . . . . . .9 Appendix to Chapter 5: Some Distributions . . . . . . 5. . . . . . . . . . . . . . . . . . . . 5. . v 206 206 207 208 209 213 215 216 217 222 223 226 228 231 231 233 235 236 238 238 240 242 242 242 243 243 244 247 249 250 254 255 257 258 258 259 259 260 261 261 261 262 263 264 266 4. . . 5. . . . . . . . 5. .7. . . 5. .CONTENTS 4. . .5 Using Samples from the Posterior . . . .3 Selection Models . . . . . . . .1 Data Augmentation . . . . . . . . . . . . . .3. . . . . . . . . . . . . . .10 Bibliographic Notes . 5. . . . . . . 5. . . . . . . . . . . . . . . . . .2 Parameters of Interest. . . . . . .3. . . . . . . .1 Unmeasured Heterogeneity in Nonlinear Regression . . . . . . . . . . .6 5 Nonlinear Regression Models 5. . . . . . . . . . . . . . . . . .9. . . . .1 Censored Linear Models . . . . .4 Practical Convergence . 5. . . . . 5. . . . . . . . . . . .6. . . 4. 5. . .4 4. . . . . . .2 Binary Choice . . . . . . . . . . . .3. . . . . . . .7 Duration Data . . . 4. . .10 Finding Kernels With A Given Stationary Two General Methods of Constructing Kernels: . . . .2.1 Criticisms of this Model . 5. 5. .4 Heterogeneous Duration Models . 5.6. 5. . . . . . . . . . . . . . . 4. . . . . . . . .3 The Poisson Family .9. . . . . .1 The Gibbs Sampler . . . Distribution. . . . . . . . . . . . . . . . . . .2 Time Series of Counts . . . . . .5. . . . . . . .7. . . . . . . . . . . . . . . . . 5. . . . . . . .5 Tobit Models . 5. . .6 Calculating the Prior Predictive Density . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . .3 Other Models for Binary Choice . . . 5. . . . .2. . . . .6 Count Data . . . . .1 Probit Likelihoods . . . . . . . . . . .3 Piecewise Constant Hazards . . . . . . . .

. . . . 7. . . . . . . . 7 . . . . . . . . . . . .3. . . 7. . .4. . . . . . . . . . . 7.3 Orthogonal Reparametrizations . . . . . 7. . .2 A Gamma Prior for the Individual Effects.3. .2. . . .3. . .8 Exercises . . . . . . .1 Likelihood . . . .2. . . . . .1 Panel Data . . . . . . . . . . .2 Controlled Experimentation . 7.4. . . . . . . .2 Choices of Prior . . Models for Panel Data 7. . 7. . . . . . . . . . 6. . . . . . . . . . . . . 7.4 Implementation of the model . . . . .5 Bugs Program . . .2. 6. . . . . . 6. 6. . . . . . .vi 6 Randomized. . . . . . . . . . Controlled and Observational Data 6. . . . . . . . .1 A Uniform Prior on the Individual Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Parameters of Interest . . . . . . . .7 Concluding Remarks . . . . . . . . . . . . . . .6 Shrinkage . . . . . . . . . . . . . . . . . 7. . . . . . . . . 7. . . .2 How Do Panels Help? . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . .6 Bibliographic Notes . . . . . . . . . . . .3. . . . . . . . . . . 7. . . . . . .4 A Hierarchical Prior . . . . . . .3 Simpson’s Paradox .6. . . . . . . . . . . . . . . .4. . . . . . . . . . . . . . . . . . . . . 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 A Uniform Prior on the Individual Effects . . . . . . 7. . . . . . . . . .4 Panel Counts . . . . 6. . 7. . . . . . . . . . . . . . . . . . . . . . . . . . .6. . .5 Appendix: Koopmans’ Views on Exogeneity .5 Panel Duration Data . . .3 Exact Sampling . . . . . . . . . . . . . . . . . . 7. . . . . . . . 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. . . . . . . . . . . . . 7. 7. . .6. . . . . . . . . .6 Panel Binary Data . . .1 Introduction . . . .2. . . . . . . 7. . 6.4 Exogeneity and Endogeneity in Economics . . . . . 6. . . . . . . . . . . . . . . . . . . 7. . . . . . . . . . . . . . . . . . CONTENTS 269 269 270 270 271 272 273 276 278 278 279 281 281 282 285 286 288 292 292 294 295 299 300 301 302 302 304 304 305 307 308 311 312 . . . . . .6. . . .4 Conclusions . . . . . . . . . . . . . 7. . . . . . . . . 7. . . . . . . . . . . .1 Randomization . .3 Randomization and Control in Economics . .3 Calculation in the Panel Count Model . . . . . . . . .2 Designed Experiments . . . . . . . . . . .3 Linear Models on Panel Data .3. . . . . . . . . . . .

. . . . . . . . . . . . . . .6. . . . . . 8. . . . .6. . . . . . . . . . 8. .1 Likelihood Identification . . . . . . . 313 8 Instrumental Variables 8. . . . . . . . . . . . . . 8. . . . . . .0. . . . . . . . . . . . . . . . . . .4 A Second Order Autoregression 9. . . . . .1 Likelihoods and Priors . . . . . . . . . . . . . . . . . .3 Models and Instrumental Variables . . . . . . . . . . . . . . . 8. . . . . . . . . . .7 An Application of IV Methods to Wages and Education . . . 369 369 371 371 374 375 376 .1 Introduction . . . . .1. .8. . . 363 . .3 Prediction . . . . . . . . . . . . . .1 Generating data for a simulation study . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. . . .6 Special Likelihoods and the Ones Trick . . . . . . . . . . . . 8. . 9. . . . . . . . . . . . . . . . . . . . .8 Computing References . . . . . .1 The Frequentist Approach .4 WinBUGS . . . . . . .3 Stochastic Volatility . . . 8. . . . . . . . . . . . . . . . .2 The Bayesian Contrast . . . . . . . . 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. . . . . . . . .1. . . . . . . . . . . . . . . .3 S . 8. .6 A Numerical Study of Inference with Instrumental Variables 8. . . . . 8. . . . .6 Bibliographic Notes .2 Randomizers and Instruments . . . .0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 BUGS Implementations 9. . . . . 315 315 316 316 318 319 320 326 327 327 328 332 338 342 345 345 347 349 349 353 355 356 358 359 360 361 361 . . . .5 Formulating the Model and Inputting Data . . 9 Some Time Series Models 9. . . . . . Appendix 1: A Conversion Manual 363 . . . . . 364 Appendix 2: Programming . . . .0. . . . . . . . . .10 Instruments via Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . . . . . . . . .1 Is Education Endogenous? . . . . . . . . . . . . . . . . . .7 Running the Sampler . . . . . . . . . . . . . . . .2 Extensions . . . . . .0. . . . 8. . . . . .1 Identification .3 Simulation Results . . . 8. . . . . . . . .0. . . . . . . . . 9. . . . . . . . . . . .7. . . . . . . . . . . . . . . . . . . . .2 A BUGS model statement . . . . . . . . . . . . . . . . . . . . . . . . .0. .8 Simultaneous Equations . . . . 9. . .1 First Order Autoregression . . .6. . . .5 Inference in a Recursive System . . . . . . . . . . . . . .9 Bibliographic Notes . . . 8. . . . . . . . . . . . . . . . . . . . . . 9. . . . . . . . . . . . . . 9. . . . . . . .4 The Structure of a Recursive Equations Model . . . .5 Exercises . . .9 Bibliographic Notes . . .CONTENTS vii 313 7. . . . . . . . . .0. . .0.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viii App endix 3: . . . . . . . . . . Ordered Probit . . . . . . . . . . . . . . . . . . . . . . . .19 . . . . . . .0. . . . . . . Right Censored Exponential Data (using the ones trick) Right Censored Weibull Data .0. . . . . . . . . .9 . . . . Probit Model .0. . .11 . . . . . . . . . A Panel Data Linear Model . .17 . . . .0. . .0. . .15 . . . .0. . . . . Poisson Regression . . . . .18 . . . . . Heterogeneous Poisson Regression . . . . . . . .0.0. .24 PREFACE BUGS Code Heteroscedastic Regression . . . .0.0. .16 . . .13 . . . . . . A Censored Heterogeneous Weibull Model . . . . . .20 . . . . . . . .0. Truncated Normal . . . . . . . . . . . . 385 . . Regression with Autocorrelated Errors . . . . . . .12 . . . . . . .0. . A Simultaneous Equations Model . .0. . . . .0. . . . A Second O Stochastic Volatility . CES production Function . . . . . . . . . . . 379 379 379 380 380 381 381 382 382 382 383 383 384 384 385 . .14 . . . .22 . . . . . . . Tobit Model . .10 . . . . . . . . . . .0. . . . . . . . . . . . . . . . .21 . . . . . . . . . . .23 .0. .

Whether it useful to have previous knowledge of econometrics is debatable. specifically a package with the unlikely name of BUGS. be studied by upper level undergraduates. It is desirable that the reader is familiar with the laws of probability. in my experience. and of fundamental econometric notions such endogeneity and structure. For simple cases these sums can be done in. the ideas of scalar and vector random variables and the notions of marginal. More complicated calculations rely on purpose built Bayesian sofware. I supply code written in S for many of the examples. On the one hand it is helpful to have some understanding of the method of least squares and of regression. therefore.Introduction This book is an introduction to the Bayesian approach to econometrics. The mathematics used in the book rarely extends beyond introductory calculus and the rudiments of matrix algebra and I have tried to limit even this to situations where mathematical analysis clearly seems to give additional insight into a problem. and to make full use of this book it is necessary to obtain and learn to use this package. Because Bayesian inference is different from what is customary it is. in his second year or higher. Matlab or one of the several variants of the S language. extraordinarily difficult for ordinary mortals to change their way of thinking from the traditional way to the Bayesian way or vice versa. If he is an economics student he has taken in his first year a semester course on probability and random variables followed by a semester dealing with the elements of inference about linear models from a classical point of view. and I notice that most of 1 947 as of January 2003 xi . It is written for students and researchers in applied economics. particularly in Europe and other countries with European style undergraduate programs. Some facility with computer software for doing statistical calculations would be an advantage because the book contains many examples and exercises that ask the reader to simulate data and calculate and plot the probability distributions that are at the heart of Bayesian inference. joint and conditional probability distributions and the simpler limit theorems. On the other hand this book deals exclusively with Bayesian econometrics and this is a radically different approach to our subject than that used in all1 existing introductory texts. for example. At least it is for me. The book has developed out of teaching econometrics at Brown University where the typical member of the class is a graduate student. It could.

The remainder of the book essentially provides applications of Bayes’ theorem and illustrations of the method of calculation using mostly the simplest models. This means that someone whose training has been confined to the conventional approach may find this immersion to be a barrier to understanding the Bayesian method. The first is “What is Bayesian Econometrics?” and the second is “How do I do it?” In the first chapter I explain that Bayesian Econometrics is nothing more than the systematic application of a single theorem. One way to read the book is to get the gist of the Bayesian method from chapters one and two. In 1989 the methods described here were scarcely known. indeed. I have used it as such on several occasions with a teaching style that emphasizes calculations. the practicality of Bayesian methods. I do not even deal with all those cases in which the method has been applied. in general. The reader could then choose among the remaining chapters. is to use our new computer power to sample from the probability distributions that the theorem requires us to calculate. . Bayes’ theorem. if any. which are illustrations of the use of Bayesian methods in particular areas of application. in 1995 they would have been difficult for a beginner to apply. in 2003 application of these computer intensive methods is little. This is the meaning of the word ”Modern” in the book’s title. extensions to more complex structures will in many cases be fairly obvious. according to his or her interests. it is not a book about comparative methods and it contains little about traditional approaches which are covered in many textbooks. namely that to apply this theorem in an econometric investigation the best method. My aim has been to answer two rather simple questions. and demonstrates sampling algorithms including use of markov chain monte carlo procedures in class and requires students to solve problems numerically. without necessarily going into the more detailed discussion in these chapters. The book could be used as the basis for a one semester course at graduate or advanced undergraduate level. they are unnecessary! Bayesian analysis of important economic models has has been going on since the 1960’s and significant progress has been made with a number of applications.xii INTRODUCTION my students face the same problem. This book is about the Bayesian approach to inference. My hope is that just a few examples will be sufficient to enable the reader to tackle his own problem using what I shall later call The Bayesian Algorithm. more difficult than application of the methods traditionally used in applied econometrics. I also provide a brief answer to the second question. These illustrations are not comprehensive. but rather confine my examples to cases that I feel comfortable explaining. then to read chapter three to get a broad understanding of markov chain monte carlo methods. for an (imaginary) reader who gets the point of the opening chapters.