You are on page 1of 28

HANDS-ON INTERMEDIATE

ECONOMETRICS USING R
Templates for Extending Dozens
of Practical Examples
This page intentionally left blank
HANDS-ON INTERMEDIATE
ECONOMETRICS USING R
Templates for Extending Dozens
of Practical Examples

Hrishikesh D Vinod
Fordham University in New York, USA

World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TA I P E I • CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

HANDS-ON INTERMEDIATE ECONOMETRICS USING R


Templates for Extending Dozens of Practical Examples
(With CD-ROM)
Copyright © 2008 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.

ISBN-13 978-981-281-885-0
ISBN-10 981-281-885-5

Typeset by Stallion Press


Email: enquiries@stallionpress.com

Printed in Singapore.

Shalini - Hands-on Intermediate.pmd 1 12/12/2008, 7:16 PM


September 15, 2008 14:20 9in x 6in B-645 fm FA

To the memory of

My father Maharshi Nyaya Ratna Vinod,


who had been an Indian freedom fighter, Marathi poet
and was elected as the World Peace Ambassador,
http://www.maharshivinod.org/

My mother Maitreyi Dhundiraj Vinod,


who was a pioneer in women’s education in India
and an inspiration to millions of women through
her own outstanding academic achievement

and

My mother-in-law Dr. Kamal Krishna Joshi,


who provided free medical care to countless poor and needy
women in Nagpur, India for sixty long years, twice a week.

v
September 15, 2008 14:20 9in x 6in B-645 fm FA

This page intentionally left blank


September 15, 2008 14:20 9in x 6in B-645 fm FA

Preface

My teacher, Wassily Leontief, a Nobel laureate, had a great influence on


me. In fact, my Harvard dissertation was on “Non-Linearization of Leontief
Input–Output Analysis.” Leontief’s (1982) piece in Science criticizes typical
models used by academic economists for assuming stationary equilibria and
lamented “the splendid isolation” of academic economists from the real
world. Morgan (1988) studies Leontief’s points by studying the content
of economics journals such as the American Economic Review (AER) and
argues that there is “market failure” in academic economics with excess
emphasis on “fine points of logic to win approval within the guild.” For-
tunately, things are changing in the new century. This book includes real-
istic econometric tools which model nonstationary equilibria. The book is
inspired by Ernst Berndt’s (1991) hands-on type book suitable for an earlier
era before the Internet.
Similar to Berndt, the book should help the reader in learning to practice
econometrics with numerical estimation work involving computer software.
Econometrics has been around as a subject of scientific inquiry since about
1933 when the journal Econometrica was founded by Ragner Frisch and
others. It has grown a great deal in the 20th century and matured. The book
uses econometric examples involving several policy issues such as divestiture
of the Bell System, global warming, etc. Hence it can be of interest to all
social and biological scientists, engineers, legal professionals in addition to
those who use the R software. I assume that the reader has only a limited
(not zero) exposure to the basics in Economics, Statistics, Finance, Com-
puter Science, and Mathematics. Every attempt is made to derive results
from first principles, to use self-explanatory notation and minimal cross-
referencing, and to fully explain the mathematics used. Some R snippets
allow the readers to ‘see’ what various matrix operations (e.g., Kronecker
products) do and numerically verify the algebraic identities (e.g, singular
value decomposition) with simple examples.

vii
September 15, 2008 14:20 9in x 6in B-645 fm FA

viii Hands-on Intermediate Econometrics Using R

Thus, the book is structured for five types of potential uses.

1. as a textbook for graduate or advanced undergraduate econometrics


courses for students with mixed backgrounds focused on applications
rather than proofs.
2. as a supplemental textbook for usual econometrics /applied statistics
courses in social sciences, engineering, law and Finance.
3. as a tool for any student or researcher wanting hands-on learning of some
of the basic regression and applied statistics methods
4. as a supplemental material for computer science courses teaching object
oriented languages including R.
5. as a reference for data sources and research ideas.

In the new century, fast linking of all scientists through the Internet
is revolutionizing the exchange of scientific ideas including data and
software tools. This book is intended to highlight and welcome these
exciting changes. Since open source free software is particularly powerful
for exchange of software tools, the book embraces R (distributed under
Free Software Foundation’s GNU GENERAL PUBLIC LICENSE Version
2, June 1991).

Why R?
All empirical scientists should learn more than one programming language,
not some point-and-click software restricted by the imagination of the pro-
grammer. R is based on an earlier language called S developed in Bell Labs
around 1979 (by John Chambers and others with whom I often used to
have lunch). It is an object oriented Unix type language, where all inputs,
data and outputs from R are ‘objects’ inside the computer. Also, R is
an “interpreted programming language” not “compiled” language similar
to FORTRAN or GAUSS (reviewed in Vinod, J2000c) or many older lan-
guages. Hence, all R commands (or command sets enclosed in curly braces)
are implemented as they are typed (or ‘sourced’).
An advantage of Unix type languages is that numerical results are iden-
tical (platform-independent), and yet R is available for a wide variety of
UNIX platforms, Windows and Mac systems. This provides one kind of
flexibility in using R. SPlus is the commercial version of S, and R is the
free version. Since Splus programs work in R, this allows second kind of
flexibility for R. Third kind of flexibility of R arises from the fact that it
is open source, meaning that every line of the code is available for any
researcher anywhere to see, modify, criticize, etc. A black box of hidden
code is unavoidable for any proprietary commercial software. Not so for R.
September 15, 2008 14:20 9in x 6in B-645 fm FA

Preface ix

Anyone with enough patience to learn R can modify available software to


suit a particular application. The fourth kind of flexibility of R is that it
offers a package called ‘Rcmdr’ for the convenience of cursory users who
want the point-and-click mode and do not wish to go beyond standard
techniques. A fifth kind of flexibility is the ability to choose from a wide
choice of packages without having to figure out if they are money’s worth.
New packages and modules of proprietary packages are often expensive and
expect users to keep on paying for ever newer versions. The buyer has to
figure out if the latest version is really new, bug-free and worth buying. By
contrast, free latest versions of all R packages are readily available, making
life easier for users.
Vinod (J1999b, J2000c, J2003c & d, J2004d & e) discuss numerical
accuracy issues. R is believed to be numerically one of the most accurate
languages available The accuracy of its nonlinear algorithms and random
number generators is reasonable. Although not perfect, R language is pro-
gressing fast. I am convinced that R is headed to become the lingua franca
of all applied statistics, including econometrics.
The book provides numerous snippets containing R code explaining
what the code does, with detailed comments hoping to simplify and speed
up the learning curve. Since the idea of learning a programming language
like R may seem daunting to disinterested students, this book does some
spoon feeding. It suitably motivates the reader by making the learning a
bit more hands-on and fun. The reader can first do any one of the dozens
of computing tasks in our snippets (e.g., run projection pursuit regression),
look at the results, and choose to learn only the relevant features of inter-
esting tasks at leisure. My students find it fun to discover hidden jewels
inside R by direct searches on Google’s website and by doing illustrative
examples inside the contributed packages.

Advantages of R for Replication of Published Research


There is a new emphasis on replicable empirical work, for the benefit of all
students and researchers, anywhere in the globalized world. Under the able
editorship of Ben Bernanke (current Chair of the US Central Bank called
the Federal Reserve Bank) reproducible empirical work is being empha-
sized at the American Economic Review (AER), the main journal of the
American Economic Association. Bernanke (2004) cites McCullough and
Vinod (J2003d) in his “editorial statement” requiring all AER authors to
submit both data and software code. Indeed, it would be beneficial for
the profession if any researcher anywhere is able to replicate published
quantitative results from all Economics journals. Econometrica, Journal of
September 15, 2008 14:20 9in x 6in B-645 fm FA

x Hands-on Intermediate Econometrics Using R

Political Economy and many top journals are following the lead of AER and
require authors to submit both data and software. I hope that this book
facilitates the movement toward replicable econometrics by making R more
easy and fun. Replication with free R will be much simpler and fast for
international students, eliminating possible delays in paying for software in
different currencies.

Obtaining R and its documentation files


Go to http://www.r-project.org. On the left side, under “Download” click
on the CRAN (Comprehensive R Archive Network) link. Pick a mirror
closest to your location (e.g., Pittsburgh). Assuming that you have a
Windows PC, under “Download and install R” click on Windows. Now,
under subdirectories, click on ‘base.’ Next, click on the fourth item under “in
this directory:” Do not be confused by options. Click on R-2.7.0-win32.exe
(or similar latest setup program, size over 25 megabytes). Depending on
your Internet connection, this process will take some time. The ‘setup’
program creates an Icon for R-Gui (graphical user interface). The point
is that anyone with an Internet connection sitting almost anywhere in the
world can get R on any number of computers completely free of charge. This
is perhaps too good to be true. Many books, some free manual and other
documentation is also available by clicking on a link in the left column under
“documentation” at the R Homepage. The Wiki for R at (http://wiki.r-
project.org) is particularly useful for newcomers.

Packages within R
R comes with the “base” package and additional contributed “contrib”
packages have to be explicitly requested from the R website. The con-
tributed packages are written by statisticians, computer scientists and
econometricians from around the world and contain several functions for
doing operations plus many illustrative data sets. They are freely available
on demand for noncommercial purposes. All R packages are required to
follow a nice readable format describing all their functions. Each function
is described with details of inputs and outputs, references to the literature
and generally does have examples. If one is curious to know exactly how the
function was implemented, usually all one has to do is write the name of the
function and the entire program can be seen. The user then has an option
to modify the program, except that beginners will find that many programs
are hard to understand, since their expert authors use their ingenuity in
writing efficient (fast, not necessarily easy to read) code. By contrast, the
code in the snippets in this book should be easy to read.
September 15, 2008 14:20 9in x 6in B-645 fm FA

Preface xi

In addition to user manuals, some packages also have “vignettes,”


which are fully worked out examples explaining the usage and interpre-
tation of results. (type ‘vignette()’ to know what is available). I always
download and see both the user manuals and vignettes from the r-project
website for contributed packages to fully understand what each package
does and its full potential. Downloading the latest version of any package
itself into R takes only a couple of clicks and automatic updates are also
similarly available from the R-Gui on the user’s desktop. The Journal of
Statistical Software of the American Statistical Association has the URL
(http://www.jstatsoft.org/) for free download with a number of articles
dealing with R. A recent volume 27 (July 2008) edited by Achim Zeileis
and Roger Koenker deals with ‘Econometrics in R.’ It has excellent articles
describing the following packages: plm, forecast, vars, np, Redux, sampleS-
election, pscl.

Organization of the Book


The book begins with an introductory chapter using production functions
to illustrate relevance of nonlinear functions of regression coefficients. We
include preliminary data analysis with recent tools (Cook’s distance) mul-
tiple regression methods, singular value decomposition, collinearity problem
and ridge regression. Chapters 2, 3 and 5 deal with time series analysis with
the standard topics and some sophisticated ones including: autoregressive
distributed lags (ARDL), chaos theory, mean reversion, long memory,
spectrum analysis, ergodicity, stationarity, business cycles with imaginary
roots of AR(2), impulse response, vector autoregression (VAR) models,
stochastic diffusions, cointegration, Granger causality testing and multi-
variate techniques including canonical correlations.
Chapter 4 discusses expected and non-expected utility theory and impli-
cations for Finance and other areas. It has new tools for measuring up
to fourth order stochastic dominance shown to study ‘prudence.’ Chapter
6 has detailed derivations of simultaneous equations theory including k-
class estimators, limited information maximum likelihood (LIML) and the
‘identification’ problem, with hands on examples. The novelty of Chapter
7 on ‘limited dependent variables’ is that besides economists’ favorite
Tobit and Heckman estimators, it explains the less familiar general linear
model (GLM) viewpoint used by Biostatisticians with their ‘link functions’
implying superiority of logit over probit. We also discuss survival models
for Oil Company CEOs.
September 15, 2008 14:20 9in x 6in B-645 fm FA

xii Hands-on Intermediate Econometrics Using R

Chapter 8 has sophisticated consumer theory including Wiener-Hopf


dynamic optimization and Kernel estimation. Chapter 9 explains the tra-
ditional bootstrap, its limitations and several newer tools including double
and maximum entropy bootstraps (with easy to use R code). Chapter
10 has generalized least squares (GLS), generalized method of moments
(GMM), vector autoregressive moving average (ARMA) models, ‘esti-
mating function’ and related pivot functions with examples. Chapter 11
deals with nonlinear models and explains how to use ‘projection pursuit
regression’ for money demand equations.
The topics covered are influenced by a need to illustrate them with
examples within the size limitations of a book and my own familiarity with
the topic. Admittedly somewhat vain, the long list of my own and joint
papers is separated from the list of other authors. The list uses the prefix
J for journal articles, P for published proceedings, B for books or chapters
in books and U for unpublished but widely circulated pieces. Since it is not
possible to include additional papers of mine, an Appendix lists a thematic
classification of my papers. It shows that the book could not accommodate
some themes and state space (Kalman filter) modeling from Vinod (B1983,
1990, J1995c) using the R package ‘sspir.’
An important selling point of the book is that it has numerous program
snippets in R. My hope is that the snippets have useful practical information
about implementing various theoretical results. The reader is encouraged to
read them and treat the snippets as templates. I hope readers will modify
the snippets to apply to different and more interesting data and models.
This is the sense in which we call this book “hands-on.” The reader can
readily copy and paste each snippet into R while reading the discussion and
see R work out the results first hand. For brevity, very few of the outputs
appear in the printed book.
I am grateful to the following former/current students for detailed com-
ments: Erik Dellith on Chapter 1, Brian Belen on Chapter 5, Caleb Roepe
on Chapter 7 and Diana Rudean on Chapter 9.
Many snippets in the book have been tried by graduate students at
Fordham University. The students have found it fun to try them on some
data sets of their own choice. It is important not to treat the snippets as
black boxes to blindly perform some tasks. Professional programmers try
to write the most efficient code in the sense that it works fast and has as
few lines as possible. The snippets in this book are not at all professional.
Instead, they should be viewed as tools for learning to use R in applied
work. Hence instead of merely copying and pasting them, my students were
September 15, 2008 14:20 9in x 6in B-645 fm FA

Preface xiii

encouraged to slightly modify each snippet to suit a distinct problem. Such a


hands-on approach makes it fun. A purchaser of this book need not type the
snippets, since a CD in a sleeve inside the back cover contains all snippets
as text files
I welcome suggestions for improvements to the content of the book
and/or the snippets. Although improved code to professionals means faster
running with fewer lines, I prefer a readable code with some intuition for the
steps used and lots of comments. Of course, I will give proper credit to the
person(s) suggesting any improvements. On that note, Professor Peter C.
B. Phillips, the eminent econometrician from Yale University who has seen
a preprint of this book, suggested that I should clarify that my snippets
often use the assignment symbol “=” from GAUSS and FORTRAN, instead
of the “<-” symbol preferred by R and S-plus professionals. My reasons for
using the “=” symbol are that it works in R, requires less typing by me and
takes up less space in the printed book. Thus our snippets are not portable
to S-Plus until the user replaces “=” by the symbol “<-” in the body of
the snippet, except for occasional ‘lists’ associated with new R functions.
I hope that the reader will appreciate the reduced drudgery made possible
by our R snippets and treat Econometrics as a fun subject.
September 15, 2008 14:20 9in x 6in B-645 fm FA

This page intentionally left blank


September 15, 2008 14:20 9in x 6in B-645 fm FA

Foreword

COMMENTS CUM ENDORSEMENTS BY


DISTINGUISHED ECONOMETRICIANS

The following are in alphabetical order by the last name of distinguished


professors, who are all Fellows of the Journal of Econometrics.

Professor William A. Barnett, Oswald Distinguished Professor of


Macroeconomics, University of Kansas, USA
This book provides a unified and broadly accessible presentation of modern
econometrics, with emphasis on application and computing.

Professor Jean-Marie Dufour, William Dow Professor of


Economics, McGill University, Canada
This book is a beautiful and highly accessible introduction to modern econo-
metric methods tailored to the needs of those who wish to apply them.
The main theoretical concepts are clearly explained in a non-technical way
and the link with economic theory is underscored. Hrishikesh Vinod also
undertook to show how the methods can be programmed with the powerful
(and free) R software, which has become the reference language of statisti-
cians for both statistical computing and graphics.
Students of this book can thus, quickly apply and modify the methods
explained in the book, and they will also be able to draw from a wide stock
of freely available code to pursue their own projects.
This book is a major addition to applied econometrics, which should
prove useful to students and researchers all over the world. It should also
contribute to bridge the gap between econometricians and statisticians. I
strongly recommend this book to students and applied researchers, and will
certainly do so for my own students.

xv
September 15, 2008 14:20 9in x 6in B-645 fm FA

xvi Hands-on Intermediate Econometrics Using R

Professor Subal C. Kumbhakar, University (distinguished)


Professor, State University of New York, Binghamton
The learning of econometrics is never complete without some ‘hands-on’
experience. One way to accomplish this is to work on replicating previously
published results. This is much easier today than it was 50 years ago thanks
to the recent trend in transparency of applied econometric research. Many
reputed journals now require that data and software be available for others
to replicate results. This gives researchers, young and old, an opportunity
to test their modeling and estimation skills.
Recognizing the importance of this hands-on work, many textbook
writers provide data sets from published papers on a CD and add problem
sets to verify and extend results. Since all of these involve the use of some
software, some books today come bundled with a student or full version of
it. These ‘tailored’ versions of the software most often have limited capacity
in terms of what they can do.
Vinod has done a great service by writing a book utilizing R software
that can supplement the standard econometrics textbook in an advanced
undergraduate and applied graduate econometrics course. Although the
book does not cover all the standard topics in an econometrics textbook,
it gives a much deeper understanding in terms of economic theory and
econometric models and explains in detail how to use the R software on the
topics covered. R is open source software that is as powerful as commercial
software. Vinod utilizes the software to teach econometrics by providing
interesting examples and actual data applied to important policy issues.
This helps the reader to choose the best method from a wide array of
tools and packages available. The data used in the examples along with
R program snippets illustrate economic theory and sophisticated statistical
methods that go beyond the usual regression.
The R program snippets are not merely given as black boxes, but
include detailed comments which help the reader better understand the
software steps and use them as templates for possible extension and modi-
fication. Readers of this book, be they students of econometrics or applied
economists, will benefit from the hands-on experience either by using the
R snippets in replicating published results or by customizing them for their
own research.
September 15, 2008 14:20 9in x 6in B-645 fm FA

Foreword xvii

Professor Peter C. B. Phillips, Sterling Professor of Economics &


Professor of Statistics, Yale University, USA
Modern approaches to econometrics education acknowledge the importance
of practical implementation even in introductory courses. The R software
package provides an open source statistical and graphics engine, that facil-
itates this learning process, empowering students and researchers to inte-
grate theory and practice. Rick Vinod’s book is an outstanding tool in this
educational process, gently nursing its readers through simple examples on
a vast range of topics that illustrate the practical side of econometric work,
forcing economic ideas to face the reality of observation.
Professor Jean-François Richard, University (distinguished)
Professor, University of Pittsburgh, USA
This book embraces R as a powerful tool to promote hands-on learning of
econometric methods. It covers a wide range of commonly used techniques
with emphasis on problems faced by practitioners. It does so in a way,
that allows readers to initially reproduce by themselves several substantive
illustrations presented in the book and to subsequently develop their own
applications. It also emphasizes often overlooked careful data analysis prior
to model specification. All together, this book presents a very convincing
case and I definitely intend to recommend it as supplemental textbook to
my graduate students, especially those with empirical interests.
Professor Aman Ullah, Chair, Economics Department, University
of California, Riverside, USA
Econometrics applies statistical methods to study economic and financial
data. The data used can be in the form of a cross-section (micro), or a
time-series (macro), or panel with discrete, continuous, bounded, truncated,
or censored variables related simultaneously or dynamically. Econometric
models include linear and non-linear regression models, ARMA time-series
models, vector autoregressive models, limited dependent variable models
among others. Most econometrics texts concentrate on discussing model
selection, statistical estimation and hypothesis testing for all such models.
Vinod’s book is a superb work and it is unique in various respects. First,
it provides a solid introduction to important topics in both micro/macro
Economics and Finance in a simple way. Second, it integrates econometrics
September 15, 2008 14:20 9in x 6in B-645 fm FA

xviii Hands-on Intermediate Econometrics Using R

methods with economic models in producer and consumer theories, labor


economics, and financial econometrics. Third, each chapter is accom-
panied with empirical illustrations in the software R. Fourth, the hands-on
approach provides the implementation of all the results in the book easily
by anyone and anywhere. It is eminently appealing to the new generation
of economists and econometricians.
I am sure this book will be greatly used outside the econometrics field by
many readers from other applied sciences. For example, readers in applied
statistics, engineering, sociology, and psychology would enjoy learning some
clever modeling tricks and graphics in R.
Rick Vinod is a distinguished econometrician with great width, depth
and originality in his published work and what impressed me greatly was
the ease with which this has been communicated in the book as hands-
on examples. It is a very user-friendly book and I forecast that this new
kind of book will be extensively used by students, faculty and applied
researchers.
September 15, 2008 14:20 9in x 6in B-645 fm FA

Contents

Preface vii
Foreword xv

1. Production Function and Regression Methods Using R 1


1.1. R and Microeconometric Preliminaries . . . . . . . . . . 2
1.1.1. Data on Metals Production Available in R . . . . 3
1.1.2. Descriptive Statistics Using R . . . . . . . . . . . 4
1.1.3. Writing Skewness and Kurtosis Functions in R . . 5
1.1.4. Units of Measurement and Numerical Reliability
of Regressions . . . . . . . . . . . . . . . . . . . . 6
1.1.5. Basic Graphics in R . . . . . . . . . . . . . . . . . 7
1.1.6. The Isoquant . . . . . . . . . . . . . . . . . . . . . 8
1.1.7. Total Productivity of an Input . . . . . . . . . . . 9
1.1.8. The Marginal Productivity (MP) of an Input . . . 9
1.1.9. Slope of the Isoquant and MRTS . . . . . . . . . 9
1.1.10. Scale Elasticity as the Returns to Scale
Parameter . . . . . . . . . . . . . . . . . . . . . . 11
1.1.11. Elasticity of Substitution . . . . . . . . . . . . . . 12
1.1.12. Typical Steps in Empirical Work . . . . . . . . . . 13
1.2. Preliminary Regression Theory: Results Using R . . . . . 13
1.2.1. Regression as an Object ‘reg1’ in R . . . . . . . . 16
1.2.2. Accessing Objects Within an R Object by Using
the Dollar Symbol . . . . . . . . . . . . . . . . . . 17
1.3. Deeper Regression Theory: Diagonals of the Hat Matrix . 18
1.4. Discussion of Four Diagnostic Plots by R . . . . . . . . . 20
1.5. Testing Constant Returns and 3D Scatter Plots . . . . . . 23
1.6. Homothetic Production and Cost Functions . . . . . . . . 26

xix
September 15, 2008 14:20 9in x 6in B-645 fm FA

xx Hands-on Intermediate Econometrics Using R

1.6.1. Euler Theorem and Duality Theorem . . . . . . . 29


1.6.2. Profit Maximizing Solutions . . . . . . . . . . . . 30
1.6.3. Elasticity of Total Cost w.r.t. Output . . . . . . . 31
1.7. Miscellaneous Microeconomic Topics . . . . . . . . . . . . 32
1.7.1. Analytic Input Demand Function for the
Cobb–Douglas Form . . . . . . . . . . . . . . . . . 32
1.7.2. Separability in the Presence of Three
or More Inputs . . . . . . . . . . . . . . . . . . . . 32
1.7.3. Two or More Outputs as Joint Outputs . . . . . . 33
1.7.4. Economies of Scope . . . . . . . . . . . . . . . . . 33
1.8. Nonhomogeneous Production Functions . . . . . . . . . . 34
1.8.1. Three-Input Production Function for Widgets . . 34
1.8.2. Isoquant Plotting for a Bell System Production
Function . . . . . . . . . . . . . . . . . . . . . . . 42
1.9. Collinearity Problem, Singular Value
Decomposition (SVD), and Ridge Regression . . . . . . . 45
1.9.1. What is Collinearity? . . . . . . . . . . . . . . . . 45
1.9.2. Consequences of Near Collinearity . . . . . . . . . 48
1.9.3. Regression Theory Using the Singular Value
Decomposition . . . . . . . . . . . . . . . . . . . . 51
1.10. Near Collinearity Solutions by Coefficient Shrinkage . . . 55
1.10.1. Ridge Regression . . . . . . . . . . . . . . . . . . 57
1.10.2. Principal Components Regression . . . . . . . . . 61
1.11. Bell System Production Function in Anti-Trust Trial . . . 62
1.11.1. Collinearity Diagnostics for Bell Data Trans-Log . 65
1.11.2. Shrinkage Solution and Ridge Regression
for Bell Data . . . . . . . . . . . . . . . . . . . . . 65
1.11.3. Ridge Regression from Existing R Packages . . . . 66
1.12. Comments on Wrong Signs, Collinearity,
and Ridge Scaling . . . . . . . . . . . . . . . . . . . . . . 69
1.12.1. Concluding Comments on the 1982 Bell System
Breakup . . . . . . . . . . . . . . . . . . . . . . . 75
1.13. Data Appendix . . . . . . . . . . . . . . . . . . . . . . . . 75

2. Univariate Time Series Analysis with R 77


2.1. Econometric Univariate Time Series are Ubiquitous . . . . 77
2.2. Stochastic Difference Equations . . . . . . . . . . . . . . . 81
September 15, 2008 14:20 9in x 6in B-645 fm FA

Contents xxi

2.3. Second-Order Stochastic Difference Equation


and Business Cycles . . . . . . . . . . . . . . . . . . . . . 85
2.3.1. Complex Number Solution of the Stochastic AR(2)
Difference Equation . . . . . . . . . . . . . . . . . 87
2.3.2. General Solution to ARMA (p, p − 1) Stochastic
Difference Equations . . . . . . . . . . . . . . . . 89
2.4. Properties of ARIMA Models . . . . . . . . . . . . . . . . 91
2.4.1. Identification of the Lag Order . . . . . . . . . . . 93
2.4.2. ARIMA Estimation . . . . . . . . . . . . . . . . . 100
2.4.3. ARIMA Diagnostic Checking . . . . . . . . . . . . 101
2.5. Stochastic Process and Stationarity . . . . . . . . . . . . . 108
2.5.1. Stochastic Process and Underlying
Probability Space . . . . . . . . . . . . . . . . . . 108
2.5.2. Autocovariance of a Stochastic Process
and Ergodicity . . . . . . . . . . . . . . . . . . . . 110
2.5.3. Stationary Process . . . . . . . . . . . . . . . . . . 112
2.5.4. Detrending and Differencing to Achieve
Stationarity . . . . . . . . . . . . . . . . . . . . . 117
2.6. Mean Reversion . . . . . . . . . . . . . . . . . . . . . . . . 129
2.7. Autocovariance Generating Functions (AGF)
and the Power Spectrum . . . . . . . . . . . . . . . . . . . 132
2.7.1. How to Get the Power Spectrum from the AGF? . 133
2.8. Explicit Modeling of Variance (ARCH, GARCH Models). . 136
2.9. Tests of Independence, Neglected Nonlinearity,
Turning Points . . . . . . . . . . . . . . . . . . . . . . . . 139
2.10. Long Memory Models and Fractional Differencing . . . . . 143
2.11. Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . 147
2.12. Concluding Remarks and Examples . . . . . . . . . . . . . 150

3. Bivariate Time Series Analysis Including Stochastic


Diffusion and Cointegration 153
3.1. Autoregressive Distributed Lag (ARDL) Models . . . . . 153
3.2. Economic Interpretations of ARDL(1,1) Model . . . . . . 161
3.2.1. Description of M1 to M11 Model Specifications . . 162
3.2.2. ARDL(0,q) as M12 Model, Impact and Long-Run
Multipliers . . . . . . . . . . . . . . . . . . . . . . 166
3.2.3. Adaptive Expectations Model to Test Rational
Expectations Hypothesis . . . . . . . . . . . . . . 167
September 15, 2008 14:20 9in x 6in B-645 fm FA

xxii Hands-on Intermediate Econometrics Using R

3.2.4. Statistical Inference and Estimation with Lagged-


Dependent Variables . . . . . . . . . . . . . . . . 168
3.2.5. Identification Problems Involving Expectational
Variables (I. Fisher Example) . . . . . . . . . . . 168
3.2.6. Impulse Response, Mean Lag and Insights from
a Polynomials in L . . . . . . . . . . . . . . . . . 169
3.2.7. Choice Between M1 to M11 Models Using R . . . 170
3.3. Stochastic Diffusion Models for Asset Prices . . . . . . . . 176
3.4. Spurious Regression (R2 > Durbin Watson)
and Cointegration . . . . . . . . . . . . . . . . . . . . . . 183
3.4.1. Definition of a Process Integrated of Order d, I(d) 183
3.4.2. Cointegration Definition and Discussion . . . . . . 184
3.4.3. Error Correction Models of Cointegration . . . . . 185
3.4.4. Economic Equilibria and Error Reductions
through Learning . . . . . . . . . . . . . . . . . . 186
3.4.5. Signs and Significance of Coefficients on Past
Errors while Agents Learn . . . . . . . . . . . . . 187
3.5. Granger Causality Testing . . . . . . . . . . . . . . . . . . 189

4. Utility Theory and Empirical Implications 191


4.1. Utility Theory . . . . . . . . . . . . . . . . . . . . . . . . 191
4.1.1. Expected Utility Theory (EUT) . . . . . . . . . . 192
4.1.2. Arrrow–Pratt Coefficient of Absolute Risk
Aversion (CARA) . . . . . . . . . . . . . . . . . . 197
4.1.3. Risk Premium Needed to Encourage
Risky Investments . . . . . . . . . . . . . . . . . . 199
4.1.4. Taylor Series Links EUT, Moments of f (x)
and Derivatives of U (x) . . . . . . . . . . . . . . . 200
4.2. Non-Expected Utility Theory . . . . . . . . . . . . . . . . 202
4.2.1. Lorenz Curve Scaling over the Unit Square . . . . 203
4.2.2. Mapping From EUT to Non-EUT within the Unit
Square to Get Decision Weights . . . . . . . . . . 206
4.3. Incorporating Utility Theory into Risk
Measurement and Stochastic Dominance . . . . . . . . . . 210
4.3.1. Class D1 of Utility Functions and Investors . . . . 210
4.3.2. Class D2 of Utility Functions and Investors . . . . 210
4.3.3. Explicit Utility Functions and Arrow–Pratt
Measures of Risk Aversion . . . . . . . . . . . . . 211
September 15, 2008 14:20 9in x 6in B-645 fm FA

Contents xxiii

4.3.4. Class D3 of Utility Functions and Investors . . . . 212


4.3.5. Class D4 of Utility Functions and Investors . . . . 212
4.3.6. First-Order Stochastic Dominance (1SD) . . . . . 214
4.3.7. Second-Order Stochastic Dominance (2SD) . . . . 216
4.3.8. Third-Order Stochastic Dominance (3SD) . . . . . 217
4.3.9. Fourth-Order Stochastic Dominance (4SD) . . . . 218
4.3.10. Empirical Checking of Stochastic Dominance
Using Matrix Multiplications and Incorporation
of 4DPs of Non-EUT . . . . . . . . . . . . . . . . 218

5. Vector Models for Multivariate Problems 227


5.1. Introduction and VAR Models . . . . . . . . . . . . . . . 227
5.1.1. Some R Packages for Vector Modeling . . . . . . . 228
5.1.2. Vector Autoregression or VAR Models . . . . . . 228
5.1.3. Data Collection Tips Using R . . . . . . . . . . . 229
5.1.4. VAR Estimation of Sims’ Model . . . . . . . . . . 237
5.1.5. Granger-Causality Analysis in VAR Models . . . . 240
5.1.6. Forecasting Out-of-Sample in VAR Models . . . . 242
5.1.7. Impulse Response Analysis in VAR Models . . . . 243
5.2. Multivariate Regressions: Canonical Correlations . . . . . 248
5.2.1. Why Canonical Correlation is Not Popular So Far 251
5.3. VAR Estimation and Cointegration Testing
Using Canonical Correlations . . . . . . . . . . . . . . . . 257
5.4. Final Remarks: Multivariate Statisics Using R . . . . . . . 259

6. Simultaneous Equation Models 261


6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 261
6.1.1. Simultaneous Equation Notation System with
Stars and Subscripts . . . . . . . . . . . . . . . . 263
6.1.2. Simultaneous Equations Bias and the
Reduced Form . . . . . . . . . . . . . . . . . . . . 266
6.1.3. Successively Weaker Assumptions Regarding
the Nature of the Zj Matrix of Regressors . . . . 269
6.1.4. Reduced Form Estimation and Other Alternatives
to OLS . . . . . . . . . . . . . . . . . . . . . . . . 269
6.1.5. Assumptions of Simultaneous Equations Models . 271
September 15, 2008 14:20 9in x 6in B-645 fm FA

xxiv Hands-on Intermediate Econometrics Using R

6.2. Instrumental Variables and Generalized Least Squares . . 272


6.2.1. The Instrumental Variables (IV) and Generalized
IV (GIV) Estimator . . . . . . . . . . . . . . . . . 273
6.2.2. Choice Between OLS and IV by Using
Wu–Hausman Specification Test . . . . . . . . . . 275
6.3. Limited Information and Two-Stage Least Squares . . . . 277
6.3.1. Two-Stage Least Squares . . . . . . . . . . . . . . 277
6.3.2. The k-class Estimator . . . . . . . . . . . . . . . . 278
6.3.3. Limited Information Maximum Likelihood
(LIML) Estimator . . . . . . . . . . . . . . . . . . 280
6.4. Identification of Simultaneous Equation Models . . . . . . 282
6.4.1. Identification is Uniquely Going from the
Reduced Form to the Structure . . . . . . . . . . 285
6.5. Full Information and Three-Stage Least Squares (3SLS) . 288
6.5.1. Full Information Maximum Likelihood . . . . . . 293
6.6. Potential of Simultaneous Equations
Beyond Econometrics . . . . . . . . . . . . . . . . . . . . 294

7. Limited Dependent Variable (GLM) Models 295


7.1. Problems with Dummy Dependent Variables . . . . . . . 295
7.1.1. Proof of the Claim that Var(εi ) = Pi (1 − Pi ) . . . 300
7.1.2. The General Linear Model from Biostatistics . . . 304
7.1.3. Marginal Effects (Partial Derivatives) in
Logit-Type GLM Models . . . . . . . . . . . . . . 308
7.1.4. Further Generalizations of Logit and
Probit Models . . . . . . . . . . . . . . . . . . . . 309
7.1.5. Ordered Response . . . . . . . . . . . . . . . . . . 312
7.2. Quasi-Likelihood Function for Binary Choice Models . . . 314
7.2.1. The ML Estimator in Binary Choice Models . . . 315
7.2.2. Tobit Model for Censored Dependent Variables . . 317
7.3. Heckman Two-Step Estimator for Self-Selection Bias . . . 322
7.4. Time Duration Length (Survival) Models . . . . . . . . . 326
7.4.1. Probability Distributions and Implied Hazard
Functions . . . . . . . . . . . . . . . . . . . . . . . 330
7.4.2. Parametric Survival (Hazard) Models . . . . . . . 331
7.4.3. Semiparametric Including Cox Proportional
Hazard Models . . . . . . . . . . . . . . . . . . . . 333
September 15, 2008 14:20 9in x 6in B-645 fm FA

Contents xxv

8. Dynamic Optimization and Empirical Analysis


of Consumer Behavior 343
8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 343
8.2. Dynamic Optimization . . . . . . . . . . . . . . . . . . . . 344
8.3. Hall’s Random Walk Model . . . . . . . . . . . . . . . . . 346
8.3.1. Data from the Internet and an Implementation . . 349
8.3.2. OLS Estimation of the Random Walk Model . . . 350
8.3.3. Direct Estimation of Hall’s NLHS Specification . . 352
8.3.4. Strong Assumptions and Granger-Causality Tests 356
8.4. Nonparametric Kernel Estimation . . . . . . . . . . . . . 358
8.4.1. Kernel Estimation of Amorphous Partials . . . . . 360
8.5. Wiener–Hopf–Whittle Model if Consumption
Precedes Income . . . . . . . . . . . . . . . . . . . . . . . 364
8.5.1. Determination of Target Consumption . . . . . . 365
8.5.2. Implications for Various Puzzles
of Consumer Theory . . . . . . . . . . . . . . . . 368
8.6. Final Remarks on Consumer Theory . . . . . . . . . . . . 369
8.7. Appendix: Additional R Code . . . . . . . . . . . . . . . . 370

9. Single, Double and Maximum Entropy Bootstrap and Inference 377


9.1. The Motivation and Background Behind Bootstrapping . 377
9.1.1. Pivotal Quantity and p-Value . . . . . . . . . . . 378
9.1.2. Uncertainty Regarding Proper Density for
Regression Errors Illustrated . . . . . . . . . . . . 380
9.1.3. The Delta Method for Standard Error
of Functions . . . . . . . . . . . . . . . . . . . . . 382
9.2. Description of Parametric iid Bootstrap . . . . . . . . . . 383
9.2.1. Simulated Sampling Distribution for Statistical
Inference Using OLS Residuals . . . . . . . . . . . 383
9.2.2. Steps in a Parametric Approximation . . . . . . . 386
9.2.3. Percentile Confidence Intervals . . . . . . . . . . . 387
9.2.4. Reflected Percentile Confidence Interval
for Bias Correction . . . . . . . . . . . . . . . . . 388
9.2.5. Significance Tests as Duals to Confidence
Intervals . . . . . . . . . . . . . . . . . . . . . . . 388
9.3. Description of Nonparametric iid Bootstrap . . . . . . . . 391
9.3.1. Map Data from Time-Domain
to (Numerical Magnitudes) Values-Domain . . . . 391
September 15, 2008 14:20 9in x 6in B-645 fm FA

xxvi Hands-on Intermediate Econometrics Using R

9.4. Double Bootstrap Illustrated with a Nonlinear Model . . 398


9.4.1. A Digression on the Size of Resamples . . . . . . 399
9.4.2. Double Bootstrap Theory Involving Roots
and Uniform Density . . . . . . . . . . . . . . . . 399
9.4.3. GNR Implementation of Nonlinear Regression
for Metals Data . . . . . . . . . . . . . . . . . . . 401
9.5. Maximum Entropy Density Bootstrap
for Time-Series Data . . . . . . . . . . . . . . . . . . . . . 407
9.5.1. Wiener, Kolmogorov, Khintchine (WKK)
Ensemble of Time Series . . . . . . . . . . . . . . 408
9.5.2. Avoiding Unrealistic Properties of iid Bootstrap . 409
9.5.3. Maximum Entropy Density is Uniform When
Limits are Known . . . . . . . . . . . . . . . . . . 410
9.5.4. Quantiles of the Patchwork of the ME Density . . 412
9.5.5. Numerical Illustration of “Meboot” Package in R 413
9.5.6. Simple and Size-Corrected Confidence Bounds . . 418

10. Generalized Least Squares, VARMA, and Estimating


Functions 419
10.1. Feasible Generalized Least Squares (GLS) to Adjust for
Autocorrelated Errors and/or Heteroscedasticity . . . . . 419
10.1.1. Consequences of Ignoring Nonspherical
Errors Ω = IT . . . . . . . . . . . . . . . . . . . . 419
10.1.2. Derivation of the GLS and Efficiency Comparison 420
10.1.3. Computation of the GLS and Feasible GLS . . . . 422
10.1.4. Improved OLS Inference for Nonspherical Errors . 424
10.1.5. Efficient Estimation of β Coefficients . . . . . . . 425
10.1.6. An Illustration Using Fisher’s
Model for Interest Rates . . . . . . . . . . . . . . 426
10.2. Vector ARMA Estimation for Rational
Expectations Models . . . . . . . . . . . . . . . . . . . . . 429
10.2.1. Greater Realism of VARMA(p, q) Models . . . . . 431
10.2.2. Expectational Variables from Conditional
Forecasts in a General Model . . . . . . . . . . . . 432
10.2.3. A Rational Expectation Model Using VARMA . . 433
10.2.4. Further Forecasts, Transfer Function Gains,
and Response Analysis . . . . . . . . . . . . . . . 438
September 15, 2008 14:20 9in x 6in B-645 fm FA

Contents xxvii

10.3. Optimal Estimating Function (OptEF)


and Generalized Method of Moments (GMM) . . . . . . . 443
10.3.1. Derivation of Optimal Estimating Functions
for Regressions . . . . . . . . . . . . . . . . . . . . 443
10.3.2. Finite Sample Optimality of OptEF . . . . . . . . 445
10.3.3. Introduction to the GMM . . . . . . . . . . . . . 445
10.3.4. Cases Where OptEF Viewpoint Dominates GMM 447
10.3.5. Advantages and Disadvantages of
GMM and OptEF . . . . . . . . . . . . . . . . . . 449
10.4. Godambe Pivot Functions (GPFs) and Statistical
Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
10.4.1. Application of the Frisch–Waugh Theorem
to Constructing CI95 . . . . . . . . . . . . . . . . 452
10.4.2. Steps in Application of GPF to Feasible
GLS Estimation . . . . . . . . . . . . . . . . . . . 453

11. Box–Cox, Loess and Projection Pursuit Regression 459


11.1. Further R Tools for Studying Nonlinear Relations . . . . . 459
11.2. Box–Cox Transformation . . . . . . . . . . . . . . . . . . 459
11.2.1. Logarithmic and Square Root Transformations . . 459
11.3. Scatterplot Smoothing and Loess Regressions . . . . . . . 463
11.3.1. Improved Fit (Forecasts) by Loess Smoothing . . 465
11.4. Projection Pursuit Methods . . . . . . . . . . . . . . . . . 466
11.5. Remarks on Nonlinear Econometrics . . . . . . . . . . . . 477

Appendix 479
References 485
Index 505

You might also like