319 views

Original Title: Econometrics Chapter 14, 15 & 16 PPT slides

Uploaded by IsabelleDwight

Econometrics Chapter 14, 15 & 16 PPT slides

Times Series Regression

© All Rights Reserved

- Econometrics PPT Final Review slides
- Econometrics Chapter 8 PPT slides
- Econometrics Chapter 10 PPT slides
- Econometrics Chapter 11 PPT slides
- Stock Watson 3U ExerciseSolutions Chapter7 Instructors
- Introduction to Econometrics- Stock & Watson -Ch 13 Slides.doc
- ECON 491: ECONOMETRICS STOCK AND WATSON PPT
- Introduction to Econometrics- Stock & Watson -Ch 9 Slides.doc
- Introduction to Econometrics- Stock & Watson -Ch 10 Slides.doc
- Social Spending Human Capital & Growth
- Summary of Empirical Chemicals LTD: Merseyside Project
- ECONO
- Variabel X2.doc
- Regression Notes
- Analysis of Determinant Factors of a Company’s Performance (2)
- 173232298 a Guide to Modern Econometrics by Verbeek 371 380
- 1. Private Consumption Expenditure
- Tourism WP
- The Relationship Between Pay and Performance - Team Salaries and Playing Success From a Comparative Perspective (Forrest, Simmons)
- cesifo1_wp1110

You are on page 1of 113

Time Series

Regression

Time series data are data collected on the same

observational unit at multiple time periods;

Yt=B0+B1X1t+B2X2t+ut

Aggregate consumption and GDP for a country (for

example, 20 years of quarterly observations = 80

observations)

Yen/$, pound/$ and Euro/$ exchange rates (daily

data for 1 year = 365 observations)

Cigarette consumption per capita in California, by

year (annual data)

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-2

time series

14-3

Logarithm:

14-4

14-5

14-6

Forecasting (SW Ch. 14)-separate class Econ 373

Estimation of dynamic causal effects (SW Ch. 15)

If the Fed increases the Federal Funds rate now,

what will be the effect on the rates of inflation and

unemployment in 3 months? in 12 months?

What is the effect over time on cigarette

consumption of a hike in the cigarette tax?

Modeling risks, which is used in financial markets

(one aspect of this, modeling changing variances

and volatility clustering, is discussed in SW Ch.

16)

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-7

Time lags

Correlation over time (serial correlation, a.k.a.

autocorrelation which we encounter in panel data)

Calculation of standard errors when the errors are

serially correlated

A good way to learn about time series data is to

investigate it yourself! A great source for U.S. macro

time series data, and some international data, is the

Federal Reserve Bank of St. Louiss FRED database.

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-8

A. Notation

B. Lags, first differences, and growth rates

C. Autocorrelation (serial correlation)

D. Stationarity.

14-9

A. Notation

Yt = value of Y in period t.

Data set: {Y1,,YT} are T observations on

the time series variable Y

We consider only consecutive, evenlyspaced observations (for example, monthly,

1960 to 1999, no missing months)

missing and unevenly spaced data introduce

technical complications

14-10

rates

14-11

A. Notation

B. Lags, first differences, and growth rates

C. Autocorrelation (serial correlation)

D. Stationarity.

14-12

AUTOCORRELATION

(Serial Correlation):

follows the laws of multiple

regressors heteroskedasticity

The correlation of a series Yt with its own lagged

values is called autocorrelation or serial

correlation.

The first autocovariance of Yt is cov(Yt,Yt1)

The first autocorrelation of Yt is corr(Yt,Yt1)

cov(Yt , Yt 1 )

Thus

corr(Yt,Yt1) = var(Yt ) var(Yt 1 ) =1

These are population correlations they describe the

population joint distribution of (Yt, Yt1)

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-14

Pure serial correlation occurs when the assumption of

uncorrelated observations of the error term, is violated (in a

correctly specified equation!)

The most commonly assumed kind of serial correlation is

first-order serial correlation, in which the current value of

the error term is a function of the previous value of the error

term:

t = t1 + ut (9.1)

where: = the error term of the equation in question

= the first-order autocorrelation coefficient

u = a classical (not serially correlated) error term

2011 Pearson Addison-Wesley. All rights reserved.

9-15

t = t1 + ut

The magnitude of indicates the strength of the

serial correlation:

If is zero, there is no serial correlation

As approaches one in absolute value, the previous

observation of the error term becomes more important in

determining the current value of t and a high degree of

serial correlation exists

For to exceed one is unreasonable, since the error term

effectively would explode

1 < < +1(9.2)

2011 Pearson Addison-Wesley. All rights reserved.

9-16

The sign of indicates the nature of the serial correlation in an

equation:

Positive:

implies that the error term tends to have the same sign from

one time period to the next

this is called positive serial correlation

Negative:

implies that the error term has a tendency to switch signs

from negative to positive and back again in consecutive

observations

this is called negative serial correlation

Figures 9.19.3 illustrate several different scenarios

2011 Pearson Addison-Wesley. All rights reserved.

9-17

Correlation ?

9-18

Figure 9.1b

Positive Serial Correlation

9-19

Figure 9.2

No Serial Correlation

9-20

Positive or

Negative Serial Correlation

9-21

Positive or

Negative Serial Correlation

9-22

Impure serial correlation is serial correlation that is caused

by a specification error such as:

an omitted variable and/or

an incorrect functional form

How does this happen? Just as with heteroskedasticity in cross sectional data

As an example, suppose that the true equation is:

(9.3)

where t is a classical error term. As learned, if X2 is accidentally omitted

from the equation (or if data for X2 are unavailable), then:

(9.4)

9-23

explanatory variables, X2

As a result, the new error term, * , can be serially correlated

even if the true error term , is not

In particular, the new error term will tend to be serially

correlated when:

time series) and

2. the size of is small compared to the size of

Figure 9.4 illustrates 1., for the case of U.S. disposable

income

2011 Pearson Addison-Wesley. All rights reserved.

9-24

Function of Time

9-25

(Incorrect Functional Form IFF)

Turn now to the case of impure serial correlation caused by an

incorrect functional form

Suppose that the true equation is polynomial in nature:

(9.7)

but that instead a linear regression is run:

(9. 8)

The new error term * is now a function of the true error term

and of the differences between the linear and the polynomial

functional forms

Figure 9.5 illustrates how these differences often follow fairly

autoregressive patterns

2011 Pearson Addison-Wesley. All rights reserved.

Source of Impure Serial Correlation

9-27

Impure Serial Correlation

9-28

Correlation

The existence of serial correlation in the error term leads to the

estimation of the equation with OLS to have at least three

consequences:

1.

estimates

2.

variance estimator (of all the linear unbiased estimators): So what

doesnt it minimize anymore ? R2

3.

biased, leading to unreliable hypothesis testing. Typically the

bias in the SE estimate is negative, meaning that OLS

underestimates the standard errors of the coefficients (and thus

overestimates the t-scores). How does this compare to

9-29

heteroskedasticity ?

Formal: testing for serial correlation using the DurbinWatson d test

We will now go through the second of these in detail

applicable if the following three assumptions are met:

1. The model includes an intercept term: Yt=B1X1T+B2X2T is NOT ok

2. The serial correlation is first-order in nature:

t = t1 + ut where is the autocorrelation coefficient and u is a

classical (normally distributed) error term

variable as an independent variable:

Yt=B0+B1X1T+B2X2T+B3Yt-1+ui

9-30

The DurbinWatson

d Test (cont.)

The equation for the DurbinWatson d statistic for T

observations is:

(9.10)

where the ets are the OLS residuals

There are three main cases:

1. Extreme positive serial correlation: d = 0

2. Extreme negative serial correlation: d 4

3. No positive serial correlation: d 2

2011 Pearson Addison-Wesley. All rights reserved.

9-31

The DurbinWatson

d Test (cont.)

To test for positive (note that we rarely, if ever, test for

negative!) serial correlation, the following steps are required:

1. Obtain the OLS residuals from the equation to be tested

and calculate the d statistic by using Equation 9.10:

explanatory variables and then consult a Statistical

Table to find the upper critical d value, dU, and the lower

critical d value, dL, respectively

9-32

The DurbinWatson

d Test (cont.)

3. Set up the test hypotheses and decision rule:

H0: 0 (no positive serial correlation)

HA: > 0 (positive serial correlation)

if d < dL

Reject H0

if d > dU

Do not reject H0

if dL d dU

Inconclusive

two-sided d test might be appropriate

In such a case, steps 1 and 2 are still used, but step 3 is now:

2011 Pearson Addison-Wesley. All rights reserved.

9-33

The DurbinWatson

d Test (cont.)

3. Set up the test hypotheses and decision rule:

H0: = 0

HA: 0

(serial correlation)

if d < dL

Reject H0

if d > 4 dL

Reject H0

Otherwise Inconclusive

Figure 9.6 gives an example of a one-sided Durbin Watson d test

2011 Pearson Addison-Wesley. All rights reserved.

9-34

9-35

https://www3.nd.edu/~wevans1/econ30331/

Durbin_Watson_tables.pdf

14-36

A farmers association hire you to predict inches of growth

for corn as a function of rain on a monthly basis (they

provide you with the data they have been collecting for

the past 14 months). You estimate the model:

InGrwtht=B0+B1InRaint+B2Tempt+ut

1. What sign to you expect each coefficient to have ?

2. Your results are:

InGrwtht=1.2+.07InRaint+.03Tempt , R2=.48

(.07)

(.003)

(.02)

3. Interpret in words the findings for your employer.

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-37

A farmers association hire you to predict inches of growth

for corn as a function of rain on a monthly basis (they

provide you with the data they have been collecting for

the past 14 months). You estimate the model:

InGrwtht=1.2+.07InRaint+.03Tempt , R2=.48

(.07)

(.003)

(.02)

serial correlation ?

5. You run the DW test and find: d=2.8. Do you have

serial correlation ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-38

5. You run the DW test and find: d=2.8. Do you have

serial correlation ?

14-39

Squares

Start with an equation that has first-order serial correlation:

(9.15)

Which, if t = t1 + ut (due to pure serial correlation), also

equals:

(9.16)

Multiply Equation 9.15 by and then lag the new equation

by one period, obtaining:

(9.17)

9-40

(cont.)

Next, subtract Equation 9.17 from Equation 9.16,

obtaining:

(9.18)

(9.19)

(9.20)

9-41

Equation 9.19 is called a Generalized Least Squares

(or quasi-differenced) version of Equation 9.16.

Notice that:

1. The error term is not serially correlated

a. As a result, OLS estimation of Equation 9.19 will be minimum

variance

b. This is true if we know or if we accurately estimate

coefficient of the original serially correlated equation,

Equation 9.16. Thus coefficients estimated with GLS have

the same meaning as those estimated with OLS.

2011 Pearson Addison-Wesley. All rights reserved.

9-42

3. The dependent variable has changed compared

to that in Equation 9.16:

This means that the GLS is not directly comparable

to the OLS.

4. To forecast with GLS, adjustments discussed later

are required

Unfortunately, we cannot use OLS to estimate a GLS

model because GLS equations are inherently nonlinear

in the coefficients

Fortunately, there are at least two other methods

available:

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-43

9-43

This is a two-step iterative technique that first produces an

estimate of and then estimates the GLS equation using that

estimate.

The two steps are:

1. Estimate by running a regression based on the residuals of the

equation suspected of having serial correlation:

et = et1 + ut (9.21)

where the ets are the OLS residuals from the equation suspected

of having pure serial correlation and ut is a classical error term

2. Use this to estimate the GLS equation by substituting into

Equation 9.18

and using OLS to estimate Equation 9.18 with the adjusted data

These two steps are repeated (iterated) until further iteration results

in little change in

Once has converged (usually in just a few iterations), the last

estimate of step 2 is used as a final estimate of Equation 9.18

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-44

9-44

The AR(1) method estimates a GLS equation like Equation

9.18

by estimating 0, 1 and simultaneously with iterative

nonlinear regression techniques (that are well beyond the

scope of this class!)

The AR(1) method tends to produce the same coefficient

estimates as CochraneOrcutt

However, the estimated standard errors are smaller

This is why the AR(1) approach is recommended as long as

your software can support such nonlinear regression

2011 Pearson Addison-Wesley. All rights reserved.

9-45

Correlation

The place to start in correcting a serial correlation problem is to

considered

serial correlation:

1. Generalized Least Squares we just learned it

2. Newey-West standard errors what is this ? And when

would we use this instead of GLS ? Next !

9-46

Remedy 2: NeweyWest

Standard Errors

Not all corrections for pure serial correlation involve Generalized

Least Squares (GLS does not do well in small samples)

NeweyWest standard errors take account of serial correlation

by correcting the standard errors without changing the

estimated coefficients

The logic begin NeweyWest standard errors is powerful:

If serial correlation does not cause bias in the estimated

coefficients but does impact the standard errors, then it

makes sense to adjust the estimated equation in a way that

changes the standard errors but not the coefficients

9-47

(cont.)

The NeweyWest SEs are biased but generally more

accurate than uncorrected standard errors for large

samples in the face of serial correlation

As a result, NeweyWest standard errors can be used for

t-tests and other hypothesis tests in most samples without

the errors of inference potentially caused by serial

correlation

Typically, NeweyWest SEs are larger than OLS SEs, thus

producing lower t-scores

9-48

DYNAMIC MODELS

14-49

An (ad hoc) distributed lag model explains the

current value of Y as a function of current and past

values of X, thus distributing the impact of X over a

number of time periods

For example, we might be interested in the impact of a

change in the money supply (X) on GDP (Y) and model

this as:

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

(12.2)

14-50

12-

interested in the impact of a change in the money supply

(X) on GDP (Y) and model this as:

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

(12.2)

expect ?

14-51

12-

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

where: 1 = 0

3 = 30

.

.

p = P0

(12.2)

(12.8) 2 = 20

will indeed smoothly decline, as shown in Figure 12.1

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-52

12-

Various Dynamic Models

14-53

12-

12.2 with OLS:

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

(12.2)

1. The various lagged values of X are likely to be severely

multicollinear, making coefficient estimates

imprecise

there is no guarantee that the estimated coefficients

will follow the smoothly declining pattern that

economic theory would suggest

Instead, its quite typical to get something like:

14-54

with OLS:

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

(12.2)

sometimes substantially, since we have to:

estimate a coefficient for each lagged X, thus

increasing K and lowering the degrees of

freedom (N K 1)

decrease the sample size by one for each

lagged X, thus lowering the number of

observations, N, and therefore the degrees of

freedom (unless data for lagged Xs outside the

14-55

12-

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

GDPt = 0 + 0MSt + 1MSt1 + 2MSt2 + ... +

+pMStp + t

have all these problems, how can we still

correctly estimate, say, the impact of a change in

the money supply on GDP ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-56

Because of the aforementioned problems with

an Ad Hoc Distributed Lag Model:

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

we always want to rewrite it as

Yt 0 0 X t Yt1 ut

(12.3)

Note that Y is on the left-hand side as Yt, and

on the right-hand side as Yt1

Its this difference in time period that

14-57

makes the equation dynamic

12Copyright 2015 Pearson Education, Inc. All rights reserved.

The simplest dynamic model is:

Yt 0 0 X t Yt1 ut

(12.3)

Note that Y is on the left-hand side as Yt,

and on the right-hand side as Yt1

Its this difference in time period that

makes the equation dynamic

14-58

12-

Dynamic models:

produced by OLS

Yt 0 0 X t Yt1 ut

GDPt = 0 + 0MSt + 1GDPt1 + ut

Why or why not ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-59

12-

Yt 0 0 X t Yt1

ut

correlation for a typical dynamic model involves

three steps:

1. Obtain the residuals of the estimated equation:

an auxiliary regression that includes as

independent variables all those on the right-hand

side of the original equation as well as the lagged

residuals:

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-60

12-

3. Estimate Eq 12.18

using OLS and then test the null hypothesis that a3 = 0 with

the following test statistic:

LM = N*R2 (12.19)

where: N = the sample size

R2 is the unadjusted coefficient of determination

For large samples, LM has a chi-square distribution with

degrees of freedom equal to the number of restrictions in the

null hypothesis (in this case, one).

If LM is greater than the critical chi-square value from

the corresponding Statistical Table, then we reject the

null hypothesis that a3 = 0 and conclude that there is

indeed serial correlation in the original equation

14-61

Copyright 2015 Pearson Education, Inc. All rights reserved.

12-

A farmers association hire you to predict inches of growth

for corn as a function of rain on a monthly basis (they

provide you with the data they have been collecting for

the past 14 months). You estimate the model:

InGrwtht=B0+B1InRaint+B2Tempt+B3InGrwtht-1+ut

1. What sign to you expect each coefficient to have ?

2. Your results are:

InGrwtht=1.3+.11InRaint+.19Tempt-.01InGrwtht-1, R2=.48

(.07)

(.003)

(.02)

(.003)

3. Was introducing the lag of the dependent variable a

14-62

good idea or should you remove it ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

A farmers association hire you to predict inches of growth for

corn as a function of rain on a monthly basis (they provide you

with the data they have been collecting for the past 14

months). You estimate the model:

InGrwtht=1.3+.11InRaint+.19Tempt-.01InGrwtht-1, R2=.48

(.07) (.003)

(.02)

(.003)

5. How would you test whether your model suffers from

serial correlation ?

6. You run the LM test and find: LM=_____

7 . Do you have serial correlation ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-63

A farmers association hire you to predict inches of growth for

corn as a function of rain on a monthly basis (they provide you

with the data they have been collecting for the past 14

months). You estimate the model:

InGrwtht=1.3+.11InRaint+.19Tempt-.01InGrwtht-1, R2=.48

(.07) (.003)

(.02)

(.003)

7 . Do you have serial correlation? F-test table (Chi-Square)

14-64

Dynamic Models

There are essentially three strategies for attempting to rid a

dynamic model of serial correlation:

improving the specification:

Only relevant if the serial correlation is impure

instrumental variables:

substituting an instrument (a variable that is highly correlated with YM

but is uncorrelated with ut) for Yt: in the original equation effectively

eliminates the correlation between Ytl and ut

Problem: good instruments are hard to come by (more in Ch 12)

modified GLS:

Technique similar to the GLS procedure we learned

Potential issues: sample must be large and the standard

14-65

12-

Models

Yt = 0 + 0Xt + 1Xt1 + 2Xt2 + ... + pXtp + t

GDPt = 0 + 0MSt + 1MSt1 + 2MSt2 + ... +

+pMStp + t

information to a researcher ?

consistently and predictably changes before

another one.

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-66

9-66

Granger Causality

Granger causality, or precedence, is a circumstance

in which one time series variable consistently and

predictably changes before another variable

A word of caution: even if one variable precedes

(Granger causes) another, this does not mean that the

first variable causes the other to change

There are several tests for Granger causality

They all involve distributed lag models in one form or

another, however

Well discuss an expanded version of a test originally

developed by Granger

14-67

12-

Granger suggested that to see if A Granger-caused Y, we

should run:

Yt = 0 + 1Yt1 + ... + pYtp + 1At1 + ... + pAtp + t(12.20)

and test the null hypothesis that the coefficients of the

lagged As (the s) jointly equal zero

If we can reject this null hypothesis using the F-test,

then we have evidence that A Granger-causes Y

Note that if p = 1, Equation 12.20 is similar to the

dynamic model, Equation 12.3 Y X Y

t

t1

ut

tests, one in each direction

14-68

12Copyright 2015 Pearson Education, Inc. All rights reserved.

That is, run Equation 12.20:

Yt = 0 + 1Yt1 + ... + pYtp + 1At1 + ... + pAtp + t

(12.20)

At = 0 + 1At1 + ... + pAtp + 1Yt1 + ... + pYtp + t

(12.21)

testing the null hypothesis that the coefficients of the

lagged Ys (again, the s) jointly equal zero

If the F-test is significant for Equation 12.20 but not

for Equation 12.21, then we can conclude that

A Granger-causes Y

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-69

12-

A. Notation

B. Lags, first differences, and growth rates

C. Autocorrelation (serial correlation)

D. Stationarity.

14-70

STATIONARITY

14-71

actually are if they have the same underlying trend as the

dependent variable

Example: In a country with rampant inflation almost any nominal

variable will appear to be highly correlated with all other

nominal variables

Why?

Nominal variables are unadjusted for inflation, so every nominal

variable will have a powerful inflationary component

a strong relationship between two or more variables that is not caused by

a real underlying causal relationship

If you run a regression in which the dependent variable and one or more

independent variables are spuriously correlated, the result is a

spurious regression, and the t-scores and overall fit of such spurious

14-72

regressions are likely to be overstated and untrustworthy

correlation ?

NONSTATIONARITY TIME SERIES

Lets see what that means and how can we

correct for it

14-73

actually are if they have the same underlying trend as the

dependent variable

Such a problem is an example of spurious correlation:

a strong relationship between two or more variables that is not caused by

a real underlying causal relationship

If you run a regression in which the dependent variable and one or more

independent variables are spuriously correlated, the result is a

spurious regression

the t-scores and overall fit of such spurious regressions

are likely to be overstated and untrustworthy

14-74

14-75

a time-series variable, Xt, is stationary if:

1. the mean of Xt is constant over time,

2. the variance of Xt is constant over time, and

3. the simple correlation coefficient between Xt

and Xtk depends on the length of the lag (k) but on no

other variable (for all k)

If one or more of these properties is not met, then Xt

is nonstationary

If a series is nonstationary, that problem is often

referred to as nonstationarity

14-76

12Copyright 2015 Pearson Education, Inc. All rights reserved.

a time-series variable, Xt, is stationary if:

1. the mean of Xt is constant over time,

2. the variance of Xt is constant over time, and

3. the simple correlation coefficient between Xt

and Xtk depends on the length of the lag (k) but on no

other variable (for all k)

What is real per capita output ?

What is the growth rate for real per capita output ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-77

12-

where Yt is generated by an equation that includes only past values

of itself (an autoregressive equation):

Yt =

Yt1 + vt

(12.22)

GDPt =

GDPt1 + vt

eventually approach 0 (and therefore be stationary) as the sample

size gets bigger and bigger? (Remember, since vt is a classical error

term, its expected value = 0)

spurious regression results

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-78

Most importantly, what about if || = 1? In this case:

Yt = Yt1 + vt (12.23)

GDPt = GDPt1 + vt

This is a random walk: the expected value of Yt does

not converge on any value, meaning that it is

nonstationary

This circumstance, where = 1 in Equation 12.23 (or

similar equations), is called a unit root

If a variable has a unit root, then Equation 12.23

holds, and the variable follows a random walk and is

nonstationary

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-79

12-

From the previous discussion of stationarity and unit

roots, it makes sense to estimate Equation 12.22:

Yt = Yt1 + vt (12.22)

GDPt =

GDPt1 + vt

This is almost exactly how the Dickey-Fuller test

works:

1. Subtract Yt1 from both sides of Equation 12.22,

yielding:

(Yt Yt1) = ( 1)Yt1 + vt

(12.26)

14-80

12-

(Yt Yt1) = ( 1)Yt1 + vt

GDPt - GDPt1 = (-1) GDPt1 + vt

If we define Yt = Yt Yt1 then we have the simplest

form of the DickeyFuller test:

Yt = 1Yt1 + vt

(12.27)

where 1 = 1

Note: alternative Dickey-Fuller tests additionally

include a constant and/or a constant and a trend term

2. Set up the test hypotheses:

H0: 1 = 0 (unit root)

HA: 1 < 0 (stationary)

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-81

12-

3. Set up the decision rule:

If is statistically significantly less than 0, then we can

reject the null hypothesis of nonstationarity

If is not statistically significantly less than 0, then

we cannot reject the null hypothesis of

nonstationarity

Note that the standard t-table does not apply to Dickey

Fuller tests

For the case of no constant and no trend (Equation 12.27)

the large-sample values for tc are listed on the next slide

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-82

12-

Values for the DickeyFuller Test

14-83

12-

what and when

14-84

9-84

the DF test?

The decision to use the intercept-only DF test

or the intercept & trend DF test depends on

what the alternative is and what the data

look like.

In the intercept-only specification, the

alternative is that Y is stationary around a

constant no long-term growth in the series

Yt = 0+1Yt1 + vt

In the intercept & trend specification, the

alternative is that Y is stationary around a linear

time trend the series has long-term growth.

Yt = 0+1Yt1 + 2t + vt

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-85

(0.109) (0.0001)

(0.014)

+ 0.269ln(GDPt1) + 0.178ln(GDPt2)

(0.069)

(0.070)

DF t-statstic = 2.18

Note that the standard t-table does not apply to

DickeyFuller tests

Dont compare this to 1.96 use the Dickey-Fuller

table!

14-86

trend):

14-87

1. Which is the coefficient you have to test whether its significant ?

2. Which of the three Dickey-Fuller tables would you use ?

3. Do we have non stationarity in our study or not ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-88

9-88

What was that again ?

a strong relationship between two or more

variables that is not caused by a real underlying

causal relationship

What was its main cause ?

Nonstationarity

Some more examples:

http://www.tylervigen.com/spurious-correlations

14-89

9-89

NON

STATIONARITY AND

COINTEGRATION

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-90

Cointegration

If the DickeyFuller test reveals nonstationarity, what

should we do?

The traditional approach has been to take first

differences (Y = Yt Yt1 and X = Xt Xt1) and use them

in place of Yt and Xt in the regressions

Issue: the first-differencing basically throws away

information about the possible equilibrium

relationships between the variables

Alternatively, one might want to test whether the timeseries are cointegrated, which means that even though

individual variables might be nonstationary, its possible for

linear combinations of nonstationary variables to be

stationary

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-91

Cointegration (cont.)

To see how this works, consider Equation 12.24:

(12.24)

Assume that both Yt and Xt have a unit root

Solving Equation 12.24 for ut, we get:

(12.30)

In Equation 12.24, u t is a function of two nonstationary

variables, so u t might be expected also to be nonstationary

Cointegration refers to the case where this is not the case:

Yt and Xt are both non-stationary, yet a linear combination

of them, as given by Equation 12.24, is stationary

How does this happen?

This could happen if economic theory supports Equation

12.24 as an equilibrium

14-92

12Copyright 2015 Pearson Education, Inc. All rights reserved.

Cointegration (cont.)

We thus see that if Xt and Yt are cointegrated then OLS

estimation of the coefficients in Equation 12.24 can

avoid spurious results

To determine if Xt and Yt are cointegrated, we begin with

OLS estimation of Equation 12.24 and calculate the OLS

residuals:

(12.31)

Next, perform a Dickey-Fuller test on the residuals

Remember to use the critical values from the DickeyFuller Table!

If we are able to reject the null hypothesis of a unit root

in the residuals, we can conclude that Xt and Yt are

cointegrated and our OLS estimates are not spurious

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-93

Nonstationary Time Series

1. Specify the model (lags vs. no lags, etc)

2. Test all variables for nonstationarity (technically unit roots)

using the appropriate version of the DickeyFuller test

3. If the variables dont have unit roots, estimate the equation

in its original units (Y and X)

4. If the variables have unit roots, test the residuals of the

equation for cointegration using the DickeyFuller test

5. If the variables have unit roots but are not cointegrated,

then change the functional form of the model to first

differences (X and Y) and estimate the equation

6. If the variables have unit roots and also are cointegrated,

then estimate the equation in its original units

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-94

Assume we are estimating the following model:

GDPt = 0 + 0MSt + t

1. We first check if each variable is nonstationary:

How would you do that ?

2. Assume we find out both are. Please write out

step by step how you would check for

cointegration.

3. If you find no evidence of cointegration, how

can you still estimate your model correctly ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-95

AUTOREGRESSION

14-96

4. Autoregressions

(SW Section 14.3)

A natural starting point for a forecasting model is to

use past values of Y (that is, Yt1, Yt2,) to forecast Yt.

An autoregression is a regression model in which Yt

is regressed

against its own lagged values.

The number of lags used as regressors is called the

order of the autoregression.

regressed against Yt1.

In a pth order autoregression, Yt is regressed

against Yt1,Yt2,,Ytp.

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-97

The population AR(1) model is

Yt = 0 + 1Yt1 + ut

0 and 1 do not have causal interpretations

if 1 = 0, Yt1 is not useful for forecasting Yt

The AR(1) model can be estimated by an OLS

regression of Yt against Yt1 (mechanically, how

would you run this regression??)

Testing 1 = 0 v. 1 0 provides a test of the

hypothesis that Yt1 is not useful for forecasting Yt

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-98

rate of GDP

Estimated using data from 1962:Q1

2012:Q4:

GDPGR

t = 1.991 + 0.344GDPGRt1

R2

(0.349)

(0.075)

= 0.11

Is the lagged growth rate of GDP a useful

predictor of the current growth rate of GDP?

1. t = 0.344/.075 = 4.59 > 1.96 (in absolute value)

2. Reject H0: 1 = 0 at the 5% significance level

3. Yes, the lagged growth rate of GDP is a useful of

2

R

the current growth ratebut the

is pretty low.

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-99

forecasting

The pth order autoregressive model (AR(p)) is

Yt = 0 + 1Yt1 + 2Yt2 + + pYtp + ut

The AR(p) model uses p lags of Y as regressors

The AR(1) model is a special case

The coefficients do not have a causal interpretation

To test the hypothesis that Yt2,,Ytp do not further

help forecast Yt, beyond Yt1, use an F-test

Use t- or F-tests to determine the lag order p

Or, better, determine p using an information

criterion (more on this later)

14-100

Information Criteria

How to choose the number of lags p in an AR(p)?

14-101

Estimated using data from 1962:Q1

2012:Q4:

GDPGR

t = 1.991 + 0.344GDPGRt1

R2

(0.349)

(0.075)

= 0.11

Is the lagged growth rate of GDP a useful

predictor of the current growth rate of GDP?

1. t = 0.344/.075 = 4.59 > 1.96 (in absolute value)

2. Reject H0: 1 = 0 at the 5% significance level

3. Yes, the lagged growth rate of GDP is a useful of

2

R

the current growth ratebut the

is pretty low.

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-102

rate of GDP

GDPGR

t

R2

(0.40)

(0.08)

(0.08)

= 0.14

R 2 increased from .11 to .14 by adding lags 2

So, lag 2 help to predicts the growth of GDP.

14-103

(SW Section 14.5)

How to choose the number of lags p in an AR(p)?

You can use sequential downward t- or F-tests;

but the models chosen tend to be too large

Another better way to determine lag lengths is

to use an information criterion

Information criteria trade off bias (too few lags)

vs. variance (too many lags)

Two IC are the Bayes (BIC) and Akaike (AIC)

14-104

ln T

SSR ( p )

( p 1)

BIC(p) = ln

T

T

First term: always decreasing in p (larger p, better fit)

The variance of the forecast due to estimation error

increases with p so you dont want a forecasting

model with too many coefficients but what is too

many?

This term is a penalty for using more parameters

and thus increasing the forecast variance.

Minimizing BIC(p) trades off bias and variance to determine a

best value of p for your forecast.

The result is that

BIC

14-105

Information Criterion (AIC)

2

SSR ( p )

( p 1)

AIC(p) = ln

T

T

BIC(p) ln SSR ( p ) ( p 1) ln T

T

T

=

The penalty term is smaller for AIC than BIC (2 <

lnT)

AIC estimates more lags (larger p) than the BIC

This might be desirable if you think longer lags

might be important.

However, the AIC estimator of p isnt consistent

it can overestimate p the penalty isnt big

enough

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-106

6:

14-107

# Lags

0

1

2

3

4

5

6

BIC

1.095

1.067

0.955

0.957

0.986

1.016

1.046

AIC

1.076

1.030

0.900

0.884

0.895

0.906

0.918

R2

0.000

0.056

0.181

0.203

0.204

0.204

0.204

14-108

Predictors and the Autoregressive

Distributed Lag (ADL) Model

Can you use lags of more than the

independent variable in your regression ?

If so, how do you decide how many for

those independent variables ?

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-109

9-

Predictors and the Autoregressive Distributed

Lag (ADL) Model

(SW Section 14.4)

So far we have considered models that use only past

values of Y

It makes sense to add other variables (X) that might

be useful predictors of Y, above and beyond the

predictive value of lagged values of Y:

Yt = 0 + 1Yt1 + + pYtp + 1Xt1 + + rXtr + ut

This is an autoregressive distributed lag model

with p lags of Y and r lags of X ADL(p,r).

14-110

14-111

GDPGR

t

(0.48) (0.08)

(0.08)

(0.42)

(0.43)

R 2 0.17

F-statistic for coefficients on lags of TSpread:

F = 4.43 (p-value = 0.01)

Copyright 2015 Pearson Education, Inc. All rights reserved.

14-112

models

Let K = the total number of coefficients in the model

(intercept, lags of Y, lags of X). The BIC is,

BIC(K) =

ln T

SSR ( K )

ln

K

T

T

of Y and lags of X (but this is a lot)!

Shortcut ? Yes:

require the same number of lags for each variable

used Y, X1,X2

you might choose lags of Y by BIC, and decide

whether or not to include X using a Granger causality

test with a fixed number of lags (number depends on

the data and application)

14-113

Copyright 2015 Pearson Education, Inc. All rights reserved.

- Econometrics PPT Final Review slidesUploaded byIsabelleDwight
- Econometrics Chapter 8 PPT slidesUploaded byIsabelleDwight
- Econometrics Chapter 10 PPT slidesUploaded byIsabelleDwight
- Econometrics Chapter 11 PPT slidesUploaded byIsabelleDwight
- Stock Watson 3U ExerciseSolutions Chapter7 InstructorsUploaded byaspendos68
- Introduction to Econometrics- Stock & Watson -Ch 13 Slides.docUploaded byAntonio Alvino
- ECON 491: ECONOMETRICS STOCK AND WATSON PPTUploaded byEdith Kua
- Introduction to Econometrics- Stock & Watson -Ch 9 Slides.docUploaded byAntonio Alvino
- Introduction to Econometrics- Stock & Watson -Ch 10 Slides.docUploaded byAntonio Alvino
- Social Spending Human Capital & GrowthUploaded bygulngul
- Summary of Empirical Chemicals LTD: Merseyside ProjectUploaded byYna Mendez
- ECONOUploaded byNoman Moin Ud Din
- Variabel X2.docUploaded byAndi Jumardi Al-Bugizy
- Regression NotesUploaded byAleciafy
- Analysis of Determinant Factors of a Company’s Performance (2)Uploaded byIulia Balaceanu
- 173232298 a Guide to Modern Econometrics by Verbeek 371 380Uploaded byAnonymous T2LhplU
- 1. Private Consumption ExpenditureUploaded byGà Siêu Nhân
- Tourism WPUploaded byRoaring INDIA
- The Relationship Between Pay and Performance - Team Salaries and Playing Success From a Comparative Perspective (Forrest, Simmons)Uploaded byMatheus Evaldt
- cesifo1_wp1110Uploaded byblood88
- Group 6_ Study the Causality and Volatility effects on Major Market Indices .docxUploaded byBhushanam Bharat
- Ch 620140313121047Uploaded byYatharth Narang
- 5254-14324-1-PBUploaded byCristina Maria Deneşan
- Thesis Proposal 23 04 2013Uploaded byRiska Pujiati
- Impact of Firm Specific Factors on Profitability oUploaded byayunandini
- MIT18_S096F13_lecnote6Uploaded bykid
- Free Mit PptUploaded byMohit Pipal
- Growth and Profitability of Welsh SMEsUploaded bySylvanas Sya
- C_canh - DUploaded byLy Binh Lieu
- ruge-murcia.pdfUploaded byJoab Dan Valdivia Coria

- Econometrics Chapter 7 PPT slidesUploaded byIsabelleDwight
- Econometrics Chapter 5 PPT slidesUploaded byIsabelleDwight
- Econometrics 1 Cumulative Final Study GuideUploaded byIsabelleDwight
- Applied Econometrics Final Study GuideUploaded byIsabelleDwight
- Pew Research Center ￼￼: Teens, Social Media and Technology Overview 2015Uploaded bywendyista
- Sp16 Review Test 1 Ch 4 5 6 7 8(1)Uploaded byIsabelleDwight
- This are some solutionsUploaded bymata
- the Internet and Social LifeUploaded byIsabelleDwight
- Lifelong Learning and Technology.pdfUploaded byIsabelleDwight
- A study of the relationship between internet addiction, psychopathology and dysfunctional beliefs.pdfUploaded byIsabelleDwight
- Toward Digital Inclusion- Understanding the Literacy Effect on Adoption and Use of Mobile Phones and the Internet in Africa.pdfUploaded byIsabelleDwight
- Internet Use and Psychological Well-Being- Effects of Activity and Audience.pdfUploaded byIsabelleDwight
- Examining Human Value Development of Children with Different Habits of Internet Usage.pdfUploaded byIsabelleDwight

- Snow Cover Trend and Hydrological Characteristics of the Astore River Basin (Western Himalayas) and Its Comparison to the Hunza Basin (Karakoram Region) 2015 Science of the Total EnvironmentUploaded byibn-e-yusuf
- Predicting Potential Habitats of Tree Species in Japan and East Asia Under Climate Change_TanakaUploaded byTA7465
- CondensationEvaporation_2013.pdfUploaded byShams Shams
- Wind and Tall BuildingsUploaded bysuheilbugs
- Altherma Specifications Tcm39-37834Uploaded byCarlos Manriquez
- 3110-18-483-00-1205-lqUploaded byfernando2968
- Catherine Candano Published Book ChapterUploaded byCathy Candano
- My Kitchen Rules RecipesUploaded byroygbivmagic
- King's College London report on mortality burden of NO2 and PM2.5 in LondonUploaded byThe Guardian
- ASTM C553 (Scope)Uploaded bysirkhuong
- FEMA Is-36 Lesson SummaryUploaded bydrthtater
- X-15- The World′s Fastest Rocket Plane and the Pilots Who Ushered in the Space AgeUploaded byAbu Nayeem
- servicesUploaded byKarl Attard
- Module 8.1_B1B2_Rev 00(Full Permission)Uploaded byAhsan Malik
- 011Ampacity.pdfUploaded byCarlos Lino Rojas Agüero
- 5e lesson plan favorsUploaded byapi-312673844
- CLIMATE-CHANGE-IMPACTS-IN-KERALA-AN-OVERVIEW.pptUploaded byArya Maloo
- Lista de Exercício 1Uploaded byjuniormarq
- Ci39 39190 Corrosion AtmosfericaUploaded byEdcri
- Valvulas de Expansion HneywelUploaded byDani Dam
- METHOD of STATEMENT for Grounding SystemUploaded byHamada Khamis
- A Physically Based Variable Contributing Area Model of Basin Hydrology Un Mod Le Base Physique de Zone d Appel Variable de l Hydrologie Du BassinUploaded byJorge Vasconcelos Netto
- Project on SoilUploaded byRahul Baranwal
- BMS James Cook UniversityUploaded byamhosny64
- [Elearnica] -636351890513148509-Steady_state_modelling_and_simulation_of_an_indirect_rotary_dryer_-_Science.pdfUploaded bysheizareh
- Gael Baudine - Dragonsword 01 - Dragon SwordUploaded bydrqubit
- Monin and Obukhov 1954Uploaded byNikolay Zavgorodniy
- (4.4) (140)Uploaded byFherry Leonheart
- Rapport 2006-2 Exjobb Anna LundborgUploaded byGuilherme Augusto de Oliveira
- ProCabinetltd WarrantyUploaded byAlfred Harvey Elacion