You are on page 1of 59

Applications of Econometrics

Serial Correlation in Time Series Data


Wooldridge (2012) Chapter 12

Semester 2, 2023/24

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 1 / 59


In this lecture

1 Properties of OLS under Serial Correlation

2 N-W/HAC Standard Errors

3 Testing for Serial Correlation

4 Correcting for Serial Correlation with (F)GLS

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 2 / 59


Properties of OLS under Serial Correlation

Properties of OLS under Serial Correlation

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 3 / 59


Properties of OLS under Serial Correlation

Let’s remind ourselves

Recall the model written in its usual form:

yt = β0 + β1 xt1 + . . . + βk xtk + ut

Serial correlation means that the errors, {ut : t = 1, 2, . . .} are correlated.


Does serial correlation affect unbiasedness or consistency directly? No.
If the expected value of ut does not depend on any of the explanatory
variables in any time period – so the explanatory variables are strictly
exogenous, Assumption TS.3 – then OLS is unbiased.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 4 / 59


Properties of OLS under Serial Correlation

Let’s remind ourselves

If ut is uncorrelated with the explanatory variables at time t – the explanatory


variables are contemporaneously exogenous, Assumption TS.3′ – then OLS
is consistent, provided the time series are weakly dependent.
There are situations where the nature of the xtj means that serial correlation in
{ut } implies that ut is correlated with xtj , an example being an autoregressive
model.
There is little to worry about with static and finite distributed lag regression
models concerning consistency in the presence of serial correlation.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 5 / 59


Properties of OLS under Serial Correlation

But wait...it affects something

But serially correlated errors mean that the usual OLS statistical inference is
incorrect, even in large samples.
In many cases, the inference can be very misleading.
(What is the other assumption that affects statistical inference?)
(Heteroskedasticity or HSK also invalidates the usual inference in TS
regressions.)
In some cases, we can improve over OLS by modelling the serial correlation
and using a different estimation method, but additional assumptions are
needed.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 6 / 59


Properties of OLS under Serial Correlation

Serial correlation and goodness of fit

It is commonly thought that serial correlation invalidates R 2 and R̄ 2 .


If the serial correlation is due to spurious regression – which means {yt } and
some of the explanatory variables have unit roots – then R 2 and R̄ 2 are pretty
useless.
But if the data are weakly dependent (perhaps after differencing or using
growth rates), the usual R-squareds are reliable even if there is serial
correlation (and/or HSK).
Spurious regression and weak dependency are introduced in EofE. We will
examine them in more detail in topic 4.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 7 / 59


N-W/HAC Standard Errors

Computing Standard Errors Robust to


Serial Correlation and HSK

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 8 / 59


N-W/HAC Standard Errors

Remember HSK in CS data?

It is increasingly common to treat serial correlation in TS regression like we


often treat HSK in CS regression: as a nuisance that causes the usual
inference to be incorrect.
How did we deal with HSK in CS data?
reg y x1 x2 ... xk, robust

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 9 / 59


N-W/HAC Standard Errors

What does “, robust” do?

“, robust” means the inference will be robust to HSK of unknown form.


It substitutes constant error variance with heteroskedastic error variances in
the formula for standard errors.
Does it work for TS data?
If we can rule out serial correlation in the errors, we can use exactly the same
command to make inference robust to HSK with time series.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 10 / 59


N-W/HAC Standard Errors

What about serial correlation?

It is also possible to compute standard errors, CIs, and test statistics robust to
general forms of serial correlation – at least approximately.
These statistics are also robust to any kind of HSK.
The underlying theory is complicated, but it is easy to describe the idea.
For example, we might decide up front to allow ut to be correlated with ut−1
and ut−2 , but not the errors more than two periods apart.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 11 / 59


N-W/HAC Standard Errors

Let’s name these standard errors

The resulting standard errors are usually called Newey-West standard


errors, and are now computed routinely by Stata and other programs.
The standard errors are sometimes called HAC (heteroskedasticity and
autocorrelation consistent) standard errors.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 12 / 59


N-W/HAC Standard Errors

Can they be automated with a command?

The N-W standard errors are not as automated as the adjustment for HSK
because we have to choose a lag.
The choice of the lag q is debatable.
Guidelines: With annual data, the lag is usually fairly short – maybe a couple
of years, so lag = 2 – but with quarterly or monthly data we tend to try longer
lags, such as lag = 24.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 13 / 59


N-W/HAC Standard Errors

How to decide the number of lags?

Are there more specific rules?


A starting estimate is the integer part of 4(n/100)2/9
Newey-West suggest a multiple of n1/3 . Stock and Watson (2014) follow this
and suggest the integer part from (3/4)n1/3 .
Others have suggested the integer part of n1/4 .

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 14 / 59


N-W/HAC Standard Errors

How to decide the number of lags?

For example, suppose we have 40 years of data.


What are the lag length options?
The lag length options would be:
4(40/100)2/9 = 3.26 → q = 3
(3/4)401/3 = 2.56 → q = 2
401/4 = 2.51 → q = 2
Based on these, we can choose either 2 or 3 lags.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 15 / 59


N-W/HAC Standard Errors

Stata command

The command in Stata is


newey y x1 x2 ... xk, lag(q)
where we have to choose q, and probably will experiment a bit to see how
sensitive the standard errors are.
If we choose q = 0, it yields the same results as
reg y x1 x2 ... xk, robust
What estimator is “newey” based on?
We are still estimating the parameters by OLS. We are only changing how we
estimate their precision and perform inference.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 16 / 59


N-W/HAC Standard Errors

When to use HAC standard errors

Just as with the HSK-robust inference, we can apply the HAC inference
irrespective of evidence of serial correlation.
Large differences in the HAC standard errors and the usual ones suggest
serial correlation (autocorrelation) or HSK are present.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 17 / 59


N-W/HAC Standard Errors

HAC is less commonly used

HAC standard errors are less commonly used than HSK-robust errors for several
reasons:
HAC SEs can be poorly behaved if there is substantial serial correlation and
the sample size is small.
The lag length q must be chosen by the researcher and the standard errors
can be sensitive to the choice of lag.

That said, HAC standard errors are becoming more widespread in use.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 18 / 59


N-W/HAC Standard Errors

Example: Federal funds rate, the set-up

Let’s estimate a simple reaction function for the federal funds rate (FEDFUND.DTA)

We difference ffratet , inflation, and GDP gap as they are highly persistent
(topic 4).
We obtain cffrate, cinf , and cgdpgap, where c. denotes the change in a
variable.
With quarterly data, try FDLs of order 4 in both cinf and cgdpgap.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 19 / 59


N-W/HAC Standard Errors

Example: Federal funds rate, FDL regression

. reg cffrate cinf cinf_1 cinf_2 cinf_3 cinf_4 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3 cgdpgap_4

Source | SS df MS Number of obs = 177


-------------+------------------------------ F( 10, 166) = 6.23
Model | 48.562746 10 4.8562746 Prob > F = 0.0000
Residual | 129.427415 166 .779683225 R-squared = 0.2728
-------------+------------------------------ Adj R-squared = 0.2290
Total | 177.990161 176 1.01130774 Root MSE = .883

------------------------------------------------------------------------------
cffrate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cinf | .1635009 .0693821 2.36 0.020 .0265158 .3004861
cinf_1 | .0892447 .0749762 1.19 0.236 -.0587852 .2372746
cinf_2 | .2397011 .0766598 3.13 0.002 .0883473 .3910549
cinf_3 | .1603425 .0742329 2.16 0.032 .0137802 .3069048
cinf_4 | .0188896 .0692756 0.27 0.785 -.1178851 .1556644
cgdpgap | .3419624 .077994 4.38 0.000 .1879743 .4959506
cgdpgap_1 | .2432981 .0796212 3.06 0.003 .0860974 .4004988
cgdpgap_2 | .1016662 .077379 1.31 0.191 -.0511077 .2544401
cgdpgap_3 | .0544501 .0335291 1.62 0.106 -.0117484 .1206486
cgdpgap_4 | -.0874404 .0774749 -1.13 0.261 -.2404035 .0655227
_cons | .0395079 .0670281 0.59 0.556 -.0928297 .1718454
------------------------------------------------------------------------------

The usual t statistics show significance of both contemporaneous variables


and some lags.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 20 / 59


N-W/HAC Standard Errors

Example: Federal funds rate, FDL with HAC

If we try the Newey-West standard errors with lag = 2...


. newey cffrate cinf cinf_1 cinf_2 cinf_3 cinf_4 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3 cgdpgap_4, lag(2)

Regression with Newey-West standard errors Number of obs = 177


maximum lag: 2 F( 10, 166) = 5.51
Prob > F = 0.0000

------------------------------------------------------------------------------
| Newey-West
cffrate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cinf | .1635009 .0890041 1.84 0.068 -.012225 .3392269
cinf_1 | .0892447 .1168206 0.76 0.446 -.1414011 .3198905
cinf_2 | .2397011 .1328552 1.80 0.073 -.0226026 .5020048
cinf_3 | .1603425 .1170766 1.37 0.173 -.0708086 .3914936
cinf_4 | .0188896 .0738927 0.26 0.799 -.127001 .1647803
cgdpgap | .3419624 .1068802 3.20 0.002 .1309426 .5529822
cgdpgap_1 | .2432981 .0974424 2.50 0.014 .050912 .4356842
cgdpgap_2 | .1016662 .1466274 0.69 0.489 -.1878288 .3911612
cgdpgap_3 | .0544501 .0405701 1.34 0.181 -.0256499 .1345501
cgdpgap_4 | -.0874404 .07741 -1.13 0.260 -.2402755 .0653947
_cons | .0395079 .0593546 0.67 0.507 -.0776794 .1566951
------------------------------------------------------------------------------

...the standard errors generally increase, sometimes by large amounts. For


example, cinf and cinf 3 have have much smaller t statistics.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 21 / 59


N-W/HAC Standard Errors

Example: Federal funds rate, adding N-W lags

Increasing the N-W lag to four (as would be suggested by using the integer
part of 4(n/100)2/9 )...
. newey cffrate cinf cinf_1 cinf_2 cinf_3 cinf_4 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3 cgdpgap_4, lag(4)

Regression with Newey-West standard errors Number of obs = 177


maximum lag: 4 F( 10, 166) = 7.51
Prob > F = 0.0000

------------------------------------------------------------------------------
| Newey-West
cffrate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cinf | .1635009 .0913361 1.79 0.075 -.0168292 .343831
cinf_1 | .0892447 .1218877 0.73 0.465 -.1514052 .3298947
cinf_2 | .2397011 .1463759 1.64 0.103 -.0492973 .5286995
cinf_3 | .1603425 .1191334 1.35 0.180 -.0748695 .3955545
cinf_4 | .0188896 .0723094 0.26 0.794 -.1238749 .1616542
cgdpgap | .3419624 .1126093 3.04 0.003 .1196314 .5642934
cgdpgap_1 | .2432981 .0966048 2.52 0.013 .0525656 .4340306
cgdpgap_2 | .1016662 .1455609 0.70 0.486 -.1857231 .3890555
cgdpgap_3 | .0544501 .040057 1.36 0.176 -.0246368 .133537
cgdpgap_4 | -.0874404 .0724511 -1.21 0.229 -.2304848 .055604
_cons | .0395079 .0584588 0.68 0.500 -.0759106 .1549264
------------------------------------------------------------------------------

... does not change much; so standard errors are not very sensitive to the
choice of q.
Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 22 / 59
N-W/HAC Standard Errors

Example: Federal funds rate, testing FD lags

If we do a joint test on the fourth lag...


. test cinf_4 cgdpgap_4

( 1) cinf_4 = 0
( 2) cgdpgap_4 = 0

F( 2, 166) = 0.74
Prob > F = 0.4771

...we fail to reject the null of no impact.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 23 / 59


N-W/HAC Standard Errors

Example: Federal funds rate, dropping FD lags

So we would be justified in dropping them.


. newey cffrate cinf cinf_1 cinf_2 cinf_3 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3, lag(4)

Regression with Newey-West standard errors Number of obs = 178


maximum lag: 4 F( 8, 169) = 9.85
Prob > F = 0.0000

------------------------------------------------------------------------------
| Newey-West
cffrate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cinf | .1529545 .090034 1.70 0.091 -.0247817 .3306907
cinf_1 | .0711862 .1073547 0.66 0.508 -.1407427 .2831152
cinf_2 | .2244522 .1375701 1.63 0.105 -.047125 .4960295
cinf_3 | .1428366 .0961857 1.49 0.139 -.0470436 .3327168
cgdpgap | .3387203 .1090373 3.11 0.002 .1234698 .5539708
cgdpgap_1 | .2413696 .0972818 2.48 0.014 .0493255 .4334136
cgdpgap_2 | .0886014 .1476462 0.60 0.549 -.202867 .3800698
cgdpgap_3 | .0502603 .0376323 1.34 0.183 -.0240297 .1245503
_cons | .038079 .0584926 0.65 0.516 -.0773912 .1535493
------------------------------------------------------------------------------

Overall, there seems to be evidence that the FF rate increases, phased over a
couple of quarters, when inflation increases or when the GDP gap increases
(so actual GDP is above the ideal GDP).
HSK-robust standard errors

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 24 / 59


N-W/HAC Standard Errors

Example: Federal funds rate, lag lengths

FD lag length (3) does not match N-W lag length (4)—problematic?
No. FD lag length concerns explanatory variables, N-W lag length concerns
residuals. There is no reason for them to have the exact same length.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 25 / 59


Testing for Serial Correlation

Testing for Serial Correlation

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 26 / 59


Testing for Serial Correlation

Serial correlation in AR(1)

We specify simple alternative models that allow the errors to be serially


correlated, and then use the model to test the null that the errors are not
serially correlated.
The most common is an AR(1) model:

ut = ρut−1 + et ,

where {et } is serially uncorrelated, has a zero mean, and (usually) a constant
variance, what should we set as the null hypothesis?

H0 : ρ = 0.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 27 / 59


Testing for Serial Correlation

Practical issues

Often ρ > 0 when there is serial correlation, but we usually use a two-sided
alternative.
If we could observe {ut }, we would just estimate a simple AR(1) model for ut
and test ρ = 0. [Because E(ut ) = 0, this is one case we would not have to
include a constant.]
But we do not observe the errors. Instead, we base a test on the OLS
residuals, ût . (Think back to the case of testing for HSK, where we used ût2 in
place of ut2 .)
Remember the difference between ût and ut : the former depends on the
estimators, β̂j .

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 28 / 59


Testing for Serial Correlation

Testing ρ under strict exogeneity I

Provided that the {xtj } are strictly exogenous (Assumption TS.3) then we can
implement the test in three steps.
1. Estimate the equation

yt = β0 + β1 xt1 + . . . + βk xtk + ut , t = 1, 2, . . . , n

by OLS, and save the residuals, {ût : t = 1, 2, . . . , n}.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 29 / 59


Testing for Serial Correlation

Testing ρ under strict exogeneity II

2. Run the AR(1) regression

ût on ût−1 ,t = 2, . . . , n

It is not necessary to estimate an intercept—after all, the averages of ût and


ût−1 are almost zero over t = 2, . . . , n—but it is harmless to do so.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 30 / 59


Testing for Serial Correlation

Testing ρ under strict exogeneity III

3. Compute the usual t statistic for ρ̂, and carry out the test H0 : ρ = 0 in the
usual way.
The test tends to work well in large samples.
It is often applied to static and FDL models because strict exogeneity can be
true.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 31 / 59


Testing for Serial Correlation

The role of sample size

With large n, we might reject ρ = 0 even if ρ̂ is “small.”


With small n, we might not reject even if ρ̂ seems fairly large.
The null is that everything is okay. We require the data to tell us, fairly
convincingly, that some action is required.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 32 / 59


Testing for Serial Correlation

Extensions to the test

Another statistic, related to the previous one, is called the Durbin-Watson


statistic.
Unless the sample size is small, it has little to offer over the simple
regression-based test.
With the regression-based test, we can easily add lags, too, and then use an
F test: for example, we can regress

ût on ût−1 , ût−2 , t = 3, . . . , n

and test the two lags for joint significance (using a usual F statistic).

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 33 / 59


Testing for Serial Correlation

When strict exogeneity fails

So far, we have assumed that regressors are strictly exogenous.


A simple adjustment is needed if the regressors are not strictly exogenous.
All we have to do is add all of the explanatory variables along with the lagged
OLS residual.
This time, we definitely estimate an intercept.
Why? Because E(xtj ) ̸= 0.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 34 / 59


Testing for Serial Correlation

Testing ρ under contemporaneous exogeneity

So, for the AR(1) test, after getting the OLS residuals exactly as before, run

ût on ût−1 , xt1 , xt2 , ...,xtk , t = 2, . . . , n

If we take the “∧ ” off of the residuals, we can see why we need to include the
regressors: If xtj is correlated with ut , and ut is correlated with ut−1 , then xtj
might be correlated with ut−1 .
In other words, leaving out xtj would bias the estimates.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 35 / 59


Testing for Serial Correlation

Comparing the two forms

This form of the test is more general than the previous form, even though the
previous test is somewhat more popular.
One must use the extended form if one or more of the xtj is a lag of yt , but it is
needed in other situations where strict exogeneity is violated.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 36 / 59


Testing for Serial Correlation

Example: Percent fatalities regression

Using TRAFFIC.DTA, we model prcfata, the percent of accidents resulting in


at least one fatality, as the dependent variable.
. reg prcfata spdlaw beltlaw unem feb-dec t

Source | SS df MS Number of obs = 108


-------------+------------------------------ F( 15, 92) = 15.57
Model | .764194266 15 .050946284 Prob > F = 0.0000
Residual | .30105389 92 .003272325 R-squared = 0.7174
-------------+------------------------------ Adj R-squared = 0.6713
Total | 1.06524816 107 .00995559 Root MSE = .0572

------------------------------------------------------------------------------
prcfata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
spdlaw | .0671634 .0204439 3.29 0.001 .02656 .1077668
beltlaw | -.0295827 .023093 -1.28 0.203 -.0754474 .0162819
unem | -.0154371 .0055134 -2.80 0.006 -.0263872 -.004487
feb | -.0001812 .0269749 -0.01 0.995 -.0537557 .0533933
(mar to nov omitted here)
dec | .0089053 .0275565 0.32 0.747 -.0458243 .0636349
t | -.0022355 .0004185 -5.34 0.000 -.0030668 -.0014043
_cons | 1.038472 .0571893 18.16 0.000 .924889 1.152055
------------------------------------------------------------------------------
. predict uh, resid

HAC SE Estimating ρ

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 37 / 59


Testing for Serial Correlation

Example: Percent fatalities, ρ under strict exogeneity

We then regress the residuals on its first lag.


. gen uh_1 = L.uh
(1 missing value generated)

. reg uh uh_1

Source | SS df MS Number of obs = 107


-------------+------------------------------ F( 1, 105) = 8.91
Model | .023532239 1 .023532239 Prob > F = 0.0035
Residual | .277343282 105 .002641365 R-squared = 0.0782
-------------+------------------------------ Adj R-squared = 0.0694
Total | .300875521 106 .002838448 Root MSE = .05139

------------------------------------------------------------------------------
uh | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
uh_1 | .2816806 .0943712 2.98 0.004 .0945599 .4688012
_cons | .0002994 .0049688 0.06 0.952 -.0095528 .0101516
------------------------------------------------------------------------------

The estimate of ρ is about .282, with tρ̂ = 2.98. What does it suggest?
There is strong evidence of serial correlation, although it is not a huge amount
of serial correlation.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 38 / 59


Testing for Serial Correlation

Example: Percent fatalities, ρ under contemporaneous exogeneity

The more general test (including the x ′ s) gives practically the same results:
ρ̂ = .283 and tρ̂ = 2.77.
. reg uh uh_1 spdlaw beltlaw unem feb-dec t

Source | SS df MS Number of obs = 107


-------------+------------------------------ F( 16, 90) = 0.48
Model | .023694612 16 .001480913 Prob > F = 0.9505
Residual | .277180909 90 .003079788 R-squared = 0.0788
-------------+------------------------------ Adj R-squared = -0.0850
Total | .300875521 106 .002838448 Root MSE = .0555

------------------------------------------------------------------------------
uh | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
uh_1 | .2830111 .1021103 2.77 0.007 .0801511 .4858711
spdlaw | -.0019168 .0199115 -0.10 0.924 -.0414744 .0376408
beltlaw | .0011499 .022418 0.05 0.959 -.0433874 .0456872
unem | -.000307 .0054271 -0.06 0.955 -.011089 .0104749
feb | -.0040023 .0270068 -0.15 0.883 -.057656 .0496513
(mar to nov omitted here)
dec | -.0040947 .0276155 -0.15 0.882 -.0589577 .0507684
t | -4.60e-06 .0004176 -0.01 0.991 -.0008342 .000825
_cons | .006583 .0583639 0.11 0.910 -.109367 .122533
------------------------------------------------------------------------------

What do we conclude? There is evidence of serial correlation.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 39 / 59


Testing for Serial Correlation

Example: Percent fatalities, N-W lag length

What comes next?


With serial correlation, we should compute Newey-West standard errors, but it
is not clear what the lag should be in N-W.
Our sample size is 108. We can pick lag = 3 or 4.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 40 / 59


Testing for Serial Correlation

Example: Percent fatalities, HAC standard errors

Going back to the percent fatalities example, we compute the HAC standard
errors.
. newey prcfata spdlaw beltlaw unem feb-dec t, lag(4)

Regression with Newey-West standard errors Number of obs = 108


maximum lag: 4 F( 15, 92) = 19.74
Prob > F = 0.0000

------------------------------------------------------------------------------
| Newey-West
prcfata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
spdlaw | .0671634 .0264891 2.54 0.013 .0145538 .1197729
beltlaw | -.0295827 .0330354 -0.90 0.373 -.0951939 .0360284
unem | -.0154371 .0059803 -2.58 0.011 -.0273144 -.0035598
feb | -.0001812 .016465 -0.01 0.991 -.0328821 .0325197
(mar to nov omitted here)
dec | .0089053 .0283141 0.31 0.754 -.0473291 .0651396
t | -.0022355 .0005551 -4.03 0.000 -.0033381 -.001133
_cons | 1.038472 .0591372 17.56 0.000 .9210202 1.155924
------------------------------------------------------------------------------

Using lag = 4 reduces the statistical significance of spdlaw and beltlaw. (p =


0.001 and 0.203 w/o HAC. Regression )

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 41 / 59


Testing for Serial Correlation

Example: Percent fatalities, HAC standard errors

Is the estimated effect of increasing the speed limit large or small?


The estimated effect of increasing the speed limit, .067, may seem small.
But the average fatality rate is about .886 with standard deviation = .10.
So, increasing the speed limit (on rural interstates) was associated with about
two-thirds of a standard deviation increase the fatality rate.
The seatbelt law had a negative sign but is not statistically significant.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 42 / 59


Testing for Serial Correlation

Example: Federal funds rate, regression with three FD lags

Next, let’s test for serial correlation in the federal funds rate example.
Recall that we decided to include three lags of cinf (change in inflation) and
cgdpgap (change in GDP gap).
. reg cffrate cinf cinf_1 cinf_2 cinf_3 cgdpgap cgdpgap_1 cgdpgap_2 cgdpgap_3

Source | SS df MS Number of obs = 178


-------------+------------------------------ F( 8, 169) = 7.68
Model | 47.4763936 8 5.93454919 Prob > F = 0.0000
Residual | 130.524094 169 .772331915 R-squared = 0.2667
-------------+------------------------------ Adj R-squared = 0.2320
Total | 178.000487 177 1.00565247 Root MSE = .87882

------------------------------------------------------------------------------
cffrate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cinf | .1529545 .0680061 2.25 0.026 .0187037 .2872053
cinf_1 | .0711862 .0727929 0.98 0.330 -.0725143 .2148868
cinf_2 | .2244522 .0729376 3.08 0.002 .080466 .3684385
cinf_3 | .1428366 .0682135 2.09 0.038 .0081763 .2774969
cgdpgap | .3387203 .0769542 4.40 0.000 .186805 .4906355
cgdpgap_1 | .2413696 .0791325 3.05 0.003 .085154 .3975851
cgdpgap_2 | .0886014 .0761234 1.16 0.246 -.0616739 .2388767
cgdpgap_3 | .0502603 .0314743 1.60 0.112 -.0118731 .1123937
_cons | .038079 .0664227 0.57 0.567 -.093046 .169204
------------------------------------------------------------------------------

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 43 / 59


Testing for Serial Correlation

Example: Federal funds rate, ρ1 under strict exogeneity

What do we conclude about first-order serial correlation?


. predict uh, resid
(4 missing values generated)

. gen uh_1 = L.uh


(5 missing values generated)

. reg uh uh_1

Source | SS df MS Number of obs = 177


-------------+------------------------------ F( 1, 175) = 0.06
Model | .048449077 1 .048449077 Prob > F = 0.7991
Residual | 130.467572 175 .745528986 R-squared = 0.0004
-------------+------------------------------ Adj R-squared = -0.0053
Total | 130.516022 176 .741568304 Root MSE = .86344

------------------------------------------------------------------------------
uh | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
uh_1 | .0192665 .0755773 0.25 0.799 -.1298939 .1684269
_cons | .000512 .0649001 0.01 0.994 -.1275757 .1285997
------------------------------------------------------------------------------

There is no first-order serial correlation.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 44 / 59


Testing for Serial Correlation

Example: Federal funds rate, ρ2 under strict exogeneity

What do we conclude about second-order serial correlation?


. gen uh_2 = L2.uh
(6 missing values generated)

. reg uh uh_1 uh_2

Source | SS df MS Number of obs = 176


-------------+------------------------------ F( 2, 173) = 6.16
Model | 8.66875741 2 4.33437871 Prob > F = 0.0026
Residual | 121.811858 173 .704114789 R-squared = 0.0664
-------------+------------------------------ Adj R-squared = 0.0556
Total | 130.480616 175 .745603519 Root MSE = .83912

------------------------------------------------------------------------------
uh | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
uh_1 | .0243798 .0734643 0.33 0.740 -.1206218 .1693814
uh_2 | -.2571086 .0734841 -3.50 0.001 -.4021494 -.1120678
_cons | -.000234 .0632508 -0.00 0.997 -.1250766 .1246085
------------------------------------------------------------------------------

There is second-order serial correlation.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 45 / 59


Testing for Serial Correlation

Example: Federal funds rate, HSK-robust vs. HAC

The standard errors that are robust to HSK (below) are larger than those
robust to serial correlation and HSK... meaning the robust option has failed to
account for serial correlation.
. reg cffrate cinf cinf_1 cinf_2 cinf_3 cgdpgap cgdpgap_1 cgdpgap_2 cgdpgap_3,
robust

Linear regression Number of obs = 178


F( 8, 169) = 4.84
Prob > F = 0.0000
R-squared = 0.2667
Root MSE = .87882

------------------------------------------------------------------------------
| Robust
cffrate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cinf | .1529545 .10977 1.39 0.165 -.0637424 .3696515
cinf_1 | .0711862 .1058495 0.67 0.502 -.1377714 .2801439
cinf_2 | .2244522 .1136453 1.98 0.050 .0001049 .4487996
cinf_3 | .1428366 .094696 1.51 0.133 -.0441028 .3297761
cgdpgap | .3387203 .1162425 2.91 0.004 .1092458 .5681947
cgdpgap_1 | .2413696 .0987046 2.45 0.015 .0465168 .4362223
cgdpgap_2 | .0886014 .1482415 0.60 0.551 -.2040422 .381245
cgdpgap_3 | .0502603 .035287 1.42 0.156 -.0193998 .1199203
_cons | .038079 .0617225 0.62 0.538 -.0837674 .1599254
------------------------------------------------------------------------------
HAC standard errors

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 46 / 59


Testing for Serial Correlation

A fair question

Given the presence of HAC standard errors, why bother testing for serial
correlation in the first place?
There are several reasons why we would want to detect serial correlation
rather than just always using HAC standard errors.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 47 / 59


Testing for Serial Correlation

Why do we test for serial correlation?

1 Lag choice: HAC errors require the choice of a lag and different choices may
lead to different standard errors. If we can show that there is no serial
correlation, then we do not need to make this choice.
2 Efficiency: Without serial correlation, it is pointless to try to implement
estimation methods that improve upon OLS only in the presence of serial
correlation. In other words, OLS is efficient if there is not serial correlation.
3 Misspecification: We may have specified a model that should not have serial
correlation. For example, a model that includes a lagged dependent variable
should not have serial correlation if we have specified the error structure
correctly (e.g., an AR(1) model that includes one lag of the dependent
variable). We are testing whether the model is misspecified.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 48 / 59


Correcting for Serial Correlation with (F)GLS

Correcting for Serial Correlation with (F)GLS

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 49 / 59


Correcting for Serial Correlation with (F)GLS

An alternative to OLS

If we detect serial correlation, then it must be accounted for.


HAC errors are generally easy and do not rely on a strong set of assumptions.
However, these errors may be impractically large, suggesting a more efficient
method could be preferred.
We can directly model serial correlation and apply generalised least squares
(GLS).*

*These procedures require that our regressors are strictly exogenous, a stronger
assumption than needed for computing HAC standard errors.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 50 / 59


Correcting for Serial Correlation with (F)GLS

GLS with AR(1) errors

Consider the following model with AR(1) errors:

yt = β0 + β1 xt + ut (1)

and ut = ρut−1 + et , where et are uncorrelated random variables with mean


zero and a constant variance.
For t ≥ 2, we write
yt−1 = β0 + β1 xt−1 + ut−1 (2)

If we multiply Eq. (2) by ρ and subtract it from Eq. (1), we get

yt − ρyt−1 = (1 − ρ)β0 + β1 (xt − ρxt−1 ) + et , t ≥ 2, (3)

where use have used the fact that et = ut − ρut−1 .

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 51 / 59


Correcting for Serial Correlation with (F)GLS

GLS with AR(1) errors

We can rewrite Eq. (3) as

ỹt = (1 − ρ)β0 + β1 x̃t + et ,

where
ỹt = yt − ρyt−1 , x̃t = xt − ρxt−1 .

We call ỹt and x̃t quasi-differenced data, and they satisfy all Gauss-Markov
assumptions.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 52 / 59


Correcting for Serial Correlation with (F)GLS

Have we found the “best” estimators?

We need to add the first period of data for these GLS estimators to have the
smallest variance.
For t = 1, we have p
ỹ1 = 1 − ρ2 β0 + β1 x̃1 + ũ1 ,
where p p p
ỹ1 = 1 − ρ2 y1 , x̃1 = 1 − ρ2 x1 , and ũ1 = 1 − ρ2 u1 .
We can add more explanatory variables; we quasi-difference them using the
same formulas.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 53 / 59


Correcting for Serial Correlation with (F)GLS

Are they feasible estimators?

We rarely know the value of ρ and instead must estimate it.


Feasible GLS (FGLS) first obtains an estimate for ρ and then uses this to
quasi-difference the data.
Three steps for FGLS estimation of a model with AR(1) errors:
1 Run the OLS regression of yt on xt1 , . . . , xtk and obtain the residuals ût .
2 Regress ût on ût−1 for t ≥ 2 to obtain an estimate ρ̂.
3 Quasi-difference the data and run OLS to obtain consistent estimates and
asymptotically valid standard errors.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 54 / 59


Correcting for Serial Correlation with (F)GLS

Example: Percent fatalities, estimating ρ

In the percent fatilities example, we first run the OLS regression here .
Next, we estimate an AR(1) model on the estimated errors.
. predict uh, resid

. gen uh_1 = L.uh


(1 missing value generated)

. reg uh uh_1, noconstant

Source | SS df MS Number of obs = 107


-------------+---------------------------------- F(1, 106) = 8.99
Model | .0235243 1 .0235243 Prob > F = 0.0034
Residual | .277352872 106 .002616537 R-squared = 0.0782
-------------+---------------------------------- Adj R-squared = 0.0695
Total | .300877173 107 .002811936 Root MSE = .05115

------------------------------------------------------------------------------
uh | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
uh_1 | .2816134 .0939201 3.00 0.003 .0954077 .467819
------------------------------------------------------------------------------

. scalar rho = _b[uh_1]

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 55 / 59


Correcting for Serial Correlation with (F)GLS

Example: Percent fatalities, quasi-differencing

Then, we quasi-difference the data.


. gen const = 1

. foreach var in prcfata spdlaw beltlaw unem feb mar apr may jun jul aug sep oct nov dec t const {
2. gen ‘var’_qd = ‘var’ - rho * L.‘var’
3. replace ‘var’_qd = sqrt(1 - rho^2) * ‘var’ in 1
4. }
(1 missing value generated)
(1 real change made)
(1 missing value generated)
(1 real change made)
(1 missing value generated)
(1 real change made)
(1 missing value generated)
(1 real change made)
(1 missing value generated)
(1 real change made)
..
.
We never had to generate a constant. Why here?
p
The constant needs to be scaled by (1 − ρ) or 1 − ρ2 .

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 56 / 59


Correcting for Serial Correlation with (F)GLS

Example: Percent fatalities, FGLS

Finally, we use the command


reg prcfata_qd spdlaw_qd beltlaw_qd unem_qd feb_qd mar_qd ///
apr_qd may_qd jun_qd jul_qd aug_qd sep_qd oct_qd nov_qd ///
dec_qd t_qd const_qd, noconstant
This is equivalent to
prais prcfata spdlaw beltlaw unem feb-dec t, twostep
and the regression is known as Prais-Winsten or Cochrane-Orcutt regression.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 57 / 59


Correcting for Serial Correlation with (F)GLS

Example: Percent fatalities, comparing estimates

We have four sets of estimates. We know that (1) is regression without


adjusting for errors and (3) is regression with HAC standard errors.
Which one is FGLS? Which one is HSK-robust?

(1) (2) (3) (4)


spdlaw 0.0672*** 0.0672*** 0.0672** 0.0644**
(0.0204) (0.0197) (0.0265) (0.0264)
beltlaw -0.0296 -0.0296 -0.0296 -0.0252
(0.0231) (0.0238) (0.0330) (0.0297)
unem -0.0154*** -0.0154** -0.0154** -0.0133*
(0.0055) (0.0060) (0.0060) (0.0070)
feb -0.0002 -0.0002 -0.0002 -0.0019
(0.0270) (0.0254) (0.0165) (0.0228)
.
.
.
dec 0.0089 0.0089 0.0089 0.0082
(0.0276) (0.0260) (0.0283) (0.0243)
t -0.0022*** -0.0022*** -0.0022*** -0.0022***
(0.0004) (0.0005) (0.0006) (0.0005)
constant 1.0385*** 1.0385*** 1.0385*** 1.0184***
(0.0572) (0.0614) (0.0591) (0.0712)

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 58 / 59


Correcting for Serial Correlation with (F)GLS

Example: Percent fatalities, answer revealed

Regression HSK-robust HAC FGLS


“regress” “, robust” “newey ” formula or “prais”
(1) (2) (3) (4)
spdlaw 0.0672*** 0.0672*** 0.0672** 0.0644**
(0.0204) (0.0197) (0.0265) (0.0264)
beltlaw -0.0296 -0.0296 -0.0296 -0.0252
(0.0231) (0.0238) (0.0330) (0.0297)
unem -0.0154*** -0.0154** -0.0154** -0.0133*
(0.0055) (0.0060) (0.0060) (0.0070)
feb -0.0002 -0.0002 -0.0002 -0.0019
(0.0270) (0.0254) (0.0165) (0.0228)
.
.
.
dec 0.0089 0.0089 0.0089 0.0082
(0.0276) (0.0260) (0.0283) (0.0243)
t -0.0022*** -0.0022*** -0.0022*** -0.0022***
(0.0004) (0.0005) (0.0006) (0.0005)
constant 1.0385*** 1.0385*** 1.0385*** 1.0184***
(0.0572) (0.0614) (0.0591) (0.0712)

The first three columns are all based on OLS. Only the standard errors differ.
The fourth column is based on FGLS, so the point estimates differ as well.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 59 / 59

You might also like