Topic2 TSSR Handout

Applications of Econometrics
Serial Correlation in Time Series Data

Wooldridge (2012) Chapter 12
Semester 2, 2023/24
Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 1 / 59

In this lecture
1 Properties of OLS under Serial Correlation
2 N-W/HAC Standard Errors
3 Testing for Serial Correlation
4 Correcting for Serial Correlation with (F)GLS

Properties of OLS under Serial Correlation

Let’s remind ourselves
Recall the model written in its usual form:
yt = β0 + β1 xt1 + . . . + βk xtk + ut
Serial correlation means that the errors, {ut : t = 1, 2, . . .} are correlated.

Does serial correlation affect unbiasedness or consistency directly? No.
If the expected value of ut does not depend on any of the explanatory
variables in any time period – so the explanatory variables are strictly
exogenous, Assumption TS.3 – then OLS is unbiased.

Let’s remind ourselves
If ut is uncorrelated with the explanatory variables at time t – the explanatory

variables are contemporaneously exogenous, Assumption TS.3′ – then OLS
is consistent, provided the time series are weakly dependent.
There are situations where the nature of the xtj means that serial correlation in
{ut } implies that ut is correlated with xtj , an example being an autoregressive
model.
There is little to worry about with static and finite distributed lag regression
models concerning consistency in the presence of serial correlation.

But wait...it affects something
But serially correlated errors mean that the usual OLS statistical inference is
incorrect, even in large samples.
In many cases, the inference can be very misleading.
(What is the other assumption that affects statistical inference?)
(Heteroskedasticity or HSK also invalidates the usual inference in TS
regressions.)
In some cases, we can improve over OLS by modelling the serial correlation
and using a different estimation method, but additional assumptions are
needed.

Serial correlation and goodness of fit
It is commonly thought that serial correlation invalidates R 2 and R̄ 2 .

If the serial correlation is due to spurious regression – which means {yt } and
some of the explanatory variables have unit roots – then R 2 and R̄ 2 are pretty
useless.
But if the data are weakly dependent (perhaps after differencing or using
growth rates), the usual R-squareds are reliable even if there is serial
correlation (and/or HSK).
Spurious regression and weak dependency are introduced in EofE. We will
examine them in more detail in topic 4.

N-W/HAC Standard Errors
Computing Standard Errors Robust to

Serial Correlation and HSK

Remember HSK in CS data?
It is increasingly common to treat serial correlation in TS regression like we

often treat HSK in CS regression: as a nuisance that causes the usual
inference to be incorrect.
How did we deal with HSK in CS data?
reg y x1 x2 ... xk, robust

What does “, robust” do?
“, robust” means the inference will be robust to HSK of unknown form.

It substitutes constant error variance with heteroskedastic error variances in
the formula for standard errors.
Does it work for TS data?
If we can rule out serial correlation in the errors, we can use exactly the same
command to make inference robust to HSK with time series.

What about serial correlation?
It is also possible to compute standard errors, CIs, and test statistics robust to
general forms of serial correlation – at least approximately.
These statistics are also robust to any kind of HSK.
The underlying theory is complicated, but it is easy to describe the idea.
For example, we might decide up front to allow ut to be correlated with ut−1
and ut−2 , but not the errors more than two periods apart.

Let’s name these standard errors
The resulting standard errors are usually called Newey-West standard

errors, and are now computed routinely by Stata and other programs.
The standard errors are sometimes called HAC (heteroskedasticity and
autocorrelation consistent) standard errors.

Can they be automated with a command?
The N-W standard errors are not as automated as the adjustment for HSK
because we have to choose a lag.
The choice of the lag q is debatable.
Guidelines: With annual data, the lag is usually fairly short – maybe a couple
of years, so lag = 2 – but with quarterly or monthly data we tend to try longer
lags, such as lag = 24.

How to decide the number of lags?
Are there more specific rules?

A starting estimate is the integer part of 4(n/100)2/9
Newey-West suggest a multiple of n1/3 . Stock and Watson (2014) follow this
and suggest the integer part from (3/4)n1/3 .
Others have suggested the integer part of n1/4 .

How to decide the number of lags?
For example, suppose we have 40 years of data.

What are the lag length options?
The lag length options would be:
4(40/100)2/9 = 3.26 → q = 3
(3/4)401/3 = 2.56 → q = 2
401/4 = 2.51 → q = 2
Based on these, we can choose either 2 or 3 lags.

Stata command
The command in Stata is

newey y x1 x2 ... xk, lag(q)
where we have to choose q, and probably will experiment a bit to see how
sensitive the standard errors are.
If we choose q = 0, it yields the same results as
reg y x1 x2 ... xk, robust
What estimator is “newey” based on?
We are still estimating the parameters by OLS. We are only changing how we
estimate their precision and perform inference.

When to use HAC standard errors
Just as with the HSK-robust inference, we can apply the HAC inference
irrespective of evidence of serial correlation.
Large differences in the HAC standard errors and the usual ones suggest
serial correlation (autocorrelation) or HSK are present.

HAC is less commonly used
HAC standard errors are less commonly used than HSK-robust errors for several
reasons:
HAC SEs can be poorly behaved if there is substantial serial correlation and
the sample size is small.
The lag length q must be chosen by the researcher and the standard errors
can be sensitive to the choice of lag.
That said, HAC standard errors are becoming more widespread in use.

Example: Federal funds rate, the set-up
Let’s estimate a simple reaction function for the federal funds rate (FEDFUND.DTA)
We difference ffratet , inflation, and GDP gap as they are highly persistent
(topic 4).
We obtain cffrate, cinf , and cgdpgap, where c. denotes the change in a
variable.
With quarterly data, try FDLs of order 4 in both cinf and cgdpgap.

Example: Federal funds rate, FDL regression
. reg cffrate cinf cinf_1 cinf_2 cinf_3 cinf_4 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3 cgdpgap_4
Source | SS df MS Number of obs = 177

-------------+------------------------------ F( 10, 166) = 6.23
Model | 48.562746 10 4.8562746 Prob > F = 0.0000
Residual | 129.427415 166 .779683225 R-squared = 0.2728
-------------+------------------------------ Adj R-squared = 0.2290
Total | 177.990161 176 1.01130774 Root MSE = .883
------------------------------------------------------------------------------
cffrate | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cinf | .1635009 .0693821 2.36 0.020 .0265158 .3004861
cinf_1 | .0892447 .0749762 1.19 0.236 -.0587852 .2372746
cinf_2 | .2397011 .0766598 3.13 0.002 .0883473 .3910549
cinf_3 | .1603425 .0742329 2.16 0.032 .0137802 .3069048
cinf_4 | .0188896 .0692756 0.27 0.785 -.1178851 .1556644
cgdpgap | .3419624 .077994 4.38 0.000 .1879743 .4959506
cgdpgap_1 | .2432981 .0796212 3.06 0.003 .0860974 .4004988
cgdpgap_2 | .1016662 .077379 1.31 0.191 -.0511077 .2544401
cgdpgap_3 | .0544501 .0335291 1.62 0.106 -.0117484 .1206486
cgdpgap_4 | -.0874404 .0774749 -1.13 0.261 -.2404035 .0655227
_cons | .0395079 .0670281 0.59 0.556 -.0928297 .1718454
------------------------------------------------------------------------------
The usual t statistics show significance of both contemporaneous variables

and some lags.

Example: Federal funds rate, FDL with HAC
If we try the Newey-West standard errors with lag = 2...

. newey cffrate cinf cinf_1 cinf_2 cinf_3 cinf_4 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3 cgdpgap_4, lag(2)
Regression with Newey-West standard errors Number of obs = 177

maximum lag: 2 F( 10, 166) = 5.51
Prob > F = 0.0000
------------------------------------------------------------------------------
| Newey-West
-------------+----------------------------------------------------------------
cinf | .1635009 .0890041 1.84 0.068 -.012225 .3392269
cinf_1 | .0892447 .1168206 0.76 0.446 -.1414011 .3198905
cinf_2 | .2397011 .1328552 1.80 0.073 -.0226026 .5020048
cinf_3 | .1603425 .1170766 1.37 0.173 -.0708086 .3914936
cinf_4 | .0188896 .0738927 0.26 0.799 -.127001 .1647803
cgdpgap | .3419624 .1068802 3.20 0.002 .1309426 .5529822
cgdpgap_1 | .2432981 .0974424 2.50 0.014 .050912 .4356842
cgdpgap_2 | .1016662 .1466274 0.69 0.489 -.1878288 .3911612
cgdpgap_3 | .0544501 .0405701 1.34 0.181 -.0256499 .1345501
cgdpgap_4 | -.0874404 .07741 -1.13 0.260 -.2402755 .0653947
_cons | .0395079 .0593546 0.67 0.507 -.0776794 .1566951
------------------------------------------------------------------------------
...the standard errors generally increase, sometimes by large amounts. For

example, cinf and cinf 3 have have much smaller t statistics.

Example: Federal funds rate, adding N-W lags
Increasing the N-W lag to four (as would be suggested by using the integer
part of 4(n/100)2/9 )...
. newey cffrate cinf cinf_1 cinf_2 cinf_3 cinf_4 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3 cgdpgap_4, lag(4)

maximum lag: 4 F( 10, 166) = 7.51
Prob > F = 0.0000
------------------------------------------------------------------------------
| Newey-West
-------------+----------------------------------------------------------------
cinf | .1635009 .0913361 1.79 0.075 -.0168292 .343831
cinf_1 | .0892447 .1218877 0.73 0.465 -.1514052 .3298947
cinf_2 | .2397011 .1463759 1.64 0.103 -.0492973 .5286995
cinf_3 | .1603425 .1191334 1.35 0.180 -.0748695 .3955545
cinf_4 | .0188896 .0723094 0.26 0.794 -.1238749 .1616542
cgdpgap | .3419624 .1126093 3.04 0.003 .1196314 .5642934
cgdpgap_1 | .2432981 .0966048 2.52 0.013 .0525656 .4340306
cgdpgap_2 | .1016662 .1455609 0.70 0.486 -.1857231 .3890555
cgdpgap_3 | .0544501 .040057 1.36 0.176 -.0246368 .133537
cgdpgap_4 | -.0874404 .0724511 -1.21 0.229 -.2304848 .055604
_cons | .0395079 .0584588 0.68 0.500 -.0759106 .1549264
------------------------------------------------------------------------------
... does not change much; so standard errors are not very sensitive to the
choice of q.
Example: Federal funds rate, testing FD lags
If we do a joint test on the fourth lag...

. test cinf_4 cgdpgap_4
( 1) cinf_4 = 0
( 2) cgdpgap_4 = 0
F( 2, 166) = 0.74
Prob > F = 0.4771
...we fail to reject the null of no impact.

Example: Federal funds rate, dropping FD lags
So we would be justified in dropping them.

. newey cffrate cinf cinf_1 cinf_2 cinf_3 cgdpgap cgdpgap_1 cgdpgap_2
cgdpgap_3, lag(4)

maximum lag: 4 F( 8, 169) = 9.85
Prob > F = 0.0000
------------------------------------------------------------------------------
| Newey-West
-------------+----------------------------------------------------------------
cinf | .1529545 .090034 1.70 0.091 -.0247817 .3306907
cinf_1 | .0711862 .1073547 0.66 0.508 -.1407427 .2831152
cinf_2 | .2244522 .1375701 1.63 0.105 -.047125 .4960295
cinf_3 | .1428366 .0961857 1.49 0.139 -.0470436 .3327168
cgdpgap | .3387203 .1090373 3.11 0.002 .1234698 .5539708
cgdpgap_1 | .2413696 .0972818 2.48 0.014 .0493255 .4334136
cgdpgap_2 | .0886014 .1476462 0.60 0.549 -.202867 .3800698
cgdpgap_3 | .0502603 .0376323 1.34 0.183 -.0240297 .1245503
_cons | .038079 .0584926 0.65 0.516 -.0773912 .1535493
------------------------------------------------------------------------------
Overall, there seems to be evidence that the FF rate increases, phased over a
couple of quarters, when inflation increases or when the GDP gap increases
(so actual GDP is above the ideal GDP).
HSK-robust standard errors

Example: Federal funds rate, lag lengths
FD lag length (3) does not match N-W lag length (4)—problematic?
No. FD lag length concerns explanatory variables, N-W lag length concerns
residuals. There is no reason for them to have the exact same length.

Testing for Serial Correlation

Serial correlation in AR(1)
We specify simple alternative models that allow the errors to be serially

correlated, and then use the model to test the null that the errors are not
serially correlated.
The most common is an AR(1) model:
ut = ρut−1 + et ,
where {et } is serially uncorrelated, has a zero mean, and (usually) a constant
variance, what should we set as the null hypothesis?
H0 : ρ = 0.

Practical issues
Often ρ > 0 when there is serial correlation, but we usually use a two-sided
alternative.
If we could observe {ut }, we would just estimate a simple AR(1) model for ut
and test ρ = 0. [Because E(ut ) = 0, this is one case we would not have to
include a constant.]
But we do not observe the errors. Instead, we base a test on the OLS
residuals, ût . (Think back to the case of testing for HSK, where we used ût2 in
place of ut2 .)
Remember the difference between ût and ut : the former depends on the
estimators, β̂j .

Testing ρ under strict exogeneity I
Provided that the {xtj } are strictly exogenous (Assumption TS.3) then we can
implement the test in three steps.
1. Estimate the equation
yt = β0 + β1 xt1 + . . . + βk xtk + ut , t = 1, 2, . . . , n
by OLS, and save the residuals, {ût : t = 1, 2, . . . , n}.

Testing ρ under strict exogeneity II
2. Run the AR(1) regression
ût on ût−1 ,t = 2, . . . , n
It is not necessary to estimate an intercept—after all, the averages of ût and

ût−1 are almost zero over t = 2, . . . , n—but it is harmless to do so.

Testing ρ under strict exogeneity III
3. Compute the usual t statistic for ρ̂, and carry out the test H0 : ρ = 0 in the
usual way.
The test tends to work well in large samples.
It is often applied to static and FDL models because strict exogeneity can be
true.

The role of sample size
With large n, we might reject ρ = 0 even if ρ̂ is “small.”

With small n, we might not reject even if ρ̂ seems fairly large.
The null is that everything is okay. We require the data to tell us, fairly
convincingly, that some action is required.

Extensions to the test
Another statistic, related to the previous one, is called the Durbin-Watson

statistic.
Unless the sample size is small, it has little to offer over the simple
regression-based test.
With the regression-based test, we can easily add lags, too, and then use an
F test: for example, we can regress
ût on ût−1 , ût−2 , t = 3, . . . , n
and test the two lags for joint significance (using a usual F statistic).

When strict exogeneity fails
So far, we have assumed that regressors are strictly exogenous.

A simple adjustment is needed if the regressors are not strictly exogenous.
All we have to do is add all of the explanatory variables along with the lagged
OLS residual.
This time, we definitely estimate an intercept.
Why? Because E(xtj ) ̸= 0.

Testing ρ under contemporaneous exogeneity
So, for the AR(1) test, after getting the OLS residuals exactly as before, run
ût on ût−1 , xt1 , xt2 , ...,xtk , t = 2, . . . , n
If we take the “∧ ” off of the residuals, we can see why we need to include the
regressors: If xtj is correlated with ut , and ut is correlated with ut−1 , then xtj
might be correlated with ut−1 .
In other words, leaving out xtj would bias the estimates.

Comparing the two forms
This form of the test is more general than the previous form, even though the
previous test is somewhat more popular.
One must use the extended form if one or more of the xtj is a lag of yt , but it is
needed in other situations where strict exogeneity is violated.

Example: Percent fatalities regression
Using TRAFFIC.DTA, we model prcfata, the percent of accidents resulting in

at least one fatality, as the dependent variable.
. reg prcfata spdlaw beltlaw unem feb-dec t

-------------+------------------------------ F( 15, 92) = 15.57
Model | .764194266 15 .050946284 Prob > F = 0.0000
Residual | .30105389 92 .003272325 R-squared = 0.7174
-------------+------------------------------ Adj R-squared = 0.6713
Total | 1.06524816 107 .00995559 Root MSE = .0572
------------------------------------------------------------------------------
prcfata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
spdlaw | .0671634 .0204439 3.29 0.001 .02656 .1077668
beltlaw | -.0295827 .023093 -1.28 0.203 -.0754474 .0162819
unem | -.0154371 .0055134 -2.80 0.006 -.0263872 -.004487
feb | -.0001812 .0269749 -0.01 0.995 -.0537557 .0533933
(mar to nov omitted here)
dec | .0089053 .0275565 0.32 0.747 -.0458243 .0636349
t | -.0022355 .0004185 -5.34 0.000 -.0030668 -.0014043
_cons | 1.038472 .0571893 18.16 0.000 .924889 1.152055
------------------------------------------------------------------------------
. predict uh, resid
HAC SE Estimating ρ

Example: Percent fatalities, ρ under strict exogeneity
We then regress the residuals on its first lag.

. gen uh_1 = L.uh
(1 missing value generated)
. reg uh uh_1

-------------+------------------------------ F( 1, 105) = 8.91
Model | .023532239 1 .023532239 Prob > F = 0.0035
-------------+------------------------------ Adj R-squared = 0.0694
Total | .300875521 106 .002838448 Root MSE = .05139
------------------------------------------------------------------------------
uh | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
uh_1 | .2816806 .0943712 2.98 0.004 .0945599 .4688012
_cons | .0002994 .0049688 0.06 0.952 -.0095528 .0101516
------------------------------------------------------------------------------
The estimate of ρ is about .282, with tρ̂ = 2.98. What does it suggest?
There is strong evidence of serial correlation, although it is not a huge amount
of serial correlation.

Example: Percent fatalities, ρ under contemporaneous exogeneity
The more general test (including the x ′ s) gives practically the same results:
ρ̂ = .283 and tρ̂ = 2.77.
. reg uh uh_1 spdlaw beltlaw unem feb-dec t

-------------+------------------------------ F( 16, 90) = 0.48
Model | .023694612 16 .001480913 Prob > F = 0.9505
-------------+------------------------------ Adj R-squared = -0.0850
Total | .300875521 106 .002838448 Root MSE = .0555
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
uh_1 | .2830111 .1021103 2.77 0.007 .0801511 .4858711
spdlaw | -.0019168 .0199115 -0.10 0.924 -.0414744 .0376408
beltlaw | .0011499 .022418 0.05 0.959 -.0433874 .0456872
unem | -.000307 .0054271 -0.06 0.955 -.011089 .0104749
feb | -.0040023 .0270068 -0.15 0.883 -.057656 .0496513
dec | -.0040947 .0276155 -0.15 0.882 -.0589577 .0507684
t | -4.60e-06 .0004176 -0.01 0.991 -.0008342 .000825
_cons | .006583 .0583639 0.11 0.910 -.109367 .122533
------------------------------------------------------------------------------
What do we conclude? There is evidence of serial correlation.

Example: Percent fatalities, N-W lag length
What comes next?

With serial correlation, we should compute Newey-West standard errors, but it
is not clear what the lag should be in N-W.
Our sample size is 108. We can pick lag = 3 or 4.

Example: Percent fatalities, HAC standard errors
Going back to the percent fatalities example, we compute the HAC standard
errors.
. newey prcfata spdlaw beltlaw unem feb-dec t, lag(4)

maximum lag: 4 F( 15, 92) = 19.74
Prob > F = 0.0000
------------------------------------------------------------------------------
| Newey-West
prcfata | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
spdlaw | .0671634 .0264891 2.54 0.013 .0145538 .1197729
beltlaw | -.0295827 .0330354 -0.90 0.373 -.0951939 .0360284
unem | -.0154371 .0059803 -2.58 0.011 -.0273144 -.0035598
feb | -.0001812 .016465 -0.01 0.991 -.0328821 .0325197
dec | .0089053 .0283141 0.31 0.754 -.0473291 .0651396
t | -.0022355 .0005551 -4.03 0.000 -.0033381 -.001133
_cons | 1.038472 .0591372 17.56 0.000 .9210202 1.155924
------------------------------------------------------------------------------
Using lag = 4 reduces the statistical significance of spdlaw and beltlaw. (p =

0.001 and 0.203 w/o HAC. Regression )

Example: Percent fatalities, HAC standard errors
Is the estimated effect of increasing the speed limit large or small?

The estimated effect of increasing the speed limit, .067, may seem small.
But the average fatality rate is about .886 with standard deviation = .10.
So, increasing the speed limit (on rural interstates) was associated with about
two-thirds of a standard deviation increase the fatality rate.
The seatbelt law had a negative sign but is not statistically significant.

Example: Federal funds rate, regression with three FD lags
Next, let’s test for serial correlation in the federal funds rate example.
Recall that we decided to include three lags of cinf (change in inflation) and
cgdpgap (change in GDP gap).
. reg cffrate cinf cinf_1 cinf_2 cinf_3 cgdpgap cgdpgap_1 cgdpgap_2 cgdpgap_3

-------------+------------------------------ F( 8, 169) = 7.68
Model | 47.4763936 8 5.93454919 Prob > F = 0.0000
-------------+------------------------------ Adj R-squared = 0.2320
Total | 178.000487 177 1.00565247 Root MSE = .87882
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
cinf | .1529545 .0680061 2.25 0.026 .0187037 .2872053
cinf_1 | .0711862 .0727929 0.98 0.330 -.0725143 .2148868
cinf_2 | .2244522 .0729376 3.08 0.002 .080466 .3684385
cinf_3 | .1428366 .0682135 2.09 0.038 .0081763 .2774969
cgdpgap | .3387203 .0769542 4.40 0.000 .186805 .4906355
cgdpgap_1 | .2413696 .0791325 3.05 0.003 .085154 .3975851
cgdpgap_2 | .0886014 .0761234 1.16 0.246 -.0616739 .2388767
cgdpgap_3 | .0502603 .0314743 1.60 0.112 -.0118731 .1123937
_cons | .038079 .0664227 0.57 0.567 -.093046 .169204
------------------------------------------------------------------------------

Example: Federal funds rate, ρ1 under strict exogeneity
What do we conclude about first-order serial correlation?

. predict uh, resid
(4 missing values generated)
. gen uh_1 = L.uh

. reg uh uh_1

-------------+------------------------------ F( 1, 175) = 0.06
Model | .048449077 1 .048449077 Prob > F = 0.7991
-------------+------------------------------ Adj R-squared = -0.0053
Total | 130.516022 176 .741568304 Root MSE = .86344
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
uh_1 | .0192665 .0755773 0.25 0.799 -.1298939 .1684269
_cons | .000512 .0649001 0.01 0.994 -.1275757 .1285997
------------------------------------------------------------------------------
There is no first-order serial correlation.

Example: Federal funds rate, ρ2 under strict exogeneity
What do we conclude about second-order serial correlation?

. gen uh_2 = L2.uh
. reg uh uh_1 uh_2

-------------+------------------------------ F( 2, 173) = 6.16
Model | 8.66875741 2 4.33437871 Prob > F = 0.0026
-------------+------------------------------ Adj R-squared = 0.0556
Total | 130.480616 175 .745603519 Root MSE = .83912
------------------------------------------------------------------------------
-------------+----------------------------------------------------------------
uh_1 | .0243798 .0734643 0.33 0.740 -.1206218 .1693814
uh_2 | -.2571086 .0734841 -3.50 0.001 -.4021494 -.1120678
_cons | -.000234 .0632508 -0.00 0.997 -.1250766 .1246085
------------------------------------------------------------------------------
There is second-order serial correlation.

Example: Federal funds rate, HSK-robust vs. HAC
The standard errors that are robust to HSK (below) are larger than those
robust to serial correlation and HSK... meaning the robust option has failed to
account for serial correlation.
. reg cffrate cinf cinf_1 cinf_2 cinf_3 cgdpgap cgdpgap_1 cgdpgap_2 cgdpgap_3,
robust
Linear regression Number of obs = 178

F( 8, 169) = 4.84
Prob > F = 0.0000
R-squared = 0.2667
Root MSE = .87882
------------------------------------------------------------------------------
| Robust
-------------+----------------------------------------------------------------
cinf | .1529545 .10977 1.39 0.165 -.0637424 .3696515
cinf_1 | .0711862 .1058495 0.67 0.502 -.1377714 .2801439
cinf_2 | .2244522 .1136453 1.98 0.050 .0001049 .4487996
cinf_3 | .1428366 .094696 1.51 0.133 -.0441028 .3297761
cgdpgap | .3387203 .1162425 2.91 0.004 .1092458 .5681947
cgdpgap_1 | .2413696 .0987046 2.45 0.015 .0465168 .4362223
cgdpgap_2 | .0886014 .1482415 0.60 0.551 -.2040422 .381245
cgdpgap_3 | .0502603 .035287 1.42 0.156 -.0193998 .1199203
_cons | .038079 .0617225 0.62 0.538 -.0837674 .1599254
------------------------------------------------------------------------------
HAC standard errors

A fair question
Given the presence of HAC standard errors, why bother testing for serial
correlation in the first place?
There are several reasons why we would want to detect serial correlation
rather than just always using HAC standard errors.

Why do we test for serial correlation?
1 Lag choice: HAC errors require the choice of a lag and different choices may
lead to different standard errors. If we can show that there is no serial
correlation, then we do not need to make this choice.
2 Efficiency: Without serial correlation, it is pointless to try to implement
estimation methods that improve upon OLS only in the presence of serial
correlation. In other words, OLS is efficient if there is not serial correlation.
3 Misspecification: We may have specified a model that should not have serial
correlation. For example, a model that includes a lagged dependent variable
should not have serial correlation if we have specified the error structure
correctly (e.g., an AR(1) model that includes one lag of the dependent
variable). We are testing whether the model is misspecified.

Correcting for Serial Correlation with (F)GLS

An alternative to OLS
If we detect serial correlation, then it must be accounted for.

HAC errors are generally easy and do not rely on a strong set of assumptions.
However, these errors may be impractically large, suggesting a more efficient
method could be preferred.
We can directly model serial correlation and apply generalised least squares
(GLS).*
*These procedures require that our regressors are strictly exogenous, a stronger
assumption than needed for computing HAC standard errors.

GLS with AR(1) errors
Consider the following model with AR(1) errors:
yt = β0 + β1 xt + ut (1)
and ut = ρut−1 + et , where et are uncorrelated random variables with mean

zero and a constant variance.
For t ≥ 2, we write
yt−1 = β0 + β1 xt−1 + ut−1 (2)
If we multiply Eq. (2) by ρ and subtract it from Eq. (1), we get
yt − ρyt−1 = (1 − ρ)β0 + β1 (xt − ρxt−1 ) + et , t ≥ 2, (3)
where use have used the fact that et = ut − ρut−1 .

GLS with AR(1) errors
We can rewrite Eq. (3) as
ỹt = (1 − ρ)β0 + β1 x̃t + et ,
where
ỹt = yt − ρyt−1 , x̃t = xt − ρxt−1 .
We call ỹt and x̃t quasi-differenced data, and they satisfy all Gauss-Markov
assumptions.

Have we found the “best” estimators?
We need to add the first period of data for these GLS estimators to have the
smallest variance.
For t = 1, we have p
ỹ1 = 1 − ρ2 β0 + β1 x̃1 + ũ1 ,
where p p p
ỹ1 = 1 − ρ2 y1 , x̃1 = 1 − ρ2 x1 , and ũ1 = 1 − ρ2 u1 .
We can add more explanatory variables; we quasi-difference them using the
same formulas.

Are they feasible estimators?
We rarely know the value of ρ and instead must estimate it.

Feasible GLS (FGLS) first obtains an estimate for ρ and then uses this to
quasi-difference the data.
Three steps for FGLS estimation of a model with AR(1) errors:
1 Run the OLS regression of yt on xt1 , . . . , xtk and obtain the residuals ût .
2 Regress ût on ût−1 for t ≥ 2 to obtain an estimate ρ̂.
3 Quasi-difference the data and run OLS to obtain consistent estimates and
asymptotically valid standard errors.

Example: Percent fatalities, estimating ρ
In the percent fatilities example, we first run the OLS regression here .
Next, we estimate an AR(1) model on the estimated errors.
. predict uh, resid
. gen uh_1 = L.uh

. reg uh uh_1, noconstant

-------------+---------------------------------- F(1, 106) = 8.99
Model | .0235243 1 .0235243 Prob > F = 0.0034
-------------+---------------------------------- Adj R-squared = 0.0695
Total | .300877173 107 .002811936 Root MSE = .05115
------------------------------------------------------------------------------
uh | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
uh_1 | .2816134 .0939201 3.00 0.003 .0954077 .467819
------------------------------------------------------------------------------
. scalar rho = _b[uh_1]

Example: Percent fatalities, quasi-differencing
Then, we quasi-difference the data.

. gen const = 1
. foreach var in prcfata spdlaw beltlaw unem feb mar apr may jun jul aug sep oct nov dec t const {
2. gen ‘var’_qd = ‘var’ - rho * L.‘var’
3. replace ‘var’_qd = sqrt(1 - rho^2) * ‘var’ in 1
4. }
(1 real change made)
..
.
We never had to generate a constant. Why here?
p
The constant needs to be scaled by (1 − ρ) or 1 − ρ2 .

Example: Percent fatalities, FGLS
Finally, we use the command

reg prcfata_qd spdlaw_qd beltlaw_qd unem_qd feb_qd mar_qd ///
apr_qd may_qd jun_qd jul_qd aug_qd sep_qd oct_qd nov_qd ///
dec_qd t_qd const_qd, noconstant
This is equivalent to
prais prcfata spdlaw beltlaw unem feb-dec t, twostep
and the regression is known as Prais-Winsten or Cochrane-Orcutt regression.

Example: Percent fatalities, comparing estimates
We have four sets of estimates. We know that (1) is regression without

adjusting for errors and (3) is regression with HAC standard errors.
Which one is FGLS? Which one is HSK-robust?
(1) (2) (3) (4)

spdlaw 0.0672*** 0.0672*** 0.0672** 0.0644**
(0.0204) (0.0197) (0.0265) (0.0264)
beltlaw -0.0296 -0.0296 -0.0296 -0.0252
(0.0231) (0.0238) (0.0330) (0.0297)
unem -0.0154*** -0.0154** -0.0154** -0.0133*
(0.0055) (0.0060) (0.0060) (0.0070)
feb -0.0002 -0.0002 -0.0002 -0.0019
(0.0270) (0.0254) (0.0165) (0.0228)
.
.
.
dec 0.0089 0.0089 0.0089 0.0082
(0.0276) (0.0260) (0.0283) (0.0243)
t -0.0022*** -0.0022*** -0.0022*** -0.0022***
(0.0004) (0.0005) (0.0006) (0.0005)
constant 1.0385*** 1.0385*** 1.0385*** 1.0184***
(0.0572) (0.0614) (0.0591) (0.0712)

Example: Percent fatalities, answer revealed
Regression HSK-robust HAC FGLS

“regress” “, robust” “newey ” formula or “prais”
(1) (2) (3) (4)
spdlaw 0.0672*** 0.0672*** 0.0672** 0.0644**
(0.0204) (0.0197) (0.0265) (0.0264)
beltlaw -0.0296 -0.0296 -0.0296 -0.0252
(0.0231) (0.0238) (0.0330) (0.0297)
unem -0.0154*** -0.0154** -0.0154** -0.0133*
(0.0055) (0.0060) (0.0060) (0.0070)
feb -0.0002 -0.0002 -0.0002 -0.0019
(0.0270) (0.0254) (0.0165) (0.0228)
.
.
.
dec 0.0089 0.0089 0.0089 0.0082
(0.0276) (0.0260) (0.0283) (0.0243)
t -0.0022*** -0.0022*** -0.0022*** -0.0022***
(0.0004) (0.0005) (0.0006) (0.0005)
constant 1.0385*** 1.0385*** 1.0385*** 1.0184***
(0.0572) (0.0614) (0.0591) (0.0712)
The first three columns are all based on OLS. Only the standard errors differ.
The fourth column is based on FGLS, so the point estimates differ as well.

Topic2 TSSR Handout

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic2 TSSR Handout

Uploaded by

Copyright:

Available Formats

Applications of Econometrics

Serial Correlation in Time Series Data

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 1 / 59

1 Properties of OLS under Serial Correlation

2 N-W/HAC Standard Errors

3 Testing for Serial Correlation

4 Correcting for Serial Correlation with (F)GLS

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 2 / 59

Properties of OLS under Serial Correlation

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 3 / 59

Let’s remind ourselves

Recall the model written in its usual form:

Serial correlation means that the errors, {ut : t = 1, 2, . . .} are correlated.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 4 / 59

Let’s remind ourselves

If ut is uncorrelated with the explanatory variables at time t – the explanatory

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 5 / 59

But wait...it affects something

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 6 / 59

Serial correlation and goodness of fit

It is commonly thought that serial correlation invalidates R 2 and R̄ 2 .

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 7 / 59

Computing Standard Errors Robust to

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 8 / 59

Remember HSK in CS data?

It is increasingly common to treat serial correlation in TS regression like we

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 9 / 59

What does “, robust” do?

“, robust” means the inference will be robust to HSK of unknown form.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 10 / 59

What about serial correlation?

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 11 / 59

Let’s name these standard errors

The resulting standard errors are usually called Newey-West standard

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 12 / 59

Can they be automated with a command?

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 13 / 59

How to decide the number of lags?

Are there more specific rules?

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 14 / 59

How to decide the number of lags?

For example, suppose we have 40 years of data.

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 15 / 59

The command in Stata is

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 16 / 59

When to use HAC standard errors

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 17 / 59

HAC is less commonly used

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 18 / 59

Example: Federal funds rate, the set-up

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 19 / 59

Example: Federal funds rate, FDL regression

Source | SS df MS Number of obs = 177

The usual t statistics show significance of both contemporaneous variables

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 20 / 59

Example: Federal funds rate, FDL with HAC

If we try the Newey-West standard errors with lag = 2...

Regression with Newey-West standard errors Number of obs = 177

...the standard errors generally increase, sometimes by large amounts. For

Applications of Econometrics Ch. 12 Serial Correlation Semester 2, 2023/24 21 / 59

Example: Federal funds rate, adding N-W lags

Regression with Newey-West standard errors Number of obs = 177

Example: Federal funds rate, testing FD lags