You are on page 1of 9

Applications of Econometrics

Tutorial Sheet 4

Unit Roots
1. Explain how we can test whether there is a unit root in {yt }.

Solution: Testing for unit roots begins by writing down an AR(1) model for yt

yt = α + ρyt−1 + et , t = 1, 2, ...,

We think of et as a zero mean process not dependent on past values of yt :

E (et |yt−1 , yt−2 , . . . , y0 ) = 0

{yt } has a unit root if and only if ρ = 1. Because there are multiple ways for the above model
to be a stable AR(1) process (ρ = 0.2, ρ = 0.4, ρ = 0.57, ...), and only one way for it to be a I(1)
process (ρ = 1), it makes sense to think about the null hypothesis as H0 : ρ = 1 and the alternative
hypothesis as H1 : ρ < 1. If the equation is transformed by subtracting yt−1 from both sides,

∆yt = α + θyt−1 + et ,

then this test can be carried out on θ instead of ρ where θ = ρ − 1. In this model the null hypothesis
is now θ = 0 and the alternative hypothesis is θ < 0. Because under the null hypothesis yt is I(1) the
series is not weakly dependent and the test statistics will not be standard normally distributed. This
means the normal critical values will not apply. We need to use critical values from a distribution
known as a Dickey-Fuller distribution (this test is also known as a Dickey-Fuller (DF) test). These
values can be found in the textbook.

If the model contained a time trend we would have to amend the test. The critical values for
the model with a time trend are larger in magnitude (more negative) than the critical values for the
test without a time trend. This is because time series which are trend stationary (they have a linear
trend but are I(0) around the trend) can appear as if they have a unit root. Testing with a linear
trend would be based on the model

∆yt = α + δt + θyt−1 + et ,

The null hypothesis is still H0 : θ = 0, only the critical values change.


Figure 1: Examples of AR(1) processes for different values of ρ

α ρ α ρ


 





Z

Z






         
U U

Panel A Panel B
α ρ α ρ



 



Z

Z







         
U U

Panel C Panel C

2. Two variables often used in economic analysis are gross domestic product (gdp) and crude birth rates
(cbr). For this question we use FRED data on annual U.S. GDP and crude birth rates.1 Table 1 shows
regression results based on these data from 1960 to 2021.
1 Source: https://fred.stlouisfed.org/series/FYGDP and https://fred.stlouisfed.org/series/SPDYNCBRTINUSA.

Page 2
Table 1: Fertility and GDP

(1) (2) (3)


Dependent variable: ∆ log(gdpt ) ∆ log(cbrt ) log(cbrt )
year -0.0019 -0.0007
(0.0010)* (0.0004)*
log(gdpt−1 ) 0.0147
(0.0152)
log(gdpt ) -0.1286
(0.0090)***
log(cbrt−1 ) -0.1076
(0.0429)**
Constant 3.6857 1.6722 3.8028
(1.8677)* (0.9151)* (0.0765)***
Observations 61 61 62
R2 0.348 0.116 0.774
Standard errors in parentheses
* p < 0.10, ** p < 0.05, *** p < 0.01

(a) Comment on whether there is evidence for a unit root in log gdp and/or in log cbr based on the
results in Table 1.

Solution: The Dickey-Fuller test for a unit root in log gdp (and analogously log cbr) if we
allow for a linear trend is based on the model
∆ log (gdpt ) = α + δt + θ log (gdpt−1 ) + et
The null hypothesis is
H0 : θ=0
If the null hypothesis is true, then there is a unit root in log gdp. Column (1) in Table 1 shows
the results of estimating the parameters of this model. Note that it does not matter whether
we include t (starting in t = 0) or year (starting in year = 1960) to control for a linear trend.
Only the constant will be different.
From the table we see that θ̂ = 0.0147. We can calculate the t-statistic:
0.0147
t= ≈ 0.9671
0.0152
Now we compare this against the critical values (Table 18.3 in the book). The 5% critical value
for a DF test with trend is -3.41. Clearly 0.9671 > −3.41 so we cannot reject H0 at the 5%
level. This means we cannot reject that there is a unit root in log gdp. We do not know whether
there is one, but there might be, so we have to be careful.
We can do the same test for log cbr:
−0.1076
t= ≈ −2.5082
0.0429
Clearly −2.5082 > −3.41 so we cannot reject H0 at the 5% level. This means we cannot reject
that there is a unit root in log cbr. We do not know whether there is one, but there might be,
so we have to be careful.
Overall we could say that there is some evidence that is consistent with unit roots in both
variables, but it is not conclusive.

Page 3
(b) Based on the regression in column (3) can we say anything on the effect of log gdp on log cbr? In
other words, is there evidence that higher gdp leads to lower fertility?

Solution: Because both log gdp and log cbr might have a unit root we cannot just regress one
on the other. The problem is spurious regression. There is a significant negative relationship
between the two variables, but this happens frequently with unit root variables even if there is
no relationship at all.
To see why, consider
log (cbrt ) = β0 + β1 log (gdpt ) + ut
We are trying to test H0 : β1 = 0. If this is true (i.e. under the null hypothesis) we have

log (cbrt ) = β0 + ut

If log(cbrt ) follows a random walk, then this equation can only hold if ut also follows a random
walk. Then TS.3’, TS.4’ and TS.5’ do not hold (under the null). Therefore, we cannot do a
t-test (this means the stars in the table are completely wrong).
The only way to save this regression is if the two variables are cointegrated, which we will look
at below.

Cointegration
3. What does it mean for two variables to be cointegrated?

Solution: If yt and xt are two I(1) processes then in general yt − βxt will also be an I(1) process
for any number β. Nevertheless, it is possible that for some β 6= 0, yt − βxt is an I(0) process. If
such a β exists, we say that y and x are cointegrated, and we call β the cointegration parameter.

Why is it useful to know this? Well, if two variables are cointegrated, then they have a long
run relationship. A good example of this is the interest rate on U.S. treasury bills. Treasury bills
are securities which are sold at maturity lengths of 1-month, 3-months, 6-months, and 12-months
with a promise to pay a face value amount (typically $100) at maturity. They are sold at less than
face value to create a positive return to those who hold them. For example, if you pay $90 for
a 3-month treasury bill, the interest rate is 100−90
90 = 0.11 = 11%. It has been shown that the
interest rate on 3-month treasury bills is likely I(1). The same also goes for 6-month treasury bills.
Because of the longer maturity on 6-month treasury bills they pay a higher return. This difference
in returns between a 3-month treasury bill and 6-month treasury bill is known as the yield spread.
However, this difference would not be expected to grow or shrink over time due to arbitrage. While
the difference between them will change over time it is more likely that it fluctuates around a mean.
Econometrically, this can be represented as

r6t = r3t + µ + et

where et is a zero mean, I(0) process. At any time period, there can be deviations from equilibrium,
but they will be temporary: there are economic forces that drive r6 and r3 back toward the equi-
librium relationship.

4. How can we test for cointegration?

Page 4
Solution: Before testing for cointegration you should have some idea that both series are I(1). This
can be done by examining the series’ autocorrelation coefficients and performing DF tests on them.
To test for cointegration we perform a DF test on the residuals of the following regression,

yt = α + βxt + ut ,

since ut = yt − α − βxt is supposed to be I(0) for yt and xt to be cointegrated. The null hypothesis,
that there is a unit root, means that yt and xt are not cointegrated. If we reject the null hypothesis
then we find that yt and xt are cointegrated. Because we have to take into account the estimation
of β in this method this test is called an Engle-Granger test and uses different critical values from
the traditional DF test.

5. Returning to the fertility-gdp example from above, let’s examine the possibility that gdp and cbr are
cointegrated. After running the regression shown in Table 1 column (3) we run the following Stata code:
predict uhat, residual
regress D.uhat L.uhat

The output is
Source | SS df MS Number of obs = 61
-------------+---------------------------------- F(1, 59) = 3.27
Model | .002398465 1 .002398465 Prob > F = 0.0755
Residual | .043214323 59 .000732446 R-squared = 0.0526
-------------+---------------------------------- Adj R-squared = 0.0365
Total | .045612788 60 .000760213 Root MSE = .02706

------------------------------------------------------------------------------
D.uhat | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
uhat |
L1. | -.0779884 .0430974 -1.81 0.075 -.1642261 .0082493
|
_cons | -.0045375 .0034661 -1.31 0.196 -.0114732 .0023982
------------------------------------------------------------------------------

Is there evidence that the two series are cointegrated?

Solution: The cointegration test here first estimates the parameters of

log (cbrt ) = β0 + β1 log (gdpt ) + ut

Then we predict the residuals ût and conduct a DF test on them by running

∆ût = α + θût−1 + et

The results of this regression are shown in the Stata output. We get θ̂ = −0.0780 with a standard
error of 0.0431. We could now calculate a t-statistic but Stata has already done this and gives us
t = −1.81. We now compare this against Engle-Granger critical values (Table 18.4 in the book).
The 5% critical value is -3.34. Clearly −1.81 > −3.34. This means we cannot reject H0 . The null
hypothesis is ’no cointegration’. If we cannot reject it then it could be true, so it could hold that
the two series are not cointegrated. So it looks likely that this is a spurious regression. The perhaps
surprising thing about the Engle-Granger test is that it works even though under the null hypothesis
the Gauss-Markov assumptions do not hold and we thus have a spurious regression problem.

6. When yt and xt are cointegrated in the following equation

yt = α + βxt + ut ,

Page 5
OLS produces consistent estimates of α̂ and β̂. However, because xt is I(1), we cannot use the traditional
inference techniques to test hypotheses about β as the estimator won’t have a normal distribution, even
asymptotically. However, if we are willing to make the assumption that xt is strictly exogenous, then the
estimators will be normally distributed provided that the errors are homoskedastic, serially uncorrelated,
and normally distributed, conditional on the explanatory variables. Explain how this model can be
transformed so that xt is strictly exogenous and we can conduct our usual tests.

Solution: The solution to this is the leads and lags estimator of β. We construct a new set of errors
which are, at least approximately, strictly exogenous. However, because xt is I(1), we need these
new error terms to be strictly exogenous with regards to ∆xt (remember that xt being I(1) means
it takes a form similar to xt = xt−1 + vt ). This can be achieved by writing ut as a function of the
∆xs for all s close to t. For example,

ut = η + φ0 ∆xt + φ1 ∆xt−1 + φ2 ∆xt−2 + γ1 ∆xt+1 + γ2 ∆xt+2 + et

where, by construction, et is uncorrelated with each ∆xs appearing in the equation. The hope is
that et is uncorrelated with further lags and leads of ∆xs . We know that, as |s − t| gets large, the
correlation between et and ∆xs approaches zero, because these are I(0) processes. Now, if we plug
this into the original equation we get

yt = (α + η) + βxt + φ0 ∆xt + φ1 ∆xt−1 + φ2 ∆xt−2 + γ1 ∆xt+1 + γ2 ∆xt+2 + et

This equation looks a bit strange because future ∆xs appear with both current and lagged ∆xt .
The key is that the coefficient on xt is still β, and, by construction, xt is now strictly exogenous in
this equation.
As an example, and a way to think about the leads and lags model conceptually, imagine we are
interested in examining the following model:

M urderRatet = α + βP oliceP Ct + ut

where M urderRatet is the murder rate in a given city at a time t, and P oliceP Ct represents the
number of police per capita at time t. Imagine there is a shock in the murder rate at time t so
that the murder rate increases by more than expected, given the PolicePC at time t. This shock
will be captured by ut . It is quite likely that this shock may result in a change in the number of
police per capita at time t + 1. Therefore, an increase in the error term at time t is correlated with
∆P oliceP Ct+1 . However, it is unlikely that this shock will feed into later changes in police per
capita. In other words, as we move further away from t, we expect the correlation to tend to zero
if both ∆P oliceP Ct and et are I(0). Therefore, we do not expect an increase in et to be correlated
with, say, ∆P oliceP Ct+5 . This explains why we include ∆xs for s close to t. The inclusion of
these terms accounts for possible ways in which the assumption that xt is strictly exogenous may
be violated.

7. If yt and xt are two I(1) processes, and they are NOT cointegrated, then both of the variables can be
used in a regression together when they are used in their difference form. For example,

∆yt = α0 + α1 ∆yt−1 + γ0 ∆xt + γ1 ∆xt−1 + ut

However, if we knew that yt and xt were cointegrated, how could this equation be improved?

Solution: If yt and xt are cointegrated with parameter β, then we have additional I(0) variables
that we can include in the above equation. Let st = yt − βxt , so that st is I(0), and assume for the
sake of simplicity that st has zero mean. This is known as the error correction term. Now, we can

Page 6
include lags of st in the equation. In the simplest case, this gives

∆yt = α0 + α1 ∆yt−1 + γ0 ∆xt + γ1 ∆xt−1 + δst−1 + ut


∆yt = α0 + α1 ∆yt−1 + γ0 ∆xt + γ1 ∆xt−1 + δ(yt−1 − βxt−1 ) + ut ,

where δ < 0. If yt−1 > βxt−1 , then y in the previous period has overshot the equilibrium; because
δ < 0, the error correction term works to push y back toward the equilibrium. Similarly, if yt−1 <
βxt−1 , the error correction term induces a positive change in y back toward the equilibrium. This
is an example of an error correction model.

Application
8. Figure 1 below shows the evolution of newly confirmed Covid-19 cases (in thousands) on a daily basis for
the UK during the period 11/1/2020–31/12/2021. A researcher interested in predicting future Covid-19
cases based on past values estimates the following model:

Casest = β0 + β1 Casest−1 + ut (1)

Figure 2: Newly confirmed Covid-19 cases in UK (in thousands)


200
Newly Confimred COVID-19 cases in UK (in 1,000)
0 50 100 150

01jan2020 01jul2020 01jan2021 01jul2021 01jan2022

(a) Figure 2 plots newly confirmed Covid-19 cases per day. Discuss whether Casest is likely to be
covariance stationary or not based on this figure.

Solution: The three conditions for covariance stationarity are:


(i) E[yt ] = µ : Mean is constant for all t
(ii) V ar(yt ) = σ 2 : Variance is constant for all t
(iii) Cov(yt , yt+h ) = f (h): Covariance for given h is constant for all t

We might suspect a random walk based on the figure and so (i) may not be true. From
the figure it seems the variance increases after July 2021 so (ii) might not hold. It seems pretty

Page 7
Table 2: Regression Outputs

(1) (2) (3)


Casest ∆Casest ∆Casest
Casest−1 1.027 0.025
(0.006) (0.007)

∆Casest−1 0.041 0.090


(0.039) (0.038)

Constant -0.219 -0.189 0.246


(0.180) (0.182) (0.142)
Observations 708 707 707
R2 0.975 0.027 0.008
Standard errors in parentheses

evident that (iii) does not hold since it seems to matter where we start in terms of getting
negative or positive covariance.

(b) Based on the output in Table 2 explain whether we might be worried that Casest has a unit root.

Solution: Realisation 1: From column (1) we see that β̂1 = 1.027, which is close to unity so
that indicates there might be a unit root
Realisation 2: Column (2) shows results from a augmented Dickey-Fuller test for a unit root in
Casest . The coefficient on Casest−1 = 0.025.
The null hypothesis for the Augmented Dickey-Fuller test is H0 : θ = 0 against the alternative
H1 < 0. In this case we can calculate the t-statistic as
0.025
t= ≈ 3.57
0.007
This is positive, so it is way above any of the critical values of the DF test. This means we
cannot reject H0 so there could be a unit root in Casest . We wouldn’t even have to do a test
here because both the coefficient and the standard error are positive, so there’s no way the t
could ever be negative (and below the critical values).

(c) Assume that the first difference ∆Casest is covariance stationary. Use the estimates from Table 2
column (3) and the observations in Table 3 to predict the number of new Covid-19 cases on the 1st
of January 2022.

Table 3: Observations of New Covid-19 Cases in the UK

Date New Covid-19 Cases (’000)


29th Dec, 2021 183.04
30th Dec, 2021 190.38
31th Dec, 2021 190.97

Solution: Firstly we need to take first differences as the input to our regression equation.

Page 8
Date New Covid-19 Cases (’000) First difference
29th Dec, 2021 183.04
30th Dec, 2021 190.38 7.34
31th Dec, 2021 190.97 0.59
Next we input the change in cases into our regression equation:

∆Casest = β0 + β1 ∆Casest−1 + ut (2)


for the period t + 1, which gives us 0.299. We still need to convert this to the actual number of
cases: 190.97+0.299=191.27.

(d) On the 1st of January there were 163,677 new Covid-19 cases. Calculate the forecast error and
explain how we would calculate the 95% forecast interval.

Solution: The equation over predicted the number of cases by 191, 270 − 163, 677 = 27, 593.
This is not that surprising given that the regression equations only explains 0.8% of the temporal
variation in cases. One way to calculate the 95% forecast interval is to run the following
regression:
∆Casest = β0 + β1 (∆Casest−1 − 0.59) + ut (3)

The intercept (β̂0 ) will give us the point estimate fˆn for the period t + 1 and its standard error
(se(fˆn ). The standard error of the forecast error is se(êt+1 ) = [(se(fˆn ))2 + (σ̂)2 ]1/2 , where σ̂ 2 is
the RMSE. We can’t calculate this here from the information in the table, but just conceptually,
the 95% forecast interval is given by 0.299 ± 1.96 × se(êt+1 )

Page 9

You might also like