Chapter12 Autocorrelation

Introduction to Econometrics,
5th edition
Chapter heading
Chapter 12: Autocorrelation
FALL 2020
Introduction to Econometrics
Chapter heading
Autocorrelation
AUTOCORRELATION
Assumption C.5 states that the values of the disturbance term in the
observations in the sample are generated independently of each other.
1
AUTOCORRELATION
In the graph above, it is clear that this assumption is not valid. Positive values
tend to be followed by positive ones, and negative values by negative ones.
Successive values tend to have the same sign. This is described as positive
autocorrelation. 2
AUTOCORRELATION
In this graph, positive values tend to be followed by negative ones, and

negative values by positive ones. This is an example of negative
autocorrelation. 3
AUTOCORRELATION
Yt =  1 +  2 X t + ut
First-order autoregressive autocorrelation: AR(1)

ut = ut −1 +  t
A particularly common type of autocorrelation, at least as an approximation,

is first-order autoregressive autocorrelation, usually denoted AR(1)
autocorrelation. 8
AUTOCORRELATION
Yt =  1 +  2 X t + ut

ut = ut −1 +  t
It is autoregressive, because ut depends on lagged values of itself, and first-

order, because it depends only on its previous value. ut also depends on t, an
injection of fresh randomness at time t, often described as the innovation at
time t. 8
AUTOCORRELATION
Yt =  1 +  2 X t + ut

ut = ut −1 +  t
Fifth-order autoregressive autocorrelation: AR(5)

ut =  1 ut −1 +  2 ut −2 +  3 ut −3 +  4 ut −4 +  5 ut −5 +  t
Here is a more complex example of autoregressive autocorrelation. It is

described as fifth-order, and so denoted AR(5), because it depends on
lagged values of ut up to the fifth lag. 8
AUTOCORRELATION
Yt =  1 +  2 X t + ut

ut = ut −1 +  t

ut =  1 ut −1 +  2 ut −2 +  3 ut −3 +  4 ut −4 +  5 ut −5 +  t
Third-order moving average autocorrelation: MA(3)

ut = 0 t + 1 t −1 + 2 t −2 + 3 t −3
The other main type of autocorrelation is moving average autocorrelation,

where the disturbance term is a linear combination of the current innovation
and a finite number of previous ones. 8
AUTOCORRELATION
Yt =  1 +  2 X t + ut

ut = ut −1 +  t

ut =  1 ut −1 +  2 ut −2 +  3 ut −3 +  4 ut −4 +  5 ut −5 +  t
Third-order moving average autocorrelation: MA(3)

ut = 0 t + 1 t −1 + 2 t −2 + 3 t −3
This example is described as third-order moving average autocorrelation,

denoted MA(3), because it depends on the three previous innovations as
well as the current one. 8
AUTOCORRELATION
ut = ut −1 +  t
3
0 1
-1
-2
-3
We will now look at examples of the patterns that are generated when the
disturbance term is subject to AR(1) autocorrelation. The object is to
provide some bench-mark images to help you assess plots of residuals in
time series regressions. 9
AUTOCORRELATION
ut = ut −1 +  t
3
0 1
-1
-2
-3
We will use 50 independent values of , taken from a normal distribution with

0 mean, and generate series for u using different values of .
10
AUTOCORRELATION
ut = 0.0ut −1 +  t
3
0 1
-1
-2
-3
We have started with  equal to 0, so there is no autocorrelation. We will

increase  progressively in steps of 0.1.
11
AUTOCORRELATION
ut = 0.1ut −1 +  t
3
0 1
-1
-2
-3
( = 0.1)
12
AUTOCORRELATION
ut = 0.2ut −1 +  t
3
0 1
-1
-2
-3
( = 0.2)
13
AUTOCORRELATION
ut = 0.3ut −1 +  t
3
0 1
-1
-2
-3
With  equal to 0.3, a pattern of positive autocorrelation is beginning to be

apparent.
14
AUTOCORRELATION
ut = 0.4ut −1 +  t
3
0 1
-1
-2
-3
( = 0.4)
15
AUTOCORRELATION
ut = 0.5ut −1 +  t
3
0 1
-1
-2
-3
( = 0.5)
16
AUTOCORRELATION
ut = 0.6ut −1 +  t
3
0 1
-1
-2
-3
With  equal to 0.6, it is obvious that u is subject to positive autocorrelation.

Positive values tend to be followed by positive ones and negative values by
negative ones. 17
AUTOCORRELATION
ut = 0.7 ut −1 +  t
3
0 1
-1
-2
-3
( = 0.7)
18
AUTOCORRELATION
ut = 0.8ut −1 +  t
3
0 1
-1
-2
-3
( = 0.8)
19
AUTOCORRELATION
ut = 0.9ut −1 +  t
3
0 1
-1
-2
-3
With  equal to 0.9, the sequences of values with the same sign have
become long and the tendency to return to 0 has become weak.
20
AUTOCORRELATION
ut = 0.95ut −1 +  t
3
0 1
-1
-2
-3
The process is now approaching what is known as a random walk, where 

is equal to 1 and the process becomes nonstationary. The terms ‘random
walk’ and ‘nonstationary’ will be defined in the next chapter. For the time
being we will assume |  | < 1. 21
AUTOCORRELATION
ut = 0.0ut −1 +  t
3
0 1
-1
-2
-3
Next we will look at negative autocorrelation, starting with the same set of 50
independently distributed values of t.
22
AUTOCORRELATION
ut = −0.3ut −1 +  t
3
0 1
-1
-2
-3
We will take larger steps this time.
23
AUTOCORRELATION
ut = −0.6ut −1 +  t
3
0 1
-1
-2
-3
With  equal to –0.6, you can see that positive values tend to be followed by
negative ones, and vice versa, more frequently than you would expect as a
matter of chance. 24
AUTOCORRELATION
ut = −0.9ut −1 +  t
3
0 1
-1
-2
-3
Now the pattern of negative autocorrelation is very obvious.
25
AUTOCORRELATION
============================================================
Dependent Variable: LGHOUS
Method: Least Squares
Sample: 1959 2003
Included observations: 45
============================================================
Variable Coefficient Std. Error t-Statistic Prob.
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
R-squared 0.998583 Mean dependent var 6.359334
Adjusted R-squared 0.998515 S.D. dependent var 0.437527
S.E. of regression 0.016859 Akaike info criter-5.263574
Sum squared resid 0.011937 Schwarz criterion -5.143130
Log likelihood 121.4304 F-statistic 14797.05
Durbin-Watson stat 0.633113 Prob(F-statistic) 0.000000
============================================================
Next, we will look at a plot of the residuals of the logarithmic regression of

expenditure on housing services on income and relative price.
26
AUTOCORRELATION
0.04
0.03
0.02
0.01
0
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
-0.01
-0.02
-0.03
-0.04
This is the plot of the residuals of course, not the disturbance term. But if
the disturbance term is subject to autocorrelation, then the residuals will be
subject to a similar pattern of autocorrelation. 27
AUTOCORRELATION
0.04
0.03
0.02
0.01
0
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
-0.01
-0.02
-0.03
-0.04
You can see that there is strong evidence of positive autocorrelation.

Comparing the graph with the randomly generated patterns, one would say
that  is about 0.7 or 0.8. 28
Chapter heading
CONSEQUENCES OF
AUTOCORRELATION
CONSEQUENCES OF AUTOCORRELATION
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
The consequences of autocorrelation for OLS are similar to those of

heteroscedasticity. In general, the regression coefficients remain unbiased,
but OLS is inefficient because one can find an alternative regression
technique that yields estimators with smaller variances. 1
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
The other main consequence is that autocorrelation causes the standard

errors to be estimated wrongly, often being biased downwards. Finally,
although in general OLS estimates are unbiased, there is an important
special case where they are biased. 2
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
Unbiasedness is easily demonstrated, provided that Assumption C.7 is

satisfied. In the case of the simple regression model shown, we have seen
that the OLS estimator of 2 can be decomposed as the second line where
the at are as defined in the third line. 3
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
 T 
( )
E ˆ2 =  2 + E   at ut 
 t =1 
T T
=  2 +  E ( at ut ) =  2 +  E ( at ) E ( ut )
t =1 t =1
Now, if Assumption C.7 is satisfied, at and ut are distributed independently

and we can write ̂the expectation of as shown. At no point have we made
2
any assumption concerning whether ut is, or is not, subject to
autocorrelation. 4
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
 T 
( )
E ˆ2 =  2 + E   at ut 
 t =1 
T T
=  2 +  E ( at ut ) =  2 +  E ( at ) E ( ut )
t =1 t =1
All that we now require is E(ut) = 0 and this is easily demonstrated.
5
ut = ut −1 +  t
ut −1 = ut − 2 +  t −1
ut =  2 ut − 2 +  t −1 +  t
For example, in the case of AR(1) autocorrelation, lagging the process one
time period, we have the second line. Substituting for ut–1 in the first
equation, we obtain the third. 6
ut = ut −1 +  t
ut −1 = ut − 2 +  t −1
ut =  2 ut − 2 +  t −1 +  t
ut =  t +  t −1 +  2 t − 2 + ...
E (ut ) = E ( t ) + E ( t −1 ) +  2 E ( t − 2 ) + ... = 0
Continuing to lag and substitute, we can express ut in terms of current and

lagged values of t with diminishing weights. Since, by definition, the
expected value of each innovation is zero, the expected value of ut is zero. 7
ut = 0 t + 1 t −1 + 2 t −2 + 3 t −3
E (ut ) = 0 E ( t ) + 1 E ( t −1 ) + 2 E ( t − 2 ) + 3 E ( t − 3 ) = 0
For higher order AR autocorrelation, the demonstration is essentially similar.

For moving average autocorrelation, the result is immediate.
8
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
 T 
( )
E ˆ2 =  2 + E   at ut 
 t =1 
T T
=  2 +  E ( at ut ) =  2 +  E ( at ) E ( ut )
t =1 t =1
For multiple regression analysis, the demonstration is the same, except that
at is replaced by at*, where at* depends on all of the observations on all of the
explanatory variables in the model. 9
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
 T 
( )
E ˆ2 =  2 + E   at ut 
 t =1 
T T
=  2 +  E ( at ut ) =  2 +  E ( at ) E ( ut )
t =1 t =1
We will not pursue analytically the other consequences of autocorrelation.

An important one is that the Gauss–Markov theorem, which guarantees the
efficiency of the OLS estimators, does not apply, since its proof requires no
autocorrelation. 10
Yt =  1 +  2 X t + ut
T
ˆ2 =  2 +  at ut
t =1
Xt − X
at = T
(
 sX − X )2
s =1
 T 
( )
E ˆ2 =  2 + E   at ut 
 t =1 
T T
=  2 +  E ( at ut ) =  2 +  E ( at ) E ( ut )
t =1 t =1
Another is that the expressions for the standard errors are invalid since they
are based on the assumption that there is no autocorrelation.
11
Yt =  1 +  2 X t +  3Yt −1 + ut
ut = ut −1 +  t
Now we come to the special case where OLS yields inconsistent estimators
if the disturbance term is subject to autocorrelation.
12
Yt =  1 +  2 X t +  3Yt −1 + ut
ut = ut −1 +  t
If the model specification includes a lagged dependent variable, OLS

estimators are biased and inconsistent if the disturbance term is subject to
autocorrelation. This will be demonstrated for AR(1) autocorrelation and an
ADL(1,0) model with one X variable. 13
Yt =  1 +  2 X t +  3Yt −1 + ut
ut = ut −1 +  t
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1
Lagging the ADL(1,0) model by one time period, we obtain the third line.
Thus Yt–1 depends on ut–1. As a consequence of the AR(1) autocorrelation ut
also depends on ut–1. 14
Yt =  1 +  2 X t +  3Yt −1 + ut
ut = ut −1 +  t
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1
Hence we have a violation of part (1) of Assumption C.7. The explanatory

variables, Yt–1, is not distributed independently of the disturbance term. As
a consequence, OLS will yield inconsistent estimates. 15
Yt =  1 +  2 X t +  3Yt −1 + ut
ut = ut −1 +  t
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1
This was described as a special case, but actually it is an important one.

ADL models are frequently used in time series regressions and
autocorrelation is a common problem. 16
Chapter heading
TESTS FOR AUTOCORRELATION I:

BREUSCH–GODFREY TEST
TESTS FOR AUTOCORRELATION I: BREUSCH–GODFREY TEST
Simple autoregression of the residuals
ut = ut −1 +  t
uˆ t =  uˆ t −1 + error
We will initially confine the discussion of the tests for autocorrelation to its
most common form, the AR(1) process. If the disturbance term follows the
AR(1) process, it is reasonable to hypothesize that, as an approximation, the
residuals will conform to a similar process. 1
ut = ut −1 +  t
After all, provided that the conditions for the consistency of the OLS
estimators are satisfied, as the sample size becomes large, the OLS estimators
will converge on their true values. 2
ut = ut −1 +  t
If the OLS estimators will converge on their true values, the location of the
regression line will converge on the true relationship, and the residuals will
coincide with the values of the disturbance term. 3
ut = ut −1 +  t
Hence a regression of uˆ t on uˆ t −1 is sufficient, at least in large samples. Of

course, there is the issue that, in this regression, uˆ t −1 is a lagged dependent
variable, but that does not matter in large samples. 4
true value
Yt = 10 + 2.0t + ut
T = 200
ut = 0.7 ut −1 +  t
uˆ t = ˆ uˆ t −1 5
T = 100
T = 50
T = 25
0
-0.5 0 0.5 0.7 1 ̂
This is illustrated with the simulation shown in the figure. The true model is
as shown, with ut being generated as an AR (1) process with  = 0.7.
5
true value
Yt = 10 + 2.0t + ut
T = 200
ut = 0.7 ut −1 +  t
uˆ t = ˆ uˆ t −1 5
T = 100
T = 50
T = 25
0
-0.5 0 0.5 0.7 1 ̂
The values of the parameters in the model for Yt make no difference to the
distributions of the estimator of .
6
true value
Yt = 10 + 2.0t + ut
T = 200
ut = 0.7 ut −1 +  t
uˆ t = ˆ uˆ t −1 5
T mean
T = 100 25 0.47
50 0.59
T = 50 100 0.65
200 0.68
T = 25
0
-0.5 0 0.5 0.7 1 ̂
As can be seen, when uˆ t is regressed on uˆ t −1 , the distribution of the

estimator of  is left skewed and heavily biased downwards for T = 25. The
mean of the distribution is 0.47. 7
true value
Yt = 10 + 2.0t + ut
T = 200
ut = 0.7 ut −1 +  t
uˆ t = ˆ uˆ t −1 5
T mean
T = 100 25 0.47
50 0.59
T = 50 100 0.65
200 0.68
T = 25
0
-0.5 0 0.5 0.7 1 ̂
However, as the sample size increases, the downwards bias diminishes and it
is clear that the distribution of the estimator is converging on 0.7 as the
sample becomes large. Inference in finite samples will be approximate, given
the autoregressive nature of the regression. 8
Breusch–Godfrey test
k
Yt =  1 +   j X jt + ut
j =2
The simple estimator of the autocorrelation coefficient depends on

Assumption C.7 part (2) being satisfied when the original model (the model
for Yt) is fitted. Generally, one might expect this not to be the case. 9
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
If the original model contains a lagged dependent variable as a regressor, or

violates Assumption C.7 part (2) in any other way, the estimates of the
parameters will be inconsistent if the disturbance term is subject to
autocorrelation. 10
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
uˆ t
As a repercussion, a simple regression ofuˆ t −1 on will produce an
inconsistent estimate of . The solution is to include all of the explanatory
variables in the original model in the residuals autoregression. 11
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
If the original model is the first equation where, say, one of the X variables is
Yt–1, then the residuals regression would be the second equation.
12
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
The idea is that, by including the X variables, one is controlling for the
effects of any endogeneity on the residuals.
13
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
The underlying theory is complex and relates to maximum-likelihood

estimation, as does the test statistic. The test is known as the Breusch–
Godfrey test. 14
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
Test statistic: nR2, distributed as c2(1) when

testing for first-order autocorrelation
Several asymptotically-equivalent versions of the test have been proposed.

The most popular involves the computation of the lagrange multiplier
statistic nR2 when the residuals regression is fitted, n being the actual
number of observations in the regression. 15
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
Test statistic: nR2, distributed as c2(1) when

testing for first-order autocorrelation
Asymptotically, under the null hypothesis of no autocorrelation, nR2 is

distributed as a chi-squared statistic with one degree of freedom.
16
k
Yt =  1 +   j X jt + ut
j =2
k
uˆ t =  1 +   j X jt +  uˆ t −1
j=2
Alternatively, simple t test on coefficient of uˆ t −1
A simple t test on the coefficient of uˆ t −1 has also been proposed, again with
asymptotic validity.
17
k
Yt =  1 +   j X jt + ut
j =2
k q
uˆ t =  1 +   j X jt +   s uˆ t − s
j=2 s =1
The procedure can be extended to test for higher order autocorrelation. If

AR(q) autocorrelation is suspected, the residuals regression includes q
lagged residuals. 18
k
Yt =  1 +   j X jt + ut
j =2
k q
uˆ t =  1 +   j X jt +   s uˆ t − s
j=2 s =1
Test statistic: nR2, distributed as c2(q)
For the lagrange multiplier version of the test, the test statistic remains nR2
(with n smaller than before, the inclusion of the additional lagged residuals
leading to a further loss of initial observations). 19
k
Yt =  1 +   j X jt + ut
j =2
k q
uˆ t =  1 +   j X jt +   s uˆ t − s
j=2 s =1
Under the null hypothesis of no autocorrelation, nR2 has a chi-squared

distribution with q degrees of freedom.
20
k
Yt =  1 +   j X jt + ut
j =2
k q
uˆ t =  1 +   j X jt +   s uˆ t − s
j=2 s =1
Alternatively, F test on the lagged residuals

H0: 1 = ... = q = 0, H1: not H0
The t test version becomes an F test comparing RSS for the residuals
regression with RSS for the same specification without the residual terms.
Again, the test is valid only asymptotically. 21
k
Yt =  1 +   j X jt + ut
j =2
k q
uˆ t =  1 +   j X jt +   s uˆ t − s
j=2 s =1
Test statistic: nR2, distributed as c2(q),

valid also for MA(q) autocorrelation
The lagrange multiplier version of the test has been shown to be

asymptotically valid for the case of MA(q) moving average autocorrelation.
22
Chapter heading
TESTS FOR AUTOCORRELATION II:

DURBIN–WATSON TEST
TESTS FOR AUTOCORRELATION II: DURBIN–WATSON TEST
Durbin–Watson test
 ( uˆ t − uˆ t −1 )
2
d= t =2
T
 t
ˆ
u 2
t =1
The first major test to be developed and popularised for the detection of
autocorrelation was the Durbin–Watson test for AR(1) autocorrelation based
on the Durbin–Watson d statistic calculated from the residuals using the
expression shown. 1
 ( uˆ t − uˆ t −1 )
2
d= t =2
T
 t
ˆ
u 2
t =1
In large samples d → 2 – 2
It can be shown that in large samples d tends to 2 – 2, where  is the

parameter in the AR(1) relationship ut = ut–1 + t.
2
 ( uˆ t − uˆ t −1 )
2
d= t =2
T
 t
ˆ
u 2
t =1
No autocorrelation d→2
If there is no autocorrelation,  is 0 and d should be distributed randomly around 2.
3
 ( uˆ t − uˆ t −1 )
2
d= t =2
T
 t
ˆ
u 2
t =1
Severe positive autocorrelation d→0
If there is severe positive autocorrelation,  will be near 1 and d will be near 0.
4
 ( uˆ t − uˆ t −1 )
2
d= t =2
T
 t
ˆ
u 2
t =1
Severe negative autocorrelation d→4
Likewise, if there is severe positive autocorrelation,  will be near –1 and d

will be near 4.
5
positive no negative
autocorrelation autocorrelation autocorrelation
0 2 4
Thus d behaves as illustrated graphically above.
6
0 dcrit 2 dcrit 4
To perform the Durbin–Watson test, we define critical values of d. The null

hypothesis is H0:  = 0 (no autocorrelation). If d lies between these values,
we do not reject the null hypothesis. 7
0 dcrit 2 dcrit 4
The critical values, at any significance level, depend on the number of

observations in the sample and the number of explanatory variables.
8
0 dL dcrit dU 2 dcrit 4
Unfortunately, they also depend on the actual data for the explanatory
variables in the sample, and thus vary from sample to sample.
9
However Durbin and Watson determined upper and lower bounds, dU and dL,
for the critical values, and these are presented in standard tables.
10
If d is less than dL, it must also be less than the critical value of d for
positive autocorrelation, and so we would reject the null hypothesis and
conclude that there is positive autocorrelation. 11
If d is above than dU, it must also be above the critical value of d, and so we
would not reject the null hypothesis. (Of course, if it were above 2, we
should consider testing for negative autocorrelation instead.) 12
If d lies between dL and dU, we cannot tell whether it is above or below the
critical value and so the test is indeterminate.
13
0 dL dU 2 4
1.43 1.62
(n = 45, k = 3, 5% level)
Here are dL and dU for 45 observations and two explanatory variables, at the 5%
significance level.
14
0 dL dU 2 4
1.43 1.62 2.38 2.57
(n = 45, k = 3, 5% level)
There are similar bounds for the critical value in the case of negative
autocorrelation. They are not given in the standard tables because negative
autocorrelation is uncommon, but it is easy to calculate them because are
they are located symmetrically to the right of 2. 15
0 dL dU 2 4
1.43 1.62 2.38 2.57
(n = 45, k = 3, 5% level)
So if d < 1.43, we reject the null hypothesis and conclude that there is
positive autocorrelation.
16
0 dL dU 2 4
1.43 1.62 2.38 2.57
(n = 45, k = 3, 5% level)
If 1.43 < d < 1.62, the test is indeterminate and we do not come to any
conclusion.
17
0 dL dU 2 4
1.43 1.62 2.38 2.57
(n = 45, k = 3, 5% level)
If 1.62 < d < 2.38, we do not reject the null hypothesis of no autocorrelation.
18
0 dL dU 2 4
1.43 1.62 2.38 2.57
(n = 45, k = 3, 5% level)
If 2.38 < d < 2.57, we do not come to any conclusion.
19
0 dL dU 2 4
1.43 1.62 2.38 2.57
(n = 45, k = 3, 5% level)
If d > 2.57, we conclude that there is significant negative autocorrelation.
20
0 dL dU 2 4
1.24 1.42 2.58 2.76
(n = 45, k = 3, 1% level)
Here are the bounds for the critical values for the 1% test, again with 45
observations and two explanatory variables.
21
0 dL dU 2 4
1.24 1.42 2.58 2.76
(n = 45, k = 3, 1% level)
The Durbin-Watson test is valid only when all the explanatory variables are
deterministic. This is in practice a serious limitation since usually
interactions and dynamics in a system of equations cause Assumption C.7
part (2) to be violated. 22
0 dL dU 2 4
1.24 1.42 2.58 2.76
(n = 45, k = 3, 1% level)
In particular, if the lagged dependent variable is used as a regressor, the

statistic is biased towards 2 and therefore will tend to under-reject the null
hypothesis. It is also restricted to testing for AR(1) autocorrelation. 23
0 dL dU 2 4
1.24 1.42 2.58 2.76
(n = 45, k = 3, 1% level)
Despite these shortcomings, it remains a popular test and some major

applications produce the d statistic automatically as part of the standard
regression output. 24
0 dL dU 2 4
1.24 1.42 2.58 2.76
(n = 45, k = 3, 1% level)
It does have the appeal of the test statistic being part of standard regression
output. Further, it is appropriate for finite samples, subject to the zone of
indeterminacy and the deterministic regressor requirement. 25
Chapter heading
TESTS FOR AUTOCORRELATION:

EXAMPLES
TESTS FOR AUTOCORRELATION III: EXAMPLES
============================================================
Dependent Variable: LGFOOD
Sample: 1959 2003
============================================================
============================================================
C 2.236158 0.388193 5.760428 0.0000
LGDPI 0.500184 0.008793 56.88557 0.0000
LGPRFOOD -0.074681 0.072864 -1.024941 0.3113
============================================================
Log likelihood 112.8843 Hannan-Quinn crite-4.838846
F-statistic 2606.860 Durbin-Watson stat 0.478540
Prob(F-statistic) 0.000000
============================================================
The output shown in the table gives the result of a logarithmic regression of
expenditure on food on disposable personal income and the relative price of
food. 1
0.06
Residuals, static logarithmic regression for FOOD
0.05
0.04
0.03
0.02
0.01
0
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
-0.01
-0.02
-0.03
-0.04
The plot of the residuals is shown. All the tests indicate highly significant
autocorrelation.
2
============================================================
Dependent Variable: RLGFOOD
Sample(adjusted): 1960 2003
Included observations: 44 after adjusting endpoints
============================================================
============================================================
RLGFOOD(-1) 0.790169 0.106603 7.412228 0.0000
============================================================
R-squared 0.560960 Mean dependent var 3.28E-05
Log likelihood 127.9936 Durbin-Watson stat 1.477337
============================================================
uˆ t = 0.79uˆ t −1
RLGFOOD in the regression above is the residual from the LGFOOD

regression. A simple regression of RLGFOOD on RLGFOOD(–1) yields a
coefficient of 0.79 with standard error 0.11. 3
============================================================
============================================================
============================================================
RLGFOOD(-1) 0.790169 0.106603 7.412228 0.0000
============================================================
============================================================
uˆ t = 0.79uˆ t −1
Technical note for EViews users: EViews places the residuals from the most
recent regression in a pseudo-variable called resid. resid cannot be used
directly. So the residuals were saved as RLGFOOD using the genr
command: genr RLGFOOD = resid 4
============================================================
============================================================
============================================================
C 0.175732 0.265081 0.662936 0.5112
LGDPI -7.36E-05 0.006180 -0.011917 0.9906
LGPRFOOD -0.037373 0.049496 -0.755058 0.4546
RLGFOOD(-1) 0.805744 0.110202 7.311504 0.0000
============================================================
============================================================
nR 2 = 44  0.5720 = 25.17 c 2 (1)crit, 0.1% = 10.83
Next, the Breusch‒Godfrey test. Adding an intercept, LGDPI and LGPRFOOD

to the specification, the coefficient of the lagged residuals becomes 0.81
with standard error 0.11. R2 is 0.5720, so nR2 is 25.17. 5
============================================================
============================================================
============================================================
C 0.175732 0.265081 0.662936 0.5112
LGDPI -7.36E-05 0.006180 -0.011917 0.9906
LGPRFOOD -0.037373 0.049496 -0.755058 0.4546
RLGFOOD(-1) 0.805744 0.110202 7.311504 0.0000
============================================================
============================================================
nR 2 = 44  0.5720 = 25.17 c 2 (1)crit, 0.1% = 10.83
(Note that here n = 44. There are 45 observations in the regression in Table
12.1, and one fewer in the residuals regression.) The critical value of chi-
squared with one degree of freedom at the 0.1 percent level is 10.83. 6
============================================================
Breusch-Godfrey Serial Correlation LM Test:
============================================================
F-statistic 54.78773 Probability 0.000000
Obs*R-squared 25.73866 Probability 0.000000
============================================================
Test Equation:
Dependent Variable: RESID
Presample missing value lagged residuals set to zero.
============================================================
============================================================
C 0.171665 0.258094 0.665124 0.5097
LGDPI 9.50E-05 0.005822 0.016324 0.9871
LGPRFOOD -0.036806 0.048504 -0.758819 0.4523
RESID(-1) 0.805773 0.108861 7.401873 0.0000
============================================================
R-squared 0.571970 Mean dependent var-1.85E-18
============================================================
Technical note for EViews users: one can perform the test simply by
following the LGFOOD regression with the command auto(1). EViews
allows itself to use resid directly. 7
============================================================
============================================================
============================================================
Test Equation:
============================================================
============================================================
C 0.171665 0.258094 0.665124 0.5097
LGDPI 9.50E-05 0.005822 0.016324 0.9871
LGPRFOOD -0.036806 0.048504 -0.758819 0.4523
RESID(-1) 0.805773 0.108861 7.401873 0.0000
============================================================
============================================================
The argument in the auto command relates to the order of autocorrelation
being tested. At the moment we are concerned only with first-order
autocorrelation. This is why the command is auto(1). 8
============================================================
============================================================
============================================================
Test Equation:
============================================================
============================================================
C 0.171665 0.258094 0.665124 0.5097
LGDPI 9.50E-05 0.005822 0.016324 0.9871
LGPRFOOD -0.036806 0.048504 -0.758819 0.4523
RESID(-1) 0.805773 0.108861 7.401873 0.0000
============================================================
============================================================
When we performed the test, resid(–1), and hence RLGFOOD(–1), were
not defined for the first observation in the sample, so we had 44
observations from 1960 to 2003. 9
============================================================
============================================================
============================================================
Test Equation:
============================================================
============================================================
C 0.171665 0.258094 0.665124 0.5097
LGDPI 9.50E-05 0.005822 0.016324 0.9871
LGPRFOOD -0.036806 0.048504 -0.758819 0.4523
RESID(-1) 0.805773 0.108861 7.401873 0.0000
============================================================
============================================================
EViews uses the first observation by assigning a value of zero to the first
observation for resid(–1). Hence the test results are very slightly
different. 10
============================================================
============================================================
============================================================
C 0.175732 0.265081 0.662936 0.5112
LGDPI -7.36E-05 0.006180 -0.011917 0.9906
LGPRFOOD -0.037373 0.049496 -0.755058 0.4546
RLGFOOD(-1) 0.805744 0.110202 7.311504 0.0000
============================================================
============================================================
We can also perform the test with a t test on the coefficient of the lagged
variable.
11
============================================================
============================================================
============================================================
Test Equation:
============================================================
============================================================
C 0.171665 0.258094 0.665124 0.5097
LGDPI 9.50E-05 0.005822 0.016324 0.9871
LGPRFOOD -0.036806 0.048504 -0.758819 0.4523
RESID(-1) 0.805773 0.108861 7.401873 0.0000
============================================================
============================================================
Here is the corresponding output using the auto command built into
EViews. The test is presented as an F statistic. Of course, when there is
only one lagged residual, the F statistic is the square of the t statistic. 12
============================================================
Sample: 1959 2003
============================================================
============================================================
C 2.236158 0.388193 5.760428 0.0000
LGDPI 0.500184 0.008793 56.88557 0.0000
LGPRFOOD -0.074681 0.072864 -1.024941 0.3113
============================================================
============================================================
dL = 1.24 (1% level, 2 explanatory variables, 45 observations)
The Durbin–Watson statistic is 0.48. dL is 1.24 for a 1 percent significance

test (2 explanatory variables, 45 observations).
13
k
Yt =  1 +   j X jt + ut
j =2
k q
uˆ t =  1 +   j X jt +   s uˆ t − s
j=2 s =1
Alternatively, F test on the lagged residuals

H0: 1 = ... = q = 0, H1: not H0
The Breusch–Godfrey test for higher-order autocorrelation is a

straightforward extension of the first-order test. If we are testing for order q,
we add q lagged residuals to the right side of the residuals regression. We
will perform the test for second-order autocorrelation. 14
============================================================
============================================================
============================================================
C 0.071220 0.277253 0.256879 0.7987
LGDPI 0.000251 0.006491 0.038704 0.9693
LGPRFOOD -0.015572 0.051617 -0.301695 0.7645
RLGFOOD(-1) 1.009693 0.163240 6.185318 0.0000
RLGFOOD(-2) -0.289159 0.171960 -1.681548 0.1009
============================================================
============================================================
nR = 43  0.6020 = 25.89
2
c (2 )
2
= 13.82
crit, 0.1%
Here is the regression for RLGFOOD with two lagged residuals. The
Breusch–Godfrey test statistic is 25.89. With two lagged residuals, the
statistic has a chi-squared distribution with two degrees of freedom under
the null hypothesis. It is significant at the 0.1 percent level 15
============================================================
============================================================
============================================================
C 0.071220 0.277253 0.256879 0.7987
LGDPI 0.000251 0.006491 0.038704 0.9693
LGPRFOOD -0.015572 0.051617 -0.301695 0.7645
RLGFOOD(-1) 1.009693 0.163240 6.185318 0.0000
RLGFOOD(-2) -0.289159 0.171960 -1.681548 0.1009
============================================================
============================================================
We will also perform an F test, comparing the RSS with the RSS for the same regression without
the lagged residuals. We know the result, because one of the t statistics is very high.
16
============================================================
Sample: 1961 2003
============================================================
============================================================
C 0.027475 0.412043 0.066680 0.9472
LGDPI -0.001074 0.009986 -0.107528 0.9149
LGPRFOOD -0.003948 0.076191 -0.051816 0.9589
============================================================
Adjusted R-squared -0.049687 S.D. dependent var 0.020368
============================================================
Here is the regression for ELGFOOD without the lagged residuals. Note that
the sample period has been adjusted to 1961 to 2003, to make RSS
comparable with that for the previous regression. 17
============================================================
Sample: 1961 2003
============================================================
============================================================
C 0.027475 0.412043 0.066680 0.9472
LGDPI -0.001074 0.009986 -0.107528 0.9149
LGPRFOOD -0.003948 0.076191 -0.051816 0.9589
============================================================
Adjusted R-squared -0.049687 S.D. dependent var 0.020368
============================================================
(0.017419 − 0.006935) / 2
F (2,38) = = 28.72 F (2,35 )crit, 0.1% = 8.47
0.006935 / 38
The F statistic is 28.72. This is significant at the 1% level. The critical value
for F(2,35) is 8.47. That for F(2,38) must be slightly lower.
18
============================================================
============================================================
============================================================
Test Equation:
============================================================
============================================================
C 0.053628 0.261016 0.205460 0.8383
LGDPI 0.000920 0.005705 0.161312 0.8727
LGPRFOOD -0.013011 0.049304 -0.263900 0.7932
RESID(-1) 1.011261 0.159144 6.354360 0.0000
RESID(-2) -0.290831 0.167642 -1.734833 0.0905
============================================================
Here is the output using the auto(2) command in EViews.
============================================================ The conclusions
for the two tests are the same.
19
============================================================
Sample (adjusted): 1960 2003
Included observations: 44 after adjustments
============================================================
============================================================
C 0.985780 0.336094 2.933054 0.0055
LGDPI 0.126657 0.056496 2.241872 0.0306
LGPRFOOD -0.088073 0.051897 -1.697061 0.0975
LGFOOD(-1) 0.732923 0.110178 6.652153 0.0000
============================================================
============================================================
The output above gives the result of a parallel logarithmic regression with the
addition of lagged expenditure on food as an explanatory variable. Again,
there is strong evidence that the specification is subject to autocorrelation.
20
0.04
Residuals, ADL(1,0) logarithmic regression for FOOD
0.03
0.02
0.01
0
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
-0.01
-0.02
-0.03
Here is a plot of the residuals.
21
============================================================
============================================================
============================================================
RLGFOOD(-1) 0.431010 0.143277 3.008226 0.0044
============================================================
============================================================
uˆ t = 0.43uˆ t −1
A simple regression of the residuals on the lagged residuals yields a coefficient

of 0.43 with standard error 0.14. We expect the estimate to be adversely
affected by the presence of the lagged dependent variable in the regression for
LGFOOD. 22
============================================================
============================================================
============================================================
C 0.417342 0.317973 1.312507 0.1972
LGDPI 0.108353 0.059784 1.812418 0.0778
LGPRFOOD -0.005585 0.046434 -0.120279 0.9049
LGFOOD(-1) -0.214252 0.116145 -1.844700 0.0729
RLGFOOD(-1) 0.604346 0.172040 3.512826 0.0012
============================================================
============================================================
With an intercept, LGDPI, LGPRFOOD, and LGFOOD(–1) added to the

specification, the coefficient of the lagged residuals becomes 0.60 with
standard error 0.17. 23
============================================================
============================================================
============================================================
C 0.417342 0.317973 1.312507 0.1972
LGDPI 0.108353 0.059784 1.812418 0.0778
LGPRFOOD -0.005585 0.046434 -0.120279 0.9049
LGFOOD(-1) -0.214252 0.116145 -1.844700 0.0729
RLGFOOD(-1) 0.604346 0.172040 3.512826 0.0012
============================================================
============================================================
nR = 43  0.2469 = 10.62
2
c (1)
2
crit, 0.1% = 10.83
R2 is 0.2469, so nR2 is 10.62, significant at the 1 percent level and nearly

significant at the 0.1 percent level. (Note that here n = 43.) The t statistic for
the coefficient of the lagged residual is also highly significant. 23
Chapter heading
ELIMINATING AR(1)
AUTOCORRELATION
ELIMINATING AR(1) AUTOCORRELATION
Yt =  1 +  2 X t + ut ut = ut −1 +  t
This sequence shows how AR(1) autocorrelation can be eliminated from a

regression model. The AR(1) process is the equation at the top right. We
will start with the simple regression model, top left. 1
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
If the regression model is valid at time t, it is also valid at time t – 1. For

reasons that will become obvious in a moment, we have multiplied through
the second equation by . 2
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
We now subtract the second equation from the first.
3
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
The disturbance term now reduces to t, the innovation at time t in the AR(1)
process. By assumption, this is independently distributed, so the problem
of autocorrelation has been eliminated. 4
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
There is one minor problem. The revised specification involves a nonlinear

restriction. The coefficient of Xt–1 is minus the product of the coefficients of
Xt and Yt–1. 5
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
This means that we should not try to fit the equation using ordinary least
squares. OLS would not take account of the restriction and so we would
end up with conflicting estimates of the parameters. 6
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yˆt = 100 + 0.5Yt −1 + 0.8 X t − 0.6 X t −1
For example, we might obtain the equation shown. From it we could deduce
estimates of 0.5 for  and 0.8 for 2. But these numbers would be
incompatible with the estimate of 0.6 for 2. 7
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yt =  1 +  2 X 2 t +  3 X 3 t + ut ut = ut −1 +  t
We therefore need to use a nonlinear estimation technique. Before doing

this, we will extend the model to multiple regression with two explanatory
variables. 8
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yt =  1 +  2 X 2 t +  3 X 3 t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X 2 t −1 +  3 X 3 t −1 + ut −1
The procedure is the same. Write the model a second time, lagged one time
period, and multiply through by .
9
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yt =  1 +  2 X 2 t +  3 X 3 t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X 2 t −1 +  3 X 3 t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 + ut − ut −1
Subtract the second equation from the first.
10
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yt =  1 +  2 X 2 t +  3 X 3 t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X 2 t −1 +  3 X 3 t −1 + ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
Again, we obtain a model that is free from autocorrelation.
11
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yt =  1 +  2 X 2 t +  3 X 3 t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X 2 t −1 +  3 X 3 t −1 + ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
Now there are two restrictions. One involves the coefficients of Yt–1, X2t, and
X2t–1.
12
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yt =  1 +  2 X 2 t +  3 X 3 t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X 2 t −1 +  3 X 3 t −1 + ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
The other involves the coefficients of Yt–1, X3t, and X3t–1.
13
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
Method: Least Squares Sample(adjusted): 1960 2003
LGHOUS=C(1)*(1-C(2))+C(2)*LGHOUS(-1)+C(3)*LGDPI-C(2)*C(3)
*LGDPI(-1)+C(4)*LGPRHOUS-C(2)*C(4)*LGPRHOUS(-1)
============================================================
Coefficient Std. Error t-Statistic Prob.
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
Here is the output for a logarithmic regression of expenditure on housing

services on income and price, assuming an AR(1) process, using EViews.
14
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
EViews allows two ways of specifying a regression equation. One is to list

the variables, starting with the dependent variable, continuing with C for the
intercept, and finishing with a list of the explanatory variables. This is fine
for linear regressions. 15
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
The other method is to write the model as an equation, referring to the

parameters as C(1), C(2), etc. This is what you should do when fitting a
nonlinear model, such as the present one. 16
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
Here 1 has been denoted C(1).
17
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
, the coefficient of the lagged dependent variable, has been denoted C(2). It
is also a component of the intercept in this model. The estimate of , 0.72, is
quite high. 18
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
2, the coefficient of income, has been denoted C(3). The estimate is close
to the OLS estimate, 1.03.
19
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
The coefficient of lagged income must then be specified as –C(2)*C(3).
20
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
3, the coefficient of price, has been denoted C(4). The estimate is the same
as the OLS estimate, –0.48, at least to two decimal places. (This is a
coincidence.) 21
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
The coefficient of lagged price must then be specified as –C(2)*C(4).
22
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
The only problem with this method of fitting the AR(1) model is that
specifying the model in equation form is a tedious task and it is easy to
make mistakes. 23
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
============================================================
Convergence achieved after 21 iterations
============================================================
============================================================
C 0.154815 0.354989 0.436111 0.6651
LGDPI 1.011295 0.021830 46.32642 0.0000
LGPRHOUS -0.478070 0.091594 -5.219437 0.0000
AR(1) 0.719102 0.115689 6.215836 0.0000
============================================================
============================================================
Since the AR(1) specification is a common one, most serious regression
applications provide some short-cut for specifying it easily. In the case of EViews,
AR(1) estimation is invoked by adding AR(1) to the list of explanatory variables.
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
=============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
============================================================
C 0.154815 0.354989 0.436111 0.6651
LGDPI 1.011295 0.021830 46.32642 0.0000
LGPRHOUS -0.478070 0.091594 -5.219437 0.0000
AR(1) 0.719102 0.115689 6.215836 0.0000
============================================================
The constant is an estimate of 1.
25
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
=============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
============================================================
C 0.154815 0.354989 0.436111 0.6651
LGDPI 1.011295 0.021830 46.32642 0.0000
LGPRHOUS -0.478070 0.091594 -5.219437 0.0000
AR(1) 0.719102 0.115689 6.215836 0.0000
============================================================
The income coefficient is the estimate of the elasticity with respect to

current income..
26
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
=============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
============================================================
C 0.154815 0.354989 0.436111 0.6651
LGDPI 1.011295 0.021830 46.32642 0.0000
LGPRHOUS -0.478070 0.091594 -5.219437 0.0000
AR(1) 0.719102 0.115689 6.215836 0.0000
============================================================
The price coefficient is the estimate of the elasticity with respect to current
price.
27
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
=============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
============================================================
C 0.154815 0.354989 0.436111 0.6651
LGDPI 1.011295 0.021830 46.32642 0.0000
LGPRHOUS -0.478070 0.091594 -5.219437 0.0000
AR(1) 0.719102 0.115689 6.215836 0.0000
============================================================
The coefficient of AR(1) is an estimate of .
28
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
=============================================================
============================================================
============================================================
C(1) 0.154815 0.354989 0.436111 0.6651
C(2) 0.719102 0.115689 6.215836 0.0000
C(3) 1.011295 0.021830 46.32641 0.0000
C(4) -0.478070 0.091594 -5.219436 0.0000
============================================================
============================================================
============================================================
C 0.154815 0.354989 0.436111 0.6651
LGDPI 1.011295 0.021830 46.32642 0.0000
LGPRHOUS -0.478070 0.091594 -5.219437 0.0000
AR(1) 0.719102 0.115689 6.215836 0.0000
============================================================
The coefficients of lagged income and lagged price are not reported
because they are implicit in the estimates of , 2, and 3.
29
Chapter heading
THE COCHRANE–ORCUTT
PROCESS
FOOTNOTE: THE COCHRANE–ORCUTT PROCESS
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
We saw in the previous sequence that AR(1) autocorrelation could be

eliminated by a simple manipulation of the model. The regression model is
nonlinear in parameters, but that now presents no problem for fitting it. 1
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
However, in the early days of computing, nonlinear estimation was not so

simple and it was avoided whenever possible. The Cochrane–Orcutt iterative
procedure was an ingenious method of using linear regression analysis to fit
this nonlinear model. 2
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
It is of no practical interest now, but you may see references to it

occasionally. This sequence explains how it worked.
3
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
~ ~ ~
Yt =  1 +  2 X t +  t Y t = Yt − Yt −1
~
X t = X t − X t −1
 1 =  1 (1 −  )
We return to line 3 and note that the model can be rewritten as shown with
appropriate definitions. We now have a simple regression model free from
autocorrelation. 4
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
~ ~ ~
Yt =  1 +  2 X t +  t Y t = Yt − Yt −1
~
X t = X t − X t −1
 1 =  1 (1 −  )
However, to construct the artificial variables

~ ~Yt and Xt , we need an estimate
of . We obtain one using the residuals. If the disturbance term follows the
AR(1) process, it is reasonable to hypothesize that, as an approximation, the
residuals will conform to a similar process. 5
Yt =  1 +  2 X t + ut ut = ut −1 +  t
Yt −1 =  1  +  2 X t −1 + ut −1
Yt − Yt −1 =  1 (1 −  ) +  2 X t −  2 X t −1 + ut − ut −1
~ ~ ~
Yt =  1 +  2 X t +  t Y t = Yt − Yt −1
~
X t = X t − X t −1
 1 =  1 (1 −  )
Hence one can obtain an estimate of  by fitting the residuals regression.

Refinements, including an iterated version, were developed. However, there
is no point in describing them because the method has long been obsolete.
It is now only a historical curiosity. 6
Chapter heading
AUTOCORRELATION,
PARTIAL ADJUSTMENT AND
ADAPTIVE EXPECTATIONS
AUTOCORRELATION, PARTIAL ADJUSTMENT, AND ADAPTIVE EXPECTATIONS
Yt* =  1 +  2 X t + ut Yt − Yt −1 =  (Yt* − Yt −1 )
Yt = Yt* + (1 −  )Yt −1
Yt =  ( 1 +  2 X t + ut ) + (1 −  )Yt −1
=  1 +  2 X t + (1 −  )Yt −1 + ut
=  1 +  2 X t +  3Yt −1 + ut
This sequence looks at the implications of autocorrelation for the partial

adjustment and adaptive expectations models.
1
Yt* =  1 +  2 X t + ut Yt − Yt −1 =  (Yt* − Yt −1 )
Yt = Yt* + (1 −  )Yt −1
Yt =  ( 1 +  2 X t + ut ) + (1 −  )Yt −1
=  1 +  2 X t + (1 −  )Yt −1 + ut
=  1 +  2 X t +  3Yt −1 + ut
In the partial adjustment model, the disturbance term in the fitted model is
the same as that in the target relationship, except that it has been multiplied
by a constant, . 2
Yt* =  1 +  2 X t + ut Yt − Yt −1 =  (Yt* − Yt −1 )
Yt = Yt* + (1 −  )Yt −1
Yt =  ( 1 +  2 X t + ut ) + (1 −  )Yt −1
=  1 +  2 X t + (1 −  )Yt −1 + ut
=  1 +  2 X t +  3Yt −1 + ut
Thus, if the regression model assumptions are valid in the target

relationship, they will also be valid in the fitted relationship.
3
Yt* =  1 +  2 X t + ut Yt − Yt −1 =  (Yt* − Yt −1 )
Yt = Yt* + (1 −  )Yt −1
Yt =  ( 1 +  2 X t + ut ) + (1 −  )Yt −1
=  1 +  2 X t + (1 −  )Yt −1 + ut
=  1 +  2 X t +  3Yt −1 + ut
The only problem is the finite sample bias caused by using the lagged
dependent variable as an explanatory variable, and this is usually disregarded
in practice anyway. 4
Yt* =  1 +  2 X t + ut Yt − Yt −1 =  (Yt* − Yt −1 )
Yt = Yt* + (1 −  )Yt −1
Yt =  ( 1 +  2 X t + ut ) + (1 −  )Yt −1
=  1 +  2 X t + (1 −  )Yt −1 + ut
=  1 +  2 X t +  3Yt −1 + ut
Of course, if the disturbance term in the target relationship is autocorrelated, it

will be autocorrelated in the fitted relationship. OLS would yield inconsistent
estimates and you should use an AR(1) estimation method instead. 5
Yt =  1 +  2 X te+1 + ut X te+1 − X te =  ( X t − X te )
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  2  (1 −  ) X t −1 +  2 (1 −  ) X t − 2 + ...
2
+  2 (1 −  ) X t − s+1 +  2 (1 −  ) X te− s+1 + ut

s −1 s
In the case of the adaptive expectations model, we derived two alternative

regression models. One model expresses Y as a function of current and
lagged values of X, enough lags being taken to render negligible the coefficient
of the unobservable variable Xet–s+1. 6
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  2  (1 −  ) X t −1 +  2 (1 −  ) X t − 2 + ...
2
+  2 (1 −  ) X t − s+1 +  2 (1 −  ) X te− s+1 + ut

s −1 s
The disturbance term in the regression model is the same as that in the
original model. So if it satisfies the regression model assumptions in the
original model it will do so in the regression model, which should be fitted
using a standard nonlinear estimation method. 7
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  2  (1 −  ) X t −1 +  2 (1 −  ) X t − 2 + ...
2
+  2 (1 −  ) X t − s+1 +  2 (1 −  ) X te− s+1 + ut

s −1 s
If it is autocorrelated in the original model, it will be autocorrelated in the

regression model. An AR(1) estimation method should be used.
8
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  2  (1 −  ) X t −1 +  2 (1 −  ) X t − 2 + ...
2
+  2 (1 −  ) X t − s+1 +  2 (1 −  ) X te− s+1 + ut

s −1 s
Yt =  1 +  2 X t + (1 −  )(Yt −1 −  1 − ut −1 ) + ut
=  1 +  2 X t + (1 −  )Yt −1 + ut − (1 −  )ut −1
=  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
The other version of the regression model expresses Y as a function of X

and lagged Y. The disturbance term is a compound of ut and ut–1.
9
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Thus if the disturbance term in the original model satisfies the regression
model assumptions, the disturbance term in the regression model will be
subject to MA(1) autocorrelation (first-order moving average autocorrelation).10
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
If you compare the composite disturbance terms for observations t and t – 1,

you will see that they have a component ut–1 in common.
11
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
The combination of moving-average autocorrelation and the presence of the

lagged dependent variable in the regression model causes a violation of
Assumption C.7. 12
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
ut–1 is a component of both Yt–1 and the composite disturbance term.
13
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
Since the current value of the disturbance term is not distributed

independently of the current value of one of the explanatory variables, OLS
estimates will be biased and inconsistent. Under these conditions, the other
regression model should be used instead. 14
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
ut = ut −1 +  t
However, suppose that the disturbance term in the original model were
subject to AR(1) autocorrelation.
15
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
ut = ut −1 +  t
ut − (1 −  )ut −1 = ut −1 +  t − (1 −  )ut −1

=  t + (  +  − 1)ut −1
Then the composite disturbance term at time t will be as shown.
16
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
ut = ut −1 +  t
ut − (1 −  )ut −1 = ut −1 +  t − (1 −  )ut −1

=  t + (  +  − 1)ut −1
Itis thus a composite of the innovation in the AR(1) process at time t and ut–1.
Now, under reasonable assumptions, both  and  should lie between 0 and
1. Hence it is possible that the coefficient of ut–1 may be small enough for the
autocorrelation to be negligible. 17
X te+1 = X t + (1 −  ) X te
Yt =  1 +  2 X t +  3Yt −1 + ut − (1 −  )ut −1
Yt −1 =  1 +  2 X t −1 +  3Yt − 2 + ut −1 − (1 −  )ut − 2
ut = ut −1 +  t
ut − (1 −  )ut −1 = ut −1 +  t − (1 −  )ut −1

=  t + (  +  − 1)ut −1
If that is the case, OLS could be used to fit the regression model after all.
You should, of course, perform a Breusch–Godfrey test to check that there
is no (significant) autocorrelation. 18
Chapter heading
Autocorrelation
Example: HOUSING DYNAMICS
HOUSING DYNAMICS
============================================================
Sample: 1959 2003
============================================================
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
============================================================
This sequence gives an example of how a direct examination of plots of the

residuals and the data for the variables in a regression model may lead to an
improvement in the specification of the regression model. 1
HOUSING DYNAMICS
============================================================
Sample: 1959 2003
============================================================
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
============================================================
The regression output is that for a logarithmic regression of aggregate

expenditure on housing services on income and relative price for the United
States for the period 1959–2003. The income and price elasticities seem
plausible. 2
HOUSING DYNAMICS
============================================================
Breusch–Godfrey statistic: 20.02
c2(1)crit, 0.1% = 10.83
Sample: 1959 2003
============================================================
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
============================================================
However, the Breusch–Godfrey and Durbin–Watson statistics both indicate

autocorrelation at a high significance level.
3
HOUSING DYNAMICS
0.04
0.03
0.02
0.01
0
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
-0.01
-0.02
-0.03
-0.04
The residuals exhibit a classic pattern of strong positive autocorrelation.
4
HOUSING DYNAMICS
9 0.04
0.03
8
0.02
0.01
7
6
-0.01
-0.02
5
-0.03
4 -0.04
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
LGHOUS FITTED LGDPI LGPRHOUS RESIDS
The actual and fitted values of the dependent variable and the series for
income and price have been added to the diagram. The price series was
very flat and so had little influence on the fitted values. It will be ignored in
the discussion that follows. 5
HOUSING DYNAMICS
9 0.04
0.03
8
0.02
0.01
7
6
-0.01
-0.02
5
-0.03
4 -0.04
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
There was a very large negative residual in 1973. We will enlarge this part of
the diagram and take a closer look.
6
HOUSING DYNAMICS
6.4 8.2
6.3
8.1
6.2
8
6.1
6 7.9
1971 1972 1973 1974 1975
LGHOUS FITTED LGDPI
In 1973, income (right scale) grew unusually rapidly. The fitted value of
housing expenditure (left scale, with actual value) accordingly rose above its
trend. 7
HOUSING DYNAMICS
6.4 8.2
6.3
8.1
6.2
8
6.1
6 7.9
1971 1972 1973 1974 1975
LGHOUS FITTED LGDPI
This boom was stopped in its tracks by the first oil shock. Income actually
declined in 1974, the only fall in the entire sample period.
8
HOUSING DYNAMICS
6.4 8.2
6.3
8.1
6.2
8
6.1
6 7.9
1971 1972 1973 1974 1975
LGHOUS FITTED LGDPI
As a consequence, the fitted value of housing expenditure would also have

fallen in 1974. In actual fact it rose a little because the real price of housing
fell relatively sharply in 1974. 9
HOUSING DYNAMICS
6.4 8.2
6.3
8.1
6.2
8
6.1
6 7.9
1971 1972 1973 1974 1975
LGHOUS FITTED LGDPI
However, the actual value of housing maintained its previous trend in those
two years, responding not at all to the short-run variations in the growth of
income. This accounts for the gap that opened up in 1973, and the large
negative residual in that year. 10
HOUSING DYNAMICS
9 0.04
0.03
8
0.02
0.01
7
6
-0.01
-0.02
5
-0.03
4 -0.04
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
There was a similar large negative residual in 1984. We will enlarge this part
of the diagram.
11
HOUSING DYNAMICS
8.5
6.6
8.4
6.5
8.3
6.4 8.2
1982 1983 1984 1985 1986 1987
LGHOUS FITTED LGDPI
Income grew unusually rapidly in 1984. As a consequence, the fitted value

of housing also grew rapidly. However the actual value of housing grew at
much the same rate as previously. Hence the negative residual. 12
HOUSING DYNAMICS
8.5
6.6
8.4
6.5
8.3
6.4 8.2
1982 1983 1984 1985 1986 1987
LGHOUS FITTED LGDPI
In the years immediately after 1984, income grew at a slower rate. Accordingly
the fitted value of housing grew at a slower rate. But the actual value of
housing grew at much the same rate as before, turning the negative residual
in 1984 into a large positive one in 1987. 13
HOUSING DYNAMICS
9 0.04
0.03
8
0.02
0.01
7
6
-0.01
-0.02
5
-0.03
4 -0.04
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
Finally, we shall take a closer look at the series of positive residuals from
1960 to 1965.
14
HOUSING DYNAMICS
5.9 7.8
7.7
5.8
7.6
5.7
7.5
5.6
7.4
5.5
7.3
5.4 7.2
1959 1960 1961 1962 1963 1964 1965 1966
LGHOUS FITTED LGDPI
In the first part of this subperiod, income was growing relatively slowly.
Towards the end, it started to accelerate. The fitted values followed suit.
15
HOUSING DYNAMICS
5.9 7.8
7.7
5.8
7.6
5.7
7.5
5.6
7.4
5.5
7.3
5.4 7.2
1959 1960 1961 1962 1963 1964 1965 1966
LGHOUS FITTED LGDPI
However, the actual values maintained a constant trend. Because it was

unresponsive to the variations in the growth rate of income, a gap opened
up in the middle, giving rise to the positive residuals. 16
HOUSING DYNAMICS
5.9 7.8
7.7
5.8
7.6
5.7
7.5
5.6
7.4
5.5
7.3
5.4 7.2
1959 1960 1961 1962 1963 1964 1965 1966
LGHOUS FITTED LGDPI
In this case, as in the previous two, the residuals are not being caused by
autocorrelation. If that were the case, the actual values should be relatively
volatile, compared with the trend of the fitted values. 17
HOUSING DYNAMICS
5.9 7.8
7.7
5.8
7.6
5.7
7.5
5.6
7.4
5.5
7.3
5.4 7.2
1959 1960 1961 1962 1963 1964 1965 1966
LGHOUS FITTED LGDPI
What we see here is exactly the opposite. The actual values have a very
stable trend, while the fitted values respond, as they must, to short-run
variations in the growth of income. The pattern we see in the residuals is
caused by the nonresponse of the actual values. 18
HOUSING DYNAMICS
5.9 7.8
7.7
5.8
7.6
5.7
7.5
5.6
7.4
5.5
7.3
5.4 7.2
1959 1960 1961 1962 1963 1964 1965 1966
LGHOUS FITTED LGDPI
One way to model the inertia in the growth rate of the actual values is to add
a lagged dependent variable to the regression model.
19
HOUSING DYNAMICS
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================ =========
We are now hypothesizing that current expenditure on housing services

depends on previous expenditure as well as income and price. Here is the
regression with the lagged dependent variable added to the model. 20
HOUSING DYNAMICS
============================================================
c2(1)crit, 5% = 3.84
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================ =========
We should check for any evidence of autocorrelation. The Breusch–Godfrey

test statistic is 0.20, not remotely significant, so we do not reject the null
hypothesis of no autocorrelation. 21
HOUSING DYNAMICS
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
The new equation indicates that current expenditure on housing services is

determined only partly by current income and price. Previous expenditure is
clearly very important as well. 22
HOUSING DYNAMICS
============================================================
c2(1)crit, 0.1% = 10.83
============================================================
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
============================================================
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
We now have an explanation for the apparent autocorrelation exhibited by the

residuals in the plot, the resulting high value of the Breusch–Godfrey statistic,
and the low value of the Durbin‒Watson d statistic, in the original regression. 23
HOUSING DYNAMICS
============================================================
c2(1)crit, 0.1% = 10.83
============================================================
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
============================================================
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
They were attributable to the omission of an important variable, rather than

to the disturbance term being subject to an AR(1) process.
24
HOUSING DYNAMICS
============================================================
============================================================
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
============================================================
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
Note that the income and price elasticities are much lower than in the original
regression. We have already seen the reason for this in the sequence that
discussed the dynamics inherent in a partial adjustment model. 25
Chapter heading
COMMON FACTOR TEST

COMMON FACTOR TEST
Yt =  1 +  2 X t + ut
ut = ut −1 +  t
In a previous sequence it was shown that, if you have a simple regression

model with Y depending on X and a disturbance term u subject to an AR(1)
process, ... 1
COMMON FACTOR TEST
Yt =  1 +  2 X t + ut
ut = ut −1 +  t
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
... the model can be rewritten with Yt depending on Xt, Yt–1, Xt–1, and a
disturbance term t that is not subject to autocorrelation.
2
COMMON FACTOR TEST
Yt =  1 +  2 X t + ut
ut = ut −1 +  t
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
This model is nonlinear in parameters since the coefficient of Xt–1 is equal to

minus the product of the coefficients of Xt and Yt–1.
3
COMMON FACTOR TEST
Yt =  1 +  2 X t + ut
ut = ut −1 +  t
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Yt = 0 + 1Yt −1 + 2 X t + 3 X t −1 +  t
It can be thought of as a special case of a more general model involving the

same variables.
4
COMMON FACTOR TEST
Yt =  1 +  2 X t + ut
ut = ut −1 +  t
Restricted model
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Unrestricted model
Yt = 0 + 1Yt −1 + 2 X t + 3 X t −1 +  t
Restriction embodied in the AR(1) process
3 = −12
It is special in two senses. First, it imposes the restriction already noted.

Formally, it is the restricted version of the more general model.
5
COMMON FACTOR TEST
Yt =  1 +  2 X t + ut
ut = ut −1 +  t
Restricted model
Yt =  1 (1 −  ) + Yt −1 +  2 X t −  2 X t −1 +  t
Unrestricted model
Yt = 0 + 1Yt −1 + 2 X t + 3 X t −1 +  t
Restriction embodied in the AR(1) process
3 = −12
Second, it imposes an interpretation on the coefficient of Yt–1 that may not

be valid.
6
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
If the original specification is a multiple regression model with two

explanatory variables, and if the disturbance term is subject to an AR(1)
process, the model to be fitted is a little more complex. 7
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t
Restricted model
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
Unrestricted model
Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
The AR(1) special case is again a restricted version of a more general model.
8
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t
Restricted model
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
Unrestricted model
Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
Restrictions embodied in the AR(1) process
3 = −12 5 = −14
In this case, however, the restricted version incorporates two restrictions.
9
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t
Restricted model
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
Unrestricted model
Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
In general, the number of restrictions in the AR(1) model is equal to the

number of explanatory variables.
10
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t
Restricted model
Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
Unrestricted model
Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
One can, and one should, test the validity of the restrictions. The test is
known as the common factor test.
11
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t
Restricted model RSSR

Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t
Unrestricted model RSSU

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
The test involves a comparison of RSSR and RSSU, the residual sums of
squares in the restricted and unrestricted specifications.
12
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t

Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
RSSR can never be smaller than RSSU and it will in practice be greater,
because imposing a restriction in general leads to some loss of goodness of
fit. The question is whether the loss of goodness of fit is significant. 13
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t

Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
If it is, this is an indication that imposing a restriction has caused a

distortion, and so we should conclude that the restrictions are invalid.
14
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t

Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
RSS R
Test statistic: n log
RSSU
Because the restrictions are nonlinear, the F test is inappropriate. Instead,

we construct the test statistic shown above. n is the number of
observations in the regression. log is the natural logarithm. 15
COMMON FACTOR TEST
Yt =  1 +  2 X 2 t +  3 X 3 t + ut
ut = ut −1 +  t

Yt =  1 (1 −  ) + Yt −1 +  2 X 2 t −  2 X 2 t −1 +  3 X 3 t −  3 X 3 t −1 +  t

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
RSS R
Test statistic: n log
RSSU
Under the null hypothesis that the restrictions are valid, the test statistic has
a c 2 (chi- squared) distribution with degrees of freedom equal to the number
of restrictions. It is in principle a large-sample test. 16
COMMON FACTOR TEST
============================================================
Breusch–Godfrey statistic = 20.02
c2(1)crit, 0.1% = 10.83
Sample: 1959 2003
============================================================
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
============================================================
dL = 1.24 (1%, n = 45, k = 3)
We will perform the test for the logarithmic regression of expenditure on

housing services on income and relative price. The output from the OLS
regression is shown above. The Breusch–Godfrey and Durbin–Watson
statistics indicate severe positive autocorrelation. 17
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.154815 0.354989 0.436111 0.6651
LGDPI 1.011295 0.021830 46.32642 0.0000
LGPRHOUS -0.478070 0.091594 -5.219437 0.0000
AR(1) 0.719102 0.115689 6.215836 0.0000
============================================================
============================================================
Here is the result of fitting the same model using an AR(1) estimation
method. We make a note of the residual sum of squares.
18
COMMON FACTOR TEST
LGHOUS t =  1 +  2 LGDPI t +  3 LGPRHOUS t + ut

ut = ut −1 +  t
Restricted model
LGHOUS t =  1 (1 −  ) + LGHOUS t −1
+  2 LGDPI t −  2 LGDPI t −1
+  3 LGPRHOUS t −  3 LGPRHOUS t −1 +  t
We are fitting the model shown above, ensuring that the parameter
estimates conform to the two restrictions, one involving the income
variables ... 19
COMMON FACTOR TEST

ut = ut −1 +  t
Restricted model
LGHOUS t =  1 (1 −  ) + LGHOUS t −1
+  2 LGDPI t −  2 LGDPI t −1
... and the other involving the price variables.
20
COMMON FACTOR TEST

ut = ut −1 +  t
Restricted model
LGHOUS t =  1 (1 −  ) + LGHOUS t −1
+  2 LGDPI t −  2 LGDPI t −1
Unrestricted model
LGHOUS t = 0 + 1 LGHOUS t −1
+ 2 LGDPI t + 3 LGDPI t −1
+ 4 LGPRHOUS t + 5 LGPRHOUS t −1 +  t
We use OLS to fit the model with no restrictions on the parameters.
21
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
Here is the output from the unrestricted specification.
22
COMMON FACTOR TEST
============================================================
Method: Least Squares c2(1)crit, 5% = 3.84
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
Before we do anything else, we should check that the unrestricted model is

not subject to autocorrelation. The Breusch–Godfrey statistic for AR(1)
autocorrelation is 0.29, not remotely significant. 23
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
Before performing the Common Factor test, it is a good idea to eyeball the
coefficients in the unrestricted regression to see if they appear to conform
to the restrictions. 24
COMMON FACTOR TEST
============================================================
Method: Least Squares –0.73 x 0.28 = –0.20
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
In this case, the restriction involving the income coefficients does not
appear to be satisfied. Minus the product of 0.73 and 0.28 is –0.20, but the
coefficient of lagged income is –0.01. 25
COMMON FACTOR TEST
============================================================
Method: Least Squares –0.73 x –0.23 = 0.17
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
On the price side, the coefficients do appear to conform quite closely.

Minus the product of 0.73 and –0.23 is 0.17, not far from the coefficient of
lagged price. 26
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
The residual sum of squares was 0.006084 in the AR(1) regression and it is
0.001456 in the OLS unrestricted version.
27
COMMON FACTOR TEST
============================================================
============================================================
 RSS R   0.006084 
============================================================
C n log 0.041458
 = 440.065137
log  = 62.90.5283
0.636465
LGDPI  RSS U 
0.275527  0.001456
0.067914 
4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1)
LGDPI(-1)
0.725893
c 2
2( )0.058485
-0.010625 crit,
0.086737
0.1% = 13.8
12.41159
-0.122502
0.0000
0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
The test statistic is 62.9. The critical value of c2 at the 0.1 percent level, with
two degrees of freedom, is 13.8.
28
COMMON FACTOR TEST
============================================================
============================================================
 RSS R   0.006084 
============================================================
C n log 0.041458
 = 440.065137
log  = 62.90.5283
0.636465
LGDPI  RSS U 
0.275527  0.001456
0.067914 
4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1)
LGDPI(-1)
0.725893
c 2
( )
2
0.058485
-0.010625 crit,
0.086737
0.1% = 13.8
12.41159
-0.122502
0.0000
0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
We therefore reject the restrictions. We should choose the more general

model instead of assuming that the disturbance term is subject to an AR(1)
process. 29
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
We conclude that the apparent autocorrelation in the original OLS

regression of expenditure on housing on income and price was due to the
omission of the lagged variables. 30
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
Note that in this example the coefficients of lagged income and price are not
significant. We will investigate whether we can drop them.
31
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
============================================================
The fact that their coefficients have insignificant t statistics is not enough.
We also need to perform an F test of their joint explanatory power. We make
a note of RSS when they are included. 32
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
We also make a note of RSS when they are dropped. The null hypothesis for
the F test is that the coefficients of lagged income and lagged price are both
equal to zero. The alternative hypothesis is that one or both are nonzero. 33
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI
(0.001556
0.282935
)
0.046912
− 0.0014560.027383
/2
6.031246 0.0000
( )
,LGPRHOUS
38 =
F 2LGHOUS(-1) -0.116949
0.707242 = 1
0.044405 . -4.270880
44 F 2
15.92699
(
, 35 )
0.0001
0.0000
crit, 5% = 3.27
0 . 001456 / 38
============================================================
============================================================
The F statistic is 1.44. The critical value at the 5% significance level with 2
and 35 degrees of freedom is 3.27. The critical value with 2 and 38 degrees
of freedom must be lower. 34
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI
(0.001556
0.282935
)
0.046912
− 0.0014560.027383
/2
6.031246 0.0000
( )
,LGPRHOUS
38 =
F 2LGHOUS(-1) -0.116949
0.707242 = 1
0.044405 . -4.270880
44 F 2
15.92699
(, 35 )
0.0001
0.0000
crit, 5% = 3.27
0 . 001456 / 38
============================================================
============================================================
Hence we do not reject the null hypothesis. It appears that we can drop the
lagged variables.
35
COMMON FACTOR TEST
============================================================
c2(1)crit, 5% = 3.84
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
The Breusch–Godfrey statistic indicates that the null hypothesis of no

autocorrelation would also not be rejected in this specification of the model.
36
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
Thus we conclude that the omission of the lagged dependent variable was
responsible for the apparent autocorrelation in the original OLS regression.
37
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
============================================================
============================================================
C 0.041458 0.065137 0.636465 0.5283
LGDPI 0.275527 0.067914 4.056970 0.0002
LGPRHOUS -0.229086 0.075499 -3.034269 0.0043
LGHOUS(-1) 0.725893 0.058485 12.41159 0.0000
LGDPI(-1) -0.010625 0.086737 -0.122502 0.9031
LGPRHOUS(-1) 0.126270 0.084296 1.497928 0.1424
============================================================
Assuming that the lagged income and price variables really are redundant,
we obtain an increase in efficiency by dropping them, as reflected in the
smaller standard errors. 38
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
The final model, incidentally, is exactly the same as that in the previous
sequence. In that sequence we were led to this specification by examining
the plots of the variables and the residuals. 39
COMMON FACTOR TEST
============================================================
============================================================
============================================================
C 0.073957 0.062915 1.175499 0.2467
LGDPI 0.282935 0.046912 6.031246 0.0000
LGPRHOUS -0.116949 0.027383 -4.270880 0.0001
LGHOUS(-1) 0.707242 0.044405 15.92699 0.0000
============================================================
============================================================
In this sequence we have arrived at the same conclusion by performing the

common factor test, which revealed that the AR(1) specification was
inadequate, and cleaning up afterwards. 40
Chapter heading
DYNAMIC MODEL
SPECIFICATION
DYNAMIC MODEL SPECIFICATION
General model with lagged variables
Static AR(1) Model with lagged

model model dependent variable
Methodologically, in developing a regression specification that survives the

tests to which it is subjected, we have followed what is described as a
specific-to-general procedure for model selection, and it is open to serious
criticism. 1

If you start with a poorly specified model, in our case the static model, the
various diagnostic test statistics are likely to be invalidated. Thus there is a
risk that the model may survive the tests and appear to be satisfactory, even
though it is misspecified. 2

To avoid this danger, you should in principle adopt a general-to-specific

approach. You should start with a model that is sufficiently general to avoid
potential problems of underspecification, and then see if you can
legitimately simplify it. 3

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
In our case, the starting point should be the model with all the lagged
variables.
4

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
1 = 3 = 5 = 0
Having fitted it, we might be able to simplify it to the static model, if the
lagged variables individually and as a group do not have significant
explanatory power. 5

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
3 = −12 5 = −14
If the lagged variables do have significant explanatory power, we could

perform a common factor test and see if we could simplify the model to an
AR(1) specification. 6

Yt = 0 + 1Yt −1 + 2 X 2 t + 3 X 2 t −1 + 4 X 3 t + 5 X 3 t −1 +  t
 3 = 5 = 0
Sometimes we may find that a model with a lagged dependent variable is an

adequate dynamic specification, if the other lagged variables lack significant
explanatory power. 7

In the case of the housing regression, we have done exactly the opposite.
We started with a crude static model.
8

We switched to an AR(1) specification.
9

We turned to the more general model when the common factor test revealed
that the AR(1) specification was inadequate.
10

Finally we ended up with a model with a lagged dependent variable, perhaps

a little lucky to do so.
11

Chapter12 Autocorrelation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter12 Autocorrelation

Uploaded by

Copyright:

Available Formats

Introduction to Econometrics,

Chapter 12: Autocorrelation

In this graph, positive values tend to be followed by negative ones, and

First-order autoregressive autocorrelation: AR(1)

A particularly common type of autocorrelation, at least as an approximation,

First-order autoregressive autocorrelation: AR(1)

It is autoregressive, because ut depends on lagged values of itself, and first-

First-order autoregressive autocorrelation: AR(1)

Fifth-order autoregressive autocorrelation: AR(5)

Here is a more complex example of autoregressive autocorrelation. It is

First-order autoregressive autocorrelation: AR(1)

Fifth-order autoregressive autocorrelation: AR(5)

Third-order moving average autocorrelation: MA(3)

The other main type of autocorrelation is moving average autocorrelation,

First-order autoregressive autocorrelation: AR(1)

Fifth-order autoregressive autocorrelation: AR(5)

Third-order moving average autocorrelation: MA(3)

This example is described as third-order moving average autocorrelation,

We will use 50 independent values of , taken from a normal distribution with

We have started with  equal to 0, so there is no autocorrelation. We will

With  equal to 0.3, a pattern of positive autocorrelation is beginning to be

With  equal to 0.6, it is obvious that u is subject to positive autocorrelation.

The process is now approaching what is known as a random walk, where 

We will take larger steps this time.

Now the pattern of negative autocorrelation is very obvious.

Next, we will look at a plot of the residuals of the logarithmic regression of

You can see that there is strong evidence of positive autocorrelation.

The consequences of autocorrelation for OLS are similar to those of

The other main consequence is that autocorrelation causes the standard

Unbiasedness is easily demonstrated, provided that Assumption C.7 is

Now, if Assumption C.7 is satisfied, at and ut are distributed independently

All that we now require is E(ut) = 0 and this is easily demonstrated.

Continuing to lag and substitute, we can express ut in terms of current and

ut = 0 t + 1 t −1 + 2 t −2 + 3 t −3

For higher order AR autocorrelation, the demonstration is essentially similar.

We will not pursue analytically the other consequences of autocorrelation.

If the model specification includes a lagged dependent variable, OLS

Hence we have a violation of part (1) of Assumption C.7. The explanatory

This was described as a special case, but actually it is an important one.

TESTS FOR AUTOCORRELATION I:

Simple autoregression of the residuals

Simple autoregression of the residuals

Simple autoregression of the residuals

Simple autoregression of the residuals

Hence a regression of uˆ t on uˆ t −1 is sufficient, at least in large samples. Of

Simple autoregression of the residuals

Simple autoregression of the residuals

Simple autoregression of the residuals

As can be seen, when uˆ t is regressed on uˆ t −1 , the distribution of the

Simple autoregression of the residuals

The simple estimator of the autocorrelation coefficient depends on

If the original model contains a lagged dependent variable as a regressor, or

The underlying theory is complex and relates to maximum-likelihood

Test statistic: nR2, distributed as c2(1) when

Several asymptotically-equivalent versions of the test have been proposed.

Test statistic: nR2, distributed as c2(1) when

Asymptotically, under the null hypothesis of no autocorrelation, nR2 is

Alternatively, simple t test on coefficient of uˆ t −1

The procedure can be extended to test for higher order autocorrelation. If

Test statistic: nR2, distributed as c2(q)

Test statistic: nR2, distributed as c2(q)

Under the null hypothesis of no autocorrelation, nR2 has a chi-squared

Alternatively, F test on the lagged residuals

Test statistic: nR2, distributed as c2(q),