You are on page 1of 22

STK 310 (Autocorrelation / Serial correlation (AC/SC))

Some ideas:
The term autocorrelation may be defined as “correlation between members of series of
observations ordered in time [as in time series data] or space [as in cross-sectional data].”.
Problem in the structure of the error terms.
It is now a common practice to treat the terms autocorrelation and serial correlation
synonymously, but some authors prefer to distinguish the two terms. Although a distinction
between the two terms may be useful, in this course we treat them synonymously.

Consider the (usual) simple linear regression model:

Yt = β1 + β2 Xt + ut (1)
Ŷt = β̂1 + β̂2 Xt (2)
ût = Yt − Ŷt (3)
E(ui uj ) = 0 for i 6= j (4)
2
V AR(ut ) = σ (5)

Consider the autocorrelated model:

Yt = β1 + β2 Xt + ut (6)
ut = ρ ut−1 + νt (7)

Ŷt = β̂1 + β̂2 Xt


ût = Yt − Ŷt
E(ui uj ) 6= 0 for i 6= j (8)
(9)

X
xi yi
Reminder: β̂2 = X 2 w.r.t equation (6) and for equation (7)
xi

t=n
X
ût ût−1
t=2
ρ̂ = t=n
using usual OLS estimation (10)
X
û2t−1
t=2
t=n
X
ût ût−1
t=2
≈ t=n
(11)
X
û2t
t=1

1
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Important difference????????

ut = ρ ut−1 + νt indicates the the residual terms are not independent, hence correlated. (See
equations 4 and 8)

How does this influence the traditional OLS results?


No Serial corrrelation

2
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Serial corrrelation

3
STK 310 (Autocorrelation / Serial correlation (AC/SC))

See below and class example for various autocorrelation cases.

ˆ No autocorrelation

4
STK 310 (Autocorrelation / Serial correlation (AC/SC))

ˆ Positive autocorrelation

5
STK 310 (Autocorrelation / Serial correlation (AC/SC))

ˆ Negative autocorrelation

6
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Reasons of SC See textbook/class discussion for more detail

1. Inertia

2. Specification bias

3. Cobweb phenomenon

4. Lags

5. Data issues

6. Data transformations

Consequences of SC See textbook/class discussion for more detail

1. Regression parameters are unbiased and consistent

2. Not BLUE

3. Estimated residual variance σ̂ 2 is biased

4. Likely overestimate R2 .

5. Usual inference not valid.

6. Inference might be misleading

GLS used to estimate the model

Detection of SC
1. Graphical methods

ˆ Plot of ût against time


ˆ Plot of ût against ût−1 - for fist order serial correlation.

2. The runs test.


See the positive serial correlation graph above. Initially, we have several residuals that are
negative, then there is a series of positive residuals, and then there are several residuals that
are negative. If these residuals were purely random, could we observe such a pattern?
Intuitively, it seems unlikely.
See the negative serial correlation graph above. We have ver frequent jumps between
positive and negative residuals. If these residuals were purely random, could we observe
such a pattern? Intuitively, it seems unlikely.
This intuition can be checked by the so-called runs test.
A run as an uninterrupted sequence of one symbol or attribute, such as + or −. We also
define the length of a run as the number of elements in it.

7
STK 310 (Autocorrelation / Serial correlation (AC/SC))

ˆ State the Null and Alternative Hypotheses.


ˆ Determine
– The total number of observations: N
– N1 , the number of + symbols - the number of + residuals.
– N2 , the number of − symbols - the number of − residuals.
– The number of runs.
ˆ The number of runs are asymptotically normally distributed with mean and variance:
2 N1 N2
E(R) = +1 (12)
N
2 2 N1 N2 (2 N1 N2 − N )
σR = (13)
N 2 (N − 1)

ˆ Determine the 95% confidence interval

(E(R) ± 1.96 σR ) (14)

ˆ Do not reject the null hypothesis of randomness with 95% confidence if R, the number
of runs, lies in the preceding confidence interval; reject the null hypothesis if the
estimated R lies outside these limits.
(Note: You can choose any level of confidence you want.)

8
STK 310 (Autocorrelation / Serial correlation (AC/SC))

3. The Durbin-Watson (DW) test, d statistic


Calculate the DW test statistic

t=n
X
(ût − ût−1 )2
t=2
d = t=n
(15)
X
û2t
t=1
X X X
û2t + û2t−1 − 2 ût ût−1
= X
û2t
X X
2 û2t − 2 ût ût−1
≈ X
û2t
2( û2t − ût ût−1 )
P P
= X
û2t
 X 
ût ût−1
= 2 1 − X 
û2t
X
ût ût−1
≈ 2 (1 − ρ̂) since ρ̂ ≈ X (16)
û2t

ρ̂ is the estimated autocorrelation coefficient. Since −1 ≤ ρ̂ ≤ 1 we have 0 ≤ d ≤ 4.

9
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Assumptions of the DW test

(a) Model should include an intercept term


(b) Feature/explanatory variables are non-stochastic
(c) ut should be normally distributed
(d) First order autoregression
(e) The regression model itself should not include lagged dependant variables as feature
variables.

Also see below.

10
STK 310 (Autocorrelation / Serial correlation (AC/SC))

STK 310
Use and interpretation of the DW table

Reject H0 : Inconclusive Do not reject H0 : or H 0* : Inconclusive Reject H 0* :

DW-d= 0 2 4
d_lower d_upper 4-d_upper 4-d_lower

Example 1: 0 0.879 1.32 2 2.68 3.121 4

0 1.201 1.411 2 2.589 2.799 4 Example 2:

H0 : No positive serial correlation

H 0* : No negative serial correlation

Example 1: n= 10 Example 2: n= 20
k= 2 k= 2

Read of 2 values from table: Read of 2 values from table:


d_lower= 0.879 d_lower= 1.201
d_upper= 1.320 d_upper= 1.411

^
Remember: d  2(1   )
^ ^
^
 
u u t t 1

^ 2
u t

11
STK 310 (Autocorrelation / Serial correlation (AC/SC))

12
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Remedial measures
1. Check for model mis-specification
2. Generalised Least Squares (GLS)
Consider the two variable regression model with first order serial correlation:

Yt = β1 + β2 Xt + ut (17)
ut = ρut−1 + t (18)

with −1 < ρ < 1 and t error terms that are I.I.D N (0, σ∗2 )
If equation (17) holds true at time t it also holds true at time t − 1. Hence we have

Yt = β1 + β2 Xt + ut (19)
Yt−1 = β1 + β2 Xt−1 + ut−1 (20)
multiplying equation (20) by ρ yields
ρYt−1 = ρβ1 + β2 ρXt−1 + ρut−1 (21)
subtracting 21 from 19 yields
Yt − ρYt−1 = β1 (1 − ρ) + β2 (Xt − ρXt−1 ) + ut − ρut−1 (22)
which we can express as
Yt∗ = β1∗ + β2 Xt∗ + t (23)

where Yt∗ = Yt − ρYt−1 , Xt∗ = Xt − ρXt−1 and β1∗ = β1 (1 − ρ)


Note that β1 = β1∗ /(1 − ρ)
Problem with the above???
ρ is not known
Needs to be estimated, using

t=n
X
ût ût−1
t=2
ρ̂ = t=n
(24)
X
û2t
t=1

which is based on the regression results on the observed data without adjusting for serial
correlation, i.e. OLS estimation of equation (17). That is ρ is estimated using a secondary
OLS regression on the estimated errors (OLS on equation (18)) as follows:

ût = ρût−1 + t (25)

ρ can also be estimated based on the Durbin-Watson statistic, d, see equation (16):

d
ρ̂ = 1 − (26)
2

13
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Positive Serial Correlation - Generalised differences


Without Prais Winston
Standard OLS regression

The REG Procedure


Model: MODEL1
Dependent Variable: y

Number of Observations Read 200


Number of Observations Used 200

Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 11070 11070 555.42 <.0001
Error 198 3946.33827 19.93100

Corrected Total 199 15016

Root MSE 4.46442 R-Square 0.7372

Dependent Mean 190.02696 Adj R-Sq 0.7359


Coeff Var 2.34936

Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 42.34727 6.27424 6.75 <.0001
x 1 0.74172 0.03147 23.57 <.0001

14
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Positive Serial Correlation - Generalised differences


Without Prais Winston
Standard OLS regression

The REG Procedure


Model: MODEL1
Dependent Variable: y

Durbin-Watson D 0.080
Number of Observations 200
1st Order Autocorrelation 0.911

15
STK 310 (Autocorrelation / Serial correlation (AC/SC))

The initial (OLS) estimated regression model is:

Ŷi = 42.347 + 0.742Xi (27)

16
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Positive Serial Correlation - Generalised differences


Without Prais Winston
Standard OLS regression

Residual
7

-1

-2

-3

-4

-5

-6

-7
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

17
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Positive Serial Correlation - Generalised differences


Without Prais Winston
Standard OLS regression

Residual
7

-1

-2

-3

-4

-5

-6

-7
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7

lres

18
STK 310 (Autocorrelation / Serial correlation (AC/SC))

Positive Serial Correlation - Generalised differences


Without Prais Winston
Standard OLS regression

The REG Procedure


Model: MODEL1
Dependent Variable: res Residual

Number of Observations Read 200


Number of Observations Used 199
Number of Observations with Missing Values 1

Note: No intercept in model. R-Square is redefined.

Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 3274.34668 3274.34668 2279.37 <.0001
Error 198 284.43008 1.43652
Uncorrected Total 199 3558.77676

Root MSE 1.19855 R-Square 0.9201


Dependent Mean -0.09893 Adj R-Sq 0.9197
Coeff Var -1211.54045

Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
lres 1 0.91092 0.01908 47.74 <.0001

19
STK 310 (Autocorrelation / Serial correlation (AC/SC))

12

Positive Serial Correlation - Generalised differences


Without Prais Winston
Standard OLS regression

The REG Procedure


Model: MODEL1
Dependent Variable: ys

Number of Observations Read 200


Number of Observations Used 199
Number of Observations with Missing Values 1

Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 19581 19581 18598.0 <.0001
Error 197 207.40840 1.05283
Corrected Total 198 19788

Root MSE 1.02608 R-Square 0.9895


Dependent Mean 16.72654 Adj R-Sq 0.9895
Coeff Var 6.13443

Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 4.42848 0.11586 38.22 <.0001
xs 1 0.69850 0.00512 136.37 <.0001

20
STK 310 (Autocorrelation / Serial correlation (AC/SC))

From the above we have,

Yˆi∗ = 4.428 + 0.699Xi∗ (28)


β̂1∗ = 4.428 (29)
β̂1∗ 4.428
βˆ1 = = = 49.753 (30)
1 − ρ̂ 1 − 0.911
β̂2 = 0.699 (31)
(32)

The final estimated regression model therefore is:

Ŷi = 49.753 + 0.699Xi (33)

Compare the GLS results to the OLS results in Equation (27).

21
STK 310 (Autocorrelation / Serial correlation (AC/SC))

SAS Class example: Autocorrelation

proc reg data=a;


model y=x / dw ;
output out=b p=pred r=res ;
run ;
quit ;

data b ;
set b ;
lres = lag(res) ;
run ;

proc gplot data=b;


plot res*t ;
plot res*lres ;
run ;

proc reg data=b ;


model res = lres / noint ;
run ;
quit ;

data b ;
set b ;
ys = y - 0.911*lag(y) ;
xs = x - 0.911*lag(x) ;
run ;

proc reg data=b ;


model ys = xs ;
run ;

22

You might also like