You are on page 1of 95

VIOLATIONS THE CLRM

BASIC ASSUMPTIONS
Recall the CLRM (OLS) Basic Assumptions

1. The regression model is linear in the parameters.


2. X values are fixed in repeated sampling. This also
means Xi and ui are uncorrelated.
3. Zero mean value of disturbance ui
4. Homoscedasticity or equal variance of ui .
5. No correlation between the disturbances.
6. The model is correctly specified.
7. No perfect multicollinearity among independent
variables. (No exact linear relationships among
regressors)
CHAPTER 5:
MULTICOLLINEARITY
MULTICOLLINEARITY
• Multicollinearity: High degree of linear association among
independent variables => a problem of multiple regression model.
• Recall the variances of 𝛽2 and 𝛽3

2
Var ( ˆ2 ) 
 2i 23 )
x 2
(1  r 2

2
Var ( ˆ3 ) 
 3i 23 )
x 2
(1  r 2

• Where r23 is the correlation coefficient between X2 and X3.


• The higher r23 is, the more inaccurate the estimated coefficients are.
• Thus, in an ideal model, the degree of linear association among
independent variables should be low.
OBJECTIVES
5.1. What is the nature of multicollinearity? Distinguish
between perfect and imperfect multicollinearity.

5.2. Estimating the coefficients with the present of


multicollinearity

5.3. Causes of multicollinearity

5.4. Consequences of multicollinearity on OLS estimates

5.5. Detection of multicollinearity

5.6. Remedies for multicollinearity


5.1. MULTICOLLINEARITY
• Don’t be too worried when there are linear
associations among independent variables!
• Different degrees of linear association:
No association => Good
Low and moderate association => ok
High association => a problem! (imperfect/
near perfect multicollinearity)
Perfect association => failed! (Perfect
multicollinearity)
5.1.1
PERFECT MULTICOLLINEARITY

• This is a violation of the CLRM basic


assumption!!!
• Perfect multicollinearity happens when there is
an exact linear relationship among regressors.
• In case of perfect multicollinearity, the value of
one regressor is explained perfectly/completely
by other regressors in the model.
5.1.1.
PERFECT MULTICOLLINEARITY
• Consider a k-variable model:

Y i   1   2 X 2i   3 X 3i  ...   k X ki  u i
• There exists perfect multicollineaity in the model if the following
exact linear combination happens:

 1   2 X 2   3 X 3  ...   k X k  0
Where λ1 , λ2 , λ3 ,..., λ𝑘 are constants and not simultaneously equal
to 0.
• Assuming that λ2 ≠ 0, then:
 1   3   4  ...   k
X2 X3 X4 Xk
2 2 2 2
=> Perfect multicollinearity!
M1.
PERFECT MULTICOLLINEARITY
• Example: Regress wage on age and years of experience.
• 𝒘𝒂𝒈𝒆𝐢 = 𝛃𝟏 + 𝛃𝟐 𝒂𝒈𝒆𝐢 + 𝛃𝟑 𝒆𝒙𝒑𝐢 + 𝐮𝐢
• Sample data:
No. Wage Age Exp
1 5 22 0
2 6 23 1
3 8 24 2
4 15 30 8
5 30 42 20
6 42 55 33

• Age = Exp + 22 => Perfect multicollinearity


5.1.2.
IMPERFECT MULTICOLLINEARITY
• Imperfect multicollinearity happens when there is a
less-than-perfect linear association among
regressors.
• In the case of imperfect multicollinearity, the value of
an regressor is not fully/perfectly explained by other
regressors.
5.1.2.
IMPERFECT MULTICOLLINEARITY
• There exists imperfect multicollinearity among regressors if:

 1   2 X 2   3 X 3  ...   k X k  V i  0
• Where λ1 , λ2 , λ3 ,..., λ𝑘 are constants and not simultaneously
equal to 0.
• Vi is the error term.

• Assuming that λ2 ≠ 0, then:


 1   3   4  ...   k Vi
X2 X3 X4 Xk 
2 2 2 2 2
=> Imperfect multicollinearity!
5.1.2.
IMPERFECT MULTICOLLINEARITY
• Example: Choose another sample with the same ages but
different years of experience.

No. Wage Age Exp2


1 5 22 1
2 6 23 0
3 8 24 3
4 15 30 10
5 30 42 19
6 42 55 33

• X2 and X3 have near – perfect linear relationship: X2 = X3


+ 22 + Vi
DIFFERENT DEGREES OF
MULTICOLLINEARITY

No multicollinearity Low multicollinearity

Y Y

X2 X3 X3
X2
DIFFERENT DEGREES OF
MULTICOLLINEARITY
Imperfect multicollinearity Perfect multicollinearity

Y Y
X3
X2 X2
X3
5.2. ESTIMATING THE COEFFICIENTS WITH
THE PRESENT OF MULTICOLLINEARITY
Consider a 3-variable model:
• Population regression model:
Yi = β1 + β2 X2i + β3 X3i + ui
• Sample regression model:
Yi = 𝛽1 + 𝛽2 X2i + 𝛽3 X3i + 𝑢i
5.2. ESTIMATING THE COEFFICIENTS WITH
THE PRESENT OF MULTICOLLINEARITY
• OLS Estimators:

( y )( x )  ( y x )( x x
2
x )
  i 2i 3i i 3i 2i 3i

( x )( x )  ( x x )
2 2 2
2
2i 3i 2i 3i

( y x )( x )  ( y x )( x x
2
)
  i 3i 2i i 2i 2i 3i

( x )( x )  ( x x )
2 2 2
3
2i 3i 2i 3i

 Y  X
1 1 2
 X
1 3
5.2. IN THE CASE OF PERFECT
MULTICOLLINEARITY
• Assuming that there exists perfect multicollinearity among
regressors: 𝑋3𝑖 = λ𝑋2𝑖
𝑥3𝑖 = λ𝑥2𝑖 (λ≠0)

Replace into the estimators:


( y i x 2i )(  x )  ( y i  x 2i )( x 2i  x 2i ) 0
2 2

2 
2i

( x 22i )(  2x 2i )  ( x 2i  x 2i )
2 2
0
( y i  x 2i )( x 22i )  ( y i x 2i )( x 2i  x 2i ) 0
3 
( x 2i )(  2x 2i )  ( x 2i  x 2i )
2 2 2
0
Perfect multicollinearity: We cannot estimate the parameters.
5.2. IN THE CASE OF IMPERFECT
MULTICOLLINEARITY
• Assuming that there exists imperfect multicollinearity among
regressors: 𝑋3𝑖 = λ𝑋2𝑖 + 𝑉𝑖
𝑥3𝑖 = λ𝑥2𝑖 + 𝑣𝑖 (λ≠0, 𝑣𝑖 is the error term)
Replace into the formulas for estimators:

( yi x2i )( 2  x 2 2i   v 2i )  (  yi x2i   yi vi )(  x 2 2i )


2 
(  x   v i )  (  x
2
x
2 2 2 2 2
2i 2i 2i )
(  yi xi   yi vi )( x 2 2i )  ( yi x2i )(  x 2 2i )
3 
(  x   v i )  (  x
2
x
2 2 2 2 2
2i 2i 2i )
=> Imperfect multicollinearity: We can still get the estimates of
the parameters.
PLAY WITH STATA – BACK TO THE EXAMPLE
OF WAGE
• In the case of perfect multicollinearity: one independent
variable is omitted due to perfect multicollinearity
• In the case of imperfect multicollinearity:
Source SS df MS Number of obs = 6
F( 2, 3) = 345.02
Model 1136.39277 2 568.196385 Prob > F = 0.0003
Residual 4.94056404 3 1.64685468 R-squared = 0.9957
Adj R-squared = 0.9928
Total 1141.33333 5 228.266667 Root MSE = 1.2833

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

age 1.228248 .4854988 2.53 0.085 -.316826 2.773322


exp2 -.0914135 .4973841 -0.18 0.866 -1.674312 1.491485
_cons -21.45055 10.43491 -2.06 0.132 -54.65909 11.758
PLAY WITH STATA – BACK TO THE EXAMPLE
OF WAGE
• Assuming that we regress Wage on only years of experience
(which means no multicollinearity)

Source SS df MS Number of obs = 6


F( 1, 4) = 909.76
Model 1136.33714 1 1136.33714 Prob > F = 0.0000
Residual 4.99619193 4 1.24904798 R-squared = 0.9956
Adj R-squared = 0.9945
Total 1141.33333 5 228.266667 Root MSE = 1.1176

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

exp 1.139375 .0377749 30.16 0.000 1.034496 1.244255


_cons 5.513328 .6087107 9.06 0.001 3.823276 7.20338
PERFECT MULTICOLLINEARITY VS.
IMPERFECT MULTICOLLINEARITY

PERFECT IMPERFECT
MULTICOLLINEARITY MULTICOLLINEARITY
A violation of the CLRM Not a violation of the
assumption CLRM
We cannot estimate the We can still get the
population parameters. estimates of the
population parameters.
5.3. CAUSES OF MULTICOLLINEARITY
• Constrains on the model or in the population being
sampled
For example: Regress electricity consumption on the size of
the house and income of households.
• Data collection method used: sample does not represent
the population.
The degree of linear association among regressors in the
sample is high but it is much lower in the population.
For example: Regress consumption on income and wealth.
• Model specification: adding polynomial terms (square or
cube of a regressor) to the model
• Macroeconomic times series data: regressors share a
common trend.
5.4.
CONSEQUENCES OF MULTICOLLINEARITY
• The consequence of perfect multicollinearity is
that we cannot estimate the population
parameters.
• Consequences of imperfect multicollinearity:
Consequences on the properties of OLS
estimators
Practical consequences
5.4. CONSEQUENCES ON THE PROPERTIES
OF OLS ESTIMATORS
• Unbiasedness: OLS estimators are still
unbiased
• Efficiency: OLS estimators still have the
minimal variances among the class of linear
unbiased esitmators.
=> Imperfect multicollinearity has no impact on
the properties of OLS estimators.
5.4. PRACTICAL CONSEQUENCES

1. OLS estimators have larger variances and covariance


than those obtained in the absence of multicollinearity.

Consider the model Yi = β1 + β2 X2i + β3 X3i + ui


The variances of estimated coefficients are:
2 2
var(  2 )  ; var( 3 ) 
(1  r23 )( x2i )
2 2
(1  r232 )( x3i 2 )

The closer to 1 (or -1) r23 is, the larger the variances
of estimated coefficients are.
5.4. PRACTICAL CONSEQUENCES

2. Wider confidence interval


Confidence interval for parameter β𝑗 :

 nk
 j t  /2.se( j )
If an interval is large, it provides little information
about the true parameter.
5.4. PRACTICAL CONSEQUENCES

3. Insignificant t ratios(very small).

 j 0
t
se(  j )
When a t-ratio is close to 0, it leads to the
acceptance of the “zero null hypothesis” (i.e., the
true population coefficient is zero) more readily.
5.4. PRACTICAL CONSEQUENCES

4. A high R2 but few significant t Ratios


• In cases of high collinearity, it is possible that one
or more of the partial slope coefficients are
individually statistically insignificant.
• Yet the R2 may be so high, which means the model
does actually explain much of the variation in the
dependent variable.
• This is a signal of multicollinearity.
5.4. PRACTICAL CONSEQUENCES

5. Sensitivity of OLS Estimators and Their Standard


Errors to Small Changes in Data
Back to the example of Wage regression.
Larger confidence intervals lead to obtaining
estimates that are far different from each other.
5.4. PRACTICAL CONSEQUENCES

6. The signs of the estimated coefficients may be


different from theories.
Back to the example of Wage regression: the
coefficient on experience has a negative sign.
This is because the estimated coefficients are too
imprecise.
5.5. DETECTION OF MULTICOLLINEARITY

1. High R2 but few significant t Ratios


This is the “classic” symptom of multicollinearity.
R2 is high (R2 > 0.8), but none or very few t ratios
of the partial slope coefficients are statistically
significant => multicollinearity
5.5. DETECTION OF MULTICOLLINEARITY

2. High pairwise correlations among regressors


Create a correlation matrix for the regressors
If the pair-wise or zero-order correlation
coefficient between two regressors is in excess of
0.8, then multicollinearity is a serious problem.
This method can only detect the multicollinearity
that happens between 2 regressors!
Find Correlation matrix in GRETL.
5.5. DETECTION OF MULTICOLLINEARITY
3. Auxiliary Regressions
• Regress each regressor on other regressors in the model.
• For example: Regress X2 against X3, X4, …, Xk

X 2   1   2 X 3   3 X 4  ...   k 1 X k  vi
• Do the same for X3, X4, …, Xk
• Determine 𝑅𝑗2 for the auxiliary regression of Xj
• High 𝑅𝑗2 means the correlation of Xj with other regressors is high.
• Test the overall significance of auxiliary models:
R (n  k  1)
2

Fj
j

(1  R 2j )(k  2)
• Fj follows an F-distribution with (k -2) and (n – k + 1) degrees of freedom.
• Fj < critical F => Xi has no linear association with other regressors.
5.5. DETECTION OF MULTICOLLINEARITY

4. Variance Inflation Factor


1
VIFj =
1  R2 j

R 2 j  1, VIFj   : Xj is highly collinear


R 2 j  0,VIFj  1: Xj is not collinear

• VIF > 10 indicates multicollinearity.


• Find VIF in GRETL
5.6. REMEDIES FOR MULTICOLLINEARITY

Ignore multicollinearity when the problem is not


so serious.
• Ignore it when VIF < 10
• Ignore it when the main purpose is only to
estimate the value of Y.
5.6. REMEDIES FOR MULTICOLLINEARITY

1. Use a priori information


From previous empirical works
2. Combining cross-sectional and time series data
3. Dropping a variable(s)
One of the “simplest” things to do is to drop one of
the collinear variables
Dropping variables may lead to specification bias.
If economic theories say that one or more variable
should be included in the model explaining the
dependent variables, do not drop them!
5.6. REMEDIES FOR MULTICOLLINEARITY
4. Additional or new data:
Increase sample size
Collect new data sample
5. Transformation of variables:
In some cases, we can transform the regressors to
avoid multicollinearity.
For example: Regress housing price on mortgage rate,
GNP and population.
It is highly possible that GNP and population are
highly correlated.
=> Tranform GNP into GNP per capita
CHAPTER 6:
HETEROSCEDASTICITY
OBJECTIVES
6.1. The nature of heteroscedasticity

6.2. Causes of heteroscedasticity

6.3. Consequences of heteroscedasticity

6.4. Detection of heteroscedasticity

6.5. Remedies for heteroscedasticity


6.1. WHAT IS HETEROSCEDASTICITY?
• An assumption of the CLRM:
Homoscedasticity or equal variance of ui .

var(ui | X i )  E (ui2 | X i )   2
• Remember that var(u | X )  var(Y | X )
i i i i

• In many cases, the conditional variances of the disturbance


are not constant.
var(ui | X i )   i2
i  1, 2,.., n
=> heteroscedasticity
H1. WHAT IS HETEROSCEDASTICITY?
Regress consumption Y on income X of households in Vietnam.
𝑌𝑖 = β1 + β2 𝑋𝑖 + 𝑢𝑖
• Scenario 1: Homoscedasticity (variances of ui are constant)
Var(𝑢𝑖 ) = σ2
 The distribution of consumption around its mean is assumed to be the
same for all level of income.
 Low income & high income households have the same distribution of
consumption.
• Scenario 2: Heteroscedasticity (variances of ui are not
constant)
Var(𝑢𝑖 ) = σ𝑖 2
 The consumptions of high-income households are more dispersed
than those of low-income households.
6.1. WHAT IS HETEROSCEDASTICITY?

Homoscedasticity Heteroscedasticity
6.2. CAUSES OF HETEROSCEDASTICITY
• As people learn, their errors of behavior become smaller
over time or the number of errors becomes more
consistent.

• As data collecting techniques improve, σ𝑖 2 is likely to


decrease.
6.2. CAUSES OF HETEROSCEDASTICITY
• Outliers: An outlying observation, or outlier, is an observation
that is much different in relation to the observations in the
sample.
• Nature of the relationship: consumption – income, savings –
income
• Specification error: some important variables are omitted from
the model.
• Heteroscedasticity is likely to be more common in cross-
sectional data.
Cross-sectional data deals with members of a population at a
given point in time.
These members have different properties.
 Enterprises: very small/ small-sized/ medium-sized/ large-sized
enterprises
 Household/ individual: low/ medium/ high level of income
6.3. CONSEQUENCES

Yi  1   2 X i  ui

ˆ2 
 xyi i

x i
2

Var (ui )   Var (ui )  


2 2
i

ˆ
var(  2 ) 
 2

var(ui )
var( ˆ2 ) 
 i i
x 2
 2

 ( X  X ) 2
i x 2
  xi 2 2
6.3. CONSEQUENCES
• Unbiasedness: The OLS estimator continues to be
linear and unbiased.
• Efficiency: The standard formulas for the standard
errors of the OLS estimator are biased.

The OLS estimator is no longer the BLUE.


Statistical inferences (testing and prediction) using
the standard formulas will be incorrect.
6.4. DETECTION
Graphical Examination of Residuals
• We evaluate the residuals by plotting them against an
independent variable or fitted values of the dependent
variable 2
ui , ui
6.4. DETECTION
Breusch – Pagan Test:
Yi  1   2 X 2i  ...   k X ki  ui (1)
• Assume that the variances of the observations have some
functional relationship with certain variables Z

 i2  1   2 Z 2   3 Z 3 ...   m Z m  i
• Z variables typically are the X in the regression.
• In STATA, Z is the estimated value of Y
6.4. DETECTION
Breusch – Pagan Test:
• Step 1: Use OLS to estimate model (1) to get the residuals
• Step 2: Construct auxiliary model to test for
heteroscedasticity
2
u  1   2 Z 2   3 Z 3 ...   m Z m  i
i

• Step 3: Calculate the test statistic: nR*2 ~ χ2


• (n: sample size, R*2: coefficient of determination of the auxiliary model)
• χ2 𝑠 = nR*2 follows chi-squared distribution with (m-1) degree of
freedom.
• Step 4: H0 if χ2 𝑠 > χ2 𝑐
or p-value (χ2 𝑠 ) < α
6.4. DETECTION
White Test

Yi  1   2 X 2i  3 X 3i  ui
• The White test looks to see if there’s a relationship
between squared residuals and all of the independent
variables.
• Auxiliary regression:
2
u  1   2 X 2i   3 X 3i   4 X   5 X   6 X 2i X 3i  vi
i
2
2i
2
3i
6.4. DETECTION
White Test
• Step 1: Use OLS to estimate model (1) to get the residuals
• Step 2: Construct auxiliary model to test for
heteroscedasticity
2
u  1   2 X 2i   3 X 3i   4 X   5 X   6 X 2i X 3i  vi
i
2
2i
2
3i

• Step 3: Calculate the test statistic: nR*2 ~ χ2


• n: sample size, R*2: coefficient of determination of the auxiliary model
• χ2 𝑠 = nR*2 follows chi-squared distribution with the degree of freedom
equal to the number of regressors in the auxiliary regression model.
• Step 4: H0 if χ2 𝑠 > χ2 𝑐
or p-value (χ2 𝑠 ) < α
6.5. REMEDIES
I. When 𝝈𝒊 𝟐 is known:
Using Generalized Least Squares (GLS)/Weighted
Least Squares (WLS) method

• OLS: Criterion to choose the sample regression line:

 
n n 2

u   Yi  ˆ1  ˆ2 X i
2
i  min
i 1 i 1
• OLS gives equal weight to all the observations in the sample.
• As the variances are not constant, we need another method in
which
observations with greater variability are given less weight than
those with smaller variability.
6.5. REMEDIES
The Method of Generalized Least Squares (GLS)
• Transform the original model so that the new model does not
incur heteroscedasticity.
• Estimate the new model using OLS.

• Consider the following model, assuming that heteroscedasticity


exists.
Yi  1   2 X i  ui
• Transform the model: Divide the 2 sides of the equation by σ𝒊
Yi 1  X i  ui
  2   
i i  i  i
6.5. REMEDIES
• Set Yi 1 Xi ui
 Yi *,  1*,  X i *,  ui *
i i i i
• Then
Y     X u
*
i
*
1
*
2
*
i
*
i

ui 1
Var (u )  var( )  2 var(ui )  1
*
i
i i

OLS :min  u  min  (Yi*  ˆ1*  ˆ2* X i* ) 2  ˆ1* , ˆ2* : BLUE


*2
i
6.5. REMEDIES
The method of Weighted Least Squares (WLS)

OLS :min  e  min  (Yi  1  2 X )


2
i
ˆ ˆ 2

1
min  w i ei  min  w i (Yi  ˆ1  ˆ2 X i )
2 2 wi  2
i
• WLS is just a special case of the more general estimating
technique, GLS. In the context of heteroscedasticity, one
can treat the two terms WLS and GLS interchangeably.
6.5. REMEDIES
• WLS Estimators:

* ( Wi )( Wi X iYi )  ( Wi X i )( WiYi )


2 
( Wi )( Wi X )  ( Wi X i )
i
2 2

* * * *
1  Y   2 X
*
var(  2 ) 
W i

( Wi )( Wi X )  ( Wi X i )
i
2 2
6.5. REMEDIES
II. When 𝝈𝒊 𝟐 is not known:
Making assumptions about the heteroscedasticity Pattern &
them implement GLS
• Assumption 1: The error variance is proportional to 𝑋𝑖 2
Var(Ui) = E(Ui2) = 2Xi2

• Divide the 2 sides of the equation by Xi (Xi  0)


Yi 1 Ui
  2 
Xi Xi Xi
Ui 1
• Where: Var( )  2 Var(Ui )   2
Xi Xi
6.5. REMEDIES
II. When 𝝈𝒊 𝟐 is not known:
Making assumptions about the heteroscedasticity Pattern &
them implement GLS
• Assumption 2: The error variance is proportional to Xi
Var(Ui) = E(Ui2) = 2Xi

• Divide the 2 sides of the equation by 𝑋𝑖 (Xi > 0)


Yi 1 Ui
  2 X i  (Xi > 0)
Xi Xi Xi
• Where: Ui 1
Var( )  Var(Ui )   2
Xi Xi
6.5. REMEDIES
II. When 𝝈𝒊 𝟐 is not known:
• Respecification of the Model
• A respecification which reduces the magnitude of the
dispersion is taking the natural logarithm of the
independent and dependent variables.
• Original model:
Yi = 1 + 2Xi + ui

• Take the natural logarithm of variables:


lnYi = 1 + 2lnXi + Ui
6.5. REMEDIES
II. When 𝝈𝒊 𝟐 is not known:
• Robust Standard Errors
(White’s Heteroscedasticity – Consistent Variances and
Standard Errors)
• The approach involves deriving estimates of variances for
the observations using the residuals from the OLS
regression and then using those to obtain the correct OLS
standard errors.

In STATA:
Reg [dependent variable] [independent variables], robust
Some notes about the transformations
• In a multiple regression model, we must be careful
in deciding which of the X variables should be
chosen for transforming the data.
• Log transformation is not applicable if some of the Y
and X values are zero or negative.
• When σ𝑖 2 are not directly known and are estimated
from one or more of the transformations discussed
earlier, all our testing procedures using the t
tests, F tests, etc., are, strictly speaking, valid only in
large samples.
CHAPTER 7:
AUTOCORRELATION
OBJECTIVES
7.1. The nature of autocorrelation

7.2. Causes of autocorrelation

7.3. Consequences of autocorrelation

7.4. Detection of autocorrelation

7.5. Remedies for autocorrelation


7.1. NATURE OF AUTOCORRELATION
• An assumption of the CLRM:
No correlation between the disturbances.
cov (ui, uj) = 0 (i ≠ j)

• In many cases, instead of the above assumption, the following


relationship holds:
cov (ui, uj) ≠ 0 (i ≠ j)
For example: an error occurring at period t may be carried over
to the next period t+1

=> autocorrelation
“correlation between members of series of observations ordered in
time [as in time series data] or space [as in cross-sectional data].”
AUTOCORRELATION PATTERNS
ut
ut
   

   
  
    
   t 

     t
   
(a) (b)

ut ut ut
 
  
        
       
      
       
         t
  t t
  
(c)   (d) (e)
7.1. NATURE OF AUTOCORRELATION

First order autocorrelation


ut = ρut-1 + vt =>AR(1)
Second order autocorrelation:
ut=ρ1ut-1+ ρ2ut-2+vt => AR(2)
P-th order autocorrelation
ut = ρ1ut-1 +..+ ρput-p+ vt => AR(p)

vt : error term which satisfies OLS assumptions.


ρ is the coefficient of autocorrelation and -1 ≤ ρ ≤ 1
7.2. CAUSES OF AUTOCORRELATION

1. Inertia - Macroeconomics data experience


cycles/business cycles.
2. Specification Bias - Excluded variable
 Appropriate equation:
Yt  1   2 X 2t  3 X 3t   4 X 4t  ut
 Estimated equation
Yt  1   2 X 2t  3 X 3t  vt
 Estimating the second equation implies
vt   4 X 4t  ut
7.2. CAUSES OF AUTOCORRELATION

Specification Bias - Excluded variable (cont.)


Lags: the case when the excluded variable is the lags of the
dependent variable.

The value of the dependent variable at time t depends on its


value at time (t-1)
For example: Consumption at current time depends on the
consumption at the previous time
Yt = 1 + 2Xt + 3Yt-1 + ut
7.2. CAUSES OF AUTOCORRELATION

3. Specification Bias- Incorrect Functional Form

Yt   1   2 X 2t   3 X 22t  v t

Yt   1   2 X 2t  u t

u t   3 X 22t  v t
7.2. CAUSES OF AUTOCORRELATION
4. Systematic errors in measurement
• Suppose a company updates its inventory at a given period in
time.
• If a systematic error occurred then the cumulative inventory
stock will exhibit accumulated measurement errors.
• These errors will show up as an autocorrelated procedure.
5. Cobweb phenomenon
• In agricultural market, supply reacts to price with a lag of
one time period because supply decisions take time to
implement
QSt = 1 + 2Pt-1 + ut
6. Data manipulation
7.3. CONSEQUENCES
Consider a simple regression model: Yt  1   2 X t  ut

Using OLS to estimate the mode in the case of autocorrelation:

ˆ2  x y t t

x t
2

 n 1 n2

2   xt xt 1  xt xt  2 
 2
2 x x
var(  2 )AR1  n  n   t 1n   t 1 n
2
 ...   n 1
n
1 n

2  2
t 1
x 2
t t 1
x t

 t 1
 xt
2

t 1
x 2
t 
t 1
xt

7.3. CONSEQUENCES
1. The OLS estimators are still unbiased and
consistent. This is because both unbiasedness and
consistency do not depend on assumption of no
autocorrelation.
2. The OLS estimators will be inefficient and therefore
no longer BLUE.
3. Hypothesis testing and prediction are no longer
valid.
7.4. DETECTION
Graphical Examination of Residuals
• We evaluate the residuals by plotting them against time
𝒖t
𝒖t
   

   
  
    
   t 

     t
   
(a) (b)

𝒖t 𝒖t 𝒖t
 
  
        
       
      
       
         t
  t t
  
(c)   (d) (e)
7.4. DETECTION
Durbin – Watson Test
• Durbin – Watson d test statistic:

d
 (u  u t t 1 ) 2

u 2
t
7.4. DETECTION
Durbin – Watson Test
If n is sufficiently large, then d  2(1-)
where:
  ut ut 1
u 2
t

-1 ≤  ≤ 1 => 0 ≤ d ≤ 4
 = -1 => d = 4: perfect negative autocorrelation
 = 0 => d = 2: no autocorrelation
 = 1 => d = 0: perfect positive autocorrelation
7.4. DETECTION
Durbin – Watson Test

Step 1: Estimate the model by OLS and obtain the


residuals
Step 2: Calculate the DW statistic
Step 3: Construct the table with the calculated DW
statistic and the dU, dL, 4-dU and 4-dL critical
values.
Step 4: Conclude
7.4. DETECTION
Durbin – Watson Test
• Durbin – Watson d test statistic:

Zone of indecision

Positive
TTQ No Negative
TTQ
Dương
autocorrelation autocorrelation autocorrelation
Âm
 >0 =0  <0

0 dL dU 2 4 – dU 4 – dL 4
7.4. DETECTION
Durbin – Watson Test
• Assumptions of the DW test:
1. The regression model includes a constant
2. Autocorrelation is assumed to be of first-order only
3. The equation does not include a lagged dependent variable
as an explanatory variable
4. No missing observations
• Drawbacks of the DW test
1. It may give inconclusive results
2. It is not applicable when a lagged dependent variable is used
3. It can’t take into account higher order of autocorrelation
7.4. DETECTION
Breusch – Godfrey Test
(Lagrange Multiplier Test)

Consider the model:

Yt  1   2 X t  ut
ut  1ut 1  2ut 2  ...   put  p  vt
BG Test can test for higher order autocorrelation.
7.4. DETECTION
Breusch – Godfrey Test

Step 1: Estimate the model by OLS and obtain the


residuals
Step 2: Use OLS to estimate the following auxiliary
model (LM model)
u t  1   2 X t  1 u t 1   2 u t  2  ...   p u t  p  vt
and obtain R2
7.4. DETECTION
Breusch – Godfrey Test
The null and the alternative hypotheses are:
H0: ρ1= ρ2=…= ρp=0 no autocorrelation
H1: at least one of the ρ’s is not zero, thus,
autocorrelation
Step 3: Compute the LM statistic = (n-ρ)R2 from the LM
model.
• (n-ρ)R2 follows χ² distribution with p d.f. when n is
sufficiently large
Or an F-statistic for joint significance of the lagged
residuals
7.4. DETECTION
Breusch – Godfrey Test

Step 4: Reject H0 if (n-p)R² > χ²(p)


or p-value (χ²) < α

Reject H0 if Fs > Fα
or p-value (F) < α
7.5. REMEDIES
Resolving Autocorrelation
We have two different cases:

(a) When ρ is known


(b) When ρ is unknown
7.5. REMEDIES

Resolving Autocorrelation
Consider the model

Yt  1   2 X t  ut
Where
ut  1ut 1  vt
 ≠ 0 , vt is the OLS disturbance.
7.5. REMEDIES

Resolving Autocorrelation
(when ρ is known)

Use GLS.
• Transform the original model:
Yt  1   2 X t  ut (1)
Yt 1  1   2 X t 1  ut 1 (2)

• Multiply 2 sides of (2) with 


Yt 1  1   2 X t 1   ut 1 (3)
7.5. REMEDIES
Resolving Autocorrelation
(when ρ is known)
Subtract (3) from (1)

Yt  Yt 1  1  1   2 X t   2 X t 1  ut   ut 1
 1 (1   )   2 ( X t   X t 1 )  ut  ut 1
Set
Yt  Yt  Yt 1 ,   1  1 ,   2 , X t  X t   X t 1
*
1
*
2
* *

New model without autocorrelation:


Yt     X t  t
*
1
*
2
* *
7.5. REMEDIES
Resolving Autocorrelation
(when ρ is unknown)

The Cochrane-Orcutt iterative procedure.


Step 1: Estimate the regression and obtain residuals
Step 2: Estimate ρ from regressing the residuals to its lagged
terms.
Step 3: Transform the original variables as starred variables using
the obtained from step 2.
Step 4: Run the regression again with the transformed variables
and obtain residuals.
Step 5 and on: Continue repeating steps 2 to 4 for several rounds
until (stopping rule) the estimates of ρ from two successive
iterations differ by no more than some preselected small value,
such as 0.001.
SPECIFICATION ERROR
Omitting Influential Variables

“True Model”: Yi  1   2 X 2i   3 X 3i  U i

Research model:
(Under-fitted Model) Yi  b1  b2 X 2i  Vi
X3 omitted from the under-fitted model
=> estimators of β2 and β3 to be biased and
inconsistent.
WRONG FORM OF THE MODEL
• Research model: linear relationship between
dependent and independent variable
Yi  1   2 X 2i   3 X 3i  ui

• True model: nonlinear relationship between


dependent and independent variable
Yi  1   2 X 2i   3 X 3i   4 X 32i  ui
The Ramsey Reset Test
2 3
Ramsey suggests using 𝑌𝑖 , 𝑌𝑖 to estimate missing
variable Zi.

Step 1 : Regress Yi on Xi to obtain 𝑌𝑖


Step 2 : Regress Yi on the original independent
2 3
variables and 𝑌𝑖 , 𝑌𝑖
Step 3 : Testing
2 3
H0 : Coefficients on 𝑌𝑖 , 𝑌𝑖 are simultaneously equal to
0.
If reject H0  the original model is misspecified.
The Ramsey Reset Test
- Calculate test statistic:
(R 2
R 2
)(n  k )
F new original

(1  R )m
2
new
where:
m : number of newly-added variables
k : number of parameter of the new model
- If F > F(m,n-k) or p-value(F) <   Reject H0.
NORMAL DISTRIBUTION
OF THE DISTURBANCE
• For prediction and hypothesis testing purposes, we
assume the normal distribution of the disturbance.
ui ~ N(0, 𝝈𝟐 )
• Does the disturbance actually follow normal distribution?
=> Use normality test: Jacque – Bera
NORMAL DISTRIBUTION
OF THE DISTURBANCE
Jacque – Bera Test
• Test for normal distribution of the residuals
• Step 1: Use OLS to estimate the original model and save the
residuals
• Step 2: Identify the Skewness and Kurtosis of the distribution
of the residuals.
• Step 3: Test for normal distribution:
𝐻0: 𝑁𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑢𝑟𝑏𝑎𝑛𝑐𝑒
𝐻1: 𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑢𝑟𝑏𝑎𝑛𝑐𝑒 𝑖𝑠 𝑛𝑜𝑡 𝑛𝑜𝑟𝑚𝑎𝑙
• Calculate JB test statistic, which follows χ2 (2)

• If JB > critical χ2 or p-value (JB) < α => Reject H0


REMEDIES
• Wrong form of the model: Change the model’s
specification to show non-linear relationship between
dependent and the independent variable(s)
• The distribution of the disturbance is not normal:
increase the sample size
change model specification

You might also like