Professional Documents
Culture Documents
BASIC ASSUMPTIONS
Recall the CLRM (OLS) Basic Assumptions
2
Var ( ˆ2 )
2i 23 )
x 2
(1 r 2
2
Var ( ˆ3 )
3i 23 )
x 2
(1 r 2
Y i 1 2 X 2i 3 X 3i ... k X ki u i
• There exists perfect multicollineaity in the model if the following
exact linear combination happens:
1 2 X 2 3 X 3 ... k X k 0
Where λ1 , λ2 , λ3 ,..., λ𝑘 are constants and not simultaneously equal
to 0.
• Assuming that λ2 ≠ 0, then:
1 3 4 ... k
X2 X3 X4 Xk
2 2 2 2
=> Perfect multicollinearity!
M1.
PERFECT MULTICOLLINEARITY
• Example: Regress wage on age and years of experience.
• 𝒘𝒂𝒈𝒆𝐢 = 𝛃𝟏 + 𝛃𝟐 𝒂𝒈𝒆𝐢 + 𝛃𝟑 𝒆𝒙𝒑𝐢 + 𝐮𝐢
• Sample data:
No. Wage Age Exp
1 5 22 0
2 6 23 1
3 8 24 2
4 15 30 8
5 30 42 20
6 42 55 33
1 2 X 2 3 X 3 ... k X k V i 0
• Where λ1 , λ2 , λ3 ,..., λ𝑘 are constants and not simultaneously
equal to 0.
• Vi is the error term.
Y Y
X2 X3 X3
X2
DIFFERENT DEGREES OF
MULTICOLLINEARITY
Imperfect multicollinearity Perfect multicollinearity
Y Y
X3
X2 X2
X3
5.2. ESTIMATING THE COEFFICIENTS WITH
THE PRESENT OF MULTICOLLINEARITY
Consider a 3-variable model:
• Population regression model:
Yi = β1 + β2 X2i + β3 X3i + ui
• Sample regression model:
Yi = 𝛽1 + 𝛽2 X2i + 𝛽3 X3i + 𝑢i
5.2. ESTIMATING THE COEFFICIENTS WITH
THE PRESENT OF MULTICOLLINEARITY
• OLS Estimators:
( y )( x ) ( y x )( x x
2
x )
i 2i 3i i 3i 2i 3i
( x )( x ) ( x x )
2 2 2
2
2i 3i 2i 3i
( y x )( x ) ( y x )( x x
2
)
i 3i 2i i 2i 2i 3i
( x )( x ) ( x x )
2 2 2
3
2i 3i 2i 3i
Y X
1 1 2
X
1 3
5.2. IN THE CASE OF PERFECT
MULTICOLLINEARITY
• Assuming that there exists perfect multicollinearity among
regressors: 𝑋3𝑖 = λ𝑋2𝑖
𝑥3𝑖 = λ𝑥2𝑖 (λ≠0)
2
2i
( x 22i )( 2x 2i ) ( x 2i x 2i )
2 2
0
( y i x 2i )( x 22i ) ( y i x 2i )( x 2i x 2i ) 0
3
( x 2i )( 2x 2i ) ( x 2i x 2i )
2 2 2
0
Perfect multicollinearity: We cannot estimate the parameters.
5.2. IN THE CASE OF IMPERFECT
MULTICOLLINEARITY
• Assuming that there exists imperfect multicollinearity among
regressors: 𝑋3𝑖 = λ𝑋2𝑖 + 𝑉𝑖
𝑥3𝑖 = λ𝑥2𝑖 + 𝑣𝑖 (λ≠0, 𝑣𝑖 is the error term)
Replace into the formulas for estimators:
PERFECT IMPERFECT
MULTICOLLINEARITY MULTICOLLINEARITY
A violation of the CLRM Not a violation of the
assumption CLRM
We cannot estimate the We can still get the
population parameters. estimates of the
population parameters.
5.3. CAUSES OF MULTICOLLINEARITY
• Constrains on the model or in the population being
sampled
For example: Regress electricity consumption on the size of
the house and income of households.
• Data collection method used: sample does not represent
the population.
The degree of linear association among regressors in the
sample is high but it is much lower in the population.
For example: Regress consumption on income and wealth.
• Model specification: adding polynomial terms (square or
cube of a regressor) to the model
• Macroeconomic times series data: regressors share a
common trend.
5.4.
CONSEQUENCES OF MULTICOLLINEARITY
• The consequence of perfect multicollinearity is
that we cannot estimate the population
parameters.
• Consequences of imperfect multicollinearity:
Consequences on the properties of OLS
estimators
Practical consequences
5.4. CONSEQUENCES ON THE PROPERTIES
OF OLS ESTIMATORS
• Unbiasedness: OLS estimators are still
unbiased
• Efficiency: OLS estimators still have the
minimal variances among the class of linear
unbiased esitmators.
=> Imperfect multicollinearity has no impact on
the properties of OLS estimators.
5.4. PRACTICAL CONSEQUENCES
The closer to 1 (or -1) r23 is, the larger the variances
of estimated coefficients are.
5.4. PRACTICAL CONSEQUENCES
nk
j t /2.se( j )
If an interval is large, it provides little information
about the true parameter.
5.4. PRACTICAL CONSEQUENCES
j 0
t
se( j )
When a t-ratio is close to 0, it leads to the
acceptance of the “zero null hypothesis” (i.e., the
true population coefficient is zero) more readily.
5.4. PRACTICAL CONSEQUENCES
X 2 1 2 X 3 3 X 4 ... k 1 X k vi
• Do the same for X3, X4, …, Xk
• Determine 𝑅𝑗2 for the auxiliary regression of Xj
• High 𝑅𝑗2 means the correlation of Xj with other regressors is high.
• Test the overall significance of auxiliary models:
R (n k 1)
2
Fj
j
(1 R 2j )(k 2)
• Fj follows an F-distribution with (k -2) and (n – k + 1) degrees of freedom.
• Fj < critical F => Xi has no linear association with other regressors.
5.5. DETECTION OF MULTICOLLINEARITY
var(ui | X i ) E (ui2 | X i ) 2
• Remember that var(u | X ) var(Y | X )
i i i i
Homoscedasticity Heteroscedasticity
6.2. CAUSES OF HETEROSCEDASTICITY
• As people learn, their errors of behavior become smaller
over time or the number of errors becomes more
consistent.
Yi 1 2 X i ui
ˆ2
xyi i
x i
2
ˆ
var( 2 )
2
var(ui )
var( ˆ2 )
i i
x 2
2
( X X ) 2
i x 2
xi 2 2
6.3. CONSEQUENCES
• Unbiasedness: The OLS estimator continues to be
linear and unbiased.
• Efficiency: The standard formulas for the standard
errors of the OLS estimator are biased.
i2 1 2 Z 2 3 Z 3 ... m Z m i
• Z variables typically are the X in the regression.
• In STATA, Z is the estimated value of Y
6.4. DETECTION
Breusch – Pagan Test:
• Step 1: Use OLS to estimate model (1) to get the residuals
• Step 2: Construct auxiliary model to test for
heteroscedasticity
2
u 1 2 Z 2 3 Z 3 ... m Z m i
i
Yi 1 2 X 2i 3 X 3i ui
• The White test looks to see if there’s a relationship
between squared residuals and all of the independent
variables.
• Auxiliary regression:
2
u 1 2 X 2i 3 X 3i 4 X 5 X 6 X 2i X 3i vi
i
2
2i
2
3i
6.4. DETECTION
White Test
• Step 1: Use OLS to estimate model (1) to get the residuals
• Step 2: Construct auxiliary model to test for
heteroscedasticity
2
u 1 2 X 2i 3 X 3i 4 X 5 X 6 X 2i X 3i vi
i
2
2i
2
3i
n n 2
u Yi ˆ1 ˆ2 X i
2
i min
i 1 i 1
• OLS gives equal weight to all the observations in the sample.
• As the variances are not constant, we need another method in
which
observations with greater variability are given less weight than
those with smaller variability.
6.5. REMEDIES
The Method of Generalized Least Squares (GLS)
• Transform the original model so that the new model does not
incur heteroscedasticity.
• Estimate the new model using OLS.
ui 1
Var (u ) var( ) 2 var(ui ) 1
*
i
i i
1
min w i ei min w i (Yi ˆ1 ˆ2 X i )
2 2 wi 2
i
• WLS is just a special case of the more general estimating
technique, GLS. In the context of heteroscedasticity, one
can treat the two terms WLS and GLS interchangeably.
6.5. REMEDIES
• WLS Estimators:
* * * *
1 Y 2 X
*
var( 2 )
W i
( Wi )( Wi X ) ( Wi X i )
i
2 2
6.5. REMEDIES
II. When 𝝈𝒊 𝟐 is not known:
Making assumptions about the heteroscedasticity Pattern &
them implement GLS
• Assumption 1: The error variance is proportional to 𝑋𝑖 2
Var(Ui) = E(Ui2) = 2Xi2
In STATA:
Reg [dependent variable] [independent variables], robust
Some notes about the transformations
• In a multiple regression model, we must be careful
in deciding which of the X variables should be
chosen for transforming the data.
• Log transformation is not applicable if some of the Y
and X values are zero or negative.
• When σ𝑖 2 are not directly known and are estimated
from one or more of the transformations discussed
earlier, all our testing procedures using the t
tests, F tests, etc., are, strictly speaking, valid only in
large samples.
CHAPTER 7:
AUTOCORRELATION
OBJECTIVES
7.1. The nature of autocorrelation
=> autocorrelation
“correlation between members of series of observations ordered in
time [as in time series data] or space [as in cross-sectional data].”
AUTOCORRELATION PATTERNS
ut
ut
t
t
(a) (b)
ut ut ut
t
t t
(c) (d) (e)
7.1. NATURE OF AUTOCORRELATION
Yt 1 2 X 2t 3 X 22t v t
Yt 1 2 X 2t u t
u t 3 X 22t v t
7.2. CAUSES OF AUTOCORRELATION
4. Systematic errors in measurement
• Suppose a company updates its inventory at a given period in
time.
• If a systematic error occurred then the cumulative inventory
stock will exhibit accumulated measurement errors.
• These errors will show up as an autocorrelated procedure.
5. Cobweb phenomenon
• In agricultural market, supply reacts to price with a lag of
one time period because supply decisions take time to
implement
QSt = 1 + 2Pt-1 + ut
6. Data manipulation
7.3. CONSEQUENCES
Consider a simple regression model: Yt 1 2 X t ut
ˆ2 x y t t
x t
2
n 1 n2
2 xt xt 1 xt xt 2
2
2 x x
var( 2 )AR1 n n t 1n t 1 n
2
... n 1
n
1 n
2 2
t 1
x 2
t t 1
x t
t 1
xt
2
t 1
x 2
t
t 1
xt
7.3. CONSEQUENCES
1. The OLS estimators are still unbiased and
consistent. This is because both unbiasedness and
consistency do not depend on assumption of no
autocorrelation.
2. The OLS estimators will be inefficient and therefore
no longer BLUE.
3. Hypothesis testing and prediction are no longer
valid.
7.4. DETECTION
Graphical Examination of Residuals
• We evaluate the residuals by plotting them against time
𝒖t
𝒖t
t
t
(a) (b)
𝒖t 𝒖t 𝒖t
t
t t
(c) (d) (e)
7.4. DETECTION
Durbin – Watson Test
• Durbin – Watson d test statistic:
d
(u u t t 1 ) 2
u 2
t
7.4. DETECTION
Durbin – Watson Test
If n is sufficiently large, then d 2(1-)
where:
ut ut 1
u 2
t
-1 ≤ ≤ 1 => 0 ≤ d ≤ 4
= -1 => d = 4: perfect negative autocorrelation
= 0 => d = 2: no autocorrelation
= 1 => d = 0: perfect positive autocorrelation
7.4. DETECTION
Durbin – Watson Test
Zone of indecision
Positive
TTQ No Negative
TTQ
Dương
autocorrelation autocorrelation autocorrelation
Âm
>0 =0 <0
0 dL dU 2 4 – dU 4 – dL 4
7.4. DETECTION
Durbin – Watson Test
• Assumptions of the DW test:
1. The regression model includes a constant
2. Autocorrelation is assumed to be of first-order only
3. The equation does not include a lagged dependent variable
as an explanatory variable
4. No missing observations
• Drawbacks of the DW test
1. It may give inconclusive results
2. It is not applicable when a lagged dependent variable is used
3. It can’t take into account higher order of autocorrelation
7.4. DETECTION
Breusch – Godfrey Test
(Lagrange Multiplier Test)
Yt 1 2 X t ut
ut 1ut 1 2ut 2 ... put p vt
BG Test can test for higher order autocorrelation.
7.4. DETECTION
Breusch – Godfrey Test
Reject H0 if Fs > Fα
or p-value (F) < α
7.5. REMEDIES
Resolving Autocorrelation
We have two different cases:
Resolving Autocorrelation
Consider the model
Yt 1 2 X t ut
Where
ut 1ut 1 vt
≠ 0 , vt is the OLS disturbance.
7.5. REMEDIES
Resolving Autocorrelation
(when ρ is known)
Use GLS.
• Transform the original model:
Yt 1 2 X t ut (1)
Yt 1 1 2 X t 1 ut 1 (2)
Yt Yt 1 1 1 2 X t 2 X t 1 ut ut 1
1 (1 ) 2 ( X t X t 1 ) ut ut 1
Set
Yt Yt Yt 1 , 1 1 , 2 , X t X t X t 1
*
1
*
2
* *
“True Model”: Yi 1 2 X 2i 3 X 3i U i
Research model:
(Under-fitted Model) Yi b1 b2 X 2i Vi
X3 omitted from the under-fitted model
=> estimators of β2 and β3 to be biased and
inconsistent.
WRONG FORM OF THE MODEL
• Research model: linear relationship between
dependent and independent variable
Yi 1 2 X 2i 3 X 3i ui
(1 R )m
2
new
where:
m : number of newly-added variables
k : number of parameter of the new model
- If F > F(m,n-k) or p-value(F) < Reject H0.
NORMAL DISTRIBUTION
OF THE DISTURBANCE
• For prediction and hypothesis testing purposes, we
assume the normal distribution of the disturbance.
ui ~ N(0, 𝝈𝟐 )
• Does the disturbance actually follow normal distribution?
=> Use normality test: Jacque – Bera
NORMAL DISTRIBUTION
OF THE DISTURBANCE
Jacque – Bera Test
• Test for normal distribution of the residuals
• Step 1: Use OLS to estimate the original model and save the
residuals
• Step 2: Identify the Skewness and Kurtosis of the distribution
of the residuals.
• Step 3: Test for normal distribution:
𝐻0: 𝑁𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑢𝑟𝑏𝑎𝑛𝑐𝑒
𝐻1: 𝐷𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑢𝑟𝑏𝑎𝑛𝑐𝑒 𝑖𝑠 𝑛𝑜𝑡 𝑛𝑜𝑟𝑚𝑎𝑙
• Calculate JB test statistic, which follows χ2 (2)