You are on page 1of 16

Heteroskedasticity

Denition of Heteroskedasticity
One of the classical regression assumptions is:
SR3: var (e
i
) =
2
for i = 1, ..., N (homoskedasticity).
If this assumption is not satised by a regression model, then
heteroskedasticity exists, i.e.
var (e
i
) =
2
i
=
2
for at least one value of i = 1, ..., N
1 / 16
Heteroskedasticity
Examples for which Heteroskedasticity is likely to exist
wage
i
=
1
+
2
years_at_school
i
+ e
i
It is likely that var (e
i
) increases as the number of years at
school increases.
consumption
i
=
1
+
2
income
i
+ e
i
It is likely that var (e
i
) increases as income increases.
typing_errors
i
=
1
+
2
practice_time
i
+ e
i
It is likely that var (e
i
) decreases as practice time increases.
2 / 16
Heteroskedasticity
Consequences of Heteroskedasticity
SR1: The model is correctly specied.
SR2: E(e
i
) = 0 for i = 1, ..., N
SR3: var (e
i
) =
2
for i = 1, ..., N (homoskedasticity)
SR4: cov(e
i
, e
j
) = 0 for i = 1, ..., N, j = 1, ..., N and i = j
(no autocorrelation)
SR5: The variable x
i
is not random, and it must take at least
two dierent values
SR6: (optional) e
i
N(0,
2
)
3 / 16
Heteroskedasticity
Consequences of Heteroskedasticity
Under assumptions SR1, SR2, SR4 and SR5,
The OLS estimator b
j
is unbiased for
j
;
The OLS estimator b
j
is inecient.
var (b
j
) is dierent to what it would be if SR3 was true.
e.g. For a 2-variable regression model
var (b
2
) =

2
N

i =1
(x
i
x)
2
if SR3 is true,
var (b
2
) =
N

i =1
(x
i
x)
2

2
i

i =1
(x
i
x)
2

2
in general.
The standard OLS standard errors are invalid, as are
the standard t- and F-tests and condence intervals.
4 / 16
Heteroskedasticity
Detecting Heteroskedasticity
Graphical Methods: Estimate the model by OLS, plot the
residuals against each of the variables, and look for evidence of
the residual variance changing with the variable.
-1500
-1000
-500
0
500
1000
1500
2000
20 40 60 80 100 120
r
e
s
i
d
u
a
l
income
Regression residuals (= observed - tted miles)
5 / 16
Heteroskedasticity
Detecting Heteroskedasticity
Breusch-Pagan/Koenker (Lagrange Multiplier) Test:
Step 1: Estimate the main regression equation
y
i
=
1
+
2
x
2i
+
3
x
3i
+... +
k
x
ki
+ e
i
by OLS and compute the residuals e
i
.
Step 2: Estimate the auxiliary regression by OLS, i.e.
regress the squared residuals e
2
i
on the explanatory
variables that are expected to have an eect on the
variance of the errors. e.g.
e
2
i
=
1
+
2
x
2i
+
3
x
3i
+... +
s
x
si
+
i
6 / 16
Heteroskedasticity
Detecting Heteroskedasticity
Breusch-Pagan/Koenker (Lagrange Multiplier) Test:
Step 3: Compute BP = N R
2
where
N = the number of observations;
R
2
= the coecient of determination
of the auxiliary regression.
Step 4:
H
0
:
2
=
3
= ... =
s
= 0 (homoskedasticity)
H
1
:
j
= 0 for some j = 1, ..., s (heteroskedasticity)
Step 5:
If H
0
is true, then BP
2
s1
where s is the number of
regression coecients in the auxiliary regression.
Therefore, reject H
0
if BP >
2
s1;
.
7 / 16
Heteroskedasticity
Detecting Heteroskedasticity
Whites Test:
Step 1: e.g. Consider the regression equation
y
i
=
1
+
2
x
2i
+
3
x
3i
+ e
i
Estimate by OLS and compute the residuals e
i
.
Step 2: Estimate the auxiliary regression by OLS, i.e.
regress the squared residuals e
2
i
on all the explanatory
variables, their squares and cross-products.
e
2
i
=
1
+
2
x
2i
+
3
x
3i
+
4
x
2
2i
+
5
x
2
3i
+
6
x
2i
x
3i
+
i
8 / 16
Heteroskedasticity
Detecting Heteroskedasticity
Whites Test:
Step 3: Compute WH = N R
2
where
N = the number of observations;
R
2
= the coecient of determination
of the auxiliary regression.
Step 4:
H
0
:
2
=
3
= ... =
6
= 0 (homoskedasticity)
H
1
:
j
= 0 for some j = 1, ..., 6 (heteroskedasticity)
Step 5:
If H
0
is true, then WH
2
s1
where s is the number of
regression coecients in the auxiliary regression.
Therefore, reject H
0
if WH >
2
s1;
.
9 / 16
Heteroskedasticity
Dealing with Heteroskedasticity
Whites Heteroskedasticity-Consistent Variance
Estimator:
e.g. for the two-variable regression model
var (b
2
) =
N

i =1
(x
i
x)
2

2
i

i =1
(x
i
x)
2

2
Whites estimator of this quantity is
var
w
(b
2
) =
N

i =1
(x
i
x)
2
e
2
i

i =1
(x
i
x)
2

2
10 / 16
Heteroskedasticity
Dealing with Heteroskedasticity
Whites Heteroskedasticity-Consistent Variance
Estimator:
var
w
(b
j
) is a consistent estimator of var (b
j
)
The OLS estimator with Whites variance estimator
provides an unbiased (but inecient) coecient estimator
with valid t- and F-tests and valid condence intervals.
11 / 16
Heteroskedasticity
Dealing with Heteroskedasticity
Whites Heteroskedasticity-Consistent Variance
Estimator:
In Gretl, with cross-sectional data, choose the robust
standard errors option in the OLS dialogue box.
12 / 16
Heteroskedasticity
Dealing with Heteroskedasticity
Generalised Least Squares (GLS):
e.g. consider the regression model
y
i
=
1
+
2
x
2i
+ e
i
where var (e
i
) =
2
i
(1)
Divide by
i
y
i

i
=
1
1

i
+
2
x
2i

i
+
e
i

i
i.e.
y

i
=
1
x

1i
+
2
x

2i
+ e

i
(2)
where y

i
=
y
i

i
, x

1i
=
1

i
, x

2i
=
x
2i

i
and e

i
=
e
i

i
.
Note that var (e

i
) = var (
e
i

i
) =
1

2
i
var (e
i
) =

2
i

2
i
= 1
Regression (2) is homoskedastic.
13 / 16
Heteroskedasticity
Dealing with Heteroskedasticity
Generalised Least Squares (GLS):
y

i
=
1
x

1i
+
2
x

2i
+ e

i
(2)
Therefore, if the values of
i
, i = 1, ..., N were known,
OLS could be used to estimate the parameters in
Equation (2). This approach to estimating the
coecients in Equation (1) is GLS.
Since
i
is unknown, the GLS estimator is infeasible.
Note: This GLS estimator may be derived by choosing the
parameter values that minimise the sum of the squared
weighted errors, where the weights are
1

2
i
.
14 / 16
Heteroskedasticity
Feasible Generalised Least Squares (FGLS):
e.g. consider the regression model
y
i
=
1
+
2
x
2i
+ e
i
where var (e
i
) =
2
i
(1)
Estimate Equation (1) by OLS. Compute the residuals e
i
.
Use OLS to estimate the auxiliary regression
ln( e
2
i
) =
1
+
2
x
2i
+ u
i
(2)
Generate the tted values from the auxiliary regression
g
i
=
1
+
2
x
2i
, i = 1, ..., N (3)
Estimate
2
i
by s
2
i
= e
g
i
, i = 1, ..., N.
Create the transformed variables
y
+
i
=
y
i
s
i
, x
+
2i
=
x
2i
s
i
, x
+
1i
=
1
s
i
Use OLS to estimate the regression
y
+
i
=
1
x
+
1i
+
2
x
+
2i
+ e
+
i
(4)
15 / 16
Heteroskedasticity
Feasible Generalised Least Squares (FGLS):
The FGLS estimator is biased, but is consistent and
asymptotically ecient. t- and F-tests are asymptotically
valid, and condence intervals have the correct probability
coverage.
16 / 16

You might also like