Generalized Least Squares and Heteroskedasticity: Walter Sosa-Escudero

Generalized Least Squares
Heteroskedasticity

and Heteroskedasticity
Walter Sosa-Escudero
Econ 507. Econometric Analysis. Spring 2009
February 4, 2009
Generalized Least Squares and Heteroskedasticity

Heteroskedasticity
The Classical Linear Model:

1
Linearity: Y = X + u.
Strict exogeneity: E(u|X) = 0
No Multicollinearity: (X) = K, w.p.1.
No heteroskedasticity/ serial correlation: V (u|X) = 2 In .
Gauss/Markov: = (X 0 X)1 X 0 Y is best linear unbiased.
Variance: S 2 (X 0 X)1 is unbiased for V (|X)

= 2 (X 0 X)1
Valid Inference: with the normality assumption, we use t and F
tests.
Now we will focus on the consequences of relaxing V (u|X) = 2 In .

Heteroskedasticity
The Generalized Linear Model

Suppose all classical assumptions hold, but now
V (u|X) = 2 where is any symmetric, positive definite
n n matrix
We are now allowing for heteroskedasticity (elements of the
diagonal of are not restricted to be all equal) and/or serial
correlation (off-diagonal elements may now be 6= 0.), but we are
not imposing any structure to yet (besides symmetry and pd).
Plan
1
Explore consecuences on previous results of relaxing

V (u|X) = 2 In .
Find optimal estimators and valid inference strategies for this

case (GLS).

Heteroskedasticity
Consequences of relaxing V (u|X) = 2 In

is still linear and unbiased (why?) but the Gauss Markov
Theorem does not hold anymore. We will show constructively
that is now inneficient by finding the BLUE for the
generalized linear model.
V (|X)
will now be 2 (X 0 X)1 (X 0 X)1 (check it).
V (|X)
is no longer 2 (X 0 X)1 , and S 2 (X 0 X)1 will be a
biased estimator for V ().

t tests no longer have the t distribution, F tests no longer
valid too.
So, ignoring heteroskedasticity or serial correlation, that is, the use
of and V (|X)
= S 2 (X 0 X)1 , keeps estimation of unbiased
though ineficient, and invalidates all standard inference.

Heteroskedasticity
First we need a simple result: if is n n symmetric and pd,

there is an n n nonsingular matrix C such that
1 = C 0 C
What does this mean, intuitively?
Consider now the following tranformed model

Y = X + u
where Y = CY , X = CX and u = Cu.

Heteroskedasticity
Now check:
1
Y = X + u , so the transformed model is trivially linear.
E(u |X) = CE(u |X) = 0
(X ) = (CX) = K, w.p.1.
(u |X)
(CX is a rank preserving tranformation of X!).
V (Cu|X) = E(CuuC 0 |X) = CE(uu0 |X)C 0

= C 2 C 0
= 2 C[1 ]1 C 0

1
= 2 C (C 0 C)1
C
= 2 In
So...

Heteroskedasticity
Since the transformed model satisfies all the classical assumption,

the Gauss-Markov Theorem hold for the transformed model, hence
the best linear unbiased estimator is:
gls = (X 0 X )1 X 0 Y
This the generalized least squares estimator
Careful: the GLS estimator is an OLS estimator of a
transformed isomorphic model (the generalized linear model).
It provides the BLUE under heteroskedasticity/ serial
correlation.
Now it is clear that is inefficient in the generalized context
(why?)
It is important to see that the statistical properties depend on
the underlying structure (they are not properties of an
estimator per-se).

Heteroskedasticity
Note that
gls = (X X )1 X Y
= (X 0 C 0 CX)1 X 0 C 0 CY
= (X 0 1 X)X 0 1 Y
When = In , gls = .
The practical implementation of gls requires that we know
(though not 2 .)
It is easy to check that V (gls ) = 2 (X 0 X )1 .

Heteroskedasticity
Feasible GLS
Then, replacing
Suppose there is an estimate for , label it .
by :
1 X)X 0
1 Y
f gls = (X 0
This is the feasible GLS estimator.
Is it linear and unbiased? Efficient?

Heteroskedasticity
Heteroskedasticity
Suppose we keep the no-serial correlation assumption, but allow for
heteroskedasticity, that is:
V (u|X) = 2 diag(12 , 22 , . . . , n2 )
It is easy to see that in this case
C = diag(11 , 21 , . . . , n1 )
Hence
X = CX
is a matrix with typical element Xik /i1 , i = 1, . . . , n,
k = 1, . . . , K. Y = CY is constructed in the same fashion.
gls for this particular case is called the weighted least squares
estimator.

Heteroskedasticity
In some cases the FGLS takes a very simple form. Consider

Y = X + u with
V (ui |X) = 2 Xij2
Consider the transformed model
Y = X + u
where X is a matrix with typical element
Xik
=
Xik
Xij
u and Y are defined in a similar fashion. Note

ui =
ui
Xij
so V (ui |X) = 2 . Hence the transformed model is homoskedastic.


Heteroskedasticity
In this case the FGLS is a GLS estimator since there is no

need to estimate unknown parameters.
In any case, the implementation of a FGLS or GLS requires
detailed knowledge of the structure of heteroskedasticity in
order to propose a homoskedastic transformed model. This is
very rarely available.
Later on we will explore another approach: keep the OLS
estimator (we will loose efficiency) but replace S 2 (X 0 X)1 by
some other estimator that behaves correctly even under
heteroskedasticity. (we need an asymptotic framework for
this).

Heteroskedasticity
Testing for heteroscedasticity
We will explore two strategies, with no derivations and

emphasizing intuitions. Later on we will prove all results (we need
an asymptotic framework for this).
a) White test
H0 : no heteroscedasticity, HA : there is heterocedasticity of some
form.
Consider a simple case with K = 3:
Yi = 1 + 2 X2i + 3 X3i + ui
1, . . . , n

Heteroskedasticity
Steps to implement the test:

1
2
Estimate by OLS, save squared residuals in e2 .

Regress e2 on all variables, their squares and all possible
non-redundant cross-products. In our case, regress e2 on
1, X2 , X3 , X22 , X32 , X2 X3 , and obtain R2 in this auxiliar
regression.
Under H0 , nR2 2 (p). p = number of explanatory variables
in the auxiliar regression minus one.
Reject H0 if nR2 is too large.

Heteroskedasticity
Intuition: Note that under homoskedasticity

E(u2i |X) = 2
The auxiliar model can be seen as trying to model the variance of
the error term. If the R2 of this auxiliar regression were high, then
we could explain the behavior of the squared residuals, providing
evidence that they are not constant.
Comments:
Valid for large samples.
Informative if we do not reject the null (no heterocedasticity).
When it rejects the null: there is heterocedasticity. But we do not
have any information regarding what causes heterocedasticity. This
will cause some trouble when trying to construct a GLS estimator,
for which we need to know in a very specific way what causes
heterocedasticity.

Heteroskedasticity
b) The Breusch-Pagan/Godfrey/Koenker test

Mechanically very similar to Whites test. Checks if certain
variables cause heterocedasticity.
Consider the linear model where all classical assumptions hold, but:
V (ui |X) = h(1 + 2 Z2i + 3 Z3i + . . . + p Zpi )
where h( ) is any positive function with two derivatives.
When 2 = . . . = p = 0, V (ui |X) = h(1 ), a constant!!
Then, homoscedasticity means H0 : 2 = 3 = . . . = p = 0, and
HA : 2 6= 0 3 6= 0 . . . p 6= 0.

Heteroskedasticity
Steps to implement the test:

1
2
Estimate by OLS, and save squared residuals e2i .

Regresss e2i on the Zik variables, k = 2, . . . , p and get (ESS).
The test statistic is:
1
ESS 2 (p 1) 2 (p)
2
under H0 , asymptotically. We reject if it is too large.

Heteroskedasticity
Comments:
Intuition is as in the White test (a model for the variance).
By focusing on a particular group, if we reject the null we
have a better idea of what causes heterocedasticity.
Accepting the null does not mean there isnt heterocedasticity
(why?).
Also a large sample test.
2 as a test, which is
Koenker (1980) has proposed to use nRA
still valid if errors are non-normal.

Generalized Least Squares and Heteroskedasticity: Walter Sosa-Escudero

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Generalized Least Squares and Heteroskedasticity: Walter Sosa-Escudero

Uploaded by

Copyright:

Available Formats

Generalized Least Squares