You are on page 1of 8

Heteroskedasticity

Homoskedasticity - Var(u|x) = 2
Heteroskedastic Case

Heteroskedasticity with a single regressor

(xi x ) ui

1 = 1 +
2
(
)
x
x

i
2 2
(
)
x
x

Var (1 ) =
2
( ( xi x ) ) 2

Heteroskedasticity with a single regressor


When i2 2
2 2
(
)
x
x
ui

Var(1 ) =
,
2 2
( ( xi x ) )

where ui are are the OLS residuals

To calculate the standard error,


Small Sample- homoskedasticity assumption is crucial!
Large Sample- okay but we use a different formula
(Heteroskedasticity)

Robust Standard Error

Heteroskedasticity with multiple regressors


When i2 2

Var j

r u

)=

2
ij i
2
j

SSR

where
ui are are the OLS residuals
rij is the residual from regressing x j on other independent variables
SSR j is the sum of squared residuals from this regression

Robust Standard Errors

Now that we have a consistent estimate of the


variance, Var j , its square root can be used as a
standard error for inference

( )

Important to remember that these robust standard


errors only have an asymptotic justification
With small sample sizes, t statistics formed with robust
standard errors will not have a distribution close to the t,
and inferences will not be correct

Testing for Heteroskedasticity


H0: Var(u|x1, x2,, xk) = 2

H0: E(u2|x1, x2,, xk) = E(u2) = 2

We assume the relationship between u2 and xj will be


linear and test as a linear restriction:
u2 = 0 + 1x1 ++ k xk + v
H0: 1 = 2 = = k = 0

The Breusch-Pagan Test

We dont observe the error, but can estimate the


residuals from the OLS regression

After regressing the residuals squared on all of the xs,


can use the R2 to form an F test

The F statistic is just the reported F statistic for overall


significance of the regression

R2 k
F=
1 R 2 (n k 1)

The White Test


The White test allows for nonlinearities by using
quadratic and interactions of all the xs (the BreuschPagan test only detects any linear forms of
heteroskedasticity)
If we just use a F test to see overall significance, then we will
pretty quickly have too many restrictions.
A simpler way is to use a fitted value from OLS, , a function
of all the xs.
-> and 2 can proxy for all of the xj, xj2, and xjxh
Regress the residuals squared on and 2 and use the R2 to
form an F statistic (now only testing for 2 restrictions)

Weighted Least Squares Estimation


The basic idea is to transform the model with
heteroskedastic error into one with homoskedastic
error using a proper weight.

Weighted Least Squares Estimation


j* minimize the weighted sum of squared residuals

Weighted Least Squares Estimation


The transformed model satisfies MLR 5.

WLS estimator
Under MLR 1-4 and 6, the WLS estimator now satisfies
MLR 5. We can calculate standard errors and so test
statistics. The t/F statistics have exact distributions.
Furthermore, it is a BLUE (that is, more efficient than OLS)
and so one example of Generalized Least Square (GLS)
estimators.
However, we should specify the form of heteroskedasticity.
In practice, we rarely know how the variance depends on a
particular independent variable in a simple form.
Example:

Example: Naturally Given Weights


When we only have averages of data for groups,
instead of using data for individual level:
e.g. Effect of the plan generosity on individual contribution

Feasible Generalized Least Squares (FGLS)


We usually dont know the form of the heteroskedasticity
In this case, you need to estimate h(xi)
We assume that Var(u|x) = 2exp(0 + 1x1 + + kxk)
The j must be estimated because we dont know.

u2 = 2exp(0 + 1x1 + + kxk)v


ln(u2) = 0 + 1x1 + + kxk + e

Feasible Generalized Least Squares (FGLS)

Implementation:
Step1. Run the original OLS model, save the residuals,
Step2. Create ln(2)
Step3. Regress ln(2) on all of the independent variables
and get the fitted values,
Step4. Exponentiate the fitted values, that is, =exp()
Step5. Estimate the original model by WLS
using 1/ as the weight

Further Comments:
1. FGLS
2. WLS

The Linear Probability Model Revisited


P(y = 1|x) = E(y|x)=0 + 1x1 + + kxk

You might also like