Professional Documents
Culture Documents
Heteroscedasticity
Objectives:
(1) Causes
(2) Consequences
(3) Detection
(4) Solutions
Econometrics 1
What is Heteroscedasticity
z Recall: Assumptions of OLS
E (u i ) = 0
E (u i u j ) = 0 i≠ j
E (u 2 ) = σ 2
z The LRM assumes that the variance of the equation
disturbance term is constant over the whole sample
period. That is:
2 2
σ t = σ for all t, t = 1,...,T
What is Heteroscedasticity
z When the requirement of a constant
variance holds we have
homoscedasticity
z When the requirement of a constant
variance is violated we have a condition of
heteroscedasticity
Econometrics 2
What is Heteroscedasticity
- Essence (informal) is that the estimated regression
line fits some observations better than others, and that
the poor fit of this observation can be identified as
being related in a systematic fashion to some cause.
Types of Heteroscedasticity
z Type A.) Error variance related to a
category: variety, gender, type of industry
z Type B.) Related to values of Xi
z Type C.) Related to values of Yi
z Type D.) Some function of the explanatory
variables X1, X2, X3
Econometrics 3
z The Residual for each observation i is the
vertical distance between the observed
value of the dependent variable and the
predicted value of the dependent variable
– I.e. the difference between the observed value
of the dependent variable and the line of best
fit value
Econometrics 4
Diagnose heteroscedasticity by plotting the
residual against the predicted y.
(Assume that this represents multiple
observations of y for each given value of x):
400000
300000
200000 +ive
residual
100000
-ive
residual
+
Purchase price
-
0
-100000
0 2 4 6 8 10 12 14
Number of rooms
9
10
Econometrics 5
If we plot the residual against Rooms, we
can see that its variance increases with
No. rooms:
300000
200000
100000
Unstandardized Residual
-100000
-200000
0 2 4 6 8 10 12 14
Number of rooms
11
200000
There is
clear
100000 evidence
of
Unstandardized Residual
0 increasing
variance
-100000
here
-200000
0 2 4 6 8 10 12 14
Number of rooms
12
Econometrics 6
Consumption function example (cross-
section data): credit worthiness as a
missing variable?
13
14
Econometrics 7
The Heteroskedastic Case
15
Causes
z What might cause the variance of the
residuals to change over the course of the
sample?
– the error term may be correlated with:
z either the dependent variable and/or the
explanatory variables in the model,
z or some combination (linear or non-linear) of all
variables in the model
z or those that should be in the model.
– But why?
16
Econometrics 8
(i) Non-constant coefficient
z Suppose that the slope coefficient varies
across i:
– yi = a + bi xi + ui
z suppose that it varies randomly around some
fixed value b:
– bi = b + ei
z then the regression actually estimated by
eviews will be:
– yi = a + (b + ei) xi + ui
– = a + b xi + (ei xi + ui)
– where (ei x + ui) is the error term in the regression.
17
The error term will thus vary with x.
Econometrics 9
(iii) Non-linearities
z If the true relationship is non-linear:
– yi = a + b xi2 + ui
z but the regression we attempt to estimate is
linear:
yi = a + b xi + vi
z then the residual in this estimated
regression will capture the non-linearity
and its variance will be affected
accordingly:
vi = f(xi2, ui)
19
(iv) Aggregation
z Sometimes we aggregate our data across groups:
– e.g. quarterly time series data on income = average
income of a group of households in a given quarter
z if this is so, and the size of groups used to calculate the
averages varies,
⇒ variation of the mean will vary
z larger groups will have a smaller standard error
of the mean.
⇒the measurement errors of each value of our
variable will be correlated with the sample size of
the groups used.
z Since measurement errors will be captured by the
regression residual
⇒ regression residual will vary the sample size of the
underlying groups on which the data is based. 20
Econometrics 10
The consequences of
heteroskedasticity
21
The consequences
z OLS estimators are still unbiased (unless there are also
omitted variables)
22
Econometrics 11
The consequences
z However OLS estimators are no longer
efficient or minimum variance
Econometrics 12
Detecting heteroskedasticity
z Unfortunately,there is usually no
straightforward way to identify
the cause
25
Detecting heteroskedasticity
z Formal Tests:
Ho: Homoskedasticity
HA : Hetroskedasticity
26
Econometrics 13
Detecting heteroskedasticity:
Formal Tests
z Ramsey Reset Test
z White’s Test
z Goldfeld-Quandt test - suitable for a
simple form of heteroskedasticity
z Breusch-Pagan test - a test of more
general forms of heteroskedastcity
27
28
Econometrics 14
Goldfeld-Quandt test: Type B
– S.M. Goldfeld and R.E. Quandt, "Some Tests
for Homoscedasticity," Journal of the American
Statistical Society, Vol.60, 1965.
29
30
Econometrics 15
Goldfeld-Quandt test: Procedure
(iv) Calculate the test statistic G where:
G = RSS2/ (1/2(n - p) -k)
RSS1/ (1/2(n - p) -k)
G has an F distribution: G ~ F[1/2(n - p) - k, 1/2(n - p) -k]
z NB G=F* must be > 1. If not, invert it.
z Compare F* to Fc
z Prob: In practice we don’t usually know what z is.
– But if there are various possible z’s then it may not matter
which one you choose if they are all highly correlated which
each other.
31
– Assumes that:
σi2 = a1 + a2z1 + a3 z3 + a4z4 … am zm [1]
where z’s are all independent variables. z’s can be
some or all of the original regressors or some other
variables or some transformation of the original
regressors which you think cause the
heteroscedasticity:
e.g. σi2 = a1 + a2exp(x1) + a3 x32 + a4x4
32
Econometrics 16
Breusch-Pagan Test
H 0 : σ 2 = σ 22
H A : σ i = f ( Z 1 , Z 2 ,...Z n ) = f (γZ 1 + ...γZ 1 )
2
H 0 : γ 1 = γ 2 = ....γ n = 0
33
Econometrics 17
Breusch-Pagan Test: Problems
z This test is not reliable if the errors are not
normally distributed and if the sample size
is small
z Koenker (1981) offers an alternative
calculation of the statistic which is less
sensitive to non-normality in small
samples:
BKoenker = nR2 ~ χ2m-1
where n and R2 are from the regression of uhat 2
on the z’s, where BKoenker has a Chi-square
distribution with m-1 degrees of freedom
35
36
Econometrics 18
White (1980) Test: Procedure
– (ii) use uhat 2 as the dependent variable in another
regression, in which the regressors are:
z (a) all "k" original independent variables, and
z (b) the square of each independent variable,
(excluding dummy variables), and all 2-way
interactions (or crossproducts) between the
independent variables.
– The square of a dummy variable is excluded
because it will be perfectly correlated with the
dummy variable.
38
Econometrics 19
Notes on White’s test:
z The White test does not make any
assumptions about the particular form of
heteroskedasticity, and so is quite general
in application.
– It does not require that the error terms be
normally distributed.
– However, rejecting the null may be an
indication of model specification error, as well
as or instead of heteroskedasticity.
39
Econometrics 20
White test in Brief
1.) Run OLS on Yi = B1 + B2 X2i + B3iX3 + ei to
obtain ei2
2.) Regress ei2 = a2 +Z2i + Z3i + ZKi + vi
3.) Calculate X2* = N . R2 where R2is from (2.)
4.) Compare X2* to X2 with K. d.f. to X2 where
K= # Regressors
Econometrics 21
Ramsey Reset Test
(1) Run the intended regression and obtain the
residuals, εˆ, and the predicted values, yˆ
2. Estimate and specify the auxiliary regression. The
auxiliary regression is thus one where εˆ from the
original regression is regressed on powers of the
predicted values:
43
Heteroscedasticity
Solutions/Correction:
z GLS Estimation
z Weighted Least Squares
z Respecificationof the model
– Include relevant omitted variable(s)
– Express model in log-linear form
– Express variables in per capita form
z White’s Standard Errors
– Where respecification won’t solve the problem use
robust HeteroskedasticConsistent Standard
Errors(White standard errors)
44
Econometrics 22
GLS/Weighted Least Squres
z GLS is a new estimator that is more efficient
than OLS when errors are heteroscedastic
45
46
Econometrics 23
z A. Weighted Least Squares
– If the differences in variability of the error term can be
predicted from another variable within the model, the
Weight Estimation procedure (available in eviews) can
be used.
z computes the coefficients of a linear regression model using
WLS, such that the more precise observations (that is, those
with less variability) are given greater weight in determining
the regression coefficients.
– Problems:
z Wrong choice of weights can produce biased estimates of the
standard errors.
– we can never know for sure whether we have chosen the
correct weights, this is a real problem.
z If the weights are correlated with the disturbance term, then the
WLS slope estimates will be inconsistent.
47
48
Econometrics 24
Summary
49
Econometrics 25