You are on page 1of 23

ECON335 CH.

8 - HETEROSKEDASTICITY 3-27-24

CHAPTER 8
HETEROSKEDASTICITY
We’re now going to consider situations in which one (or more) of the basic
assumptions of the LRM is not present. Two situations we consider involve (i) the
homoscedasticity assumption failing (we call this a situation in which we have
heteroscedasticity), and (ii) the random sampling assumption that the disturbances
are independent of each other fails (the situation we will consider is called serial
correlation). This chapter focuses on the first situation.

Heteroskedasticity: return to
Assumption MLR. 5 Homoskedasticity (7th: 88) “The error u has the same
variance given any value of the explanatory variables. In other words, Var(u|
x1,..,xk) = σ2.”
Recall that it implied, graphically ….

Ex. Consumption Function: the example we


considered was a consumption function where consumption (y) depended on
income (xx) and it could depend on a variety of other factors such as preferences,
the prices of goods, a person’s education, and so on. We are interested primarily in
the relationship between consumption and income. Assumption MLR.5 implies the
above graph; the variation in consumption is the same for a person with $20,000 in
income and a person with $200,000 in income.
The assumption seems inappropriate in this context. Rather, what makes more
sense is that the variance becomes greater as inc ↑'s. It would look something
like ...
1
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

The question we consider in this chapter is what happens to our analysis when we
are presented with a situation in which variances differ across observations?

Specifically, suppose that V(ui|xi) = σ 2i .

[Note the subscript: it implies that the variance can be different for different
observations (notation is important in econometrics.)]

We will assume that all of the other assumptions of the LRM apply.
Importantly, we still have a linear regression [put up the LRM equation], still
assume that the expected value of the disturbance conditional on x equals zero, and
so on.

 Remember: Econ Theory → Econometric Model → Estimator.

There are a variety of questions we might ask:

2
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

1) Can we still use the OLS estimator as an estimator of the parameters of the
model? Yes.

2) Do we have to change any part of our prior analysis of the OLS estimator?
Yes, the std errors.

3) Is the OLS estimator still the BLUE of the population parameters? No.

4) What is the BLUE of the population parameters? Called the GLS (or WLS)
estimator.

5) How do we estimate the WLS model? What are the relevant R commands?

Hypothesis Testing
Because we do not have to address any of these questions if heteroscedasticity is
not present in data, it makes sense to first consider whether we can perform
hypothesis tests to determine its presence. If it is not present, then we can rely on
the analysis we undertook in ECON335 (with respect to the Multiple Linear
Regression (MLR) model).

Ex [7th: 291 C8] Determinants of college GPA


y = college GPA x1 = HS GPA (quadratic) x2 = ACT Score
x3 = average # of lectures missed per week x4 = 1 if pc at home

The null hypothesis in any test will be that homoscedasticity exists; in other words,

3
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

H0: Var(ui|x1i, x2i … xki) = σ2 [Why does this imply homoscedasticity? The variance
doesn’t depend on i.]

 How do we get started when thinking about hypothesis testing?

Some theoretical econometricians noted that the zero conditional mean and
homoskedasticity assumptions regarding the disturbances (MLR4 & 5) imply that

E(u2i |x1i, x2i … xki) = σ2.

We can interpret this result as implying that the variance of u does not depend on
any of the x’s; it is constant across the x’s. This observation leads to various
heteroscedasticity tests.

Breusch-Pagan Heteroskedasticity Test [7th: 269]

Ex. Suppose that we propose to estimate the following regression:


u2 = δ0 + δ1•x1 + … + δk•xk + v [eq 8.12 7th: 269]

where the deltas are parameters to be estimated and v is a disturbance (which we


assume is independent of the x’s).

Interpret the regression: variance depends on the x’s.

4
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

What would a null hypothesis of homoscedasticity imply with respect to (wrt) the
parameters? Think of the null: it implies that the x’s don’t affect the variance. If
they don’t affect the variance, what would have to be true of the parameters?
[only δ0 is non-zero]
HO: δ1 = δ2 = … = δk
HA: HO is not true.

Given what we have learned in this class, we know how to undertake this
hypothesis test.

What would we call the test we just proposed? A test of the overall significance of
the regression.

We know that we could undertake this test using an F test.

 The F statistic [eq. (8.15) 7th: 270] is F = ( R2u/k)/( (1- R2u)/(n-k-1) )

Where R2u is the R2 from the regression in 8.12.

It has an approximate F distribution with k d.f. in the numerator and (n-k-1) df in


the denominator. [7th: 788 has F tables for critical statistics at a 5% l.o.s..]

 Problem with the proposed test: don’t have the u’s for individual observations.

5
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

What can we do? We will see that it turns out that the OLS estimator is a consistent
estimator of the parameters (betas) of the LRM with heteroscedasticity.

Consider using the residual as an estimator of the disturbance:

u^ = y - ( ^β 0 + ^β 1•x1 + … + ^β k•xk) Residual

Since the betas are consistent estimators of the population parameters, this is a
consistent estimator of the disturbance.

So, we resolve the problem by first estimating the model using the OLS estimator,
and obtaining the residuals.

Hypothesis Testing Steps for the B-P test using the F statistic [7th: 270]

(1) Estimate the econometric model using OLS and obtain residuals.

(2) Propose to estimate the parameters of the following model:

u^ 2 = δ0 + δ1•x1 + … + δk•xk + v

(3) Propose a null hypothesis that all “slope” parameters (parameters other than
the intercept) equal zero & establish a level of significance (5%).

(4) Estimate the model and calculate the R2 for this regression: R2u.

6
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

(5) Calculate the F statistic for testing the overall significance of the regression
& its prob value.

(6) If the probability of the statistic is less than the level of significance then
reject the null hypothesis. Or, if the value of the sample statistic – F s – is
greater than the critical value of the statistic then reject the null hypothesis.

Note: one does not have to use the exact variables in the regression equation of
interest. One could use a subset of the x’s. Another possibility is to use the
predicted values from the OLS regression.

Ex College GPA: assume that it’s proportional to high school GPA.

8-3a The White Test for Heteroskedasticity [7th: 271]


Another approach to HT was identified by White.

Intuition: Theoreticians have shown that we can replace the homoscedasticity


assumption MLR.5 with a different assumption regarding the relationship between
the squared disturbances and the regressors. White used this observation to derive a
test of heteroscedasticity.

The test, specifically, looks to see if there’s a relationship between squared


residuals from an OLS regression and various transformations of the independent
variables.
If homoskedasticity is present we’d expect no relationship between the squared
residuals and any those variables.

7
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

Steps for White test


Suppose that our proposed regression model is y = β0 + β1•x1 + β2•x2 + u

(1) Obtain residuals from an OLS regression & square them.

(2) Propose estimating the following equation (which includes all of the
independent variables, squares of independent variables & cross-products of
independent variables).

u^ 2 = δ0 + δ1•x1 + δ2•x12 + δ3•x2 + δ4•x22 + δ5•x1•x2 + v (I)

(3) Propose a null hypothesis that there is no heteroskedasticity. What does


such an assumption imply in the context of the foregoing regression? That
all parameters but the intercept equal zero.

Why? Suppose that all but the intercept = 0.


What result do we get? u^ 2 = δ0 + v. Its expected value will = δ0, a
constant.

Thus, our NH is Ho: δ1 = δ2 = δ3 = δ4 = δ5 = 0


(4) Establish a level of significance for the test: 5%.

(5) Estimate equation (I) and obtain its R2. Then, calculate

LM = n•R2 has a chi-squared distribution with p d.f., where p = # of parameters


in equation (I) (minus the intercept).
8
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

If the probability of n•R2 (given HO) is less than the level of significance of the test
we reject the NH & conclude that heteroscedasticity is present.

Introduce the Chi-Squared distribution (7th : pp. 709, 790)

A Special Case of the White Test for Heteroskedasticity (7th: 272)


Calculating the squares of all of the independent variables and their cross products
can be daunting when you have a larger number of independent variables.

Ex. y = college GPA x1 = HS GPA x2 = ACT Score


x3 = average # of lectures missed per week x4 = 1 if pc at home

[Derive the equation which will be estimated & the tested hypotheses ….]
u^ 2 = ….

Note that the square of a dummy variable = the d.v. So, can’t include the square.

The regression to estimate can become really long. What to do?

(1) No cross products. Do for the above example.

(2) A shorter way to test the same thing involves using predicted values from
the OLS regression. We estimate
u^ 2 = δ0 + δ1• ^y + δ2• ^y 2 + v

9
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

Where ^y equals …. Note that it is a linear combination of the x’s.

For this test the null hypothesis is δ1 = δ2 = 0. It has 2 d.f., a lot fewer than if we
included all of the variables, their squares, and their cross products.

(7th: 272) Can calculate an LM statistic (White) or an F statistic.

 Shortcoming of the White test: it isn’t necessarily a direct test of


heteroscedasticity. It could also pick up variables which are excluded from the
regression. So, it might be picking up something other than heteroscedasticity. In
this regard, Wooldridge recommends more direct tests of specific functional forms
of heteroscedasticity.

Ex. Consumption as a function of income and other variables. We might expect


the variances to differ with income. Could undertake a specification test.

Note: this can be done with the B-P test for a subset of the explanatory variables.

8-1 Consequences of Heteroskedasticity for OLS (7th: 262)

Once we have concluded that heteroscedasticity is present in a data set, the


question becomes “what should we do?”

The first place we might look is the OLS estimator. Can we still rely on the OLS
estimator to make inferences about the population parameters (the betas)? Let’s
take a look at this.

10
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

Bottom Line: the OLS estimator remains unbiased/consistent but it is no longer


efficient.

Bias in OLS Standard Errors (7th: 263) We have seen that

Theorem 3.2 (7th: 88) Sampling Variances of the OLS Slope Estimators

V( ^β j) = σ2/ (SSTj •(1- R2j))

where, as we know, SSTj is the sum of squared deviations of xj from its mean and
R j is the coefficient of determination from a regression of xj on all of the other
2

independent variables.

Note how the variance of the OLS estimator of beta depends on σ2, the assumed
constant variance of the disturbance. Because we are now assuming that the
variance can differ across observations, it is clear that this formula cannot work for
the model with heteroskedasticity.

So, the OLS formula for standard errors when we assume homoscedasticity are
wrong.

[Implication wrt R: if you use the “lm” command and don’t account for the
possibility of different variances, the standard errors you get will be incorrect.]

11
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

8-2 Heteroskedasticity-Robust Inference after OLS Estimation (7th: 263)

To get an idea of the impact of heteroscedasticity on the formula for the variance
of ^β 1 let’s consider the formula for the variance of β1 in simple linear regression
case.
SLR: y = β0 + β1•x1 + u

V( ^β 1) = ∑ ❑(x1i- x 1)2•σ 2i / SST 21 [8-2]


i=1

Note: it weighs each variance by the degree to which it deviates from the overall
mean of x1.

So, if we are to use the OLS estimator, we need to make sure that we use the
correct formula for the variance of the estimator.

☞ Issue: there’s one problem with [8.2]: we don’t know the σ i . Is all lost? No.
2

It has been shown that we can replace the σ 2i with the squared residuals from an
OLS regression.
So, we estimate the model using OLS and get the residuals from the regression: u^ i2

And insert them into [8-2]:


n

V( ^β 1) = ∑ ❑(xi- x )2•u^ i2/ SST 2x [8-3]


i=1

12
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

Wooldridge (7th: 264 eq. (8-4)) identifies the formula for the multiple linear
regression case.

[Heteroskedasticity-] Robust (White) Standard Errors what they’re called

In R, we’ll use HC1 [don’t worry about the formulae]:

Hypothesis Testing

t-Test(7th: 265): all is the same except that we use the new standard error of the
parameter estimate.

F Test of Multiple Restrictions: heteroscedasticity-robust Wald Statistic


What if a software package does not calculate it? Then can turn to

8-2a Computing Heteroskedasticity-Robust LM Tests (7th: 267, 172) We won’t


consider it.

8-4 Weighted Least Squares Estimation (7th: 273)

13
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

So far, we have considered using the OLS estimator as an estimator of the


parameters of the model with heteroscedasticity. I have also noted that the OLS
estimator is not efficient.

Question: does an efficient estimator exist? Answer: Yes.

Generally, it’s called the Generalized Least Squares (GLS) estimator. In the
context of the presence of heteroscedasticity. It is called the Weighted Least
Squares (WLS) Estimator.

The general idea behind identification of the minimum variance estimator is to re-
characterize the model as one in which all of the assumptions of the MLR model
hold. Specifically, we transform the model into one in which homoscedasticity is
present. Once we have homoscedasticity, we can rely on the Gauss-Markov
Theorem & all of the desirable properties of the OLS estimators. That’s what we
will do.

We make assumptions MLR.1 to 4 and assume that the disturbances can have
different variances.

Because we are going to allow the variances of the disturbances to vary across
observations, I will represent the linear model by adding subscripts for
observations; i.e.,
yi = β0 + β1∙x1i + β2∙x2i + … + βk∙xki + ui

☺ Now, consider the following transformation of the data

yi /σi= β0/σi + β1∙x1i /σi + β2∙x2i /σi + … + βk∙xki /σi + ui/σi


14
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

we divide each variable by the std error of the observation.


What is the expected value of the transformed variable?

E(yi /σi) = β0/σi + β1∙x1i /σi + β2∙x2i /σi + … + βk∙xki /σi + E(ui/σi)

Let’s consider the mean & variance of the transformed disturbance.

E(ui/σi) = (1/σi)∙E(ui) = 0 V(ui/σi) = (1/σ2i)∙V(ui) = (1/σ2i)∙σ2i = 1.

So, we have a random term with mean = zero & a constant variance = 1.

☞ Suppose that we propose estimating a model which incorporates the foregoing


transformation. Can we id an estimator for the model?

What can we say about the transformed model?

Since the variance of the random term is constant w/ mean = 0, we have a model in
which homoskedasticity is present. Further, all of the other assumptions of the
LRM apply.

What is the efficient estimator for this model? The OLS estimator!

That’s the idea behind the GLS estimator and, for heteroscedasticity, the WLS
estimator.

15
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

We transform the data and then estimate the model using the OLS estimator. So,
how do we implement the estimator?
Interpretation of GLS Estimator: why is it called a WLS Estimator? (7th: 275)

In order to interpret GLS estimation, let’s set up the basic OLS estimation problem.

Recall that we obtain the OLS estimator by minimizing the sum of squared
residuals (7th: 70); i.e.,

min ∑ ❑( yi - ^β 0 - ^β 1∙x1i - … - ^β k∙xki)2


i=1

^β 0 … ^β k

Let’s represent this minimization for our transformed model:


n

min ∑ ❑( yi/σi - ^β 0 /σi - ^β 1∙x1i /σi - … - ^β k∙xki/σi)2


i=1

^β 0 … ^β k

If we let wi = 1/σi2.

This minimization is equivalent to (7th: 275)


n

min ∑ ❑wi •( yi - ^β 0 - ^β 1∙x1i - … - ^β k∙xki)2


i=1

^β 0 … ^β k

If we contrast this minimization with that for the LRM model, we see that the
difference lies in the wi.

16
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

☞ Discuss: the relationship between σi2 & the size of the weight. Graphically ….

OLS is not efficient because it does not take into account the differences in the
variances.

[Side note: WLS in R – students don’t have to know this]

8-4a Heteroskedasticity is Known up to a Multiplicative Constant (7th: 273…)

Letting x = [x1 … xk]′, suppose that the variance of u (conditional on x) takes the
following form:
V(u|x) = σ2h(x).

Ex. Consumption as a function of income & variances related to income.

In other words, it is some common variance multiplied by a function of the x


variables. Consider transforming our linear regression model by dividing all of its
elements by h(x)1/2.
17
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

We get yi/h(x)1/2= β0/h(x)1/2 + β1∙x1i /h(x)1/2 + … + βk∙xki /h(x)1/2 + ui/h(x)1/2 (III)

Focus on the variance of the disturbance:

V(ui/h(x)1/2) = (1/h(x))V(ui) = (1/h(x))•σ2h(x) = σ2

This gets us a homoscedastic disturbance. So, we can estimate equation (III) using
the OLS estimator.

Issue: we don’t know what h(x) looks like.

There are some cases in which we have to obtain estimates of the individual
variances and other cases where we do not. We will consider the latter first.

What to do? Consider the following example and possibility:

Ex Savings Function (7th: eq (8.23) 273) Savingsi = β0 + β1∙incomei + ui

and assume that V(ui|x) = σ2•incomei

Here, y = savings, x1 = income. What is h(x)? Incomei.

How would we implement the GLS estimator? We know that we have to divide all
variables (including the intercept) by the square root of h( ).

18
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

So, we divide y, the intercept and income by (income)1/2. Go through the detail and
note that there is no intercept (column of ones).
And estimate the model using the OLS estimator (with no intercept).

Notes:
( ) In this type of case, we don’t have to obtain estimates of the individual
variances; we just need to have the x variables.

( ) Remember that the GLS estimator is efficient. So, if the form of the variance is
correct, the estimates are efficient.

8-4b The Heteroskedasticity Function Must Be Estimated: Feasible GLS (7th:


278)
I noted that the above assumptions regarding the variances do not require that we
obtain estimates of the variances. We call the situation in which we must obtain
estimates of the variances FGLS.
We can make a variety of assumptions about the relationship between the variance
of observations and independent variables (can be x’s or other variables). A
frequently used observation is

V(ui|xi) = σ2•exp(δ0 + δ1•x1i + … + δj•xji).

19
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

Why have the exponential function? Note that variances can’t be negative. If we
ran a linear regression we could get negative predicted values. With the
exponential function we can’t get a negative value.

Issue: don’t have the u’s. We should we do? Because the OLS estimator is
unbiased, we can use residuals from an OLS regression: u^ i2. We have

u^ i2 = σ2•exp(δ0 + δ1•x1i + … + δk•xki)•vi where vi is a disturbance.

Issue: how do we estimate the model? It’s not linear in the parameters (because
they are in the power of the exponential function). Taking the natural log, we get

ln(u^ i2) = ln(σ2) + δ0 + δ1•x1i + … + δk•xki + ei where ei = ln(vi) (IV)

Note that ln(σ2) + δ0 is the intercept in this regression.

Steps of the procedure (7th: 279):


1) Estimate the model using OLS and obtain residuals: u^ i2.
2) Estimate equation (IV) and obtain predicted values of ln(u^ i2).
3) Predicted values are natural logs. So, exponentiate them: σ^ i2 = exp(ln(u^
2
i )).
4) Use the σ^ i2‘s as weights in a WLS equation: the weight will be 1/σ^ i.

Issues/Notes:

20
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

(1) We use FGLS to obtain estimates of the parameters. The fact that the
independent variables (incl. the intercept) are divided by an h( ) function does not
change this result. The transformation gets us efficient estimates of the parameters
of the underlying linear model.

(2) Alternative FGLS model: ln(u^ i2) = δ0 + δ1• ^y + δk• ^y 2 + v [7th: 279 eq 8.34]

(3) [7th: 281] What if the OLS and GLS estimates are notably different (e.g.,
different signs and both s.s.). Since both estimators are consistent this suggests that
there might be some other problem with the regression specification (i.e., some
other LRM assumption has failed, such as the form of the CEF).

8-4c What if the Assumed Heteroskedasticity Function Is Wrong? [7th: 281]


Issue: what if make an incorrect assumption regarding the form of the
heteroscedasticity?

Are the WLS estimates biased/inconsistent? What about the efficiency


characteristics?
No, wrt the bias and inconsistency.

If we simply use an h(x) function, the estimates are unbiased because any function
of the x is uncorrelated with the disturbance.

If we have to obtain estimates in the h( ) function (like in 8-4b) then it is


consistent, but biased. The consistency is good enough.

21
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

What about standard errors in FGLS? They are biased but we will be okay if we
obtain robust standard errors [7th: 282]. So, suggestion would be to calculate robust
standard errors always.

8-5 The Linear Probability Model Revisited [7th: 284]


When we discussed the LPM we noted that the conditional variance of y is

V(y|x) = p(x)(1-p(x)) where p(x) = β0 + β1•x1 + … + βk•xk [7th: 285]

Is the variance constant? Nope, it depends on the values of x. So, the LPM is
inherently a heteroskedastic model.

Implication: If use OLS, the standard errors will be incorrect.


If use GLS, know how to get efficient estimates.

What to do? Get estimates of the variances. We know that the predicted values
from an OLS regression are the probabilities. So,
^
V ( yi∨xi) = ^y i•(1- ^y i) = h^ i [8-47]

Issue: what happens if get a predicted value outside of the unit interval? [7th 285]

WLS won’t work if this happens. Possibilities:


(1) make an adjustment which produces predictions in the unit interval. Problem
with this: it’s arbitrary.
22
ECON335 CH.8 - HETEROSKEDASTICITY 3-27-24

(2) Use OLS and compute heteroscedasticity robust standard errors. This is
probably preferred.

Summary: Steps for the analysis [7th: 286]


1) Estimate the LPM and obtained predicted values.

2) If the predicted values lie between 0 & 1, you can use the MLR model.

3) Negative predicted values are problematic for weights. If there are


predicteds outside the unit interval use the OLS estimator with robust
standard errors (7th: 285).

4) Wrt WLS, construct the estimated variances from [8.47] (7th: 285)

5) Estimate the equation using WLS.

23

You might also like