You are on page 1of 32

Heteroscedasticity

Chapter 11
Answers to the following questions:

1. What is the nature of heteroscedasticity?


2. What are its consequences?
3. How does one detect it?
4. What are the remedial measures?

09/10/20 Prepared by Sri Yani K 2


The nature of heteroscedasticity

1. The error-learning models  variance is expected


to decrease
2. Discretionary income  variance is expected to
increase
3. Data collecting technique improve  variance is
likely to decrease
4. Presence of outliers
5. Specification bias  some important variables are
omitted from the model

09/10/20 Prepared by Sri Yani K 3


The nature of heteroscedasticity

6. Skewness in the distribution of one or more


regressors included in the model
7. As David Hendry notes, heteroscedasticity can
also arise because of:
a) Incorrect data transformation
b) Incorrect functional form
8. Problem of heteroscedasticity is likely to be more
common in cross-section than in time series data.

09/10/20 Prepared by Sri Yani K 4


OLS estimation in the presence of
heteroscedasticity
 The two-variable model:
Yi  1   2 X i  ui

ˆ
2 
 xi yi n X iYi   X i  Yi

 i n X i    X i 
x 2 2 2

 
var ˆ2 
i i
x  2

x 
2
2
i

 ̂ 2 is still linear unbiased and consistent, but no longer


best and minimum variance  they are not BLUE

09/10/20 Prepared by Sri Yani K 5


The method of Generalized Least
Squares (GLS)
 GLS is OLS on the transformed variables that
satisfy the standard least squares assumptions
 GLS estimators are BLUE
 The SRF:
Yi  1   2 X i  ui
Yi  1 X 0i   2 X i  ui where X 0i  1 for each i
Assume that the heteroscedastic variances  i2 are known,
Yi  X 0i   Xi   ui 
 1    2   
i  i   i   i 

09/10/20 Prepared by Sri Yani K 6


The method of Generalized Least
Squares (GLS)
Yi    1 X 0i   2 X i  ui
where the starred variables are the original variables devided by  i .
2
 ui 
   Eu 
2
 
var u i i  E 
 i 
1
 2 E  ui 
2
since  i2 is known
i
1
 
 2  i2
i
since E  ui    i2
2

1 constant = homoscedastic

09/10/20 Prepared by Sri Yani K 7


GLS estimators
Yi  X 0i   X i   ui 
 1    2     
i  i   i   i 
Yi   1 X   2 X i  u
  
0i
  
i

 
2
Minimize  uˆ2
i   Yi   1 X   2 X i
  
0i
 

2 2
 uˆi   Yi    X 0i    X i  
          1     2    
 i  i   i   i 

09/10/20 Prepared by Sri Yani K 8


GLS estimators

ˆ    w    w X Y     w X    wY 
i i i i i i i i

 w   w X   w X 
2 2
2
i i i i i

var  ˆ  
   w i

 w   w X   w X 
2 2
2
i i i i i

1
where wi  2
i

09/10/20 Prepared by Sri Yani K 9


Difference between OLS and GLS

in OLS we minimize a sum of residual squares

 uˆ    
2
2
i Yi  ˆ1  ˆ2 X i
in GLS we minimize a weighted sum of residual squares
with wi  1  i2

 
2
 w uˆ   wi Yi  ˆ1  ˆ2 X i
2
i i

Observation coming from a population with larger  will get


relatively smaller weight and those from a population with
smaller  will proportionally larger weight in minimizing the RSS

09/10/20 Prepared by Sri Yani K 10


Consequences of using OLS in the
presence of heteroscedasticity

   
var  2  var ˆ2
ˆ 

 Confidence interval based on the later will be


unnecessarily larger
 The t and F tests are likely to give us inaccurate
results and coefficient to be statistically insignificant
 If we persist in using the usual testing procedure
despite heteroscedasticity, whatever conclusions
we draw or inferences we make may be very
misleading

09/10/20 Prepared by Sri Yani K 11


Detection
Informal methods
1. Nature of the problem
2. Graphical method

Formal methods
1. Park test
2. Glejser test
3. Spearman’s rank correlation test
4. Goldfeld-Quandt test
5. Breusch-Pagan-Godfrey test
6. White’s general heteroscedasticity test
7. Koenker-Basset test

09/10/20 Prepared by Sri Yani K 12


Graphical method
u2 u2 u2

X X X
u2 u2

X X

09/10/20 Prepared by Sri Yani K 13


Park test
 Variance is some function of the explanatory
variable Xi
 i2   2 X i evi or ln i2  ln  2   ln X i  vi
Since  2 is generally unknown, using uˆi2 as a proxy
ln uˆi2  ln  2   ln X i  vi     ln X i  vi
if  statistically significant  heteroscedasticity
Problem: vi may not satisfy the OLS assumptions and
may itself be heteroscedasticity

09/10/20 Prepared by Sri Yani K 14


Glejser test
ˆ i  1   2 X i  i ˆi  1   2 X i  i
1 1
ˆ i  1   2  i ˆi  1   2  i
Xi Xi
ˆ i  1   2 X i  i ˆi  1   2 X i2  i

P roblem: i has some problems in that its expected


value is nonzero, it is serially correlated and
heteroscedastic

09/10/20 Prepared by Sri Yani K 15


Spearman’s rank correlation test

Spearman’s rank correlation coefficient:

  di2 
rs  1  6 2 
 
 n n  1 
where:
di = difference in the rank assigned to two
different characteristic of the i-th individual
or phenomenon
n = number of individuals or phenomenon
ranked

09/10/20 Prepared by Sri Yani K 16


Spearman’s rank correlation test
1. fit the regression on the data on Y and X and obtain
the residuals
2. taking absolute value of residual and rank both
residual and X and compute the Spearman’s rank
correlation coefficient
3. assuming that the population rank correlation
coefficient s is zero and n>8, the significance of the
sample rs can be tested by
rs n  2
t with df=n-2
1 r
s
2

If t > the critical t  heteroscedasticity

09/10/20 Prepared by Sri Yani K 17


Goldfeld-Quandt test
 Assumes that the heteroscedasticity variance,
i2, is positively related to one of the explanatory
variables in the regression model

 The two-variable model: Yi  i   2 X i  ui


 i2 is positively related to Xi as

 i2   2 X i2
where  is a constant
2

09/10/20 Prepared by Sri Yani K 18


Goldfeld-Quandt test
1. Order the observation according to the values of Xi,
beginning with the lowest X value
2. Omit c central observations, and divide the remaining
observations into two groups each (n-c)/2
observations
3. Fit separate OLS regressions to the first (n-c)/2
observations and the last (n-c)/2 observations, and
obtain the residual sum of squares RSS1 and RSS2.
These RSS each have

 n  c  k or
 n  c  2k 
  df
2  2 
09/10/20 Prepared by Sri Yani K 19
Goldfeld-Quandt test
4. Compute the ratio
RSS2 df

RSS1 df
If ui are assumed to be normally distributed and if the
assumption of homoscedasticity is valid, then it can be
shown that  follows the F distribution with df each of
(n-c-2k)/2. If  > the critical F, reject the hypothesis of
homoscedasticity.
 The success of the GQ test depends on the value of c
and identifying the correct X with which to order the
observations

09/10/20 Prepared by Sri Yani K 20


Breusch-Pagan-Godfrey test
The k-variable regression model
Yi  1   2 X 2i  ...   k X ki  ui
Assume that the error variance  i2 is described as
 i2  f  1   2 Z 2i  ...   m Z mi 
that is,  i2 is a linear function of the nonstochastic variables Z's.
Specifically, assume that  i2  1   2 Z 2i  ...   m Z mi
that is,  i2 is a linear function of the Z's.
If  2   3  ...   m  0   i2  1  constant 

09/10/20 Prepared by Sri Yani K 21


Breusch-Pagan-Godfrey test

1. Estimate by OLS and obtain the residuals


2. 
Obtain  2  uˆi2 n , that is the ML estimator of
2
Construct variables pi defined as pi  ui 
2 2
3. ˆ 
4. Regress pi on the Z’s as pi  1   2 Z 2i  ...   m Z mi
5. Obtain the ESS and define
1
   ESS 
2

09/10/20 Prepared by Sri Yani K 22


Breusch-Pagan-Godfrey test
Assuming ui are normally distributed, if there is
homoscedasticity and if the sample size n increase
indefinitely, then
~ 2
m 1
asy

If  > the critical 2, can reject the hypothesis of


homoscedasticity
 It is sensitive to the normally assumption

09/10/20 Prepared by Sri Yani K 23


White’s general heteroscedasticity
test
 Does not rely on the normality assumption
 The three-variable regression model:

Yi  1   2 X 2i  3 X 3i  ui
 The k-variable regression model :
Yi  1   2 X 2i  ...   k X ki  ui

09/10/20 Prepared by Sri Yani K 24


White’s general heteroscedasticity
test
The White test proceeds as follows:
1. Estimate the regression model and obtain the
residual
2. Estimate the following regression
uˆi2  1   2 X 2i   3 X 3i   4 X 22i   5 X 32i   6 X 2i X 3i  ui
3. H0: there is no heteroscedasticity
4. Obtain nR 2
~  2
df number of regressors
asy

5. If 2 > the critical chi-square, the conclusion is


there is heteroscedasticity

09/10/20 Prepared by Sri Yani K 25


Koenker-Basset (KB) test
 Estimate the regression model
Yi  1   2 X 2i  ...   k X ki  ui
 Obtain the estimated Y value and the residual, and
then estimate

 
2
uˆi2   i   2 Yˆi  i
 H0: 2=0
If this is not rejected, conclude that there is no
heteroscedasticity
It is applicable if the error term in the original model
is not normally distributed

09/10/20 Prepared by Sri Yani K 26


Remedial measures

 When i2 is known: the method of weight


least squares
 When i2 is not known
 The true i2 are rarely known
 White’s Heteroscedasticity-Consistent
Variances and Standard Errors  robust
standard errors
 can be larger or smaller than the uncorrected
standard errors

09/10/20 Prepared by Sri Yani K 27


Remedial measures
Assumption about heteroscedasticity pattern
1. The error variance is proportional to Xi2
 
E uˆi2   2 X i2
Divide the original model through by Xi
Yi 1 ui 1
  2   1   2  i
Xi Xi Xi Xi
2
 ui  1 1
 
E  i
2
  
 E    2 E ui  2  2 X i2   2
2

 Xi  Xi Xi

09/10/20 Prepared by Sri Yani K 28


Remedial measures
2. The error variance is proportional to Xi
The square root transformation:

 
E uˆi2   2 X i
The model can be transformed:
Yi 1 ui 1
  2 X i   1   2 X i  i
Xi Xi Xi Xi
2
 ui  1 1
 
E  i
2
 E
 X
 
 Xi
E ui 
2

Xi
  
 2 Xi   2
 i 

09/10/20 Prepared by Sri Yani K 29


Remedial measures
3. The error variance is proportional to the
square of the mean value of Y

E  uˆ     E  Yi  
2 2 2
i

The transform model


Yi 1 Xi ui 1 Xi
  2   1  2  i
E  Yi  E  Yi  E  Yi  E  Yi  E  Yi  E  Yi 
 
E i2   2

09/10/20 Prepared by Sri Yani K 30


Remedial measures

4. A log transformation

ln Yi  1   2 ln X i  ui
very often reduces heteroscedasticity
 log transformation compresses the scales of
variables
 The slope coefficient 2 measures the
elasticity of Y respect to Xi

09/10/20 Prepared by Sri Yani K 31


Conclude of the remedial measures
 All the transformations discussed previously
are ad hoc  speculating about the nature of
 i2
 Some problems:
1. Beyond the two-variable model, we do not know a
priori which of the X variables should be chosen for
transforming the data
2. Log transformation is not applicable if some of the
Y and X values are zero or negative
3. The problem of spurious correlation
4. All testing procedure using the t test, F test, etc are
strictly speaking valid only in large samples

09/10/20 Prepared by Sri Yani K 32

You might also like