Professional Documents
Culture Documents
Chapter8 Econometrics Heteroskedasticity PDF
Chapter8 Econometrics Heteroskedasticity PDF
Heteroskedasticity
In this case, the diagonal elements of covariance matrix of ε are same indicating that the variance of each ε i
is same and off-diagonal elements of covariance matrix of ε are zero indicating that all disturbances are
pairwise uncorrelated. This property of constancy of variance is termed as homoskedasticity and disturbances
are called as homoskedastic disturbances.
In many situations, this assumption may not be plausible and the variances may not remain same. The
disturbances whose variances are not constant across the observations are called heteroskedastic disturbance
and this property is termed as heteroskedasticity. In this case
(ε i ) σ=
Var= 2
i , i 1, 2,..., n
and disturbances are pairwise uncorrelated.
Homoskedasticity
Examples : Suppose in a simple linear regression model, x denote the income and y denotes the expenditure
on food. It is observed that as the income increases, the variation in expenditure on food increases because the
choice and varieties in food increase, in general, upto certain extent. So the variance of observations on y will
not remain constant as income changes. The assumption of homoscedasticity implies that the consumption
pattern of food will remain same irrespective of the income of the person. This may not generally be a
correct assumption in real situations. Rather the consumption pattern changes and hence the variance of y
and so the variances of disturbances will not remain constant. In general, it will be increasing as income
increases.
2. Sometimes the observations are in the form of averages and this introduces the heteroskedasticity in the
model. For example, it is easier to collect data on the expenditure on clothes for the whole family rather
than on a particular family member. Suppose in a simple linear regression model
yij =β 0 + β1 xij + ε ij , i =1, 2,..., n, j =1, 2,..., mi
yij denotes the expenditure on cloth for the j th family having m j members and xij denotes the age of
the i th person in j th family. It is difficult to record data for individual family member but it is easier to
get data for the whole family. So yij ' s are known collectively.
Then instead of per member expenditure, we find the data on average expenditure for each family
member as
mj
1
yi =
mj
∑y
j =1
ij
=
It we assume E (ε ij ) 0,=
Var (ε ij ) σ 2 , then
E (ε i ) = 0
σ2
Var (ε i ) =
mj
3. Sometimes the theoretical considerations introduces the heteroskedasticity in the data. For example,
suppose in the simple linear model
yi =β 0 + β1 xi + ε i , i =1, 2,..., n ,
yi denotes the yield of rice and xi denotes the quantity of fertilizer in an agricultural experiment. It is
observed that when the quantity of fertilizer increases, then yield increases. In fact, initially the yield
increases when quantity of fertilizer increases. Gradually, the rate of increase slows down and if
fertilizer is increased further, the crop burns. So notice that β1 changes with different levels of
fertilizer. In such cases, when β1 changes, a possible way is to express it as a random variable with
where w=
i ε i + xi vi is like a new random error component. So
E ( wi ) = 0
Var ( wi ) = E ( wi2 )
=E (ε i2 ) + xi2 E (vi2 ) + 2 xi E (ε i vi )
=σ 2 + xi2θ 2 + 0
= σ 2 + xi2θ 2 .
So variance depends on i and thus heteroskedasticity is introduced in the model. Note that we assume
homoskedastic disturbances for the model
yi =β 0 + β1 xi + ε i , β1i =β1 + vi
but finally end up with heteroskedastic disturbances. This is due to theoretical considerations.
Econometrics | Chapter 8 | Heteroskedasticity | Shalabh, IIT Kanpur
4
4. The skewness in the distribution of one or more explanatory variables in the model also causes
heteroskedasticity in the model.
5. The incorrect data transformations and incorrect functional form of the model can also give rise to the
heteroskedasticity problem.
1. Bartlett’s test
It is a test for testing the null hypothesis
H 0 : σ 12= σ 22= ...= σ i2= ...= σ n2
This hypothesis is termed as the hypothesis of homoskedasticity. This test can be used only when replicated
data is available.
only one observation yi is available to find σ i2 , so the usual tests can not be applied. This problem can be
overcome if replicated data is available. So consider the model of the form
yi* X i β + ε i*
=
n n n
where y * is a vector of order ∑ mi ×1, X is ∑ mi × k matrix, β is k ×1 vector and ε * is ∑ mi ×1
= i 1=
i1 i =1
vector. Application of OLS to this model yields
βˆ = ( X ' X ) −1 X ' y *
and obtain the residual vector
e=
*
i yi* − X i βˆ .
Based on this, obtain
1
si2 = ei* ' ei*
mi − k
n
∑ (m − k ) s
i
2
i
s2 = i =1
n
.
∑ (m − k )
i =1
i
1 n
1 1
C=
1+ ∑ − .
3(n − 1) i =1 mi − k n
∑ (mi − k )
i =1
testing H=
0 σ=
2
1 σ=
2
2 ... σ n2 is
=
ni / 2
m
s2
u = ∑ i2
i =1 s
where
ni
∑( y − yi ) , i= 1, 2,..., m; j= 1, 2,..., ni
1 2
s=
2
i ij
ni j =1
1 m
s2 = ∑
n i =1
ni si2
m
n = ∑ ni .
i =1
1 1 m
1
1+ ∑ −
3(m − 1) i =1 ni − 1 n − m
1 ni
=σˆ i2 ∑ ( yij − yi )2
n − 1 j =1
1 m
=σˆ 2 ∑
n − m i =1
(ni − 1)σˆ i2 .
In experimental sciences, it is easier to get replicated data and this test can be easily applied. In real life
applications, it is difficult to get replicated data and this test may not be applied. This difficulty is overcome in
Breusch Pagan test.
σ i2 h( Z=
= iγ )
'
ˆ h(γ 1 + Z i*γ * )
=
is the vector of observable explanatory variables with first element unity and γ ′ (γ=
1, γ i )
*
(γ 1 , γ 2 ,..., γ p ) is a
vector of unknown coefficients related to β with first element being the intercept term. The heterogencity is
defined by these p variables. These Z i ' s may also include some X ' s also.
If H 0 is accepted , it implies that Z i 2 , Z i 3 ,..., Z ip do not have any effect on σ i2 and we get σ i2 = γ 1 .
explains the heteroskedasticity in the model. Let j th explanatory variable explains the heteroskedasticity, so
σ i2 ∝ X ij
or σ i2 = σ 2 X ij .
2. Split the observations into two equal parts leaving c observations in the middle.
n−c n−c
So each part contains observations provided > k.
2 2
3. Run two separate regression in the two parts using OLS and obtain the residual sum of squares SS res1
and SS res 2 .
SS res 2
4. The test statistic is F0 =
SS res1
n−c n−c
which follows a F − distribution, i.e., F − k, − k when H 0 true.
2 2
Moreover, the choice of X ij is also difficult. Since σ i2 ∝ X ij , so if all important variables are included in the
model, then it may be difficult to decide that which of the variable is influencing the heteroskedasticity.
4. Glesjer test:
This test is based on the assumption that σ i2 is influenced by one variable Z , i.e., there is only one variable
which is influencing the heteroskedasticity. This variable could be either one of the explanatory variable or it
can be chosen from some extraneous sources also.
1
4. Conduct the test for h =±1, ± . So the test procedure is repeated four times.
2
In practice, one can choose any value of h . For simplicity, we choose h = 1 .
• The test has only asymptotic justification and the four choices of h give generally satisfactory results.
• This test sheds light on the nature of heteroskedasticity.
Econometrics | Chapter 8 | Heteroskedasticity | Shalabh, IIT Kanpur
10
5. Spearman’s rank correlation test
It di denotes the difference in the ranks assigned to two different characteristics of the i th object or
phenomenon and n is the number of objects or phenomenon ranked, then the Spearman’s rank correlation
coefficient is defined as
n 2
∑ di
r = 1 − 6 i =12 ; − 1 ≤ r ≤ 1.
n(n − 1)
This can be used for testing the hypothesis about the heteroskedasticity.
Consider the model
yi =β 0 + β1 X i + ε i .
1. Run the regression of y on X and obtain the residuals e .
2. Consider ei .
5. Assuming that the population rank correlation coefficient is zero and n > 8, use the test statistic
r n−2
t0 =
1− r2
which follows a t -distribution with (n − 2) degrees of freedom.
6. The decision rule is to reject the null hypothesis of heteroskedasticity whenever t0 ≥ t1−α (n − 2).
If there are more than one explanatory variables, then rank correlation coefficient can be computed
between ei and each of the explanatory variables separately and can be tested using t0 .
The OLSE is
b = ( X ' X ) −1 X ' y.
Its estimation error is
b−β =
( X ' X ) −1 X ' ε
and
E (b − β ) ( X ' X
= = ) −1 X ' E (ε ) 0.
Thus OLSE remains unbiased even under heteroskedasticity.
e
e
ei = [ 0, 0,..., 0,1, 0,...0] 2
en
= i ' H εε ' H i
=ei2 =
i ' e.e ' i i ' H εε ' H i
ei2 ) i ' HE (εε ') H
E (= = i i ' H ΩH i
0
h1i i
h11 h1n 0
h2i i
=H i = 1
hn1 hnn 0
hni i
0
σ 12 0 0 h1i i
0 σ 22 0 h2i i
E (ei ) = h1i i, h2i i,..., hni i
2
.
0 0 σ n2 hn1 i
Thus E (ei2 ) ≠ σ i2 and so ei2 becomes a biased estimator of σ i2 in the presence of heteroskedasticity.
In the presence of heteroskedasticity, use the generalized least squares estimation. The generalized least
squares estimator (GLSE) of β is
βˆ =Ω
( X ' −1 X ) −1 X ' Ω −1 y.
Its estimation error is obtained as
βˆ =( X ' Ω −1 X ) −1 X ' Ω −1 ( X β + ε )
βˆ − β = ( X ' Ω −1 X ) −1 X ' Ω −1ε ).
Thus
E ( βˆ − β ) = ( X ' Ω −1 X ) −1 X ' Ω −1E (ε ) = 0
V ( βˆ ) =E ( βˆ − β )( βˆ − β )
( X ' Ω −1 X ) −1 X ' Ω −1 E (εε )Ω −1 X ( X ' Ω −1 X ) −1
=
= ( X ' Ω −1 X ) −1 X ' Ω −1ΩΩ −1 X ( X ' Ω −1 X ) −1
= ( X ' Ω −1 X ) −1.
∑ (x − x ) σi
2
i
2
Var (b) = i =1
n
∑ (x − x )
i =1
i
2
∑ (x − x )
i
2
Var ( βˆ ) = i =1
2 2 2 1
n n
∑ i ( x − x ) σ i ∑ ( xi − x )
σ i2
= i 1= i 1
x −x
Square of the correlation coefficient betweene σ i ( xi − x ) and i
σi
≤1
⇒ Var ( βˆ ) ≤ Var (b).
( xi − x )
So efficient of OLSE and GLSE depends upon the correlation coefficient between ( xi − x )σ i and .
σi
The generalized least squares estimation assumes that Ω is known, i.e., the nature of heteroskedasticity is
completely specified. Based on this assumption, the possibilities of following two cases arise:
• Ω is completely specified or
• Ω is not completely specified .
yi = β1 + β 2 X i 2 + ... + β k X ik + ε i .
yi 1 X X ε
= β1 + β 2 i 2 + ... + β k ik + i .
σi σi σiσi σi
ε σ i2
Let ε i* = i , then E (ε=
*
) 0, Var (ε=*
) = 1. Now OLS can be applied to this model and usual tools for
σi i i
σ i2
drawing statistical inferences can be used.
Note that when the model is deflated, the intercept term is lost as β1 / σ i is itself a variable. This point has to be
taken care in a software output.
Econometrics | Chapter 8 | Heteroskedasticity | Shalabh, IIT Kanpur
14
Case 2: Ω may not be completely specified
Let σ 12 , σ 22 ,..., σ n2 are partially known and suppose
σ i2 ∝ X ij2 λ
or σ i2 = σ 2 X ij2 λ
yi β1 X X ε
λ
= λ
+ β 2 iλ2 + ... + β k ikλ + iλ .
X ij X ij X ij X ij X ij
Now apply OLS to this transformed model and use the usual statistical tools for drawing inferences.
A caution is to be kept is mind while doing so. This is illustrated in the following example with one
explanatory variable model.
Deflate it by xi , so we get
yi β 0 ε
= + β1 + i .
xi xi xi
Note that the roles of β 0 and β1 in original and deflated models are interchanged. In original model, β 0 is
intercept term and β1 is slope parameter whereas in deflated model, β1 becomes the intercept term and β 0
becomes the slope parameter. So essentially, one can use OLS but need to be careful in identifying the intercept
term and slope parameter, particularly in the software output.