You are on page 1of 2

An important assumption of OLS is that the disturbances μi appearing in the population

regression function are homoscedastic (Error term have the same variance).
i.e. The variance of each disturbance term μi, conditional on the chosen values of
explanatory variables is some constant number equal to σ2σ2. E(μ2i)=σ2E(μi2)=σ2;
where i=1,2,⋯,ni=1,2,⋯,n.
Homo means equal and scedasticity means spread.
Consider the general linear regression model

yi=β1+β2x2i+β3x3i+⋯+βkxki+εyi=β1+β2x2i+β3x3i+⋯+βkxki+ε
If E(ε2i)=σ2E(εi2)=σ2 for all i=1,2,⋯,ni=1,2,⋯,n then the assumption of constant
variance of the error term or homoscedasticity is satisfied.
If E(ε2i)≠σ2E(εi2)≠σ2 then assumption of homoscedasticity is violated and
heteroscedasticity is said to be present. In the case of heteroscedasticity, the OLS
estimators are unbiased but inefficient.
Examples:

1. The range in family income between the poorest and richest family in town is the classical
example of heteroscedasticity.
2. The range in annual sales between a corner drug store and general store.

Reasons for Heteroscedasticity


There are several reasons when the variances of error term μi may be variable, some of
which are:

1. Following the error learning models, as people learn their error of behaviors becomes
smaller over time. In this case σ2iσi2 is expected to decrease. For example the number of typing
errors made in a given time period on a test to the hours put in typing practice.
2. As income grows, people have more discretionary income and hence σ2iσi2 is likely to
increase with income.
3. As data collecting techniques improve, σ2iσi2 is likely to decrease.
4. Heteroscedasticity can also arise as a result of the presence of outliers. The inclusion or
exclusion of such observations, especially when the sample size is small, can substantially alter the
results of regression analysis.
5. Heteroscedasticity arises from violating the assumption of CLRM (classical linear regression
model), that the regression model is not correctly specified.
6. Skewness in the distribution of one or more regressors included in the model is another
source of heteroscedasticity.
7. Incorrect data transformation, incorrect functional form (linear or log-linear model) is also the
source of heteroscedasticity

Consequences of Heteroscedasticity
1. The OLS estimators and regression predictions based on them remains unbiased and
consistent.
2. The OLS estimators are no longer the BLUE (Best Linear Unbiased Estimators) because
they are no longer efficient, so the regression predictions will be inefficient too.
3. Because of the inconsistency of the covariance matrix of the estimated regression
coefficients, the tests of hypotheses, (t-test, F-test) are no longer valid.

Note: Problems of heteroscedasticity is likely to be more common in cross-sectional


than in time series data.

You might also like