You are on page 1of 6

Detection of heteroscadasticity:

Nature of the problem:


In cross- sectional data, where we have to collect data on micro, small, medium and
large farms/firms, heteroscedasticity is likely to be there.

Park Test:
Run a usual regression, like:
lnY = 0 + 1lnXi + i

(8.1)

Obtain residuals ei and make them squared, run regression of the following form:
Lne2i = 0 + 1lnXi + i

(8.2)

If 1 happens to be statistically significant, it will indicate the existence of the


problems of heteroscedasticity. Lets do the Park test for evaluating our Job
satisfaction and organizational justice case for checking existence of
heteroscadasticity problem.
Convert data on all dependent and independent variables JB, DJ,PJ, IJ, INJ and AEE
into log using TRANSFORM and COMPUTE VARIABLE commands in SPSS; let the
newly log-variables have new names LJB, LDJ,LPJ, LIJ, LINJ and LAEE.
Regressing (8.1) type of model:
lnLB = 0 + 1lnDJ + 2lnPJ + 3lnIJ + 4lnIN + 5lnAEE + i

(8.3)

Obtain residuals using additional SPSS commands: ANALYZEREGRESSION


LINEARSAVERESIDUALSUNSTANDARDIZEDCONTINUEOK
This command will estimate residuals and put those in the last column of the data
file under name RES_1. Make this variable square (as we need Lne 2i as per
equation 8.2), using TRANSFORM and COMPUTE commands.

Now you can run regression on the second equation, like (8.2); doing so:
Lne2i = 0 + 1lnDJ + 2lnPJ + 3lnIJ + 4lnIN + 5lnAEE + i
We get results like:

(8.4)

Coefficientsa

Model
1

Unstandardized
Coefficients

Standardize
d
Coefficients

Beta

Std. Error

Sig.

2.450

.015

(Constant .240
)

.098

LDJ

-.157

.026

-.455

-6.124

.000

LPJ

-.008

.022

-.027

-.341

.733

LIJ

.026

.024

.069

1.075

.283

LINJ

-.056

.032

-.129

-1.748

.082

LAEE

.021

.024

.046

.848

.397

a. Dependent Variable: Lnes

The three coefficients (LPJ, LIJ & LAEE) are statistically insignificant while two
coefficients (LDJ & LINJ) are statistically significant, suggesting the possibility of
moderate level of heteroscadasticity problem.

Goldfeld-Quant Test:
The Goldfeld-Quant test suggests ordering or rank observations according to the
values of Xi, beginning with the lowest Xi value. Then some central observations are
omitted in a way that the remaining observations are divided into two equal groups.
These two data groups are used for running two separate regressions, and residual
sum of squares (RSS) are obtained; these RSSs (RSS 1 & RSS2) are then used to
compute Goldfeld-Quant F test, namely:

RSS 2 df
RSS1 df
(8.5)

If the F is found significant (F-calculated > F-tabulated, the problem of heteroscedasticity


is likely to exist.

Lets run the stated test for Organizational justice and Job satisfaction case. The
aforementioned Parks test indicated that log of variable DJ was found the most
collinear with the log of the squared residuals; this suggested that we arrange data
in ascending order using DJ variable as the base, and then omit central 14
observations, which will leave 250 observation to be equally divided in two parts of
150 observation each.
The SPSS command is: DATASORT CASESTake DJ to the SORT-BY BOX
ASCENDING.
Remove the 14 central observations, and save data in two separate files, one
having Group 1 data (the first 150 observations) and the second having Group II
data (having 150 later observations).
Then running the required two regressions gives the following TWO ANOVA tables:
GROUP I: ANOVAb
Sum of
Squares

Df

Mean
Square

Sig.

Regression 14.897

2.979

6.447

.000a

Residual

54.995

119

.462

Total

69.892

124

Model
1

a. Predictors: (Constant), AEE, Procedural justice, Interactive justice ,


Distributive justice, INJ
b. Dependent Variable: Job satisfaction

GROUP II: ANOVAb


Sum of
Squares

Df

Mean
Square

Sig.

Regression 4.123

.825

5.005

.000a

Residual

19.605

119

.165

Total

23.728

124

Model
1

a. Predictors: (Constant), AEE, Distributive justice, Interactive justice ,


INJ, Procedural justice
b. Dependent Variable: Job satisfaction
The residual sum of squares (RSS) of the two groups are:
RSS1 = 54.995 with DF = 119
RSSII = 19.605 with DF = 119
Calculating F, using (8.5)
F

= (RSSII/DF) / (RSSI/DF)
= (19.605/119) / 54.995/119
= 0.3565

(8.6)

F-calculated = 0.3565 < F-tabulated = 1.29 (at p = 0.05), suggesting there exists no
heteroscadasticity.

Whites General Heteroscedasticity Test


Unlike the GoldfeldQuandt test, which requires reordering the observations with
respect to the
X variable that supposedly caused heteroscedasticity, or the BPG test, which is
sensitive to the normality assumption, the general test of heteroscedasticity
proposed by White does not rely on the normality assumption and is easy to
implement. As an illustration of the basic idea, consider the following three-variable
regression model.
Yi = 1 + 2X2i + 3X3i + ui
Step 1:

(8.7)

Given the data, we estimate (8.7) and obtain the residuals, ui.

Step 2: We then run the following (auxiliary) regression:


u2i = 1 + 2X2i + 3X3i + 4X22i + 5X23i + 6X2iX3i + vi

(8.8)

Obtain the R2 from this (auxiliary) regression.


Step 3: Under the null hypothesis that there is no heteroscedasticity, thatis:
n R2 ~ asy 2df

(8.9)

where df is the number of regressors (excluding the constant term) in the auxiliary
regression. In our example, there are 5 df since there are 5 regressors in the
auxiliary regression.
Step 4. If the chi-square value obtained in (8.9) exceeds the critical chi-square
value at the chosen level of significance, the conclusion is that there is
heteroscedasticity. If it does not exceed the critical chi-square value, there is no
heteroscedasticity.
Gujarati (2007, pp.422) advises caution in using the White test; he says: the White
test can be a test of (pure) heteroscedasticity or specification error or both. It has
been argued that if no cross-product terms are present in the White test procedure,
then it is a test of pure heteroscedasticity. If cross-product terms are present, then it
is a test of both heteroscedasticity and specification bias.

Remedies:
1) If we know , then we use the weighted least squares (WLS) estimation
technique, i.e.,

Yi

1
X i ei
i 0 i 1
i
i

(8.7)

Where i = standard deviation of the Xi.


2) Log -transformation:

Ln Yi 0 1 Ln X i i
(8.8)
It reduces the heteroscedasticity.
3) Other transformations:

Yi

Xi

Xi

Xi

X i

Xi

a)

(8.9)
After estimating the above model, both the sides are then multiplied by X i.

Yi
b)

0 1

Yi
Yi

Xi

Yi

Yi
(8.10)

Note: In case of transformed data, the diagnostic statistics t- ratio and Fstatistic
are valid only in large sample size.