Detection of Heteroscadasticity

Regression Lecture 7
- Use of Dami Variables in Eco No Metrics
- Front pages (1) (2)
- Data Anna 1
- Statistics Study Guide MINITAB
- Pereyra - What Said the Neoclassical and Endogenous Growth Models About Portugal
- Lampiran No 2
- Cap 1_en_11.04.2016
- sample lab report
- 2c671cfdd033359259526ec40fa4ec99f44b
- Ch18_Forecasting.xlsx
- Linear Regression Model
- TonyHW CRP5450 - Final Term Paper
- Simple Linear Regression and Correlation
- e301sample questions2_12
- 14-20
- t3-tor-vergata-web-2016-04-25-18-10-37
- Index to newton and tomahawk
- Improved Penalization for Determining the Number of Factors in Approximate Factor Models
- Mult Regression

In cross- sectional data, where we have to collect data on micro, small, medium and

large farms/firms, heteroscedasticity is likely to be there.

Park Test:

Run a usual regression, like:

lnY = 0 + 1lnXi + i

(8.1)

Obtain residuals ei and make them squared, run regression of the following form:

Lne2i = 0 + 1lnXi + i

(8.2)

problems of heteroscedasticity. Lets do the Park test for evaluating our Job

satisfaction and organizational justice case for checking existence of

heteroscadasticity problem.

Convert data on all dependent and independent variables JB, DJ,PJ, IJ, INJ and AEE

into log using TRANSFORM and COMPUTE VARIABLE commands in SPSS; let the

newly log-variables have new names LJB, LDJ,LPJ, LIJ, LINJ and LAEE.

Regressing (8.1) type of model:

lnLB = 0 + 1lnDJ + 2lnPJ + 3lnIJ + 4lnIN + 5lnAEE + i

(8.3)

LINEARSAVERESIDUALSUNSTANDARDIZEDCONTINUEOK

This command will estimate residuals and put those in the last column of the data

file under name RES_1. Make this variable square (as we need Lne 2i as per

equation 8.2), using TRANSFORM and COMPUTE commands.

Now you can run regression on the second equation, like (8.2); doing so:

Lne2i = 0 + 1lnDJ + 2lnPJ + 3lnIJ + 4lnIN + 5lnAEE + i

We get results like:

(8.4)

Coefficientsa

Model

1

Unstandardized

Coefficients

Standardize

d

Coefficients

Beta

Std. Error

Sig.

2.450

.015

(Constant .240

)

.098

LDJ

-.157

.026

-.455

-6.124

.000

LPJ

-.008

.022

-.027

-.341

.733

LIJ

.026

.024

.069

1.075

.283

LINJ

-.056

.032

-.129

-1.748

.082

LAEE

.021

.024

.046

.848

.397

The three coefficients (LPJ, LIJ & LAEE) are statistically insignificant while two

coefficients (LDJ & LINJ) are statistically significant, suggesting the possibility of

moderate level of heteroscadasticity problem.

Goldfeld-Quant Test:

The Goldfeld-Quant test suggests ordering or rank observations according to the

values of Xi, beginning with the lowest Xi value. Then some central observations are

omitted in a way that the remaining observations are divided into two equal groups.

These two data groups are used for running two separate regressions, and residual

sum of squares (RSS) are obtained; these RSSs (RSS 1 & RSS2) are then used to

compute Goldfeld-Quant F test, namely:

RSS 2 df

RSS1 df

(8.5)

is likely to exist.

Lets run the stated test for Organizational justice and Job satisfaction case. The

aforementioned Parks test indicated that log of variable DJ was found the most

collinear with the log of the squared residuals; this suggested that we arrange data

in ascending order using DJ variable as the base, and then omit central 14

observations, which will leave 250 observation to be equally divided in two parts of

150 observation each.

The SPSS command is: DATASORT CASESTake DJ to the SORT-BY BOX

ASCENDING.

Remove the 14 central observations, and save data in two separate files, one

having Group 1 data (the first 150 observations) and the second having Group II

data (having 150 later observations).

Then running the required two regressions gives the following TWO ANOVA tables:

GROUP I: ANOVAb

Sum of

Squares

Df

Mean

Square

Sig.

Regression 14.897

2.979

6.447

.000a

Residual

54.995

119

.462

Total

69.892

124

Model

1

Distributive justice, INJ

b. Dependent Variable: Job satisfaction

Sum of

Squares

Df

Mean

Square

Sig.

Regression 4.123

.825

5.005

.000a

Residual

19.605

119

.165

Total

23.728

124

Model

1

INJ, Procedural justice

b. Dependent Variable: Job satisfaction

The residual sum of squares (RSS) of the two groups are:

RSS1 = 54.995 with DF = 119

RSSII = 19.605 with DF = 119

Calculating F, using (8.5)

F

= (RSSII/DF) / (RSSI/DF)

= (19.605/119) / 54.995/119

= 0.3565

(8.6)

F-calculated = 0.3565 < F-tabulated = 1.29 (at p = 0.05), suggesting there exists no

heteroscadasticity.

Unlike the GoldfeldQuandt test, which requires reordering the observations with

respect to the

X variable that supposedly caused heteroscedasticity, or the BPG test, which is

sensitive to the normality assumption, the general test of heteroscedasticity

proposed by White does not rely on the normality assumption and is easy to

implement. As an illustration of the basic idea, consider the following three-variable

regression model.

Yi = 1 + 2X2i + 3X3i + ui

Step 1:

(8.7)

Given the data, we estimate (8.7) and obtain the residuals, ui.

u2i = 1 + 2X2i + 3X3i + 4X22i + 5X23i + 6X2iX3i + vi

(8.8)

Step 3: Under the null hypothesis that there is no heteroscedasticity, thatis:

n R2 ~ asy 2df

(8.9)

where df is the number of regressors (excluding the constant term) in the auxiliary

regression. In our example, there are 5 df since there are 5 regressors in the

auxiliary regression.

Step 4. If the chi-square value obtained in (8.9) exceeds the critical chi-square

value at the chosen level of significance, the conclusion is that there is

heteroscedasticity. If it does not exceed the critical chi-square value, there is no

heteroscedasticity.

Gujarati (2007, pp.422) advises caution in using the White test; he says: the White

test can be a test of (pure) heteroscedasticity or specification error or both. It has

been argued that if no cross-product terms are present in the White test procedure,

then it is a test of pure heteroscedasticity. If cross-product terms are present, then it

is a test of both heteroscedasticity and specification bias.

Remedies:

1) If we know , then we use the weighted least squares (WLS) estimation

technique, i.e.,

Yi

1

X i ei

i 0 i 1

i

i

(8.7)

2) Log -transformation:

Ln Yi 0 1 Ln X i i

(8.8)

It reduces the heteroscedasticity.

3) Other transformations:

Yi

Xi

Xi

Xi

X i

Xi

a)

(8.9)

After estimating the above model, both the sides are then multiplied by X i.

Yi

b)

0 1

Yi

Yi

Xi

Yi

Yi

(8.10)

Note: In case of transformed data, the diagnostic statistics t- ratio and Fstatistic

are valid only in large sample size.

