14 views

Uploaded by Vu Trung Duc

Attribution Non-Commercial (BY-NC)

- EL510_OM Calculator Manual
- Excel Regression Analysis Output Explained
- Acemoglu Education to Democracy
- Multiple Linear Regression
- Gamma
- OUTPUT3
- Non Linear Probability Models
- 13 Multiple Regression Part3
- The Impact of Toys Recall Announcements on Market Returns
- Multiple Regression by 6
- 2.6_Linear+Regression+with+One+Regressor+一元线性回归
- Da on Regression
- Board Size, Composition and the Performance of Private Sector Banks-2
- Doyle, Lundholm, Dan Soliman (2003)
- v13n5
- 2 Gold Price and the Exchange Rate
- Background
- Frequency Distribution and Percentage
- ALSMStudentsolnsbookv1 Solution
- APJMR-2015-3-180-Predictive-Models-of-Work-Related-Musculoskeletal-Disorders.pdf

You are on page 1of 9

Applied Econometrics

Lecture 8: Heteroscedasticity

1) The nature of heteroscedasticity

To estimate the model: Y

i

=

0

+

1

X

i

+

i

. We assume that (0,

2

). The constant variance

assumption E(

i

2

)=

2

for all i is called homoscedasticity. Violation of of this assumption is the

problem of heteroscedasticity.

Heteroscedasticity is a non-constant error variance across the sample. The presence of

heteroscedasticity renders least squares estimators inefficient, but they remain unbiased. In other

words, they are linear unbiased estimators, not best linear unbiased estimators. Moreover, the

standard formulae for the standard errors of the coefficients no longer apply and, hence, statistical

inferences based on the t test or F test is not valid.

Since we do not know the population line, we do not know the actual errors ( s), but we estimate

them by the residuals (e). Hence a look at the residual plot is a first test for the presence of

heteroscedasticity.

2) An illustrative example: Urban weekly earning against age of workers

We have data on weekly earning of 261 workers along with their age. The estimated regression of

weekly wage income on age is given below (standard errors in brackets)

INCOME = 8.647 + 4.883 AGE R

2

= 0.196

(21.13) (0.61)

Figure 1 presents the residual versus predicted plot. The points spread vertically wider and wider

with the increase in the predicted value of weekly wage income, indicating heteroscedasticity.

We may also draw the plots of absolute and squared residuals against the predicted values of weekly

wage income. The latter mainly draws our attention to the presence of outliers.

Written by Huynh Thanh Dien May 24, 2004

1

Applied Econometrics Heteroscedasticity

Figure 1: The Evidence of Heteroscedasticity

Residual versus Predicted Income

-400

-200

0

200

400

600

800

50 100 150 200 250 300 350 400

Predicted Income

R

e

s

i

d

u

a

l

Source: Survey of worker households in 1990 in an industrial town in southern of India.

3) Detection of heteroscedasticity

There are a number of tests, which help us to detect the presence of heteroscedasticity. What follows

is an illustration of a selection of commonly used ones.

Glejsers test

Glejsers test checks whether a systematic relation exists between the residuals and the explanatory

variables. The test involves regressing absolute residual separately on X, X

-1

, X

1/2

, and uses t-tests for

the slope coefficients to be zero

1

. The hypothesis of homoscesdasticity is rejected if any of slope

coefficients turns out to be significantly different from zero.

The test involves the following steps:

1. Run regression and calculate the residuals (e)

2. Convert the residuals to their absolute values

2

1

If there is more than one explanatory variable, this test is to be repeated for each of the explanatory

variables.

2

In Excel we use the command =absolute(data range) and in Eviews we use GENR ee = abs(e).

Written by Huynh Thanh Dien May 24, 2004

2

Applied Econometrics Heteroscedasticity

3. Regress the absolute residual on each regressor separately in the following functional

forms:

( e( =

0

+

1

X

( e( =

0

+

1

(1/X)

( e( =

0

+

1

X

( e( =

0

+

1

(1/X)

4. Use t-test to determine of the slope coefficient is significantly different from zero.

Whites test

Whites test also checks whether there is any systematic relation between the squared residuals and

explanatory variables. This is achieved by regressing the squared residuals e

2

on all the explanatory

variables and on their squares and cross products. Thus, if X

1

and X

2

are the explanatory variables,

then Whites test involves regressing e

2

on X

1

, X

2

, X

1

2

, X

2

2

, X

1

X

2

and using overall F-test to check if

the regression is significant or not.

Goldfeld-Quandts test

The test is commonly used when the heteroscedatic variance is suspected to be monotonically (i.e.,

consistently increasing or decreasing) related to one of the explanatory variables in the regression

model. We group the data with respect to different ranges of one of the explanatory variables and test

whether the conditional variances of the error term are the same.

The test involves the following steps:

1. Arrange the data in ascending order of the explanatory variable suspected to be related to

the error variance.

2. Drop a number of the middle observations, say c, so that (n-c) is divided by 2. A rule of

thumb is to drop about 1/4

th

of the total observations from the middle. This omission

sharpens the test.

3. Estimate two separate regressions for the bottom and the top group of observations (equal

sample size).

4. Calculate the ratio of the higher to the lower residual sum of squares from the two sub-

sample regressions. If the sub-sample variances are the same (homoscedatic errors), the

Written by Huynh Thanh Dien May 24, 2004

3

Applied Econometrics Heteroscedasticity

ratio will approximately equal unity.

5. Compare the computed ratio with critical value of the relevant F distribution with ([(n-

c)/2]-k; [(n-c)/2]-k) degrees of freedom at the desired level of significance, where (n-c)/2 is

the size of each sub-sample and k is the number of regressors in the equation (including

intercept). If the computed ratio exceeds the critical value, then the hypothesis of

homoscedasticity is rejected.

The Goldfeld-Quandts test is only valid under the assumption that the dependent variable is

normally distributed.

Bartletts test

Bartletts test can be applied to check for the equality of the variances of the dependent variables

across groups defined by an explanatory variable.

The conditional variance of Y given X is the same as the conditional variance of the error term

2

X

,

V(Y) = V(

0

+

1

X +

X

)

But, since X is given,

X

is uncorrelated with X, i.e. cov(

X

, X) = 0, and the variance of (

0

+

1

X) is zero for any given X, we have:

V(Y) = V(

0

+

1

X) + V(

X

) = V(

X

) =

2

X

Hence, one way of checking for heteroscedasticity is to test for the stability of the conditional

variance of Y across the range of X in the sample data. In practical situations, we generally do not

have multiple observations of Y for a given X. The application of Bartletts test, therefore, involves

that we first sort the data in ascending order of the explanatory variable which is suspected to be the

cause of the heteroscedastic pattern of the error term, and divide the sample into several groups, say

k groups, based on this explanatory variable, after which we subsequently test the hypothesis of

homogeneous variances across the groups

If

2

i

is the variance of Y in the ith group; the null hypothesis we seek to test can then be stated as:

H

0

:

2

1

=

2

2

=

2

3

= =

2

k

=

2

Now let Y

ij

= jth Y value in the ith group; n

i

= the number of observations in ith group; and f

i

=(n

i

1);

f = f

i

. The test is then performed as follows:

Written by Huynh Thanh Dien May 24, 2004

4

Applied Econometrics Heteroscedasticity

1. Compute the sample variance for each group i

n

1 j

i

ij

2

i

2

i

i

) Y

Y

(

1 n

1

s

where

n

1 j

ij

i

i

i

Y

n

1

Y

s

i

2

is the estimator of

2

i

, i = 1, 2, 3, , k.

2. Compute the pooled sample variance of all the group together

k

1 i

k

1 i

2

i

i

k

1 i

i

k

1 i

2

i

i

2

f

s f

f

s f

s

s

2

is the estimator of

2

under H

0

.

2. Under the null hypothesis the ratio A/B has approximately a chi square distribution with (k1)

degrees of freedom, where

k

1 i

2

i i

2

)

s

.ln(

f

) (s f.ln A

1

1

]

1

,

_

k

1 i

i

f

1

f

1

1) 3(k

1

1 B

Note that Bartletts test is only valid under the assumption that the dependent variable is normally

distributed.

4) Transformations towards homoscedasticity

Our tests for heteroscedasticity are test upon residual as proxies for errors. But properties of

residuals are determined by model specification. Errors in true model may be homoscedastic, but

those in misspecified model not so. Hence residual heteroscedasticity may be a symptom of model

misspecification (either incorrect functional form or omitted variables).

Written by Huynh Thanh Dien May 24, 2004

5

Applied Econometrics Heteroscedasticity

One of the most common reasons for heteroscedasticity is the skewness of the distribution of one or

more variables

3

involved in a regression with socioeconomic data. How do we find the appropriate

transformation to eliminate heteroscedasticity?

If the functional relationship between the variance of the dependent variable (

Y

2

) and the mean are

known, a transformation exists which will make the variance approximately constant (Rawlings,

1988:309).

The common functional relationship between the variance of the dependent variable and its

conditional mean is:

Y

2

= A

2

Y

k

or, alternatively:

Y = A

Y

k

which can conveniently reexpress as a double-log equation as follows:

ln( Y) = A + k ln( Y)

where

Y

2

is the conditional variance of Y

Y is the conditional mean of Y

A and k are constants

The slope coefficient of the corresponding regression tells us which transformation may be most

appropriate.

If k = 1, the log transformation will approximately eliminate heteroscedasticity.

If k 1, the power transformation Y

1-k

will approximately eliminate heteroscedasticity.

However, Y and Y are unknown. What we can do is to substitute the absolute residuals ( e( for

Y and the predicted Y

p

for Y in the above equation and using the data to estimate its slope

coefficient, k, with least squares. The corresponding fitted regression is given as follows,

ln(( e( ) =

0

+

1

ln(Y

p

)

3

We may properly take log transformation of both dependent and independent variables to eliminate

heteroscedasticity (practice INDFOOD).

Written by Huynh Thanh Dien May 24, 2004

6

Applied Econometrics Heteroscedasticity

We may use the t-test for the slope coefficient to be equal to one at the desired level of significance.

If

1

= 1, the log transformation will approximately eliminate heteroscedasticity.

If

1

1, the power transformation Y

1-k

will approximately eliminate heteroscedasticity.

5) Weighted least squares

If we believe we dealing with a case of genuine heteroscedasticity then, in some cases, the method of

weighted least squares allows us to derive efficient estimators of a regression model with

heteroscedastic errors and to make valid inferences.

If the heteroscedastic model is given as follows:

Y

i

=

1

+

2

X

i

+

i

(6.1)

where V(

i

) =

i

2

= w

i

2

2

, we can divide equation 6.1) by w

i

so as to get the following model:

w w

X

w

1

w

Y

i

i

i

i

2

i

1

i

i

+ +

This transformed regression has no constant term. The variance of the error term in the new

specification is homoscedastic because

w

w

1

) V(

w

1

w

V

2 2 2

i

2

i

i

2

i

i

i

,

_

This procedure of estimation of the regression coefficients is called the weighted least squares

method of estimation. The crucial issue in practice is to find the appropriate weights, w

i

.

6) Whites heteroscedastic consistent standard errors (HCSEs)

If the weighted least squared method is not possible, we may use the heteroscedastic consistent

standard errors. In the two variable case we see that:

) X

X

(

)

b

Var(

i

2

2

2

(6.1)

where the formula is simplified using the assumption that E(

i

2

) =

2

for all i. Where this

Written by Huynh Thanh Dien May 24, 2004

7

Applied Econometrics Heteroscedasticity

assumption is not valid (i.e. the errors are heteroscedastic) then:

) X

X

(

) X

X

(

)

b

Var(

i

i

2

2

2

2

(6.2)

White (1980) showed that substituting the squared residuals (e

i

2

) into equation (6.2) yields a

consistent estimate of the standard errors. However, unlike with weighted least squares, these are not

the minimum variances.

Inspection of equation (6.2) shows that if the errors are homoscedastic then the expression simplifies

to that in equation (6.1). That is, the heteroscedastic consistent standard errors and those usually

reported will be the same if there is no heteroscedasticity. A divergence between these two sets of

standard errors is thus a rough test for the presence of heteroscedasticity.

References

Maddala, G.S. (1992), Introduction to Econometrics, Macmillan Publishing Company, New York.

Rawlings, John O. (1988) Applied Regression Analysis: A Research Tool, Pacific Grove, CA:

Woodsworth and Brooks/Coke.

Mukherjee Chandan, Howard White and Marc Wuyts (1998), Econometrics and Data Analysis for

Developing Countries published by Routledge, London, UK.

White Halbert (1980) A Heteroscedasticity Consistent Covariance Matrix Estimator and a Direct

Test for Heteroscedasticity, Econometrica 48: 817-38.

Written by Huynh Thanh Dien May 24, 2004

8

Applied Econometrics Heteroscedasticity

Workshop 8: Heteroscedasticity

1) Use the data set INDIA,

1.1) estimate the regression line between the logarithm of wage income and the age of the

worker, compute the residuals, and plot the raw, absolute, and squared residuals against

the predicted values of wage income and against the age of workers. What do you

conclude about the presence or absence of heteroscedasticity.

1.2) do tests for heteroscedasticity with the model featuring the logarithms of income versus

the age of INDIA worker.

2) Use the data set SOCECON,

2.1) Regress energy consumption (E) on GNP per capita (Y); energy consumption (E) and

the degree of urbanization (E) as measured by the percentage of the population living in

urban areas (U); and, life expectancy (L) and GNP per capita (Y).

2.2) For each of these simple regressions between raw data, compare the plots of raw,

absolute, and squared residuals against the predicted values of the dependent variable or

against the regressor. In each case, check which plot is most revealing in terms of

detecting heteroscedasticity.

2.3) Use all four tests, detect the presence of heteroscedasticity.

3) Using the data file TPESANT (farm size and household size in Tanzania) estimate the

regression of landholding size on household size with weighted least squares. Do you think

that the resulting regression satisfies the assumptions of classical linear regression?

4) Use the data in data file INDFOOD to test for heteroscedasticity in the regression of

household food expenditure on total expenditure. Repeat the tests using the log of both

variables. Comment on your findings.

5) Using the data in data file LEACCESS, regress life expectancy on (a) income per capita; (b)

logged income per capita; and (c) logged income per capita and access to health. Test for

heteroscedasticity in each regression equation. Comment on your results.

Written by Huynh Thanh Dien May 24, 2004

9

- EL510_OM Calculator ManualUploaded bytorn4do63
- Excel Regression Analysis Output ExplainedUploaded byYuvraaj Singh
- Acemoglu Education to DemocracyUploaded byJavier Vargas Diaz
- Multiple Linear RegressionUploaded byVimal Prajapati
- GammaUploaded byYuli940011
- OUTPUT3Uploaded byEmin Neziraj
- Non Linear Probability ModelsUploaded bySriram Thwar
- 13 Multiple Regression Part3Uploaded byRama Dulce
- The Impact of Toys Recall Announcements on Market ReturnsUploaded byGaows Mohammad
- Multiple Regression by 6Uploaded byJason Pan
- 2.6_Linear+Regression+with+One+Regressor+一元线性回归Uploaded byJames Jiang
- Da on RegressionUploaded byqt_anju
- Board Size, Composition and the Performance of Private Sector Banks-2Uploaded byIAEME Publication
- Doyle, Lundholm, Dan Soliman (2003)Uploaded byRosalia Anita
- v13n5Uploaded byShaheryar Munir
- 2 Gold Price and the Exchange RateUploaded byLuis Grados Arrieta
- BackgroundUploaded byVu H Thanh
- Frequency Distribution and PercentageUploaded byGurumoorti Bhat
- ALSMStudentsolnsbookv1 SolutionUploaded bytoancao
- APJMR-2015-3-180-Predictive-Models-of-Work-Related-Musculoskeletal-Disorders.pdfUploaded byRomi Riah Al-wafi
- BR Assignment Report FullUploaded byThanobol Cenphakdee
- Soal_dan_Jawaban_Statistika_2_LInd_March.pptxUploaded byAkkhu Anx Amk
- FormulaUploaded bykumar030290
- Question 1_9310226Uploaded bymehdi
- 3 20-28Uploaded byAbdulhaq Hadi Alhaddad
- 132441-356985-1-PBUploaded byJamiu Ariremako
- GIPE-058050.pdfUploaded byjmcl
- Bitoy_d_gwapo_buyagUploaded byMohamad Nalo
- hasil arif.docUploaded byAdi Nugroho
- Regression.pdfUploaded bywolfretonmaths

- QuasiHelicalUploaded byFabio Cavaliere
- Costs and Benefits of Urbanization: The Indian CaseUploaded byADBI Publications
- Going FlatUploaded byAnonymous QMabU3fhB
- Simulated AnnealingUploaded byPramod Dhaigude
- Pappa YaUploaded bykisnacapri
- Notice: Human drugs: Drug products withdrawn from sale for reasons other than safety or effectiveness— Daranide (dichlorphenamide) tablets, 50 milligramsUploaded byJustia.com
- Enviornmental EngineeringUploaded byMatthew Liu
- 3 Water Treatment PlantUploaded byPranay Patel
- STOD-ECM-TPL-0806-QC-MST-0003 (B01) Work method statement for cathodic protection (2).docxUploaded bymeda li
- Electrical ObjectivesUploaded byMahesh Kumar Bhatta
- zdl-logUploaded byAiron Away
- Case Study a Drug InteractionUploaded byGustiAgungKrisna
- rr307Uploaded bysere55
- Spot Drilling CollarUploaded byKelvin Ng
- Complex AnalysisUploaded byricky montgomery
- Hardware Ica Ayab Agg 1 v1.0 Ne40e x8Uploaded byJhorkaef Zurita
- 0757994083.pdfUploaded byRevisão Textual
- Chordu Guitar Chords Krutsay Sakayanon Verni Gonzalez Chordsheet Id XFAonLbmVU8Uploaded byJane Limsan Paglinawan
- EWork and EBusiness in Architecture Engineering and ConstructionUploaded byoktaviadn
- Semantic ProsodyUploaded byStoneColdDeath
- نظام المختبرات الخاصةUploaded byAhmed Shaban Kotb
- Adoption Intention of Digital Signature in IndiaUploaded bydushyantkr
- Volume III - Semiconductors (5th Edition).pdfUploaded byLewie Brace
- strobelresumeUploaded byapi-317417810
- Getting Things Done.pdfUploaded byadonisghl
- Evaluating Student Teachers.pdfUploaded byChiara Maye Notarte
- ComplexNumbers GuideUploaded byJason Batson
- ALM_BRIEFUploaded byMohammad Noor Alam
- Tourism EntrepreneurshipUploaded byJAKAN
- Linking Words ExercisesUploaded bymidobase