You are on page 1of 4

1.

We have a database for a sample of 935 workers, with information on the following
variables: SALARY (monthly salary), EDAT (age), EDUCACIO (years of study),
CASAT (dummy variable that is 1 if the individual is married, 0 otherwise),
PERMANENCIA (years spent in the same job), EXPERIENCIA (years of experience)
and IQ (index score of intelligence).

1.c. It has been estimated the following model:

Model (1)

We get the following results:

Table 1.

1. From this model, we then obtained the following result:

Table 2
From these results, what are the properties of the estimators of Table 1, in terms of bias,
consistency and efficiency? Do you think that is necessary to estimate model 1 by
Generalized Least Square (GLS)? Why? (Note: if you do a test of hypothesis tell which
are the hypothesis, which is the test statistic, and the reached conclusion) (1 point)

The table shown is the auxiliary regression for the White Test. The null hyp is
homoskedatiscity, while the alternative is heteroskedasticity. The test statistic is
N·R_aux2 = 935*0,024029 = 22,467. In our case, this statistic is distributed as a chi-
square with 5 degrees of freedom for a significance level (α) of 5%, which is equal to
11,1. Since the test statistic is greater than the value of the tables, we reject the null
hypothesis of homoscedasticity. This implies that we have heteroskedasticity, so we
have inefficient estimates in Table 1. Therefore it is necessary to estimate by GLS to
correct this inefficiency.

2. Based on a sample of 15.111 households from the 2001 Census, we try to explain the
surface of the family home (SUPERF_VIV) depending on the characteristics of the
family. To do this, we estimate the following regression:
2.1What is the value of the Durbin-Watson statistic? How do you interpret this value for
this particular model? Do you think that with this value of the Durbin-Watson we
should take any action? (1 point)

In this particular model, the Durbin-Watson does not have any particular sense, since it
is cross-sectional data. Nonetheless (and indeed) its value is close to 2, which indicates
absence of autocorrelation.

2.2 It was also considered an auxiliary regression:

Where, the variable RESID_MCO^2 are the squared and normalized residuals of the
OLS model of reference (i.e. the residuals squared divided by the regression variance
estimation). What is this regression? Which are the null and alternative hypotheses for
the related test? What is your conclusion? (1 point)

It is a heteroskedasticity test, in this case, the Breusch-Pagan test. H0: No


heteroscedasticity. Ha: The heteroskedasticity causes by a linear combination of the
variables ANY_ESUDIOS_PP and NUM_OCUP. The test statistic is: ESS/2
Which in this case becomes: ESS=TSS-RSS=(12367.59)^2*(15111-1)-
2.30e+12=11184537186

Therefore ESS/2=11184537186/2=5592268593

This value should be compared with a χ2 with the same number of degrees of freedom
of the variables of the auxiliary regression, ie, χ2(2), with a confidence level of 95% its
approximately value is 5,99. Therefore, we reject the null hypothesis and confirm the
existence of heteroskedasticity.

2.3 Based on the results of questions 2.1 and 2.2, what would be an appropriate
procedure to estimate our model? What form might the Ω matrix take? (1 point).

From the results obtained in the previous section, we considered that the most
appropriate method of estimation is the generalized least square approximations
(FGLS), as OLS gives us inefficient estimates and biased standard errors of the
coefficients β. The Ω matrix can be obtained by either using a linear combination of the
variables used in the auxiliary regression in the Breusch-Pagan test (even if this not
ensures us positive values for the variance) or by taking the exponential of the fitted
values of a regression of the logarithms of the squared residuals on the exogenous
variables:

Where g denote these fitted values (obtained by one of the two methodologies pointed
out above).

You might also like