396403_108533_3_tm_c_tts190318-699842-8-solution

Solution
1.
Here, we see that the mean total expenditure spent on food is 18.78 with a standard deviation of
11.37. The median total expenditure spent on food is 16.822. Here, we see that the mean total
expenditure spent on food is greater than the median total expenditure spent on food. Therefore,
the distribution of total expenditure spent on food is skewed right and thus, the assumption of
normality is violated
Here, we see that the mean total expenditure spent on transportation is 13.71 with a standard
deviation of 13.38. The median total expenditure spent on transportation is 10.54. Here, we see
that the mean total expenditure spent on transportation is greater than the median total
expenditure spent on transportation. Therefore, the distribution of total expenditure spent on
transportation is skewed right and thus, the assumption of normality is violated
2.
Model – 1 (Dependent variable – Food)

The regression equation is
Food = 38.515 – 0.211 * Sex – 0.0465 * Age – 1.277 * Education – 0.0259 * Family Size
The coefficient of determination is 0.0474, indicating that 4.74% of the variation in the
dependent variable is explained by the regression model and the remaining 95.26% left
unexplained
The 95% confidence interval for the independent variable Age is (– 0.066, – 0.027)
Model – 2 (Dependent Variable – Transportation)
Food = 19.103 – 0.525 * Sex – 0.0476 * Age – 0.237 * Education – 0.378 * Family Size
unexplained. The 95% confidence interval for the independent variable Age is (– 0.071, –
0.0241)
3.
Food
300
200
Residuals
1000
-100
10 20 30 40
Fitted values
The first test on Heteroscedasticity given by imest is the White’s test. Here, we test the null
hypothesis that the variance of the residuals is homogenous. Therefore, since the p-value is very
small, we would have to reject the hypothesis and accept the alternative hypothesis that the
variance is not homogenous.
Transportation
200
150
Residuals
100
50
0
10 12 14 16 18
Fitted values
4. Model – 1 (Dependent variable – Food)

Food = 22.14 – 0.000444 * Total Expense – 0.0357 * Income hours
unexplained
Model – 2 (Dependent Variable – Transportation)
Food = 11.86 – 0.00063 * Total Expense – 0.0435 * Income hours

unexplained
5. Food
60
40
Residuals
20 0
-20
-20 -10 0 10 20
Fitted values
Transportation
100
50
Residuals
0
-50
10 20 30 40 50 60
Fitted values
6.
Heteroscedasticity is violation of the assumption that “Variances among the groups are equal”
and it normally occurs when the variance of the error terms differ across observations.
Heteroscedasticity has serious consequences for the OLS estimator. Although the OLS estimator
remains unbiased, the estimated SE is wrong. Because of this, confidence intervals and
hypotheses tests cannot be relied on. In addition, the OLS estimator is no longer BLUE. If the
form of the Heteroscedasticity is known, it can be corrected (via appropriate transformation of
the data) and the resulting estimator, generalized least squares (GLS), can be shown to be BLUE.
The effects of Heteroscedasticity are:
 OLS is still unbiased and consistent
 The standard errors of the estimates are biased if we have Heteroscedasticity
7.
Multicollinearity is an issue whether we see that there exists a significant relationship among the
independent variables included in the study. In cases of perfect multicollinearity, OLS estimators
are not even defined. An exact linear relationship between two or more (explanatory) variables;
more than one exact linear relationship between two or more explanatory variables. In perfect
collinearity there is an exact linear relationship between two or more variables, whereas in
imperfect collinearity this relationship is not exact but an approximate one
Here, we see that the total expense variable is the main root cause for Heteroscedasticity
The assumption called homogeneity of variances should also be validated before performing the
regression analysis. It is defined as “Variances among the groups are equal” and this assumption
is violated when there is very large variations in the error terms. Heteroscedasticity has serious
influence on the least square estimator.
Question 2
1.
This assumption represents the equality of variances assumption. If observations across

households were potentially correlated, then this assumption is violated and also we can say that
there is an existence of multicollinearity
2.
Model – 1 (Dependent variable – Food)
The coefficient of determination is 0.532, indicating that 5.32% of the variation in the dependent
variable is explained by the regression model and the remaining 94.68% left unexplained
Here, the dummy variables South Region seems to be significant predictor for food expenditure
Food expenditure seems to highest in South region and lowest in Northeast region
3.
The coefficient of determination is 0.5, indicating that 5.3% of the variation in the dependent
variable is explained by the regression model and the remaining 94.7% left unexplained
Here, the dummy variables South Region seems to be significant predictor for food expenditure
Food expenditure seems to highest in South region and lowest in Midwest region
4.
Here, we see that as the income increases, the amount spent on food decreases. Thus, it is said to
be a necessity good if its share decreases with income
5.
The regression output is given below
unexplained
Apart from the variables Sex and Family Size, all other independent variables seems to be
significant predictor of Food expenditure
The STATA code is given below
regress food sex_ref age_ref educ_ref fam_size age2, vce(bootstrap, reps(200) seed(1234)) beta
The regression output is given below
unexplained
Apart from the variables Sex and Family Size, all other independent variables seems to be
significant predictor of Food expenditure
On comparing this bootstrap with the one obtained via delta method, we see that there seems to
be no difference in the regression findings

396403_108533_3_tm_c_tts190318-699842-8-solution

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

396403_108533_3_tm_c_tts190318-699842-8-solution

Uploaded by

Copyright:

Available Formats

Solution

Model – 1 (Dependent variable – Food)

Model – 2 (Dependent Variable – Transportation)

The regression equation is

4. Model – 1 (Dependent variable – Food)

Food = 22.14 – 0.000444 * Total Expense – 0.0357 * Income hours

Model – 2 (Dependent Variable – Transportation)

The regression equation is

Food = 11.86 – 0.00063 * Total Expense – 0.0435 * Income hours

 OLS is still unbiased and consistent

 The standard errors of the estimates are biased if we have Heteroscedasticity

This assumption represents the equality of variances assumption. If observations across

Model – 1 (Dependent variable – Food)

The regression output is given below

The STATA code is given below

You might also like