Professional Documents
Culture Documents
1.
Here, we see that the mean total expenditure spent on food is 18.78 with a standard deviation of
11.37. The median total expenditure spent on food is 16.822. Here, we see that the mean total
expenditure spent on food is greater than the median total expenditure spent on food. Therefore,
the distribution of total expenditure spent on food is skewed right and thus, the assumption of
normality is violated
Here, we see that the mean total expenditure spent on transportation is 13.71 with a standard
deviation of 13.38. The median total expenditure spent on transportation is 10.54. Here, we see
that the mean total expenditure spent on transportation is greater than the median total
expenditure spent on transportation. Therefore, the distribution of total expenditure spent on
transportation is skewed right and thus, the assumption of normality is violated
2.
Food = 38.515 – 0.211 * Sex – 0.0465 * Age – 1.277 * Education – 0.0259 * Family Size
The coefficient of determination is 0.0474, indicating that 4.74% of the variation in the
dependent variable is explained by the regression model and the remaining 95.26% left
unexplained
The 95% confidence interval for the independent variable Age is (– 0.066, – 0.027)
Food = 19.103 – 0.525 * Sex – 0.0476 * Age – 0.237 * Education – 0.378 * Family Size
The coefficient of determination is 0.0082, indicating that 0.82% of the variation in the
dependent variable is explained by the regression model and the remaining 99.12% left
unexplained. The 95% confidence interval for the independent variable Age is (– 0.071, –
0.0241)
3.
Food
300
200
Residuals
1000
-100
10 20 30 40
Fitted values
The first test on Heteroscedasticity given by imest is the White’s test. Here, we test the null
hypothesis that the variance of the residuals is homogenous. Therefore, since the p-value is very
small, we would have to reject the hypothesis and accept the alternative hypothesis that the
variance is not homogenous.
Transportation
200
150
Residuals
100
50
0
10 12 14 16 18
Fitted values
The first test on Heteroscedasticity given by imest is the White’s test. Here, we test the null
hypothesis that the variance of the residuals is homogenous. Therefore, since the p-value is very
small, we would have to reject the hypothesis and accept the alternative hypothesis that the
variance is not homogenous.
The coefficient of determination is 0.1072, indicating that 10.72% of the variation in the
dependent variable is explained by the regression model and the remaining 89.28% left
unexplained
5. Food
60
40
Residuals
20 0
-20
-20 -10 0 10 20
Fitted values
The first test on Heteroscedasticity given by imest is the White’s test. Here, we test the null
hypothesis that the variance of the residuals is homogenous. Therefore, since the p-value is very
small, we would have to reject the hypothesis and accept the alternative hypothesis that the
variance is not homogenous.
Transportation
100
50
Residuals
0
-50
10 20 30 40 50 60
Fitted values
The first test on Heteroscedasticity given by imest is the White’s test. Here, we test the null
hypothesis that the variance of the residuals is homogenous. Therefore, since the p-value is very
small, we would have to reject the hypothesis and accept the alternative hypothesis that the
variance is not homogenous.
6.
Heteroscedasticity is violation of the assumption that “Variances among the groups are equal”
and it normally occurs when the variance of the error terms differ across observations.
Heteroscedasticity has serious consequences for the OLS estimator. Although the OLS estimator
remains unbiased, the estimated SE is wrong. Because of this, confidence intervals and
hypotheses tests cannot be relied on. In addition, the OLS estimator is no longer BLUE. If the
form of the Heteroscedasticity is known, it can be corrected (via appropriate transformation of
the data) and the resulting estimator, generalized least squares (GLS), can be shown to be BLUE.
The effects of Heteroscedasticity are:
7.
Multicollinearity is an issue whether we see that there exists a significant relationship among the
independent variables included in the study. In cases of perfect multicollinearity, OLS estimators
are not even defined. An exact linear relationship between two or more (explanatory) variables;
more than one exact linear relationship between two or more explanatory variables. In perfect
collinearity there is an exact linear relationship between two or more variables, whereas in
imperfect collinearity this relationship is not exact but an approximate one
Here, we see that the total expense variable is the main root cause for Heteroscedasticity
The assumption called homogeneity of variances should also be validated before performing the
regression analysis. It is defined as “Variances among the groups are equal” and this assumption
is violated when there is very large variations in the error terms. Heteroscedasticity has serious
influence on the least square estimator.
Question 2
1.
2.
The coefficient of determination is 0.532, indicating that 5.32% of the variation in the dependent
variable is explained by the regression model and the remaining 94.68% left unexplained
Here, the dummy variables South Region seems to be significant predictor for food expenditure
Food expenditure seems to highest in South region and lowest in Northeast region
3.
The coefficient of determination is 0.5, indicating that 5.3% of the variation in the dependent
variable is explained by the regression model and the remaining 94.7% left unexplained
Here, the dummy variables South Region seems to be significant predictor for food expenditure
Food expenditure seems to highest in South region and lowest in Midwest region
4.
Here, we see that as the income increases, the amount spent on food decreases. Thus, it is said to
be a necessity good if its share decreases with income
5.
The coefficient of determination is 0.0499, indicating that 4.99% of the variation in the
dependent variable is explained by the regression model and the remaining 95.01% left
unexplained
Apart from the variables Sex and Family Size, all other independent variables seems to be
significant predictor of Food expenditure
regress food sex_ref age_ref educ_ref fam_size age2, vce(bootstrap, reps(200) seed(1234)) beta
The regression output is given below
The coefficient of determination is 0.0499, indicating that 4.99% of the variation in the
dependent variable is explained by the regression model and the remaining 95.01% left
unexplained
Apart from the variables Sex and Family Size, all other independent variables seems to be
significant predictor of Food expenditure
On comparing this bootstrap with the one obtained via delta method, we see that there seems to
be no difference in the regression findings