Professional Documents
Culture Documents
Group Name:
INTERPRETATION: Descriptive statistics of the continuous dependent variables (Y) and the
independent variables (X2, X4 and X10)
For the dependent variable Y, i.e expenditure, the mean value is 7.26e+17, the standard
deviation 7.90e+18, minimum 1 and maximum 8.62e+19 for 119 numbers of observations.
For the independent variable X2, i.e Household size, the mean value is 7.184474, the
standard deviation 2.840286, minimum 2 and maximum 13 for 119 number of observations.
For the independent variable X4, i.e amount of saving, the mean value is 998.4286, the
standard deviation 220.1074, minimum 720 and maximum 2000 for 119 number of
observations.
For the independent variable X10, i.e number of employment created, the mean value is
12.61345, the standard deviation 2.404478, minimum 7 and maximum 22 for 119 number of
observations.
. tab1 sex edustat credituse credituse loan loanterm loandisburse bustraining
0 66 55.46 55.46
1 53 44.54 100.00
0 96 80.67 80.67
1 23 19.33 100.00
0 64 53.78 53.78
1 55 46.22 100.00
0 64 53.78 53.78
1 55 46.22 100.00
0 48 40.34 40.34
1 71 59.66 100.00
0 40 33.61 33.61
1 79 66.39 100.00
loandisburs
e Freq. Percent Cum.
0 43 36.13 36.13
1 76 63.87 100.00
0 34 28.57 28.57
1 85 71.43 100.00
For sex:- 66 frequency and 55.46% for Female and for male 53 frequency and 44.54%.
For Education status: 96 frequency and 80.67% for illiteracy and for literacy 23
frequency and 19.33%
For use of credit:- 64 frequency and 53.78% for no response and for yes response 55
frequency and 46.22%.
For loan:- 48 frequency and 40.34% for no response and for yes response 71 frequency
and 59.66%.
For loan term:- 40 frequency and 33.61% for no response and for yes response 79
frequency and 66.39%.
For loan disbursement:- 43 frequency and 36.13% for no response and for yes response
76 frequency and 63.87%.
For business training:- 34 frequency and 28.57% for no response and for yes response 85
frequency and 71.43%.
2. Perform pair wise correlation coefficient between Y and other continuous variables only and
make interpretations
. pwcorr expenditure households saving createdemploy
expenditure 1.0000
households -0.1038 1.0000
saving 0.1485 0.0548 1.0000
createdemp~y -0.0237 -0.0143 -0.4318 1.0000
INTERPRETATION: All the values are below 0.8 there is no multicollinearity problem among
the independent variables.
. gen lnexpenditure=log(expenditure )
4. Run Multiple Linear Regression Model using normalized dependent variable (LogY), do you
think the model should be interpreted, explain your reason? Explain your reason
. reg lnexpenditure sex households edustat saving credituse loan loanterm loandisburse bustraining createdemploy
6. Test the degree of multicollinearity among the explanatory variables in the data
using the formal test (interpret the result), if any problem takes the required remedial
measure?
. vif
. pwcorr sex households edustat saving credituse loan loanterm loandisburse bustraining createdemploy
sex 1.0000
households 0.0550 1.0000
edustat -0.0533 -0.0696 1.0000
saving -0.0591 0.0548 0.5470 1.0000
credituse 0.1867 0.0526 0.3573 0.4363 1.0000
loan -0.1938 -0.1158 0.2723 0.3320 0.1438 1.0000
loanterm 0.2439 0.0214 0.3032 0.3787 0.3028 0.1764 1.0000
loandisburse 0.2869 0.0121 0.2353 0.4155 0.4166 0.2373 0.4276
bustraining 0.0053 0.0874 0.1682 0.3312 0.2878 0.0488 0.2588
createdemp~y 0.0387 -0.0143 -0.2854 -0.4318 -0.1953 -0.3760 -0.3526
loandisburse 1.0000
bustraining 0.2213 1.0000
createdemp~y -0.3114 -0.3585 1.0000
VIF INTERPRETATION: All the VIF values are less the 10% thus there is no serious
multicollinearity problem.
PAIRWISE CORRELATION: All the values are less than 0.8 thus there is no serious
multicollinearity problem.
7. Test whether there is heteroskedasticity problem in the data or not using formal test
(interpret the result), if any problem take the required remedial measure?
. estat hettest
chi2(1) = 30.11
Prob > chi2 = 0.0000
INTERPRETATION: Since P value is small and significant at 1% the null hypothesis of
homoskedasaticity is rejected and the model has heteroskedasaticity problem.
. . reg lnexpenditure sex households edustat saving credituse loan loanterm loandisburse bustraining createdemploy, robust
Robust
lnexpendit~e Coef. Std. Err. t P>|t| [95% Conf. Interval]
8. Test for non-normality of error terms in the data or not using graphical test (interpret the
result), what should be done if a problem?
Kernel density estimate
.08
.06
Density
.04
.02
0
INTERPRETATION: The Kdensity of the plot for regression model is relatively matching
the normal curve. Therefore this indicates that the normality assumption is not violated
showcasing the non-existence of outliers.
9. Test for model misspecification bias in the data or not (interpret the result) using formal
test (interpret the result), what should be done if a problem?
. linktest
. ovtest
10. Explain final model result after the required diagnostics test, correction and make
interpretations of the coefficients of the model?
. . reg lnexpenditure sex households edustat saving credituse loan loanterm loandisburse bustraining createdemploy, robust
Robust
lnexpendit~e Coef. Std. Err. t P>|t| [95% Conf. Interval]