You are on page 1of 9

Model 1

. regress qdpermonth monthlyincome familysize

Source SS df MS Number of obs = 95


F( 2, 92) = 32.87
Model 15009.9174 2 7504.95871 Prob > F = 0.0000
Residual 21005.2405 92 228.317831 R-squared = 0.4168
Adj R-squared = 0.4041
Total 36015.1579 94 383.139978 Root MSE = 15.11

qdpermonth Coef. Std. Err. t P>|t| [95% Conf. Interval]

monthlyincome -.0004131 .0001768 -2.34 0.022 -.0007641 -.000062


familysize 6.830266 .8547493 7.99 0.000 5.13266 8.527872
_cons 11.702 4.629899 2.53 0.013 2.506624 20.89738

The R-squared and Adjusted R-squared of


model 1 is higher than model 2, but, the
Model 2 value of root MSE of model 2 is closer to
0. That’s why we conclude that model 2 is
. regress lnqd lnmi lnfs the best fit to use.
Source SS df MS Number of obs = 95
F( 2, 92) = 30.81
Model 9.7667351 2 4.88336755 Prob > F = 0.0000
Residual 14.5814397 92 .158493909 R-squared = 0.4011
Adj R-squared = 0.3881
Total 24.3481748 94 .259023136 Root MSE = .39811

lnqd Coef. Std. Err. t P>|t| [95% Conf. Interval]

lnmi -.106809 .0544301 -1.96 0.053 -.2149119 .0012939


lnfs .8450825 .1089885 7.75 0.000 .6286219 1.061543
_cons 3.264146 .504188 6.47 0.000 2.262785 4.265507
Model 1
. regress qdpermonth monthlyincome familysize, vce (hc3)
R2 is different from 0 since the p-value is
less than 0.05 and the relationship between
Linear regression Number of obs = 95
dependent variables and independent
F( 2, 92) = 23.05
variable is statistically significant.
Prob > F = 0.0000
Monthly income and family size are
R-squared = 0.4168 statistically significant to Qd.
Root MSE = 15.11 The model explains 41.68% of the variance
in(Qd).
Qd = 11.702 – 0.0004131(monthly income) +
Robust HC3 6.830266 (family size)
qdpermonth Coef. Std. Err. t P>|t| [95% Conf. Interval]

monthlyincome -.0004131 .0001559 -2.65 0.009 -.0007226 -.0001035


familysize 6.830266 1.018124 6.71 0.000 4.808184 8.852348
_cons 11.702 4.440801 2.64 0.010 2.882188 20.52182

Model 2
. regress lnqd lnmi lnfs, vce (hc3)
R2 is different from 0 since the p-value is
Linear regression Number of obs = 95
less than 0.05 and the relationship between
F( 2, 92) = 33.03
dependent variables and independent variable
Prob > F = 0.0000
is statistically significant.
R-squared = 0.4011
Family size is statistically significant to
Root MSE = .39811 Qd and monthly income is not.
The model explains 40.11% of the variance in
(Qd).
Robust HC3 Qd = 3.264146 – 0.106809(monthly income) +
lnqd Coef. Std. Err. t P>|t| [95% Conf. Interval] 0.8450825(family size)

lnmi -.106809 .0592933 -1.80 0.075 -.2245707 .0109526


lnfs .8450825 .1066632 7.92 0.000 .6332402 1.056925
_cons 3.264146 .5687828 5.74 0.000 2.134494 4.393797
Model 1
100
80
60
40
20

2.5 3 3.5 4 4.5


Fitted values
In this figure, as you can see, model 2 is closer to
each other which is the trend clearly identified.
Model 2 Compared to model 1,that is scattered to each other.
4.5
4
lnqd

3.5
3
2.5

2.5 3 3.5 4 4.5


Fitted values
. estat hettest
.5
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
0

Variables: fitted values of lnqd


Residuals

chi2(1) = 0.94
-.5

Prob > chi2 = 0.3329


-1

The null hypothesis is that residuals are


homoscedastic. In the example bwe fail to
-1.5

reject the null at 95% and concluded that


2.5 3 3.5 4 4.5 residuals are homogenous.
Fitted values

. estat ovtest
The null hypothesis is that the model does
Ramsey RESET test using powers of the fitted values of lnqd not have omitted-variables bias, the p-value
Ho: model has no omitted variables is higher than the usual threshold of 0.05
F(3, 89) = 0.12 (95% significance), so we fail to reject the
Prob > F = 0.9458
null and conclude that we do not need more
variables.
.
. linktest

Source SS df MS Number of obs = 95


F( 2, 92) = 30.82
Model 9.76765036 2 4.88382518 Prob > F = 0.0000 The p-value is higher than the usual
Residual 14.5805244 92 .158483961 R-squared = 0.4012 threshold of 0.05 (95% significance), so we
Adj R-squared = 0.3881
fail to reject the null and conclude that
Total 24.3481748 94 .259023136 Root MSE = .3981
our model correctly specified.

lnqd Coef. Std. Err. t P>|t| [95% Conf. Interval]

_hat .8383094 2.131999 0.39 0.695 -3.396025 5.072644


_hatsq .0227624 .2996016 0.08 0.940 -.5722723 .6177972
_cons .284781 3.776167 0.08 0.940 -7.215013 7.784575

. estat vif

Variable VIF 1/VIF The VIF’s from the results are less than 10,
indicates the absence of multicollinearity.
lnfs 1.01 0.990810
lnmi 1.01 0.990810

Mean VIF 1.01


X1
1
.5
e( lnqd | X )

0
-.5-1
-1.5

-2 -1 0 1 2
e( lnmi | X )
coef = -.10680902, se = .05443011, t = -1.96
All data points are in range. In this
figure, we analyzed the outliers of Qd as
X2 you can see in variable; X1 and X2. But,
there’s no outliers found.
1
.5
e( lnqd | X )

0
-.5-1

-1 -.5 0 .5 1
e( lnfs | X )
coef = .84508248, se = .10898852, t = 7.75
Kernel density estimate

1
1
Density

Density
.5

.5
0

-1.5 -1 -.5 0 .5 1
Residuals

Kernel density estimate

0
Normal density
-1.5 -1 -.5 0 .5
kernel = epanechnikov, bandwidth = 0.1337 Residuals

The distribution here are more closely to


normal distribution.
1.00
0.75

In this figure, some of the points are off


in the distribution line, but the trend
0.50

looks better.
0.25
0.00

0.00 0.25 0.50 0.75 1.00


Empirical P[i] = i/(N+1)
1
.5

In this figure, some of the points are in


0
Residuals

the distribution line, but both ends are


off.
-.5 -1
-1.5

-1 -.5 0 .5 1
Inverse Normal
. swilk e

Shapiro-Wilk W test for normal data

Variable Obs W V z Prob>z

e 95 0.94521 4.335 3.244 0.00059

The null hypothesis is that the distribution


of the residuals is normal, here the p-value
is 0.00059, thus we reject the null (at 95%).

You might also like