You are on page 1of 6

ECN 4231: INTODUCTION TO ECONOMETRICS

Dummy Variable Lab

Refer to data set below, which reports information on homes sold in PJ area during last
year. Use selling price of the home as the dependent variable and determine the multiple
regression equation with all the independent variables

Price (RM) bed sqrft pool distance garage bath


263115 4 2349 0 17 1 2
182385 4 2102 1 19 0 2
242055 3 2271 1 12 0 2
21357 2 2188 1 16 0 2.5
13986 2 2148 1 28 0 1.5
24543 2 2117 0 12 1 2
32724 6 2484 1 15 1 2
271755 2 2130 1 9 1 2.5
22113 3 2254 0 18 0 1.5
266625 4 2385 1 13 1 2
29241 4 2108 1 14 1 2
20898 2 1715 1 8 1 1.5
27081 6 2495 1 7 1 2
246105 4 2073 1 18 1 2
28134 3 2119 1 16 1 2
172665 4 2189 0 16 0 2
207495 5 2316 0 21 0 5
198855 3 2220 0 10 1 2
20925 6 1901 0 15 1 2
252315 4 2624 1 8 1 2
192915 4 1938 0 14 1 2.5
20925 5 2101 1 20 0 1.5
34533 8 2644 1 9 1 2.3
326295 6 2141 1 11 1 1.5
17307 2 2198 0 21 1 2.2
186175 2 1912 1 26 0 1.5
257175 2 2117 1 9 1 2.2
23301 3 2162 1 14 1 2
18036 2 2041 1 11 0 2
233955 2 1712 1 19 1 2
20709 2 1974 1 11 1 2
247725 5 2438 1 16 1 2
166185 3 2019 0 16 1 2
17712 2 1919 1 10 1 2

1
Price = selling price (RM)
Bed = number of bedrooms
sqrft = size of the home in square feet
pool = swimming pool (1 = yes; 0 = no)
distance = distance from the center of the city
garage = garage attached (1 = yes; 0 = no)
bath = number of bathrooms

a. Write out the regression equation. Discuss each of the variables.


Answer

Variable Coefficient Std. Error t-Statistic Prob.

C -92946.25 270086.0 -0.344136 0.7334


BED -1519.124 15969.95 -0.095124 0.9249
BATH 32937.00 37713.50 0.873348 0.3902
SQRFT 51.59078 116.2356 0.443847 0.6607
DISTANCE 955.2982 4982.970 0.191713 0.8494
DUMMY_GARAGE 34861.33 53438.34 0.652366 0.5197
DUMMY_POOL 9734.754 47360.85 0.205544 0.8387

Sales = - 92946.25 – 1519.124 Bed + 32937 Bath + 51.59078 Sqrft + 955.2982


Distance + 34861.33 Garage + 9734.754 Pool

 The value of the intercept is RM92946.25, this means if the value of all
variables is zero then selling price will be decreased by RM92946.25.
 Bed (β1 = - 1519.124) the relationship between numbers of bedrooms and
selling price is negative. If number of bedroom increased by one bed then
selling price will be decreased by RM1519.124 given that other variables are
constant.
 Bathrooms (β2 = 32937) the relationship between number of bathrooms
and selling price is positive. If number of bathrooms increased by one bath
then selling price will be increased by RM32937 given that other variables
are constant.
 Sqrft (β3 = 51.59078) the relationship between size of the home and selling
price is positive. If size increases by one square feet selling price also
increased by RM51.59078 given that other variables are constant.
 Distance (β4 = 955.2982) the relationship between distance and selling
price is positive. If distance is increased by one unit then selling price alse
increased by RM955.2982 given that other variables are constant.
 Garage (β5 = 34861.33). Garage is a dummy variable. The relationship
between garage and selling price is positive. If garage is include in house then
selling price will be increased by RM34861.33 given that other variables are
constant.

2
 Pool (β6 =9734.754). Pool is dummy variable. The relationship between
pool and selling price is positive. If pool is included in the house then selling
price will be increased by RM9734.754 given that other variables are
constant.

b. Determine the R2 and interpret.


Answer

R-squared 0.054135
Adjusted R-squared -0.156057

the R2 value of 5.4% indicates that the ability of the predictor variable in explaining
the selling price is bad, it can be seen from the low R2 value of 5.4%

c. Develop a correlation matrix. Which independent variables have strong or


weak correlations with the dependent variable? Do you see any problems with
multicollinearity?
Answer

PRICE__RM_ BED BATH SQRFT DISTANCE DUMMY_GA... DUMMY_POOL


PRICE... 1.000000 0.077284 0.169117 0.123860 -0.050788 0.107625 -0.011537
BED 0.077284 1.000000 0.143388 0.565693 -0.140474 0.148798 -0.017075
BATH 0.169117 0.143388 1.000000 0.197285 0.035149 -0.095597 -0.286131
SQRFT 0.123860 0.565693 0.197285 1.000000 -0.188876 0.028724 0.034954
DISTA... -0.050788 -0.140474 0.035149 -0.188876 1.000000 -0.523942 -0.172349
DUMM... 0.107625 0.148798 -0.095597 0.028724 -0.523942 1.000000 0.008333
DUMM... -0.011537 -0.017075 -0.286131 0.034954 -0.172349 0.008333 1.000000

Correlation between predictor variable expect distance and dummy pool with
selling price have weak positive correlation. Distance and dummy pool is negatively
correlated with selling price. Multicoliniearity arise when correlation between two
independent variables is more than 0.70 or less than -0.70. In this case the correlation
between independent variables is in between 0.70 and – 0.70. So there has no problem of
multicoliniearity.

d. Conduct a test of hypothesis on each of the independent variables. Would you


consider deleting any of the variables (if the variable is not significant)? If so,
which ones?
Answer :

Individual Test
Hypotheses:
a. H0 : β1 = 0
H1 : β1 ≠ 0
b. H0 : β2 = 0
H1 : β2 ≠ 0
c. H0 : β3 = 0
H1 : β3 ≠ 0

3
d. H0 : β4 = 0
H1 : β4 ≠ 0
e. H0 : β5 = 0
H1 : β5 ≠ 0
f. H0 : β6 = 0
H1 : β6 ≠ 0

Level of significance : α = 0.05


Reject H0 if T value is more than 2.05 or prob less than α
Critical value = t0.05(27) = 2.05

Test statistic:
Variable Coefficient Std. Error t-Statistic Prob.

C -92946.25 270086.0 -0.344136 0.7334


BED -1519.124 15969.95 -0.095124 0.9249
BATH 32937.00 37713.50 0.873348 0.3902
SQRFT 51.59078 116.2356 0.443847 0.6607
DISTANCE 955.2982 4982.970 0.191713 0.8494
DUMMY_GARAGE 34861.33 53438.34 0.652366 0.5197
DUMMY_POOL 9734.754 47360.85 0.205544 0.8387

Decision :
For all variable we Accept H0, because prob more than 0.05 and T value less than
2.05. So it can be concluded that there is no predictor variable that affects to selling
price. But we can delete variable one by one. In this case first we reject this
independent varible which has the smallest t value and largest prob. Variable bed is
the smallest among them. So first we rejct bed form analysis.

e. Re-run the analysis until only significant net regression coefficients remain in
the analysis. Identify these variables.
Answer

1. Not Included Bed


Variable Coefficient Std. Error t-Statistic Prob.

C -83962.54 248517.9 -0.337853 0.7380


BATH 32769.07 36999.52 0.885662 0.3833
SQRFT 45.48106 95.14828 0.478002 0.6364
DISTANCE 933.7984 4888.962 0.191001 0.8499
DUMMY_GARAGE 34029.27 51776.31 0.657236 0.5164
DUMMY_POOL 9831.236 46504.55 0.211404 0.8341

For all variable we Accept H0, because prob more than 0.05 and T value less
than 2.05. So it can be concluded that there is no predictor variable that
affects to selling price. Variable bath is the smallest among them. So first we
rejct distance form analysis.

4
2. Not Included Bed and distance
Variable Coefficient Std. Error t-Statistic Prob.

C -57330.47 202270.2 -0.283435 0.7789


BATH 32513.33 36355.85 0.894308 0.3785
SQRFT 42.04231 91.86423 0.457657 0.6506
DUMMY_GARAGE 28737.40 43007.02 0.668203 0.5093
DUMMY_POOL 8089.369 44837.64 0.180415 0.8581

For all variable we Accept H0, because prob more than 0.05 and T value less
than 2.05. So it can be concluded that there is no predictor variable that
affects to selling price. Variable dummy_pool is the smallest among them. So
first we rejct dummy_pool form analysis.
3. Not Included Bed, distance and dummy Pool
Variable Coefficient Std. Error t-Statistic Prob.

C -50948.50 195915.6 -0.260053 0.7966


BATH 30545.98 34118.15 0.895300 0.3778
SQRFT 43.67338 89.93216 0.485626 0.6308
DUMMY_GARAGE 28544.72 42294.83 0.674898 0.5049

For all variable we Accept H0, because prob more than 0.05 and T value less
than 2.05. So it can be concluded that there is no predictor variable that
affects to selling price. Variable sqrft (size) is the smallest among them. So
first we rejct sqrft (size) form analysis.
4. Not Included Bed, distance,dummy Pool and Sqrft (size)
Variable Coefficient Std. Error t-Statistic Prob.

C 35883.99 79079.12 0.453773 0.6532


BATH 33876.84 33007.11 1.026350 0.3127
DUMMY_GARAGE 29546.25 41720.60 0.708193 0.4841

For all variable we Accept H0, because prob more than 0.05 and T value less
than 2.05. So it can be concluded that there is no predictor variable that
affects to selling price. Variable dummy_garage is the smallest among them.
So first we rejct dummy_garage form analysis.
5. Not Included Bed, distance,dummy Pool,Sqrft (size) and dummy garage
Variable Coefficient Std. Error t-Statistic Prob.

C 61353.99 69876.90 0.878030 0.3865


BATH 31642.23 32599.04 0.970649 0.3390

For all variable we Accept H0, because prob more than 0.05 and T value less
than 2.05. So it can be concluded that there is no predictor variable that
affects to selling price.

5
f. Do home that have garage attached sell a higher price? How do you know?
Answer

If the house has a garage, the selling price of the house will be high. it can be seen
from the positive relationship between the garage variable and selling price in the
regression model

g. Do home that have swimming pool sell a higher price? How do you know?
Answer

If the house has a swimming pool, the selling price of the house will be high. it can
be seen from the positive relationship between the pool variable and selling price in
the regression model

h. Test whether that home have garage attached and swimming pool are
statistically significant determinants of selling price.
Answer

Hypotheses:
H0 : β5 = β6 = 0
H1 : Not all the βi are 0; i = 5,6
Level of significance : α = 0.05
Reject H0 if F value is more than 2.45 or prob less than α
Critical value = F0.05(6,27) = 2.45
Test statistic:
Dependent Variable: PRICE__RM_
Method: Least Squares
Date: 01/25/22 Time: 21:51
Sample: 1 34
Included observations: 34

Variable Coefficient Std. Error t-Statistic Prob.

C 110777.7 46177.41 2.398958 0.0226


DUMMY_GARAGE 25477.31 42227.39 0.603336 0.5507
DUMMY_POOL -2940.694 42227.39 -0.069639 0.9449

R-squared 0.011738 Mean dependent var 126685.9


Adjusted R-squared -0.052021 S.D. dependent var 109378.8
S.E. of regression 112187.7 Akaike info criterion 26.17783
Sum squared resid 3.90E+11 Schwarz criterion 26.31251
Log likelihood -442.0231 Hannan-Quinn criter. 26.22376
F-statistic 0.184095 Durbin-Watson stat 2.427277
Prob(F-statistic) 0.832760

Conclusion :
We Accept H0, because prob more than 0.05 and F value less than 2.45. So it can be
concluded that home have garage attached and swimming pool are not
statistically significant determinats of selling price.

You might also like