You are on page 1of 16

Course: Research Methods

Assignment 3 Houses in UK

SPSS Data Analysis

Instructors: Prof. Dr. Nikica Mojsovska Blazevski

Dr. Miodraga Stefanovska Petkovska

1
Introduction:

Based on the given data, we are supposed to do an analysis on how some characteristics of a
house affect the price of that house. In the beginning a few seemingly logical assumptions arise,
such as the one that bigger houses are more expensive or that houses with several bathrooms
are more expensive than those without. By using several statistical analyses with the SPSS
program, we aim to determine which characteristics truly influence a houses price, and how
exactly they do so.

(i)

From the table (Appendix A) we can deduce the following data on the 127 houses taken into
account:

- The average house in the UK is 100,3 square meters large (with a standard deviation of
26.8 square meters), and is 24 years old (with a standard deviation of 33 years).
- 37% of the houses are terraced
- 40% of the houses are detached
- 22% of the houses are semi-detached.
- 61% of the houses have a garage
- 69% of the houses have full central heating
- 33% of the houses have 2 or more bathrooms

Scatterplot analysis (Appendices B.1 and B.2)

In terms of age, the first thing we notice is that new houses (aged 0 years) vary greatly in price.
There is a somewhat downward slope, however, and it shows that there is a negative
relationship between age and price, even though it is slight.

In terms of size, there is an obvious positive relationship between size and price. The upward
slope shows that as a houses size increases, so does price.

Cross tabulation (Appendix E):

- Houses that are terraced show a trend of being cheaper, and thus we can conclude that
houses that are not terraced will be more expensive.
- Houses that are detached show a trend of being more expensive as opposed to those
which aren’t detached.

2
- Houses that are semi-detached show a trend of being cheaper as opposed to those who
don’t.
- Houses that have 2 or more bathrooms show a trend of being more expensive as
opposed to those who don’t.
- Houses that have full central heating show a trend of being more expensive as opposed
to those who don’t.
- Houses that have at least 1 garage show a trend of being more expensive as opposed to
those who don’t.

(ii)

Since all three variables refer to one of the houses characteristics, we will exclude one of them
in the linear regression, and use it as a reference variable when interpreting the other two which
we have used in the regression. Since both terraced and semi-detached houses show a trend of
being less expensive (as assumed from the cross tabulation) we will use those two in the
regression.

(iii)

To begin with, we calculate the linear regression of both “Price=β1 + all independent variables +
Ut” and “LnPrice=β1 + all independent variables + Ut” (as shown Appendices C.1 and C.2). It is
important to note that in both equations the “Detached” dummy variable is not present as it was
previously stated.

From the analyzed regressions we decided to pick the LnPrice to further interpret, as its R-
squared value was 90% as opposed to the other one which was 88%. In other words, the
independent variables of LnPrice explained 90% of the variability of the dependent variable –
Price. F-test and T-test in both equations yield the same results in terms of significance, so the
R-squared was the deciding factor when picking between these two.

F – Test

The P - value of the independent variables (as shown Appendix C.2) is below 0,05, and thus we
can conclude that the variables have a joint statistical significant for the dependent variable.

T – Test

3
Noting from the table (Appendix C.2) we can interpret that all variables except for “CHF” and
“GARAGE” have a p – value below 0,05 and thus they have an individual statistically significant
effect on the dependent variable.

Interpreting estimated coefficients (log-lin)

- On average, for every increase age in by 1% we would expect to see a decrease in price
by 0,3%, ceterus paribus.

- On average, for every increase in size by 1% we would expect to see an increase in


price by 0.6%, ceterus paribus.

- On average, the price for houses with terrace will be 24,7% less than for houses that are
detached, ceterus paribus.

- On average, for every increase in bathrooms by 1% the price will increase by 12,4%,
ceterus paribus.

- On average, the price for houses that are semi-detached will be 7,5% less than for
houses that are detached, ceterus paribus.

(iv)

As stated in the linear regression interpretation, the excluded variable (detached) became the
referent price when commenting on terraced or semi-detached houses. From the data it is
obvious that the price for terraced houses is the cheapest, the semi-detached houses are more
expensive than terraced houses, and the detached houses are the most expensive out of all
three.

(v)

From the cross tabulation analysis (Appendix E) we noted that houses that have a garage
showed a trend of having a bigger price when compared to those who don’t. Still, as it stands,
GARAGE is a dummy variable. If we were to transform it into a continuous variable we would
end up with the assumption that “if the number of garages increase by 1 unit, the price will
increase by x units”.

4
(vi)

As we previously stated, the relationship between price and size is positive. The current
assumption is aligned with our statement, but also states that it is up to a certain point, or that
for the largest houses the prices are lower. To check whether this is true, we took the variable
SIZE and squared it. We made a new scatterplot analysis (as shown in Appendix D) and saw
the results. The scatterplot analysis shows several cases where smaller houses have a bigger
price, but also the biggest house has a high price as well as the second biggest house. Thus,
when taking into account the majority of cases, it is safe to say that the assumption that prices
for the largest houses are lower than those which are smaller, is false.

Conclusion:

From the analysis and interpretations that we conducted we can conclude that all variables play
a certain role in the pricing of a house. Our initial assumptions regarding age and size were
confirmed, and by using a scatterplot analysis we were able to determine their respective
negative and positive relationships regarding price. The regressions, however, did point out that
variables such as Central Heating or Garages have an insignificant role when it comes to
pricing. To wrap up, we can conclude that the biggest houses, which have not still aged, are
detached and have 2 or more bathrooms are the most expensive in the UK. Contrary to that, old
houses, which are smaller and are terraced, will have the lowest prices in the UK.

5
APPENDIX A
Descriptive Statistics

N Minimum Maximum Mean Std. Deviation

PRICE 127 20000 132000 59561.29 21221.689


AGE 127 0 120 24.41 33.395
SIZE 127 45 182 100.30 26.860
TERR 127 0 1 .37 .485
DET 127 0 1 .40 .492
CHF 127 0 1 .69 .463
BATH2P 127 0 1 .33 .472
GARAGE 127 0 1 .61 .489
SEMI 127 0 1 .23 .421
Valid N (listwise) 127

6
APPENDIX B.1

APPENDIX B.2

7
APPENDIX C.1

Variables Entered/Removeda

Variables Variables
Model Entered Removed Method

1 SEMI ,
SIZE ,
AGE ,
CHF , . Enter
GARAGE ,
BATH2P ,
b
TERR

a. Dependent Variable: PRICE


b. All requested variables entered.

Model Summary

Adjusted R Std. Error of the


Model R R Square Square Estimate

1 .940a .883 .876 7478.822

a. Predictors: (Constant), SEMI , SIZE , AGE , CHF


, GARAGE , BATH2P , TERR

ANOVAa

Sum of
Model Squares df Mean Square F Sig.

1 Regression 50089371387. 7155624483.8


7 127.933 .000b
067 67

Residual 6656000597.1
119 55932778.127
54

Total 56745371984.
126
220

a. Dependent Variable: PRICE


b. Predictors: (Constant), SEMI , SIZE , AGE , CHF , GARAGE
, BATH2P , TERR

8
Coefficientsa

Standardized
Unstandardized Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) 27978.841 4091.206 6.839 .000

AGE -140.661 24.721 -.221 -5.690 .000

SIZE 372.079 36.310 .471 10.247 .000

TERR -12080.847 2867.414 -.276 -4.213 .000

CHF 60.750 1497.619 .001 .041 .968

BATH2P 11032.510 2673.102 .246 4.127 .000

GARAGE -563.403 2155.158 -.013 -.261 .794

SEMI -5154.908 2488.298 -.102 -2.072 .040

a. Dependent Variable: PRICE

9
APPENDIX C.2

Variables Entered/Removeda

Variables Variables
Model Entered Removed Method

1 SEMI ,
SIZE ,
AGE ,
CHF , . Enter
GARAGE ,
BATH2P ,
b
TERR

a. Dependent Variable: LnPrice


b. All requested variables entered.

Model Summary

Adjusted R Std. Error of the


Model R R Square Square Estimate

1 .949a .901 .895 .11226

a. Predictors: (Constant), SEMI , SIZE , AGE , CHF


, GARAGE , BATH2P , TERR

ANOVAa

Model Sum of Squares df Mean Square F Sig.

1 Regression 13.663 7 1.952 154.875 .000b

Residual 1.500 119 .013

Total 15.163 126

a. Dependent Variable: LnPrice


b. Predictors: (Constant), SEMI , SIZE , AGE , CHF , GARAGE ,
BATH2P , TERR

10
Coefficientsa

Standardized
Unstandardized Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) 10.503 .061 171.032 .000

AGE -.003 .000 -.293 -8.215 .000

SIZE .006 .001 .436 10.333 .000

TERR -.247 .043 -.345 -5.730 .000

CHF .011 .022 .015 .484 .629

BATH2P .124 .040 .169 3.097 .002

GARAGE .001 .032 .001 .032 .975

SEMI -.075 .037 -.091 -2.010 .047

a. Dependent Variable: LnPrice

11
APPENDIX D

12
Appendix E

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

TERR * PRICE 127 100.0% 0 0.0% 127 100.0%


DET * PRICE 127 100.0% 0 0.0% 127 100.0%
CHF * PRICE 127 100.0% 0 0.0% 127 100.0%
BATH2P * PRICE 127 100.0% 0 0.0% 127 100.0%
GARAGE * PRICE 127 100.0% 0 0.0% 127 100.0%
SEMI * PRICE 127 100.0% 0 0.0% 127 100.0%

13
14
15
16

You might also like