Professional Documents
Culture Documents
Friends I discussed the answer with are Hilmiya Batrisyia Binti Hilmi and Alvina Binti
Asrul
Case 1
Hedonic price
There are many factors that could influence the price of a house including its features. As an
economist who are working for a real estate agent, your manager has asked you to find out what
is/are important features of a house that could drive the price up. You are provided with the data
from past sales (housing.xlsx) which consists of 400 observations of housing price and their
characteristics. The available data are as follow:
Given the information above, run a regression that will help you to understand the variation in
house pricing. Specifically, you need to
a) present and elaborate the estimation model, explain how and why variables included as
independent variables are relevant to explain the housing prices
Estimation Model
Price of house will be affected by size of lot, age of house, value of land, living area and the
number of bedrooms.
age - age of house (years) When buying a house, the age of the house
has an impact on the price. It seems logical
that house prices reflect how up-to-date a
home's features are, since newer amenities,
such as a new roof and new kitchen
appliances, are worth more because they will
endure for a longer period of time.
landValue - value of land (US dollars) Land rises in value because there is a scarcity
of it. Because of this, demand for land grows
in tandem with population growth, pushing
Name: Nabilah Binti Talib
Matric No: 17204258
livingArea - living area (square feet) Even when housing prices decline on a
national level, prices in a given living area
may climb, especially in urban areas. This is
owing to the fact that they are desirable. A
few examples of geographical considerations
that might impact house prices are
neighborhoods, highways, tourist attractions,
hospitals, and schools, among others.
Regression Equation
Price=β 0+ β 1(lotSize)+ β 2(age)+ β 3 (landValue)+ β 4 (livingArea)+ β 5( bedrooms)+e
where
Findings
Since there is more than one independent variable in the model, it could lead to a
multicollinearity problem if two or more explanatory variables move together in the model.
Therefore, it is quite difficult to determine which of the factors is influencing the dependent
variable in this scenario. For that reason, we need to do a diagnostic checking of
multicollinearity and VIF.
Name: Nabilah Binti Talib
Matric No: 17204258
From table 3, VIF of lotSize is 1.027785. Meanwhile, age, landValue, livingArea and bedrooms
also have a small value of VIF which is only at 1.077354, 1.241832, 2.262817 and 1.850711
respectively. Thus, there is no multicollinearity problem between independent variables of the
price of house regression model because all the VIF values are lower than 10.
(c) evaluate the fitness of the overall model and significance of each variable.
In order to evaluate fitness of the overall model and significance of each variable, I will be using
F-statistics and p-value.
Table 4 F-statistics
546.827741 2.21929417
H0: There is no significant relationship between the price of house with the size of lot, age of
house, value of land, living area and number of bedrooms.
H1: The relationships of all the variables are significant.
According to table above, the F-statistic = 546.827741035845 is greater than the critical value
f0.05,5,1722 = 2.21929417. The null hypothesis is rejected at 5% level of significance. Therefore, we
Name: Nabilah Binti Talib
Matric No: 17204258
accepted alternative hypothesis. It is possible to infer that the model is substantial. There is
evidence that the relationship of all variables is significant.
Table 5 T-Statistics
T-Stats P-value
Since the p-value 0.001643 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X1 (size of lot) contributes to the price of house of
the model given that other independent variables have been put into the model
Since the p-value 6.3979E-7 < alpha value 0.05, we reject the null hypothesis.
Name: Nabilah Binti Talib
Matric No: 17204258
Since the p-value 2.5165E-83 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X3 (value of land) contributes to the price of
house of the model given that other independent variables have been put into the model.
Since the p-value 1.4494E-124 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X4 (living area) contributes to the price of
house of the model given that other independent variables have been put into the model.
Since the p-value 0.008198 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X4 (bedrooms) contributes to the price of
house of the model given that other independent variables have been put into the model.
We can see here that only lotSize and bedrooms are independent variables that are
significant determinants affecting the price of house (dependent variable)
Name: Nabilah Binti Talib
Matric No: 17204258
Estimation will help property managers understand their target market, allowing them to make
the best economic decisions for their industries. The estimation can also help a developer, or a
real estate agent determine the selling house price and it can also help the buyer decide when to
buy houses. Managers can use this estimate to determine which variables have a substantial
impact on the housing price, and thus, they can better grasp the housing market's current price
trend.
(f) what other variable(s) that might be important to consider in the regression.
I think the other variables that might be important to consider in the regression is heating. When
the time comes to sell the house, it is probable that the heating system may have an impact on its
final selling price. Certain systems may raise the expense of heating a home; customers would
take this into account when making an offer. In addition, new construction might have an impact
on the value of a house. In research, a higher concentration of new, bigger homes has a stronger
positive influence on the price of a given bundle of features in the market, particularly for houses
that are priced lower than the market average. Lastly, central air conditioning may affect the
price of the house. According to Money magazine, installing a new central air conditioning
system might increase the value of a property by 10 percent. Depending on the market, the price
Name: Nabilah Binti Talib
Matric No: 17204258
of a house with central air conditioning may be more than that of a home without it. At least 87
percent of American homes are equipped with air conditioning, and according to a 2018
consumer research, homes with air conditioning sold for 2.5 percent more than those without it.
Name: Nabilah Binti Talib
Matric No: 17204258
Case 2
Mortgage lending
A local American bank manager is looking into past data on loan applicants and try to
understand the likelihood of an applicant to obtain the loan. He has asked the data manager to
provide all necessary information about past applicants. The data manager has provided him with
2380 previous applicants (HMDA.xlsx) with the following variables:
Estimation Model
1
P ( deny )= −( pirat β1 +chist β2 + phist β 3+ selfemp β 4 +insurance β 5 +a)
1+e
The likelihood that an applicant will be denied from getting the loan is affected by payment to
income ratio, credit history: consumer payments, public bad credit record, self-employed and
individual denied mortgage insurance.
(b) run the regression and present the findings (hint: the dependent variable is a
categorical variable, therefore you cannot use OLS)
where
Name: Nabilah Binti Talib
Matric No: 17204258
Findings
R Square
McFadden 0.240256
R Square
*Log-likelihood: 1325.123236
(c) evaluate the fitness of the overall model and significance of each variable
I will use the Pearson’s Chi Square and the Likelihood Ratio Test in order to evaluate the fitness
and the overall model and significance of each variable.
Chi-square P-value
The p-value of chi-square (1188.8901) is 0.000017 and it is less than the alpha value of 0.05,
therefore, we accept the null hypothesis at a 5% significance level, indicating that the model is a
good fit.
H 0 : pirat =0
H 1 : pirat ≠ 0
The p-value of chi-square (52.9963) is 3.3418E-13 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that the
payments to income ratio is significantly different from zero. This means that payments to
income ratio does influence the likelihood that an applicant will be denied from getting loan.
H 0 :chist =0
H 1 : chist ≠ 0
The p-value of chi-square (79.4859) is 4.8567E-19 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that the credit
history: consumer payment is is significantly different from zero. This means that the credit
history: consumer payment does influence the likelihood that an applicant will be denied from
getting a loan.
Name: Nabilah Binti Talib
Matric No: 17204258
H 0 : phist =0
H 1 : phist ≠ 0
The p-value of chi-square (42.7647) is 6.1737E-11 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that the public
bad credit is significantly different from zero. This means that the public bad credit record does
influence the likelihood that an applicant will be denied from getting a loan.
H 0 :selfemp=0
H 1 : selfemp ≠ 0
The p-value of chi-square (5.6234) is 0.01772 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that being self-
employed is significantly different from zero. This means that being self-employed does
influence the likelihood that an applicant will be denied from getting a loan.
H 0 :insurance=0
H 1 :insurance ≠ 0
The p-value of chi-square (153.2772) is 3.332E-35 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that being
denied mortgage insurance is significantly different from zero. This means that being denied
insurance mortgage does influence the likelihood that an applicant will be denied from getting a
loan.
Name: Nabilah Binti Talib
Matric No: 17204258
(d) based on the estimation result, explain what the important factors are explaining the
likelihood of being denied for a loan (Be extra careful in interpreting the coefficients).
1
P ( deny )= −(−5.0118+ 4.9341 pirat +0.3443chist +1.3464 phist +0.5023 selfemp+4.7560 insurance+a )
1+e
According to the estimation model, given that the other independent variables are held at
constant,