You are on page 1of 18

Name: Nabilah Binti Talib

Matric No: 17204258

Friends I discussed the answer with are Hilmiya Batrisyia Binti Hilmi and Alvina Binti
Asrul

EIB3003 Managerial Economics


Case study 1 (15%)
Duration: 7 days (8 May 2022 – 15 May 2022)

Case 1
Hedonic price

There are many factors that could influence the price of a house including its features. As an
economist who are working for a real estate agent, your manager has asked you to find out what
is/are important features of a house that could drive the price up. You are provided with the data
from past sales (housing.xlsx) which consists of 400 observations of housing price and their
characteristics. The available data are as follow:

Price price (US dollars)


lotSize size of lot (acres)
age age of house (years)
landValue value of land (US dollars)
livingArea living area (square feet)
pctCollege percent of neighborhood that graduated college
bedrooms number of bedrooms
fireplaces number of fireplaces
bathrooms number of bathrooms (half bathrooms have no shower or tub)
rooms number of rooms
heating type of heating system
fuel fuel used for heating
sewer type of sewer system waterfront
waterfront whether property includes waterfront
newConstruction whether the property is a new construction
Name: Nabilah Binti Talib
Matric No: 17204258

centralAir whether the house has central air

Given the information above, run a regression that will help you to understand the variation in
house pricing. Specifically, you need to

a) present and elaborate the estimation model, explain how and why variables included as
independent variables are relevant to explain the housing prices

Estimation Model

Price=β 0+ β 1( lotSize)+ β 2(age)+ β 3 (landValue)+ β 4 (livingArea)+ β 5( bedrooms)+e

Price of house will be affected by size of lot, age of house, value of land, living area and the
number of bedrooms.

Table 1 Independent Variables and Reasons

Independent Variables Reasoning

lotSize - size of lot (acres) If we are purchasing a house package from a


contractor, we will almost certainly have to
pay extra for the bigger plot of ground. More
acreage implies more preparation work, more
landscaping, and so on, and all of these
expenditures are included into the final price.
Because of this, the price of the home will
increase.

age - age of house (years) When buying a house, the age of the house
has an impact on the price. It seems logical
that house prices reflect how up-to-date a
home's features are, since newer amenities,
such as a new roof and new kitchen
appliances, are worth more because they will
endure for a longer period of time.

landValue - value of land (US dollars) Land rises in value because there is a scarcity
of it. Because of this, demand for land grows
in tandem with population growth, pushing
Name: Nabilah Binti Talib
Matric No: 17204258

the price of land upward over time and having


an impact on the price of houses.

livingArea - living area (square feet) Even when housing prices decline on a
national level, prices in a given living area
may climb, especially in urban areas. This is
owing to the fact that they are desirable. A
few examples of geographical considerations
that might impact house prices are
neighborhoods, highways, tourist attractions,
hospitals, and schools, among others.

bedrooms - number of bedrooms The greater the number of bedrooms in our


house, the more we may price it. The number
of bedrooms is a vital feature to consider
when comparing two homes when property
owners and brokers decide listing prices based
on what other similar houses are selling for in
the same market.

(b) run the regression and present the findings.

Regression Equation
Price=β 0+ β 1(lotSize)+ β 2(age)+ β 3 (landValue)+ β 4 (livingArea)+ β 5( bedrooms)+e

where

Price Price (US dollar)

lotSize Size of lot (acres)

Age Age of house (years)

landValue Value of land (US dollars)

livingArea Living area (square feet)

bedrooms Number of bedrooms


Name: Nabilah Binti Talib
Matric No: 17204258

Findings

Table 2 Multiple Linear Regression Output


Variable Beta Coefficient T test P-value
Intercept β0 41079.844406 6.742447 2.1187E-11
lotSize β1 6746.946015 3.153087 0.001643
Age β2 -261.878179 4.997504 6.3979E-7
landValue β3 0.959209 20.441549 2.5165E-83
livingArea β4 92.405293 25.823811 1.4494E-124
bedrooms β5 -6496.734244 -2.646891 0.008198
R-squared 0.613567
Adjusted R-squared 0.612445
Observation 1728
F-stats 546.827741
Significant of F-stats 0.0E0

Based on the table 2,


- The R-squared is 0.613567. It is implied that about 61.3567% of the variation is the price of
house explained by the size of lot, age of house, value of land, living area and number of
bedrooms. The remaining 38.6433% explained by other variables.
- The Adjusted R-squared is 0.612445. It is implied that about 61.2445% of the variation is the
price of house explained by by the size of lot, age of house, value of land, living area and number
of bedrooms. The remaining 38.7555% explained by other variables.

Since there is more than one independent variable in the model, it could lead to a
multicollinearity problem if two or more explanatory variables move together in the model.
Therefore, it is quite difficult to determine which of the factors is influencing the dependent
variable in this scenario. For that reason, we need to do a diagnostic checking of
multicollinearity and VIF.
Name: Nabilah Binti Talib
Matric No: 17204258

Table 3 VIF output of Multicollinearity


Variable Statistics VIF
lotSize 1.027785
Age 1.077354
landValue 1.241832
livingArea 2.262817
bedrooms 1.850711

From table 3, VIF of lotSize is 1.027785. Meanwhile, age, landValue, livingArea and bedrooms
also have a small value of VIF which is only at 1.077354, 1.241832, 2.262817 and 1.850711
respectively. Thus, there is no multicollinearity problem between independent variables of the
price of house regression model because all the VIF values are lower than 10.

(c) evaluate the fitness of the overall model and significance of each variable.

In order to evaluate fitness of the overall model and significance of each variable, I will be using
F-statistics and p-value.

i) Fitness of the overall model

Table 4 F-statistics

F-Statistics Critical F-value

546.827741 2.21929417

H0: There is no significant relationship between the price of house with the size of lot, age of
house, value of land, living area and number of bedrooms.
H1: The relationships of all the variables are significant.

According to table above, the F-statistic = 546.827741035845 is greater than the critical value
f0.05,5,1722 = 2.21929417. The null hypothesis is rejected at 5% level of significance. Therefore, we
Name: Nabilah Binti Talib
Matric No: 17204258

accepted alternative hypothesis. It is possible to infer that the model is substantial. There is
evidence that the relationship of all variables is significant.

ii) Significance of each variable

Table 5 T-Statistics

T-Stats P-value

lotSize 3.153087 0.001643

Age -4.997504 6.3979E-7

landValue 20.441549 2.5165E-83

livingArea 25.823811 1.4494E-124

bedrooms -2.646891 0.008198

Each variables coefficient significance of the regression model

(1) H0: lotSize = 0


H1: lotSize ≠ 0

Since the p-value 0.001643 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X1 (size of lot) contributes to the price of house of
the model given that other independent variables have been put into the model

(2) H0: Age = 0


H1: Age ≠ 0

Since the p-value 6.3979E-7 < alpha value 0.05, we reject the null hypothesis.
Name: Nabilah Binti Talib
Matric No: 17204258

We can conclude that independent variable X2 (age of house) contributes to price of


house of the model given that other independent variables have been put into the model.

(3) H0: landValue = 0


H1: landValue ≠ 0

Since the p-value 2.5165E-83 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X3 (value of land) contributes to the price of
house of the model given that other independent variables have been put into the model.

(4) H0: livingArea = 0


H1: livingArea ≠ 0

Since the p-value 1.4494E-124 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X4 (living area) contributes to the price of
house of the model given that other independent variables have been put into the model.

(5) H0: bedrooms = 0


H1: bedrooms ≠ 0

Since the p-value 0.008198 < alpha value 0.05, we reject the null hypothesis.
We can conclude that independent variable X4 (bedrooms) contributes to the price of
house of the model given that other independent variables have been put into the model.

We can see here that only lotSize and bedrooms are independent variables that are
significant determinants affecting the price of house (dependent variable)
Name: Nabilah Binti Talib
Matric No: 17204258

(d) interpret the coefficients.

Price=41079.844406+6746.946015 (lotSize )+261.878179(age)+0.959209(landValue)+92.405293(livingArea)

Based on estimation model above:


1) Every 1 acres of lot size increase, the price of house will increase by $6746.946015.
2) Every 1 year of house age increase, the price of house will decrease by $261.878179.
3) Every 1 US dollar of value of land increase, the price of house will increase by $0.959209.
4) Every 1 square feet of living area increase, the price of house will increase by $92.405293.
5) Every 1 number of bedrooms increase, the price of house will decrease by $6496.734244.

(e) how the estimation will be useful for the manager.

Estimation will help property managers understand their target market, allowing them to make
the best economic decisions for their industries. The estimation can also help a developer, or a
real estate agent determine the selling house price and it can also help the buyer decide when to
buy houses. Managers can use this estimate to determine which variables have a substantial
impact on the housing price, and thus, they can better grasp the housing market's current price
trend.

(f) what other variable(s) that might be important to consider in the regression.

I think the other variables that might be important to consider in the regression is heating. When
the time comes to sell the house, it is probable that the heating system may have an impact on its
final selling price. Certain systems may raise the expense of heating a home; customers would
take this into account when making an offer. In addition, new construction might have an impact
on the value of a house. In research, a higher concentration of new, bigger homes has a stronger
positive influence on the price of a given bundle of features in the market, particularly for houses
that are priced lower than the market average. Lastly, central air conditioning may affect the
price of the house. According to Money magazine, installing a new central air conditioning
system might increase the value of a property by 10 percent. Depending on the market, the price
Name: Nabilah Binti Talib
Matric No: 17204258

of a house with central air conditioning may be more than that of a home without it. At least 87
percent of American homes are equipped with air conditioning, and according to a 2018
consumer research, homes with air conditioning sold for 2.5 percent more than those without it.
Name: Nabilah Binti Talib
Matric No: 17204258

Case 2
Mortgage lending

A local American bank manager is looking into past data on loan applicants and try to
understand the likelihood of an applicant to obtain the loan. He has asked the data manager to
provide all necessary information about past applicants. The data manager has provided him with
2380 previous applicants (HMDA.xlsx) with the following variables:

deny Factor. Was the mortgage denied?


pirat Payments to income ratio.
hirat Housing expense to income ratio.
lvrat Loan to value ratio.
chist Factor. Credit history: consumer payments.
mhist Factor. Credit history: mortgage payments.
phist Factor. Public bad credit record?
unemp 1989 unemployment rate in applicant's industry.
selfemp Factor. Is the individual self-employed?
insurance Factor. Was the individual denied mortgage insurance?
condomin Factor. Is the unit a condominium?
afam Factor. Is the individual African-American?
single Factor. Is the individual single?
hschool Factor. Does the individual have a high-school diploma?

Given the data,


(a) present and elaborate the estimation model, explain how and why variables included as
independent variables are relevant to explain the likelihood that an applicant will be denied
from getting the loan.
Name: Nabilah Binti Talib
Matric No: 17204258

Estimation Model

1
P ( deny )= −( pirat β1 +chist β2 + phist β 3+ selfemp β 4 +insurance β 5 +a)
1+e

The likelihood that an applicant will be denied from getting the loan is affected by payment to
income ratio, credit history: consumer payments, public bad credit record, self-employed and
individual denied mortgage insurance.

Independent variables Reasoning


pirat Payments to income ratio This is, in my opinion, a critical factor for
lenders, who want to ensure that their
clients can afford the mortgage they are
giving. Our ratio of payments to income
is used to establish this. In this instance,
it is the ratio of monthly debt payments to
monthly income. The repayment of a
mortgage may be challenging for
borrowers whose payment-to-income
ratio exceeds the necessary minimum. As
a result, an applicant will be unable to get
a loan.
chist Factor. Credit history: consumer Financial institutions will have access to
payments a client's payment history when they
receive a loan application. Therefore, if
the borrower has a bad track record of
loan repayment, the lender will presume
the borrower will be unable to return the
loan. The consumer would thus be denied
a mortgage.
phist Factor. Public bad credit record? A loan will not be provided to the
Name: Nabilah Binti Talib
Matric No: 17204258

applicant because of the applicant's


terrible credit history being made public.
This is because financial institutions will
believe that the application is not able to
borrow money, and they do not want to
take the chance of the applicant not
repaying their loan.
selfemp Factor. Is the individual self- To apply loan, applicant must prove a
employed? steady income by showing tax statements
and business accounts for at least the last
two to three years. As we know, self-
employed people are having difficulties
to obtain steady income every month. It
will increase possibility to get denied
from getting loan.
insurance Factor. Was the individual Without mortgage insurance, lenders are
denied mortgage insurance? exposed to a bigger risk. The purpose of
mortgage insurance is to safeguard the
lender in the case of the applicant's death;
hence, lenders generally try to avoid
applicants who lack this protection.

(b) run the regression and present the findings (hint: the dependent variable is a
categorical variable, therefore you cannot use OLS)

Logistic Regression Equation


1
P ( Deny )= −(pirat β1 +chist β 2+ phist β3+ selfemp β4 +insurance β 5+a )
1+e

where
Name: Nabilah Binti Talib
Matric No: 17204258

deny Factor. Was the mortgage denied?

pirat Payments to income ratio

chist Factor. Credit history: consumer payments

phist Factor. Public bad credit record?

selfemp Factor. Is the individual self-employed?

insurance Factor. Was the individual denied mortgage


insurance?

Findings

Table 2 Logistic Regression Output


Variable Coefficient Standard P-value Odds ratio Lower Upper
Error
Intercept -5.011843 0.307888 1.4107E- 0.00666
59
pirat 4.934172 0.774328 1.8634E- 138.958 30.463134 633.858901
10
chist 0.344295 0.037800 8.3753E- 1.41099 1.310236 1.519501
20
phist 1.346414 0.197951 1.0335E- 3.84362 2.607619 5.665478
11
selfemp 0.502282 0.205063 0.01430 1.65249 1.105575 2.469950
9
insurance 4.756032 0.542932 1.9543E- 116.283567 40.121149 337.025937
18
Cox & 0.161441
Snell R
Square
Nagelkerke 0.310786
Name: Nabilah Binti Talib
Matric No: 17204258

R Square
McFadden 0.240256
R Square
*Log-likelihood: 1325.123236

(c) evaluate the fitness of the overall model and significance of each variable

I will use the Pearson’s Chi Square and the Likelihood Ratio Test in order to evaluate the fitness
and the overall model and significance of each variable.

i) Fitness of the overall model

Table 3 Pearson’s Chi Square

Chi-square P-value

Pearson 1188.890122 0.000017

H 0=model is a good fit

H 1=model is not a good fit

The p-value of chi-square (1188.8901) is 0.000017 and it is less than the alpha value of 0.05,
therefore, we accept the null hypothesis at a 5% significance level, indicating that the model is a
good fit. 

ii) Significance of each variable

Table 4 Likelihood Ratio Test

Independent variable Chi-square P-value

pirat 52.996255 3.3418E-13

chist 79.485935 4.8567E-19

phist 42.764677 6.1737E-11


Name: Nabilah Binti Talib
Matric No: 17204258

selfemp 5.623391 0.017722

insurance 153.277158 3.332E-35

(1) Likelihood ratio test for payment of ratio

H 0 : pirat =0

H 1 : pirat ≠ 0

The p-value of chi-square (52.9963) is 3.3418E-13 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that the
payments to income ratio is significantly different from zero. This means that payments to
income ratio does influence the likelihood that an applicant will be denied from getting loan. 

(2) Likelihood ratio test for credit payment: consumer payments

H 0 :chist =0
H 1 : chist ≠ 0

The p-value of chi-square (79.4859) is 4.8567E-19 and it is smaller than the alpha value of 0.05. 
Therefore, we reject the null hypothesis at 5% significance level, which indicates that the credit
history: consumer payment is is significantly different from zero. This means that the credit
history: consumer payment does influence the likelihood that an applicant will be denied from
getting a loan.
Name: Nabilah Binti Talib
Matric No: 17204258

(3) Likelihood ratio test for public bad credit record

H 0 : phist =0
H 1 : phist ≠ 0

The p-value of chi-square (42.7647) is 6.1737E-11 and it is smaller than the alpha value of 0.05. 
Therefore, we reject the null hypothesis at 5% significance level, which indicates that the public
bad credit is significantly different from zero. This means that the public bad credit record does
influence the likelihood that an applicant will be denied from getting a loan.

(4) Likelihood ratio test for individual self-employed

H 0 :selfemp=0
H 1 : selfemp ≠ 0

The p-value of chi-square (5.6234) is 0.01772 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that being self-
employed is significantly different from zero. This means that being self-employed does
influence the likelihood that an applicant will be denied from getting a loan. 

(5) Likelihood ratio test for insurance mortgage

H 0 :insurance=0
H 1 :insurance ≠ 0

The p-value of chi-square (153.2772) is 3.332E-35 and it is smaller than the alpha value of 0.05.
Therefore, we reject the null hypothesis at 5% significance level, which indicates that being
denied mortgage insurance is significantly different from zero. This means that being denied
insurance mortgage does influence the likelihood that an applicant will be denied from getting a
loan.
Name: Nabilah Binti Talib
Matric No: 17204258

(d) based on the estimation result, explain what the important factors are explaining the
likelihood of being denied for a loan (Be extra careful in interpreting the coefficients).

1
P ( deny )= −(−5.0118+ 4.9341 pirat +0.3443chist +1.3464 phist +0.5023 selfemp+4.7560 insurance+a )
1+e

According to the estimation model, given that the other independent variables are held at
constant, 

1. Payments to income ratio is a positive and as well as a significant (β = 4.9342, p-value =


1.8634E-10) predictor of the likelihood of an applicant will be denied from obtaining a
loan. The odds ratio also indicates that for every one-unit increase in the payments to
income ratio, the odds of an applicant being denied a loan is increased by a factor of
138.958 (95% CI [30.463, 633.859]). 
2. Credit history: consumer payment is a positive (β = 0.3443) as well as a significant (p-
value=8.375E-10) predictor of the likelihood of an applicant being denied a loan. The
odds ratio also indicates that for every one-unit increase in the payments to income ratio,
the odds of an applicant being denied a loan are increased by a factor of 1.411 (95% CI
[1.310, 1.520]). 

3. Public bad credit record is a positive (β = 1.3464) as well as a significant (p-


value=1.034E-11) predictor of the likelihood of an applicant being denied obtaining a
loan. The coding variable indicates that the positive coefficient means that a person with
a public bad credit report is more likely to be denied obtaining a loan. The odds ratio of
3.8436 (95% CI [2.608, 5.665]) indicates that a person with a bad public credit record has
3.8436 times the odds of a person without a public bad credit record of getting denied
obtaining a loan. 
Name: Nabilah Binti Talib
Matric No: 17204258

4. Self-employed is a positive (β = 0.5023) as well as a significant (p-value = 0.01431)


predictor of the likelihood of an applicant being denied obtaining a loan. The coding
variable indicates that the positive coefficient means that a person who is self-employed
is more likely to be denied from obtaining a loan. The odds ratio of 1.6525 (95% CI
[1.106, 2.470]) indicates that a person who is self-employed has 1.6525 odds of a person
who is not self-employed getting denied obtaining a loan. 

5. Insurance mortgage is a positive (β = 4.7560) as well as a significant (p-value=1.954E-


18) predictor of the likelihood of an applicant being denied obtaining a loan. The coding
variable indicates that the positive coefficient means that a person who was denied
mortgage insurance is more likely to be denied obtaining a loan. The odds ratio of
116.2836 (95% CI [40.121, 337.0260]) indicates that a person who was denied mortgage
insurance has 116.2836 odds over a person who was not denied mortgage insurance in
obtaining a loan.

You might also like