CA4229 Week 3 Land Use Planning - Applied Val

CA4229 Semester B 2018
Land Use Planning &

Applied Valuation
1 February 2018
Week 3
Hedonic Price Analysais
(Con. & Advanced)
Murakami, Jin
Assistant Professor
Department of Architecture and Civil Engineering
City University of Hong Kong
Today’s Outline 00
• Regression (by MS Excel)

• Non‐Linear Spatial Relations
• Interaction
• Terms
• Panel Data Analysis
Regression
By MS Excel
Software for OLS Regression 01
Optional
Regression by MS Excel 1 02
P = 0 +  1Z1 +  2Z2 +  3Z3 + t +

P: Market Transaction Price [$ per sq ft]
Z1: Internal Attributes
Z2 : External Attributes
Z3 : Other Attributes
t: data year (panel data only)
0: constant
1, 2, 3,  : parameters
: error term
Open the data table from MS Excel
File  Options
Place “Analysis ToolPak” under “Active Application Add‐Ins”
Data  Data Analysis
Descriptive Statistics
Organize the format of Descriptive Statistics
Data  Data Analysis
Non‐Linear
Spatial Relations
Coefficients of Distance 14
Sensitivity Test: If the other conditions are the same (means of each variable in
the model), how much are prices sensitive to one of the variables (e.g., Actual Walking
Distance from MTR Station)
Estimate from the Linear Model Estimate from the Full‐log Model
12000 12000
Slope = ‐ 422.62 Elasticity = ‐0.053%

10000 10000
Housing Price (HK$ per Sq f)
Housing Price (HK$ per Sq f)

8000 8000
6000 6000
4000 4000
2000 2000
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
AW Distance from MTR Station (m) AW Distance from MTR Station (m)
Categorical Approach? 15
Y (Price) In some facility case (e.g., highway interchange), the spatial relation
cannot be simply linear (e.g., accessibility benefits are cancelled out by
air pollution & noise near a facility)
 In such a complex spatial case, we may try to incorporate categorical
approach (distance band dummy 1/0)
Not
good
line
0 Band Band Band Band Band X (Distance)
100 200 300 400 500
Generate Band Dummy (1/0) 16
Select All Parts
& Sort the
dataset based
on Actual
SMALL
Walking
Distance from
Small to Large
LARGE
Add a new column for the new variable “MTR
Distance within 100 m dummy”
And then, give “1” if AWDtoMTRstation is less

than 100 meters, OR “0”
Add a new column for the new variable
“MTR Distance within 200 m dummy”
And then, give “1” if

AWDtoMTRstation is between 100‐200
meters, OR “0”

meters, OR “0”


meters, OR “0”


meters, OR “0”
Regression with Band Dummy 20
Run regression using the 5 distance band dummy variables (100m, 200m, 300m, 400m & 500m)
‐‐‐ Instead of AWD to MTR Station (m), which has continuous distance values.
Insufficient
Finalized Model
<0.01
Make a graph
1200 Band Dummies can be incorporated
Access benefits are highly into semi‐log and full‐log functions
1000 localized within 100m? too. You may also consider
incremental bands (e.g., by 50
800
meters), but it should not be too
Housing Price Premium (HKD sq f)
600 incremental (*too small distance

banding means just like continuous
400 distances).
200
‐200
‐400
‐600 Access values are deeply

discounted around 400‐500m?
‐800
‐1000
Within 100m Within 200m Within 300m Within 400m Within 500m Between 500‐1000m
MTR Station Distance (Band Dummy)

Interaction
Terms
Independent Variables 23
Y = 0 +  1X1 +  2X2 +
X1, X2 can be….

• Numeric
• Integer
• Dummy (1/0)
The Idea of “Interaction” 24
In the regression model, you may be able to
explain one thing (e.g., property pricy) by using
more than two factors (e.g., size, age, distance
from a station etc…)– so‐called main effects. Each
factor has a impact individually, but you may also
find additive effects among multiple factors
e
Y
X1 X2
Simple Example 25
P = 0 +  1X1 +  2X2 +
P: Property Price ($/sq f)

X1 :Size of Room (sq f)
X2 :Ocean View (1/0)
P‐X1 Relationship 26
P ($/sq f.)
X1 (sq f.)
P‐X2 Relationship 27
P ($/sq f.)
0 1 X2 (1/0)
No‐view Ocean view
P‐X1‐X2 Relationship 28
P ($/sq f.) X2 = 1
X2 = 0
No Interaction Effect X1 (sq f.)

P‐X1‐X2 Relationship 2 29
P ($/sq f.) X2 = 1
X2 = 0
Interaction Effect X1 (sq f.)

Interaction means… 30
• Room Size (X1) increases Property Price (P)
• Ocean View (X2) increases Property Price (P)
• The combination of X1 and X2 increases
Property Price (P) more.
e
P
X1 X2
Interaction means… 31
HK$20,000 HK$5,000
Not simply sum
HK$30,000
How to write? 1 32
P ($/sq f.) X2 = 1
X2 = 0
Interaction Effect X1 (sq f.)

How to write? 2 33
P = 0 + ’ 1X1 +  2X2 +
’ 1=  1+3X2
P = 0 + ( 1+3X2 )X1
+  2X2 +
How to write? 2 34
P = 0 +  1 X1 +3X2 X1
+  2X2 +
P = 0 +  1 X1 +  2X2
+3X1 X2 +
Interaction Effect
Extensions 35
Technically speaking, interactions
can be
• More than two (e.g., 3X1 X2 X3 )
• Numeric & Numeric Variables
• Dummy & Dummy Variables
• Linear, semi‐long, and full‐log forms
• Non‐linear
But, don’t be too many, unreasonable, and too
complex
Possible Combinations 36
Another Example 37
P = 0 +  1 X1 +  2X2
+3X1 X2 +
P: Property Price ($/sq f)
X1 :Distance from MTR (m)
X2 :Public-Private Coordination (1/0)
Another Example 38
Lilian Law with Jin Murakami (2014)

Another Example 39
Lilian Law with Jin Murakami (2014)

Panel Data
Analysis
Your question? 40
There are three basic types of questions that research projects can address:
Descriptive
Relational
Causal
Y= aX + b
Dependent Independent
Causality is difficult 41
Direct Causal Relationship Indirect Causal Relationship
X Y X Z Y
Spurious Relationship
Bidirectional Causal Relationship
Z
X Y
X Y
Moderated Causal Relationship
Unobserved Relationship
Z
X Y
X Y
Dynamic & Complex 42
Panel vs. Cross‐Sectional 43
Panel > Cross‐sectional
Advantages
• You can follow individual changes
• You can assume more accurate causal relations
• You can test difference‐in‐differences more widely
• You can take into account individual differences
unobserved (as fixed effects)
• You can increase the number of cases in your modeling
Disadvantages
• Data collection may be more difficult and time consuming
• Analysis requires more careful attentions and advanced
techniques
Panel Data 44
Housing Price Unit of Analysis = District (N=3)
District A
District B
District C
1996 2001 2006 2011 Year t

Panel Data Organization 1 45
1996 2001 2006 2011
District A Pa1996 Pa2001 Pa2006 Pa2011

District B Pb1996 Pb2001 Pb2006 Pb2011
District C Pc1996 Pc2001 Pc2006 Pc2011
“Wide Format”
X it1 X it2
District A 1996 P
a1996
District B 1996 P
b1996
District C 1996 P
c1996
District A 2001 P
a2001
District B 2001 P
b2001
District C 2001 P
c2001
District A 2006 P
a2006
District B 2006 P
b2006
District C 2006 P
c2006
District A 2011 P
a2011
District B 2011 P
b2011
District C 2011 P
c2011 “Long Format”
X it1 X it2
District A 1996 P
a1996
District A 2001 P
a2001
District A 2006 P
a2006
District A 2011 P
a2011
District B 1996 P
b1996
District B 2001 P
b2001
District B 2006 P
b2006
District B 2011 P
b2011
District C 1996 P
c1996
District C 2001 P
c2001
District C 2006 P
c2006
District C 2011 P
c2011 “Long Format”
Balanced Panel Data 48
Sampling matters
Interventions 49
X it1 X it2
District A 1996 P
a1996 0
District B 1996 P
b1996
0
District C 1996 P
c1996
0
District A 2001 P
a2001 0
District B 2001 P
b2001 0
District C 2001 P
c2001 0
District A 2006 P
a2006 0
District B 2006 P
b2006
1
District C 2006 P
c2006
0
District A 2011 P
a2011 0
District B 2011 P
b2011 1
District C 2011 P
c2011 1
Time Lag 50
Interventions often have a “time lag” on
dependent variable.
Think about the impact of transportation

investment on property prices. It takes some years
after the completion.
You may test several different time lags (e.g., 1~5

years) and pick up one lag. But you may not be
able to test too big time lags because you would
loose a lot of cases for the model.
Time Lag Table 51
X it1 X it lag 1 X it lag 2
District A 1996 P
a1996 0 ‐ ‐
District B 1996 P
b1996
0 ‐ ‐
District C 1996 P
c1996
0 ‐ ‐
District A 2001 P
a2001 0 0 ‐
District B 2001 P
b2001 1 0 ‐
District C 2001 P
c2001 0 0 ‐
District A 2006 P
a2006 0 0 0
District B 2006 P
b2006
1 1 0
District C 2006 P
c2006
0 0 0
District A 2011 P
a2011 0 0 0
District B 2011 P
b2011 1 1 1
District C 2011 P
c2011 1 0 0
Equation 52
Pit =0 + 1X1it+ 2X2it+uit+

i = district (i=A, B, C)
t= year (t=1996, 2001, 2006, 2011)
N = 3 districts x 4 years = 12 cases
uit= Di + Tt
Di : District Specific Effects
Tt : Time Specific Effects
Year Dummy 53
You may have to consider “year specific
effect Ti” in panel data analysis using “year
dummy (1/0)” variables.
Year effects are usually unobserved trends

or phenomenon in specific years.
One of the years should be dropped as “a

base year”
Year Dummy Table 54
T 2001 T 2006 T 2011
District A 1996 P
a1996
0 0 0
District B 1996 P
b1996
0 0 0
District C 1996 P
c1996 0 0 0
District A 2001 P
a2001 1 0 0
District B 2001 P
b2001 1 0 0
District C 2001 P
c2001 1 0 0
District A 2006 P
a2006 0 1 0
District B 2006 P
b2006 0 1 0
District C 2006 P
c2006 0 1 0
District A 2011 P
a2011 0 0 1
District B 2011 P
b2011 0 0 1
District C 2011 P
c2011 0 0 1
Between vs. Within 1 55
Difference
between
Districts
Change
within
A District
Both “Between” and “Within” are statistically significant
“Between” is statistically significant , while “Within” is not
“Within” is statistically significant , while “Between” is not
“Between” can be estimated by OLS
Regression Model
“Within” needs conducting Fixed
Effects (FE) or Random Effects (RE)
Model (*I recommend using STATA
rather than SPSS)
Fixed Effects (FE) Model 60
Pit =0 + 1X1it+ 2X2it+ui+
*You cannot include independent

variable Xi that does not change over
the time period.
Random Effects (FE) Model 61
Pit =0 + 1X1it+ 2X2it+(ui )+
*You can include independent variable

Xi that does not change over the time
ui period.
FE or RE: Hausman Test 62
To decide between fixed or random effects you can
run a Hausman test where the null hypothesis is
that the preferred model is random effects vs. the
alternative the fixed effects. It basically tests
whether the unique errors are correlated with the
regresssors.
Prices in the time series 63
Monetary values (e.g., property prices) change over the
time periods. To compare the time‐dependent values, we
need to adjust the values based on a certain year (before
you start analysis). The most typical way can be using
“Consumer Price Index(CPI)”. Each country (government)
usually announce its own annual CPI over the past decades
(annually or sometimes monthly by goods and services).
Year Property Price ($M) CPI‐96 CPI Adjusted Price ($M)

1996 50 100  50
2001 53 113  46
2006 56 115  50
2011 63 107  55
*Year 1996 Value

CA4229 Week 3 Land Use Planning - Applied Val

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CA4229 Week 3 Land Use Planning - Applied Val

Uploaded by

Copyright:

Available Formats

CA4229 Semester B 2018

Land Use Planning &

• Regression (by MS Excel)

P = 0 +  1Z1 +  2Z2 +  3Z3 + t +

Slope = ‐ 422.62 Elasticity = ‐0.053%

Housing Price (HK$ per Sq f)

And then, give “1” if AWDtoMTRstation is less

And then, give “1” if

And then, give “1” if

Add a new column for the new variable

And then, give “1” if

Add a new column for the new variable

And then, give “1” if

600 incremental (*too small distance

‐600 Access values are deeply

MTR Station Distance (Band Dummy)

X1, X2 can be….

P: Property Price ($/sq f)

No Interaction Effect X1 (sq f.)

Interaction Effect X1 (sq f.)

Not simply sum

Interaction Effect X1 (sq f.)

Lilian Law with Jin Murakami (2014)

Lilian Law with Jin Murakami (2014)

1996 2001 2006 2011 Year t

District A Pa1996 Pa2001 Pa2006 Pa2011

Think about the impact of transportation

You may test several different time lags (e.g., 1~5

Pit =0 + 1X1it+ 2X2it+uit+

Year effects are usually unobserved trends

One of the years should be dropped as “a

*You cannot include independent

*You can include independent variable

Year Property Price ($M) CPI‐96 CPI Adjusted Price ($M)

You might also like