You are on page 1of 34

Regression:

Predicting House
Prices
Machine Learning Specializa0on

Utkarsh Kulshrestha

1
Predicting house
prices

Machine Learning Specializa0on


How much is my house worth?

3
How much is my house worth?

4
Look at recent sales in my
neighborhood
• How much did they sell
for?

5
Plot recent house
sales (Past 2 years)
y
price
($)

Terminology:
x – feature,
covariate,
or
predictor
square feet x y – observation
or response
(sq.ft.)
6
Predict your house
by similar houses
y
price
($)

No house sold
recently had
exactly the same
square feet x sq.ft.
(sq.ft.)
7
Predict your house
by similar houses
y
price

• Look at
($)

average price
in range
• Still only 2
square feet x houses!
(sq.ft.) • Throwing out
8
Linear
regression

Machine Learning Specializa0on


Use a linear regression model
y Fit a line through the
data
price
($)

f(x) = w 0 +w1
x
parameters
square feet x of model
(sq.ft.)
1
Use a linear regression model
y Fit a line through the
data
price
($)

fw (x) = w 0 +w1 x
functio
parameterized
n
square feet x byw = (w0 ,w
(sq.ft.) ) 1
1
Which line?
y
price
($)

fw (x) =
w 0 +w1 x
different parameters
square feet w x
(sq.ft.)
1
“Cost” of using a given line
y Residual sum of squares
(RSS)

RSS(w 0 ,w1) =
price

($house 1-[w 0 +w1sq.ft.house


($)

1 ]) 2

+ ($house 2 -[w 0 +w1sq.ft.house


2 ]) 2

square feet + ($house x 3-[w 0 +w1sq.ft.house


(sq.ft.) ]) 2
1 3
Find “best” line
y Minimize cost over
possible
all
w 0 ,w1
RSS(w 0 ,w1) =
price

($house 1-[w 0 +w1sq.ft.house


($)

1 ]) 2

+ ($house 2 -[w 0 +w1sq.ft.house


2 ]) 2

square feet + ($house x 3-[w 0 +w1sq.ft.house


(sq.ft.) ]) 2
1 3
Predicting your house price
y fw *(x) = ŵ 0 + ŵ
1

x
price
($)

Best guess of
your house
price:
ŷ = ŵ0 + ŵ1
square feet xsq.ft.your house
(sq.ft.)
1
Adding higher order
effects

Machine Learning Specializa0on


Fit data with a line or … ?
y
price
($)

You show
your friend
square feet x your
(sq.ft.) analysis
1
Fit data with a line or … ?
y
price
($)

square feet x
(sq.ft.)
1
What about a quadratic function?
y
price
($)

square feet x
(sq.ft.)
1
What about a quadratic function?
y
price
($)

fw (x) = w 0 + w 1 x+
w 2 x2
square feet x
(sq.ft.)
2
Even higher order polynomial
y
price
($)

square feet x
(sq.ft.)
2
Do you believe this fit?
y
price
($)

square feet x
(sq.ft.)
2
Evaluating overfitting
via training/test split

Machine Learning Specializa0on


Do you believe this fit?
y
price

Minimizes RSS,
($)

but bad
predictions

square feet x
(sq.ft.)
2
What about a quadratic function?
y
price
($)

fw (x) = w 0 + w 1 x+
w 2 x2
square feet x
(sq.ft.)
2
How to choose
model
order/complexity

• Want good predictions,


but can’t observe
future
• Simulate predictions
1. Remove some houses
2. Fit model on remaining
2
Training/test split

Terminology: – training
– test set
2 set
Training error
y

Training error (w) =


price

($train 1-f w (sq.ft.train


($)

1))
2

+ ($train 2 -f w (sq.ft.train 2 ))2


+ ($train 3-f w (sq.ft.train 3))2
square feet +x… [includetrainingall
(sq.ft.) houses]
2
Test error
y
Test error (ŵ) =
($test 1-f ŵ (sq.ft.test
price

))
($)

2
1
+ ($test 2 -f ŵ (sq.ft.test
2 )) 2

+ ($test 3-f ŵ (sq.ft.test


square feet x3))2 test
(sq.ft.) houses]
2 + … [include all
Training/Test Curves
Erro
r

Model
complexity
3
Adding other
features

Machine Learning Specializa0on


Predictions just based
on house size
y
price
($)

square feet x
(sq.ft.)
3
Add more features fw(x) = w0 + w
y sq.ft.
1 + w2
#bath

x
price
($)

square feet x
(sq.ft.)
3
How many features to use?
• Possible choices:
- Square feet
- # bathrooms
- # bedrooms
- Lot size
- Year built
- …
• See Regression
Course!
3

You might also like