Professional Documents
Culture Documents
Regression 2021 Partie1 Anglais Annote PDF
Regression 2021 Partie1 Anglais Annote PDF
Simon Malinowski
2 Linear regression
House House size (sq. feet) Year built House Price (target, in ke)
1 80 1985 120
2 92 2010 180
3 75 2008 145
4 110 2015 220
5 85 2000 140
6 150 1975 225
7 55 1992 100
8 105 1999 ??
9 95 2018 ??
Simple VS multiple
→ simple regression : one feature variable (X) to predict Y
→ multiple regression : several feature variables (X1 , . . . , Xp ) to
predict Y
Linear VS Non-linear
Size Price
80 120
92 180
75 145
110 220
85 140
150 225
55 100
105 ??
95 ??
Year Price
1985 120
2010 180
2008 145
2015 220
2000 140
1975 225
1992 100
1999 ??
2018 ??
2 Linear regression
Data, problem
Linear model
Estimation of the coefficients
Coefficient of determination (R-squared)
Statistical tests about regression models
Estimating the general performance of a model
n
X
It = (yi − y )2
i=1
2 Linear regression
Data, problem
Linear model
Estimation of the coefficients
Coefficient of determination (R-squared)
Statistical tests about regression models
Estimating the general performance of a model
Y ' β0 + β1 X1 + β2 X2 + · · · + βp Xp
2 Linear regression
Data, problem
Linear model
Estimation of the coefficients
Coefficient of determination (R-squared)
Statistical tests about regression models
Estimating the general performance of a model
→ Least-square regression
n
X
Ir (β0 , β1 ) = ((β0 + β1 xi ) − yi )2
i=1
2 Linear regression
Data, problem
Linear model
Estimation of the coefficients
Coefficient of determination (R-squared)
Statistical tests about regression models
Estimating the general performance of a model
2 Linear regression
Data, problem
Linear model
Estimation of the coefficients
Coefficient of determination (R-squared)
Statistical tests about regression models
Estimating the general performance of a model
95%
(n−2)
-t0.025 0 (n−2)
t0.025
pc 2 pc 2
-|b|/σB 0 |b|/σB
2 Linear regression
Data, problem
Linear model
Estimation of the coefficients
Coefficient of determination (R-squared)
Statistical tests about regression models
Estimating the general performance of a model
Are the determination coefficient or the Ir (or the statistical tests) adapted
to tell me if a given model is good to make predictions ?
Size Price
80 120
92 180
75 145
110 220
85 140
150 225
55 100
Apprentissage
.
.
.
Test
ê1
ê3
ê2
êK