You are on page 1of 10

Regression Performance

고태훈 (thoon.koh@gmail.com)
Regression performance
v Example: Predict a baby’s weight(kg)

Actual Predicted 25
Age
Weight(y) ")
Weight(𝒚
20
1 5.6 6.0
15
2 6.9 6.4

3 10.4 10.9 10

4 13.7 12.4 5

5 17.4 15.6 0
1 2 3 4 5 6 7
6 20.7 21.5
Actual Weight(y)
7 23.5 23.0 Predicted Weight(y’)

1
Regression performance
v Average error
▶ Indicate whether the predictions are on average over- or under-
predicted.
Actual Predicted
Age
Weight(y) ")
Weight(𝒚
1 n
Average error = å i =1 ( y - yˆ )
1 5.6 6.0

n 2 6.9 6.4

= 0.342 3 10.4 10.9

4 13.7 12.4

5 17.4 15.6

6 20.7 21.5

7 23.5 23.0

2
Regression performance
v Mean absolute error (MAE)
▶ Gives the magnitude of the average error

Actual Predicted
Age
1 n Weight(y) ")
Weight(𝒚
MAE = å i =1 y - yˆ = 0.829 1 5.6 6.0
n
2 6.9 6.4

3 10.4 10.9

4 13.7 12.4

5 17.4 15.6

6 20.7 21.5

7 23.5 23.0

3
Regression performance
v Mean absolute percentage error (MAPE)
▶ Gives a percentage score of how predictions deviate (on average)
from the actual values.
Actual Predicted
Age
Weight(y) ")
Weight(𝒚
1 n y - yˆ
MAPE = 100% ´ å i =1 1 5.6 6.0

n y 2 6.9 6.4

= 6.43% 3 10.4 10.9

4 13.7 12.4

5 17.4 15.6

6 20.7 21.5

7 23.5 23.0

4
Regression performance
v Mean squared error (MSE) and root MSE (RMSE)
▶ MSE: Standard error of estimate.

▶ RMSE: Same units as the variable


Actual Predicted
Age
predicted. Weight(y) ")
Weight(𝒚
1 5.6 6.0
1 n
MSE = å i =1 ( y - yˆ ) 2 2 6.9 6.4
n
3 10.4 10.9
= 0.926
4 13.7 12.4
1 n
RMSE = å
n i =1
( y - ˆ
y ) 2 5 17.4 15.6

6 20.7 21.5
= 0.962
7 23.5 23.0

5
Regression performance
v Train error
▶ Regression model이 데이터에 얼마나 적합하였는가?

▶ Goodness-of-fit

v Test error
▶ Regression model의 예측 성능이 어느 정도인가?

▶ Predictive performance

6
Train error and test error

% of training data
Calculating “train error” using Y and 𝐘

Predicting Y values
Training
data X %
Y Y Linear
regression
model
Test
data X %
Y Y Predicting Y values

% of test data
Calculating “test error” using Y and 𝐘
7
Evaluation of regression model in statistics
v Akaike Information Criteria (AIC)

v Bayesian Information Criteria (BIC)


SSE k
v Adjusted-R2: 기존의 R2에 변수의 수를 고려 AIC = n × ln( ) + 2k
n
v Mallow’s Ck
SSE k
BIC = n × ln( ) + k × ln(n)
n

æ n -1 ö SSEk
Adjusted-R = 1 - ç
2
÷ (1 - R 2
) Ck = - ( n - 2k )
è n - k -1 ø
2
s

n :number of samples
k : number of selected variables
SSEk :sum of squared error of regression model with k variables
s : sum of squared error of full regression model
8
v Y = 5, 6, 7, 8

v Model 1에 의해 예측된 Y’ = 6, 5, 8, 13

v Model 2에 의해 예측된 Y’ = 7, 8, 5, 10

v Model 1의 MAE = 2, MSE = 7

v Model 2의 MAE = 2, MSE = 4

You might also like