You are on page 1of 21

Machine Learning and Data Analytics

Fundamentals – Part 3

Dr. Rossana Cavagnini

Deutsche Post Chair – Optimization of Distribution Networks (DPO)


RWTH Aachen University

mlda@dpo.rwth-aachen.de
Assessing model accuracy

Agenda

1 Assessing model accuracy


Training vs test errors
The bias-variance trade-off

DPO MLDA 2
Assessing model accuracy

DPO MLDA 3
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Measuring the quality of fit (regression)

Performance of a learning method: how well its predictions actually match the
observed data

DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Measuring the quality of fit (regression)

Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model

DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Measuring the quality of fit (regression)

Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}

DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Measuring the quality of fit (regression)

Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Obtain the estimate fˆ and compute fˆ(x1 ), fˆ(x2 ), . . . , fˆ(xn )

DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Measuring the quality of fit (regression)

Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Obtain the estimate fˆ and compute fˆ(x1 ), fˆ(x2 ), . . . , fˆ(xn )
3 Compute:
n
1X
Training MSE = (yi − fˆ(xi ))2
n
i=1

- Small: prediction close to true (fˆ(xi ) ≈ yi )


- Large: prediction differs from true substantially
Not useful for predicting tasks (it just memorizes), only used for getting good parameters

DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}

DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Get (x0 , y0 ): previously unseen test observation (not used to train the model)

DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Get (x0 , y0 ): previously unseen test observation (not used to train the model)
3 Compute:
Test MSE = Ave(fˆ(x0 ) − y0 )2

DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Get (x0 , y0 ): previously unseen test observation (not used to train the model)
3 Compute:
Test MSE = Ave(fˆ(x0 ) − y0 )2
What if no test observations are available?
Choose the learning method which minimizes the Training MSE?
No guarantee that the method with the lowest Training MSE will also have the lowest Test
MSE!

DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

2.5
12

2.0
10

Mean Squared Error

1.5
8
Y

1.0
6

0.5
4
2

0.0
0 20 40 60 80 100 2 5 10 20

X Flexibility

L: black=true function, orange=linear regression, blue and green=smoothing spline fits;


R: Grey=Training MSE, Red=Test MSE, Dotted=Irreducible error (Var ())

Training MSE decreases as the method is more flexible


Test MSE initially decreases with flexibility, but then increases
DPO MLDA 6
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

The bias-variance trade-off


Our predictor fˆ depends on:
1 the parameters we found solving the training problem
2 the points in the training set (their selection)

DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

The bias-variance trade-off


Our predictor fˆ depends on:
1 the parameters we found solving the training problem
2 the points in the training set (their selection)
Two sources of randomness:  and training set
E (y0 − fˆ(x0 ))2 = Var (fˆ(x0 )) + [Bias(fˆ(x0 ))]2 + Var ()
| {z } | {z } | {z } | {z }
Expected test MSE Variance of fˆ(x0 ) Squared bias of fˆ(x0 ) Variance of the error terms

DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

The bias-variance trade-off


Our predictor fˆ depends on:
1 the parameters we found solving the training problem
2 the points in the training set (their selection)
Two sources of randomness:  and training set
E (y0 − fˆ(x0 ))2 = Var (fˆ(x0 )) + [Bias(fˆ(x0 ))]2 + Var ()
| {z } | {z } | {z } | {z }
Expected test MSE Variance of fˆ(x0 ) Squared bias of fˆ(x0 ) Variance of the error terms

Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)

DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

The bias-variance trade-off


Our predictor fˆ depends on:
1 the parameters we found solving the training problem
2 the points in the training set (their selection)
Two sources of randomness:  and training set
E (y0 − fˆ(x0 ))2 = Var (fˆ(x0 )) + [Bias(fˆ(x0 ))]2 + Var ()
| {z } | {z } | {z } | {z }
Expected test MSE Variance of fˆ(x0 ) Squared bias of fˆ(x0 ) Variance of the error terms

Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
Bias: error obtained by approximating a complex real-life problem with a simpler
model (more flexible methods have less bias)

DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

The bias-variance trade-off


Our predictor fˆ depends on:
1 the parameters we found solving the training problem
2 the points in the training set (their selection)
Two sources of randomness:  and training set
E (y0 − fˆ(x0 ))2 = Var (fˆ(x0 )) + [Bias(fˆ(x0 ))]2 + Var ()
| {z } | {z } | {z } | {z }
Expected test MSE Variance of fˆ(x0 ) Squared bias of fˆ(x0 ) Variance of the error terms

Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
Bias: error obtained by approximating a complex real-life problem with a simpler
model (more flexible methods have less bias)
To minimize the expected test error, select a learning method with low variance and
low bias

DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

The bias-variance trade-off


Our predictor fˆ depends on:
1 the parameters we found solving the training problem
2 the points in the training set (their selection)
Two sources of randomness:  and training set
E (y0 − fˆ(x0 ))2 = Var (fˆ(x0 )) + [Bias(fˆ(x0 ))]2 + Var ()
| {z } | {z } | {z } | {z }
Expected test MSE Variance of fˆ(x0 ) Squared bias of fˆ(x0 ) Variance of the error terms

Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
Bias: error obtained by approximating a complex real-life problem with a simpler
model (more flexible methods have less bias)
To minimize the expected test error, select a learning method with low variance and
low bias
Bad news: models with low variance have high bias (and vice-versa)
DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

A model with high variance and low bias is an overfitted model (the model is too
complex and specific, it does not generalize well).
A model with low variance and high bias is an underfitted model (the model is not
complex enough).
this helps the model at generalizing, but it introduces a high bias so that there is a
systematic deviation between the predicted data and the true one.
From a practical point of view:
Overfitting an estimator is easy: add new parameters (e.g., add quadratic terms in a linear
regression)
Making an overfitted model generalize better is hard (regularization may help).

DPO MLDA 8
Training vs test errors
Assessing model accuracy
The bias-variance trade-off

Measuring the quality of fit (classification)

yi is no longer numerical
Training error rate:
n
1 P
n I (yi 6= ŷi )
i=1
(
1 if yi 6= ŷi
6 ŷi ): indicator function
I (yi =
0 otherwise
Test error rate:
Ave(I (y0 6= ŷ0 ))

DPO MLDA 9

You might also like