Fundamentals Part 3

Machine Learning and Data Analytics
Fundamentals – Part 3
Dr. Rossana Cavagnini
Deutsche Post Chair – Optimization of Distribution Networks (DPO)

RWTH Aachen University
mlda@dpo.rwth-aachen.de
Assessing model accuracy
Agenda
1 Assessing model accuracy

Training vs test errors
The bias-variance trade-off
DPO MLDA 2
DPO MLDA 3
Measuring the quality of fit (regression)
Performance of a learning method: how well its predictions actually match the
observed data
DPO MLDA 4
observed data
Training Mean Squared Error: use the same data used to train the model
DPO MLDA 4
observed data
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
DPO MLDA 4
observed data
2 Obtain the estimate fˆ and compute fˆ(x1 ), fˆ(x2 ), . . . , fˆ(xn )
DPO MLDA 4
observed data
2 Obtain the estimate fˆ and compute fˆ(x1 ), fˆ(x2 ), . . . , fˆ(xn )
3 Compute:
n
1X
Training MSE = (yi − fˆ(xi ))2
n
i=1
- Small: prediction close to true (fˆ(xi ) ≈ yi )

- Large: prediction differs from true substantially
Not useful for predicting tasks (it just memorizes), only used for getting good parameters
DPO MLDA 4
Test Mean Squared Error: use previously unseen observations to check performance
DPO MLDA 5
2 Get (x0 , y0 ): previously unseen test observation (not used to train the model)
DPO MLDA 5
3 Compute:
Test MSE = Ave(fˆ(x0 ) − y0 )2
DPO MLDA 5
3 Compute:
Test MSE = Ave(fˆ(x0 ) − y0 )2
What if no test observations are available?
Choose the learning method which minimizes the Training MSE?
No guarantee that the method with the lowest Training MSE will also have the lowest Test
MSE!
DPO MLDA 5
2.5
12
2.0
10
Mean Squared Error
1.5
8
Y
1.0
6
0.5
4
2
0.0
0 20 40 60 80 100 2 5 10 20
X Flexibility
L: black=true function, orange=linear regression, blue and green=smoothing spline fits;

R: Grey=Training MSE, Red=Test MSE, Dotted=Irreducible error (Var ())
Training MSE decreases as the method is more flexible

Test MSE initially decreases with flexibility, but then increases
DPO MLDA 6

Our predictor fˆ depends on:
1 the parameters we found solving the training problem
2 the points in the training set (their selection)
DPO MLDA 7

Two sources of randomness: and training set
E (y0 − fˆ(x0 ))2 = Var (fˆ(x0 )) + [Bias(fˆ(x0 ))]2 + Var ()
| {z } | {z } | {z } | {z }
Expected test MSE Variance of fˆ(x0 ) Squared bias of fˆ(x0 ) Variance of the error terms
DPO MLDA 7

| {z } | {z } | {z } | {z }
Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
DPO MLDA 7

| {z } | {z } | {z } | {z }
Bias: error obtained by approximating a complex real-life problem with a simpler
model (more flexible methods have less bias)
DPO MLDA 7

| {z } | {z } | {z } | {z }
To minimize the expected test error, select a learning method with low variance and
low bias
DPO MLDA 7

| {z } | {z } | {z } | {z }
To minimize the expected test error, select a learning method with low variance and
low bias
Bad news: models with low variance have high bias (and vice-versa)
DPO MLDA 7
A model with high variance and low bias is an overfitted model (the model is too
complex and specific, it does not generalize well).
A model with low variance and high bias is an underfitted model (the model is not
complex enough).
this helps the model at generalizing, but it introduces a high bias so that there is a
systematic deviation between the predicted data and the true one.
From a practical point of view:
Overfitting an estimator is easy: add new parameters (e.g., add quadratic terms in a linear
regression)
Making an overfitted model generalize better is hard (regularization may help).
DPO MLDA 8
Measuring the quality of fit (classification)
yi is no longer numerical
Training error rate:
n
1 P
n I (yi 6= ŷi )
i=1
(
1 if yi 6= ŷi
6 ŷi ): indicator function
I (yi =
0 otherwise
Test error rate:
Ave(I (y0 6= ŷ0 ))
DPO MLDA 9

Fundamentals Part 3

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fundamentals Part 3

Uploaded by

Copyright:

Available Formats

Machine Learning and Data Analytics

Dr. Rossana Cavagnini

Deutsche Post Chair – Optimization of Distribution Networks (DPO)

1 Assessing model accuracy

Measuring the quality of fit (regression)

Measuring the quality of fit (regression)

Measuring the quality of fit (regression)

Measuring the quality of fit (regression)

Measuring the quality of fit (regression)

- Small: prediction close to true (fˆ(xi ) ≈ yi )

Mean Squared Error

L: black=true function, orange=linear regression, blue and green=smoothing spline fits;

Training MSE decreases as the method is more flexible

The bias-variance trade-off

The bias-variance trade-off

The bias-variance trade-off

The bias-variance trade-off

The bias-variance trade-off

The bias-variance trade-off

Measuring the quality of fit (classification)

You might also like