You are on page 1of 30

Advice for

applying
machine
Deciding
learning

what to try
next
Machine Learning

Debugging a learning algorithm:
Suppose you have implemented regularized linear regression to
predict housing prices.

However, when you test your hypothesis on a new set of
houses, you find that it makes unacceptably large errors in its
predictions.
Whattraining
should examples
you try next?
- Get more
-

Try
Try
Try
Try
Try

smaller sets of features
getting additional features
adding polynomial features
decreasing
increasing

Andrew Ng

Machine learning diagnostic:
Diagnostic: A test that you can run to gain
insight what is/isn’t working with a learning
algorithm, and gain guidance as to how best
to improve its performance.
Diagnostics can take time to implement, but
doing so can be a very good use of your
time.

Andrew Ng

Advice for applying machine learning Evaluating a hypothesis Machine Learning .

of floors age of average income in house kitchen size neighborhood Andrew Ng . of bedrooms house no. of size no.Evaluating your hypothesis price Fails to generalize to new examples not in training size set.

Evaluating your hypothesis Dataset: Size Price 2104 1600 2400 1416 3000 1985 1534 1427 1380 1494 400 330 369 232 540 300 315 199 212 243 Andrew Ng .

Training/testing procedure for linear regression .Learn parameter from training data (minimizing training error ) .Compute test set error: Andrew Ng .

Training/testing procedure for logistic regression .Compute test set error: .Misclassification error (0/1 misclassification error): Andrew Ng .Learn parameter from training data .

Advice for applying machine learning Model selection and training/validation/t est sets Machine Learning .

Andrew Ng . the error of the parameters as measured on that data (the training error xxxxx) is likely to be lower than the actual generalization error.price Overfitting example size Once parameters were fit to some set of data (training set).

3. I.Model selection 1.e. Choose How well does the model generalize? Report test set error . 2. 10. Problem: is likely to be an optimistic estimate of generalization error. our extra parameter ( = degree of polynomial) is fit to test Andrew Ng .

Evaluating your hypothesis Dataset: Size Price 2104 1600 2400 1416 3000 1985 1534 1427 1380 1494 400 330 369 232 540 300 315 199 212 243 Andrew Ng .

Train/validation/test error Training error: Cross Validation error: Test error: Andrew Ng .

Pick Estimate generalization error for test set Andrew Ng . 3.Model selection 1. 10. 2.

Advice for applying machine learning Diagnosing bias vs. variance Machine Learning .

Size High bias (underfit) Price Price Price Bias/variance Size “Just right” Size High variance (overfit) Andrew Ng .

Bias/variance Training error: error Cross validation error: degree of Andrew Ng .

variance error Suppose your learning algorithm is performing less well than you were hoping.Diagnosing bias vs. ( or is high.) Is it a bias problem or a variance problem? Bias (underfit): (cross validation error) Variance (overfit): (training error) degree of polynomial d Andrew Ng .

Advice for applying machine Regularization learning and bias/variance Machine Learning .

Linear regression with regularization Price Price Price Model: Size Size Large xx High bias (underfit) Intermediate xx “Just right” Size Small xx High variance (overfit) Andrew Ng .

Choosing the regularization parameter Andrew Ng .

Try Try Try Try Try 12. 3. 4.Try Pick (say) .Choosing the regularization parameter Model: 1. 5. Test error: Andrew Ng . 2.

Bias/variance as a function of the regularization parameter Andrew Ng .

Advice for applying machine learning Learning curves Machine Learning .

error Learning curves (training set size) Andrew Ng .

getting more training data will not (by itself) help price siz e siz Andrew Ng .error price High bias (training set size) If a learning algorithm is suffering from high bias.

High variance ) (training set size) If a learning algorithm is suffering from high variance. getting more training data is likely to price error price (and small siz e siz Andrew Ng .

Advice for applying machine Deciding learning what to try next (revisited) Machine Learning .

Try getting additional features . you find that it makes unacceptably large errors in its prediction.Try decreasing .Try increasing Andrew Ng .Get more training examples . when you test your hypothesis in a new set of houses. What should you try next? . However.Try smaller sets of features .Debugging a learning algorithm: Suppose you have implemented regularized linear regression to predict housing prices.Try adding polynomial features .

more prone to overfitting) Computationally more expensive. more prone to underfitting) Computationally cheaper “Large” neural network (more parameters.Neural networks and overfitting “Small” neural network (fewer parameters. Andrew Ng . Use regularization ( ) to address overfitting.