Professional Documents
Culture Documents
Computer Engineering
Unit #2
to try next
For logistic regression, the cross-entropy cost function,
defined as,
Evaluating a
Learning
Algorithm:
Deciding what
to try next
Model
selection: Issue
with train/test
split
Why Train/Validation/Test Splits?
We have n models, having varying candidate hyper
parameters (like number of polynomial terms in
regression, number of hidden layer and neurons in the
neural network) and one is chosen based on the lowest
Why test error it reports after training.
Train/Validati Would it be correct to report this error as the indicator
of the generalized performance of the selected model?
on/Test Splits? In general, many practitioners do use the same metrics
as model performance, but it is advised against.
This is so because, the way the model parameters were
fit to the train samples and hence would report lower
error on train dataset, similarly the model hyper
parameters are fit to the test set and would report a
lower error.
To overcome this issue, it is recommended to split the
dataset into three parts, namely, train, cross-validation
(or validation) and test.
Now, train set is used to optimize the model
Why parameters, then cross-validation set is used to select
the best model among ones having varying hyper
Train/Validati parameters.
on/Test Splits? Finally the generalized performance can be calculated
on test dataset which is not seen during training and
model selection process.
This would be the truly unbiased reporting of model
performance metrics. This way the hyper parameters
have not been trained using the test set.
Model
selection:
solution to
with train/test
split
Model
selection:
solution to
with train/test
split
Model
selection:
solution to
with train/test
split
There are mainly two types of errors in machine
learning:
Reducible errors: These errors can be reduced to
improve the model accuracy. Such errors can further be
classified into bias and Variance.
Errors In
Machine
Learning
Different
Combinations
of Bias-
Variance
1. Low-Bias, Low-Variance: The combination of low bias
and low variance shows an ideal machine learning model.
However, it is not possible practically.
2. Low-Bias, High-Variance: With low bias and high
variance, model predictions are inconsistent and accurate
on average. This case occurs when the model learns with a
Different large number of parameters and hence leads to
an overfitting.
Combinations
3. High-Bias, Low-Variance: With High bias and low
of Bias- variance, predictions are consistent but inaccurate on
average. This case occurs when a model does not learn
Variance well with the training dataset or uses few numbers of the
parameter. It leads to underfitting problems in the
model.
4. High-Bias, High-Variance: With high bias and high
variance, predictions are inconsistent and also inaccurate
on average
Overfitting
Overfitting occurs when our machine learning model
tries to cover all the data points or more than the
required data points present in the given dataset.
Because of this, the model starts caching noise and
Overfitting inaccurate values present in the dataset, and all these
and factors reduce the efficiency and accuracy of the
model.
Underfitting
The over fitted model has low bias and high variance.
The chances of occurrence of overfitting increase as
much we provide training to our model. It means the
more we train our model, the more chances of
occurring the over fitted model.
Underfitting
Underfitting occurs when our machine learning model
is not able to capture the underlying trend of the data.
To avoid the overfitting in the model, the fed of
training data can be stopped at an early stage, due to
which the model may not learn enough from the
Overfitting training data.
and As a result, it may fail to find the best fit of the
dominant trend in the data.
Underfitting In the case of underfitting, the model is not able to
learn enough from the training data, and hence it
reduces the accuracy and produces unreliable
predictions.
An under fitted model has high bias and low variance.
Overfitting
and
Underfitting
Overfitting
and
Underfitting
Diagnosing
Bias Vs
Variance
Bias-Variance Trade-Off
While building the machine learning model, it is
really important to take care of bias and variance in
order to avoid overfitting and underfitting in the
model.
Bias-Variance If the model is very simple with fewer parameters,
Trade-Off it may have low variance and high bias.
Whereas, if the model has a large number of
parameters, it will have high variance and low bias.
So, it is required to make a balance between bias
and variance errors, and this balance between the
bias error and variance error is known as the Bias-
Variance trade-off.
Bias-Variance
Trade-Off
For an accurate prediction of the model, algorithms
need a low variance and low bias. But this is not
possible because bias and variance are related to each
other:
➢If we decrease the variance, it will increase the bias.
Bias-Variance ➢If we decrease the bias, it will increase the variance.
Trade-Off So, we need to find a sweet spot between bias and
variance to make an optimal model.
Hence, the Bias-Variance trade-off is about finding
the sweet spot to make a balance between bias and
variance errors.
Bias/Variance
as a function of
Regularization
parameter λ
Bias/Variance
as a function of
Regularization
parameter λ
Bias/Variance
as a function of
Regularization
parameter λ
Learning
Curves
Learning
Curves
Learning
Curves
Building a
Spam
Classifier:
Prioritizing
what
to work on
Supervised Learning:
x= features of email
y= spam(1) or not spam(0)
Building a
Feature x= Choose 100 words indicative of spam/not
Spam spam
Classifier: Eg: Indicative words in Spam emails: Buy, Discount, Deal
Prioritizing Indicative words in Non Spam emails: Andrew, Now
what
to work on
Step 1: Arranging 100 indicative words in alphabetical
order. For eg: Andrew, Buy, Deal, Discount, Now
Step 2: Now I will check email and map words from
Building a list(described in step 1) and then define feature vector
X.
Spam
Classifier:
Prioritizing
what
to work on
X= [0 1 1 0 0….]
Building a
Spam
Classifier:
Prioritizing
what
to work on
Error Analysis
Error Analysis
Error Analysis
Error Analysis
Any
Thank you
Queries..??