Professional Documents
Culture Documents
Unit III 1
Unit III 1
This happens because, in cases like these, our models don’t learn but
instead memorize; hence, they cannot generalize well on unseen data.
To get started, let’s define these three important terms:
• Learning: ML model learning is concerned with the accurate
prediction of future data, not necessarily the accurate
prediction of training/available data.
Evaluation Metrics:
Evaluation metrics are tied to machine learning tasks. There are
different metrics for the tasks of classification, regression,
ranking, clustering, topic modeling, etc. Some metrics, such as
precision-recall, are useful for multiple tasks. Classification,
regression, and ranking are examples of supervised learning,
which constitutes a majority of machine learning applications.
In machine learning, there is always the need to test the stability of the
model. It means based only on the training dataset; we can't fit our
model on the training dataset. For this purpose, we reserve a particular
sample of the dataset, which was not part of the training dataset. After
that, we test our model on that sample before deployment, and this
complete process comes under cross-validation. This is something
different from the general train-test split.
Model Accuracy:
Let’s try calculating the accuracy of this model on the above dataset,
given the following results:
In the above case let’s define the TP, TN, FP, FN:
This highly accurate model may not be useful, as it isn’t able to predict
the actual cancer patients—hence, this can have worst consequences.
So for these types of scenarios how do we can trust the machine learning
models?
Accuracy alone doesn’t tell the full story when we’re working with
a class-imbalanced dataset like this one, where there’s a significant
disparity between the number of positive and negative labels.
Let’s try to measure precision and recall for our cancer prediction
use case:
Each statistic can be calculated using the log-likelihood for a model and the data. Log-
likelihood comes from Maximum Likelihood Estimation, a technique for finding or
optimizing the parameters of a model in response to a training dataset.
Akaike Information Criterion
The Akaike Information Criterion, or AIC for short, is a method for scoring and
selecting a model.
It is named for the developer of the method, Hirotugu Akaike, and may be shown to
have a basis in information theory and frequentist-based inference
The AIC statistic is defined for logistic regression as follows (taken from “The Elements
of Statistical Learning“):
The BIC statistic is calculated for logistic regression as follows (taken from “The
Elements of Statistical Learning“):
• BIC = -2 * LL + log(N) * k
Where log() has the base-e called the natural logarithm, LL is the log-likelihood of the
model, N is the number of examples in the training dataset, and k is the number of
parameters in the model.
The score as defined above is minimized, e.g. the model with the lowest BIC is
selected.
Minimum Description Length
The Minimum Description Length, or MDL for short, is a method for scoring and
selecting a model.
It is named for the field of study from which it was derived, namely information theory.
From an information theory perspective, we may want to transmit both the predictions
(or more precisely, their probability distributions) and the model used to generate them.
Both the predicted target variable and the model can be described in terms of the number
of bits required to transmit them on a noisy channel.
The Minimum Description Length is the minimum number of bits, or the minimum of
the sum of the number of bits required to represent the data and the model
Where h is the model, D is the predictions made by the model, L(h) is the number of
bits required to represent the model, and L(D | h) is the number of bits required to
represent the predictions from the model on the training dataset.
The score as defined above is minimized, e.g. the model with the lowest MDL is
selected.
Assignment # 3
Exercise: Build decision tree model to predict survival based on certain parameters
In this file using following columns build a model to predict if person would survive or not
1. Pclass
2. Sex
3. Age
4. Fare
ROC Curve:
Basically, ROC curve is a graph that shows the performance of a classification model
at all possible thresholds( threshold is a particular value beyond which you say a point
belongs to a particular class). The curve is plotted between two parameters
• TRUE POSITIVE RATE
• FALSE POSITIVE RATE
AUC-ROC curve:
ROC curve stands for Receiver Operating Characteristics Curve and AUC stands
for Area Under the Curve.
It is a graph that shows the performance of the classification model at different
thresholds.
To visualize the performance of the multi-class classification model, we use the AUC-
ROC Curve.
The ROC curve is plotted with TPR and FPR, where TPR (True Positive Rate) on Y-
axis and FPR(False Positive Rate) on X-axis.
AUC measures how well a model is able to distinguish between classes