You are on page 1of 6

EVALUATION OF PREDICTIVE MODELS

Predictive models are proving to be quite helpful in predicting the future growth of businesses, as
it predicts outcomes using data mining and probability, where each model consists of a number of predictors
or variables. A statistical model can, therefore, be created by collecting the data for relevant variables.

There are two categories of problems that a predictive model can solve depending on the category
of business — classification and regression problems. There is a fundamental difference between the
methods for evaluating a regression and classification model.

With regression, we deal with continuous values where one can identify the error between the actual
and prediction output.

However, when evaluating a classification model, the focus is on the number of predictions that we
can classify correctly. Also to evaluate a classification model correctly, we also have to consider the data
points that we classify incorrectly.

Model Evaluation techniques

Model Evaluation is an integral part of the model development process. It helps to find the best
model that represents our data. It also focusses on how well the chosen model will work in the future.
Evaluating model performance with the training data is not acceptable in data science. It can easily generate
overoptimistically and overfit models. There are two methods of evaluating models in data science, Hold-
Out and Cross-Validation. To avoid overfitting, both methods use a test set (not seen by the model) to
evaluate model performance.

Hold-Out

In this method, the mostly large dataset is randomly divided into two subsets:
1. The training set is a subset of the dataset to build predictive models; a subset to train a
model.
2. Test set or unseen examples is a subset of the dataset to assess the likely future performance
of a model. If a model fit to the training set much better than it fits the test set, overfitting is
probably the cause.; a subset to test the trained model.

Cross-Validation

When only a limited amount of data is available, to achieve an unbias estimate of the model performance we
use k-fold cross-validation. In the k-fold cross-validation, we divide the data into k subsets of equal size. We
build models times, each time leaving out one of the subsets from training and use it as the test set. If k equals
the sample size, this is a “leave-one-out” method.

Training and Testing Methodologies

Training data and test data sets are two different but important parts in machine learning. While training
data is necessary to teach an ML algorithm, testing data, as the name suggests, helps you to validate the
progress of the algorithm's training and adjust or optimize it for improved results.
In simple words, when collecting a data set that you'll be using to train your algorithm, you should keep in
mind that part of the data will be used to check how well the training goes. This means that your data will
be split into two parts: one for training and the other for testing.
Adequate training requires the algorithm to see the training data multiple times, which means that the model
will be exposed to the same patterns if it runs over the same data set. To avoid this, you need a different set
of data that will help your algorithm to see different patterns. But at the same time, you don't want to involve
your testing data set before the training ends since you need it for different purposes.
• Training data. This type of data builds up the machine learning algorithm. The data scientist feeds
the algorithm input data, which corresponds to an expected output. The model evaluates the data
repeatedly to learn more about the data’s behavior and then adjusts itself to serve its intended
purpose. You train the model using the training set. Train the model means create the model.
• Validation data. During training, validation data infuses new data into the model that it hasn’t
evaluated before. Validation data provides the first test against unseen data, allowing data scientists
to evaluate how well the model makes predictions based on the new data. Not all data scientists use
validation data, but it can provide some helpful information to optimize hyperparameters, which
influence how the model assesses data.
• Test data. After the model is built, testing data once again validates that it can make accurate
predictions. If training and validation data include labels to monitor performance metrics of the
model, the testing data should be unlabeled. Test data provides a final, real-world check of an
unseen dataset to confirm that the ML algorithm was trained effectively. You test the model using
the testing set. Test the model means test the accuracy of the model.

Confusion Matrix in Machine Learning


In the field of machine learning and specifically the problem of statistical classification, a confusion
matrix, also known as an error matrix. A confusion matrix is a table that is often used to describe the
performance of a classification model (or “classifier”) on a set of test data for which the true values are
known. It allows the visualization of the performance of an algorithm.

It allows easy identification of confusion between classes e.g. one class is commonly mislabeled
as the other. Most performance measures are computed from the confusion matrix.

Confusion Matrix:

A confusion matrix is a summary of prediction results on a classification problem. The number of correct
and incorrect predictions are summarized with count values and broken down by each class. This is the key
to the confusion matrix. The confusion matrix shows the ways in which your classification model is
confused when it makes predictions. It gives us insight not only into the errors being made by a classifier
but more importantly the types of errors that are being made.

A confusion matrix is a table that is often used to describe the performance of a classification model (or
"classifier") on a set of test data for which the true values are known.

Here,
• Class 1 : Positive
• Class 2 : Negative
Definition of the Terms:
• Positive (P) : Observation is positive (for example: is an apple).
• Negative (N) : Observation is not positive (for example: is not an apple).
• True Positive (TP) : Observation is positive, and is predicted to be positive.
• False Negative (FN) : Observation is positive, but is predicted negative.
• True Negative (TN) : Observation is negative, and is predicted to be negative.
• False Positive (FP) : Observation is negative, but is predicted positive.

Let’s understand TP, FP, FN, TN in terms of pregnancy analogy.

• True Positive:
Interpretation: You predicted positive and it’s true.
You predicted that a woman is pregnant and she actually is.
• True Negative:
Interpretation: You predicted negative and it’s true.
You predicted that a man is not pregnant and he actually is not.
• False Positive: (Type 1 Error)
Interpretation: You predicted positive and it’s false.
You predicted that a man is pregnant but he actually is not.
• False Negative: (Type 2 Error)
Interpretation: You predicted negative and it’s false.
You predicted that a woman is not pregnant but she actually is.
• Just Remember, We describe predicted values as Positive and Negative and actual values as True
and False.
Let's start with an example confusion matrix for a binary
classifier (though it can easily be extended to the case of more than
two classes):
What can we learn from this matrix?

• There are two possible predicted classes: "yes" and "no". If


we were predicting the presence of a disease, for
example, "yes" would mean they have the disease, and
"no" would mean they don't have the disease.
• The classifier made a total of 165 predictions (e.g., 165
patients were being tested for the presence of that
disease).
• Out of those 165 cases, the classifier predicted "yes" 110
times, and "no" 55 times.
• In reality, 105 patients in the sample have the disease, and
60 patients do not.

Let's now define the most basic terms, which are whole numbers
(not rates):

• true positives (TP): These are cases in which we


predicted yes (they have the disease), and they do have
the disease.
• true negatives (TN): We predicted no, and they don't
have the disease.
• false positives (FP): We predicted yes, but they don't
actually have the disease. (Also known as a "Type I error.")
• false negatives (FN): We predicted no, but they actually
do have the disease. (Also known as a "Type II error.")

I've added these terms to the confusion matrix, and also added the
row and column totals:

This is a list of rates that are often computed from a confusion matrix for a binary classifier:

• Accuracy: Overall, how often is the classifier correct?


o (TP+TN)/total = (100+50)/165 = 0.91
• Misclassification Rate: Overall, how often is it wrong?
o (FP+FN)/total = (10+5)/165 = 0.09
o equivalent to 1 minus Accuracy
o also known as "Error Rate"
• True Positive Rate: When it's actually yes, how often does it predict yes?
o TP/actual yes = 100/105 = 0.95
o also known as "Sensitivity" or "Recall"
• False Positive Rate: When it's actually no, how often does it predict yes?
o FP/actual no = 10/60 = 0.17
• True Negative Rate: When it's actually no, how often does it predict no?
o TN/actual no = 50/60 = 0.83
o equivalent to 1 minus False Positive Rate
o also known as "Specificity"
• Precision: When it predicts yes, how often is it correct?
o TP/predicted yes = 100/110 = 0.91
• Prevalence: How often does the yes condition actually occur in our sample?
o actual yes/total = 105/165 = 0.64
ASSESSMENT TASK:
1. Suppose we had a classification dataset with 1000 data points. Write the confusion matrix based on the
following:

• There were 560 positive class data points were correctly classified by the model
• There were 330 negative class data points were correctly classified by the model
• There were 60 negative class data points were incorrectly classified as belonging to the positive
class by the model
• There were 50 positive class data points were incorrectly classified as belonging to the negative
class by the model
a. Write the confusion matrix
b. Solve for the Accuracy, Error Rate and Precision

References:
https://medium.com/@divyacyclitics15/what-is-predictive-model-performance-evaluation-8ef117ae0e40
https://analyticsindiamag.com/what-is-predictive-model-performance-evaluation-and-why-is-it-important/
https://towardsdatascience.com/understanding-confusion-matrix-a9ad42dcfd62

You might also like