Professional Documents
Culture Documents
Autumn 2020
Analytics in Practice
Model Evaluation
Learning objectives
2 Data Understanding
3 Data Preparation
4 Modelling
5 Evaluation
6 Deployment
Overfitting
• Generalisation capability: how a model performs on unseen data
• Typically, accuracy on test data is somewhat lower than on training data.
• If it drops by a large amount, it suggests that model overfits the training
data
Overfitting: Model is too strongly tailored to the training data and won’t
work on new data.
Overfitting examples
Training
Overfitting in Linear Functions
𝑓 𝑥 = 𝑤0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 + 𝑤3 𝑥3
𝑓 𝑥 = 𝑤0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 + 𝑤3 𝑥3 + 𝑤4 𝑥4 + 𝑤5 𝑥5 + ⋯ + 𝑤𝑛 𝑥𝑛
Avoiding overfitting
• Try several different types of models on the same problem and then compare
the result to find the best model.
• What would you like to achieve by developing a predictive model?
• Evaluation metrics are the key to measure the performance of your model
when applied to a test dataset.
• Accuracy, Error Rate, Confusion matrix, ROC chart
• Give insight into the types of errors that are being made.
• False positives are also known as type I errors
• “False alarms”
• For example, we predict that the customer is going to leave but actually not.
• False negatives are also known as type II errors
• “Failed to raise the alarm”
• For example, we predict that the patient will stay with the company
but actually s/he will leave.
Predicting Churn
Actual
Actual
Yes
Yes No
No
Yes True Positives False Positives
Yes 10 (TP) 5 (FP)
(TP) (FP)
Predicted
Predicted No
No False
90 (FN) True
395 Negatives
(TN)
Negatives (FN) (TN)
Confusion matrix
Predicting Churn
Actual
Yes No
• Accuracy is not always a useful metric Yes True Positives False Positives
• when one outcome is very rare Predicted
TP = 10 FP = 5
• when errors have very different costs No False Negatives True Negatives
FN = 90 TN = 395
Confusion Matrix
Precision∗Recall 0.67∗0.10
F-measure = 2 ∗ =2∗ = 0.17
Precision+Recall 0.67+0.10
Actual
Yes No
in a month
Predict
Retired No 3 0.67
Student No 1 0.32
Predict
Working … … …
Making predictions revisited - Example
• If the prediction score is greater than or equal to the threshold value, we predict the
target value as “Yes”, otherwise “No”.
Predicted Y TP FP 𝐹𝑃
𝐹𝑃𝑅 =
N FN TN
𝐹𝑃 + 𝑇𝑁
Receiver Operator Characteristic (ROC) Graph
• ROC graph illustrates relative trade-offs between true
positives (benefits) and false positives (costs).
𝑇𝑃
Employment Card Shopping Response Prediction Employment Card Shopping Responder Prediction
Per month Score Per month Score
... need to consider the top 30% of the customers as ranked by predicted
likelihood to respond to the marketing campaign.
Questions
Class: Class:
Misclassification Error Rate: 0% Low Risk High Risk
Overfitting in Decision Trees
Test Data: Income
Class:
Credit Card
High Risk
Yes No
Class: Class:
Low Risk High Risk
Expected Value
Actual Actual
Confusion Matrix: Confusion Matrix:
Decision Tree Respond: Yes Respond: No Logistic Regression Respond: Yes Respond: No
Respond TP = 56 FP = 7 Respond TP = 50 FP = 5
Yes Yes
Predicted Predicted
Respond FN = 5 TN = 42 Respond FN = 3 TN = 52
No No
Respond TP = 56 FP = 7
Probability(FN) = FN / Total Instances Yes
= 5/110 = 0.05 Predicted
Respond FN = 5 TN = 42
No
• False Positive (FP): a consumer who is predicted to respond the marketing
campaign but actually she does not respond.
Profit(FP) = - £1
Probability(FP) = FP / Total Instances = 7/110 = 0.06
Respond TP = 56 FP = 7
profit is £50.43 per customer. Yes
Predicted
Respond FN = 5 TN = 42
No
Expected Value to Compare Classifiers (Costs and Benefits)
• True Positive (TP): a consumer who is offered the product and actually
buys it.
Profit(TP) = 100 – 1 = £99
Probability(TP) = TP / Total Instances = 50/110 = 0.45
Profit(TN) = 0 Actual
Confusion Matrix:
Logistic Regression Respond: Yes Respond: No
Probability(TN) = TN / Total Instances
= 52/110 = 0.47 Respond TP = 50 FP = 5
Yes
Predicted
Respond FN = 3 TN = 52
No
Expected Value to Compare Classifiers (Costs and Benefits)
• True Positive: Profit(TP) = £99 Prob(TP) = 0.45
• False Negative: Profit(FN) = 0 Prob(FN) = 0.03
• False Positive: Profit(FP) = - £1 Prob(FP) = 0.05
• True Negative: Profit(TN) = 0 Prob(TN) = 0.47
Actual Actual
Confusion Matrix: Confusion Matrix:
Decision Tree Respond: Yes Respond: No Logistic Regression Respond: Yes Respond: No
Respond TP = 56 FP = 7 Respond TP = 50 FP = 5
Yes Yes
Predicted Predicted
Respond FN = 5 TN = 42 Respond FN = 3 TN = 52
No No
Summary