You are on page 1of 5

Part 2: Should the Bank Buy Third-Party

Credit Information?
TOTAL POINTS 9

1. Introduction 1 point

Part 2 is intended to
illustrate how binary classification performance metrics make it possible for
you to put an exact value, in dollars per event, on new information that
relates to a predictive model.

Note that new information will be worth far


more if it is compared to no
forecasting model rather than the state of partial knowledge available from the
current model. Sellers of information (and data science consultants!) love to
take credit for any information gain they achieve over the base rate.

Very often some intermediate state of


knowledge is already available for which no additional spending is required.
Evaluating the realistic incremental
financial gain from new information, whether licensing a third-party commercial
database or collecting new data internally, is therefore of great practical
value, as this sets an upper bound on what your Company should be willing to
pay to license or create the new information.

In this case study, your boss has been in


discussions with an advanced machine-learning predictive-analytics credit-risk
analytics company that claims to score individual probability of default with
very high information gain. Let’s call the company Eggertopia. Eggertopia sales
representatives claim their pre-processed risk-scores can achieve AUC values as
high as .85 or even higher. However, Eggertopia scores are sold per-event, and
they are expensive!

Your boss asks you to determine the


incremental financial value to the bank of purchasing Eggertopia risk scores on
future credit-card applicants.

Eggertopia agrees to apply its algorithms to


generate credit scores for the 400 individuals in the Training and Test Sets.
Eggertopia scores do not need to be combined with anything else to make a
model. However, since the scores range from approximately -600 (best credit
risk) to 4900 (most likely to default) they will need to be standardized and
adjusted to fit the -3.5 to 3.5 range of the AUC Calculator Spreadsheet (below)
AUC_Calculator and Review of AUC Curve.xlsx

You will determine the sustainable AUC of the


Eggertopia scores, the sustainable cost-per-event, and the savings per event,
when comparing Eggertopia data to the base rate forecast.

You will then calculate the incremental savings per event if you
compare use of Eggertopia data to use of your current model developed in Part
1.

Question: What is the AUC of the Eggertopia Scores on the Training


Set? Give your answer to two digits to the right of the decimal point.

.83

.85

.88

.95

2. What is the optimum threshold on the training 1 point


set to minimize the average cost per test?

.1

.25

.15

.2

3. What is the average 1 point


cost-per-event at the Training Set optimum threshold?

$600

$640

$540

$500
4. What 1 point
is the AUC of the Eggertopia scores on the Test Set?

.88

.80

.75

.85

5. Using the same threshold as used on the training 1 point


set, what is the cost per event of the Eggertopia scores on the Test Set? Round to the
nearest
dollar.

$833

$803

$838

$823

6.
1 point
If the bank did not have your model, or any
other way of forecasting default, what is the maximum (break-even) price per
event that the bank could theoretically pay for Eggertopia scores? In other
words, what are Eggertopia’s scores’ absolute savings-per-event?

Hint: Calculate the


difference between the cost-per-event at a 25% default rate, and the cost-per-event
using Eggertopia scores

$423

$412

$418

$425

7. What is the True 1 point


Positive rate of the forecasting model using Eggertopia Scores?

.70

.74

.76

.72
8. What is its Positive Predictive
1 point
Value (PPV) of the forecasting model using Eggertopia scores?

Hint: To calculate
the PPV, divide the portion of True Positives by the total number of Positive
Classifications. Review confusion matrix definitions and letter designations on
the Information Gain Spreadsheet,
[PPV is defined at Cell G41], obtain True Positive and False Positive Rates
from the AUC Calculator Spreadsheet, and use algebra to solve.

Information Gain Calculator.xlsx

.54

.48

.50

.52

9. Incremental Financial Value of Eggertopia 1 point


Scores

You calculated a cost per event for your own


predictive model on Test Set data to answer Quiz 1 - Part 1, Question 6.

Incremental Financial Value of Eggertopia


Scores

You calculated a cost per event for your own


predictive model on Test Set data to answer Quiz 1 - Part 1, Question 6.

Question: Assuming that the


performance of the Eggertopia model and your model both remain stable on any
future data (a big assumption), what is the maximum, or break-even, price that
the bank could pay per score for Eggertopia, given that it already has your
model and data?

700

Your answer cannot be more than 10000 characters.

You might also like