Coursera

© All Rights Reserved

1.1K views

Coursera

© All Rights Reserved

- Robert Marcel Branch, PhD Proposal Defense, Dissertation Chair, Dr. William Allan Kritsonis
- Part 1 Building your Own Binary Classification Model.txt
- Part 2 Should the Bank Buy Third-Party Credit Information.txt
- Answers for Mastering Data Analysis in Excel
- chapter 14 ms-drg statistics assignment 1
- nd407
- Peer-graded Assignment Part 5 Modeling Credit Card Default Risk and Customer Profitability.txt
- Excel to MySQL_ Analytic Techniques for Business - Duke University _ Coursera
- LearnOpex Brochure
- Part 3 Comparing the Information Gain of Alternative Data and Models.txt
- Peer-graded Assignment Part 5 Modeling Credit Card Default Risk and Customer Profitability 2.txt
- Part 4 Modeling Profitability Instead of Default.txt
- Parametric Models for Regression (graded).txt
- Parametric Models for Regression _ Coursera
- Probability, AUC, and Excel Linest Function.txt
- Excel Essentials.txt
- Probability, AUC, And Excel Linest Function _ Coursera
- Information Measures (graded).txt
- 4. Diagnostic Accuracy of Ultrasonography
- IEE 380 Syllabus

You are on page 1of 3

Part1:BuildingyourOwnBinaryClassificationModel|Coursera

13questions

Introduction:

You work for a bank as a business data analyst in the credit card risk-modeling department. Your bank recently

conducted a bold experiment: over a short time interval three years ago, it quietly issued 600 credit cards to

everyone who applied, regardless of their credit risk.

After three years, 150, or 25%, of card recipients defaulted they failed to pay back at least some of the money

they owed. However, the bank collected very valuable proprietary data that it can now use to optimize its future

card-issuing process.

The bank initially collected six pieces of data about each person.

Age

Years at current employer

Years at current address

Income over the past year

Current credit card debt, and

Current automobile debt

You are first asked to propose a binary classification model for default that uses only data from one or more of

the above six inputs, and outputs a single score. The relative rank-ordering of scores will determine the models

effectiveness. For convenience, you are asked to use a scale for your score that has a maximum < 3.5 and a

minimum > -3.5.

Initially you are not told what the banks best estimate for cost per False Negative (accepted applicant who

becomes a defaulting customer) and False Positive (rejected customer who would not have defaulted). Therefore,

the best you can do is to design a model that maximizes the Area Under the ROC Curve, or AUC.

You are told that if your model is effective (high enough AUC not defined) and robust (not defined, but in

general means relatively little change in AUC across multiple sets of available data) that it may be adopted by the

bank as a predictive model for default, to determine which future applicants will be issued credit cards.

First Binary Classification Model: You are first given a training set of 200 out of the 600 people in the experiment.

Design your model on this set. Standardize your data first. You may combine the six inputs by adding them to or

subtracting them from each other, taking simple ratios, etc The only restriction is that your final score needs to

be scaled so that the maximum is less than 3.5 and the minimum is greater than -3.5, so you can use the Excel

AUC Calculator provided.

Question 1: What is your model? Give it as a function of the two or more of the six inputs that outputs a single

numerical score between -3.5 and 3.5 for each applicant

2.

What is your models AUC on the Training Set?

1/31/2016

Part1:BuildingyourOwnBinaryClassificationModel|Coursera

3.

Initial Assessment for Over-fitting (testing your model on new data)

Next test your model, without changing any parameters, on the Test Set of 200 additional applicants.

Question: What is your models new AUC on the Test Set?

4.

Finding the Cost-Minimizing Threshold for your Model

Now that you have, hopefully, developed your model to the point where it is relatively robust across the training

set and test set, your boss at the bank finally gives you its current rough estimate of the banks average costs for

each type of classification error.

[Note that all bank models here include only profits and losses within three years of when a card is issued, so the

impact of out-years (years beyond 3) can be ignored.]

Cost Per False Negative: $5000

Cost Per False Positive: $2500

Note that for the 600 individuals that were automatically given cards without being classified, the total cost of the

experiment turned out to be 25%*($5000)*600 or $7.5 million. This is $1,250 per event. Only models with lower

cost per event than this have any value.

Question: On the training set, what is the threshold score for your current classification model that minimizes

costs per event on the training set?

5.

What is your minimum cost per event on the training set?

6.

At that same threshold score (NOT the threshold score that would minimize costs for the new Test Set, but the

old threshold score that minimized costs on the Training Set) what is the cost per event on the test set?

7.

Putting a Dollar Value on Your Model Plus the Data

Again assume Test Set results are sustainable long term.

Question: How much money does the bank save, per event, using your model and its data-inputs, instead of

issuing credit cards to everyone who asks?

8.

Given that it apparently cost the bank $750,000 to conduct the three-year experiment, if the bank processes 1000

credit card applicants per day on average, how many days will it take to ensure future savings will pay back the

investment?

1/31/2016

Part1:BuildingyourOwnBinaryClassificationModel|Coursera

9.

Confusion Matrix Metrics at the cost-Minimizing Threshold for your Model

What is the test incidence of your test, on the test set, at the threshold from the training set? In other words,

what percentage of applicants does your model classify Positive as defaulters (test incidence)? (Answers must be

in percentages, i.e. 75)

10.

On the test set, calculate your models False Positive Rate (FPR) and compare it to the Test Incidence (TI)

1. Your FPR should be greater than the TI

Your FPR should be less than the TI

Your FPR should be equal to the TI

11.

On the test set, calculate your models True Positive Rate (TPR) and compare it to the Test Incidence (TI)

Your TPR should be greater than the TI

Your TPR should be less than the TI

Your TPR should be equal to the TI

12.

What is the models Positive Predictive Value (PPV)?

Greater than .25

Less than .25

Equal to .25

13.

What is the model's Negative Predictive Value (NPV)?

Less than .75

Equal to .75

Greater than .75

13questions unanswered

Submit Quiz

- Robert Marcel Branch, PhD Proposal Defense, Dissertation Chair, Dr. William Allan KritsonisUploaded byAnonymous sewU7e6
- Part 1 Building your Own Binary Classification Model.txtUploaded byWathek Al Zuaiby
- Part 2 Should the Bank Buy Third-Party Credit Information.txtUploaded byWathek Al Zuaiby
- Answers for Mastering Data Analysis in ExcelUploaded byWathek Al Zuaiby
- nd407Uploaded byAshar Saragih
- chapter 14 ms-drg statistics assignment 1Uploaded byapi-457984193
- Peer-graded Assignment Part 5 Modeling Credit Card Default Risk and Customer Profitability.txtUploaded byWathek Al Zuaiby
- Excel to MySQL_ Analytic Techniques for Business - Duke University _ CourseraUploaded bylenovoji
- LearnOpex BrochureUploaded bypiyush.k
- Part 3 Comparing the Information Gain of Alternative Data and Models.txtUploaded byWathek Al Zuaiby
- Peer-graded Assignment Part 5 Modeling Credit Card Default Risk and Customer Profitability 2.txtUploaded byWathek Al Zuaiby
- Part 4 Modeling Profitability Instead of Default.txtUploaded byWathek Al Zuaiby
- Parametric Models for Regression (graded).txtUploaded byWathek Al Zuaiby
- Parametric Models for Regression _ CourseraUploaded byArshdeep Dhaliwal
- Probability, AUC, and Excel Linest Function.txtUploaded byWathek Al Zuaiby
- Excel Essentials.txtUploaded byWathek Al Zuaiby
- Probability, AUC, And Excel Linest Function _ CourseraUploaded byArshdeep Dhaliwal
- Information Measures (graded).txtUploaded byWathek Al Zuaiby
- 4. Diagnostic Accuracy of UltrasonographyUploaded bybacabacadongdong
- IEE 380 SyllabusUploaded byGareth Whitehead
- Wallach Interp Diag TestsUploaded byAbdullah Albadri
- Assignment 2 (GRoup Erma Wati)Uploaded byArryna Husny
- CH03 Classification Part IUploaded byozge
- Jurnal WhoUploaded byWilliam Palandeng
- Ch27_presN.pdfUploaded byAlee López
- Elementary Statistics a Brief 6th Edition Bluman Solutions ManualUploaded bya585855093
- 612397.v2Uploaded byLeo trinh
- One Way Anova LOG DocUploaded byAbeiasa
- Spring 2014 STAT 110 DistanceUploaded byJason Wright
- Attachment 1Uploaded byahmshi

- Preliminary Ship DesignUploaded bySijish
- Phelps IndustriesUploaded byDilip Reddy
- CFin I Midterm Formula SheetUploaded byanind06
- Float Collar Rev1Uploaded byDilip Reddy
- StackingUploaded byDilip Reddy
- Orcina BrochureUploaded byDilip Reddy
- Closed ChocksUploaded byDilip Reddy
- Clamp DesignUploaded byDilip Reddy
- Hydrostar Software CapabilitiesUploaded byDilip Reddy
- Combined Equity KIM June 21 2014 01Uploaded bySeetha Chimakurthi
- Updated Address Contact of Authorised AgentsUploaded byDilip Reddy
- Publication No 333Uploaded byDilip Reddy
- Dnlds From DcUploaded byDilip Reddy
- HoltropUploaded bybhushantaskar

- Ch7 TheoryUploaded byMaJdAlAmLeh
- Test BankUploaded byJi Yu
- Micro FinanceUploaded byMohit kolli
- Revenue Curves Under Different Markets (With Diagram)Uploaded byMathew Abraham
- Accounting EquationUploaded byPrincess Tibon
- chap08Uploaded bySum Khor
- [Economy] Yield Spread _ Meaning and Use Explained « MrunalUploaded byRajeshKumar
- AssignmentUPAUploaded byMark Christian Luciano
- December 2008 $2.00Uploaded byBusiness Update
- BM 6601 MA Project 3Uploaded byIan Chen
- Tax 2 Case Digests Part 2 Transfer Taxes.pdfUploaded byNolaida Aguirre
- Marketing NotesUploaded byPriya Srinivasan
- 26) Commissioner of Internal Revenue v. CA 301 SCRAUploaded byLucioJr Avergonzado
- 240_2011-11-08_661214Uploaded byAnonymous bau06xStT
- 03NITCEEZV201617Uploaded byexecutive engineer
- Sol_Ch16Uploaded byElizabethBuana
- Grant Thornton - Co-op 3Uploaded byConnieLow
- Efficiency of Proportional Allocation Procedure Over Other Allocation Procedures in Stratified Random SamplingUploaded byIJARP Publications
- Low Wage CapitalismUploaded byallyd
- the Individual Demand ScheduleUploaded byorionally
- S6(4)1Uploaded byAlimah Tus
- DhakaBankUploaded byBithi Jesmine
- wacc (1).pptUploaded byTammy Yah
- 450-WongUploaded byZaman Parvez
- mfguide2010Uploaded byabhishek 513
- Leaseplan Corporation Annual Report 2016Uploaded byMyo Myint
- Accounting for Merchandising BusinessesUploaded bywarsima
- 2019 Deped Public School Teacher Benefits and IncentivesUploaded byRyan Bantiding
- LI & FUNGUploaded byKarthikeyan Krishnakumar
- Shadow of Power the Council on Foreign Relations and the American DeclineUploaded byFirdaus Razak