Mang 6054 Credit Scoring and Data Mining
Dr. Bart Baesens
School of Management
University of Southampton
Bart.Baesens@econ.kuleuven.ac.be
Lecturer: Bart Baesens
Studied at the Catholic University of Leuven
(Belgium)
Business Engineer in Management
Informatics, 1998
Ph.D. in Applied Economic Sciences, 2003
Ph.D. title: Developing Intelligent Systems for Credit Scoring Using
Machine Learning Techniques
Associate professor at the K.U.Leuven, Belgium,
and Vlerick Leuven Ghent Management School
Lecturer at the School of Management at the University
of Southampton, United Kingdom
Research: data mining, Web intelligence, neural networks, SVMs,
rule extraction, survival analysis, CRM, credit scoring, Basel II, …
Twitter: www.twitter.com/DataMiningApps
Facebook: Data Mining with Bart
www.dataminingapps.com
2 2 2
3
Software Support
SAS
SAS Enterprise Miner
SAS Credit Scoring Nodes
– interactive grouping, scorecard,
reject inference, credit exchange
Weka
SPSS
Microsoft Excel
Relevant Background: Credit Scoring Books
Credit Risk Management: Basic Concepts, Van Gestel
and Baesens. Oxford University Press, 2008.
Credit Scoring and Its Applications, Thomas,
Edelman, and Crook. Siam Monographs on
Mathematical Modeling and Computation,
2002.
The Credit Scoring Toolkit, Anderson, Oxford University Press, 2007.
Consumer Credit Models: Pricing, Profit and Portfolios, Thomas, Oxford
University Press, 2009.
The Basel Handbook, Ong, 2004.
Basel II Implementation: A Guide to Developing and Validating a
Compliant Internal Risk Rating System, Ozdemir and Miu, McGrawHill, 2008.
The Basel II Risk Parameters: Estimation, Validation and Stress Testing,
Engelmann and Rauhmeier, Springer, 2006.
Recovery Risk: The Next Challenge in Credit Risk Management. Edward
Altman, Andrea Resti, Andrea Sironi (editors), 2005
Developing Intelligent Systems for Credit Scoring, Bart Baesens,
Ph.D. thesis, K.U.Leuven, 2003.
4
Relevant Background: Data mining books
Hastie T., Tibshirani R., Friedman J., Elements
of Statistical Learning: data mining, inference
and prediction, SpringerVerlag, 2001.
Hand D.J., Mannila H., Smyth P., Principles
of Data Mining, the MIT Press, 2001.
Han J., Kamber M., Data Mining: Concepts and
Techniques, 2nd edition, Morgan Kaufmann, 2006.
Bishop C.M., Neural Networks for Pattern Recognition,
Oxford University Press, 1995.
Cristianini N., Taylor J.S., An Introduction to Support
Vector Machines and other kernelbased learning
methods, Cambridge University Press, 2000.
Cox D.R., Oakes D., Analysis of Survival Data,
Chapman and Hall, 1984.
Allison P.D., Survival Analysis Using the SAS System,
SAS Institute Inc., 1995.
5
6
Relevant Background: Credit scoring journals
Journal of Credit Risk
Risk Magazine
Journal of Banking and Finance
Journal of the Operational Research Society
European Journal of Operational Research
Management Science
IMA Journal of Mathematics
Relevant Background: Data mining journals
Data Mining and Knowledge Discovery
Machine Learning
Expert Systems with Applications
Intelligent Data Analysis
Applied Soft Computing
European Journal of Operational Research
IEEE Transactions on Neural Networks
Journal of Machine Learning Research
Neural Computation
7
8
Relevant Background: Credit scoring Web sites
www.defaultrisk.com
http://www.bis.org/
Websites of regulators:
www.fsa.gov.uk (United Kingdom)
www.hkma.gov.uk (Hong Kong)
www.apra.gov.au (Australia)
www.mas.gov.sg (Singapore)
In the U.S.
4 agencies: OCC, Federal Reserve, FDIC, OTS
Proposed Supervisory Guidance for Internal Ratings Based Systems for
Credit Risk, Advanced Measurement Approaches for Operational Risk, and
the Supervisory Review Process (Pillar 2), Federal Register, Vol 72, No. 39,
February 2007.
Riskbased capital standards: Advanced Capital Adequacy Framework
Basel II; Final rule, Federal Register, December 2007.
Relevant Background: Data mining web sites
www.kdnuggets.com (data mining)
www.kernelmachines.org (SVMs)
www.sas.com
See support.sas.com for interesting macros
www.twitter.com/DataMiningApps
9
10
Course Overview
Credit Scoring
Basel I and Basel II
Data preprocessing for PD modeling
Classification techniques for PD modeling
Measuring performance of PD models
Input selection
Reject inference
Behavioural scoring
Other applications
Section 1
An Introduction to
Credit Scoring
Credit Scoring
Estimate whether applicant will successfully repay
his/her loan based on various information
– Applicant characteristics (e.g. age, income, …)
– Credit bureau information
– Application information of other applicants
– Repayment behavior of other applicants
Develop models (also called scorecards) estimating
the probability of default of a customer
Typically, assign points to each piece of information,
add all points, and compare with a threshold (cutoff)
12
Judgmental versus Statistical Approach
Judgmental
Based on experience
Five C’s: character, capital, collateral, capacity, condition
Statistical
Based on multivariate correlations between inputs and risk
of default
Both assume that the future will resemble the past.
13
Types of Credit Scoring
Application scoring
Behavioral scoring
Profit scoring
Bankruptcy prediction
14
Application Scoring
Estimate probability of default at the time the applicant applies for the loan!
Use predetermined definition of default
– For example, three months of payment arrears
– Has now been fixed in Basel II
Use application variables (age, income, marital status, years at
address, years with employers, known client, …)
Use bureau variables
– Bureau score, raw bureau data (for example, number of credit checks,
total amount of credits, delinquency history), …
– In the US
Fico scores: between 300 to 850
Experian, Equifax, TransUnion
– Others
Baycorp Advantage (Australia and New Zealand)
Schufa (Germany)
BKR (the Netherlands)
CKP (Belgium)
Dun & Bradstreet (MidCorp, SME)
15
Example Application Scorecard
16
So, a new customer applies for credit:
AGE 32 120 points
GENDER Female 180 points
SALARY $1,150 160 points
Total 460 points
Let cutoff = 500
REFUSE CREDIT
...
Example Variables Used in Application
Scorecard Development
Age
Time at residence
Time at employment
Time in industry
First digit of postal code
Geographical: Urban/Rural/Regional/Provincial
Residential Status
Employment status
Lifestyle code
Existing client (Y/N)
Years as client
Number of products internally
Internal bankruptcy/writeoff
Number of delinquencies
Beacon/Empirical
Time at bureau
Total Inquiries
Time since last Inquiry
Inquiries in the last 3/6/12 mths
Inquiries in the last 3/6/12 mths as % of total
Income
Total Liabilities
Total Debt
Total Debt service $
Total Debt Service Ratio
Gross Debt Service Ratio
Revolving Debt/Total Debt
Other Bank card
Number of other bank cards
Total Trades
Trades opened in last six months
Trades Opened in Last six months /
Total Trades
Total Inactive Trades
Total Revolving trades
Total term trades
% revolving trades
Total term trades
Number of trades current
Number of trades R1, R2..R9
% trades R1R2
% Trades R3 or worse
% trades O1..O9
% trades O1O2
% Trades O3 or worse
Total credit lines  revolving
Total balance  revolving
Total Utilization
total balance all
17
Application Scoring: Summary
New customers
Two snapshots of the state of the customer
– Second snapshot typically taken around 18 months
after loan origination
Static procedure
Scorecards are outofdate, even before they are
implemented (Hand 2003).
Application scoring aims at ranking customers –
no calibrated probabilities of default needed!
18
Behavioral Scoring
Behavioral Scoring
Study risk behavior of existing customers
Update risk assessment taking into account recent behavior
Example of behavior:
– Average/Max/Min/trend in checking account balance, bureau score,
…
– Delinquency history (payment arrears, …)
– Job changes, home address changes, …
Dynamic
“Video clip” to snapshot (typically taken after 12 months)
19
continued...
Performance period/video clip
Observation point Outcome point/snapshot
Behavioral Scoring
Behavioral scoring can be used for
– Debt provisioning and profit scoring
– Authorizing accounts to go in excess
– Setting credit limits
– Renewals/reviews
– Collection strategies
Characteristics:
– Many, many variables (input selection needed,
cf. infra)
– Purpose is again scoring = ranking customer with
respect to their default likelihood
20
Profit Scoring
Calculate profit over customer's lifetime
Need loss estimates for bad accounts
Decide on time horizon
– Survival analysis
Dynamic models: incorporate changes in economic
climate
Score the profitability of a customer for a particular
product (product profit score)
Score the profitability of a customer over all products
(customer profit score)
Video clip to video clip
Not being adopted by many financial institutions (yet!)
21
Corporate default risk modeling
Prediction approach
Predict bankrupt versus nonbankrupt given ratios (solvency, liquidity, …)
describing financial status of companies
Assumes the availability of data (and defaults!)
For example, Altman’s zmodel
– 1968, linear discriminant analysis
– For public industrial companies: Z=1.2X1 + 1.4X2+ 3.3X3 + 0.6X4 +
1.0X5 (healthy if Z > 2.99, unhealthy if Z < 1.81)
– For private industrial companies: Z=6.56X1 + 3.26X2 + 6.72X3 +
1.05X4 (healthy if Z > 2.60, unhealthy if Z < 1.1)
– X1: Working Capital/Total Assets, X2: Retained Earnings/Total Assets,
X3: EBIT/ Total Assets, X4: Market (Book) Value of Equity/Total Liabilities,
X5: Net Sales/Total Assets
Expert based approach
Expert or expert committee decides upon criteria
22
Example expert based scorecard (Ozdemir
and Miu, 2009)
Business Risk Score
Industry Position 6
Market Share Trends and Prospects 2
Geographical Diversity of Operations 2
Diversity Product and Services 6
Customer Mix 1
Management Quality and Depth 4
Executive Board oversight 2
…. ….
23
Corporate default risk modeling
Agency rating approach
In the absence of default data (for example, sovereign exposures, low
default portfolios)
Purchase ratings (for example, AAA, AA, A, BBB) to reflect
creditworthiness
Each rating corresponds to a default probability (PD)
Shadow rating approach
Purchase ratings from a ratings agency
Estimate/mimick ratings using statistical models and data
24
The Altman Model: Enron Case
25
Source: Saunders and Allen (2002)
Rating Agencies
Moody’s Investor Service, Standard & Poor’s,
Fitch IBCA
Assign ratings to debt and fixed income securities
at the borrower’s request to reflect financial strength
(“creditworthiness”) of the economic entity
Ratings for
– Companies (private and public)
– Countries and governments (sovereign ratings)
– Local authorities
– Banks
Models are based on quantitative and qualitative
(judgmental) analysis (black box)
26
Rating Agencies
27 Van Gestel, Baesens, 2008
Section 2
Basel I, Basel II and Basel III
29
Regulatory versus Economic capital
Role of capital
Banks should have sufficient capital to protect themselves and
their depositors from the risks (credit/market/operational) they are
taking
Regulatory capital
Amount of capital a bank should have according to a regulation
(e.g. Basel I or Basel II)
Economic capital
Amount of capital a bank has based on internal modeling strategy
and policy
Types of capital:
– Tier 1: common stock, preferred stock, retained earnings
– Tier 2: revaluation reserves, undisclosed reserves, general
provisions, subordinated debt (can vary from country to
country)
– Tier 3: specific to market risk
30
The Basel I and II Capital Accords
Basel Committee
Central Banks/Bank regulators of major (G10) industrial countries
Meet every 3 months at Bank for International Settlements (BIS) in
Basel
Basel I Capital Accord 1988
Aim is to set up minimum regulatory capital requirements in order to
ensure that banks are able, at all times, to give back depositor’s funds
Capital ratio = available capital/riskweighted assets
Capital ratio should be > 8%
Both tier 1 and tier 2 capital
Need for new Accord
Regulatory arbitrage
Not sufficient recognition of collateral guarantees
Solvency of debtor not taken into account
Market risk later introduced in 1996
No operational risk
31
Basel I: Example
Minimum capital = 8% of riskweighted assets
Fixed risk rates
Retail:
– Cash: 0%
– Mortgages: 50%
– Other commercial: 100%
Example:
100$ mortgage
– 50% 50$ riskweighted assets
– 8% 4$ capital needed
Risk weights are independent of obligor and facility
characteristics!
32
Three Pillars of Basel II
Credit Risk
– Standard Approach
– Internal Ratings
Based Approach
Foundation
Advanced
Operational Risk
Market Risk
Sound internal
processes to
evaluate risk
(ICAAP)
Supervisory
monitoring
Semiannual
disclosure of:
– Bank’s risk profile
– Qualitative and
Quantitative
information
– Risk management
processes
– Risk management
Strategy
Objective is to inform
potential investors
Pillar 1: Minimum Capital
Requirement
Pillar 2: Supervisory Review
Process
Pillar 3: Market Discipline and
Public Disclosure
33
Pillar 1: Minimum Capital Requirements
Credit Risk: key components
– Probability of default (PD) (decimal)
– Loss given default (LGD) (decimal)
– Exposure at default (EAD) (currency)
– Maturity (M)
Expected Loss (EL) = PD x LGD x EAD
For example, PD=1%, LGD=20%, EAD=1000 Euros EL=2
Euros
Standardized Approach
Internal Ratings Based Approach
In the U.S., initially, in the notice of proposed rule making, a
difference was made between ELGD (expected LGD) and LGD
(based on economic downturn). This was later abandoned.
34
The Standardized Approach
Risk assessments (for example, AAA, AA, BBB, ...)
are based on external credit assessment institution
(ECAI) (for example, Standard & Poor’s, Moody’s, ...)
Eligibility criteria for ECAI given in Accord (objectivity,
independence, ...)
Risk weights given in Accord to compute
Risk Weighted Assets (RWA)
Risk weights for sovereigns, banks, corporates, ...
Minimum Capital = 0.08 x RWA
35
The Standardized Approach
Example:
Corporate/1 mio dollars/maturity 5
years/unsecured/external rating AA
RW=20%, RWA=0.2 mio, Reg. Cap.=0.016 mio
Credit risk mitigation (collateral)
guidelines provided in accord (simple versus
comprehensive approach)
For retail:
Risk weight =75% for nonmortgage; 35% for mortgage
But:
Inconsistencies
Sufficient coverage?
Need for individual risk profile!
No LGD, EAD
36
Internal Ratings Based (IRB) Approach
Internal Ratings Based Approach
Foundation
Advanced
PD LGD EAD
Foundation
approach
Internal
estimate
Regulator's
estimate
Regulator's
estimate
Advanced
approach
Internal
estimate
Internal estimate Internal estimate
37
IRB Approach
Split exposure into 5 categories:
– corporate (5 subclasses)
– sovereign
– bank
– retail (residential mortgage, revolving, other retail
exposures)
– equity
Foundation IRB approach not allowed for retail
Risk weight functions to derive capital requirements
Phased rollout of the IRB approach across banking
group using implementation plan
Transition period of 3 years
38
Basel II: Retail Specifics
Retail exposures can be pooled
– Estimate PD/LGD/EAD per rating/pool/segment!
– Note: pool=segment=grade=rating
Default definition
– obligor is past due more than 90 days
(can be relaxed during transition period)
– default can be applied at the level of the credit facility
– varies per country (e.g. with materiality thresholds)!
PD is the greater of the oneyear estimated PD or 0.03%
Length of underlying historical observation period to
estimate loss characteristics must be at least 5 years (can
be relaxed during transition period, minimum 2 years at the
start)
No maturity (M) adjustment for retail!
39
Developing a Rating Based System
Application and behavioral scoring models provide ranking
of customers according to risk
This was OK in the past (for example, for loan approval),
but Basel II requires wellcalibrated default probabilities
Map the scores or probabilities to a number of distinct
borrower ratings/pools
Decide on number of ratings and their definition
– Typically around 15 ratings
– Can be defined by mapping onto Moody’s, S&P scale,
or expert based!
– Impact on regulatory capital!
A B C E D
600 580 570 520
Score
40
Developing a Rating System
For corporate, sovereign, and bank exposures
“To meet this objective, a bank must have a minimum of
seven borrower grades for nondefaulted borrowers and
one for those that have defaulted”, paragraph 404 of the
Basel II Capital Accord.
For retail
“For each pool identified, the bank must be able to provide
quantitative measures of loss characteristics (PD, LGD,
and EAD) for that pool. The level of differentiation for IRB
purposes must ensure that the number of exposures in a
given pool is sufficient so as to allow for meaningful
quantification and validation of the loss characteristics at
the pool level. There must be a meaningful distribution of
borrowers and exposures across pools. A single pool must
not include an undue concentration of the bank’s total retail
exposure”, paragraph 409 of the Basel II Capital Accord.
41
A Basel II Project Plan
New customers: application scoring
1 year PD (needed for Basel II)
Loan term/18month PD (needed for loan approval)
Customers having credit: behavioral scoring
1 year PD
Other existing customers applying for credit:
mix application/behavioural scoring
1 year PD
Loan term/18month PD
42
Estimating Wellcalibrated PDs
Calculate empirical 1year realized default rates for
borrowers in a specific grade/pool
Use 5 years of data to estimate future PD (retail)
How to calibrate the PD?
– Average
– Maximum
– Upper percentile (stressed/conservative PD)
– Dynamic model (time series, e.g. ARIMA)
period time given the
of beginning the at X rating with obligors of Number
period time the during defaulted that period time given the
of beginning the at X rating with obligors of Number
X grade rating for PD year  1 =
43
Estimating Wellcalibrated PDs
empirical 12 month PD rating grade B
0,00%
0,50%
1,00%
1,50%
2,00%
2,50%
3,00%
3,50%
4,00%
4,50%
5,00%
0 10 20 30 40 50 60
Month
P
D
44
Basel II model architecture
Model
Scorecard
Logistic Regression
Internal Data
External Data
Expert Judgment
Risk ratings definition
PD calibration
Level 0
Level 1
Level 2
Characteristic
Name
Attribute
Scorecard
Points
AGE 1 Up to 26 100
AGE 2 26  35 120
AGE 3 35  37 185
AGE 4 37+ 225
GENDER 1 Male 90
GENDER 2 Female 180
SALARY 1 Up to 500 120
SALARY 2 5011000 140
SALARY 3 10011500 160
SALARY 4 15012000 200
SALARY 5 2001+ 240
Characteristic
Name
Attribute
Scorecard
Points
AGE 1 Up to 26 100
AGE 2 26  35 120
AGE 3 35  37 185
AGE 4 37+ 225
GENDER 1 Male 90
GENDER 2 Female 180
SALARY 1 Up to 500 120
SALARY 2 5011000 140
SALARY 3 10011500 160
SALARY 4 15012000 200
SALARY 5 2001+ 240
Characteristic
Name
Characteristic
Name
Characteristic
Name
Attribute Attribute Attribute
Scorecard
Points
Scorecard
Points
Scorecard
Points
AGE 1 AGE 1 AGE 1 Up to 26 Up to 26 Up to 26 100 100 100
AGE 2 AGE 2 AGE 2 26  35 26  35 26  35 120 120 120
AGE 3 AGE 3 AGE 3 35  37 35  37 35  37 185 185 185
AGE 4 AGE 4 AGE 4 37+ 37+ 37+ 225 225 225
GENDER 1 GENDER 1 GENDER 1 Male Male Male 90 90 90
GENDER 2 GENDER 2 GENDER 2 Female Female Female 180 180 180
SALARY 1 SALARY 1 SALARY 1 Up to 500 Up to 500 Up to 500 120 120 120
SALARY 2 SALARY 2 SALARY 2 5011000 5011000 5011000 140 140 140
SALARY 3 SALARY 3 SALARY 3 10011500 10011500 10011500 160 160 160
SALARY 4 SALARY 4 SALARY 4 15012000 15012000 15012000 200 200 200
SALARY 5 SALARY 5 SALARY 5 2001+ 2001+ 2001+ 240 240 240
empirical 12 month PD rating grade B
0,00%
0,50%
1,00%
1,50%
2,00%
2,50%
3,00%
3,50%
4,00%
4,50%
5,00%
0 10 20 30 40 50 60
Month
P
D
45
Risk Weight Functions for Retail
N() is cumulative standard normal distribution, N
1
() is
inverse cumulative standard normal distribution and ρ
is asset correlation
Regulatory capital=K.EAD
Residential mortgage exposures
Qualifying revolving exposures
Other retail exposures
( )


.

\

÷


.

\



.

\

÷
+
÷
=
÷ ÷
PD N PD N N LGD K ) 999 . 0 (
1
) (
1
1
.
1 1
µ
µ
µ
15 . 0 = µ
04 . 0 = µ


.

\

÷
÷
÷ +


.

\

÷
÷
=
÷
÷
÷
÷
35
35
35
35
1
1
1 16 . 0
1
1
03 . 0
e
e
e
e
PD PD
µ
46
Risk Weight Functions For
Corporates/Sovereigns/Banks
M represents the nominal or effective maturity (between 1 and
5 years)
Firm size adjustment for SME’s
– S represents total annual sales in millions of euros (between
5 and 50 million)
( ) b
b M
PD N PD N N LGD K
5 . 1 1
) ) 5 . 2 ( 1 (
) 999 . 0 (
1
) (
1
1
.
1 1
÷
÷ +


.

\

÷


.

\



.

\

÷
+
÷
=
÷ ÷
µ
µ
µ
))² ln( 05478 . 0 11852 . 0 ( PD b ÷ =


.

\

÷
÷
÷ +


.

\

÷
÷
=
÷
÷
÷
÷
50
50
50
50
1
1
1 24 . 0
1
1
12 . 0
e
e
e
e
PD PD
µ
)
45
5
1 ( 04 . 0
1
1
1 24 . 0
1
1
12 . 0
50
50
50
50
÷
÷ ÷


.

\

÷
÷
÷ +


.

\

÷
÷
=
÷
÷
÷
÷
S
e
e
e
e
PD PD
µ
47
Risk Weight Functions
Regulatory Capital=K (PD, LGD). EAD
Only covers unexpected loss (UL)!
– Expected loss=LGD.PD
– Expected loss covered by provisions!
RWA not explicitly calculated. Can be backed out as follows:
– Regulatory Capital=8% . RWA
– RWA=12.50 x Regulatory Capital=12.50 x K x EAD
Note: scaling factor of 1.06 for creditriskweighted assets (based on
QIS 3, 4 and 5)
The asset correlations ρ are chosen depending on the business
segment (corporate, sovereign, bank, residential mortgages,
revolving, other retail) using some empirical but not published
procedure!
– Based on reverse engineering of economic capital models from
large banks
– The asset value correlations reflect a combination of supervisory
judgment and empirical evidence (Federal Register)
– Note: asset correlations also measure how the asset class is
dependent on the state of the economy
48
LGD/ EAD Errors More Expensive Than
PD Errors
Consider a credit card portfolio where
– PD = 0.03 ; LGD = 0.5 ; EAD =$10,000
Basel formulae give capital requirement of
K(PD,LGD).EAD
K(0.03,0.50)(10000) = $343.7
10% over estimate on PD means capital required is
K(0.033,0.50)(10,000) = $367.3
10% over estimate on LGD means capital required is
K(0.03,0.55)(10,000) = $378.0
10% over estimate on EAD means capital required is
K(0.03,0.50)(11,000) = $378.0
49
The Basel II Risk Weight Functions
Basel III
Objective: fundamentally strengthen global capital standards
Key attention points:
– focus on tangible equity capital since this is the component with
the greatest lossabsorbing capacity
– reduced reliance on banks’ internal models
– reduce the reliance on external ratings
– greater focus on stress testing
Loss absorbing capacity beyond common standards for
systematically important banks
Tier 3 capital eliminated
Address model risk by e.g. introducing nonrisk based leverage
ratio as a backstop
Liquidity coverage ratio and the Net Stable Funding Ratio
No major impact on underlying credit risk models!
The new standards will take effect on 1 January 2013 and for the
most part will become fully effective by January 2019
50
Basel II versus Basel III
Basel II Basel III
Tier 1 capital ratio 4%.RWA 6% .RWA (by 2015)
Core tier 1 capital ratio (common
equity=shareholder+ retained earnings)
2%.RWA 4.5% .RWA (by 2015)
Capital conservation buffer (common
equity)
 2,5% .RWA (by 2019)
Countercyclical buffer  0% – 2.5% .RWA (by 2019)
Non riskbased leverage ratio (tier 1
capital)
 3% .Assets (by 2018)
Note: Assets include off balance
sheet exposures and derivatives.
Total capital ratio [Tier 1 Capital
Ratio] + [Tier 2
Capital Ratio] +
[Tier 3 Capital
Ratio]
[Tier 1 Capital Ratio] + [Capital
Conservation Buffer] +
[Countercyclical Capital Buffer] +
[Capital for Systemically Important
Banks]
Section 3
Preprocessing Data
for Credit Scoring and
PD Modeling
53
Motivation
Dirty, noisy data
– For example, Age= 2003
Inconsistent data
– Value “0” means actual zero or missing value
Incomplete data
– Income=?
Data integration and data merging problems
– Amounts in Euro versus Amounts in dollar
Duplicate data
– Salary versus Professional Income
Success of the preprocessing steps is crucial to success of
following steps
Garbage in, Garbage out (GIGO)
Very time consuming (80% rule)
54
Preprocessing Data for Credit Scoring
Types of variables
Sampling
Missing values
Outlier detection and treatment
Coarse classification of attributes
Weights of Evidence and Information Value
Segmentation
Definition of target variable
55
Types of Variables
Continuous
– Defined on a continuous interval
– For example, income, amount on savings account, ...
– In SAS: Interval
Discrete
– Nominal
No ordering between values
For example, purpose of loan, marital status
– Ordinal
Implicit ordering between values
For example, credit rating (AAA is better than AA, AA is better
than A, …)
– Binary
For example, gender
56
Sampling
Take sample of past applicants to build scoring model
Think careful about population on which the model that is going to be
built using the sample, will be used
Timing of sample
– How far do I go to get my sample?
– Tradeoff: many data versus recent data
Number of bads versus number of goods
– Undersampling, oversampling may be needed (dependent on
classification algorithm, cf. infra)
Sample taken must be from a normal business period, to get as
accurate a picture as possible of the target population
Make sure performance window is long enough to stabilize bad rate
(e.g. 18 months!)
Example sampling problems
– Application scoring: reject inference
– Behavioural scoring: seasonality depending upon the choice of
the observation point
57
Missing Values
Keep or Delete or Replace
Keep
– The fact that a variable is missing can be important
information!
– Encode variable in a special way (e.g. separate
category during coarse classification)
Delete
– When too many missing values, removing the
variable or observation might be an option
– Horizontally versus vertically missing values
Replace
– Estimate missing value using imputation
procedures
58
Imputation Procedures for Missing Values
For continuous attributes
– Replace with median/mean (median more robust to outliers)
– Replace with median/mean of all instances of the same class
For ordinal/nominal attributes
– Replace with modal value (= most frequent category)
– Replace with modal value of all instances of the same class
Advanced schemes
– Regression or treebased imputation
Predict missing value using other variables
– Nearest neighbour imputation
Look at nearest neighbours (e.g. in Euclidean sense) for
imputation
– Markov Chain Monte Carlo methods
– Not recommended!
59
Dealing with Missing Values in SAS
data credit;
input income savings;
datalines;
1300 400
1000 .
2000 800
. 200
1700 600
2500 1000
2200 900
. 1000
1500 .
;
proc standard data=credit replace out=creditnomissing;
run;
Impute node in
Enterprise Miner!
60
Outliers
E.g., due to recording or data entry errors or noise
Types of outliers
– Valid observation: e.g. salary of boss, ratio
variables
– Invalid observation: age =2003
Outliers can be hidden in one dimensional views of the
data (multidimensional nature of data)
Univariate outliers versus multivariate outliers
Detection versus treatment
61
Univariate Outlier Detection Methods
Visually
– histogram based
– box plots
zscore
Measures how many standard deviations an observation
is away from the mean for a specific variable
µ is the mean of variable x
i
and o
its standard deviation
Outliers are variables having z
i
 > 3 (or 2.5)
Note: mean of zscores is 0, standard deviation is 1
Calculate the zscore in SAS using PROC STANDARD
o
µ ÷
=
i
i
x
z
62
Univariate Outlier Detection Methods
63
Box Plot
A box plot is a visual representation of five numbers:
– Median M P(X≤M)=0.50
– First Quartile Q
1
P(X≤Q
1
)=0.25
– Third Quartile Q
3
P(X≤Q
3
)=0.75
– Minimum
– Maximum
– IQR=Q
3
Q
1
Min
Q
1
Q
3
M
1,5*IQR
Outliers
64
Multivariate Outlier Detection Methods
Multivariate outliers!
65
Treatment of Outliers
For invalid outliers:
– E.g. age=300 years
– Treat as missing value (keep, delete, replace)
For valid outliers:
– Truncation based on zscores:
Replace all variable values having zscores of
> 3 by the mean + 3 times the standard deviation
Replace all variable values having zscores of
< 3 by the mean 3 times the standard deviation
– Truncation based on IQR (more robust than zscores)
Truncate to M±3s, with M=median and s=IQR/(2 x 0.6745)
See Van Gestel, Baesens et al., 2007
– Truncation using a sigmoid
Use a sigmoid transform, f(x)=1/(1+e
x
)
– Also referred to as winsorizing
66
Computing the zscores in SAS
data credit;
input age income;
datalines;
34 1300
24 1000
20 2000
40 2100
54 1700
39 2500
23 2200
34 700
56 1500
;
proc standard data=credit mean=0 std=1 out=creditnorm;
run;
67
Discretization and Grouping of Attribute
Values
Motivation
–
Group values of categorical variables for robust analysis
–
Introduce nonlinear effects for continuous variables
–
Classification technique requires discrete inputs (for example, Bayesian
network classifiers)
–
Create concept hierarchies (Group low level concepts,
for example, raw age data, to higher lever concepts, for example, young,
middle aged, old)
Also called Coarse classification, Classing, Categorisation, binning, grouping, …
Methods
–
Equal interval binning
–
Equal frequency binning (histogram equalization)
–
Chisquared analysis
–
Entropy based discretization (using e.g. decision trees)
68
Coarse Classifying Continuous Variables
Lyn Thomas, A survey of credit and behavioural scoring: forecasting
financial risk of lending to consumers, International Journal of
Forecasting, 16, 149172, 2000
69
The Binning Method
For example, consider the attribute income
– 1000, 1200, 1300, 2000, 1800, 1400
Equal interval binning
– Bin width=500
– Bin 1, [1000, 1500[ : 1000, 1200, 1300, 1400
– Bin 2, [1500, 2000[ : 1800, 2000
Equal frequency binning (histogram equalization)
– 2 bins
– Bin 1: 1000, 1200, 1300
– Bin 2: 1400, 1800, 2000
However, both these methods do not take into account
the default risk!
70
The Chisquared Method
Consider the following example (taken from the book,
Credit Scoring and Its Applications, by Thomas,
Edelman and Crook, 2002)
Suppose we want three categories
Should we take option 1: owners, renters, and others
or option 2: owners, with parents, and others
Attribute Owner Rent
Unfurnished
Rent
Furnished
With
parents
Other No
answer
Total
Goods 6000 1600 350 950 90 10 9000
Bads 300 400 140 100 50 10 1000
Good:bad
odds
20:1 4:1 2.5:1 9.5:1 1.8:1 1:1 9:1
continued...
71
The Chisquared Method
The number of good owners given that the odds are the same as in
the whole population are (6000+300) x 9000/10000=5670.
Likewise, the number of bad renters given that the odds are the same
as in the whole population are
(1600+400+350+140) x 1000/10000=249.
Compare the observed frequencies with the theoretical frequencies
assuming equal odds using a chisquared test statistic.
Option 1
Option 2
The higher the test statistic, the better the split (formally, compare
with chisquare distribution with k1degrees of freedom for k classes
of the characteristic).
Option 2 is the better split!
( ) ( ) ( ) ( ) ( ) ( )
583
121
² 121 160
1089
² 1089 1050
249
² 249 540
2241
² 2241 1950
630
² 630 300
5670
5670 6000
2
2
=
÷
+
÷
+
÷
+
÷
+
÷
+
÷
= _
( ) ( ) ( ) ( ) ( ) ( )
662
265
² 265 600
2385
² 2385 2050
105
² 105 100
945
² 945 950
630
² 630 300
5670
² 5670 6000
2
=
÷
+
÷
+
÷
+
÷
+
÷
+
÷
= _
Coarse Classification in SAS
72
data residence;
input default$ resstatus$ count;
datalines;
good owner 6000
good rentunf 1600
good rentfurn 350
good withpar 950
good other 90
good noanswer 10
bad owner 300
bad rentunf 400
bad rentfurn 140
bad withpar 100
bad other 50
bad noanswer 10
;
Coarse Classification in SAS
73
data coarse1;
input default$ resstatus$ count;
datalines;
good owner 6000
good renter 1950
good other 1050
bad owner 300
bad renter 540
bad other 160
;
Coarse Classification in SAS
74
data coarse2;
input default$ resstatus$ count;
datalines;
good owner 6000
good withpar 950
good other 2050
bad owner 300
bad withpar 100
bad other 600
;
Coarse Classification in SAS
75
proc freq data=coarse1;
weight count;
tables default*resstatus / chisq;
run;
proc freq data=coarse2;
weight count;
tables default*resstatus / chisq;
run;
76
Recoding Nominal/Ordinal Variables
Nominal variables
– Dummy encoding
Ordinal variables
– Thermometer encoding
Weights of evidence coding
77
Dummy and Thermometer Encoding
Input 1 Input 2 Input 3
Purpose=car 1 0 0
Purpose=real estate 0 1 0
Purpose=other 0 0 1
Original Input Categorical
Input
Thermometer Inputs
Input 1 Input 2 Input 3
Income ≤ 800 Euro 1 0 0 0
Income > 800 Euro and
Income ≤ 1200 Euro
2 0
0 1
Income > 1200 Euro and
≤ 10000 Euro
3 0 1 1
Income > 10000 Euro 4 1 1 1
Many inputs
Explosion of data set!
Thermometer encoding
Dummy encoding
78
Weights of Evidence
Measures risk represented by each (grouped) attribute
category (for example, age 2326)
The higher the weight of evidence (in favor of being
good), the lower the risk for that category
Weight of Evidence
category
= ln ( p_good
category
/ p_bad
category
),
where p_good
category
= number of goods
category
/ number of goods
total
p_bad
category
= number of bads
category
/ number of bads
total
If p_good
category
> p_bad
category
then WOE > 0
If p_good
category
< p_bad
category
then WOE < 0
continued...
79
Weights of Evidence
Age Count
Distr.
Count Goods
Distr.
Goods Bads
Distr.
Bads WOE
Missing 50 2.50% 42 2.33% 8 4.12% 57.28%
1822 200 10.00% 152 8.42% 48 24.74% 107.83%
2326 300 15.00% 246 13.62% 54 27.84% 71.47%
2729 450 22.50% 405 22.43% 45 23.20% 3.38%
3035 500 25.00% 475 26.30% 25 12.89% 71.34%
3544 350 17.50% 339 18.77% 11 5.67% 119.71%
44+ 150 7.50% 147 8.14% 3 1.55% 166.08%
Total: 2000 1806 194
Distr Good Distr Bad
/
Ln x 100
...
80
Information Value
Information Value (IV) is a measure of predictive
power used to:
– assess the appropriateness of the classing
– select predictive variables
IV is similar to an entropy (cf. infra):
IV = E ( (p_good
category
 p_bad
category
) * woe
category
)
Rule of thumb:
– < 0.02 : unpredictive
– 0.02 – 0.1 : weak
– 0.1 – 0.3 : medium
– 0.3 + : strong
Age
Distr.
Goods
Distr.
Bads WOE IV
Missing 2.33% 4.12% 57.28% 0.0103
1822 8.42% 24.74% 107.83% 0.1760
2326 13.62% 27.84% 71.47% 0.1016
2729 22.43% 23.20% 3.38% 0.0003
3035 26.30% 12.89% 71.34% 0.0957
3544 18.77% 5.67% 119.71% 0.1568
44+ 8.14% 1.55% 166.08% 0.1095
Information Value: 0.6502
81
WOE: Logical Trend?
Predictive Strength
150
100
50
0
50
100
150
200
Missing 1822 2326 2729 3035 3544 44 +
Age
W
e
i
g
h
t
Younger people tend to represent a higher risk
than the older population
82
Segmentation
When more than one scorecard is required
Build a scorecard for each segment separately
Three reasons (Thomas, Jo, and Scherer, 2001)
– Strategic
Banks might want to adopt special strategies to specific
segments of customers (for example, lower cutoff
score depending on age or residential status).
– Operational
New customers must have separate scorecard
because the characteristics in the standard scorecard
do not make sense operationally for them.
– Variable Interactions
If one variable interacts strongly with a number of
others, it might be sensible to segment according to
this variable.
continued...
83
Segmentation
Two ways
– Experience based
In cooperation with financial expert
– Statistically based
Use clustering algorithms (kmeans,
decision trees, …)
Let the data speak
Better predictive power than single model
if successful!
84
Decision Trees for Clustering
Segmentation
Customer > 2 yrs
Bad rate = 0.3%
Customer < 2 yrs
Bad rate = 2.8%
Existing Customer
Bad rate = 1.2%
Age < 30
Bad rate = 11.7%
Age > 30
Bad rate = 3.2%
New Applicant
Bad rate = 6.3%
All Good/Bads
Bad rate = 3.8%
4 Segments
85
Definitions of Bad
30/60/90 days past due
Charge off/write off
Bankrupt
Claim over $1000
Profit based
Negative NPV
Less than x% owed collected
Fraud over $500
Basel II: 90 days, but can be changed by national
regulators (for example, U.S., UK, Belgium, …)
Section 4
Classification Techniques
for Credit Scoring and
PD Modeling
87
The Classification Problem
The classification problem can be stated as follows:
– Given an observation (also called pattern) with
characteristics x=(x
1
,..,x
n
), determine its class c
from a predetermined set of classes {c
1
,...,c
m
}
The classes {c
1
,..,c
m
} are known on beforehand
Supervised learning!
Binary (2 classes) versus multiclass (> 2 classes)
classification
Regression for classification
Linear regression gives : Y=β
0
+ β
1
Age+ β
2
Income+ β
3
Gender
Can be estimated using OLS (in SAS use proc reg or proc glm)
Two problems:
• No guarantee that Y is between 0 and 1 (i.e. a probability)
• Target/Errors not normally distributed
Using a bounding function to limit the outcome between 0 and 1:
88
Customer Age Income Gender G/B
John 30 1200 M B
Sarah 25 800 F G
Sophie 52 2200 F G
David 48 2000 M B
Peter 34 1800 M G
Customer Age Income Gender G/B Y
John 30 1200 M B 0
Sarah 25 800 F G 1
Sophie 52 2200 F G 1
David 48 2000 M B 0
Peter 34 1800 M B 1
z
e
z f
÷
+
=
1
1
) (
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
7 5 3 1 1 3 5 7
Logistic Regression
• Linear regression with a transformation such that the output
is always between 0 and 1 can thus be interpreted as a
probability (e.g. probability of good customer)
• Or, alternatively
• The parameters can be estimated in SAS using proc logistic
• Once the model has been estimated using historical data, we
can use it to score or assign probabilities to new data
89
...) (
3 2 1 0
1
1
) gender,... income, , age  good customer (
gender income age
e
P
    + + + ÷
+
= =
... )
) gender income, age,  bad customer (
) gender,... income, age,  good customer (
( ln
3 2 1 0
+ + + + =
=
=
gender income age
P
P
   
Logistic Regression in SAS
90
Customer Age Income Gender Response
John 30 1200 M No
Sarah 25 800 F Yes
Sophie 52 2200 F Yes
David 48 2000 M No
Peter 34 1800 M Yes
Historical data
proc logistic data=responsedata;
class Gender;
model Response=Age Income Gender;
run;
Customer Age Income Gender Response score
Emma 28 1000 F 0,44
Will 44 1500 M 0,76
Dan 30 1200 M 0,18
Bob 58 2400 M 0,88
...) 80 . 0 050 . 0 22 . 0 10 . 0 (
1
1
) Gender,... Income, , Age  yes response (
gender income age
e
P
÷ + + ÷
+
= =
New data
91
Logistic Regression
The logistic regression model is formulated as follows:
Hence,
Model reformulation:
n n
n n
n n
X X
X X
X X
n
e
e
e
X X Y P
  
  
   + + +
+ + +
+ + + ÷
+
=
+
= =
...
...
) ... (
1
1 1 0
1 1 0
1 1 0
1 1
1
) ,...,  1 (
n n n n
X X X X
n n
e e
X X Y P X X Y P
      + + + + + + ÷
+
=
+
÷ = = ÷ = =
... ) ... (
1 1
1 1 0 1 1 0
1
1
1
1
1 ) ,...,  1 ( 1 ) ,...,  0 (
1 ) ,...,  0 ( ), ,...,  1 ( 0
1 1
s = = s
n n
X X Y P X X Y P
n
X
n
X
e
X X Y P
X X Y P
n
n
   + + +
=
=
=
. . .
1 1 0
) ,...,  0 (
) ,...,  1 (
1
1
continued...
92
Logistic Regression
n n
n
n
X X
X X Y P
X X Y P
   + + + =
=
=
... )
) ,...,  0 (
) ,...,  1 (
( ln
1 1 0
1
1
) ,...,  1 ( 1
) ,...,  1 (
1
1
n
n
X X Y P
X X Y P
= ÷
=
is the odds in favor of Y=1
is called the logit )
) ,...,  1 ( 1
) ,...,  1 (
( ln
1
1
n
n
X X Y P
X X Y P
= ÷
=
93
Logistic Regression
If X
i
increases by 1:
e

i
is the oddsratio: the multiplicative increase
in the odds when X
i
increases by one (other
variables remaining constant/ceteris paribus)

i
>0 e

i
> 1 odds and probability increase
with X
i

i
<0 e

i
< 1 odds and probability
decrease with X
i
i
i
X X
i
 + =
+
 logit  logit
1
i
i i
e
X X

 odds  odds
1
=
+
94
Logistic Regression and Weight of
Evidence Coding
Cust
ID
Age Age Group Age WoE
1 20 1: until 22 1.1
2 31 2: 22 until 35 0.2
3 49 3: 35+ 0.9
Actual Age
After
classing
After re
coding
continued...
95
Logistic Regression and Weight of
Evidence Coding
No dummy variables!
More robust
...) _ * _ * (
1
0
1
1
) ,...,  1 (
+ + + ÷
+
= =
woe purpose woe age
n
purpose age
e
X X Y P
  
96
Decision Trees
Recursively partition the training data
Various tree induction algorithms
– C4.5 (Quinlan 1993)
– CART: Classification and Regression Trees
(Breiman, Friedman, Olsen, and Stone 1984)
– CHAID: Chisquared Automatic Interaction
Detection (Hartigan 1975)
Classification tree
– Target is categorical
Regression tree
– Target is continuous (interval)
97
Decision Tree Example
Income > $50,000
Job > 3 Years High Debt
No
No No
Good Risk
Yes
Bad Risk
Yes
Good Risk Bad Risk
Yes
98
Income > $50,000
Job > 3 Years High Debt
No
No No
Good Risk
Yes
Bad Risk
Yes
Good Risk Bad Risk
Yes
Decision Trees Terminology
Leaf
nodes
Branch
Internal node
Root
Estimating decision trees from data
Splitting decision
• Which variable to split at what value (e.g. age < 30
or not, income < 1000 or not; marital
status=married or not, …)?
Stopping decision
• When to stop growing the tree?
• When to stop adding nodes to the tree?
Assignment decision
• Which class (e.g. good or bad customer) to assign
to a leave node?
• Usually the majority class!
99
Splitting decision: entropy
Green dots represent good customers; red dots bad
customers
Impurity measures disorder/chaos in a data set
Impurity of a data sample S can be measured as follows:
– Entropy: E(S)= p
G
log
2
(p
G
)p
B
log
2
(p
B
) (C4.5)
– Gini: Gini(S)=2p
G
p
B
(CART)
100
Minimal Impurity Minimal Impurity
Maximal Impurity
Measuring Impurity using Entropy
Entropy is minimal when p
G
=1 or p
B
=1 and maximal
when p
G
= p
B
=0.50
Consider different candidate splits and measure for each
the decrease in entropy or gain
101
Entropy
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
102
Example: Gain Calculation
Entropy top node = 1/2 x log
2
(1/2) – 1/2 x log
2
(1/2) = 1
Entropy left node = 1/3 x log
2
(1/3) – 2/3 x log
2
(2/3) = 0.91
Entropy right node = 1 x log
2
(1) – 0 x log
2
(0) = 0
Gain = 1 – (600/800) x 0.91 – (200/800) x 0 = 0.32
Consider different alternative splits and pick the one with biggest
gain
102
400 400
200 400 200 0
G B
G B G B
Age < 30 Age ≥ 30
Gain
Age < 30 0,32
Marital status=married 0,05
Income < 2000 0,48
Savings amount < 50000 0,12
103
Stopping decision: early stopping
As the tree is grown, the data becomes more and more segmented and
splits are based on fewer and fewer observations
Need to avoid the tree from becoming too specific and fitting the noise of
the data (aka overfitting)
Avoid overfitting by partitioning the data into a training set (used for splitting
decision) and a validation set (used for stopping decision)
Optimal tree is found where validation set error is minimal
103
Validation set
Training
set
minimum
M
i
s
c
l
a
s
s
i
f
i
c
a
t
i
o
n
e
r
r
o
r
STOP Growing tree!
Number of tree nodes
104
Advantages/Disadvantages of Trees
Advantages
– Ease of interpretation
– Nonparametric
– No need for transformations of variables
– Robust to outliers
Disadvantages
– Sensitive to changes in the training data (unstable)
– Combinations of decision trees: bagging, boosting,
random forests
Section 5
Measuring the
Performance of Credit Scoring
Classification Models
106
How to Measure Performance?
Performance
– How well does the trained model perform in predicting
new unseen (!) instances?
– Decide on performance measure
Classification: Percentage Correctly Classified (PCC),
Sensitivity, Specificity, Area Under ROC curve
(AUROC), ...
Regression: Mean Absolute Deviation (MAD), Mean
Squared Error (MSE), ...
Methods
– Split sample method
– Single sample method
– Nfold crossvalidation
107
Split Sample Method
For large data sets
– Large= > 1000 Obs, with > 50 bad customers
Set aside a test set (typically onethird of the data)
which is NOT used during training!
Estimate performance of trained classifier on test set
Note: for decision trees, validation set is part of
training set
Stratification
– Same class distribution in training set and test set
108
Performance Measures for Classification
Classification accuracy, confusion matrix, sensitivity,
specificity
Area under the ROC curve
The Lorenz (Power) curve
The Cumulative lift curve
Kolmogorov Smirnov
Classification performance: confusion matrix
109
Cut off=0,50
G/B Score
John G 0,72
Sophie B 0,56
David G 0,44
Emma B 0,18
Bob B 0,36
G/B Score Predicted
John G 0,72 G
Sophie B 0,56 G
David G 0,44 B
Emma B 0,18 B
Bob B 0,36 B
Actual status
Positive (Good) Negative (Bad)
Predicted
status
Positive (Good) True Positive (John) False Positive (Sophie)
Negative (Bad) False Negative (David) True Negative (Emma, Bob)
Classification accuracy=(TP+TN)/(TP+FP+FN+TN)=3/5
Classification error = (FP +FN)/(TP+FP+FN+TN)=2/5
Sensitivity=TP/(TP+FN)=1/2
Specificity=TN/(FP+TN)=2/3
Confusion matrix
All these measures depend upon the cut off!
Classification Performance: ROC analysis
• Make a table with sensitivity and specificity for each possible cutoff
• ROC curve plots sensitivity versus 1specificity for each possible cutoff
• Perfect model has sensitivity of 1 and specificity of 1 (i.e. upper left corner)
• Scorecard A is better than B in above figure
• ROC curve can be summarized by the area underneath (AUC); the bigger
the better!
110
Cutoff sensitivity specificity 1specificity
0 1 0 1
0,01
0,02
….
0,99
1 0 1 0
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
S
e
n
s
i
t
i
v
i
t
y
(1  Specificity)
ROC Curve
Scorecard  A Random Scorecard  B
111
The Receiver Operating Characteristic
Curve
ROC Curve
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
(1  Specificity)
S
e
n
s
i
t
i
v
i
t
y
Scorecard  A Random Scorecard  B
...
112
The Area Under the ROC Curve
How to compare intersecting ROC curves?
The area under the ROC curve (AUC)
The AUC provides a simple figureofmerit for the
performance of the constructed classifier
An intuitive interpretation of the AUC is that it provides
an estimate of the probability that a randomly chosen
instance of class 1 is correctly ranked higher than a
randomly chosen instance of class 0 (Hanley and
McNeil, 1983) (Wilcoxon or MannWhitney or U
statistic)
The higher the better
A good classifier should have an AUC larger than 0.5
Cumulative Accuracy Profile (CAP)
• Sort the population from high score to low model score
• Measure the (cumulative) percentage of bads for each score decile
• E.g. top 30% most likely bads according to model captures 65% of true
bads
• Often summarised as topdecile lift, or how many bads in top 10%
113
114
Accuracy Ratio
The accuracy ratio (AR) is defined as:
(Area below power curve for current modelArea below power
curve for random model) /
(Area below power curve for perfect modelArea below power
curve for random model)
Perfect model has an AR of 1
Random model has an AR of 0
AR is sometimes also called the “Gini coefficient”
AR=2*AUC1
B
A
B
A
AR=B/(A+B)
Current model
Perfect model
115
The KolmogorovSmirnov (KS) Distance
Separation measure
Measures the distance between the cumulative score
distributions P(sB) and P(sG)
KS = max
s
 P(sG)  P(sB) , where:
P(sG) = ∑
x≤s
p(xG) (equals 1 sensitivity)
P(sB) = ∑
x≤s
p(xB) (equals the specificity)
KS distance metric is the maximum vertical distance
between both curves
KS distance can also be measured on the ROC graph
– Maximum vertical distance between ROC graph and
diagonal
continued...
116
The KolmogorovSmirnov Distance
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
KS distance
score
P(sB)
P(sG)
Section 6
Setting the
Classification Cutoff
118
Setting the Classification Cutoff
P(customer is good payerx)
Depends upon risk preference strategy
No input provided by Basel accord, capital should be in
line with risks taken
Strategy curve
– Choose cutoff that produces same acceptance
rate as previous scorecard but improves the bad
rate
– Choose cutoff that produces same bad rate as
previous scorecard but improves the acceptance
rate
119
Strategy Curve
Accept Rate
0
2
4
6
8
10
12
14
16
18
40 45 50 55 60 65 70 75 80 85 90 95
B
a
d
R
a
t
e
120
Setting the Cutoff Based on Marginal
GoodBad Rates
Usually one chooses a marginal goodbad rate of
about 5:1 or 3:1
Dependent upon the granularity of the cutoff change
Cutoff Cum. Goods
above score
Cum. Bads
above score
Marginal
goodbad rate
0.9 1400 50 
0.8 2000 100 12:1
0.7 2700 170 10:1
0.6 2950 220 5:1
0.5 3130 280 3:1
0.4 3170 300 2:1
121
Including a Refer Option
Refer option
– Gray zone
– Requires further (human) inspection
0 400 190
Reject
210
Accept Refer
...
122
Baesens B., Van Gestel T., Viaene S.,
Stepanova M., Suykens J., Vanthienen J.,
Benchmarking State of the Art Classification
Algorithms for Credit Scoring, Journal of the
Operational Research Society, Volume 54,
Number 6, pp. 627635, 2003.
Case Study 1
123
Benchmarking Study Baesens et al., 2003
8 reallife credit scoring data sets
– UK, Benelux, Internet
17 algorithms or variations of algorithms
Various cutoff setting schemes
Classification accuracy + Area under Receiver
Operating Characteristic Curve
McNemar test + DeLong, DeLong and ClarkePearson
test
Baesens B., Van Gestel T., Viaene S., Stepanova M., Suykens J.,
Vanthienen J., Benchmarking State of the Art Classification Algorithms
for Credit Scoring, Journal of the Operational Research Society, Volume
54, Number 6, pp. 627635, 2003.
continued...
124
Benchmarking Study
125
Benchmarking Study: Conclusions
NN and RBF LSSVM consistently yield good
performance in terms of both PCC and AUC
Simple linear classifiers such as LDA and LOG
also gave very good performances
Most credit scoring data sets are only weakly
nonlinear
Flat maximum effect
continued...
126
Benchmarking Study: Conclusions
“Although the differences are small, they may be large
enough to have commercial implications.” (Henley and
Hand, 1997)
“Credit is an industry where improvements, even when
small, can represent vast profits if such improvement
can be sustained.” (Kelly, 1998)
For mortgage portfolios, a small increase in PD
scorecard discrimination will significantly lower the
minimum Capital Requirements in the context of Basel
II. (Experian, 2003)
The best way to augment the performance of a
scorecard is to look for better predictors!
Role of Credit Bureaus!
127
Credit Bureaus
Credit reference agencies or credit bureaus
UK: Information typically accessed through name and address
Types of information:
– Publicly available information
For example, time at address, electoral rolls, court judgments
– Previous searches from other financial institutions
– Shared contributed information by several financial institutions
Check whether applicant has loan elsewhere and how he pays
– Aggregated information
For example, data at postcode level
– Fraud Warnings
Check whether prior fraud warnings at given address
– Bureau added value
For example, build generic scorecard for small populations or
new product
In U.K.: Experian, Equifax, Call Credit
In U.S.: Equifax, Experian, TransUnion
Section 7
Input Selection for
Classification
129
Input Selection
Inputs=Features=Attributes=Characteristics=Variables
Also called attribute selection, feature selection
If n features are present, 2
n
1 possible feature sets
can be considered
Heuristic search methods are needed!
Good feature subsets contain features highly
correlated (predictive of) with the class, yet
uncorrelated with (not predictive of) each other
(Hall and Smith 1998)
Can improve the performance and estimation time
of the classifier
continued...
130
Input Selection
Curse of dimensionality
– Number of training examples required increases
exponentially with number of inputs
– Remove irrelevant and redundant inputs
Interaction and correlation effects
Correlation implies redundancy
Difference between redundant input and relevant input
– An input can be redundant, but that does not mean
it is relevant (due to e.g. high correlation with
another input already in the model)
131
Input selection procedure
Step 1: Use a filter procedure
– For quick filtering of inputs
– Inputs are selected independent of the
classification algorithm
Step 2: Wrapper/stepwise regression
– Use the pvalue of the regression for input
selection
Step 3: AUC based pruning
– Iterative procedure based on AUC
132
Filter Methods for Input Selection
Continuous target
(e.g. LGD)
Discrete target
(e.g. PD)
Continuous input
Pearson correlation
Spearman rank order
correlation
Fisher score
Discrete input
Fisher score Chisquared analysis
Cramer’s V
Information Value
Gain/Entropy
133
Chisquared Based Filter
Under the independence assumption,
P(married and good payer) = P (married).P(good payer) =
0.6 * 0.8
The expected number of good payers that are married are
0.6*0.8*1000 = 480.
Good Payer Bad Payer Total
Married 500 100 600
Not Married 300 100 400
800 200 1000
continued...
134
Chisquared Based Filter
If both criteria independent, one should have the
following contingency table
Compare using a chisquared test statistic
Good Payer Bad Payer Total
Married 480 120 600
Not Married 320 80 400
800 200 1000
135
Chisquared Based Filter
χ² is chisquared distributed with k1 degrees of freedom, with k
being the number of classes of the characteristic
The bigger (lower) the value of χ² , the more (less) predictive the
attribute is
Apply as filter: rank all attributes with respect to their
pvalue and select the most predictive ones
Note:
always bounded between 0 and 1, higher values indicate better
predictive input.
( ) ( ) ( ) ( )
41 . 10
80
² 80 100
320
² 320 300
120
² 120 100
480
² 480 500
2
=
÷
+
÷
+
÷
+
÷
= _
n
²
V s Cramer'
_
=
136
Example: Filters (German credit data set)
Van Gestel, Baesens, 2008
137
Fisher score
Defined as
Higher value indicates more predictive input
Can also be used if target is continuous and input discrete
Example for German credit data set
2 2
 
score Fisher
B G
B G
s s
x x
+
÷
=
Van Gestel, Baesens, 2008
138
Wrapper/stepwise regression
Use the pvalue to decide upon the importance of the
inputs
– pvalue < 0.01: highly significant
– 0.01 < pvalue < 0.05: significant
– 0.05 < pvalue < 0.10: weakly significant
– 0.1 < pvalue: not significant
Can be used in different ways:
– Forward
– Backward
– Stepwise
139
Search Strategies
Direction of search
– Forward selection
– Backward elimination
– Floating search
– Oscillating search
Traversing the input space
– Greedy Hill Climbing search (Best First search)
– Beam search
– Genetic algorithms
140
{I
1
, I
2
, I
3
, I
4
}
{I
2
} {I
1
} {I
3
} {I
4
}
{I
1
, I
4
} {I
2
, I
4
} {I
3
, I
4
} {I
2
, I
3
} {I
1
, I
2
} {I
1
, I
3
}
{I
1
, I
2
, I
4
} {I
1
, I
2
, I
3
} {I
1
, I
3
, I
4
} {I
2
, I
3
, I
4
}
{ }
Example Search Space for Four Inputs
141
Stepwise Logistic Regression
SELECTION=
FORWARD,
PROC LOGISTIC first estimates parameters for
effects forced into the model. These effects are the
intercepts and the first n explanatory effects in the
MODEL statement, where n is the number specified
by the START= or INCLUDE= option in the MODEL
statement (n is zero by default). Next, the
procedure computes the score chisquare statistic
for each effect not in the model and examines the
largest of these statistics. If it is significant at the
SLENTRY= level, the corresponding effect is added
to the model. Once an effect is entered in the
model, it is never removed from the model. The
process is repeated until none of the remaining
effects meet the specified level for entry or until the
STOP= value is reached.
continued...
142
Stepwise Logistic Regression
SELECTION=
BACKWARD
,
Parameters for the complete model as specified in
the MODEL statement are estimated unless the
START= option is specified. In that case, only the
parameters for the intercepts and the first n
explanatory effects in the MODEL statement are
estimated, where n is the number specified by the
START= option. Results of the Wald test for
individual parameters are examined. The least
significant effect that does not meet the SLSTAY=
level for staying in the model is removed. Once an
effect is removed from the model, it remains
excluded. The process is repeated until no other
effect in the model meets the specified level for
removal or until the STOP= value is reached.
continued...
143
Stepwise Logistic Regression
SELECTION=
STEPWISE
similar to the SELECTION=FORWARD option
except that effects already in the model do not
necessarily remain. Effects are entered into and
removed from the model in such a way that each
forward selection step may be followed by one or
more backward elimination steps. The stepwise
selection process terminates if no further effect can
be added to the model or if the effect just entered
into the model is the only effect removed in the
subsequent backward elimination.
SELECTION=
SCORE
PROC LOGISTIC uses the branch and bound
algorithm of Furnival and Wilson (1974)
144
Stepwise Logistic Regression: Example
proc logistic data=bart.dmagecr;
class checking history purpose savings employed marital coapp resident
property age other housing;
model good_bad=checking duration history purpose amount savings employed
installp marital coapp resident property other/selection=stepwise slentry=0.10
slstay=0.01;
run;
!
145
AUC based pruning
Start from a model (e.g. logistic regression) with all n inputs
Remove each input in turn and reestimate the model
Remove the input giving the best AUC
Repeat this procedure until AUC performance decreases
significantly.
Inputs
Average AUC
Performance
Section 8
Reject Inference
147
Population
Throughthedoor
Rejects Accepts
Bads Goods ? Bads ? Goods
148
Reject Inference
Problem:
– Good/Bad Information is only available for past
accepts, but not for past rejects.
– Reject bias when building classification models
– Need to create models for the throughthedoor
population
– If past application policy was random, then reject
inference no problem, however this is not realistic
Solution:
– Reject Inference: A process whereby the
performance of previously rejected applications is
analyzed to estimate their behavior.
149
Reject Inference
Throughthedoor
10,000
Rejects
2,950
Accepts
7,050
Inferred Bads
914
Inferred Goods
2,036
Known Bads
874
Known Goods
6,176
150
Techniques to Perform Reject Inference
Define as bad
– Define all rejects as bads
– Reinforces the credit scoring policy and prejudices
of the past
In SAS Enterprise Miner (reject inference node):
– Hard cutoff augmentation
– Parcelling
– Fuzzy Augmentation
151
Hard CutOff Augmentation
AKA Augmentation 1
– Build a model using known goods and bads
– Score rejects using this model and establish
expected bad rates
– Set an expected bad rate level above which an
account is deemed bad
– Classify rejects as good and bad based on this
level
– Add to known goods/bads
– Remodel
152
Parcelling
Rather than classify rejects as good or bad, it assigns
them proportional to the expected bad rate at that
score
– score rejects with good/bad model
– split rejects into proportional good and bad groups.
Score # Bad # Good % Bad % Good
099 24 10 70.3% 29.7%
100199 54 196 21.6% 78.4%
200299 43 331 11.5% 88.5%
300399 32 510 5.9% 94.1%
400+ 29 1,232 2.3% 97.7%
Reject
342
654
345
471
778
Rej  Bad
240
141
40
28
18
Rej  Good
102
513
305
443
760
...
153
Parcelling
However
– Reject bad proportion cannot be the same as
approved
Therefore
– Allocate higher proportion of bads from reject
– Rule of thumb: bad rate for rejects should be
2–4 times that of approves
154
Nearest Neighbor Methods
Create two sets of clusters: goods and bads
Run rejects through both clusters
Compare Euclidean distances to assign most likely
performance
Combine accepts and rejects and remodel
However, rejects may be situated in regions far away
from the accepts
155
Bureau Based Reject Inference
Bureau data
– performance of declined apps on similar products
with other companies
– legal issues
– difficult to implement in practice – timings,
definitions, programming
Declined – got
credit elsewhere
Jan 99
Analyze
performance
Dec 00
...
156
Additional Techniques to Perform
Reject Inference
Approve all applications
Threegroup approach
Iterative ReClassification
Mixture Distribution modelling
157
Reject Inference
“... There is no unique best method of universal
applicability, unless extra information is obtained. That is,
the best solution is to obtain more information (perhaps
by granting loans to some potential rejects) about those
applicants who fall in the reject region”, David Hand,
1998.
Cost of granting credit to rejects versus benefit of better
scoring model (for example, work by Jonathan Crook)
continued...
158
Reject Inference
Banasik et al. were in the exceptional situation of
being able to observe the repayment behavior of
customers that would normally be rejected. They
concluded that the scope for improving scorecard
performance by including the rejected applicants
into the model development process is present but
modest.
Banasik J., Crook J.N., Thomas L.C., Sample
selection bias in credit scoring models, In:
Proceedings of the Seventh Conference on Credit
Scoring and Credit Control (CSCCVII’2001),
Edinburgh, Scotland, 2001.
159
Reject Inference
No method of universal applicability
Much controversy
Withdrawal inference
– Competitive market
160
Withdrawal Inference
Throughthedoor
Withdrawn Rejects
? Bads ? Goods ? Bads ? Goods
Accepts
Bads Goods
Section 10
Behavioral Scoring
162
Why Use Behavioral Scoring?
Debt provisioning
Setting credit limits (revolving credit)
Authorizing accounts to go in excess
Collection strategies
– What preventive actions should be taken
if customer starts going into arrears? (early
warning system)
Marketing applications (for example, customer
segmentation and targeted mailings)
Basel II!
163
Behavioral Scoring
Once behavioral scores have been obtained, they can be used to
increase/decrease the credit limits
But: score measures the risk of default given the current
operating policy and credit limit!
So, using the score to change the credit limit invalidates the
effectiveness of the score!
Analogy: “only those who have no or few accidents when they
drive a car at 30 mph in the towns should be allowed to drive at
70 mph on the motorways” (Thomas, Edelman, and Crook 2002)
Other skills may be needed to drive faster!
Similarly, other characteristics required to manage account with
large credit limits when compared to accounts with small credit
limits
Typically, divide the behavioral score into bands and give
different credit limit to each band.
164
Variables Used in Behavioral Scoring
More variables than application credit scoring
– Application variables
– Credit bureau data (for example, monthly updates)
– Repayment and usage behaviour characteristics
Example usage behavior characteristics
– Maximum and minimum levels of balance, credit
turnover, trend in payments, trend in balance,
number of missed payments, times exceeded
credit limit, ..., during performance period
continued...
165
Variables Used in Behavioural Scoring
Define derived variables measuring financial solvency
of customer
Customer can have many products and have different
roles for each product
– For example, primary owner, secondary owner,
guarantor, private versus professional products, ...
Think about product taxonomy to define customer
behavior
– For example, average savings amount, look at
different savings products
The best way to augment the performance of a
scorecard is to create better variables!
166
Aggregation Functions
Average (sensitive to outliers)
Median (insensitive to outliers)
Minimum/Maximum
– For example, worst status during last 6 months, highest
utilization during last 12 months, …
Absolute trend
Relative trend
Most recent value, Value 1 month ago, …
Ratio variable
– E.g.: Obligation/debt ratio: calculated by adding the proposed
monthly house payment and any regular monthly installment
revolving debt and dividing by the monthly income.
– Others: current balance/credit limit, …
6
6
6
÷
÷
÷
T
T T
x
x x
6
6 ÷
÷
T T
x x
167
Impact of Using Aggregation Functions
Impact of missing values
– How to calculate trends?
Ratio variables mighty have distributions with fat tails
(large positive and negative values)
– Use truncation/winsoring!
Data set expands in the number of attributes
– Input selection!
What if input selection allows you to choose between,
for example, average and most recent value of a ratio?
– Choose average value for cyclic dependent ratios
– Choose most recent for structural (noncyclic)
dependent ratios
168
Behavioral Scoring
Implicit assumption
– Relationship between the performance
characteristics and the subsequent delinquency
status of a customer is the same now as 2 or 3
years ago.
Classification approach to behavioral scoring
Markov Chain method to behavioral scoring
169
Classification Approach to Behavioral
Scoring
Performance period versus observation period
Develop classification model estimating probability of
default during observation period
Length of performance period and observation period
Tradeoff
Typically 6 months to 1 year
Seasonality effects
Performance period
Observation point Outcome point
170
How to Choose the Observation Point?
Seasonality, dependent upon observation point
Develop 12 PD models
– Less maintainable
– Need to backtest/stress test/monitor all these
models!
Use sampling methods to remove seasonality
– Use 3 years of data and sample the observation
point randomly for each customer during the
second year
– Only 1 model
– Gives a seasonally neutral data set
Lecturer: Bart Baesens
Studied at the Catholic University of Leuven (Belgium) Business Engineer in Management Informatics, 1998 Ph.D. in Applied Economic Sciences, 2003 Ph.D. title: Developing Intelligent Systems for Credit Scoring Using Machine Learning Techniques Associate professor at the K.U.Leuven, Belgium, and Vlerick Leuven Ghent Management School Lecturer at the School of Management at the University of Southampton, United Kingdom Research: data mining, Web intelligence, neural networks, SVMs, rule extraction, survival analysis, CRM, credit scoring, Basel II, … Twitter: www.twitter.com/DataMiningApps Facebook: Data Mining with Bart www.dataminingapps.com
2
Software Support
SAS SAS Enterprise Miner SAS Credit Scoring Nodes – interactive grouping, scorecard, reject inference, credit exchange Weka SPSS Microsoft Excel
3
thesis. The Credit Scoring Toolkit. 2006. 2003. 2007. The Basel Handbook. Ozdemir and Miu. and Crook. 2009. Bart Baesens.Relevant Background: Credit Scoring Books
Credit Risk Management: Basic Concepts. McGrawHill. Springer. Siam Monographs on Mathematical Modeling and Computation. Basel II Implementation: A Guide to Developing and Validating a Compliant Internal Risk Rating System. Consumer Credit Models: Pricing.D.
4
. Recovery Risk: The Next Challenge in Credit Risk Management. Edward Altman. The Basel II Risk Parameters: Estimation. Andrea Sironi (editors). Ong.U. Profit and Portfolios. Oxford University Press. Edelman. 2005 Developing Intelligent Systems for Credit Scoring. K.Leuven. Thomas. Andrea Resti. Credit Scoring and Its Applications. 2008. 2008. Engelmann and Rauhmeier. Van Gestel and Baesens. Ph. Oxford University Press. Oxford University Press. Anderson. 2004. 2002. Thomas. Validation and Stress Testing.
Oxford University Press...J.. Chapman and Hall.R. Cox D. inference and prediction. Principles of Data Mining. Analysis of Survival Data.M. Elements of Statistical Learning: data mining. 2001. Morgan Kaufmann. Kamber M.. 2nd edition... Cristianini N. 2000. 1984. Tibshirani R. 1995.D. Oakes D. Taylor J. Friedman J..
5
.. 2001. Han J... 1995.. Mannila H. 2006. Cambridge University Press. An Introduction to Support Vector Machines and other kernelbased learning methods. Data Mining: Concepts and Techniques. Allison P. Survival Analysis Using the SAS System.S. Hand D.Relevant Background: Data mining books
Hastie T. Smyth P. Neural Networks for Pattern Recognition.. SpringerVerlag. Bishop C. SAS Institute Inc.... the MIT Press.
Relevant Background: Credit scoring journals
Journal of Credit Risk Risk Magazine Journal of Banking and Finance Journal of the Operational Research Society European Journal of Operational Research Management Science IMA Journal of Mathematics
6
.
Relevant Background: Data mining journals
Data Mining and Knowledge Discovery Machine Learning Expert Systems with Applications Intelligent Data Analysis Applied Soft Computing European Journal of Operational Research IEEE Transactions on Neural Networks Journal of Machine Learning Research Neural Computation
7
.
Final rule.gov.mas. OTS Proposed Supervisory Guidance for Internal Ratings Based Systems for Credit Risk.au (Australia) www.S. Federal Register. Riskbased capital standards: Advanced Capital Adequacy FrameworkBasel II.gov. Federal Register.com http://www.hkma.
8
. FDIC.uk (Hong Kong) www.gov.bis. 39.org/ Websites of regulators: www. No. December 2007. Vol 72. Federal Reserve.gov. February 2007.uk (United Kingdom) www.fsa.defaultrisk.apra. and the Supervisory Review Process (Pillar 2).sg (Singapore) In the U.
4 agencies: OCC.Relevant Background: Credit scoring Web sites
www. Advanced Measurement Approaches for Operational Risk.
com for interesting macros www.kernelmachines.com/DataMiningApps
9
.com (data mining) www.kdnuggets.twitter.com See support.sas.sas.org (SVMs) www.Relevant Background: Data mining web sites
www.
Course Overview
Credit Scoring Basel I and Basel II Data preprocessing for PD modeling Classification techniques for PD modeling Measuring performance of PD models Input selection Reject inference Behavioural scoring Other applications
10
.
Section 1
An Introduction to Credit Scoring
.
assign points to each piece of information.g.Credit Scoring
Estimate whether applicant will successfully repay his/her loan based on various information
– Applicant characteristics (e. and compare with a threshold (cutoff)
12
. add all points. age. income. …) – Credit bureau information – Application information of other applicants – Repayment behavior of other applicants
Develop models (also called scorecards) estimating the probability of default of a customer Typically.
13
. capital. condition Statistical Based on multivariate correlations between inputs and risk of default Both assume that the future will resemble the past.Judgmental versus Statistical Approach
Judgmental Based on experience Five C’s: character. capacity. collateral.
Types of Credit Scoring
Application scoring Behavioral scoring Profit scoring Bankruptcy prediction
14
.
years with employers. … – In the US Fico scores: between 300 to 850 Experian. delinquency history). income. SME)
15
. number of credit checks. …) Use bureau variables – Bureau score. three months of payment arrears – Has now been fixed in Basel II Use application variables (age. total amount of credits. years at address. TransUnion – Others Baycorp Advantage (Australia and New Zealand) Schufa (Germany) BKR (the Netherlands) CKP (Belgium) Dun & Bradstreet (MidCorp. marital status.Application Scoring
Estimate probability of default at the time the applicant applies for the loan! Use predetermined definition of default – For example. Equifax. raw bureau data (for example. known client.
Example Application Scorecard
Let cutoff = 500 So, a new customer applies for credit:
AGE GENDER SALARY
Total
32 Female $1,150
120 points 180 points 160 points
460 points REFUSE CREDIT
16
...
Example Variables Used in Application Scorecard Development
Age Time at residence Time at employment Time in industry First digit of postal code Geographical: Urban/Rural/Regional/Provincial Residential Status Employment status Lifestyle code Existing client (Y/N) Years as client Number of products internally Internal bankruptcy/writeoff Number of delinquencies Beacon/Empirical Time at bureau Total Inquiries Time since last Inquiry Inquiries in the last 3/6/12 mths Inquiries in the last 3/6/12 mths as % of total Income Total Liabilities Total Debt Total Debt service $ Total Debt Service Ratio Gross Debt Service Ratio Revolving Debt/Total Debt Other Bank card Number of other bank cards Total Trades Trades opened in last six months Trades Opened in Last six months / Total Trades Total Inactive Trades Total Revolving trades Total term trades % revolving trades Total term trades Number of trades current Number of trades R1, R2..R9 % trades R1R2 % Trades R3 or worse % trades O1..O9 % trades O1O2 % Trades O3 or worse Total credit lines  revolving Total balance  revolving Total Utilization total balance all
17
Application Scoring: Summary
New customers Two snapshots of the state of the customer – Second snapshot typically taken around 18 months after loan origination Static procedure Scorecards are outofdate, even before they are implemented (Hand 2003).
Application scoring aims at ranking customers – no calibrated probabilities of default needed!
18
. home address changes. … – Delinquency history (payment arrears. … Dynamic “Video clip” to snapshot (typically taken after 12 months)
Performance period/video clip Observation point Outcome point/snapshot
19
continued. …) – Job changes. bureau score.
.Behavioral Scoring
Behavioral Scoring
Study risk behavior of existing customers Update risk assessment taking into account recent behavior Example of behavior: – Average/Max/Min/trend in checking account balance..
Behavioral Scoring
Behavioral scoring can be used for – Debt provisioning and profit scoring – Authorizing accounts to go in excess – Setting credit limits – Renewals/reviews – Collection strategies Characteristics: – Many. cf. infra) – Purpose is again scoring = ranking customer with respect to their default likelihood
20
. many variables (input selection needed.
Profit Scoring
Calculate profit over customer's lifetime Need loss estimates for bad accounts Decide on time horizon – Survival analysis Dynamic models: incorporate changes in economic climate Score the profitability of a customer for a particular product (product profit score) Score the profitability of a customer over all products (customer profit score) Video clip to video clip Not being adopted by many financial institutions (yet!)
21
.
X2: Retained Earnings/Total Assets. linear discriminant analysis – For public industrial companies: Z=1.1) – X1: Working Capital/Total Assets.0X5 (healthy if Z > 2. X4: Market (Book) Value of Equity/Total Liabilities.72X3 + 1. Altman’s zmodel – 1968.3X3 + 0.4X2+ 3. liquidity.Corporate default risk modeling
Prediction approach Predict bankrupt versus nonbankrupt given ratios (solvency.26X2 + 6. X5: Net Sales/Total Assets Expert based approach Expert or expert committee decides upon criteria
22
.05X4 (healthy if Z > 2. unhealthy if Z < 1. X3: EBIT/ Total Assets.56X1 + 3. …) describing financial status of companies Assumes the availability of data (and defaults!) For example.6X4 + 1.81) – For private industrial companies: Z=6.99.2X1 + 1. unhealthy if Z < 1.60.
2009)
Business Risk Industry Position Market Share Trends and Prospects Diversity Product and Services Customer Mix Score 6 2 6 1
Geographical Diversity of Operations 2
Management Quality and Depth
Executive Board oversight ….
4
2 ….Example expert based scorecard (Ozdemir and Miu.
23
.
BBB) to reflect creditworthiness Each rating corresponds to a default probability (PD) Shadow rating approach Purchase ratings from a ratings agency Estimate/mimick ratings using statistical models and data
24
. AAA.Corporate default risk modeling
Agency rating approach In the absence of default data (for example. A. low default portfolios) Purchase ratings (for example. sovereign exposures. AA.
The Altman Model: Enron Case
25
Source: Saunders and Allen (2002)
.
Rating Agencies
Moody’s Investor Service. Standard & Poor’s. Fitch IBCA Assign ratings to debt and fixed income securities at the borrower’s request to reflect financial strength (“creditworthiness”) of the economic entity Ratings for – Companies (private and public) – Countries and governments (sovereign ratings) – Local authorities – Banks Models are based on quantitative and qualitative (judgmental) analysis (black box)
26
.
2008
. Baesens.Rating Agencies
27
Van Gestel.
Section 2
Basel I. Basel II and Basel III
.
general provisions. preferred stock.g. Basel I or Basel II) Economic capital Amount of capital a bank has based on internal modeling strategy and policy Types of capital: – Tier 1: common stock. subordinated debt (can vary from country to country) – Tier 3: specific to market risk
29
. retained earnings – Tier 2: revaluation reserves. undisclosed reserves.Regulatory versus Economic capital
Role of capital Banks should have sufficient capital to protect themselves and their depositors from the risks (credit/market/operational) they are taking Regulatory capital Amount of capital a bank should have according to a regulation (e.
The Basel I and II Capital Accords
Basel Committee Central Banks/Bank regulators of major (G10) industrial countries Meet every 3 months at Bank for International Settlements (BIS) in Basel Basel I Capital Accord 1988 Aim is to set up minimum regulatory capital requirements in order to ensure that banks are able. to give back depositor’s funds Capital ratio = available capital/riskweighted assets Capital ratio should be > 8% Both tier 1 and tier 2 capital Need for new Accord Regulatory arbitrage Not sufficient recognition of collateral guarantees Solvency of debtor not taken into account Market risk later introduced in 1996 No operational risk
30
. at all times.
Basel I: Example
Minimum capital = 8% of riskweighted assets Fixed risk rates Retail: – Cash: 0% – Mortgages: 50% – Other commercial: 100% Example: 100$ mortgage – 50% 50$ riskweighted assets – 8% 4$ capital needed Risk weights are independent of obligor and facility characteristics!
31
.
Three Pillars of Basel II
Pillar 1: Minimum Capital Requirement
Pillar 2: Supervisory Review Process
Pillar 3: Market Discipline and Public Disclosure
Credit Risk – Standard Approach – Internal Ratings Based Approach Foundation Advanced Operational Risk Market Risk
Sound internal processes to evaluate risk (ICAAP) Supervisory monitoring
32
Semiannual disclosure of: – Bank’s risk profile – Qualitative and Quantitative information – Risk management processes – Risk management Strategy Objective is to inform potential investors
.
initially.S. LGD=20%. EAD=1000 Euros EL=2 Euros Standardized Approach Internal Ratings Based Approach
In the U. PD=1%. a difference was made between ELGD (expected LGD) and LGD (based on economic downturn). in the notice of proposed rule making.Pillar 1: Minimum Capital Requirements
Credit Risk: key components – Probability of default (PD) (decimal) – Loss given default (LGD) (decimal) – Exposure at default (EAD) (currency) – Maturity (M) Expected Loss (EL) = PD x LGD x EAD For example.
33
. This was later abandoned..
corporates..) Risk weights given in Accord to compute Risk Weighted Assets (RWA) Risk weights for sovereigns. . . AA. . Moody’s. banks....08 x RWA
34
. AAA. BBB. .The Standardized Approach
Risk assessments (for example.. independence.. Standard & Poor’s.) are based on external credit assessment institution (ECAI) (for example..) Eligibility criteria for ECAI given in Accord (objectivity.. Minimum Capital = 0.
The Standardized Approach
Example: Corporate/1 mio dollars/maturity 5 years/unsecured/external rating AA RW=20%. RWA=0. EAD
35
.016 mio Credit risk mitigation (collateral) guidelines provided in accord (simple versus comprehensive approach) For retail: Risk weight =75% for nonmortgage.=0. Cap. Reg. 35% for mortgage But: Inconsistencies Sufficient coverage? Need for individual risk profile! No LGD.2 mio.
Internal Ratings Based (IRB) Approach
Internal Ratings Based Approach Foundation Advanced
PD Foundation approach Advanced approach Internal estimate Internal estimate LGD Regulator's estimate Internal estimate EAD Regulator's estimate Internal estimate
36
.
other retail exposures) – equity Foundation IRB approach not allowed for retail Risk weight functions to derive capital requirements Phased rollout of the IRB approach across banking group using implementation plan Transition period of 3 years
.IRB Approach
37
Split exposure into 5 categories: – corporate (5 subclasses) – sovereign – bank – retail (residential mortgage. revolving.
with materiality thresholds)! PD is the greater of the oneyear estimated PD or 0. minimum 2 years at the start) No maturity (M) adjustment for retail!
.03% Length of underlying historical observation period to estimate loss characteristics must be at least 5 years (can be relaxed during transition period.g.Basel II: Retail Specifics
38
Retail exposures can be pooled – Estimate PD/LGD/EAD per rating/pool/segment! – Note: pool=segment=grade=rating Default definition – obligor is past due more than 90 days (can be relaxed during transition period) – default can be applied at the level of the credit facility – varies per country (e.
S&P scale. but Basel II requires wellcalibrated default probabilities Map the scores or probabilities to a number of distinct borrower ratings/pools
E 520 D 570 C 580 B 600 A Score
39
Decide on number of ratings and their definition – Typically around 15 ratings – Can be defined by mapping onto Moody’s.Developing a Rating Based System
Application and behavioral scoring models provide ranking of customers according to risk This was OK in the past (for example. for loan approval). or expert based! – Impact on regulatory capital!
.
Developing a Rating System
For corporate, sovereign, and bank exposures “To meet this objective, a bank must have a minimum of seven borrower grades for nondefaulted borrowers and one for those that have defaulted”, paragraph 404 of the Basel II Capital Accord. For retail “For each pool identified, the bank must be able to provide quantitative measures of loss characteristics (PD, LGD, and EAD) for that pool. The level of differentiation for IRB purposes must ensure that the number of exposures in a given pool is sufficient so as to allow for meaningful quantification and validation of the loss characteristics at the pool level. There must be a meaningful distribution of borrowers and exposures across pools. A single pool must not include an undue concentration of the bank’s total retail exposure”, paragraph 409 of the Basel II Capital Accord.
40
A Basel II Project Plan
New customers: application scoring 1 year PD (needed for Basel II) Loan term/18month PD (needed for loan approval) Customers having credit: behavioral scoring 1 year PD Other existing customers applying for credit: mix application/behavioural scoring 1 year PD Loan term/18month PD
41
Estimating Wellcalibrated PDs
Calculate empirical 1year realized default rates for borrowers in a specific grade/pool
Number of obligors with rating X at the beginning of the given time period that defaulted during the time period 1 year PD for rating grade X Number of obligors with rating X at the beginning of the given time period
Use 5 years of data to estimate future PD (retail) How to calibrate the PD? – Average – Maximum – Upper percentile (stressed/conservative PD) – Dynamic model (time series, e.g. ARIMA)
42
Estimating Wellcalibrated PDs
empirical 12 month PD rating grade B
5,00% 4,50% 4,00% 3,50% 3,00% 2,50% 2,00% 1,50% 1,00% 0,50% 0,00% 0 10 20 30 Month 40 50 60
43
PD
Basel II model architecture
5,00% 4,50% 4,00% 3,50% 3,00% 2,50% 2,00% 1,50% 1,00% 0,50% 0,00% 0
empirical 12 month PD rating grade B
Level 2
Risk ratings definition PD calibration
PD
10
20
30 Month
40
50
60
Characteristic Name
AGE 1
Attribute
Up to 26 26  35 35  37 37+ Male Female Up to 500 5011000 10011500 15012000 2001+
Scorecard Points
100 120 185 225 90 180 120 140 160 200 240
Level 1
Model Scorecard Logistic Regression
AGE 2 AGE 3 AGE 4 GENDER 1 GENDER 2 SALARY 1 SALARY 2 SALARY 3 SALARY 4 SALARY 5
Level 0
Internal Data External Data Expert Judgment
44
Risk Weight Functions for Retail
K LGD. N
1 N 1 ( PD) 1
1 1 N (0.999) PD
N() is cumulative standard normal distribution, N1() is inverse cumulative standard normal distribution and ρ is asset correlation Regulatory capital=K.EAD Residential mortgage exposures 0.15 Qualifying revolving exposures 0.04 Other retail exposures
1 e35PD 1 e35PD 0.03 1 e35 0.161 1 e35
45
241 1 e 50
M represents the nominal or effective maturity (between 1 and 5 years) Firm size adjustment for SME’s – S represents total annual sales in millions of euros (between 5 and 50 million)
1 e 50 PD 1 e 50 PD S 5 0.04(1 0. N 1 N 1 ( PD) 1 (1 ( M 2.11852 0.999) PD 1 1.05478ln( PD))²
1 e 50 PD 1 e 50 PD 0.12 ) 50 1 e 50 1 e 45
46
.5b
b (0.12 1 e 50 0.5)b) 1 1 N (0.Risk Weight Functions For Corporates/Sovereigns/Banks
K LGD.241 0.
other retail) using some empirical but not published procedure! – Based on reverse engineering of economic capital models from large banks – The asset value correlations reflect a combination of supervisory judgment and empirical evidence (Federal Register) – Note: asset correlations also measure how the asset class is dependent on the state of the economy
47
. bank. 4 and 5) The asset correlations ρ are chosen depending on the business segment (corporate.06 for creditriskweighted assets (based on QIS 3. sovereign. residential mortgages.Risk Weight Functions
Regulatory Capital=K (PD.50 x K x EAD Note: scaling factor of 1.50 x Regulatory Capital=12. LGD). revolving.PD – Expected loss covered by provisions! RWA not explicitly calculated. EAD Only covers unexpected loss (UL)! – Expected loss=LGD. Can be backed out as follows: – Regulatory Capital=8% . RWA – RWA=12.
3 10% over estimate on LGD means capital required is K(0.0.0.033.0.0
48
.0.03.LGD/ EAD Errors More Expensive Than PD Errors
Consider a credit card portfolio where – PD = 0. EAD =$10.55)(10.03.03.LGD).50)(10. LGD = 0.EAD K(0.50)(11.000 Basel formulae give capital requirement of K(PD.000) = $367.000) = $378.5 .000) = $378.0 10% over estimate on EAD means capital required is K(0.50)(10000) = $343.7 10% over estimate on PD means capital required is K(0.03 .
The Basel II Risk Weight Functions
49
.
g.Basel III
Objective: fundamentally strengthen global capital standards Key attention points: – focus on tangible equity capital since this is the component with the greatest lossabsorbing capacity – reduced reliance on banks’ internal models – reduce the reliance on external ratings – greater focus on stress testing Loss absorbing capacity beyond common standards for systematically important banks Tier 3 capital eliminated Address model risk by e. introducing nonrisk based leverage ratio as a backstop Liquidity coverage ratio and the Net Stable Funding Ratio No major impact on underlying credit risk models! The new standards will take effect on 1 January 2013 and for the most part will become fully effective by January 2019
50
.
RWA
6% .5% .Basel II versus Basel III
Basel II Basel III
Tier 1 capital ratio
4%.RWA (by 2015)
Core tier 1 capital ratio (common 2%.RWA (by 2015)
2.RWA (by 2019) 0% – 2.Assets (by 2018) Note: Assets include off balance sheet exposures and derivatives.RWA equity=shareholder+ retained earnings)
Capital conservation buffer (common equity) Countercyclical buffer Non riskbased leverage ratio (tier 1 capital) Total capital ratio 
4.5% .5% .RWA (by 2019) 3% . [Tier 1 Capital Ratio] + [Capital Conservation Buffer] + [Countercyclical Capital Buffer] + [Capital for Systemically Important Banks]
[Tier 1 Capital Ratio] + [Tier 2 Capital Ratio] + [Tier 3 Capital Ratio]
.
Section 3
Preprocessing Data for Credit Scoring and PD Modeling
.
noisy data – For example. Garbage out (GIGO) Very time consuming (80% rule)
. Age= 2003 Inconsistent data – Value “0” means actual zero or missing value Incomplete data – Income=? Data integration and data merging problems – Amounts in Euro versus Amounts in dollar Duplicate data – Salary versus Professional Income Success of the preprocessing steps is crucial to success of following steps Garbage in.Motivation
53
Dirty.
Preprocessing Data for Credit Scoring
Types of variables Sampling Missing values Outlier detection and treatment Coarse classification of attributes Weights of Evidence and Information Value Segmentation Definition of target variable
54
.
gender
55
. marital status – Ordinal Implicit ordering between values For example. credit rating (AAA is better than AA.. AA is better than A. . purpose of loan. income. amount on savings account..Types of Variables
Continuous – Defined on a continuous interval – For example. – In SAS: Interval Discrete – Nominal No ordering between values For example. …) – Binary For example.
oversampling may be needed (dependent on classification algorithm. will be used Timing of sample – How far do I go to get my sample? – Tradeoff: many data versus recent data Number of bads versus number of goods – Undersampling.Sampling
Take sample of past applicants to build scoring model Think careful about population on which the model that is going to be built using the sample. cf. to get as accurate a picture as possible of the target population Make sure performance window is long enough to stabilize bad rate (e. infra) Sample taken must be from a normal business period.g. 18 months!) Example sampling problems – Application scoring: reject inference – Behavioural scoring: seasonality depending upon the choice of the observation point
56
.
g. removing the variable or observation might be an option – Horizontally versus vertically missing values Replace – Estimate missing value using imputation procedures
57
.Missing Values
Keep or Delete or Replace Keep – The fact that a variable is missing can be important information! – Encode variable in a special way (e. separate category during coarse classification) Delete – When too many missing values.
Imputation Procedures for Missing Values
For continuous attributes – Replace with median/mean (median more robust to outliers) – Replace with median/mean of all instances of the same class For ordinal/nominal attributes – Replace with modal value (= most frequent category) – Replace with modal value of all instances of the same class Advanced schemes – Regression or treebased imputation Predict missing value using other variables – Nearest neighbour imputation Look at nearest neighbours (e.g. in Euclidean sense) for imputation – Markov Chain Monte Carlo methods – Not recommended!
58
.
2000 800 . 200 1700 600 2500 1000 2200 900 . 1300 400 1000 . run. 1000 1500 .
Impute node in Enterprise Miner!
59
.Dealing with Missing Values in SAS
data credit. . input income savings. datalines. proc standard data=credit replace out=creditnomissing.
ratio variables – Invalid observation: age =2003 Outliers can be hidden in one dimensional views of the data (multidimensional nature of data) Univariate outliers versus multivariate outliers Detection versus treatment
60
..g.Outliers
E.g. due to recording or data entry errors or noise Types of outliers – Valid observation: e. salary of boss.
Univariate Outlier Detection Methods
Visually – histogram based – box plots zscore
zi
xi m
s
61
Measures how many standard deviations an observation is away from the mean for a specific variable m is the mean of variable xi and s its standard deviation Outliers are variables having zi > 3 (or 2.5) Note: mean of zscores is 0. standard deviation is 1 Calculate the zscore in SAS using PROC STANDARD
.
Univariate Outlier Detection Methods
62
.
Box Plot
A box plot is a visual representation of five numbers: – Median M P(X≤M)=0.50 – First Quartile Q1 P(X≤Q1)=0.75 – Minimum – Maximum – IQR=Q3Q1
1.25 – Third Quartile Q3 P(X≤Q3)=0.5*IQR
Min 63
Q1
M
Q3
Outliers
.
Multivariate Outlier Detection Methods
Multivariate outliers!
64
.
Treatment of Outliers
For invalid outliers: – E. with M=median and s=IQR/(2 x 0. replace) For valid outliers: – Truncation based on zscores: Replace all variable values having zscores of > 3 by the mean + 3 times the standard deviation Replace all variable values having zscores of < 3 by the mean 3 times the standard deviation – Truncation based on IQR (more robust than zscores) Truncate to M±3s.6745) See Van Gestel..g. 2007 – Truncation using a sigmoid Use a sigmoid transform. delete. age=300 years – Treat as missing value (keep. Baesens et al. f(x)=1/(1+ex) – Also referred to as winsorizing
65
.
proc standard data=credit mean=0 std=1 out=creditnorm.
66
. 34 1300 24 1000 20 2000 40 2100 54 1700 39 2500 23 2200 34 700 56 1500 . datalines.Computing the zscores in SAS
data credit. run. input age income.
to higher lever concepts. … Methods – Equal interval binning – Equal frequency binning (histogram equalization) – Chisquared analysis – Entropy based discretization (using e. grouping.Discretization and Grouping of Attribute Values
Motivation
Group values of categorical variables for robust analysis Introduce nonlinear effects for continuous variables Classification technique requires discrete inputs (for example. Bayesian network classifiers) – Create concept hierarchies (Group low level concepts. binning. raw age data. old) Also called Coarse classification. for example. decision trees)
67
– – –
. young. middle aged. for example. Classing.g. Categorisation.
2000
68
. International Journal of Forecasting. 16. A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers.Coarse Classifying Continuous Variables
Lyn Thomas. 149172.
1300. 2000 However. 2000[ : 1800. 1400 – Bin 2. 1200. [1000. consider the attribute income – 1000. 1800. 1800. 1300. 1500[ : 1000. 2000 Equal frequency binning (histogram equalization) – 2 bins – Bin 1: 1000. 1200. 1200. 2000. both these methods do not take into account the default risk!
69
. 1300 – Bin 2: 1400. 1400 Equal interval binning – Bin width=500 – Bin 1. [1500.The Binning Method
For example.
Credit Scoring and Its Applications. and others
70
continued.. by Thomas.. 2002)
Owner 6000 300 20:1 Rent Unfurnished 1600 400 4:1 Rent Furnished 350 140 2. renters. with parents. Edelman and Crook.
.8:1 No answer 10 10 1:1 Total 9000 1000 9:1
Attribute Goods Bads Good:bad odds
Suppose we want three categories Should we take option 1: owners.5:1 With parents 950 100 9.The Chisquared Method
Consider the following example (taken from the book. and others or option 2: owners.5:1 Other 90 50 1.
Option 1
2
6000 5670 2 300 630 ² 1950 2241 ² 540 249 ² 1050 1089 ² 160 121 ² 583
5670 630 2241 249 1089 121
Option 2
2
6000 5670² 300 630² 950 945² 100 105² 2050 2385² 600 265² 662
5670 630 945 105 2385 265
71
The higher the test statistic. Likewise. compare with chisquare distribution with k1degrees of freedom for k classes of the characteristic). the number of bad renters given that the odds are the same as in the whole population are (1600+400+350+140) x 1000/10000=249. the better the split (formally. Option 2 is the better split!
.The Chisquared Method
The number of good owners given that the odds are the same as in the whole population are (6000+300) x 9000/10000=5670. Compare the observed frequencies with the theoretical frequencies assuming equal odds using a chisquared test statistic.
datalines.
72
.Coarse Classification in SAS
data residence. input default$ resstatus$ count. good owner 6000 good rentunf 1600 good rentfurn 350 good withpar 950 good other 90 good noanswer 10 bad owner 300 bad rentunf 400 bad rentfurn 140 bad withpar 100 bad other 50 bad noanswer 10 .
good owner 6000 good renter 1950 good other 1050 bad owner 300 bad renter 540 bad other 160 .Coarse Classification in SAS
data coarse1. input default$ resstatus$ count.
73
. datalines.
Coarse Classification in SAS
data coarse2. good owner 6000 good withpar 950 good other 2050 bad owner 300 bad withpar 100 bad other 600 .
74
. input default$ resstatus$ count. datalines.
weight count. run. tables default*resstatus / chisq. run. tables default*resstatus / chisq.
75
.Coarse Classification in SAS
proc freq data=coarse1. proc freq data=coarse2. weight count.
Recoding Nominal/Ordinal Variables
Nominal variables – Dummy encoding Ordinal variables – Thermometer encoding Weights of evidence coding
76
.
Dummy and Thermometer Encoding
Dummy encoding
Input 1 Purpose=car Purpose=real estate 1 0 Input 2 0 1 Input 3 0 0
Many inputs Explosion of data set!
Purpose=other
0
0
1
Thermometer encoding
Original Input Income ≤ 800 Euro Income > 800 Euro and Income ≤ 1200 Euro Income > 1200 Euro and ≤ 10000 Euro Income > 10000 Euro
77
Categorical Input 1 2 3 4
Thermometer Inputs Input 1 0 0 0 1 Input 2 0 0 1 1 Input 3 0 1 1 1
.
age 2326) The higher the weight of evidence (in favor of being good).. where p_good category = number of goods category / number of goods total
p_bad category = number of bads category / number of bads total
If p_goodcategory > p_badcategory then WOE > 0 If p_goodcategory < p_badcategory then WOE < 0
78
continued..
. the lower the risk for that category
Weight of Evidence category = ln ( p_good category / p_bad category ).Weights of Evidence
Measures risk represented by each (grouped) attribute category (for example.
Bads WOE 4.30% 18.47% 23.50% 10.28% 24. Count 2.Weights of Evidence
Age Missing 1822 2326 2729 3035 3544 44+ Total: Count 50 200 300 450 500 350 150 2000 Distr.67% 119.08%
Ln
Distr Good
/
Distr Bad
x 100
79
.43% 26. Goods 2.00% 15.55% 166.83% 27.84% 71.74% 107.62% 22.20% 3.34% 5.00% 22.33% 8.89% 71.38% 12..50% 7.77% 8..71% 1.12% 57.50% 25.14% Bads 8 48 54 45 25 11 3 194 Distr.42% 13.
.50% Goods 42 152 246 405 475 339 147 1806 Distr.00% 17.
12% 57.74% 107.1568 0.62% 22.71% 1.84% 71.20% 3. infra):
Rule of thumb: – < 0.3 : medium – 0.6502
80
.1760 0.Information Value
Information Value (IV) is a measure of predictive power used to: – assess the appropriateness of the classing – select predictive variables IV is similar to an entropy (cf.83% 27.0003 0.1 : weak – 0.42% 13.33% 8.3 + : strong
IV = S ( p_good category .38% 12.1 – 0. Goods 2.1095 0.1016 0.43% 26.p_bad category ) * woe category )
Age Missing 1822 2326 2729 3035 3544 44+
Distr.77% 8.14%
Distr.02 : unpredictive – 0.67% 119.30% 18.89% 71.0103 0.08% Information Value:
IV 0.47% 23.0957 0. Bads WOE 4.28% 24.34% 5.55% 166.02 – 0.
WOE: Logical Trend?
Predictive Strength
200 150 100
Weight
50 0 Missing 50 100 150 Age 1822 2326 2729 3035 3544 44 +
Younger people tend to represent a higher risk than the older population
81
.
. 2001) – Strategic Banks might want to adopt special strategies to specific segments of customers (for example.
continued.
.Segmentation
82
When more than one scorecard is required Build a scorecard for each segment separately Three reasons (Thomas. it might be sensible to segment according to this variable.. – Operational New customers must have separate scorecard because the characteristics in the standard scorecard do not make sense operationally for them. – Variable Interactions If one variable interacts strongly with a number of others. Jo. lower cutoff score depending on age or residential status). and Scherer.
decision trees.Segmentation
Two ways – Experience based In cooperation with financial expert – Statistically based Use clustering algorithms (kmeans. …) Let the data speak Better predictive power than single model if successful!
83
.
Decision Trees for Clustering
Segmentation
All Good/Bads Bad rate = 3.7%
Age > 30 Bad rate = 3.8%
Existing Customer Bad rate = 1.2%
4 Segments
84
.3%
Customer < 2 yrs Bad rate = 2.8%
Age < 30 Bad rate = 11.3%
Customer > 2 yrs Bad rate = 0.2%
New Applicant Bad rate = 6.
but can be changed by national regulators (for example. Belgium. UK.S.Definitions of Bad
30/60/90 days past due Charge off/write off Bankrupt Claim over $1000 Profit based Negative NPV Less than x% owed collected Fraud over $500 Basel II: 90 days. …)
85
. U..
Section 4
Classification Techniques for Credit Scoring and PD Modeling
.
The Classification Problem
The classification problem can be stated as follows: – Given an observation (also called pattern) with characteristics x=(x1....xn)...cm} are known on beforehand Supervised learning! Binary (2 classes) versus multiclass (> 2 classes) classification
87
.cm} The classes {c1.. determine its class c from a predetermined set of classes {c1.....
2 0.1
88
0
7 5 3 1 1 3 5 7
.Regression for classification
Customer John Sarah Sophie David Peter Age 30 25 52 48 34 Income Gender 1200 M 800 F 2200 F 2000 M 1800 M G/B B G G B G Customer Age Income Gender John 30 1200 M Sarah 25 800 F Sophie 52 2200 F David 48 2000 M Peter 34 1800 M G/B B G G B B Y 0 1 1 0 1
Linear regression gives : Y=β0+ β1Age+ β2Income+ β3Gender Can be estimated using OLS (in SAS use proc reg or proc glm) Two problems: • No guarantee that Y is between 0 and 1 (i.6 0.7
0.5 0. a probability) • Target/Errors not normally distributed Using a bounding function to limit the outcome between 0 and 1: f ( z )
1
0.4
0.3
0.8
1 1 e z
0.9 0.e.
)
• Or.. income.. income.. gender.... ) 1 1 e ( 0 1age 2income 3 gender.g. alternatively
ln ( P(customer good  age. P(customer bad  age..) ) 0 1age 2income 3 gender ... we can use it to score or assign probabilities to new data
89
. gender.Logistic Regression
• Linear regression with a transformation such that the output is always between 0 and 1 can thus be interpreted as a probability (e.. gender)
• The parameters can be estimated in SAS using proc logistic • Once the model has been estimated using historical data. income. probability of good customer)
P(customer good  age.
100.050income0. )
1 1 e ( 0.. run.44 0..18 0... Income. class Gender.)
New data
Customer Emma Will Dan Age 28 44 30 58 Income Gender Response score 1000 1500 1200 2400 F M M M 0. model Response=Age Income Gender.76 0..Logistic Regression in SAS
Historical data
Customer Age Income Gender Response John 30 1200 M No Sarah 25 800 F Yes Sophie 52 2200 F Yes David 48 2000 M No Peter 34 1800 M Yes proc logistic data=responsedata.
P(response yes  Age.80 gender.Gender.22age0.88
90
Bob
.
............. n X n
1 1 1 e 0 1 X1 .. X n ).... n X n )
e 0 1 X1 .. n X n )
Hence.
0 P(Y 1  X 1 . n X n
P(Y 0  X 1 .. . X n )
91
e
0 1X1.... n X n
continued.. n X n 1 e 0 1 X1 ......X n ) 1
1 e ( 0 1 X1 .. . X n ) 1
Model reformulation: P (Y 1  X 1 ..... X n )
P(Y 0  X 1 ..X n ) 1 P(Y 1  X 1 .Logistic Regression
The logistic regression model is formulated as follows:
P(Y 1  X 1 .
.... P(Y 0  X 1 ... X n )
1 1 e ( 0 1 X1 ..
.....Logistic Regression
P(Y 1  X 1 . X n ) ln ( ) is called the logit 1 P (Y 1  X 1 ... X n )
92
... X n ) P(Y 1  X 1 ... X n ) P (Y 1  X 1 .... n X n P(Y 0  X 1 ........ X n ) ln ( ) 0 1 X 1 ...... X n ) is the odds in favor of Y=1 1 P(Y 1  X 1 .
Logistic Regression
If Xi increases by 1:
logit  X i 1 logit  X i i
odds  X i 1 odds  X i e
i
ei is the oddsratio: the multiplicative increase in the odds when Xi increases by one (other variables remaining constant/ceteris paribus) i>0 ei > 1 odds and probability increase with Xi i<0 ei < 1 odds and probability decrease with Xi
93
.
2 0.
.1 0..9
94
continued..Logistic Regression and Weight of Evidence Coding
Actual Age After classing After recoding
Cust Age ID 1 2 3 20 31 49
Age Group 1: until 22 2: 22 until 35 3: 35+
Age WoE 1.
.X n ) 1 1 e
( 0 age *age _ woe purpose * purpose _ woe .Logistic Regression and Weight of Evidence Coding
P(Y 1  X 1 ..)
No dummy variables! More robust
95
.....
and Stone 1984) – CHAID: Chisquared Automatic Interaction Detection (Hartigan 1975) Classification tree – Target is categorical Regression tree – Target is continuous (interval)
96
.5 (Quinlan 1993) – CART: Classification and Regression Trees (Breiman.Decision Trees
Recursively partition the training data Various tree induction algorithms – C4. Friedman. Olsen.
000 No Job > 3 Years No Yes High Debt Yes No
Yes
Good Risk
Bad Risk
Bad Risk
Good Risk
97
.Decision Tree Example
Income > $50.
000
Internal node
Root
No
Yes High Debt Yes
Branch
Yes
Job > 3 Years No
No
Leaf nodes
Good Risk
Bad Risk
Bad Risk
Good Risk
98
.Decision Trees Terminology
Income > $50.
g. …)? Stopping decision • When to stop growing the tree? • When to stop adding nodes to the tree? Assignment decision • Which class (e.g. age < 30 or not. income < 1000 or not. good or bad customer) to assign to a leave node? • Usually the majority class!
99
.Estimating decision trees from data
Splitting decision • Which variable to split at what value (e. marital status=married or not.
red dots bad customers Impurity measures disorder/chaos in a data set Impurity of a data sample S can be measured as follows: – Entropy: E(S)= pGlog2(pG)pBlog2(pB) (C4.5)
– Gini: Gini(S)=2pGpB (CART)
100
.Splitting decision: entropy
Minimal Impurity
Maximal Impurity
Minimal Impurity
Green dots represent good customers.
8 0.Measuring Impurity using Entropy
Entropy is minimal when pG=1 or pB =1 and maximal when pG= pB =0.4 0.8 1
Consider different candidate splits and measure for each the decrease in entropy or gain
101
.2 0 0 0.4 0.2 0.6 0.6 0.50
Entropy
1 0.
32 Consider different alternative splits and pick the one with biggest gain
Gain Age < 30 0.91 – (200/800) x 0 = 0.32 Marital status=married 0.91 Entropy right node = 1 x log2(1) – 0 x log2(0) = 0 Gain = 1 – (600/800) x 0.05 Income < 2000 0.48 Savings amount < 50000 0.12
102
.Example: Gain Calculation
G B 400 400
Age < 30
200 400 G B
Age ≥ 30
200 0 G B
Entropy top node = 1/2 x log2(1/2) – 1/2 x log2(1/2) = 1 Entropy left node = 1/3 x log2(1/3) – 2/3 x log2(2/3) = 0.
the data becomes more and more segmented and splits are based on fewer and fewer observations Need to avoid the tree from becoming too specific and fitting the noise of the data (aka overfitting) Avoid overfitting by partitioning the data into a training set (used for splitting decision) and a validation set (used for stopping decision) Optimal tree is found where validation set error is minimal
Misclassification error
Validation set
STOP Growing tree!
minimum
Training set
Number of tree nodes
103
.Stopping decision: early stopping
As the tree is grown.
boosting. random forests
104
.Advantages/Disadvantages of Trees
Advantages – Ease of interpretation – Nonparametric – No need for transformations of variables – Robust to outliers Disadvantages – Sensitive to changes in the training data (unstable) – Combinations of decision trees: bagging.
Section 5
Measuring the Performance of Credit Scoring Classification Models
.
Sensitivity. Methods – Split sample method – Single sample method – Nfold crossvalidation
106
.. Area Under ROC curve (AUROC).How to Measure Performance?
Performance – How well does the trained model perform in predicting new unseen (!) instances? – Decide on performance measure Classification: Percentage Correctly Classified (PCC). . Regression: Mean Absolute Deviation (MAD). .. Mean Squared Error (MSE).. Specificity..
Split Sample Method
For large data sets – Large= > 1000 Obs. validation set is part of training set Stratification – Same class distribution in training set and test set
107
. with > 50 bad customers Set aside a test set (typically onethird of the data) which is NOT used during training! Estimate performance of trained classifier on test set Note: for decision trees.
sensitivity.Performance Measures for Classification
Classification accuracy. confusion matrix. specificity Area under the ROC curve The Lorenz (Power) curve The Cumulative lift curve Kolmogorov Smirnov
108
.
56 G 0.Classification performance: confusion matrix
John Sophie David Emma Bob G/B G B G B B Score 0.36
Cut off=0.56 0.50
John Sophie David Emma Bob
G/B G B G B B
Score Predicted 0.44 B 0.36 B
Confusion matrix
Predicted Positive (Good) status Negative (Bad) Actual status Positive (Good) Negative (Bad) True Positive (John) False Positive (Sophie) False Negative (David) True Negative (Emma.72 G 0.44 0.18 0.18 B 0. Bob)
Classification accuracy=(TP+TN)/(TP+FP+FN+TN)=3/5 Classification error = (FP +FN)/(TP+FP+FN+TN)=2/5 Sensitivity=TP/(TP+FN)=1/2 Specificity=TN/(FP+TN)=2/3
All these measures depend upon the cut off!
109
.72 0.
4 0.02 ….Specificity)
0
1
0
Scorecard .2 0. upper left corner) • Scorecard A is better than B in above figure • ROC curve can be summarized by the area underneath (AUC).6 0.99 1 sensitivity specificity 1specificity 1 0 1
Sensitivity
ROC Curve
1 0.A
Random
Scorecard .2 0 0 0.Classification Performance: ROC analysis
• Make a table with sensitivity and specificity for each possible cutoff • ROC curve plots sensitivity versus 1specificity for each possible cutoff
Cutoff 0 0. 0.01 0.6
0.8 1 (1 .4 0.8 0.B
• Perfect model has sensitivity of 1 and specificity of 1 (i. the bigger the better!
110
.e.
2 0 0 0.Specificity) Scorecard .4 0.8 1 (1 .
..A Random Scorecard .The Receiver Operating Characteristic Curve
ROC Curve
1
Sensitivity
0.B
111
..6 0.6 0.8 0.4 0.2 0.
5
112
. 1983) (Wilcoxon or MannWhitney or U statistic) The higher the better A good classifier should have an AUC larger than 0.The Area Under the ROC Curve
How to compare intersecting ROC curves? The area under the ROC curve (AUC) The AUC provides a simple figureofmerit for the performance of the constructed classifier An intuitive interpretation of the AUC is that it provides an estimate of the probability that a randomly chosen instance of class 1 is correctly ranked higher than a randomly chosen instance of class 0 (Hanley and McNeil.
or how many bads in top 10%
113
.Cumulative Accuracy Profile (CAP)
• Sort the population from high score to low model score • Measure the (cumulative) percentage of bads for each score decile • E. top 30% most likely bads according to model captures 65% of true bads • Often summarised as topdecile lift.g.
Accuracy Ratio
A A B Perfect model B Current model
AR=B/(A+B)
114
The accuracy ratio (AR) is defined as: (Area below power curve for current modelArea below power curve for random model) / (Area below power curve for perfect modelArea below power curve for random model) Perfect model has an AR of 1 Random model has an AR of 0 AR is sometimes also called the “Gini coefficient” AR=2*AUC1
.
..sensitivity) P(sB) = ∑x≤s p(xB) (equals the specificity) KS distance metric is the maximum vertical distance between both curves KS distance can also be measured on the ROC graph – Maximum vertical distance between ROC graph and diagonal
115
continued.P(sB) ..The KolmogorovSmirnov (KS) Distance
Separation measure Measures the distance between the cumulative score distributions P(sB) and P(sG) KS = maxs  P(sG) . where: P(sG) = ∑x≤s p(xG) (equals 1.
9 0.2 0.3 0.7 0.5 0.8 0.4 0.The KolmogorovSmirnov Distance
1 0.6
P(sB)
0.1 0
KS distance
P(sG)
score
116
.
Section 6
Setting the Classification Cutoff
.
capital should be in line with risks taken Strategy curve – Choose cutoff that produces same acceptance rate as previous scorecard but improves the bad rate – Choose cutoff that produces same bad rate as previous scorecard but improves the acceptance rate
118
.Setting the Classification Cutoff
P(customer is good payerx) Depends upon risk preference strategy No input provided by Basel accord.
Strategy Curve
18 16 14 12 10 8 6 4 2 0 40 45 50 55 60 65 70 75 80 85 90 95 Accept Rate
119
Bad Rate
.
6 0. Goods above score 1400 Cum.5 0.8
0.Setting the Cutoff Based on Marginal GoodBad Rates
Usually one chooses a marginal goodbad rate of about 5:1 or 3:1 Dependent upon the granularity of the cutoff change
Cutoff 0.4
120
2000
2700 2950 3130 3170
100
170 220 280 300
12:1
10:1 5:1 3:1 2:1
.9 Cum. Bads above score 50 Marginal goodbad rate 
0.7 0.
..Including a Refer Option
Refer option – Gray zone – Requires further (human) inspection
Reject Refer Accept
0
190
210
400
121
..
Viaene S. 627635. Vanthienen J. Journal of the Operational Research Society. Van Gestel T. Volume 54.. Suykens J.... Number 6.Case Study 1
Baesens B. pp. 2003.
122
... Stepanova M. Benchmarking State of the Art Classification Algorithms for Credit Scoring.
Viaene S..
...Benchmarking Study Baesens et al.. 2003. Journal of the Operational Research Society. Benelux. DeLong and ClarkePearson test
Baesens B. Benchmarking State of the Art Classification Algorithms for Credit Scoring.
123
continued. 2003
8 reallife credit scoring data sets – UK.. Number 6... Volume 54. Van Gestel T. pp.. Vanthienen J. Suykens J. Stepanova M. Internet 17 algorithms or variations of algorithms Various cutoff setting schemes Classification accuracy + Area under Receiver Operating Characteristic Curve McNemar test + DeLong. 627635..
Benchmarking Study
124
.
.
..Benchmarking Study: Conclusions
NN and RBF LSSVM consistently yield good performance in terms of both PCC and AUC Simple linear classifiers such as LDA and LOG also gave very good performances Most credit scoring data sets are only weakly nonlinear Flat maximum effect
125
continued.
(Experian. they may be large enough to have commercial implications. 1997) “Credit is an industry where improvements. can represent vast profits if such improvement can be sustained. 2003) The best way to augment the performance of a scorecard is to look for better predictors! Role of Credit Bureaus!
. even when small.Benchmarking Study: Conclusions
126
“Although the differences are small.” (Henley and Hand.” (Kelly. 1998) For mortgage portfolios. a small increase in PD scorecard discrimination will significantly lower the minimum Capital Requirements in the context of Basel II.
time at address. Equifax. Call Credit In U. build generic scorecard for small populations or new product In U. court judgments – Previous searches from other financial institutions – Shared contributed information by several financial institutions Check whether applicant has loan elsewhere and how he pays – Aggregated information For example. data at postcode level – Fraud Warnings Check whether prior fraud warnings at given address – Bureau added value For example.: Experian.S. Experian.K.Credit Bureaus
Credit reference agencies or credit bureaus UK: Information typically accessed through name and address Types of information: – Publicly available information For example. electoral rolls.: Equifax. TransUnion
127
.
Section 7
Input Selection for Classification
.
feature selection If n features are present.. yet uncorrelated with (not predictive of) each other (Hall and Smith 1998) Can improve the performance and estimation time of the classifier
129
continued. 2n1 possible feature sets can be considered Heuristic search methods are needed! Good feature subsets contain features highly correlated (predictive of) with the class..
.Input Selection
Inputs=Features=Attributes=Characteristics=Variables Also called attribute selection.
high correlation with another input already in the model)
130
.g.Input Selection
Curse of dimensionality – Number of training examples required increases exponentially with number of inputs – Remove irrelevant and redundant inputs Interaction and correlation effects Correlation implies redundancy Difference between redundant input and relevant input – An input can be redundant. but that does not mean it is relevant (due to e.
Input selection procedure
Step 1: Use a filter procedure – For quick filtering of inputs – Inputs are selected independent of the classification algorithm Step 2: Wrapper/stepwise regression – Use the pvalue of the regression for input selection Step 3: AUC based pruning – Iterative procedure based on AUC
131
.
LGD) (e.Filter Methods for Input Selection
Continuous target Discrete target (e.g.g. PD) Continuous input Pearson correlation
Spearman rank order correlation Fisher score
Discrete input
Fisher score
Chisquared analysis Cramer’s V Information Value Gain/Entropy
132
.
133
continued.
. P(married and good payer) = P (married).P(good payer) = 0.8*1000 = 480.Chisquared Based Filter
Good Payer Bad Payer Total Married 500 800 100 600
Not Married 300
100
200
400
1000
Under the independence assumption..6*0.6 * 0..8 The expected number of good payers that are married are 0.
Chisquared Based Filter
If both criteria independent. one should have the following contingency table
Good Payer Bad Payer Total Married 480 120 600
Not Married 320
800
80
200
400
1000
Compare using a chisquared test statistic
134
.
the more (less) predictive the attribute is Apply as filter: rank all attributes with respect to their pvalue and select the most predictive ones Note:
Cramer' s V
²
n
always bounded between 0 and 1.41
480 120 320 80
χ² is chisquared distributed with k1 degrees of freedom. higher values indicate better predictive input. with k being the number of classes of the characteristic The bigger (lower) the value of χ² .Chisquared Based Filter
2
500 480 ² 100 120 ² 300 320 ² 100 80 ² 10 .
135
.
Example: Filters (German credit data set)
Van Gestel. Baesens. 2008
136
.
Fisher score
Defined as
Fisher score  xG x B 
2 2 sG sB
Higher value indicates more predictive input Can also be used if target is continuous and input discrete Example for German credit data set
Van Gestel. Baesens. 2008
137
.
1 < pvalue: not significant Can be used in different ways: – Forward – Backward – Stepwise
138
.01: highly significant – 0.01 < pvalue < 0.10: weakly significant – 0.Wrapper/stepwise regression
Use the pvalue to decide upon the importance of the inputs – pvalue < 0.05: significant – 0.05 < pvalue < 0.
Search Strategies
Direction of search – Forward selection – Backward elimination – Floating search – Oscillating search Traversing the input space – Greedy Hill Climbing search (Best First search) – Beam search – Genetic algorithms
139
.
I3}
{I1. I4}
{I2. I3. I3. I2. I3}
{I2.Example Search Space for Four Inputs
{}
{I1}
{I2}
{I3}
{I4}
{I1. I2. I4}
{I3. I2}
{I1. I3}
{I1. I2. I4}
{I1. I4}
{I1. I3. I4}
{I1. I4}
{I2. I4}
140
.
PROC LOGISTIC first estimates parameters for effects forced into the model. If it is significant at the SLENTRY= level. These effects are the intercepts and the first n explanatory effects in the MODEL statement. Once an effect is entered in the model. the corresponding effect is added to the model.. the procedure computes the score chisquare statistic for each effect not in the model and examines the largest of these statistics. where n is the number specified by the START= or INCLUDE= option in the MODEL statement (n is zero by default). Next.
. The process is repeated until none of the remaining effects meet the specified level for entry or until the STOP= value is reached..
141
continued. it is never removed from the model.Stepwise Logistic Regression
SELECTION= FORWARD.
. it remains excluded.. Results of the Wald test for individual parameters are examined. The least significant effect that does not meet the SLSTAY= level for staying in the model is removed. In that case. only the parameters for the intercepts and the first n explanatory effects in the MODEL statement are estimated. The process is repeated until no other effect in the model meets the specified level for removal or until the STOP= value is reached.
142
continued.Stepwise Logistic Regression
SELECTION= BACKWARD . Parameters for the complete model as specified in the MODEL statement are estimated unless the START= option is specified.. where n is the number specified by the START= option. Once an effect is removed from the model.
Effects are entered into and removed from the model in such a way that each forward selection step may be followed by one or more backward elimination steps.
PROC LOGISTIC uses the branch and bound algorithm of Furnival and Wilson (1974)
SELECTION= SCORE
143
. The stepwise selection process terminates if no further effect can be added to the model or if the effect just entered into the model is the only effect removed in the subsequent backward elimination.Stepwise Logistic Regression
SELECTION= STEPWISE similar to the SELECTION=FORWARD option except that effects already in the model do not necessarily remain.
model good_bad=checking duration history purpose amount savings employed installp marital coapp resident property other/selection=stepwise slentry=0.01.
!
144
. class checking history purpose savings employed marital coapp resident property age other housing.Stepwise Logistic Regression: Example
proc logistic data=bart. run.10 slstay=0.dmagecr.
logistic regression) with all n inputs Remove each input in turn and reestimate the model Remove the input giving the best AUC Repeat this procedure until AUC performance decreases significantly.
Average AUC Performance
Inputs
145
.g.AUC based pruning
Start from a model (e.
Section 8
Reject Inference
.
Population
Throughthedoor
Rejects
Accepts
? Bads
? Goods
Bads
Goods
147
.
– Reject bias when building classification models – Need to create models for the throughthedoor population – If past application policy was random. however this is not realistic Solution: – Reject Inference: A process whereby the performance of previously rejected applications is analyzed to estimate their behavior. then reject inference no problem.
148
. but not for past rejects.Reject Inference
Problem: – Good/Bad Information is only available for past accepts.
050
Inferred Bads 914
Inferred Goods 2.176
149
.000
Rejects 2.950
Accepts 7.Reject Inference
Throughthedoor 10.036
Known Bads 874
Known Goods 6.
Techniques to Perform Reject Inference
Define as bad – Define all rejects as bads – Reinforces the credit scoring policy and prejudices of the past In SAS Enterprise Miner (reject inference node): – Hard cutoff augmentation – Parcelling – Fuzzy Augmentation
150
.
Hard CutOff Augmentation
AKA Augmentation 1 – Build a model using known goods and bads – Score rejects using this model and establish expected bad rates – Set an expected bad rate level above which an account is deemed bad – Classify rejects as good and bad based on this level – Add to known goods/bads – Remodel
151
.
.6% 11.3% 21.Parcelling
Rather than classify rejects as good or bad.232 % Bad 70.Bad 240 141 40 28 18
Rej .3% % Good 29.5% 94.
# Bad 24 54 43 32 29 # Good 10 196 331 510 1..7%
Score 099 100199 200299 300399 400+
Reject 342 654 345 471 778
Rej .7% 78. it assigns them proportional to the expected bad rate at that score – score rejects with good/bad model – split rejects into proportional good and bad groups.5% 5.Good 102 513 305 443 760
152
.1% 97.9% 2.
.4% 88.
Parcelling
However – Reject bad proportion cannot be the same as approved Therefore – Allocate higher proportion of bads from reject – Rule of thumb: bad rate for rejects should be 2–4 times that of approves
153
.
Nearest Neighbor Methods
Create two sets of clusters: goods and bads Run rejects through both clusters Compare Euclidean distances to assign most likely performance Combine accepts and rejects and remodel However. rejects may be situated in regions far away from the accepts
154
.
..
. definitions. programming
Jan 99 Declined – got credit elsewhere Dec 00
Analyze performance
155
.Bureau Based Reject Inference
Bureau data – performance of declined apps on similar products with other companies – legal issues – difficult to implement in practice – timings.
Additional Techniques to Perform Reject Inference
Approve all applications Threegroup approach Iterative ReClassification Mixture Distribution modelling
156
.
1998. That is. David Hand. work by Jonathan Crook)
157
continued.
Cost of granting credit to rejects versus benefit of better scoring model (for example..
. There is no unique best method of universal applicability....Reject Inference
“. unless extra information is obtained. the best solution is to obtain more information (perhaps by granting loans to some potential rejects) about those applicants who fall in the reject region”.
Reject Inference
Banasik et al. were in the exceptional situation of being able to observe the repayment behavior of customers that would normally be rejected. They concluded that the scope for improving scorecard performance by including the rejected applicants into the model development process is present but modest. Banasik J., Crook J.N., Thomas L.C., Sample selection bias in credit scoring models, In: Proceedings of the Seventh Conference on Credit Scoring and Credit Control (CSCCVII’2001), Edinburgh, Scotland, 2001.
158
Reject Inference
No method of universal applicability Much controversy Withdrawal inference – Competitive market
159
Withdrawal Inference
Throughthedoor
Withdrawn
Rejects
Accepts
? Bads
? Goods
? Bads
? Goods
Bads
Goods
160
Section 10
Behavioral Scoring
.
customer segmentation and targeted mailings) Basel II!
162
.Why Use Behavioral Scoring?
Debt provisioning Setting credit limits (revolving credit) Authorizing accounts to go in excess Collection strategies – What preventive actions should be taken if customer starts going into arrears? (early warning system) Marketing applications (for example.
163
. using the score to change the credit limit invalidates the effectiveness of the score! Analogy: “only those who have no or few accidents when they drive a car at 30 mph in the towns should be allowed to drive at 70 mph on the motorways” (Thomas. Edelman. divide the behavioral score into bands and give different credit limit to each band. other characteristics required to manage account with large credit limits when compared to accounts with small credit limits Typically. they can be used to increase/decrease the credit limits But: score measures the risk of default given the current operating policy and credit limit! So. and Crook 2002) Other skills may be needed to drive faster! Similarly.Behavioral Scoring
Once behavioral scores have been obtained.
trend in balance. times exceeded credit limit. monthly updates) – Repayment and usage behaviour characteristics Example usage behavior characteristics – Maximum and minimum levels of balance... ..Variables Used in Behavioral Scoring
More variables than application credit scoring – Application variables – Credit bureau data (for example.. during performance period
164
continued. credit turnover.
.. trend in payments. number of missed payments.
private versus professional products. guarantor.. . Think about product taxonomy to define customer behavior – For example..Variables Used in Behavioural Scoring
Define derived variables measuring financial solvency of customer Customer can have many products and have different roles for each product – For example. secondary owner. look at different savings products The best way to augment the performance of a scorecard is to create better variables!
165
. average savings amount. primary owner.
Value 1 month ago.: Obligation/debt ratio: calculated by adding the proposed monthly house payment and any regular monthly installment revolving debt and dividing by the monthly income. …
166
. … Absolute trend
xT xT 6 6
Relative trend
xT xT 6 6 xT 6
Most recent value. – Others: current balance/credit limit.g.Aggregation Functions
Average (sensitive to outliers) Median (insensitive to outliers) Minimum/Maximum – For example. highest utilization during last 12 months. … Ratio variable – E. worst status during last 6 months.
Impact of Using Aggregation Functions
Impact of missing values – How to calculate trends? Ratio variables mighty have distributions with fat tails (large positive and negative values) – Use truncation/winsoring! Data set expands in the number of attributes – Input selection! What if input selection allows you to choose between. average and most recent value of a ratio? – Choose average value for cyclic dependent ratios – Choose most recent for structural (noncyclic) dependent ratios
167
. for example.
Behavioral Scoring
Implicit assumption – Relationship between the performance characteristics and the subsequent delinquency status of a customer is the same now as 2 or 3 years ago. Classification approach to behavioral scoring Markov Chain method to behavioral scoring
168
.
Classification Approach to Behavioral Scoring
Performance period versus observation period
Performance period Observation point Outcome point
Develop classification model estimating probability of default during observation period Length of performance period and observation period Tradeoff Typically 6 months to 1 year Seasonality effects
169
.
How to Choose the Observation Point?
Seasonality. dependent upon observation point Develop 12 PD models – Less maintainable – Need to backtest/stress test/monitor all these models! Use sampling methods to remove seasonality – Use 3 years of data and sample the observation point randomly for each customer during the second year – Only 1 model – Gives a seasonally neutral data set
170
.
0.799439097. '.3907.881.05.74:897003/:.3/$9430
86:.9.70/:942.9047.9439700 %.33/..
.9.943 090..79..7098.3 .47928 ":3.843%7008
#0.881.3/#0708843%7008 702..7098.943. #07088439700 %.943..4393:4:8 3907.3 803 .3 70/2.3 #%.:78.943 ..
250
3.78 4 08 09 08 4
08
44/#8
.420
4 40.843%700./#8
44/#8
./#8
.0.
0.420
39073./#8
.
08
40.34/0 #449
4
08 09 08
7.1 34/08
44/#8
./#8
44/#8
.3.843%7008%072344
3.78 4
4
0.
:0 0.9:82.0.034/0 W &8:.770/47349 $94553/0..9.843 W .094859.843 W 03948945743909700 W 03948945.843 W .9.93/0.420
473492. 89.9...479..88 044/47.:894207 94.892../.0
47349 3.88
.843970081742/...7.79.883 94.
$5993/0.//334/0894909700 8832039/0.902.
:8942078 25:7920.8:708/847/07.2./ .$5993/0.:894207870//498.25:79
7003/49870570803944/.25:79 32.843039745
32.25:79 .
.8:70/.3020.9.9./..809 25:7941.250$./.8.81448
39745 $
54 5
54 5 33 $ 55 #%
.483..
908598.3/20. 0355
39745
438/07/1107039.035475 ..2. 90/0..70.3/2.3//.8:701470.80303974547.3
.0.8:7325:79:8339745
397458232.
..250.:.943
0
0_
3974594534/0
.3.
4 .
.
4 .
3974501934/0
.
4 .
.
4 .
3
.
397457934/0
4
4
.
.
3
f° .9.9073.3/5.08598.ff¾f¾ ¯f °n¯ f°¾f¯°
.
438/07/1107039.904309089 .
843 ../..9.943809 :80/147894553/0.
8.$94553/0.3/24708020390/.9.33809 :80/14785993 /0.3/.4.0. .9.1....4/90970017420.394.80/431007.9700814:3/070.79943390/.33 809
:20741970034/08
.7894553
89097008743 90/.8430..3/10074807..943809077478232.843 592.881.3/ 8598.94307747
'./.70.97..3/199390348041 90/.423944850.42082470.0719935./..071993 .943809
$% !7439700 232:2
%7.4/4.9438 00/94.
39../.08.
0841%7008
/.39..39.3 44893 7..8...30839097.7. :389.08 .804139075709. 4300/14797.943841./.9.3/421470898
.7.39.84397008.0 423.094.381472.33/.08 $0389.943 43
5..2097.08 #4:89944:9078 8./.943841/0.
881.$0.473 .04170/9$.943
0.9434/08
.8:7390 !071472.3..
8:70!071472.3 $6:.3.0 &# #07088430..881.&3/07# .08 0.9 70.7488
.943 0.0
!071472.:7.8:70 .93 30:38003 389.9 $50.2502094/ $308.8810/ ! $0389.3./.020.9.4940.3.039.30/24/050714723570/./0435071472.2502094/
14/.0 40/4089097.943
..384:900.04770.943!07.3.70/7747 $ 094/8 $598..1.
9.8/0..33809.430
97/4190/.91..8 %:80//:7397.9.:8942078 $09.04197.88107439089809 490147/0.70/..7941 97.250094/
47.3/9089809
.94380985./.70
8 9
.8098 .33 892.20.33809 $97./. ..30/.$59$.88/897:943397.84397008 .943 $..905071472.3.9089809 95.
:7. ..0 %0:2:.9 70.!071472.1.00.881.8:708147.943..431:8432.019.:7..881.
.:7.3.0 424474..97 80389.$2734.9 850.943
.:7.9.0 %04703 !407 ..:3/0790# .
..9435071472...3.943.97
° ½ f ¯¯f
$ n :9411
° ½ f ¯¯f
$ n 9n
431:8432.. %! % ..:7.431:8432.0.881.97
9n 9¾%% ¾f¾ f%f% nf¾f¾ 9¾%% f%f% @9¾%°% f¾9¾%½% f¾f%f% @f%¯¯f
%
.881.
%! ! % .
881.94307747 ! . ..
%! ! % .
$0389.9%!.
%! .
$50.9%.1.
! % .
908020.8:708/0503/:54390.:9411
.
0549880389.9..0.9.3.:9
411
.1.0# .91470.943!071472.91470.3.9.881..88
W .:9
411 W # ..54880.078:8
850..0980389.:7.1.54880.3/850.
1.70.01:70 W # .9
# :7.1.924/0.7/8099079.941.470.0.. ¾°¾ ¾½nn ¾½nn
$0389.7/
#.941 0:5507019.9 & 9007 9009907
..0
$50.470.33.3/850.4.70/90.3/42
$.470.47307 W $..308:22.:7.7/
W !0710.880389.:3/0730.9
$..
0.3/42 $.93.7/
#..9
$50.470.07 507.470.7..9 $.0
$0389.7/
.0
# :7..%0#0. :7.1.90789.
3/42./08 .039075709.0 & %0&574.:3/0790# .4803389./08.90/.041.8810784:/.3. %0079009907 44/.3.3.888.3/42.425.82501:70
41
207914790 5071472.0.3/ .4803 389.88
.:7. 7.99574.0
494.%070.99.041.04190.904190574.44347.70..97.30.88107 339:9.33
93047& 89.3&.989.08 %0.30/079.70390780.43897:.93# .&3/0790# :7.9.:7.7079....3
..30892.4770.3.9434190&89.7.0 .
0 507.59:7084197:0 ./81470..9.470/0..8.0.019 4742./83945
.780/.47094424/08...94317428../8.8945
/0.8:7090 ./8 W 19038:22.039.!7410 !
W $47990545:.:2:..47/39424/0.9.0 W 945
24890.:7.:2:.041.470 W 0.3.
94
:7703924/0
!0710.924/0
#..:7..#..
3/4224/0 .94 # 8/0130/..:7703924/0
70.01477..:7.0147.045407 .8 70.:7.7..045407.
%0.:7.
. 70.8.84.014750710.924/0.045407.3#41 #.01477.8.3/4224/0 !0710.0/90 3.039 # &
.924/0
70.045407 .3/4224/0.3#41
#884209208.:7.4011.:7.
8
80389.4393:0/
.8:70/4390# 7.0
$05.890850./89.3/ /.%0424474.00900390.0.3.08 $/89.:2:.3. $ 89.02097..3/! 8 $2.9.3.079.470 /897:9438! 8 .
$2734.43.3.3.7.:7.84020.079..08..009003# 7.2:2.8:70 0.3.0 0900349.9 ! 8 c^8 5 06:.8 ! 8
! 8 070 ! 8 c^8 5 06:.3.2:2.8902.8:70890/89.94320.9 $/89./89.1.5 .5.
.
3.89.3.
$2734.0
! 8
8.0
! 8
$/89.%0424474.470
.
.943:9
411
.$0.881.943
$099390 .
20.4:88.90.:9
4119..90 435:9574.07 0503/8:5437857010703.881..470.90.059..088.8 570..7/:92574./0/.84:/03 3097889.943:9
411
! .0 7.80..9574/:.20.:9
4119.3./ 7.0890.8570.03 $97..470.0 7.:7.90
.9574/:.59.4:88..$099390.7/:92574.3.90 4480.0897..90.059.0890..0 4480.:894207844/5.088..47/ ./7.
.0
.$97.90
./#.059#.90:7.90
.
44/
.90
.30
:9
411
:244/8 ./8 .$099390:9
411. 44/
.470
./7.4./7..2.470
:2.73.4.4:947 0503/039:543907.73.:9
411.08.44808.3:./#.9041 .44/
.73.794190.430.08.80/43.908
&8:.
.943
#00.3.059
.:/3.430 #06:7081:7907 :2.#0107 5943
#010745943 7.3 3850.9 #0107
.
09 '4:20 :207 55
.390303 03..$4.030$ $905.90419079..943.881. $:038 '.943 4792814770/9$.34.4190 507.473 4:73.#080.73$9.2.80$9:/
.08038 '.7.30890% '..
08038 '.$4.47928 '. 70.030$ $905..4792847.881..90419079.#080.0208 .0.07 507.:7.:9
411809938..93.0803809.4190 507.09 '4:20 :207 55
.0 .7.2.4393:0/
.2. $:038 '.7.02..70/98.79089 043 043.473/.:7.03.94347928 14770/9$.943841.881.8098 & 030: 3907309 .73$9..473 4:73.90789.3/.943.:3/07#0.30890% '.34.9.70
!0..943.74:8.
70.7843 9089
.7.390303 03..73$9:/..
10.
73$9:/
.03.2.
70/98.3.7..3.3/& $25030.08 489.3/#$
$'.73$9:/43.:8438
.84.03907284149!.92. 343
30.473/.9
.03.8098.2:20110.4393:0/
.8810788:.0744/5071472.70430..8.9..2.3/ .0.438890390/44/ 5071472.7 .
020398 0..03.73$9:/43..2574.041.08..7//8.943831.803! 8.3. 902.470.3
%0089.470.8957419818:.0.70.94..33/:8970702574.:8438
94:90/110703.7082.3/ .723.4390941..7/8944414709907570/.30/ 0 472479.3940790 232:2..42207.#06:7020398390.:2039905071472.80 507.308:89.3705708039.9478 #404170/9:70.2.9438 030.0303 82. .:8
.25.02039 ..82.70 034:94.59..3.0.054791448 .. 8.3/ 70/98.
.0907574717.:0 470.20.55.943847 30574/:.3899:9438 0....4:880.38&343
.39.:8
70/97010703.250 920.3/405.4397:90/31472.8 70.3.:8 &31472.250 /..3 %7. 507.70/9 3&$6:1.7338 0.9 3&507.4/00.470.748 .:/.0 7.//7088 00.70/9:70.//7088 %5084131472.//7088 :70.13.70/9:70.545:.//0/.90/31472.95489.4:79:/20398 !70.03.943 !:.9..7338.0847.0880/974:3.7.081742490713.943 470.250 :/0307.:/.94395.94380.70/.:.947.07.0..84...943 470.0907..7/14782.9...3899:9438 $.3 6:1. .3/.9..03..3080070.8.3.031472.
943147 .943
35:9$00.$0.943
.881..
35:9$00.997:90800.041 0.310.943920 4190.90/ 570/.9:708098 .4907 .30.32574.7.90/9 349570/.9.943 10.4393:0/
.0.7.88 09 :3.3.4770.7.2094/8.80.3/0892.0/.439.943
35:980..4770.9:708 .3/$29 .9:70800.705708039 3
5488010.88107
.943 1310..438/070/ 0:789.08 84.90789.0905071472..9...041 990.8'.9:708997:908..70300/0/ 44/10..9:708.9:708:8098.
3.70..9439 .4770.3935:9 335:9.07700.39 /:0940.3/70/:3/.3/.3935:98 3907.70..3 98700.35:9$00.2508706:70/3. 110703.93:2074135:98 #024.98 4770..943
:78041/203843.9430110.9/40834920.00900370/:3/.../39024/0
.943.808 0543039.3935:9.3070/:3/.39 :99.4770.330.3490735:9.943250870/:3/.39.3/700.9 :2074197.
0/:70 476:.1907574.90/3/0503/0394190 .881.943.35:9800..943574.5507.190734135:98 35:98.4792 $9057.0/:70
$905&80.70800..
.80/57:33 907.9.80/43&
.89058070708843 &80905
.0574.0/:70.:041907070884314735:9 800.943 $905&.
4770.3.943'.709035:9
8078.943 8078.7843.:0 .347/07 .470
8.70909.37.70/.943
$50.72.907094/814735:9$00.4770.207 8' 31472.709 8.3.709 0 0! 4393:4:835:9 !0.470
86:.943
4393:4:89.88 7.
39745
.
70/.07 .770/ ! 44/5.07 %49.90/3:2074144/5.9.3/44/5.702.80/907
44/!.0.07 ! 2. .770/
&3/07903/0503/03.70
.770/
49.4393:0/
.770/.0789.88:25943 ! 2.770/.
86:.07
%0050./!.
7907.70/908989.9.3/0503/039 43084:/.0
44/!.. .090 1443.
86:.989.07 %49.
.439303..
86:.70:83.80/907
149.770/
49.770/
425.07 .70/./!.
%007 407 90.9.7.090 .997:908970850.:0.9902489570/.84:3/0/09003
.997:908 55.207 8 '
/ 3
..:083/.88084190.90789..3.994907 5
.9009907 570/.
86:.:041à 902470 088 570/.04308 490
7..3/800.035:9
.
86:.3/ 07.70/.....819077..70//897:90/9
/07008411700/42 9 03903:20741.9.80/907
/
à 8..9.
08038
.9.3.2509078 072..70/9/.809
'.30890 .
7098.8078.30890 .035:9 .:03/.70/9/.4393:4:8.3/35:9/8.9082470570/.470
8 8
07.7090 .8
8078.3.470
0130/.250147072..809
'.840:80/19.9.3..08038
.9.
5507.7.
..30:80/3/1107039..7/ .:094/0.39
5
.39 .39
5
.:0
831.831.39
5
..04190 35:98 5
.8 47....:0
831.3../0:5439025479..:0349831.7/ $90580
.:0
0.89058070708843
&80905
...
.7.9380. 0309.9434180.7.7.280. 0.0 700/2380... 47.7/023.7/800. 08978980..943 4.7.$0.7. %7.943 .9008
70. 8.47928
.7..$97.9380.7.07839035:985..
.7..01474:735:98
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
<
.250$0.$5.
989.3/0.98390 89.230890 ...10/0.$90580489.:0870.4393:0/
.9085.98.33 0110.0/3949024/0%0800110.7.3/9017893 05.:9 09 90 574.9470110.08887050.47708543/30110.0147039747:3990 $% !.902039 0703 8903:207850.989.077024.425:908908.0 90.0/:70.39.708941908089..3.902039 3 8074/01.98200990850.209078147 0110.//0/ 949024/0 3.0/
.0598.934939024/0.7089.
86:.#0708843
$% ## !# $%17890892. 1470..990 $%#0.0110.98.7090 3907.10/ 90$%#%47&45943390 89.98039070/390 24/0 9830.98147.0/17429024/0%0 574.90/:3934304190702.30110.0.470.8198831.
7.390110.10/90 $%#%45943#08:984190...10/39.38 0.08887050.0/17429024/0 9702.902039.9.$90580489.90/:308890 $%#%459438850.700892.0598.987024.0.3.7.98390 89.209078.0/
.70 0892.42509024/0.9/408349200990$$% 0.:0870.939024/02009890850.209078147903907./:.9470110./9089147 3/.8850.5.20907814790.4393:0/
.:/0/%0574.3/9017893 05.99.89 831..90/ 0703 8903:207850..90/:39344907 0110.014789.10/3 90 89.0/ 3.339024/087024.0147 7024.902039.7.700.10/0.47:3990$% !.#0708843
$% # !.230/%00.80 4390 5.3 0110..
9:89039070/ 3949024/0890430110.7/800.#0708843
$% $%!$ 82.94389058%0890580 800.7702..7/023.79490$% ##45943 0.90110.7/023..98.3/ 7024.08890723.943 !# $%:808907.3/4:3/ ..01440/43047 2470.479241:73.94389052.70039070/394. 147..3/843
$% $ #
..9081341:79070110.98..3 0.9.0599.90.3110.0/390 8:806:039.0/17429024/038:.088..70./39024/0/4349 30.3...943574..97024.//0/949024/0471900110.$90580489..9.
#0708843.9.0.79.3802540/2.79.4.049074:83 24/044/*.3802540/ 389..0..55708/039 5745079.250
574.489..79/2.389475:754808..7 .0.24:398.$90580489..52.88.55708/03957450794907..4.3/:7./.94389475:75480./.
7:3
.800.94389058080397
889.
&.39
..3.80/57:33
$9.808 831.00.09035:9.3.70708843 9.0/0.24/0 0489.35:939:73.0
35:98
.909024/0 #024.70.07.791742.0/:70:39&5071472.998574.335:98 #024..390089& #050.3/70
0892.0& !071472.
9310703.943
#00.0
.$0.
0598
./8
44/8
./8
44/8
.!45:.98
..943
%74:
90
/447
#00.
#00.0
!7402 44/.9310703.
87.89 .9024/0814790974:
90
/447 545:.9310703.....9....0798834970.47
.3.55..3/42 903700.89700.94388 ..03457402 40.3.4:8700.89.943843. $4:943 #00.803:/3.94324/08 00/94.0574.943 15.70.041570.881.90/.9 310703.08807090 5071472../31472.89.0598 :93491475.98 #00..0/940892.55.01475..909070.94354.
/8
310770/44/8
343./8
34344/8
.9310703..0
%74:
90
/447
#00.98
.0598
310770/.#00.
9310703./ 0130.943
.943 !.98..:2039.7/.7.47354.700.08 41905.9310703.8.:9
411.36:0894!071472#00.3/570:/.70/98.0890.034/0 .0
0130.%0.89 3$$39075780307 700.03 ::2039./8 #03147.8.
943 :/.3050.844/.3/.4..4:398/0020/.90/.908 $09.24/0:8334344/8.80/4398 0.3 ../.7/:9
11:2039./7.881700.900..0./7.3/089.0 //9434344/8.943
:2039.3/.8 050../ .98:839824/0.90/.98./8 $.470700.0.
/8 #024/0
..
98.9 8.90/.98944/.470700.!.8838 902574547943.9490050./7.3.90.7./ 9.03
#..9079.844/47.881700.99.470 8.
/
44/
#00./24/0 859700./
#0
44/
.98394574547943./ 44/
..470
.44/./74:58
$.9
#0
.3/.
/7.08
.900757454794341.9..03
40.20.8 .5574..33490908.5574.7.!.9884:/0 92089.07 #00.0/ %0701470 4.941./81742700./574547943.9 #:0419:2.90147700.
/8 #:3700..0.70:.3.7.88324890 5071472.0894.98.7089047094/8
70.3.0598
.089:.3/700.98974:49../0.90/3704381..0 4230. 174290.3/7024/0 40.:89078 425..9094809841.3/89.3/..:8907844/8.07 700.982.0598.
98 94907.3 0.425.:70.0
:70..0
.:. 5071472.3.5584382.80/#00.30/.9310703.9.
0.:/.:9942502039357.9.70/9080070
3.223
.30/ 49 .0 9238 /0139438 5747.308 0.0 5071472.041/0.88:08 /11.3.7574/:.
.//943.9438 %700
74:5.5574.9.0.%0.0#0
. 907.943 9:70897:94324/03
.0
5574.9310703.55..881.36:0894!071472 #00...
97043 .58 7.98 .250 4743.38948420549039.0
%070834:36:00892094/41:3.3934.078:8030194109907 8.943 507./.943849.9 :3088097.98.390700.9. .3/ 489417..3247031472.31472.#00.3744
..9310703..393.55.078.47324/0 1470.30/%.70/994700.700.4:99480 .98 9008984:94389449.55.39841.4393:0/
.
04370/9 $.3/
.0
.9310703.8 $.47324/08 3 !74.8 744 %42.20390.4741 .0944807.0700.49.:89420789.43.94:/3472.0452039574.03.250 800.4501472574.83.943.08885708039:9 24/089 .:/390700.473.3/70/943974 $'
/3:7 $..398 3949024/0/0.90/%0 .#00.809.9908.89:.55.090705.0394310703.05943.7/ 5071472.94341 03.00/384190$0.3.:/0/9.0703900..3..38.90/.3.470.70/98.
..709
.9 :.0 425099.0
42094/41:3.310703.078 9/7..43974.9310703.55.078.#00..02.
310703.0
%74:
90
/447
9/7.3
#00..0598
../8
44/8
./8
44/8
./8
44/8
.9/7.98
.
943
0.47.$0.473
..$.
38 .943.9438 1470.473
09574..3.&800.:89420789.039.55.79843394.088 40.3/9.8433 $0993.70/9298 70.03 1.943884:/09.9008 ..:894207 802039.4.0.70/9 :9473.80
.770.73388902 .47..250 .78 0.9570..7090/2.4:39894430.$.7 .7093.943897..
.30:80/94 3..$..473
3.70.30/ 90.80.000349.0.47.47..8.4708.00.
/0.3/.01.0..3/.7 4907.8 %42.470 3.70/9 298 %5...9
25439024947.....9..8090.70/929803.:77039 4507.8:708907841/01...47..8 /02.70/94.0344710.7.7.70.3/8../03980390 /7.9354.70/929940.4:398982..0308841908. /.3 .47094.4:399 .9
25390943884:/0.425./0900..8.3..70/9293.0300/0/94/7.70/9298 :98.:9.0.0.90789../.47020.0 /1107039.70/929 $4 :83908.40/94/7.3/
.3/744
907882..3090...0390.4 4394804.70.470394.90890 0110.8706:70/942.8907 $2.
089.2:2..3...943.8 .0 .20398 92080.47.8 .55.08 70/9:70.00.00/0/ .943..7.$..00.908 #05.7.:/.473
470..90789..90789...9.250:8.0841.70/9 9:734.08&80/30.473 55.3.20398 9703/3.2039.05074/
.3/232:20..70/98.3. 1470.0 3:207412880/5.7.250 2439:5/.7.47.4393:0/
..7.07 9703/35.4:7.3.70/929 /:735071472..3/:8.'..
0/1107039 74081470.434294/0130.:894207 0.38.7.:2039905071472.74307 :.98 %3.3.02.574/:.9 470..3.43/...9 /11070398.08&80/30.08.0820.041..473
0130/07...98.84.8:7313.078:8574108843.250 572..07.'.99.90.70.250 .38574/:. 8.. 41..03.470..3/..7/894.7.7..47 470.98 %0089.3947 57.4:7.08
...574/:.0/.4:9574/:.74307 80.7.:894207 :894207.9009907.94.3.$.3574/:.24:39 44.
70.943:3.9438
.3 380389.07.0944:9078 32:2.0 80389.0944:9078 0/.
7.039.:02439.9.0 .94..943/:73.250 478989.4 #.8924398 84:909703/
% %
#0.2:2 470.9:8/:73.09703/
% % %
48970.:0 '.8924398 089 :9...943.
3//.2039.0./39024393.:77039..94.4./097.420 9078.72439389..//3905745480/ 24394:805.3.370:.3/.3/09.:.2039 70..90/.
.70/929
.
.0.94.0/897:943891.:.9.25.:08 494..941&8370.9438
25...9412883.7..:08 &8097:3..943...705489.943:3.3/30..8 .909703/8 #.0829.99.0.
250 ..0...07.943.9. /0503/0397.3/248970.039147897:.:041.039..0..07.80905..948
.9135:9800.943 .948 4480248970.448009003 1470..9:7.94 4480.7./0503/0397.3/83903:20741.997:908 35:9800. 343
.:0147.484:94..38473 .
0 . 89.90789.0.9:841.$..5574.3.473 .940.9438509003905071472.74.847 0.47.881...8.943....:8942078908..47.47.8.32094/940..2034.9.4 .3/908:806:039/036:03.473
25..473
.78.7.88:25943 #0.8.
940.3/4807.078:84807.9435074/ 039415071472.05074/.93574./0
411 %5.98
..9435439 :9.9435574.4205439
0..881...473
!071472.05074/.90110.941 /01.3.3..94324/00892.3.47.:9/:734807.. $..24398940.7 $0.05074/ 807...9435074/ %7.881..843.045..9435074/
!071472.
0 00/94..843.943!439
$0.9 /0503/039:5434807..045!24/08 0882.9089..494448090 807.3.39.9435439 0.
8970889089.
943 54397.80..2532094/8947024.243947...080.9.9.9080 24/08 &808.250904807.9 &800.:894207/:7390 80.7841/.43/0.08.7 324/0 ..30:97./.809
.843.843.3/8.3/421470.