Presentation Material 2

ABC’s Analytics PD Modeling
Training Agenda
June, 2011
Agenda
1. Day 1 – Introductions / High Level Overview / Using WPS

2. Day 2 - Data cleansing / Background Theory / Position Analysis
3. Day 3 - Single-Factor Analysis / Multi-Factor Analysis
4. Day 4 - Multi-Factor Analysis / Validation / Calibration
5. Day 5 - Q&A / Next Steps
Training 2
- Introductions
1 - High level overview

- WPS
- Matlab
Training 3
Introductions
» ABC's Introductions
Training 4
Training 5
» This training session covers statistical portion of the model only
Training 6
Project Goals
Quantitative Final Model Model

Data Cleansing Model Validation and Calibration Documentatio
Development Selection n
- Cleansing dataset - Single Factor - Make - Review of - Data cleansing

based on rules Analysis recommendations internal rating process and results
based final models master scale
- Data cleansing - Multi Factor - Modelling results
report Analysis - Validation process - CT estimate - Qualitative Model
- Expected or building process
desired final - Calibration
rating
process and
distribution
results
- Recommendation
s and conclusions
-Excel based Model
Training 7
High level process overview
Data Preparation
Before beginning any analysis, the data must be cleansed by getting rid of financial
statements that do not satisfy the following:
» Ratio’s check – running the dataset through a series of involves

Data preparation data cleansing rules the definition
identifying
of default, cleaning the financial statement data,
and filtering the customers so they are consistent
» Make a default definition – a consistent definition
with of default
the business has to be
segment fordetermined
which theinmodel
order is
properly to classify the underlying data
being built.
» Determine the default horizon – determining a time window which classifies defaulters
and non defaulters
» Segmentation Analysis – the analysis of the development sample amongst different

splits
Training 8
The aim of single factor analysis (SFA) is to stud
the relationship of each factor to a number o
High level process overview criteria. Specifically, single factor analysis look
Single Factor Analysis at a factor’s consistency with its underlyin
economic hypothesis, its relation to default, it
» Given the large number of possible ratios,power
it is important to reduce
to predict the listdefault,
and explain of ratiosand
that how we
enter the final model selection process based on theisbest
the factor performing.
populated Thisgiven
for the screening of
sample.
ratios is based on the following criteria:
– There must be enough observations. Observations where many values are missing typically
indicate that the information is difficult to obtain. This information should therefore not be included
in the final model  Position Analysis
– The distributions of the ratio values for non-defaulted and defaulted companies should differ
significantly to increase the discriminatory power of the model  T-tests (Kolmogorov-Smirnov)
– They must be consistent with their associated economic assumption.  Shape Analysis
– The correlation between the ratios should be low to ensure a robust and simple model 
Correlation Analysis
– The discriminatory power of each ratio should be sufficient. We want to keep in our set of potential
regressors those factors that have a high discriminatory power between defaulted and non-
defaulted companies  Accuracy Ratios
Training 9
Single Factor Analysis
Training 10
Multi Factor Analysis
» Multi-Factor Analysis reveals the discriminatory powers of the different combinations of

transformed factors from short-list as a model.
Based on the results of that analysis we exclude

some factors from entering the multi-factor
analysis. The multi-factor analysis (MFA) uses
several selection procedures to further restrict the
list of factors down to about n factors.
Training 11
Multi Factor Analysis
Training 12
Financial Model Selection Model Candidates
Step 2:Keep
Step 1:Keep Candidates Step 3:Keep
ONLY positive without known Candidates
coefficient correlated without any
(Expect factors from dominant
Intercept) SFA factors
Step 5:Keep
The final model selection aims at selecting those
Candidates
Step 4:Keep
factors which will constitute the
Candidates withfinancial
all part of
factors
the rating model.
with highest
significant
AR
Step 6:
Consider
Business
Meaning
• Consider the economic meaning of factors
• Preferably in Profit, Leverage and
Coverage categories
• With possible broad representation
Validation
Five Recommended Models
Training 13
Validation
» The validation tests the discriminatory power, robustness and stability of the model. If a
huge amount of data is available, out-of-sample tests are the best way for testing the
model regarding these different criteria. However, with a limited amount of data, it is
better to use all data for the development of the model and conduct some tests which
mimic the out-of-sample test, like bootstrapping or modified k-fold test. We will use:
Once the final model is selected, it is tested or

– VIT test to test for model robustness
validated. As indicators of model performance
power statistics (accuracy ratio) are measured on
– K-Fold for modelthe entire sample.
stability
– Bootstrap to test model stability
Training 14
Calibration
» Creates a mapping from score to PD%, such that PD’s generated by the model are
empirically accurate
– Based on calibrated PD, we can compare the risk of customers between different asset classes
and portfolios Finally, the model needs to be calibrated to
– Quantification ofprobabilities of Pricing,
credit risk enables default. Calibrating
Limit the model
System, Regulatory Capitalis(Basel II) etc.
necessary because the model is developed on a
sample which usually does not reflect the
» Creates a bucketing systemdefault
“natural” that is capable of assigning
rate within obligors
a country’s into risk grades
economy.
Therefore, the model needs to be adjusted to an
observed central default tendency within a given
» Finally calibrationtime horizon.
maps each obligor to a bank master scale
Training 15
Calibration
Criteria 1 - Set the distribution of obligors on grades

(e.g. 2% obligors on grade1, 4% obligors on grade2)
Sort the obligors by score from low risk to high risk. Together with the distribution
above, we can determine the boundaries of score for each grade
Criteria 2 - Is average PD close to CT value? (Average PD = 2% * PD1 + 4% * PD2 + …)

PD1, PD2… are PD midpoint for each grade of the master scale
Criteria 3 - For each grade, calculate the upper boundary of default number using
binomial test.
Is the real default number lower than the upper boundary? (conservative)
Training 16
17
Calibration
If the three criteria has been satisfied, we can do the calibration, i.e. mapping score to %PD
1. Mapping the boundary of score to boundary of PD for each grade.
2. Applying PCHIP to get smoothed calibration curve.
Master scale PD PD
Calibration curve
Max 0.50%
Grade A
PD bound
Min 0.28%
Grade A
Score range
Score 80 ~ 75 Score
E.g. Company with a score between 75 and 80 is assigned grade A
Training
17
17
Combining Qualitative and Quantitative Score
Training 18
WPS
Training 19
WPS
Importing
PROC IMPORT OUT= &ClientLib..ImportExcelWide

DATAFILE="I:\Avtar\XYZ training\ Data\
testdata.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
RUN;
Training 20
WPS
Exporting
PROC EXPORT DATA=Temp1

OUTFILE= "&TemplatePath.\01)BenchmarkGradePD.csv"
DBMS=CSV REPLACE;
RUN;
Training 21
Matlab
Training 22
2 - Data cleansing
- Background Theory
Training 23
Data Cleansing
Before beginning any analysis, the data must be cleansed by getting rid of financial
statements that do not satisfy the following:
» Ratio’s check – running the dataset through a series of data cleansing rules
» Make a default definition – a consistent definition of default has to be determined in order

properly to classify the underlying data
» Determine the default horizon – determining a time window which classifies defaulters
and non defaulters
» Segmentation Analysis – the analysis of the development sample amongst different

splits
Training 24
Data Cleansing
Ratio check
ID Rule Name Verbal Description Rule
1 Financial Check Are financial statements dates in the future or in the very past? PUBLICDATE within pre-defined boundaries
2 Intangibles Check Total Intangibles < 0 1110 + 1128 < 0

3 Total Fixed Assets Check Total Fixed Assets < 0 1290 < 0
4 Long Term Investment Check Long Term Investments < 0 1190 < 0
5 Negative Non Current Assets Check Total Non Current Assets < 0 1490 – 1090 < 0
6 Current Assets Check Do all current assets add up to the actual sum? 1020 + 1030 + 1040 + 1050  1090
7 Total Inventory Check Total Inventory < 0 1040 < 0
8 Total Account Receivables Check Total Account Receivables < 0 1030 < 0
9 Cash Check Cash < 0 1020 < 0

10 Negative Current Assets Check Total Current Assets < 0 1090 < 0
11 Total Assets Check Total Assets < 0 1490 < 0
13 Current Liabilities Check Do all current liabilities add up to the actual sum? 1510 + 1520 + 1530 + 1540  1590
13 Long Term Debt Current Maturities Check Long Term Debt Current Maturities < 0 1520 < 0
14 Total Accounts Payable Check Total Accounts Payable < 0 1530 < 0
15 Total Current Liabilities Check Total Current Liabilities < 0 1590 < 0
16 Long-term Liabilities Check Do all long term liabilities add up to the actual sum? 1620 + 1615 + 1620 + 1630 + 1640 – 1645  1690
17 Long-Term Debt Check Long Term Debt < 0 1610 + 1615 < 0
18 Long-Term Liabilities Check Long Term Liabilities < 0 1690 < 0
19 Total Liabilities Check 2 Total Liabilities < 0 1590 + 1690 < 0
20 Total Provisions Check 2 Total Provisions < 0 1710 < 0
21 Revenue Check Are the revenues consistent? 2100 = 0 and 2310 = 0 and 2590 > 0
22 Income Statement Check Are the incomes consistent? 2100 = 0 and 2290 = 0 and 2690 = 0
23 Net Check Net sale exists? 2100  0

24 Total COGS Check Total costs of goods sold < 0 2100 < 0
25 Amortisation/Depreciation Check Amortisation and Depreciation < 0 3110 < 0
26 Total Operating Expenses Check Total Operating expenses < 0 2100 < 0
Training 25
Data Cleansing
Default definition
Default may be defined as any of the following events (sorted by increasing severity):
» Cheque Return
» Watch List
» Past Due Above 90 days
» Debt Restructuring
» Demand for Payment
» Experts Definition
» Ongoing concern (warning of the auditors)
» Non interest bearing
» Provision
» Writing Off
» Bankruptcy
» Assets Liquidation
Training 26
Data Cleansing
Default Horizon
» We will evaluate the predictive power of the financial model over a one-year time
horizon. The 1-year horizon is a window prior to the default event in which we look for a
financial statement for the borrower. We considered statements from 90 days before the
default date all the way to 730 days after it for the 1-year horizon. Predicting default on
horizons shorter than 90 days is not practically useful since many financial statements
are not completed and available within this time frame. The process of assigning a
default flag is explained below:
– If a firm defaults within 90 days of the financial statement that firm-year observation is dropped.
– If the difference between the default date and the financial statement date (the number of days
until default) is within the default window (90 to 730 days) then that firm-year observation is
labelled a default.
– If the firm does not default within the window during that firm-year, the observation is labelled as a
non-default.
– We retain all non-default firm-year observations, but only one default firm-year observation per
firm.
Training 27
Data Cleansing
Segmentation Analysis
Analysing the development sample along three dimensions:
» Size
– All else equal, larger companies are more stable which is due to the diversification effect from its
client base, product range, etc. One way to ensure the accuracy is to include the size factor in a
model in a way better scores for the larger companies.
» Industry
– Divided by different industry sectors
» Time period
– Divided by different time periods, typically by ‘years’.
Training 28
Background Theory
Many different metrics are used:
» Power curves and Accuracy Ratio
» Kolmogorov-Smirnov (K-S)
» Loess regressions
» Correlations (pearson, spearman, Kendal Tau)
Training 29
Rank Ordering Performance : Power Curves and Accuracy Ratio
A power curve measures how efficiently a The relationship between the power curve and
factor discriminates between defaults and the Accuracy Ratio (AR)
non-defaults
% Defaults % Defaults
Perfect Model
B
60%
B
A.R.=
A+B
Random Model
Perfect Model: 100%

Random Model: 0%
20%
% Sample % Sample
Training 30
Kolmogorov-Smirnov (K-S) & Test – Differentiate
Defaulters from Non-defaulters
We performed Kolmogorov-Smirnov (K-S) test to determine
whether the distributions of each sub-factor on non-
defaulted and defaulted populations are statistically
different:
» If the p value is smaller than 0.05 significance level, we
reject the hypothesis that the non-defaulted population
and defaulted population are the same;
» Hence the factor does discriminate statistically between
the two populations
In general, we would perform T-test as well. However, T-
test requires the normal distribution assumption which
could not hold with the data. Therefore we only used K-S
as it is a non-parametric method, which do not assume any
distribution.
Results Highlights:
 Majority of factors passed the above mentioned K-S
test and has discriminatory power.
 Detailed results including test statistic and p-value
are available in the appendix.
Training 31
Loess regression
Loess regressions procedure fits the model y = f(x 1,...,xk) + ε nonparametrically; i.e., without
assuming a parametric form for f(x1,...,xk) such as f(x1,...,xk) = b0 + b1x1 + ...+ bkxk.
The method is to use data in a "local neighbourhood" of (x 1,...,xk) to estimate
E(Y) = f(x1,...,xk) via weighted least squares in the local neighbourhoods. The weights are
chosen relative to distance from the point (x 1,...,xk).
The method can be tuned to produces a smoother graph by allowing a larger neighbourhood
and tuning the weight parameters.
Training 32
Correlations
Pearson
Spearman
Kendall’s
Training 33
3 - Single-Factor Analysis
- Multi-Factor Analysis
Training 34
Single-Factor Analysis
» The single factor analysis (SFA) is based on a 5 step approach which reduces the
original number of ratios to the short list of candidate ratios. These five steps use
statistical methods to answer the following questions:
– There must be enough observations. Observations where many values are missing typically
indicate that the information is difficult to obtain. This information should therefore not be included
in the final model  Position Analysis
– The distributions of the ratio values for non-defaulted and defaulted companies should differ
significantly to increase the discriminatory power of the model  T-tests (Kolmogorov-Smirnov)
– They must be consistent with their associated economic assumption.  Shape Analysis
– The correlation between the ratios should be low to ensure a robust and simple model 
– The discriminatory power of each ratio should be sufficient. We want to keep in our set of potential
regressors those factors that have a high discriminatory power between defaulted and non-
defaulted companies  Accuracy Ratios
Training 35
Position Analysis
Purpose:
» Assessing data availability to aid model estimation, and to ensure that the resulting
model will be practical
Test:
» Each factor is summarised for proportion has missing/ zero/ positive/ negative values
– For each factor or ratio we perform a position analysis, which summarizes for what percent of the
population this factor has a missing value. Some positions in financial statements may not be filled
which will lead to missing factor values. Usually, a ratio is dropped, if more than 10% of its values
are missing. The exceptions are growth ratios which have a higher percentage of missing values
because they require two consecutive financial statements. The threshold for dropping growth
ratios is normally 25%. If a ratio is kept and it contains missing values, the missing values will be
replaced with a ‘1’ in the normalisation process of the ratio. This means that a missing value gets
a neutral factor value and this category won’t alter the overall PD for the corporate; the PD is then
solely driven by the remaining factors.
Training 36
T-tests (Kolmogorov-Smirnov)
We performed Kolmogorov-Smirnov (K-S) test to determine

whether the distributions of each sub-factor on non-
defaulted and defaulted populations are statistically
different:
» If the p value is smaller than 0.05 significance level, we
reject the hypothesis that the non-defaulted population
and defaulted population are the same;
» Hence the factor does discriminate statistically between
the two populations
In general, we would perform T-test as well. However, T-
test requires the normal distribution assumption which
could not hold with the data. Therefore we only used K-S
as it is a non-parametric method, which do not assume any
distribution.
Results will highlights:

 If factors passed the above mentioned K-S test and
therefore has discriminatory power.
 Detailed results including test statistic and p-value
are available in the appendix.
Training 37
Shape Analysis
The rank ordered observations are grouped in bins, say 50; each bin contains ssay 2% of
the development sample. The default rate for each bin is calculated and we obtain 50 data
points.
The Loess function is used to interpolate between the data points and this curve will be
interpreted in the shape analysis. A step monotonic curve is the best obtainable result; a flat
or bumpy curve is not useful for a rating model because the discrimination between
defaults and non-defaults does not take place.
This means that the shape analysis is closely related to the analysis of the discriminatory
power because a step monotonic curve has a high discriminatory power and a flat curve
has a very low accuracy ratio. The accuracy ratio supports the shape analysis, because a
high accuracy ratio indicates a steep and monotonic curve. However, the individual analysis
cannot be replaced because a small monotonic slope may have the same accuracy ratio as
a bumpy curve which is not monotonic at all. To make the transformation as consistent as
possible and to avoid overfitting monotonic curves (except for the u-shaped growth ratios)
are indispensable.
< SHOW OUTPUTS >
Training 38
The correlation between model factors is analysed where a high

correlation signifies that the factors are capturing similar information
and the inclusion of both factors results in increased complexity and
the potential over-weighting of a given dimension of the business in
the model.
Analysing the correlations between potential model factors help
identify independent factors and, as a result, minimise the likelihood
of over-weighting any particular dimension.
The analysis of factor correlations is based on three calculation
methods:
• Pearson correlation - measures the dependence between two
factors directly
• Spearman’s Rank correlation - is the Pearson correlation
coefficient between the ranked variables
• Kendall’s Tau correlation – a non-parametric rank correlation
Training
39
39
Rank Ordering Performance : Power Curves and Accuracy Ratio
A power curve measures how efficiently a The relationship between the power curve and
factor discriminates between defaults and the Accuracy Ratio (AR)
non-defaults
% Defaults % Defaults
Perfect Model
B
60%
B
A.R.=
A+B
Random Model
Perfect Model: 100%

Random Model: 0%
20%
% Sample % Sample
Training 40
Factor Transformation
Purpose
» To capture the “realistic” relationship between factors and default rate.
» LOESS curve fits through the plot to obtain the “transformed” relationship, which will be
standardised (rescaled to have a mean of 1).
» To use in subsequent analysis to “map” from the ratio value to default rate.
» Allows the parameters estimated in the multi-factor analysis to be interpreted

meaningfully as weights.
Training 41
Factor Transformation
LOESS is a local polynomial fitting method, which better fits the non-linear nature of the
relationship of factor values and default rates.
Using a LOESS method to fit and capture the Force Monotonic (Directional) relationship
relationship of factor values to default with default probabilities
Training 42
Normalisation of the transformed ratios
The transformed ratios values are normalised before they are combined. This is done by
dividing the transformation Ti(xi) by its mean(Ti(xi)).The normalised transformation NTi(xi)=
has the following form:
If the value for one observation of ratio xi is missing, then NTi(xi) will get the neutral value
‘1’. The aim of the normalisation is to ensure that the transformation for each ratio has the
mean of ‘1
Training 43
Multi-Factor Analysis
» Multi-Factor Analysis reveals the discriminatory powers of the different combinations of

transformed factors from short-list as a model.
Training 44
Exhaustive search
In general, exhaustive search is the only technique guaranteed to find the predictor variable
subset with the best evaluation criterion. It is often the ideal technique when the number of
possible predictor variables is less than 20 (this number, to some degree, depends on the
computational complexity of evaluating a predictor variable subset).
The problem with exhaustive search is that it is often a computationally intractable
technique for more than 20 possible predictor variables. For regression models with 25
predictor variables, exhaustive search must check 33,554,431 subsets, and this number
doubles for each additional predictor variable considered.
There is no set rule in determining how many ratios a particular rating model should
contain: a model with too few variables will not capture all the relevant information, whereas
another one with too many variables will be powerful in-sample, but unstable when applied
elsewhere and will most likely have onerous data input requirements.
The ratios should cover at the categories Return, Profit, Interest Coverage, Leverage,
Operating Efficiency, Liquidity, Size and Growth.
Training 45
Probit Regression Framework
For each combination of factors, the model estimates the relationship between the
transformed factors and the default/non-default flags using a Probit Regression framework.
It is used to model the multivariate relationship between independent variables (the
standardised transforms of individual ratios) and default, since it has several useful
properties:
» The model output is bounded between 0 and 1 corresponding to probabilities of default
» It is intuitive in the sense that it is monotonic, so that a worse output leads to a worse PD
» It is reproducible and widely understood
Training 46
Probit Regression Framework
The model is based on the following functional form:
» where x1,...,xN are the input ratios; 1,… N estimated coefficients;  is the cumulative
normal distribution; and F and NT1,...,NTN are normalised non-parametric
transformations. The NTs are the transforms of each financial statement variable, which
captures the non-linear impacts of financial ratios on the default. Y is the dependent
variable (i.e. the default / no default flag). The observed value of Y is either 0 (not
defaulted) or 1 (defaulted), whereas the calculated Y can be any value between 0 and 1.
The final transform captures the empirical relationship between the probit model score
and actual default probabilities.
Training 47
Final Model Selection
When deciding on the final financial model to use, we combine a consistent framework with
our experience to make the selection process as transparent as possible. This framework
includes these considerations:
» High accuracy ratio of the overall model

» Stability across different company sizes, industries, and periods
» Positive coefficients. Negative coefficients indicate an over-shooting of another ratio
» Economic Intuition should reflect the lending policy of Bank
» Model should cover most of the different categories
» Significance of the parameters should be above 95% to ensure a stable model
» Check for Multicollinearity to avoid highly collinear variables
Training 48
- Validation
4 - Calibration
- Combining Quantitative and
Qualitative Scores
Training 49
Validation
The validation tests the discriminatory power, robustness and stability of the model. If a
huge amount of data is available, out-of-sample tests are the best way for testing the model
regarding these different criteria. However, with a limited amount of data, it is better to use
all data for the development of the model and conduct some tests which mimic the out-of-
sample test, in this case we will use
» bootstrapping
» k-fold test
» VIF test
Training 50
Bootstrapping
» As the overall sample size and/or number of defaults decreases, the variability of the test statistics
increases making the results less clear.
» A common approach to sizing the variability of a particular statistic is to employ a re-sampling
technique to leverage available data and reduce the dependency on the original sample.
Bootstrapping is employed to leverage available data in an effort to reduce

dependency on the original sample dataset and define confidence intervals to assess
the consistency of the model.
Training 51
Bootstrapping
Primary Metrics – Kendall’s Tau
» Using the statistical package, MATLAB, the primary performance measure of interest, the Kendall's
Tau coefficient, is calculated with the same parameters.
» The in-built bootstrap function returns an N-by-B matrix of indices, where N is the number of rows in
non-scalar input arguments to the bootstrap function and B is the number of generated bootstrap
replicates.
» The bootstrap function creates each bootstrap sample by sampling with replacement.
» The function within the bootstrap is specified as the Kendall’s Tau coefficient and terminates when
the total number of bootstraps reaches 2000.
» A sufficient distribution of the Kendall’s Tau coefficient is constructed where a number of statistics
such as the mean, median and standard error can be extracted including confidence intervals.
Training 52
K-Fold test
The in-sample performance evaluation of a model might show a high degree of power in
distinguishing good credits from bad ones.
In order to increase our confidence in the model, we would like to know whether the model
performance is robust throughout the sample and it is not driven by a particular sub-sample
of it. A standard test for evaluating the robustness of a model is the “modified k-fold test.”
To implement this test, one divides the defaulting and non-defaulting companies into k
equally sized segments. This yields k equally sized observation sub-samples that exhibit
the identical overall default rate and are temporally and cross-sectionally independent.
Accordingly, we calculate the accuracy ratio of the sub-sample (1/k of the portfolio) and the
remaining portfolio ((k-1)/k) of the portfolio).
We repeat this procedure for all possible combinations, and compare the results of the k
samples with the k remaining portfolios. This approach shows the performance of the
model on smaller “in-samples”.
Training 53
VIF test
Variance Inflation Factor tests assess the relationships among factors to ensure model
robustness. Excessive multicollinearity occurs if a number of the predicting variables
(factors) are highly correlated.
» The collinearity test: The diagonal elements of the inverse correlation matrix for variables
in the model are known as variance inflation factors (VIF).
– VIF= 1.0 => Uncorrelated

– VIF > 4.0 => Excessive Multi-collinearity (4.0 is commonly used threshold)
Training 54
Calibration
» Creates a mapping from score to PD%, such that PD’s generated by the model are
empirically accurate
– Based on calibrated PD, we can compare the risk of customers between different asset classes
and portfolios
– Quantification of credit risk enables Pricing, Limit System, Regulatory Capital (Basel II) etc.
» Creates a bucketing system that is capable of assigning obligors into risk grades
» Finally calibration maps each obligor to a bank master scale
Training 55
Calibration process
Criteria 1 - Set the distribution of obligors on grades

(e.g. 2% obligors on grade1, 4% obligors on grade2)
Sort the obligors by score from low risk to high risk. Together with the distribution
above, we can determine the boundaries of score for each grade
Criteria 2 - Is average PD close to CT value? (Average PD = 2% * PD1 + 4% * PD2 + …)

PD1, PD2… are PD midpoint for each grade of the master scale
Criteria 3 - For each grade, calculate the upper boundary of default number using
binomial test.
Is the real default number lower than the upper boundary? (conservative)
Training 56
Calibration
If the three criteria has been satisfied, we can do the calibration, i.e. mapping score to %PD
1. Mapping the boundary of score to boundary of PD for each grade.
2. Applying PCHIP to get smoothed calibration curve.
Master scale PD PD
Calibration curve
Max 0.50%
Grade A
PD bound
Min 0.28%
Grade A
Score range
Score 80 ~ 75 Score
E.g. Company with a score between 75 and 80 is assigned grade A
Training
57
57
Calibration
» Information required:
– The master scale used for calibration
– Central Tendency Estimate for the portfolio
– Expected grade distribution for the portfolio
Training 58
Calibration
Training 59
Combining Quantitative and Qualitative Scores
Training 60
Calculation of relative weights for each ratio
The relative weight of each individual ratio is found by comparing the impact on the model
output of an increment of one standard deviation in the value of each standardised
transformed ratio. So the relative weight of the ith individual ratio is given by
Training 61
5 - Q&A
- Next steps
Training 62
Questions?
Training 63
Next Steps…
Training 64
© 2011 ABC’s Analytics, Inc. and/or its licensors and affiliates (collectively, “ABC’S”). All rights reserved. ALL INFORMATION CONTAINED HEREIN IS PROTECTED BY
COPYRIGHT LAW AND NONE OF SUCH INFORMATION MAY BE COPIED OR OTHERWISE REPRODUCED, REPACKAGED, FURTHER TRANSMITTED, TRANSFERRED,
DISSEMINATED, REDISTRIBUTED OR RESOLD, OR STORED FOR SUBSEQUENT USE FOR ANY SUCH PURPOSE, IN WHOLE OR IN PART, IN ANY FORM OR MANNER OR
BY ANY MEANS WHATSOEVER, BY ANY PERSON WITHOUT ABC’S PRIOR WRITTEN CONSENT. All information contained herein is obtained by ABC’S from sources believed by
it to be accurate and reliable. Because of the possibility of human or mechanical error as well as other factors, however, all information contained herein is provided “AS IS” without
warranty of any kind. Under no circumstances shall ABC’S have any liability to any person or entity for (a) any loss or damage in whole or in part caused by, resulting from, or relating
to, any error (negligent or otherwise) or other circumstance or contingency within or outside the control of ABC’S or any of its directors, officers, employees or agents in connection
with the procurement, collection, compilation, analysis, interpretation, communication, publication or delivery of any such information, or (b) any direct, indirect, special,
consequential, compensatory or incidental damages whatsoever (including without limitation, lost profits), even if ABC’S is advised in advance of the possibility of such damages,
resulting from the use of or inability to use, any such information. The credit ratings, financial reporting analysis, projections, and other observations, if any, constituting part of the
information contained herein are, and must be construed solely as, statements of opinion and not statements of fact or recommendations to purchase, sell or hold any securities. NO
WARRANTY, EXPRESS OR IMPLIED, AS TO THE ACCURACY, TIMELINESS, COMPLETENESS, MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OF ANY
SUCH RATING OR OTHER OPINION OR INFORMATION IS GIVEN OR MADE BY ABC’S IN ANY FORM OR MANNER WHATSOEVER. Each rating or other opinion must be
weighed solely as one factor in any investment decision made by or on behalf of any user of the information contained herein, and each such user must accordingly make its own
study and evaluation of each security and of each issuer and guarantor of, and each provider of credit support for, each security that it may consider purchasing, holding, or selling.
Training 65

Presentation Material 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Presentation Material 2

Uploaded by

Copyright:

Available Formats

ABC’s Analytics PD Modeling

1. Day 1 – Introductions / High Level Overview / Using WPS

1 - High level overview

Quantitative Final Model Model

- Cleansing dataset - Single Factor - Make - Review of - Data cleansing

» Ratio’s check – running the dataset through a series of involves

» Segmentation Analysis – the analysis of the development sample amongst different

» Multi-Factor Analysis reveals the discriminatory powers of the different combinations of

Based on the results of that analysis we exclude

Once the final model is selected, it is tested or

– Bootstrap to test model stability

Criteria 1 - Set the distribution of obligors on grades

Criteria 2 - Is average PD close to CT value? (Average PD = 2% * PD1 + 4% * PD2 + …)

PROC IMPORT OUT= &ClientLib..ImportExcelWide

PROC EXPORT DATA=Temp1

» Make a default definition – a consistent definition of default has to be determined in order

» Segmentation Analysis – the analysis of the development sample amongst different

2 Intangibles Check Total Intangibles < 0 1110 + 1128 < 0

9 Cash Check Cash < 0 1020 < 0

23 Net Check Net sale exists? 2100  0

25 Amortisation/Depreciation Check Amortisation and Depreciation < 0 3110 < 0

Analysing the development sample along three dimensions:

Many different metrics are used:

» Power curves and Accuracy Ratio

» Correlations (pearson, spearman, Kendal Tau)

Perfect Model: 100%

We performed Kolmogorov-Smirnov (K-S) test to determine

Results will highlights:

< SHOW OUTPUTS >

The correlation between model factors is analysed where a high

Perfect Model: 100%

» To capture the “realistic” relationship between factors and default rate.

» Allows the parameters estimated in the multi-factor analysis to be interpreted

» Multi-Factor Analysis reveals the discriminatory powers of the different combinations of

» The model output is bounded between 0 and 1 corresponding to probabilities of default

» It is reproducible and widely understood

The model is based on the following functional form:

» High accuracy ratio of the overall model

Bootstrapping is employed to leverage available data in an effort to reduce

– VIF= 1.0 => Uncorrelated

» Finally calibration maps each obligor to a bank master scale

Criteria 1 - Set the distribution of obligors on grades

Criteria 2 - Is average PD close to CT value? (Average PD = 2% * PD1 + 4% * PD2 + …)

– The master scale used for calibration

– Central Tendency Estimate for the portfolio

– Expected grade distribution for the portfolio

You might also like