You are on page 1of 82

Application of Statistical Tools in

Empirical Research

Dr. Arindam Bandyopadhyay

National Institute of Bank Management

What is Quantitative Research?


„ Quantitative research is about measurements.
„ Statistics and econometrics are the most widely used branch of
mathematics in quantitative research
„ Quantitative research using statistical methods typically begins
with the collection of data based on a theory or hypothesis,
followed by the application of descriptive or inferential statistical
methods.
„ Those who are likely to be a successful researchers/analysts are
more usually attracted by the problem solving side of the work and
the practical application of the mathematics and logic rather than
the mathematics/statistical concepts per se.

2
The Empirical Research Process

1. Interest->The topic or theme of research


2. Reading earlier research and theoretical research
3. Specification of the research problems-Research questions,
Framing hypotheses, Conceptualization
4. Planning Research process, Empirical Research design
5. Selection of variables and empirical tools-Source of data.
6. Data collection
7. Data filtering, coding, sorting
8. Data analysis (quantitative analysis using statistical packages)
1. Univariate/multivariate descriptive analysis
2. Multivariate Regression analysis
3. Diagnostic Checks

Research Process…
8. Data analysis (quantitative analysis using statistical packages)
1. Univariate/multivariate descriptive analysis
2. Multivariate Regression analysis
3. Diagnostic Checks
9. Interpretation of results
1. Answering empirical questions
2. Explanation of results
10. Drawing conclusions and policy actions
11. Bibliographic citations
12. Finalizing the report (sequencing the charts, tables, footnotes,
abstract and text etc.)
13. Publication of results

4
Data
„ When considering the establishment of a framework for statistical
testing, it is sensible to ensure the availability of a large enough
set of reliable information on which to base the test. For
example, if the analyst intends to find `one-in-five-year event’ the
best way is to have a five-year database.

Problem Solving Approach


„ Data Analysis: Summary Statistics
„ Central tendency/expectations
„ Dispersion/volatility
„ Understanding Distribution Fitting
„ In-dept analysis of Data
„ Covariance and correlation
„ Basic Concepts in Probability, Joint Probability
„ Discrete & Continuous Distribution
„ Hypotheses Testing
„ Modeling & Forecasting
„ Simulation & Value at Risk (VaR) Techniques
„ Simple Linear Regression
„ Multi-Variate Regression-MDA, Multiple Regression, Logistic Regression
etc.
„ Time Series Analysis
„ Diagnostic Checks or Validation
6
Descriptive Analysis
„ In descriptive analysis we are interested in describing a single
issue or social phenomenon (e.g. its frequency, distribution or
magnitude).
„ We can simply describe (e.g. unemployment rate, average
wages, support of different parties) or describe and compare
(default rates in different regions, average wages in different
professions of customers, party alignment within different social
strata)
„ Univariate analysis deals with single variables

Descriptive Methods

„ Frequency Distribution (grouped)-histogram, frequency curve,


cumulative distribution etc.
„ Measurement of central tendency (mean, media, mode,
percentiles etc.)
„ Dispersion-SD, mean deviation, CV, range, moments, skew-ness,
kurtosis etc.
„ Forms of distribution-discrete vs. continuous

8
Grouped Frequency Distribution

„ Grouped frequency distribution is a tabular summary of data


showing the frequency of items in each of several non-overlapping
classes
„ Class interval Width= (Largest -Smallest value)/No. of class
„ Relative frequency=Frequency of the class/n
„ Percentage frequency=Relative frequency*100
„ Cumulative frequency-less than type or more than type
„ Graphic Presentation
„ Histogram

„ Frequency polygon

„ Ogive (cumulative)

„ Lorenz Curve

Frequency Distribution
„ Frequency distribution tabulates and presents all the occurring
values arranged in order of magnitude and their respective
frequencies.
„ An inspection of the frequency distribution gives a quick idea about
the average in the series and shows how the observations vary
around the average (through plotting a histogram or frequency
polygon drawn from the frequency distribution.

„ Frequency = simple count of the cases with a certain variable


value
„ Percent = percentage of the cases with a certain variable value
„ Cumulative Percent = percentage of the cases with the given or a
smaller value

10
Descriptive Stats about Zone-wise
Loan Distribution of a Bank
zone_group p1 p5 p10 p25 p50 p75 p90 p95 p99 min max range mean sd cv Kurto Gini HHI
Central_Z_I 0.11 0.4 0.69 1 1.68 3 7.67 11.15 84.79 0.01 90.89 90.9 4.042 10.40 2.57 56.47 0.634 0.356
Central_Z_II 0.03 0.39 0.62 0.99 1.43 2.54 6.97 13.71 107.7 0.01 211.43 211.4 5.410 20.75 3.83 76.68 0.724 0.519
East_Z 0.02 0.51 0.95 1.35 2.39 10.8 30.5 55.84 260 0.01 1251 1251.0 20.703 97.27 4.70 137.38 0.792 0.598
Mumbai_Z 0.04 0.29 0.64 1.48 4.14 15 49.6 133.7 560 0.004 1204.4 1204.4 27.815 91.71 3.30 67.32 0.786 0.572
North_Z 0.11 0.5 0.82 1.2 2.22 5.49 13.4 41.45 183.3 0.01 731.03 731.0 10.620 44.61 4.20 159.31 0.739 0.519
South_Z_I 0.13 0.83 0.97 1.31 2.4 6.41 24.3 38.27 97.3 0.02 380.5 380.5 8.735 23.87 2.73 155.87 0.701 0.421
South_Z_II 0.07 0.41 0.79 1.37 3.11 9.74 29 59 272.4 0.04 400 400.0 13.249 37.77 2.85 67.84 0.720 0.442
West_Z_I 0.21 0.73 0.94 1.53 3.27 11 29.7 105.5 225.1 0.12 385.4 385.3 18.296 48.88 2.67 31.59 0.759 0.547
West_Z_II 0.22 0.69 0.83 1.23 2.34 4.93 13.7 27.16 50.32 0.07 99.54 99.5 6.108 11.55 1.89 32.66 0.619 0.299
Total 0.08 0.46 0.79 1.24 2.51 7.94 26.1 52.41 250 0.004 1251 1251.0 15.505 62.26 4.02 163.61 0.771 0.578

Frequency = simple count of the cases with a certain variable value


Percent = percentage of the cases with a certain variable value
Cumulative Percent = percentage of the cases with the given or a
smaller value
11

Quartiles and Percentiles


„ Quartiles divide an ordered lists into quarters.
„ For example, the first quartile (Q1) is a number greater (or equal) than the values
of 25% of the cases and lower (or equal) than the values of the remaining 75%.
„ In financial risk management quantile chosen would be 90%, 95% or 99% in most
cases since the largest losses can be observed at extreme quanitiles. E.g. Op-Risk
Capital from loss distribution (LDA) can be quantified by determining the 100p%
quantile for simulated distribution.
„ Percentiles divide ordered lists into hundredths.
„ One percent (p1) of the cases lie below the first percentile and 99% lie above it.
For example 1st quartile (Q1) equals 25th percentile (p25).
„ For example, all the cases of a real sample of employees (N=1112) are ordered on
the line below according to the monthly income (in Rs.).
„ e.g. median is the value of 556th and 557th cases (or their average)

12
Measures of dispersion
„ Range=maximum value-minimum value
„ Interquartile range(IQR)=Q3-Q1
„ Standard Deviation (SD), Variance (SD2)
„ Coefficient of Variation (CV)=SD/Mean
„ Skeweness (sk)=3(Mean-Median)/SD
=(Mean-Mode)/SD
or =[(Q3-Q2)-(Q2-Q1)]/[(Q3-Q2)+(Q2-Q1)]
or=3 moment=√β1=µ3/σ3
rd

„ Kurtosis=4th moment=β2-3=(µ4/µ22)-3
„ = µ4/σ4
„ If β2<3, distribution is platykurtic (thick tail but less peaked-ness); if
β2>3, distribution is leptokurtic (thin tail but high peaked-ness).
When β2 =3, distribution becomes normal or mesokurtic or
symmetric
13

Descriptive Statistics: Mean, VARIANCE,


SKEW, KURTOSIS, Gini, HHI

„ These are the four moments about mean describe the nature of
loss distribution in risk measurement.
„ The mean is the location of a distribution & Variance or the square
of standard deviation measures the scale of a distribution.
„ The Skew is a measure of the asymmetry of the distribution. In risk
measurement, it tells us whether the probability of winning is
similar to the probability of losing and the nature of losses.
„ Negative skewness means there is a substantial probability of a big
negative return. Positive skewness means that there is a greater-
than-normal probability of a big positive return.
„ Kurtosis is useful in describing extreme events (e.g., losses that are
so bad that they only have a 1 in 1000 chance of happening).
„ In the extreme events, the portfolio with the higher kurtosis would
suffer worse losses than the portfolio with lower kurtosis.
„ Skewness and Kurtosis are called the shape parameters
14
Moments and the Nature of Distribution

„ For a normal distribution, skewness=0


„ The Kurtosis for the Normal distribution is 3.
„ Normal distribution is so commonly used (especially in the
credit risk) that some researchers define the “excess kurtosis”
as being the calculations above minus 3.
„ Distributions with a kurtosis greater than the Normal
distribution are said to be leptokurtic.

15

Kurtosis
„ Since Kurtosis measures the shape of the distribution (the fatness of the tails), it
focuses on losses are ranged around the mean.
„ Leptokurtic means smaller proportion of medium sized deviation from mean, but
larger proportion of extremely large an small deviation from mean. Kurtosis
greater than three indicates a sharp/high peak with a thin midrange and fat tails
(super Gaussian type e.g. Pareto distribution, Long normal distribution, Weibull
distribution etc.)
„ Platykurtic means smaller proportion than normal deviation from mean that are
extremely small or large and a larger proportion of medium sized deviations from
mean (may happen in stock return distribution). Kurtosis of less than three
indicates a low peak with a fat midrange on either side (short tails-sub Gaussian
type e.g. Bernoulli distribution)
„ A normal distribution is called mesokurtic and it has a kurtosis of 3 (it is a thin tail
distribution).

16
Difference between Skewness & Kurtosis
„ Skewness - measures the degree and direction of symmetry or
asymmetry of the distribution.
„ A normal or symmetrical distribution has a skewness of zero (0). But in
the operational loss results, normal distributions are hard to come by.
„ Therefore, a distribution may be positively skewed (skew to the right-loss
series; longer tail to the right; represented by a positive value) or
negatively skewed (skew to the left; longer tail to the left; with a
negative value-return series).
„ Kurtosis - measures how peaked a distribution is and the lightness or
heaviness of the tails of the distribution. In other words, how much of
the distribution is actually located in the tails?
„ A positive kurtosis value means that the tails are heavier than a
normal distribution and the distribution is said to be leptokurtic (with a
higher, more acute "peak"). A negative kurtosis value means that the
tails are lighter than a normal distribution and the distribution is said to
be platykurtic (with a smaller, flatter "peak").

17

Measures of Moments

„ Distribution is fully described by 4 moments


„ m1= X/n is mean
„ m2= (X-X-)2/n is variance
„ m3= (X-X-)3 /n is absolute measure of skewness
„ m4= (X-X-)4/n is absolute measure of kurtosis

„ Relative measure of Sk=m3/SD3 ranges between + and -3 (Sk=0 indicates


symmetric distribution)
„ Relative measure of kurtosis =m4/SD4 kr=3 indicates mesokurtic

18
Herfindahl-Hirschman Index (HHI)
„ The Herfindahl index is a commonly used ratio to measure
concentrations/inequality of the distribution.
„ The Herfindahl index measures concentration as the sum of the
squared business share of each loan in the pool (or portfolio). i.e.,
N

„ ∑E 2
n N
HHI = n =1
N
= ∑ s n2
∑E
n =1
n
n =1

„ Where E= Loan Exposure Amount (Rs. Cr.) and s= loan share to total.
The HHI is calculated by summing the squares of the portfolio share of
each contributor.
„ Theoretically, a perfectly diversified portfolio of 500 borrowers would
have HHI = 0.002. In contrast, if the bank portfolio is divided amongst
five zones in the ratio of 5:2:1:1:1, then the implied HHI by sector is
0.32, indicating a significant level of concentration.

Gini Coefficient Measure of Inequality


„ The Gini coefficient or Lorenz ratio is a standard measure of inequality or
concentration of a group distribution.
„ It is defined as a ratio with values between 0 and 1. A low Gini coefficient
indicates more equal income or distribution of loan assets with different
industries/groups, sectors, etc., while a high Gini coefficient indicates more
unequal distribution.
„ For a portfolio of N loans with exposure shares s1, s2,…., sN, the empirical
Gini coefficient is defined as N
( 2n − 1) s ∑ n
„ G ( s1 , s2 ,........., sN ) = n =1
−1
N

„ Therefore, the Gini coefficient: G = 1 − ∑ pi ( zi + zi −1 )


„ pi is the probability or frequency of no. of borrowers and zi is the loan
share.
„ A value of Gini coefficient close to zero (45 degree diagonal line - no
inequality) corresponds to a well diversified portfolio and a value close to
one corresponds to a highly concentration portfolio.
„ A Gini coefficient in the range of 0.3 or less indicates substantial equality,
Gini>0.3 to 0.4 indicate acceptable normality.
Geographic Loan Concentration: Gini
Coefficient Approach
Zone wise Inequality Comparison in Loan Distribution

100.00%

90.00%

80.00%
Central_Z_I
70.00%
Central_Z_II
Cum % of Loan Share

60.00% East_Z
Mumbai_Z
50.00%
North_Z
40.00%
South_Z_I
30.00% South_Z_II
West_Z_I
20.00%
West_Z_II
10.00%

0.00%
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00
%
Cum % of Borrow ers

We have used declies to slice the loan zonal loan exposure


distribution of a large Bank

Basic concepts in Probability

„ Probability is a numerical measure of the likelihood that an event will


occur out of all possible outcomes in the experiment.
„ Sample space is the set of all experimental outcomes
„ Probability of an event is greater than or equal to 0 and less than or
equal to 1
„ The probability of an entire sample space is 1
„ Probabilities under conditions of
„ Mutual exclusion

„ Mutual non-exclusion

22
Probability Axioms

„ Marginal probability
„ P(A)=relative frequency of occurrence

„ Addition law: Probability of either of 2events occurring


„ P(A U B)=P(A)+P(B)- P(A ∩ B)

„ Joint Probability: Probability of 2 events both occurring


„ P(A ∩ B) = P(A)*P(B) if they are independent
„ Conditional probability: Probability of an event given that another
event has occurred
„ P(A/B)=P (A ∩B)/P(B)

23

Few examples

„ Tossing an unbiased coin- H, T - r=1, s=1


„ P(A)=s/(s+r ), P(B)=r/(s+r)

„ P(A)+P(B)=1

„ Tossing 2 unbiased coins- TT, TH, HT, HH r=2 s=2; P(A=one


H & one T: composite event)= 2/4
„ Prob (Both Heads): P(A)=1/4
„ Prob (at least one Head)=3/4
„ Similarly, when a dice is thrown, there are six possible
outcomes: 1,2,….,6.
„ Find Prob(Dice giving even no.)

24
Few examples

„ What is probability that either of the two coins give heads ?


„ 1/2 + 1/2 - (1/2*1/2) = 3/4 or .75
„ Probability of A & B to default is .25% & .5% and probability of both
default is 0.35% respectively. What is the probability that A or B might
default
„ .25%+.5%-(0.35%)= 0.40% where both events are not independent.
„ What would be the probability if they are independent?

25

Drawing without replacement


„ A loan portfolio contains 8 solvent accounts and 5 defaulted
accounts. Two successive draws of 3 accounts are made without
replacement. Find the probability that the first drawing will give 3
defaulted and the second 3 solvent facilities.
„ Soln. Let A denote the event of first drawing gives 3 defaulted loans
and B denotes the event second drawing gives 3 solvent loans. We
have to find out Prob (A ∩ B).
„ We know by probability theorem P(A∩ B)=P(A)×P(B/A)
„ P(A)=[8C0×5C3]/13C3=[1×10]/286=5/143=3.45% (approx.)
„ Next, we have to find
P(B/A)=[8C3×2C0]/10C3=[56×1]/120=7/15=46.68% (approx.)
„ Hence, required probability is
P(AB)=(5/143)×(7/15)=7/429=1.63%
„ This concept has major applications in Op-Risk or Credit Risk
Portfolio Modeling Exercises.

26
Conditional Probability (without
replacement): Example 2
„ A box contains five yellow balls and two green balls. What is
the probability that three balls randomly taken from the box
(without replacement) all will all be yellow?
„ A= first ball is yellow
„ B=Second ball is yellow
„ C=third ball is yellow
„ P(A ∩ B ∩ C= P(A) P(B/A) P(C/A ∩ B)
„ P(A)=5/7 i.e. 5 yellow balls in a box of 7
„ P(B/A)=4/6 i.e. 4 yellow balls left in a box of 6
„ P(C/A ∩ B)=3/5 i.e. 3 yellow balls left in a box of 5
Thus:
P(A ∩ B ∩ C)=5.4.3/7.6.5
=2/7

27

Condition Probability (with replacement):


Example
„ In a certain repeated experiment, the possibility of occurrence of an
event is p and consequently the probability of non-occurrence is 1-p=q.
„ In n repeated trials of the experiment the prob. of occurrence of an
event of r times is:
„ P (r)=nCrprqn-r (Follows Bernoulli or Binomial distribution)
„ Ex. If 4% of NPAs are present in a loan pool, determine the probability
that out of 4 borrowers chosen at random at most 2 will be defaulting?
„ Hints: P(r<=2)=P(r=0)+P(r=1)+P(r=2)
„ Where n=4 and p=4% and q=96%.
„ This concept has major applications on Risk Loss Simulation
Exercises

28
Conditional Probability: Example1

„ From two sets of portfolios A and B with shares yielding Profit


as well as loss, what is the probability of picking a profit yielding
share of portfolio A?

„ What is the probability of picking loss making share given that


the share is from portfolio A?

29

Conditional Probability: Ex2


„ Probability of Transaction Errors=0.53
„ Probability of System Fail=0.50
„ Prob(Both Fail)=0.27
„ Therefore, Prob(Trans_error|System Fail)=0.27/0.50=0.54

30
Conditional Probability: Bayes’ Theorem
„ The conditional prob. P(Bi/A) of a specified event Bi, when A is stated
to have actually occurred, is given by:
P( Bi ) × P( A / Bi )
P( Bi / A) = n

∑ P( B ) × P( A / B )
i =1
i i

„ Ex. In a bolt factory, machines A, B and C manufacture respectively


25%, 35% and 40% of the total. Of their output 5, 4, and 2 per cents
are defective bolts. A bolt is drawn at random from the product and is
found to be defective. What are the respective probabilities that it was
manufactured by machine A, B, C?
„ Practical Applications: Through a credit scoring model (say z score),
once a randomly selected borrower obtain z score from its financial
ratios, the above theorem helps us to classify him/her with that score.
That means Bayes’ theorem will assist us to find which group borrower
will fall and with what probability (i.e. whether it will be in defaulting or
solvent type of customer?)
31

Probability Distribution
„ In reality, there are an infinite number of possible outcomes for
the asset value. We represent the distribution of these possible
outcomes with a probability density function (which is linked to
the histogram).
„ The next figure shows a typical probability density function for
credit losses. Along the x-axis is the value of the assets. The
height of the function in the y-axis gives the probability of any
given loss occurring.
„ The higher uncertainty in the asset value increases the
probability of defaulting on the debt (for bond issuer/bank).

32
Results of 10 Credit-Loss Scenarios

Scenario Asset Value


1 96.5
2 98.4
3 100.6
4 101.7
5 102.3
6 103.2
7 103.9
8 104.4
9 104.7
10 105.2

Results of 10 possible scenarios for


asset values at the end of one year 33

Number of Occurrence in Each Range

Range Occurrence per bin


96-98 1
98-100 1
100-102 2
102-104 3
104-106 3
Table showing the no. of results that
fall in each range/bin of possible asset value

34
Histogram of 10 Credit-Loss Scenarios

Series: Asset_Value
.15

Observations 10

Mean 102.09
.1

Median 102.75
Maximum 105.2
Density

Minimum 96.5
Std. Dev. 2.859856
.05

Skewness -0.805147
Kurtosis 2.486903

Jarque-Bera 1.190132
0

96 98 100 102 104 106


Asset_value
Probability 0.551526

35

Histogram of Credit Loss Scenarios


„ The histogram gives us a crude indication of the probability
distribution for the asset value. For example, it shows us that
there is a 20% chance that the asset value will be less than Rs.
100 (phase value).

36
Probability Density for the Credit-Loss
Example
. 15
.1
D ensity
. 05
0

96 98 100 102 104 106


Asset_value

37

Cumulative Probabilities
„ While the probability density tells us the probability of a variable
falling in a given range, cumulative probability depicts the
probability of the random variable falling below a given number.
„ The cumulative probability can be estimated by multiplying the
probability density by the bin width to get probabilities for each
bin, and by summing up all the probabilities for values less than
equal to (less than type ogive) or more than equal to that
number (more than type ogive)

38
Cumulative Probability for the Credit-Loss
Example (less than type Ogive)
1
.8
.6
cum
.4
.2
0

96 98 100 102 104 106


Asset_value

39

Measure of Relative Location

„ Z-scores or standardised values is the no. of standard deviations


x¡is from the mean
„ zi= (x¡- x )/sd
-

„ Chebychev’s theorem enables us to estimate the proportion of


data values that must be within a specified no. of SD from the
mean
„ At least (1-1/z2) of the data vales must be within z SD of the
mean, where z is greater than 1.

40
Normal Probability Distribution

„ Normal probability distribution: A continuos probability distribution,


its probability density function is bell shaped and determined by its
mean  and SD "
„ Normal pdf is a good model for a continuos random variable whose
values depends on a no. of factors, each exerting a comparatively
small influence
„ normal pdf is symmetric around mean/median/mode
„ probability of obtaining a value far away from mean becomes
progressively smaller
„ 68.26%, 95% , 95.44%, 99% & 99.73% of area is covered by 1,
1.96, 2, 2.58 and 2.99 SD respectively.

41

Normal Distribution
If we’d measure very accurately a randomly distributed
characteristic in a very large sample of cases, we’d obtain a
frequency distribution which is symmetric and in which most cases
cluster around the mean.

42
Standard Normal Distribution
„ Standard Normal Distribution:A normal probability distribution with
mean 0 and SD 1
„ Normal distributions differ from one another in terms of mean and
SD
„ Comparison of 2 normal distributions possible through
standardization
„ New variable Z may be created from the normal distribution with
mean=o and SD=1. Where,
„ Z=(Xi -X )/ SD
„ Standard normal distribution can be used to compute the various
confidence intervals of probable price/loss/return ranges.
„ Most of the VaR models in calculating economic capital use loss
distribution follows standard normal distribution. Many statistical
credit scoring models also assumes error term follows standard
normal distribution.

43

Examples

„ Given that the daily change in price of a security follows the normal
distribution with a mean of 70 bps and a variance of 9. What is the
probability that on any given day the change in price is greater than
75 bps.
„ Z= (75-70)/3 =1.67

„ P(X>75)=P(Z>1.67)

„ =1-P(Z<1.67)= 1-0.9525=0.0475

„ Now estimate:
„ Probability of change in price being 75 or fewer

„ Probability of change in price being between 65 and 75 bps

„ Probability of change in price being less than or equal to 60 bps

44
Confidence Intervals for Standard Normal
Distribution
„ Normal distribution with 0 mean and 1 standard deviation is
called a Standard Normal Distribution.
„ In risk management, confidence levels are often more useful
than confidence intervals because we are usually concerned
with the downside risk or worst-case level (tail risk).
„ It is a single number and level (α) that will not be exceeded,
with a given probability (%).

„ For example, there is only a 5% chance that a variable drawn


from a Standard Normal distribution will have a value greater
than 1.64.
„ We can therefore say that the 95% confidence level for this
variable is 1.64 (α). The inverse of the confidence level (α) is
the percentile.

45

Confidence Interval…Example

„ Suppose the mean operational loss X =$434,045 and set
confidence multiplier α=5% so that we have a (1- α)=95%
confidence interval around the estimate of mean, Such an
interval can be calculated using:

X ± z α × Stdev(X)

„ Stdev(X), the standard deviation of X=$73,812, and z is the


standard normal variable for α=5%. Using the Normsinv( )
function, we see that Normsinv(0.95)=1.64 (Or see the
standard normal table). Therefore, we can set z=1.64 and
calculate 95% confidence interval as $312,635 to $555,455.
„ In this case, the OR manager may feel comfortable stating that
the average OR loss as $434,045, although we have 95%
confidence that the actual (population) value will lie somewhere
close to this value, say, between $312,635 and $555,455.
46
Confidence interval for calculating
average defaults
„ A Bank finds that the defaulted housing loan facilities have a length of survival
life is approximately normally distributed, with mean equal to 600 days and a
standard deviation of 40 days. Find the probability that a random sample of 16
defaulted loans will have an average life of less than 550 days.
„ Here, mean=600 and SD=40/√16=10. The desired probability is given by the
area of the shaded region in the figure below.


X

„ Corresponding to 550, we find Z=(550-600)/10=-5 and therefore, Pr( <550 )


=Pr(Z<-5)=normsdist(-5)=0.00003%
47

Confidence Interval & Precision about


Prediction

„ A confidence interval is an interval constructed from a sample, which


includes the parameter being estimated with a specified probability
known as the confidence level.
„ If a risk indicator for example was sampled on many occasions, and the
confidence interval calculated each time, then (1-α)% of such intervals
would cover the true population parameter being estimated. Therefore,
the width of the confidence interval measures how uncertain we are
about the unknown population parameters.
„ A very narrow interval indicates less uncertainty (or less error rate)
about the value of the population parameter than a very wide interval.
„ It is important to note that C.I is a function of a sample, it is itself a
random variable and will therefore vary from sample to sample.

48
Example: Credit Risk: Bond Default Rates
over 19 Years
Year Bond Default Rate (bp)
1982 125
1983 68
1984 84
1985 99
1986 175
1987 93
1988 146
1989 151
1990 256
1991 297
1992 121
1993 47
1994 52
1995 91
1996 43
1997 52 Source: S & P’s Credit
1998 116 Week, Jan31,2001
1999 198
2000 212 49

Histogram of Bond Default Losses

Series: Loss_rate_bsp
10

Observations 19
8

Mean 127.6842
Median 116
Frequency
6

Maximum 297
Minimum 43
4

Std. Dev. 72.44619


Skewness 0.844582
2

Kurtosis 2.880478

Jarque-Bera 2.270154
0

50 100 150 200 250


Loss_Rate_bsp
Probability 0.321397

50
Descriptive Statistics of Credit Loss
Series: Asset_Value
.15

Observations 10

Mean 102.09
Median 102.75
.1

Maximum 105.2
Density

Minimum 96.5
Std. Dev. 2.859856
.05

Skewness -0.805147
Kurtosis 2.486903

Jarque-Bera 1.190132
0

96 98 100 102 104 106


Asset_value
Probability 0.551526

51

Bank’s Loan Loss Distribution

400
Series: HIST_LGD
Sample 1 829
Observations 829
300
Mean 0.751924
Median 0.937150
Maximum 1.000000
200
Minimum 0.000000
Std. Dev. 0.323241
Skewness -1.160426
100 Kurtosis 3.063549

Jarque-Bera 186.1932
Probability 0.000000
0
0.0 0.2 0.4 0.6 0.8 1.0
52
Fitting Beta Distribution to Loan Loss
BetaGeneral(0.35405, 0.15230, 0.0000, 1.0000)
25
Fit-Test Input
Function RiskBetaGeneral(0.35405, 0.1523, 0, 1) N/A
a1 (location) 0.354048284 N/A
a2 (Scale) 0.15229666 N/A
20 min 0 N/A
max 1 N/A
Mean 0.69922 0.75192
Mode N/A 1
Median 0.93079 0.93715
15 Std. Deviation 0.37365 0.32324
Variance 0.13962 0.10436
Skewness -0.8509 -1.1604
Kurtosis 2.0652 3.0635

10

0
-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

5.0% 90.0% 5.0%


0.005 1.000
53

Fitted Loss Distribution through


Simulation
BetaGeneral(0.34846, 0.16739, 0.0000, 1.0000)
25

Fitted Input
20 Function RiskBetaGN/A
a1 0.34846 N/A
a2 0.167393 N/A
15 min 0 N/A
max 1 N/A
Mean 0.6755 0.69935
Mode N/A 1.0000 [est]
10
Median 0.89737 0.93
Std. Deviat 0.38027 0.37397
Variance 0.1446 0.13972
5
Skewness -0.7338 -0.8509
Kurtosis 1.8714 2.0657

0
-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

5.0% 90.0% 5.0%


0.004 1.000
54
Market Risk Example: Histogram of
Daily Returns for S&PCNXNIFTY over a
5-year period

500
Series: SNP_RETURN
Sample 1 1275
400 Observations 1275

Mean 0.001205
300 Median 0.002188
Maximum 0.079691
Minimum -0.130539
200 Std. Dev. 0.014263
Skewness -1.088501
Kurtosis 11.35109
100
Jarque-Bera 3956.755
Probability 0.000000
0
-0.10 -0.05 0.00 0.05

55

Hypothesis Testing
„ Testing of hypothesis is one of the main objectives of Sampling
Theory. Hypothesis tests address the uncertainty of the sample
estimate.
„ When we have to make a decision about the entire population
based on the sample data, hypothesis tests help us in arriving at
a decision.
„ It attempts to refute a specific claim about a population
parameter based on the sample data.
„ The process which enables us to decide on the basis of the
sample results whether a hypothesis is true or not, is called Test
of Hypothesis or Test of Significance.

56
Hypothesis Testing Procedure
„ All hypothesis tests are conducted the same way. The researcher
states a hypothesis to be tested, formulates an analysis plan,
analyzes sample data according to the plan, and accepts or rejects
the null hypothesis, based on results of the analysis.
State the hypotheses. Every hypothesis test requires the analyst
to state a null hypothesis and an alternative hypothesis. The
hypotheses are stated in such a way that they are mutually
exclusive. That is, if one is true, the other must be false; and
vice versa.
Formulate an analysis plan. The analysis plan describes how to
use sample data to accept or reject the null hypothesis. It should
specify the following elements.
• Significance level. Often, researchers choose significance levels
equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can
be used.
57

One-tailed test vs. Two-tailed Hypothesis


Testing
„ One-Tailed Test
„ A test of a statistical hypothesis , where the region of rejection is on
only one side of the sampling distribution , is called a one-tailed test. In
such tests, we are only interested in values greater (or less) than the
null. A one sided hypothesis test is as follows:
„ Test H0: k=0 against HA: k>0 or k<0 & we reject the null if | Tcomp
|>Tcritical
„ Two-Tailed Test
„ A test of a statistical hypothesis , where the region of rejection is on
both sides of the sampling distribution , is called a two-tailed test. In
such tests, we are interested in values greater and smaller than the null
hypothesis.
„ We write this as:
„ Test H0: k=0 against HA:k≠0 & we reject the null if | Tcomp |>Tcritical
„ In the two-sided hypothesis, we calculate critical value using α/2. For
example, α=5%, the critical value of the test statistic is T0.025. 58
Problem1: Two tailed test
„ An inventor has developed a new, energy-efficient lawn mower
engine. He claims that the engine will run continuously for 5
hours (300 minutes) on a single gallon of regular gasoline.
Suppose a simple random sample of 50 engines is tested. The
engines run for an average of 295 minutes, with a standard
deviation of 20 minutes. Test the null hypothesis that the mean
run time is 300 minutes against the alternative hypothesis that
the mean run time is not 300 minutes. Use a 0.05 level of
significance. (Assume that run times for the population of
engines are normally distributed.)
„ Null hypothesis: µ = 300
Alternative hypothesis: µ ≠ 300
„ Note that the null hypothesis will be rejected if the sample mean is
too big or if it is too small.

59

Solution1: Two-tailed test


„ Analyze sample data. Using sample data, we compute the standard
error (SE), degrees of freedom (DF), and the t-score test statistic (t).
„ SE = s / sqrt(n) = 20 / sqrt(50) = 20/7.07 = 2.83
DF = n - 1 = 50 - 1 = 49
t = (x - µ) / SE = (295 - 300)/2.83 = 1.77
„ where s is the standard deviation of the sample, x is the sample mean,
µ is the hypothesized population mean, and n is the sample size.
„ Since we have a two-tailed test, the P-value is the probability that the
t-score having 49 degrees of freedom is less than -1.77 or greater than
1.77.
„ We use the t Distribution Calculator to find P(t < -1.77) = 0.04,
and P(t > 1.75) = 0.04. Thus, the P-value = 0.04 + 0.04 = 0.08.
„ Interpret results. Since the P-value (0.08) is greater than the
significance level (0.05), we cannot reject the null hypothesis.

60
Problem2: One-tailed test
„ Bon Air Elementary School has 300 students. The principal of
the school thinks that the average IQ of students at Bon Air is
at least 110. To prove her point, she administers an IQ test to
20 randomly selected students. Among the sampled students,
the average IQ is 108 with a standard deviation of 10. Based on
these results, should the principal accept or reject her original
hypothesis? Assume a significance level of 0.01.
„ Null hypothesis: µ = 110
Alternative hypothesis: µ < 110
„ Note that these hypotheses constitute a one-tailed test. The null
hypothesis will be rejected if the sample mean is too small.

61

Solution2: One-tailed test


„ Analyze sample data. Using sample data, we compute the standard
error (SE), degrees of freedom (DF), and the t-score test statistic (t).
„ SE = s / sqrt(n) = 10 / sqrt(20) = 10/4.472 = 2.236
DF = n - 1 = 20 - 1 = 19
t = (x - µ) / SE = (108 - 110)/2.236 = -0.894
„ where s is the standard deviation of the sample, x is the sample mean,
µ is the hypothesized population mean, and n is the sample size.
„ Since we have a one-tailed test, the P-value is the probability that the
t-score having 19 degrees of freedom is less than -0.894.
„ We use the t Distribution Calculator to find P(t < -0.894) = 0.19. Thus,
the P-value is 0.19.
• Interpret results. Since the P-value (0.19) is greater than the
significance level (0.01), we cannot reject the null hypothesis.

62
Hypothesis Testing: Bond Loss Example 1
Hypothesis Testing for LOSS_RATE_BSP
Date: 10/24/07 Time: 12:50
Sample: 1 19
Included observations: 19

Test of Hypothesis: Mean = 128.0000


Assuming Std. Dev. = 72.44619

Sample Mean = 127.6842


Sample Std. Dev. = 72.44619

Method Value Probability


Z-statistic -0.019 0.9848
t-statistic -0.019 0.985
63

Example 2: Bond Loss

Hypothesis Testing for LOSS_RATE_BSP


Date: 10/24/07 Time: 13:03
Sample: 1 19
Included observations: 19

Test of Hypothesis: Mean = 80.00000

Sample Mean = 127.6842


Sample Std. Dev. = 72.44619

Method Value Probability


t-statistic 2.869035 0.0102

64
Parametric-Mean Difference Test
„ Many problems arise where we wish to test hypotheses about the means of two
different populations (e.g. comparing ratios of defaulted and solvent firms or
comparing performance of public sector bank vis a vis private banks etc.)
„ Un-Paired test: Or,

„ Start by assuming H0 is true and use the following test statistic to arrive at a
decision:

A low p value (<0.05) will Reject the null and a high p value (>0.10)
will fail to reject the null.
65

Ex: Difference between Solvent &


Defaulted Group of Borrowers
Variable Name Mean

Solvent Defaulted t-test for


Difference$
PROPERTY AREA (SQ. METER) 101.67 65.99 35.68**
(23.81)
GROSS MONTHLY INCOME (RS.) 20,443.30 9,711.90 10,731.40**
(28.56)
AGE_BORR 43 45 -1.79**
(-12.42)
NO_DEPEND 1.445 1.744 -0.2988**
(-12.57)
LN_ASSTVAL 12.75 11.95 0.798**
(22.15)
SECVAL_LOANAMT 1.65 1.50 0.15**
(10.05)
NO_CO_BORR 0.48 0.31 0.174**
(18.70)
COBOR_MINC 3061.04 10,24.64 2036.4**
(12.30)
ORGNL_TERMM 173.26 176.4 -3.14**
(-3.8)
No. of observations 7321 6166

66
Errors of Testing
„ There are two kinds of errors that can be made in significance
testing: (1) a true null hypothesis can be incorrectly rejected and
(2) a false null hypothesis can fail to be rejected.
„ The former error is called a Type I error and the latter error is
called a Type II error.

True State of the Null Hypothesis


Statistical Decision
H0 True H0 False
Reject H0 Type I error Correct
Do not Reject H0 Correct Type II error

„ The probability of a Type I error is designated by the Greek


letter alpha (α) and is called the Type I error rate; the probability
of a Type II error (the Type II error rate) is designated by the
Greek letter beta (ß). 67

Relationship Between Alpha, Beta and


Power

68
Example: Classification Power of a
Statistical Scoring Model

Table: Classification Power of the Logistic Model 1 for the Holdout Sample of the
year 2003 & 2004
Predicted Group
Original Defaulted Solvent Total
Group Defaulted 47 3 50
(94%) (6%) (100%)
Solvent 8 42 50
(16%) (84%) (100%)
Note: Figures in the parentheses denote percentages.

69

Testing The Power of Credit Risk Models

% Correct Classification
Model Within Sample
Good Bad
Alman Z-Score 1968 Reworked with Indian Data 84.00% 82.00%
Emerging Market Z-Score 1995 Reworked with Indian
Data 88.20% 75.90%
NIBM Z-Score 2005 Developed on Indian Data 85.20% 91.00%

70
Calibrating & Benchmarking A Model

•Take a look at the above two graphs showing score-wise distribution


of bankrupt and non-bankrupt category of borrowers.
•The first graph has substantial overlapping of observations making it
difficult to predict failure of large number of borrowers, while the
second graph has very less overlapping area between the two
categories.
71

Types of Probability Distributions

„ Discrete (for Events Prediction) & Continuous


(for Losses)
„ Binomial, Poisson, Bernoulli, Negative
Binomial...
„ Normal, Beta (Credit Risk, Market Risk) t, Chi-
sq. , Beta, Exponential, Weibull (Extreme
Distribution-Thick Tail), ...

72
Popular Discrete Distributions: Rule of
Thumb for Identifying Them
„ Binomial Distribution, Geometric Distribution and Negative
Binomial Distribution
„ A useful rule of thumb for choosing between these popular
distributions is:

Binomial: variance<a. m
Poisson: variance=a. m
Negative Binomial: variance>a. m

• Thus, if we observe that our sample variance is much larger than


the sample mean, the negative binomial distribution may be an
appropriate choice.
73

Frequency Distributions

Poisson Distribution:
Number of Frauds λ= 102
x
e−λλk
f (x) = ∑
January February March April May June July August

95 82 114 74 79 160 110 115 91% k=0 k!


118 95%
126 99%

Poisson

Poisson PDF Poisson CDF

4.50% 100.00% Other popular


4.00% 90.00%
distributions to
80.00%
3.50%
70.00%
estimate frequency
3.00%
2.50%
60.00% are the geometric,
2.00%
50.00%
negative binomial,
40.00%
1.50%
30.00%
binomial, weibull, etc
1.00%
20.00%
0.50%
10.00%
0.00%
0.00%
0 50 100 150 200 0 20 40 60 80 100 120 140 160

74
Binomial Distribution
N= 12
p= 0.8
N!
f (x) = px (1− p)x
x!(N − x)!
Probability

Mean=N p

and
Standard Deviation:

σ = Np(1 − p)
1

9
11

13

15

17

19

21

23

25

27

29
Number of events

x
The parameter p can be estimated by: pˆ =
N 75

Summary of Frequency of Loss Daily Data for


Credit Card Fraud

Poisson Distribution:
No. of events Observed i x ni
per day (i) frauds (ni)
0 19 0 x
e−λλk
1 16 16 f (x) = ∑
2 51 102 k=0 k!
3 9 27
4 6 24 E=2.71828…
5 5 25 x=0,1,2,…,
6 4 24
7 6 42
8 2 16
9 1 9 Here, mean (Lambda)
10 0 0 λ =sum(ixni )/sum(ni)
11 0 0 =352/124=2.84
12 2 24
13 1 13 Here, SD=
14 0 0 √(2.84)=1.68523
15 2 30
Total: 124 352 76
Distribution of Credit Card Fraud Events
Distribution of Frauds frequency

60

50

40
O b served F rau d s

30

20

10

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
no of events per day
77

Fitted Poisson Values for Credit Card


Frauds
Lambda (λ=2.84)
No. of Events Probability Possion Parameter
0 5.84% 2.84
1 16.59%
2 23.56%
3 22.31%
Probability Fitted Poisson
4 15.84%
5 9.00% 25.00%
6 4.26%
7 1.73%
8 0.61% 20.00%
9 0.19%
10 0.05%
Probability

11 0.01% 15.00%
12 0.00%
13 0.00%
14 0.00% 10.00%
15 0.00%
16 0.00%
17 0.00% 5.00%
18 0.00%
19 0.00%
20 0.00%
0.00%
21 0.00%
22 0.00%
10

12

14
16

18

20

22
24
0
2
4

23 0.00% 78
24 0.00% No. of Events
Chi-Sq. Goodness of Fit Test
„ The risk manager should run a fit test to confirm the right
selection of distribution.
„ One such test is: Chi-squared goodness of fit test:

~ n (Oi − Ei ) 2
T =∑
i =1 Ei
„ H0: The data follows a specified distribution (here Poisson)
„ Ha: The data do not follow the specific distribution
„ The test statistic is calculated by dividing the data into n bins
(or ranges) and is defined as:
„ Where Oi is the observed no. of events, Ei is the expected no.
of events (or fitted), and n is the no. of categories.
„ D.f=n-(k-1), where k refers to the no. of parameters that need
to be estimated.
79

Example: Key Personnel Risk


No. of Back Office Staff Leaving per Month
Number leaving Observed ixn
per month (i) (ni)
0 18 0
1 20 20
2 21 42
3 11 33
4 4 16
5 1 5
Total 75 116
Mean (Lambda) 1.55

•We fit a Poisson distribution to the above data. The parameter λ is


estimated at 1.55 (one can check).
•It tells that there has been a constant turnover in staff of between one
or two people per month.
80
Actual Poisson Fitted Values & Back Office
Turnover Risk

Poisson Fitted Vs Actual Series2


Observed

Series3
Poisson

30
25
Frequency

20
15
10
5
0
0 1 2 3 4 5
Nos Leaving per Month

The Poisson Distribution appears visually fit the data fairly well. 81

Chi-Sq-Goodness of Fit Result


Observed & Actual Number of Back Office Staff Leaving Per Month
Numbers leaving Observed (O) Fitted Prob Expected (E) sum((O-E)^2/E)
per month (i) (ni)
0 18 21.23% 16 0.27210689
1 20 32.90% 25 0.88522513
2 21 25.50% 19 0.184441167
3 11 13.17% 10 0.127023463
4 4 5.11% 4 0.007659566
5 1 1.58% 1 0.029315002
15 75
Chi2 T-curl 1.51
Test Stat
d.f 5
Critical T-curl 5% 11.07049775

The chi2 test statistics T-curl=1.51, which is less than the critical value of 11.07 at 5
percent significance with 5 degrees of freedom (n-1=6-1=5), we fail to reject the
null hypothesis and conclude that there is no evidence to support the alternative
hypothesis that the observed distribution is significantly different from the expected
(poisson) distribution. [In excel use chiinv(p,df) formula to obtain the critical value]
Hence, Poisson distribution fits the data fairly well. 82
Testing the Fitness of Continuous
Distributions

„ Jarque-Bera Statistics-Tests the normality of a distribution


„ Kolmogorov-Smirnov Test-Identify the Fat Tails
„ Anderson-Darling Test-Best Fits Extreme Distributions
„ Scheartz Criteria
„ Akaike Information Criteria
„ Graphical-Quantile-Quantile (Q-Q) Plots, Probability-Probability
Polots (P-P).

83

Confidence Interval for Loss Prediction

„ Using Chebyshev’s theorem we can say that at least


¾ (or 75%) of loss events (credit card fraud) fall in
the interval: λ+2σ.
„ For credit card events (in our example-credit card
fraud), 75% of the observations will be from 0 to 6
since C.I: λ+2σ=2.84±(1.685)×(1.685), or
„ Similarly, in case of key personnel risk, 75% of the
observations will be from 0 to 3.
„ Similar concept applies in choosing sample from the
population for making statistical inference.

84
Kolmogorov-Smirnov Test (K-S)
„ Kolmogorov-Smirnov goodness of fit test that whether
a set of data come from a hypothesized continuouis
distribution.
„ It tends to be more sensitive near the center of the
distribution than it is at the tails.
„ H0: The data follow the specified distribution. Ha: The
data do not follow the specified distribution.
„ Test Statistics:
„ Where F(Y) is the theoretical fitted distribution
„ i/N is the actual data distribution.
„ The hypothesis regarding the distributional form is
rejected if the test statistic, D, is greater than the
critical value obtained from a table.
„ You can run this test in Best Fit, Easy Fit, Data Plot
softwares
85

Anderson-Darling Test
„ Anderson-Darling goodnes of fit test whether a data set comes from a
specified distribution.
„ It is a modification of the Kolmogorov-Smirnov (K-S) test and gives
more weight to the tails than the K-S test.
„ The K-S test is distribution free in the sense that the critical values do
not depend on the specific distribution being tested.
„ The Anderson-Darling test makes use of the specific distribution in
calculating critical values. This has the advantage of allowing a more
sensitive test and the disadvantage that critical values must be
calculated for each distribution.
„ You can run this test in Best Fit, Easy Fit, Data Plot softwares
„ More formally, the test is defined as follows.
„ H0: The data follows a specified distribution.
„ Ha: The data do not follow the specified distribution
„ For Test Statistic, See Statistics Book

86
Severity Distribution: Legal Liability Loss
60

Descriptive Statistics of Legal


Liability Losses (in British Pound)
Mean 151944
Median 103522.9
Standard Deviation 170767.1
40

Skew 2.8064
Kurtosis 15.3145
Percent

No. of Obs 140


20
0

0 500000 1000000 1500000


legl_loss

87

Normal Probability Plot for Legal Event


Losses
Normal(151944, 170767) Normal(151944, 170767)
1.0 1.4

1.2

0.8
1.0

0.8
Fitted p-value

Values in Millions

0.6
Fitted quantile

0.6

0.4
0.4

0.2

0.2 0.0

-0.2

0.0
-0.4
0.0

0.2

0.4

0.6

0.8

1.0

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Input p-value
Input quantile
Values in Millions 88
Exponential Probability Plot for Legal
Event Losses
Expon(149190) Shift=+1688.6 Expon(149190) Shift=+1688.6
1.0 1.4

1.2

0.8

1.0

Values in Millions
Fitted p-value

Fitted quantile
0.6
0.8

0.6
0.4

0.4

0.2

0.2

0.0
0.0
0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4
0.0

0.2

0.4

0.6

0.8

1.0

Input p-value
Input quantile 89
Values in Millions

Fitted Exponential Distribution


Expon(149190) Shift=+1688.6
7

Fitted Actual
6 Function RiskExpon(1N/A
Shift 1688.58848 N/A
5
b 149189.812 N/A
Minimum 1688.6 2754.2
Values x 10^-6

Maximum Plus infinity 1255736


4
Mean 150878 151944
Mode 1688.6 13551 [est]
3 Median 105099 103523
Std. Deviation 149190 170767
Variance 2.23E+10 2.90E+10
2
Skewness 2 2.8064
Kurtosis 9 15.3145
1

0
0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Values in Millions

0.009
90.0%
0.449
5.0% >
90
Fitted Weibull Distribution to Cover the
Fat Tail
Weibull(1.2154, 192107) Shift=-26732
6

5 Weibull Fit Actual


Function RiskWeibull( N/A
Shift -26731.975 N/A
4 a 1.21540041 N/A
Values x 10^-6

b 192106.533 N/A
Minimum -26732 2754.2
3
Maximum Plus infinity 1255736
Mean 153392 151944
Mode 19533 13551 [est]
2
Median 115363 103523
Std. Deviation 148922 170767
1
Variance 2.22E+10 2.90E+10
Skewness 1.492 2.8064
Kurtosis 6.0945 15.3145
0
0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Values in Millions

< 90.0% 5.0% >


-0.010 0.447
91

Weibull Probability Plot for Legal Event


Losses
Weibull(1.2154, 192107) Shift=-26732 Weibull(1.2154, 192107) Shift=-26732
1.0 1.4

1.2

0.8

1.0
Fitted p-value

Values in Millions

0.8
0.6
Fitted quantile

0.6

0.4
0.4

0.2
0.2

0.0

0.0
-0.2
0.0

0.2

0.4

0.6

0.8

1.0

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Input p-value
Input quantile
Values in Millions

92
Fitting Beta Distribution to Loan Loss
BetaGeneral(0.35405, 0.15230, 0.0000, 1.0000)
25
Fit-Test Input
Function RiskBetaGeneral(0.35405, 0.1523, 0, 1) N/A
a1 (location) 0.354048284 N/A
a2 (Scale) 0.15229666 N/A
20 min 0 N/A
max 1 N/A
Mean 0.69922 0.75192
Mode N/A 1
Median 0.93079 0.93715
15 Std. Deviation 0.37365 0.32324
Variance 0.13962 0.10436
Skewness -0.8509 -1.1604
Kurtosis 2.0652 3.0635

10

0
-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

5.0% 90.0% 5.0%


0.005 1.000
93

Beta Distribution

The mean of beta distribution is given by:

α αβ
Mean = & S .D. =
; α+β (α + β ) 2 (α + β + 1)

The parameters of this distribution can be easily estimated


using the following (method of moments) equations:

⎡⎛ X (1 − X ) ⎞ ⎤ ⎡⎛ X (1 − X ) ⎞ ⎤
αˆ = X ⎢⎜⎜ ⎟⎟ − 1⎥ βˆ = (1 − X ) ⎢⎜⎜ ⎟⎟ − 1⎥
S2 ⎣⎝ S2 ⎠ ⎦
⎣⎝ ⎠ ⎦ 94
Fitted Loss Distribution through
Simulation
BetaGeneral(0.34846, 0.16739, 0.0000, 1.0000)
25

Fitted Input
20 Function RiskBetaGN/A
a1 0.34846 N/A
a2 0.167393 N/A
15 min 0 N/A
max 1 N/A
Mean 0.6755 0.69935
Mode N/A 1.0000 [est]
10
Median 0.89737 0.93
Std. Deviat 0.38027 0.37397
Variance 0.1446 0.13972
5
Skewness -0.7338 -0.8509
Kurtosis 1.8714 2.0657

0
-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

5.0% 90.0% 5.0%


0.004 1.000
95

VaR

¾Value-at-Risk is a risk measure (risk capital) which is


generically defined as the maximum possible loss for a
given position or portfolio within a known confidence
interval over a specific time horizon due to certain kind of
risk

96
Correlation and Dependence Analysis
„ Frequency based Joint Dependence: Using probability
and set theorem-Random sampling with or without
replacement
„ Pearson Correlation Coefficient(rx,y):
Cov(x,y)/(SDx×SDy)
„ Spearman’s Rank Correlation Coefficient (ρ ): For
example correlation between salary ratio and gross
income generation for 20 traders.
„ ρ=1-6Σdi2/(n2-1)n where di are the differences of the
ranked pairs.

97

Econometric Models

Regression model

98
Simple Linear Regression (OLS)

„ Regression analysis is concerned with the study of relationship between one


variable called the explained or dependent variable and one or more other
variables called independent or explanatory variables.
„ Linearity implies slope remains constant
„ In a SLR Model, y= bo + b1x + e
„ bo and b1 are the parameters and e is a random variable referred to as error
term. ‘e’ accounts for variability in y not accounted by linear relationship
between x and y.
„ bo is the intercept or constant term b1 is the slope, indicating change in y for 1
unit change x.
„ Sign of b1 indicates direction of relationship

99

Assumptions of OLS regression

„ OLS method of estimation minimises the sum of the squared error


terms
„ Assumptions behind OLS:
„ Model is correctly specified

„ Mean value of ui=0

„ Equal variance ui (Homoscedasticity)

„ No autocorrelation or no correlation between ui and uj

„ Zero covariance between Xi and ui

„ No multicollinearity Cov(Xi Xj)=0 , multivariate regression

100
Regression Analysis -- OLS

18
16
Y 14
12
10
8
6
4
2
0
0 10 20 30 40 50

X
Ordinary Least Squares (OLS)
• We have a set of data points, and want to fit a line to the data
• The most “efficient” can be shown to be OLS. His minimizes the
squared distance between the line and actual data points.
101

Regression Analysis -- OLS

Yj = a + b⋅ X j + ε j •The basic equation

b̂ =
∑ (X − X)(Y − Y)
i i •OLS estimator of b hat

∑ (X − X) i
2

â = Y − b̂X •OLS estimator of a hat

Here a hat denotes an estimator, and a bar a sample mean.


102
Regression Analysis -- Confidence

R =
2 ∑ (Ŷ − Y) = (correlation)
i
2
2

∑ (Y − Y)
i
2

S . E(b̂) =
∑ (Y − Ŷ ) /(n − 2) = RSS/(n − 2)
i i
2

∑ (X − X) i ∑ (X − X)
2
i
2

RSS ⎡ 1 X2 ⎤
S . E(â) = Variance(â) = ⎢ + ⎥
n − 2 ⎣⎢ n ∑ X i2 ⎦⎥

Here, the R-squared is a measure of the goodness of fit of our model, while
the standard error or deviation of b gives us a measure of confidence for out
estimate of b. 103

Significance test for regression


coefficients
„ T-test in regression exercise helps us to examine the whether there are any
significant effect of regression parameters on dependent variable
„ For example, researcher wants to examine the effect of system downtime on
the amount of operational risk or fall in GDP growth on default%.
„ H0: b=0; a=0
„ Ha: b≠0; a≠0
„ Test Statistics:
â − 0 b̂ − 0
t â/ 2 = & t b̂/ 2 =
S . E(â) S . E .(b̂)
A 100×(1-α) confidence interval (C.I) is given by: bˆ ± S .E (bˆ) × tα / 2
Where t follows t distrbn with d.f. n-2
Rule of thumb: If estimated t is greater than the C.I and the resulting error
rate or p value is <0.05, then we reject the null hypothesis and conclude that
independent variable (X) significantly affect the dependent variable (Y).
One can also perform a one tail test to check whether there is a +ve or –ve
relationship between Y & X 104
Overall Significance and Goodness of Fit
Total Sample Variance(TSS) = ESS+ RSS
⇒ ∑i =1 (Yi − Y) 2 = ∑ (Ŷi − Y ) 2 + ∑ (Yi − Ŷi ) 2
n

Again, it can be proved that :


ESS = b̂ 2 × ∑ (X i − X)
2

o The difference between TSS and RSS represents the improvement obtained
by adjusting Y to account for X.
o The measure of goodness of fit R2 can be constructed by taking into ratio of
explained variance to total variance (i.e. R2=(ESS/TSS) or, =1-RSS/TSS.
o For a good fitting model ESS will be large and RSS will be small and R2 will
be large. 105

Regression Analysis – Two-variable

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.976786811
R Square 0.954112475
Adjusted R Square 0.94493497
Standard Error 27.08645377
Observations 7

ANOVA
df SS MS F Significance F
Regression 1 76274.47725 76274.48 103.9621 0.000155729
Residual 5 3668.379888 733.676
Total 6 79942.85714

Coefficients Standard Error t Stat P-value


Intercept 87.10614525 17.92601129 4.859204 0.004636
Q (X) 12.2122905 1.19773201 10.19618 0.000156


= the t-ratio.
S .E (bˆ)
Combined with information in critical values from a “student-t”
distribution, this ratio tells us how confident we are that a value is
significantly different from zero. 106
Regression Fit

Q(X) Line Fit Plot


450
400
350
300
TC(X)

250
200
150 TC(X)
100 Predicted TC(X)
50
0
0 10 Q(X) 20 30
107

Interpretation of model parameters


Operational Losses and System Downtime
System
Operational downtime
Date losses ($) (minutes)
1-Jun 1610371 9
2-Jun 25677 0
3-Jun 1504852 11
4-Jun
5-Jun 913881 7
6-Jun 2352458 18
7-Jun 3549325 19
8-Jun 0 0
9-Jun 0 0
10-Jun 1649917 13

Using the above data, we estimate the regression equation using OLS:
Operational Loss=-$40,526 +[$155,470×system downtime]; R2=0.931413
(176688.8) (15945.9)
(-0.22937) (9.749864) Adj R2=0.921615
F-Stat=95.05984 108
Regression: Interpretation
„ Operational_loss day i=[intercept]+[slope × system downtime day i]
+[random error day i]

ƒ Simple linear regression is a conditional expectation.


ƒ The slope parameter “β” measures the relationship between X
(independent) and Y (dependent variable). It is interpreted as the
expected change in Y for a 1-unit change in X.
ƒ For example, if we estimate a regression and find:
ƒ E(Y/X)=2.75+0.35X, a 1 unit change in X is expected to lead to a 0.35
unit change in Y.

109

Statistical Significance Test of Regression


Coefficient
„ In the op-risk regression example, one can check: does system
downtime affect the amount of operational losses?
„ We may, if we wish, test this formally with H0: β0=0 against H1: β0>0
„ The S.E. and t value with a resulting p value will help us to test the null
hypothesis (one tail test)
Regression Interpretation: Ex2
„ JPMC has used the following regression equation using 15 years data
points:
„ Loss Given Default Rate=1.16+0.16Xln(Default Rate)
(0.0001) (0.0069)
R2=0.44 and Adjusted R2=0.40

Source: M Araten, M Jacobs Jr. and P Varshney, (May 2004), “Measuring LGD on
Commercial Loans: An 18-Year Internal Study, RMA

111

Multivariate Regression Analysis


„ Regression with two or more X variables
„ All variables have to be continuous (at least interval level), X
„ variables can also be dichotomous dummy-variables
„ How does a group of X variables affect Y variable?
„ Regression equation:

In logistic regression the dependent variable is dichotomous


(e.g. yes/no) 112
Multivariate regression

X k 1 ⎤ ⎡ β 1 ⎤ ⎡ u1∧ ⎤

⎡ Y1 ⎤ ⎡1 X 21
⎢Y ⎥ = ⎢1 ⎢ ∧⎥ ⎢ ⎥
⎢ 2⎥ ⎢ X 22 X k 2 ⎥⎥ ⎢ β 2 ⎥ + ⎢ u 2∧ ⎥
X kn ⎥⎦ ⎢⎣ β 3 ⎥⎦ ⎢⎣ u 3∧ ⎥⎦

⎢⎣Yn ⎥⎦ ⎢⎣1 X 2n
y = Xβ + u
β ∧ = ( X / X ) − / X /Y
113

Regression Results
„ Regression analysis produce the following results:
„ For the whole regression:
„ R2 predicts the explanatory power of the regression model
(explained variance/total variance)
„ ANOVA ( F-test and p values-test for overall goodness of fit)
„ For each X variable:
„ Regression coefficients (betas)
„ Standard error or repressor variance
„ t-test value for the statistical significance (with p values)

114
R2, Adjusted R2, F statistics for Model Fit

„ R2=ESS/TSS measures explanatory power of regression model.


„ R2 (Adj)=1-[(RSS/n-k)/(TSS/n-1)]
„ n=no. of obs.; k=no. of parameters
„ F-test examines the goodness of fit of the model
„ H0: β1= β2= β3=….= βn=0
„ F=R2/[(1-R2)/n-k]
„ A high F value and low p (less than 5%) rejects the hypothesis and
then model fits well.

115

The Difference between Correlation and


Regression
„ Correlation analysis compares how to random variables vary together.
„ In regression we assume the values taken by the dependent variable
are influenced or caused by the independent variables.
„ Therefore, regression provides us with a cause-and-effect modeling
framework.
„ Correlation, on the other hand, informs us that two variables may be
related, but it tells us nothing about causation.

116
Example of regression analysis

117

Coefficients

Interpretation:
„ Health does not seem to be dependent on sex (P = 0.209 >0.05).
„ Age, smoking and exercising have significant effect on health.
„ Age has the strongest effect (Beta = -.316). The older the person, the weaker the
experienced health. Smoking has a negative effect and exercising positive effect on
health.
„ In total the model is statistically significant and explains 15,5% of total variation in
experienced health.
118
Application of Multiple Regression: Ex1
„ Operational Los=f (system downtime, no. of trainees working, no. of
experienced staff, volume of transactions, no. of transaction errors)
Dependent Variable: OPLOSS
Method: Least Squares
Date: 11/15/09 Time: 00:18
Sample: 1 10
Included observations: 9
Variable Coefficient Std. Error t-Statistic Prob.

C 508467.1 359893.5 1.412827 0.25260


SYSTDOWN 162073.9 11343.19 14.28822 0.00070
TRAINEES 42063.33 19657.59 2.139801 0.12190
EXPRSTAFF -41034.77 9035.252 -4.541629 0.02000
TRANSNO 0.556896 4.598436 0.121106 0.91130
TRANSERR -1.074378 40.7226 -0.026383 0.98060

R-squared 0.99457 Mean dependent var 1289609


Adjusted R-squared 0.985521 S.D. dependent var 1203115
S.E. of regression 144767.8 Akaike info criterion 26.83837
Sum squared resid 6.29E+10 Schwarz criterion 26.96985
Log likelihood -114.7727 F-statistic 109.9071
Durbin-Watson stat 1.940854 Prob(F-statistic) 0.001352 119

Logistic Regression
„ Logistic regression in a nutshell:
„ It is a multiple regression with an outcome variable (or
dependent variable) that is a categorical dichotomic and
explanatory variables that can be either continuous or
categorical
„ In other words, the interest is in predicting which of two
possible events are going to happen given certain other
information
„ For example in Political Science, logistic regression could be
used to analyse the factors that determine whether an
individual participates in a general election or not.

120
Why cannot we use a Simple Linear
Regression?

„ Let us remember what we have learnt about Simple Linear


Regression:
„ We used it when we had reasons (a theory) to assume

causality between two variables: X → Y.


„ Example:

„ X= Investment in R&D; Y= New Products introduced

„ In particular, we want the ‘X’ to cause the ‘Y’ and not the
inverse.

121

Simple Linear Regression


„ This sort of regression analysis provides us with useful
information:
„ E.g.: For a certain confidence level (95%, for example): How

much the explained variable (Y) changes as a result of a


change in the explanatory variable (X)
„ With a regression we can predict the value of Y given the

value of X

122
Simple Linear Regression
How is this impact of X on Y estimated?

„ We assumed a linear relation between the two variables


„ We introduced ‘u’, unobserved factors affecting Y, which we
are not going to account for in our model
„ Then we postulated the following relation:
Yi = α + βXi + ui

123

Simple Linear Regression


How is this impact of X on Y estimated?

„ We made some assumptions about ‘u’


(Basically we assumed that ui are identically and independently
distributed with zero mean and constant variance)

„ Then we estimated the parameters for the model (using


generally Ordinary Least Squares)

„ Simple Linear Regression provides the ‘best fit’ line. i.e.: the
straight line which best describes the relationship between the
two variables

124
Our example: R&D and New Products

„ How does investment in R&D 50

affects the number of new


products developed? We can
40

postulate the following 30

relation:
20

# of new products = α +
β*Investment in R&D + u
NEWPROD
10

0
0 200 400 600 800

Let us look at the scatter plot: RD

125

Our Example: Investment in R+D and


introduction of new products

„ It makes sense to assume a


linear relation between X
and Y in this case. 50

„ The estimate for β = 0.049


„ This tells us that in order to 40

increase the number of new


products in one unit, we 30

need to invest a little bit


more than 20 monetary units 20
in R&D.
„ If a company invests 1000 in
NEWPROD

10
R&D, we would predict this
company to develop around 0
49 new products 0 200 400 600 800

RD

126
Another example: Failing or Passing an
exam

„ Let us define a variable ‘Outcome’


„ Outcome = 0 if the individual fails the exam

= 1 if the individual passes the exam


„ We can reasonably assume that Failing or Passing an exam
depends on the quantity of hours we use to study
„ Note than in this case, the dependent variable takes only two
possible values. We will call it ‘dichotomic’ variable

127

Regression analysis with dichotomic


dependent variables

„ We will be interested then in inference about the probability of


passing the exam.

„ Were we to use linear regression, we would postulate:


Prob (Outcome=1) = α + β*Quantity of hours of study + u

„ As we are concerned about modelling the probability of the event


occurring, this is a probability model

„ As we model the relation between the quantity of hours of study


and the probability of passing the exam as linear, this is a linear
model

„ We will call this model a ‘Linear Probability Model’ (LPM)


128
Linear Probability Models (LPM)

„ Our dataset contains Student id Outcome


Quantity of Study
Hours
information about 14 1 0 3
students. 2 1 34

„ Our statistical software 3 0 17

(SPSS) will happily perform a 4 0 6


5 0 12
linear regression of
6 1 15
Outcome, on the quantity of 7 1 26
study hours. 8 1 29
9 0 14
10 1 58
11 0 2
12 1 31
13 1 26
14 0 11
129

Linear Probability Models (LPM) –


What is wrong about them?

1.2

„ Let us do a scatter plot and


1.0
insert the regression line:
The probability of Outcome=1
.8
„
can take values between 0 and 1 .6

„ But we do not observe .4

probabilities but the actual event .2

happening
OUTCOME

0.0

„ A straight line will predict values -.2

between negative and positive 0 10 20 30 40 50 60

infinity, outside the [0,1] interval! HSTUDY

130
What is wrong with LPM?
Coefficients
Unstandardized Coefficients Sig.
Model B Std. Error
1 (Constant) - 0.031861 0.161591 0.846994
HSTUDY 0.026219 0.006483 0.001627
a Dependent Variable: OUTCOME

„ Above is the SPSS output on the linear regression of ‘Outcome’


on Hours of Study
„ The results suggest that an increase in 1 hour of studying
increases the probability of passing the exam, on average, by
approx. 0.026 or 2.6%.
„ So what would the model predict if we studied 100 hours for
the exam?

131

Linear Probability Models (LPM) –


What is wrong with them?
„ Basically, the linear relation we had postulated before between
X and Y is not appropriate when our dependent variable is
dichotomic. Predictions for the probability of the event
occurring would lie outside the [0,1] interval, which is
unacceptable.
„ Other two subtle problems related to LPM:
„ Distribution of ui is not normal as we wished it to be

„ The variance of ui is not constant (problem of

heteroscedasticity)

132
Non Linear Probability Models

„ We want to be able to model the probability of the event


occurring with an explanatory variable ‘X’, but we want the
predicted probability to remain within the [0,1] bounds.
„ There is a threshold above which the probability hardly
increases as a reaction to changes in the explanatory
variable.

„ Many functions meet these requirements (non-linearity and


being bounded within the [0,1] interval).

„ Logistic function is of such kind.

„ The Logistic Curve will relate the explanatory variable X to the


probability of the event occurring. In our example, it will relate
the number of study hours with the probability of passing the
exam. 133

Logistic Regression
„ Logistic regression, and related methods such as Probit analysis,
are very useful techniques when one wants to understand or to
predict the effect of a series of variables on a binary response
variable (a variable which can take only two values, 0/1 or
Yes/no, for example).
„ The methodology of logistic regression aims at modeling the
probability of success depending on the values of the
explanatory variables, which can be categorical or numerical
variables.
„ For example, a marketing researcher may want to detect if
customers are likely to renew their savings deposit/Loan Facility
„ Logistic regression can be helpful to model the effect of repeal
of a patent on profitability of textile firms or to examine the key
determinants of likelihood of a firm to export or to evaluate the
risk for a bank that a client will not pay back a loan
The Logit Model
„ A Logit Model states that:
„ Prob(Yi=1) = F (α + βXi)

„ Prob(Yi=0) = 1 - F (α + βXi)

„ Where F(.) is the ‘Logistic Function’.


„ So, the probability of the event occurring is a logistic
function of the independent variables

135

Logit Models

1
F (α + βX i ) = P(Yi = 1) =
1+ e−(b0 +b1X1 +εi )

„ When there is only one explanatory variable (X1), the Logistic


Function is defined as above.
„ Therefore, we will be interested in finding estimates for b0 and
b1 so that the Logistic Function best fits the data

136
How do we find the best Logistic Function
to fit our data?

„ We will estimate our model with Maximum Likelihood


techniques, which will be shown in the statistical packages.
„ Logistic Regression Can be Run in Statistical Packages like
STATA, SPSS etc.
„ One can also use XLSTAT to run a logistic regression in EXCEL
„ http://www.kovcomp.co.uk/support/XL-Tut/demo-log.html
„ http://www.kovcomp.co.uk/xlstat/index.html
„ One May also use Solver to in Excel to run Logistic Regression

Logit Regression in Credit Risk


Management
„ Logit regression technique is popularly used in developing credit
scoring models for Corporate/SMEs/Retail Loans
„ The logit model overcomes the weakness of “unboundedness”
problem of MDA and Liniear Probability Models by restricting the
estimated range of default probabilities to lie between 0 and 1.
„ Essentially this is done by plugging the estimated values of Z
from LPM into the formula: F(z)=1/(1+exp(z))
„ Where z is the dichotomous regression equation explaining how
two factors debt-equity ratio (D/E) and the sales-asset ratio
(S/A) predicting the probability of default of payment on a loan.
„ The estimated LPM regression equation from past data on
default behavior of borrowers is: Z=0.5(D/E)-0.3(S/A)
„ Assume a prospective borrower has D/E=0.3 and S/A=4. Its
expected LPM score=-1.05; and F(z)=1/(1+exp(-
1.05))=74.08%. It means Probability of odds of Solvency over
default is=74.08%
138
Logit Model in Credit Risk
„ Logistic regression is a simple and appropriate technique for
estimating the log of the odds of default as a linear function of
loan application attributes:
⎡ Pr ob( Default ) ⎤
ln ⎢ ⎥ = β 0 + β1 X 1 + β 2 X 2 + β 3 X 3 + ... + β k X k .
⎣ Pr ob( Solvency) ⎦

„ A logistic model has the flexibility of incorporating both the


qualitative and quantitative factors and is more efficient than
the linear regression probability model.
„ In logistic regression exercise, we are actually predicting the
probability of a loan default based on the financial, non financial
(qualitative borrower characteristics), situational factors
(location and local factors) obtained from the credit files of the
Bank and external macro economic conditions

139

Logistic Regression in Operational Risk


Management
„ Logistic regression is a useful tool for analyzing data that
includes binary dependent variables such as presence or
absence of a fraud and success or failure of a back office
process or system.
„ Logistic regression is simply nonlinear transformation of the
linear regression model. However, unlike OLS, it does not
require assumptions about normality.
„ The dependent variable is log odds ratio or logit
„ For example observed number of daily “computer system
failures” (coding of yes vs. no into 1,0 mode) are converted to
proportions which are then fitted by Logit model that determine
the probability that the computer system will fail today because
of other causative factors like presence of new trainees or staff
ratio, business volume etc.

140
Multiple Discriminant Analysis
„ Discriminant analysis is appropriate in situations where the
researcher may want to identify those variables/factors which
are effective in predicting group membership or what variables
discriminate well between groups.

141

MDA Analysis: Altman’s Z Score

¾ Altman (1968), for the first time, applied Multiple Discriminant Analysis
(MDA) in response to shortcomings of traditional univariate financial ratio
analysis.
MDA models are developed in the following steps :
¾ Establish a sample of two mutually exclusive groups: firms which have
“failed” and those which are still continuing to trade successfully
¾ Collect financial ratios for each of these companies belonging to both of
these groups
¾ Identify financial ratios which best discriminate between groups (F-test/
Wilk’s Lambda test).
¾ Established a Z score based on these ratios.
142
Altman’s Z-Score Model

Altman Z model predicts the probability of a company going bankrupt


within a period of 12 months:

Z = 1.2X1 + 1.4X2 + 3.3X3 + 0.6X4 + 0.999X5


Z = The Z score
X1 = Net Working Capital (NWC)/Total Asset (liquidity)
X2 = Retained Earnings/Total Asset (cumulative profitability)
X3 = Profit before Interest and Tax (PBIT)/total assets (productivity)
X4 = Market Value of Equity/Book value of Liabilities (movement in
the asset value)
X5 = Sales/Total Assets (sales generating ability)

143

Altman’s Z-Score Model

¾ It is a classificatory model for corporate customers

Î Zscore > 2.99 - firm is in a good shape


Î 2.99>Zscore>1.81 - warning signal
Î 1.81>Zscore-big trouble; firm could be heading towards
bankruptcy
Î Therefore, the greater a firm’s distress potential, the lower
its discriminant score

¾ Z-score model can be used as a default probability predictor


model

144
The Z score and weights
„ The discriminant coefficients can be estimated using following formula
based on 2 variables:
„ Z=aX+bY where X=TOL/TA and Y=CR;

„ where
„ a={(VarY(avg.Xsolv-avg.Xdef))-(CovXY(avg.Ysolv-Ydef))}/((VarX×VarY)-(CovXY)^2)
„ b={(VarX(avg.Ysolv-avg.Ydef))-(CovXY(avg.Xsolv-Xdef))}/((VarX×VarY)-(CovXY)^2)
„ Where Cov XY=Σ(X-avgX)(Y-avgY)/n-1
„ avg. Xsolv=mean of variable X for borrowers in solvent category
„ avg. Xdef=mean of variable X for borrowers in defaulted group
„ avg. Ysolv=mean of variable Y for borrowers in solvent category
„ avg. Ydef=mean of variable Y for borrowers in defaulted category
„ The cut off Z-score is the combined benchmark for identified
independent variables to classify the prospective borrower into defaulted
or solvent category.

145

Regime Switching Regression-Dummy


Variable Regression
„ When an operation risk event is subject to regime shifts, the
parameters of the statistical model will be time-varying.
„ For example, consider the time series of minutes of system
downtime per month for a particular business unit. It may
happen that downtime fell sharply as a result of the business
unit outsourcing its IT administration. Thus, the change in
management policy had a direct impact on the stochastic
behavior of the OR event “system downtime”.
„ One simple approach for capturing regime shift would be to
model regression using either intercept or slope dummies:
Before and After IT function has been outsourced.

146
Multivariate Regression Interpretation
(Including Regime Shifting): Ex2

Repo Rate Determinants (using Quarterly data from Nov. 2000-Dec 2007

Jt. Jalan & Reddy Period(Nov. Reddy Period


2000-07) (Sept 2003-Dec 2007)
All Only money All Only money
determinants supply determinants supply
Lagged % change GDP -0.22*** --- -0.027 ---
Lagged % change WPI 0.013 --- -0.01 ---
Lagged % change M3 0.25** 0.24*** 0.24*** 0.24***
Constant 4.9*** 3.67*** 3.22* 3.00***
Adjusted R-squared 0.47 0.22 0.84 0.84
Observations 29 29 18 18
Note: Absence of * indicates not significant even at 5% level of confidence; * indicates
significant at 5%, ** at 1% and *** at .1%

Source: “Triplets: RBI, money supply and the repo”,


by Surjit Bhalla, Business Standard, Jan 6, 2009

147

Linear Probability or Multiple Discriminant


Analysis (MDA) for Op-Risk Scores & Rating
Migration Analysis
„ MDA, a popular statistical classification technique popularly applied
in credit risk management can be applied to design Qualitative Risk
Rating of Operational Risk Management Process in a Bank (as part
of QLA).
„ To apply this technique, one has to devise a risk map that
encompasses all types of operational risk (legal, compliance,
operations, security, system, etc.) and then the key qualitative as
well as quantitative factors may be selected through MDA statistical
analysis to develop risk level of operational process of a Bank (e.g.
transaction processing in each business). Such exercise would also
help us to understand the key factors that affect the risk rating.
„ The rewarded score will determine the OR Rating (in 7 or 8 scale).
„ Next, one can study the rating migration of credit/operational risk
over the time (following Markov chain method) that would help the
top management to control or monitor risk and to identify the
factors responsible for more downgrades etc.
148
Modeling Computer System Failure Risk
with Logistic Regression
„ Processing risk covers losses from back office operations. It
includes, among other factors, the failure of computer systems..
„ Logistic regression allows us to calculate the probability associated
with such a failure, given assumed risk indicators.
„ For Example, given the data we make hypothesis that the
probability of computer failure is related to the ratio of available
staff (systems support and maintenance) to all available staff on a
particular day and the volume of computer related business activity
as a proportion of the recommended maximum capacity of the
system.
„ ln[p/(1-p)=α+β1(staff ratioi)+β2(volumei)+єi
„ After estimating the logistic regression equation using the daily
data, we obtain the following estimated coefficients (p-values in
brackets):
„ ln[p/(1-p)]=-1.9936 (0.001)+1.416(0.002)×Staff_ratio
+1.3917(0.003)×Volume
149
Note: McFadden’s Pseudo R2 measures the goodness of fit

Diagnostic Checks-validation of regression


estimation
„ Heterscedasticity: when data violates the assumption that the
disturbance terms all have the same variance.
„ It affects the regression coefficients and making incorrect decision
concerning the reliability of the partial regression coefficients.
„ Multicollinearity: Causes linear dependence between explanatory
variables and affects the precision of the estimators (low significance
of variables but high R2).
„ If your goal is simply to predict Y from a set of X variables, then
multicollinearity is not a problem. The predictions will still be
accurate, and the overall R2 (or adjusted R2) quantifies how well
the model predicts the Y values.
„ If your goal is to understand how the various X variables impact
Y, then multicollinearity is a big problem. One problem is that the
individual P values can be misleading (a P value can be high, even
though the variable is important).
„ Solutions: The best solution is to understand the cause of
multicollinearity and remove it. You can also reduce its impact by
increasing the sample size.
150
Diagnostic Checks…
„ Serial Correlation: When error terms are serially correlated (can be
identified by looking at Durbin Watson stat (<2)
„ Serial correlation will not affect the unbiasedness or consistency of
OLS estimators, but it does affect their efficiency (standard error
may be low due to +ve serial corrln) which will lead to a tendency
to wrongly reject the null hypothesis when it should not be
rejected.

151

Time Series Techniques of Forecasting


„ A time series is a set of observations on a variable measured at
successive points of time. E.g. interest rate movements, price
movements, GDP trends, yield on bonds/spread movement, credit
growth etc.
„ In order to forecast the time series variables more accurately, one has
to devise a scheme to describe the movement in a time series
adequately.
„ For this purpose, we need to decompose the time series into four
components:
„ Tend component-long term smooth movement.
„ Cyclical component represents oscillatory movement around the
trend (means ups and downs). One has to find out lag to capture
cyclicality.
„ The seasonal component is also oscillatory character but is strictly
confined to intra year movement. This mainly happens in
daily/weekly/monthly data series.

152
The following data represent the closing value
of the Dow Jones Industrial Average for the
years 1980 - 2001.

153

Time Series Plot

154
Monthly WPI Series
220

200

180

160

140

120
98 99 00 01 02 03 04 05 06 07

WPI 155

Time Series of Monthly Aaa Bond Yields

Monthly Data on Bond Yields


10.0

9.5

9.0
Yield
8.5

8.0

7.5

7.0

6.5
1990 1991 1992 1993 1994

Month BO ND_YLD 156


Trend Analysis
„ The trend component of a time series reflects the long run
movement. Usually, it is a rising or falling smooth curve.
„ It is very advantageous for forecasting population growth, credit
growth trend etc. if we can represent the trend by a simple
mathematical function.
„ The linear trend is a special case of polynomial trend. An nth degree
polynomial trend can be written as:
Yt = a 0 + a1t + a 2 t 2 + a 3 t 3 + ... + a n t n .
„ One can perform OLS regression to get trend fitting.
„ Suppose one has a yearly data series of Bank’s credit supply from
1960 to 2006. He has to fit a trend with the actual series. For this, he
converts the actual credit series into natural log figure (base is e) and
can fit into a nth degree polynomial trend function like the above
equation and estimate the coefficients by Ordinary Least Square
(OLS) method that will give annual composite growth rate and also
can project the trend for future years. 157

ARIMA Technique
„ Any time series which contains no trend can be represented as consisting of two
parts: AR Process (lag dependent variable itself) and MA Process (lag error
dependence or serial correlation in the disturbance)
„ ARIMA model can improve forecasting power as it incorporates trend, cyclicality
and seasonality
„ STEPS in Building ARIMA model of forecasting:
„ A. Model Identification-Stationarity check, identifying level of statioarity of the
series (or order of integration) and AR and MA process specification
Methods-
„ Correlogram analysis-studying the Auto correlation function (ACF) and partial
auto correlation function (PACF) lag structure
„ Dickey-Fuller Unit Root Test
„ B. Model Estimation: Having determined the orders of the ARIMA model, the
model can be estimated in either EVIEWS 5 or STATA 9 using differenced
regression technique.
„ C. Diagnostic Checks: Once the ARIMA model is specified and parameters are
estimated, the adequacy of the models may be checked through Box-Pierce-
Ljung-residual test (or white noise test of the residual)
„ D. Forecasting: After diagnostic checks, the regression equation may be used to
generate short term (static) or long term (dynamic) forecasts 158
Graphical Presentation of ARIMA Process

AR (1) process:

MA (1) process:

159

Correlogram Study…
AR (2) process:

MA (2) process:

Note that ACF in


an explosive or
random walk
series does not
die out quickly 160
Correlogram Study

AR (1) MA (1) process:

Note: If, after the first difference, the ACF or PACF spikes in
correlogram get eliminated the series is I(1); if it happens in
second difference then it is I(2) process
161

Useful References
„ Greene, W. H. (2007). “Econometric Analysis”, Fifth Edition, Low Price Edition,
Pearson Education.
„ Gujarati, D N (2004): “Basic Econometrics”, 4th Edition, Tata McGraw-Hill.
„ Johnston J., and DiNardo J (1997): “Econometric Methods”, 4th Edition, The
McGraw-Hill Companies, Inc. (Important for time series and panel data analysis).
„ Lewis, N. D. C “Operational Risk: Applied Statistical Methods for Risk
Management”, Wiley Finance.
„ Maddala, G S (1983): “Limited-Dependent and Qualitative Variables in
Econometrics”, Cambridge University Press.
„ Pindyck, R.S., and D. L. Rubinfeld (1981), “Econometric Models and Economic
Forecasts”, McGraw-Hill International Editions.
„ Vose, D. “A Guide to Monte Carlo Simulation Modeling”, John Wiley & Sons.
„ Walpole, R. E. (1982) “Introduction to Statistics”, Publisher: The Macmillan Co., NY.
„ Walpole, R. E., Sharon L Myers, Keying Ye, Raymond H. Myers (2006), “Probability
and Statistics”.
„ EVIEWS help, STATA help, SPSS 17 help etc.
„ @Risk and BestFit Software at Palisade: www.palisade.com.au

162
Thank You

My Email: arindam@nibmindia.org

163

You might also like