Professional Documents
Culture Documents
Application of Statistical Tools in EmpR Research-AB
Application of Statistical Tools in EmpR Research-AB
Empirical Research
2
The Empirical Research Process
Research Process…
8. Data analysis (quantitative analysis using statistical packages)
1. Univariate/multivariate descriptive analysis
2. Multivariate Regression analysis
3. Diagnostic Checks
9. Interpretation of results
1. Answering empirical questions
2. Explanation of results
10. Drawing conclusions and policy actions
11. Bibliographic citations
12. Finalizing the report (sequencing the charts, tables, footnotes,
abstract and text etc.)
13. Publication of results
4
Data
When considering the establishment of a framework for statistical
testing, it is sensible to ensure the availability of a large enough
set of reliable information on which to base the test. For
example, if the analyst intends to find `one-in-five-year event’ the
best way is to have a five-year database.
Descriptive Methods
8
Grouped Frequency Distribution
Frequency polygon
Ogive (cumulative)
Lorenz Curve
Frequency Distribution
Frequency distribution tabulates and presents all the occurring
values arranged in order of magnitude and their respective
frequencies.
An inspection of the frequency distribution gives a quick idea about
the average in the series and shows how the observations vary
around the average (through plotting a histogram or frequency
polygon drawn from the frequency distribution.
10
Descriptive Stats about Zone-wise
Loan Distribution of a Bank
zone_group p1 p5 p10 p25 p50 p75 p90 p95 p99 min max range mean sd cv Kurto Gini HHI
Central_Z_I 0.11 0.4 0.69 1 1.68 3 7.67 11.15 84.79 0.01 90.89 90.9 4.042 10.40 2.57 56.47 0.634 0.356
Central_Z_II 0.03 0.39 0.62 0.99 1.43 2.54 6.97 13.71 107.7 0.01 211.43 211.4 5.410 20.75 3.83 76.68 0.724 0.519
East_Z 0.02 0.51 0.95 1.35 2.39 10.8 30.5 55.84 260 0.01 1251 1251.0 20.703 97.27 4.70 137.38 0.792 0.598
Mumbai_Z 0.04 0.29 0.64 1.48 4.14 15 49.6 133.7 560 0.004 1204.4 1204.4 27.815 91.71 3.30 67.32 0.786 0.572
North_Z 0.11 0.5 0.82 1.2 2.22 5.49 13.4 41.45 183.3 0.01 731.03 731.0 10.620 44.61 4.20 159.31 0.739 0.519
South_Z_I 0.13 0.83 0.97 1.31 2.4 6.41 24.3 38.27 97.3 0.02 380.5 380.5 8.735 23.87 2.73 155.87 0.701 0.421
South_Z_II 0.07 0.41 0.79 1.37 3.11 9.74 29 59 272.4 0.04 400 400.0 13.249 37.77 2.85 67.84 0.720 0.442
West_Z_I 0.21 0.73 0.94 1.53 3.27 11 29.7 105.5 225.1 0.12 385.4 385.3 18.296 48.88 2.67 31.59 0.759 0.547
West_Z_II 0.22 0.69 0.83 1.23 2.34 4.93 13.7 27.16 50.32 0.07 99.54 99.5 6.108 11.55 1.89 32.66 0.619 0.299
Total 0.08 0.46 0.79 1.24 2.51 7.94 26.1 52.41 250 0.004 1251 1251.0 15.505 62.26 4.02 163.61 0.771 0.578
12
Measures of dispersion
Range=maximum value-minimum value
Interquartile range(IQR)=Q3-Q1
Standard Deviation (SD), Variance (SD2)
Coefficient of Variation (CV)=SD/Mean
Skeweness (sk)=3(Mean-Median)/SD
=(Mean-Mode)/SD
or =[(Q3-Q2)-(Q2-Q1)]/[(Q3-Q2)+(Q2-Q1)]
or=3 moment=√β1=µ3/σ3
rd
Kurtosis=4th moment=β2-3=(µ4/µ22)-3
= µ4/σ4
If β2<3, distribution is platykurtic (thick tail but less peaked-ness); if
β2>3, distribution is leptokurtic (thin tail but high peaked-ness).
When β2 =3, distribution becomes normal or mesokurtic or
symmetric
13
These are the four moments about mean describe the nature of
loss distribution in risk measurement.
The mean is the location of a distribution & Variance or the square
of standard deviation measures the scale of a distribution.
The Skew is a measure of the asymmetry of the distribution. In risk
measurement, it tells us whether the probability of winning is
similar to the probability of losing and the nature of losses.
Negative skewness means there is a substantial probability of a big
negative return. Positive skewness means that there is a greater-
than-normal probability of a big positive return.
Kurtosis is useful in describing extreme events (e.g., losses that are
so bad that they only have a 1 in 1000 chance of happening).
In the extreme events, the portfolio with the higher kurtosis would
suffer worse losses than the portfolio with lower kurtosis.
Skewness and Kurtosis are called the shape parameters
14
Moments and the Nature of Distribution
15
Kurtosis
Since Kurtosis measures the shape of the distribution (the fatness of the tails), it
focuses on losses are ranged around the mean.
Leptokurtic means smaller proportion of medium sized deviation from mean, but
larger proportion of extremely large an small deviation from mean. Kurtosis
greater than three indicates a sharp/high peak with a thin midrange and fat tails
(super Gaussian type e.g. Pareto distribution, Long normal distribution, Weibull
distribution etc.)
Platykurtic means smaller proportion than normal deviation from mean that are
extremely small or large and a larger proportion of medium sized deviations from
mean (may happen in stock return distribution). Kurtosis of less than three
indicates a low peak with a fat midrange on either side (short tails-sub Gaussian
type e.g. Bernoulli distribution)
A normal distribution is called mesokurtic and it has a kurtosis of 3 (it is a thin tail
distribution).
16
Difference between Skewness & Kurtosis
Skewness - measures the degree and direction of symmetry or
asymmetry of the distribution.
A normal or symmetrical distribution has a skewness of zero (0). But in
the operational loss results, normal distributions are hard to come by.
Therefore, a distribution may be positively skewed (skew to the right-loss
series; longer tail to the right; represented by a positive value) or
negatively skewed (skew to the left; longer tail to the left; with a
negative value-return series).
Kurtosis - measures how peaked a distribution is and the lightness or
heaviness of the tails of the distribution. In other words, how much of
the distribution is actually located in the tails?
A positive kurtosis value means that the tails are heavier than a
normal distribution and the distribution is said to be leptokurtic (with a
higher, more acute "peak"). A negative kurtosis value means that the
tails are lighter than a normal distribution and the distribution is said to
be platykurtic (with a smaller, flatter "peak").
17
Measures of Moments
18
Herfindahl-Hirschman Index (HHI)
The Herfindahl index is a commonly used ratio to measure
concentrations/inequality of the distribution.
The Herfindahl index measures concentration as the sum of the
squared business share of each loan in the pool (or portfolio). i.e.,
N
∑E 2
n N
HHI = n =1
N
= ∑ s n2
∑E
n =1
n
n =1
Where E= Loan Exposure Amount (Rs. Cr.) and s= loan share to total.
The HHI is calculated by summing the squares of the portfolio share of
each contributor.
Theoretically, a perfectly diversified portfolio of 500 borrowers would
have HHI = 0.002. In contrast, if the bank portfolio is divided amongst
five zones in the ratio of 5:2:1:1:1, then the implied HHI by sector is
0.32, indicating a significant level of concentration.
100.00%
90.00%
80.00%
Central_Z_I
70.00%
Central_Z_II
Cum % of Loan Share
60.00% East_Z
Mumbai_Z
50.00%
North_Z
40.00%
South_Z_I
30.00% South_Z_II
West_Z_I
20.00%
West_Z_II
10.00%
0.00%
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00
%
Cum % of Borrow ers
Mutual non-exclusion
22
Probability Axioms
Marginal probability
P(A)=relative frequency of occurrence
23
Few examples
P(A)+P(B)=1
24
Few examples
25
26
Conditional Probability (without
replacement): Example 2
A box contains five yellow balls and two green balls. What is
the probability that three balls randomly taken from the box
(without replacement) all will all be yellow?
A= first ball is yellow
B=Second ball is yellow
C=third ball is yellow
P(A ∩ B ∩ C= P(A) P(B/A) P(C/A ∩ B)
P(A)=5/7 i.e. 5 yellow balls in a box of 7
P(B/A)=4/6 i.e. 4 yellow balls left in a box of 6
P(C/A ∩ B)=3/5 i.e. 3 yellow balls left in a box of 5
Thus:
P(A ∩ B ∩ C)=5.4.3/7.6.5
=2/7
27
28
Conditional Probability: Example1
29
30
Conditional Probability: Bayes’ Theorem
The conditional prob. P(Bi/A) of a specified event Bi, when A is stated
to have actually occurred, is given by:
P( Bi ) × P( A / Bi )
P( Bi / A) = n
∑ P( B ) × P( A / B )
i =1
i i
Probability Distribution
In reality, there are an infinite number of possible outcomes for
the asset value. We represent the distribution of these possible
outcomes with a probability density function (which is linked to
the histogram).
The next figure shows a typical probability density function for
credit losses. Along the x-axis is the value of the assets. The
height of the function in the y-axis gives the probability of any
given loss occurring.
The higher uncertainty in the asset value increases the
probability of defaulting on the debt (for bond issuer/bank).
32
Results of 10 Credit-Loss Scenarios
34
Histogram of 10 Credit-Loss Scenarios
Series: Asset_Value
.15
Observations 10
Mean 102.09
.1
Median 102.75
Maximum 105.2
Density
Minimum 96.5
Std. Dev. 2.859856
.05
Skewness -0.805147
Kurtosis 2.486903
Jarque-Bera 1.190132
0
35
36
Probability Density for the Credit-Loss
Example
. 15
.1
D ensity
. 05
0
37
Cumulative Probabilities
While the probability density tells us the probability of a variable
falling in a given range, cumulative probability depicts the
probability of the random variable falling below a given number.
The cumulative probability can be estimated by multiplying the
probability density by the bin width to get probabilities for each
bin, and by summing up all the probabilities for values less than
equal to (less than type ogive) or more than equal to that
number (more than type ogive)
38
Cumulative Probability for the Credit-Loss
Example (less than type Ogive)
1
.8
.6
cum
.4
.2
0
39
40
Normal Probability Distribution
41
Normal Distribution
If we’d measure very accurately a randomly distributed
characteristic in a very large sample of cases, we’d obtain a
frequency distribution which is symmetric and in which most cases
cluster around the mean.
42
Standard Normal Distribution
Standard Normal Distribution:A normal probability distribution with
mean 0 and SD 1
Normal distributions differ from one another in terms of mean and
SD
Comparison of 2 normal distributions possible through
standardization
New variable Z may be created from the normal distribution with
mean=o and SD=1. Where,
Z=(Xi -X )/ SD
Standard normal distribution can be used to compute the various
confidence intervals of probable price/loss/return ranges.
Most of the VaR models in calculating economic capital use loss
distribution follows standard normal distribution. Many statistical
credit scoring models also assumes error term follows standard
normal distribution.
43
Examples
Given that the daily change in price of a security follows the normal
distribution with a mean of 70 bps and a variance of 9. What is the
probability that on any given day the change in price is greater than
75 bps.
Z= (75-70)/3 =1.67
P(X>75)=P(Z>1.67)
=1-P(Z<1.67)= 1-0.9525=0.0475
Now estimate:
Probability of change in price being 75 or fewer
44
Confidence Intervals for Standard Normal
Distribution
Normal distribution with 0 mean and 1 standard deviation is
called a Standard Normal Distribution.
In risk management, confidence levels are often more useful
than confidence intervals because we are usually concerned
with the downside risk or worst-case level (tail risk).
It is a single number and level (α) that will not be exceeded,
with a given probability (%).
45
Confidence Interval…Example
−
Suppose the mean operational loss X =$434,045 and set
confidence multiplier α=5% so that we have a (1- α)=95%
confidence interval around the estimate of mean, Such an
interval can be calculated using:
−
X ± z α × Stdev(X)
−
X
48
Example: Credit Risk: Bond Default Rates
over 19 Years
Year Bond Default Rate (bp)
1982 125
1983 68
1984 84
1985 99
1986 175
1987 93
1988 146
1989 151
1990 256
1991 297
1992 121
1993 47
1994 52
1995 91
1996 43
1997 52 Source: S & P’s Credit
1998 116 Week, Jan31,2001
1999 198
2000 212 49
Series: Loss_rate_bsp
10
Observations 19
8
Mean 127.6842
Median 116
Frequency
6
Maximum 297
Minimum 43
4
Kurtosis 2.880478
Jarque-Bera 2.270154
0
50
Descriptive Statistics of Credit Loss
Series: Asset_Value
.15
Observations 10
Mean 102.09
Median 102.75
.1
Maximum 105.2
Density
Minimum 96.5
Std. Dev. 2.859856
.05
Skewness -0.805147
Kurtosis 2.486903
Jarque-Bera 1.190132
0
51
400
Series: HIST_LGD
Sample 1 829
Observations 829
300
Mean 0.751924
Median 0.937150
Maximum 1.000000
200
Minimum 0.000000
Std. Dev. 0.323241
Skewness -1.160426
100 Kurtosis 3.063549
Jarque-Bera 186.1932
Probability 0.000000
0
0.0 0.2 0.4 0.6 0.8 1.0
52
Fitting Beta Distribution to Loan Loss
BetaGeneral(0.35405, 0.15230, 0.0000, 1.0000)
25
Fit-Test Input
Function RiskBetaGeneral(0.35405, 0.1523, 0, 1) N/A
a1 (location) 0.354048284 N/A
a2 (Scale) 0.15229666 N/A
20 min 0 N/A
max 1 N/A
Mean 0.69922 0.75192
Mode N/A 1
Median 0.93079 0.93715
15 Std. Deviation 0.37365 0.32324
Variance 0.13962 0.10436
Skewness -0.8509 -1.1604
Kurtosis 2.0652 3.0635
10
0
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Fitted Input
20 Function RiskBetaGN/A
a1 0.34846 N/A
a2 0.167393 N/A
15 min 0 N/A
max 1 N/A
Mean 0.6755 0.69935
Mode N/A 1.0000 [est]
10
Median 0.89737 0.93
Std. Deviat 0.38027 0.37397
Variance 0.1446 0.13972
5
Skewness -0.7338 -0.8509
Kurtosis 1.8714 2.0657
0
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
500
Series: SNP_RETURN
Sample 1 1275
400 Observations 1275
Mean 0.001205
300 Median 0.002188
Maximum 0.079691
Minimum -0.130539
200 Std. Dev. 0.014263
Skewness -1.088501
Kurtosis 11.35109
100
Jarque-Bera 3956.755
Probability 0.000000
0
-0.10 -0.05 0.00 0.05
55
Hypothesis Testing
Testing of hypothesis is one of the main objectives of Sampling
Theory. Hypothesis tests address the uncertainty of the sample
estimate.
When we have to make a decision about the entire population
based on the sample data, hypothesis tests help us in arriving at
a decision.
It attempts to refute a specific claim about a population
parameter based on the sample data.
The process which enables us to decide on the basis of the
sample results whether a hypothesis is true or not, is called Test
of Hypothesis or Test of Significance.
56
Hypothesis Testing Procedure
All hypothesis tests are conducted the same way. The researcher
states a hypothesis to be tested, formulates an analysis plan,
analyzes sample data according to the plan, and accepts or rejects
the null hypothesis, based on results of the analysis.
State the hypotheses. Every hypothesis test requires the analyst
to state a null hypothesis and an alternative hypothesis. The
hypotheses are stated in such a way that they are mutually
exclusive. That is, if one is true, the other must be false; and
vice versa.
Formulate an analysis plan. The analysis plan describes how to
use sample data to accept or reject the null hypothesis. It should
specify the following elements.
• Significance level. Often, researchers choose significance levels
equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can
be used.
57
59
60
Problem2: One-tailed test
Bon Air Elementary School has 300 students. The principal of
the school thinks that the average IQ of students at Bon Air is
at least 110. To prove her point, she administers an IQ test to
20 randomly selected students. Among the sampled students,
the average IQ is 108 with a standard deviation of 10. Based on
these results, should the principal accept or reject her original
hypothesis? Assume a significance level of 0.01.
Null hypothesis: µ = 110
Alternative hypothesis: µ < 110
Note that these hypotheses constitute a one-tailed test. The null
hypothesis will be rejected if the sample mean is too small.
61
62
Hypothesis Testing: Bond Loss Example 1
Hypothesis Testing for LOSS_RATE_BSP
Date: 10/24/07 Time: 12:50
Sample: 1 19
Included observations: 19
64
Parametric-Mean Difference Test
Many problems arise where we wish to test hypotheses about the means of two
different populations (e.g. comparing ratios of defaulted and solvent firms or
comparing performance of public sector bank vis a vis private banks etc.)
Un-Paired test: Or,
Start by assuming H0 is true and use the following test statistic to arrive at a
decision:
A low p value (<0.05) will Reject the null and a high p value (>0.10)
will fail to reject the null.
65
66
Errors of Testing
There are two kinds of errors that can be made in significance
testing: (1) a true null hypothesis can be incorrectly rejected and
(2) a false null hypothesis can fail to be rejected.
The former error is called a Type I error and the latter error is
called a Type II error.
68
Example: Classification Power of a
Statistical Scoring Model
Table: Classification Power of the Logistic Model 1 for the Holdout Sample of the
year 2003 & 2004
Predicted Group
Original Defaulted Solvent Total
Group Defaulted 47 3 50
(94%) (6%) (100%)
Solvent 8 42 50
(16%) (84%) (100%)
Note: Figures in the parentheses denote percentages.
69
% Correct Classification
Model Within Sample
Good Bad
Alman Z-Score 1968 Reworked with Indian Data 84.00% 82.00%
Emerging Market Z-Score 1995 Reworked with Indian
Data 88.20% 75.90%
NIBM Z-Score 2005 Developed on Indian Data 85.20% 91.00%
70
Calibrating & Benchmarking A Model
72
Popular Discrete Distributions: Rule of
Thumb for Identifying Them
Binomial Distribution, Geometric Distribution and Negative
Binomial Distribution
A useful rule of thumb for choosing between these popular
distributions is:
Binomial: variance<a. m
Poisson: variance=a. m
Negative Binomial: variance>a. m
Frequency Distributions
Poisson Distribution:
Number of Frauds λ= 102
x
e−λλk
f (x) = ∑
January February March April May June July August
Poisson
74
Binomial Distribution
N= 12
p= 0.8
N!
f (x) = px (1− p)x
x!(N − x)!
Probability
Mean=N p
and
Standard Deviation:
σ = Np(1 − p)
1
9
11
13
15
17
19
21
23
25
27
29
Number of events
x
The parameter p can be estimated by: pˆ =
N 75
Poisson Distribution:
No. of events Observed i x ni
per day (i) frauds (ni)
0 19 0 x
e−λλk
1 16 16 f (x) = ∑
2 51 102 k=0 k!
3 9 27
4 6 24 E=2.71828…
5 5 25 x=0,1,2,…,
6 4 24
7 6 42
8 2 16
9 1 9 Here, mean (Lambda)
10 0 0 λ =sum(ixni )/sum(ni)
11 0 0 =352/124=2.84
12 2 24
13 1 13 Here, SD=
14 0 0 √(2.84)=1.68523
15 2 30
Total: 124 352 76
Distribution of Credit Card Fraud Events
Distribution of Frauds frequency
60
50
40
O b served F rau d s
30
20
10
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
no of events per day
77
11 0.01% 15.00%
12 0.00%
13 0.00%
14 0.00% 10.00%
15 0.00%
16 0.00%
17 0.00% 5.00%
18 0.00%
19 0.00%
20 0.00%
0.00%
21 0.00%
22 0.00%
10
12
14
16
18
20
22
24
0
2
4
23 0.00% 78
24 0.00% No. of Events
Chi-Sq. Goodness of Fit Test
The risk manager should run a fit test to confirm the right
selection of distribution.
One such test is: Chi-squared goodness of fit test:
~ n (Oi − Ei ) 2
T =∑
i =1 Ei
H0: The data follows a specified distribution (here Poisson)
Ha: The data do not follow the specific distribution
The test statistic is calculated by dividing the data into n bins
(or ranges) and is defined as:
Where Oi is the observed no. of events, Ei is the expected no.
of events (or fitted), and n is the no. of categories.
D.f=n-(k-1), where k refers to the no. of parameters that need
to be estimated.
79
Series3
Poisson
30
25
Frequency
20
15
10
5
0
0 1 2 3 4 5
Nos Leaving per Month
The Poisson Distribution appears visually fit the data fairly well. 81
The chi2 test statistics T-curl=1.51, which is less than the critical value of 11.07 at 5
percent significance with 5 degrees of freedom (n-1=6-1=5), we fail to reject the
null hypothesis and conclude that there is no evidence to support the alternative
hypothesis that the observed distribution is significantly different from the expected
(poisson) distribution. [In excel use chiinv(p,df) formula to obtain the critical value]
Hence, Poisson distribution fits the data fairly well. 82
Testing the Fitness of Continuous
Distributions
83
84
Kolmogorov-Smirnov Test (K-S)
Kolmogorov-Smirnov goodness of fit test that whether
a set of data come from a hypothesized continuouis
distribution.
It tends to be more sensitive near the center of the
distribution than it is at the tails.
H0: The data follow the specified distribution. Ha: The
data do not follow the specified distribution.
Test Statistics:
Where F(Y) is the theoretical fitted distribution
i/N is the actual data distribution.
The hypothesis regarding the distributional form is
rejected if the test statistic, D, is greater than the
critical value obtained from a table.
You can run this test in Best Fit, Easy Fit, Data Plot
softwares
85
Anderson-Darling Test
Anderson-Darling goodnes of fit test whether a data set comes from a
specified distribution.
It is a modification of the Kolmogorov-Smirnov (K-S) test and gives
more weight to the tails than the K-S test.
The K-S test is distribution free in the sense that the critical values do
not depend on the specific distribution being tested.
The Anderson-Darling test makes use of the specific distribution in
calculating critical values. This has the advantage of allowing a more
sensitive test and the disadvantage that critical values must be
calculated for each distribution.
You can run this test in Best Fit, Easy Fit, Data Plot softwares
More formally, the test is defined as follows.
H0: The data follows a specified distribution.
Ha: The data do not follow the specified distribution
For Test Statistic, See Statistics Book
86
Severity Distribution: Legal Liability Loss
60
Skew 2.8064
Kurtosis 15.3145
Percent
87
1.2
0.8
1.0
0.8
Fitted p-value
Values in Millions
0.6
Fitted quantile
0.6
0.4
0.4
0.2
0.2 0.0
-0.2
0.0
-0.4
0.0
0.2
0.4
0.6
0.8
1.0
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Input p-value
Input quantile
Values in Millions 88
Exponential Probability Plot for Legal
Event Losses
Expon(149190) Shift=+1688.6 Expon(149190) Shift=+1688.6
1.0 1.4
1.2
0.8
1.0
Values in Millions
Fitted p-value
Fitted quantile
0.6
0.8
0.6
0.4
0.4
0.2
0.2
0.0
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0.0
0.2
0.4
0.6
0.8
1.0
Input p-value
Input quantile 89
Values in Millions
Fitted Actual
6 Function RiskExpon(1N/A
Shift 1688.58848 N/A
5
b 149189.812 N/A
Minimum 1688.6 2754.2
Values x 10^-6
0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Values in Millions
0.009
90.0%
0.449
5.0% >
90
Fitted Weibull Distribution to Cover the
Fat Tail
Weibull(1.2154, 192107) Shift=-26732
6
b 192106.533 N/A
Minimum -26732 2754.2
3
Maximum Plus infinity 1255736
Mean 153392 151944
Mode 19533 13551 [est]
2
Median 115363 103523
Std. Deviation 148922 170767
1
Variance 2.22E+10 2.90E+10
Skewness 1.492 2.8064
Kurtosis 6.0945 15.3145
0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Values in Millions
1.2
0.8
1.0
Fitted p-value
Values in Millions
0.8
0.6
Fitted quantile
0.6
0.4
0.4
0.2
0.2
0.0
0.0
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Input p-value
Input quantile
Values in Millions
92
Fitting Beta Distribution to Loan Loss
BetaGeneral(0.35405, 0.15230, 0.0000, 1.0000)
25
Fit-Test Input
Function RiskBetaGeneral(0.35405, 0.1523, 0, 1) N/A
a1 (location) 0.354048284 N/A
a2 (Scale) 0.15229666 N/A
20 min 0 N/A
max 1 N/A
Mean 0.69922 0.75192
Mode N/A 1
Median 0.93079 0.93715
15 Std. Deviation 0.37365 0.32324
Variance 0.13962 0.10436
Skewness -0.8509 -1.1604
Kurtosis 2.0652 3.0635
10
0
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Beta Distribution
α αβ
Mean = & S .D. =
; α+β (α + β ) 2 (α + β + 1)
⎡⎛ X (1 − X ) ⎞ ⎤ ⎡⎛ X (1 − X ) ⎞ ⎤
αˆ = X ⎢⎜⎜ ⎟⎟ − 1⎥ βˆ = (1 − X ) ⎢⎜⎜ ⎟⎟ − 1⎥
S2 ⎣⎝ S2 ⎠ ⎦
⎣⎝ ⎠ ⎦ 94
Fitted Loss Distribution through
Simulation
BetaGeneral(0.34846, 0.16739, 0.0000, 1.0000)
25
Fitted Input
20 Function RiskBetaGN/A
a1 0.34846 N/A
a2 0.167393 N/A
15 min 0 N/A
max 1 N/A
Mean 0.6755 0.69935
Mode N/A 1.0000 [est]
10
Median 0.89737 0.93
Std. Deviat 0.38027 0.37397
Variance 0.1446 0.13972
5
Skewness -0.7338 -0.8509
Kurtosis 1.8714 2.0657
0
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
VaR
96
Correlation and Dependence Analysis
Frequency based Joint Dependence: Using probability
and set theorem-Random sampling with or without
replacement
Pearson Correlation Coefficient(rx,y):
Cov(x,y)/(SDx×SDy)
Spearman’s Rank Correlation Coefficient (ρ ): For
example correlation between salary ratio and gross
income generation for 20 traders.
ρ=1-6Σdi2/(n2-1)n where di are the differences of the
ranked pairs.
97
Econometric Models
Regression model
98
Simple Linear Regression (OLS)
99
100
Regression Analysis -- OLS
18
16
Y 14
12
10
8
6
4
2
0
0 10 20 30 40 50
X
Ordinary Least Squares (OLS)
• We have a set of data points, and want to fit a line to the data
• The most “efficient” can be shown to be OLS. His minimizes the
squared distance between the line and actual data points.
101
b̂ =
∑ (X − X)(Y − Y)
i i •OLS estimator of b hat
∑ (X − X) i
2
R =
2 ∑ (Ŷ − Y) = (correlation)
i
2
2
∑ (Y − Y)
i
2
S . E(b̂) =
∑ (Y − Ŷ ) /(n − 2) = RSS/(n − 2)
i i
2
∑ (X − X) i ∑ (X − X)
2
i
2
RSS ⎡ 1 X2 ⎤
S . E(â) = Variance(â) = ⎢ + ⎥
n − 2 ⎣⎢ n ∑ X i2 ⎦⎥
Here, the R-squared is a measure of the goodness of fit of our model, while
the standard error or deviation of b gives us a measure of confidence for out
estimate of b. 103
o The difference between TSS and RSS represents the improvement obtained
by adjusting Y to account for X.
o The measure of goodness of fit R2 can be constructed by taking into ratio of
explained variance to total variance (i.e. R2=(ESS/TSS) or, =1-RSS/TSS.
o For a good fitting model ESS will be large and RSS will be small and R2 will
be large. 105
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.976786811
R Square 0.954112475
Adjusted R Square 0.94493497
Standard Error 27.08645377
Observations 7
ANOVA
df SS MS F Significance F
Regression 1 76274.47725 76274.48 103.9621 0.000155729
Residual 5 3668.379888 733.676
Total 6 79942.85714
bˆ
= the t-ratio.
S .E (bˆ)
Combined with information in critical values from a “student-t”
distribution, this ratio tells us how confident we are that a value is
significantly different from zero. 106
Regression Fit
250
200
150 TC(X)
100 Predicted TC(X)
50
0
0 10 Q(X) 20 30
107
Using the above data, we estimate the regression equation using OLS:
Operational Loss=-$40,526 +[$155,470×system downtime]; R2=0.931413
(176688.8) (15945.9)
(-0.22937) (9.749864) Adj R2=0.921615
F-Stat=95.05984 108
Regression: Interpretation
Operational_loss day i=[intercept]+[slope × system downtime day i]
+[random error day i]
109
Source: M Araten, M Jacobs Jr. and P Varshney, (May 2004), “Measuring LGD on
Commercial Loans: An 18-Year Internal Study, RMA
111
X k 1 ⎤ ⎡ β 1 ⎤ ⎡ u1∧ ⎤
∧
⎡ Y1 ⎤ ⎡1 X 21
⎢Y ⎥ = ⎢1 ⎢ ∧⎥ ⎢ ⎥
⎢ 2⎥ ⎢ X 22 X k 2 ⎥⎥ ⎢ β 2 ⎥ + ⎢ u 2∧ ⎥
X kn ⎥⎦ ⎢⎣ β 3 ⎥⎦ ⎢⎣ u 3∧ ⎥⎦
∧
⎢⎣Yn ⎥⎦ ⎢⎣1 X 2n
y = Xβ + u
β ∧ = ( X / X ) − / X /Y
113
Regression Results
Regression analysis produce the following results:
For the whole regression:
R2 predicts the explanatory power of the regression model
(explained variance/total variance)
ANOVA ( F-test and p values-test for overall goodness of fit)
For each X variable:
Regression coefficients (betas)
Standard error or repressor variance
t-test value for the statistical significance (with p values)
114
R2, Adjusted R2, F statistics for Model Fit
115
116
Example of regression analysis
117
Coefficients
Interpretation:
Health does not seem to be dependent on sex (P = 0.209 >0.05).
Age, smoking and exercising have significant effect on health.
Age has the strongest effect (Beta = -.316). The older the person, the weaker the
experienced health. Smoking has a negative effect and exercising positive effect on
health.
In total the model is statistically significant and explains 15,5% of total variation in
experienced health.
118
Application of Multiple Regression: Ex1
Operational Los=f (system downtime, no. of trainees working, no. of
experienced staff, volume of transactions, no. of transaction errors)
Dependent Variable: OPLOSS
Method: Least Squares
Date: 11/15/09 Time: 00:18
Sample: 1 10
Included observations: 9
Variable Coefficient Std. Error t-Statistic Prob.
Logistic Regression
Logistic regression in a nutshell:
It is a multiple regression with an outcome variable (or
dependent variable) that is a categorical dichotomic and
explanatory variables that can be either continuous or
categorical
In other words, the interest is in predicting which of two
possible events are going to happen given certain other
information
For example in Political Science, logistic regression could be
used to analyse the factors that determine whether an
individual participates in a general election or not.
120
Why cannot we use a Simple Linear
Regression?
In particular, we want the ‘X’ to cause the ‘Y’ and not the
inverse.
121
value of X
122
Simple Linear Regression
How is this impact of X on Y estimated?
123
Simple Linear Regression provides the ‘best fit’ line. i.e.: the
straight line which best describes the relationship between the
two variables
124
Our example: R&D and New Products
relation:
20
# of new products = α +
β*Investment in R&D + u
NEWPROD
10
0
0 200 400 600 800
125
10
R&D, we would predict this
company to develop around 0
49 new products 0 200 400 600 800
RD
126
Another example: Failing or Passing an
exam
127
1.2
happening
OUTCOME
0.0
130
What is wrong with LPM?
Coefficients
Unstandardized Coefficients Sig.
Model B Std. Error
1 (Constant) - 0.031861 0.161591 0.846994
HSTUDY 0.026219 0.006483 0.001627
a Dependent Variable: OUTCOME
131
heteroscedasticity)
132
Non Linear Probability Models
Logistic Regression
Logistic regression, and related methods such as Probit analysis,
are very useful techniques when one wants to understand or to
predict the effect of a series of variables on a binary response
variable (a variable which can take only two values, 0/1 or
Yes/no, for example).
The methodology of logistic regression aims at modeling the
probability of success depending on the values of the
explanatory variables, which can be categorical or numerical
variables.
For example, a marketing researcher may want to detect if
customers are likely to renew their savings deposit/Loan Facility
Logistic regression can be helpful to model the effect of repeal
of a patent on profitability of textile firms or to examine the key
determinants of likelihood of a firm to export or to evaluate the
risk for a bank that a client will not pay back a loan
The Logit Model
A Logit Model states that:
Prob(Yi=1) = F (α + βXi)
Prob(Yi=0) = 1 - F (α + βXi)
135
Logit Models
1
F (α + βX i ) = P(Yi = 1) =
1+ e−(b0 +b1X1 +εi )
136
How do we find the best Logistic Function
to fit our data?
139
140
Multiple Discriminant Analysis
Discriminant analysis is appropriate in situations where the
researcher may want to identify those variables/factors which
are effective in predicting group membership or what variables
discriminate well between groups.
141
¾ Altman (1968), for the first time, applied Multiple Discriminant Analysis
(MDA) in response to shortcomings of traditional univariate financial ratio
analysis.
MDA models are developed in the following steps :
¾ Establish a sample of two mutually exclusive groups: firms which have
“failed” and those which are still continuing to trade successfully
¾ Collect financial ratios for each of these companies belonging to both of
these groups
¾ Identify financial ratios which best discriminate between groups (F-test/
Wilk’s Lambda test).
¾ Established a Z score based on these ratios.
142
Altman’s Z-Score Model
143
144
The Z score and weights
The discriminant coefficients can be estimated using following formula
based on 2 variables:
Z=aX+bY where X=TOL/TA and Y=CR;
where
a={(VarY(avg.Xsolv-avg.Xdef))-(CovXY(avg.Ysolv-Ydef))}/((VarX×VarY)-(CovXY)^2)
b={(VarX(avg.Ysolv-avg.Ydef))-(CovXY(avg.Xsolv-Xdef))}/((VarX×VarY)-(CovXY)^2)
Where Cov XY=Σ(X-avgX)(Y-avgY)/n-1
avg. Xsolv=mean of variable X for borrowers in solvent category
avg. Xdef=mean of variable X for borrowers in defaulted group
avg. Ysolv=mean of variable Y for borrowers in solvent category
avg. Ydef=mean of variable Y for borrowers in defaulted category
The cut off Z-score is the combined benchmark for identified
independent variables to classify the prospective borrower into defaulted
or solvent category.
145
146
Multivariate Regression Interpretation
(Including Regime Shifting): Ex2
Repo Rate Determinants (using Quarterly data from Nov. 2000-Dec 2007
147
151
152
The following data represent the closing value
of the Dow Jones Industrial Average for the
years 1980 - 2001.
153
154
Monthly WPI Series
220
200
180
160
140
120
98 99 00 01 02 03 04 05 06 07
WPI 155
9.5
9.0
Yield
8.5
8.0
7.5
7.0
6.5
1990 1991 1992 1993 1994
ARIMA Technique
Any time series which contains no trend can be represented as consisting of two
parts: AR Process (lag dependent variable itself) and MA Process (lag error
dependence or serial correlation in the disturbance)
ARIMA model can improve forecasting power as it incorporates trend, cyclicality
and seasonality
STEPS in Building ARIMA model of forecasting:
A. Model Identification-Stationarity check, identifying level of statioarity of the
series (or order of integration) and AR and MA process specification
Methods-
Correlogram analysis-studying the Auto correlation function (ACF) and partial
auto correlation function (PACF) lag structure
Dickey-Fuller Unit Root Test
B. Model Estimation: Having determined the orders of the ARIMA model, the
model can be estimated in either EVIEWS 5 or STATA 9 using differenced
regression technique.
C. Diagnostic Checks: Once the ARIMA model is specified and parameters are
estimated, the adequacy of the models may be checked through Box-Pierce-
Ljung-residual test (or white noise test of the residual)
D. Forecasting: After diagnostic checks, the regression equation may be used to
generate short term (static) or long term (dynamic) forecasts 158
Graphical Presentation of ARIMA Process
AR (1) process:
MA (1) process:
159
Correlogram Study…
AR (2) process:
MA (2) process:
Note: If, after the first difference, the ACF or PACF spikes in
correlogram get eliminated the series is I(1); if it happens in
second difference then it is I(2) process
161
Useful References
Greene, W. H. (2007). “Econometric Analysis”, Fifth Edition, Low Price Edition,
Pearson Education.
Gujarati, D N (2004): “Basic Econometrics”, 4th Edition, Tata McGraw-Hill.
Johnston J., and DiNardo J (1997): “Econometric Methods”, 4th Edition, The
McGraw-Hill Companies, Inc. (Important for time series and panel data analysis).
Lewis, N. D. C “Operational Risk: Applied Statistical Methods for Risk
Management”, Wiley Finance.
Maddala, G S (1983): “Limited-Dependent and Qualitative Variables in
Econometrics”, Cambridge University Press.
Pindyck, R.S., and D. L. Rubinfeld (1981), “Econometric Models and Economic
Forecasts”, McGraw-Hill International Editions.
Vose, D. “A Guide to Monte Carlo Simulation Modeling”, John Wiley & Sons.
Walpole, R. E. (1982) “Introduction to Statistics”, Publisher: The Macmillan Co., NY.
Walpole, R. E., Sharon L Myers, Keying Ye, Raymond H. Myers (2006), “Probability
and Statistics”.
EVIEWS help, STATA help, SPSS 17 help etc.
@Risk and BestFit Software at Palisade: www.palisade.com.au
162
Thank You
My Email: arindam@nibmindia.org
163