Predicting Probability of Debt Default A Study of Corporate Debt Market in India and Other Countries

PREDICTING PROBABILITY OF DEBT DEFAULT:
A STUDY OF CORPORATE DEBT MARKET IN

INDIA AND OTHER COUNTRIES
THESIS SUBMITTED TO THE UNIVERSITY OF DELHI

FOR THE AWARD OF THE DEGREE OF
DOCTOR OF PHILOSOPHY
By
RAJEEV KUMAR UPADHYAY
Under the Supervision of
Dr. CHANDRA SHEKHAR SHARMA

&
Prof. R. K. SINGH
DEPARTMENT OF COMMERCE
DELHI SCHOOL OF ECONOMICS
UNIVERSITY OF DELHI
DELHI-110007
SEPTEMBER 2018
Electronic copy available at: https://ssrn.com/abstract=3814275

DECLARATION
I hereby declare that the thesis entitled, "Predicting Probability of Debt Default: A
Study of Corporate Debt Market in India and other Countries”, is an original
research work done by me and where the language, quotes, thought, expression or
work of others are taken, appropriate credit has been mentioned for in the thesis itself.
Further, any part or whole of the thesis is not submitted to any University or authority
for award of any degree or diploma.
Date:
Place:
Rajeev Kumar Upadhyay
Dr. Chandra Shekhar Sharma Prof. R. K. Singh

(Supervisor) (Co-supervisor)
Associate Professor, Professor
Department of Commerce Department of Commerce
Shri Ram College of Commerce Delhi School of Economics
University of Delhi University of Delhi
Delhi – 110007 Delhi - 110007
Prof. Kavita Sharma

Head, Department of Commerce
Dean, Faculty of Commerce and Business
Delhi School of Economics
University of Delhi
Delhi-110007

DEPARTMENT OF COMMERCE
DELHI SCHOOL OF ECONOMICS
UNIVERSITY OF DELHI
DELHI-110007, INDIA
Date: _____________
CERTIFICATE OF ORIGINALITY
The research work embodied in this thesis entitled “Predicting Probability of Debt
Default: A Study of Corporate Debt Market in India and other Countries” has
been carried out by me at the Department of Commerce, Delhi School of Economics,
University of Delhi, Delhi, India. The thesis has been subjected to plagiarism check
by turnitin plagiarism detection software. The work submitted for consideration of
award of the Ph.D. is original.

(Supervisor) (Co-supervisor)
Associate Professor, Professor
Department of Commerce Department of Commerce
Shri Ram College of Commerce Delhi School of Economics
University of Delhi University of Delhi
Delhi – 110007 Delhi - 110007
Prof. Kavita Sharma

Head, Department of Commerce
Dean, Faculty of Commerce and Business
Delhi School of Economics
University of Delhi
Delhi-110007

STUDENT APPROVAL FORM
Name of the Author Rajeev Kumar Upadhyay

Department Department of Commerce
Degree Doctor of Philosophy
University University of Delhi
Supervisor Dr. Chandra Shekhar Sharma (Supervisor)
Prof. R. K. Singh (Co-Supervisor)
Thesis Title “Predicting Probability of Debt Default: A Study of
Corporate Debt Market in India and other Countries”
Year of Award
Agreement
1. I hereby certify that, if appropriate, I have obtained and attached hereto a written
permission/statement from the owner(s) of each third party copyrighted matter to
be included in my thesis/dissertation, allowing distribution as specified below.
2. I hereby grant to the university and its agents the non-exclusive license to archive
and make accessible, under the condition specified below, my thesis/dissertation,
in the whole or in part in all forms of media, now or hereafter know. I retain all
other ownership rights to the copyright of the thesis/dissertation. I also retain to
use in future works (such as article or books) all or part of this thesis, dissertation,
or project report.
Conditions:
1. Release the entire work for access worldwide
2. Release the entire work for „My University only for

1 Year
2 Years
3 Years
And after this time release the work for access worldwide
3. Release the entire work for „My University only while at the
same time releasing the following parts of the work (e.g.
because other parts relate to publications) for the worldwide
access.
a) Bibliographic details and synopsis only.

b) Bibliographic details, synopsis and the following
chapters only.
c) Preview/Table of Contents/24 page only.
4. View only (NO Downloads) (worldwide)

(Researcher)

Signature and Seal of the Supervisor Signature and Seal of the Co-supervisor
Place: Delhi
Date:

ACKNOWLEDGEMENT
I want to express my sincere gratitude to my esteemed supervisor, Dr. Chandra

Shekhar Sharma, the academician and a teacher par-excellence, who is an institution
in himself. His untiring zeal towards the research is admirable. Without his
continuous guidance, this study would not have seen light of the day. I wish to extend
a heartfelt gratitude to my co-supervisor Prof. R. K. Singh, who is a dynamic
personality, excellent reviewer and above all a great human being. Their pearls of
wisdom, coupled with valuable suggestions and constructive criticism have contributed
largely in completion of this work.
I would like to extend my heartfelt thanks and regards to Prof. Kavita Sharma, Head
and Dean of the Department of Commerce, Delhi School of Economics and my
advisor
Prof. J. P. Sharma and Dr. Sunaina Kanojia, for actively scrutinizing my work and
providing valuable directions which guided me deeply in refining the work towards
merit.
I would also like to thank staff members of Department of Commerce, Delhi School of
Economics and Ratan Tata Library for their support. I am thankful to all my
colleagues, friends and students who persistently helped and provided me ungrudging
help whenever I was in need.
I would also like to express my gratitude to my dear father, brother and wife for their
continuous support and encouragement for completing this monumental work in time
through all thick & thins of life. The debt I owe to father, brother and wife for their
unconditional love, support, care and encouragement since the day I opened my eyes to
this world is supreme and cannot be explained merely in words.
Above all, I remember the almighty God with humble prayers who bestowed me the
confidence and blessings to complete my work.
September, 2018 Rajeev Kumar Upadhyay

ABSTRACT
Default by corporate is problem of concern for banking and financial services industry
along with economy. Timely prediction of default is imperative for the bankers and
regulators to devise and take preventive actions. There has been large numbers of
studies across the world and in India too few studies conducted for Indian data. An
extensive review of literature reveals that the subject of default prediction has not
explored much in India. In India along with rest of the world, the numbers of defaults
have been increasing every year. This creates scope for further studies.
A comparative study has been conducted on the data of three countries, namely India,
the US and the UK using four methods; MDA, logit, reduced form and structural
models. The study for the Indian sample has developed 24 models for three methods;
MDA, logit and reduced form model for two sample periods large sample period and
small sample period on a sample of 2450 case over ten years from 1st April 2004 to
31st March 2014 and on two years data from 1st April 2014 to 31st March 2016, the
developed models have been validated for the forward period as well as for the out of
sample firms. Structural model has been tested on Indian data. The study for the
American sample has developed three models for the three methods; MDA, logit and
reduced form model. Structural model has been tested on American data. The study
for the British sample has developed three models for the three methods; MDA, logit
and reduced form model. Structural model has been tested on British data. So in this
study total 30 models have been developed.
The study uses multiple discriminant analysis and logistic regression analysis to
calculate composite scores which are used to predict default. The structural model
uses option mathematics to give probability of default with help of market value of
assets, book value of debt and asset drift rate. Reduced form model uses Poisson
regression to estimate intensity of default which is transformed into probability of
default.
The findings of the study for Indian sample suggest that logit model yields the highest
classification accuracy of 97.7% and reduced form model yields the lowest

classification accuracy of 82.3%. The models for small sample period are robust but
classification accuracies for the models for large sample period are higher for
different methods. The classification accuracies of 24 developed models for Indian
sample has higher than many studies conducted in India and other countries. The
findings of the study on American sample suggest that logit model yields the highest
classification accuracy of 95% and structural model yields the lowest classification
accuracy of 85.9% and classification accuracies of developed models higher than
many studies. The findings of the study on British sample suggest that structural
model yields the highest classification accuracy of 81.6% and reduced form model
yields the lowest classification accuracy of 67.9%.
The findings on variables suggest that all the three types of variables namely
accounting, market and economic variables are significant for all the three samples
from India, the US and the UK. In case of India, it is observed that all the variables
which are significant for large sample period are significant for small sample period
indicating the in long run firms tend to follow a certain pattern and default can be
predicted with the help of a set of certain variables as far as default predict is
concerned. However in short run market variables seem play minor role in default
prediction. From the analysis of variables it is evident that total book value of debt
with respect to total assets is the most important variable among all the variables with
debt component. The biggest limitation of this study is that the models for small
sample period are more robust while the models for large sample period have higher
classification accuracy. The whole study is facing the problem of sample and
selection biases.

TABLE OF CONTENT
Title Page No.

List of Tables i
List of Abbreviation vi
CHAPTER 1: INTRODUCTION 1-16

1.1 Background 1
1.2 Importance of the Study 4
1.3 Past Studies 5
1.4 Research Gap 7
1.5 The Present Study 8
1.6 Objectives 8
1.7 Testable Research Hypotheses 9
1.8 Methods 10
1.8. Data 11
1.8.1. Sample 11
1.8.1.1. Indian Firms 11
1.8.1.2. American Firms 12
1.8.1.3. British Firms 12
1.8.2. Sample Period of the Study 12
1.8.3. Data Sources 12
1.8.3.1. Company specific data 12
1.8.3.2. Interest Rate Proxy 13
1.8.3.3. Market-Proxy 13
1.8.4. The Database 14
1.8.5. Statistical Analysis Packages 14
1.9 Results and Analysis 14
1.10 Organization of the Study 15
CHAPTER 2: CONCEPTUAL FRAMEWORK 17-23

2.1 Introduction 17
2.2 Classification of Default Prediction Models 17

Title Page No.
2.2.1 Parametric Models 17
2.2.2 Non-parametric Models 18
2.3 Evolution of Credit Risk Models 18
2.3.1 Fundamental Analysis 18
2.3.2 Univariate Methods 19
2.3.3 Multivariate Methods 19
2.3.4 Structural Models 20
2.3.5 Reduced Form Models 20
2.3.6 Artificial Neural Networks 21
2.3.7 Hybrid Models 22
2.3.8 Data Envelopment Analysis 22
2.4 Limitations in Default Prediction Studies 22
2.4.1 Sampling Bias 22
2.4.2 Selection Bias 23
CHAPTER 3: REVIEW OF LITERATURE 24-60

3.1 Introduction 24
3.2 Earlier Studies 24
3.3 Fundamental Analysis 26
3.4 Univariate Analysis 27
3.5 Multivariate Analysis 29
3.5.1 Discriminant Analysis 29
3.5.2 Regression Methods 35
3.6 Structural Models 41
3.7 Reduced Form Models 47
3.8 Other Models 50
3.9 Indian Perspective 51
3.10 The American Perspective 59
3.11 The British Perspective 59
3.12 Conclusion 60

Title Page No.
CHAPTER 4: THEORETICAL FRAMEWORK AND METHODS 61-76

4.1. Introduction 61
4.2. Methods Used in the Study 61
4.2.1. Multiple/Multivariate Discriminant Analysis (MDA)
Approach 61
4.2.1.1. Variables Used in the Model 63
4.2.2. Logistic Regression Model 63
4.2.3. Structural Models 64
4.2.4. Reduced Form Model 68
4.3. Variables Used in the Study 70
4.3.1. Accounting Variables 70
4.3.2. Market Variables 72
4.3.3. Economic Variables 73
4.3.4. Categorical Variables 73
4.3.5. Variables for Structural Model 74
4.4. Data 74
4.4.1. Sample 74
4.4.2. Sample Period of the Study 74
4.4.3. Data Sources 75
4.4.4. Statistical Analysis Packages 76
CHAPTER 5: ANALYSIS OF DEFAULT PREDICTION MODELS

FOR INDIAN, US AND UK CORPORATE DEBT 77-266
Section I: Indian Sample 77-144
5.1.1. Multiple Discriminant Analysis 77
5.1.1.1. All Firms 78
5.1.1.2. Large Firms 94
5.1.1.3. Large Firms with PSU 112
5.1.1.4. Small and Medium Enterprises 126

Title Page No.
5.1.1.5. Findings from the MDA Study 139
5.1.2. Logistic Regression 144
5.1.2.1. All Firms 144
5.1.2.3. Large Firms with Public Sector Units 164
5.1.2.4. Small and Medium Enterprises 173
5.4.2.5. Findings from Logistic Regression Analysis 181
5.1.3. Reduced Form Models 185
5.1.3.1. All Firms 185
5.1.3.3. Findings from the Reduced Form Model 203
5.1.5. Interferences from the Study 207
5.1.6. Conclusion 210
5.1.7 Limitations and Further Scope for Studies 211
Section II: American Sample 212-238

5.2.6 Conclusion 237
5.2.7 Limitations and Further Scope for Studies 238
Section III: British Sample 239-265

5.3.6. Limitations 264
5.3.7. Conclusion 264

Title Page No.
Section IV: Comparative Study of Default Prediction Models 265-266
5.4.1. Variables in Equation 265
5.4.2. Statistical Robustness 266
5.4.3. Classification Accuracy 266
CHAPTER 6: SUMMARY AND CONCLUSION 267-276

6.1. Introduction 267
6.2. Summary of Findings 267
6.2.1. Findings from Indian Sample 267
6.2.2. Findings from the American Sample 270
6.2.3. Findings from the British Sample 271
6.2.4. Findings from Comparative Study 272
6.3. Conclusion 274
6.4. Limitations of the Study 275
6.5. Further Scope for Studies 276
BIBLIOGRAPHY 277-285
APPENDICES 286-293

LIST OF TABLES
Table No. Page No.
Table No 5.1.1.1.1 : Case Processing Summary 79
Table No 5.1.1.1.2 : Tests of Equality of Group Means 80
Table No 5.1.1.1.3 : Log Determinants 81
Table No 5.1.1.1.4 : Box‟s M Test 82
Table No 5.1.1.1.5 : Eigenvalues 82
Table No 5.1.1.1.6 : Wilks' Lambda 83
Table No 5.1.1.1.7 : Structure Matrix 84
Table No 5.1.1.1.8 : Standardized Canonical Discriminant Function Coefficients 85
Table No 5.1.1.1.9 : Canonical Discriminant Function Coefficients (Unstandardized) 87
Table No. 5.1.1.1.10 : Prior Probabilities for Groups 88
Table No. 5.1.1.1.11 : Functions at Group Centroids 88
Table No. 5.1.1.1.12 : In Sample Classification Result 89
Table No. 5.1.1.1.13 : Forward Testing (In Sample Firms) 90
Table No.5.1.1.1.14 : Out of Sample Validation 91
Table No. 5.1.1.2.1 : Case Processing Summary 96
Table No. 5.1.1.2.2 : Tests of Equality of Group Means 97
Table No. 5.1.1.2.3 : Log Determinants 98
Table No. 5.1.1.2.4 : Test Results 99
Table No. 5.1.1.2.5 : Eigenvalues 99
Table No. 5.1.1.2.6 : Wilks‟ Lambda 100
Table No. 5.1.1.2.7 : Structure Matrix 101
Table No. 5.1.1.2.8 : Standardized Canonical Discriminant Function Coefficients 102
Table No. 5.1.1.2.9 : Unstandardized Canonical Discriminant Function Coefficients 104
Table No. 5.1.1.2.14 : Out of Sample (Validation) 110

Table No. Page No.

Table No. 5.1.1.3.4 : Test Results 116
Table No. 5.1.1.3.6 : Wilks' Lambda 117
Table No. 5.1.1.4.4 : Box‟s M Test Results 130
Table No. 5.1.1.4.6 : Wilks' Lambda 131
Table No. 5.1.2.1.2 : Omnibus Tests of Model Coefficients 146
Table No. 5.1.2.1.3 : Model Summary 147
Table No. 5.1.2.1.4 : Hosmer and Lemeshow Test 148
Table No. 5.1.2.1.5 : Variables in Equation 149
ii

Table No. Page No.
Table No. 5.1.2.1.8 : Out Sample Results (Validation) 152
Table No. 5.1.2.2.8 : Out of Sample Results (Validation) 162
Table No. 5.1.3.1.2 : Statistical Coefficients 187
iii

Table No. Page No.
Table No. 5.1.3.2.2 : Statistical Coefficients 196
Table No. 5.1.4.1 : Prior Probabilities for Groups 205
Table No. 5.1.4.2 : D2 and Default Probabilities 206
Table No. 5.1.4.3 : Classification Accuracy 206
Table No. 5.2.1.1 : Analysis Case Processing Summary 213
Table No 5.2.1.2 : Group Statistics 214
Table No 5.2.1.3 : Tests of Equality of Group Means 215
Table No 5.2.1.4 : Log Determinants 216
Table No 5.2.1.5 : Test Results 217
Table No 5.2.1.6 : Eigenvalues 217
Table No 5.2.1.7 : Wilks' Lambda 218
Table No 5.2.1.8 : Structure Matrix 219
Table No 5.2.1.9 : Standardized Canonical Discriminant Function Coefficients 220
Table No 5.2.1.10 : Canonical Discriminant Function Coefficients (Unstandardized) 221
Table No 5.2.1.11 : Prior Probabilities for Groups 222
Table No 5.2.1.12 : Functions at Group Centroids 222
Table No 5.2.1.13 : Classification Results 223
Table No. 5.2.2.2 : Omnibus Tests of Model Coefficients 225
Table No. 5.2.2.3 : Model Summary 226
Table No. 5.2.2.4 : Hosmer and Lemeshow Test 226
Table No. 5.3.2.5 : Variables in the Equation 227
Table No. 5.2.2.6 : Classification Results 228
Table No. 5.2.3.1 : Case Processing Summary 230
Table No. 5.2.3.2 : Statistical Coefficients 230
Table No. 5.2.3.3 : Variables in Equation 233
iv

Table No. Page No.
Table No. 5.2.3.4 : Classification Result 233
Table No. 5.3.1.2 : Group Statistics 240
Table No. 5.3.1.3 : Tests of Equality of Group Means 242
Table No. 5.3.1.4 : Log Determinants 243
Table No. 5.3.1.5 : Test Results 243
Table No. 5.2.1.6 : Eigenvalues 244
Table No. 5.3.1.7 : Wilks' Lambda 244
Table No 5.3.1.8 : Structure Matrix 245
Table No. 5.3.1.9 : Standardized Canonical Discriminant Function Coefficients 246
Table No. 5.3.1.10 : Canonical Discriminant Function Coefficients (Unstandardized) 247
Table No. 5.3.1.12 : Functions at Group Centroids 248
Table No 5.3.1.13 : Classification Results 249
Table No. 5.3.2.2 : Omnibus Tests of Model Coefficients 252
Table No. 5.3.2.3 : Model Summary 252
Table No. 5.3.2.4 : Hosmer and Lemeshow Test 253
Table No: 5.3.2.5 : Variables in Equation 253
Table No. 5.3.2.6 : Classification Results 254
Table No. 5.3.3.1 : Case Processing Summary 256
Table No. 5.3.3.2 : Statistical Coefficients 256
Table No. 5.3.3.3 : Regression Estimate 259
Table No. 5.3.3.4 : Classification Result 259
Table No. 5.4.1.1 : Classification Accuracies 266

LIST OF ABBREVIATIONS
BSE : Bombay Stock Exchange
CA/CL : Current Assets to Current Liabilities Ratio
CG : Credit Growth
D/E : Debt to Equity ratio
df : Degree of Freedom
DPM : Default Prediction Model
EBIT/INT : Earnings before Interest and Tax to Interest Expense Ratio
EBIT/TA : Earnings before Interest and Tax to Total Assets Ratio
FAT : Fixed Assets Turnover Ratio
GDP : Gross Domestic Product
GRTA : Growth in Total Assets
ITR : Inventory Turnover Ratio
Log(TA/GNP) : Log of Total Assets to GNP Index Ratio
LOGIT : Logistic Regression Analysis
MDA : Multiple Discriminant Analysis
MP/BV : Market Price to Book Value Ratio
MP/EPS : Market Price to Earnings per Share Ratio
MVE/TBD : Market Value of Equity to Total Book Value of Debt Ratio
NI/TA : Net Income to Total Assets Ratio
NP/TE : Net Profit to Total Equity Ratio
OCFR : Operating Cash Flow Ratio
RE/TA : Retained Earnings to Total Assets Ratio
Sales/TA : Sales to Total Assets Ratio
SG : Sales Growth
vi

SG/GNPG : Sales Growth to Gross National Product Growth Ratio
TBD/TA : Total Book Value of Debt to Total Assets Ratio
UK : The United Kingdom
US : The United States of America
WC/TA : Working Capital to Total Assets Ratio
vii

Chapter 1
Introduction

Chapter 1: Introduction
CHAPTER 1
Introduction
1.1 Background
Banking and financial system is the nerves system of any economy. It mobilizes
financial resources from one sector to another. Of these resources, a large portion is
lent to the corporate. A part of this lending is either not recovered in time or never
recovered. This situation is termed to as default. The vents of default generate risk for
the whole banking system and the economy as it disturbs the flow of resources to the
household and industrial sector both.
As per the list of defaulters prepared by CIBIL, number of default every year since 2012
is increasing. In 2016 number of defaulters who have debt of one crore or more was more
than fifty thousand and willful defaulters who have defaulted on repayment of 25 lakhs or
more was more than thirty thousands. This is partial list and clearly indicates to the
problem that banking and financial service industry is facing. And this is not a new trend.
Since long banking system is facing this problem. This fact has led the banking and
financial system to devise ways and means to predict these events of default beforehand
so that preventive measures could be activated in time. This objective has yielded in form
of different methods of credit evaluation and default prediction.
The earlier avatars of credit rating agencies such as Mercantile Agency (1841) and
Dun and Bradstreet (1849) used fundamental analysis to provide with independent
credit reports. These independent credit reports were prepared with the help of three
types of institutions, namely credit-reporting agency, the specialized financial press
and investment bankers (Sylla, 2002). John Moody in 1909 devised a scale for rating
the credit quality of risky borrowers by fusing the functions of the above three
institutions (Benzschawel, 2015). It started a new era of default studies which
combined fundamental analysis with the ratio analysis to develop credit evaluation
framework.
Wall (1919), Smith (1930), Smith and Winakor (1935), Fitzpatrick (1932), Durand
(1941) and Merwin (1942) used financial ratios using univariate methods to establish

relationship with the default for providing credit report. It was the first generation of
credit evaluation framework. But the seminal works of Beaver (1966) and Altman
(1968) completely changed the whole course. Beaver (1966) used univariate models
while Altman (1968) used multiple discriminant analysis to provide with a credit
scoring framework (Z-score). Altman (1968) Z-score model was the starting point of
the first generation of credit evaluation models. Ohlson (1980) used logistic
regression (O-score) to offer better credit scoring system by removing many
peculiarities in MDA assumption. Zmijewski (1984) used Probit method to provide
with better credit scoring model with purpose of removing the limitations of logistic
regression method.
The second generation of credit evaluation models started with Black and Scholes
(1973) and Merton (1973). These two studies gave foundation to structural model
along with Merton (1974) on the pricing of corporate debt (Merton, 1973). Merton
model for credit risk has different perspective to default process than that of the score
models and is a single period model which derives default probabilities from the
random variation in the unobservable value of assets of the firm (Merton, 1974).
Reduced Form Models can be considered to be third generation of credit risk models.
The starting point of reduced from models is Merton (1976) which modelled
bankruptcy as a continuous probability of default (Gugole, 2016). The study assumes
that at the time of default, the price of the stocks of the defaulting firm is zero. Option
mathematics has been used to arrive at the value of the defaulting firms. This method
modelled as statistical process rather than as a microeconomic model of the firm's
capital structure as is with the structural models (Merton, 1976). Although many
experts find Jarrow-Turnbull (1995) to be the first study on reduced form model.
Jarrow-Turnbull (1995) is an extension of Merton (1976) to a random interest rates
framework. The model uses multi-factor and dynamic analysis of interest rates to
arrive at probability of default (Jarrow & Turnbull, 1995).
Despite so many models, every year billions of dollars are lost by the banking and
financial system across the world. In this era of multinational corporations, it poses
huge risk to the stability of economies where the sizes of corporations and banking

and financial institutions are far bigger than that of the GDP of many economies
(PwC, 2017). This is itself a problem. In the recent times, the world has experienced
the sub-prime crisis of the US which eventually adversely affected the whole world
and still many economies are on tows. The incidents which followed sub-prime crisis
show about the impact of debt default can have on any economy in particular and on
life in general.
Also the globalization and increasing economic and financial integration of world
economies has increased the frequency of such crises and risk in one economy can
easily contaminate other economies and can spread like fire in forest (sub-prime crisis
in the US is an example of the same). This peculiar situation is posing huge challenge
to the central banks and regulators across the world. Every year across world some
big firms and numerous small firms are failing to service their debt obligations.
In the year 2016-17 alone, around 200 firms listed with Indian stock exchanges were
declared to be wilful defaulters. Similarly default by thousands of other firms which
are not listed causes huge revenue as well exchequer losses every year for India. This
situation requires more prudent system of default prediction that is basically meant for
Indian economic and financial structure.
In India unlike many developed economies, debt market is still in early stage of
development. Most of the corporations are completely dependent on banking and
financial institutions for finances. Also at the same time, these debts are not traded on
exchanges besides a few bonds. Also there is no publicly available information about
these debts such as credit rating, credit and default history on servicing these debts
besides the audited financial statements. Also the banking and financial institutions
follow secret methods of credit assessments and lending practices. These things
altogether makes prediction of default very difficult and the models that have been
developed in other developed economies may not work in Indian conditions.
At academic level only a few studies have been conducted for Indian firms in
comparison to other developed markets, may be because of the above mentioned
reasons. These peculiarities demand for more and more studies that are India specific
as India is growing with a high rate and expected to play a vital role in world

economy. The unpredictability in banking and financial system of India may be cause
of concern for whole world economy. The proposed study is an effort in that
direction. It will try to evolve a method on the basis of existing knowledge base which
can produce better results.
1.2 Importance of the Study
Banks lend huge portion of their total lending to corporate but of these corporate,
some companies default every year. According to market information, every year
hundreds of listed companies fail to meet their financial obligations. For banking and
other lending firms, such information is of huge importance.
For banks and regulators, timely and accurate predictions of borrowers’ default
probability hold the key to developing a responsive and effective tool for risk
management. Basel Committee on Banking Supervision (BCBS) advocates that an
effective required capital provisions demands timely, and reliable, early warning
indicators of changes in default risk of borrowers (Basel Committee, 1999).
The composition of Indian debt market and availability of debt to corporate sector is
quite different to that of the developed world markets. Unlike practice in developed
markets, in India bank loans dominate the whole debt markets for corporate and very
few Indian corporate issues bonds to raise debt. This is because of regulatory reasons
as well as lower acceptance of bonds in Indian market. This situation in India makes
the credit ratings by rating agencies irrelevant for the banks and financial institutions
to huge extent as they have to deal with hundreds of corporate loan applications every
day. Apart from this, the banks are liable to lend to some priority sectors’ firms at
easier terms because of regulatory needs (RBI, 2015). This factor also has impact on
the overall risk assessment of debt risk and prediction of corporate defaults. And
while conducting assessment, this factor also needs to be considered. So for the
assessment of financial position of applicant corporate, banks have developed their
internal indigenous methods for risk and default assessment.
Over time because of changes in overall financial reporting and economic

environment the efficiency of conventional methods (accounting based method) has

been questioned. For example proliferation of high technology environments in the

manufacturing and services sectors and introduction of financial derivative
instruments, intangibles have gained progressively greater significance in the balance
sheets. This has placed limits on the utility of conventional methods. This situation
demands for more studies that can better predict default probabilities by using
available accounting, market, economic, qualitative as well as categorical (dummy)
variables for bonds and bank loans. This study is aimed at achieving the same goal.
1.3 Past Studies
Since times when the practice of lending started, lenders have been worried about the
borrowers’ ability to return the money back. Different ways and methods have been
evolved and used to figure it out and in the process the modern practices of credit
rating and default prediction, internal as well as external, have evolved. The
importance of credit rating and default prediction has increased significantly with the
development of organized banking and financial services industry as the whole world
economy now considerably depends on it. So the prediction of default has become
very crucial to banks, financial institutions as well as for the regulators as the cost of
default is very high for the whole economic system.
Though there have been a lot of work in this direction but Moody’s (1909) devised a
scale of credit rating which was followed by fundamental and ratio analysis.
FitzPatrick (1932) and Durand (1941) used different ratios to arrive at conclusion
which somehow was very complex and lengthy but encouraged different studies.
Since then different methods and models have evolved and thousands of studies
across the world have been performed using different data sets and variables
depending upon respective economic realities. These studies can broadly be
categorized in three categories namely Empirical Models or statistical models,
Structural Models and Reduced Form Models.
Beaver (1966) and Altman (1968) are the mark point for emprical models which over
time has used different methods such as Multiple Discriminant Analysis, Logistic
Regression and Probit etc. The seminal work of Beaver (1966) used univariate
method exihbited the importance of diffrernt financial ratios in prediction of failure of

the firms as well as the role of credit extention prctices followed by banking and
financial sector (Beaver W. , 1966). But Altman’s (1968) Z-score model marks the
beginning of modern era of credit assessment methods which used multiple
distriminant analysis method (Altman E. I., 1968). Ohlson (1980) used logistic
regression to arrive at a credit score.
A comparison of ZETA model and Z-score model ZETA model provides improved
accuracy. Resaons may be credited to the more relavant data and larger numbers of
industrial firms used in the analysis for ZETA analysis (Altman E. I., 2000). Liang
(2003) conduceted a comparative study on Chinese firm between MDA and logistic
regression model exihbiting that logistic regression providing relatively higher
predictive ability than that of the MDA (Liang, 2003). Bandyopadhyay (2006)
developed a Z-score model which exihbited high predictive power in detecting
possible defaulting firms. The developed model outformed the original Altman’s
model in Indian context. The new Z-score model can provide early signals about the
firms’ financial positions. The analysis of logit model clearly indicates that inclusion
of financial and non-financial paprameters improves the predicitve ability of the
model. The default probability calculted using Logit can be helpful in deciding credit
risk capital as well pricing of debt (Bandyopadhyay A. , 2006).
Wilcox (1971) found when the value of assets is lower than that of the liabilities or
because of lack of liquidity, firms tend to default (Wilcox, 1971). Merton (1974),
Black and Cox (1976) along with Wilcox mark the beginning of era of structural
models which have been carried forward by Longstaff and Schwartz (1995), Leland
and Toft (1996), Ericsson and Reneby (1998) and Collin-Dufresne and Goldstein
(2001), Patel and Pereira (2005), Bandyopadhyay (2005) and Sharma, Singh &
Upadhyay (2014). These studies have estimated expected default probabilities (EDPs)
for failed and non-failed firms. A few financial and strucutral ratios can help to
understand the same (Scott, 1981).
Bandyopadhyay (2007) finds that if the information from the structural models are
used along with credit scoring models, it can imporve the accuracy in prediction in
credit scroing models like logistic regression model. Financial statements

informatrion along with firm specific information improve the predicitve power of the
model (Bandyopadhyay, 2007).
1.4 Research Gap
Altman (1968), Altman (2005), Beaver (1966), Beaver (2004), Saretto (2004), Frade
(2008) and Aguado (2012) found that ratios can predict probability of bonds and
corporate bankruptcy. Duffie (2007), Duan (2011) and Patel (2005) have found some
structural models to be efficient in predicting probability of bond defaults by
corporate. Bandyopadhyay (2006), Gupta (2014), Jayadev (2006) and Sen Chaudhury
(1999) have used MDA to develop a model for Indian firms. Bandyopadhyay (2006)
have used logistic regression method. Similalry there are very few studies on
structural models like Bandyopadhyay (2007) and Sharma et al (2014).
Bandyopadhyay (2007) used hybrid method to arrive at probability of default. As per
the survey of literature on default prediction, it was found that reduced form models
have not been eplored yet in Indian scenario.
In Indian context there are few studies on the subject and there remains scope for
study and some of those are mentioned as below:
 Though all the above mentioned studies have considered the debt while
predicting probability of default but these studies have ignored the bank loans
and are more concerned with bonds defaults and tend to ignore usual bank loans
extended to corporate that are common source of debt for Indian firms unlike
American and other developed countries.
 There have been very few studies for predicting the probability of debt default
(bank loans and bonds) in Indian context. So the lack of studies in Indian
context also warrants for a study on debt defaults by Indian firms.
 There are few studies like Bandyopadhyay (2006), Bandyopadhyay (2007),

Bandyopadhyay (2007) and Sharma et al (2014) using logit, structural and
hybrid models.

 Reduced form models appear to be completely unexplored in Indian context. In

spite of an extensive search of literature no published study could be found on
this aspect.
 From the extensive review of literature it appears that the studies on the subject
conducted for Indian sample have not been compared with other countries.
1.5 The Present Study
This study aims at assessing different default predictions models in the context on
India and the USA and the UK. As it is well known that default by firms have
negative impact on the health of banking and financial services industry as well as
economy. From the below Figure No.1.2: Net NPA (India), it is clear that the non-
performing assets have been increasing for some time in Indian economy. NPA data
and CIBIL reports clearly indicate that every year thousands companies fail to meet
their financial obligations. So the information about the default is of huge importance
to the banking and financial services sector as well as for the regulators because
timely and accurate prediction of borrowers’ default probability holds the key to
developing a responsive and effective tool for risk management.
The present study aims at the filling the gaps mentioned above. The study is using
four methods namely, multiple discriminant analysis, logistic regression analysis,
structural model and reduced form model to model probability of bank loan default
for three samples namely Indian sample, American sample and British sample. The
main focus of the study is Indian sample and American and British sample is for the
purpose of comparison.
1.6 Objectives
On overall the objective of the study is to develop and compare models for estimating
probability of default of firms from India, the USA and the UK. For this purpose four
methods have been used to developed respective models and test them in respective
markets. The objective of the study can be summarized in three broad points.

1. To develop models for firms from India, the USA and the UK using four
methods namely, multiple discriminant analysis, logistic regression analysis,
Black-Scholes-Merton model and reduced form model.
2. To develop and compare four different models for different size of firm for
Indian companies for large sample period (ten years) as well as small sample
period (five years). Two models for All Firms, four models for Large Firms
excluding public sector undertaking, Models for large firms including public
sector undertaking and two models for small and medium enterprises using
multiple discriminant analysis and logistic regression analysis.
3. To develop a new reduced form model for the bank loans. For this purpose two
models are developed for All Firms and Large Firms each for two sample
periods.
4. To facilitate the banks and financial institutions to choose the best method
among the analyzed methods of predicting the debt default by firms in India, the
USA and the UK.
1.7 Testable Research Hypotheses
The present study uses four methods namely multiple discriminant analysis, logistic
regression, reduced form model and structural model. Research hypotheses for
different methods are different which are following:
With respect to Multiple Discriminant Analysis, following three null hypotheses are
tested in this study:
a). There is no significant difference between the groups on each of the independent
variables group mean.
b). The covariance matrices do not differ between groups formed by the dependent
variables.
c). There is no significant discriminating power in the independent variables.
With respect to Logistic Regression Analysis, following three null hypotheses are
tested in this study:

a). There is no effect of the independent variables, taken together, on the dependent
variable.
b). The model is correctly specified.
c). The corresponding coefficient to variable is zero.
With respect to Reduced Form Models, following two null hypotheses are tested in
this study:
a). All slope coefficients are equal to zero.
b). Estimated coefficient of independent variables is equal to zero.
1.8 Methods
Many methods such as earlier forms of analysis like fundamental analysis, ratio
analysis and univariate analysis as well as parametric and non-parametric methods
can be used for predicting default. Parametric methods like multivariate analysis such
as multiple discriminant analysis, logistic regression analysis and probit analysis,
structural models and reduced form models consider various aspects of firm such as
different ratios, assets value, outside obligations and variability in future market value
of assets. Similarly non-parametric methods like artificial neural networks (ANN),
hazard models, fuzzy models, genetic algorithms (GA) and hybrid models which use
several of the former models can be used to predict probability of default of a firm.
The present study uses four parametric methods namely multiple discriminant
analysis, logistic regression analysis, structural models and reduced form models.
Financial ratios can indicate the firm’s ability to meet it external liabilities (current as
well as long-term) with the help of a lead indicator. This lead indicator can be
developed using multivariate analysis. Multiple discriminant analysis is way to
develop an indicator which can inform beforehand whether firm would default or not
in near future. The developed score using multiple discriminant analysis is popularly
known as Z-score (Altman E. I., 2000). Similarly logistic regression can also be used
to develop an indicator like multiple discriminant analysis whether firm would default
or not in near future. The developed score using logistic regression is popularly
known as O-score (Ohlson, 1980).
10

Structural models believe that the financial structure of the firm can indicate the
firm’s ability to meet it external liabilities. It argues that default is driven by a) market
value of firm’s assets, b) level of firm’s outside obligations (or liabilities) and c)
variability in future market value of assets. As the market value of firm’s assets
approaches book value of liabilities, the default risk of the firm increases. The default
point is the threshold value of firm’s assets (somewhere between total liabilities &
current liabilities) at which the firm defaults. Therefore, the relevant net worth of the
firm is the difference between the market value of assets and the default point. Default
occurs when the relevant net worth of the firm approaches to zero.
Unlike accounting ratios based methods like multiple discriminant analysis and
logistic regression analysis as well as market based model like structural models,
reduced form models use both the accounting as well market information to predict
probability of default. It uses Poisson regression to arrive at default intensity which is
used to arrive at probability of default.
This study uses four methods namely multiple discriminant analysis, logistic
regression analysis, reduced form model and structural models have been used to
develop models for firms from India, the USA and the UK and compare classification
accuracies of the developed models.
1.8. Data
1.8.1. Sample
1.8.1.1. Indian Firms
For the study of default prediction models for Indian firms, four broad samples are
used. The first sample is for all the firms consisting 226 firms from India which have
2450 case. The second sample is for the 149 large firms consist of 1642 cases. The
third sample is of 169 large firms and public sector units which has 1881 cases. The
fourth sample consists of 57 small and medium firms. This sample has total 570 cases.
11

1.8.1.2. American Firms
This sample consists of 50 firms from the US and has 250 cases.
1.8.1.3. British Firms
This sample consists of 50 firms from the UK and has 250 cases.
1.8.2. Sample Period of the Study
Indian Firms
The sample period for study on Indian sample is twelve years from 1st April 2004 to
31st March 2016. Of this sample period, ten years data has been used to develop the
model from 1st April 2004 to 31st March 2014. Of this period, the study is conducted
for two sample periods namely large sample period of ten years from 1st April 2004 to
31st March 2014 and small sample period of five years from 1st April 2009 to 31st
March 2014 to find out the impact of sample periods. The data belonging to period
from 1st April 20014 to 31st March 2016 have been used for forward testing for both
out of sample firms as well as in sample firms.
American Firms
The period of study is taken for a sample period of five years from 1st January 2012 to
31st December 2014 for the firms from the United States of America.
British Firms
31st December 2014 for the firms from the United Kingdom.
1.8.3. Data Sources
1.8.3.1. Company specific data
Accounting, market and economic information from the three countries have been
collected. In case of firms from India, the accounting data have been collected from
12

Prowess and respective financial statements of the firms and the stock prices of the firms
have been collected from Bombay Stock Exchange. Economic data has been collected
from RBI and World Bank. As far as the information about firms’ default position is
concerned, this information has been collected from respective auditor’s reports in
financial statements. In case of firms from the United States of America, the accounting
data have been collected from the respective financial statements of the firms and the
stock prices of the firms have been collected from NASDAQ. Economic data has been
collected from Federal Reserve and World Bank. As far as the information about firms’
default position is concerned, this information has been collected from respective
auditor’s reports in financial statements. In case of firms from the United Kingdom, the
accounting data have been collected from the respective financial statements of the firms
and the stock prices of the firms have been collected from Financial Times Stock
Exchange from London Stock Exchange. Economic data has been collected from Bank of
England and World Bank. As far as the information about firms’ default position is
financial statements.
1.8.3.2. Interest Rate Proxy
As far as risk free interest rates are concerned, 91 days Treasury bills have been taken
as proxy for interest rate for the whole study. Interest rate proxy for Indian firms has
been collected from Reserve Bank of India. For the firms from the USA, it has been
collected from Federal Reserve Bank. For the firms from the United Kingdom, the
interest proxies over the sample period have been collected from The Bank of
England.
1.8.3.3. Market-Proxy
For the purpose of getting market returns, the study has taken market proxies for each
market. In case of India, the market proxy is BSE 500. For the USA, it is NASDAQ
Composite. In case of the UK, it is FTSE AIM All Share.
13

1.8.3. The Database
When it comes to collecting data for the study, most of the accounting data have been
collected from the individual financial statements of the firms that are part of the
study. Besides this, in case of India, data have also been collected from Prowess. For
the USA and the UK, no database has been used.
1.8.4. Statistical Analysis Packages
The first tool that has been used is spreadsheet software Microsoft Excel. The daily
stock prices as well market prices data has been processed by using Microsoft Excel.
Then the processed data is used for development of models using advanced statistical
analyses packages SPSS 20 and E-Views 9 for testing various statistical properties.
SPSS 20 has been used for developing models using multiple discriminant analysis
and logistic regression analysis. E-Views 9 has been used for reduced form modelling
using Poisson regression method. For structural model based on Black-Scholes-
Merton Model, Microsoft Excel has been used.
1.9 Results and Analysis
For Indian sample, the models for the small sample period have higher statistical
robustness but the models for the large sample period have higher classification accuracy
except reduced form model. On comparison of the classification accuracies, logit models
have higher accuracies (97.7%) than that of the discriminant models (Gupta V. , 2014)
and structural model as well (Sharma, Singh, & Upadhyay, 2014). And over time
classification accuracy deteriorates for the developed models. The results for MDA and
logit models are comparable to many studies like Altman (1968), Ohlson (1980), Altman
& Narayanan (1997), Sen Chaudhury (1999), Altman (2000), Bandyopadhyay (2006),
Jayadev (2006), Agarwal & Taffler (2008) and Gupta (2014) etc. As far as the accuracies
of the reduced form model and structural model are concerned, the results of the study are
found less efficient than many studies like Patel & Pereira (2005), Bandyopadhyay
(2007), Agarwal & Taffler (2008), Kulkarni, Mishra, & Thakkar (2008), Duan, Sun, &
Wang (2011) and Sharma, Singh, & Upadhyay (2014) etc. For the newly developed
14

model using reduced form method, the statistical robustness as well as the classification
accuracy is the highest model developed for small sample period.
For the US and UK samples, all categories of variables are important for model
building for MDA and logit models but for reduced form model, only two categories
accounting and economic variables are important. MDA and logit models have
variables with debt component but in case of reduced form model, variables with debt
component are not significant.
1.10 Organization of the Study
This study has been divided into nine chapters, with each chapter shedding light on
different steps of the study.
 Chapter – 1: Introduction
This chapter introduces the basic issues and the problems relating to the topic of the
study. Briefs on importance of study, past studies, research gap, the present study,
objectives, hypotheses, methods, data and the findings of the study have been
presented in this chapter.
 Chapter – 2: Conceptual Framework
This chapter introduces to basic concepts on which credit risk models are built.
Conceptual framework of the four methods used in the study along with chronological
evolution of credit risk models and limitations of the conceptual framework has been
presented in this chapter.
 Chapter 3: Literature Review
This chapter reviews the existing literature on credit risk modelling. It begins by
presenting a comprehensive review of the literature in chronological order from
beginning to showcase how the field of credit risk modelling has evolved over time.
In the second part of the chapter, literature relating to India, the USA and the UK has
been presented separately.
15

 Chapter – 4: Theoretical Framework and Methods
This chapter discusses the theories behinds the selected four methods for the study and
the methodology that have been followed in the study along with the data, selected
variables, sample period, sources of data and statistical packages etc. used in the study.
 Chapter – 5: Analysis of Default Prediction Models for Indian, US and UK

Corporate Debt
This chapter consists of the empirical results of the study. It is organized in four
sections. Part I present results and findings from study on Indian sample. Part II
present results and findings from study on American sample. Part III present results
and findings from study on British sample and Part IV present comparative study of
default prediction models.
 Chapter – 6: Summary and Conclusion
The chapter presents the summary of the findings and concludes the whole study.
16

Chapter 2
Conceptual Framework

Chapter 2: Conceptual Framework
CHAPTER 2
Conceptual Framework
2.1 Introduction
Although, the formal studies on credit risk started in the 1930’s (Altman E. , 1968)
but the practice of credit assessment may be traced to Mercantile Agency (1841).
Started as fundamental analysis to univariate analysis to multivariate analysis to
structural models to neural network, numerous credit risk models have been evolved
over time and after six decades of credit risk measurement, there is extensive
development in the credit risk literature.
2.2 Classification of Default Prediction Models
The present form of formal studies on credit risk started in the 1930’s (Altman E. ,
1968). The early studies were univariate in nature, and single financial ratios were
used to assess the financial position of the borrower followed by multivariate studies.
Over more than seventy years, large numbers of studies have been conducted in
different settings. The credit risk models can be classified into the following
categories on the basis of methods used:
1. Parametric Models
2. Non-parametric Models
2.2.1 Parametric Models
The models which are univariate and multivariate in nature and use financial as well
as non-financial information and focus on the symptoms of bankruptcy may be
categorized as parametric models (Balcaen, 2006). These models face the problem of
sampling bias, selection bias, unsuitable sample period and the assumptions relating
to dichotomous variables etc. Accounting and market based models are parametric
models. Examples of accounting based models are Altman (1968) Z-score model,
Ohlson (1980) O-score model Zmijewski (1984) Zmijewski Model.
17

 Accounting based models
 Market based models
 Structural Model
 Reduced Form Model.
2.2.2 Non-parametric Models
These are computer dependent multivariate models and uses complex technology like
artificial intelligence to assess credit risk. These models are not capable of offering
explanation to the causal relationships between the independent variables and
dependent variables. Some examples of non-parametric models are artificial neural
networks, hazard models, fuzzy model, Bayesian, Data Envelopment Analysis and
hybrid models etc.
2.3 Evolution of Credit Risk Models
Though earlier evidences credit risk assessment dates back to 1841 but systematic
efforts for default prediction dates back to 1909 when John Moody devised a scale for
rating the credit quality of risky borrowers (Benzschawel, 2015). The journey from
credit reports in fragmented structure has reached to complex artificial intelligence.
Walking the wave of economic growth in the USA during this period the whole
outlook of credit assessment has witnessed sea-changes. The evolution can be
summarized as follow:
2.3.1 Fundamental Analysis
Fundamental credit analysis makes a qualitative assessment of the past records of the
assets and liabilities mix, sales and earnings dynamics, products and brand positioning
as well as management along with the economic scenario. With the help of this
analysis, future trends are predicted regarding the success or failure and default risk
with respect to other firms in the industry. The earlier avatars of credit rating agencies
18

such as Mercantile Agency (1841) and Dun and Bradstreet (1849) used fundamental
analysis to provide with independent credit reports.
2.3.2 Univariate Methods
Univariate methods are the first generation of credit evaluation framework. This
method used financial ratios to establish relationship with the default. The relationship
between each selected variable and default is established. For each analysed ratio this
method provided with separate information on its relationship with default. These
different results normally established different relationship with default and it was
difficult to arrive at a conclusion. This was the biggest limitation of the method. Then
the need for a composite indicator was realized. Few examples of univariate methods
are Wall (1919), Smith (1930), Smith and Winakor (1935), Fitzpatrick (1932),
Durand (1941) and Merwin (1942). Although the seminal works of Beaver (1966) and
Altman (1968) completely changed the whole course. Beaver (1966) used univariate
discriminant analysis while Altman (1968) used multiple discriminant analysis to
provide with a credit scoring framework (Z-score).
2.3.3 Multivariate Methods
The need for a composite score was realized while conducting univariate analyses to
arrive at a decision. It was assumed that with help of a few tests weights can be
assigned to relevant variables and then a composite score can be arrived upon. This
method does not try to build a model with the help of drivers of default. Rather it puts
emphasis on a quantitative score that can indicate default. In this process, the
defaulted firms are compared against non-defaulting firms through a number of
associated variables using a few tests to arrive at a composite score. On the basis this
concept Altman (1968) developed Z-score. Z-score model was the starting point of
the first generation of credit evaluation models. Next important credit scoring model
of the first generation is developed by Ohlson (1980). It used logistic regression to
offer better credit scoring system (O-score) by removing many peculiarities in MDA
19

assumption. Zmijewski (1984) used Probit method to provide with better credit
scoring model (Zmijewski Score) with purpose of removing the limitations of logistic
regression method. These models give a composite score by assessing different
discrete accounting, market and economic characteristics of the firm (Altman &
Narayanan, 1997). These models don’t try to capture the economics of default. Rather
these follow a much simpler method (Chacko, Sjoman, Motohashi, & Dessain, 2006).
2.3.4 Structural Models
(1973) and Merton (1973). These two studies gave foundation to structural models
model for credit risk has different perspective to default process than that of the credit
scoring models and is a single period model which derives default probabilities from
the random variation in the unobservable value of assets of the firm (Merton, 1974). It
incorporated factors like the market value of assets and the business risk of firms.
This approach says that a firm tends to default when the market value of its assets is
less than the debt it has to pay. This model uses firms’ balance sheet and capital
structure information to assess the credit risk (Chacko, Sjoman, Motohashi, &
Dessain, 2006). Structural models are found to be very useful in predicting corporate
bond defaults but when it comes to default prediction of bank loans, the same may not
be as useful (Sharma, Singh, & Upadhyay, 2014).
2.3.5 Reduced Form Models
is modelled as statistical process rather than as a microeconomic model of the firm's
20

Reduced form model don’t try to build model on the information from the balance sheet
nor does try to develop an indicator from the financial statement of the firm. It assumes
that there is nothing like gradual default. Rather it is an abrupt event. The assumption at
the work is that there is no relationship between the value of the firm and default. Default
is instead an unpredictable and sudden event of inexplicable loss in the market values of
the firm. This approach still asks for a signal to indicate the event of default. The reduced
form models simply assign a random external indicator which is known as default
intensity λ (Chacko, Sjoman, Motohashi, & Dessain, 2006).
Where,
Default Intensity: λ
The probability of default at time t years: 1-e-λt
The expected time to default: 1/λ
2.3.6 Artificial Neural Networks
Artificial neural network methods are another approach of credit risk assessment.
These can be termed as fourth generation of credit models. It uses artificial
intelligence to build model with help propagation learning algorithm. In this method
neural network is trained either with backward or forward propagation learning
algorithm so that the developed model can decide on loan applications. These models
are capable to learn and adapt as per the sample. These can capture non-linear
relationships also. The models based on neural network have been able to generate
high level of accuracies. However because of ad hoc theoretical foundation, use of
data mining for identifying unobservable correlations among the independent
21

variables and inability to provide explanation of causal relationships, the models

based on neural network approach are criticized (Khashman, 2009).
2.3.7 Hybrid Models
Hybrid models are the model which combines two or more methods to arrive at a
decision. Sometimes, these use parametric as well as non-parametric models at the
same time. These can be termed as fourth generation of credit models. Study like
Bandyopadhyay (2007) is hybrid model.
2.3.8 Data Envelopment Analysis
Charnes, Cooper and Rhodes (1978) introduced the Data Envelopment Analysis. Relying
on the productivity concept which defined efficiency measure as a quotient of singular
input and singular output, they used it for a multidimensional situation in which there was
more than one input as well as more than one output. Applying this inference, they were
able to propose a very practical system to measure efficiency (Ferus, 2008).
2.4 Limitations in Default Prediction Studies
The credit risk models like all the other models have some limitations mainly because
of restrictive assumptions of the models. And these limitations are related to those
models individually. But on overall all the models suffers from some common
limitations and those limitations are mainly related to sampling process. The details
are as follow:
2.4.1 Sampling Bias
It is a bias in which, some members of population are less likely to be included in the
sample than others. In this type of sampling bias, all the individual data points from
the population are not equally likely to be drawn into the sample. That means
members from a group are more likely to be included into sample while members
from other group are less likely to be included into the sample. It results into non-
22

random sample, non-representative of the population. So the statistical analyses are

found to be biased and the conclusions from the analyses are found to be
representative of a group of members rather than the population (Lavrakas, 2008).
2.4.2 Selection Bias
Selection Bias occurs when in the process of selection of individual data point from
the population, proper randomization is not achieved. The obtained sample is not the
representative of the population. Because of this problem in selection, there is
distortion is the statistical analysis which is biased to the sample. So the results of
analyses are meant only for the obtained sample not for the population from the
sample is drawn. So when analyzing the results of the studies for such samples, the
conclusions from the results may not be accurate for the whole population. Selection
bias is also known as selection effect (Lavrakas, 2008).
23

Chapter 3
Review of Literature

Chapter 3: Review of Literature
CHAPTER 3
Review of Literature
3.1 Introduction
Corporate need fund to carry out its operations and the fund is in the hands of
households in forms of savings. Banking system mobilizes the funds from households
to corporate and vice versa. In this process many borrowers (both corporate as well as
individuals) fail to service their debt obligations when it is due for a number of
reasons. This failure has huge consequences for the banking and financial system as
well as the economic system. And when plagued with such defaults the economic
system leads towards economic crises.
Predictions of probable defaults are very crucial for banks, financial institutions as
well as regulators as it has huge cost for the whole economic system. Considering the
importance of timely repayment of debt, the banking and financial system has been
working since early times to devise methods of predicting and forecasting such
defaults to avoid and take preventive actions in its lending practices. To serve the
purpose, several studies across the world have been carried out using different sets of
data and variables according to different economic realities to provide with reliable
information. This information is of huge importance to the regulatory and economic
system for framing rules, guidelines and practices as it provides with insight to
developing a responsive and effective tool for risk management (Altman E. I., 2002).
3.2 Earlier Studies
Default prediction has always been of huge interest for the lenders and regulators.
This has led to lot of research in this domain resulting into huge literature on default
prediction methods and models. Over time, different methods have evolved for credit
assessment and default prediction; from fundamental analysis to univariate analysis
and to the modern and post-modern quantitative default prediction models such as
multivariate studies, structural models, reduced form models, neural network,
artificial intelligence and hybrid models etc.
24

Wall (1919), Smith (1930), Smith and Winakor (1935), Fitzpatrick (1932), Durand
(1941) and Merwin (1942) used financial ratios using univariate methods to provide
with the first generation of credit evaluation framework. But the seminal works of
Beaver (1966) and Altman (1968) completely changed the whole course. Beaver
(1966) used univariate models while Altman (1968) used multiple discriminant
analysis to provide with a credit scoring framework (Z-score). Altman (1968) Z-score
model was the starting point of the first generation of credit evaluation models.
Ohlson (1980) used logistic regression (O-score) to offer better credit scoring system
by removing many peculiarities in MDA assumption. Zmijewski (1984) used Probit
method to provide with better credit scoring model with purpose of removing the
limitations of logistic regression method.
The first generation of credit risk models are basically accounting based credit scoring
models which uses financial ratios to arrive at a score that could indicate default.
Because of ease in carrying out tests and availability of data, this method changed the
whole landscape of default prediction research and still it is one of the most prominent
methods used for predicting default. Altman‟s model has performed very well in
different studies across world in different realities over different time periods (Altman
& Narayanan, 1997).
Many studies find that in a number of economic realities, Altman (1968) Model has
been found to be very efficient in classification accuracy. But at the same this model
received a lot of criticism as it is based on book value of accounting data. In a time
when market information has become very important this model, other dimensions
cannot ignored while assessing default. These models use only discrete accounting
data and do not allow non-linear effects among different credit risk factors. Also the
structural changes in economy, increasing importance of intangible assets and
technology could not be included in these models. In such a case, these models would
not be able to capture the effect of adverse business cycle effects on the
creditworthiness of corporate clients (Kealhofer, 2003). This demanded for new ways
and means that led to second and third generation of credit risk models. Credit scoring
models are not able to distinguish between the asset volatility and leverage.
25

(1973) and Merton (1973). These two studies gave foundation to structural models
model for credit risk has different perspective to default process than that of the credit
scoring models and is a single period model which derives default probabilities from
the random variation in the unobservable value of assets of the firm (Merton, 1974). It
incorporated factors like the market value of assets and the business risk of firms.
Structural models are found to be very useful in predicting corporate bond defaults
but when it comes to default prediction of bank loans, the same may not be as useful
(Sharma, Singh, & Upadhyay, 2014).
is modelled as statistical process rather than as a microeconomic model of the firm's
3.3 Fundamental Analysis
Fundamental credit analysis makes a qualitative assessment of the past records of the
assets and liabilities mix, sales and earnings dynamics, products and brand positioning
as well as management. With the help of this analysis, future trends are predicted
regarding the success or failure and default risk with respect to other firms in the
industry. The earlier avatars of credit rating agencies such as Mercantile Agency
(1841) and Dun and Bradstreet (1849) used fundamental analysis to provide with
independent credit reports rather than credit score. Even the modern avatars of credit
26

rating agencies such as Moody‟s, Standard and Poor‟s and Fitch etc. uses fundamental
analysis along with other different methods and tools to arrive at credit ratings
through credit scores. Although fundamental credit analysis can help in identifying
potential opportunities and risks, it is very subjective and difficult to quantify,
relatively with lesser precision than that of the model based approaches which give
objective credit scores.
3.4 Univariate Analysis
Initially ratio analysis or financial ratio analysis was considered to be part of the
fundamental analysis because of inexplicit relationship between default and ratios.
But later on the financial ratios became tool for credit assessment. The subjective
interpretations in the fundamental analysis warranted for reviewing the way credit
assessment was carried out. This resulted into objective on financial ratios. The early
studies concerning ratio analysis for bankruptcy prediction used univariate studies.
These studies focused on individual ratios and compared different ratios of failed
companies with those of successful firms. The univariate studies had important
implications for future model development as they laid the groundwork for
multivariate bankruptcy prediction models. Compared with the next 50 years (1965 to
present), there were relatively few studies published till 1965. The most prominent of
the univariate studies are summarized as follow:
Wall (1919) conducted a study on 981 US firms using seven ratios (mostly related to
liquidity; directly or indirectly) to find out the relationship between the ratios and debt
serviceability and results showed that ratio can be helpful in discriminating failing and
non-failing firms as failed firms had poor liquidity ratios in comparison to non-
defaulting firms (Wall, 1919).
In 1930, the Bureau of Business Research (BBR) published a bulletin with results of a
study of ratios of failing industrial firms. The study analysed 24 ratios of 29 firms to
determine common characteristics of failing firms. Average ratios were developed
based on the ratios of the 29 firms. The ratios of each firm were then compared with
the average ratios to show that the failing firms displayed certain similar
27

characteristics or trends. The study found eight ratios that were considered good
indicators of the "growing weakness" of a firm. These ratios were Working Capital to
Total Assets, Surplus and Reserves to Total Assets, Net Worth to Fixed Assets, Fixed
Assets to Total Assets, the Current Ratio, Net Worth to Total Assets, Sales to Total
Assets, and Cash to Total Assets. BBR also reported that the Working Capital to Total
Assets ratio appeared to be a more valuable indicator than the Current Ratio, despite-
the fact both ratios were found to be good indicators of weakness (Smith, 1930).
FitzPatrick (1932) compared 13 ratios of 20 failed and successful firms. He found that, in
the overwhelming majority of cases, the successful companies displayed favourable ratios
while the failed firms had unfavourable ratios when compared with "standard" ratios and
ratio trends. FitzPatrick reported that two significant ratios were Net Worth to Debt and
Net Profits to Net Worth. Also, FitzPatrick suggested that less importance should be
placed on the Current Ratio and Quick Ratio for firms with long-term liabilities indicating
that serviceability of long term debt is more related to profitability of firms in long term
and heir corresponding net worth (FitzPatrick, 1932).
Smith and Winakor (1935) analysed ratios of 183 failed firms from a variety of
industries in a follow-up study to the Smith (1930) results. Smith and Winakor found
that Working Capital to Total Assets was a far better predictor of financial problems
than both Cash to Total Assets and the Current Ratio. They also found that the
Current Assets to Total Assets ratio dropped as the firm approached bankruptcy
(Winakor & Smith, 1935).
In 1942, Merwin published his study focusing on small manufacturers. He used 200
failed and 381 successful firms for the study. The study reported that when comparing
successful with failing firms, the failing firms displayed signs of weakness as early as
four or five years before the failure. Also, Merwin found three ratios that were
significant indicators of business failure - Net Working Capital to Total Assets, the
Current Ratio, and Net Worth to Total Debt (Merwin, 1942).
Chudson (1945) studied patterns of capital structures of the firms in an effort to

determine if there was a normal pattern. The study found that in general at economy
28

level, there was not normal pattern to the capital structures. It was reported that there
was clustering of ratios within industry, size and profitability. Not directly linked to
default prediction but as indicator of the need for different default prediction models
on the basis of size, industry and economy (Chudson, 1945).
Jackendoff (1962) compared the ratios of profitable and unprofitable firms. The study
found that two ratios were higher for profitable firms than that of the unprofitable
firms, namely Current Ratio and Net Working Capital to Total Assets. Also, it was
found that the successful firms had lower debt to net worth ratios than unsuccessful
firms (Jackendoff, 1962).
The studies by Wall (1919), Fitzpatrick (1932) and Durand (1941) were the stepping
stones in the process as the relationship between the ratios and failure was not clear
but the studies like Fisher (1959) and Beaver (1966) systematically established the
relationship between the ratios and failure by firms.These early studies laid the
groundwork for the studies that followed. As will be discussed in the next section,
bankruptcy prediction models began to develop with Beaver's (1966) univariate study
and have continued to evolve since then.
Beaver, (1966) analyzed 14 financial ratios using a univariate discriminant analysis on

a matched sample of 158 US firms. The study found that two ratios namely working
capital cash flow to total assets ratio and net income to total assets ratio can be used to
classify the firms (Beaver H. W., 1966).
3.5 Multivariate Analysis
3.5.1 Discriminant Analysis
By 1960s the whole arena of default prediction study completely changed. The
seminal works like Beaver (1966) and Altman (1968) used discriminant analysis to
come up with a better default prediction framework. Basically these were credit
scoring models. Numerous studies like Deakin (1972), Edmister (1972), Blum (1974),
McFadden (1976), Eisenbeis (1977), Altman et al. (1981), Queen and Roll (1987),
29

Shumway (2001), and Balthazar (2006) contributed towards credit scoring models in
different economic realities.
Altman (1968) used multiple discriminant analysis to basically develop a credit

scoring model to categorize firms into two classes; defaulting and non-defaulting,
using a set of accounting variables for sixty six manufacturing corporations based in
the USA over the period of 1946-1965. These 66 firms were grouped into two groups;
namely defaulting and non-defaulting with 33 firms in each group. The results of
study show that five financial ratios i.e. working capital/total assets, retained
earnings/total assets, earnings before interest and taxes/total assets, market value
equity/book value of total debt and sales/total assets discriminating characteristics
(Altman E. , 1968).
Sen Chaudhury (1999) used discriminant analysis method to structure a model to

predict default on Indian firms that had borrowed from ICICI Bank Limited over
more than a decade. The proposed model predicts possible defaults with a high degree
of accuracy. The accuracy of the model for ICICI bank loans were found to be at 75%
two years prior to the occurrence of the event based on the discriminant score (Sen
Chaudhury, 1999).
This study analyses credit rating migrations in Indian corporate bond market to bring
about greater understanding of its credit risk. The ratings data has been collected from
CRISIL over 24 quarters from January 1993 to October 1998 for a sample of 426
firms with 4300 data points (company-quarters of rating data). With refinement of
data, the sample size fell to 3819 company-quarters. Within this sample, there are 255
rating changes implying that about 6.7% of the ratings change during a quarter. The
rating migration probabilities reflect the biases of the sample period. The migration
probabilities may be made more symmetric by averaging the downgrade and upgrade
probabilities with suitable weights. The study pointed out anomalies in the rating
migration data that are suggestive of inadequacies in the rating process itself (Varma
& Raghunathan, 2000).
30

Altman (2005) in his study found internal credit model to be useful because of the
flexibility and ease offered by the model. It found that the extensive default histories
of the external rating agencies can be combined to predict the probability of default
over investment horizons stretching from 1 year to 10 years or even longer (Altman
E. I., 2005).
This paper develops three forms of Z-score models. The first model is developed with
the data of internal crediting rating models of Indian banks. The second model is
developed with Altman‟s five ratios model with one modification of using debt to book
value of equity instead of debt to market value of equity. And the third model is
replication of Altman, Hartzell and Peck‟s model. The study uses a sample size of 112
firms from India which has been derived from defaulter‟s list of five public sector banks
namely State Bank of India, Bank of India, Central Bank of India, Bank of Baroda and
Punjab National Bank. The total number of defaulters on 31st Dec 2002 was 2047 of
which private limited companies and non-banking financial companies were excluded
bringing down the figure to 1297. From this sample 56 defaulting companies were
stratified on the basis of asset size and industry category and 56 matching non-
defaulting companies were chosen to carry out study using multiple discriminant
analysis used by Altman. Current ratio, debt-equity ratio, operating margin, working
capital to total assets, earnings before interest and tax to total assets, net worth to debt,
and asset-turnover ratio are found to be significant in predicting default.
The classification accuracy of first model was found to just 68 percent while for
second and third models this accuracy level increased to 82 percent. It has been
observed that Altman‟s model is capable of predicting default correctly for most of
the defaulting firms of sample and same was true for holdout sample for validation.
That means variables used in Altman‟s model can predict default accurately in Indian
scenario. Also it is found that Altman‟s model has performed better than that of
internal credit rating models of the banks which is 68 percent in comparison to 82
percent. So on the basis of the findings of the study, it can be said that banks should
use Altman‟s model rather just relying on their internal credit rating system. The
internal credit rating models used by banks and financial institutions, the financial risk
31

used by banks are found to be less effective in differentiating between the possible
defaulting and non-defaulting firms. The efficiency of the model is found to be around
68 per cent. For any bank and financial institution this level of accuracy is very low
and it becomes imperative for the banks to choose risk factors more carefully. Z
scores can be useful and banks can use these to improve on the internal ratings and
scale the whole internal assessment process up by assigning different credit ratings to
different Z scores. On the basis of the historical database about the debts banks can
arrive at the coefficients and can develop different Z-score calculators using the
estimated coefficients for the different categories of borrowers such as SME, mid-cap
company and large corporations. Banks are at liberty to assign weights to different
financial ratios using the discriminant coefficients for the purpose of rating the
borrowers. After this, bank should compare the estimated results against the actual
results to assess the efficiency of their internal rating process and if required can
revise weights of different parameters. This exercise would help the banks to put
themselves on the track of Basel-II (Jayadev, 2006).
Within a credit assessment process distinguishing solvent as well as insolvent

borrowers accurately is very important for banks and the general objective of
discriminant analysis is the same. To achieve this end banks may use some other
functions and parameters which may contain several independent creditworthiness
criteria. In this study it has been found that with the help of discriminant analysis and
simple Bayesian classifier technique banks can estimate default probability of their
non-banking clients more accurately and precisely. The model developed for the study
has analyzed different sets of financial data of 2 years. On overall 84% of the companies
that defaulted were correctly classified by the discriminant functions. Customized rating
scale using independent credit worthiness criteria has increased the success rate up to
around 98%. 97.22% of companies that were classified as not reliable by discriminant
functions have ended up actually defaulting later on (Mileris, 2010).
Ramaratanam & Jayaraman (2010) has been carried out a study to predict, compare
and analyse the classification accuracy of the results conducted for steel companies
from India over the sample period. The results of Altman‟s original Z-score model
32

have made benchmark for the study and Z score for five steel companies has been
calculated to figure out possibility of their default (Ramaratnam & Jayaraman, 2010).
Rashid and Abbas (2011) in their paper aims to identify the financial ratios that are
most significant in bankruptcy prediction for the non-financial sector of Pakistan
based on a sample of companies which became bankrupt over the time period 1996-
2006. Twenty four financial ratios covering four important financial attributes,
namely profitability, liquidity, leverage, and turnover ratios, were examined for a
five-year period prior to bankruptcy. The discriminant analysis produced a
parsimonious model of three variables viz. sales to total assets, EBIT to current
liabilities, and cash flow ratio (Rashid & Abbas, 2011).
The availability of new statistical models and concepts like univariate and multiple
discriminant analysis facilitated development of new models that can help in
predicting default. These models along with the ratio analysis were found to be more
efficient and accurate than the ratio analysis models alone (Sirirattanaphonkun &
Pattarathammas, 2012).
Gupta, Singh and Maheshwari (2013) carried out a default prediction study on the listed
textile companies from Punjab, India using Zeta model revealing that Zeta model can be
very reliable with an accuracy rate of 86.67% indicating that the model holds good in
majority of the cases of all the cases (Gupta, Singh, & Maheshwari, 2013).
Li (2014) using data of 108 bankrupt and non-bankrupt firms from construction
industry during 1985-2013 from North America explores the predictive ability of a
tailored Altman‟s bankruptcy prediction model for construction industry. Results of
the study suggest that there is no single dominant factor. Also if the original 5 factor
Z-model is expanded to 14 factor Z-model, there is no clear cut preference, meaning
that the original model stand valid and effective in default prediction (Li, 2014).
Castagnolo & Ferro (2014) examines four default prediction models: Z-score, O-
score, Campbell, and Merton distance to default model (MDDM) with help of three
techniques: intra-cohort analysis, power curves and discrete hazard rate models
33

between 1990 to 2010 on quarterly financial datasets of 10439 firms. Findings of the
study suggest that there is need for a mix of models rather than just one model.
Although Ohlson‟s (1980) O-score model outperforms other models and Merrton‟s
Distance to Default Model was found to be not sufficient predictor of default events.
But discrete hazard rate models suggest that combining both should enhance default
prediction models (Castagnolo & Ferro, 2014).
Gupta (2014) conducted study on Indian data using discriminant analysis and logistic
regression method to assess the classification accuracy on a sample of 120 firms
consisting 60 defaulted and 60 solvent firms. The solvent companies are chosen on a
stratified random basis to match the defaulted list. Independent variables were
calculated using financial statements of the participating firms and dependent
variables were collected from CRISIL ratings. For the logistic regression method
macro-economic as well as industry dummy variables were included in the study. The
model built by the study is found to have higher predictive ability than that of the
Altman‟s original Z-score model and Altman‟s emerging market model. The
empirical results suggest that the proposed logistic regression model also have higher
predictive ability but when compared with the discriminant analysis model, the
predictive ability of the logistic regression model is found to be higher. Another
interesting fact emerges from the study that the accounting based models can predict
financial distress but there is need to include some other variables besides the
accounting variables (Gupta V. , 2014).
The study has objective of developing a default prediction model in Indian context
over the sample period. The sample consist of 60 firms of which 30 firms are
defaulting firms and have CRISIL „D‟ rating and 30 firms are non-defaulting. The
study develops three models using discriminant analysis. The first model is built upon
five ratio of original Altman (1966) model. The second model is built upon the ratios
from Altman model for emerging markets and the third model is developed by the
significant ratios found in the study. The results of the study reveal that the
classification accuracy of the third model is higher than that on the first two models
(Desai & Joshi, 2015).
34

This study analyses four public sector steel companies from India during the sample
period of 2011 to 2015 using Altman Z score model and benchmarks original Altman
model for comparison. During the analysis period it has been found that these four
companies have been struggling as their Z score has deteriorated pushing them into
grey zone. This indicates that these firms need to look into their financial aspects
(Kaur & Srivastava, 2016).
The purpose of study is to establish a relationship between the size of the company
and probability of default for the firms from Indian steel sector using logistic
regression model and Altman‟s Z score model. The study uses financial data of 10
steel companies from India over 5 years period from 2009-10 to 2013-14. The results
of the study clearly point that the size of the firm has inverse relationship with the
probability of default. With an increase in the size of the firm; probability of failure
decreases and vice-versa. This is restricted to steel sector and the sample size is very
small. Also the sample period is very narrow to generalize the results of the study.
These results may be significant for the sample. But there is need to validate this
result with larger set of data over larger sample period (Singla & Singh, 2017).
3.5.2. Regression Methods
The limitations of MDA approach led to development of other alternatives like Logit
model by Ohlson (1980) and Probit model by Zmijewski (1984) (Altman & Sabato,
2007). Ohlson (1980) applied conditional Logit model on a data set with 105 bankrupt
and 2058 non-bankrupt firms from the US over the period of 1970-1976 using 7
financial ratio and 2 categorical variables. The classification accuracy was lower than
that of the studies based on MDA (Ohlson, 1980).
The importance of the analysis and management of credit risk has been increasing
over time. To manage risks regulatory authorities have been coming up with new
regulations to ensure sustainability of financial institutions and banks. These rules and
regulations force banks and other financial institutions to formulate credible processes
and methods to bring down the default possibilities by borrowers. The increasing
35

competition has impacted the risk and reward relationship between the clients and
financial institutions and many new ways and methods have been arrived at
measuring default risks. Many modern risk indicators such as Credit Risk Capital
(CRC) and Risk Adjusted Return On Capital (RAROC) are now well established
among banks. These methods use expected loss of bank portfolio and it is problematic
because of the assumptions regarding the estimated default frequency (EDF) for each
client or group of clients. Benchmark models for CRC calculations treat EDFs as
exogenous and do not devote much attention to how they can be obtained (Westgaard
and Wijst, 2001). The proposed model uses logistic regression model to estimate these
rate where it is assumed that financial variables as well as other firm characteristics
affect the default probability (Westgaard & Wijst, 2001).
Antune, Ribeiro, and Antao (2005) conducted a study about the default situation of
Portuguese non-financial companies. This study was conducted in relatively stable
interest rate regime and moderate economic growth phase. It estimates of a loan‟s
default probability by credit dimension class, and the average default probability of
the Portuguese portfolio of credit to non-financial firms, given two different
macroeconomic scenarios. Under a scenario with a strong economic deceleration, the
probability of default of non-financial firms is found to being increasing in the range
of 2% to 3.4% at the end of 2 years period. The study found that for the given the
characteristics of the credit portfolio, the average rate of default would still be
relatively low (Antune, Ribeiro, & Antão, 2005).
Bandyopadhyay (2006) conducted study on Indian corporate to develop a model for

prior prediction of default with financial and non-financial variables using multiple
discriminant analysis and logistic regression method on a sample of 102 firms
consisting 52 defaulting and 52 non-defaulting firms from 1998 to 2003. Using the
sample Z score model has been developed that can predict default on year prior to
default. This model exhibited high classification accuracy to detect defulting firms
within sample at 91 percent and out of sample accuracy for two hold out sample was
found to be 92 and 88 percent respectively. The model also can predict corporate
bankruptcy in two years prior to financial distress with an accuracy rate of 97 percent
36

and 96.3 percent respectively. Also the developed model outperformed other two
contesting models comprising of Altman, 1968 and emerging market score 1995 set
of ratios respectively. The logit results show that PD is a decreasing function of cash
profit over total assets, working capital to assets, total sales relative to total assets,
solidity, solvency ratio, firm age, ISO certification and top 50 group affiliation.
Further, industry affiliation of a firm is also an important factor for explaining its
default status and also needs to be taken into account (Bandyopadhyay A. , 2006).
This paper aims to develop a hybrid logistic model by using the inputs obtained from
BSM equity-based option model described in Bandyopadhyay (2007). The ability of
the market value of assets, asset volatility and firm‟s leverage structure measures to
predict future default has been investigated. Study uses a sample of 150 publicly
traded firms from India over the sample period of 1998 to 2005. The results of the
study suggest that using information from structural model improves the predictive
ability of logistic regression model (Bandyopadhyay A. , 2007).
Altman and Sabato, (2007) conducted a study on the small and medium sized enterprises.
These enterprises are significantly different from large corporate from a credit risk point
of view. The findings demonstrate that managing credit risk for SMEs requires models
and procedures specifically focused on the SME segment. Five financial ratios such as
EBITDA/ Total Assets, Short term Debt/ Equity Book value, Retained Earnings/ Total
Assets, Cash/ Total Assets and EBITDA/ Interest Expense are found in combination to be
the best predictors of SME default (Altman & Sabato, 2007).
Zeitun, Tian and Keen (2007) in their paper aims to investigate the effect of cash flow
and free cash flow on corporate failure in the emerging market in particular Jordan.
This study has arrived at three conclusions in context of Jordanian firms, 1). Firm‟s
free cash flow increases corporate default, 2). Firm‟s cash flow decreases corporate
failure, and 3). Capital structure is seen as the main factor affecting the probability of
default (Zeitun, Tian, & Keen, 2007).
Frade (2008) in his study has found that not all the market variables are significant in
a logistic regression model the financial ratio variables that targeted a company's
37

current leverage position, the company's current competitive advantages, and the
profitability margins were important factors in determining the likelihood that the
company would go into default on outstanding debt. Also, a market indicator that
illustrates recent movements in the issuer's value, price of last trade, was found to be
significant (Frade, 2008).
With the increasing level of securitization, the attitude of the lenders changed towards
the default probability as they have an incentive to originate loans that are rated to be
high on the basis of parameters reported and disclosed to the shareholders although
the unreported variable may be making less quality borrowers. It found that lenders
tend to fix the lending rates only on the basis of the factors and parameters that
needed to be reported to the shareholders while ignoring much vital and important
information that might have quality of debt extended. This change in the behaviours
of the lenders tends to manipulate the data generating process by transforming the
mapping from observables to loan defaults. This change in the behaviours of the
lenders has been illustrated in this study. It is found that the statistical default model
in times of lower securitization period tends to be broken down during the high
securitization period in a systematic manner. The lenders tend to under predict the
possible defaults among borrowers by ignoring vital soft information that has more
impact on the creditworthiness of the borrowers (Rajan, Seru, & Vig, 2010).
The study is basically based on a theoretical economic approach but the results are
found to be very powerful using publicly available data. It uses dynamic logit or
hazard logit that has been used by Shumway (2001), Jarrow and Chava as well as
Campbell (2008). The study uses publicly available data rather than going for mining.
Also as far as the selection of variables is concerned, relevant variables have been
used based on previous literature along with few macro-economic variables mainly
based on theory. The study is concerned with corporate failure rather than just
financial distress. The study uses a sample that consists of publicly listed firms under
receivership and liquidation in the UK during 1978 to 2006. The sample includes
7589 bankrupt firms and 49063 non-bankrupt firms for the study. Also these firms
belong to all sectors besides banks, insurance companies and investment trust etc.
38

Following the logic of Campbell et al (2008), the study extends their model by
investigating the importance of macro-economic variables. it finds that the risk free
rate of interest, the term structure of interest rates, and an inflation term are significant
variables in improving the basic Accounting and Market model of Campbell et al
(2008). Extending the analysis further, by adding an industry effect as in Chava and
Jarrow (2004) marginally increases the power of this model. However, industry
effects seem more important in the context of a pure accounting model. The results of
the study give a few clear indications. The first is that financial statement information
on profitability, cash flow and liabilities are very important for assessing the
bankruptcy probability even if model uses market variables. The second is that by
replacing accounting measures of book value with market measures in financial ratio
denominators improves predictive power of the model by making that information
more timely. The third is that at longer forecasting horizons, only accounting
information remain useful (Christidis & Gregory, 2010).
Wang & Zhou (2011) used financial data of rendomly selected 20 defaulting and 50
non-defaulting SMEs from China over the sample period of 2004-2007 for developing
a model using binary logistic regression model with Forward Stepwise method, tested
by likelihood ratio and then tested the model on 35 defauting and 88 non-defaulting
firms during the same period. In case of the large firms finaical indicators are found to
be very important for assessing the financial health of any firm and its ability to repay
debt but this is not always true in case of SMEs as their ability to repay debt is more
dependent on qualitaitve variables rather than just financial information (Wang &
Zhou, 2011).
Hauser & Booth (2011) uses three fold validation scheme to assess the classification
and prediction ability of logistic regression model with Bianco and Yohai (BY)
estimator versus maximum likelihood (ML) logistic regression for financial ratio data
from 2006 and 2007. The results of study indicate that BY robust logistic regression
method can significantly improve the classification and prediction of defaulting firms.
Even in the worst cases, BY robust logistic regression method has no impact on
39

estimated coefficients. This strongly indicates that BY robust logistic regression

method outperforms ML logistic regression (Hauser & Booth, 2011).
Ong, Yap, & Khong (2011) conducted their study for public listed companies in
Malaysia for developing a model that can help to predict financial distress situation to
be faced by these companies. The study use logistic regression method to develop the
model. The uses data from 2001 to 2007 for Bursa Malaysia listed public companies
which have been classified as financially distressed. The sample comprise of 105
companies from seven different sectors. The companies classified as financially
distressed during 2001-2006 have been used to develop the model using logistic
regression model and the sample of hold out companies listed as financially
distressed in 2007 have been used to validation. The reuslts of the study shows that
five financial ratios are statistically signifcant and can be used to predict corporate
loan default in Malaysian market. The overall accuracy of the model is found to be
91.5% which is higher than many studies in Malaysia and many other countries. This
makes the developed model to a reliable tool to figure out financial distress. Also the
classification accuracy is higher than that of the result of the stuides conducted in
Malaysia which used multiple discriminant analysis method (Ong, Yap, & Khong,
2011).
Considering the importance of SMEs in world economy, Sirirattanaphonkun &

Pattarathammas (2012) conducted a study on SMEs from Thailand over the period of
2000-2010 using MDA and Logit using 22 financial ratios. This study found that
Logit model has higher predictive accuracy for out of sample test than that of the
MDA. Moreover both the models could be helpful in predicting default
(Sirirattanaphonkun & Pattarathammas, 2012).
Moghadas & Salami (2014) in their study used logitic regression model with 9
independent variables for a sample of 50 Iranin firms over the period of 2002 to 2010.
The logistic model has 89.75% of accuracy in discriminating firms. The prediction
accuracy for non-bankruptcy firms is 89% and for bankruptcy firms is 91%
(Moghadas & Salami, 2014).
40

Singh & Mishra (2016) conducted default prediction study on a sample of Indian data
belonging to manufacturing sector with three objectives of developing a new
bankruptcy prediction model for Indian manufacturing companies, to revisits and re-
estimate Altman (1968), Ohlson (1980) and Zmijewski (1984) models to examine the
sensitivity of these models towards change in financial conditions and time periods.
Finally, to choose the best model for prediction of financial distress of Indian
manufacturing companies. The study uses sample of 208 equal numbers of defaulted
and non-defaulted firms for the period 2006 to 2014. The major findings of the study
reveal that the overall predictive accuracy of all the three models improves on
estimation and holdout sample when the coefficients are re-estimated. Amongst the
contesting models, the new proposed model outperforms while predicting bankruptcy
for Indian manufacturing companies. The study further suggests the coefficients of the
models are sensitive to time periods and financial conditions (Singh & Mishra, 2016).
The purpose of this paper is to develop a model to predict distress situation of non-
government non-financial public limited companies. The study uses a balanced panel
1051 firms over the period of 2006-07 to 2013-14.The model estimates one year prior
probability of a firm becoming financially distressed using logistic regression method.
This study use only three financial ratios viz., long term liabilities to total assets,
operating profits to total liabilities, and current assets to current liabilities. The model
was tested for some distressed industries/companies and was found to capture the
underlying distress. The results of the study suggest that the distress of corporate may
be predicted in advance (Senapati & Ghosal, 2016).
3.6 Structural Models
The seminal works Black and Scholes (1973) and Merton (1974) introduced the first
structural models describing the default risk1 of companies. This paper belongs to the
class of first-passage time models, pioneered by Black and Cox (1976), where default
of a company is announced at the first time when the firm-value falls below a certain
boundary. It has been shown by Leland and Toft (1996) that under certain
assumptions this behaviour is optimal for the company owner. However, in these
41

models it is a fundamental assumption that investors have complete information on

the firm‟s asset value as well as on the default boundary. In fact, usually investors do
not have complete information and there are several approaches which deal with this
issue. Most researchers concentrate on first-passage time models. Duffie and Lando
(2001) consider the case, where investors estimate the firm‟s asset value from noisy
accounting reports. Coculescu, Geman, and Jeanblanc (2006) consider a model where
investors observe a correlated index and Frey and Schmidt (2006) filter the asset
value from discretely observed news. In contrast to these filtering approaches, there is
a different branch of research where the investors have incomplete information of
either firm‟s asset value or default barrier (or both), but no additional information.
This results in a class of highly tractable models. For example, Giesecke (2006)
considers the case where the firm-value or default barrier (or both) may not be
observed, while in Giesecke and Goldberg (2004) the asset value is observed; both
papers deal with the case of a time-independent default barrier.
Leland, (2004) in his study uses Exogenous and endogenous default boundary
models. Here default costs and recovery rates are matched. These have been found to
be efficiently predicting default probabilities and fit observed default frequencies
equally well. These two models predict longer-term default frequencies quite
accurately, both for the investment grade and non-investment grade bonds though
these models tend to be less efficient predicting shorter-term default frequencies
(Leland H. , 2004).
Charitou & Trigeorgis (2004) builds on option-pricing theory to estimate default

probabilities of 420 distressed U.S. firms for the period 1986-2001. The results
indicate that the primary option variables, such as firm volatility, play an important
role in explaining distress up to five years prior to bankruptcy filing. When the model
is extended to account for the probability of default on interest and debt repayments
due at intermediate times prior to debt maturity, an option-motivated transformation
of the cash flow coverage is shown to have incremental explanatory power, while the
primary option variables remain statistically significant. The significant primary
option variables include the face value of debt owed at maturity (lnB), the current
42

market value of the firm‟s assets (lnV), and the standard deviation (σ) of firm value
changes (returns). The distance to default (d2d) and the probability of default at
maturity (-d2) were also found to be significant predictor variables (Charitou &
Trigeorgis, 2004).
Bharath & Shumway (2004) examine the accuracy and contribution of the default
forecasting model based on Merton‟s (1974) bond pricing model and developed by
the KMV Corporation 1,449 firm defaults covering the period 1980-2003. Comparing
the KMV Merton model to a similar but much simpler alternative, and find that it
performs slightly worse as a predictor in hazard models and in out of sample
forecasts. Moreover, several other forecasting variables are also important predictors,
and fitted hazard model values outperform KMV-Merton default probabilities out of
sample. Implied default probabilities from credit default swaps and corporate bond
yield spreads are only weakly correlated with KMV-Merton default probabilities after
adjusting for agency ratings, bond characteristics, and our alternative predictor. The
study concludes that the KMV-Merton model does not produce a sufficient statistic
for the probability of default, and it appears to be possible to construct such a
sufficient statistic without solving the simultaneous nonlinear equations required by
the KMV-Merton model (Bharath & Shumway, 2004).
Patel and Pereira (2005) use a sample of 52 real estate companies listed on the
London Stock Exchange (LSE), during the period 1984-2004 to estimate EDPs using
different models such as those proposed by Merton (1974), Black and Cox (1976),
Longstaff and Schwartz (1995), Leland and Toft (1996), Ericsson and Reneby (1998)
and Collin-Dufresne and Goldstein (2001). These studies have estimated expected
default probabilities (EDPs) for failed and non-failed firms. The study by Petal and
Pereira had results that are generally consistent with models‟ predictions, and the
estimates of EDPs for different models are closely clustered. Analysis of EDPs was
done using logistic regressions. Findings suggest: the observed misclassification of
the companies by structural models is due to special company management and/or
43

regulatory circumstances rather than limitations of these models (Patel & Pereira,
Expected Default Probabilities in Structural Models: Empirical Evidence, 2005).
Bandyopadhyay, (2007) in his study used Black, Scholes and Merton model along
with logistic regression on an unbalanced panel data of 53 Indian firms which had
different frequencies of time periods making it a cross section of 131 observations.
Results establishes that option based models that use stock market and balance sheet
information can fairly predict the default status of corporate prior to any credit ratings
issued by credit rating agencies. Distance to default has been found to be single most
important measure that can help in assessing the asset value and volatility and the
balance sheet liquidity. Also option based model has been found to be more efficient
than other ratio based models (Bandyopadhyay A. , 2007).
Over the sample period the objective probability measures are found to be at the
higher end than that of the risk-neutral measures although the probability estimates
are found to be robust to the default trigger point. The proposed model results are
found to comparable with the default rate reported by CRISIL‟s Average 1-year rating
transitions and the Altman Z-score coefficient. But this model similar to the literature
on credit spread is not able to generate spreads as high as those observed in the
corporate bond market (Kulkarni; Mishra, and Thakkar, 2008).
Agarwal and Taffler (2008) highlight the fact that accounting-based models have an
important role to play in the prediction of financial distress. They note that whilst
accounting ratio based models are much criticised for a lack of theoretical grounding,
they are capable of out-performing their more theoretically appealing contingent-
claims based alternatives. They point out three factors that might favour accounting
models: first, corporate failure is likely to evolve over time and as such will be
reflected in financial statements; second, the nature of double entry bookkeeping
means that combining accounting information should overcome the effects of policy
changes or “window dressing”; third, loan covenants are generally based on
accounting numbers. It is interesting to note that this latter point is timely in relation
44

to the current credit and economic crisis, whilst the first point is consistent with our
evidence on prior year bankruptcy prediction (Agarwal & Taffler, 2008).
A firm-value model similar to the one proposed by Black and Cox (1976) is
considered. Instead of assuming a constant and known default boundary, the default
boundary is an unobserved stochastic process. This process has a Brownian
component, reflecting the influence of uncertain effects on the precise timing of the
default, and a jump component, which relates to abrupt changes in the policy of the
company, exogenous events or changes in the debt structure. Interestingly, this setup
admits a default intensity, so the reduced form methodology can be applied. This
paper extends to the case where the default barrier is allowed to be a stochastic
process (Schmidt & Novikov, 2008).
This study proposes a model to predict default probability of the firms using Black,
Scholes and Merton framework. The study uses financial data of twelve companies
from 1998 to 2004 from Prowess and rating data from Credence Analytics for Indian
corporate. Over the sample period the objective probability measures are found to be
at the higher end than that of the risk-neutral measures although the probability
estimates are found to be robust to the default trigger point. The proposed model
results are found to comparable with the default rate reported by CRISIL‟s Average 1-
year rating transitions and the Altman Z-score coefficient. But this model similar to
the literature on credit spread is not able to generate spreads as high as those observed
in the corporate bond market. The results suggest that probabilities of default
estimates are significantly dependent on equity price and are sensitive to default
trigger point (sum of short term and long term debt). Also the rate of asset drift is
found to short of the risk free rate over the sample period (Kulkarni, Mishra, &
Thakkar, 2008).
This study tries to find a generalized Black-Cox structural model of default risk. It has
been observed that for some time, a few firms avoid defaults even if the firm‟s
liabilities are more than that of the firm‟s assets. While many firms default even if the
assets are more than that of the firm‟s liabilities. This is interesting. The proposed
45

model tries to capture the risk associated with the firm‟s ability to avoid default even
if firm‟s liabilities are more than that of its assets (Katz & Shokhirev, Default risk
modeling beyond the first-passage approximation: Extended Black-Cox model, 2010).
Davydenko (2012) finds that default is triggered by the situation of illiquidity in the
firm. That means when the market value of the firm's assets falls below a certain level
(default boundary) the firm tend to default on the payment of principal or interest or
both. Study has used observed prices of debt and equity of defaulted firms from the
US market since January 1997 to December 2010. The sample size was 306 firms.
Based on market values of defaulting firms, this certain level or default boundary has
been estimated to be 66% of the face value of debt. That means if the market value of
any firm falls below the 66% of face value of the debt, the firm defaults on the
payments. This generalization looks very attractive as it gives clear cut boundary but
in the process of cross validation it gives substantial number misclassifications which
eventually brings down the efficiency and accuracy of the model (Davydenko S. A.,
2012).
This paper studies the default probabilities of the 47 Indian firms over period of 2007
to 2013. This study uses options based method to predict the probability of default of
these firms over the assessment period. Researchers used Black, Scholes and Merton
model in this paper. The study estimates the market value of assets, asset volatility,
risk neutral default probability and real default probability of firms and finds out the
factors that have impact on the default probabilities (Sharma, Singh, & Upadhyay,
2014).
The study uses a sample of 80 firms comprising 30 distressed and 50 non-distressed

firms identified on the basis of BIFR references the period of 2006 to 2014.
Accounting and market data has been collected from respective balance sheets and
Bombay Stock Exchange (BSE) and risk free rate has been collected from Reserve
Bank of India (RBI) publication. The BSM model has been used to derive a risk-
neutral (and objective) indicator of credit risk that can be used to assess financial
distress of firms based upon BSM framework. The empirical finding shows the mean
46

PD estimated from BSM for distressed group is found to be higher than that of the
mean probability of default estimated from non-distressed firms. The model can be
applied to calculate direct PD estimates which can be used by investors and customers
to take an informed decision on whether to do business with such companies which
are likely to default in the near future. Banks and the financial institutions can use the
model to predict whether a company is going to default before sanctioning the credit
(Mishra & Singh, 2016).
3.7 Reduced Form Models
With reference to credit risk assessment the valuation models can broadly be
categorized into two categories; mainly on the basis of the default time. The first
category of models estimates default time T, using Poisson process assuming that the
default is an arbitrary and sudden event. While on the other hand the second category
of valuation models estimates default time T when the process touches or crosses a
certain boundary or limit indicating that the firm would be unable to repay the debt on
maturity.
Duffie, Saita, and Wang, (2007) in their study found that the estimated term structures
of default hazard rates of individual firms depend significantly in level and shape on
the current state of the economy and especially on the current leverage of the firm as
captured by distance to default. The maximum likelihood estimators of term structures
of conditional probabilities of corporate default, incorporating the dynamics of firm-
specific and macroeconomic covariates are very helpful in the course of prediction
(Duffie, Saita, & Wang, 2007).
Duan, Sun and Wang (2011) studied on a large sample of the US industrial and
financial firms spanning the period 1991-2010 on a monthly basis. The forward
intensity model is also amenable to aggregation, which allows for an analysis of
default behavior at the portfolio and/or economy level and has developed a reduced-
form model for predicting corporate defaults/bankruptcies over different prediction
horizons. They proposed a forward intensity approach for default prediction for
different time horizons. This approach can also be used for credit risk analysis of
47

individual firms such as credit ratings. The forward intensity model is also amenable
to aggregation, which allows for an analysis of default behavior at the portfolio and/or
economy level. By applying the aggregation algorithm of Duan (2010) for the
standard intensity model, one can generate the default distribution (in terms of the
number of defaults or the size of exposure) for any credit portfolio. In short, it also
offers a practical bottom-up approach to credit portfolio analysis. The implementation
of this model also factors in momentum in some variables and figures out their
importance in default prediction. The prediction using forward intensity model is very
accurate for shorter horizons though the accuracy deteriorates somewhat when the
horizon is increased to two or three years, but its performance still remains reasonable
(Duan, Sun, & Wang, 2011).
The power and calibration are related. As a result, it is found that it is better to first
identify the most powerful model for predicting the default probability from available
models and then to calibrate that powerful model to suit to the present needs. It is
quite possible that there may be some problems in calibration of the model but by
selecting a powerful model, the possibility of having acceptable results and
performances when calibrated. Future is indefinite and unpredictable; so not possible
to calibrate any model for future defaults but through appropriate designed
experiments, it can be simulated. Researchers can by controlling for sample and time
dependence test models and methodologies using walk forward testing method. This
technique wherever it is possible end up reducing the possibility of models becoming
over fit. This is o because this method of walk forward process does not use the same
data in the testing that was used to fit the model parameters. Besides this, the
researchers can use as much data as possible to fit and test the model (Stein, 2007).
It has been common belief that the two main credit risk models i.e. the structural
model and the intensity model, cannot be unified and this study proves that it is not
totally correct. For the same purpose researcher extends the definition of the classic
default intensity; the instantaneous default intensity, to the forward default intensity in
two different ways. Concurring with Duffie and Lando (2001) and Giesecke (2005),
the instantaneous default intensity does not generally exist unless the default time is
48

totally inaccessible and same is used to firm-value processes in this study. However, it
is proved that forward default intensities always exist. Based on the firm-value
information from the past to current time, the conditional probability of default is a
function of the forward default intensity, not the instantaneous default intensity. As a
result, the yield spread and the forward spread in terms of forward intensities is
obtained. Also it is shown that the forward default intensity can be decomposed into
the forward default intensity caused by an accessible process and the forward default
intensity caused by a totally inaccessible process (Chen C.-J. , 2007).
Beaver, McNichols, & Rhie (2005) in their study of secular change in the predictive
ability of financial ratios for default observed two striking findings: (1) the robustness
of the predictive models using financial ratios is strong over time, showing only slight
changes. (2) The slight decline in the predictive ability of the financial ratios can be
improved by including market-related variables. When the financial ratios and
market-related variables are combined, the decline in predictive ability appears to be
very small. The finding is consistent with non-financial-statement information
compensating for a slight loss in predictive power of the financial ratios. It is well
established that financial ratios do have predictive power up to at least five years prior
to bankruptcy. In this paper, we extend this literature and the literature on the secular
change in the explanatory power of financial statements by examining changes in the
predictive ability of financial ratios with respect to bankruptcy (Beaver, McNichols,
& Rhie, 2005).
Saretto (2006) conducted a study on corporate bond defaults using data set from 1979
to 2000 with a sample of 7,282 firms and 48,967 firm-years. Corporate bond defaults
can be predicted using financial ratios and it is related to the cross-section of expected
stock returns. Using several performance measures, it is has been found that the
duration model outperforms many existing models in correctly classifying both
Default and Non-Default firms. Using the default probabilities predicted in the study,
the relationship between default risk and the Fama-French distress factors, HML and
SMB were analyzed. Evidences are found that support the interpretation on HML as a
distress related factor. Both portfolio and individual stock factor loadings are related
49

to the estimated default probabilities. The study suggests that there is a negative and
significant contemporaneous correlation between HML and shocks to the level of
aggregate financial distress. Consequently, this paper consists of two major parts.
First, it examines how corporate debt defaults can be predicted using financial
information and propose a model for forecasting default. Second, it analyses whether
and how this estimated measure of credit risk (the probability of default) is related to
the Fama and French factors at the firm and aggregate level (Saretto, 2006).
3.8 Other Models
Acharya; Chatterjee and Pal (2003) found that liquidity, debt serviceability and capital
turnover ratios of firms are important indicators for forecasting probable debt default.
They found that the probability of default is inversely related to these three ratios of
the firm. Study has found that liquidity and capital turnover ratios are the most
significant indicators even after controlling for the heteroscedasticity. After checking
for heteroscedasticity at significance level of 5% the impact of debt service on default
probability becomes insignificant. This result says that debt service in past is not
important in prediction of debt default. Similarly the marginal effect of size is
insignificant (at 5% level) (Acharya, Chatterjee, & Pal, 2003).
The study by Aguado and Benito (2012) has suggested that the most robust
determinants of firm default are firm-specific variables such as the ratio of working
capital to total assets, the ratio of retained earnings to total assets, the ratio of total
liabilities to total assets, and the standard deviation of the firm stock return. In
contrast, aggregate variables do not seem to play a relevant role once firm-specific
characteristics (observable and unobservable) are taken into consideration (Aguado &
Benito, 2012).
Chava, Stefanescu, & Turnbull (2010) used balance sheet and market infomration
sample for the period of 1980-2008 which included 3555 firms, 46605 firm years and
477 default events to model and estimate loss distribution due for credit risky assets
such as bonds and loans. To achieve the purpose the probability of default and the
recovery rate has been calculated by developing a new class of default models which
50

explicitly focus on industry specific as well as firm specific characteristics. Findings

suggest that choice of the default model has significant impact on the loss prediction.
Also the default probabilities and recovery rates are negatively correlated and the
degree of correlation is dependent on seniority of debt, credit cycle and industry
(Chava, Stefanescu, & Turnbull, 2010).
Evaluating default correlations and the probabilities of multiple defaults is an

important task in credit analysis and risk management, but very difficult because
default correlations cannot be measured directly. This study gives an analytical
formula for calculating default correlations which can be easily implemented and
conveniently used in a variety of financial applications. The result of this paper also
provides a theoretical justification for many empirical results found in the literature
and increases our understanding of the important features of default correlations
(Zhou, 1997).
To assess the process of loan origination, a metric has been devised through credit
score. These scores are then compared with FICO scores using loan level data. The
study uses loan level data for more than seven millions subprime securitized loans
originated during 2000-2006 in the USA. The parametric and the non-parametric
estimates of the credit scores give different results. This is more evident in case of low
credit score cases. It is has been found that there has been a trend of over emphasis on
high scores in the whole loan origination process. And over time the over emphasis on
the high score is found to have deteriorated. But whenever other attributes relating to
loan origination as well as the change in the economic conditions are controlled,
FICO scores are found to be stable. However, they also suggest a pattern in which
credit scoring was likely used to offset other riskier attributes on the origination—
leading to an unconditionally higher rate of default, especially on originations with
low credit scores (Bhardwaj` & Sengupta, 2015).
3.9 Indian Perspective
In India limited research has been conducted on default predictions because of

absence of proper bankruptcy laws and lack of reliable. There is complete silence on
51

information about default on bank loans. Even in case of bonds, situation is not very
encouraging. Sen Chaudhury (1999), Jayadev (2006), Bandyopadhyay (2007),
Kulkarni, Mishra, & Thakkar (2008) and Sharma et al (2014) are a few studies that
have tried to explore the problem of default with a large set of data and have added
knowledge. Most of the studies like Bandyopadhyay (2006), Gupta (2014), Jayadev
(2006) and Sen Chaudhury (1999) have used multiple discriminant analysis to
develop a model for Indian firms. Very few studies like Bandyopadhyay (2006) have
used logistic regression method. Similalry there are very few studies on structural
models like Bandyopadhyay (2007) and Sharma et al (2014). As per the survey of
literature on default prediction, it was found that reduced form models have not been
eplored yet in Indian scenario.
The information on default history of firms is scarce in India (Jayadev, 2006).

Insolvency and Bankruptcy Code 2016 came in force in 2016. Prior to it, The Sick
Industrial Companies (Special Provisions) Act (SICA) of 1985 was serving the cause
but not up to mark. Government has set up insolvency regulator Insolvency and
Bankruptcy Board of India which is responsible to regulate and look after the
insolvency matters in the country with adjudicating authority in National Company
Law Tribunal for corporate debtors (Ministry of Law and Justice, Government of
India, 2016).
Sen Chaudhury (1999) used discriminant analysis method to structure a model to

predict default on Indian firms that had borrowed from ICICI Bank Limited over
more than a decade. The proposed model predicts possible defaults with a high degree
of accuracy. The accuracy of the model for ICICI bank loans were found to be at 75%
two years prior to the occurrence of the event based on the discriminant score (Sen
Chaudhury, 1999).
This study analyses credit rating migrations in Indian corporate bond market to bring
about greater understanding of its credit risk. The ratings data has been collected from
CRISIL over 24 quarters from January 1993 to October 1998 for a sample of 426
firms with 4300 data points (company-quarters of rating data). With refinement of
52

data, the sample size fell to 3819 company-quarters. Within this sample, there are 255
rating changes implying that about 6.7% of the ratings change during a quarter. The
rating migration probabilities reflect the biases of the sample period. The migration
probabilities may be made more symmetric by averaging the downgrade and upgrade
probabilities with suitable weights. The study pointed out anomalies in the rating
migration data that are suggestive of inadequacies in the rating process itself (Varma
& Raghunathan, 2000).
This paper develops three forms of Z-score models. The first model is developed with
the data of internal crediting rating models of Indian banks. The second model is
developed with Altman‟s five ratios model with one modification of using debt to
book value of equity instead of debt to market value of equity. And the third model is
replication of Altman, Hartzell and Peck‟s model. The study uses a sample size of 112
firms from India which has been derived from defaulter‟s list of five public sector
banks namely State Bank of India, Bank of India, Central Bank of India, Bank of
Baroda and Punjab National Bank. The total number of defaulters on 31st Dec 2002
was 2047 of which private limited companies and non-banking financial companies
were excluded making the figure 1297. From this sample 56 defaulting companies
were stratified on the basis of asset size and industry category and 56 matching non-
defaulting companies were choses to carry out study using multiple discriminant
analysis used by Altman. Current ratio, debt-equity ratio, operating margin, working
capital to total assets, earnings before interest and tax to total assets, net worth to debt,
and asset-turnover ratio are found to be significant in predicting default.
The classification accuracy of first model was found to just 68 percent while for
second and third models this accuracy level increased to 82 percent. It has been
observed that Altman‟s model is capable of predicting default correctly for most of
the defaulting firms of sample and same was true for holdout sample for validation.
That means variables used in Altman‟s model can predicting default accurately in
India scenario. Also it is found that Altman‟s model has performed better than that of
internal credit rating models of the banks which is 68 percent in comparison to 82
percent. So on the basis of the findings of the study, it can be said that banks should
53

use Altman‟s model rather just relying on their internal credit rating system. The
internal credit rating models used by banks and financial institutions, the financial risk
used by banks are found to be less effective in differentiating between the possible
defaulting and non-defaulting firms. The efficiency of the model is found to be around
68 per cent. For any bank and financial institution this level of accuracy is very low
and it becomes imperative for the banks to choose risk factors more carefully. Z
scores can be useful and banks can use these to improve on the internal ratings and
scale the whole internal assessment process up by assigning different credit ratings to
different Z scores. On the basis of the historical database about the debts banks can
arrive at the coefficients and can develop different Z-score calculators using the
estimated coefficients for the different categories of borrowers such as SME, mid-cap
company and large corporations. Banks are at liberty to assign weights to different
financial ratios using the discriminant coefficients for the purpose of rating the
borrowers. After this, bank should compare the estimated results against the actual
results to assess the efficiency of their internal rating process and if required can
revise weights of different parameters. This exercise would help the banks to put
themselves on the track of Basel-II (Jayadev, 2006).
Bandyopadhyay (2006) conducted study on Indian corporate to develop a model for

prior prediction of default with financial and non-financial variables using multiple
discriminant analysis and logistic regression method on a sample of 102 firms
consisting 52 defaulting and 52 non-defaulting firms from 1998 to 2003. Using the
sample Z score model has been developed that can predict default on year prior to
default. This model exhibited high classification accuracy to detect defulting firms
within sample at 91 percent and out of sample accuracy for two hold out sample was
found to be 92 and 88 percent respectively. The model also can predict corporate
bankruptcy in two years prior to financial distress with an accuracy rate of 97 percent
and 96.3 percent respectively. Also the developed model outperformed other two
contesting models comprising of Altman, 1968 and emerging market score 1995 set
of ratios respectively. The logit results show that PD is a decreasing function of cash
profit over total assets, working capital to assets, total sales relative to total assets,
54

solidity, solvency ratio, firm age, ISO certification and top 50 group affiliation.
Further, industry affiliation of a firm is also an important factor for explaining its
default status and also needs to be taken into account (Bandyopadhyay A. , 2006).
This paper aims to develop a hybrid logistic model by using the inputs obtained from
BSM equity-based option model described in Bandyopadhyay (2007). The ability of
the market value of assets, asset volatility and firm‟s leverage structure measures to
predict future default has been investigated. Study uses a sample of 150 publicly
traded firms from India over the sample period of 1998 to 2005. The results of the
study suggest that using information from structural model improves the predictive
ability of logistic regression model (Bandyopadhyay A. , 2007).
Bandyopadhyay, (2007) in his study used Black, Scholes and Merton model along
with logistic regression on an unbalanced panel data of 53 Indian firms which had
different frequencies of time periods making it a cross section of 131 observations.
Results establishes that option based models that use stock market and balance sheet
information can fairly predict the default status of corporate prior to any credit ratings
issued by credit rating agencies. Distance to default has been found to be single most
important measure that can help in assessing the asset value and volatility and the
balance sheet liquidity. Also option based model has been found to be more efficient
than other ratio based models (Bandyopadhyay A. , 2007).
This study proposes a model to predict default probability of the firms using Black,
Scholes and Merton framework. The study uses financial data of twelve companies
from 1998 to 2004 from Prowess and rating data from Credence Analytics for Indian
corporate. Over the sample period the objective probability measures are found to be
at the higher end than that of the risk-neutral measures although the probability
estimates are found to be robust to the default trigger point. The proposed model
results are found to comparable with the default rate reported by CRISIL‟s Average 1-
year rating transitions and the Altman Z-score coefficient. But this model similar to
the literature on credit spread is not able to generate spreads as high as those observed
in the corporate bond market. The results suggest that probabilities of default
55

estimates are significantly dependent on equity price and are sensitive to default
trigger point (sum of short term and long term debt). Also the rate of asset drift is
found to short of the risk free rate over the sample period (Kulkarni, Mishra, &
Thakkar, 2008).
This study has been carried out to predict, compare and analyse the classification
accuracy of the results conducted for steel companies from India over the sample
period. The results of Altman‟s original Z-score model has made benchmark for the
study and Z score for five steel companies has been calculated to figure out possibility
of their default (Ramaratnam & Jayaraman, 2010).
This paper studies the default probabilities of the 47 Indian firms over period of 2007 to
2013. This study uses options based method to predict the probability of default of these
firms over the assessment period. Researchers used Black, Scholes and Merton model in
this paper. The study estimates the market value of assets, asset volatility, risk neutral
default probability and real default probability of firms and finds out the factors that have
impact on the default probabilities (Sharma, Singh, & Upadhyay, 2014).
Gupta (2014) conducted study on Indian data using discriminant analysis and logistic
regression method to assess the classification accuracy on a sample of 120 firms
consisting 60 defaulted and 60 solvent firms. The solvent companies are chosen on a
stratified random basis to match the defaulted list. Independent variables were
calculated using financial statements of the participating firms and dependent
variables were collected from CRISIL ratings. For the logistic regression method
macro-economic as well as industry dummy variables were included in the study. The
model built by the study is found to have higher predictive ability than that of the
Altman‟s original Z-score model and Altman‟s emerging market model. The
empirical results suggest that the proposed logistic regression model also have higher
predictive ability but when compared with the discriminant analysis model, the
predictive ability of the logistic regression model is found to be higher. Another
interesting fact emerges from the study that the accounting based models can predict
56

financial distress but there is need to include some other variables besides the
accounting ratios (Gupta V. , 2014).
The study has objective of developing a default prediction model in Indian context
over the sample period. The sample consist of 60 firms of which 30 firms are
defaulting firms and have CRISIL „D‟ rating and 30 firms are non-defaulting. The
study develops three models using discriminant analysis. The first model is built upon
five ratio of original Altman (1966) model. The second model is built upon the ratios
from Altman model for emerging markets and the third model is developed by the
significant ratios found in the study. The results of the study reveal that the
classification accuracy of the third model is higher than that on the first two models
(Desai & Joshi, 2015).
This study analyses four public sector steel companies from India during the sample
period of 2011 to 2015 using Altman Z score model and benchmarks original Altman
model for comparison. During the analysis period it has been found that these four
companies have been struggling their Z score has deteriorated pushing them into grey
zone. This indicates that these firms need to look into their financial aspects (Kaur &
Srivastava, 2016).
The purpose of this paper is to develop a model to predict distress situation of non-
government non-financial public limited companies. The study uses a balanced panel
of 1051 firms over the period of 2006-07 to 2013-14. The model estimates one year
prior probability of a firm becoming financially distressed using logistic regression
method. This study use only three financial ratios viz., long term liabilities to total
assets, operating profits to total liabilities, and current assets to current liabilities. The
model was tested for some distressed industries/companies and was found to capture
the underlying distress. The results of the study suggest that the distress of corporate
may be predicted in advance (Senapati & Ghosal, 2016).
Singh & Mishra (2016) conducted default prediction study on a sample of Indian data
belonging to manufacturing sector with three objectives of developing a new
bankruptcy prediction model for Indian manufacturing companies, to revisits and re-
57

estimate Altman (1968), Ohlson (1980) and Zmijewski (1984) models to examine the
sensitivity of these models towards change in financial conditions and time periods
and finally, to choose the best model for prediction of financial distress of Indian
manufacturing companies. The study uses sample of 208 equal numbers of defaulted
and non-defaulted firms for the period 2006 to 2014. The major findings of the study
reveal that the overall predictive accuracy of all the three models improves on
estimation and holdout sample when the coefficients are re-estimated. Amongst the
contesting models, the new proposed model outperforms while predicting bankruptcy
for Indian manufacturing companies. The study further suggests the coefficients of the
models are sensitive to time periods and financial conditions (Singh & Mishra, 2016).
The study uses a sample of 80 firms comprising 30 distressed and 50 non-distressed

firms identified on the basis of BIFR references the period of 2006 to 2014.
Accounting and market data has been collected from respective balance sheets and
Bombay Stock Exchange (BSE) and risk free rate has been collected from Reserve
Bank of India (RBI) publication. The BSM model has been used to derive a risk-
neutral (and objective) indicator of credit risk that can be used to assess financial
distress of firms based upon BSM framework. The empirical finding shows the mean
PD estimated from BSM for distressed group is found to be higher than that of the
mean probability of default estimated from non-distressed firms. The model can be
applied to calculate direct PD estimates which can be used by investors and customers
to take an informed decision on whether to do business with such companies which
are likely to default in the near future. Banks and the financial institutions can use the
model to predict whether a company is going to default before sanctioning the credit
(Mishra & Singh, 2016).
The purpose of study is to establish a relationship between the size of the company
and probability of default for the firms from Indian steel sector using logistic
regression model and Altman‟s Z score model. The study uses financial data of 10
steel companies from India over 5 years period from 2009-10 to 2013-14. The results
of the study clearly point that the size of the firm is inverse relationship with the
probability of default. With an increase in the size of the firm; probability of failure
58

decreases and vice-versa. This is restricted to steel sector and the sample size is very
small. Also the sample period is very narrow to generalize the results of the study.
These results may be significant for the sample companies in India. There is need to
valid this result with larger data set over longer duration (Singla & Singh, 2017).
3.10 The American Perspective
The USA has been the birth place of credit risk models. Most of the credit risk models
have evolved and developed in the US. From fundamental studies to univariate
models to multivariate models, from structural model to reduced form models, all
these methods have evolved and developed in the US. The most used model Altman
(1968) Z-score model has developed in the US. Other models like structural models,
reduced form models, Ohlson (1980) O-score model, Zmijewski Score model and
KMV model have evolved and developed in the US.
Thousands of studies have been carried in the US. List is very long however Wall
(1919), Smith (1930), Smith and Winakor (1935), Fitzpatrick (1932), Durand (1941),
Merwin (1942), Beaver (1966), Altman (1968), Black and Scholes (1973), Merton
(1973) Merton (1974) Merton (1976), Ohlson (1980), Zmijewski (1984), Jarrow and
Turnbull (1995), Kealhofer (2003), Altman (2005) are some of the most important
studies in credit risk modelling and these all have been carried out on the US data.
There is lot of literuature on artifical nueral network, fussy model, Baysian model and
hybrid models in the context of the US.
3.11 The British Perspective
Since long back, default prediction studies have been being carried in the UK and the
list is very long but when compared to the US, The UK lags behind. The studies that
have been carried out in the UK have used almost every method and model such as
MDA, logit, probit, structural, reduced form models, hybrid model, Baysian model
and artifical nueral network etc. these studies have offered some crucial results as
well.
59

Some of the important studies that have assessed credit risk are Taffler (1984), Peel,
Peel, & Pope (1986), Goudie & Meeks (1991), Alici (1995), Bharath & Shumway
(2004), Agarwal & Taffler (2008), Agarwal & Taffler (2008b), Charalambakis,
Espenlaub, & Garrett (2009), Charalambakis, Espenlaub, & Garrett (2009), Christidis
& Gregory (2010), Duan, Sun, & Wang (2011), Blanco, Irimia, & Oliver (2012),
Bunyaminu & Issah (2012), Jackson & Wood (2013) and Lin (2015) etc.
3.12 Conclusion
Since the study of Fitzpatrick (1932), different prediction methods and models have
evolved. These models have used accounting and market information and the
efficiency of these models have been different in different economic realities and
periods demanding for better models. Different variations of multiple discriminant
analysis based on Altman (1968) has been performing very well as credit scoring
model while other models like Black, Scholes and Merton model (KMV) works well
in predicting default. Other structural and duration models also provide vital
information in prediction of default.
60

Chapter 4
Theoretical Framework and Methods

Chapter 4: Theoretical Framework and Methods
CHAPTER 4
Theoretical Framework and Methods
4.1. Introduction
For conducting the present study four different methods have been used. Every
method has its own mathematics, methodology and assumptions. Different sets of
variables are required for different methods. This chapter discusses all theoretical
constructs, the methodologies followed, variables selected, sample period, the data
and its sources and statistical analysis packages used in the study have been discussed
in this chapter in different sections.
4.2. Methods Used in the Study
Based on the initial review of existing literatures, four models have been selected for
conducting the study. These models are multiple discriminant analysis, logistic
regression analysis, reduced form model based on Poisson regression analysis and
structural model. The mathematics and the process of these models along with the
variables used in this study are discussed in detail in this section.
4.2.1. Multiple/Multivariate Discriminant Analysis (MDA) Approach
Multiple discriminant analysis is a statistical technique which is used for the

classification of set of observations into one of the several priori groups which are
dependent on a set of variables with individual characteristics. In this method, group
memberships are dependent variables and variables relating to the observations are
independent variables. The purpose of discriminant analysis is to build a model that
can predict a single qualitative variable from one or more independent variables.
These independent variables may be qualitative and quantitative both while dependent
variables are always qualitative variables and at least there are two groups.
Discriminant analysis derives an equation as linear combination of the independent

variables that will discriminate best between the groups in the dependent variable.
This linear combination is known as the discriminant function. The weights assigned
61

to each independent variable are corrected for the interrelationships among all the
variables. The weights are referred to as discriminant coefficients.
MDA technique estimates a set of discriminant coefficients and with the help of these
discriminant coefficients, a transformed single discriminant score or Z-value is
calculated which is then used to classify the observations into different groups. The
model that is developed through MDA take the form as follows.
Where, Z is the overall index, , , …… are discriminant coefficients, ,

,………. are independent variables. The discriminant score (Z) is taken to
estimate the bankruptcy character of the company. Lower the value of Z, greater is the
firm’s bankruptcy probability and vice versa (Brown, 1998).
The estimated coefficients of the discriminant function are basically partial

coefficients which reflect the specific contribution of the individual variable to the Z-
score which is criterion variables for the classification into different groups. In this
study, estimates of standardized canonical discriminant function coefficients have
been used to assess the relative importance of the independent variables in the
classification into different groups. And the rule is that the higher is the coefficient,
greater contribution is of that variable.
The objective discriminant analysis is to test if the classifications of groups in a variable

Y depend on at least one of the Xi’s. In terms of hypothesis, it can be written as:
H0: Y does not depend on any of the Xi’s.
Ha: Y depends on at least one of the Xi’s.
OR simply, H0: βi = 0, for i=1, 2,…, p versus Ha: βi ≠ 0 for at least one i.
Though, it is found that variables have high degree of correlation with each. It may
seem to be a problem but as per Altman (2002), the detection of the multicollinearity
among the independent variables is not a serious problem in discriminant analysis
(Altman, 2002). Multiple Discriminant Analysis method itself is capable of taking
62

care of this problem through structure matrix which is similar to principal component
analysis. The advantage of this method is that it can provide a model with a relatively
small number of independent variables which can capture great deal of information.
4.2.1.1. Variables Used in the Model
For the prediction of default, twenty independent variables belonging to three

categories of variables and one dependent variable has been used to arrive at the
probability of default with help of discriminant score in the study.
4.2.2. Logistic Regression Model
A logistic function is a common ‘S’ shape with equation
f (x) =
As per linear probability model,
Pi = E(Y = 1 | Xi) = β1 + β2 Xi
Where X is independent variable and Y = 1 means firm has defaulted and Pi is the
probability that Y = 1. Now it can be considered as follow:
Pi = E(Y = 1 | Xi) =
It can be written as
Pi = =
Where Zi = β1 + β2 Xi
If Pi the probability of default is given by then (1- Pi) is the probability of not-
defaulting. It can be rewritten as follow:
1 - Pi =
63

Now it can be rewritten as
= = eZ
If natural log of the above equation is taken, the below relationship is found:
Li = ln ( ) = Zi = β1 + β2 Xi
The log of the odds ratio L is not only linear in X but also linear in the parameters. L
is called the logit.

categories of variables, two dummies which are categorical variables and one
dependent variable has been used to arrive at the probability of default with help of
logit score in the study.
4.2.3. Structural Models
Default is dependent on many factors such as market value of assets, outside

liabilities, liquidity position and the variability in the value of assets. Once the market
value of assets is equal to face value of the outside liabilities; the probability of
default increases. This is basic concept behind the methodology followed. The
methodology that has been used in this paper is based on the Black & Scholes Merton
model. Numerical steps used are based on KMV (2001 & 2002) and Default Greeks
under an objective probability measure (Farmen, Westgaard and Wijst, 2004).
Numerical Steps
According to Merton (1974), it is assumed that a firm’s equity is an option on the

assets of the company. So over time a firm often faces two situations; either V T < D or
VT > D. In first, the firm would default and fail to meet its outside obligations while in
second case firm would repay its debt in time. In the first case value of equity is then
64

zero and in second case it is V T – D where V is market value of assets and D is book
value of debt.
As per Merton’s model, the value of the firm’s equity at time T is:
E T = max (V T - D,0)
This relationship indicates that the equity is a call option on the value of the assets
with a strike price=repayment required on debt.
According to Morton Model, the value of firm’s assets at time T can be measured by
below equation.
Where is the total value of the firm, is the expected return on firm’s assets
(i.e., the asset drift), is the volatility of firm value and is a standard Weiner
process. The incremental changes in ln follow a generalized Wiener process with
drift .
Estimating Asset Value and Volatility
Using Black and Scholes, Merton (BSM) option pricing model the value of asset and
volatility of assets can be measured by below equation.
are asset value and change in asset value, and are the firm’s asset drift
rate and volatility, dZ is a Wiener process.
The value of equity using Black Scholes and Morton Model can be calculated by the
below equation.
Where D is book value of debt which is due at time T, is risk free rate and is the
market value of firm’s equity.
65

=
√
= - √
The equity and asset volatility can be measured by below equation
Where, is equity volatility, is asset volatility, is value of assets and is

value of equity. is the change in the value of equity with respect to change in the
debt.
Estimating Distance-to-Default
Distance-to-default (dtd) is calculated based on Moody’s-KMV definition as follows
Where dtd is distance-to-default, MVA is market value of assets; DP is default point

and is volatility of asset A. This equation can further expressed as below:
Estimating Probability of Default
The probability of default is the probability that the market value of the firm’s assets
will be less than the book value of the firm’s liabilities by the time the debt matures.
| |
Where
is the probability of default by time t
is the market value of the firm’s assets at time t, and
66

is the book value of the firm’s liabilities due at time t.
The BS model assumes that the random component of the firm’s asset returns is
normally distributed, and as a result we can define the default probability in terms of
the cumulative Normal distribution
[ ]
√
Where N(.) is the cumulative standard normal distribution.
Estimating Drift of Assets and Real Default Probability
For better indication of default, real default probability is considered to be more

efficient than that of risk neutral default probability but estimating real default
probability is more difficult than that of the risk neutral default probability. So in this
process finding drift of the asset is the next step and this can be done by solving a
number of equations as solved by Bandyopadhyay (2007)
By using Ito’s Lemma we get the following relation
The next step is to obtain equity gamma by using the following expression (Hull, 2006)
Option Gamma
√
Equity Theta Θ can be estimated using the following equation (Hull, 2006)
Option Theta -
√
The drift of asset can be estimated by the below equation
67

CAPM can used to estimate . According to CAPM
Where
Having found asset drift , V and , the real default probability can be estimated as
follows (Sharma, Singh, & Upadhyay, 2014)
( )
]
√
For the prediction of default, only three variables namely market value of the assets,
book value of debt and drift rate has been used to arrive at the probability of default in
this study.
4.2.4. Reduced Form Model
Reduced form model don’t try to build model on the information from the balance
sheet nor does try to develop an indicator from the financial statement of the firm. It
assumes that there is nothing like gradual default. Rather it is an abrupt event. The
assumption at the work is that there is no relationship between the value of the firm and
default. Default is instead an unpredictable and sudden event of inexplicable loss in the
market values of the firm. This approach still asks for a signal to indicate the event of
default. The reduced form models simply assign a random external indicator which is
known as default intensity λ (Chacko, Sjoman, Motohashi, & Dessain, 2006).
Where,
68

λ: Default Intensity
The probability of default at time t years: 1-e-λt
The expected time to default: 1/λ
Default intensity λ is calculated using Poisson regression analysis on independent

variables using the following process.
There are phenomena where dependent variable is count data, the probability distribution
that is best suited to Poisson probability distribution. The probability distribution function
of the Poisson distribution is given by (Gujarati & Sangeetha, 2007):
f(Yi) =
Where Y = 0, 1, 2, 3,…..
In case of Poisson distribution, it variance is the same as its mean value. So the
Poisson regression model may be written as
Yi = E(Y) + ui = µi + ui
Where the Y is independent variable distributed as Poisson random variables with

mean µi for each observation expressed as
µi = E(Yi) = β1 + β2X1i + β3X2i + ………+βkXki
For estimation purpose the model is written as:
Yi = + ui

categories of variables and one dependent variable has been used to arrive at the mean
value. With help of the calculated mean value probability of survival is arrive upon.
From the probability of survival, probability of default is calculated in the study.
69

Time series of independent variables is converted into single period variables by

taking the mean of the independent variables over the sample period. The total
number of defaults during the sample period is used in regression analysis as
dependent variable.
4.3. Variables Used in the Study
In this study, for discriminant analysis and the reduced form model, twenty
independent variables are used. For logistic regression, number of independent
variables is twenty two. For the structural model three independent variables have
been used in the study. These independent variables belong to four categories, namely
accounting variables, market variables, economic variables and categorical variables.
There are fifteen independent variables which belong to accounting variables. Three
independent variables belong to market variables. Two variables belong to economic
variables and two variables belong to two categorical variables.
The relevance of the independent variables is summarized as follow:
4.3.1. Accounting Variables
There are fifteen independent variables that have been used in the study belong to
accounting variables. These variables capture different financial aspects of the firms
included in the study.
CA/CL
It is the ratio of current assets to current liabilities. This provides information the
relationship between the current assets and current liabilities. It tells how efficiently
firm has been managing its short finances. If a firm is having a higher ratio, the firms’
chances to default are expected to be lower.
D/E
It is the ratio of book value of debt to book value of equity. This signifies that level of
debt with respect to owners’ fund. If a firm is having a lower ratio, its debt exposure
is lower and the firms’ chances to default are expected to be lower.
70

EBIT/INT
It is the ratio of EBIT to interest. This tells how many times EBIT is to interest. This
tells that how many times is its earnings to interest expense. If a firm is having a
higher ratio, the firms’ chances to default are expected to be lower.
EBIT/TA
It is the ratio of EBIT to total assets. It tells about how efficiently assets are used to
generate earnings for the firm. If a firm is having a higher ratio, the firms’ chances to
default are expected to be lower.
FAT
It is fixed assets turnover ratio. It is a ratio between the total assets with respect to
fixed assets. If a firm is having a higher ratio, the firms’ chances to default are
expected to be lower.
GRTA
It is growth in total assets. Higher is the value better is for firm. If a firm is having a
higher ratio, the firms’ chances to default are expected to be lower.
ITR
It is inventory turnover ratio. It tells how much revenue is being generated for every
rupee of inventory. If a firm is having a higher ratio, the firms’ chances to default are
expected to be lower.
NI/TA
It is the ratio between the net income and total assets. It tells that ability of the assets
of firms to earn net income for the firm. If a firm is having a higher ratio, the firms’
chances to default are expected to be lower.
NP/TE
It is a ratio between the net profit and book value of total equity. This ratio signifies
that how much profit is earned by the firm against every rupee of equity. If a firm is
having a higher ratio, the firms’ chances to default are expected to be lower.
71

OCFR
It is operating cash flow ratio. It is a ratio between operating cash and current
liabilities. If a firm is having a higher ratio, the firms’ chances to default are expected
to be lower.
RE/TA
It is a ratio between the retained earnings of the firm to total assets. This tells what the
total accumulated value of net profit over its life span is against total assets. If a firm
is having a higher ratio, the firms’ chances to default are expected to be lower.
SG
It is the value of sales growth. It tells how firm is able to expand its business. If a firm
Sales/TA
It is ratio between the total sales to total assets. This tells how revenue is being
generated by one rupee of assets of the firm. If a firm is having a higher ratio, the
firms’ chances to default are expected to be lower.
TBD/TA
It is the ratio between the total book value of the firm. This tells how much debt has
been consumed by the firm to erect one rupee of assets for the firm. If a firm is having
a lower ratio, the firms’ chances to default are expected to be lower.
WC/TA
It is a ratio between the working capital of the firm to total assets. This signifies that
how much working capital is required for every rupee of total assets. If a firm is
having a lower ratio, the firms’ chances to default are expected to be lower.
4.3.2. Market Variables
In this study, only three variables belong to market variables. These variables tell that
how market in perceiving about the firm.
72

MP/BV
It is a ratio between the market price of the firms’ stock and book value of the firm.
This signifies that what is the price of every rupee of book value of the firm. If a firm
MP/EPS
It is a ratio between the market price of the firms’ stock and earing per share. This
tells that how much market is ready to pay for the every rupee of earning per share. If
a firm is having a higher ratio, the firms’ chances to default are expected to be lower.
MVE/TBD
It is the ratio between the market value of equity and total book value of debt. It signifies
that the in view of market what is the value of its assets for every rupee of debt. If a firm
4.3.3. Economic Variables
These are the variables that tells about how firm’s finances are responding to
economic activities in the country. There are two economic variables.
Log(TA/GNP)
It is log value of the ratio of total assets to market index of GNP. This informs about
the value of firms assets with respect to GNP index. If a firm is having a higher ratio,
the firms’ chances to default are expected to be lower.
SG/GNPG
It is ratio between the sales growth and growth in GNP. It tells whether firm has been
able matched growth rate of GNP with growth in sales.
4.3.4. Categorical Variables
1 If total liabilities are more than the total assets and 0 if total assets are more than the
total liabilities.
73

1 if average net profits of two years are less than zero. 0 if average net profits are
more than zero.
4.3.5. Variables for Structural Model
These variables are used by the structural model. Apart from the book value of debt,
market value of assets and drift rate are calculated using balance sheet and market
information.
4.4. Data
4.4.1. Sample
Indian Firms
For the study of default prediction models for Indian firms, four broad samples are
used. The first sample is for all the firms consisting 226 firms from India which have
2450 case. The second sample is for the 149 large firms consist of 1642 cases. The
third sample is of 169 large firms and public sector units which has 1881 cases. The
fourth sample consists of 57 small and medium firms. This sample has total 570 case.
American Firms
This sample consists of 50 firms from the US and has 250 cases.
British Firms
This sample consists of 50 firms from the UK and has 250 cases.
4.4.2. Sample Period of the Study
Indian Firms
The sample period for study on Indian sample is twelve years from 1st April 2004 to
31st March 2016. Of this sample period, ten years data has been used to develop the
model from 1st April 2004 to 31st March 2014. Of this period, the study is conducted
for two sample periods namely large sample period of ten years from 1st April 2004 to
74

31st March 2014 and small sample period of five years from 1st April 2009 to 31st
March 2014 to find out the impact of sample periods. The data belonging to period
from 1st April 20014 to 31st March 2016 have been used for forward testing for both
out of sample firms as well as in sample firms.
American Firms
31st December 2014 for the firms from the United States of America.
British Firms
31st December 2014 for the firms from the United Kingdom.
4.4.3. Data Sources
4.4.3.1. Company specific data
Accounting, market and economic information from the three countries have been
collected. In case of firms from India, the accounting data have been collected from
Prowess and respective financial statements of the firms and the stock prices of the firms
have been collected from Bombay Stock Exchange. Economic data has been collected
from RBI and World Bank. As far as the information about firms’ default position is
financial statements. In case of firms from the United States of America, the accounting
data have been collected from the respective financial statements of the firms and the
stock prices of the firms have been collected from NASDAQ. Economic data has been
collected from Federal Reserve and World Bank. As far as the information about firms’
default position is concerned, this information has been collected from respective
auditor’s reports in financial statements. In case of firms from the United Kingdom, the
accounting data have been collected from the respective financial statements of the firms
and the stock prices of the firms have been collected from Financial Times Stock
Exchange from London Stock Exchange. Economic data has been collected from Bank of
England and World Bank. As far as the information about firms’ default position is
financial statements.
75

4.4.3.2. Interest Rate Proxy
As far as risk free interest rates are concerned, 91 days Treasury bills have been taken as
proxy for interest rate for the whole study. Interest rate proxy for Indian firms has been
collected from Reserve Bank of India. For the firms from the USA, it has been collected
from Federal Reserve Bank. For the firms from the United Kingdom, the interest proxies
over the sample period have been collected from The Bank of England.
4.4.3.3. Market-Proxy
For the purpose of getting market returns, the study has taken market proxies for each
market. In case of India, the market proxy is BSE 500. For the USA, it is NASDAQ
Composite. In case of the UK, it is FTSE AIM All Share.
4.4.3.4. The Database
When it comes to collecting data for the study, most of the accounting data have been
collected from the individual financial statements of the firms that are part of the
study. Besides this, in case of India, data have also been collected from Prowess. For
the USA and the UK, no database has been used.
4.4.4. Statistical Analysis Packages
The first tool that has been used is spreadsheet software Microsoft Excel. The daily
stock prices as well market prices data has been processed by using Microsoft Excel.
Then the processed data is used for development of models using advanced statistical
analyses packages SPSS 20 and E-Views 9 for testing various statistical properties.
SPSS 20 has been used for developing models using multiple discriminant analysis
and logistic regression analysis. E-Views 9 has been used for reduced form modelling
using Poisson regression method. For structural model based on Black-Scholes-
Merton Model, Microsoft Excel has been used.
76

Chapter 5
Analysis of Default Prediction Models
for Indian, US and UK
Corporate Debt

Chapter 5: Analysis of Default Prediction Models for Indian, US and UK Corporate Debt
CHAPTER 5
Analysis of Default Prediction Models for Indian,
US and UK Corporate Debt
This chapter presents the results relating to default prediction models and their
comparative study in respect of three selected counties namely India, the US and the
UK. The chapter is organized into four sections. Section I analyses the empirical
results obtained from the analysis of data relating to Indian sample companies while
Section II and III respectively examine empirical results relating to the US and the
British sample companies. Section IV presents the comparative study of the
performance of default prediction models in respect of these three countries.
SECTION I: INDIAN SAMPLE
This section contains empirical results and findings from the models developed for
Indian sample. This study examines four methods namely multiple discriminant
analysis, logistic regression, reduced form models and structural model to predict
default in Indian context. The results and findings of the study are arranged into four
parts. Part I consists of the results of study using multiple discriminant analysis which
is further divided into four sub-parts namely, All Firms, Large Firms, Large Firms
with Public Sector Units and Small and Medium Firms. Part II contains results and
finding of study using logistic regression analysis which is further divided into four
sub-parts namely, All Firms, Large Firms, Large Firms with Public Sector Units and
Small and Medium Firms. Part III presents the results and finding of study using
reduced form model using Poisson regression which is further divided into two parts
namely, All Firms and Large Firms. And Part IV presents and discusses the results of
study using with Merton distant to default model.
5.1.1. Multiple Discriminant Analysis
Multiple discriminant analysis has been used to arrive a credit score to predict default by
the firms. The whole study has been divided into four parts. Part I presents the results of
the analysis of the sample of All Firms which contains 2450 firm years for which 20
77

ratios have been calculated using financial statement, market and economic information.
Part II presents the results of the analysis of the sample of large firms which consists of
1642 firm years. Part III presents the results of analysis of the sample of large firms and
public sector units which has 1881 firm years. Part IV presents the results of the analysis
of the sample of small and medium firms. This sample has total 570 firm years.
5.1.1.1. All Firms
This sample has 2450 firm years. Two discriminant models have been developed for
his sample. The first model that has been developed for the whole sample period that
is from 1st April 2004 to 31st March 2014 and is indicated as large sample period. The
second model has been developed for a sample period of five years from 1st April 2009 to
31st March 2014 and is indicated as small sample period. After checking for outliers using
Mahalanobis statistics, firm years for the Large sample period and Small sample period
are brought down to 1666 firm years and 859 firm years respectively. On these two
samples multiple discriminant analysis is used using SPSS software.
Two hold out samples namely ‘Forward Testing (In Sample Firms)’ and ‘Out of
Sample’ are meant for validation of the classification accuracy of the developed
models. The developed models are compared against the Altman (1968) original
model. The first hold out sample ‘Forward Testing (In Sample Firms)’ consists of the
financial, market and economic information of the same firms that have been used to
develop the models beyond the sample years. This sample contains 308 firm years.
The second sample ‘Out of Sample’ consists of the financial, market and economic
information of the firms which have not been part of the study. This sample contains
161 firm years.
After running the multiple discriminant analysis using SPSS software the following
models have been found.
For large sample period, a 7 factors model is found.
Z = - 2.626 + 0.544RE/TA + 1.46EBIT/TA + 0.414Sales/TA + 0.807NI/TA -

0.346TBD/TA - 0.013OCFR + 0.459 Log (TA/GNP)
78

For small sample period, a 9 factors model is found.
Z = - 2.351 + 0.459RE/TA + 1.084EBIT/TA + 0.546Sales/TA + 0.754NI/TA +

0.013NP/TE - 0.982TBD/TA - 0.005OCFR + 0.012MP/BV + 0.459 Log (TA/GNP)
Empirical Results
In this section empirical results of the study for the All Firms is presented and
discussed in different sub-parts and compared with other studies.
Case Processing Summary
From the following Table 5.1.1.1.1: Case Processing Summary, it is clear that ‘Large
sample period has total 1666 firms years of with 1608 firm years are found to be valid
for the study. Small sample period has total 859 firm years of which 833 firm years
are found to be valid for the study. Only the valid firm years have been used to
develop Z score model using multiple discriminant analysis.
Table No 5.1.1.1.1: Case Processing Summary
Large Sample Period Small Sample Period

Valid 1608 833
Excluded 58 26
Total 1666 859
Tests of Equality of Group Means
Tests of equality of group means tests whether there are any significant differences
between groups on each of the independent variables using group mean or not. The
null hypothesis of the test is that there is no significant difference between the groups
on each of the independent variables group mean. The alternative hypothesis is that
there is significant difference between the groups on each of the independent variables
group mean.
H0: There is no significant difference between the groups on each of the independent
79

H1: There is significant difference between the groups on each of the independent
From the below Table No 5.1.1.1.2: Tests of Equality of Group Means it is clear that
the Wilk’s Lambda of nine variables for the sample large sample period are having p-
value less than that of the critical p-value that 0.05 by F test.
Table No 5.1.1.1.2: Tests of Equality of Group Means

Wilks' Wilks'
F Sig. F Sig.
Lambda Lambda
WC/TA 1.000 0.690 0.406 1.000 0.026 0.872
RE/TA 0.913 152.893 0.000 0.896 96.484 0.000
EBIT/TA 0.887 205.277 0.000 0.833 166.936 0.000
MVE/TBD 0.999 1.011 0.315 0.998 1.361 0.244
Sales/TA 0.965 58.122 0.000 0.952 41.543 0.000
CA/CL 1.000 0.229 0.633 1.000 0.038 0.845
NI/TA 0.893 191.555 0.000 0.845 152.394 0.000
NP/TE 0.998 3.509 0.061 0.995 4.537 0.033
TBD/TA 0.948 88.112 0.000 0.961 34.032 0.000
EBIT/INT 0.996 6.665 0.010 0.993 6.196 0.013
OCFR 0.996 6.714 0.010 0.993 5.790 0.016
GRTA 1.000 0.177 0.674 1.000 0.265 0.607
ITR 1.000 .429 0.513 1.000 0.405 0.524
FAT 0.998 3.838 0.050 0.996 3.259 0.071
MP/EPS 1.000 0.343 0.558 0.996 3.630 0.057
MP/BV 1.000 0.262 0.609 0.976 20.185 0.000
D/E 0.998 3.192 0.074 0.996 3.175 0.075
Log(TA/GNP) 0.973 45.396 0.000 0.957 37.019 0.000
SG 0.999 1.199 0.274 0.997 2.118 0.146
SG/GNP 0.996 6.537 0.011 0.991 7.520 0.006
The Wilk’s Lambdas of nine variables for large sample period and eleven variables
out of twenty variables have p-value less than that of the critical p-value 0.05. So the
null hypothesis for the respective variables of the two samples is rejected and the
80

alternative hypothesis is accepted that there is significant difference between the

groups on each of the independent variables group mean. This indicates that these
independent variables are significant for the study. However the values of Wilks’
Lambdas of these significant variables are on higher side. It is basically undesirable.
For the large sample period, significant variables belong to nine variables are either
accounting or economic variables. No significant variable belongs to market variables
used in the study. For the small sample period, significant variable belong to all the
three types of variables.
Log Determinants
Table No 5.1.1.1.3: Log Determinants

Log Log
Rank Rank
Determinant Determinant
Non-Defaulting 20 94.088 20 82.990
Defaulting 20 48.654 20 44.410
Pooled within-groups 20 96.545 20 88.404
From the below Table No 5.1.1.1.3: Log Determinants it is clear that the log
determinants for both the samples are large making the obtained discriminant
functions significant. The log determinants for non-defaulting group and pooled with
groups are comparable. This signifies that that for both samples the covariance matrix
for the two groups doesn’t differ much. But on comparison with log determinant of
the defaulting group, the log determinant for the non-defaulting group and pooled
within groups are found to be more than double for both the sample periods. This
indicates that group covariance matrices differ.
Box’s M Test
Box’s M test helps to examine the assumption of equality of covariance across groups
and find out whether there is different in covariance matrices between groups formed
by the dependent variables. The null hypothesis of this test is that the covariance
81

matrices do not differ between groups formed by the dependent variables. And as per
assumptions, test should not be significant so that null hypothesis can be accepted.
H0: The covariance matrices do not differ between groups formed by the dependent
variables.
H1: The covariance matrices differ between groups formed by the dependent
variables.
Table No 5.1.1.1.4: Box’s M Test

Box's M 12987.833 10710.663
Approx. 59.657 48.704
df1 210 210
F
df2 383886.460 272469.871
Sig. 0.000 0.000
From the Table No 5.1.1.1.4: Box’s M Test Results, it is clear that test the p-values of
the test for both the samples are lower than that of critical p-value of 0.05. That means
that null hypothesis that the covariance matrices do not differ between groups formed
by the dependent variables, is rejected. That means the covariance matrices differ
between groups formed by the dependent variables. This is an undesirable situation in
the analysis and is against the assumption of the multiple discriminant analysis.
Eigenvalues
Table No 5.1.1.1.5: Eigenvalues
Canonical
Eigenvalue % of Variance Cumulative %
Correlation
Large Sample Period 0.282 100.0 100.0 0.469
Small Sample Period 0.448 100.0 100.0 0.556
The above Table No 5.1.1.1.5: Eigenvalues provides information about the eigenvalue
and canonical correlation. The eigenvalue of the discriminant function for the large
82

sample period is 0.282 and for the small sample period, it is 0.448. For both the
sample periods, eigenvalues are on lower side making these less robust. This signifies
that the variance in the dependent variable cannot be properly explained by the
discriminant functions for both the samples. The canonical correlation is the measure
of association between the discriminant function and the dependent variable. The
canonical correlation between the discriminant function and the dependent variable
for the large sample period is 0.469 and for the small sample period it is 0.556. This
signifies that there is a strong association enough between the discriminant function
and classification of groups as to terms the function robust.
Wilks' Lambda
Wilk’s Lambda tests the discriminatory power of the independent variables used in
the discriminant analysis. It has null hypothesis that there is no significant
discriminating power in the independent variables. While the alternative hypothesis is
that there is significant discriminating power in the independent variables.
H0: There is no significant discriminating power in the independent variables.
H1: There is significant discriminating power in the independent variables.
Table No 5.1.1.1.6: Wilks' Lambda
Wilks' Lambda Chi-square df Sig.

Large Sample Period 0.780 396.110 20 0.000
Small Sample Period 0.691 303.647 20 0.000
Wilks' Lambda is a measure of discriminatory power of discriminant function. Wilks’

Lambdas for both the functions for the large sample period as well as small sample
period are high. This indicates that these functions are less robust but the p-values for
the Chi-square test for all the two samples are less than the critical p-values of 0.05.
That means null hypothesis is rejected and alternative hypothesis is accepted that
there is significant discriminating power in the independent variables of the four
developed models. So the two groups differ from each other and the functions can
effectively discriminate between the two groups.
83

Structure Matrix
Table No 5.1.1.1.7: Structure Matrix
EBIT/TA 0.674 0.670
NI/TA 0.651 0.640
RE/TA 0.581 0.509
TBD/TA -0.441 0.334
Sales/TA 0.358 0.316
Log(TA/GNP) 0.317 -0.303
OCFR 0.122 0.233
EBIT/INT 0.121 -0.142
SG/GNPG -0.120 0.129
FAT 0.092 0.125
NP/TE 0.088 0.110
D/E -0.084 0.099
SG -0.051 0.094
MVE/TBD 0.047 -0.092
WC/TA 0.039 -0.075
ITR 0.031 0.061
MP/EPS 0.028 0.033
MP/BV 0.024 0.027
CA/CL 0.022 0.010
GRTA 0.020 0.008
The above Table No. 5.1.1.1.7: Structure Matrix shows that the correlation between
the predictor variables and standardized canonical discriminant function. The above
table indicates that most of the independent variables have low correlation and
84

contribute negligibly to the discriminant function besides the top 6 variables for the
large sample period and 7 variables for the Small sample period’. Also the top six
variables from Structure Matrix that have higher correlations with discriminant
function are same for the both the samples.
Standardized Canonical Discriminant Function Coefficients
The relative importance of the each independent variable can be figured out through
standardized canonical discriminant function coefficients. The higher is the
coefficient, higher is the relative discriminatory power of the independent variable.
Table No 5.1.1.1.8: Standardized Canonical Discriminant Function Coefficients

WC/TA 0.727 0.624
RE/TA 0.371 0.343
EBIT/TA 0.283 0.204
MVE/TBD -0.052 -0.141
Sales/TA 0.646 0.872
CA/CL 0.061 0.088
NI/TA 0.130 0.122
NP/TE 0.008 0.064
TBD/TA -0.272 -0.495
EBIT/INT -0.107 -0.106
OCFR -0.025 0.008
GRTA 0.048 0.076
ITR 0.023 -0.025
FAT -0.052 -0.149
MP/EPS 0.800 0.039
MP/BV -0.752 0.105
D/E -0.036 -0.017
Log(TA/GNP) 0.391 0.410
SG -0.002 -0.012
SG/GNPG -0.151 -0.208
85

From the above Table No 5.1.1.1.8: Standardized Canonical Discriminant

Function Coefficients, it is clear that MP/BV has the highest coefficient value at
0.800 for the large sample period while for the small sample period, Sales/TA has
highest coefficient at 0.872. That means the independent variables MP/BV for the
large sample period and Sales/TA for the small sample period have the highest
discriminatory power among all the independent variables. SG at -0.002 and
OCFR at 0.008 has the lowest coefficient for large sample period and small
sample period respectively. On comparing with other independent variables these
two variables have the lowest discriminatory power among all the independent
variables. Also from the above table it is clear that around seven independent
variables for the large sample period and five independent variables for the small
sample period are ranging between 0.05 to -0.05. Interesting finding from the
above table is that all the ratios that have been calculated using debt or external
liabilities have very negligible discriminatory power expect TBD/TA. So on the
basis of Standardized Canonical Discriminant Function Coefficients, it can be
concluded that the amount of debt with respect to market value of equity and book
value of equity or current liabilities with respect to current assets have no impact
on their default probability but the ratio of the total book value of debt and total
assets have correlation with discriminant function.
Canonical Discriminant Function Coefficients (Unstandardized)
The below Table No. 5.1.1.1.9: Canonical Discriminant Function Coefficients

(Unstandardized) gives value of the coefficients which would be used to arrive at
the discriminant score on the basis of structure matrix and significance of
variables in equality of group means.
86

Table No 5.1.1.1.9: Canonical Discriminant Function Coefficients (Unstandardized)

WC/TA 0.885 0.623
RE/TA 0.544 0.459
EBIT/TA 1.460 1.084
MVE/TBD 0.000 0.000
Sales/TA 0.414 0.546
CA/CL 0.012 0.020
NI/TA 0.807 0.754
NP/TE 0.001 0.013
TBD/TA -0.346 -0.982
EBIT/INT 0.000 0.000
OCFR -0.013 0.005
GRTA 0.001 0.001
ITR 0.000 0.000
FAT -0.001 -0.004
MP/EPS 0.001 0.000
MP/BV -0.002 0.012
D/E -0.002 -0.001
Log(TA/GNP) 0.459 0.481
SG 0.000 -0.002
SG/GNPG 0.000 0.000
(Constant) -2.626 -2.351
Prior Probabilities for Groups
Prior probabilities for the groups give information about the distribution of the
observations in the groups used in the study. From the below Table No. 5.1.1.1.10:
Prior Probabilities for Groups, it clear that for the large sample period out of 1608
firm years, 200 firm years belong to defaulting group and for the small sample period
out of 833 firm years, 162 firm years belong to defaulting group.
87

Table No. 5.1.1.1.10: Prior Probabilities for Groups

Cases Prob. Cases Prob.
Non-Defaulting 1408 0.876 671 0.806
Defaulting 200 0.124 162 0.194
Total 1608 1.000 833 1.000
Functions at Group Centroids
The function of group centroids gives information about the average discriminating
score of each group. From the below Table No. 5.1.1.1.11: Functions at Group
Centroids, it is clear that for the large sample period the centroid of the discriminating
score for defaulting group is -1.407 and for non-defaulting group, it is 0.200. For the
small sample period the centroid of the discriminating score for defaulting group is -
1.36 and for non-defaulting group, it is 0.328.
Table No. 5.1.1.1.11: Functions at Group Centroids
Non-Defaulting 0.200 0.328
Defaulting -1.407 -1.360
The centroids of the two groups are basically the extreme points which are used to
formulate decision rule in deciding group membership of individual case. From the
above Table No. 5.1.1.1.10: Prior Probabilities for Groups, it is clear that the numbers
of defaulting and non-defaulting cases are not equal, so to find the dividing point,
weights on centroids is used.
Dividing point = (N1 X Lower centroid + N2 X Higher Centroid)/(N1 + N2)
For large sample period
Dividing point = (0.200 X 1408 – 1.407 X 200)/1608 = 0.00012
88

Decision Rule for the Classification
For defaulting firms: -1.407 < Z < 0.00012
For non-defaulting firms: 0.00012 < Z < 0.200
For small sample period
Dividing point = (0.328 X 671 – 1.36 X 162)/833 = -0.00028
For defaulting firms: -1.36 < Z < -0.00028
For non-defaulting firms: -0.00028 < Z < 0.328
In Sample Classification
Table No. 5.1.1.1.12: In Sample Classification Result

Model Altman Original Model Altman Original
Accuracy 89.9 39.2 87.4 43.8
Type I Error 1.4 68.9 2.9 68.9
Type II Error 73.3 2.5 54.3 1.9
The above Table No 5.1.1.1.12: In Sample Classification Results presents the in

classification accuracy of the developed model as well as for the Altman Original
model for the two samples. The cut-off probability of group classification has been
taken as 0.5. For the large sample period the accuracy rate of the developed model is
89.9% and for the Altman Original Model, the accuracy is 39.2%. Type I error for the
developed model is 1.4% and for the Altman Original Model, it is 68.9%. Type II
error for the developed model is 73.3% and for the Altman Original Model, it is 2.5%.
Misclassification of defaulting firms is very high for the developed model. It is
undesirable. For the small sample period the accuracy rate of the developed model is
87.4% and for the Altman Original Model, the accuracy is 43.8%. Type I error for the
developed model is 2.9% and for the Altman Original Model, it is 68.9%. Type II
89

error for the developed model is 54.3% and for the Altman Original Model, it is 1.9%.
Misclassification of defaulting firms is very high for the developed model. It is
undesirable.
From the above information from the In Sample Classification results, it can be
concluded that the accuracy level of the developed models for both the samples are far
higher than that of the accuracy level of the Altman Original Model. But the problem
with the developed model is that the Type II errors for the both the samples are very
high. The in sample classification accuracy for the model developed for the large
sample period is higher than that of the small sample period.
Model Validation
This section uses two hold out samples to validate the classification accuracy of the
developed models for the two samples. The first sample consists of 307 firm years of
the same firms but beyond the sample period for the forward testing. The second
sample consists of 160 firm years which are out of sample period.
Forward Testing (In Sample Firms)
Table No. 5.1.1.1.13: Forward Testing (In Sample Firms)
Altman
Original
Accuracy 69.3 76.4 60.8
1 Year Type I Error 43.9 33.6 56.1
Type II Error 0 0.0 0.0
Accuracy 70.5 76.5 59.7
2 Years Type I Error 44.0 34.0 58.0
Type II Error 0.0 2.0 4.1
Accuracy 69.7 76.2 60.3
Overall Type I Error 44.5 34.5 57.4
Type II Error 0 1.0 2.0
The above Table 5.1.1.1.13: Forward Testing (In Sample Firms) presents results of
the forward testing of the out of sample period. For the whole sample, the accuracy of
90

the model developed for the large sample period is 69.7%, for the small sample period
accuracy is 76.2% and for the Altman Original Model the accuracy level is 60.3%. So
the accuracy of the developed model for the small sample period is highest among all
the three models. If Type I error is considered, it is the lowest for the small sample
period. As far as the Type II error is concerned, it is lowest for the large sample
period. But for the small sample period too, it is very low at 1%. The classification
accuracy of the forward testing for all the models for the whole period for the small
sample period is the highest at 76.2%. For one year and two years forward period,
highest accuracy levels are 76.4% and 76.5% respectively for the small sample
period. So on the basis of the results of forward testing, it can be concluded that,
model developed for the small sample period has the highest level of classification
accuracy among the three models and Altman Original Model has the lowest level of
classification accuracy.
Out of Sample Validation
Table No. 5.1.1.1.14: Out of Sample Validation
Altman
Original
Accuracy 48.3 57.6 42.4
In
Sample Type I Error 57.0 45.2 64.5
Period
Type II Error 32.0 32.0 32.0
Accuracy 39.1 43.5 26.1
Forward Type I Error 57.1 50.0 78.6
Type II Error 66.7 66.7 66.7
The above Table 5.1.1.1.14: Out of Sample Period presents results of out of sample
period validation. For the whole sample, the accuracy of the model developed for the
large sample period is 48.3%, for the small sample period accuracy is 57.6% and for
the Altman Original Model the accuracy level is 42.4%. So the accuracy of the
developed model for the small sample period is highest among all the three models. If
Type I error is considered, it is the lowest for the small sample period. As far as the
Type II error is concerned, it is the same for all the three models at 32% on
91

undesirable higher side. The classification accuracy of the out of sample validation for
all the models for In Sample Period period for the small sample period is the highest
at 43.5%. So on the basis of the results of out of sample validation, it can be
concluded that, model developed for the small sample period has the highest level of
classification accuracy among the three models and Altman Original Model has the
lowest level of classification accuracy.
Discussion
From the above discussion, it can be said on the basis of the statistical coefficients for
the different tests for the developed models for the large sample period as well as
small sample period makes them statistically robust. On the basis of the classification
accuracy, the model developed for the large sample period has higher classification
accuracy at 89.9% which is higher than that of the classification accuracy of Altman
Original Model. So it can be concluded that discriminant function developed for the
large sample period has the highest discriminatory power. The findings from the
above information can be summarized as follow:
Variables in Equation
 The top six variables from Structure Matrix that have higher correlations with
discriminant function are the same for the both the samples.
 On the basis of Standardized Canonical Discriminant Function Coefficients, it

can be concluded that besides total book value of debt with respect to total
assets no other variable with debt component contribute to discriminant
function.
 For the small sample period, the variables that are found to be significant for
model development belong to all the three category of variables. The same is not
true with the large sample period.
Statistical Robustness
 The large values of the log determinants for both the sample periods, makes the
discriminant functions robust. The log determinants of the pooled within groups,
92

non-defaulting groups are more than two times of the defaulting groups for both
the samples. This may be the cause of Type II error as discriminant functions
may respond better to the non-defaulting firms than that of the defaulting firms.
 The low values of Eigenvalues and canonical correlation and high values of
Wilks’ Lambdas make the developed discriminant function less robust but the
Chi-square tests make both the discriminant functions significant.
Classification Accuracy
 From the in sample classification results, it can be concluded that the accuracy
level of the developed models for both the sample periods are higher than that of
the accuracy level of the Altman Original Model (1968). But Type II errors for
the both the sample periods are very high.
 The in sample classification accuracy for the model developed for the large
 On the basis of the results of forward testing on in sample firms, it can be

concluded that, model developed for the small sample period has the highest
level of classification accuracy among the three models and Altman Original
Model (1968) has the lowest level of classification accuracy. However the
classification accuracies of developed models deteriorate over period of two
years.
 On the basis of the results of out of sample (validation), it can be concluded that,
model developed for the small sample period has the highest level of
classification accuracy among the three models and Altman Original Model has
the lowest level of classification accuracy.
 On comparison the classification accuracies the model developed for the small
sample period for All Firms has the highest classification accuracy.
 From the classification results, it is clear that the classification accuracy of the
Altman Original Model (1968) is very low for all the samples and there is need to
recalibrate the Altman Original Model with the same variables in Indian context.
93

5.1.1.2. Large Firms
This sample initially has 1642 firm years. The study in this part is carried into two
sub-parts and for this purpose four discriminant models have been developed for his
sample. The first sub-part deals with Unmatched Sample of Large Firms which has
149 firms. The second sub-part deals with Matched Sample of Large Firms which has
128 firms; 64 defaulting and 64 non-defaulting. Every sub-part is further divided into
two parts, namely large sample period and small sample period. The sample period
for the large sample period is from 1st April 2004 to 31st March 2014 and for the small
sample period the sample period is five years from 1st April 2009 to 31st March. After
checking for outliers using Mahalanobis statistics, firm years for the large sample
period and small sample period for the Unmatched Sample of Large Firms are
brought down to 1165 firm years and 604 firm years respectively. Similarly after
checking for outliers using Mahalanobis statistics, firm years for the large sample
period and small sample period for the Matched Sample of Large Firms are brought
down to 1043 firm years and 540 firm years respectively. On these four samples
multiple discriminant analysis is used using SPSS software. Two hold out samples
namely ‘Forward Testing (In Sample Firms)’ and ‘Out of Sample’ are meant for
validation of the classification accuracy of the developed models. The developed
models are compared against the Altman (1968) original model. The first hold out
sample ‘Forward Testing (In Sample Firms)’ consists of the financial, market and
economic information of the same firms that have been used to develop the models
beyond the sample years. This sample contains 192 firm years. The second sample
‘Out of Sample’ consists of the financial, market and economic information of the
firms which have not been part of the study. This sample contains 94 firm years.
Unmatched Sample
94

Z = - 2.954 + 1.198WC/TA + 0.811RE/TA + 1.223EBIT/TA + 0.398Sales/TA -

0.915NI/TA - 0.243TBD/TA + 0.079OCFR + 0.402GRTA + 0.045MP/BV
+ 0.459 Log (TA/GNP)
Z = - 2.728 + 0.48WC/TA + 0.873RE/TA – 0.562EBIT/TA + 0.466Sales/TA +

0.848NI/TA + 0.06NP/TE - 0.97TBD/TA - 0.007OCFR – 0.2GRTA + 0.03MP/BV
+ 0.459 Log (TA/GNP) + 0.011D/E + 0.564SG
Matched Sample
Z = - 2.876 + 1.227WC/TA + 0.667RE/TA + 3.214EBIT/TA + 0.33Sales/TA –

2.011NI/TA - 0.186TBD/TA + 0.1OCFR + 0.635GRTA + 0.04MP/BV + 0.444Log
(TA/GNP)
For small sample period, an 11 factors model is found.

0.801NI/TA + 0.054NP/TE - 0.847TBD/TA + 1.005GRTA + 0.025MP/BV +
0.443Log
(TA/GNP) + 0.893SG
Empirical Results
In this section empirical results of the study for the Sample of Large Firms is
presented and discussed in different sub-parts and compared with other studies.
From the following Table No. 5.1.1.2.1: Case Processing Summary, it is clear that
there are four samples Z score models have been developed using multiple
discriminant analysis.
95

Table No. 5.1.1.2.1: Case Processing Summary
Unmatched Sample Matched Sample

Large Sample Small Sample Large Sample Small Sample
Period Period Period Period
Valid 1139 595 1020 534
Excluded 26 9 23 6
Total 1165 604 1043 540
group mean.
From the below Table No. 5.1.1.2.2: Tests of Equality of Group Means it is clear that
ten factors for the unmatched-large sample period, thirteen factors for the unmatched-
small sample period, ten factors for the matched-large sample period and eleven
factors for the matched-small sample period are found to have p-value less than that
of the critical p-value of 0.05 by F test. So the null hypothesis is rejected and the
alternative hypothesis is accepted that there is significant difference between the
groups on each of the independent variables group mean.
96

Table No. 5.1.1.2.2: Tests of Equality of Group Means

Wilks' Wilks' Wilks' Wilks'
Sig. Sig. Sig. Sig.
Lambda Lambda Lambda Lambda
WC/TA 0.982 0.000 0.984 0.002 0.974 0.000 0.978 0.001
RE/TA 0.832 0.000 0.791 0.000 0.816 0.000 0.768 0.000
EBIT/TA 0.890 0.000 0.819 0.000 0.854 0.000 0.810 0.000
MVE/TBD 0.999 0.335 0.997 0.213 0.999 0.424 0.998 0.352
Sales/TA 0.934 0.000 0.894 0.000 0.932 0.000 0.899 0.000
CA/CL 1.000 0.768 1.000 0.776 0.999 0.413 0.999 0.582
NI/TA 0.893 0.000 0.816 0.000 0.864 0.000 00.819 0.000
NP/TE 1.000 0.468 0.986 0.004 0.997 0.087 0.987 0.008
TBD/TA 0.919 0.000 0.870 0.000 0.922 0.000 0.872 0.000
EBIT/INT 0.996 0.026 0.993 0.044 0.996 0.042 0.993 0.057
OCFR 0.988 0.000 0.992 0.029 0.992 0.006 0.995 0.104
GRTA 0.988 0.000 0.983 0.001 0.979 0.000 0.932 0.000
ITR 1.000 0.800 0.999 0.578 1.000 0.685 0.999 0.507
FAT 0.998 0.139 1.000 0.815 1.000 0.701 1.000 0.804
MP/EPS 0.999 0.370 1.000 0.612 0.999 0.397 1.000 0.711
MP/BV 0.960 0.000 0.949 0.000 0.964 0.000 0.956 0.000
D/E 1.000 0.665 0.993 0.047 0.997 0.098 0.994 0.074
Log(TA/GNP) 0.981 0.000 0.956 0.000 0.977 0.000 0.951 0.000
SG 0.999 0.305 0.971 0.000 0.999 0.284 0.947 0.000
SG/GNP 0.999 0.207 0.995 0.093 0.998 0.199 0.995 0.112
So these respective independent variables are significant for the study. Though, the
Wilk’s Lambdas of these significant variables are undesirably on higher side. It is
interesting to find out that ten same independent variables are found to be significant
for all the four samples. And these significant variables belong all the three types of
variables for all the four samples.
97

Log Determinants
Table No. 5.1.1.2.3: Log Determinants

Large Small Large Small
Rank Sample Sample Sample Sample
Non-Defaulting 20 69.607 59.787 67.819 57.125
Defaulting 20 38.114 33.236 40.606 33.236
Pooled within-groups 20 71.797 66.006 71.136 63.501
From the below Table No 5.1.1.2.3: Log Determinants it is clear that the log
determinant for all the samples for the pooled within group and for the Non-defaulting
group are comparable. This indicates that that for both samples the covariance matrix
for the two groups doesn’t differ much. But for the defaulting group, the log
determinants for all the samples are almost half of the log determinant for the non-
defaulting group. This indicates that group covariance matrices differ for all the
samples. It is witnessed that the log determinants for all the samples for the two
groups are large. This indicates that discriminant function is robust enough. As
confirmed by the classification results, it seems that discriminant function can the
better predict the non-defaulting firms than that of the defaulting firms.
Box’s M Test
variables.
variables.
98

Table No. 5.1.1.2.4: Test Results

Box's M 6710.301 6714.871 7240.478 6115.204
Approx. 30.264 29.927 32.749 27.233
df1 210 210 210 210

F
df2 173112.391 136326.296 198472.155 140708.733
Sig. 0.000 0.000 0.000 0.000
the test for all the fours samples are lower than that of critical p-value of 0.05. That
means that null hypothesis that the covariance matrices do not differ between groups
formed by the dependent variables, is rejected. That means the covariance matrices
differ between groups formed by the dependent variables. This is an undesirable
situation in the analysis and is against the assumption of the multiple discriminant
analysis.
Eigenvalues
Table No. 5.1.1.2.5: Eigenvalues
% of Cumulative Canonical
Function Eigenvalue
Variance % Correlation

Unmatched

Matched
The above Table No. 5.1.1.2.5: Eigenvalues presents the values of eigenvalue and
canonical correlations of the four models developed. The eigenvalues of the functions
99

are not very large but the canonical correlations are large enough to make the
developed models robust. Though eigenvalues of different samples are not
comparable but when these values are interpreted in conjunction with canonical
correlations, discriminant functions for the small sample period are more robust.
Wilks' Lambda
Table No. 5.1.1.2.6: Wilks’ Lambda
Function Wilks' Lambda Chi-square df Sig.

Unmatched
Matched
From the above Table 5.1.1.2.6: Wilks' Lambda, it is clear that the p-values for the
Chi-square test for all the four samples are less than the critical p-values of 0.05. That
means null hypothesis is rejected and alternative hypothesis is accepted that there is
significant discriminating power in the independent variables of the four developed
models. So it can be concluded that the two groups differ from each other and the
function can effectively discriminate between the two groups. The Wilk’s Lambdas
for the four samples are neither large nor small. In conjunction with Chi-square test,
the developed models can be termed to be robust. Therefore, on the basis of Wilks’
Lambda and Chi-square values that it can be concluded that the two groups differ
from each other for all the four samples and the functions can effectively discriminate
between the two groups.
100

Structure Matrix
Table No. 5.1.1.2.7: Structure Matrix

EBIT/TA 0.611 .652 0.779 0.678
NI/TA 0.601 .658 0.678 0.651
RE/TA 0.783 .712 0.651 0.779
TBD/TA -0.518 -.536 -0.477 -0.477
Sales/TA 0.464 .478 0.444 0.444
Log(TA/GNP) 0.239 .296 0.319 0.253
OCFR 0.189 .124 0.266 0.143
EBIT/INT 0.115 .115 0.253 0.105
SG/GNPG -0.065 -.096 0.241 -0.066

FAT 0.076 .013 0.143 0.020
NP/TE 0.037 .164 0.105 0.088
D/E -0.022 -.114 0.088 -0.085
SG 0.053 .240 -0.085 0.055
MVE/TBD 0.050 .071 -0.066 0.041
WC/TA 0.237 .177 0.055 0.266
ITR 0.013 .032 0.044 0.021
MP/EPS 0.046 .029 0.042 0.044
MP/BV 0.354 .323 0.041 0.319
CA/CL 0.015 .016 0.021 0.042
GRTA 0.189 .182 0.020 0.241
Structure Matrix shows the correlations of each variable with each discriminant
function and these correlations are similar to the factor loadings in factor analysis.
The above Table 5.1.1.2.7: Structure Matrix shows that the correlation between the
predictor variables and standardized canonical discriminant function. . From the
above table it can be said that most of the independent variables have very low
correlations and contribute negligibly to the discriminant function besides the top five
101

variables for the for all the samples. Also the top five variables from Structure Matrix
that have higher correlations with discriminant functions are the same for the four
samples.
Table No. 5.1.1.2.8: Standardized Canonical Discriminant Function Coefficients

WC/TA 0.349 0.155 0.362 0.073
RE/TA 0.443 0.484 0.345 0.532
EBIT/TA 0.242 -0.102 0.518 0.018
MVE/TBD -0.051 -0.124 -0.036 -0.148
Sales/TA 0.365 0.376 0.287 0.379
CA/CL 0.065 0.105 0.084 0.144
NI/TA -0.146 0.123 -0.267 -0.112
NP/TE 0.047 0.272 0.052 0.259
TBD/TA -0.204 -0.350 -0.165 -0.309
EBIT/INT -0.093 -0.097 -0.123 -0.089
OCFR 0.136 0.013 0.173 0.092
GRTA 0.233 -0.113 0.296 0.239
ITR -0.005 -0.039 -0.004 -0.053
FAT -0.098 -0.119 -0.104 -0.146
MP/EPS -0.001 -0.080 -0.019 -0.052
MP/BV 0.294 0.216 0.253 0.183
D/E -0.025 0.224 -0.039 0.239
Log(TA/GNP) 0.292 0.366 0.285 0.276
SG 0.027 0.370 0.019 0.402
SG/GNPG -0.124 -0.036 -0.121 0.119
102

From the above Table No 5.1.1.2.8: Standardized Canonical Discriminant Function

Coefficients, it is clear that for Unmatched Sample for Large Firms for Large
sample period RE/TA has the highest coefficient value at 0.443 and MP/EPS has
the lowest coefficient value at -0.001. For Unmatched Sample for Large Firms for
Small sample period RE/TA has the highest coefficient value at 0.484 and OCFR
has the lowest coefficient value at 0.013. For the Matched Sample for Large Firms
for Large sample period EBIT/TA has the highest coefficient value at 0.518 and
ITR has the lowest coefficient value at -0.004. For the Matched Sample for Large
Firms for Small sample period RE/TA has the highest coefficient value at 0.532
and MP/EPS has the lowest coefficient value at -0.052.
On comparing with other independent variables for all the four models the
coefficients for the most of the independent variables are ranging between 0.05 to
-0.05. Interesting finding from the above table is that all the ratios that have been
calculated using debt or external liabilities have very negligible discriminatory
power expect TBD/TA for the large sample period. But for the samples for large
sample period, all the independent variables that have debt or external liabilities
component have higher coefficient than that of small sample period. So on the
basis of Standardized Canonical Discriminant Function Coefficients, it can be
concluded that besides the amount of debt with respect to total assets, no variable
that have debt or external liabilities component have any correlation with the
discriminant function. The coefficients indicate that along with accounting
variables, market and economic variables have impact on discriminant function
and are helpful in deciding the membership to groups.
103

Table No. 5.1.1.2.9: Unstandardized Canonical Discriminant Function Coefficients

WC/TA 1.198 0.480 1.227 0.221
RE/TA 0.811 0.873 0.667 1.068
EBIT/TA 1.223 -0.562 3.214 0.106
MVE/TBD 0.000 0.000 0.000 0.000

Sales/TA 0.398 0.466 0.330 0.466
CA/CL 0.015 0.022 0.019 0.029
NI/TA -0.915 0.848 -2.011 -0.801

NP/TE 0.008 0.060 0.008 0.054
TBD/TA -0.243 -0.970 -0.186 -0.847
EBIT/INT 0.000 0.000 0.000 0.000
OCFR 0.079 0.007 0.100 0.049
GRTA 0.402 -0.200 0.635 1.005
ITR 0.000 0.000 0.000 0.000
FAT -0.009 -0.010 -0.009 -0.011
MP/EPS 0.000 -0.001 0.000 -0.001
MP/BV 0.045 0.030 0.040 0.025
D/E -0.002 0.011 -0.002 0.011
Log(TA/GNP) 0.466 0.593 0.444 0.443
SG 0.007 0.564 0.005 0.893

SG/GNPG 0.000 0.000 0.000 0.000
(Constant) -2.954 -2.728 -2.876 -2.335
The above Table 5.1.1.2.9: Canonical Discriminant Function Coefficients

(Unstandardized) gives value of the coefficients which would be used to arrive at the
discriminant score on the basis of structure matrix and significance of variables in
equality of group means.
104

Prior Probabilities for Groups, it clear out of all the firm years, 11.9% from
Unmatched Sample of Large Firms for Large sample period, 19.3% from Unmatched
Sample of Large Firms for Small sample period, 14% from Matched Sample of Large
Firms for Large sample period and 21.5% from Matched Sample of Large Firms for
Small sample period belong to non-defaulting group.

Z Period Period Period Period
Cases Prob. Cases Prob. Cases Prob. Cases Prob.
Non-Defaulting 1004 0.881 480 0.807 877 0.860 419 0.785
Defaulting 135 0.119 115 0.193 143 0.140 115 0.215
Total 1139 1.000 595 1.000 1020 1.000 534 1.000
score of each group. The below Table No. 5.1.1.2.11: Functions at Group Centroids,
presents the centroids of the discriminating scores for defaulting group as well as for
non-defaulting group for all the four samples.

Non-
0.211 .352 0.399 0.246
Defaulting
Defaulting -1.566 -1.471 -1.452 -1.508
105

weights on centroids is used as follow:
For Unmatched Sample of Large Firms for Large sample period
Dividing point = (0.211X1004 – 1.566X135)/1139 = 0.00038
For Unmatched Sample of Large Firms for Small sample period
Dividing point = (0.352X480 – 1.471X115)/595 = - 0.00034
For Matched Sample of Large Firms for Large sample period
For Matched Sample of Large Firms for Small sample period
106


Altman Altman Altman Altman

Model Model Model Model
Original Original Original Original
Accuracy 91.2 39.1 88.2 43.7 90.4 37.1 88.1 41.7
Type I Error 0.4 68.6 2.2 69.1 0.7 72.7 2.4 73.6
Type II Error 72.3 2.9 52.2 1.7 64.8 2.8 47.0 1.7

classification accuracy of the four developed models. For the Unmatched Sample of
Large Firms for Large sample period, the classification accuracy of developed model
is found to be 91.2% against 39.1% for Altman Original Model. Type I error of model
is on lower side at 0.4% but the type II error is very high at 72.3%. For the
Unmatched Sample of Large Firms for Small sample period, the classification
accuracy of developed model is found to be 88.2% against 43.7% for Altman Original
Model. Type I error of model is on lower side at 2.2% but the type II error is very
high at 52.2%. For the Matched Sample of Large Firms for Large sample period, the
classification accuracy of developed model is found to be 90.4% against 37.1% for
Altman Original Model. Type I error of model is on lower side at 0.7% but the type II
error is very high at 64.8%. For the Matched Sample of Large Firms for Small sample
period, the classification accuracy of developed model is found to be 88.1% against
41.7% for Altman Original Model. Type I error of model is on lower side at 2.4% but
the type II error is very high at 47%. High level of Type II error is always undesirable.
The In Sample Period classification accuracies of the developed functions for all the
107

four samples are far higher than that of the Altman Original Model. Also from the
results it is clear that the developed discriminant function can classify non-defaulting
case more accurately than that of the defaulting cases as suggested by log
determinant. On comparison of results, it is clear that classification accuracies of the
models for the large sample period are higher than that of the models for the small
sample period for both Unmatched as well as Matched samples.
Validation of Model
developed models for the four samples. For Unmatched Samples, the first sample
consists of 217 firm years of the same firms but beyond the sample period for the
forward testing and the second sample consists of 85 firm years which are out of
sample firms. For Matched Samples, the first sample consists of 192 firm years of the
same firms but beyond the sample period for the forward testing and the second
sample consists of 94 firm years which are out of sample firms.

Altman Altman
Sample Sample Sample Sample
Original Original
Accuracy 83.6 89.1 64.6 80.6 85.7 66.3
Type I
22.2 15.3 54.2 28.3 20.0 55.0
1 Year Error
Type II
5.3 2.6 0.0 5.3 5.3 0.0
Error
Accuracy 85.1 86.9 66.4 84.0 88.3 67.0
Type I
20.1 13.4 50.8 24.1 14.8 53.7
2 Years Error
Type II
5.0 12.5 5.0 5.0 7.5 5.0
Error
Accuracy 84.3 88.0 65.4 82.3 87.0 66.7
Type I
21.6 14.4 52.5 26.3 17.5 54.4
Overall Error
Type II
5.1 7.7 2.6 5.1 6.4 2.7
Error
108

The above Table No 5.1.1.2.13: Forward Testing (In Sample Firms) presents the in
classification accuracy of the four developed models. For the Unmatched Sample of
Large Firms for Large sample period, the classification accuracy of developed model
is found to be 84.3% against 65.4% for Altman Original Model. Type I error of model
is on higher side at 21.6% but the type II error is on lower side at 5.1%. For the
Unmatched Sample of Large Firms for Small sample period, the classification
accuracy of developed model is found to be 88.0% against 65.4% for Altman Original
Model. Type I error of model is on higher side at 14.4% and the type II error is at
7.7%. For the Matched Sample of Large Firms for Large sample period, the
Altman Original Model. Type I error of model is on higher side at 26.3% but the type
II error is on lower side at 5.1%. For the Matched Sample of Large Firms for Small
sample period, the classification accuracy of developed model is found to be 87.0%
against 66.7% for Altman Original Model. Type I error of model is on higher side at
17.5% but the type II error is at 6.4%. High level of Type II error is always
undesirable. The In Sample Period classification accuracies of the developed
functions for all the four samples are far higher than that of the Altman Original
Model. Also from the results it is clear that the developed discriminant function can
classify non-defaulting case more accurately than that of the defaulting cases as
suggested by log determinant. On comparison of results, it is clear that classification
accuracies of the models for the small sample period are higher than that of the
models for the large sample period for both Unmatched as well as Matched samples.
For one year and two years forward period, highest classification accuracy is at 89.1%
for Unmatched Sample for Small sample period and 88.3% for matched Sample for
small sample period. In case of one year forward testing model for unmatched sample
gives the highest classification accuracy and in case of two years forward testing
model for matched sample gives the highest classification accuracy. Type I errors for
both the one year forward testing as well two years forward testing, for all the four
models are on higher side but Type II errors are on lower side for all the four models.
So on the basis of the results of forward testing, it can be concluded that, model
developed for the small sample period has the highest level of classification accuracy
109

among the four models and Altman Original Model has the lowest level of
Table No. 5.1.1.2.14: Out of Sample (Validation)

Altman Altman
Sample Sample Sample Sample
Original Original
Accuracy 62.4 58.3 60.0 63.8 64.8 73.4
In Type I
44.1 39.7 91.2 44.4 32.4 20.8
Sample Error
Period
Type II
11.8 50.0 47.1 9.1 45 45.5
Error
Accuracy 75.0 46.7 25.0 76.5 46.7 76.5
Type I
22.2 22.2 66.7 22.2 22.2 0.0
Forward Error
Type II
28.6 100.0 85.7 25.0 100.0 50.0
Error
The above Table 5.1.1.2.14: Out Sample (Validation) presents results of out of sample
validation. For the Unmatched Sample, the In Sample Period classification accuracy
of model of the large sample period is the highest at 62.4% in comparison to model
for Small sample period and Altman Original Model along with the lowest type II
error. For the Matched sample the In Sample Period classification accuracy of Altman
Original is the highest at 73.4% in comparison to model for large sample period and
Small sample period. Among all the models, the In Sample Period accuracy of
Altman Original Model is the highest at 73.4%. As far as the classification of out of
sample forward testing is concerned, the classification accuracy is for the model
developed for the Matched Sample of Large Firms for the large sample period at
76.5%.
From the above results of the out of sample validation, for the In Sample Period as
well as for forward testing, the model developed for the Matched Sample of Large
Firms for Large sample period has the highest level of accuracy.
110

Discussion
the different tests for the developed models the model developed for the matched
sample of small sample period has higher statistical robustness. And on the basis of
the classification accuracy, the model developed for the unmatched sample of large
sample period has higher classification accuracy at 91.2% which is higher than that of
the classification accuracy of Altman Original Model. The findings from the above
information can be summarized as follow:
 It is interesting to find out that ten same independent variables are found to be
significant for all the four samples. And these significant variables belong all the
three types of variables for all the four samples.
 The independent variables have low correlations and contribute negligibly to the
discriminant function besides the top five variables for all the samples. Also the
top five variables from Structure Matrix that have higher correlations with
discriminant functions are the same for the four samples.
 Interesting finding from the significant variables is that besides total book value
of debt with respect to total assets no variable with debt component is important
for all the function.
 The large values of the log determinants for the four samples, makes the
developed discriminant functions robust. The log determinants of the pooled
within groups, non-defaulting groups are more than two times of the defaulting
groups for all the samples. This may be the cause of Type II error as
discriminant function may be responding better to the non-defaulting firms than
that of the defaulting firms.
 Eigenvalues for the four developed models are not very high, but canonical
correlations are strong with the developed discriminant functions. This makes
the functions robust.
111

 Wilks’ Lambdas of both samples are not low but Chi-square test results clearly
signify that all the functions are significant.
 From the in sample classification results, the In Sample Period classification

accuracies of the developed functions for all the four samples are far higher than
that of the Altman Original Model. Also from the results it is clear that the
developed discriminant function can classify non-defaulting case more
accurately than that of the defaulting cases as suggested by log determinant. On
comparison of results, it is clear that classification accuracies of the models for
the large sample period are higher than that of the models for the small sample
period for both unmatched as well as matched samples.
 On comparison of results, it is clear that classification accuracies of the models

for the small sample period are higher than that of the models for the large
sample period for both Unmatched as well as Matched samples. So on the basis
of the results of forward testing, it can be concluded that, model developed for
the small sample period has the highest level of classification accuracy among
the four models and Altman Original Model has the lowest level of
 From the classification accuracies of models for in sample classification,

forward testing and out of sample, it is clear that classification accuracy for
unmatched sample and small sample period is higher than other models.
5.1.1.3. Large Firms with PSU
This sample initially has 1880 firm years. After refinement of data and checking for
outliers using Mahalanobis statistics, firm years are brought down to 1289 firm years
for the sample period of ten years from 1st April 2004 to 31st March 2014. The study
is conducted into two parts for the large sample period from 1st April 2004 to 31st
March 2014 and for the Small sample period from 1st April 2009 to 31st March 2014.
On these two samples multiple discriminant analysis is used using SPSS software.
Two hold out samples namely Forward Testing Sample and Out of Sample are meant
112

for validation of the classification accuracy of the developed model. The first sample
‘Forward Testing Sample’ consists of the financial, market and economic information
of the same firms that have been used to develop the model beyond the sample years.
This sample contains 262 firm years. The second sample ‘Out of Sample’ consists of
the financial, market and economic information of the firms which have not been part
of the study. This sample contains 240 firm years.

0.860NI/TA - 0.28TBD/TA + 0.068OCFR + 0.389GRTA + 0.03MP/BV + 0.459
Log (TA/GNP)
Z = - 2.862 + 0.41WC/TA + 0.941RE/TA – 0.256EBIT/TA + 0.377Sales/TA +

0.453NI/TA + 0.264NP/TE – 1.156TBD/TA - 0.021GRTA + 0.025MP/BV + 0.642
Log (TA/GNP) + 0.552SG
Empirical Results
Analysis Case Processing Summary
From the following Table No. 5.1.1.3.1: Case Processing Summary, it is clear that for
the Large sample period there are total 1263 valid cases while for the Small sample
period, there are 651 valid case. On these two samples, multiple discriminant analysis
has been conducted to develop two Z score models.

Valid 1263 651
Excluded 26 9
Total 1289 660
113

Tests of equality of group means tests whether there are any significant differences between
groups on each of the independent variables using group mean or not. The null hypothesis
of the test is that there is no significant difference between the groups on each of the
independent variables group mean. The alternative hypothesis is that there is significant
difference between the groups on each of the independent variables group mean.

Wilks' Wilks'
F Sig. F Sig.
Lambda Lambda
WC/TA 0.982 22.848 0.000 0.984 10.285 0.001
RE/TA 0.839 241.585 0.000 0.807 155.513 0.000
EBIT/TA 0.897 144.763 0.000 0.841 122.892 0.000
MVE/TBD 0.999 0.845 0.358 0.998 1.105 0.294
Sales/TA 0.943 76.016 0.000 0.915 60.042 0.000
CA/CL 1.000 0.000 0.985 1.000 0.002 0.966
NI/TA 0.902 137.244 0.000 0.843 121.137 0.000
NP/TE 0.999 0.662 0.416 0.989 7.448 0.007
TBD/TA 0.916 115.105 0.000 0.868 98.713 0.000
EBIT/INT 0.996 5.450 0.020 0.993 4.581 0.033
OCFR 0.988 15.051 0.000 0.991 5.626 0.018
GRTA 0.990 12.234 0.000 0.987 8.498 0.004
ITR 1.000 0.036 0.849 1.000 0.188 0.665
FAT 0.998 2.176 0.140 0.997 1.839 0.176
MP/EPS 0.999 1.134 0.287 0.999 0.362 0.547
MP/BV 0.974 33.599 0.000 0.965 23.797 0.000
D/E 1.000 0.274 0.600 0.999 0.613 0.434
Log(TA/GNP) 0.972 36.946 0.000 0.945 38.102 0.000
SG 0.999 1.006 0.316 0.977 15.418 0.000
SG/GNP 0.999 1.632 0.202 0.996 2.541 0.111
114

From the above Table 5.1.1.3.2: Tests of Equality of Group Means it is clear that the
Wilk’s Lambda of eleven variables for the sample large sample period are having p-
value less than that of the critical p-value that 0.05 by F test. For the sample small
sample period, the Wilk’s Lambda of thirteen variables out of twenty variables are
found to have p-value less than that of the critical p-value. So the null hypothesis for
the respective variables of the two samples, that there is no significant difference
between the groups on each of the independent variables group mean, is rejected and
the alternative hypothesis is accepted that there is significant difference between the
groups on each of the independent variables group mean. This indicates that these
independent variables are significant for the study and are part of the models. But if
the value of Wilks’ Lambda is taken into consideration, then it is found that Wilks’
Lambda of these significant variables is on higher side. It is basically undesirable. The
smaller values of Wilk’s Lambda are considered to be better for the analysis.
For Large sample period, of eleven significant variables, one variable has zero
coefficient value. And the remaining ten variables belong to all the three types of
variables namely accounting, market and economic variables. For the Small sample
period, of thirteen significant variables, two variables have zero coefficient values and
the remaining eleven variables belong to all the three types of variables that have been
used in the study. Also besides the total book value of debt to total assets ratio, no
other variables that has debt or external liabilities component are useful.
Log Determinants

Z Rank Log Determinant Rank Log Determinant
Non-Defaulting 20 72.185 20 62.983
Defaulting 20 38.114 20 30.539
Pooled within-groups 20 74.170 20 66.097
From the below Table No. 5.1.1.3.3: Log Determinants it is clear that the log
determinant for both the sample periods for the pooled within group and for the Non-
115

defaulting group are comparable. This indicates that that for both samples the
covariance matrix for the two groups doesn’t differ much. But for the defaulting
group, the log determinants for both the sample periods are almost half of the log
determinant for the non-defaulting group. This indicates that group covariance
matrices differ for all the samples. It is witnessed that the log determinants for both
the sample periods for the two groups are large. This indicates that developed
discriminant functions are robust enough. As confirmed by the classification results, it
seems that discriminant function can the better predict the non-defaulting firms than
that of the defaulting firms.
Box’s M Test
variables.
variables.
Table No. 5.1.1.3.4: Test Results

Box's M 7069.190 5460.168
Approx. 31.887 24.227
df1 210 210
F
df2 171431.575 113569.107
Sig. 0.000 0.000
the test for both the samples are lower than that of critical p-value of 0.05. That means
that null hypothesis that the covariance matrices do not differ between groups formed
by the dependent variables, is rejected. That means the covariance matrices differ
116

between groups formed by the dependent variables. This is an undesirable situation in

the analysis and is against the assumption of the multiple discriminant analysis.
Eigenvalues
% of Canonical
Function Eigenvalue Cumulative %
Variance Correlation
The above Table No. 5.1.1.3.5: Eigenvalues, it is clear that the eigenvalues of the
discriminant functions for both the sample periods are not very high but in conjunction
with canonical correlations for both the sample periods, the developed discriminant
functions can be termed as robust as a major part of the variance in classification of
groups can be correctly explained by the developed discriminant functions.
Wilks' Lambda
Table No. 5.1.1.3.6: Wilks' Lambda
Test of Function(s) Wilks' Lambda Chi-square df Sig.

From the above Table No 5.1.1.3.6: Wilks' Lambda, it is clear that the p-values for the
Chi-square test for all the two sample periods are less than the critical p-values of 0.05.
117

That means null hypothesis is rejected and alternative hypothesis is accepted that there is
significant discriminating power in the independent variables of the four developed
models. So it can be concluded that the two groups differ from each other and the
function can effectively discriminate between the two groups. The Wilk’s Lambdas for
both the sample periods are not very high if these interpreted in conjunction with the Chi-
square test. This makes both the discriminant functions robust but not very high.
Structure Matrix

RE/TA 0.767 0.672
EBIT/TA 0.594 0.597
NI/TA 0.578 0.593
TBD/TA -0.530 -0.535
Sales/TA 0.430 0.417
Log(TA/GNP) 0.300 0.332
MP/BV 0.286 0.263
WC/TA 0.236 0.173
OCFR 0.192 0.128
GRTA 0.173 0.157
EBIT/INT 0.115 0.115
FAT 0.073 0.073
SG/GNPG -0.063 -0.086
MP/EPS 0.053 0.032
SG 0.050 0.211
MVE/TBD 0.045 0.057
NP/TE 0.040 0.147
D/E -0.026 -0.042
ITR 0.009 0.023
CA/CL 0.001 -0.002
the predictor variables and standardized canonical discriminant function. From the
above table it can be said that most of the independent variables have low correlation
and contribute negligibly to the discriminant function besides the top 6 variables for
both sample which are same.
118

Large sample period Small sample period

WC/TA 0.320 0.129
RE/TA 0.434 0.506
EBIT/TA 0.288 -0.046
MVE/TBD -0.054 -0.138
Sales/TA 0.359 0.346
CA/CL 0.047 0.094
NI/TA -0.134 0.066
NP/TE 0.049 0.425
TBD/TA -0.224 -0.386
EBIT/INT -0.087 -0.094
OCFR 0.112 0.000
GRTA 0.217 -0.114
ITR -0.013 -0.054
FAT -0.121 -0.106
MP/EPS -0.025 -0.110
MP/BV 0.246 0.225
D/E 0.012 0.338
Log(TA/GNP) 0.358 0.420
SG 0.030 0.347
SG/GNPG -0.119 -0.048
From the above Table No 5.1.1.3.8: Standardized Canonical Discriminant Function

Coefficients, it is clear that RE/TA has the highest coefficient value at 0.434 for the
large sample period while for the small sample period, RE/TA has highest coefficient
at 0.506. That means the independent variables RE/TA for both the samples have the
highest discriminatory power among all the independent variables. D/E at 0.012 and
OCFR at 0.000 has the lowest coefficient for large sample period and small sample
period respectively. Also from the above table it is clear that around five independent
variables for the large sample period and three independent variables for the small
sample period are ranging between 0.05 to -0.05.
119

Interesting finding from the above table is that for the Large sample period, all the
ratios that have been calculated using debt or external liabilities have very negligible
discriminatory power expect TBD/TA. While for the Small sample period, these
ratios have higher coefficient than many other variables. So on the basis of
Standardized Canonical Discriminant Function Coefficients, it can be concluded that
for the Large sample period the amount of debt with respect to market value of equity
and book value of equity or current liabilities with respect to current assets have no
impact on their default probability but the ratio of the total book value of debt and
total assets have correlation with discriminant function but for the Small sample
period, all the variables with debt or external liabilities are significant.


WC/TA 1.130 0.410
RE/TA 0.825 .941
EBIT/TA 1.499 -0.256
MVE/TBD 0.000 0.000
Sales/TA 0.360 0.377
CA/CL 0.012 0.021
NI/TA -0.860 0.453
NP/TE 0.009 0.264
TBD/TA -0.280 -1.156
EBIT/INT 0.000 0.000
OCFR 0.068 0.000
GRTA 0.389 -0.210
ITR 0.000 0.000
FAT -0.003 -0.002
MP/EPS 0.000 -0.001
MP/BV 0.030 0.025
D/E 0.001 0.026
Log(TA/GNP) 0.530 0.642
SG 0.008 0.553
SG/GNPG 0.000 0.000
(Constant) -3.153 -2.862
120

The above Table No. 5.1.1.3.9: Canonical Discriminant Function Coefficients

Prior Probabilities for Groups, it clear that for the large sample period out of 1263
firm years, 135 firm years belong to defaulting group and for the small sample period
out of 651 firm years, 107 firm years belong to defaulting group.

Non-Defaulting 1128 0.893 544 0.836
Defaulting 135 0.107 107 0.164
Total 1263 1.000 651 1.000

121

For large sample period
Dividing point = (0.197 X 1128 – 1.648 X 135)/1263 = -0.00021
Dividing point = (0.323 X 544 – 1.641 X 107)/651 = 0.0002

Model Altman Original Model Altman Original
Accuracy 92.1 40.2 89.8 27.6
Type I Error 0.4 66.9 2.2 68.0
Type II Error 79.8 0.0 51.4 95.3

classification accuracy of the developed models as well as for the Altman Original
model for the two samples. For the large sample period the accuracy rate of the
122

developed model is 92.1% and for the Altman Original Model, the accuracy is 40.2%.
Type I error for the developed model is 0.4% and for the Altman Original Model, it is
66.9%. Type II error for the developed model is 79.8% and for the Altman Original
Model, it is 0.0%. Misclassification of defaulting firms is very high for the developed
model. It is undesirable. For the small sample periodthe accuracy rate of the
developed model is 89.8% and for the Altman Original Model, the accuracy is 27.6%.
Type I error for the developed model is 2.2% and for the Altman Original Model, it is
68.0%. Type II error for the developed model is 51.4% and for the Altman Original
Model, it is 95.3%. Misclassification of defaulting firms is very high for the
developed model. It is undesirable.
From the above information from the In Sample Classification results, it can be
concluded that the accuracy level of the developed models for both the samples are far
higher than that of the accuracy level of the Altman Original Model. But the problem
with the developed model is that the Type II errors for the both the samples are very
high. The in sample classification accuracy for the model developed for the large
Model Validation
The below Table 5.1.1.3.13: Forward Testing (In Sample Firms) presents results of
the forward testing of the out of sample period. For the In Sample Period period, the
accuracy of the model developed for the large sample period is 83.2%, for the small
sample period accuracy is 85.5% and for the Altman Original Model the accuracy
level is 59.9%. So the accuracy of the developed model for the small sample period is
the highest among all the three models.
123

Large Sample Small Sample

Altman Original
Period Period
Accuracy 82.4 85.5 61.1
Accuracy 84.0 85.5 58.8
Accuracy 83.2 85.5 59.9
If Type I error is considered, it is the lowest for the small sample period. As far as the
Type II error is concerned, it is the lowest for the large sample period. But for the
small sample period too, it is on lower side at 7.6%. The classification accuracy of the
forward testing for all the models for the whole period for the small sample period is
the highest at 85.5%. For one year and two years forward period, highest accuracy
levels are 85.5% for the small sample period. So on the basis of the results of forward
testing, it can be concluded that, model developed for the small sample period has the
highest level of classification accuracy among the two models and Altman Original
Model has the lowest level of classification accuracy.
Small Sample
Large Sample Period Altman Original
Period
Accuracy 67.8 77.2 36.7
Sample
Type I Error 33.0 21.1 67.9
Period
Type II Error 23.8 42.1 18.2
Accuracy 63.2 86.5 39.5
Forward Type I Error 36.1 8.6 61.1
Type II Error 50.0 100.0 50.0
124

The above Table 5.1.1.3.14: Out of Sample (Validation) presents results of out of
sample validation. For the In Sample Period, the accuracy of the model developed for
the large sample period is 67.8%, for the small sample period accuracy is 77.2% and
for the Altman Original Model the accuracy level is 36.7%. So the accuracy of the
developed model for the small sample period is the highest among all the three
models. If Type I error is considered, it is the lowest for the small sample period. As
far as the Type II error is concerned, it is on undesirable higher side. The
classification accuracy of the out of sample validation for all the models for In Sample
Period for the small sample period is the highest at 77.2%. For the Forward Period,
the accuracy of the model developed for the large sample period is 63.2%, for the
small sample period accuracy is 86.5% and for the Altman Original Model the
accuracy level is 39.5%. So the accuracy of the developed model for the small sample
period is the highest among all the three models. If Type I error is considered, it is the
lowest for the small sample period. As far as the Type II error is concerned, it is on
undesirable higher side. So on the basis of the results of out of sample validation, it
can be concluded that, model developed for the small sample period has the highest
level of classification accuracy among the three models and Altman Original Model
has the lowest level of classification accuracy.
Discussion
the different tests for the developed models the model developed for the matched
sample of small sample period has higher statistical robustness. And on the basis of
the classification accuracy, the model developed for the unmatched sample of large
sample period has higher classification accuracy at 92.1% which is higher than that of
the classification accuracy of Altman Original Model. The findings from the above
information can be summarized as follow:
 For both the sample periods, significant variables belong to all the three
categories of variables. Also besides the total book value of debt to total assets
ratio, no other variables with debt component is significant.
125

 The large values of the log determinants for both the sample periods, makes the
discriminant function robust. The log determinants of the pooled within groups,
non-defaulting groups are more than two times of the defaulting groups for both
the samples. This may be the cause of Type II error as discriminant function
may be responding better to the non-defaulting firms than that of the defaulting
firms.
 As per eigenvalues, canonical correlation and Wilk’s Lambdas functions for

small sample period more robust. However Chi-square test results clearly
signify that both the functions are significant.
 From the in sample classification results, the classification accuracy of for the
model for the large sample period is higher than that of the small sample period.
But Type II errors are high.
 From forward testing results and out of sample validation, model for the small
sample period has the highest level of classification accuracy.
 The developed models have higher accuracy than the Altman Original Model.
5.1.1.4. Small and Medium Enterprises
This sample initially has 569 firm years. After refinement and checking for the
for the sample period of ten years from 1st April 2004 to 31st March 2014. On this
sample multiple discriminant analysis is used using SPSS software. Two hold out
samples namely Forward Testing Sample and Out of Sample are meant for validation
of the classification accuracy of the developed model. The first sample ‘Forward
Testing Sample’ consists of the financial, market and economic information of the
same firms that have been used to develop the model beyond the sample years. This
sample contains 63 firm years. The second sample ‘Out of Sample’ consists of the
financial, market and economic information of the firms which have not been part of
the study. This sample contains 59 firm years.
126

For the large sample period of Small and Medium Enterprises, a 4 factors model is
found.
Z = - 1.853 - 0.398RE/TA + 4.099EBIT/TA + 0.367Sales/TA + 1.484NI/TA
Z = - 1.619 - 0.21RE/TA + 14.112EBIT/TA + 0.882Sales/TA - 11.595NI/TA +

0.003EBIT/INT – 0.025FAT + 0.002MP/EPS
Empirical Results
From the following Table No. 5.1.1.4.1: Case Processing Summary, it is clear that the
large sample period has 345 valid firm years and small sample period has 182 valid
firm years. These data have been used to develop Z score model using multiple
discriminant analysis and all the firm years have included in the study.

Valid 345 182
Excluded 32 17
Total 377 199
Tests of equality of group means tests whether there are any significant differences between
groups on each of the independent variables using group mean or not. The null hypothesis of
the test is that there is no significant difference between the groups on each of the
independent variables group mean. The alternative hypothesis is that there is significant
difference between the groups on each of the independent variables group mean.
127

Wilks'
Wilks' Lambda F Sig. F Sig.
Lambda
WC/TA 0.999 0.486 0.486 0.995 0.865 0.354
RE/TA 0.980 7.124 0.008 0.966 6.316 0.013
EBIT/TA 0.871 51.001 0.000 0.838 34.760 0.000
MVE/TBD 0.998 0.540 0.463 0.989 1.977 0.161
Sales/TA 0.970 10.613 0.001 0.950 9.374 0.003
CA/CL 0.995 1.655 0.199 0.990 1.787 0.183
NI/TA 0.886 44.060 0.000 0.867 27.601 0.000
NP/TE 0.992 2.701 0.101 0.995 0.873 0.351
TBD/TA 1.000 0.169 0.681 1.000 0.015 0.904
EBIT/INT 0.999 0.229 0.632 0.927 14.194 0.000
OCFR 0.999 0.187 0.666 0.998 0.375 0.541
GRTA 0.999 0.245 0.621 0.998 0.440 0.508
ITR 0.998 0.594 0.441 0.998 0.430 0.513
FAT 0.990 3.620 0.058 0.923 15.075 0.000
MP/EPS 0.999 0.336 0.562 0.978 3.972 0.048
MP/BV 0.999 0.236 0.627 1.000 0.002 0.960
D/E 0.990 3.507 0.062 0.991 1.624 0.204
Log(TA/GNP) 0.998 0.586 0.445 0.998 0.360 0.550
SG 0.991 3.139 0.077 0.990 1.869 0.173
SG/GNP 0.984 5.419 0.021 0.973 5.062 0.026
From the above Table No. 5.1.1.4.2: Tests of Equality of Group Means it is clear that
the Wilk’s Lambda of five variables for the large sample period and eight variables
for the small sample period are having p-value less than that of the critical p-value
that 0.05 by F test. So the null hypothesis for the respective variables of the two
128

samples, that there is no significant difference between the groups on each of the
independent variables group mean, is rejected and the alternative hypothesis is
accepted that there is significant difference between the groups on each of the
independent variables group mean. This indicates that these independent variables
are significant for the study and are part of the models. But if the value of Wilks’
Lambda is taken into consideration, then it is found that Wilks’ Lambda of these
significant variables is on higher side. It is basically undesirable. The smaller values
of Wilk’s Lambda are considered to be better for the analysis. As far as nature of the
variables is concerned, for the large sample period, of the significant variables, no
variable belongs to market variables as well as economic variable in the model.
There are only accounting variables in the model for the large sample period. For
the small sample period, no variable is from the economic variables.
Log Determinants
From the below Table No. 5.1.1.4.3: Log Determinants it is clear that the log
determinant for both the sample periods for the pooled within group and for the
Non-defaulting group are comparable. This indicates that that for both samples the
covariance matrix for the two groups doesn’t differ much. But for the defaulting
group, the log determinants for both the sample periods are almost half of the log
determinant for the non-defaulting group. This indicates that group covariance
matrices differ for all the samples. Also it is found that the log determinants for both
the sample periods for the two groups are large. This indicates that developed
discriminant functions are robust enough. As confirmed by the classification results,
it seems that discriminant function can the better predict the non-defaulting firms
than that of the defaulting firms.
Rank Large Sample Period Small Sample Period

Non-Defaulting 20 80.336 59.560
Defaulting 20 37.937 36.434
Pooled within-groups 20 89.565 72.377
129

Box’s M Test
variables.
variables.
Table No. 5.1.1.4.4: Box’s M Test Results

Box's M 5879.022 3555.868
Approx. 24.748 14.422
df1 210 210
F
df2 42638.414 36374.744
Sig. 0.000 0.000
From the Table No. 5.1.1.4.4: Box’s M Test Results, it is clear that test the p-values of
the test for both samples are lower than that of critical value of 0.05. That means that
null hypothesis that the covariance matrices do not differ between groups formed by
the dependent variables, is rejected. That means the covariance matrices differ
between groups formed by the dependent variables. This is undesirable.
Eigenvalues
Canonical
Function Eigenvalue % of Variance Cumulative %
Correlation
130

The above Table No. 5.1.1.4.5: Eigenvalues provides information about the
eigenvalue and canonical correlation. The eigenvalue and canonical correlation of the
discriminant function for the large sample period is on lower side making less robust.
But in case of small sample period, eigenvalue and canonical correlation is high. This
makes the discriminant functions developed for small sample period highly robust.
Wilks' Lambda
Table No. 5.1.1.4.6: Wilks' Lambda
Wilks' Lambda Chi-square df Sig.

From the above Table 5.4.1.4.6: Wilks' Lambda, it is clear that the Wilk’s Lambda is
0.729 for large sample period making it less robust while it is 0.581 for the small
sample period making discriminant function highly robust. Also the Chi-square tests
for both the sample periods are having p-values less than the critical p-value of 0.05.
That means null hypothesis is rejected and alternative hypothesis that there is
significant discriminating power in the independent variables for both the samples. So
it can be concluded on the basis of above test that the groups differ from each other
and the function can effectively discriminate between the two groups and the
discriminant function developed for the small sample period is highly robust.
131

Structure Matrix

EBIT/TA 0.632 0.518
NI/TA 0.587 0.462
Sales/TA 0.288 0.269
RE/TA 0.236 0.221
SG/GNPG -0.206 -0.198
FAT 0.168 0.341
D/E -0.166 -0.112
SG -0.157 -0.120
NP/TE 0.145 0.082
CA/CL 0.114 0.117
ITR 0.068 0.058
Log(TA/GNP) 0.068 -0.053
MVE/TBD 0.065 0.124
WC/TA -0.062 -0.082
MP/EPS 0.051 0.175
GRTA 0.044 0.058
MP/BV 0.043 -0.004
EBIT/INT 0.042 0.331
OCFR -0.038 0.054
TBD/TA -0.036 0.011
the predictor variables and standardized canonical discriminant function. From the
above table it can be said that most of the independent variables have very low
correlation and contribute negligibly to the discriminant function. Only five variables
can contribute towards the discriminant function for the large sample period and for
the small sample period, only seven variables contribute to discriminant function.
132


WC/TA 0.857 1.762
RE/TA -0.426 -0.260
EBIT/TA 0.817 3.096
MVE/TBD 0.295 -0.106
Sales/TA 1.016 2.582
CA/CL 0.132 0.152
NI/TA 0.263 -2.445
NP/TE 0.035 -0.005
TBD/TA -0.340 -0.890
EBIT/INT -0.200 0.128
OCFR -0.078 0.008
GRTA 0.078 0.133
ITR -0.038 -0.156
FAT 0.048 -0.190
MP/EPS 6.637 0.219
MP/BV -6.871 -0.060
D/E -0.151 -0.126
Log(TA/GNP) 0.347 0.232
SG -0.103 -0.042
SG/GNPG -0.121 -0.074
From the above Table No. 5.1.1.4.8: Standardized Canonical Discriminant Function
Coefficients, for the large sample period, it is clear that MP/BV has the highest
coefficient value at -6.871 for the sample. That means the independent variables
MP/BV for the sample has the highest discriminatory power among all the
133

independent variables. NP/TE has the lowest coefficient at 0.035. For the small
sample period, the independent variable NI/TA has highest discriminatory power with
coefficient at -2.445 among all the variables and OCFR has the lowest discriminatory
power with a coefficient of 0.008.

WC/TA 0.508 0.858
RE/TA -0.398 -0.210
EBIT/TA 4.099 14.112
MVE/TBD 0.001 -0.002
Sales/TA 0.367 0.882
CA/CL 0.016 0.039
NI/TA 1.484 -11.595
NP/TE 0.005 0.000
TBD/TA -0.491 -1.033
EBIT/INT -0.001 0.003
OCFR -0.027 0.006
GRTA 0.001 0.001
ITR 0.000 -0.001
FAT 0.003 -0.025
MP/EPS 0.003 0.002
MP/BV -0.010 -0.007
D/E -0.007 -0.004
Log(TA/GNP) 0.436 0.283
SG -0.013 -0.004
SG/GNPG 0.000 0.000
(Constant) -1.853 -1.619
The above Table No. 5.1.1.4.9: Canonical Discriminant Function Coefficients

134

Prior Probabilities for Groups, it clear that for the large sample period out of 345 firm
years, 280 firm years belong to non-defaulting group and 65 firm years belong to
defaulting group. For the small sample period, out of 182 firm years, 127 firm years
belong to non-defaulting group and 55 belong to defaulting group.

Non-Defaulting 280 0.812 127 0.698
Defaulting 65 0.188 55 0.302
Total 345 1.000 182 1.00

formulate decision rule in deciding group membership of individual case From the
135

For Large sample period
Dividing point = (0.293X280 – 1.263X65)/345 = -0.00016
Dividing point = (0.555 X 127 – 1.282 X 55)/182 = - 0.000137
For defaulting firms: -1.282 < Z < - 0.000137
For non-defaulting firms: - 0.000137 < Z < 0.555
Large Sample Period Small Sample Period Altman Original

Accuracy 88.7 84.6 20.4
Type I Error 2.6 6.3 96.2
The above Table No. 5.1.1.4.12: In Sample Classification Results presents the in
classification accuracy of the developed model. The classification accuracy of the
developed model for large sample period is 88.7% against 84.6% for small sample
period. For the Altman Original Model it is 20.4%. This clearly indicates that model
developed for large sample period has outperformed both the model developed for
small sample period and Altman Original Model. Type I errors of the models are
2.6% and 6.3% respectively on lower side. As far as the Type II error is on higher side
136

at 64.6% and 36.4%. Also from the results it is clear that the developed discriminant
function can classify non-defaulting case more accurately than that of the defaulting
cases as suggested by log determinant. On comparison of results it is clear that the
model developed for large sample has higher accuracy than other models.
Validation of Model

Accuracy 37.5 59.4 53.1
Accuracy 38.7 58.1 48.4
Accuracy 38.1 58.7 50.8
The above Table No. 5.1.1.4.13: Forward Testing (In Sample Firms) presents results
of out of sample period validation. For the whole period, the accuracies of the
developed models are 38.1% and 58.4% respectively. For the Altman Original Model
classification accuracy is 50.8%. If Type I errors are considered, it is very high for all
the models. As far as the Type II errors are concerned, these are on lower side for all
the models. For one year and two years forward period, the classification accuracies
of the model developed for large sample period are 37.5% and 38.7% respectively.
137

For small sample period the accuracies are 59.4% and 58.1%. So on the basis of the
results of forward testing, it can be concluded that, model developed for small sample
period has higher classification accuracy than that of the other models. And over time,
the accuracy falls but the fall is minor.

Accuracy 35.6 44.1 45.8
Type I Error 84.6 71.8 82.1
The above Table 5.1.1.4.14: Out Sample (Validation) presents results of out of sample
validation. The accuracy of the developed model for large sample period is 35.6%
against 44.1% for small sample period. For the Altman Original Model classification
accuracy is 45.8%. Type I and II errors are very high for all the models. On
comparison it can be said that the Altman Original Model has the highest accuracy
among all the models.
Discussion
the different tests for the developed models that model developed for the small sample
period has higher statistical robustness. On the basis of the classification accuracy, the
model developed for the large sample period has higher classification accuracy at
88.7% which is than that of the classification accuracy of Altman Original Model.
The findings from the above information can be summarized as follow:
 For the large sample period, all the significant variables belong to only
accounting variables. For the small sample period, no significant variable is
from the economic variables.
138

 Only five variables can contribute towards the discriminant function for the
large sample period and seven variables for small sample period.
 The large values of the log determinants for both the models make the
discriminant functions robust.
 Eigenvalues for the two models, it is clear that the discriminant functions
developed for small sample period is highly robust if interpreted in conjunction
with the Wilks’ Lambdas and Chi-square test.
 In sample classification results clearly indicates that model for large sample
period has the highest accuracy among all the models.
 From forward testing and out sample results, model for small sample period has
higher classification accuracy than that of the other models.
 All the developed have higher classification accuracy than Altman model.
5.1.1.5. Findings from the MDA Study
From the above discussion for the four broad samples and ten developed models for
two sample periods using discriminant analysis, it clear that on In Sample Period the
discriminant functions developed for the small sample period has more robust but the
classification accuracy is highest for the models developed for large sample period at
92.1% for Large Firms with PSU. The findings from the study are summarized as
follow:
For different sample, different categories of variables are found to be significant in

default prediction. Four variables namely RE/TA, EBIT/TA, Sales/TA and NI/TA are
significant for all the models. So the accounting variables are the most important
variables in default prediction. The findings of the study are summarized as follow:
139

All Firms
 All the three categories of variables are significant for the model for the small
sample period and for large sample period two categories of variables namely
accounting and economic are significant.
 Though the different coefficients, but all the significant variables for large
sample period are important for the small sample period. This indicates that in
long run firms tend to follow similar pattern while in short run, they have
different pattern when it comes to default prediction.
 It can be said that in long run, market information does not play any role in
default prediction. So the firms and economic performance is of more
importance.
 In short run (five years), all the three categories of variables are significant. That
means in short run, market information along with accounting and economic
information are useful to forecast the future of the firms.
 Besides the amount of debt with respect to total assets, no other variables with
debt component is significant.
Large Firms
 For the large firms, all the three categories of variables are significant for both
the sample periods.
 Though the different coefficients, all the variables significant for the large
sample period are important for the small sample period. This indicates that in
long run firms tend to follow similar pattern while in short run, they have
different pattern when it comes to default prediction.
 For the large sample period, only total book value with respect to total assets is
significant among all the variables that have debt component. However the
small sample period all the variables with debt component are found to be
significant. That indicates that nature of the firm change over time in the context
of default prediction. And to accommodate this, there is need to include
variables which can provide input on short term risks associated with the firm.
140

Large Firms with PSU
 For both the sample periods, all the three types of variables are important for the
prediction of default.
 All the variables that are significant for the large sample period are also
important for small sample period. And the new variables for the small sample
period seem to be those which can provide on short term risks.
 It seems that in long term all the firms follow a certain pattern when it comes to
default prediction and a certain set variables can predict default.
 No variable other than total book value of debt with respect to total assets is
significant in the analysis.
Small and Medium Enterprises
 For large sample period, only accounting variables are significant for the default
prediction.
 For small sample period, accounting as well as only market variables are
significant.
 All the significant variables for the large sample period are significant for the
small sample period too. This indicates that in long run firms tend to follow a
certain pattern which can be predicted with help of a certain set of variables.
And the rest variables used for small sample period seem to providing input for
the short term risks.
Firms in long run tend to follow a certain pattern as far as default prediction is
concerned and default possibilities can be estimated with help of a set of certain
variables in long run. While is for short run or change in assets size of the firms, a few
additional variables are included/excluded in the model for adjusting the additional
risks.
141

On comparison of results of different statistical tests of all the samples over both the
sample periods individually, it is crystal clear that the models developed for the small
sample period of five year for all the samples have higher robustness. However large
values of Log determinant are found to favour large sample period.
These findings can be interpreted as follow:
 The log determinants of the models for the large sample period are higher than
that of the small sample period. It seems that log determinant is dependent size
of data set. Larger is the data set, larger is the log determinant.
 The log determinants of all the models are found to follow a certain pattern. The
log determinants of the non-defaulting group and pooled within group are
almost double of the log determinant of the defaulting group for all the models.
This indicates that the developed models can discriminate non-defaulting firms
better than the defaulting firms. This is confirmed by the low Type I errors of all
the developed models and high Type II errors of the developed models. This
finding is in line with assumption of multiple discriminant analysis as it requires
that there should not be huge difference in the values of log determinants of the
two groups. The reason for this seems to be the sampling and selection biases.
 From Wilks’ Lambda of all the models, are on lower side for the small sample
period. From this it seems that smaller data set leads to smaller values of Wilks’
Lambda so robustness of model is dependent on sample size. However it needs
to be statistically investigated.
 From all the statistical tests, it can be concluded that for smaller sample periods,
more robust model can be developed than that of the larger sample periods.
Though it needs to be further investigated.
On comparison of classification accuracies of all the models, it is clear that the model
for the Large Firms with PSU for large sample period has yielded the highest
142

classification accuracy at 92.1%. Also it is crystal clear from the classification

accuracies of the models for the large sample period yields higher classification
accuracy. Further classification accuracies of logit models are than the MDA models.
The findings can be interpreted as follow:
 From the in sample classification accuracy of MDA models, it is concluded that

the models developed for the large sample period yield higher classification
accuracy.
 From forward testing and out of sample validation, the models for small sample
period yield higher classification accuracy.
 The classification accuracy of the developed model is found to deteriorate over

time for both the in sample firms as well as for out of sample firms (Altman E.
I., 2005).
 For the sample of Large Firms, the models developed for the Unmatched
Sample for both the sample periods is found to have higher prediction accuracy
than that of the Matched Sample. This is against the assumption of the many
studies like (Altman E. , 1968), (Beaver W. , 1966) and (Gupta V. , 2014).
Limitations and Further Scope of the Developed Models
The developed model like other models faces a number of limitations as follow.
 The developed models are facing the problem of high Type I and Type II errors.
 The models for small sample period are more robust but the models for large
sample period have higher classification accuracy. This is itself the biggest
limitation of the study.
Conclusion
Considering the different limitations on the basis of the statistical robustness and the
classification accuracies of the developed models, it can be concluded that the models
for the large sample period have the highest classification accuracy but models for
143

small sample period more robust. And over time classification accuracy deteriorates
for the developed models. The model for the large sample period for Large Firms with
PSU has the highest classification accuracy at 92.1%.
5.1.2. Logistic Regression
Logistic Regression analysis has been used to arrive at a credit score for the firms
with the purpose of the predicting their respective default probabilities. The whole
study has been divided into four parts. Part I presents the results of the analysis of All
Firms which contains 2449 firm years for which ratios have been calculated using
financial statement, market and economic information. Part II presents the results of
the analysis of the sample of large firms which is consist of 1167 firm years. Part III
presents the results of analysis of the sample of large firms and public sector units
which has 1880 firm years. Part IV presents the results of the analysis of the sample
of small and medium firms. This sample has total 569 firm years.
5.1.2.1. All Firms
This sample has 2450 firm years. Two logistic regression models have been
developed for his sample. The first model that has been developed for the whole
sample period that is from 1st April 2004 to 31st March 2014 and is indicated as large
sample period. The second model has been developed for a sample period of five
years from 1st April 2009 to 31st March 2014 and is indicated as small sample period.
After checking for outliers using Mahalanobis statistics, firm years for the Large
sample period and Small sample period are brought down to 1666 firm years and 859
firm years respectively. On these two samples multiple discriminant analysis is used
using SPSS software. Two hold out samples namely ‘Forward Testing (In Sample
Firms)’ and ‘Out of Sample’ are meant for validation of the classification accuracy of
the developed models. The first hold out sample ‘Forward Testing (In Sample Firms)’
consists of the financial, market and economic information of the same firms that
have been used to develop the models beyond the sample years. This sample contains
330 firm years. The second sample ‘Out of Sample’ consists of the financial, market
144

and economic information of the firms which have not been part of the study. This
sample contains 309 firm years.
O = - 1.46EBIT/TA – 0.14MVE/TBD – 1.588Sales/TA + 9.1957NI/TA +

0.011FAT - 0.586 Log (TA/GNP) + 1.846Y
O = 1.675 - 0.731RE/TA – 0.304MVE/TBD – 1.765Sales/TA + 8.33NI/TA –

0.107EBIT/INT + 0.019FAT - 0.531Log (TA/GNP) + 0.785X + 1.585Y
Empirical Results
From the Table No. 5.1.2.1.1: Case Processing Summary, it is clear that the two
samples respective have 1608 and 833 valid cases. On these valid cases, two separate
models have been developed to predict probability of default using logistic regression
method.

Valid Cases 1608 833
Excluded Cases 58 26
Total 1666 859
145

Omnibus Tests of Model Coefficients
Omnibus Tests of Model Coefficients are used to check that the new model (with
explanatory variables included) is an improvement over the baseline model. The Null
hypothesis of the test is that there is no effect of the independent variables, taken
together, on the dependent variable and the alternative hypothesis is that there is an
effect of the independent variables, taken together, on the dependent variable.
H0: There is no effect of the independent variables, taken together, on the dependent
variable.
H1: There is an effect of the independent variables, taken together, on the dependent
variable
Table No. 5.1.2.1.2: Omnibus Tests of Model Coefficients
Chi-square df Sig.
Large Sample Period 603.180 22 0.000
Small Sample Period 440.058 22 0.000
From the above Table No. 5.1.2.1.2: Omnibus Tests of Model Coefficients presents
the results of the test. For the large sample period the Chi-square coefficient is found
to be 603.180 with the p-value of 0.000. For the small sample period, the Chi-square
coefficient is found to be 440.058 with p-value of 0.000. The p-values of both the
models are less than the critical p-value. This suggests that null hypothesis for both
the samples are rejected and alternative hypothesis are accepted for both the sample.
That means, for both the samples, there is an effect of the independent variables,
taken together, on the dependent variable. So on the basis of above result, it can be
concluded that the inclusion of selected independent variables improves the predictive
ability of the model. If the Chi-square values of the two models are compared, it is
evident that the Chi-square value for the small sample period is lower than that of the
large sample period. It signifies that the correlation between the expected and
observed values for the model developed for the small sample period is higher than
146

that of the large sample period. So on In Sample Period the model developed for the
small sample period has higher predictive ability.
Model Summary
The below Table No. 5.1.2.1.3: Model Summary presents the coefficients that indicate
towards the robustness of the developed models for the two samples. The -2LL are
found to be 604.616 and 380.699 for large sample period and small sample period
respectively making the developed models robust. The Cox and Snell R Square and
Nagelkerke R Square the model developed for the small sample period can explain
65.5% variation of the classification into the groups while the model developed for
the large sample period it is 59.2%. This makes the discriminant function for the small
sample period more robust.
Table No. 5.1.2.1.3: Model Summary
Cox & Snell R Nagelkerke R

Step -2 Log likelihood
Square Square
Large Sample Period 604.616 0.313 0.592
Small Sample Period 380.699 0.410 0.655
Hosmer and Lemeshow Test
Hosmer and Lemeshow test of goodness of fit divides the sample according to the
predicted probabilities and tells whether the data used for developing the model fits
into the model or not. That means that model is correctly specified or not. A good fit
model will have a small Hosmer-Lemeshow test statistic and a p-value that is greater
than the critical value. The null hypothesis for the test is that the model is correctly
specified. The alternative hypothesis is that the model is not correctly specified.
H0: The model is correctly specified.
H1: The model is not correctly specified.
147

Table No. 5.1.2.1.4: Hosmer and Lemeshow Test
Step Chi-square df Sig.

From the above Table No. 5.1.2.1.4: Hosmer and Lemeshow Test presents the results
of the goodness of fit of the developed models. The degree of freedom for the test is
8. The model developed for the large sample has the Chi-square at 6.351 with a p-
value of 0.608 and the model developed for the small sample period has the Chi-
square at 5.249 with a p-value of 0.731. The p-values of test for both the models are
higher than that of the critical value. This clearly indicates that the data fit into the
specified models and the null hypothesis is accepted for both the models. So it can be
concluded that the data that has been used to develop the models fit well in the models
for both samples.
This table provides information as which variable in the analysis is significant and
which should be dropped from the model. For this purpose Wald test is used with null
hypothesis that the corresponding coefficient to variable is zero and alternative
hypothesis is that the corresponding coefficient to variable is not zero.
H0: βi = 0: The corresponding coefficient to variable is zero.
H1: βi ≠ 0: The corresponding coefficient to variable is not zero.
From the below Table No. 5.1.2.1.5: Variables in Equation, it is clear that seven
variables for the large sample period and 9 variables for the small sample period have
p-value less than that of the critical value 0.05. So null hypothesis for these variables
is rejected and alternative hypothesis is accepted. That means the corresponding
coefficients to these variables are non-zero.
148

Table No. 5.1.2.1.5: Variables in Equation

B Wald Sig. B Wald Sig.
WC/TA -0.139 0.105 0.746 0.345 0.419 0.517
RE/TA -0.257 1.078 0.299 -0.731 4.308 0.038
EBIT/TA -10.079 11.609 0.001 -6.831 3.801 0.051
MVE/TBD -0.140 25.758 0.000 -0.304 15.323 0.000
Sales/TA -1.588 41.838 0.000 -1.765 29.740 0.000
CA/CL -0.014 0.253 0.615 0.015 0.130 0.718
NI/TA 9.195 9.791 0.002 8.333 6.228 0.013
NP/TE -0.004 0.050 0.822 0.002 0.007 0.933
TBD/TA 0.325 0.910 0.340 -0.318 0.429 0.513
EBIT/INT 0.001 0.943 0.331 -0.107 10.883 0.001
OCFR -0.001 0.000 0.985 0.119 0.847 0.358
GRTA -0.426 2.019 0.155 -0.910 1.813 0.178
ITR 0.000 0.011 0.917 0.000 2.267 0.132
FAT 0.011 7.908 0.005 0.019 6.472 0.011
MP/EPS 0.000 0.009 0.925 0.000 0.047 0.829
MP/BV -0.038 3.036 0.081 -0.040 2.237 0.135
D/E -0.001 0.010 0.919 0.001 0.022 0.882
Log(TA/GNP) -0.586 22.337 0.000 -0.531 9.412 0.002
SG 0.003 0.011 0.917 0.332 1.076 0.300
SG/GNPG 0.000 2.157 0.142 0.000 2.318 0.128
X 16.229 0.000 1.000 0.785 4.757 0.029
Y 1.846 42.304 0.000 1.585 21.625 0.000
Constant -15.372 0.000 1.000 1.675 5.413 0.020
This signifies that these respective variables are significant in model development and
rest variable are dropped from the model. The intercept for the large sample period is
insignificant but for the small sample period it is significant. Of the two categorical
variables only categorical variable is found to be significant for the large sample
period while for the small sample period, both the categorical variables are
significant.
149


Accuracy 97.7 91.1
Type I Error 2.3 4.2
Type II Error 38 28.4
classification accuracy of the developed models for the two samples. For the large
sample period the accuracy rate of the developed model is 97.7% and for the small
sample period it is 91.1%. Type I errors are 2.3% and 4.2% for large sample period
and small sample period respectively. Type II errors are 38% and 28.4% respectively.
These results are found to better than the models developed using multiple
discriminant analysis for All Firms. Also from the above result it is clear that the
model developed for the small sample period has higher predictive ability than that of
the large sample period. And the model develop using logistic regression analysis
method have higher predictive ability than that of the model developed using multiple
Like the models developed using multiple discriminant analysis, the Type II errors for
the both the developed models are very high. This is an undesirable situation for the
model. This put question on the robustness of the models developed.
Validation of Model
150


Accuracy 81.6 90.3
Overall Type I Error 0.9 6.2
Type II Error 57.6 17.2
Accuracy 83.1 90.3
1 Year Type I Error 0.9 5.6
Accuracy 80.1 90.3
2 Years Type I Error 0.9 6.9
of the forward testing of the out of sample period. For the whole sample, the accuracy
of the model developed for the large sample period is 81.6%, for the small sample
period accuracy is 90.3%. Type I error is the lowest for the large sample period and
Type II error is the lowest for the small sample period. For one year and two years
forward period, highest accuracy levels are 90.3% for the model developed for the
small sample period. These results are in line with the results of the multiple
discriminant analysis but the level of classification accuracy is far higher for the
logistic regression models than the multiple discriminant analysis models. So on the
basis of the above results, it can be concluded that, model developed for the small
sample period has the highest level of classification accuracy. And the model develop
using logistic regression analysis method have higher predictive ability than that of
the model developed using multiple discriminant analysis.
the both the developed models are on higher side. Although for the small sample
period, it is not very high. This situation puts question on the robustness of the models
developed and requires better specification of the models.
151

Table No. 5.1.2.1.8: Out Sample Results (Validation)

In Sample Period Accuracy 88.7 80.3
Forward Accuracy 91.7 82.6
The above Table No. 5.1.2.1.8: Out Sample Results (Validation) presents results of
validation of the models on out of sample firms. For the In Sample Period period, the
sample period, accuracy is 80.3%. Type I error is the lowest for the large sample
period and Type II error is the lowest for the small sample period. For the forward
period, the accuracy of the model developed for the large sample period is 91.7%, for
the small sample period, accuracy is 82.6%. Type I and Type II error are the lowest
for the large sample period.
These results are in line with the results of the multiple discriminant analysis but the
level of classification accuracy is far higher for the logistic regression models than the
multiple discriminant analysis models. So on the basis of the above results, it can be
concluded that, on the basis of validation results, model developed for the large
sample period has the highest level of classification accuracy. And the model develop
using logistic regression analysis method have higher predictive ability than that of
the model developed using multiple discriminant analysis.
Discussion
On the basis of statistical coefficients for the different tests for the two developed
models for the two sample periods, it is clear that the model developed for the small
sample period is more statistical more robust. However from the classification results
152

it is crystal clear that the model developed for the large sample period has the highest
classification accuracy at 97.7%. The findings from the above information can be
 Six out of seven significant variables from the model developed for the large
sample period are part of the model developed for the small sample period. This
signifies that firms tend to follow a certain pattern in long run and to predict the
firm’s default in short run, additional variables are required to accommodate for
short term risk as far as default prediction is concerned.
 All the three categories of independent variables are found to be significant for
the default prediction for both models.
 Out of two categorical variables, only one categorical variable Y is significant

for both the models.
 The intercept for the large sample period is insignificant but for the small
sample period it is significant.
 As per Omnibus Test all the models are significant.
 As per -2LL, Cox & Snell R Square and Nagelkerkre R Sqaure, all the models
are robust and the models developed for the small sample period can explain
higher level of variation of the classification into the groups.
 As per From Hosmer and Lemeshow Test results, both the models fit into the
data.
 The classification accuracies for both the sample periods for the model
developed using logistic regression method are higher than the models
developed using multiple discriminant analysis for All Firms.
153

 From in sample classification accuracies are concerned, the model for the large
sample period has higher predictive ability than that of the small sample period
at 97.7%. Though Type II errors for the both are very high.
 The forward test results indicate that the model developed for the small sample
period has higher classification accuracy at 90.3% than the large sample period.
 Out of sample validation results clearly indicate that for out of sample firms for
in sample period as well as forward periods, the classification accuracies are
higher for the small sample period at 88.7% and 91.7% respectively. Moreover
the classification accuracy in forward has increased.
This sample has 1642 firm years. Four logistic regression models have been
developed for his sample. The two models for matched and unmatched samples have
been developed for the large sample period from 1st April 2004 to 31st March 2014
and is indicated as. And for the small sample period, two have been developed for a
sample period of five years from 1st April 2009 to 31st March 2014. After checking for
outliers using Mahalanobis statistics, firm years for the Large sample period and
Small sample period are brought down to 1165 firm years and 604 firm years
respectively. On these two samples logistic regression analysis is used using SPSS
software. Two hold out samples namely ‘Forward Testing (In Sample Firms)’ and
‘Out of Sample’ are meant for validation of the classification accuracy of the
developed models. The first hold out sample ‘Forward Testing (In Sample Firms)’
consists of the financial, market and economic information of the same firms that
have been used to develop the models beyond the sample years. This sample contains
233 firm years. The second sample ‘Out of Sample’ consists of the financial, market
and economic information of the firms which have not been part of the study. This
sample contains 154 firm years.
After running the logistic regression analysis on SPSS, the following models has been
found for different samples.
154

Unmatched Sample
O = - 8.356EBIT/TA – 0.775MVE/TBD - 0.808Sales/TA + 9.728NI/TA -

0.184NP/TE – 1.946MP/BV + 1.946Y
O = - 2.726RE/TA – 0.846MVE/TBD – 1.023Sales/TA - 0.954NP/TE + 1.895Y
Matched Sample
O = - 8.066EBIT/TA – 0.768MVE/TBD - 0.776Sales/TA + 9.363NI/TA -

0.18NP/TE – 0.356MP/BV + 2.057Y
O = - 2.728RE/TA – 0.833MVE/TBD – 1.014Sales/TA - 0.953NP/TE + 1.901Y
Empirical Results
From the Table No. 5.1.2.2.1: Case Processing Summary, it is clear that a the
Unmatched Sample – Large sample period has 1139 valid cares, Unmatched Sample –
Small sample period has 595 valid case. Matched Sample – Large sample period has
1020 valid cases and Matched Sample – Small sample period has 534 valid cases. On
these valid cases of four samples, logistic regression analysis has been carried out
using SPSS to develop models for predicting probability of default.
155


Valid Cases 1139 595 1020 534
Excluded Cases 26 9 23 6
Total 1165 604 1043 540
variable.
variable
Chi-square df Sig.
Unmatched Sample
Matched Sample
From the above Table No. 5.1.2.2.2: Omnibus Tests of Model Coefficients it is clear
that as per the Chi-square test, all the developed functions are significant. The p-
values of all the four models are less than the critical p-value. This suggests that null
hypothesis for all the four samples are rejected and alternative hypothesis are accepted
156

for all the samples. That means, for all the four samples, there is an effect of the
independent variables, taken together, on the dependent variable. So on the basis of
above result, it can be concluded that the inclusion of selected independent variables
improves the predictive ability of the model.
Model Summary
From the below Table No. 5.1.2.2.3: Model Summary presents the coefficients that
indicate towards the robustness of the developed models for the two samples. The -
2LL for all the models are found to be on higher side. This makes all the models
significant. Cox and Snell R Square values are relatively lower for all the samples but
Nagelkerke R Square values are on higher side. This makes all the developed models
robust. The models developed for the small sample period can explain the highest
level variation in group classification. On the basis of above three criterions,
interpreted separately, it can be said that the model developed for the matched sample
for the small sample period has the highest level of robustness.
-2 Log Cox & Snell R Nagelkerke

likelihood Square R Square
Unmatched Sample
Matched Sample
Small Sample Period 212.999 .474 0.733
into the model or not. That means that model is correctly specified or not. The null
hypothesis for the test is that the model is correctly specified. The alternative
hypothesis is that the model is not correctly specified.
157

Chi-square df Sig.
Unmatched Sample
Matched Sample
The above Table No. 5.1.2.2.4: Hosmer and Lemeshow Test present the results of the
goodness of fit of the developed models. The degree of freedom for the test is 8. The
p-values of test for three models expect unmatched sample – small sample period are
higher than that of the critical value. This clearly indicates that the three specified
models fit into the data and the null hypothesis is accepted for the three models and
for unmatched sample –small sample period, null hypothesis is rejected.
variables for the unmatched sample - large sample period and five variables for the
unmatched sample - small sample period, seven variables for the matched sample -
large sample period and five variables for the matched small sample period have p-
value less than that of the critical value 0.05. So null hypothesis for these variables is
rejected and alternative hypothesis is accepted. That means the corresponding
coefficients to these variables are on-zero. This signifies that these respective
variables are significant in model development and rest variable are dropped from the
model. The intercept for the large sample period is insignificant but for the small
158

sample period it is significant. Of the two categorical variables only categorical

variable is found to be significant for all the samples.

B Sig. B Sig. B Sig. B Sig.
WC/TA 0.383 0.492 1.154 0.133 0.229 0.687 1.117 0.148
RE/TA -0.863 0.065 -2.726 0.013 -0.719 0.163 -2.728 0.013
EBIT/TA -8.356 0.021 5.977 0.548 -8.066 0.027 6.458 0.551
MVE/TBD -0.775 0.000 -0.846 0.032 -0.768 0.000 -0.833 0.034
Sales/TA -0.808 0.022 -1.023 0.031 -0.776 0.026 -1.014 0.032
CA/CL 0.010 0.778 0.045 0.419 0.011 0.760 0.045 0.422
NI/TA 9.728 0.007 -2.845 0.774 9.363 0.009 -3.366 0.757
NP/TE -0.184 0.046 -0.954 0.017 -0.180 0.045 -0.953 0.017
TBD/TA 0.197 0.626 -0.126 0.848 0.188 0.616 -0.137 0.835
EBIT/INT -0.007 0.669 -0.088 0.466 -0.006 0.691 -0.084 0.489
OCFR -0.008 0.935 0.312 0.260 -0.003 0.975 0.309 0.264
GRTA -0.171 0.736 -1.097 0.332 -0.108 0.828 -1.097 0.333
ITR 0.000 0.006 0.000 0.055 0.000 0.008 0.000 0.059
FAT 0.019 0.111 0.012 0.410 0.019 0.085 0.011 0.419
MP/EPS 0.001 0.674 0.001 0.783 0.001 0.683 0.001 0.786
MP/BV -0.353 0.001 -0.573 0.064 -0.356 0.001 -0.579 0.062
D/E -0.034 0.323 -0.125 0.075 -0.033 0.327 -0.124 0.076
Log(TA/GNP) 0.072 0.765 0.215 0.521 0.052 0.832 0.231 0.492
SG -0.295 0.457 -0.236 0.678 -0.398 0.309 -0.238 0.675
SG/GNPG 0.000 0.730 0.000 0.737 0.000 0.849 0.000 0.739
X 15.940 0.378 0.436 0.384 16.081 1.000 0.423 0.398
Y 1.946 0.000 1.895 0.000 2.057 0.000 1.901 0.000
Constant - 1.000 -0.102 0.940 - 1.000 -0.133 0.921
16.896 16.963
The intercepts for all the models are found to be insignificant. Another interesting
finding is that for large sample period for both matched as well as unmatched sample,
159

same variables are found to be significant and the difference in the coefficients of the
two samples are very minute. These may be considered to identical. Same is true with
the small sample period. So it can be concluded on the basis of above information that
there is no difference between the matched sample and unmatched sample as far as
logistic regression models for the sample data is concerned.

Accuracy 94.1 92.9 93.3 92.1
Type I Error 2.3 3.1 2.7 3.6
Type II Error 32.6 23.5 30.8 23.5
classification accuracy of the developed models for the four samples. For the
unmatched sample - large sample period the accuracy rate of the developed model is
94.1%, for the unmatched sample - small sample period it is 92.9%, for the matched
sample - large sample period the accuracy rate of the developed model is 93.3% and
for the unmatched sample - small sample period it is 92.1%. Type I errors are 2.3%,
3.1%, 2.7% and 3.6% respectively. Type II errors are 32.6%, 23.5%, 30.8% and
23.5% respectively. These results are found to better than the models developed using
multiple discriminant analysis for Large Firms. Also from the above result it is clear
that the model developed for the unmatched sample large sample period has the
highest predictive ability among all the models but not very high. Also the model
develop using logistic regression analysis method have higher predictive ability than
that of the model developed using multiple discriminant analysis.
160

Validation of Model
developed models for the four samples. For Unmatched Samples, the first sample
consists of 233 firm years of the same firms but beyond the sample period for the
forward testing and the second sample consists of 154 firm years which are out of
sample firms. For Matched Samples, the first sample consists of 206 firm years of the
same firms but beyond the sample period for the forward testing and the second

Accuracy 92.7 92.3 92.7 90.8
Overall Type I Error 3.9 1.9 3.9 2.7
Type II Error 14.1 19.2 12.7 20.2
Accuracy 93.1 90.5 92.3 89.4
1 Year Type I Error 2.6 1.3 3.1 1.5
Type II Error 15.8 26.3 15.4 25.6
Accuracy 92.3 93.2 93.1 92.2
2 Years Type I Error 5.2 2.6 4.8 3.2
Type II Error 12.5 15.0 10.0 15.0
The above Table No. 5.1.2.2.7: Forward Testing (In Sample Firms) presents the in
classification accuracy of the four developed models. For the whole period, the
Unmatched Sample of Large Firms for Large sample period, the classification
accuracy of developed model is found to be 92.7% against 92.3% for the Unmatched
Sample – small sample period. Type I errors are 3.9% and 1.9% respectively but the
type II error is on higher side at 14.1% and 19.2% respective. So for the unmatched
samples, the model developed for the large sample period has the highest
classification accuracy. For the Matched Sample – large sample period the
161

the matched sample – small sample period. Type I errors are 3.9% and 2.7%
respectively and the type II errors are 12.7% and 20.2% respectively. Similarly for the
matched samples, the model developed for the large sample period has the highest
Like the In Sample Period results, the classification accuracy for the models
developed for the large sample periods are found to have the highest level of
classification accuracy for both the unmatched as well as matched samples for
forward period of one year and two years. As far as the misclassification is concerned,
the misclassification level for the models developed for the large sample periods are
highest for both the unmatched and matched samples. But like models developed
using MDA, these models have very high level of type II errors. As far as the results
of logistic models are compared with the multiple discriminant analysis models, the
classification accuracies of the logistic models are higher.
So on the basis of the results of forward testing, it can be concluded that, model
developed for the large sample periods for both the unmatched and matched samples
have the highest level of classification accuracy. On comparison of results it is clear
that there is no major difference between classification accuracies for the unmatched
and matched sample. So it can be said that assumption of matched and unmatched
sample does not have any impact on classification accuracy.
Table No. 5.1.2.2.8: Out of Sample Results (Validation)

Accuracy 85.6 84.8 86.4 85.1
In Sample
Type I Error 4.8 6.8 4.6 6.1
Period
Type II Error 68.2 65.0 68.2 68.2
Accuracy 63.0 88.7 61.5 61.5
Forward Type I Error 10.5 6.8 11.1 11.1
Type II Error 100.0 100.0 100.0 100.0
162

The above Table No. 5.1.2.2.8: Out Sample (Validation) presents results of out of
sample validation. For the In Sample Period, the Unmatched Sample of Large Firms
for Large sample period, the classification accuracy of developed model is found to be
85.6% against 84.8% for the Unmatched Sample – small sample period. Type I errors
are 4.8% and 6.8% respectively but the type II error is on higher side at 68.2% and
65.0% respective. So for the unmatched samples, the model developed for the large
sample period has the highest classification accuracy. For the Matched Sample – large
sample period the classification accuracy of developed model is found to be 86.4%
against 85.1% for the matched sample – small sample period. Type I errors are 4.6%
and 6.1% respectively and the type II errors are 68.2% for both samples. Similarly for
the matched samples, the model developed for the large sample period has the highest
For the forward period the classification accuracy for the model developed for the
unmatched sample - small sample periods are found to have the highest level of
classification accuracy among all the models developed for forward period of one year
and two years for out of sample firms. As far as the misclassification is concerned, it is
very high for all the samples for forward period of out of sample firms.
Discussion
The model for the matched sample for small sample period has the highest statistical
robustness and the model for the unmatched sample for large sample period has the
highest classification accuracy at 94.1%. The findings from the above information can
be summarized as follow:
 There is no distinction between matched and unmatched samples. For the large
sample period, the same variables are significant for both the samples with same
coefficients for respective variables. Similarly for small sample period, the same
variables are significant respectively for both the samples with same coefficients
for respective variables.
 The significant variables belong to only two categories namely, accounting and
market variables.
163

 Out of two categorical variables, only one categorical variables Y is found to be

significant.
 The intercepts for all the developed models are insignificant.
 As per Omnibus test all the models are significant.
 As per Model Summary, all the developed models are robust and the models for
the small sample period can explain the highest level variation in group
classification.
 As per Hosmer and Lemeshow Test the three specified models fit into the data and
the model for unmatched sample –small sample period does not fit into the data.
 The logit models have higher classification accuracy than MDA models for
Large Firms.
 From the in sample classification results, for the large sample period, the
classification accuracies for the matched and unmatched samples are almost the
same. Similarly for the small sample period, the classification accuracies for the
matched and unmatched samples are almost the same.
 From the in sample classification results, it is clear that classification accuracy

for the large sample period is highest at 94.1%. Although, Type II errors are
very high for all the samples.
 For the forward period the classification accuracy for the model developed for
the small sample periods is the highest at 90.3% and over time classification
accuracy is found to deteriorate.
 For out of sample validation results, the classification accuracy for the large
sample period is the highest at 88.7%.
5.1.2.3. Large Firms with Public Sector Units
This sample initially has 1880 firm years. Four logistic regression models have been
164

After refinement and checking for outliers using Mahalanobis statistics, , firm years
for the Large sample period and Small sample period are brought down to 1289 firm
years and 659 firm years respectively. On these two samples logistic regression
analysis is used using SPSS software. Two hold out samples namely ‘Forward Testing
(In Sample Firms)’ and ‘Out of Sample’ are meant for validation of the classification
accuracy of the developed models. The first hold out sample ‘Forward Testing (In
Sample Firms)’ consists of the financial, market and economic information of the
same firms that have been used to develop the models beyond the sample years. This
After running the logistic regression analysis using SPSS software on two samples
the following models are found.
O = - 7.586EBIT/TA – 0.156MVE/TBD – 0.95Sales/TA + 8.407NI/TA +

0.009FAT – 0.202MP/BV + 2.007Y
O = - 2.725RE/TA + 5.123Sales/TA – 0.817NP/TE – 0.114EBIT/INT –

1.015MP/BV + 1.6Y
Empirical Results
165

method.

Valid Cases 1263 650
Excluded Cases 26 9
Total 1289 659
variable.
variable
Chi-square df Sig.
values of both the models are less than the critical p-value. This suggests that null
hypothesis for all the two samples are rejected and alternative hypothesis are accepted
for all the samples. That means, for both the samples, there is an effect of the
166

Model Summary
From the below Table No. 5.1.2.3.3: Model Summary, -2LL for all the models are
found to be on higher side. This makes all the models significant. Cox and Snell R
Square values are relatively lower for all the samples but Nagelkerke R Square values
are on higher side. This makes all the developed models robust. The models
developed for the small sample period can explain the highest level variation in group
classification between the two models. On the basis of above three criterions,
interpreted separately, it can be said that the model developed for the small sample
period has the highly robust.

Square Square
167

Chi-square df Sig.
From the above Table No. 5.1.2.3.4: Hosmer and Lemeshow Test presents the results
of the goodness of fit of the developed models. The degree of freedom for the test is
8. The model developed for the large sample has the Chi-square at 7.156 with a p-
value of 0.520 and the model developed for the small sample period has the Chi-
square at 2.148 with a p-value of 0.976. The p-values of test for both the models are
higher than that of the critical value. This clearly indicates that the specified models
fit into the data and the null hypothesis is accepted for both the models.
variables for the large sample period and six variables for the small sample period
have p-value less than that of the critical value 0.05. So null hypothesis for these
variables is rejected and alternative hypothesis is accepted. That means the
corresponding coefficients to these variables are non-zero. This signifies that these
respective variables are significant in model development and rest variable are
dropped from the model. The intercepts for both the samples are insignificant. Of the
two categorical variables only categorical variable Y is found to be significant for
both the samples.
168


WC/TA 0.110 0.042 0.838 1.233 2.535 0.111
RE/TA -0.578 1.987 0.159 -2.725 8.176 0.004
EBIT/TA -7.586 5.720 0.017 -3.128 0.245 0.620
MVE/TBD -0.156 12.960 0.000 0.000 0.009 0.923
Sales/TA -0.950 9.480 0.002 -0.824 5.123 0.024
CA/CL 0.020 0.344 0.558 0.066 1.529 0.216
NI/TA 8.407 7.451 0.006 6.017 0.903 0.342
NP/TE -0.117 1.892 0.169 -0.817 5.550 0.018
TBD/TA 0.388 0.918 0.338 0.306 0.272 0.602
EBIT/INT -0.022 0.980 0.322 -0.114 6.910 0.009
OCFR -0.009 0.011 0.915 0.467 3.344 0.067
GRTA -0.070 0.023 0.880 -0.183 0.033 0.855
ITR 0.000 1.977 0.160 0.000 0.923 0.337
FAT 0.009 4.343 0.037 0.008 0.347 0.556
MP/EPS 0.001 0.551 0.458 0.001 0.165 0.684
MP/BV -0.202 4.219 0.040 -1.015 15.009 0.000
D/E -0.026 0.619 0.431 -0.069 1.251 0.263
Log(TA/GNP) -0.369 2.918 0.088 -0.043 0.020 0.886
SG -0.369 0.951 0.329 -0.400 0.527 0.468
SG/GNPG 0.000 0.065 0.799 0.000 0.056 0.812
X 16.611 0.000 1.000 0.811 2.793 0.095
Y 2.007 34.209 0.000 1.600 11.572 0.001
Constant -16.957 0.000 1.000 -0.119 0.010 0.922

Accuracy 94.1 92.5
169

and small sample period respectively. Type II errors are 27% and 29.2% respectively.
These results are found to better than the models developed using multiple
discriminant analysis for Large Firms with PSU. Also from the above result it is clear
that the model developed for the large sample period has higher predictive ability than
that of the small sample period.
Model Validation

Accuracy 92.6 86.6
Accuracy 93.2 89.4
Accuracy 92.3 83.6
170

The above Table No. 5.1.2.3.7: Forward Testing (In Sample Firms) presents results of
the forward testing of the out of sample period. For the whole sample, the accuracy of
the model developed for the large sample period is 92.6%, for the small sample period
accuracy is 86.6%. Type I and Type II errors are the lowest for the large sample
period. For one year and two years forward period, highest accuracy levels are 93.2%
for the model developed for the large sample period. That means over a longer time
period predictive ability of models falls. Therefore on the basis of the above results, it
can be concluded that, model developed for the large sample period has the highest
level of classification accuracy. And the model develop using logistic regression
analysis method have higher predictive ability than that of the model developed using
multiple discriminant analysis.
the both the developed models are on higher side. Although for the large sample
period, it is not very high. This situation puts question on the robustness of the models
developed and requires better specification of the models.

Accuracy 89.2 80.7
In Sample Period Type I Error 6.0 15.1
Accuracy 94.7 86.5
Forward Type I Error 0 11.4
The above Table No. 5.1.2.3.8: Out of Sample (Validation) presents results of
validation of the models on out of sample firms. For the In Sample Period period, the
sample period, accuracy is 80.7%. Type I and Type II errors are the lowest for the
171

large sample period. For the forward period, the accuracy of the model developed for
the large sample period is 94.7%, for the small sample period, accuracy is 86.5%.
So on the basis of the above results, it can be concluded that, on the basis of
validation results, model developed for the large sample period has the highest level
of classification accuracy. And the model develop using logistic regression analysis
method have higher predictive ability than that of the model developed using multiple
Discussion
On the basis of statistical coefficients the models for the small sample period have the
highest statistical robustness. From the classification results the models for the large
sample period have the highest classification accuracy at 94.1%. The findings from
the above information can be summarized as follow:
 Seven and six variables are found significant for large sample period and small
sample period respectively. Of these only two variables are common. That
means the two models have almost different sets of variables.
 For both the models only accounting and market variables are significant. No
economic variables in found significant.
 Of the two categorical variables only categorical variable Y is found to be

significant for both the samples.
 The intercepts for both the samples are insignificant.
 As per the Omnibus Test, all the models are significant.
 As per three criterions of the Model Summary all the models are robust.
Interpreted separately, it can be said that the model for the small sample period
has the highly robust.
 The Hosmer and Lemeshow Test clearly indicate that the specified models fit
into the data for both the models.
172

 Logit models have higher classification accuracies than MDA model for Large
Firms with PSU.
 From the in sample classification results, the model for the large sample period
has the highest classification accuracy at 94.1% but the Type II errors for both
the models are very high.
 From the forward testing results, the model for the large sample period has the
highest classification accuracy at 92.1%.
 From the out of sample validation results, the model for the large sample period
has the highest classification accuracy at 89.2%.
5.1.2.4. Small and Medium Enterprises
This sample initially has 559 firm years. Two logistic regression models have been
After refinement and checking for outliers using Mahalanobis statistics, firm years for
the Large sample period and Small sample period are brought down to 377 firm years
and 199 firm years respectively. On these two samples logistic regression analysis is
used using SPSS software. Two hold out samples namely ‘Forward Testing (In
Sample Firms)’ and ‘Out of Sample’ are meant for validation of the classification
accuracy of the developed models. The first hold out sample ‘Forward Testing (In
Sample Firms)’ consists of the financial, market and economic information of the
same firms that have been used to develop the models beyond the sample years. This
After running the multiple discriminant analysis using SPSS software the following 5
factors model was found.
173

O = 2.138 – 20.002EBIT/TA – 0.42MVE/TBD – 2.412Sales/TA + 18.743NI/TA –

0.658 Log(TA/GNP) + 1.409X + 1.404Y
O = 4.288MVE/TBD + 11.658Sales/TA + 26.007NI/TA + 2.319X
Empirical Results
method.

Valid Cases 345 182
Excluded Cases 32 17
Total 377 199
174

H0: All the coefficients of independent variables are equal to zero.
H1: There is at least one coefficient of an independent variable that is not equal to zero.
Chi-square df Sig.
values of both the models are less than the critical p-value. This suggests that null
hypothesis for all the two samples are rejected and alternative hypothesis are accepted
for all the samples. That means, for both the samples, there is an effect of the
Model Summary
The below Table No. 5.1.2.4.3: Model Summary -2LL for all the models are found to
be on higher side. This makes all the models significant. Cox and Snell R Square
values are relatively lower for all the samples but Nagelkerke R Square values are on
higher side. This makes all the developed models robust. The models developed for
the small sample period can explain the highest level variation in group classification.
On the basis of above three criterions, interpreted separately, it can be said that the
model developed for the matched sample for the small sample period has the highest
level of robustness.

Square Square
175


From the above Table No. 5.1.2.4.4: Hosmer and Lemeshow Test presents the results of
the goodness of fit of the developed models. The degree of freedom for the test is 8. The
model developed for the large sample has the Chi-square at 4.007 with a p-value of 0.856
and the model developed for the small sample period has the Chi-square at 7.331 with a
p-value of 0.501. The p-values of test for both the models are higher than that of the
critical value. This clearly indicates that the data fit into the specified models and the null
hypothesis is accepted for both the models. So it can be concluded that the data that has
been used to develop the models fit well in the models for both samples.
176


WC/TA -0.695 0.408 0.523 -0.196 0.016 0.898
RE/TA 0.403 0.714 0.398 0.105 0.038 0.845
EBIT/TA -20.002 5.052 0.025 -26.892 3.801 0.051
MVE/TBD -0.420 5.781 0.016 -0.617 4.288 0.038
Sales/TA -2.412 20.188 0.000 -3.076 11.658 0.001
CA/CL -0.039 0.466 0.495 -0.046 0.137 0.711
NI/TA 18.743 4.501 0.034 26.007 4.292 0.038
NP/TE -0.022 0.245 0.621 0.011 0.009 0.925
TBD/TA 0.371 0.116 0.733 -0.379 0.049 0.825
EBIT/INT 0.001 0.369 0.544 -0.026 0.570 0.450
OCFR 0.077 0.179 0.672 0.049 0.041 0.840
GRTA -1.129 2.319 0.128 -1.060 0.810 0.368
ITR 0.000 0.056 0.813 0.005 2.057 0.152
FAT 0.024 3.076 0.079 0.107 2.335 0.126
MP/EPS -0.004 0.862 0.353 -0.006 1.271 0.260
MP/BV 0.053 1.493 0.222 0.065 1.278 0.258
D/E 0.003 0.024 0.877 0.029 0.149 0.699
Log(TA/GNP) -0.658 7.795 0.005 -0.436 1.615 0.204
SG 0.712 2.447 0.118 1.113 2.408 0.121
SG/GNPG 0.001 0.580 0.446 0.001 0.840 0.359
X 1.409 4.250 0.039 2.319 5.318 0.021
Y 1.404 4.701 0.030 0.942 1.099 0.294
Constant 2.138 6.005 0.014 2.563 4.107 0.043
variables for the large sample period and four variables for the small sample period
have p-value less than that of the critical value 0.05. So null hypothesis for these
variables is rejected and alternative hypothesis is accepted. That means the
177

corresponding coefficients to these variables are on-zero. This signifies that these
respective variables are significant in model development and rest variable are
dropped from the model. The intercept for the large sample period is significant but
for the small sample period it is insignificant. Of the two categorical variables only
categorical variable is found to be significant for the small sample period while for the
large sample period, both the categorical variables are significant.

Accuracy 91.9 89.6
Type II Error 31.2 20
and small sample period respectively. Type II errors are 31.2% and 20.0%
respectively. From the above results it is clear that the model developed for the large
sample period has higher predictive ability than that of the small sample period. And
the model developed using logistic regression analysis method has higher predictive
ability than that of the model developed using multiple discriminant analysis.
Validation of Model
178


Accuracy 88.1 28.4
Accuracy 87.5 18.6
Accuracy 90.0 33.3
of the forward testing of the out of sample period. For the whole sample, the accuracy
of the model developed for the large sample period is 88.1%, for the small sample
period accuracy is 28.4%. Type I error is the lowest for the large sample period and
Type II error is the lowest for the small sample period. On In Sample Period the
model developed for the large sample period has the highest classification accuracy.
For one year and two years forward period, the model developed for the small sample
period has the highest classification accuracy.

In Sample Period Accuracy 82.8 37.9
Forward Accuracy 100.0 42.9
179

The above Table No. 5.4.2.4.8: Out Sample Results (Validation) presents results of
validation of the models on out of sample firms. For the In Sample Period, the
sample period, accuracy is 37.9%. Type I and Type II errors are the lowest for the
large sample period. For the forward period, the classification accuracy of the model
developed for the large sample period is the highest. These results are in line with the
results of the multiple discriminant analysis but the level of classification accuracy is
far higher for the logistic regression models than the multiple discriminant analysis
models. So on the basis of the above results, it can be concluded that, on the basis of
validation results, model developed for the large sample period has the highest level
of classification accuracy.
Discussion
On the basis of statistical coefficients the model for the small sample period has the
highest statistical robustness. From the classification results the model for the large
sample period has the highest classification accuracy at 91.4%. The findings from the
above information can be summarized as follow:
 All the variables that are significant for the small sample period are significant
for the large sample period.
 For the model for large sample period, all the three categories of variable are
significant but for the model for small sample period only accounting and
market variables are significant. No economic variables in found significant
small sample period.
 Of the two categorical variables both the categorical variables are significant for
model developed for the large sample period and for the small sample only
categorical variable X is found to be significant.
 The intercept for the model developed for small sample period is insignificant.
 As per Omnibus Test, all the models are significant.
180

 As per three criterions of the Model Summary, all the models are significant.
Interpreted separately, the models for the small sample period are highly robust.
 As per Hosmer and Lemeshow Test both the specified models fit into the data.
 Logit models have higher accuracy than MDA model for Large Firms with PSU.
 From the in sample classification results, the model for the large sample period
has the highest classification accuracy at 91.4% but the Type II errors for the
both the models are very high.
 From the forward testing results, the model for the large sample period has the
highest classification accuracy at 88.1%.
 From the out of sample validation results, the model for the large sample period
has the highest classification accuracy at 82.8%.
5.4.2.5. Findings from Logistic Regression Analysis
From the above discussion for the four broad samples and ten developed models for
two sample periods using logistic regression analysis, it clear that on In Sample
Period the models developed for the small sample period have higher robustness but
the classification accuracy is highest for the models developed for large sample period
at 97.7% for All Firms. The findings from the study are summarized as follow:
For different sample, different categories of variables are found to be significant but
four variables MVE/TBD, Sales/TA, NI/TA and Y are significant for all the models
and these four variables belong to accounting and market variables. The findings of
the study are summarized as follow:
All Firms
 All the three categories of variables are significant for both the models.
 Only one categorical variables Y is found significant for both the models.
181

 Four variables MVE/TBD, Sales/TA, NI/TA and Y are significant for all the
models and these four variables belong to accounting and market variables.
 All the significant variables for the model for large sample period are significant
for the small sample period. This indicates that in long run firms tend to follow
similar pattern while in short run, they have different pattern when it comes to
default prediction.
 For the model for the large sample period, no variable with debt component is
significant.
 For the model for the small sample period, besides EBIT/INT, no other variable
with debt component is significant.
Large Firms
 Only two categories of variables, namely accounting and market variables are
significant for both the models.
 Though the different coefficients, but all the significant variables for the model for
the small sample period are significant for the model for the large sample period.
 Only one categorical variables Y is significant for both the models.
 For both the models, only market value of equity with respect to total book
value of debt is significant from all the variables with debt component.
 Intercept for both the models are insignificant.
Large Firms with PSU
significant for both the models.
 Besides two variables, all the other variables in the two models are different.
 Only one categorical variables Y is significant for both the models.
 For the model for large sample period only market value of equity with respect
to total book value of debt is significant and for the model for small sample
period, no variable with debt component is significant.
 Intercept for both the models are insignificant.
182

Small and Medium Enterprises
 All the three categories of variables are significant for the model for large
sample period.
significant for the model for small sample period.
 Though the different coefficients, but all the significant variables for the model
for the small sample period are significant for the model for the large sample
period.
 For the model for large sample period both the categorical variables are
significant but for the model for the small sample period, only one categorical
variables Y is significant.
 For both the models, only market value of equity with respect to total book
value of debt is significant.
 Intercept for the model for small sample period is insignificant.
For all the models, accounting variables as well as market variables are significant.
Only for three models, economic variables are important although in all the
discriminant functions economic variables have been found significant.
On comparison of results of different statistical tests of all the samples over both the
sample periods individually, it is crystal clear that the models for the small sample
period of five year for all the samples have higher robustness. These findings can be
interpreted as follow:
 According to the Omnibus test, all the models are significant.
 As per Model Summary, all the developed models are robust. On the basis of
three criterions, interpreted separately, it can be said that the models for the
small sample periods have the highest level of robustness.
183

 Hosmer and Lemeshow Test clearly indicates that nine the specified models fit
into the data and the null hypothesis is accepted for these three models and for
unmatched sample –small sample period, null hypothesis is rejected.
On comparison of classification accuracies of all the models, it is clear that the sample
of All Firms for large sample period has yielded the highest classification accuracy at
97.7%. Also it is crystal clear from the classification accuracies of the ten developed
models that large sample period yields higher classification accuracy using logistic
regression analysis.
These findings can be interpreted as follow:
 Logit models have higher classification accuracy that MDA models.
 The in sample classification accuracy for the models for the large sample
periods are the highest.
 The classification accuracy of the forward testing of in sample firms as well as

out of sample firms, the classification accuracy is found to be the highest for the
models for the large sample period.
 The classification accuracy of the developed model is found to deteriorate over

time for both the in sample firms as well as for out of sample firms.
 On the basis of classification results, there is no distinction between the matched

and unmatched sample. They have almost same results.
 The developed models are facing the problem of high Type I and Type II errors.
Though the more problematic is Type II error.
 The models for small sample period are more robust but classification
accuracies for the models for large sample period are higher. This is itself
biggest limitations of the study.
184

Conclusion
Considering the different limitations on the basis of the statistical robustness and the
classification accuracies of the developed models, it can be concluded that the models
for the large sample period have the highest classification accuracy but models for
small sample period more robust. And over time classification accuracy deteriorates
for the developed models. The model for the large sample period for Large Firms has
the highest classification accuracy at 97.7%.
5.1.3. Reduced Form Models
Reduced Form Models using Poisson Probability has been used to arrive at the default
intensity for the Indian firms. This default intensity is used to arrive at the probability
of default. The whole study has been divided into two parts. Part I deals with the
sample of All Firms which contains 2450 firm years for which ratios have been
calculated using financial statement and market and economic information. Part II
deals with the sample of large firms which contain 1642 firm years for which ratios
have been calculated using financial statement and market and economic information.
5.1.3.1. All Firms
This sample initially has 2450 firm years of 225 large firms. The study is conducted
into two parts for two sample periods, namely large sample period for ten years from
1st April 2004 to 31st March 2014 and small sample period for five years from 1st
April 2009 to 31st March 2014. The default occurrences are counted for every firm for
the whole sample period. Similarly the 10 years means and 5 years means of selected
ratios for every firm are calculated. This turns the size of sample into 225 firms. Then
this sample is checked for outliers using Mahalanobis statistics, which brought down
the sample size to 223 firm years. Averages of ratios are taken as independent
variables and count data is used as dependent variables. On this sample regression is
run using Poisson Probability in E-Views software.
Two hold out samples namely Out of Sample Period and Out of Sample are meant for
validation of the classification accuracy of the developed model. The first sample for
185

Forward Testing (In Sample Firms) consists of the financial, market and economic
information of the same firms that have been used to develop the model beyond the
sample years. This sample contains 329 firm years. The second sample ‘Out of
Sample’ consists of the financial, market and economic information of the firms
which have not been part of the study. This sample contains 357 firm years.
After running the multiple discriminant analysis using E-Views software the
following models are found.
For Large Sample Period
For the large sample period a seven factors models is found to arrive at Z.
Z = 2.548741– 15.3073EBIT/TA - 0.25864Log (TA/GNP) + 10.07975NI/TA –

0.18673OCFR –1.06864Sales/TA - 0.09779TBD/TA – 0.68851WC/TA
For Small Sample Period
For the small sample period a six factors models is found to arrive at Z.
Z = 1.966049 + 0.000229EBIT/INT – 12.4021EBIT/TA – 0.26898Log (TA/GNP) -

0.1032MVE/TBD + 9.011738NI/TA – 0.94743Sales/TA
Empirical Results
From the below Table No. 5.1.3.1.1: Case Processing Summary, it is clear that the
sample size of the large sample firm is found to be 223. Of these firms, only 191 firms
are included in the study. The sample size of the small sample firm is found to be 223.
Of these firms, only 189 firms are included in the study.
186


Sample Size (Adjusted) 223 223
Included Observations 191 189
Statistical Coefficients
The below Table No. 5.1.3.1.2: Statistical Coefficients presents different coefficients
on the basis of which the relative statistical robustness of the models will be assessed.
Table No. 5.1.3.1.2: Statistical Coefficients

R-squared 0.557 0.585
Adjusted R-squared 0.505 0.535
S.E. of regression 1.423 1.156
Sum squared residuals 344.118 224.579
Log likelihood -200.660 -185.614
Restricted log likelihood -370.600 -320.117
Avg. log likelihood -1.051 -0.982
Mean dependent var. 1.141 0.984
S.D. dependent var. 2.022 1.696
Akaike info criterion 2.321 2.186
Schwarz criterion 2.679 2.547
Hannan-Quinn criterion 2.466 2.332
LR statistic 339.881 269.007
Prob. (LR statistic) 0.000 0.000
R-squared and Adjusted R-squared
From the above Table No. 5.1.3.1.2: Statistical Coefficients, R-squared for the large
sample period is 0.557 and for the small sample period, it is 0.585. It clearly indicates
that the data better fits into the model developed for the small sample period than the
model developed for the large sample period. Similarly from the Adjusted R-squared,
it is clear that the data better fits into the model developed for the small sample
period.
187

Standard Errors of Regression
From the above Table No. 5.1.3.1.2: Statistical Coefficients, Standard Errors of the
Regression of the large sample period is 1.423 and for the small sample period, it is
1.156. it clearly indicates the statistical noise in the estimates are higher in the model
developed for the large sample period than the model developed for the small sample
period. So model developed for the small sample period is better estimates
coefficients than the model developed for the large sample period.
Sum Squared Residuals
From the above Table No. 5.1.3.1.2: Statistical Coefficients, Sum Squared Residuals
for the large sample period is 344.118 and for the model developed for the small
sample period is 224.579. Sum Squared Residuals is a measure of the distance from a
data point to the regression line. Lower is the residual, better the model is good fit for
the data. So the model developed for the small sample period better fit for the data
than the model developed for the large sample period.
Log Likelihood, Restricted Log Likelihood and Average Log Likelihood
From the above Table No. 5.1.3.1.2: Statistical Coefficients, it clear from the values
of the Log Likelihood, Restricted Log Likelihood and Average Log Likelihood, that
the model developed for the small sample period has higher precision than that of the
model developed for the large sample period.
Standard Deviation of Dependent Variables
From the above Table No. 5.1.3.1.2: Statistical Coefficients, it is clear that standard
deviation for the model developed for the small sample period is lower than that the
model developed for the large sample period. So on this criterion the model developed
for the small sample period has better precision that the model developed of the large
sample period.
Akaike Information Criterion (AIC)
From the above Table No. 5.1.3.1.2: Statistical Coefficients, Akaike Information
Criterion for the model developed for the large sample period is 2.321 and for the
model developed for small sample period, it is 2.186. Lower is the value of Akaike
188

Information Criterion higher is the quality of the model. It is clear that relative quality
of the model developed for the small sample period is better than that of the model
developed for the large sample period.
Schwarz Criterion
From the above Table No. 5.1.3.1.2: Statistical Coefficients, Schwarz Criterion for the
model developed for the large sample period is 2.679 and for the model developed for
small sample period, it is 2.547. Lower is the value of Schwarz Criterion more
efficient is the model. It is clear that the model developed for the small sample period
is more efficient than that of the model developed for the large sample period.
Hannan-Quinn Criterion
From the above Table No. 5.1.3.1.2: Statistical Coefficients, Hannan-Quinn Criterion
for the model developed for the large sample period is 2.466 and for the model
developed for small sample period, it is 2.332. Lower is the value of Hannan-Quinn
Criterion higher is the relative quality of the model. It is clear that relative quality of
the model developed for the small sample period is better than that of the model
developed for the large sample period.
LR Statistic and Probability
LR Statistic is used to find out whether the estimated coefficients for the independent
variables are non-zero or not. The null hypothesis for the test is that all slope
coefficients are equal to zero.
H0: All slope coefficients are equal to zero.
H1: All slope coefficients are not equal to zero.
From the above Table No. 5.1.3.1.2: Statistical Coefficients, the p-values for LR
Statistic for both the models developed for the large sample period and small sample
period are lower than that of the critical p-value 0.05. That means the null hypothesis
is rejected for both the models and alternative hypothesis that that all slope
coefficients are not equal to zero is accepted. LR Statistic for the model developed for
the large sample period is 339.881 and for the model developed for small sample
period, it is 269.007. Lower is the value of LR Statistic higher is the precision of the
189

model. It is clear that precision of the model developed for the small sample period is
better than that of the model developed for the large sample period.
The below Table No. 5.1.3.1.3: Variables in Equation presents the estimated
coefficients of the independent variables, their respective standard errors, z-statistic
and p-values. The null hypothesis of the test is that the estimated coefficients are zero.
H0: Estimated coefficient of independent variables is equal to zero.
H1: Estimated coefficient of independent variables is not equal to zero.

Variable Std. Std.
Coefficient z-Statistic Prob. Coefficient z-Statistic Prob.
Error Error
Const. 2.548741 0.336 7.584 0.000 1.966049 0.388 5.064 0.000
CA/CL 0.002199 0.032 0.068 0.946 0.008934 0.040 0.222 0.825
D/E -0.00972 0.027 -0.365 0.715 -0.0175 0.012 -1.453 0.146
EBIT/INT -0.0006 0.001 -0.546 0.585 0.000229 0.000 2.231 0.026
EBIT/TA -15.3073 2.245 -6.817 0.000 -12.4021 2.856 -4.343 0.000
FAT 0.006739 0.004 1.848 0.065 0.006652 0.004 1.576 0.115
GRTA -0.01094 0.024 -0.457 0.648 -0.01011 0.024 -0.426 0.670
ITR 0.000223 0.000 1.338 0.181 0.000117 0.000 0.482 0.630
Log(TA/GNP) -0.25864 0.088 -2.943 0.003 -0.26898 0.091 -2.960 0.003
MP/BV -0.00736 0.006 -1.268 0.205 0.037539 0.023 1.631 0.103
MP/EPS 0.001724 0.002 0.996 0.319 -0.00041 0.001 -0.305 0.760
MVE/TBD -0.0008 0.001 -0.804 0.422 -0.1032 0.034 -3.075 0.002
NI/TA 10.07975 1.939 5.200 0.000 9.011738 2.886 3.123 0.002
NP/TE -0.05179 0.090 -0.576 0.564 -0.07608 0.046 -1.656 0.098
OCFR -0.18673 0.093 -2.005 0.045 -0.00459 0.116 -0.040 0.968
RE/TA -0.17723 0.119 -1.486 0.137 0.03917 0.224 0.175 0.861
SG -0.00573 0.026 -0.223 0.823 -0.00319 0.015 -0.217 0.828
SG/GNPG 0.00161 0.001 1.824 0.068 0.000456 0.000 1.061 0.289
SALES/TA -1.06864 0.198 -5.408 0.000 -0.94743 0.204 -4.654 0.000
TBD/TA -0.09779 0.050 -1.971 0.049 -0.20768 0.243 -0.856 0.392
WC/TA -0.68851 0.313 -2.203 0.028 -0.02805 0.314 -0.089 0.929
190

From the above Table No. 5.1.3.1.3: Variables in Equation, it is clear that eight
variables from large sample period and six variables from the small sample period has
p-values less than that of the critical p-value 0.05. That means the null hypothesis for
these variables are rejected. That signifies that these variables will be part of the
developed model along with significant intercepts.

Accuracy 67.7 73.8
classification accuracy of the developed models for the two sample periods. For the
large sample period the accuracy rate of the developed model is 67.7% with Type I
error at 35.5% and Type II error at 9.4%. For the small sample period the accuracy
rate of the developed model is 73.8% with Type I error at 31.1% and Type II error at
5.6%. The classification accuracy of the model developed for the small sample period
is higher than the larger sample period. As far as the misclassification of defaulting
firms is on higher at 9.4% for the model developed for the large sample period than
that of the small sample period. It is undesirable. The high level of misclassification
raises question on the statistical robustness of the develop models.
Model Validation
This section presents the results of the validation of the developed mode on two hold
out samples for the two samples. The first sample for the forward testing consists of
329 firm years of the same firms but beyond the sample period. The second sample
consists of 357 firm years which are out of sample firms.
The below Table No. 5.1.3.1.5: Forward Testing (In Sample Firms) presents the
classification accuracy of the two developed models for the forward period. For the large
sample period the accuracy rate of the developed model is 75.8% with Type I error at
32.9% and Type II error at 5.3%. For the small sample period the accuracy rate of the
developed model is 83.1% with Type I error at 22% and Type II error at 6.3%.
191


Accuracy 75.8 83.1
Accuracy 74.8 83.2
Accuracy 77.3 82.9
The classification accuracy of all the models has improved in comparison to the in sample
classification results. The model developed for the small sample period has higher
classification accuracy than the larger sample period. As far as the misclassification is
concerned, it has decreases for both the models in forward period. And Type II errors are
on lower side. This is a sign robustness of the models developed.
For one year and two years forward period, the classification accuracy don’t deteriorate
over time like other models and studies like Bandyopadhyay (2006) rather it improves as
it evident from the above table that classification accuracy for two year forward period is
higher than that of the one year forward period. It may be an exceptional case. May be it
indicate robustness of model? Further studies may confirm it.

Accuracy 71.7 83.1
In Sample
Period
Accuracy 61.4 61.9
Forward Type I Error 38.2 33.3
192

The above Table No. 5.1.3.1.6: Out of Sample Results (Validation) presents the in
classification accuracy of the developed models for the out of sample firms over the
sample period. For the in sample period, large sample period the accuracy rate of the
developed model is 71.7% with Type I error at 31% and Type II error at 11.8%. For
the small sample period the accuracy rate of the developed model is 83.7% with Type
I error at 17.7% and Type II error at 13.3%. The classification accuracy of the model
developed for the small sample period is higher than the large sample period. For the
large sample, classification accuracy has improved against the in sample classification
accuracy but is lower than the forward testing results. For the model for small sample
period too, the classification accuracy has improved in comparison to in sample
classification accuracy and same as the forward testing results.
Discussion
From the above discussion, it can be said that the statistical coefficients for the
different tests for the two models for the two samples as well as the classification
results clearly indicate that the model developed for the small sample period is more
robust and has higher classification accuracy than that of the large sample period. The
findings are summarized as follow:
 From the significant variables for the models for large sample period and small
sample period, it is clear that to predict default in long run, accounting and
economic variables are important while in short run, all the three accounting,
market and economic variables are required. This indicates that the market
factors play important role in default prediction but only in short run. In long
run, firm’s financial health and economic conditions are important in long run.
 Of the significant variables for the two models, four variables are common for
both the models. So these variables are the most important variables.
193

 From the view of the statistical robustness of the models for All Firms, the
model for the small sample period is better fits into the model with lesser noise
in the coefficients estimates.
 From the in sample classification accuracy results and validation results, the
model for the small sample period has higher classification accuracy.
 On the basis of the classification accuracy from forward testing and out of
sample validation, the classification accuracy improves for both the models is
higher than that of the in sample classification accuracy.
This sample initially has 1642 firm years of 148 large firms. The study is conducted
into two parts for two sample periods, namely large sample period for ten years from
1st April 2004 to 31st March 2014 and small sample period for five years from 1st
April 2009 to 31st March 2014. The default occurrences are counted for every firm for
the whole sample period. Similarly the 10 years means and 5 years means of selected
ratios for every firm are calculated. This turns the size of sample into 148 firms. Then
this sample is checked for outliers using Mahalanobis statistics, which brought down
the sample size to 147 firms. Averages of ratios are taken as independent variables
and count data is used as dependent variables. On this sample regression is run using
Poisson Probability in E-Views software.
Two hold out samples namely Out of Sample Period and Out of Sample are meant for
validation of the classification accuracy of the developed model. The first sample for
Forward Testing (In Sample Firms) consists of the financial, market and economic
information of the same firms that have been used to develop the model beyond the
sample years. This sample contains 234 firm years. The second sample ‘Out of
Sample’ consists of the financial, market and economic information of the firms
which have not been part of the study. This sample contains 143 firm years.
194

following models are found.
For Large Sample Period
For the large sample period a five factors models is found to arrive at Z.
Z = 3.908105– 12.8526EBIT/TA – 0.59025Log (TA/GNP) – 0.19479MP/BV +

9.955476NI/TA - 1.17538Sales/TA
For Small Sample Period
For the small sample period a three factors models is found to arrive at Z.
Z = 2.197437 + 0.012139FAT - 0.37201MP/BV– 0.94113Sales/TA
Empirical Results
From the below Table No. 5.1.3.2.1: Case Processing Summary, it is clear that the
sample size of the large sample firm is found to be 147. Of these firms, only 138 firms
are included in the study. The sample size of the small sample firm is found to be 146.
Of these firms, only 133 firms are included in the study.

Sample Size (Adjusted) 147 146
Included Observations 138 133
The below Table No. 5.1.3.2.2: Statistical Coefficients presents different coefficients
on the basis of which the relative statistical robustness of the models will be assessed.
195

Table No. 5.1.3.2.2: Statistical Coefficients

R-squared 0.685 0.616
Adjusted R-squared 0.632 0.548
S.E. of regression 1.214 0.990
Sum squared residuals 172.451 109.735
Log likelihood -132.298 -109.549
Restricted log likelihood -263.149 -203.816
Avg. log likelihood -0.959 -0.824
Mean dependent var. 1.116 0.880
S.D. dependent var. 2.000 1.472
Akaike info criterion 2.222 1.963
Schwarz criterion 2.667 2.420
Hannan-Quinn criterion 2.403 2.149
LR statistic 261.703 188.534
Prob. (LR statistic) 0.000 0.000
From the above Table No. 5.1.3.2.2: Statistical Coefficients, R-squared for the large
sample period is 0.685 and for the small sample period, it is 0.616. It clearly indicates
that the data better fits into the model developed for the large sample period than the
model developed for the small sample period. Similarly from the Adjusted R-squared,
it is clear that the data better fits into the model developed for the large sample period.
Also these values are higher than that of the values for the All Firms.
From the above Table No. 5.1.3.2.2: Statistical Coefficients, Standard Errors of the
Regression of the large sample period is 1.214 and for the small sample period, it is
.990. It clearly indicates the statistical noise in the estimates is higher in the model
developed for the large sample period than the model developed for the small sample
period. So model developed for the small sample period is better estimated
coefficients than the model developed for the large sample period. Also these results
are comparatively better than that of the results for the All Firms.
196

From the above Table No. 5.1.3.2.2: Statistical Coefficients, Sum Squared Residuals
for the large sample period is 172.4511 and for the model developed for the small
sample period is 109.735. Sum Squared Residuals is a measure of the distance from a
the data. So the model developed for the small sample period better fits into the data
than the model developed for the large sample period. These results are better than the
results for the All Firms.
From the above Table No. 5.1.3.2.2: Statistical Coefficients, it clear from the values
of the Log Likelihood, Restricted Log Likelihood and Average Log Likelihood, that
the model developed for the small sample period has higher precision than that of the
model developed for the large sample period.
From the above Table No. 5.1.3.2.2: Statistical Coefficients, it is clear that standard
deviation for the model developed for the small sample period is lower than that the
model developed for the large sample period. So on this criterion the model developed
for the small sample period has better precision than that of the model developed of
the large sample period. Also the results have improved for the Large Firms than that
of the All Firms.
From the above Table No. 5.1.3.2.2: Statistical Coefficients, Akaike Information
Criterion for the model developed for the large sample period is 2.222 and for the
model developed for small sample period, it is 1.963. Lower is the value of Akaike
Information Criterion higher is the quality of the model. It is clear that relative quality
of the model developed for the small sample period is better than that of the model
developed for the large sample period. These values are lower than the values for the
All Firms indicating an improvement in the models.
197

Schwarz Criterion
From the above Table No. 5.1.3.2.2: Statistical Coefficients, Schwarz Criterion for the
model developed for the large sample period is 2.667 and for the model developed for
small sample period, it is 2.420. Lower is the value of Schwarz Criterion more
efficient is the model. It is clear that the model developed for the small sample period
is more efficient than that of the model developed for the large sample period. The
results for the models for the Large Firms are better than the All Firms.
From the above Table No. 5.1.3.2.2: Statistical Coefficients, Hannan-Quinn Criterion
for the model developed for the large sample period is 2.449 and for the model
developed for small sample period, it is 2.103. Lower is the value of Hannan-Quinn
Criterion higher is the relative quality of the model. It is clear that relative quality of
the model developed for the small sample period is better than that of the model
developed for the large sample period. The results for the models for the Large Firms
are better than the All Firms.
From the above Table No. 5.1.3.2.2: Statistical Coefficients, the p-values for LR
Statistic for both the models developed for the large sample period and small sample
period are lower than that of the critical p-value 0.05. That means the null hypothesis
is rejected for both the models and alternative hypothesis that that all slope
coefficients are not equal to zero is accepted. LR Statistic for the model developed for
the large sample period is 261.703 and for the model developed for small sample
period, it is 188.534. Lower is the value of LR Statistic higher is the precision of the
model. It is clear that precision of the model developed for the small sample period is
198

better than that of the model developed for the large sample period. The results for the
models developed for the Large Firms have improved in comparison to the All Firms.
The below Table No. 5.1.3.2.3: Variables in Equation presents the estimated

Variable Std. z-
Coefficient Std. Error z-Statistic Prob. Coefficient Prob.
Error Statistic
Const. 3.908105 0.575 6.792 0.000 2.197437 0.679 3.237 0.001
CA/CL -0.006568 0.045 -0.146 0.884 -0.024062 0.079 -0.305 0.761
D/E 0.024696 0.037 0.676 0.499 -0.013391 0.035 -0.388 0.698
EBIT/INT -0.000469 0.001 -0.527 0.598 -0.001805 0.004 -0.504 0.614
EBIT/TA -12.85257 3.705 -3.469 0.001 8.700209 8.342 1.043 0.297
FAT 0.011945 0.007 1.822 0.068 0.012139 0.006 1.972 0.049
GRTA -0.015279 0.034 -0.447 0.655 0.097814 1.137 0.086 0.932
ITR 0.000284 0.001 0.470 0.638 -0.000002 0.000 -0.005 0.996
Log(TA/GNP) -0.590252 0.174 -3.383 0.001 -0.373514 0.205 -1.823 0.068
MP/BV -0.194790 0.079 -2.477 0.013 -0.372006 0.159 -2.336 0.020
MP/EPS -0.006847 0.005 -1.280 0.201 0.001354 0.002 0.571 0.568
MVE/TBD -0.000714 0.002 -0.311 0.756 -0.012371 0.027 -0.463 0.643
NI/TA 9.955476 3.847 2.588 0.010 -10.70547 9.112 -1.175 0.240
NP/TE 0.085442 0.140 0.608 0.543 -0.071570 0.134 -0.533 0.594
OCFR -0.183406 0.127 -1.447 0.148 -0.046706 0.244 -0.191 0.848
RE/TA -0.153841 0.303 -0.508 0.612 -0.550311 0.349 -1.579 0.114
SG -0.128043 0.163 -0.785 0.433 -1.410894 0.887 -1.590 0.112
SG/GNPG 0.000673 0.001 0.675 0.500 -0.001460 0.001 -1.507 0.132
SALES/TA -1.175382 0.261 -4.496 0.000 -0.941125 0.311 -3.022 0.003
TBD/TA -0.109607 0.079 -1.389 0.165 0.361373 0.421 0.859 0.390
WC/TA -0.697146 0.398 -1.753 0.080 0.337056 0.490 0.688 0.492
199

From the above Table No. 5.1.3.2.3: Variables in Equation, it is clear that seven
variables from large sample period and three variables from the small sample period
has p-values less than that of the critical p-value 0.05. That means the null hypothesis
for these variables are rejected. That signifies that these variables will be part of the
developed model along with significant intercepts.

Accuracy 73.1 77.4
classification accuracy of the developed models for the two sample periods. For the
10.4%. The classification accuracy of the model developed for the small sample
period is higher than the larger sample period. Also the in sample classification
accuracy for the large firms is higher than the model developed for the All Firms.
Model Validation
200


Accuracy 88.0 94.0
Accuracy 89.8 95.7
Accuracy 86.2 92.2
The above Table No. 5.1.3.2.5: Forward Testing (In Sample Firms) presents the
classification accuracy of the two developed models for the forward period. For the
6.3%. The classification accuracy of all the models has improved in comparison to the
in sample classification results as well as the model developed for the All Firms. The
classification accuracy for the model developed for the small sample period is higher
than the larger sample period. As far as the misclassification is concerned, it has
decreases for both the models in forward period. This is positive sign.
For one year and two years forward period, the classification accuracy of the model
developed for the small sample period is higher than that of the large sample period.
And the classification accuracy deteriorates over time as it evident from the above
table that classification accuracy for two year forward period is lower than that of the
one year forward period.
201


Accuracy 75.5 86.0
In Sample
Period
Accuracy 62.5 66.7
Forward Type I Error 16.7 11.1
The above Table No. 5.1.3.2.6: Out of Sample Results (Validation) presents the in
classification accuracy of the developed models for the out of sample firms over the
sample period. For the In Sample Period, for the large sample period the accuracy rate
of the developed model is 75.5% with Type I error at 20.3% and Type II error at
50.0%. For the small sample period the accuracy rate of the developed model is
86.0% with Type I error at 8.9% and Type II error at 45.0%. The classification
accuracy of the model developed for the small sample period is higher than the large
sample period. For the large sample, classification accuracy has improved against the
in sample classification accuracy but is lower than the forward testing results. For the
model for small sample period, classification accuracy has fallen down in comparison
to in sample classification accuracy and but lower than forward testing results. Also
the results for the out of sample firms for large firms has improve in comparison to
classification accuracy for the All Firms, For the forward period, classification
accuracy has fallen for both the models. As far as the misclassification of defaulting
firms is concerned, it is on higher side for both the models.
Discussion
From the above discussion, it can be said that the statistical coefficients for the
different tests for the two developed models for the two samples as well as the
classification results clearly indicate that the model developed for the small sample
202

period for the Large Firms has higher robustness and accuracy. The findings are
 For the model for the large firms for large sample period, all the three categories
of variables are significant while for the model for the small sample period
accounting as well as market variables are important.
 In neither model, debt component is directly involved.
 From the view of the statistical robustness of the models developed for All
Firms, it clear that all the tests and coefficients reported for the two regressions
apart from the R-squared and Adjusted R-squared strongly indicate that the
model developed for the small sample period has higher statistical robustness
than that of the model developed for the large sample period.
 From the in sample classification accuracy results and validation results, it is clear
that the model developed for the small sample period has higher classification
accuracy than that of the model developed for the large sample period.
sample validation, it is clear that the classification accuracy improves for both
the developed models than that of the in sample classification accuracy.
5.1.3.3. Findings from the Reduced Form Model
From the above discussion for the four developed models using Poisson Regression
method, the findings can be summarized under four headings namely Statistical
Robustness, Classification Accuracy and Limitations of the Developed Models and
Further Scope.
203

 For All Firms, accounting and economic variables are significant for the model
for large sample period and all the three categories of variables for small sample
period.
 For Large Firms, all the three categories of variables are significant for model
for the large sample period and accounting and economic variables are
significant for small sample period.
From the statistical coefficient reported for the regression, it is clear that the statistical
robustness for the small sample period is the highest for model developed for the
Large Firms among all the models.
The classification accuracy of the developed models has been gauged on three
classification accuracies, namely in sample classification accuracy, forward period
classification accuracy for the in sample firms and classification accuracy of out of
sample firms for in sample period as well as for forward period.
 In sample, forward and out of sample classification accuracy is the highest for
the model for the small sample periods for Large Firms.
 The developed model is facing the problem of high Type I and Type II errors.
 There is need to carry out separate study for small and medium enterprise as this
study could not do so because of small data set because of unavailability of the
same.
204

Conclusion
Considering the limitations on the basis of the statistical robustness and the
classification accuracies of the models, the model for the small sample period for the
Large Firms better fits into data with lesser noise in the coefficients estimates with
high level of classification accuracy. On forward and out of sample testing
classification accuracy improves.
The total numbers of firm years for 225 firms are 2449. For the large firms, this number is
1642 firm years for 148 firms. For public sector undertakings, the total of firm years is
238 for 20 public sector undertakings. And for the small and medium enterprises, the total
numbers of firm years are 569 for 57 firms. But because of inconsistency in data, a large
number of firm years have been dropped from the sample for the structural model base on
Black, Scholes and Merton (1973) model. After refinement of the data, the total number
of firm years dropped to 1473 firm years.
Table No. 5.1.4.1: Prior Probabilities for Groups
Large Firms with

All Firms Large Firms SME
PSU
Cases Prob. Cases Prob. Cases Prob. Cases Prob.
Non-Defaulting 980 66.5 721 66.7 812 69.2 168 56.9
Defaulting 493 33.5 360 33.3 362 30.8 131 43.8
Total 1473 100 1081 100 1174 100 299 100
From the above Table No. 5.1.4.1: Prior Probabilities for Groups, it is clear that for
All Firms. There are 1473 cases of which 33.5% are defaulting. For the sample of
Large Firms, total cases are 1081 cases of which 33.3% cases are defaulting. For the
sample of Large Firms with PSU, total cases are 1174 cases of which 30.8% cases are
defaulting. And for sample of SMEs, total cases are 299 cases of which 43.8% cases
are defaulting.
205

D2 and Default Probabilities
Table No. 5.1.4.2: D2 and Default Probabilities
Large Firms with

PSU
Avg. Avg. Avg. Avg.
Avg. D2 Avg. D2 Avg. D2 Avg. D2
Prob. Prob. Prob. Prob.
Non-
14165.2 27 12346.4 33 13626.25 28 15657.16 88
Defaulting
Defaulting -569.59 82 -391.165 79 -382.424 78 -961.35 88
Total 10131.6 42 8573.692 47 9972.682 41 10543.77 43
From Table No. 5.1.4.2: D2 and Default Probabilities, it is clear that the average D2 for
all the firms is 10131.6 with average probability of default at 42%. The average D2 for
the large firms is 8573.7 with average probability of default at 47%. The average D2 for
the large firms with PSU is 9972.7 with average probability of default at 41%. And the
average D2 for the SMEs is 10543.8 with average probability of default at 43%.
Table No. 5.1.4.3: Classification Accuracy
Large Firms
with PSU
Accuracy 75.4 69.4 72.8 82.3
Type I Error 27.6 34.8 29.5 22.2
Type II Error 16.7 20.8 20.8 7.6
From the above Table No. 5.1.4.3: Classification Accuracy, it is clear that the
classification accuracy of the structural model is not very high. The highest
classification accuracy is for the sample of SMEs with 82.3%. Also Type I and Type
II errors are lowest for SMEs. Though in case of Large Firms, the classification
accuracy is the lowest at 69.4% with the highest Type I and Type II errors.
206

Discussion
From Table No. 5.1.4.1: Prior Probabilities for Groups, it is clear that the proportion of
defaulting firms in the sample is high compared with other studies with 33.5% defaulting
cases (Kulkarni, Mishra, & Thakkar, 2008). From the classification results, it is clear that
classification accuracy is not very high for the structural model for all the samples if
compared with other studies. The highest classification accuracy is found to be 82.3%
which is lower than other studies (Bandyopadhyay A. , 2007). Also the misclassification
is very high for the samples. The reason for this may the presence of willful defaulters in
the sample. This may be the cause of the Type II errors.
5.1.5. Interferences from the Study
From the above discussion for the four broad samples and all the developed models
using four methods, following are the findings:
 For Indian sample in long term for all models for all methods, only accounting
and economic variables are significant while for short term models, all the
categories are variables significant. This signifies that in long run market
information about the firm is not important but in short market dynamics have
impact on default prediction
 For Indian sample in long term for all models for all methods, besides the
amount of debt with respect to total assets, no other variable with debt
component is significant for default prediction but in short run, most of the
variables having debt component are significant. That means in short run all the
form of debt is important for default prediction.
 For MDA models, all the significant variables for the large sample period are
significant for the small sample period. indicates that in long run firms tend to
follow similar pattern while in short run, they have different pattern in the
context of default prediction. And in short run or change in assets size, for the
additional risk additional variables are required.
207

 For the MDA and logit models, the distinction between matched and unmatched
sample is irrelevant.
 For SMEs, only accounting as well as market variables is significant for both the
sample periods. Economic variables seem to have no role in default prediction
for SMEs.
 For logistic regression method categorical variables Y is significant for all the
models but X is significant for only one model.
 Unlike MDA models, logit models for the two sample periods have almost
different set of variables.
 For every model expect logit models intercept is significant.
 For MDA and logit method, the models for the small sample period are more
robust.
 In case of reduced for models, models developed for the large sample period are
more robust.
 In case of MDA log determinants and Wilks’ Lambdas are higher for large
sample period. It seems that log determinant and Wilks’ Lambda are dependent
on size of data set.
almost double of the log determinant of the defaulting group. This indicates that
the developed models can discriminate non-defaulting firms better than the
defaulting group.
 For Logit models, as per Omnibus test and Model Summary, all the developed
models are robust and on the basis of these criterions, interpreted separately, it
can be said that the models developed for the small sample periods have the
highest level of robustness.
 For Logit models, most of the specified models fit into the data.
208

 The highest classification accuracy is found for the model for large sample
period for All Firms using logistic regression analysis at 97.7%.
 Logit models have the highest classification among all the methods at 97.7%.
 Models developed for the large sample have the highest classification accuracy
for all the methods. However models developed for the small sample period are
more robust.
 The presence of public sector undertakings in sample does not have any impact
as the classification results are comparable to the classification results of large
firms.
 When it comes to forward testing of in sample firms as well as out of sample

firms, the classification accuracy is found to be higher for all the models using
all the methods.
 The classification accuracies of the developed models are found to deteriorate

over time (Altman E. I., 2005).
 In case MDA models, the highest classification accuracy is found for the Large
Firms with PSU at 92.1% and the models developed for the Unmatched Sample
have higher prediction accuracy than that of the Matched Sample. This is
against the assumption of the many studies like (Altman E. , 1968), (Beaver W. ,
1966) and (Gupta V. , 2014).
 In case of Logit method, on the basis of classification results, there is no

distinction between the matched and unmatched sample. They have almost same
results.
 For reduced form models, the highest classification accuracy is found for Large
Firms for the small sample period at 77.4% although for forward period and out
sample accuracy improves to 94%.
209

 In case of structural model, the highest classification accuracy is found to be

82.3% for SMEs which is lower than other studies (Bandyopadhyay A. , 2007).
 Misclassification is very high for all the models developed using all the four
methods.
5.1.6 Conclusion
From the above analysis of all the models using the four methods, it can be said that
the different methods find different sets of variables significant for the model
development with varying statistical robustness and classification accuracies.
However, it can be concluded that the presence of all the three categories of variables
in the model normally improves the prediction accuracy of the models. Considering
the statistical robustness of the developed models, it is found in general that the
models developed for the small sample period have higher statistical robustness. As
far as the classification accuracies of the models are concerned, it can be concluded
that the models developed for the large sample period have higher classification
accuracy except reduced form model. On comparison of the classification accuracies,
logit models have the highest classification accuracies at 97.7% than that of the
discriminant models (Gupta V. , 2014), structural model (Sharma, Singh, &
Upadhyay, 2014) and reduced form models. The models developed for the small
sample period are more robust for all three methods MDA, Logit and reduced form.
And over time classification accuracy deteriorates for the developed models.
From the results, it is clear that the models developed using MDA and Logit methods
have the highest classification accuracies up to 97.7% for logit model. These results are
comparable to many studies like Altman (1968), Ohlson (1980), Altman & Narayanan
(1997), Sen Chaudhury (1999), Altman (2000), Bandyopadhyay (2006), Jayadev (2006),
Agarwal & Taffler (2008) and Gupta (2014) etc. As far as the accuracies of the reduced
form model and structural model are concerned, the results of the study are found less
efficient than many studies like Patel & Pereira (2005), Bandyopadhyay (2007), Agarwal
& Taffler (2008), Kulkarni, Mishra, & Thakkar (2008), Duan, Sun, & Wang (2011) and
Sharma, Singh, & Upadhyay (2014) etc. For the newly developed model using reduced
210

form method, the statistical robustness as well as the classification accuracy is the highest
model developed for small sample period.
5.1.7. Limitations and Further Scope for Studies
 The developed models using MDA and Logit method are facing the problem of
high Type I and Type II errors. Though the more problematic is Type II error. Is
the reason for high Type II error is inclusion of willful defaulters in the study?
This needs to be answered.
 The developed model for reduced form method is facing the problem of high
Type I error. There is need to further investigate as why non-defaulting firms are
being predicted as defaulting? Is it indication of financial distress in these firms
or some other indications or there is problem with model? These questions need
to be answered and these are basically scope for further studies.
 It is quite possible that there are a few variables which may have been not been
included in the study accidently. So there is need to include other variables
which may provide better results.
 The study has used only accounting, market and economic variables. There is
need to include qualitative as well as categorical variables in the study. This
may improve the results.
 From sampling, it is clear there is problem of sampling bias and selection bias in
the sample which cannot be avoided in case of credit risk models.
 There is further scope for the study on larger data set.
 This study is carried on a sample of five years data set and ten years data set.
The study indicates that robust model is found for small sample period. But
from the view point of classification accuracy, the models for large sample
period have higher efficiency. This is biggest limitation of the study.
 There is further scope to study with larger sample periods.
211

 There is further scope for sectoral analysis to find out effect of sectors.
same.
SECTION II: AMERICAN SAMPLE
American sample. This study examines four methods namely multiple discriminant
default in American context. The results and findings of the study are arranged into
four parts. Part I consists of the results of study using multiple discriminant analysis.
Part II contains results and finding of study using logistic regression analysis. Part III
presents the results and finding of study using reduced form model using Poisson
regression. And Part IV presents and discusses the results of study using with Merton
distant to default model.
Multiple discriminant analysis has been used to arrive a credit score. The empirical
results of the study has been presented and discussed in this section. The sample size
of the study is 250 firm years for 50 firms over a period of five years from 1st January
2012 to 31st December 2016. This sample initially has 250 firm years which was
refined to 229 firm years. After checking for outliers using Mahalanobis statistics,
firm years are brought down to 206 firm years for the sample period of five years
from 2012 to 2016 of which 166 firm years were found to be valid. On this sample
multiple discriminant analysis is used using SPSS software. One hold out samples
namely Out of Sample is meant for validation of the classification accuracy of the
developed model. This sample ‘Out of Sample’ consists of the financial and market
information of the firms which have not been part of the study. This sample contains
32 firm years.
212

12 factors model was found.
Z = - 5.271 – 14.367EBIT/TA + 0.009MVE/TBD + 0.8Sales/TA + 16.751NI/TA –

2.045TBD/TA – 0.02EBIT/TA + 0.037OCFR + 0.985GRTA - 0.002FAT +
0.001MP/EPS + 1.219Log (TA/GNP) + 1.466SG
Empirical Results
In this section empirical results of different tests and classification accuracy of

developed is reported, discussed and compared with other studies.
From the following Table No 5.2.1.1: Analysis Case Processing Summary, it is clear
that a sample of total 166 firm years has been used to develop Z score model using
multiple discriminant analysis and all the firm years have included in the study.
Table No. 5.2.1.1: Analysis Case Processing Summary
Cases Prob.
Valid 166 80.6
Excluded 40 19.4
Total 206 100
Group Statistics
From the belowTable No 5.2.1.2: Group Statistics, it clear that mean of the selected
ratios of the three groups which have been used for the development of the model, are
quite different from each other. The means of ratios of non-defaulting group are far
healthier than that of defaulting group. The table shows stark difference in the
financial performance of between the two groups. Same trend is found with the
standard deviations of the two groups.
213

Table No 5.2.1.2: Group Statistics
Non-Defaulting Defaulting Total

Mean Std. Dev Mean Std. Dev Mean Std. Dev
WC/TA 0.119 0.165 0.038 0.321 0.106 0.200
RE/TA 171.843 920.621 772.522 2081.691 269.544 1200.474
EBIT/TA 0.108 0.120 -0.062 0.364 0.081 0.192
MVE/TBD 4.533 3.650 31.453 94.002 8.912 38.766
Sales/TA 1.035 0.798 0.591 0.286 0.963 0.756
CA/CL 1.882 2.386 1.735 1.340 1.858 2.247
NI/TA 0.082 0.104 -0.081 0.333 0.055 0.173
NP/TE 0.321 1.324 -0.066 3.895 0.258 1.969
TBD/TA 0.270 0.132 0.373 0.237 0.287 0.158
EBIT/INT 12.514 8.303 7.193 18.157 11.649 10.653
OCFR 0.321 1.746 -1.438 5.179 0.035 2.684
GRTA 0.024 0.128 -0.119 0.310 0.001 0.178
ITR 27.239 46.028 49.160 99.156 30.805 58.198
FAT 9.421 14.392 3.460 3.198 8.451 13.406
MP/EPS 14.780 9.501 -258.657 1090.347 -29.695 444.584
MP/BV 6.729 20.058 94.812 339.373 21.056 139.815
D/E 4.251 12.374 2.342 11.506 3.941 12.224
Log(TA/GNP) 4.646 0.653 3.008 1.532 4.379 1.046
SG 0.009 0.143 -0.073 0.364 -0.004 0.197
SG/GNPG 0.057 7.920 -1.545 9.806 -0.203 8.244
group mean.
214

H0: There is no significant difference between the groups on each of the

independent variables group mean.
Table No 5.2.1.3: Tests of Equality of Group Means
Wilks' Lambda F df1 df2 Sig.

WC/TA 0.977 3.821 1 164 0.052
RE/TA 0.966 5.826 1 164 0.017
EBIT/TA 0.893 19.731 1 164 0.0
MVE/TBD 0.934 11.603 1 164 0.001
Sales/TA 0.953 8.109 1 164 0.005
CA/CL 0.999 0.097 1 164 0.756
NI/TA 0.88 22.461 1 164 0.0
NP/TE 0.995 0.874 1 164 0.351
TBD/TA 0.941 10.32 1 164 0.002
EBIT/INT 0.966 5.805 1 164 0.017
OCFR 0.941 10.264 1 164 0.002
GRTA 0.911 15.934 1 164 0.0
ITR 0.981 3.251 1 164 0.073
FAT 0.973 4.567 1 164 0.034
MP/EPS 0.948 8.965 1 164 0.003
MP/BV 0.946 9.432 1 164 0.002
D/E 0.997 0.55 1 164 0.46
Log(TA/GNP) 0.664 83.049 1 164 0
SG 0.976 3.991 1 164 0.047
SG/GNPG 0.995 0.853 1 164 0.357
From the above Table No 5.2.1.3: Tests of Equality of Group Means it is clear that
fifteen variables have p-value less than critical p-value that is 0.05 by F test. So the
null hypothesis, that there is no significant difference between the groups on each of
the independent variables group mean, is rejected and the alternative hypothesis is
215

independent variables group mean. So these fifteen independent variables are

significant for the study. But the problem with these significant variables is that the
Wilk’s Lambdas of these significant variables are on higher side which in undesirable
and smaller Wilk’s Lambda is better for the analysis.
Log Determinants
The below Table No 5.2.1.4: Log Determinants indicates that the log determinant for
the whole sample pooled within group is found to be 37.05. It is 17.535 and 28.342
for the Non-defaulting group and defaulting group respectively. This shows that the
covariance matrices for the three groups are not similar. Rather they do differ. Log
determinant of non-defaulting groups are about one and half times of the defaulting
groups. This shows that the variables in this function can better predict the non-
defaulting firms than that of the defaulting firms as larger log determinants
correspond to non-defaulting group.
Table No 5.2.1.4: Log Determinants
Group Rank Log Determinant

Non-Defaulting 20 17.535
Defaulting 20 28.342
Pooled within-groups 20 37.050
Box’s M Test
H0: The covariance matrices do not differ between groups formed by the
dependent variables.
variables.
216

Table No 5.2.1.5: Test Results
Box's M 2919.379
Approx. 9.718
df1 210
F
df2 6794.332
Sig. 0.000
From the Table No 5.2.1.5: Test Results, it is clear that test the p-value of the test is
lower than that of critical value of 0.05. That means that null hypothesis that the
covariance matrices do not differ between groups formed by the dependent variables,
is rejected. That means the covariance matrices differ between groups formed by the
dependent variables. This is undesirable.
Eigenvalues
Table No 5.2.1.6: Eigenvalues
Canonical
Function Eigenvalue % of Variance Cumulative %
Correlation
The USA 1.315 100.0 100.0 0.754
The above Table No 5.2.1.6: Eigenvalues provides information about the eigenvalue
and canonical correlation. The eigenvalue of the discriminant function is 1.315. This
value is on the higher side. This signifies that most of the variance in the dependent
variable can be properly explained by the discriminant function. The canonical
correlation between the discriminant function and the dependent variable is 0.754.
This signifies that there is a significantly strong association between the discriminant
function and classification of groups and 75.4% of variance in classification of groups
can be correctly explained by the developed discriminant function.
Wilks' Lambda
217


Table No 5.2.1.7: Wilks' Lambda
Function Wilks' Lambda Chi-square df Sig.

The USA 0.432 129.292 20 0.000
Wilks' Lambda is a measure of discriminatory power of discriminant function. Lower

is the Wilk’s Lambda, higher is the discrimination power of the function. From the
above Table No 5.2.1.7: Wilks' Lambda, it is clear that the Wilk’s Lambda is 0.432
which is on lower side. This tells that the discriminatory power the developed
discriminant function from the sample is high. The Chi-square test has p-value less
than the critical p-value of 0.05. That means null hypothesis is rejected and alternative
hypothesis that there is no significant discriminating power in the independent
variables. So it can be concluded on the basis of above test that the groups differ from
each other and the function can effectively discriminate between the two groups.
Structure Matrix
Structure Matrix shows the correlations of each variable with each discriminant
function and these correlations are similar to the factor loadings in factor analysis.
The below Table No 5.2.1.8: Structure Matrix shows that the correlation between the
predictor variables and standardized canonical discriminant function. From the above
table it can be said that most of the independent variables have low correlation and
contribute negligibly to the discriminant function.
218

Table No 5.2.1.8: Structure Matrix
Variables Coefficients
Log(TA/GNP) 0.62
NI/TA 0.323
EBIT/TA 0.302
GRTA 0.272
MVE/TBD -0.232
TBD/TA -0.219
OCFR 0.218
MP/BV -0.209
MP/EPS 0.204
Sales/TA 0.194
RE/TA -0.164
EBIT/INT 0.164
FAT 0.146
SG 0.136
WC/TA 0.133
ITR -0.123
NP/TE 0.064
SG/GNPG 0.063
D/E 0.05
CA/CL 0.021
219

Table No 5.2.1.9: Standardized Canonical Discriminant Function Coefficients
WC/TA 0.338
RE/TA 0.357
EBIT/TA -2.615
MVE/TBD 0.326
Sales/TA 0.593
CA/CL 0.026
NI/TA 2.73
NP/TE 0.016
TBD/TA -0.314
EBIT/INT -0.212
OCFR 0.098
GRTA 0.168
ITR -0.203
FAT -0.024
MP/EPS 0.418
MP/BV 0.001
D/E 0.003
Log(TA/GNP) 1.042
SG 0.287
SG/GNPG -0.163
From the above Table No 5.2.1.9: Standardized Canonical Discriminant Function

Coefficients, it is clear that EBIT/TA has the highest coefficient value at -2.615. That
means the independent variable EBIT/TA has the highest discriminatory power
among all the independent variables. The coefficient of MP/BV is 0.001. On
comparing with other independent variables MP/BV has the lowest discriminatory
power among all the independent variables.
220

Table No 5.2.1.10: Canonical Discriminant Function Coefficients (Unstandardized)
WC/TA 1.709
RE/TA 0.000
EBIT/TA -14.367
MVE/TBD 0.009
Sales/TA 0.800
CA/CL 0.012
NI/TA 16.751
NP/TE 0.008
TBD/TA -2.045
EBIT/INT -0.020
OCFR 0.037
GRTA 0.985
ITR -0.004
FAT -0.002
MP/EPS 0.001
MP/BV 0.000
D/E 0.000
Log(TA/GNP) 1.219
SG 1.466
SG/GNPG -0.020
(Constant) -5.271
The above Table No 5.2.1.10: Canonical Discriminant Function Coefficients

221

observations in the groups used in the study. From the below Table No 5.2.1.11: Prior
Probabilities for Groups, it clear that out of 166 firm years, 139 firm years belong to
non-defaulting group and 27 firm years below to defaulting group.
Table No 5.2.1.11: Prior Probabilities for Groups
Cases Prior Prob.

Defaulting 27 0.163
Total 166 1.00
score of each group. From the below Table No 5.2.1.12: Functions at Group
Centroids, it is clear that centroid of the discriminating score for defaulting group is -
Table No 5.2.1.12: Functions at Group Centroids
Group Centroids
Non-Defaulting 0.502
Defaulting -2.586
formulate decision rule in deciding group membership of individual case. The
decision rule is described below:
-2.586 -0.00026 0.502
From the above Table No 5.2.1.11: Prior Probabilities for Groups, it is clear that the
numbers of defaulting and non-defaulting cases are not equal, so to find the dividing
point, weights on centroids is used.
222

= (0.502 X 139 – 2.586 X 27)/166 = -0.000265
Classification Result
Table No 5.2.1.13: Classification Results
Non Defaulting Defaulting Total

Non Defaulting 162 10 172
Count
Defaulting 11 23 34
Non Defaulting 94.2 5.8 100.0
%age
Defaulting 32.4 67.6 100.0
Accuracy 89.8%
Type I Error 5.8%
Type II Error 32.4%
The above Table No 5.2.1.13: In Sample Classification Results (Original) presents the
in classification accuracy of the developed model. The classification accuracy of the
developed model is 89.8% with 5.8% Type I error and 32.4% Type II error. The
classification accuracy is comparable but lower than the results of the study on Indian
firms as well as many other studies. Type I error is on higher side but Type II error is
not very high which is desirable.
Discussion
From the above discussion, it clear that the model is found to be robust and the
classification accuracy of the developed model is found at 89.8% with lower Type II
error. The findings from the study are summarized as follow:
 The significant variables belong to all the three categories of variables.
223

 Besides total book value of debt with respect to total assets, no other variable
 As per different tests, the developed is statistically robust.
 From the classification accuracy, it is clear that developed model has

comparable accuracy and is higher than Altman (1968).
 These results are lower than the results from the Indian sample but higher than
the British sample in this study.
Logistic Regression analysis has been used to arrive at a credit score for the firms
with the purpose of the predicting their respective default probabilities. This sample
initially has 250 firm years which was refined to 223 firm years. After checking for
for the sample period of five years from 2012 to 2016 of which 201 firm years were
found to be valid. On this sample, factor analysis is carried out as because of high
multicollinearity in data. After analysis only eight variables are used for logistic
regression analysis using SPSS software.
After running the logistic regression analysis using SPSS software the following 5
O = 4.372 + 4.513Sales/TA – 3.559NI/TA - 7.671TBD/TA - 0.001MP/EPS –

0.02SG/GNPG
Empirical Results

224

From Table No 5.2.2.1: Case Processing Summary, it is clear that a sample of 201
firm years has been used to develop a model for predicting default probability using
logistic regression method and all the firm years have been used to develop this
model.
Cases Prob.
Valid 201 97.6
Excluded 5 2.4
Total 206 100
variable.
variable
Table No. 5.2.2.2: Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 149.676 22 0.000
Block 149.676 22 0.000
Model 149.676 22 0.000
225

From the above Table No. 5.5.2.2: Omnibus Tests of Model Coefficients, the Chi-
square coefficient is found to be 149.676 and the p-value of the test is 0.000. This
suggests that null hypothesis is rejected and alternative hypothesis is accepted that
there is an effect of the independent variables, taken together, on the dependent
variable. So it can be said that the inclusion of independent variables improves the
predictive ability of the model.
Model Summary
From the below Table No. 5.2.2.3: Model Summary, -2LL is found to be 29.831 and
pseudo R square that is Cox and Snell R Square is found to 0.525 and adjusted pseudo
R square or Nagelkerke R square is found to be 0.889. This signifies that about 88.9%
of the variance in the estimation can be explained by the logistic function.
Table No. 5.2.2.3: Model Summary
Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square

1 29.831 0.525 0.889
Table No. 5.2.2.4: Hosmer and Lemeshow Test

1 1.055 8 0.998
226

From the above Table No. 5.2.2.4: Hosmer and Lemeshow Test, the degree of
freedom is found to be 8, the Chi-square is 0.005 and the p-value of the test is 1. The
p-value of test is higher than that of the critical value that is 0.05. So the null
hypothesis is accepted. That mean the model is correctly specified. That means the
data that has been used to develop the model fits well in the model.
Table No. 5.3.2.5: Variables in the Equation
B S.E. Wald df Sig.

WC/TA -3.521 4.203 0.702 1 0.402
Sales/TA 4.513 2.599 3.014 1 0.038
NI/TA -3.559 4.138 3.40 1 0.039
TBD/TA -7.671 5.849 3.720 1 0.019
EBIT/TA -0.053 0.134 0.160 1 0.689
MP/BV -0.001 0.001 2.296 1 0.048
D/E 0.157 0.114 1.906 1 0.167
Log(TA/GNP) -0.020 0.030 2.157 1 0.049
X(1) 55.124 1272.160 0.002 1 0.965
Y(1) 51.777 1272.129 0.002 1 0.968
Constant 4.372 2.135 0.001 1 0.027
From the below Table No. 5.2.2.5: Variables in Equation, it is clear that five variables
have p-value less than that of the critical value 0.05. So the null hypothesis is rejected.
Only these five variables are part of the model.
227

Classification Results
Table No. 5.2.2.6: Classification Results

Count
Defaulting 5 28 33
%age
Defaulting 15.2 84.8 100.0
Accuracy 95.0%
Type I Error 3.0%
Type II Error 15.2%
The above Table No 5.2.2.6: Classification Results presents the in classification

accuracy of the developed model. The classification accuracy of the developed is
found to be 95% with Type I at 3% and Type II error at 15.2%. Classification of logit
model is higher than the discriminant model. But the problem with the model is that it
has high level of Type II error. This is undesirable
Discussion
From the results it is clear that the developed logit model is significant and has high
classification accuracy with 95% is comparable to many studies like Ohlson (1980)
but with high level of Type II error. The findings from the study are summarized as
follow:
 Besides total book value of debt with respect to total assets, no other variable
 As per Omnibus test, model is significant.
228

 As per model summary, the model is robust.
 As per Hosmer and Lemeshow Test, the model is correctly specified.
 The developed model has yielded a classification accuracy of 95% with low
Type I error which comparable to many studies.
 The problem with model is that is has high level of Type II error.
intensity for the firms for the sample of American firms. This default intensity is used
to arrive at the probability of default. This sample initially has 250 firm years of 50
firms from the United State of America for the period of five years from 2012 to
2016. The default occurrences are counted for every firm for the whole sample period.
Similarly the 5 years means of selected ratios for every firm are calculated. This turns
the size of sample into 50 firms. Then this sample is checked for outliers using
Mahalanobis statistics, which brought down the sample size to 41 firm years.
Averages of ratios are taken as independent variables and count data is used as
dependent variables. On this sample regression is run using Poisson Probability in E-
Views software.
following a four factors model is found.
Z =3.220329 + 19.28147EBIT/TA – 1.09715 Log (TA/GNP) – 19.7665NI/TA –

1.181494Sales/TA
Empirical Results

229

From the below Table No. 5.2.3.1.: Case Processing Summary, it is clear that the
sample size is found to be 41. Of these firms, only 39 firms are included in the study.
Table No. 5.2.3.1: Case Processing Summary
The US
Sample Size (Adjusted) 41
Included Observations 39
The below Table No. 5.2.3.2: Statistical Coefficients presents different coefficients on
the basis of which the relative statistical robustness of the models will be assessed.
Table No. 5.2.3.2: Statistical Coefficients
The USA
R-squared 0.368
Adjusted R-squared 0.280
S.E. of regression 0.314
Sum squared residuals 14.300
Log likelihood -47.615
Restricted log likelihood -76.036
Avg. log likelihood -0.287
Mean dependent var. 0.163
S.D. dependent var. 0.370
Akaike info criterion 0.827
Schwarz criterion 1.220
Hannan-Quinn criterion 0.986
LR statistic 56.843
Prob. (LR statistic) 0.000
230

From the above Table No. 5.2.3.2: Statistical Coefficients, R-squared is 0.368 and
adjusted R-squared is 0.28. It clearly indicates that the data don’t fit properly into the
model developed.
From the above Table No. 5.2.3.2: Statistical Coefficients, it is clear that the Standard
Error of the Regression is on lower side at 0.314. It clearly indicates the statistical
noise in the estimates is very low. So developed is found to be robust.
From the above Table No. 5.2.3.2: Statistical Coefficients, Sum Squared Residuals
for the model is 14.3. Sum Squared Residuals is a measure of the distance from a data
point to the regression line. Lower is the residual, better the model is good fit for the
data. So the model better fit for the data.
From the above Table No. 5.2.3.2: Statistical Coefficients, it clear from the values of
the Log Likelihood, Restricted Log Likelihood and Average Log Likelihood, that the
developed has higher precision.
From the above Table No. 5.2.3.2: Statistical Coefficients, it is clear that standard
deviation for the developed model is on lower side at 0.37. SO the model has highly
robust.
From the above Table No. 5.2.3.2: Statistical Coefficients, Akaike Information
Criterion for the model is 0.827 which is construed as low. So the model is robust.
231

Schwarz Criterion
From the above Table No. 5.2.3.2: Statistical Coefficients, Schwarz Criterion for the
model is 1.22 which is on lower side and considered low. This makes the developed
model robust.
From the above Table No. 5.2.3.2: Statistical Coefficients, Hannan-Quinn Criterion
for the model is 0.986. This value is considered low. So it can be said that the
developed model is robust.
From the above Table No. 5.2.3.2: Statistical Coefficients, the p-values for LR
Statistic for model is 56.843 with p-value of 0.000 which is lower than that of the
critical p-value 0.05. That means the null hypothesis is rejected for model and
alternative hypothesis that that all slope coefficients are not equal to zero is accepted.
The below Table No. 5.2.3.3: Variables in Equation presents the estimated
232

Table No. 5.2.3.3: Variables in Equation
Variable Coefficient Std. Error z-Statistic Prob.

Const. 3.220329 1.613221 1.996211 0.0459
D/E 0.008287 0.023526 0.352254 0.7246
EBIT/TA 19.28147 9.290355 2.075428 0.0379
Log(TA/GNP) -1.09715 0.2568 -4.27238 0
MP/BV 0.000214 0.008656 0.024729 0.9803
NI/TA -19.7665 9.710449 -2.03559 0.0418
SALES/TA -1.81494 0.737225 -2.46185 0.0138
TBD/TA 1.763738 1.547633 1.139636 0.2544
WC/TA -0.11373 2.375402 -0.04788 0.9618
From the above Table No. 5.2.3.3: Variables in Equation, it is clear that four
variables from sample have p-values less than that of the critical p-value 0.05. That
means the null hypothesis for these variables are rejected. That signifies that these
variables will be part of the developed model along with significant intercepts.
Table No. 5.2.3.4: Classification Result
Predicted Group Membership

Original Z
Non-Defaulting 130 2 132
Count
Defaulting 17 6 23
Non-Defaulting 98.5 1.5 100.0

%
Defaulting 73.9 28.1 100.0
Accuracy 87.7%
Type I Error 1.5%
Type II Error 73.9%
The above Table No. 5.2.3.4: Classification Result presents the in classification
accuracy of the developed model. The cut-off probability of group classification has
233

been taken as 0.5. The classification accuracy of developed model is 87.7% with low
Type I error at 1.5% and very high Type II error at 73.9%. On overall the developed
model has high level of accuracy but Type II error puts question the robustness of the
developed model. These results are comparable to the results of the study on Indian
firms.
Discussion
From the results it is clear that the developed model is significant and has high
classification accuracy with 87.7% is comparable to many studies but with high level
of Type II error. The findings from the study are summarized as follow:
 Besides total book value of debt with respect to total assets, no other variable is
significant.
 As per statistical tests, the model is found to robust and significant.
 The developed model has yielded a classification accuracy of 87.7% with low
 The problem with model is that is has high level of Type II error.
The total numbers of firm years for 50 firms are 250. Of this sample only 149 firm
years could be used for structural model. Because of inconsistency in data, a large
number of firm years have been dropped from the sample for the structural model
base on Black, Scholes and Merton (1973) model.
234

The USA
Cases Prob.
Defaulting 22 14.8
Total 149 100
total numbers of firm years from are 149. Of this sample 127 firm years belong to
non-defaulting group and 22 firm years belong to defaulting group. That mean of the
sample 14.8% cases are defaulting cases.
All Firms
Avg. D2 Avg. Prob.
Non-Defaulting 23992.49 15.1%
Defaulting -4240.94 90.2%
Total 18912.38 26.2%
From Table No. 5.2.4.2: D2 and Default Probabilities, it is clear that the average D2
for all the cases is 23992.49 with average probability of default at 15.1%. The average
D2 for the non-defaulting cases is -4240.94 with average probability of default at
90.2%.
235


Defaulting 2 20 22
Non-Defaulting 85.04% 14.96%
Defaulting 9.09% 90.91%
Accuracy 85.91%
Type I Error 14.96%
Type II Error 9.09%
classification accuracy of the structural model is on higher side at 85.9%. Type I and
Type II errors are on higher side at 15% and 9.1% respectively. So on the basis of
classification accuracy; model can be said to be strong enough to classify firms accurately
in groups.
Discussion
From prior probabilities, it is clear that the proportion of defaulting firms in the
sample is 14.8%. From the classification results, it is clear that classification accuracy
is on higher side at 85.9% for the structural model for the sample but the Type I errors
are comparatively on higher side though Type II error is not very high.
From the above discussion for the all the developed models using four methods,
following are the findings:
 For MDA, logit and reduced form models, all the three categories of variables;
accounting, market and economic variables are found to be significant.
 For MDA and logit models, only total book value of debt with respect to total is
significant and no other variable with debt component is found significant but
for reduced form models no variables with debt component is found significant.
236

 Three variables namely, Sales/TA, NI/TA and log(TA/GNP) are significant for
all the methods and these two belong to accounting and economic variables.
 Intercepts for all the models are significant.
 The models for MDA, logit and reduced form models are significant and robust.
 For MDA model the classification accuracy is higher than Altman (1968) and
Bandyopadhyay (2006) at 89.8% although lower than the models for the Indian
firms but higher than the British firms used in the study.
 For logit model the classification accuracy of the model for The US is found at
95% which is lower than the model for Indian firms but higher than the British
firms. These results are higher than Ohlson (1980) and Bandyopadhyay (2006).
 The developed reduced form model has yielded a classification accuracy of

87.7%. It is higher than the models developed for the Indian firms and the
British firms.
 The classification accuracy for structural model is found at 85.9% which higher
than the models developed for the Indian firms as well as the British firms.
 Classification accuracy for logit model is the highest among all the methods for
the US firms.
 Type I and Type II errors for all the methods are relatively higher than many
studies.
5.2.6. Conclusion
From the above analysis of all the models using the four methods, it can be said that for
MDA and logit models, all categories of variables are significant but for reduced form
model, only two categories accounting and economic variables are significant. MDA and
logit models have variables with debt component but in case of reduced form model,
variables with debt component are not significant. As for as the classification accuracy is
concerned, classification accuracy for logit model is the highest among all the methods. If
237

compared with classification accuracies from the models for Indian firms and the British
firms, MDA and logit models have lower classification accuracies than models for Indian
firms but higher accuracies than the British firms. For the reduced form models, the
model for the US firm has the highest classification accuracy that the models for Indian
firms and British firms. Like models for Indian firms and British firms, the models for
American firms also have high Type I and II errors.
5.2.7. Limitations and Further Scope for Studies
The developed models using three methods like other models faces a number of
limitations as follow:
 Type I and Type II errors are found to on higher sides for all the methods.
However Type II error is very high but the main concern is Type II errors. This
makes these models less effective.
 It is observed that a large number of non-defaulting firms are being predicted as

defaulting by all the methods. Is it indication of financial distress in these firms
or some other indications or there is problem with model? This needs to be
investigated.
 From sampling, it is clear there is problem of sampling bias and selection bias in
included in the study accidently.
 The study has used only accounting, market, economic and categorical
variables. There is need to include qualitative as well. This may improve the
results.
 There is further scope for the study on larger data set over larger sample period.
same.
238

SECTION III: BRITISH SAMPLE
British sample. This study examines four methods namely multiple discriminant
default in British context. The results and findings of the study are arranged into four
parts. Part I consists of the results of study using multiple discriminant analysis. Part
II contains results and finding of study using logistic regression analysis. Part III
presents the results and finding of study using reduced form model using Poisson
regression. And Part IV presents and discusses the results of study using with Merton
distant to default model.
Multiple discriminant analysis has been used to arrive a credit score. The empirical
results of the model developed using MDA technique has been presented and
discussed in this section. The sample size of the study is 250 firm years for 50 firms
over a period of five years from 1st January 2012 to 31st December 2016. After
refinement the number of cases fell to 218 firm years. After checking for outliers
using Mahalanobis statistics, firm years are brought down to 197 firm years. On this
sample multiple discriminant analysis is used using SPSS software. Of 197 firm years,
only 171 cases were found to be valid by the discriminant analysis.
seven factors model was found.
Z = -1.222 + 12.717EBIT/TA – 0.003MVE/TBD + 1.016Sales/TA -9.64NI/TA -

0.481TBD/TA + 0.975MP/BV + 1.419 Log(TA/GNP)
Empirical Results

239

From the following Table 5.3.1.1: Analysis Case Processing Summary, it is clear that
a sample of total 171 firm years has been used to develop Z score model using
multiple discriminant analysis and all the firm years have included in the study.
Cases Prob.
Valid 171 86.8
Excluded 26 13.2
Total 197 100
Group Statistics
Table No. 5.3.1.2: Group Statistics

Mean Std. Dev Mean Std. Dev Mean Std. Dev
WC/TA 0.219 0.965 0.048 0.721 0.147 0.346
RE/TA 171.843 90.621 272.522 281.691 229.544 919.73
EBIT/TA 0.101 0.201 -0.064 0.634 0.09 0.542
MVE/TBD 3.353 9.76 33.53 107.00 7.912 83.67
Sales/TA 1.503 3.98 0.391 12.86 0.937 2.56
CA/CL 2.82 5.86 1.055 19.43 1.385 7.47
NI/TA 0.84 1.04 -0.53 7.733 0.105 2.73
NP/TE 0.213 0.932 -0.767 31.85 0.058 19.869
TBD/TA 1.702 0.132 0.373 0.237 0.287 0.158
EBIT/INT 9.514 8.303 7.193 18.157 11.649 10.653
OCFR 0.321 1.746 -1.438 5.179 0.035 2.684
GRTA 0.024 0.128 -0.119 0.310 0.001 0.178
ITR 7.29 76.028 -4.160 132.56 0.958 48.18
FAT 5.21 14.392 1.460 3.981 4.415 17.46
MP/EPS 4.87 9.501 -258.657 1090.347 -29.695 444.584
MP/BV 8.79 37.58 97.12 793.73 17.56 537.15
D/E 3.251 23.474 6.32 171.56 3.914 142.24
Log(TA/GNP) 4.467 8.53 3.807 1.532 4.197 1.406
SG 0.023 132.143 -0.059 98.64 -0.003 56.17
SG/GNPG 0.073 6.208 -1.459 78.06 -0.393 21.44
240

From the above Table No. 5.3.1.2: Group Statistics, it clear that mean of the selected
ratios of the three groups which have been used for the development of the model, are
quite different from each other. The means of ratios of non-defaulting group are far
healthier than that of defaulting group. The table shows stark difference in the
financial performance of between the two groups. Same trend is found with the
standard deviations of the two groups.
between groups on each of the independent variables using group mean or not. The null
hypothesis of the test is that there is no significant difference between the groups on each
of the independent variables group mean. The alternative hypothesis is that there is
significant difference between the groups on each of the independent variables group
mean.
From the below Table No. 5.3.1.3: Tests of Equality of Group Means it is clear that
thirteen variables have p-value less than the critical p-value that is 0.05 by F test. So
the null hypothesis, that there is no significant difference between the groups on each
of the independent variables group mean, is rejected and the alternative hypothesis is
independent variables group mean. So these thirteen independent variables are
significant for the study. But the problem with these significant variables is that the
Wilk’s Lambdas of these significant variables are on higher side which in undesirable
and smaller Wilk’s Lambda is better for the analysis.
241

Table No. 5.3.1.3: Tests of Equality of Group Means
Wilks' Lambda F df1 df2 Sig.

WC/TA 0.976 3.81 1 169 0.052
RE/TA 0.967 1.86 1 169 0.087
EBIT/TA 0.893 19.371 1 169 0.0
MVE/TBD 0.934 12.003 1 169 0.001
Sales/TA 0.965 8.109 1 169 0.005
CA/CL 0.999 0.089 1 169 0.765
NI/TA 0.88 23.61 1 169 0.0
NP/TE 0.996 0.894 1 169 0.363
TBD/TA 0.914 11.002 1 169 0.003
EBIT/INT 0.966 5.805 1 169 0.037
OCFR 0.917 10.249 1 169 0.001
GRTA 0.901 15.934 1 169 0.0
ITR 0.981 3.051 1 169 0.079
FAT 0.975 4.67 1 169 0.034
MP/EPS 0.944 8.999 1 169 0.002
MP/BV 0.996 8.32 1 169 0.007
D/E 0.999 0.55 1 169 0.56
Log(TA/GNP) 0.646 86.499 1 169 0
SG 0.987 1.991 1 169 0.074
SG/GNPG 0.992 0.835 1 169 0.378
Log Determinants
The below Table No. 5.3.1.4: Log Determinants indicates that the log determinant for
the whole sample pooled within group is found to be 10.291. It is 8.362 and 2.908 for
the Non-defaulting group and defaulting group respectively. This shows that the
covariance matrices for the three groups are not similar. Rather they do differ. Log
determinant of non-defaulting groups are about thrice of the defaulting groups.
Though in case of pooled within-group and non-defaulting groups, the difference in
log determinant is less. This shows that the variables in this function can better predict
the non-defaulting firms than that of the defaulting firms as larger log determinants
correspond to non-defaulting group. Also, the values of the log determinants are not
242

very high for any group. This shows that discriminatory power of this function is not
very high.
Table No. 5.3.1.4: Log Determinants
Z Rank Log Determinant

Defaulting 20 2.908
Pooled within-groups 20 10.291
Box’s M Test
variables.
variables.
Table No. 5.3.1.5: Test Results
Box's M 871.912
Approx. 7.684
df1 105
F
df2 72917.681
Sig. 0.000
From the Table No 5.3.1.5: Test Results, it is clear that test the p-value of the test is
lower than that of critical value of 0.05. That means that null hypothesis that the
covariance matrices do not differ between groups formed by the dependent variables,
243

is rejected. That means the covariance matrices differ between groups formed by the
dependent variables. This is undesirable.
Eigenvalues
Table No. 5.2.1.6: Eigenvalues
Function Eigenvalue % of Variance Cumulative % Canonical Correlation

1 0.454 100.0 100.0 0.356
The above Table No. 5.3.1.6: Eigenvalues provides information about the eigenvalue
and canonical correlation. The eigenvalue of the discriminant function is 0.454. This
value is on lower side. This signifies that the variance in the dependent variable
cannot be explained properly by the discriminant function. The canonical correlation
between the discriminant function and the dependent variable is 0.356 which is poor.
This indicates a poor association between the discriminant function and classification
of groups.
Wilks' Lambda
Table No. 5.3.1.7: Wilks' Lambda
Test of Function(s) Wilks' Lambda Chi-square df Sig.

1 0.770 30.964 14 0.000
Wilks' Lambda is a measure of discriminatory power of discriminant function. Lower

is the Wilk’s Lambda, higher is the discrimination power of the function. From the
above Table 5.3.1.7: Wilks' Lambda, it is clear that the Wilk’s Lambda is 0.770 which
244

is on higher side. This tells that the discriminatory power the developed discriminant
function from the sample is on lower side. The Chi-square test has p-value less than
the critical p-value of 0.05. That means null hypothesis is rejected and alternative
hypothesis that there is significant discriminating power in the independent variables.
So it can be concluded on the basis of above test that the groups differ from each
other and the function can effectively discriminate between the two groups.
Structure Matrix
Table No 5.3.1.8: Structure Matrix
Log(TA/GNP) 0.632
NI/TA 0.253
EBIT/TA 0.382
GRTA 0.027
MVE/TBD -0.372
TBD/TA -0.719
OCFR 0.098
MP/BV -0.379
MP/EPS 0.004
Sales/TA 0.579
RE/TA -0.164
EBIT/INT 0.004
FAT 0.016
SG 0.106
WC/TA 0.713
ITR -0.023
NP/TE 0.104
SG/GNPG 0.123
D/E 0.015
CA/CL 0.019
Structure Matrix shows the correlations of each variable with each discriminant function
and these correlations are similar to the factor loadings in factor analysis. The above
Table 5.3.1.8: Structure Matrix shows that the correlation between the predictor variables
245

and standardized canonical discriminant function. From the above table it can be said that
most of the independent variables have low correlation and contribute negligibly to the
discriminant function. The highest coefficient is for TBD/TA at -0.719
Table No. 5.3.1.9: Standardized Canonical Discriminant Function Coefficients
WC/TA 0.387
RE/TA 0.537
EBIT/TA -3.157
MVE/TBD 0.436
Sales/TA 0.653
CA/CL 0.006
NI/TA 3.673
NP/TE 0.006
TBD/TA -0.614
EBIT/INT -0.001
OCFR 0.008
GRTA 0.001
ITR -0.73
FAT 0.004
MP/EPS 0.008
MP/BV 1.391
D/E 0.003
Log(TA/GNP) 1.42
SG 0.087
SG/GNPG -0.203
From the above Table No. 5.3.1.9: Standardized Canonical Discriminant Function
Coefficients, it is clear that NI/TA has the highest coefficient value at -3.157. That
means the independent variable NI/TA has the highest discriminatory power among
all the independent variables. The coefficient of EBIT/INT is -0.001. It has the lowest
246

discriminatory power among all the independent variables. From the above table it
can be said that most of the variables have very poor coefficients.
Table No. 5.3.1.10: Canonical Discriminant Function Coefficients (Unstandardized)
Variables Coefficient
WC/TA 1.335
RE/TA -0.068
EBIT/TA 12.717
MVE/TBD -0.003
Sales/TA 1.016
CA/CL -0.044
NI/TA -9.640
NP/TE 0.004
TBD/TA -0.481
EBIT/INT 0.000
OCFR 0.000
GRTA 0.000
ITR 0.000
FAT 0.000
MP/EPS 0.000
MP/BV 0.975
D/E 0.000
Log(TA/GNP) 1.419
SG 0.66
SG/GNPG -0.039
(Constant) -1.222
The above Table 5.3.1.10: Canonical Discriminant Function Coefficients

247

observations in the groups used in the study. From the below Table No. 5.3.1.11:
Prior Probabilities for Groups, it clear that out of 242 firm years, 166 firm years
belong to non-defaulting group and 76 firm years below to defaulting group.
Cases Used in Analysis

Z Prior
Un-weighted Weighted
Non-Defaulting 0.888 152 152.000
Defaulting 0.111 19 18.000
Total 1.000 171 171.000
score of each group. From the below Table No. 5.3.1.12: Functions at Group
Centroids, it is clear that centroid of the discriminating score for defaulting group is -
Table No. 5.3.1.12: Functions at Group Centroids
Z Function
Non-Defaulting 0.185
Defaulting -1.481
formulate decision rule in deciding group membership of individual case. The
decision rule is described below:
-1.481 -0.00011 0.185
248

From the above Table No. 5.3.1.10: Prior Probabilities for Groups, it is clear that the
numbers of defaulting and non-defaulting cases are not equal, so to find the dividing
point, weights on centroids is used.
= (0.185 X 152 – 1.481 X 19)/171 = -0.0001
Classification Results
Table No 5.3.1.13: Classification Results

Count
Defaulting 6 13 19
%age
Defaulting 31.6 68.4 100.0
Accuracy 73.1%
Type I Error 25.7%
Type II Error 31.6%

accuracy of the developed model. The classification accuracy of the developed model
is 73.1% with type I error at 25.7% and type II error at 31.6.
Discussion
From the above discussion, it clear that the model is found to have the classification
accuracy of the developed model is found at 73.1% with high Type II error. The
findings from the study are summarized as follow:
249

 Variables with debt component are found significant for the model development.
 From the statistical tests the developed mode is significant but not highly robust.
The Eigenvalue and canonical correlations are on lower side.

comparable accuracy at 73.1%.
 These results are lower than the results from the Indian firms as well as the
American firms used in this study.
Logistic regression analysis has been used to arrive a credit score. The empirical
results of the model developed using logit method has been presented and discussed
in this section. The sample size of the study is 250 firm years for 50 firms over a
period of five years from 1st January 2012 to 31st December 2016. After refinement
the number of cases fell to 218 firm years. After checking for outliers using
Mahalanobis statistics, firm years are brought down to 197 firm years. Because of
high correlation among the variables, factor analysis was carried out on this sample.
Out of twenty categorical variables, eleven variables are found suitable for the study.
On this obtained sample multiple discriminant analysis is used using SPSS software.
Of 197 firm years, only 171 cases were found to be valid by the discriminant analysis.
After running the multiple discriminant analysis using SPSS software the following 5
O = 2.734 – 13.169EBIT/TA – 3.3SALES/TA + 12.58NI/TA + 1.001

Log(TA/GNP) + 1.848X
250

Empirical Results

From the Table No. 5.3.2.1: Case Processing Summary, it is clear that a sample of 162
firm years has been used to develop a model for predicting default probability using
logistic regression method and all the firm years have been used to develop this
model.
Cases Prob.
Valid 162 82.2
Excluded 17 17.8
Total 197 100
variable.
variable
251

Table No. 5.3.2.2: Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 134.57 13 0.000
Block 134.57 13 0.000
Model 134.57 13 0.000
From the above Table No. 5.3.2.2: Omnibus Tests of Model Coefficients, the Chi-
square coefficient is found to be 134.57 and the p-value of the test is 0.000. This
suggests that null hypothesis is rejected and alternative hypothesis is accepted that
there is an effect of the independent variables, taken together, on the dependent
variable. So it can be said that the inclusion of independent variables improves the
predictive ability of the model.
Model Summary
From the below Table No. 5.3.2.3: Model Summary, -2LL is found to be 105.272 and
pseudo R square that is Cox and Snell R Square is found to 0.418 and adjusted pseudo
R square or Nagelkerke R square is found to be 0.479. This signifies that about 41.8%
of the total variation of the classification can be explained by the model and 47.9% of
the variance in the estimation can be explained by the logistic function.
Table No. 5.3.2.3: Model Summary
Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square
1 105.272 0.418 0.479
252

Table No. 5.3.2.4: Hosmer and Lemeshow Test

1 23.94 8 0.017
From the above Table No. 5.3.2.4: Hosmer and Lemeshow Test, the degree of
freedom is found to be 8, the Chi-square is 23.94 and the p-value of the test is 0.017.
The p-value of test is lower than that of the critical value that is 0.05. So the null
hypothesis is accepted. That mean the model is correctly specified. That means the
data that has been used to develop the model fits well in the model.

Table No: 5.3.2.5: Variables in Equation
B S.E. Wald df Sig.

WC/TA -0.983 0.892 0.688 1 0.343
RE/TA -0.172 0.474 0.126 1 0.782
EBIT/TA -13.169 8.09 6.50 1 0.018
MVE/TBD -0.53 0.029 3.406 1 0.065
Sales/TA -3.30 0.534 5.98 1 0.001
CA/CL 0.071 0.087 0.655 1 0.418
NI/TA 12.58 10.340 4.941 1 0.026
NP/TE 0.000 0.060 0.001 1 0.819
TBD/TA -0.810 0.713 1.225 1 0.268
MP/BV 1.015 2.145 0.007 1 0.019
Log(TA/GNP) 1.001 0.085 0.031 1 0.09
X(1) 1.848 0.660 4.402 1 0.036
Y(1) 2.943 1.894 5.847 1 0.161
Constant 2.734 8.769 0.982 1 0.045
253

From the below Table No: 5.3.2.5: Variables in Equation, it is clear that only five
variables have p-value less than that of the critical value 0.05. So null hypothesis is
rejected for these five variables and alternative hypothesis is accepted and these five
variables are significant in model development.
Table No. 5.3.2.6: Classification Results

Count
Defaulting 7 13 20
%age
Defaulting 35.0 65.0 100.0
Accuracy 72.8%
Type I Error 29.6%
Type II Error 35.0%

accuracy of the developed model. The classification accuracy of the developed is
found at 72.8% with very high type I and Type II errors at 29.6% and 35%. This level
of classification accuracy along with such a high misclassification, the developed
model cannot be termed as robust.
Discussion
From the above discussion, it clear that the model is found to have the classification
accuracy of the developed model is found at 73.8% with very high Type II error. The
findings from the study are summarized as follow:
 The significant variables belong to all the three categories of variables that have
been used in the study.
254

 No variable with debt component is found significant for the model

development.
 From the statistical tests the developed mode is significant but not highly robust.

comparable accuracy at 72.8%.
 These results are lower than the results from the Indian firms as well as the
British firms used in this study.
intensity for the firms for the sample of British firms. This default intensity is used to
arrive at the probability of default. This sample initially has 250 firm years of 50 firms
from the United State of America for the period of five years from 2012 to 2016. The
default occurrences are counted for every firm for the whole sample period. Similarly
the 5 years means of selected ratios for every firm are calculated. This turns the size
of sample into 50 firms. Then this sample is checked for outliers using Mahalanobis
statistics, which brought down the sample size to 39 firm years. Because of high
multicollinearity, factor analysis is carried on. It gives eleven variables for further
analysis. Averages of the obtained eleven ratios are taken as independent variables
and count data is used as dependent variables. On this sample regression is run using
Poisson Probability in E-Views software.
following 5 factors model is found.
Z = 0.805871 + 0.80522CA/CL +0.0041EBIT/INT - 4.194EBIT/TA –

0.03133MVE/TBD – 1.0644 Log(TA/GNP)
255

Empirical Results

From the below Table No. 5.3.3.1.: Case Processing Summary, it is clear that the
sample size is found to be 39. Of these firms, only 37 firms are included in the study.
Table No. 7.4.3.1: Case Processing Summary
The US
Sample Size (Adjusted) 39
Included Observations 37
The below Table No. 5.3.3.2: Statistical Coefficients presents different coefficients on
the basis of which the relative statistical robustness of the models will be assessed.
Table No. 5.3.3.2: Statistical Coefficients
The UK
R-squared 0.517
Adjusted R-squared 0.284
S.E. of regression 1.313
Sum squared residuals 46.582
Log likelihood -40.010
Restricted log likelihood -65.701
Avg. log likelihood -0.976
Mean dependent var. 0.878
S.D. dependent var. 1.552
Akaike info criterion 2.635
Schwarz criterion 3.220
Hannan-Quinn criterion 2.848
LR statistic 51.383
Prob. (LR statistic) 0.000
256

From the above Table No. 5.3.3.2: Statistical Coefficients, R-squared is 0.517 and
adjusted R-square is 0.284. It clearly indicates that the data don’t fit properly into the
model developed.
From the above Table No. 5.3.3.2: Statistical Coefficients, it is clear that the Standard
Error of the Regression is on lower side at 1.313. It clearly indicates the statistical
noise in the estimates is low. So developed is found to be robust.
From the above Table No. 5.3.3.2: Statistical Coefficients, Sum Squared Residuals
for the model is 46.582. Sum Squared Residuals is a measure of the distance from a
the data. So the model is poor fit for the data.
From the above Table No. 5.3.3.2: Statistical Coefficients, it clear from the values of
the Log Likelihood, Restricted Log Likelihood and Average Log Likelihood, that the
developed does not have higher precision.
From the above Table No. 5.3.3.2: Statistical Coefficients, it is clear that standard
deviation for the developed model is 1.552. So the model is not highly robust.
From the above Table No. 5.3.3.2: Statistical Coefficients, Akaike Information
Criterion for the model is 2.635 which cannot be construed as low. So the model is
not very robust.
257

Schwarz Criterion
From the above Table No. 5.3.3.2: Statistical Coefficients, Schwarz Criterion for the
model is 3.220 which is on higher side. This makes the developed model to be less
robust.
From the above Table No. 5.3.3.2: Statistical Coefficients, Hannan-Quinn Criterion
for the model is 2.884. This value is considered high. So it can be said that the
developed model is not very robust.
From the above Table No. 5.5.3.2: Statistical Coefficients, the p-values for LR
Statistic for model is 51.383 with p-value of 0.000 which is lower than that of the
critical p-value 0.05. That means the null hypothesis is rejected for model and
alternative hypothesis that that all slope coefficients are not equal to zero is accepted.
The below Table No. 5.3.3.3: Variables in Equation presents the estimated
258

Table No. 5.3.3.3: Regression Estimate
Variable Coefficient Std. Error z-Statistic Prob.

Constant 0.805871 0.581105 6.673955 0.0007
CA/CL 0.80522 0.092021 7.467192 0.003
EBIT/INT 0.0041 0.00768 0.2331 0.019
EBIT/TA -4.194 0.71997 -3.908 0.0085
FAT -0.0083 0.0158 -0.7546 0.753
ITR 0.0313 0.0781 0.9309 0.813
MVE/TBD -0.03133 0.06027 -0.9905 0.0430
NP/TE -1.0039 0.01572 -1.9009 0.0573
OCFR 0.03897 0.078391 0.8903 0.192
SALES/TA -1.1809 0.2301 -1.0217 0.218
TBD/TA 0.66605 0.6252 1.27151 0.168
Log(TA/GNP) -1.0644 0.56556 -2.01848 0.0011
From the above Table No. 5.3.3.3: Variables in Equation, it is clear that five variables
from the sample have p-values less than that of the critical p-value 0.05. That means
the null hypothesis for these variables are rejected. That signifies that these variables
will be part of the developed model along with significant intercepts.
Table No. 5.3.3.4: Classification Result
Predicted Group Membership

Original Z
Count
Defaulting 9 11 20
Non-Defaulting 69.7 30.3 100.0

%
Defaulting 45.0 55.0 100.0
Accuracy 67.9%
Type I Error 30.3%
Type II Error 45.0%
259

The above Table No. 5.3.3.4: Classification Result presents the in classification accuracy
of the developed model. The cut-off probability of group classification has been taken as
0.5. The classification accuracy of developed model is 67.9% with high Type I error at
30.3% and very high Type II error at 45.0%. On overall the developed model has high
level of accuracy but Type II error puts question the robustness of the developed model.
These results are comparable to the results of the study on Indian firms.
Discussion
From the results it is clear that the developed model is significant and has high
classification accuracy with 57.9% is comparable to many studies but with high level
of Type II error. The findings from the study are summarized as follow:
 The significant variables belong to two categories of variables that is accounting

and economic variables used in the study.
 Variables with debt component are found significant for the model development.
 As per statistical tests, the model is not found to be very robust and but is
significant.
 As per LR statistics, all coefficients are not zero.
 The developed model has yielded a classification accuracy of 67.9% with low
 The problem with model is that is has high level of Type I and Type II error.
The total numbers of firm years for 50 firms are 250. Of this sample only 141 firm
years could be used for structural model. Because of inconsistency in data, a large
260

number of firm years have been dropped from the sample for the structural model
base on Black, Scholes and Merton (1973) model.
The UK
Cases Prob.
Defaulting 19 13.5
Total 141 100
total numbers of firm years from are 141. Of this sample 122 firm years belong to
non-defaulting group and 19 firm years belong to defaulting group. That mean of the
sample 13.5% cases are defaulting cases.
All Firms
Avg. D2 Avg. Prob.
Non-Defaulting 13792.94 23.5%
Defaulting -5640.40 88.32%
Total 12712.83 32.2%
From Table No. 5.3.4.2: D2 and Default Probabilities, it is clear that the average D2
for all the cases is 13792.94 with average probability of default at 23.5%. The average
D2 for the non-defaulting cases is -5640.40 with average probability of default at
88.32%.
261


Defaulting 7 12 19
Non-Defaulting 84.4% 15.6%
Defaulting 36.9% 63.2%
Accuracy 81.6%
Type I Error 15.6%
Type II Error 36.9%
classification accuracy of the structural model is on higher side at 81.6%. Type I and
Type II errors are on higher side at 15.6% and 36.9% respectively. So on the basis of
classification accuracy; model can be said to be strong enough to classify firms
accurately in groups.
Discussion
From prior probabilities, it is clear that the proportion of defaulting firms in the
sample is 13.5%. From the classification results, it is clear that classification accuracy
is on higher side at 81.6% for the structural model for the sample but the Type I and
Type II errors are comparatively on higher side.
Discussion
From the above discussion, it clear that the structural model has the highest
classification accuracy at 81.6% and the three models developed using MDA, logit
and reduced form model are significant but not highly robust. The findings from the
study are summarized as follow:
262

 For all developed models for the British firms, all the three categories of
variables are found significant. So it can be concluded that for predicting default
of British firms, the model should have all the three categories of variables
namely, accounting, market and economic variables.
 Only for two models, MDA and reduced form model, variables with debt
component are found significant.
 For all developed models for British firms, the tests results clearly indicate that
the developed models are not highly robust however are significant.
 For MDA model the classification accuracy is comparable to Altman (1968) at

73.1% but is lower than the results from the Indian firms as well as the US firms
used in this study and is comparable to Taffler (1984), Peel, Peel, & Pope (1986).
 For logit model, the classification accuracy 72.8% which is the lowest among all the
developed models in the study using logit method but is comparable to Charitou,
Neophytou, & Charalambous (2004), Bunyaminu & Issah (2012), and Lin (2015).
 For reduced form model classification accuracy is 67.9%.
 For Structural model, classification accuracy is on higher side at 81.6% for the
structural model for the sample. It is comparable to studies like Bharath &
Shumway (2004). But the Type I and Type II errors are comparatively on higher
side.
 All developed models are plagued with high Type I and Type II errors.
 Among all models for the British firms, the classification accuracy for structural
model is the highest.
263

5.3.6. Conclusion
From the above discussion it is clear that the structural model has the highest
classification accuracy at 81.6% and the lowest is for reduced form model. The three
models developed using MDA, logit and reduced form model are significant but not
highly robust. The classification accuracies of the developed models and structural
model is comparable to many studies like Taffler (1984), Goudie & Meeks (1991),
Alici (1995), Christidis & Gregory (2010), Duan, Sun, & Wang (2011), Blanco,
Irimia, & Oliver (2012), , Jackson & Wood (2013), Charalambakis, Espenlaub, &
Garrett (2009), Agarwal & Taffler (2008), Agarwal & Taffler (2008b). And all the
four models are plagued with high level of Type I and Type II errors. As far as the
significant variables are concerned, all the three categories of variables are important
for the model development. Only for two models, MDA and reduced form model,
variables with debt component are found significant.
5.3.7. Limitations
The developed models using three methods like other models faces a number of
limitations as follow:
 Type I and Type II errors are found to on higher sides for all the methods.
However Type II error is very high but the main concern is Type II errors. This
makes these models less effective.
 It is observed that a large number of non-defaulting firms are being predicted as

defaulting by all the methods. Is it indication of financial distress in these firms
or some other indications or there is problem with model? This needs to be
investigated.
 From sampling it is clear there is problem of sampling bias and selection bias in
included in the study accidently.
264

 The study has used only accounting, market, economic and categorical
variables. There is need to include qualitative as well. This may improve the
results.
 There is further scope for the study on larger data set over larger sample period.
There is need to carry out separate study for small and medium enterprise as this
study could not do so because of small data set because of unavailability of the same.
SECTION IV:
COMPARATIVE STUDY OF DEFAULT PREDICTION MODELS
This section presents the comparison of the findings from the study on Indian sample,
American sample and British Sample into three parts; variables in equation, statistical
robustness and classification accuracy.
From comparison of the significant variables for the models for Indian, American and
British sample the following are the findings:
 For the models for Indian sample, the variables from all the three categories of
independent variables are significant for MDA, logit and reduced form model.
 For the models for American and British sample, all the three categories of
variables are significant for MDA and logit model but only accounting and
economic variables are significant for reduced form model.
 For Indian sample besides the amount of debt with respect to total assets, no
other variable with debt component is significant in default prediction for MDA,
logit and reduced form model.
 For American and British sample for MDA and logit models, most of the
variables with debt component are found significant but for reduced form
model, no variable with debt component is found significant. This a departure
from findings for Indian sample.
265

 For logit model, categorical variable Y is more important than X for Indian
sample but for American sample no categorical variable is significant while for
British sample only is significant.
 The intercepts for logit models for Indian samples are not significant but for
American and British sample intercepts are significant.
 For Indian and American sample, models for all the methods are significant and
robust but for British sample, models for all the methods are significant but not
highly robust.
From the below Table No. 5.4.3.1: Classification Accuracies, the following are
findings:
 Logit model for Indian sample has the highest classification accuracy at 97.7%
among all the models and methods.
 For Indian and American sample logit models have the highest classification
accuracies and for British sample, structural model has the highest classification
accuracy.
 For MDA, logit and reduced form methods, models for Indian sample have the
highest classification accuracy.
 For structural model, model for American sample has the highest classification
accuracy.
Table No. 5.4.3.1: Classification Accuracies
India The US The UK

MDA 92.1 89.8 73.1
Logit 97.7 95.0 72.8
Reduced Form Model 94.0 87.7 67.9
Structural Model 82.3 85.9 81.6
266

Chapter 6
Summary and Conclusion

Chapter 6: Summary and Conclusion
CHAPTER 6
Summary and Conclusion
6.1. Introduction
In this chapter, summary, the limitations and further scope for studies for all the
methods and models for the three samples are discussed on the basis of findings from
the chapter 5 along with conceptual and theoretical assumptions.
6.2. Summary of Findings
6.2.1 Findings from Indian Sample
From the above discussion for the four broad samples and all the developed models
using four methods for the two sample periods, it can be said that few findings are
universal in nature for all the methods while some findings are associated with
respective methods. Following are the findings:
 For Indian sample in long term for all models for all methods, only accounting
and economic variables are significant while for short term models, all the
categories are variables significant. This signifies that in long run market
information about the firm is not important but in short market dynamics have
impact on default prediction
 For Indian sample in long term for all models for all methods, besides the
amount of debt with respect to total assets, no other variable with debt
component is significant for default prediction but in short run, most of the
variables having debt component are significant. That means in short run all the
form of debt is important for default prediction.
 For MDA models, all the significant variables for the large sample period are
significant for the small sample period. indicates that in long run firms tend to
follow similar pattern while in short run, they have different pattern in the
267

context of default prediction. And in short run or change in assets size, for the
additional risk additional variables are required.
 For the MDA and logit models, the distinction between matched and unmatched
sample is irrelevant.
 For SMEs, only accounting as well as market variables is significant for both the
sample periods. Economic variables seem to have no role in default prediction
for SMEs.
 For logistic regression method categorical variables Y is significant for all the
models but X is significant for only one model.
 Unlike MDA models, logit models for the two sample periods have almost
different set of variables.
 For every model expect logit models intercept is significant.
 For MDA and logit method, the models for the small sample period are more
robust.
 In case of reduced for models, models developed for the large sample period are
more robust.
 In case of MDA log determinants and Wilks’ Lambdas are higher for large
sample period. It seems that log determinant and Wilks’ Lambda are dependent
on size of data set.
almost double of the log determinant of the defaulting group. This indicates that
the developed models can discriminate non-defaulting firms better than the
defaulting group.
 For Logit models, as per Omnibus test and Model Summary, all the developed
models are robust and on the basis of these criterions, interpreted separately, it
268

can be said that the models developed for the small sample periods have the
highest level of robustness.
 For Logit models, most of the specified models fit into the data.
 The highest classification accuracy is found for the model for large sample
period for All Firms using logistic regression analysis at 97.7%.
 Logit models have the highest classification among all the methods at 97.7%.
 Models developed for the large sample have the highest classification accuracy
for all the methods. However models developed for the small sample period are
more robust.
 The presence of public sector undertakings in sample does not have any impact
as the classification results are comparable to the classification results of large
firms.
 When it comes to forward testing of in sample firms as well as out of sample

firms, the classification accuracy is found to be higher for all the models using
all the methods.
 The classification accuracies of the developed models are found to deteriorate

over time (Altman E. I., 2005).
 In case MDA models, the highest classification accuracy is found for the Large
Firms with PSU at 92.1% and the models developed for the Unmatched Sample
have higher prediction accuracy than that of the Matched Sample. This is
against the assumption of the many studies like (Altman E. , 1968), (Beaver W. ,
1966) and (Gupta V. , 2014).
 In case of Logit method, on the basis of classification results, there is no

distinction between the matched and unmatched sample. They have almost same
results.
269

 For reduced form models, the highest classification accuracy is found for Large
Firms for the small sample period at 77.4% although for forward period and out
sample accuracy improves to 94%.
 In case of structural model, the highest classification accuracy is found to be

82.3% for SMEs which is lower than other studies (Bandyopadhyay A. , 2007).
 Misclassification is very high for all the models developed using all the four
methods.
6.2.2. Findings from the American Sample
From the results of the American sample, the following findings have been obtained:
 For MDA, logit and reduced form models, all the three categories of variables;
accounting, market and economic variables are found to be significant.
 For MDA and logit models, only total book value of debt with respect to total is
significant and no other variable with debt component is found significant but
for reduced form models no variables with debt component is found significant.
 Three variables namely, Sales/TA, NI/TA and log(TA/GNP) are significant for
all the methods and these two belong to accounting and economic variables.
 Intercepts for all the models are significant.
 The models for MDA, logit and reduced form models are significant and robust.
270

 For MDA model the classification accuracy is higher than Altman (1968) and
Bandyopadhyay (2006) at 89.8% although lower than the models for the Indian
firms but higher than the British firms used in the study.
 For logit model the classification accuracy of the model for The US is found at
95% which is lower than the model for Indian firms but higher than the British
firms. These results are higher than Ohlson (1980) and Bandyopadhyay (2006).
 The developed reduced form model has yielded a classification accuracy of

87.7%. It is higher than the models developed for the Indian firms and the
British firms.
 The classification accuracy for structural model is found at 85.9% which higher
than the models developed for the Indian firms as well as the British firms.
 Classification accuracy for logit model is the highest among all the methods for
the US firms.
 Type I and Type II errors for all the methods are relatively higher than many
studies.
6.2.3. Findings from the British Sample
From the British sample the structural model has the highest classification accuracy at
81.6% and the three models developed using MDA, logit and reduced form model are
significant but not highly robust. The findings from the study are summarized as
follow:
 For all developed models for the British firms, all the three categories of
variables are found significant. So it can be concluded that for predicting default
of British firms, the model should have all the three categories of variables
namely, accounting, market and economic variables.
271

 Only for two models, MDA and reduced form model, variables with debt
component are found significant.
 For all developed models for British firms, the tests results clearly indicate that
the developed models are not highly robust however are significant.
 For MDA model the classification accuracy is comparable to Altman (1968) at

73.1% but is lower than the results from the Indian firms as well as the US firms
used in this study and is comparable to Taffler (1984), Peel, Peel, & Pope
(1986).
 For logit model, the classification accuracy 72.8% which is the lowest among all the
developed models in the study using logit method but is comparable to Charitou,
Neophytou, & Charalambous (2004), Bunyaminu & Issah (2012), and Lin (2015).
 For reduced form model classification accuracy is 67.9%.
 For Structural model, classification accuracy is on higher side at 81.6% for the
structural model for the sample. It is comparable to studies like Bharath &
Shumway (2004). But the Type I and Type II errors are comparatively on higher
side.
 All developed models are plagued with high Type I and Type II errors.
 Among all models for the British firms, the classification accuracy for structural
model is the highest.
6.2.2. Findings from the Comparative Study
From comparison of the significant variables for the models for Indian, American and
British sample the following are the findings:
272

 For the models for Indian sample, the variables from all the three categories of
independent variables are significant for MDA, logit and reduced form model.
 For the models for American and British sample, all the three categories of
variables are significant for MDA and logit model but only accounting and
economic variables are significant for reduced form model.
 For Indian sample besides the amount of debt with respect to total assets, no
other variable with debt component is significant in default prediction for MDA,
logit and reduced form model.
 For American and British sample for MDA and logit models, most of the
variables with debt component are found significant but for reduced form
model, no variable with debt component is found significant. This a departure
from findings for Indian sample.
 For logit model, categorical variable Y is more important than X for Indian
sample but for American sample no categorical variable is significant while for
British sample only is significant.
 The intercepts for logit models for Indian samples are not significant but for
American and British sample intercepts are significant.
 For Indian and American sample, models for all the methods are significant and
robust but for British sample, models for all the methods are significant but not
highly robust.
From the Table No. 5.4.3.1: Classification Accuracies, the following are findings:
 Logit model for Indian sample has the highest classification accuracy at 97.7%
among all the models and methods.
 For Indian and American sample logit models have the highest classification
accuracies and for British sample, structural model has the highest classification
accuracy.
273

 For MDA, logit and reduced form methods, models for Indian sample have the
highest classification accuracy.
 For structural model, model for American sample has the highest classification
accuracy.
6.3. Conclusion
For Indian sample, the models for the small sample period have higher statistical
robustness but the models for the large sample period have higher classification accuracy
except reduced form model. On comparison of the classification accuracies, logit models
have higher accuracies (97.7%) than that of the discriminant models (Gupta V. , 2014)
and structural model as well (Sharma, Singh, & Upadhyay, 2014). And over time
classification accuracy deteriorates for the developed models. These results are
comparable to many studies like Altman (1968), Ohlson (1980), Altman & Narayanan
(1997), Sen Chaudhury (1999), Altman (2000), Bandyopadhyay (2006), Jayadev (2006),
Agarwal & Taffler (2008) and Gupta (2014) etc. As far as the accuracies of the reduced
form model and structural model are concerned, the results of the study are found less
efficient than many studies like Patel & Pereira (2005), Bandyopadhyay (2007), Agarwal
& Taffler (2008), Kulkarni, Mishra, & Thakkar (2008), Duan, Sun, & Wang (2011) and
Sharma, Singh, & Upadhyay (2014) etc.For the newly developed model using reduced
form method, the statistical robustness as well as the classification accuracy is the highest
model developed for small sample period.
For the US and UK samples, all categories of variables are important for model
building for MDA and logit models but for reduced form model, only two categories
accounting and economic variables are important. MDA and logit models have
variables with debt component but in case of reduced form model, variables with debt
component are not significant.
From the analyses of all the models using the four methods for all the samples, it can
be said that the different methods find different sets of variables significant for the
model development with varying statistical robustness and classification accuracies.
However, it can be concluded that the presence of all the three categories of variables
274

in the model normally improves the predictive ability of the models. For most of the
models for all the samples, debt with respect assets is found significant. Other
variables with debt components keep changing from one model to another. Logit
models have the highest classification accuracies among all the methods for the three
samples. For Indian and the US sample logit models have the highest classification
accuracy while for the UK sample, structural model has the highest classification
accuracy. Of the four methods, the models for MDA, logit and reduced form have the
highest accuracy for Indian sample and for structural model, classification accuracy is
the highest for the US sample.
6.4. Limitations of the Study
In this study 30 models have been developed using MDA, logit and reduced form
model for the three samples. Besides these thirty models, structural model (BSM) has
been tested on three samples. In due process of model development, a number of
limiting factors are found which are capable of affecting the models’ statistical
robustness as well as effectiveness. Those limiting factors are as follow:
 MDA and Logit method for all the three samples have high Type I and Type II
errors. However high Type II errors weaken the effectiveness of models.
 For reduced form method for Indian sample, Type I error is high. This weakens
the robustness of the model.
 The study indicates that the models for the small sample period are more robust
but classification accuracy is higher for large sample period. This is biggest
limitation of the study. This needs to be taken care off.
 The scarcity of reliable data on default history is a limiting factor in credit risk
modeling.
 The models for British sample are not highly robust. The reasons may be the
small sample size.
275

6.5. Further Scope for Studies
In any study, although limiting factors have negative impact on overall efficiency of
models but at the same time these offer scope for the further studies. The further
scope for the studies that have been derives from the present study are as follow:
 It is quite possible that a few important variables may have been not been
included in the study accidently. So there is need to explore and include other
variables which may provide better results.
 The study uses only accounting, market, economic and categorical variables.
The inclusion of qualitative variables may improve the accuracy.
 There is further scope for the study on larger set of data for larger sample period
to generalize the findings as the default prediction studies faces the problem of
sampling and selection biases.
 There is further scope to study with larger sample periods.
 There is further scope for sectoral analysis to find out effect of sectors on default
probabilities.
 There is need to carry out separate study for small and medium enterprise with
larger set of data as this study could not do so because of unavailability of large
sample.
276

Bibliography

Bibliography
BIBLIOGRAPHY
Acharya, A., Chatterjee, I., & Pal, R. (2003). Estimating Default Probability of Firms:
An Indian Perspective. Mumbai: Indira Gandhi Institute of Development
Research, Mumbai, India.
Agarwal, V., & Taffler, R. (2008). Comparing the performance of market-based and
accounting-based bankruptcy prediction models. Journal of Banking &
Finance, 32(8), 15-41-1551.
Aguado, C., & Benito, E. (2012). Determinants of Corporate Default: A BMA

Approach. Banco de Espana Working Paper No. 1221.
Altman, E. (1968). Financial Ratios, Discriminant Analysis and the Prediction of

Corporate Bankruptcy. Journal of finance, 23 (4), 589-609.
Altman, E. I. (1968). Fianncial Ratios, Discriminant Analysis and Prediction of

Corporate Bankruptcy . The Journal of FInance, 589-609.
Altman, E. I. (2000). Predicting financial distress of companies: revisiting the Z-

score and ZETA models. New York: Stern School of Business, New York
University.
Altman, E. I. (2002). Corporate Distress Prediction Models in a Turbulent Economic

and Basel II Environment.
Altman, E. I. (2002). Managing Credit Risk: A Challenge for the. Economic Notes by
Banca Monte dei Paschi di Siena SpA, 31(2), 201-214.
Altman, E. I. (2005). Estimating Default Probabilities of Corporate Bonds over

Various Investment Horizons. 2005 Financial Analysts Seminar, Evanston,
Illinois.
Altman, E. I., & Narayanan, P. (1997). Business Failure Classification Models: An

International Survey. In F. Choi, International Accounting and Finance
Handbook. New York: John Wiley & Sons.
Altman, E. I., & Sabato, G. (2007). Modelling Credit Risk for SMEs: Evidence from
the U.S. Market. Abacus, vol. 43, issue 3, 332-357.
277

Bibliography
Antune, A., Ribeiro, N., & Antão, P. (2005). Estimation of Probabilities of Default
under Macroeconomic Scenarios, Financial Stability Report 2005. Banco de
Portugal.
Balcaen, S. (2006). 35 years of studies on business failure: an overview of the classic

statistical methodologies and their related problems. British Accounting
Review, 63-93.
Bandyopadhyay, A. (2006). Predicting probability of default of Indian corporate

bonds: logistic and Z-score model approaches. The Journal of Risk Finance,
7(3), 255-272.
Bandyopadhyay, A. (2007). Mapping corporate drift towards default Part 2: a hybrid

credit-scoring model. Journal of Risk Finance, 8(1), 46-55.
Bandyopadhyay, A. (2007). Mapping Corporate Drift towards Default: Part 1: a

market‐based approach. Journal of Risk Finance, 8(1), 35-55.
Bank, T. W. (2018). GDP. Retrieved from World Bank:

https://data.worldbank.org/indicator/NY.GDP.MKTP.CD?locations=US
Beaver, H. W. (1966). Financial Ratios as Predictors of Failure. Journal of

Accounting Research, 71-111.
Beaver, W. (1966). Financial Ratios as Predictors of Failure, Empirical Research in

Accounting: Selected Studies. Journal of Accounting Research, 5, 71-111.
Beaver, W. H., McNichols, M. F., & Rhie, J. (2005). Have Financial Statements
Become Less Informative? Evidence from the Ability of Financial Ratios to
Predict Bankruptcy. Review of Accounting Studies 10(1), 93-122.
Benzschawel, T. (2015). Default Models: Past, Present and Future. In J. Callaghan, &
A. a. Murphy (Ed.), Third International Conference on Credit Analysis and
Risk Management (pp. 5-92). Newcastle: Cambridge Scholars Publishing.
Bharath, S. T., & Shumway, T. (2004). Forecasting Default with the KMV-Merton
Model.
Bhardwaj`, G., & Sengupta, R. (2015). Credit Scoring and Loan Default. The Federal
Reserve Bank of Kansas City Research Working Papers.
278

Bibliography
Brown, C. (1998). Multiple Discriminant Analysis. In C. Brown, Applied Multivariate

Statistics in Geohydrology and Related Sciences (pp. 115-128). Springer.
Castagnolo, F., & Ferro, G. (2014). Models for predicting default: towards efficient
forecasts. The Journal of Risk Finance, 15(1), 52-70.
Chacko, G., Sjoman, A., Motohashi, H., & Dessain, V. (2006). Credit Derivatives: A
Primer on Credit Risk, Modeling and Instruments. Wharton School
Publishing.
Charitou, A., & Trigeorgis, L. (2004). EXPLAINING BANKRUPTCY USING

OPTION THEORY.
Chava, S., Stefanescu, C., & Turnbull, S. (2010). Modeling the Loss Distribution.
Chen, C.-J. (2007). The instantaneous and forward default intensity of structural
models.
Christidis, A. C.-Y., & Gregory. (2010). Some New Models for Financial Distress
Prediction in the UK. Xfi Centre for Finance & Investment, University of
Exeter.
Chudson, W. A. (1945). A Survey of Corporate Financial Structure. In W. A.

Chudson, The Pattern of Corporate Financial Structure: A Cross-Section
View of Manufacturing, Mining, Trade, and Construction, 1937 (pp. 1-16).
NBER.
CNBC. (2017). Here are the retailers that have filed for bankruptcy protection in
2017. Retrieved from CNBC: https://www.cnbc.com/2017/09/23/here-are-the-
retailers-that-filed-for-bankruptcy-protection-in-2017.html
Davydenko, S. A. (2012). When Do Firms Default? A Study of the Default Boundary.

EFA Moscow Meetings Paper; AFA San Francisco Meetings Paper; WFA
Keystone Meetings Paper.
Desai, J., & Joshi, N. A. (2015). A Proposed Model for Industrial Sickness.
International Journal of Engineering Development and Research, 3(4), 754-
760.
279

Bibliography
Duan, J., Sun, J., & Wang, T. (2011). Multiperiod Corporate Default Prediction - A
Forward Intensity Approach. National University of Singapore.
Duffie, D., Saita, L., & Wang, K. (2007). Multi-period Corporate Default Prediction
with Stochastic Covariates. Journal of Financial Economics, Vol. 83, 635–
665.
Ferus, A. (2008). The DEA Method in Managing the Credit Risk in Companies.
Ekonomika, 109-118.
Finch, G. (2017). London Retains Its Crown as World’s Top Financial Center.
Retrieved from Bloomberg: https://www.bloomberg.com/news/articles/2017-
09-11/london-still-tops-financial-centers-despite-brexit-survey-says
FitzPatrick, P. J. (1932). A Comparison of the Ratios of Successful Industrial

Enterprises With Those of Failed Companies. The Certified Public
Accountant.
Frade, J. (2008). Credit Risk Modeling: Default Probabilities. Florida State

University.
Gugole, N. (2016). Merton Defussion Jump Model Versus the Black and Scholes
Approach for the Log Returns and Volatility Smile Fitting. International
Journal of Pure and Applied Mathematics, 719-736.
Gujarati, D. N., & Sangeetha. (2007). Basic Econommetrics. Tata McGraw-Hill.
Gupta, S., Singh, P., & Maheshwari, N. (2013). Employment of Zeta Model on the
Listed Textile Companies of Punjab. Asia Pacific Journal of Marketing &
Management Review, 2(6), 69-73.
Gupta, V. (2014). An Empirical Analysis of Default Risk for Listed Companies in

India: A Comparison of Two Prediction Models. International Journal of
Business and Management, 9, 223-234.
Hauser, R. P., & Booth, D. (2011). Predicting Bankruptcy with Robust Logistic
Regression. Journal of Data Science, 9, 565-584.
Hoggson, N. F. (1926). Banking Through the Ages. New York: Dodd, Mead &
Company.
280

Bibliography
India Ratings & Research. (2018). Default History for the Year April 2015 to March
2018. Retrieved from India Ratings: https://www.indiaratings.co.in/regulatory-
disclosurs/default-history
Inman, P., & Barr, C. (2017). The UK's debt crisis. Retrieved from The Guardian:
https://www.theguardian.com/business/2017/sep/18/uk-debt-crisis-credit-
cards-car-loans
Jackendoff, N. (1962). A study of published industry financial and operating ratios.

Bureau of Economic and Business Research, Temple University.
Jarrow, R. A., & Turnbull, S. (1995). Pricing Derivatives on Financial Securities

Subject to Credit Risk. Journal of Finance, 53-85.
Jayadev, M. (2006). Predictive Power of Financial Risk Factors: An Empirical

Analysis of Default Companies. Vikalpa, Vol. 31, No.3, July-September, 45-
56.
Katz, Y. A., & Shokhirev, N. V. (2010). Default risk modeling beyond the first-
passage approximation: Extended Black-Cox model. PHYSICAL REVIEW.
Katz, Y. A., & Shokhirev, N. V. (2010). Default risk modeling beyond the first-
passage approximation: Extended Black-Cox model. PHYSICAL REVIEW.
Kaur, D., & Srivastava, S. (2016). An Analysis of Financial Soundness and Prediction
of Distress in Public Sector Undertakings of Indian Steel Industry. The Indian
Management Researcher, 3(1), 38-49.
Kealhofer, S. (2003). Quantifying Credit Risk I: Default Prediction . Financial

Analysts Journal, 30-44.
Khashman, A. (2009). A Neural Network Model for Credit Risk Evaluation.

International Journal of Neural Systems, 285-294.
Kulkarni, A., Mishra, A., & Thakkar, J. (2008). How Good is Merton Model at
Assessing Credit Risk? Evidence from India. Second Singapore International
Conference on Finance. Singapore: Second Singapore International
Conference on Finance.
Lavrakas, P. J. (2008). Encyclopedia of Survey Research Methods.
281

Bibliography
Leland, H. (2004). Prediction of Default Probabilities in Structural Models of Debt.

Journal of Investment Management, Vol. 2, No. 2.
Li, W. (2014). Corporate Financial Distress and Bankruptcy Prediction in the North
American Construction Industry. Duke University.
Liang, Q. (2003). CORPORATE FINANCIAL DISTRESS DIAGNOSIS IN CHINA:

EMPIRICAL ANALYSIS USING CREDIT SCORING MODELS.
Hitotsubashi Journal of Commerce and Management, 38, 13-28.
Merton, R. C. (1973). Theory of Rational Option Pricing. Bell Journal of Economics

and Management Science, 141-183.
Merton, R. C. (1974). On the Pricing of Corporate Debt: The Risk Structure of

Interest Rates. Journal of Finance, 449-470.
Merton, R. C. (1976). Option Pricing when Underlying Stock Returns are

Discountinuous. Jorunal of Financial Economics, 3, 125-144.
Merwin, C. L. (1942). Financing Small Corporations. NBER.
Ministry of Law and Justice, Government of India. (2016). Insolvency and

Bankruptcy Code, 2016. New Delhi: Government of India.
Mishra, A. K., & Singh, B. P. (2016). Predicting probability of default of Indian

companies: A market based approach. Theoretical and Applied Economics,
XXIII(3(608)), 197-204.
Moghadas, H. B., & Salami, E. (2014). PREDICTION FINANCIAL DISTRESS BY

USE OF LOGISTIC IN FIRMS ACCEPTED IN TEHRAN STOCK
EXCHANGE. Indian Journal of Fundamental and Applied Life Sciences, 200-
207.
Mukherjee, K. N. (2012). Corporate Bond Market in India: Current Scope and Future
Challenges. Pune: National Institute of Bank Management.
Ohlson, J. A. (1980). Financial Ratios and Probabilistic Prediction of Bankruptcy.

Journal of Accounting Research, 18(1), 109-131.
Ong, S.-W., Yap, V. C., & Khong, R. W. (2011). Corporate failure prediction: a study
of public listed companies in Malaysia. Managerial Finance, 37(6), 553-564.
282

Bibliography
Patel, K., & Pereira, R. (2005). Expected Default Probabilities in Structural Models:
Empirical Evidence. Journal of Real Estate Finance and Economy, Vol. 34,
Issue 1, 107-133.
PwC. (2017). The World in 2050. PwC.
Raghavan, S., & Sarwono, D. (2012). Corporate Bond Market in India: Lessons from
Abroad and Road Ahead. International Journal of Trade, Economics and
Finance, 3(2).
Rajan, U., Seru, A., & Vig, V. (2010). The Failure of Models That Predict Failure:
Distance, Incentives and Defaults. Chicago: Chicago GSB Research Paper No.
08-10.
Ramaratnam, M. S., & Jayaraman, R. (2010). A study on measuring the financial

soundness of select firms with special reference to Indian steel industry – An
empirical view with Z score. ASIAN JOURNAL OF MANAGEMENT
RESEARCH, 724-735.
Rashid, A., & Abbas, Q. (2011). Predicting Bankruptcy in Pakistan. Theoretical and
Applied Economics, Volume XVIII (2011), No. 9(562), 103-128.
RBI. (2015). Priority Sector Lending-Targets and Classification. RBI.
RBI. (2017). MOVEMENT OF NON-PERFORMING ASSETS (NPAs) OF

SCHEDULED COMMERCIAL BANKS. Reserve Bank of Indai.
Saretto, A. (2006). Predicting and Pricing the Probability of Default. Boston, USA:
American Finance Association 2006 Boston Meetings, USA.
Schmidt, T., & Novikov, A. (2008). A Structural Model with Unobserved Default
Boundary. Applied Mathematical Finance, 15(2), 183-203.
Scott, J. (1981). The probability of bankruptcy: a comparison of empirical predictions

and theoretical models. Journal of Banking and Finance, 5, 317-44.
Sen Chaudhury, J. (1999). A Discriminant Model to Predict Default. Chennai: ICICI

Research Centre.
Senapati, M., & Ghosal, S. (2016). Modelling Corporate Sector Distress in India. RBI
WORKING PAPER SERIES, RBI .
283

Bibliography
Sharma, C. S., Singh, R. K., & Upadhyay, R. K. (2014). Predicting Probability of

Default. Primax International Journal of Commerce and Management
Research, Vol. II, Issue No. 3, 5-13.
Singh, B. P., & Mishra, A. K. (2016). Re-estimation and comparisons of alternative

accounting based bankruptcy prediction models for Indian companies.
Financial Innovation, 2(6).
Singla, R., & Singh, G. (2017). Assessing the Probability of Failure by Using
Altman’s Model and Exploring its Relationship with Company Size: An
Evidence from Indian Steel Sector. Journal of Technology Management for
Growing Economies, 2, 167-180.
Sirirattanaphonkun, W., & Pattarathammas, S. (2012). Default Prediction for Small-

Medium Enterprises in Emerging Market: Evidence from Thailand. Seoul
Journal of Business Volume 18, Number 2.
Sirirattanaphonkun, W., & Pattarathammas, S. (2012). Default Prediction for Small-

Medium Enterprises in Emerging Market: Evidence from Thailand. Seoul
Journal of Business, 18(2), 25-54.
Smith, R. F. (1930). A Test Analysis of Unsuccessful Industry Companies. Bureau of

Business Research, 31.
Stein, R. M. (2007). Benchmarking default prediction models: pitfalls and remedies in

model validation. Journal of Risk Model Validation, Volume 1, Number 1.
Sylla, R. (2002). A Historical Primer on the Business of Credit Rating. In R. M.

Levich, & G. a. Majnoni, Ratings, Rating Agencies and the Global Financial
System. Boston: Kluwer Academic Publishers.
Varma, J. V., & Raghunathan, V. (2000). Modelling Credit Risk in Indian Bond
Markets. The ICFAI Journal of Applied Finance, 6(3), 53-67.
Wall, A. (1919, March). Study of Credit Barometrics. Federal Reserve Bulletin, 229-243.
Wang, W.-T., & Zhou, X. (2011). Could traditional financial indicators predict the
default of small and medium-sized enterprises? International Conference on
Economics and Finance Research, IPEDR (pp. 72-76). Singapore: IACSIT
Press.
284

Bibliography
Westgaard, S., & Wijst, N. V. (2001). Default Probabilities in a Corporate Bank

Portfolio: A Logistic Model Approach. European Journal of Operational
Research, Vol. 135, 338-349.
Wilcox, J. (1971). A simple theory of financial ratios as predictors of failure. Journal

of Accounting Research, 9(2), 389-95.
Winakor, A. H., & Smith, R. F. (1935). Changes in the Financial Structure of

Unsuccessful Industrial Corporations. Bureau of Business Research, 51.
Zeitun, R., Tian, G., & Keen, K. (2007). Default Probability for the Jordanian
Companies: A Test of Cash Flow Theory. ’ International Research Journal of
Finance and Economics Vol. 8, 147-162 .
Zhou. (1997). Default Correlation: An Analytical Result.
285

Appendices

Appendices
APPENDIX A1:
Non- Defaulting Firms: India
1. 3 M India 23. Cadila Healthcare
2. ABB India 24. Castrol India
3. Abbot India 25. Cummins India
4. ACC Ltd 26. Dabur India
5. Adani Ports and Special Economic 27. Divis Lab

Zone 28. Dr. Reddy’s Laboratories
6. Aditya Birla Nuvo 29. Eicher Motors
7. Ambuja Cements 30. Emami Ltd
8. Apollo Tyres 31. Exide Industries
9. Ashok Leyland 32. Finolex Cables
10. Asian Paints 33. Finolex Industries
11. Aurobindo Pharma 34. Future Retail
12. Bajaj Auto 35. GlaxosmithKline Pharma
13. Bajaj Electricals 36. Glenmark Pharma
14. Balkrishna Industries 37. Godfrey Phillips India
15. Bayer CropScience 38. Godrej Consumer Products
16. Berger Paints 39. Godrej Industries
17. Bharat Forge 40. Grasim Industries
18. Bharti Airtel 41. Gujarat Pipavav Port
19. Biocon 42. Havells India
20. Bombay Dyeing 43. HCL Tevhnologies
21. Bosch Ltd 44. Hero MotoCorp
22. Britannia Industries
286

Appendices
45. Hindalco Industries 66. Reliance Industries
46. Hinduja Global Solutions 67. Reliance Power
47. Hindustan Unilever 68. Shree Cements
48. Hindustan Zinc 69. Siemens
49. Idea Cellular 70. Sun Pharma
50. India Cements 71. Tata Chemicals
51. Indian Acrylics 72. Tata Consultancy Services
52. Indian Hotels Company 73. Tata Motors
53. Infosys 74. Tata Power Company
54. ITC 75. Tata Steels
55. JSW Steel 76. Tech Mahindra
56. L&T 77. Titan Company
57. Lupin 78. Transport Corporation of India
58. Mahindra & Mahindra 79. TVS Electronics
59. Maruti Suzuki India 80. TVS Motor Company
60. Motherson Sumi Systems 81. Ultratech Cements
61. Nestle India 82. Vedanta
62. Nexxoft Infotel 83. Waterbase Industries
63. Nilkamal 84. Whirpool of India
64. Pidilite Industries 85. Wipro
65. Reliance Communications
287

Appendices
APPENDIX A2:
List of Defaulting Firms: India
1. 3i Infotech 33. Lanco

2. ABD Shipyard 34. Micro Technologies
3. Alok Industries 35. Monnet Ispat Energy
4. Alps industries 36. Moser Baer India
5. Von Corporation 37. MVL
6. Bartronics 38. Nagarjuna Oil Refinery
7. Bellarr Steels Alloys Ltd 39. Nakoda Ltd
8. Best and Crompton Engineering 40. Orchid Chemicals and Pharma
9. Bhushan Steel 41. Pardip Overseas
10. Birla Costyn India 42. Prithvi Information Solution
11. Birla Power Solutions 43. PSL
12. Bombay Rayon Fashions 44. Ramsarup Industries
13. Brandhouse Retails 45. Rei Agro
14. Deccan Chronicle Holdings 46. Ruchi Soya Industries
15. DLF 47. S Kumars Nationwide
16. Dunlop India 48. SEL Manufacturing
17. Electrosteel Steels 49. Sterling Biotech
18. Era Infra Engineering 50. Sujana Universal
19. Euro Ceramics 51. Suzlon Energy Ltd
20. Firstsource Solutions 52. SVOGL Oil Gas and Energy
21. Gangotri Iron and Steel company 53. Temptation Foods
22. Housing Development and 54. Uttam Galva Steel
Infrastructure Ltd 55. Varun Industries
23. IVRCL 56. Videocon Industries
24. Jai Balaji Industries 57. Visa Steel
25. J P Associates 58. Winsome Diamonds and Jewellery
26. JCT Electronics 59. Winsome Yanrs
27. Jindal Stainless 60. Wockhardt
28. Jupiter Bioscience 61. Woolworth India
29. Karma Industries 62. XL Energy
30. Kingfisher Airlines 63. Zenith Birla
31. KS Oils 64. Zylog Systems
32. KSL & Industries
288

Appendices
APPENDIX A3:
Public Sector Undertakings: India
1. Balmer Lawrie and Company 11. HPCL
2. Bharat Electronics 12. MMTC
3. BHEL 13. MTNL
4. BPCL 14. National Aluminium Company
5. Chennai Petroleum 15. NMDC
6. Coal India 16. NTPC
7. Container Corporation of India 17. Oil India
8. Engineers India 18. ONGC
9. GAIL India 19. Power Grid Corporation
10. Gujarat Gas 20. SAIL
289

Appendices
APPENDIX A4:
List of Small and Medium Enterprises (Non-defaulting): India
1. Alpha Lab 15. Lotus Chocolate
2. Asian Hotel (West) 16. Madhucon Projects
3. Borosil Glass Works 17. Mangalam Organics
4. Dynemic Profucts 18. Modern India
5. Genesys International 19. Moschip Semiconductor
6. GM Breweries 20. Oswal Agro Mills
7. IFB Agro 21. Oswal Spinning and Weaving Mills
8. India Motors Parts 22. Panacea Boitec
9. Indian Bright Steel 23. R S Software
10. Indian Terrain Fashion 24. Selan Exploration Technologies
11. Kalyani Steels 25. Technofab Engineering
12. KEI Industries 26. Vidhi Speciality Food Ingredients
13. Kwality Ltd 27. West Coast Paper Mills
14. Lifeline Drugs & Pharma 28. Zen Technologies’
290

Appendices
APPENDIX A5:
List of Small and Medium Enterprises (Defaulting): India
1. Acclaim Industries 16. KDL Biotech
2. Aftek 17. Kemrock Industries
3. Ankur Drugs 18. KEW Industries
4. Atcom Technologies 19. Killick Nixon
5. Bilcare 20. Linkson International
6. Cranes Software 21. Metalman Industries
7. Educomp Solutions 22. Midfield Industries
8. Electrotherm India 23. ORG Informatics
9. Euro Multivision 24. Pithampur Steels
10. First Leasing Company of India 25. Renaissance Steels
11. Geodesic 26. Shonkh Technologies
12. IAG Company 27. Surya Pharma
13. ICSA India 28. Telephone Cables
14. James Hotel 29. Twilight Latika Pharma
15. Kanchan International
291

Appendices
APPENDIX A6:
List of Companies: The US
1. Abbvie Inc 26. United Technologies Corp
2. Altria Group 27. Valero Energy Corp
3. Amgen Inc 28. Arch Cosl
4. Asbury Automotive 29. Breitburn Energy
5. Costco Wholesale 30. Cigna Corp
6. CVS Health 31. Cisco
7. Duke Energy 32. Dynegy Inc
8. Exelon Corp 33. Energy Xxi Gulf Coast
9. Express Script Holding 34. Gaming Partner
10. FedEx Corp 35. General Motors
11. Ford Motor 36. Halcon Resource
12. Intel Corporation 37. HP Inc
13. Johnson and Johnson 38. ITT Educational Service
14. Lowe’s Companies 39. Kodak
15. Oracle Corp 40. Linn Energy
16. PepsiCO 41. Lyondell Petroleum
17. Phillip Morris International 42. Midstates Petroleum
18. The Boeing Company 43. NII Holdings
19. The Home Depot 44. Peabody Energy
20. The Kraft Heinz Company 45. Sage Therapeutics
21. The Procter & Gamble 46. SandRidge Energy
22. The Walt Disney 47. Skyworks Solutions
23. Time Warner 48. Titan Energy
24. Twenty First Century Fox 49. Ultra Petroleum
25. United Parcel Service
292

Appendices
APPENDIX A7:
List of Companies: The UK
1. 4D Pharma 26. Atalaya Minning
2. Abcam PLC 27. ATTRAQT Group
3. Accesson Technology 28. Augean PLC
4. Advanced Oncotherapy 29. Autins Groups
5. Aeorema Communications 30. Avacta Group
6. African Battery Metals 31. Avesoro Resources
7. AIREA 32. Avingtrans plc
8. Akers Bioscience 33. BAE Systems plc
9. Albert Technologies 34. Bango plc
10. Altitude Group 35. Belvoir Lettings plc
11. Alumasc Group 36. Best of the Best plc
12. Amedeo Resources 37. Bigblu Broadband
13. Amino Technologies 38. Blue Prism Group
14. Amryt Pharma 39. BP PLC
15. Andrews Skyes Group 40. Britvic PLC
16. Angel Plc 41. Easy Jet
17. Angus Enegry 42. Electrocomponents plc
18. Animalcare Group 43. Essentra PLC
19. Anparia PLC 44. GlaxoSmithKline
20. APC Technologies 45. Infrma plc
21. appScatter Group 46. KAZ Mineral
22. Arden Partners 47. Meggitt
23. Argo Group 48. Northgate
24. Argos Resources 49. The Sage Group
25. ASOS PLC 50. Whitebread
293

Predicting Probability of Debt Default A Study of Corporate Debt Market in India and Other Countries

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Predicting Probability of Debt Default A Study of Corporate Debt Market in India and Other Countries

Uploaded by

Copyright:

Available Formats

PREDICTING PROBABILITY OF DEBT DEFAULT:

A STUDY OF CORPORATE DEBT MARKET IN

THESIS SUBMITTED TO THE UNIVERSITY OF DELHI

RAJEEV KUMAR UPADHYAY

Under the Supervision of

Dr. CHANDRA SHEKHAR SHARMA

Electronic copy available at: https://ssrn.com/abstract=3814275

Rajeev Kumar Upadhyay

Dr. Chandra Shekhar Sharma Prof. R. K. Singh

Prof. Kavita Sharma

Electronic copy available at: https://ssrn.com/abstract=3814275

Rajeev Kumar Upadhyay

Dr. Chandra Shekhar Sharma Prof. R. K. Singh

Prof. Kavita Sharma

Electronic copy available at: https://ssrn.com/abstract=3814275

Name of the Author Rajeev Kumar Upadhyay

1. Release the entire work for access worldwide

2. Release the entire work for „My University only for

Electronic copy available at: https://ssrn.com/abstract=3814275

4. View only (NO Downloads) (worldwide)

Rajeev Kumar Upadhyay

Dr. Chandra Shekhar Sharma Prof. R. K. Singh

Electronic copy available at: https://ssrn.com/abstract=3814275

I want to express my sincere gratitude to my esteemed supervisor, Dr. Chandra

September, 2018 Rajeev Kumar Upadhyay

Electronic copy available at: https://ssrn.com/abstract=3814275

Electronic copy available at: https://ssrn.com/abstract=3814275

Electronic copy available at: https://ssrn.com/abstract=3814275

Title Page No.

CHAPTER 1: INTRODUCTION 1-16

CHAPTER 2: CONCEPTUAL FRAMEWORK 17-23

Electronic copy available at: https://ssrn.com/abstract=3814275

CHAPTER 3: REVIEW OF LITERATURE 24-60

Electronic copy available at: https://ssrn.com/abstract=3814275

CHAPTER 4: THEORETICAL FRAMEWORK AND METHODS 61-76

CHAPTER 5: ANALYSIS OF DEFAULT PREDICTION MODELS

Electronic copy available at: https://ssrn.com/abstract=3814275

Section II: American Sample 212-238

Section III: British Sample 239-265

Electronic copy available at: https://ssrn.com/abstract=3814275

CHAPTER 6: SUMMARY AND CONCLUSION 267-276

Electronic copy available at: https://ssrn.com/abstract=3814275

Electronic copy available at: https://ssrn.com/abstract=3814275

Table No. 5.1.1.3.3 : Log Determinants 115

Electronic copy available at: https://ssrn.com/abstract=3814275

Electronic copy available at: https://ssrn.com/abstract=3814275

Electronic copy available at: https://ssrn.com/abstract=3814275

Electronic copy available at: https://ssrn.com/abstract=3814275

BSE : Bombay Stock Exchange

CA/CL : Current Assets to Current Liabilities Ratio

D/E : Debt to Equity ratio

DPM : Default Prediction Model

EBIT/INT : Earnings before Interest and Tax to Interest Expense Ratio

EBIT/TA : Earnings before Interest and Tax to Total Assets Ratio

FAT : Fixed Assets Turnover Ratio

GDP : Gross Domestic Product

GRTA : Growth in Total Assets

ITR : Inventory Turnover Ratio

Log(TA/GNP) : Log of Total Assets to GNP Index Ratio

LOGIT : Logistic Regression Analysis

MDA : Multiple Discriminant Analysis

MP/BV : Market Price to Book Value Ratio

MP/EPS : Market Price to Earnings per Share Ratio