Predicting Default by Indian Firms Using Statistical Methods

PREDICTING DEFAULT BY INDIAN FIRMS USING STATISTICAL METHODS
Rajeev Kumar Upadhyay1
ABSTRACT
This study assesses the classification accuracies of two statistical methods namely, multiple discriminant analysis and logistic
regression approach in default prediction. It uses a small sample of 32 India firms listed on Bombay Stock Exchange for the
sample period of six years over 2010-11 to 2015-16. Two models have been built using the two statistical methods. Results of the
study clearly indicate that there are no significant differences in the classification accuracies because of change in statistical
methods. Different statistical measures clearly suggest that the two developed models have comparative classification accuracies
in default prediction. However, the specification and robustness of logit model is found to be on lower side than that of the
discriminant model contrary to large numbers of studies. The weaker model specification of logistic model may be result of
small sample size.
Keywords: Default, Bankruptcy, Insolvency, Multiple Discriminant Analysis, Logistic Regression
INTRODUCTION of study two models have been built using the two statistical
methods.
Any form of external liability is risk to any firm. It is true for
every size of firm. It is quite possible that a firm may fail to LITERATURE REVIEW
meet its obligations in time. It can happen even with financially
For more than a century there have been efforts to predict
healthy firms. In today’s times when large numbers of the
probable defaults before those actually occur. However, prior
borrowers are highly leveraged, even the slightest negative
to the breaktrhough study by Beaver (1966), most of the
fluctuations in the rate of interests may prove very costly for
studies had been using univariate methods that could give
the firms (Bandyopadhyay A., 2006). In response to a number
multiple signals. Beaver exhibited the importance of diffrernt
of factors (external and internal) as well as Basel norms for
financial ratios in prediction of failure of the firms as well as
banking, RBI has been issuing guidelines to the banking and
role of credit extension to corporate by banking and financial
financial sectors to deal with increasing non-performing assets
sector (Beaver, 1966). Altman (1968) extended the work of
but these guidelines have not been able to prevent events of
Beaver (1966) by developing a single and directional credit
default by corporate borrowers.
scoring model. This model used multivariate method to a clear
Even after concentrated efforts to improve the situation on signal about possible default. This study used multiple
NPAs, the problem of NPA is alarming. The numbers of discriminant analysis to build a Z-score model using different
defaults as well as wilful defaults are increasing. As a result financial and market ratios to arrive at a single score. The
the banking and financial system is forced to take deep debt developed score(Z-score) could provide clear information
haircut to recover a portion of its NPA. As per RBI financial about the possible default by the firm in near future. The
stability report 2016, the NPA of scheduled banks has developed model has classification accuracy of 94 percent
increased significantly (RBI, 2016). In such circumstances, two years prior to default (Altman E. I., 1968). This model
prior information about probable defaults by corporate revolutionsed the whole arena of defualt prediction forever.
borrowers could be of very help for the whole banking and Even today, banking and financial sector across world is using
financial system for corrective measures, both at operative as this method for developing internal credit scoring models for
well as regulatory levels. evaluting loan applications along with other methods.
As per available literature, there are few studies on different Over time different approaches for default prediction have
methods and models for default prediction in Indian context. been developed. In theory when the value of assets is lower
Moreover these studies cannot be termed to be enough. The than that of the liabilities or there is lack of liquidity, firms
present study assesses the predictive ability of two statistical tend to default (Wilcox, 1971). Ohlson (1980) applied
methods namely multiple discriminant analysis and logistic conditional Logit model on a data set with 105 bankrupt and
regression approach in default prediction. It uses a small 2058 non-bankrupt firms from the US over the period of 1970-
sample of 32 India firms listed on Bombay Stock Exchange 1976 using 7 financial ratios and 2 categorical variables. The
for the sample period of 2010-11 to 2015-16. For the purpose classification accuracy was lower than that of the studies based
on MDA (Ohlson, 1980). Scott (1981) found that a few
1
Assistant Professor, Sri Aurobindo College (Evening), University of Delhi, E-mail : rajeevupadhyay@live.in
J-GIBS Volume 11, Number 1, January-December 2019 89
Electronic copy available at: https://ssrn.com/abstract=3630799

financial and strucutral ratios can be helpful to understand multiple discriminant analysis and binary logistic regression
the default process (Scott, 1981). A comparison of ZETA for dichotomous response over a sample period of six years
model and Z-score model shows that ZETA model provides from 2010-11 to 2015-16 for a relatively small sample.The
improved accuracy. However, there is no certain explanation results of the study have been compared to arrive at a
to it but use of more relavant and larger data of industrial conclusion as which method offers better results. This study
firms in the analysis for ZETA analysis may provide has intentionally used relatively a small sample of 32 firms,
explanation (Altman E. I., 2000). Liang (2003) conducted a half defaulting and half non-defaulting firms listed in Bombay
comparative study on Chinese firms between MDA and Stock Exchange with the purpose to assess the change in
logistic regression model exhibiting that logistic regression predictive ability of default prediction models for small sample
provides relatively higher predictive ability than that of the size.
MDA (Liang, 2003).
Data and Methods
In Indian context Bandyopadhyay (2006) developed two
models based on MDA and logit approach. The Z-score model For the purpose of study, total 192 cases have been used from
exhibited high predictive power in detecting defaulting firms 32 firms listed on Bombay Stock Exchange. These 32 firms
and outformed the original Altman’s model. The logit model have been selected randomly of which 16 have defaulted at
clearly indicates that inclusion of financial and non-financial least once during the sample period of six years and 16 firms
paprameters improves the predictive ability of the model. The have never defaulted over the sample period. These firms are
default probability calculted using Logit can also be helpful non-banking and non-financial firms from different industries
in deciding credit risk capital as well pricing of debt (manufacturing and services). However there is risk of
(Bandyopadhyay A. , 2006). The information from structural sampling bias as well as selection biases (Upadhyay, 2018).
models along with firm specific information can imporve the To develop default prediction model (DPM) using the multiple
accuracy of prediction in credit scoring models like logit discriminant analysis, original variables used by Altman
model (Bandyopadhyay A. , 2007). Market value of the assets (1968) have been used in the study. Similarly for the logit
and volatility in the value of assets are important factors in model, original variables used by Ohlson (1980) have been
default prediction and have inverse relationship with default used. The financial data has been collected from the respective
probabilities (Sharma, Singh, & Upadhyay, 2014). financial statements of these firms. The information on default
has been collected from auditors’ report from the respective
Singla and Singh (2017) tried to establish a relationship annual reports published by the firms. The market information
between the size of the company and probability of default has been collected from Bombay Stock Exchange.
for the firms from Indian steel sector using logit and Altman’s
Z score model. The results of the study clearly point that the Multiple Discriminant Method
size of the firm is inversely related with the probability of
For the multiple discriminant analysis, the study has used the
default (Singla & Singh, 2017). This is an interesting result
same variables used by Altman (1968) model to arrive upon
but it requires further validations with large sample over larger
Z-scores for the every case (Altman E. I., 1968).
sample period. Upadhyay (2018) uses four methods, namely
MDA, Logit, Poisson regression based reduced form model Z-Score = ± – ²1 NWC/TA + ²2RE/TA + ²3EBIT/TA + ²4MVE/
and Merton distant default model over sample period from BVD + ²5Sales/TA
2004 to 2016 on 2450 cases. The findings of the study suggest
Where
that logit model yields the highest classification accuracy of
97.7 percent and reduced form model yields the lowest NWC/TA = Net working capital to total assets ratio
classification accuracy of 82.3 percent and classification
RE/TA = Retained earnings to total assets ratio
accuracies of developed models are found to be higher than
that of many other studies (Upadhyay, 2018). EBIT/TA = EBIT to total assets ratio
RESEARCH GAP AND OBJECTIVES OF THE MVE/BVD = Market value of equity to book value of debt
STUDY ratio
There is a need for more studies in the area of default Sales/TA = Sales to total assets ratio
prediction which could explore different aspects of default The collected data for the ratios after checking for outliers
prediction models and practices such as impact of size, sector has been processed in SPSS using multiple discriminant
and change in statistical methods and models on classification analysis and as a result, the analysis yields the following
accuracy.The present study aims to assess classification model:
accuracies of developed model with the change in statistical
method. For predicting default of bank loans by Indian Z = -1.099 – 0.488NWC/TA – 0.391RE/TA + 10.229EBIT/
corporates, this study uses two statistical methods namely TA + 0MVE/BVD + 0.334Sales/TA
90 J-GIBS Volume 11, Number 1, January-December 2019

Logistic Regression Method DATA ANALYSIS AND INTERPRETATIONS
For the logistic regression the study has used the same Part One: Multiple Discriminant Analysis
variables used by Ohlson (1980) model to build O-score model
for the sample on the lines of following model (Ohlson, 1980). Table 1.1 provides information about mean and standard
deviation of the ratios used in multiple discriminant analysis
O-Score = ± – ²1 WC/TA + ²2TL/TA + ²3CL/CA + ²4NI/TA + to predict default. From the table, it is clear that there is huge
²5FFO/TL + ²6CHIN + ²7Size + ²8X + ²9Y difference among the ratios between the two groups. The mean
Where of ratios of non-defaulting group are found to be more robust
than that of the defaulting group. Similarly standard deviation
WC/TA = Working capital to total assets ratio of ratios of the defaulting group is unfavourably high in
TL/TA = Total liabilities to total assets ratio comparison to non-defaulting group. On comparing the mean
and standard deviation of each ratio for both the groups, it
CL/CA = Current liabilities to current assets ratio signifies that these two groups are clearly different and may
NI/TA = Net income to total assets ratio be making two separate clusters. Though, this requires further
investigations. However, this pattern is more evident in case
FFO/TL = Funds from operations to total liabilities of market value of equity to book value of debt ratio as the
CHIN = (NIt - NIti)/(INItI + INIt-il), where NIt is net income mean of this ratio has deteriorated sharply from 101.63 to
for the most recent period. The denominator acts as a level 0.53. The findings of the study are in confirmation with the
indicator. The variable is thus intended to measure change in findings of Wilcox (1971) which finds that when a firm is
net income about to default its key ratios relating to liquidity and structure
start deteriorating over time (Wilcox, 1971).
Size = Log (TA/GNP price-level index)
From the table 1.2, it is clear that there are weak correlations
X = 1 if TL>TA, 0 otherwise among most of the variables besides negative correlations
Y = 1 if a net loss for the last two years, 0 otherwise among a few. However, there are strong positive correlations
between NWC/TA and RE/TA and RE/TA EBIT/TA. Strong
The collected data for the ratios after checking for outliers positive correlation is undesirable in the analysis as weak and
has been processed in SPSS using logistic regression analysis negative correlation is favourable condition for predicting
and as a result, the analysis yields the following model: default (Cochran, 1964).
O-Score = -1.289-0.974WC/TA + 2.48TL/TA + 0.21CL/CA
- 4.307NI/TA + 0.614FFO/TL + 0.251CHIN - 0.302Log (TA/
GNP) + 17.54X + 2.617Y
Table 1.1: Group Statistics
Mean Std. Deviation

No Default Default Total No Default Default Total
NWC/TA 0.2160 0.1664 0.2067 0.1695 0.3852 0.2253
RE/TA 0.3879 -0.0610 0.3040 0.1698 0.9272 0.4593
EBIT/TA 0.1370 -0.0445 0.1031 0.0767 0.1935 0.129
MVE/BVD 101.639 0.5349 82.7517 292.705 0.6538 266.73
Sales/TA 0.8762 0.5457 0.8145 0.6577 0.3746 0.6274
Table 1.2: Correlation
NWC/TA RE/TA EBIT/TA MVE/BVD Sales/TA

NWC/TA 1.000 0.537 0.319 -0.128 -0.150
RE/TA 1.000 0.750 0.028 -0.010
EBIT/TA 1.000 0.180 0.077
MVE/BVD 1.000 0.158
Sales/TA 1.000

From the table 1.3, it is clear that the Eigen value for the & Pal (2003), Agarwal & Taffler (2008), Mishra & Singh
analysis is found to be 0.475 which not very high. High eigen (2016), Gupta (2014) and Upadhyay (2018). The classification
value indicates towards robust specification of the model and accuracy for the non-defaulting firms is 90.5 percent with
is favourable but for the developed model this value is on Type I error of 9.5 percent. The classification for the defaulting
lower side. This signifies lower level of robustness of the firms is 82.4 percent with Type II error of 17.6 percent.
specified model. However the canonical correlation for the Considering the small sample size as well as the
specified model is slightly on higher side at 0.567, thus misclassifications, the accuracy of the specified model can
improving model’s robustness. be termed satisfactory.
Table 1.3: Eigenvalues Table 1.6: Classification Results (In-Sample)
Observed Predicted Group Total

Eigenvalue percent of Cumulative Canonical
Membership
Variance percent Correlation
No Default Default
0.475 100.0 100.0 0.567
No Default 143 15 158
Table 1.4 provides information about the Wilks’ Lambda test. Count
Default 6 28 34
In multiple discriminant analysis Wilks’ Lambda tells about
the strength of discriminatory power of the specified model. No Default 90.5 9.5 100.0
percent
Lower is the value, higher is the discriminatory power of the Default 17.6 82.4 100.0
specified model. The value of Wilks’ Lambda for the specified Overall Accuracy 89.1 percent
model is 0.678. This clearly signifies that the specified model
has lower discriminatory power. Part Two: Logistic Regression
Table 1.4: Wilks’ Lambda Table 2.1 gives information about the goodness of fit of the
developed logit model in predicting default by Indian firms
Wilks' Lambda Chi-square Sig. on the banking borrowings. The observed Nagelkerke R
square is found to be 0.518. This signifies that only 51.8
0.678 68.931 0.000
percent of variations can be predicted with the help of the
From the table 1.5, it is clear that in the process of building developed logit model. This value is relatively weak and may
the model, market value of equity to book value of debt would be the cause of errors in default prediction.
be excluded from the analysis as the coefficient for the factor Table 2.1: Model Summary
is found to be zero. This means that this variable has no
discriminatory power as far as this analysis is concerned. -2 Log likelihood Cox & Snell R Nagelkerke R
However, this observation is contrary to the conclusion that Square Square
the mean values and standard deviation provide. This may be
confirmed by further investigations with larger sample and 107.017 0.314 0.518
wider sample period. However the importance of EBIT to From table 2.2 it is clear that the Wald test gives very poor
total assets ratio is very high as coefficient value is as high as values for almost every variable in the model except Y (Y = 1
10.229. if a net loss for the last two years, 0 otherwise). This clearly
Table 1.5: Canonical Discriminant Function Coefficients indicates the relatively weaker discriminatory strength of the
model.
Variables ?
Table 2.2 Variables in the Equation
NWC/TA -0.488
Variables ? W ald Sig.
RE/TA -0.391
W C/TA -0.974 0.409 0.522
EBIT/TA 10.229
TL/TA 2.480 1.018 0.313
MVE/BVD 0.000 CL/CA 0.021 0.084 0.772
Sales/TA 0.334 NI/TA -4.307 0.410 0.522
(Constant) -1.099 FFO/TL 0.614 0.050 0.823
CHIN 0.251 0.796 0.372
From the classification result from table 1.6, it is clear that
Log(TA/G NP) -0.302 1.311 0.252
overall sample classification accuracy of the specified model
is 89.1 percent. The classification accuracyof the developed X(1) 17.540 0.000 0.999
model is comparable to findings of many studies such as Y(1) 2.617 7.925 0.005
Altman (1968), Bandyopadhyay (2006), Acharya, Chatterjee, Constant -1.289 0.188 0.665

Table 2.3: Classification Results (In-Sample)
Observed Predicted Group Membership Total

No Default Default
No Default 154 5 96.9
Count
Default 14 20 58.8
No Default 96.9 3.1 100.0
percent
Default 41.2 58.8 100.0
Overall Percentage 90.2
Table 3.1: Comparative Results(In-Sample)
Multiple Discriminant Analysis Logistic Regression

Developed Model Developed Model
Overall Accuracy 89.5 90.2
Type I Error 9.5 3.1
Type II Error 17.6 41.2
The overall sample classification accuracy of the derived caused by the small sample size than the small sample period.
model is found to be 90.2 percent. This level of classification Moreover, it requires further investigations to confirm this
accuracy is comparable to results of many studies such Ohlson result.
(1980), Bandyopadhyay (2006) Acharya, Chatterjee, & Pal
CONCLUSION
(2003), Agarwal & Taffler (2008), Mishra & Singh (2016),
Gupta (2014), Sharma, Singh, & Upadhyay (2014) and The overall sample classification accuracies are found to be
Upadhyay (2018). The classification accuracy for the non- 89.1 percent with misclassification of 10.9 percent for
defaulting firms is 96.9 percent with Type I error of 3.1 discriminant model and 90.2 percent with misclassification
percent. The classification accuracy for the defaulting firms of 41.2 percent for logit model. So on the basis of accuracies,
is merely 58.8 percent with Type II error of 41.2 percent. it can be concluded that there are no significant differences
However the overall misclassification is 9.2 percent only. in the classification accuracies of developed default prediction
Considering the small sample size as well as the overall models (DPM) because of the change in statistical methods.
misclassifications, the accuracy of the specified model can Also the classification accuracies of the developed models
be termed satisfactory but the high level of type II error makes are comparable to many studies like Altman (1968), Ohlson
the specification of model weaker. (1980), Bandyopadhyay (2006) Acharya, Chatterjee, & Pal
Part Three: Comparison of Results (2003), Agarwal & Taffler (2008), Mishra & Singh (2016),
Gupta (2014), Sharma, Singh, & Upadhyay (2014) and
From the table 3.1, on comparing the two developed models, Upadhyay (2018). So it can be concluded that small sample
it is observed that there are no significant differences between size seems to have no negative impact on the classification
the two models on account of overall accuracy and Type I accuracies of default prediction models.
error but on the basis of Type II errors, the relative robustness
However the relative robustness as well as the specification
and specification of the developed discriminant model is found
of the developed discriminant model is found to be higher
to be stronger than that of the logit model. So it can be
than that of the developed logit model. So it can be said that
concluded that for a relatively small sample size and small
the robustness and specifications of the developed models is
sample period, there are no significant differences in
sensitive to change in statistical method.The weak model
classification accuracies of default prediction models.
robustness and specification may be the cause of higher type
However the robustness and specifications of the developed
II error in the developed logit model which may have been
logit models seems to be affected either by small sample size
caused by small sample size. Though, it requires further
or small sample period or both. No tool has been used to
investigation to confirm this result.
directly assess the impact of sample period in this study.
Though, the fact remains intact that there must be some impact For the defaulting firms, the key ratios relating to liquidity
of sample period as the changing values of ratios must be and structure start deteriorating than that of the non-defaulting
reflecting on the result. So vaguely it can be argued that the firms. From the mean and standard deviation of each ratio for
weak model specification and robustness might have been both the groups, it seems that these two groups are making

two separate clusters. Though, it again requires further 8. Liang, Q. (2003). CORPORATE FINANCIAL
investigations. Also in case of discriminant model, market DISTRESS DIAGNOSIS IN CHINA: EMPIRICAL
value of equity to book value of debt is found to be ANALYSIS USING CREDIT SCORING MODELS.
insignificant signifying that market variables are not important Hitotsubashi Journal of Commerce and Management,
for discriminant model. However all these conclusions are 38, 13-28.
based on the original variables used in Altman (1968) and
9. Ohlson, J. A. (1980). Financial Ratios and Probabilistic
Ohlson (1980). A study with more variables may provide
Prediction of Bankruptcy. Journal of Accounting
better results.
Research, 18(1), 109-131.
REFERENCES
10. Ohlson, J. A. (1980). Financial Ratios and the
1. Altman, E. I. (1968). Fianncial Ratios, Discriminant Probabilistic Prediction of Bankruptcy. Journal of
Analysis and Prediction of Corporate Bankruptcy . The Accounting Research, 18(1), 109-131.
Journal of FInance, 589-609. 11. RBI. (2016). Financial Stability Report December 2016.
2. Altman, E. I. (2000). Predicting financial distress of RBI.
companies: revisiting the Z-score and ZETA models. New 12. Scott, J. (1981). The probability of bankruptcy: a
York: Stern School of Business, New York University. comparison of empirical predictions and theoretical
3. Bandyopadhyay, A. (2006). Predicting probability of models. Journal of Banking and Finance, 5, 317-44.
default of Indian corporate bonds: logistic and Z-score 13. Sharma, C. S., Singh, R. K., & Upadhyay, R. K. (2014).
model approaches. The Journal of Risk Finance, 255- Predicting Probability of Default. Primax International
272. Journal of Commerce and Management Research, 2(3),
4. Bandyopadhyay, A. (2006). Predicting probability of 5-13.
default of Indian corporate bonds: logistic and Z-score 14. Singla, R., & Singh, G. (2017). Assessing the Probability
model approaches. The Journal of Risk Finance, 255- of Failure by Using Altman’s Model and Exploring its
272. Relationship with Company Size: An Evidence from
5. Bandyopadhyay, A. (2007). Mapping corporate drift Indian Steel Sector. Journal of Technology Management
towards default Part 2: a hybrid credit-scoring model. for Growing Economies, 2, 167-180.
The Journal of Risk Finance, 8(1), 46-55. 15. Upadhyay, R. K. (2018). Predicting Probability of Debt
6. Beaver, W. (1966). Financial Ratios as Predictors of Default: A Study of Corporate Debt Market in India and
Failure, Empirical Research in Accounting: Selected other Countries (Doctoral Thesis). University of Delhi.
Studies. Journal of Accounting Research, 5, 71-111. 16. Wilcox, J. (1971). A simple theory of financial ratios as
7. Cochran, W. G. (1964). On the preformance of the linear predictors of failure. Journal of Accounting Research,
discriminant analysis. Technometric, 179-190. 9(2), 389-95.


Predicting Default by Indian Firms Using Statistical Methods

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Predicting Default by Indian Firms Using Statistical Methods

Uploaded by

Copyright:

Available Formats

PREDICTING DEFAULT BY INDIAN FIRMS USING STATISTICAL METHODS

Rajeev Kumar Upadhyay1

Keywords: Default, Bankruptcy, Insolvency, Multiple Discriminant Analysis, Logistic Regression

J-GIBS Volume 11, Number 1, January-December 2019 89

Electronic copy available at: https://ssrn.com/abstract=3630799

90 J-GIBS Volume 11, Number 1, January-December 2019

Electronic copy available at: https://ssrn.com/abstract=3630799

Table 1.1: Group Statistics

Mean Std. Deviation

NWC/TA RE/TA EBIT/TA MVE/BVD Sales/TA

J-GIBS Volume 11, Number 1, January-December 2019 91

Electronic copy available at: https://ssrn.com/abstract=3630799

Observed Predicted Group Total

92 J-GIBS Volume 11, Number 1, January-December 2019

Electronic copy available at: https://ssrn.com/abstract=3630799

Observed Predicted Group Membership Total

Multiple Discriminant Analysis Logistic Regression

J-GIBS Volume 11, Number 1, January-December 2019 93

Electronic copy available at: https://ssrn.com/abstract=3630799

94 J-GIBS Volume 11, Number 1, January-December 2019

Electronic copy available at: https://ssrn.com/abstract=3630799

You might also like