You are on page 1of 13

Decision Support Systems 11 (1994)545-557 545


Bankruptcy prediction using
neural networks
R i c k L. W i l s o n a n d R a m e s h Sharda I. Introduction
Oklahoma State University, Stillwater, OK, USA The ability to predict firm bankruptcies has
been extensively studied in the accounting litera-
Prediction of firm bankruptcies have been extensively
ture. Creditors, auditors, stockholders and senior
studied in accounting, as all stakeholders in a firm have a
vested interest in monitoring its financial performance. This management all have a vested interest in utilizing
paper presents an exploratory study which compares the pre- and developing a methodology or model that will
dictive capabilities for firm bankruptcy of neural networks and allow them to monitor the financial performance
classical multivariate discriminant analysis. The predictive ac- of a firm via accounting ratios. This "failure
curacy of the two techniques is presented within a compre-
analysis" can be helpful in identifying internal
hensive, statistically sound framework, indicating the value
added to the forecasting problem by each technique. The problems, firm evaluation by investors, and as a
study indicates that neural networks perform significantly tool used by auditors to assist them in their job.
better than discriminant analysis at predicting firm bankrupt- Typically, a number of financial ratios are used
cies. Implications of our results for the accounting profes- in a multivariate discriminant analysis approach
sional, neural networks researcher and decision support sys-
in an attempt to predict firm bankruptcies. Dis-
tem builders are highlighted.
criminant analysis is a statistical technique used
Keywords: Neural network applications; Bankruptcy predic- to construct classification schemes so as to assign
tion; Discriminant analysis; Classification tech- previous unclassified observations to the appro-
niques priate group [15]. However, it may be a valid
technique only under certain restrictive assump-
tions, including the requirement for the discrimi-
nating variables to be jointly distributed accord-
ing to a multivariate normal distribution. Should
this not be the case, results obtained by the
discriminant analysis procedure may be erro-
Neural Networks represents a field of study
Rick L. Wilson is currently an Assis- within the Artificial Intelligence area where re-
tant Professor of Management Sci-
ence and Information Systems at Ok-
lahoma State University. He received Ramesh Sharda is the Conoco/Du-
his Ph.D. in MIS from the University Pont Professor of Management of
of Nebraska-Lincoln. Dr. Wilson has Technology at Oklahoma State Uni-
published in journals such as Infor- versity. Professor Sharda received his
mation and Management, Interna- Ph.D. from the University of Wiscon-
tional Journal of Production Research, sin-Madison. His publications have
International Journal of Production appeared in major academic journals
and Operations Management, among such as Management Science, Inter-
others. He is a member of Decision
Sciences Institute, The Institute of faces, Computers and Operations Re-
Management Science, and Operations Research Society of search, Journal of Intelligent Manufac-
America. His current research interests include neural net-
turing and Socio-Economic Planning
works, decision support systems and integrated management
Sciences. In addition, he has coedited
two books: Impacts of Recent Com-
science applications. puter Advances on Operations Research, and Knowledge-
Based Systems and Neural Networks: Techniques and Applica-
Correspondence to: Ramesh Sharda, Department of Manage- tions. He is a member of The Institute of Management Sci-
ment, College of Business Administration, Oklahoma State ence, Operations Research Society of America and the Deci-
University, Stillwater, OK 74078, (405) 744-8638. Email: sion Sciences Institute. His current research interests include
decision support systems, forecasting, optimization on micro-
MGMTRSH@OSUCC.bitnett. computers, neural networks and expert systems.

0167-9236/94/$07.00 © 1994 - Elsevier Science B.V. All rights reserved
SSDI 01 6 7 - 9 2 3 6 ( 9 2 ) 0 0 0 8 5 - F

operating ability is endangered [3]. These weights represent. There following. it is essential that an objective opinion on the risk sults were then contrasted with the accuracy ob. a series of ex. ies were performed to determine whether finan- The major objectives of this paper are the cial ratios provide useful information [1. Thus. as a normal re- that of discriminant analysis for the decision sponsibility. sively in the accounting literature. Predicting bankruptcy has been studied exten- mine the conditions where neural network mod. These predictive re. To this point. shareholders and auditors. neural networks. a criminant analysis and a neural network model. we report the results of a com. provides substantially better results. and economic variables which improve predictive ture. senior periments was conducted to investigate the effect management of a firm and the board of directors of the training and testing (holdout) set composi. tion on predictive accuracy. the scribes our comparison procedure. must evaluate the financial position problem of firm bankruptcy prediction. ing financial ratios for bankruptcy prediction. On the basis of its discriminant cations of our results for researchers from three score. Stockholders hold similar contrasts neural network predictive accuracy with monetary concerns. a linear discriminant function is developed ral network model's predictive performance in which will compute a "score" for an object. com- areas: bankruptcy prediction. tained by classical discriminant analysis to deter.L. Sharda / Bankruptcy prediction using neural networks searchers are studying a "biologically inspired" problem remains of great interest to researchers way of processing information. our basis of an object' s observed characteristics.25. Bankruptcy risk prediction requirement for the discriminating variables to be jointly multivariate normal. but rather parative classification technique because of its to test a neural network's effectiveness in per. of bankruptcy can be formed as early as possible. Often. relative importance and impact of the various sents the results. ral networks have proven to be good at solving Firm insolvency is a problem throughout the in- some real-world problems. Section 4 pre. repeated use in many other problem areas. function is a weighted linear combination of the Section 2 briefly reviews bankruptcy prediction object's observed values on discriminating charac- and the neural network literature. that they wish to identify negative developments The exploratory study presented in this paper of their borrowers. have been many different studies since [5] utiliz- prehensive. we to have been used the most: discriminant analysis conclude with a brief conjecture on how neural and logistic regression [6].1. characteristics. Wilson and R. we utilize in this analysis measures of the analysis approach [1. results ob- . For all parties. The first stud- els are significantly better predictors. No technique clearly networks may affect decision support systems. These measures allow researchers to assess performance. forming classifications as contrasted against the Discriminant analysis is a statistical technique incumbent techniques. Auditors. Brief review of relevant research a number of restrictive assumptions. including the 2.2. an object is then classified. or study a new architecture.12. it is mality of the variables is critical to the discrimi- not surprising that the bankruptcy prediction nant analysis procedure. In this sense. Better algorithms should used to classify objects into distinct groups on the only improve the performance.5].32]. The major validity of any classification technique which have evolution in these studies is to identify financial been used extensively in the psychology litera.7. as well as creditors. We have Our objective is not to examine speed of a new chosen to study discriminant analysis as a com- algorithm. Ba- results should provide a lower bound of the neu. puter software packages compute the probability and decision support systems. teristics. neu. especially in the areas dustrialized countries of the world [4]. of group membership on the basis of this proce- dure [39]. This bankruptcy prediction. Using a of a client to determine whether or not the firm's resampling methodological design. Two statistical techniques appear the true value added by a technique. have a vested interest in this decision problem in lems. in essence.4. Multivariate discriminant analysis is subject to 2. statistically sound comparison of dis. majority of which use a multivariate discriminant Second. sically. This multivariate nor- In the present days of economic turmoil. Creditors of forecasting and classification decision prob. can attempt to avert the crisis [34]. First. Third. Section 3 de.546 R. while section 5 discusses impli. otherwise.

feed forward neural networks have scope of this paper. Most follow- fied forgeries significantly better than any human up studies have identified several other attributes 'expert'. These ratios were: and hold' strategy. Their finding were prediction. a neural model for predicting percentage change in the S & P 500 XI: Working C a p i t a l / T o t a l Assets five days ahead using a variety of economic indi. technique to assess the effectiveness of neural Predicting rating of corporate bonds and at. Collard [10] networks can come close to the traditional tech- states that this neural network model for com- niques. Additionally. this could establish a lower bound Tanigawa [24] used a neural network to chart on neural network performance in bankruptcy Tokyo Stock Exchange data.2. to improve prediction performance. For these reasons. Fraud prevention is another area of neural 3. been applied to many problem domains in and Most of the studies which have compared neu- outside the business field. Further.1. Kamijo and mance. we wanted to see if the neural neural networks in financial markets. Credit card fraud. neural networks have been thorough comparison using sound statistical pro- shown to predict mortgagee applicant solvency cedures.L. networks on a statistical basis. These networks identi- studies using discriminant analysis. In this ex- Several people have tested the applicability of ploratory study. Neural network applications types of decision problems. profitability. radar detec- tion and many others. results on the basis of either a single experiment mine whether loan applications should be ap. neural ral networks with statistical techniques report the networks have been successfully trained to deter. The Altman than traditional regression analysis. study [1] has been used as the standard of com- neural networks have been used in the validation parison for subsequent bankruptcy classification of bank signatures [19. Sharda / Bankruptcy prediction using neural networks 547 tained may be erroneous [25]. Additionally. Wilson and R. For instance.30]. or in an anecdotal form.21]. other net- work paradigms have been useful in solving other 2. robotics. The main conclusion reached was that neural networks provided a more general framework for connecting financial information 3. such as speech recognition. The authors claim X3: Earnings before Interest and T a x e s / T o t a l that the model has provided more accurate pre. We argue that such measures ought to formed regression analysis and other mathemati. More sophisticated inputs to the neural modity trading would have resulted in significant network model should not worsen its perfor- profits over other trading strategies. Method of a firm to the respective bond rating. Neural networks outper. Additionally. Our study is based on a resampling better than mortgage writers [11]. These cal multivariate discriminant analysis to that of models were shown to be much more successful neural networks for firm bankruptcy. technique. Discussion of these other applications and approaches is beyond the Multi-layer. Financial ratios and data collection network applications in business. Similarly. This theoretical There have been many other applications of assumption often cannot be realized in practice neural networks in non-business related fields [4]. was addressed by Chase Manhattan Bank and contrast the predictive performance of classi- of New York by neural networks [33]. Thus. we used the same that the results of the model would beat a 'buy financial ratios as Altman [1]. we bor- tempting to predict their profitability is another row some measures from the psychology litera- area where neural networks have been applied ture to isolate the value added by a classification successfully [14. There is a need for a proved [20]. X2: Retained E a r n i n g s / T o t a l Assets cators has been developed [18]. a costly and difficult problem faced by The basic intent of this study is to compare banks. R. Assets diction than alleged experts in the field using the X4: Market Value of E q u i t y / T o t a l Debt same indicators. be used in determining alleged superiority of any cal modeling tools in predicting bond rating and such model. Xs: S a l e s / T o t a l Assets .

this proportion is the population of all firms contains a certain not exact due to required round-off to integer proportion of firms on the verge of bankruptcy. That is. . The base rate 8 0 / 2 0 and 9 0 / 1 0 cases should be close. If a classification stronger test of a technique's performance. no in the test population is different? This issue is overlap was allowed.2. the prediction test: 20/20 20/5 20/2 of bankruptcy is to be made about one year in 90%/10% * advance.548 R. est than actually occur in the population. 65 50%/50% of which went bankrupt during the period and 64 TRAIN: 44/44 44/44 44/44 non bankrupt firms matched on industry and test: 20/20 20/5 20/2 year. portion of bankruptcy to non bankrupt cases. Within each cell. in the training set. If a classification model testing data set pairs were generated from the is built using a training sample with a certain base original data. The samples from the original firms in order to gain a first factor level (or base rate) was a 5 0 / 5 0 pro- better measure of predictive accuracy (see [37]. other proportions. the testing set). firms. it * Approximately 90%/10% ratio is necessary to create two distinct sets of data. Data set generation where for each cell: TRAIN: (nonbankrupt/bankrupt) firms in training set In assessing the predictive accuracy of discrim.. i.L. 20% bankrupt). to train the neural network. consisted of a total of 129 firms. of the training and testing set compositions. the for example). rupt) constitute a very small percentage of the Utilizing a full two-factor design. a total of 180 distinct training and ples and testing samples. different training-testing set pairs were generated A second effect of the base rate is in terms of via Monte Carlo resampling from the original 129 differences in base rates between training sam. a data set to develop the discriminant function (similarly. obtained from Moody's Indus. second level was a 80/20 proportion (80% non- The results of this study could be affected by bankrupt. and the third factor the proportion of non-bankrupt firms to bankrupt level. we this study utilizes the concept of Monte Carlo created three proportions (or base rates) for each resampling techniques to generate multiple sub. In each case. 20 necessary for classification. The may have an impact on a prediction technique's 5 0 / 5 0 scenario is utilized to investigate the possi- performance in two ways. the training set and rate. Data used for the bankrupt firms is from 8o%/2o% the last financial statements issued before the TRAIN: 44/11 44/11 44/11 firms declared bankruptcy. it may be possible to build a ple to validate the derived discriminant function model using a higher proportion of cases of inter- (in neural network terminology. The times referred to as the base rate. This restriction provides a important for one more reason. proximately 90% non-bankrupt. often re. TEST: (nonbankrupt/bankrupt) firms in testing set inant analysis as compared to neural networks. This would be due to nine different experimental cells. Thus. TRAIN: 44/5 44/5 44/5 test: 20/20 20/5 20/2 3. whose composi- a technique's inability to identify the features tion is indicated in Table 1. Wilson and R. actual proportion of firms going bankrupt. a technique may bility of a better model by using a high base rate not work well when the firms of interest (bank. Sharda / Bankruptcy prediction using neural networks The sample of firms for which these ratios Table 1 were obtained consisted of firms that either were Number of observations in training and testing sets in operation or went bankrupt between 1975 and Training set Testing set composition 1982. The sample. composition 50%/50% 80%/20% 90%/10% * trial Manuals. does the model still work when the base rate test set pairs contained unique firms. We do not really know the proportion of interest in any population is some. model based on a certain base rate works across ferred to as the training set) and a holdout sam. an approximate 9 0 / 1 0 proportion (ap- firms in both the training and testing sets. there were population (low base rate). First. Because the decision of splitting the original 129 To study the effects of this proportion on the firms could affect the results of the comparison. predictive performance of the two techniques.e. Thus. A number of firms).

3. was used to construct and rectly). gence (all firms in the training set classified cor- tion training algorithm. A stringent training tolerance was initially test trained neural network models. By using a relaxed tolerance. Dependent variable: correct predictions certain amount of variation away from the de- sired values of 0 and 1 (indicating bankrupt or The intent of this study was to compare the non-bankrupt) is typically allowed at the output predictive capability of discriminant analysis and layer when determining whether adjustments neural networks. For each used in training the network (a small value of 0.85 < 0. 1 . Thus. a structure of 5 and gradually relaxed until such a point was input neurons (one for each financial ratio).L. Other to as the training tolerance. ance of . R. evaluated by the corresponding testing sets. In this study. Such a network struc.e. Then. Sharda / Bankruptcy prediction using neural networks 549 BR = 1 NBR = 1 3. which were.. . a 3. In training the networks. the neural network models were able to work. satisfy the training tolerance (i. vergence occurred at this level. 1 7 . This was repeated poused heuristic guidelines [8. utilizes Mahalanobis distances [39] to desired value. or overtraining should have been avoided. 10 reached when all training cases satisfied the hidden neurons and 2 output neurons (one indi.2 SYSTAT. A typical NN model for bankruptcy prediction. Fig. training tolerance criteria.85 and its non-bankrupt node at 0. measures are described as introduced. rection would take place. put node valued at 0. All variables were included in each discriminant analysis conducted in the study. This allowable variation is referred is the chief measure of predictive success. Implementation of comparative methods SYSTAT [39]. the distribu. Thus.0 < 0.36]. For instance.2) and no network weight cor- issue in considering robust predictive accuracy. Figure 1 pic. a personal computer-based sta- tistical package. stringent) and the network was trained until con- ture was chosen on the basis of previously es.9. previous research has indicated that neural net- works can also perform well in cases of multivari- ate normal distributions [13].2. consider a cases. the other indicating non. the output neuron values are examined by obtain 100% classifications of the training set the training procedure. thus.1) network trained in the study. until no further reductions of the training toler- torially illustrates this network. B R A I N M A K E R [35]. ated. in turn. 1. 5" q" q" 5" X1 X2 X3 X4 X5 nating variables were distributed according to a joint multivariate normal distribution. the training tol- cating bankrupt firm. When training a neural network. Thus. Wilson and R. the training sets were used to set up initial discriminating functions. In all 180 subsamples gener- As training cases are presented to the net. tion of the data used in our study is not a relevant and 0 . the previous example would calculate posterior probabilities for each case.(}. in using the multivariate general lin- variation of each output node away from the ear model.4. was used for discriminant analy- sis. However. the number that each should be made in network weights via back method correctly predicts in the testing data sets propagation. Tests were not undertaken to determine whether the discrimi.17. a heuristic back based neural network software package which propagation algorithm was used to ensure conver- implements the aforementioned back propaga.2 would allow the training procedure . SYSTAT fits the standard multivariate gen- eral linear model in performing discriminant analysis and also uses information regarding the prior probability specification of the training set in determining the discriminant function. memorization bankrupt training case that had its bankrupt out. a personal computer. a training toler. ance could occur. erance was incrementally lowered (made more bankrupt firm) was used.

Simi- 4. In this study. 90/10 72. were determined for the neural network models. since (p = 0. 90/10 100 94. composition 50/50 80/20 90/10 tion and the neural network model.002) (p = 0.59 learning accuracy of non-bankrupt and bankrupt (p = 0.correct classifications (%) (grouped by like for each group. This is the manner in which correct and 50/50 100 88. Additionally. Cases where double classifications were indicated (both neurons > 0.Multivariate discriminant analysis training tolerance.001) (p = 0.59 100 99.76 incorrect classifications were determined for the 80/20 100 90.68 91.67 When evaluating the predictive capability of NN . and non-bankrupt cases.005) (p = 0. a testing threshoM.25 95. on the basis of variation in output neurons can be when predict.68 93.54 100 82.0 86. Wilson and R. and the testing sets also tions that the specific procedure provided on the contained an equal number of the two cases.5 (and the other neuron value nant analysis." L e a r n i n g " Testing set . similar to D A . the techniques to evaluate the 20 holdout samples for number of correct classifications that the particu.65 100 94.69 100 60.33 100 97. each combination of base rates.Neural network neural networks. was < 0.069) (p = 0.correct classification (%) (all cases) The first results to be presented display the Training set Testing set composition learning performance of the discriminant func. 2.550 R.5 88. (p < 0.8 94.13 100 54.001) training cases. while multivariate discriminant analysis was correct 88.25% of the time. Table 3 represents the average Each data set was evaluated by both discrimi.81 It is not surprising that the neural network approach outperforms discriminant analysis. a testing rupt firms. prior probabilities composition) were incorporated based upon the composition (base rate) of the training sets. When the train- lar procedure provided on the training set ing sets contained an equal number of bankrupt ("learning") and the number of correct classifica.5) were automati.25 85. testing set ("generalization").625 72. Training sets . is ratio used as the discriminant analysis prediction for NN DA NN DA NN DA that case.875 91. is specified. Sharda / Bankruptcy prediction using neural networks indicating the likelihood of group membership Table 2 Training set . It is on son between the two techniques is their perfor- this basis that correct and incorrect classifications mance in classifying cases in the holdout samples. if one output learned more than classical multivariate discrimi- neuron exceeded 0.1. Results Table 3 4. neural networks correctly classified 97.5). Table 2 also distinguishes between the 80/20 82.499 was used.55 91.046) ing sets.05 level will not cease until all members of the training * *-Significant at 0.L. Table 2 shows NN DA NN DA NN DA the aggregated percentage of correct classifica." G e n e r a l i z a t i o n " neuron.5% of the holdout cases. percentage of correct classifications (irrespective nant analysis and neural networks.91 discriminant analysis method.0 75. The group with Training Combined Non-bankrupt Bankrupt set accuracy cases cases the highest posterior probability. 50/50 97. the network classified the case as the corresponding group associated with the first 4.0 95. or the testing sets. neural networks appear to have threshold of 0. Two different of type of firm) when utilizing the two different measures of accuracy could be determined. strictly learning a set of bankrupt and non-bank- ing group membership.0 89.318) (p = 0.008) the neural network training algorithm employed *-Significant at 0.32 tions of training cases by the two approaches across the three different combinations of train.8 95. Thus. thus.6 91.01 level . Testing sets . therefore.126) (p = 0. Perhaps a better measure of accuracy compari- cally counted as incorrect classifications. This testing threshold identifies how stringent the allowable set are correctly classified.

tors of firm bankruptcies in the holdout sample Table 6 summarizes the prediction classifica- than discriminant analysis.Significant at 0.56 85. For instance. neural networks cor- combination of factor levels neural networks per. note that neural networks predicted better the testing sets contained 20 percent bankrupt than discriminant analysis at every factor level firms.5 90/10 47.0 92.correct classifications (%) non-bankrupt cases Training set Testing set composition Training set Testing set composition composition 50/50 80/20 90/10 composition 50/50 80/20 90/10 NN DA NN DA NN DA NN DA NN DA NN DA 50/50 97.001) (p = 0.0 67.022) (p = 0. The critical values of this neural network model predicts better than dis- test are also reported in Table 3. rectly predicted 97.115) (p = 0. was undertaken to assess niques. breaking down the cor. R. Those experi.05 level (p = 0. neural networks classified at a 95.52 98.75 96.25 97. Table 5 presents the results for the A non-parametric test.97 data.036) ** . In general. Significance is tested and reported as whether the different correct classification per.92 88.75 97.0 54.0 93. analysis.64 80. 80/20 65.0 80/20 98.5 90. Table 3 indicates that in every different classes of firms.071) (p = 0.Significant at 0.061) (p = 0.0% of the bankrupt firms.439) (p = 0.0 (p = 0.5 98.00 46.07 97. neural net.5 (p < 0.0 70.282) (p < 0.49 90.87 of the two prediction techniques. where training and rect rate. Both methods appear to centages for the two different techniques were predict non-bankrupt firms quite well.6% cor.263) (p = 0.89 44.42 95. while discriminant analysis correctly test set composition was equal among the two classified 91. of 80-20. mentioned previously.75%.080) (p = 0.0 45.5 45. Similarly.75 98. criminant analysis in all but a single combination mental cells that are statistically significant are of factor levels (training set of 90-10.75 92.0 35.289) * .correct classifications (%) bankrupt cases Testing set . the Wilcoxon test for prediction of non-bankrupt cases by the two tech- paired observations.0 79.06) 90/10 98. the Wilcoxon paired observation test was Total used to assess the significance of the differences overall cases 90.5 97.304) (p = 0.83 97. From this rect percentages in terms of bankrupt firm pre- dictions and non-bankrupt firm predictions.240) (p = 0. It is apparent from Table 4 that it is in the classifica. statistically not significant).025) (p = 0. tion results for all test sets at each level of the Tables 4 and 5 provide a more detailed look at training set base rate and also differentiates be- the classification results.25 62. and those significant larly.L.0 96.002) (p = 0. Again.25 92.25 54. when the training sets contained a balanced noted by asterisks.5 94.83 96. combination.038) 80/20 62.0 50/50 98. formed better at generalization than discriminant while discriminant analysis predicted only 79.5 (p = 0. and the difference is negligible and works were statistically significant better predic. Sharda / Bankruptcy prediction using neural networks 551 Table 4 Table 5 Testing set . tween the different categories of firms.92 96. Training set Bankrupt Non-bankrupt Total This is important since it is widely accepted in composition cases cases cases terms of predicting bankrupt firms that it is more NN DA NN DA NN DA costly to classify a failed firm as non-failing than 50/50 95. testing set highlighted by asterisks.76 As with the overall aggregate classification 90/10 48. Wilson and R.74 80.25 96.25 96.25 97.00 53. for predicting bankrupt number of bankrupt and non-bankrupt firms but cases.05 82.8%.029) (p = 0.01 level values of this test are given.58 82.51 the converse [38]. The critical . though the significantly different.25 49.0 98.0 82.83 94. Table 6 Training composition effect on classification (%) tion of bankrupt firms where neural networks significantly out perform discriminant analysis.75 98.

Note that neural networks outperformed discriminant analysis irrespective 4. the pro. 50% correct predictions could be tions must exceed the base rate of the more achieved by chance. with the exception of measuring prediction with test While the results have clearly shown that neu. both neural networks and discriminant analysis. tion. Wilson and R. In further investigating the value added by The underlying concept in comparing a classi. this proportion is calculated for non-bankrupt firms could be achieved by chance. discriminant analysis and neural networks to the fication or prediction technique to pure chance is classification problem. we will further modify bankrupt. exposed to tion (i. For instance.1 96. Thus. 50% correct predictions. sets having 90 percent non-bankrupt firms. a neural (E .(N_E)).1 69.6 89. training set is 'learned' by the classification de- tion rates.4 95. The following standard normal test statistic is Thus.31]. The test statistic utilized will be based upon larly.83% of non-bankrupt firms that discriminant analysis predicted.74% of the bankrupt firms and 50/50 98. Also note that. In Table 7. It has been shown that to achieve equal to the proportion of that group (base rate) significant levels of predictive validity. ity.29]. for those training sets portion of correct positive predictions (bankrupt with a balanced number of bankrupt and non- firms. Sharda / Bankruptcy prediction using neural networks and the previous tables. testing sets) for measuring classification 90% non-bankrupt cases in training. This crite- could blindly predict with 80% accuracy by pre.4 86.4 71. the general trend of neural net. if the base rate to the classification techniques (base rate of the was 50% for a two group problem.0 94. (1) network model forecasting a bankrupt firm was .E ) * N t/2 non-bankrupt firms (base rate of 50%).9 85.L.92% of bankrupt 90/10 96.0 56. Only the base rate of the [22] for studying discriminant analysis classifica. Because our study uses cross-valida. when both the training and testing sets calculated as contained a balanced number of bankrupt and ( 0 .9 86. Thus. in average. our standard normal test to consider what one could do by simply guessing statistics will be based on only information known at the predictions. neural ral networks outperformed discriminant analysis networks also exceeded the base rate of the most in predicting firm bankruptcies.7 80. in the training set. our study must frequent class. irrespective of the testing set composi- these tests to fit our study. while when there are 90% non-bankrupt firms. information not available to the originally proposed in [29] and further clarified in classification technique.5 samples. this analysis used the base rates of by pure chance [22. one the proportional chance criterion [22]. in our case) to all positive bankrupt predic./2 . 90% correct predictions of frequent class [17. tion results are better than what can be expected However. ysis was correct 94. the neural network model correctly NN DA NN DA NN DA predicted 95.8 58. indicative of good predictive valid- now address whether the neural network predic. result. vice. a pure chance technique.83% of the non-bankrupt firms in the holdout 80/20 98. correct 98% of the time.9%. it is apparent that neural Table 7 networks represent a better predictive approach Percent predicted as bankrupt that were bankrupt than multivariate discriminant analysis.0 85.552 R.8 78. bankrupt firms.e. guessing would training sets). rion implies that prediction by guessing can dicting all cases to belong to the more frequent achieve a correct rate for each group involved class [28. while discriminant anal- work prediction superiority remained. When Training set Testing set composition considering all 60 cases where a balanced training composition 50/50 80/20 90/10 set was used.0 37.0 95.29]. While the per- centage of correct classifications of bankrupt firms decreased with the increased imbalance of the training cases. Simi.3 Further assessment of predictive capabilities of factor levels (base rates).. We will employ tests the testing sets.3 firms and 94. as compared to 80. if the base rate was skewed (80%-20%). would ran- success and utilizes different base rates for the domly declare 90% of testing cases to be non- training and testing set.

0 81.001) (p < 0. notation as above.5 61.6 69.005) (p = 0. dictions for group g (refer to Table 4).36 (p = 0.11 2.e g ) * l ' l g 1/2 ing set is used to train a neural network. ing set is 91. E = total correct predictions obtainable where H o is the observed rate of correct predic- by chance (Y~ eg). over-chance for the prediction of bankrupt firms tremely significant. Additionally.0 80/20 56.19 4. Sharda / Bankruptcy prediction using neural networks 553 Table 8 Predictive validity of classifications Training set Bankrupt Non-bankrupt Total composition NN DA NN DA NN DA 50/50 2. indicate significant differences over chance pre- the only non-significant result occurs when pre. 50/50 91.17 1.001) (p = 0.020) (p = 0.031) (p < 0.3 75. H e is defined as (E(bg * n g ) ) / N for the aggregate dictive results obtained by neural networks and case. The index discriminant analysis differ greatly from those I represents a reduction-in-error statistic in that that can be obtained by chance. the improvement- neural networks and discriminant analysis is ex. Thus.73 2. the predictive validity of Also. this statistic will indicate whether pre. Table 9 provides this calculation for the neural tics for each group (illustrating whether the pre.89 3.010) 90/10 3.2 37.06 1.33 (p < 0.8 50.045) where g = groups (bankrupt and non-bankrupt). though it is still considerably better than chance.13 1.l-Ie expected correct predictions for group g by I 1 -H e ' (3) chance (ng* bg). as previous Table 9 results have already shown. the 90% base rate where neural networks did not ering the predictive validity by specific groups. eg = I-Io -. O = total correct prediction (~ Og). Not surprisingly. network and discriminant analysis predictions ag- dictive results obtained by a classification tech. Using the previous notation.e g ) ) 1/2" (2) network model provides 92.002) (p = 0. chance assignment. as well as the improve- nique significantly differs from chance) is ment-over-chance index for both bankrupt and non-bankrupt cases.2 45.70 (p < 0. og = observed correct pre.31 2.L.8 93.001) (p = 0.48%. tion-in-error index [22. bg = training setting is the improvement-over-chance or reduc- base rate of group g. Using the same chance.001) (p < 0.98% fewer classifica- tion errors than would occur by blind guessing.8 78. and be for each separate group. 1 % fewer prediction errors result using the can also calculate a similar statistical measure for classification rule than would be expected by each separate classification group. Thus. dictions for non-bankrupt cases. the (eg * (F/g -. Another approach useful in assessing a predic.2 41.88 1.1 79. when the 50-50 train- ( O g . both methods with neural networks trained on a balanced train- are significantly better than pure chance regard. gregately across firm type.4 tion method is determining how much better a 90/10 43. Wilson and R. Consid. neural networks are Reduction-in-error of classifications judged more statistically significant than chance Training set Bankrupt Non-bankrupt Total as compared to discriminant analysis in every composition NN DA NN DA NN DA case. Also of interest is that even on less of the base rate of the training sets.7 89. An index useful in such a ng = number of test cases in group g .19 4.88 1.001) 80/20 3. the standard normal test statis.001) (p < 0. the improve- dicting non-bankrupt firms when the training set base rate is 90%.7 93. as another example.042) (p = 0. N = total number of cases tions and He is the correct prediction rate ex- (E //g) pected by chance.26].001) (p < 0. As Table 8 indicates.41 1.129) (p = 0.4 classification approach predicts compared to .86 4.1 55.030) (p = 0.00 5.57 2. one 1 0 0 .9 92. R.011) (p = 0. Aggregately.121) (p = 0.

944 2 38. This level is as good as or better than other studies.750 171 11.L. The 80/20 and 50/50 testing sets showed comparable mea- sures of prediction accuracy across different training factor levels.033 form as well or better with the inclusion of more Test 41. cases found in the holdout samples of factor level 90/10 (2 cases).3%.067 expect with a neural network model for . only the differ- squares ence between the 80/20 and 90/10 evaluation Train 49246. It squares stands to reason that neural networks will per- Train 76.805 2.305 1.842 exploratory study could be considered to offer a Error 1883. The results of these two results in classifying non-bankrupt cases than the ANOVA's are presented in Tables 10 and 11. This lem. it is ap- cant in determining the neural network predic- parent that for the bankruptcy prediction prob- tion accuracy of the bankrupt test cases.258 R 2 = 0.904 0.016 lower bound on the predictive accuracy one can R = 0.627 R 2 = 0. neural networks offer a viable alternative result further reinforces the intuitive thought that approach.4.972 1. two two-factor the three different factor levels. In every instance.394 cases and 10 percent bankrupt firms. firms as the dependent variable.889 0.672 0.072 it can be seen that similarly trained networks Interaction 3397. Sharda / Bankruptcy prediction using neural networks Table 10 The moderately significant effect of the testing A N O V A .492 0.704 0. Thus. the random generation of cases may on generalization have led to this experimental finding.472 3. Effect of training and testing set composition of test cases.0286). the ANOVA on the non-bankrupt In order to further assess the effect on accu- cases.Bankrupt cases set composition is not as easily explained.750 171 498. Upon closer contrast analysis among ral network model predictions. Wilson and R.353 0.942 2 24623.000 sets were significant (p = 0.554 R. This variation can perhaps ment-over-chance percentage is still a relatively be explained by the small number of bankrupt high 78.944 2 20.222 4 349.471 49. Source Sum of DF Mean-square F Ratio P the more important classification problem [38]. were 50/50). shows that only the composition of the racy of classifications that the factor levels of training set significantly effects neural network training and testing set composition have on neu- predictions. especially in the prediction of A N O V A .010).152 variables in the analysis. neural net- works outperformed discriminant analysis in clas- Table 11 sification accuracy. utilizing an rectly predicting as high as a 97% accuracy level equal number of examples of each concept is (when both the training and testing base rates desirable [23]. Test 2663.610 2 1331.151 were evaluated more favorable when the testing Error 85233.443 set was composed of 90 percent non-bankrupt R = 0.401 0. another utilizing predictive accuracy of networks trained by the correct predictions of bankrupt firms as the de- 80/20 composition sets provided more accurate pendent variable. the results of this Interaction 15. arguably. the more difficult and. Con- Source Sum of DF Mean-square F Ratio P trasting the different factor levels. In fact.556 4 3. 50/50 training sets. With simple data (five variables). Similar analysis was not undertaken for the dis- criminant analysis results since the neural net- work approach clearly dominates its' perfor- 5. Table 11.Non-bankrupt cases bankrupt finns. the only signifi- ANOVA's were undertaken. neu- to properly train a network (or any model) to ral networks showed extreme promise by cor- recognize two different concepts. From Table 3. Discussion mance. The composition of the training set was signifi- From the results of this experiment. By having such a small number 4. one using the per- cant difference between levels is between the centage of correct classifications of non-bankrupt 50/50 and 80/20 composition (p = 0.

when creating training and testing sets from the The investigation of the effects of different pool of existing problem data. the have indicated that prediction of the bankrupt classification accuracy of bankrupt firms is seri- firms poses the largest problem to the two differ. It is true that neural network performance is ties were calculated from the base rates of the less impressive as the proportion of non-bankrupt training sets. Even so. analysis. Bankrupt firms true population of bankrupt firms is probably less are predicted correctly in the 92% to 97% range. bankrupt firms increased in the training sets. it was will significantly effect the accuracy of decision shown that neural networks provide better under. Thus. A better predictive training and testing set composition on the pre. easier pre- equal numbers of examples in the learning phase. more accurately using symmetric costs (testing One caution to this approach in developing the threshold of 0. This phe- results in discriminant analysis [23]. Significance of testing set composition in ing the unequal misclassification costs.499). the network specific methodology. any study are bound by the limitations of the data the decision maker may not have control over the and methodology. irrespective of the actual distribution. Thus. neural net- probabilities.L. Of course. great care result when developed with an equal number of must be taken when creating the training and . R. ously reduced.. Either way. However. Sharda / Bankruptcy prediction using neural networks 555 bankruptcy prediction. the discriminant analysis procedure work models continue to outperform discriminant in this study actually incorporates significant un. since the does not deteriorate significantly. could be over or under reported. While non-bankrupt is the most costly error. this ral network predictive accuracy. predicting) non-bankrupt firms. results. however. The results of predicting non-bankrupt cases cepts (bankrupt firms and non-bankrupt firms) improved as the imbalance of bankrupt to non- when an equal number of examples of each con. at the expense of While all prediction errors are undesirable in a "learning" about bankrupt firms. This result can be attributed to significant fewer number of is not dissimilar to one's intuition and previous bankrupt firms in the training sets. neural with similar accuracy for non-bankrupt firms. a more accurate classification model will this is obviously not desirable. Neural networks were shown to sacrificing the prediction performance of one im- perform well in predicting both bankrupt firms portant category to marginally increase the pre- and non-bankrupt firms when presented with diction performance on the other. Since in the real-world. if too few of the decision maker and neural network researcher. composition of historical data necessary in the Discriminant analysis classification rules often predictive model development. [16. the results of instances of each category. In our comparison study.e. Thus. The major dilemma in utiliz.e. dicted category. standing and differentiation between two con. both the assumed base rate and the costs associ. By using the base rates as the prior to bankrupt firms diverge. Results overall predictive accuracy may remain high. networks continually predicted bankrupt firms given a balanced training set. hard-to-classify or more important cases exist in Results indicated that the composition of the the cross-validation set. training set is also indicated in the experimental ing the discriminant analysis model is in estimat. neural network performance bankrupt firm is a more costly error). the model performance training set was a significant determinant of neu.. maker confidence in the prediction model. Future bankrupt firm prediction may have indicated research investigating performance adjustments over-reported accuracy due to the small number given explicit values for asymmetric misclassifica. one would be significantly ent techniques. study indicates that a potential trade-off exists ral networks may be warranted. If one follows the recommendation of a equal misclassification costs (i. nomenon illustrates that. prior probabili. Thus. than the training set base rate.27]. will provide ated with misclassification errors if different a better model. misclassifying a 50-50 training set. of bankrupt firms in the 90-10 test sets. it is generally accepted that "memorizes" and becomes very good at recogniz- the incorrect prediction of a bankrupt firm as ing (i. Basically. it appears that incorporate prior probabilities that account for "smoothing" the distribution of the training set. Wilson and R. In firm bankruptcy predictions. neural network model can be created by using a dictive results lead to further implications for the balanced training set. this tion costs for both discriminant analysis and neu. This cept is used in the learning procedure.

[1] Altman. this ex. working bility of neural networks with that of classical paper. Financial Ratios as Predictors of Failure.. Notable omissions in. but nance. multivariate normal distribution. While lems [13].G. using matched firms nance. . Zeta Analysis. Also. tion. neural net. Vol. works predicted at a high rate of classification and that their use in prediction can reduce errors accuracy. prior probability specification. clude the size of the firm. the Much additional research needs to be done paper has greatly benefitted from comments and regarding neural networks for bankruptcy predic. Acknowledgments quirements. References sults on implementing a neural network predic- tion model. these models may provide excel. The effect of network architecture. Huss. in this problem domain by as much as 93% over lent results with less data requirements than other chance. approach in the prediction of firm stability. As previously mentioned.. thus. T. (June 1977). Additionally. R. it has shown the promise of priate use. and so forth. E. Conclusion [6] Bell. represent a classi- Discriminant analysis is not the only tool that fication technique that is a robust and promising has been postulated for use in classification prob. Additionally. Neural net- Corporations In Germany. network training algorithms and learning paradigms need to be examined to provide more prescriptive re.. J. 71-111. Ribar. 22-51. The Use of Statis- tical Analysis To Identify The Financial Strength Of results in predicting bankruptcy [40].. With only five simple ratios. and Niehaus.L. ments on previous drafts of this paper. tion Models. W. In the case of discriminant analysis. Journal of Banking and Finance. (Fall 1982). other variables should be included in the [2] Altman. M. Neural Nets vs. Thus. Wilson and R. nal of Finance.556 R. Ability to Predict Commercial Bank Failures. However. The Jour- ables to achieve its' high level of predictive accu. (September 1968). approaches to the problem. Studies in Banking and Fi- works may or may not be affected by this. 7 (1988). 183-196. (May 1990). therefore. E. Discriminant Analysis ploratory study uses only a small amount of vari... cant improvement in prediction over pure chance. P. and the Prediction of Corporate Bankruptcy. of a small number of financial ratios..I. 589-609. 4-19. In this this study has illustrated that neural networks are study. additional research should study this issue. neural networks clearly outperformed dis- a viable model that should be included in the criminant analysis in prediction accuracy of both model base of a DSS. neural network model [2]. Financial Ratios. Empirical Research in Accounting: Selected Studies (1966). all other models do have this study is exploratory in nature and has some limitations with regard to successful and appro. neural networks offer additional The authors wish to sincerely thank Marcus benefits in reducing managerial concern over Odom and Nik Dalai for their help and assistance choosing the appropriate model in the decision in data collection and in their insightful com- support context. Logistic Regression: A Comparison of Each Model's This paper has compared the predictive capa. E. was shown that neural networks offer a signifi- point. J.I. Journal of Accounting Auditing and Fi- among others. and time series data [3] Altman.I. Peat Marwick Co. text of forecasting firm bankruptcies on the basis From a decision support systems perspective. 6. Accounting Implications of Failure Predic- (more than just one years' previous financial data).. Haldeman. and Verchio. they are more robust prediction tech- niques.. G. Neural networks. Predictive accuracy ob. suggestions from the anonymous referees.H. Neural networks have no such potential restrictive assumptions or re. H. bankrupt and non-bankrupt firms under varying tained in this study illustrates the potential of training and testing conditions.. limitations as noted. statistical analyses that should be utilized as re- ables should be jointly distributed according to a search continues in this area. neural networks through the use of a set of solid limitations include the requirement that the vari. and Narayanan. by industry and year has been postulated to bias [4] Baetge. [5] Beaver. Sharda / Bankruptcy prediction using neural networks cross-validation sets when developing a neural multivariate discriminant analysis within the con- network prediction model. it neural networks from a data reduction stand. racy.

CA. and Chandrasekaran.R. 6. R. Warning Signs of Impending Business Fail- ogy. Miller. J... M.L. B. W. [32] Moyer. [20] Gallant.. San Diego. 2. 11-17. Systems. Journal of Marketing Research. P. (April 1981). 1-25. Interna- [8] Caudill..) New Business Uses For Neurocomput- Finance. Multivariate Normality niques. (June 1988) 53-59. Lexington 156-163.. Communica. Forecasting Financial Failure: A Reexami- [16] Eisenbeis.. 1985). Positive Accounting Systems I. 9-13... Estimation of Financial Distress Prediction Models. and Loick. J. in Advances in Neu- servative Application of Neural Networks. Expert Systems Scores. 59-82. working pa. Sierra Madre. San Diego...E. (Spring 1977). Krishnaiah and Kanal. M. Computer Decisions.G. [36] Surkan. R. and Chatterjee. P. 1-17. Bond-Rating: A Non-Con. J. Issues in the Use and Interpretation of TAT. [26] Klecka.. CA. (June 1990). D.. (January 1991).. [39] Wilkinson. Technical Analysis of (California Scientific Software. per.L. R. and Rosen. R. Interna- (Jan. tice. Hung. 417-424. [18] Fishman. NY.. Vol.. L. R. nition: A Recurrent Neural Network Approach. P. SYSTAT: The System for Statistics.P. ure and Means to Counteract such Prospective Failure.. 573-593. A. E. Patterns or Cutting Approach to the Classification Problem. P.. Antecedent Probability and [13] Denton. J. An Infor. CA. A. (Sage Publishing: [11] Collins.. Neural Networks For Bond [19] Francett. (1969). Analysis in Business. Financial Management.S. R. 1989). Finance and Accounting. and Prakash. Neural Nets Arrive. (Spring 1974). G. [28] Meehl. in Advances in Neural Information Processing [38] Watts. San Diego.. Barr. (1984). a Multiple Neural Network Learning System to Emula.. Neural Network Training Tips and Tech. M. Vol. and Bak.. S. Vol. [29] Meehl. Neural Network Primer: Part III. 1989) 340-347. The National Public Accountant.. and Avery. and Tanigawa. Sharda / Bankruptcy prediction using neural networks 557 [7] Blum. M.V.. Clinical versus Statistical Prediction: A The- [12] Deakin. MA. (State University of New York Press. (ed. 1972). Prediction in Criminol.W. Discriminant Analysis. Discriminant Analysis. B..C. 1980). S. [27] Lachenbruch. 393-403. J. L. (1989). mation Theoretic Approach to Rule-Based Connectionist Decision Sciences. Touretsky ed. [24] Kamijo.I.G. N.. Introduction to Neural Networks. (1988) 443-450. Methodological Issues Related to the (1984). [37] Teebagy. Proceedings of ral Information Processing Systems I. and Tarling. 167-179.. (1955) 194-216. Using Neural [35] Stanley. Stocks and Commodities. (Prentice-Hall. 20. Vol... and Scofield.. On the Interpretation of Discriminant [15] Eisenbeis. A. the IEEE International Conference on Neural Networks.. tions of the ACM.. [9] Caudill. S. 52.L. Neural Network News. (1989). 2. 1. Response Model with Applications to Data Analysis. [31] Morrison.E. ing.. 1975). S. (Kaufman Publishing: San Mateo. (SYS- [22] Huberty. 22 Supplement Sample Size Considerations in Pattern Recognition Prac. [34] Siegel.E. [23] Jain. No. Networks in Market Analysis. Albany. [17] Farrington D. I/S Analyzer. D. NY. [30] Mighell. Pitfalls in the Application of Discriminant nation. Psychological Bulletin. Handwritten Signature Verification. and Osyk. 2. T. (Lexington Books. Connectionist Expert Systems. 1989). Vol. (February 1988).. Journal of Busi- [10] Collard. Commodity Trading with a Neural Net. D. A Neural Network the Efficiency of Psychometric Signs. (Hafner Press: tion of Mortgage Underwriting Judgments.. Stock Price Pattern Recog- nal of Accounting Research. 1990). eds. Classification Procedures.B. P. Failing Company Discriminant Analysis. AI Expert. tional Joint Conference on Neural Networks. ness Finance and Accounting. Vol. (April 1991) 18-25. Mateo. (June 1990). With Applications. Discriminant Analysis. and Smyth. . Psychological Bulletin.S. CA. Inference in a Binary [21] Goodman. 1989) 58-62. [25] Karels. (North-Holland. tional Joint Conference on Neural Networks. J. Inc. C. Rating Improved by Multiple Hidden Layers. (Feb 1990). Vol 95 [40] Zmijewski. D. (Univer- Business Failures. Inc. Journal of Accounting Research. and Shekhar. 156-171. sity of Minnesota Press: Minneapolis. and Singleton. 10 (October. J. 835-855. A Discriminant Analysis of Predictors of oretical Analysis and a Review of the Evidence. IL. (1990).... Wilson and R. S. K. R. An Application of Beverly Hills.. J. Dimensionality and Journal of Accounting Research. Touretsky ed. Discriminant Analysis and Analysis. 1954). M. Jour. and Zimmerman. W. No. Nestor. and Forecasting of Business Bankruptcy. Journal of [33] Rochester. Back-Propagation and its Application to [14] Dutta. in Handbook of Statistics.E. Evanston. 1982). (Spring 1972). E. C. AI Expert. R.J. (Winter 1987). A. D.M.. NY. (June 1977) 875-900. (Kaufman Publishing: San Theory. 152-169. E. 53-59. 1989) 356-364. M. Ghosh. B. 1986).