Professional Documents
Culture Documents
net/publication/323737968
CITATIONS READS
0 978
1 author:
Safwan Umer
University of Salford
1 PUBLICATION 0 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Safwan Umer on 13 March 2018.
MSc Dissertation
The objective of bankruptcy prediction in the fields of data mining and machine learning is,
to develop a model that can give higher prediction accuracy (Tsai, Hsu and Yen, 2014). This
is also the main objective of this thesis. In this dissertation, in order to assess the efficiency of
the data mining models, five years of financial ratios of 464-Bankrupt and 464- Non-bankrupt
firms are used. This dissertation, presents an application of about all the data mining models
used in previous extensive literature and many new techniques using state of the art data
mining software. The models are developed using SAS Enterprise Miner, WEKA and IBM
SPSS. This study shows the application of 11 models using SAS Enterprise Miner (EM). The
bankruptcy prediction accuracy of Neural Network, Auto Neural, Regression and High
Performance Regression were excellent using SAS Enterprise Miner. This study also presents
application of 21 data mining models using the WEKA data mining software. Using WEKA
Simple Classification and regression trees(SimpleCART),Multi-Boost Ada-
Boost(MultiBoostAB), OneR and Radial based function network (RBFNetwork) models were
efficient to predict bankruptcy. Finally, 6 models of IBM SSPSS were employed to determine
the classification accuracy of bankrupt and non-bankrupt firms. Multi-Layer Perception
Neural Network model prove to be the best predictor of bankruptcy using IBM SPSS. Overall
37 data mining models have been applied and the empirical results of all the models have
been analysed, the highest bankruptcy prediction accuracy is achieved by using Neural
Networks. The results of this study show that it is possible to forecast bankruptcy five years
before it is happening.
Keywords: Data mining; Neural Network; Auto Neural; Regression; High Performance
Regression; Simple Classification and Decision Trees; Multi-Boost Ada-Boost ; OneR;
Multi-Layer Perception Neural Networks;
2
Acknowledgments
I am also very thankful to Dr. Mo Saraee who gave me a strong theoretical and practical
understanding of data mining classification concepts during the course work.
I would also like to express my immense appreciation to Dr. Rasool Eskandari for providing
me financial data and basic understanding of financial factors. I would never been able to
complete my research without his help.
Finally I am also very thankful to my parents, family and elder brother who were always
supporting me morally and encouraging me with their best wishes.
3
Contents
Abstract ................................................................................................................................................... 2
Acknowledgments................................................................................................................................... 3
Chapter 1 Introduction and Motivation................................................................................................ 10
1.1 Introduction ................................................................................................................................. 10
1.2 Research Motivations.................................................................................................................. 12
1.3 Objectives of the thesis ............................................................................................................... 12
1.4 Contributions: ............................................................................................................................. 12
1.5 Thesis Outline ............................................................................................................................. 13
Chapter 2 Literature Review ................................................................................................................. 14
2.1 Introduction ................................................................................................................................. 14
2.2 Statistical Techniques ................................................................................................................. 15
2.3 Uni-variate or Linear statistical methods .................................................................................... 15
2.4 Multiple Discriminant Analysis .................................................................................................. 17
2.5 Probability, Regression, Logistic and factor analysis models ..................................................... 20
2.5.1 Linear probability model ...................................................................................................... 20
2.5.2 Conditional probability models ............................................................................................ 21
2.6 Machine learning Models............................................................................................................ 25
2.6.1 Neural Networks .................................................................................................................. 25
2.6.2 Decision trees ........................................................................................................................... 26
2.6.3 Support Vector Machines..................................................................................................... 27
2.6.4 Fuzzy logic ........................................................................................................................... 28
2.6.5 Rough Sets ........................................................................................................................... 29
2.6.6 Case based reasoning ........................................................................................................... 30
2.7 Other Methods ............................................................................................................................ 31
Chapter 3 Financial Distress and Bankruptcy ....................................................................................... 33
3.1 Introduction ................................................................................................................................. 33
3.2 Financial Distress ........................................................................................................................ 33
3.1.1 Stages of Financial Distress ................................................................................................. 34
3.1.2 Factors of Financial distress ................................................................................................. 35
Internal Factors: ............................................................................................................................ 35
External factors: ............................................................................................................................ 35
3.1.3 Causes of Financial Distress ................................................................................................ 35
3.1.4 Result of corporate financial Distress .................................................................................. 36
4
3.2 Bankruptcy .................................................................................................................................. 36
3.2.1 Cost of bankruptcy ............................................................................................................... 37
3.1.3 Determining cost of bankruptcy ........................................................................................... 37
3.1.4 Direct costs of bankruptcy endured by the firm ................................................................... 38
3.1.5 Indirect costs of bankruptcy endured by the firm ................................................................ 38
Chapter 4 Data ..................................................................................................................................... 39
4.1 Introduction ................................................................................................................................. 39
4.2 Importance of Data sample ......................................................................................................... 39
4.2.1 Population ............................................................................................................................ 39
4.2.2 Sample.................................................................................................................................. 39
4.2.3 Importance ........................................................................................................................... 39
4.3 Source of Data............................................................................................................................. 41
4.4 Selection of Ratios ...................................................................................................................... 41
Table 4.1 Financial ratios used in this study ..................................................................................... 42
4.5 Data Pre-Processing .................................................................................................................... 43
4.5.1 Missing values ..................................................................................................................... 44
4.5.2 Outliers................................................................................................................................. 44
4.6 Descriptive Statistics of data samples ......................................................................................... 45
4.7 Summary ..................................................................................................................................... 45
Chapter 5: Model development and application .................................................................................. 46
5.1 Introduction ................................................................................................................................. 46
Part-1:.................................................................................................................................................... 46
5.2 Overview ..................................................................................................................................... 46
5.3 SAS Enterprise miner and its predictive modelling .................................................................... 46
5.3 Application of the Models .......................................................................................................... 48
5.3.1 Decision Trees ......................................................................................................................... 49
5.3.2 Decision Trees Model: ......................................................................................................... 49
5.3.3 High Performance Trees Model ........................................................................................... 49
5.3.4 Neural Network .................................................................................................................... 49
5.3.5 Neural Network Model ........................................................................................................ 50
5.3.6 Auto Neural Model .............................................................................................................. 50
5.3.7 High Performance Neural Model ......................................................................................... 50
5.3.8 Data Mining Neural Model .................................................................................................. 51
5.3.9 Regression Model ................................................................................................................ 51
5
5.3.10 High Performance Support Vector Machine Model .......................................................... 51
5.3.11 High Performance Regression Model ................................................................................ 52
5.3.12 Memory Based Reasoning Model ...................................................................................... 52
Part 2: .................................................................................................................................................... 54
5.4 WEKA: ....................................................................................................................................... 54
5.4.1 Naïve Bayes ......................................................................................................................... 55
5.4.2 Naïve Bayes Model .............................................................................................................. 55
5.4.3 BayesNet Model................................................................................................................... 55
5.4.4 SMO OR SVM Model ......................................................................................................... 55
5.4.5 RBFNetwork Model............................................................................................................. 56
5.4.6 Kstar Model ......................................................................................................................... 56
5.4.7 LWL Model ......................................................................................................................... 56
5.4.8 AdaBoostM1 Model............................................................................................................. 56
5.4.9 ClassificationViaRegression Model ..................................................................................... 56
5.4.10 Decorate Model .................................................................................................................. 57
5.4.11 Dagging Model .................................................................................................................. 57
5.4.12 LogisticBoost Model .......................................................................................................... 57
5.4.13 MultiBoostAB Model ........................................................................................................ 57
5.4.14 Random Committee Model ................................................................................................ 58
5.4.15 HyperPipes Model.............................................................................................................. 58
5.4.17 NNge Model....................................................................................................................... 58
5.4.18 OneR Model ....................................................................................................................... 58
5.4.19 ZeroR Model ...................................................................................................................... 59
5.4.20 Random Forest Model ........................................................................................................ 59
5.4.21 J48 Model........................................................................................................................... 59
5.4.22 SimpleCart Model .............................................................................................................. 59
5.4.23 END Model ........................................................................................................................ 60
Part 3 ..................................................................................................................................................... 61
5.5 IBM SPSS ................................................................................................................................... 61
5.5.1 MLP neural network Model ................................................................................................. 61
5.6 Models implementation using variations of decision trees ..................................................... 61
5.6.1 CHAID Model ..................................................................................................................... 61
5.6.2 CHAID Exhaustive Model ................................................................................................... 62
5.6.3 CART Model ....................................................................................................................... 62
6
5.6.4 QUEST Model ..................................................................................................................... 63
5.6.5 K-NN Model ........................................................................................................................ 63
5.7 Summary ................................................................................................................................. 63
Chapter 6 Results Analysis and Critical Evaluation .............................................................................. 64
6.1 Introduction ................................................................................................................................. 64
6.2 Type-I Error ................................................................................................................................ 64
6.3 Type-II Error ............................................................................................................................... 64
6.4 Total Error................................................................................................................................... 64
6.5 Classification Accuracy .............................................................................................................. 65
6.6 Empirical Results Analysis ......................................................................................................... 65
6.6.1 Analysis of Results of SAS Enterprise Miner Models ......................................................... 65
6.6.2 Analysis of Results of WEKA ............................................................................................. 67
6.6.3 Analysis of results of IBM SPSS models ............................................................................. 69
6.7 Critical Evaluation ...................................................................................................................... 70
6.8 Summary ..................................................................................................................................... 70
Chapter 7 Conclusion and Future Directions ...................................................................................... 71
7.1 Conclusions ................................................................................................................................. 71
7.2 Future Directions .................................................................................................................... 73
Bibliography .......................................................................................................................................... 74
Appendix-A:........................................................................................................................................... 87
Appendix B .......................................................................................................................................... 117
7
List of Figures
Figure 2.1 Neural Network basic understanding ................................................................................. 26
Figure 2.2 Basic understanding of decision trees ................................................................................ 27
Figure 2.3 Basic idea of the Hyperplanes and support vectors ............................................................ 28
Figure 2.4 Cased Based Reasoning 4-step cycle................................................................................... 30
Figure 2.5 A comparison of different bankruptcy prediction approaches............................................. 31
Figure 2.6 Accuracy of different methods being used in the past ......................................................... 32
Figure 2.7 Studies using different model of bankruptcy prediction...................................................... 32
Figure 4.1 Method used in SPSS to find 5th and 95th percentile ........................................................ 45
Figure 5.1 step by step method of creating any project in SAS Enterprise miner ................................ 47
Figure 5.2 The step by step implementation of the model generation using SAS EM ......................... 48
Figure 5.3 Final implementation diagram of models using SAS .......................................................... 53
Figure 5.14 Final application diagram of models using WEKA ........................................................... 54
Figure 6.1 Bankrupt and non-Bankrupt firms prediction Accuracy ..................................................... 66
Figure 6.2 Bankrupt firms five years ahead prediction accuracy using WEKA models....................... 67
Figure 6.3 non-Bankrupt firms five years prediction accuracy using WEKA models ........................ 67
Figure 6.4 Bankrupt and non-bankrupt firms prediction accuracy ....................................................... 69
Figure 5.4 Model Decision Trees........................................................................................................ 117
Figure 5.5 Model HP Tree .................................................................................................................. 118
Figure 5.6 Neural Network Model ...................................................................................................... 119
Figure 5.7 Auto Neural Model ............................................................................................................ 120
Figure 5.8 HP Neural Model ............................................................................................................... 121
Figure 5.9 DMNeural Model .............................................................................................................. 122
Figure 5.10 Regression Model ............................................................................................................ 123
Figure 5.11 HP SVM Model ............................................................................................................... 124
Figure 5.12 HP Regression Model ...................................................................................................... 125
Figure 5.13 Memory Based Reasoning Model ................................................................................... 126
List of Tables
Table 2.1 some studies that used Univariate statistical methods to predict bankruptcy ....................... 16
Table 2.2 Studies using MDA model from 1968 to 1996 ..................................................................... 18
Table 2.3 the use of the logistic model in different studies ................................................................. 23
Table 4.1 Financial ratios used in this study ......................................................................................... 42
Table 6.1 Bankrupt and non-bankrupt five years ahead prediction accuracy table using SAS Enterprise
miner models......................................................................................................................................... 66
Table 6.2 Bankrupt and non-bankrupt firms five years ahead prediction accuracy table using WEKA
models ................................................................................................................................................... 68
Table 6.3 Bankrupt and non-bankrupt firms five years prediction accuracy table using SPSS ............ 69
Table 4.2 Containing 5th and 95th percentile for the data one year before bankruptcy ....................... 87
Table 4.3 Containing 5th and 95th percentile for the data 2 year before bankruptcy. .......................... 88
Table 4.4 Containing 5th and 95th percentile for the data 3 year before bankruptcy. .......................... 89
Table 4.5 Containing 5th and 95th percentile for the data 4 year before bankruptcy. .......................... 90
Table 4.6 Containing 5th and 95th percentile for the data 5 year before bankruptcy. .......................... 91
Table 4.7 Univariate Statistics for data sample one year before bankruptcy ....................................... 92
Table 4.8 Univariate Statistics for data sample two year before bankruptcy: ....................................... 93
Table 4.9 Univariate Statistics for data sample three year before bankruptcy ...................................... 94
8
Table 4.10 Univariate Statistics for data sample four year before bankruptcy ..................................... 95
Table 4.11 Univariate Statistics for data sample five year before
bankruptcy……………………………..94
Table 5.1 Prediction accuracy of the model starting from year one to five using Decision Trees Model
.............................................................................................................................................................. 97
Table 5.2 Prediction accuracy of the model starting from year one to five using HP Trees Model ..... 98
Table 5.3 Prediction accuracy of the model starting from year one to five using Neural Network
Model .................................................................................................................................................... 98
Table 5.4 Prediction accuracy of the model starting from year one to five using Auto Neural Model 99
Table 5.5 Prediction accuracy of the model starting from year one to five using HP Neural Model ... 99
Table 5.6 Prediction accuracy of the model starting from year one to five using Neural Network
Model .................................................................................................................................................. 100
Table 5.7 Prediction accuracy of the model starting from year one to five using Neural Network
Model .................................................................................................................................................. 100
Table 5.8 Prediction accuracy of the model starting from year one to five using HP SVM Model ... 101
Table 5.9 Prediction accuracy of the model starting from year one to five using Neural Network
Model .................................................................................................................................................. 102
Table 5.10 Prediction accuracy of the model starting from year one to five using MBR Model ....... 102
Table 5.11 Bankruptcy prediction accuracy using Naïve Bayes Model ............................................. 103
Table 5.12 Bankruptcy prediction accuracy using BayesNet Model .................................................. 103
Table 5.13 Bankruptcy prediction accuracy table using SMO OR SVM Model ................................ 104
Table 5.14 Bankruptcy prediction accuracy table using RBFNetwork Model ................................... 104
Table 5.15 Bankruptcy prediction accuracy table using KSTAR Model ............................................ 105
Table 5.16 Bankruptcy prediction accuracy table using LWL Model ................................................ 105
Table 5.17 Bankruptcy prediction accuracy table using AdaBoostM1 Model ................................... 106
Table 5.18 Bankruptcy prediction accuracy table using ClassificationviaRegression Model ............ 106
Table 5.19 Bankruptcy prediction accuracy table using Decorate Model .......................................... 107
Table 5.20 Bankruptcy prediction accuracy table using Dagging Model ........................................... 107
Table 5.21 Bankruptcy prediction accuracy table using ogisticBoost Model ..................................... 108
Table 5.22 Bankruptcy prediction accuracy table using MultiBoostAB Model ................................. 108
Table 5.23 Bankruptcy prediction accuracy table using Random Committee Model ........................ 109
Table 5.24 Bankruptcy prediction accuracy table using HyperPipes Model ...................................... 109
Table 5.25 Bankruptcy prediction accuracy table using NNge Model ............................................... 110
Table 5.26 Bankruptcy prediction accuracy table using OneR Model ............................................... 110
Table 5.27 Bankruptcy prediction accuracy table using ZeroR Model............................................... 111
Table 5.28 Bankruptcy prediction accuracy table using Random Forest Model ................................ 111
Table 5.29 Bankruptcy prediction accuracy table using J48 Model ................................................... 112
Table 5.30 Bankruptcy prediction accuracy table using SimpleCart Model....................................... 112
Table 5.31 Bankruptcy prediction accuracy table using END Model................................................. 113
Table 5.32 Bankruptcy prediction accuracy table using MLP neural network Model ....................... 113
Table 5.33 Bankruptcy prediction accuracy table using CHAID Model ............................................ 114
Table 5.34 Bankruptcy prediction accuracy table CHAID Exhaustive Model ................................... 114
Table 5.35 Bankruptcy prediction accuracy table CART Model ........................................................ 115
Table 5.36 Bankruptcy prediction accuracy table QUEST Model ..................................................... 115
Table 5.37 Bankruptcy prediction accuracy table K-NN Model ........................................................ 116
9
Chapter 1 Introduction and Motivation
1.1 Introduction
Data mining is used to find hidden patterns in large sets of data. Data mining has been widely
used in many different fields to conceive logics in the data stored in databases (Shamsinejad,
Saraee and Shekholeslam, 2011). State of the art data mining classification models are being
used in the field of bankruptcy prediction. The most popular techniques which are being used
now-a-days are decision trees (DT), Artificial Neural Networks (ANN), Support Vector
Machines (SVM), Case Base Reasoning (CBR), K-Nearest Neighbour (K-NN), Bayesian
Networks, Regression and hybrid methods (Chen et al. 2011).
Bankruptcy forecast of an organisation has been a paramount subject in the accounting and
finance literature (Zhang, Hu, Patuwo & Indro, 1999). Financial failure of a company
significantly affects the company, stakeholders, employees, customers and nation.
Bankruptcy prediction is one of the areas that have been extensively studied in the fields of
accounting and finance (Wilson and Sharda, 1994). The companies cannot be immune against
bankruptcy and bankruptcy is not something that happens overnight. Therefore, it is very
important to understand and predict the phenomena that lead to bankruptcy (Kim and Kang,
2009).
Timely prediction of bankruptcy also helps in making best business decisions for the future of
the company. The accuracy of the bankruptcy prediction is very important and if it is not
predicted accurately, the results would be catastrophic for the company. Prediction of the
corporate failure is very important because it impacts employees of the company,
management, auditors and debtors (Jardin, 2014).
Companies which do not have enough financial means to operate have to eliminate the
company’s assets and pay its debts. If a company does not have enough money to pay its
10
debts then the company goes in a financial distress. The company must have to be in a
solvent state to keep its progress (Blum, 1974).
Bankruptcy could be caused by many factors like poor management, less financial funds,
shortage of fund providers, revenue decrement, lack of assets, lack of management
knowledge, lack of stockholders in terms of fund raising and lack of shares (David and Denis,
1995).
Various researches are available on the topic of bankruptcy prediction. These studies have
analysed different financial distress factors that lead to bankruptcy (Wilson and Sharda,
1994). This dissertation has a comprehensive literature review spanning from 1932 to 2014
and comprises various theoretical, statistical and machine learning approaches for bankruptcy
prediction.
The major purpose of this dissertation is to evaluate bankruptcy prediction through the use of
data mining models. This study also illustrates the theoretical concepts and practical results of
the data mining models in the prediction of bankruptcy. The tools used in this study are very
well known in data mining community and these are SAS enterprise miner, WEKA and IBM
SPSS.
The process of bankruptcy prediction involves several important steps on data containing
financial ratios. First of all data is gathered. Secondly, the data is processed in a meaningful
format to apply different data mining techniques. Thirdly, the processed data is used to apply
data mining techniques and different data mining classification models are generated. Finally,
the results of different employed models are compared and the best model is selected.
The sample data that I have used is gathered from the Financial Analysis Made Easy (FAME)
Database. This sample consists of 464 bankrupt and 464 non-bankrupt companies. This
dissertation shows the importance of data and its pre-processing phase using an effective
statistical method. The 41 financial ratios used in this study are also very important because
these have been used in most of the research articles. An important contribution of this
dissertation is its use of 5 years prior ratios for different companies from 2000 to 2012 to
predict the bankruptcy five years ahead.
11
1.2 Research Motivations
We are witnessing a very competitive era for companies where bankruptcy is seen as
tarnishing the companies’ reputations. The bankruptcy prediction is a very challenging
subject. When a company starts to go into insolvent state and does not return to the solvent
state due to the debts which have not been paid because of lesser amount of liquidity. In this
state, the company has either to pay its debts or file for bankruptcy (Wruck, 1990).
Many large organizations like Delta Airlines, United Airlines, New Century Financial,
Calpine, Lyondell Chemicals, Telecom Company Global Crossing, Thornburg Mortgage and
Pacific Gas have filed for bankruptcies in last 2 decades (Anon., 2014). These incidents
completely disturbed the investors around the world and made it even more important to
predict the financial distress before bankruptcy. Auditors, as a general duty use bankruptcy
prediction techniques to assess the financial state of a company before investing in the
company (Wilson and Sharda, 2009).The managers of the companies who make the decisions
are always looking for a prediction model that gives the best results in bankruptcy prediction.
Many techniques have been used in the past.
1. Utilization of different data mining methods and algorithms using SAS enterprise
miner, WEKA and IBM SPSS.
2. Analysis of results obtained from various data mining models implementation.
1.4 Contributions:
The contributions of the dissertation in the field of bankruptcy prediction are:
1. Bankruptcy prediction using 5 years prior ratios because most of the research articles
have used 3 years back ratios for prediction.
2. Predict bankruptcy five years ahead using five years back ratios.
3. Use of 41 most important financial ratios.
4. Use of 11 SAS Enterprise miner models, 21 WEKA models and 6 IBM SPSS models.
5. Find the most effective model for bankruptcy prediction.
12
1.5 Thesis Outline
On the basis of the theoretical and practical literature review this dissertation describes
different features of bankruptcy prediction models. This thesis is divided into 7 chapters.
Chapter 2
Chapter 3
This chapter elaborates financial distress, factors of financial distress, causes of financial
distress, bankruptcy definitions and costs of bankruptcy.
Chapter 4
This chapter illustrates importance of data, ratios and pre-processing phase of data. This
chapter also elaborated the method of winsorizing to eliminate outliers in the data.
Chapter 5
This chapter offers a complete analysis and applications of different models using SAS
Enterprise Miner, WEKA and IBM SPSS. It also presents prediction accuracy results
provided by each model.
Chapter 6
This chapter give a complete insight and critical evaluation of each data mining model. It
also gives the five years prior results of each model in a detailed manner.
Chapter 7
This chapter summarizes the major contributions of this dissertation and gives directions for
future work.
13
Chapter 2 Literature Review
2.1 Introduction
Various methods have been used in the literature for predicting the business failure. Each
methodology has its importance and contributions in this area. But each prediction technique
is basically used to divide the firms in financially healthy or financially failed firms
(Dimitras, Zankis and Zopounidis,1996).
Business failure studies have attracted world-wide interest from many researchers and
practitioners. Earlier techniques, when there was no statistical or machine learning technique
available, used to compare two companies, one with a healthy financial state and the other
with a failed financial state (Bellovary, Giacomino and Akers, 2007). According to Fitzpatric
(1932) there are five stages of financial failure. These stages are incubation, financial
embarrassment, financial insolvency, total insolvency and confirmed insolvency. Then
statistical bankruptcy prediction models started from the Beaver’s (1966) one variable model
and Altman’s Linear Discriminant Analysis model (Altman, 1968).
Since bankruptcy prediction has become a hot topic for the researchers and they have started
to use different techniques to get better and more reliable results. Many researchers started to
use different models to improve the results of the Altman’s technique. Data mining
techniques were not used until 1980. The use of data mining techniques like SVM, NN,
Decision trees was started in late 1980’s for bankruptcy prediction (Pompe and Feedlers,
1997).
There are various statistical, machine learning, soft computing, operational and evolutionary
approaches to predict bankruptcy and each have its own pros and cons (Kumar and Ravi
2007). The most important methods used in the past, their research procedures and prediction
accuracy results are discussed in the next section.
14
2.2 Statistical Techniques
These are the techniques that use statistical methods on sample of data containing bankrupt
and non-bankrupt companies. Many studies are available and have used statistical techniques
on different financial ratios. A statistical technique contains the methods using financial
parameters and ratios to predict financial distress. The Beaver’s uni-variate model was the
initial point of research for these techniques. Examples of these techniques are Linear
Discriminant analysis (LDA), MDA Multiple Discriminant Analysis (MDA), Quadratic
Discriminant Analysis (QDA), Logistic regressions and Factor analysis (Kumar and Ravi
2007).
The traditional statistical methods can better control huge data sets without losing the
prediction performance, while machine learning techniques obtain better performance with
smaller data sets and would be affected by large data sets (chen, 2011).
These are the earlier techniques used to differentiate between a financially stable and
financially failed firm. Table 2.1 shows some of the studies that used Univariate statistical
methods to predict bankruptcy.
The Univariate models were heavily criticised but laid the path for other models like MDA,
Linear Probability Model (LPM), Logistic and Regression.
15
Table 2.1 some studies that used Univariate statistical methods to predict bankruptcy
16
2.4 Multiple Discriminant Analysis
MDA is the most commonly used statistical method for bankruptcy prediction. This method
has been used in more than 70 research studies from 1960 to present. This method is used to
classify a variable into one of the several a priori groups available, depending upon the
features of that variable. This technique was also very efficient in the prediction of the
qualitative data. MDA technique examines a complete profile of features prevalent to the
pertinent group of corporations. It also considers the interaction of these characteristics. The
major benefit of MDA is that it can deal with the problem of classification because it can
observe the complete profile of a financial factor. The MDA method also decreases the
analyst’s space dimensionality (Altman, 1968). An MDA technique is made up of linear
collection of variables, which are used to discriminate between failing and non-failing firms
(Balcaen and Ooghe, 2006).
𝑍= 𝑉1 𝑋1 + 𝑉2 𝑋2 + 𝑉3 𝑋3 + ………………….. + 𝑉𝑛 𝑋𝑛
The MDA calculate the Discriminant Coefficients, 𝑉𝑖 and the Independent Variables 𝑋𝑖 are
actual values. Where 𝑖= 1, 2, 3, 4, ………………,𝑛
Many researchers used MDA bankruptcy prediction technique, based on the methodology by
the Altman Z-Score model. Deakin, Edmister and lis (1972) used LDA method and obtained
prediction accuracy of 80%, 88% and 83% respectively. Table 2.2 shows the studies using
MDA model for predicting bankruptcy from 1968 to 2004.
Varun (2009) applied these techniques on 78 failed companies and 91 non-failed companies
in the period of 1999 to 2007. His research showed that the ratios total debt to total assets,
cash flow from operations / Interest Expense and net profit / total assets were the most
differentiating ratios one year before the bankruptcy and short term debt / total assets and
sales/ total assets were the most discriminating variables for predicting two years before the
bankruptcy.
17
Table 2.2 Studies using MDA model from 1968 to 2004
18
Ketz(1978) General 16 75 failed firms and Failed firms- The use of general price level statements to distinguish
597 non-failed 56% and Non- between a failing and non- failing firms.
firms. failed firms 93%
Castanga and Austrailian 10 A sample of 21 Failed firms- 0% This study proposed that it is not easy to use a distinct
Matolcsy Firms companies. to 90% and Non- model to predict financial distress efficiently.
(1981) failed firms 76%
to 100%
Izan (1984) Austrailian 5 A sample of 53 40% to 100% He used company ratios using their industry median and
Firms failed and 50 non- made a combination of five variables for Discriminant
failed firms model.
Keasey and Small UK 5 A sample of 10 Failed firms- The use of trade-credit specialists and statistical model to
Watson firms failed and 10 non- 70% predict financial failure.
(1986) failed firms. Non-failed firms
66.7% to 68.3%
Koh and General 5 A sample of 400 Failed firms- SAS 34 and SAS 59 were used to make a prediction
Killough firms. Out of 400 78.6% and Non- model. Development of a prediction model which was
(1990) only 14 were failed firms accurate approximately 88 percent.
bankrupt. 88.25%
Laitinen Small and mid- 6 40 randomly Failed firms- Finding the existence of the failure processes in the firms.
(1991) size Finnish selected failed and 57.5% to 90% These processes were used on selected ratios to predict
firms non-failed firms. and Non-failed financial failure.
firms 52.5% to
87.5%
Alici(1996) UK Mfg. firms 4 29 Failed and 31 Failed firms- Introduced wavelet networks and pruning techniques were
Non-failed British 60.12% Non- examined in his model.
corporations failed firms
71.07%
Pidado and Mfg. firms 15 42-bankrupt and Used MDA technique in the footwear manufacturing
Rodriques 42- Non-bankrupt 89.58% industry.
(2004) firms
19
Lugovskaja (2009) also used MDA technique to predict financial failure of Russian Small
and medium-sized Enterprises (SMEs). He used two MDA models on a data set of 260
bankrupt and 260 non-bankrupt arbitrary SMEs. In the first model he found six important
bankruptcy prediction ratios and the classification result was 76.2% for the estimation sample
and 68.1% for the holdout sample. In the second model he used non-financial variable such
as size and age with financial factors of SMEs and classification accuracy was 77.9% for the
estimation sample and 79% for the holdout sample.
Ivica Pervan et al (2011) used this statistical technique on a sample of 78 bankrupted and 78
non-bankrupted companies from the Croatian manufacturing and trade industries. This study
mentioned that financial statements and financial factors are informative to predict the
bankruptcy of a company. They obtained results with 79.5% bankruptcy prediction accuracy.
Recently, Lee and Choi (2013) provided a multi-industry prediction model. This study used
different sets of variables and produced a model which is better in reflecting the
characteristics of the industry and selection of ratios to elaborate distinct prediction results.
The accuracy of this model for MDA model is 74.82%. In addition to these outcomes, this
study also emphasis on the fact that it is mandatory to build bankruptcy prediction models for
each industry specifically to generate the efficient and reliable prediction results.
20
Meyer and Pifer (1970) presented techniques of simple least squares linear regression with
the concept of dummy variable 0 and 1 (0 for non-failed and 1 for financially failed banks).
They applied this technique on banks data set consisting on 18 financial ratios and their
empirical classification accuracy was 67% to 100% for failed and 55% to 89% for non-failed
banks. Later on, Grammatikos and Gloubos (1984), Theodossiou (1991), Vranas (1992) and
Lennox (1999) also used this research method in their studies to predict bankruptcy.
The logistic method gives the probability of a firm that is going to be bankrupt. (Dimitras et
al., 1996) discussed that In the logistic model the probability of a company 𝑖 that bankrupt
given the vector variable 𝑋𝑖 as (Dimitras et al., 1996):
𝑃(𝑋𝑖 , 𝑐) = 𝐹(𝑑 + 𝑐 𝑋𝑖 )
Where 𝐹(𝑑 + 𝑐 𝑋𝑖 ) is the cumulative logistic function and is given by the equation as
1
𝐹(𝑑 + 𝑐 𝑋𝑖 ) =
1+𝑒 (𝑑+𝑐𝑋𝑖 )
Martin (1977) introduced the logistic regression model to predict the financial failure of
banks. He used a data set of about 5700 Federal Reserve member banks, 58 of the banks have
financially failed. He used six years back ratios for prediction and obtained a classification
accuracy starting from 91.3% to 41.7% one to six years before prediction for failed banks and
the results for non-failed banks were also remarkable starting from 91.1% to 82.2% one to six
years before bankruptcy prediction.
Ohlson (1980) proposed the concept of conditional probability model. The data set was used
from 10-K (Annual report of a firm that gives a comprehensive summary of firms’ financial
performance) financial statements for the first time. In this study he elaborated on the
following four statistically important factors for bankruptcy prediction:
21
He criticised the MDA technique because of the three problems associated with it. (i)
Matched samples. (ii) MDA behaves like a Discriminating device and does not provide any
statistical importance of variables. (iii) MDA model gives output in the form of a score which
is difficult to understand. Conditional logistic model keeps away all of the problems related
to the MDA. The accuracy of this logistic prediction model was 96.12%, 95.55% and 92.84%
for one year, two years and one-two years respectively. Mensah (1983) also used logistic
analysis method on a sample of 66 manufacturing firms and 32 factors and his classification
model accuracy was 18% to 55% for bankrupt firm while 80% to 86% for non-bankrupt
firms. Table 2.3 summarises the use of the logistic model in different studies.
Furthermore, Erkki and Teija(2000) also used a combination of the logistic model and
Taylor’s series. They used logistic model to describe insolvency and Taylor’s series to
approximate the exponent of the logistic function. They used a sample of 400 firms and
concluded that classification accuracy could be increased by using interacting ratios.
Kalori et al. (2002) applied this technique to develop an early warning system. They used this
model to predict the financial distress of banks. The classification accuracy of the model was
over 96% in 1 year before failure and 95% before 2 years. In 2003, Foreman performed
analysis of bankruptcy within US local telecommunications industry using logistic model.
Moreover, Jones and Hensher (2004) proposed a mixed logistic analysis model to predict
financial distress of a firm. They specified financial distress in three states 0 state for non-
failed, 1 state for insolvent and 2 state for failed firms. Mei and Lin (2005) also applied this
approach with quadratic interval regression model. Their empirical findings show that
quadratic model can help the logistic model to distinguish between failed and non-failed
firms.
Recently,(Masten and Masten, 2012) used logistic model with Classification and Regression
Trees (CART)- base methodology. This was a very simple approach and used dummy
variables. Their practical results show that the combination of these methods gives the
highest prediction accuracy of 95%.
22
Table 2.3 The use of logistic model in different studies
23
Probability model is also like logistic model but the function calculating the probability is
very different from the logistic model. Grablowsky and Talley (1981) used probability
analysis for classification of credit applicants and found that probability analysis could be
used as the substitute for Discriminant analysis. The studies using this model are less accurate
than the logistic model, and only a few researchers have worked in this particular area.
Hanweck (1977) applied this method on banks financial data for testing the financial distress.
He used 6 financial factors and obtained 67% accuracy for failed banks and 99% for non-
failed banks using hold out sample.
Zmijewski (1984) also investigated this statistical method on a biased sample of the data set
consisting of 40 bankrupt and 800 non-bankrupt firms. He used probability and bivariate
probability analysis to assess the sample bias issue.
Skogsvik (1990) examines this model to inspect the bankruptcy of Swedish mining and
manufacturing firms on a data sample consisting of 17 financial factors and period from 1966
to 1980. His empirical result shows a classification accuracy of 84.0% to 71.2% from 1 to 6
years respectively before bankruptcy. Moreover, Theodossiou (1995) used probability model
for Greek manufacturing firms and obtained a classification accuracy of 95.5% for Bankrupt
and 92.6% for non-Bankrupt firms. Later on, Boritz and Kennedy (1995), Lennox (1999) also
used this method for bankruptcy prediction.
Canbas et al. (2005) presented an Integrated Early Warning System (IEWS) to investigate the
financial problems of banks by incorporating logistic regression, DA, probability and
principal component analysis. This system helped in a great deal to assess the financial
conditions of banks. Their calculated failure prediction probability of banks were 56%, 99%
and 99.9% for year one, two and three respectively.
Factor analysis was used to describe a set of variables in terms of factors on the basis of the
relation between actual variables. This technique was used in a combination with logistic
estimation by West (1985) to investigate the financial condition of the bank.
24
2.6 Machine learning Models
Different Intelligent techniques are being used these days because statistical techniques have
distributional hypotheses that financial data do not always fit. Thus machine learning
techniques which do not require parameters conquer the limitations of traditional statistical
models. Machine learning models belonging to the data mining domain include artificial
neural networks, decision trees, Case-based reasoning, SVM, Fuzzy logic, and rough
sets(Kim and Kang,2010).
Odom and Sharda (1990) firstly applied this technique for bankruptcy prediction with a
comparison to MDA technique. They used a data sample of 128 firms and obtained 77.8% to
81.5% and classification accuracy. There are many different variations of NN such as Back
Propagation Neural Networks (BPNN), Self-Orgnizing Feature Map (SOM) , Probabilistic
NN, Auto Associative NN and Cascade Correlation NN. These NNs are divided into different
categories due to their learning type, algorithm and connection of nodes with each other
(Kumar and Ravi,2007).
Vellido et al., (1999), Wong et al., (1997), Zhang et al., (1999), Atiya (2001), and Paliwal
and Kumar (2009) have reviewed the use of NN in business and other science and
engineering domains.
Jeong et al. (2012) proposed a new architecture of NN models by using hybrid tuning
method. The practical results show that tuned model was significant in predicting financial
failure. Their research has numerous advantages like the reflection of nonlinear aspects of
ratios using Generalized Additive Model(GAM) , most favourable parameter values of the
25
variable were secured and this model was more profitable than other non-tuned models such
as SVM, Generalized Logistic Model (GLM), MDA, CBR, DT and GAM.
Recently, Lee and Choi (2013) applied BNN and MDA model for construction, retail and
manufacturing industries to predict the financial distress. This study further elaborates on the
relative power of each independent variable and the classification accuracy of BNN model
was 81.43%.Figure 2.1 gives an example of one input and one target layer with hidden layer
neural network architecture.
Figure 2.1 Figure 2 Neural Network basic understanding (Tsai and Wu, 2008)
VAR-2
VAR-3
Target
*
VAR-N
Kumar and Ravi (2007) have mentioned that decision trees provide if-than-else rules which
are very simple to understand and they also defined different types of algorithms for decision
trees like CRT, CHAID, Quest and C5.0 ( which is the enhanced version of C4.5).CRT and
26
CHAID are new algorithmic techniques. CRT use towing optimum split techniques whereas
CHAID uses chi square statistics.
Frydman et al. (1985), Bryant (1997), Curram and Migers (1994) applied decision trees to
predict financial failures whereas Hui et al. (2010) gave a comparative study of decision trees
with other data mining models ANN,SVM, Logistic and MDA statistical method. The
decision trees are easily understandable by human, more accurate than NN and SVM but
sometimes excess of rules makes it difficult to comprehend (Olson et al., 2012). The
following figures give basic understanding of decision trees.
Figure 2.2 Basic understanding of decision trees (SAS Institute Inc., 2012)
Recently, Chih et al. (2014) made a comparative study of different classifiers for bankruptcy
prediction. They applied these techniques using the combination methods of bagging and
boosting on a data set of 220 failed and 220 non-failed firms and empirical results shows 83%
DT-bagging and 85% DT-boosting classification accuracy.
27
maximum margin hyperplane is a special kind of linear model. Figure 2.3 gives the basic idea
of the hyperplanes and support vectors (Circled in the figure).
Figure 2.3 Basic idea of the Hyperplanes and support vectors (Han et al., 2006)
SVM is very powerful because it integrates statistical methods and machine learning
methods. According to Chaudhuri and De (2011) SVM initially started from the idea of
Search Reasoning Machines (SRM) (Shin, Lee and Kim, 2005) to build a model and is
becoming more famous due to its better predictive accuracy and performance.
A wide range of research articles have been written on this topic. In the past, Tay and Cao
(2001) and Kim (2004) used SVM in financial time series forecasting, Tay and Cao (2001)
applied a modified version of SVM in their research, Shin, Lee and Kim (2005) investigated
the efficiency of SVM for bankruptcy prediction and concluded that it works better than BPN
for smaller training data sets, Min and Lee (2005) evaluated this technique to find the optimal
parameter values of kernel function of SVM, Chih et al. (2007) implemented a real valued
genetic algorithm to optimize parameters of support vector machine for predicting financial
distress. Later on, Gao, Cui and Po (2008) predicted enterprise bankruptcy using Noisy-
Tolerant Support Vector Machine. Recently, Fong et al. (2014) also used a comparative
method of SVM to predict bankruptcy.
28
But he also indicated a disadvantage that is difficult to build and tune a membership function
and rules using fuzzy logic model.
Fuzzy logic is used in many areas that included credit risk prediction (Chung et al., 2005),
loan analysis commercial system (Levy et al., 1991), correlations of crude oil systems
(Sunday et al., 2011) , disease of a firm (Hernan and Antonio, 2008) and forecasting
exchange rates (Korol, 2014) .
Slowinski and Stenfanowski (1994) described that rough sets of approaches essentially
allows the analysis of a huge set of predictive ratios to recognize numerous reduced ratios set
that it can forecast the characteristics of interest.
This technique was first used by Matarazzo et al. (1998a) and (1998b) to predict bankruptcy.
They used of dominance relation and indiscernibility relation in the first research study and
only dominance relation in their second research study. Susmaga et al.(1999) also applied this
technique to predict bankruptcy in comparison with DA and logistic and deduce that the
rough set of techniques performed better than the other two. Mckee (2000) employed rough
set model on variables specified by recursive partitioning technique and a holdout sample of
100 companies the empirical results show 88% classification accuracy and Popova and Bioch
(2001) used rough set method with a slight modification using monotone extensions to
predict bankruptcy.
Slowinski et al. (2001) and Matarazzo et al.(2002) used dominance based rough sets
approach and concluded that it is the only data mining method holding the preference order
of the data. Furthermore, this theory can be used to solve classification problems by using
exact and possible induced decision rules (Kumar and Ravi, 2007). Moreover, research
29
articles on this topic by Francis and Lixiang (2002), Indrani (2006) , Ching et al. (2010),
Chen (2012) and Zhi et al. (2012) also elaborate the use of rough set techniques for
bankruptcy prediction.
Recently, Chiang et al. (2014) used rough set and hybrid random forest method, while
intellectual capital as predictive variable for bankruptcy prediction and they concluded that
hybrid approach provided best classification rate with least Types-I and Type-II errors.
Figure 2.4 Cased Based Reasoning 4-step cycle (Aamodt and Plaza, 1994)
30
CBR has not been widely used in the field of bankruptcy prediction but has been widely used
in the fields of management, engineering, medical diagnosis, clash resolution in traffic
control, creating product index for e-shopping malls and in the drawing of semiconductors
(Turban and Aronson, 2001). For further reading on this topic the reader may refer to the
research articles by Ahn and Kim (2009), Sungbin et al. (2010) and Chuang (2013).
Figure 2.5 A comparison of different bankruptcy prediction approaches (Aziz and Dar, 2006)
31
Figure 2.6 Accuracy of different methods being used in the past (Aziz and Dar, 2006)
Figure 2.7 Studies using different model of bankruptcy prediction (Aziz and Dar, 2006)
32
Chapter 3 Financial Distress and Bankruptcy
3.1 Introduction
This chapter describes the basic understanding of financial distress that leads to bankruptcy.
Additionally, causes and outcomes of financial distress have been elaborated. Finally, cost of
bankruptcy has also been discussed in the last section of the chapter.
Financial failure is the situation when profit is lower than invested capital, keeping the risk in
observation, even if the same investment is used at the different economic situation at
prevailing rates and where the average returning output of the firm is always below the
capital cost of firm. A firm is not in financial distress if it is unable to pay its slight amount of
debt or deficiency of debts. Insolvency can also be used to describe dismissive corporate
performance. The financial distress of a firm is further ascribed using four general terms in
many research studies: failure, default, insolvency and bankruptcy. Furthermore, the financial
idea of default also means that a company is not in a condition to pay debt or interest to
33
creditors on due time. At last, the financial distress is elaborated in technical and legal case.
The technical financial distress is the case where a corporate is unable to keep its contractor
and legal case refers to the failure of the company to meet regular repayment on loan (Altman
and Hotchkiss, 2005).
According to, Gaughan (2011) financial failure does not means that a company is unable to
meet its due debt obligations. This can even happen when the corporate have enough net
worth to pay present legal responsibilities. Additionally, financial distress is not a necessary
measure of corporate bankruptcy because some companies also default due to management
ineligibility (Perold, 1999). Finally, reader may refer to the Karels and Prakash (1987) and
Lin and Mclean (2000) for further definitions and explorations of financial distress.
1. Early stage: Customers start complaining about the services and quality (Whitaker,
1999), the company start to feel sales are decreasing and stock return turns less than
expected (Opler and Titman, 1994).
2. Mid-Stage: in the mid-stage of the financial distress the company faces problems like
cash shortage, less profit (Makridakis, 2001) unable to pay dividend payments and
disturbance in the payment of debt to suppliers (Altman and Hotchkiss, 2005).
3. Final or later stage: According to Altman and Hotchkiss (2005) the company have
constant cash deficit and it breaches the debt contract with the creditors.
The bankruptcy of a company can be predicted about 5 to six years before it is happening
because some of the researcher as stated in the Table 2.2 has predicted bankruptcy 5 years
ahead.
34
3.1.2 Factors of Financial distress
There are two factors of financial distress discussed in different research studies Internal and
External.
Internal Factors:
There are many different internal factors related to financial distress some of the important
are (Keskin, 2002):
1. Bad management,
2. Lack of communication between the business entities.
3. Major projects’ failure,
4. Expansion of business with no stability,
5. No agreement between domain growths.
Wruck (1990) and Whitaker (1999) also considered poor management a significant factor in
the financial distress of a company.
External factors:
Each company have to exist in an environment. The External factors involve environmental
factors that lead to financial distress and some of them are discussed by the researchers as
following:
35
3.1.4 Result of corporate financial Distress
As mentioned above financial distress is continues event and it takes more or less six years to
reach its final stage bankruptcy. According to (Kumar and Ravi, 2007) the health of a firm or
bank relies on its:
And when a company gets more and more liquidated, it gets into a danger zone which is
called bankruptcy.
3.2 Bankruptcy
The concept of bankruptcy has been used to describe firm bearing financial troubles. A few
researchers have used generic term “failed” as synonym to “bankrupt”. Nonetheless,
bankruptcy is an activity starting financially and ends legally. It is hard to tell the particular
moment of occurrence of bankruptcy. It seems to be intuitive settlement in which financial
distress continues until the firm or creditors file a legal action. Financial failure is a
mandatory, but not enough, condition of bankruptcy (Karels and Parakash, 1987).
The firms under the allocations of National bankruptcy act are legally bankrupt either they
are in receivership or have been allowed the right to restructure (Altman, 1968).
When a firm is unable to pay its financial obligations as they are due, bond default, an
overdrawn bank account or preferred stock dividend, operationally this firm is said to be
bankrupt, failed or default (Blum, 1974). According to Deakin (1972) a firm encountering
insolvency, bankruptcy or liquidity for the interest of creditors is said to be a failed firm.
36
3.2.1 Cost of bankruptcy
The bankruptcy cost is generally divided into two categories (Kalay et al., 2007):
1. Direct cost
2. Indirect cost
The entire cost of bankruptcy including direct and indirect for firm is 15% of pre-distress
firm value and 7% for the retailer firms (Altman, 1984). According to Franks and Torous
(1994) the formal cost of bankruptcy is more than informal cost by 4.5%. The same year,
Opler and Titman (1994) announced that the firms with more leverage lose market shares. At
last, Kaplan (1994) concluded that profit from the liquidated financial reshuffle procedure
also increased the cost.
The bankruptcy cost can be divided into four sub-categories (Branch, 2002) :
The costs (1), (2), and (3) are considered to be the sub-categories of direct costs while (4)
belong to the indirect cost.
PDV is the considered the entire value of the firm’s assets according to its previous
bankruptcy financial report. Mostly, at the final stage of the financial distress the equity value
of the company is near to zero when it is going to file bankruptcy. On the other hand the
37
balance sheet of the bankrupted frim will not be showing running losses but representing
some overdue assets values (Branch, 2002).
Finally, (Branch, 2002) have concluded the victims of bankruptcy costs in four steps. Firstly
the bankruptcy cost is imposed on the landlord, suppliers, customer, employees etc.
Secondly, creditors and claimant will also have to face the costs associated with bankruptcy
of the firm. Thirdly, the par value of the liquidated firm’s debt before bankruptcy is assigned
as follows, 28% to the loss causing bankruptcy, 16% cost to deal with bankruptcy and 56% is
the cost to the claims-holders. Lastly, interest holder also have to be given a cost if company
bankrupt.
38
Chapter 4 Data
4.1 Introduction
In this chapter I shall be discussing about the importance of bankruptcy prediction data
sample. The database source I have used to obtain this data. Finally I will be discussing about
the variables selection, data pre-processing phase and statistic description of the data used in
this dissertation.
4.2.1 Population
It is the complete collection of objects or items that may be the section of a study (Kathleen
and Jonathan, 2011), for instance, all manufacturing companies in the UK, all banks in UK,
all bankrupt firms in UK, all non-bankrupt companies that are still in active state.
4.2.2 Sample
It is the sub-group of items from a particular population (Katleen and Jonathan, 2011), for
example, the group of 63-bankrupt firms randomly selected from a large database containing
records of thousands of bankrupt firms. The data sample must be the representative of whole
population.
4.2.3 Importance
After reading exhaustive literature I have come to know that selection of data sample is the
most important aspect in the bankruptcy prediction. Since, it is an important fact that
computers provide information according to the data given to process. If computers are given
erroneous data to process the result would also be unexpectedly erroneous.
Nevertheless, previous studies show that researchers knew the importance of the data sample
to predict bankruptcy. Initially, the researchers used data sample containing limited number
of bankrupt and non-bankrupt firms. For example, Beaver (1966) used a data sample of 79-
bankrupt and 79 non-bankrupt firms, Piches et al. (1975) used data sample of 221 firms,
Altman (1968) and Deakin (1972) used a data sample of 32-Bankrupt and 32-Non-bankupt
39
firms. Later on, some researchers also used large data samples, for instance, Zmijewski
(1984) used a data sample of 40-Bankrupt and 800-Non bankrupt firms and Erkki and Teija
(2000) used equally divided data sample of 400 bankrupt and non-bankrupt firms.
Since my major concern in this study is to apply data mining classification techniques to
predict bankruptcy, hence, it is very important for me to select an unbiased training and test
data sample. The training data sample I have employed in this study consist of an unbiased
sample of 464 Bankrupt and 464 non-Bankrupt UK and Irish firms during the period of 2000
to 2012 while test data sample contains 64 bankrupt and 64 non-bankrupt companies during
period 2010 to 2012. I have selected 5 years prior ratios to analyse bankruptcy prediction.
Finally, I divided data into 5 different data files to perform my analysis as follows.
1. Data sample containing financial ratios one year before bankruptcy (dataset1.xlsx).
2. Data sample containing financial ratios two years before bankruptcy (dataset2.xlsx).
3. Data sample containing financial ratios three years before bankruptcy (dataset3.xlsx).
4. Data sample containing financial ratios four years before bankruptcy (dataset4.xlsx).
5. Data sample containing financial ratios five years before bankruptcy (dataset5.xlsx).
40
4.3 Source of Data
This data sample has been collected from the Financial Analysis Made Easy (FAME)
database. This database gives detailed information on all significant private and public
companies in the UK and Ireland. The information provided contains, Name, number of
employees, profile, location, assets, identification number, status, legal form, incorporate
date, phone number, industry, stock data, mortgage data, account type, accounting figures,
financial statistics, custom data and information related to directors and owners of the
companies. We can access the past 10 year’s financial data for a company from this database.
Using FAME database we can analyse detailed statistical description, aggregation, linear
regression and segmentation of data in seconds. Moreover, FAME database describe the
status of the companies in two categories:
1. Active
2. Inactive
1. Dissolved
2. Liquidated
41
Table 4.1 Financial ratios used in this study
Ratio Name used in this study Financial ratio Number of approximate
research articles containing
this ratio.
X1 Factor/Consideration 65
X2 Net income / Total assets 60
X3 Current ratio 50
X4 Working capital/Total assets 50
X5 Retained earnings / Total 40
assets
X6 Earnings before interest and 36
taxes / Total assets
X7 Sales / Total assets 35
X8 Quick ratio 33
X9 Total debt / Total assets 31
X10 Current assets / Total assets 29
X11 Net income / Net worth 25
X12 Total liabilities / Total assets 23
X13 Cash / Total assets 21
X14 Market value of equity/book 18
value of equity
X15 Cash flow from operations / 17
Total assets
X16 Cash flow from operations / 19
Total liabilities
X17 Current liabilities / Total assets 15
X18 Cash flow from operations / 13
Total debt
X19 Quick assets / Total assets 14
X20 Current assets / Sales 15
X21 Earnings before interest and 15
taxes / Interest
X22 Inventory / Sales 15
X23 Operating income / Total 13
assets
X24 Cash flow from operations / 13
Sales
X25 Net income / Sales 14
X26 Long-term debt / Total assets 15
X27 Net worth / Total assets 13
X28 Total debt / Net worth 14
X29 Total liabilities / Net worth 14
X30 Cash / Current liabilities 15
X31 Cash flow from operations / 10
Current liabilities
X32 Working capital/Sales 9
X33 Capital/Assets 7
X34 Net sales / Total assets 8
X35 Net worth / Total liabilities 7
X36 Total assets 7
X37 Cash flow (using net income) / 7
Debt
X38 Cash flow from operations 7
X39 Operating expenses / 7
Operating income
X40 Quick assets / Sales 7
X41 Sales / Inventory 7
42
4.5 Data Pre-Processing
To apply data mining techniques the data must be filtered and prepared for recognizing
efficient pattern in the data. According to Han and Kamber (2000) the data mining process
involves six important steps: Select data, Filter data, Give meaning (Value) to Filtered data,
programming, data mining and report generation.
Data cleaning is very important as it removes any errors from data and improves its quality.
Since, Data obtained from any source have missing values, outliers and noise. Data pre-
processing is a phase in which data is prepared for analysis by using different data cleaning
and processing methods. If data is not pre-processed before applying different models the
results would be very different than the processed data. Therefore, it is important to pre-
process data for better classification results.
Moreover, the data used in this study is presented in the form of a combination of X and T
variable. Where X (starting from 1 to 41) variable shows the ratios and T (starting from 1 to
5) shows the number of year before bankruptcy. Since, the data contains 5 years prior ratios
and I have to apply data mining on each year data before bankruptcy so, I made different files
of data containing ratios related to each year. For example, to apply data mining models on
data 5 years before bankruptcy I deleted first to four years ratios remaining 5 year ratios
specified as X,T (where X = 1 to 41 and T=5). I used IBM SPSS to make these samples of
data. In addition to this, I also deleted columns of data that were not required in this study.
The deleted columns were status and event data year. Since, I also want to find out the most
important ratios in the bankruptcy prediction I also made different data file with different
ratios (deleting others) in mind as well. To make data more cleaner I truncated the spare
decimal (if it were greater than 6 after decimal place) places to 4 decimal places by using
excel roundup function Roundup(Number, Digit).
Finally, the data was showing the bankruptcy of the firms in binary form (0 for bankrupt and
1 for non-bankrupt firm). I converted the form of this variable to nominal for classification
and changed 0 to “bankrupt” and 1 to “non-bankrupt” string data type for better classification
analysis. At last I deleted some columns from the sample data which was not required in this
study. The columns I deleted were, Company Name, status and Year of Event.
43
4.5.1 Missing values
Missing values have always been a problem for researchers and it is up to researchers how
they deal with the missing values. According to Rubin (2002) there are three major kinds of
missing values mechanisms:
The missing values in the data is limited or scattered in the whole data. Limited is when only
few values are missing in the data and total is when all data is full of missing values. The
most commonly method used to solve missing values it to impute missing values with the
average value. The SPSS missing values analysis gives the complete insight of the missing
values in the data one year before bankruptcy. According to this analysis variables in each
sample of data have certain missing value such as :
X1T1,X3T1,X4T1,X5T1,X6T1,X8T1,X9T1,X10T1,X11T1,X12T1,X14T1,X16T1,X18T1,x1
9T1,X20T1,X21T1,X22T1,X23T1,X24T1,X25T1,X26T1,X27T1,X28T1,X29T1,X30T1,X31
T1,X32T1,X33T1,X34T1,X35T1,X36T1,x39T1,X40T1,X41T1 contains zero missing values
variable X7T1, X13T1,X15T1,X17T1 contains more than 50 missing value while variable
X2T1 is having five missing values. Moreover, the SAS and IBM SPSS have methods to
impute the missing values in the data. I have used IBM SPSS to detect and impute missing
values in the data using mean of nearby point method.
4.5.2 Outliers
Outliers are the values in the data that are significantly far away from the other observation in
the data (Hansen et al., 1983). The outlier affects the results of analysis method and also skew
data from normal distribution. The most commonly used methods to deal with outliers are
(Dhiren and Ghosh, 2012):
In the trimming method the outliers are eradicated from the data during analysis and
winsorizing is a method to assign an outlier highest or lowest value in the data that is not an
outlier. A general method of winsorinzing is to replace any data value over the ninety fifth
44
percentile of the sample data by the 95th percentile and any value below the 5th percentile by
5th percentile (Dhiren and Ghosh, 2012).
Figure 4.1 Method used in SPSS to find 5th and 95th percentile
Tables 4.2, 4.3, 4.4, 4.5 and 4.6 present the 5th and 95th percentile of each year data in
Apendix-A.
4.7 Summary
Since the data has been pre-processed and cleansed by using different statistical methods.
Hence it is ready to be used in the bankruptcy prediction models development. The next
chapter will be presenting this implementation.
45
Chapter 5: Model development and application
5.1 Introduction
This chapter consists of three parts, Part - 1 presents the application of data mining methods
using SAS enterprise miner, Part-2 elaborates the used of data mining algorithms using
WEKA software and Part-3 presents the classification of bankruptcy data using IBM SPSS
Modeller.
Part-1:
5.2 Overview
This part gives a brief description to the SAS enterprise miner and its predictive modelling
approach. Moreover, this section introduces the step by step implementation of the models
with brief introduction to the data mining model nodes used and their execution using SAS
programming.
SAS enterprise miner provides a GUI to perform different data mining tasks. The GUI
consists of Workspace where nodes can be dragged from a toolbar to create a process flow
diagram. Figure 5.1 elaborates the process of creating any project in SAS Enterprise miner.
46
Figure 5.1 step by step method of creating any project in SAS Enterprise miner
Create a Library
Create a Diagram
Place Node in
Workspace and Execute
SAS enterprise miner have many features including data mining set of tools, an easy to use
GUI, more accurate predictions, development of better predictive models for later use and
text editor to write code to perform task through SAS enterprise guide. SAS also helps in the
goal of data mining process to develop predictive models. These models help to find rules for
prediction using variable and data from one data source. After creating better predictive
model, it can be applied to the new data source for prediction.
47
5.3 Application of the Models
To develop a bankruptcy prediction models it is required to have a data set. This data set needs to be
imported into the SAS miner. Since SAS does not understand this data set hence it is converted to SAS
data set to perform tasks by SAS miner. In the later step, the SAS dataset is divided into three parts,
Training, Validation and Testing and explored using stat explorer node. I have used 70% data as training
and 30% as Validation data to test the results. After the data have been divided into two categories,
different predictive model are used and compared. The validation data set is employed to save a modelling
node from over fitting the training data and to compare different models. Finally the results of these
models are acquired and best ones are considered. The Figure 5.2 presents step by step implementation of
the model generation using SAS EM:
Figure 5.2 The step by step implementation of the model generation using SAS EM
Data Set
Different Models
Implementation
Model Accuracy
Final Results
48
In this phase I shall be applying prediction models nodes at 5 data sets separately to perform
bankruptcy prediction task, with a short introduction of the model employed.
49
5.3.5 Neural Network Model
This node model helps to generate, train, and test multilayer feed forward neural networks (
SAS Institute Inc., 2003). Overall model accuracy using Neural Network Model is 95.4%,
97.7%, 93.25%, 92.2%, and 90.1 % for one to five years respectively. Table 5.3 presents
bankruptcy prediction accuracy given in the Appendix-A. Moreover, Figure 5.6 in Appendix
B shows the classification bar graph and score mode for each year bankrupt and non-bankrupt
classification.
Overall model accuracy using HP Neural Model is 51.0%, 47.25.0%, 84.0%, 89.4%, and 54.6
% for one to five years respectively. Table 5.5 presents bankruptcy prediction accuracy given
in the Appendix-A. In addition to the classification accuracy table, Figure 5.8 in Appendix B
shows the classification bar graph, NN diagram and score mode for each year bankrupt and
non-bankrupt classification.
50
5.3.8 Data Mining Neural Model
The DMNeural node model is used to create additive nonlinear model. The major purpose of
the algorithm that is used in DMNeural node is to eradicate certain problems like, Nonlinear
estimation problem, Computing time, Finding global and optimal solution. The training
process of DMNeural creates eight functions. Each function performs a particular
functionality and their optimization is also executed individually. DMneural node model
choose a function that gives most appropriate results (SAS Institute Inc., 2013). Overall
model accuracy using DMNeural Model is 46.64%, 55.15%, 52.4%, 61.1%, 64.7 % for one
to five years respectively. Table 5.6 presents bankruptcy prediction accuracy given in the
Appendix-A. In addition to the classification accuracy table, Figure 5.9 in Appendix- B
shows the classification bar graph, NN diagram and score mode for each year bankrupt and
non-bankrupt classification.
1. Link Function
2. Error function
Link function is used for the distribution problems and error function is used perform linear
regression on the data (SAS Institute Inc., 2013). Overall model accuracy using Regression
Model is, 46.64%, 55.15%, 52.4%, 61.1%, and 64.7 % for one to five years respectively. .
Table 5.7 presents bankruptcy prediction accuracy given in the Appendix-A. In addition to
the classification accuracy table, Figure 5.10 in Appendix B shows the classification bar
graph, NN diagram and score mode for each year bankrupt and non-bankrupt classification.
51
addition to the classification accuracy table, Figure 5.11 in Appendix B shows the
classification bar graph, NN diagram and score mode for each year bankrupt and non-
bankrupt classification
Overall model accuracy using HP Regression Model is 99.0%, 50.0%, 47.25%, 49.0%, and
50.5 % for one to five years respectively. Table 5.9 presents bankruptcy prediction accuracy
given in the Appendix-A. In addition to the classification accuracy table, Appendix- B shows
the classification bar graph, NN diagram and score mode for each year bankrupt and non-
bankrupt classification
In addition to the classification accuracy table, Figure 5.13 in Appendix B shows the
classification bar graph, NN diagram and score mode for each year bankrupt and non-
bankrupt classification accuracy.
In this part I have applied different data mining node models available on data sample. Each
model has its strengths and weaknesses. Chapter 6 elaborates an complete insight of each
52
model results and accuracy. Following is the final implementation diagram of all SAS data
mining models that I have used in this study for data set 1.
53
Part 2:
This section gives a brief introduction to WEKA and method to apply data mining in WEKA.
This part also elaborates applications of data mining algorithms on the data samples using
WEKA software.
5.4 WEKA:
WEKA is open source software consisting of a group of algorithms to perform data mining
tasks on large amount of data. Using WEKA is possible to perform different data mining
related techniques on data like classification, regression, clustering and association rule
mining. [Mark et al.(2009)]. WEKA divides classification algorithm into different groups,
Bayes classifiers, Functions classifiers, Lazy classifiers, Meta classifiers, MI classifiers, rules
base classifier and trees classifiers. The Figure 5.14 gives the step by step implementation of
data mining algorithms on data using WEKA.
Pre-process
Select Bankrupt/Non-
Bankrupt as Target
Select Classification
Algorithms and apply
Calculate Model
Accuracy
54
Since WEKA provides the algorithmic models so, this section represents applications these
models and their empirical findings of the classification accuracy using the above mentioned
implementation approach. I will be processing the confusion matrix and calculating the
classification accuracy in each case. In every model generated I have used 10 fold cross
validation technique to validate the accuracy of the model.
55
5.4.5 RBFNetwork Model
Radial base function network that employs radial basis function as activation functions is
based on neural network logic to solve problems (Schwenker et al., 2001). The overall
prediction accuracy considering both bankrupt and non-bankrupt firms using RBFNetwork
Model is 61.7%, 77.5%, 63.5%, 55.7% and 88.3% for one to five years respectively.
Moreover, Table 5.14 in Appendix-A gives a detailed prediction accuracy of both bankrupt
and non-bankrupt firms.
56
48.22%, and 56.9% for one to five years respectively. Moreover, Table 5.18 in Appendix-A
gives a detailed prediction accuracy of both bankrupt and non-bankrupt firms applying
ClassificationViaRegression model.
The overall prediction accuracy considering both bankrupt and non-bankrupt firms using
MultiBoostAB Model is 57.8%, 52.15%, 57.7%, 87.7% and 56.1% for one to five years
57
respectively. Moreover, Table 5.22 in Appendix-A gives a detailed prediction accuracy of
both bankrupt and non-bankrupt firms MultiBoostAB model.
The overall prediction accuracy considering both bankrupt and non-bankrupt firms using
Random Committee Model is 54.0%, 49.5%, 51.9%, 50.4% and 50.4% for one to five years
respectively. Moreover, Table 5.23 in Appendix-A gives a detailed prediction accuracy of
both bankrupt and non-bankrupt firms Random Committee model.
The overall prediction accuracy considering both bankrupt and non-bankrupt firms using
HyperPipes Model is 49.6%, 48.50%, 48.60%, 49.3% and 46.9% for one to five years
respectively. Moreover, Table 5.24 in Appendix-A gives a detailed prediction accuracy of
both bankrupt and non-bankrupt firms HyperPipes model.
58
The overall prediction accuracy considering both bankrupt and non-bankrupt firms using
OneR Model is 51.3%, 51.02%, 51.3%, 51.02% and 50.05% for one to five years
respectively. Moreover, Table 5.26 in Appendix-A gives a detailed prediction accuracy of
both bankrupt and non-bankrupt firms using OneR model.
59
5.4.23 END Model
This algorithm belongs to the meta group of algorithms in WEKA. It is used to solve
problems related to two class classifiers. The overall prediction accuracy considering both
bankrupt and non-bankrupt firms using END Model is 52.5%, 52.4%, 54.1%, 51.0% and
52.5% for one to five years respectively. Moreover, Table 5.31 in Appendix-A gives a
detailed prediction accuracy of both bankrupt and non-bankrupt firms using End model.
60
Part 3
This section consists of a brief introduction to IBM SPSS, application of MLP neural
networks, different variations of decision trees and nearest neighbour algorithm to predict
bankruptcy.
61
respectively. Moreover, Table 5.33 in Appendix-A gives a detailed prediction accuracy of
both bankrupt and non-bankrupt firms using CHAID model.
1. Merging
2. Splitting
3. Stopping
In the merging step each explanatory (input) variable merge non-important categories and
each final category have one child node. The merging step also calculates the p- value which
is used in the splitting step. The splitting step then find the best split for each predictor value
found in merging step and selects which one of the predictor value is to be used to split the
child node. In the final step the stopping step will stop the tree growing process:
The overall prediction accuracy using both bankrupt and non-bankrupt firms using CHAID
Model is 58.5%, 82.2%, 55.2, 53.0% and 66.3% for one to five years respectively. Moreover,
Table 5.34 in Appendix-A gives a detailed prediction accuracy of both bankrupt and non-
bankrupt firms using CHAID Exhaustive model.
62
The overall prediction accuracy using both bankrupt and non-bankrupt firms using CART
Model is 57.9%, 57.1%, 56.2, 54.4% and 52.7% for one to five years respectively. Moreover,
Table 5.35 in Appendix-A gives a detailed prediction accuracy of both bankrupt and non-
bankrupt firms using CART model.
The overall prediction accuracy using both bankrupt and non-bankrupt firms using QUEST
Model is 94.0%, 82.0%, 78.2, 50.0% and 50.0% for one to five years respectively. Moreover,
Table 5.36 in Appendix-A gives a detailed prediction accuracy of both bankrupt and non-
bankrupt firms using QUEST model.
The overall prediction accuracy using both bankrupt and non-bankrupt firms using K-NN
Model is 61.3%, 53.4%, 45.2, % and 47.1% for one to five years respectively. Moreover,
Table 5.37 in Appendix-A gives a detailed prediction accuracy of both bankrupt and non-
bankrupt firms using KNN model.
5.7 Summary
This chapter gives a complete insight of the models implementation, generation and overall
classification accuracy of each model using SAS miner, WEKA and IBM SPSS. Next step is
to critically analyse these results and select the most efficient model of data mining software
used in this chapter.
63
Chapter 6 Results Analysis and Critical Evaluation
6.1 Introduction
This chapter consists of brief description of Type-I and Type-II errors of bankruptcy prediction
models. This chapter also consists of the analysis and critical evaluation of the results obtained from
applications of models using SAS Enterprise Miner, WEKA and IBM SPSS.
According to Neves and Vieira (2006) overall Type-I error is calculated as:
According to Neves and Vieira (2006) overall Type-II error is calculated as:
64
6.5 Classification Accuracy
The classification accuracy of a bankruptcy prediction model is generally measured by the percentage
of correctly classified observations. The Classification accuracy is calculated as Neves and Vieira
(2006):
Table 6.1 (Part-2) and Figure 6.1, which also consist of classification prediction accuracy for non-
bankrupt firms prior five years, shows that Neural Network and Auto Neural models has given
bankrupt firms classification accuracy more than 90% for each year before the event. According to
Table 6.1 (Part-1) prediction accuracy of Neural networks is 95.90%,97.80%,95.50%,95.00% and
95% for starting from one to five years respectively before bankruptcy year, which shows that NN are
more efficient than other three models, as others have certain fluctuation in some years. Similarly,
Table 6.1 (Part-2) shows that Auto Neural model which is also a type of NN also gives 93%, 99.5%,
99%, 97.6% and 99% starting from year one to five respectively.
According to the research conducted in the field of bankruptcy prediction, various researchers have
used different statistical and intelligent methods to predict bankruptcy but Neural Networks and its
different types are most commonly used intelligent methods (kumar and ravi, 2007). Cadden (1991)
used neural network model to predict bankruptcy using three year ahead forecast, his classification
accuracy was 90%, 90% and 80% respectively for bankrupt firms and 100%,90% and 90% for non-
bankrupt firms. Moreover, Leshno and Spector (1996) also used Neural Network method to predict
bankruptcy, and obtained prediction accuracy of the two years ahead case 76.4% to 76.4%.
65
Table 6.1 Bankrupt and non-bankrupt five years ahead prediction accuracy table using SAS Enterprise miner models
(Part-1) Bankruptcy Prediction Accuracy (%) Prior Event (Part-2) Non-Bankruptcy Prediction Accuracy (%) Prior Event
Model Name One year Two Years Three Four Five One year Two Years Three Four Five Years
Years years Years Years years
Decision Trees 73.27 60.00 64.50 32.00 39.80 52.00 61.20 94.80 73.50 95.6
HP Trees 78.44 84.00 71.70 51.00 32.00 50.40 52.90 70.00 73.00 90.20
Neural Network 95.90 97.80 95.50 95.00 95.00 95.40 97.60 92.00 92.40 93.10
Auto Neural 94.00 99.50 0 97.80 0 93.00 99.50 99.00 97.60 99
HP Neural 90.00 95.00 95.00 90.00 97.8 12.00 0.00 73.00 88.90 12.00
DMNeural 59.30 93.70 92.27 74.20 39.80 63.57 16.60 13.00 48.00 90.51
Regression 98.70 95.00 92.27 94.42 95.90 99.50 0.00 0.00 6.20 3.40
HP SVM 67.70 40.70 40.70 38.70 33.40 49.13 67.70 67.70 70.00 63.57
HP Regression 98.00 92.27 95.60 95.60 93.00 100.00 0.00 0.00 3.00 3.00%
MBR 39.80 48.20 49.10 47.20 47.20 64.47 75.60 70.00 75.40 71.90
Banrkupt Prediction Accuracy Non-Bankrupt Prediction Accuracy
120 120
100 100
80 One year 80 One year
60 60
40 Two Years 40 Two Years
20 20
0 Three Years 0 Three Years
Four years Four years
Five Years Five Years
66
6.6.2 Analysis of Results of WEKA
The results obtained after the implementation of WEKA data mining algorithms have shown
that WEKA is also very good software to preform classification using different algorithms.
The Table 6.2 (Part-1) and Figure 6.2 clearly show that SimpleCart, RBFNetwork and
MultiboostAB are the most efficient algorithms to predict bankruptcy phenomena. The
prediction accuracy of SimpleCart algorithm is 89.60%, 70.00%, 89.80%, 86.40% and
85.70% starting from one to five years respectively for a case of five years ahead forecast of
bankruptcy. MultiboostAB algorithm is also showing good prediction accuracy of 82.10%,
76.29%, 80.40% for first, second and fourth year, but its classification accuracy is below 70%
for third and fifth year. The Figure 6.3 also represents that RBFNetwork is also a very good
predictor of bankruptcy with a prediction rate of over 70% in first two years, 90% in third and
fifth year, and 48% in the fourth years.
The non-bankrupt firms forecast is also handled efficiently by OneR, Hyperpipes and
Dagging algorithms. The Table 6.2 (Part-2) and Figure 6.3 evidently displays that prediction
classification of OneR is over 95.0% in case of five years ahead forecast. It can also be
observed that non-bankrupt classification accuracy of Dagging algorithm is more than 80%
for first four years and 77.6% for the fifth year.
Figure 6.2 Bankrupt firms five years ahead prediction accuracy using WEKA models chart
LogitBo…
BayesN…
Naïve…
Rando…
Rando…
HyperP…
RBFNet…
AdaBo…
Classifi…
Simple…
J48
SMO:
KSTAR:
Dagging:
NNge
ZeroR
END
LWL:
OneR
One year Two Years Three Years Four years Five Years
Figure 6.3 non-Bankrupt firms five years prediction accuracy using WEKA models chart
67
Table 5.2 Bankrupt and non-bankrupt firms five years ahead prediction accuracy table using WEKA models
(Part-1) Bankruptcy Prediction Accuracy Prior (Part-2) Non-Bankruptcy Prediction Accuracy Prior
Model Name One year Two Years Three Four Five Years One year Two Years Three Four Five Years
Years years Years years
Naïve Bayes: 92.00% 6.20% 79.70% 93.75% 92.80% 79.70% 96.30% 92.00% 10.50% 94.60%
BayesNet: 100.00% 6.20% 78.00% 57.10% 57.30% 79.70% 96.30% 98.00% 45.20% 43.00%
SMO: 73.70% 58.20% 62.50% 51.50% 55.60% 49.70% 51.90% 59.40% 53.20% 45.90%
RBFNetwork: 76.70% 75.40% 92.30% 62.90% 95.00% 46.70% 79.50% 34.60% 62.50% 81.70%
KSTAR: 100% 50.20% 49.80% 49.80% 52.80% 100% 47.40% 54.50% 51.00% 47.60%
LWL: 81.50% 61.20% 74.80% 91.60% 10.60% 21.90% 46.70% 29.50% 95.70% 87.50%
AdaBoostM1: 53.20% 51.00% 53.20% 83.80% 45.90% 64.00% 64.00% 37.20% 25.00% 49.80%
ClassificationviaRegression: 32.30% 29.31% 62.50% 18.70% 24.70% 68.90% 66.40% 73.06% 78.50% 89.22%
Decorate: 52.80% 92.70% 23.70% 55.20% 88.20% 57.30% 12.50% 79.90% 51.50% 16.60%
Dagging: 50.21% 37.90% 44.60% 52.80% 43.10% 74.40% 81.90% 81.50% 71.20% 84.50%
LogisticBoost: 68.10% 73.92% 68.90% 70.00% 60.30% 44.80% 51.50% 70.90% 21.20% 72.20%
MultiBoostAB 82.10% 76.29% 69.40% 80.40% 61.80% 33.60% 34.00% 46.10% 95.20% 50.31%
Random Committee 56.50% 51.30% 49.50% 53.20% 53.20% 51.50% 47.60% 54.30% 47.60% 50.40%
HyperPipes 19.20% 16.80% 16.80% 19.50% 16.20% 80.20% 80.20% 80.30% 80.20% 77.60%
NNge 53.01% 55.80% 57.90% 46.90% 53.50% 48.70% 44.30% 46.70% 41.59% 44.10%
OneR 6.00% 4.70% 6.00% 4.70% 5.30% 96.70% 97.50% 96.70% 96.70% 95.01%
ZeroR 39.60% 39.60% 39.60% 39.60% 39.60% 59.50% 59.50% 59.50% 59.50% 59.50%
Random Forest 55.20% 32.80% 42.00% 47.10% 36.60% 47.20% 66.16% 56.40% 47.00% 64.20%
J48 64.10% 40.80% 28.40% 54.00% 49.70% 40.90% 56.40% 70.60% 48.00% 52.10%
SimpleCart 89.60% 70.00% 89.80% 86.40% 85.70% 10.10% 29.74% 10.50% 30.40% 21.30%
END 64.00% 60.00% 66.20% 49.80% 64.00% 40.90% 44.80% 40.90% 52.20% 40.90%
68
6.6.3 Analysis of results of IBM SPSS models
The results obtained after the implementation of SPSS models have demonstrated that SPSS
can also be used to predict bankruptcy of a firm in an effective manner. The Table 6.3 (Part-
1) and Figure 6.4 effectively illustrate that Multi-Layer Perception Neural Network (MLP
Neural Network) is the most effective model to predict bankruptcy. The prediction accuracy
of this model is 100.00%, 90.40%, 98.10%, 74.40% and 32.10% starting from first year to
fifth year forecast respectively. It can also be observed that Classification and regression tree
(CART) model captured second position in prediction of bankruptcy. The classification
accuracy of CART model is 84.90%, 72.20%, 86.20%, 83.80% and 95.00%, one to five years
before bankruptcy respectively.
Non-bankrupt firms are also predicted by MLP Neural Network model. Table 6.3 (Part-2)
and figure 6.4 also presents the classification accuracy of non-bankrupt firms from 100.00%,
82.00%, 91.00%, 42.50%, 72.20% one to five years correspondingly. Figure 6.6 demonstrate
that Quick, Unbiased, Efficient Statistical Tree(QUEST) also provides a good classification
accuracy of 88.10%, 56.20%, 100.00% and 100.00% for fist four years and 0% for the fifth
year of non-bankrupt firms.
Table 6.3 Bankrupt and non-bankrupt firms five years prediction accuracy table using SPSS
One year Two Years Three Years One year Two Years Three Years
Four years Five Years Four years Five Years
69
6.7 Critical Evaluation
There is one pitfall associated with all data mining software I have used in this empirical
study. Despite various advantages and characteristics of SAS enterprise miner used in this
study, there is one disadvantage, that it works on nodes and does not specify the name of the
algorithm used in the development of model. WEKA data mining software resolves this
problem but there is another problem associated with WEKA, that it does not provide
graphical user interface. Both of these problems are eliminated by IBM SPSS but I do not
have access to the complete data mining IBM SPSS modeller.
The dataset samples used in this study were also a big hindrance in performing different data
mining techniques. All data samples had a great deal of missing values. Though, I applied
IBM SPSS technique to eradicate missing values drawback yet, I am not sure that all the
values were imputed efficiently by IBM SPSS.
Final drawback of this approach is that it cannot predict the human faults and frauds. We
know that all financial statements are made by accountants and concerned staff of the
company. If they are making are not giving correct information about the company ratios
then these models are unable to predict the bankruptcy of the company. So, if the financial
ratios are faulty the result would also be accordingly faulty.
6.8 Summary
This chapter contains results of all major software used in this study. I have concluded that
the all models of software used in this study have their particular importance in the field of
bankruptcy prediction. The most important models to predict bankruptcy using SAS
Enterprise miner are, Neural Network, Auto Neural, Regression and HP Regression. The
most efficient models to forecast bankruptcy using WEKA are SimpleCart,
RBFNetwork,OneR and MultiboostAB. Considering IBM SPSS the most reliable models are
MLP Neural Network, CART and QUEST to classify bankruptcy prediction. Finally, the
main pitfall in the study is the missing values in the data.
70
Chapter 7 Conclusion and Future Directions
7.1 Conclusions
In this study I have used variety of data mining classification methods to deal with
bankruptcy prediction. I have applied numerous data mining models, using the most
commonly used software to predict bankruptcy more effectively as well as accurately.
In this dissertation, there were three major objectives to achieve, using five years prior
financial ratios of 464 bankrupt and 464 non-bankrupt firms. Firstly, to develop different
data mining models to predict bankruptcy using three data mining software, SAS Enterprise
miner, WEKA and IBM SPSS. Secondly, the application of these models, and analyse the
accuracy of each model separately. Thirdly, to obtain the most accurate model provided by
different data mining software individually. The first motivation of this study was to
understand financial distress that leads to bankruptcy, effects of bankruptcy, cost of
bankruptcy and the factors involved in bankruptcy. The second motivation was to find, most
commonly used data mining models used from 1932 to present and apply those models to test
their accuracy. Very vast research has been carried out in the field of bankruptcy prediction
because of the importance of the topic. Nevertheless, each research study has used only few
machine learning or statistical methods to predict bankruptcy.
To develop an effective data mining classification model, is a very significant but slightly
difficult task for financial organisations. These prediction models tests a new individual or
company, whether or not it will bankrupt. If the classification accuracy of these prediction
models is not efficient, this can lead to wrong decisions and cause huge financial lose (Tsai et
al., 2014).
To achieve goals of my study mentioned above, I developed 6 chapters and each chapter is a
building block to achieve my goal: Chapter one is related to introduction, Chapter 2 is related
to literature review, chapter 3 defines bankruptcy and its costs, chapter 4 gives a complete
insight of the data and test samples used, chapter 5 gives a detailed description of
development and application of each model using SAS EM, WEKA and SPSS, and Chapter 6
provides a critical evaluation of these effective models.
After carrying out an extensive research, in the field of bankruptcy prediction, I have
understood the importance of an effective model for bankruptcy prediction. Furthermore,
bankruptcy is an important phenomenon for a big or small company. Finally, I concluded that
71
most of the researchers only used one or two methods to predict bankruptcy. So, I chose to
apply a variety of data mining models using software, SAS Enterprise Miner, WEKA and
IBM SPSS.
Then, to give a better understanding of bankruptcy to the reader, corporate financial distress,
actual cause of bankruptcy, was defined and elaborated. Moreover, different stages of
financial distress, factors of financial distress, causes and results of corporate distress were
discussed. Later on, bankruptcy was defined and four types of costs associated with
bankruptcy were illustrated.
In the later step, data was gathered from FAME (Financial Analysis Made Easy) database.
This data was cleansed and pre-processed by applying statistical techniques and tools.
Missing values were minimized, using SPSS missing value imputation technique. Outliers
were handled using winsorization method. In addition, data was divided into five different
data sets prior to bankruptcy year. Since the research in the field of bankruptcy prediction,
shows that the selection of financial identifiers (ratios) is also very important factor for
creating an effective model. If significant identifiers are not selected, the results of the
developed model would not be accurate. By keeping in this in mind, I have chosen 41
financial ratios most commonly used in various research studies from different ratio groups,
Liquidity, Leverage, solvability, profitability, efficiency and cash flow .
Then, Chapter 5 consists of three parts, part-1 elaborate step by step procedure of model
development using SAS EM. I have developed 11 models using Decision Trees, HP Trees,
Neural Network, Auto Neural, HP Neural, DMNeural, Regression, HP SVM, HP Regression
and Memory Based Reasoning (MBR) nodes of SAS enterprise Miner and implemented these
models on the five years distinct data samples. The best bankruptcy prediction models using
SAS EM are Neural Network, Auto Neural, Regression and HP Regression. Later on, I have
illustrated a step by step process of model generation using WEKA, and developed 21
distinct models using Naïve Bayes, BayesNet, SMO, RBFNetwork, KSTAR, LWL,
AdaBoostM1,ClassificationviaRegression,Decorate,Dagging,LogisticBoost,MultiBoostAB,
Random, Committee, HyperPipes, NNge, OneR, ZeroR, Random Forest, J48, SimpleCart and
END algorithmic data mining models. The highest bankruptcy prediction model using
WEKA are SimpleCart, RBFNetwork,OneR and MultiboostAB. Finally, I gave a step by step
plan of model development using SPSS. I proposed, MLP neural network, CHAID, CHAID
Exhaustive, CART, QUEST and K-NN 6 individual models using IBM SPSS. The best
classification accuracy is given by MLP Neural Network to predict bankruptcy.
Finally, Chapter 6 critically evaluates the results provided by each software and model
separately. It is concluded that the classification accuracy of Neural Network model is higher
72
than all of the other models. In case of SAS EM, NN models provided results of 95.90%,
97.80%, 95.50%, 95.00%, and 95.00% and Auto neural provided classification accuracy of
93% , 99.5%, 99%, 97.6% and 99% in bankruptcy prediction using five years prior ratios of
the firms. Moreover, using WEKA SimpleCart data mining (DM) algorithm provided
89.60%, 70.00%, 89.80%, 86.40%, 85.70% classification accuracy for one to five years
respectively, on the other hand, RBFNetwork algorithm that works with hidden layers also
provided 76.70%, 75.40%, 92.30%, 62.90%, 95.00% bankruptcy prediction accuracy on a
five years financial ratios of different firms. Finally, MLP neural network model of IBM
SPSS also provided remarkable classification accuracy of 100.00%, 90.40%, 98.10%,
74.40% and 32.10% for one to five years respectively.
In the background history of bankruptcy prediction studies, the neural network models have
captured a significant place. Researches on the applications of NN models to financial
distress prediction problems inaugurated in the 1990s, and they are still operational in today’s
research. For two decades, researchers have verified the supremacy of NN models over
numerous statistical models such as MDA, logistic regression, and k-NN (Jeong et al., 2012).
This dissertation also acknowledges the supremacy of NN models over other data mining
models.
The bankruptcy prediction for five years ahead have been done in this study using numerous
data mining models, but financial statement, balance sheets, income statements, and
statements of cash flows could also be used in near future to predict bankruptcy. Moreover,
the models could also be used to predict bankruptcy of individuals in the near future.
I have applied many data mining models in this study to predict bankruptcy, but many other
methods are also available to predict bankruptcy. In future, research can also be conducted to
predict bankruptcy without using financial ratios and applying data mining on financial
statements.
73
Bibliography
Guoqiang Zhang, Michael Y. Hu, , B. Eddy Patuwo, Daniel C. Indro, 1999. Artificial neural networks in
bankruptcy prediction: General framework and cross-validation analysis. European Journal of
Operational Research, 116(1), pp. 16-32.
H. Kurniawan, Peter Nwe, Kok Thai, P. Ravi Kumar,V. Ravi,, 2008. Soft computing system for bank
performance prediction. Applied Soft Computing, 8(1), pp. 305-315.
SAS Institute Inc., 2003. Data Mining Using SAS® Enterprise MinerTM A Case Study Approach..
Second Edition ed. Carry: NC: SAS Institute Inc..
A. Garmroodi Asil, A. Shahsavand, 2014. Reliable estimation of optimal sulfinol concentration in gas
treatment unit via novel stabilized MLP and regularization network. Journal of Natural Gas Science
and Engineering., Volume 21, pp. 791-804.
A.I. Dimitras , S.H. Zankis, C. Zopounidis, 1996. A survey of business failures with an emphasis on
prediction methods and industrail Applications. European Journal of Operational Research , I(90), pp.
487-513.
Altman E. ,R. Haldeman , P. Narayanan , 1977. Zeta analysis: A new model to identify bankruptcy risk
of corporations.. Journal of Banking and Finance , 1(1), pp. 29-51.
Altman, E. ,B. Loris, 1976. A financial early warning system for over-the-counter broker-dealers.
Journal of Finance , 4(12), pp. 1201-1217.
Altman, E.I., Hotchkiss E., 2005. Corporate Financial Distress and Bankruptcy:Predict and Avoid
Bankruptcy, Analyze and Invest in Distressed Debt.. 3rd ed. New Jersy: Jhon Wiley & sons.
Altman, E.I, 1968. Financial Ratios, Discriminant Analysis and the prediction of corporate bankruptcy.
Journal of Finance, 4(1968), pp. 589-609.
Altman, E. I., 1984. A further Empirical Investigation of the bankruptcy cost question.. The Journal of
Finance., XXXIX(4), pp. 1067-1089.
Andrea Bichlera, , Arnold Neumaierb, , Thilo Hofmanna,, 2014. A tree-based statistical classification
algorithm (CHAID) for identifying variables responsible for the occurrence of faecal indicator bacteria
during waterworks operations. Journal of Hydrology, 519 Part A.(27), pp. 909-917.
Arindam Chaudhuri and kajal De, 2011. Fuzzy Support Vector Machine For Bankruptcy Prediction.
Applied Soft computing , Volume 11, pp. 2472-2486.
74
Arindam Chaudhuri and Kajal De, 2011. Fuzzy Support Vector machine for bankruptcy prediction..
Applied Vector Machine for bankruptcy predction., 11(2), pp. 2472-2486.
Arindam Chaudhuri, Kajal De, 2011. Fuzzy Support Vector Machine for bankruptcy prediction.
Applied Soft Computing, 11(2), p. 2472–2486.
Arjana Brezigar-Masten , Igor Masten, 2012. CART-based selection of bankruptcy predictors for the
logit model.. Expert Systems With Applications, Volume 39, pp. 10153-10159.
B. Wong, T. Bodnovich and Y selvi, 1997. Neural network applications in Business. A review and
analysis of the literature(1988-95). Decision support systems, Volume 19, pp. 301-320.
Bankruptcy prediction with rough sets. (2001) ERIM Report Series Research in Management (ERS-
2001-11-LIS).
Beaver, W., 1966. Finanacial Ratios as predictors of failure. Journal of Accounting Research , 3(1966),
pp. 71-111.
Bigss, D., Ville, B., and Suen, E., 1991. A Method of Choosing Multiway Partitions for Classification
and Decision Trees.. Journal of Applied Statistics., 18(1), pp. 49-62.
Blum, M., 1974. Failing company discriminant analysis. Journal of Accounting Research , 1(12), pp. 1-
25.
Bose, I., 2006. Deciding the financial health of dot-coms using rough sets.. Information &
Management., 43(7), pp. 835-846.
Branch, B., 2002. A cost of bankruptcy A review.. International Review of Financial Analysis, Volume
11, pp. 39-57.
Breiman, L., 2001. Random Forests. Machine Learning, Volume 45, pp. 5-32.
Bris, A., Welch, I., Zhu, N, 2006. The costs of bankruptcy: Chapter 7 liquidation versus Chapter 11
reorganization.. Journal of Finance, Volume 61, pp. 1253-1303.
Bryant, S. M., 1997. A case-based reasoning approach to bankruptcy prediction modeling. Intelligent
Systems in Accounting, Finance and Management., Volume 6, pp. 195-214.
Büker, S., Asikoglu, R., Sevil, G., 1997. Finansal Yönetim. 2nd ed. Eskişehir: Anadulu Üniversitesi.
C. Kao, S.-T. Liu, 2004. Prediction bank performance with financial forecasts: A case of Taiwan
commercial banks. Journal of Banking & Finance, Volume 28, p. 2353–2368.
Castagna, A. a. Z. M., 1981. The prediction of corporate failure: Testing the. Australian Journal of
Management, 1(6), pp. 23-50.
Chen, M.-Y., 2012. Visualization and dynamic evaluation model of corporate financial structure with
self-organizing map and support vector regression.. Applied Soft Computing, 12(8), p. 2274–2288.
75
Chen, Y.-S., 2012. Classifying credit ratings for Asian banks using integrating feature selection and
the CPDA-based rough sets approach.. Knowledge-Based Systems., Volume 26, pp. 259-270.
Chih-Fong Tsai , Jhen- Wei Wu, 2008. Using neural network ensembles for bankruptcy prediction and
credit scoring.. Expert systems with applications, Volume 34, pp. 2639-2649.
Chih-Fong Tsai, Yu-Feng Hsu, David C. Yen, 2014. A comparative study of classifier ensembles for
bankruptcy prediction. Applied Soft Computing, Volume 24, pp. 977-984.
Chih-Fong Tsai, Yu-Feng Hsu, David C. Yen, 2014. A comparative study of classifier ensembles for
bankruptcy prediction.. Applied Soft Computing, Volume 24, pp. 977-984.
Chih-Hung Wua, Gwo-Hshiung Tzeng, Yeong-Jia Good, Wen-Chang Fang, 2007. A real-valued genetic
algorithm to optimize the parameters of support vector machine for predicting bankruptcy.. Expert
Systems with Applications., 32(2), pp. 397-408.
Ching-Chiang Yeh, Der-Jang Chi and Ming-Fu Hsu, 2010. A hybrid approach of DEA, rough set and
support vector machines for business failure prediction.. Expert Systems with Applications., 37(2),
pp. 1535-1541.
Chuang, C.-L., 2013. Application of hybrid case-based reasoning for enhanced performance in
bankruptcy prediction.. Information Sciences., Volume 236, pp. 174-185.
Chudson, W., 1945. The Pattern of Corporate Financial Structure.. New York: National Bureau of
Economic Research..
Chulwoo Jeong, Jae H. Min, Myung Suk Kim , 2012. A tuning method for the architecture of neural
network models incorporating GAM and GA as applied to bankruptcy prediction.. Expert Systems
with Applications., Volume 39, pp. 3650-3658.
Chulwoo Jeong, Jae H.Min , Myung Suk Kim, 2012. A tuning method for the architecture of neural
network models incorporating GAM and GA as applied to bankruptcy prediction. Expert Systems
with Applications , 39(3), p. 3650–3658.
Curram, S. P., & Mingers, J., 1994. Neural networks, decision trees induction and discriminant
analysis: An empirical comparison.. Journal of the operational research society., 4(45), pp. 440-450.
David J. Denis, Diane K. Denis, 1995. Causes of financial distress following leveraged
recapitalizations. Journal of Financial Economics, 37(2), pp. 129-157.
David L. Olson, Dursun Delen, Yanyan Meng, 2012. Comparative analysis of data mining methods for
bankruptcy prediction. Decision Support Systems, 52(2), pp. 464-473.
Deakin, E. E., 1972. A Discriminant Analysis of Predictors of Business Failure. Journal of Accounting
Reasearch, 1(10), pp. 167-179.
Demir, H., 1997. . İşletmelerde Başarısızlığın Nedenleri ve Çıkış Yolları, Dış Ticaret Dergisi, 6.. 6 ed.
s.l.:s.n.
76
Dhiren Ghosh and Andrew Vogt, 2012. Outliers: An Evaluation of Methodologies. Section on survey
Research Methods., pp. 3455-3460.
E. Frank, Y. Wang, S. Inglis, G. Holmes, I.H. Witten, 1998. Using model trees for classification.
Machine Learning, 32(1), pp. 63-76.
E. Turban, J.E. Aronson, 2001. Decision Support Systems and Intelligent Systems.. 6th ed. Upper
Saddle River, NJ: Prentice Hall.
E.I.Altman,E. Hotchkiss, 2005. Corporate Financial Distress and Bankruptcy : predict and avoid
bankruptcy. 3rd ed. New Jersey.: John Wiley & Sons .
Edmister, R., 1972. An Empirical test of financial ratio analysis for small business failurer prediction.
Journal of financial and quantitative analysis, 2(7), pp. 1477-1493.
Eibe Frank, Mark Hall, Bernhard Pfahringer, 2003. Locally Weighted Naive Bayes. In: 19th Conference
in Uncertainty in Artificial Intelligence, 249-256,. New York, s.n.
Eisenbeis, R., 1977. Pitfalls in the application of discriminant analysis in business and economics.. The
journal of Finance, Issue 32, pp. 875-900.
Elam, R., 1975. The Efforts of lease data on the predictive ability of financial ratios.. The accounting
Review., pp. 25-43.
Erkki K. Laitinen, Teija Laitinen, 2000. Bankruptcy prediction Application of the Taylor's expansion in
logistic regression.. International Review of Financial Analysis, 9(4), pp. 327-349.
F.E.H. Tay, L. Cao, 2001. Modified support vector machines in financial time series forecasting.
Omega, 29(4), pp. 309-317.
F.E.H. Tay, L. Cao, 2002. Modified support vector machines in financial time series forecasting.
Neurocomputing, 48(1-4), pp. 847-861.
Fang-Mei Tseng , Yi-Chung Hub, 2010. Comparing four bankruptcy prediction models: Logit,
quadratic interval logit,neural and fuzzy neural networks.. Expert Systems with Applications., Volume
37, pp. 1846-1853.
Fang-Mei Tseng, L. Lin, 2005. A quadratic interval logit model for forecasting bankruptcy.. Omega
The international Journal of Management Science., Volume 33, pp. 85-91.
Fitzpatrick, P., 1932. A comparison of ratios of successful industrial enterprises with whose of failed
companies. s.l.:s.n.
Foreman, R. D., 2003. A logistic analysis of bankruptcy within the US local. Journal of Economics and
Business, Volume 55, p. 135–166.
Francis E.H. Tay and Lixiang Shen, 2002. Economic and financial prediction using rough sets model..
European Journal of Operational Research., 141(3), pp. 641-659.
77
Frank, J. ,. &. T. W., 1994. A comparison of Financial restructuring is distress exchanges and chapter
11 reorgnization.. Journal of Financial Economics., Volume 27, pp. 315-353.
G. Zhang, M. Hu, and B. Patuwo et al., 1999. Artificial neural networks in bankruptcy prediction:
General framework and cross-validation analysis.. European Journal operational research., Volume
116, pp. 16-32.
Gaughan, P., 2011. Merger, Acquisitions and Corporate Restructuring. 3rd ed. New York: Jhon Wiley.
Geoffrey I. Webb, 2000. MultiBoosting: A Technique for Combining Boosting and Wagging. Machine
Learning., 40(2), pp. 1-50.
Gleb Lanine , Rudi Vander Vennet, 2006. Failure prediction in the Russian bank sector with logit and
trait recognition models.. Expert Systems with Applications., Volume 30, pp. 463-478.
Gordini, N., 2014. A genetic algorithm approach for SMEs bankruptcy prediction: Empirical evidence
from Italy. Expert Systems with Applications, 41(14), p. 6433–6445.
Grablowsky, B.J. and Talley, W.K., 1981. "Probit and discriminant factors for classifying credit
applicants: A comparison.. Journal of Economics and Business, Volume 33, pp. 254-261.
Grammatikos, T., and Gloubos, G., 1984. Predicting bankruptcy of industrial firms in
Greece.Spoudai,. The University of Piraeus Journal of Economics and Business Statistics and
operations Research, pp. 3-4, 421-443.
Guoqiang Zhang, Michael Y. Hu, Eddy Patuwo, Daniel C. Indro, 1999. Artificial neural networks in
bankruptcy prediction: General framework and cross-validation analysis. European Journal of
Operational Research, 116(1), pp. 16-32.
H. Frydman, E.I. Altman, D. Kao, 1985. Introducing recursive partitioning for financial classification:
The case of financial distress.. Journal of Finance, 1(40), p. 269–291.
H., I., 1984. Corporate distress in Australia. Journal of Banking and finance., Issue 8, pp. 303-320.
H.Tisshaw, R. T. a., 1977. Going, Going, Gone - Four Factors Which predict.. Accountancy., p. 50.
Han, C.-S. P. a. I., 2002. A case-based reasoning with the feature weights derived by analytic
hierarchy process for bankruptcy prediction.. Expert Systems with Applications., 23(3), pp. 255-264.
Hansen, M., Madow, W., and Tepping, B., 1983. An Evaluation of Model-Dependent and Probability
Sampling Inferences in Sample Surveys.. J. Amer. Stat. Assoc., Volume 78, pp. 776-793.
Hanweck, G., 1977. Predicting bank failures. Research Papers in Banking and Financial Economics,
Financial Studies Section, Board of Governors of the Federal Reserve System. Washington D.C: s.n.
Hashi, I., 1997. The Economics of Bankruptcy, Reorganization and Liquidation. Lessons for east
European Transition Economics.. Russian And East European Finance and Trade, 33(4), pp. 6-34.
78
Hernan Pedro Vigier and Antonio Terceno, 2008. A model for the prediction of disease of firms by
means of fuzzy relations.. Fuzzy sets and systems., 159(17), pp. 2299-2316.
Holte, R., 1993. Very simple classification Rules Perform well on most commonly used datasets..
Machine Learning, Volume 11, pp. 63-91.
Hsueh-Ju Chen, Shaio Yan Huang and Chin-Shien Kin, 2009. Alternative Diagnosis of corporate
bankruptcy: A neuro fuzzy approach.. Expert Systems with appications., 36(4), pp. 7710-7720.
Hui Li , Young-Chan Lee , Yan-Chun Zhou , Jie Sun , 2011. The random subspace binary logit (RSBL)
model for bankruptcy prediction.. Knowledge-Based Systems, Volume 24, pp. 1380-1388.
Hui-Ling Chen, Bo Yang, Gang Wang, Jie Liu, Xin Xu, Su-Jing Wang, Da-You Liu, 2011. A novel
bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method. Knowledge-
Based Systems., 24(8), pp. 1348-1359.
Hui-Ling Chen, Bo Yang, Gang Wang, Jie Liu, Xin Xu, Su-Jing Wang, Da-You Liu, 2011. A novel
bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method. Knowledge-
Based Systems, 24(8), pp. 1348-1359.
Hyunchul Ahn and Kyoung-jae Kim, 2009. Bankruptcy prediction modeling with hybrid case-based
reasoning and genetic algorithms approach.. Applied Soft Computing, 9(2), pp. 599-607.
I.M. Premachandra , Gurmeet Singh Bhabra , Toshiyuki Sueyoshi, 2009. DEA as a tool for bankruptcy
assessment: A comparative study with logistic regression technique.. European Journal of
Operational Research., Volume 193, pp. 412-424.
Ibe, O. C., 2014. Introduction to Descriptive Statistics.. 2nd ed. Elsevier Inc.: Academic Press .
Ivica Pervan, Maja Pervan, Bruno Vukoja, 2011. PREDICTION OF COMPANY BANKRUPTCY USING
STATISTICAL. Croatian Operational Research Review, Volume 2, pp. 158-167.
J. C. NEVES, A. VIEIRA, 2006. Improving Bankruptcy Prediction with Hidden Layer Learning Vector
Quantization.. European Accounting Review, 15(2), pp. 253-271.
J. Levy, E. Mallach, P. Duchessi, 1991. A fuzzy logic evaluation system for commercial loan analysis.
Omega, International Journal of Management Science, 19(6), pp. 651-669.
J. Peltonen,S. Kaski, J. Sinkkonen,, 2001. Bankruptcy analysis with self-organizing maps in learning
metrics. IEEE Transactions on Neural Networks, 12(4).
J.P. Ignizio, J.R. Soltyas, 1996. Simultaneous design and training of ontogenic neural network
classifier. Computers Operations Research, 23(6), p. 535–546.
Jackendoff, N., 1962. A study of Published Industry Finanacial and Operating Ratios.. Philadelphia:
Temple University, Bureau of Economic and Business Research.
Jae H. Min, Young-Chan Lee, 2005. Bankruptcy prediction using support vector machine with optimal
choice of kernel function parameters. Expert sysetms with applications., 28(4), pp. 603-614.
79
Jardin, P. d., 2014. Bankruptcy prediction using terminal failure processes. European Journal of
Operational Research.
Jie Sun and Hui Li, 2009. Financial distress early warning based on group decision making. Compters
and Operational Research., Volume 36, pp. 885-906.
Jodi Bellovary, Don Giacomino, Michael Akers, 2007 . A Review of Bankruptcy Prediction Studies:
1930 to Present. Journal of Financial Education, Volume 33, pp. 1-42.
Johan Huysmansa,Bart Baesens,Jan Vanthienen, Tony van Gestel , 2006. Failure prediction with self
organizing maps. Expert Systems with Applications, 30(3), p. 479–487.
John G. Cleary, Leonard E. Trigg, 1995. K*: An Instance-based Learner Using an Entropic Distance
Measure. In: 12th International Conference on Machine Learning. 108-114. s.l., s.n.
Junyoung Heo and Jin Yong Yang , 2014. AdaBoost based bankruptcy forecasting of Korean
construction companies.. Applied Soft Computing, Volume 24, pp. 494-499.
K.C. Lee, I. Han, Y. Kwon, 1996. Hybrid neural network models for bankruptcy predictions. Decision
Support Systems, Volume 18, pp. 63-72.
K.F. Lam, J.W. Moy, 2002. Combining discriminant methods in solving classification problems in two-
group discriminant analysis. European Journal of Operational Research, Volume 138, pp. 294-301.
K.Kim, 2004. Financial time series forecasting using support vector machines. Neurocomputing ,
Volume 55, pp. 307-319.
K.S Shin T.S Lee H.J Kim, 2005. An application of support vector machines in bankruptcy prediction
model. Expert Systems with Applications, Volume 28, pp. 127-135.
Kalay, A., Singhal, R., Tashjian, E, 2007. Is Chapter 11 costly?. Journal of Financial Economics., Volume
84, pp. 772-796.
Kaplan, S., 1994. Campeau's Acquisition of federated post-bankruptcy results.. Journal of Financial
Economicss., Volume 35, pp. 123-136.
Karels, G. V. and Prakash, A. P., 1987. Multivariate Normality and Forecasting of Business
Bankruptcy.. Journal of Business Finance and Accounting. , 14(4), pp. 573-593.
Kass, G., 1980. An Exploratory Technique For Investigating Large Quantities of Categorical data..
Applied Statistics., 29(2), pp. 119-127.
Kathleen McMillan and Jonathan Dundee, 2011. How to Write Dissertation and Project Reports.. 2nd
ed. Dundee: Pearson.
Keasey, K. and R. Watson. , 1986. The prediction of small company failure: Some behavioral evidence
for the UK.. Accounting and Business Research, Issue 17, pp. 49-57. .
Keskin, Y., 2002. İşletmelerde Finansal Başarısızlığın Tahmini, Çok Boyutlu Model Önerisi ve
Uygulaması, Doktora Tezi, Hacettepe Üniversitesi.. s.l.:s.n.
80
Ketz, J. E., 1978. The effect of general price-level adjustments on the predictive ability of. Journal of
Accounting Research, Supplement(16), pp. 273-284.
Kiviluoto, K., 1998. Predicting bankruptcies with the self-organizing map. Neurocomputing, 21(1-3),
p. 191–201.
Kolodner, J., 1991. Improving human decision making through case-based decision aiding.. AI
Magazine, 12(2), pp. 52-68.
Korol, T., 2014. A fuzzy logic model for forecasting exchange rates.. Knowledge-Based Systems,
Volume 67, pp. 49-60.
Kyung-Shik Shin, Taik Soo Lee, Hyun-jung Kim, 2005. An application of support vector machines in
bankruptcy prediction model.. Expert Systems with Applications., 28(1), pp. 127-135.
Laitinen, E., 1991. Financial ratios and different failure processes.. Journal of Business Finance &
Accounting, 5(18), pp. 649-673.
Lennox, C., 1999. The accuracy and incremental information content of audit reports in predicting
bankruptcy.. Journal of Business Finance & Accounting., 26(5/6), pp. 757-778.
Liang, B. J. a. T., 1995. Fuzzy indexing and retrieval in case-based system.. Expert Systems with
Applications., 8(1), pp. 135-142.
Lili Sun, Prakash P. Shenoy, 2007. Using Bayesian networks for bankruptcy prediction: Some
methodological issues. European Journal of Operational Research, 180(2), pp. 738-753.
Lin, F.Y. and McClean, S, 2000. The prediction of Financial Distress Using Structured Financial Data
From the Interne.. IJCSS International Journal of Computers Science and Signal, 1(1), pp. 43-57.
Loh, W.-Y. and Shih, Y.-S, 1997. Split Selection Method For Classification Trees.. Statistica Sinica,
Volume 7, pp. 815-840.
Loh, W.-Y., 2011. Classification and Regression Trees.. 1st ed. NY: Willey & Sons Inc. .
Lugovskaja, L., 2009. Predicting default of Russian SMEs on the basis of financial and non-financial
variables",. Journal of Financial Services Marketing,, 14(4), pp. 301-313.
M. Adnan Aziz Humayon A. Dar, 2006. "Predicting corporate bankruptcy: where we stand?. The
international Journal of business in society., 6(1), pp. 18-33.
M. odom and R. Sharda, 1990. A neural network model for bankruptcy prediction. in Proc. Int. Joint
Conf. Neural Networks. San Diego, CA, s.n.
Makridakis, S., 2001. Insider Trading Behavior Prior to Chapter 11 Bankruptcy Announcements..
Journal of Business Research, 54(1), pp. 63-70.
81
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten, 2009.
The WEKA Data Mining Software: An Update;. SIGKDD Explorations, 11(1), pp. 1-50.
Martin, D., 1977. Early warning of bank failures: A logit regression approach.. Journal of Banking and
Finance., Volume 1, pp. 249-276.
McKee, T., 2000. Developing a bankruptcy prediction model via rough sets theory. International
Journal of Intelligent Systems in Accounting, Finance and Management, Volume 9, pp. 59-173.
Mensah, Y. M., 1983. The differential Bankruptcy predictive ability of specific price level
adjustments:Some Empirical Evidence. The Accounting Review, LVIII(2), pp. 228-246.
Merwin, G., 1942. Financial Small corporations in five manufacturing industries, 1926-1936. New
York: National Bureau of Economic Research.
Meyer, P. and H. Pifer., 1970. Prediction of bank failures.. JOUHli1l of Finance, 4(25), pp. 853-868.
Morris, E. H. a. R., 1983. The significance of Base year in developing Failure prediction models..
Journals of Business Finance and Accounting., pp. 209-223.
Myong-Jong Kim and Dae-Ki Kang, 2010. Ensemble with neural networksn for bankruptcy
prediction.. Expert systems with applications, Volume 37, pp. 3373-3379.
Myoung-Jong Kim, Dae-Ki Kang, 2009. Ensemble with neural networks for bankruptcy prediction.
Expert Systems with Applications, 37(4), pp. 3373-3379.
Ning Chena, Bernardete Ribeiro, Armando Vieira, An Chena, 2013. Clustering and visualization of
bankruptcy trajectory using self-organizing map.. Expert Systems with Applications., 40(1), pp. 385-
393.
Ohloson, J. A., 1980. Financial Ratios and the probabilistic pridiction of Bankruptcy. Journal of
Accounting Research , 18(1), pp. 109-131.
O'Leary, D., 1992. On bankruptcy information systems. European Journal of Operational Research,
56(1), pp. 67-69.
Opler, T. C. and Titman, S., 1994. Financial Distress and Corporate Performance. The journal of
Finance, 18(1), pp. 109-131.
P. Melville, R. J. Mooney, 2003. Constructing Diverse Classifier Ensembles Using Artificial Training
Examples. In: Eighteenth International Joint Conference on Artificial Intelligence, 505-510. New York,
s.n.
P. Ravi Kumar , V. Ravi, 2007. Bankruptcy Prediction in banks and firms via statistical and intelligent
techniques - A review. European Journal of Operational Research, Volume I, pp. 1-28.
Paliwal, M., and Kumar, U., 2009. Neural networks and statistical techniques: A review of
applications.. Expert Systems with Applications, 36(1), pp. 2-17.
82
Pawlak, Z., 1982. Rough Sets. International journal of Computer and Information Science, Volume 11,
pp. 341-356.
Perold, F., 1999. Long term Capital Management Case Study Harvard Business School. s.l.:s.n.
Pindodo,J. and Rodriques, L.F., 2004. Parsimonious Models of Financial Insolvency in Small
Companies. Small Business Economics, pp. 51-66.
Pirooz Shamsinejad, Mohammad Saraee and Farid Sheikholeslam, 2010. A New Path Planner for
Autonomous Mobile Robots Based on Genetic Algorithm”, the 3rd IEEE International Conference on
Computer Science and Information Technology (ICCSIT 2010). Chengdu, China, IEEE, pp. 115-120.
Pompe P., Feedlers A., 1997. Using Machine Learning, Neural Networks and statistics to predict
Corporate Bankruptcy. s.l., s.n., pp. 267-276.
Pulvino, T., 1999. Effects of bankruptcy court protection on asset sales.. Journal of financial
Economics., Volume 52, pp. 151-186.
R. Slowinski,S. Greco, B. Matarazzo, 2001. Rough sets theory for multicriteria decision analysis..
European Journal of Operational Research., 129(1), pp. 1-47.
R. Susmaga,C. Zopounidis,A.I. Dimitras, R. Slowinski, 1999. Business failure prediction using rough
sets. European Journal of Operational Research, Volume 114, pp. 263-280.
R.Slowinski and J. Stefanowski., 1994. RoughDas: Rough set based data analysis system, Version 2.0,
User's Guide Book. Pozan, Poland.. s.l.:s.n.
Rajeev Singhal, Yun (Ellen) Zhu, 2013. Bankruptcy risk, costs and corporate diversification. Journal of
Banking & Finance, Volume 37, pp. 1475-1489.
Rubin, D. B., 2002. Statistical Analysis With Missing Data. 2nd ed. New York: Wiley.
S. Greco, B. Matarazzo, R. Slowinski, 1998. A new rough set approach to evaluation of bankruptcy
risk.. C. Zopounidis (Ed.), Operational Tools in the Management of Financial Risks, Kluwer Academic
Publishers, Dordrecht, pp. 121-136.
S. Greco, B. Matarazzo, R. Slowinski, 1998. A new rough set approach to multicriteria and
multiattribute classification.. Rough Sets and Current Trends in Computing, pp. 60-67.
S. Jones, D.A. Hensher, 2004. Predicting firm financial distress: A mixed logit model. Accounting
Review, 4(79), p. 1011–1038.
S.Balcaen, H. Ooghe , 2006. 35 years of studies on business failure: an overview of the classic
statistical methodologies and their related problems. The British Accountin Review, Issue 38, pp. 63-
93.
Sangjae Lee and Wu Sung Choi , 2013. A multi-industry bankruptcy prediction model using back-
propagation neural network and multivariate discriminant analysis.. Expert Systems with
Applications., Volume 40, p. 2941–2946.
83
Sangjae Lee and Wu Sung Choi, 2013. A multi-industry bankruptcy prediction model using back-
propagation neural network and multivariate discriminant analysis.. Expert Systems with
Applications, 40(8), pp. 2941-2946.
Sankaran Mahadevan, , Ramesh Rebba , 2005. Validation of reliability computational models using
Bayes networks. Reliability Engineering & System Safety., 87(2), pp. 223-232.
SAS Institute Inc., 2012. Applied Analytics Using SAS® Enterprise Miner. Cary: NC: SAS Institute Inc.
SAS Institute Inc., 2013. Getting Started with SAS® Enterprise Miner. Cary: NC: SAS Institute Inc..
SAS Institute Inc., 2013. SAS Enterprise Miner 13.2 Reference Help.. 1st ed. Carry: SAS Institute Inc..
Schwenker, Friedhelm; Kestler, Hans A.; Palm, Günthe, 2001. Three Learning Phases For Radial Basis
Function Network.. Neural Network, Volume 14, pp. 439-458.
Shapiro, A. F., 2002. The merging of neural networks, fuzzy logic, and genetic algorithms. Insurance:
Mathematics and Economics., 31(1), pp. 115-131.
Sinkey, J., 1975. A multivariate statistical analysis of the characteristics of problem. Journal of
Finance, 1(30), pp. 21-36.
Skogsvik, K., 1990. Current cost accounting ratios as predictors of business failure: The Swedish
case.. Journal of Business Finance and Accounting., 17(1), pp. 137-160.
Sunday Olusanya Olatunji, Ali Selamat, Abdul Azeez, Abdul Raheem, 2011. Predicting correlations
properties of crude oil systems using type-2 fuzzy logic systems.. Expert Systems with Applications.,
38(9), pp. 10911-10922.
Sungbin Cho, Hyojung Hong and Byoung-Chun Ha, 2010. A hybrid approach based on the
combination of variable selection using decision trees and case-based reasoning using the
Mahalanobis distance: For bankruptcy prediction.. Expert Systems with Applications., 37(4), p. 3482–
3488.
Sung-Hwan Min, Jumin Lee and Ingoo Han, 2006. Hybrid genetic algorithms and support vector
machines for bankruptcy prediction. Expert Systems with Applications, 31(3), pp. 652-660.
T.-P. Liang, B. Jeng, Y.-M. Jeng, 1997. FILM: A fuzzy learning method for automated knowledge
acquisition. Decision Support Systems, Volume 21, p. 61–73.
Takahashi, K., Y. Kurokawa and K: Watase. , 1984. Corporate bankruptcy prediction in Japan.. Journal
of Banking and Finance , 2(8), pp. 229-247.
Tezcan, N., 2002. Firmalarda Mali Başarisizliğin Tahmini. Yüksek Lisans Tezi, Yıldız. s.l.:Teknik
Üniversitesi, Sosyal Bilimler Enstitüsü.
84
Theodossiou, P., 1991. Alternative models for assessing the financial condition of business in Greece.
Journal of Business Finance and Accounting., 5(18), pp. 697-720..
Theodossiou, P., 1991. Alternative models for assessing the financial condition of business in
Greece.. Journal of Business Finance & Accounting., 18(5), pp. 697-720.
Thomas E. McKee and Terje Lensberg, 2002. Genetic programming and rough sets: A hybrid
approach to bankruptcy classification. European Journal of Operational Research., 138(2), p. 436–
451.
Thorburn, K. S., 2000. Bankruptcy auctions: costs, debt recovery and firm survival.. Journal of
Financial Economics., Volume 58, pp. 337-368.
Toshiyuki Sueyoshia, Mika Goto, 2009. Methodological comparison between DEA (data envelopment
analysis) and DEA–DA (discriminant analysis) from the perspective of bankruptcy assessment..
European Journal of Operational Research, 199(2), p. 561–575.
Tseng-Chung Tang and Li-Chiu Chi, 2005. Predicting multilateral trade credit risks: comparisons of
Logit and Fuzzy Logic models using ROC curve analysis.. Expert Systems with Applications., 28(3), pp.
547-556.
V. Popova and J.C. Bioch, 2001. Bankruptcy prediction with rough sets, ERIM Report Series Research
in Management (ERS-2001-11-LIS). s.l.:s.n.
Vapnik, V., 1998. in: S. Haykin (Ed.) Statistical Learning Theory. Adaptive and Learning systems,
Volume 736.
Varun, B., 2009. Prediction of Business failure: a Comparison of Discriminat And logistic Regression
Analyses. Istanbul University Journal of the School of Business Administration, 38(1), pp. 21-36.
Vranas, A., 1992. The significance of financial characteristics in predicting business failure: An
analysis in the Greek context.. Foundations of Computing and Decision Sciences., 4(17), pp. 257-275.
W.J. Banks, L.A. Prakash, 1994. On the performance of linear programming heuristics applied on a
quadratic transformation in the classification problem.. European Journal of Operational Research.,
74(23), pp. 23-28.
West, R., 1985. A factor analytic approach to bank condition.. Journal of Banking and Finance,
Volume 9, pp. 253-266.
Wheelen, T. L. and Hunger, J. D, 2000. Strategic Management: Business Policy.. 7th ed. New Jersey:
Prentice Hall.
Whitaker, R. B., 1999. The Early Stages of Financial Distress.. Journal of Economics and Finance,
23(2), pp. 123-133.
Wruck, K. H., 1990. Financial distress, reorganization, and organizational efficiency.. Journal of
Financial Economics , Volume 27, pp. 419-444.
85
Yoav Freund, Robert E. Schapire, 1996. Experiments with a new boosting algorithm. In: Thirteenth
International Conference on Machine Learning,148-156. San Francisco, s.n.
Z. Pawlak, J. Grzymala-Busse, R. Slowinski, W. Ziarko, 1995. Rough sets. Communications of the ACM
Association for Computing Machinery, 38(11), pp. 89-97.
Z.Pawlak, 1984. Rough classification. International Journal of Man–Machine Studies, Volume 20, p.
469–483.
Zhi Xiao, Xianglei Yang, Ying Pang, Xin Dang, 2012. The prediction for listed companies’ financial
distress by using multiple prediction methods with rough set and Dempster–Shafer evidence theory..
Knowledge-Based Systems., Volume 26, pp. 196-206.
Zhong Gao, Meng Cui and Lai-Man Po, 2008. Enterprise Bankruptcy Prediction Using Noisy-Tolerant
Support Vector Machine. Leicestershire, Inernational Seminar on Future Information Technology and
management Engineering.
Zijiang Yang, Wenjie You, Guoli Ji, 2011. Using partial least squares and support vector machines for
bankruptcy prediction.. Expert Systems with Applications., 38(7), pp. 8386-8342.
Zmijewski, M., 1984. Methodological issues related to the estimation of financial distress prediction
models. Journal of Accounting Research, Volume 22, pp. 59-82.
86
Appendix-A:
Table 4.2 Containing 5th and 95th percentile for the data one year before bankruptcy
87
Table 4.3 Containing 5th and 95th percentile for the data 2 year before bankruptcy.
X5 T2 -.989231 .193456
X6 T2 .020722 2.735512
X7 T2 -.955825 .201191
X8 T2 .037395 1.703400
X9 T2 .037395 .953118
88
Table 4.4 Containing 5th and 95th percentile for the data 3 year before bankruptcy.
X5 T3 -.920121 .188337
X6 T3 .026572 2.737783
X7 T3 -.774949 .223634
X8 T3 .037395 1.541811
X9 T3 .037395 .941941
89
Table 4.5 Containing 5th and 95th percentile for the data 4 year before bankruptcy.
X5 T4 -.955825 .201191
X6 T4 .037395 2.700640
X7 T4 .028180 2.981235
X8 T4 .037395 1.367371
X9 T4 .037395 .936470
90
Table 4.6 Containing 5th and 95th percentile for the data 5 year before bankruptcy.
X5 T5 -.774949 .223634
X6 T5 .037198 2.824415
X7 T5 .020722 2.735512
X8 T5 .037395 1.316398
X9 T5 .037395 .939327
91
Table 4.7 Univariate Statistics for data sample one year before bankruptcy
92
x39T1 928 .164003 .3871066 0 .0 8 35
X40T1 928 26.543483 124.7393648 0 .0 0 127
X41T1 928 .138622 .3487470 0 .0 11 31
Table 4.8 Univariate Statistics for data sample two year before bankruptcy:
93
X33T2 928 1.180883 5.8503652 0 .0 1 29
X34T2 928 20.466627 561.3885151 0 .0 93 95
Table 4.9 Univariate Statistics for data sample three year before bankruptcy
94
X27T3 928 .889831 8.3839439 0 .0 0 36
X28T3 928 -8.455995 107.5723951 0 .0 133 70
X29T3 928 2.094125 18.7134149 0 .0 1 124
X30T3 928 1.004896 32.5207291 0 .0 120 56
X31T3 928 .281926 1.6874224 0 .0 12 52
X32T3 928 2.094638 20.2466047 0 .0 0 71
X33T3 928 .978278 1.3907070 0 .0 1 36
X34T3 928 -1.258732 30.1861351 0 .0 96 92
X36T3 928 10.128026 265.9225798 0 .0 59 103
X37T3 927 5.807971 58.2262456 1 .1 0 108
X38T3 927 31.327416 229.7271188 1 .1 0 127
x39T3 928 .151201 .1844662 0 .0 4 22
X40T3 928 35.842833 269.1824970 0 .0 0 124
X41T3 928 3.124092 83.5912666 0 .0 3 28
95
X25T4 928 .218274 .4325690 0 .0 0 33
X26T4 928 .668108 1.2811048 0 .0 0 55
X27T4 928 .392867 .7428170 0 .0 0 24
X28T4 928 -6.885582 100.9061376 0 .0 144 73
X29T4 928 1.818821 6.8452573 0 .0 0 129
X30T4 928 1.139221 31.2651135 0 .0 114 60
X31T4 928 .452752 8.6250310 0 .0 12 56
X32T4 928 1.467498 7.0327693 0 .0 0 69
X33T4 928 1.036377 1.3866071 0 .0 1 31
X34T4 928 -.916501 16.8631678 0 .0 103 104
373743.197126
X35T4 928 26884.588919 0 .0 52 136
3
X36T4 928 4.360924 62.1136661 0 .0 65 96
X37T4 927 10.165556 152.0521921 1 .1 0 124
X38T4 927 35.577803 267.1218918 1 .1 0 121
x39T4 928 .142655 .3950704 0 .0 7 28
X40T4 928 39.247402 219.7461392 0 .0 0 124
X41T4 928 2.616322 64.7991745 0 .0 4 33
96
X22T5 928 -.116831 .8905469 0 .0 116 21
X23T5 928 -4.435434 55.1402660 0 .0 158 58
X24T5 928 -4.851490 58.2163124 0 .0 159 43
X25T5 928 .204382 .4084191 0 .0 0 32
X26T5 928 .647224 1.1521097 0 .0 0 52
X27T5 928 .572762 6.1248955 0 .0 0 22
X28T5 928 -8.703963 227.1230790 0 .0 140 90
X29T5 928 1.656466 6.3122065 0 .0 1 138
X30T5 928 .734257 19.9489760 0 .0 120 52
X31T5 928 -.041617 11.6041550 0 .0 13 56
X32T5 928 1.307071 5.8451373 0 .0 0 64
X33T5 928 1.025311 1.0571790 0 .0 1 35
X34T5 928 2.744535 70.1411300 0 .0 98 98
426169.224685
X35T5 928 32491.360418 0 .0 50 126
6
X36T5 928 -4.221002 418.1366420 0 .0 60 88
X37T5 927 6.127394 77.6160452 1 .1 1 117
X38T5 927 39.323022 219.8241031 1 .1 0 123
x39T5 928 .161733 .2420099 0 .0 5 17
X40T5 928 46.842724 408.8061708 0 .0 0 119
X41T5 928 .156068 .2768338 0 .0 7 16
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 300 164 64.5% Bankrupt 150 314 32.0%
Non- 123 341 73.5% Non- 24 440 94.8%
Bankrupt Bankrupt
Overall Accuracy % 69.0% Overall Accuracy % 63.0%
97
Table 5.2 Prediction accuracy of the model starting from year one to five using HP Trees Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt Bankrupt 390 74 84.0%
364 100 78.44%
Non- Non- 220 244 52.9%
Bankrupt 230 234 50.4% Bankrupt
68.3%
Overall Accuracy % 61% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 333 131 71.7% Bankrupt 234 230 51.0%
Non- 325 139 70.0 % Non- 225 239 73.0%
Bankrupt Bankrupt
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 444 20 95.5% Bankrupt 440 24 95.0%
Non- 34 430 92.0 % Non- 30 434 92.4 %
Bankrupt Bankrupt
98
Table 5.4 Prediction accuracy of the model starting from year one to five using Auto Neural Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 440 24 94.0 % Bankrupt 463 1 99.5%
Non- 26 439 93.0 % Non- 1 463 99.5%
Bankrupt Bankrupt
99.5%
Overall Accuracy % 93.5% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 0 464 0 Bankrupt 454 10 97.8%
Non- 0 464 100.00 Non- 9 455 97.6%
Bankrupt Bankrupt
Table 5.5 Prediction accuracy of the model starting from year one to five using HP Neural Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 420 44 90.0 % Bankrupt 440 24 95.0%
Non- 404 60 12.0 % Non- 464 0 0.0%
Bankrupt Bankrupt
47.25.0%
Overall Accuracy % 51.0% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 440 24 95.0% Bankrupt 420 44 90.0 %
Non- 225 239 73.0% Non- 52 412 88.9%
Bankrupt Bankrupt
99
Table 5.6 Prediction accuracy of the model starting from year one to five using Neural Network Model
Classification Table for data one year before Classification Table for data Two years before
Event: Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy Bankrupt Non- Accuracy
Bankrupt % Bankrupt %
Bankrupt 275 189 59.30% Bankrupt 435 29 93.7%
Non- 169 295 63.57 % Non- 392 72 16.6%
Bankrupt Bankrupt
55.15%
Overall Accuracy % 46.64% Overall Accuracy %
Classification Table for data Three years before Classification Table for data Four years before
Event: Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy Bankrupt Non- Accuracy
Bankrupt % Bankrupt %
Bankrupt 431 33 92.27% Bankrupt 335 129 74.2%
Non- 405 59 13.0% Non- 240 224 48.0%
Bankrupt Bankrupt
61.1%
Overall Accuracy % 52.4% Overall Accuracy %
Classification Table for data Three years before Classification Table for data Four years before
Event: Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy Bankrupt Non- Accuracy
Bankrupt % Bankrupt %
Bankrupt 431 33 92.27% Bankrupt 335 129 74.2%
100
Non- 405 59 13.0% Non- 240 224 48.0%
Bankrupt Bankrupt
61.1%
Overall Accuracy % 52.4% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 189 275 40.7% Bankrupt 180 284 38.7%
Non- 151 313 67.7% Non- 163 325 70.0%
Bankrupt Bankrupt
54.2%
Overall Accuracy % 54.0% Overall Accuracy %
101
Table 5.9 Prediction accuracy of the model starting from year one to five using Neural Network Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt Bankrupt 431 33 92.27%
454 10 98.0%
Non- Non- 464 0 0.0%
Bankrupt 0 464 100.0% Bankrupt
50.0%
Overall Accuracy % 99.0% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 444 20 95.6% Bankrupt 444 20 95.6%
Non- 464 0 0.0% Non- 450 14 3.0 %
Bankrupt Bankrupt
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 228 236 49.1% Bankrupt 221 243 47.2%
Non- 139 325 70.0% Non- 114 350 75.4%
Bankrupt Bankrupt
102
Table 5.11 Bankruptcy prediction accuracy using Naïve Bayes Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 427 37 92.0% Bankrupt 29 435 6.2%
Non- 94 370 79.7 Non- 17 447 96.3%
Bankrupt Bankrupt
51.1%
Overall Accuracy % 85.8% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 370 94 79.7% Bankrupt 435 29 93.75%
Non- 37 427 92.0% Non- 415 49 10.5%
Bankrupt Bankrupt
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 362 98 78.0% Bankrupt 265 199 57.1%
Non- 7 457 98.0% Non- 254 210 45.2%
Bankrupt Bankrupt
88.0% 51.1%
Overall Accuracy % Overall Accuracy %
103
Table 5.13 Bankruptcy prediction accuracy table using SMO OR SVM Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 342 122 73.7% Bankrupt 270 194 58.2%
Non- 233 231 49.7% Non- 223 241 51.9%
Bankrupt Bankrupt
55.1%
Overall Accuracy % 61.7% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 290 174 62.5% Bankrupt 239 225 51.5%
Non- 204 260 56.2% Non- 210 254 54.7%
Bankrupt Bankrupt
59.4% 53.2%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 427 37 92.3% Bankrupt 227 237 48.9%
Non- 303 161 34.6% Non- 174 290 62.5%
Bankrupt Bankrupt
63.5% 55.7%
Overall Accuracy % Overall Accuracy %
104
Table 5.15 Bankruptcy prediction accuracy table using KSTAR Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 464 0 100% Bankrupt 233 231 50.2%
Non- 0 464 100% Non- 244 220 47.4%
Bankrupt Bankrupt
49.8%
Overall Accuracy % 100% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 229 235 49.8% Bankrupt 229 235 49.8%
Non- 211 253 54.5% Non- 227 237 51.0%
Bankrupt Bankrupt
50.3% 50.4%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 347 117 74.8% Bankrupt 425 39 91.6%
Non- 327 137 29.5% Non- 20 444 95.7%
Bankrupt Bankrupt
52.1% 93.6%
Overall Accuracy % Overall Accuracy %
105
Table 5.17 Bankruptcy prediction accuracy table using AdaBoostM1 Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 247 217 53.2% Bankrupt 234 221 51.0%
Non- 167 297 64.0% Non- 167 297 64.0%
Bankrupt Bankrupt
57.0%
Overall Accuracy % 58.6% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 349 115 75.21% Bankrupt 389 75 83.8%
Non- 291 173 37.2% Non- 348 116 25.0%
Bankrupt Bankrupt
56.2% 54.4%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 290 174 62.5% Bankrupt 87 377 18.7%
Non- 125 339 73.06% Non- 100 364 78.5%
Bankrupt Bankrupt
67.75% 48.22%
Overall Accuracy % Overall Accuracy %
106
Table 5.19 Bankruptcy prediction accuracy table using Decorate Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 245 219 52.8% Bankrupt 430 34 92.7%
Non- 198 266 57.3% Non- 406 58 12.5%
Bankrupt Bankrupt
52.6%
Overall Accuracy % 55.0% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 110 354 23.7% Bankrupt 256 208 55.2%
Non- 93 371 79.9% Non- 225 239 51.5%
Bankrupt Bankrupt
51.8% 53.4%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 207 257 44.6% Bankrupt 245 219 52.8%
Non- 86 378 81.5% Non- 134 330 71.2%
Bankrupt Bankrupt
63.06% 61.9%
Overall Accuracy % Overall Accuracy %
107
Table 5.21 Bankruptcy prediction accuracy table using ogisticBoost Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 316 148 68.1% Bankrupt 343 121 73.92%
Non- 256 208 44.8% Non- 225 239 51.5%
Bankrupt Bankrupt
62.7%
Overall Accuracy % 54.5% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 320 144 68.9% Bankrupt 325 139 70.0%
Non- 136 328 70.9% Non- 366 98 21.2%
Bankrupt Bankrupt
69.8% 45.5%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 322 142 69.4% Bankrupt 373 91 80.4%
Non- 250 214 46.1% Non- 51 413 95.2%
Bankrupt Bankrupt
57.7% 87.7%
Overall Accuracy % Overall Accuracy %
108
Table 5.23 Bankruptcy prediction accuracy table using Random Committee Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 262 202 56.5% Bankrupt 238 226 51.3%
Non- 225 239 51.5% Non- 243 221 47.6%
Bankrupt Bankrupt
49.5%
Overall Accuracy % 54.0% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 230 234 49.5% Bankrupt 247 217 53.2%
Non- 212 252 54.3% Non- 243 221 47.6%
Bankrupt Bankrupt
51.9% 50.4%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 78 386 16.8% Bankrupt 91 373 19.5%
Non- 91 373 80.3% Non- 92 372 80.2%
Bankrupt Bankrupt
48.60% 49.3%
Overall Accuracy % Overall Accuracy %
109
Table 5.25 Bankruptcy prediction accuracy table using NNge Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 246 218 53.01% Bankrupt 259 205 55.8%
Non- 238 226 48.7% Non- 258 206 44.3%
Bankrupt Bankrupt
50.0%
Overall Accuracy % 50.8% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 269 195 57.9% Bankrupt 246 218 46.9%
Non- 247 217 46.7% Non- 271 193 41.59%
Bankrupt Bankrupt
52.3% 44.3%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 28 436 6.0% Bankrupt 22 442 4.7%
Non- 15 449 96.7% Non- 15 449 96.7%
Bankrupt Bankrupt
51.3% 51.02%
Overall Accuracy % Overall Accuracy %
110
Table 5.27 Bankruptcy prediction accuracy table using ZeroR Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 184 280 39.6% Bankrupt 184 280 39.6%
Non- 188 276 59.5% Non- 188 276 59.5%
Bankrupt Bankrupt
49.5%
Overall Accuracy % 49.5% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 184 280 39.6% Bankrupt 184 280 39.6%
Non- 188 276 59.5% Non- 188 276 59.5%
Bankrupt Bankrupt
49.5% 49.5%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 195 269 42.0% Bankrupt 241 223 47.1%
Non- 202 262 56.4% Non- 220 244 47.0%
Bankrupt Bankrupt
49.2% 47.05%
Overall Accuracy % Overall Accuracy %
111
Table 5.29 Bankruptcy prediction accuracy table using J48 Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 297 167 64.1% Bankrupt 189 275 40.8%
Non- 274 190 40.9% Non- 202 262 56.4%
Bankrupt Bankrupt
48.6%
Overall Accuracy % 52.5% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 132 332 28.4% Bankrupt 251 213 54.0%
Non- 136 328 70.6% Non- 242 222 48.0%
Bankrupt Bankrupt
49.5% 51.0%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 417 47 89.8% Bankrupt 401 63 86.4%
Non- 415 49 10.5% Non- 323 141 30.4%
Bankrupt Bankrupt
50.15% 58.4%
Overall Accuracy % Overall Accuracy %
112
Table 5.31 Bankruptcy prediction accuracy table using END Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 297 167 64.0% Bankrupt 280 184 60.0%
Non- 274 190 40.9% Non- 256 208 44.8%
Bankrupt Bankrupt
52.4%
Overall Accuracy % 52.5% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 307 157 66.2% Bankrupt 231 233 49.8%
Non- 274 190 40.9% Non- 222 242 52.2%
Bankrupt Bankrupt
54.1% 51.0%
Overall Accuracy % Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt 312 6 98.1% Bankrupt 247 85 74.4%
Non- 29 293 91.0% Non- 185 137 42.5%
Bankrupt Bankrupt
58.7%
Overall Accuracy % 94.5% Overall Accuracy %
113
Table 5.33 Bankruptcy prediction accuracy table using CHAID Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt Bankrupt
366 98 78.9% 194 270 41.8%
Non- Non-
Bankrupt 308 156 33.6% Bankrupt 137 327 70.5%
56.1%
Overall Accuracy % 56.2% Overall Accuracy %
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt Bankrupt
396 68 85.3% 60 404 12.9%
Non- Non-
Bankrupt 340 124 26.7% Bankrupt 32 432 93.1%
56.0%
Overall Accuracy % Overall Accuracy % 53.0%
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non-Bankrupt Accuracy % Bankrupt Non-Bankrupt Accuracy %
Bankrupt Bankrupt
349 115 75.2% 60 404 12.9%
Non-Bankrupt Non-Bankrupt
301 163 35.1% 32 432 93.1%
55.2%
Overall Accuracy % Overall Accuracy % 53.0%
114
Table 5.35 Bankruptcy prediction accuracy table CART Model
Classification Table for data one year before Event: Classification Table for data Two years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt Bankrupt
394 70 84.9% 335 129 72.2%
Non- Non-
Bankrupt 321 143 30.8% Bankrupt 269 195 42.0%
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non- Accuracy % Bankrupt Non- Accuracy %
Bankrupt Bankrupt
Bankrupt Bankrupt
400 64 86.2% 389 75 83.8%
Non- Non-
Bankrupt 342 122 26.3% Bankrupt 348 116 25.0%
56.2%
Overall Accuracy % Overall Accuracy % 54.4%
Classification Table for data Three years before Event: Classification Table for data Four years before Event:
Observed Predicted Observed Predicted
Bankrupt Non-Bankrupt Accuracy % Bankrupt Non-Bankrupt Accuracy %
Bankrupt Bankrupt
200 264 56.2% 0 464 0.0%
Non-Bankrupt Non-Bankrupt
0 464 100.0% 0 464 100.0%
78.2%
Overall Accuracy % Overall Accuracy % 50.0%
115
Table 5.37 Bankruptcy prediction accuracy table K-NN Model
Classification table one year before Event Classification table two years before Event
Classification table three year before Event Classification table four years before Event
116
Appendix B
117
Figure 5.5 Model HP Tree
118
Figure 5.6 Neural Network Model
119
Figure 5.7 Auto Neural Model
120
Figure 5.8 HP Neural Model
121
Figure 5.9 DMNeural Model
122
Figure 5.10 Regression Model
123
Figure 5.11 HP SVM Model
124
Figure 5.12 HP Regression Model
125
Figure 5.13 Memory Based Reasoning Model
126