financial statements and differentiating betweenfraud and non fraud reporting. The dataset consistsof financial ratios obtained from publicly availablefinancial statements.The paper is organised as follows: Section 2discusses the relevant prior research followed bysection 3 which describes the various tricksadopted by management for falsifying financialstatements. Section 4 reveals the key variables andfinancial ratios related to detection of financialstatement fraud. Section 5 provides an insight in tothe data mining techniques used in this study.Section 6 analyses the results followed byconcluding remarks (Section 7).
2.
Related Work
An overview of the academic literature concerningdetection of financial statement fraud is given here.Number of studies such as PwC [4]
4
, and ACFE[5]
5
tells the story about detection of fraud.Findings of these studies suggest that many anumber of times fraud has been detected by chancemeans or accident. For example reports of PwC [4]revels that 41% of the fraud cases were detected bymeans of tip – offs or by chance.Several groups of researchers have devoted asignificant amount of effort in studying the use of data mining techniques in detection of financialstatements fraud from different perspectives.Beasley [6]
6
used Logit regression to test theprediction that the inclusion of larger proportionsof outside members on the board of directorssignificantly reduces the likelihood of financialstatement fraud with a sample of 150 Americanfirms. They found that non-fraud firms have boardswith significantly higher percentages of outsidemembers than fraud firms. Green and Choi [7]
7
presented a neural network fraud classificationmodel employing endogenous financial data. Aclassification model created from the learnedbehavior pattern is then applied to a test sample.Fanning and Cogger
8
[8] also used an artificialneural network to predict management fraud. Usingpublicly available predictors of fraudulent financialstatements, they found a model of eight variableswith a high probability of detection. Kirkos
9
[9],carry out an in-depth examination of publiclyavailable data from the financial statements of various firms in order to detect FFS by using DataMining classification methods. In this study, threeData Mining techniques namely Decision Trees,Neural Networks and Bayesian Belief Networks aretested for their applicability in management frauddetection. Spathis et al
10
[10] compared multi-criteria decision aids with statistical techniquessuch as logit and discriminant analysis in detectingfraudulent financial statements. Cecchini et al [11]
11
developed a novel financial kernel using supportvector machines for detection of managementfraud. An innovative fraud detection mechanism isdeveloped by Huang et al.[12]
12
on the basis of Zipf’s Law. This technique reduces the burden of auditors in reviewing the overwhelming volumes of datasets and assists them in identification of anypotential fraud records. Hoogs et al[13]
13
presentsa genetic algorithm approach to detecting financialstatement fraud. Cerullo and Cerullo [14]
14
explained the nature of fraud and financialstatement fraud along with the characteristics of NN and their applications. They illustrated how NNpackages could be utilized by various firms topredict the occurrence of fraud. Koskivaara [15]
15
proposed NN based support systems as a possibletool for use in auditing. He demonstrated that themain application areas of NN were detection of material errors, and management fraud. Busta andWeinberg[16]
16
used NN to distinguish between‘normal’ and ‘manipulated’ financial data. Theyexamined the digit distribution of the numbers inthe underlying financial information. Koh andLow[17]
17
construct a decision tree to predict thehidden problems in financial statements byexamining the following six variables: quick assetsto current liabilities, market value of equity to totalassets, total liabilities to total assets, interestpayments to earnings before interest and tax, netincome to total assets, and retained earnings to totalassets. Belinna et al [18]
18
examine theeffectiveness of CART on identification anddetection of financial statement fraud. Theyconcluded by saying that CART is a very effectivetechnique in distinguishing fraudulent financialstatement from non fraudulent. Further, Deshmukhand Talluru [19]
19
demonstrated the construction of a rule-based fuzzy reasoning system to assess therisk of management fraud and proposed an earlywarning system by finding out 15 rules related tothe probability of management fraud. Zhou &Kapoor [20]
20
examine the effectiveness andlimitations of data mining techniques such asregression, decision trees, neural network andBayesian networks. They explore a self – adaptiveframework based on a response surface model withdomain knowledge to detect financial statementfraud. Recently, Ravisankar et al [20]
21
uses datamining techniques such as Multilayer FeedForward Neural Network (MLFF), Support VectorMachines (SVM), Genetic Programming (GP),Group Method of Data Handling (GMDH),Logistic Regression (LR), and Probabilistic NeuralNetwork (PNN) to identify companies that resort tofinancial statement fraud. They found that PNNoutperformed all the techniques without featureselection, and GP and PNN outperformed otherswith feature selection and with marginally equalaccuracies.If we summarize the existing academic research,we arrive at a conclusion that detection of financialstatement fraud is an instance of classification and
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 10, No. 3, March 201250http://sites.google.com/site/ijcsis/ISSN 1947-5500