Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
5Activity
0 of .
Results for:
No results containing your search query
P. 1
An Improved Multiperceptron Neural Network Model To Classify Software Defects

An Improved Multiperceptron Neural Network Model To Classify Software Defects

Ratings: (0)|Views: 192|Likes:
Published by ijcsis
Predicting software defects in modules not only helps in maintaining legacy systems but also helps the software development process and ensures higher reliability. Advantage includes planning of resources for the projects and minimization of budget. Research has been carried out using statistical methodology and machine learning techniques which are generic in nature. The dependability on legacy Software systems to meet current demanding requirements is a major challenge for any IT administrator and estimation of costs to maintain the same is a huge challenge. In this paper, it is proposed to modify the existing multi layer perceptron Neural Network which is a popular supervised classification algorithm to predict defects in a given module based on the available software metrics.
Predicting software defects in modules not only helps in maintaining legacy systems but also helps the software development process and ensures higher reliability. Advantage includes planning of resources for the projects and minimization of budget. Research has been carried out using statistical methodology and machine learning techniques which are generic in nature. The dependability on legacy Software systems to meet current demanding requirements is a major challenge for any IT administrator and estimation of costs to maintain the same is a huge challenge. In this paper, it is proposed to modify the existing multi layer perceptron Neural Network which is a popular supervised classification algorithm to predict defects in a given module based on the available software metrics.

More info:

Published by: ijcsis on Mar 08, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

07/09/2011

pdf

text

original

 
AN IMPROVED MULTIPERCEPTRON NEURAL NETWORKMODEL TO CLASSIFY SOFTWARE DEFECTS
M.V.P. Chandra Sekhara Rao,Aparna Chaparala,
Department of CSE,R.V.R. &J.C. College of Engineering,Guntur, India
Dr.B.Raveendra Babu
Director (Operations), Delta Technologies (P) Ltd.,Hyderabad, India
Dr. A.Damodaram
JNTU, CSE Department, JNTU College of Engineering, Kukatpally,Hyderabad, INDIA
Abstract
:
Predicting software defects in modules not onlyhelps in maintaining legacy systems but also helps thesoftware development process and ensures higherreliability. Advantage includes planning of resources forthe projects and minimization of budget. Research has beencarried out using statistical methodology and machinelearning techniques which are generic in nature. Thedependability on legacy Software systems to meet currentdemanding requirements is a major challenge for any ITadministrator and estimation of costs to maintain the sameis a huge challenge. In this paper, it is proposed to modifythe existing multi layer perceptron Neural Network whichis a popular supervised classification algorithm to predictdefects in a given module based on the available softwaremetrics.
Keywords
Legacy software, Software metrics, Softwarereliability, Classification, Multilayer Perceptron Neural network, Fault-proneness.
I. INTRODUCTION
Software reliability and Software quality assuranceare two major areas in software engineering whichensures high quality software. Both these conceptsare drawn in throughout the development andmaintenance process. The notable major activitiesused are performance analysis, functional tests,quantifying time and budget along with measurementof metrics[1]. In addition; code reviews, keypersonnel assignment and automatic test-casegeneration are the other strategies that are applied toreach the high reliability [2].Software quality can be viewed from differentperspectives including time, budget and mean time tofailure. Alpha and Beta testing help to improve thequality of software but does not ensure zero defectsand is a very expensive proposition if not plannedproperly.Software quality modeling becomes an importantcriterion to ensure that the software not only meetsthe desired quality but also within time and budgetlines. Defect prediction based on quantifiable metricsthough in controversy, has been used successfully topredict defects in modules. Defect prediction modelshave independent variables captured in the form of product and process metrics and one dependentvariable which indicates whether there could be afault or no fault in the module. Typically researchershave used product metrics extensively to predict faultin the modules. The independent variables used forprediction of defects can be parameters captured inprevious projects which is available in theconfiguration management system or can becomputed from the current project.Predicting module defects also finds application inlegacy systems where it may not be possible toreplace legacy systems through the practice of application retirement. Defect prediction provides acost effective process to enhance them.The previous work carried out by the author [3]investigates the KC1 for defect classification usingDecision Tree induction and Bayesian networks.Various pre-processing techniques were alsoinvestigated [4]. The results obtained are tabulated intable 1 and 2.
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 2, February 2011124 http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
TABLE-
I.
CLASSIFICATION ACCURACY ON KC1DATASET
KC1 DatasetCorrectlyclassified%Incorrectlyclassified%MeanAbsoluteerrorRamdom tree 81.86 18.14 0.1924CART 84.91 15.09 0.2095Bayesianlogisticregression86.03 13.97 0.1397
TABLEII
.
CLASSIFICATION ACCURACY AFTERPREPROCESSING IN KC1 DATA SET
 
% correctlyclassified% IncorrectlyClassifiedRandomTree94.5531 5.4469Logisticregression 95.6704 4.3296CART 96.7877 3.2123In this paper, the efficacy of neural network fordefect prediction using available model and ourproposed model is verified.This paper is organized into the following sections.Section II describes software metrics, Section IIIdescribes data mining techniques for classification,Section IV gives an introduction to Neural Network used, Section V describes the dataset used in thework, Section VI includes the improved neuralnetwork technique and output obtained. The lastsection analyses and concludes the paper.
II. SOFTWARE METRICS
Software metrics are collected at various phases of the software development process. These metricscontain information of software and can be used topredict software quality in the early stages of software life cycle.Software reliability engineering is one of the mostimportant aspects of software quality. Recent studiesshow that software metrics can be used in softwaremodule fault-proneness prediction. A softwaremodule has a series of metrics, some of which arerelated to fault-proneness. Multiple research workson the software quality prediction using therelationship between software metrics and softwaremodule’s fault-proneness have been done in the lastdecades. There are several techniques proposed toclassify the modules for identifying fault-pronemodules
III. DATA MINING TECHNIQUES
Data Mining (DM) aims to establish something newfrom the facts recorder in the databases. Originally,data mining is a statistician’s term for overusing datato draw in legitimate inferences. DM is the use of powerful tools to sift out important or significanttraits that are previously unknown from databases ordata warehouses.Software is prone to have errors and bugs. Theprocess of software testing is to assess the quality of computer software and verify whether the softwarecomplies with software specification and customerneeds. There are two ways to find errors in softwaretesting: manual and automated. Manually debuggingis laboured intensive and costly while automateddebugging can classify and locate the software defectautomatically. Data mining based softwaredebugging is becoming more and more accepted andit can significantly reduce the amount of labour costin software debugging.Data Mining extracts useful information andknowledge from huge amount of data. DM methodscan be applied to the data generated in every stage of software life cycle such as design, development,testing, deployment and maintenance, and extractpotential errors in the software.
IV. NEURAL NETWORKS
Neural networks consist of multiple layers of computational units, usually interconnected in a feed-forward way. Each neuron in one layer has directedconnections to the neurons of the subsequent layer. Inmany applications the units of these networks apply asigmoid function as an activation function.The feed forward neural network was the first andarguably simplest type of artificial neural network devised. As the majority of faults are found of itsmodules, there is a need to investigate the modulesthat are affected severely as compared to othermodules and proper maintenance to be done on timeespecially for the critical applications Ebru Ardil et.al (2009).Algorithms based on neural networks have a lot of applications in knowledge engineering. In datamining, the following neural network architecturesare used:
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 2, February 2011125 http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
x
1
 
Multilayered feed forward neuralnetworks
 
Kohenen’s self-organizing maps.
A)
 
Multilayered feed forward neuralnetworks
Multilayered feed forward neural networks (ANNs)are non-parametric regression methods, whichapproximate the underlying functionality in data byminimizing the loss function. The common lossfunction used for training and ANN is quadratic errorfunction. ANN is used for adaptation supervisedlearning. Database
 
form a training
 
set. Duringtraining, specified items of data records are put as theinput of neural network and its weights are changedin such a way that its output would approximate thevalues in the data set. After finishing learningprocess, the learned knowledge is represented by thevalues of neural network weights. For training, thealgorithm of back propagation of error is often used.
Input Hidden OutputLayer Layers Layer
w
1j
 
w2
j
 O
j
W
jk 
w
ij
 O
w
nj
Fig. 1. Multilayaer Neural NetworkB)
 
Kohenen’s self-organizing maps
Kohenen’s self-organizing maps (SOMs) havebecome a promising technique in cluster analysis.They are adapted by unsupervised learning. In datamining, Kohenen’s self-organizing maps basedcluster techniques have the following advantagesover standard statistical methods.DM typically deals with high-dimensional data. Arecord in a database typically consists of a largenumber of items. The data do not have regularmultivariate distribution and thus the traditionalstatistical methods have their limitations and they arenot effective. SOMs work with high-dimensional dataefficiently.Kohenen’s self-organizing maps provide means forvisualization of multivariate data, because twoclusters of similar members activate output neuronswith small distance in the output layer. In otherwords, neurons that share a topological resemblancewill be sensitive to inputs that are similar. Thisproperty has no other algorithm of cluster analysis.SOM is a dynamic system, which learns abstractstructure in high-dimensional input space using low-dimensional space for representation.
V. DATA SET
Data from the NASA’s Metric Data Program (MDP)data repository is made use of. The KC1 dataset usedcontains LOC measure, cyclomatic complexity, BaseHalstead Measures, Derived Halstead measures fromvarious software modules.The attributes used in this work is described brieflybelowLOC_BLANK - The number of blank lines in amodule.LOC_CODE_AND_COMMENT - The number of lines which contain both code & comment in amodule.LOC_COMMENTS - The number of lines of comments in a module.CYCLOMATIC_COMPLEXITY - The cyclomaticcomplexity of a module.DESIGN_COMPLEXITY - The design complexityof a module.ESSENTIAL_COMPLEXITY - The essentialcomplexity of a module.LOC_EXECUTABLE - The number of lines of executable code for a module (not blank or comment)HALSTEAD_CONTENT - The Halstead lengthcontent of a module.HALSTEAD_DIFFICULTY - The Halsteaddifficulty metric of a module.HALSTEAD_EFFORT - The Halstead effort metricof a module.HALSTEAD_ERROR_EST - The Halstead errorestimate metric of a module.HALSTEAD_LENGTH - The Halstead length metricof a module.HALSTEAD_LEVEL - The Halstead level metric of a module.HALSTEAD_PROG_TIME - The Halsteadprogramming time metric of a module.HALSTEAD_VOLUME - The Halstead volumemetric of a module.NUM_OPERANDS - The number of operandscontained in a module.
 
xx
1
x
i
x
n
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 9, No. 2, February 2011126 http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->