You are on page 1of 9

Biswajit Biswas Table of contents

Abstract ____________________________________________________________ 3 Abbreviations _______________________________________________________ 4 Introduction _________________________________________________________ 5 Model Description____________________________________________________ 5 Calculations ________________________________________________________ 5 Other Usage ________________________________________________________ 7 Plotting of Gompertz curve ____________________________________________ 8 Imperatives of the model ______________________________________________ 8 Conclusions ________________________________________________________ 9

Abstract This paper discusses a reliability model, which has been successfully used for predicting reliability of a software system. This model is based on a numerical distribution model based on Rayleigh Equation belonging to Weibull family of curves. Taking input from the upstream processes in terms of defect density and defect injection rate, it is possible to estimate the defect rates to be unearthed at downstream processes, e.g. final stages of testing. This estimate is useful for obvious reasons, as it helps to envisage the software system much before it is shipped. SPC methods for defect management and growth curves used for determining test-stop stage can also be effectively used vis-a-vis this model as it gives apriori information of the parameter that needs to be controlled. This model has been successfully used by TEL in more than 20 live projects in last 18 months and used by us to predict the outgoing defect rate. The predictions at various stages have been pretty accurate. This paper also discusses the imperatives of the model, the methods of estimating organizational constants, which goes into defining the model for this purpose.

Abbreviations HLD KLOC LLD PDF SPC TEL High Level Design Kilo Lines of Code Low Level Design Probability Distribution function Statistical Process Control Tata Elxsi Limited

Introduction Software reliability models are used to estimate the reliability or the number of latent defects of the software product when it is available to the customers. Such an estimate is important for two reasons: i) ii) Quantitative statement on outgoing quality of the software product. For resource planning for the software maintenance stage.

Model Description The number of defects recovered during various life cycle stages of a project conforms to a numerical distribution, which is represented by Rayleigh equation. Estimation of overall defect density of the entire project can be obtained by carrying out non-linear regression analysis using this equation with observed defect data for design reviews and code reviews. Estimation for number of defects for any stage (e.g., Unit Testing) thereafter can be obtained through PDF. The nature of curve indicates the pattern of defect removal rate in the life cycle of the project. The steeper it is, the less defect prone it is when delivered to customer, on the other hand, if it is flatter, it indicates inefficient defect removal rate and hence lot of defects leaked to customer. Rayleigh Model, among the family of Weibull distribution, is found to be most suitable for predicting reliability of software product. It predicts the expected value of defect density at different stages of life cycle of the project, once parameters (total number of defects or total cumulative defect rate and peak of the curve in terms of unit of time) for the curve are decided. The PDF of the curve can be given as: F(t)= f(K, tm, t) where tm is the peak of the curve t is actual time unit K is cumulative defect density

Process control can be exercised using these predicted values of defect density at various stages of testing.

Calculations Two parameters K, tm are estimated prior to plotting the Rayleigh curve for entire range. Minimum three points are needed to estimate these parameters using

Non-linear regression analysis. Parameters are considered to be valid if following statistical derivations are in conformance. a) Standard error of estimate b) Proportion of variance R2 c) Durbin-Watson test for autocorrelation Once two parameters are established, graph is plotted for entire range.

K y

t Fig: Rayleigh Distribution

t = TUT

Defect density at Life cycle stage t= TUT (at unit testing) is = K * y (at t= TUT) Thus, defect density is predicted at any stage of the project by substituting the value of t and getting the Y axis value, multiplied by K. Following time scale is used in the model for carrying out Non-linear regression analysis. Stage HLD LLD Implementation Unit Testing Integration Testing System Testing Time scale 0.5 1.5 2.5 3.5 4.5 5.5

Table: Mapping of stages to time scale

Below is the plot of Rayleigh curve plotted for a live project.

Rayleigh Curve
35 30 25 20 15 10 5 0

Defects /KLOC

LLD

28.9

23

CODING

UT

10.5 3.5
IT

HLD 1.4

The red line indicates the actual defect density observed as against the predicted values (brown smooth curve) obtained through theoretical model. Observed defect density closely matches with the defect density predicted by the model. Outgoing quality of the software The curve indicates the defect density at the time of system testing as 0.68 defects/KLOC or 21 defects. Other Usage The process of bug-detection, during a planned testing process against test cases, follows a pattern called S-curve or more suitably Gompertz curve. Plot of cumulative bug/trouble count against those theoretically predicted patterns with control boundaries shows the consistency and effectiveness of the testing process.

0. 05 0. 45 0. 85 1. 25 1. 65 2. 05 2. 45 2. 85 3. 25 3. 65 4. 05 4. 45 4. 85 5. 25 5. 65 6. 05 6. 45 6. 85

Life cycle stages

Fig: Rayleigh Plot for a TEL project

Plotting of Gompertz curve Estimation for expected number of defects (k) is done by using Ralyeigh equation. Upper and Lower control limits of Gompertz curve are calculated using equation: t b Y= k* ( a ) where a and b are model parameters, values of which depend on type of software.

y f Fig: Gompertz curve Imperatives of the model The basic assumptions made in using Rayleigh models are: i) Error injection rate at various stages is constant. ii) Defect removal effectiveness remains more or less unchanged. This means, the upstream processes of the life cycle like Design reviews and Code Reviews need to be followed consistently. In-flight control charts used for measuring consistency of review processes are necessary for this purpose. So, the implied need of this model is to have a very high level of process maturity and process capability. Implementation of the model is not going to be a success unless these implied needs are satisfied. t

Conclusions This has been very effectively implemented in our organization since last 18 months in over 20 projects. The outcome has been very successful prediction of outgoing quality. This is also reflected in our customer feedback. Effective implementation of this model calls for overall consistency in all software engineering activity, which reaps its benefit all through the life cycle stage in terms of effective reviews and effective testing. To recollect the success, the projects where this has been used for predicting out going quality and has matched to the prediction, we have achieved peer review effectiveness in excess of 75% and testing effectiveness in excess of 95%. This model is one of the main tool for Quantitative process management and Statistical process management as its very use itself calls for on-line process under statistical control.

References: [1] Stephen H. Kan - Metrics & Models in Software Quality Engineering.

You might also like