You are on page 1of 7

International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 250-256, Dec. 2011.

Software Reliability Model Metrics: Precision and Robustness

Zhanwei Hui, Xiaoming Liu, & Song Huang
Abstract Software reliability is the key factor among the characteristics software quality, and it is also the important index that hard to control and measure. On the basis of introducing the mechanism of software reliability models, this paper summarizes the present situation of the shortage of the generalization of the model. And it introduces two kinds of software reliability metrics. With the classification of software reliability of software life cycle, it will provide a useful criterion for the effective selection of software reliability. At last, it presents the conclusion and introduces the future work. Manuscript
Received: 5, Sep., 2011 Revised: 14,Nov., 2011 Accepted: 15,Dec., 2011 Published: 15,Jan., 2012

software reliability model, model precision, model robustness, classification

1. Introduction
Almost all technologies which penetrate and control our modern life rely heavily on computer and computer software. However, today, none of the existing software products is fault-free. Software errors have known to cause spectacular and sometimes catastrophic failures. For example, on September 17, 1991, a power outage at the AT&T switching facility in New York City interrupted service to 10 million telephone customers for nine hours. During the 1991 Gulf War, a software problem may have prevented the Patriot missile system from tracking the Iraqi Scud missile the killed 28 U.S. soldiers. Giver the potential costly impact and possibly disastrous consequence of software failures for many applications, it is imperative to have sound methodologies to measure, quantify and improve software quality. This leads to the fact that software reliability is now an important research area. Software reliability is defined as the probability that the software will be functioning without failure under given environmental condition during a specific period of time [1]. Here, a software failure generally means the inability of
This work was supported by National High Technology Research and Development Program of China (No: 2009AA01Z402). Z.W. Hui is with the department of Military Training Software Testing and Evaluation Centre, University of Science and Technology of PLA, Nanjing 210007, China. E-mail: X.M. Liu and S. Huang are with the Military Training Software Testing and Evaluation Centre of PLA, and Institute of Command Automation, PLA University of Science and Technology, Nanjing 210007, China.

the software to perform an intended task specified by the requirement. As the complexity and size of software applications grow, number of faults in software design increases, hard to find and more subtle to detect. In recent years, the costs of developing software and fixing its inherent faults are major expenses in a system. Hence, reliability estimation and improvements techniques have proven to be useful tools for software developers to bring down cost, to evaluate current reliability, and to predict the future performance before releasing software products into market. In order to assess and enhance performance of the software, it is important to assess the reliability of the software for making important business decisions such as release of the software. Reliability models are a powerful tool for predicting, controlling, and assessing software reliability. The work on software reliability models started in 70s, the first model being presented in 1972. Today the number of existing models exceeds hundred with more models developed every year. Still there does not exist any model that can be applied in all cases. Models that are good in general are not always the best choice for a particular data set, and it is not possible to know in advance what model should be used in any particular case [2]. There is not a guideline with high confidence level, which we can follow to choose any particular model. No one has succeeded in identifying a priori the characteristics of software that will ensure that a particular model can be trusted for reliability predictions [3]. This paper is organized as follows: section 2 presents mechanism and deficiency of traditional software. We first introduce the mechanism of deterministic models, and then gives the mechanism of probabilistic models. Section 3 exposes the measurement of software reliability models based on applicability analysis, and shows two metrics for software reliability models. In section 4, we briefly present a kind of classification of reliability models based on precision and robustness. Section 5 provides validation of the work. Finally, section 6 presents the conclusion and introduces the future work.

Zhanwei Hui et al.: Software Reliability Model Metrics: Precision and Robustness.


2. Mechanism and Deficiency of Traditional Software Reliability Models

As for the reliability of software, it is very important if we can estimate the remaining faults in system software. However, the initial faults can not be confirmed which brings in uncertainty for evaluation [4]. Its been more than 30 years since the studies of software reliability engineering began, to realize the credible measure of software reliability, people have proposed many kinds of reliability models based on different assessment techniques. A. Mechanism of Deterministic Models The existing reliability models can be divided into two categories according to randomness: deterministic model and probabilistic model. Deterministic model count the amount of errors in program according to the explicit operators and operands in it. This kind of models put emphasis on the analysis of the structure with stochastic events rarely involved. There are two popular deterministic modelsHalstead MH [5]McCabe TJ [6]. Though having different objects, these two models both need to analyze the structure of software system in advance, and then analyze the system quantitatively. We calculate the numbers of errors in the program using n1n2N1N2NVI, according to reference [5]. n1=amount of special operators in program n2=amount of special operands n program N1=total of operators in program N2=total of operands in program N=length V=version I=amount of machine instructions in program. McCabe TJ also needs to make structural analysis to the program and transform the program into directed control graph, and then measure the complexity of the order of program according to the amount of edges and vertices in the graph. So the concept model of these metric models can be abstracted as Figure 1.

structural attributives of SUTSoftware Under Test, and obtain the parameters X1X2... by calculating function F based on the abstraction, and then after doing some conversion, we get the output: the RM of software. We can see from the above that if the structure information of software can be obtained beforehand, such model can provide software designers with the standard of quality programming structure and testers with an effective way to estimate the testing cost. Meanwhile, the developer can learn about the cost of fulfilling the specification and client requirements with model calculations [7]. But considering that these models are based on the systematic structure, it is impossible to make a complete measurement over software system; they are not fit for overall evaluation for software reliability [8]. B. Mechanism of Probabilistic Models Reliability models usually refer to probabilistic models. Such models regard failure and elimination of errors as probabilistic events. In the following context, reliability model is referred as probabilistic model. The probabilistic model relies on the theory that the dynamic behavior of errors in software is uncertain, that is the randomness of software faults. So what is the essential reason for randomness? Nowadays a popular concept model is input-output model in Figure 2. Its basic theory can be found in reference [9], and the software reliability model is based on this theory too.
Input space I Output space O

program Input space IF

Output space


Fig. 2 Software concept model

SUT n1 n2 N1 N2
X1 X2 F RM

Fig. 1 Concept model of deterministic

First we abstract the wanted input X1 X2 , using the

International Journal Publishers Group (IJPG)

Software program is a corresponding relationship which maps an input space to an output space. Considering from the input space, we regard selecting an input as throwing a point, the ratio of IFs area to Is area and use profile decides the probability of the event that the input falls into IF. So it is a try for Bernoulli experiment when we select an input and decide whether it falls in IF. We select a point in input space I according to some kind of stochastic mechanism, and then map it to a point in output space O after calculating it in a program. The program is just a mapping from I to O, in mathematics P I O. So long as the input accepted by the program belongs to IF which is the space of causes of faults, then the output of program must falls in the space of output faults: OF, it also means that the


International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 250-256, Dec. 2011.

program encounters with a fault. The input falling in space IF is stochastic, so it is reasonable to model the stochastic mapping with a stochastic process. C. Analysis of Constraint of Software Reliability Models According to the analysis of the above, we know that there are two fundamental conditions to fulfill to set up the software reliability stochastic model: a) The input of software is stochastic, to generate a point in space I is a stochastic process; b) The handling of program is certain, given the same input the program get the same output; So if the software system fulfills the two conditions above, then the software reliability stochastic model is correct. The mapping between operational space Op and input space I is a subjective mapping as shown in Figure 3, the same with the relationship between the failed operational space OpF and IF. Then we can ensure that there is one-one corresponding between operation and input, input and operating meaning. If the generation of the points in space I is stochastic, then every kind of operation in operational space is stochastic. However the actual use of software isnt stochastic, in other words the generation of operation in Op is not stochastic, because clients use the software system following the requirements with confirmed purpose. So condition a can not be satisfied strictly.
operating space Op input space I output space O

enough for modeling software failure process. The traditional Markov process model or binomial model only takes the randomness of input profiles, ignoring the other randomness factors which impact the software reliability, so the existing software reliability model can not evaluate the reliability of software system accurately.

3. Measurement of Software Reliability Model Based on Application Analysis

The theory of software reliability model is the most successful one in the field of software reliability testing and evaluation by far. Hudon proposed the Birth and Death Process model [9] in the first thesis on software reliability; this model educed the Weibull distribution of time between failures. After that in 1972, Jelinski and his partners further studied the rate of software faults, they assume a piecewise-constant for faults and the rate of fault is proportional to the amount of remaining errors. The parameters in their models above are evaluated mostly in the method of classical statistics. But in fact, the evaluation may be not reliable when the distribution of observations deviates the assumed model. With the development of software reliability model theory, we have many modeling methods including non-parametric analysis method, Bayes method, Markov stochastic process method and so on. There are at least 40 kinds of models so far and each needs rather strict preconditions to be applied, such situation constrains the application of software reliability models. In conclusion, two aspects of software model are mainly concerned: the precision and the robustness of software reliability. A. Precision of Software Reliability Models Definition 1 The precision of software reliability model: how precise the model evaluates software reliability. Presuming the intrinsic reliability of software is Re we obtain the reliability of software is based on the model, then the precision of the software reliability model is:

operating space OpF

program input space IF output space OF

Fig. 3 Relationship among operational space, input space and output space

The actual output of software is closely related both with software and its environment, so the output of program is possibly different with the same input. Software reliability stochastic model is based on the failed data of software, and the data is obtained without the consideration of the developing environment, so condition b can not be fully satisfied neither. The defects of traditional software reliability model can be found through the analysis above. It is one of the important research topics on software reliability model to obtain more accurate models by transforming the two conditions. For instance the proposal of operational profiles [10] bring about lots of improved models aimed at b [3] under some conditions which satisfy a . However, the stochastic factors of software failure process are co complicated that the traditional stochastic process is not

RM (Re)

Precision(M)= RM (Re) Re Given Preci(M) =0, then RM (Re) Re , and the

evaluation of the model is consistent with the intrinsic reliability of software. In such an ideal situation, the higher the precision of the model is, the less the amount of errors evaluation has. Given RM (Re) Re , then Precision(M)=RM (Re) Re , and the evaluation of the model overestimates the reliability of software, the results exaggerate the reliability of software,
International Journal Publishers Group (IJPG)

Zhanwei Hui et al.: Software Reliability Model Metrics: Precision and Robustness.


we call such model the Radical Model. Given RM (Re) Re , then Precision(M)=Re RM (Re) and the evaluation underestimates the reliability of software, we call such model the Conservative Model. The two kinds of models have different applications in evaluating the reliability of software in reality. For the key software systems which will cost users a great loss when they lose efficiency, the conservative model should be used. However, the testing cost would be increased. As for an ordinary software system which has little impact on users or users can endure its failure when it loses efficiency, the radical model can be used. What we talked about is the relation between testing cost and failure tolerance, this is an important reference for choosing reliability model and test strategies. People have made great efforts to improve and optimize the model aiming at solving the problems in the precision of models. Though different models can achieve good precision for SUT individual, the result gets much worse for the other SUT. As long as the aspect of precision cant reflect the quality of the model completely, we introduce another feature-the robustness of software reliability models. B. Robustness of Software Reliability Models Definition2 the robustness of software reliability modelthe extent to which software reliability models can adjust different software systems and maintain its precision, also the ability of evaluating different software systems. Different SUT exist, they are

is a misunderstanding in some researches. As talked above, the software reliability model is confined to some theoretical assumptions we made. Because many of these assumptions is so ideal that may not reflect the real testing process. So when the precision of a single model is quite high, often its robustness is not so satisfactory. And to improve the robustness of the model comprehensively is more important than the precision of a single model.

4. Classification of Reliability Models Based on Precision and Robustness

Based on the analysis above, the precision of a single model is relatively high while its robustness is not and this causes the lack of a general model. Therefore we introduce the choice of comprehensive optimization of models, a feasible way by choosing a reliability model in the basis of model classification. A. Classical Classification of Software Reliability Models There are many classified methods based on different standards, for instance Musa [13] of the Bell Labs divided the models into exponent, Pareto, Weibull, gamma, geometric attenuation according to the intensity decay curves of the models; In reference [14] Amerit Goel divided models into failure-interval-based-time models, failure-count models, error-seeding models, input-field-based models according to the features of failure process. Among these methods, the most meaningful one for software reliability testing is the classification based on randomness. Figure 4 shows the classical classification method.
Halstead software metrics model Deterministic Model McCabe complexity metrics model Software reliability models

Si (i 1, 2,...)
According to the definition, the mean value of evaluating inaccuracies for different tested systems by software reliability models is:


1 n Precision(M Si ) n i 1

Mean variance of the model is 1 n Applicability(M) = (Precision(Msi ) Applicability(M)) 2 n i 1 The

Non-homogeneous Poisson process model G-O Model Scochastic Model SM Model

Applicability(M) reflects the mean results of

the evaluations for every Si, and


reflects the

stability of the results of every tested systems evaluated by the model. The smaller the




Fig. 4 Classification Randomness of Software Reliability Models Based on


, the better robustness the model has.

At present many experts choose the typical software reliability models to make a comprehensive evaluation over the development of dedicated reliability models for specific software projects [11]-[12], and to improve the reliability of evaluation and prediction. Actually such method helps improve the robustness of the model not the precision which
International Journal Publishers Group (IJPG)

B. Classification of Software Reliability Model Based on Software Life Cycle Software life cycle accompanies software testing life cycle and provide foundation with different phases and kinds of testing. So the software reliability model based on


International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 250-256, Dec. 2011.

software life cycle [15] can provide foundation with testers choosing different models in different phases. The goal of the study was to provide an approach, which would help achieve better confidence level in model selection process. Additionally this approach should be applicable through the different phases of Software Development Life Cycle (SDLC) considering a number of criteria (more than previously used) according to their importance. Also the approach should be applicable on various categories of software reliability models. A survey of over thirty models used in practice was conducted for the study. And then, a classification of software reliability models is presented according to SDLC phases. Figure 5 shows the detailed category view. As the figure shows that different models can be applied to different phases and even the same ones. This is also a research method to generalize the constraints of the model and to broaden its applications. Although the study gives the criteria and algorithm for software reliability models selection, the model is just one of the traditional ones. And as we known, the single model is not suitable for reliability prediction of complex software. The integration information system belongs to complicated software systems and its development needs the combination of many developers. So the testing should be started at the very early phase of the software life cycle. As we all know, the earlier the testing starts, the less the cost is and the better the reliability is [16]. So the software reliability model selection described above suits the reliability model selection of the integration information system better, and the developers can choose the proper model in the light of their phases and the actual developing

conditions so as to improve the reliability of the whole software system. C. Classification of Software Reliability Model Based on Precision And Robustness According to the analysis of software reliability model metrics and the models precision and robustness, we can divide the models into the following categories as shown in figure 6. Radical model: the model whose evaluation is too radical or the output evaluation of model reliability excels the actual software reliability model. Such kind of model applies to the software system which doesnt require high reliability. Conservative model: the evaluation is too conservative and the output evaluation of model reliability is inferior to the actual ones. Such kind of model applies to those systems which require high reliability or are sensitive to failure. Initial state model: robustness is the measurement for the whole effect of several software system by the description model, so this model means the creation of the model stays in the initial phase which is active and unstable. This kind of model can provide high precision with part of software but not for all. That is to say it has low generalization and needs an improvement for correction and perfection. The model also refers to the models proposed in the initial phases of software reliability research by researchers. Steady state model: compared with the initial state model, this kind of model has better robustness which means it can do a better generalization. The model has reached a steady state and acquired some application after

Phases of software development life cycle






Early prediction model Model based on phases; Rome laboratory model; Rayleigh model; Musa prediction model; Industrial data collection; History data collection;

Model based on structural system

Mixed white box model

Mixed black box model

Software reliability growth model

Model based on input field

Reliability prediction model based on time structure

Reliability growth model based on input field

Nelson model Tsoukalas model Weiss&Weyuker model

Model based on states

Model based on paths

Additional model

S-type model

Au linear model

Model based on structural system; Non-homogenous Poisson process model; Lapire model; Nodel of Gokhale; Reliability simulation of Gokhale;

Shooman model K-M model YacoubCukic and Ammar model

Everett model Xie and Wohlin model

Yamada S-type model Gomperz model

Musa basic model G-O NHPP model M-O NHPP model Musa Poisson execution time model J-M model L-V model Wei-Park model Rayleigh model

Fig. 5 Software Reliability Model Selection Based on Software Life Cycle

International Journal Publishers Group (IJPG)

Zhanwei Hui et al.: Software Reliability Model Metrics: Precision and Robustness.


being amended during the long engineering practice. But it still stays in the research phase and hasnt been used in the actual engineering.
Radical model Precision of model Conservative model Software reliability model Steady state model Robustness of model Initial state model

Fig. 6 Classification of Reliability Model Based on Precision and Robustness

1. When these reliability models are applied to data set, not all of them would fit. 2. TBF data set are always smoother than IDC data set. As Fig 7 and Fig 8 are TBF data set, and Fig 9 and Fig 10 are IDC data set, we could see that the last two ones are smoother the first two ones. In other words, it may be that failure time is easier to estimate than fault number. 3. GO, JM, SM and MB could be called exponential-shaped NHPP reliability model tend to be radical, and LV tends to be conservative based on model precision. Other model do not have the precision trend. 4. For IDC models, GPO and NHPP tends to be steady, and YAM and SDW tends to be initial state models, based on model robustness.

The method of classification can guide the evaluation of the model for a project in practice and are especially important for the key software system which requires high reliability.

5. Validation
A key challenge in software reliability model research is the validation of a model. Validating a software reliability model is hard [handbook] for software reliability data; reliability, especially software reliability is an attribute that is hard to measure and hence even harder to validate. As the software reliability model could be divided into two basic classes, depending upon the types of data the model uses, we conducted four exploratory empirical studies to validate our metric. The first two data sets used the Interval Data Counts data sets, which we collected from [17], and the last two data sets used the Time-Between-Failure (TBF) data sets, which we collected from [18]. A. Experiment Tool: SMERFS We used the SMERFS tool to apply software reliability models to the four data sets. SMERFS (Statistical Modeling and Estimation of Software Reliability Functions) is a program for estimating and predicting the reliability of software during the testing phase, which provided by Naval Surface Weapons Center [19]. It uses failure count information to make these predictions. There are two types of models in SMERFS: Interval data counts (IDC) models and failure-count (FC) models. Typical IDC models, including generalized Poisson model (GPO), Brooks and Motley Poisson model (BMP), and binomial model (BMB), GO (also called NHPP) model, and Yamada delayed S-shaped model (YSS) [20]. Typical TBF models, including LV, gemometric model (GEO), Musa Basic (MB), Jelinski/Moranda (JM), and Musa Okumoto (MO) [13]. B. Analysis Result We use several IDC software reliability models to predict the reliability of data set1 and data set2. From fig7 and fig8 we could make following observations:
International Journal Publishers Group (IJPG)

Fig. 7 IDC data set1

Fig. 8 IDC data set2

Fig. 9 TBF data set3


International Journal of Advanced Computer Science, Vol. 1, No. 6, Pp. 250-256, Dec. 2011.

[1] M.R. Lyu, Handbook of Software Reliability Engineering, (1996) IEEE Computer Society Press. [2] A.D. Denton, Accurate Software Reliability Estimation, (1999) Master of Science Thesis, Colorado State University, Fort Collins, Colorado, Fall. [3] X. Zhang, M.Y Shin & H. Pham, Exploratory analysis of environmental factors for enhancing the software reliability assessment, (2001) Journal of Systems and Software, vol. 57, pp. 73-78. [4] H. Pham, System Software Reliability, (2006) Springer. [5] M.H. Halstead, Elements of Software Science, (1977) Elsevier, New York. [6] T.J. McCabe, A complexity measure, (1976) IEEE Trans. Software Engineering, vol. 2, no. 4. [7] V.R. Basili, B.T. Perricone, Software errors and complexity: An empirical investigation, (1984) Communication ACM,. vol. 27, no.1. [8] M.A. Friedman, J.M. Voas, Software Assessment Reliability, Safety, Testability, John Wiley & Sons, (1995) New York. [9] G.R. Hudon, Program Error as a Birth and Death Process, (1967) Report SP-3011, Snta Monica ,CA:System Development Corporation. [10] J.D. Musa,Operational Profile in Software-Reliability Engineering, (1993) IEEE Software, vol. 1, no. 10, pp.14-32. [11] Z.F. zhong, L.H. qing & W. Lin, Integrated Model of Software Reliability, (2003) Wuhan University Journal, vol. 36, no. 1, pp. 105-108. [12] Z.F.zhong, X.R. zuo, Multi-Model Assessment of Software Reliability, (2002) Tongji University Journal, vol. 30, no. 10, pp. 1183-1185. [13] J.D. Musa A theory of software reliability and its applications, (1975) IEEE Trans. on Software Engineering,. vol. 1, no. 3. [14] A.L. Goel, Software reliability models: Assumptions, limitations and applicability, (1985) IEEE transaction on software engineering, 12, pp.1411-1425. [15] C.A. Asad, M.I. Ullah & M.J. Rehman, An Approach for Software Reliability Model Selection, (2004) Annual International Computer Software and Applications Conference, 28. [16] M.R. Lyu, Handbook of Software Reliability Engineering , IEEE Computer Society Press and McGraw-Hill Book Company, 1996. [17] [18] [19] [20] S. Yamada & S.Osaki, Software Reliability Growth Modeling: Models and Applications, (1985) IEEE Trans. Soft. Eng, vol. 11, no. 12, pp. 1431-1437. [21] S. Inoue & S. Yamada, Chang-point Modeling for Software Reliability Assessment Depending on Two Types of Reliability Growth Factors, (2011) IEEE Internatinal Congference on Industial Engineering and Engineering Management (IEEM), Japan.

Fig. 10

TBF data set4

As combinations of model could provide more accurate predictions than the individual models themselves [21], the basic model of the combinations is important. So with these metric model precision and model robustness could be useful to choose the appreciate models for combinations.

6. Conclusion
We propose two metrics for the software reliability model on the basis of the mechanism of the existing software reliability models: precision and robustness. People can analyze the applicability of the model by these two metrics, and they can also provide basis for the selection and combination of the models. We introduce a classifying method for software reliability models based on software life cycle with regard to the features of the integration information system. This method can guide the model selection for different phases of software reliability life cycle. However, as a preliminary implementation process, there inevitably exist some problems which need further research. For instance, the problem of generating the smallest test suite and determining the testing profiles, which are all the key points for further research. Besides, the selection decision also has an impact on the evaluation in the later period. From the discussion above, we know that after finishing the unit testing, integration testing and acceptance testing, the reliability testing by a third-party software can ensure the improvement of the quality of software and make clients confident for the software product. There is no doubt that this would increase the cost and prolong development cycle but as for such an important software system like the integration information system, we believe it is worth the price.

Resources of the PLA Software Test and Evaluation Centre for Military Training were used in this research.

International Journal Publishers Group (IJPG)