You are on page 1of 6

Abstract:

Software reliability and production bugs being the major problem for software developers, there was a
need of a bug and reliability prediction in the field. Near around 18 NHPP SRGM models have been
proposed until now. Reliability decides whether the work in the development should be continued or it
has no future so reliability prediction should be as accurate as they can. This investigation elucidates that
SVM can be a better alternative for the forecasting of reliability. We have tested the NHPP models and
SVM on future predictability and tested the accuracy and ranked them on the basis of RPE, MSE and R 2 .
Three different data sets are used for the testing of proposed model.

1. Introduction
softwares ::::::::::: ????
In software qualities reliability is a key feature. One of the major problem being faced in the software
industry is software reliability. Reliable software are lacking in the market and the demand is increasing
rapidly. Developing a software that is reliable is not an easy task as the resources are limited and the
requirements are unrealistic so they take time to be processed. It isn’t easy to determine whether the
delivered software is reliable or not. Nearly 30 software reliability models came over the time to find out
the quantitative measure of software reliability during development of the software.

SRGMs(software reliability growth models) have been proven to be successful in estimating the software
reliability and the no of errors remaining in the software. Using SRGMs one can predict the future
reliability of the software that is not even developed and also it can conduct the software testing and
debugging process. The present models be it yam model , be it s shaped model or be it the geo model they
assume that the fault process follow a specific kind of curve which is not true all the time and a bit
unrealistic. Though these growth model being inaccurate and insufficient to read the actual software
failure data for reliability assessment.

SVM is something which is being used for classifying and cracking the patterns and also for future
predictions. It is proving itself very useful in the real world implementation and it is known for
categorising and generalising well in most of the cases and adapt at modelling non-linear functions
relationships which are difficult to do with the SRGs. We are proposing SVM as a growth model and
using that here we are predicting the bugs that will be encountered and the software reliability at any time
during the development of the software.

2.Software Reliability Growth Models

Here we present the real projects on which SVM is applied for software reliability growth generalisation
and prediction . Three data sets data1 ,data2 and data3 are software failure data applied for software
reliability growth modelling data1 has 81 data pairs ,data 2 has 111 data pairs and data3 has 81 data pairs.
[[all three data are normalised to the range of [0,1] first]]. The data are given in table 1. Input of SVR
is the normalised successive failure occurrence times and the normalised accumulated failure number is
the output of SVR function. We use SVRSRG to refer the SVR-based software reliability growth model.

Table 1
MODEL NAME FORMULA

GOEL OKOMOTU m(t)=a(1-b^(-b*t))

GENERAL GEO MODEL m(t)=a(1-e^(-bt^(c))

MUSA OKOMOTU m(t)=a*ln(1+bt)

DELAYED S SHAPE m(t)=a(1-(1+bt)*e^(-bt))

S SHAPED m(t)=(a(1-e^(-bt)))/(1+Be^(-bt))

3.Support vector machine model


The concept of Support vector machine was proposed by Vapnik in late 1960s for classification of data.
The principle was extended to regression problems by the addition of an alternative loss function. SVR
works on the idea of dividing the dataset into a higher dimension to convert it into a linear problem. This
is done by the addition of kernels to the original SVC. The regression function for the linear problem is as
follows:

(1)

where is the features for one data point and and are the weight vector and bias.
The weight and bias are estimated by minimizing the cost function. The best line is selected by
those and where the cost function is minimum. The cost function is given as follows:

(2)
where

(3)

where C and are predefined constants. C signifying the trade-off between training error
and model complexity and is the -insensitive loss function. The term is the
measure of complexity of the function. By the introduction of and which represents the
distance from observed value to corresponding boundary value of - tube, the cost function is
transformed into a simpler function. When the observed point lies above the support vector, is the
positive difference between the observed point and . While if the point lies below the support
vector, is the negative difference between observed point and . The new optimization problem
now can be defined as:

Minimize:

(4)

subjected to:

Finally, the result of the regression function at any sample is given by:

(5)

Here is the kernel function. It is the inner product of two vectors and in the feature
space and , i.e., . Kernel functions must satisfy Mercer’s
condition. In this study we have used ### and ### kernels.
Goodness of fit

4.Data of software failures

TIME BUGS 20 196 40 345 61 414


1 7 21 200 41 350 62 419
2 8 22 214 42 352 63 420
3 36 23 223 43 356 64 423
4 45 24 246 44 367 65 429
5 60 25 257 45 373 66 440
6 74 26 277 46 373 67 443
7 82 27 283 47 378 68 448
8 98 28 286 48 381 69 454
9 106 29 292 49 383 70 456
10 115 30 297 50 384 71 456
11 120 31 301 51 384 72 456
12 134 32 302 52 387 73 457
13 139 33 310 53 387 74 458
14 142 34 317 54 387 75 459
15 145 35 319 55 388 76 459
16 153 36 323 56 393 77 459
17 167 37 324 57 398 78 460
18 174 38 338 58 400 79 460
19 183 39 342 59 407 80 460
Table 2. Dataset 2 (Failure data from real-time control system [23])

TIM
E BUGS 20 211 40 346 60 460 80 473 100 477
1 5 21 217 41 367 61 463 81 473 101 477
2 10 22 226 42 375 62 463 82 473 102 477
3 15 23 230 43 381 63 464 83 473 103 478
4 20 24 234 44 401 64 464 84 473 104 478
5 26 25 236 45 411 65 465 85 473 105 478
6 34 26 240 46 414 66 465 86 473 106 479
7 36 27 243 47 417 67 465 87 475 107 479
8 43 28 252 48 425 68 466 88 475 108 479
9 47 29 254 49 430 69 467 89 475 109 480
10 49 30 259 50 431 70 467 90 475 110 480
11 80 31 263 51 433 71 467 91 475 111 481
12 84 32 264 52 435 72 468 92 475
13 108 33 268 53 437 73 469 93 475
14 157 34 271 54 444 74 469 94 475
15 171 35 277 55 446 75 469 95 475
16 183 36 293 56 446 76 469 96 476
17 191 37 309 57 448 77 470 97 476
18 200 38 324 58 451 78 472 98 476
19 204 39 331 59 453 79 472 99 476

Table 3 Dataset 3 (Failure data from online bug tracking system [30])

TIME BUGS 20 43 40 55 60 88
1 9 21 43 41 56 61 92
2 12 22 44 42 59 62 94
3 16 23 45 43 60 63 94
4 25 24 45 44 60 64 94
5 27 25 46 45 60 65 99
6 29 26 47 46 61 66 102
7 29 27 47 47 61 67 104
8 32 28 49 48 62 68 105
9 34 29 50 49 62 69 105
10 35 30 50 50 62 70 106
11 36 31 50 51 62 71 106
12 36 32 50 52 64 72 107
13 39 33 51 53 65 73 108
14 39 34 52 54 66 74 108
15 40 35 53 55 73 75 109
16 40 36 54 56 76 76 112
17 40 37 55 57 81 77 113
18 41 38 55 58 83 78 113
19 42 39 55 59 87 79 115
4 Result and Discussion

Software reliability growth models capture the pattern of datasets in a specific behaviour. These models
are therefore less flexible to fluctuations in pattern of bug production. This behaviour brings less accurate
results of reliability. To surpass this problem, we need a model or a technique which can map a better and
more flexible way to generate better results with more accuracy. Machine learning is a field of science
which uses existing data to predict future patterns. We used these concepts, as gradient descent and
support vector machine to try to combat this problem. SVM splits the data in hyper plane to form a linear
problem from a non-linear problem. We have made our own software reliability growth model and
optimised it using SVM. We used relative predictability error criteria to compare the reliability of the
existing models with the model we proposed. The models were trained manually using gradient descent
algorithm as more accurate results were found as compared to SPSS version . We selected 7 well known
models (described in Table 1) from the software engineering literature to rank the models. selected
models were trained on 3 different dataset as mentioned in Table 2, Table 3 and Table 4. The models
were first trained on 70% of the data and prediction was made for whole dataset and the Mean Square
Error (MSE) was calculated. This process was repeated at 80%, 90% and100% for all of the datasets.
After finding out the MSE, Relative Predictability Error (RPE) was calculated and finally these results
were compared to the results of SVM.
in the tables presented the RPE and the MSE are show .
Rpe is evaluated by (predicted(final) - actual(final))/actual . MSE is determined from the formula
((sum(actual-pred))^2)/no of terms.
The parameters v and c of SVR used in the experiment which is actually v-SVR are optimised by the
cross validation method .In our evaluation process, for each data set we first trained our model with 100%
,90%,80%,70% of the data individually and then predicted the complete 100% data and then evaluated
the relative predictability error (RPE) which tells about the future predictability power of the model.we
have used this RPE and MSE for the ranking of the models and for the same model we have 4 scores for
the fitness(for 100% trained ,90% trained ,80%$70%) . MSE is the mean square error which is calculated
by ((actual -predicted)^2)/no of data points) .it is obvious that SVRSRG is having a high score of fitness
on the data sets than the older models . From the experiment it is visible that the relative predictability
error and the mean square error of SVMSRG are smaller than other older SRGMs.

Conclusion
We first applied the models to calculate the software reliability and marked the results, then we applied
the proposed model i.e. SVM to same data setsthe results obtained by SVM had less forecasting errors
than the NHPP software reliability growth models. SVM has non-linear mapping due to which it captures
the pattern of any dataset and produces better results with less RPE. SVM’s parameters play a major role
in the prediction , improper selection of the parameter can led to overfitting or under fitting of the model.

[30] Yang, J., et al. (2016). Modeling and analysis of reliability of multi-release open source software
incorporating both fault detection and correction processes. Journal of Systems and Software, 115, 102-
110.
[23] Hwang, S., & Pham, H. (2009). Quasi-renewal time-delay fault-removal consideration in software
reliability modeling. IEEE Trans. on Systems, Man and Cybernetics, Part A: Systems and Humans,
39(1),200-209.

You might also like