You are on page 1of 32

A Data-Driven Model for

Software Reliability Prediction


Author: Jung-Hua Lo
IEEE International Conference on Granular Computing (2012)

Young Taek Kim

KAIST SE Lab.
9/4/2013
Contents

Introduction
Background
Overall Approach
Detailed Process
Experimental Results
Conclusion
Discussion

2 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

SW Reliability Prediction
Definition of SW Reliability
 Probability of failure-free operation of a software
product in a specified environment for a specified
time.
SRM (Software Reliability Model)
 To estimate how reliable the software is now.
 To predict the reliability in the future.
Two categories of SRMs
 Analytical Models: NHPP SRMs
 Data-Driven Models: ARIMA, SVM

3 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Data Driven Model

Limitations of Analytical Models


• Software behavior changes during testing phase
 Assumption of “all faults are independent & equally detectable”
is violated by the dataset.

Data Driven Models


• Much less unpractical assumptions:
developed from collected failure data.
• Easy to make abstractions and generalizations of the SW failure
process:
the approach of regression or time series analysis.

4 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Motivation
Problems
 Actual SW failure data set is rarely pure linear or
nonlinear
 No general model suitable for all situations
Proposed Solution
 Hybrid strategy with both linear and nonlinear
predicting model
• ARIMA model: Good performance in predicting linear data
• SVM model: Successful application to nonlinear data

5 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Stationarity
Statistical properties (mean, variance,
covariance, etc.) are all constant over time.
(1) E ( yt )  u y for all t.
(2) Var ( yt )  E[( yt  u y ) 2 ]   y2 for all t.
(3) Cov( yt , yt  k )   k for all t.

60 μ1, σ12, γ1 ≠ μ2, σ22, γ2 60

50 50

40 40

30 Differencing
30 μ1, σ12, γ1= μ2, σ22, γ2
20 20

10 10

0 0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11

6 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

ACF (Autocorrelation Function)


The correlation between observations at
different distances apart (lag)
n n

(y t  y )( yt  k  y ) y t

rk  t  k 1 where y t 1

n n
(y
t 1
t  y) 2

7 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

PACF (Partial ACF)


 The degree of association between yt and yt-k,
when the effects of other time lags 1, 2, 3, …,
k-1 are removed.
r1 if k  1, rkj  rk 1, j  rkk rk 1,k  j
where
 k 1
 rk   rk 1, j  rk  j

for j = 1, 2, … , k-1.
rkk   j 1
 k 1
if k  2,3, 
 1   rk 1, j  rk

 j 1

PACF

8 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Removing Non-stationarity
Differencing
 Differenced series: yt  yt  yt 1

PACF

9 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

3 Prediction Models for Stationary Data

AR • Use past values in forecast


(Auto Regressive) • AR(p)
Model 𝑦𝑡 = α1 𝑦𝑡−1 + α2 𝑦𝑡−2 + ⋯ +α𝑝 𝑦𝑡−𝑝 + 𝜀𝑡

• Use past residuals (random events) in


MA forecast
(Moving Average)
• MA(q)
Model
𝑦𝑡 = 𝜀𝑡 + 𝛽1 𝜀𝑡−1 + ⋯ + 𝛽𝑞 𝜀𝑡−𝑞

ARMA • Combination of AR & MA


(Auto Regressive &
Moving Average) • ARMA(p, q)
Model 𝑦𝑡 = α1 𝑦𝑡−1 + α2 𝑦𝑡−2 + ⋯ +α𝑝 𝑦𝑡−𝑝 + 𝜀𝑡
+𝛽1 𝜀𝑡−1 + ⋯ + 𝛽𝑞 𝜀𝑡−𝑞

10 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

AR (Auto Regressive) Model (1/2)


AR(p)
 𝑦𝑡 = α1 𝑦𝑡−1 + α2 𝑦𝑡−2 + ⋯ +α𝑝 𝑦𝑡−𝑝 + 𝜀𝑡
α𝑖 : 𝐴𝑢𝑡𝑜𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡
𝜀𝑡 : 𝑒𝑟𝑟𝑜𝑟 𝑎𝑡 𝑡 1.0
Autocorrelation Function for AR1 data series
(with 5% significance limits for the autocorrelations)

Selection of a model
0.8
0.6
0.4

Autocorrelation
0.2

 ACF decreasing exponentially


0.0

Exponentially
-0.2
-0.4

Decreasing
-0.6

• Directly: 0<a<1
-0.8
-1.0

1
(oscillating)
5 10 15 20 25
Lag
30 35 40 45 50

• Oscillating patter: -1<a<0 Partial Autocorrelation Function for AR1 data series

 PACF identifying the order


(with 5% significance limits for the partial autocorrelations)

1.0
0.8
0.6

Partial Autocorrelation
of AR model PACF
0.4
0.2
0.0

Cut off
-0.2
-0.4

at Lag 1  AR(1)
-0.6
-0.8
-1.0

2 4 6 8 10 12 14 16 18 20
Lag

11 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

MA (Moving Average) Model (1/2)


MA(q)
 𝑦𝑡 = 𝜀𝑡 + 𝛽1 𝜀𝑡−1 + ⋯ + 𝛽𝑞 𝜀𝑡−𝑞
𝛽𝑖 : 𝑀𝐴 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟
𝜀𝑡 : 𝑒𝑟𝑟𝑜𝑟 𝑎𝑡 𝑡
Example Year Sales(B$) MA(3) 1000 + 1500 + 1250
2000 1000 3
2001 1500
2002 1250 MA(3)
2003 900 1250
2004 1600 1217 1800
2005 950 1250
2006 1650 1150 1300
2007 1750 1400
2008 1200 1450
800
2009 2000 1533
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2010 2100 1650
Sales(B$) MA(3)
2011 1767

12 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

MA (Moving Average) Model (2/2)


Selection of a model
 ACF identifying the
Autocorrelation Function for MA1 data series
(with 5% significance limits for the autocorrelations)

Cut off
1.0

order of MA model
0.8

at Lag 1  MA(1)
0.6
0.4

Autocorrelation
0.2

 PACF decreasing
0.0
-0.2
-0.4
-0.6

exponentially
-0.8
-1.0

1 5 10 15 20 25 30 35 40 45 50

• Directly: 0<a<1
Lag

• Oscillating patter: -1<a<0 Partial Autocorrelation Function for MA1 data series
(with 5% significance limits for the partial autocorrelations)

1.0

Exponentially
0.8
0.6

Partial Autocorrelation
PACF
0.4
0.2 Decreasing (oscillating)
0.0
-0.2
-0.4
-0.6
-0.8
-1.0

2 4 6 8 10 12 14 16 18 20
Lag

13 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

ARMA Model
ARMA(p,q) = AR(p) + MA(q)
 𝑦𝑡 = α1 𝑦𝑡−1 + α2 𝑦𝑡−2 + ⋯ +α𝑝 𝑦𝑡−𝑝 + 𝜀𝑡
𝛽1 𝜀𝑡−1 + ⋯ + 𝛽𝑞 𝜀𝑡−𝑞
Procedures for model identification

• ▶ Guideline to determine
• p, q for ARMA

14 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

ARIMA Model
Auto Regressive Integrated Moving Average
(By Box and Jenkins (1970))
 Linear model for forecasting time series data:
Future values is a linear function of several past observations.
 ARIMA(p, d, q)
Moving average of order q
Integrated differentiation of order d
(Expand to Non-Stationary Time Series)
Auto Regression of order p

15 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

SVM (Support Vector Machine)


Proposed by Vladimir N. Vapnik (1995, Rus)
An algorithm (or recipe) for maximizing a
particular mathematical function with respect to
a given collection of data
4 Key Concepts:
 Separating hyperplane
 Maximum-margin hyperplane
 Soft margin
 Kernel function

16 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Separating Hyperplane

f(x,w,b) = sign(w x + b)
denotes +1 w x + b>0
denotes -1 Separating
Hyperplane
(= Classifier)

w x + b<0

17 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Maximum Margin
f(x,w,b) = sign(w x + b)

denotes +1
denotes -1

x+ M=Margin
Width
Support
Vectors are X-
those data
points that
the margin
pushes up
Against
Only Support vectors are used to
specify the separating hyperplane!!
18 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Kernel Function (1/2)


Nonlinear SVMs
 Datasets that are linearly separable with some noise work out
great:
0 x

 But what are we going to do if the dataset is just too hard?

0 x
 How about… mapping data to a higher-dimensional space:
x2

19 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Kernel Function (2/2)


Nonlinear SVMs: Feature Spaces
 General idea: The original input space can always be mapped
to some higher-dimensional feature space where the training
set is separable linearly.
 Definition of Kernel Function: some function that corresponds
to an inner product in some expanded feature space.

Φ: x → φ(x)

20 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Genetic Algorithm
Search & Optimization
Create inintial, random
population
technique (potential solutions)

 By J. Holland, 1975 Evaluate fitness for


each population
 Based on Darwin’s
Principle of Natural
Optimal or "good"
Selection solution found?

Basic operations END

 Crossover
No

 Mutation Selection
or kill population

Crossover
Mutation

21 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Overall Approach (1/2)


Support Vector Machines ARIMA
Random Initial Population
Data
Set
Chromosome 1

Chromosome 2 Initial Model Identification


.. Parameters
.
Model Estimation
Chromosome N No

Nonlinear Yes Is satisfied


Training SVM Model Residual model checking?

Trained SVM Model

Trained SVM Model Trained ARIMA Model


Fitness Evaluation (Nonlinear Forecasting) (Linear Forecasting)
Yes

+
Stop
Criteria?
Software Reliability
No Prediciton

Genetic Operations

Support Vector Machines ARIMA


Random Initial Population
Data
Set
Chromosome 1

Chromosome 2 Initial Model Identification


.. Parameters
.
Model Estimation
Chromosome N No

Nonlinear Yes Is satisfied


Training SVM Model
Residual model checking?

Trained SVM Model

Trained SVM Model Trained ARIMA Model


Fitness Evaluation (Nonlinear Forecasting) (Linear Forecasting)
Yes

Stop +
Criteria?
Software Reliability
No Prediciton

Genetic Operations

22 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Overall Approach (2/2)


Xt = Lt + Nt
Xt : Time series data
Lt : Linear part of time series data
Nt : Nonlinear part of time series data
After ARIMA model processing, we can get
𝑳 𝒕 , 𝜺𝒕 :
𝐿𝑡 : Predicted value of the ARIMA model
𝜀𝑡 : residual at time t from the linear model
𝜀𝑡 = Xt - 𝐿𝑡
Finally, the residuals (𝜀𝑡 ) will be modeled by
the SVM model with GA (Genetic Algorithm).

23 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

ARIMA Process (1/2)

Data
Set
Stationarize input data
- Differencing, determine d
- ACF, PACF checking
Model Identification

Determination of the values of p and q


Parameter Estimation
No - ACF, PACF checking
MA(q) AR(p) ARMA(p,q)
ACF Cuts after q Tails off Tails off
Is satisfied PACF Tails off Cuts after p Tails off
model checking?

MLE (Maximum Likelihood Estimation)


Yes
- Find a set of parameters q1,q2, ..., qk
SW Reliability Prediction to maximize L(q1,q2, ... , qk)=
f(x1,x2, ... , xN;q1,q2, ... , qk)

24 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

ARIMA Process (2/2)

Data
Set

Model Identification

Parameter Estimation
No
Residual randomness Check
- Residuals of the well-fitted model
will be random and follow the
Is satisfied
model checking?
normal distribution
Yes - Check ACF and PACF

SW Reliability Prediction

25 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

SVM Process (1/2)


Random Initial Population
o Due to the characteristics of
Chromosome 1
input data (randomness),
Chromosome 2 Initial
random initial population
..
.
Parameters selected
- ex: C, ε, σ
Chromosome N

o Data set is divided into two


Training SVM Model
Nonlinear part: training & testing data
Residual

Trained SVM Model

Fitness Evaluation Trained SVM Model


(Nonlinear
Forecasting)
Yes
Stop
Criteria?

No

Genetic Operations

26 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

SVM Process (2/2)


Random Initial Population

Chromosome 1
o The higher fitness value, the
Chromosome 2 Initial more survivability ability
..
.
Parameters
o The high-fitness valued
Chromosome N
candidate chromosome
retained, & combined to
produce new offspring.
Nonlinear
Training SVM Model Residual

o GA is applied to SVM parameter


Trained SVM Model
search
- No theoretical method for
Fitness Evaluation Trained SVM Model determining a kernel function
(Nonlinear
Forecasting) and its parameter
Yes - No a priori knowledge for
setting kernel parameter C.
Stop
Criteria?
o Applied GA operations
- Crossover operation
No

Genetic Operations - Mutation operation


27 / 31
Introduction Background Overall Approach Detailed Process Experimental Results
Experimental Results Conclusion Discussion

Experimental Results (1/2)


Collected data: cumulative number of failures, 𝑥𝑖 , at time 𝑡𝑖
 Data Set (DS-1)
• RADC (Rome Air Development Center) Project reported by Musa
• 21 weeks tested, 136 observed failures

Output: predicted value, 𝑥𝑖+1 , using (𝑥1 , 𝑥2 ,…, 𝑥𝑖 )

Goodness of fit curves Relative Error curves


28 / 31
Introduction Background Overall Approach Detailed Process Experimental Results
Experimental Results Conclusion Discussion

Experimental Results (1/2)


Collected data: cumulative number of failures, 𝑥𝑖 , at time 𝑡𝑖
 Data Set (DS-2)
• 28 weeks SW test, 234 observed failures

Output: predicted value, 𝑥𝑖+1 , using (𝑥1 , 𝑥2 ,…, 𝑥𝑖 )

Goodness of fit curves Relative Error curves


29 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Conclusion
Proposed hybrid methodology in forecasting
software reliability:
 exploits unique strength of the ARIMA model and
the SVM model
Test results
 showed improvement of the prediction performance

30 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion

Discussion
Pros
 Providing a possible solution of SRM selection
difficulties
 Improving SW reliability prediction performance
Cons
 Not present detailed test methods (ex: stop criteria
for SVM, parameter estimation criteria for ARIMA,
etc.)

31 / 31
Thank you!

You might also like