Professional Documents
Culture Documents
KAIST SE Lab.
9/4/2013
Contents
Introduction
Background
Overall Approach
Detailed Process
Experimental Results
Conclusion
Discussion
2 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
SW Reliability Prediction
Definition of SW Reliability
Probability of failure-free operation of a software
product in a specified environment for a specified
time.
SRM (Software Reliability Model)
To estimate how reliable the software is now.
To predict the reliability in the future.
Two categories of SRMs
Analytical Models: NHPP SRMs
Data-Driven Models: ARIMA, SVM
3 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
4 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Motivation
Problems
Actual SW failure data set is rarely pure linear or
nonlinear
No general model suitable for all situations
Proposed Solution
Hybrid strategy with both linear and nonlinear
predicting model
• ARIMA model: Good performance in predicting linear data
• SVM model: Successful application to nonlinear data
5 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Stationarity
Statistical properties (mean, variance,
covariance, etc.) are all constant over time.
(1) E ( yt ) u y for all t.
(2) Var ( yt ) E[( yt u y ) 2 ] y2 for all t.
(3) Cov( yt , yt k ) k for all t.
50 50
40 40
30 Differencing
30 μ1, σ12, γ1= μ2, σ22, γ2
20 20
10 10
0 0
1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11
6 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
(y t y )( yt k y ) y t
rk t k 1 where y t 1
n n
(y
t 1
t y) 2
7 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
PACF
8 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Removing Non-stationarity
Differencing
Differenced series: yt yt yt 1
PACF
9 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
10 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Selection of a model
0.8
0.6
0.4
Autocorrelation
0.2
Exponentially
-0.2
-0.4
Decreasing
-0.6
• Directly: 0<a<1
-0.8
-1.0
1
(oscillating)
5 10 15 20 25
Lag
30 35 40 45 50
• Oscillating patter: -1<a<0 Partial Autocorrelation Function for AR1 data series
1.0
0.8
0.6
Partial Autocorrelation
of AR model PACF
0.4
0.2
0.0
Cut off
-0.2
-0.4
at Lag 1 AR(1)
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20
Lag
11 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
12 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Cut off
1.0
order of MA model
0.8
at Lag 1 MA(1)
0.6
0.4
Autocorrelation
0.2
PACF decreasing
0.0
-0.2
-0.4
-0.6
exponentially
-0.8
-1.0
1 5 10 15 20 25 30 35 40 45 50
• Directly: 0<a<1
Lag
• Oscillating patter: -1<a<0 Partial Autocorrelation Function for MA1 data series
(with 5% significance limits for the partial autocorrelations)
1.0
Exponentially
0.8
0.6
Partial Autocorrelation
PACF
0.4
0.2 Decreasing (oscillating)
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2 4 6 8 10 12 14 16 18 20
Lag
13 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
ARMA Model
ARMA(p,q) = AR(p) + MA(q)
𝑦𝑡 = α1 𝑦𝑡−1 + α2 𝑦𝑡−2 + ⋯ +α𝑝 𝑦𝑡−𝑝 + 𝜀𝑡
𝛽1 𝜀𝑡−1 + ⋯ + 𝛽𝑞 𝜀𝑡−𝑞
Procedures for model identification
• ▶ Guideline to determine
• p, q for ARMA
14 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
ARIMA Model
Auto Regressive Integrated Moving Average
(By Box and Jenkins (1970))
Linear model for forecasting time series data:
Future values is a linear function of several past observations.
ARIMA(p, d, q)
Moving average of order q
Integrated differentiation of order d
(Expand to Non-Stationary Time Series)
Auto Regression of order p
15 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
16 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Separating Hyperplane
f(x,w,b) = sign(w x + b)
denotes +1 w x + b>0
denotes -1 Separating
Hyperplane
(= Classifier)
w x + b<0
17 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Maximum Margin
f(x,w,b) = sign(w x + b)
denotes +1
denotes -1
x+ M=Margin
Width
Support
Vectors are X-
those data
points that
the margin
pushes up
Against
Only Support vectors are used to
specify the separating hyperplane!!
18 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
0 x
How about… mapping data to a higher-dimensional space:
x2
19 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Φ: x → φ(x)
20 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Genetic Algorithm
Search & Optimization
Create inintial, random
population
technique (potential solutions)
Crossover
No
Mutation Selection
or kill population
Crossover
Mutation
21 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
+
Stop
Criteria?
Software Reliability
No Prediciton
Genetic Operations
Stop +
Criteria?
Software Reliability
No Prediciton
Genetic Operations
22 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
23 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Data
Set
Stationarize input data
- Differencing, determine d
- ACF, PACF checking
Model Identification
24 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Data
Set
Model Identification
Parameter Estimation
No
Residual randomness Check
- Residuals of the well-fitted model
will be random and follow the
Is satisfied
model checking?
normal distribution
Yes - Check ACF and PACF
SW Reliability Prediction
25 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
No
Genetic Operations
26 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Chromosome 1
o The higher fitness value, the
Chromosome 2 Initial more survivability ability
..
.
Parameters
o The high-fitness valued
Chromosome N
candidate chromosome
retained, & combined to
produce new offspring.
Nonlinear
Training SVM Model Residual
Conclusion
Proposed hybrid methodology in forecasting
software reliability:
exploits unique strength of the ARIMA model and
the SVM model
Test results
showed improvement of the prediction performance
30 / 31
Introduction Background Overall Approach Detailed Process Experimental Results Conclusion Discussion
Discussion
Pros
Providing a possible solution of SRM selection
difficulties
Improving SW reliability prediction performance
Cons
Not present detailed test methods (ex: stop criteria
for SVM, parameter estimation criteria for ARIMA,
etc.)
31 / 31
Thank you!