Professional Documents
Culture Documents
Release N+1
Months
The research questions
Is therethis
Given a model
model,that
howdescribes
can we predict
the
model parameters
defect occurrence pattern?
for the next release?
Now
?
Defect occurrences
Months
The research approach
In the context of widely-deployed
production software:
Perform analysis to develop hypotheses
concerning models/methods
Use real world data to empirically test
hypotheses
The data
User-reported defects in 22 releases of
four widely-deployed productions
software systems:
8 releases of a commercial operating system
3 releases of a commercial middleware system
8 releases of an open source operating system
(OpenBSD)
3 releases of an open source middleware system
(Jakarta Tomcat)
Relation to prior work
Software reliability modeling and software
certification:
Assume software and hardware configurations and
deployment and usage patterns are known
Total number of defects prediction and
defect prone module identification:
Produce results that are insufficient for maintenance
planning and software insurance
No work on projecting defect occurrence
rates for open source software systems
Part 1: which model to use?
Now
?
Defect occurrences
Months
Previously published models
Total number Increasing component, Decreasing component,
of defect occurrences dominates when t is small dominates when t is large
Model type Model shape Model form
Exponential
Goel & Okumoto [1979] λ(t) = N α e – α t
Weibull βt α
Schick-Wolverton [1978] λ(t) = N α β t α-1 e –
Gamma
Yamada, Ohba, & λ(t) = N β α t α-1 e – βt
Osaki [1983]
Power
Duane [1964] λ(t) = α β e – β t
Logarithmic
Musa-Okumoto [1975] λ(t) = α (α β t +1) – 1
Model comparison
Defect occurrences
Weibull model 83
Conclusion: Weibull is better
Has the best AIC score in 73% of the
releases
Is within the 95% C.I. of the best AIC
score in 95% of the releases
Is good despite differences in the type of
system, style of development, and the
kind of data
Part 2 : How to extrapolate
model parameters?
α
Weibull = N α β t α-1 e – β t
Now
Defect occurrences
Months
Parameter extrapolation methods
previous
Defect occurrences
projected
projection
uninformed
difference
guess
baseline difference
actual
t1 t2
Months
Defect projection evaluation
Releases/ one two three four five six seven
System release releases releases releases releases releases releases
Open source
1.06 0.70
OS R2.8
Open source
1.32 0.93 1.04
OS R2.9
Open source
0.87 0.42 0.43 0.44
OS R3.0
Open source
0.72 0.70 0.73 0.71 0.73
OS R3.1
Open source
0.76 0.91 0.87 0.99 0.97 1.02
OS R3.2
Open source
1.56 1.10 0.85 0.86 0.66 0.66 0.57
OS R3.3
Variance Bias
Residual standard error
Compares model fits with different number
of parameters.
Accounts for variance and bias.
Follows a ~ X2 (Chi-squared) distribution.
4 ~ 95% Confidence Interval.
The Theil forecasting statistic
Historical releases: A2
P
Theil forecasting statistic:
A1 P2 √ (Σ(Actual – Predicted)2)
√( Σ(Actual)2)
Perfect forecast: P2 = A2
(Actual – Predicted) = ((A2-A1) – (P2-A1)) = ((A2-A1) – (P2-A1))
= ((A2-A1) – (A2-A1)) = 0 → Theil statistic of 0
Uninformed forecast: P2 = A1
(Actual – Predicted) = ((A2-A1) – (P2-A1)) = ((A2-A1) – (A1-A1))
= ((A2-A1) – 0) = ((A2-A1) – 0) = Actual → Theil statistic of 1
Current release: