You are on page 1of 53

Measuring Software Product Quality

Eric Bouwers

T +31 20 314 0950


June 20, 2013
info@sig.eu
www.sig.eu
Software Improvement Group

2 I 53
Who are we?
•  Highly specialized advisory company for cost,
quality and risks of software
•  Independent and therefore able to give objective
advice

What do we do?
•  Fact-based advice supported by our automated
toolset for source code analysis
•  Analysis across technologies by use of technology-
independent methods

Our mission:
We give you control over your software.
Services

3 I 53

Software Risk Assessment


•  In-depth investigation of software quality and risks
•  Answers specific research questions

Software Monitoring
•  Continuous measurement, feedback, and decision support
•  Guard quality from start to finish

Software Product Certification


•  Five levels of technical quality
•  Evaluation by SIG, certification by TÜV Informationstechnik

Application Portfolio Analyses


•  Inventory of structure and quality of application landscape
•  Identification of opportunities for portfolio optimization
Today’s topic

4 I 53

Measuring Software Product Quality


Some definitions

Projects 5 I 53

•  Activity to create or modify software products


•  Design, Build, Test, Deploy

Product Project
•  Any software artifact produced to support business processes
•  Either built from scratch, from reusable components, or
by customization of “standard” packages
Product
Portfolio
•  Collection of software products Product Product Product
in various phases of lifecycle
•  Development, acceptance, operation,
selected for decommissioning Product
Product
Our claim today

6 I 53
“Measuring the quality of software products is key to
successful software projects and healthy software portfolios”

Project

Product

Product Product Product

Product
Product
Today’s topic

7 I 53

Measuring Software Product Quality


The ISO 25010 standard for software quality

8 I 53

Functional Suitability Performance Efficiency Compatibility

Usability ISO 25010 Reliability

Security Maintainability Portability


Why focus on maintainability?

9 I 53

Accep-
Init- Build Maintenance
iation tation

20% of the costs 80% of the costs


The sub-characteristics of maintainability in
ISO 25010

10 I 53

Maintain

Analyze Reuse Modify Test

Modularity
Measuring maintainability
Some requirements

11 I 53

Simple to understand

Allow root-cause
analyses
Suggestions?

Technology
independent

Easy to compute

Heitlager, et. al. A Practical Model for Measuring Maintainability, QUATIC 2007
Measuring ISO 25010 maintainability using the
SIG model

12 I 53

C
om
C

po
M

om
U

od
ni

ni

ne
D

po
tc
up

ul
ti
U
Vo

nt
ni

ne
e
nt
om
lic
lu

in
ts

co
e

nt
m

rf

de
pl
iz
tio

up
ac
e

ba
ex

pe
e
n

lin
in

la
ity

nd
g

nc
g

en
e

ce
Analysability X X X X
Modifiability X X X
Testability X X X
Modularity X X X
Reusability X X
Measuring maintainability
Different levels of measurement

13 I 53
System

Component

Module

Unit
Source code measurement
Volume

14 I 53

Lines of code
•  Not comparable between technologies

Function Point Analysis (FPA)


•  A.J. Albrecht - IBM - 1979
•  Counted manually
•  Slow, costly, fairly accurate

Backfiring
•  Capers Jones - 1995
•  Convert LOC to FPs
•  Based on statistics per technology
•  Fast, but limited accuracy
Source code measurement
Duplication

15 I 53

0: abc 34: xxxxx


1: def 35: def
2: ghi 36: ghi
3: jkl 37: jkl
4: mno 38: mno
5: pqr 39: pqr
6: stu 40: stu
7: vwx 41: vwx
8: yz 42: xxxxxx
Source code measurement
Component balance

16 I 53

AA
AA BB
BB CC
CC DD
DD
0.0 0.1 0.2 0.6
Measure for number and relative size of architectural elements
•  CB = SBO × CSU
•  SBO = system breakdown optimality, computed as distance from ideal
•  CSU = component size uniformity, computed with Gini-coefficient

E. Bouwers, J.P. Correia, and A. van Deursen, and J. Visser, A Metric for Assessing Component Balance of Software Architectures
in the proceedings of the 9th Working IEEE/IFIP Conference on Software Architecture (WICSA 2011)
From measurement to rating
A benchmark based approach

17 I 53

0.14 0.96
Threshold Score
0.96 0.84 0.9 HHHHH
sort 0.8 HHHHI
…. …. 0.5 HHHII
0.3 HHIII
0.09 0.14
0.1 HIIII

0.84 0.09
0.34
HHIII
Note: example thresholds
But what about the measurements
on lower levels?

18 I 53

C
om
C

po
M

om
U

od
ni

ni

ne
D

po
tc
up

ul
ti
U
Vo

nt
ni

ne
e
nt
om
lic
lu

in
ts

co
e

nt
m

rf

de
pl
iz
tio

up
ac
e

ba
ex

pe
e
n

lin
in

la
ity

nd
g

nc
g

en
e

ce
Analysability X X X X
Modifiability X X X
Testability X X X
Modularity X X X
Reusability X X
Source code measurement
Logical complexity

19 I 53
•  T. McCabe, IEEE Transactions on Software Engineering, 1976

•  Academic: number of independent paths per method


•  Intuitive: number of decisions made in a method
•  Reality: the number of if statements (and while, for, ...)

Method

McCabe: 4
My question …

20 I 53

How can we aggregate this?


Option 1: Summing

21 I 53

Crawljax GOAL Checkstyle Springframework


Total McCabe 1814 6560 4611 22937

Total LOC 6972 25312 15994 79474

Ratio 0,260 0,259 0,288 0,288


Option 2: Average

22 I 53

Crawljax GOAL Checkstyle Springframework


Average McCabe 1,87 2,45 2,46 1,99

1800"

1600"

1400"

1200"

1000"

800"

600"

400"

200"

0"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 17" 18" 19" 20" 21" 22" 23" 24" 25" 27" 29" 32"
Option 3: quality profile

23 I 53
Cyclomatic Risk Sum lines of code" Lines of code per risk category
complexity category
per category" Low Moderate High Very high
1-5 Low
70 % 12 % 13 % 5%
6 - 10 Moderate

11 - 25 High

> 25 Very high


Springframework#

Checkstyle#

Goal#

Crawljax#

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
24 I 53

First Level Calibration


The formal six step proces

25 I 53

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010
Visualizing the calculated metrics

26 I 53

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010
Choosing a weight metric

27 I 53

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010
Calculate for a benchmark of systems

28 I 53

70% 80% 90%


Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010
SIG Maintainability Model
Derivation metric thresholds

29 I 53

1.  Measure systems in benchmark 3.  Derive thresholds that bring out the
metric’s variability
tegory 2.  Summarize
(a) Metricall measurements(b) Box plot per risk4. 
distribution
Round the thresholds
category

Fig. 12: Module Inward Coupling (file fan-in)

Cyclomatic Risk
complexity category
Derive & Round" 1-5 Low

6 - 10 Moderate

11 - 25 High

> 26 Very high

tegory (a) Metric distribution (b) Box plot per risk category

rs) Fig. 13: Module Interface Size (number of methods per file)
Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010
The quality profile

30 I 53
Cyclomatic Risk Sum lines of code" Lines of code per risk category
complexity category
per category" Low Moderate High Very high
1-5 Low
70 % 12 % 13 % 5%
6 - 10 Moderate

11 - 25 High

> 25 Very high


Springframework#

Checkstyle#

Goal#

Crawljax#

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
31 I 53

Second Level Calibration


How to rank quality profiles?
Unit Complexity profiles for 20 random systems

32 I 53
HHHHH

HHHHI

HHHII
?
HHIII

HIIII

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011
Ordering by highest-category is not enough!

33 I 53
HHHHH

HHHHI

HHHII
?
HHIII

HIIII

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011
A better ranking algorithm

34 I 53

Order categories

Define thresholds
of given systems

Find smallest
possible
thresholds

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011
Which results in a more natural ordering

35 I 53
HHHHH

HHHHI

HHHII

HHIII

HIIII

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011
Second level thresholds
Unit size example

36 I 53

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011
SIG Maintainability Model
Mapping quality profiles to ratings

37 I 53

1.  Calculate quality profiles for the systems in the benchmark


2.  Sort quality profiles
3.  Select thresholds based on 5% / 30% / 30% / 30% / 5% distribution

HHHHH
Sort" Select thresholds"
HHHHI
HHHII
HHIII
HIIII

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011
SIG measurement model
Putting it all together

38 I 53

Quality Property
Measurements a. b. c. d.
Profiles Rating Qualtity
HHIII Rating
HIIII HHIII Overall
HHHII HHIII Rating
HHHHI HHHII HHHII

HHHHH HHHHI
HHHHI
HHHII
HHHII
HHHHI
Your question …

39 I 53

Does this work?


SIG Maintainability Model
Empirical validation

40 I 53

Internal: external:
maintainability Issue handling

16 projects
(2.5 MLOC)
150 versions

50K issues
•  The Influence of Software Maintainability on Issue Handling
MSc thesis, Technical University Delft
•  Indicators of Issue Handling Efficiency and their Relation to Software Maintainability,
MSc thesis, University of Amsterdam
•  Faster Defect Resolution with Higher Technical Quality of Software, SQM 2010
Empirical validation
The life-cycle of an issue

41 I 53
Empirical validation
Defect resolution time

42 I 53

Luijten et.al. Faster Defect Resolution with Higher Technical Quality of Software, SQM 2010
Empirical validation
Quantification

43 I 53

Luijten et.al. Faster Defect Resolution with Higher Technical Quality of Software, SQM 2010
SIG Quality Model
Quantification

44 I 53
Resolution time for defects and enhancements

•  Faster issue resolution with higher quality


•  Between 2 stars and 4 stars, resolution
speed increases by factors 3.5 and 4.0
SIG Quality Model
Quantification

45 I 53

Productivity (resolved issues per developer per month)

•  Higher productivity with higher quality


•  Between 2 stars and 4 stars, productivity
increases by factor 10
Your question …

46 I 53

Does this work?


Yes
Theoretically
Your question …

47 I 53

But is it useful?
HHHHH
Software Risk Assessment

48 I 53

Technical Risk
Strategy Technical
validation validation
session session
session session
Kick-off Final
Final Report
session presentation

Analysis in SIG Laboratory


Example
Which system to use?

49 I 53

HHIII
HHHII HHHHI
Should we accept delay and cost overrun, or
cancel the project?

50 I 53

User Interface User Interface

Business Layer Business Layer

Data Layer Data Layer

Vendor framework

Custom
Software Monitoring

51 I 53

Source Source Source Source


code 1 week code 1 week code 1 week code

Automated Automated Automated Automated


analysis analysis analysis analysis

Updated Updated Updated Updated


Website Website Website Website

Interpret Interpret Interpret Interpret


Discuss Discuss Discuss Discuss
Manage Manage Manage Manage
Act! Act! Act! Act!
Software Product Certification

52 I 53
Summary

53 I 53

C
om
C
Springframework#

om

po
U

od
ni
D

ne
po
up

ul
ti
U

om
Vo

nt
ni

ne
e
nt
lic
lu

pl
ts

co
e

in
nt
m

rf
ex

de
iz
tio

up
ac
e

ba
ity
e

pe
n

lin
in

la
Checkstyle#

de
g

nc
g

nc
e

e
Analysability X X X X
Modifiability X X X Goal#
Testability X X X
Modularity X X X
Reusability X X Crawljax#

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#

Thank you!
HHHHH Eric Bouwers
e.bouwers@sig.eu

You might also like