You are on page 1of 49

Effectiveness of Software Quality

Techniques

A Dr. Dobb’s Webcast


Sponsored by Coverity
Our Distinguished Panel

Capers Jones, President and CEO, Capers


Jones & Associates LLC

Matthew Hayward, Director of Professional


Services, Coverity Inc
Software Productivity Research

SOFTWARE DEFECT REMOVAL


IN 2009
Capers Jones, Chief Scientist Emeritus

http://www.spr.com
cjones@spr.com May 12, 2009
TYPES OF SOFTWARE DEFECTS

DEFECT SOURCE REQUIREMENTS


• Defect potential 1.0 per function point; 10.0 per KLOC
• Volume 0.5 pages per function point
• Completeness < 75% of final features
• Rate of change 2% per month

• Defect types Errors of omission


Errors of clarity and ambiguity
Errors of logical conflict
Errors of judgement (Y2K problem)‫‏‬

• Defect severity > 25% of severity 2 errors


Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\4
TYPES OF SOFTWARE DEFECTS

DEFECT SOURCE DESIGN


• Defect potential 1.25 per function point; 12.5 per KLOC
• Volume 2.5 pages per function point (in total)‫‏‬
• Completeness < 65% of final features
• Rate of change 2% per month

• Defect types Errors of omission


Errors of clarity and ambiguity
Errors of logical conflict
Errors of architecture and structure

• Defect severity > 25% of severity 2 errors


Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\5
TYPES OF SOFTWARE DEFECTS

DEFECT SOURCE SOURCE CODE


• Defect potential 1.75 per function point; 17.5 per KLOC
• Volume Varies by programming language
• Completeness 100% of final features
• Dead code > 10%; grows larger over time
• Rate of change 5% per month

• Defect types Errors of control flow


Errors of memory management
Errors of complexity and structure

• Defect severity > 50% of severity 1 errors


Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\6
TYPES OF SOFTWARE DEFECTS

DEFECT SOURCE USER DOCUMENTS


• Defect potential 0.6 per function point; 6.0 per KLOC
• Volume 2.5 pages per function point
• Completeness < 75% of final features
• Rate of change 1% per month (lags design and code)‫‏‬

• Defect types Errors of omission


Errors of clarity and ambiguity
Errors of fact

• Defect severity > 50% of severity 3 errors

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\7
TYPES OF SOFTWARE DEFECTS

DEFECT SOURCE BAD FIXES


• Defect potential 0.4 per function point; 4.0 per KLOC
• Volume 7% of defect repairs
• Completeness Not applicable
• Rate of change Not applicable

• Defect types Errors of control flow


Errors of memory management
Errors of complexity and structure

• Defect severity > 15% of severity 1 errors


> 20% of severity 2 errors
Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\8
FORMS OF SOFTWARE DEFECT REMOVAL

• STATIC ANALYSIS

• GENERAL TESTING

• SPECIALIZED TESTING

• USER TESTING

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\9
FORMS OF DEFECT REMOVAL

SOFTWARE STATIC ANALYSIS

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\10
FORMS OF STATIC ANALYSIS

REMOVAL STAGE REQUIREMENT INSPECTIONS


• Occurrence < 5% of mission-critical software
• Performed by Clients, Designers, Programmers, SQA
• Schedule 75 function points per hour
• Purpose Requirements error removal
• Limits Late additions not covered
• Scope Full requirement specifications
• Size of software > 100,000 LOC or 1000 function points
• Defect potential 1.0 per function point; 10.0 per KLOC
• Removal Efficiency 65% to 85% of significant errors
• Bad fix injection 2% to 5%
• Comment Reduces creep by > 50%

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\11
FORMS OF STATIC ANALYSIS

REMOVAL STAGE DESIGN INSPECTIONS


• Occurrence Systems software primarily
• Performed by 3 to 8 Designers, Programmers, SQA
• Schedule 100 function points per hour
• Purpose Design error removal
• Limits Late features not covered
• Scope Initial and final specifications
• Size of software > 10,000 LOC or 100 function points
• Defect potential 1.25 per function point; 12.5 per KLOC
• Removal Efficiency 65% to 85% of all defect types
• Bad fix injection 2% to 7%
• Comment Raises test efficiency by > 10%

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\12
FORMS OF STATIC ANALYSIS

REMOVAL STAGE AUTOMATED STATIC ANALYSIS


• Occurrence Systems, embedded, open-source
• Performed by Programmers
• Schedule 500 function points 25,000 LOC per hour
• Purpose Coding error detection
• Limits Only works for Java and C dialects
• Scope Source code after clean-compilation
• Size of software Flexible: 1 to > 10,000 function points
• Defect potential 1.75 per function point; 17.5 per KLOC
• Detection Efficiency > 85% except for performance
• Bad fix injection 2% to 5%
• Caution Performance and some security issues

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\13
FORMS OF STATIC ANALYSIS

REMOVAL STAGE CODE INSPECTIONS


• Occurrence Systems software primarily
• Performed by 3 to 6 Programmers, Testers, SQA
• Schedule 2.5 function points or 250 LOC per hour
• Purpose Coding error removal
• Limits Late features not covered
• Scope Source code after clean-compilation
• Size of software > 1,000 LOC or 10 function points
• Defect potential 1.75 per function point; 17.5 per KLOC
• Removal Efficiency 65% to 85% except for performance
• Bad fix injection 2% to 5%
• Caution Cyclomatic complexity > 10

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\14
NORMAL DEFECT ORIGIN/DISCOVERY GAPS

Requirements Design Coding Documentation Testing Maintenance


Defect
Origins

Defect
Discovery
Requirements Design Coding Documentation Testing Maintenance
Static Analysis

Zone of Chaos

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\15
DEFECT ORIGINS/DISCOVERY WITH INSPECTIONS

Requirements Design Coding Documentation Testing Maintenance


Defect
Origins

Defect
Discovery
Requirements Design Coding Documentation Testing Maintenance
Static analysis

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\16
FORMS OF DEFECT REMOVAL

GENERAL SOFTWARE TESTING

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\17
GENERAL FORMS OF SOFTWARE TESTING

Test UNIT TEST


• Occurrence All types of software
• Performed by Programmers
• Schedule 10 function points; 1 KLOC per hour
• Purpose Coding error removal
• Limits Interfaces and system errors not found
• Scope Single modules
• Size Tested 100 LOC or 1 function point
• Test cases 5.0 per function point
• Test case errors 20%
• Test runs 10.0 per test case
• Removal Efficiency 30% of logic and coding errors
• Bad fix injection 7%
Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\18
GENERAL FORMS OF SOFTWARE TESTING

Test NEW FUNCTION TEST


• Occurrence Software > 100 function points
• Performed by Programmers or test specialists
• Schedule 100 function points; 10 KLOC per hour
• Purpose Incremental features added
• Limits Regression problems not covered
• Scope Multiple modules
• Size Tested 100 LOC or 1 function point and up
• Test cases 1.0 per function point
• Test case errors 5%
• Test runs 4.5 per test case
• Removal Efficiency 35% of functional errors
• Bad fix injection 7%
Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\19
GENERAL FORMS OF SOFTWARE TESTING

Test REGRESSION TEST


• Occurrence Software updates > 10 function points
• Performed by Programmers or test specialists
• Schedule 250 function points; 25 KLOC per hour
• Purpose Find errors caused by updates
• Limits Errors in new features not covered
• Scope Multiple modules
• Size Tested 100 LOC or 1 function point and up
• Test cases 0.5 per function point
• Test case errors 5%
• Test runs 2.0 per test case
• Removal Efficiency 35% of regression problems
• Bad fix injection 5%
Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\20
GENERAL FORMS OF SOFTWARE TESTING

Test SYSTEM TEST


• Occurrence Software > 1000 function points
• Performed by Test specialists or programmers
• Schedule 250 function points; 25 KLOC per hour
• Purpose Errors in interfaces, inputs, & outputs
• Limits All paths not covered
• Scope Full application
• Size Tested 100,000 LOC or 1000 function points
• Test cases 0.5 per function point
• Test case errors 5%
• Test runs 3.0 per test case
• Removal Efficiency 50% of usability problems
• Bad fix injection 7%
Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\21
DISTRIBUTION OF 1500 SOFTWARE PROJECTS
BY
DEFECT REMOVAL EFFICIENCY LEVEL
Defect Removal Efficiency Percent of
Level (Percent) Number of Projects Projects

> 99 6 0.40%
95 - 99 104 6.93%
90 - 95 263 17.53%
85 - 90 559 37.26%
80 - 85 408 27.20%
< 80 161 10.73%

Total 1,500 100.00%

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\22
U.S. AVERAGE DEFECT REMOVAL: 85%
STATIC ANALYSIS

• Informal design reviews Developers (< 35% efficient)‫‏‬

• Code desk checking Developers (< 25% efficient)‫‏‬

Pre-test removal 50%

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\23
U.S. AVERAGE DEFECT REMOVAL: 85%

TEST STAGES

• Subroutine tests Developers


• Unit tests Developers
• New function tests Developers
• Regression tests Developers
• System tests Developers
• External Beta testsClients, users

Test removal 70%

Cumulative removal 85%


Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\24
BEST IN CLASS DEFECT REMOVAL: 99%

STATIC ANALYSIS

• Requirement inspections Clients, developers, SQA

• Design inspections Designers, developers, SQA

• Code inspections or Programmers, testers, SQA

• Static Analysis Programmers

• Pre-test removal 90%

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\25
BEST IN CLASS DEFECT REMOVAL: 99%
TEST STAGES

• Subroutine tests Programmers


• Unit tests Programmers
• New function tests Test specialists, SQA
• Regression tests Test specialists, SQA
• Performance tests Test specialists, SQA
• Integration tests Test specialists, SQA
• System tests Test specialists, SQA
• External Beta tests Clients

Test removal 85%

Cumulative removal 99%


Copyright © 2009 by Capers Jones.. All Rights Reserved.
SWQUAL08\26
CONCLUSIONS ON SOFTWARE DEFECT REMOVAL

• No single defect removal method is adequate.

• Testing alone is insufficient to top 90% removal efficiency.

• Formal inspections, static analysis, and tests combined


give high removal efficiency, low costs and short schedules.

• Defect prevention plus static analysis, inspections and tests


give highest cumulative efficiency and best economics.

• Bad fix injections need special solutions.

• Test case errors need special solutions.

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\27
REFERENCES ON SOFTWARE QUALITY

Jones, Capers; Estimating Software Costs, McGraw Hill, 2007.

Jones, Capers; Assessments, Benchmarks, and Best Practices,


Addison Wesley, 2000.

Jones, Capers; Applied Software Measurement; McGraw Hill, 2008.


(3rd edition.)‫‏‬

Jones, Capers; Software Quality: Analysis and Guidelines for Success;


International Thomson; 1995.

Kan, Steve; Metrics and Models in Software Quality Engineering,


Addison Wesley, 2003.

Radice, Ron; High-quality, Low-cost Software Inspections,


Paradoxican Publishing, 2002.

Wiegers, Karl; Peer Reviews in Software, Addison Wesley, 2002.

Copyright © 2009 by Capers Jones.. All Rights Reserved.


SWQUAL08\28
Preventing Field Defect Impact
with Static Analysis
Matthew Hayward, 5/12/2009
Director Professional Services, Coverity, Inc.
mhayward@coverity.com
3rd Generation Static Analysis

• Comprehensive and Accurate


– Optimal false positive elimination
– Understands developer intent
– Bit-accurate, executable representation of your
source code using an authentic compiler
• Integrated and Usable
– Drop-in integration with zero change to
environment or build
– Plugs into the central build and the desktop
– Effective defect resolution process with ownership
and intuitive analytics tools
Authentic Compilation

• Control Flow Graph


– Accurate representation of flow within each function, including
complex C/C++ and Java functions
– Understand temporary objects, constructor/destructor calls inserted
transparently by compilers, gotos, exceptions, etc.

• Call Graph
– Understand function, file and model interactions in your code
– Follow complex call chains crossing linkage units, virtual functions
and function pointers

• 3rd Party Libraries and Platforms


– OOTB support for common external libraries and platforms
– Fully customizable to support your unique external binary
components
False Positive Elimination

• Optimal elimination of false paths to ensure


any analysis path is logically feasible

• Data Tracking
– Track all known values for every expression

• Data Propagation
– Propagate values through function calls interprocedurally

• False Path Pruning


– Convert all possible values, variables and operations into a
bit-accurate representation to verify if a path can be logically
executed
Defect Detection

• Benefits
– Automatic detection of logically incorrect code
– Accurate analysis of 100% of all paths and values
– Comprehensive understanding of 3rd party platforms and
libraries
• Sample Checkers
OVERRUN_STATIC INFINITE LOOP UNCAUGHT_EXCEPT
OVERRUN_DYNAMIC CHAR IO OPEN_ARGS
READLINK COM.BSTR.CONV BUFFER_SIZE
INTEGER_OVERFLOW COM.BSTR.ALLOC STRING_OVERFLOW
DEADCODE COM.BAD_FREE JDBC_CONNECT
UNREACHABLE MISSING_RETURN RESOURCE_LEAK
Accuracy: Predict Developer
Intent

• Statistical Analysis
– Monitor common behavior to infer correct behavior
– Infer proper API error handling
– Associate variables and locks
• Sample Checkers
NULL_RETURNS MISSING_LOCK
CHECKED_RETURN ORDER_REVERSAL
BAD_COMPARE BAD_EQ
NO_EFFECT CALL_SUPER
BAD_OVERRIDE LOCK_ORDERING
MISMATCHED_ITERATOR GUARDED_BY_VIOLATION
Effectiveness of Static Analysis

Consumers of static analysis often have


two fundamental questions about its
effectiveness:
1. How many Coverity detected defects
would have hit the in system testing or
QA?
2. How many Coverity detected defects
would have hit in the field?

ALL MATERIALS CONFIDENTIAL 35


Data Sources for Inquiry

In order to answer these questions, we


will consider two data sources:
1. The results of Coverity’s open source
Scan project (37 projects including
Apache HTTPD, gcc, KDE, Linux, and
NetBSD)
2. The results of analysis on a sequence
of released versions of the Linux Kernel

ALL MATERIALS CONFIDENTIAL 36


Survey of Scan Results

To answer the first question:

How many Coverity detected defects


would have hit in system testing or QA?

We will consider the results of open


source development for projects covered
by Coverity’s Scan project.

ALL MATERIALS CONFIDENTIAL 37


Scan Project Data

The defects detected through Coverity’s


open source Scan project may be
grouped into two categories:
1. Defects that have been inspected by an
open source developer.
– This category allows us to draw conclusions
about the use of Coverity to resolve defects
2. Defects which are uninspected.
– This category allows us to draw conclusions
about defects resolved through other means -
such as testing, QA ,or field defect reports
ALL MATERIALS CONFIDENTIAL 38
Scan Data - Inspected Defects

• Of 20,648 detected 12,065 defects were


inspected by open source developers
• Of these 9,551 defects have been
removed
• Inspection of a Coverity detected defect
by an open source developer leads to a
fix 79% of the time

ALL MATERIALS CONFIDENTIAL 39


Scan Data - Uninspected Defects

• Of 20,648 detected defects 8,619 were


left uninspected by open source
developers
• Of these 4,235 defects have been
removed, presumably by traditional
testing or field defect reports
• Uninspected Coverity detected defects
are fixed 49% of the time through other
means, such as testing, QA, or field
defect reports
ALL MATERIALS CONFIDENTIAL 40
Scan Data - Breakdown by Open
Source Project

Open Source projects typically fix more


than half of their Coverity detected
defects:
20

18

16

14
Number of Projects

12

10

0
0-25% 26-50% 51-75% 76-100%

ALL MATERIALS CONFIDENTIAL


Percentage of Defects Fixed 41
Scan Data - Conclusion

• Since its inception 13,786 out of 20,684


- almost precisely two thirds of all
detected defects - have been removed.
• The fixed defects include a mix of:
• Inspected defects fixed with Coverity’s assistance: 79% fix rate
• Uninspected defects fixed through traditional means: 49% fix rate

• This means that on average between


49% and 79% of Coverity defects would
have been hit in testing, QA, or the field.

ALL MATERIALS CONFIDENTIAL 42


Linux Kernel Case Study

To answer the question:


How many Coverity detected defects
would have hit in the field?
a different approach is required.
We will consider the results of running
Coverity’s analysis over approximately
200 minor and patch releases of the Linux
Kernel made between version 2.6.12 and
2.6.25
ALL MATERIALS CONFIDENTIAL 43
Linux Case Study - Methodology

• For each release of the Linux Kernel


between 2.6.12 and 2.6.25 inclusive, we
perform a Coverity analysis using a recent
version of Coverity’s analysis engine
• We exclude from our results any defects
present in the Scan project’s data set for
the Linux Kernel
• For every released defect detected we
track whether it is removed in a
subsequent release:
– If so, it must have been fixed by Linux developers via
traditional means of remediating defects which have
reached the field
ALL MATERIALS CONFIDENTIAL 44
Linux Case Study - Results

• A recent version of Linux Kernel Fixed Defects


Coverity’s analysis by Impact
engine detection 5,876
defects in these Resource
Memory Leaks, 59
Security
versions of the Linux Corruptions,
148
Issues, 140

Kernel
• 2,315 of these defects
Concurrency
Issues, 939

were removed in Logic Errors,


subsequent releases 865

• 39% of the Coverity


detected defects Crash

present in the Linux Causing,


164

Kernel and released to


the field were fixed in
subsequent releases
ALL MATERIALS CONFIDENTIAL 45
Conclusions

1. How many Coverity detected defects


would have hit in system testing or
QA?
– Open source data indicates between 49% and
79% of Coverity detected defects would have to
be resolved by some other means during testing,
QA, or following a release
2. How many Coverity detected defects
would have hit in the field?
– A case study over the Linux Kernel demonstrates
39% of Coverity defects which were allowed into
the field were fixed in subsequent patch or minor
ALL MATERIALS CONFIDENTIAL 46
Coverity Integrity Center
Precision Software Analysis Across Lifecycle
•Increase customer satisfaction by
eliminating product delays and recalls
caused by software problems

•Speed time to market by making


software changes faster and with less
risk

•Innovate rapidly by reducing time


developers spend fixing software
design, code, and delivery problems

ALL MATERIALS CONFIDENTIAL


Thank You

Q&A

ALL MATERIALS CONFIDENTIAL


Resources
To View This or Other Events On-Demand Please Visit:
http://www.techweb.com/webcasts

For more information:


http://coverity.com/html/research-library.html