Professional Documents
Culture Documents
CO MP0
Fundamentals of Testing MP 103-
010 A7P
3-A
TU
Validation & Verification
Part of the slides are used with kind permission of Dr Shin Yoo and Dr Yue Jia
COMP0103 f.sarro@ucl.ac.uk
Why do we test software?
UCL
COMP0103 f.sarro@ucl.ac.uk
Major Software Failures
UCL
✤ NASA’s Mars lander: September 1999, crashed due to a units
integration fault
✤ Ariane 5 explosion
https://www.youtube.com/watch?v=gp_D8r-2hwk
http://www.cas.mcmaster.ca/~baber/TechnicalReports/Ariane5/Ariane5.htm
COMP0103-A7P/U f.sarro@ucl.ac.uk
London Heathrow Terminal 5
Opening
UCL
Staff successfully tested
the brand new baggage
handling system with
over 12,000 test pieces of
luggage before the
opening to the public
COMP0103-A7P/U f.sarro@ucl.ac.uk
London Heathrow Terminal 5
Opening
UCL
1 single real life scenario caused
the entire system to become
confused and shut down
COMP0103-A7P/U f.sarro@ucl.ac.uk
Cost of Software Bugs
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
What is Software Testing? UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Level of Testing Goals
UCL
to show correctness
Increasing Testing
Process Maturity
to show problems
COMP0103-A7P/U f.sarro@ucl.ac.uk
Software Testing
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Software Qualities
UCL
✤ Dependability
✤ Correctness
✤ A program is correct if it is consistent with its specification
✤ Seldom practical for non-trivial systems
✤ Reliability
✤ Probability of correct function for some ‘unit’ of behaviour
✤ Relative to a specification and usage profile
✤ Statistical approximation to correctness (100% reliable = correct)
✤ Safety
✤ Preventing hazards (loss of life and/or property)
✤ Robustness
✤ Acceptable (degraded) behaviour under extreme conditions
✤ Performance
✤ Usability
COMP0103-A7P/U f.sarro@ucl.ac.uk
Software Testing
UCL
✤ Testing is the process of finding differences between the expected behaviour specified by
system models and the observed behaviour of the implemented system.
✤ Unit testing finds differences between a specification of an object and its realisation as a
component
✤ Structural testing finds differences between the system design model and a subset of
integrated subsystems
✤ Functional testing finds differences between the use case model and the system
✤ When differences are found, the developers identify the defect causing the observed failure
and modify the system to correct it or if the system model is identified as the cause of the
difference it is updated to reflect the system
COMP0103-A7P/U f.sarro@ucl.ac.uk
Terminology: Fault, Error, Failure
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Terminology: Fault, Error, Failure
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Example: Faults, Error, Failure
UCL
A patient gives a doctor a list of symptoms
COMP0103-A7P/U f.sarro@ucl.ac.uk
Example: Faults, Error, Failure
UCL
A patient gives a doctor a list of symptoms Failure
COMP0103-A7P/U f.sarro@ucl.ac.uk
Dynamic vs. Static
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
What About Software Bugs?
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
How To Deal With Faults
Object-Oriented Software Engineering: Using UML, Patterns, and Java, 3rd Edition
UCL
Prentice Hall, Upper Saddle River, NJ, September 25, 2009.
✤ Fault avoidance
✤ Use Reviews
✤ Fault detection
✤ Fault tolerance
✤ Exception handling
✤ Modular redundancy
COMP0103-A7P/U f.sarro@ucl.ac.uk
More Terminology
UCL
✤ Test Input: a set of input values that are used to execute a given program
✤ Test Oracle: a mechanism for determining whether the actual behaviour of a test input
execution matches the expected behaviour
✤ Test Effectiveness: the extent to which testing reveals faults or achieves other objectives
✤ Testing vs. Debugging: testing reveals faults, while debugging is used to remove them
COMP0103-A7P/U f.sarro@ucl.ac.uk
Example
UCL
SUT
System
Under Test
COMP0103-A7P/U f.sarro@ucl.ac.uk
Example
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Example
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Testing Activities
UCL
Test Design Test Execution Test Evaluation
COMP0103-A7P/U f.sarro@ucl.ac.uk
Example
UCL
x = -2 SUT y = -5 y>0
COMP0103-A7P/U f.sarro@ucl.ac.uk
A Test Case Failed
UCL
x = -2 SUT y = -5 y>0
COMP0103-A7P/U f.sarro@ucl.ac.uk
A Test Case Failed
UCL
Requirements
x = -2 SUT y = -5 y>0
COMP0103-A7P/U f.sarro@ucl.ac.uk
A Test Case Failed
UCL
Requirements
x = -2 SUT y = -5 y>0
COMP0103-A7P/U f.sarro@ucl.ac.uk
A Test Case Failed
UCL
Requirements Libs
x = -2 SUT y = -5 y>0
COMP0103-A7P/U f.sarro@ucl.ac.uk
A Test Case Failed
UCL
Requirements Libs
x = -2 SUT y = -5 y>0
OS
COMP0103-A7P/U f.sarro@ucl.ac.uk
A Test Case Failed
UCL
Requirements Libs
x = -2 SUT y = -5 y>0
OS
Hardware
COMP0103-A7P/U f.sarro@ucl.ac.uk
A Test Case Failed
UCL
Requirements Libs …but, when re-executed,
sometimes it passes!
x = -2 SUT y = -5 y>0
COMP0103-A7P/U f.sarro@ucl.ac.uk
Brief Look at Software Lifecycle
UCL
COMP0103 f.sarro@ucl.ac.uk
Waterfall Model (Royce, 1970)
UCL
Requirements
Design
Implementation
Integration
Validation
Deployment
COMP0103-A7P/U f.sarro@ucl.ac.uk
Spiral Model (Boehm, 1988)
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Recent Paradigms
UCL
✤ Agile?
✤ Test-Driven Development?
COMP0103-A7P/U f.sarro@ucl.ac.uk
Testing Activities
UCL
Developer Client
COMP0103-A7P/U f.sarro@ucl.ac.uk
Testing Activities
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Brief Look at Testing Techniques
UCL
COMP0103 f.sarro@ucl.ac.uk
How Do You Test … ?
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Testing Techniques
UCL
✤ There is no fixed recipe that works always
COMP0103-A7P/U f.sarro@ucl.ac.uk
Random Testing
UCL
✤ Can be both black-box or white box
✤ Pros:
✤ Cons:
COMP0103-A7P/U f.sarro@ucl.ac.uk
Combinatorial Testing
UCL
✤ Black-box technique
COMP0103-A7P/U f.sarro@ucl.ac.uk
Structural Testing
UCL
✤ White-box technique
COMP0103-A7P/U f.sarro@ucl.ac.uk
Mutation Testing
UCL
✤ White-box technique
COMP0103-A7P/U f.sarro@ucl.ac.uk
Regression Testing
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Model-based Testing
UCL
COMP0103-A7P/U f.sarro@ucl.ac.uk
Why Is Testing Hard?
Exhaustive Testing & Oracles
COMP0103-A7P/U f.sarro@ucl.ac.uk
Exhaustive Testing
UCL
✤ Can we test each and every program with all possible inputs, and
guarantee that it is correct every times? Surely then it IS correct
✤ Takes three 32bit integers, tells you whether they can form three
sides of a triangle, and which type if they do
COMP0103-A7P/U f.sarro@ucl.ac.uk
Exhaustive Testing
UCL
✤ 32bit integers: between -231 and
231-1, there are 4,294,967,295
numbers
COMP0103-A7P/U f.sarro@ucl.ac.uk
Test Oracle
int testMe (int x, int y)
{
return x / y;
}
UCL
✤ In the example, we immediately know something is wrong when we
set y to 0: all computers will treat division by zero as an error
✤ What about those faults that forces the program to produce answers
that are only slightly wrong?
COMP0103-A7P/U f.sarro@ucl.ac.uk
Oracles and Non-Testable
Programs
UCL
✤ Weyuker observed that many programs are ‘non-testable’, in the
sense that it is nearly impossible to construct an effective oracle for
them
✤ Many numerical algorithms, e.g., multiplication of two large
matrices containing large values
✤ Must somehow compute result independently to validate it
✤ But that independent computation may be just as faulty
✤ Many large distributed real-time programs, e.g., USA’s Strategic
Defence Initiative (SDI), aka ‘Star Wars’
✤ Testing must demonstrate with sufficient confidence that it
would protect the USA from a nuclear attack
COMP0103-A7P/U f.sarro@ucl.ac.uk
Oracles and Reliability Testing
UCL
✤ Reliability testing gets around some of the problems of non-testable
programs by applying statistical reasoning to a testing activity
✤ Reliability: The probability of failure-free operation over some stated
period of time
✤ Can be estimated through testing, to a level of precision that depends on
how much testing was performed
✤ The greater the amount of testing, the greater the precision
✤ Butler and Finelli observe that it is physically impossible to attain the
stated reliability targets of many safety-critical systems
✤ Example: achieving ‘nine 9s’ reliability would require centuries of
testing
COMP0103-A7P/U f.sarro@ucl.ac.uk
High Dependability
Vs. Time-to-Market UCL
✤ Mass market products
COMP0103-A7P/U f.sarro@ucl.ac.uk
When To Stop Testing?
UCL
✤ When the program has been tested “enough”
✤ Temporal Criteria: the time allocated run out
✤ Cost Criteria: the budget allocated run out
✤ Coverage Criteria: a predefined percentage of the elements of a program is
covered by the tests; or test cases covering certain predefined conditions
are selected
✤ Statistical Criteria: predefined MTBF (mean time between failures)
compared to an existing predefined reliability model
✤ Practical Goals
✤ maximising the number of faults found (may require many test cases)
✤ minimising the number of test cases (and therefore the cost of testing)
COMP0103-A7P/U f.sarro@ucl.ac.uk
Competing Goals…
UCL
✤ Practical Goals
✤ maximising the number of faults found (may require many test cases)
✤ minimising the number of test cases (and therefore the cost of testing)
http://crest.cs.ucl.ac.uk/about/
COMP0103-A7P/U f.sarro@ucl.ac.uk