Professional Documents
Culture Documents
Łukasz Bownik
This text presents a simple method to estimate the quality of a unit test suite that can give some insight into the subject beyond
regular test coverage.
Download spreadsheet - 8 KB
Introduction
This text presents a simple method to estimate the quality of a unit test suite that can give some insight into the subject beyond
regular test coverage.
The main problem with code coverage expressed in terms of percentage of executed statements is, that it does not equal to the
coverage measured in term of exercised use cases. All it measures is the number of executed lines of code without taking into
consideration whether anything gets actually verified. This limits the utility of this metric alone.
Definition of Quality
A high quality test suite should give high confidence about how the code under test behaves. It should preferably encode all
required behavior (expressed as use cases) of all delivered functionality, so that every unwanted behavior gets automatically
detected. This is the point of perfection. The rest of this text presents the extended metric that helps in monitoring quality of test
suite.
Example
To illustrate the presented method, the following code under test is used throughout this text. The class generates odd numbers.
The boolean property instructs generator to emit only prime numbers.
public NumberGenerator() {
this.primeNumbersOnly = false;
}
public boolean isPrimeNumbersOnly() {
return primeNumbersOnly;
}
public void setPrimeNumbersOnly(boolean primeNumbersOnly) {
this.primeNumbersOnly = primeNumbersOnly;
}
return result;
}
Additionally to code under test, the following unit test suite is used throughout this text.
@Test
public void testConstruction1() {
try {
NumberGenerator ng = new NumberGenerator();
} catch(Exception e) {
fail();
}
}
@Test
public void testPrimeNumbersOnly() {
@Test
public void testGenerateOdd1() {
NumberGenerator ng = new NumberGenerator();
ng.setPrimeNumbersOnly(false);
@Test
public void testGenerateOdd2() {
ng.setPrimeNumbersOnly(true);
result = ng.generateOdd();
assertEquals(Arrays.asList(1,3,5,7), result);
}
}
Quality Estimation
In order to determine the quality of test suite, one needs to review both test cases and code under test. Preferably, the whole test
suite should be reviewed, but in order to estimate exiting large test suite, one should gather enough samples to reason about the
whole codebase (at least 100). During rough inspection, every test case gets assigned three numeric values: verification ratio, use
case coverage and covered complexity.
Verification Ratio
Verification ratio denotes whether the test actually verifies anything. Value of 1 gets assigned to a test the verifies behavior of
exercised code (testPrimeNumbersOnly, testGenerateOdd1, testGenerateOdd1), value of 0 gets assigned to
test that merely runs the code (testConstruction1 and testConstruction1). There may be cases where things are not
obvious at first glance. In such situations, an arbitrary fraction number may be assigned (remember, this is just estimation).
Complexity Impact
Complexity impact is a rough estimation telling whether code under test contains functionality worth testing. Value of 1 can be
assigned to test case for regular code that contains some algorithm – a condition, loop or sequence of function invocations
(testGenerateOdd2). Value of 0.1 shall be assigned to test cases exercising getters and setters and other functions that
merely copy variables (testConstruction1, testConstruction2, testPrimeNumbersOnly). In rare cases, 0 value
may be assigned if it is difficult to determine whether the test actually runs the code under test (sic!) or the test is redundant
(testGenerateOdd1 because it is a subset of testGenerateOdd2). This parameter is very rough by nature, as there is
plenty of freedom between 0.1 and 1.
Results
The result of code review shall be a table containing 5 columns with assigned values.
Test name Verification ratio Use case coverage Complexity impact Comment
testGenerateOdd2 1 1 1
The following table presents the value for example suite in bottom right cell (value 0.21).
testPrimeNumbersO
1 0.5 0.1 0.05
nly
testGenerateOdd1 1 0.5 0 Redundant 0
testGenerateOdd2 1 1 1 1
Values close to 1 mean that the test suite concentrates on complex non-trivial functionality. Such test suite is a source of confidence
for software team. On the other hand, values close to 0 indicate that the suite does not verify much, concentrates on trivial
functionality or does not fully exercise all possible use cases (or all of them combined).
Code coverage and average covered complexity combined provide insight into quality of test suite that has been summarized in
Figure 1.
Figure 1. Test suite quality.
Both parameters over 0.5 or 50% (green quadrant) denote a suite that covers most of relevant complex use cases of code
under test; this is a desirable situation.
Code coverage under 50% and average covered complexity over 0.5 (yellow quadrant) denotes a suite of good quality but
insufficient quantity (a lot of functionality hasn’t been covered at all, but the covered functionality is thoroughly exercised);
this situation happens when tests have been abandoned during development process or are being written after fact, usually
as a prerequisite for refactoring; in order to move the suite to green quadrant, more tests need to be written to cover the
remaining parts of functionality.
Code coverage over 50% and average covered complexity under 0.5 (red quadrant) denote a “fraud” test suite that was
created only to fool code coverage metric; such suite shall be deleted.
Both parameters under 0.5 or 50% (grey quadrant) denote that the existence of test suite itself is questionable; such suite
shall be refactored or deleted.
A point of perfection is a guiding light for test suite but reaching it is really possible or desirable (it is usually too expensive). The
test suite that provides a highest return of investment is usually placed somewhere within light green quadrant usually leaning
towards higher average covered complexity than code coverage. In this case, all non-trivial functionality gets exercised by test suite.
Additional Metrics
In order to give even more insight into the problem, additional metrics can be calculated:
Average verification ratio – shows what percentage of test cases actually verify something; low values (0.61 in the example)
mean that many test cases just run code under test
Average use case coverage – shows the ratio of use cases exercised by tests; low values (0.4 in the example) mean that
tests are not exhaustive
Average complexity impact– shows the average verified complexity of covered code; low values (0.26 in the example)
mean that test suite concentrates on trivial functionality that is unlikely to misbehave
Automation
The presented method can be automated to some extent. If the programming language supports annotations, a special one can be
developed to keep parameters assigned to every test case along with the test case as presented in the following code snippet.
@Test
@Quality(verificationRatio = 1, useCaseCoverage = 0.5, complexityImpact = 0.1,
comment="Tests trivial assignment.")
public void testPrimeNumbersOnly() {
The test case evaluation can be part of regular code reviews and code coverage tools can be extended to calculate aggregated
metric during test suite execution.
//NOTE: A jar file containing @Quality annotation and an Apache Ant task calculating described metrics is available here.
Conclusion
The presented combined metrics (code coverage and average covered complexity) provide some qualitative insight into quality of a
test suite, but it is important to note, that average covered complexity is just a rough estimate that may vary considerably. It is only
representative if the set of examined test cases is representative for the entire suite and depends on examiner’s arbitrary estimates
of particular test cases. Nevertheless, both metrics combined give a better picture of test suite quality than code coverage alone.
History
5th May, 2019: Initial version
License
This article, along with any associated source code and files, is licensed under The BSD License