You are on page 1of 14

Robustness and diagnosability of oo systems designed by

contracts
Benoit Baudry, Yves Le Traon, Jean-Marc Jézéquel

To cite this version:


Benoit Baudry, Yves Le Traon, Jean-Marc Jézéquel. Robustness and diagnosability of oo systems
designed by contracts. Proceedings of Metrics’01, Apr 2001, LONDRES, United Kingdom. �hal-
00794315�

HAL Id: hal-00794315


https://hal.inria.fr/hal-00794315
Submitted on 25 Feb 2013

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
Robustness and Diagnosability of OO Systems Designed by Contracts

Benoit Baudry, Yves Le Traon and Jean-Marc Jézéquel


IRISA, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France
{Benoit.Baudry, Jean-Marc.Jezequel, Yves.Le_Traon }@irisa.fr

Abstract be expected to be close to the fault cause). At this


level of understanding, the diagnosability factor can
While there is a growing interest for component- be roughly defined as the degree to which the
software allows an easy and precise location of a
based systems in industry,little effort has so far
fault when detected.
been devoted to quality evaluation of these
This paper mainly reports on the empirical
systems. This paper presents the definition of validation of an axiomatization of the behavior of
measures for two quality factors, namely these robustness and diagnosability factors. This
robustness and “diagnosability” for the special validation seems to be successful since the case
case of OO systems for which thee approach known studies reveal that the measures correspond
as Design by Contract has been used. The main closely to the intuition. The proposed robustness
steps in constructing these measures are given, and diagnosability measures offer an easy way of
from informal definitions of the factors to be comparing designs, as well as a method for
measured to the mathematical model of the appraising contract efficiency in terms of both
measures. To fix the parameters, experimental robustness and diagnosis effort and preciseness.
studies have been conducted, essentially based on These estimates allow the effort which must be
applying mutation analysis in the OO context. devoted to contract quality and quantity in order to
Several measures are presented that reveal and reach a certain level of robustness and
estimate the contribution of contracts quality and diagnosability to be predicted. The measures
density to the overall quality of a system in terms of presented, which are based on a generic
robustness and “diagnosability”. axiomatization of their expected behavior, can be
generalized to classical procedural programming.
Section 2 opens with a presentation of the design
1. Introduction by contract approach and an intuitive analysis of
some expected benefits of the approach: robustness
Contracts are elements of formal specification and diagnosability improvement. Section 3
associated to programs in a way that is acceptable concentrates on the definition of robustness,
to practicing developers [Meyer00]. Contracts have axiomatization of the expected measurement
reactive support in Object Oriented languages such behavior and the calibration of the model
as the UML or Eiffel, and can be easily supported parameters on several case studies. Concerning the
in Java (iContracts) or C++. calibration of the model parameters, we use a
The objective of this paper is to bridge the gap particular adaptation of mutation analysis to the
between intuitive understanding of what contracts OO paradigm. Section 4 is devoted to
(or to enlarge the scope of the paper, use of diagnosability analysis, along the same lines as that
assertions) improve in the software and a given for robustness.
quantitative, and hopefully accurate, estimate of This paper is quite dense, but we prefer detail
these improvements. We voluntarily restrict the each measurement elaboration, since we believe
problem domain to the design by contract approach that any realistic software measurement cannot be
and propose two measurements: one of the too simplistic.
robustness, and the other of what we call
“diagnosability” of the software [Le Traon98].
Indeed, these factors are related to a contract-based
2. The Problem Domain: Design by
design approach: software with embedded contracts Contract
can detect internal anomalies during execution (the
system is thus more robust) and helps in 2.1. Design by contract
pinpointing the fault location (the faulty state can
The notion of software contract has been defined The design by contract approach prompts
to capture mutual obligations and benefits among developers to specify precisely every consistency
classes. Experience tells us that simply spelling out condition that could go wrong, and to explicitly
unambiguously these contracts is a worthwhile assign the responsibility of its enforcement to either
design approach [Jezequel97], that B. Meyer the routine caller (the client) or the routine
cornered the Design by Contract approach to implementation (the contractor). Along the line of
software construction [Meyer92]. This design by abstract data type theory, a common way of
contract approach has a sound theoretical basis in specifying software contracts is to use boolean
relation to partial functions, and provides a assertions called pre-and post-conditions for each
methodological guideline for building robust, yet service offered, as well as class invariants for
modular and simple systems. In some ways, design defining general consistency properties. A contract
by contract is the exact opposite of defensive carries mutual obligations and benefits: the client
programming [Liskov86] where it is recommended should only call a contractor routine in a state
to protect every software module by as many where the class invariant and the precondition of
checks as possible. the routine are respected. In return, the contractor
Defensive programming makes it difficult to promises that when the routine returns, the work
precisely assign responsibilities among modules, specified in the postcondition will be done, and the
and has the additional malevolent side-effect of class invariant is still respected.
increasing software complexity, which eventually A failure to meet the contract terms indicates the
leads to a decrease in reliability. presence of a fault, or bug. A precondition
violation points out a contract broken by the client:
the contractor does not then have to try to comply
with its part of the contract, but may signal the fault
Classical by raising an exception. A postcondition violation
software indicates a bug in the routine implementation,
which does not fulfill its obligations.
The Design by Contract approach is smoothly
integrated into the type system of an OO language
Diagnosis scope through the notion of subcontracting as provided by
the inheritance mechanism. That is, dynamic
(diagnosis and default mode)

binding can be viewed through the perspective of a


routine subcontracting its actual implementation to
a redefined version. Redefinition is then a
Exception treatment

Designed by semantics-preserving transformation in the sense


contract that the redefined routine must at least fulfill the
software contract of the original routine, and optionally does
more (e.g., accepting cases that would have been
rejected by the original contractor or returning a
‘‘better’’ result than originally promised). In other
Diagnosis scope words, it means that in a subclass, preconditions
may only be weakened (accept more) and
An execution thread postconditions strengthened (do more).
Distributed execution threads
2.2. Contracts for Robustness and Diagnosability
Infection point: global
program state faulty after this
point
In this paper, we focus on measuring the benefit
of a design by contract approach. The main impact
Failure produced at the output
of contracts on final-product software quality is
two-fold:
A contract
- since contracts can be compiled in such a way
Exception handled to raise exceptions when violated, they behave
by contract as classical, but more meaningful, assertions: a
faulty program state during execution (due to a
Fig. 1. Contracts for early detection of a fault fault or bug) can thus be automatically detected,
and the failure that would have certainly
occurred can be avoided. Intuitively, it shows
that contracts participate to software robustness. construction: the factor to be measured is first
The question is: to what degree do contracts informally defined and significant and measurable
contribute to software robustness, depending on attributes are identified (with intuitive and
their “strength” and number? hopefully convincing arguments and assumptions).
when a contract is violated, it indicates the part Then the intuitive properties of the factor behavior
of the code where a wrong program state has been must be expressed using the chosen attributes: this
detected. With no contract embedded in the is what we call axioms. An axiom is an expected
software, the failure would have been detected and understandable property of the measurement
elsewhere, perhaps at the system output. Since the that also has a meaning in the mathematical model.
scope of diagnosis (the number of statement in It makes the connection between the intuitive/real
which the fault must be located) is reduced when world and the formal one for the theoretical
contracts catch the presence of a fault, it can be validation. A formal model of measurement, that is
seen that contracts help the diagnosis task. The richer than the expressed axioms (otherwise, the
question is: to what degree do contracts reduce the formal model is of no interest), can then be
diagnosis effort depending on their “strength” and proposed. The role of axiomatization is twofold: it
number. provides a unifying framework for evaluating the
As presented in Figure 1, a design-by-contract various measures that can be proposed, and thus
software should allow early detection of faults helps one to separate the search for a measurement
(during their propagation to the outputs) and help from the statement of the intuition (the expected
locating the faulty part of the software by reducing behavior).
the “diagnosis scope” in which a fault must be Since the axioms formalize the essential
located. What is of great interest with contracts is expected properties of any measurement, they
the fact that their efficiency is not dependent on define which systems should be comparable and
possibly distributed execution of the software. the generic characteristics these measures must
While faults are really difficult to locate in a satisfy. The axioms constitute the basis of the
distributed execution environment, using contracts theoretical evaluation, which was carried out for
there is a good probability if pinpointing where the checking the consistency of the proposed
fault occurred. measurement with the real world. The theoretical
evaluation precedes empirical evaluation since it is
2.3. Measurement construction less time-consuming and more appropriate to show
that the model is internally consistent: it is used to
The literature emphasizes the difficulty in detect that no pathological structures exist for
constructing valid measurements [Fenton86, which the model produces inappropriate behavior.
Shepperd93, Briand96, Kitchenham95]. In this
paper, since the measured factors first appear as
quite abstract and unclear, we choose to make the
axiomatization of the measures [Shepperd93].
Figure 2 illustrates the process of the measurement
M ath e m a tica l/fo rm a l
w o rld
M us t be con sis ten t w ith

uses M e as u re d fac tor

F o rm a l M o d e l
ca

s
t ch

se F o rm al pro p e rtie s o f
u
es

th e m ea s ure m e n t

H a s o ne or s ev era l
S o ftw a re M ea su ra b le a ttrib u te s
Depends on

Should verify the


Depends on

R e a l/E m p irica l W orld

H as s om e
Fa c to r to b e m e as u red In tuitiv e p ro p ertie s
-
Fig. 2. The measurement elaboration
3. Measuring robustness improvement robustness. The proposed model extracts the main
attributes from a UML model to compute the
In this section, we want to model the global robustness based on local robustness
relationship between a component’s robustness and measures( the information we need could also be
its contracts. Therefore we propose a measure of re-constructed through static analysis of programs
contracts efficiency, and we show the improvement written in Java, C++, Eiffel…). This consideration
in robustness brought by contracts. leads to the definition of the local robustness of a
component plugged into a system.
3.1. Definitions
Definition - Local Robustness (RobInSi ): The
Definition- Robustness: Robustness expresses the local robustness RobInSi of a component Ci in a
degree to which the software is able to recover system S is defined as the probability that all
from internal faults that would otherwise have fault is detected either by Ci when it is known
provoked a failure. that this fault would provoke a failure.

In this paper, we consider that the main attribute Both isolated and local robustness are
that is significant for robustness is the capability of measurements local to a component, while the
the software to detect an internal faulty state. global robustness concerns the whole system. This
explains the decomposition of axiomatization into
local and global axioms
Definition - Isolated Robustness (Robi ): The
isolated robustness Robi of a component Ci in a
a) Axiomatization
system S is defined as the probability that a fault The axioms formalize the essential expected
internal to Ci is detected by Ci when it is known properties of any measurement. They define what
that this fault would provoke a failure. should be comparable and the generic
Conversely, the “weakness” Weaki of the characteristics that these measures must satisfy.
component is equal to the probability that the Based on the definitions given in the previous
fault is not detected. paragraph, three sets of axioms are provided: global
axioms, local axioms (for local and isolated
The detection mechanisms we focus on are robustness) and axioms linking global and local
executable contracts and other assertions. Many measures. They constitute the basis of the
components cannot directly be executed (abstract theoretical evaluation which was carried out.
or generic classes). Nevertheless, they may still be Measures profiles:
equipped with their own contracts that can detect Robi: Component → Real over [0..1]
their internal failure.
ℜ: System Architecture → Real over [0..1]
Since all robustness measurements are
Definition - Global robustness (ℜ): The global probabilities, they are bounded between 0 and 1, a
robustness ℜ of a system composed of a set of value of 1 meaning perfect robustness (internal
interconnected components is defined as the faults are always detected) and 0 indicating a non-
probability that an internal fault is detected by robust system or component.
any one of the components.
Local robustness axioms:
Without loss of generality, we can consider a LRA1 - Component comparison. All components of
component in isolation, and the interconnection a system are comparable in terms of local and
rules are those classically defined in OO systems, isolated robustness.
e.g. in a UML context. LRA2 – Component with no contracts (or
It has to be noted that an internal fault in a assertions). Components which have no
component plugged into a system can be detected contracts (or assertions or other fault detection
either by the component itself or by one of its mechanisms) have an isolated robustness value
clients or children. Intuitively, the global of 0.
robustness cannot be directly deduced by the
knowledge of local components robustness. We The following axioms concern the intuitive
argue that a relationship exists between local and behavior of the measures under some design
global robustness but that additional information on operations: system concatenation, contracts
the architecture is needed to obtain the global addition and contracts improvement
- Concatenation: models any operation that allows its clients bring their contracts to help the fault
the connection of two systems to produce a new detection. The notion of Test Dependency is thus
one (for example using inheritance, introduced to determine the relationship between a
client/provider dependencies). component and its clients in a system.
- Contract addition: operation consisting of adding
a contract to a system component (pre/post Definition- Test dependency: A component class
conditions, class invariants). Ci is test-dependent on Cj if it uses some objects
- Contract improvement: operation consisting of from Cj. This dependency relation is noted: Ci
adding a new clause to an existing contract to RTD Cj
check the consistency of a previoulsly non For example, on figure 3, component C is test-
verified property of a component. A contract is dependent on D, and Components A and B are
thus improved iff it checks more properties of test-dependent on C.
the component.
j
LRA3 - System concatenation. The isolated Definition - Det i : If Ci RTD Cj, then the probability
robustness of a component included in a system that Ci contracts detect a fault due to Cj is
S1 is unmodified by concatenation to a system j
S2 and its local robustness cannot decrease. noted Det i .
LRA4 –Contract (assertion) addition. In a system, In next section, we give a way to estimate the
the local and isolated robustness of a robustness of a component and the probability
component cannot decrease by the addition of a Det ij .
contract to a component in the system.
Even though the test dependency relationship is
LRA5 – Contract improvement. The improvement
transitive, we only consider faults that are detected
of a contract of a component Ci in a system must
by a component directly dependent on the faulty
increase its isolated and local robustness. The one. Inheritance is a special case of Test
other components local (and obviously isolated) Dependency whose impact on robustness
robustness cannot decrease. measurement is still unclear to us. For our
experiments we have thus considered two
Global robustness axioms: hypothesis: the pessimistic one for which we
GRA1 - System comparison. Two systems are j
always comparable in terms of robustness. consider that the Det i probability is 0 for
GRA2 - System concatenation. The global inheritance, and the optimistic one for which we
robustness of a system obtained by j
concatenation of two systems S1 and S2 cannot consider that the Det i probability is the same for
be lower than the lowest robustness of S1 and inheritance as for the client relationship.
S2. ontracts component
A B
GRA3 - Observation point addition. For any
system, its global robustness cannot decrease by
addition of an observation point. C

Inter-level axioms:
The knowledge of local and isolated robustness
measures must be sufficient to deduce the system D
global robustness.
Fig. 3. Example for test-dependency
IRA1 - Inter-level relationship. An injective
function relates local robustness to global
The robustness RobInSi (= 1- WeakInSi) of the
robustness.
component Ci in the system S is the probability a
fault in the component Ci is detected either by its
3.2. Assumptions and mathematical model own contracts or by the components it interacts
with. To calculate this probability, we calculate
A component isolated from the system will have WeakInSi. The probability WeakInSi is the
a basic robustness corresponding to the strength of probability that a fault due to Ci is not detected
its embedded contracts. A component plugged into
a system has a robustness enhanced by the fact that
locally by Ci multiplied by the probability that the errors) for declared objects, these operators are
fault is not detected by the clients of Ci. detailed in [Baudry00]. The operators introduced
WeakInS i = Weak i ⋅ ∏ (1 − Det ki ), k / C k RTD C i for the object-oriented domain are the following:
k − MCP (Methods Call Replacement): Replace
methods by a call to another method with
Finally, the global robustness ℜ of the system is the same signature.
thus equal to: − RFI (Referencing Fault Insertion): Nullify the
n reference of an object after its creation.
ℜ = 1 - Weak = 1 - ∑ Prob_failu re(i) ⋅ WeakIntoS i Suppress a clone or copy instruction. Insert
i =1
a clone instruction for each reference
where Prob_failure(i) is the probability the failure assignment. Operator RFI introduces object
comes from the component Ci knowing that a aliasing and object reference faults, most
failure certainly occurs. This probability is common in object-oriented programming.
approximated by the component’s complexity.
b) Estimating Contracts efficiency: a case
3.3. Experimental model parameterization study
To compute the contract quality of classes in a
a) Mutation analysis for OO Domain system, that is the isolated robustness of the class,
we used two mutation analyses.
Mutation testing is a testing technique which A first analysis uses behavioral differences
was first designed to create effective test data, with between the initial class and a mutant class. During
an important fault revealing power [Offutt96, this first analysis, a test kills a mutant if the
Voas92]. It was originally proposed in 1978 execution results of a test on the initial class and on
[DeMillo78], and consists in creating a set of faulty a mutant class are different. This analysis is
versions or mutants of a program with the ultimate incremental: first we compute the mutation score of
goal of designing a test cases set that distinguishes an initial test cases set, if this score is not
the program from all its mutants. In practice, faults satisfying, we write new test cases to improve the
are modeled by a set of mutation operators where score. At the end of this analysis we have a good
each operator represents a class of software faults. test cases set for each class in the system able to
To create a mutant, it is sufficient to apply its kill at least 90% of the class’ mutants.
associated operator to the original program. For the second mutation analysis, we consider
A set of test cases is relatively adequate if it all the mutants of a class alive and we try to kill
distinguishes the original program from all its non- them again, using only the class contracts. For this
equivalent mutants. Otherwise, a mutation score second analysis, we execute all the test cases we
(MS) is associated with the set of test cases set to have written during the first analysis on all the
measure its effectiveness in terms of the percentage mutants. We say that a test kills a mutant if the
non-equivalent mutants detected. It is to be noted execution of the test on the mutant class raises an
that a mutant is considered equivalent to the exception. The mutation score of a class’ test cases
original program if there is no input data on which set at the end of this analysis is the percentage of
the mutant and the original program produce a mutants the class’ contracts are able to detect. For
different output. our experiments, we consider this score as the
During the test selection process, a mutant isolated robustness of the class. If this initial
program is said to be killed if at least one test case robustness value is not satisfying, we can improve
detects the fault injected into the mutant. the contracts until we reach a good robustness.
Conversely, a mutant is said to be alive if no test Once we have the isolated robustness of each
j
cases detect the injected fault. class, we can measure Det i . This value, for a
A benefit of the mutation score is that even if component Ci, is measured by injecting faults in
no error is found, it still measures how well the Ci’s providers. Then, we execute Ci tests using Ci
software has been tested, giving the user and its faulty providers. The percentage of killed
information about the program test quality. It can
be viewed as a kind of reliability assessment for the mutants is the Det ij .
tested software. The parameters of the model of robustness are
For experiments, our choice of mutation easily fixed using mutation analysis:
operators includes selective relational and Robi = 1-Weaki = percentage of mutants
arithmetic operator replacement, variable detected by contracts.
perturbation, but also referencing faults (aliasing
j j
Det i = percentage of mutants in Cj detected by To measure Det i values, we generate the
Ci contracts. mutants for a class, and compute the mutation score
Prob_failure(i) = 1 / n , n being the number of for the clients’ set of test cases on the mutants of
classes in the system. methods the client uses. For example, if class P is a
Starting from a system in which each class has provider for class C, we generate mutants for class
an associated test case set, the aims of the case P, we select the mutants of methods used by C and
study are the following: we compute the mutation score for the set of test
1. To appraise the initial effectiveness of cases of C on the selected mutants. This mutation
contracts and improve them using this score corresponds to the percentage of mutants of P
approach, the contracts of C are able to detect and this is what
2. To estimate the robustness of a component we call Det i .
j

with embedded selftest in terms of detecting


For these measures, all the classes in the system
faults due to supplier classes.
have their improved contracts (average isolated
We used the Pylon library (http://www.eiffel-
robustness 87%), and the values are ranged from
forum.org/archive/arnaud/pylon.htm) as a case
50% to 84%.
study. It is a small, portable, freely available Eiffel
library for data structures and other basic features.
The class diagram is composed of 50 classes and 3.4. Results
134 relations. This library is complex enough to
illustrate the approach and obtain interesting To illustrate the interest of a design-by-contract
results. The way in which the various classes used approach for robustness improvement, we applied
in this package interact is presented in Annex. The it a posteriori to three real world case studies in the
mutation analysis tool used, called mutant slayer or telecommunications and compiler software
domains.
µSlayer, is dedicated to the Eiffel language. This − A Telecommunications Switching System:
tool injects faults in a class under test (or a set of Switched multimegabits data service (SMDS) is
classes), executes tests on each mutant program a connectionless, packet-switched data transport
and delivers an analysis to determine which service running on top of connected networks
mutants were killed by tests. The process is such as the Broadband Integrated Service
incremental (for example, we do not restart the Digital Network (B-ISDN), which is based on
execution on already killed mutants) and is the asynchronous transfer mode (ATM). A
parameterized (for example, the user selects the detailed description of an SMDS server design
number and types of mutation he wants to apply at and implementation can be found in [Jéron99].
any step). The class-diagram is composed of 37 classes,
Concerning the improvement of contracts, initial with a high connectivity degree (72 relations).
contracts killed an average of 58.5% of mutants,
− The Pylon library, which has already been
after improvement the mutation score reached an
presented above .
average of 7.5%.
The isolated robustness of classes is − The InterViews library, composed of 146
significantly improved (the best improvement is classes and 420 relations.
from 25% to 100%). The fact that all faults are not For these three systems, we show the evolution
detected by the improved contracts reveals the limit of global robustness with the improvement of
of contracts as oracle functions. The contracts isolated components’ robustness. To illustrate this
associated with these methods are unable to detect j
evolution, we consider that the Det i probability
faults disturbing the global state of a component.
depends on the component’s robustness as follows:
For example, a prune method of a stack cannot j
have trivial local contracts checking whether the Det i = K Rob i
element removed had been previously inserted by a Using the measures in tables 2 and 3, we fix the
put. In that case, a class invariant would be adapted coefficient K: K=0.8. The robustness evolutions for
to detect such faults. At the end of the the three systems are shown Figure 5. For these
improvement process, the contractable component evolutions, we make the pessimistic assumption
has a considerably greater capacity to detect faults that inheritance dependencies have no impact on
(between 72% and 100% in the case of mutation global robustness.
faults for this study). As a result, this approach
highlights methods for which the associated
contracts are too weak.
1

global robustness
0,8
0,6
SMDS
0,4 IV
Pylon
0,2
0
0 0,2 0,4 0,6 0,8 1
isolated robustness

Fig. 5.Evolution of global robustness for three systems

With these results, we see that using no contracts help us understand more precisely the importance of
implies that the system is not robust, but that adding inheritance for local and global robustness.
simple contracts improves the global robustness
rapidly. For example, in the InterViews system, if the 4. Measuring Diagnosability
components isolated robustness is 0.4, the global
robustness is almost 0.7. Moreover, the three curves A failure may be observed during the software
show that improving the isolated robustness from 0.8 development as well as the maintenance stage. Given
to 1, which corresponds to the most costly the occurrence of a failure, diagnosis requires a set of
improvements, is not interesting in terms of global additional symptoms in order to determine the faulty
robustness improvement: for InterViews the global part of the system which causes the detected failure.
robustness is already 0.94 when isolated components’ Diagnosis is thus defined as the task of locating faulty
robustness is 0.8. parts of a system when a failure is detected. The
The slight differences between the systems notion and definition of what we call “diagnosability”
correspond to different dependency densities. Indeed, are introduced in this section: the formal part of the
the local robustness of a component can be increased measurement process is not presented. We just briefly
by its clients’ contracts, and improving local comment the main results obtained using a
robustness improves global robustness. Thus, the diagnosability measurement and its impact on system
more relationships there are between components, the quality.
more local robustness can increase, and the bigger is
the global robustness. To analyze the diagnosability attribute, one needs
We have measured evolutions of the SMDS to understand the main methods used for locating
robustness considering different influences of faults in the software after they have been detected.
inheritance dependencies for global robustness. We
have considered three cases, the first case 4.1. Diagnosis practices in the software domain
corresponds to the pessimistic assumption under
A first way of locating faults consists of
which we ignore inheritance. In the second case, we
performing some cross-checking between information
have considered that inheritance is less important than
resulting from test executions. Such systematic cross-
other dependencies for global robustness, and took a
checking of test results and executed paths may lead
coefficient K=0.2 for Det ij . Finally, we considered to semi-automated diagnosis strategies [Khalil98].
that inheritance is as important as other dependencies. Along these lines, most diagnosis reported works are
The measures showed that the maximum difference based on the program slicing techniques. These
between values is only 3%. techniques focus on the software code at the unit and
All we can say now is that the three cases bound integration levels. Various slicing methods exist
the true robustness value, and that future work should [Weiser84, Weiser82, Kamkar95, Korel97,
Agrawal95] which basically consist in extracting
from the program a set of statements which can be The underlying attribute expressing the
executed independently (this corresponds to a slice of localization of faulty statements among a set of
the program). The fault localization consists in suspected statements is called indistinguishability. To
executing the program slice by slice and in analyzing conclude, the diagnosis effort and the preciseness can
each slice result. The main limitation of this both relate to the number of indistinguishable
technique is its cost in terms of human effort. Indeed, statements in which the faulty statement has to be
each slice implies the determination of an oracle and, localized.
because the slices have no simple functional meaning,
it often needs human intervention. 4.3. Definitions
Another classical way for locating faulty In this section, we detail the definition of
statements consists of inserting assertions in the diagnosability. The refinement of the intuitive
program for detecting some internal faulty state diagnosability definition into a set of behavioral
during execution. The systematic use of assertions axioms is not presented here for conciseness reasons.
before and after procedure calls may be very efficient The process used for producing a diagnosability
for detecting and locating faults. Design by contract is measurement is similar to the one used for
a generalization of this principle. However, the effort robustness. Since the intuitive aspects of
for defining and inserting assertions may be important diagnosability have been already discussed,
because it implies a good understanding of the diagnosability and its related attributes can be
internal meaning of the procedure and expected defined.
values of the data. Some works have focused on the
way of inserting assertions when needed in the Definition (Informal)- Diagnosability:
program, with testability criteria [Voas95]. Diagnosability expresses the localization effort as
Confronted with the problem of diagnosis, which well as the precision allowed by a test strategy on
remains a non-automated task, it would be useful to a given system.
appraise the probable difficulty of locating faults in
the software beforehand. Such predictor estimate is The effort depends on the selected test strategy.
called diagnosability (see [Le Traon98] for data flow However, for sake of simplicity, we do not consider
designs), and provides a way of improving the design multiple paths diagnosis strategies (the problem has
quality. been studied in detail in [Le Traon98]) since we focus
To illustrate the importance of such measurement, on the impact of contracts on diagnosability. While
we concentrate on the impact of executable for robustness our basic measurable attribute was a
contracts/assertions on OO system diagnosability. component and its contracts, our entry points to
Measures are generic enough to be adapted to diagnosability measurement are the statements
classical procedural programming. executed when a failure occurs or when a contract (or
assertion) detects the fault. The notion of component
4.2. Diagnosability: analysis of the notion is forgotten here, since we concentrate first on a
The localization effort is related to the size of the particular software execution.
sets of suspected components or statements. The
bigger the suspected sets are, the more difficult the Assumptions/propositions:
diagnosis is. Moreover it is intuitively more difficult - the software is assumed to be faulty: there exists
to distinguish a faulty statement among ten statements an execution of the system that would provoke a
than to determine it among two statements. Applying failure if it were not detected by a contract.
the same reasoning, the diagnosis preciseness - contracts are assumed to be correct
obtained by a test strategy is higher if the fault is - a fault can be modeled as an invalid state in the
located among two statements rather than among ten global program state, which differs from the
statements. These intuitive considerations can be expected one after the execution of a statement.
expressed by three propositions: We call this statement the faulty statement, even
- the diagnosability is composed of the localization if it is not necessarily the cause of the failure (that
effort and the diagnosis preciseness, for example can be an omitted statement),
- the localization effort and the diagnosis - the main diagnosis task consists of locating this
preciseness are closely connected, divergence point,
- the diagnosability depends on the capacity to - as a consequence, if an execution flow is faulty on
isolate statements in the structure either by multiple points, the diagnosis will point out the
applying a test strategy covering this structure or first divergence point (faults that compensate each
by using “watchdogs”, e.g.; contracts and other other are considered as negligible for a global
assertions. estimate).
probable degree of preciseness obtained
Definition – Execution flow: An execution flow is a depending on the density and quality of the
partially ordered set of statements and contracts embedded contracts.
(or assertions) that is executed by a given
program. Implicitly, we only consider flows that From these measurement definitions, derived from
would provoke a failure. the informal ones, one can deduce the measure
The statements are partially ordered because – and expected profiles [Le Traon00]. A diagnosis effort is
particularly in an OO system- some of the execution a ratio extensive measure (operations such as addition
may be distributed on several threads. To simplify the are possible since it is a counting measure). In our
mathematical model, we only consider a non study the measurable attribute associated to the
distributed flow. The exhaustive mathematical diagnosis effort is the size of the indistinguishability
modeling of such flows is not presented because it set in which the faulty statement must be located
does not significantly modify the results of the (using scrutation with program slicing for example).
measurements while it makes the model much more Local diagnosability and diagnosis effort should thus
complex. In our case, an execution flow is equivalent express the probable size of an indistinguishability
to a dynamic slice of the system. set. These measurements thus take their domain value
into [1..+ ∞]. Note that a good diagnosability
Definition - Indistinguishable statements: Two corresponds to a low diagnosis effort. For the global
statements are indistinguishable from each other if diagnosability ∆, a difficulty comes from the fact that
they are bounded by consecutive contracts in an there is no general relationship between the size of
execution flow (or the entrance and output of the the execution flow and the size or architecture of the
flow). system. What we want to measure is the degree of
Definition - Indistinguishability set: An diagnosis precision obtained with a certain
indistinguishability set corresponds to a set of proportion/quality of contracts in the system,
indistinguishable statements. compared to the same system with no contracts (as an
Definition - Size of an indistinguishability set: The absolute reference value). ∆ is also a ratio
size of an indistinguishability set is equal to the measurement but it is intensive, in the sense that
cardinality of this set. The smallest possible size is values can not be easily combined. For ∆, a good
1; the case when the suspected set is made up of diagnosability corresponds to a 1 value and the worse
only one statement. one to 0 (when no contracts or assertions allow the
reduction of the diagnosis scope).
For local measurements (attached respectively to a
statement and to an execution flow), the Measures profiles:
diagnosability is more directly expressed in terms of
diagnosis effort (in that case, a diagnosability δ: Statement × Flow → Real over [1..+ ∞]
improvement corresponds to a reduced diagnosis δeff: Statement → Real over [1..+ ∞]
effort) while the global diagnosability measure
∆: System → Real over [0..1]
(attached to a system) is related to the precision of
diagnosis.
The detail of the axiomatization and measure
definition are not given in this paper, we just
Definition - Local diagnosis effort (δ): The local comment the diagnosability results in relation with
diagnosability δ of a statement Stat in an the density of contracts in the design and their quality
execution flow F is the probable effort needed for (in terms of probability to detect the faulty state of the
determining that Stat is faulty in F when the system).
number of statements, the number and distribution
of contracts and their efficiency is known. 4.4. Results and conclusions
Definition – Local diagnosis effort for a flow Results show that the introduction of contracts
(δeff): The global diagnosis effort for a flow F is quickly enhances the global diagnosability of the
the probable effort needed for pointing out the system. Besides the ∆ values are stable between
faulty statement, knowing that a fault is detected [0.17, 1] range of density of contracts. So, the
in F, and knowing the number of statements, the addition of many contracts (high contract density)
number of contracts and their efficiency. does not significantly improve a system global
Definition - Global diagnosability of a system (∆): diagnosability :
The global diagnosability of a system S is the ∆ ≈ 0.6 with contracts efficiency equal to 0.2
and a contract density ∈[0.17, 1]
∆ ≈ 0.9 with contracts efficiency equal to 0.4 References
and a contract density ∈[0.17, 1]
Finally, the quality of the contracts is more [Agrawal95] H. Agrawal, J. Horgan, S. London, and
important than their number, since it is the only way W. Wong, “Fault Localization using Execution
to make the upper bound for diagnosability increase. Slices and Dataflow Tests,” presented at 6th
International Symposium on Software Reliability
The conclusions we can deduce from this Engineering, Toulouse (France), 1995.
measurement are the following: [Baudry00] B. Baudry, Y. Le Traon, H. Vu Le,
- a 0.2 contract/assertion density is enough to reach “Testing-for-Trust: the Genetic Selection Model
the upper bound of diagnosability for a given applied to Component Qualification”, In
contract average efficiency. It has to be noted that proceedings of TOOLS’2000 (Technology of
in good OO designs the size of methods is often Object Oriented Languages and Systems), pp.
small, and includes a small number of statements. 108-119, June 2000.
A 0.2 contract density corresponds to a good and [Briand96] L. C. Briand, S. Morasca, and V. R.
possible density for an OO system. In most cases, Basili, “Property-Based Software Engineering
the use of assertions in the body of methods is Measurement,” IEEE Transactions on Software
thus useless. This result could not be easily Engineering, vol. 22, pp. 68-86, 1996.
predicted without a mathematical model. [DeMillo78] R. DeMillo, R. Lipton, and F. Sayward,
- Quality is better than quantity. For the same “Hints on Test Data Selection : Help For The
contracts density, the diagnosability is highly Practicing Programmer”, IEEE Computer, vol. 11,
sensitive to the quality of contracts. It is better to pp. 34-41, 1978.
put the effort on a good design with high [Fenton86] N. E. Fenton and R. W. Whitty,
encapsulation and well defined interfaces “Axiomatic approach to Software Metrication
(supporting clearer properties derivable into through Program Decomposition”, in The
contracts) than to put the effort on defensive Computer Journal, vol. 29, 1986, pp. 330-339.
assertions, that often are unclear and dependent on [Jéron99] T. Jéron, J-M. Jézéquel, Y. Le Traon,
the code. and P. Morel, “Efficient Strategies for Integration
Design by contract is thus a very efficient way of and Regression Testing of OO Systems”, In proc.
improving the diagnosability and robustnesss of a of the 10th International Symposium on Software
system and its general quality. Reliability Engineering (ISSRE’99) , pp. 260-269,
Boca raton (Florida), November 1999.
5. Conclusion [Jézéquel97] J-M. Jézéquel and B. Meyer, “Design
by Contract: The lessons of Ariane”, Computer,
The work presented here focused on the definition vol. 30, No. 1, pp. 129-130, January 1997.
of measures for two quality factors in a particular [Kamkar95] M. Kamkar, “An Overview and
problem domain: the use of a design by contract Comparative Classification of Program Slicing
approach for designing and implementing OO Techniques,” Systems Software, vol. 31, pp. 197-
systems. The main steps in the construction of 214, 1995.
measures elaboration have been given, from informal [Khalil98] Khalil, M., Le Traon, Y. and Robach, C.,
definitions of the factors to be measured (robustness “Automated Strategies for Software Diagnosis”,
and diagnosability) to the mathematical model of the Proc. of the IEEE International Symposium on
measures. To fix the parameters, experimental studies Software Reliability Engineering (ISSRE’98),
have been conducted, essentially based on applying Paderborn (Germany), November 1998.
mutation analysis in the OO context. Several [Kitchenham95] B. Kitchenham, S. L. Pfleeger,
measures have been presented that estimate the N. Fenton, “Towards a Framework for Software
contribution of contract quality and density to the Measurement Validation”, IEEE Transactions on
overall quality of a system in terms of robustness and Software Engineering, vol.21, No. 12, pp. 929-
diagnosability. Finally, our results confirm that the 943, December 1995.
quality of contracts is more important than their [Korel97] B. Korel, “Computation Of Dynamic
quantity. Program Slices For Unstructured Programs”,
IEEE Transactions on Software Engineering, vol.
Acknowledgements 23, pp. 17-34, 1997.

We thank Clémentine Nebut and Franck Allion for


computing the data about the Pylon library.
[Le Traon98 ]Y. Le Traon, F. Ouabdesselam and C. [Shepperd93] M. Shepperd and D. Ince,
Robach, “Software Diagnosability”, Proc. of the “Derivation and Validation of Software Metrics”.
IEEE International Symposium on Software Oxford, 1993.
Reliability Engineering (ISSRE’98), Paderborn [Voas92] J. Voas et K. Miller, “The Revealing
(Germany), November 1998. Power of a Test Case”, Software Testing,
[Le Traon00 ]Y. Le Traon, F. Ouabdesselam and C. Verification and Reliability, vol. 2, pp. 25-42,
Robach, “Analyzing Testability on Data Flow 1992.
Designs”, to be presented. to the International [Voas95] J. M. Voas, “Software Testability
Symposium on Software Reliability Engineering Measurement for Assertion Placement and Fault
2000 (ISSRE’00), San José (CA), October 2000. Localization,” presented at 2nd International
[Liskov86] B. Liskov and J. Guttag, “Abstraction Workshop on Automated and Algorithmic
and Specification in Program Development”, MIT Debugging (AADEBUG'95), Saint-Malo
Press/McGraw-Hill, 1986. (France), 1995.
[Meyer92] B. Meyer. “Applying “Design by [Weiser82] M. Weiser, “Programmers Use Slices
contract””. IEEE Computer, Vol. 25, No. 10, pp. When Debugging,” Communication of ACM,, vol.
40--52, October 1992. 25, pp. 446-452, 1982.
[Meyer00] B. Meyer, "Toward More expressive [Weiser84] M. Weiser, “Program Slicing,” IEEE
contracts", Journal of OO Programming (JOOP), Transactions on Software Engineering, vol. 10,
July-August 2000, pp. 39-43. pp. 352-357, 1984.
[Offutt96] J. Offutt, J. Pan, K. Tewary and T.
Zhang "An experimental evaluation of data flow
and mutation testing", Software Practice and
Experience, vol. 26, No. 2, February 1996.
HASHABLE Text_object COMPARABLE Date_const

Dict_node Date_time Time Date Text_object_manager Format

Mutator
Integer_decoder

Oneway_iterator

Iterator Tree_iterator

Linked_iterator Sequence_iterator Tree_level_iterator Tree_past_iterator Tree_pre_iterator


Fig. 4. The Pylon library

Container

Searchable Sortable Oneway_traversable Sequence Dispenser

Table Set Traversable Tree History_list Stack

Catalog Dictionary Hash_set List General_tree Queue

Prime Tree_post_iterator

Linked_list Sequence_list
Math
Node Order_comparable
System

Order_relation
Linked_node Tree_node

Random

You might also like