Professional Documents
Culture Documents
Software Quality
Approaches:
Testing, Verification,
and Validation
Software Best Practice 1
, Springer
Editors:
Michael Haug Luisa Consolini
Eric W. Olsen GEMIN! soco consoaorol.
HIGHWARE GmbH Via So Serlio 24/2
WinzererstraBe 46 40128 Bologna, Italy
80797 MUnchen, Germany
luisa@gemini.it
Michael@Haugocom
ewo@homeocom
ISBN 978-3-540-41784-2
http://wwwospringerode
© Springer-Verlag Berlin Heidelberg 2001
Originally published by Springer-Verlag Berlin Heidelberg New York 2001
The use of general descriptive names, trademarks, etco in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general useo
Cover design: design & production GmbH, Heidelberg
Typesetting: Camera-ready by editors
Printed on acid-free paper SPIN: 10832653 45/3142 ud - 543210
Foreword
C. Amting
Directorate General Information Society, European Commission, Brussels
Under the 4th Framework of European Research, the European Systems and Soft-
ware Initiative (ESSI) was part of the ESPRIT Programme. This initiative funded
more than 470 projects in the area of software and system process improvements.
The majority of these projects were process improvement experiments carrying
out and taking up new development processes, methods and technology within the
software development process of a company. In addition, nodes (centres of exper-
tise), European networks (organisations managing local activities), training and
dissemination actions complemented the process improvement experiments.
ESSI aimed at improving the software development capabilities of European
enterprises. It focused on best practice and helped European companies to develop
world class skills and associated technologies to build the increasingly complex
and varied systems needed to compete in the marketplace.
The dissemination activities were designed to build a forum, at European level,
to exchange information and knowledge gained within process improvement ex-
periments. Their major objective was to spread the message and the results of
experiments to a wider audience, through a variety of different channels.
The European Experience Exchange ~UR~X) project has been one of these dis-
semination activities within the European Systems and Software Initiative.~UR~)(
has collected the results of practitioner reports from numerous workshops in
Europe and presents, in this series of books, the results of Best Practice achieve-
ments in European Companies over the last few years.
~UR~)( assessed, classified and categorised the outcome of process improve-
ment experiments. The theme based books will present the results of the particular
problem areas. These reports are designed to help other companies facing software
process improvement problems.
The results of the various projects collected in these books should encourage
many companies facing similar problems to start some improvements on their
own. Within the Information Society Technology (1ST) programme under the 5th
Framework of European Research, new take up and best practices activities will
be launched in various Key Actions to encourage the companies in improving
their business areas.
Preface
M. Haug
HIGHWARE, Munich
In Part II we present the collected findings and experiences of the process im-
provement experiments that dealt with issues related to the problem domain ad-
dressed by the book. Part II consists of the chapters:
4 Perspectives
5 Resources for Practitioners
6 Experience Reports
7 Lessons from the ~UR~X Workshops
8 Significant Results
Part III offers summary information for all the experiments that fall into the
problem domain. These summaries, collected from publicly available sources,
provide the reader with a wealth of information about each of the large number of
projects undertaken. Part III includes the chapters:
9 Table of PIEs
10 Summaries of Process Improvement Experiment Reports
A book editor managed each of the books, compiling the contributions and
writing the connecting chapters and paragraphs. Much of the material originates in
papers written by the PIE organisations for presentation at ~UR~X workshops or
for public documentation like the Final Reports. Whenever an author could be
identified, we attribute the contributions to him or her. If it was not possible to
identify a specific author, the source of the information is provided. If a chapter is
without explicit reference to an author or a source, the book editor wrote it.
Many people contributed to ~UR~XPI, more than I can express my appreciation
to in such a short notice. Representative for all of them, my special thanks go to
the following teams: David Talbot and Rainer Zimmermann (CEC) who made the
ESSI initiative happen; Mechthild Rohen, Brian Holmes, Corinna Amting and
Knud Lonsted, our Project Officers within the CEC, who accompanied the project
patiently and gave valuable advice; Luisa Consolini and Elisabetta Papini, the
Italian ~UR~X team, Manu de Uriarte, Jon Gomez and lfiaki Gomez, the Spanish
~URD< team, Gilles Vallet and Olivier Becart, the French ~UR~X team, Lars
Bergman and Terttu Orci, the Nordic ~UR~X team and Wilhelm Braunschober,
Bernhard Kolmel and Jorn Eisenbiegler, the German ~UR~X team; Eric W. Olsen
has patiently reviewed numerous versions of all contributions; Carola, Sebastian
and Julian have spent several hundred hours on shaping the various contributions
into a consistent presentation. Last but certainly not least, Ingeborg Mayer and
Hans Wossner continuously supported our efforts with their professional publish-
ing know-how; Gabriele Fischer and Ulrike Drechsler patiently reviewed the
many versions of the typoscripts.
The biggest reward for all of us will be, if you - the reader - find something in
these pages useful to you and your organisation, or, even better, if we motivate
you to implement Software Process Improvement within your organisation.
PI
Opinions in these books are expressed solely on the behalf of the authors. The European
Commission accepts no responsibility or liability whatsoever for the content.
Table of Contents
T. Linz F. Lopez
imbus GmbH Procedimientos-Uno SL
info@imbus.de flopez@procuno.pta.es
l
1.1 Introduction
Enterprises in all developed sectors of the economy - not just the IT sector - are
increasingly dependent on quality software-based IT systems. Such systems sup-
port management, production, and service functions in diverse organisations.
Furthermore, the products and services now offered by the non-IT sectors, e.g., the
automotive industry or the consumer electronics industry, increasingly contain a
component of sophisticated software. For example, televisions require in excess of
half a Megabyte of software code to provide the wide variety of functions we have
come to expect from a domestic appliance. Similarly, the planning and execution
of a cutting pattern in the garment industry is accomplished under software con-
trol, as are many safety-critical functions in the control of, e.g., aeroplanes, eleva-
tors, trains, and electricity generating plants. Today, approximately 70% of all
software developed in Europe is developed in the non-IT sectors of the economy.
This makes software a technological topic of considerable significance. As the
information age develops, software will become even more pervasive and trans-
parent. Consequently, the ability to produce software efficiently, effectively, and
with consistently high quality will become increasingly important for all industries
across Europe if they are to maintain and enhance their competitiveness.
The goal of the European Systems and Software Initiative (ESSI) was to promote
improvements in the software development process in industry, through the take-
up of well-founded and established - but insufficiently deployed - methods and
technologies, so as to achieve greater efficiency, higher quality, and greater econ-
omy. In short, the adoption of Software Best Practice.
All material presented in Chapter I was taken from publicly available information issued
by the European Commission in the course of the European Systems and Software Initia-
tive (ESSI). It was compiled by the main editor to provide an overview of this pro-
gramme.
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
4 Software Process Improvement
The aim of the initiative was to ensure that European software developers in
both user and vendor organisations continue to have the world class skills, the
associated technology, and the improved practices necessary to build the increas-
ingly complex and varied systems demanded by the market place. The full impact
of the initiative for Europe will be achieved through a multiplier effect, with the
dissemination of results across national borders and across industrial sectors.
1.3 Strategy
Create communities
of common interest
Any organisation in any sector of the economy, which regards generation of soft-
ware to be part of its operation, may benefit from the adoption of Software Best
Practice. Such a user organisation is often not necessarily classified as being in the
software industry, but may well be an engineering or commercial organisation in
which the generation of software has emerged as a significant component of its
operation. Indeed as the majority of software is produced by organisations in the
non-IT sector and by small and medium sized enterprises (SMEs), it is these two
groups who are likely to benefit the most from this initiative.
Competitive
advantage
Greater
customer
satisfaction
Better
quality
Better value
for money
Greater
efficiency
Improvements to
the software
development
process
Software Best Practice activities focus on the continuous and stepwise improve-
ment of software development processes and practices. Software process im-
provement should not be seen as a goal in itself but must be clearly linked to the
business goals of an organisation. Software process improvement starts with ad-
6 1 Software Process Improvement
dressing the organisational issues. Experiences in the past have shown that before
any investments are made in true technology upgrades (through products like tools
and infrastructure computer support) some critical process issues need to be ad-
dressed and solved. They concern how software is actually developed: the meth-
odology and methods, and, especially, the organisation of the process of develop-
ment and maintenance of software.
Organisational issues are more important than methods and improving methods
is, in tum, more important than introducing the techniques and tools to support
them.
Finding the right organisational framework, the right process model, the right
methodology, the right supporting methods and techniques and the right mix of
skills for a development team is a difficult matter and a long-term goal of any
process improvement activity. Nevertheless, it is a fundamental requirement for
the establishment of a well-defmed and controlled software development process.
'"
3. Technical approach: methods, procedures, ...
'"
4. Technical support: tools, computers, ...
The European Commission issued three Calls for Proposals for Software Best
Practice in the Fourth Framework Programme in the years 1993, 1995 and 1996.
The first call was referred to as the "ESSI Pilot Phase". The aim was to test the
perceived relevance of the programme to its intended audience and the effective-
ness of the implementation mechanisms. Before the second call in 1995 a major
review and redirection took place. Following the revision of the ESPRIT Work
programme in 1997, a further call was issued of which the results are not been
reviewed in this book. The four calls varied slightly in their focus. In the follow-
ing, all types of projects supported by the ESSI initiative will be presented.
Training
D
Process
Assessment Improvement Dissemination
Experiment
D
Experience Networks
2
1.7.1 Stand Alone Assessments
The main objective of the Stand Alone Assessments action was to raise the aware-
ness of user organisations to the possibilities for improvement in their software
development process, as well as give the momentum for initiating the improve-
Stand Alone Assessments have been called only in the year 1995.
1.7 Types of Projects 9
Process Improvement Experiments have been called in the years 1995, 1996 and 1997.
As the project type "Application Experiment" can be considered the predecessor of PIEs,
it is legitimate to say that PIEs have been subject to all ESSI calls and have formed not
only the bulk of projects but also the "heart" of the initiative.
10 Software Process Improvement
PIE
Dissemination
...-- ~
Analysis of
current
situation
I--- Experimentation .. Analysis of
final
situation
I--- f- Next
Sta ge
\ /\ /\ t
V V V
~ ~
Baseline Project
4
1.7.3 Application Experiments
5
1.7.4 Dissemination Actions , 6
Application Experiments have only been called in 1993. See also the footnote to Process
Improvement Experiments.
6 Dissemination Actions have been called in 1993, 1995 and 1996.
The ESSI project EUREX which resulted in this book was such a Dissemination Action.
12 Software Process Improvement
Source of
Inform ation
Worldwide Info
Focused Target
Audiences
7
1.7.5 Experience/User Networks
There was opportunity for networks of users, with a common interest, to pursue a
specific problem affecting the development or use of software. ExperiencelUser
Networks mobilised groups of users at a European level and provided them with
the critical mass necessary to influence their suppliers and the future of the soft-
ware industry through the formulation of clear requirements. A network had to be
trans-national with users from more than one Member or Associated State.
By participating in an ExperiencelUser Network, a user organisation helped to
ensure that a particular problem - with which it is closely involved - is addressed
and that it is able to influence the choice of proposed solution.
Software suppliers (methodologies, tools, services, etc.) and the software indus-
try as a whole took benefit from ExperiencelUser Networks by receiving valuable
feedback on the strengths and weaknesses of their current offerings, together with
information on what is additionally required in the marketplace.
8
1.7.6 Training Actions
Training actions have been broad in scope and covered trammg, skilling and
education for all groups of people involved - directly or indirectly - in the devel-
opment of software. In particular, training actions aimed at:
• increasing the awareness of senior managers as to the benefits of software pro-
cess improvement and software quality
• providing software development professionals with the necessary skills to de-
velop software using best practice
Emphasis had been placed on actions which served as a catalyst for further
training and education through, for example, the training of trainers. In addition,
the application of current material - where available and appropriate - in a new or
wider context was preferred to the recreation of existing material.
Training Actions have been called in 1993 and 1996. Whereas the projects resulting from
the call in 1996 were organised as separate projects building the ESSI Training Cluster
ESSItrain, the result of the call in 1993 was one major project ESPITI which is described
9 in chapter 2.3.2.
ESSI PIE Nodes have only been called in 1997.
14 1 Software Process Improvement
The objective of an ESBNET was to implement small scale software best practice
related activities on a regional basis, but within the context of a European net-
work. A network in this context was simply a group of organisations, based in
different countries, operating together to implement an ESBNET project, accord-
ing to an established plan of action, using appropriate methods, technologies and
other appropriate support. By operating on a regional level, it was expected that
the specific needs of a targeted audience will be better addressed. The regional
level was complemented by actions at European level, to exploit synergies and
bring cross-fertilisation between participants and their target audiences. A network
had a well defined focus, rather than being just a framework for conducting a set
of unrelated, regional software best practice activities.
The two ESSI tasks newly introduced in the Call for Proposals in 1997 - ESPI-
NODEs and ESBNETs - aimed to continue and build upon the achievements of
the initiative so far, but on a more regional basis. ESPINODEs aim with first
priority to provide additional support to PIEs, whilst ESBNETs aim to integrate
small-scale software best practice actions of different type implemented on a re-
gional basis - with an emphasis on the non-PIE community.
By operating on a regional level, it was expected that ESPINODEs and ESB-
NETs will be able to tailor their actions to the local culture, delivering the mes-
sage and operating in the most appropriate way for the region. Further, it was
expected that such regional actions will be able to penetrate much more into the
very comers of Europe, reaching a target audience which is much broader and
10
Software Best Practice Networks have only been called in 1997.
1.7 Types of Projects 15
Over 70% of the organisations that participated in events organised during the
course of the ESPITl project (see section 1.3.2 below) were Small or Medium
Enterprises (SMEs), and many of which had substantially fewer than 250 employ-
ees. This response rate demonstrated a significant interest on the part of SMEs in
finding out more about Software Process Improvement (SPI). Therefore, the pri-
mary target audience for ~UR~X was those European SMEs, and small teams in the
non-IT organisations, engaged in the activity of developing software. Within these
organisations, the focus was on management and technical personnel in a position
to make decisions to undertake process improvement activities.
The ESPITI User Survey presents a clear picture of the needs and requirements
of SMEs concerning software process improvement. For example, 25% of those
who responded requested participation in working groups for experience ex-
change. However, SMEs are faced with many difficulties when it comes to trying
to implement improvement programmes.
For example, SMEs are generally less aware than larger companies of the bene-
fits of business-driven software process improvement. It is perceived as being an
expensive task and the standard examples that are quoted in an attempt to con-
vince them otherwise are invariably drawn from larger U.S. organisations and
therefore bear little relevance for European SMEs. ESSlgram No 11 also reported
that "peer review of experiment work in progress and results would be helpful."
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
18 2 The EUREX Project
Thus, SMEs need to see success among their peers, using moderate resources,
before they are prepared to change their views and consider embarking upon SPI
actions.
For those SMEs that are aware of the benefits of SPI, there are frequently other
inhibitors that prevent anything useful being accomplished. Many SMEs realise
that they should implement software process improvement actions but do not
know how to do this. They do not have the necessary skills and knowledge to do it
themselves and in many cases they do not have the financial resources to engage
external experts to help them. Consequently, SPI actions get deferred or cancelled
because other business priorities assume greater importance. Even those SMEs
that do successfully initiate SPI programmes can find that these activities are not
seen through to their natural completion stage because of operational or financial
constraints.
Many of the concerns about the relevance of SPI for SMEs were addressed by
~UR~X in a series of workshops in which speakers from similarly characterised
companies spoke about their experiences with SPI. The workshops were in inte-
gral part of the ~UR~X process and provided much of the data presented in this
volume.
The Commission funded ~UR~X in large measure because the evaluation of ap-
proximately 300 PIEs was too costly for an independent endeavour. Even if some
resource-rich organisation had undertaken this task, it is likely that the results
would not have been disseminated, but would rather have been used to further
competitive advantage. Commission support has insured that the results are widely
and publicly distributed.
Many ESSI dissemination actions have been organised as conferences or work-
shops. PIE Users register in order to discharge their obligations to the Commis-
sion; however, the selection and qualification of contributions is often less than
rigorous. In addition, many public conferences have added PIE presentation tracks
with little organisation of their content. Small audiences are a consequence of the
competition of that track with others in the conference. The common thread in
these experiences is that organisation of the actions had been lacking or passive.
~UR~X turned this model on its end. PIE Users were approached proactively to
involve them in the process. In addition, the information exchange process was
actively managed. The ~UR~X workshops were organised around several distinct
problem domains and workshop attendees were supported with expert assistance
to evaluate their situations and provide commentary on solutions from a broadly
experienced perspective. (See chapter 3 for a detailed discussion of the domain
selection process.) Participants were invited through press publications, the local
chambers of commerce, the Regional Organisations of ~UR~X and through co-
operation with other dissemination actions.
This approach provided a richer experience for attendees. Since the workshops
were domain-oriented, the participants heard different approaches to the same
issues and were presented with alternative experiences and solutions. This was a
more informative experience than simply hearing a talk about experiences in a
2.2 Objectives and Approach 19
The ~UR~X Software Best Practice Reports (of which this volume is one) and
Executive Reports are directed at two distinct audiences. The first is the techni-
cally oriented IT manager or developer interested in the full reports and technol-
ogy background. The second is senior management, for whom the Executive Re-
ports a summary of benefits and risks of real cases are appropriate.
2.3 Partners
Other ESSI Dissemination Actions that have also generated significant results that
may be of interest to the reader. These actions include SISSI and ESPITI, both
described briefly below.
2.4.1.1 Overview
The target audience for the SISSI case studies is senior executives, i.e. decision-
makers, in software producing organisations through Europe. This includes both
software vendors and companies developing software for in-house use. The mate-
rial has been selected in such a way that it is relevant for both small and large
organisations.
SISSI produced a set of 33 case studies, of about 4 pages each, and distributed
50 case studies overall, together with cases from previous projects. Cases are not
exclusively technical; rather, they have a clear business orientation and are fo-
cused on action. Cases are a selected compendium of fmished Process Improve-
ment Experiments (PIEs) funded by the ESSI program of the EC. They are classi-
fied according to parameters and keywords so tailored and selective extractions
can be made by potential users or readers. The main selection criteria are the busi-
ness sector, the software process affected by the improvement project and its busi-
ness goals.
The dissemination mechanisms of SISSI were the following: a selective tele-
phone-led campaign addressed to 500 appropriate organisations together with
follow up actions; an extensive mailing campaign targeting 5000 additional or-
ganisations which have selected the relevant cases from an introductory document;
joint action with the European Network of SPI Nodes - ESPINODEs - to distrib-
ute the SISSI material and provide continuity to the SISSI project; WWW pages
with the full contents of the case studies; synergic actions with other Dissemina-
tion Actions of the ESSI initiative, like ~UR~X, SPIRE, RAPID; co-operation with
other agents like European publications, SPI institutions, or graduate studies act-
ing as secondary distribution channels.
SISSI developed an SPI Marketing Plan to systematically identify and access
this target market in any European country and distributed its contents through the
European Network of SPI Nodes both for a secondary distribution of SISSI Case
Studies, and for a suitable rendering of the ESPINODEs services. The plan was
implemented for the dissemination of the SISSI Case Studies in several European
countries, proving its validity.
2.4.1.2 Objectives
The main goals of the approach taken in the SISSI project have been as follows:
• The material produced has been formed by a wide variety of practical real cases
selected by the consultants of the consortium, and documented in a friendly and
didactic way to capture interest between companies.
• The cases have clearly emphasised the key aspects of the improvement projects
in terms of competitive advantage and tangible benefits (cost, time to market,
quality).
22 2 The EUREX Project
• Most of the cases have been successful cases, but also not successful ones have
been sought in order to analyse causes of failure, i.e. inadequate analysis of the
plan before starting the project.
• The project has not been specially focused on particular techniques or applica-
tion areas, but it has been a selected compendium of the current and fInished
Process Improvement Experiments - PIEs -. They have been classifIed accord-
ing to different parameters and keywords so tailored and selective extractions
can be made by potential users or readers. The main selection criteria have
been: business sector (fInance, electronics, manufacturing, software houses, en-
gineering, etc.), the software process, the business goals and some technologi-
cal aspects of the experiment.
• The Dissemination action should open new markets promoting the SPI benefIts
in companies not already contacted by traditional ESSI actions.
• The SISSI Marketing Plan should provide the methodology and the information
not only to disseminate the SISSI material, but has to be generic enough to di-
rect the marketing of other ESSI services and SPI activities in general.
The SISSI material should be used in the future by organisations and other dis-
semination actions and best practices networks as a reference material to guide
lines of software improvement and practical approaches to face them. In particu-
lar, SISSI has to provide continuity of the action beyond the project itself support-
ing the marketing of SPI in any other ESSI action.
2.4.2 ESPITI
One of the most significant tasks perfonned during the ~UR~X project was the
creation of the taxonomy needed to drive the Regional Workshops and, ultimately,
the content of these Software Best Practice Reports. In this chapter, we examine in
detail the process that led to the ~UR~X taxonomy and discuss how the taxonomy
led to the selection of PIEs for the specific subject domain.
A set of more than 150 attributes was refined in several iterations to arrive at a
coarse grain classification into technological problem domains. These domains
were defmed such that the vast majority of PIEs fall into at least one of these do-
mains. There were seven steps used in the process of discovering the domains, as
described in the following paragraphs.
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
26 3 The EUREX Taxonomy
In part because of the distributed nature of the work and in part because of the
necessity for several iterations, the classification required 6 calendar months to
complete.
Each partner examined the PIEs conducted within its region and assigned attrib-
utes from the list given above that described the work done within the PIE (more
than one attribute per PIE was allowed). The regions were assigned as shown in
Table 3.1.
Partner Region
HIGHWARE Germany mapped all key words used by the partners into a new set
of attributes, normalising the names of attributes. No attribute was deleted, but the
overall number of different attributes decreased from 164 to 127. These attributes
were further mapped into classes and subclasses that differentiate members of
classes. This second mapping lead to a set of 24 classes each containing 0 to 13
subclasses. The resulting classes are shown in table 3.2.
The classification achieved by the above mentioned process was reviewed by the
partners and accepted with minor adjustments. It is important to note that up to
this point, the classification was independent of the structure of the planned publi-
cations. It simply described the technical work done by PIEs in the consolidated
view of the project partners.
In the next step this view was discussed and grouped into subject domains suit-
able for publications planned by the consortium.
Out of the original 24 classes, 7 were discarded from the further consideration,
either because the number of PIEs in the class was not significant or because the
domain was already addressed by other ESSI Dissemination Actions (e.g. formal
methods, reengineering, and so on). The 17 final classes were grouped into the
28 3 The EUREX Taxonomy
subject domains shown in table 3.3 such that each of the resulting 5 domains
forms a suitable working title for one of the ~UR~X books.
100 88
w
til 80 70
ii:
'0 60 48 51
...ell 40
.c 40 . 34
E
::J
Z 20
0
A B 0 OK E F GR IRL ISR N NL P S SF UK
Country
There we 33 PIEs that were not classified by r;.URr;.X. There were generally two
reasons for lack of classification.
18%
Config&Change Management,
Requirements Engineering
Fig. 3.2 Classification ofPIEs Europe-wide
1. Neither the r;.URr;.X reviewer alone nor the consortium as a whole was able to
reach a conclusion for classification based on the PIE description as pub-
lished.
2. The PIE addressed a very specific subject that did not correspond to a class
defined by r;.URr;.X and/or the PIE was dealt with by other known ESSI pro-
jects, e.g. formal methods. The consortium tried to avoid too much overlap
with other projects.
30 3 The EUREX Taxonomy
14
12
12
LfJ 10
Q: 8
15 6
4
f: 0
A
0
B D D
•
2 2
•
E
•
• 0
F GR I
o •
IR IS
0 o •
N N
• o •
P
•
S SF UK
Counby
When one of these rules was applied, the corresponding PIE was given no clas-
sification and was not considered further by the ~UR~X analysis. Fig. 3.3 shows
the breakdown of unclassified PIEs by country.
As can be seen in Fig. 3.3, there were 33 PIEs that remained unclassified once
the ~UR~X analysis was complete.
Part II presents the results of ~URfX project with respect to the Testing, Verifica-
tion, and Quality Management classification. The attributes associated with this
classification were Testing, Validation and Verification, and Quality Management.
Within this classification there were a total of 93 PIEs that were assigned one or
more of these attributes. The distribution of these PIEs throughout Europe is
shown in figure 3.4.
17
18
16 14
w 141/1
ii: 12 lU lU
1:
.
'0 10
8 5 II
7
()
E 6
:l 't ~ ~ 4 ~
4
• • •• • • •
Z
2 • • • • 1 • 1
U
o
A B D DK E F GR I IRL ISR N NL P S SF UK
Country
Fig. 3.4 Testing, Verification, Validation, and Quality Management PIEs by Country
4 Perspectives
L. Consolini
GEMINI, Bologna
Virtual1y al1 of the PIEs examined by WR~)( fal1 into five subject domains.
GEMINI, the consortium partner for Italy, was responsible for the domain classi-
fied as Testing, Verification, Validation, and Quality Management, consisting of
88 PIEs performed throughout Europe between 1994 and 1997.
This volume discusses the results obtained by ~URl;)( concerning this domain
and focuses primarily on the improvement of the product verification process
through better testing practices.
This chapter provides an introduction to the central theme of the domain and
presents the contributions of three authors who analyse the state-of-the art and the
state-of-the practice from different perspectives.
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
34 4 Perspectives
The activities performed during the development and maintenance process that
ensure that these aspects of quality are adequately represented in the software
product form the core of Software Quality Assurance. Some of these activities are
intended to insure the implementation of a defined quality standard, others are
targeted at assessing and controlling the products of the software process to check
for defects and remove them before delivery.
The latter group of activities are more precisely named Quality Control activi-
ties, or, using a terminology more common in the software industry, Verification
and Validation (V&V) activities. They consist mostly of document reviewing,
code inspection and testing; testing being by far the most widespread.
It is evident that the level and amount of V&V cannot be equal for all types of
software: what is suited to the production of safety critical software could be ex-
cessive and unaffordable in the production of low-risk commercial software. The
selection of the appropriate V&V activities in the context of a specific software
project or product revolve around the following issues:
• the nature of the specific quality targets to be achieved;
• the nature of the product;
• the specific customer's demands;
• the available resources;
• the available skills;
• the level of risk that can be accepted;
• schedule related issues.
The application of an appropriate product verification process consists in as-
sessing these issues, establishing the right quality targets and adopting the most
suitable strategy to achieve them. Such a strategy involves the selection of meth-
ods, techniques and tools that can be applied to perform V&V at different levels of
depth, thoroughness, productivity and skills demand.
Adopting the right product verification process according to the quality objec-
tives is also known as V&V planning (which also includes test planning) and is a
core Quality Management component.
Depending on the nature of the software production model used by an organisa-
tion, V&V planning can be performed anew for each project (custom-built soft-
ware, second party regulated software) or can be simplified by the tailored appli-
cation of standard practices described in internal procedures (commercial soft-
ware, off-the-shelf software). In the latter case, planning will concentrate on the
identification of the specific controls and tests to be carried out and on the set up
of an adequate environment to perform them in compliance with an organisation's
internal standards. Most of the PIEs represented in this Part II followed this ap-
proach.
The current culture and experience in the application of product verification
methods, techniques and tools is unfortunately quite unsatisfactory in the software
industry, principally in the commercial software area. More know-how is found
among the producers of highly regulated or safety critical software.
4.1 Introduction to the SUbject Domain 35
The three authors represented are first of all practitioners and their articles are
based on their direct working experience; references to the application of general
concepts into a real environment are therefore frequent and substantiate the au-
thors' perspective on the subject domain.
More introductory information is also found in the expert presentation that
opened the third fURfX workshop in Spain (see Chapter 7.2.2).
M. Paradiso
IBM Semea Sud, Bari
Michele Paradiso is currently an Advisory liT Specialist at IBM Semea Sud in the
Application Products Software Development Center in Bari (Italy).
He has worked on software quality assurance applied to software development,
ISO 9000 auditing, software measurement, application of software reliability
growth modelling and test process improvement.
Since 1996, he was included in IBM Internal ISO auditor team and, for his ac-
tivities on test process improvement he received an IBM Outstanding Technical
Achievement Award.
He received a bachelor's degree in Computer and Information Science from the
University of Bari (Italy).
Inspections and reviews of the documents and code are executed to "assure" the
full adherence to the customer needs and to development standards. The execution
of these V& V activities marks a Checkpoint (CP) in the development process. A
specific approval is required to proceed to the next CPo
The "Develop & Verify Product" phase includes all aspects of product design,
coding and testing, together with the development of plans for marketing, distribu-
tion, servicing and supporting the product.
Current industry practices rely extensively on testing to ensure software quality.
The following levels of testing are executed to assess all the product quality char-
acteristics, which are part of the quality targets of the product, such as reliability,
functionality, performance, usability, maintainability and portability (as defmed
by the ISO 9126 standard):
Unit Testing Developers to test each module separately in order to verify that it
executes as specified without any programming error.
Functional Testing Application to test each function separately in order to verify that
Domain Spe- functional requirements are implemented as stated in
cialist the Functional Specification document. Formal test
cases are defined and executed; errors are recorded and
test results analysed
Product Testing Application to verify proper execution of the whole product and to
11
The phase ends when the product meets the established specifications as demonstrated by
successful completion of V&V activities.
38 4 Perspectives
testing is enough not only on the basis of technical requirements but also on the
basis of risk and business considerations.
Unfortunately factors such as the increased complexity of the business, tech-
nology and development environment, as well as the lack of adequately trained
people, have increased the probability and the cost of failure.
The main weakness of the testing process currently used to test software products
can be identified in the areas of:
• text execution
• test documentation management
• measurement framework
• testing organisation and the cultural environment.
All these areas and their current shortfalls (as can be found in most software
development organisations) are analysed hereafter. In the remaining of this article
the same areas will be seen from the point of view of how an improved process
could work.
Any software process improvement should not be seen as a goal in itself but it
pays off if it is clearly linked to the business goals of an organisation. Software
process improvement starts with addressing the organisational issues.
To increase the business profitability and market share several software devel-
opment companies declare their strong commitment achieve a greater customer
satisfaction by improving the quality of products and services as well as reducing
development costs and delivery time. The improvement of the testing process can
4.2 Software Verification & Validation Introduced 41
To achieve these benefits an improved testing process should address the main
weaknesses identified in chapter 4.1.2. The improved model will be described
hereafter according to the same decomposition of the testing process into the areas
of:
• text execution
• test documentation management
• measurement framework
• testing organisation and the cultural environment.
Packaging Testing), so this approach is not recommendable for all levels and type
of testing.
A large reuse ofthis investment can be done when the applied software process
model is evolutionary and development proceeds by new releases of the same
product.
Another aspect where automated data management can considerably help the
testing effort is maintaining a cross-reference between test cases and requirements
specifications to make the task of identifying the test cases related with changes
easier and more secure.
get over this problem. A wide visibility of progress could be a facilitating factor,
too: the results achieved and the benefits perceived in the daily work would be
appreciated from the people involved becoming a motivating factor.
It is also important to promote an independent unit responsible to monitor and
support the development teams during the transition. This unit could be involved
in the evaluation of the improvement action results and in the dissemination of the
experience gained.
Assessment make a testing process assessment guided awareness of the weak points
by an external organisation and involving with respect to business needs
the developers directly performing the clarification of the key product
testing quality characteristics with
set a baseline against which future im- respect to customer needs
provements can be measured identification of improvement
actions
Consensus share the results of the assessment with diffused and formalised com·
Building both senior management and developers mitment
and get their commitment on the implemen-
tation of the improvement actions. The
commitment should be formalised
Organisation assign roles and responsibilities for the an adequate "Process Im-
improvement project and particularly: provement Organisation"
project management, technical and meth-
odological direction, internal process sup-
port
Planning organise and implement the improvement project plan
actions on a pilot project as self-contained
work packages, each of them associated
with well identified improvement objec-
tives
Best Practices define the new, improved, practices and defined methodology
Definition identify the skills to be acquired trammg programme
Field Trial apply the defined practices to a pilot project input to the evaluation step
and continuously monitor the results refined practices definition
Evaluation make a final assessment with the same an evaluation of the new
approach of the initial one. The comparison practices in technical and
with the initial baseline will measure the business terms
improvement
Diffusion illustrate the results achieved to a wider institutionalised practices
audience in the organisation, and tum them
into the new process standards.
Any organisation in any sector of the economy, which regards the production of
software as part of its operations may benefit from the adoption of the roadmap
described. Such a user organisation is often not necessarily classified as being in
the software industry, but may well be an engineering or commercial organisation
in which software has emerged as a significant component of its products, services
or processes.
4.2 Software Verification & Validation Introduced 45
Finally a costlbenefit analysis table 4.5 has been included to help setting out the
parameters according to which the suggested improvement actions can be meas-
ured within or after the timeframe of an improvement pilot project.
12
The use of the recording tools requires an investment in terms of test case structure and
maintenance of the output recording. Several of these test cases could be automatically
re-executed during the system test of the new release of the "same product"
46 4 Perspectives
4.3 Testware
F. Milanese
Compuware, Milano
What is software testing? There are many defmitions of software testing but the
classical definition is:
"Testing is the process of executing a program with the intent of finding errors"
But testing is much more...
First of all we will try to focus on customer needs: it's essential to understand
what a customer needs from an automated testing tool and which are the goals.
This is the first, most important step of the testing process.
Generally a customer needs a better quality of the software produced and, in the
meantime, he needs to save time so an automated testing tool should help to im-
prove the quality of software but should also reduce testing times. A testing tool
should be easy to learn and to use, not very expensive, it should interface the most
common planning and development tools and it should automate everything that is
tedious and boring in the testing process.
How to achieve these goals?
In order to obtain a better quality of the software it's necessary to identify ex-
actly the quality factors that are essential for the application: reliability, integrity,
security, safety, correctness, ease of use, maintainability, portability, performance
and so on. For every quality factor try to identify the best type of test to be per-
formed and the best testing tool.
In order to save time the customer should plan accurately every testing action
and should identify the most recursive phases of test: these are the steps to auto-
mate!
4.3 Testware 47
There are many types of testing and therefore there are many methods of testing
an application system. Let's try to identify the main types of testing.
4.3.4 Debugging
Debugging is the process of fmding the location of an error and correcting it.
Even if debugging is not properly part of the testing process it is the logical
consequence of the testing and should be always followed by another testing
phase.
It is extremely important to collect as much fault information as possible in or-
der to optimise the debugging phases. The process of notifying an error from the
testing environment to the debugging environment and of tracking errors is also
known as defect tracking. Automated defect tracking tools document faults, notify
and assign faults to programmers, defme category and priority of faults, keep track
of the evolution and fixing times of problems, document re-testing phases until the
closure of the problem.
There are many methods to debug a program:
• debugging by brute force (method consisting in storage dump, in inserting print
statements throughout the program and in using automated debugging tools),
• debugging by induction (the process of proceeding from the particulars to the
whole that is by starting with the clues, the symptoms, to find the error) or de-
duction (the process of proceeding from some general theories or premises, us-
ing the processes of elimination and refinement, to arrive at the location of the
error),
• debugging by backtracking (an effective error-locating method for small pro-
grams that starts at the point in the program where the incorrect result was pro-
duced and deduces, from the observed output, what the values of the program's
variables must have been),
4.3 Testware 49
There are many other techniques for testing software used sometimes in particular
cases and critical applications.
We will mention:
• mutation analysis (a testing method that generates programs very similar to the
testing one and generates test cases for all these mutated programs)
• quality metrics (identification of metrics or benchmarks in order to evaluate the
quality of software)
• symbolic execution (where the program is executed symbolically, that is a
variable can take on symbolic, as well as numeric, values and where a symbolic
value is an expression containing symbolic and numeric values)
• test case generators (automated tools that, starting by specifications, generate
test data randomly for a particular program)
• simulators (tools that simulate the environment surrounding the system to test,
typically used when the real environment is too expensive or impossible to use)
• predictive models (models that estimate the number of errors remaining in a
program and detennine when to stop testing and how much it will cost).
4.3.6 Tools
For almost every category of testing there are testing tools that help the automa-
tion of the testing process and make the work easier. We will mention the main
categories of these tools referring to the corresponding testing techniques and
identifying their key characteristics.
should not be confused with compilers even if some functions (e.g. warnings gen-
eration) are very similar.
They give information about the correctness of code, the number of lines of
comment compared to the number of lines of code, the variables declared but not
initialised nor used, unreachable code, loops, functions declared but never used
and so on.
They should have graphing and reporting functions because they could be used
to analyse the quality of the code and they should be integrated with the compiler
of the specific language used by developers in order to be easily used during the
programming.
execution and it should be able to read data from and external database and insert
data automatically into the program.
These tools should have the ability to manage the development of descriptive
manual tests and they should also have the ability to define pre-execution and
post-execution rules, environmental set-up and cleanup tasks.
They should have an open architecture that will make possible the integration
with as many automated testing tools as they are required to test applications.
In many cases test plan management tools provide the ability to execute tests on
remote machines so you can test an application across its distributed components
as it were fully operational. The distributed test execution should also enable to
perform parallel testing allowing to distribute a large testing load across a network
to make effective use of system resources.
These tools are usually tightly integrated with other development and manage-
ment tools. For example they can give the possibility to import test plans from
word processors and spreadsheets, to handle version control systems, to report and
graph results in reporting tools, to integrate with event debuggers, to send faults
records automatically to defect tracking tools.
Test analysis is another fundamental aspect of test plan management tools: they
store the results of each run in a database, consolidate test results from multiple
tools in a single view, and allow the tester to see the passed or failed status of test
cases. Data can eventually be extracted in standard report formats.
As regression testing is performed, the results trend graphs enable to compare
results from multiple test cycles to determine the progress status of application
quality.
Debugging Tools
The process of program debugging can be described as the activity performed
after executing a test case that revealed a fault. Debugging tools help the user to
locate errors with a precise static analysis of source code. Inserting breakpoints,
executing the program step by step, watching the variables and, with the best
tools, perfonning an event debugging.
All the debugging activity is based on the infonnation provided by the testing
phase, so we can say that the quality and the perfonnance of the debugging phase
heavily depends on the quality of the infonnation sent by the automated testing
tools.
exchange systems - that need in-depth testing. In all these cases it could be neces-
sary to develop application specific testing tools.
There is no limit to the complexity of these "ad hoc" tools: we have to mention
only that they could be very expensive in terms of money, human resources and
know how and it is very important to evaluate all these aspects before deciding to
write specific testing programs.
4.3.7 Testware
Which are the benefits of these testing tools and which are the limits?
It is hard to say without considering the specific testing environment. We can
say that the more repetitive is the testing the more convenient is the use of auto-
mated testing tools, the more creative is the testing the less convenient is the use
of automated testing tools.
The best results will be obtained on repetitive testing actions in which the initial
testing effort will be paid off by re-use of the testing scripts. There are many tools
that can help testers to automate actions but no program exists that accepts a speci-
fication as input and produces an automated test script as output: this step is still
human dependent.
The best results will also be obtained only if a certain amount of time is dedi-
cated to test plans: planning a testing process is as important as development plan-
56 4 Perspectives
ning. Tools will be useless without strategic methodology setting out what to do
and how to do it.
Testing will not be completely automated by testing tools, but once the test
team is skilled the quality of software produced will be greater and a great amount
of time will be saved.
4.3.9 References
[Myers78]
Glenford J. Myers, The Art ofSoftware Testing, Wiley-Interscience., 1978
[Ghezzi91]
Ghezzi, Fuggetta, Morasca, Morzenti, Pezze, Ingeneria del Software, Monda-
dori Informatica, 1991
[Perry95]
William Perry, Effective Methodsfor Software Testing, Wiley, 1995
4.4 Classic Testing Mistakes 57
B. Marick
Testing Foundations
Brian Marick, 11 years as programmer, tester, and line manager. Owner of Testing
Foundations since 1992.
Trainer and consultant he also spends a good deal of time on independent prod-
uct ("black box") testing. In recent years, a considerable amount of his work has
been with mass-market software.
Brian Marick is the author of a groundbreaking book for practitioners: The
Craft of Software Testing (see Chapter 5)
It's easy to make mistakes when testing software or planning a testing effort.
Some mistakes are made so often, so repeatedly, by so many different people, that
they deserve the label Classic Mistake.
Classic mistakes cluster usefully into five groups, which I've called "themes":
• The Role of Testing: who does the testing team serve, and how does it do that?
• Planning the Testing Effort: how should the whole team's work be organised?
• Personnel Issues: who should test?
• The Tester at Work: designing, writing, and maintaining individual tests.
• Technology Rampant: quick technological fixes for hard problems.
I have two goals for this paper. First, it should identify the mistakes, put them
in context, describe why they're mistakes, and suggest alternatives. Because the
context of one mistake is usually prior mistakes, the paper is written in a narrative
style rather than as a list that can be read in any order. Second, the paper should be
a handy checklist of mistakes. For that reason, the classic mistakes are printed in a
larger bold font when they appear in the text, and they're also summarised at the
end.
Although many of these mistakes apply to all types of software projects, my
specific focus is the testing of commercial software products, not custom software
or software that is safety critical or mission critical.
This paper is essentially a series of bug reports for the testing process. You may
think some of them are features, not bugs. You may disagree with the severities I
assign. You may want more information to help in debugging, or want to volun-
teer information of your own. Any decent bug reporting system will treat the
original bug report as the first part of a conversation. So should it be with this
58 4 Perspectives
A first major mistake people make is thinking that the testing team is responsible
for assuring quality. This role, often assigned to the first testing team in an organi-
sation, makes it the last defence, the barrier between the development team (ac-
cused of producing bad quality) and the customer (who must be protected from
them). It's characterised by a testing team (often called the "Quality Assurance
Group") that has formal authority to prevent shipment of the product. That in itself
is a disheartening task: the testing team can't improve quality, only enforce a
minimal level. Worse, that authority is usually more apparent than real. Discover-
ing that, together with the perverse incentives of telling developers that quality is
someone else's job, leads to testing teams and testers who are disillusioned, cyni-
cal, and view themselves as victims. We've learned from Deming and others that
products are better and cheaper to produce when everyone, at every stage in de-
velopment, is responsible for the quality of their work ([Deming86], [Ishi-
kawa85]).
In practice, whatever the formal role, most organisations believe that the pur-
pose of testing is to find bugs. This is a less pernicious definition than the previous
one, but it's missing a key word. When I talk to programmers and development
managers about testers, one key sentence keeps coming up: "Testers aren't finding
the important bugs." Sometimes that's just griping, sometimes it's because the
programmers have a skewed sense of what's important, but I regret to say that all
too often it's valid criticism. Too many bug reports from testers are minor or ir-
relevant, and too many important bugs are missed.
What's an important bug? Important to whom? To a first approximation, the
answer must be "to customers". Almost everyone will nod their head upon hearing
this definition, but do they mean it? Here's a test of your organisation's maturity.
Suppose your product is a system that accepts email requests for service. As soon
as a request is received, it sends a reply that says "your request of 5/12/97 was
accepted and its reference ID is NIC-05l297-3". A tester who sends in many re-
quests per day finds she has difficulty keeping track of which request goes with
which ID. She wishes that the original request were appended to the acknowl-
edgement. Furthermore, she realises that some customers will also generate many
requests per day, so would also appreciate this feature. Would she:
• file a bug report documenting a usability problem, with the expectation that it
will be assigned a reasonably high priority (because the fix is clearly useful to
everyone, important to some users, and easy to do)?
• file a bug report with the expectation that it will be assigned "enhancement
request" priority and disappear forever into the bug database?
4.4 Classic Testing Mistakes 59
• file a bug report that yields a "works as designed" resolution code, perhaps with
an email "nastygram" from a programmer or the development manager?
• not bother with a bug report because it would end up in cases (2) or (3)?
If usability problems are not considered valid bugs, your project defines the
testing task too narrowly. Testers are restricted to checking whether the product
does what was intended, not whether what was intended is useful. Customers do
not care about the distinction, and testers shouldn't either.
Testers are often the only people in the Organization who use the system as
heavily as an expert. They notice usability problems that experts will see. (Formal
usability testing almost invariably concentrates on novice users.) Expert customers
often don't report usability problems, because they've been trained to know it's
not worth their time. Instead, they wait (in vain, perhaps) for a more usable prod-
uct and switch to it. Testers can prevent that lost revenue.
While defining the purpose of testing as "finding bugs important to customers"
is a step forward, it's more restrictive than I like. It means that there is no focus on
an estimate of quality (and on the quality of that estimate). Consider these two
situations for a product with five subsystems.
• 100 bugs are found in subsystem 1 before release. (For simplicity, assume that
all bugs are of the highest priority.) No bugs are found in the other subsystems.
After release, no bugs are reported in subsystem 1, but 12 bugs are found in
each of the other subsystems.
• Before release, 50 bugs are found in subsystem 1. 6 bugs are found in each of
the other subsystems. After release, 50 bugs are found in subsystem 1 and 6
bugs in each of the other subsystems.
From the "find important bugs" standpoint, the first testing effort was superior.
It found 100 bugs before release, whereas the second found only 74. But I think
you can make a strong case that the second effort is more useful in practical terms.
Let me restate the two situations in terms of what a test manager might say before
release:
• "We have tested subsystem 1 very thoroughly, and we believe we've found
almost an of the priority 1 bugs. Unfortunately, we don't know anything about
the bugginess ofthe remaining five subsystems."
• "We've tested an subsystems moderately thoroughly. Subsystem 1 is still very
buggy. The other subsystems are about 1I10th as buggy, though we're sure
bugs remain."
This is, admittedly, an extreme example, but it demonstrates an important point.
The project manager has a tough decision: would it be better to hold on to the
product for more work, or should it be shipped now? Many factors - all rough
estimates of possible futures - have to be weighed: Will a competitor beat us to
release and tie up the market? Will dropping an unfinished feature to make it into
a particular magazine's special "Java Development Environments" issue cause us
60 4 Perspectives
120 '---,- - - - - - - - - - - - - - - - ,
iI
100 ,
I
I
I
..
c
::;,
80 ~
60 ~
I
I
0 I
0
I
40 J
,I
20
0 I
I
1 2 3 4 5 6 7 8 9 10
--0- Bugs found
Build
-0- Bugs fixed
Early test design can do more than prevent coding bugs. As will be discussed in
the next theme, many tests will represent user tasks. The process of designing
them can find user interface and usability problems before expensive rework is
required. I've found problems like no user-visible place for error messages to go,
plugable modules that didn't fit together, two screens that had to be used together
but could not be displayed simultaneously, and "obvious" functions that couldn't
be performed. Test design fits nicely into any usability engineering effort ([Niel-
sen93]) as a way of finding specification bugs.
I should note that involving testing early feels unnatural to many programmers
and development managers. There may be feelings that you are intruding on their
turf or not giving them the chance to make the mistakes that are an essential part
of design. Take care, especially at first, not to increase their workload or slow
them down. It may take one or two entire projects to establish your credibility and
usefulness.
62 4 Perspectives
I'll first discuss specific planning mistakes, then relate test planning to the role of
testing.
It's not unusual to see test plans biased toward functional testing. In functional
testing, particular features are tested in isolation. In a word processor, all the op-
tions for printing would be applied, one after the other. Editing options would later
get their own set of tests.
But there are often interactions between features, and functional testing tends to
miss them. For example, you might never notice that the sequence of operations
open a document, edit the document, print the whole document, edit one page,
print that page doesn't work. But customers surely will, because they don't use
products functionally. They have a task orientation. To find the bugs that custom-
ers see - that are important to customers - you need to write tests that cross func-
tional areas by mimicking typical user tasks. This type of testing is called scenario
testing, task-based testing, or use-case testing.
A bias toward functional testing also under-emphasises configuration testing.
Configuration testing checks how the product works on different hardware and
when combined with different third party software. There are typically many
combinations that need to be tried, requiring expensive labs stocked with hardware
and much time spent setting up tests, so configuration testing isn't cheap. But, it's
worth it when you discover that your standard in-house platform which "entirely
conforms to industry standards" actually behaves differently from most of the
machines on the market.
Both configuration testing and scenario testing test global, cross-functional as-
pects of the product. Another type of testing that spans the product checks how it
behaves under stress (a large number of transactions, very large transactions, a
large number of simultaneous transactions). Putting stress and load testing off to
the last minute is common, but it leaves you little time to do anything substantive
when you discover your product doesn't scale up to more than 12 users. \5
Two related mistakes are not testing the documentation and not testing installa-
tion procedures. Testing the documentation means checking that all the proce-
dures and examples in the documentation work. Testing installation procedures is
a good way to avoid making a bad first impression.
15
Failure to apply particular types of testing is another reason why developers complain
that testers aren't finding the important bugs. Developers of an operating system could be
spending all their time debugging crashes of their private machines, crashes due to net-
working bugs under normal load. The testers are doing straight "functional tests" on iso-
lated machines, so they don't find bugs. The bugs they do find are not more serious than
crashes (usually defined as highest severity for operating systems), and they're probably
less.
4.4 Classic Testing Mistakes 63
Beta programs are also useful for building word of mouth advertising, getting
"first glance" reviews in magazines, supporting third-party vendors who will build
their product on top of yours, and so on. Those are properly marketing activities,
not testing.
17
I use "confidence" in its colloquial rather than its statistical sense. Conventional testing
that searches specifically for bugs does not allow you to make statements like "this prod-
uct will run on 95±5% of Wintel machines". In that sense, it's weaker than statistical or
reliability testing, which uses statistical profiles of the customer environment to both find
bugs and make failure estimates. (See [Dyer92], [Lyu96], and [Musa87].) Statistical test-
ing can be difficult to apply, so I concentrate on a search for bugs as the way to get a us-
able estimate. A lack of statistical validity doesn't mean that bug numbers give you noth-
ing but "warm and fuzzy (or cold and clammy) feelings". Given a modestly stable testing
process, development process, and product line, bug numbers lead to distinctly better de-
18
cisions, even if they don't come with p-values or statistical confidence intervals.
It's expensive to test quality into the product, but it may be the only alternative. Code
redesigns and rewrites may not be an option.
4.4 Classic Testing Mistakes 65
4.4.2.3 "So, Winter's early this Year. We're still going to Invade
Russia."
Good testers are systematic and organised, yet they are exposed to all the chaos
and twists and turns and changes of plan typical of a software development pro-
ject. In fact, the chaos is magnified by the time it gets to tester s9' because of their
1
position at the end of the food chain and typically low status. One unfortunate
reaction is sticking stubbornly to the test plan. Emotionally, this can be very
satisfying: "They can flail around however they like, but I'm going to hunker
down and do my job." The problem is that your job is not to write tests. It's to find
the bugs that matter in the areas of greatest uncertainty and risk, and ignoring
changes in the reality of the product and project can mean that your testing
20
becomes irrelevant.
That's not to say that testers should jump to readjust all their plans whenever
there's a shift in the wind, but my experience is that more testers let their plans
fossilise than overreact to project change.
Fresh out of college, I got my first job as a tester. I had been hired as a developer,
and knew nothing about testing, but, as they said, "we don't know enough about
you yet, so we'll put you somewhere where you can't do too much damage". In
due course, I "graduated" to development.
Using testing as a transitional job for new programmers is one of the two clas-
sic mistaken ways to staff a testing organisation. It has some virtues. One is that
you really can keep bad hires away from the code. A bozo in testing is often less
dangerous than a bozo in development. Another is that the developer may learn
something about testing that will be useful later. (In my case, it founded a career.)
And it's a way for the new hire to learn the product while still doing some useful
work.
The advantages are outweighed by the disadvantage: the new hire can't wait to
get out of testing. That's hardly conducive to good work. You could argue that the
testers have to do good work to get "paroled". Unfortunately, because people tend
to be as impressed by effort as by results, vigorous activity - especially activity
that establishes credentials as a programmer - becomes the way out. As a result,
the fledgling tester does things like becoming the expert in the local programma-
ble editor or complicated freeware tool. That, at least, is a potentially useful role,
19
How many proposed changes to a product are rejected because of their effect on the
testing schedule? How often does the effect on the testing team even cross a developer's
20 or marketer's mind?
This is yet another reason why developers complain that testers aren't finding the impor-
tant bugs. Because of market pressure, the project has shifted to an Internet focus, but the
testers are still using and testing the old "legacy" interface instead of the now critically
important web browser interface.
66 4 Perspectives
though it has nothing to do with testing. More dangerous is vigorous but misdi-
rected testing activity; namely, test automation. (See the last theme.)
Even if novice testers were well guided, having so much of the testing staff be
transients could only work if testing is a shallow algorithmic discipline. In fact,
good testers require deep knowledge and experience.
The second classic mistake is recruiting testers from the ranks of failed pro-
grammers. There are plenty of good testers who are not good programmers, but a
bad programmer likely has some work habits that will make him a bad tester, too.
For example, someone who makes lots of bugs because he's inattentive to detail
will miss lots of bugs for the same reason.
So how should the testing team be staffed? If you're willing to be part of the
21
training department, go ahead and accept new programmer hires. Accept as ap-
plicants programmers who you suspect are rejects (some fraction of them really
have gotten tired of programming and want a change) but interview them as you
would an outside hire. When interviewing, concentrate less on fonnal qualifica-
tions than on intelligence and the character of the candidate's thought.
22
A good tester has these qualities:
• methodical and systematic.
• tactful and diplomatic (but finn when necessary).
• sceptical, especially about assumptions, and wants to see concrete evidence.
• able to notice and pursue odd details.
• good written and verbal skills (for explaining bugs clearly and concisely).
• a knack for anticipating what others are likely to misunderstand. (This is useful
both in fmding bugs and writing bug reports.)
• a willingness to get one's hands dirty, to experiment, to try something to see
what happens.
Be especially careful to avoid the trap of testers who are not domain experts.
Too often, the tester of an accounting package knows little about accounting.
Consequently, she finds bugs that are unimportant to accountants and misses ones
that are. Further, she writes bug reports that make serious bugs seem irrelevant. A
programmer may not see past the unrepresentative test to the underlying important
problem. (See the discussion of reporting bugs in the next theme.)
Domain experts may be hard to find. Try to find a few. And hire testers who are
quick studies and are good at understanding other people's work patterns.
Two groups of people are readily at hand and often have those skills. But test-
ing teams often do not seek out applicants from the customer service staff or the
technical writing staff. The people who field email or phone problem reports de-
21
Some organisations rotate all developers through testing. Well, all developers except
those with enough clout to refuse. And sometimes people not in great demand don't seem
22 ever to rotate out. I've seen this approach work, but it's fragile.
See also the list in [Kaner93], chapter 15.
4.4 Classic Testing Mistakes 67
velop, if they're good, a sense of what matters to the customer (at least to the
vocal customer) and the best are very quick on their mental feet.
Like testers, technical writers often also lack detailed domain knowledge.
However, they're in the business of translating a product's behaviour into terms
that make sense to a user. Good technical writers develop a sense of what' s impor-
tant, what's confusing, and so on. Those areas that are hard to explain are often
fruitful sources of bugs. (What confuses the user often also confuses the pro-
grammer.)
One reason these two groups are not tapped is an insistence that testers be able
to program. Programming skill brings with it certain advantages in bug hunting. A
programmer is more likely to find the number 2,147,483,648 interesting than an
accountant will. (It overflows a signed integer on most machines.) But such tricks
of the trade are easily learned by competent non-programmers, so not having them
is a weak reason for turning someone down.
If you hire according to these guidelines, you will avoid a testing team that
lacks diversity. All of the members will lack some skills, but the team as a whole
will have them all. Over time, in a team with mutual respect, the non-programmers
will pick up essential titbits of programming knowledge, the programmers will
pick up domain knowledge, and the people with a writing back-ground will teach
the others how to deconstruct documents.
All testers - but non-programmers especially - will be hampered by a physical
separation between developers and testers. A smooth working relationship be-
tween developers and testers is essential to efficient testing. Too much valuable
information is unwritten; the tester finds it by talking to developers. Developers
and testers must often work together in debugging; that's much harder to do re-
motely. Developers often dismiss bug reports too readily, but it's harder to do that
to a tester you eat lunch with.
Remote testing can be made to work - I've done it - but you have to be careful.
Budget money for frequent working visits, and pay attention to interpersonal is-
sues.
Some believe that programmers can't test their own code. On the face of it, this
is false: programmers test their code all the time, and they do find bugs. Just not
enough of them, which is why we need independent testers.
But if independent testers are testing, and programmers are testing (and inspect-
ing), isn't there a potential duplication of effort? And isn't that wasteful? I think
the answer is yes. Ideally, programmers would concentrate on the types of bugs
they can find adequately well, and independent testers would concentrate on the
rest.
The bugs programmers can find well are those where their code does not do
what they intended. For example, a reasonably trained, reasonably motivated pro-
grammer can do a perfectly fine job finding boundary conditions and checking
whether each known equivalence class is handled. What programmers do poorly is
discovering overlooked special cases (especially error cases), bugs due to the
68 4 Perspectives
interaction of their code with other people's code (including system-wide proper-
ties like deadlocks and perfonnance problems), and usability problems.
Crudely put' 2pood programmers do functional testing, and testers should do
everything else. Recall that I earlier claimed an over-concentration on functional
testing is a classic mistake. Decent programmer testing magnifies the damage it
does.
Of course, decent programmer testing is relatively rare, because programmers
are neither trained nor motivated to test. This is changing, gradually, as companies
realise it's cheaper to have bugs found and fixed quickly by one person, instead of
more slowly by two. Until then, testers must do both the testing that programmers
can do and the testing only testers can do, but must take care not to let functional
testing squeeze out the rest.
When testing, you must decide how to exercise the program, then do it. The doing
is ever so much more interesting than the deciding. A tester's itch to start breaking
the program is as strong as a programmer's itch to start writing code - and it has
the same effect: design work is skimped, and quality suffers. Paying more atten-
tion to running tests than to designing them is a classic mistake. A tester who is
not systematic, who does not spend time laying out the possibilities in advance,
will overlook special cases. They may be the same subtle ones that the program-
mers overlooked.
Concentration on execution also results in unreviewed test designs. Just like
programmers, testers can benefit from a second pair of eyes. Reviews of test de-
signs needn't be as elaborate as product design reviews, but a short check of the
testing approach and the resulting tests can fmd significant omissions at low cost.
23
Independent testers will also provide a "safety net" for programmer testing. A certain
amount offunctional testing might be planned, or it might be a side effect of the other
types of testing being done.
4.4 Classic Testing Mistakes 69
Desil!n 1
Setup: initialise the balance in account 12 with $100.
Procedure:
Start the program.
Type 12 in the Account window.
Press OK.
Click on the 'Withdraw' toolbar button.
In the withdraw popup dialog,
click on the 'all' button.
Press OK.
Expect to see a confirmation popup that
says "You are about to withdraw all
the money from this account. Continue?"
Press OK.
Expect to see a 0 balance in the account window.
Separately query the database to check
that the zero balance has been posted.
Exit the program with File->Exit.
Desil!n 2
Setup: initialise the balance with a positive value.
Procedure:
Start the program on that account.
Withdraw all the money from the account
using the 'all' button.
It's an error if the transaction happens
without a confirmation popup.
Immediately thereafter:
- Expect a $0 balance to be displayed.
- Independently query the database to check
that the zero balance has been posted.
log via the toolbar. Maybe the menu was always used. Maybe the toolbar but-
ton doesn't work at all!
• By spelling out all inputs, the first style prevents testers from carelessly overus-
ing simple values. For example, a tester might always test accounts with $100,
rather than using a variety of small and large balances. (Either style should in-
clude explicit tests for boundary and special values.)
• However, there are also some disadvantages:
• The first style is more expensive to create.
• The inevitable minor changes to the user interface will break it, so it's more
expensive to maintain.
• Because each run of the test is exactly the same, there's no chance that a varia-
tion in procedure will stumble across a bug.
• It's hard for testers to follow a procedure exactly. When one makes a mistake-
pushes the wrong button, for example - will she really start over?
On balance, I believe the negatives often outweigh the positives, provided there
is a separate testing task to check that all the menu items and toolbar buttons are
hooked up. (Not only is a separate task more efficient, it's less error-prone. You're
less likely to accidentally omit some buttons.)
I do not mean to suggest that test cases should not be rigorous, only that they
should be no more rigorous than is justified, and that we testers sometimes error
on the side of uneconomical detail.
Detail in the expected results is less problematic than in the test procedure, but
too much detail can focus the tester's attention too much on checking against the
script he's following. That might encourage another classic mistake: not noticing
and exploring "irrelevant" oddities. Good testers are masters at noticing "some-
thing funny" and acting on it. Perhaps there is a brief flicker in some toolbar but-
ton which, when investigated, reveals a crash. Perhaps an operation takes an oddly
long time, which suggests to the attentive tester that increasing the size of an "ir-
relevant" dataset might cause the program to slow to a crawl. Good testing is a
combination of following a script and using it as a jumping-off point for an explo-
ration of the product.
An important special case of overlooking bugs is checking that the product
does what it's supposed to do, but not that it doesn't do what it isn't supposed to
do. As an example, suppose you have a program that updates a health care ser-
vice's database of family records. A test adds a second child to Dawn Marick's
record. Almost all testers would check that, after the update, Dawn now has two
children. Some testers - those who are clever, experienced, or subject matter ex-
perts - would check that Dawn Marick's spouse, Brian Marick, also now has two
children. Relatively few testers would check that no one else in the database has
had a child added. They would miss a bug where the programmer over-generalised
and assumed that all "family information" updates should be applied both to a
patient and to all members of her family, giving Paul Marick (aged 2) a child.
4.4 Classic Testing Mistakes 71
Ideally, every test should check that all data that should be modified has been
modified and that all other data has been unchanged. With forethought, that can be
built into automated tests. Complete checking may be impractical for manual tests,
but occasional quick scans for data that might be corrupted can be valuable.
Desi n 3
Withdraw all with confirmation and normal check for O.
That means the same thing as Design 2 - but only to the original author. Test
suites that are understandable only by their owners are ubiquitous. They cause
many problems when their owners leave the company; sometimes many months'
worth of work has to be thrown out.
I should note that designs as detailed as Designs I or 2 often suffer a similar
problem. Although they can be run by anyone, not everyone can update them
when the product's interface changes. Because the tests do not list their purposes
explicitly, updates can easily make them test a little less than they used to. (Con-
sider, for example, a suite of tests in the Design I style: how hard will it be to
make sure that all the user interface controls are touched in the revised tests? Will
the tester even know that's a goal of the suite?) Over time, this leads to what I call
"test suite decar," in which a suite full of tests runs but no longer tests much of
2
anything at all.
Another classic mistake involves the boundary between the tester and pro-
grammer. Some products are mostly user interface; everything they do is visible
on the screen. Other products are mostly internals; the user interface is a "thin
pipe" that shows little of what happens inside. The problem is that testing has to
use that thin pipe to discover failures. What if complicated internal processing
produces only a "yes or no" answer? Any given test case could trigger many inter-
25
nal faults that, through sheer bad luck, don't produce the wrong answer.
In such situations, testers sometimes rely solely on programmer ("unit") testing.
In cases where that's not enough, testing only through the user-visible interface is
a mistake. It is far better to get the programmers to add "testability hooks" or
"testpoints" that reveal selected internal state. In essence, they convert a product
like that shown in Fig. 4.2 into one like shown in Fig. 4.3.
24
The purpose doesn't need to be listed with the test. It may be better to have a central
document describing the purposes of a group of tests, perhaps in tabular form. Of course,
25 then you have to keep that document up to date.
This is an example of the formal notion of "testability". See, [Friedman95] or [Voas91]
for an academic treatment.
72 4 Perspectives
User Interface
"""""."."""""." ............ " """""" ...""." ..."""""""""""".""""."".".""""""".""""...
I
"",,--------,
Guts of the Product
Fig. 4.2 Program without testability hooks
Testing
. . . . ".~.~.:.~ . I"~"~.:..~~.~.:",,....a..-.
1 ...l."..."... ".I".~.~.:,,~.ace
Guts of the Product
Fig. 4.3 Program with testing interface
It is often difficult to convince programmers to add test support code to the prod-
uct. (Actual quote: "I don't want to clutter up my code with testing crud.") Perse-
vere, start modestly, and take advantage of these facts:
• The test support code is often a simple extension of the debugging support code
• 26
programmers wrIte anyway.
• A small amount of test support code often goes a long way.
A common objection to this approach is that the test support code must be
compiled out of the final product (to avoid slowing it down). If so, tests that use
the testing interface "aren't testing what we ship". It is true that some of the tests
won't run on the final version, so you may miss bugs. But, without testability
code, you'll miss bugs that don't reveal themselves through the user interface. It's
a risk trade-off, and I believe that adding test support code usually wins. See
[Marick95], chapter 13, for more details.
In one case, there's an alternative to having the programmer add code to the
product: have a tool do it. Commercial tools like Purify, Boundschecker, and Sen-
tinel automatically add code that checks for certain classes of failures (such as
26
For example, the Java language encourages programmers to use the toString method to
make internal objects printable. A programmer doesn't have to use it, since the debugger
lets her see all the values in any object, but it simplifies debugging for objects she'll look
at often. All testers need (roughly) is a way to call toString from some external interface.
4.4 Classic Testing Mistakes 73
memory leaks).27 They provide a narrow, specialised testing interface. For market-
ing reasons, these tools are sold as programmer debugging tools, but they're
equally test support tools, and I'm amazed that testing groups don't use them as a
matter of course.
Testability problems are exacerbated in distributed systems like conventional
client/server systems, multi-tiered client/server systems, Java applets that provide
smart front-ends to web sites, and so forth. Too often, tests of such systems
amount to shallow tests of the user interface component because that's the only
component that the tester can easily control.
29
• That area of the product is buggy. It's well known that bugs tend to cluster.
• That area of the product was inadequately tested. Otherwise, why did the bug
originally escape testing?
An appropriate response to several customer bug reports in an area is to sched-
ule more thorough testing for that area. Begin by examining the current tests (if
they're understandable) to determine their systematic weaknesses.
Finally, every bug report is a gift from a customer that tells you how to test bet-
ter in the future. A common mistake is failing to take notes for the next testing
effort. The next product will be somewhat like this one, the bugs will be somewhat
like these, and the tests useful in finding those bugs will also be somewhat like the
ones you just ran. Mental notes are easy to forget, and they're hard to hand to a
new tester. Writing is a wonderful human invention: use it. Both [Kaner93] and
[Marick95] describe formats for archiving test information, and both contain gen-
eral-purpose examples.
29
That's true even if the bug report is due to a customer misunderstanding. Perhaps this
area of the product is just too hard to understand.
4.4 Classic Testing Mistakes 75
each run once. Beware of irrational, emotional reasons for automating, such as
testers who find programming automated tests more fun, a perception that auto-
mated tests will lead to higher status (everything else is "monkey testing"), or a
fear of not rerunning a test that would have found a bug (thus leading you to
automate it, leaving you without enough time to write a test that would have found
a different bug).
You will likely end up in a compromise position, where you have:
• a set of automated tests that are run often.
• a well-documented set of manual tests. Subsets of these can be rerun as neces-
sary. For example, when a critical area of the system has been extensively
changed, you might rerun its manual tests. You might run different samples of
this suite after each major build. 30
• a set of undocumented tests that were run once (including exploratory "bug
bash" tests).
Beware of expecting to rerun all manual tests. You will become bogged down
rerunning tests with low bug-finding value, leaving yourself no time to create new
tests. You will waste time documenting tests that don't need to be documented.
You could automate more tests if you could lower the cost of creating them.
That's the promise of using GUT capture/replay tools to reduce test creation cost.
The notion is that you simply execute a manual test, and the tool records what you
do. When you manually check the correctness of a value, the tool remembers that
correct value. You can then later play back the recording, and the tool will check
whether all checked values are the same as the remembered values.
There are two variants of such tools. What T call the first generation tools cap-
ture raw mouse movements or keystrokes and take snapshots of the pixels on the
screen. The second generation tools (often called "object oriented") reach into the
program and manipulate underlying data structures (widgets or controls).31
First generation tools produce un-maintainable tests. Whenever the screen lay-
out changes in the slightest way, the tests break. Mouse clicks are delivered to the
wrong place, and snapshots fail in irrelevant ways that nevertheless have to be
checked. Because screen layout changes are common, the constant manual updat-
ing of tests becomes insupportable.
Second generation tools are applicable only to tests where the underlying data
structures are useful. For example, they rarely apply to a photograph editing tool,
where you need to look at an actual image - at the actual bitmap. They also tend
30
An additional benefit of automated tests is that they can be run faster than manual tests.
That allows you to reduce the time between completion of a build and completion of its
testing. That can be especially important in the final builds, if only to avoid pressure
from executives itching to ship the product. You're trading fewer tests for faster time to
market. That can be a reasonable trade-off, but it doesn't affect the core of my argument,
31 which is that not all tests should be automated.
These are, in effect, another example of tools that add test support code to the program.
76 4 Perspectives
not to work with custom controls. Heavy users of capture/replay tools seem to
spend an inordinate amount of time trying to get the tool to deal with the special
features of their program - which raises the cost of test automation.
Second generation tools do not guarantee maintainability either. Suppose a ra-
dio button is changed to a pull-down list. All of the tests that use the old controls
will now be broken.
GUI interface changes are of course common, especially between releases.
Consider carefully whether an automated test that must be recaptured after GUl
changes is worth having. Keep in mind that it can be hard to figure out what a
captured test is attempting to accomplish unless it is separately documented.
As a rule of thumb, it's dangerous to assume that an automated test will pay for
itself this release, so your test must be able to survive a reasonable level of GUI
change. I believe that capture/replay tests, of either generation, are rarely robust
enough.
An alternative approach to capture/replay is scripting tests. (Most GUI cap-
ture/replay tools also allow scripting.) Some member of the testing team writes a
"test API" (application programmer interface) that lets other members of the team
express their tests in less GUI-dependent terms. Whereas a captured test might
look like this:
Caotured Test
text $main.accountField "12 U
click $main.OK
menu $operations
menu $withdraw
click $withdrawDialog.all
Script
select-account 12
. withdraw all
I
The script commands are subroutines that perform the appropriate mouse clicks
and key presses. If the API is well-designed, most GUI changes will require
changes only to the implementation of functions like withdraw, not to all the tests
32
that use them. Please note that well-designed test APls are as hard to write as any
other good API. That is, they're hard, and you shouldn't expect to get it right the
first time.
32
The "Joe Gittano" stories and essays on my web page,
http://www.stlabs.com/marickJroot.htm. go into this approach in more detail.
4.4 Classic Testing Mistakes 77
In a variant of this approach, the tests are data-driven. The tester provides a ta-
ble describing key values. Some tool reads the table and converts it to the appro-
priate mouse clicks. The table is even less vulnerable to QUI changes because the
sequence of operations has been abstracted away. It's also likely to be more un-
derstandable, especially to domain experts who are not programmers. See [Petti-
chord96] for an example of data-driven automated testing.
Note that these more abstract tests (whether scripted or data-driven) do not nec-
essarily test the user interface thoroughly. If the Withdraw dialog can be reached
via several routes (toolbar, menu item, and hotkey), you don't know whether each
route has been tried. You need a separate (most likely manual) effort to ensure that
all the QUI components are connected correctly.
Whatever approach you take, don't fall into the trap of expecting regression
tests to find a high proportion of new bugs. Regression tests discover that new or
changed code breaks what used to work. While that happens more often than any
of us would like, most bugs are in the product's new or intentionally changed
behaviour. Those bugs have to be caught by new tests.
For the same reason, removing tests from a regression test suite just because
they don't add coverage is dangerous. The point is not to cover the code; it's to
have tests that can discover enough of the bugs that are likely to be caused when
the code is changed. Unless the tests are ineptly designed, removing tests will just
remove power. If they are ineptly designed, using coverage converts a big and
lousy test suite to a small and lousy test suite. That's progress, I suppose, but it's
33
addressing the wrong problem.
A grave danger of code coverage is that it is concrete, objective, and easy to
measure. Many managers today are using coverage as a performance goal for
testers. Unfortunately, a cardinal rule of management applies here: "Tell me how a
person is evaluated, and I'll tell you how he behaves." If a person is evaluated by
how much coverage is achieved in a given time (or in how little time it takes to
reach a particular coverage goal), that person will tend to write tests to achieve
high coverage in the fastest way possible. Unfortunately, that means short-
changing careful test design that targets bugs, and it certainly means avoiding in-
34
depth, repetitive testing of "already covered" code.
Using coverage as a test design technique works only when the testers are both
designing poor tests and testing redundantly. They'd be better off at least targeting
33
Not all regression test suites have the same goals. Smoke tests are intended to run fast
and find grievous, obvious errors. A coverage-minimised test suite is entirely appropri-
34 ate.
In pathological cases, you'd never bother with user scenario testing, load testing, or
configuration testing, none of which add much, if any, coverage to functional testing.
4.4 Classic Testing Mistakes 79
their poor tests at new areas of code. In more normal situations, coverage as a
guide to design only decreases the value of the tests or puts testers under unpro-
ductive pressure to meet unhelpful goals.
Coverage does playa role in testing, not as a guide to test design, but as a rough
evaluation of it. After you've run your tests, ask what their coverage is. If certain
areas of the code have no or low coverage, you're sure to have tested them shal-
lowly. If that wasn't intentional, you should improve the tests by rethinking their
design. Coverage has told you where your tests are weak, but it's up to you to
understand how.
You might not entirely ignore coverage. You might glance at the uncovered
lines of code (possibly assisted by the programmer) to discover the kinds of tests
you omitted. For example, you might scan the code to determine that you under-
tested a dialog box's error handling. Having done that, you step back and think of
all the user errors the dialog box should handle, not how to provoke the error
checks on line 343, 354, and 399. By rethinking design, you'll not only execute
those lines, you might also discover that several other error checks are entirely
missing. (Coverage can't tell you how well you would have exercised needed code
that was left out of the program.)
There are types of coverage that point more directly to design mistakes than
35
statement coverage does (branch coverage, for example). However, none - and
not all of them put together - are so accurate that they can be used as test design
techniques.
One final note: Romances with coverage don't seem to end with the former
devotee wanting to be "just good friends". When, at the end of a year's use of
coverage, it has not solved the testing problem, I fmd testing groups abandoning
coverage entirely. That's a shame. When I test, I spend somewhat less than 5% of
my time looking at coverage results, rethinking my test design, and writing some
new tests to correct my mistakes. It's time well spent.
4.4.6 Acknowledgements
My discussions about testing with Cern Kaner have always been illuminating. The
LAWST (Los Altos Workshop on Software Testing) participants said many inter-
esting things about automated GUI testing. The LAWST participants were Chris
Agruss, Tom Arnold, James Bach, Jim Brooks, Doug Hoffman, Cern Kaner, Brian
Lawrence, Tom Lindemuth, Noel Nyman, Brett Pettichord, Drew Pritsker, and
Melora Svoboda. Paul Czyzewski, Peggy Fouts, Cern Kaner, Eric Petersen, Joe
Strazzere, Melora Svoboda, and Stephanie Young read an earlier draft.
35
See [Marick95], chapter 7, for a description of additional code coverage measures. See
also [Kaner96b] for a list of more than one hundred types of coverage.
80 4 Perspectives
4.4.7 References
[Cusuman095 ]
M. Cusumano and R. Selby, Microsoft Secrets, Free Press, 1995.
[Dyer92]
Michael Dyer, The Cleanroom Approach to Quality Software Development,
Wiley, 1992.
[Friedman95]
M. Friedman and J. Voas, Software Assessment: Reliability, Safety, Testability,
Wiley, 1995.
[Kaner93]
C. Kaner, 1. Falk, and H.Q. Nguyen, Testing Computer Software (2Ie), Van
Nostrand Reinhold, 1993.
[Kaner96a]
Cern Kaner, "Negotiating Testing Resources: A Collaborative Approach," a po-
sition paper for the panel session on "How to Save Time and Money in Test-
ing", in Proceedings ofthe Ninth International Quality Week (Software Re-
search, San Francisco, CA), 1996. (http://www.kaner.com/negotiate.htm)
[Kaner96b]
Cern Kaner, "Software Negligence & Testing Coverage," in Proceedings of
STAR 96, (Software Quality Engineering, Jacksonville, FL), 1996.
(http://www.kaner.com/coverage.htm)
[Lyu96]
Michael R. Lyu (ed.), Handbook ofSoftware Reliability Engineering, McGraw-
Hill, 1996.
[Marick95]
Brian Marick, The Craft ofSoftware Testing, Prentice Hall, 1995.
[Marick97]
Brian Marick, "The Test Manager at the Project Status Meeting," in Proceed-
ings ofthe Tenth International Quality Week (Software Research, San Fran-
cisco, CA), 1997. (http://www.stlabs.com/-marick/root.htm)
[McConne1l96]
Steve McConnell, Rapid Development, Microsoft Press, 1996.
[Moore91]
Geoffrey A. Moore, Crossing the Chasm, Harper Collins, 1991.
[Moore95]
Geoffrey A. Moore, Inside the Tornado, Harper Collins, 1995.
[Musa87]
1. Musa, A. lannino, and K. Okumoto, Software Reliability: Measurement,
Prediction, Application, McGraw-Hill, 1987.
[Nielsen93 ]
Jakob Nielsen, Usability Engineering, Academic Press, 1993.
[Pettichord96]
Brett Pettichord, "Success with Test Automation," in Proceedings ofthe Ninth
4.4 Classic Testing Mistakes 81
L. Consolini
Gemini, Bologna
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
84 5 Resources for Practitioners
5.2 Books
Brian Marick - The Craft Of Software Testing - Prentice Hall, NJ, 1995
This book is the logical sequel of Myers' fundamental work. It explores new
techniques and discovers the potential of sub-system testing.
Cern Kaner, Jack, Falck, Hung Quoc Nguyen - Testing Computer Software -
International Thompson Computer Press, 1993
This book is about testing under real-world conditions. It is full of insight and
useful advice.
Boris Beizer - Black-box Testing -John Wiley, 1995
Another savvy book by Beizer. This one is really focused on functional testing
of software and systems.
Daniel 1. Mosley - The Handbook Of MIS Application Software Testing -
Yourdon Press, 1993
5.3 Organisations 85
5.3 Organisations
Table 5.1 Organisations
Name URL
Name
Name URL
L. Consolini
Gemini, Bologna
Among the PIEs examined by (;UR(;X and involved in the workshops, several par-
ticularly significant PIEs were selected for a more in-depth analysis (see Table
6.1). Their experience is both .interesting and relevant to many of the key issues
involved in the application of Validation and Verification to real life software.
At the same time these PIEs have been chosen to represent a wide range of or-
ganisations (SMEs, large companies, not-for-profit organisations) and domains
(technical software, aerospace software, Internet software, commercial MIS soft-
ware).
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
88 6 Experience Reports
PI3 provides useful insights into the issues emerging from (lack of) testing
Internet based applications and the applicability of traditional testing techniques to
this new application software paradigm.
Finally, GUI-Test explores the transition from manual GUI testing - time-
consuming and cumbersome - to automated GUI testing based on commercial
tools. GUI-Test compares manual. semi-automated and automated methods to
establish a cost-effective strategy that respects the needs and resources of a small
company providing customer-specific software.
90 6 Experience Reports
G. Bazzana
Onion
The explosive growth in the WWW and the increasing complexity of Internet
applications, the interaction with legacy systems and large DBMS, the use of web-
based interfaces for business applications, require the adoption of systematic test-
ing activities also in the Internet realm.
In the World of WWW technologies, the PI3 Project helped an innovative and
dynamic small company enhancing product quality, timeliness and productivity at
the same time.
The PIE shows an interesting approach combining mature testing methods and
inspection techniques to ensure the overall quality of Internet based applications;
in fact also static web pages can contain bugs and should be checked for legal
syntax and for additional problems (portability across browsers for example is an
issue).
PI3 is also relevant for its practical and business oriented approach to the meas-
urement of the results based on a Goal Question Metrics (GQM) approach. The
company claims a "THREE-DIMENSIONAL" improvement in product quality
(+17%), time-to-market (-10%) and cost (-9%). An analysis of the ROI for intro-
ducing HTML validation tools is reported.
The following are the key lessons learnt from this experiment:
• the introduction of more systematic testing methods and tools is of paramount
importance for Level I SMEs and can be done with success in a short time
whereas the introduction of Configuration Management requires specific care
both from a methodology and a cultural point of view
• pursuing two improvement actions (Configuration Management and testing) at
a time has been perceived as difficult and demanding
• during the PIE the company felt the need of an overall framework for its im-
provement actions; although they had not planned for it the company defined a
first draft Quality Manual adherent to ISO 9000 before getting to the definition
of detailed guidelines.
6.1 PI3 Project Summary 91
6.1.1 Participants
Table 6.4 Values of indicators for the two PI3 baseline projects
With respect to the main business goals of the software-producing unit, the
quantitative improvements summarised in Table 6.5 have been observed.
In addition the PI3 experiment has also originated some indirect benefits; among
them, the following must be mentioned:
• a common company approach has been established with respect to the ISO
9000 certification
• an important echo has been generated at intemationallevel.
6.1.4.2 Technical
According to ONION's Technology Director the global evaluation of the technical
results can be summed up as follows:
• The definition of Onion's Intranet services architecture and the definition of
ONION's Software Development Factory have been achieved as a conse-
quence of the PI3 project.
• A testing checklist and tool has been defined for the automation of various tests
for ONION's products (mainly for Web applications).
In addition to the deployment activities already agreed and under way, the follow-
ing actions are foreseen (some of them already done) after the end of the PIE:
• installation of testing tools on a server accessible to the whole development
community
• regular exhaustive regression testing
• deployment of a Web-based tracking system to all designers, and integration
with a defect report data base
Table 6.6 Historical data and results from new test practice
Before After
Before After
Before After
6.1.5 References
[Onion]
ONION, "Process Improvement in Internet Service Providing", available at
net.onion.it! pi3/
[Bazzana96]
G. Bazzana, E. Fagnoni, M. Piotti, G. Rumi, F. Visentin "Testing in the Inter-
net", Proceedings ofEuroStar 1996
[Visentin96]
F. Visentin, E. Fagnoni, G. Rumi "Onion Technology Survey on Testing and
Configuration Management", Onion, Id: PI3-D02, April 1996, Excerpts availa-
ble at: http://net.onion.it!pi3/
[Bowers96]
N. Bowers, "WebUnt: Quality Assurance for the World Wide Web", Proceed-
ings of5th International WWW Conference, Paris, May 1996, pg. 1283-1290
[lmagiWare]
ImagiWare, "Doctor HTML", http://www2.imagiware.com/RxHTML/
[Bach]
lBach "Testing Internet Software", available at http://www.stlabs.com/inet.htm
96 6 Experience Reports
[McGraw]
G. McGraw, D. Hovemeyer, "Untangling the Woven Web: Testing Web-based
software", available at http://www.rstcorp.com/- anup/Ibconf/Ibconf.html
[Bergel]
H. Bergel, "Using the WWW Test Pattern to check HTML client compliance",
IEEE Computer, Vol. 28, No.9, pages 63-65, http://www.uark.edu/-wrg/
[Driscoll]
S. DriscolI "Systematic Testing ofWWW Applications", available on
http://www.oclc.org/webart/paper2
[Mercury]
Mercury Interactive "Automated Testing for Internet and Intranet Applica-
tions", available on http://www-
heva.mercuryinteractive.com/resources/library/whitepapers/
[Yourdon96]
E. Yourdon "Testing Internet Software", Corporate Internet, Vol. II, N.l 0 Oc-
tober 1996
[ST Labs]
ST Labs "Internet Testing: Keeping Bugs Off of the Web", available at
http://www.stlabs.com/Bugs_Off.htm
[AM&PM]
AM&PM Consulting "Testing Your Internet Security"
[Software QA]
Software QA "Web Site Test Tools and Site Management Tools", available at
http://www.softwareqatest.com/qatwebl.html
6.2 PROVE Project Summary 97
B. QuaquarelIi
Think3 (fonnerly CAD. LAB)
The relevance of the PIE PROVE stems from a fundamental consideration: The
goal of high software quality is obvious: to produce software that works flaw-
lessly, but the quality has to be reached without hindering development; thus the
verification process had to be compatible with other priorities like time-to-market
and adding leading-edge features to the product.
The need to achieve considerable improvement in software verification under
strong competitive pressure and tight schedules pushed this PIE to implement a
comprehensive testing environment to support the design, implementation, execu-
tion and the reuse of test cases. The environment has been integrated by an errors
database that was a pivotal tool to actually measure the effectiveness of the verifi-
cation process.
The company has kept enhancing its environment also after PROVE and now it
is seen as an essential component of the developers' workbench.
PROVE also revealed the critical importance of improving the test design skills
of developers and testers to really get that dramatic improvement in the effective-
ness of testing that cannot be achieve only by technology.
6.2.1 Participants
CAD.LAB, a CAD/CAM systems producer based in Italy, carried out the Process
Improvement Experiment (PIE) PROVE to improve the quality of its software
products by implementing a measurable verification process in parallel with the
software development cycle. Two approaches were experimented: dynamic verifi-
cation (testing) and static verification (inspection).
About Cad-Lab:
• Established in 1979
• Software Factory
• CAD (Computer Aided Design) and PDM (Product Data Management) systems
for the manufacturing industry
• Hardware independent software (WS and PC)
98 6 Experience Reports
As product complexity increases and customers' demand for high quality software
grows, the verification process is becoming a crucial one for software producers.
Unfortunately, even if verification techniques have been available for a few years,
little experience in their application can be found among commercial software
producers. For this reason we believe that our experience, will be of significant
relevance for a wider community, not least because it could demonstrate the feasi-
bility of a structured and quantitative approach to verification in a commercial
software producer whose products sell on the national and international market.
• Our business goal: to produce software that works flawlessly.
• The objective of the experiment: defining a measured verification process inte-
grated with our software development cycle.
• A fundamental requirement: doing the best job possible in the time available.
Some key sentences summarise the lessons that we consider most valuable for
whoever will repeat a similar experiment:
• "A cultural growth on testing is paramount"
• "Developers won't accept change which does not provide a clear return on their
effort investment"
• "Provide senior managers the results which are useful to pursuing their business
strategy"
6.2 PROVE Project Summary 99
Testing
Will this particular input cause
the program to fail?
Inspection
Is there any input that causes
the program to fail?
Fig. 6.1 A global verification strategy
The baseline project in PROVE was identified with a significant selection of the
subsystems of a 3D (three-dimensional) design environment Eureka. The subsys-
tems selected are different for design and technology and PROVE took into ac-
count their importance within the product; the baseline covers about the 25% of
the whole product.
PROVE consisted of these steps:
• To define a global verification strategy - tailored to the distinct characteristics
of Eureka's subsystems - in which testing and inspection are balanced.
• To build up and experiment with an automated testing environment, compatible
with the different technological environment of the baseline project subsys-
tems.
• To defme inspection procedures focused on those aspects which cannot be
dynamically tested and to facilitate re-execution by means of partial automa-
tion.
• To identify an initial set of software metrics - to be applied by means of static
analysis tools - to assess the design quality of the code (namely the 00 code).
• To set up an ongoing measurement mechanism based on testing results logging
and error tracking to obtain quantitative data about the correctness of the code
and the defect removal capability.
100 6 Experience Reports
system's functional
r----------------+
I
. . u m . ! a : ! . ! n . ! . ! u : ! . ! a i ! . ! I U i w n ! : i S ~ p ~ e ~ c : . ! : t u i O ! n . ! . - ~
1
I
1
1 __ __
I
1
1
~__t ' ~
1
1
I
ERRORS DB
I
I
1
1
1
1_ _ -> inspection
_.....:.:.:..:J::_~::.:::.~~_
--+
TEST-LOG
• ERRORS
DB
DB
Particular care was put into selecting and deploying supportive technology, in-
tegrated into an overall verification environment, and on acquiring the necessary
training and external assistance. An illustration of these aspects of the work per-
formed is shown in Fig. 6.4.
PROVE's work plan was conceived around these fundamental assumptions:
• Each of the baseline project subsystems has peculiar quality aspects to be veri-
fied. For this reason the most suitable verification approach for each of these
components had to be planned for. Both testing and inspection techniques had
to be included because not all the relevant quality aspects could be easily veri-
fied through just one of those techniques.
• The software process model of CAD.LAB (repeated evolutionary cycles) could
benefit from a set of reusable test cases and inspection procedures to be re-
executed on every new release in an automated way. Automation had to be
compatible with the development environment which, at the time PROVE
started, presented major differences from one subsystem to another.
• Results of applying the new verification process had to be measured. Measur-
ing meant setting up a test log database to monitor the quality level before re-
lease and an error database to track and analyse quality related data after re-
lease.
• Testers had firstly to be trained on the fundamentals of testing and inspecting
and then take a more in-depth, hands-on training on a method for testing.
To make the new verification practices ready for being adopted by all R&D
staff, and to achieve integration with the development process on all the com-
pany's product lines, CAD.LAB made the methods, tools and measures defined by
PROVE available to software engineers as immediately accessible practices on the
internal WEB site.
PROVE have certainly moved the verification work from a very raw state into a
much more mature one, the problems/initiatives are understood and accepted and
there is a bigger awareness of its importance at all levels.
The results of PROVE are becoming embedded in CAD.LAB's process, chang-
ing some of its phases.
A significant impact of PROVE was that for the first time clear roles and re-
sponsibilities for the testing process could be identified. We chose to make the
programmers responsible for subsystem testing, in fact this kind of testing is based
on the knowledge of subsystem's structural details. As regards system testing and
inspection we preferred a mixed solution: the programmer and the tester (where
tester means an independent tester that doesn't know how the system was built,
but on the other hand, he/she knows how the system will be used by the users) will
design the test plan together. The programmers will carry out a more "technical"
testing, focused on performance, accuracy and geometrical consistency of results
and portability, whilst independent testers will exercise the product by emulating
what a user could do with it.
102 6 Experience Reports
As a consequence of the new verification practices defects are found before re-
lease saving later work and costs, moreover this same method can be applied
alongside program development to prevent mistakes, enhancing error prevention
capability.
ERRORS DB
r manual
REPOSITOR
!
'suites
·test cases
·scenarios automated
TEST· LOG DB
At the time PROVE started, verifying the product meant spending time on setting
up an environment and developing test cases over and over again. Errors were
registered (unevenly) in a database through a cumbersome text based interface that
made this activity slow and frustrating.
Within PROVE such dispersion of resources has been removed by providing an
infrastructure that assures repeatability, traceability and availability of the infor-
mation.
6.2 PROVE Project Summary 103
.;-
./
------
/
/
/
/
CCInters3d I
(31 scenarios) I
I
SSinters
I
(16 scenarios) \
\
\
\
-
"-
------_
"- ........
.......
....
500 , . . . . - - - - - - - - - - - - - - - - - - - - - - - ,
450
400
• Open
350
• Fixed after
~ 300
• Fixed before
2
U! 250 • Not an error
z
200 • Frozen
• Duplicated
150
• Can't reproduce
100
50
oL-.L..-------.JI!!!!I!------.l="----'-.....'--------------'
<96109/17 <96110/25 <96112/23 <97/03121 <97/04/16 <Today
(8.0B3) (8.084) (8.0C) (8.0C3) (8.00)
Release date
Fig. 6.6 Cumulative Error trend and resolution status over EUREKA releases
104 6 Experience Reports
100
90
80
70
.
0 60
~ 50
z 40
30
20
~ Delta fixed since last rev
10 _ _ Delta open since last rev
0
<96/09/1 <96/10/2 <96/12/2 <97/03/2 <97/04/1 <Today
(8083) (8084) (8 .OC) (8 .OC 3) (8.00)
Release
Fig. 6.7 Open and fix capability
platforms
Fig. 6.8 Test failed on the same build, same release over different platforms
6.2 PROVE Project Summary 105
2. n e-mail
addressed to the
corrector is fired off
4. The ubmitter
receive an e-mail
notice of the fix
E'''' Ed. ~..... l:il> Iilo<l_ .Q<Joono Q"edOly ;t,,1..... tJ,ilp ,...
I<> J)lsirl'Jif'"Cfllt lf3;W31 dIIlll~l
.""
¢o
r~ LDcabOJl:.!h:rlI:l IIPcul\r8,/Clberron,SdledaForm.M.p
flll
.
Error Form O,er Sod>"", Qu.quonIli B.ccotdo 1 Qr3U
I
ISJ~rJ~
I
[ftipoate II! Delete I Filter I JrtnsVlew j[Ust v=! Print PrintAll I
Currm\ F.ur P'ro4\u:lJ ... S\IfttK.A
.)i.'~ -
1"!Z7:s!1.1 IlJocumQf1t.,Ol)l'LY
............... - .- - - .~.:J:
... -'j
A.Silva
Agusta
6.3.1 Participants
In the Aerospace Industry product costs and time-to-market are today 2 key com-
petitive levers. Helicopter is a very software intensive product, in which the
avionics software contributes to the product costs by more than 30% and to the
overall time-to-market by 40%. The software Testing and Validation activities
significantly contribute to the above mentioned high avionics software costs
(50%) and lead-time (40%).
108 6 Experience Reports
Variant VersionslReleases
RAFI RAF2
RAF I I
SMC4 SMC4.1
SMC I I
where
Re DB
RelVar ReqVer RelVar+ 1 Status
SRD FMF4
n/a PP13.I-l FMF4 N
« PP13. I-I title» FMF4 PPI3.I-l RAFI U
RAFJ PPI3.1-2 RAF2 M
/
RAF2 PPI3.1-2 SMC4.1 U
SRD RAF\
/
«PP13.I-1 . title»
SRD SMC4.\
SRDRAF2
«PP13.1-2 : title»
« PPI3.1-2 : title»
8--,----8
Database Implementation
DB Roq FMF4.2
8 1---- SMe4.D.2
'-----8 RAFI
RAFIDB
Re DB
AccProc TSM
TS0301
TS0304
TS0505
NewPaste Link
NewRelVar ,,
I
I
I
,,
I
'- - - - ------ - --
SRD Link DB
RAF2 RAFl
Revisioned/
Marked
Link
Fig. 6.16 Dynamic operations of Requirement construction using the TestTracker toolset
_ LI x
VM4.S
VMS Ii
VMS 1 Ii
VMS.2 l?
VMS 21 Ii
VMS.22 Ii
VMS.2.3 1i......J
VMS.2.4 Ii
VMS.2.4.1 Ii
VMS.24.1.1 r;
VMS 2 4.2 Ii
VMS.3 li
-""""':~'".,...."<n--"-""""'",,,""..:."'1..J..~:.::.'T.rJ1""io/";:21';:;7:;<'9_-'o!''''_~
ff
r,:.
~
_ D X
t::
1600
0 1400
l::.-.
~ ~ 1200
"t:I l'l:l
C"C 1000
l'l:l'"
til I: 800
Vi 0
~ l!! 600
....-0 ' -Ql=-' 400
..:: 200
z
0
If>
;;;
t 6'
g 0
e
Ql
W 0
f-
eo ;;;
If>
"5 ill
f- u
f- ';::
a;
E
N
2:
1100
1000
~ 900
~ 800
'=
c 700
«;I
600
til
0- 500
~ 400
"-
0 300
"-
z: 200
100
0
"'C
aJ
(fJ
u; 6'
0
0)
OJ
c:
ro
..c:
I- L
u ro (fJ
c- o .g
aJ I- a;
a: E
~
:::2:
RAF2
RAF
1
I '---.---------r--~---'l
-,
o 500 1000 1500
Nr.ofTests
F.L6pez, P.Hodgson
Procedimientos-Uno SL
The interest FCI-STDE lies in the subject matter of this experiment: code inspec-
tions. It is widely claimed that code inspections can have a higher defect discovery
potential than testing at lower costs. This PIE field tested this statement and came
to some relevant conclusions:
• A relationship exists between the cost of inspecting and testing and the com-
plexity of the object subject to these procedures. A linear cost/complexity ten-
dency fits well with the experimental data - taking in account the overall limi-
tations of a small experiment.
• Code inspections were only marginally cost effective in the PIE context what-
ever the complexity of inspected objects.
• Young and relatively flat organisations, such as our company, offer little or no
resistance to the introduction of code inspections. Current literature tends to
maintain the opposite.
• The introduction of formal code inspections has led to changes that are more
organisational in nature (affecting workflow and deliverables) than technical.
The successful implementation of the FCI-STDE process improvement experi-
ment at Procedimientos-Uno, S.L. has led to the improvement of code maintain-
ability as well as to better knowledge of the code and product. As a consequence,
the risks related to engineers leaving the organisation have been positively re-
duced for the company.
6.4.1 Participants
Code inspections have not been widely adopted by software developers of techni-
cal applications; due mainly to the relative low testing costs and the fact that tech-
nical knowledge of the domain was considered more important than good coding
practice. In small software developing units this situation becomes worse, as clas-
sical code inspection plans require more human resources than usually available.
The widespread introduction of graphical user interfaces and the demand for dis-
tributed computing have seriously risen testing costs for all types of software, and
has led Procedimientos-Uno to consider the introduction of code inspections.
Q)
c.
• IEEE 10281988, M.E. Fagan • Small Businesses (SME)
8 • Large companies: IBM, AT&T • Technical software
r./'J • IT Departments • Small projects (1020.000 NCSL
• Very large projects (100.000 person months)
NCSL person years) • Small working teams (also one
• Large working teams person teams)
• Limited range of conditions to • Large range (almost unlimited)
be tested of conditions to be tested
T={l...n}
qt. r'
~.:~. <
;.. J
The motivation for the PIE was to contain the ever-increasing testing costs and
to improve software quality, especially the quality as perceived by the end-user.
Indirect benefits were expected as well: Procedimientos-Uno has always acted
as a well-integrated team, but as the size of the business grows, methods are re-
quired to enhance communication within the development team.
Improvement in code maintainability was also expected, as many coding stan-
dards are oriented to enhance code readability.
Additional improvement in software reliability were also expected; based on
the fact that code inspections can discover many faulty aspects that software test-
ing cannot capture.
In concrete, the following objectives were presented:
• higher level of quality of components at less cost
• eliminate errors that software testing cannot capture
• enhance code maintainability
• enhance code readability
6.4 FCI-STDE Project Summary 121
• enhance reliability
• group motivation
The experiment was carried out in the context of a baseline project that is strategi-
cally vital to Procedimientos-Uno and was subject to external auditing by a public
institution (CDTI of the Spanish Ministry of Industry). The project, known as
NovaMedia, falls in the category of development platforms and can be described
in short as a true visual language based on COM technology.
The experiment was designed in two phases, each one covering a complete de-
velopment, inspection and testing cycle of different components of the baseline
project.
Code inspection took place when the coding effort was at its peak on the base-
line project NovaMedia.
The following steps were carried out:
• Review of current C++ coding standards to bring them up-to-date and to make
them appropriate to the product and development environment characteristics.
• Review Test Metrics, with respect to the PIE's main objective. This revision
suggested collecting detailed effort data in order to enable later evaluation of
FCI cost effectiveness.
• Software tool acquisition and set-up. The decision to update to "Microsoft
Visual c++ Development System Version 5.0" was made. This product offered
additional benefits beyond improved static code analysis.
• Selection of code to be reviewed (Test plan 1). Special attention was paid to
establishing similar amounts of code (in effort, complexity, etc.) to be subject
or not to Formal Code Inspection (FCI), based on existing specifications and
test plan.
• Formal Code Inspections - Phase 1. Three engineers were involved in the roles
of: Inspector/Moderator, Inspector/Reporter and Author in the following proc-
esses: Inspection overview, Inspection session, Meeting, Rework and Follow
up.
• Process Improvement Plan. Required situating the FCI practice in a general
plan aiming to a wide adoption of reviews and inspections based on clearly de-
fined business objectives.
• Statistics.
• Review of standards and metrics, with the objective of improving standards and
criteria established in WPI and WP2.
• Selection of code to be reviewed. The main practical difference of this new
round was that the objects subject to the FCI were now classes of a complex
container rather than simple components.
• Formal Code Inspections phase 2.
122 6 Experience Reports
In the FCI-STDE PIE two full cycles of coding, inspection and testing have been
completed. The following tables and figures show the results:
i
2500
i
+-,~~----l-~~--J~~--J~~~t--~---...~~-A-+-~~+-~~-j
I
Ui i
S::l 2IlOO,
-:-~~~----l-~__--J~~--Jf-----...e-."~~+-~~+-~~+-~~-j
.5 I
E i
-~ I
1500 -i-~~-+-~-""+-~-.,..lL---+---+----+----+-------1
ffi
'.
1000 +--_.Jf£-~-+-----+-------I-----+---+----+----+
Relative Complexity
Analysis of the figures and charts show FCI to be, at best, only marginal bene-
ficial in the terms of the experiment in our environment. At our current level of
competence in the inspection process, and only taking the experimental measures
into account, Procedimientos-Uno can save an overall average of 0.25 person days
per KLOC delivered to our clients.
If only inspection, testing and rework are compared, careful attention has to be
paid to the size of the inspection team versus the size of the testing team. The
amount of testing required in a single cycle (not including re-testing) is independ-
ent of the inspection effort. Obviously inspections have to be very efficient (and
testing thorough) if the inspection team is the larger.
As for the metrics for code inspections themselves, in the first phase inspectors
covered an average of 155 lines per hour and discovered an average of 2.02 criti-
cal defects per KLOC. In the second phase inspectors covered an average of 233
lines per hour and discovered an average of 0.89 critical defects per KLOC.
The experiment has proved that FCI improves knowledge dissemination within
the organisation. This directly affects (reduces) business risks related to the loss of
part of the organisations engineering workforce.
In spite of the marginal 0.25 person days per KLOC overall gains implied by
the experiments results, Procedimientos-Uno has decided to adopt FCI (and re-
views in general) at an organisation wide level. There are several reasons for this:
124 6 Experience Reports
• http://www.procuno.pta.es/fci-stde/
• http://epic.onion.it/workshops/w07/slides03/index.htm
• http://www.esLesIVASIE/
• mailto:phodgson@procuno.pta.es
J.c. Sanchez
lntegracion y Sistemas de Medida, SA
TESTLIB deals with the key issues in software verification: the automation of test
software generation by using a high level command language, the integration of
the generated code in a testing environment usable by independent testers and the
management of the generated code in a data base.
TESTLIB took an innovative approach to these issues basing its testing soft-
ware generation on re-use. To this aim the results created by TESTLIB are:
• A set of libraries containing different types of modules to be used in the gen-
eration of tests and test flow sequences.
• An automated process to integrate all the required modules into a test engine in
order to generate, without user intervention, the final run-time test application
software.
Finally the set of tools created by TESTLIB has been encapsulated within the
commercial development environment HP SoftBench to make the testing envi-
ronment smoothly integrated with the development environment.
The work carried out in this experiment demonstrated that in testing automation
the traditional drawbacks of testing automation - and namely low efficiency, long
development cycle, high risk and costly projects - can be addressed and over-
come.
Data gathered in this experiment show that with TESTLIB's innovative ap-
proach it is feasible to generate test software based on re-usability. This approach
does not go without pain: very high software development effort was necessary
prior to feasible application of results.
By applying the techniques involved in this experiment software engineering
and test engineering specialists interact within the test software development proc-
ess in a highly efficient way through task specialisation.
6.5.1 Participants
The experiment has been carried out by Integraci6n y Sistemas de Medida, S.A, a
SME highly specialised in the development of applications software for tum-key
testbeds - ATS (Automated Test System) - to be used in automated systems
testing, and particularly telecommunication devices.
6.5 TESTLIB Project Summary 127
The motivation behind TESTLlB comes from the current drawbacks affecting the
development of ATS software and namely:
• long development time, high costs
• low re-use
• difficult maintenance of test cases and test procedures, particularly when main-
tenance is performed by different people from those who developed them in the
first place
• impossible to make extensions and changes by the final user
TESTLlB explored the use of c++ object libraries for the automated generation
of reusable test software code to be used into an automated testing environment
for telecommunication devices.
As a direct consequence of the experiment, test engineers, rather than software
engineers, will be able to develop 00 testing code with a relatively small training
effort, based on generic code re-use.
Software re-use will allow development effort to be reduced in a 30 to 40 per-
cent.
.. Lower risks.
t High quality.
Fig. 6.27 Benefits in the Company's point of view
128 6 Experience Reports
Currently, every single programmable instrument used in an ATS must have its
own proprietary driver within the test code and this driver is designed according to
the tests to be performed every time a new ATS is projected. This fact implies the
involvement of both test engineers and software engineers any time a new test is
being coded for every single instrument included in the ATS. This is a tedious and
repetitive process subjected to improvement for a much higher productivity and
reliability.
The experiment was organised around three main lines of action:
• Development of object-oriented test and instrumentation libraries.
• Integration and application of object-oriented libraries to the baseline project.
• Integration and encapsulation within a standard CASE tool. (HP Softbench).
A generic instrumentation drivers library was developed.
Tests and measurement definitions were made by means of a visual program-
ming environment to simplify the way test and measurement procedures pro-
gramming are performed.
Test software code was generated based on the above defmitions by means of a
high-level code compiler.
The generation of a relational database for allocating and handling all test code
components and definitions was automated.
Object Orientation and software reuse Lower development costs and time
Graphical programming and interfaces Lower software skills
Based on Industry Standards Market strength and continuity
6.5 TESTLIB Project Summary 129
Basic Design
• Basic Object Management
• Unique Identification network service
• Object Persistence using ANSI-SQL
• Remote Connectivity using ONC-rpc's
• Object typing and browsing
• Code Generation
• Automatic c++ classes
• Binding process of generic object to specific instrument drivers
Graphical
Environment
25%
Basic
Object
Mgt
30%
Automation Code
Generation
Fig. 6.28 Time used in Design Phase
Standards
Dev.
15% Alternatives
Environment
5%
30%
Problem
Domain
50%
S. Daiqui
Deutsche Forschungsanstalt fur Luft- und Raumfahrt e. V.
The ATECON PIE is particularly significant for the completeness of its approach:
It covers all phases of testing, a wide range of tools and different development
environments.
As a consequence ATECON shows how to make a verification approach scal-
able, adaptable and tailored to projects with different reliability, availability, main-
tainability and safety requirements.
The approach has been applied and validated in seven real-world projects from
different application areas using heterogeneous hardware/software environments
and traditional programming languages like FORTRAN or C.
To all these domains ATECON has applied a consistent and very systematic
testing approach which proved to payoff in terms higher quality and reduced
efforts, mostly when applied earlier in the project.
The partners in the ATECON project are planning to extend the test approach
to cover object oriented system development with new challenges like dynamic
linking and overloading.
6.6.1 Participants
ATECON was performed by the Deutsche Forschungsanstalt fur Luft- und Raum-
fahrt e. V. (DLR). DLR, the German aerospace research establishment, is a non-
profit institution that develops systems for aerospace applications and the corre-
sponding ground support systems. DLR's quality and safety-division was respon-
sible for the overall experiment management as well as for the development of the
testing concept.
The ideas of the application experiment were tested in seven baseline projects.
Five of them were other divisions of DLR, covering a wide range of different
application areas like robotics, space operation centre, and the German remote
sensing data centre. The project partners CAM and ZETTLER performed the other
two baseline projects. CAM develops software systems for the monitoring and
control of technical facilities like ground support stations or systems for aerospace
applications. ZETTLER develops electrical and electronic systems with monitor-
ing and control systems that contain embedded real-time software with hard reli-
ability, availability and safety requirements.
6.6 ATECON Project Summary 133
The objective of the ATECON project was to defme and apply a cost effective,
efficient state-of-the-art system and software test concept.
This concept should include test methods, detailed practical procedures and the
suggestion for state-of-the-art test tools. The concept had to be modular to be
scalable to projects with different requirements regarding reliability, availability,
maintainability and safety (RAMS).
The test concept had to cover all aspects of testing, i.e. the unit, integration,
system and acceptance testing. For each phase off-the-shelf test tools had to be
selected and integrated in a state-of-the-art software and system testing environ-
ment. Special attention had to be given to the integration of the test concept into
the overall system and software development life cycle, considering the necessary
interfaces regarding the methods, procedures, and tools utilised during the differ-
ent life cycle phases.
The concept had to be tested by seven baseline projects from a wide range of
application areas and different hardware/software environments to make sure that
the know-how can be transferred to industry.
ATECON set out to prove that the result of systematic testing is higher quality
software systems and measurable improvements regarding reliability, availability,
maintainability and safety.
The test approach, consisting of a test concept and a supporting test environment,
covered all testing phases starting from the coding and debugging phase, over the
module, integration, and system testing phase, up to the acceptance testing phase.
For all testing phases, state-of-the-art testing methods, procedures, and tools
were defined and applied. In particular, projects were provided with tools support-
ing the test specification, preparation, performance, and evaluation of regression
tests on module, component, subsystem, and system level.
Tools supported also the measurement and analysis of the achieved test cover-
age.
These methods and tools utilised by the different baseline projects were sup-
plemented by static and dynamic software quality analysis activities and RAMS
(reliability, availability, maintainability, safety) analysis activities performed cen-
tralised for all baseline projects by appropriate specialists.
The broad range of projects and users selected for this application experiment
ensured the definition of a modular, scalable test concept and test environment,
ensuring transferability to future projects and to other organisations.
The overall test concept consisted of four sub-concepts for the unit, integration,
system and acceptance testing phases. Each sub-concept has been defmed as a set
of "Software Engineering Modules" (SEMs). A SEM describes a specific test
134 6 Experience Reports
Area Tools
The project has shown that many practitioners underestimate the power of a sys-
tematic test approach. Not only that the test effort becomes more predictable and
the software becomes more stable, but knowing more about testing techniques and
starting with the testing activities earlier in the project can lead to higher quality
systems and reduced efforts.
Among the most important technical lessons learned were the following:
• Even highly critical systems have only few components that are critical. Know-
ing which components are critical can reduce the overall testing effort dramati-
cally since only they have to undergo more elaborated test procedures.
• Testing starts with the requirements phase and never ends. This is well known
by the experts but seldom applied in real projects.
• The programming languages to a much higher degree than expected influence
the methods and tools used for testing. Not only minor differences in applying a
method have been observed, but the use of totally different methods and proce-
dures.
• Most of the testing methods and procedures described in the literature have two
shortcomings: it is not well defined under which conditions and for which kind
6.6 ATECON Project Summary 135
of systems they can be applied, and they are often purely academic and cannot
be used in real projects without tool support, which is often not available.
ATECON overcame these problems by providing detailed descriptions of step-
by-step procedures, methods and tools applicable in real world projects.
• A training program tailored to the individual needs of the project was extremely
important to transfer the theoretical concepts into practice.
• In tools selection test installations and trial periods are an absolute must, espe-
cially when the tools are used in different hardware/software environments.
Testing is not an art. One can define and apply strict procedures based on ob-
jective criteria similar to well know requirements and design methods. And these
test procedures even provide feedback to improve the usage of requirements and
design methods!
136 6 Experience Reports
T. Linz
imbus GmbH
In the age of highly interactive and graphics user interfaces testers have to cope
with the increasing complexity of thoroughly testing the large combination of
options available to the user.
The manual approach is scarcely effective, costly and, worse than all, does not
produce any re-usable asset.
GUI testing tools are certainly the largest family of testing automation tools
available on the market these days. They carry big promises of increased produc-
tivity and lower costs. However it is well known that automated GUI tests could
be difficult to re-use and maintain as the interfaces changes (and you can bet it
will!).
GUI-Test looks into this issue to establish the most appropriate GUI testing
strategy fitting with their business needs and their process which is already de-
fined as part of the company Quality Management System.
Therefore GUI-test is a good example of experiment focused on assessing ob-
jectively - and with an eye to the bottom line - the benefits of switching from
manual to automated in critical area of the software process.
6.7.1 Participants
Today's software systems usually have Graphical User Interfaces (GUI) offering a
large amount of control elements to the system's users. As there are thousands of
possible interactions, testing such systems is extremely difficult and labour inten-
sive.
GUI-Test aimed to standardise, optimise GUI testing methods by introducing
commercial tools to automate the testing.
The effectiveness of GUI testing is very relevant to imbus since their customers
expect a robust graphical user interface for practically every software project.
From a business point of view imbus' motivation was to gain the following bene-
fits:
• a cost reduction and a more accurate estimate of testing costs; the reason for
this being standardisation and repeatability of testing procedures
• lower residual error rates because testing is more thorough and effective, lead-
ing to bug repair costs savings.
• Lower total development costs and shorter time-to-market.
Given that imbus is also a service provider in the field of testing one of the mo-
tivations behind the PIE was to acquire know-how to improve their competence as
"Third Party Tester".
At PIE start at imbus GUI-Systems were only tested manually, a labour intensive
and costly approach. The completeness and effectiveness of manual tests were
strongly dependent on the ability of the tester and the results were difficult to
quantify and qualify.
In addition such testing effort did not produce any re-usable asset and the test-
ing work had to be done and done again over subsequent versions of the software.
To address this issue imbus opted for the application of commercially available
GUI testing tools to automate the tests.
imbus ran a formal tool selection phase before committing to a final selection
consisting of:
• GUI test automation tools "WinRunner" and "TestDirector" from "Mercury
Interactive".
• Runtime Analyser & Error Detection tools for MS-Windows "BoundsChecker"
from "NuMega Technologies".
The tools were evaluated in a real software development project: the baseline
project we have chosen is the development of the latest of an integrated PC soft-
ware tool for GSM radio base stations. The application had been developed as a
distributed system with several tasks and DLLs using Microsoft Visual
138 6 Experience Reports
PIE: GUI-Test
Goals: Automate Results:
optimize and defined tests Use og GUI·
automate .... and repeat tests Testing Tool
using GUI·Test
GUI·testing ! integrated into Next
Tool.
imbus Testing
Stage
Procedure
Baseline Project
PC software tool for GSM radio base
stations maintenance.
I
Define tests
and run all tests manually.
97 890 Lines of C++ Code.
Tests were run on the baseline project using both traditional manual methods
and the new, semi-automated or automated methods. The amount of test that can
be automated was assessed and the old and the new methods were compared.
The PIE results showed that QUI that it is possible to automate up to 90% of QUI
test cases which had to be performed manually beforehand.
Improvements:
• Buy a capture & replay tool.
• Let the testers learn to program and use it.
• Build up and maintain a test case library.
• Integrate tool usage into your testing process.
6.7 GUI·Test Project Summary 139
6.7.4.3 Conclusions
The tools selected (but also tools of competitors that were not selected) worked
fine, but easy usage needs a high amount of tool specific programming know-how
and tool set-up. Therefore imbus tried to isolate as many reusable test cases as
possible and put them into a test case library to be used within the GUI testing
tool.
However GUI-Test showed that if the tool is well known its regular usage is
less effort consuming than manual testing.
As an intermediate result GUI-Test found that the break-even point on GUI
testing automation within the baseline project was reached after the 2nd - 8th
repetition of an automated test run, proving that testing automation pays off on the
medium to long distance if tests are re-run frequently.
This result was better than what imbus expected but must still be confirmed
during ongoing evaluation.
7 Lessons from the ~UR~X Workshops
L. Consolini
Gemini, Bologna
The workshop was held on 28 May 1998 in Milan. GEMINI, the Italian partner of
I;URI;X organised the workshop. The title of the second workshop was "The
Testing of Software". 19 participants attended the workshop, mostly representing
SMEs software industries.
The subject domain was selected on the basis of the results of the classification
of the PIEs, which showed that 8 out of 52 Italian PIEs dealt or are dealing with
software testing. The workshop focused on the following aspects of testing:
• Testing automation: does it payoff?
• The economics of testing: how can you tell how much testing is needed and
• economically viable?
• Is it possible to re-use tests?
• Who is responsible for testing?
• How do the traditional testing methods apply to Internet based applications?
These were identified as the major questions arising from the PIEs' experience.
The PIEs more involved in these issues, and who achieved significant results,
suitable to be disseminated to a wider audience, were invited to be part of the
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
142 7 Lessons from the EUREX Workshops
workshop panel, they were: PD, PROVE, and TRUST. The PIEs and the experts
formed the panel of the workshop.
The workshop aimed at determining which approaches to testing and particu-
larly to testing automation are most beneficial to the software industry, what im-
pact it is going to have on the company and how to measure its results. This aim
was pursued by elaborating a "workshop hypothesis" to be discussed and proved
or disproved during the workshop.
In the first part of the workshop the experts introduced the topic and illustrated
the workshop hypothesis. Three PIEs presented their experience and commented
the hypothesis.
In the second half of the workshop one of the expert stimulated the discussion
among all the participants by defending a set of provocative thesis fairly critical
about the effectiveness of testing to ensure product quality.
A lively discussion involved PIEs, experts and the audience; the panel drew the
final conclusions.
Two domain experts participated in the event. The first was Gualtiero Bazzana,
Partner and Chief Executive of ONION Communications - Technologies - Con-
sulting. He has more than 10 years of experience in conducting IT-projects in
several sectors of applications and has more than 40 international publications of
which about 10 are dedicated to the domain of testing. Lately he specialised in
automation of testing and in verification and validation of Internet / Intranet appli-
cations.
The other expert was Luisa Consolini, president of GEMINI S. cons. a r. I., a
consortium for Software Engineering and Software Quality. Since 1991 she is
involved in Software Engineering and Software Quality. She assisted the PIE
PROVE, that focused on the verifying and testing process.
They were selected for their hands-on experience in the field and for being well
known in Italy and knowledgeable on testing in the software sector.
Gualtiero Bazzana did a introduced the topic with a presentation focused on
Web Testing Methods, that is on the new issues raised by the application of testing
methods to WEB based applications. Mr. Bazzana identified the Quality character-
istics specific to Internet software and the technical and process related peculiari-
ties of the WEB projects.
These peculiarities have an impact on the testing methods and Bazzana high-
lighted two major classes of tests:
• tests targeted to static aspects, relevant mostly for static HTML pages
• tests targeted to dynamic aspects, relevant for real interactive applications
which apply the client/server paradigm to the Web.
7.1 Second Italian Workshop 143
Both classes were described in tenns of techniques and tools available on the
market. Tools and techniques were related to the specific quality characteristics
they could verify most effectively. A paper on Web Testing Methods written by
the expert is included hereafter
The paper covers the following aspects:
• testing challenges in Internet applications
• a survey on existing tools in the specific application domain
• a proposed approach for testing Internet applications
• testing aspects to be taken into account
• the relationships with the overall improvement program, results achieved and
lessons learnt.
G. Bazzana, E. Fagnoni
Onion, Milan
7.1.3.1 Background
The last years have seen an explosive growth in the WWW. Currently the Web is
the most popular and fastest growing infonnation system deployed on the Internet,
representing more than 80% of its traffic.
The picture shown in Fig. 7.1 represents the growth of WWW servers world-
wide (data derived from Netcraft Web Survey - http://www.netcraft.co.uk/Survey)
that are now (May 1999) well over 5 millions!
It is a matter of fact that success of the Internet resides also in easiness of build-
ing HTML documents. This allows everyone to have its own home page.
A wide variety of web browsers, each implementing its own interpretation of
HTML, have also favoured this massive phenomenon.
Unfortunately humans, when perfonning manual tasks tend to make mistakes;
moreover a large number of non-programmers are also creating web pages, so we
cannot assume that everyone perfectly knows Defmition Type Document (DTD)
specifications.
Browsers are supposed to be liberal in what they accept, and do their best to
render pages, no matter how badly fonned they are. But a page that looks great
under Netscape may be non-viewable under other browsers.
Search engines are becoming the first lines of attack when surfing, so it is im-
portant that web pages are suitable to automatic processing, particularly for ex-
tracting titles and paragraph headings.
144 7 Lessons from the EUREX Workshops
In HTML development there are two kind of tools generally used by different
categories of authors. Well experienced programmers often prefer Standard
HTML Editors which allow to have more control on HTML tags, while non-
programmers tend to prefer WYSIWYG Editors (also well-known as Web Author-
ing tools) which promise an easier approach to web pages creating, leaving them
free of knowing HTML syntax.
6000000
5000000
4000000
3000000
2000000
1000000
0 OIl
~ -£ <:
<" <5
....,"
0
""
'"
~ ~
All these things mean that it is increasingly important that web pages are
checked for legal syntax and for additional problems, such as portability across
browsers.
In addition, we have seen the emergence of new "active" languages in develop-
ing WWW applications (e.g.: Perl, Java) as well as the increasing development of
dynamic applications.
Additional trends are:
• interaction of Web-based solutions with large DBMS
• web-portals
• usage of Web-based interfaces for Intranet/ Extranet applications that directly
interface the company legacy system
• usage of Web-based approaches for critical applications (e.g.: on-line trading)
• access to the Web by different media (e.g.: mobile phones, TV)
• need to allow equal opportunities to Web access also for impaired or disabled
people, in order not to exclude them from the new "Information Society".
7.1 Second Italian Workshop 145
This has increased the complexity and criticality of applications, requiring the
adoption of systematic testing activities also in the Web-based realm that is far too
often wrongly considered an application domain populated mostly by hackers.
As of date, we can say that Web-based applications deserve a high level of all
software quality characteristics defined in the ISO 9126 standard, namely:
• Functionality: Verified content of Web must be ensured as wel1 as fitness for
intended purpose.
• Reliability: Security and availability are of utmost importance especial1y for
applications that required trusted transactions or that must exclude the possibil-
ity that information is tampered.
• Efficiency: Response times are one of the success criteria for on-line services.
• Usability: High user satisfaction is the basis for success.
• Portability: Platform independence must be ensured at client level.
• Maintainability: High evolution speed of services (a "Web Year" normally lasts
a couple of months) requires that applications can be evolved very quickly.
Client side
/' /'
Web ~erver'/s
..6
Web Applications
Java Machine Ix'
I DBMS I Objects'.
HTMLI
I Legacy Apps
A!
I Script interpreter OS + M1lhtd~lI[l:
Test Levels
As a consequence of the highlighted architecture, the following test levels have
been defined depending on whether the Web-based application is dynamic or only
static.
148 7 Lessons from the EUREX Workshops
.--- ~
NETSCAPE
MS-IIS APACHE
Server
1~
.--- ~> SAPIC
~> ~
Perl CGI
ASP
Programs Programs
~~ ~ ~ <~
~ ~
MS-SQL
ORACLE MINISQL
MS-ACCESS
~~ ~ ~ ~~ ~~
~ ~
MS-NT UNIX
The remainder of this document covers the two aspects in more details.
Service tests have the goal to validate the resulting service from an user's point
of view, thus adopting a black-box strategy without any assumption on the under-
lying architecture and implementation choices.
First of all it shall be clear that on the WWW there is no silver bullet for abso-
lute security, likewise in real life, and that security techniques and checks shall be
tailored depending on the value to be protected.
Moreover, it is important to underline that security enforcing involves both
organisational and technical issues; namely organisational issues are often much
more important than technical ones, at least in Intranet and Extranet applications.
In such cases, the approach is to defme a "Security Policy" at company level
and then to tailor it for the risk level associated to the various services. Security
testing will thus focus on checking that those Policy rules that rely on technical
aspects have been correctly implemented.
For Web-based application intended for Internet usage with security con-
straints, specific tests will have to be devised on a case-by-case strategy remem-
bering that at the time being Internet is a so called "open net" without any central
management. It derives that availability of any service cannot be guaranteed and
that confidentiality and integrity of not encrypted communications cannot be guar-
anteed.
In general, aspects to be covered by security testing will include:
• Password security and authentication.
• Encryption of business Transaction on WWW (SSL, Secure Socket Layer)
• Encryption of e-mails (PGP - Pretty Good Privacy)
• Firewalls, Routers and Proxy Servers
• Web Site Security
• Virus Detection
• Transmission Logging
• Physical Security and Backups.
Testing can take benefit of available programs (most are shareware), among
which: COPS, CRACK; TCP Wrapper and Satan. Quite interesting to notice that
some of them have been developed by hackers and crackers.
be done with respect to: nonnal behaviour; destructive behaviour and behaviour of
inexperienced users.
Aspects to be tested with care for perfonnance include: searches on database,
turnaround time of custom applications embedded in the WWW, verification of
server response. This has to be done with respect to: platfonns, browsers, network
connections.
Aspects to be tested with care for loading include support to many connections
and users and management of message peaks and resource locking. This has to be
done with respect to server load and client load.
DoctorHTML
Doctor HTML [ImagiWare] is a Web site analysis product, whose main features
are:
152 7 Lessons from the EUREX Workshops
Web Lint
Weblint [Bowers96] is a syntax and style checker for HTML: a Perl script which
picks fluff off HTML pages, much in the same way traditional lint picks fluff off
C programs.
There are several dozens of different checks and warnings that can be en-
abled/disabled individually, as per your preference.
Weblint is free and regularly updated; it is written in Perl scripting language,
which was designed for processing and generating text and has powerful regular
expression capabilities. Perl is also very portable, so weblint can be used under
Unix, VMS, Windows, Mac and other platforms.
Weblint does not perform a strict HTML validation test but gives you some
level of assurance that web pages provide the intended content to the reader. It is
easy to obtain, install, use, configure and its warnings are easy to understand. It
also has many web-based interfaces gateway (a weblint gateway is an HTML form
which lets you type in a URL and have it checked by weblint without having to
install weblint locally).
check for HTML compliance. Its goal is to check any inconsistency with which
Web client developers comply with the emerging standards.
As a matter of fact, the HTML protocol has evolved in stages. As a conse-
quence of the "browsers' battle" between Netscape and Microsoft, non-standard
extensions are emerging in parallel with orthodox versions; as a consequence
typical Web-client developers make usually no claim of HTML compatibility but
simply add as many features as they feel they can manage in the latest browser
release.
Henceforth, the WWW Test Pattern has been created for:
• monitoring the degree ofHTML compliance of Web Clients;
• checking Netscape Mozilla and Microsoft Explorer extensions;
• analysing MPEG, AVI and QuickTime animations.
Some of the tests are passive (that is, the user merely loads the test document
and views the results), whereas some other require direct user involvement.
The WWW Test Pattern has to be kept aligned with the fast evolution of the
browsers and thus its standard suite of tests for text, audio, graphics, meta-links,
animations, forms and tables is always under modification. This urges also the
need for reducing the multiplicity of tests and providing a standardised test report.
checker which validates your HTML to ensure the correctness of your pages;
supporting various HTML, Netscape and Internet Explorer versions;
• Hot Dog includes a syntax and spelling checker as well as a Width Checker
which you can use to see how pages will appear to users whose monitors are
running with screen widths of 640, 800 or 1024;
• Web Suite includes a Load Manager which provides size and download
information for components created in the Component Editor. It determines
download times for the different connection speeds, and lets you keep the
download time in mind while you are creating your components.
Extension
Server
EJ "'~I---··I C_G_I _
Transaction
Mng
HITPSERVER
CGI PROGRAM
DB
11I LEGACY
APPLICATION
Fig. 7.5 eGI programming
Besides the well known techniques for client-server testing, you should beware
of complexity from included software layers; it has to be remembered that often
more than 90% of the software is out of the developers' control, being re-used
from other sources.
156 7 Lessons from the EUREX Workshops
I SERVER EXTENSION
I
n n
HTTP SERVER
DB lEGACV
APPLICATION
WWWSERVER
WWWSERVER
Capture
'I 1/
Script File
Playback
Fig. 7.8 Grey-box HTTP Testing approach
Performix / Web
This tool captures actual user activity at workstation and derives from that a script.
Script is then compiled and executed; scripts can also be combined to simulate
multiple users.
At the end, log files and reports are produced. The tool reveals bottlenecks at
the client, the server and the network, adopting a good recording technology that
allows testing without many PCs: one driver machine emulates unlimited users.
Peculiarities are: conversion facilities for our scripts from all major vendors as
well as client application and DB server tested simultaneously.
Webstone
It is the most widely cited WWW Benchmark, useful in evaluating the server
capability to deliver static web object.
A rigorous testing methodology makes usage of a Webmaster process that con-
trols several WebChiidren who do the actual banging of the HTTP Server.
It is downloadable for free.
158 7 Lessons from the EUREX Workshops
services, integrating Web-based application and MIS and setting-up simple, cross-
platform applications on top of a simple-to-manage and more centralised IT infra-
structure.
As far as testing is concerned, challenges are on:
• security,
• load testing,
• user authentication,
• server authentication,
• connection privacy,
• message integrity,
• payment security.
Tool providers are performing certified integration with ERP systems.
7.1.3.5 References
[Bazzana96]
G. Bazzana, E. Fagnoni, M. Piotti, G. Rumi, "Process Improvement in SMEs:
the ONION Experience in Internet Service Providing", SP 96 Conference Pro-
ceedings, Brighton, December 1996.
[Bazzana99]
G. Bazzana, E. Fagnoni, "Process Improvement in SMEs: an Experience in
Internet Service Providing", in Better Software Practice for Business Benefit,
IEEE Software, 1999.
[Visentin96]
F. Visentin, E. Fagnoni, G. Rumi, ONION Technology Survey on Testing and
Configuration Management, ONION, Id: PI3-D02, April 1996.
Excerpts available at: http://net.ONION.itlpi3/
[ImagiWare]
ImagiWare, Doctor HTML, http://www2.imagiware.com/RxHTMLI
[Bowers96]
N. Bowers, "WebUnt: Quality Assurance for the World Wide Web", Proceed-
ings of5th International WWW Conference, Paris, May 1996, pages 1283-1290
[BergeI]
H. Bergel, "Using the WWW Test Pattern to check HTML client compliance",
IEEE Computer, Vol. 28, No.9, pages 63-65, http://www.uark.edu/~wrg/
[CEC91]
Commission of the European Communities, Information Technology Security
Evaluation Criteria (ITSEC), Version 1.2, CEC, 28 June 1991.
[CEC93]
Commission of the European Communities, Information Technology Security
Evaluation Manual (ITSEM), Version 1.0, CEC, 10 September 1993.
160 7 Lessons from the EUREX Workshops
The starting point for discussion in the workshop was a hypothesis derived from
the direct analysis of the PIEs experience and drafted by GEMINI who, before the
workshop, circulated it to the invited PIEs to stimulate their reactions.
The thesis, illustrated in the following statements, summarise what was dis-
cussed with the PIEs before the workshop and an analysis of their project reports
available at the time.
As product complexity increases and customers' demand for high quality soft-
ware grows, the verification process is becoming a crucial one for software pro-
ducers. Unfortunately, even if verification techniques have been available for a
few years, little experience in their application can be found among commercial
software producers. For this reason we believe that the PIEs' experience will be of
significant relevance for a wider community, not least because it could demon-
strate the feasibility of a structured and quantitative approach to verification.
The approach of the PIEs focusing on testing selected by ~UR~X consisted in
setting up a verification method and supporting it with an automated infrastruc-
ture; in general they anticipated the following benefits:
• less defects escaping from quality control
• more reliability over subsequent evolutionary releases of the software product
• more productive verification activities, relaying on a replicable set of test cases
and test procedures
• the availability of quantitative data on the correctness of the product.
Some key morals summarise the lessons learned by the PIEs that we consider
the most valuable for a wider community and are summarised in the following
paragraphs.
up a testing culture is high priority to put testers on the same level as design engi-
neers.
Traceability
Some of the PIEs explored the aspects involved in linking test cases to require-
ments and tried to measure coverage as requirements coverage.
Tools
There are several types of testing tools which can be applied at various points of
code integration (unit testing, integration testing, system testing). Many of these
7.1 Second Italian Workshop 163
tools are very sophisticated and use existing or proprietary coding languages. The
effort of automating an existing manual testing is no different than a programmer
using a coding language to write programs to automate any other manual process.
Treat the entire process of automating testing as you would any other software
development effort. Usual components of software development, such as configu-
ration management also apply.
Skills
Test automation summarises identify and write the right test cases (i.e. good test-
ing skills) and writing code (i.e good programming skills). Consequently a good
tester does not necessarily make a good test automator: these two roles are differ-
ent and the skill sets are different. Related to this point is the idea that testers and
software developers need to work as a team to make effective test automation
work, this does not disrupt test automation, to the contrary some advantages can
be gained. Working with developers also promotes building in "testability" into
the application code.
A Final Caution
Test automation is not a substitute for walkthroughs, good project management,
coding standards, good configuration management etc. Most of these efforts pro-
duce higher payback for the investment than does test automation. Testing should
not be looked as the primary activity in Software Quality Assurance.
Quality cannot be put into the product at the end just by debugging the faults
discovered by testing. If you want quality at the end you have to design it earlier
on.
Who is responsible?
"If a constructor has built a house for someone and his work was not well done
so that the house crumbles killing those that live in it, then the constructor must be
executed" Code ofHammurabi 1750 b.c., para 229
Giving responsibility for quality to independent testers means separating who
does the job from those who secure its quality. This does not work, we should
come back to the ancient concept (see the Hammurabi law on craftsmen account-
ability) of being accountable for the quality of your own work.
Not all quality characteristics can be verified by testing, many relevant aspects
require a different approach.
No matter how much we believe in the effectiveness of testing the crude reality
is that testing is not done because we have no time. No commercial organisation
ca really afford to spend 25% of the effort required by the implementation of a
change request on testing. Consequently the testing either is skimped or is done
ineffectively.
Total
Quality
Quality System 1990 - ...
1980-'90
Quality Assurance
1970-'80
Quality Control
1950-'60
Inspection
1900
System
Test evolution
automation slow and implaca
Fig. 7.13 ROI of testing?
In other sectors Quality has come a long way; in the software sector we are still
stuck to the Quality Control stage.
Automation makes big promises to make testing affordable and repeatable. But
automated tests have to be maintained and they have to keep pace with the product
evolution. It is very much a "Achilles and the turtle" paradox: tests lag inevitably
behind.
~--
~n~ Buys -----rhe test liberation
dream
The Manager The Test Automation w\"\\
package ~
~
Development
Champion
~----_.
~ J1 ~ Buys ~e test automation
nightmare
The Manager The Test Automation
package
~
6
"~ ~ Development
.... Etc..... i'~ Victim
There is a new "silver bullet": testing automation. the myth lets us think that
automation will relieve us from the burden of testing: we automate it so it will run
by itself! But development and maintenance work caused by testing automation is
a potentially disruptive "time bomb".
Final Warning
Test automation is not a substitute for:
choose a wrong model". In every organisation you have to identify the right way
to testing, from a technical and organisational point of view. For example you
need to focus your testing the best you can.
You must remember that there is no silver bullet.
Another aspect is that engineers do not see a career in testing as an interesting
perspective, it very difficult to find brilliant people willing to do testing. The tester
is a rare cattle of fish!
PIE (IBM): As in many things the truth takes something from both positions:
pros and cons. A frequent error is to think that testing is a panacea. You cannot
think that a tool or a specific technique solves the quality problem is wrong. The
best approach is starting from your immediate need: I have a problem now and I
want to start solving it immediately in the simplest and more specific way, very
much like the TRUST approach. As the you get more experience and your process
evolves you will be mature enough to apply more sophisticated solutions. So the
problem is not whether testing is useful or not but the way you approach it.
Participant: It is true that you cannot test in quality, but we should not forget
that software is a flexible product, very different from a car, so we can profit the
fact that you can amend the software very easily. It's an opportunity that we have
and other sectors have not.
It is true that other sectors have come a long way on quality but they are not re-
nouncing to final quality control anyway.
As regards the responsibility for quality nowadays software producers are at
least liable for any damage caused by their products.
It is true that many characteristics of software quality are not verifiable through
testing, but we can develop different measurement mechanism, let's not forget that
software metrics is a young discipline.
As regards the affordability of testing we must be clear that testing our software
is a duty not an option, and our clients will be more and more demanding about
that.
Participant: I am a tester, my domain is integration testing. We need develop-
ers' testing, if we tested the code without this first level testing it would be a disas-
ter! Even if we have a structured life cycle approach and we also use code inspec-
tion tools.
Panel: One of the arguments against testing is the cost of correcting errors dis-
covered too late and mostly errors in requirements. We must keep in mind the
economic implication of testing and debugging. The testing is affordable only if it
is highly effective. The effectiveness of testing is related to finding the relevant
bugs Le. those highly visible or serious to the user.
Also related to the economy of testing is the impact on the bottom line of cor-
recting errors when the product is still in a guarantee period.
Panel: The affordability and usefulness of testing is determined also by your
market situation, if you are competing on a global scale the reliability of your
product is a critical factor. So you need to allocate resources to improve the prod-
7.2 Third Spanish Wor1tshop 169
uct quality no matter how you do it. When your market share is at stake your in-
vestment in testing is perceived in a totally different way.
Participant: We sell embedded systems, in our case software is a minor com-
ponent in our systems, we have an hard time convincing our customers to pay for
the software let alone the testing. However we take charge of the costs of testing
because we must deliver a reliable product: there's no discussion about that. Also
we have a one year guarantee or more during which whatever happens to our
product we have to support our customer and usually we stretch this period to
please our customers.
The presentation focused on the Defects Detection activities that CMM, ISO, and
other quality management models define for software quality improvement; this
experience was carried out during the development of the Exchange products in
AlcateI.
7.2 Third Spanish Workshop 171
Introduction
The development of Software projects have nowadays three big problems, which
the development models try to solve, these are lack of product quality, time-to-
market and development costs.
The Software Quality produced is directly related to:
• the correct implementation of the specified requirements
• the absence of problems with the code and the data
• the facility to use the Documentation given to Customers
• the facility to maintain and update the product
A sample of a project life cycle is shown in figure 7.16. In this figure the main
phases overlap one another indicating some important mechanisms that are carried
out during all the phases, facilitating the achievement of the objectives of the
development process. These mechanisms are:
Requirements
- Management
System
Design
Software
Design
B I Inst&lation I Mamtenance I
t
Defect
t
Defect
t
Project
Prevention Detection QualityPlan
Table 7.2 shows the efficiency in the detection of errors before the product is
released, according to the CMM level in which the development process is found.
• The review should only begin if the criteria specified in the Project Quality
Plan have been fulfilled. One of these criteria is the availability of the docu-
ment with sufficient time for its review.
• Reviewer/reviewers should carry out the review on an individual basis.
• Comments should be fully documented so that the author can have answers to
them before the review meeting, thus increasing the productivity of the process.
• The review meeting should always take place when there is a need to put in
common the discussion/solution with the comments, in order to reach an im-
plementation agreement.
• All comments that are accepted by the author should be included in the revised
document before the end of the review meeting.
• Summary metrics should be available. These metrics are:
• Effectiveness: errors/size
• Efficiency: errors/hour
• Speed of review: size/hour
• The review is finished when all the output criteria are met according to the
Quality Plan of the Project.
Experts, who will receive the document, together with the author, should take
part under the supervision of a co-ordinator.
An inspection is basically the same as a formal review with the following differ-
ences:
• more formalism in the input/output and during the process, using the appropri-
ate checking lists
• mandatory comparison with the original document
• less reading speed
• more specialised roles
• The size of the inspected matter is determined in a dynamic way in relation to
the results obtained.
However, certain dangers must be avoided which could easily lead to this activity
not being fruitful. These dangers are:
• Inadequate planning, in which the reviewers have not been assigned a certain
period of time.
• The authors do not accept the process nor the review results of their documents.
The authors should know that all human activity produces mistakes and that it
does not invalidate the quality of the work.
• The document or the code to be reviewed is not made available with sufficient
time for review.
• Errors found are primarily publishing mistakes.
• Metrics being used in some way to evaluate the author(s) work.
• The process not being continuously improved.
7.2 Third Spanish Workshop 175
• People not feeling committed to the quality of the products for which they are
responsible.
• The essential mechanisms to be used in review/inspection activities not known,
and accepted at the beginning of the project:
• Non respected input and output criteria that need to be completed before a
process can be started or that the intermediate/final product can be delivered.
These criteria are applied either by phase, by design area or by function to be
implemented, depending on the selected life cycle.
Checking lists are a combination of verified questions in relation with the proc-
ess carried out, the effective use of the procedures and tools, or in relation with the
completeness and consistency of the product.
Different checking lists are applied in relation with each one of the different in-
termediate products to be revised or inspected.
Quality and process metrics are a combination of measurable conditions of each
product/process that will measure objectively its fulfilment or the quality of the
delivered product. They will measure the completeness of a phase, the number of
defects found, the effectiveness, the efficiency, the productivity in the work, etc.
Metrics should always be contrasted with the previously defined objectives,
which are related to the type of process used.
Tests
The aim of the tests is to validate that the designed software product fulfils all
specified requirements. For their correct implementation, the sub-processes in
which the test phase can be divided are:
• Definition of Test Types
• Test Preparation and Specification
• Execution and Reporting
• Regression Tests
• Problem Control
• Defect Cause Analysis.
Types of Tests
In figure 7.17, we can observe a breakdown of the testing system. Based on the
breakdown model, the following types can be defined.
Under this simulated environment the following aspects will be tested: the
module interface, the content of the updated databases, the hardware accesses, if
any, etc.
Preparation
Regression Tests
_ _ _ _ _ _ _p_r_o_bl_e_m_C_o_n_tr_o_I ~
Fig 7.17 Breakdown model of the testing system
SUbsystem Tests
The designer (or specialised tester) integrates the Software module within a whole
set of modules that make up a subsystem, and that of the Hardware in which it will
be integrated.
Simulated environments can also be used, because it is possible that not all the
subsystems have at the same time the same Quality level. The largest part of
Software runs in the final configuration (Operative System, Complete Hardware,
Application Code and Data) and will be supported by it right to the end.
The aim is to check, by area or subsystem, the requirements defined in the pro-
ject, as well as checking by regression that the rest of the reused Software (Code
and Data) has not been altered by its introduction.
Simulators are used to generate external events, required to create test condi-
tions.
Service Tests
The tester carries out the tests of each service or individual facility from the cus-
tomer's point of view, making special emphasis not only on its performance but
7.2 Third Spanish Workshop 177
also on its use. All the subsystems, which make up this application, are tested for
each service.
In this case, no simulations are used and the service runs in a total real envi-
ronment. However, simulations are used to generate real external events.
In this phase a complete Verification of the User's Documentation is made.
System Tests
The primary objective is to try the system in real working. This means that a set of
services are tested working simultaneously in conjunction with load tests, over-
charge processor overload tests, equipment strength tests, limit tests, capacity
tests, etc.
Therefore, simulators are necessary to generate external events for these tests,
as the real load generation can only be done with them.
Qualification Tests
A qualifying team carries these out, which is independent of the project. The posi-
tive result of these tests is necessary to authorise the product delivery to customer
and the commercialisation of the product.
To guarantee the quality of the product, these tests are carried out by means of
random samples and statistic results.
These tests should be carried out in an environment similar to the customer's
actual environment, non-simulated, with the same commercialised software, with
no corrections and with the User's documentation that will then be delivered.
The Results of this qualification are analysed during the product Delivery Re-
view: the Management guarantee that the product fulfils the output criteria and the
quantitative targets and authorises the product's delivery and its commercialisa-
tion.
Acceptance Tests
Some customers wish to carry out their own tests before authorising the installa-
tion of the product: therefore the customer usually generates hislher own set of
tests or uses the Supplier's Qualification Tests. At other times, the customer trusts
in the supplier's fulfilment of the specified quality level.
The tests are carried out in a real environment and simulators are used to gener-
ate Automatic tests.
One of the customer's aims is to check once more that the problems existing in
previous versions of the product are not reproduced.
Tests Preparation
It consists of 3 main parts:
• Definition of a testing strategy
• Specification of Test Cases
• Detailed Planning: Test Plan
178 7 Lessons from the EUREX Workshops
• If it is not a blocking problem, the next test case is executed. When the new
source is corrected, it will check that the error is correctly fixed, and it will ap-
ply regression tests to the global operation, to make sure that the solution has
not affected the rest of the software.
• Follow-up Criteria
• Periodically, the evolution of the defined metrics is followed with the aim to
determine when the tests are finished.
• Ending criteria can be a minimum number of test cases to be passed, or the
number of pending errors to detect or even the behaviour of any of the ratios,
metrics. These criteria should be predetermined in the test strategy.
• Ending criteria should be guaranteed by a meeting at the end of the phase, in
which input/output criteria, the checking lists and the metrics are analysed. The
decision should be made objectively.
Existing Metrics
In table 7.3, there are shown some metric examples to be used in the test phase for
its control and follow-up:
Regression Tests
Regression tests are called those tests that are carried out not due to a new intro-
duced functionality, but those with the aim to control that the new introduced
software has not affected negatively existing capabilities of the previous commer-
cialised version.
180 7 Lessons from the EUREX Workshops
Problem Control
Problem Configuration Control permits:
• Clearly specify all detected errors through formalised failure reports.
• Knowledge of error fixing status.
• Knowledge of which errors are fixed and in which version of intermediate
product/final product.
• Knowledge of which versions of development documents are up-dated in rela-
tion to the errors fixed.
• Knowledge of which solutions to errors have been delivered to the customer.
• Obtain metrics and problem statistic results.
• Knowledge of problem application among projects.
7.2 Third Spanish Workshop 181
The control of problems should be done by a tool that controls failure reports,
which has a node structure and communication mechanisms, and also has clearly
defined the problem status sequence (at least: CREATED, ACCEPTED, COR-
RECTED and VERIFIED).
The Problem configuration control should be carried out at least from the be-
ginning of the Subsystem test phase.
The Information supplied by each failure report, referring to when the error was
introduced, when it should have been detected and other information, is a very
important input to make the error cause analysis with the aim to define and im-
prove the processes by carrying out preventive actions.
Conclusion
The practices described in this article are a required support for the verification
and validation activities of a software product. We firmly believe that any Soft-
ware producing unit should follow the steps mentioned by CMM and ISO 9000/3,
to produce high quality products.
Alcatel has experience in its use. The obtained results make us feel sure that the
explained defect detection activities are an excellent mechanism to obtain a good
quality software product.
Four areas or layers were established to focus the discussions in the Workshop.
182 7 Lessons from the EUREX Workshops
The four areas of interest used for the exchange of ideas and opinions were the
following:
• Methodology and Process area
• Technology area (tools used/implemented)
• Change Management area
• Business area
The work groups were made up of approximately 8 to 10 people. In this work-
shop, in order to optimise the size of the Working groups, 3 groups were set-up
(Areas A and B were joined into one group).
Previous to the Workshop, the people responsible for the organisation (SO-
CINTEC), with the collaboration of the expert in the area (Miguel del Coso Lam-
preabe, from ALCATEL Espana) analysed and processed the information received
from the PIEs. Their aim was to obtain the preliminary conclusions and findings,
which would serve as discussion topics and stimuli in the work group meetings.
On the other hand, 3 persons from SOCINTEC and the experts joined the
Groups with the aim of helping discussion.
Summaries of the conclusions presented by each of the groups can be found be-
low.
Measurements
It is necessary to use some minimum indicators that permit us to know the scope
of the tests and their status, the number of tested components and the effort dedi-
cated to verification and validation activities, for example.
The use of metrics is important to cover various fundamental aspects:
• Which is the coverage of a set of test cases?
• Which is the status of the tests?
• When can we start to try the system's test as one?
• When do we know that the system is sufficiently tested?
Various metrics exist that are considered to facilitate above mentioned analysis:
• number of tested requirements
• number of specified test cases already executed
• test effort used vs. planned
• number of detected defects
• defects ratio evolution by test case
• effort ratio evolution for trial cases
Tools
Producers and distributors assure that there are tools that cover the whole test life
cycle. However, queries have been made about their use when not referring to the
development and maintenance ofbusiness management applications.
debates are also held in many countries of similar level of development. Neverthe-
less, these debates are not sufficiently efficient and profitable, because manage-
ment is not involved, and they do not make decisions about software reengineer-
ing in their companies.
Exercises similar to these work sessions should be prepared and encouraged in
the software community, with a closer relationship with the press, either technical
or not specialised, in a way that company management be made aware of the need
for re-engineering and continuous process improvement.
not stand out, and are usually awarded in less extent. Without forgetting the just
acknowledgement to firemen, the evaluation processes should give more consid-
eration to all those that are efficient in the delivery oftheir products with quality.
by contract. One of the tenns that should be considered in this contract is the qual-
ity metrics presentation; their numeric objectives should be stricter every year.
Some metrics to be supplied are related to foreseen number of defects, maxi-
mum period of time to fix them, minimum perfonnance guaranteed, etc.
It is fundamental to review the test plan, with the aim to set-up the
test coverage and reduce the time duration.
The test plan is a powerful mechanism that defines what should be tested, when
and how. Therefore, one of the inputs to the test plan should be the classification
of elements as critics, analysing their complexity, reusability and the modification
percentage as well as the results of their reviews and inspections. In this way it
will be known what elements are more prone to defects, and based on this, the test
plan can be redefined.
Via its in-depth revision, the test plan can be simplified, if taking into account
the maximum test coverage at a minimum cost and in a minimum time period.
Whatever serious effort made in this sense would have a significant reduction of
project cost and time, assuring that the delivery criteria have been met without
impacting on quality objectives.
really occurs, resulting in less effort than in previous projects and with productiv-
ity (tests/ person-day) and quality (errors/test case) ratios that are much better.
• MPCM by Transaction.
The workshop was structured in three sessions. In the first session the PIEs pre-
sented their experience; in the second session the expert introduced the topic of
Object oriented testing and the last session was devoted to the discussion among
all the participants: PIEs, expert, external audience.
Tool vendors participated and provided additional views on the subject.
6
Keynote Address of Robert V. Bindel
Software development presents two hard problems that in my view have been
insufficiently addressed: requirements and testing. These problems are closely
related in that testing of a software system is meant to demonstrate - to some
degree of satisfaction - that the requirements for that software system have been
met by a given implementation. In spite of the advances that object-oriented pro-
gramming languages and methodologies offer, testing remains necessary. Devel-
opers make mistakes. Complex systems can easily produce unanticipated results.
The social and economic costs attributed to software failure continue to rise.
Testing as an element of the software development process is normally gov-
erned by economic constraints. That is to say, the degree to which a system is
tested is governed by the tension between the desire for reliability and the time
and money available to achieve it. We distinguish between a reliability-driven
process, which uses testing to demonstrate that a particular reliability goal has
been met, and a resource-limited process, which uses time and money to remove
36
R. Binder's presentation is presented here summarised by L. Consolini
192 7 Lessons from the EUREX Workshops
as many rough edges from the system as possible. The effect of this trade-off is
evident throughout the life cycle. Consider table 7.4:
These figures illustrate the cost in terms of reliability of various testing envi-
ronments.
Testing presents several fundamental technical difficulties. Since we can never
hope to exercise all possible inputs, paths, and states of a system under test, which
should we try? When should we stop? If we must rely on testing to prevent certain
kinds of failures, how can we design systems that are testable as well as reliable
and efficient?
Such issues have been considered in detail for so-called "conventional" soft-
ware systems. Many answers, both practical and otherwise, have been proposed
and debated and a few have even been subjected to empirical validation. Until
recently, less attention has been paid to testing of object-oriented implementations.
More attention is needed, however, because the increased power of object-oriented
languages creates new opportunities for error.
Testing of object oriented implementations raises further issues. Each lower
level in an inheritance hierarchy is a new context for inherited features; correct
behaviour at one level in no way guarantees correct behaviour at another level.
Polymorphism with dynamic binding dramatically increases the number of possi-
ble execution paths. Static analysis of source code to identifY paths bedrock of
conventional testing) is of relatively little help here. While limiting scope of ef-
fect, encapsulation is an obstacle to controllability and observability of the imple-
mentation state. Components offered for reuse should be highly reliable; extensive
testing is warranted when reuse is intended. However, each reuse is a new context
of usage and re-testing is prudent. It seems likely that more, not less, testing is
needed to obtain high reliability in object-oriented systems.
In addition, the role of the tester must be considered. Traditionally, testing has
been viewed as a low-level task, beneath the concern of "real" programmers. It is
becoming increasingly clear that the importance of testing is on a par with that of
design and development. Most implementations can benefit from various forms of
testing throughout the life cycle. It is important to consider, for example, whether
requirements are testable. If a requirement cannot be tested, how will it be judged
completed, much less correctly implemented?
7.3 Pilot Gennan Workshop 193
• The advantages of testing for project members should be clearly pointed out in
order to get them "onboard".
• It is important to look for success experiences in an early stage.
• The PIE as a first experiment causes a higher effort. The advantage will be
shown in the following projects.
• Improvement of test-quality adds value of the jobs and therefore it provides
motivation and satisfaction to people involved in testing. This leads to the ef-
194 7 Lessons from the EUREX Workshops
fect, that more experienced people join the test groups and this additionally im-
proves the quality of testing.
• Job rotation between development and test groups is an effective means to
improve satisfaction and understanding.
• It is recommendable to get management support with a business case for im-
proving testing practices grounded on ROJ and customer satisfaction instead of
technical issues. The evaluation and presentation of reduced maintenance costs
in increasing the test quality should be addressed to convince management.
• Code reviews are recognised as one of the most effective quality practices
however it seems very hard to argue for this method in order to convince man-
agement and even team members.
• Establishing sound test practices costs too much effort for small companies.
Errors are detected at customer site.
7.3.3.2 Tools
It was recognised that technology should be seen as a supporting aid to a well-
defined process. Technology cannot be the first and only concern for testing im-
provement. A list of key lessons for success in the choice of tools were reached at
the workshop:
• it is necessary for the potential customer to evaluate a tool in his own environ-
ment and with his own data. A demonstration by vendor is not enough.
• a checklist of evaluation criteria establishes a sound base for communication
between the customer and the tool vendor. This common base will improve the
selection.
• the quality of the documentation is essential; the quality of the documentation
means readability and conformity with the software.
• the "standing" of the tool vendor (i.e. does the vendor still exist in 10 years)
was pointed out as an important point of selection.
The various workshops reported lessons in several areas. Chief among these were
the importance of managing people and business issues; however, a number of
technical conclusions were reached as well. The editors chose the most relevant
lessons and tied them back to the workshops' conclusions presented previously.
7.4 Lessons Learned from the Workshops 195
7.4.1.2 Lesson 2
There is a persistent training and motivation problem with testing. Few engineers
have real testing skills and it is extremely difficult to fmd personnel motivated to
pursue a career in testing. A poor but common practice is to assign those who are
not good at programming to testing.
The human factors that associated with gathering measurements should be stud-
ied in great detail. The measurement objectives should be clear and respected by
all that use them. Everyone involved should agree to use of measurement for
.
process Improvement an d that data'IS not to be questlOne
. d afterwards. 39. 40
37
Refer to the third Spanish workshop chapter 7.2.3: Perception of professional category
difference between software developers and "testers"; specialised "tester" is very effi-
cient, but creates barriers for developers; in some cases the independence of the tests is
38 requested by regulations.
39 Refer also to the pilot German workshop chapter 7.3.3, people issues.
Refer to the third Spanish workshop chapter 7.2.3: Company culture to reward "fire-
40 men"; not much training in techniques and test planning.
Refer also to the second Italian workshop chapter 7.1.3: A cultural growth on testing is
paramount; skills.
196 7 Lessons from the EUREX Workshops
abili~ ar~l?jF.~3est. An analysis of where the risk is highest should be part of test
plannmg.
7.4.2.2 Lesson 4
The highest ROI can be achieved through re-use of test cases, but to ensure a
focused re-use you need to link tests and requirements to identify which tests
which. This type of traceability can be achieved by. extending the usual capabili-
'14
ties of configuration and version management tools.
7.4.2.3 Lesson 5
It is necessary to define a metrics policy so that an adequate balance can be ob-
tained between the measuring cost and the benefits in mind.
Metrics make it possible for the clients to change their perception of processes
and services in the long term. The information used should be part of the culture
of the organisation.
Metrics have served to increase awareness of the measuring objectives and to
transmit the priorities to the organisation.
Metrics are considered very important as a means to align the technological ob-
jectives with those of the business. They also help to determine the value that
t~chn~J.0JY adds to the company and to improve the credibility of the techni-
cians.
7.4.2.4 Lesson 6
The introduction of a software verification and measurement program is a strate-
gic project, and should not be planned exclusively as a classical investment benefit
analysis. If the implementation is to be successful there must be an adequate plan,
47. 48
organization and the necessary resources.
4\
Refer to the third Spanish workshop chapter 7.2.3: Software test is an unquestionable
42 fact.
43
Refer to the second Italian workshop chapter 7.1.3: Handling schedule pressure and risk.
44 Refer also to the pilot Gennan workshop Chapter 7.3.3.
45 Refer to the second Italian workshop chapter 7.1.3: The ROI of automated testing.
Refer to the third Spanish workshop Chapter 7.2.3: Data collection during system opera-
46 tion: making developers aware of extended quality.
7.4.3.2 Lesson 8
With new technologies we are making steps backwards as regards testing. It is
very common to have no testing at all of Web applications. It is mistakenly be-
lieved that static WEB applications do not need any testing and the staff involved
is usually very much on the hacker side. We have to adapt traditional testing
methods to this 52new technological setting and ensure at least a 15% testing effort
within a project.
7.4.3.3 Lesson 9
We should not be ambitious in the re-collection of the measuring data. Only the
information related to the variables with which you are going to work should be
re-collected as the validity of the data and its tendency is more important than its
precision.
Metrics should be used in an 00 software development project:
• During the design and development stage to validate the quality of the software
architecture and of the code.
• They should be calculated at the same time as the corresponding elements are
going to form part of the configuration management.
The measurement objectives should be taken into account during the whole
identification and implementation processes. The use of the GQM methodology
has been of great use.
49
50 Refer to the third Spanish workshop chapter 7.2.3: Tools.
Refer to the second Italian workshop chapter 7.1.3: Developing a testing automation
5] strategy is very important; Tools.
52 Refer also to the pilot German workshop Chapter 7.3.3: Tools.
Refer to the second Italian workshop chapter 7.1.2: Expert presentations.
198 7 Lessons from the EUREX Workshops
In conclusion, the most relevant message was to avoid applying ready-made but
inadequate models to your organisation. One should take a balanced approach
where different levels and classes of testing and testing automation coexist. The
needs and the available resources should determine the approach.
It is evident that there is great interest in this subject, but the level of use of
metrics continues to be very limited. In many cases it is apparent that a metrics
program is carried out by personal or group initiative, rather than by the imple-
mentation of structured programs promoted by company management.
The use of metrics in some areas still finds itself in an immature stage. In gen-
eral, an existing gap has been verified between the world of research and the final
user to easily use these techniques.
Because of this lack of experience, market interest suffered until recently.
There are a limited number of consultancy companies that can offer experience,
knowledge and products within this area.
It is not possible to improve software quality without knowing the quantitative
change in process improvement. Metrics provide a necessary base that makes this
possible.
S3
S4 Refer to the third Spanish workshop Chapter 7.2.3: Measurements.
Refer also to the pilot German workshop Chapter 7.3.2: Expert presentations.
8 Significant Results
L. Consolini
Gemini, Bologna
The ~UR~X workshops covered a great deal: methods, tools, ski11s and new proc-
esses have been widely and deeply touched upon either by the speakers or by the
material collected and analysed by the organisers.
It is now time to draw some conclusion from it all.
In chapter 4, inadequate application of product verification methods, techniques
and tools by commercial software developers was lamented. The contrary is true
with the ESSI PIEs covered here: most of them were performed by commercial
software organisations. Evidently, rigorous software verification is rapidly exiting
its traditional niche (those who provide safety critical software) and entering a
world of carefully controlled budgets, tight schedules, and strong competitive
pressures. Until recently, we have been forced to come to terms with defective
commercial software. Customers learned to accept low quality software, almost
never fighting back, while developers continued to patch the code here and there
and testers continued to be frustrated and relegated to the lowest ranks of the de-
velopment team.
What emerged from the ~UR~X workshops is a realisation that the situation is
rapidly changing, in Michele Paradiso's words: "The competitive pressure on high
quality software, stringent budget and aggressive cycle time requires increased
productivity while sustaining quality in all phases of the software development life
cycle. The business imperative for organisation in the 2000's are to gain competi-
tive and advantage while reducing time to market and at the same time minimising
business risk: a11 this means getting a new application and/or solution out of the
door in a hurry."
Since it is no longer possible to safely ignore (in business terms) the product
quality issue, many companies are now asking the crucial question: What and how
should be improved in our process to achieve and maintain an competitive level of
product quality?
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
200 8 Significant Results
In the author's experience, there are several barriers preventing software organisa-
tions from moving more decisively to upgrade their V& V practices. These are
discussed in the following sections.
In each of the [;UR[;)( workshops, it was pointed out several times that there is not
enough cultural support for testing and software verification. Software developers
do not learn how to do it and no reward system is in place to motivate and recom-
pense good testers. In many companies testing is not even a "job", it is simply
what remains for developers to do after the code is finished.
This leads us to conclude that training efforts aimed at spreading awareness and
knowledge are critical to really change the "quick and dirty" approach we have
been accustomed to.
Unfortunately an appropriate training offer seems lacking or, at least, it has a
difficult time in reaching many SMEs and practitioners who are still far from
getting the basic education needed to successfully implement and use new emerg-
ing software verification support tools.
In this context the risk is that the first and only contact practitioners have with
verification methodologies is through highly commercially pitched tools. Also, as
the experts made clear in their introduction, many tools are now commercially
available at affordable prices, although their capabilities need a "reality check"
and the people possessing the skills necessary to use them effectively are rare.
Developers and managers alike can be lured into thinking that tools (or even
just one tool) are the silver bullet. The expectations raised by catchy marketing
messages are almost never met. It becomes clear that automating poorly designed
tests makes it only easier to execute them; whereas, the quality level of the prod-
uct remains basically the same.
Designing good tests means designing tests with a high probability to find de-
fects, simple enough to make defects easily reproducible, significant enough to
make fixing the errors behind the defects well worth doing. When automation and
tools come into play, a good test is also a maintainable test, with minimal depend-
ence on what changes frequently in the code.
To achieve this "test quality" you need test design skills, experience and a cer-
tain amount of flair. Perhaps you need to become a professional tester: testing as
an "if-we-have-spare-time-occupation" is not enough any more.
8.1 Barriers Preventing Change of Practices 201
This is the hardest barrier to overcome. In an industry where we are not used to
measure the results of process change, and even less to quantify the costs of non-
quality, it is generally difficult to tell whether and when change pays off.
On one side you have the costs, the resources and the time that a verification
process change takes out of your pocket, but you seldom have similarly concrete
values on the other side.
One must be very clear about this: you have to go a long way in training, men-
tality change and infrastructure set up before you can really have meaningful data
to measure your results in a way that can speak a clear business message to those
who hold the purse strings. You have to commit yourself to a difficult endeavour
and to keep momentum mostly out of your goodwill and of the comfort that oth-
ers' successful experiences can give you.
In this respect the ESSI PIEs and the analysis carried out by ~URr;.x can be help-
ful and inspiring, mostly to assist with choosing a suitable path and to avoid com-
mon mistakes.
In particular all three ~UR~X workshops warned against an unconsidered, head-
long plunge into testing automation. It was clear that automation pays off only if
applied in the right way and to the right types of testing. Moreover, testing is not
the only verification means and, very probably, not the most effective. ~UR~X
gave inspection-oriented PIEs a forum to present their results and to dispel some
myths. ~UR~X confirmed that there is a whole range of very promising manual or
semiautomatic verification techniques that deserve some attention because they
are effective and focused on early discovery of errors (not to mention their poten-
tial for prevention).
Quite naturally manual or semiautomatic verification techniques are more
costly and require higher skill levels, but as more data are being gathered about
the effectiveness of such techniques, they become more and more convincing, at
least for critical code.
It is unpleasant news, but there is still too much acceptance of low quality by
software consumers. Many software organisations will not fully commit to quality
improvement unless they feel some market pressure to do it. To the contrary,
many workshop participants remarked that many customers are not willing to pay
for higher quality and they set the time and cost targets so low that meeting them
becomes a matter of deciding where to cut.
"How can you care for quality when the fiscal legislation changes overnight
and your customers want the new release at no cost at lightening speed unless they
202 8 Significant Results
incur bitter sanctions?" Similar questions arose time and time again at the fURfX
workshops. It is hard to give a reasonable answer except by offering very general
advice, such as: "Aim for enough quality, tailor your quality targets to what is
suitable to compete and win in your market!" In other words quality should be
providing value to somebody who recognises it. There is no such thing as absolute
quality.
From this line of reasoning stems the relevance of choosing the right (i.e. ade-
quate and appropriate) software quality strategy, which will then be your specific
strategy, not a blind application of standards and models good for all seasons.
It is evident that rising up software verification and testing in particular from the
lower ranks of software tasks has many educational and cultural implications.
The reported experiences revealed a dramatic lack of structured and effective
courses on product verification integrated within the usual software engineers'
educational curricula. In addition, little investment in training has traditionally
been made to prepare competent testers. The result is a lack of knowledge and,
consequently, a general lack of motivation to undertake a career in testing.
The best return on investment is certainly obtainable with the skill improve-
ment criteria.
In the improved process model discussed by one of the experts in Chapter 4, prod-
uct verification ceased to be an indistinct package of work to be carried out at the
end of development; rather, product verification was described as a process struc-
tured into manageable units and critical decision checkpoints. This view is coher-
8.2 Best Practices Recommended by Experts 203
ent with the unanimous emphasis that experts put on planning and making V&V
sufficient to achieve the desired quality targets.
Most of the PIEs related more to execution and how to make it effective than to
planning, however many of them realised along the way that to gain real benefits
it was necessary to formalise the new practices and to integrate them into the de-
velopment process.
Some of the PIEs interpreted integration as traceability, particularly with the
requirements process, others explored modifications to the project planning and
estimation process to take V& V into account. Of particular interest was the inte-
gration of regression testing and maintenance to ensure a consistent quality level
over time.
The great steps forward that have been made in automation are certainly part of
the reason for the predominance of product verification over other improvement
areas in ESSI. Both the experts and the software organisations performing the
PIEs are focused on the same theme. In accordance with the diffusion of more
rigorous verification practices into the commercial software industry on the one
hand and the general growth of the competitive pressure on the other, it is clear
that the increase in productivity and reuse promised by automaton can be ex-
tremely appealing.
The tool vendors are clearly responding to this market opportunity with a wide
and diversified range of products, which were described in Chapter 4. On the
down side, Chapter 4 also warns against common mistakes and unjustified expec-
tations that usually go hand in hand with automation. The PIEs' experience con-
firms that some bad news mitigates the enthusiasm engendered by the availability
of tools:
• investment in automation can be jeopardised by the product evolution if meas-
ures are not taken to stabilise test cases with respect to software changes
• tools need a considerable set-up and tailoring effort if not the development of
in-house integration software
• automation does not mean automatically improving the quality of the verifica-
tion cases that we run against our code. In other words automating a badly de-
signed set of cases only means doing a bad job faster.
Perhaps the most useful lessons to be derived from the ~UR~X workshops con-
cern the recommended approaches to the use of automation.
204 8 Significant Results
• Embracing code coverage with the devotion that only simple numbers can
inspire.
• Removing tests from a regression test suite just because they don't add cover-
age.
• Using coverage as a performance goal for testers.
• Abandoning coverage entirely.
A last word on the ~UR~X process is in order. It is clear that we are not talking of a
scientifically validated way to formulate hypotheses and to check them out. How-
ever, rURrX made an effort to achieve a systematic and grounded interpretation of
a considerable number of field cases and to derive a set of lessons transferable to a
wider community of potential users.
The process for doing this was conceived of from scratch. For the most part,
nothing of the sort had been attempted before in the software engineering and
quality field. As a result, it inevitably underwent considerable day-to-day adjust-
ments as it was deployed.
The rURrX approach had to consider different cultures, changing audiences, and
the contingent difficulties. The interpretation and the value of the data we were
gathering was neither always evident nor readable in just one unambiguous way.
However we are fairly convinced of the final results for several reasons:
• the conclusions of the various workshops were uniform, little if no contrast was
observed by the editors;
• the PIEs' view - mostly a practitioners' one - was balanced by the experts'
position that was certainly more inspired by what the theory and the literature
say;
8.4 The EUREX Process 207
• the final lessons have been elaborated on the basis of a rich and vast material:
PIEs reports, workshop proceedings, experts' papers, domain specific literature.
The confidence we have developed in the I;URI;X results makes us also confi-
dent that there is a need to publish them and bring them to the attention of the
software community at large in order to stimulate further discussion and action.
We therefore claim not an academic but a practical- and practitioners-oriented
value for our work, very much in line with what the workshop audiences always
asked from us
Part III
Table 9.1 below lists each of the PIEs considered as part of the ~UR~X taxonomy
within the problem domain of Testing, Validation and Verification.
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
212 9 Table of PIEs
MATICA S.p.A.
24266 1996 EXOTEST DASSAULT ELECTRONIQUE F
24157 1996 FCI·STDE PROCEDIMIENTOS - UNO S.L. E
21367 1995 FI·TOOLS TT TIETO TEHDAS OY SF
23887 1996 GRIPS SIMCORP A/S DK
24306 1996 GUI-TEST IMBUS GmbH D
23833 1996 IDEA ISTITUTO NAZIONALE PREVIDENZA I
SOCIALE
24078 1996 IMPACTS2 DTK GESELLSCHAFT FOR TECHNI- D
SCHE KOMMUNIKAnON mbH
21733 1995 INCOME FINSIEL S.p.A. I
10482 1993 IQASP LABEIN E
10482 1993 IQASP B.Y.G. SYSTEMS LTD UK
10163 1993 IRMA BRITISH AEROSPACE DEFENCE LTD UK
23690 1996 MAGICIAN MAGIC SOFTWARE ENTERPRISES ISR
LTD
21224 1995 METEOR LOG.IN I
10228 1993 MIST GEC-MARCONI AVIONICS LTD UK
10788 1993 ODP FESTO Ges.m.b.H. A
24053 1996 OMP/CAST OM PARTNERS N.V. B
23743 1996 PCFM LUCAS AEROSPACE UK
10438 1993 PET BRueEL & KJAER MEASUREMENTS DK
A/S
21199 1995 PI' ONION I
24344 1996 PIE - TEST LGTsoft B
23705 1996 PREV - DEV MOTOROLA COMMUNICATIONS ISR
ISRAEL
21417 1995 PROVE CAD.LAB S.p.A. I
23834 1996 QUALITAS Management Data, Datenverarbeitungs- A
und Untemehmensberatungsges. m.b.B.
23978 1996 RESTATE BOSCH Telecom GmbH D
10494 1993 SDI-WAN TECNOMET PESCARA S.p.A I
10824 1993 SIMTEST DATASPAZIO TELESPAZIO E DATA- l
MAT PER L'INGEGNERIA DEI SISTEMI
SPA
21612 1995 SMUIT ABB Netzleittechnik GmbH D
21394 1995 SPIDER ETRA SA E
10875 1993 SPIMP PHILIPS MEDICAL SYSTEMS NL
23750 1996 SPIP ONYX TECHNOLOGIES ISR
21799 1995 SPIRIT BAAN COMPANY N.V. NL
24193 1996 STOMP TECHNODATA INFORMATIONS- D
TECHNIK GmbH
21160 1995 STUT·!U OY LM ERICSSON AB SF
23855 1996 SWAT TELECOM SCIENCES CORP. Ltd UK
21385 1995 TEPRIM IBM SEMEA SUD s.d. I
9 Table of PIEs 213
The Experiment
To achieve the above mentioned objective, we will design and build an automated
system which gives the biggest benefit, meaning reduced workload and smallest
fault tolerance. During the baseline project we will concurrently do manual code
reviews and run static and dynamic code analysis tools to detennining the soft-
ware quality factor. The results of the reviews and the resulting metrics will be
combined to find a correlation. Out of this correlation report we will be able to
make an automated selection of sources which have to be manually reviewed. By
using the metric tools we will already exclude part of the faults while the auto-
matic selection system will make sure the critical sources get reviewed.
Though the framework must be applicable to any software development project
we will use one of Denkart's typical legacy business application migration pro-
jects as baseline project.
M. Haug et al. (eds.), Software Quality Approaches: Testing, Verification, and Validation
© Springer-Verlag Berlin Heidelberg 2001
216 10 Summaries of PIE Reports
A process group support has been put in place in the engineering department with
two basic tasks:
• initiating and sustaining process change,
• supporting the projects as they use methods, standards, technology (normal
operations).
This group serves as a consolidating force for the change that have already been
made. Without such guidance, lasting process improvement is practically impossi-
ble.
At the end of this experiment, SAlT Devlonics has outlined the following key
lessons learnt:
• the introduction of an integrated tools support for the software life cyle devel-
opment are permitted to automate the procedures defmed in the SAlT Quality
System (SQS);
• the quality level was improved by a more involvement from the quality control
(QC) and the existence of static and dynamic metrics. However, the lack of
transparency between the host/target environment, the weakness of data set
management and no functional testing must be still solved.
• the configuration management system is appropriate for documents manage-
ment. Indeed, it offers much flexibility for adapting the configuration to the va-
riety of projects developed by SAlT Devlonics. But the facilities available for
the configuration of sources code and executables (software components reus-
ability) are insufficient.
To conclude, this experiment has contributed to the SAlT Devlonics efforts to
pass the ISO 9001 certification.
The next actions will be carried to improve the software components reusability
and to increase productivity in order to avoid to loos competitive power.
SAlT DEVLONICS s.a.ln.v.
Chaussee de Ruisbroek, 66
B 1190 BRUSSEL
BELGIUM
interest groups for this project are companies who test software as part of their
product development life cycle, and those who need to automate the process or
parts of it, using tools.
The ALCAST project ran from January 1994 until June 1995. During Phase 1,
software testing practices in both companies were first assessed against current
best practice in the industry. Having identified key areas for improvement, the V
Model was implemented as a process framework and then further enhanced using
the Systematic Test and Evaluation Process (STEP) methodology.
In Phase 2 of the project, the VHI piloted an on-line test environment with
automated defect tracking and change management. In QFS, support for STEP
was included in their existing corporate information system and automation of
both regression testing and static analysis took place. The main lessons learned
were as follows:
• Specific ALCAST lessons:
• Testing should be involved at the project requirements stage.
• The STEP methodology has proved effective when tailored for individual com-
pany needs.
• Unit testing pays, but overheads and administration should be kept to a mini-
mum.
• Test automation is beneficial but has a significant learning curve.
• Metrics should be kept simple and usable.
• Training for best practices is essential to ensure the success of a company wide
implementation.
• General Project Management lessons:
• Improvement must be managed as a mainstream project in a company, with
equal or higher priority than core business projects.
• Expertise in tools and automation should be gathered as a company asset into
teams and used as a resource on projects.
• The initial assessment in the cycle of Assess, Improve and Measure is critical
for gauging the success of the Project.
The next actions will be to run an end of project dissemination event in Ireland
(estimated 150 companies attending) and distribute this report in booklet form to
Q'SET customers (>7000).
The project was regarded as a success in all 3 companies and plans are in place
for company wide implementation.
Members of the ALCAST Project gratefully acknowledge the moral support
and financial help provided by the ESSI Group at the European Commission with-
out which ALCAST would not have happened.
10.4 AMIGO 21222 219
• The experiment has confirmed an initial negative consideration from the soft-
ware engineers towards the maintenance work. Some improvements introduced
in the management of software defects has reached an outstanding acceptance
from the involved people.
ELlOP, S.A
The key lessons learnt from this experiment can be summarised as follows:
• Development of an Operational Profile proved to be the most difficult and
time-consuming tasks.
• Special care should be given to the data collection since this will have great
impact in the accuracy of the measurements and the results
• Training of the people involved in SRE projects is very important for the suc-
cess of the experiment
10.6 ASTEP 23860 221
The Experiment
The test approach, consisting of a test concept and a supporting test and simulation
environment, shall cover all system test phases starting from the factory test onto
the final acceptance test.
In particular, it is foreseen to provide several sub-projects within the baseline
project with tools supporting the test specification, preparation, and performance
on system level. Especially for the integrate engineers of the customer in the ex-
periment from the very beginning. In a second step, the test environment will be
extended by an efficient simulation system to allow to determine in advance the
operational behaviour and the side effects due to changes in the plant configura-
222 10 Summaries of PIE Reports
tion as well as in production The last step will be the introduction of the test and
simulation environment at site for an even better integration of the end-user.
The Experiment
The baseline project chosen for this experiment is the Envisat-l Monitoring and
Control Facility (MCF) project. This 2.8 MECD development is being performed
to tight budget and schedule constraints, and can benefit directly from the applica-
tion of the techniques being proposed within ASTERIX.
The ASTERIX experiment will determine whether the additional effort sup-
plied in the system testing phase is repaid in terms of higher product quality at a
reduced overall price, compared with the measured quality on other similar pro-
jects performed by Anite Systems for major clients. This reflects the overall goal
to demonstrate that the approach proposed will result in real gains both for the
10.8 ATECON 10464 223
customer (through receiving a better quality product, allowing him to reduce op-
erational costs) and for Anite (through reducing the overaillifecycle costs).
The test approach defmed in the ATECON project will be exploited by all pro-
ject partners. It will be integrated in their respective software development stan-
dards and therefore used in future development projects within the organisations.
It is also planned to extend the test approach to cover object oriented system de-
velopment with new challenges like dynamic linking and overloading.
This section should refer to the support of the Commission in completing the
work, in accordance with Article 6.4 of Annex II of the contract.
This project was carried out in the framework of the Community Research Pro-
gramme with a financial contribution by the Commission of European Communi-
ties.
DLR
Oberpfaffenhofen
Germany
By means of the ATM project execution we have been able to experiment and to
verify on a real project the methodological, technological and organisational solu-
tions adopted. In particular, the executed activities have permitted:
• to assess the initial testing process and to define the improvement and meas-
urements plan
• to train the people involved into the experiment on the theoretical and practical
aspects of the software testing
• to define an appropriate test life cycle coherent with the company software life
cycle
• to formalise the test procedure and to defme the responsibilities of the involved
roles on the test activities
• to define the new testing environment integrated into the company development
environment
• to experiment the test procedure and standards, and the test documentation
management system, on a real project
• to determine the most effective organisation for testing activities
• to assess the final testing process in order to measure the achieved improve-
ment
• to verify the usefulness of measures to evaluate the final product quality
Finally, some key lessons have been learnt by the ATM project execution, as a
result of the difficulties and problems met during the experimentation. Main key
lessons learnt concern the organisational, technological and business aspects. In
particular, we have been able to verify the importance of the people training, the
need of a strong support by the top management, the usefulness to experiment in
advance organisational and technical solutions before to adopt them.
SVIMSERVICE
Via Massaua "Complesso il Faro", Puglia
70123 Bari
Italy
object oriented systems is not enough developed. Thus the main interest of the
partners is to extend its software engineering approach by introducing a systematic
test concept for object oriented systems. This project aims at developing and put-
ting into practice an efficient, state-of-the-art, well founded, and cost-effective test
approach for object oriented systems that covers all test phases (unit, integration,
system and acceptance test phases). The test approach must be scalable to projects
of different sizes and with different reliability, availability, maintainability and
safety requirements. It must be a pragmatic approach to be applied on real-world
projects right away. This concept is to extend the software engineering approach
from the partners (DLR and ConSol) and is a further step towards the ISO 9001
certification. The installation of this defined test process will enable the improve-
ment of the software quality produced by the partners. Therefore, a higher cus-
tomer satisfaction will be reached and the consortium will achieve a quicker reac-
tion to new requirements imposed by the market. This will enable us to reach a
better position in the market, to increase the efficiency and effectiveness and
therefore to reduce costs. Selected results and experiences gained within the ex-
periment will be available to all companies interested in a systematic test approach
for object oriented systems. Additionally, companies aiming at the ISO 9001 certi-
fication will be able to have an inside in the costs, efforts and experiences made by
the ATOS consortium and analyse the possibility of including this test approach in
their own companies.
The Experiment
Within the framework of the entire development life cycle, the focus of this ex-
periment is the definition and application of a systematic test approach for object
oriented systems and the enrichment of the knowledge on testing object oriented
systems of the baseline projects team members and their management. The test
approach generated and applied in this application experiment shall fulfil the fol-
lowing main requirements:
• the test approach shaH be based on a modular test concept and on a supporting
test environment
• the concepts and the environment shall consider the interfaces to the different
life cycle phases
• it shall cover all test phases starting from the module test over the integration
and system test phase onto the acceptance test phase
• the integration of quality assurance activities shall enforce the correct applica-
tion of the test methods, procedures and tools defined within this approach and
result in an ongoing improvement of these concepts and environments
• CASE tools should be used to support the different test activities.
• The main activities to be performed within this application experiment are:
• an assessment in the baseline projects to identify the existing test practices and
to determine the knowledge on methods and techniques for testing object ori-
ented systems is available in the project teams.
10.11 AUTOMA 10564 227
• configuration management
• regression testing
The project has selected the appropriate tools and technologies, and has used
them to build two complementary experiment scenarios, based on the maintenance
activities of two project groups (one for each partner).
The project has been fully successful in the experimentation of Configuration
Management and Requirements management.
In the first case, the whole maintenance line of a complex system has been put
under fully automated control, developing (on top of the selected tool) a CM envi-
ronment and related procedures capable of ensuring full control while avoiding
any extra effort for the maintenance teams (actually contributing to improve the
overall efficiency).
On the second aspect, the specifications of another system (in continuous evo-
lution due to changing and increasing user requirements) have been formalised
and are now under tool-supported control.
The results obtained on testing shows some problem; at the beginning, more re-
sistance has been experienced on these aspects by the development teams, despite
the reduced involvement requested (the preparation of test procedures was per-
formed by dedicated resources), due to the difficulties of showing the advantages
of the approach.
The preparation phase has been however successful, and allowed to derive in-
teresting lessons on how to extract and formalise the functional knowledge re-
quired to prepare good, effective functional tests.
Once an initial set of automated test procedures has been prepared, its exploita-
tion suffered problems related to the high level of changes that the two systems are
still experiencing from one release to the other; this has prevented, till now, a real
deployment of automated testing on one of the two systems, while a partial auto-
mated testing approach is currently operational for the other.
Despite these difficulties, however, the need to formalise test procedures has in-
jected a radical organisational change in the two maintenance teams, that now
handle testing-related activities in a quite better way. This is demonstrated by the
comparison of the process assessments conducted before and after the experiment.
DATAMAT INGEGNERIA DEI SISTEMI SPA
Autoqual deals with automating the quality system of SAET s.p.a., an Italian
SME, whose business is in electrical engineering, but with a growing part of soft-
ware activities.
To gain control of software production, SAET decided in 1995 to set up for its
software department a ISO 9001 compliant quality system. The quality system is,
up to date, paper oriented and has a number of drawbacks: waste of time, both
from the staff and the quality manager, for clerical tasks, negative influence on the
attitude of staff towards the quality system.
Autoqual aims at automating the clerical tasks of the quality system (document
search and retrieval, communication of documents, access to the quality manual,
access to quality sheets and logs), by exploiting as much as possible the context of
the organisation: a PC for each staff member, PCs connected by a LAN, MSOffice
tools on each Pc.
The PIE is relevant for any organisation having a quality system supporting the
production of software embedded in larger systems and trying to automate it.
While the experience in automation is generic, the experience gained in supporting
tools will be specific to tools running on networks of PCs.
The main lessons learnt from the PIE are:
• Automation of the quality system of a project - oriented company requires a
flexible and highly customisable tool.
• The customisation of the tool requires important resources, in the design, im-
plementation and put into service phase. The workflow analysis effort should
not be underestimated.
• With the prerequisites above, an automated quality system, well adapted to the
needs and structure of a company, is a powerful tool. In our case it allows the
project managers and the technical director to exploit quality records to have a
real-time overview of the state of projects.
• Before the PIE, with a paper quality systems, quality records were not easily
usable, and actually not used except by the quality function.
• While the cost of setting up such an automated systems are easy to compute,
the benefits are difficult to quantify, but we believe they make the investment
worthwhile.
• The automation has some drawbacks too: the automated system needs skilled
technical roles to be mantained. If those figures are not available inside the
company, any modification becomes slow and costly.
The Autoqual project is funded by the European Community, under the Esprit
ESSI initiative.
SAET S.P.A.
Viale dell'Industria 14
35030 Rubano (PD)
Italy
230 10 Summaries of PIE Reports
The Experiment
The objectives of this PIE proposal is to integrate into the development process of
the Municipality of Kavala, robust methods & tools for conducting the Acceptance
Testing and Verification & Validation phases of distributed, client/server, transac-
tion oriented applications, connected with large databases of indexed images.
These methods & supporting tools, shall be employed both for in-house develop-
ment as well as in subcontractors' management, and shall cover functional and
non functional requirements of such systems.
The AVE project shall be an experiment for the specification & conducting ac-
ceptance tests & verification procedures for systems delivered by third parties
following public Call for Tenders. The experiment shall cover all related phases,
from Call for Tenders write up ("rule of the game setting") down to the actual
acceptance testing and systems verification. The experiment shall be conducted
according to the guidelines defmed by the Information Strategy Plan & Process
Assessment study.
The baseline project (Human Resources Management) is considered as a most
critical application for the Prime User and it encompasses a variety of technolo-
gies and characteristics (large database content in tabular and image forms, cli-
ent/server architecture, distributed nature, GUIs, smart card integration etc.). Is-
sues to be addressed are processes sequences, methodology support, functional &
non-functional issues tackling, data correctness & completeness and standardisa-
tion as well as organisational and contracting issues.
nance costs, frequent operations disruption, it became quite clear that a robust,
well defined process, adequately supported by mature methodologies and state of
the art software engineering tools, is the key issue. As the bulk of the software to
be acquired will be contracted to third parties, acceptance testing & verification is
considered amongst those being of principal importance. It should also be noted,
that the experience gained and lessons learnt, are critical also, for a very big num-
ber of user organisations, sharing similar needs and concerns. Therefore, it is be-
lieved that a successful project and a well planned dissemination component, will
result in a broad and deep impact in a whole class of IT users.
Municipality of Kavala
10, Cyprou str
Kavala 654 03, GREECE
pragmatic subset may be identified in order to monitor and track future improve-
ments.
Chase Computer Services Ltd
83-85 Mansell Street
LONDON EC 1 8AN
United Kingdom
The Experiment
The CITRATE PIE will introduce methods and tools for automated testing, in the
software development process of NOVABASE, in a stepwise and continuous way.
The total duration of the experiment will be 18 months, starting in March, 1997.
The baseline project associated with the CITRATE PIE will be the develop-
ment and upgrade of the CSI software product line, in the areas of materials man-
agement and act management, which constitutes an integrated offer for health care
providers.
The testing effort, the number of errors and the maintenance costs will be com-
pared along the experiment, against current metrics of the company.
NOVABASE employs 80 people, 10 of them involved in the baseline project.
• To reduce by 50% the total number of errors found after product release, lead-
ing to a higher quality level and a considerable reduction of risk in the imple-
mentation phase.
• To reduce by 50% the effort needed for client support, transferring the available
resources to other productive areas.
NOVABASE
Sistemas de InformalYao e Bases de Dados S.A
Av. Antonio Jose de Almeida SF, 6°
1000 Lisboa
Portugal
The Experiment
Context of the Experiment.
The experiment shall be performed in the context of the European Photon Im-
aging Camera (EPIC) Project to be flown in the next XMM/ESA spacecraft. The
software for OnBoard Data Handling units is considered the Application Experi-
ment.
Description of the Baseline project.
10.18 CLEANAP 21465 237
The software for the OnBoard Data Handling shall assure the telecommand and
telemetry link between the payload low-level controller and the spacecraft central
data handling. As the payload is intended for an operational life of at least 2 years
with extension up to 10 years, the requirement for high reliability is very impor-
tant for the software as well.
In compliance to PSS-05, the software process is ruled by a Software Project
Management Plan (SPMP), referring to a Software Quality Assurance Plan
(SQAP), both appointed by ESA. Accordingly, a Waterfall Process is set up
through 4 main phases: SW Requirement phase (SR), Architectural Design phase
(AD), Detailed Design phase (DD) and Production, Transfer phase (TR). Incre-
mental deliveries are not explicitly stated, though several issues on EM, EQM, and
FM are planned. Different levels of testing are planned but not by means of a
separate Validation Team, although reviews are managed by Q.A. Personnel sepa-
rated from the Development Team.
Process Improvement Experiment Steps.
Introduction of a Cleanroom Process implies many impacts on a traditional
software development area as Engineering Process, Quality Assurance methods
and Configuration Management, which shall be set as a rigorous, but not heavy-
weight, process tool.
Beside a first assessment and a final results evaluation, two main steps shall be
performed by the experiment:
Introduction of a Cleanroom Process in the software development where the
Software Life cycle shall be thought for incremental development and followed by
a suitable Quality Assurance Reviews plan, able to support the increasing com-
plexity of the released software. Metrics shall be introduced to monitor progresses
Cleanroom Experimentation Software is incrementally developed starting from
the most critical components as kernel or operating system, in order to achieve an
early control on the trending reliability and a monitored reliability growth from the
software releases for EQM up to the last release for Flight Model (FM).
final product with reference to the level of safety of each project. Project man-
agers shall be able to use data coming from other similar projects to make more
accurate predictions on resulting reliability level and needed resources.
LABEN S.p.A.
S.S. Padana Superiore, 290
20090 VIMODRONE (MI)
Italy
The Experiment
The experiment will involve benchmarking manual tests in old projects. The proc-
ess of testing will be defmed and applied to baseline projects. This includes the
use of metrics, designing new test specifications, and the criteria for accepting
software for testing. Test automation plays a key factor in this experiment.
There are 66 staff in Credo Group; 27 in actual development.
Project Goals
By implementing the software control management the project manager can opti-
mise the development process, in concrete terms:
• Cost reduction (l 0-15 %)
• Elimination of errors in an early phase of the process (l in stead of 10)
• Quality improvement of delivered programmes
• Reliability increase of installed programmes
• And last but not least, acceleration of the definite product delivery (about 10%)
Reaching these goals indirectly results in a better work-atmosphere for pro-
grammers, analysts, project managers and management.
This experiment will also be part of the efforts, TeSSA Software NV is making
to produce a quality manual and obtain ISO 9000 certification (specially ISO
12207).
Work done
A quality manager was indicated and an internal base-reference report is written,
to situate problems and costs. The global IT company strategy was defined and the
specific requirements of this PIE are exactly defined to fit in this strategy. In the
running of this PIE we had to change the global plan a few times. Looking for
other existing models we found SPIRE (ESSI Project 21419) promoting CMM
and BootCheck, 2 very interesting projects, giving a wider frame for the global
plan.
The strategic choice between the different tools is part of this PIE and the choice
has been made:
• Version control system and configuration management: PVCS
• Testtool: SQA Teamtest
240 10 Summaries of PIE Reports
One employee was trained in PYCS, another one in SQA Teamtest. Both prod-
ucts are installed, we got consultancy on both products and a global session on
test-methods was given to everyone in the company. This was an important ses-
sion to convince every one of the strategic choices.
In both domains the first procedures were implemented.
Results
At the end of the experiment, every employee agrees that quality and reliability of
the software development process is improved significantly. First figures give a
global improvement of 5%. This is less then expected (7 a 10%), but we believe
that the positive influence in productivity and reliability will become more and
more visible in the next years.
The confidence in this experiment certainly helps to get a better working at-
mosphere.
The responses of the customers prove the confidence in the strategy of our
company, working hard on the improvement of our internal processes and they see
the first results of the new working methods.
Future Actions.
Now the procedures are consolidated and standardised to support the development
cycle internally on the same LAN, the next step will be to extend the procedures to
also support external employees.
With the help of our internal organisation with Lotus Notes Applications, the
proceedings and the procedures are nowadays continuously internally dissemi-
nated.
At this moment we're still looking for opportunities to disseminate our knowl-
edge externally.
TeSSA NY
Clara Snellingsstraat 29
2100 Deurne
Belgium
customers to deliver products to market faster, at reduced cost and ever increasing
quality. An assessment was made of our ISO 9001 certified development proc-
esses, via our internal audit programme, to understand where key processes or
tools required improvement in order to meet these objectives of efficiency and
effectiveness. Whilst being compliant to international quality standards there is
always potential for improvement and the following common process characteris-
tics were assessed as being candidates for a series of mini improvement experi-
ments to be performed through the funding of the ESSI programme.
• Projects employed only a standard "V" life cycle and made no use of iterative
or rapid development life cycles and supporting tools where potentially these
could efficiently reduce cycle time or improve the quality of the requirements
capture process
• Limited use of CASE tools were used to support systems analysis. This was
restricted to classical database entity relationship analysis with little potential
for efficient forward engineering into subsequent project phases.
• Limited use of Object Oriented analysis and design methods or tools to support
existing Object Oriented implementation using conventional 3GL (C++) envi-
ronments, compromising the quality and continuity of the analysis, design and
implementation life cycle.
• Unit testing was performed by individual programmers but not formally speci-
fied to ensure rigour or analysed for effectiveness with coverage tools, thus di-
luting the effect of a valuable early stage of verification and validation.
• System testing was a textual scripting process that was manually executed.
With any test cycle repetition this process would become increasingly ineffi-
cient, therefore by limiting repetition to save time the potential effectiveness of
this process for finding system errors was not sufficiently exploited.
• The Quality Management System was manual1y implemented with a manual
document control system that reduced the potential effectiveness of the quality
practices and reuse of existing documentation because of inadequate and ineffi-
cient access to documentary and intellectual assets within the business.
By applying a range of mini experiments to improve these processes it was in-
tended that the business goals of productivity and quality would be more readily
achieved.
The results of the experiments are as fol1ows
• Projects can now employ a RAD life cycle with 4GL tools providing up to
709% productivity gains with an improved requirements capture and mainte-
nance process.
• CASE tools are now available to support object oriented analysis providing
productivity gains of 85% over previous methods, with improvement in quality
of analysis due to formal analysis methods
242 10 Summaries of PIE Reports
• Object Oriented analysis and design methods that show promise of high quality
reusable components developed to better meet requirements by the use of an it-
erative life cycle.
• Rigorous Unit testing that has improved the process to assure the quality of
receipt of third party developed software.
• System testing techniques that have provided a 600% cost reduction by detect-
ing errors earlier in the development life cycle, with the potential for cost re-
duction and increased effectiveness by automation of test repetition.
• An on-line Quality Management System that has reduced asset management
costs by 68% and increased the availability of quality procedures and reusable
documentary assets across the company. A 0.4% increase in productivity per
employee through better practice or document reuse would see the system cost
paid back within one year
From these results the ESSI experiment has been a success and has contributed
to achieving the overall business objectives of improved productivity and quality
of our processes and products.
Fame Computers Ltd
Fame House,
Ashted Lock
Aston Science Park,
Birmingham,
UK
• compare, analyse and evaluate the alternative method with the previous ap-
proach
• develop common guidelines for future projects
• disseminate experience
For this experiment 17 functions of an already developed air data computer
software was taken, in order to reflect the baseline for this experiment. For details,
please refer to Annex I.
The experiment started at the beginning of January '96 with a duration of 12
months and consisting of 14 tasks. It had been run under the European Software
and Systems Initiative (ESSI) with funding by the European Commission.
In advance to the experiment the following objectives had been defined:
• Automation of module integration test execution
• Efficient usage oftest resources/equipment
• Independence from target hardware
• Reproducibility of each individual test
• Use of experience in other/future project
After completion and evaluation of the results, the project accomplishments can be
summarised as follows:
• module integration tests on a host computer system available for multiple users.
• integration tests without any target hardware environment.
• automated test run controlled by a test executive
• automated test protocol generation
• automated test result evaluation
• automated regression testing
• parallel testing by multiple engineers possible
• overall cost reduction for the complete S/W development process achieved by
this test phase
After results analysis of the Module Integration Tests with the new method, we
can conclude that a reduction of integration test time by up to 35% is feasible. The
envisaged goal could be met
Nord-Micro
Victor-Slotosch-Strasse 20
D-60388 FRANKFURT
Germany
10.24 ENG·MEAS 21162 245
The Experiment
To achieve these objectives, we will evaluate the potential contribution of statisti-
cal techniques in each and every aspect of the development cycle, from the unit
tests to the acceptance test. These techniques, combined with our test tools (DE-
VISOR and SYLVIE) and methods, are: code quality measurement (using M-
Square), statistical testing and software reliability modelling (using M-elopee).
The experiment will be performed on the embedded software of electronic
equipment for commercial aircraft, developed by a five-person team over a two-
year period. The main idea is to set up a test team using new testing strategies in
addition to the project team.
them measurable targets and the direct feedback of their work results. Further-
more, it will also give concrete elements to decision-makers.
If these techniques are demonstrated to be powerful, they will be transferred to
our software development process at large.
DASSAULT ELECTRONIQUE
55 Quai Marcel Dassault
92214 SAINT-CLOUD
France
Procedimientos-Uno S.L.
Juan Lopez Penvalver C.T.LA. P.T.A.
29590 Malaga
Spain
Main Conclusions
When using an object-oriented CASE-tool from methodological point of view, the
CASE-tool builds a connection between design and implementation phases of the
project.
The personnel in Banking and Financial systems has become aware of compo-
nent libraries. This is a remarkable point both from the business and technical
point of view in the longer run. When we are producing new components it is
crucial for the further use of the components, that they are very well tested.
TT Tieto Oy
Kutojantie 10
Espoo Finland
SimCorp A/S
Kompagnistrrede 20-22
1208 Copenhagen
Denmark
250 10 Summaries of PIE Reports
The Experiment
We will first expand our know-how about formal test specifications, test methods,
and the methodology of the GUI test and ensure that this know-how is state-of-the
art. On this basis we will select the most promising methods for the GUI test and
demonstrate these within the company.
We will select a commercially available tool which allows us to apply the test
methods selected in Action I to the baseline project. We will procure the selected
test tool and introduce it to those employees involved in the PIE and the baseline
project.
We will evaluate these new testing methods and test tool in the baseline project.
For this purpose, during the test phase of the baseline project, we will conduct
tests both according to our traditional manual methods and utilising the new, semi-
automated or automated methods.
The old and the new methods will be compared. The criteria for this compari-
son will be established at the beginning of the PIE. If this analysis shows the supe-
riority of the new methods, we will begin using these throughout the company at
the conclusion of the PIE.
• A test-notation will be selected and put into use which pennits the test specifi-
cations to be transfonned into the syntax required by the test tools with a mini-
mum amount of manpower.
• Templates will be available for the efficient specification and notation of the
tests. If necessary, templates will also be available for the specification of the
corresponding system requirements.
• GUI-Tests will be executable and recordable almost solely by means of
tools/automatically.
• The test documentation will be generated by the testing tool.
• An appropriate test tool will be available and have been tested in routine use.
IMBDS GmbH
Kleinseebacher Str 9
91096 Moehrendorf
Gennany
The Experiment
The intent of this project is to define a set of document standards together with the
definition of clear rules and roles involved in the document flow management.
Moreover we intend to experiment Verification and Validation activities to be
executed on the outputs of each phase. Available environments as Lotus Notes for
document management and a metrication tool as Metrication (by SPC) for collec-
tion and analysis ofmetrics (also derived by V&V activities) will be experimented
in PIE project.
Major activities in the experiment will be: PIE management, PIE qualification
and monitoring; Set up of the IDEA experiment, inclusive of training, definition of
a methodology defming document and V&V processes, software tools selection
252 10 Summaries of PIE Reports
The Experiment
The development process shall be significantly improved by focusing on two key
areas of SPI: testing and configuration & change management. Thus the PIE will
deal with the following:
• Introducing configuration & change management techniques and tools,
• Introducing systematic testing methods and procedures, supported by suitable
tools.
The PIE will concentrate on that part of the software that is already multiply
reused under different configurations. This part would benefit most and has the
most significance for the successful evaluation of techniques and tools. It will be
referred to as the baseline project.
include central and local government departments, leading banks and large indus-
trial groups.
The baseline project was a CASE tools development project, to which a signifi-
cant number of resources are assigned each year in different geographical sites,
and in which several innovative technologies are used.
The PIE is now completed and can be considered successful from several points of
view:
• the approach followed in the experiment is valid, the adoption of SPICE and
ami has been effective and the two methods appear to be complementary;
• the improvement actions defined and executed in the areas of the Project Man-
agement, Testing and Configuration Management caused a progress in the
baseline project development process as shown by the specific indicators and
by the process assessment performed at the end of the PIE applying the SPICE
prospective standard;
• both the approach and some of the solutions within the improvement actions
can be generalised and reused in a more general context within Finsiel and the
IT community; indeed, a new improvement plan is being defined within a dif-
ferent Business Unit in Finsiel
Finsiel S.p.A
v. Matteucci (ang. v. Malagoli)
56100 PISA
ITALY
In order to reduce the cost associated with the production and maintenance of
each new system version and variant, we need to improve the efficiency and effec-
tiveness of our software process by the use of a automated and comprehensive test
environment. The results of an external review of our processes has recommended
the adoption of an integrated test management system as an essential step in
achieving our quality goals.
The Experiment
This 12 month PIE will enable us to complete the fmal selection of a suitable test
tool set, its integration into the environment and its trial use of in a typical devel-
opment project. The baseline project is planned to be part of a release of MAGIC
Version 8 that will include expanded capability in handling of Web connectivity.
In the PIE, the DB Gateway part of the product will be re-tested using new auto-
matic procedures and compared to the current manual method.
In addition to the work to be done by Magic Software Enterprises itself, KPA
Ltd, an Israeli consultancy specialising in Software Process Improvement will be
used to assist, particularly in reengineering of part of the test process. Consultants
will also be used from the chosen tool supplier to integrate the test tools into the
experimental environment.
that small teams are highly interactive and multifunctional so that conventional
approaches used in large fIrms do not work.
The new team model bases on 'Programming by Contract' in which the single
modules are given to sub- and subsubcontractors. The concept is, that each func-
tionality is programmed as a separate task with a small and well-defIned interface
to the rest of the system (object oriented philosophy). The implementation inside
the module is not so important, if the programmer follows the rules of quality
assurance. This idea follows the strategy of cohesion, coupling and information
hiding. So we hope to reduce the amount of software corrections and the cost of a
bug and alteration. Also a parallel working on the project is only possible with a
concept like this, because each group can simulate the interface from a task, which
will be developed from another group.
In November 1995 our team with the Life-Cycle- and Team-Model was ISO
900 I certifIed.
FESTa Gesellschaft M.B.H.
Luetzowgasse 14
AUSTRIA
The Experiment
The experiment will consist of the following actions:
• Market survey, selection and acquisition of a software testing tool
• Installation of a measurement method
• Set-up of the test environment
• Experiment with change management of the testing environment in the light of
software evolutions (releases)
10.38 PCFM 23743 259
The Experiment
The technical objectives of the experiment are to integrate design and V& V by
formalising the definitions and terms used in the design and to enforce the neces-
sary constructs/constraints to ensure that these formal definitions and terms will
always be correct in the code.
The experiment starts by producing formal definitions for commonly used defi-
nitions and terms within our projects. These are then used to specify and prove
part of the baseline design in parallel with the baseline project design and V&V.
The results from both can then be directly compared and the benefits quantified.
The experiment is resourced from the project teams to ensure that the process is
practical and acceptable to the ultimate users.
The baseline project will be a real time, safety critical control system for an
aerospace application.
Lucas Aerospace, York Road, design, develop and manufacture real time,
safety critical control systems. The site employs 880 people of which 130 are
involved with engineering software.
embedded real-time software follow the same pattern as other types of software
reported by Boris Beizer.
We have also found that the major cause of bugs reported (36%) are directly re-
lated to requirements, or can be derived from problems with requirements. Im-
proved tracking of requirements through the development process has been
achieved through the introduction of a life-cycle management CASE tool. Unfor-
tunately the customisation of the life-cycle management tool took longer than
expected, so no actual numbers on the positive effect of the tool are available at
present, but it is expected that the integration and system testing phases can be
combined, resulting in a major reduction of testing effort.
The second largest cause of bugs (22%) stems from lack of systematic unit test-
ing, allegedly because of the lack of tools for an embedded environment. We have
found that tools do exist to assist this activity, but their application requires some
customisation. We have introduced a unit testing environment based on EPROM
emulators enabling the use of symbolic debuggers and test coverage tools for
systematic unit testing. The unit testing methods employed were: Static and dy-
namic analysis.
We have demonstrated that the number of bugs that can be found by static and
dynamic analysis is quite large, even in code that has been released. The results
we have found are applicable to the software community in general, not only to
embedded real-time software, because the methods and tools are generally avail-
able. Finally a cost/benefit analysis of our results with static and dynamic analysis
indicates that there could be an immediate payback on tools and training already
on the first project.
The efficiency of static analysis to fmd bugs was very high (only 1.6
hourslbug). Dynamic analysis was found to be less efficient (9.2 hourslbug), but
still represented a significant improvement over finding bugs after release (14
hourslbug). We achieved a test coverage (branch coverage) for all units in the
product of 85%, which is considered best-practice for most software, e.g. non
safety critical software.
The PET project has been funded by the Commission of the European Commu-
nities (CEC) as an Application Experiment under the ESSI programme: European
System and Software Initiative.
Mr. Otto Vinter
Brueel & Kjaer Measurements A/S
Skodsborgvej 307,
Denmark
262 10 Summaries of PIE Reports
10.40 PI321199
For the future the company is aiming at continuing Process Improvement activi-
ties, especially in the following directions:
• full deployment of the enhanced practices to the daily routine work of all pro-
jects;
• definition of life cycle and methodologies/ tools for Rapid Application Devel-
opment
• completion of the Quality Management System for the sake of ISO 900 I regis-
tration.
The PI3 Project was run under the auspices of the CEC DG III within the scope
of the ESSI Initiative of the ESPRIT Fourth Framework. This support has proven
to be extremely important in ensuring the overall success of the initiative, which
lies in the mainstream of the company core business
ONION
Via L. Gussalli 11
25131 BRESCIA
ITALY
http://net.onion. it/
The Experiment
We will introduce SQA test tools and provide training and consulting to the em-
ployees involved with the base project. Especially people from QA will be in-
volved, since they will be the key to reducing maintenance costs.
Special attention will be given to the testing method as this is the only guaran-
tee that we will use the tool as efficient as possible.
LGTsoft currently employs 9 people, 8 of them are involved in software devel-
opment
The Experiment
The experiment will investigate patterns of errors that commonly occur in the
development process and will define and implement techniques to prevent these
errors from occurring. Success in this area will reduce the amount of re-work
caused by errors and shorten the development time.
10.43 PROVE 21417 265
The PIE will define and implement defect prevention methods for each of the
phases of the development life cycle and will conclude a strategy for determining
the amount of testing needed in a defect prevention environment.
The Experiment
The implementation of a wider and deeper testing approach is meant to help us
achieve the objectives stated above. There are three areas for improving the cur-
rent test process which are reflected in the three experiments of the QUALITAS
project:
I. Enhanced document testing by running intensified reviews on early develop-
ment results, Le. functional specifications and design documents.
II. Systematic functional testing and test-case determination is to be used to en-
sure that the functional requirement specification document is complete, that there
are no versioning problems, and that implemented functions work as intended.
III. Automated installation and system integration testing is to be performed for
each delivery to a customer site. Each delivered product is to come with an auto-
mated installation test suite guaranteeing that the product works properly in the
customer's software and hardware environment.
The experiments will be conducted in the context of the realisation of a new
version of CORONA CS, a client/server solution for the automatic reconciliation
of accounts.
Management Data Vienna has 132 employees, with about 50% of them being
employed in the software development unit. A staff of 15-20 is involved in the
CORONA CS team.
The Experiment
The experiment Will use TTCN to introduce a formal notation for specifying ab-
stract system test cases which are then automatically transformed into executable
programs. These may be used and reused at system test but also during develop-
ment and maintenance. Thus the experiment consists of two steps.
The first step is the system testing of a product feature with TTCN, by means of
the specification, translation and execution ofTTCN test suites.
The second step is the integration of a TTCN test system in the development
environment, and the use of that system together with the test suites arising from
the first step.
The baseline project will be the comparable test tasks of the private communi-
cation network division performed with the presently prevailing development and
test methods . The software department employs approx. 120 employees, 80 of
them involved in the release development of the communication network which
includes the baseline project.
Mainly, lessons learned that have been derived from this experiment can be so
summarised:
• it is better to take into account the use of these tools at a very early stage of
system design as they could require architectural and/or implementation peculi-
arities;
• automatic test procedures can be more exhaustive and can produce a better
coverage w.r.t. the traditional approach. This is paid with an extra effort in their
preparation, although capture and playback features can ease the MMI test cod-
ing. An economical return is mainly possible for systems which undergo many
changes during their operational life and for which non-regression testing is a
significant cost;
• the quality and the robustness of the software to be produced is better assured,
as more deep and intensive tests can be easily implemented and run with a low
cost;
• the repeatability of the tests is really granted using an automated tool, as there
are no way to misunderstand or not fully perform a test step, as during manual
test execution;
10.48 SMUIT 21612 271
• if properly planned, the testing tool can do much more than just test for the
system: it can effectively be used to encapsulate the application under devel-
opment, reproducing the context environment and facilitating the integration.
As a result of the experiment, our Company is carrying out an evaluation of the
use of the same approach for a new development in a similar application; the
quantitative data collected during the experiment are being the basis for this deci-
sion.
Dataspazio Telespazio
e Datamat per
L'ingegneria dei
Sistemi Spa
• You need sufficient theoretical and technical know-how to apply code analysis
systematically.
• To get all benefits from code analysis it has to be integrated in the organisa-
tion's software process model.
• Code analysis may not be applied to measure the capability of developers.
• The results of code analysis serve as an input to support management decision
processes.
• Code analysis will only be accepted and applied by developers, if they get
benefits from it.
The know-how gained during the PIE is "stored" by means of a "test office"
which was established and is working at ABB Netzleittechnik. The main reason
behind the concept of a test office was, that the core competence and knowledge
needed to systematically perform our testing and analysis procedure should be
concentrated. The test office offers method and tool know-how as well as practical
support concerning all testing activities to all S.P.I.D.E.R. development teams.
The software process improvement project reported here was funded as a so
called Process Improvement Experiment (PIE) by the ESSI program of the Euro-
pean Community (ESSI stands for European System and Software Initiative).
ABB Netzleittechnik GmbH
Postfach 1140
D-68520 Ladenburg
Germany
The Experiment
ETRA's products typically have a long life cycle, of the order of five to ten years.
This fact, together with their complexity, make Maintainability, Errors Manage-
ment (EM), Tests, Configuration Management (CM) and Requirements Manage-
ment (RM) crucial issues. Up to now, none ofthese issues have been satisfactorily
tackled.
The pilot application to be used in SPIDER as the Baseline Project (BP) to
carry out the PIE will be the kernel ofETRA's Traffic Control System.
The User Requirements, Analysis, Design and Coding of the BP will be revis-
ited, and the processes and documentation standards defined in MACRO will be
applied.
The testing, installation and maintenance phases will be formalised and the cor-
responding procedures defined, so that they are implemented, in co-ordination
with the user of the system, within the frame of the BP.
In parallel with the above, it will be carried out the definition, and implementa-
tion of the processes of EM, RM and CM. Special attention will be put to meas-
urement of cost effectiveness and level of errors reduction.
ETRA employs 70 people, 25 of which are involved in the development unit.
ETRA SA
Tres Forques, 147
46014 Valencia
Spain
The Experiment
The experiment will include selecting tools and methodologies for configuration
management and testing. The tools and methodologies will be used by the baseline
project for 9 months. During this time data will be gathered and the results of the
experiment will be evaluated according to this data. Based on the evaluation,
changes will be introduced (if necessary) and a plan for assimilation will be de-
veloped.
The experiment will be performed at our office on a client/server project of
about 15 man years. Onyx Technologies employs 90 full time employees, 70 of
them involved in software engineering. 6 as part of the baseline project.
The Experiment
• definition of procedures for systematic tool-based software testing
• build up database with test cases
• training of staff in test methods and tools
• compare new test method with existing test procedure
• detect higher amount of defects before shipment to the customers
TECHNODATA
INFORMATIONSTECHNIK GmbH
Postfach 1346
71266 Renningen
Germany
and evaluate the applicability of the STUT method on industrial scale. In the pro-
ject, STUT method was adapted to applicable parts to baseline test process.
In STUT method a new concept, Function Usage Model (FUM), is used. It has
proved to be a powerful way to understand and describe the functionality and use
it as a basis for test case specification. Usage modelling has obvious benefits when
the function can be described by a black-box interface and when a graphical view
of the function helps to learn the functionality. Results show that the FUM must
be a compromise between real usage and test requirements of Function Test. Also,
a good STUT tool is essential to be able to industrialise the STUT method prop-
erly. When STUT method is was adapted to our baseline test process it was no-
ticed that the role of reliability estimates was not as great as was supposed earlier.
In STUT-IU project it has been proved that STUT method is technically appli-
cable to baseline project. Evaluation results indicate that little better quality is
obtained with STUT method than with baseline test method. They also show that
it is beneficial to apply STUT method to network and traffic function classes in
the short-term. For these function classes productivity has increased a little. As
figures in productivity calculations are based on quite little data, they should only
be interpreted as indications, not as statistically strong conclusions.
In the next baseline project STUT method will be applied to network function
class and applicability of STUT method will be studied in more detail for traffic
function class. A new STUT tool, that fulfils the criteria required, will be taken
intd use. In the long-term essential improvements should be made to test execution
support so that more wider application of STUT method would be profitable.
Oy LM Ericsson Ab
FIN-02420 Jorvas
FINLAND
testing; increasing the number of defects found and reducing the time taken to fix
by providing better test feedback.
The Experiment
The achievement of the above objectives require the improvement of the software
testing phase by performing the following tasks:
1. Perform a "Health Check" on and update existing Automatic Test System.
2. Measure software complexity and feed into test creation activity.
3. Compare manual and automatic testing on the baseline project.
This SWAT PIE involves the automating of the test phase of a separate baseline
software development project (typically a future release of Central Unit software).
The Baseline project will be selected on the basis of its suitability for use in the
SWAT PIE. This baseline software requires a team of around 6 engineers over an
extended period, which involves Alpha and Beta Test phases. A 50% reduction in
the length of these phases and an increase in the number of defects found are
SWAT PIE aims. TSc employs around 470 people, of which 45 are involved in
software development.
tory of all the test data records. A total of 40 data were selected to cover test
plan, test case information, test execution data and test errors.
• Execution of two pilots with 450 test cases in electronic form and reusable, 170
test cases recorded and re-executable, test errors stored and all test data usable
for statistic and improvement.
• Identification, experimentation and validation of a Quality profile (set of met-
rics) for the Business Management System (BMS) product following the AMI
approach and the ISO/lEC 9126.
• Acquisition of advanced skills on testing tools and methodologies, on SPICE
assessment model and on process/product metrics (AMI approach).
• Introduction of a workgroup organisation for testing management activities.
• Specific internal and external dissemination activities were implemented to
give a wide visibility of the experiment and the availability of a web page
(http://max.tno.it/esssiteprim/teprim.htm).
The results achieved were considered very positive and the use of new testing
environment was extended to other software product development projects. Plans
are in place for Company wide implementation of TEPRIM and for further dis-
semination of activities on both internal and external levels.
The TEPRIM project has been funded by the Commission of European Com-
munities (CEC) as a Process Improvement Experiment under the ESSI pro-
gramme.
IBM SEMEA SUD s.r.l
The Experiment
The aim of the proposed Process Improvement Experiment (PIE) is to improve the
software testing phase at the component and at the integration level. The experi-
ment especially focuses on the relation between testing and requirements coverage
which is central to achieving satisfaction of our customers needs. This experiment
will refer to following aspects of the development process:
I. Software testing during various software testing phases to obtain satisfactory
coverage within project time and budget constrains using the Logiscope tool (or
similar).
2. Management of system requirements allocated to software using RTM tool (or
similar).
3. Management of requirements changes using a data base application developed
at IAI (RCR). This will be used to control requirements changes: description of
desired change, cost estimation of the change, impact and risk assessment of
the change and recording of resolution. The output of this change control proc-
ess will drive the changes to the base line requirements data base in RTM.
The baseline project is an advanced avionics system enhancement including
hardware modifications as well as extensive software development of its main five
components. The development effort is presently estimated at over 20 man years.
This baseline project is typical for IAI and reflects the commercial market trend of
our products which demands high quality, increasingly complex functionality and
short development time.
The Experiment
The objective above will be achieved developing the appropriate testing proce-
dures, selecting the most suitable tools that help the developers during the testing
phase both in the testing itself and in the testing managing and accounting, refin-
ing the method with its application to the pilot project and disseminating the
method both internally and externally. The baseline project will be the SRS I and
II projects. These are parts 1 and II of a technical decision-support system running
on UNIX workstations which are connected on-line with a SCADA (Supervisory
Control And Data Acquisition) system controlling a large electricity transportation
network. The SRS-I is a 10 man-year system which has been completed during the
first quarter of 1995, while SRS-II is the second part of SRS, a 4 man-year project,
and is scheduled to last from July 95 through November 96. Most of the data from
project SRS-I are already available, such as effort spent in testing and bugs re-
ported, among others. Since SRS-II includes new functionality plus the deploy-
ment of the system at a new customer, real quality data such as bugs found and
number of complaints will be available for its use in this PIE, and the improve-
ment in the testing phases of SRS-II due to the PIE will be easily compared with
earlier data from SRS-I.
and an improved commercial image of the developer. Also, since much of repeti-
tive testing is expected to become automated, the maintenance phase of products
is expected to be cheaper and of better quality.
LABEIN
Parque Tecnologico Ed 101
48170 Zamudio (Bizkaia)
Spain
• A library for automated generation of data bases has also been produced.
• A library of generic instrumentacion drivers has been generated and an
importaton tool for incorporating drivers of specific instruments from different
vendors developed according to the emerging industry standard VXI p&p have
also been developed.
• A library for the automation of measurement software code generation in C++
has been delivered.
• The automation of the integration process for all the required modules into the
Test Engine Core (base line project) to generate, without user intervention, the
final run-time test application software.
• The encapsulation and integration of TestLib within the standarized comer-
cially available CASE tool, HP SoftBench, widely used in Unix ( Sun, HP)
software development environments.
Due to the technical success of this project, the contractor has taken the deci-
sion to commit in investing in further software development to create a commer-
cial product based on the proved technology used by TestLib. Plans for releasing
this product into the test & measurement automation market are before the end of
next year.
Integracion y Sistemas de Media, SA
C/Esquilo, 1
28230 LAS ROZAS (Madrid)
Spain
The Experiment
In order to achieve this objective, requirements traceability will be exploited, as
the mean to directly and un-ambiguously relate subsets of testing and validation
sequences to specific subsets of requirements. The goal is to keep track of what
and how should be tested when requirements change or when amendments to
faults in a product ReleaseNariant have to be propagated to all the other relevant
active ReleasesNariants.
Participants
• Agusta - Un' Azienda Finmeccanica SpA: Helicopter manufacturer, having in
charge the development of both mechanical and avionics
• TXT Ingegneria Informatica: software house with large experience in software
engineering practice and products development involved as external assistance
provider.
Alcatel SEL AG
LorenzstraJ3e 10
70435 Stuttgart
Germany
ported by Rational's TestMate. Data from the baseline project, including the inte-
gration test phase, has been used to determine the cost effectiveness of the differ-
ent review and test activities. Analysis has also been undertaken to determine
which metrics best indicate where verification effort should be concentrated.
Results indicate that:
• Fagan inspections are two and a half times more effective at finding major
defects than code reviews.
• Static analysis can find 26% more major defects than code reviews. Further-
more only 5% of the defects detected through static analysis were also detected
by code review.
• The benefits of TestMate for unit test case generation are limited within the
baseline project.
• The people are important to the effectiveness of the process.
The conclusion of the VERA experiment is that the most cost effective error
detection process for real-time software is static analysis coupled with Fagan in-
spections.
This project was funded by ESSI under the PIE programme and will be of in-
terest to the radar control and processing software, real-time software, reliability
and embedded Ada software communities.
GEC-Marconi Radar and Defence Systems, Radar Division
Glebe Rd
Chelmsford Essex CMl lQW
UK
quirements. Engineers were trained in using the tools and a plan for introduction
of the tools in a running project, the base-line, was fonnulated. Furthennore the
role of testing in our CMM-based software process model was defined as a sepa-
rate chapter in our system handbook.
Development activities in the software packages originally to be used in the
base-line project were postponed due to changed market conditions for nuclear
geometry viewing front-ends and back-ends. Therefore, another project was cho-
sen to serve as a base-line. Adjusting the planning for VISTA to the planning for
this project introduced a delay with respect to the original VISTA-time schedule.
An initial inventory of the testing process and error statistics in the base-line
showed, that internal pre-delivery testing was undervalued. With the aid of tool
consultants test-sets and test-scripts were built according to the test-specification.
Structured validation was introduced in version-deliveries in the baseline project.
Techniques developed in our baseline project have been used in structuring tests in
another project concerned with applying the analysis of large time-series of meas-
urements. This application was less complex in tenns of underlying hardware
architecture and allowed us to demonstrate the value of the new techniques much
quicker.
Test-tools are now in common use with large software development finns. For
some finns, software test engineers outgrow software engineers in number. Intro-
duction of test-tools on a scale as for our organisation (18 software developers)
introduces problems. The complete set of functionality of the tools is large and the
view on the application-under-test is significantly different from the software
engineer view. Given the size of our software engineering group, this forced us to
make careful choices and be not too ambitious in adopting tool functionality. As a
last activity in work package II, deliveries of releases of the base-line project
software are done after automated regression testing using the tools.
The results of our working package III, evaluation, indeed indicate the value of
structuring and automating the testing process. Furthennore, the tools seem very
promising for managing defect repair and the analysis of the delivery process. In
our organisation the qualitative evaluation and quantification of our experiences
and further dissemination has been perfonned. The value of the new techniques
for a number of typical application types was detennined. An application potential
matrix is given per application type. Our experience is, that the size and scope of
projects for setting a test-organisation is confined to those, where application
domain knowledge can be shared over two teams. For all application-types struc-
turing the test process is useful; for large model based client/server-systems with a
large GUIs and for user-intensive data driven applications automation of test-
execution and management is most appropriate.
Looking at the system development process, testing automation should be set
up apart from the software design. Testing and design engineers should consider
projects from a different perspective. Quantitatively, the amount of data collected
is too small to justify conclusions about the business advantage. Qualitative in-
10.64 STAR 27378 291
fonnation as gathered from the participating engineers in the project is very posi-
tive.
This report is an extension of the MTR of July 1998 [1]. Emphasis has been
laid in this report on the evaluation phase of the project and in an assessment of
the usability of the new procedures and techniques for software development
groups in R&D organisations comparable to ECN.
A first international dissemination of the results has been given as a part of an
EPIC-workshop (Eindhoven, 28 April 1998). On Eurostar 1999 a final presenta-
tion will be given of the project. A paper describing our experiences [2] to the
software engineering community is in preparation.
ECN
P.O. box 1
1755 ZG Petten
The Netherlands
procedures at module and integration level with reviewing method of each devel-
opment product, form requirements to delivery; assure a committed level of qual-
ity in the product delivery by means of applying statistical control of errors related
to the requirements approved.
Establishing a cost-benefit analysis through a controlled experiment will give
us the objective information needed to validate the procedures, methods and tools
defined to do a complete Testing process in a baseline project. All other practices
being maintained as they are today during the experiment, we will be able to
measure the effect of the Testing process in the key features of quality, plan and
cost of the resulting software product.
For the PIE measurements we have defined internal quality, committed quality
and testing cost in the following way, according to the main objective of the pro-
posal:
• Internal Quality will be measured in terms of the ratio of errors discovered in
the acceptance test over the errors reported and removed in the software tests
phase. The current measure that will be used for reference is 25% to 40%, with
a figure of 37%, measured in our last project as control reference for the pilot
project.
• Committed Quality will be measured in terms of the ratio of errors discovered
in our standard six-month guarantee period over the errors reported in the ac-
ceptance test. The current measure that will be used for reference is 15% to
75%.
• Testing Cost will be measured in terms of the actual time used in testing activi-
ties (plan and execution), as ratio over the total development cost. In our last
project, used as control, have get a reference cost of 27% (9 man months) over
a total cost of 33 man months.
The proposed target is to reach a radical reduction in the number of errors de-
livered to the customer. For that, two complementary actions have been proposed:
to improve the module and integration test phase, and to perform an acceptance
test phase with the ability to demonstrate the satisfaction of requirement specifica-
tions without the need of perform an exhaustive test of each individual require-
ment, but instead based on statistically controlled parameters.
The original Workplan defined for the project contains four main activities. Now,
at middle of the project, we have completely done the first one and are working on
Base Line project:
• Training and definition of the formal procedures, methods and tools that need
be put in place to establish an operational and efficient Systematic Testing
process.
• Application of the Test planning and execution procedures on the base line
project.
10.64 STAR 27378 293
Information Society 7
Technology programme V Macquarie University 86
Instituto Portugues da Qualidade 24 Manual GUI testing 89
Instrumentation software 130 MARl 20,24
Integra Sys 170 Marick, Brian 1,35,57,80,83,84,204
Integracion y Sistemas de Medida 126 Market share 40, 169
Integration McCabe & Associates 83
integration software 203 McConnell, Steve 80
of product verification 88 McGraw, G. 96
of test concept 133 Metric Plan definition 45
of verification process 35 Microsoft
Internet 89,90,141,143,145,150 Explorer 153
applications 90, 143 Visual C++ Development System 121
realm 90 Visual C++ Visual Studio 137
service 87 Milanese, Fabio 35,46
so ftware 142 Miller, K. 81
usage 150 MKS Web Integrity 154
Internet Engineering Task Force W3C 151 Moore, Geoffrey A. 80
InternetlIntranet 50, 91 Morel1, L. 81
applications 91, 142 Mosley, Daniel 1. 84
remote access 53 Musa, J. 80
Intranet services 94 Myers, Glenford J. 56
Intranet/Extranet applications 144, 146,
150, 158 Netscape 143
INTRASOFT SA 24 Mozil1a 153
Italian PIEs 141 Server API 154
New Procedures
Jalote, Pankaj 84 adoption of 116
Java 49, 158 Nguyen, H.Q. 80
applets 73 Nielsen, Jakob 80
development environments 59 Norsk Regnesentral 24
test tools 154 Nyman, Noel 79