You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/270697944

State of the art in software quality assurance

Article  in  ACM SIGSOFT Software Engineering Notes · June 2014


DOI: 10.1145/2597716.2597724

CITATIONS READS
3 1,461

3 authors:

Nandakumar Ramanathan A. K. Lal

17 PUBLICATIONS   68 CITATIONS   
Space Application Center
3 PUBLICATIONS   12 CITATIONS   
SEE PROFILE
SEE PROFILE

R. M. Parmar
Space Application Center
22 PUBLICATIONS   40 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Writing a Monograph on Software Quality View project

Development of software asset management system to facilitate software reuse View project

All content following this page was uploaded by Nandakumar Ramanathan on 20 April 2016.

The user has requested enhancement of the downloaded file.


ACM SIGSOFT Software Engineering Notes Page 1 May 2014 Volume 39 Number 3

State of the Art in Software Quality Assurance


R. Nandakumar A.K. Lal R.M. Parmar
Systems Reliability Area, Systems Reliability Area, Systems Reliability Area,
Space Applications Centre, ISRO, Space Applications Centre, ISRO, Space Applications Centre, ISRO,
Ahmedabad, India Ahmedabad, India Ahmedabad, India
+91 – 79 – 2691 4186 +91 – 79 – 2691 4629 +91 – 79 – 2691 5028
nandakumar@sac.isro.gov.in aklal@sac.isro.gov.in rmparmar@sac.isro.gov.in

ABSTRACT of software design quality. Section 6 deals with the effect of branching
The authors present a summary of their understanding of the publications on quality as part of software configuration management. Section 7
that appeared in the computing literature during the last five-year period, describes real-life testing or testing that is continued on the software
with the word quality as part of their title and dealing with some or other product after its operational deployment. Section 8 deals with NASA
aspects of software quality assurance. Specifically, the summary covers JPL experiences in SQA. Section 9 includes a summary of ongoing
the following eight sub-topics, viz., (i) quality models; (ii) timely QA research endeavours and section 10 contains the authors’ conclusions.
feedback; (iii) quantitative approaches to predicting software quality and
the effectiveness of software QA; (iv) optimal choice of QA methods; 2. SOFTWARE QUALITY MODELS
(v) design level QA; (vi) impact of parallel development options on Klaus Lochmann and Andreas Goeb [11] list six different concepts that
software quality; (vii) continued QA efforts even after the operational relate to the quality of software products.
deployment of the software product; and (viii) use of CASE tools and (i) Quality attributes grouped under eight major classes, viz.,
perceived value for QA at NASA JPL. This digest also includes (a) functional suitability (e) reliability
summaries of three short papers describing ongoing research on (i) (b) performance efficiency (f) security
prioritizing QA efforts; (ii) QA in an Agile development context; and (c) compatibility (g) maintainability
(iii) presenting the software quality status to stakeholders during (d) usability (h) portability
development. Major trends observed in the summary pertain to (i) as in the product quality model defined in ISO 25010 [18].
continuing software QA efforts even after the operational deployment of (ii) An activity-based quality model defined by F. Deissenboeck, et al.
the software product, which may also be pertinent to the proliferation of [19] relating Product characteristics (e.g. static and dynamic
mobile applications; (ii) presenting the current software quality status to behaviours) and activities performed on the product/system (e.g.
different stakeholders using concise sets of software metrics; and (iii) the analysis or implementation). Both of these are refined down to two
effectiveness of hybrid approach of tool use combined with expert levels of detail and their influences on each of these last-level
details are represented in a matrix format.
judgment for software QA. The emphasis on software QA activities
(iii) Domain-specific quality models described by B. Shim, et al. [20]
spreading throughout the software product life cycle, from its beginning
for representing design quality (say, for the SOA -Service Oriented
to end, is clearly observed in the reviewed literature. Architecture- context) as a four-partite directed graph. In this
graph, the edges (direct influence) connect the four partitions of
vertices in a particular sequence, namely design metrics, derived
Categories and Subject Descriptors metrics, design properties and quality attributes.
D.2.5 Testing and Debugging D.2.8 Metrics D.2.9 Management - (iv) The practice of specifying non-functional or quality requirements
Software quality assurance (SQA) for a software product apart from specifying its functional
requirements.
General Terms (v) The practice of different quality assurance techniques such as (a)
Management, Measurement, Design functional tests; (b) static code analysis; and (c) inspections.
(vi) Prescribing guidelines for coding, user interface design, formal
Keywords specifications, etc.
Quality assurance, quality models, just-in-time approach, quantitative
approach, design quality, effect of branching, real-life testing, tool use, Lochmann and Goeb [11] suggest a unifying model for software quality
ongoing research, perceived value taking many of the above individual quality concepts into consideration
by way of a graphical representation combining the concepts presented
by B. Shim, et al. [20] and F. Deissenboeck, et al. [19]. This model
1. INTRODUCTION relates the individual use cases for each stakeholder to the activity-
properties (such as efficiency or effectiveness) of appropriate and
A summary of the publications that appeared in the computing literature
relevant product quality attributes, which in turn are linked to the
during the last five-year period with the word quality as part of their title software product properties (grouped into functional and structural
and dealing with some or other aspects of software quality assurance is properties), which in turn are connected to one or more of the functional
presented in this paper. This work is a part of the authors’ undertaking in test cases or manual inspection items or static code analysis rules or
bringing out an internal technical note on the state of the art practices in environment-related metrics. The sample graph presented pertains to the
software quality assurance (SQA) in aerospace industries as well as the quality attributes maintainability and usability and similar future work is
state of the art in the computing literature covering the last five-year proposed to cover other quality attributes.
period. The reviewed research papers [1–17] are arranged in descending
order of their publication year. References [18-25] have been cited in the
reviewed publications [1–17], and are also referred to and cited as part of 3. JUST-IN-TIME SQA
this digest. Yasutaka Kamei, et al. [1] predict defect-prone code changes in real-time
(just-in-time when the code submissions are made and their logic is fresh
Section 2 deals with current software quality models. Section 3 describes
in the minds of the implementers) i.e., as soon as the code is submitted to
an empirical study on just-in-time software quality assurance. Section 4
the code repository, using 14 different factors grouped under five
covers quantitative approaches to SQA. Section 5 covers the assurance
dimensions, as depicted in Table 1 below.

DOI: 10.1145/2597716.2597724 http://doi.acm.org/10.1145/2597716.2597724


ACM SIGSOFT Software Engineering Notes Page 2 May 2014 Volume 39 Number 3

Table 1: Factors Considered in Real-time Prediction of Defect-prone on the influence of those factors through a questionnaire. The defects
Code Changes (derived from Yasutaka Kamei, et al. [1]) present are then estimated by multiplying,
(i) size of the phase output;
Category/Dimension Specific Factors (ii) base value of the average defect density applicable to that phase
of Group of Factors based on previous history;
Diffusion (i) number of modified subsystems (iii) 1 plus defect density increase factor (due to the influence of causal
(ii) number of modified directories factors);
(iii) number of modified files (iv) base value of effectiveness for the specific SQA activity based on
(iv) entropy (meaning the distribution of history; and
modified code across the files) (v) 1 plus effectiveness improvement factor (due to the influence of
effectiveness improvement factors).
Size of code change (i) lines of code added
The adjustment factors (not to be confused with the causal factors) for
(ii) lines of code deleted
the two base values are reportedly obtained from opinions gathered from
(iii) original lines of code in the file
different experts and a Monte-Carlo simulation. The simulation details
before change incorporation
are reportedly available in another publication by the same authors,
Purpose of change To fix an earlier defect in the file or not
Michael Kläs, et al. [22]. By plotting relative defect density against
History of changes (i) number of developers who made relative defect detection effectiveness, and observing to which quadrant
changes to the affected files the current project falls, one can effectively plan the SQA activities for
(ii) average time interval between the that phase. For the JAXA case study with data from five past projects,
last change and the current change the Mean Magnitude of Relative Error of the estimation accuracy
(iii) number of unique changes made to proposed by the hybrid model determined by cross-validation (by
the modified files successively leaving out one of the five projects for estimating) is
Experience of (i) developer experience observed to be 29.6% compared to 76.5%, which resulted from the most-
Change (ii) their recent experience with the accurate data-based model.
Implementers current system
(iii) their experience with the subsystem 4.2 A Framework for the Balanced Optimization
that needs change incorporation. of SQA Strategies
Michael Kläs, et al. [16] have refined the basic six steps of the Quality
Yasutaka Kamei, et al. [1] have empirically validated their prediction Improvement Paradigm (QIP) namely,
model on a number of large software projects, both under open source as (i) “characterization of the current situation”;
well as commercial categories. They report that on average, 64% of (ii) “definition of improvement goals”;
defect-inducing changes are actually detected (recall) with an accuracy (iii) “planning of improvement activities”;
of 68% and a precision of 34% in both the categories. Here recall is (iv) “execution”;
defined as “the ratio of correctly predicted defect-inducing changes to (v) “analysis of the results”; and
the total number of defect inducing changes;” and accuracy is defined as (vi) “packaging of the results for reuse”
“the ratio of correctly classified changes (both defect-inducing and non- into an 11-step process framework applicable for small and medium-
defect-inducing) with respect to the total number of changes.” Precision sized software enterprises, which have limited resources for process
is defined as “the ratio of correctly predicted defect-inducing changes to improvement and no explicit process improvement groups. The authors
all changes predicted as defect-inducing.” The advantages of just-in-time claim that the proposed quality improvement framework is a concrete
quality assurance predictions are threefold. First, they are fine-grained and easy-to-adopt set of steps for such organizations. As the final step,
(code-level changes as against file-level or package-level changes are the experiences gained with respect to the quality assurance
identified, and the files or packages may include many such code technologies used such as efforts, effectiveness, effects of individual
segments); second, they identify the specific developer to whom the influencing factors, and lessons learnt are archived in the best practices
code review is to be assigned; and third, the predictions are made just-in- or technology repository for reuse. Michael Kläs, et al. [16] report the
following four lessons learnt from the case study conducted in one mid-
time, when the changes incorporated are fresh in the minds of the
size software company, viz.,
developers. (i) establishment of a consistent defect classification scheme;
(ii) need for the establishment of a measurement infrastructure;
4. QUANTITATIVE APPROACHES TO SQA (iii) conducting interviews at different levels of personnel; and
(iv) the primary basis of the optimization suggestions being derived
4.1 Predicting Defect Content and Quality from the interview findings together with measurement data.
Assurance Effectiveness 4.3 Inter-Play between Cost, Schedule and Quality
For predicting (i) the defect contents of any software development phase
output (with particular emphasis on the software requirements analysis The authors in the two short conference papers (Omar Alshathry, et al.
phase); and (ii) the defects finding effectiveness of corresponding SQA [15], Omar Alshathry and Helge Janicke [13]) discuss the inter-play
review activities applicable to that phase (here SRS reviews); Michael between cost, schedule and quality and propose models to support
Kläs, et al. [17] propose a hybrid model-based solution, combining the meaningful decision-making by project managers for SQA activities
opinions of experts and the history of past projects undertaken in the during the software development life cycle. The papers discuss weight
organization. This is carried out in order to quantitatively aid assignment to phase-wise SQA activities in direct proportion to their risk
management in (i) both planning and executing SQA activities for new rating or defect removal efficiency and a regression model between the
software development projects; (ii) ensuring the mitigation of software size of the work product and the number of defects originating from that
quality related risk in the finished product; and (iii) ascertaining whether work product.
the planned level of risk reduction has been realised. The authors also
discuss a case study related to the Independent Verification and 4.4 Cost-Benefit Analysis towards Scheduling
Validation activity of one mission-critical, onboard space system SQA Activities
software product at the Japanese Aerospace Agency JAXA. Among the
Tilmann Hampp [6] claims that informed decision-making in scheduling
possible causal factors (about 100, as per Jacobs J., et al. [21]) affecting
defect injection and defect detection, a specific list of factors applicable SQA activities requires cost-benefit analysis derived from previous
to the current context is identified. Opinions of experts are then obtained projects. The work describes a model named CoBe developed by the
author and his team for this purpose. The model relates the costs and

DOI: 10.1145/2597716.2597724 http://doi.acm.org/10.1145/2597716.2597724


ACM SIGSOFT Software Engineering Notes Page 3 May 2014 Volume 39 Number 3

benefits of any SQA activity with its defect detection effectiveness as software projects, and claim that their findings are universally
defined by S. H. Kan [23], based on the total test cases considered and applicable.
the competence of the testers. The model was evaluated with 21 student
projects and two industry projects. A discussion of the sensitivity A few definitions have been adopted verbatim from Emad Shihab,
analysis to model inputs is included. The model assumes a standard life- Christian Bird, and Thomas Zimmermann [7]. A branch family is a sub-
cycle process and is not readily tunable for agile practices. What-if tree branching off from the main trunk. Sometimes, a branch may have
analyses carried out for industry projects resulted in a good fit, while the to be added to an existing branch family to provide further isolation in
same for student projects showed variation. development. In those cases, the new branch is said to be one level
deeper in the branch tree. Branch depth is defined as the measure of how
deep a branch is away from the main trunk. Software changes are
5. SOFTWARE DESIGN QUALITY classified into two categories, namely development changes and
branching changes. Development changes relate to addition, deletion or
ASSURANCE modifications within a code segment, while branching changes relate to
Ganesh Samarthyam, et al. [3] share their experiences in implementing a integration of changes done in different branches to realise a software
Method for Intensive Design quality ASsessment (MIDAS) of software component. The following definitions of five branching related metrics
applications for the industry, energy, healthcare, infrastructure and cities are adopted from Emad Shihab, Christian Bird, and Thomas
sectors at the Siemens corporate development centre for Asia and Zimmermann [7], which are needed to state the findings.
Australia. MIDAS enables design experts to assess software design (i) Branching Activity – which is defined as the ratio of the number of
quality using a systematic application of design analysis tools with a branching changes to the number of development changes, per
three-view model, comprising (i) standard design principles, (ii) project- software component.
specific constraints and (iii) an ‘ility’-based quality model. The quality (ii) Branching Scatter – which is defined as the ratio of unique branch
attributes considered are understandability, effectiveness, flexibility, families that a software component is in, to the number of
extendibility, reusability, and functionality. The design principles development changes.
considered, under three categories from coarse-grained to fine-grained, (iii) Branch Scatter Entropy – which is defined as “the entropy
include abstraction, encapsulation, modularity, and hierarchy at the top (Shannon entropy) of the scatter of the changes to a software
level; classification, aggregation, grouping, coherence, hiding, component across branch families.” This is related to the
decomposition, locality, ordering, factoring, layering, generalization, and percentage of changes done for a software component across
substitutability at the middle level; and single-responsibility-principle, branch families. If an equal percentage of changes are performed
acyclic-dependencies-principle, Liskov’s-substitution-principle, etc. at in all branch families, then the Branch Scatter Entropy will have a
the bottom level (as described by R. C. Martin [24]). Project-specific maximum value and if all changes are restricted to one of the
constraints might include those that pertain to language and platform, branch family, then the Entropy will have the least value.
framework, domain, architecture, hardware and process. Ganesh (iv) Branching Depth (Low, Medium, High) – which is defined as “the
Samarthyam, et al. [3] cite example design problems such as ratio of development changes to a software component in
(i) “cyclic dependencies between classes”; low/medium/high depth branches.”
(ii) use of recursion in an embedded system, where it is not allowed; (v) Branch Depth Entropy (Low, Medium, High) – which is defined as
(iii) a class that must implement the dispose pattern does not
“the entropy of the changes at each depth level (low, medium, and
implement it (with regard to resource management); and
(iv) a GUI application, which is supposed to use the Model-View- high).”
Controller (MVC) pattern, does not strictly follow it. The major findings follow.
The authors Ganesh Samarthyam, et al. [3] also report three salient (i) Branching activity “has a negative impact on software quality,”
characteristics of MIDAS. First, its goal-oriented assessments are meaning more Branching Activity will imply poorer software
mentioned, such as (i) a go/no-go decision for a major refactoring quality.
initiative; or (ii) an estimate of current status of design quality; or (iii) (ii) “Branch Scatter has a negative impact on software quality.”
the need to know priorities among the components that require (iii) “Branch Scatter Entropy has a slight positive impact on software
refactoring. Second, it is reported that MIDAS combines “manual review quality in Windows Vista and negative impact on software quality
and tool analysis,” facilitating the manual assessment of the results of in Windows 7.”
design analysis tools such as the SPQR tool. The third characteristic (iv) “Branch Depth and Branch Depth Entropy have very little or no
mentioned is the customized reporting feature of MIDAS for software impact on software quality.”
stakeholders, tuning the reports to answer the concerns of (i) managers, The authors also conclude that in addition to aligning the branching
giving a summary with suggestions for preventive and corrective structure as per the software architectural structure, they should also be
actions; or (ii) architects, providing a high-level view of the problems aligned more with the organizational structure of their teams. That is, the
and their impact on design quality and possible remedial strategies; or development responsibility for each branch family should be restricted to
(iii) software developers, providing details of design problems with a specific organizational sub-group specializing in that functional aspect,
priorities to refactor. which has been dealt with in the concerned branch family.

6. EFFECT OF BRANCHING ON SOFTWARE 7. REAL-LIFE TESTING/CONTINUED SQA


QUALITY
Emad Shihab, Christian Bird, and Thomas Zimmermann [7] report that 7.1 In-vivo Testing
in large software projects and especially in projects where multiple Christian Murphy, et al. [14] present a new testing methodology called
similar releases are needed for different end users, branching from the in vivo (meaning real-life in Latin) testing, in which testing is continued
baseline trunk version maintained in the codebase, typically using a in the deployment environment, as against the laboratory testing known
configuration management tool, is inevitable so as to enable parallel as in vitro (meaning in the glass in Latin, or alternatively in an artificial
modifications to an existing software component. An essential post- environment) testing. They also introduce a new type of in vivo tests to
activity after branching is merging to combine the different code- be used in the new testing methodology. They list at least five conditions
changes that were incorporated in the different branches, as applicable in which in vivo testing is warranted, which are,
for the current software release. The authors of this paper empirically (i) “incorrect assumptions on state of objects” (such as clean initial
study different branching strategies with regard to post-release software conditions for dynamic data structures) in the application during
quality with two case studies involving Microsoft operating system unit testing;

DOI: 10.1145/2597716.2597724 http://doi.acm.org/10.1145/2597716.2597724


ACM SIGSOFT Software Engineering Notes Page 4 May 2014 Volume 39 Number 3

(ii) inadequate laboratory testing of possible real life configuration 8. NASA JPL EXPERIENCES IN SQA
options;
(iii) inability to practically simulate the deployment environment 8.1 Tool Use
conditions during laboratory testing (such as multi-core/multi-CPU Denise Shigeta, et al. [5] argue for the use of tools for effective SQA in
target environments); large and complex software systems, in order to analyze a large number
(iv) possible user actions making the application go into unexpected of complex artifacts such as requirements, design descriptions, etc., as
states, which were not anticipated or tested before; and well as large amounts of quality records such as anomaly reports. They
(v) bugs “that only appear intermittently,” which could not be state that 100% of participants in a recent survey conducted at NASA
reproduced during the laboratory testing. JPL vouched for the current dominance of manual methods and the
limited use of tools in SQA. Reasons for poor penetration include (i)
The main idea here is that there are simply too many possible
high cost; (ii) lack of user training; (iii) steep learning curve; (iv) lack of
configurations and options to test in the development environment, and
coordination within the institution; (v) overheads in identifying proper
hence tests cannot be run exhaustively on-site to ensure proper quality tools; (vi) difficulty in understanding the interactions and relations
assurance. The authors also describe an example framework developed among the different tools at any given time; as well as (vii) identification
by them for carrying out in vivo testing and demonstrate its feasibility of proper combinations of tools to address specific SQA needs. The
without actually affecting the user experience or deteriorating the importance of early error detection and the associated steep increase in
performance characteristics of the application. They also report that the cost due to the delayed fixing of errors are stated to be the primary
inspiration for this work came from combining two concepts reported causes that warrant the use of tools for SQA, especially for assessing (i)
earlier in the literature, namely self-checking software and perpetual the correctness, completeness and traceability of requirements in
testing. different granularities among different artifacts; as well as (ii) within the
requirements themselves relating non-functional (quality) requirements
7.2 Automated Continuous QA to the functional requirements. Manual methods run the risk of human
Johannes Neubauer, et al. [8] illustrate the importance of automated errors and the possible lack of coverage, while automated SQA tools can
SQA that is continued even after the first operational deployment of provide efficiency and increased effectiveness. Requirements traceability
evolving enterprise-wide client-server software with the specific matrix generation is a mandatory activity at JPL. There are tens of
example of their experience with Springer Verlag’s Online Conference thousands of individual requirements for typical aerospace software
System (OCS). SQA efforts, particularly the testing efforts, in a system systems such as the Mars Reconnaissance Orbiter. The size and
development life cycle till the first deployment of the system in the user complexity in the number of possible relations between non-functional
environment, are well above 20-30% of the total development efforts. and functional requirements, and the associated efforts and costs
This percentage increases further, if we consider the testing efforts for involved in manually analyzing these requirement relationships, clearly
each new upgrade of the system during its operation. Even then, (i) some indicate the need to use automated requirements management tools for
efficiency in analysis. Another mandated activity at JPL is the
number of “functional bugs;” (ii) possible non-acceptance of certain
identification and analysis of top-N risks. The authors Denise Shigeta, et
software features by the users; as well as (iii) some performance issues al. [5] provide an example of insufficient risk analysis carried out
due to “feature interaction;” remain unnoticed during the testing efforts. manually, with the risk of possible unit mismatch among interfacing
The authors Johannes Neubauer, et al. [8] explain how the automata software components causing a mission failure (Mars Climate Orbiter
learning (as provided by LearnLib, a software tool described by Maik 1998) even though it was a well-known documented risk that was
Merten, et al. [25]) used to study the user behavioral model of OCS has identified from previous projects. They argue the case for tools to ensure
helped mitigate the three problems encountered during “classical” testing completeness. The list of possible areas where tools could help to hasten
of functional properties carried out manually viz., the manual methods include architecture, code, contractor, cost, delivery,
(i) inadequate test coverage; product, project, reliability, requirements, resource, risk, safety,
(ii) manual regression testing of system upgrades resulting in a large schedule, security, test assurance, and assurance management. The
number of false positives, as the outputs of the upgraded version authors Denise Shigeta, et al. [5] provide a summary of a study carried
are compared with the expected outputs of the previous or original out at JPL in evaluating a number of tools for different SQA tasks in the
version of the system; form of a graph plot. Sample conclusions relate to (i) the use of
Constructive Cost Model (COCOMO) for cost, project and schedule
(iii) test results only indicate the presence of problems and lack
related aspects; (ii) Architecture Analysis and Design Language (AADL)
diagnostic information to reach the root-cause of the problems.
for architecture, reliability and resource related aspects; (iii) use of the
The authors Johannes Neubauer, et al. [8] also explain that their platform named JIRA for risk analysis; and (iv) the tool Coverity for
approach has helped in identifying and resolving non-functional issues code analysis purposes.
with the system, such as performance bottlenecks and indications of
which portions of the system are stable and which are altered, through 8.2 Perceived Value of SQA
detection of feature interactions. They claim the automata learning-based Dan Port and Joel Wilf [12] report the results of a study conducted by
behavioral-model testing can provide increasing reliability in the them in NASA JPL on how different stakeholders in a software project
evolving system. have different perceptions about the value of SQA activities and also
about what constitutes the success of SQA activities, based on interviews
7.3 SQA Environment for Distributed and and surveys conducted with more than 40 stakeholders. The 40
Continuous QA stakeholders are grouped into five categories, namely (i) Mission
Adam Porter, et al. [4] in an yet to be published work, claim that as the assurance managers as acquirers of SQA; (ii) Flight software managers
current SQA techniques do not scale to the needs of agile and and (iii) Project managers as customers; (iv) Software quality
continuously evolving software products, they have developed an SQA improvement staff as assessors; and (v) Software quality engineers as
environment named SKOLL. It defines a generic set of distributed, providers. The differences in perception make it difficult to plan and
continuous and novel SQA processes, including certain supporting tools. justify SQA activities and conduct performance assessment of SQA
They describe their environment and its advantages with the results of activities in any given project. The authors Dan Port and Joel Wilf [12]
two case studies. Shortcomings of current SQA techniques are stated to claim that their observations are a snapshot in time and not to be taken as
include differences in (i) test cases; (ii) input load conditions, (iii) the foregone conclusions for all times to come, as both perceptions and
software version; and (iv) the platform used during testing; with respect practice continually evolve. JPL, over time, has increasingly recognized
to the field conditions. the importance of software quality, especially after mission failures due
to software errors, as happened in Mars Climate Orbiter and Mars Polar
Lander. As the software size, complexity and cost grow the quality

DOI: 10.1145/2597716.2597724 http://doi.acm.org/10.1145/2597716.2597724


ACM SIGSOFT Software Engineering Notes Page 5 May 2014 Volume 39 Number 3

concerns also grow. The factors contributing to the confusion come from cycle time improvement, and code quality;” and (iii) “Scrum practices,
(i) the different definitions of SQA prevailing within NASA-JPL and continuous integration, refactoring, experience of team members,
specific mission assurance plans; (ii) different perspectives of SQA held geographical team distribution, ‘done’ compliance, condition of
by stakeholders viz., (a) as testing, (b) as verification and validation, (c) satisfaction, test driven development, acceptance test driven
as another set of reviews, (d) as proof of compliance to standards, (e) as development, test coverage and sprint commitment.”
risk management and (f) as quality management; (iii) damaging
misperceptions such as (a) SQA is an unnecessary activity; (b) SQA 9.3 Fostering Software Quality Assessment [2]
focuses only on defects found; (c) SQA can be performed by anyone; In another paper [2], the author Martin Brandtner proposes a new way of
(d) SQA is only useful in helping the project pass milestone reviews; and representing software quality through distilling information, both direct
(e) prevailing confusion between process improvement and SQA. The observations and derived metrics, from different software repositories
authors Dan Port and Joel Wilf [12] also list a few consequences such as codebases in version control systems, development tool-usage,
resulting from the prevailing confusion, i.e., different interpretations and issue managers or bug trackers, project wikis, etc. Such observations and
values of SQA perceived in the organization, such as (i) unclear metrics are proposed to be presented in customized form for the different
priorities; (ii) not performing risk assurance; (iii) reporting only the stakeholders such as architects, developers and testers in a timely
observations and not commenting on overall quality; etc. fashion, presenting a snapshot of the project quality status, either as
“stakeholder context” or as “technical context.” Architects would require
a set of quality measurements indicating potential violations of the
9. ONGOING RESEARCH IN SQA architecture. Developers would need source code history and build
history to support understanding of the root cause of any problem.
9.1 Pragmatic Prioritization of SQA Efforts [10] Testers would want a set of quality measurements and statistics on
It is well-known that maintenance and evolution efforts amount to as events occurring in software tools such as the version control system, the
much as 90% of software development costs. According to statistics build and release tool Jenkins and the static code analysis tool Sonar
reported in the literature, the current size of the code base at Fortune 100 during and after any code change. The proposed work involves
companies is 35 million lines of code, and this size is expected to double identifying a model for software quality assessment context, tailoring or
every seven years. SQA forms a large part of the code maintenance customization for different stakeholder needs and its presentation and
effort. Hence, there is a need for practical approaches to prioritize SQA interaction.
efforts. Software Analytics dealing with mining of software repositories
is an active area of research. The majority of ongoing research on
prioritizing SQA efforts focus on “code-level quality” (predicting code 10. CONCLUSIONS
locations that are error prone) and “change-level quality” (predicting This paper summarizes the state of the art research work reported in the
software changes that are error prone). However, their penetration into computing literature on software quality assurance during 2008 to 2013
practice is too low, considering the coarse granularity of these particularly covering (i) quality models; (ii) timely QA feedback; (iii)
predictions (being made at the file level). The proposed research work by quantitative approaches to predicting software quality and the
Emad Shihab [10] aims to resolve this situation by suggesting four effectiveness of software QA; (iv) optimal choice of QA methods; (v)
principles for such predictions, viz., (i) focused finer levels of design level QA; (vi) impact of parallel development options on software
granularity; (ii) timely feedback, when changes are fresh in the minds of quality; (vii) continued QA efforts running into the operational life of the
developers; (iii) provide an estimate of the SQA effort; and, (iv) evaluate software product; and (viii) use of CASE tools and perceived value for
general applicability of the predictions across different projects and QA at NASA JPL. A few reports on ongoing research are also
domains. Research has progressed well already for the code-level summarized. Major trends observed in the above summary pertain to (i)
quality, while similar progress is expected for the change-level quality in continuing software QA efforts into the operational life of the software
near future. The list of heuristics proposed for identifying code-level product, which may also be pertinent to the proliferation of mobile
quality that identify specific function or method for which unit tests are applications; (ii) presenting the current software quality status to all
to be carried out, is given below, viz., (i) “most frequently modified;” (ii) stakeholders through an optimal set of software metrics; and (iii) the
“most recently modified;” (iii) “most frequently fixed;” (iv) “most effectiveness of hybrid approach of tool use combined with expert
recently fixed;” (v) “largest modified;” (vi) “largest fixed;” (vii) “size judgment for software QA. The emphasis on continued software QA
risk” (functions with risk defined as the ratio of number of error fixes activities spreading throughout the software product life cycle, from its
over the size of the function in lines of code); (viii) “change risk” beginning to end, is clearly observed from the reviewed literature.
(functions with risk defined as the ratio of number of error fixing
changes over the total number of changes); and (ix) random selection of
functions. The size (lines-of-code) of the function is suggested to
11. ACKNOWLEDGMENTS
The authors acknowledge Shri A. S. Kiran Kumar, Director, Space
estimate the effort needed to perform unit tests. A metric named
Applications Centre, ISRO for the encouragement given to this study
usefulness is defined to indicate whether the predictions were really
assignment. The authors also thank the two internal reviewers at Space
effective in finding new errors in the predicted functions.
Applications Centre for their careful scrutiny of the manuscript. Critical
9.2 Quality Assurance in Agile [9] comments by the SEN Editor Michael Wing greatly helped to improve
Sonali Bhasin [9] writes that organizations are yet to be convinced of (i) the readability of this article.
whether agile methods have adequate rigor as compared to traditional
methods, to ensure and manage the quality of software products; and (ii) 12. REFERENCES
how one achieves improved quality in an Agile environment. The [1] Yasutaka Kamei, Emad Shihab, Bran Adams, et al., 2013, A Large-Scale
objectives of her research on SQA in an Agile environment are threefold, Empirical Study of Just-in-Time Quality Assurance, IEEE Transactions on
Software Engineering, Vol. 39, No. 6, June 2013, pp 757-773, IEEE
viz., (i) to identify “key drivers” that help realise product quality; (ii)
http://dx.doi.org/10.1109/TSE.2012.70
study how product quality further impacts on cost reduction and [2] Martin Brandtner, 2013, Fostering Software Quality Assessment, ICSE
increased business value; and (iii) propose an “Agile quality framework” 2013, San Francisco, CA, USA, Doctoral Symposium, pp 1393 - 1396;
that can help organizations to adopt effective SQA practices using Agile IEEE http://dx.doi.org/10.1109/ICSE.2013.6606725
methodologies. The core characteristics of Agile, Agile influence on [3] Ganesh Samarthyam, Girish Suryanarayana, Tushar Sharma, et al., 2013,
quality and a list of control variables are brought out, respectively as (i) MIDAS: A Design Quality Assessment Method for Industrial Software,
“customer involvement, test early and often, shorter feedback and ICSE 2013, San Francisco, CA, USA, Software Engineering in Practice, pp
911 - 920; IEEE http://dx.doi.org/10.1109/ICSE.2013.6606640
prioritized requirements;” (ii) “defect reduction, early defect detection,

DOI: 10.1145/2597716.2597724 http://doi.acm.org/10.1145/2597716.2597724


ACM SIGSOFT Software Engineering Notes Page 6 May 2014 Volume 39 Number 3

[4] Adam Porter, et al., Skoll: A Process and Infrastructure for Distributed Information and Service Science, pp 405-408; IEEE
Continuous Quality Assurance”, To appear in IEEE Trans. on SW Engg. http://dx.doi.org/10.1109/NISS.2009.114
[5] Denise Shigeta, Dan Port, Allan P. Nitora, Joel Wilf, 2013, Tool Use Within [16] Michael Kläs, S. Wagner, M. Pizka, et al., 2009, A Framework for the
NASA Software Quality Assurance, 2013 46th Hawaii International Balanced Optimization of Quality Assurance Strategies Focusing on Small
Conference on System Sciences, pp 4938 - 4947; IEEE and Medium Sized Enterprises, 35th Euromicro Conference on Software
http://dx.doi.org/10.1109/HICSS.2013.554 Engineering and Advanced Applications, pp 335 - 342; IEEE
[6] Tilmann Hampp, 2012, A Cost-Benefit Model for Software Quality http://dx.doi.org/10.1109/ICSM.2007.4362631
Assurance Activities, PROMISE ’12, September 21–22, 2012, Lund, [17] Michael Kläs, H. Nakao, F. Elberzhager, J. Munch, 2008, Predicting Defect
Sweden, pp 99 - 108; ACM DL http://dx.doi.org/10.1145/2365324.2365337 Content and Quality Assurance Effectiveness by Combining Expert
[7] Emad Shihab, Christian Bird, Thomas Zimmermann, 2012, The Effect of Judgment and Defect Data - A Case Study, 19th International Symposium
Branching Strategies on Software Quality, ESEM’12, Sep. 19–22, 2012, on Software Reliability Engineering, pp 17 - 26; IEEE
Lund, Sweden, pp 301-310; ACM DL http://dx.doi.org/10.1109/ISSRE.2008.43
http://dx.doi.org/10.1145/2372251.2372305 [18] Anonymous, 2011, ISO/IEC. 25010 Systems and software engineering –
[8] Johannes Neubauer, B. Steffen, D. Bauer, et al., 2012, Automated System and software product Quality Requirements and Evaluation
Continuous Quality Assurance, FormSERA 2012, Zurich, Switzerland, pp (SQuaRE) – System and software quality models.
37-43; IEEE http://dx.doi.org/10.1109/FormSERA.2012.6229787 [19] F. Deissenboeck, S. Wagner, M. Pizka, et al., 2007, An Activity-Based
[9] Sonali Bhasin, 2012, Quality Assurance in Agile –A study towards Quality Model for Maintainability., In Proc. 23rd Int. Conf. on Software
achieving excellence, Agile India 2012, pp 64-67; IEEE Maintenance (ICSM 2007), IEEE CS Press.
http://dx.doi.org/10.1109/AgileIndia.2012.18 http://dx.doi.org/10.1109/ICSM.2007.4362631
[10] Emad Shihab, 2011, Pragmatic Prioritization of Software Quality Assurance [20] B. Shim, S. Choue, S. Kim, S. Park, 2008, A Design Quality Model for
Efforts, ICSE ’11, May 21–28, 2011, Waikiki, Honolulu, HI, USA, pp 1106- Service-Oriented Architecture, In Proc. 15th Asia-Pacific Software
1109; ACM DL http://dx.doi.org/10.1145/1985793.1986007 Engineering Conf. (APSEC ’08), pages 403–410.
http://dx.doi.org/10.1109/APSEC.2008.32
[11] Klaus Lochmann, Andreas Goeb, 2011, A Unifying Model for Software
Quality, WoSQ’11, September 4, 2011, Szeged, Hungary, pp 3 - 10; ACM [21] Jacobs J, Jan van Moll, Rob Kusters, et al., 2007, Identification of factors
DL http://dx.doi.org/10.1145/2024587.2024591 that influence defect injection and detection in development of software
intensive products, Inf. Softw. Technol., vol. 49, no. 7, pp. 774-789.
[12] Dan Port, Joel Wilf, 2011, A Study on the Perceived Value of Software http://dx.doi.org/10.1016/j.infsof.2006.09.002
Quality Assurance at JPL, Proceedings of the 44th Hawaii International
Conference on System Sciences - 2011; IEEE [22] Michael Kläs, A. Trendowicz, A. Wickenkamp, J. Münch, 2008, The Use of
http://dx.doi.org/10.1109/HICSS.2011.34 Simulation Techniques for Hybrid Software Cost Estimation and Risk
Analysis, Adv. in Computers, 74, pp. 115-174, Elsevier.
[13] Omar Alshathry, Helge Janicke, 2010, Optimizing Software Quality http://dx.doi.org/10.1016/S0065-2458(08)00604-9
Assurance, 34th Annual IEEE Computer Software and Applications
Conference Workshops, pp 87-92; [23] S. H. Kan, 2003, Metrics and Models in Software Quality Engineering,
http://dx.doi.org/10.1109/COMPSACW.2010.25 Addison Wesley.
[14] Christian Murphy, G. Kaiser, I. Vo, M. Chu, 2009, Quality Assurance of [24] R. C. Martin, 2000, Design principles and design patterns, Object Mentor.
Software Applications Using the In Vivo Testing Approach, International [25] Maik Merten, B. Steffen, F. Howar, T. Margaria, 2011, Next generation
Conference on Software Testing Verification and Validation, pp 111-120; learnlib, in P. Abdulla and K. Leino, editors, Tools and Algorithms for the
IEEE http://dx.doi.org/10.1109/ICST.2009.18 Construction and Analysis of Systems, volume 6605 of Lecture Notes in
[15] Omar Alshathry, H. Janicke, H. Zedan, A. Alhussain, 2009, Quantitative Computer Science, pages 220–223. Springer Berlin/Heidelberg.
Quality Assurance Approach, International Conference on New Trends in http://dx.doi.org/10.1007/978-3-642-19835-9_18

DOI: 10.1145/2597716.2597724 http://doi.acm.org/10.1145/2597716.2597724


View publication stats

You might also like