You are on page 1of 4

RECAP: A Requirements Elicitation,

Capture and Analysis Process


Prototype Tool for Large
Complex Systems

Michael L. Edwards Matt Flanzer, Mark Terry and Joseph Landa


Code B44 Naval Surface Warfare Center Trident Systems Incorporated
Dahlgren Division 10201 Lee Highway, Suite 300
10901 New Hampshire Avenue Fairfax, Virginia22030
Silver Spring, MD 20903

Abstract
Complete, correct, and consistent requirements are essential to the development of large complex systems. The Re-
quirements Elicitation, Capture and Analysis Process prototype tool (RECAP)provides automated assistance in identifying,
capturing, analyzing, and using requirements. RECAP combines recent advances in natural language parsing, requirements
specification, and knowledge-based rules to support semiautomatic capture elicitation analysis and the use of requirements
data. The process begins with unformatted natural language text which is inherently ambiguous, incomplete and cumber-
some. RECAP assists the user in translating requirementsfrom natural language text into concise requirements data which
can be organized and accessed by user defined views. Requirements and domain speciJic rules are used to help the user ana-
lyze and maintain the requirements data throughout the system life-cycle.

Introduction
System requirements specification for large com- Typically, the top level system requirements documenta-
plex systems, such as naval command and control sys- tion consists of several natural language requirements
tems, airspace battle management systems or international documents. Undocumented assumptions on the part of
financial trading systems, invariably consist of many the system designers lead to incorrect, inconsistent, am-
documents which contains pages full of jargon, acronyms, biguous, incomplete, or forgotten requirements. When
and symbols (JAS) which are meant to fully define the combined, these factors hinder human understanding of
desired product. The system requirements in these docu- the system being specified and can lead to massive losses
ments must be identified, interpreted, and extended to in time and money spent in designing a system which fails
produce a design which when, instantiated as a product, to satisfy the needs of the specifiers, developers, and end-
satisfies the specifier. Too often the product fails to meet users.
such expectations because the supplied requirements were
incomplete, inconsistent, incorrect, uninterpreteable, or Because the ability of humans to assimilate, re-
simply unmanageable. A complete, correct, and precise call, and transfer large volumes of information is limited,
set of requirements is essential to ensure that the desired the benefits of mechanical and digital requirements engi-
system’s functionality is provided. Factors, such as the neenng tools have been explored for years. Techniques
inherent ambiguity of natural language representations and tools have been devised to address difficulties arising
and the scope and complexity of the required system, from the capture and management of requirements. Many
combine to make writing, interpreting, and using require- of these efforts require the specifier to use a formal re-
ments documents difficult. quirement language or capture natural language text with-
out being able to conduct analysis on the meaning of that
The development of a set of compete and consis- text, therefore being reduced to keyword search and orga-
tent requirements for large complex computer systems is nizers.
hampered by the intrinsic properties of such systems, in-
cluding size, span of control, and application complexity.

278
0-8186-7123-8/95 $4.00 0 1995 IEEE
Documents

Figure 1. RECAP process overview

This paper discusses the benefits of em- defined capture views and further associating a particular
ploying RECAP, a prototype tool being developed for the requirement with a domain and with attributes, the re-
Engineering of Complex Systems technology block at quirements data is effectively partitioned into manageable
Trident Systems Incorporated under NSWC oversight. sections which are analyzed based on a requirements or
RECAP consists of a natural language processing module, domain specific rule base.
requirements parser, and requirements analysis engine.
Each major section of the tool and the underlying research Use of RECAP allows requirements to be
is described. Figure 1. shows the RECAP process over- viewed as a data set rather than a document set. Require-
view beginning with unformatted documents, tagging ments are reorganized and indexed to fit into a structured
those documents, parsing requirements based on a re- template. The template is an outline of how the user
quirement process template, and then conducting analysis wants the requirements data parsed. Part of an example
to ensure the requirements data set is complete, consis- template follows:
tent, and unambiguous. system capabilities
data
Requirements Structure type
RECAP organizes requirements data according updated by
to a requirements template. This template can be defined
by the user and additions to the template can be made received by
throughout the project. Requirements capture in RECAP produced by
characterize requirements according to their adjudged
purpose in accordance with the template. RECAP is de- used by
signed to identify requirements in a text file and to set the
attributes of that requirement according to the system re- functions
quirements views structure. In ongoing work, require-
ments have been partitioned into five categories: capa-
initiation
bilities, constraints, processes, testing, and environment termination
[11. By partitioning requirements into these, or other user

279
The template indexes the requirements according based taggers learn as a requirements data set grows
to the implied hierarchy. Each item in the requirements through a project life-style, and they infer unknown parts
template points to a set of terms in RECAP lexicon of speech based on previous experiences or generalized
which accounts for synonyms, word stems, idiomatic linguistic rules.
phrases, and acronyms. For example, a requirement
which is a sensing capability is indexed as belonging to To accomplish the goal of identifying parts of
both capability and sensing. RECAP attempts to classify speech within a document, a pair of natural language tag-
a requirement to the lowest level within the template. gers have been built. The first is a parts of speech tagger
This process makes later data access faster and helps FE- based on a modified Brill algorithm (Brill [2]). This al-
CAP derive requirements. gorithm is a trainable rule-based parts of speech tagger
which has been tailored to derive unknown parts of
Candidate requirements are identified by gram- speech which are most important in identifying require-
matical patterns and subject. Once a term or sentence is ments. The second tagger is a requirements specific tag-
flagged as a potential requirement, RECAP mes to find ger which identifies phrases common in requirements.
places in the template to associate with that requirement. The original Brill tagger is based on transformation-based
RECAP has a robust lexicon and lexicon builder associ- error- driven learning. In this type of learning, a corpus is
ated with the template. The robust lexicon and lexicon used for building a lexicon. Based on rules and error cor-
builder define words and terms which help to identify and rection during training, an effective tagger has been cus-
classify requirements. tomized for the specific needs of requirements parsing.
Identification of the present and future tense auxiliary
Recap Inputs verb phrase is not sufficient to identify requirements be-
cause many occurrences of these verbs are found in non-
The RECAP process begins with some version of requirements text. Many requirements are hidden inside
top level requirements, a mission needs statement, a complex statements which do not contain any present or
functional description of a proposed system, or any draft future tense auxiliary verbs.
document. Original documents are indexed and preserved
for reference and traceability. The document is passed A secondary rule base is defined which helps
through a series of filters to strip out headers, footers, derive sentence subject and determines if that subject is a
covers, and other information deficient parts of the docu- component, subsystem, or system under consideration.
ment. All tables and figures are indexed and referenced The determination of subject is important in both finding
for later analysis. requirements and classifying those requirements as be-
longing to a particular domain.
Requirements Parsing
The RECAP parser takes advantage of an object-
Once a document has been filtered, the contents oriented design to include a robust and easily maintain-
are examined from a requirements point of view. Natural able file class hierarchy. This hierarchy consists of a ge-
language processing and understanding is a complex field neric Wile class, used as a base class, and other classes
of research. Many of the challenges in this subject are the that represent specific types of files used in the RECAP.
result of written documents not rigorously following the These files are the process document, template, gammati-
rules of English grammar. Idiomatic expressions and ac- cal pattem, and lexicon.
ronyms can complicate even small documents. Fortu-
nately, requirements are a subset of all text. A set of The Wile classes, as they are known, provide an
grammatical rules which apply to requirements is easier interface to move through a regular text file in various
to define than a set of grammatical rules which apply to ways. A file may be navigated through searching by
all text, such as science fiction, technical literature, and regular expression, direct indexing, per word, or per sen-
business contracts. tence. The definitions for what makes a word or a sen-
tence are contained within the base class, allowing for
The first step in decomposing a document into these fundamental definitions to be altered at one place
requirements data is to tag the parts of speech throughout and affect all file operations on all subclasses. Identifying
the document. Once all parts of speech are known, sen- a specific location within an arbitrarily large Wile is an
tence, statement, and paragraph subjects are deduced. issue handled by the RIndex class. The Nndex class sim-
Several types of automated part of speech taggers cur- ply represents a location within any Wile. Currently, the
rently exist. Stochastic taggers work by assigning a tag RIndex is implemented using a byte offset from the be-
based on a probability distribution. The probabilities are ginning of the file, but it has been constructed to allow
estimated from manually tagged documents. The primary this method to change.
problem with this method is that is does not take advan-
tage of linguistic information specific to requirements.
Additionally, stochastic taggers are incapable of learning,
while rule based taggers are capable of learning. Rule

280
Recap Analysis labeled incomplete by RECAP. RECAP has a dialog box
which allows for this type of query, and, in this case, a
The RECAP engine is used to identify and index user would enter signal processing AND incomplete. RE-
all potential requirements according to the process tem- CAP would respond with a set of requirements that are in
plate patterns and lexicon. The RECAP engine searches the signal processing domain and have been labeled in-
both the text and the index file. The text is searched to complete by earlier analysis. If upon inspection a user
identify original requirements and to partition the require- determines a requirement is not incomplete, an annotation
ments set. The index file is search to find the attributes is made which indicates the conflict has been resolved or
and markings based on the conditional in the rule set. no conflict exists; a text field is referenced where the user
may enter a justification for that change.
To conduct analysis of requirements once they
are partitioned by requirement capture view, a set of rules Discussion
must be applied. An interpreted language has been de-
fined for use in RECAP which allows the user to define While RECAP provides a robust facility for the
rules based on a set of functions which the RECAP engine parsing of requirements and the generation and applica-
can evaluate. A conditional evaluates into two parts: tion of rules, further work is needed to support more com-
boolean condition and matching requirements. The plex function and the acquisition and use of domain ex-
matching requirements are passed along to the action sec- pertise. We will continue to exploit the use of domain
tion of the engine. The boolean condition will determine knowledge in dealing with complex system requirements.
which, if any, action is taken. Functions which can be The RECAP prototype is currently in use at Trident Sys-
evaluated are: CONTAINS, IS ALSO, HAS VALUE, tems Incorporated and is being applied to the Engineering
HAS SUBJECT, TRUE,FARSE of Complex System sample problem. The prototype ver-
sion is being used to develop general and domain rules in
Some functions evaluate to no matching require- several important areas. RECAP development, research,
ments but can have a boolean condition and affect the ac- and testing will continue throughout the next year. Re-
tion taken. The rules can be applied to any section of the cent advances in requirements engineering and trace-
requirements data. A wildcard is available to apply a rule ability will be incorporated, and a new release is planned
or rule set to all requirements, however, often a rule set is for 1996.
applied to all requirements within a particular view or do-
main. Some examples of rules are provided below. References
REOUIREMENTS TEXT Edwards, Michael 8z White, Stephanie, " RE-VIEWS: A
Requirements Structure and Views," NCOSE, 1994
Acoustic detection system will have a mini-
mum probability of detection of .9 to ensure a high Brill, Eric, "Some Advances in Transformation-Based
degree-of confidence that no targets pass undetected Part of Speech Tagging," Twelfth National Conference on
AI, 1994.
DOMAIN RULE
IF [signal processing] CONTAINS "Pd" and
NOT CONTAINSVfa" THEN incomplete ENDIF
This rule executes on all requirements in the sig-
nal processing domain and checks to ensure that any re-
quirement which specifies a Pd also specifies a Pfa. The
terms Pd (probability of detection) and Pfa (probability of
false alarm) are referenced into a lexicon so all synonyms
are evaluated equally.
GENERAL REOUIREMENTS RULE
IF probability HAS VALUE > 1 THEN incor-
rect ENDIF
This rule will mark any probabilities with a value
greater than 1 as incorrect. Requirements can be marked
more than one, and one rule can have multiple results.
Once a document is marked based on the rules, a user
may wish to query the requirements data based on mark-
ings. The user may want to find all requirements that be-
long to the signal processing domain and have been

281

You might also like