You are on page 1of 4

EA-Miner: a Tool for Automating Aspect-Oriented

Requirements Identification
Américo Sampaio Ruzanna Chitchyan Awais Rashid Paul Rayson
Lancaster University Lancaster University Lancaster University
Computing Department, InfoLab21 Computing Department, InfoLab21 Computing Department, InfoLab21
Building, South Drive, Lancaster Building, South Drive, Lancaster Building, South Drive, Lancaster
University Lancaster LA1 4WA UK University Lancaster LA1 4WA UK University Lancaster LA1 4WA UK
00 44 (0) 1524 510346 00 44 (0) 1524 510499 00 44 (0) 1524 510316

a.sampaio@comp.lancs.ac.uk rouza@comp.lancs.ac.uk {marash, paul}


@comp.lancs.ac.uk

ABSTRACT oriented requirements engineering (AORE) as an effective


Aspect-Oriented requirements engineering helps to achieve early approach to deal with requirements engineering tasks. The
separation of concerns by supporting systematic analysis of benefits brought by aspect-oriented approaches were first
broadly-scoped properties such as security, real-time constraints, evidenced at the implementation level and recently at other stages,
etc. The early identification and separation of aspects and base such as requirements and design.
abstractions crosscut by them helps to avoid costly refactorings at Aspects are abstractions used to modularise crosscutting concerns,
later stages such as design and code. However, if not handled e.g., security, distribution, real-time constraints, etc., in software
effectively, the aspect identification task can become a bottleneck development. AORE helps to effectively achieve separation of
requiring a significant effort due to the large amount of, often crosscutting concerns at the requirements level and results in
poorly structured or imprecise, information available to a improved modularity of requirement artefacts facilitating their
requirements engineer. In this paper, we describe a tool, EA- creation, composition/decomposition and evolution [1-3].
Miner, that provides effective automated support for identifying Moreover, early identification and separation of aspects and base
and separating aspectual and non-aspectual concerns as well as abstractions crosscut by them helps to avoid costly refactorings at
their crosscutting relationships at the requirements level. The tool later stages, such as design and code.
utilises natural language processing techniques to reason about
the properties of the concerns and model their structure and One crucial point in adopting AORE is having an effective and
relationships. scalable mechanism to deal with its most complex and time
consuming activities, such as identification and analysis of base
and crosscutting concerns and their crosscutting relationships.
Categories and Subject Descriptors Identifying crosscutting concerns in requirements documents of
D.2.1 [Requirements/Specifications]: tools. I.7.5 [Document varying size and structure is a complex task due to the massive
Capture]: document analysis. K.6.3 [Software Management]: amount of information and the tangled nature of requirements
software development, software process. sources (e.g., interview transcripts, legacy documentation,
manuals, etc). Even structured requirements specifications written
General Terms with non-AORE methods (e.g., use case descriptions) can be hard
Management. to analyse for the purpose of identifying early aspects because
they also can contain scattered and tangled concerns in a large
volume of information.
Keywords
Aspect oriented requirements engineering, AORE, aspect-oriented Some research has focused on the problem of providing tool
software development, AOSD, aspect mining and identification, support for identifying crosscutting concerns most notably
tool support, automation, concern identification. Theme/Doc [2]. It provides a tool for semi-automatic
identification of crosscutting behaviours in requirements
specifications. However, with this approach a developer has to
1. INTRODUCTION manually provide a list of action words from the input document
Recently, some researchers [1-3] proposed the adoption of aspect-
implying that s/he has to previously read the complete set of
documents to scan for these words. Considering, for example, that
Permission to make digital or hard copies of all or part of this work for the rate of reading is between 250-350 words per minute a person
personal or classroom use is granted without fee provided that copies are would take about 20 minutes just to read a 5,000 words document
not made or distributed for profit or commercial advantage and that and also spend more time to decide what words to add to the
copies bear this notice and the full citation on the first page. To copy action words list and then to add them to the tool. Despite the
otherwise, or republish, to post on servers or to redistribute to lists, important contributions of this approach it does not scale well for
requires prior specific permission and/or a fee. complex projects with large documents.
ASE’05, November 7–11, 2005, Long Beach, California, USA.
Copyright 2005 ACM 1-58113-993-4/05/0011...$5.00.

352
This paper describes a tool (EA-Miner) that provides effective • It can easily provide support for model refinement and
automated support for identifying and separating aspectual (called specification generation.
early aspects) and non-aspectual concerns as well as their • It enables the requirements engineer to get a quick
crosscutting relationships at the requirements level. The tool understanding of the system by showing the key abstractions
utilises natural language processing techniques (part-of-speech and relevant information.
and semantic tagging, word frequencies, etc.) to reason about the
View Generator
properties of the concerns and model their structure and W eb-based GUI
crosscutting relationships. The remainder of this paper is Viewpoint Other Model
structured as follows: Section 2 shows our approach for View View
identifying aspects while Section 3 describes in more detail how
the tool works. Finally Section 4 concludes the paper.
W MATRIX Controller

2. EA-MINER TOOL AND APPROACH


The EA-Miner tool uses the WMATRIX [5] natural language Internal model

processor to pre-process the input documents and get relevant Viewpoint Other
Requirements to internal
information. WMATRIX uses part-of-speech and semantic model parser
tagging, frequency analysis and concordances to identify concepts
Viewpoint Other Model
of potential significance in the domain. Part-of-speech analysis parser parser
automates the extraction of syntactic categories from the text (e.g.,
nouns, verbs). Figure 1 - EA-Miner high level architecture
EA-Miner is a web-based tool that utilises NLP features from Having said that, it is important to present a more detailed
WMATRIX to identify base concerns, early aspects and explanation of how the identification of base concerns, early
crosscutting relationships between them. For the purpose of the aspects and crosscutting relationships is carried out in a
tool base concerns are any matter of interest at the requirements requirements engineering approach using EA-Miner. Figure 2
level that can be decomposed in a unit of the chosen presents a detailed view of how the tool works within the
decomposition model. For example, if the chosen decomposition requirements engineering process.
model is viewpoint-oriented, a base concern is a viewpoint while The four main RE activities supported by EA-Miner are
if the technique used is UML-based, a base concern can be a use elicitation, identification, presentation of results and screening out
case or a scenario (at a finer-grained level). An early aspect is a of irrelevant abstractions. EA-Miner supports the elicitation
requirements-level aspect; an entity that semantically crosscuts a activity by helping the requirements engineer to focus on specific
concern. The early aspect modularises responsibilities that would parts of the input documents while the following activities are
otherwise be scattered across several concerns or tangled inside executed and assisting him/her in getting a quick understanding of
one concern if the aspectual model was not applied. Therefore, the system.
early aspects provide a means for modularising the crosscutting
behaviour at the requirements level and applying it to the base In the identification activity, EA-Miner parses the input files
concerns using a composition mechanism. (different types and structures) provided by the engineer and
EA-Miner helps to achieve separation of concerns at the sends them to WMATRIX for part-of-speech (POS) and semantic
requirements level by modularising crosscutting concerns as early tagging. WMATRIX then generates an XML file consisting of the
aspects. This helps to improve requirements maintainability by input file with POS and semantic annotations. This file is passed
facilitating change management. to EA-Miner in order to produce an internal representation of the
file as Java objects. This internal model is produced for a specific
Figure 1 shows a high level view of EA-Miner architecture to technique (e.g., viewpoint, scenario based, etc.) selected
highlight how it has been designed to be model independent so previously by the user. In the presentation of results activity the
that different techniques can be supported. Currently, there is requirements engineer can view the internal model in different
support for the viewpoint-based AORE model in [1]. ways (e.g., diagrams, textual representations, etc.) represented in
The Web-based GUI component is a normal web browser the specific AORE model selected previously (e.g., viewpoints-
(IExplorer, Mozilla) that handles GUI interactions with the user. based). During the screening out and specification generation
The Controller is responsible for checking the state of the tagging activity the previous AORE model can be refined by discarding
process in WMATRIX and instantiating the specific internal irrelevant abstractions previously identified or by adding new
model parser (e.g., Viewpoint parser) which parses the ones. Moreover, the tool helps to generate part of the
requirements document and generates the specific internal model. requirements specification document by translating the refined
After the internal model is created the web-based GUI can use the model into several formats such as XML, DOC, etc. Due to lack
view generator component to produce a specific view (e.g., of space, in Section 3 we will focus on describing in detail the
viewpoint-based view as in Figure 5) for the user. identification task of Figure 2 which represents the core of the
Besides model independence, some other advantages of using EA- approach.
Miner are as follows:
• It helps to automate time consuming and error-prone
activities, such as identification of concerns, early aspects
and their relationships.

353
built as soon as WMATRIX tagging is finished. EA-Miner detects
this and starts parsing the XML file generated by WMATRIX.
The XML file is structured in sentences (<s> </s>) containing
groups of words with part-of-speech and semantic annotations.
For instance, (<w pos=”JJ” sem=”S7.4+”> authorised </w>)
means that the word “authorised” is tagged with the “JJ” part of
speech tag which represents “general adjective” and the semantic
tag “S7.4+” which represents permission (Figure 4(b)).
(a)

Figure 2 - Approach supported by EA-Miner


(b)
3. HOW IT WORKS
In Figure 3 EA-Miner displays the file used as input that is further
tagged by WMATRIX. The file contains a partial description of
the requirements of a toll collection system for the Portuguese
highways [1].

Figure 4 (a): Part of speech tagging in WMATRIX; (b):


Semantic tagging in WMATRIX
While the XML file is parsed the java objects are created and
populated based on the selected technique (e.g., viewpoint-based).
Note that each abstraction contains specific heuristics for its
identification in a given model (e.g., each noun is a viewpoint
candidate in the viewpoint model). Each sentence is treated as a
requirement decomposition unit that can be used later on during
abstraction identification. This approach is common in NLP-based
Figure 3 - Simple description of a system requirements engineering as can be seen in [7,8]. For abstraction
Figure 4(a) shows a list of some nouns identified by WMATRIX. identification (e.g., base concerns, early aspects, and crosscutting
The bottom of Figure 4(a) displays the concordance of the word relationships) the tool currently implements the AORE model
“vehicle” so that the user can inspect its surrounding context in based on viewpoints [1]. Below, for each of the abstractions in the
the text. Figure 4(b) shows how semantic analysis groups related model we describe heuristics for identifying them.
words and multi-word expressions into semantic fields e.g., “M3” Base Concern: In this model the base concerns are represented by
or “vehicles and transport or land” and also provides statistical the viewpoints. The basic rule for identifying them is: if the word
data such as LL (log-likelihood) showing that this is the most POS tag represents a noun (pos tag = “NN1” or “NN2”) then the
significant category. The “permission category” is highlighted to word is selected and a new Viewpoint object is created in the Java
show that it contains the word “authorised” that might be related model. The viewpoint class contains a Java collection that
to the non-functional requirement “security”. From this example represents the requirements of that viewpoint. Each sentence that
we can see how the information produced by WMATRIX can be contains the identified viewpoint (e.g., gizmo) is then instantiated
used by EA-Miner to make inferences about the identification of as a “Requirement” object and added to the collection. The
base (e.g., nouns can be used to identify viewpoint candidates problem with the list of viewpoints (Figure 5) is that, depending
such as vehicle and gizmo) and crosscutting concerns (e.g., words on its size, a document can contain numerous nouns. Also, the
whose meaning pertains to a non-functional concept such as noun-based identification leads to classification of the same
“authorised” meaning permission). concepts as different viewpoints due to variations, such as
WMATRIX produces an XML file containing part of speech and “driver” and “drivers”. These problems are minimized with some
semantic tagging for each word of the input file. The main output adjustments done in the “Screening out” activity such as using
of the identification task, the internal representation, starts to be lemmatization to recognize words that have the same root (driver
and drivers) and also using a dictionary of synonyms to catch

354
different words with the same meaning. Moreover, the frequency 4. CONCLUSIONS
lists produced by WMATRIX can be used to list only the most Aspect-oriented requirements engineering (AORE) provides an
significant candidates in the context of the requirements effective way for modularising concerns, aspectual requirements
document. Figure 5 shows, on the top left hand side, a list of some (early aspects) and their crosscutting relationships. However, the
candidate viewpoints identified by EA-Miner. When the user variety of requirements related documents (e.g., interview
clicks on a specific viewpoint, the right hand side (the Java transcripts, user manuals, legacy documents) of varying size and
collection of Requirement objects) is updated. For the example structure imposes serious difficulties on the task of identifying
shown, the “gizmo” word occurs in sentences 3, 5, 6. and modularising the crosscutting properties.
Therefore, to effectively support this task it is imperative to
provide automated support to reduce the burden on identifying
and structuring base concerns, aspects and their crosscutting
relationships. In this paper we presented the EA-Miner tool and
described how the tool automates parts of the requirements
process helping to reduce the complexity and effort required to
perform these tasks. Regarding scalability, we have tested the
approach with several requirements documents and the tools takes
just a few minutes to process and display the results even
considering large documents (tens of thousands of words).
Our future work will focus on further improving the identification
of early aspects (both functional and non-functional) as well as
the screening out functionalities to minimise even more the effort
Figure 5 - Identified Abstractions of the requirements engineer. Also, we will investigate how other
Early Aspect: The early aspect identification is based on a domain approaches [2,3] can be supported by the tool and how the tool
specific lexicon (XML file) that was built observing non- can be extended to address other activities of later stages of
functional related words (e.g., the “authorised” word with development such as architecture and design.
meaning of permission) existing in the NFR framework catalogues
[4]. The task of the tool is to compare if each word in the 5. ACKNOWLEDGMENTS
document is “equalTo” a NFR concept. The “equalTo” procedure This work is supported by European Commission grant IST-2-
is defined as: if a word is lexically equal, ignoring case and 004349: European Network of Excellence on Aspect-Oriented
suffixes, to the word in the lexicon AND the word has the same Software Development (AOSD-Europe), 2004-2008.
semantic class as a word in lexicon. The comparison after the
AND avoids identifying words in the text that have the same
spelling but are used with a completely different meaning (e.g.,
6. REFERENCES
[1] A. Rashid, A. Moreira, and J. Araujo, "Modularisation and
the word performance can be used to indicate a constraint on a
Composition of Aspectual Requirements," presented at 2nd
software or to indicate the act of a dancer or artist in a show). The
International Conference on Aspect Oriented Software
“EarlyAspect” object also contains a collection of “Requirement”
Development (AOSD), Boston, USA, 2003.
objects representing sentences where the aspect occurs.
[2] E. Baniassad and S. Clarke, "Theme: An Approach for
Crosscutting Relationships: When the user clicks on an early Aspect-Oriented Analysis and Design," presented at
aspect (e.g., “authorised” word of type security) candidate from International Conference on Software Engineering,
the list on the left hand side, the viewpoints list in the centre is Edinburgh, Scotland, UK, 2004.
updated to show which viewpoints that specific early aspect [3] J. Whittle and J. Araujo, "Scenario Modeling with Aspects,"
crosscuts. This is currently identified by verifying, for each IEE Proceedings - Software, vol. 151, pp. 157-172, 2004.
viewpoint, the intersection of the set of its requirements (e.g., [4] L. Chung, B. A. Nixon, E. Yu, and J. Mylopoulos, Non-
sentences {1, 4, 10} for “vehicles”) with the set of the early aspect Funcitonal Requirements in Software Engineering: Kluwer
requirements (e.g., sentences {1, 4, 8} for “authorised” word). Academic Publishers, 2000.
The result of the intersection {1, 4} is used to determine the [5] P. Sawyer, P. Rayson, and R. Garside, "REVERE: Support
crosscutting points. During the screening out and specification for Requirements Synthesis from Documents," Information
generation activity the models produced in the previous activities Systems Frontiers, vol. 4, pp. 343-353, 2002.
can be refined to exclude irrelevant abstractions or to add relevant [6] A. Sampaio, N. Loughran, A. Rashid, and P. Rayson,
ones that were not found. It will be common at this step to "Mining Aspects in Requirements," presented at Early
add/exclude specific requirements to/from the early aspects Aspects 2005: Aspect-Oriented Requirements Engineering
because the ones contained in its collection are more likely to and Architecture Design Workshop (held with AOSD 2005),
pertain to the viewpoints but are helpful to indicate the Chicago, Illinois, USA, 2005.
crosscutting points. Also, it is likely that different viewpoints will [7] V. Ambriola and V. Gervasi, "Processing natural language
contain the same requirements but this can also be detected by the requirements," presented at International Conference on
tool and corrected by the requirements engineer until the model is Automated Software Engineering, Los Alamitos, 1997.
refined enough to generate the requirements specification in [8] L. Goldin and D. Berry, "AbstFinder: A Prototype Natural
different formats. Language Text Abstraction Finder for Use in Requirements
Elicitation," Automated Software Engineering, vol. 4, 1997.

355

You might also like