Professional Documents
Culture Documents
Software P r o d u c t i v i t y C o n s o r t i u m
H e r n d o n , V A 22070
ABSTRACT
Many methods for representing software components for reuse have been proposed.
These include traditional library and information science methods, knowledge based
methods, and hypertext. There has been no empirical evaluation of these methods,
and consequently there is no data about their relative costs and effectiveness. Proteus
is an experimental reuse library system that will be used to help gather such data.
Reuse library systems are usually tied to one methodmProteus supports multiple
methods.
1. INTRODUCTION
One of the many problem areas in software reuse concerns the representation of
reusable components. Specifically, how should one represent reusable software
components so that they can be found, understood, and perhaps automatically
synthesized? These components will be not only code, but also other software
lifecycle objects such as requirements, designs, test cases, and perhaps general
knowledge about software products and processes. The last few years have seen a
proliferation of methods proposed for representing reusable software, and the
concomitant development of systems to support those methods. These methods
are drawn from three major areas: library and information science, artificial
intelligence, and hypertext technology[FRA89,90].
While many representations have been proposed for reuse, little empirical
evaluation of them has been done. Currently, the relative costs and effectiveness
of these representation methods for helping a user to find, understand, and
synthesize reusable components is unknown. Given this, there is a clear need for
empirical research to provide this information to software practitioners.
The first experimental method we are considering is 'known item' search. For this
approach, each subject will be given a list of descriptions for items in the database,
and be randomly assigned to one of the representation methods for searching. The
dependent variables will be the number of items correctly located, and the time it
takes to find the items.
3. PROTEUS OVERVIEW
Figure 1 is the first user screen in Proteus. The users selects one or more available
databases from the Databases window, and a representation/search method from
the Methods window. Figure 2 shows the interface for keyword search. The top
window displays messages to the user. The Keywords-in-database window displays
all of the keywords in the database. Below it is a menu of operators that a user can
select with a mouse to construct queries. The List-of-previous-queries window
displays the search history. The one below it displays the items retrieved by the
currently selected query. The window on the bottom is where the users enters
queries. Queries use standard Boolean operators, prefix or infix notation, and
truncation.
Figure 3 shows the interface for the enumerated graph search. The top window
displays user messages. The Current-root-class window displays the node where
the user is currently positioned. Below that, the subclasses are displayed. To the
right are displayed the super-classes of the current classes, and the the parts in the
current node. We plan to replace this interface with a purely graphical one.
Figure 4 shows the interface to the faceted classification method. A good way to
represent and search a faceted classification is with a spreadsheet which we have
used. The top window is for user messages, as in the other methods. The next
window contains the facet names, one per column. Below this are the terms in
46
each of the facets. The windows below the facets window display the current facet
name, facet term, and the current part. Other windows provide the ability to sort
the terms within a facet, to center the spreadsheet on a given term, and to retrieve
information about the current part.
Figure 5 shows the interface for the attribute-value method. The top window, as
before, is for user messages. The attributes available window lists all attributes
available for the selected database(s). The values window lists all values for the
current attribute. A search is specified by selecting sets of attribute-value pairs.
In figure 5, the pairs AUTHOR = W W O O D F O R D and L A N G U A G E = ADA
have been selected. The Parts that match pattern window shows that two parts
match the search
4. DESIGN AND IMPLEMENTATION OF PROTEUS
To support these various methods, the Proteus system has been designed in a three
layer architecture(see figure 6). Software component data resides in the
Information Retrieval (IR) Layer. The IR layer is the lowest level of the Proteus
library system. The next layer up, the Reusable Functions Layer (RF), consists of
methods to search, organize, change, and perform any other tasks required to
manipulate information stored in the IR layer. The third and highest layer is the
Library Representation (LR) Layer. The LR layer consists of application
programs that are independent implementations of library representations. They
can each access the same information about the components, but each represents
those components in a different way.
4.1 THE INFORMATION RETRIEVAL (IR) LAYER
The IR layer does not relate the components to each other, it only makes data
about each object available. Retrieval of data about a part, consists of asking that
part to provide data about itself. No data item of any object is easier to retrieve
than any other datum about any other object. Data about multiple components,
relationships between components, components of similar attributes, or
knowledge about anything beyond the part under examination, is not available
to the part object. This is done to encapsulate the tasks of retrieving component
data in the IR layer, and the tasks of relating components and data about them
in the RF layer.
4.2 THE REUSABLE FUNCTIONS (RF) LAYER
All representations have certain requirements, and need a tool set of functions
that can be used to implement those requirements. All representations must, for
example, have functions that implement a user interface. Different tool sets must
all be able to add descriptions of components to the IR level. Once a list of
candidate components has been found, it will need to be sorted and/or examined.
Proteus has a set of generic functions for these tasks.
By making these functions generally available, a tool set for a new representation
can be built in less time. By isolating them in a separate layer and building them
47
classification population
pattern-matchers
hierarchies routines
from another set of reused lower level functions (that are also in the RF layer)
differences in the performance of different representations caused by factors
extrinsic to the representation, can be minimized
4.3 LIBRARY REPRESENTATION (LR) LAYER
For the experiments to be valid, Proteus must not favor some representations over
others. The underlying services that are supplied to the different representations
must fulfill all of the requirements of each representation, while not favoring
some over others. For example, an enumerated class structure might be favored
by a database that stores the components in a tree structure
To minimize the effects of the design and implementation of these services, care
has been taken to make all component information equally retrievable by all
representations. The sub-classes of a class of component can be returned via a
function call, with approximately the same delay as a function that returns all the
keywords of a group of components. The list of all components that are members
of a class, along with all members of that class's descendant classes, is available
with a delay equivalent to that of returning a list of all components that match a
two or three word Boolean keyword query. The object oriented design paradigm
used in the design and implementation of Proteus retrieves information about
an object via a function call whether it is static data, or the return of a function
that determines the data. This helps to enforce the requirement of having all types
49
The graphical user interfaces (GUI) of the different interfaces are all built from
the same window building toolkit which was produced in support of the Proteus
project. It is an object oriented set of tools that allows the GUI interface windows
to be built quickly, and to easily expand the services supplied by classes of objects
available from the toolkit. This inherently produces interfaces with a common
"look and feel", as most interface objects in each representation are of the same
common classes of objects. For example, each representation must have an
interface that allows the user to examine the data available on a single component,
once a likely component is found. All the representations use an identical window
for this purpose. Also, each interface supplies some information that is
appropriate to the database of components that the user has chosen (keywords
defined for the database in a keyword based representation, list of all sub or
super-classes of a class in the enumerated class representation.) and the window
objects to display these data in each representation are of the same class of GUI
object. The help windows which are attached to each representations GUI objects
are also instances of the same class of objects.
50
The lower level data structures and the tools that manipulate these structures are
shared by all the representations. These services include the retrieval of data on
a component by that components name, the ability to order textual or numerical
data for sorting, the search of a data structure for a datum by matching some
search criteria to an aspect or attribute of the data, and the ability to present this
information to a user through GUI interface objects and to define and customize
these objects. As stated above, Proteus has been designed and implemented to
minimize the effect of the relationships between representations and underlying
services. Whenever feasible, underlying services have been implemented so that
no representation will gain apparent advantage in functionality or usability
though its commonality or similarity to the implementation of underlying services.
5. CURRENT STATUS OF PROTEUS
The window tool kit in the Apollo/Lucid Lisp was also more difficult to work with
for two reasons. First, the Apollo toolkit is functionally defined rather than OOD
defined as it is on the TI. To follow the design of Proteus, and to reuse components
between the representations, an OOD library of user interface objects had to be
5i
defined and implemented. This brought about the second problem. The Domain
window toolkit environment was not robust. There were many difficult to explain
errors that occurred (some of which were fixed by patches from Apollo when they
were made aware of the problems) and Lisp system crashes that were difficult to
reproduce, as they did not give any error messages. These crashes would often
crash the Unix shell process running the Lisp process as well, making traceability
of the error even more difficult. Some of the errors that were found to cause this
behavior, were as simple as a misspelled symbol name being evaluated during the
initialization of graphical objects. About 40% of the time spent building the
graphical user interface for Proteus was spent diagnosing errors in the Lisp
window tool kit, or in having to repeatedly rebuild images, and reboot systems
that exhibited errors that could not be traced because they were not dependably
repeatable.
The following table gives timings (in seconds) for Proteus operations. Empty cells
in the table indicate that no time is required for the operation. For example,
faceted and enumerated searches have essentially no retrieval time associated
with them because of the way they are structured. We feel that time differences
between operations for different representations are not so large as to bias the
experiments.
7. CONCLUSIONS
In this paper, we described Proteus, a reuse library tool that allows multiple
methods to be used. Proteus will be used to empirically evaluate reuse
representation methods, so that cost/perforance data about them can be used to
guide practitioners. Planned experiments were described. We have discussed how
the design of Proteus attempts to handle experimental problems, and have given
descriptive data about Proteus' execution speed and databases.
References
[BOO86] Booch, G. (1986), Software Components with Ada: Structures, Tools, and
Subsystems, Menlo Park, CA: Benjamin/Cummings.
[CON86] Conte, S., Dunsmore, H., Shen, V., Software Engineering Metrics and
Models, Menlo Park, CA: Benjamin Cummings, 1986.
[WOO89] Wood, J. Joint Application Design: how to design quality systems in 40%
less time. New York: Wiley, 1989.
53
:.p.screen []
• g • B( . i o
Melcone to Proteus...
Please select one or note databases
House click left to choose.
]hen choose • search method in the same way.
Rouse click on the 9o box to continue your search.
Labase~ thods~
~OOCH-PARI ATIRIBUIE-VALUE
COSNIC-PARI ENUMERATED-GRAPH
LISP-PARI FACEIED-CLAS$IFICATION
UNIX-RPD-IOOL KEYNORD-2
UNIX-IDOL
Figure 2
isp.Sc~-een .. ' ' ' [] IB
ord-I is t box.
I C V ~ p l e t e t o per~or,
I "Lh~ r~I~l~ Imf~t~ ~aiIabIe one part ~ound.
, click ]eft on that part in the "Items found in..." window.
~i~,,~],¢II?J,~IT.~,~R~,r- .- --
NOIATION
NTH
NTH-KEY
NTH-KEYTH
NTHS
NULL
OBJECT
|{~I.k'II,1~]tI,lB I ~m | iI~I, ~ lll-'i:-I ~ ¢ J I~.~;IFJ,I,~.,(: O~JECT-INSTANCE
~LPHR-SORT-IWO OBJECT-INSIANCES
INSERI-IN-ORDER-ALPHA O£COEJ¢CE
I~ER T-IN-ORDER-PLPHB-UNIQUE ODD
LISI-SORI ONLT
OPS
ORDER
ORDERING
@IHERNIS£
54
Figure 3
l ~$MiC-PARI
b-classes of Current Root
AN-INTERACTIVE-EDIIOR-FOR-DLFINIiXUN-U
ATHENA-FORIRAN-COMPILER
BUIOMATED-FLONCHBRI-SYSTEM-TEXBS-BBM-U
BDMBDS-BASIC-DBTA-MBNIPULnTION-AND-DIS
BOX-LBNGUAGE-SOFINARE-DEVELOPMENI-SYST
-COST-AND-RELIA~ILITY-ESIIMATION-
•AREM
OAP-COLLECIOR-OUIPUT-ANALYSIS-PROGRAM
COMPLEX-BN-flPL-WORKSPBCE-FOR-NANIPULBI
CRISP80-SOFIWBRE-DESIGN-ANBLYZER-SYSTE
CROSSREF-LIBRARY-LOAD-CROSS-REFERENCE-
CROSSREF-LI~RARY-LOAD-CROSS-REFERENCE-
CSMR-COMMON-SOFTWBRE-MODULE-REPOSIORY
ii ~CUAS-CODE-USAGE-ANALYSIS-SYSTEM
A T A T IB f l n h W n n h P n ^ w w r B r h r T u A ~ r rh~
Figure 4
EX-RPD
lFUNcIIONAL'AREAL41IqBII'z~e~('~'('J~I'uNI
JT~n'l~mzq~'mqtz'~'~'
55
Figure 5
---'kq-111111
i i l i l ; l i l l r ) l i~3Ifl~ k'~.'~'1~,~[rPil'~] [~]~, :L?1~ [~iw,mlII~,{fllll
I
r t s t h a t match p a t t e r n ttern to match
BUTOMATED-FLOWCHARI-SVSIEM- AUTHOR N NOODFORD
FDAS-FLIGHI-DYNANICS-AN~VS LRNGUAGE ADA