Professional Documents
Culture Documents
hite ture of a
D. W. Walker
Computer S
ien
e and Mathemati
s Division
Oak Ridge National Laboratory
PO Box 2008, Oak Ridge TN 37831-6367, USA
M. Li, O. F. Rana, M. S. Shields, and Y. Huang
Department of Computer S
ien
e
Cardi University, PO Box 916, Cardi CF24 3XF, UK
September 20, 2000
Abstra
t
This paper des
ribes the fun
tionality and software ar
hite
ture of a
generi
problem-solving environment (PSE) for
ollaborative
omputa-
tional s
ien
e and engineering. A PSE is designed to provide transparent
a
ess to heterogeneous distributed
omputing resour
es, and is intended
to enhan
e resear
h produ
tivity by making it easier to
onstru
t, run,
and analyze the results of
omputer simulations. Although implementa-
tion details are not dis
ussed in depth, the r^ole of software te
hnologies
su
h as CORBA, Java, and XML is outlined. An XML-based
omponent
model is presented. The main features of a Visual Component Composi-
tion Environment for software development, and an Intelligent Resour
e
Management System for s
heduling
omponents, are des
ribed. Some
prototype implementations of PSE appli
ations are also presented.
1 Introdu tion
1
The main motivation for developing PSEs is that they provide software tools
and expert assistan
e to the
omputational s
ientist in a user-friendly environ-
ment, allowing more rapid prototyping of ideas and higher resear
h produ
tivity.
By relieving the s
ientist of the burdens asso
iated with the inessential and of-
ten ar
ane details of spe
i
hardware and software systems, the PSE leaves the
s
ientist free to
on
entrate on the s
ien
e.
This paper des
ribes the general fun
tionality and software ar
hite
ture of
a generi
distributed PSE. The use of CORBA, Java, XML, and other soft-
ware te
hnologies is dis
ussed, but this paper's emphasis is on des
ribing the
overall fun
tionality of the PSE and on how the dierent sub-systems of the
PSE intera
t, rather than on implementation details. Implementation details
an be found in [28℄. Therefore, some issues of importan
e in the design of
PSEs are not fully addressed in this paper, although a brief referen
e to them
is made in se
tion 7.2, with respe
t to the
urrent PSE infrastru
ture. These
in
lude se
urity and authenti
ation, fault toleran
e, debugging, quality
ontrol
and validation of software
omponents.
The two major parts of the PSE are the Visual Component Composition
Environment (VCCE) and the Intelligent Resour
e Management Sub-system
(IRMS). These are des
ribed in Se
tions 2 and 3, respe
tively. In Se
tion 4
other aspe
ts of PSEs are dis
ussed, su
h as the role of \intelligen
e" and mobile
agents. Prototype implementations are des
ribed in Se
tion 5. Related resear
h
into PSEs is presented in Se
tion 6. A summary is given in Se
tion 7.
2
2. Components themselves may be hierar
hi
al (i.e.,
onstru
ted from other
omponents) and be of arbitrary granularity. Thus, a
omponent may
perform a simple task, su
h as nding the average of a set of input values,
or it may be a
omplete appli
ation for solving a
omplex problem.
3. Ea
h
omponent is represented by a well-dened model spe
ied in XML,
as des
ribed in Se
tion 2.5.
4. A
omponent may have individual, group, or world a
ess permissions to
spe
ify who may use it.
5. Information is passed from one
omponent to another via uni-dire
tional
typed
hannels. A
hannel
onne
ts an outport of one
omponent to an in-
port of another
omponent. A
omponent may have zero or more inports.
The set of data obje
ts referen
ed by the
hannels
onne
ted to a
ompo-
nent's inport(s) together dene its input interfa
e. Similarly, a
omponent
may have zero or more outports, and the set of data obje
ts referen
ed
by the
hannels
onne
ted to a
omponent's outport(s) together dene its
output interfa
e.
6. A set of
onstraints may be asso
iated with ea
h
omponent, indi
ating
on what platforms it is li
ensed to run, and whether it requires generi
software, su
h as MPI or the BLAS, to be bound in later in order to run.
7. A performan
e model is optionally asso
iated with ea
h
omponent. Ide-
ally, this gives the runtime of the
omponent as a fun
tion of its input,
ma
hine,
ommuni
ation, and network parameters. If no performan
e
model is available, an upped bound on performan
e
an be approximated
based on
riteria su
h as the number of inputs/outputs, the implementa-
tion language, the exe
ution platform et
. Subsequent exe
utions of the
omponent
an be logged into a database, and used to provide a perfor-
man
e model.
8. Information on a
omponent's purpose, the algorithms it uses, and other
pertinent explanatory data is optionally asso
iated with a
omponent.
There are several ways in whi
h a pie
e of exe
utable
ode may be
ome
asso
iated with a
omponent. The binding of an exe
utable with a
omponent
generally takes pla
e after the VCCE has been used to
onstru
t the appli
ation.
Thus, a
omponent may be available only as an exe
utable spe
i
to a
ertain
type of hardware, and in this
ase the binding pro
ess is trivial. Alternatively,
the
omponent may be available as sour
e
ode, in whi
h
ase the PSE must
arrange for it to be
ompiled for a target ar
hite
ture. Finally, if neither the
sour
e nor the exe
utable
ode is available, the Software Information Servi
e is
used within the IRMS to lo
ate an implementation of the
omponent. This latter
ase is useful when using
ommonly available software, su
h as the BLAS, whi
h
may be available from a
ompute server. For example, if it was ne
essary to
perform a matrix-matrix multipli
ation, it might be better to allow the PSE to
3
nd the BLAS routine to do this on some platform, rather than the
omposer
of the appli
ation making the de
ision within the VCCE. To be exe
uted, a
omponent must be resolved to a given server implementation. Hen
e, although
a
omponent in the VCCE represents a pla
e holder for exe
utable
ode, it must
be linked to a server whi
h
an implement the fun
tionality, and this
he
k is
also made by the VCCE prior to invoking the resour
e management system.
This topi
is dis
ussed further in Se
tion 3.
There are four
ommon types of
omponent. The main task of
ompute
omponents is to do
omputational work, su
h as performing a matrix multipli-
ation or a fast Fourier transform. Template
omponents require one or more
omponents to be inserted into them in order to be
ome fully fun
tional. Tem-
plate
omponents
an be used to embody
ommonly-used algorithmi
design
patterns. This allows a developer to experiment with dierent numeri
al meth-
ods or model physi
s. Thus, in a partial dierential equation solver, a template
might require a pre-
onditioner to be inserted. Template
omponents
an also
be used to introdu
e
ontrol
onstru
ts su
h as loops and
onditionals into a
higher-level
omponent. For example, in a simple for loop, the initial and nal
values of the loop
ontrol variable, and its in
rement, must be spe
ied, together
with a
omponent that represents the body of the loop. For an if-then-else
onstru
t, a Boolean expression and a
omponent for ea
h bran
h of the
on-
ditional must be given. A template
omponent
an be viewed as a
ompute
omponent in whi
h one or more of the inputs is a
omponent. Data manage-
ment
omponents are used to read and
onvert between the dierent types of
data format that are held in les and internet data repositories. The intention
of this type of
omponent is to be able to (1) sample data from various ar
hives,
espe
ially if there is a large quantity of data to be handled; (2)
onvert between
dierent data formats;(3) support spe
ialized transports between
omponents if
large quantities of data need to be migrated a
ross a network; and (4) undertake
data normalization, and perhaps also generate SQL or similar queries to read
data from stru
tured databases. The fourth type of
ommon
omponent is the
user interfa
e
omponent, whi
h allows a user to
ontrol an appli
ation through
a graphi
al user interfa
e and plays a key role in
omputational steering.
4
a new hierar
hi
al
omponent has been built in this way, it may then be added
to an appropriate CR for future use and for sharing with other appli
ation
developers.
The basi
fun
tionality of the VCCE is similar to that of other modular visu-
alization environments su
h as AVS [24℄, IBM Data Explorer [1℄, IRIS Explorer
[14℄, and Khoros [38℄. A re
ent arti
le by Wright et al. [37℄ has reviewed these
types of modular visualization environments. The VCCE diers from these en-
vironments through its use of an XML-based
omponent model, and an event
model that is able to support
he
kpointing and
ontrol
onstru
ts su
h as loops
and
onditionals. In addition, the VCCE in
orporates a number of sophisti
ated
support tools whi
h are des
ribed below in this sub-se
tion.
The VCCE in
ludes a tool for building inports and outports from a
ompo-
nent's input and output interfa
es. Although a
omponent's interfa
es
annot
be
hanged, inports and outports
an be
onstru
ted out of the data obje
ts
omprising the input and output interfa
es, respe
tively. Ea
h
omponent has
a set of default inports and outports dened by the author of the
omponent.
For example, suppose the input to a
omponent
onsists of two
oating point
numbers, a and b, and an integer, j . These
ould be
ongured into a single
inport (a; b; j ), or as two inports ((a; b) and (j ), or (a) and (b; j ), or (b) and
(a; j )), or as three inports (a), (b), and (j ). It is also permitted to have a data
obje
t asso
iated with two or more inports. Thus, in the previous example,
the inports
ould be
ongured as (a; b) and (b; j ). In general, this results in a
non-deterministi
program with a ra
e
ondition on b.
The VCCE also in
ludes a text editor for
reating a new
omponent from
s
rat
h in some programming language. The Wrapper Tool
an then be used to
onvert the new
ode into a CORBA-
ompliant
omponent that
an be pla
ed
in an CR, if required. The Wrapper Tool
an also be used to
onvert lega
y
odes into
omponents [22℄. A
omplete lega
y appli
ation may be wrapped as
a single
omponent, or it may be divided and wrapped as a series of smaller
omponents, provided the internal stru
ture of the appli
ation is known.
The Annotation Tool allows a developer of a
omponent to annotate it with
information that may be of use to other users and parts of the PSE. This
information might in
lude an explanation of what the
omponent does and of
its input and output interfa
es, together with a URL giving the web address
of more detailed information. Other annotations in
lude
onstraints on the
omponent's use and information about its performan
e.
The Component Analysis Tool (CAT) of the VCCE
an be used to display
the hierar
hi
al stru
ture of a
omponent. A
omponent
an be expanded to
show the data
ow graph of the
omponents
omprising it. Alternatively, if it is
not
omposed of lower-level
omponents, the sour
e
ode is displayed. This may
be annotated to show the signi
an
e of important variables, data stru
tures,
and lines of
ode. A URL may be asso
iated with ea
h
omponent giving the
web address of do
umentation about the
omponent explaining, for example, its
purpose, the algorithms used, the meaning of the input and output arguments,
et
. This do
umentation may also in
lude links to sour
es of information related
to the
omponent external of the PSE.
5
The Expert Assistant (EA) helps users to lo
ate and use
omponents based
on a de
ision tree and/or rule-based approa
hes. The EA is also responsible for
mat
hing user requirements to
omponent types, based on
omponent interfa
es
spe
ied in XML (as dis
ussed in Se
tion 2.5). Therefore, the EA
an be used
to sear
h
omponent repositories that are registered for a parti
ular PSE. The
EA is a
ore part of the user interfa
e. It en
odes rules that are domain spe
i
,
and
an help a novi
e user evaluate the
omponents that may be most useful
for his or her parti
ular problem. These domain spe
i
rules
an either be
asso
iated with a
omponent for use within the EA, or they may be developed
in-house by an organization using the PSE.
The Input Wizard
he
ks that the input provided to an appli
ation does not
violate any algorithmi
onstraints, and
orre
tly represents the system being
modelled.
Figure 1 shows the software ar
hite
ture of the VCCE part of a PSE.
Expert
Third party Assistant
Input
software Wizard
6
Computational steering is often used in situations in whi
h data are presented
to the user from within a loop, for example, a time-stepping loop. Another use
o
urs when the user wishes to pause the appli
ation at some point in order to
set a value based on the appli
ation results at that point.
These two types of
omputational steering
an be provided for by having
ea
h
omponent identied (through the
omponent model des
ribed in Se
tion
2.5) as either steerable or non-steerable. Steerable
omponents may, or may not,
be initiated upon entry in the paused state. If a steerable
omponent is in the
paused state, then input from a user interfa
e
omponent is required to make
it
ontinue. The user interfa
e
omponent steering the appli
ation
an be used
to swit
h a steerable
omponent between the paused and non-paused states.
Figure 2 shows a portion of a task graph that is used for
omputational
steering. The steerable
omponent requires two inputs, a and b. Input a may
ome from either the user interfa
e
omponent doing the steering or from some
pre
eding
omponent in the task graph. A steerable
omponent will always
a
ept input preferentially from the user interfa
e
omponent, if any is available;
otherwise, it will a
ept input from the pre
eding
omponent. The output
from
the steerable
omponent loops ba
k into the user interfa
e
omponent, as well
as being passed on to another
omponent for
onsumption.
a a b
User interface
component Steerable component
c c d
7
simulations and/or databases, often through a visualization interfa
e. Within
the VCCE, a PSE
an be used
ollaboratively by \
loning" user interfa
e
om-
ponents. This permits ea
h user to view, or intera
t with, the data through
their own
lone of the master user interfa
e
omponent. Thus, all user interfa
e
omponents are
lonable or non-
lonable, and the
lones may, or may not, in-
herit from the master the ability to steer a
omponent in the task graph. The
loning fa
ility is mediated through a web page asso
iated with the master user
interfa
e. In a typi
al
ollaborative appli
ation, the master user would initiate
an appli
ation in the paused state (see Se
tion 2.3) and then wait for other users
to a
ess the web page through whi
h they generate their user interfa
e
lones.
Users may join the
ollaborative appli
ation at any time that the master user
interfa
e is a
tive.
8
me
hanism is parti
ularly useful when PSEs share
omponents or when the user
wishes to evaluate other
omponents that may be used as alternatives. XML
has also be
ome a useful integration me
hanism with Java Beans, and numerous
ommer
ial tools are readily available to support this integration. These reasons
suggest that XML is the most appropriate way of
reating
omponent interfa
es
within a PSE. The use of XML also enables the generation of
ontext-sensitive
help and leads to the development of self-
ataloguing
omponents.
XML tags are used to dene a standard
omponent model within the PSE
that will be deployed within all subsystems. Components are stored in the
Component Repository using this format, and any binary data asso
iated with a
omponent must also be tagged appropriately. The XML interfa
e to a CORBA
obje
t or a Java Bean, for instan
e, does not make a distin
tion between an em-
bedded
omponent obje
t or an obje
t that is native to CORBA-IDL or Java-
IDL. Therefore, a wrapped
omponent
an be a Java Bean, a COM obje
t, C
or C++ library
alls, wrapped Fortran
ode
alled from C,
alls to DLLs or
shared-obje
ts, and other run-time linking methods. In ea
h
ase, the XML in-
terfa
e is used to dene interfa
e information, for a
essing parti
ular attributes
and methods, for registering listeners, and for handling ex
eptions and events.
Further, spe
ialized support for expressing
onstraints on the run-time environ-
ment is also provided, along with me
hanisms to express hierar
hy and dire
tory
stru
tures if
omponents are stored within a stru
tured le system.
Our XML based
omponent model
ontains the following tags:
1. Context and header tags. These are used to identify a
omponent and the
types of PSEs that a
omponent may be usefully employed in. The
om-
ponent name must be unique, with an alternative alphanumeri
identier;
however, any number of PSEs may be spe
ied. These details are grouped
under the prefa
e tag, hen
e:
The hierar
hy tag is used to identify parent and
omponents, and works
in a similar way to the Java pa
kage denition. A
omponent
an have
one parent and multiple
hildren.
2. Port tags. These are used to identify the number of input and output
ports and their types. An input port
an a
ept multiple data types and
an be spe
ied in a number of ways by the user. The DTD for the ports
part of the
omponent model
an be spe
ied as:
9
<!ELEMENT inportnum INTEGER>
<!ELEMENT outputnum INTEGER> ...
<ports>
<inport id=1 type=stream>
<parameter value=regression value=NIL/>
<href name=http://www.
s.
f.a
.uk/PSE/ value=test.txt>
</inport>
</ports>
When reading data from a le, the href tag is hanged to:
<ports>
<inport id=1 type=stream>
<parameter value=regression value=NIL/>
<href name=file:/home/pse/test.txt value=NIL>
</inport>
</ports>
This gives a user mu
h more
exibility in dening data sour
es and using
omponents in a distributed environment. The general type:
or
are appli
able within any tag, and are used to spe
ify (name,value) pairs
that o
ur frequently in interfa
e denitions. The user may also dene
more
omplex input types, su
h as a matrix, stream, or an array in a
similar way.
3. Steerability tabs. A
omponent
an be tagged to indi
ate that it
an be
steered via the user interfa
e. In order to a
hieve this, the input/output
to/from a
omponent must be obtainable from a user interfa
e. Hen
e,
with a steerable
omponent, the inports and outports of a
omponent
must have a spe
ialized tag to identify this. Hen
e, for a
onventional
omponent without steerability, the ports denition is:
10
<ports>
<inport id=1 type=stream>
<parameter type=velo
ity value=10.2 />
</inport>
</ports>
<ports>
<steerable>
<inport id=1 type=stream
<parameter type=velo
ity value=10.2 />
</inport>
</steerable>
</ports>
This will automati
ally produ
e an additional input port over whi
h in-
tera
tive inputs
an be sent to the
omponent, with the input being of the
same type as in the non-steerable version of the
omponent.
4. Exe
ution tags. A
omponent may have exe
ution-spe
i
details asso
i-
ated with it, su
h as whether it
ontains MPI
ode, if it
ontains internal
parallelism, et
. If only a binary version of a
omponent is available, then
this must also be spe
ied by the user. Su
h
omponent-spe
i
details
may be en
losed in any number of type tags.
The exe
ution tag is divided into a software part and a platform part.
The former is used to identify the internal properties of the
omponent,
while the latter is used to identify a suitable exe
ution platform or a
performan
e model.
5. Help tags. A user
an spe
ify an external le
ontaining help for a parti
u-
lar
omponent. The help tag
ontains a
ontext option that enables the
asso
iation of a parti
ular le with a parti
ular option to enable display
of a pre-spe
ied help le at parti
ular points in appli
ation
onstru
tion.
The
ontexts are predened, and all
omponent interfa
es must use these.
Alternatively, the user may leave the
ontext eld empty, suggesting that
the same le is used every time help is requested for a parti
ular
ompo-
nent. If no help le is spe
ied, the XML denition of the
omponent is
used to display
omponent properties to a user. Help les
an be kept
lo
ally, or they
an be remote referen
es spe
ied by a URL. One or more
help les may be invoked within a parti
ular
ontext, some of whi
h may
be lo
al.
6. Conguration tags. As with the help tag, a user
an spe
ify a
onfiguration
tag, whi
h enables a
omponent to load predened values from a le, from
a network address, or using parameter,value. This enables a
omponent
11
to be pre-
ongured within a given
ontext (e.g., to perform a given a
tion
when a
omponent is
reated or destroyed). The
onfiguration tag is
parti
ularly useful when the same
omponent needs to be used in dierent
appli
ations, enabling a user to share parts of a hierar
hy, while dening
lo
al variations within a given
ontext.
7. Performan
e model. Ea
h
omponent has an asso
iated performan
e model,
whi
h
an be spe
ied in a le using a similar approa
h to
omponent
onguration dened above. A performan
e model is en
losed in the
performan
e tag and may range from being a numeri
al
ost of running
the
omponent on a given ar
hite
ture, to being a parameterized model
that
an a
ount for the range and types of data it deals with, to more
omplex models that are spe
ied analyti
ally.
8. Event model. Ea
h
omponent supports an event listener. Hen
e, if a
sour
e
omponent
an generate an event of type XEvent, then any listener
(target) must implement an Xlistener interfa
e. Listeners
an either be
separate
omponents that perform a well dened a
tion, su
h as handling
ex
eptions, or
an be more general and support methods that are invoked
when the given event o
urs. We use an event tag to bind an event to a
method identier on a parti
ular
omponent:
12
<prefa
e>
<name alt=DA id=DA02>Data Extra
tor</name>
<pse-type>Generi
</pse-type>
<hierar
hy id=parent>Tools.Data.Data_Extra
tor</hierar
hy>
<hierar
hy id=
hild></hierar
hy>
</prefa
e>
The s
ript tags are used to spe
ify what method to invoke in another
omponent, when the given event o
urs.
9. Additional tags. These tags are not part of the
omponent model and
an
be spe
ied by the user in a blo
k at the end of ea
h se
tion by using:
13
XML
VCCE
Component
Repository
XML
Representation
Python
Scheme Output
Output
To
Resource CORBA-IDL
Web Pages Manager Output
On
e a
omplete appli
ation has been spe
ied using the VCCE the resulting
task graph, annotated with information about the
omponents' performan
e
and
onstraints on exe
ution, is passed to the Intelligent Resour
e Management
System (IRMS) to be s
heduled for exe
ution. The stru
ture of the IRMS is
shown in Fig. 4 and will now be des
ribed.
14
Task graph Compile
from VCCE and Install
Constraints Service
Perf. model
Scheduler Computational
Choose granularity of task graph Software infrastructure
Resolve components Information Computers
Schedule components on hardware Service Libraries
Networks
Executable
task graph
Performance Hardware
History Information
Database Service
Execution Controller
Dispatch components to hardware
Perform checkpointing
Handle I/O
Monitor performance
Executable
components
Local Resource
Manager Interfaces Hardware
CODINE resources
LSF
etc.
15
passed to the Exe
ution Controller.
The S
heduler must devise a s
hedule a
ording to
ertain goals embodied in
a s
heduling poli
y and in a
ordan
e with a set of
onstraints. It makes use of
a number of information resour
es to devise an exe
utable s
hedule. S
heduling
involves the
oupled a
tivities of determining an appropriate level of granularity
for s
heduling the task graph obtained from the VCCE and de
iding on whi
h
resour
es to run the
omponents of the task graph.
Currently, a fairly simple S
heduler is envisioned that takes a top down
approa
h, starting at the highest level of granularity. The task graph re
eived
from the VCCE is initially regarded as having a single node that represents
the whole appli
ation. This node is then expanded (i.e., repla
ed by its sub-
omponents), and an improved s
hedule is sought. If no su
h s
hedule
an be
found then the initial task graph and its asso
iated s
hedule is a
epted, and
output to the Exe
ution Controller. Otherwise, we expand again and repeat
until we either generate a task graph that is worse then the previous one, or
until we
annot expand any more. The metri
used to de
ide if one task graph
is better than its prede
essor is important, and dierent metri
s
orrespond to
dierent s
heduling poli
ies. For example, the metri
that always views TG(i) as
better than TG(i+1) (provided the former obeys all the
onstraints)
orresponds
to a poli
y that attempts to always s
hedule at the highest granularity possible,
regardless of
ost or performan
e. The metri
that regards TG(i) as better than
TG(i+1) if it has a smaller expe
ted time to
ompletion will result in a s
hedule
that attempts to minimize exe
ution time. The expe
ted time to
ompletion is
derived from the performan
e model asso
iated with a
omponent.
In the simple s
heme des
ribed above, \expansion" means expanding ea
h
node of the
urrent task graph, but they
ould be expanded one at a time
(starting with the most resour
e hungry) to see whi
h is best. The s
heduling
s
heme is represented in Fig. 5.
3.2 Constraints
A
omponent is said to be
ompatible with a pie
e of hardware if all its sub-
omponents
an be exe
uted on that hardware. Clearly, the S
heduler must
always expand
omponents that are not
ompatible.
The S
heduler
an run a
omponent on a parti
ular pie
e of hardware only if
it
an lo
ate or
reate an exe
utable for that
omponent/hardware
ombination.
If no exe
utable is expli
itly asso
iated with a
omponent, and if the lo
ation of
the sour
e
ode is known, then the S
heduler will use the Compile and Install
Servi
e to
reate an exe
utable for the hardware platform of interest.
Resour
es are owned and operated by dierent individuals or organizations
and exist in dierent administrative domains. Therefore, resour
e usage may
have dierent se
urity restri
tions or be subje
t to parti
ular poli
y
onstraints,
restri
ting the s
heduling of
omputational tasks to these devi
es (or groups of
devi
es). Li
ensing
onstraints are a pertinent example of these restri
tions,
whereby mathemati
al or s
ienti
software is restri
ted to a single ma
hine or
parti
ular groups of ma
hines.
16
START
i=1
Schedule TG(i)
i=i+1
Yes
Is
Can Yes Refine Schedule TG(i+1)
TG(i) be granularity TG(i+1) better than
refined to give TG(i+1) TG(i)
No No
Accept TG(i)
STOP
Figure 5: S
heduling a task graph. The input task graph TG(1) is the
omplete
appli
ation re
eived from the VCCE.
17
3.4 The Exe
ution Controller
Components
an be run on hardware under the dire
t
ontrol of the Exe
ution
Controller, or they may be passed to lo
al resour
e managers that will handle
their exe
ution. The simplest
ase is where the entire appli
ation is handled
by a single lo
al resour
e manager, su
h as CODINE [10℄ or GRAM [17℄. Co-
ordinating appli
ations whose sub-
omponents are handled by multiple lo
al
resour
e managers, and also dire
tly by the Exe
ution Controller, may present
signi
ant diÆ
ulties.
18
by the Intelligent Resour
e Management System (see Se
tion 3). A database of
previous performan
e history is built up to store the performan
e of
omponents
for dierent hardware platforms and inputs. This may then be used to help
s
hedule
omponents using geneti
algorithms and neural networks.
19
agent
an take this and attempt to
ompile and install the
omponent on the
ma
hines known to the PSE through the HIS. The resulting exe
utable
an then
be used in appli
ations built within the VCCE. Again, this a
tivity
an take
pla
e
ontinuously, independent of the users of the PSE.
5 Prototype Implementations
20
se
ond is to provide a me
hanism that
an be used to dynami
ally
reate links
between
omponents. The third is to provide dynami
method invo
ation on
parti
ular
omponents within the environment.
The solutions to date, using the Java programming language, provide the
ability for the system to dis
over a
omponent's properties at design time and
display them via a simple \Obje
t Inspe
tor." Thus, for a display
omponent
the value that is set for the
omponent is shown as an editable string, and
for an operator
omponent the two input values are shown as editable strings,
and the operator as a sele
tion from a \drop-down list." The system
an also
dynami
ally
reate links between two
omponents and use these links to
all
\set methods" to update the properties of a given
omponent instan
e.
Figure 6 illustrates the ar
hite
ture of the MDS-PSE. Java IDL is used as the
fundamental infrastru
ture for dening
omponent interfa
es in MDS-PSE. Java
IDL is a CORBA
ompliant IDL shipped with JDK1.2. The Wrapper
onverts
the lega
y
ode in the server node to a CORBA obje
t. It is responsible for
ommuni
ation with the lega
y
ode through a shared le and provides servi
es
to the
lient. Sin
e the operations performed in MDS-PSE are known at
ompile
time, the
lient invokes the Wrapper through an IDL stub on the
lient side and
an IDL skeleton on the server side. Using CORBA, the
lient
an submit input
simulation data and wait for simulation results without knowing the lo
ation of
the wrapper and the exa
t implementation of the simulation.
The operations provided by the
lient and the wrapper in MDS-PSE are dened
by the IDL denitions in Code Segment 1. The IDL interfa
e hides the Client
from the implementation of the wrapper and the programming language used
to implement the appli
ation.
In Code Segment 1, the servi
e provided by the Wrapper is startSimulation(),
whi
h has two input parameters. One parameter is the referen
e to the Client
obje
t, whi
h will be used to invoke the Client in order to display simulation
results to the user. The other parameter is the input simulation data re
eived
21
Client node Server node
ORB
22
WrapperRef.startSimulation(ClientRef, simulationParameters);
...
}
23
void startSimulation( Client obj, String SimulationParameter)
{
NativeC(String SimulationParameter);
}
...
}
Code Segment 4. The Wrapper
ode used to invoke the MDS lega
y
ode.
Se
ond, the Wrapper reads the simulation results from a shared le and
requests the Client
(obj.displaySimulation()) to display the results to the user, as shown in
Code Segment 5.
lass Wrapper{
...
readData();
obj.displaySimulation( a,f1,f2,f3,f4,f5);
...
}
Code Segment 5. The Wrapper
ode used to request a servi
e from the
Client.
6 Related Work
The
urrent
on
ept of a PSE for
omputational s
ien
e has its origins in an
April 1991 workshop funded by the U.S. National S
ien
e Foundation (NSF) [15,
16℄. The workshop found that the availability of high performan
e
omputing
resour
es,
oupled with advan
es in software tools and infrastru
ture, made the
reation of PSEs for
omputational s
ien
e a pra
ti
al goal, and that these PSEs
would greatly improve the produ
tivity of s
ientists and engineers. This is even
more true today with the advent of web-based te
hnologies, su
h as CORBA
and Java, for a
essing remote
omputers and databases.
A se
ond NSF-funded workshop on S
alable S
ienti
Software Libraries and
Problem-Solving Environments was held in September 1995 [29℄. This workshop
assessed the status of PSE resear
h and made a number of re
ommendations for
future development. One parti
ular re
ommendation was the need to develop
PSE infrastru
ture and tools and to evaluate these in
omplete s
ienti
PSEs.
Sin
e the 1991 workshop, PSE resear
h has been mainly dire
ted at im-
plementing prototype PSEs and at developing the software infrastru
ture { or
\middleware" { for
onstru
ting PSEs. Mu
h of this work has been presented
in a re
ent book by Houstis et al. [18℄. Initially, many of the prototype PSEs
that were developed fo
used on linear algebra
omputations [9℄ and the solution
of partial dierential equations [20℄. More re
ently prototype PSEs have been
developed spe
i
ally for s
ien
e and engineering appli
ations [11, 12, 21, 32℄.
Tools for building spe
i
types of PSEs, su
h as PDELab [34℄ (a system for
building PSEs for solving PDEs), and PSEWare [6℄ (a toolkit for building PSEs
24
fo
used on symboli
omputations) have been developed. More generi
infras-
tru
ture for building PSEs is also under development. This infrastru
ture ranges
from fairly simple RPC-based tools for
ontrolling remote exe
ution [2, 31℄, to
more ambitious and sophisti
ated systems, su
h as Legion [19℄ and Globus [13℄,
for integrating geographi
ally distributed
omputing and information resour
es.
The Virtual Distributed Computing Environment (VDCE) developed at
Syra
use University [33℄ is broadly similar to the PSE software ar
hite
ture
des
ribed in Se
tions 2 and 3. However,
omponents in the VDCE are not hi-
erar
hi
al, whi
h simplies the s
heduling of
omponents. Also, the transfer of
data between
omponents in VDCE is not handled using CORBA, but instead
is the responsibility of a Data Manager that uses so
kets. The Appli
ation-
Level S
heduler (AppLeS) [5℄ and the Network Weather Servi
e [36℄ developed
at the University of California, San Diego [5℄ are similar to the IRMS and HIS
des
ribed in Se
tion 3, sin
e both make use of appli
ation performan
e models
and dynami
ally gathered resour
e information.
Visual programming based on the spe
i
ation of appli
ations and algo-
rithms with dire
ted graphs is the basis of the Heterogeneous Network Com-
puting Environment (HeNCE) [4℄ and the Computationally Oriented Display
Environment (CODE) [25℄. Browne et al. have reviewed the use of visual pro-
gramming in parallel
omputing and
ompared the approa
hes of HeNCE and
CODE [7℄. Though a similar approa
h is used by the VCCE des
ribed in Se
tion
2, HeNCE and CODE were designed for use at a ner level of algorithm design;
thus, they require a greater degree of sophisti
ation in their design. SCIRun
[27℄ is a PSE for parallel s
ienti
omputing that also uses dire
ted graphs to
visually
onstru
t appli
ations and has been designed to support visual steering
of large-s
ale appli
ations.
This paper has des
ribed the software ar
hite
ture of a problem-solving envi-
ronment for
ollaborative
omputational s
ien
e and engineering. The PSE is
designed to provide transparent a
ess to heterogeneous distributed
omputing
resour
es and is intended to enhan
e resear
h produ
tivity by making it easier
to
onstru
t, run, and analyze the results of
omputer simulations. A PSE may
be used for tasks su
h as rapid prototyping, design optimization, and detailed
analysis.
7.1 Summary
Mu
h of the PSE infrastru
ture des
ribed in this paper is generi
and
an be
used to build PSEs for a range of appli
ations. Thus, a PSE for a parti
ular
appli
ation domain has a tiered stru
ture, with the lowest tier
onsisting of
the generi
PSE infrastru
ture. The middle tier
ontains the domain-spe
i
infrastru
ture (
omponents, expert assistan
e, input wizards, performan
e his-
tory database) that is needed to
reate a PSE for an appli
ation domain. The
25
upper tier
ontains the models (high-level
omponents and asso
iated input and
output data) that will be used from within the PSE.
Three key aspe
ts of a PSE are
ollaborative tools, visualization, and intel-
ligen
e. It is these features that distinguish a PSE as a powerful environment
for resear
h and knowledge dis
overy, rather than being merely a sophisti
ated
interfa
e.
Collaborative working is supported in several ways within a PSE. First,
the VCCE provides a
ollaborative software development environment through
the
on
ept of a web-a
essible Component Repository. Se
ond,
ollaborative
data analysis and exploration is supported through the
loning of user interfa
e
omponents. Third, ele
troni
notebooks and living do
uments are a me
hanism
for sharing the results of previous simulations, and provide a resear
h re
ord.
Visualization is be
oming in
reasingly important in
omputational s
ien
e
and engineering, both as a me
hanism for runtime monitoring and steering of
appli
ations, and for post-mortem data analysis and navigation. Therefore, it is
essential that a PSE tightly integrates visualization,
omputation, and analysis.
This
an be done using the
omponent-based software engineering approa
h
upon whi
h the VCCE is based. Using the VCCE,
omputation and analysis
omponents
an be linked to user interfa
e
omponents that display the data
visually. It is important to make the mode of visualization \resour
e aware."
Thus, if the visualization platform is a CAVE a fully immersive representation
of the data should be provided. If a semi-immersive
at-panel display is used,
the visualization should be appropriate for that platform. Finally, if a PC is
used as the visualization platform the data
ould be displayed using VRML, or
some other me
hanism for depi
ting virtual reality on a PC.
\Intelligen
e" is important for ensuring the PSE is easy to use and eÆ
ient,
and is in
orporated into the PSE in a number of ways. The Expert Assistant
provided in the VCCE helps users to lo
ate and use
omponents using de
i-
sion tree and/or rule-based approa
hes. The Input Wizard
he
ks that the
input provided to an appli
ation does not violate any physi
al or algorithmi
onstraints. Lastly, the IRMS
an use information from previous runs of an ap-
pli
ation or
omponent to make s
heduling de
isions. This approa
h
an make
use of intelligent methods, su
h as geneti
algorithms and neural networks.
PSEs have the potential to fundamentally
hange how
omputational re-
sour
es, instrumentation, and people intera
t in doing resear
h. Although some
interesting PSEs are now beginning to emerge, resear
h into PSEs of the type
des
ribed in this paper is still at an early stage in many areas. It is hoped
that PSE resear
h will lead to the adoption of software standards that in turn
will provide the basis on whi
h PSE infrastru
ture
an be based. The Common
Component Ar
hite
ture is one su
h standardization eort [3℄.
26
se
tion brie
y outlines our approa
h to ta
kling some of these issues in the fu-
ture, and is not meant to be an exhaustive
overage of possible ways to ta
kle
these
on
erns. Issues su
h as quality of servi
e (QoS) are
urrently beyond the
s
ope of this work, and we rely on third party tools from low level infrastru
-
ture providers to guarantee this. Support for resour
e reservation is also not
provided, and real time appli
ations whi
h require su
h servi
es must negotiate
resour
es with providers before
ommen
ing a simulation.
Fault toleran
e
an be provided in the PSE in two ways, (1) repli
ating
ommon and more frequently used
omponents to ensure availability, (2) up-
dating the
omponent model to support state
he
kpointing. Repli
ation
an
be easily integrated into our
urrent PSE infrastru
ture, as every
omponent
within the VCCE is primarily a pla
e holder or referen
e to an exe
utable
ode.
Su
h a referen
e
an
ontain both a primary referen
e, whi
h is always given
preferen
e, and multiple se
ondary referen
es. Hen
e, subje
t to li
ensing, the
exe
utable
ode for a
omponent may exist in multiple pla
es, and when adding
a new referen
e to the
omponent repository, the developer is en
ouraged to
provide multiple lo
ations where the exe
utable
ode for the
omponent may be
found. If a new exe
utable
ode is being added into the PSE, the developer is
again en
ouraged to pla
e the
ode at multiple lo
ations. To a
hieve this, the
omponent integration tool [23℄ will prompt the user to provide multiple sites
for an exe
utable
ode.
State
he
kpointing
an be a
hieved by updating the
omponent model to
in
lude a
he
kpoint
ag, whi
h suggests that
ertain input variables, or state
variables within a
omponent
an be saved periodi
ally. A
he
kpoint servi
e is
then provided by a given
omponent, whi
h monitors the state of these variables
at regular intervals, and maintains their values in a database. All
omponents
whi
h need to be
he
kpointed must subs
ribe to this servi
e rst. If a
om-
ponent
rashes during exe
ution, it
an be restarted from its last stored state.
The validity of this approa
h depends on the type of
omponent, and the type
of state variables being re
orded. It is left to the developer of a
omponent to
ensure that enough state information within a
omponent is tagged with the
he
kpoint
ag to ensure that a
omponent
an be re-started.
Se
urity is an important
on
ern for many users of s
ienti
appli
ations,
and this
an relate to a number of issues, (1) authenti
ating users requiring
a
ess to a
omponent (perhaps due to li
ensing), (2) ownership and rights of
generated data from a simulation, (3) re-play of the a
ess pro
ess to subvert
password prote
tion, and (4)
omponent modi
ation. Some of these issues
an
be dealt with by providing se
urity
erti
ates for every
omponent through a
entral issuing authority. Ea
h
omponent exe
utable
an also be signed prior
to adding it to a repository, to ensure that it
annot be subsequently modied.
Additional
on
erns with respe
t to migratable
ode, in the use of mobile agents
for instan
e, is still an open resear
h issue [30℄, and being investigated a
tively
in the agents
ommunity. Low level se
urity provision through a Se
ure So
kets
Layer (SSL) may be too limiting, as our emphasis is at a high level of
omponent
manipulation in the VCCE, and not at the
ommuni
ations layer. Individual
omponent developers may in
lude SSL as a way to
ommuni
ate with their
27
omponents; however, the PSE infrastru
ture
annot impose su
h a requirement.
Referen
es
[1℄ G. Abram and L. Treinish, \An Extended Data-Flow Ar
hite
ture for Data Anal-
ysis and Visualization," Computer Graphi
s, Vol. 29, No. 2, pages 17{21, 1995.
[2℄ P. Arbenz, C. Sprenger, H. P. Luthi, and S. Vogel, \SCIDDLE: A Tool for Large-
S
ale Distributed Computing," Te
hni
al Report 213, Institute for S
ienti
Com-
puting, ETH Zuri
h, Mar
h 1994.
[3℄ R. Armstrong, D. Gannon, A. Geist, K. Keahey, S. Kohn, L. M
Innes, S.
Parker, and B. Smolinksi, \Toward a Common Component Ar
hite
ture for High-
Performan
e S
ienti
Computing," in Pro
eedings of High Performan
e Dis-
tributed Computing (HPDC99), 1999.
[4℄ A. Beguelin, J. J. Dongarra, G. A. Geist, R. Man
hek, and V. S. Sunderam,
\Graphi
al Development Tools for Network-Based Con
urrent Super
omputing,"
in Pro
eedings of Super
omputing 91, pages 435{444, 1991.
[5℄ F. Berman, R. Wolski, S. Figueira, J. S
hopf, and G. Shao, \Appli
ation Level
S
heduling on Distributed Heterogeneous Networks," in Pro
eeding of Super
om-
puting 1996, Pittsburgh, Pennsylvania, November, 1996.
[6℄ R. Bramley and D. Gannon. See web page on PSEWare at
http://www.extreme.indiana.edu/pseware.
[7℄ J. C. Browne, S. I. Hyder, J. J. Dongarra, K. Moore, P. Newton, \Visual Pro-
gramming and Debugging for Parallel Computing," Te
hni
al Report TR94-229,
Department of Computer S
ien
es, University of Texas at Austin, 1994.
[8℄ B. Carpenter, Y.-J. Chang, G. Fox, D. Leskiw, and X. Li, \Experiments with
HPJava," Con
urren
y: Pra
ti
e and Experien
e, Vol. 9, no. 6, pp. 633-648, June
1997.
[9℄ H. Casanova and J. J. Dongarra, \NetSolve: A Network-Enabled Server for Solv-
ing Computational S
ien
e Problems," Int. J. Super
omputing Appl. Vol. 11,
No. 3, pp. 212-223, Fall 1997.
[10℄ CODINE. See http://www.geniasoft.
om/produ
ts/
odine/des
ription.html
for further information.
[11℄ J. E. Cuny, R. A. Dunn, S. T. Ha
kstadt, C. W. Harrop, H. H. Hersey, A. D.
Malony, and D. R. Toomey, \Building Domain-Spe
i
Environments for Compu-
tational S
ien
e: A Case Study in Seismi
Tomography," Int. J. Super
omputing
Appl. Vol. 11, No. 3, pp. 179-196, Fall 1997.
[12℄ K. M. De
ker and B. J. N. Wylie, \Software Tools for S
alable Multilevel Appli-
ation Engineering," Int. J. Super
omputing Appl. Vol. 11, No. 3, pp. 236-250,
Fall 1997.
[13℄ I. Foster and C. Kesselman, \GLOBUS: A Meta
omputing Infrastru
ture
Toolkit," Int. J. Super
omputing Appl. Vol. 11, No. 2, pp. 115-128, Summer
1997. See also web site at http://www.globus.org/.
[14℄ D. Foulser, \IRIS Explorer: A Framework for Investigation," Computer Graphi
s,
Vol. 29, No. 2, pages 13{16, 1995.
28
[15℄ E. Gallopoulos, E. N. Houstis, and J. R. Ri
e, \Workshop on Problem-Solving En-
vironments: Findings and Re
ommendations," ACM Computing Surveys, Vol. 27,
No. 2, pp. 277-279, June 1995.
[16℄ E. Gallopoulos, E. N. Houstis, and J. R. Ri
e, \Computer as Thinker/Doer:
Problem-Solving Environments for Computational S
ien
e," IEEE Computa-
tional S
ien
e and Engineering, Vol. 1, No. 2, pp. 11-23, 1994.
29
[30℄ Se
urity in Mobile Agent Systems, See web site at:
http://www.informatik.uni-stuttgart.de/ipvr/vs/projekte/mole/se
urity.html
[31℄ S. Sekigu
hi, M. Sato, H. Nakada, S. Matsuoka, and U. Nagashima, \Ninf: A
Network-Based Information Library for Globally High Performan
e Computing,"
in Pro
eedings of the 1996 Parallel Obje
t-Oriented Methods and Appli
ations
Conferen
e, Santa Fe, New Mexi
o, February 1996.
[37℄ H. Wright, K. Brodlie, J. Wood, and J. Pro
ter, \Problem Solving Environments:
Extending the Role of Visualization Systems," Pro
eedings of the European Con-
feren
e on Parallel Computing (EuroPar 2000), held in Muni
h, Germany, 29
August - 1 September, 2000.
[38℄ M. Young, D. Argiro, and S. Kubi
a, \Cantata: Visual Programming Environ-
ment for the Khoros System," Computer Graphi
s, Vol. 29, No. 2, pages 22{24,
1995.
30