You are on page 1of 5

Scienti c work ow systems

(Short Paper)
 y
Jacques Wainer Mathias Weske, Gottfried Vossen
z
Claudia Bauzer Medeiros

July 11, 1997

Abstract
This paper will discuss some of the basic assumptions behind oce work and
scienti c work, and show that work ow systems for these two endeavors should
have di erent functionalities. In particular we discuss the idea that data mana-
gement and management of work ow models are the most important aspects of
scienti c work ows.

1 Introduction
Work ow management systems (WFMS) have been marketed as systems that control
the sequencing of the activities in a procedure (or work ow). At this very abstract
level, WFMS could serve both to control the execution of business or oce procedures
and of scienti c procedures. These two families of procedures involve the execution of
activities, some of them manually, some of them automatically, and the dependence
relationships among these activities can be very complex, yielding complex problems of
synchronization of the execution of these activities.
However, this view presupposes that the process being implemented with a WFMS
has been modeled in advance, and that, at enactment time, the WFMS is only follo-
wing what the process model dictates. This is where the rosier view of WFMS breaks
down. In real life, both in oce and in scienti c lab environments, the enactment of a
workcase may deviate signi cantly from what was planned/modeled. In extreme cases,
the execution of a workcase may be totally ad hoc. This has given rise to a growing
research area, which is concerned with enabling WFMS to help its users deal with these
kinds of deviation-from-the-model cases. In this case, a better understanding of what
is important in oce work and in scienti c work is necessary in order to provide the
correct functionalities for the WFMS.
 DCC - IMECC - UNICAMP, 13081-970 Campinas SP Brazil. Email: wainer@dcc.unicamp.br
y Wirtschaftsinformatik, University of Muenster, Germany. Email: fweske,vosseng@uni-muenster.de
z DCC - IMECC - UNICAMP, 13081-970 Campinas SP Brazil. Email: cmbm@dcc.unicamp.br.

1
In a recent paper, the WASA architecture has been proposed in the context of scienti-
c work ows [MedVW95] (WASA stands for "Work ow-based Architecture to support
Scienti c Applications"). It discusses properties of scienti c work and functionalities
that an environment to support scienti c work must have in order to be useful. The
WASA architecture is currently being used in the context of a scienti c work ow in
molecular biology, namely in DNA Fragment Assembly. This work ow, the resulting
FAT-WASA architecture, and a prototypical implementation of the FAT-WASA system
using a business work ow tool are discussed in [MeidVW95].
While [MedVW95] discusses properties of scienti c work, and architectures and func-
tionalities for these systems, this paper concentrates on scienti c work ow management,
its relations to business work ows and ad hoc work ows; it is organized as follows. Sec-
tion 2 discusses basic properties of work ows in an oce environment. Section 3 reviews
the basic properties of scienti c work ows and relates them to properties of business
work ows. There is another kind of work ows discussed in the literature, namely ad
hoc work ows. Their relationships to scienti c work ows are discussed in Section 4.
Concluding remarks complete the paper.

2 Oce Work
We propose that oce work is mostly about achieving goals de ned by rules of enter-
prises. In more modern WFMS, there has been a lot of emphasis on exception handling
and ad hoc planning for special cases [BK95][BW95][BN95][SMM+94]. These two con-
cepts, exception handling and ad hoc planning, show that it is acceptable to neglect a
planned processing in order to achieve the goals that were behind the process itself. For
example, if the CEO of a company sends a memo that a particular customer's purchase
should proceed as urgently as possible, the work ow for that purchase will be changed
accordingly. In another example, if a customer is late in providing some documents, the
credit checking activity may be postponed and the production scheduling activity may
start before credit evaluation, although it should precede it, if this change is approved
by someone at the appropriate level in the organization.
These examples show that in oce work situations what is important is not follow
the rules but "to get the job done" or even better, to achieve what doing the job would
achieve, possibly in a more ecient way than planned. This, in the business work ow
literature is called "exception handling" or "situated planning". <P>

3 Scienti c Work
Whereas oce work is about goals, scienti c work is about data. Collecting, generating,
and analyzing large amounts of heterogeneous data is the essence of such work, or at
least of the components of scienti c work that are more naturally the target of WFMS.
Gathering and merging data from various experiments, generating data from a computer
model or performing statistical analyses in the data, are among the activities that could
pro t from WFMS support.

2
The degree of exibility that scientists have in their work is usually much higher than
in the business domain, where business processes are usually prede ned and executed in
a routine fashion. Assume a scientist decides to lter a data set coming from a measuring
device; even if such ltering was not planned for, that is a perfectly acceptable attitude,
provided the resulting data is tagged as being the output of the ltering activity.
This example shows what we believe is the most important characteristics of a sci-
enti c work ow: as a way of identifying data sets. The details and parameters of a
work ow should be added to the data set in order to identify the data. Thus, by ac-
cessing this identi cation tag on a data set the scientist would know how the data was
generated (devices, algorithms, time, place), which data manipulation activities were
performed.
While exibility is a major property of scienti c work, there are numerous stan-
dard procedures that can be assembled to perform complex scienti c experiments. An
important di erence from business work ows is that a scienti c work ow is often not
completely de ned before it starts. The scientist performs some tasks and decides on
the further steps only after evaluating the previous ones. These sequences of steps that
make up part of a scienti c experiment are known as partial work ows. Partial work-
ows may be re-used in later experiments. Therefore it is obvious that managing partial
work ows is an essential goal in scienti c work ow management.
The above illustrates that work ow systems can prove invaluable in helping activity
tracking, data tagging and documentation, even for experiments performed by a single
scientist. This is particularly true for scientists working on computational models; they
generate large amounts of data, each produced by changing di erent parameters in the
computer models, that must be properly identi ed.
There is one other aspect that distinguishes oce from scienti c work ows: an oce
work ow must be brought to a "satisfying" end. If a customer cancels his order, that
purchase case must be further processed to be brought to an acceptable end state: the
production may be rescheduled, the organization may sue the customer for expenses
or for breaking the contract. In a scienti c work ow, cases may be abandoned at any
moment and at any step. If the scientist thinks that there was some contamination of
the data, an experiment may be just stopped.

4 Ad hoc ow
We will call ad hoc ow the possibility of altering the ow of a particular workcase.
Because of its particular characteristics a workcase may have to follow a di erent se-
quencing of activities from the one planned for more standard workcases of that type.
There are di erent forms of ad hoc ow discussed in the recent literature: ad hoc
planning is the case where a particular actor in the work ow may alter the plan of
activities of the workcase [BK95][BN95][SMM+94]. This re-planning can be restricted
to a particular domain in the organization: the credit checking department proposes a
di erent plan for this workcase because it is in some way a special case, but outside the
credit checking department the case will proceed as planned. Or the re-planning can
a ect the whole plan for the workcase.

3
A di erent form of ad hoc ow can be described as "pass the buck." In this case,
the ow of the workcase is not planned but at the end of each activity the actor decides
to whom the workcase should be sent next. Ad hoc planing has been discussed in the
literature as an important way of dealing with speci cities of a workcase. The pass the
buck mode is discussed by [WB96].
In scienti c work, both forms of ad hoc ow seem very important: a group may
replan a certain sequence of activities because of characteristics of the data, or because
the scientists want to try a new data analysis procedure. Or a solitary scientist may not
even replan in advance, but given the results of an activity, decide what to do next. One
can see that because the WFMS will manage the data, the scientist may nd it more
interesting to describe to the WFMS what activity should be performed next instead of
just doing it, since in the latter case she would have to manually attach to the resulting
data the information on what activity and parameters was used to generate it.
Scienti c work ows should also provide another functionality for ad hoc planning,
which we call rewind. A scientist may decide after performing a sequence of data analysis
activities (say high frequency ltering and outline removal) that a di erent form of data
analysis should have been performed (say principal component analysis). The scientist
should be able to rewind the workcase to a step previous to this data analysis sequence
and from there perform the alternative data analysis procedure.
The rewind concept should not be confused with the redo concept, which is common
in oce and software engineering work ows. In a redo, the ow of the workcase is
redirected so that a particular activity is executed again. The di erence is that in the
redo all data additions performed by the subsequent activities are available for the redone
activity. For example, in software production work ows it is common to have loops of
code/test where the code activity is redone if the test activity detects errors. The code
activity has access to all the test results, and in fact depends on it. If it were rewound,
the code activity would start again, from the speci cations, with no data from later
activities available. The rewind functionality is of course based on versioning the data
produced by the activities. One would like to be able to restore the full context after
(or before) some activity was performed and proceed with another course of actions.

5 Conclusions
The objective of this paper was to exhibit di erences between classes of work ows:
scienti c, oce, and ad-hoc ones. Our goal is to contribute to a better understanding
of what targets exist for work ow management systems. The main focus of this paper,
however, is on scienti c work ows. We showed that these do have special properties, such
as identifying and tagging data sets, versioning data sets, allowing to rewind work ows.
This emphasis is currently being implemented in the WASA prototype; it is not available
in any commercial work ow management system yet.

4
Acknowledgements
This work was partially supported by CNPq Brazil and by the German Ministry of
Science and Technology (BMFT), within a bilateral cooperation on Database Technology
and Knowledge-Based Systems, and by FAPESP Brazil.

References
[BK95] D.P. Bogia and S.M. Kaplan, Flexibility and Control for Dynamic Work ows in
the wOrlds Environment, in Proceedings of the 1995 ACM Conference on Organizatio-
nal Computing Systems (COOCS'95), N. Comstock and C.A. Ellis (eds.), pp 148-159,
Milpitas, California, 1995.
[BN95] R. Blumenthal and G.J. Nutt, Supporting Unstructured Work ow Activities
in the Bramble ICN System, in Proceedings of the 1995 ACM Conference on Organizati-
onal Computing Systems (COOCS'95), N. Comstock and C.A. Ellis (eds.), pp 130-137,
Milpitas, California, 1995.
[BW95] P. Barthelmess and J. Wainer, Work ow Systems: a few de nitions and a
few suggestions, in Proceedings of the 1995 ACM Conference on Organizational Com-
puting Systems (COOCS'95), N. Comstock and C.A. Ellis (eds.), pp 138-147, Milpitas,
California, 1995.
[MedVW95] C. B. Medeiros, G. Vossen, M. Weske: WASA: A Work ow-Based Archi-
tecture to Support Scienti c Database Applications (Extended Abstract). Proceedings
of the 6th DEXA Conference (eds.: N. Revell, A. M. Tjoa), Springer LNCS 978, pp
574-583, London 1995.
[MeidVW95] J. Meidanis, G. Vossen, M. Weske: Using Work ow Management in
DNA Sequencing. Fachbericht Angewandte Mathematik und Informatik 23/95-I, Uni-
versitt Mnster, 1995.
[SMM+94] K.D. Swenson and R.J. Maxwell and T. Matsumoto B. Saghari and K. Ir-
win, A Business Process Environment Supporting Collaborative Planning, in CSCW'94,
ACM, 1994.
[WP96] J. Wainer and P. Barthelmess Workcase-centric Work ow Model Submitted
to NSF Workshop on Work ow and Process Automation, 1996.

You might also like