Generation Block-Structured Parallel Process Models by Demonstration

1
1 Article
2 Generation Block-Structured Parallel Process Models by

3 Demonstration
4 Julijana Lekić1, Dragan Milićev2, and Dragan Stanković3
5 1
Faculty of Technical Sciences, Kneza Milosa 7, Kosovska Mitrovica, 38220, Serbia; julijana.lekic@pr.ac.rs
6 2
Faculty of Electrical Engineering, Bulevar kralja Aleksandra 73, Belgrade, 11000, Serbia; dmilicev@etf.bg.ac.rs
7 3
Cisco Meraki, 500 Terry A Francois Blvd, San Francisco, California 94158, United States; sfsgagi@gmail.com
8 * Correspondence: julijana.lekic@pr.ac.rs
9 Abstract: Programming by Demonstration (PBD) is a technique which allows end-users to create,

10 modify, accommodate and expand programs by demonstrating what the program is supposed to
11 do. Although the ideal of common-purpose programming by demonstration or by examples has
12 been rejected as practically unrealistic, this approach has found its application and shown
13 potentials when limited to specific narrow domains and ranges of applications. In this paper, the
14 original method of applying the principles of programming by demonstration in the area of
15 Process mining (PM) to interactive construction of block-structured parallel business processes
16 models is presented. The idea is based on the following principle: using a demonstrational user
17 interface, a user demonstrates scenarios of execution of the parallel business process activities, and
18 the system gives a generalized model process specification. Thereby, a modified process mining
19 technique with the α||(L) algorithm applied on weakly complete event logs are used for creating
20 parallel business process models using demonstration.
21 Keywords: Programming by Demonstration, Process mining, graphical user interface, business

22 process model discovery, block-structured parallel process models, α-algorithm, α||-algorithm,
23 weakly complete event log.
24
25 Citation: Lastname, F.; Lastname, F.; 1. Introduction

26 Lastname, F. Title. Symmetry 2021, Programming by demonstration (PBD) is an approach in which the software is
13, x. https://doi.org/10.3390/xxxxx
27 developed partly or completely by an interactive demonstration to the computer of how
28 to behave in specific situations, i.e. how to perform actions and data processing on
Received: date
29 specific examples. With PBD, a designer can demonstrate to a computer how to behave
Accepted: date
30 in individual cases, while the generalized specification is created by a computer with
Published: date
31 possible assistance and suggestions from designers.
Publisher’s Note: MDPI stays
neutral with regard to jurisdictional
claims in published maps and
institutional affiliations.
Copyright: © 2021 by the authors.

Submitted for possible open access
publication under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (http://
creativecommons.org/licenses/by/
4.0/).
3 Symmetry 2021, 13, x. https://doi.org/10.3390/xxxxx www.mdpi.com/journal/symmetry

4 Symmetry 2021, 13, x FOR PEER REVIEW 2 of 19
5
32 In software development research, the Programming by Example (PBE) and

33 Programming by Demonstration (PBD) techniques have appeared as ways to define a set
34 of operations without the need to learn a programming language. The user does not
35 need to know some specific programming language in order to automate repetitive
36 tasks; instead, the user presents the programs using the program's own user interface (or
37 a stub of it). Such demonstrational interfaces allow a user to perform actions on specific
38 examples of objects (often using direct manipulation), although examples represent a
39 much more general class of objects. The term “demonstration“ is used because the user
40 demonstrates the desired results through examples of values.
41 1.1. Motivation for interactive construction of business processes models

42 The topic of this paper is a prototype program with a graphical user interface (GUI)
43 for interactive construction of a specific category of business processes models. The idea
44 of interactive specification of behavior by demonstration is not new, it exists in papers of
45 many PBE and PBD paradigm researchers, and is precisely described by Lieberman [1],
46 where the term “intelligent interface agent“ implies a system (machine) which
47 communicates with the user directly through user interface, that has the ability to reason
48 and conclude and uses procedures to perform tasks that are demonstrated by a user.
49 However, this paper shows how this same paradigm can be applied on process mining
50 [2, 3] and inference of a specific category of business process models by a system
51 through interactive demonstration.
52 Many information systems have a possibility to record their execution, and, in this
53 way, to generate a trace about events describing the real system behaviour. Process
54 mining techniques are based on the assumption that there is a strong relationship
55 between process models and the "reality" recorded in the event log in the form of traces
56 [2]. From behavioural sample records in the event log, the α-algorithm automatically
57 generates a process model that belongs to a subclass of Petri nets [4], known as
58 workflow (WF) nets [5]. The property of completeness of the log often implies the
59 necessity of having a large number of traces in the log on which the "representative"
60 model for the behaviour seen in the log has to be constructed. Therefore, our challenge
61 was to find logs with potentially much lower number of traces, which may not be
62 complete, but are sufficiently valid so that, using the appropriate algorithm based on the
63 evidences recorded in such logs, a "representative" model can be obtained. To achieve
64 this, we have partially modified the technique of process model discovering, and also
65 the α-algorithm itself, the modified version of which we named α||-algorithm [6, 7].
66 The graphical user interface that we created is a tool that visually shows steps of
67 α||-algorithm. Such tool could serve as a learning tool and playground for those who
68 want to learn more about how the more general α-algorithm that is based on the same
69 principles as its modified version α||-algorithm, functions. This has been achieved
70 through the idea shown in the paper where a user enters log entries (scenarious) step by
7
71 step and observes a current model which is shown and that is obtained on the basis of
72 the log entries entered.
73 In the specific case presented in this paper, the user performs possible scenarios of
74 executed activities of a parallel business process using a graphical user interface and
75 according to this performance, the system generates a business process model. After
76 each played scenario, a candidate model is generated, which represents the model of all
77 the performed scenarios of executed activities, but which does not have to be the model
78 of all possible scenarios of the regarded process activity execution. The idea is that the
79 system creates a model which also suits all possible scenarios of the regarded process
80 activity execution, which is called the final model, as soon as possible, i.e. based on as
81 few performed scenarios as possible. In order to achieve that, the system leads the user
82 to perform a scenario which would offer the system the biggest number of useful
83 information for model specification generalization, which is explained in detail in the
84 demonstration procedure description in the paper.
85 1.2. Related Work

86 The creation of business process models by using demonstration behavior through
87 scenarios, presented in this paper is based on Lieberman’s elements of scenario-oriented
88 recommendation [8], and especially on Harel's play-in and play-out techniques [9].
89 By addressing the issue of describing the actions and objects selected by users in the
90 user interface in PBE, and thus determining which type of generalization is possible,
91 Lieberman and associates [10] propose a visual generalization. In visual generalization,
92 visual properties of the interaction elements themselves, such as size, shape, colour and
93 appearance of graphical objects are used to describe user intentions.
94 Harel and his associates have developed an approach where especially profiled
95 interaction models with formal semantics of execution, called Live Sequence Charts
96 (LSC), can be specified by behavior demonstration of the system based on scenarios
97 (which they call play-in) [9]. After the specification of the LSC model, the system can
98 reproduce defined scenarios in specific cases itself (the technique which they call play-
99 out). In order to realize the mentioned techniques based on scenarios, the tool called
100 Play-Engine was created. This approach is designed primarily for the development of
101 reactive systems of general purpose which are event-driven.
102 Today, the PBD technique is most widely used in the field of robotics [11], where,
103 based on examples or demonstrations, a strategy is created that allows the robot to
104 perform a certain activity, which is known in the literature as Learning from
105 Demonstration (LfD) [12]. Within the LfD, during demonstration of the desired robot
106 behavior, a number of pairs of state-action are defined and recorded. LfD algorithms use
107 this set of examples to develop a strategy by which a robot chooses a particular action
108 based on the current state, and thus reproduces demonstrated behavior.
9
109 The primary goal of our research was the interactive generation of a process model.
110 We have found the guidelines for achieveng this goal in the aforementioned literature.
111 Harel’s play-in and play-out techniques [9], Lieberman’scenario oriented
112 recommendation [8] as well as recording of state-action pairs and reproducing
113 demonstrated behaviour in (LfD) [12], all that gave us an idea on which road shoud we
114 take in achieveing our goal. Likewise, based on Lieberman's visual generalization [8], we
115 used differently colored scenarios to indicate their different roles in the model
116 generation process by demonstration. Beside serving us as an idea for adopting a
117 strategy for playing out process execution scenarios, the term Learning from
118 Demonstration [12] itself has encouraged us that our graphical user inteface could serve
119 as a learning device on the way of process model discovery from log entries
120 demonstration.
121 1.3. Challenges

122 Discovering models based on recorded scenarios of process execution has led us to
123 process mining techniques and α-algorithm [2, 3], [13]. However, the α-algorithm
124 apllicable to complete event logs [3] did not allow us to achieve our goal. Our further
125 research in that area has resulted in a modification of the process mining technique for
126 process model discovery as well as α-algorithm itself, and also in defining of new types
127 of event logs and the conditions that they fulfill. The results of that part of our research
128 in the form of modified PM technique, α ||(L) algorithm, causally complete and weakly
129 complete event logs are presented in our papers [6, 7].
130 In [6] we briefly introduced our modified PM technique and the α ||-algorithm and
131 we have defined two types of logs, so called causally complete and weakly complete logs.
132 Besides that, the results of the application of the modified PM technique and α ||-
133 algorithm on the before mentioned types of logs, as incomplete event logs, has been
134 shown in the example of the parallel business process.
135 The paper [7] is largely devoted to causally complete logs. A more detailed
136 presentation of our modified PM technique and algorithm is given in that paper, with
137 the formal support of definitions, theorems and proofs. Moreover, the comparison of
138 results obtained by applying our algorithm with the results of other process mining
139 algorithms on causally complete logs was done. Finally, that paper brings an extensive
140 experimental analysis whose results are presented. In this analysis, values of the
141 minimal sizes of complete logs and causally complete logs are compared for 100 real
142 examples of parallel business processes. In [7], plug-ins are also presented in the existing
143 ProM framework designed for the needs of the experimental analysis.
144 For the realization of the idea of interactive business process model generation, a
145 modified PM technique and α||-algorithm applied on weakly complete event logs were
146 used [6]. A detailed description and role of weakly complete event logs in process
147 mining are pressented in our other paper that is currently under publication. In this
11
148 paper, the attention is focused on a specific class of parallel processes – block-structured
149 parallel processes [7], which are suitable for the use of modified PM technique and α ||-
150 algorithm, as it will be shown later in this paper.
151 This paper is dedicated to generating the models of the block-structured parallel
152 business process by direct manipulation through the created graphical user interface.
153 Besides the detailed presentation of the procedure of generating models, the description
154 of the manner of the graphical user interface functioning as an ‘'intelligent”, user
155 interface is also given, which is essential for the realization of our idea. This paper also
156 shows the commented results of the experimental analysis carried out on 100 real
157 examples of parallel business processes with the goal of examining how many scenarios
158 of process execution is it neccessary to perform by direct manipulation in order to
159 generate the process model.
160 2. Materials and Methods

161 In this chapter we will address some of the concepts from our papers [6, 7] that are
162 necessary to understand this paper. For the purpose of illustration of the effect of our
163 modified method on the capability and efficiency of the process model discovering, we
164 shall define the particular class of business processes to which our method is restricted
165 at this stage, namely parallel processes. Process models we have dealt with make a
166 subclass of block-structured process models that we named block-structured parallel
167 process models [6, 7].
168 2.1. Properties of Weakly Complete Event Logs

169 The ability of α||-algorithm that its application on the weakly complete event logs
170 may lead to the model discovery was succesfully used for the generation block-
171 structured parallel process models by demonstration. Therefore, the basic concepts and
172 properties of weakly complete event logs, which were used to interactively generate
173 process models, will be presented below.
174 As we have stated in [7], the defined relations between activities can be represented
175 with a matrix, which represents a footprint of the event log, where the relations between
176 any two activities in the modified PM technique for discovering process models, are
177 defined as it is presented in Table 1 in [7].
178 For a particular process model to be discovered, there may be a large (in general, an
179 unlimited) number of different complete logs [6, 7]. However, all these complete logs
180 have the same footprint, i.e., the same causality relation. We call this relation the basic
181 causality relation and we marked it with BN [6, 7]. As we have shown in [7] it can be
182 concluded that the main task is to find a log with the causality relation that is equal to
183 the basic causality relation, and then apply the α ||-algorithm, which then leads to
184 discovery of the original network of parallel processes.
13
185 During our research, we have found that there are logs from which one can
186 discover a causality relation equal to BN, but one cannot come to it only from the
187 evidence recorded in the log, but the individual elements of the causality relation can be
188 inferred in the process of applying the α || algorithm of them. Examples that we have
189 analyzed show that the original network can be discovered from an incomplete log L for
190 which the following holds: BN  (L   L), and  L  BN. Such a log we call a weakly
191 complete log and we marked it with Lw [6].
192 Although in such logs, based on traces in the log, it can not be concluded that Lw =
193 BN (but  Lw  BN instead), the elements of the causality relation that are not in  Lw,
194 but are found in BN, can be subsequently inferred from the footprint of the log. These
195 elements form the causality relation we call inferred causality relation, and we mark it
196 with i. The causality relation that inserted the final appearance of the log footprint
197 (denoted with Lf), and on which the α||-algorithm is applied, becomes: Lf = Lw  i
198 which gives that Lf = BN. Finding of the footprint causality relation Lf which is equal
199 to the basic causal relation BN is a sufficient condition for discovering process model
200 based on Lf using α||-algorithm, as it shown on a concrete example in [6].
201
202 2.2. The Dangling Nodes Problem

203 A network obtained from a weakly complete log often contains dangling nodes, i.e.,
204 activities (nodes in the Petri net [4]) without predecessors and/or successors. The
205 definition of a WF-net [5] includes an assumption of network connectivity, which means
206 connectivity of all nodes in the network, and which prohibits the existence of dangling
207 nodes. In attempt to overcome the problem of dangling nodes, we observed the relations
208 in footprints, and based on these we have defined the rules of inference of direct from
209 indirect successors and predecessors. Thus, for each activity that is a dangling node, a
210 successor and/or predecessor can be found [6].
211 A network obtained based on a weakly complete event log that contains dangling
212 nodes has in the event log footprint at least one activity that in its table row does not
213 have the relation  (if the activity has no successor) or relation  (if the activity has no
214 predecessor) [6]. Let a, b, and c be the activities of the observed process, then in the case
215 of dangling nodes, the following rules apply:
216 Rule 1. (Determining the inferred causality relation i when activity has no successor) Let
217 a be activity which in its log footprint row has no relation then by definition: a i c iff
218 in footprint a c and there is b such that b  c, where a||b.
219 Rule 2. (Determining the inferred causality relation i when activity has no predecessor)
220 Let a be activity which in its log footprint row has no relation , then by definition: ai
221 c iff in footprint a  c and there is b such that b  c, where a||b.
222 2.3. Elements of Automatic Inference in the Demonstrational Program

15
223 The basic idea of applying the paradigm programming by demonstration in this
224 paper is that the user performs possible scenarios of executed activities of a business
225 process and according to this performance, the system generates the process model
226 which will be in accordance with all demonstrated scenarios of process execution. For
227 this purpose, its own demonstrational graphical user interface has been created, which
228 enables user to perform different scenarios of activity execution process using direct
229 manipulation. The created graphical user interface is accessible on the adress given on
230 [14].
231 The system contains some of the components of artificial inteligence which are
232 reflected in the fact that the system itself suggests the order of process activitiy
233 performance (in order to discover the process model as soon as possible) and ’’infers“
234 some relations between the actvities that have not been performed during the
235 demonstration procedure.
236 In addition to recording relations which are demonstrated by a user, the system
237 concludes certain relations in order to generalize them by using heuristics and specific
238 rules, although those relations were not performed and therefore cannot be seen in the
239 demonstrated event log. In order to find and infer relations that have not been
240 performed (inferred relations), the system uses the event log footprint and Rule 1 and
241 Rule 2, in a manner described in section 2.2. The system capability to infer relations
242 among activities that are not recorded in the event log during the demonstration allows
243 the system to generate the model of the performed process based on a very small
244 number of traces in the event log. This statement was confirmed by the results of the
245 conducted experimental analysis that will be presented further in this paper.
246 The system itself largely influences the size of the event log, that is to say the
247 number of recorded traces neccessary for generating the final process model. The
248 system, namely, “leads’’ the user to perform the scenario of process execution that
249 would provide the most useful information for generalization of the process model
250 specification by suggesting the performing order of process activities to the user. It is of
251 key importance for the choice of the activities order that the system will suggest to the
252 user that the system, based on the relations of indirection according to Rule 1
253 (Rule 2), can inferr causal relationsi (i ). For the application of Rule 1 and Rule 2 in
254 the procedure of inferring causal relations, the relation of parallelism, which according
255 to Definition 1 [6] and Definition 4 [7] respectively, can be discovered on the basis of
256 relations of indirection are also needed. The system therefore “knows” that the
257 scenario of process execution in which the largest number of relations of indirection
258 (parallelism) occurs, that being the scenario in which the activities are to be performed in
259 a reversed order relative to the last performed scenario, will be of most use for
260 generalization of process model specification. By such a principle, a large number of
261 parallel activities that can be performed in a mutually independent order will replace
262 the performed order in the next scenario, which will enable the determination of parallel
17
263 relations between them, as required by the modified PM tehnique. Following this
264 principle further, the system can detect very fast all parallel relations in the model, as
265 well as causal relations, either from the records themselves, or by following Rules 1 and
266 2 for finding causal relation in the dangling nodes. This way, the system can quickly
267 discover the basic causal relation BN and therefore the final model of the demonstrated
268 process.
269 The system influences the user performance in one more way, and all with the
270 objective to discover the process model from the least possible number of performed
271 scenarios of process execution as soon as possible. Namely, the system uses different
272 colours to mark the performed scenarios of process execution depending on whether the
273 scenario has provided some new information for obtaining the process model or not.
274 Therefore, green colour marks the scenarios of process execution which performance
275 lead to changes in process model, and red colour marks the repeated scenarios. The
276 appearance of red colour on screen leads user to step away from the suggested order
277 and perform a scenario that is different to previous ones and is user’s choice so that
278 more useful information could be obtained, which is explained in detail in the following
279 description of the demonstration procedure. In that way, by using heuristics and
280 inference the system influences the user’s performance and therefore the obtained
281 results.
282 3. Interactive Model Generating Procedure

283 Demonstration and creating a parallel business process model will be shown in the
284 example of the process shown in Figure 1, which represents our running example in this
285 paper. This process example was found by internet search, and it is located at the
286 address given under [15].
287
288 Figure 1. Accelerating data processing via parallelism.
289 By starting a program at the address given under [14], the screen provides a space
290 for entering the business process activities whose model is being created, as shown in
291 Figure 2.
292
19
293 Figure 2. The initial layout of the graphical user interface.
294 In the initial layout of the graphical user interface the activities that are performed
295 as the first and the last during the execution of the process are entered in the first and
296 last place, respectively, which is consistent with the feature of block-structured parallel
297 processes to have one input and one output [7]. All other process activities can be
298 entered in the arbitrary order independent of the order in which they are executed. One
299 possible order of insertion of the process activity from Figure 1 is shown in Figure 3.
300
301 Figure 3. One possible order of insertion of the activity of the observed process.
302 By pressing the Continue key, the sequence of activities proposed by the next-to-
303 perform system (from left to right) appears on the screen, which is the same as the order
304 of the initial entry of the activity (Figure 4). As with block-structured parallel processes
305 there is one input and one output, the system will always include the first and the last
306 process activity in the order of play, according to how they are entered for the first time.
307
308 Figure 4. Recommended order of activities for the first scenario of process execution.
309 By pressing the sequence of activities from the order offered, a user creates a
310 scenario for execution of the process activity (Figure 5). When the first scenario process
311 execution is performed, the choice of the order of the remaining activities (except the
312 first and last activities) by the user is completely free, and can be independent of the
313 order that is offered. By pressing the Undo key, the previous activity selection can be
314 eliminated.
315
316 Figure 5. Performing of the first scenario.
317 After the completion of the first scenario of process execution, the screen shows its
318 appearance and process model which is obtained based on the performed scenario,
21
319 which represents a candidate-model process (Figure 6). The green colour that marks the
320 scenario means that the performance of this scenario has led to a change in the model,
321 which is expected, considering that this is the first scenario and the first candidate-
322 model.
323
324
325 Figure 6. A candidate model process corresponding to the first performed scenario.
326 By pressing the Next button from Figure 6, the screen shows the order of activities
327 proposed by the system for performing in the next scenario (Figure 7).
328
329 Figure 7. Suggested order of activities for performing the next scenario.
330 For the next performance, the system actually suggests performing the activities in
331 the order reversed from the last performance. The user’s task is furthermore to follow
332 the next rule of activity selection from the order proposed by the system:
333 From the proposed order of activities, always the first activity (from left to right), whose
334 execution is possible, is selected, and so on up to the last activity.
335 If the performance of the new scenario has led to a change in the model (as in this
336 case), the scenario is marked in green and a new candidate model process is obtained
337 (Figure 8) that supports both performed scenarios of process execution.
338
339 Figure 8. Candidate model process that corresponds to the performed scenarios.
340 The procedure is continued by pressing the Next button, after which the user
341 performs the next scenario by following the above rule of the activities performing order
342 selection, from the order proposed by the system. If one of the already performed
23
343 scenarios repeats during demonstration, the repeated scenario will be highlighted in red
344 (Figure 9). Due to the fact that the candidate model corresponds to all performed
345 scenarios, the repeated scenario does not lead to any changes in the model.
346
347 Figure 9. The repeated scenario does not lead to changes in the model.
348 Repetition of the scenario means that, by following the mentioned rule of selection
349 of activities from the proposed order, the other scenarios will be repeated as well, which
350 can lead to the prevention of some other scenarios. Consequently, this would lead to the
351 inability to detect any other candidate model which they correspond to. Therefore, the
352 appearance of a repeated scenario indicates to the user not to follow the mentioned rule
353 of selection of activities from the proposed order in the next demonstration, but to
354 perform the scenario of his choice (if possible, different from those he already
355 performed), as shown in Figure 10.
356
357 Figure 10. After a repeated scenario, the user performs the scenario of his choice, not respecting
358 the rule of selection of activities from the proposed order.
359 Furhtermore, the procedure continues with following the rules of selecting
360 activities from the proposed order, up until the new occurrence of the repeated scenario
361 (marked in red), when we deviate from the rule and continue in the before mentioned
362 way. If some of the played scenarios in this procedure lead to a change in the model, it
363 will be marked in green, and a new candidate model will be obtained, as shown in
364 Figure 11.
25
365
366 Figure 3. The last performed scenario led to the discovery of a new candidate model.
367 The described demonstration procedure suggests user to perform as many different
368 senarios of process execution as possible, in order to create the final process model.
369 However, the fact that many scenarios were performed, whether different or repeated,
370 does not mean that they will certainly lead to some changes in the model (Figure 12).
371
372 Figure 4. The final model of the observed process example
373 By using the described demonstration procedure, in the majority of cases examined,
374 the first appearance of the repeated scenario means that the performance of all the
375 following scenarios will not lead to a change in the model, that is, the last obtained
376 candidate model is in fact the final model of the process as well, which will be discussed
377 later.
378 3.1. Layout of the Relations of the Modified PM Technique in the Demonstration Procedure
379 During the demonstration, after each performed scenario, all relations defined in
380 the modified PM technique (Definition 1 [6], Definition 4 [7]) appears on the screen
27
381 (Figure 13), as well as the footprint of the event log from which the relationships were
382 established (Figure 14), obtained on the basis of all the scenarios performed.
383 Figure 13 shows the layout of the relationships that were obtained after two played
384 scenarios of execution of the observed process, shown in Figure 8. In addition to the
385 relations defined by the modified PM technique, the layout shows:
386 NO_DIRECT  - activities which do not have a direct successor,
387 NO_DIRECT -activities which do not have a direct predecessor,
388 INFERRED i, INFERRED i - causality relations that have not been revealed on
389 the basis of the scenarios performed, but have been subsequently concluded from the
390 event log using Rule 1, or Rule 2, for activities that do not have a direct successor, or
391 predecessor, respectively.
392
393 Figure 5. The layout of the relations after the 2 performed scenarios shown in Figure 8.
394 Figure 14 shows the layout of the event log footprint after the two performed
395 scenarios of execution of the observed process, shown in Figure 8. In addition to other
396 relations of the modified PM technique obtained based on the scenarios, the causal
397 relations obtained either on the basis of performed scenarios, or concluded by the
398 application of Rule 1 and Rule 2, are given in the event log footprint, i.e. causal relations
399 given in INFERRED, marked with i and i.
29
400
401 Figure 6. Footprint relation after the 2 performed scenarios shown in Figure 8.
402 4. Results
403 Our experimental analysis was performed on a sample of 100 real examples
404 obtained by arbitrary manual search of the Internet and selecting publicly available
405 models of business processes (or models similar to them), which fulfill our conditions of
406 block-structured models of parallel processes 1 [7]. The considered examples can be
407 found at the address given in [15].
408 Some characteristics that reflect the network structure and size of the analysed
409 examples are given in Tables 2 and 3 in [7]. These characteristics are expressed by the
410 total number of activities in the network and a number of branches in the network. As
411 with block-structured parallel processes there is one input and one output and it can be
412 presented by the structure of the tree [7], by ''branch'' we meant a direct route from the
413 entrance to the exit of the network. Therefore in the example presented in Figure 1 there
414 are three branches with activities:
415 - B1-B2-B3-B8-B9,
416 - B4-B5-B8-B9 and
417 - B6-B7-B9.
418 The experimental analysis consisted of multiple different executions of process
419 activities given at the address under [15], tracking the results obtained by these
420 executions, and making conclusions based on the results obtained. In the case of
421 multiple different executions of the considered parallel processes, in addition to the fact
422 that, after each performed scenario of process execution, a candidate model that
423 supports all the performed scenarios is obtained, it has also been noticed that after the
424 set of performed scenarios the candidate model no longer changes. This, in fact, means
425 that the last obtained candidate model, in addition to the performed scenarios of process
1
30 The models were found by searching the Web for the keywords: block-structured parallel process, parallel business process, activ-
31 ity diagram, BPMN diagrams etc.
33
426 execution, also supports the scenarios that could eventually be performed by the
427 observed process, making it the final process model.
428 It was of particular interest to consider after how many scenarios could the final
429 model be achieved, and on what this depends in the executing process of experimental
430 analysis. Namely, multiple different executions of the same process indicate that the
431 final model of the observed process does not always occur after the same number of
432 execution scenarios played. The obtained results showed that the number of the
433 execution scenarios necessary for obtaining the final model depends, quite expectidly,
434 on the sequence of activities in the first scenario, given the structure of the network. In
435 an attempt to determine on which feature of the first played execution scenario of the
436 process depends the number of necessary plays of scenarios to obtain the final model,
437 we came to the cognition that it is important whether all activities belonging to one
438 branch are executed successively. In this regard, we addopted two different strategies
439 according to which we performed the processes from the sample and followed the
440 results obtained:
441 Strategy 1. (Depth-first) In the first performed scenario, the execution order of
442 activities is such that all the activities of one branch are executed first, then all the
443 activities of the other branch that are parallel to the previous one, and so until all
444 activities from all parallel branches are executed.
445 Strategy 2. (Breath-first) In the first performed scenario, the order of execution of
446 activities is such that activities from different parallel branches are alternately executed.
447 Strategy 1
448 By applying Strategy 1, the results obtained from the experimental analysis showed
449 that in 99 examples (out of the total of 100 examples), the final model is always obtained
450 after only two different performed scenarios (marked in green). This means that the first
451 appearance of the repeated scenario (marked in red) is an indication that the obtained
452 candidate model is actually the final model as well. In only one case, it was necessary
453 after the first appearance of the repeated scenario, to proceed with the procedure given
454 in the description of the demonstration flow in order to achieve the final model.
455 The demonstrative example shown in Figure 1 can also be obtained after only two
456 different scenarios performed if we would adhere to Strategy 1 when performing the
457 first scenario, as shown in Figure 15. Figure 15 shows that the appearance of the first
458 repeated scenario (marked in red) indicates that the final model of the considered
459 process has been discovered. From Figure 15 it can also be seen that the appearance of
460 the final model of the considered process is the same with the model shown in Figure 12,
461 although it was necessary to perform different scenarios for their obtaining.
462
35
463
464 Figure 7. The final model of a demonstrative process example obtained by performing according
465 to Strategy 1.
466 Strategy 2
467 By applying Strategy 2, the obtained results of the conducted experimental analysis
468 showed that in the 88 examples (out of the total of 100 examples), the final model of the
469 process is always obtained after only three different successively performed scenarios
470 (marked in green). This means that the first appearance of the repeated scenario (marked
471 in red) is an indication that the candidate model is in fact the final process model as well.
472 In 12 cases, it was necessary to proceed with the procedure given in the description of
473 the demonstration flow after the first appearance of the repeated scenario, in order to
474 reach the final process model. The total number of performed scenarios necessary for
475 obtaining the final model in these 12 process examples depends on random performing
476 of the scenarios by a user in the described procedure, ranging from five scenarios to
477 higher.
478 5. Discussion
479 As it can be seen from the demonstration procedure shown, when the
480 demonstration example of the process shown in Figure 1 is executed according to
481 Strategy 2, the appearance of the first repeated scenario is not an indication that the final
482 model is discovered. Only after the sixth scenario was performed, the appearance of the
483 second repeated scenario meant that the final model of the considered process is
484 discovered. That means that the demonstration example of the process from Figure 1
485 belongs to a group of 12 examples of the process from the experimental analysis
486 executed by the application of Strategy 2, where it was necessary to perform more
487 different scenarios of process execution in order to obtain the final process model after
488 obtaining the first repeated scenario.
489 Thus, the results of the experimental analysis performed on 100 examples of
490 parallel business processes have shown:
491 - using Strategy 1, in 99% of the considered examples, the final model is obtained
492 when the occurance of the first repeated scenario of process execution, which takes
493 places after two performed scenarios,
37
494 - using Strategy 2, in 88% of the considered examples, the final model is obtained
495 when the occurance of the first repeated scenario of process execution, which takes places
496 after three performed scenarios.
497 From this it can be concluded that the final model of a parallel business process
498 whose execution is demonstrated can occur after a very small number of performed
499 scenarios (usually 2-3 scenarios). In addition, the obtained model (candidate model)
500 correctly reflects the structure and behavior expressed by all previously demonstrated
501 process execution scenarios at any time.
502 6. Conclusions
503 The paper has presented the use of elements of programming by demonstration
504 technique in the area of process mining for the interactive model construction of block-
505 structured parallel business processes. The idea was that the user performs possible
506 scenarios of business process activity execution, and that the system generates a process
507 model (candidate model) based on this execution, which would correspond to the
508 demonstrated process execution. In addition, the goal was also to provide, on the basis
509 of demonstrated behavior, a generalized specification of the process model (final model)
510 that would correspond to the desired behavioral process. The problem of noncompliance
511 between the model and the actual system behaviour that is evident in hand made
512 models, is overcomed by the interactive creation of the process model. The special
513 advantage of the presented way of model creation is that the last created model always
514 supports all the other previously performed scenarios, so that it is always a valid model
515 for the presented system's behaviour. That provides large number of possibilities for
516 modification and expansion of the system, since the model always adapts to the each
517 newly performed scenario.
518 To accomplish our goal, a graphical user interface was created, through which a
519 user demonstrates different scenarios of process execution. In realization of the idea of
520 generating the parallel business process model based on the demonstrated scenarios,
521 ability of the system to use heuristics and the possibility of inferring the non-performed
522 relations between the activities, is of key importance.
523 The paper could have a potential to make two relevant contributions. The first
524 contribution could be in describing a tool that visually shows steps of α||-algorithm.
525 Such tool could serve as a learning tool and playground for those who want to learn
526 more about how the much better known and more general α-algorithm, which is based
527 on the same principles, functions. This was achieved through the idea shown in the
528 paper in which a user enters log entries (scenarious) step by step and observes current
529 model which is presented.
530 The second contribution is related to experimental analysis shown in the paper in
531 order to find how many log entries should be used to obtain a model that is same as
532 original model which would lead to fast discovery of processes. The results of the
39
533 experimental analysis showed that a very small number of log entries is required, which
534 makes those event logs (weakly complete event logs) very small. This proves the
535 assertion made in [6] that weakly complete event logs can be significantly smaller than
536 complete event logs used by α-algorithm. This assertion is proved by another completely
537 different type of experimental analysis performed with ProM tool [16], whose
538 statistically processed results, together with a detailed description of weakly complete
539 event logs, are presented in our paper that is in the process of publication.
540 The research presented in this paper was focused on a specific class of business
541 processes, which is block-structured parallel business processes. Our future work will be
542 focused on finding the possibility of expanding the idea of creating a process model
543 through demonstration and achieving it for any type of process. This will bring us back
544 to process mining where we will try to find an algorithm and event logs that would
545 allow us to get similar effects with any type of process. The graphical user interface
546 could then be modified in accordance with the results obtained.
547 Supplementary Materials: The following are available online at

548 https://drive.google.com/drive/u/0/folders/0B7gNCSuMP3pKd1JfZzNHa0N2OE0
549 https://julijanagraph.000webhostapp.com/
550 https://drive.google.com/drive/u/0/folders/1pxSgKZpnpcmEZcsPOLFu--xvT4N_1H03
551 Funding: This research received no external funding.

552 Data Availability Statement: The data that cited in this manuscript are available from the
553 published papers or a corresponding author.
554 Conflicts of Interest: The authors declare no conflict of interest associated with this manuscript.
555 References
556 1. Lieberman, H.; Selker, T. Agents for the User Interface, in Handbook of Agent Technology, ed.; Bradshaw, J.; MIT Press, to
557 appear.
558 2. Aalst van der, W.M.P. Process Mining: Discovery, Conformance and Enhancement of Business Processes, Springer-Verlag,
559 Berlin, 2011.
560 3. Aalst van der, W.M.P.; Weijters, A.J.M.M.; Maruster, L. Workflow Mining: Discovering Process Models from Event Logs,
561 IEEE Transactions on Knowledge and Data Engineering 2004, Volume 16(9):1128-1142.
562 4. Aalst van der, W.M.P. The Application of Petri Nets to Workflow Management, Journal of Circuits, Systems and Computers
563 1998, Volume 8(1):21-66.
564 5. Aalst van der, W.M.P. Verification of Workflow Nets, in Application and Theory of Petri Nets, 2nd ed.; Azema P., Balbo G.,
565 Eds.; pp. 407-426, Berlin: Springer-Verlag, 1997.
566 6. Lekic, J.; Milicev, D. Discovering Models of Parallel Workflow Processes from Incomplete Event Logs, in Proc. of the 3rd
567 International Conference on Model-Driven Engineering and Software Development (MODELSWARD-2015), pages 477-
568 482.
569 7. Lekic, J.; Milicev, D. Discovering Block-Structured Parallel Process Models from Causally Complete Event Logs, Journal of
570 Electrical Engineering 2016, Volume 67, Issue 2, Pages 111–123, ISSN 1339-309X, DOI: 10.1515/jee-2016-0016.
41
571 8. Shen, E.; Lieberman, H.; Lam, F. What Am I Gonna Wear: 3, in Proc. of International Conference on Intelligent User
572 Interfaces (IUI-07), Honolulu, January 2007.
573 9. Harel, D.; Marelly, R. Come, Let's Play: Scenario-Based Programming Using LSCs and the Play-Engine, Springer-Verlag, 2003.
574 10. Lieberman, H.; Amant, R.St.; Potter, R.; Zettlemoyer, L. Visual Generalization in Programming by Example,
575 Communications of the ACM 2000. Also in Lieberman, H. Your Wish is My Command, ed.; Kaufmann, M.; 2001.
576 11. Billard, A.; Callinon, S.; Dillmann, R.; Schaal, S. Robot programming by demonstration, in Handbook of Robotics, 2nd ed.;
577 Siciliano, B.; Khatib, O., Eds.; Springer, New York, NY, USA, 2008 (Chapter 59).
578 12. Argall, B.D.; et al. A survey of robot learning from demonstration, Robotics and Autonomous Systems 2009, Volume 57, Issue
579 5, pages 469–483.
580 13. Leemans, S.J.J.; Fahland, D.; Aalst van der, W.M.P. Discovering Block-structured Process Models from Incomplete Event
581 Logs, in Applications and Theory of Petri Nets, 2nd ed.; Ciardo, G.; Kindler, E., Eds.; Volume 8489 of Lecture Notes in
582 Computer Science, pages 91-110. Springer-Verlag, Berlin, 2014.
583 14. https://julijanagraph.000webhostapp.com/
584 15. https://drive.google.com/drive/u/0/folders/0B7gNCSuMP3pKd1JfZzNHa0N2OE0
585 16. Aalst van der, W.M.P.; van Dongen, B.F.; Günther, C.W.; Mans, R.S.; Alves de Medeiros, A.K.; Rozinat, A.; Rubin, V.;
586 Song, M.; Verbeek, H.M.W.; Weijters, A.J.M.M. ProM 4.0: Comprehensive Support for Real Process Analysis, in
587 Application and Theory of Petri Nets and Other Models of Concurrency (ICATPN 2007), 2nd ed.; Kleijn, J.; Yakovlev, A., Eds.;
588 Volume 4546 of Lecture Notes in Computer Science, pages 484-494. Springer-Verlag, Berlin, 2007.

Generation Block-Structured Parallel Process Models by Demonstration

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Generation Block-Structured Parallel Process Models by Demonstration

Uploaded by

Copyright:

Available Formats

1

2 Generation Block-Structured Parallel Process Models by

9 Abstract: Programming by Demonstration (PBD) is a technique which allows end-users to create,

21 Keywords: Programming by Demonstration, Process mining, graphical user interface, business

25 Citation: Lastname, F.; Lastname, F.; 1. Introduction

Copyright: © 2021 by the authors.

3 Symmetry 2021, 13, x. https://doi.org/10.3390/xxxxx www.mdpi.com/journal/symmetry

32 In software development research, the Programming by Example (PBE) and

41 1.1. Motivation for interactive construction of business processes models

85 1.2. Related Work

121 1.3. Challenges

160 2. Materials and Methods

168 2.1. Properties of Weakly Complete Event Logs

202 2.2. The Dangling Nodes Problem

222 2.3. Elements of Automatic Inference in the Demonstrational Program

282 3. Interactive Model Generating Procedure

288 Figure 1. Accelerating data processing via parallelism.

293 Figure 2. The initial layout of the graphical user interface.

316 Figure 5. Performing of the first scenario.

372 Figure 4. The final model of the observed process example

547 Supplementary Materials: The following are available online at

551 Funding: This research received no external funding.

You might also like