Professional Documents
Culture Documents
Thesis: Testing in Scheduling Problems For Information Retrieval
Thesis: Testing in Scheduling Problems For Information Retrieval
retrieval
Ioannis Samaras
Student number: 2577900
31 July 2016
Thesis committee:
Thesis supervisor: Dr. Ir. R.A. Sitters
Second Reader: Dr. D.A. van der Laan
Abstract
This thesis focuses on the two machine flow shop problem with unknown delays. This a problem
often found in the design of manufacturing facilities, where the equipment can be ordered only
after its specifications are known with one-hundred-percent confidence. In this industry, the man-
agement uses the approach of Critical Path Method (CPM), which, given the stochastic nature of
the problem, can lead to undesired prolongation of the project duration. For this reason, the idea
of testing some of the jobs is proposed as an exploration through exploitation concept.
Keywords: Two-machine flow shop with unknown delays, scheduling with testing, unknown
time lags, exploration through exploitation, information retrieval
1
Contents
1 Introduction 4
1.1 Motivation for this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Some details about the various project phases . . . . . . . . . . . . . . . . . . . 5
1.3 What happens in reality and what can be achieved . . . . . . . . . . . . . . . . . 5
1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Literature Review 6
2.1 Flow shops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Main categories of scheduling problems . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Learning and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Model Description 9
3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Deterministic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.2 Stochastic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.3 Stochastic Model with Testing . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Model Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Preliminaries 11
4.1 Lower Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Theorems and Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Local search properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6 Modeling 27
6.1 Project phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.2.1 Free Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.2.2 Costly Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7 Computational Results 31
7.1 Deterministic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.1.1 Equal Processing Times . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.1.2 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.1.3 Deterministic problem - Discussion . . . . . . . . . . . . . . . . . . . . 35
7.2 Stochastic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2.1 Normal processing times at WS1 and equal processing times at WS2 . . . 35
7.2.2 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2.3 Stochastic problem discussion . . . . . . . . . . . . . . . . . . . . . . . 42
7.3 k-free Testing vs Costly Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.3.1 Equal Processing Times . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.3.2 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.3.3 Testing discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2
Abbreviations
CPM Critical Path Method
LB Lower Bound
WS Work Station
3
1 Introduction
1.1 Motivation for this thesis
In the field of chemical and pharmaceutical manufacturing the introduction of new products is nec-
essary for a companys innovation and sustainability. With the new products, new challenges also
incur with respect to the expansion or modification of the current production facilities and more
specifically the production lines making the facility design an absolutely essential operation. The
planning and scheduling functions in a company rely mostly on the critical path method (CPM) to
allocate limited resources to the activities that have to be done. This allocation of resources has
to be done in such a way that the company optimizes its objectives and achieves its goals. These
resources can be engineering teams, licenses for computer-aided-engineering software, the project
budget, equipment or machinery that can be re-used from other finished projects etc. The above
mentioned planning also defines the deadlines of the projects according to the organizations ob-
jectives often ignoring details from a microscopic view of the project. The following example
illustrates the role of planning and scheduling in a real industrial situation, namely in the manage-
ment of large construction and installation projects that consist of many stages. Consider a phar-
maceutical company that intends to produce a new drug (e.g. liquid form). The project involves
a number of distinct tasks including electrical and instrumentation design, the process design, the
procurement of instruments and equipment, the construction phase, the programming phase (im-
plementation) and documentation phase. A precedence relationship structure exists among these
tasks: some can be done in parallel (concurrently), whereas others can only start when certain
predecessors have been completed. The goal is to complete the entire project in minimum time,
in other words to minimize the maximum completion time also known as the makespan. Planning
and scheduling provide a coherent process to manage the project as well as a good estimate for
its completion time but sometimes fail to reveal which tasks are critical and determine the actual
duration of the entire project. This failure is attributed mostly to the outsourcing policy that such
organizations employ. For example, the engineering teams for electrical & instrumentation design
and process design are outsourced whereas the initial planning (e.g. deadlines, resources and bud-
get), the coordination and supervision are performed by the organization. A good approximation
of such a project is shown in the following figure.
4
1.2 Some details about the various project phases
Table 1 shows possible phases for a project in the design of manufacturing facilities. It is impor-
tant to mention at this point that the graph in Figure 1 will not be used explicitly in this thesis but
only two of its nodes for reasons explained in the following section. Therefore, Table 1 intends
primarily to introduce one instance of a real application, which was the motivation for this the-
sis. In addition to this, since the characteristics of the possible phases are realistic we intend to
emphasize that in order to build a basic model one has to make assumptions and simplifications
mentioned in later sections.
Table 1: Overview of possible project phases
S: Basic Provides the basis (basic engineering design in engineering terms) for the jobs
Engineering in A and B. It also contains information about the processing times of the jobs
in A and B with 30% precision. However, this is a not entirely an optimistic-
pessimistic approach but rather assumptions about additional work (jobs that
might need to be reprocessed due to engineering refinements).
A: Electrical The processing times are known with 30% precision, whereas the release
Engineering dates are equal to 0. In principal the duration of the same job in A is less or
equal to its processing in B.
B: Process The processing times are known with 30% precision whereas the release
Engineering dates are 0. The sequence policy of the jobs in B is defined by the priority
rules that are out of the scope of this thesis. The machines in B cannot be idle.
The processing times are unknown until they are fully processed.
C: The processing times are known with 30% precision whereas the release
Automation date of is job is dependent on its completion time in B. The processing times
Specification are unknown until they are fully processed.
D: Montage The processing times are known with 30% precision whereas the release
date of each job is dependent on its completion time in B plus the delay, which
is the lead (delivery/transportation) time of the equipment. The delay of each
job remains unknown until its completion in B.
E: The processing times are known, whereas the release date of each job is de-
Automation pendent on its completion time in D.
G: Safety The processing times are known, whereas the release date of each job is de-
Documentation pendent on its completion time in C.
F: Final The processing times are known, whereas the release date of each job is de-
Documentation pendent on its latest completion time in E or G.
5
parallel). When the job j is finished in B, there is a minimum delay for each job before it can
then be processed by D. This delay becomes known only when the job is completed in B and
as a result the earliest possible release time rj for phase D can be determined. Moreover, there
is constant communication between teams in A and B and this is the reason why it is assumed
that if a job is stopped being processed by a machine in A, it also stops being processed by a
machine in B and vice versa. This implies also that preemption is allowed as well as every stop is
considered preemption. The processing times in D are known as soon as the job j in A is finished
(also known with the 30% precision when S is finished). Summarizing, in an attempt to also
state the problem, the minimum time that is needed between B and D is unknown and can lead to
undesired prolongation of the completion time of the project. The fact that these delays between
B and D are typically longer than the processing times suggests that the critical path is the path
S-B-D-E-F. Aim of this thesis is to focus on the two most significant nodes of the graph (B and
D), which consist a two-machine flow shop problem with minimum delays. The reason behind
this is that every reduction in the completion time of this flow shop is an actual reduction in the
completion time of the whole project. We will attempt to use testing to reveal information for the
delays and assess whether this is a strategy that can be employed when information on the delays
is missing. It will also be investigated whether the initial policy of scheduling the jobs according
to EDD, usually equivalent to the LPT schedule, can be replaced by another scheduling rule.
1.4 Outline
Main characteristic of our attempt is the concept of exploration through exploitation, which con-
sists testing some of the jobs in B in order to learn their delays and employ another scheduling
policy that will lead to a shorter completion time. The notion of testing is also the novelty of this
thesis, which can be applied in the above realistic project management scenario. Due to the fact
that not much work has been done in the field of testing in scheduling problems, we consider a
single machine for each stage of the flow shop. Section 2 provides a detailed literature review to
build up the necessary theoretical background for this scheduling problem and its innovative ap-
proach. In Section 3 we give a formal description of the basic problem and this is divided into three
distinct models. Afterwards, in Section 4 we define the lower bounds and we give some theorems
and general observations about the three models. Moreover, we present the optimal algorithm for
a special case of the deterministic problem and we propose five algorithms that are believed to pro-
duce near optimal schedules. In an attempt to extend some of the observations for more complex
settings we also give three properties of local search. In Section 5 the problem is also formulated
as an Integer Linear Program aiming to obtain the optimal solution of every instance. After that,
in Section 6 we show how the industrial application described in the introduction was modeled
as a two-machine flow shop with delays for two cases, namely when testing is for free and when
testing is costly including a detailed cost function. We also show how the proposed algorithms
are modified to cope with the stochastic version and the two cases of testing. Section 7 is devoted
to the results obtained by simulation for various cases and different distributions for the delays.
Finally, this thesis is concluded with general discussion of the findings and suggestions for further
research in Section 8.
2 Literature Review
2.1 Flow shops
Flow shop is one of the most classic scheduling problems whose conception is traced back to the
1950s. However, one of the main assumptions of those problems has always been that the time
6
required to move a job from one workstation to another is negligible (Johnson, 1954). This as-
sumption might be the case in some situations, but according to the description with regard to the
facility design a "transportation" time will be required for each and every job. Our problem is the
two-machine flow shop with minimum delays. Nawijn and Kern (1991) study a single-machine
problem with two operations per job and intermediate minimum delays, which is equivalent to
the two-machine flow shop problem with minimum delays. They show that the problem is NP-
hard in the ordinary sense if the solution space is not restricted to permutation schedules. The
result is strengthened to NP-hardness in the strong sense for F 2|dj |Cmax (Yu et al., 2004), for
F 2|dj , p1j = p2j |Cmax (DellAmico, 1996), for F 2|dj {0, d}, p1j = p2j |Cmax (Yu, 1996), and
finally for F 2|dj , p1j = p2j = 1|Cmax (Yu et al., 2004). DellAmico (1996) proposed several
lower bounds, which were used in the derivation of several polynomial 2-approximation algo-
rithms for the makespan objective. In the same paper, he also proposed a tabu search algorithm
that produced good results. Orman and C. Potts (1997) study the problem of minimizing the idle
time of a radar, which they formulate as a single-machine scheduling problem with exact rather
than minimum delays. They identify some special cases that are polynomially solvable. The coun-
terpart of the above problem, in which the delay constraints are restricted to be "maximum", has
also been studied in the literature. Yang and Chern (1995) show that the problem to minimize
the makespan objective is NP-hard and propose a branch and bound algorithm. Another closely
related problem is the proportionate two-machine flow shop (p1j = p2j ), for which Ageev (2007)
improved the results of DellAmico giving a 32 -approximation for this special case. The last de-
velopment in the two-machine flow shop problem with exact delays was made by Leung et al.
(2007), where they focused on objectives of the makespan and total completion times. For the
makespan objective they show that the problem is strongly NP-hard even if there are only two
possible delay values. They showed that some special cases are solvable in polynomial time. They
also designed approximation algorithms for the general case and some NP-hard special cases. It
is also noteworthy that they showed that the optimal schedule for the problem F 2|dj |Cmax does
not have to be a permutation schedule. This comes also to an agreement with the findings of C. N.
Potts et al. (1991) according to which, for the problem of minimizing the maximum completion
time, the value of the best permutation schedule is worse than that of the true optimal schedule by
a factor of more than 12 m, where m the number of machines. In another paper Strusevich and
Zwaneveld (1994), consider the two machine flow shop with setup, processing, and removal times
separated. They show that there may not exist an optimal solution that is a permutation schedule,
and that the problem is NP-hard in the strong sense. Furthermore, long before all these problems
were mentioned, the first who introduced the notion of delays were Johnson (1958), Mitten (1959),
Nabeshima (1963) and Swarc (1968). They refer to them as time lags, which can be viewed as
processing times of a non-bottleneck machine in between the two workstations, thus comprising
a special case of a F 3||Cmax problem. So one could solve the problem by applying the Johnsons
rule to the processing times (p1j + dj , dj + p2j ). More specifically, Johnson proposed that for
two jobs i and j: job i precedes job j if min{p1j + dj , di + p2i } > min{p1i + di , dj + p2j }.
With respect to the Johnsons solution of the F 3||Cmax problem and the definition of a non bot-
tleneck machine, Kamburowski (2000) summarized all the progress regarding the non-bottleneck
machines. According to that paper a middle machine can be perceived as a non-bottleneck ma-
chine and the Johnsons algorithm is applicable under any of the following conditions:
1. Johnson (1954): p1j di or p2i dj for all i 6= j, which basically implies that the middle
machine is dominated by machine 1 or machine 3.
3. Monma and Rinnooy Kan (1983): p1j + (1 )p2i dk for all i, j and k, and some ,
0 1.
7
4. Kamburowski (2000): (p1j di ) + (1 )(p2i dj ) 0 for all i 6= j and some ,
0 1.
8
its relative priority might not be known exactly. In a recent article, Levi et al. (2015) identify that
there are cases, where collecting more information on a job requires the allocation of the same
resources used to process the job. This gives rise to operational trade-offs of exploration versus
exploitation, specifically, how to dynamically allocate resources between diagnostic work called
testing that aims to collect more information on the arriving jobs, and processing work called
working that simply serves the jobs (customers) in the systems. In the same article, they are the
first to introduce this new class of problems that capture exploration versus exploitation trade-offs
in the service environments. Moreover, they provide a structural analysis after formulating the
problem as a high-dimensional Dynamic Program and give a characterization of optimal policies.
Apart from this article and despite the wide spectrum of problems that has been explored in the
area of scheduling, the topic of testing itself seems to have not received attention.
3 Model Description
This section gives detailed information regrading the three models used in this thesis. In detail,
these are the deterministic model, the stochastic and the stochastic with testing. Before breaking
down the three models we state the basis of them, which is the same for all.
9
following:
There is unlimited capacity between the two work stations - no blocking effect.
Non-permutation schedules are allowed (the jobs are not required to follow the same se-
quence in the two WS).
The actual value of dj is revealed only after the job j is processed at WS1 (O1j is com-
pleted).
There is unlimited capacity between the two work stations - no blocking effect.
Non-permutation schedules are allowed (the jobs are not required to follow the same se-
quence in the two WS).
The actual value of dj can also be revealed after the job j is tested instead of processed at
some cost.
The testing assumption corresponds to the investment of an additional amount of time in the job
(some percentage of its processing time p1j at WS1) and consequently the retrieval of some infor-
mation about its delay prior to its actual processing. The most common example to describe the
basic two-machine flow-shop without delays is the one of the paint shop. Consider a paint shop
and a building with multiple levels. Every level is a job, which has to undergo two operations:
sanding and painting. If it were just for this problem, it is solved optimally by Johnsons rule.
Now consider that every level has to be painted with a specific color and this color can be speci-
fied and purchased only after the sanding is done. Assume that not all the colors are immediately
available so their delivery times are the delays. Testing one job would mean to decide on the color
in advance (before sanding) and then contact the color supplier to learn its approximate delivery
10
time. The amount of time spent on this task is called Cost of Testing (CoT) and its cost function
for a job j can be defined as follows:
where j is a percentage different for each job and depends, in general, on the difficulty level
of each job. In our model, it is assumed that testing occurs before processing the jobs at WS1,
which implies that the jobs are available for processing at WS1 at time nj=1 CoTj instead of zero
P
as in the other two models. Note that the jobs are available for processing at WS1 at time zero if
the testing is for free. Of course, the cost function can be defined appropriately for every different
problem as it will happen in Section 6.2.2 for the project management problem mentioned in the
beginning. For instance, for the paint shop example the j might be a constant, same for all jobs
and thus CoTj = p1j .
All in all, the assumption of testing is the novelty of this model and this thesis, since it provides
an alternative way to tackle the stochastic version of the problem and introduces a new category
of scheduling problems.
4 Preliminaries
In this section, the theoretical background for dealing with the problem is built up and several
observations for the general case but also special cases are made. Moreover, some algorithms that
are expected to perform near optimally will be described.
Moreover, there is another job with max{p1j + dj } and let this job be scheduled first at WS1.
j
Ideally this job would finish last at WS2, suggesting that all other jobs have started later at WS1
11
but completed earlier at WS2 after the FCFS rule was applied. This gives the second LB equal to:
Finally, some job j is scheduled last at WS1 and in the best case this job has also a delay dj = dmin .
Assuming that this job is completed last at WS2, this gives another LB with significant value,
especially for the case where p2j = p or d is uniformly distributed, which is equal to:
n
X
LB3 = p1j + dj + p2j (5)
j=1
12
C1a = p.
C1b = p + p = 2p.
ra = dL + p.
rb = dS + 2p.
There are three possible cases:
(i) ra = rb ,
which implies that dL + p = dS + 2p dL = dS + p. Then, it does not matter which
jobs comes first at WS2. Let that job be a. As a result:
C2a = ra + p = dL + p + p = dL + 2p. Obviously, rb < C2a , so b is processed
without idle time.
Cmax = C2b = C2a + p = dL + 2p + p = dL + 3p = Cmax 0 .
(ii) ra < rb ,
which implies dL + p < dS + 2p dL < dS + p. Since ra < rb , a can be processed
first at WS2 and it is not known whether it is going to be idle time after that or not. So
for the WS2 we have:
C2a = ra + p = dL + p + p = dL + 2p.
C2a rb = dL + 2p dS 2p = dL dS > 0, so again no idle time at WS2.
Cmax = C2b = max{rb , C2a } + p = C2a + p = dL + 2p + p = dL + 3p = Cmax 0 .
(iii) ra > rb ,
which implies that dL + p > dS + 2p dL > dS + p. In this case b is processed first
at WS2 and it is unknown if there will be idle time in-between.
C2b = rb + p = dS + 2p + p = dS + 3p.
C2b ra = dS + 3p dL p = dS dL + 2p = dS + p dL + p < dL dL + p = p.
We need to take cases again, since we cannot determine if there is idle time just from
the inequality C2b ra < p.
C2b ra
So, Cmax = C2a = max{C2b , ra } + p = C2b + p = dS + 3p + p = dS + 4p < Cmax 0 .
(dS + 4p < dL + 3p dS + p < dL )
C2b < ra
So, Cmax = C2a = max{C2b , ra } + p = ra + p = dL + p + p = dL + 2p < Cmax 0 .
13
Figure 4: ra > rb without idle time Figure 5: ra > rb with idle time
0
We observe that Cmax Cmax , which means it is optimal to schedule a before b, as claimed
initially.
Example 1. The following shows that Theorem 1 cannot be generalized for more than 2 jobs,
in other words the rule of longest delay first is not optimal (see Figure 6). In the first schedule
Table 2: Example instance where the assumption for the LDF optimality does not hold
the jobs at WS1 are scheduled according to the rule longest delay first giving a Cmax = 10. The
reason for this is that with this schedule rA = rB = rC = rD = 6. Another scheduling rule could
be to schedule the jobs according to their delays in pairs LPT-SPT. In detail, the longest delay
comes first but after that follows a job with short delay so that the jobs are scheduled as early as
possible, thus releasing jobs at different times. The schedule provided by the LPT-SPT gives also
the optimal solution for the given instance Cmax = 9.
14
Theorem 2. For the deterministic problem and for every instance with unit processing times for
both work stations (p1j = p2j = 1), where also holds dj dj+1 2, if delays are sorted in an
increasing order, it is optimal to schedule the jobs according to the rule longest delay first (LDF)
at WS1 and apply the rule First Come First Served (FCFS) at WS2.
Proof. We will prove that this algorithm gives Cmax = LB. We assume that the jobs are indexed
and sorted in an order such that dn < dn1 < ... < d2 < d1 . From the LB definition in Section
4.1 we can determine the LB for this instance as:
X n n
X
LB = max{min{p1j + dj } + p2j , max{p1j + dj } + p2j , p1j + dn + p2j } =
j j
j=1 j=1
= max{1 + dn + n, 1 + d1 + 1, n + dn + 1} = max{1 + dn + n, 1 + d1 + 1}.
According to the LDF we schedule job 1 first, then job 2 and so on. As a result, we get:
C
P11n = p11 , C12 = p11 +p12 , ... , C1(n1) = p11 +p12 +...+p1(n1) , C1n = p11 +p12 +...+p1n =
j=1 p1j .
The release dates also become:
r1 = C11 + d1 , r2 = C12 + d2 , ... , rn 1 = C1(n1) + dn 1, rn = C1n + dn
or equivalently:
r1 = p11 + d1 , r2 = p11 + P p12 + d2 , ... , rn1 = p11 + p12 + ... + p1(n1) + dn1 , rn =
p11 + p12 + ... + p1n + dn = nj=1 p1j + dn .
Moreover, it can be proved that:
since rn rn1 = p11 +p12 +...+p1n +dn (p11 +p12 +...+p1(n1) +dn1 ) = p1n +dn dn1 =
1 + dn dn1 1 rn < rn1 (dn1 dn 2 dn1 + dn 2).
7
Due to (7) the lower bound becomes: LB = 1 + d1 + 1. If we compare sequentially rn1
and rn2 until r2 and r1 , we show that the inequality between two adjacent jobs holds.
Consequently, the job that finished last at WS1 is scheduled first at WS2.
At WS2 the order is n, n 1, ..., 2, 1 and thus their respective completion times are:
C2n = rn + 1 = n + dn + 1, C2(n1) = rn1 + 1 = n 1 + dn1 + 1 = n + dn1 , ... ,
C22 = r2 + 1 = 2 + d2 + 1, C21 = r1 + 1 = 1 + d1 + 1.
Note, that every job at WS2 is released no sooner than the previous job is completed. This is easily
shown because: rn1 C2n = p11 + p12 + ... + p1(n1) + dn1 (rn + 1) = p11 + p12 + ... +
p1(n1) + dn1 (p11 + p12 + ... + p1n + dn + 1) = dn1 p1n dn 1 0.
The job that is completed last is the one with the longest delay:
Cmax = C21 = r1 + 1 = p11 + d1 + 1 = 1 + 1 + d1 = LB = OP T .
The job with max{p1j + dj }, namely the job associated with the longest delay (d1 ) is scheduled
j
first and completed last without any delay, which is optimal, since it is equal to the lower bound.
Theorem 3. For the stochastic problem where d is an IID random variable and the processing
times for WS2 are equal (p2j = p) it is optimal to apply Shortest Processing Time First (SPT) at
WS1 and FCFS at WS2.
Proof. The two main ideas are that SPT reveals faster information about the time delays d but
also SPT does not use any information from the time delays d. Intuitively, the SPT creates release
dates at faster rate and minimizes them. Consider the work stations 1 and 2. There are n jobs
in total that have to be processed at WS1 and afterwards at WS2. For WS1 the release dates
are 0 for all jobs, whereas for WS2 are given by the formula: rj = C1j + dj , with dj an IID
random variable. Assume that the processing times are indexed from 1, 2, ..., n with the property
15
p11 p12 ... p1n for WS1. With the use of SPT the completion times, which comprise
the first term of the sum that gives the P release dates, are as follows: C11 = p11 , C12 = p11 +
p12 , ..., C1n = p11 + p12 + ... + p1n = nj=1 C1j .
Respectively,
Pn the release dates become r1 = C11 + d1 , r2 = C12 + d2 , ..., rn = C1n + dn =
j=1 C 1j + d n , where dj is equal to the expected value.
Due to SPT it holds that: C11 < C12 < ... < C1n . As a result, the expected values satisfy:
Ehr1 i < Ehr2 i < ... < Ehrn i and the expected sequence of the WS2 is the same as at the
WS1. By definition, for WS2 it is optimal to process the jobs according the rule PnFCFS as it is not
optimal for a machine to remain idle. It is also known that SPT minimizes j=1 C1j and thus
C11 , C12 , ..., C1n and in this case the r1 , r2 , ..., rn .
We distinguish 3 cases for the analysis:
If p minj p1j for all jobs, the expected Cmax becomes: EhCmax i = Ehrn i + p with
expected idle times. This holds because:
C11 = p11 and r1 = p11 + Ehdi.
C12 = p11 + p12 and r2 = p11 + p12 + Ehdi.
...
C1n = p11 + p12 + .. + p1n and rn = p11 + p12 + .. + p1n + Ehdi.
For the WS2 we also have:
C21 = r1 + p.
C22 = max{r2 , C21 } + p, for which it can easily be proved that r2 C21 ,
since p11 + p12 + Ehdi p11 + Ehdi + p.
C2n = max{rn , C2(n1) } + p, with rn C2(n1) that holds for all jobs.
If p maxj p1j for all jobs, the expected Cmax becomes: EhCmax i = Ehr1 i + nj=1 C2j
P
without expected idle times. This holds because:
C11 = p11 and r1 = p11 + Ehdi.
C12 = p11 + p12 and r2 = p11 + p12 + Ehdi.
...
C1n = p11 + p12 + .. + p1n and rn = p11 + p12 + .. + p1n + Ehdi.
For the WS2 we also have:
C21 = r1 + p.
C22 = max{r2 , C21 } + p, for which it can easily be proved that r2 C21 ,
since p11 + p12 + Ehdi p11 + Ehdi + p.
C2n = max{rn , C2(n1) } + p, with rn C2(n1) that holds for all jobs.
If minj p1j < p < maxj p1j the expected Cmax satisfies the relation:
Ehrn i + p < EhCmax i < Ehr1 i + ni=1 C2j .
P
In the first case the SPT has minimized C1n = nj=1 C1j and thus the Ehrn i = C1n + Ehdn i
P
and in the second the earlier starting time of the WS2 Ehr1 i = C11 + Ehd1 i. In all three cases, it
minimizes the total expected completion time, which completes the proof.
Example 2. This instance shows that if the above assumptions do not hold and the processing
times at WS2 are arbitrary, the SPT rule might not be optimal (see Figure 7). The first schedule is
the one, whose sequence at WS1 is constructed by SPT instead of the LPT in the second.
16
Table 3: Example instance where SPT at WS1 is not optimal
Figure 7: Illustration of the schedules at WS1 and WS2 constructed by SPT-FCFS and LPT-FCFS
respectively for Example 2
Observation 1. For the stochastic problem with testing, if testing is for free, it is optimal to test
all jobs in advance.
By definition, if the d component is known the problem becomes the deterministic F 2|dj |Cmax .
Then, if the transportation delay is uniform (equal to d), the problem is in P and one has to apply
the Johnsons algorithm to processing times (p1j + dj , dj + p2j ). Due to the fact that the problem
we study is not the variant that can be solved optimally by Johnsons rule but several algorithms
have been proposed in literature (Karuno and Nagamochi, 2003) apart from a 32 -approximation
algorithm for the case where all processing times are equal (Ageev, 2007), it is better to solve the
deterministic version. As a result, it is optimal to test all jobs in advance.
Observation 2. For the stochastic problem with testing, where d is an IID random variable, and
unit processing times at WS2 for all jobs (p = 1 = p2j p1j ), if it is allowed to test a single job
without any testing cost, then the decision maker chooses to test the job with longest processing
time at WS1.
We now give a rough sketch of the proof. As shown by Theorem 3 it is optimal to apply the SPT
rule as the d components are revealed faster and the release times are minimized. Also with distri-
bution of d known, it is possible to classify d into three categories, namely {dSmall , dM edium , dLarge }.
Any d after becoming known, we say that is dLarge if there is an interval [dLargemin , +) such
that d [dLargemin , +). For simplicity, we write that dLarge Ehdj i + 1. Sorting the jobs
according to SPT at WS1 gives:
p11 p12 ... p1(n1) p1n . It has been proved by Theorem 3 that the expected completion
time at WS2 will be:
EhCmax i = Ehrn i+p = nj=1 C1j +Ehdj i+p. We will denote the last job (index n) as job y and
P
the second last (n 1) as x for the analysis. If we test the longest job y at WS1 its dy component
17
is revealed. If dy Ehdj i we keep the sequence as it is and proceed to WS2.
In case dy > Ehdj i, then dy is said to be dLarge . For the previous expected completion time it also
holds:
0 i = Ehrn i + p = nj=1 C1j + Ehdj i + p = nj=1 C1j + dLarge + p,
P P
EhCmax
since d is not expected value anymore. We need to show that the best we can do knowing
the real d of only one job is to swap the sequence of the last two jobs. Originally, it was:
p11 p12 ... px py . Thus, swapping x and y and leaving the sequence of all other
jobs the same we get for WS1 the following sequence: 1, 2 , 3 , .... , n 2, y, x, where also holds:
p11 p12 ... p1(n2) and py px .
Job y has a completion time at WS1 equal to:
C1y = p11 + p12 + ... + p1(n2) + py .
Job x now completes last:
C1x = p11 + p12 + ... + p1(n2) + py + px .
As a result their completion times are respectively ry = C1y + dLarge and rx = C1x + dj .
Knowing that dLarge Ehdj i + 1 is not enough to determine, which of the jobs x and y will be
scheduled first at WS2. So we need to take all possible cases:
ry < rx
This also implies that dLarge < px + Ehdj i. The job y is scheduled first so EhC2y i =
C1y + dLarge + p. Job x completes last but we cannot determine if there will be idle time
between x and y at WS2. C2x = max{rx , C2y } + p
rx C2y .
EhCmax i = EhC2x i = rx + p = nj=1 C1j + Ehdi + p < EhCmax
0
P
i.
rx < C2y
EhCmax i = EhC2x i = EhC2y i + p = C1y + dLarge + p = nj=1 C1j px + dLarge + p <
P
0
EhCmax i.
ry = rx Pn
EhC
Pn max i = EhC 2x i = rx + p + p = ry + p + p = j=1 C1j px + dLarge + 2p
0
j=1 C1j + dLarge + p = EhCmax i.
ry > rx
EhCmax i = EhC2y i + p = nj=1 C1j py + dLarge + p < EhCmax
0
P
i.
0
For all cases, EhCmax i EhCmax i, which completes the claim that it is optimal to test the longest
job at WS1.
Example 3. In this example it is shown how the test of the longest job can improve the expected
completion time. We assume that at WS1 there are n 2 other jobs that have already been
Table 4: Example instance for the optimality of testing the longest job
18
delay dy , the completion time becomes EhCmax i = 148 also depicted in Figure 8. The second
schedule is constructed after swapping the jobs x and y resulting to an expected improvement of
the EhCmax i.
Observation 3. For the stochastic problem with testing, where d is an IID random variable, and
unit processing times at WS2 for all jobs (p = 1 = p2j p1j ), if it is allowed to test k jobs without
any testing cost, then the decision maker chooses to test the k longest jobs.
According to Observation 2 but also due to improvements in the Cmax that can be given when
scheduling the jobs with the longest delay first we generalize the testing of the longest job to the k
longest jobs and we claim that under circumstances changing the sequence of these k jobs at WS1
can produce a better schedule, otherwise the schedule remains as it was originally. The Examples
4 and 5 show the application of Observation 3, where in the former an actual improvement can be
achieved unlike in the latter.
Example 4. We consider an instance of n jobs, where the first n 2 jobs are completed at WS1 in
100 time units. The decision maker is allowed to test 3 jobs for free and chooses the longest jobs
x, y and z, with data shown in Table 5.
19
Initially, the EhCmax i = 153 but after testing the realized Cmax = 158. Knowing this the
decision maker can swap the order of the jobs at WS1 and minimize the completion time at WS2
Cmax = 152, as can be also seen in Figure 9.
Example 5. We consider the same instance as before with the only difference being the actual d
of the jobs x, y, and z. After the testing, it is revealed that the job associated with the shortest d is
scheduled last as can be observed in Figure 10. For this instance there is no possible swap between
these 3 jobs that can produce a smaller Cmax at WS2. This can be easily verified if one creates all
3! = 6 possible schedules at WS1 and their corresponding schedules at WS2 according to FCFS
to calculate all completion times.
Table 6: Example instance where testing the k-longest jobs is not effective
Remark 1. In a more complicated setting, the Observations 2 and 3 can also hold under circum-
stances and in order to determine the optimal sequence between two jobs we give some properties
of local search in Section 4.4.
20
Figure 10: Illustration of the schedules at WS1 and WS2 for Example 5
4.3 Algorithms
As mentioned before, for the problem F 2|dj |Cmax the optimal schedule does not have to be a per-
mutation schedule, which is given by the Johnsons rule. This means that ideally in order to find an
optimal non-permutation schedule one should generate all permutations for WS1 and then sched-
ule the jobs at WS2 according to the rule FCFS. The optimal solution would be the min{Cmax }
but this would require n! time.
For the general case, which concerns the stochastic problem with testing in addition to the O(nlogn)
Johnsons algorithm we will propose also four algorithms that generate non-permutation sched-
ules. Note that the Johnsons rule is applied for cases where the delays are uniform, which implies
that the sequence at WS2 is also the FCFS rule. However, this is not the case for the delays in
our model and since the Johnsons rule is considered to be an algorithm that produces permu-
tation schedules we modify this rule to produce non-permutation schedules (Flexible Johnsons).
The following algorithms concern the deterministic model (or equivalently the stochastic with free
testing for all jobs) and will later be modified for the general stochastic variant where testing is
subject to limitations. As far as the 4 algorithms are concerned, Theorem 1 and Example 1 were
the basis for their development. The LDF simply schedules the jobs in a descending order of their
delays, whereas the LPT1LPT2 and LPT1SPT2 prioritize the jobs with long delays but differ from
each other when choosing the job to be scheduled after a long delay job. Unlike those 3 algo-
rithms, the SPT1LPT2 prioritizes the short delay jobs but at the same time schedules a long delay
job after every short delay job, thus being the LPT1SPT2 with reverse order in every pair.
Johnsons rule
J OHNSON S RULE
1: Test all jobs for free
2: Partition the jobs into two sets A and B
3: . A contains the jobs with p1j p2j and B the jobs with p1j > p2j
4: Sort the jobs from set A in increasing order of p1j + dj (SPT(p1j + dj ))
5: Sort the jobs from set B in decreasing order of p2j + dj (LPT (p2j + dj ))
6: Apply list scheduling for the jobs in A followed by the jobs in B at WS1
7: Apply list scheduling for the same sequence at WS2
8: return Cmax = C2(last)
21
Flexible Johnsons
F LEXIBLE J OHNSON S
1: Test all jobs for free
2: Partition the jobs into two sets A and B
3: . A contains the jobs with p1j p2j and B the jobs with p1j > p2j
4: Sort the jobs from set A in increasing order of p1j + dj (SPT(p1j + dj ))
5: Sort the jobs from set B in decreasing order of p2j + dj (LPT (p2j + dj ))
6: Apply list scheduling for the jobs in A followed by the jobs in B at WS1
7: Calculate the rj for all jobs
8: Apply FCFS at WS2
9: return Cmax = C2(last)
LDF
1: Test all jobs for free
2: Sort d in descending order
3: Apply list scheduling for d at WS1
4: Calculate the rj for all jobs
5: Apply FCFS at WS2
6: return Cmax = C2(last)
LPT1LPT2
1: Test all jobs for free
2: Sort d in descending order
3: Partition jobs into two sets A and B
4: . set A contains the first half of the jobs with the longest delays whereas
B contains the rest (short delay jobs)
5: repeat
6: Create a pair of jobs (in total n2 ), where the first job is the one with
longest delay from set A and the second is the job with longest delay
from B, and index this pair counting from 1 onwards
7: Delete the two jobs from A and B respectively
8: until Sets A and B are empty
9: Apply list scheduling of the pairs according to lowest index first
10: Calculate the rj for all jobs
11: Apply FCFS at WS2
12: return Cmax = C2(last)
22
Algorithm 3: Delays in pairs SPTshort LPTlong (SPT1LPT2)
SPT1LPT2
1: Test all jobs for free
2: Sort d in descending order
3: Partition jobs into two sets A and B
4: . set A contains the first half of the jobs with longest delays whereas B
contains the rest (short delay jobs)
5: repeat
6: Create n2 a pair of jobs, where the first job is the one with shortest
delay from set B and the second the job with longest delay from A, and
index this pair counting from 1 onwards
7: Delete the two jobs from A and B respectively
8: until Sets A and B are empty
9: Apply list scheduling of the pairs according to lowest index first
10: Calculate the rj for all jobs
11: Apply FCFS at WS2
12: return Cmax = C2(last)
LPT1SPT2
1: Test all jobs for free
2: Sort d in descending order
3: Partition jobs into two sets A and B
4: . set A contains the first half of the jobs with longest delays (long delay
jobs) whereas B contains the rest (short delay jobs)
5: repeat
6: Create a pair of jobs (in total n2 ), where the first job is the one with
longest delay from set A and the second is the job with shortest delay
from B, and index this pair counting from 1 onwards
7: Delete the two jobs from A and B respectively
8: until Sets A and B are empty
9: Apply list scheduling of the pairs according to lowest index first
10: Calculate the rj for all jobs
11: Apply FCFS at WS2
12: return Cmax = C2(last)
Property 1. For the problem F 2|dj |Cmax , if two adjacent jobs h and k at WS1 satisfy:
23
(ii) p1h p2k dh
then it is optimal to schedule k before h at WS1 and then apply FCFS at WS2.
Proof. If k is scheduled first then we define the completion times of jobs k and h for the WS1 as
follows:
C1k = p1k .
C1h = p1k + p1h .
rk = p1k + dk .
rh = p1k + p1h + dh .
For WS2, it is needed to be found which job can start first.
rh rk = p1k + p1h + dh p1k dk = p1h + dh dk > 0.
As a result, k is scheduled first.
C2k = rk + p2k = p1k + dk + p2k .
C2k rh = p1k + dk + p2k (p1k + p1h + dh ) = dk dh + p2k p1h 0.
So, C2h = C2k + p2h = dk dh + p2k p1h + p2h = Cmaxkf irst .
2p1k p1h rk rh
Hence, job h starts first.
C2h = rh + p2h = p1h + dh + p2h .
We need to determine if the release time of k happens before the completion of h.
C2h rk = p1h + dh + p2h (p1h + p1k + dk ) = p2h p1k + dh dk 0.
So, C2k = C2h + p2k = p1h + dh + p2h + p2k = Cmaxhf irst .
Comparing this with the Cmax when k was scheduled first we get:
Cmaxhf irst Cmaxkf irst = p1h +dh +p2h +p2k (dk dh +p2k p1h +p2h ) = p2k p1k 0.
24
Because p1h p1k dh dk dk dh p1h + p1k .
Also, p1h p2k .
Then dk dh p2k + p1k dk dh + p2k p1k 0.
We observe that in both cases Cmaxkf irst Cmaxhf irst , so the claim is proved.
Property 2. For the problem F 2|dj |Cmax , if two adjacent jobs h and k at WS1 satisfy:
(i) p1k p2k dk
If h is scheduled first, then the jobs h, k are scheduled successively without idle times.
C1h = p1h .
C1k = p1h + p1k .
rh = C1h + dh = p1h + dh .
rk = C1k + dk = p1h + p1k + dk .
We need to determine which of the two jobs should start first:
rk rh = p1h + p1k + dk p1h dh = p1k + dk dh .
It is not clear, so we need to take cases:
p1k + dk dh 0 rk rh
Job h will be scheduled first. C2h = rh + p2h = p1h + dh + p2h . Now, it remains to
determine, whether there will be idle time between k and h at WS2. C2h rk = p1h + dh +
p2h (p1h + p1k + dk ) = dh + p2h p1k dk 0.
As a result, no idle time between h and k. C2k = C2h + p2k = p1h + dh + p2h + p2k =
Cmaxhf irst Cmaxkf irst .
Property 3. For the problem F 2|dj |Cmax , if two adjacent jobs h and k at WS1 satisfy:
25
(i) p1k p2h
26
5.1 Formulation
Mixed integer linear programming model for the general problem in order to determine the se-
quence of jobs that minimizes the makespan criterion. The notation that will be used is the fol-
lowing:
i: index of jobs, i = 1, .., n
P1i : processing time of job i at WS1
P2i : processing time of job i at WS2
Di : duration of delay for job i
s1i : starting time of operation 1 of job i at WS1
s2i : starting time of operation 2 of job i at WS2
ij = 1, if s1i < s1j , 0 otherwise
ij = 1, if s2i < s2j , 0 otherwise
Z = Cmax : maximum completion time at WS2
minimize Z
subject to ij + ji = 1, i, j, i < j and i 6= j (C1)
s1i s1j + ij M M P1i , i 6= j (C2)
s1i s2i (P1i + Di ), i (C3)
ij + ji = 1, i, j i < j and i 6= j (C4)
s2i s2j + ij M M P2i , i, j and i 6= j (C5)
s2i Z P2i , i (C6)
ij , ij {0, 1}, i, j (C7)
s1i 0, i (C8)
s2i 0, i (C9)
Note that all variables and constants are integers and M is a very large number. Constraint
(C1) means that for any two jobs i and j, either i precedes or j. The second constraint defines the
requirement that the first machine executes only one job at a time. Constraint (C3) suggests that
the processing time of the second operation of a job can only begin once the job has arrived to the
second machine. Constraint (C4) ensures that all jobs must be executed by the second machine.
The fifth constraint ensures that the second machine executes only one job at a time. Constraint
(C6) implies that the end of processing of any job at WS2 is lower or equal to the makespan.
Constraint (C7) means that and are binary. Constraints (C8) and (C9) state that the starting
times at WS1 and WS2 are non-negative.
6 Modeling
The problem in the facility design mentioned in the introduction has to be modeled to automate the
current method used by the management. This chapter will focus on the modeling of the problem.
It starts with the modeling of the two phases, the algorithms and provides information about the
data and how this was modified to create other datasets.
27
the two nodes will be perceived as a Work Station (WS). Ideally, each work station can have m
machines at its disposal and the jobs could be processed in parallel but due to simplicity reasons
we use only a single machine at each WS. Furthermore, single machine models are important in
decomposition methods, when scheduling problems in more complicated machine environments
are broken down into a number of smaller single machine scheduling problems and their results
can provide a basis for heuristics regarding the complicated setting (M. L. Pinedo, 2004).
The processing times at WS1, WS2 as well as the delays were initially raw data collected
during my professional experience as a Systems Engineer for the design of chemical facilities. The
data was extracted from data sheets of cost estimations and other tools that monitored the progress
of the projects and was converted to hour units. Then, with the use of Matlab the data was fitted
into various distributions until a close match was found. Table 7 summarizes the empirical data
used for the problem.
component Distribution
p1 normal 10.4 3.2
p2 normal 15.1 4.2
d lognormal 4.51 1
In order to test the algorithms and their validity when the input data changes several other dis-
tributions were created scaled to match the expected value of the original distributions. Obviously,
the empirical data have very large variance, which was reduced for the other distributions but still
was large enough. This happened to make the comparison of the results easier and secondly to
maintain the originality of the problem as close as possible to reality.
6.2 Algorithms
In this section we describe how the algorithms proposed in Section 4.3 will function given the fact
that the delays are unknown but testing is allowed.
28
long job associated with a long delay should be scheduled first. However, no claim was made
regarding the question how to schedule the unknown jobs and considering that it is hard to prove
such a claim mathematically, the simulations are the next best option. Intuitively, considering the
nature of the proposed scheduling algorithms, which is basically to avoid jeopardizing a prolonged
random schedule by ensuring that jobs with long delays will not be scheduled last, it is plausible to
choose to schedule the unknown jobs first or somewhere in the middle. Tables 9 and 10 illustrate
all the options that are included in each and every algorithm that was implemented. These two
categories of options are also two major questions that this research intends to answer, in other
words: "which jobs to test" and "how to schedule the unknown jobs".
Option Explanation
LPT Tests the k allowed jobs starting from the job with longest
processing time at WS1.
SPT Tests the k allowed jobs starting from the job with shortest
processing time at WS1.
Random Tests the k allowed jobs randomly.
Option Explanation
Unknown Schedules the unknown jobs according to SPT and then the algorithm creates the
first sequence of the known jobs, which are scheduled right after.
Unknown The algorithm creates the sequence of the known jobs, which are split in two
middle groups, then the unknown jobs are scheduled according to SPT in the middle.
Unknown Schedules the unknown jobs according to SPT and then the algorithm creates
last the sequence of the known jobs, which are scheduled right before the set of the
unknown.
Unknown Schedules the unknown jobs according to SPT and then the algorithm creates the
nested sequence of the known jobs. After each pair of known jobs follows one unknown
job.
Next, we show how the algorithms of Section 4.3 are extended. We only show this for Algo-
rithm 2 (LPT1LPT2). The other algorithms, including the Johnsons and the modified Johnsons,
are modified similarly.
29
Algorithm 2a: Delays in pairs LPTlong LPTshort (LPT1LPT2) with k-free testing
LPT1LPT2
1: Choose which k jobs to test . LPT, SPT, Random
2: Sort unknown jobs according to SPT rule
3: Choose how to schedule the rest unknown jobs . Unknown first,
Unknown middle, Unknown last, Unknown Nested
4: Sort d in descending order for the known jobs
5: Partition jobs into two sets A and B . set A contains the first half of the
long delay jobs whereas B contains the rest (short delay jobs)
6: repeat
7: Create a pair of jobs (in total n2 ), where the first job is the one with
longest delay from set A and the second is the job with longest delay
from B, and index this pair counting from 1 onwards
8: Delete the two jobs from A and B respectively
9: until Sets A and B are empty
10: Sort the pairs according to lowest index first
11: Merge the unknown jobs with the pairs according to the selected option
12: Apply list scheduling at WS1
13: Calculate the rj for all jobs
14: Apply FCFS at WS2
15: return Cmax = C2(last)
The following instance shows how the algorithm LPT1LPT2 works in the case of k-free testing
but the option "how to schedule the unknown jobs" is replaced by another idea. The unknown
delays are replaced by the expected delay and they are treated by the algorithm as "known" jobs.
Algorithm 2b: Delays in pairs LPTlong LPTshort (LPT1LPT2) with k-free testing and expected
delay
LPT1LPT2
1: Choose which k jobs to test . LPT, SPT, Random
2: Sort unknown jobs according to SPT rule
3: Sort d in descending order for all the jobs . The Not-Tested jobs have
delay equal to the expected delay
4: Partition jobs into two sets A and B . set A contains the first half of the
long delay jobs whereas B contains the rest (short delay jobs)
5: repeat
6: Create a pair of jobs (in total n2 ), where the first job is the one with
longest delay from set A and the second is the job with longest delay
from B, and index this pair counting from 1 onwards
7: Delete the two jobs from A and B respectively
8: until Sets A and B are empty
9: Apply list scheduling at WS1
10: Calculate the rj for all jobs
11: Apply FCFS at WS2
12: return Cmax = C2(last)
30
amount of time in the task and the retrieval of some information
Pn prior to its actual processing.
Pn This
practically implies that the processing at WS1 will last j=1 (p1j + CoTj ), where j=1 CoTj is
not only the cost of testing but also the starting time of the O1j of the very first job. Alternatively,
considering that it has been decided to test k jobs first and process all the jobs at WS1 after testing,
this means that the release dates of all jobs are equal to the total testing cost. As mentioned before,
in the industrial application the delays correspond to lead times of equipment. The delays can be
revealed only after the equipment is completely specified and ordered. In this case, the testing
option offers the following alternative: The engineering team (Node B in Figure 1) instead of
designing and specifying the equipment (O1j ) can spend some amount of time to investigate the
task, suggest potential equipment and contact the respective suppliers to learn the preparation and
delivery times of the equipment. The described task can last 10%-30% of the O1j but it reveals
the corresponding delay and this consists the idea of learning through testing. As a result the cost
function of CoT for a job j in this specific industrial problem can be defined as follows:
where
0.1, if dj {dmin , dmin + }
j = 0.2, if dj {dmin + , dmin + 2} (9)
0.3, if dj {dmin + 2, dmax }
dmax dmin
and is equal to 3 .
7 Computational Results
In this section we will present the results obtained by the simulation of the industrial problem for
the three models described in Section 3.1. The data was modified further from the Basic Case
shown in Table 7 to include simulations regarding the Theorem 3 and some special cases (e.g.
proportionate two-machine flow shop with minimum delays).
31
7.1.1 Equal Processing Times
The following 4 tables show the results that were obtained for different number of jobs for 4
different distributions and a sample size of 10000 simulations. Since the processing times at WS1
and WS2 are equal to 13 for all jobs, this is a special case of the two machine flow shop called
proportionate flow shop. The minimum of each column is colored to make it easier to spot the
algorithm that achieved the shortest schedule.
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 415.60 573.61 702.15 817.30 923.80 1022.41 1367.08 2098.79
SPT(p1j + dj ) 439.57 627.30 784.52 926.75 1057.67 1183.92 1621.47 2544.61
Flexible Johnsons 439.57 627.30 784.52 926.75 1057.67 1183.92 1621.47 2544.61
Johnsons 439.57 627.30 784.52 926.75 1057.67 1183.92 1621.47 2544.61
LDF 391.56 519.67 619.96 712.09 793.74 877.38 1162.17 1830.54
LPT1-LPT2 392.31 518.72 614.39 699.27 774.00 850.70 1111.64 1728.10
SPT1-LPT2 405.27 532.01 628.36 713.50 788.21 864.25 1119.75 1723.56
LPT1-SPT2 393.37 520.25 617.06 703.07 778.61 855.46 1113.14 1720.40
LB 387.81 510.61 603.14 681.98 748.91 812.35 1022.85 1513.72
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 273.99 343.45 408.25 475.76 539.81 606.94 868.58 1522.12
SPT(p1j + dj ) 292.77 377.28 452.15 524.74 595.70 665.23 936.25 1600.68
Flexible Johnsons 292.77 377.28 452.15 524.74 595.70 665.23 936.25 1600.68
Johnsons 292.77 377.28 452.15 524.74 595.70 665.23 936.25 1600.68
LDF 253.48 327.89 420.95 500.87 573.84 644.69 919.50 1586.88
LPT1-LPT2 253.52 310.51 384.74 456.04 529.66 597.24 867.87 1528.64
SPT1-LPT2 267.02 326.20 391.99 457.35 522.49 587.45 846.68 1495.65
LPT1-SPT2 257.73 321.32 388.48 454.45 519.92 585.04 844.71 1494.11
LB 241.17 262.20 283.18 325.12 381.60 442.69 692.56 1331.14
32
Table 13: p1j = p2j and d (N (150, 552 ) + N (150, 552 ))
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 321.85 397.76 464.08 529.41 595.37 659.47 920.92 1572.49
SPT(p1j + dj ) 343.09 437.42 515.43 588.22 659.63 729.21 1000.72 1664.35
Flexible Johnsons 343.09 437.42 515.43 588.22 659.63 729.21 1000.72 1664.35
Johnsons 343.09 437.42 515.43 588.22 659.63 729.21 1000.72 1664.35
LDF 297.34 339.06 390.92 484.81 583.78 679.70 982.57 1651.22
LPT1-LPT2 298.80 350.38 396.04 456.58 524.86 594.89 865.40 1528.19
SPT1-LPT2 312.45 365.20 417.16 475.96 539.87 603.71 862.25 1514.70
LPT1-SPT2 300.99 354.26 407.21 470.49 536.26 601.33 861.18 1514.22
LB 291.66 320.59 333.45 343.35 377.27 437.37 693.87 1339.45
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
SPT(p1j + dj ) 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
Flexible Johnsons 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
Johnsons 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
LDF 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
LPT1-LPT2 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
SPT1-LPT2 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
LPT1-SPT2 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
LB 224.70 288.86 353.55 418.39 483.27 548.21 808.09 1458.01
33
Table 15: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d LogN orm(4.51, 1)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 407.11 567.74 690.89 804.48 895.51 975.17 1248.67 1893.21
SPT(p1j + dj ) 428.54 618.31 768.73 904.93 1022.23 1124.00 1485.33 2282.48
Flexible Johnsons 420.52 599.32 739.67 863.91 970.18 1060.92 1379.14 2079.62
Johnsons 426.49 613.56 761.57 894.82 1009.13 1107.94 1458.84 2229.99
LDF 391.45 534.99 646.70 746.11 835.20 914.93 1206.75 1986.40
LPT1-LPT2 391.77 533.14 638.01 728.42 804.69 868.83 1115.16 1807.32
SPT1-LPT2 401.62 543.09 647.78 739.25 815.51 880.08 1123.21 1796.45
LPT1-SPT2 392.31 533.98 639.37 731.55 808.66 874.42 1120.16 1798.25
LB 387.52 525.36 625.66 712.73 782.79 838.19 1051.59 1664.76
Table 16: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d N (150, 552 )
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 263.85 321.34 382.18 445.22 512.56 582.18 873.81 1620.49
SPT(p1j + dj ) 281.87 353.14 416.93 476.30 533.90 591.41 843.22 1567.56
Flexible Johnsons 274.27 336.63 392.02 444.71 500.15 559.55 830.44 1563.59
Johnsons 279.77 348.60 409.53 466.44 521.78 577.48 832.65 1563.59
LDF 252.48 329.09 432.45 529.77 617.21 698.70 1015.33 1789.83
LPT1-LPT2 252.34 313.74 398.49 475.49 557.73 632.74 942.25 1705.87
SPT1-LPT2 262.39 322.88 395.28 468.14 540.45 614.87 909.18 1656.63
LPT1-SPT2 255.38 320.92 395.24 468.77 541.31 615.79 911.12 1659.38
LB 240.42 264.19 302.10 361.11 430.21 502.29 792.61 1537.24
Table 17: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d (N (60, 152 ) + N (240, 502 ))
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 316.04 375.20 426.30 480.24 534.60 589.64 846.47 1584.78
SPT(p1j + dj ) 336.27 416.02 478.10 540.82 597.54 654.80 876.47 1550.45
Flexible Johnsons 328.24 397.50 449.90 502.98 550.83 601.31 826.75 1549.20
Johnsons 334.20 411.12 470.68 530.86 584.58 639.08 851.60 1549.20
LDF 300.71 340.36 399.23 497.66 600.29 702.28 1075.54 1855.05
LPT1-LPT2 301.53 349.35 394.60 455.45 524.80 601.21 916.34 1682.54
SPT1-LPT2 312.66 361.84 413.63 474.09 538.31 607.29 901.49 1653.84
LPT1-SPT2 303.32 353.34 406.17 469.39 536.85 608.43 905.30 1657.73
LB 295.25 321.85 332.26 361.15 422.59 497.25 795.60 1546.13
34
Table 18: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d U(145, 155)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT(p1j ) 233.18 306.76 382.24 456.12 530.77 606.18 906.78 1662.75
SPT(p1j + dj ) 232.23 305.15 380.48 454.40 528.97 604.16 904.59 1660.63
Flexible Johnsons 231.64 304.89 380.37 454.33 528.93 604.11 904.58 1660.62
Johnsons 231.64 304.89 380.37 454.33 528.93 604.11 904.58 1660.62
LDF 239.58 315.96 392.55 467.53 542.69 618.06 919.57 1676.52
LPT1-LPT2 239.59 315.69 392.29 467.16 542.42 617.79 919.12 1676.16
SPT1-LPT2 236.02 311.51 387.90 462.66 537.87 612.71 914.26 1671.27
LPT1-SPT2 239.15 315.25 391.61 466.35 541.58 616.68 918.19 1674.99
LB 231.12 304.73 380.26 454.27 528.88 604.06 904.57 1660.61
7.2.1 Normal processing times at WS1 and equal processing times at WS2
This section presents the simulation results related to the Theorem 3, for which the sample size
was 10000. The tables are divided into 3 groups, which is consistent with the 3 distinct cases
that were used in the proof of Theorem 3. Note that for the stochastic problem the SPT(p1j ) and
SPT(p1j + dj ) yield the same results due to the fact that the first does not use the information of
35
the delays and the second uses the expected value of the distribution, which makes no difference
in the sequence of the jobs. Hence, SPT(p1j ) and SPT(p1j + dj ) were concatenated to SPT.
Tables 19, 20 and 20 show the computational results specifically for the Theorem 3 where the
values for the SPT and the LB are the expected completion time and the expected lower bound.
Hence, there is no need for distinction between the various distributions since they all have the
same expected value. The data was modified accordingly such as the properties of each case were
satisfied.
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 204.35 256.09 308.01 360.05 411.93 463.98 671.68 1192.07
LB 204.35 256.09 308.01 360.05 411.93 463.98 671.68 1192.07
Table 20: minj p1j < p2j < maxj p1j and Ehdi = 150
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 212.96 265.07 317.16 369.31 421.25 473.10 680.84 1200.73
LB 212.96 265.07 317.16 369.31 421.25 473.10 680.84 1200.73
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 231.82 305.60 379.96 454.52 529.22 603.99 903.41 1652.78
LB 231.82 305.60 379.96 454.52 529.22 603.99 903.41 1652.78
The rest simulations of this section follow the same structure but this time the results show
what is the best that can be achieved under uncertainty. Tables 22 to 25 show the results for the
case where minj p1j p2j for all the jobs. For these simulations the processing times were
adjusted such that p1j 2 and p2j = 2.
36
Table 22: minj p1j p2j and d LogN orm(4.51, 1)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 387.83 540.80 672.12 778.53 877.09 949.40 1239.83 1826.79
LPT 393.61 557.37 699.41 812.88 918.34 992.75 1316.19 1989.97
Random 390.77 549.97 687.45 797.38 895.84 970.78 1280.21 1905.20
Flexible Johnsons 390.33 548.77 683.52 792.87 898.52 968.39 1272.29 1911.17
Johnsons 394.03 556.69 695.96 808.81 917.63 991.20 1307.34 1965.51
LB 368.88 499.41 607.25 687.25 758.78 802.18 990.33 1370.33
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 245.95 299.31 348.99 402.47 453.35 505.98 711.65 1228.96
LPT 252.69 313.20 368.84 422.59 474.76 528.30 738.21 1261.78
Random 248.71 305.36 359.37 410.85 463.17 515.46 722.34 1242.74
Flexible Johnsons 249.36 307.56 358.93 411.32 462.97 515.08 724.57 1243.37
Johnsons 252.38 311.86 363.97 416.35 468.22 520.19 729.54 1248.63
LB 225.71 246.76 259.31 279.59 309.24 353.65 551.00 1059.20
Table 24: minj p1j p2j and d (N (60, 152 ) + N (240, 502 ))
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 299.74 359.26 409.37 460.26 511.06 562.11 766.79 1285.54
LPT 305.36 372.57 426.64 479.43 536.13 588.80 799.45 1324.12
Random 302.38 366.41 418.10 467.22 522.16 578.72 781.27 1303.74
Flexible Johnsons 302.32 365.40 419.52 472.01 521.79 573.69 780.92 1302.52
Johnsons 305.71 370.95 425.83 478.10 528.24 579.79 787.27 1308.98
LB 280.12 308.52 321.76 328.87 335.19 357.77 551.67 1069.28
37
Table 25: minj p1j p2j and d U(145, 155)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 204.01 256.00 308.25 360.02 411.98 464.29 671.98 1193.17
LPT 204.27 256.49 308.91 360.49 412.69 465.11 673.04 1194.38
Random 204.12 256.13 308.32 360.22 412.08 464.31 672.04 1193.52
Flexible Johnsons 204.02 256.00 308.30 360.08 412.06 464.29 672.01 1193.17
Johnsons 204.04 256.03 308.34 360.10 412.08 464.30 672.04 1193.19
LB 200.70 251.92 303.87 355.52 406.76 459.47 666.97 1188.27
Tables 26 to 30 show the results for the case where minj p1j < p2j < maxj p1j for all the
jobs. For these simulations only the processing times were adjusted such that p2j = 10.
Table 26: minj p1j < p2j < maxj p1j and d LogN orm(4.51, 1)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 396.23 549.23 680.45 786.89 883.35 957.79 1248.22 1835.04
LPT 402.07 565.83 707.80 821.38 929.29 1001.04 1324.58 1998.48
Random 399.19 558.31 695.92 805.74 908.45 979.11 1288.49 1913.45
Flexible Johnsons 396.68 550.06 682.13 788.90 885.20 959.85 1253.06 1859.51
Johnsons 414.64 592.03 747.20 873.69 991.22 1082.38 1449.12 2201.75
LB 376.89 507.42 615.25 695.28 766.83 810.26 998.58 1378.88
Table 27: minj p1j < p2j < maxj p1j and d N (150, 552 )
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 255.16 309.78 359.89 413.33 463.90 516.36 721.96 1239.04
LPT 262.03 323.31 380.12 434.84 487.79 543.73 764.32 1332.54
Random 258.18 315.42 370.28 422.57 473.88 527.89 735.65 1256.66
Flexible Johnsons 256.25 311.34 362.82 416.42 468.83 519.41 728.71 1249.28
Johnsons 271.75 339.41 397.70 454.26 506.98 560.21 767.43 1284.66
LB 233.72 254.80 267.48 288.42 318.99 364.14 561.19 1068.81
38
Table 28: minj p1j < p2j < maxj p1j and d (N (60, 152 ) + N (240, 502 ))
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 308.57 368.48 418.91 471.33 519.27 571.47 775.95 1294.60
LPT 314.19 381.95 436.24 489.20 543.48 598.92 810.56 1349.76
Random 311.28 375.43 427.47 478.99 531.43 588.21 790.94 1314.05
Flexible Johnsons 309.03 369.81 422.03 475.65 523.96 575.96 783.30 1305.25
Johnsons 325.68 401.61 462.48 516.76 569.82 623.89 829.09 1348.33
LB 288.16 316.52 329.76 336.87 343.19 366.71 562.03 1078.72
Table 29: minj p1j < p2j < maxj p1j and d U(145, 155)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 213.44 265.40 317.46 369.40 420.80 473.87 681.26 1202.23
LPT 219.31 277.38 335.30 392.69 449.84 507.99 737.60 1313.44
Random 216.08 270.05 322.84 375.92 428.27 481.24 690.29 1212.63
Flexible Johnsons 213.70 265.74 317.85 369.94 421.38 474.37 681.88 1202.87
Johnsons 213.73 265.79 317.91 369.99 421.42 474.41 681.95 1202.91
LB 209.83 261.37 313.28 365.12 416.37 469.23 676.47 1197.34
Tables 30 to 33 show the results for the case where maxj p1j p2j for all the jobs. For these
simulations the processing times p1j were capped to 15, whereas p2j = 15.
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 401.65 554.89 686.11 793.14 890.14 964.66 1262.47 1897.84
LPT 407.34 571.01 712.44 826.52 933.84 1006.00 1333.51 2041.80
Random 404.57 563.60 701.20 811.18 913.48 984.72 1299.66 1962.42
Flexible Johnsons 401.65 554.89 686.11 793.14 890.14 964.66 1262.47 1897.84
Johnsons 428.91 619.14 787.05 925.89 1057.99 1158.59 1580.53 2475.92
LB 382.02 513.27 621.72 705.19 779.04 827.81 1064.16 1670.13
39
Table 31: maxj p1j p2j and d N (150, 552 )
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 262.04 321.29 378.85 442.16 507.97 578.41 869.11 1610.66
LPT 268.77 334.35 400.60 472.81 547.34 624.40 931.70 1690.29
Random 265.10 326.53 389.01 457.52 526.40 600.55 901.03 1650.96
Flexible Johnsons 262.04 321.29 378.85 442.16 507.97 578.41 869.11 1610.66
Johnsons 286.38 368.68 442.63 516.75 587.07 661.58 956.08 1703.17
LB 239.09 264.11 298.49 360.15 427.91 499.68 789.52 1527.71
Table 32: maxj p1j p2j and d (N (60, 152 ) + N (240, 502 ))
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 314.73 375.49 426.50 479.64 530.02 587.13 839.26 1572.58
LPT 319.90 388.31 443.51 497.67 556.60 622.93 921.75 1687.43
Random 317.32 381.83 434.59 487.03 542.32 604.54 879.75 1628.65
Flexible Johnsons 314.73 375.49 426.50 479.64 530.02 587.13 839.26 1572.58
Johnsons 340.11 429.38 504.64 575.06 648.05 722.41 1017.32 1760.88
LB 293.39 321.41 335.08 356.75 420.48 494.04 790.60 1535.80
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 231.88 305.43 379.83 454.29 528.98 603.62 903.26 1652.15
LPT 239.31 315.70 391.40 466.97 542.42 617.70 918.38 1669.12
Random 235.63 310.78 385.83 460.91 535.61 610.81 910.68 1660.87
Flexible Johnsons 231.88 305.43 379.83 454.29 528.98 603.62 903.26 1652.15
Johnsons 231.91 305.51 379.96 454.42 529.17 603.85 903.68 1652.64
LB 230.39 303.86 378.05 452.54 527.07 601.77 901.10 1650.11
40
Table 34: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d LogN orm(4.51, 1)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 407.11 567.74 690.89 804.48 895.51 975.17 1248.67 1893.21
LPT 412.97 584.20 717.59 832.41 934.27 1019.21 1328.31 2040.63
Random 410.37 575.09 703.12 815.97 913.62 998.27 1288.02 1973.52
Flexible Johnsons 407.45 567.75 692.21 804.78 897.59 977.18 1256.21 1903.01
Johnsons 433.83 630.03 789.12 931.40 1056.51 1164.70 1559.98 2452.49
LB 387.52 525.36 625.66 712.73 782.79 838.19 1051.59 1664.76
Table 35: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d N (150, 552 )
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 263.85 321.34 382.18 445.22 512.56 582.18 873.81 1620.49
LPT 270.40 336.42 404.45 476.26 552.80 631.07 941.02 1711.11
Random 266.95 328.75 392.76 461.44 531.54 605.57 906.48 1661.40
Flexible Johnsons 263.74 321.13 381.91 444.12 510.29 581.41 873.19 1620.36
Johnsons 287.38 367.91 443.23 516.13 589.59 662.09 960.56 1712.13
LB 240.42 264.19 302.10 361.11 430.21 502.29 792.61 1537.24
Table 36: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d (N (60, 152 ) + N (240, 502 ))
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 316.04 375.20 426.30 480.24 534.60 589.64 846.47 1584.78
LPT 322.41 390.58 445.91 505.00 563.47 629.89 932.99 1709.77
Random 319.46 382.35 434.60 491.75 547.41 608.12 886.71 1640.84
Flexible Johnsons 316.63 375.37 425.69 479.90 533.30 588.55 845.17 1584.39
Johnsons 341.66 430.16 503.75 579.62 650.81 725.91 1022.16 1775.05
LB 295.25 321.85 332.26 361.15 422.59 497.25 795.60 1546.13
41
Table 37: p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d U(145, 155)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 233.18 306.76 382.24 456.12 530.77 606.18 906.78 1662.75
LPT 240.93 319.11 396.90 472.97 549.15 625.93 930.56 1695.29
Random 237.05 312.54 388.94 463.64 538.56 614.14 915.28 1672.34
Flexible Johnsons 232.89 306.64 382.21 456.11 530.75 606.18 906.78 1662.75
Johnsons 232.93 306.74 382.34 456.30 530.95 606.42 907.15 1663.22
LB 231.12 304.73 380.26 454.27 528.88 604.06 904.57 1660.61
42
some cases could compensate the costly testing. The comparison of worst-case ratios between
the stochastic and the deterministic problem is also deemed useful to identify instances where the
testing can be used effectively (see the Appendix). Moreover, all the 4 proposed algorithms use
the option "Schedule the Unknown Jobs First", whenever 0 < k < n, where n the number of total
jobs. For the same case, the Johnsons algorithm as well as its modified version use the Ehdi and
treat the unknown jobs as "known", something that did not have the same effect when implemented
in the other 4 algorithms. For the results of testing, we used figures instead of tables to illustrate
the results since it is more interesting to observe the qualitative behavior of the algorithms as they
use more information on the delays. Finally, all algorithms choose to test the k longest jobs in all
cases as it seems to be the most prevalent option compared to the other two. More details about
the two policies ("schedule the unknown first" and "test k-longest") and why they were chosen can
be found in the Appendix.
43
(a) k-testing for free (b) k-testing costly
Figure 13: 25 Jobs, p1j = p2j = 13, d (N (60, 152 ) + N (240, 502 ))
Figure 15: 25 Jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d LogN orm(4.51, 1)
44
(a) k-testing for free (b) k-testing costly
Figure 16: 25 Jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d N (150, 552 )
Figure 17: 25 Jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d (N (60, 152 ) +
N (240, 502 ))
Figure 18: 25 Jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 ) and d U(145, 155)
45
those cases. Remember that in the case of equal processing times the algorithms SPT(p1j + dj ),
Flexible Johnsons and Johnsons are generally bad since they have no sorting ability, so the results
of the deterministic problem can only be worse than those of the stochastic, where they all apply
their free stochastic counterparts. For the lognormal distribution the algorithms LDF, LPT1LPT2,
SPT1LPT2 and LPT1SPT2 improve their solution as they utilize more information of the delays
even if the testing is not for free. With regard to the normal and bimodal distributions, the LDF
algorithm to our surprise after a certain point cannot use the extra information to its benefit even
if it is for free. With that said, note that there are also algorithm curves that are concave, which is
result of the fact that the algorithms are memory-less. For example, the LDF in Figure 13 achieves
its best solution when it tests 40% of the jobs. Yet, if it can test 50% of the jobs it produces a
whole new schedule without taking into account that there might be a better schedule if it would
use less information.
For the general case, the Flexible Johnsons and the Johnsons algorithm are not disabled any-
more and for the case of the normally and uniformly distributed delays they actually improve their
results as the problem tends to the deterministic version. Except for the case, where the delays
follow the lognormal distribution, no algorithm can give a better result than the SPT in the general
case of the full costly testing. Note, that the SPT is actually the SPT(p1j ), a free algorithm, and
thus does not require to test any jobs. It should also be mentioned that all the algorithms presented
in Section 7.2 can be used as free algorithms using only the Ehdj i but we chose SPT as the only
free algorithm in this section due to its overall performance for the stochastic problem. Finally,
from the Figure 17 there is some indication that the stochastic problem with testing can yield a
schedule better than that of the purely stochastic and deterministic schedule.
46
Choosing the SPT compared to the LPT or Random can lead to significant reductions in the com-
pletion time as the number of jobs increases. As a result the EDD rule used by the CPM could
be replaced by another scheduling rule. Returning to the other initial research question whether
testing can be employed as a means to retrieve the delays, which was the very innovation of this
research, there is no straight answer. We could speak, though, only for the instances we simulated.
We could say, in general, that choosing to test and not use one of the "free" algorithms (e.g. SPT
or Flexible Johnsons) to create a schedule when the delays are unknown really depends on the
distribution, the number of total jobs as well as the cost function of testing. Specifically, whenever
the delays are uniformly distributed the stochastic version does not differ form the deterministic
as the improvement is 0.3%, which makes testing obsolete for instances of 25 jobs. The same
applies more or less to the normal distribution as the improvement with free testing is almost 2%.
However, the improvement in the case of lognormal and testing for free is 10% for the general
case and 16% if the processing times are equal, which make an improvement of 6.3% and 11.6%
respectively possible in the case of costly testing for the given cost function. The overview of
detailed recommendations for the simulated instances can be found in the Appendix. For further
research, it is recommended that the given algorithms acquire memory. We saw that in the case of
bimodal the LPT1LPT2 achieves a solution better than that of the deterministic setting when 40%
of the jobs are tested. Having memory, the algorithm will be able to produce the best obtained
schedule even if that was achieved under uncertainty. As a result, the testing would be more effec-
tive. Furthermore, it was mentioned before that the value of testing is related to the distribution but
also the cost function. It would be interesting to study how these algorithms would perform not
only with other distributions but with the same distributions and different values. Another aspect
worth investigating is that of learning and particularly when the testing can reduce the processing
time of O1j , which would be based on that assumption that testing is not only information retrieval
but also a sort of processing. Finally, we would recommend further research on the ILP in order
to investigate in depth if the calibration of M is the right strategy to get optimal feasible solutions
or additional constraints are required.
References
Ageev, Alexander A. (2007). A 3/2-Approximation for the Proportionate Two-Machine Flow Shop
Scheduling with Minimum Delays. 5th International Workshop, WAOA 2007 Eilat, Israel, Oc-
tober 11-12, 2007 Revised Papers, pp. 55-66.
Biskup, D. (1999). Single-machines scheduling with learning considerations. European of Journal
Operation Research 115, pp. 173-178.
Boudhar, Mourad and Nacira Chikhi (2011). Two machine Flow shop with transportation time.
Available at http://studia.complexica.net/Art/RI090204.pdf.
Burns, Fennell and John Rooker (1975). A special case of the 3 x n flow shop problem. Naval
Research Logistics Quarterly 22, pp. 811-817.
DellAmico, Mauro (1996). Shop Problems with Two Machines and Time Lags. INFORMS, pp.
777-787.
Frasch, Janick V., Sven Oliver Krumke, and Stephan Westphal (2011). MIP Formulations for
Flowshop Scheduling with Limited Buffers. Theory and Practice of Algorithms in (Computer)
Systems 6595, pp. 127-138.
Johnson, S. M. (1954). Optimal two- and three-stage production schedules with setup times in-
cluded. Elsevier - Discrete Applied Mathematics 1, pp. 61-68.
(1958). Sequencing n Jobs on Two Machines with Arbitrary Time Lags; Alternate proof and
discussion of the general case. Available at http://www.rand.org/pubs/papers/
P1526.html.
47
Kamburowski, Jerzy (2000). Non-bottleneck machines in three-machine flow shops. Journal of
Scheduling 3, pp. 209-223.
Karuno, Yoshiyuki and Hiroshi Nagamochi (2003). A Better Approximation for the Two-Machine
Flowshop Scheduling Problem with Time Lags. 14th International Symposium, ISAAC 2003
Kyoto, Japan, December 15-17, 2003 Proceedings, pp. 309-318.
Lageweg, B. J., E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan (1981). Computer-aided
complexity classification of deterministic scheduling problems. Tech. rep. BW138. Amster-
dam, The Netherlands: Centre for Mathematics and Computer Science.
(1982). Computer-aided complexity classification of combinatorial problems. Communica-
tions of the ACM 25, pp. 817-822.
Lawler, E.L., J.K. Lenstra, A.H.G. Rinnooy Kan, and D.B. Shmoys (1993). Sequencing and schedul-
ing: algorithms and complexity, Handbooks in Operations Research and Management Science,
Vol.4: Logistics of Production and Inventory. North-Holland.
Leung, Joseph Y-T., Haibing Li, and Hairong Zhao (2007). Scheduling two-machine flow shops
with exact delays. International Journal of Foundations of Computer Science 18.2, pp. 341-
359.
Levi, Retsef, Thomas Magnanti, and Yaron Shaposhnik (2015). Scheduling with Testing. IN-
FORMS.
Mitten, L. G. (1959). Sequencing n Jobs on Two Machines with Arbitrary Time Lags. Available at
http://pubsonline.informs.org/doi/pdf/10.1287/mnsc.5.3.293.
Monma, C.L. and A.H.G. Rinnooy Kan (1983). A concise survey of efficiently solvable of the
permutation flow-shop problem. RAIRO - Operations Research 17.2, pp. 105-119.
Nabeshima, I. (1963). Sequencing on two machines with start lag and stop lag. Journal of the
Operations Research Society 5, pp. 97-101.
Nawijn, W.M. and W. Kern (1991). Scheduling multi-operation jobs with time lags on a single
machine. Proceedings 2nd Twente Workshop on Graphs and Combinatorial Optimization, U.
Faigle and C. Hoede (eds.), Enschede.
Orman, A.J. and C.N. Potts (1997). On the complexity of coupled-task scheduling. Discrete Ap-
plied Mathematics 72, pp. 141-154.
Pan, Chao-Hsien (1997). A study of integer programming formulations for scheduling problems.
International Journal of Systems Science 28, pp. 33-41.
Pinedo, Michael L. (2004). Planning and Scheduling in Manufacturing and Services. Springer.
(2008). Scheduling: Theory Algorithms and systems. Springer.
Potts, Chris N., David B. Shmoys, and David P. Williamson (1991). Permutation vs. non-permutation
flow shop schedules. Operations Research Letters 10, pp. 281-284.
Rothkopf, M. H. (1966). Scheduling with random service times. Management Science 12, pp.
703-713.
Strusevich, Vitaly A. and Carin M. Zwaneveld (1994). On Non-Permutation Solutions to Some
Two Machine Flow Shop Scheduling Problems. ZOR - Mathematical Methods of Operations
Research 39, pp. 305-319.
Swarc, W. (1968). On some sequencing problems. Naval Research Logistics Quarterly 15, pp.
127-155.
Uetz, Marc (2001). Algorithms for deterministic and stochastic scheduling. PhD Thesis, TU Berlin.
Weiss, G. and M. Pinedo (1980). Scheduling tasks with exponential service times on non-identical
processors to minimize various cost functions. Journal of Applied Probability 17, pp. 187-202.
Wright, T.P. (1936). Factors affecting the cost of airplanes. Journal of Aeronautical Science 3, pp.
122-128.
Wu, C.C. and W.C. Lee (2009). Single-machine and flow shop scheduling with a general learning
effect model. Computers and Industrial Engineering 56, pp. 1553-1558.
48
Xingong, Zhang and Yan Guangle (2010). Machine scheduling problems with a general learning
effect. Mathematical and Computer Modelling 51, pp. 84-90.
Yang, D.L. and M.S. Chern (1995). A two-machine flow shop sequencing problem with limited
waiting time constraints. Computers and Industrial Engineering 28, pp. 63-70.
Yu, Wenci (1996). The two-machine flow shop problem with delays and the one-machine total
tardiness problem. PhD Thesis, Available at http://repository.tue.nl/461119.
Yu, Wenci, Han Hoogeveen, and Jan Karel Lenstra (2004). Minimizing makespan in a two-machine
flow shop with delays and unit-time operations is NP-Hard. Journal of Scheduling 7, pp. 333-
348.
49
Appendix
In this section, we will present the results of the simulations that led to useful conclusions for the
scheduling policy whenever the testing option is limited. These conclusions were implemented and
used to form the final configuration of algorithms and determine the behavior during the limited
test occasion. The two most important questions when the resources allow one to test and learn
the delays of only some jobs are:
The following sections are devoted to the most important comparisons that determined the above
mentioned scheduling policies.
Figure 19: Behavior of the algorithms Flexible Johnsons and Johnsons w.r.t. the k-tested jobs for
various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
Figure 20: Behavior of the algorithms Flexible Johnsons and Johnsons w.r.t. the k-tested jobs for
various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
50
(a) d LogN orm(4.51, 1) (b) d N (150, 552 )
Figure 21: Behavior of the algorithms LDF and LPT1LPT2 w.r.t. the k-tested jobs for various
instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
Figure 22: Behavior of the algorithms LDF and LPT1LPT2 w.r.t. the k-tested jobs for various
instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
Figure 23: Behavior of the algorithms SPT1LPT2 and LPT1SPT2 w.r.t. the k-tested jobs for
various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
51
(a) d (N (60, 152 ) + N (240, 502 )) (b) d U(145, 155)
Figure 24: Behavior of the algorithms SPT1LPT2 and LPT1SPT2 w.r.t. the k-tested jobs for
various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
Figure 25: Behavior of the algorithms Flexible Johnsons and Johnsons w.r.t. the position
of k-tested jobs for various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j
N (15.1, 4.22 )
Figure 26: Behavior of the algorithms Flexible Johnsons and Johnsons w.r.t. the position
of k-tested jobs for various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j
N (15.1, 4.22 )
52
(a) d LogN orm(4.51, 1) (b) d N (150, 552 )
Figure 27: Behavior of the algorithms LDF and LPT1LPT2 w.r.t. the position of k-tested jobs for
various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
Figure 28: Behavior of the algorithms LDF and LPT1LPT2 w.r.t. the position of k-tested jobs for
various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
Figure 29: Behavior of the algorithms SPT1LPT2 and LPT1SPT2 w.r.t. the position of k-tested
jobs for various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
53
(a) d (N (60, 152 ) + N (240, 502 )) (b) d U(145, 155)
Figure 30: Behavior of the algorithms SPT1LPT2 and LPT1SPT2 w.r.t. the position of k-tested
jobs for various instances of free testing: 25 jobs, p1j N (10.4, 3.22 ), p2j N (15.1, 4.22 )
Figure 31: Ratios of the algorithms w.r.t. the number jobs, p1j = p2j = 13
and d LogN orm(4.51, 1)
54
(a) Stochastic (b) Deterministic
Figure 32: Ratios of the algorithms w.r.t. the number jobs, p1j = p2j = 13
and d N (150, 552 )
Figure 33: Ratios of the algorithms w.r.t. the number jobs, p1j = p2j = 13
and d (N (60, 152 ) + N (240, 502 ))
Figure 34: Ratios of the algorithms w.r.t. the number jobs, p1j = p2j = 13
and d U(145, 155)
55
(a) Stochastic (b) Deterministic
Figure 35: Ratios of the algorithms w.r.t. the number jobs, p1j N (10.4, 3.22 ),
p2j N (15.1, 4.22 ) and d LogN orm(4.51, 1)
Figure 36: Ratios of the algorithms w.r.t. the number jobs, p1j N (10.4, 3.22 ),
p2j N (15.1, 4.22 ) and d N (150, 552 )
Figure 37: Ratios of the algorithms w.r.t. the number jobs, p1j N (10.4, 3.22 ),
p2j N (15.1, 4.22 ) and d (N (60, 152 ) + N (240, 502 ))
56
(a) Stochastic (b) Deterministic
Figure 38: Ratios of the algorithms w.r.t. the number jobs, p1j N (10.4, 3.22 ),
p2j N (15.1, 4.22 ) and d U(145, 155)
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 415.60 573.61 702.15 817.30 923.80 1022.41 1367.08 2098.79
LPT 415.60 573.61 702.15 817.30 923.80 1022.41 1367.08 2098.79
Random 415.60 573.61 702.15 817.30 923.80 1022.41 1367.08 2098.79
Flexible Johnsons 415.60 573.61 702.15 817.30 923.80 1022.41 1367.08 2098.79
Johnsons 439.57 627.31 784.52 926.76 1057.67 1183.92 1621.47 2544.62
LB 387.81 510.61 603.14 681.98 748.91 812.35 1022.85 1513.72
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 273.99 343.45 408.25 475.76 539.81 606.94 868.58 1522.12
LPT 273.99 343.45 408.25 475.76 539.81 606.94 868.58 1522.12
Random 273.99 343.45 408.25 475.76 539.81 606.94 868.58 1522.12
Flexible Johnsons 273.99 343.45 408.25 475.76 539.81 606.94 868.58 1522.12
Johnsons 292.77 377.28 452.15 524.74 595.70 665.23 936.25 1600.68
LB 241.17 262.20 283.18 325.12 381.60 442.69 692.56 1331.14
57
Table 40: p1j = p2j = 13 and d (N (60, 152 ) + N (240, 502 ))
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 321.85 397.76 464.08 529.41 595.37 659.47 920.92 1572.49
LPT 321.85 397.76 464.08 529.41 595.37 659.47 920.92 1572.49
Random 321.85 397.76 464.08 529.41 595.37 659.47 920.92 1572.49
Flexible Johnsons 321.85 397.76 464.08 529.41 595.37 659.47 920.92 1572.49
Johnsons 343.09 437.42 515.43 588.22 659.63 729.21 1000.72 1664.35
LB 291.66 320.59 333.45 343.35 377.27 437.37 693.87 1339.45
# jobs
5 10 15 20 25 30 50 100
Algorithm
SPT 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
LPT 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
Random 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
Flexible Johnsons 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
Johnsons 231.37 297.10 362.41 427.57 492.69 557.76 817.92 1468.00
LB 224.70 288.86 353.55 418.39 483.27 548.21 808.09 1458.01
58
Detailed recommendations
The following tables summarize the recommendations for instances of 25 jobs. Note that these
recommendations apply only for this specific number of jobs, since the behavior of the algorithms
is affected by the number of jobs.
Table 42: Overview of recommendations for 25-job instances and testing for free
Table 43: Overview of recommendations for 25-job instances and costly testing
59