You are on page 1of 11

Exact and Approximate Algorithms for Scheduling

Nonidentical Processors

ELLIS HOROWITZ

University of ~outhern California, Los Angeles, Calzfornia


AND

SARTAJ S A H N I

University of Minnesota, Mznneapolis, Minnesota

ABSTRACT. Exact and approximate algorithms are presented for scheduling independent tasks in a
multiprocessor environment in which the processors have different speeds. Dynamic programming
type algorithms are presented which minimize finish time and weighted mean flow time on two proces-
sors. The generalization to m processors is direct. These algorithms have a worst-case complexity
which is exponential in the number of tasks. Therefore approximation algorithms of low polynomial
complexity are also obtained for the above problems. These algomthms are guaranteed to obtain
solutions that are close to the optimal. For the case of minimizing mean flow time on m-processors an
algorithm is given whose complexity is 0 ( n log ran).

KEY WORDS AND PHRASES: scheduling independent tasks, uniform processors, unrelated processors,
finish time, mean flow time, weighted mean flow time, exact algorithms, approximate algorithm,
complexity

CR CATEGORIES: 4.32, 5.39

1. ] ntroduction
W e are c o n c e r n e d h e r e w i t h s c h e d u l i n g n >_ 1 i n d e p e n d e n t t a s k s T l , . . - , T , o n nn >_ 1
processors P ~ , • • • , P ~ . T h u s we a s s u m e t h e r e are n o p r e c e d e n c e c o n s t r a i n t s o n t h e t a s k s
a n d also t h a t all schedules m u s t b e n o n p r e e m p t i v e . T h e e x e c u t i o n t i m e of t a s k % o n
processor P , will b e d e n o t e d b y t , yielding a n m X n m a t r i x of processing times. E a c h t ,
is a s s u m e d to b e a p o s i t i v e r a t i o n a l n u m b e r a n d w i t h o u t loss of g e n e r a l i t y tie, 1 < j _~ n
is n o r m a l i z e d t o a p o s i t i v e integer.
F o r m a l l y a s c h e d u l e S for m processors is a p a r t i t i o n of t h e s e t of t a s k
indices, { 1, 2, • • • , n I i n t o m disjoint, o r d e r e d sets R1, • • • , R,~ s u c h t h a t

(i) R , = { r , l , r ~ 2 , "." , r,~,}, j , > 0,


(ii) U~<,_<mR, = l l , 2, . . . , n}.

I n f o r m a l l y t h i s m e a n s t h a t t h e t a s k s specified b y t h e indices in R, are t o b e e x e c u t e d o n


processor i. T a s k T,,.~ is t o b e t h e k t h t a s k e x e c u t e d on processor i.

Copyright © 1976, Association for Computing Machinery, Inc. General permission to republish,
but not for profit, all or part of this material is granted provided that ACM's copyright notice is
given and that reference is made to the publication, to its date of issue, and to the fact that reprinting
privileges were granted by permission of the Association for Computing Machinery.
This work was supported by the National Science Foundation under Grants GJ-44207 and DCR74-
10081.
Authors' addresses:E. Horowitz, Computer Science Program, University of Southern California,Los
Angeles, C A 90007; S. Sahni, Department of Computer, Information and Control Sciences,University
of Minnesota, 114 Main Engineering Building, Minneapolis, M N 55455.

Journal of the Aaeocm,tion for Computing Machinery, Vol. 23, No. 2, April 1976, pp, 317-.327
318 I~. HOROWITZ AND S. S A H N I

There are two commonly used measures of the worth of a schedule, finish time and mean
flow time. The finish time of processor i for any schedule S = {R~, . . . , R~}, written
F,(kQ, is defined as

F,(S) -- ~ t,.,,z,1 ~ i_~ m.

The finish time of the entire schedule, F ( S ) , is then

E(S) = max {F,(S)}.

We will be interested in finding schedules which minimize F ( S ) . Such a schedule will be


denoted by S*. We also speak of the finish time of task T~ on processor i, denoted as
fk(S), and define it as
fk(S)-- ~ t .... , , where r,.d ffi k.

Then the mean flow time of schedule S is defined to be

mft(S) = ~ fk(S),
l~k~n

or the sum of the finish times of the jobs on their respective processors. A third criteria of
scheduling is the weighted mean flow time, where an additional n nonnegative weights
wl, -.- , w# are specified and the weighted mean flow time is defined as
wmft(S) = ~ w~fk(S).
l~k~n

Throughout we will be interested in finding schedules which minimize one of these three
functions. A schedule which yields the minimum time will be called an optimal schedule.
Quite a bit is already known about minimizing finish and mean flow time for m _> 1
identical processors. In [1] it is shown that for m > 2 obtaining a schedule that minimizes
either finish time or weighted mean flow time is NP-complete. Informally this means that
obtaining an algorithm of polynomial time complexity that can generate a schedule with
minimum finish time is as hard as obtaining an algorithm of polynomial complexity for
such problems as the traveling salesman, knapsack, and maximum clique problems (see
Karp [8] and Sahni [13] for further discussion of NP-complete problems). The theory plus
the diversity of problems developed in [8] and [13] indicates that in all likelihood any
problem which is NP-eomplete does not admit of an algorithm of polynomial complexity.
This puts added importance on the development of approximation algorithms. Bruno,
Coffman, and Sethi in [1-3] and Graham in [5] have presented O(n log n) heuristic al-
gorithms for identical processors. These heuristics guarantee schedules that have a finish
time close to the minimum. For example, it is shown in [5] that the finish time of the
schedule obtained by assigning jobs in order of decreasing execution times (usually called
an L P T schedule) is at most ~ of the optimal.
For the case of mean flow time on identical processors it is well known that schedule S
is minimal if and only if it is an S P T schedule (shortest processing time first); see [4].
An SPT schedule can be found in time O(n log n) and thus an approximation algorithm
for this problem is unnecessary.
In this paper we are concerned chiefly with the situation of nonidentical processors.
Moreover it is convenient to distinguish between two situations. First is when the proces-
sors are uniformly faster than others. This we refer to as m-unzform processors
P l , • ." , P~ with relative speeds sl = 1, s2, • • • , s,, where s, > 1 for 2 _< i _< m. Equiv-
alently processor i is said to run s, times faster than processor 1. Such a situation may
often arise. For instance we might have a computing system with several processors
differing only in their speeds. A job shop may have several machines for performing the
Exact and Approxima~ Algorithms for Scheduling Nonidentical Processors 319

same tasks b u t which run at different speeds. Such a system has already been studied by
J. Liu and C. Liu in [11], where they give results for the question of minimizing finishtime
on uniform processors where s, = 1, 1 < z < m - 1, and sm > 1. They present some
heuristics and compare the resulting schedules with arbitrary schedules. Since for uniform
processors the execution times t , are related for fixed j , namely t,j = hJs, for I < i < m
and 1 < j _< n, we will from now on only specify times t~ = t~3, 1 ~_ 3 -< ~, and the
relative speeds s , , 1 < i < m.
The second situation of nonidentical processors arises when the processors are radically
different and hence unrelated. This m a y occur when there are several job shops of radically
different structure so t h a t certain tasks could be executed efficiently in one job shop yet
others executed more efficiently in a different job shop. We shall refer to this situation as
one of unrelated processors. I n [2] Bruno, Coffman, and Sethi present an 0(max [mn2,n~})
algorithm to obtain a schedule which minimizes the mean flow time in the case of m-
unrelated processors.
In Section 2 we present exact algorithms for the finish time, mean flow time, and
weighted mean flow time problems. Section 2.1 presents an algorithm for minimum finish
time on both uniform and unrelated processors. Section 2.2 presents an algorithm for
minimum mean flow time on m-uniform processors whose computing time is 0 ( n log ran).
In Section 2.3 we give an algorithm for minimum weighted mean flow time for uniform
processors.
Our algorithms for obtaining a schedule t h a t minimizes either the finish time or the
weighted mean flow time require in the worst case an exponential amount of time. Since
both of these problems are known to be NP-complete it is unlikely t h a t any polynomially
bounded algorithms for these problems exist. In any practical situation the time required
to obtain the schedule must alsobeconsidered. Thus if T.s is the time taken to compute the
schedule S and F(S) is the finish time of S, then we actually wish to minimize the quan-
t i t y Ts -k F(S) or Ts + wmft(s) in the case of the weighted mean flow time. When T~
is comparable to or significantly greater than F(S) we would be better off using a schedule
t h a t approximates F(S*) b u t which can be obtained easily. Alternatively, it is often the
case t h a t the execution times are themselves only estimates. In such a case an approximate
solution would probably be just as meaningful as an exact solution. Therefore in Section 3
we turn our attention to the problem of obtaining a schedule t h a t approximates the mini-
mal finish time or the minimal weighted mean flow time. These algorithms have a com-
plexity which is either linear or quadratic in the number of tasks to be scheduled. More-
over they allow one to find a schedule whose time is as close as one wants to get to the
minimum.

2. Exact Algorithms
2.1. MINIMUM FINmH TIME. In this section we present an algorithm which mini-
mizes the finish time on two unrelated processors. The generalization to more than two
processors is direct and will be discussed later. Thus we are given n jobs with times
t,~, 1 < j _< n, and i = I or 2. The algorithm is of the dynamic programming type, this
type having already been shown to be useful for many other problems such as the parti-
tions problem; see [6]. In this case, the algorithm proceeds b y computing sets
S (°), . . . , S (") which contain 3-tuples. A 3-tuple (F~, F2, e) has coordinates which are
defined to be
F~ :: = the finish time on processor 1;
F2 : : = the finish time on processor 2;
e :: = an encoded bit string whose ones indicate jobs allocated to processor 1.

F r o m the set S (~) the minimum finish time and the corresponding schedule can be ob-
tained.
320 E. HOROWITZ AND S. SAHNI

Algorithm FT
Input: n , t ~ i f o r l _ < a _ < n , z = 1,2.
Output: A schedule which assigns jobs to processors I and 2 such that the finish time is minimized.
1. Ag(°) ~- {(0, 0, 0)}; F ~- finish time of the schedule obtained by executing each task on the proces-
sor for which its time is minimal.
2. f o r j * - l , - . - ,ndlo
S(,) ~ {,Sci-t) + (tt,, 0, 2,)} O {8o-t~ + (0, it.:, 0)};
Note. Addition of 3-tuples (+) is done componentwise; the union operator (U) should be done
so that (i) if (a, b, c), (d, e, f) E S(') such that a = d and b < e then retain only (a, b, c),
and (ii) no tuple (a, b, c) for which a or b is greater than F is retained.
3. In ,S<~) choose the 3-tuple having the smallest maximum of the first and second coordinates.
The schedule is obtained by decoding the third coordinate which gives the tasks to be
executed on processor 1. The remainder goes onto processor 2. The tasks may be executed
in any order and the finish time will be minimal. However, executing them in nonincreas-
ing order of t, will reduce the mean flow time of the resulting schedule. []
I t is easy to verify the correctness of Algorithm F T by noting that all possible schedules
are generated in step 2. The ones that are eliminated are only those that cannot lead to an
optimal schedule. A more complete proof of a similar dynamic programming algorithm
can be found in [6].
']'his dynamic programming approach, as embodied by algorithm FT, may also be
viewed as a branch-and-bound method in which S (~) represents the nodes at level j of the
branch-and-bound tree. The elimination rule of step 2 eliminates those nodes that are
dominated by other nodes in the tree and thus corresponds to the bounding operation
performed in a branch-and-bound algorithm. A more detailed treatment of the general
branch-and-bound technique can be found in [10].
Computing time. Let F ( S * ) be the finish time corresponding to an optimal schedule
S*. Then an upper bound on F(S*) is F(S*) < F, where F is the finish time of the sched-
ule obtained by executing each task on the processor on which its execution time is mini-
mum. Since the tuples in S (') have distinct integer first coordinates, it follows that no
S (') has more than F + 1 tuples. Also, the size of each S (') at most doubles at each itera-
tion of step 2. Hence, another bound on the number of tuples in S (') is 2". If T is the time
required to add two tuples together, then the overall computing time becomes
O(minlnF, 2"}7). No sorting is needed in step 2 as the S ") may be generated in non-
decreasing order of first coordinates. As written, the time r required for the tuple addition
could be 0(n) as the number of digits in the encoding is 0(n). To get around this, for
large n (i.e. n > # of bits in one computer word), the encoding may be maintained as a
linked list of indices which could be shared by several tuples. In this case the addition of
tuples could be carried out in time T .= 0(1) and so the overall computing time of Al-
gorithm F T is 0 (rain {nF, 2n} ). []
Example. Let n = 3, m = 2, and tl: = (1, 3, 5), t~j = (4, 7, 3). Without pruning
the sets, S (') would be generated as
8 (~)= { ( L o , 2~), (0, 4,0)I, s ~ = {(4,0, 2' + 2~), (3, 4,2~), ( L 7 , 2 ~ ) ,
(0, 11, o)},
S (3) = {(9, 0, 2' + 2~ + 23), (8, 4, 22 + 23), (6, 7, 2' + 2a), (5, 11, 23),
(4, 3, 2 ~ + 2~), (3, 7, 2~), (1, 10, 2~), (0, 14, 0)}.
However, assigning jobs 1 and 2 to processor 1 and job 3 to processor 2 gives a finish
time of 4. Using this value for F reduces the size of S (') so that algorithm F T actually
produces
S (~) = {(1,0,2'), (0,4,0)}, S (2) = {(4,0,2' + 22), (3,4,2~)},
S (3) = {(4, 3, 2' + 22)}.
Exact and Approximate Algorithms for Scheduling Nonidentical Processors 321

The optimal schedule has minimum finish time of maxl4, 3} = 4 and is obtained b y
decoding the third coordinate 2 ~ + 22.
Finally we note t h a t the algorithm can be simplified if the processors are uniform. I n
this case the second tuple becomes unnecessary as the finish time on processor 2 can be
computed from the first coordinate t as ( ~1<,_<~ t, - t)/s (s the speed of processor 2 and
t, the time for job i on processor 1).
The generalization of this method to ra processors works by considering m + 1 tuples
whose i t h coordinate is the finish time on processor i, 1 < i < ra, and the last coordinate
encodes the jobs on the various processors. S "~ is computed from S ¢'-~) b y adding m + 1
tuples componentwise as in step 2. The computing time for this algorithm will
be 0(rain {nF, m ~} ), where F is the finish time of the schedule obtained b y assigning each
job to the processor on which its execution time is minimal.
2.2. MINIMUM MEAN FLOW TIME. I n [2] Bruno, Coffman, and Sethi have given an
algorithm which finds a schedule for the minimal mean flow time of n jobs with times t ,
on m unrelated processors. Their algorithm takes time 0(max {ran2, n 3}) and proceeds
generally by reducing this problem to a minimum cost network flow problem.
The algorithm given in this section will find a schedule which minimizes the mean flow
time for a set of n jobs to be run on rn umform processors with speeds 1, s2, . . . , sin,
respectively. The computing time of this algorithm is 0 ( n log ran). To begin we show
t h a t for uniform processors an S P T schedule could work very poorly. In the simplest case
imagine two processors, s = s2, and two jobs both requiring time t on processor 1. The
S P T schedule would assign one job to each processor, yielding a mean flow time
of t + t/s = t(1 + l/s). An alternative schedule might place both jobs on processor 2,
yielding a mean flow time of t/s + 2t/s = 3Us. The ratio of these schedules is (s + 1)/3,
which gets arbitrarily large with s. Thus, the point is t h a t S P T schedules can work very
poorly when the processor speeds are very different but uniform.
The algorithm which follows was originally described b y Conway, Maxwell, and
Miller in [4, pp. 78-79]. We present a more precise implementation of their idea, which
works b y assigning jobs in nonincreasing order of their times. A t each iteration it assigns
+.he kth job to t h a t processor for which the flow time is minimally increased. The determi-
nation of the processor can be rapidly accomplished b y picking the minimum of a set of
ra numbers.

Algorithm MFT
Input: m processors with speeds 1, s2, ... , s=, 1 < s2 _< ..- < s~ ; n tasks initially sorted so that
tt < t2 < ... < t~ where the times t, are for processor 1.
Output: Sets R, 1 < z < m. The tasks in R, are to be run on processor i in increasing order of
their execution times.
f o r 3 ~-- 1, . . - , m -- 1 d o R~ ~-- ~ , zl ~-- 1/s~ ;

~Note t h a t the above assigns the job w i t h the largest processing time to the fastest processor, m.)
fork*--n- 1,... ,1do
Let I be the largest index such t h a t ~t = mini<e<= l~} ;

end
The sets R I , . . - , R,~ contain the indices of the jobs to be run on the ra processors. F o r
each processor the jobs should be run in increasing order of their times.
Computing time. Step 2 requiresn - 1 iterations of theloop where each iteration needs
no more than ra -- 1 comparisons. This number of comparisons for finding iz can be re-
duced to O(log2ra) b y using a heap as in [9, p. 152]. This represents an improvement only
for larger values of m because of the overhead (m should be greater than about six). The
jobs need to be initially sorted for a final time requirement of 0 ( n log ran).
THEOREM 2.1. Algorithm M F T produces a schedule with minimum mean flow time.
322 E. HOROWITZ AND S. SAHNI

PROOF. Suppose not. Then let S* be a schedule which has the minimal mean flow time
which is smaller than that for the schedule S produced by MFT. We may assume that in
both S* and S the tasks for each processor are scheduled to be executed in increasing
order of their execution times and hence in increasing order of their indices. Let k be the
largest index such that task T~ is scheduled on different processors in S* and S. If there is
no such k then the schedules are identical. Let i be the processor on which Tk is scheduled
in S* and j the corresponding processor in S. Let u, be the'number of tasks scheduled to
be processed after and including task Tk on processor i in S*. Let u~ be the corresponding
number for task Tk on processor j in S. Since S* and S agree for task indices greater than
k and algorithm M F T assigned task Tk to processor 3, it follows that u~/s~ < u,/s,.
Let T~ he the task at position us on processor j in S*. If there is no such task let l = 0
with to = 0. Then, by interchanging the position of task Tk and T1 in S* the M F T of S*
changes by
(u,t~/s, + u A / s , ) - (u,tk/s, + u , t d s , ) = (u,/s, - u,/s,)(t, - t~) < O,

since by the definition of k and the ordering of task execution times it follows that tl - -
t~_<0.
Thus this interchange either produces a schedule better than S*, contradicting the
assumption that S* was optimal, or it produces a schedule as good as S* but agreeing with
schedule S on tasks Tk, • • • , T~. In the latter ease, by successively repeating this inter-
change process on the new schedule we either transform S* to S with no change in M F T
(in which case S is also optimal) or we get a better schedule than S* (in which case S*
was not optimal). []
2.3. WEIGHTEDMEAN FINISH TIME ON UNIFORM PROCESSORS. Again we consider
an exact algorithm for n jobs t~, . . . , t. in the context of two uniform processors of speeds
I and s. Now we are also given positive weights wl, . . - , w. and asked to find a schedule
which minimizes the weighted mean flow time. For the single processor case we already
know the following:
LEMMA 2.3.1 For a single processor the weighled mean flow time is minimized if the jobs
are processed ~n nonmcreasing order of w,/t,.
PROOF. See [4, p. 44].
We may view any 2-processor schedule as a partition of the set of jobs into sets R~
and R~ with jobs in R, executed on processor i. From Lemma 2.3.1 it follows that the
wmft of schedules corresponding to the partition R i , R2 is minimized if the jobs in Ri
and R~ are executed in nonincreasing order of w,/t,. Consequently, given a partition its
minimum wmft is easily computed. Thus the major problem is to determine a partition
with minimum wmft. Therefore we return to a dynamic programming type of algorithm
which worked so well in Section 2.1.
We will be computing sets S (°), .-. , S (~) where each set contains 3-tuples of the form
(T, t, e) where
T :: = the current wmft;
t : : = the finish time on processor 1;
e :: = an encoded bit string whose ones indicate jobs allocated to processor 1.
From the SS(") we can choose the tuple with minimum first coordinate, the corresponding
third coordinate yielding the proper schedule.

At¢orithm WMFT
Input: n positive integers w t , . . . , w,;
n jobs w i t h times ~1 , In ordered so t h a t w,/t~ ~ w~+l/t,+t, 1 < i < n;
"" " ,

s, the speed of the second processor.


Output: A schedule which assigns jobs to processors 1 and 2 such t h a t the w m f t is minimized.
1. 8 C°)~-- [ (0, 0, 0)} ; TEUM ~-- 0;
Exact and Approximate Algorithms for Scheduling Nonidentical Processors 323

2. for i ~ - 1 t o n d o
A~-B~---O
for every (T, t, e) E S ('-1) do
A ~A U {(T + w,(t + t,), t + t , , e + 2')}
B ~-- B 0 {(T "t- w,(TSUM -- t -~ t,)/s, t, e) I
end
S(*) ~-- merge A, B into a single set ordered by the second coordinate and for each dmtinct
second coordinate only the tuple with the smallest first coordinate is retained.
TSUM ~ T~UM -}- t~ ;
end
3. In S (~)choose the 3-tuplewith minimum firstcoordinate.
Computingtime. Since the number of 3-tuples m a y double at each stage, 2' bounds the
number of tuples of S °). The merge time is linear in the size of S ~') since A and B are
ordered. Thus the total time for n iterations is the minimum of n T S U M and
~_<,_<~ 2' = 0(2 ~) or O(min{nTSUM, 2~}). For large enough values of TSUM, the
computing time is exponential in n and in addition the space requirements are also ex-
ponential.

3. Approximate Algorithms
The exact algorithms presented in Section 2 for minimizing the finish time and the
weighted mean flow time could be exponential in the worst case. In this section we present
algorithms whose complexity is a polynomial function of the number of jobs, n. These
algorithms are t-approximate in that they guarantee a fractional error of no more than
between the optimal and approximate solutions. Formally, we have:
D e f i m t w n 3.1 [12]. An algorithm will be said to be an t-approximate algorithm for a
minimization problem P if and only if (F -- F * ) / F * < ~ for some constant ~ > 0. F* is
the value of the optimal solution and P that of the approximate solution; F* is assumed
to be greater than 0.
3.1. FINISH TIME UNIFORMPROCESSORS. We present the approximate algorithm for
the case m = 2. The generalization to an arbitrary number of processors is direct and
follows the generalization of the exact algorithm in Section 2.1. Since any schedule can be
represented as a partition R~, R2, let s = s~, t the finish time on processor 1,
t = ~,eRa t,, and T -- ~l_<,<n t,. For any schedule S, the finish time is
max{t, ( T -- t)/s} = T / ( 1 + s) + max{t -- 5 / ( 1 + s), ( T / ( 1 + s) - t)/s}.
We are interested in obtaining a partition R i , R2 for which
max{t -- T / ( 1 + s), ( T / ( 1 + s) - t ) / s I
is minimized. This is equivalent to solving the following sum of subsets problem:
(i) maximize
subject to ~2 t,~, _< T / ( 1 + s), 8, = {0, 1}
and
(ii) minimize
subject to t,6, > T / ( 1 -% s ) , 8, = {0, 1}.
Then the minimal finish time is ( T - F1)/s if T / ( 1 + s) -- Fi _< (F~ - T / ( 1 .% s))s;
otherwise the optimal is F2. The approximation algorithm proceeds by solving approxi-
mately the two sum of subsets problems (i) and (ii) and then chooses the solution which
has a better finish time.
The 2-uniform processor approximation algorithm follows the strategy used for the
approximation algorithm for the knapsack problem by Ibarra and Kim [7]. Our presenta-
tion differs from theirs chiefly in that their steps not relevant to our scheduling problem
are omitted. Let F* be the optimal finish time. Then, it is easy to verify that T / 2 s <
324 E. HOROWITZ AND S. SAHNI

F* 5; T/s. L e t r be an integer such t h a t 10" < T / s < 10 '+~ and 1 t h e smallest integer such
t h a t for given ~, ~ > 2 • 10 -z. T h e a l g o r i t h m proceeds b y dividing t h e tasks into two sets
A and B depending on w h e t h e r t, is greater t h a n 10 ' - t or not. T h e execution t i m e of all
tasks in A is t r u n c a t e d retaining only a few digits, i.e. t,' = I t , , 10nJ where
p = min{0, 21 - r}. Since T / s < l f f +~ it follows t h a t F* • 10u - , _< 10 u+~. All sums
102~+~ obtainable from A are computed. Tasks f r o m B are t h e n p u t into all schedules
obtained from A in a n y order. T h e best such schedule is t h e a p p r o x i m a t e schedule.
Formally, we h a v e :

Algorzthm APPROX FT
Input: ntasks Tx, ... , T~ with execution times h , ... ,t~;
two processors with speeds 1 and s;
the demred accuracy.
Output" A schedule, S, for the tasks such that the finish time ~ zs within ~ of the optimal F*, i.e.
O0- F*)/F* .< ~.
1. [Initialize] T ¢- ~1<,<~ t,
1 ¢- smallest integer such that ¢ >_ 2 * 10-z
r ¢- smallest integer such that T/s < lff +1
A , - 1~ I t, > lO~-q
B , - I~ I t, < 10,-~}
a ~-- number of tasks in A
2. [Generate all sumsets of A _< 102z+q
S°~-- {(0, 0)} (the first coordinate is the finish time on processor I while the second is an encoding
of the jobs to be processed on processor 1).
p ~-- min{0, 21 -- r}
fori~-- 1,2,.." ,ado
S(') ~-- S ('-~) U {S ('-~) + (It, * 10hi, 2')}
(t, Is the execution time for the ith task in A)
end
(Note that the union operation above deletes all tuples with first coordinate greater than 10~+t.
For each distinct first coordinate only one tuple is retained.)
3. [Update the tuples in S (~) so that the first coordinate represents the actual finish time of Px and
fill with tasks from B to get the finish time near T/(1 T s).]

for each tuple (t, e) E S (a) do


Let b, be the zth digit in the binary representation of e. Set t ~ ~t,b,. t now represents the
untruncated finish time on processor 1.
for each task T~ E B do
i f t + tj < T/(1 + s) t h e n [(t,e)~-- (t + t : , e + 2Q]
end
LI: i]t~ > maxlt, (T -- t)/s)} t h e n [/9~-max{t, (T - t)/s} ;O~-e;]
i f t < T/(1 + s) t h e n [let j be the index of the smallest task not in e; t ~-- t + tj ; e *-- e + 2';
go t o L1]
end
{Return ~ with encoded schedule 9.]
Computing time. While actually i m p l e m e n t i n g t h e a b o v e algorithm, t h e encoding e
should be m a i n t a i n e d as a linked list of indices so t h a t t h e addition of a new t a s k as in
step 2 ( S ('-~) + (it, * 10~1, 2 ' ) ) can be carried o u t in c o n s t a n t time. W i t h this in mind,
one readily verifies t h a t this a l g o r i t h m is of complexity 0(10Un). []
THEOREM 3.1. Algomthm A P P R O X F T is an ~.approxzmate algorzthm for minimzzzng
the finish time on two unzform processors.
PROOF. T h e proof is along t h e lines of I b a r r a a n d K i m [7]. L e t tasks i~, ~2, • • • , it be
scheduled on processor P~ in some o p t i m a l schedule. L e t the finish t i m e on P~ be Fi*.
T h e n , Fl* = ~ l < k < , t,k and F* = max{Fx*, ( T -- El*)Is} is t h e minimal finish time.
L e t / 1 , -- • , i~, j < l, be t h e tasks w i t h execution t i m e greater t h a n 10 ~-z. W e assume
21 - r < 0 as otherwise, t h e solutions in A are exact and so t h e only error arises from t h e
fill-in from B. T h i s t h e n implies t h a t (/~ -- F*)/F* < 10"-~/10 ~ = 10 -z. L e t t,' = [t,102z-'].
Exact and Approximate Algorithms for Scheduling Nonidentical Processors 325

Then, for t, > 10'-t we obtain:


t,' 10r-2t _~ t, < (tf + 1)10 r-2~ < t,'(1 + 10-z)10 "-'~t. (1)
Let a be the contribution to Ft* of tasks with processing times < 10r-Z. Then
= Z j < k < t t,k •

Let "r = ~,_<~<, t~ = ~,_<k_<~tt,~ 10~Z-rl. From (1) we obtain :


3"10r-2t "4- a < Fi* _< '~(1 -4- 10-~)10T-2z -t- a. (2)
Since all sums obtainable from the t,' are computed in step 2, it follows that there is a
tuple (% e) 6 S¢a). Let the exact value of the first coordinate as computed in step 3 be
t , . We have from (1),
3,10"-21 _< tv _< 7(1 q- 10-t)10T-u). (3)
If t~ < T/(1 + s) then let # be the amount of time added on to Pa by tasks from B
such that t, + ~ < T/(1 + s). At some time in step 3 we consider a schedule that has a
contribution from A of t,. If t, > T/(1 + s) then the worst case occurs when t, takes
on its maximum value of 3'(1 + 10-t)10 ~-u and F,* its minimum of y10"-~t + a. Hence,
(F -- F*)/F* < (I~"- F,*)/F,* _< ( y l 0 r - 3 t - - a)/(~10 ~-2Z+ a) _< I0-z _< ¢.
If tr < T / ( I + s) then the worst case occurs when t~ has its minimum value of .710~-et
and F~* its maximum. At this time ~ is added on to t~. Hence,
iF* - F)/F* < iF,* - F)/F* < (~lO-'lO'-~')/(3,10r-~'(1 "4- 10-')) "4- l a -- #I/F*
< 10-' + 10~-'/10 ~ (as F* > 10~)
< 2.10 -t < ~. D
3.2. FINISH TIME UNRELATED PROCESSORS. The technique here is similar to that
used in [12] to obtain an approximate algorithm for the ease of two identical processors.
Once again, we present the method only for the case m = 2. Let "ybe the finish time of the
schedule obtained by executing each task on the processor on which its time is minimum.
And let F* be the finish time of an optimal schedule. Then we have 3,/2 <_ F* < 3,. Let
m be the largest integer such that F* > 10' and let ~ > 10-z. Dwide the interval [1, 10~+~]
into nl0 TM equal parts each of size I10~-Z/nl. The approximate algorithm is essentially
Algorithm F T of Section 2.1. However, in step 2 the (J operator is carried out in such a
way that if there are several tuples whose first coordinate lies in the same subinterval of
size [lOT-Z/n], then only the one with the smallest second coordinate is retained. If the
first coordinate lies outside the range [1, 10T+x]then the tuple is deleted. Thus, each set has
at most nl0 TM tuples and the computing time of the algorithm becomes ~)(n210t). In
going from S (') to S ('+~) an error of at most ll0~-~/n] is introduced in the first coordinate.
Hence the optimal solution obtained in this way from S (') may be bigger by at most
n[lO~-t/n] < 10~-t. The fractional error is then < 10~-t/10 ~ = 10-t.
3.3. W M F T UNIFORM PROCESSORS. The al~proximation scheme here is similar to
that used in the previous section. A bound on the value of the minimum wmft is obtained
by using the relationship [4, p. 79].
[(m "4- n)/m(n "Jr 1)]Fl*(n) < Fro*in) < F,*(n). (4)
F~*(n) is the minimum wmft when the n tasks are executed on one processor and
F~*(n) is the minimum for m identical processors. For mn= 2 we obtain from (4)
½El*in) < F~*(n) < El*in). (5)
Let F*(n) be the minimum wmft obtainable on two uniform processors with speeds 1
and s, respectively. Let F~*(n) be the wmft of the schedule obtained by processing all
tasks on P~ in order of nonincreasing w,/t,. Then one obtains the following for a 2-uniform
processor system F,*(n)/2s < F* < Fl*(n)/s, or equivalently,
1 * (n)/s, Fl*(n)/s].
F* E [~F1
326 s. HOROW1TZ AND S. S A H N I

Let m be the largest integer for which Fl*(n) > 2s10' and l the smallest integer for
which 10-z _< e. Then divide the interval [1, 10"+1] in nl0 t+l into equal parts, as in Section
3.2. Now, use Algorithm W M F T of Section 2.3 and in step 2 Bin sort S (') on the first
coordinate rather than the second. For all tuples in S (') that have a first coordinate which
is in the same subinterval of [1, 10r+1], retain only the one which has the smallest second
coordinate as well as the one which has the largest second coordinate. As we shall see in
Lemma 3.3.1 below, only one of these two tuples can lead to an optimal. Again, each S (')
has at most 2n10 t+~ tuples and so the complexity of the approximation algorithm is
v(nq0~).
L ~ t A 3.3.1. I f during the generatwn of a set ~(v) in step 2 of Algorithm W M F T of
Sectwn 2.3 several tuples have the same first coordinate, then either the one with the smallest or
largest second coordinate only can lead to an optimal schedule.
PROOF. Let ( T, tl , el), ( T, t2 , e2), . . . , ( T, tq , eq) be generated for S (~). L e t u , v be a
partitioning of the remaining tasks resulting in an optimal schedule with u scheduled on
P1 and v on P~. Let the tasks in Pi be {(@l,tl), " - , (@k,tk)} and those in
P2 be {(@~, ~l), "'" , ( ~ t , Zl)t with @,/[~ > @,+l/t,+l and ~,/~, > @,+lfl,+l. Let us sup-
pose that the optimal schedule is obtained by using this partition to complete the sched-
ule corresponding to the tuple (T, t,, e,). Then, the wmft of the completed schedule is:
wmft = T .+ ~l(t, -1- /1) + " " -4- ~k(t, "4- /1 + " " q- t~)
+ ( ~ , / s ) ( ~ + ~i) + . . . + ( ~ , / s ) ( ~ + tl + . . . . + t7)
where T = ( ~1<~.~, G) - t , . This m a y be r e w r i t t e n as:

wmft = T + Ti' ~- T2'/s -~ t,Wl -~ TW2,


where wl = ~1<,<~ ~ , , w2 = St_<,_<,~,/s,
Ti' = ~,~1 + ~(~1 + ~) + . . - + ~ ( h + . . . + ~,),
T~' = ~,tl + ~ ( / I + t~) + . . . + ~ , ( l , + . . . + 5).
Thus,

wmft = T + Ti' + T~'/s + t,wl + ( S t , - t,)w2


= T + Ti' + T=:/s + w , ~ t , + t,(w~ - w,).
The first four terms are i n d e p e n d e n t of t,. Therefore if wl - w2 < 0 then wmft is mini-
mized if t, is maximum. If wi - w2 >_ 0 then wmft is minimized if t, is minimum. []
T h a t completes the proof above. In conclusion we note t h a t for m-uniform processors
the a p p r o x i m a t e algorithms c a r r y over using either m or m + I tuples. I n general this will
give a computing time which is O((10Zn=)~-1).

Conclusion
When we consider the general question of scheduling we usually consider three basic
problems: minimizing the mean flow time, the finish time, and the weighted mean flow

TABLE I. COMPUTING TIMES OF APPROXIMATE ALGORITHMS FOR T W O PROCESSORS


Processors
Problems Identical Uniform ,Nonidentical

Finish time 102in • 10~n b 10ins b


Weighted mean flow time 10tnt a 10~n~

B y Sahni in [12].
B y Horowitz-Sahni.
Exact and Approximate Algorithms for Scheduling Nonidentical Processors 327

time. Algorithms for these problems have been extensively studied when the m processors
are identical but far less is known for unrelated processors (the exception being mean flow
time in [3]). Using the general notion of ~n processors with speeds 1, s2, • • • , s~, where s,
is positive and greater than or equal to 1 (called unTform processors), we have given exact
algorithms for each of these three problems. For the mean flow time our algorithm has
complexity o(n log ran), which is an order of magnitude faster than the one given in [3],
though theirs is for the more general case of unrelated processors. As regards the other
problems which are known to be NP-complete, the algorithms we present are of the
dynamic programming type and hence may be exponential in the worst case. Therefore
we have also derived approximation algorithms for these problems. For two processors
these algorithms are very fast, being either linear or quadratic in the number of jobs, and
they find solutions which are guaranteed to be close to the optimal solution.
Table I summarizes the approximation algorithms which have recently been devised.
These are usually a function of n, the number oi jobs, and l, the percentage of error.

REFERENCES
(Note. Reference [14] is not cited in the text.)
1. BRUNO, J., COFFMAN, E.G. JR., AND SETHI, R. Scheduling independent tasks to reduce mean
finishing-time. Comm. ACM 17, 7 (July 1974), 382-387.
2. BRUNO, J., COFFMAN, E.G. JR., AND SET, I, R. Algorithms for minimizing mean flow time
Prec. IFIP Congr. 74, North-Holland Pub. Co., Amsterdam, 1974, pp. 504-510.
3. COFFMAN,E.G., ANB SETHI, R. Algorithms minimizing mean flow time: Schedule length prop-
erties. Comput. Scl. Dep., Pennsylvania State U., University Park, Pa., 1974
4. CONWAY,R.W., MAXWELL, W.L., ANn MILLER, L.W Theory of Scheduling. Addison-Wesley,
Reading, Mass., 1967.
5. GRAHAM,R.L. Bounds on multiprocessing time anomalies. SIAM J. Appl. Math. 17, 2 (March
1969), 416-429.
6. HOROWlTZ,E., AND SAHNI, S Computing partitions with applications to the knapsack prob-
lem. J. ACM 21, 2 (April 1974), 277-292.
7. IBARRA,O H., AND KIM, C E. Fast approximation algorithms for the knapsack and sum of
subset problems. Comput Sci. Tech Rep #74-13, U. of Minnesota, Minneapolis, Minn., 1974.
8. KARP, R.M. Reducibility among combinatomal problems. In Complexity of Computer Computa-
lions, R.E. Miller and J.W. Thatcher, Eds., Plenum Press, New York, 1972, pp. 85-103.
9. KNUTH, D.E. The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-
Wesley, Reading, Mass., 1973
10. KOHLER, W.H., AND STEIGLITZ, K. Characterization and theoretical comparison of branch-
and-bound algorithms for permutation problems. J. A CM 21,1 (Jan 1974), 140-156.
11. LIu, J.W.S., AND LIU, C L. Bounds on scheduling algorithms for heterogeneous computing
systems. Prec. IFIP Congr. 74, North-Holland Pub Co., Amsterdam, 1974, pp. 349~353.
12. SAHNI,S. Algorithms for scheduling independent tasks. J. ACM 28, 1 (Jan. 1976), 116-127.
13. SAHNI,S. Computationally related problems. SIAM J. Comput. 8, 4 (Dec 1974), 262-279.
14. ULLMAN, J.D. Polynomial complete scheduling algorithms. 4th Symposium on Operating
Systems Principles, Oct. 1973, Yorktown Heights, NewYork, pp. 96-101; also in J Cemput. Syst.
Scis. 10, 3 (June 1975), 384-393.

RECEIVED NOVEMBER 1974; REVISED MAY 1975

Journal of the A~oclatlonfor Computing Machinery, VoL 23. No. 2, April1976.

You might also like