Professional Documents
Culture Documents
Abstract—Task scheduling algorithm, with which the tasks algorithm is proposed in Section 3. In Section 4, we use an
with precedence constraints are assigned to the proper example to illustrate these algorithms and give experimental
processor, is vital for obtaining high performance in results. Section 5 provides the concluding remarks.
multiprocessors system. In this paper, we firstly analyzed three
typical table-based algorithms, i.e. MCP algorithm, ETF II. BACKGROUND
algorithm and BDCP algorithm. It is showed these algorithms
cannot guarantee that the critical tasks have the priority to A. Task Scheduling Problem
schedule firstly. To solve this problem, we proposed a new
A scheduling problem consists of an application, a target
global scheduling algorithm that based on dynamic critical
computing environment, and performance criteria for
path (GDCP). In GDCP algorithm, tasks on the critical path
have the priority to be scheduled firstly in each scheduling
scheduling.
step, and a global search strategy will be applied to select a A parallel program is represented as a directed acyclic
suitable processor to execute each task, thus reduces the graph G=(V,E), where V is the set of nodes and E is the set
schedule length. The result of experiments shows that the of edges. A node in the graph represents a task which is a
proposed algorithm is better than other algorithms. set of instructions that can be executed on any of the
available processors. The weight of node vi denoted by wi is
Keywords- scheduling algorithm; dynamic critical path; its computation cost. The edges between the nodes represent
global select strategy; schedule length the dependency between the tasks. Between two nodes
vi,vjęV, edge eijęE denotes the precedence constraint such
I. INTRODUCTION that task vj should not start its execution until task vi
completes its execution. The weight of edge eij denoted by
The objective of task scheduling is to reasonably assign cij represents the communication cost. The source node and
related tasks to a multiprocessor system in order to obtain the destination node of an edge are called the parent node
the minimum schedule length, which is an important issue and the child node respectively. In a task graph, a node with
of parallel computing. An efficient task scheduling method no parent node is called an entry node, and a node with no
can achieve a high performance in a system. It is well child node is called an exit task. A node which obtains all of
known, however, that multiprocessor scheduling for most the messages from its parent nodes and waits for scheduling
precedence-constrained task graphs is an NP-complete to a processor is called a ready node.
problem in its general form [5]. Therefore, task scheduling A critical path (CP) of a task graph is a path formed by a
has been extensively studied, and various heuristics have set of nodes and edges, from an entry node to an exit node,
been proposed in the literature [1][6][7][8][9]. In compile- of which the sum of computation costs and communication
time scheduling, these heuristics are classified into a variety costs is maximum quantity.
of categories, such as list scheduling, task duplication and The target system is a multiprocessor system with
clustering. unlimited number of fully connected identical processors.
A scheduling algorithm needs to address a number of The different processors transmit data by the messaging
issues. It should take task into account granularity, arbitrary way, and all of the processors can execute tasks and
computation and communication costs. What’s more, in communicate with others at a time. Usually we assume that
order to be practical use, it should have low complexity and all inter-processor communications are performed without
be economical. List scheduling [2][3][4] is generally contention and the communication overhead between two
accepted as an attractive approach since it combines low tasks scheduled onto the same processor is taken as zero.
complexity with good results. In this paper, we propose a
new static scheduling algorithm, which is called GDCP B. Related Work
(global scheduling algorithm based on dynamic critical It is well known, most of the reported scheduling
path). It schedules arbitrary precedence-constrained task algorithms are based on the approach of list scheduling. The
graphs to a multiprocessor system with unlimited number of basic idea of list scheduling is to assign priorities to tasks of
fully-connected identical processors. The GDCP algorithm the graph and place the tasks in an ordered list by the
overcomes the drawbacks of past approaches and obtains a priorities, and then schedule a node of the list to a processor.
better performance. The following steps are repeatedly carried out until all of
The remainder of this paper is organized as follows. In nodes have been scheduled to the multiprocessor system.
the next section, we define the task scheduling problem, and 1) Select the node with the highest priority from the
describe three typical scheduling algorithms. Our GDCP scheduling list;
1336
According to the critical path algorithm in graph theory D. The Proposed GDCP Algorithm
[10], it is known that if epst(vi) is equal to lpst(vi), node vi The pseudo-code of the proposed algorithm is shown in
must be a critical node, namely, node vi is on the CP. Thus, Algorithm 1. The GDCP algorithm has a time complexity of
in our algorithm we first assign a node whose epst and lpst O(n2) where n is the number of tasks.
are equal higher priorities than other nodes. And then, for
the nodes which have the same priority, the smaller epst is,
the higher priorities it has. Tasks are ordered by the
decreasing order of their priorities ultimately. Therefore, the
most important nodes which are on the CP can be taken into
account preferentially at each scheduling step with the result
that reduces the final schedule length effectively.
C. Processor Selection
After choosing a scheduling node, we need a method to
select a suitable processor for scheduling that node. As can
be noticed, the classic scheduling algorithm selects the
processor allowing the minimum start time for a node. This
method probably gives a locally optimized result. However,
it usually gives bad results in the case of great
communication cost. For example, figure 1(a) shows a task
graph; figure 1(b) gives the schedule result with the above-
mentioned processor selection method which selects a
processor provided the earliest start time of each node, and
then the schedule length is 4. But figure 1(c) shows that
when schedule all nodes to the same processor, the schedule
length is 3 at last.
Algorithm 1. The pseudo-code for the proposed algorithm
E. An Example
1337
The schedule generated by the GDCP algorithm is V. CONCLUSIONS
shown in Figure 2(b). The GDCP algorithm schedules the After analyzing several typical list scheduling
task graph in the order: v1, v3, v7, v4, v9, v12, v5, v10, v14, algorithms, the paper presents a global scheduling algorithm
v16, v6, v11, v15, v17, v18, v2, v8, v13, and the schedule based on dynamic critical path. The algorithm generally
length is 450 time units. At first step, only node v1 has schedules the nodes on the CP preferentially, and reduces
equal values of epst and lpst in all ready nodes and so it is schedule length of task graph as far as possible at each
selected for scheduling to the processor PE0 which enable scheduling step so that obtains the shortest final schedule
the current schedule length of task graph shorten. After length. The results of experiments shows the proposed
scheduling several nodes: v1, v3, v7, v4, v9, v12, the ready algorithm works well on various random graphs and gets
node on the CP is the only one v5 and then schedule it to the better performance than others.
processor PE1 where the current schedule length could be
reduced. Finally, there are remaining nodes: v2, v8, v13
which have lower priorities. Sort them by the value of epst ACKNOWLEDGMENT
and schedule them to a processor in sequence. Eventually,
the schedule of the task graph is generated by the proposed I really appreciate my tutor Professor Yang, whose help
algorithm. and patience made this paper get off the ground and come to
a close smoothly.
IV. PERFORMANCE EVALUATION Last but not least, thanks are given to my roomies, who
have shared with me my worries, frustrations, and hopefully
Random graphs are generally used to compare my ultimate happiness in eventually finishing this paper.
scheduling algorithms. We implement a graph generator
based on the method according to the paper [5]. A random REFERENCES
DAG is described as follows: the computation cost of each [1] T Hagras, and J Janecek, “Static vs. Dynamic List-Scheduling
node in the graph is randomly selected from a uniform Performance Comparision,” in Acta Polytechnica, vol. 43, January
distribution with the mean equal to 40. Beginning with first 2003.
node, a random number indicating the number of children is [2] JJ Hwang, YC Chow, F, and FD Anger, “Scheduling Precedence
chosen from a uniform distribution with the mean equal to Graphs in Systems with Interprocessor Communication Times,”
v/10. The communication cost of each edge is also SIAM Journal on Computing, vol. 18, pp. 244-257, June 1989
randomly selected from a uniform distribution with mean [3] Wang M Y, and Gajski D D, “Hypertool: A Programming Aid for
equal to 40 times the specified value of CCR Message-passing Systems,” IEEE Transactions on Parallel and
Distributed Systems, vol. 1, pp. 330-343, July 1990.
(communication-to-computation-ratio). We generate a batch
of random task graphs consisting of subsets of graphs in [4] SHI Wei, and ZHENG Wei-Min, “The Balanced Dynamic Critical
Path Scheduling Algorithm of Dependent Task Graphs,” Chinese
which the number of nodes vary from 20 to 160 with Journal of Computers, vol. 24, pp. 991-997, September 2001.
increments of 20, and each subset consists of graphs with
[5] YK Kwok, and Ishfaq Ahmad, “Benchmarking and Comparison of
different CCRs (0.1, 0.2 1.0, 5.0 and 10.0). At last, we use the Task Graph Scheduling Algorithms,” Parallel and Distributed
normalized schedule length (NSL) [5] to compare these Computing, vol. 59, pp. 381-422, December 1999.
scheduling algorithms. [6] Jorge Barbosa, and AP Monteiro, “A List Scheduling Algorithm for
Figure 3 shows the results of different scheduling Scheduling Multi-user Jobs on Clusters,” Computer Science, vol.
algorithms in above-mentioned test environment. In figure 5336, pp. 123-136, December 2008.
3(a), it is known that the larger the value of CCR is, the [7] Ishfaq Ahmad, and Yu-Kwong Kwok, “On Exploiting Task
better result the proposed algorithm obtains. And as can be Duplication in Parallel Program Scheduling,” IEEE Transactions on
observed from figure 3(b), the NSLs of all algorithms have a Parallel and Distributed Systems, vol. 9(9), pp. 872-892, September
1998.
slightly increasing trend along with the increase of the
[8] H Topcuoglu, S. Hariri, and M Y Wu, “Performance-Effective and
number of nodes. But the GDCP algorithm has a better Low-Complexity Task Scheduling for Heterogeneous Computing,”
performance than others in each condition. IEEE Transactions on Parallel and Distributed Systems, vol. 13(3), pp.
260-274, March 2002.
[9] YK Kwong Kwok, and Ishfaq Ahmad, “Efficient Scheduling of
Arbitrary Task Graphs to Multiprocessors Using a Parallel Genetic
Algorithm,” Parallel and Distributed Computing, vol. 47(1), pp. 58-77,
November 1997.
[10] Mark Allen Weiss, “Data Structures and Algorithm Analysis in C,” in
Post &Telecom Press, 2005.
1338