You are on page 1of 6

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.

ORG

59

A Partitioning-based method according to budget distribution for task scheduling in Computational Grids
Mostafa Ghobaei Arani, Sam Jabbehdari and Nasser Modiri
Abstract The goal of computational grids is to aggregate heterogeneous distributed resources for solving large-scale problems in science, engineering and commerce. Unfortunately dynamism and heterogeneity of grid resources and also various demands for applications on grids cause the complexity of grid scheduling. So for having access to high performance in grid systems, It is necessary to get effective scheduling for resources. Most Quality of Service (QoS) constraint based workflow scheduling algorithms are based on either budget or deadline constraints. In this paper, we solve the problem of budget constraint-based scheduling through dividing total problem on several partitions and budget distribution on each of them. After budget distribution, we can find a local optimal schedule for each partition based on its sub-budget. We evaluate proposed algorithm compared with Back-tracking and BTO scheduling algorithms in the fields of time and cost execution. Simulation experimental results shows that proposed algorithm provide better performance in low-level budgets. Index Terms Computational Grids, Workflow Scheduling, QoS Constraints, Partition.

1 INTRODUCTION

computational Grid is a software and hardware infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities [1]. In recent years, Grid technology provide the basis for creating a service-oriented paradigm that enables a new way of service provisioning based on utility computing models. Typically, service providers charge higher prices for higher QoS. Users are charged for consuming services based on their usage and QoS level required. Workflow scheduling algorithms is required to be able to analyze users QoS requirements and map workflows on suitable resources such that the workflow execution can be completed to satisfy users QoS constraints [2]. Processing time and cost execution are two typical QoS constraints for workflow execution on utility Grids. Let B be the cost constraint (budget) and D be the time constraint (deadline) specified by a user for workflow execution. The budget constrained scheduling problem is to map every Ti onto a suitable service to minimize the

service to minimize the execution cost of the workflow and complete it with the total time less than D [3]. Several strategies have been proposed to address scheduling problems based on users deadline and budget constraints. Buyya Time Optimization (BTO) and Buyya Cost Optimization (BCO) are derived from the cost and deadline optimization algorithms in Nimrod-G [4,5,6], which is initially designed for scheduling independent tasks on Grids. BTO is used for solving time optimization problem with a budget. It sorts services by their processing times and assigns as many tasks as possible to the fastest services without exceeding the budget. BCO is used for solving the cost optimization problem within the deadline. It sorts services by their processing prices and assigns as many tasks as possible to cheapest services without exceeding the deadline. More recently, iterative processing based heuristics such as Back-tracking [7], and LOSS and GAIN [8], have been proposed to solve constrained optimization problems. They iteratively amend the schedule optimized for one factor to satisfy the other factor in the way that it can gain maximum benefit or minimum loss. However, they need go through more iteration to modify and recomputed the current schedule to meet the constraint and thus result in large scheduling computation time. In this paper, we provide the budget constrained scheduling algorithm, which by following the divide-andconquer technique and divide workflow tasks into several partition and after distribution of overall budget on each partition, we can find a local optimal schedule for each partition based on its sub-budget. Then we compare pro-

execution time of the workflow and complete it with the total cost less than B. Similarly, the deadline constrained scheduling problem is to map every Ti onto a suitable

Mostafa Ghobaei Arani is with the Computer Engineering Department, Kashan Branch, Islamic Azad University, Kashan, Iran . Sam Jabbehdari is with the Computer Engineering Department, North Tehran Branch, Islamic Azad University, Tehran, Iran . Nasser Modiri is with the Computer Engineering Department, ZanjanBranch, Islamic Azad University, Zanjan, Iran .

2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

60

posed algorithm with BTO and Back-tracking algorithms in the fields of cost and time execution. The remainder of the paper is organized as follows: Section 2 provides a workflow scheduling problem description. We describe proposed scheduling algorithm in Section 3. Experimental details and simulation results of proposed algorithm in compare with two algorithms Back-tracking and BTO are presented in Section 4. Finally, we conclude the paper with directions for further work in Section 5.

3 PROPOSED ALGORITHM
In this section, we decide to develop introduced procedure for Deadline distribution scheduling algorithm in [9] to Budget constraint Scheduling. We solve the scheduling problem by following the divide-and-conquer technique in three phases as below: Phase 1: Workflow tasks partitioning into partitions. Phase 2: Distribute overall budget into every partition. Phase 3: Make advance reservations based on the local optimal solution of partition. We describe details of phases 1-3 in the following subsections.

2 PROBLEM DISCRIPTION
Before providing proposed algorithm, It is better to offer more exact description from the problem of workflow scheduling. A workflow application can be modeled as a Directed Acyclic Graph (DAG). Let be the finite set of tasks Ti (1 i n ) . Let be the set of directed edges. Each edge is denoted by (Ti , Tj ) , where Ti is called an immediate parent task of T j , and T j the immediate child task of Ti . A child task can not be executed until all of its parent tasks have been completed. There is a transmission time and cost associated with each edge. We assume that a child task can not be executed until all of its parent tasks are completed. Then, the workflow application can be described as a tuple (, ) . In a workflow graph, a task which does not have any parent task is called an entry task, denoted as Tentry and a task which does not have any child task is called an exit task, denoted as Texit . In this paper, we assume there is only one Tentry and Texit in the workflow graph. If there are multiple entry tasks and exit tasks in a workflow, we can connect them to a zero-cost pseudo entry or exit task [8]. The execution requirements for tasks in a workflow could be heterogeneous. A service may be able to execute some of workflow tasks. Let m be the total number of services available. There are a set of services S i is capable of exj

3.1 Workflow Tasks Partitioning Phase Workflow tasks are categorized to be either a synchronization task or a simple task. A synchronization task is defined as a task which has more than one parent or child task. For example, In Figure 1, T1 , T10 , T14 are synchronization tasks. Other tasks, which have only one parent task and child task, are simple tasks. In the example of Figure 1, T2 T9 and T11 T13 are supposed simple tasks.

Fig.1. Before partitioning [9]

A simple partition can be a set of interdependent simple tasks that are executed sequentially between two synchronization tasks. Simple tasks are categorized in one partition. For example, simple partitions in Figure 2 consist of {T2 , T3 , T4 } , {T5 , T6 } , {T7 } , {T8 , T9 } , {T11 } and {T12 , T13 } .

ecuting the task Ti , but only one service can be assigned for the execution of a task.

S i = {1 i n ;1 j mi , mi m}
j

(1)

Services have varied processing capability delivered at different prices. We denote

t i j as the sum of the c ij as the

processing time and data transmission time, and processing Ti on service S i [3].
j

sum of the service price and data transmission cost for


Fig.2. After partitioning [9]
2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

61

We partition workflow tasks into independent partitions

Where:

Pi (1 i k )

and

synchronization

tasks

Yi (1 i l ) , such that k and l are the total number of partitions and synchronization tasks in the workflow respectively. Let V be a set of nodes in a DAG corresponding to a set of partitions Vi (1 i k + l ) . Let E be the set of directed edges of the form (Vi , V j ) where Vi is a parent partition of V j and V j is a child partition of Vi . Then, a partition graph in problem of budget constraint is denoted as G(V, E, B).

avg cij =

1 j Si

cij
(5)

si

3.2 Budget Distribution Phase After workflow tasks partitioning, we distribute the overall budget between each Vi in G. Denoted subbudget bdg [Vi ] to any Vi is sub-budget of overall budget B. We consider the following strategy of budget assignment based on below mentioned policies: 1. The sub-budget of any partition ( bdg [Vi ] ) should not be more than expected execution cost of that partition ( eec[Vi ] ).

3.3 Planing for Scheduling Phase The planning phase makes an optimized schedule for advance reservation and run-time execution. Optimal decision-making of scheduler is gained through selection of fastest services, which can execute related task in assigned sub-budget. After budget-distribution, we can find an optimal local scheduling for every partition according to sub-budget of that mentioned partition. If each local schedule guarantees that their task execution can be completed within the sub-budget, the whole workflow execution will be completed within the overall budget. There are two types of partitions: Simple Partition and Synchronization Partition. Simple Partition consist of several simple tasks and Synchronization Partition include one synchronization task. The scheduling solutions for each type of partition are described as follow:
3.3.1 Synchronization Partition Scheduling

bdg [Vi ] eec[Vi ]

(2)

2. The sum of sub-budget assigned to any partition is equal to overall budget.


Vi G

For Synchronization Partition Scheduling, scheduler considers only one synchronization task. The optimal decision is to select the fastest service that can process the task within the assigned sub-budget. The objective function for scheduling one synchronization task Yi is:

bdg [V ] = B
i

(3)

min ti j where 1 j mi and cij eec(Yi )


3.3.2 Simple Partition Scheduling

(6)

3. The overall distributed budget over partitions is in proportion to average execution cost (processing cost and data transmission) of their available task. In phase of overall budget distribution over partition, based on average execution cost and the cost of data transmission of available tasks in each partition, we assign a sub-budget to each partition. In a workflow, tasks may require various kinds of services with different prices, and their computational workload and required I/O data transmission is varied between services. Therefore, the portion of the overall budget each task obtains should be based on the proportion of their expense requirements. Since there are multiple possible services and data links for executing a task, their average cost values are used for measuring their expense requirements. The expected budget for task Ti is defined by:

If there are one simple task in to the partition, in this mode, the algorithm of Simple Partition Scheduling are as same as Synchronization Partition Scheduling. But if there are multiple tasks in to the partition, scheduler should assign one service to each task for execute after completion of its parent task. The optimal decision is to minimize the total execution cost of every partition and complete partition tasks within the assigned sub-budget. The objective function for scheduling partition Pj is as follow:

min tik , where 1 k mi and


Ti Pj

Ti Pj

k i

eec( Pj )

(7)

= eec[V ]

avg c
Ti V 1 j S i

1 j S i

avg ci

(4)
2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

62

4 RESULT EVALUATION
We use Gridsim [10], [11] to simulate a Grid testbed for our experiments. As execution requirements for tasks in scientific workflows are heterogeneous, we use service type to represent different type of service. Every task in our experimental workflow applications requires a certain type of service. We model 4 types of services with different prices for simulation within a heterogeneous environment, each of which was supported by 10 different service providers with different processing capability. The topology of Grid system is specified in the same manner that services are connected to each other. The available network bandwidths between services are 1000 Mbps, 200 Mbps, 512Mbps, 1024 Mbps. For experiments, the cost that a user needs to pay for workflow execution comprises of two parts: processing cost and data transmission cost. Table 2, shows an example of data transmission cost, while Table 1 shows an example of processing cost. The length of each task is measured according to MI (Million Instructions) and we use MIPS (Million Instructions per Second) to represent the processing capability of services. As you can see the processing cost and transmission cost are inversely proportional to its processing time and transmission time respectively. TABLE 1 SERVICE SPEED AND CORRESPONDING PRICE. Service ID 1 2 3 4 Processing Time(MIPS) 1200 600 400 300 Cost

Beacause may be there are different structures for applications, we use a common and useful structure of workflow in scientific applications for simulation experiments, according to Figure 3.

Fig.3. A part of workflow in applicationwith parallel structure [3]

(G$/sec)
300 600 900 1200

TABLE 2 TRANSMISSION BANDWIDTH AND CORRESPONDING PRICE. Bandwidth Cost

(Mbps)
100 200 512 1024

(G$/sec)
1 2 5.12 10.24

We compared proposed algorithm with other algorithms of BTO and Backtracking over workflow, as shown in Figure3. BTO algorithm sorts services by their processing time and assigns as many tasks as possible to services without exceeding the budget coustraint, while Back-tracking algorithm assigns more ready task by fastest computing resources. If the execution cost exceeds the budget, it Back-tracks to the previous step and removes the fastest computing resources from its resource list and reassigns tasks with the reduced resource set. We use two metric time and cost execution for evaluating scheduling algorithms. Execution cost specify whether produced schedule by scheduling algorithm is capable of execution with lower cost than our determined budget and execution time can specify how much time have been consumed for executing workflow tasks on the testbed. A comparison of time and cost execution for budget constrained scheduling algorithms of BTO, Back-tracking and proposed algorithm about in user-budget of 500,1000,1500,2000,2500 Grid dollars(G$) have been shown in Figure 4 and Figure 5. As you can see in Figure 4, BTO algorithm can not meet users specified budgets in all of cases, while users specified budgets 500,1000,1500 G$, execution cost in BTO algorithm is more than users specified budgets, while two other algorithms can meet budget constraint in all of cases. In other words, that can complete execution with cost, lower than user specified budgets. In addition to, proposed algorithm can meet and complete budget constraint in lower budgets with consuming lower costs in comparison with Back-tracking algorithm. Therefore, proposed algorithm according to execution cost is better than Back-tracking and Back-tracking is better than BTO.

2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

63

this algorithm will be more effective than two their algorithms, specially, when users are poverty in financially.
3000

5 CONCLUSION
2500 2000 Proposed algorithm 1500 Back-tracking BTO 1000

500

0 500 1000 1500 2000 2500 User budget(G$)

Fig.4. Execution cost of three algorithms

As you can see in Figure 5, each three algorithms, with increasing budget, the time of execution would be decreased. This point is very obvious that because if we posses more budgets for ourselves, we can choose services that are faster than others, which can cause to decrease execution time. As you can see, proposed algorithm, have lowly execution time than two other algorithms in lowly budgets. If the budget be increased, then time of execution of three algorithms become closer to each other.
60

More recently, Grid computing has progressed towards a service-oriented paradigm, which defines a new way of service provisioning based on utility computing models. Within utility Grids, each resource is represented as a service to which consumers can negotiate their usage and quality of service. Workflow scheduling is one of the key issues in the management of workflow execution. Scheduling introduces allocating suitable resources to workflow tasks so that the execution can be completed to satisfy objective functions specified by users. In this paper, we present budget constraint scheduling algorithm, which follow the divide-and-conquer technique for partitioning workflow tasks into smaller partitions. The results of simulation experiment indicated that our proposed scheduling algorithm will behave better than BTO and Back-tracking approaches in the fields of cost and time execution. In the field of cost execution, our proposed algorithm can solve the problem of budget constraint in lower budgets with consuming lower costs than Back-tracking algorithm. Also, that in low budget will have lower execution time than two other algorithms. Therefore, it can be said the applying of proposed algorithm will be more effective than two other algorithms, specially, when users are poverty in financially.

Execution time(min)

Execution cost(G$)

50 40 30 20 10 0 Proposed algorithm Back-tracking BTO

REFERENCES
[1] I.Foster, C.Kesselman "The Grid: Blueprint for new computing infrastructure",Elsevier Inc,2004. [2] Jia Yu and Rajkumar Buyya, "A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms", Workshop on Workflows in Support of Large- Scale Science, HPDC 2006, Paris, France, June 19-23, 2006 [3] Jia Yu, Rajkumar Buyya, "QoS-based Scheduling of Workflow on Global Grids", Ph.D. Thesis, Department of Computer Science and Software Engineering, The University of Melbourne, Australia, oct, 2007. [4] D. Abramson, R. Buyya, and J. Giddy. "A Computational Economy for Grid Computing and its Implementation in the Nimrod-G Resource Broker", Future Generation Computer Systems (FGCS), 18(8): 1061-1074, Elsevier Science, The Netherlands, October 2002. [5] R. Buyya, "Economic-based Distributed Resource Management and Scheduling for Grid Computing", Ph.D. Thesis, School of Computer Science and Software Engineering, Monash University, Melbourne, Australia, April 2002. [6] Y. Mahdavifar, and M. Meybodi, "Scheduling Algorithms for Time Optimization in Economic Computational Grid", CEE, 2007. [7] D. A. Menasce and E. Casalicchio, "A Framework for Resource Allocation in Grid Computing", Proceedings of the IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS 2004), Volendam, Netherlands,October 5-7, 2004. [8] R. Sakellariou, H. Zhao, E. Tsiakkouri amd M. D. Dikaiakos. "Scheduling Workflows with Budget Constraints". In S.Gorlatch, M.Danelutto (Eds.), Integrated Research in Grid Computing, CoreGrid series, Springer-Verlag, 2004. [9] Jia Yu, Rajkumar Buyya and Chen Khong Tham, "Cost-based Scheduling of Workflow Applications on Utility Grids", Proceedings of the 1st IEEE International Conference on e-Science and Grid

Finally, it can be concluded from two Figure 4 and Figure 5 that proposed algorithm, according to execution cost and time be have better than two other algorithms. Also, Backtracking algorithm behaves batter than BTO algorithms in the fields of execution cost, but BTO will have better performance than Back-tracking in the fields of execution time. Perhaps it is based on this reason that BTO apply fastest, but more expensive services for solving the problem of budget constraint. Also, proposedalgorithm has better performance in comparing with two other algorithms in lower budgets. Therefore, using of

50 0 10 00 15 00 20 00 25 00 30 00
User budget(G$)

Fig.5. Execution time of three algorithms

2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 9, SEPTEMBER 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING WWW.JOURNALOFCOMPUTING.ORG

64

Computing (eScience2005, IEEE CS Press, Los Alamitos, CA, USA), Melbourne, Australia, December 5-8, 2005. [10]R. Buyya and M. Murshed, "GridSim: A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing", Journal of Concurrency and Computation: Practice and Experience, pp. 1-32, May 2002. [11]Sulistio and R. Buyya, "A Grid Simulation Infrastructure Supporting Advance Reservation", In 16th International Conference on Parallel and Distributed Computing and Systems (PDCS 2004), MIT Cambridge, Boston, USA November 9-11, 2004.

2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

You might also like