This action might not be possible to undo. Are you sure you want to continue?

4, July 2010

**Scheduling of Workflows in Grid Computing with Probabilistic Tabu Search
**

R. Joshua Samuel Raj

CSE, VV College of Engineering Tirunelveli, India joshuasamuelraj@gmail.com

Dr. V. Vasudevan

Prof. & Head/IT, Kalasalingam University Srivilliputur, India drvvmca@yahoo.com

Abstract: In Grid Environment the number of resources and tasks to be scheduled is usually variable and dynamic in nature. This characteristic emphasizes the scheduling approach as a complex optimization problem. Scheduling is a key issue which must be solved in grid computing study and a better scheduling scheme can greatly improve the efficiency.The objective of this paper is to explore the Probabilistic Tabu Search to promote compute intensive grid applications to maximize the Job Completion Ratio and minimize lateness in job completion based on the comprehensive understanding of the challenges and the state of the art of current research. Experimental results demonstrate the effectiveness and robustness of the proposed algorithm. Further the comparative evaluation with other scheduling algorithms such as First Come First Serve (FCFS), Last Come First Serve (LCFS), Earliest Deadline First (EDF) and Tabu Search are plotted. Key words: grid computing, workflow, Tabu Search, scheduling problem, Probabilistic Tabu Search

INTRODUCTION Grid Computing a pioneer technique in harnessing the geographically dislocated computer power has changed the perception on the utility and availability of the computer power, which has carved a new technology that openly ventures and amalgamates an infinite number of computing devices into any grid environment, augmenting to the computing capability and providing resolutions to the various tasks within the operational grid environment basically by enabling, sharing, selection and aggregation of geographically distributed autonomous resources dynamically at runtime, depending on their availability, capability, performance and cost, thereby shifting the focus to collaborative environments, federating services and exchanging transactions in a mutual manner to share resources and thereby achieve common goals to enhance productivity and speed up progress in much

the same way that the Internet did in yesterdays economy, paving the way for numerous research efforts in grid scheduling mechanisms Grid Computing is our greatest hope for delivering computing as utility to homes and offices. Many large scale applications such as scientific, engineering and business problems (Hai et al., 2005; Cannataro et al., 2002) are solved effectively using the logical amalgamation of geographically dispersed Grid resources (Bernan et al., 2002). Grid computing, analogous to the pervasive electrical power grid, enables resource sharing and cooperative work among distributed computational sites. In grid environment, applications are often described as workflows. A workflow is composed of atomic tasks that are processed in specific order to fulfill a complicated goal. Generally, grid workflows require huge intensive computing and process larger data, compared with traditional workflows. Therefore, the performance of grid workflows becomes a critical issue of the workflow management systems. One of the most challenging problems is to map each task to a corresponding service instance to achieve the customers’ quality of service (QoS) requirements as well as to accomplish high performance of the workflow. This problem is found to be NP-complete. During the course of grid scheduling there are many challenges that require the simultaneous optimization of several incommensurable and competing objectives. • Unpredictable challenges in Grid resources • Inevitability to multiple resource types for completing a job • Necessitate for a parallel or concurrent execution of tasks in any workflows. Under the OSGA, the workflow scheduler has to balance several QoS requirements, including makespan and cost. Consequently, many traditional workflow scheduling algorithms, such as Opportunistic Load Balancing, Minimum Completion Time, Min-min, Max-min and Duplex, are not

314

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 4, July 2010

suitable since they only tackle the makespan requirement. In recent years, a number of researches have been focused on scheduling problem involving more than one QoS requirements. The traditional System namely advanced reservations for scheduling the workflows undergoes problems such as overloading and power failure. The overloading and the scheduler failure problem are overridden by a two level scheduling scheme where the first level is used for frequent small jobs and second level for large jobs. The market oriented approach algorithm succeeded in distributed scheduling of workflows, but could not appease completion of more workflows within the deadline. The success ratio of the workflows allotted for mapping the Grid sites is 30% (Chien et al., 2005) when 30 workflows are scheduled at a time. Workflows submitted to the Computational Grids by resource consumers have a proper budget proposal, client authentication and the requirements for its execution as shown in Fig 1. The willingness to complete any job is given by resource providers. Hence the Grid schedulers search for solutions in the state space aiming at achieving high performance, both in terms of solution quality and execution speed.

Grid Client’s Job Submission

Client Name

Jeny

Password

******

CPU Power

30T flods

Literatures have also presented a scheduling approach for the economics-driven grids to optimize the cost under the deadline constraint. In fact, a mixed-integer non-linear programming algorithm was introduced to optimize the cost with the consideration of other QoS requirements. As the scale of workflow applications becomes larger and larger, conventional deterministic approaches may fail to give a satisfying solution. Moreover in Grid scheduling problem, for most practical applications, any scheduler delivering good quality planning of jobs would suffice rather than searching for optimality. In fact, in highly dynamic Grid environment, there is no possibility to even define optimality of planning as it is defined in combinatorial optimization. This is due to the fact that Grid schedulers run as long as the Grid system exists and thus the performance is measured not only for particular applications but also in the long run. It is well known that meta-heuristics are able to compute in short time high quality feasible solutions. Therefore, meta-heuristic algorithms have been receiving growing interests due to their powerful global search capability. From the above exposition we are motivated and in this paper we apply the probabilistic Tabu search algorithm for the generalized Grid Scheduling problem. The basic idea behind the algorithm is to use preprocessing operations to arrive at a probability value for each vertex which roughly corresponds to its probability of being included in an optimal solution, and to use such probability values to shrink the size of the neighborhood of solutions to manageable proportions. We report results from computational experiments that demonstrate the superiority of this method over the generic Tabu search method. PROBLEM DESCRIPTION

Memory

19MB

Dead Line

12/09/07

Quality of Service

Best Effort Service

Submit Fig. 1: Job submission blueprint

Literatures have proposed a grid workflow scheduling algorithm in which cost is optimized with the expectation to minimize the makespan.

The Super Schedule (SSGA) Grid Architecture described with eight nodes Grid environment example is shown in the Fig 2. This architecture can be utilized for any practical applications for the normal grid environments. The setup is experimented in TIFAC Core in Network Engineering under DST project. The goal of the SSGA is to find the allocation sequence of workflows on each Grid site. Four major entities are involved in this architecture. • The grid users submit their request for job completion to the local grid managers. • All the tasks should be received by the grid managers and the decision for the scheduling is made on deploying the request to the Intra Grid schedulers.

315

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 4, July 2010

•

•

The Intra-Grid schedulers have the updated information of the grid resources that are idle during time t. This information is frequently updated. The smaller jobs can be scheduled within their deadlines by the Intra-Grid schedulers in their respective Administrative Domains. Here scheduling is often dynamic. For data intensive applications where the jobs are larger it requires the necessity of the resources worldwide. At that moment, there

Fig 3: DAG workflow model

The duration for any workflow, penalty cost incurred and the required grid resources are shown in the Table 1.The tasks taken for experiment have their predecessors and successors, such as T1 follow T2 or T2, T3 and are parallel computations once the task T1 is executed.

Table 1: Experimental work flows

**is a necessity of Inter-Grid schedulers which is static often.
**

Fig 2: Super Schedule Grid Architecture

The workflow allocation strategy in a Grid environment differs from the traditional ones. The goal of the Inter-Grid Scheduler is to receive the request from different Intra-Grid Schedulers and make an optimistic scheduling such that it accommodates many workflows completing within its deadline. The following DAG workflows and the penalty cost for each workflow are considered for experimental purpose.

The Workflow model for W1, W2, W3 are shown in Fig. 3. The FCFS map tasks to the idle Grid sites based on first task arrival to serve first. The EDF algorithm executes the tasks whose absolute deadline is the earliest. Hence it estimates the execution deadline of the individual workflow for any standalone system and schedules such that the workflows that require greater completion time is served first. In EDF the task priorities are not fixed but change depending on the closeness of their absolute deadline. The settings of the experiment consist of workflows with following assumptions: Each workflow received in the Inter-Grid Scheduler consists of a set of Tasks T1, T2, T3 and so on.

316

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 4, July 2010

The task in each workflow is a Directed Acyclic Graph (DAG) model. (Fig. 3.) The output from a task can be transferred to other tasks as per the DAG graph model and all jobs are available at time zero. At any time a task can be executed only on a Grid site which is reported to the Inter-Grid scheduler as idle via Intra-Grid scheduler. There is no pre-emption of tasks or workflows. The sequential order of workflow allotment changes. Here we present a scheduling approach for the wide area problem where in the resources and jobs are dispersed geographically.

PROPOSED METHOD OF PTS In this study, PTS heuristic to solve scientific workflow scheduling problem in Grid is discussed. The roots of Tabu search go back to the 1970's; it was first presented in its present form by Glover [Glover, 1986]; the basic ideas have also been sketched by Hansen [Hansen 1986]. Additional efforts of formalization are reported in [Glover, 1989], [de Werra & Hertz, 1989], [Glover, 1990]. Many computational experiments have shown that tabu search has now become an established optimization technique which can compete with almost all known techniques and which - by its flexibility - can beat many classical procedures. The generic TS is a metaheuristic strategy based on neighborhood search with overcoming local optimality. It works in a deterministic way trying to model human memory processes. Memory is implemented by the implicit recording of previously seen solutions, using simple but effective data structures. This approach focuses on the creation of a Tabu list of moves that have been performed recently and are forbidden to be performed for a certain number of iterations, thereby helping to avoid cycling and promoting search in a diversified space. At each iterations, TS moves to the best solution that is not forbidden and thus independent of local optima The generic TS introduce flexible memory structures articulating strategic restrictions and aspiration levels as a mean for exploiting search spaces. TS have the ability to generate solutions of notably high quality such as to escape from the local minima and to implement an explorative strategy. TS are an iterative procedure for searching a global optimum for discrete combinatorial problem. The philosophy of TS is to avoid entrainment in cycles by forbidding or penalizing moves, which take the

solution in the next iteration, to points in the solution space previously visited. In order to improve the efficiency of the exploration process, one needs to keep track not only of local information (like the current value of the objective function) but also of some information related to the exploration process. This systematic use of memory is an essential feature of Tabu search (TS). While most exploration methods keep in memory essentially the value f(i*) of the best solution i* visited so far, TS will also keep information on the itinerary through the last solutions visited. Such information will be used to guide the move from i to the next solution j to be chosen in N(i). The role of the memory will be to restrict the choice to some subset of N(i) by forbidding for instance moves to some neighbor solutions. More precisely, we will notice that the structure of the neighborhood N(i) of a solution i will in fact be variable from iteration to iteration. The main problem with such a tabu search algorithm is the size of the the neighborhood, for each solution. Thus generic Tabu search is able to execute only a few iterations within reasonable execution times and therefore alleviating the complexity of matching a job to the appropriate resource in the shortest time possible. The Probabilistic Tabu search for Grid scheduling addresses this concern. SOLUTON CONSTRUCTION The structure of Probabilistic Tabu search is as shown below. The basic idea is to look at only a subset of the neighborhood of each solution which has the maximum likelihood of containing the best tabu and non-tabu neighbors. The belief is that a large enough set of locally optimal solutions collectively contain predominantly those features that are present in globally optimal solutions and rarely contain features that are absent in globally optimal solutions. In this approach, a pre-defined number of starting solutions are chosen from widely separated regions in the sample space, and used in local search procedures to obtain a set of locally optimal solutions. These locally optimal solutions are then examined to provide an idea about the probability of each solution being included in an optimal solution. Using this idea, the neighborhood of each solution is searched in a probabilistic manner. General Scheme of PTS: The structure of PTS algorithm is formalized as shown below. Step 0 (Generating Probabilities): Generate a set of s solutions S = {S1,S2, . . . , Ss} using an extension to

317

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

local search method to obtain a local optimum. For each solution Si compute the associated probability pi Go to Step 1. Step 1 (Initialization): Define all solution elements as non-tabu. Choose an initial solution S, set BestSolution ← S, and set Iteration ← 1. Go to Step1. Step 2 (Termination): If a pre-defined termination condition is satisfied, output BestSolution and exit. Else go to Step 3. Step 3 (Iteration): Consider each neighbor N of S with a probability of (1−pi)pj where vi = S \ N and vj = N \ S. If vi or vj is marked ‘tabu’ then N is a tabu neighbor, otherwise it is a ‘non-tabu’ neighbor. If the best tabu neighbor considered has a cost lower than the cost of BestSolution, go to Step 4, else replace S by the best non-tabu neighbor considered. Mark the solution elements participating in this move (i.e. the vertex that has left the solution, and the vertex that has entered the solution to form the neighbor) as tabu for the next TENURE moves. If this best non-tabu neighbor is better than BestSolution, replace BestSolution with this neighbor. Set Iteration ← Iteration + 1. Go to Step 2. Step 4 (Aspiration): Replace BestSolution and S with the tabu neighbor of S. Remove the tabu status for all solution elements. Set Iteration ← Iteration + 1. Go to Step 2. For every solution move in the TS procedure, the neighborhood solution will be evaluated for a Dual Objective Function of minimizing the total penalty cost on choosing the workflow sequence and maximizing the number of workflows completed within deadline (Job Completion Ratio). In our proposed method, the workflows are created based on DAG model and the deadline is fixed to be at 1.5 * Execution time. RESULTS AND DISCUSSION The methodology is such that an initial job sequence is selected at random among the set of job sequences and the dual objective function for the solution is defined as a best cost. The obtained solution is recorded as initial step for the Probabilistic Tabu Scheduling mechanism. Later, the set of neighborhood solution of S is generated and again the dual objective function (DOF) is calculated and replaced if necessary finding the best cost among the history record.

The comparative increase in the completion of workflows by PTS dual objective scheduling mechanism considering other algorithms such as FCFS, EDF and TS are shown in Fig 4 and Fig 5.

140 120 100 JCR OF (5) FCFS 80 60 40 20 0 0 10 20 30 40 50 60 70 EDF TS PTS

NO. OF WORK FLOWS

Fig 4: Job completion ratio

It can be analyzed that PTS outperforms TS in the number of workflow completions. In Table 2, the penalty cost incurred by the Inter-Grid scheduler on not completing the job is plotted. As per the methodology PTS succeeds the other scheduling mechanisms in consideration.

300

250

200 No of work flows DOF PTS 150 TS EDF 100 FCFS

50

0 1 2 3 4 5 6 7 8 9 10 11 12

NO. OF WORKFLOWS

Fig 5: DOF for PTS, TS, EDF and FCFS Table 2: Penalty cost incurred for the workflow sequence No of workflows 5 10 15 20 25

30 35

FCFS 41.34 43.75 45.67 56.45 61.45

74.55 84.3

EDF 29 35.78 42.87 45.78 50.83

58.34 73.46

TS 25 29.94 33.78 40.82 51.98

59.674 68.3

PTS 20.88 27.63 30.84 37.62 39.652

45.67 50.64

40 45 50 55 60

97.55 100.98 108.3 112.7 119.5

79.83 87.67 97.25 106 112.3

74 79.56 85.65 99.32 106.9

62.1 75.3 82.5 89.41 100.26

318

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

CONCLUSION AND FUTURE WORK In this paper, we have applied probabilistic tabu search algorithm for the Generalized Grid Scheduling problem. In this approach, a pre-defined number of starting solutions are chosen from widely separated regions in the sample space, and used in local search procedures to obtain a set of locally optimal solutions. These locally optimal solutions are then examined to provide an idea about the probability of being included in an optimal solution. Using these ideas, the neighborhood of each solution is searched in a probabilistic manner. Our computational experience shows us that this probabilistic tabu search method outperforms generic tabu search most of the time. In the near future we plan to combine Probabilistic Tabu search with simulated annealing along with sharing method to increase the efficiency. Similarly the ant colony properties can be included for scalability in the existing algorithm. The procedure can also suitably be modified and applied to any kind of Grid scheduling with different problem environment and optimize any number of objectives concurrently. REFERENCES

E.H.L. Aarts, P.J.M. van Laarhoven, J.K. Lenstra, and N.L.J. Ulder, “A Computational Study of Local Search Algorithms for Job Shop Scheduling", ORSA Journal on Computing 6, (1994)118125. I. Foster and C. Kesselman, The grid: Blueprint for a future computing infrastructure, San Mateo, CA: Morgan Kaufmann, 1999. M. Maheswaran, et al., “Dynamic mapping of a class of independent tasks onto heterogeneous computing systems”, Journal of Parallel and Distributed Computing, Vol. 59, 1999, pp. 107-131. R. Buyya, D. Abramson, and J. Giddy, “A case for economy grid architecture for service oriented grid computing”, 10th Heterogeneous Computing Workshop (HCW’ 2001), 2001. I. Foster, C. Kesselman, S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”, Intl J. Supercomputer Applications, 2001. H. XiaoShan, S. XiaoHe, “QoS guided min-min heuristic for grid task scheduling”, Journal of Comput. Sci. & Technol., Vol. 18, No. 4, 2003, pp. 442-451. Diptesh Gosh, “A Probabilistic Tabu Search algorithm for the Generalized Minimum Spanning Tree Problem” Published in 2003, Indian Institute of Management (Ahmedabad) A. A. Mandal, et al. “Scheduling strategies for mapping application workflows onto the grid”, in Proceedings of the 14th IEEE International Symposium on High Performance and Distributed Computing (HPDC-14), 2005, pp. 125-134.

J. Yu, R. Buyya, and C.K. Tham, “Cost-based scheduling of scientific workflow applications on utility grids”, Proceedings of the 1st International Conference on e-Science and Grid Computing (e-Science’ 05), pp. 140-147, 2005. M.M. López, E. Heymann, M.A. Senar, “Analysis of dynamic heuristics for workflow scheduling on grid systems”, in Proceedings of the Fifth International Symposium on Parallel and Distributed Computing (ISPDC’06), IEEE, 2006. A. Afzal, J. Darlington, A.S. McGough, “QoS-constrained stochastic workflow scheduling in enterprise and scientific grids”, The 7th IEEE/ACM International Conference on Grid Computing, 2006, pp. 1-8.

Name:

R. Joshua Samuel Raj

Afﬁliation:

Assistant Professor / CSE VV College of engineering.

**Brief Biographical History:
**

2005 -Graduated in 2005 from the Computer Science and Engineering Department from PETEC under Anna University 2007 -Received M.E Degree in Computer Science and Engineering from Jaya College of Engineering under Anna University 2009 Working towards the Ph.D degree in the area of Grid scheduling under Kalasalingam University

Main Works:

Grid computing, Mobile Adhoc Networking, Multicasting and so forth

Name:

V. Vasudevan

Afﬁliation:

Director, Software Technologies Lab, TIFAC

Core in Network Engineering, Srivilliputhur, India

**Brief Biographical History:
**

1984- M.Sc in Mathematics and worked for several areas towards Representation Theory 1992 Received his Ph.D. degree in Madurai Kamaraj University 2008- the Project Director for the Software Technologies Group of TIFAC Core in Network Engineering and Head of the Department for Information Technology in Kalasalingam University, Sirivilliputhur, India

Main Works:

Grid computing, Agent Technology, Intrusion Detection system, Multicasting and so forth

319

http://sites.google.com/site/ijcsis/ ISSN 1947-5500

- Journal of Computer Science IJCSIS March 2016 Part II
- Journal of Computer Science IJCSIS March 2016 Part I
- Journal of Computer Science IJCSIS April 2016 Part II
- Journal of Computer Science IJCSIS April 2016 Part I
- Journal of Computer Science IJCSIS February 2016
- Journal of Computer Science IJCSIS Special Issue February 2016
- Journal of Computer Science IJCSIS January 2016
- Journal of Computer Science IJCSIS December 2015
- Journal of Computer Science IJCSIS November 2015
- Journal of Computer Science IJCSIS October 2015
- Journal of Computer Science IJCSIS June 2015
- Journal of Computer Science IJCSIS July 2015
- International Journal of Computer Science IJCSIS September 2015
- Journal of Computer Science IJCSIS August 2015
- Journal of Computer Science IJCSIS April 2015
- Journal of Computer Science IJCSIS March 2015
- Fraudulent Electronic Transaction Detection Using Dynamic KDA Model
- Embedded Mobile Agent (EMA) for Distributed Information Retrieval
- A Survey
- Security Architecture with NAC using Crescent University as Case study
- An Analysis of Various Algorithms For Text Spam Classification and Clustering Using RapidMiner and Weka
- Unweighted Class Specific Soft Voting based ensemble of Extreme Learning Machine and its variant
- An Efficient Model to Automatically Find Index in Databases
- Base Station Radiation’s Optimization using Two Phase Shifting Dipoles
- Low Footprint Hybrid Finite Field Multiplier for Embedded Cryptography

In Grid Environment the number of resources and tasks to be scheduled is usually variable and dynamic in nature. This characteristic emphasizes the scheduling approach as a complex optimization pro...

In Grid Environment the number of resources and tasks to be scheduled is usually variable and dynamic in nature. This characteristic emphasizes the scheduling approach as a complex optimization problem. Scheduling is a key issue which must be solved in grid computing study and a better scheduling scheme can greatly improve the efficiency. The objective of this paper is to explore the Probabilistic Tabu Search to promote compute intensive grid applications to maximize the Job Completion Ratio and minimize lateness in job completion based on the comprehensive understanding of the challenges and the state of the art of current research. Experimental results demonstrate the effectiveness and robustness of the proposed algorithm. Further the comparative evaluation with other scheduling algorithms such as First Come First Serve (FCFS), Last Come First Serve (LCFS), Earliest Deadline First (EDF) and Tabu Search are plotted.

- 512_512by Ubiquitous Computing and Communication Journal
- 01382854by Roba Alnajjar
- Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Scheduling in Computational Gridby CS & IT
- A Dynamic Error based fair scheduling using Two Layered Distributed Heap Sort Tree for a Computational Gridby Journal of Computing

- (SC11)Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
- Taxonomy
- 512_512
- Grid Management Sysytem
- 512_512
- 01382854
- Fault-Tolerance Aware Multi Objective Scheduling Algorithm for Task Scheduling in Computational Grid
- A Dynamic Error based fair scheduling using Two Layered Distributed Heap Sort Tree for a Computational Grid
- v10n2a4
- Task Scheduling Heuristic in Grid Computing
- A Comparative Study in Dynamic Job
- OJS_file
- An Efficient Scheduling Method for Grid Systems Based on a Hierarchical Stochastic Petri Net
- V2I30025
- 1
- A Computational Economy for Grid Computing
- GLOA
- Condor and the Grid
- Bee foraging behaviour techniques for grid scheduling problem
- User-Deadline-Based-Job-Scheduling-in-Grid-Computing.pdf
- IJGCA 030403
- EJSR_65_3_15
- STY08UweSchwiegelsohn
- Condor
- grid computing
- irafit1025
- Grid Computing
- Analysis of Advanced Grid Resource Management Models
- FGL09cUweSchwiegelsohn
- Grid Computing
- Scheduling of Workflows in Grid Computing with Probabilistic Tabu Search

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd