You are on page 1of 10

INTRA- GROUP LOAD BALANCING IN GRID COMPUTING USING GENETIC ALGORITHM

Mrs.J.Jayabharathy and *Mrs.S.Neelavathi


Senior Lecturer, Department of Computer Science and Engg, Pondicherry Engineering College, Puducherry bharathyhari_raja@yahoo.co.in *M.Tech student, Department of Computer Science and Engg, Pondicherry Engineering College, Puducherry neelabaski@gmail.com

Abstract Effective and efficient load balancing algorithms are fundamentally important to improve the global throughput of the grid environment. Although load balancing problem in classical distributed systems has been intensively studied, new challenges in Grid computing still make it an interesting topic, and many research projects are under way. This is due to the Grid characteristics and to the complex nature of the problem. This paper represents an efficient dynamic load balancing scheme based on a genetic algorithm (GA) which includes an evaluation mechanism of fitness values for a task load balancing model in Grid environment. This task load balancing model is characterized by three main features: (i) it is hierarchical; (ii) it supports heterogeneity and scalability; and, (iii) it is totally independent from any Grid physical architecture and uses a hierarchical load balancing strategy to balance tasks among Grid resources. In a GA-based intra group load balancing scheme in grid environment we propose, a subset of nodes to which the requests are sent off is adaptively determined by a learning procedure to reduce unnecessary requests. The learning procedure consists of standard genetic operations such as selection, crossover and mutation applied to a population of binary strings each of which stands for a list of nodes to which the migration requests are sent off. Each node has its own population, and the fitness of a string depends on how efficient the destination of a migration is found. We show the effectiveness of this proposed strategy from a view point of mean response time and average job slow down.

technology allows the use of geographical widely distributed and multi-owner resources to solve large-scale applications like meteorological simulations, data intensive applications, etc. [2]. Although load balancing problem in conventional distributed systems has been intensively studied, new challenges in Grid computing still make it an interesting topic, and many research projects are under way. This is due to the characteristics of Grid computing and to the complex nature of the problem itself. Load balancing algorithms in classical distributed systems, which usually run on homogeneous and dedicated resources, cannot work well in the Grid architectures. Grids have a lot of specific characteristics [3], like heterogeneity, autonomy and dynamicity, which remain obstacles for applications to harness conventional load balancing algorithms directly. Load balancing is a mapping strategy that efficiently equilibrates the tasks load into multiple computational resources in the network based on the system status to improve performance [4]. The essential objective of a load balancing is to allocate tasks among the processors to maximize the processors utilization and minimize the mean response time. While overall system load across the grid remains light, the Intra-group algorithm performs well because it succeeds in finding a node to migrate a task. When the system load becomes heavy, it is difficult to find the node because most nodes cannot afford to receive additional tasks. Therefore, many requests and rejects are repeatedly sent back and forth and a lot of time is consumed before execution. Our main contribution towards the dynamic load balancing scheme of Grid environment using genetic algorithm, is to evolve a strategy for determining a destination processor to migrate a task through a learning procedure carried out within each node. An objective of the learning is to reduce the time for deciding the destination and then improve the acceptance rate of the request messages. We realize the learning via genetic operators such as selection, crossover and mutation applied to a population of binary strings each of which stands for the combination

I. INTRODUCTION
The availability of low cost powerful computers coupled with the popularity of the Internet and high-speed networks have led the computing environment to be mapped from classical distributed to Grid environments. In fact, recent researches on computing architectures allowed the emergence of a new computing paradigm known as Grid computing. Grid is a type of a distributed system which supports the sharing and coordinated use of resources, independently from their physical type and location [1]. This

of processors to which request messages are sent off.


Level 0: In this first level, we have the Grid manager, which realizes the following functions:

II. TREE-BASED LOAD BALANCING MODEL OF GRID


A. Grid topology
From the topological point of view, we regard a Grid computing as a set of clusters in a multi-nodes platform. Each cluster owns a set of worker nodes and belongs to a local domain, i.e. a LAN (Local Area Network). Every cluster is connected to the global network or WAN (World Area Network) by a switch. Figure 1 describes this topology.

(i) Maintains the workload information about the cluster managers. (ii) Decides to start a global load balancing between the clusters of the Grid, which we will call intra-Grid load balancing. (iii) Sends the load balancing decisions to the cluster managers of level 1 for execution. Level 1: Each cluster manager of this level is associated to a physical cluster of the Grid. This manager is responsible to : (i) Maintain the workload information relating to each one of its worker nodes. (ii) Estimate the workload of associated cluster and send this information to the Grid manager. (iii) Decide to start a local load balancing, which we will call intra-cluster load balancing. (iv) Send the load balancing decisions to the worker nodes which it manages, for execution. Level 2: At this last level, we find the worker nodes of the Grid linked to their respective clusters. Each node at this level is responsible to: (i) Maintain its workload information. (ii) Send this information to its cluster manager. (iii) Perform the load balancing decided by its manager.

Fig. 1. Example of a Grid topology

B. Mapping a Grid into a tree-based model The load balancing strategy is based on mapping of any Grid architecture into a tree structure. This tree is built by aggregation as follows: First, for each cluster we create a two levels subtree. The leaves of this subtree correspond to the nodes of the cluster, and its root, called cluster manager, represents a virtual tree node associated to the cluster, whose role is to manage the cluster workload. Second, subtrees corresponding to all clusters are aggregated to build a three levels tree whose root is a virtual tree node designated as Grid manager. The final tree is denoted by C/N, where C is the number of clusters that compose the Grid and N the total number of worker nodes. As illustrated in Figure 2, this tree can be transformed into two specific trees: C/N and 1/N, depending on the values of C and N. The mapping function generates a non cyclic connected graph where each level has specific functions.

Fig. 2. Tree-based representation of a Grid

III. GENETIC APPROACH BASED DYNAMIC

LOAD BALANCING IN GRID COMPUTING


A.Overview The genetic approach to dynamic load balancing [4], employs a three-level load system(Light,Normal,Heavy) by determing the length of task queue in each processor.

Each processor has its own population a multiset of strings to which genetic operators such as selection ,crossover and

mutation are applied as illustrated in Fig 3. It is defined as a binary-coded vector indicating a set of processors to which requests are sent indicating 1 if the request is dispatched else 0. Each string is associated with its payoff values and has its own fitness values. The fitness value of a string is an average of the last T payoff values. A string is selected from the population at the probability proportional to its fitness value. The learning procedure is realized by genetic operators such as selection, crossover and mutation applied to a population of binary strings each of which stands for the combination of processors to which the request messages are sent thereby showing the effectiveness of the whole system from a view point of mean response time.

IV. INTRA GROUP LOAD BALANCING STRATEGY


A.Principles In accordance with the proposed model, we distinguish two load balancing levels: Intra-Cluster load balancing and intra-Grid load balancing. Intra cluster load balancing: In this first level, depending on current workload of its associated cluster, estimated from its own worker nodes, each cluster manager decides whether to start or not a load balancing operation. If it decides to start a balancing operation, then a combination of request messages is issued before migrating a task. This is determined by a learning procedure carried out within each processor which is mainly objected to reduce the time for deciding the destination and then improving the acceptance rate of the request messages in a dynamically changing environment. This learning procedure is realized through genetic operators such as selection, crossover and mutation applied to a population of binary strings. Hence, we can proceed C parallel local load balancing, where C is the number of clusters. Intra-Grid load balancing: The load balancing at this level is performed only if some cluster managers fail to load balance their workload among their associated worker nodes. The local balancing failure may be due either to saturation of the cluster or insufficient supply. In this case, tasks of overloaded clusters are transferred to underloaded ones regarding the communication cost and according to the selection criteria. The chosen underloaded clusters are those which need minimal communication cost for transferring tasks from overloaded clusters. B.Generic Strategy At any load balancing level, we propose the following strategy. As the description will be done in a generic way, we will use the concept of group and element. Depending on cases, a group designs either a cluster or the Grid (level 1 or level 0 in the tree). An element is a group component (worker node of level 2 or cluster of level 1). The main steps of our strategy can be summarized as follows: STEP 1: Estimation of current group workload Knowing the number of available elements under its control and their computing capabilities each group manager estimates its group capability by the following functions. Estimates current workload of group based on workload information.

Fig 3. An overview of genetic approach to dynamic load balancing in DCS

B. Fitness Evaluation The fitness value of a string is an average of the last T payoff values, each of which is given at every migration. If at least one of migration requests according to a string is accepted, then it provides the string with positive payoff which is inversely proportional to the number of the requests issued. Otherwise, provide zero payoff value. That is, the payoff of a string is defined as follows: Payoff = 0 if all the requests were rejected, Payoff = g(m) otherwise, where g(.) is a monotone decreasing function. A string is selected from the population at the probability proportional to its fitness value. That is, the probability of the jth string to be selected is fj/ n fk , where f j is the fitness value of j-th string and N is the number of strings included in a population. When all fitness values are 0, a string is selected at random. String duplication leads to inconsistent fitness values because the same strings may have different fitness.

Computes standard deviation over the workload index in order to measure the deviations between involved elements. Sends workload information to its grid manager. STEP 2: Decision Making The manager decides whether it is necessary to perform a load balancing operation or not. It performs two following actions defining the imbalance/saturation state of the group and group partitioning.

the Request messages according to the contents of the string. The total number of Requests issued is used by a Message

Defining the imbalance/saturation state of the


group: If the standard deviation measures the average deviation between the processing times of elements and the processing time of their group ,then the group is in balance state when this deviation is small. To define Balance state, we define a balance threshold from which we can say that the standard deviation tends to zero and hence the group is balanced. A group can be balanced while being in Saturated state. To measure saturation we introduce another threshold called saturation threshold. When the current workload of group borders its capacity, it is useless to balance since all belonging components are saturated. Group Partitioning: For an imbalance case, we determine the overloaded elements(sources) and the underloaded ones(receivers) depending on processing time of every element relatively to average processing time of the associated group.

STEP 3: Tasks transferring The following heuristic is proposed to transfer tasks from overloaded elements to underloaded ones. 1. Evaluate the total amount of load supply available on receiver elements. 2. Compute the total amount of load Demand required by source elements. 3. If the supply is much lower than the demand it is not recommended to start local load balancing. We introduce a third threshold to measure relative deviation between supply and demand and perform task transfer. STEP 4: Initialization An Initialization procedure is executed in each processor only when the whole system starts to work. Load level of a processor is set to be Light and a population of strings is generated randomly with no string duplications. All the initial strings are assigned with 0 fitness value. STEP 5: Check-load Whenever a new task is born in a processor, a Check load procedure is called to observe its own processors load by checking the length of the task queue. If the observed load is Heavy, the Check load selects a string from the population at the probability proportional to its fitness and then sends off

evaluation procedure to know whether all the answers to the Requests returned or not. STEP 6: Message-Evaluation A Message evaluation procedure is woken up whenever a processor receives a message from other processors through a communicating network. When a processor Pj receives a Requestij message, it sends back an Acceptji if the processor is Lightly loaded and ready to accept an additional task, otherwise, it sends back a Rejectji. When a received message at Pj is an Acceptij we put the number i into an available list which indicates the processors ready for accepting an additional task. After receiving all answers, the messageevaluation procedure in Pj sends off a Migrate ji message together with a task taken out from its own waiting queue to the processor indicated by the available list. The processor Pi which receives a Migrateji message puts the received task into its own task queue. When a processor receives a Terminate ij message, it gets the results of the execution. STEP 7: String Evaluation A String evaluation procedure starts to work after all answers to requests issued before returned. This procedure calculates the payoff value and the fitness value of the selected string according to the result of the requests. STEP 8: Genetic Operations The String evaluation is followed by a Genetic-operations

procedure. Genetic operations such as crossover and mutation are executed on the population in such a way as follows: At first, selection is applied to eliminate strings with low fitness value. Then, crossover and mutation are applied to the strings randomly chosen. Lastly, offsprings generated in the former step are added to the population. The initial fitness and payoff of the offspring are directly inherited from its parents. We evaluate the correct fitness of new strings by the String evaluation procedure which is carried out only after the new string was selected to decide the destinations of requests. C.Generic Intra-Group Load Balancing With Genetic Algorithm Step 1: Workload estimation 1. Workload information about each element Ei of G: For Every Ei AND according to its specific period do sends its workload LODi to its group manager end For
2. According to its period the group manager performs: aComputes speed SPDG and capacity SATG of G.

b- Evaluates current load LODG and processing time TEXG. c- Computes the standard deviation G over processing times. d- Sends workload information of G to its manager.

Step 2: Decision making 3. Balance criteria: Switch G = cluster: If (G ) then Cluster in balance state; Return end If G = Grid: If (# overloaded clusters Given threshold) then Grid is balanced ; Return end If end Switch 4.Saturation criteria: If (LODG/SATG> ) then Group G is saturated ; Load balancing Fail; Return end If 5. Partitioning G into overloaded (GES), underloaded (GER) and balanced (GEN) GES ; GER ; GEN For Every element Ei of G do If (Ei saturated) then

Heuristic 2: Intra-Grid tasks transferring S1: Initialization() S2: check-load() S3: performs the steps if load is heavy S4: Message evaluation() S5: string evaluation() S6: Genetic operators 1 . selection 2. uniform crossover 3. mutation S7: Repeat steps 4 to 6until terminating condition

V. SIMULATION RESULTS
We executed several simulations on the proposed genetic algorithm for intra group load balancing to compare with conventional load balancing strategy. Simulation 1: We compared the performance of proposed Intra cluster algorithm with the conventional one in terms of its response time.

GES GES{ Ei} else Switch


T S TEXi <> T E X GE X G+ G G G E G E R G E S E R E i} E i} G { { TEXG G TEXi TEXG + G GEN

TEXi

G EN

{Ei}

End switch end If end For Step 3: Tasks transferring 6. Test on Supply and Demand:

Supply = _Er GER LODG.SPDr / SPDG LODr Demand = _Es GES LODs LODG.SPDs/SPDG If (( Supply/Demand )) then

local Load balancing Fail; Return end If


7. Tasks transferring: If (G = Cluster) then Perform Heuristic1 else

Perform Heuristic2 end If Heuristic 1: Intra-cluster tasks transferring S1: Initialization of all nodes with population strings S2: check-load to observe nodes load. S3: performs the steps if load is heavy S4: Message evaluation()
S5: string evaluation() S6: Genetic operators

The above figure shows the result of response time with the variation in number of nodes per cluster from 5 to 25.The proposed Intra cluster algorithm shows the reduction in response time .The reduction in response time increases more significantly, when the number of nodes in a cluster also increases.

Simulation 2: We compared the performance of proposed Intra cluster algorithm with the conventional one in terms of its job slow down.

1. selection 2. uniform crossover 3. mutation S7: Repeat steps 4 to 6until terminating condition

Simulation 4: We compared the performance of proposed Intra grid algorithm with the conventional one in terms of its job slow down.

The above figure shows the result of job slow down with the variation in number of nodes per cluster from 5 to 25.The proposed Intra cluster algorithm shows the reduction in job slow down .The reduction in job slow down increases more significantly, when the number of nodes in a cluster also increases. Simulation 3: We compared the performance of proposed Intra grid algorithm with the conventional one in terms of its response time.

The above figure shows the result of job slow down with the variation in number of clusters from 5 to 25.The proposed Intra grid algorithm shows the reduction in job slow down. The reduction in job slow down increases more significantly, with that of the increase in the number of clusters..

VI. CONCLUSION AND FUTURE WORKS


This Paper summarizes the implemention of two levels of load balancing algorithms i.e Intra cluster load balancing algorithm and Intra grid load balancing algorithm along with genetic approach which consists of a learning procedure of standard genetic operators such as selection, mutation and crossover applied to a population of binary strings. The work improves the effectiveness more significantly in terms of mean response time and job slow down of the whole system. In the future, it is planned to integrate our work on known Grid simulators like GridSim and HyperSim.This will allow us to measure the effectiveness of our strategy in existing simulators. Finally, we think that as another perspective, to extend the proposed model to a fully distributed model of grid(removal of the root from the tree structure).

The above figure shows the result of response time with the variation in number of clusters from 5 to 25.The proposed Intra grid algorithm shows the reduction in response time .The reduction in response time increases more significantly, with that of the increase in the number of clusters.

REFERENCES
[1] I. Foster, C. Kesselman, and S. Tuecke, The anatomy of the grid: Enabling scalable virtual organizations, International Journal of High Performance Computing Applications, vol. 15, no. 3, 2001. [2] A. Chervenak, I. F. C. Kesselman, C. Salisbury, and S. Tuecke, The data grid: towards an architecture for the distributed management and analysis of large scientific datasets, Journal of Network and Computing Applications, vol. 23, no. 3, pp. 187 200, 2000. [3] M. Baker, R.Buyya, and D. Laforenza, Grids and grid technologies for wide-area distributed computing, International Journal of Software: Practice and Experience (SPE), vol. 32, no. 15, 2002. [4]N. Maasour and G. C. Fox, A hybrid genetic algorithm for task dwation in multicomputers, in Proceedings of the Fourth Internaiwnal Conference on Genetic Algorithms (R. K. Belew, ed.), pp. 466-473, Morgan KaufinannPublishers, 1991. [5] M. D. Kidwell, Using genetic algorithms to schedule distributed tasks on a bus-based systems, in Proceedings of the F@h Internaiwnal Coderewe on Genetic Algorithms (S. Forrest, ed.), pp. 368374, Morgan Kaufmann Publishexs. 1993. [6] D. L. Eager, E. D. Lazowska, and J. Zahorjan, Adaptive load sharing in homogeneous distributed systems, IEEE Transactwns on Software Engineering, vol. 12, no. 5. pp. 662-675. May 1986. [7] C. Xu and F. Lau, Load Balancing in Parallel Computers: Theory and Practice. Kluwer, Boston, MA, 1997. [8] H. Johansson and J. Steensland, A performance characterization of load balancing algorithms for parallel SAMR applications, Uppsala University, Department of Information Technology, Tech. Rep. 2006- 047, 2006. [9] H. Shan, L. Oliker, R. Biswas, and W. Smith, Scheduling in heterogeneous grid environments: The effects of data migration, in Proceedings of ADCOM2004: International Conference on Advanced Computing and Communication, India, December 2004. [10] J. Cao, D. P. Spooner, S. A. Jarvi, and G. R. Nudd, Grid load balancing using intelligent agents, Future Generation Computer Systems, vol. 21, pp.

135149, 2005. [11] Belabbas Yagoubi and Meriem Medebber Department of Computer Science,Faculty of Sciences, University of Iran Campus Professeur Taleb Mourad, 31000 Oran,

Algeria A

load balancing model for grid environment

2007 IEEE.

[12] Masaharu Munetomo, Student Member, IEEE, Yoshiaki Takai, Member, IEEE, and Yoshiharu Sat A Genetic approach to dynamic load balancing in a distributed computing system.

[13] Belabbas and Slimani Dynamic load balancing strategy for grid computing Proceedings of world academy of science,Engg.and technology. [14] Seong Lee& Chong sun Hwang A dynamic load balancing approach using genetic algorithm in distributed systems 1998 IEEE.

You might also like