“Load balancing in grid environment for using
heirarichal approaach”





Submitted By:



For the award of the Degree of




Under the Guidance of:

Er.Sharad Chauhan

(H.O.D,CSE Deptt)





M.Tech – Synopsis Submission Form
[To be filled-in by the student]

Name of the student: Shalu Soni Roll No:7212157

Department:C.S.E Date of submission:

Name of Research Guide(s): Er.Sharad Chauhan

Title of thesis Synopsis (as approved guide) in block letters:

Signature of the Student, with Date:
[Recommendation by Evaluation Committee]
--------------------------------------------------------------------------------------------- -------------------
------------------------------------------------------------------------------------------------------- ---------
Date of Synopsis: Open Seminars:

Date of Committee meeting for approval of synopsis:
Decision of Evaluation Committee: Acceptable/not-acceptable :
Name & Signature of Evaluation Committee:
M.Tech Coordinator (CE) HOD (CSE)
Note: The following documents must be enclosed:
1. ONE soft copy on CD in .pdf of the Synopsis.
2. Declaration by the student


CHAPTER 1: INTRODUCTION……………………………………………………….....
1.1 Grid Computing…………………………………………………...…...
1.2 Problem areas in Grid Computing……………………………….……
1.3 Load Balancing Policies…………………………………..
1.4 Load Balancing Strategy…………………………………….............

CHAPTER 2: LITERATURE SURVEY……………………………………………………
2.1 Heirarichal Model………………………………………
2.2 Grid Topology………………………….……..…..

3.1 Objective
3.2 Characteristics
CHAPTER 4: REFERENCES…………………………………………………..…………


1.1 Grid Computing

Grids enable the sharing, selection, and aggregation of a wide variety of resources including
supercomputers, storage systems, data sources, and specialized devices that are geographically
distributed and owned by different organizations for solving large-scale computational and data
intensive problems in science, engineering, and commerce. The concept of Grid computing [17] started
as a project to link geographically dispersed supercomputers, but now it has grown far beyond its
original intent.
The Grid infrastructure [3] can benefit many applications, including collaborative
engineering, data exploration, high throughput computing, distributed supercomputing, and
service-oriented computing. Figure 1.1 on page 2 shows the architecture of grid computing.
The last decade has seen a substantial increase in commodity computer and network
performance, mainly as a result of faster hardware and more sophisticated software. Nevertheless, there
are still problems in the fields of science, engineering, and business, which cannot be effectively
dealt with using the current generation of supercomputers. In fact, due to their size and complexity,
these problems are often resource (computational and data) intensive and consequently entail the use of
a variety of heterogeneous resources that are not available in a single organization. The ubiquity of the
Internet as well as the availability of powerful computers and high-speed network technologies as low-
cost commodity components is rapidly changing the computing landscape.

Figure 1.1: Grid Computing Architecture

These technology opportunities have led to the possibility of using wide-area distributed
computers for solving large-scale problems, leading to what is popularly known as Grid computing.
The term Grid is chosen as an analogy to the electric power Grid that provides consistent, pervasive,
dependable, transparent access to electricity, irrespective of its source. Such an approach to network
computing is known by several names: meta computing, scalable computing, global computing,
Internet computing, and more recently Peer-to-Peer (P2P) computing [33] .Moreover, due to the
rapid growth of the Internet and Web, there has been a growing interest in Web-based distributed
computing, and many projects have been started and aim to exploit the Web as an infrastructure for
running distributed and parallel applications. In this context, the Web has the capability to act as a
platform for parallel and collaborative work as well as a key technology to create a pervasive and
ubiquitous Grid-based infrastructure [3]. The user essentially interacts with a resource broker that
hides the complexities of Grid computing. The broker discovers resources that the user can access
using information services, negotiates for access costs using trading services, maps tasks to resources
(scheduling), stages the application and data for processing (deployment), starts job execution, and

finally gathers the results.
1.2 Problem Areas in Grid Computing

 Lack of security: Since the number of users within a system is increased, new security
mechanisms are needed to ensure that malicious code cannot legitimate services running on
the grid.
 Fault Tolerance: Grid is a heterogeneous system in which failure cannot be avoided.
Failures of system resources have unfavourable affects on applications performance. Failures
can make a process run slower than normal or even stop it.
 Scheduling of Resource: A grid federates a large number of resources contributed by
individual machines into a greater total virtual resource. For applications that are grid-enabled,
the grid can offer a resource balancing effect by scheduling grid jobs on machines with low
utilization. This feature can prove invaluable for handling occasional peak loads of activity in
parts of a larger organization.
 Load Balancing Problem: Load Balancing is tuff to achieve in grid environment. Load
balancing is a technique to spread work between two or more computers, network links,
CPUs, hard drives, or other resources, in order to get optimal resource utilization, throughput,
or response time. Using multiple components with load balancing, instead of a single
component, may increase reliability through redundancy.
 Discovery of Resource: Grid environment is an environment in which applications are
composed of large set of hardware and software resources distributed among many locations.
Hence, the problem of resource discovery can be profoundly complex due to the size and the
complexity of Grid system [21].
 Load Balancing:A typical distributed system involves a large number of geographically
distributed worker nodes which can be interconnected and effectively utilized in order to
achieve performances not ordinarily attainable on a single node. Each worker node possesses
an initial load, which represents an amount of work to be performed, and may have a different
processing capacity.
To minimize the time needed to perform all tasks, the workload has to be evenly
distributed over all nodes which are based on their processing capabilities. This is why load
balancing is needed. The load balancing problem is closely related to scheduling and resource
allocation. It is concerned with all techniques allowing an evenly distribution of the workload
among the available resources in a system. The main objective of a load balancing consists
primarily to optimize the average response time of applications; this often means the
maintenance the workload proportionally equivalent on the whole system resources.

 Load balancing is usually described in the literature as either load balancing or load sharing.
These terms are often used interchangeably, but can also attract quite distinct definitions.
In the following, we distinguish between three forms of load balancing.
 Load Sharing: This is the coarsest form of load distribution. Load may only be placed on
idle resources, and can be viewed as a binary problem, where a resource is either idle or busy.
 Load Balancing: Where load sharing is the coarsest form of load distribution, load
balancing is the finest. Load balancing attempts to ensure that the workload on each resource
is within a small degree, or balance criterion, of the workload present on every other resource
in the system.
 Load Levelling: Load levelling introduces a third category of load balancing to describe the
middle ground between the two extremes of load sharing and load balancing. Rather than
trying to obtain a strictly even distribution of load across all resources, or simply utilizing idle
resources, load levelling seeks to avoid congestion on any resource. A Grid load balancer
receives applications from Grid users, selects feasible resources for these applications
according to acquired information, and finally generates application-to-resource mappings ,
on the basis of objective functions and predicted resource performance. Basically , a load
balancing system can be generalized into four basic steps:
 (1) Monitoring resource load and state.
 (2) Exchanging load and state information between resources.
 (3) Calculating the new load distribution.
 (4) Updating data movement.

1.3 Load balancing policies:

• Information policy: Specifies what workload information to be collected, when it is to be
collected and from where [27].
• Triggering policy: Determines the appropriate period to start a load balancing operation.
• Resource type policy: Classifies a resource as server or receiver of tasks according to its
availability status.
• Location policy: Uses the results of the resource type policy to find a suitable partner for a server
or receiver.


• Selection policy: Defines the tasks that should be migrated from overloaded resources (source) to
most idle resources (receiver).

1.4 Load Balancing Strategy
Principles: In accordance with the structure of proposed model, the load
balancing strategy is also hierarchical. Hence, we distinguish between three load balancing levels:
 Intra-site load balancing: In this first level, depending on its current load, each site
decides to start a load balancing operation. In this case, the site tries, in priority, to load
balance its workload among its computing elements.
 Intra-cluster load balancing: In this second level, load balance concerns only one cluster,
Ck, among the clusters of a grid. This kind of load balance is achieved only if some sites Sjk
fail to load balance its workload among their respective computing elements. In this case,
they need the assistance of its direct parent, namely cluster Ck.
 Intra-grid load balancing: The load balance at this level is used only if some clusters
Ck’s fail to load balance their load among their associated sites.
The main advantage of this strategy is to prioritize local load balancing first (within a site,
then within a cluster and finally on the whole grid). The goal of this neighbourhood strategy is to
decrease the amount of messages between computing elements. As consequence of this goal, the
overhead induced by our strategy is reduced. In terms of complexity, we can easily prove that the
amount of messages is linear according to the number of computing elements.




In order to well explain as model, we first define the topological structure for a grid
computing. Research works on load balancing has been focused on system-level load balancing or
task scheduling [12],[14]. Their main objective is to maximize the overall throughput or average
response time. Most application-level load balancing approaches are oriented on application
partitioning via graph algorithms. However, it does not address the issue of reducing migration cost.
That is the cost entailed by load redistribution, which can consume order of magnitude more time than
the actual computation of a new decomposition. Some works have proposed a latency-tolerant
algorithm that takes advantage of overlapping the internal data computation and incoming data
communication to reduce data migration cost. Unfortunately, it requires applications to provide such a
parallelism between data processing and migration, which restricts its applicability. Propose a
dynamic load balancing approach to provide application level load balancing for individual parallel
jobs in Grid computing environment. Agent-based approaches [20] have been tried to provide load
balancing in cluster of machines [29]. Enhance the MPI_Scatterv primitive to support master-slave
load balancing by taking into consideration the optimization of computation and data distribution
using a linear programming algorithm. However, this solution is limited to static load balancing.[19]
propose an optimal data migration algorithm in diffusive dynamic load balancing through the
calculation of Lagrange multiplier of the Euclidean form of transferred weight. This work can
effectively minimize the data movement in homogeneous environments, but it does not consider the
network heterogeneity. In particular, workload migration is critical to be considered because the wide
area network performance is dynamic, changing throughout execution, instable, etc., in addition to
considering the resource heterogeneity. This communication aspect is neglected in traditional
application-level load balancing strategies. Scheduling sets of computational tasks on distributed
platforms is a key issue but a difficult problem. Although, as mentioned above, a large number of
scheduling techniques and heuristics have been presented in the literature, most of them target only

homogeneous resources. However, modern computing systems, such as the computational Grid, are
most likely to be widely distributed and strongly heterogeneous. Therefore, it is essential to consider
the impact of heterogeneity on the design and analysis of scheduling techniques. The traditional
objective, when scheduling sets of computational tasks, is to minimize the overall execution time
called makespan. However, in the context of heterogeneous distributed platforms, makespan
minimization problem is in most cases NP-complete, sometimes even APX-complete. When dealing
with large scale systems, an absolute minimization of the total execution time is not the only objective
of a load balancing strategy. We think that the communication cost, induced by load redistribution, is
also a critical issue. For this purpose, we propose, in this paper, a novel load balancing strategy to
address the new challenges in Grid computing.
2.2 Grid topology
We suppose that a grid computing (see Fig. 2.1) is a finite set of G clusters Ck, interconnected by
gates gtk, k ∈ {0, ...,G − 1},where each cluster contains one or more sites Sjk interconnected by
switches SWjk and every site contains some Computing Elements CEijk and some Storage Elements
SEijk, interconnected by a local area network.

Figure 2.1: Grid topology


3.1 Objective
 The objective of model is to implement load balancing in grid environment using Hierarchical
 The proposed algorithm will balance the load in the grid, based on the queue length of each
resource and transfer the job to the resource having minimum queue length.

From the base paper that paper based on homogeneous environment but our proposed work is
based on heterogeneous & that paper implement on grid sim toolkit , proposed work based on
java(create own simulator on java) & use MS access as back end for storing the values.

Proposed work:
1. We create the clusters
2. calculate the load
3. intrasite load balancing
4. intersite load balancing
5. intercluster load balancing

3.2 Characteristics of the proposed model

The main features of our proposed load balancing generic model are listed below:
1) It is hierarchical: This characteristic facilitate the information flow through the tree and well
defines the message traffic in our strategy.
2) It supports scalability of grids: Adding or removing entities (computing
elements, sites or clusters) are very simple operations in our model
(adding or removing nodes, sub-trees).
3) It is totally independent from any physical architecture of a grid: The transformation of a grid into a
tree is an univocal transformation. Each grid corresponds to one and only one tree.


[1] Belabbas Yagoubi and Yahya Slimani , “Dynamic Load Balancing Strategy for Grid Computing,”
World Academy of Science, Engineering and Technology, 2006.
[2] E. Badidi, “Architecture and services for load balancing in object distributed systems,” PhD
thesis, Faculty of High Studies, University of
Montreal, Mai 2000.
[3] F. Berman, G. Fox, and Y. Hey, “Grid Computing: Making the Global Infrastructure a Reality,”
Wiley Series in Communications Networking & Distributed Systems, 2003.
4. B. Yagoubi, and M. Medebber, “A load balancing model for grid environment,” In Proceeding of
International Symposium on Computer and Information Sciences (ISCISC 2007), pp. 1-7, 7
November 2007.
5. K. Lu, and A. Zomaya, “A Hybrid Policy for Job Scheduling and Load
Balancing in Heterogeneous Computational Grids,” In Proceeding of 6
International Symposium on
Parallel and Distributed Computing, pp.19-26, 5 July 2007.
[6] M. Baker, R.Buyya, and D. Laforenza, “Grids and grid technologies
for wide-area distributed computing,” International Journal of Software:
Practice and Experience (SPE), 2002.
[7] M. Dobber, R. Mei, and G. Koole, “ Dynamic Load Balancing and Job Replication in a Global-
Scale Grid Environment: A Comparison,” IEEE Transaction on Parallel and Distributed Systems,
20(2): 207- 218, February 2009.
[8] H. Shan, L. Oliker, R. Biswas, and W. Smith, “Scheduling in heterogeneous grid
environments: The effects of data migration,” In Proceedings of (ADCOM2004) International
Conference on Advanced Computing and Communication, India, December 2004.
[9] J. Cao, D.P. Spooner, S. A. Jarvi, and G.R. Nudd, “Grid load balancing using intelligent agents,”
Future Generation Computer Systems, 135-149, January 2005.

[10] Youchan Zhu, Lei An, Shuangxi Liu, “ A Resource Discovery Method of Grid Based on
Resource Classification,” In Proceedings of First International Conference on Intelligent Networks
and Intelligent Systems, pp. 716-719, August 2008.
11. P. K. Suri, and Manpreet Singh, “An Efficient Decentralized Load Balancing Algorithm For
Grid,” IEEE 2
International Advance Computing Conference, pp. 10-13, February 2010.
[12] Bin Lu, and Hongbin Zhang, “Grid Load Balancing Scheduling Algorithm Based on Statistics
Thinking,” IEEE 9th International Conference, pp. 288-292, 2008.
13. Han Zhao, Xinxin Liu, and Xiaolin Li, “DLBEM: Dynamic Load Balancing Using Expectation-
Maximization,” IEEE, August 2008.