Professional Documents
Culture Documents
Critical Analysis of Load Balancing Strategies For Cloud Environment
Critical Analysis of Load Balancing Strategies For Cloud Environment
Anurag Jain*
Department of Computer Science and Engineering,
Chandigarh Engineering College,
Landran, Mohali, Punjab, India
Email: er.anuragjain14@gmail.com
*Corresponding author
Rajneesh Kumar
Department of Computer Science and Engineering,
Maharishi Markandeshwar Engineering College,
Maharishi Markandeshwar University,
Mullana, Ambala, India
Email: drrajneeshgujral@mmumullana.org
1 Introduction
Cloud computing means two things: Firstly applications delivered as services over the
internet and secondly the software and hardware in the data centre which provide those
services. Virtualisations, scalability, ubiquitous and pay as per usage are some of the
essential characteristics of cloud environment. Services in cloud are provided in the form
of three models, namely:
1 software as a service
2 platform as a service
3 infrastructure as a service.
These models can be implemented as public, private or hybrid (Jain and Kumar, 2014;
Sharma, 2015a). Random nature of job arrival and massive involvement of data
exchanged over network results in sometimes over utilisation and sometimes
underutilisation of components. This causes delay in services or some time failures also,
which results in the decrease of customer satisfaction level. To solve this issue, there is a
need of efficient load balancing algorithm for the cloud environment (Khiyaita et al.,
2012; Sasikala, 2011). In this paper, the authors have shown in depth analysis of load
balancing strategies used in a cloud environment. They have also shown the comparison
through simulation on various metrics suitable for cloud environments.
This paper is organised as follows: Section 2 gives an overview of load balancing, its
types, challenges, issues and metrics in a cloud environment. Section 3 discusses the big
data and problems associated with big data. Section 4 includes the overview of the cloud
analyst simulator. Section 5 contains the in depth analysis of load balancing approaches.
Section 6 includes the results and comparative analysis of load balancing techniques
discussed in the previous section. This has been done using the cloud analyst simulator.
In Section 7, the authors have concluded the paper and shown the scope of enhancement
and future plan.
Load balancing can be defined as the process of task distribution among multiple
computers, processes, disk, or other resources in order to get optimal resource utilisation
Critical analysis of load balancing strategies for cloud environment 215
and to reduce the computation time. The future of cloud computing depends on effective
installation of infrastructure, efficient utilisation of resources and dynamic transformation
of load. So, there is a need of efficient and effective load balancing strategy which is
suitable for cloud environments. Load balancing algorithms can be classified in two
categories:
Depending on how the tasks are assigned and how resources are allocated to nodes, it can
be of three types as follows:
• Centralised approach: In this approach, a single scheduler or node manages all the
task assignment and resource allocation activities.
• Distributed approach: In this approach, every node not only maintains its own load
vector, but also maintains the load and resource information of some nearby nodes. If
a node is not able to handle the task, it migrates the task to another node whose load
is least. Decisions are made in a distributed manner. This is more suitable for cloud
environments.
• Mixed approach: A mixture of the two approaches is used to get the benefits of the
both. Usually scheduling is done on two levels. On the first level centralised
approach is followed and on second level distributed approach is followed. This
approach best suits to cloud environment.
Depending on the informational status of the nodes (system topology), it can be of three
types as follows:
• Dynamic approach: This approach considers the current parameters while assigning
the task to a node. It is more suitable for cloud environments. Such algorithms are
hard to implement as they have to constantly monitor the nodes and task progress
and take the decision based upon that.
• Adaptive approach: This approach also considers the current parameter while
making a task assignment decision. It is different from the dynamic approach in the
sense that it not only changes the parameter of load balancing strategy, but also
makes the changes in the algorithm with the change of system load. It is the hardest
to implement, but it is most suitable for cloud environments (Khiyaita et al., 2012;
Nuaimi et al., 2012).
216 A. Jain and R. Kumar
3 Big data
Due to data proliferation nature of data sources such as social media sites, geographical
information systems and the weather forecasting system, there is an exponential growth
of structured and unstructured data which is heterogeneous in nature. This rampant
generation of data has given birth to concept of big data. Present data models are not
capable enough to handle big data. Data outlets are generating data at a larger pace while
data models are improving at a poky rate to handle the inundation of data (Sharma,
2015b). Significant dimensions of big data are as follows:
• Data processing: Data can be processed either in streamline or batch mode.
• Data source: Social media sites, online shopping sites, geographical information
sites, weather forecasting sites, trading sites are major big data producing outlets.
• Data modelling: Due to different formats of big data, appropriate transformation
operation is required to convert it from one format to another. To remove data
redundancy normalisation techniques are applied. Also to remove corrupt data and to
identify complete data from incomplete one, data cleansing techniques are applied.
• Data formatting: Due to the diversified nature of data outlets which are generating
big data, its format can be structured, unstructured and semi-structured this may be
homogeneous as well as heterogeneous in nature.
• Data management: Traditional data models are not capable enough to manage big
data. To resolve this hurdle, researchers have proposed No SQL (Not only SQL) data
management system which has four important types:
1 Document data model: Data is stored as a collection of documents. Documents
are encoded in XML or JSON format.
2 Graph data model: This model stores the data along the vertices and edges of a
graphical structure.
Critical analysis of load balancing strategies for cloud environment 217
3 Wide column store: This model stores the data in a single extendable column. It
is suitable for distributed data storage.
4 Key value store: This model stores the data in key value pair where data is
stored as value which has been retrieved by key element of the pair (Sharma
et al., 2015a).
No SQL (Not only SQL) data models have been proposed by researchers to handle big
data. Sharma et al. (2014, 2015b) have given a comparative analysis of various No SQL
data models. But due to the poky development rate of big data enabled data models, this
area is still a challenging research area.
4 Cloud analyst
Cloud analyst is a GUI based tool which can simulate the cloud environment. It is
developed over cloud sim architecture.
• Internet characteristics: This part sustains the characteristics of the internet during
the simulation. It includes the latencies and accessible bandwidths between regions,
the present traffic levels, and present performance level information for the data
centres.
• VM load balancer: This component simulates the various load balance policy used
by data centres while entertaining the allotted tasks.
• Cloud app service broker: It simulates the service brokers which control the traffic
routing between user bases and data centres (Wickremasinghe et al., 2010).
Figure 2 Cloud analyst user interface (see online version for colours)
In Armstrong et al. (1998), minimum execution time (MET) approach for load balancing
has been discussed. In this approach, tasks are assigned to resources in first come first
serve manner. The virtual machine which takes less execution time (ET) for a given task
is scheduled first. MET Scheduler accesses the VM allocation table to assign tasks to a
virtual machine. VM allocation table stores the virtual machine id and virtual machine
power. A virtual machine with more processing power can execute the task fast. So this
centralised load balancing approach is static in nature which neither considers the present
load nor considers the task size.
Critical analysis of load balancing strategies for cloud environment 219
Algorithm MET ( )
{
Populate the VM allocation table.
While (there is a task received by MET scheduler)
{
Choose that virtual machine from VM allocation table whose processing power is
highest.
Assign the received task to selected machine.
}
}
In Ritchie and Levine (2005), minimum completion time (MCT) approach for load
balancing has been discussed. Tasks are assigned to resources in first come first serve
manner. The virtual machine which takes less completion time for a given task is
scheduled first. For assignment of task to virtual machine, MCT scheduler accesses the
VM allocation table. VM allocation table stores the virtual machine id, virtual machine
power, number of tasks in queue and completion time of that virtual machine. This
approach is dynamic in nature as it considers the current load of virtual machine.
Algorithm MCT ( )
{
Populate the VM allocation table.
While (there is a task received by MCT scheduler)
{
Choose that virtual machine from VM allocation table whose completion time is least.
Assign the received task to selected machine.
Update the VM allocation table.
}
}
220 A. Jain and R. Kumar
In Wang et al. (2010), min-min approach of load balancing has been discussed. This
algorithm does not follow first come first serve sequence rather it contains two criteria:
MET and MCT. MET tasks are preferred over the maximum execution time tasks. The
concept chooses the task which holds MET and assigns it to the virtual machine which
gives MCT. It involves two minimum selection criteria, so it is called min-min approach.
This centralised approach considers the present load of virtual machine and task size also.
But due to buffering of task to choose the MET, this may results in some delay in
response time.
Algorithm min-min ( )
{
Populate the VM allocation table.
While (there is a task received by scheduler)
{
Store the task in task allocation table.
If (task allocation table is filled)
Critical analysis of load balancing strategies for cloud environment 221
{
Sort the task on the basis of no of instructions in the task.
For (i=1 to n)
// n is the index of last task in task allocation table
{
Choose that virtual machine from VM allocation table whose completion time is
lowest.
Assign the ith task to selected machine.
Update the VM allocation table.
}
}
}
}
It may cause load imbalance problem if more number of larger tasks are present. It also
causes starvation to maximum execution time tasks.
In Wang et al. (2010), max-min approach of load balancing has been discussed. This
algorithm does not follow first come first serve sequence. It contains two criteria:
maximum execution time and MCT. Maximum execution time tasks are preferred before
the MET tasks. The concept chooses the task which holds maximum execution time and
assigns it to the virtual machine which gives MCT It involves one maximum and one
minimum selection criteria, so it is called max min approach. This centralised approach
considers the present load of virtual machine and task size also. But due to buffering of
task to choose the maximum execution time, this may results in some delay in response
time.
Algorithm max-min ( )
Choose that virtual machine from VM allocation table whose completion time is
lowest.
Assign the ith task to selected machine.
Update the VM allocation table.
}
}
}
}
It gives a better schedule if few numbers of minimum execution tasks are present. But at
the same time it causes starvation to minimum execution tasks. It may cause load
imbalance problem if more number of larger tasks are present. It also causes starvation to
maximum execution time tasks.
In Tyagi and Kumar (2015), throttled load balancing strategy has been discussed.
Throttled load balancer uses a single job scheduler which makes it centralised in nature.
Job scheduler maintains a table named VM allocation table, which stores the id and status
of all the virtual machines. A virtual machine can have only two states: occupied or idle,
denoted by 1 or 0 respectively in the array. Initially all virtual machines are idle. On
receiving a task, job scheduler search the virtual machine which suits the requirement of
that task and not busy. If it finds such kind of virtual machine than it assigns that task to
222 A. Jain and R. Kumar
that virtual machine. If no virtual machines are available to accept job then the job
scheduler stores the job in waiting list maintained at job scheduler level. No queues are
maintained at virtual machines level. A virtual machine can accommodate only one task
and another task can be allocated only when the current task has finished (Mohapatra
et al., 2013).
Algorithm TLB ( )
{
Populate the VM allocation table.
While (Job scheduler receives a new task or waiting list is non empty)
{
Job scheduler scans the array to find the idle virtual machine which suits the
requirements of the task in terms of size.
If (Job scheduler finds suitable virtual machine) then
{
It assigns that task to that virtual machine.
It updates the status of virtual machine in the VM allocation table.
}
Else
{
Stores that task in the waiting list.
}
If (any virtual machine has completed the assigned task) then
{
It notifies the job scheduler and job scheduler updates the status of that virtual
machine in the VM allocation table
}
}
}
Critical analysis of load balancing strategies for cloud environment 223
TLB approach is static in nature and easy to implement. Its major drawback is that on
arrival of new job, job scheduler has to scan the index table repeatedly until it finds the
suitable virtual machine.
In Lu et al. (2011), join idle queue scheduling approach for load balancing has been
discussed. Authors have implemented a two level scheduling. To realise the concept of
two levels scheduling, authors have used the distributed scheduler. Numbers of
schedulers are very less in comparison to number of virtual machines. Every scheduler
will maintain a queue of idle virtual machines. On receiving a task, scheduler first
consults its idle queue. If it finds any virtual machine which is idle then it immediately
assigns the task to that virtual machine and removes that virtual machine from its idle
queue. If it does not find any idle virtual machine then it randomly allots that task to any
virtual machine. Virtual machine, after job completion, update about its status to any of
the randomly chosen idle queue associated with a scheduler. This approach has separated
the task of discovery of idle servers from the task of job assignment to virtual machine.
Algorithm JIQ ( )
{
Every Scheduler maintains the list of idle virtual machines in its idle queue
While (there is a task received by data centre)
{
Data centre forward the task randomly towards any of the scheduler.
On receiving a task, scheduler checks its idle queue.
If (idle queue is not empty) then
{
Scheduler removes the idle server from the queue and assigns the received task.
}
else
224 A. Jain and R. Kumar
{
Scheduler assigns the task to any randomly selected virtual machine.
}
If (any virtual server get idle) then
{
Virtual server randomly selects scheduler and adds itself to the idle queue of that
scheduler.
}
}
}
In Gupta et al. (2007), join shortest queue scheduling approach for load balancing in
distributed environment has been discussed. This approach uses only single scheduler,
which dispatches the task towards that virtual machine whose queue length is small.
Scheduler maintains a VM allocation table which stores the queue length corresponding
at each virtual machine. This helps the scheduler to redirect the received task towards a
suitable virtual machine. No queues are maintained at scheduler level. Queues are
maintained only at virtual machine level.
Algorithm JSQ ( )
{
Scheduler initialises the VM allocation table.
While (there is a task received by JSQ scheduler)
{
Scheduler forwards the task towards that VM whose queue length is smallest and update
VM allocation table
Critical analysis of load balancing strategies for cloud environment 225
This approach is centralised in nature which does not provide the facility of task
migration from queue of one virtual machine to another. So, it does not illustrate good
performance in terms of response time, throughput and resource utilisation.
In Chaudhary and Kumar (2014), round robin scheduling algorithm for load
balancing in cloud environment has been discussed. The basis of this algorithm is
principle of time scheduling. Scheduler assigns the tasks received through data centre
controller to a list of virtual machines on rotation basis. Scheduler assigns the first
request to a randomly selected virtual machine from the list of available virtual machines.
After the task assignment, the virtual machine id is moved to the end of available list of
virtual machines.
Algorithm RR ( )
{
Currentvm=0; //currentvm holds the id of last selected VM
While (scheduler receives a task)
{
Currentvm++;
if (Currentvm >n) // no of vm’s
{
Currentvm = 0;
}
Assign the task to the virtual machine having id=Currentvm
}
}
226 A. Jain and R. Kumar
This approach has very simple logic. But it is not suitable for dynamic environment as it
neither considers the present load of machine nor considers the task size.
In Bagwaiya and Raghuwanshi (2014), equally spread current execution load
balancing approach for cloud environment has been discussed. This algorithm uses the
spread spectrum approach. It works in such a way that number of active tasks on each
virtual machine is same at any time instant. Scheduler maintains VM allocation table
which stores VM id, active task count and VM status. With the assignment of new task or
on task completion, it updates the table. Based on the size and requirements of received
task, it finds the lightly loaded and appropriate virtual machines and assigns that task to
that virtual machine. Scheduler periodically analyses the load of virtual machines and
reshuffles the load to ensure equality of load by transferring of load from overloaded
virtual machine to under-loaded virtual machine. Repeatedly scanning of the queue
results in additional computational overhead.
Algorithm ESCE ( )
{
Job scheduler populates the VM allocation table.
While (Job scheduler receives new task or queue of any virtual machine has crossed the
threshold limit)
{
Job scheduler scans the active task count of each VM to find the lightly loaded virtual
machine which suits the requirements of the task.
If (Job scheduler finds suitable virtual machine) then
{
It assigns that task to that virtual machine.
It updates the VM allocation table.
}
If (any virtual machine has completed the assigned task) then
{
It updates the VM allocation table.
}
If (load of any virtual machine has crossed the threshold limit)
then
{
Job scheduler finds the lightly loaded suitable machine.
Transfer the task from overloaded VM to lightly loaded VM.
Update the VM allocation table.
}
}
}
Critical analysis of load balancing strategies for cloud environment 227
• Grouping model:
a no of concurrent users from a single user base = 1,000
b no of concurrent requests a distinct application server instance can sustain = 100
c length of executable instruction per request= 250 bytes.
No. of
Data size Average Average
User requests Peak hour Peak hour
Region per requests peak hour off peak
base per user start time end time
in bytes user hour user
per hour
UB1 0 100 100 13 15 86,500 8,650
UB2 1 90 150 15 17 56,500 5,650
UB3 2 120 200 20 22 97,500 9,750
UB4 3 150 175 1 3 116,500 11,650
UB5 4 70 125 21 23 20,000 1,200
UB6 5 60 135 9 11 6,500 650
Note: Duration of simulation: 12 hrs; service broker policy: optimise response time
Source: Internet World Stats (2015)
6.2 Results
Cost (in $)
No. of Join Minimum
users Round Equally Join idle
Throttled shortest completion
robin spread queue
queue time
50,000 353 340 320 47 62 50
100,000 695 680 670 81 117 100
150,000 1,033 1,020 1,000 115 170 150
200,000 1,370 1,350 1,323 149 219 200
250,000 1,719 1,700 1,695 184 272 250
300,000 2,061 2,020 2,040 218 323 300
Figure 12 Data processing time comparisons (see online version for colours)
Join Minimum
Regression Round Equally Join idle
Throttled shortest completion
analysis robin spread queue
queue time
Value of R 0.9988 0.9988 0.9992 0.9769 0.9914 0.995567
Figure 14 Curve estimation of round robin algorithm (see online version for colours)
Figure 15 Curve estimation of equally spread algorithm (see online version for colours)
Figure 16 Curve estimation of throttled algorithm (see online version for colours)
232 A. Jain and R. Kumar
Figure 17 Curve estimation of join idle queue algorithm (see online version for colours)
Figure 18 Curve estimation of join shortest queue algorithm (see online version for colours)
Figure 19 Curve estimation of MCT algorithm (see online version for colours)
Due to task inundation at data centre, an efficient load balancing strategy is required for
efficient utilisation of resources and satisfaction of service level agreement. There are
many different load balancing strategies for cloud environment. But no one is good
enough to satisfy all the related parameters. In this paper, authors have thoroughly
Critical analysis of load balancing strategies for cloud environment 233
discussed the nine different load balancing approaches for cloud environment. Amongst
these nine discussed approaches, six approaches have been tested using cloud analyst
simulator in different simulation environments on different parameters like response
time, data processing time and cost. It has been identified that join idle queue approach is
the most suitable approach amongst the tested approaches. Also simulation results have
been validated by regression analysis.
As a future scope, authors are planning to propose a hybrid load balancing approach
by combining the best characteristics of analysed techniques. Authors are also planning
to work on load balancing of big data at data centre level. Authors are expecting this
work as a helpful guidance to load balancing in cloud computing.
References
Amandeep, V.Y. and Mohammad, F. (2014) ‘Different strategies for load balancing in cloud
computing environment: a critical study’, International Journal of Scientific Research
Engineering & Technology (IJSRET), Vol. 3, No. 1, pp.85–90.
Armstrong, R., Hensgen, D. and Kidd, T. (1998) ‘The relative performance of various mapping
algorithms is independent of sizable variances in run-time predictions’, Proceedings of
Seventh Heterogeneous Computing Workshop (HCW 98), pp.79–87.
Bagwaiya, V. and Raghuwanshi, S.K. (2014) ‘Hybrid approach using throttled and ESCE load
balancing algorithms in cloud computing’, Proceedings of International Conference on Green
Computing Communication and Electrical Engineering (ICGCCEE), pp.1–6.
Chaudhary, D. and Kumar, B. (2014) ‘Analytical study of load scheduling algorithms in cloud
computing’, Proceedings of International Conference on Parallel, Distributed and Grid
Computing (PDGC), pp.7–12.
Gupta, V., Harchol-Balter, M., Sigman, K. and Whitt, W. (2007) ‘Analysis of join-the-shortest-
queue routing for web server farms’, Proceedings of International Symposium on Computer
Modelling, Measurement and Evaluation, pp.1–29.
Internet World Stats (2015) http://www.internetworldstats.com (accessed August 2015).
Jain, A. and Kumar, R. (2014) ‘A taxonomy of cloud computing’, International Journal of
Scientific and Research Publications, Vol. 4, No. 7, pp.1–5.
Khiyaita, A., Zbakh, M., El Bakkali, H. and El Kettani, D. (2012) ‘Load balancing cloud
computing: state of art’, Proceedings of National Days of Network Security and Systems
(JNS2), pp.106–109.
Lu, Y., Xie, Q., Kliot, G., Geller, A., Larus, J. and Greenberg, A. (2011) ‘Join-idle-queue: a novel
load balancing algorithm for dynamically scalable web services’, Performance Evaluation,
Vol. 68, No. 11, pp.1056–1071.
Mohapatra, S., Rekha, K.S. and Mohanty, D.S. (2013) ‘A comparison of four popular heuristics for
load balancing of virtual machines in cloud computing’, International Journal of Computer
Application, Vol. 68, No. 6, pp.33–38.
Nuaimi, K.A., Mohamed, N., Nuaimi, M.A. and Al-Jaroodi, J. (2012) ‘A survey of load balancing
in cloud computing: challenges and algorithms’, Proceedings of Second Symposium on
Network Cloud Computing and Applications (NCCA), pp.137–142.
Ritchie, G. and Levine, J. (2005) ‘A fast, effective local search for scheduling independent jobs in
heterogeneous computing environments’, Journal of Computer Applications, Vol. 25, No. 5,
pp.1190–1192.
Sasikala, P. (2011) ‘Cloud computing: present status and future implications’, International
Journal of Cloud Computing, Vol. 1, No. 1, pp.23–36.
Sharma, S. (2015a) Evolution of As-a-Service Era in Cloud, arXiv preprint arXiv: 1507.00939.
234 A. Jain and R. Kumar
Sharma, S. (2015b) An Extended Classification and Comparison of Nosql Big Data Models, arXiv
preprint arXiv: 1509.08035.
Sharma, S., Shandilya, R., Patnaik, S. and Mahapatra, A. (2015b) ‘Leading nosql models for
handling big data: a brief review’, International Journal of Business Information Systems,
Vol. 18, No. 4, pp.1–25.
Sharma, S., Tim, S.U., Gadia, S. and Wong, J. (2015a) Proliferating Cloud Density through Big
Data Ecosystem, Novel Xclouds Classification and Emergence of As-a-Service Era,
White paper.
Sharma, S., Tim, S.U., Wong, J., Gadia, S. and Sharma, S. (2014) ‘A brief review on leading big
data models’, Data Science Journal, Vol. 13, pp.138–157.
Tyagi, V. and Kumar, T. (2015) ‘ORT broker policy: reduce cost and response time using throttled
load balancing algorithm’, Proceedings of International Conference on Computer,
Communication and Convergence (ICCC 2015), pp.217–221.
Wang, S.C., Yan, K.Q., Liao, W.P. and Wang, S.S. (2010) ‘Towards a load balancing in a three-
level cloud computing network’, Proceedings of 3rd IEEE International Conference on
Computer Science and Information Technology (ICCSIT), pp.108–113.
Wickremasinghe, B., Calheiros, R.N. and Buyya, R. (2010) ‘Cloudanalyst: a cloudsim-based visual
modeller for analysing cloud computing environments and applications’, Proceedings of 24th
International Conference on Advanced Information Networking and Applications (AINA
2010), pp.446–452.