You are on page 1of 14

Chapter 5

LITERATURE SURVEY

5.1 Static Environment


Processing elements or machines in cloud computing under static environment installs
homogeneous resources. These resources in the cloud are not flexible when environment is
made static. In this scenario, the cloud requires prior knowledge of nodes capacity, processing
power, memory, performance and statistics of user requirements. These user requirements are
not subjected to any change at run-time. Algorithms proposed to achieve load balancing in
static environment cannot adapt to the run time changes in load. Although static environment is
easier to simulate but is not well suited for heterogeneous cloud environment.

FCFS [67], is the easiest static scheduling algorithm in which task that arrives first will be
scheduled first of all and resources are allocated to that task as it needs. All tasks for which no
resources are available at any particular time are kept in a queue waiting for resources to get
free. Once the task is executed, the next task in queue is scheduled next and assigned the free
resource. FCFS is a basic scheduling method in which tasks are queued up if resources are busy.
The algorithm evaluated arrival time of tasks as well the algorithm is easy to be implemented.
There is no optimization function used to map the tasks to the resources and hence it is a pure
form of greedy algorithm with no optimization features.

Round Robin [67], is another static scheduling method in which task are executed for a fixed
time slot. The tasks are put in queue at the end and will be taken again when it will reach front
and remaining execution will be carried out. The algorithm calculates expected execution time
as well balances the load. The resources are provisioned to the task on first-cum-first-serve
(FCFS- i.e. the task that entered first will be first allocated the resource) basis and scheduled in
time sharing manner. It can be considered as greedy (first-fit) round-robin scheduling policy for
mapping Virtual Machines and tasks under static environment.

Priority Scheduling [68], Assigning priority to tasks is key to implementing solution to


scheduling problem. Priority in cloud computing can be based on multiple attributes. Priority
based job scheduling algorithm is suitable example of On-line mode heuristic scheduling
algorithm. A particular job scheduling algorithm in cloud environments should pay attention to
multi-attribute and multi-criteria properties of jobs. Numerous mathematical functions exist to
assign weights to tasks so that the tasks can be mapped to resources. These functions are the
used to generate a weight for each evaluation criterion according to the decision maker’s
pairwise comparisons of the criteria. Weights are computed for each task on all resources. Main
focus of the priority scheduling algorithm is priority of the tasks and not the optimization of
resources.

Shortest Job First is a simple static priority-based scheduling algorithm in which the tasks are
assigned priority on the basis of the task length. The task with the shortest length is assigned
resources first. A queue is maintained if all resources are busy.

Improved Cost-Based Algorithm for Task Scheduling [69], is an improved cost-based


algorithm so to make efficient mapping of tasks to resources which are available. Sorting of the
tasks based on the priority (decided using some criteria) is done and these are further place in 3
different lists namely, high, medium and low priority. The main aim of this task scheduling was
to result in minimum total tasks completion time and minimum cost. The algorithm did not
address the issue of handling complicated scenario involving other QoS attributes.

5.2 Static & dynamic heuristic strategies


Static task scheduling strategies are preferred when it is certain that all tasks will reach at the
same time and will be considered for scheduling simultaneously. Dynamic strategies are
particularly useful when the scheduler is not aware of all incoming tasks at the beginning or
there is change in available machines dynamically. The tasks are passed to the scheduler as they
arrive or in iterations. In certain architecture arrangements, the tasks arrive asynchronously and
even some machines go offline at certain intervals. Dynamic heuristics can be used either in on-
line mode or batch mode. In online mode, each task is scheduled to a particular machine as it
enters the system or is submitted to the broker. In the batch mode, all tasks are collected as part
of a set and scheduling is performed as per the preschedule.

OLB (Opportunistic Load Balancing) [70], begins by assigning the tasks randomly or in free
order to available resources on the cloud. It does so by assigning workload to nodes in free
order. The implementation is very easy as no computation is required and it does not consider
any constraints while assigning tasks to resources. For instance, it does not consider the
expected execution time of task on different resources. Idea is to ensure that all resources or
machines get work.

MET (Minimum Execution Time) [70], is also simple and easy to execute. The tasks are
assigned to the machines considering that it should take minimum time to execute the task. It
seems very valid. But it does not take into consideration the current load and availability of the
machine. It simply assigns the task to the best machine. This strategy can result in poor load
balance across various machines. Load balance Min-Min (LBMM) [70] is an example of
Minimum Execution Time task scheduling algorithm.

MCT (Minimum Completion Time) [71], is a strategy that works differently than the
Minimum Execution Time. Rather than considering the Execution Time, it works on
Completion Time of the task. It may take more time to execute, but the task is guaranteed to
complete in minimum time as compared to other machines. Each task is assigned arbitrarily to
the machines which possesses the minimum completion time to complete this task. However,
the strategy fails to ensure that task takes minimum execution time.

MOMCT (Modified Ordered Minimum Completion Time) [71], proposes an algorithm


using which it is possible to identify MCT (Minimum Completion Time) that allocates tasks in
a random order to the minimum completion time machine. It suggests an ordered approach to
the MCT heuristic, which order tasks in accordance to the mean difference of the completion
time on each machine and the minimum completion time machine.
Min-min [41], is based on Completion Time of all tasks that are still waiting for the resource
allocation.Idea is to compute the matrix for minimum completion time of every task which is
still waiting for resource allocation. Task with minimum completion time is scheduled to the
respective machine on which its completion time is minimum. The task is then removed from
the list of tasks that are waiting for resource allocation and the same procedure is followed for
all the remaining tasks in the list.

Min-max [41], is also based on Completion Time of tasks and is quite similar to Min-min
heuristic on the basis of its implementation. Only difference between min-min and max-min is
the selection of corresponding machine where it should execute. It also has a set of all
unscheduled tasks. Again, we compute the matrix for minimum completion time of every task
which is still waiting for resource allocation. But, rather than selecting the task with overall
minimum time, here the task with overall maximum completion time is scheduled the respective
machine on which its completion time is maximum. The task is then removed from the list of
tasks that are waiting for resource allocation and the same procedure is followed for all the
remaining tasks in the list.

5.3 Other Scheduling Algorithms


In paper [66], a review of different Task Scheduling schemes is discussed. Also, a novel
taxonomy is proposed in the paper to solve the problem of task scheduling in cloud
environment. Schemes falling under Goal Oriented Task Scheduling (GOTS) schemes give
service providers a fair chance to apply specific approach and schedule the tasks and resources
that can generate maximum possible economic gains, while using least resource provisioning.
Using low resource provision allows providers to use their resources at possible fullest and
trading Makespan with marginal increase only.

In paper [62], an Autonomous Agent Based Load Balancing Algorithm (A2LB) is


proposed. The objective of this algorithms is to provide dynamic load balancing using ants as
the migration agent. Autonomous agent-based load balancing algorithm (A2LB) focuses on
parameters like improving throughput, minimizing response time, dynamic resource scheduling
with scalability and reliability. A2LB works by ensuring that all the resources are properly
utilized and the resources are further used in a manner that the load remains balanced. A2LB
mechanism comprises of three agents: Load agent, Channel gent and Migration Agent. Load
and channel agents are static agents whereas migration agent is an ant. Load Agent is
responsible to calculate the load on every available virtual machine after allocation of a new job
in the data centre. It maintains all such information in table termed as VM_Load_Fitness
table.Channel Agent initiates migration agents on receiving the request from load agent. The
idea is to search for virtual machines with similar configuration. It maintains the information
received from migration agent in table termed as Response Table.Migration Agent
communicates with load agents of other datacenters to find a compatible VM whose fitness
value is greater than some threshold value.In case any such VM is found, the channel agent
migrates the task to that VM. The proposed mechanism has been implemented and found to
provide good results. Only problem with this algorithm is the number of migrations. Cloudlets
are randomly assigned to Virtual machines and once all Virtual Machines are busy, the
algorithm spends much of its time to find a suitable alternative Virtual Machine and fails to
ensure that the resource optimization is fully achieved.

GA (Genetic Algorithm) [41], is another popular heuristic strategy used to find near-optimal
solution for complex problems It is a population-based heuristic. First step of GA is to
randomly initialize the population of chromosomes. Objective function is then designed based
on one or two parameters. One of the most widely used QOS parameter is Makespan time. After
getting the initial population, all chromosomes in the population are evaluated on the basis on
their respective fitness value (Makespan time). Next step is to perform a crossover operation
that selects a random pair of chromosomes of a task and picks a random point in first
chromosome. Allocation of resources is also exchanged between particular corresponding tasks.
Last step is to perform mutation operation. It randomly selects a chromosome and task within
the chromosome and the task is then re-assigned or re-allocated to the selected resource. The
same process is repeated for number of iterations till the stopping criteria is met, which it the
objective function.

In paper [18], Load balancing strategy has been implemented using Genetic Algorithm.
Generally, the problem of task scheduling in cloud computing is dynamic in nature, still at some
points you have a certain set of tasks to be assigned to available resources. In this paper, two
vectors were used to represent the current load of the VM’s at any given time and information
related to the job submitted to the cloud. The focus of this paper was to optimize the cost
function. Simulation results shows that performance of load balancer using GA is much better
than other static algorithms.

In paper [63], a cloud task scheduling policy based on Ant Colony Optimization (ACO)
[38] algorithm has been implemented and its performance is compared with different
scheduling algorithms like First Come First Served (FCFS) and Round-Robin (RR). In this
paper, a probabilistic function on the basis of expected time to compute for each task has been
proposed. The probabilistic function takes into consideration the pheromone concentration,
transfer time of task, expected time to compute, length of each task, processing capabilities of
each Virtual machine including band width are considered. Then the paper also suggests a
function for updating the pheromone value. The pheromone value is updated after each tour by
ant. Computed length after each tour by an ant refers to the Makespan value. The function also
considers the trail decay, i.e. decay of pheromone concentration at each path. The main goal of
these algorithms is minimizing the Makespan of a given tasks set. Experimental results showed
that cloud task scheduling based on ACO outperformed FCFS and RR algorithms. Tasks
varying in size from 100 to 100 are then scheduled using the Ant Colony Optimization
algorithm.Performance of the metaheuristic ACO is much better as compared to static
algorithms like FCFS and Roubd-Robin. But paper does not suggest any measure to improve
the imbalance factor beyond the implementation of evolutionary ACO algorithm. The algorithm
does not consider parameters other than Makespan time and also the probabilistic function used
in this paper can be extended to consider other issues. The probability function can be designed
by considering different QoS parameters.

In paper [49], Ant Colony Optimization (ACO) [38] has been implemented for solving cloud
scheduling problem. The paper suggests that local pheromone update and global pheromone
update is the key step in implementation of Ant Colony Optimization scheduling algorithm.
Local pheromone update is done so as to ensure that more preference is given to edges or the
paths that have not yet been explored over the edges or paths have been already travelled by the
ant. It further ensures that all ants do not end up converging on the same path. Neighbour is
explored using the global pheromone which is updated in each iteration to ensure that the ants
explore the search spaces in the neighbourhood too. It compares both Makespan and total
execution time of tasks with FCFS. It aims to minimize the Makespan. Reducing the Makespan
of the tasks is important as it can increase the overall throughput of the tasks.

Particle Swarm Optimization (PSO) [37]is based on the behaviour of animals that form a
group and find best position in that group to form a swam. Single fish or bird in a swarm is
called a particle and each particle in a swarm them moves in a certain direction at certain speed.
So, the next position of the particle in the swarm is decided by the direction and speed in which
the particles are moving. It is best suited for problems that are continuous in nature, but it can
also be used to solve the discrete problems such as task scheduling in cloud environment. It
takes less time to converge to a solution than other algorithms.

In paper [72], a task scheduling algorithm based on Particle Swarm Optimization (PSO)
has been implemented. It provides an optimal way to minimize average utilisation of
resources. In this paper a multi-dimensional QOS based improved particle swarm optimization
algorithm (QoS-DPSO) was proposed. The idea of this algorithms was to consider multi-
dimensional QoS parameters and adjust these parameters dynamically to find better particle
positions.

In paper [73], approach to Load balancing using Particle Swarm Optimization in cloud
computing has been proposed. VM migration from one host to another is a time-consuming task
and the proposed method targets to overcome this drawback. Rather than migrating the VM
itself, only the overloaded tasks are migrated. Focus of the proposed algorithm is to reduce the
time taken to load the balance.

In paper [64], an approach to migration along with allocation has been suggested.The
objective of this algorithm is to reduce the number of migrations for achieving better energy
consumption and resource. The algorithm was successful in reducing the energy consumption
and also ensuring that the algorithm is able to converge to a solution in less time by combining
the two different set of algorithms for allocation of the resources and migration of the resources.

In paper [50], heuristic based method is proposed to schedule data intensive and
computation intensive applications so as to minimize the overall execution cost. The
scientific analysis is computation intensive that involves lot of data and hence it takes a long
time for execution. Many web services are also now coming into the category of data intensive
applications. Creating a web environment for carrying out the data intensive computing is very
useful in terms of achieving much higher scalability then performing data intensive computing
on single high-end system at one place. The main challenge in implemented such an
infrastructure is to achieve minimum response time. The task scheduling algorithm in this paper
reduces the amount of data movement between the nodes. A task's processing cost varied on
the basis of the processor assigned to each task. Data movement is key performance indicator
and is considered to be the part of the total expected time to execute a task on a virtual machine.
It is important to reduce the data movement and thus ensuring that lesser time is wasted in
migrating the tasks to the cloud server.

In paper [46], a broad overview of Bio inspired algorithms used to tackle various
challenges faced in Cloud Computing Resource management environment has been
presented. Bio inspired algorithm plays very important role in computer networks, data mining,
power system, economics, robotics, information security, control system, image processing etc.
There are great opportunities of exploring or enhancing this field algorithm with the help of
innovative ideas or thoughts. Since this field of Bio inspired algorithm bridge a knowledge
bond between different communities like computer science, biology, economics, artificial
intelligence etc.

Table 5.1: Comparison of different scheduling algorithms.


Algorithms Description Findings Limitations Remarks
First Come Tasks are saved in The algorithm is Only arrival time Can only be used
First Serve a queue based on easy to implement is considered for where large
algorithm [67] the submission and requires less scheduling the number of free
time. The tasks effort to identify tasks and all other resources are
are then assigned resources for tasks. parameters are available and
to the free ignored. optimization is
resource. not primary
requirement.
Round Robin Tasks are The idea is to Due to pre- Better when no
algorithm [67] executed reduce the response emptive nature of deadline is set and
concurrently in a time and minimize the algorithm, preferred when
time-shared the expected delay. switching only the response
manner. No task between the tasks time is the
has to wait for results in wastage scheduling
indefinite period of crucial time criteria.
for it to be and transfer of
assigned the data is also an
resource. issue.
Priority Tasks are Numerous multi- The algorithm is Hybrid with other
scheduling assigned priority objective based not suitable for heuristic
algorithm [68] on the basis of mathematical dynamic algorithms can be
some parameters models have been environment and used to design
are saved in proposed to tasks with low objective
queues. The tasks compute the priority can suffer functions using
are then assigned priority. The from starvation priority model.
to free resources algorithm fairs well when
on the basis of in static implemented
highest priority environment. under dynamic
first. environment.
Improved It is an improved It measures both The algorithm The algorithm is
cost-based priority-based the computation only focuses on good under static
algorithm for scheduling performance and the cost and environment and
task algorithm. The cost of resource. doesn’t improve still improvement
scheduling algorithm divides The primary resource is required for it
[69] task into three objective is to utilization. to be used on
different list reduce the cost online dynamic
depending on of processing. environment as it
priority of each does not consider
task. It Makes resource
efficient mapping utilization.
of available
resources to tasks.
OLB Tasks are The It follows a OLB can be used
(Opportunistic randomly implementation is greedy approach as hybrid
Load assigned to very easy as no and tasks are approach with
Balancing) available computation is randomly other algorithms
[70] resources on the required and it does assigned. No and when the load
cloud. not consider any optimization is less, OLB can
constraints while criteria is be used under a
assigning tasks to followed. specific thresh
resources. hold value to get
good results.
MET Tasks are It is very logical to This strategy can MET can be used
(Minimum assigned to the follow this result in poor load as optimization
Execution machines approach, but it balance across function for other
Time) [70] considering that it does not take into various machines heuristic
should take consideration the as better machines algorithms. For
minimum time to current load and can become example it can be
execute the task. availability of the heavily loaded. used to compute
machine. the Expected time
to execute on
machine and that
can be used by
other heuristic
machines.

MCT Tasks are Rather than strategy fails to Modified


(Minimum assigned by considering the ensure that task approached to
Completion considering the Execution Time, it takes minimum traditional MCT
Time) [72] minimum works on execution time. algorithm can be
completion time Completion Time Resources are not used for better
of the tasks. of the task that can guaranteed best performance and
result in better mapping. efficient
Makespan time. utilization of
resources.
MOMCT Tasks are It is able to identify Even though it Resource
(Modified randomly the MCT for works faster and awareness can be
Ordered assigned to the randomly assigned in efficient added to the
Minimum machines which tasks using an manner, still it ordered approach
Completion order tasks in ordered approach. does not consider for better
Time) [71] accordance to the efficient mapping efficiency. Still
mean difference of resources to the algorithm
of the completion tasks. cannot be
time on each effectively used in
machine and the real time
minimum environment.
completion time
machine.
Min-min [41] It computes the It targets the It cannot be used The algorithm
matrix for Makespan time and effectively when again uses very
minimum tries to ensure that all VM’s are of little parameters
completion time we get best same or with very and has to be
of every waiting Makespan Time. little difference in extended to use
task and assigns it computing other QoS
to the machine capacity. parameters as
with minimum well.
completion time.

Max-min [41] It computes the It targets the It cannot be used It is a variation of


matrix for Makespan time and effectively when min-min and
minimum tries to ensure that all VM’s are of whether it
completion time we get best same or with very performs better
of every waiting Makespan Time. little difference in than min-min or
and assigns it to computing not is only a
the machine with capacity. matter of chance
maximum depending upon
completion time. the incoming
tasks.
Goal Oriented It is based on It uses very little Makespan time is There is need to
Task achieving a Resource compromised and consider the user
Scheduling specific goal like provisioning for only gain is for parameters as
(GOTS) [66] maximization of task scheduling. the provider. well to give a fair
profits, etc. Not many chance for to both
parameters provider and user.
considered. Focusing on only
one parameter
does not
eventually
guarantee
effective usage.
Autonomous Tasks are Three agents are The algorithm There is a need to
Agent Based randomly able to ensure that suffers from large make better
Load assigned to the the all resources get number of allocation policy
Balancing machines. And in almost equal migrations when so that the
Algorithm case the machine amount of work in the system resource and task
(A2LB)[62] becomes respect to finish becomes fully mapping can
overloaded then time only. loaded and also result in better
agents are used to the algorithm fails optimization of
migrate the task to implement resources.
to some other efficient resource
VM. utilization.

GA (Genetic It is population- GA consists of Algorithms tends GA gives a


Algorithm) based heuristic three steps: random to perform well satisfactory result.
[41] strategy used to initialization, when all the tasks Performance is at
find near-optimal crossover function are submitted par with other
solution for and mutation before the first heuristic
complex operation. Three iteration. It algorithms.
problems. steps are repeated struggles in Further
for number of dynamic improvement can
iterations till the environment be made by
stopping criteria is making it
met adaptable to
dynamic
environment like
ACO.
Ant Colony Ant Colony The algorithm is The solution Algorithm gives
Optimization Optimization based on finding involves large and fair result. There
(ACO) [38] (ACO) meta- the minimum complex is need to design
heuristic is length. Length in computations at better
inspired by the this case can be every stage and optimization
behaviour of real Makespan time. the process is functions that
ants finding the Hence the solution time consuming. cover wide range
shortest path with minimum of QoS
between their length or Makespan parameters and
colonies and a time is final probably design a
source of food. solution resource aware
hybrid algorithm.
Particle It is a swarm- Particle in PSO The algorithm Computation is
Swarm based intelligence implementation sometimes fails to less as compared
Optimization algorithm represents bird and cover all search to ACO and
(PSO) [37] influenced by the movement of each space and is provides result
social behaviour particle is co- caught in a local faster then ACO.
of animals such as ordinated by a best that results in The solution
birds. velocity, i.e. pre-mature given sometimes
magnitude and convergance. is greedy. There is
direction. Particle a need to hybrid it
position represents with other
a solution and is the heuristic to
best position is overcome its
measured by a weaknesses.
fitness value.

You might also like