You are on page 1of 10

Summer 2015

Resource Management and Scheduling in Cloud

Under the guidance of


Dr. MingHwa Wang

Khushboo Bhuva
Sumati Aneja
Priyanka Botny Srinath

ACKNOWLEDGEMENTS
We would like to express our special appreciation and gratitude to our Professor,
Dr.MingHwa Wang for giving us this opportunity and encouraging our research.
We would also like to thank the Santa Clara University library for providing us various
resources needed for our research of the project.
Finally, we would like to thank our family and friends for constant support and
cooperation.

TABLE OF CONTENTS
1. Introduction

What is the problem?


Why is the project related to the class?
Why other approaches are no good?
Why you think your approach is better?
Statement of the problem
Scope of Investigation

2. Theoretical bases and literature review

Definition of the problem


Theoretical background of the problem
Related research to solve the problem
Advantage/Disadvantage of the research
Our Solution to solve the problem
Why our solution is different from others?
Why our solution is better?

3. Hypothesis
Positive/Negative Hypothesis
Multiple Hypotheses
4. Methodology

How to generate/collect input data?


How to solve the problem?
Algorithm Design
Language used
Our prototype
Tools used
How we generated an Output
How to test against hypothesis?
How to proof correctness

5. Implementation
Code

Design document and flowchart


6. Data Analysis and Discussion

Output generation
Output analysis
Compare output against hypothesis
Abnormal case explanation
Statistic regression
Discussion
7. Conclusions and recommendations

Summary and conclusions


Recommendations for future studies
8. Bibliography
9. Appendices
Program flowchart
Program source code with documentation
Input/ Output Listing
Related Material

1. INTRODUCTION

What is the problem?


Cloud computing is one of the growing businesses around the world. Software as a
Service, Platform as a Service and Infrastructure as a Service are the various business
models adopted in the cloud industry. Cloud computing promises to provide reliable
services through cloud centers built on virtualization and storage technologies. Migrating
to a cloud-computing environment generates the need for task scheduling to enable the
optimized utilization of various cloud services execution. However, the scheduling of
tasks/applications for parallel and distributed computing systems is a NP-complete
problem. Till date, researchers have proposed a number of algorithms such as Genetic
algorithms, simulated annealing and many others based on existing algorithms to meet
the requirements of cloud environment.
Why is the project related to the class?
We as a team are delivering resource management strategy/ies to schedule tasks in a
cloud environment using creative approaches to reduce power consumption, cost and
precious time.
Companies as well as academia can implement this feature on their cloud infrastructure
to best deliver a cloud environment as a product/service/platform.
Scope of Investigation
- Resource Management in Cloud Environment
- Task Scheduling Algorithm Implementation
- Comparative study with existing algorithms like round robin and greedy
algorithm
- Simulation using CloudSim with Cloud center data samples
- Load Balancing by Quality of Service using priority and completion time
of tasks
Our Solution to solve the problem
We propose a task scheduling algorithm based on Quality of Service driven cloud
computing. First, we compute the priority of tasks according to the special attributes, then
sort tasks by priority and order them in a task queue. Our task-scheduling algorithm
evaluates the completion time on each tasks on different VMs and schedule each task on
to a VM which can complete the task as soon as possible.
Why our solution is different from others?

Most of the tasks scheduling algorithms like First Come First Serve (FCFS), Round
Robin Algorithm (RR) Min-Min Algorithm, etc schedule jobs on the basis of their arrival
time in the systems. Since the cloud environment is a heterogeneous system and the
speed of each processor varies quickly. Our algorithm is more appropriate for the cloud
environment and is different in such a way that we use a strategy where the task owned
higher priority should be scheduled first and it must be completed as soon as possible.
Why our solution is better?
Our approach always assigns tasks on to optimal resources in order to satisfy the user
requirement for QoS (Quality of Service determines the degree of satisfaction of the user
for the service) for cloud computing. This leads that all tasks accumulate on few optimal
resources. It can also be shown that makespan reduces with the increase of virtual
machines for the same number of tasks. (The smaller the makespan, the greater the
performance) Our proposed algorithm blends many task attributes including user
privilege task length and pending time in queue and it shows that the algorithm achieves
high performance and load balancing by QoS driving from both priority and completion
time.
2. THEORITICAL BASES AND LITERATURE REVIEW

Definition of the problem


Theoretical background of the problem
Related research to solve the problem
Advantage/Disadvantage of the research
Our Solution to solve the problem
Why our solution is different from others?
Why our solution is better?

3. HYPOTHESIS
Proposed Goal of Task Scheduling Model
Task with higher priority should be scheduled before task with lower
priority Achieve QoS
Task should be completed as soon as possible Achieve Low latency
Take current load of VMs into account before assigning new task
Achieve Load balance across all VMs
8. METHODOLOGY

In cloud computing, to perform benchmarking and to test our results on a comparative


study with other algorithms used for the task scheduling, we need to perform experiments
on a repeatable and scalable cloud environment. But because of different cloud service
providers, and the differences in their scheduling policies, and environments, it is very
hard to come to a common benchmark to test our results with other strategies for task
scheduling.
In order to perform testing, we use a simulator called CloudSim for modelling the cloud.
The main function of CloudSim is to provide a generalized, and extensible simulation
framework for emerging Cloud computing infrastructures and application services. By
using CloudSim, researchers and developers can focus on specific system design issues
that they want to investigate, without getting concerned about the low level details related
to Cloud-based infrastructures and services.
Using CloudSim, we propose a comparison of the execution time and resource efficiency
with other scheduling algorithms, using a common input.

How to generate/collect input data?

To evaluate the efficiency of your algorithm based on the simulation, we need to conduct
experiments using data traces from a real workable system. The input consists of data
w.r.t CPU utilization by 1090 nodes from servers located at around 507 places worldwide
tested with 200-2400 tasks. (PlanetLab as provided in the testbed)

How to solve the problem?

Every VM has attributes such as Id, name, ability (processor, bandwidth, storage) and
every task has attributes such as:
1.
Id
2.
Usertype - gives privilege class of owner. We define
three values A, B, C with A being the highest privilege
3.
Length of execution - number of instructions for
that task
4.
Priority level - It shows the urgency of task to be
scheduled. We will set it as urgent, high, medium and low
5.
Arrival time: Time when task arrived
6.
As we are focusing only on compute functionality,
we use above attributes for task.

Algorithm Design

1. Calculate normalized value

For each attribute, calculate normalized value, 0 being lowest, 1 highest


based on,
Ni = (Ai MinVal)/(MaxVal MinVal)
- Ni is normalized value of attribute for Task i.
- Ai is value of attribute for Task i.
- MinVal is minimum value of attribute across all tasks waiting to be
scheduled
- MaxVal is maximum value of attributes across all tasks waiting to be
scheduled
From this step, we have normalized values for every attribute for each task.
2. Calculate priority of each task by
Pi = w1(TUP) + w2(TP) +w3(TAT) +w4(TLE)
- Here, w1,w2,w3,w4 are the weights or contribution factor of each attribute
towards final priority, where w1+w2+w3+w4 = 1
- Priority 1 is the highest and 0 is the lowest.
3. Sort the entire task based on above priority
4. Create ETC (Expected Time to Compute) Matrix
- It contains the estimates of expected execution times of all tasks on all
virtual machines
- The elements along the row indicate the estimates for the expected
execution times of all task on all virtual machines
- Those on column give the estimates of expected execution times of
different tasks on given virtual machine.
5. Create ST (Start Time) Matrix
- It contains estimates for the earliest time that can be used for virtual
machines after these VMs have executed and completed tasks
allocated on.
6. Create MCT (Minimum Completion Time) Matrix
- It contains the estimates for the expected completion times of all tasks on
all VMs
- Elements on row indicate the estimates of the expected completion times
of all tasks on all VMs
- On column gives the estimates the expected completion times of
different task on given VM

7. Schedule the task on VM which has minimum value of MCT (i,j)


The total completion time of each VM is the computing load of the VM.
i.e greater the value of MCT(i,j), heavier the load of the VM.

Language used: Java

Tools used: CloudSim and Eclipse IDE

Generating an Output:

The energy utilization and the time taken to schedule the cloudlets or tasks on
different Virtual machines is obtained as part of the output. The performance
comparison of our proposed algorithm with other scheduling algorithms will be
shown using statistical analysis of the results obtained during the complete
implementation phase of the project.

How to test against hypothesis?

1. Makespan - Execution time span


-

Test for lower makespan to achieve better performance

2. Average latency - Average waiting time of long tasks


- Measured by the ratio of total waiting time of long tasks
and number of long tasks
- Lower average latency helps to reduce makespan
3. Load Balancing Indexing - It is a metric that measures whether a system is
well load balanced.
-

The smaller the load balance index, the better the load

balancing.

BIBLIOGRAPHY
[1] Resource Management and Scheduling in Cloud Environment by Vignesh V, Sendhil
Kumar KS, Jaisankar N
[2]Resource Allocation and Scheduling in Cloud Computingby Elghoneimy, E., Bouhali,
O. Alnuweiri H

[3] Credit Based Scheduling Algorithm in Cloud Computing Environment, Antony


Thomas, krishnalal G, Jagathy Raj V P
[4] Virtual Machine Scheduling Management on Cloud Computing Using Artificial Bee
Coloney, B Kruekaew and W. Kimpan

You might also like