Professional Documents
Culture Documents
ABSTRACT
This thesis developed a computing architecture and algorithms for supporting soft
real-time task scheduling in a cloud computing environment through the dynamic
provisioning of virtual machines.
This project proposes an Adaptive Scoring Job Scheduling algorithm (ASJS) for
the grid environment. Compared to other methods, it can decrease the completion time of
submitted jobs, which may compose of computing-intensive jobs and data-intensive jobs.
Python 3.6 is used as the front end language to develop the application.
CONTENT
CHAPTER NO. TITLE PAGE NO.
LIST OF TABLES
LIST OF FIGURES
LIST OF ABBREVIATIONS
1 Introduction
2 System Analysis
3 System Specification
4 Software Description
5 Project Description
6 System Testing
7 System Implementation
9 Appendix
10 References
LIST OF ABBREVIATIONAS
CLR : common language runtime
CLS : common language specification
CLI : cross language inheritance
CTS : Common Type System
SQL : Structured Query Language
IL : Intermediate Language
ADO : activex Data Object
COM : Component Object Model
HTML : Hyper Text Markup Language
DHTML : Dynamic Hyper Text Markup Language
XML : Extended Markup Language
CHAPTER 1
INTRODUCTION
With the exponential growth of big data induced by the proliferation of Internet
technologies, there has been an increase in the demand for cloud computing over the last
few years. Cloud computing is a type of distributed parallel computing system consisting
of a collection of inter-connected and virtualized computers that are dynamically
provisioned and presented as one or more unified computing resources based on service-
level agreements between the service provider and consumers. Cloud computing
facilitates flexible and dynamic outsourcing of applications while improving cost-
effectiveness.
In IaaS, the consumer is provisioned the use of virtualized computing, storage and
network resources that are delivered on demand basis. In IaaS model, the consumer will
not have control on the cloud infrastructure, however, he can control the operation
system, the storage and deployed applications beside the possibility of controlling limited
network components such as firewalls. Cloud computing is enabling the development of
the next generation of computing services, which would be heavily geared toward
massively distributed on-line computing. It also affords a new model of globally
accessible on-demand high-performance computing (HPC) services.
The authors in this paper [2] stated that Fair queuing is a technique that allows
each flow passing through a network device to have a fair share of network resources.
Previous schemes for fair queuing that achieved nearly perfect fairness were expensive to
implement: specifically, the work required to process a packet in these schemes was
O(log(n) ), where n is the number of active flows. This is expensive at high speeds. On
the other hand, cheaper approximations of fair queuing that have been reported in the
literature exhibit unfair behavior. In this paper, the authors described a new
approximation of fair queuing, that they called Deficit Round Robin. Their scheme
achieves nearly perfect fairness in terms of throughput, requires only O(1) work to
process a packet, and is simple enough to implement in hardware. Deficit Round Robin is
also applicable to other scheduling problems where servicing cannot be broken up into
smaller units, and to distributed queues.
When there is contention for resources, it is important for resources to be
allocated or scheduled fairly. They need fire walk between contending users, so that the
“fair” allocation is followed strictly. For example, in an operating system, CPU
scheduling of user processes controls the use of CPU resources by processes, and
insulates well-behaved users from ill-behaved users. Unfortunately, in most computer net
works there are no such firewalls; most networks are susceptible to badly-behaving
sources. A rogue source that sends at an uncontrolled rate can seize a large fraction of the
buffers at an intermediate router; this can result in dropped packets for other sources
sending at more moderate rates! A solution to this problem is needed to isolate the effects
of bad behavior to users that are behaving badly. [2]
According to authors in this paper the rapid proliferation of cloud and service-
oriented computing infrastructure is creating an ever increasing thirst for storage within
data centers. Ideally management applications in cloud deployments should operate in
terms of high-level goals, and not present specific implementation details to
administrators. Cloud providers often employ Storage Area Networks (SANs) to gain
storage scalability. SAN configurations have a vast parameter space, which makes them
one of the most difficult components to configure and manage in a cloud storage offering.
Cloud computing has begun to flourish in recent years [16]. Its growth has led to a
significant increase in the demand for storage, and thus scalable, manageable storage
infrastructures. The promise to customers of low Total Cost of Ownership (TCO),
combined with the ability of cloud computing to cope with those customers’ dynamic
infrastructure requirements, have fueled the emergence and growth of cloud service
providers such as Amazon S3 and EC21.
The authors said that the industry-driven evolution of cloud computing tends to
obfuscate the common underlying architectural concepts of cloud offerings and their
implications on hosted applications. Patterns are one way to document such architectural
principles and to make good solutions to reoccurring (architectural) cloud challenges
reusable.
To capture cloud computing best practice from existing cloud applications and
provider-specific documentation, they proposed to use an elaborated pattern format
enabling abstraction of concepts and reusability of knowledge in various use cases. They
presented a detailed step-by-step pattern identification process supported by a pattern
authoring toolkit. They continuously apply this process to identify a large set of cloud
patterns. In this paper, they introduced two new cloud patterns they identified in
industrial scenarios recently. The approach aims at cloud architects, developers, and
researchers alike to also apply this pattern identification process to create traceable and
well-structured pieces of knowledge in their individual field of expertise. As entry point,
we recap challenges introduced by cloud computing in various domains.
Through the use of cloud computing, cloud providers and their customers benefit
from the fundamental cloud properties: elasticity, pay-per-use, standardization, and
resource sharing. Elasticity empowers cloud users to reserve and release cloud resources
dynamically and based on the currently experienced workload. [4]
The authors stated that nature-inspired algorithms are among the most powerful
algorithms for optimization. This paper intends to provide a detailed description of a new
Firefly Algorithm (FA) for multimodal optimization applications. We will compare the
proposed firefly algorithm with other meta-heuristic algorithms such as particle swarm
optimization (PSO). Simulations and results indicate that the proposed firefly algorithm is
superior to existing metaheuristic algorithms. Finally they discussed its applications and
implications for further research.
They concluded that This algorithm also considers the priorities of the tasks.
Honey bee behavior inspired load balancing improves the overall throughput of
processing and priority based balancing focuses on reducing the amount of time a task
has to wait on a queue of the VM. [6]
The authors stated that cloud computing, a new concept is a pool of virtualized
computer resources. An Internet-based development where dynamically scalable and
often virtualized resources are provided as a service over the Internet has become a
significant issue. Cloud computing describes both a platform and type of application. A
cloud computing platform dynamically provisions, configures, reconfigures, and de-
provisions servers as needed. Servers in the cloud can be physical machines or virtual
machines spanned across the network. Thus it utilizes the computing resources (service
nodes) on the network to facilitate the execution of complicated tasks that require large-
scale computation. Selecting nodes (load balancing) for executing a task in the cloud
computing must be considered, and to exploit the effectiveness of the resources, they
have to be properly selected according to the properties of the task.
In this paper, a soft computing based load balancing approach has been proposed.
A local optimization approach Stochastic Hill climbing is used for allocation of incoming
jobs to the servers or virtual machines (VMs). Performance of the algorithm is analyzed
both qualitatively and quantitatively using CloudAnalyst. CloudAnalyst is a CloudSim-
based Visual Modeller for analyzing cloud computing environments and applications. A
comparison is also made with Round Robin and First Come First Serve (FCFS)
algorithms. [7]
CHAPTER 3
SYSTEM STUDY
3.4 ADVANTAGES
The proposed system has following advantages.
The two criteria to judge feasibility are cost required and value to be attained. As such, a
well-designed feasibility study should provide a historical background of the business or project,
description of the product or service, accounting statements, details of the operations and
management, marketing research and policies, financial data, legal requirements and tax
obligations. Generally, feasibility studies precede technical development and project
implementation.
The feasibility of the system is analyzed in this phase and business proposal is put forth
with general plan for the project and some cost estimates. During the system analysis of the
thesis, the feasibility study of proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibility analysis, some understanding of
the major requirements for the system is essential.
a. Technical Feasibility
b. Economical Feasibility
c. Operational Feasibility
3.5.1 ECONOMIC FEASIBILITY
For any system if the expected benefits equal or exceed the expected costs, the
system can be judged to be economically feasible. In economic feasibility, cost benefit
analysis is done in which expected costs and benefits are evaluated. Economic analysis is
used for evaluating the effectiveness of the proposed system.
The application consumes fewer resources in the system so that the network
administrator can access the application more economically.
The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The user must
not feel threatened by the system, instead must accept it as a necessity. The level of
acceptance by the users solely depends on the methods that are employed to educate the
user about the system and to make him familiar with it. The level of confidence must be
raised so that he is also able to make some constructive criticism, which is welcomed.
The parameters given to the application is less so that any normal user can access
the application easily. Moreover, the application is command line oriented and so
accessing the application is very easy by giving python <filename>.
3.5.3 TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the user.
Since the network administrators are going to test the algorithms, the system is
technically feasible. In addition, all the nodes in the windows environment is provided
with .net framework software so that separate installation is not necessary to run or test
this application.
FRONT END
PYTHON DESCRIPTION
Python is a dynamic, high level, free open source and interpreted programming
language. It supports object-oriented programming as well as procedural oriented
programming. In Python, we don’t need to declare the type of variable because it is a
dynamic typed language.
Features in Python
There are many features in Python, some of which are discussed below –
1. Easy to code
Python is high level programming language.Python is very easy to learn language
as compared to other language like c, c#, java script, java etc.It is very easy to code in
python language and anybody can learn python basic in few hours or days.It is also
developer-friendly language.
2. Free and Open Source
Python language is freely available at official website and you can download it
from the given download link below click on the Download Python kyeword.
Since, it is open-source, this means that source code is also available to the public.So you
can download it as, use it as well as share it.
3. Object-Oriented Language
One of the key features of python is Object-Oriented programming. Python
supports object oriented language and concepts of classes, objects encapsulation etc.
4. GUI Programming Support
Graphical Users interfaces can be made using a module such as PyQt5, PyQt4,
wxPython or Tk in python.PyQt5 is the most popular option for creating graphical apps
with Python.
5. High-Level Language
Python is a high-level language.When we write programs in python, we do not
need to remember the system architecture, nor do we need to manage the memory.
6. Extensible feature
Python is a Extensible language.we can write our some python code into c or c++
language and also we can compile that code in c/c++ language.
7. Python is Portable language
Python language is also a portable language.for example, if we have python code for
windows and if we want to run this code on other platform such as Linux, Unix and Mac
then we do not need to change it, we can run this code on any platform.
8. Python is Integrated Language
Python is also an Integrated language because we can easily integrated python
with other language like c, c++ etc.
9. Interpreted Language
Python is an Interpreted Language. because python code is executed line by line
at a time. like other language c, c++, java etc there is no need to compile python code this
makes it easier to debug our code.The source code of python is converted into an
immediate form called bytecode.
10. Large Standard Library
Python has a large standard library which provides rich set of module and
functions so you do not have to write your own code for every single thing.There are
many libraries present in python for such as regular expressions, unit-testing, web
browsers etc.
11. Dynamically Typed Language
Python is dynamically-typed language. That means the type (for example- int,
double, long etc) for a variable is decided at run time not in advance.because of this
feature we don’t need to specify the type of variable.
PYTHON APPLICATIONS
Python Applications
1. Web and Internet Development
Python lets you develop a web application without too much trouble. It has
libraries for internet protocols like HTML and XML, JSON, e-mail processing, FTP,
IMAP, and easy-to-use socket interface. Yet, the package index has more libraries:
Requests – An HTTP client library
BeautifulSoup – An HTML parser
Feedparser – For parsing RSS/Atom feeds
Paramiko – For implementing the SSH2 protocol
Twisted Python – For asynchronous network programming
We also have a gamut of frameworks available. Some of these are- Django, Pyramid. We
also get microframeworks like flask and bottle. We’ve discussed these in our write-up on
an Introduction to Python Programming.
We can also write CGI scripts, and we get advanced content management systems like
Plone and Django CMS.
2. Applications of Python Programming in Desktop GUI
Most binary distributions of Python ship with Tk, a standard GUI library. It lets
you draft a user interface for an application. Apart from that, some toolkits are available:
wxWidgets
Kivy – for writing multitouch applications
Qt via pyqt or pyside
And then we have some platform-specific toolkits:
GTK+
Microsoft Foundation Classes through the win32 extensions
Delphi
3. Science and Numeric Applications
This is one of the very common applications of python programming. With its
power, it comes as no surprise that python finds its place in the scientific community. For
this, we have:
SciPy – A collection of packages for mathematics, science, and engineering.
Pandas- A data-analysis and -modeling library
IPython – A powerful shell for easy editing and recording of work sessions. It
also supports visualizations and parallel computing.
Software Carpentry Course – It teaches basic skills for scientific computing and
running bootcamps. It also provides open-access teaching materials.
Also, NumPy lets us deal with complex numerical calculations.
4. Software Development Application
Software developers make use of python as a support language. They use it for
build-control and management, testing, and for a lot of other things:
SCons – for build-control
Buildbot, Apache Gump – for automated and continuous compilation and testing
Roundup, Trac – for project management and bug-tracking.
Roster of Integrated Development Environments
5. Python Applications in Education
Thanks to its simplicity, brevity, and large community, Python makes for a great
introductory programming language. Applications of python programming in education
has huge scope as it is a great language to teach in schools or even learn on your own.
If you still haven’t begun, we suggest you read up on what we have to say about
the white and dark sides of Python. Also, check out Python Features.
6. Python Applications in Business
Python is also a great choice to develop ERP and e-commerce systems:
Tryton – A three-tier, high-level general-purpose application platform.
Odoo – A management software with a range of business applications. With that,
it’s an all-rounder and forms a complete suite of enterprise-management
applications in-effect.
7. Database Access
With Python, you have:
Custom and ODBC interfaces to MySQL, Oracle, PostgreSQL, MS SQL Server,
and others. These are freely available for download.
Object databases like Durus and ZODB
Standard Database API
8. Network Programming
With all those possibilities, how would Python slack in network programming? It
does provide support for lower-level network programming:
Twisted Python – A framework for asynchronous network programming. We
mentioned it in section 2.
An easy-to-use socket interface
CHAPTER – 6
PROJECT DESCRIPTION
In this module, task details such as required RAM (in MB), Storage Capacity (in
MB), CPU required (in MHz) and network bandwidth (in Mbps) are retrieved and stored
in various dictionary entries.
In this module, the cluster id is given with ATP (Average Transmission Power)
and ACP (Average Computing Power) and CS (Cluster Score) value set to zero. The
details are saved in ‘Cluster’ table.
6.3.4 ADD RESOURCE
In this module, the resource id, Resource Name, IPAddress, CPU MHz, CPU
MHz available, Load Percent, CP (available computing power) and Storage Capacity
details are keyed in. The details are saved in ‘Resources’ table.
In this module, the cluster id is fetched from ‘Clusters’ table and resource id is
fetched from ‘Resource’ table. The ids are selected from combo boxes and are saved in
‘Clusters_Resources’ table.
In this module, the job id, name, required RAM in MHz, required hard disk
storage in MB, CPU MHz and network bandwidth is keyed in and saved into ‘Jobs’
table.
In this module, the cluster score is calculated based on the following formula.
where CSi is the cluster score for cluster i, a and b are the weight value of ATP i and ACPi
respectively, the sum of and is 1, ATPi and ACPi are the average transmission
power and average computing power of cluster i respectively. ATP i means the average
available bandwidth the cluster i can supply to the job and is defined as:
Similarly, ACPi means the average available CPU power cluster i can supply to the job
and is defined as:
where CPU_speedk is the CPU speed of resource k in cluster i, load k is the current load of
the resource k in cluster i, n is the number of resources in cluster i. Also let
Because the transmission power and the computing power of a resource will actually
affect the performance of job execution, these two factors are used for job scheduling.
Since the bandwidth between resources in the same cluster is usually very large, we only
consider the bandwidth between different clusters.
Local update and global update are used to adjust the score. After a job is submitted to a
resource, the status of the resource will change and local update will be applied to adjust
the cluster score of the cluster containing the resource. What local update does is to get
the available CPU percentage from Information Server and recalculate the ACP, ATP and
CS of the cluster. After a job is completed by a resource, global update will get
information of all resources in the entire grid system and recalculate the ACP, ATP and
CS of all clusters.
In addition with existing formula, the storage capacity is also calculated like
Average Transmission Power and so sum of and is 1. All other calculations
are used in same scenario as above module. The job is split into tasks with
andvalues for each sub task. So one cluster is assigned for one task and
others cluster for other tasks.
6.4 SYSTEM ARCHITECTURE
CPU BW
RAM
OS
SRTA AJSC
Analysis False Positive Rate
Analyze False Positive Rate
Avaiabl
e Calculate ATP, ACP & memory
yes Resourc
e
Allocate VM Ranking Cluster Score
no
Cluster Score
If
High
Runtim
yes
e
Task Allocation
Available Resource
Task Pool
6.6 DATA FLOW DIAGRAM
Admin
Admin Admin
Login
Resource
Add
Add Resource Assign
Cluster Resource to
Cluster
Resource
Cluster
Cluster_Resources
Add Job
Calculate
Cluster Job
Score
Resource Cluster_Resources
6.9 INPUT DESIGN
System analysis decide the following input design details like, what data to input,
what medium to use, how the data should be arranged or coded, data items and
transactions needing validations to detect errors and at last the dialogue to guide user in
providing input.
Input data of a system may not be necessarily is raw data captured in the system
from scratch. These can also be the output of another system or subsystem. The design of
input covers all the phases of input from the creation of initial data to actual entering of
the data to the system for processing. The design of inputs involves identifying the data
needed, specifying the characteristics of each data item, capturing and preparing data
from computer processing and ensuring correctness of data.
Any Ambiguity in input leads to a total fault in output. The goal of designing
the input data is to make data entry as easy and error free as possible.
6.10 OUTPUT DESIGN
Output design generally refers to the results and information that are generated by
the system for many end-users; output is the main reason for developing the system and
the basis on which they evaluate the usefulness of the application.
As the outputs are the most important sources of information to the users, better
design should improve the system’s relationships with user and also will help in decision-
making. Form design elaborates the way output is presented and the layout available for
capturing information.
CHAPTER 7
SYSTEM TESTING
7.1 TESTING
Unit Testing is done to check whether the individual modules of the source code
are working properly. i.e. testing each and every unit of the application separately by the
developer in the developer’s environment. It is AKA Module Testing or Component
Testing.
The unit testing done to check whether all the modules in the project works
without any error during the input using the various methods and algorithms that are
given in the project.
The integration testing in this project is that the modules are combined to form a
perfect application to improve the response time. The module that are integrated through
the given flow they that the job is split into sub tasks to calculate the cluster score value
to make the job completion much more earlier.
It’s a black box testing. Testing the fully integrated application this is also called
as end to end scenario testing. To ensure that the software works in all intended target
systems. Verify thorough testing of every input in the application to check for desired
outputs. Testing of the users experiences with the application. The system testing is done
after the completion of both the unit and integration testing this testing is used to check
whether the algorithm and the functionalities applied after the correction made during the
testing process held before hand.
The Mater Node and Virtual machines attempt to assign the tasks to an empty
processor. In the absence of such a processor, a new VM node is dynamically created and
assigned the task. The results of the demonstrative implementation of the proposed
architecture and algorithms in the present study were expressed in terms of the number of
deadline exceptions fired by each algorithm, the number of extra resources provisioned to
each algorithm to handle the deadline exceptions, and the average response time of the
tasks.
This project proposes an adaptive scoring method to schedule jobs in grid
environment. ASJS selects the fittest resource to execute a job according to the status of
resources. Local and global update rules are applied to get the newest status of each
resource. Local update rule updates the status of the resource and cluster which are
selected to execute the job after assigning the job and the Job Scheduler uses the newest
information to assign the next job.
The results suggest the applicability of the scheduling of periodic real time tasks
to cloud computing. In addition to current problem, there is other interesting and relevant
issues that remain to be addressed in future are:
The non-consideration of some other important scheduling parameters such as
cost and energy consumption constitutes a limitation of the present study. Further
experimental study is planned to consider such parameters as secondary priorities in
addition to meeting the constrained deadlines of real-time tasks
SAMPLE CODING
import matplotlib.pyplot as plt
import random as rnd
import numpy as np
"""
First 100 lines are Algorithm 1 in Base Paper - Schedulability
Virtualized cloud environment
"""
#Receive real time tasks
outputstr=''
with open('tasks.txt','r') as f:
lines=f.readlines()
totallines=len(lines)
if totallines>=5:
for i in range(0,5):
print(lines[i] )
else:
print(lines)
tasks={}
tid=1
for l in lines:
tok = l.split('\t')
singletask={}
singletask['taskid'] = tok[0]
singletask['rammb'] = int(tok[1])
singletask['storagemb'] = int(tok[2])
singletask['cpumhz'] = int(tok[3])
singletask['nwbandwidthmbps'] = int(tok[4])
singletask['windowsorlinux'] = int(tok[5])
tasks["T"+ str(tid)] =singletask
tid+=1
print('TASK DETAILS')
print('------------')
#print(tasks)
for i1 in range(1,tid):
dict1=tasks["T"+ str(i1)]
print('Task Id : ',dict1['taskid'])
outputstr= outputstr+'\nTask Id : ' + dict1['taskid']
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : '+str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : '+str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : '+str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+str(dict1['nwbandwidthmbps'])
print(' Windows(1)/Linux(2) (Mbps) : ',dict1['windowsorlinux'])
outputstr= outputstr+'\n Windows(1)/Linux(2) (Mbps) : '+str(dict1['windowsorlinux'])
print('------------')
outputstr= outputstr+'\n------------'
taskscount=len(tasks)
#========Create VM Pool
VM={}
OriginalRAMs=[]
OriginalSTs=[]
OriginalCPUs=[]
OriginalNBs=[]
for i in range(1,11):
vmresource={}
vmresource['resid']=i
vmresource['rammb']=100 + round((rnd.random() * 100),0) #round(10240 + (rnd.random() * 1024),0)
if i==6:
vmresource['rammb']=500
OriginalRAMs.append(vmresource['rammb'])
vmresource['storagemb']=10000 +round(10240 + (rnd.random() * 1024),0)
OriginalSTs.append(vmresource['storagemb'])
vmresource['cpumhz']=512 #round(10240 + (rnd.random() * 1024),0)
OriginalCPUs.append(vmresource['cpumhz'])
vmresource['cpuloadpercent']=0 #round(10240 + (rnd.random() * 1024),0)
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------\n'
vmcount=len(VM)
VMPendingResources={}
#all resources are available
for j in range(1,vmcount+1):
vmresource = VM['vm' + str(j)]
vmpendingresource={}
vmpendingresource['resid']=vmresource['resid']
vmpendingresource['rammb']=vmresource['rammb']
vmpendingresource['storagemb']=vmresource['storagemb']
vmpendingresource['cpumhz']=vmresource['cpumhz']
vmpendingresource['cpuloadpercent']=vmresource['cpuloadpercent']
vmpendingresource['nwbandwidthmbps']=vmresource['nwbandwidthmbps']
vmpendingresource['windowsorlinux']=vmresource['windowsorlinux']
vmpendingresource['taskids']=[]
VMPendingResources['vm' +str(j)]=vmpendingresource
i+=vmcount
print('i new =' + str(i))
if taskassignedcounter>taskscount:
exitfor=1
break;
if exitfor==1:
break;
print('----------------------------------')
outputstr= outputstr+'\n----------------------------------'
print('Available Resources in VMs')
outputstr= outputstr+'\nAvailable Resources in VMs'
#print(VMPendingResources)
print('--------------------------------')
outputstr= outputstr+'\n----------------------------------'
#print(VM)
for i1 in range(1,11):
dict1=VMPendingResources['vm' + str(i1)]
print('Resource Id : ',dict1['resid'])
outputstr= outputstr+'\nResource Id : ' + str(dict1['resid'])
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : ' + str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : ' + str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : ' + str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : ' + str(dict1['nwbandwidthmbps'])
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------'
failedvm=0
#Task execution
vmpendingcount=len(VMPendingResources)
suspect=0 #for algorithm2
for j in range(1,vmpendingcount+1):
vmtmp=VMPendingResources['vm' + str(j)]
print(vmtmp['taskids'])
outputstr= outputstr+'\nTask Ids of VM'
for tmp1 in vmtmp['taskids']:
outputstr= outputstr+'\n'+ str(tmp1)
excep=0
for x in vmtmp['taskids']:
task =tasks['T' + str(x)]
#print(str(task['windowsorlinux']) + ':' + str(vmtmp['windowsorlinux']))
if task['windowsorlinux'] ==vmtmp['windowsorlinux']:
vmtmp['rammb'] = vmtmp['rammb'] - task['rammb']
vmtmp['storagemb'] = vmtmp['storagemb'] - task['storagemb']
vmtmp['cpumhz'] = vmtmp['cpumhz'] - task['cpumhz']
#print( vmtmp['nwbandwidthmbps'])
vmtmp['nwbandwidthmbps'] = vmtmp['nwbandwidthmbps'] - task['nwbandwidthmbps']
#print( vmtmp['nwbandwidthmbps'])
#input()
VMPendingResources['vm' + str(j)]=vmtmp
#
dict1=VMPendingResources['vm' + str(j)]
print('Pending Resource Id with new Status : ',dict1['resid'])
outputstr= outputstr+'\nPending Resource Id with new Status : '+ str(dict1['resid'])
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : '+str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : '+str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : '+str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+ str(dict1['nwbandwidthmbps'])
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------'
#
if vmtmp['rammb']<0:
print('DeadLine Exception due to insufficient RAM')
outputstr= outputstr+'\nDeadLine Exception due to insufficient RAM'
excep=1
if vmtmp['storagemb']<0:
print('DeadLine Exception due to insufficient Storage')
outputstr= outputstr+'\nDeadLine Exception due to insufficient Storage'
excep=1
if vmtmp['cpumhz']<0:
print('DeadLine Exception due to insufficient CPU')
excep=1
if vmtmp['nwbandwidthmbps']<0:
print('DeadLine Exception due to insufficient Network Bandwidth')
outputstr= outputstr+'\nDeadLine Exception due to insufficient Network Bandwidth'
excep=1
if rnd.random()<0.1 :
print('DeadLine Exception due to delayed time')
outputstr= outputstr+'\nDeadLine Exception due to delayed time'
suspect=1
excep=1
if excep==1:
print('VM' + str(j) + ' Failed for task ' + task['taskid'])
outputstr= outputstr+'\nVM' + str(j) + ' Failed for task ' + task['taskid']
failedvm=j
print('***Look for a node with idle processor***')
outputstr= outputstr+'\n***Look for a node with idle processor***'
VMPendingResources['vm' + str(j)]=vmtmp
break;
else:
VMPendingResources['vm' + str(j)]=vmtmp
#print(vmtmp)
dict1=VMPendingResources['vm' + str(j)]
print('Pending Resource Id with new Status : ',dict1['resid'])
outputstr= outputstr+'\nPending Resource Id with new Status : '+ str(dict1['resid'])
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : '+str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : '+str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : '+str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+ str(dict1['nwbandwidthmbps'])
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------'
if excep==1:
break;
if excep==1:
found=0
for j in range(1,vmpendingcount+1):
if failedvm!=j:
ram=0
sto=0
cpu=0
bw=0
vmtmp=VMPendingResources['vm' + str(j)]
if vmtmp['rammb'] > task ['rammb']:
ram=1
if vmtmp['storagemb'] > task ['storagemb']:
sto=1
if vmtmp['cpumhz'] > task ['cpumhz']:
cpu=1
if vmtmp['nwbandwidthmbps'] > task ['nwbandwidthmbps']:
bw=1
if ram==1 and sto==1 and cpu==1 and bw==1:
print('VM' + str(j) + ' can take the task: ' + task['taskid'])
outputstr= outputstr+'\nVM' + str(j) + ' can take the task: ' + task['taskid']
found=1
break
if found==0:
print('New VM created:')
outputstr= outputstr+'\nNew VM created:'
print('Task ' + task['taskid'] + ' need to assigned to that New VM')
outputstr= outputstr+'\nTask ' + task['taskid'] + ' need to assigned to that New VM'
vmnew = VM['vm' + str(1)]
vmnew['resid'] = vmcount+1
VM['vm' + str(vmcount+1)] =vmnew
#print(VM['vm' + str(vmcount+1)])
dicttmp =VM['vm' + str(vmcount+1)]
outputstr= outputstr+'\nResource Id : '+str(dicttmp['resid'])
print('Resource Id : ',str(dicttmp['resid']))
print('===========')
outputstr= outputstr+'\n==========='
outputstr= outputstr+'\n RAM (MB) : '+str(dicttmp['rammb'])
print(' RAM (MB) : '+str(dicttmp['rammb']))
outputstr= outputstr+'\n Storage (MB) : '+str(dicttmp['storagemb'])
print(' Storage (MB) : '+str(dicttmp['storagemb']))
outputstr= outputstr+'\n CPU (MHz) : '+str(dicttmp['cpumhz'])
print(' CPU (MHz) : '+str(dicttmp['cpumhz']))
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+str(dicttmp['nwbandwidthmbps'])
print(' Network Bandwidth (Mbps) : '+str(dicttmp['nwbandwidthmbps']))
with open('Output.txt','w') as of:
of.write(outputstr)
of.close()
#DRAWING CHART
#Drawing RAM Capacity
xaxistitles=[]
RAMs=[]
for i in range(0,vmcount):
dicttmp=VMPendingResources['vm' + str(i+1)]
RAMs.append(dicttmp['rammb'])
xaxistitles.append('VM' + str(dicttmp['resid']))
fig, ax = plt.subplots()
plt.xlim(0, 10)
plt.ylim(0, max(OriginalRAMs)+10)
width=0.35
ax.plot(OriginalRAMs)
ax.plot(RAMs)
ax.bar(xaxistitles,RAMs,width,color='r')
ax.set_title('Virtual Machines Pending RAM Capacity')
plt.show()
#Drawing NB Capacity
xaxistitles=[]
NBs=[]
for i in range(0,vmcount):
dicttmp=VMPendingResources['vm' + str(i+1)]
NBs.append(dicttmp['nwbandwidthmbps'])
xaxistitles.append('VM' + str(dicttmp['resid']))
fig, ax = plt.subplots()
plt.xlim(0, 10)
plt.ylim(0, max(OriginalNBs)+10)
width=0.35
ax.plot(NBs)
ax.plot(OriginalNBs)
#ax.bar(xaxistitles,OriginalNBs,0.45,color='b')
ax.bar(xaxistitles,NBs,width,color='g')
ax.plot(NBs,'g*',OriginalNBs,'ro')
ax.scatter(NBs,OriginalNBs)
ax.set_title('Virtual Machines Pending Network Bandwidth (Mbps) Capacity')
plt.hist(NBs,bins=5)
plt.show()
#print(NBs)
#print(OriginalNBs)
#exit()
#Algorithm 2
if suspect==1:
print('Task: ' + task['taskid'] + ' is removed from queue')
print('Task: ' + task['taskid'] + ' is suspected and so sent to master')
print('Task: ' + task['taskid'] + ' is added by Master Scheduler\'s urgent queue.')
SCREEN SHOTS
REFERENCES
[1] A. Y. Zomaya and Y.-H. Teh, ‘‘Observations on using genetic algorithms for dynamic load-
balancing,’’ IEEE Trans. Parallel Distrib. Syst., vol. 12, no. 9, pp. 899–911, Sep. 2001.
[3] D. Eyers, R. Routray, R. Zhang, D. Willcocks, and P. Pietzuch, “Towards a middleware for
configuring large-scale storage infrastructures” ’in Proc. 7th Int. Workshop Middleware Grids,
Clouds e-Sci., 2009, p. 3.
[5] X.-S. Yang, ‘‘Firefly algorithms for multimodal optimization,’’ in Proc. Int. Symp. Stochastic
Algorithms. Berlin, Germany: Springer, 2009, pp. 169–178.
[6] L. D. D. Babu and P. V. Krishna, ‘‘Honey bee behavior inspired load balancing of tasks in
cloud computing environments,’’Appl.Soft Comput., vol. 13, no. 5, pp. 2292–2303, May 2013.
[7] B.Mondal, K.Dasgupta, and P.Dutta, “Load balancing in Cloud computing using stochastic
hill climbing-a soft computing approach,” Procedia Technol., vol. 4, pp. 783–789, Jun. 2012.