You are on page 1of 57

INVESTIGATING THE SCHEDULABILITY OF PERIODIC REAL-TIME

TASKS IN VIRTUALIZED CLOUD ENVIRONMENT

ABSTRACT

This thesis developed a computing architecture and algorithms for supporting soft
real-time task scheduling in a cloud computing environment through the dynamic
provisioning of virtual machines.

The architecture integrated soft real-time task scheduling algorithms, namely


master node and virtual machine node schedules. In addition Adaptive Job Scoring
algorithm is also applied. When science and technology advance, the problems
encountered become more complicated and need more computing power.

In contrast to the traditional notion of using supercomputers, grid computing is


proposed. Distributed computing supports resource sharing. Parallel computing supports
computing power. Grid computing aims to harness the power of both distributed
computing and parallel computing. The goal of grid computing is to aggregate idle
resources on the Internet such as Central Processing Unit (CPU) cycles and storage
spaces to facilitate utilization.

Grid technology, which connects a number of personal computer clusters with


high speed networks, can achieve the same computing power as a supercomputer does,
also with a lower cost. However, grid is a heterogeneous system. Scheduling independent
tasks on it is more complicated. In order to utilize the power of grid completely, we need
an efficient job scheduling algorithm to assign jobs to resources in a grid.

This project proposes an Adaptive Scoring Job Scheduling algorithm (ASJS) for
the grid environment. Compared to other methods, it can decrease the completion time of
submitted jobs, which may compose of computing-intensive jobs and data-intensive jobs.
Python 3.6 is used as the front end language to develop the application.
CONTENT
CHAPTER NO. TITLE PAGE NO.

LIST OF TABLES

LIST OF FIGURES

LIST OF ABBREVIATIONS

1 Introduction

2 System Analysis

2.1 Existing System

2.2 Drawbacks of Existing System

2.3 Proposed System

2.4 Advantages of Proposed System

2.5 Feasibility Study

2.5.1 Economical Feasibility

2.5.2 Operational Feasibility

2.5.3 Technical Feasibility

3 System Specification

3.1 Hardware Requirements

3.2 Software Requirements

4 Software Description

4.1 Front End

4.2 Back End

5 Project Description

5.1 Overview of the project

5.2 Module Description

5.3 System Flow Diagram

5.4 Data Flow Diagram

5.5 E-R Diagram


5.6 Database Design

5.7 Input Design

5.8 Output Design

6 System Testing

6.1 Unit Testing

6.2 Acceptance Testing

6.3 Test Cases

7 System Implementation

8 Conclusion and Future Enhancements

9 Appendix

9.1 Source Code

9.2 Screen Shots

10 References
LIST OF ABBREVIATIONAS
CLR : common language runtime
CLS : common language specification
CLI : cross language inheritance
CTS : Common Type System
SQL : Structured Query Language
IL : Intermediate Language
ADO : activex Data Object
COM : Component Object Model
HTML : Hyper Text Markup Language
DHTML : Dynamic Hyper Text Markup Language
XML : Extended Markup Language
CHAPTER 1
INTRODUCTION
With the exponential growth of big data induced by the proliferation of Internet
technologies, there has been an increase in the demand for cloud computing over the last
few years. Cloud computing is a type of distributed parallel computing system consisting
of a collection of inter-connected and virtualized computers that are dynamically
provisioned and presented as one or more unified computing resources based on service-
level agreements between the service provider and consumers. Cloud computing
facilitates flexible and dynamic outsourcing of applications while improving cost-
effectiveness.

Cloud computing offers three different types of service models: Software-as-a-


Service (SaaS), Platform as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS). In
SaaS model, the consumer is offered to use provider’s applications hosted in a cloud
infrastructure. In PaaS, the consumer is provided with required software and hardware
tools to develop and deploy cloud applications that are hosted in the provider’s cloud
infrastructure.

In IaaS, the consumer is provisioned the use of virtualized computing, storage and
network resources that are delivered on demand basis. In IaaS model, the consumer will
not have control on the cloud infrastructure, however, he can control the operation
system, the storage and deployed applications beside the possibility of controlling limited
network components such as firewalls. Cloud computing is enabling the development of
the next generation of computing services, which would be heavily geared toward
massively distributed on-line computing. It also affords a new model of globally
accessible on-demand high-performance computing (HPC) services.

However, such benefits are presently not available to real-time safety-critical


applications. This could be due to the inability of current cloud infrastructures to support
timing constraints. Compared with the case of existing real-time scheduling platforms
such as multiprocessors and multi cores, the handling of real time constraints on cloud
platforms is more complex due to the difficulty of predicting system performance in
virtualized and heterogeneous environments.

However, real-time applications are gradually progressing unto cloud computing


platforms, driven by the tremendous possibilities afforded by such platforms. Examples
of hard and soft real-time system applications of cloud computing are military distributed
control systems for remote surveillance, early warning and response systems, sensor-
driven unmanned vehicles with augmented intelligence, and cloud gaming. Cloud
computing technology is not actually geared toward hard real-time applications in closed
environments, but soft real-time applications that do not require direct exposure to the
hardware by passing system software.

Incidentally, soft real-time scheduling policies can be integrated in virtualization


platforms, enabling the system to deliver hierarchical real-time performance. This project
proposes and analyzes the possibility of applying real-time task scheduling, or deadline-
constrained task scheduling, on IaaS cloud computing service model, for the support of
soft and near-hard real-time applications. Examples of such applications are cloud-based
gaming, online video streaming, and telecommunication management. Such applications
could benefit significantly from cloud computing despite the limitations associated with
the start-up time and communication of virtual machines. This is because of the ability of
cloud computing to support dynamic workloads, thereby enabling the elastic allocation of
resources.
OBJECTIVES

 To consider the problem of scheduling independent periodic real-time


tasks with implicit deadlines (i.e. task deadlines are equal to task periods,
i.e. di = pi) on a virtualized cloud environment that offers IaaS services to
cloud subscribers.
 To avoid the missing deadlines, and to maintain the system criticality.
 In real-time systems, a periodic real-time task is one that is ‘released’
periodically at constant intervals. Such periodic real-time tasks can be
found in a broad range of real-time system applications. Scheduling
independent tasks on it is more complicated. In order to utilize the power
of grid completely, we need an efficient job scheduling algorithm to assign
jobs to resources in a grid.
 To decrease the completion time of submitted jobs, which may compose
of computing-intensive jobs and data-intensive jobs.
 To fit the tasks with required operating system in the VM with
corresponding operating systems (either windows or linux).
CHAPTER 2
LITERATURE SURVEY
2.1 RELATED WORK

According to authors in this paper Load-balancing problems arise in many


applications, but, most importantly, they play a special role in the operation of parallel
and distributed computing systems. Load-balancing deals with partitioning a program
into smaller tasks that can be executed concurrently and mapping each of these tasks to a
computational resource such a processor (e.g., in a multiprocessor system) or a computer
(e.g., in a computer network). By developing strategies that can map these tasks to
processors in a way that balances out the load, the total processing time will be reduced
with improved processor utilization.

Most of the research on load-balancing focused on static scenarios that, in most of


the cases, employ heuristic methods. However, genetic algorithms have gained immense
popularity over the last few years as a robust and easily adaptable search technique. The
work proposed here investigates how a genetic algorithm can be employed to solve the
dynamic load-balancing problem. A dynamic load-balancing algorithm is developed
whereby optimal or near-optimal task allocations can ªevolveº during the operation of the
parallel computing system. [1]

The authors in this paper [2] stated that Fair queuing is a technique that allows
each flow passing through a network device to have a fair share of network resources.
Previous schemes for fair queuing that achieved nearly perfect fairness were expensive to
implement: specifically, the work required to process a packet in these schemes was
O(log(n) ), where n is the number of active flows. This is expensive at high speeds. On
the other hand, cheaper approximations of fair queuing that have been reported in the
literature exhibit unfair behavior. In this paper, the authors described a new
approximation of fair queuing, that they called Deficit Round Robin. Their scheme
achieves nearly perfect fairness in terms of throughput, requires only O(1) work to
process a packet, and is simple enough to implement in hardware. Deficit Round Robin is
also applicable to other scheduling problems where servicing cannot be broken up into
smaller units, and to distributed queues.
When there is contention for resources, it is important for resources to be
allocated or scheduled fairly. They need fire walk between contending users, so that the
“fair” allocation is followed strictly. For example, in an operating system, CPU
scheduling of user processes controls the use of CPU resources by processes, and
insulates well-behaved users from ill-behaved users. Unfortunately, in most computer net
works there are no such firewalls; most networks are susceptible to badly-behaving
sources. A rogue source that sends at an uncontrolled rate can seize a large fraction of the
buffers at an intermediate router; this can result in dropped packets for other sources
sending at more moderate rates! A solution to this problem is needed to isolate the effects
of bad behavior to users that are behaving badly. [2]

According to authors in this paper the rapid proliferation of cloud and service-
oriented computing infrastructure is creating an ever increasing thirst for storage within
data centers. Ideally management applications in cloud deployments should operate in
terms of high-level goals, and not present specific implementation details to
administrators. Cloud providers often employ Storage Area Networks (SANs) to gain
storage scalability. SAN configurations have a vast parameter space, which makes them
one of the most difficult components to configure and manage in a cloud storage offering.

As a step towards a general cloud storage configuration platform, this paper


introduces a SAN configuration middleware that aids management applications in their
task of updating and troubleshooting heterogeneous SAN deployments. The middleware
acts as a proxy between management applications and a central repository of SAN
configurations. The central repository is designed to validate SAN configurations against
a knowledge base of best practice rules across cloud deployments. Management
applications contribute local SAN configurations to the repository, and also subscribe to
proactive notifications for configurations now no longer considered safe.

Cloud computing has begun to flourish in recent years [16]. Its growth has led to a
significant increase in the demand for storage, and thus scalable, manageable storage
infrastructures. The promise to customers of low Total Cost of Ownership (TCO),
combined with the ability of cloud computing to cope with those customers’ dynamic
infrastructure requirements, have fueled the emergence and growth of cloud service
providers such as Amazon S3 and EC21.

The highly heterogeneous, multi-vendor interconnection of devices created by the


cloud service providers allows multiple qualitative gradations of service. During the
whole lifecycle starting from data placement to retirement of workload, it remains a
monumental task for the cloud providers to perform the planning, configuration and
migration on demand [17]. Such operations are always error-prone given the complexity
of interrelated physical and logical provisioning. Cloud providers may choose from a
number of different approaches to effect scalable storage. [3]

The authors said that the industry-driven evolution of cloud computing tends to
obfuscate the common underlying architectural concepts of cloud offerings and their
implications on hosted applications. Patterns are one way to document such architectural
principles and to make good solutions to reoccurring (architectural) cloud challenges
reusable.

To capture cloud computing best practice from existing cloud applications and
provider-specific documentation, they proposed to use an elaborated pattern format
enabling abstraction of concepts and reusability of knowledge in various use cases. They
presented a detailed step-by-step pattern identification process supported by a pattern
authoring toolkit. They continuously apply this process to identify a large set of cloud
patterns. In this paper, they introduced two new cloud patterns they identified in
industrial scenarios recently. The approach aims at cloud architects, developers, and
researchers alike to also apply this pattern identification process to create traceable and
well-structured pieces of knowledge in their individual field of expertise. As entry point,
we recap challenges introduced by cloud computing in various domains.

Through the use of cloud computing, cloud providers and their customers benefit
from the fundamental cloud properties: elasticity, pay-per-use, standardization, and
resource sharing. Elasticity empowers cloud users to reserve and release cloud resources
dynamically and based on the currently experienced workload. [4]
The authors stated that nature-inspired algorithms are among the most powerful
algorithms for optimization. This paper intends to provide a detailed description of a new
Firefly Algorithm (FA) for multimodal optimization applications. We will compare the
proposed firefly algorithm with other meta-heuristic algorithms such as particle swarm
optimization (PSO). Simulations and results indicate that the proposed firefly algorithm is
superior to existing metaheuristic algorithms. Finally they discussed its applications and
implications for further research.

Biologically inspired algorithms are becoming powerful in modern numerical


optimization [22] especially for the NP-hard problems such as the travelling salesman
problem. Among these biology-derived algorithms, the multi-agent meta-heuristic
algorithms such as particle swarm optimization form hot research topics in the start-of-
the-art algorithm development in optimization and other applications. Particle swarm
optimization (PSO) was developed by Kennedy and Eberhart in 1995, based on the
swarm behaviour such as fish and bird schooling in nature, the so-called swarm
intelligence. Though particle swarm optimization has many similarities with genetic
algorithms, but it is much simpler because it does not use mutation/crossover operators.
[5]

The authors stated that scheduling of tasks in cloud computing is an NP-hard


optimization problem. Load balancing of non-preemptive independent tasks on virtual
machines (VMs) is an important aspect of task scheduling in clouds. Whenever certain
VMs are overloaded and remaining VMs are under loaded with tasks for processing, the
load has to be balanced to achieve optimal machine utilization. In this paper, the authors
proposed an algorithm named honeybee behavior inspired load balancing (HBB-LB),
which aims to achieve well balanced load across virtual machines for maximizing the
throughput.

They concluded that This algorithm also considers the priorities of the tasks.
Honey bee behavior inspired load balancing improves the overall throughput of
processing and priority based balancing focuses on reducing the amount of time a task
has to wait on a queue of the VM. [6]
The authors stated that cloud computing, a new concept is a pool of virtualized
computer resources. An Internet-based development where dynamically scalable and
often virtualized resources are provided as a service over the Internet has become a
significant issue. Cloud computing describes both a platform and type of application. A
cloud computing platform dynamically provisions, configures, reconfigures, and de-
provisions servers as needed. Servers in the cloud can be physical machines or virtual
machines spanned across the network. Thus it utilizes the computing resources (service
nodes) on the network to facilitate the execution of complicated tasks that require large-
scale computation. Selecting nodes (load balancing) for executing a task in the cloud
computing must be considered, and to exploit the effectiveness of the resources, they
have to be properly selected according to the properties of the task.

In this paper, a soft computing based load balancing approach has been proposed.
A local optimization approach Stochastic Hill climbing is used for allocation of incoming
jobs to the servers or virtual machines (VMs). Performance of the algorithm is analyzed
both qualitatively and quantitatively using CloudAnalyst. CloudAnalyst is a CloudSim-
based Visual Modeller for analyzing cloud computing environments and applications. A
comparison is also made with Round Robin and First Come First Serve (FCFS)
algorithms. [7]
CHAPTER 3

SYSTEM STUDY

3.1 EXISTING SYSTEM


In existing system, a deadline look-ahead module was incorporated into each of
the algorithms to fire deadline exceptions and avoid the missing deadlines and to
maintain the system criticality. It develops a computing architecture and algorithms for
supporting soft real-time task scheduling in a cloud computing environment through the
dynamic provisioning of virtual machines. The results not only suggest the feasibility of
the soft real-time scheduling of periodic real-time tasks in cloud computing but that the
process can also be scaled up to handle the near-hard real-time task scheduling.

3.2 DRAWBACKS OF EXISTING SYSTEM


 Each task is considered as separate unit.
 A single task is given to a selected virtual machine only.
 Additional Virtual machine creation is required if the current cloud node is not
capable of executing task.
 Splitting of tasks into various Virtual machines are not considered.
3.3 PROPOSED SYSTEM

Along with existing system implementation, Adaptive Scoring Job Scheduling


algorithm (ASJS) for Job execution is being carried out. In addition, jobs can be divided
into sub tasks and given to one or more clusters. Storage capacity requirement is also
included in Cluster Score calculation. During task assignment to VMs, to fit the tasks
with required operating system in the VM with corresponding operating systems (either
windows or linux) is also carried out. Also, time factor is considered for task execution
success or failure.

3.4 ADVANTAGES
The proposed system has following advantages.

 Each job is considered as sub tasks.


 A single job is given to a selected multiple clusters since jobs are split into tasks.
 Cluster score values are recalculated even during the job is partially completed.
This is achieved when a particular sub task is finished.
 Storage capacity of cluster resources is taken into account.
 Multiple  and  values are calculated for each sub task and so cluster assignment
is effective than existing system.
 Job Split method assists in faster job completion.
FEASIBILITY STUDY

The two criteria to judge feasibility are cost required and value to be attained. As such, a
well-designed feasibility study should provide a historical background of the business or project,
description of the product or service, accounting statements, details of the operations and
management, marketing research and policies, financial data, legal requirements and tax
obligations. Generally, feasibility studies precede technical development and project
implementation.

The feasibility of the system is analyzed in this phase and business proposal is put forth
with general plan for the project and some cost estimates. During the system analysis of the
thesis, the feasibility study of proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibility analysis, some understanding of
the major requirements for the system is essential.

Three key considerations involved in feasibility analysis are

a. Technical Feasibility
b. Economical Feasibility
c. Operational Feasibility
3.5.1 ECONOMIC FEASIBILITY

For any system if the expected benefits equal or exceed the expected costs, the
system can be judged to be economically feasible. In economic feasibility, cost benefit
analysis is done in which expected costs and benefits are evaluated. Economic analysis is
used for evaluating the effectiveness of the proposed system.

In economic feasibility, the most important is cost-benefit analysis. As the name


suggests, it is an analysis of the costs to be incurred in the system and benefits derivable
out of the system. The study is carried out to check the economic impact that the system
will have on the organization. The amount to find that the company can put into the
research and development of the system is limited. The designed application is found to
be more economical than the current system.

The application consumes fewer resources in the system so that the network
administrator can access the application more economically.

3.5.2 OPERATIONAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The user must
not feel threatened by the system, instead must accept it as a necessity. The level of
acceptance by the users solely depends on the methods that are employed to educate the
user about the system and to make him familiar with it. The level of confidence must be
raised so that he is also able to make some constructive criticism, which is welcomed.

The parameters given to the application is less so that any normal user can access
the application easily. Moreover, the application is command line oriented and so
accessing the application is very easy by giving python <filename>.
3.5.3 TECHNICAL FEASIBILITY

In technical feasibility the following issues are taken into consideration.

 Whether the required technology is available or not


 Whether the required resources are available Manpower- programmers, testers
& debuggers, Software and hardware. Once the technical feasibility is
established, it is important to consider the monetary factors also. Since it
might happen that developing a particular system may be technically possible
but it may require huge investments and benefits may be less. For evaluating
this, economic feasibility of the proposed system is carried out.

This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the user.

Since the network administrators are going to test the algorithms, the system is
technically feasible. In addition, all the nodes in the windows environment is provided
with .net framework software so that separate installation is not necessary to run or test
this application.

The algorithm if applied in grid environment is capable of decreasing completion


time of jobs and the performance of ASJS will be better than existing approaches. In
addition, the algorithm is modified such that it can be executed in real grid applications.
CHAPTER - 3
SYSTEM SPECIFICATION

4.1 HARDWARE REQUIREMENTS


This section gives the details and specification of the hardware on which the
system is expected to work.

Processor : Intel Dual Core


RAM : 4 GB SD
Monitor : 17inch Color
Hard disk : 500 GB
Keyboard : 102 keys
Mouse : Optical Mouse

4.2 SOFTWARE REQUIREMENTS


This section gives the details of the software that are used for the development.

Operating System : Windows 10


Backend : Python 3.6
CHAPTER - 5
SOFTWARE DESCRIPTION

FRONT END

PYTHON DESCRIPTION

Python is a dynamic, high level, free open source and interpreted programming
language. It supports object-oriented programming as well as procedural oriented
programming. In Python, we don’t need to declare the type of variable because it is a
dynamic typed language.

Features in Python

There are many features in Python, some of which are discussed below –
1. Easy to code
Python is high level programming language.Python is very easy to learn language
as compared to other language like c, c#, java script, java etc.It is very easy to code in
python language and anybody can learn python basic in few hours or days.It is also
developer-friendly language.
2. Free and Open Source
Python language is freely available at official website and you can download it
from the given download link below click on the Download Python kyeword.
Since, it is open-source, this means that source code is also available to the public.So you
can download it as, use it as well as share it.
3. Object-Oriented Language
One of the key features of python is Object-Oriented programming. Python
supports object oriented language and concepts of classes, objects encapsulation etc.
4. GUI Programming Support
Graphical Users interfaces can be made using a module such as PyQt5, PyQt4,
wxPython or Tk in python.PyQt5 is the most popular option for creating graphical apps
with Python.
5. High-Level Language
Python is a high-level language.When we write programs in python, we do not
need to remember the system architecture, nor do we need to manage the memory.
6. Extensible feature
Python is a Extensible language.we can write our some python code into c or c++
language and also we can compile that code in c/c++ language.
7. Python is Portable language
Python language is also a portable language.for example, if we have python code for
windows and if we want to run this code on other platform such as Linux, Unix and Mac
then we do not need to change it, we can run this code on any platform.
8. Python is Integrated Language
Python is also an Integrated language because we can easily integrated python
with other language like c, c++ etc.
9. Interpreted Language
Python is an Interpreted Language. because python code is executed line by line
at a time. like other language c, c++, java etc there is no need to compile python code this
makes it easier to debug our code.The source code of python is converted into an
immediate form called bytecode.
10. Large Standard Library
Python has a large standard library which provides rich set of module and
functions so you do not have to write your own code for every single thing.There are
many libraries present in python for such as regular expressions, unit-testing, web
browsers etc.
11. Dynamically Typed Language
Python is dynamically-typed language. That means the type (for example- int,
double, long etc) for a variable is decided at run time not in advance.because of this
feature we don’t need to specify the type of variable.
PYTHON APPLICATIONS

Python Applications
1. Web and Internet Development
Python lets you develop a web application without too much trouble. It has
libraries for internet protocols like HTML and XML, JSON, e-mail processing, FTP,
IMAP, and easy-to-use socket interface. Yet, the package index has more libraries:
 Requests – An HTTP client library
 BeautifulSoup – An HTML parser
 Feedparser – For parsing RSS/Atom feeds
 Paramiko – For implementing the SSH2 protocol
 Twisted Python – For asynchronous network programming
We also have a gamut of frameworks available. Some of these are- Django, Pyramid. We
also get microframeworks like flask and bottle. We’ve discussed these in our write-up on
an Introduction to Python Programming.
We can also write CGI scripts, and we get advanced content management systems like
Plone and Django CMS.
2. Applications of Python Programming in Desktop GUI
Most binary distributions of Python ship with Tk, a standard GUI library. It lets
you draft a user interface for an application. Apart from that, some toolkits are available:
 wxWidgets
 Kivy – for writing multitouch applications
 Qt via pyqt or pyside
And then we have some platform-specific toolkits:
 GTK+
 Microsoft Foundation Classes through the win32 extensions
 Delphi
3. Science and Numeric Applications
This is one of the very common applications of python programming. With its
power, it comes as no surprise that python finds its place in the scientific community. For
this, we have:
 SciPy – A collection of packages for mathematics, science, and engineering.
 Pandas- A data-analysis and -modeling library
 IPython – A powerful shell for easy editing and recording of work sessions. It
also supports visualizations and parallel computing.
 Software Carpentry Course – It teaches basic skills for scientific computing and
running bootcamps. It also provides open-access teaching materials.
 Also, NumPy lets us deal with complex numerical calculations.
4. Software Development Application
Software developers make use of python as a support language. They use it for
build-control and management, testing, and for a lot of other things:
 SCons – for build-control
 Buildbot, Apache Gump – for automated and continuous compilation and testing
 Roundup, Trac – for project management and bug-tracking.
 Roster of Integrated Development Environments
5. Python Applications in Education
Thanks to its simplicity, brevity, and large community, Python makes for a great
introductory programming language. Applications of python programming in education
has huge scope as it is a great language to teach in schools or even learn on your own.
If you still haven’t begun, we suggest you read up on what we have to say about
the white and dark sides of Python. Also, check out Python Features.
6. Python Applications in Business
Python is also a great choice to develop ERP and e-commerce systems:
 Tryton – A three-tier, high-level general-purpose application platform.
 Odoo – A management software with a range of business applications. With that,
it’s an all-rounder and forms a complete suite of enterprise-management
applications in-effect.
7. Database Access
With Python, you have:
 Custom and ODBC interfaces to MySQL, Oracle, PostgreSQL, MS SQL Server,
and others. These are freely available for download.
 Object databases like Durus and ZODB
 Standard Database API
8. Network Programming
With all those possibilities, how would Python slack in network programming? It
does provide support for lower-level network programming:
 Twisted Python – A framework for asynchronous network programming. We
mentioned it in section 2.
 An easy-to-use socket interface
CHAPTER – 6
PROJECT DESCRIPTION

6.1 PROBLEM DEFINITION

The problem is to formally develop a cloud computing architecture and


algorithms for soft and near-hard real-time scheduling. The idea was to utilize the
features of cloud computing in the dynamic allocation and release of computing resources
in response to workload needs. For this purpose, deadline look a-head schedulers were
developed for the firing of deadline exceptions before their occurrence.

The main problem is to analyze the possibility of applying real-time task


scheduling, or deadline-constrained task scheduling, on IaaS cloud computing service
model, for the support of soft and near-hard real-time applications. Examples of such
applications are cloud-based gaming tools, video streaming software and
telecommunication tools. Such applications could benefit significantly from cloud
computing despite the limitations associated with the start-up time and communication of
virtual machines.
6.3 MODULE DESCRIPTION

The following modules are present in the project.

6.3.1 TASKS DETAILS COLLECTION


6.3.2 EXECUTION OF MN SCHEDULER, VM NODE SCHEDULER AND
DEADLINE EXCEPTION HANDLER
6.3.3 ADD CLUSTER
6.3.4 ADD RESOURCE
6.3.5 ASSIGN CLUSTER TO RESOURCE
6.3.6 ADD JOB
6.3.7 CLUSTER SCORE CALCULATION FOR DATA INTENSIVE AND
COMPUTATION INTENSIVE STRATEGY
6.3.8 CLUSTER SCORE CALCULATION WITH STORAGE CAPACITY

6.3.1 TASKS DETAILS COLLECTION

In this module, task details such as required RAM (in MB), Storage Capacity (in
MB), CPU required (in MHz) and network bandwidth (in Mbps) are retrieved and stored
in various dictionary entries.

6.3.2 EXECUTION OF MN SCHEDULER, VM NODE SCHEDULER AND


DEADLINE EXCEPTION HANDLER

In this module, the algorithms mentioned in system methodology part are


executed so that creation of new virtual machine is required or not is found out. Virtual
machines’ pending resources are updated each time a task is taken and processed.

6.3.3 ADD CLUSTER

In this module, the cluster id is given with ATP (Average Transmission Power)
and ACP (Average Computing Power) and CS (Cluster Score) value set to zero. The
details are saved in ‘Cluster’ table.
6.3.4 ADD RESOURCE

In this module, the resource id, Resource Name, IPAddress, CPU MHz, CPU
MHz available, Load Percent, CP (available computing power) and Storage Capacity
details are keyed in. The details are saved in ‘Resources’ table.

6.3.5 ASSIGN CLUSTER TO RESOURCE

In this module, the cluster id is fetched from ‘Clusters’ table and resource id is
fetched from ‘Resource’ table. The ids are selected from combo boxes and are saved in
‘Clusters_Resources’ table.

6.3.6 ADD JOB

In this module, the job id, name, required RAM in MHz, required hard disk
storage in MB, CPU MHz and network bandwidth is keyed in and saved into ‘Jobs’
table.

6.3.7 CLUSTER SCORE CALCULATION FOR DATA INTENSIVE AND


COMPUTATION INTENSIVE STRATEGY

In this module, the cluster score is calculated based on the following formula.

CSi =  . ATPi +  . ACPi

where CSi is the cluster score for cluster i, a and b are the weight value of ATP i and ACPi
respectively, the sum of  and  is 1, ATPi and ACPi are the average transmission
power and average computing power of cluster i respectively. ATP i means the average
available bandwidth the cluster i can supply to the job and is defined as:

where Bandwidth_availablei,j is the available bandwidth between cluster i and cluster j, m


is the number of clusters in the entire grid system.

Similarly, ACPi means the average available CPU power cluster i can supply to the job
and is defined as:
where CPU_speedk is the CPU speed of resource k in cluster i, load k is the current load of
the resource k in cluster i, n is the number of resources in cluster i. Also let

CPk indicates the available computing power of resource k.

Because the transmission power and the computing power of a resource will actually
affect the performance of job execution, these two factors are used for job scheduling.
Since the bandwidth between resources in the same cluster is usually very large, we only
consider the bandwidth between different clusters.

Local update and global update are used to adjust the score. After a job is submitted to a
resource, the status of the resource will change and local update will be applied to adjust
the cluster score of the cluster containing the resource. What local update does is to get
the available CPU percentage from Information Server and recalculate the ACP, ATP and
CS of the cluster. After a job is completed by a resource, global update will get
information of all resources in the entire grid system and recalculate the ACP, ATP and
CS of all clusters.

6.3.8 CLUSTER SCORE CALCULATION WITH STORAGE CAPACITY

In addition with existing formula, the storage capacity is also calculated like
Average Transmission Power and so sum of and is 1. All other calculations
are used in same scenario as above module. The job is split into tasks with
andvalues for each sub task. So one cluster is assigned for one task and
others cluster for other tasks.
6.4 SYSTEM ARCHITECTURE

Task Data Base

CPU BW
RAM
OS

Master Node Scheduling

SRTA AJSC
Analysis False Positive Rate
Analyze False Positive Rate

Calculate Minimum VM resource Configure VM


required
no
Configuration Cluster

Avaiabl
e Calculate ATP, ACP & memory
yes Resourc
e
Allocate VM Ranking Cluster Score
no

Cluster Score
If
High
Runtim
yes
e
Task Allocation
Available Resource

Task Allocation Complete


no
New VM
Update
allocation
Task Data
yes Analyze False
Positive Rate
yes
Another VM task Availabl
Job
Allocation e
Scheduling
Updates
no Alllocation

Task Pool
6.6 DATA FLOW DIAGRAM

Admin

Admin Admin

Login

Resource

Add
Add Resource Assign
Cluster Resource to
Cluster

Resource
Cluster

Cluster_Resources
Add Job

Calculate
Cluster Job
Score

Resource Cluster_Resources
6.9 INPUT DESIGN

Input design is the process of converting user-originated inputs to a computer


understandable format. Input design is one of the most expensive phases of the operation
of computerized system and is often the major problem of a system. A large number of
problems with a system can usually be tracked backs to fault input design and method.
Every moment of input design should be analyzed and designed with utmost care.
The system takes input from the users, processes it and produces an output. Input
design is link that ties the information system into the world of its users. The system
should be user-friendly to gain appropriate information to the user. The decisions made
during the input design are
 To provide cost effective method of input.
 To achieve the highest possible level of accuracy.
 To ensure that the input is understand by the user.

System analysis decide the following input design details like, what data to input,
what medium to use, how the data should be arranged or coded, data items and
transactions needing validations to detect errors and at last the dialogue to guide user in
providing input.
Input data of a system may not be necessarily is raw data captured in the system
from scratch. These can also be the output of another system or subsystem. The design of
input covers all the phases of input from the creation of initial data to actual entering of
the data to the system for processing. The design of inputs involves identifying the data
needed, specifying the characteristics of each data item, capturing and preparing data
from computer processing and ensuring correctness of data.
Any Ambiguity in input leads to a total fault in output. The goal of designing
the input data is to make data entry as easy and error free as possible.
6.10 OUTPUT DESIGN
Output design generally refers to the results and information that are generated by
the system for many end-users; output is the main reason for developing the system and
the basis on which they evaluate the usefulness of the application.

The output is designed in such a way that it is attractive, convenient and


informative. Coding is made in Python with various features, which make the console
output more pleasing.

As the outputs are the most important sources of information to the users, better
design should improve the system’s relationships with user and also will help in decision-
making. Form design elaborates the way output is presented and the layout available for
capturing information.
CHAPTER 7
SYSTEM TESTING
7.1 TESTING

Software testing is a process, to evaluate the functionality of a software


application with an intent to find whether the developed software met the specified
requirements or not and to identify the defects to ensure that the product is defect free in
order to produce the quality product.

 Manual Testing: Manual testing is the process of testing the software manually


to find the defects. Tester should have the perspective of end users and to ensure
all the features are working as mentioned in the requirement document. In this
process, testers execute the test cases and generate the reports manually without
using any automation tools.
 Automation Testing: Automation testing is the process of testing the software
using an automation tool to find the defects. In this process, testers execute the
test scripts and generate the test results automatically by using automation tools.
Some of the famous automation testing tools for functional testing are QTP/UFT
and Selenium.

There are levels in software testing. They are


 Unit testing
 Integration testing
 System testing
 Acceptance testing
7.2 Unit testing

Unit Testing is done to check whether the individual modules of the source code
are working properly. i.e. testing each and every unit of the application separately by the
developer in the developer’s environment. It is AKA Module Testing or Component
Testing.
The unit testing done to check whether all the modules in the project works
without any error during the input using the various methods and algorithms that are
given in the project.

7.3 Integration Testing

Integration Testing is the process of testing the connectivity or data transfer


between a couple of unit tested modules. It is AKA I&T Testing or String Testing. It is
subdivided into Top-Down Approach, Bottom-Up Approach and Sandwich Approach
(Combination of Top Down and Bottom Up).

The integration testing in this project is that the modules are combined to form a
perfect application to improve the response time. The module that are integrated through
the given flow they that the job is split into sub tasks to calculate the cluster score value
to make the job completion much more earlier.

7.4 System Testing (end to end testing)

It’s a black box testing. Testing the fully integrated application this is also called
as end to end scenario testing. To ensure that the software works in all intended target
systems. Verify thorough testing of every input in the application to check for desired
outputs. Testing of the users experiences with the application. The system testing is done
after the completion of both the unit and integration testing this testing is used to check
whether the algorithm and the functionalities applied after the correction made during the
testing process held before hand.

7.5 Acceptance Testing

To obtain customer sign-off so that software can be delivered and payments


received. Types of Acceptance Testing are Alpha, Beta & Gamma Testing. The
acceptance testing is done at the very last of all the testing because this is the last step of
the project before delivering the project. Here in this project the acceptance testing is
done with the the given input Adaptive Scoring Job Scheduling algorithm (ASJS) for the
grid environment by various employees in the testing department. Scheduling
independent periodic real-time tasks with implicit deadlines is satisfied in this testing.
CHAPTER 8
SYSTEM IMPLEMENTATION
the initial design was done for the system, the client was consulted for the acceptance of
the design so that further proceedings of the system development can be carried on. After the
development of the system a demonstration was given to them about the working of the system.
The aim of the system illustration was to identify any malfunction of the system.
After the management of the system was approved the system implemented in the concern,
initially the system was run parallel with existing manual system. The system has been tested
with live data and has proved to be error free and user friendly.
Implementation is the process of converting a new or revised system design into an
operational one when the initial design was done by the system; a demonstration was given to the
end user about the working system. This process is uses to verify and identify any logical mess
working of the system by feeding various combinations of test data. After the approval of the
system by both end user and management the system was implemented.
System implementation is made up of many activities. The six major activities as
follows.
Coding
Coding is the process of whereby the physical design specifications created by the
analysis team turned into working computer code by the programming team.
Testing
Once the coding process is begin and proceed in parallel, as each program module can be
tested.
Installation
Installation is the process during which the current system is replaced by the new
system. This includes conversion of existing data, software, and documentation and work
procedures to those consistent with the new system.
Documentation
It is result from the installation process, user guides provides the information of how the
use the system and its flow.
Training and support
Training plan is a strategy for training user so they quickly learn to the new system. The
development of the training plan probably began earlier in the project.
The best-suited application package to develop the system is PYTHON IDE 3.6 under
windows XP’ environment.
CHAPTER 9
CONCLUSION AND FUTURE ENHANCEMENTS

The Mater Node and Virtual machines attempt to assign the tasks to an empty
processor. In the absence of such a processor, a new VM node is dynamically created and
assigned the task. The results of the demonstrative implementation of the proposed
architecture and algorithms in the present study were expressed in terms of the number of
deadline exceptions fired by each algorithm, the number of extra resources provisioned to
each algorithm to handle the deadline exceptions, and the average response time of the
tasks.
This project proposes an adaptive scoring method to schedule jobs in grid
environment. ASJS selects the fittest resource to execute a job according to the status of
resources. Local and global update rules are applied to get the newest status of each
resource. Local update rule updates the status of the resource and cluster which are
selected to execute the job after assigning the job and the Job Scheduler uses the newest
information to assign the next job.
The results suggest the applicability of the scheduling of periodic real time tasks
to cloud computing. In addition to current problem, there is other interesting and relevant
issues that remain to be addressed in future are:
The non-consideration of some other important scheduling parameters such as
cost and energy consumption constitutes a limitation of the present study. Further
experimental study is planned to consider such parameters as secondary priorities in
addition to meeting the constrained deadlines of real-time tasks
SAMPLE CODING
import matplotlib.pyplot as plt
import random as rnd
import numpy as np
"""
First 100 lines are Algorithm 1 in Base Paper - Schedulability
Virtualized cloud environment
"""
#Receive real time tasks
outputstr=''
with open('tasks.txt','r') as f:
lines=f.readlines()
totallines=len(lines)

if totallines>=5:
for i in range(0,5):
print(lines[i] )
else:
print(lines)

tasks={}
tid=1
for l in lines:
tok = l.split('\t')
singletask={}
singletask['taskid'] = tok[0]
singletask['rammb'] = int(tok[1])
singletask['storagemb'] = int(tok[2])
singletask['cpumhz'] = int(tok[3])
singletask['nwbandwidthmbps'] = int(tok[4])
singletask['windowsorlinux'] = int(tok[5])
tasks["T"+ str(tid)] =singletask
tid+=1
print('TASK DETAILS')
print('------------')
#print(tasks)

for i1 in range(1,tid):

dict1=tasks["T"+ str(i1)]
print('Task Id : ',dict1['taskid'])
outputstr= outputstr+'\nTask Id : ' + dict1['taskid']
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : '+str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : '+str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : '+str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+str(dict1['nwbandwidthmbps'])
print(' Windows(1)/Linux(2) (Mbps) : ',dict1['windowsorlinux'])
outputstr= outputstr+'\n Windows(1)/Linux(2) (Mbps) : '+str(dict1['windowsorlinux'])

print('------------')
outputstr= outputstr+'\n------------'
taskscount=len(tasks)
#========Create VM Pool
VM={}
OriginalRAMs=[]
OriginalSTs=[]
OriginalCPUs=[]
OriginalNBs=[]
for i in range(1,11):
vmresource={}
vmresource['resid']=i
vmresource['rammb']=100 + round((rnd.random() * 100),0) #round(10240 + (rnd.random() * 1024),0)
if i==6:
vmresource['rammb']=500
OriginalRAMs.append(vmresource['rammb'])
vmresource['storagemb']=10000 +round(10240 + (rnd.random() * 1024),0)
OriginalSTs.append(vmresource['storagemb'])
vmresource['cpumhz']=512 #round(10240 + (rnd.random() * 1024),0)
OriginalCPUs.append(vmresource['cpumhz'])
vmresource['cpuloadpercent']=0 #round(10240 + (rnd.random() * 1024),0)

vmresource['nwbandwidthmbps']=512 #round(10240 + (rnd.random() * 1024),0)


OriginalNBs.append(vmresource['nwbandwidthmbps'])
if i<=5:
vmresource['windowsorlinux']=1
else:
vmresource['windowsorlinux']=2
VM['vm' + str(i)] = vmresource
print('VIRTUAL MACHINE RESOURCE DETAILS')
outputstr= outputstr+'\nVIRTUAL MACHINE RESOURCE DETAILS'
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------'
#print(VM)
RAMs=[]
for i1 in range(1,11):
dict1=VM['vm' + str(i1)]
print('Resource Id : ',dict1['resid'])
outputstr= outputstr+'\nResource Id : '+str(dict1['resid'])
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
RAMs.append(dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : '+str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : '+str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : '+str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+str(dict1['nwbandwidthmbps'])

print('--------------------------------')
outputstr= outputstr+'\n--------------------------------\n'
vmcount=len(VM)

#colors = map(lambda x: colmap[x+1], labels)


xaxistitles=[]
for i in range(0,vmcount):
dicttmp=VM['vm' + str(i+1)]
xaxistitles.append('VM' + str(dicttmp['resid']))
#plt.bar(xaxistitles,RAMs)
#plt.scatter(DCCap, df['y'], color=colors, alpha=0.5, edgecolor='k')
#for idx, centroid in enumerate(centroids):
# plt.scatter(*centroid, color=colmap[idx+1])
fig, ax = plt.subplots()
plt.xlim(0, 10)
plt.ylim(0, max(RAMs))
width=0.35
ax.bar(xaxistitles,RAMs,width,color='r')
ax.set_title('RAM Capacity of VMs')
plt.show()

VMPendingResources={}
#all resources are available
for j in range(1,vmcount+1):
vmresource = VM['vm' + str(j)]
vmpendingresource={}
vmpendingresource['resid']=vmresource['resid']
vmpendingresource['rammb']=vmresource['rammb']
vmpendingresource['storagemb']=vmresource['storagemb']
vmpendingresource['cpumhz']=vmresource['cpumhz']
vmpendingresource['cpuloadpercent']=vmresource['cpuloadpercent']
vmpendingresource['nwbandwidthmbps']=vmresource['nwbandwidthmbps']
vmpendingresource['windowsorlinux']=vmresource['windowsorlinux']
vmpendingresource['taskids']=[]
VMPendingResources['vm' +str(j)]=vmpendingresource

#Assign tasks to VMs


print ("Tasks Count" + str(taskscount))
taskassignedcounter=1
exitfor=0
for i in range(1,taskscount+1):
vmidx=1
print('i = ', i)
for j in range(1,vmcount+1):
print("Task " + str(taskassignedcounter) + " to VM " + str(vmidx))
outputstr= outputstr+'\nTask ' + str(taskassignedcounter) + ' to VM ' + str(vmidx)
vmtmp = VMPendingResources['vm' +str(j)]
vmtmp ['taskids'].append(str(taskassignedcounter))
VMPendingResources['vm' +str(j)] =vmtmp
taskassignedcounter+=1
vmidx+=1
if vmidx>vmcount:

i+=vmcount
print('i new =' + str(i))
if taskassignedcounter>taskscount:
exitfor=1
break;
if exitfor==1:
break;

print('----------------------------------')
outputstr= outputstr+'\n----------------------------------'
print('Available Resources in VMs')
outputstr= outputstr+'\nAvailable Resources in VMs'
#print(VMPendingResources)
print('--------------------------------')
outputstr= outputstr+'\n----------------------------------'
#print(VM)

for i1 in range(1,11):
dict1=VMPendingResources['vm' + str(i1)]
print('Resource Id : ',dict1['resid'])
outputstr= outputstr+'\nResource Id : ' + str(dict1['resid'])
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : ' + str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : ' + str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : ' + str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : ' + str(dict1['nwbandwidthmbps'])
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------'
failedvm=0
#Task execution
vmpendingcount=len(VMPendingResources)
suspect=0 #for algorithm2
for j in range(1,vmpendingcount+1):
vmtmp=VMPendingResources['vm' + str(j)]
print(vmtmp['taskids'])
outputstr= outputstr+'\nTask Ids of VM'
for tmp1 in vmtmp['taskids']:
outputstr= outputstr+'\n'+ str(tmp1)
excep=0
for x in vmtmp['taskids']:
task =tasks['T' + str(x)]
#print(str(task['windowsorlinux']) + ':' + str(vmtmp['windowsorlinux']))
if task['windowsorlinux'] ==vmtmp['windowsorlinux']:
vmtmp['rammb'] = vmtmp['rammb'] - task['rammb']
vmtmp['storagemb'] = vmtmp['storagemb'] - task['storagemb']
vmtmp['cpumhz'] = vmtmp['cpumhz'] - task['cpumhz']
#print( vmtmp['nwbandwidthmbps'])
vmtmp['nwbandwidthmbps'] = vmtmp['nwbandwidthmbps'] - task['nwbandwidthmbps']
#print( vmtmp['nwbandwidthmbps'])
#input()
VMPendingResources['vm' + str(j)]=vmtmp
#
dict1=VMPendingResources['vm' + str(j)]
print('Pending Resource Id with new Status : ',dict1['resid'])
outputstr= outputstr+'\nPending Resource Id with new Status : '+ str(dict1['resid'])
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : '+str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : '+str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : '+str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+ str(dict1['nwbandwidthmbps'])
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------'
#

if vmtmp['rammb']<0:
print('DeadLine Exception due to insufficient RAM')
outputstr= outputstr+'\nDeadLine Exception due to insufficient RAM'
excep=1
if vmtmp['storagemb']<0:
print('DeadLine Exception due to insufficient Storage')
outputstr= outputstr+'\nDeadLine Exception due to insufficient Storage'
excep=1
if vmtmp['cpumhz']<0:
print('DeadLine Exception due to insufficient CPU')
excep=1
if vmtmp['nwbandwidthmbps']<0:
print('DeadLine Exception due to insufficient Network Bandwidth')
outputstr= outputstr+'\nDeadLine Exception due to insufficient Network Bandwidth'
excep=1
if rnd.random()<0.1 :
print('DeadLine Exception due to delayed time')
outputstr= outputstr+'\nDeadLine Exception due to delayed time'
suspect=1
excep=1
if excep==1:
print('VM' + str(j) + ' Failed for task ' + task['taskid'])
outputstr= outputstr+'\nVM' + str(j) + ' Failed for task ' + task['taskid']
failedvm=j
print('***Look for a node with idle processor***')
outputstr= outputstr+'\n***Look for a node with idle processor***'
VMPendingResources['vm' + str(j)]=vmtmp
break;
else:
VMPendingResources['vm' + str(j)]=vmtmp
#print(vmtmp)
dict1=VMPendingResources['vm' + str(j)]
print('Pending Resource Id with new Status : ',dict1['resid'])
outputstr= outputstr+'\nPending Resource Id with new Status : '+ str(dict1['resid'])
print('===========')
outputstr= outputstr+'\n==========='
print(' RAM (MB) : ',dict1['rammb'])
outputstr= outputstr+'\n RAM (MB) : '+str(dict1['rammb'])
print(' Storage (MB) : ',dict1['storagemb'])
outputstr= outputstr+'\n Storage (MB) : '+str(dict1['storagemb'])
print(' CPU (MHz) : ',dict1['cpumhz'])
outputstr= outputstr+'\n CPU (MHz) : '+str(dict1['cpumhz'])
print(' Network Bandwidth (Mbps) : ',dict1['nwbandwidthmbps'])
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+ str(dict1['nwbandwidthmbps'])
print('--------------------------------')
outputstr= outputstr+'\n--------------------------------'

if excep==1:
break;

if excep==1:
found=0
for j in range(1,vmpendingcount+1):
if failedvm!=j:
ram=0
sto=0
cpu=0
bw=0
vmtmp=VMPendingResources['vm' + str(j)]
if vmtmp['rammb'] > task ['rammb']:
ram=1
if vmtmp['storagemb'] > task ['storagemb']:
sto=1
if vmtmp['cpumhz'] > task ['cpumhz']:
cpu=1
if vmtmp['nwbandwidthmbps'] > task ['nwbandwidthmbps']:
bw=1
if ram==1 and sto==1 and cpu==1 and bw==1:
print('VM' + str(j) + ' can take the task: ' + task['taskid'])
outputstr= outputstr+'\nVM' + str(j) + ' can take the task: ' + task['taskid']
found=1
break
if found==0:
print('New VM created:')
outputstr= outputstr+'\nNew VM created:'
print('Task ' + task['taskid'] + ' need to assigned to that New VM')
outputstr= outputstr+'\nTask ' + task['taskid'] + ' need to assigned to that New VM'
vmnew = VM['vm' + str(1)]
vmnew['resid'] = vmcount+1
VM['vm' + str(vmcount+1)] =vmnew
#print(VM['vm' + str(vmcount+1)])
dicttmp =VM['vm' + str(vmcount+1)]
outputstr= outputstr+'\nResource Id : '+str(dicttmp['resid'])
print('Resource Id : ',str(dicttmp['resid']))
print('===========')
outputstr= outputstr+'\n==========='
outputstr= outputstr+'\n RAM (MB) : '+str(dicttmp['rammb'])
print(' RAM (MB) : '+str(dicttmp['rammb']))
outputstr= outputstr+'\n Storage (MB) : '+str(dicttmp['storagemb'])
print(' Storage (MB) : '+str(dicttmp['storagemb']))
outputstr= outputstr+'\n CPU (MHz) : '+str(dicttmp['cpumhz'])
print(' CPU (MHz) : '+str(dicttmp['cpumhz']))
outputstr= outputstr+'\n Network Bandwidth (Mbps) : '+str(dicttmp['nwbandwidthmbps'])
print(' Network Bandwidth (Mbps) : '+str(dicttmp['nwbandwidthmbps']))
with open('Output.txt','w') as of:
of.write(outputstr)
of.close()

#DRAWING CHART
#Drawing RAM Capacity
xaxistitles=[]
RAMs=[]
for i in range(0,vmcount):
dicttmp=VMPendingResources['vm' + str(i+1)]
RAMs.append(dicttmp['rammb'])
xaxistitles.append('VM' + str(dicttmp['resid']))
fig, ax = plt.subplots()
plt.xlim(0, 10)
plt.ylim(0, max(OriginalRAMs)+10)
width=0.35
ax.plot(OriginalRAMs)
ax.plot(RAMs)
ax.bar(xaxistitles,RAMs,width,color='r')
ax.set_title('Virtual Machines Pending RAM Capacity')
plt.show()

#Drawing StorageMB Capacity


xaxistitles=[]
StorageMBs=[]
for i in range(0,vmcount):
dicttmp=VMPendingResources['vm' + str(i+1)]
StorageMBs.append(dicttmp['storagemb'])
xaxistitles.append('VM' + str(dicttmp['resid']))
fig, ax = plt.subplots()
plt.xlim(0, 10)
plt.ylim(0, max(OriginalSTs)+10)
width=0.35
ax.plot(StorageMBs)
ax.plot(OriginalSTs)
ax.bar(xaxistitles,StorageMBs,width,color='r')
ax.set_title('Virtual Machines Pending Storage (MB) Capacity')
plt.show()

#Drawing CPUMHz Capacity


xaxistitles=[]
CPUMHzs=[]
for i in range(0,vmcount):
dicttmp=VMPendingResources['vm' + str(i+1)]
CPUMHzs.append(dicttmp['cpumhz'])
xaxistitles.append('VM' + str(dicttmp['resid']))
fig, ax = plt.subplots()
plt.xlim(0, 10)
plt.ylim(0, max(OriginalCPUs)+10)
width=0.35
ax.plot(CPUMHzs)
ax.plot(OriginalCPUs)
ax.bar(xaxistitles,CPUMHzs,width,color='b')
ax.set_title('Virtual Machines Pending CPU(MHz) Capacity')
plt.show()

#Drawing NB Capacity
xaxistitles=[]
NBs=[]
for i in range(0,vmcount):
dicttmp=VMPendingResources['vm' + str(i+1)]
NBs.append(dicttmp['nwbandwidthmbps'])
xaxistitles.append('VM' + str(dicttmp['resid']))
fig, ax = plt.subplots()
plt.xlim(0, 10)
plt.ylim(0, max(OriginalNBs)+10)
width=0.35

ax.plot(NBs)
ax.plot(OriginalNBs)
#ax.bar(xaxistitles,OriginalNBs,0.45,color='b')
ax.bar(xaxistitles,NBs,width,color='g')
ax.plot(NBs,'g*',OriginalNBs,'ro')
ax.scatter(NBs,OriginalNBs)
ax.set_title('Virtual Machines Pending Network Bandwidth (Mbps) Capacity')
plt.hist(NBs,bins=5)
plt.show()
#print(NBs)
#print(OriginalNBs)
#exit()
#Algorithm 2
if suspect==1:
print('Task: ' + task['taskid'] + ' is removed from queue')
print('Task: ' + task['taskid'] + ' is suspected and so sent to master')
print('Task: ' + task['taskid'] + ' is added by Master Scheduler\'s urgent queue.')
SCREEN SHOTS
REFERENCES

[1] A. Y. Zomaya and Y.-H. Teh, ‘‘Observations on using genetic algorithms for dynamic load-
balancing,’’ IEEE Trans. Parallel Distrib. Syst., vol. 12, no. 9, pp. 899–911, Sep. 2001.

[2] M.ShreedharandG.Varghese,‘‘Efficient fair queuing using deficit round robin,’’ IEEE/ACM


Trans. Netw., vol. 4, no. 3, pp. 375–385, Jun. 1996

[3] D. Eyers, R. Routray, R. Zhang, D. Willcocks, and P. Pietzuch, “Towards a middleware for
configuring large-scale storage infrastructures” ’in Proc. 7th Int. Workshop Middleware Grids,
Clouds e-Sci., 2009, p. 3.

[4] C.Fehling,T.Ewald,F.Leymann,M.Pauly,J.Rütschlin,andD.Schumm, ‘‘Capturing cloud


computing knowledge and experience in patterns,’’ in Proc. IEEE 5th Int. Conf. Cloud Comput.
(CLOUD), Jun. 2012, pp. 726–733.

[5] X.-S. Yang, ‘‘Firefly algorithms for multimodal optimization,’’ in Proc. Int. Symp. Stochastic
Algorithms. Berlin, Germany: Springer, 2009, pp. 169–178.

[6] L. D. D. Babu and P. V. Krishna, ‘‘Honey bee behavior inspired load balancing of tasks in
cloud computing environments,’’Appl.Soft Comput., vol. 13, no. 5, pp. 2292–2303, May 2013.

[7] B.Mondal, K.Dasgupta, and P.Dutta, “Load balancing in Cloud computing using stochastic
hill climbing-a soft computing approach,” Procedia Technol., vol. 4, pp. 783–789, Jun. 2012.

You might also like