You are on page 1of 8

Characterization of Elasticityin Cloudswith promise

of SLAs
SudhaPelluri, RamachandramSirandas, Uma Krishna. H, Rasagna Veeramallu
Department Of Computer Science& Engineering
University College of Engineering
Osmania University
Hyderabad, India
{sudhapv@gmail.com}

Abstract- Cloud computing has become very popular due to
users ability to effectively access and use large pools of
resources, which are virtualized and dynamically provisioned in
response to variable workloads and usage. The promise of
Elasticity of the clouds makes it one of the most sought after
approaches in the modern organizations. In this paper we discuss
about the need of scaling, scaling approaches and analysis of
various scaling mechanisms in clouds. In clouds the applications
resource demands keeps on changing. The cloud should be
promising in its scalability of resources for the effective
provisioning of the resources based on the changing demand. The
contention is that the scaling cannot be absolute, there are
constraints in the form of Service Level agreements (SLAs ) to be
lived up to by the provider to survive the market challenges. We
analyze the benefits and drawbacks of horizontal and vertical
scaling. We analyze that in certain cases a hybrid approach
works better. Algorithm for managing SLAs by efficient scaling
is discussed.
Keywords Elasticity, Horizontal, vertical and hybrid
scaling, Service level agreements, Deadlines
I. INTRODUCTION
Cloud computing promises cost-effective computing solutions
for end-users as well as improved resource utilization for cloud
providers. Cloud computing has its huge popularity due to its
elastic nature. Cloud computing is viewed as a pay-per-use or
pay-as-you-go model. Elasticity is a key characteristic of cloud
computing. The elasticity of clouds enables the users to acquire
and then release the resources dynamically with the changing
demand. It is the ability of a system to dynamically grow and
shrink such that, it uses only those resourceswhich arerequired
for the current workload. But appropriately dimensioning the
resources to the application is a difficult task becausemost of
the applications have large and fluctuating loads. In
predictable situations, the resources are provisioned in advance
by means of capacity planning techniques. But for unplanned
spike loads, an automatic scaling system or auto scaling system
is used, which relieves the end user fromresource allocation
decisions. The auto scaling system adjusts the resources
allocated to an application depending on its needs at any time.
The auto-scaling should ensure minimum cost of resource
acquisition as well as comply with the SLO ( Service Level
Objectives).
The need for huge investments upfront, man-power , energy for
maintenance of large sized resources are a thing of the past.
Now, these resources are in the form of virtual machines.
Hence, users can ask for suitable number of Virtual machines
for running his applications and pay for the same.
Clouds have three service models such as IaaS( Infrastructure-
as-a-Service).PaaS(Platform-as-a-Service) and Software-as-a-
Service). IaaS is the provisioningof IT and network resources
like processing, storage and bandwidth and management
middleware. Examples are Amazon EC2 [1], RackSpace [2]
and the new Google Compute Engine[3]. Platform-as-a-Service
(PaaS) provides programming environments and the tools
supported by cloud providers that are used by consumers to
build and thereby deploy applications onto the cloud
infrastructure. Some examples of PaaS include Heroku [4],
Google App Engine [7], Amazon Elastic Beanstalk[3],
Force.com[6] and Microsoft Windows Azure [5]. Software-as-
a-Service (SaaS) provides hosted vendor applications.
Examples are Google Apps, MicrosoftOffice365[8],
Salesforce.com[9].
Our work considers the IaaS provider perspective. We
consider a typical scenario, where a user hosts an application
using the resources provided by IaaS cloud provider. The client
is the owner of the application and the user is the end user of
the application. In order to both maximize the profit by only
scaling when required and also live upto SLAs agreed upon
with the customers, specific algorithms are described.
II. RESOURCE PROVISIONING IN CLOUDS
With the emergence of IaaS Clouds enterprises need not have
to own and maintain their own computing infrastructures to
build their applications. The end users and the cloud providers
enter into contracts such as Service Level Agreements(SLAs)
which describes the user requirements in terms of computing,
capacity, response times , deadlines for tasks, cost etc. This
information helps the cloud providers to determine the amount
of infrastructure they need to allocate to the users. This is done
through provisioning algorithms which allocates the virtual
machines running user applications to the physical cloud
infrastructure provided by the cloud provider. VM
Provisioning is the provisioning of physical resources in the
cloud to different VMs to run applications. This helps in
efficient load balancing, application execution and proactive
failure handling, thereby improving the reliability and
efficiency of the entire cloud and the applications running in
it. VMs can be assigned to the physical machines within a
cloud by either static or dynamic VM provisioning.
The resource provisioning can be done statically or
dynamically depending upon the user application demands. If
the user application demands are predictable or do not change
frequently then static provisioning can be used. If the
application demands are fluctuating then the Virtual Machines
(VMs) haveto be migrated to new compute nodes on-the-fly
in the cloud. This technique allows adaptability but it may
negatively affect the end users by means of cost of runtime
overhead as well as potential execution delays. In order to
have a better provisioning scheme, a hybrid approach has to be
used by starting with a good static placement which is then
tuned by dynamic re-provisioning as long as it is effective and
not excessively costly.

III. ELASTICITY BY EFFICIENT SCALING
Elasticity can be said to be the ability to scale the
infrastructure, within minutes or seconds, which can prevent
the under-utilization and over-utilization of the in-house
resources[18]. Elasticity is an important characteristic of cloud
platforms which enables on demand resource acquisition in
response to time varying workloads. Elasticity is the key
feature of cloud platforms which benefits both cloud providers
and end-users. It reduces server under-utilization for cloud
providers while guaranteeing Quality Of Service (QoS) for
end-users. The elastic capabilities are offered by the cloud
provider by means of rule-based algorithms in which certain
scaling conditions are defined based upon a target metric
reaching some threshold.[19] Rule-based auto-scaling
techniques are offered by several cloud providers like Amazon,
RightScale[20] and also by Azure Watch[21].
In a cloud the customers need not have to provision for peak
workloads. They can request for additional resources when the
workload increases as well as release the resources when the
workload decreases. This is made possible by online
provisioning and de-provisioning by means of server
virtualization. Virtualization is used to create virtual machines.
Virtual machines can be cloned when the customer requests for
additional resources due to the increase in workloads.
Likewise, the virtual machines can also be shut down when the
workload demand comes down. In order to add as well as
remove resources from a virtual machine virtualization
provides another technique such as online VM resizing . This is
supported by most of the modern hypervisors which helps in
adding CPU or memory resources to a virtual machine without
bringing it down.
Public clouds usually provide and interface for requesting
additional virtual machine instances where as private clouds or
virtualized data centers provide a more fine-grained interface,
which enables the users to request for additional resources to
existing VM instances as well as request for new VM
instances. This ensures the enterprises in provisioning only
required amount of resources in clouds and thereby derive cost
savings. An application should be able to dynamically identify
an application configuration, such that cloud resources can be
used ina profitable manner.
Flexible resource provisioning by instantiating virtual
machines in physical machines is one of the key requirements
of cloud computing. Many IaaS providers make use of
virtualization technologies in order to encapsulate applications
and for providing isolation among uncooperative users. But if
the physical resources are statically partitioned into virtual
machines according to applications peak demand it will lead
to poor resource utilization. In order to improve the resource
utilization overbooking[10] can be used. For achieving
performance isolation among co-located applications, resource
capping is applied. This guarantees that an application cannot
use more resources than that is allocated to it. But resource
demands are rarely static. If the resource cap is made too low,
then it may lead to SLO (Service Level Objectives) violations.
If the resource cap is too high, then it may lead to the wastage
of resources and increases the cost incurred by the cloud
provider. So the only solution to tackle this problem, is to have
an elastic resource scaling systemwhich dynamically adjusts
the resource cap based upon the changes in the application
resource demands.[11]
In response to the demand variations an application can be
scaled by means of two approaches such as horizontal scaling
and vertical scaling. Scalability is the need for marshaling the
resources so that an application can run smoothly even with the
increase in the number of users
A. Horizontal Scaling
Cloud application scalability is of great importance in the
present era of cloud computing. The automation is achieved
typically by means of a set of rules defined by the cloud service
provider. These rules govern how the service can scale up or
down in order to adapt to the varying or fluctuating load. These
rules are composed of certain conditions which when met leads
to the triggering of some actions on the cloud infrastructure or
platformas the case may be. The customization of the rules
that govern the service and the degree of automation vary.
Some systems allow the users to build simple conditions based
upon fixed infrastructure or platformmetrics such as CPU,
memory etc. while others allow the use of server-level metrics
such as cost to benefit ratio. Even more complex conditions
such as arithmetic and logic combinations of the simple rules
can also be included. When the conditions are met scaling
action is performed. Horizontal scaling is done by means of
adding new server replicas as well as load balancers such that
the load can be distributed among all available replicas. For
this purpose the load balancer has to support aggregation of
new servers for distributing the load among the servers.
Amazon performs the load balancing of replicated VMs
through its Elastic Load Balancing capabilities. Load balancers
and the algorithms used to distribute the load among the
servers play an important role in horizontal scalability of cloud
applications.
Horizontal scaling can be used for those applications with a
clustered architecture having a gateway or a master node which
distributes requests among the worker nodes or the virtual
machines. When workload increases more nodes are added to
the cluster, and when the workload decreases, some of the
nodes can be removed from the cluster thereby freeing up the
resources. In a clustered architecture, it is the gateway that
maintains the list of nodes which are part of the cluster. The
reconfiguration cost is different for different application and it
also depends on the easiness of addingand removing nodes to
and fromthe cluster. The reconfiguration is easy for stateless
applications, but for stateful applications, the application has to
be partially or completely brought down in order to add or
remove nodes in a cluster.Horizontal scaling is used in most
enterprise clouds [23].
B. Vertical Scaling
Vertical scaling is done by means of changing the assigned
resources to an instance which is already running, such that
allowing more physical CPU to an already running virtual
machine. But this is limited by the availability of free CPU
cores and memory of the physical server which hosts the
virtual machine. It is used to increase or decrease the resources
of any specific element in the system. This is done by changing
the CPU number,or by adapting the memory or bandwidth of
any single virtual machine. Live migration is supported by
most of the modern hypervisors which allows the migration of
vertical machines from one physical server to another. By
provisioning additional resources to a scaledVM and migrating
any other VMs which is already running on the server, live
migration has increased the scope of vertical scaling. Vertical
scaling is useful for dynamic consolidation in data centers.
Figure 1. IaaS Scaling [ from 22]
C. A Hybrid Scaling Approach
Even though horizontal scaling and vertical scaling has
several benefits, they have their own disadvantages as well.
Vertical scaling has a limited range, but it has a lower resource
and configuration cost. Also running the servers at high
utilization can affect the performance of the application.
Horizontal scaling can be used to scale the application to a
large throughput, but it increases the potential cost. By taking
into account the good features of both horizontal and vertical
scaling wecan create an intelligent combination of both adding
new VMs and VM resizingsuch that application scaling can be
done within the SLA parameters stated and at low costs.
IV. AUTO-SCALING
Auto-scaling is the technique by which on-demand resources
are provided according to the fluctuations in a cloud
computing system. In auto-scaling , the load for a short
window of time is estimated and then the resources are up-
scaled or down-scaled as per the demand for it. This reduces
the wastage of resources and also helps in maintaining
application service quality. Cloud services such as Amazon
EC2[1] and Google App Engine[7] uses auto-scaling
services[15]. The auto-scaling takes into account the
parameters such as cloud provider pricing model, unit charge,
VM boot time and resource acquisition log[14].
Auto-scaling is the process of automatically changing the
amount of resources which is used to run a service in the IaaS
system[16]. Auto-scaling mechanism should have the ability
to anticipate the service demand changes. The services with
time varying demands on IaaS resources can be run by
allocating only those resources needed at any point of time
without violating SLAs. Thus more resources are allocated
when demand increases and the resources are released when
they are not needed.

Auto-scaling technique needs two elements namely monitor
and scaling unit. The monitoring system gathers the
performance metrics for scaling purposes such as:
Hardware: memory usage, CPU utilization, network
interface access, disk access.
General OS Process: real memory( resident set),page
faults, CPU-time.
Load balancer: session rate, number of current
sessions,
size of request queue length, number of denied
requests, transmitted bytes, number of errors
Web server: number of connections in specific states,
transmitted bytes and requests.
Application server: active threads count, total threads
count, session count, used memory, processed
requests, dropped requests, pending requests,
response time.
Database server: number of transactions in particular
state, number of active threads.
With this information the scaling unit decides the scaling
action to be performed. The objective is to meet SLO and
minimize the overall cost of renting cloud resources[14].
Auto-scaling is made possible by schedule based as well as
rule based mechanisms by several cloud providers. In the
schedule based approach the cyclic pattern of the load is taken
into account and the configurations for scaling are done
manually. This makes it difficult for the system to adapt to
unexpected changes in the load. Rule based approach is used
by Amazon EC2. Another approach for classifying the scaling
is based on the timing of scaling .Specifically, predictive and
reactive scaling. In proactive/predictive scaling the future
resource requirements are anticipated and the availability of
the resources is ensured ahead of time. Whereas in reactive
scaling there is no anticipation done, the system just reacts to
the changes.
The different categories of auto-scaling techniques are Static
threshold based policies, Reinforcement Learning, Queuing
theory, Control theory, Time-series analysis. The threshold
based policies or rules based approach, is purely reactive. The
lack of anticipation can affect effective auto-scaling. Slashdot
effect or sudden traffic bursts can lead to poor scaling.
Moreover, the time taken to instantiate new VMs can be too
long. Time series-analysis is a pure proactive scaling
approach. It uses the past history of a time-series in order to
predict the future values. The other auto-scaling techniques
such as control theory, reinforcement learning, and queuing
theory cannot be classified clearly into a reactive or proactive
approach. Classic queuing theory requires modeling of each
application VM as queue of requests. There are certain
established methods for estimating performance metrics for
every scenario. Some of the reinforcement learning algorithms
need not have any prior knowledge of the system model, but it
may take unfeasibly long time to converge to an optimal
policy. The control theory auto-scaling technique uses a
proactive or reactive controller to adjust the required resources
automatically to the application demand. All the above stated
approaches for scaling need to be used based
V. SERVICE LEVEL AGREEMENTS
Service Provisioning in the cloud relies on Service Level
Agreements representing a contract signed between the
customer and the service provider including non functional
requirements of the service specified as Quality of
Service(QoS). SLA includes obligations, service pricing, and
penalties in case of agreement violations.
Reliably adhering to SLA agreements is of paramount
importance for Cloud service providers and consumers alike.
Based on flexible and timely reactions to possible SLA
violations, user interaction with the system can be minimized
and definitely by avoiding the SLA violations penalties can
be prevented. SLA management which includes a number of
tasks like discovering service providers, defining SLAs,
Monitoring SLA violation, terminating SLA , enforcing
penalties for violation. These are defined as tasks of the SLA
lifecycle. A broker is generally entrusted with managing all
these activities by initially discovering the SLA features
supported by the cloud providers, based on users requests.
Some typical SLA Parameters for IaaS are provided by [24].
Functional Non-Functional
CPU cores Response time
Memory size Budget
CPU speed Completion Time
I/O Bandwidth Data transfer time
OS Type Availability
Storage Size Persistence (Yes/No)
Image URL Reservation( Yes/ No)
Based on reasonable values provided by the user, the broker
needs to perform the match making / negotiation with the
provider. When the agreement is reached, the resources will
be reserved for the user.
Once the SLA document has been negotiated between the
service provider and the requester, it should be deployed. SLA
deployment is the process of validating and distributing the
contract.
Service provider uses the foundation of the signed contract to
enhance their use of infrastructure in order to meet signed
terms of the services. Service user makes use of the SLA to
enforce the level of QoS they require and in order to maintain
business models for service provisioning.
The chief requirements of the Service Level Agreement
include:
Format that clearly describes a service.
Level of service performance.
Illustrate methods by which parameters of the service
can be monitored.
Penalties when failing to meet the requirements.
Business metrics like billing.
The main specifications describing the web based SLA are
Web Service-Agreement (WS-Agreement)
Web Service Level Agreement language an d
framework (WSLA)
WSLA is made up of set of concepts and an XML language. It
is formulated in order to depict the SLA in formal way. It
comprises of 3 entities:
1. Parties: WSLA encapsulates the details about the
service provider, consumer as well as the third parties
involved in the contract.
2. SLA parameters: The parameters in the SLA are
specified through metrics. The metrics may be
resource metrics (directly retrieved from the provider
of the service) or composite metrics (combination of
multiple resource metrics calculated using a specific
algorithm)
3. Service Level Objectives (SLO)
SLA metrics for Platform-as-a-Service (PaaS) include
integration, scalability, pay as we go billing, deployment
environments, servers, browsers and number of developers.
SLA metrics for Software-as-a-Service(SaaS) include
reliability, usability, scalability, availability and
customizability.
The inclusion of the penalty function (based on linear
function) into the SLA in [25] makes it a striking feature to
enhance utility. It decreases the budget of the job after
completion of the deadline instead of the runtime. To improve
aggregate utility for the cluster, the proposal considers utility
of each job in order to determine the jobs that have higher
return. It considers jobs with shorter deadlines for expected
utility per unit runtime will produce higher returns. Shorter
deadline jobs are penalized more than the jobs with longer
deadline, which discourages accepting new requests than
delaying the accepted jobs, and eventually reduces the
aggregate utility of the cluster.
To implement a method for resources owners to have full
control over resource sharing, allocation and access policies, a
heuristic based on Greedy method allocation is used in [26].
This heuristic focuses on increasing the payoff function for
resource owners. The Greedy backfilling LRMS periodically
goes through the local SLA bids to finalize the contract well
fitting the payoff function of the resource owner.
SLA contracts facilitate job migrations in contract net. The
requests arrive at the manager GFA, which queries federation
directory in order to receive the contactor GFAs quote that
matches with the consumer specified SLA parameters. The
consumer can seek optimization in terms of response time or
the budget spent. On receiving the quote of the required
contractor GFA, it signs a SLA negotiation contract including
only a part of the job deadline as time for negotiation with the
chosen contractor. As the super-scheduling iterations
increases, manager GFA gives reduced time to contractor to
decide the SLA that meets the consumers specified deadlines.
If the consumers SLA parameters are not able to be satisfied,
then the job is rejected by the manager GFA.
In [27] Service Level Agreements are specified in terms of
deadline for applications executions. The user supplies the
estimation of execution times along withthe deadlines during
job submission. The count of required number of resources is
sent to Dynamic Provisioning service, which in turn acquires
them from external sources. If the deadlines are achievable
with the help of the available resources, then the job goes to
provisioned state. If more resources are required from
dynamic provisioned state, the job remains in under-
provisioned state. If deadlines cannot be met, the job goes to
infeasible state. If all the resources required for the completion
of the execution are present, then the job is finished.
In order to minimize the cost of infrastructure at the virtual
machine level and the number of SLA violations, an algorithm
in [28] that provides a mapping from the users quality of
service requirements to resources is implemented. The author
proposes a method of increasing the profits by cutting down
the cost by reusing VMs with maximum available space.
Worst fit approach is used if more than one VM is available
with maximum available space. But this method can face
decreased profits when the maximum space available is in use
by small jobs. Another approach focuses on increasing the
profits by cutting down the cost by reusing VMs with
minimum available space. This method overcomes the above
mentioned disadvantage as the request is assigned to VM with
least available space as in the best-fit approach.
Application Environment (AE) which embeds an arbitrary
application topology that can span over one or more VM is
associated with specific performance goals as mentioned in
the SLA contract.
The Local Decision Module (LDM) [29] that is associated
with application performance models is used to compute the
utility function. The computed utility function provides a
measure of the satisfaction of the application with a specific
resource allocation for its given current load. It evaluates the
chances of allocating increased number of VMs or releasing
the already existing VMs from the AE based on the current
workload using service level metrics. The Global Decision
Module (GDM) takes the utility function produced by the
LDM as one of its input in order to determine the VM
allocation vector for each application and VM packing [29].
VM provisioning and packing done by the LDM and GDM
mentioned above are expressed as Constraint Satisfaction
problem.
Self-optimization is achieved by combining the utility function
with constraint programming approach. Every SLA includes
charges for running the job, obligations and penalties that
must be paid when the provider fails to meet the SLA
parameters. As discussed in [30], the penalties include
Flat penalty
Penalty proportional to load
Proportional penalties with upper bounds
The QoS of the SLA measured by the observed average
waiting time (that is computed as the arithmetic mean of all
the waiting jobs that belong to the same session) , which
should be greater than the specified threshold. The
optimization in terms of waiting time is preferred from the
service provider point of view. The measure of the QoS of
SLA preferred by the user is the average response time along
with taking the job lengths into account.
Self SLA enactment and resource management tool has been
used for cloud computing infrastructure at the VM level. A-
prior learning is not required and the adaption to the
computing environment occurs during the request execution
considering different characteristics of various resource types.
It is general with no specialized domain knowledge and
independent of parameters to be tuned.
Two approaches for autonomic parameter adaptation have
been proposed in [31]:
Based on cost function
Based on workload volatility.
If the cost increases for a period of time, then the respective
threshold should be selected and adapted. Workload Volatility,
defined as intensity of change in workload traces measured of
a certain resources is calculated as the percentage similarity of
current workload to previous workload.
Cloud Infrastructure SLA (CI-SLA)[32]- is offered by the
public CISP (Cloud Infrastructure Service Providers) to its
users assuring the quality levels of resource capabilities and
specifications. It presents the perspective of the public cloud.
SLAaaS (SLA aware Services) integrates cloud QoS with the
SLA requirements. Cloud Based Application SLA (CA-
SLA)[32]- assures the quality level of the application running
on infrastructure of the public cloud. This is mainly concerned
with the quality metrics of the running applications on the
cloud infrastructure.
VI. INNOVATIVE APPROACH
Having seen about the general approaches adopted in order to
meet the SLAs specified by the users, more specific solutions
for meeting the most important constraints as SLAs, namely,
Deadlines and cost are needed. A deadline aware scaling
scheme is discussed.
Deadline can be characterized as not only a unique absolute
number / time specified by user but rather a time duration
allowed for the completion of the task/ application.
The necessary case specific information about the resources
available for the task, (CPU, Memory ), measured against the
required completion time, execution time, waiting time, is
necessary to be able to formulate the completion time in
minimum cost .
The pseudo-code can be stated as-
-----------------------------------------------------------------
/Input- tasks T
i
, i=1 to N
For each tasl T
i
{
t
dl
Completion time( expected deadline)
/* only time quantumas exact submission time is not of
consequence */
Pt Provisioning time
Q
Waiting
Waiting time before submission is accepted
( t
dl
- (Pt+ E
t
+ Q
waiting
) ) > 0 ----------------- (1)
Only when expected deadline is greater than sum of
provisioning time, execution time and delay due to waiting
task execution is successfully started.
/ * Execution time E
t
Provisioning timeVertical scaling P
tV
Provisioning time horizontal scaling - P
tH
Computationcapacity of VM is CP
VM capacity
*/
T
execution
=Total task (computation required) / CP
VM capacity
(cores)---- (2)
T
execution-scaled Horixontal
= Total task/( VM capacity (cores) +
(scaled number of VMs))
----------------(3)
T
execution-scaled -Vertical
= Total task/( VM capacity (cores) +
(scaled no. of cores))
----------------(4)
Basically time required for provisioning by scaling (
Horizontal/ Vertical ) should be considerably less than time
gained by scaling
(T
execution-Horizontal
/T
execution
) - P
tH
>0
(T
execution-Vertical
/T
execution
) - P
tV
>0
For Horizontal scaling, the time required for preparing the
VMs itself is quite high, hence the above equation might not
hold true. So, we need prediction based approaches for
identifying the required amount of resources in the future for a
specific user/ workload (can be a collection of user requests).
The different types of prediction approaches like Time series
analysis, principal component analysis, Exponential
smoothing, artificial neural networks can be used to
predetermine the required resources in the next time frame.
The provisioning time will get reduced to merely making
VMs specifically allocate for the tasks.
The slight variation of the resource allocation, measured as
time gain required to meet the deadline as compared between
predication and real time demand can be met by vertical
scaling- allocating more resources to the already allocated
VMs.
With current resourcesthat were allocated to the VMs,
if ( t
dl
<T
execution ) ,
scaling is required.
Required scaling
X ={ Total task (computation)/ ((t
dl
- T
execution
)* individual
VM capacity)}
- Originally allocated CP
VM capacity
(cores)
------------------------- (5)
In auto-scaling using hybrid approach , a combination of
horizontal and vertical scaling is used. By using prediction
approach, like time series analysis, reinforced learning, Neural
networks, exponential smoothing, Y units of VMs were
estimated as necessary for the task . As already discussed, the
required predicted number of resources can be made available
before the task is actually submitted. When the task actually
arrives/ when the request for resources is made by the user at a
certain instant, already CP, computing resources were made
available.
So, (Y CP) units areready for horizontal scaling. Based on
deadlines, the required scaling was X units. After completion
of horizontal scaling, which actually does not add to the delay
because the resources are pre bundled and kept ready in VMs,
only vertical scaling is required for the error in estimation
because of prediction variation.
If, X >( Y-CP)
Units to be vertically scaled now =X ( Y CP)
Else,
Deallocate [( Y-CP ) -X ] units to be sent into the pool of
ready VMs.
--------------------------------------------------------------------
Using this approach, auto-scaling within user specified
deadlines can be met using least time for scaling. This code
can be implemented in simulators like CloudSim using real
traces like Google Cluster Data , IBM SGE data or by using
synthetic workloads generated using RuBiS VA . Results of
the execution will be shortly included as extension to this
work. Metrics like reduction of resources allocation time per
task , hence cost saving, overall cost saving per job can be
used to show the efficiency of the stated approach.
Incorporating the cost constraint, minimizing cost of resource
allocation, keeping minimum time delay, specifically only
allocating the resources exactly needed by the user, the overall
cost gets reduced to the minimum possible. Inclusion of cost
constraint as a SLA requirement is anextension of this work.
To implement the above algorithm in real time, computation
of resources requires for each task needs to be identified.
Maximum returns are seen over successive implementations
because of the prediction algorithms getting better trained,
accuracy of further predictions gets far more improved. Then,
the scaling mechanisms will require very less time, hence
meeting the user deadlines will be within least cost.
VII. CONCLUSION
In this paper the importance of scaling in cloud computing
has been emphasized. In order to get better performance at low
cost, and in response to the fluctuating user demands, scaling is
an essential factor. As discussed, the two main approaches in
scaling such as Horizontal scaling and Vertical scaling, both
has advantages as well as disadvantages. So it can be
concluded that by using a hybrid scaling, we can provide a
better scaling approach for cloud computing applications in a
cost-effective manner. Promise of SLAs will be met with
scaling algorithm as discussed more efficiently. Real time
results will be discussed and added shortly.
REFERENCES
[1]. Amazon EC2: http://aws.amazon.com/ec2 accessed on
25th March 29, 2014
[2]. Rackspace. The open cloud company.
http://www.rackspace.com/ accessed on 25th March 29,
2014
[3]. AWS Elastic Beanstalk(beta). Easy to begin, Impossible
to outgrow. http://aws.amazon.com/elasticbeanstalk/
accessed on 25th March 29, 2014
[4]. Heroku. Cloud application platform.
http://www.heroku.com/ accessed on 25th March 29,
2014
[5]. Microsoft Windows Azure.
https://www.windowsazure.com/ accessed on 25th March
29, 2014
[6]. Force.com(salesforce). http://www.force.com/ accessed
on 25th March 29, 2014
[7]. Google App Engine. http://cloud.google.com/products/
accessed on 25th March 29, 2014
[8]. Microsoft Office 365. http://www.microsoft.com/en-
us/office365/online-software.aspx accessed on 25th
March 29, 2014
[9]. Salesforce.com. http://www.salesforce.com/ accessed on
25th March 29, 2014
[10]. B. Urgaonkar, P. shenoy, and et al. Resource
overbooking and application profiling in shared hosting
platforms. In Proc. OSDI, 2002.
[11]. Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu,
J ohn Wilkes, CloudScale: Elastic Resource Scaling for
Multi-Tenant Cloud Systems, In Proc SOCC11 2011
[12]. Kim H, Kim W and Kim Y, Grid and Distributed
Computing, Control and Automation (Communications in
Computer and Information Science vol 121) ed Kim T h,
Yau S S, Gervasi O, Kang B H, Stoica A and Dominik(
Springer Berlin Heidelberg) pp 84-94
[13]. Mahfuzur Rahman, Peter Graham. Hybrid resource
provisioning for clouds, High Performance Computing
Symposium 2012.
[14]. Tania Lorido-Botran, J ose Miguel-Alonso, J ose A.
Lozano. Auto-scaling Techniques for Elastic
Applications in Cloud Environments. Technical Report
EHU-KAT-IK-09-12
[15]. Ching-Chi Lin, J an-J an Wu, J eng-An Lin, Li-Chung
Song, Pangfeng Liu, Automatic Resource Scaling Based
on Application Service Requirements. IEEE Fifth
International Conference on Cloud Computing 2012,pp.
941-942
[16]. Fabio Morais, Francisco Brasileiro, Raquel Lopes.
Ricardo Araujo, Wade Satterfield, Leandro Rosa,
Autoflex: Service Agnostic Auto-scaling Framework for
IaaS Deployment Models. 13
th
IEEE/ACM International
Symposium on Cluster, Cloud, and Grid Computing 2013,
pp 42-49.
[17]. Pavlos Kranas, Andreas Menychtas, Vasileios
Anagnostopoulos, Theodara Varvarigou, ElasS: An
innovative Elasticity as a Service framework for dynamic
management across the cloud stack layers. Sixth
International Conference on Complex, Intelligent, and
Software Intensive Systems 2012, pp 1042-1049
[18]. J eremy Geelan. Twenty one experts define cloud
computing. Virtualization, August 2008.
[19]. Laura R. Moore, Kathryn Bean, Tariq Ellahi,
Transforming Reactive Auto-scaling into Proactive
Auto-scaling. CloudDP 13, pp 7-12.
[20]. RightScale http://www.rightscale.com/ accessed on
25th March 29, 2014
[21]. AzureWatch http://www.paraleap.com/azurewatch
accessed on 25th March 29, 2014
[22]. Luis M. Vaquero, Luis Rodero-Merino, Rajkumar
Buyya, Dynamically Scaling Applications in the
Cloud. ACM SIGCOMM Computer Communication
Review,2011, pp 45-52.
[23]. Sourav Dutta, Sankalp Gera, Akshat Verma, Balaji
Vishwanathan, SmartScale: automatic Application
Scaling in Enterprise Clouds. IEEE Fifth International
Conference on Cloud Computing,2012, pp 221-228.
[24]. Fouedjrad, J ie Tao, Achim Streit, SLA based service
Brokering in Intercloud Environments . CLOSER 2012,
pp76-81.
[25]. Yeo, Chee Shin and Buyya, Rajkumar, Service
level agreement based allocation of cluster resources:
Handling penalty to enhance utility.Cluster Computing,
2005. IEEE International,2005, pp. 1-10.
[26]. Ranjan, Rajiv and Harwood, Aaron and Buyya,
Rajkumar ,SLA-based coordinated superscheduling
scheme for computational Grids.Cluster Computing,
2006 IEEE International Conference,2006,pp 1-8.
[27]. Buyya, Rajkumar and Garg, Saurabh Kumar and
Calheiros, Rodrigo N, SLA-oriented resource
provisioning for cloud computing: Challenges,
architecture, and solutions. Cloud and Service
Computing (CSC), 2011 International
Conference,2011,pp 110.
[28]. Wu, Linlin and Garg, Saurabh Kumar and Buyya,
Rajkumar,SLA-based resource allocation for software as
a service provider (SaaS) in cloud computing
environments.Cluster, Cloud and Grid Computing
(CCGrid),11th IEEE/ACM International
Symposium,2011,pp 195204.

[29]. Van, Hien Nguyen and Tran, Frederic Dang and
Menaud, J -M, SLA-aware virtual resource management
for cloud infrastructures.Computer and Information
Technology, 2009. CIT'09. Ninth IEEE International
Conference,2009.

[30]. Mazzucco, Michele, Towards autonomic service
provisioning systems.Proceedings of the 2010 10th
IEEE/ACM International Conference on Cluster, Cloud
and Grid Computing,2010, pp 273-282
[31]. Maurer, Michael and Brandic, Ivona and Sakellariou,
Rizos, Self-adaptive and resource-efficient SLA
enactment for cloud computing infrastructures. Cloud
Computing (CLOUD), IEEE 5th International
Conference 2012,pp 368-375.
[32]. Suleiman, Basem and Sakr, Sherif and J effery, Ross
and Liu, Anna, On understanding the economics and
elasticity challenges of deploying business applications on
public cloud infrastructure. J ournal of Internet Services
and Applications 2012,pp 173-193
[33]. Patel, Pankesh and Ranabahu, Ajith H and Sheth,
Amit P,Service level agreement in cloud
computing.2009.
[34]. Alhamad, Mohammed and Dillon, Tharam and
Chang, Elizabeth,Conceptual SLA framework for cloud
computing.Digital Ecosystems and Technologies
(DEST), 2010 4th IEEE International Conference,2010,pp
606610.