Professional Documents
Culture Documents
Abstract—Virtualized servers have been the key for the ef- challenging, because (a) the workloads may vary significantly
ficient deployment of cloud applications. As the application over time, (b) the workload of some applications may change
demand increases, it is important to dynamically adjust the CPU tremendously over a very short period of time, and (c) resource
allocation of each component in order to save resources for
other applications and keep performance high, e.g., the client allocation should be optimized for better performance (see, for
mean response time (mRT) should be kept below a Quality example, [3]–[6] and references therein).
of Service (QoS) target. In this work, a new form of Kalman While performing virtualization, it is important to ensure
filter, called the Maximum Correntropy Criterion Kalman Filter that the demands meet their SLOs. Towards this end, auto-
(MCC-KF), has been used in order to predict, and hence, nomic resource management methods are needed to dynam-
adjust the CPU allocations of each component while the RUBiS
auction site workload changes randomly as the number of clients ically allocate resources across virtualized applications with
varies. MCC-KF has shown high performance when the noise is diverse workload and highly fluctuating workload demands.
non-Gaussian, as it is the case in the CPU usage. Numerical Autonomic resource management in a virtualized environment
evaluations compare our designed framework with other current using control-based techniques has recently gained significant
state-of-the-art using real-data via the RUBiS benchmark website attention; see [7] for a survey. One of the most common
deployed on a prototype Xen-virtualized cluster.
Index Terms—Resource provisioning, virtualized servers, CPU approaches to control the application performance is by con-
allocation, CPU usage, RUBiS, Kalman filter. trolling its CPU utilization within the VM; see, for example,
[8] and references therein.
I. I NTRODUCTION The Kalman filter, adopted in [8], [9], is one of the most
Popular applications, such as the Instagram and Dropbox, widely-used adaptive filters due to its simplicity, optimality,
use the cloud computing paradigm for their services. Virtu- and versatility. The traditional Kalman filter is derived under
alization [1] is one of the fundamental technologies used for the minimum mean square error (MMSE) criterion and is
server consolidation in cloud computing. When a physical ma- the optimal estimator if the measurement and process noises
chine is virtualized, it is transformed into one or more virtual are Gaussian. However, in non-Gaussian noise environments,
execution environments, called virtual machines (VMs), and the Kalman filter uses only second-order signal information,
their resource allocations must be adjusted online in order to and hence it is suboptimal. In general, the use of MSE
match their workload needs. Applications can run in isolation is desirable if the signals follow a Gaussian distribution.
on each VM. Infrastructure as a Service (IaaS) constitutes a Otherwise, if the distribution is non-Gaussian, the performance
form of cloud computing in which virtual computing resources of the system may degrade considerably and in these cases, a
are provisioned and utilized over the Internet [2]. IaaS plat- non-quadratic cost is more desirable than MSE [10]. In order
forms offer automated resource provisioning (AutoScaling) for to handle environments with non-Gaussian disturbances, such
adjusting resource allocation based on demand. This makes as impulsive noise and Gaussian mixture noise, and improve
IaaS well-suited for enterprises that experience workload vari- the robustness of state estimation, a Maximum Correntropy
ations. Unexpected workload variations should also be handled Criterion Kalman filter (MCC-KF) was introduced in [11] and
seamlessly without any performance degradation. [12]. The MCC-KF adopts the robust Maximum Correntropy1
Server consolidation is the efficient usage of server re- Criterion (MCC) instead of the MMSE criterion which is
sources in order to reduce the total number of physical sensitive to large outliers and results in robustness decay of
machines required. Consolidated applications share server the Kalman filter in non-Gaussian environments.
resources, such as CPU time, memory, network bandwidth Dynamic resource provisioning (DRP) is characterized by
and disk space. To operate efficiently, each application is workloads that have sudden and irregular fluctuations. In this
allocated with enough resources from the hosting server in work, motivated by the workload fluctuations in such an
order to meet its performance requirements of Service Level environment, we use the MCC-KF for better state estimation,
Objectives (SLOs), as measured, for example, by the high
1 The correntropy is a similarity measure related to the probability of how
throughput, availability, and requests mean Response Time
similar two random variables are [13]. Since correntropy is insensitive to
(mRT). However, adjusting the shares of resources of consol- outliers, it is a natural robust adaptation cost in presence of heavy-tailed
idated applications while their demands change over time is impulsive noises.
978-1-5090-6505-9/17/$31.00
2017
c IEEE
and hence, a more efficient resource provisioning control for the same number and type of requests. Virtualization
mechanism. More specifically, a new controller, based on the further facilitates moving applications from dedicated servers
MCC-KF, is proposed. The proposed controller allocates CPU to dynamically provisioned servers, thus reducing the number
resources in virtualized applications and aims at meeting its of physical machines, which results in significant energy and
SLO, which is to maintain the requests’ mRT below a certain cost savings.
threshold. Experimental evaluation shows that the proposed
controller dynamically allocates the resources in order to meet
its SLO. Compared to current-state-of-the-art in the literature,
the proposed controller reduces the overall mRT and the
instances for which the SLO is not met, without providing
more resources than necessary. This is achieved due to the
fact that better state prediction is obtained, because the new
controller takes into account the fact that the process noise may
be non-Gaussian and it is characterized by sudden workload
changes.
Fig. 1: Motivating example with three single-component appli-
The remainder of the paper is organized as follows. In
cations that can be hosted on a single physical server machine
Section II, we provide a simple motivating example and
using DRP.
present related work in the field. In Section III, we provide the
notation, we describe the performance metric being considered In this paper, we use dynamic resource adaptation for
and the model adopted for capturing the dynamics of the capturing the workload demands change. Our approach adopts
CPU utilization. In Section IV, we introduced the controller modern virtualization platforms which export a user-level
developed for our system. The performance of the controller interface to bound the maximum resource allocation per VM
is evaluated and compared with other state-of-the-art in Sec- at runtime.
tion V. Finally, in Section VI we conclude this work and draw
directions for future research. B. Related work
The application performance can be controlled, by control-
II. BACKGROUND
ling its CPU utilization. It has been observed that the appli-
A. Motivating example cation response times stay low [14] as long as the utilization
In this subsection, we provide a simple example with three remains below the allocation by a certain threshold. When the
single-component applications that can be hosted on a single threshold is exceeded, the response times increase dramatically
physical server machine using DRP, in order to show the and, as a result, the performance of the application drops.
concept of server consolidation. In [15] and [16], the authors directly control application
Traditionally, each application is hosted on a dedicated response times through runtime resource CPU allocation using
server, as shown in the left diagram of Fig. 1. Assuming that an offline system identification approach with which they tried
two of the applications have workload requirements such that to model the relationship between the response times and the
the sum of resources for both applications does not exceed the CPU allocations in regions where it is measured to be linear.
total available physical resources for a server machine, two However, as this relationship is application-specific and relies
VMs are created, each one hosting an application with fixed on offline identification performance models, it cannot be
resources allocated as required (as in the middle diagram in applied when multiple applications are running concurrently
Fig. 1). In this way, both applications are served adequately and it is not possible/easy to adjust to new conditions. This
and the total resource utilization of the physical machine is triggered the need for searching for other approaches.
now increased simply by co-locating two running servers. The authors in [16] and [17] were among the first to connect
Consider now the case where the workload in both applications the control of the application CPU utilization within the VM
changes considerably. For example, consider the case in which with the response times. The use of control-based techniques
resource utilization in VM A increases, while in VM B has emerged as a natural approach for resource provisioning in
it decreases so fewer resources are needed. If there is no a virtualized environment. Control-based approaches have de-
DRP, the application in VM A experiences under-provisioning signed controllers to continuously update the maximum CPU
resulting in performance degradation. At the same time, the allocated to each VM based on CPU utilization measurements.
application in VM B experiences over-provisioning, which For example, Padala et al. [18] present a two-layer non-
does not affect the performance of the system, but resources linear controller to regulate the utilization of the virtualized
are under-utilized. In the right diagram of Fig. 1, we see components of multi-tier applications. Kalyvianaki et al. [8],
how with DRP, one can dynamically allocate the resources [9] formulate the CPU allocation problem as a tracking one
and even include a third application on the same server. As and propose adaptive Kalman-based controllers to track and
a result, applications that would traditionally require three maintain the CPU utilization to a user-defined threshold. Even
dedicated physical servers, now they are all served by a though, the Kalman filter provides an optimal estimate if
single physical server, thus reducing the resources needed the noise is Gaussian, it may perform poorly if the noise
characteristics are different. To account for uncertainties in input queues for long and, as a result, their response times
the system model and noise statistics, Charalambous et al. [19] increase dramatically to relatively high values.
propose the use of an H∞ controller in order to minimize the In this work, the response time of every type of request was
maximum error caused by the uncertainties in the model. This captured calculating the time difference between the request
type of controller showed improved performance in saturation and its response, as Fig. 2 shows. All requests were issued to
periods and sudden workload changes, but it requires tuning our RUBiS cluster and specifically to the Web Server, through
an extra parameter for achieving the desired performance. the Client Emulator that was deployed on a seperate physical
Multi-Input-Multi-Output (MIMO) feedback controllers machine. When all requests were completed, a mean value
have also been considered; see, for example, [4] and [9]. These of response times, in a time interval of 1s, was calculated in
controllers make global decisions by coupling the resource order to have an image of mRT over time.
usage of all components of multi-tier server applications. In
addition, the resource allocation problem across consolidated
virtualized applications under conditions of contention have
been considered in [18], [20]: when some applications demand
more resources than physically available, then the controllers
share the resources among them, while respecting the user- Fig. 2: Request-to-response path.
given priorities.
In another line of research, researchers used neuro-fuzzy To maintain a good server performance, the operators try
control for controlling CPU utilization decisions of virtual to keep the CPU utilization below 100% of the machine
machines level controllers. For example, Sithu et al. [21] use capacity by a certain value, which is usually called headroom.
CPU load profiles to train the neuro-fuzzy controller to only Headroom values are chosen such that they form the boundary
predict the usage with 100% allocation, without having any between the second and the third mRT regions. At such
CPU allocation scheme to train on. Deliparaschos et al. [22], values the server is well provisioned and response times
instead, use data from established controllers, such as [8] and are kept low. If the utilization exceeds the boundary due to
[19], to train their neuro-fuzzy controller. increased workload demands, operators should increase the
III. N OTATION AND P RELIMINARIES server resources.
Firstly, we measure the server’s performance when 100%
A. Notation
of resources is provisioned, without any controller adjusting
R and R+ represent the real and the nonnegative real the allocation of resources, in order to extract what is the
numbers, respectively. Vectors are denoted by small letters, required headroom. In this work, we consider a Browsing Mix
matrices are denoted by capital letters, and sets by calligraphic workload type, in order to specify the server’s performance
capital letters. AT and A−1 denote the transpose and inverse while the number of clients varies. Fig. 3 shows the mean
of matrix A respectively. By I we denote the identity matrix. response times (mRT) when the number of clients increases
âk|k−1 and âk|k denote the a priori and a posteriori estimates in steps of 200, until the mRT grows above 1s.
of random vector ak for time instant k. Pk denotes the matrix
P at time instant k. E{·} represents the expectation of its 2500
Client mRT
argument.
2000
B. Performance metric
One of the most widely used metrics for measuring server 1500
mRT (ms)
60 1500
CPU (%)
50
40 1000
30
500
20
10 0
0 10 20 30 40 50 60 70 80 90 100
0 time interval (s)
0 10 20 30 40 50 60 70 80 90 100
time interval (s) Fig. 8: Upper: H∞ filter CPU usage and allocation. Lower:
3000
Kalman mRT for H∞ filter.
2500
In this experiment the H∞ filter predicts and adjusts the
2000 CPU allocations for a period of 100s. During time interval
mRT (ms)
CPU (%)
100 50
90 allocation
usage 40
80 30 Kalman
70 20 H∞
60 MCC−KF
10 usage
CPU (%)
50 0
0 10 20 30 40 50
40 time interval (s)
30 Fig. 10: Allocation Predictions with the same CPU usage per
20 Controller.
10 Overall, we observe that all controllers perform well while
0
0 10 20 30 40 50 60 70 80 90 100 the workload is relatively stable, and thus the mRT is kept at
time interval (s) the low mRT region. However, when there is a big workload
3000
MCC−KF change because of an increase in the number of clients issuing
2500 requests to the RUBiS, MCC-KF manages to keep the mRT
at lower values than the Kalman filter and slightly lower than
2000 H∞ filter. The H∞ filter, as it is shown in Fig. 10, manages to
mRT (ms)
1500 keep the mRT low by allocating more CPU resources, while
MCC-KF allocates less, and as a result, more resources are
1000 available for use by other applications on the same physical
machine.
500
VI. C ONCLUSIONS AND F UTURE D IRECTIONS
0 A. Conclusions
0 10 20 30 40 50 60 70 80 90 100
time interval (s)
Several available virtualization technologies allow for dy-
Fig. 9: Upper: MCC-KF filter CPU usage and allocation. namic CPU allocation adjustment of each component in
Lower: mRT for MCC-KF filter. virtualized server environment in such a manner so as to
prevent resource saturation. In order to adjust the allocation,
This experiment shows the way that MCC-KF responds suitable controllers must be developed to take on the role of
when the number of clients issuing requests to the server predicting and hence adjusting the CPU allocations of each
increases from 400 to 1000 like previous experiments. As server component. In this work, a Maximum Correntropy
for this experiment, the sudden workload change was done Criterion Kalman Filter (MCC-KF) has been used in order to
at time period 30 − 40s. MCC-KF, operates well enough at adjust the CPU allocations of each component when random
sudden CPU usage changes while the mRT is kept at 1800ms. workload changes are introduced from a RUBiS auction site
During all other regions, the mRT stays at low levels because for a varying number of clients deployed on a Xen-virtualized
the remaining amount of CPU resources is enough for serving cluster. The MCC-KF allows for state estimation in the pres-
the stable workload. ence of non-Gaussian noise environments, as it is the case of
In order to observe how each controller operates, we ran an the CPU usage.
experiment with exactly the same demand for all controllers,
with a big workload change at time interval 20 − 30s. During B. Future Directions
this experiment, the allocations that are predicted from the In this work, all the controllers were implemented assuming
controllers were not applied, since only one controller can prior knowledge of the distribution pertaining the workload
handle the CPU allocations, for each experiment, using the changes. However, the system dynamics change with time,
Xen Hypervisor. Fig. 10 shows that, while the workload is since the volume and fluctuation of requests might vary
relatively stable, the predictions by each controller are almost significantly due to increased/decreased popularity of a site,
the same. However when the demand changes suddenly, the advertisement, etc. For this reason, it is important that self-
predictions of H∞ filter allocates more resources for the adaptation techniques are adopted in order to adjust the
RUBiS application than Kalman and MCC-KF. However, the proposed controllers to these fluctuations [26]–[28].
Resource needs across multiple dimensions, such as com- [12] R. Izanloo, S. A. Fakoorian, H. S. Yazdi, and D. Simon, “Kalman
pute, storage, and network bandwidth, are coupled; hence, filtering based on the maximum correntropy criterion in the presence
of non-Gaussian noise,” in Annual Conference on Information Science
workload consolidation should be performed while taking and Systems (CISS), Mar. 2016, pp. 500–505.
care of the resource coupling in multi-tier virtualized appli- [13] W. Liu, P. P. Pokharel, and J. C. Príncipe, “Correntropy: Properties and
cations in order to provide timely allocations during workload Applications in Non-Gaussian Signal Processing,” IEEE Transactions
on Signal Processing, vol. 55, no. 11, Nov 2007.
changes. For this reason, part of ongoing research concentrates [14] L. Kleinrock, Queueing Systems, Volume 1, Theory. Wiley-Interscience,
on using system identification/learning to extract the coupling 1975.
between the resource needs for more efficient workload con- [15] Z. Wang, X. Zhu, and S. Singhal, “Utilization and SLO-Based Control
for Dynamic Sizing of Resource Partitions,” in Proceedings of the
solidation. IFIP/IEEE International Workshop on Distributed Systems: Operations
Data centers distributed in different geo-location can work and Management (DSOM), October 2005, pp. 133–144.
collaboratively as a whole. They are connected with high- [16] X. Zhu, Z. Wang, and S. Singhal, “Utility-Driven Workload Management
using Nested Control Design,” in Proceedings of the American Control
speed internet or dedicated high-bandwidth communication Conference (ACC), 2006, pp. 6033–6038.
links. VMs can be migrated, if necessary, within a data [17] Z. Wang, X. Liu, A. Zhang, C. Stewart, X. Zhu, T. Kelly, and S. Singhal,
center, or even across data centers [29]. The combination of “AutoParam: Automated Control of Application-Level Performance in
Virtualized Server Environments,” in Proceedings of the IEEE Interna-
dynamically changing the size of VMs and their migration, tional Workshop on Feedback Control Implementation and Design in
while a long standing issue, constitutes a very challenging Computing Systems and Networks (FeBID), 2007.
and open problem. [18] P. Padala, K. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal, A. Merchant,
and K. Salem, “Adaptive Control of Virtualized Resources in Utility
Computing Environments,” in Proceedings of the European Conference
ACKNOWLEDGEMENTS on Computer Systems (EuroSys), 2007, pp. 289–302.
[19] T. Charalambous and E. Kalyviannaki, “A min-max framework for CPU
The authors would like to thank Ms. Kika Christou from the resource provisioning in virtualized servers using H∞ Filters,” in 49th
Information Systems & Technology Service of Cyprus Univer- IEEE Conference on Decision and Control (CDC), Dec. 2010, pp. 3778–
sity of Technology, for her continued support on computing 3783.
[20] P. Padala, K.-Y. Hou, K. G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Sing-
and networking infrastructure related issues. hal, and A. Merchant, “Automated Control of Multiple Virtualized
Resources,” in Proceedings of the 4th ACM European Conference on
R EFERENCES Computer Systems (EuroSys ’09). New York, NY, USA: ACM, 2009,
pp. 13–26.
[1] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neuge- [21] M. Sithu and N. L. Thein, “A resource provisioning model for virtual
bauer, I. Pratt, and A. Warfield, “Xen and the Art of Virtualization,” in machine controller based on neuro-fuzzy system,” in 2011 The 2nd
Proceedings of the ACM Symposium on Operating Systems Principles International Conference on Next Generation Information Technology
(SOSP), 2003, pp. 164–177. (ICNIT), Jun. 2011, pp. 109–114.
[2] L. Rodero-Merino, L. M. Vaquero, V. Gil, F. Galán, J. Fontán, R. S. [22] K. M. Deliparaschos, T. Charalambous, E. Kalyvianaki, and
Montero, and I. M. Llorente, “From infrastructure delivery to service C. Makarounas, “On the use of fuzzy logic controllers to comply with
management in clouds,” Future Generation Computer Systems, vol. 26, virtualized application demands in the cloud,” in European Control
no. 8, pp. 1226–1240, 2010. Conference (ECC), June 2016, pp. 649–654.
[3] G. Jung, K. R. Joshi, M. A. Hiltunen, R. D. Schlichting, and C. Pu, “A [23] W. Liu, P. P. Pokharel, and J. C. Príncipe, “Correntropy: A Localized
cost-sensitive adaptation engine for server consolidation of multitier ap- Similarity Measure,” in The 2006 IEEE International Joint Conference
plications,” in Proceedings of the 10th ACM/IFIP/USENIX International on Neural Network Proceedings, 2006, pp. 4919–4924.
Conference on Middleware. New York, NY, USA: Springer-Verlag New [24] R. He, W. S. Zheng, and B. G. Hu, “Maximum Correntropy Criterion
York, Inc., 2009, pp. 9:1–9:20. for Robust Face Recognition,” IEEE Transactions on Pattern Analysis
[4] E. Kalyvianaki, T. Charalambous, and S. Hand, “Resource Provisioning and Machine Intelligence, vol. 33, no. 8, pp. 1561–1576, Aug. 2011.
for Multi-Tier Virtualized Server Applications,” Computer Measurement [25] A. Singh and J. C. Principe, “Using Correntropy As a Cost Function
Group (CMG) Journal, vol. 126, pp. 6–17, 2010. in Linear Adaptive Filters,” in Proceedings of the 2009 International
[5] D. Ardagna, B. Panicucci, M. Trubian, and L. Zhang, “Energy-aware Joint Conference on Neural Networks, ser. IJCNN’09. Piscataway,
autonomic resource allocation in multitier virtualized environments,” NJ, USA: IEEE Press, 2009, pp. 1699–1704. [Online]. Available:
IEEE Transactions on Services Computing, vol. 5, no. 1, pp. 2–19, http://dl.acm.org/citation.cfm?id=1704175.1704421
March 2012. [26] A. Filieri, C. Ghezzi, A. Leva, and M. Maggio, “Self-adaptive software
[6] C. Mastroianni, M. Meo, and G. Papuzzo, “Probabilistic consolidation meets control theory: A preliminary approach supporting reliability
of virtual machines in self-organizing cloud data centers,” IEEE Trans- requirements,” in Proceedings of the 2011 26th IEEE/ACM Interna-
actions on Cloud Computing, vol. 1, no. 2, pp. 215–228, July 2013. tional Conference on Automated Software Engineering, ser. ASE ’11.
[7] J. Zhang, H. Huang, and X. Wang, “Resource provision algorithms Washington, DC, USA: IEEE Computer Society, 2011, pp. 283–292.
in cloud computing: A survey,” Journal of Network and Computer [27] A. Filieri, H. Hoffmann, and M. Maggio, “Automated design of self-
Applications, vol. 64, pp. 23–42, 2016. adaptive software with control-theoretical formal guarantees,” in Pro-
[8] E. Kalyvianaki, T. Charalambous, and S. Hand, “Adaptive resource pro- ceedings of the 36th International Conference on Software Engineering,
visioning for virtualized servers using Kalman filters,” ACM Transaction ser. ICSE 2014. New York, NY, USA: ACM, 2014, pp. 299–310.
on Autonomous and Adaptive Systems, vol. 9, no. 2, pp. 10:1–10:35, July [28] A. Filieri, M. Maggio, K. Angelopoulos, N. D’Ippolito,
2014. I. Gerostathopoulos, A. B. Hempel, H. Hoffmann, P. Jamshidi,
[9] ——, “Self-Adaptive and Self-Configured CPU Resource Provisioning E. Kalyvianaki, C. Klein, F. Krikava, S. Misailovic, A. V. Papadopoulos,
for Virtualized Servers using Kalman Filters,” in Proceedings of the 6th S. Ray, A. M. Sharifloo, S. Shevtsov, M. Ujma, and T. Vogel, “Software
International Conference on Autonomic Computing (ICAC). New York, engineering meets control theory,” in Proceedings of the 10th
NY, USA: ACM, 2009, pp. 117–126. International Symposium on Software Engineering for Adaptive and
[10] B. Chen, Y. Zhu, J. Hu, and J. C. Principe, System Parameter Identifi- Self-Managing Systems, ser. SEAMS ’15. Piscataway, NJ, USA: IEEE
cation: Information Criteria and Algorithms, 1st ed. Amsterdam, The Press, 2015, pp. 71–82.
Netherlands, The Netherlands: Elsevier Science Publishers B. V., 2013. [29] P. T. Endo, A. V. de Almeida Palhares, N. N. Pereira, G. E. Goncalves,
[11] B. Chen, X. Liu, H. Zhao, and J. C. Príncipe, “Maximum Correntropy D. Sadok, J. Kelner, B. Melander, and J. E. Mangs, “Resource allocation
Kalman Filter,” arXiv:1509.04580 [cs, stat], Sep. 2015, arXiv: for distributed cloud: concepts and research challenges,” IEEE Network,
1509.04580. [Online]. Available: http://arxiv.org/abs/1509.04580 vol. 25, no. 4, pp. 42–46, July 2011.