Dynamic CPU Resource Provisioning in Virtualized Servers Using Maximum Correntropy Criterion Kalman Filters

Dynamic CPU Resource Provisioning in Virtualized Servers
using Maximum Correntropy Criterion Kalman Filters

Evagoras Makridis∗ , Kyriakos M. Deliparaschos∗ , Evangelia Kalyvianaki† and Themistoklis Charalambous‡
∗ Cyprus University of Technology, Limassol, Cyprus, Email: ep.makridis@edu.cut.ac.cy, k.deliparaschos@cut.ac.cy
† City
University of London, London, United Kingdom, Email: evangelia.kalyvianaki.1@city.ac.uk
‡ Aalto University, Espoo, Finland, Email: themistoklis.charalambous@aalto.fi
Abstract—Virtualized servers have been the key for the ef- challenging, because (a) the workloads may vary significantly
ficient deployment of cloud applications. As the application over time, (b) the workload of some applications may change
demand increases, it is important to dynamically adjust the CPU tremendously over a very short period of time, and (c) resource
allocation of each component in order to save resources for
other applications and keep performance high, e.g., the client allocation should be optimized for better performance (see, for
mean response time (mRT) should be kept below a Quality example, [3]–[6] and references therein).
of Service (QoS) target. In this work, a new form of Kalman While performing virtualization, it is important to ensure
filter, called the Maximum Correntropy Criterion Kalman Filter that the demands meet their SLOs. Towards this end, auto-
(MCC-KF), has been used in order to predict, and hence, nomic resource management methods are needed to dynam-
adjust the CPU allocations of each component while the RUBiS
auction site workload changes randomly as the number of clients ically allocate resources across virtualized applications with
varies. MCC-KF has shown high performance when the noise is diverse workload and highly fluctuating workload demands.
non-Gaussian, as it is the case in the CPU usage. Numerical Autonomic resource management in a virtualized environment
evaluations compare our designed framework with other current using control-based techniques has recently gained significant
state-of-the-art using real-data via the RUBiS benchmark website attention; see [7] for a survey. One of the most common
deployed on a prototype Xen-virtualized cluster.
Index Terms—Resource provisioning, virtualized servers, CPU approaches to control the application performance is by con-
allocation, CPU usage, RUBiS, Kalman filter. trolling its CPU utilization within the VM; see, for example,
[8] and references therein.
I. I NTRODUCTION The Kalman filter, adopted in [8], [9], is one of the most
Popular applications, such as the Instagram and Dropbox, widely-used adaptive filters due to its simplicity, optimality,
use the cloud computing paradigm for their services. Virtu- and versatility. The traditional Kalman filter is derived under
alization [1] is one of the fundamental technologies used for the minimum mean square error (MMSE) criterion and is
server consolidation in cloud computing. When a physical ma- the optimal estimator if the measurement and process noises
chine is virtualized, it is transformed into one or more virtual are Gaussian. However, in non-Gaussian noise environments,
execution environments, called virtual machines (VMs), and the Kalman filter uses only second-order signal information,
their resource allocations must be adjusted online in order to and hence it is suboptimal. In general, the use of MSE
match their workload needs. Applications can run in isolation is desirable if the signals follow a Gaussian distribution.
on each VM. Infrastructure as a Service (IaaS) constitutes a Otherwise, if the distribution is non-Gaussian, the performance
form of cloud computing in which virtual computing resources of the system may degrade considerably and in these cases, a
are provisioned and utilized over the Internet [2]. IaaS plat- non-quadratic cost is more desirable than MSE [10]. In order
forms offer automated resource provisioning (AutoScaling) for to handle environments with non-Gaussian disturbances, such
adjusting resource allocation based on demand. This makes as impulsive noise and Gaussian mixture noise, and improve
IaaS well-suited for enterprises that experience workload vari- the robustness of state estimation, a Maximum Correntropy
ations. Unexpected workload variations should also be handled Criterion Kalman filter (MCC-KF) was introduced in [11] and
seamlessly without any performance degradation. [12]. The MCC-KF adopts the robust Maximum Correntropy1
Server consolidation is the efficient usage of server re- Criterion (MCC) instead of the MMSE criterion which is
sources in order to reduce the total number of physical sensitive to large outliers and results in robustness decay of
machines required. Consolidated applications share server the Kalman filter in non-Gaussian environments.
resources, such as CPU time, memory, network bandwidth Dynamic resource provisioning (DRP) is characterized by
and disk space. To operate efficiently, each application is workloads that have sudden and irregular fluctuations. In this
allocated with enough resources from the hosting server in work, motivated by the workload fluctuations in such an
order to meet its performance requirements of Service Level environment, we use the MCC-KF for better state estimation,
Objectives (SLOs), as measured, for example, by the high
1 The correntropy is a similarity measure related to the probability of how
throughput, availability, and requests mean Response Time
similar two random variables are [13]. Since correntropy is insensitive to
(mRT). However, adjusting the shares of resources of consol- outliers, it is a natural robust adaptation cost in presence of heavy-tailed
idated applications while their demands change over time is impulsive noises.
978-1-5090-6505-9/17/$31.00 2017
c IEEE
and hence, a more efficient resource provisioning control for the same number and type of requests. Virtualization
mechanism. More specifically, a new controller, based on the further facilitates moving applications from dedicated servers
MCC-KF, is proposed. The proposed controller allocates CPU to dynamically provisioned servers, thus reducing the number
resources in virtualized applications and aims at meeting its of physical machines, which results in significant energy and
SLO, which is to maintain the requests’ mRT below a certain cost savings.
threshold. Experimental evaluation shows that the proposed
controller dynamically allocates the resources in order to meet
its SLO. Compared to current-state-of-the-art in the literature,
the proposed controller reduces the overall mRT and the
instances for which the SLO is not met, without providing
more resources than necessary. This is achieved due to the
fact that better state prediction is obtained, because the new
controller takes into account the fact that the process noise may
be non-Gaussian and it is characterized by sudden workload
changes.
Fig. 1: Motivating example with three single-component appli-
The remainder of the paper is organized as follows. In
cations that can be hosted on a single physical server machine
Section II, we provide a simple motivating example and
using DRP.
present related work in the field. In Section III, we provide the
notation, we describe the performance metric being considered In this paper, we use dynamic resource adaptation for
and the model adopted for capturing the dynamics of the capturing the workload demands change. Our approach adopts
CPU utilization. In Section IV, we introduced the controller modern virtualization platforms which export a user-level
developed for our system. The performance of the controller interface to bound the maximum resource allocation per VM
is evaluated and compared with other state-of-the-art in Sec- at runtime.
tion V. Finally, in Section VI we conclude this work and draw
directions for future research. B. Related work
The application performance can be controlled, by control-
II. BACKGROUND
ling its CPU utilization. It has been observed that the appli-
A. Motivating example cation response times stay low [14] as long as the utilization
In this subsection, we provide a simple example with three remains below the allocation by a certain threshold. When the
single-component applications that can be hosted on a single threshold is exceeded, the response times increase dramatically
physical server machine using DRP, in order to show the and, as a result, the performance of the application drops.
concept of server consolidation. In [15] and [16], the authors directly control application
Traditionally, each application is hosted on a dedicated response times through runtime resource CPU allocation using
server, as shown in the left diagram of Fig. 1. Assuming that an offline system identification approach with which they tried
two of the applications have workload requirements such that to model the relationship between the response times and the
the sum of resources for both applications does not exceed the CPU allocations in regions where it is measured to be linear.
total available physical resources for a server machine, two However, as this relationship is application-specific and relies
VMs are created, each one hosting an application with fixed on offline identification performance models, it cannot be
resources allocated as required (as in the middle diagram in applied when multiple applications are running concurrently
Fig. 1). In this way, both applications are served adequately and it is not possible/easy to adjust to new conditions. This
and the total resource utilization of the physical machine is triggered the need for searching for other approaches.
now increased simply by co-locating two running servers. The authors in [16] and [17] were among the first to connect
Consider now the case where the workload in both applications the control of the application CPU utilization within the VM
changes considerably. For example, consider the case in which with the response times. The use of control-based techniques
resource utilization in VM A increases, while in VM B has emerged as a natural approach for resource provisioning in
it decreases so fewer resources are needed. If there is no a virtualized environment. Control-based approaches have de-
DRP, the application in VM A experiences under-provisioning signed controllers to continuously update the maximum CPU
resulting in performance degradation. At the same time, the allocated to each VM based on CPU utilization measurements.
application in VM B experiences over-provisioning, which For example, Padala et al. [18] present a two-layer non-
does not affect the performance of the system, but resources linear controller to regulate the utilization of the virtualized
are under-utilized. In the right diagram of Fig. 1, we see components of multi-tier applications. Kalyvianaki et al. [8],
how with DRP, one can dynamically allocate the resources [9] formulate the CPU allocation problem as a tracking one
and even include a third application on the same server. As and propose adaptive Kalman-based controllers to track and
a result, applications that would traditionally require three maintain the CPU utilization to a user-defined threshold. Even
dedicated physical servers, now they are all served by a though, the Kalman filter provides an optimal estimate if
single physical server, thus reducing the resources needed the noise is Gaussian, it may perform poorly if the noise
characteristics are different. To account for uncertainties in input queues for long and, as a result, their response times
the system model and noise statistics, Charalambous et al. [19] increase dramatically to relatively high values.
propose the use of an H∞ controller in order to minimize the In this work, the response time of every type of request was
maximum error caused by the uncertainties in the model. This captured calculating the time difference between the request
type of controller showed improved performance in saturation and its response, as Fig. 2 shows. All requests were issued to
periods and sudden workload changes, but it requires tuning our RUBiS cluster and specifically to the Web Server, through
an extra parameter for achieving the desired performance. the Client Emulator that was deployed on a seperate physical
Multi-Input-Multi-Output (MIMO) feedback controllers machine. When all requests were completed, a mean value
have also been considered; see, for example, [4] and [9]. These of response times, in a time interval of 1s, was calculated in
controllers make global decisions by coupling the resource order to have an image of mRT over time.
usage of all components of multi-tier server applications. In
addition, the resource allocation problem across consolidated
virtualized applications under conditions of contention have
been considered in [18], [20]: when some applications demand
more resources than physically available, then the controllers
share the resources among them, while respecting the user- Fig. 2: Request-to-response path.
given priorities.
In another line of research, researchers used neuro-fuzzy To maintain a good server performance, the operators try
control for controlling CPU utilization decisions of virtual to keep the CPU utilization below 100% of the machine
machines level controllers. For example, Sithu et al. [21] use capacity by a certain value, which is usually called headroom.
CPU load profiles to train the neuro-fuzzy controller to only Headroom values are chosen such that they form the boundary
predict the usage with 100% allocation, without having any between the second and the third mRT regions. At such
CPU allocation scheme to train on. Deliparaschos et al. [22], values the server is well provisioned and response times
instead, use data from established controllers, such as [8] and are kept low. If the utilization exceeds the boundary due to
[19], to train their neuro-fuzzy controller. increased workload demands, operators should increase the
III. N OTATION AND P RELIMINARIES server resources.
Firstly, we measure the server’s performance when 100%
A. Notation
of resources is provisioned, without any controller adjusting
R and R+ represent the real and the nonnegative real the allocation of resources, in order to extract what is the
numbers, respectively. Vectors are denoted by small letters, required headroom. In this work, we consider a Browsing Mix
matrices are denoted by capital letters, and sets by calligraphic workload type, in order to specify the server’s performance
capital letters. AT and A−1 denote the transpose and inverse while the number of clients varies. Fig. 3 shows the mean
of matrix A respectively. By I we denote the identity matrix. response times (mRT) when the number of clients increases
âk|k−1 and âk|k denote the a priori and a posteriori estimates in steps of 200, until the mRT grows above 1s.
of random vector ak for time instant k. Pk denotes the matrix
P at time instant k. E{·} represents the expectation of its 2500
Client mRT
argument.
2000
B. Performance metric
One of the most widely used metrics for measuring server 1500
mRT (ms)
performance is the client mean request response times (mRT).

It is difficult to predict the values of the mRT of server 1000
applications across operating regions and different applications
and workloads. However, it is known to have certain charac- 500
teristics [14]. More specifically, its values can be divided into
three regions (see, e.g., [8], [9], [19]): 0
200 400 600 800 1000 1200 1400 1600 1800 2000
(a) when the application has abundant resources and, there- Number of Clients
fore, all requests are served as they arrive and the Fig. 3: Mean Response Times (mRT) for different workloads.
response times are kept low; It is evident that for more than 1200 clients, the mRT grows
(b) when the utilization approaches 100% (e.g. around 70- fast and beyond 1700 the SLO is violated.
80% on average) the mRT increases above the low values
from the previous region, due to the fact that there Initially, as the number of clients increases, the mRT stays
are instances in which the requests increase abruptly, low. However, when the number of 1200 clients is exceeded,
approaching 90-100%; the mRT starts to grow above the low values. Specifically, the
(c) when resources are scarce and very close to 100%, since QoS target of 1s is reached when the number of clients that
requests compete for limited resources, they wait in the simultaneously issuing requests to the server is 1700.
Fig. 4 shows the average CPU usage per component while This is achieved by adjusting the allocation to values above the
the number of clients increases. As shown in this figure, utilization. For each time-interval k, the desired relationship
the database server demand is lower than the web server’s between the two quantities is given by:
one with the same number of clients. When, the number of
clients exceeds 1400, the web server’s CPU usage becomes ak = min{(1 + h)xk , amax }, (3)
the bottleneck and even though the database server does not
use 100% of its resources, it remains (almost) constant. Hence, where h ∈ (0, 1) represents the headroom (i.e., how much
it is important to establish the required resources for all the extra resources are provided above the actual CPU utilization)
involved components comprising the requests. and amax is the maximum CPU that can be allocated. To
maintain good server performance, the allocation ak should
100 adapt to the utilization xk . In [8], [9] and in [19] Kalman
Web Server
Database Server and H∞ filters have been designed, respectively, to predict
80 the CPU utilization per component of the server and thus
allocate the correct amount of resources. Let Yk represent
Average CPU usage %
60 the set of all observations up to time k. Let the a posteriori

and a priori state estimates be denoted by x̂k|k = E {xk |Yk }
40 and x̂k+1|k = E {xk+1 |Yk }, respectively; hence, x̂k+1|k is the
predicted CPU utilization for time-step k + 1. As a result, the
20 CPU allocation controller is given by

0 ak+1 = max 0, min{(1 + h)x̂k+1|k , amax } . (4)
200 400 600 800 1000 1200 1400 1600 1800 2000
Number of Clients
Fig. 4: Average CPU usages per component for different IV. C ONTROLLER DESIGN
workloads.
In this work a new Kalman filter approach is used that
C. System model uses the Maximum Correntropy Criterion (MCC) for state
estimation, referred in literature as MCC Kalman filter (MCC-
For this paper, we assume a two-tier server application
KF) in [11] and [12]. The correntropy criterion measures the
composed of 2 components. Each component runs on a differ-
similarity of two random variables using information from
ent VM. However, the allocations are predicted and adapted
high-order signal statistics [13], [23]–[25]. Since the Kalman
on one component while the other component has 100%
filter uses only second-order signal information is not optimal
of its resources available. The time-varying CPU utilization
if the process and measurement noises are non-Gaussian noise
per component is modeled as a random walk given by the
disturbances, such as shot noise or mixture of Gaussian noise.
following linear stochastic difference equation [8], [9], [19],
Consider the system with dynamics
[22]:
xk+1 = xk + wk , (1)
xk+1 = Axk + wk , (5a)
where xk ∈ R+ is the percentage of the total CPU capacity ac- yk = Cxk + vk , (5b)
tually used by the application component during time-interval
k. The independent random process wk ∈ R+ represents where xk ∈ Rnx is the state of the system, uk ∈ Rnu is
the process noise. The process noise models the utilization the control input, wk ∈ Rnx is a stochastic disturbance with
between successive intervals caused by workload changes, e.g., zero mean and finite second order matrix Wk , yk ∈ Rny
requests being added, doing work from previous intervals, or is the observation of the state of the system xk , vk ∈ Rny
leaving the server. is a stochastic disturbance with zero mean and finite second
By yk ∈ R+ we denote the total CPU utilization actually order matrix Vk , and A and C are matrices of appropriate
observed in the VM, i.e., yk models the observed utilization in dimensions. Let, also, the a posteriori (updated) and a priori
addition to any usage noise coming from other sources, such (predicted) error covariances be given by
as the operating system, to support the application, i.e.,
Pk|k = E (xk − x̂k|k )(xk − x̂k|k )T |Yk ,

y k = x k + vk , (2)
Pk+1|k = E (xk − x̂k+1|k )(xk − x̂k+1|k )T |Yk .

where vk ∈ R+ denotes the utilization measurement noise.
By ak ∈ R+ we denote the CPU capacity of a physical
The equations for the MCC-KF are summarized below [12].
machine allocated to the VM, i.e., the maximum amount of
For the prediction phase:
resources a VM can use. The purpose of a designed controller
is to control the allocation of the VM running a server appli- x̂k|k−1 = Ax̂k−1|k−1 , (6a)
cation while observing its utilization in the VM, maintaining T
good server performance in the presence of workload changes. Pk|k−1 = APk−1|k−1 A + Wk , (6b)
and for the update phase: of our prototype RUBiS server application, as the performance
metric. The goal of the control system is to adapt the CPU
Gσ k yk − C x̂k|k−1 kV −1 resource allocations on a single component (e.g., Web Server
k
Lk = , (6c)
VM) in order to save resources for other applications that

Gσ k x̂k|k−1 − Ax̂k−1|k−1 kP −1
k|k−1 can be hosted on the same physical machine. There are two
−1
Kk = (Pk|k−1 + Lk C T Vk−1 C)−1 Lk C T Vk−1 , (6d) parameters by which a server’s performance can be affected:
x̂k|k = x̂k|k−1 + Kk (yk − C x̂k|k−1 ), (6e) (i) the number of clients that send requests simultaneously
T
to the server and (ii) the workload type. An overview of the
Pk|k = (I − Kk C)Pk|k−1 (I − Kk C) + Kk Vk KkT , (6f) system’s architecture is shown in Fig. 5 below.
where Gσ is the Gaussian kernel, i.e.,
k xi − yi k2

Gσ (k xi − yi k) = exp −
2σ 2
with kernel size σ. Note that Lk is called the minimized
correntropy estimation cost function and Kk is the Kalman
gain.
Since in this work we are using a SISO controller in our
model, given by (1) and (2), the original MCC-KF equations Fig. 5: System architecture.
(6a)-(6f) are simplified to: B. Headroom
x̂k|k−1 = x̂k−1|k−1 , (7a) Let parameter c denote the desired CPU utilization to CPU
Pk|k−1 = Pk−1|k−1 + Wk , (7b) allocation ratio, i.e., c = 1/(1 + h). The mRT with respect
to parameter c for Kalman filter, H∞ filter and MCC-KF, is
Gσ k yk − x̂k|k−1 kV −1 shown in Fig. 6. In this evaluation, we set a stable workload
k
Lk = , (7c)
Gσ k x̂k|k−1 − x̂k−1|k−1 kP −1 of 500 clients sending requests simultaneously to the RUBiS
k|k−1 auction site. Each measurement is derived from an experiment,
Lk where c differs every time in steps of 0.1. While c approaches
Kk = , (7d)
(Pk|k−1 + Lk Vk−1 )Vk
−1
1, more resources are saved for other applications to run, but
x̂k|k = x̂k|k−1 + Kk (yk − x̂k|k−1 ), (7e) the mRT of requests increases and thus the performance of
2 2
the RUBiS benchmark drops. This occurs because while the
Pk|k = (1 − Kk ) Pk|k−1 + Kk Vk , (7f) headroom approaches 0, there are not enough resources left
As it can be observed in (7a)-(7f), MCC-KF has the same to be used, which means that the requests have to wait in a
structure as the Kalman filter, but in addition, it uses high- queue.
order statistics to improve state estimation.
500
V. P ERFORMANCE E VALUATION Kalman
H∞
A. Experimental Setup 400 MCC−KF
A two-tier prototype cluster of Rice University Bidding
System (RUBiS)2 was deployed in order to evaluate the 300
mRT (ms)
performance of our system. The cluster is consisted from a

web server and a database server which are deployed over 200
Xen Hypervisor VMs. Another physical server emulates the
client’s behavior, with sending requests to the RUBiS auction 100
site, via the RUBiS Client Emulator. All controllers presented
in this work were added on the base project code called
0
ViResA3 , which was first deployed for the synthetic data 0.5 0.6 0.7 0.8 0.9 1.0
generation and the performance evaluation of the controllers c parameter
in [22]. The performance of the dynamic CPU allocation is Fig. 6: mRT with respect to c: the closer parameter c is to
evaluated using the mRT, which is measured at the Client-side 1, the largest the mRT becomes. It is observed that MCC-KF
2 RUBiS is an auction site benchmark that implements the core functionality
can have a larger value of c (i.e., smaller headroom) than the
of an auction site. It is modeled after ebay.com with a web front end and a
Kalman filter for the same mRT.
database. It has provisions for selling, browsing and bidding items, allowing As can be observed, all filters can allocate resources without
for different sessions for different type of users in the form of visitor, buyer a big increase of mRT when c is less than 0.8. However,
and seller to be implemented. when the c parameter increases above 0.8, the Kalman filter
3 ViResA (Virtualized [server] Resource Allocation) is a base project code
hosted in Atlassian Bitbucket (https://bitbucket.org) as a private Git repository. performs worse than H∞ and MCC-KF. This happens because
For download requests, please contact the authors. H∞ and MCC-KF can predict the next allocation when CPU
usage changes in a random way which brings out the non- Evidently, the Kalman filter performs well as the fluctuation
Gaussian noise environments of the demand. Based on these of workload is relatively small. During time interval 0 − 30s,
observations, we can see that while the c parameter is set 400 clients issue requests to the RUBiS auction site, while
at 0.8, the mRT does not increase. However, we set the c at 30s, 600 more clients are added to the RUBiS application,
parameter at 0.7 for future experiments, in order to evaluate which effectively increases the demand in CPU resources. As
the controller’s performance while the number of clients and a result, the CPU usage changes abruptly and the allocation
hence the workload varies randomly. does not adjust fine which leads to a sudden mRT increase,
In all experiments shown in the following figures, c is set reaching a value of almost 2200ms (see lower graph of Fig. 7).
to 0.7. For all the schemes, the initial value of the error After this sudden CPU demand increase, the mRT is kept low
covariance matrix, P0 , is set to 10, the variance of the process at its normal values.
noise, W , is set to 4, and the variance of the measurement
noise, V , is set to 1. All CPU measurements for utilization and D. H∞ filter
allocations are exported from the Web Server’s VM, via Xen Next, we evaluate the performance of the H∞ filter, pro-
Hypervisor. In order to evaluate the performance of our system posed in [19] with the same parameters as before. Fig. 8
we set up experiments of 100s total duration, which is enough (upper and lower graphs) depicts the CPU usages-allocations
to evaluate the system performance. At the start of each and mRT for H∞ filter, respectively.
experiment and for the first couple of seconds the CPU usage
100
remains relatively stable at a mean value of 25% as a result of allocation
the constant workload of 400 clients that issue requests to the 90
usage
RUBiS auction site. During time interval 30 − 70s the RUBiS 80
application is loaded with 1000 clients in total, which then 70
after 70s drops to 400 clients. 60
CPU (%) 50
C. Kalman filter 40
30
The upper graph of Fig. 7 shows the predicted Kalman
20
filter allocations that are used to adjust the Web Server
10
tier allocations. The lower graph of Fig. 7 depicts RUBiS
0
performance in respect to mRT during Kalman filtering. 0 10 20 30 40 50 60 70 80 90 100
time interval (s)
3000
100 H∞
90 allocation 2500
usage
80
2000
70
mRT (ms)
60 1500
CPU (%)
50
40 1000
30
500
20
10 0
0 10 20 30 40 50 60 70 80 90 100
0 time interval (s)
0 10 20 30 40 50 60 70 80 90 100
time interval (s) Fig. 8: Upper: H∞ filter CPU usage and allocation. Lower:
3000
Kalman mRT for H∞ filter.
2500
In this experiment the H∞ filter predicts and adjusts the
2000 CPU allocations for a period of 100s. During time interval
mRT (ms)
0 − 30s, 400 clients issue requests to the RUBiS auction

1500
site, while at 30s, 600 more clients are added to the RUBiS
1000 application, which effectively increases the demand in CPU
resources. After 40s, the workload of the application remains
500 relatively stable with no big fluctuations, hence the mRT is
kept low. It is evident that the H∞ filter performs better
0
0 10 20 30 40 50 60 70 80 90 100 than the Kalman filter as the workload increases. As an
time interval (s)
effect when the CPU usage changes abruptly, the allocation is
Fig. 7: Upper: Kalman filter CPU usage and allocation. Lower: adjusted satisfactorily while maintaining a lower mRT value
mRT for Kalman filter. than the Kalman. Note that the proposed H∞ filter in [19]
was originally evaluated only via simulations using synthetic performance of each controller is obtained through our per-
data, while in this work a real-time testbed is used. formance metric, which is mRT and thus MCC-KF performs
slightly better than Kalman and H∞ filters.
E. MMC-Kalman filter (MCC-KF)
The CPU usages and allocations that MCC-KF predicts 100
and adjusts are shown in the upper graph Fig. 9. RUBiS 90
performance in respect to mRT is shown in the lower graph 80
of Fig. 9. 70
60
CPU (%)
100 50
90 allocation
usage 40
80 30 Kalman
70 20 H∞
60 MCC−KF
10 usage
CPU (%)
50 0
0 10 20 30 40 50
40 time interval (s)
30 Fig. 10: Allocation Predictions with the same CPU usage per
20 Controller.
10 Overall, we observe that all controllers perform well while
0
0 10 20 30 40 50 60 70 80 90 100 the workload is relatively stable, and thus the mRT is kept at
time interval (s) the low mRT region. However, when there is a big workload
3000
MCC−KF change because of an increase in the number of clients issuing
2500 requests to the RUBiS, MCC-KF manages to keep the mRT
at lower values than the Kalman filter and slightly lower than
2000 H∞ filter. The H∞ filter, as it is shown in Fig. 10, manages to
mRT (ms)
1500 keep the mRT low by allocating more CPU resources, while
MCC-KF allocates less, and as a result, more resources are
1000 available for use by other applications on the same physical
machine.
500
VI. C ONCLUSIONS AND F UTURE D IRECTIONS
0 A. Conclusions
0 10 20 30 40 50 60 70 80 90 100
time interval (s)
Several available virtualization technologies allow for dy-
Fig. 9: Upper: MCC-KF filter CPU usage and allocation. namic CPU allocation adjustment of each component in
Lower: mRT for MCC-KF filter. virtualized server environment in such a manner so as to
prevent resource saturation. In order to adjust the allocation,
This experiment shows the way that MCC-KF responds suitable controllers must be developed to take on the role of
when the number of clients issuing requests to the server predicting and hence adjusting the CPU allocations of each
increases from 400 to 1000 like previous experiments. As server component. In this work, a Maximum Correntropy
for this experiment, the sudden workload change was done Criterion Kalman Filter (MCC-KF) has been used in order to
at time period 30 − 40s. MCC-KF, operates well enough at adjust the CPU allocations of each component when random
sudden CPU usage changes while the mRT is kept at 1800ms. workload changes are introduced from a RUBiS auction site
During all other regions, the mRT stays at low levels because for a varying number of clients deployed on a Xen-virtualized
the remaining amount of CPU resources is enough for serving cluster. The MCC-KF allows for state estimation in the pres-
the stable workload. ence of non-Gaussian noise environments, as it is the case of
In order to observe how each controller operates, we ran an the CPU usage.
experiment with exactly the same demand for all controllers,
with a big workload change at time interval 20 − 30s. During B. Future Directions
this experiment, the allocations that are predicted from the In this work, all the controllers were implemented assuming
controllers were not applied, since only one controller can prior knowledge of the distribution pertaining the workload
handle the CPU allocations, for each experiment, using the changes. However, the system dynamics change with time,
Xen Hypervisor. Fig. 10 shows that, while the workload is since the volume and fluctuation of requests might vary
relatively stable, the predictions by each controller are almost significantly due to increased/decreased popularity of a site,
the same. However when the demand changes suddenly, the advertisement, etc. For this reason, it is important that self-
predictions of H∞ filter allocates more resources for the adaptation techniques are adopted in order to adjust the
RUBiS application than Kalman and MCC-KF. However, the proposed controllers to these fluctuations [26]–[28].
Resource needs across multiple dimensions, such as com- [12] R. Izanloo, S. A. Fakoorian, H. S. Yazdi, and D. Simon, “Kalman
pute, storage, and network bandwidth, are coupled; hence, filtering based on the maximum correntropy criterion in the presence
of non-Gaussian noise,” in Annual Conference on Information Science
workload consolidation should be performed while taking and Systems (CISS), Mar. 2016, pp. 500–505.
care of the resource coupling in multi-tier virtualized appli- [13] W. Liu, P. P. Pokharel, and J. C. Príncipe, “Correntropy: Properties and
cations in order to provide timely allocations during workload Applications in Non-Gaussian Signal Processing,” IEEE Transactions
on Signal Processing, vol. 55, no. 11, Nov 2007.
changes. For this reason, part of ongoing research concentrates [14] L. Kleinrock, Queueing Systems, Volume 1, Theory. Wiley-Interscience,
on using system identification/learning to extract the coupling 1975.
between the resource needs for more efficient workload con- [15] Z. Wang, X. Zhu, and S. Singhal, “Utilization and SLO-Based Control
for Dynamic Sizing of Resource Partitions,” in Proceedings of the
solidation. IFIP/IEEE International Workshop on Distributed Systems: Operations
Data centers distributed in different geo-location can work and Management (DSOM), October 2005, pp. 133–144.
collaboratively as a whole. They are connected with high- [16] X. Zhu, Z. Wang, and S. Singhal, “Utility-Driven Workload Management
using Nested Control Design,” in Proceedings of the American Control
speed internet or dedicated high-bandwidth communication Conference (ACC), 2006, pp. 6033–6038.
links. VMs can be migrated, if necessary, within a data [17] Z. Wang, X. Liu, A. Zhang, C. Stewart, X. Zhu, T. Kelly, and S. Singhal,
center, or even across data centers [29]. The combination of “AutoParam: Automated Control of Application-Level Performance in
Virtualized Server Environments,” in Proceedings of the IEEE Interna-
dynamically changing the size of VMs and their migration, tional Workshop on Feedback Control Implementation and Design in
while a long standing issue, constitutes a very challenging Computing Systems and Networks (FeBID), 2007.
and open problem. [18] P. Padala, K. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal, A. Merchant,
and K. Salem, “Adaptive Control of Virtualized Resources in Utility
Computing Environments,” in Proceedings of the European Conference
ACKNOWLEDGEMENTS on Computer Systems (EuroSys), 2007, pp. 289–302.
[19] T. Charalambous and E. Kalyviannaki, “A min-max framework for CPU
The authors would like to thank Ms. Kika Christou from the resource provisioning in virtualized servers using H∞ Filters,” in 49th
Information Systems & Technology Service of Cyprus Univer- IEEE Conference on Decision and Control (CDC), Dec. 2010, pp. 3778–
sity of Technology, for her continued support on computing 3783.
[20] P. Padala, K.-Y. Hou, K. G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Sing-
and networking infrastructure related issues. hal, and A. Merchant, “Automated Control of Multiple Virtualized
Resources,” in Proceedings of the 4th ACM European Conference on
R EFERENCES Computer Systems (EuroSys ’09). New York, NY, USA: ACM, 2009,
pp. 13–26.
[1] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neuge- [21] M. Sithu and N. L. Thein, “A resource provisioning model for virtual
bauer, I. Pratt, and A. Warfield, “Xen and the Art of Virtualization,” in machine controller based on neuro-fuzzy system,” in 2011 The 2nd
Proceedings of the ACM Symposium on Operating Systems Principles International Conference on Next Generation Information Technology
(SOSP), 2003, pp. 164–177. (ICNIT), Jun. 2011, pp. 109–114.
[2] L. Rodero-Merino, L. M. Vaquero, V. Gil, F. Galán, J. Fontán, R. S. [22] K. M. Deliparaschos, T. Charalambous, E. Kalyvianaki, and
Montero, and I. M. Llorente, “From infrastructure delivery to service C. Makarounas, “On the use of fuzzy logic controllers to comply with
management in clouds,” Future Generation Computer Systems, vol. 26, virtualized application demands in the cloud,” in European Control
no. 8, pp. 1226–1240, 2010. Conference (ECC), June 2016, pp. 649–654.
[3] G. Jung, K. R. Joshi, M. A. Hiltunen, R. D. Schlichting, and C. Pu, “A [23] W. Liu, P. P. Pokharel, and J. C. Príncipe, “Correntropy: A Localized
cost-sensitive adaptation engine for server consolidation of multitier ap- Similarity Measure,” in The 2006 IEEE International Joint Conference
plications,” in Proceedings of the 10th ACM/IFIP/USENIX International on Neural Network Proceedings, 2006, pp. 4919–4924.
Conference on Middleware. New York, NY, USA: Springer-Verlag New [24] R. He, W. S. Zheng, and B. G. Hu, “Maximum Correntropy Criterion
York, Inc., 2009, pp. 9:1–9:20. for Robust Face Recognition,” IEEE Transactions on Pattern Analysis
[4] E. Kalyvianaki, T. Charalambous, and S. Hand, “Resource Provisioning and Machine Intelligence, vol. 33, no. 8, pp. 1561–1576, Aug. 2011.
for Multi-Tier Virtualized Server Applications,” Computer Measurement [25] A. Singh and J. C. Principe, “Using Correntropy As a Cost Function
Group (CMG) Journal, vol. 126, pp. 6–17, 2010. in Linear Adaptive Filters,” in Proceedings of the 2009 International
[5] D. Ardagna, B. Panicucci, M. Trubian, and L. Zhang, “Energy-aware Joint Conference on Neural Networks, ser. IJCNN’09. Piscataway,
autonomic resource allocation in multitier virtualized environments,” NJ, USA: IEEE Press, 2009, pp. 1699–1704. [Online]. Available:
IEEE Transactions on Services Computing, vol. 5, no. 1, pp. 2–19, http://dl.acm.org/citation.cfm?id=1704175.1704421
March 2012. [26] A. Filieri, C. Ghezzi, A. Leva, and M. Maggio, “Self-adaptive software
[6] C. Mastroianni, M. Meo, and G. Papuzzo, “Probabilistic consolidation meets control theory: A preliminary approach supporting reliability
of virtual machines in self-organizing cloud data centers,” IEEE Trans- requirements,” in Proceedings of the 2011 26th IEEE/ACM Interna-
actions on Cloud Computing, vol. 1, no. 2, pp. 215–228, July 2013. tional Conference on Automated Software Engineering, ser. ASE ’11.
[7] J. Zhang, H. Huang, and X. Wang, “Resource provision algorithms Washington, DC, USA: IEEE Computer Society, 2011, pp. 283–292.
in cloud computing: A survey,” Journal of Network and Computer [27] A. Filieri, H. Hoffmann, and M. Maggio, “Automated design of self-
Applications, vol. 64, pp. 23–42, 2016. adaptive software with control-theoretical formal guarantees,” in Pro-
[8] E. Kalyvianaki, T. Charalambous, and S. Hand, “Adaptive resource proceedings of the 36th International Conference on Software Engineering,
visioning for virtualized servers using Kalman filters,” ACM Transaction ser. ICSE 2014. New York, NY, USA: ACM, 2014, pp. 299–310.
on Autonomous and Adaptive Systems, vol. 9, no. 2, pp. 10:1–10:35, July [28] A. Filieri, M. Maggio, K. Angelopoulos, N. D’Ippolito,
2014. I. Gerostathopoulos, A. B. Hempel, H. Hoffmann, P. Jamshidi,
[9] ——, “Self-Adaptive and Self-Configured CPU Resource Provisioning E. Kalyvianaki, C. Klein, F. Krikava, S. Misailovic, A. V. Papadopoulos,
for Virtualized Servers using Kalman Filters,” in Proceedings of the 6th S. Ray, A. M. Sharifloo, S. Shevtsov, M. Ujma, and T. Vogel, “Software
International Conference on Autonomic Computing (ICAC). New York, engineering meets control theory,” in Proceedings of the 10th
NY, USA: ACM, 2009, pp. 117–126. International Symposium on Software Engineering for Adaptive and
[10] B. Chen, Y. Zhu, J. Hu, and J. C. Principe, System Parameter Identifi- Self-Managing Systems, ser. SEAMS ’15. Piscataway, NJ, USA: IEEE
cation: Information Criteria and Algorithms, 1st ed. Amsterdam, The Press, 2015, pp. 71–82.
Netherlands, The Netherlands: Elsevier Science Publishers B. V., 2013. [29] P. T. Endo, A. V. de Almeida Palhares, N. N. Pereira, G. E. Goncalves,
[11] B. Chen, X. Liu, H. Zhao, and J. C. Príncipe, “Maximum Correntropy D. Sadok, J. Kelner, B. Melander, and J. E. Mangs, “Resource allocation
Kalman Filter,” arXiv:1509.04580 [cs, stat], Sep. 2015, arXiv: for distributed cloud: concepts and research challenges,” IEEE Network,
1509.04580. [Online]. Available: http://arxiv.org/abs/1509.04580 vol. 25, no. 4, pp. 42–46, July 2011.

Dynamic CPU Resource Provisioning in Virtualized Servers Using Maximum Correntropy Criterion Kalman Filters

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dynamic CPU Resource Provisioning in Virtualized Servers Using Maximum Correntropy Criterion Kalman Filters

Uploaded by

Copyright:

Available Formats

Dynamic CPU Resource Provisioning in Virtualized Servers

using Maximum Correntropy Criterion Kalman Filters

performance is the client mean request response times (mRT).

60 the set of all observations up to time k. Let the a posteriori

performance of our system. The cluster is consisted from a

0 − 30s, 400 clients issue requests to the RUBiS auction

You might also like