You are on page 1of 12

A CONTEXT AWARE SELF-ADAPTING ALGORITHM FOR

MANAGING THE ENERGY EFFICIENCY OF IT SERVICE CENTRES


Tudor Cioara1, Ionut Anghel1, Ioan Salomie1, Georgiana Copil1, Daniel Moldovan1 and Barbara Pernici2
1
Technical University of Cluj-Napoca, Computer Science Department, Cluj-Napoca, Romania
{tudor.cioara, ionut.anghel, ioan.salomie, georgiana.copil, daniel.moldovan}@cs.utcluj.ro
2
Politecnico di Milano, Dipartimento di Elettronica e Informazione, Milano, Italy
barbara.pernici@polimi.it

ABSTRACT
This paper addresses the problem of run-time management of a service centre
energy efficiency by using a context aware self-adapting algorithm. The algorithm
considers the service centre current context situation and predefined Green/Key
Performance Indicators (GPI/KPI) to adapt and optimize the service centre energy
consumption to the incoming workload. The service centre energy performance
context situation is obtained by collecting service centre current context data
related to the business applications scheduled to run on service centre resources,
service centre IT computing resources and the service centre facilities. A
reinforcement learning approach is used to decide the best sequence of actions to
be executed to bring the service centre resources as close as possible to a GPI and
KPI compliant state while minimizing the energy consumption.

Keywords: energy efficiency, service centre, context aware computing, self-


adaptive algorithm.

1 INTRODUCTION AND RELATED WORK ambient data related to the service centre
environment. The self-adapting algorithm implements
Over the last years the energy efficiency the four phases of a MAPE control loop: Monitoring,
management of IT processes, systems and service Analyzing, Planning and Execution. In the
centres has emerged as one of the most critical monitoring phase, the service centre energy /
environmental challenges to be dealt with. Since performance context data is collected and represented
computing demand and electricity price is in a programmatic manner by using our RAP context
continuously rising, energy consumption of IT model [4]. In the analyzing phase, the service centre
systems and service centres is expected to become a Green and Key Performance Indicators (GPIs/KPIs)
priority for the industry. Nowadays, service centre are evaluated on the service centre energy
electricity consumption accounts for almost 2% of performance context instance and compared against
the world electricity production and their overall their predefined values. If the GPIs/KPIs are not
carbon emissions [1], [2]. The GAMES (Green fulfilled in the planning phase a reinforcement
Active Management of Energy in IT Service learning approach is used to generate the best
Centres) [3] research project aims at developing a set sequence of actions to be taken in the execution phase
of innovative methodologies, metrics, services and for bringing the service centre as close as possible to
tools for the active management of energy efficiency a GPIs/KPIs compliant state. The adaptation
of IT service centres. The GAMES vision is to create algorithm decides on two types of adaptation actions:
a new generation of Green and Energy Aware IT consolidation actions and DPM (Dynamic Power
service centres by designing energy efficient service Management) actions. The consolidation actions are
centres and also by managing at run-time the service executed for balancing the service centre workload in
centre energy efficiency. an energy efficient manner. DPM actions identify the
This paper addresses the problem of run-time service centre over-provisioned resources targeting to
managing the service centre energy efficiency by set them into low power states.
proposing a context aware self-adapting algorithm for In the literature, the development of energy-aware
adjusting the resources usage to the incoming computing systems is proposed in [5]. Different
workload. The service centre context consists of the energy aware techniques based on integer
following categories of energy / performance data: (i) programming, genetic algorithms or heuristics have
data related to business applications to be executed on been proposed in the last years [6], [7], [8], [9]. A
the service centre IT computing resources, (ii) data systematic analysis of cost issues in the design of
related to the workload and energy consumption of hardware and network systems is presented in [10]
the service centre IT computing resources and (iii) while a cost-oriented design of data centres in which
applications are allocated with a fixed number of tiers conditions in a specific context situation. We define
for satisfying response time requirements is proposed the self-adapting property for a context aware system
in [11]. as a function, which associates to each context
To reduce energy consumption, autonomic situation that fails to fulfil the prescribed set of
techniques have been developed to manage workload conditions, an adaptation action plan, that should be
fluctuations [12] and to determine the optimum executed in order to re-enforce the set of conditions.
trade-offs between performance and energy costs, Fig. 1 presents the proposed self-adapting
mainly by switching servers on and off. It has been algorithm for context aware systems. The algorithm
shown that by implementing Dynamic Voltage is based on a control feedback loop with the
Scaling (DVS) mechanisms (i.e., reduction of CPU following MAPE phases: (i) the Monitoring Phase –
frequency), significant energy reduction can be collects raw data about the system’s self and
obtained [13], [14]. Self-* algorithms for ad-hoc execution environment and uses it to construct and
networks, in which the connections between nodes represent the system context situation in a
are added / removed for energy efficiency purposes programmatic manner (lines 7-9), (ii) the Analysis
are proposed in [15], [16]. An agent based solution Phase – analysis the context situation in order to
for power management in complex service centres detect significant abnormal changes (lines 10-11),
comes from the IBM researchers R. Das and J. O. (iii) the Planning Phase – decides and selects the
Kephart [17]. The authors use virtualization and appropriate adaptation action plan to be executed
load-balancing techniques for efficiently and energy (lines 12-24), (iv) the Execution Phase – executes
aware management of service centre’s resources. In the selected action plan (lines 25-26).
[18], a self-adaptive solution for optimizing and
balancing performance and energy consumption for
RAID storage is proposed. This technique groups the
available data into hot (frequently accessed data) and
cold data. Hot data is kept on an active disk group
while the cold data is kept in an inactive disk group
which spins at low speed in order to save energy.
Energy–aware provisioning and load dispatching
algorithms that take into consideration the energy /
performance models and user experience are
designed for servers that host a large number of long-
lived TCP connections [19]. An architecture which
supports the design and programming of self-
optimizing algorithms for data locality and access in
distributed systems is considered in [20]. Based on
application's run-time behaviour, the data access and
distribution is re-modelled with respect to a set of
predefined energy performance objectives.
The rest of the paper is organized as follows:
Section 2 details each MAPE phase of our context
aware self-adapting algorithm; Section 3 shows a
how the context aware self-adapting algorithm can
be used to manage the energy efficiency of a service
centre while Section 4 concludes the paper.
Figure 1: The self-adapting algorithm.
2 THE CONTEXT AWARE SELF-
The following sub-sections detail the first three
ADAPTING ALGORITHM
phases of the self-adapting algorithm. The execution
phase is trivial and is not further detailed.
To define the self-adapting property for a
context aware system we have used three main
2.1 The Monitoring Phase
situation calculus concepts: (1) the situation concept
The goal of the self-adapting algorithm
also referred as system situation or context situation,
monitoring phase is to collect the system’s self and
represents the state of a context aware system
execution environment raw context data and to
(system’s self and execution environment) at a
construct / represent the system context situation in a
moment of time, (2) the set of conditions concept,
programmatic manner. The system’s self raw context
represents a set of rules that defines the possible
data refer to the system internal state and is used to
system-context interactions and are used to guide and
assure the system’s self-awareness. This kind of
control the system execution (3) the action concept,
context data is usually gathered using low level
represents the adaptation actions or actions plans that
system calls, logging or life cycle events. The
are executed by the system to enforce the set of
system’s execution environment raw context data are example). Each Reference has a unique name that
used to ensure the system environmental awareness. can be referred from all the other XML elements of
This kind of context data is captured using sensor the policy. The reference resources collection is
networks or intelligent devices with the goal of defined by applying three types of restrictions:
assuring the system context awareness. domain restriction (location as criteria), type
The system’s self and execution environment restriction (context resource class as criteria) and
context data collected from various sources is property restriction (context resource properties other
represented in heterogeneous formats and difficult to than location as criteria).
interpret and analyze. There is an evident need for The XML Subject element uses an Reference
providing an integrated, uniform and semantically name to identify the context resources collection on
enhanced context data representation model that can which the policy imposes restrictions. The Subject
be automatically processed and interpreted at run- element has two children: (i) EvaluationTrigger
time. To solve these problems we have used our XML element that contains a set of possible events
RAP (Resources, Actions and Policies) context that may trigger the policy evaluation and (ii)
model defined in [4] to represent the system’s EvaluationCondition XML element containing a set
context situation in a programmatic manner. The of conditions for which the policy is enforced.
RAP model defines two types of context information The XML Target element represents a collection
representations: (i) set based, used to determine the of context resources for which the goals specified by
context changes and (ii) ontology based, used to infer the PolicyGoals XML element applies. The
new context information by means of reasoning and PolicyGoals element is used to determine the action
learning algorithms. plan that has to be executed to enforce the policy.
In the set based approach, the context information To be evaluated, the policies are automatically
is modelled as a triple, C = <R, A, P >, where R is converted into SWRL (Semantic Web Rule
the set of context resources that provide raw data Language) rules and injected into the RAP specific
about the system’s self and execution environment, context model instance ontology. The SWRL rules
A is the set of adaptation actions which are executed are used to reason about RAP context model instance
to enforce system execution guiding rules and P is a ontology individuals in terms of ontology classes and
set of policies which define the rules that guide the properties. Rules are written in the form of an
system execution. In the ontology-based implication between an antecedent (body) and a
representation, the relationships between the context consequent (head).
model sets are modelled in a general purpose context The process of obtaining the SWRL rule
ontology core. The domain specific concepts are antecedent from the XML policy description consists
represented as sub trees of the core ontology by from two phases: (i) the generation of the SWRL
using is-a type relations. A RAP context model atoms used to identify the specific context model
instance represents a context situation and contains instance ontology individuals involved in the
the values of the RAP context model sets elements in reasoning process and (ii) the transformation of the
a specific moment of time. XML policy evaluation condition into SWRL rules.
The SWRL rule consequent contains a SWRL
2.2 The Analysis Phase atom that sets the EvaluationResultProp attribute to
In the analysis the context policies that guide the true for all RAP context model Policy class
system execution are evaluated for the current individuals, which correspond to the XML broken
context situation with the goal of identifying those policies.
context situations in which the rules are broken. The
context policies are represented in XML and 2.2.2 The context situation entropy
automatically converted in SWRL for evaluation The system context situation entropy measures
using our policy representation / evaluation model the level of system’s self and execution environment
proposed in [22]. For measuring the degree of disorder by evaluating the degree of fulfilling the
fulfilling the context policies in a context situation, context policies. If the evaluated system context
we have defined the concept of context situation situation entropy is below a predefined threshold T E,
entropy (ES) and its associated threshold (T E). The then all the policies are fulfilled and adaptation is not
next sub-sections present a short overview of our the required. Otherwise the system must execute
policy representation / evaluation model and adaptation actions to bring the entropy below TE. The
introduce the context situation entropy concept. entropy for a context situation is:

2.2.1 The context policy representation / ∑ ∑ (1)


evaluation model
The context policies are represented into XML where:
using three elements: Reference, Subject and Target.
The XML Reference element represents a
collection of context resources (see Fig. 12 for an
- pwi is the weight of the policy i, and while Atr is mapped to the set of policies P. We
represents the importance of the policy in the define a function, Fsk, that assigns to every system
context context situation sk, a list of values returned by
- rwij is the weight of the system resource i in evaluating all the defined policies in that context
situation. Two system context situations s1 and s2
the policy j. The resource weight reflects the
belong to the same equivalence class if and only if
system resource importance for the policy. If
Fs1 and Fs2 generate the same lists of policy
a policy imposes no restrictions to a resource evaluation values (Table 1).
then the weight of the resource for that
policy is zero. Table 1: Determining the equivalence classes for the
- vij is the deviation between the system system context situations.
resource j recorded value and the policy i
accepted value.
We also define the entropy contribution (Ei) of a
policy:

Ei pwi ∑ riij vij (2)

Taking into account the system toleration to


changes, we have defined two types of entropy As a result, the equivalence class is defined as the
thresholds: restrictive threshold and relaxed set of system context situations in which the same
threshold. In the first case, we define the threshold adaptation action plan a is executed (we use the
TE = 0 as lowest entropy value. This means that, notation Sa, where a is the adaptation action plan).
whenever a policy-imposed restriction is broken, the
self-healing algorithm is triggered. In the second 2.3.2 Adaptation actions selection
case, for each policy we define an accepted entropy The action selection phase of the self-adapting
contribution value Ei and compute current threshold algorithm considers the broken policies one by one,
TE as in (3). For the relaxed threshold, broken ordered by their contribution to the overall entropy
policies are tolerated if their entropy contribution is (see relation 2). The action selection phase can be
lower than the accepted value. divided in three main steps (Fig. 2): (1) select the
policy with the highest contribution to entropy (line
E ∑ pwi vij ( Ei (3) 9), (2) for the selected policy, determine the set of
affected system resources and organize them into
2.3 The Planning Phase inter-independent resource groups (IIRGs) (line 10)
This phase deals with deciding and planning the and (3) use reinforcement learning (if necessary) to
actions that need to be taken in order to enforce the generate the best action sequence to be taken in order
broken policies. The phase starts with identifying to bring the IIRG’s resources as close as possible to a
previously encountered similar context situations in policy compliant state (lines 11-15). The last two
which the same policies were broken (Section 2.3.1). steps of the action selection phase are detailed in the
If a similar situation is found, the same action plan is following subsections.
selected and executed, otherwise a new action plan is
constructed using a reinforcement learning based
approach (Section 2.3.2) .

2.3.1 Equivalent System Context Situations


To identify and classify similar system context
situations we have defined the equivalence context
situation class concept and an equivalence relation.
Two context situations are equivalent if both fulfil
and break the same policies. To classify the system
context situations in equivalence classes we use the
information system theory. An information system is
defined as an association (W, Atr) where: W is a
non-empty finite set of objects and Atr is a non-
empty finite set of attributes. For this association a
function af that assigns to every object in W, a value
for each attribute in Atr, can be defined. For a
context aware system, the non-empty set of objects
W is mapped onto the set of context situations S Figure 2: The actions selection phase procedure
pseudo-code.
Extracting Inter-Independent Resource Groups
The inter-independent resource group (IIRG)
concept is defined and used into the reinforcement
learning algorithm to avoid the situation when
healing a context policy may cause breaking another
one because their associated resources are depended.
For example, a temperature sensor and a humidity
sensor are dependent taking into consideration a
room temperature policy and a human thermal
comfort factor policy. For this example, an action
executed to heal the room temperature policy,
changes the policy associated context resources
values and triggers a variation of its depended
context resources values with the result of breaking
the human thermal comfort factor policy.
A system resource is dependent on another
system resource when its recorded value changes
with the variation of the other resource value. We
represent the dependency between system resources Figure 3: The IIRG extraction algorithm.
as a directed weighted graph having the system
resources as nodes, the dependency relation as edges Fig. 4 shows a tracing scenario for the IIRG
and the dependency degree as edge weights extraction algorithm based on the resource
(dependency graph). The dependency degree dependency graph matrix presented in Table 2. We
between two context resources is computed using a assume that the resource weight values are in the
dependency function (Df) defined as follows: following relation: ri k ri k ri k . Initially we
build the priority queue according to their weight:
Df Resources Resources → (R2, R3, R1). In the first step, we de-queue the R2
0, no path between i and j resource and place it in the first IIRG set. In the
Df(i,j)= (4) second step, we de-queue the next resource, R3 and
1/k, k is the length of the shortest path try to add it to IIRG1 (the current IIRG). According
from resource i to j to the dependency matrix from Table 2, R2 is
independent of R3 and R3 is independent of R2 and
The dependency degree between two resources therefore R3 is added to IIRG1. In the third step, the
is equal to the inverse of the shortest path resource R1 is de-queued. Because it is not inter-
length in the dependency graph between nodes independent with R2 and R3 (the resources of the
and . The dependency graph is also represented as current IIRG), we must place it into IIRG2 (a new
a distance matrix (D) whose element dx,y are the Df constructed IIRG repository). The queue is now
function values representing the dependency degree empty the process is over resulting two IIRG’s
between and . A dependency graph matrix IIRG1= (R2, R3) and IIRG2 = (R1).
example for a context policy with three associated
context resources is presented in Table 2, where R 1 is
not dependent on R2 but R2 is dependent on R1.

Table 2: Dependency graph associated distant


matrix example.

Two context resources Ri and Rj are inter-


independent if the distance matrix contains zero
values in positions (i,j) and (j,i). A Inter-Independent
Resources Group (IIRG) is a set of inter-independent
resources (i.e. Ri Rj IIRG, Ri and Rj are inter-
independent). The IIRG extraction algorithm is
presented in Fig. 3. Figure 4: The IIRG extraction algorithm tracing.
Determining the Best Action Sequence tree are cycles, the algorithm stops and chooses the
The best actions sequence to be taken for a path leading to a state with the minimum entropy.
certain IIRG is found by merging the best sequences
of actions determined for each IIRG resources. For
an IIRG resource, the sequence of actions is
computed through a reinforcement learning
algorithm, by using a policy iteration technique (see
Fig. 5).

Figure 6: Reinforcement Learning Process Tracing.

3 SERVICE CENTER ENERGY AWARE


ADAPTATION CASE STUDY

The proposed self-adapting algorithm for context


aware systems has been used to manage the energy
efficiency of a service centre. We have considered a
service centre with the following IT resources (see
Fig. 7): (i) a server cluster composed from three
physical servers on which virtualized applications are
executed, (ii) an external shared storage server and
(iii) a set of sensors and facilities interconnected
through a sensor network which controls the service
centre environment. The self-adapting algorithm
resides on a Service Centre Control Unit.
Figure 5: The reinforcement learning algorithm.

The learning process considers all possible


system context situations and builds a decision tree
by simulating the execution of all available actions
for each system context situation. A tree node
represents a system context situation. A tree path
between two nodes A and B defines a sequence of
actions which, executed on the system context
situation represented by the node A, generates a new
system context situation represented by the node B.
The reinforcement learning process uses a reward /
penalty approach to find the best sequence of actions
that the system may execute in a given context
situation. In our case, the minimum entropy path in
the reinforcement learning decision tree represents
the best sequence of actions.
Figure 7: The test case service centre resources.
The learning process may generate different type
of results discussed below (as shown in Fig. 6).
We also consider the service centre workload is
Case 1: The algorithm finds only a possible composed of virtualized activities annotated with
system context situation that has its entropy lower Quality-of-Service (QoS) requirements.
than the predefined threshold. In this case the
sequence of actions that lead to this context situation Table 3: QoS annotations of the virtualized activities.
is selected and the search is stopped.
Case 2: The current system context situation Act. Activity KPI
entropy is higher than the threshold, but smaller than No. Cores CPU Memory Storage
the minimum entropy determined so far. We replace No. Frequency
the minimum entropy with the new entropy and 1 4 2000 MHz 2000 MB 520 MB
continue with the search process. 2 4 1000 MHz 2000 MB 600 MB
Case 3: The current entropy is higher than both 3 3 1200 MHz 750 MB 512 MB
the threshold and the minimum entropy; the 4 4 2000 MHz 1000 MB 400 MB
reinforcement learning algorithm continues the 5 1 600 MHz 1000 MB 512 MB
search process. If all exercised paths of the decision 6 1 1200 MHz 500 MB 600 MB
Table 3 presents the QoS annotations of the 3.1 The Monitoring Phase
virtualized activities considered in our case study. We The goal of the self-adapting algorithm
use virtualization as an abstraction level, because it monitoring phase is to collect the service centre
allows us to manage the activities in a uniform energy/performance related context data (section
manner and enables activity migration. 3.1.1) and to update/represent the service centre
The hardware configuration each service centre context, in a programmatic manner (Section 3.1.2).
physical server part of the server cluster together with We have identified the following categories of service
the configuration of the external shared storage server centre context data: (i) business data related to the
is presented in Table 4. The workload virtualized application which is executed on the service centre IT
activities are kept on the external shared storage
computing resources, (ii) current workload and
because: (i) the virtual machines hypervisors requires
that all virtual machines reside on a ISCSI-3 energy consumption data of the service centre IT
compatible shared storage to enable real-time computing resources and (iii) ambient data related to
migration, (ii) the virtualized activity migration time the service centre environment.
is significantly reduced and (iii) the un-deployed 3.1.1 The Service Centre Monitoring
virtual machines are accessible from all service centre Infrastructure
servers cluster. In this section we present the software/hardware
monitoring infrastructure the monitoring
Table 4: The case study service centre hardware
infrastructures used to gather the current workload /
resources.
energy consumption data and environment data.
Name Cores CPU Memory Storage Power
Regarding the business context data they are
No. Freq. source gathered from the application QoS requirements and
Server1 8 2930 6144 MB - 500 W therefore no monitoring infrastructure is needed.
MHz To accurately capture the workload and energy
Server2 8 2930 6144 MB - 500 W consumption data of the service centre IT computing
MHz resources, we have adapted and used the monitoring
Server3 8 2930 6144 MB - 500 W
MHz
architecture recommended by The Standard
Storage 2 2400 2048 MB 600 GB 350 W Performance Evaluation Corporation [24] which
Server MHz defines the following monitoring components (see
Fig. 8): System Under Test (SUT), Software Driver,
The service centre ambient monitoring Monitor, Measurement Server.
infrastructure components (sensors and actuators)
and their range of values / available actions are
shown in Table 5. The actuators are used to control
the service centre ambient characteristics and to
enact the adaptation actions.

Table 5: The service centre ambient monitoring


infrastructure components.

Infrastructure Type Range of values /


Component Available actions
Figure 8: Workload and energy consumption
Temperature Sensor [-20, 40] ℃ / - monitoring infrastructure.
Humidity Sensor [0, 100] % / -
Light Sensor {ON, OFF} /- The System Under Test represent the service
centre IT computing resources whose workload and
Air Conditioning Unit Actuator - / {Decrease by 5℃, energy consumption data is collected. The
Increase by 2 ℃}
configuration of each SUT is shown in Table 4. The
Humidity Controller Actuator - / {Increase by Software Driver represents the virtualized activities
3 %,Decrease by 3 %} that are used to stress the service centre servers. The
Light Controller Actuator - / {Turn ON, Turn Monitor represents a third-party software component
OFF} that registers the workload and energy consumption
data of the U ’s. In our test case the agios
In the following sub-sections each of the self- monitoring framework [25] has been used. Nagios
adapting algorithms MAPE phases are used to defines two major components: (i) Nagios Software
optimize the energy consumption of the test case Server, runs on the Measurement Sever and collects
service centre. For each MAPE phase, the involved data about the SUTs and (ii) Nagios Software
hardware/software resources and their interaction are Daemons, run on each monitored SUT and send data
presented and discussed. to the Nagios Software Server. The Measurement
Server represent the physical system on which the
monitor and driver resides, in our case the Service by the list of energy consuming states property. For
Centre Control Unit. example, an Intel Core i7 860@2.8Ghz processor has
The service centre context ambient data is 12 P-states varying from 1197Mhz (40%) up to
collected through the Wi-microSystem wireless 2926Mhz (100%). The Computing Resources are
sensor network [26]. The sensor collected data is classified as Simple and Complex. A Simple
encoded as MIDI messages which are real-time Computing Resource provides only one atomic
wirelessly transmitted, through Bluetooth waves, to performance property throughout its lifecycle. For
the Measurement Server for analysis. The Bluetooth example, CPU energy-related performance is
receiver, located on the Measurement Server, is characterized only by its frequency value. A
mapped as a Virtual Serial Port (VSP). The MIDI Complex Computing Resource is composed from a
message information is extracted using the Microsoft set of Simple Resources and is characterized by a set
Windows API multimedia operations and published of performance properties. For example the server
through web services (see Fig. 9). energy-related performance is expressed by means of
its component’s energy-related performance
properties such as spindle speed for HDD or clock
rate for CPU. A Business Resource is a virtual entity
which provides information about the application,
executed on the service centre, QoS requirements.

Figure 9: Environment monitoring infrastructure.

For simulation purposes, we have implemented a


sensor Random Manipulation Mechanism (RMM),
which assigns values to the service centre sensors
using a predefined pattern (or random values). The
mechanism is used to generate context situations in
which adaptation is needed (i.e. the temperature and Figure 10: The EACM model ontology
humidity in the service centre are above a predefined representation.
limit).
Context Actions (A). Three types of service
3.1.2 Context Data Representation centre adaptation actions are identified by
To represent the service centre context data we considering the category of context resources with
have developed the Energy Aware Context Model which they are enforced: (i) IT Computing Resources
(EACM model). The EACM model was constructed Adaptation Actions and (ii) IT Facility Resources
by mapping the RAP context model onto the service Adaptation Actions and (iii) Application Adaptation
centre energy efficiency domain aiming at Actions. IT Facility Resources Adaptation Actions
identifying / classifying the specific elements that (e.g. adjust the room temperature or start the CRAC)
populate the RAP context model sets (see Fig. 10). are enforced through the Active Resources. IT
Context Resources (R). Three types of context Computing Resources Adaptation Actions are
resources that generate/collect context data were executed to enforce the set of predefined GPI and
identified in service centres: (1) Service Centre IT KPI indicators on the Computing Resources. We
Facility Context Resources (Facility Resource for have defined two types of adaptation actions:
short), (2) Service Centre IT Computing Context Consolidation Actions and Dynamic Power
Resources (Computing Resource for short) and (3) Management Actions (DPM). Consolidation Actions
Business Context Resources (Business Resource for aim at identifying the Computing Resources for
short). Facility Resources are physical or virtual which the workload is inefficiently distributed from
entities which provide or enforce the service centre the energy efficiency point of view and balancing
ambient properties. Passive Resources capture and workload distribution in an energy efficient manner.
store service centre ambient data (e.g. sensors), while DPM Actions aim at determining the over
Active Resources execute adaptation actions to provisioned resources with the goal of putting them
modify the service centre ambient properties (e.g. into low power states for minimizing energy
actuators). Computing Resources are physical or consumption. Application Adaptation Actions are
virtual entities which supply context data related to executed on the service centre applications activities
service centre actual workload and performance (such as application redesign for energy efficiency).
capabilities. A Computing Resource is characterized
Context Policies (P). The constraints regarding
the service centre energy / performance are modelled <Reference Name="Server1 Policy">
<Restrictions Property="Server" Value="Server1" Operator="resourceID"/>
through a predefined set of GPI/KPI policies. We
<PropertyRestriction Property="associated-resource" Operator="Cpu">
have identified and modelled three categories of
<PropertyRestriction Property="workload" Operator=">" Value="600" />
GPI/KPI policies: (i) Environmental Policies, <PropertyRestriction Property="workload" Operator="<" Value="2065" />
imposing restrictions about the service centre </PropertyRestriction>
ambient conditions, (ii) IT Computing Policies, <PropertyRestriction Property="associated-resource" Operator="Memory">
describing the energy/performance characteristics of <PropertyRestriction Property="workload" Operator=">" Value="1000" />
the service centre that and (iii) Business Policies, <PropertyRestriction Property="workload" Operator="<" Value="4000" />

describing the rules imposed by the business for the </PropertyRestriction>


</Restrictions>
application execution. Table 6 shows the set of </Reference
GPI/KPI policies defined and used in the current
case study.
Figure 12: SUT1 energy / performance policy XML
Table 6: Service centre energy/performance and representation.
environmental policies.
2. The XML of the energy / performance policy
for SUT1 is converted into SWRL rules using our
Environmental Accepted values
Policy XML to SWRL conversion model presented in [22]
Temperature And ℃ ℃ (see Fig. 13 for the SWRL representation).
Humidity
Light
SimpleResource(?Server)
{ ^ resourceID(?Server, "Server1")
^ resourceID(?CPU, "Server1_CPU")
Energy/Performanc Accepted values
^ currentWorkload(?CPU, ?currentCPUWorkload)
e Policy
^ swrlb:greaterThanOrEqual(?currentCPUWorkload, 600)
Workload values for
Server 1 ^ swrlb:lessThanOrEqual(?currentCPUWorkload, 2065)
^ resourceID(?Memory, "Server1_Memory")
^ currentWorkload(?Memory, ?currentMEMWorkload)
Workload values for
Server 2 ^ swrlb:greaterThanOrEqual(?currentMEMWorkload, 1000)
^ swrlb:lessThanOrEqual(?currentMEMWorkload, 4000)
^ SimpleResource(?CPU)

3.2 The Analysis Phase


In this phase GPI/KPI policies are evaluated to Figure 13: Server 1 energy / performance policy
determine if they are fulfilled in the current service SWRL representation.
centre context situation, represented by the EACM
model ontology (see Fig. 11). 3. The Server 1 energy / performance policy
SWRL representation is injected in the service centre
context situation ontology representation and
evaluated using the Pellet [21] reasoner.
4. The evaluation result is used to calculate the
entropy of the current service centre context situation
(relation 1). In our test case the entropy threshold TE
is set to 1 and the GPI/KPI related policies can be
evaluated to true or false (1 / 0).
5. If the obtained entropy is higher than T E, the
planning phase is fired; otherwise nothing happens
because no adaptation is needed.

3.3 The Planning Phase


Figure 11: GPI/KPI polices evaluation process. The phase goal is to decide and select the actions
that need to be taken in order to enforce a broken
The rest of this section presents the analysis GPI.KPI policy. The phase starts with identifying the
process steps taken to determine if the energy / previously encountered similar context situations in
performance policy for SUT1 is fulfilled in the which the same policies had been broken. If a similar
current context situation: situation is found, the same adaptation action is taken.
1. The Server 1 energy / performance policy is Otherwise, the sequence of actions is determined
represented in XML according to our policy model through a reinforcement learning algorithm, by using
detailed in [22] (see Fig. 12). a policy iteration technique. The reinforcement
learning process considers all possible service centre
context situations and builds a decision tree by using
a what-if approach (see Fig. 14). The execution of all following sequence of adaptation actions: <D2,1:
available actions is simulated and rewards / penalties Deploy Activity 2 to Server 1>.
are used to find the best sequence of actions. Step3. Activity 3 and Activity 4 are
simultaneously received. There are no available
resources to deploy them on Server 1. The following
sequence of adaptation actions is determined: <W3:
Wake-up Server 3, D3,3: Deploy Activity 3 to Server
3, M2,3: Move Activity 2 to Server 3, D4,1: Deploy
Activity 4 to Server 1>.
Step4. Activity 5 and Activity 6 are
simultaneously received. There are no available
resources to deploy them therefore the reinforcement
learning algorithm determines the following
sequence of adaptation actions: <D5,3: Deploy
Activity 5 to Server 3, D6,3: Deploy Activity 6 to
Figure 14: Determine the adaptation action. A Server 3>.
reinforcement learning based approach. Fig. 15 shows the reinforcement learning
algorithm decision tree constructed and used to find
We have defined the following generic adaptation the scenario’s Step3 adaptation actions sequence
actions: Dx,y: Deploy Activity x to Server y, Hx: Put solution. At each iteration the algorithm does a what-
Server x in Hibernate State, Mx,y: Move Activity x if analyse by exploring and simulating all the
to Server y, Wx: Wake-up Server x”. available generic adaptation actions for each context
For reducing the overhead, the algorithm keeps situation and choose the decision tree leaf with the
track of the visited nodes while expanding the search largest reward value. This eliminates the greedy
space in a greedy manner. During each phase, the pitfalls, by ensuring that the search is always
algorithm is searching for the maximum rewarded conducted from the best state encountered so far.
node from the existing leafs and expands the
searching tree. The expansion goes on until the best
path is found. The reward function at each node is
calculated using the formula:

(5)

In formula (5) is the entropy of the current context


state, is the entropy value of a future context
state obtained by executing a simple adaptation
action, represents the future reward’s discount
factor, MaxActionCost refers to the highest cost of
executing a simple adaptation action (the most
expensive action) determined by using the what-if Figure 15: Planning phase “what-if” decision tree
analysis, CurrentActionCost is the cost of executing for step 3.
the current simple adaptation action.
Fig. 16 shows the estimated energy consumption
To test the self-adapting algorithm we have
(EEC) values for our test case service centre in two
considered a scenario in which the virtualized
situations: (i) the service centre is managed by our
activities presented in Table 3, are executed on the
self-adapting algorithm (marked with green) and (ii)
service centre. In the initial service centre state, all
the service centre is unmanaged (marked with red).
its servers are considered as hibernating.
The planning phase tracing for this scenario
consists of the following steps:
Step1. Activity 1 is received. The defined
GPIs/KPIs context policies are evaluated. The
GPI/KPI requirements of activity 1 are broken
because all the Servers are hibernating. The
reinforcement learning algorithm starts evaluating
the options, and finds the following sequence of
adaptation actions: <W1: wake-up Server 1, D1,1:
Deploy Activity 1 to Server 1>.
Step2. Activity 2 is received. There are no
available resources to deploy it, therefore the Figure 16: Energy consumption for the current
reinforcement learning algorithm determines the scenario.
To estimate the service centres servers energy ACKNOWLEDGEMENT
consumption the JouleMeter tool created by
Microsoft research was used [23]. We have also The work has been done in the context of the EU
defined the server base energy consumption as the FP7 GAMES project [3].
energy used by a server in idle mode (can be
estimated using the vendor specification). The 5 REFERENCES
energy consumption of the service centre in normal
operation is computed as: [1] V. Metha: A Holistic Solution to the IT Energy
Crisis, available at http://greenercomputing.com
(2007).
[2] J.M. Kaplan, W. Forrest, N. Kindler:
(6)
Revoluzioning Data Center Energy Efficiency,
McKinsey&Company, Technical Report
Using relation (6) the managed and unmanaged
(2008).
service centre energy consumption can be computed.
Fig. 16 managed service centre better results are [3] GAMES Research Project, http://www.green-
mainly due to the fact that, for this case, the servers datacenters.eu/.
remain in a low power state while not needed. [4] T. Cioara, I. Anghel, I. Salomie: A Generic
The self-adapting algorithm will always try to Context Model Enhanced with Self-configuring
find the best solution, by considering both the service Features, Journal of Digital Information
centre energy consumption and workload activities Management, vol. 7/3, pp.159-165, ISSN 0972-
QoS requests. Fig. 17 results show that the algorithm 7272 (2009).
converges quite fast, taking less than 600 ms to find [5] L. A. Barroso U. Hölzle: The Case for Energy-
a solution if exists. For improving the real-time Proportional Computing, IEEE Computer, vol.
performance of the algorithm, a suboptimal solution 40/12, pp. 33-37 (2007).
can be returned by limiting the number of search [6] L. Zeng, B. Benatallah, M. Dumas, J.
steps based on a tradeoffs between decision time and Kalagnamam, H. Chang: QoS-aware
solution quality (energy efficiency). This avoids middleware for Web services composition, IEEE
exploring the entire search space in case no optimal Trans. on Software Engineering, 30/5 (2004).
solution exists.
[7] D. Ardagna, M. Comuzzi, E. Mussi, P. Plebani,
B. Pernici: PAWS: a framework for processes
with adaptive Web services, IEEE Software,
vol. 24/6, pp. 39-46 (2007).
[8] G. Canfora, M. Penta, R. Esposito, M. L.
Villani: QoS-Aware Replanning of Composite
Web Services, Proc. of Int. Conference on Web
ervices (ICW ’ 5 , pp. 121–129 (2005).
[9] T. Yu, Y. Zhang, K.-J. Lin: Efficient algorithms
for web services selection with end-to-end QoS
constraints, ACM Trans. Web, vol. 1/1, pp. 1–26
(2007).
Figure 17: Decision time for the current scenario. [10] D. Ardagna, C. Francalanci, M. Trubian: Joint
Optimization Of Hardware And Network Costs
4 CONCLUSIONS For Distributed Computer Systems, IEEE Trans.
on Systems and Cybernetics, vol. 2, (2008).
This paper proposes a context aware self-adapting [11] W. Lin, Z. Liu, C.H. Xia, L. Zhang: Optimal
algorithm for balancing the performance and energy Capacity Allocation for Multi-Tiered Web
consumption of a service centre. The self-adapting Systems with End-to-End Delay Guarantees,
algorithm automatically detects / analyzes the Perform. Eval., vol. 62/1-4, pp. 400-416, (2005).
workload and performance changes in the service
[12] J. S. Chase, D. C. Anderson: Managing Energy
centre then decides on how it should adapt in order to
minimize the energy consumption. The obtained and Server Resources in Hosting Centers, In
results are promising, showing that the algorithm is ACM Symposium on Operating Systems
capable to manage at run-time the service centre principles, vol. 35/5 (2001).
energy efficiency by dynamically taking [13] D. Ardagna, M. Trubian, L. Zhang: SLA based
consolidation and dynamic power management resource allocation policies in autonomic
actions. The results also show that an energy saving environments, Journal of Parallel and
of about 55 % is obtained when the service centre is Distributed Computing, vol. 67/3, pp. 259–270
managed by our self-adapting algorithm. (2007).
[14] D. Kusic, N. Kandasamy: Risk-aware limited optimization, LNCS 4124, pp. 187–201,
lookahead control for dynamic resource Springer-Verlag Berlin Heidelberg (2006).
provisioning in enterprise computing systems, [21] E. Sirin, B. Parsia, B. C. Grau, A. Kalyanpur, Y.
Cluster Computing, vol. 10/4, pp. 395-408 Katz: Pellet: A practical OWL-DL reasoner,
(2007). Web Semantics: Science, Services and Agents
[15] S. Ganeriwal, A. Kansal, M. B. Srivastava: Self on the World Wide Web, vol. 5/2, pp. 51-53,
Aware Actuation for Fault Repair in Sensor (2007).
Networks, Proc. of the IEEE International [22] T. Cioara, I. Anghel, I. Salomie: A Policy-based
Conference on Robotics and Automation, pp. Context Aware Self-Management Model, 11th
5244- 5249 (2004). Int. Symp. on Symbolic and Numeric Alg. for
[16] J. Saia, A. Trehan: Picking up the Pieces: Self- Scientific Comp. (SYNASC), pp.333-341,
Healing in reconfigurable networks, In Proc. of ISBN: 978-0-7695-3964-5 (2009).
IEEE International Parallel and Distributed [23] A. Kansal, F. Zhao, J. Liu, N. Kothari, A.
Processing Symposium, pp. 1 - 12 (2008). Bhattacharya: Virtual Machine Power Metering
[17] R. Das, J. O. Kephart, C. Lefurgy, G. Tesauro, and Provisioning, Proceedings of the 1st ACM
D. W. Levine, H. Chan: Autonomic multi-agent symposium on Cloud computing, pp. 39-50,
management of power and performance in data ISBN:978-1-4503-0036-0 (2010).
centers, In Proc. AAMAS, pp. 107-114 (2008). [24] The Standard Performance Evaluation
[18] Q. Zhu, Y. Zhou: Chameleon: A Self-Adaptive Corporation, www.spec.org.
Energy-Efficient Performance-Aware RAID, [25] D. Josephsen: Building a Monitoring
IBM Austin Conference on Energy-Efficient Infrastructure with Nagios, Prentice Hall PTR,
Design (ACEED) (2005). ISBN:0132236931 (2007).
[19] G. Chen, W. He: Energy-Aware Server [26] I. Anghel, T. Cioara, I. Salomie, M. Dinsoreanu:
Provisioning and Load Dispatching for An Agent-based Context Awareness
Connection-Intensive Internet Services, The 5th Management Framework, Proceedings of the 8-
USENIX Symp. on Net. Syst. Design and Impl., th International Conference RoEduNet, pp. 107-
pp. 337-350 (2008). 113, ISBN -978-606-8085-15-9 (2009).
[20] R. Buchty, J. Tao, W. Karl: Automatic Data
Locality Optimization Through Self-

You might also like