This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON CLOUD COMPUTING
1

Greenhead: Virtual Data Center Embedding Across Distributed Infrastructures
∗ LIP6

Ahmed Amokrane∗ , Mohamed Faten Zhani† , Rami Langar∗ , Raouf Boutaba† , Guy Pujolle∗ / UPMC - University of Pierre and Marie Curie; 4 Place Jussieu, 75005 Paris, France † University of Waterloo; 200 University Ave. W., Waterloo, ON, Canada Email: ahmed.amokrane@lip6.fr; mfzhani@uwaterloo.ca; rami.langar@lip6.fr; rboutaba@uwaterloo.ca; guy.pujolle@lip6.fr !

Abstract—Cloud computing promises to provide on-demand computing, storage and networking resources. However, most cloud providers simply offer virtual machines (VMs) without bandwidth and delay guarantees, which may hurt the performance of the deployed services. Recently, some proposals suggested remediating such limitation by offering Virtual Data Centers (VDCs) instead of VMs only. However, they have only considered the case where VDCs are embedded within a single data center. In practice, infrastructure providers should have the ability to provision requested VDCs across their distributed infrastructure to achieve multiple goals including revenue maximization, operational costs reduction, energy efficiency and green IT, or to simply satisfy geographic location constraints of the VDCs. In this paper, we propose Greenhead, a holistic resource management framework for embedding VDCs across geographically distributed data centers connected through a backbone network. The goal of Greenhead is to maximize the cloud provider’s revenue while ensuring that the infrastructure is as environment-friendly as possible. To evaluate the effectiveness of our proposal, we conducted extensive simulations of four data centers connected through the NSFNet topology. Results show that Greenhead improves requests’ acceptance ratio and revenue by up to 40% while ensuring high usage of renewable energy and minimal carbon footprint. Index Terms—Green Computing, Energy Efficiency, Cloud Computing, Virtual Data Center, Distributed Infrastructure

1

I NTRODUCTION

Cloud computing has recently gained significant popularity as a cost-effective model for hosting largescale online services in large data centers. In a cloud computing environment, an Infrastructure Provider (InP) partitions the physical resources inside each data center into virtual resources (e.g., Virtual Machines (VMs)) and leases them to Service Providers (SPs) in an on-demand manner. On the other hand, a service provider uses those resources to deploy its service applications, with the goal of serving its customers over the Internet. Unfortunately, current InPs like Amazon EC2 [1] mainly offer resources in terms of virtual machines without providing any performance guarantees in terms of bandwidth and propagation delay. The lack of such guarantees affects significantly the performance of the deployed services and applications [2].
Digital Object Indentifier 10.1109/TCC.2013.5

To address this limitation, recent research proposals urged cloud providers to offer resources to SPs in the form of Virtual Data Centers (VDCs) [3]. A VDC is a collection of virtual machines, switches and routers that are interconnected through virtual links. Each virtual link is characterized by its bandwidth capacity and its propagation delay. Compared to traditional VM-only offerings, VDCs are able to provide better isolation of network resources, and thereby improve the performance of service applications. Despite its benefits, offering VDCs as a service introduces a new challenge for cloud providers called the VDC embedding problem, which aims at mapping virtual resources (e.g., virtual machines, switches, routers) onto the physical infrastructure. So far, few works have addressed this problem [2], [4], [5], but they only considered the case where all the VDC components are allocated within the same data center. Distributed embedding of VDCs is particularly appealing for SPs as well as InPs. In particular, a SP uses its VDC to deploy various services that operate together in order to respond to end users requests. As shown in Fig. 1, some services may require to be in the proximity of end-users (e.g., Web servers) whereas others may not have such location constraints and can be placed in any data center (e.g., MapReduce jobs). On the other hand, InPs can also benefit from embedding VDCs across their distributed infrastructure. In particular, they can take advantage of the abundant resources available in their data centers and achieve various objectives including maximizing revenue, reducing costs and improving the infrastructure sustainability. In this paper, we propose a management framework able to orchestrate VDC allocation across a distributed cloud infrastructure. The main objectives of such framework can be summarized as follows. - Maximize revenue. Certainly, the ultimate objective of an infrastructure provider is to increase its revenue by maximizing the amount of leased resources and the number of embedded VDC requests. However, embedding VDCs requires satisfying dif-

2168-7161/13/$31.00 © 2013 IEEE

wide area data transport is bound to be the major contributor to the data transport costs.g. energy-efficient data centers can be identified by their Power Usage Effectiveness (PUE). To the best of our knowledge. InPs are facing a lot of pressure to operate on renewable sources of energy (e. Section 6 discusses the simulation results showing the effectiveness of Greenhead.g. while ensuring that the infrastructure is as environment-friendly as possible. the VDC embedding scheme has to minimize the infrastructure carbon footprint. We then propose a simple yet efficient algorithm for assigning partitions to data centers based on electricity prices. Greenhead aims at maximizing the InP’s revenue by minimizing energy costs. Indeed. the carbon footprint of data centers around the world accounted for 0. and (2) taking advantage of the difference in electricity prices between the locations of the infrastructure facilities.Reduce backbone network workload. We finally provide concluding remarks in Section 7. We first divide a VDC request into partitions such that the inter-partition bandwidth demand is minimized and the intra-partition bandwidth is maximized. Content may change prior to final publication. As a result. we propose a two-step approach. a resource management framework for VDC embedding across a distributed infrastructure. .Reduce data center operational costs. In addition. day and night for solar power) as well as the weather conditions (e. IEEE TRANSACTIONS ON CLOUD COMPUTING 2 Fig. which constitutes 10% of Information and Communication Technologies (ICT) emissions [9]. which depends on the data center geographical location. Google G-scale network [6]). . In particular. the time of the day (e. whenever the power from the electric grid is used. in 2012.This article has been accepted for publication in a future issue of this journal. Based on these observations. InPs can achieve more savings by considering the fluctuation of electricity price over time and the price difference between the locations of the data centers. availability of renewables and the carbon footprint per unit of power. we propose Greenhead. it is crucial to reduce the backbone network traffic and place highcommunicating VMs within the same data center whenever possible. and favored to host more virtual machines. data centers’ PUEs. Section 5 presents a detailed description of the proposed algorithms for VDC partitioning and embedding. and thereby improves requests’ acceptance ratio. This significantly reduces the traffic carried by the backbone. infrastructure providers tend to build their proprietary wide-area backbone network to interconnect their facilities (e.. Reducing data centers’ operational costs is a critical objective of any infrastructure provider as it impacts its budget and growth. the embedding scheme must ensure that the capacity of the infrastructure is never exceeded. In that case. To cope with the growing traffic demand between data centers. this is the first effort to address VDC embedding problem over a distributed infrastructure taking into account energy efficiency as well as environmental considerations. an efficient VDC embedding scheme should maximize the usage of renewables and take into account their availability. solar and wind power) to make their infrastructure more green and environment-friendly. In Section 2. Obviously. In addition. it must satisfy location constraints imposed by SPs. 1: Example of VDC deployment over a distributed infrastructure ferent constraints. VMs can be efficiently placed such that the total electricity cost is minimized. To this end. which constitutes a significant portion of the total operational expenditure. In this context. This can be achieved through minimizing energy costs. but has not been fully edited. Furthermore. Furthermore. we present related works relevant to ours. The aim of such partitioning is to embed VMs exchanging high volumes of data in the same data center.. it has been reported recently that the cost of building an inter-data center network is much higher than the intra-data center network cost and it accounts for 15% of the total infrastructure cost [7]. Hence. one key objective when embedding VDCs is to minimize the traffic within the backbone network. Hence. two key techniques can be adopted: (1) placing more workload into the most energy-efficient data centers.g. namely the capacity and location constraints.g. .Reduce the carbon footprint. the placement of the VMs is critical since the carbon footprint per watt of power varies from location to location. The remainder of this paper is organized as follows.. wind. atmospheric pressure). . We then describe the proposed management framework in Section 3. according to several studies [8]. To reduce the complexity of the problem. We provide a mathematical formulation of the VDC embedding problem across a distributed infrastructure in Section 4. In this paper. Recent research has reported that..25% of the worldwide carbon emission.

e. an implementation of those abstractions that uses a greedy algorithm for mapping virtual resources to a tree-like physical topology. may comprise thousands of nodes of different types (e. [10]. [18] used a model predictive control framework for service placement.. carbon footprint and the delay between end users and the location of the data. Gao et al. In the single domain case. multiple networks managed by different InPs). or minimizing the carbon footprint [21]. [20] addressed the problem of replica placement and request routing in Content Distribution Networks (CDN).. and (3) improving energy efficiency [14]. Zhani et al. [5] proposed a data center network virtualization architecture called SecondNet that incorporates a greedy algorithm to allocate resources to VDCs. Finally. previous works do not take advantage of the variability of electricity prices between different locations and also ignore environmental considerations. Guo et al. this work has only aimed at load balancing the workload through request partitioning without considering other objectives like revenue maximization. virtual switches and routers). VMs. [16] proposed a centralized approach where the SP first splits the request using Max-Flow Min-Cut based on prices offered by different InPs then decides where to place the partitions. They proposed an algorithm that uses minimum k-cut to split a request into partitions before assigning them to different locations.g. [13]. Chowdhury et al. only few works have addressed VDC embedding problem. For instance. namely: VDC embedding within a single data center. a backbone owned and managed by a single InP) or in multiple domains (i. • VDC Embedding within a single data center So far. [24] addressed the same problem but they aimed at minimizing energy costs. different considerations should be taken into account such as carbon footprint of the data centers and variability of electricity prices over time and between different locations. It basically aims at embedding virtual nodes (mainly routers) and links on top of a physical backbone substrate. • Virtual Network Embedding Virtual network embedding has been extensively studied in the literature. [2] proposed two abstractions for VDCs. for a distributed environment. Liu et al. a resource management framework for data centers that leverages dynamic VM migration to improve the acceptance ratio of VDCs. we survey relevant research in the literature. the above proposals cannot be directly applied to allocate resources in multiple data centers due to the large size of the resulting topology. • Workload placement in geographically distributed data centers Several works have addressed the problem of workload placement in geographically distributed data centers. Le et al. Furthermore. For instance. They also proposed a VDC consolidation algorithm to minimize the number of active physical servers during lowdemand periods. IEEE TRANSACTIONS ON CLOUD COMPUTING 3 2 R ELATED WORK In this section..e. Qureshi et al. Ballani et al. namely a virtual cluster and an oversubscribed virtual cluster. However. (2) maximizing the acceptance ratio and revenue [12]. virtual network embedding. energy costs are cut down by taking advantage of the variability of electricity prices between different data centers and even at the same location over time. Services are dynamically placed in data centers and migrated according to the demand and price fluctuation over time while considering the migration cost and the latency between services and end-users. Generally. a definite need for developing new solutions able to embed large scale VDCs and to consider the diversity of resources.This article has been accepted for publication in a future issue of this journal. Unfortunately. [25] proposed a workload assignment framework . [4] presented VDC Planner. the virtual network request is sent to a single InP. Houidi et al. proposed a framework for workload assignment and dynamic workload migration between data centers that minimizes the latency between end-users and services while following renewables and avoiding using power from the electricity grid [21]. and thereby increases InP’s revenue. In the multi-domain case. The only work we are aware of that addressed multi-data center embedding problem is that of Xin et al. therefore. [17] proposed a distributed embedding solution called PolyVine. energy efficiency and green IT. Finally. [22] or both [23]. While a virtual network can be made of tens of nodes (mainly routers). Zhang et al. but has not been fully edited. it does not consider constraints on the VM placement. the request is provisioned across multiple domains belonging to different InPs. [22]. They aimed at reducing electricity costs by dynamically placing data at locations with low electricity prices. They either aimed at reducing energy costs [18]–[20]. There is. In addition. We classified previous work into three related topics. the InP tries to embed the virtual networks while aiming to achieve multiple objectives including: (1) minimizing the embedding cost [11]. expected to be similar to a real data center. backbone network usage optimization. which tries to allocate as much resources as possible in his own network before forwarding the un-embedded nodes and links to a neighboring provider. The carbon footprint is reduced by following the renewables available during some periods of the day. The above proposals on virtual network embedding cannot be directly applied to the VDC embedding problem for many reasons. Content may change prior to final publication. a VDC. They developed Oktopus. In PolyVine. Current proposals have addressed the embedding problem either in a single domain (i. and workload scheduling across geographically distributed data centers. The process continues recursively until the whole request is embedded. [15].

which could be a renewable source or a conventional one or a mix of both. IEEE TRANSACTIONS ON CLOUD COMPUTING 4 Fig. As a result. The Partition Allocation Module is responsible for assigning partitions to data centers based on runtime statistics collected by the monitoring module. Furthermore. energy from the grid is usually produced by burning coal. the cloud provider will make use of its distributed infrastructure with the objective of maximizing its revenue and minimizing energy costs and carbon footprint. In summary. a SP sends the VDC request specifications to the InP. cloud provider has to pay a penalty proportional to the generated carbon emission. As shown in Fig. This makes such approaches not applicable for embedding VDCs since we need also to consider bandwidth and delay requirements between the VDC components. However. bandwidth and delay). Unfortunately. the main limitation of the aforementioned proposals is that they ignore the communication patterns and the exchanged data between the VMs. we consider a distributed infrastructure consisting of multiple data centers located in different regions and interconnected through a backbone network (see Fig. Naturally. The entire infrastructure (including the backbone network) is assumed to be owned and managed by the same infrastructure provider. Each partition is supposed to be entirely embedded into a single data center. The aim of this module is to reduce the number of virtual links provisioned between data centers. but has not been fully edited. Greenhead. renewables are not always available as they depend on the data center location. this work is the first effort to address VDC embedding across a distributed infrastructure while considering the usage of the backbone network. Greenhead is composed of two types of management entities: (1) a central controller that manages the entire infrastructure and (2) a local controller deployed at each data center to manage the data center’s internal resources. storage and notably networking (i. solar) and resorts to electricity grid only when its on-site renewable energy becomes insufficient. The motivation behind such partitioning will be explained in Section 5. Content may change prior to final publication. The generated carbon depends on the source of power used by the electric grid supplier. • The Partitioning Module is responsible for splitting a VDC request into partitions such that interpartition bandwidth is minimized..1. the time of the day and external weather conditions. whenever electricity is drawn from the grid. To the best of our knowledge. The central management entity includes five components as depicted in Fig. this is where our proposed management framework.. While renewable energy has no carbon footprint. 2: VDC embedding across multiple data centers across multiple data centers that minimizes the costs of energy consumed by IT and cooling equipment depending on the fluctuations of electricity prices and the variability of the data centers’ PUEs. 2).This article has been accepted for publication in a future issue of this journal. energy efficiency as well as environmental considerations.e. 2.g. 2: • 3 S YSTEM A RCHITECTURE In this work. Each data center may operate on on-site renewable energy (e. it is also worth noting that prices of the grid electricity differ between regions and they even vary over time in countries with deregulated electricity markets. which has the responsibility of allocating the required resources. It ensures that all partitions are embedded . oil and gas. comes into play. our work is different from traditional virtual network and VDC embedding proposals since it considers resource allocation for VDCs across the whole infrastructure including data centers and the backbone network connecting them. It also differs from service placement works since we are provisioning all types of resources including computing. wind. the variability of electricity prices. generating high levels of carbon emissions.

due to unavailability of resources). memory and disk).. we formally define the VDC embedding problem across multiple data centers as an Integer Linear Program (ILP). servers and switches) in data center i Total power consumed in data center i Price per unit of resource type r Price per unit of bandwidth Cost per unit of bandwidth in the backbone network d(e). The physical infrastructure is represented by a graph G(V ∪ W.e. Finally.IT Pij σr σb cp Meaning PUE of data center i Electricity price in data center i on-site renewable power cost in data center i Residual renewable power in data center i Carbon footprint per unit of power from the power grid in data center i Cost per ton of carbon in data center i A boolean variable indicating whether data center i satisfies the location constraint of VM k of VDC j A boolean variable indicating whether VM k is assigned to data center i a boolean variable indicating whether the physical link e ∈ E is used to embed the virtual link e ∈ E j Cost of embedding the VDC request j in data center i Amount of power consumed only by IT equipment (i. If the embedding is not possible (for example. it allocates resources for a partition of a VDC as requested by the central controller. Those virtual links connect VMs that have been assigned to different data centers. E ). Each vertex v ∈ V j corresponds to a VM. energy efficiency and green IT objectives such as reducing energy costs from the power grid and maximizing the use of renewable sources of energy. the local controller notifies the central controller. E j ). and σ r and σ b are the selling prices of a unit of resource type r and a unit of bandwidth.. It is characterized by its bandwidth demand bw(e) and propagation delay r (v ) is the capacity of VM v belonging to the where Cj VDC j in terms of resource r. Subsequently. we assume that each VM v ∈ V j may have a location constraint.e j Di Pi. • The Monitoring Module is responsible for gathering different statistics from the data centers. Second. it can only be embedded in a particular set of data centers. SecondNet [5].. resource usage and availability of renewables to the central controller. its main role is to manage the resources within the data center itself... Hence.. Content may change prior to final publication. the partition allocation module will attempt to find another data center able to embed the rejected partition. Regarding the local controller at each data center. denoted by Rj . embed every virtual link belonging to E j either in the backbone network if it connects two VMs assigned to different data centers or within • . resource utilization. VDC planner [4]. electricity price) are measured at the beginning of each time slot and are considered constant during the corresponding time slot. but has not been fully edited. Table 1 describes the notations used in our ILP model. We assume that time is divided into slots [1. . CPU. To model this constraint. TABLE 1: Table of notations Notation P U Ei ζi ηi Ni Ci αi j zik xj ik fe. A VDC request j is represented by a graph Gj (V j . to be proportional to the amount of CPU. each local controller has to report periodically statistics including PUE. each VDC j has a lifetime Tj .This article has been accepted for publication in a future issue of this journal. Furthermore. The metrics characterizing each data center (e. The set of edges E represents the physical links in the backbone network. • The VDC Information Base contains all information about the embedded VDCs including their partitions and mapping either onto the data centers or the backbone network. Furthermore. temperature. respectively. we define the decision variable xj ik as: ⎧ ⎨ 1 If the VM k of the VDC j is assigned to data center i xj = ik ⎩ 0 Otherwise. We assume the revenue generated by VDC j . electricity price and the amount of available renewable energy. memory and bandwidth required by its VMs and links. • The Inter-data center Virtual Link Allocation Module allocates virtual links in the backbone network. outdoor temperature. assign each VM k ∈ V j to a data center. memory and disk requirements. It is worth noting that different resource allocation schemes can be deployed locally at the data centers (e. Each link is characterized by its bandwidth capacity and propagation delay.g. Let R denote the different types of resources offered by each node (i. Each edge e ∈ E j is a virtual link that connects a pair of VMs. Specifically. we define ⎧ ⎨ 1 If the VM k of the VDC j can be j embedded in data center i zik = ⎩ 0 Otherwise. Therefore. The revenue generated by VDC j can be written as follows: Rj = ( v ∈V j r ∈R r Cj (v ) × σ r + e ∈E j bw(e ) × σ b ) (1) 4 P ROBLEM F ORMULATION In this section. Thus. Oktopus [2]). we omit the time reference in all variables defined in the remainder of this section. characterized by its CPU. IEEE TRANSACTIONS ON CLOUD COMPUTING 5 while achieving cost effectiveness. T ]. PUE.g. where V denotes the set of data centers and W the set of nodes of the backbone network. The collected information includes PUE. for readability. as a binary variable that indicates whether a VM k of to VDC j can be embedded in a data center i. The problem of embedding a given VDC j across the infrastructure involves to two steps: • First.e.

To do so. respectively. cooling) to accommodate the partition Gj i can be computed as j × P U Ei Pij = Pi. to ensure that the data center i can host the assigned VMs and links. the ultimate objective of the InP when embedding a VDC request is to maximize its profit defined as the difference between the revenue (denoted by Rj ) and the total embedding cost.L ≤ RNi . otherwise.D × ( ζi + αi Ci ) Di • where sbw(e) is the residual bandwidth of the backbone network link e. IEEE TRANSACTIONS ON CLOUD COMPUTING 6 the same data center. Note that ηi includes the upfront investment. j denote the amount of power consumed Let Pi.. The mix of power used in data center i is given by j j Pij = Pi. ∀e1 . ∀e ∈ E e ∈E j (5) where P U Ei is the power usage effectiveness of data center i..g. The required propagation delay for every virtual link allocated in the backbone should be satisfied: fe.IT (9) • The capacity constraint of the backbone network links should not be exceeded: fe. .e as: ⎧ ⎨ 1 If the physical link e ∈ E is used to embed the virtual link e ∈ E j fe. This amount of power depends mainly on the local allocation scheme.L × ηi + Pi.The cost of embedding in the data centers In this work. Furthermore.The cost of embedding in the backbone network Virtual links between the VMs that have been assigned to different data centers should be embedded .D (10) j j where Pi.e = xd(e1 )d(e ) − xs(e2 )s(e ) .L + Pi. where Vi and Ei are the set of VMs and virtual links belonging to VDC j and assigned to data center i. our problem can be formulated as an ILP with the following objective function: M aximize Rj − (Dj + P j ) (2) Subject to the following constraints (3)-(8): • A VM has to be assigned to a data center that satisfies its location constraints: j j xj ik ≤ zik . let Gj i (Vi . ζi is the electricity price in data center i expressed in dollars per kilowatt-hour ($/kWh). Recall that these costs are part of the objective function. ∀i ∈ V (8) (3) A VM is assigned to one and only one data center: i∈ V j xj ik = 1. we first evaluate the amount of power required to embed the partition Gj i in a data center i denoted by Pij .e × bw(e ) ≤ sbw(e). Ei ) denote a j j partition from Gj .D denote. To do so.e. the embedding cost (expressed in dollar) of the partition Gj i in data center i can be written as j j j = Pi. ∀i ∈ V • Hence.e − fe2 . which consists of the embedding cost in the data centers (denoted by Dj ) plus the embedding cost in the backbone network P j . They can be written as Vij = {k ∈ V j |xj ik = 1} j Ei = {e ∈ E j |s(e ) ∈ Vij and d(e ) ∈ Vij } (11) • where ηi is the on-site renewable power cost in data center i expressed in dollars per kilowatt-hour ($/kWh). servers and switches) in order to accommodate Gj i (expressed in kilowatt).e × d(e) ≤ d(e ). which is captured by j Pi. where RNi is the amount of residual renewable power in data center i expressed in kilowatt. maintenance and operational costs. ∀k ∈ V . Hence. Note that the amount of on-site consumed power should not exceed the amount of produced power. the current mapping and the availability of resources at data center i. The power consumed at the data center i by IT equipment and other supporting systems (e.IT only by IT equipment (i. the central controller should also ensure that each data center is able to accommodate VMs and virtual links assigned to it.L and Pi. we define the virtual link allocation variable fe. the total embedding cost of request j in all available data centers can be written as follows : Dj = i∈ V j Di (12) We define the function ⎧ ⎨ 1 If data center i can j Embedi (Gj ) = accommodate Vij and Ei i ⎩ 0 Otherwise. respectively. we evaluate the request embedding cost in the data centers in terms of energy and carbon footprint costs. e2 ∈ E. d(e1 ) = s(e2 ). Ci is the carbon footprint per unit of power used from the grid in data center i expressed in tons of carbon per kWh (t/kWh) and αi is the cost per unit of carbon ($/t). ∀e ∈ E j e∈E (6) • The flow conservation constraint given by: fe1 . Finally. ∀k ∈ V (4) Let us now focus on the expression of the embedding costs Dj and P j in the data centers and the backbone network. Content may change prior to final publication. To j j model this constraint. we should satisfy: j j xj ik ≤ Embedi (Gi ). respectively. Hence. ∀ e ∈ V j (7) where s(e) and d(e) denote the source and destination of link e.e = ⎩ 0 Otherwise. Finally. the onsite consumed renewable power and the amount of purchased power from the grid.This article has been accepted for publication in a future issue of this journal. . but has not been fully edited. ∀k ∈ V .

which will eventually reduce the acceptance ratio of VDC requests. (ii) wide-area transit bandwidth is more expensive than building and maintaining the internal network of a data center [26]. a solution that is both efficient and scalable is required. our solution consists of two stages: (1) VDC partitioning. which is much higher than the cost of the intra-data center network [7]. use a greedy algorithm that places the obtained partitions in data centers based on location constraints and embedding costs that consider energy consumption. Content may change prior to final publication. Our motivation stems from two main observations: (i) the cost of inter-data center network accounts for 15% of the total cost. and (2) partition embedding. The objective of the Louvain algorithm is to maximize the modularity. Initially.e. which is defined as an index between −1 and 1 that measures the intra-partition density (i. then. Let P j denote the cost incurred by the InP in order to accommodate those virtual links.. Then a new graph is built by considering the partitions found during the first phase as nodes and by collapsing inter-partitions links into one link (the weight of the new link is equal to the sum of the original links’ weights). graphs with high modularity have dense connections (i. we present our solution that. the algorithm optimally connects them through virtual links across the backbone network. it is highly recommended to reduce the inter-data center traffic [8]. every node is considered as a partition. In fact. 1: IN: Gj (V j . The above embedding problem can be seen as a combination of the bin packing problem and the multi-commodity flow problem. we. Note that minimizing the inter-partition bandwidth aims at reducing the bandwidth usage within the backbone network. one should know the embedding costs of all possible partitions of the VDC graph in all data centers. In addition. The same process is applied recursively to the new graph until no improvement in the modularity is possible. IEEE TRANSACTIONS ON CLOUD COMPUTING 7 in the backbone network. For more details on the original version of the Louvain algorithm. we present these two stages. Once. electricity prices and PUEs of the different facilities. first. we propose to use the Louvain algorithm [28]. carbon footprint. divides the VDC request into partitions such that the inter-partition bandwidth is minimized. This means that each local controller has to provide the central management framework with the embedding cost of every possible partition. However. to reduce the operational costs and avoid inter-data center eventual bottleneck. please refer to [28]. and (iii) the inter-data center network might become a bottleneck. especially for large-scale VDC requests. This allows to minimize the bandwidth usage inside the backbone network. Hence.. This may result in a large computational overhead not only at local controllers but also at the central controller since the number of possible partitions can be significant. but has not been fully edited. Finally. one should note that this algorithm is not directly applicable to the VDC partitioning problem . The algorithm then considers each node and tries to move it into the same partition as one of its neighbors. which is known to be N P -Hard [27]. the partitioning is completed. Hence. In a nutshell. 5. the original Louvain algorithm proceeds as follows. In the next section. E j ): The VDC request to partition 2: repeat 3: Put every edge of G in a single partition 4: Save the initial modularity 5: while Nodes moved between partitions do 6: for all v ∈ Gj do 7: Find the partition P such as if we move v from its partition to P : 8: -Get a maximum modularity increase 9: -There will not be two nodes with different location constraints in P 10: if such a partition P exists then 11: Move v to the partition P 12: end if 13: end for 14: end while 15: if current modularity > initial modularity then 16: End ← f alse 17: Change Gj to be the graph of partitions 18: else 19: End ← true 20: end if 21: until End 5 VDC PARTITIONING A ND E MBEDDING As mentioned earlier. We chose the Louvain algorithm because it is a heuristic algorithm that determines automatically the number of partitions and has low time complexity of O(n log n).e. high sum of weights) between the nodes within partitions. the VDC partitioning module splits the VDC request into partitions such that the inter-partition bandwidth is minimized. Therefore. We assume that it is proportional to their bandwidth requirements and the length of physical paths to which they are mapped. in order to use an ILP solver. The neighboring node is chosen such that the gain in modularity is maximal.This article has been accepted for publication in a future issue of this journal.1 VDC Partitioning Before starting the embedding process. Furthermore.e × bw(e ) × cp (13) where cp is the cost incurred by the InP per unit of bandwidth allocated in the backbone network. It is given by: Pj = e ∈E j e∈E Algorithm 1 Location-Aware Louvain Algorithm (LALA) fe. but sparse connections across partitions. it is shown to provide good results [28]. In the following. The VDC partitioning problem reduces to the weighted graph partitioning problem. the sum of the links’ weights inside partitions) compared to inter-partition density (sum of the weights of links between partitions). which are both known to be N P -hard.

Note that. The idea is to start by assigning the location-constrained partitions first then select the most cost effective data centers that j satisfy these constraints. and provision virtual links across the backbone network to connect them. Algorithm 2 describes the proposed partition emj bedding algorithm. Indeed. LALA determines the number of partitions as well as the shape and size of the partitions based on the modularity. Therefore. The cost is returned by the remote call getCost(s. Illinois. E ).Physical infrastructure: We consider a physical infrastructure of 4 data centers situated in four different states: New York.1 Simulation Settings Once a request Gj (V j . where the number of partitions is known [10] or based on star-shaped structures detection [29]. the central management entity queries the Local Controller of each data center s that satisfies the location constraints to get the embedding cost of v . we can use the ILP formulation introduced in section 4 by replacing the VDC request Gj by its graph of partitions Gj M . v ) = true 11: if no data center is found then 12: return F AIL 13: end if 14: T oDC [i] ← T oDC [i] ∪ {v } 15: for all k ∈ N (v ) do 16: if k ∈ T oDC [i] then 17: T oDC [i] ← T oDC [i] ∪ {evk } 18: else 19: if ∃l = i ∈ V / k ∈ T oDC [l] then 20: Embed evk in G using the shortest path 21: end if 22: end if 23: end for 24: end for 25: return T oDC since it does not take into account location constraints. which includes both power and carbon footprint costs as described in equation (11). E j ) is partitioned. but has not been fully edited. we modified the Louvain algorithm to take into account location constraints in the partitioning process. EM ) where VM is the j set of nodes (partitions) and EM is the set of virtual .This article has been accepted for publication in a future issue of this journal.2 Partition Embedding problem links connecting them. v ). The complexity of this algorithm is j j O (|VM | × |V |). Algorithm 2 provides the mapping of all the partitions to the data centers as well as the mapping of the virtual links that connect them in the the backbone network. Hence. To address this limitation. which results in non-feasible solutions. Basically. For each partition v ∈ VM to embed. the second step is to assign the partitions to the data centers in such a way to minimize the operational costs as well as the carbon footprint. Gj M (V M . In what follows. E M ) j to a data center. and LinksEmbedP ossible(s. at this stage. embed the 2: OUT: Assign each partition in VM links between the partitions assigned to different data centers in the backbone network 3: for all i ∈ V do 4: T oDC [i] ← {} 5: end for j do 6: for all v ∈ VM 7: Sv ← {i ∈ V / i satisfies the location constraint} 8: end for j do 9: for all v ∈ VM 10: i ← s ∈ Sv with the smallest cost getCost(s. Furthermore. LALA prevents moving a node from one partition to another whenever the location constraint could be violated. The next step is to select the data center that will host the partition v (lines 1014). California and Texas. However. the Louvain algorithm may not separate them. we propose a simple yet efficient heuristic algorithm to solve the ILP problem. For each partition v ∈ VM . The data centers are connected through the NSFNet topology as a backbone . v ). unlike previous approaches in the literature. 6 P ERFORMANCE E VALUATION In order to evaluate the performance of Greenhead. However. the above formulated ILP is still NP-hard. v ) = true). Content may change prior to final publication. The selected data center is the one that incurs the lowest embedding cost (provided by the procedure getCost(s. In the following. called Location-Aware Louvain Algorithm (LALA) is described in Algorithm 1. we build the list of data centers able to host it based on the location constraints (lines 6-8). we run extensive simulations using realistic topology and parameters. The resulting heuristic algorithm. the resulting partitions that are connected through virtual links can j j j be seen as a multigraph Gj M (VM . 5. two VMs with two different location constraints should not be assigned to the same data center. we present the setting of the conducted simulations. v )) and where it is possible to embed virtual links between v and all previously embedded partitions (denoted by N (v )). and hence they have to belong to different partitions. we describe the partition placement algorithm. where |VM | is the number of partitions and |V | is the number of data centers. Once the VDC partitioning is completed. even if the VDC partitioning process significantly reduces the number of components (partitions rather than VMs) to be embedded. The next step is to embed this multigraph in the infrastructure. the performance metrics that we evaluated as well as the obtained results. Note that. 6. links between the partition v and other partitions assigned to different data centers are embedded in the backbone network using the shortest path algorithm (lines 19-21). If the whole multigraph is successfully embedded. IEEE TRANSACTIONS ON CLOUD COMPUTING 8 Algorithm 2 Greedy VDC Embedding Across data centers j j 1: IN: G(V ∪ W. the requirements of all virtual links in terms of bandwidth and delay are satisfied (achieved when LinksEmbedP ossible(s.

[16]. the acceptance ratio is defined as the ratio of the number of embedded VDCs to the total number of received VDCs (i.This article has been accepted for publication in a future issue of this journal. electricity price. [11]–[13].4 0. The cumulative revenue up to time t. while considering each single VM as a partition. [36]. [31].4 1. Content may change prior to final publication. 48 hours).4 0.The baseline approach: Since. . In addition.2 Normalized values 1 0.5 with a bandwidth demand uniformly distributed between 10 and 50 M bps and a delay uniformly distributed between 10 and 100 milliseconds. .2 0 0 10 Normalized values Normalized values 1 0.e. denoted by Ploc ∈ [0. then we calculate the average value of performance metrics.8 0. a fraction of VMs. we evaluate several performance metrics including the acceptance ratio. In our experiments.4 1.4 1. Each data center is connected to the backbone network through the closest node to its location. In order to evaluate PUEs of the different data centers.. we adopted the technique described in [37].4 0. 1]. The amount of power generated during two days are extracted from [33].6 0. for example. In order to evaluate the carbon footprint generated at each data center. and considered the on-site renewable power cost to be ηi = 0. but has not been fully edited. As illustrated in Fig.8 0.8 0. This mimics a real cloud environment where VDCs could be allocated for a particular lapse of time depending on the SP requirements. In particular..2 Electricity price Available renewables Carbon footprint per unit of power Cost per Ton of carbon 1. (16) The instantaneous power. consisting of about 3000 lines of code. denoted by CR(t). where a SP can dynamically create VMs and use them only for a specific duration.2 Normalized values Electricity price Available renewables Carbon footprint per unit of power Cost per Ton of carbon 1.6 0. the carbon footprint and the backbone network utilization. IEEE TRANSACTIONS ON CLOUD COMPUTING 9 1. with a margin of error less than 5%.VDC requests: In our simulations. We do not plot confidence intervals for the sake of presentation.2 1 0. NSFNet includes 14 nodes located at different cities in the United States. We use electricity prices reported by the US Energy Information Administration (EIA) in different locations [32]. . Their lifetime follows an exponential distribution with mean 1/μ.8 0.Performance Metrics In order to compare our approach to the baseline. . the electricity price. is assumed to have location constraints. respectively. including embedded and rejected VDCs). the revenue. The instantaneous revenue at a particular time t is given by: R ( t) = j ∈ Q(t) Rj (15) where Q(t) is the set of VDC requests embedded in the infrastructure at time t and Rj as defined in (1). the available renewable energy and the carbon footprint per unit of power drawn from the grid not only depends on the location but are also subject to change over time. energy costs. carbon footprint and . 3: Available renewables. In other words. we simulate two working days (i. similarly to previous works [4].The simulator We developed a C++ discrete event simulator for the central and local controllers.e. This is the case for Amazon EC2. 3.2 Electricity price Available renewables Carbon footprint per unit of power Cost per Ton of carbon 1. The baseline algorithm maps a VDC to the physical infrastructure by embedding its VMs and links one by one. VDCs are generated randomly according to a Poisson process with arrival rate λ.01/kW h.6 0. in each VDC.6 0.2 Electricity price Available renewables Carbon footprint per unit of power Cost per Ton of carbon 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 (a) New York (b) California (c) Illinois (d) Texas Fig. ∀i [35]. we developed a baseline embedding algorithm that does not consider VDC partitioning. The results are obtained over many simulation instances for each scenario.4 0. The exchange between the central controller and the local controllers is implemented using remote procedure calls.2 1 0. We also use real solar and wind renewable energy traces collected from different US states [33]. previous proposals on virtual network embedding and VDC embedding are not directly applicable to the studied scenario (see Section 2). We assume all NSFNet links have the same capacity of 10 Gbps [8]. can then be written as: CR(t) = t 0 R(x) dx.4 1. It is given by: At = Ut Nt (14) where Ut and Nt are the number of VDC requests that have been embedded and the total number of received VDCs up to time t. it applies the Greenhead embedding algorithm. we use the values of carbon footprint per unit of power provided in [34]. carbon footprint per unit of power and cost per unit of carbon in the data centers network [30]. Two VMs belonging to the same VDC are directly connected with a probability 0. The number of VMs per VDC is uniformly distributed between 5 and 10 for smallsized VDCs and 20 and 100 for large-sized VDCs.

In our first set of simulations. It can be written as B ( t) = j ∈ Q(t) (Rj − (Dj + P j )) (19) where Rj − (Dj + P j ) is the objective function score of embedding VDC j as defined in equation (2). Finally. Table 2 reports the average computation time needed to partition and embed a VDC request.2 0 0 Average link utilization Greenhead Baseline Profit (USD) 2000 Greenhead Baseline 35 30 25 20 15 10 5 0 0 10 20 30 Time (hours) Greenhead Baseline Bandwidth (Mbps) 8000 Greenhead Baseline 1500 6000 1000 4000 500 2000 10 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 40 50 0 0 10 20 30 Time (hours) 40 50 (a) Acceptance ratio (b) Instantaneous profit (c) Average link utilization in the (d) . but has not been fully edited. we fixed the arrival rate λ to 8 requests per hour. To compute the optimal solution. We can also see that Greenhead improves the cumulative objective function value by up to 25% compared to the baseline.8 Acceptance ratio 0. we compare Greenhead to an optimal solution provided by an ILP solver.15. 4 compares the cumulative objective function (equation (19)) of the aforementioned algorithms for small-sized VDC requests consisting of fully connected 5-10 VMs. The experiments were conducted on a machine with a 3. the instantaneous and cumulative profits are given by the difference between the instantaneous revenue and cost and the cumulative revenue and cost. This means that the Greenhead approach provides a solution close to the optimal one.This article has been accepted for publication in a future issue of this journal. duration=48 hours) backbone network cost is given by: C ( t) = j ∈ Q (t ) j Dt j Dt + Pj (17) is defined in (12). IEEE TRANSACTIONS ON CLOUD COMPUTING 10 1 0. we study the utilization of available renewable energy in the different data 250 Objective Function Value 200 150 100 50 0 0 Greenhead Baseline Optimal 10 20 30 Time (hours) 40 50 Fig. 1/μ = 6 hours. Note that we add where the time slot in the subscript to the definition of the j Dt since we are considering the variations between different time slots. Ploc = 0. revenue and backbone network utilization. fixed the . we define the cumulative objective function at time t as the sum of objective function values associated to the VDCs embedded at that time. the ILP solver takes more than 13 seconds for smallsized VDCs. we developed a C++ implementation of the branch-andbound algorithm.00 GB of RAM running Linux Ubuntu.6 0. 6. we. needs the least computation time since no partitioning is performed. however.4 GHz dual core processor and 4.e. (18) Naturally. The Baseline. respectively. Then.. we compare Greenhead to the baseline approach in terms of acceptance ratio. the baseline and the ILP solver centers. acceptance ratio and revenue In the second set of experiments. acceptance ratio.4 0. as well as to the baseline in terms of computational time and solution quality (i. cumulative objective function).2 Simulation results Through extensive experiments. 5: Greenhead vs the baseline. first. Note that the results for the optimal solution in largesized VDCs were not reported since the solver was not able to find the optimal solution due to memory outage. we first show the effectiveness of our framework in terms of time complexity. To do so. we investigate the carbon footprint and we discuss how to spur development of green infrastructure. Fig. 4: Cumulative objective function obtained with Greenhead. instantaneous revenue and backbone network utilization. On the other hand. Used bandwidth per request in backbone network the backbone network Fig. Content may change prior to final publication. in order to compare Greenhead resource allocation scheme to other schemes. We can observe that the mean values obtained for Greenhead are very close or even overlap with the values obtained with the ILP solver. Finally. The cumulative cost up to time t can be written as: CC (t) = t 0 C (x) dx. (λ = 8 requests/hour. 2) Improve backbone network utilization. 1) Greenhead provides near-optimal solution within a reasonable time frame First.15. The results show that Greenhead takes a very short time to partition and embed a VDC request (less than one millisecond for small-sized VDCs and up to 31 millisecond for larger VDCs). the average lifetime 1/μ to 6 hours and the fraction of location-constrained VMs Ploc to 0.

As a result. In fact. we can notice that as the arrival rate increases. 5.41 0. It is clear that requests embedded by Greenhead use on average 40% less bandwidth in the backbone network than the baseline algorithm.8 Profit (USD) Greenhead Partitioning Embedding 0. when there are no location constraints. 7(c).5 0.1 0. the Greenhead has more freedom to decide of the nonconstrained VMs placement. their placement is only driven by the availability of renewables.4 0. Fig. when the fraction of the constrained VMs is between 0 and 1.8 Proportion of location−constrained VMs 1 0. 5(d). more requests are embedded. On the other hand. 5(c)). they differ in the fact that Greenhead avoids embedding virtual links with high bandwidth demand in the backbone network thanks to the partitioning algorithm. respectively. Although both schemes lead to almost the same utilization of the backbone network on average (Fig. 7.8 Proportion of location−constrained VMs 1 0 0 0. the Greenhead is not able to perform any optimization. Ploc = 0.4 0. the average lifetime 1/μ to 6 hours and the fraction of locationconstrained VMs Ploc to 0.2 0. We can also see that . Finally.2 ILP Solver 13540 - Greenhead Baseline 0. In this case. Greenhead is able to optimize VDCs allocation and significantly improve the acceptance ratio and revenue compared to the baseline. and we simulated the infrastructure for 48 hours. It is clear from this figure that Greenhead consumes much more power than the baseline since it accepts more VDC requests. any particular VDC is entirely hosted in the same data center.35 Greenhead Baseline 0.6 0. all the VMs have to be placed as required by the SP. However.25 0.9 Acceptace ratio 0. At the same time. if the data centers are not overloaded.8 Backbone network utilization Greenhead Baseline 6 5 Profit (USD) 4 3 2 1 x 10 4 0. the VDCs can be hosted in any data center. which compares the average used bandwidth per request inside the backbone network for both schemes.3 0.2 0. when Ploc = 1. the acceptance ratio goes down since there is no room to accept all the incoming requests. (λ = 8 requests/hour) TABLE 2: Computation time for Greenhead. we studied the power consumption across the infrastructure and particularly the usage of renewable energy. Fig.7 0.2 0.214 0.8 Proportion of location−constrained VMs 1 (a) Acceptance Ratio (b) Instantaneous profit (c) Average link utilization in the backbone network Fig. which results in higher revenue. 8: Power consumption across the infrastructure (λ = 8 requests/hour.6 0.6 0. but has not been fully edited.079 2. 6. In practice. it ensures that the embedded requests consume as less bandwidth as possible inside the backbone network. and hence. Results are illustrated in Fig.6 0. Hence.4 0 5 10 15 Arrival rate (request/hour) 20 0 0 5 10 15 Arrival rate (request/hour) 20 (a) Acceptance Ratio (b) Profit Fig. on average.10) arrival rate λ to 8 requests per hour.2 0 0. the baseline and the ILP solver (in milliseconds) VDC size 5-10 VMs 20-100 VMs 1 0.9 Acceptace ratio 0.4 0. Content may change prior to final publication.6 0.05 0 0 Greenhead Baseline 0.This article has been accepted for publication in a future issue of this journal. IEEE TRANSACTIONS ON CLOUD COMPUTING 11 1 0. From this figure. 8 shows the total power consumption across the infrastructure for both Greenhead and the baseline approach. the electricity price and the carbon footprint. 5(a)) and up to 100% more instantaneous profit (Fig. 5(b)). we can see that Greenhead achieves.275 31. 6 and 7 show the performance results when varying the arrival rate λ and Ploc . It is also clear from this figure that the acceptance ratio as well as the revenue are always higher for Greenhead compared to the baseline.69 Baseline 0.20) 3) Maximize renewables’ usage To illustrate how our proposed framework exploits the renewables in the different data centers. This is confirmed by Fig.061 31.15 0. 7: Impact of the fraction of location-constrained VMs. this benefit is reduced when Ploc = 0 as shown in Fig.5 0.15.7 0. This results in low backbone network utilization as shown in Fig.3 0.2 0.4 0. 40% higher acceptance ratio than the baseline (Fig.28 Greenhead Baseline 8 7 6 5 4 3 2 1 x 10 4 Total 0. 6: Acceptance ratio and revenue for different arrival rates (Ploc = 0. 15 x 10 4 Available renewable power Consumed renewable power Total consumed power Power (W) 14 12 10 8 6 4 2 x 10 4 Available renewable power Consumed renewable power Total consumed power Power (W) 10 5 0 0 10 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 (a) Greenhead (b) The baseline Fig. From Fig.

we can conclude that: (i) to reduce data centers’ carbon footprint. Consequently. 1 1 0. Greenhead uses more the data centers in Illinois and New York.6 0.This article has been accepted for publication in a future issue of this journal. λ is set equal to 8 request/hour and Ploc equal to 0.8 0. and (ii) using 0. while Greenhead boosts significantly the profit up to 48%. Fig. From this figure. We can observe that Greenhead improves up to 40% the acceptance ratio which translates into 48% more profit. which may not have available renewables. Fig. we can notice that.4 0. such a cost is not enough to force InPs to reduce their carbon footprint.5 x 10 5 14 Available renewable power Consumed renewable power Total consumed power x 10 4 12 10 Power (W) 8 6 4 2 Available renewable power Consumed renewable power Total consumed power Power (W) 14 12 10 8 6 4 2 x 10 4 Available renewable power Consumed renewable power Total consumed power 2 1.0 (b) Ploc = 0. more power is drawn from the grid. Consequently. we can see that a tradeoff between the carbon footprint and the power cost can be achieved. one can reduce the carbon footprint by 12% while increasing the power cost by only 32% by setting α to 80 $/t. In this experiment. we can notice that an InP can set a carbon footprint target to reach by choosing the corresponding value of α.. 11: The carbon footprint (normalized values) of the whole infrastructure with variable cost per ton of carbon footprint in the whole infrastructure as well as the total power cost. governments should consider much higher carbon taxes. Greenhead uses 100% of available renewables. the carbon cost is imposed by governments as a carbon tax whose cost is between 25 and 30 $ [38]–[40].5 1 0.0006 ton/Kwh and 0. α ≤ 160 $). which presents the power consumption in different data centers.6 0. with Ploc = 0. According to Fig.0003 ton/Kwh) but a higher electricity price (on average. ∀i ∈ V ) on the carbon . Greenhead uses more power from the grid. let’s explore Fig. For instance. 10 compares the obtained results for both schemes for all studied metrics.5 1 0. For instance. Content may change prior to final publication.2 Carbon footprint Cost of power from the grid 0 0 100 200 300 400 500 Cost per ton of carbon ($/t) 600 Fig.6 (d) Ploc = 1 Fig. 10: Comparison of the average values of the different metrics 5) Green infrastructure is possible through tuning.2 Greenhead Baseline 0 ta ep nc er ati o t ula ive pr ofi o t tili za tio n rr eq ue st en fr nf ew ab les e tp rr eq ue st Ac c Cu Ba m c o kb ne ne tw n u rk Co m su ed po we e rp Ut iliz ati on C o o arb oo tpr in Fig. but has not been fully edited. 12.5 2 1. We can notice that. IEEE TRANSACTIONS ON CLOUD COMPUTING 12 3. However. respectively). Greenhead uses the data center in California since it has the smallest carbon footprint per unit of power (0. up to 15% of the available renewables are not used.1 (c) Ploc = 0.4 0.1. Fig. This is due to the fact that the VMs with location constraints can only be embedded in specific data centers. as the fraction of constrained nodes increases.e. 4) Reduce energy consumption and carbon footprint per request. 9: The utilization of the renewables in all data centers for different fractions of location-contained nodes Ploc for Greenhead (λ = 8 requests/hour) it uses up to 30% more renewable power than the baseline.5 3 2.8 0. 11 shows the impact of varying the cost per unit of carbon (αi = α. it generates the same amount of carbon footprint compared to the baseline approach. when Ploc is getting higher. To explain the power cost increase when reducing the carbon footprint.5 0 0 10 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 0 0 10 20 30 Time (hours) 40 50 (a) Ploc = 0. Furthermore. Finally.0005 ton/Kwh. However. In addition. 100% higher compared to New York data center). 11. It is worth noting that nowadays. From this figure. as α increases.5 Power (W) x 10 5 Available renewable power Consumed renewable power Total consumed power Power (W) 2. 9 shows the impact of the fraction of locationconstrained VMs on the power consumption across the infrastructure. 3) but high carbon footprint (0. Greenhead uses up to 15% more renewables and reduces the consumed power per request by 15% compared to the baseline approach. In addition. at the expense of power cost. we can notice that for small values of α (i. These two data centers have low electricity prices (see Fig.

C. CFI ’11. C. M. Louati. Guttag. IEEE TRANSACTIONS ON CLOUD COMPUTING 13 Greenhead.” Comput.” Communications Letters. pp. 20. Wen.” SIGMETRICS Perform. 526 –535. and Y. VISA ’10. “Geographical load balancing with renewables. J. A. Taiwan. Q. A. “Virtual network embedding through topology-aware node ranking. and M. Wu. 62–66. X. Ballani. Lu. “SDN Stack for Service Provider Networks. Simon. 233–244. G. G. M. pp. june 2012. P. pp. pp. Costa. “The future of data center wide-area networking. and J. Boutaba. H. vol. feb. 2012. J. 2012. but has not been fully edited.” in IEEE 32nd International Conference on Distributed Computing Systems (ICDCS). Wu. Rabbani. a holistic resource management framework for embedding VDCs across a geographically distributed infrastructure. Ben Ameur. Curtis. R. and J. Zhani. Chowdhury. N. Aitsaadi. pp. W. reducing energy costs and ensuring the environmental friendliness of the infrastructure.int/ITUT/climatechange/ess/index. june 2011. The results show that Greenhead provides near-optimal solution within a reasonable computational time frame and improves requests’ acceptance ratio and InP revenue by up to 40% while ensuring high usage of renewable energy and minimal footprint per request. no. S. 1. Greenberg. and S.. A. Yang. Baldine. Dec. SIGMETRICS ’11. Bari. services and applications to the cloud. Amin Vahdat. Zhang. Maltz. Qureshi. “http://aws. and R. Commun. Pujolle. Netw. 4. “Data center network virtualization: A survey. Su. 123–134. W. Zhu. Cheng. November 2012. T. Lisandro. F. We conducted extensive simulations for four data centers connected through the NSFNet topology. X. “Polyvine: policybased virtual network embedding across multiple domains. J.html Y. 242–253. G. H. Zhani.. Furthermore. 2011.4 0. “Virtual network provisioning across multiple substrate networks.” in 3rd IEEE International Conference on Smart Grid Communications (IEEE SmartGridComm). Cloud providers take advantage of the worldwide market to deploy geographically distributed infrastructures and enlarge their coverage. Boutaba. Z. Gupta. Zhang. Eval.” in Proceedings of the ACM SIGCOMM conference . 2012. Gao. Y. L. “Dynamic Service Placement in Geographically Distributed Clouds. vol. P. S. Zimmermann. Wang. we proposed Greenhead. their carbon footprint is a rapidly growing fraction of total emissions. Maggs. pp.” SIGCOMM Comput. and A. M. P. D. Keshav. 41. “Greening geographical load balancing. R. Rowstron. 3. Q. pp.” in Proceedings of the 6th International Conference on Future Internet Technologies. Lin. 16. Low. multiple data centers consume massive amounts of power. no. I. F. 39. a socially-responsible InP should consider higher carbon costs. 38–47. J. Chowdhury.” in Proceedings of the ACM SIGCOMM 2011. and P. Dec. ITU.” in Proceedings of the second ACM SIGCOMM workshop on Virtualized infrastructure systems and architectures. Duelli. vol. Wong. 1–6. D. 1. no. Granvilleand. Heermann. Apr. and L. 2010. Mandal. Chase. Kong. “Toolkit on Environmental Sustainability for the ICT Sector (ESS).” in IEEE 5th International Conference on Cloud Computing (CLOUD). “VNEAC: Virtual Network Embedding Algorithm Based on Ant Colony Meta-heuristic.” in 2012 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).2 1 Propotion 0.” H. Rev. Rev. Q. S. “Energy Efficient Geographical Load Balancing via Dynamic Deferral of Workload. “Embedding virtual topologies in networked clouds. A. Schlosser. Su. In this paper. 39.” in Proceedings IFIP/IEEE Integrated Network Management Symposium (IM 2013). no. pp. M. Cheng. 756 – 759. 127–132. Yang. 2011. Hamilton. 68–73. 1. Mar. Boutaba. Y. and R. Zhang. Commun. Wierman. M. pp.” IEEE/ACM Transactions on Networking. R.. J. Guo. X. Botero. 39. Houidi.” 2010. Luo. Luo. 2011. R. and H. ser. F.” 2012. vol. G. The goal of Greenhead is to find the best trade-off between maximizing revenue. “SecondNet: a data center network virtualization architecture with bandwidth guarantees. Weber. and R. “Energy-aware virtual network embedding through consolidation..” SIGCOMM Comput. I.” SIGCOMM Comput. Wang. Z. Sugihara. Karagiannis. 12: The power from the grid (normalized values) used in different data centers with variable cost per ton of carbon α [11] 7 C ONCLUSIONS [12] The last few years witnessed a massive migration of businesses. Zhang. Podlesny. and A. Zhang. 4. vol. F.8 0. Rev. F. “Cutting the electric bill for internet-scale systems. 2011. Boutaba. no.This article has been accepted for publication in a future issue of this journal. He. A.. Fischer. “Vineyard: Virtual network embedding algorithms with coordinated node and link mapping. Wang. Wang. F. “SociallyResponsible Load Scheduling Algorithms for Sustainable Data Centers over Smart Grid. H. and R. Adnan. Zhang. M. 1011–1023.com/ec2/. 2008. 2. pp. pp. [Online]. [23] [24] M. 2010. Co-NEXT. Md. pp. [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] R EFERENCES [1] [2] [3] Amazon Elastic Compute Cloud (Amazon EC2). vol. 26–29. Rahman. M. Deng. Wu.” in Proceedings of the 6th International Conference. 55. vol. “Energy Efficient Virtual Network Embedding. May 2013. Patel. F. Xin. “The cost of a cloud: research problems in data center networks.6 0. pp. Commun. Wang. to force Greenhead to use environmentfriendly data centers to reduce the carbon footprint. no. The partitions and the virtual links connecting them are then dynamically assigned to the infrastructure data centers and backbone network in order to achieve the desired objectives. Y. X. A. “VDC Planner: Dynamic Migration-Aware Virtual Data Center Embedding for Clouds. Esteves. Sun. Balakrishnan. 2011. However. Hesselbach. 1–12. J. and B.2 0 −200 0 200 400 600 Cost per ton of carbon ($/t) 800 New York California Illinois Texas [4] [5] [6] [7] [8] [9] [10] Fig. Aug. W. Zeghlache. D. X. and D. R. Available: http://www.4 1. Boutaba. 2009. Andrew.” Open Networking Summit. The key idea of the proposed solution is to conquer the complexity of the problem by partitioning the VDC request based on the bandwidth requirements between the VMs. Samuel. pp. Rev.” in IEEE International Conference on Communications (ICC). M. Content may change prior to final publication. pp. ser. “It’s not easy being green. Z. Z. may 2012. and D. and H. and R. de Meer.” 2012. A.. 206 –219. Fajjari. 5. Forrester Research. ser. 49–56. pp. IEEE. Liu. ——. Y. IEEE.amazon. June 2012.itu. no. Zhani. S. C.” in Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. H. even by artificially increasing these costs. 188 –195. Q. Yumerefendi. “Towards predictable datacenter networks. M. B. ser. I.

and matching. France. Technical report. “A quantitative analysis of cooling power in container-based data centers. ence from Telecom ParisTech. University. Maltz. and de Cachan (ENS Cachan). Ghazar and N.” Report the University of Paris IX and Paris XI on prepared for the European Climate Foundation and Green Budget.This article has been accepted for publication in a future issue of this journal.D. He is currently a May 2012.Sc. in [30] The National Science Foundation Network (NSFNET). Schaeffer. Lambiotte. In 2007 and 2008. Paris. He graduated with honors first Gbit/s network to be tested in 1980. several best paper awards and other recognitions such as the [35] Xiang. . He has received 2012. M.-L. fessor at the LIP6.com. His research in[32] U. X. the IEEE Hal Sobol Award “Eco-Aware Online Power Management and Load Scheduling in 2007. degrees from the National School of Computer Science.carbonfootprint.fr). the potential of d’Etat” degrees in Computer Science from carbon pricing to reduce Europes fiscal deficits. Tunisia in 2003 and 2005. J.fr). terests include network.gov/rredc/. City of Boulder. Marie Curie .” in Proceedings of Waterloo. Telecommunication Systems. Zhang. nov.”2010) and on the editorial boards of other journals. J. 27 – 64. Guy Puand 2011. J. “Fast unfolding of communities in large networks. Nguyen. Colorado. A. He is a fellow of the IEEE. and the Joe LociCero for Green Cloud Datacenters. Korea. and P.ly/L7i3td Professor at Pierre et Marie Curie University [39] Climate Action Plan Tax. Jaluria. Oct. no. 1975 and 1978 respectively. E. R. support. Hamilton. France. Wu and Junfeng. Sun Yat-sen and the Dan Stokesbury awards in 2009. He is currently a jolle is the French representative at the Technical Committee on Ph. N. [Online]. France.D. “http://www.ucopia. and Green Communications (www. he was with the P. Patel. Canada. Qouneh. University Pierre and Marie Curie. Waterloo.” in SIGCOMM. a member of the Insti[40] “British columbia carbon tax. He is the founding editor in chief of “http://www. Kim. and E.” February 2008. degree in network and computer sciStorage and Analysis (SC). Networking. as a Postthe ACM SIGCOMM conference on Data communication. Guy Pujolle received the Ph. He is an editor for ACM International Journal Marie Curie. New York. May 207. Rodriguez. NY. the IEEE Transactions on Network and Service Management (2007[34] Carbon Footprint Calculator. pp. respectively. [36] Phil Heptonstall. 2007. His research interests include mobility SIGCOMM ’09. resource management in cloud computing environments. and protocols for comRami Langar is currently an Associate Proputer communication. [Online]. EtherTrust (www. [37] A. Li. Since then. Guy Pujolle is a pionetworking in wireless networks. [Online]. 1. April 2013. Virtuor (www. the Fred W. vehicular ad-hoc and [27] S. and resource management in wireless mesh.D. Jain. [26] A. Sweden.fr).” vice management in wired and wireless net[33] The Renewable Resource Data Center (RReDC) website.” 2012. Lahiri. pp. several important patents like DPI or virtual networks.” in Interand Marie Curie . He was participating in from the ESI and ENS Cachan. His research interests include virtualization. candidate at the University Pierre and Networking at IFIP. technologies. June . and Editor terests include energy efficient and green in Chief of Annals of Telecommunications.nsf. Sengupta.” Journal of Statistical Mechanics: Theory and Experiment. Li. Ucopia Communications (www. C. network performance evaluation and software defined networking. Premiers Research Excellence Award. 51–62. Ph.” in Proceedings of the IEEE Global Communications Ph. Paris. Shen and Jian. 1–12. He was also ´ Ecole Nationale Superieure dInformatique Professor and Head of the MASI Laboratory at Pierre et Marie ´ (ESI). degree in network and comT. Deng and Di.ly/XyGk32 at POSTECH. pp.ly/JLUurv of The Royal Physiographical Academy of Lund. [29] T. 2011.D. Ellersick Prize in 2008. Algeria and Ecole Normale Superieure Curie University (1981-1993). and a member Available: http://bit. resource and ser“http://www. performance evaluation and quality-of-service vol. and S. femtocell networks. 1.S.VirtuOR. C.” CS Department.S. he has been a postdoctoral research fellow at the University of Waterloo. tut Universitaire de France.Sc. Guy Pujolle is co-founder of QoSMOS (www. ser. 1990 and 1994.” professor of computer science at the Univer[31] N. He. He is currently a “http://www. 2008. Lefebvre.eia. Sirivianos. works. Paris.S.Paris 6. and ”Thse [38] “Carbon taxation and fiscal consolidation. Samaan.of Network Management. S. in 2006. ON.greenMohamed Faten Zhani received engineer. ing and M. R.” 2011. He spent the period 1994Ahmed Amokrane received the Engineer 2000 as Professor and Head of the comand M.communications. Kandula. Greenberg.gov. and ceived the M. Content may change prior to final publication. datacenters neer in high-speed networking having led the development of the and the cloud.” UK Energy Research Centre Working Paper.gov. Canada in 2011. Meng. and the national Conference for High Performance Computing. D. engineering at POSTECH. Laoutaris. He received his Ph. in Computer science from the University of Quebec in Montreal.ethertrust. His research in. and T. Le. Professor at ENST (1979-1981).com). D. USA. France. 211–222. respectively. [28] V. He re[25] K. professor at the division of IT convergence 2011. “Graph clustering. Available: http://bit. Energy Information Administration.” Computer Science Review. Bianchini. IEEE TRANSACTIONS ON CLOUD COMPUTING 14 on Applications. University of Pierre and 2012. Y. but has not been fully edited. respectively. “A Review of Electricity Unit Cost Estimates. J. Doctoral Research Fellow. degrees in computer science from puter science department of Versailles University. “VL2: a School of Computer Science.nrel.D. “Hierarchical approach for efficient virtual network embedding based on exact subgraph Raouf Boutaba received the M. University of scalable and flexible data center network. architectures. P. Blondel. Guillaume. degrees in computer science from the Conference (GLOBECOM). R. Yang. 2011. ser. pp. 74–85. 2009. Available: http://bit. pp. SIGCOMM’12. “Reducing electricity cost through virtual machine puter science from the University of Pierre placement in high performance computing clouds.com). a distinguished invited professor 2010.Paris 6 in 2002. Paris.qosmos. in 2010 a member of the scientific staff of INRIA (1974-1979).Paris 6. “Intersity of Waterloo and a distinguished visiting datacenter bulk transfers with Netstitcher.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer: Get 4 months of Scribd and The New York Times for just $1.87 per week!

Master Your Semester with a Special Offer from Scribd & The New York Times