You are on page 1of 15

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
1

An Efficient Survivable Design with Bandwidth


Guarantees for Multi-tenant Cloud Networks
Hyame Assem Alameddine, Sara Ayoubi, Chadi Assi, Senior Member, IEEE

{hy alame, sa ayou, assi} @encs.concordia.ca

Abstract—In cloud data centers (DCs), where hosted applica- each other causing unpredictable applications’ performance
tions share the underlying network resources, network bandwidth [5]. Such variable performance is a major problem for many
guarantees have shown to improve predictability of application cloud clients (tenants) running critical services and expecting
performance and cost. However, recent empirical studies have
also shown that often DC devices and links are not all that reliable comparable performance of their applications, at any time,
and that failures may cause service outages, rendering significant independently from the network workload [4]. This results in
revenue loss for the affected tenants, as well as the cloud operator. significant revenue losses for cloud providers and their clients.
Accordingly, cloud operators are pressed to offer both reliable To this end, much work has been done in order to guarantee
and predictable performance for the hosted applications. While bandwidth [6], [7], [8], [9], [10], [11] through virtualizing
much work has been done on solving both problems separately,
this paper seeks to develop a joint framework by which cloud client bandwidth demands. Such virtualization is provided
operators can offer both performance and availability guarantees by the design of a simple and intuitive interface, known as
for the hosted tenants. In particular, this paper considers a simple abstraction model, allowing cloud tenants to express their
model to abstract the bandwidth guarantees requirement for computing and network requirements to the cloud provider.
the tenant and presents a protection plan design which consists Another challenge facing cloud operators today is providing
of backup virtual machines (VMs) placement and bandwidth
provisioning to optimize the internal data center traffic. We show high availability [12], [13], [14], [15], [16], [17] for the hosted
through solid motivational examples that finding the optimal pro- applications. For instance, a survey of over 200 companies
tection plan design is highly perplexing, and encompasses several in North America disclosed that more than $26.5 billion in
constituent challenges. Owing to its complexity, we decompose it revenue is lost each year due to outages [18]. Another survey
into two subproblems, and solve them separately. First, we invoke [19], found that one of 10 companies requires more than
a placement subproblem of the minimum number of backup VMs
and then we explore the most efficient correspondence between 99.999% availability since it can not tolerate any service
backup and primary VMs (i.e., protection plan) which minimizes outages. Thus, cloud operators seek to provide high service
the bandwidth redundancy. Further, we study the design of availability for their clients, either through the provisioning of
various facets of such a plan by exploiting bandwidth sharing redundant resources (backup VMs and bandwidth) [12], [13],
opportunities in multi-tenant cloud networks. [14], [20] or by using a Worst Case Survival (WCS) scheme
Index Terms—Bandwidth guarantees, cloud networks, network [9], [21] through a proper placement of the tenant’s VMs.
resilience, optimization, survivability, virtualization. Concurrently, providing bandwidth guarantees and high
survivability is crucial for the reputation and the popularity
I. I NTRODUCTION of cloud providers. However, only few works in the literature
Today, Internet applications with different functional and [21], [22], [1] addressed both problems jointly (bandwidth
network requirements (such as online banking, audio/video guarantees and survivability), either through focusing on the
streaming) are being increasingly deployed in the cloud and trade-off between providing fault tolerance based on a WCS
are sharing the same physical infrastructure [2]. The sharing and reducing the bandwidth to guarantee [21], or through
of computing (VMs, CPU, memory, etc.) and networking relocating traffic to a different DC upon failure, to reduce
resources (bandwidth) among these applications without inter- resource capacity requirements [22]. Unlike prior works, we
ference in their functionalities is made possible due to Network emphasize in this paper on the interdependence that exists
Virtualization (NV). NV enables the virtualization of multiple between providing high service survivability and reducing the
network components (servers, links, switches, etc.) to form a backup bandwidth footprint. In order to ensure high surviv-
single, fully isolated entity, known as a Virtual Network [3]. ability under a single failure1 while guaranteeing the same
network performance before and after any service disruption,
However, contention among non virtualized resources (such
redundant resources (backup VMs, backup bandwidth) must
as network bandwidth) creates several limitations and chal-
be provisioned in the form of a protection plan. We show in
lenges for cloud providers, such as the provisioning of guaran-
this work that designing such a plan is a perplexing problem.
teed performance to the hosted applications [4]. The best-effort
We assume that tenants are already hosted in the DC and
sharing scheme provided by Internet protocols (i.e., the Trans-
use a simple abstraction model to specify the bandwidth
mission Control Protocol (TCP)) used in the cloud, allows
network traffic from competing applications to interfere with 1 A failure can either affect a server or an interconnection of servers
(i.e. server enclosures) connected to one TOR switch (through a single
This work is supported by NSERC Discovery Grant. switch port). For simplicity, in our illustrative examples, we consider a
An earlier version of this paper was presented at IEEE DRCN 2016 [1]. single server failure.

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
2

requirements of their applications. We explore through several requirements (VMs, bandwidth) to the cloud provider indepen-
motivational examples the challenges encountered to design dently from the underlying infrastructure. However, Oktopus
such a plan. Further, we observe that backup bandwidths overlooked the fact that collocation decreases fault tolerance.
provisioned for each tenant’s protection plan remain idle until The OpenStack Neutron project [23], dedicated for managing
the occurrence of a failure; thus, we explore several bandwidth network and IP addresses in the cloud, recently introduced the
sharing opportunities between protection plans of different quality of service (QoS) application programming interface
tenants. We evaluate and compare our protection plan design (API) to guarantee bandwidth for the communication of the
against a WCS approach and two baseline algorithms. Our VMs of each tenant. They suggest a QoS data model with a
numerical evaluation shows that our protection plan design variety of policies, rules and bandwidth guarantee approaches
can decrease the tenants’ rejection rate by an average of (best effort sharing, bandwidth limit, etc). Their bandwidth
80.9% while providing 100% fault tolerance with complete guarantee services focus on reserving the bandwidth for each
service restoration under any single server failure. By tenant without accounting for any efficient bandwidth utiliza-
applying our bandwidth sharing opportunities to the provided tion techniques such as those which we present in this work.
protection plans, cloud providers can increase their revenues. Alternatively, the CloudMirror team [8], [9] proposed the
Our contributions can be summarized as follows: Tenant Application Graph (TAG) that reflects the structure of
• We present a protection plan which leverages several the tenant’s application, to guarantee bandwidth. Such struc-
insights to determine the amount and the provisioning ture is unknown by the tenant which makes the TAG model not
technique of additional backup resources needed to guar- practical. It also considered the WCS requested by the client
antee the availability of critical applications. to provide fault tolerance. WCS being the smallest number
• We devise a backup bandwidth provisioning policy of VMs that should remain functional during a failure
based on the hose model that consists of guaranteeing of a single sub-tree, causes service degradation in case of
the minimum required bandwidth while considering failure. Bodik et al. [21] also employed the WCS as a measure
the reuse of primary bandwidth as backup bandwidth. of fault tolerance. They proposed the K-way cut algorithm to
• We formulate the protection plan design problem as a provide an initial embedding for the VMs while minimizing
mixed integer non linear problem able to provide the the bandwidth at the core of the network. They improved
optimal correspondence between primary and backup this elementary allocation by realizing multiple moves of the
VMs. Further, we implement a heuristic to solve it. embedded VMs in order to achieve fault tolerance. They
• Motivated by insightful illustrative examples, we assumed that a physical server can only host one VM of
extend our protection plan design to consider the the same virtual data center (VDC) which extensively spreads
sharing of backup bandwidth between tenants. To the the tenant’s VMs. In this work, we compare against the
best of our knowledge, we are the first to explore this WCS approach and show that it decreases significantly the
problem. Our numerical evaluation shows that this network admission rate and the cloud providers revenues
sharing technique can provide an average of 33.96% due to its high bandwidth usage.
of bandwidth saving. The work in [14] provided 1-redundant and k-redundant
approaches to support the failure of critical nodes. In order
The remaining of this paper is organized as follows: Section
to minimize the bandwidth cost, they implemented band-
II provides a literature review on the bandwidth guarantee
width sharing techniques known as cross-sharing and backup-
and high survivability problems in cloud DCs. Section III
sharing. Unlike [14], the authors of [24] presented ProRed,
explains a bandwidth allocation scheme in the cloud. Section
a prognostic redesign approach that explores the number
IV presents the challenges of designing a protection plan.
of backup VMs needed for a tenant (between 1 and K
Section V formally presents our protection plan design prob-
VMs) while promoting backup resource sharing (backup
lem. Section VI proposes our two-step method for solving it.
VMs, backup bandwidth) within the VMs of the same
We discuss switch and power failures in Section VII. Section
tenant. They consider clustering the primary VMs into
VIII exploits several motivational examples for bandwidth
sets protected by a single backup VM each while pro-
sharing opportunities in the cloud and introduces a heuristic
moting cross-sharing and backup-sharing. The work in
to share bandwidth between tenants. Our numerical evaluation
[25] presented RELIEF, a joint framework of two sub-
is presented in Section IX. VMs synchronization is explored
systems; JENA and ARES. JENA is a sub-system that
in Section X. We conclude in Section XI.
performs virtual network (VN) embedding to provide just-
II. R ELATED W ORK enough availability guarantees based on the availability of
the physical servers in the physical network hosting the
Recently, there has been much effort for guaranteeing net- virtual one. In contrast, ARES deals with the variable
work performance and providing high service availability for availability demands of the requests by migrating the
cloud applications. In order to reduce the bandwidth usage VN or adding backup nodes. To reduce the idle backup
in the core of the network, Oktopus [7] developed a VM bandwidth, the study in [20] provided two heuristics; the first
embedding heuristic that collocates VMs of the same tenant one solves the virtual node embedding and the link embedding
under the smallest sub-tree while guaranteeing bandwidth problems separately. It chooses the virtual node embedding
based on the hose model [7], [6], [5]. The hose model is an solution that minimizes the reserved backup bandwidth. The
abstraction model that allows tenants to express their resources other heuristic solves both problems jointly by adopting a

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
3

link packing approach. The works in [14], [20], [24], [25]


Empty VMs Primary VMs Primary Hose Primary BW B
guarantee VM to VM bandwidth by assuming that a dedicated
link exists between each pair of communicating VMs. This C1
Bandwidth to reserve
Virtual switch
assumption is unrealistic because links are shared among on this link = min
(5,1)*B = 1B
multiple VMs. In addition, VMs communication dependencies
B B
change over time. The pipe model [9], [26] approach refers to C2
such assumption. -----------

The work in [27] proposed a congestion free proactive


approach for handling faults against link and switch failures.
Their forward fault correction (FFC) approach spreads network (a) Tenant request abstraction (b) Bandwidth reservation for a tenant <6, B>

traffic to avoid congestion in case of an arbitrary combination


Fig. 1: Hose model representation with bandwidth guarantees.
of up to k faults, thus, reducing the number of network updates For instance, consider a tenant request <6, B> of 6 primary
needed. However, such updates are required in order to adjust VMs, and B, the bandwidth to be guaranteed for the com-
to changing traffic demands and to react to failures. Those munication between those VMs. Such request is embedded as
latter are not considered by the authors. Further, the work in shown in Fig.1(b). We observe that the link of interest (thick
[27] does not provide efficient bandwidth utilization of the line in Fig1(b)) divides the network into 2 component C1 of m
network capacity because a portion of it has to be always left = 5 VMs and C2 of N-m = 6-5 = 1 VM. Hence, the bandwidth
vacant in order to handle traffic rescaling. to be guaranteed on this link, based on the hose model, is
Our work builds on top of the studies presented in the min(m, N − m) ∗ B. We refer to the hose interconnecting the
literature by jointly solving the problems of bandwidth and primary VMs as the pre-failure hose.
survivability guarantees through using the hose model [7]
to guarantee bandwidth and offering a protection plan IV. P ROTECTION P LAN D ESIGN
design to ensure high survivability. With the purpose of Given the failure-prone nature of the cloud infrastructure
providing efficient resources utilization, we determine the [28], cloud providers are determined to provide high service
number and the placement of backup VMs to ensure availability in their networks to incentivize enterprises to move
survivability while considering sharing backup resources and keep their services in the cloud. One way to ensure a
(backup bandwidth, backup VMs). To the best of our certain level of service continuity is to devise a protection plan
knowledge, we are the first to explore a backup bandwidth for each tenant’s service, where sufficient backup resources
sharing strategy between multiple tenants. are provisioned to restore the service in presence of failures.
III. BANDWIDTH P ROVISIONING IN CLOUD DC S Typically, a protection plan consists of backup VMs, where

To provision bandwidth for tenants, a cloud provider re- Empty VMs Primary VMs Backup VMs Primary BW B Backup BW B
quires the knowledge of their requirements (in terms of VMs
and bandwidth required for their communication). In addition, B B

tenants need a simple and intuitive interface to express these 2B 2B B


requirements independently of the underlying physical infras-
B B B B B
tructure of cloud DCs. Such interface is known as “Abstraction
Model”. In fact, many abstraction models were discussed in
S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
the literature. The pipe model [5], [26] guarantees host-to-host (a) Total bandwidth (primary and backup) to (b) Total bandwidth (primary and backup) to reserve
connectivity; however, it is not practical because it assumes reserve = 0B = 8B+4B = 12B

the knowledge of the communication matrix between VMs Fig. 2: Trade-off: collocation vs. bandwidth guarantees.
which is hard to determine or to estimate by the tenant. The each backup VM is in charge of protecting one or more pri-
hose model [5], [7], [26] guarantees the minimum bandwidth mary VMs (the initial VMs running the tenant’s service prior
required by each VM. The tenant application graph (TAG) to an outage). Further, for each backup VM, sufficient backup
proposed by [8], [9] relies on the tenant’s knowledge of the bandwidth must be provisioned to resume the communication
application structure in order to determine the bandwidth to with the remaining active VMs (VMs that were not affected by
guarantee. Without loss of generality, we use in this work the any failure). For example, Fig.2(b) shows a tenant request of
hose model to guarantee bandwidth between VMs due to its 4 VMs hosted on a single-rooted DC network topology where
simplicity in expressing the tenant’s requirements. each VM is hosted on a different physical server. One way to
Consider a tenant request <N ,B> of N primary VMs and ensure the availability of this service against a single server
B bandwidth to be guaranteed for the communication between failure can be achieved by provisioning a backup VM on a
those VMs. The hose model interconnects the N VMs to a distinct physical server; namely in this example on physical
central switch of bandwidths N ∗B (Fig.1(a)). This ensures a server S5. Now, when S1 fails, the failed primary VM will
maximum communication rate of N ∗B bandwidth between resume its service using S5. Therefore, a protection plan must
those VMs. Because multiple VMs are likely to communicate provision enough network bandwidth for the communication
at the same time with a single destination that can only receive between S2, S3, S4 and S5. Given the finite network resources,
data at a rate B, the hose model provisions the minimum it is in the cloud provider’s best interest to find the lowest-
bandwidth needed by each VM. cost protection plans that consume the minimal number of

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
4

Empty VMs Primary VMs Backup VMs Primary BW B Backup BW B

3B 3B

B B 3B B 3B 3B 3B 2B 2B
B B B B B B
3B 2B B 3B 3B 3B 3B 2B 2B 2B
3B 2B B 3B 2B B 3B 2B B 3B 2B B

S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
(b) Total bandwidth (primary and backup) to (c) Total bandwidth (primary and backup) to (d) Total bandwidth (primary and backup) to
(a) Total primary bandwidth to reserve=8B
reserve=8B+22B=30B reserve=8B+15B=23B reserve=8B+10B=18B

Fig. 3: Tenant of 6 VMs requirement with its needed backup VMs for 100% availability.

network resources, while guaranteeing the availability of the Fig.3(a). Here, we observe the embedding of a tenant with 6
hosted services. An efficient utilization of network resources VMs with uniform resource requirements. In order to provide
will enable the cloud operator to host more tenants, thereby service continuity for this tenant, the network provider needs
maximizing his/her long-term revenue. a protection plan that matches the number of primary VMs,
Finding the lowest-cost restoration plan is indeed a per- and their network resource requirements, following any single
plexing problem with several inter-playing challenges; mainly, node failure. By inspecting the Figure, it is easy to verify
it consists of identifying the minimum number of backup that 3 backup VMs are required to assume the failure of S1,
VMs needed, finding the placement of these backup VMs 2 in the event of S2’s failure, and 1 backup VM is needed
that minimizes the backup footprint, and finally, determining following S3’s failure. Given that at most a single server will
the primary-to-backup VMs correspondence (that is, which fail at any point in time, then 3 backup VMs will guarantee
backup VM is in charge of protecting which primary VM(s)). the availability of the tenant’s service in the face of any single
In the remainder of this section, we elucidate on each of these server failure (as shown in Fig. 3(b)), granted that these backup
challenges, and highlight their inter-playing nature. VMs are not provisioned on any server hosting primary VMs
A. Identifying the Number of Backup VMs (namely in this case, S1, S2, and S3).
Alternatively, collocating backup VMs on the servers host-
One factor that impacts the number of backup VMs needed
ing the primary ones can be advantageous in terms of reducing
is the primary VMs placement. For instance, if the primary
the backup bandwidth. For instance, Fig.3(c) shows that by
VMs are collocated on a single physical server, then a single
collocating both primary and backup VMs, the total bandwidth
server failure will bring-down all the primary VMs. Hence, the
(primary and backup) required for the tenant is reduced from
corresponding protection plan needs to provision N backup
30B (Fig.3(b)) to 23B (Fig.3(c)); and further to 18B (as shown
VMs (placed on a distinct server) to ensure the availability of
in Fig.3(d)), but at the expense of provisioning more backup
the tenant’s service. On the other hand, if the primary VMs
VMs (4VMs vs. 3VMs in Fig.3(c)).
are spread across multiple servers, then the number of backup
Hence, the first two subproblems of designing a protection
VMs (placed on a distinct server) needed is reduced to the
plan exhibit an intricate interplay, since the number of backups
maximum number of collocated primary VMs on any server.
needed is greatly impacted by the primary, as well as, the
Clearly, there exists a trade-off between the primary embed-
backup VMs placement decisions.
ding solution, and the incurred backup footprint. Indeed, while
collocating primary VMs significantly reduces the primary C. Determining the Primary-to-Backup VMs Correspondence
bandwidth needed (since VMs collocated on the same server
communicate internally via the top-of-rack switch), it requires The third challenge is to determine the primary-to-backup
a significant amount of backup VMs. Alternatively spreading VMs correspondence; that is, which backup VM is protecting
the VMs, though increases the primary traffic provisioned, which primary VM(s). This decision plays an essential role
also enhances the service’s fault-tolerance to failure, and in determining the cost of the protection plan, in terms of
subsequently reduces the backup footprint. Fig.2 depicts the the incurred backup bandwidth. To further illustrate this,
two extremes of admitting a tenant while collocating the VMs consider the examples presented in Fig.4. Fig.4(a) shows one
on a single server (Fig.2(a)), versus spreading the VMs in a embedding of a tenant of 8 primary VMs and 3 backup
one-to-one mapping scheme (Fig.2(b)). VMs. The total primary network bandwidth required for the
In this work, we assume (unless otherwise stated) that the communication between the primary VMs is 14B. Figs.4(b)
primary VMs placement is performed based on collocating and 4(c) show a different primary-to-backup correspondence
them under the smallest sub-tree as discussed in [7]. for the same placement of backup VMs. Fig.4(b) depicts
that when S1 fails, its 2 VMs are migrated to S3 and S6,
B. Finding the Backup VMs Placement respectively, and the resultant hose requires a total backup
The second challenge in designing the protection plan for bandwidth of 12B to guarantee the continuity of tenant’s
a given tenant, is deciding where to place the backup VMs. service. However, Fig.4(c) shows a different correspondence
This decision has a reciprocal impact on the number of backup where both failed VMs on S1 are restored by the backup VMs
VMs to provision, as well as, the incurred backup footprint. hosted on S6, demanding in total a backup bandwidth of 10B
To further illustrate this, consider the example presented in to restore the tenant’s service (16% of bandwidth saving).

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
5

Available VMS Primary VMs Idle Backup VMs Active Backup VMs Primary BW B Backup BW B Failed server

3B 3B 2B 2B B B

2B B 2B 3B B B 2B 3B B B 2B 3B 2B

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12


(a) Total primary bandwidth to reserve when no failure
(b) Total backup bandwidth to reserve=B+B+2B+3B+B+2B+2B=12B (c) Total backup bandwidth to reserve=B+2B+3B+2B+B+B=10B
occurs=2B+B+2B+3B+3B+3B=14B

Fig. 4: Different protection plans for the same server failure results in different bandwidth consumption.

D. Determining the Backup Bandwidth to Reserve reuse, b̂l can be calculated as: b̂l = max(0, max∀i (b̂il ) − b̄l )
Determining the backup bandwidth to be provisioned on where b̄l , b̂l and b̂il are defined earlier. Consequently, instead
each substrate link can be achieved by sequentially considering of reserving the sum of primary and backup bandwidths
each failure scenario (Fig.5). Let l denote a link in the network on each link (bl = b̄l + max∀i (b̂il )) (Fig.5(f)), one can
and b̂l be the backup bandwidth that must be provisioned on provision bl = b̄l + max(0, max∀i (b̂il ) − b̄l ) as shown in
this link. b̂l = max∀i (b̂il ); where b̂il is the backup bandwidth Fig.5(g). By considering such bandwidth reuse, we can save
required on link l to assume the failure of the VMs hosted on 7B by reserving 11B (Fig.5(g)) instead of 18B (Fig.5(f)) (39%
server Si . b̂il is determined according to the correspondence bandwidth saving).
between primary VMs and backup VMs. The total bandwidth V. S URVIVABLE V IRTUAL M ACHINE P LACEMENT WITH
to reserve on link l becomes: bl = b̄l + b̂l , where b̄l refers to BANDWIDTH G UARANTEES (SVMP-BG)
the bandwidth required on link l for the pre-failure hose.
To illustrate the backup bandwidth provisioning process, A. Problem definition
we consider the primary and backup VMs embedding of a Let <N ,B> denote a tenant request, where N is the number
tenant <6, B> as shown in Fig.5(a). This tenant requires 6 of VMs requested and B is the bandwidth to be guaranteed for
primary VMs, hosted on servers S1, S2 and S3. The bandwidth each VM. We assume this tenant is already embedded, and we
to provision for the communication between those primary seek to determine a protection plan to ensure service continuity
VMs is 8B as shown in the specified pre-failure hose on in the presence of failures. Determining such a plan entails
that same figure. We determine a protection plan for this deciding the number of backup VMs, their placement and
tenant by provisioning 4 backup VMs (Fig.5(a)) and we define the correspondence between primary VMs and backup VMs.
the primary-to-backup VMs correspondence as follows: the Our objective is to find the lowest-cost restoration plan that
backup VMs hosted on server S2 and S3 protect the primary guarantees full service recovery against facility node failure.
VMs embedded on S1, backup VMs hosted on S1 and S3
take care of the primary VMs hosted on S2. Similarly, the B. Problem formulation
primary VMs hosted on S3 is protected by the backup VM This work targets DCs with structured topologies (e.g., Fat
embedded on S1. Hence, by considering a single node failure trees). Let G(V, E) represents the substrate network where
of the physical servers S1 (Fig.5(b)), S2 (Fig.5(c)) and S3 V denotes the set of physical nodes, and E the set of edges.
(Fig.5(d)) and determining the backup bandwidth needed upon Without loss of generality, we assume G represents a single
each failure, we can determine the backup bandwidth to be path tree topology; however our work can be easily extended
reserved on each link as the maximum backup bandwidth of to handle any other network topologies by simply adding
all those provisioned by all the defined post-failure hoses (a flow-conservation constraints to our problem formulation. Let
post-failure hose is the interconnection between all VMs of P be the set of physical servers (P ⊂ V ), T OR be the set
the tenant that are operational following any failure). It can of Top Of Rack switches (T OR ⊂ V ), Aggregate be the set
be verified that this backup bandwidth depicted in Fig.5(e), is of aggregate switches (Aggregate ⊂ V ), Core be the set of
sufficient to ensure service continuity upon any single node core switches (Core ⊂ V ). We assume each VM requires a
failure. discrete and normalized resource from the physical server.
Parameters
E. Bandwidth Reuse
cp : Capacity of a physical server p (e.g., cp is the total
Upon any failure of a physical server affecting a tenant, number of VMs which can be accommodated by server p).
the primary bandwidth reserved for the communication of the cij : Capacity of a physical link (ij) ∈ E.
primary VMs (hosted on the failed server) is released, and fij : Bandwidth reserved for the communication of the
thus, it can be reused by the post-failure hose of the same tenant’s primary VMs on link (ij) ∈ E.
tenant. Let S1, S2 and S3 be the set of servers hosting the L : Set of switch levels. We distinguish between TOR,
primary VMs (Fig.5). To determine b̂l , we consider sequen- Aggregate and Core switches levels.
tially the failures of all servers; S1 (Fig.5(b)), S2 (Fig.5(c)) li : Level of switch i such that li ∈ L.
and S3 (Fig.5(d)). For each failure, there corresponds a post- xnp ∈ {0, 1} indicates that a primary VM n is mapped (1) to
failure hose, which requires guaranteed bandwidth on each physical server p (or not, 0).
link in the network. Therefore, by considering bandwidth vji ∈ {0, 1} specifies that node j is a descendant of node i

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
6

Available VMS Primary VMs Idle Backup VMs Active Backup VMs Primary BW B Backup BW B Failed server Pre-failure Hose Post-failure Hose

B B 2B 2B 2B 2B

3B 2B B 2B 2B 2B 2B 2B 2B

S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
(a) Total primary bandwidth to reserve=8B (b) Total backup bandwidth to reserve when S1 fails=8B (c) Total backup bandwidth to reserve when S2 fails=8B (d) Total backup bandwidth to reserve when S3 fails=4B

B+Max(0,2B-B) B+ Max(0,2B-B)
Max (2B) Max (2B) B+2B B+2B

3B+Max(0,2B-3B) B+Max(0,2B-B)
Max (2B) Max (2B) 2B+Max(0,2B-2B)
Max (2B) 3B+2B 2B+2B B+2B

S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
(e) Total backup bandwidth to reserve considering (f) Total bandwidth (primary and backup) to (g) Total bandwidth (primary and backup) to reserve
all possible failures=2B+2B+2B+2B+2B=10B reserve=8B+10B=18B (considering bandwidth reuse)=3B+2B+2B+2B+2B=11B

Fig. 5: Backup bandwidth reservation procedure.


P
(1) in the tree topology (or not, 0). X
ykp ≤ 1 ∀k∈K (7)
Decision Variables
p=1
wip : Number of primary and backup VMs hosted on a N K
physical server or under switch i, which are active when p wpp0 =
X
(x +
X
ykp0 xnp znk ) ∀p,p0 ∈P :p0 6=p (8)
np0
fails. n=1 k=1
bpij : Bandwidth needed on link (ij) ∈ E when p fails. P
X
tij : Bandwidth required on link (ij) ∈ E considering all wip = wpp0 vpi 0 ∀i∈T OR
∀p,p0 ∈P :p0 6=p (9)
possible failures. p0 =1
t̂ij : Backup bandwidth to reserve on link (ij) ∈ E {V −P }
X
considering the reuse of primary bandwidth as discussed in wip = wjp vji ∀i∈Aggregate ∪ Core
∀p∈P (10)
section IV-E. j=1
ykp ∈ {0, 1} indicates that a backup VM k is mapped (1) to P
X ∀i∈T OR
physical server p (or not, 0). bpip0 = min (wpp0 B; wpp00 B) ∀p,p0 ∈P (p0 6=p) (11)
znk ∈ {0, 1} specifies that a backup VM k is protecting (1) a p00 6=p,p0
primary VM n (or not, 0). {V −P }
X ∀i∈Aggregate∪Core
We aim at finding the lowest-cost protection plan design, bpij = min (wjp B; wjp0 B) ∀j∈{V −P } (12)
∀p∈P
represented in terms of number of backup VMs and backup j 0 6=j:lj0 =lj
bandwidth provisioned. We refer to this problem as the tij = max bpij (13)
∀p∈P ; ∀(ij)∈E
Survivable Virtual Machine Placement with Bandwidth
t̂ij = max (0, tij − fij ) (14) ∀(ij)∈E
Guarantees (SVMP-BG).
t̂ij + fij ≤ cij ∀(ij)∈E (15)
P X
X K
(1 − α) X α allows to adjust the weight between the dual objectives;
Minimize α( ykp ) + ( t̂ij ) (1) namely, minimizing the number of backup VMs required,
B
p=1 k=1 (ij)∈E and achieving the lowest amount of backup bandwidth pro-
subject to visioned. Note that the term expressing the second objective
N K P
X X ( (ij)∈E t̂ij ) depicts the backup bandwidth in Mbps while
xnp + ykp ≤ cp ∀p∈P (2) the first objective is the number of VMs with no specific unit.
n=1 k=1
N
Thus,
P in order to normalize the objective function, we divide
(ij)∈E t̂ij by B.
X
ykp + xnp znk ≤ 1 ∀k∈K; ∀p∈P (3)
n=1
Constraint (2) ensures that the physical servers’ capacity is
XK respected. Constraint (3) specifies that a backup VM k can
znk = 1 ∀n∈N (4) only protect a VM n, if it is hosted on a different physical
k=1 server. It also guarantees that a primary VM n hosted on the
N
X P
X same server p can not be protected by the same backup VM
znk ≤ N ykp ∀k∈K (5) k. Constraint (4) ensures that each primary VM n is protected
n=1 p=1 by one and only one backup VM. Constraint (5) certifies that
N P
X X 1 a backup VM k can at most protect all N primary VMs only
znk ≥ ykp − ∀k∈K (6) if it is hosted in the substrate node. Constraint (6) makes sure
n=1 p=1
N
that a hosted VM K must be protecting at least one primary

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
7

Tenant 1: Primary VMs Backup VMs Primary BW B1 Backup BW B1 Tenant 2: Primary VMs Backup VMs Primary BW B2 Backup BW B2
Available VMS Bandwidth sharing links Bandwidth reuse links

B2+2B1 B2+2B1 Max(B2,2B1) Max(B2,2B1)

2B1 +Max(0,2B1-2B1) 2B1 +Max(0,2B1-2B1) 2B1 +Max(0,2B1-2B1) 2B1 +Max(0,2B1-2B1)


B2+Max(0,B2-B2) B2+Max(0,B2-B2) B2+2B1 B2+Max(0,B2-B2) B2+Max(0,B2-B2) Max(B2,2B1)

B2+Max(0,B2-B2) B2+Max(0,B2-B2) B2+Max(0,B2-B2) B2+Max(0,B2-B2)


2B1 +Max(0,2B1-2B1) 2B1 +Max(0,2B1-2B1) 2B1 B2 2B1 +Max(0,2B1-2B1) 2B1 B2
2B1 +Max(0,2B1-2B1)

S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
(a) Total Bandwidth (primary and backup) to reserve without considering bandwidth (b) Total Bandwidth (primary and backup) to reserve while considering bandwidth
sharing=16B1+8B2 sharing=16B1+5B2 (B1>B2)

Fig. 6: Bandwidth sharing between tenants.

VM n ∈ N . Constraint (7) ensures that a backup VM can A. The Backup Placement Problem
only be hosted on one physical server. The backup VMs embedding problem consists of deciding
In order to decide the backup bandwidth which needs to on the placement of backup VMs to support a single node
be reserved, one needs to calculate the number of primary failure of a certain tenant. Since the objective is to provide 100
and backup VMs hosted under each node (on the physical % restoration while minimizing the backup footprint, we first
servers of the sub-tree rooted at the specified node). For this attempt to determine the minimum number of backup VMs
purpose, three constraints, each dedicated for a type of node, required to protect the tenant’s primary VMs. In addition, we
are defined. This node can be a physical server as specified in need an efficient placement of those backup VMs such that
constraint (8), a TOR switch as in constraint (9) or an aggre- they consume the minimum amount of network bandwidth.
gate or core switch such as in constraint (10). To determine We recall the observations we made from our motivational
the backup bandwidth, constraint (11) defines the bandwidth examples presented in Section IV where we showed that the
needed on link ip0 between TOR switch i and physical server best placement of backup VMs is obtained by collocating them
p0 upon the failure of p. This bandwidth is calculated, based with the primary VMs of the given tenant. The most efficient
on the hose model, as the minimum bandwidth required by collocation is the one that does not require more backup VMs
the active VMs on server p0 and the one needed by those than the maximum number of primary VMs hosted on the
hosted on the remaining servers excluding the failed one. same physical server. We will refer to this number as NbB.
Constraint (12) calculates the bandwidth on link ij between Such embedding is cost-effective because not only it reduces
the network switches. This bandwidth is calculated as the the number of backup VMs needed but also it minimizes
minimum bandwidth needed for the active VMs hosted under the bandwidth to be reserved. A similar placement approach
switch j and those hosted under switch j 0 having the same is motivated by previous works in the literature [6], [7],
level (TOR, Aggregate, Core) of j. Constraint (13) defines [9] which showed that collocating VMs reduces the internal
the required bandwidth on each link. The total bandwidth DC traffic. Our SVMP-BG performs a greedy embedding of
needed on each of those links is the maximum bandwidth NbB backup VMs by considering their collocation with the
required on that link considering all possible failures. Con- primary VMs. The backup VMs embedding is performed by
straint (14) ensures the reuse of bandwidth by calculating the recursive function addBackupForRequest() (line 7). The
the additional backup bandwidth needed on each link as the addBackupForRequest() function conducts a recursive search
difference between the total bandwidth considering all failures to host NbB backups, starting from the sub-tree where the
and the bandwidth guaranteed between primary VMs when no primary VMs of the request are placed. We will refer to
failure occurs. Finally, constraint (15) represents the bandwidth this sub-tree as the request sub-tree. This function uses the
capacity constraint on each link. collocate parameter which motivates two types of searches:
The SVMP-BG model is a Mixed Integer Non Linear 1-Search with collocation: The search with collocation starts
Program (MINLP) owing to constraint (8). Due to space by embedding the backup VMs on the server (Smin ) hosting
limitations, we have omitted the linearization details, however the minimum number of primary VMs for the given request.
standard linearization techniques can be used to linearize When Smin is saturated, or when the number of remaining
Constraints (8), (11), (12), (13) and (14). The SVMP-BG backup VMs is equivalent to the number of primary VMs
problem may be shown to be NP-complete by a reduction hosted on Smin , the collocation terminates. For the remaining
from the bin packing problem where backup VMs should be backup VMs, the algorithm will select random servers not
packed into a finite number of physical servers representing hosting any primary VMs in the request sub-tree. This ensures
the bins. Next, we present a scalable solution for the SVMP- the protection of the primary VMs hosted on Smin . If the
BG problem along with an efficient heuristic to share backup request sub-tree is unable to admit all the backup VMs due to
bandwidth between tenants. lack of resources, the search with collocation will restart from
VI. SVMP-BG: A TWO - SUBPROBLEMS SOLUTION the request sub-tree parent switch, and onwards, reaching the
whole network if needed.
Our methodology for solving the SVMP-BG problem is 2-Search without collocation: The search without colloca-
depicted in Algorithm 1. tion is executed when collocation did not yield a feasible

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
8

solution. Here, the search starts from the request sub-tree, and Algorithm 1 SVMPBG (Request r, subtree t, collocate)
begins by randomly choosing a server, not hosting any primary 1: Given:
VMs, to host the backups. If no embedding solution was found 2: G(V, E) network where the request is embedded
in the request sub-tree, the search without collocation will 3: Primary Embedding of the request r < N, B >
restart from the request sub-tree parent switch, and onwards, 4: Subtree s = null;
reaching the whole network if needed. Note that we prioritize 5: objectiveV alue = 0;
the search with collocation because hosting backup VMs 6: Sub-Problem 1: Embed backups for the request
on the same servers embedding the primary VMs reduces 7: s = addBackupF orRequest(r, t, collocate);
the bandwidth cost [6], [7], [9]. For that same reason, the 8: if (s == null) then
search always starts from the request sub-tree. The choice of 9: Release the request from the network G;
the servers hosting the minimum number of primary VMs 10: return false;
to embed the backup VMs is motivated by the fact that 11: end if
collocation on those servers will ensure that only the NbB will 12: Sub-Problem 2: Map the backup to primary VMs
be required to provide 100 % availability, without the need to 13: objectiveV alue = executePPD();
conduct any additional verifications (as shown in Section IV). 14: if (objectiveV alue ≥ 0) then
15: Reserve backup bandwidth for the request;
B. Protection Plan Design (PPD) Problem
16: return true;
Once the backup VMs of the request are placed, a protection 17: end if
plan has to be elaborated. The protection plan consists of 18: if (s.rootNode.level < G.HEIGHT) then
determining the correspondence between primary and backup 19: Release the allocated backup VMs for the request;
VMs; that is, which backup VM will protect which primary 20: return SVMPBG(r, s.getParentTree(), collocate);
VM in case of a failure. Such correspondence is driven by 21: end if
the motive of minimizing the internal DC traffic. As shown in 22: if (collocate) then
Fig.4, multiple protection plans exist, each of which requires 23: Release the allocated backup VMs for the request;
different amount of backup bandwidth for the resulting hose. 24: return SVMPBG(r, r.subtree, false);
To determine the protection plan, we solve the SVMP-BG 25: end if
model with the variables ykp now fixed (line 13) (ykp is 26: Reject the request;
fixed since the backup VMs placement is decided through 27: Release the request allocated resources from G;
the previous sub-problem). This considerably reduces the 28: return false;
complexity of the original SVMP-BG. We refer to this problem
switch or a common power supply. For instance, if a TOR
as the PPD problem and we formulate
X it as follows: switch fails, all the servers sharing it will fail.
Minimize t̂ij (16)
(ij)∈E A. Protection Plan for Fault Domains
subject to
Constraints (3), (4), (8), (9), (10), (11), (12), (13), (14), Consider the example in Fig.7 (a) where we show a
(15). If no solution was found (e.g., lack of bandwidth in the tenant <4, B> of 4 primary VMs and B bandwidth
considered sub-tree, s), then we restart the search for a new to be guaranteed for the communication between them.
backup placement in the sub-tree rooted at the parent switch We consider 4 different fault domains (F1 , F2 , F3 and
of s (line 20). This new sub-tree contains a larger number F4 ), and we seek at restoring the complete service of the
of physical servers and therefore allows the exploration of tenant under a single TOR switch failure. Given that the
different backup VMs placement. The placement follows the maximum number of primary VMs hosted in the same
search criteria detailed in the previous section; that is, we fault domain for this tenant is 3 VMs (embedded in F1 ),
attempt first to collocate both primary and backup VMs of we need 3 backup VMs to restore the service. Thus, one can
the same tenant, and we repeat solving the PPD problem. choose to spread those backup VMs under different fault
We keep repeating until either a feasible solution is found domains where no primary VMs are embedded as shown
or all possible sub-trees are explored, and hence, we switch in Fig.7(a). In this case, we consider that primary VMs
the placement criteria by not considering collocation anymore, hosted on S1 are protected by backup VMs hosted on S5;
and restart from the original request sub-tree (line 24). and primary VMs embedded on S2 and S3 are protected
by the backup VM hosted on S7. The total bandwidth to
VII. S WITCH AND P OWER FAILURES reserve is 17B (obtained by applying the hose model and
the bandwidth reuse technique (Sections IV-D and IV-E)).
Indeed, failures are not limited to the failure of physical
servers, but can also be caused by an outage of a switch or a Another embedding approach can be obtained by
power node [28], [21]. Our protection plan design approach collocating backup VMs with the primary ones while
can be easily extended to accommodate for power and respecting the following two rules that we refer to by fault
switch failures through exploiting the concept of fault tolerance rules in the remainder of this manuscript:
domain [21]. We define a fault domain as set of physical 1-Backup VMs that are designated to protect the primary
servers sharing a single point of failure such as a TOR VMs hosted on servers in the failed fault domain should be

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
9

Empty VMs Primary VMs Backup VMs BW B Fault Domain Fi


number of primary ones.
3-Collocate the remaining mp backup VMs under a fault
2B 2B B B domain not hosting any primary VMs and residing in the
2B B 2B B 2B B B smallest sub-tree where the primary VMs are embedded.
F1 F2 F3 F4 F1 F2 F3 F4 Further, the PPD model needs to be updated by adding
2B B B 2B B 2B B B B
Constraints (17) and (18) to decide on the protection plan
that will ensure that a backup VM protects a primary one
S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
hosted in a different fault domain.
(a) Total Bandwidth to reserve=17B (b) Total Bandwidth to reserve=11B
While our protection plan design is based on the
Fig. 7: Protection plans under a single TOR switch failure. embedding of primary VMs that prioritize their collocation
embedded on servers located in a different fault domain. in the smallest sub-tree in order to reduce the bandwidth to
2-A backup VM can at most protect one primary VM from reserve, we have clearly discussed in Section IV and shown
each fault domain. in Fig.2 that collocation reduces fault tolerance and thus,
By applying these fault tolerance rules, we show in Fig.7(b) increases the backup footprint.
the collocation of backup VMs with primary ones. Here,
VIII. BANDWIDTH SHARING BETWEEN TENANTS
the backup VMs hosted on server S3 are in the same fault
domain F2 as the primary VM embedded on the same A. Motivation and Challenges
server S3. If TOR switch in F2 fails, both primary and
By observing that the backup bandwidth is only used
backup VMs on servers S3 will fail. Thus, none of the
following a failure, it can thus be shared between multiple
backup VMs on S3 can protect the primary VM hosted
tenants that will not require it simultaneously. In fact, upon
on that same server. Hence, backup VMs on S3 protect
considering a single node failure, we can identify two cases
the primary VMs embedded on server S1, and primary
in which tenants can share their backup bandwidths:
VM on S3 is protected by backup VM on S5. Further, if
1-Non concurrent failure of tenants
TOR switch in F1 fails, primary VMs on S1 and S2 will
By considering a single node failure, tenants that do not
fail. In this case 3 backup VMs are needed to recover the
have primary VMs hosted on the same physical servers will
failure. Since backup VMs on S3 are protecting those on
not be vulnerable to a simultaneous service disruption. Hence,
S1, they can not protect at the same time the primary VM
they can share their protection bandwidth on the common links
on S2. This latter should be protected by S5. The total
along their routes (on their protection plans).
bandwidth to reserve in this scenario is 11B. Thus, 35%
of bandwidth saving can be obtained by collocation while In the example of Fig.6, we depict the embedding of 2
accounting for a single TOR switch failure. tenants: tenant 1 <5, B1 >, tenant 2 <3, B2 >. Since the
primary VMs of both tenants are hosted on different physical
B. SVMP-BG For Fault Domains servers, any single node failure will result in the service
To account for power and switch failures, our SVMP- disruption of only one of the tenants at a time. This suggests
BG model (Section V-B) can be extended by replacing that the backup VMs of tenant 1 and tenant 2 will not be used
constraint (3) by constraint (17) which prevents a backup at the same time. Thus, those tenants can share the backup
VM to protect a primary one if both are embedded in the bandwidth (on their respective protection plans) that is needed
same fault domain. In addition, constraint (18) prevents a for their communication. Such bandwidth sharing requires
backup VM to protect more than one primary VM from the reservation of the maximum backup bandwidth needed
each fault domain. by each of them on their shared links (thick lines in Fig.6)
∀k∈K; ∀n∈N (Fig.6(b)). By comparing Fig.6(a) and (b), one can notice
znk ≤ 1 − ykp dfp xnp0 dfp0 ∀p,p0
∈P ; ∀f ∈F (17)
the importance of bandwidth sharing between tenants, which
∀k∈K; ∀n,n0 ∈N (n6=n0 )
(znk + zn0 k )xnp dfp xn0 p0 dfp0 ≤ 1 ∀p,p0 ∈P ; ∀f ∈F
(18) results in the saving of 3B2 (when B1 =B2 =B, a saving of
dfp is a parameter which specifies that server p ∈ P is in 12.5% is obtained). A key observation is that sharing between
the fault domain f ∈ F where F is the set of fault domains tenants is only possible on those links where no bandwidth
defined by a switch or a power failure. reuse (Section IV-E) of the same tenant is considered between
its primary and backup bandwidths. In fact, even though
C. SVMP-BG Heuristic For Fault Domains the dashed links in Fig.6 are common for both tenants, no
Similarly, the SVMP-BG heuristic needs to enforce bandwidth sharing is possible on them because the primary
the same fault tolerance rules by updating the backup bandwidth of each tenant is reused by its post-failure hose on
VMs embedding accomplished by addBackupForRequest() those links. Thus, we make the following observation:
function (line 7 of Algorithm 1) as follows: Observation 1. Bandwidth sharing on a link l is allowed
1-Let N bB = Total number of backup VMs to embed = between tenants whose protection plans do not reuse their
maximum number of primary VMs hosted in the same primary bandwidths on l.
fault domain.
2-Let mp be the minimum number of primary VMs hosted 2-Simultaneous failure of tenants
in the same fault domain. Collocate N bB − mp backup While in the previous example (Fig.6) we have shown that
VMs in the same server/fault domain hosting the minimum services that do not fail simultaneously can share their backup

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
10

Tenant 1: Primary VMs Backup VMs Primary BW B1 Backup BW B1 Tenant 2: Primary VMs Backup VMs Primary BW B2 Backup BW B2
Available VMS Bandwidth sharing links Bandwidth reuse links Failed server Post-failure Hose

l1 l2
B1 B1 B2 B2 B1+B2 B1+B2

l3
B1 B1 B1 B1 B1 B1 B1+B2 B2 B1 B2 B2 B1

B1 B2 B1+B2 B2 B1+B2 B1 B1 B1 B1+B2 B2 B1 B2 B2 B1

S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
(b) Total Bandwidth to reserve when S1 (c) Total Bandwidth to reserve when S3
(a) Total primary bandwidth to reserve =4B1+2B2 (d) Total Bandwidth to reserve when S4 fails=6B1+6B2
fails =8B1+2B2 fails =4B1+6B2
B2+B1 B2+B1 B2+B1 B2+B1

B2 B2+B1 B1 B2 Max(B2,B1) B1
B1+Max(0,B1-B1) B1+Max(0,B1-B1) B1+Max(0,B1-B1) B1+Max(0,B1-B1)

B2+Max(0,B2-B2) B1 B2 B1 B2+Max(0,B2-B2) B1 B2 B1
B1+Max(0,B1-B1) B2+Max(0,B2-B2) B2+Max(0,B2-B2)
B1+Max(0,B1-B1) B1+Max(0,B1-B1) B1+Max(0,B1-B1)

S1 S2 S3 S4 S5 S6 S7 S8 S1 S2 S3 S4 S5 S6 S7 S8
(e) Total Bandwidth (primary and backup) to reserve (f) Total Bandwidth (primary and backup) to reserve while
while considering bandwidth reuse=10B1+7B2 (B1>B2) considering bandwidth reuse and sharing=10B1+6B2 (B1>B2)

Fig. 8: Bandwidth sharing between tenants that may fail simultaneously.

bandwidths, we illustrate in Fig.8 the cases where tenants both tenants can share bandwidth on this link. Thus, instead
which are vulnerable to a simultaneous failure may also share of reserving B1 + B2 on l3 , we can reserve max(B1 , B2 ). In
their backup bandwidths on the same links traversed by their Fig.8(e), we represent the total primary and backup bandwidth
corresponding post-failure hoses. We consider a network of (10B1 + 7B2 ) needed for both tenants while considering
two tenants: tenant 1 of <3, B1 > and tenant 2 <2, B2 > bandwidth reuse on the dashed links. As mentioned previously,
embedded with their backup VMs as presented in Fig.8(a). no bandwidth sharing is possible on links where bandwidth
Tenant 1 primary VMs hosted on S1 are protected by its reuse is considered. Fig.8(f) depicts that 10B1 + 6B2 is to be
backup VMs hosted on S5 and S7 while the primary VM reserved for the communication of tenant 1 and tenant 2 while
hosted on S4 is protected by its backup VM embedded on considering bandwidth reuse and bandwidth sharing between
S7. The primary VMs of tenant 2 hosted on S3 and S4 are them, saving one B2 through sharing (when B1 =B2 =B, a
protected by its backup VM embedded on S6. Both tenants saving of 6% is obtained).
have primary VMs hosted on server S4, thus they can fail
B. Multi-Tenant Bandwidth Share Design (MTBSD)
simultaneously if S4 fails.
In order to determine the bandwidth which needs to be Multiple tenants may have their protection plans
reserved for each of the two tenants, we consider the failure traversing the same link l. Hence, bandwidth sharing is
of each of the servers S1, S3 and S4 hosting the primary VMs not limited only to two tenants but can be extended to
of both tenants. When S1 fails (Fig.8(b)), only tenant 1 fails, multiple tenants; some of them may be able to share their
requiring bandwidth B1 on links l1 , l2 and l3 . When server S3 backup bandwidth on l, and others may not. Hence, we can
fails (Fig.8(c)), only tenant 2 fails, demanding bandwidth B2 group those tenants into several subsets which we define
on links l1 , l2 and l3 . However, if we consider the failure of S4 as independent sharing sets (ISS). An ISS is a subset of
(Fig.8(d)), both tenants fail since the two of them have primary one or more tenants t ∈ T that may share their backup
VMs hosted on this server. Tenant 1 demands bandwidth B1 bandwidths on l (an independent set is a subset of nodes
to be reserved on links l1 and l2 for its service restoration, of a graph G, such that no two of them are adjacent [29]).
while tenant 2 requires bandwidth B2 to be reserved on link We would like to group tenants (using link l) into
l3 (in addition to those already reserved on links l1 and l2 ). sharing subsets such that the total backup bandwidth
Since tenant 1 and tenant 2 require backup bandwidth to be b̂l to reserve is minimized. Therefore, the Multi-Tenant
reserved simultaneously on links l1 and l2 , upon the failure of Bandwidth Share Design (MTBSD) problem consists of
S4, they can not share their backup bandwidth on those links. determining for each link in a network, the ISS(s) that
Thus, we make the following observation: maximize its saved bandwidth. Hence, we provide the
following problem definition:
Observation 2. Two tenants having primary VMs hosted on
the same server S and their post-failure hoses go through the Definition 1. Given a set of tenants (t ∈ T ), each requiring
same link l, upon the failure of S, they can not share their a backup bandwidth blt on a link l, find the independent
backup bandwidth on l. tenants sharing sets that minimize the bandwidth to reserve
on l.
Now, following any failure on any server hosting primary
VMs of tenant 1 and tenant 2 (S1, S3 or S4), one of the Theorem 1. The optimal Multi-Tenant Bandwidth Share
two tenants requires bandwidth on link l3 at a time. Hence, Design (MTBSD) problem is NP-complete. (See Appendix)

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
11

SVMP-BG Model SVMP-BG Heuristic


α Rejection Rate (%) Exec. Time (ms) Tot. Res. Bandwidth (MBps) Tot. Res. Backup VMs Rejection Rate (%) Exec. Time (ms) Tot. Res. Bandwidth (MBps) Tot. Res. Backup VMs
0 20 486 094 18230 35 20 330 18846 26
0.25 0 152 068 13740 39 0 389 13740 40
0.5 10 3 718 18054 30 10 405 18358 30
0.75 10 27 829 19156 30 10 470 21612 31
1 20 4 531 12332 27 30 330 9414 27

TABLE I: SVMP-BG model and heuristic comparison over a small network of 12 servers with θ = 4 each

C. Multi-Tenant Bandwidth Share Design (MTBSD)-Heuristic and removed from the sortedRequests array (line 13). After
In order to share backup bandwidth between multiple ten- evaluating all the requests in the sortedRequests array and
ants using the same link l, we seek at partitioning them into adding those who can share their bandwidth to the set s, the
several ISS(s), where a tenant can only be part of a single heuristic will try to build a new sharing set with the remaining
set. The bandwidth to reserve for each set is the maximum requests (the requests that are not part of s). This is performed
backup bandwidth required by all the tenants in the set by calling the MTBSD algorithm again and passing to it the
(Section VIII-A). Hence, our methodology for sharing backup updated sortedRequests array (line 17). The code will keep
bandwidth on each link l in the network starts by identifying on calling the MTBSD heuristic until all the eligible requests
the tenants that are eligible to share their backup bandwidths become part of a sharing set of l. The bandwidth reserved on l
on l, sorting and storing them in decreasing order of their will be updated to consider the bandwidth to reserve for each
backup bandwidths in an array, that we denote sortedRequests. defined sharing set. If the sortedRequests array was of size
An eligible tenant is a tenant that is not reusing his primary n, looping over the n requests will take O(n). In addition,
bandwidth as backup bandwidth on l (Observation 1). Given if none of the requests was able to share its bandwidth, the
MTBSD will be called recursively n times. Hence, the worst
Algorithm 2 MTBSD (Array sortedRequests) case complexity of the MTBSD algorithm is O(n2 ).

1: Given: IX. N UMERICAL R ESULTS


2: l: link on which we are solving the MTBSD problem
We evaluate the performance of the SVMP-BG heuristic
3:
(Section VI) against the optimal solution obtained by the
4: SharingSet s = newSharingSet(l);
SVMP-BG model (Section V), as well as the WCS algo-
5: Request r = sortedRequests.get(0);
rithm [21], the N + L heuristic and a benchmark that we
6: s.requests.add(r);
denote as PPD with Randomized placement (PPDR). Further,
7: s.bandwidthT oReserve = r.backupBandwidth[l];
we evaluate the benefits of our proposed bandwidth sharing
8: sortedRequests.remove(0);
opportunities through applying them on the solution provided
9: for (i = 0; i < sortedRequests.size(); i + +) do
by the SVMP-BG heuristic. We simulate a three-level tree
10: r = sortedRequests.get(i);
topologies with no path diversity of various sizes. We assume
11: if (r.canShareBw(s)) then
all VMs have homogeneous CPU and memory capacity; and
12: s.requests.add(r);
each server has a total capacity of θ VM slots. We use Poisson
13: sortedRequests.remove(i);
traffic arrival, and we vary the load by varying the arrival rate
14: end if
(λ) while fixing the average service time (µ) of the requests
15: end for
(load = λ/µ). All our numerical evaluations are conducted
16: if (sortedRequests.size() > 0); then
using Cplex version 12.4 on an Intel core i7-4790 CPU at
17: M T BSD(sortedRequests);
3.60 GHZ with 16GB RAM.
18: end if
A. SVMP-BG Heuristic vs. SVMP-BG Model
the sortedRequests array, the MTBSD heuristic, depicted in We consider a small test network consisting of 12 servers,
Algorithm 2, recursively builds the sharing sets of l. It starts each has a capacity θ = 4. We use this network to run the
by creating a set s for l (line 4) and adds to it the first tenant SVMP-BG model and obtain reference results to compare
in the sortedRequests array (lines 5-6). This tenant will be them with those acquired using the SVMP-BG heuristic. We
the one with the highest backup bandwidth demands on l. fix the capacity of the links interconnecting the switches to
Thus, the algorithm sets the bandwidth to reserve for s equal 1Gbps. We randomly generate sets of 10 requests each. The
to the backup bandwidth demands of this tenant (line 7). This requests have varying VMs requirements [5-10] VMs and
latter is then removed from the sortedRequests array (line 8). bandwidth demands [50-200] Mbps. Our results are depicted
Afterwards, the algorithm loops over the remaining requests in Table I which clearly shows that our two-step approach
in the array (line 9), and checks if each one of them is able to is able to achieve very close performance to the SVMP-BG
share its backup bandwidth with all the requests that belong to Model with much faster runtime.
s, through the call of canShareBw(s) function (line 11). The For a small α, the model attempts to exclusively optimize
canShareBw(s) function verifies that the post-failure hoses the bandwidth use, irrespective of the number of backup VMs.
of the request of interest do not go through l upon the service As α increases, the model tries to create a balance between
disruption of any of the requests in s (Observation 2). If the both bandwidth consumption and computing resources. For
request of interest can share its backup bandwidth on l with instance, when α=0, the model requires 18230 Mbps (total
all the requests in s, it will be added to the set s (line 12) bandwidth) and 35 backup VMs. However, the 2-step approach

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
12

(a) Rejection rate over load (b) Average Bandwidth over load (c) Average Backup VMs over load (d) SVMP-BG Heuristic runtime

(e) Fault Tolerance for server failure (f) Fault Tolerance for switch failure (g) Fault Tolerance for switch failure

Fig. 9: Comparative analysis between the SVMP-BG heuristic, the WCS, the PPDR and the N + L algorithms.

entails 18846 Mbps (3.37% more) but only 26 backup VMs different placement is selected. With this method, we consider
(∼ 25% less computing resources). When α=1, both reserve a certain number of possible different placements, and for each
the same number of backup VMs, but the 2-step method one, PPD returns a solution. TheP solution that optimizes the
guarantees 23.6% less bandwidth. Here, the model ignores the PPD objective (i.e., minimize (ij)∈E t̂ij ) is selected.
bandwidth usage in its objective.
We simulate Poisson arrivals of 100 requests, each request
B. SVMP-BG Heuristic vs. WCS, N+L and PPDR demands at random between [5-25] VMs, and requires band-
We compare the SVMP-BG heuristic against the WCS, the width guarantees at random between [100-500] Mbps. We
N + L and the PPDR algorithms. In the WCS algorithm [21], consider networks of 128 servers. Each server has a capacity
the primary VMs are spread across different fault domains θ = 6, and the capacity of each link is set to 10 Gbps.All our
and embedded on random servers chosen from each fault results are averaged over 10 runs of 100 requests each.
domain. In our simulations, we consider that the physical 1-Rejection Rate: Our results in Fig.9(a) are presented with
servers rooted at the same TOR switch form a fault domain. 95% confidence interval. We observe from the figure that
The WCS algorithm does not consider any provisioning of SVMP-BG substantially outperforms the WCS, the N + L
backup resources. The N +L heuristic consists of reserving and the PPDR in terms of rejection rate. SVMP-BG heuristic
for each request <N ,B>, L backup VMs where L is the can decrease the rejection rate by an average of 80.9% in
minimum number of backup VMs required to restore comparison to the WCS method, by 72% and 74% against
the service upon any single server failure. Here, the the N + L and the PPDR algorithms respectively. In fact,
backup VMs are collocated on a server not hosting any the WCS has a higher rejection rate given that it spreads
primary VMs for the request. If there exists no server the primary VMs across the network and therefore consumes
able to accommodate all the L backup VMs, the request much of the network capacity for allowing VMs to com-
is rejected. Using the N + L algorithm, any primary-to- municate with each other. In contrast, SVMP-BG, N + L
backup VMs correspondence does not affect the backup and PPDR algorithms prioritize primary VMs collocation.
bandwidth footprint given that all backup VMs are hosted However, SVMP-BG outperforms the N + L and the PPDR
on the same server. The placement of the backup VMs of since its backup VMs placement is driven by the advantages
a tenant (we only place the minimum required), performed of collocating as much as possible, and therefore consumes
by the PPDR method, is done at random by considering less network bandwidth. While the N + L collocates the
all servers in the DC. This placement begins by selecting backup VMs on the same server located under the smallest
a random server, and embedding as many backup VMs as sub-tree hosting the request, the PPDR selects at random
possible, without violating the server’s capacity constraint. If servers, which could be dispersed in the network, and thus
more backup VMs need to be placed, another server is selected may end up with a protection plan that consumes more
at random, and the same procedure is repeated until all backup network bandwidth. Further, PPDR may select a placement
VMs are hosted. Once this placement is done, PPD is invoked for the backup VMs which may not be feasible to protect
to decide the protection plan and determine its feasibility. all primary VMs. This is true since the number of backup
Here, a feasible protection plan may not exist, either due to VMs is predetermined (to both PPDR and SVMP-BG) as
the lack of bandwidth, or because the number of backup VMs explained earlier. Such infeasible placements are pruned out
is not sufficient to protect all the primary VMs. In this case, a by SVMP-BG. Finally, as we increase the load in the network,

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
13

(a) Rejection rate over load (b) Bandwidth gain over time

Fig. 10: Comparative analysis between the SVMP-BG heuristic and the bandwidth share method.

the rejection rate of the four methods increases. failure, Ni is the total number of primary VMs required by
2-Total reserved bandwidth and backup VMs: We measure tenant i ∈ M , θif is the number of primary VMs allocated to
the total bandwidth (primary and backup) (Fig.9(b)), and tenant i affected by fault f ∈ F . Fig.9(e) depicts that SVMP-
the number of backup VMs (Fig.9(c)) reserved using the BG, N + L and PPDR methods outperform the WCS as they
four methods. Here, we focus only on the tenants that were provide 100% fault tolerance. Such result, is indeed expected
admitted by the four solutions. Fig.9(b) shows that SVMP-BG given that SVMP-BG and PPDR algorithms implement the
requires less network bandwidth than WCS, N +L and PPDR. PPD model that provides a pre-planned protection plan able
The primary bandwidth reserved by the WCS is greater than to restore the service completely upon a single server failure
the total of primary and backup bandwidth provisioned by the whereas the N + L ensures that upon any single server
SVMP-BG for the same admitted requests which shows that failure, there exists always N VMs running for the service.
the SVMP-BG outperforms the WCS in terms of bandwidth Fig.9(f) shows that SVMP-BG provides better fault tolerance
consumption. Since PPDR spreads the backup VMs, it requires than the WCS due to the fact that the backup VMs it reserves
more bandwidth than the WCS and the SVMP-BG heuristic. may not necessarily be affected by the TOR switch failure and
The bandwidth provisioned by the N + L algorithm is the can be used to restore the service. In contrast, even though the
highest, since it does not collocate primary and backup WCS spreads the primary VMs across different fault domains,
VMs together as performed by the SVMP-BG and the a request may end up having several primary VMs provisioned
PPDR methods. Fig.9(c) depicts that our protection plan in the same fault domain to get admitted in the network. Thus,
design does not indeed lead to a high backup VMs footprint. a fault can affect several primary VMs which none of them is
For instance, the simulated requests demanding between [5-25] protected as in the SVMP-BG heuristic. Finally, one can notice
primary VMs require an average of 4 backup VMs each. Given that the PPDR algorithm provides the highest fault tolerance
that the WCS approach does not provision backup resources, given that it spreads the backup VMs. In contrast, the N + L
zero backup VMs are provisioned by this method. heuristic has the worst fault tolerance given that the failed
3-Execution Time: In order to explore the scalability of switch may cause the failure of more than L VMs.
the SVMP-BG heuristic, we present in Fig.9(d) the execution Fig.9(g) depicts that the SVMP-BG for fault domains is
time of this algorithm over three single path tree networks of able to provide 100% fault tolerance under a single TOR
varying size; P = 64, P = 128 and P = 256 physical servers switch failure as it is designed to account for such failures
respectively with θ = 6 VMs for each server. Fig.9(d) depicts unlike the SVMP-BG which is designed to restore the
that as we increase the size of the network, the runtime service completely under a single server failure only.
of the SVMP-BG heuristic increases. Such variation of
the SVMP-BG runtime is due to the PPD model used C. SVMP-BG Heuristic vs Bandwidth Sharing
to solve the primary-to-backup VMs correspondence. The We evaluate the impact of bandwidth sharing between
same figure shows that the variation of the load has no tenants on the revenue over time, rejection rate and band-
effect on the execution time of the SVMP-BG heuristic. width gain metrics. We compare the SVMP-BG heuristic to
4-Fault Tolerance: To study the reliability provided by our a bandwidth sharing approach that consists of applying the
SVMP-BG method, we periodically simulate several random MTBSD heuristic (Algorithm 2) to the solution provided by
faults and measure the average fault tolerance guaranteed for the SVMP-BG heuristic. Hence, we perform our simulations
the admitted requests. Hence, we consider two types of faults; over a network of 128 physical servers with θ = 6. We set
single server fault (Fig.9(e)) and a single TOR switch fault the capacity of the links interconnecting the switches to 10
(Fig.9(f)). Gbps. We randomly generate sets of 100 requests each, of
T
1 X Ni − θif varying VMs ([5-25] VMs) and network ([100-500] Mbps)
F T (f ) = (19) requirements. We consider a Poisson traffic arrival of requests
M i=1 Ni
and average our results over 10 runs of 100 requests each.
The fault tolerance is computed by Eq.(19) [21] where M is
the total number of requests occupying the network during the 1-Rejection Rate: Cloud providers are interested in ad-
mitting more tenants in their DCs. Thus, we measure and

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
14

compare the rejection rate of the SVMP-BG heuristic against XI. C ONCLUSION
the bandwidth share method. Our results in Fig.10(a) are
presented with 95% confidence interval. Since sharing band- This paper addresses the problem of providing high ser-
width between tenants increases the available bandwidth in vice availability with bandwidth guarantees for cloud tenants
the network, one can directly guess that the rejection rate through proposing a protection plan design. The proposed de-
should decrease. This is true, given that the requests that were sign, considers the trade-off which exists between the number
rejected because of lack of bandwidth in the network using of backup VMs to provision and the backup bandwidth to
the SVMP-BG heuristic, are more probable to get admitted guarantee. Given an embedding of primary VMs, we formulate
using the bandwidth share approach. This is clearly depicted the SVMP-BG model to provide an optimal protection plan for
in Fig.10(a), which shows that the bandwidth share method a given tenant through the provisioning of backup resources.
can decrease the rejection rate by an average of 11.4% over Owing to its complexity, we develop the SVMP-BG heuristic
the load in comparison to the SVMP-BG heuristic. proved to be much more scalable than the SVMP-BG model.
We conduct extensive simulations and confirm that our
2-Bandwidth Gain Over Time: Sharing bandwidth between SVMP-BG heuristic outperforms a WCS approach, an
the admitted tenants increases the amount of available network N + L and a PPDR baseline algorithms. We show that
resources and can provide an average of 37.96% of bandwidth the SVMP-BG heuristic decreases the rejection rate by
gain as depicted by Fig.10(b). 80.9% in comparison to the WCS due its efficient resources
BandwidthGain = ((bwT s − sBw)/bwT s) ∗ 100 (20) utilization. Our simulations depicts that spreading VMs
The bandwidth gain is calculated using Eq.(20) where bwT s across different fault domains in a WCS method increases
is the total backup bandwidth that can be shared between the provisioned bandwidth and the rejection rates.
tenants and sBw represents the total bandwidth reserved for In addition to the protection plan approach that we propose,
the tenants after sharing. we exploit several bandwidth sharing opportunities to better
utilize cloud DCs network resources. Our numerical results
X. V IRTUAL M ACHINES S YNCHRONIZATION show that our MTBSD heuristic can increase cloud providers’
revenues while decreasing the tenants’ rejection rate by 11.4%
To provide high availability and fault tolerance of applica- when used with the protection plan provided by the SVMP-BG
tions, backup resources are provisioned to protect the primary heuristic. It is able to save on average 37.96% of bandwidth
ones. Whenever a primary VM fails, its corresponding backup in the network. The proposed bandwidth sharing opportunities
VM should be available to take over immediately and restore can be applied on any network and tenants’ protection plans.
the service while keeping the failure transparent to the user.
Hence, backup VM need to be continuously synchronized with A PPENDIX
its primary one to be kept in a nearly identical state [31]. Proof. We prove that the MTBSD problem is NP-Complete
Many techniques exists to provide VMs synchronization for (Theorem 1) by a reduction from the NP-Complete graph
fault tolerance. For instance, the traditional backup methods coloring problem. For completeness, we present a formal
where full or incremental backups are scheduled to be per- definition of the graph coloring decision problem.
formed hourly, daily or weekly [32]; the checkpoint-recovery
protocol [32], [33], [34], [35] that captures the execution state Definition 2. “Let G = (V, E) be an undirected graph. Is
of the running VM at relatively high frequency in order to there a k-coloring of V , such that no two adjacent vertices
propagate changes to the backup VM almost instantly. In each have the same color? ”[36].
of these methods, a bandwidth needed for synchronization be- Given a substrate link l, and a set T of tenants requiring
tween the primary and the backup VMs need to be provisioned. backup bandwidth on l and not incurring any bandwidth
We refer to such bandwidth as synchronization bandwidth. reuse on it (observation 1); we construct a conflict graph
Even though in this work, we have overlooked the syn- Gp = (Vp , Ep ), where each vertex vi ∈ Vp corresponds to
chronization bandwidth and assumed that backup and primary a tenant ti ∈ T (|Vp | = |T |). Ep is the set of edges in
VMs are in sync at all times, such bandwidth can be integrated the conflict graph, where an edge e is added between two
into our work depending on the considered synchronization vertices vi ,vj ∈ Vp whose corresponding tenants, ti , tj ∈ T ,
approach. For instance, if synchronization is performed in can not share their backup bandwidth on l (observation
a scheduled full or incremental backup approach, a certain 2). Subsequently, we can reformulate the MTBSD decision
amount of synchronization bandwidth can be reserved on problem as follows:
all the links of the network and used by all the tenants at “Given a link l, a set of T tenants, and a conflict graph
different time slots. Another approach could be by reusing the Gp = (Vp , Ep ) where every vertex vi ∈ Vp corresponds to a
primary bandwidth as synchronization bandwidth during non tenant ti ∈ T , and every edge e ∈ Ep denotes that a pair
peak times of each tenant’s application. In contrast, if primary of tenants cannot share their bandwidth on l; Is there a
VM state needs to be communicated instantly with its corre- partitioning of Vp into w independent sets?”
sponding backup VM (as in the case of checkpoint-recovery First, we show that the MTBSD problem is in the
protocol), additional synchronization bandwidth needs to be NP-Class for the given graph Gp (Vp , Ep ). We consider a
provisioned at all times for each tenant. In this case, synchro- partitioning of VP into w independent sets. One can verify,
nization bandwidth can not be shared or reused. in polynomial time, that ∀vi ∈ Vp ; vi belongs to exactly one

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TNSM.2017.2704427, IEEE
Transactions on Network and Service Management
15

independent set slj ∈ S l (S l = {sl1 , sl2 , .., slw }). In addition, [10] Lucian Popa et al. Elasticswitch: practical work-conserving bandwidth
one can validate that if (vi , vj ) ∈ Ep ; vi , vj belong to guarantees for cloud computing. In ACM SIGCOMM CCR, volume 43,
pages 351–362. ACM, 2013.
slx , sly ∈ S l , slx 6= sly . [11] Katrina LaCurts, Jeffrey C Mogul, Hari Balakrishnan, and Yoshio
Next, we show that the graph coloring problem is Turner. Cicada: Introducing predictive guarantees for cloud networks.
polynomial-time reducible to the MTBSD problem, which volume 14, pages 14–19, 2014.
[12] Md Golam Rabbani et al. On achieving high survivability in virtualized
proves that the MTBSD is NP-Hard. Consider an instance data centers. IEICE Transactions on Communications, 97(1):10–18,
{G = (V, E), k} of the graph coloring problem where V is 2014.
the set of vertices and E is a set of edges. k is the number of [13] Qi Zhang et al. Venice: Reliable virtual data center embedding in clouds.
In INFOCOM, 2014 Proceedings IEEE, pages 289–297. IEEE, 2014.
colors used to color the graph G. We transform G into an [14] Hongfang Yu et al. Cost efficient design of survivable virtual infrastruc-
instance of the MTBSD problem (l, T, Gp = (Vp , Ep ), w); ture to recover from facility node failures. In ICC, pages 1–6. IEEE,
where Vp =V , Ep =E and k=w. Further we restrict our 2011.
[15] William Lau et al. Failure-oriented path restoration algorithm for
MTBSD problem by considering that all the tenants in T survivable networks. IEEE Transactions on Network and Service
require a uniform backup bandwidth on l. Now, we show Management,, 1(1):11–20, 2004.
that there exists a k-coloring of Vp in Gp , if and only if [16] M Rizwanur Rahman and Raouf Boutaba. Svne: Survivable virtual
network embedding algorithms for network virtualization. Network and
there exists w independent sets of Vp . Service Management, IEEE Transactions on, 10(2):105–118, 2013.
Suppose that there exists a k-coloring of Vp in Gp such [17] Ramesh Govindan et al. Evolve or die: High-availability design princi-
that no two adjacent nodes in Vp have the same color, then ples drawn from googles network infrastructure. In ACM SIGCOMM,
pages 58–72. ACM, 2016.
each color corresponds to an independent set s of vertices [18] Chandler Harris. It downtime costs $26.5 billion in lost rev-
in Gp that are not connected by any edge in Ep .Thus, enue, 2011 (accessed 2016-03-31). http://www.informationweek.com/
there exists k independent sets of Vp . Conversely, if the it-downtime-costs-$265-billion-in-lost-revenue/d/d-id/1097919?
[19] Martin Perlin. Downtime, outages and failures-understanding their
MTBSD problem (l, T, Gp = (Vp , Ep ), w) has a solution, true costs, (accessed 2016-04-12). http://www.evolven.com/blog/
which yield partitioning Vp into w independent sets, such downtime-outages-and-failures-understanding-their-true-costs.html?
that in each independent set, there exists no edge between utm source=twitterfeed&utm medium=twitter&utm campaign=Feed%
3A+evolvenblog%2Frss+%28Evolven+Blog%29.
a pair of vertices; it follows that each independent set [20] Jielong Xu et al. Survivable virtual infrastructure mapping in virtualized
corresponds to a color in G. Thus, we obtain a w-coloring data centers. In IEEE Cloud Computing, pages 196–203. IEEE, 2012.
of V in G. Indeed, if any adjacent vertices in G were [21] Peter Bodı́k et al. Surviving failures in bandwidth-constrained datacen-
ters. In ACM SIGCOMM, pages 431–442. ACM, 2012.
associated with the same color, it means that this pair [22] Chris Develder et al. Dimensioning backbone networks for multi-site
have an edge between them in Gp , which contradicts the data centers: exploiting anycast routing for resilience. In RNDM, pages
fact that they belong to the same independent set in Gp . 34–40. IEEE, 2015.
[23] OpenStack Neutron Team. Neutron qos api models and extension, 2016.
This completes the proof of the reduction. It follows that [24] Sara Ayoubi, Yiheng Chen, and Chadi Assi. Towards promoting backup-
the restricted MTBSD problem is NP-Complete. Further, sharing in survivable virtual network design. IEEE/ACM Transactions
the problem is trivially as hard when all the tenants in T on Networking, 24(5):3218–3231, 2016.
[25] Sara Ayoubi, Yanhong Zhang, and Chadi Assi. A reliable embedding
require heterogeneous backup bandwidth. framework for elastic virtualized services in the cloud. IEEE Transac-
tions on Network and Service Management, 13(3):489–503, 2016.
Note that, the graph coloring problem only solves [26] Jeffrey C Mogul and Lucian Popa. What we talk about when we talk
the restricted version of the MTBSD problem where all about cloud network performance. ACM SIGCOMM CCR, 42(5):44–48,
the tenants t1 , t2 , ..., tn ∈ T require a uniform backup 2012.
[27] Hongqiang Harry Liu et al. Traffic engineering with forward fault
bandwidth blt =B on l. In this case, defining the MINIMUM correction. ACM SIGCOMM CCCR, 44(4):527–538, 2015.
number w of ISS(s) will solve our MTBSD problem. [28] Phillipa Gill et al. Understanding network failures in data centers:
measurement, analysis, and implications. In ACM SIGCOMM CCR,
R EFERENCES volume 41, pages 350–361. ACM, 2011.
[29] Michel Raynal. Distributed Algorithms for Message-Passing Systems.
[1] Hyame Assem Alameddine, Sara Ayoubi, and Chadi Assi. Protection Springer, 2013.
plan design for cloud tenants with bandwidth guarantees. In Design [30] Muntasir Raihan Rahman et al. Survivable virtual network embedding.
of Reliable Communication Networks (DRCN), 2016 12th International In International Conference on Research in Networking, pages 40–52.
Conference on the, pages 115–122. IEEE, 2016. Springer, 2010.
[2] Md Mashrur Alam Khan et al. Simple: Survivability in multi-path link [31] Daniel J Scales et al. The design of a practical system for fault-tolerant
embedding. In CNSM, pages 210–218. IEEE, 2015. virtual machines. ACM SIGOPS Operating Systems Review, 44(4):30–
[3] Sara Ayoubi et al. Minted: Multicast virtual network embedding 39, 2010.
in cloud data centers with delay constraints. IEEE Transactions on [32] Yuyang Du and Hongliang Yu. Paratus: Instantaneous failover via virtual
Communications,, 63(4):1291–1305, 2015. machine replication. In 2009 Eighth International Conference on Grid
[4] Jörg Schad et al. Runtime measurements in the cloud: observing, and Cooperative Computing, pages 307–312. IEEE, 2009.
analyzing, and reducing variance. Proceedings of the VLDB Endowment, [33] Jim Gray. Why do computers stop and what can be done about it? In
3(1-2):460–471, 2010. Symposium on reliability in distributed software and database systems,
[5] Li Chen et al. Allocating bandwidth in datacenter networks: a survey. pages 3–12. Los Angeles, CA, USA, 1986.
Journal of Computer Science and Technology, 29(5):910–917, 2014. [34] Jun Zhu et al. Optimizing the performance of virtual machine syn-
[6] Hitesh Ballani et al. Chatty tenants and the cloud network sharing chronization for fault tolerance. IEEE Transactions on Computers,
problem. In USENIX (NSDI 13), pages 171–184, 2013. 60(12):1718–1729, 2011.
[7] Hitesh Ballani et al. Towards predictable datacenter networks. In ACM [35] Balazs Gerofi et al. Utilizing memory content similarity for improving
SIGCOMM CCR, volume 41, pages 242–253. ACM, 2011. the performance of replicated virtual machines. In UCC, pages 73–80.
[8] Jeongkeun Lee et al. Cloudmirror: Application-aware bandwidth reser- IEEE, 2011.
vations in the cloud. In HotCloud. Citeseer, 2013. [36] Charles Fleurent and Jacques A. Ferland. Genetic and hybrid algorithms
[9] Jeongkeun Lee et al. Application-driven bandwidth guarantees in for graph coloring. Annals of Operations Research, 63:437–461, 1996.
datacenters. In ACM SIGCOMM CCR, volume 44, pages 467–478.
ACM, 2014.

1932-4537 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.