Deliverable D2.2 Report On Definitions of Traffic Management Mechanisms and Initial Evaluation Results

Seventh Framework STREP No. 317846 D2.
2 Definitions of Traffic Management Mechanisms

Public

Version 1.3 Page 1 of 174
Copyright 2013, the Members of the SmartenIT Consortium
Socially-aware Management of
New Overlay Application Traffic with
Energy Efficiency in the Internet
European Seventh Framework Project FP7-2012-ICT- 317846-STEP

Deliverable D2.2
Report on Definitions of Traffic Management
Mechanisms and Initial Evaluation Results

The SmartenIT Consortium

Universitt Zrich, UZH, Switzerland
Athens University of Economics and Business - Research Center, AUEB, Greece
Julius-Maximilians Universitt Wrzburg, UniWue, Germany
Technische Universitt Darmstadt, TUD, Germany
Akademia Gorniczo-Hutnicza im. Stanislawa Staszica w Krakowie, AGH, Poland
Intracom SA Telecom Solutions, ICOM, Greece
Alcatel Lucent Bell Labs, ALBLF, France
Instytut Chemii Bioorganicznej PAN, PSNC, Poland
Interoute S.P.A, IRT, Italy
Telekom Deutschland GmbH, TDG, Germany


For more information on this document or the SmartenIT project, please contact:

Prof. Dr. Burkhard Stiller
Universitt Zrich, CSG@IFI
Binzmhlestrasse 14
CH8050 Zrich
Switzerland

Phone: +41 44 635 4331
Fax: +41 44 635 6809
E-mail: info@smartenit.eu
D2.2 Definitions of Traffic Management Mechanisms Seventh Framework STREP No. 317846
Public

Page 2 of 174 Version 1.3
Document Control

Title: Report on Definitions of Traffic Management Mechanisms and Initial
Evaluation Results
Type: Public
Editor(s): Valentin Burger
E-mail: valentin.burger@informatik.uni-wuerzburg.de
Author(s): Thomas Bocek, Valentin Burger, Paolo Cruschelli, George Darzanos, Manos
Dramitinos, Zbigniew Dulinski, Jakub Gutkowski, Gerhard Halinger, David
Hausheer,Tobias Hofeld, Fabian Kaup, Sylvaine Kerboeuf, Roman Lapacz,
Andri Lareida, Lukasz Lopatowski, Sergios Soursos, Guilherme Sperb
Machado, Ioanna Papafili, Patrick Poullie, Sabine Randriamasy, George D.
Stamoulis, Rafal Stankiewicz, Michael Seufert, Corinna Schmitt, Matthias
Wichtlhuber, Mateusz Wielgosz, Krzysztof Wajda, Piotr Wydrych
Doc ID: D2.2-v1.3
AMENDMENT HISTORY

Version Date Author Description/Comments
V0.1 November 1, 2012 Burkhard Stiller First version
V0.2 May 7, 2013 Valentin Burger, Michael Seufert,
Tobias Hofeld
Draft for TOC
V0.3 May 31, 2013 Valentin Burger, Ioanna Papafili,
Matthias Wichtlhuber
Addressed comments on TOC, included vINCENT
V0.4 June 7, 2013 Valentin Burger Responsibilities, List of TMS
V0.5 June 17, 2013 Valentin, Paolo, Michael, Ioanna,
Patrick, Piotr
Included Traffic Management Solutions with Bullet Points, Tables that
give Overview of TM solutions and TM mechansims
V0.6 July 8, 2013 Corinna, Roman, Ioanna, George D.,
George S., Manos
Included more Traffic Management Solutions
V0.7 July 22, 2013 Ioanna, Roman Included MPLS Solution, Sections for models,
V0.8 August2, 2013 Sabine, Sylvaine, Ioanna, Patrick,
Andri, Matthias, Michael, Piotr, Valentin
Included sections 4.13 and 4.14, Provided Chapter 3 and Solutions in
Chapter 4 in text-form.
V0.9 September 9, 2013 Lukasz, Patrick, Ioanna, Paolo,
Mateusz, Corinna, Thomas, Fabian,
Andri
Completed missing solutions in chapter 4, Chapter 5 added, Game
theoretic model and energy models added
V1.0 October 7, 2013 Sergios, Krzysztof, Rafal, Lukasz,
Valentin, Partick, Thomas, Guillherme,
Andri, Ioanna, Michael, Fabian,
Matthias, David, Piotr, Gerhard, Sabine
Executive Summary added, Document format .docx, Sections in
chapter 4 were revised, Section 5 added, Added Mappings to
SmartenIT architecture, Two additional traffic management solutions
added, Revision of section 3, Introduction added, Conclusion added,
References formatted
V1.1 October 21, 2013 Burkhard, Sprios, George, Chris,
Valentin
Document reviewed and commented, Reviews merged
V.1.2 October 29, 2013 Valentin, all contributors Addressed reviewer comments, revised contributions, merged
revisions, final formatting
V.1.3 October 31, 2013 George, Sergios, Chris, Fabian,
Valentin
Addressed comments, revision of summary, finalization

Legal Notices
The information in this document is subject to change without notice.
The Members of the SmartenIT Consortium make no warranty of any kind with regard to this document,
including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The
Members of the SmartenIT Consortium shall not be held liable for errors contained herein or direct, indirect,
special, incidental or consequential damages in connection with the furnishing, performance, or use of this
material.
Seventh Framework STREP No. 317846 D2.2 Definitions of Traffic Management Mechanisms
Public

Table of Contents
1 Executive Summary ............................................................................................................ 12
2 Introduction ....................................................................................................................... 14
2.1 Purpose of this Document ..................................................................................................... 15
2.2 Document Outline ................................................................................................................. 16
3 Definition of Relevant Applications and Related Models ..................................................... 1
3.1 Selection of Relevant Applications for SmartenIT ................................................................. 1
3.2 !o"els for A""resse" Applications ...................................................................................... 1#
3.2.1 Simulation Models ......................................................................................................... 18
3.2.2 Theoretical Models ........................................................................................................ 29
4 SmartenI! !raffic Mana"ement Solutions ........................................................................... 42
$.1 %ome Router Sharin& 'ase" on Trust ................................................................................... $2
4.1.1 Addressed Scenarios ...................................................................................................... 43
4.1.2 Definition of SmartenIT Traffic Management Mechanisms .......................................... 44
4.1.3 Identification of Ke Influence !actors .......................................................................... 4"
4.1.4 Ke #erformance Metrics ............................................................................................... 4"
4.1.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... 4"
4.1." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... 4+
4.1.+ %,am)le Instantiation of Mechanism ............................................................................ 4+
$.2 Sociall()a*are T! for +fficient ,ontent Deliver( ................................................................. $#
4.2.2 Definition of SmartenIT Traffic Management Mechanism ............................................ $-
4.2.3 Identification of Ke Influence !actors .......................................................................... $1
4.2.4 Ke #erformance Metrics ............................................................................................... $2
4.2.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... $2
4.2." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... $$
4.2.+ %,am)le Instantiation of Mechanism ............................................................................ $$
$.3 !echanism for Inter),lou" ,ommunication ........................................................................ 55
4.3.1 Addressed Scenarios ...................................................................................................... $"
4.3.2 Definition of SmartenIT Traffic Management Mechanisms .......................................... $+
4.3.3 Identification of Ke Influence !actors .......................................................................... "-
4.3.4 Ke #erformance Metrics ............................................................................................... "1
4.3.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... "1
4.3." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... "2
4.3.+ %,am)le Instantiation of Mechanism ............................................................................ "2
$.$ D(namic Traffic !ana&ement .............................................................................................. 63
4.4.1 Addressed Scenarios ...................................................................................................... "4
4.4.2 Definition of SmartenIT Traffic Management Mechanisms .......................................... ""
Public

4.4.3 Identification of Ke Influence !actors .......................................................................... "8
4.4.4 Ke #erformance Metrics ............................................................................................... "8
4.4.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... "8
4.4." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... "9
4.4.+ %,am)le Instantiation of Mechanism ............................................................................ "9
$.5 R-)Trac.er/ 0ser Traffic !ana&ement ................................................................................. 1
4.$.1 Addressed Scenarios ...................................................................................................... +-
4.$.2 Definition of SmartenIT Traffic Management Mechanisms .......................................... +1
4.$.3 Identification of Ke Influence !actors .......................................................................... +1
4.$.4 Ke #erformance Metrics ............................................................................................... +1
4.$.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... +2
4.$." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... +2
4.$.+ %,am)le Instantiation of Mechanism ............................................................................ +3
$.6 Selection !echanism for Stora&e Provi"ers ......................................................................... $
4.".1 Addressed Scenarios ...................................................................................................... +4
4.".2 Definition of SmartenIT Traffic Management Mechanisms .......................................... +4
4.".3 Identification of Ke Influence !actors .......................................................................... ++
4.".4 Ke #erformance Metrics ............................................................................................... +8
4.".$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... +8
4."." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... +8
4.".+ %,am)le Instantiation of Mechanism ............................................................................ +8
$. Static Resource Allocation in the IaaS 2e"eration ................................................................ 3
4.+.1 Addressed Scenarios ...................................................................................................... 8-
4.+.2 Definition of SmartenIT Traffic Management Mechanisms .......................................... 8-
4.+.3 Identification of Ke Influence !actors .......................................................................... 8-
4.+.4 Ke #erformance Metrics ............................................................................................... 81
4.+.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... 81
4.+." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... 81
4.+.+ %,am)le Instantiation of Mechanism ............................................................................ 82
$.# Optimi4e" 0p&ra"e an" Plannin& Processes in 5oa" -alancin& 6et*or.s .......................... #3
4.8.1 Addressed Scenarios ...................................................................................................... 8$
4.8.2 Definition of SmartenIT Traffic Management Mechanisms .......................................... 8"
4.8.3 Identification of Ke Influence !actors .......................................................................... 8"
4.8.4 Ke #erformance Metrics ............................................................................................... 8+
4.8.$ Initial %&aluation for Ste).ise /)grades in !ull Mesh 0ore 1et.or2s ......................... 8+
4.8." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... 89
4.8.+ %,am)le Instantiation of Mechanism ............................................................................ 9-
$.3 vI6,+6T ................................................................................................................................ 31
4.9.2 Definition of SmartenIT Traffic Management Mechanisms .......................................... 91
4.9.3 Identification of Ke Influence !actors .......................................................................... 92
4.9.4 Ke #erformance Metrics ............................................................................................... 93
4.9.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................... 93
Public

4.9." Ma))ing of Mechanism to SmartenIT Architecture ...................................................... 9$
4.9.+ %,am)le Instantiation of Mechanism ............................................................................ 9$
$.11 A5TO)"riven Application 7ualit( A&&re&ation S(stem ...................................................... 35
4.1-.1 Addressed Scenarios ................................................................................................... 9"
4.1-.2 Definition of SmartenIT Traffic Management Mechanisms ....................................... 9"
4.1-.3 Identification of Ke Influence !actors ....................................................................... 98
4.1-.4 Ke #erformance Metrics ........................................................................................... 99
4.1-.$ Initial %&aluation 'esults and ()timi*ation #otential ................................................ 99
4.1-." Ma))ing of A3AS to SmartenIT Architecture ............................................................ 99
4.1-.+ %,am)le Instantiation of A3AS ................................................................................... 99
$.11 !ulti),riteria Application +n")Point Selection ................................................................ 111
4.11.1 Addressed Scenarios ................................................................................................. 1-1
4.11.2 Definition of SmartenIT Traffic Management Mechanisms ..................................... 1-1
4.11.3 Identification of Ke Influence !actors ..................................................................... 1-4
4.11.4 Ke #erformance Metrics ......................................................................................... 1-4
4.11.$ Initial %&aluation 'esults and ()timi*ation #otential .............................................. 1-"
4.11." Ma))ing of Mechanism to SmartenIT Architecture ................................................. 1-$
4.11.+ %,am)le Instantiation of Mechanism ....................................................................... 1-$
$.12 7o+ an" +ner&( A*are !o'ile Traffic !ana&ement ..................................................... 116
4.12.1 Addressed Scenarios ................................................................................................. 1-"
4.12.2 Definition of SmartenIT Traffic Management Mechanisms ..................................... 1-+
4.12.3 Identification of Ke Influence !actors ..................................................................... 1-9
4.12.4 Ke #erformance Metrics ......................................................................................... 11-
4.12.$ Initial %&aluation 'esults and ()timi*ation #otential .............................................. 11-
4.12." Ma))ing of Mechanism to SmartenIT Architecture ................................................. 11-
4.12.+ %,am)le Instantiation of Mechanism ....................................................................... 111
# $onfi"uration and $ommunication %rame&or's ............................................................... 112
5.1 Inter)A5TO ,ommunication 2rame*or. ............................................................................. 112
$.1.1 S)ecification of the !rame.or2 ................................................................................... 11$
$.1.2 A))lication to Traffic Management Solutions ............................................................. 11"
$.1.3 %&aluation %n&ironment and Initial 'esults ................................................................. 11+
5.2 Open2lo*)'ase" 6et*or. ,onfi&uration 2rame*or. ........................................................ 113
$.2.1 S)ecification of the !rame.or2 ................................................................................... 119
$.2.2 A))lication to Traffic Management Solutions ............................................................. 121
$.2.3 %&aluation %n&ironment and Initial 'esults ................................................................. 122
5.3 !P5S)'ase" 6et*or. ,onfi&uration 2rame*or. ............................................................... 123
$.3.1 S)ecification of the !rame.or2 ................................................................................... 123
$.3.2 A))lication to Traffic Management Solutions ............................................................. 124
$.3.3 %&aluation %n&ironment and Initial 'esults ................................................................. 124
( Syner"ies )et&een Mec*anisms ....................................................................................... 12(
Public

6.1 A"herence to the Scenarios ................................................................................................ 126
6.2 Properties of !echanisms .................................................................................................. 131
6.3 O'servation an" Decision !etrics ...................................................................................... 136
6.$ Discussion of S(ner&ies 'et*een !echanisms ................................................................... 1$1
Summary and $onclusions ............................................................................................... 142
.1 8e( Outcomes an" 5essons 5earnt ...................................................................................... 1$2
.2 6e9t Steps ........................................................................................................................... 1$$
+ Smart ,)-ectives .............................................................................................................. 14#
. References ....................................................................................................................... 14
1/ A))reviations ............................................................................................................... 1#4
11 Ac'no&led"ements ...................................................................................................... 1#(
12 Appendices ................................................................................................................... 1#
12.1 Open2lo* Test ,onfi&uration Details .............................................................................. 15
12.2 !P5S Test ,onfi&uration Details ..................................................................................... 161
12.3 !appin& of !echanism to SmartenIT Architecture ........................................................ 16#

Public

List of Figures
Figure 1: Interaction of users on Facebook during a day [90]. ............................................. 19
Figure 2: Buffered playtime while streaming a YouTube video. ........................................... 23
Figure 3: Finite state machine for a video streaming source traffic model. .......................... 24
Figure 4: Cumulative distribution of measured YouTube video bit-rates and video sizes. ... 24
Figure 5: Cumulative distribution of measured block sizes. ................................................. 25
Figure 6: Cumulative distribution of pre-buffered playtime. .................................................. 25
Figure 7: Video buffer while streaming a video with limited network data rate [58]. ............. 26
Figure 8: Most important use cases for Dropbox [1]. ........................................................... 28
Figure 9: MOS as function of waiting times for four difference task scenarios: initialization,
storage, retrieval, and multidevice sync. Figures are taken from [1]. ........................ 30
Figure 10: Mapping functions of stalling parameters to MOS. Video duration is fixed at 30
s. No initial delay is introduced. Parameters are given in Table 7 ............................. 32
Figure 11: Simple QoE model maps a number N of stalling events of average length L to a
MOS value, I, N = S.Su c u.1SI +u.19N +1.Su. [58] ........................................... 33
Figure 12: N = 4000 traffic traces and the estimated 95-th percentile. ................................. 35
Figure 13: Difference of the 95-th percentile minus actual traffic traces. ............................. 35
Figure 14: Throughput under a specific transit charge, with and without the scheduling
mechanism operating ideally and with perfect information. ....................................... 36
Figure 15: Basic HORST functionality.................................................................................. 43
Figure 16: Potential of caching on the end-user device for response-time and energy
consumption .............................................................................................................. 47
Figure 17: Video hosted on Facebook video server. ............................................................ 49
Figure 18: Video hosted on YouTube video server. ............................................................. 49
Figure 19: Prefetching accuracy vs. number of watched videos. ......................................... 53
Figure 20: Prefetching accuracy vs. number of pre-fetched videos. .................................... 53
Figure 21: Inter-AS traffic generated due to prefetching. ..................................................... 53
Figure 22: Total inter-AS traffic generated. .......................................................................... 54
Figure 23: Total inter-AS traffic during one simulation day. ................................................. 54
Figure 24: Cloud federation. ................................................................................................ 57
Figure 25: Instantiation of the ICC mechanism in the case of a cloud federation. ............... 63
Figure 26: Sample network model for the use-cases description ......................................... 65
Figure 27: Cost functions used for accounting cost of inter-domain traffic ........................... 66
Figure 28: A cost map as a function of traffic volume on both inter-domain links and cost
optimization potential ................................................................................................ 67
Figure 29: Illustration of a traffic compensation mechanism ................................................ 68
Public

Figure 30: Comparison of traffic growth on links during the accounting period with (green
curve) and without (red) DTM ................................................................................... 69
Figure 31: Traffic pattern in link 2 with and without DTM mechanism .................................. 70
Figure 32: Initial experiments show no traffic peak reduction in a random preference
scenario.[11] ............................................................................................................. 72
Figure 33: Deployment example of RB-Tracker. .................................................................. 73
Figure 34: Optimized load balancing often includes NP-complete problems, e.g., Bin-
Packing [47]. ............................................................................................................. 86
Figure 35: Options for algorithms and graphical views provided by the TE-Scout tool ........ 87
Figure 36: Cost and energy optimization by stepwise upgrades in an 8-node full mesh ...... 88
Figure 37: Timing and Cost Decrease for Stepwise Upgrades ............................................ 89
Figure 38: vINCENT Infrastructure ................................................................................... 91
Figure 39: Virtual Node concept of vINCENT ...................................................................... 92
Figure 40: Measurement of existing P2P streaming system. ............................................... 94
Figure 41: Energy Efficiency of end-devices. ....................................................................... 94
Figure 42: Application quality aggregation system for an ALTO guided population of ALTO
Endpoint QoE Cost values. ....................................................................................... 97
Figure 43: example deployment of MUCAPS: Multi-Cost ALTO Client block integrated in
an ISP DNS resolver and coupled with (i) an automated Application Metric Mapping
function (ii) an automated metric weight tuning function. ........................................ 103
Figure 44: Prototype and example scenario for MUCAPS-based AEP selection. .............. 104
Figure 45: video streaming application with 3 candidate AEPs and 2 metrics. .................. 105
Figure 46 Architecture of the Network Optimizer from [67] ................................................ 108
Figure 47: Inter-ALTO communication framework architecture. ......................................... 115
Figure 48: Example instantiation of the inter-ALTO framework topology. ....................... 116
Figure 49: Example instantiation of the inter-ALTO framework communication schemes.117
Figure 50: Example instantiation of the inter-ALTO framework resulting data flows. ...... 117
Figure 51: Topology used during simulations. N means that there are N links of a
indicated category between given ASes. ................................................................ 118
Figure 52: Average traffic on the link between AS5 and AS6. ........................................... 119
Figure 53: An OpenFlow communication between OpenFlow-enabled switch and
Controller (ONF, OpenFlow Switch Specification 1.0.0, Dec. 31, 2009). ................ 120
Figure 54: OpenFlow evolution [25] ................................................................................... 120
Figure 55: OpenFlow switch 1.3.0 (ONF, OpenFlow Switch Specification 1.3.0, June 25,
2012). ...................................................................................................................... 121
Figure 56: OpenFlow test domain topology ....................................................................... 122
Figure 57: Multi-domain network for MPLS tests ............................................................... 125
Public

Figure 58: Topological view of architecture with added scenario overlay (based on
component map taken from D3.1). .......................................................................... 127
Figure 59: Detailed MPLS test topology............................................................................. 160
Figure 60: Mapping of HORST to SmartenIT architecture. ................................................ 168
Figure 61: Mapping of SECD to SmartenIT architecture. ................................................... 169
Figure 62: Mapping of ICC to SmartenIT architecture. ...................................................... 169
Figure 63: Mapping of DTM to SmartenIT architecture. ..................................................... 170
Figure 64: Mapping of RB-Tracker to SmartenIT architecture. .......................................... 170
Figure 65: Mapping of SMSP to SmartenIT architecture.................................................... 171
Figure 66: Mapping of MRA to SmartenIT architecture. ..................................................... 171
Figure67: Mapping of OptiPlan to SmartenIT architecture. ................................................ 172
Figure 68: Mapping of vINCENT to SmartenIT architecture. .............................................. 172
Figure 69: Mapping of AQAS to SmartenIT architecture. ................................................... 173
Figure 70: Mapping of MUCAPS to SmartenIT architecture. ............................................. 173
Figure 71: Mapping of QoEnA to the SmartenIT architecture ............................................ 174

Public

List of Tables
Table 1: Results of the service relevance survey. Sum over all criteria. Green: very
relevant, red: not relevant at all. .............................................................................. 18
Table 2: Extended distribution of video categories in YouTube to include 19 categories.
Assignment of popularity values to the 2 new categories (i.e. 18 and 19) and
normalization of the 10 most popular categories so that popularities of all 19
categories sum up to 100%. .................................................................................... 20
Table 3: Bandwidth statistics [23]. ..................................................................................... 22
Table 4: State transition matrix for the state transition function :S x S ..................... 23
Table 5: Characteristics of Dropbox accounts from the 49 volunteers [1] .......................... 27
Table 6: Characteristics of User Profiles (B=Beginners [22% of users], S=Synchronization
Users [30%], P=Power Users [48%]) [1] ................................................................. 28
Table 7: Parameters of mapping functions (seeFigure 10) of stalling parameters to MOS
together with coefficient of determination R2 as goodness-of-fit measure. [59] ...... 32
Table 8: Users and its Cloud services preference ranking ................................................ 76
Table 9: The steps to combine Cloud services preference ranking of each users ............ 77
Table 10: Probabilistic Values for the Preference Ranking. ............................................... 77
Table 11: Local application scenario before shaping ......................................................... 82
Table 12: Local application scenario after shaping ............................................................ 83
Table 13: Savings in stepwise link upgrade cycles ............................................................ 89
Table 14: Adherence of mechanisms to scenarios. ......................................................... 128
Table 15: Overview of proposed TMS w.r.t. scenarios. Absolute values. 3 (dark green):
TMS mainly addresses scenario, 0 (white): TMS does not address scenario. ...... 129
Table 16: Summary of mechanisms properties ............................................................... 133
Table 17: Overview of mechanisms decision-taking process and envisioned innovation 138
Table 18: Overall SmartenIT SMART objective addressed. (Source: [110]) .................... 146
Table 19: Theoretical SmartenIT SMART objectives addressed. (Source: [110]) ............ 146
Public

(This page is left blank intentionally.)
Public

1 Executive Summary
This document is the deliverable D2.2 Report on Definitions of Traffic Management
Mechanisms and Initial Evaluation Results of Work Package 2 Theory and Modelling
within the ICT SmartenIT Project 317846. The main objectives of Deliverable D2.2 are as
follows:
Objective 1: Propose and justify traffic management mechanisms, which overcome
current limitations of existing services in communication networks. In particular different
viewpoints of the involved stakeholders are not taken into account. In SmartenIT the view
points of the stakeholders, specifically Internet Service Providers (ISPs), Cloud Providers,
and End-Users, are addressed, to avoid unnecessary costs, e.g., by saving energy or
expensive inter domain traffic while providing good service quality in terms of Quality-of-
Experience to end users. More details on the addressed stakeholders and scenarios can
be found in D1.1 and D1.2 respectively. Further on, current traffic management solutions
do not take into account social awareness which can be utilized to predict demand of
content and analyze user interaction.
In order to collect propositions for a subsequent SmartenIT traffic management solution, a
set of potential mechanisms with innovative features have to be defined, together with
initial evaluation results and assessment of its positioning against the SmartenIT scenarios
and architecture, thus providing evidence whether the traffic management mechanism
should be further pursued and assessed.
Objective 2: Provide models for the evaluation of the proposed mechanisms. For
subsequent performance evaluation of the proposed mechanisms, theoretical models and
simulation models are necessary. The models need to take into account the different
stakeholders and their goals which are involved in service delivery chain. In particular
models need to be developed for mechanisms that specifically address applications
selected by SmartenIT.
Addressing objective 1, a broad set of traffic management solutions were proposed, on the
basis of both the scope of the project and of the overview of existing overlay traffic
management solutions given in D2.1 [109]. For each solution proposed, the scenarios
defined in D1.2 [108] were addressed, namely inter-cloud communication, global service
mobility, social awareness and energy efficiency.
Since considerable overlaps in the four scenarios defined initially were identified, it was
decided to merge these scenarios in the recent progress of the project. This established a)
the operator focused scenario, which covers the perspective of ISPs and cloud operators
and solutions with decision metrics at data centers or the backbone-network and b) the
end user focused scenario, which covers traffic management solutions that address the
perspective of the end user and deploy decision metrics at end devices or access-
networks, as documented in D1.2, which evolved in parallel with the present deliverable.
Therefore, the traffic management solutions introduced and studied in this deliverable
were also mapped to the operator focused and the end user focused scenarios. The
mapping shows that the traffic management solutions mainly addressing inter-cloud
communication constitute the operator focused scenario, whereas solutions addressing
mainly global service mobility and social awareness are covered by the end user focused
scenario.
For subsequent performance evaluation of traffic management solutions, the factors that
have a key influence on the associated mechanism were identified, as well as key
Public

performance metrics. An initial evaluation for each proposed traffic management
mechanism was provided, as far as possible in this stage of the project. The evaluations
were based on initial estimations and argumentation, and/or were taken from literature and
address whether the relevant solution is worth detailed examination.
Addressing objective 2, relevant applications for a SmartenIT solution were identified in the
present deliverable. For this purpose, a service relevance survey and an application
relevance survey of services and applications considered for a SmartenIT traffic
management solution were conducted. The results of the surveys show that video-on-
demand and file-storage are the most relevant services for SmartenIT, as well as that
YouTube and Dropbox are the most relevant applications for the respective services. For
evaluation purposes, we use existing theoretical and simulation models from the literature,
but also develop new models, to cope with proposed modifications in the protocols and
algorithms. Since applications like Dropbox came up just recently, appropriate models for
such applications barely exist and may have to be developed.
Furthermore models for the Quality-of-Experience perceived by end-users for each
specific application were defined, in order to assess the performance of the mechanisms
from the end-user perspective. To get one step further towards a complete evaluation
framework, for each simulation and theoretic model, it was shown how it is deployed in a
broader environment to evaluate traffic management solutions. This was done by
identifying models that should complement them in a complete evaluation framework for
the proposed traffic management solutions and addressed scenarios, which will be
developed in task T2.4.
The synergies of the proposed mechanisms were studied extensively to find possible
overlaps and complementarities. The mechanisms were grouped into 5 different
categories reflecting the main characteristics of the mechanisms, namely Content
Placement, Delivery Scheduling, Ranking, Communication Protocols and
Configuration Frameworks. The categories show traffic management mechanisms that
share the same characteristics. The main outcome of the deliverable is that certain
proposed solutions share the same goal. The identified synergies allow grouping traffic
management solutions that can be combined to fit a particular use-case. Thus, a basis for
deliverable D2.3 is provided where the use-cases and their parameters will be defined.
Finally, this deliverable also provides a basis for the decision on which traffic management
solutions will be further evaluated by analysis and simulation in WP2. The analysis of the
mechanisms as well as that of their synergies will also serve as the basis for limiting the
set of mechanisms whose implementations will be integrated in the system architecture
developed in WP3.

Public

2 Introduction
SmartenIT targets an incentive-compatible cross-layer network management scheme for
network and cloud operators, cloud service providers, and end-users, as denoted in [110].
Specifically, SmartenIT aims to address accordingly load and traffic patterns or special
application requirements, to employ Quality-of-Experience (QoE)-awareness. Additionally,
one of the key targets of SmartenIT is the exploitation of social awareness (in terms of
user social relationships and interests as an extra channel of information to characterize
the end-users of the cloud services, and thus, predict demand. As a result, efficient
content placement and pre-fetching can be supported, as well as migration of workload
and Virtual Machines (VMs), etc.
Moreover, one of the key objectives of SmartenIT is energy efficiency both in the Provider-
and the End-User-side. Therefore, SmartenIT aims to design Traffic Management (TM)
mechanisms that will achieve energy efficiency, i.e. keep energy consumption low for data
centers, networks or in end-users mobile devices. Thus, the energy efficiency with respect
to both end-user devices and underlying networking and application provisioning
infrastructure is tackled to ensure an operationally efficient management. Nonetheless,
incentive-compatibility of network management mechanisms for improving metrics in all
layers and among all players will serve as the major mechanism to deal with real-life
scenarios. Furthermore, major overlay applications, whose traffic is to be tackled by
SmartenIT, as selected by WP1, include video streaming and online storage applications;
major representatives of which are YouTube and Dropbox, respectively.
Regarding the design of appropriate TM mechanisms that inevitably deal with TM of inter-
domain flows at the Internet scale, a major challenge has been how to assure that the
data/content transfers will indeed elicit the desired Quality-of-Service (QoS) properties;
thus, attaining the relevant QoE goals for the end user. To this end, three alternatives
appear:
a. Pure IP and the current EGP (e.g., BGP) / IGP (e.g., OSPF) set-up.
b. IP networks management/control plane enhancements allowing inter-domain QoS-
related prioritization mechanisms or novel TE products, so as to have some control
on the per-AS statistical performance of the routes selected.
c. Sub-IP layer routing and/or internal and external routing protocols configuration.
Regarding the first option, this is the most general approach, ensuring the applicability of
the proposed mechanisms at Internet scale; this is, thus, a worst case scenario but also
the most generic approach. The network topology and routing are taken as given and
routing decisions cannot be affected in inter-domain scale. In particular, the respective
mechanisms rely on overlay decisions regarding the placement of caches, load balancing
techniques relying on DNS or overlay information, deciding on the scheduling of data
transfers, and their respective sending rates along with shaping.
Regarding the second option, on the IP layer, the softest approach is the Differentiated
Services (DiffServ) mechanism, if applied in different interconnected domains. However,
the involved network operators have to agree on similar DiffServ classes, accept marking
at the Points of Interest (PoIs) and respect commonly defined QoS policies and
classification schemes. Technically, this is supported with the use of DiffServ brokers.
However, network operators dont seem to have enough incentives to deploy such
brokers, and therefore, we can conclude that DiffServ is currently not supported in the
inter-domain context.
Public

The third and final option is complementary to the first one. Mechanisms defined to work
on top of the Internet could be further enhanced with TE allowing for better routes for
content delivery that are engineered: i) in sub-IP layer, i.e. properly configured and
provisioned MPLS tunnels, or ii) to alter IGP/EGP configurations using Intelligent Route
Control techniques [40]. Such solutions are applicable only for multi-homed networks and
result in significant gains only for large networks with multiple neighboring networks.
The SmartenIT proposed mechanisms in this deliverable have been mostly designed to
function over pure IP. This means that no additional functionality is required so that the
proposed TM solutions can work and improve any considered service. This decision is due
to the fact that the large-scale IP network is the dominant paradigm today, i.e. networks
across multiple administrative domains simply exchange BGP information and data, which
do not implement inter-domain QoS.
Finally, another interesting issue that affects the design of the SmartenIT mechanisms
presented in this deliverable is the awareness regarding the type of traffic for which the
mechanisms are applicable. A major constraint here is that inter-domain traffic over
peering and transit links is by definition a service-agnostic aggregation of both elastic and
inelastic traffic. Although DPI is deployed in certain cases, its existence cannot be taken
for granted. In general, it is inherently too costly for networks to examine the composition
of those traffic aggregates and try to treat differently the various constituent traffic streams.
Thus, this is a major constraint that also complicates the mechanism design as well as the
actual potential of intervention of the SmartenIT mechanisms on the inter-domain network
layer. Concluding, the deployment of smart overlays or cross-layer mechanisms that
combine information from both the network and the cloud layer at the network edges are
promising in this context.
2.1 Purpose of this Document
The main goals of this deliverable are as follows:
Proposal and specification of incentive-based TM mechanisms and their intelligence
for the efficient handling of traffic generated by overlay applications in an energy
efficient manner.
Definition of related scenarios and use-cases for each proposed TM mechanism
and identification of key influence factors and evaluation metrics.
Development of theoretical and simulation models for the evaluation of the specified
TM mechanisms, as well as models employing game-theoretic aspects to
investigate behaviors of different stakeholders when adopting such TM
mechanisms,
Description of preliminary results of the evaluation of the various TM mechanisms
(where applicable).
Deliverable D2.2 is the 2
nd
deliverable of WP2 and sets the basis for the development of
intelligence, mechanisms and models that will constitute the heart of the SmartenIT
solutions. The work presented in Deliverable D2.2 will be further evolved in the next
phases of the project, and it will be finalized and concluded in Deliverable D2.4 Report on
Definitions of Traffic Management Mechanisms and Initial Evaluation Results (Final
Version) which will be delivered at the end of Year 2 of the project.
Public

2.2 Document Outline
This document is organized as follows:
Chapter 3 initially summarizes work performed in WP1 on the selection of overlay
applications whose traffic is to be tackled by SmartenIT. Then, appropriate models
developed within SmartenIT for the assessment of QoE/QoS as well as other metrics
related to the selected application categories are provided.
Chapter 4 provides the main contribution of this document, which is the specification of the
various TM mechanisms proposed by SmartenIT, the main scenarios that they address,
the key influence factors, i.e. parameters that have significant impact on their performance,
key performance metrics that should be monitored and are aimed to be improved by each
mechanisms, and finally, some preliminary evaluation results, where available already.
Chapter 5 provides communication and configuration frameworks that might be useful for a
SmartenIT solution. For each framework, its specification, its potential applicability to TMS
and initial theoretical or functional evaluation results are provided.
Chapter 6 addresses the potential synergies among the specified mechanisms and aims to
qualitatively assess the impact of their operation when combined, so as to address more
complex use cases, or use cases that are not sufficiently addressed by each of
mechanisms alone.
Chapter 7 summarizes the deliverable and draws the major conclusions on the specified
TM mechanisms and next steps of the investigations of SmartenIT WP2.
Chapter 8 reports which SMART objectives, as described in SmartenITs Description of
Work (DoW) [110], have been addressed by the work performed in WP2 and reported in
D2.2.

Public

3 Definition of Relevant Applications and Related
Models
In this chapter we define relevant applications considered for a SmartenIT solution.
Relevant applications were identified in a two stage survey among partners. For more
details on the application selection survey, please refer to deliverable D1.2 [108].
For the selected applications we define simulation models and theoretical models for the
evaluation of the traffic management solution proposed in Chapter 4. To be able to
evaluate the solutions with respect to the scope of the project we further define models for
energy efficiency, Quality-of-Experience and game theory.
3.1 Selection of Relevant Applications for SmartenIT
The number of cloud service applications provided in the Internet is constantly increasing.
To be able to focus on a manageable subset of applications to work on in SmartenIT, a
cloud application survey has been conducted among the project partners. The purpose of
the survey was to identify the applications most relevant for SmartenIT, based on carefully
selected criteria.
The cloud application survey was performed among the partners in two steps. The first
step was a service relevance survey. Its goal was to identify the most relevant services for
SmartenIT. In the second step most relevant applications were selected out of the most
relevant service categories.
For each service the relevance of 13 criteria was rated with a value from 1 to 5, for not
relevant at all to very relevant for SmartenIT. Table 1 shows the service with highest
sum over all criteria. For end-users most relevant services in the survey are File Storage
and File Sharing and Video on Demand. For the service enabling technologies Data
Centers are rated most relevant for SmartenIT.
Based on the most relevant services File Storage and Video on Demand a subsequent
survey on applications relevant for SmartenIT was conducted. The criteria were limited to
8 to cover only the criteria which differ for the considered applications.
The Video / Music on Demand application most relevant for SmartenIT is YouTube since
it has the highest mean scores, is very popular and produces a high traffic volume. The
problem with applications like YouTube is the limited intervention potential. The clients are
based on html5 / flash player and are proprietary.
The File Storage application most relevant for SmartenIT is Dropbox since it has the
highest mean scores, is very popular and produces a high traffic volume. The intervention
potential of Dropbox is not expected to be high, but has still to be investigated. Zettabox is
an application similar to Dropbox which was developed by a project partner and therefore
has a high intervention potential. OwnCloud is an open source Dropbox clone and also
gives the opportunity to modify a Dropbox like file storage application.
Public

Table 1: Results of the service relevance survey. Sum over all criteria.
Green: very relevant, red: not relevant at all.

It was agreed within the project to have YouTube and Dropbox as primary applications that
represent respectively the services Video on Demand and File Storage. If a
modification of the client/server functionality is needed for a solution the corresponding
Open Source implementations Zeta-Box, PiCsMu, Owncloud for File Storage and VLC
for Video on Demand will be used.
3.2 Models for Addressed Applications
In order to evaluate the performance of existing applications and newly proposed traffic
management solutions, models of the addressed applications and their key metrics are
needed. Thereby, the foundations are laid for analytical and simulative evaluations.
However, all models have to cope with a trade-off between accuracy and complexity. The
more accurately a model represents the application behavior, the more complex it is to get
results, and vice versa.
By modifying the application models, insights into possible optimization approaches are
provided. Moreover, these modified models can be used to predict the expected gain with
respect to a certain metric. In this section initial models for the simulation of the addressed
applications are presented, and theoretical models which are relevant for optimization are
described.
3.2.1 Simulation Models
In this subsection, we present models developed to serve the need of simulating and
evaluating some of the traffic management mechanisms proposed in Chapter 4.
3.2.1.1 Model for Video Dissemination among the Users of an OSN
The model for video dissemination among the users of an OSN was developed in order to
evaluate a Socially-aware mechanism for Efficient Content Delivery (SECD) (presented in
Section 4.2), which employs a P2P overlay and a Social Proxy Server (SPS) to assist
video delivery among the users of an OSN, and compare it with an existing approach in
literature, i.e. SocialTube [76]. Therefore, we designed and implemented a complete
evaluation framework to simulate an OSN, whose users are consumers of a video service
offered either by the servers of the OSN, or by the server of a third-party-owned
Public

CDN/video platform; the designed evaluation framework can be used to evaluate other
similar approaches. To do so, we did not model accurately the evolution of the P2P
overlay, rather we focused more on the definition of demand and supply models, i.e. video
viewing and video uploading/sharing respectively, video distributions per interest category,
users distribution per AS, etc. Below, we provide the constituent elements of the
evaluation model.
3.2.1.1.1 Time considerations
In our evaluation, time is slotted in slots of 20 minutes, in order to be consistent with the
fact that a user is active in Facebook about 20 minutes per day on the average [31]. Thus,
a user will be active only in one of these 20-minute slots, and each of his activities will
occur within this interval. Regarding the users activity, we assume that only 50% of the
OSN users are active daily. We chose randomly the users that will be active each given
day, but users with more friends have an extra possibility to be active. Specifically, a
weight is assigned to each user denoting the users probability to be active within a day;
this weight is calculated according to the formula:
uscr_wcigbt = 1 + uscr_ricnJs 1uuu,
If a user is active in Facebook on a given day, he is considered to be active only for 20
minutes in a selected 20-minutes slot. For the selection of the slot, we perform weighted
random choice based on the information extracted by Figure 1.

Figure 1: Interaction of users on Facebook during a day [91].
Furthermore, we assume that each user is active in the Internet for 140 minutes [32] (i.e.
seven 20-minutes slots). We assume that these seven timeslots are contiguous, and thus
when a user log in Facebook does it in the middle of any of these timeslots. Regardless a
users activity in Facebook within a specific day, that user can seed content which he has
stored while active in Facebook in previous timeslots.
3.2.1.1.2 Users characteristics
We have assigned 4 video interest categories to each user, while each user is considered
to share and watch videos only out of these 4 categories. To decide in which 4 categories
a user is interested in, we used a weighted random choice and we chose 4 categories out
of 19 total interest categories. Based on the popularity of the video categories of YouTube
as reported in [24], we extended the list of categories by two more and assigned
popularities to each video following the Power Law distribution [21] as it appears in Table
2.
Public

Table 2: Extended distribution of video categories in YouTube to include 19 categories.
Assignment of popularity values to the 2 new categories (i.e. 18 and 19) and normalization
of the 10 most popular categories so that popularities of all 19 categories sum up to 100%.
Category Percentage of videos
Entertainment
25.3 % (-0,1)
Music
24.7 % (-0,1)
Comedy
8.6% (-0,1)
People & Blogs
8.6 % (-0,1)
Films & Animation
8.5 % (-0,1)
Sports
7.5 % (-0,1)
News & Politics
3.5 % (-0,1)
Autos & Vehicles
3.3 % (-0,1)
How-to & Style
2.3 % (-0,1)
Pets & Animals
1.6 % (-0,1)
Travel & Events 1.6 %
Education 1.1 %
Science & Technology 1.0 %
Unavailable 0.8 %
Nonprofits & Activism 0.3 %
Gaming 0.2 %
Removed 0.2 %
Added Category 18 0.5%
Added Category 19 0,5%

Additionally, we assume that each user is located in one specific AS by assigning him an
AS id. We have assigned to each AS a rank that denotes the popularity of the AS,
assuming that ASes with higher popularity have more users. Then, in order to distribute
the OSN users among the ASes, we used the Zipf distribution.
3.2.1.1.3 Categorization of viewers
We categorize the viewers of an uploader in three categories, as follows:
Followers: are considered to be the 1-hop or 2-hops friends, who watch over 80%
of the videos uploaded by the uploader.
Non-followers: are assumed to be the 1-hop or 2-hops friends who watch less than
80% but more than 30% of the videos uploaded by the uploader.
Other viewers: are assumed to be the 1-hop or 2-hops friends who watch less than
30% but more than 20% of the videos uploaded by the uploader, since every viewer
of an uploader is assumed to watch at least 20% of videos uploaded by the latter.
Based on the aforementioned categorization of users and the observations in [31] for the
number of users watching specific percentages of uploaders videos, we assume for each
viewers category in 1 and 2 social hops that it holds: 90% of viewers are at most within two
social hops, while the remaining 10% are in three or more hops; while viewers having at
least one common interest with the users that they follow is a prerequisite. Thus, according
Public

to these percentages, the assignment of users in 1 and 2 hops (i.e. 90% of viewers) is a
random choice from users with at least one common interest, as follows:
Followers
33% of viewers are characterized as Followers at 1-hop.
2% of viewers are characterized as Followers at 2-hops.
Non-followers
37% of viewers are characterized as Non-followers at 1-hop.
12% of viewers are characterized as Non-followers at 2-hops.
Other viewers
2% of viewers are characterized as other viewers at 1-hop.
6% of viewers are characterized as other viewers at 2-hops.
3.2.1.1.4 Video viewing and related parameters
We consider a pool of videos in order to simulate a video platform like YouTube. Since
each user is active in Facebook for 20 minutes on the average per day and taking into
account that the average length of a video is 4 minutes [24], we assume that a user may
watch 1 to 5 videos in this 20-minutes interval, where the number of videos watched
follows the uniform distribution. Each user can have access to the videos published from
his 1-hops friends because of the privacy settings of Facebook, however we assume that
he watches videos only related to his interests. As expected, videos of top interest for
users, as well as videos with highest popularity are more likely to be watched.
Moreover, we assume that the number of videos uploaded daily in our system is equal to
1/20 of the total number of user in our system. In each day we decide who users will be
uploaders, i.e., they upload and share videos. For this choice we use Bernoulli distribution,
where the users are chosen with uniformly random way. Additionally, each user can
upload none, one or more videos, but only within the 20-minute slot that he is active in
Facebook. Finally, the probability for a user to re-share a video that he has already
watched from a friend, i.e. stored in Facebooks server, is 11.8%, while the probability to
upload a video watched from the video server of a third-party is 88.2% [76].
Finally, each user is considered to be able to push only one video prefix through his
messaging overlays in any given day. We make this assumption because a user hardly
uploads a video per day, so there is no point trying to push more video prefixes. In the
case where a user uploads more than one video, say two, then he is considered to push
only one video prefix within that day, while he pushes the video prefix of the remaining un-
pushed video in the next day.
3.2.1.1.5 Implementation details
We assume that each user is available to serve the local P2P overlay, i.e., a P2P overlay
network, which is built per video and AS to support the dissemination of that specific video,
as a leecher during a 4-minutes slot is active in Facebook, while each user is available to
serve the local P2P overlay as a seeder during a 20-minute slot is online more generally in
Internet. Then, the estimation of the intra-AS traffic generated by a user watching a video
is based on the percentage of seeders and leechers that are active during that 4-minutes
slot and additionally, are located within the same AS. On the other hand, the estimation of
inter-AS traffic generated by a user watching a video is based on the percentage of
Public

seeders and leechers that are active during this 4-minutes slot but are located in other
ASes, plus the contribution of the external server where the video is hosted.
Furthermore, we use the upload bandwidth available to each user from the other users in
the swarm as well as from the SPS (or the external server in case of SocialTube) as a
proxy for users QoE. Our main objective is to keep this available bandwidth for each user
higher than the (average) bit rate of the video being watched in order to assure high (or at
least adequate) QoE. In order to estimate both traffic and QoE, we assign an UL and a DL
bandwidth to every user in our system; the assignment is based on statistics presented in
Table 3.
Table 3: Bandwidth statistics [23].

Next, we will describe the framework setup for our evaluation: First, we created 3963
nodes and defined their social relationships based on the SNAP dataset [80]. Second, we
distribute users (nodes) in 4 different ASes of varying sizes using the Zipf distribution;
specifically, we assume that the AS with id 1 has rank 1 and thus, the highest number of
users is assigned to it, while the AS with id 4 has rank 4 and thus, the least users of all 4
ASes.
Moreover, we created a pool of 9000 videos and we assigned to each video an interest
category and a popularity value following the methodology described in Section 4.2.4.
Additionally, each video has been considered to have a random size from 20 to 30 MB
(uniform selection), and the bit rate of each video has been set equal to 330 Kbps.
Furthermore, we set the cache size of each user equal to 300 MB, which can be
considered as a rather low value taking into account the TBs of storage available (at low
cost) in users premises, and the cache size on each one of the four SPSs, one SPS per
AS, to be proportional to the number of users assigned to the respective AS. For each
user connected to the SPS, the SPS increase the size his cache for one prefix and one
video, i.e., 33 MB in total for a prefix and a video. Finally, the simulation lasted 30 cycles
corresponding to 30 days. Finally, we implemented our evaluation framework in MATLAB.
3.2.1.2 Simulation Model for HTTP Video Streaming
In February 2012 YouTube introduced the Range Algorithm to control the data flow for http
video streaming. In contrary to the previously used Throttling Algorithm, the Ranged
Algorithm only requests a block of data when the pre-buffered playtime drops below a
certain threshold. Therefore network bandwidth is consumed only when needed.
Especially when users dont watch a video until the end or if they jump through the video
this saves bandwidth compared to the Throttling Algorithm. The requests of the Throttling
Algorithm only depend on the bitrate of the video and do not consider the pre-buffered
playback time.
In this subsection we present our results from experiments with the YouTube video
playback buffer to investigate the Range Algorithm. The goal of the experiments is to
reverse engineer the range algorithm to develop a model for HTTP video streaming. For
the study measurements for 100 randomly chosen popular videos were conducted. Each
video was replayed at least 12 times in different resolutions, so that more than 2400
Public

samples were obtained. Samples with measurement errors were omitted for the analysis.
The YouTube monitoring tool YoMo was used to monitor and record the buffered playtime
while the videos were played back in the measurements.

Figure 2: Buffered playtime while streaming a YouTube video.
Figure 2 shows the pre-buffered playtime of a YouTube video dependent of the playtime.
The buffered playtime increases sharply at the beginning of the playback to pre-buffer
playtime. Hence, blocks are requested immediately after completing the previous block.
After exceeding a threshold of 50 seconds pre-buffered playtime blocks are only requested
when the playtime drops below that threshold. When the last block is downloaded the rest
of the video is downloaded and can be played back.
This behavior can be modeled by a finite state machine. Figure 3 shows a simple finite
state machine which models the YouTube player requests and hence the YouTube source
traffic. A finite state machine is defined by a quintuple (, S, s_0, , F).
Table 4: State transition matrix for the state transition function :S x S
Input\Current State pb == 0 pb += s/ pb -= c
a, pb> pb += s/ pb -= c pb -= c
a, pb< pb += s/ pb += s/ pb += s/
!a, pb> pb == 0 pb == 0 pb -= c
!a, pb< pb == 0 pb == 0 pb += s/
Public


Figure 3: Finite state machine for a video streaming source traffic model.
The input alphabet is ={a,!a}x{pb>,pb<}, where a indicates that blocks are available, pb
is the pre-buffer size and is the block request threshold. Thus, the input alphabet
indicates whether blocks are available and whether the buffer is above or below the
threshold. Moreover, the set of states is S={pb==0,pb+=s/,pb-=c}, which represent an
empty buffer, an ongoing download (buffer is increased by block size s divided by block
bitrate ), and the playback state (buffer is decreased by update time c). The initial state is
s
0
={pb==0} where the buffer is empty, the final state F is empty, and the state transition
function which is described in the state transition matrix in Table 4.
The parameters defining the model are the block request threshold , the block-bitrate ,
the block-size s and the video size S. To parameterize a model for YouTube we measure
YouTube video downloads in different qualities. In the following we describe the results of
the initial measurements.

Figure 4: Cumulative distribution of measured YouTube video bit-rates and video sizes.
Public

Figure 4 shows the cumulative distribution of measured YouTube video bitrates and video
sizes for three different resolutions 240p, 360p and 480p. As expected the video bit-rate
increases with the resolution of the video. Videos of the same resolution can have different
bit-rates, since for recent codecs the video-bitrate highly depends on the amount of self-
information in the video. E.g., videos with frequently changing scenes need a higher bit-
rate than still images, because more information has to be encoded. For these initial
measurements a uniform distribution of the video bit-rates can be assumed. Additional
measurements are required to get a better assessment of the video bit-rate distribution for
more videos. Furthermore, the distribution of block bit-rates has to be measured, which is
needed as parameter for the model.
Corresponding to higher bit-rates video sizes tend to be larger for higher video resolutions.
The video size distribution has a decent tail, e.g., for resolution 480p more than 95% of the
measured videos are smaller than 50MB and few videos are larger than 90 MB. This
proposes to use an Erlang-k distribution to model the video size distribution of YouTube
videos. The parameters of the Erlang-k distribution to fit YouTube video sizes still have to
be determined.

Figure 5: Cumulative distribution of
measured block sizes.

Figure 6: Cumulative distribution of pre-
buffered playtime.
Figure 5 shows the cumulative distribution of block sizes of YouTube video streams in
three different resolutions. The block size distributions are depicted separately for middle
blocks and the last block of a video. A YouTube video can consist of zero or more middle
blocks and one last block. The middle blocks have constant size and the size of the last
block is simply the size of the rest of the video. For resolutions 240p and 360p a middle
block of a YouTube video stream is about 1.78 MB. For 480p resolution the middle block
of a YouTube video stream is about 2.46 MB. The last blocks containing the rest of the
video are assumed to be distributed uniformly with lower bound 0 and upper bound 1.78
MB for resolutions 240p/360p or upper bound 2.46 MB or resolution 480p.Hence, for
YouTube video streaming we can define the block size parameter s for a video with size S:
s
= _
C
S (i 1) C

vi, i C < S
, clsc.

C is a constant which is 1.78 MB for 240p/360p and 2.46 MB for 480p resolution for
YouTube video streaming. Hence a video with size S has n = |SCj + 1 blocks. The video
duration D can be determined by dividing the block sizes s
with the block bitrates y
:
Public

=
s
n
=1

Figure 6 shows the buffered playtime at the time of a block request for different video
resolutions. The buffered playtime is 50 seconds on average when a new block is
requested. The playtime when a block is requested varies and can be fit with a normal
distribution. Hence, in a detailed model for YouTube the threshold can be modeled as a
random variable that follows a normal distribution with mean 50 seconds.
This basic model does not consider the available bandwidth and download speed of the
blocks. The key influence factor on http video streaming QoE is stalling. Stalling occurs if
the video buffer drops below a certain threshold, such that a fluent playback is no more
possible. The buffer of a video that is stalling while playback is depicted in Figure 7. The
video is initially pre-buffered until the video buffer hits a certain threshold. Then the video
playback starts. In this case the network data rate is slower than the video bit-rate, due to
a bottleneck which could be limited bandwidth. The video buffer decreases until it drops
below a threshold and the video stalls. The video stops playing until the buffer exceeds the
playing threshold and starts running again.

Figure 7: Video buffer while streaming a video with limited network data rate [59].
The basic model could be adopted by including the network data rate such that the state
pb==0 is reached if the data rate is too slow, which means that the video is stalling. A
more detailed model for http video streaming should also consider dynamic adaptive
changes of the video resolution, which could be modeled by adding a dimension in the
state machine, having one state machine for each resolution that are connected to each
other.
The simulation model for HTTP video streaming can be used as a source traffic model for
video traffic. Such it can be implemented in any flow- or packet-level simulation
environment including video sources, which could be servers in data centers of a CDN or
Public

peers in a P2P video streaming system. For a complete evaluation model of a HTTP video
streaming system we need QoE models for HTTP streaming which are described in
3.2.2.2. To evaluate video streaming in the context of the whole service infrastructure
which consists of several servers, and to take into account social awareness, we further
need models for the video-CDN infrastructure and the propagation and requests of videos
in online social networks. There have been various measurement studies which analyze
the CDN infrastructure of YouTube [3][116][4].A model for video requests and the video
popularity in online social networks is provided in Section 3.2.1.1 and in [77].
3.2.1.3 Simulation Model for Dropbox
To derive typical usage scenarios and QoE influence factors of cloud storage services for
a subsequent simulation model, we conducted a Dropbox survey. For this survey, a
dedicated application was installed on the participants Dropbox account in order to gather
information on available and used storage capacity. Depending on users goals and
specific purposes for using Dropbox, their personal characteristics and the usage situation,
the impact of influence factors on Dropbox QoE may differ. Therefore, the information
collected in this second survey is used to define user profiles and groups of QoE influence
factors by using the Expectation Maximization (EM) cluster algorithm. For modeling
Dropbox QoE depending on the actual usage context and situation, we analyse the
connection between these clusters to map user groups to sets of QoE influence factors.
For the Dropbox survey 49 volunteers were recruited. Table 5 depicts the percentage of
workers and volunteers with different Dropbox account sizes and amounts of stored data
(i.e. used space). The table shows that 17% of the workers only have the initial amount of
data stored (example files and folders with a total size of 1.4 MB) in their Dropbox folder.
For 71.15%, the available account size is between 3GB and 10GB Dropbox space.
Table 5: Characteristics of Dropbox accounts from the 49 volunteers [1]
Used Space Percentage of volunteers
Initial amount 17.31%
100MB 67.31%
1GB 40.38%
Account Size Percentage of volunteers
2GB 17.31%
3GB 71.15%
10GB 57.69%

We defined a user profile by taking into account: (a) the usage duration of Dropbox (for a
few days, up to one year, more than a year), (b) the number of linked devices (1,, 5,
more than 5), (c) experience with in-conflict files, and (d) especially their main use case /
reason to use Dropbox (backup, synchronization, collaboration, file sharing and version
control). We used the Expectation Maximization (EM) cluster algorithm of the machine
learning software WEKA to determine different user groups. This approach resulted in six
clusters containing two empty clusters and one with only three respondents who did not
answer some of the questions. These three clusters were excluded from further analysis.
In the following we will refer to the three remaining clusters as beginners, synchronization
users and power users.
Public

Table 6: Characteristics of User Profiles (B=Beginners [22% of users], S=Synchronization
Users [30%], P=Power Users [48%]) [1]
Characteristic Beginners B Sync. Users S Power Users P
Ratio 22% 30% 48%
Usage Time Up to 1 Year > 1 Year > 1 Year
Avg. Number of Devices 1.4 3.8 2.6
OS (Main Device) Windows Windows, MacOS Windows, MacOS
In-conflict Files 10.0% 0.0% 68.2%

In Table 6 some characteristics of the different user profiles are shown. The beginners
cluster contains 10 respondents using Dropbox up to one year. The synchronization users
cluster consists of 14 users and is characterized by a common Dropbox usage time of
more than a year (78.6%). 22 users are part of the power users cluster. In this cluster all
the respondents use Dropbox for more than a year. Further, the table depicts that
synchronization users push to many linked devices (mean=3.8) while power users on
average use 2.6 devices for using Dropbox.
Figure 8depicts the main usage for the different clusters. The beginners use Dropbox
mainly for collaboration (50.0%) and file sharing (40.0%) while the synchronization users
make a greater use of it for synchronization (64.3%) and backup (14.3%). For the power
users synchronization is still dominating (40.9%) but the use purposes are more balanced
than in the other clusters. Moreover, Table 6 shows, that 10.0% of the beginners and
68.2% of the long-term users have experienced in conflict files. This can be explained by
the use of Dropbox for collaboration of the beginners and power users and the overall
usage duration of the power users. [1]

Figure 8: Most important use cases for Dropbox [1].
For a complete evaluation model of a cloud storage system we need the QoE model
specified in the next section 3.2.2.1. Furthermore we need models for the number of files
stored on the file storage systems and their file-sizes. Finally we need to study and model
the propagation of files in collaboration networks and the file-sharing behavior of cloud
Public

storage users. This includes the formation of groups which share a set of files dependent
on file-size and file-type.
To simulate Dropbox traffic at flow-level we need to further investigate the Dropbox-P2P
protocol and its dissemination strategy.
3.2.2 Theoretical Models
For evaluating the user perceived quality of the selected applications in SmartenIT, QoE
models are required which allow mapping objectively measurable parameters like
download time onto QoE. In this section, theoretical models will be described which can be
used for further evaluations and studies.
3.2.2.1 Quality-of-Experience Models for Dropbox
The cloud-based file storage service QoE Model was described in detail in [1] from which
the following material is taken:
A subjective lab study was conducted at the premises of Telecommunications Research
Center Vienna (FTW) in order to quantify the impact of waiting times on QoE for cloud
storage and file synchronization services like Dropbox considering the following tasks:
initialization, storage, retrieval, multi-device sync. Figure 9 depicts the obtained results in
terms of overall quality. The study results do not only show that perception (and rating) is
highly non-linear and exhibits only limited saturation effects (similar to file downloads), but
also that end-user sensitivity is dependent on the task context. For example, users tend to
be more tolerant with slow storage operations as compared to retrieve ones, as observed
in Figure 9b and Figure 9c, and are even more patient with multi-device file
synchronization, as depicted in Figure 9d. In addition, saturation effects are different for
both storage/retrieval scenarios: a slight saturation effect occurs for file retrieval after 2
seconds, which is not observed in the case of storage. For more details on this study
please refer to [19].
According to the actual situation S including the actual task and conditions like number of
files, different shapes of curves are observed in Figure 9. Thus, the QoE model function
f
S
(t) provides the MOS for this situation depending on the short-term influence factor
waiting time t; an appropriate such function is f
S
(t) = a log(t) + b. However, the overall QoE
Q(S, t, F) also needs to take into account the long-term influence factors F which provide
an upper bound for Q. Additional degradations during the usage of the service, i.e. through
waiting times, may occur. For example, security is not affected during the usage of
Dropbox, but it may define an upper bound for QoE. In contrast, during the usage of
Dropbox mainly waiting times, but also the appearance of in-conflict files shape the user
perception, and thus should be taken into account. For the sake of simplicity, we focus on
waiting times only as short-term influence factor. Thus,

o
ot
(S, t, F) =
J
Jt
S
(t)
and for t = 0 it is
(S, u, F) = w
eP

Thus, the importance of an annoyance factor i is reflected by the weight w
i
. For the
example f
S
(t) = a log(t)+b, we arrive at (S, t, F) = o log(t) +_ w
eP
with b = _ w
eP
.
Public

(a) Client initialization (b) File storage (upload)
(c) File retrieval (download) (d) Multi-device sychnchronization

Figure 9: MOS as function of waiting times for four difference task scenarios: initialization,
storage, retrieval, and multidevice sync. Figures are taken from [1].
For a holistic File Storage QoE model, the different usage scenarios and the user profiles
have to be taken into account which is the case for Q(S, t, F). However, future research is
needed on how to quantify and measure long-term influence factors. Moreover it is not
clear whether long-term and short-term influence factors are interacting and whether the
model parameters are interacting (e.g., a and b). Future research also should guide how to
integrate long- and short-term factors in a QoE model and how to integrate several short-
term factors like waiting times and in-conflict files. Such a holistic model would enable the
precise evaluation of the effects of our traffic management solutions on the QoE of
Dropbox.

3.2.2.2 Quality-of-Experience Models for HTTP-Streaming
A QoE Model of HTTP video streaming was presented in [59] from which the most
important material is taken.
Public

User perceived quality of video streaming applications in the Internet is influenced by a
variety of factors. As a common denominator, four different categories of influence factors
[56], [101] are distinguished, which are influence factors on context, user, system, and
content level.
- The context level considers aspects like the environment where the user is
consuming the service, the social and cultural background, or the purpose of using
the service like time killing or information retrieval.
- The user level includes psychological factors like expectations of the user, memory
and recency effects, or the usage history of the application.
- The technical influence factors are abstracted on the system level. They cover
influences of the transmission network, the devices and screens, but also of the
implementation of the application itself like video buffering strategies.
- For video delivery, the content level addresses the video codec, format, resolution,
but also duration, contents of the video, type of video and its motion patterns.
In this section, a simple QoE model for YouTube is presented whose primary focus is its
application for QoE monitoring (within the network or at the edge of the network).
Therefore, we take a closer look at objectively measurable influence factors, especially on
the system and content level. For this purpose, subjective user studies are designed that
take into account these influence factors; for more details refer to [59].
The identification of key influence factors has shown that YouTube QoE is mainly
determined by stalling frequency and stalling length. To quantify YouTube QoE and derive
an appropriate model for QoE monitoring, we first provide mapping functions from stalling
parameters to MOS values. Then, we provide a simple model for YouTube QoE monitoring
under certain assumptions. Finally, we highlight the limitations of the model.
3.2.2.2.1 QoE Mapping Functions
As fundamental relationship between the stalling parameters and QoE, we utilize the IQX
hypothesis [37] which relates QoE and QoS impairments x with an exponential function
(x) = oc
-[x
+y. In [57], concrete mapping functions for the MOS values depending on
these two stalling parameters, i.e. number N of stalling events and length L of a single
stalling event, were derived. To be more precise, YouTube videos of 30 s length were
considered in the bottleneck scenario leading to periodical stalling events. In order to
determine the parameters , , of the exponential function, nonlinear regression was
applied by minimizing the least-squared errors between the exponential function and the
MOS of the user ratings. This way we obtain the best parameters for the mapping
functions with respect to goodness-of-fit.
However, the aim here is to derive a model for monitoring YouTube QoE. Therefore, we
reduce the degree of freedom of the mapping function and fix the parameters and . If
we consider as QoS impairment x either the number of stalling events or the stalling
duration, we observe the following upper and lower limits for the QoE f(x), i.e.
lim
x-0
(x) = o +y and lim
x-
(x) = y, respectively. In case of no stalling, i.e. x=0, the
video perception is not disturbed and the user perceives no stalling. As we asked the user
Did you experience these stops as annoying?, the maximum MOS value is obtained, i.e.
+=5. In case of strong impairments, however, i.e. x - , a well-known rating scale
effect in subjective studies occurs. Some users tend to not completely utilize the entire
scale, i.e. avoiding ratings at the edges leading to minimum MOS values around 1.5 [115].
Hence, we assume =3, =5 and derive the unknown parameter from the subjective user
Public

tests. The obtained mapping functions as well as the coefficient of determination R as
goodness-of-fit measure are given in Table 7. In particular, the mapping function f
L
(N)
returns the MOS value for a number N of stalling events which have a fixed length L. It can
be seen that R is close to one, which means a very good match between the mapping
function and the MOS values from the subjective studies.
Figure 10 depicts the MOS values for 1s, 2s,3s, and 4s stalling length for varying number
of stalling events together with exponential fitting curves (as discussed in [37]). The x-axis
denotes the number of stalling events, whereas the y-axis denotes the MOS rating. The
results show that users tend to be highly dissatisfied with two or more stalling events per
clip. However, for the case of a stalling length of one second, the user ratings are
substantially better for same number of stalling events. Nonetheless, users are likely to be
dissatisfied in case of four or more stalling events, independent of stalling duration. As
outlined in [102], most of the users accept a quality above 3 on the ACR scale, i.e. a fair
quality.

Table 7: Parameters of mapping functions (seeFigure 10) of stalling parameters to MOS
together with coefficient of determination R2 as goodness-of-fit measure. [59]
event length L
(in s)
mapping function depending on
number N of stalling events
coefficient of determination
R
2

1 f
1
(N)=3.50 e
-0.35 N
+ 1.50 0.941
2 f
2
(N)=3.50 e
-0.49 N
+ 1.50 0.931
3 f
3
(N)=3.50 e
-0.58 N
+ 1.50 0.965
4 f
4
(N)=3.50 e
-0.81 N
+ 1.50 0.979

Figure 10: Mapping functions of stalling parameters to MOS. Video duration is fixed at 30
s. No initial delay is introduced. Parameters are given in Table 7
Public

It has to be noted that it is not possible to characterize the stalling pattern by a simple total
stalling duration T=LN only, as the curves for f
L
(N) depending on the total stalling duration
T=LN differ significantly [60]. Therefore, stalling frequency and stalling length have to be
considered separately in the QoE model.
3.2.2.2.2 Simple Model for QoE Monitoring
Going beyond the pure mapping functions, we develop next an appropriate QoE model for
monitoring. The intention of the monitoring is to provide means for QoE management
[60]for ISPs or the video streaming service provider. Hence, the model has to consider an
arbitrary number N of stalling events and stalling event length L, while the subjective user
studies and the provided mapping functions f
L
(N) in the previous section only consider a
finite number of settings, i.e. I e {1,2,S,4]s. As a result of the regression analysis in the
previous section, the parameter L of the exponential mapping function
L
(N) = S.S
-[
L
N
+
1.S is obtained as given in Table 7.
The parameter
L
of the obtained mapping function for given length L of single stalling
event can be fitted with a linear approximation which yields a high goodness-of-fit R close
to 1. The linear relationship can be easily found as (L) = 0.15L+ 0.19.
As simple QoE model f(L,N), we therefore combine our findings, i.e. f
L
(N) and (L), into a
single equation taking the number of stalling events N and the stalling length L as input
(I, N) = S.Suc
-(0.15L+0.19)N
+ 1.Su foi I e R
+
, N e H (QoE)
Figure 11 illustrates the obtained model for YouTube QoE monitoring as surface plot. On
the x-axis the number N of stalling events is depicted, on the y-axis the stalling event
length L, while the actual MOS value f(L,N) according to Eq.(QoE) is plotted on the z-axis.
The figure clearly reveals that the number of stalling events determines mainly the QoE.
Only for very short stalling events in the order of 1 s, two stalling events are still accepted
by the user with a MOS value around 3. For longer stalling durations, only single stalling
events are accepted.

Figure 11: Simple QoE model maps a number N of stalling events of average length L to a
MOS value, (I, N) = S.Su c
(-0.15L+0.19)N
+ 1.Su. [59]
Public

For other influence factors and limitations of this model please refer to [59]. In the context
of SmartenIT, this QoE model can be exploited in order to estimate the influence of
networking conditions on the user perceived quality. To this end, the stalling pattern as
perceived on application layer, i.e. the number of stalling events and the average length of
stalling events, are required as input. This application layer information can be easily
extracted from the application directly or by analyzing the network traces on IP layer as
described in [59].
3.2.2.3 Simple model for the Estimation of Transit Cost
Considering the inter-cloud communication scenario (as initially described in Deliverable
D1.1 [1]), we expect that it is likely that traffic flows between different clouds may span
multiple ASes, while traffic generated due to the inter-cloud communication, including:
data/content replication and placement generated by, e.g., video streaming
platforms such as YouTube,
data storage and replication for fault tolerance employed by online storage systems
like Dropbox,
workload and VM migration within data centers of one or more cloud operators, etc.
will cross expensive inter-domain links in the ISP level. Therefore, in this subsection, we
propose a simple model to investigate the impact of a traffic management mechanism
performing scheduling; nevertheless, the proposed model can be employed to assess the
impact of any traffic management mechanism addressing inter-domain traffic.
In particular, we use the model for a first evaluation of the mechanism for Inter-Cloud
Communication (ICC), while is described in Section 4.3. The ICC mechanism is
considered to exploit the discontinuity of the 95-th percentile rule, which is applied to
estimate transit costs between ISPs, in order to route larger traffic volumes, without
simultaneous increase of the inter-connection costs; a similar investigation has also been
performed in [71]. The avoidance of the inter-connection costs increase can also be
achieved by hiding the extra traffic volumes, i.e., sending them within 5-minute intervals,
when traffic is lower than the expected 95-th percentile of a given period, i.e. one month.
To this end, we address neither how the expected 95-th percentile is to be predicted, nor
the scheduling mechanism itself; we only address the optimization potential of such a
mechanism, if it would operate ideally and with perfect information.
For this investigation, we generated traffic traces using the Pareto distribution [70] for N =
4000 5-minute intervals. Then, we calculated the resulting 95-th percentile of the N
samples, and the difference of the 95-th percentile and the actual traffic measurement at
every 5-minute interval:
T
extra
= max {T
percentile
- T
instanteneous
, 0}
.
Figure 12 depicts the generated traffic traces following the Pareto distribution, while Figure
13 shows the difference of the 95-th percentile minus the traffic samples, whenever
positive. Note that if the difference is negative, that would imply that the specific traffic
trace exceeds the 95-th percentile, it belongs to the upper 5% of the observed samples.
Then, further investigation is needed: if the top 4% values are increased, then the 95-th
percentile is not affected; otherwise, if the 95th observation increases, then an increase of
the inter-domain transit cost can be expected.
Public


Figure 12: N = 4000 traffic traces and the estimated 95-th percentile.

Figure 13: Difference of the 95-th percentile minus actual traffic traces.
Finally, Figure 14 illustrates the total volume of traffic that can be routed through an inter-
domain link where the 95-th percentile rule is applied, when the scheduling mechanism is
not employed (blue bar), and when it is (green bar). As it can be observed, throughput is
more than 80% higher when such a scheduling mechanism is in place.
Additional significant benefit can be attained by an algorithm that transmits more traffic, up
to the capacity of the link, within the 5-minute intervals with higher load than that of the
95th percentile; if this interval could be predicted accurately, then such an intervention
would not affect the interconnection cost either. In general, the transmission of data
volumes within the 5-minute intervals whose traffic traces are similar to that of the 95th
percentile must be performed carefully, and requires accurate estimation of the expected
traffic during these intervals.
0 500 1000 1500 2000 2500 3000 3500 4000
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
5min intervals
T
r
a
f
f
i
c

instantaneous trace
95-th percentile
0 500 1000 1500 2000 2500 3000 3500 4000
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
5min intervals
T
p
e
r
c
e
n
t
i
l
e
-
T
i
n
s
t
a
n
t
a
n
e
o
u
s
Public


Figure 14: Throughput under a specific transit charge, with and without the scheduling
mechanism operating ideally and with perfect information.
Next steps of this investigation involve the development of a model to accurately predict,
e.g., based on historical data, the amount of traffic passing through a transit link at time t,
as well as the expected 95-th percentile; such a model will allow ICC mechanism to
perform more efficiently the scheduling of the data flows of the inter-cloud communication.
3.2.2.4 Energy Models
Energy models may be used to estimate the energy consumption of networked devices
based on their device state. This is beneficial as no direct power measurement is required
after the model generation phase, and the estimation of the energy consumption may be
executed on an entity other than the device to be gauged. This gauging entity then needs
the knowledge of the energy model as well as the device state to infer the target devices
energy consumption.
To allow a holistic view on the network, energy models for all networked devices used
within SmartenIT are required. These models can be taken from the literature, if the
publication was recent and the changes in the technology are small. Otherwise, models
have to be generated by measuring the power consumption of a representative device,
while stressing different aspects of the hardware. Most likely, this might be done by
loading the network or varying the number of connected devices, but also idle modes must
be considered.
The generation of models for individual devices, in particular end-user devices, is a linear
process. Generating models of cloud instances requires knowledge of the cloud data
center configuration, its relative usage, the placement of VMs on the physical machines,
the services running, and their demands on CPU, hard drives, and network, as well as the
energy consumption of the individual servers. As the available energy on mobile devices is
limited, and in general scarce, this section focuses on the improvement of the energy
efficiency of the mobile devices and the traffic control to satisfy the requirements of the
mobile user while reducing the absolute energy consumed on the mobile device. Below
energy models for mobile devices, NaDas, network infrastructure devices, and its
implications on the exemplary applications within SmartenIT are discussed.
0
50
100
150
200
250
T
h
r
o
u
g
h
p
u
t

no mechanism
95th-aware mechanism
+82%
Public

Energy measurements on mobile devices may be conducted by directly measuring the
power consumption between the battery and the device, or by modeling the power
consumption of the mobile device, based on system parameters. This approach is
proposed by [123]. The model generation is a two-step process. First, the power
consumption of the device is recorded together with device parameters, like CPU usage,
network activity, display status, brightness, and others. These measurements are then
evaluated and the influence of each component is derived in a regression based approach.
As the measurements in [123] are quite aged, and the available devices and their feature
set have evolved, models must be recreated for the devices used within SmartenIT. These
models can then be used to accurately estimate the power consumption of the mobile
devices, based only on the system parameters. Gross et al. show in [45], that the error in
most cases is below 5%. Still, more experiments must be conducted to reliably determine
the error margin.
Similar methods are also possible for wired network devices. The measurement of the
power models for switching and routing hardware is straight-forward. Hlavacs et al. [54]
suggest modeling the power consumption as a constant. This is justified, because the
influence of the traffic moved through the device is low. Under some circumstances, it
might even be beneficial to move large amounts of traffic. The power consumption of the
optical Internet backbone is analyzed in [53]. For OpenFlow hardware, no models are
available yet.
The energy consumption of cellular networks can be modeled for conventional base
stations as well as for microcells. Arnold et al. have developed a model for both types of
GSM and UMTS cells in [10]. Guo et al have simulated the energy consumption of a 4G
network in a London case study [46], also making use of microcells to show possible
energy optimizations.
For video streaming, Hinton et al. [53] separate the influences on the power consumption
into network transmission and storage. Hinton et al. also suggest storing multiple copies of
popular videos throughout the network to reduce the power consumption of the network.
The modeling of video on demand, according to [53], is straight-forward, by adding the
fixed power consumption of the servers to the transmission cost. Still, they disregard the
need for an increase in server capacity for highly popular content.
In the case of UNaDas, first measurements indicate almost constant power consumption
while running [16]. Still, more measurements are necessary to develop a fully qualified
model. The network throughput as well as the power consumption of connected storage is
not considered yet.
The above described power models can be combined with throughput models to derive the
energy cost for individual users or data transmissions. This is possible by simply dividing
the power consumption by the number of users or the number of bytes transmitted. These
models need to be combined in the SmartenIT Traffic Manager to derive the optimal
routing and caching decisions.
As far as the hardware is standardized, or differences between different hardware models
are expected to be low, measurements may be omitted. Instead, the analytical models
described in the literature might be used to calculate the energy consumption of the
hardware components. If these models are known, and proven to be accurate, traffic
measurements, or even the number of connected users suffice to model the energy
consumption of the full network with reasonable accuracy. It is part of current work to
develop these models and energy consumption functions.
Public

Knowing the energy models of the individual components, the energy consumption for
different services can be estimated. To accomplish this, traces of the application
generated traffic must be recorded. These traces can then be combined with the energy
model to derive the power consumption of the network components and the end user
device. After establishing the state-of-the-art power consumption, experiments can be run
to optimize the energy efficiency. This can be accomplished using the models only.
Possible optimizations are improved routing or scheduling of connections. Here, the
applications under consideration, YouTube and Dropbox, offer different optimization
potential. As Dropbox, in general, can be considered delay-tolerant, scheduling of
connections to more energy efficient time periods or connection types are possible. In the
case of YouTube, the margins are much smaller, but as the same content is consumed by
different users, in network caching or pre-fetching may be used to store a copy of the
content close to the user, hence optimizing the delivery time to the mobile device.
3.2.2.5 Model on Resource Allocation
Data centers offer resources, such as CPU, RAM, disk space, and bandwidth, to
customers, e.g., end-users or cloud service providers, which consume different amounts of
these, in particular different ratios. For example, one customer may consume 2GHz, 2GB
RAM, 1TB disk space, and 1GB/s of bandwidth while another will consumes 1GHz, 1GB
RAM, 3TB disk space, and 1GB/s, which makes it is hard to say which customer
consumes more resources. In academic environments, where different customers, e.g.,
chairs, do not have to pay for the resources of, e.g., a cluster rented by their university,
ensuring fairness is important. However, it may also be desirable, to integrate fairness
guarantees into SLAs of commercial data centers, which usually offer their resources
either in static shares or a nontransparent best effort manner.
Subsequently, a definition to make consumption profiles comparable is presented, which, if
used to guide resource allocations, allows to make share guarantees while also enabling
statistical multiplexing, i.e., the advantages of allocating resources statically and
dynamically are combined. In particular, every consumption profile is mapped to a number
that can be associated to the greediness of that customer, i.e., if the number is positive the
customer consumes beyond his means without appropriately ceding resources of what
would be his share to other consumers, and, if the number is negative, the opposite is the
case. This number is therefore referred to as the greediness of a customer. If resource
scarcity occurs, constraining greedy customers stronger in their resource usage will
enforce fairness. By not trimming resources of the share of a customer, who has
greediness less or equal to zero, will implement resource guarantees.
Such sharing policy can be integrated into the SLA offered by commercial data centers, as
it allows combining the attractiveness of guarantees with the attractiveness of on demand
resource shares. However, also for non-commercial resource sharing such policy will
make sense. When resources are not to be paid for, customers are usually asked for their
expected needs by the administration to rent an according infrastructure. When the
infrastructure is deployed subsequently, resources that were claimed by a customer
should be guaranteed to him. On the other hand, when resources are not used by a
customer, these should be reallocated to optimize costs.
Let d
i,j
! 0 be what customer c
i
consumes of resource r
j
, e.g., CPU used or RAM allocated.
For any resource r
j
, let e(c
i
,r
j
) be the endowment of customer c
i
of resource r
j
. For
example, if resources are divided equally among all customer, we have e(c
i
,r
j
)= q(r
j
) m
for all c
i
and r
j
, where m is the number of customers and q(r
j
) what is available of resource
r
j
. Then, d
i,j
e(c
i
,r
j
) is the amount of resource r
j
that customer c
i
consumes beyond his
Public

endowment (if the difference is negative, c
i
is willing to release some of his endowment). If
d
i,j
> e(c
i
,r
j
), other customers have to release some of their endowment of r
j
in order to
cover for c
i
s additional demand, i.e., for what c
i
demands beyond his endowment.
Therefore the additional demand should be added to the greediness of c
i
. If d
i,j
" e(c
i
,r
j
),
customer c
i
s greediness should decrease. However, in this case, only to an extent that
other customers benefit from the release, which applies, when other customers request r
j

beyond their endowment. This notion can be formalized as follows. Define #(r
j
) as the sum
of what customers demand beyond their equal-share of r
j
, i.e.,

and $(r
j
) as the sum of what customers release of r
j
, i.e.,

The greediness of customer c
i
is defined as
,
where b(d
i,i
) is the balance for consumption d
i,j
and defined as

Note that in case $(r
j
) is 0, the else-part of the above definition is never reached and
therefore #(r
j
) is never divided by 0.
A data center operator can monitor the different resources consumed by customers and
then apply the definition to identify heavy/greedy customers. In case scarcity occurs in the
data center, the scarce resources allocated to greedy customers can be trimmed stronger
than those allocated to non-greedy customers. In this way, the greediness of customers is
aligned and thereby fairness achieved.
3.2.2.6 Model on Federation of Clouds
A federation of clouds or data centers is formed by smaller players to increase geographic
diversity and offer a richer resource pool, thereby enabling competition with larger players.
In order to allow customers of a federation to take the best possible advantage of the
federations diversity, they must be able to choose data centers within the federation from
which they consume resources, i.e., they are not forced to connect to a certain data center
that then relays their requests. However, if a customer is allowed to request resources
from any data center in the federation individual data centers may easily be overloaded,
wherefore load balancing between data centers becomes necessary. Furthermore,
fairness between customers may have to be enforced. If this is done by each data center
independently (each data center enforces internal/local fairness), customers are forced to
consume resources evenly on all data centers in the federation. Such enforcement of local
fairness is inefficient, if customers have complementary demands, which may be the case,
for example, for geographical reasons or because the resources of some data center are
better suited for a customer (faster disk, better graphic rendering, faster CPU, etc.). To
illustrate the difference between local and global fairness consider a federation of two data
centers D1 and D2 with two customers C1 and C2 (with equal endowments). Local
Public

fairness would require both customers to receive 50% of D1's resources and 50% of D2's
resources (shares of different resources inside a data center may be different). Global
fairness, however, would for example, also allow to allocate 100% of D1's resources to C1
and 100% of D2's resources to C2 (assuming that both data centers provide an equal
amount of resources). Note that, an allocation that is locally fair for every data center is
also globally fair, but not vice versa, wherefore global fairness is a relaxation that allows
for more flexible thus more efficient resource allocation.
To allow a data center federation to arrive at a globally fair allocation and also give
incentive to load balancing throughout the federation, each data center has to account for
the resources consumed by customers. This local consumption is then announced to all
data centers, such that all data centers can calculate the global consumption of customers
to apply shaping based on this information and thereby ensure global fairness. (In Section
4.7, it will be argued why the Greediness metric presented there is suited to exchange
such information in a compact manner.) When this scenario is modeled game-theoretically
two kinds of self-interested agents exist: data centers and customers. It is assumed that
customers pay a flat fee to use the federations resources. In particular, even if volume
based charging would be assumed, this charging model would not take into account the
location of the consumed resources. Therefore, customers will try to consume as many
resources from their preferred data centers as possible while they have no inherent
interest in load balancing (Tussle I). This tussle can only be resolved by close cooperation
of data centers and the use of a suited consumption metric. Depending on whether the
federation splits revenues by a contribution based scheme or a fixed scheme, interest of
data centers is to use their capacity to the highest possible degree (as this increases their
worth for the federation and therefore payments they receive) or to not have their resource
used at all (as this decreases their energy costs). Therefore data centers have incentive to
over-report used resources or to not provide resources to customers (Tussle II). In order to
expose such strategic data centers, other data centers can introduce dummy customers to
the system and monitor how much resources they receive. Since a malicious data center
does not know which customers are the dummy ones, it will misreport resource usage for
these customers as well. By comparing received and reported resources for dummy
customers, malicious data centers can be identified easily.
3.2.2.7 Game-theoretic Model
In order to sort out game-theoretic models for the strategic situation of SmartenIT
stakeholders, three aspects are needed. That is players, their strategies and payoffs they
achieve. Deliverable D1.1 [1] proposed terminology and relation model describing
stakeholders and their interests (crucial in assessing payoffs). This document provides us
with strategies (traffic management mechanisms) and offers insights into our
understanding of potential payoffs.
It is important to understand that not only goals of stakeholders differ, but also their
willingness to employ new strategies may be different. For ISP, cloud service providers ,
content providers, even a small change in energy cost, small reduction of network
congestion or transit traffic can add up to a considerable improvement of revenue. Also
they can be expected to be rational players. This may not be the case for end-users. Users
may evaluate new application or traffic solution via larger set of metrics, including, for
example, habits that reduce willingness to try out new applications. Therefore strong
incentives may be required when users approach. Subjective study (3.2.2.1) shows that
users QoE relation to objective measures is non-linear, and therefore may mitigate
perception of improvement brought by SmartenIT solutions.
Public

Therefore, some of traffic management mechanisms presented in this document can be
considered as vertical games, games between different types of stakeholders involving
different metrics for players. Other, horizontal games, will involve strategic decisions
among stakeholders of the same type, strategizing in similar manner.
Users may be indifferent to traffic solutions that keep traffic local, unless it brings them
significant improvement in QoE. Thus localization of traffic can be treated as a game,
where all players are interested in reduction of transit traffic. This makes it easier to
coordinate all players to coordinate their strategies.
Trust based solution like HORST (4.1), while beneficiary towards all SmartenIT scenarios,
can potentially introduce interesting aspect from game theory standpoint. If high trust score
ensures benefits from resources shared by other HORST users, then game is competitive,
with players putting effort in order to gain high score.
If the solution requires coordination of actions from different type stakeholders, it will be
essential for solutions to satisfy their differentiated interests in order to expect
coordination. Game-theoretic evaluation models of each TM solution will require outcomes
in relevant metrics for involved parties.
This section provides the basis for a detailed definition of game-theoretic models. It is part
of future work to investigate which game-theoretic models can be applied to the selected
traffic management solutions and use-cases, which will be defined in D2.3.
Public

4 SmartenIT Traffic Management Solutions
In this section we propose and justify traffic management solutions. A traffic management
solution is composed by one or more traffic management mechanisms. The scenarios
addressed by the traffic management solutions are described to enable classifying the
proposed solutions to identify common use-cases in task T2.2.
The structure of the subsections is recurring with the aim to provide all means for
subsequent performance evaluation of the solutions. Despite the description of the traffic
management solutions the factors that have a key influence on the associated
mechanisms are identified. To assess the performance of the mechanisms key
performance metrics are defined. Finally initial evaluation results for each proposed traffic
management mechanism are provided.
4.1 Home Router Sharing based on Trust
According to [20], the number of mobile-connected devices will exceed the world's
population in 2013 and mobile data traffic is ever increasing. To handle the growth and
reduce the load on the mobile networks, offloading to WiFi has come to the center of
industry thinking [119]. In 2012 already 33% of total mobile data traffic was offloaded onto
the fixed network through WiFi or femtocell, and the number of public WiFi hotspots is
increasing up to several millions. Additionally, there is a much larger amount of private
WiFi hotspots, which could also be utilized for data offloading.
With users increasingly sharing their lives in online social networks (OSNs) and content
spreading along connected friends (so called social cascades), there is a new reason to
utilize private home routers. Social awareness, i.e. the collection and exploitation of social
signals, can be used to predict social cascades, i.e. the propagation of content links in
OSNs, and thus specify where and by whom content will be requested. As home routers
are/can be equipped with storage capacities, a socially-aware traffic management
mechanism is possible which proactively sends the content to a router at which it is/will be
requested.
Home router sharing based on trust (HORST) is such a mechanism which addresses three
use cases: data offloading, content caching/prefetching, and content delivery. Our solution
consists of a firmware for a home router, an OSN application, and a mobile device app.
The firmware sets up two WiFi networks (SSIDs) - one for private usage and one for
sharing. Additionally, a user-owned nano data center (UNaDa) is established on the home
router. The owner of the home router uploads the WiFi access information of the shared
WiFi to the OSN application. Each user can share his WiFi information to other trusted
users via the app and request access to other shared WiFis. As the application knows the
position of the users, it can recommend WiFis near to the users, or automatically request
access and connect the users for data offloading. Social cascades will be used to predict
which content will be requested by which user. As the application also knows about the
current and future users of each WiFi, the UNaDa on the home router can be used to
cache or prefetch delay-tolerant content which will be delivered when the user is
connected to the WiFi. Finally, HORST federates all UNaDas to form an overlay content
delivery network (CDN), which allows for efficient content placement and traffic
management.
Public


Figure 15: Basic HORST functionality.
4.1.1 Addressed Scenarios
The main aspect of HORST is providing a ubiquitous Internet access via WiFi to all
participating users. Additionally, it is a socially-aware traffic management solution which
utilizes network resources more efficiently and improves users' QoE.
Figure 16 shows the basic functionality of HORST. HORST uses personal data, friendship
relations, and communication patterns from OSNs to compute trust relations which, e.g.,
can be based on common trusted connections, or reliability, or recent cooperative
behavior. Furthermore, the OSN can provide information about the popularity of content
and about the interest of specific users. The OSN may also provide geo-location
information about the users, which allows for recommendation of nearby shared WiFi
access points of trusted users, and prediction of content request locations. HORST uses
this information a) to authorize access to WiFi for data offloading, b) to enable content
caching and prefetching, and c) to efficiently manage an overlay CDN.
Thus, HORST addresses all four SmartenIT scenarios social awareness (data from OSNs
and position information are used to compute trust score, to predict user location, and to
enable pre-fetching), global service mobility (global WiFi access is provided, and delay-
tolerant data can be made available globally via NaDas), energy efficiency (data offloading
to WiFi reduces load on 3G links, and idle home routers can reach a higher utilization),
and inter-cloud communication (content distribution). In the following paragraphs the
covered use cases within these scenarios are described in more detail.

Public

4.1.1.1 Data Offloading
As users can access shared WiFis through HORST, they can use the WiFi network to
connect to the Internet. This reduces the load on the 3G link and will eventually lead to
less cost for mobile network operators. End users experience a higher bandwidth and their
mobile device has lower energy consumption because of lower signal strength and faster
data transmission. Additionally, HORST can guide users to the nearest WiFi and manage
the access request based on the trust between them and the owners. This leverages the
scheduling of both upload and download transmissions for delay-tolerant content, such
that they are conducted when the user is connected to a WiFi network.
4.1.1.2 Content Caching and Prefetching
OSN and UNaDas provide information such as social cascades and access history, which
can be used to calculate and predict temporal and spatial popularity of content. Based on
that, it is possible to decide which content to prefetch to which local cache, and how long
to keep the content in the cache for best performance. As a result, users who want to
access content which is shared via OSN (social cascade) or which is requested frequently
via their friends' home routers will often find that their home router already stores the
wanted content. Additionally, users can indicate that they want to access content at a later
time, e.g., when they have WiFi coverage or when they are back at home, and in the
meantime HORST will prefetch this content to the specified UNaDa. Thus, it is possible to
download content which will not be consumed immediately in off-peak hours to avoid
congestion. When users request content via WiFi from their router, they can access it with
much less delay and a higher bandwidth resulting in a higher QoE for almost all services.
At the same time cloud operators benefit from reduced load on servers, decreased storage
demand, and improved Quality-of-Service.
4.1.1.3 Content Delivery
Content which is about to become popular for end users will be requested from the original
content provider and distributed via the UNaDa-induced overlay CDN. In doing so, HORST
will efficiently utilize the network resources, e.g., if possible request and distribute content
only in non-peak hours. If a user requests content which is not yet stored on her UNaDa,
HORST will decide from which resource the content is requested. The selected resource
can be, e.g., another UNaDa or the server of a cloud service provider, depending on
different metrics, e.g., network (latency, congestion), cost (inter-AS traffic), cloud (data
center workload), overlay (location of users and stored content), energy (offloading
savings), or QoE (user satisfaction) metrics. Thus, HORST can flexibly integrate various
metrics (if available) in order to perform traffic management and optimize the overlay.
4.1.2 Definition of SmartenIT Traffic Management Mechanisms
The traffic management mechanism will consist of the following components which will be
described briefly: home router firmware, online social network, and mobile device
application. Additionally, a decision entity is needed which decides on traffic management
based on different metrics. However, the placement (i.e., centralized or decentralized) of
this entity, the scope of decisions, and the concrete decision algorithms are not finalized
yet, and thus, this entity is omitted here and will be covered by future work.
4.1.2.1 Home Router Firmware
Due to legal issues, a shared WiFi, which is separated from the private WiFi network, is
required for home router sharing. To host a nano data center and to contribute to an
Public

efficient CDN overlay, further requirements arise. The home routers need to push or pull
content from another node in the overlay network (which includes other UNaDas as well as
original content providers, e.g., a cloud service). Then, they need to be able to intercept
requests from end users to directly serve cached content. Based on the load and location
of the home router, content requests can also be redirected to other nodes in the overlay.
Thus, load balancing and traffic management can be established, and service quality is
assured.
4.1.2.2 Online Social Network Application
The major innovation of HORST is the OSN application, which provides input for all traffic
management mechanisms. It allows for the utilization of the convenient and well-known
user management of the OSN. Thus, to participate in HORST users simply log on with
their OSN credentials and grant permissions to the application. The required permissions
include access to personal data, communication data, and position data. Moreover, users
have to specify information about their home router, i.e. WiFi SSID, WiFi and UNaDa
access passwords, home router position, and IP address.
Furthermore, the OSN application provides a mechanism to compute trust scores. This
may be an explicit rating of other users, i.e. a user indicates which other users she does or
does not trust, or an implicit mechanism in which the application computes trust scores
based on OSN topology, personal data, and communication data. Users could then set a
rule, e.g., to automatically trust all users which have a score above a certain threshold.
Moreover, also a combination of explicit and implicit mechanisms is possible, e.g., a
system which recommends trustworthy users which have to be confirmed explicitly.
Users who want to get access to another WiFi have to send a request to the owner. If the
owner trusts the user, she can get the WiFi credentials and access the new WiFi. While
users are moving, the application can analyze their position data and recommend or
automatically request access to near WiFis. An incentive mechanism which rewards users
for sharing of their home router still has to be developed, e.g., a credit point system in
which users gain credit points for each share, but have to pay some credit points for each
request. This mechanism should also take into account users who have no router to share
but also want to participate in HORST for improved QoE.
With the OSN application, information about users (interests, preferences, position) and
content (popularity, social cascades) can be gathered and exploited for enhanced traffic
management. First, popular content can be detected and distributed over the network of
UNaDas on the home routers to minimize delay and increase reliability. Additionally,
content access patterns can be taken into account, such that the content can be
distributed more efficiently for network operators, e.g., during non-peak hours. Second,
depending on the importance and sensitivity of the content, it is possible to share and
distribute content only to UNaDas of trusted users. Finally, the same mechanisms can be
used in combination with users' location data to prefetch or cache content, which is
interesting for a specific user, on that home router to which she already is or soon will be
connected. Here again, it would be possible that a user explicitly indicates the content she
wants to consume, the time of consumption, and the home router to which it shall be
transported.
To put it in a nutshell, based on the information from the OSN, HORST allows for efficient
user-centric content placement which minimizes the distance between content and users,
reduces loading times, and thus increases the users' QoE. Additionally, it takes into
Public

account when and where the content is accessed, which makes it possible to utilize the
resources of network operators more efficiently.
4.1.2.3 Mobile Device Application
Instead of using the OSN application in a browser, a mobile device application makes the
usage of HORST more natural. The mobile device automatically provides the needed data
to the HORST application (e.g., position), such that the user can benefit without the need
to be constantly engaged and manually upload information. Moreover, the mobile device
application not only requests the WiFi credentials from the OSN application, but also
stores them on the device for automatic connection to the WiFi network. It manages the
handover between different interfaces (3G, WiFi) or between different access points.
Finally, it includes a transmission scheduler which manages for both upload and
download, whether content is more or less delay-tolerant, and whether it can or cannot be
offloaded to a WiFi network.
4.1.3 Identification of Key Influence Factors
As HORST brings benefits through applying a caching/prefetching mechanism for content,
its performance is mainly influenced by content demand and popularity. Therefore, spatial,
temporal, and topic dependent characteristics of the demand should be considered when
distributing the content via the UNaDas. Second, HORSTs ability to save inter-AS traffic is
depending on the distribution of cloud service resources, UNaDas, and users over ISPs. If
many actors are in the same AS, locality can be exploited by HORST. Third, the
geographic distribution and movement patterns of users influence the performance of
HORST. In particular they determine where content will be requested and by which access
technology, i.e., whether an offloading is possible (in case the user has WiFi coverage) or
not. This has a big influence on energy savings introduced by HORST. Finally,
upload/download bandwidth of cloud resources, UNaDas, and end user devices influence
the content distribution speed and can thus influence the resulting quality perceived by the
end user.
4.1.4 Key Performance Metrics
The performance of HORST can be measured by several metrics. As HORST will optimize
the overlay application service, the improvement in terms of QoS parameters and QoE
parameters of that overlay application must be taken into account. Moreover, HORST
specific performance metrics can be applied. As HORST is a caching approach, its cache-
hit rate measures its performance in exploiting social information and predicting the
content demand correctly. Next, energy savings (savings of data center resources as well
as better utilization of home routers) and traffic savings (inter- and intra-AS traffic and
offloading potential) have to be considered.
4.1.5 Initial Evaluation Results and Optimization Potential
Nano data centers (NaDa) are a distributed computing platform on ISP-controlled home
gateways, which were first presented in [12], [13]. They can be used for content delivery
and showed to be significantly more energy efficient compared to traditional data centers.
Thus, NaDas can form a CDN on their own but also facilitate the deployment of
applications such as peer (NaDa) assisted video on demand streaming [14]. In [15], it is
described that shared WiFi routers could be utilized for pre-fetching of content, which
reduces the perceived delay up to 50%.
Public

In [69] the end device is used as NaDa and serves as its own cache. Both content that is
globally popular and content that is of personal interest are cached overnight directly on
the end device. Figure 16 shows, that there is a high potential to reduce both, response
time and energy consumption with respect to different access technologies.

Figure 16: Potential of caching on the end-user device for response-time and energy
consumption
Social awareness is a novel approach to traffic management on the Internet. With
socially-aware caching future access to user generated content (e.g., videos) shall be
predicted based on information from OSNs. Hints are generated for replica placement
and/or cache replacement which show to increase cache performance. In [16] the classical
approach of placing replicas based on the access history is improved. Therefore social
cascades are identified in an OSN, and locations of potential future users (i.e. OSN friends
of previous users) are taken into account. In [17] standard cache replacement strategies
are augmented with geo-social information from OSNs. Again social cascades are
analyzed to recognize locally popular content and keep it longer in the cache. Specialized
solutions [18], [19] exist for video streaming, which explore social relationships, interest
similarity, and access patterns for efficient pre-fetching to improve users QoE.
4.1.6 Mapping of Mechanism to SmartenIT Architecture
The components of HORST can be seen in Figure 60. HORST works mainly on the end
user level where it employs an overlay management and a social monitor. The cloud is
considered as a (fallback) content source and thus also part of HORST. The HORST
mechanism basically relies on social awareness and traffic management, but an improved
decision algorithm taking into account more metrics is possible.
4.1.7 Example Instantiation of Mechanism
The HORST mechanism eases data offloading to WiFi by sharing WiFi networks among
trusted friends. Moreover, it places the content near to the end user such that users can
access it with less delay and higher speed, which generally results in a higher Quality-of-
Public

Experience. In order to participate, a user needs a flat rate Internet access at home, he
has to install the HORST firmware to his home router, and he needs to install an
application to his mobile device.
The HORST firmware establishes two separate WiFis (a private and a shared WiFi) and
manages the local storage of the home router as a cache. It forms an overlay with HORST
systems on other routers to exchange overlay information (such as home router location,
cache content, prefetch commands) and conducts (active or passive) traffic measurements
between the end points. The application on the mobile device of users sends social
information (location, activity patterns, and interest) to the HORST system on its own
router. Thus, private data of the user stays on its own devices. Additionally, the HORST
router has a social monitor component to collect social information from an online social
network about the routers owner and his trusted friends. If a user approaches the home
router of a trusted friend, he is provided with access data via the mobile application to
connect to the shared WiFi. After the user has connected, the friends home router sends a
notification to the users own router.
Every HORST system predicts the content consumption (i.e., when and where will which
content be requested) of his owner based on location, activity patterns, interests, and
information from the online social network such as content popularity and spreading. If a
predicted content is not yet available in the local cache, it will be prefetched. If the user is
connected to a friends home router, a prefetch command is sent to the HORST system on
the friends router. For prefetching as well as for actual requests which cannot be served
locally, HORST chooses the best source (either another home router or a cloud source)
based on overlay information and traffic measurements, and fetches the desired content.
In regular intervals, HORST checks if the content in his own local cache is still relevant
(either for local consumption or as a source for content delivery) and decides whether to
keep or replace it.
As the development phase is ongoing, the basic usage of HORST might be altered in
some of the described aspects. Moreover, it shall be noted that the HORST system leaves
room for improvement in several ways, e.g., by performing explicit traffic management
between the home routers, by taking into account external network information, by
introducing an incentive system for sharing resources (such that also users without a
home router or a flat rate could also participate), or by collaborating with cloud services,
ISPs, or OSNs.
4.2 Socially-aware TM for Efficient Content Delivery
There is plenty of work in literature examining the distribution of content published on
OSN, either popular or long-tailed, as well as the exploitation of information by the OSNs
to accommodate the dissemination of this content. Two recent approaches, i.e. SocialTube
[76] and WebCloud [124], have been overviewed in Deliverable D2.1 [109], have
comprised the basis for the development of an innovative TM mechanism that sufficiently
addressed the requirements set by SmartenIT (also described in [109]).
Therefore, in this section, we propose a traffic management mechanism, called Socially-
aware mechanism for Efficient Content Delivery (SECD), in order to efficiently deliver
videos published in OSN websites. First, we describe the use-cases of video viewing,
which are being addressed by the proposed socially-aware mechanism, and we proceed
with the specification of the proposed mechanism and its components. Then, we identify
key influence factors and key performance metrics for our mechanism and its evaluation,
Public

while finally, we provide evaluations results of the proposed mechanism, as well as we
compare to an existing approach in literature.
The proposed socially-aware mechanism for efficient content delivery addresses two use-
cases, which can be categorized under the more generic scenario of the exploitation of
social information for efficient content delivery described in Deliverable D1.1 [1].
We take Facebook (as the most popular OSN) and videos published on Facebook as
case-study for our approach. Video viewing is an increasingly popular application on
Facebook. Most of the users upload videos in their profiles; the videos uploaded are
hosted either by a Facebook video server or by an external video platform like YouTube.
4.2.1.1 Video viewing case 1
In the first use-case, we consider that a user shares a video to his profile on Facebook
which is uploaded and hosted to a Facebook video server; this is the case for 14% of the
videos shared in Facebook [31]. Then, users can view the video directly from the
Facebook video server as depicted in Figure 17.

Figure 17: Video hosted on Facebook video server.
4.2.1.2 Video viewing case 2
In the second use-case, we assume that a user copies a link of a video from an external
site, e.g., directing to a YouTube server, and posts this link on his wall in Facebook; this is
the case for up to 80% of videos [31]. Users can then view that video by clicking on this
link and becoming redirected to the external server that hosts the video. Finally, the viewer
will download and watch the video from that server as depicted in Figure 18.

Figure 18: Video hosted on YouTube video server.
Public

4.2.2 Definition of SmartenIT Traffic Management Mechanism
The SECD mechanism exploits social relationships, interest similarities with respect to
content and OSN content locality of exchange in order to enhance the content delivery that
takes place on top of OSNs. The proposed mechanism has been initially designed for
enabling efficient video dissemination; nevertheless, it provides the capability to efficiently
handle any type of content shared in OSNs.
The basic constituent elements of our mechanism include:
a. a socially-aware messaging overlay for alerting potential viewers of a video in
order to request the prefix (first chunk) of a video.
b. a Social Proxy Server (SPS) located in every local region (e.g., Autonomous
System AS) in order to enable social awareness and achieve traffic localization.
Each user of the OSN is considered to be connected to the SPS of his AS (region).
c. a content-based P2P overlay to perform video dissemination, both in intra- and
inter-AS level if peering links between ISPs exist.
d. a two level caching strategy employed both in the caches of the users and the
cache of the SPS.
Moreover, the successful operation of the SECD mechanism is heavily based on the
functionality of the following algorithms:
i. the socially-aware messaging overlay construction algorithm: the algorithm
creates clusters of nodes (users) who are potential viewers of an uploader, where
messages will be disseminated later by a pull-based prefetching algorithm. This
algorithm runs as soon as a video is published/posted (a future adjustment is to run
periodically) and the necessity of this algorithm is to determine which user is a
potential viewer of the uploader for each of the categories of his interest. This
algorithm can run either in the clients of the end-users, e.g., browser add-ons, or in
local SPS.
ii. the socially-aware pull-based prefetching algorithm: the algorithm uses the
messaging overlay to push an alert message and triggers the prefetching for the
nodes (users) which received the alert message. This algorithm runs for each user
when the user uploads a video, and pushes the alert message in all users in the
cluster corresponding to the videos category of interest. Part of the algorithm runs
on the users client and the other part on local SPS. The prefetching algorithm is
pull-based because users are considered to request the prefix from their local SPS
and then the local SPS pushes it to them; we decide to follow this strategy in order
to avoid multiple downloads of the same video prefix within the same AS.
iii. the content-based local P2P overlay construction algorithm: the algorithm is
activated when a watch activity for a video occurs. The algorithm creates a local
P2P overlay, e.g., a BitTorrent-like overlay, with nodes (users), which have already
viewed and stored the video in order to assist SPS in video sharing (seeders) and
with users watching the video at this moment (leechers). If a swarm already exists
for a specific video, the algorithm will just update the set of users that constitute the
swarm. The algorithm runs in each local SPS.
The objective of our mechanism is to improve the QoE of OSN users in terms of
decreased latency, i.e. time to start watching the video, as well as to reduce inter-AS traffic
and therefore, reduce transit inter-connection costs for the ISP.
Public

In this section, we summarize the set of parameters, which are considered significant in
the context of the proposed mechanism, which have also been used in our evaluation
studies; note that a detailed description of this set of parameters is provided in Section
3.2.1.1.
As described in Subsection 3.2.1.1.1, in our evaluation, time is slotted in slots of 20
minutes, in order to be consistent with the fact that a user is active in Facebook about 20
minutes per day on the average. Additionally, we take that the average video length is 4
minutes [24]; thus, we assume that a user may watch from 1 to 5 videos in this 20-minutes
interval, where the number of videos watched follows the uniform distribution. As
expected, videos of top interest for users, as well as videos with highest popularity are
more likely to be watched.
Regarding the users activity, we assume that only 50% of the OSN users are active
daily. We chose randomly the users that will be active each given day, but users with more
friends have an extra possibility to be active. If a user is active in Facebook on a given
day, he is considered to be active only for the duration of a selected 20-minute slot.
Furthermore, we assume that each user is active in the Internet (in general) for 140
minutes (i.e. seven 20-minutes slots). We assume that these seven timeslots are
continuous, and thus when a user logs in Facebook, he does it in the middle of any of
these timeslots. Regardless a users activity in Facebook within a specific day, that user
can seed content, which he has stored while active in Facebook in previous timeslots.
Concerning users interests in video categories, we have assigned 4 video interest
categories to each user, while each user is considered to share and watch videos only out
of these 4 categories. To decide in which 4 categories a user is interested in, we used a
weighted random choice and we chose 4 categories out of 19 total interest categories.
Additionally, we considered a distribution of users in ASes; specifically, we assumed that
each user is located in one specific AS by assigning an AS id per user. Then, in order to
distribute the OSN users among the ASes, we used the Zipf distribution.
Depending on the users viewing behavior, we performed a categorization of them into
followers, non-followers and other viewers, while taking into account the social graph and
based on the number of hops between a user (uploader) and a viewer of his, we extended
the categorization of followers, non-followers and other viewers to those in 1 and those in
2 (social-graph) hops away from the uploader of a video.
Regarding the video popularity (potentially) disseminated within the OSN, we created a
pool of videos in order to simulate a video platform like YouTube, and assigned to each of
them popularity following the Power Law distribution and an interest category by weighted
random choice using as weights the percentages of each interest category as it appears in
Table 2.
Moreover, we assume that the number of videos uploaded daily in our system is equal
to 1/20 of the total number of user in our system. In each day we decide which users will
upload/share videos, where the probability that a user becomes an uploaded is modeled
by a Bernoulli distribution. Additionally, each user can upload none, one or more videos
watched from the video server of a third-party, or re-share a video that he has already
watched from a friends wall but only within the 20-minute slot that he is active in
Facebook.
Public

Finally, concerning the number of prefixes that can be uploaded be a user, we assume
that each user is able to push only one video prefix through his messaging overlays in any
given day. We make this assumption because a user hardly uploads a video per day, so
there is no point trying to push more video prefixes. In the case where a user uploads
more than one video, say two, then he is considered to push only one video prefix within
that day, while he pushes the video prefix of the remaining un-pushed video in the next
day.
In order to evaluate the SECD mechanism and to be able to compare it with other
approaches in literature, such as SocialTube proposed in [76], we define a set of
performance metrics of interest. Specifically, we consider and monitor during our
simulations the following metrics:
Inter/Intra AS traffic: traffic generated by video dissemination (including prefetching) both in the intra-AS
and inter-AS links.
Contribution of server hosting the video: the percentage of traffic handled by the external server, e.g.,
YouTube or Facebook server, where the video is hosted.
Caching accuracy of SPS: the percentage of video prefixes or videos, which had already been stored in the
cache of the SPS, when a user requested it.
Accuracy of prefetching: the percentage of video prefixes stored in the cache when a user requested to
watch the corresponding video.
Useless prefetching: the amount of video prefixes pushed to and never used by the users who received
them.
Redundant prefetching: Redundant prefetching occurs, when the same prefix is being pushed to a user
from multiple sources, i.e. two or more of his friends.
In this section, we provide preliminary evaluation results obtained by means of the
simulation framework described in Subsection 3.2.1.1.
First, we evaluated the SECD mechanism w.r.t. to the prefetching accuracy. In particular,
we observed that both the proposed mechanism and SocialTube [76] achieve high overall
prefetching accuracy around 88%. This is due to the fact that both mechanisms follow the
same philosophy in the clusters construction of users where a prefix will be pushed. Both
mechanisms add in the clusters of a user his followers and his non-followers have the
corresponding interest. In others words both mechanisms follows same approach in
prefetching. Their difference is that in our mechanism the users send an alert message
through the messaging overlay and then the prefix pushed by the local SPS, while in
SocialTube the prefix is pushed from the users directly to their viewers in the clusters.
In Figure 19, the prefetching accuracy of SECD for an increasing number of watched
videos is depicted. We observe that as the number of watched videos by a user increases,
our mechanism achieves an improved prefetching accuracy. Next, in Figure 20, the
prefetching accuracy of SECD as the number of video prefixes pushed to a user
increases. Again, the results indicate that as pushed prefixes increase, our mechanism
achieves higher improvement.
Public


Figure 19: Prefetching accuracy vs. number of watched videos.

Figure 20: Prefetching accuracy vs. number of pre-fetched videos.
One of the main targets of our mechanism is to achieve reduction of the inter-AS traffic
generated due to the video and prefixes dissemination. The results in Figure 20 and Figure
21 show that SECD indeed achieves significant reduction of inter-AS traffic. As it appears
in Figure 22, if we assume that the prefetching in the current OSN video sharing system
architecture, i.e., following a client-server architecture, generates 100% inter-AS traffic,
then SocialTube is found to generate 69% of inter-AS traffic, while SECD only 18%;
namely, 82% reduction of inter-AS traffic is achieved. This high reduction of inter-AS traffic
achieved by SECD is due to the fact that the prefix of each video is downloaded only once
per AS and then; it is cached from the local SPS. Thus, SECD minimizes the redundant
traffic due to multiple downloads of the same prefixes in inter-AS links.

Figure 21: Inter-AS traffic generated due to prefetching.
-
1-
2-
3-
4-
$-
"-
+-
8-
9-
1--
# 1# 2# 3#
0
r
e
f
e
t
c
*
i
n
"

a
c
c
u
r
a
c
y
1um)er of &atc*ed videos
-
1-
2-
3-
4-
$-
"-
+-
8-
9-
1--
1 2 3 4 #
0
r
e
f
e
t
c
*
i
n
"

a
c
c
u
r
a
c
y
1um)er of prefetc*ed videos
-
$-
1--
1//2
(.2
1+2
0lient4Ser&er
SocialTu5e
#ro)osed
Public

Moreover, in Figure 22, we can see that SECD achieves high reduction of the total traffic
inter-AS traffic (i.e. traffic created from a watch activity by a user on video) generated by
the video dissemination. If we consider again the client-server architecture where a user
downloads the complete video from the video server is hosted, we assume that inter-AS
traffic created is 100%. For the mechanism of SocialTube, inter-AS traffic created is 90%
of total traffic, while for SECD inter-AS traffic created accounts the 18% of all traffic. Figure
23 illustrates inter-AS traffic generated by each mechanism within a full day; as expected
traffic is higher (under all approaches, when users activity is higher (see Section 4.1.2).

Figure 22: Total inter-AS traffic generated.

Figure 23: Total inter-AS traffic during one simulation day.
Thus, concerning the preliminary evaluation of the proposed mechanism, we summarize
below major results obtained by means of simulations. Specifically, the socially-aware
traffic management mechanism:
Improves the QoE of OSN users by achieving high overall prefetching accuracy
~88%.
Achieves high reduction of the inter-domain traffic by keeping (to high extent) the
video dissemination locally within the boundaries of the AS that deploys it, and thus,
may decrease the potentially high transit inter-connection costs.
Next steps in this evaluation include the measurement of the server contribution, as well
as the reduction of redundant traffic due to pre-fetching.
-
$-
1--
1//2
./2
1+2
-
1----
2----
3----
4----
1 8 1$22293"43$-$+"4+1
t
r
a
f
i
c
c

i
n

M
3
A day42/min intervals
$lient5Server
Social!u)e
0roposed
Public

The SECD TM mechanism addresses mainly the Network and End User layers, and partly
the Cloud layer. SECD aims to enhance content delivery by improving end-users QoE and
reducing traffic redundancy in the links of the network layer. SECD will employ the Overlay
Management component to handle the local P2P overlays, the Social Monitor to capture
the social activities of the users of the OSN and the Social Awareness to exploit useful
information by the derived data of the Social Monitor. Moreover, the QoE Monitor is
needed to identify poor end-users QoE, so as to enable SPS participation in the video
dissemination. Additionally, in the network layer, Traffic Monitoring is employed on intra-
and (most importantly) inter-domain links, as well as Incentives & Economics component
to estimate inter-connection cost. The mapping of the SECD mechanism to the SmartenIT
architecture is depicted in Figure 61.
In order to provide an instantiation of the SECD mechanism, we consider video viewing
case 2 (see Subsection 4.2.1.2). The SECD mechanism involves the construction of a
socially aware messaging overlay per user and per interest category; specifically, each
cluster consists of this users 1-hop and 2-hop friends, i.e. all of his followers or those non-
followers with a common interest (as defined in Section 4.2.1).
Then, the socially-aware pull-based prefetching algorithm is performed; according to it,
whenever a user wants to upload or share a video pushes an alert message in the
respective cluster, which contains the link of the video which is about to be published.
Then, any user who becomes recipient of this alert message requests to prefetch the first
prefix of the video from the Social Proxy Server (SPS) of his region, i.e. his NSP. When
the local SPS receives a prefix request, it downloads the prefix of the video from the third-
party video server where the video is hosted, pushes the prefix of the video to the users
who have asked to prefetch it, while the latters are storing the video prefix in their local
cache.
As a final step, a local content-based P2P overlay is created in order to assist the video
delivery among the users that request to watch it. The SPS of each NSP operates both as
a P2P tracker and a cache, while for each video a local P2P overlay is constructed with:
Users (seeders) who have already viewed and stored the video in order to assist
the SPS in video sharing, and
Users (leechers) who are watching the video in present time.
Note that if the available total upload bandwidth per user exceeds a specific threshold,
e.g., the bit rate of the video that is served, then the SPS stops acting like a cache and the
video is delivered only by other users in the overlay. On the other hand, if the total upload
bandwidth per user is rather low, the SPS serves most of the chunk requests of the
leechers.
4.3 Mechanism for Inter-Cloud Communication
In this section, we propose a TM mechanism to address Inter-Cloud Communication (ICC).
Specifically, we consider communication among multiple cloud operators, the data centers
of each of which are in general placed in geographically distributed locations served by
multiple different Network Service Providers (NSPs), each of which consists of one or
more Autonomous Systems (ASes).
Public

The ICC mechanism addresses mainly the Inter-Cloud Communication scenario and the
Collaboration for Energy Efficiency scenario, while also some use-cases fall under the
Global Service Mobility scenario, as identified in [1]. Currently, after the consolidation of
the four initial scenarios identified in [1] to two new scenarios described in [108], we
consider that the proposed mechanism sufficiently addresses most aspects of the first one.
Below we provide two indicative concrete use cases:
A. Use Case 1: Bulk data transfer service for cloud operators
We consider the case of N clouds and N network operators, each of which is providing
connectivity to one specific cloud operator. Therefore, the traffic of each cloud operator is
handled by its home NSPs from which IP connectivity was purchased, while its home NSP
is typically Tier 2 or 3 NSPs; thus, in order to deliver global Internet connectivity, they rely
on purchasing transit from Tier-1 NSP. Therefore the inter-domain traffic of the clouds is
delivered to Tier-1 NSP(s) through transit links; the respective traffic is typically charged
under the 95-th percentile rule.
In this case, we assume that bulk data of the cloud (customers), e.g., static content of a
Content Provider like CNN web-site or online personal storage of end-users like Dropbox,
are periodically (e.g., every 24 hours) replicated to a backup facility, i.e. another cluster
located in a different physical location, in order to increase redundancy and security. In the
case of Dropbox, we also consider personal data replication to a secondary DC to meet
demand by, e.g., users with high mobility; this is another case following the GSM scenario
described in [1]. An alternative instantiation of this use case would be that the periodic
massive bulk data transfers are performed due to the need (and respective agreement) for
sharing data, for business or scientific purposes. For instance, astronomic observation
data could be exchanged periodically among space agencies clouds so as to collaborate
on scientific projects on deep space exploration.
The service model for this use case is push-based, while a traffic management mechanism
would be necessary in order to perform destination selection, i.e. in which cloud (data
center) to replicate the data, and scheduling, i.e. when to perform the replication, taking
possibly into account both cost, e.g., the cost of interconnection between NSPs, and the
cost of energy consumption by the data centers, and QoS metrics, e.g., transfer rate,
latency, etc.
For simplicity reasons and without loss of generality, we assume that for each cloud
involved in the bulk data transfer service there is a single known Point of Interconnect
where the respective traffic either originates from or must be delivered to.
B. Use case 2: Federation of small/medium-sized clouds
This use case is an extension of the previous one: in particular, we consider N clouds,
each of which purchases connectivity from a different Tier 2/3 NSP. The N clouds are
considered to be small/medium-size and geographically dispersed; therefore, we assume
that they have established a federation in order to expand their footprint, i.e. offer their
(complementary) services to remote customers, to meet demand through economies of
scale, i.e. renting on-demand resources of others in the federation, and thus, to be able
to compete against large clouds. In particular, we assume that service provision is
performed over the federation of clouds, where the resources of the individual clouds are
combined according to business policy rules so as to create a large virtual pool of
resources, at multiple network locations to the mutual benefit of all participants (see Figure
24).
Public

These business policy rules will inevitably be a product of negotiation among the
interested parties, while depending on them different types of federation may exist [30],
e.g., i) only technical standardization, ii) technical standardization and information sharing,
or iii) alliance for joint business. In this use case, we assume that the cloud federation
belongs to the third category; thus, the federation will allow the creation of large virtual
clouds that can efficiently provision their services over large geographical regions and
across multiple networks. Such a cloud federation is in accordance with the idea of
collaboration among CDNs, the so-called CDNi approach [87].

Figure 24: Cloud federation.
Specifically, and without loss of generality, we consider here the case of personal online
storage (such as Dropbox or Zettabox) being offered over the federation. We consider a
cloud provider F in France and two cloud providers I1 and I2 in Italy. If they all belong to
the federation then the online storage service initially provided by F can now be served in
Italy by either I1 or I2; in order words, the customers of F in Italy can store their content in
the data center of either I1 or I2. Then, F has to make a decision on which of them will
serve a customers request. This decision can be made either jointly with optimal resource
and/or path selection for the content, or separately.
Additionally, both bulk data transfers (e.g., for fault-tolerance) and QoE enhancement
services as discussed in the previous subsections can be considered also in the current
setup, as long as the respective incentives of the individual cloud operators are aligned.
Regarding the network layer, similarly to the two previous use cases, the traffic generated
by the inter-cloud communication is handled by their home NSPs by means of pure IP TE
and is delivered to upper tier NSP(s) through transit links, where traffic is typically charged
according to the 95-th percentile rule.
We propose a TM mechanism for Inter-Cloud Communication (ICC) to address the
aforementioned cases of the inter-cloud communication, as described in Section 0.
Nevertheless, the ICC mechanism can be extended to address dynamic services, e.g.,
VoD service or SaaS applications offered to end-users; in such a case, QoS requirements
(e.g., in terms of latency) would be more strict. The ICC mechanism involves two layers:
the network layer and the cloud layer. The constituent elements of the mechanism include:
Public

SmartenIT Information Service (SmaS): SmaS is provided by a centralized server in the network layer, and
is responsible for characterizing a set of destinations for a specific amount of data to be replicated within a
specific time interval. Specifically, SmaS receives as input from the cloud layer: i) the amount of data to be
transferred in MBs, ii) the priority level for this amount of data declared as an integer which is mapped to a
certain QoS level in terms of duration in seconds/minutes, and iii) the set of candidate destinations (i.e. IP
addresses of data centers of the same or different cloud operator).
Periodically, SmaS gathers values of several cost metrics related to the underlying IP network such as
latency (i.e. seconds), congestion (i.e. number of timeouts or packets dropped), available bandwidth, number
of hops, geographical distance and BGP information as a proxy for cost related to interconnection
agreements (e.g., transit or peering). Then, the cost characterization of the network path leading to each
specific destination is performed by calculating the minimum "weight" or cost (corresponding to network cost)
( ) p P
d
for each considered destination d based on BGP hop count and the aforementioned criteria that are
related to network load statistics at a given time. It is worth mentioning that the aforementioned cost metrics
need to be computed over multiple (source, destination) pairs corresponding to paths of ICC and time
epochs. The values of the cost metrics are greatly affected by the volatility of the network conditions
compared to the time duration of the ICC data transfers. This volatility is known to be high enough to prevent
QoS extensions of BGP such as qBGP to work properly in terms of information accuracy and scalability;
hence, even for small N the SmaS service is important in order to gather accurate information that is required
by the mechanism in order to make educated decisions and optimize resource consumption over cost
minimization. SmaS is consistent with the ALTO approach [61] and the FI design principles [92].
Cloud Scheduler (CloS): CloS is a centralized service running in the cloud layer and is responsible for
making decision of where to allocate data, i.e. to which cloud(s) to send data, e.g., either for fault-tolerance,
or for QoE enhancement. In the case where no federation is established, a CloS instance runs in each cloud,
else the federated clouds are assumed to trust a third-party entity running CloS, which is responsible for
scheduling data exchanges within the federation. In the latter case, in each cloud operator a Cloud
Information Service (CloI) is installed. The role of CloI is to feed CloS with the necessary cloud-related
information as described above.
CloS provides as input to SmaS a list of destinations and receives back a list where a weight (representing
network cost) is assigned to each one of them. CloS makes the final decision d* for a specific traffic flow of a
given cloud operator based on the total cost
d
C that characterizes each potential destination, where
d
C
includes i) the network cost
d
P , i.e. the weight provided by SmaS, and ii) the cloud cost
d
L associated to
each destination d by the scheduler. Ultimately the CloS may be aiming to achieve either an optimum for that
specific flow:
( )
d d d
d
L P C d , min arg
*
=
,
or a an optimum for a broader set of flows, or even for the entire cloud federation, considering at the same
time the other flows as well and the impact of each decision on them.
Note that CloS treats the weights provided by SmaS as network cost. Nonetheless, SmaS reveals only
relative values, this is also why they are called weights, and not actual network cost, as the latter might
result in revealing critical information to competitors and outsiders.
Moreover, the successful operation of the proposed mechanism also relies on the
following supportive or complementary functionality:
o Inter-SmaS communication protocol: This protocol supports the exchange (or
exposure) of information between the SmaSs of the NSPs involved in the inter-
cloud communication. This communication protocol can follow and extend the inter-
ALTO IETF approach [29].
o (Cross-layer) CloS-to-SmaS communication protocol: This protocol is required
for the communication of the network layer (SmaS) and the cloud layer (CloS).
o CloS-to-Federated Cloud communication protocol: This protocol facilitates the
periodic feed of CloS by CloIs with overlay information necessary to make its
scheduling decisions.
Public

4.3.2.1 SmaS Intelligence
Let us assume that the network cost of each path consists of the inter-domain network
cost and the intra-domain network cost, i.e.
( ) ( ) ( ) p P p P p P
ra d er d d int , int ,
+ =
.
We consider the 95-th percentile rule to calculate the inter-domain network cost, which is
universally applied to estimate transit costs between ISPs. In order to mitigate or even
eliminate this cost, we propose a scheduling mechanism that predicts the 95-th percentile
trace based on historical data and exploits the non-continuousness of the 95-th percentile
rule to route larger traffic volumes, without simultaneous increase of the inter-connection
costs; a similar investigation has also been performed in [71]. This can be achieved by
hiding the extra traffic volumes, i.e. sending them within 5-minute intervals, when traffic is
lower than the expected 95-th percentile of a given period, or when traffic exceeds it.
Specifically, the available bandwidth in the t-th 5-minute interval is calculated as follows: if
the QoS requirement is elastic, then the traffic flow will be split and sent over multiple 5-
min intervals to avoid an increase of the 95-th percentile, e.g., if a traffic flow F is to be
sent, then at each 5-min interval t, T(t) is sent:
( ) [ ] ( ) [ ] ( ) [ ] ( ) [ ]
( ) [ ]

=
. , 95
, 0 95 , 95
) (
else T G
t T T if t T T
t T

whereT(t) is other background traffic, until 0 ) ( ) 1 ( = +

t
t T F t F . In this case, the 95-
th percentile is not violated and thus, no charges apply. On the other hand, in the case of
non-elastic time requirements, the entire traffic flow F will be sent at once and immediately:
at each 5-min interval t, T(t) is sent: ( ) [ ] t T G t T = ) ( (where again T(t) is other
background traffic), and thus, any potential increase of the 95-th percentile (or part of it)
will be passed to the cloud layer.

In the case of multiple flows, the available bandwidth per flow must be apportioned to the
competing flows. The total available bandwidth T(t) can be allocated to each flow: either
uniformly, or following a max-min fairness rule, or using the declared (by the cloud
operator) priorities as weights.
The time when a data transfer needs to be performed by the network layer may be varying
for different cloud operators, and especially, for different types of applications, i.e. real-time
or delay-tolerant ones. The utility of the cloud operator which is the source of the data to
be transferred can be modeled as decreasing function inversely related to the time that
passes by until the data transfer is completed by his home NSP, i.e.:
( )
S ex ex C
t t when t f U =

,
1
,
where t
S
is the requested time interval within which the data flow must be transferred.
First, we assume that the time interval is strict and the data transfer must take place
immediately. Let us suppose that a single 5-minute interval is sufficient to transfer the
entire data flow: then, taking also into account the large volumes of such data exchanges,
we expect that the 95-th percentile will be affected, and the inter-domain cost for
destination d will be non-zero; thus, the cost will increase proportionally to the (expected)
increase of the inter-domain cost. Clearly, prioritizing and admitting flows over specific 5-
minute intervals based on the preferences of the cloud layer creates value for the NSP, as
Public

the latter will be compensated for the increase the 95-th percentile by the cloud operator,
while the cloud operator will be charged by the NSP for QoS provision. Also it is worth
noting that E[T(95)] cannot be based only on historical data but recomputed periodically
according to the nature of the new contracts of the NSP. Alternatively, E[T(95)]can be
estimated dynamically, following the approach of Dynamic Locality presented in [111], i.e.
by measuring the inter-domain traffic in the first portion of a 5-minute interval (e.g., 50% of
it), and then deciding how much more traffic can be hidden in the remaining part.
Furthermore, historically the volume of traffic increases while the unit price for transit
decreases. Therefore the expected E[T(95)] will inevitably reflect this trend and needs to
be updated periodically. Pricing models applied by the NSP for QoS provision towards
cloud operators in this context are left as future work.
On the other hand, if the time interval is not defined or is set to , then SmaS is allowed
to schedule the data transfer in multiple 5-minute intervals to avoid exceeding the
threshold of the (expected) 95-th percentile. In this case, the inter-domain cost of
destination d is eliminated, i.e. no increase of inter-connection cost due to the traffic flow
towards d; thus, it holds:
( ) ( ) p P p P
ra d d int ,
=
.
In order for the NSP to efficiently address the QoS requirements of the cloud operator, we
assume that an SLA will be established among them. This SLA will dictate that whenever
the requested time interval of a data transfer is exceeded by the NSP, then as time passes
by: a) either the connectivity cost of the cloud operator is reduced, or b) a compensation or
penalty fee is returned to him by the NSP. Finally, in order to support the deferred data
transmission, after the cloud layer performs destination selection based on the costs
returned by SmaS and sends data away, the NSP may employ caches to temporarily store
the data until the 5-min interval that they will transmitted towards their destination.
The key influence factors based on which the functionality of the proposed mechanism
might affected include:
Type of service: The main services associated with this scenario, as also argued in [108], are video
delivery, bulk data transfers, load and VM migrations among clouds and data centers for data replication and
backup purposes to assure fault-tolerance, content placement and load or VM migration to enhance users
QoE. Based on the type of service we can also characterize the traffic to be delivered over the inter-cloud
communication, e.g., real-time, delay-tolerant, etc.
Service model: The service model may be pull-based, e.g., content pre-fetching initiated by the destination
data center, or push-based, e.g., bulk data transfer initiated by the source data center.
Type of customers: Inter-cloud communication is generated by service offered to customers of clouds;
these can be either other service providers, called Over-The-Top providers (OTTs), such as content
providers and gaming providers, or end-users, both corporate, e.g., banks, organizations, and residential
ones.
Network topology: The network topology mainly refers to the inter-connection of different NSPs or NSPs
and OTTs, as well as respective interconnection agreements, e.g., transit or peering.
Network traffic management type: The type of traffic management refers to the algorithms assumed to run
in the physical network to allocate traffic to network links in order to meet the traffic demand of the upper
layers, i.e. cloud; examples of such mechanisms include MPLS TE or IP TE.
Collaboration: The type of collaboration between the various stakeholders is associated to the various
business agreements that may be established between them and may significantly influence the operation of
the proposed mechanism. Specifically, we identify three types of collaboration:
Public

Collaboration in the network layer between NSPs and particularly their SmaS servers,
Collaboration in the cloud layer between cloud operators; fin our discussion in Section 0, we have
identified the cases of non-federation and federation, and
Cross-layer collaboration between the network and the cloud layers, between NSP(s) and clouds (or a cloud
federation), between SmaS and CloS.
In order to evaluate the ICC mechanism, we define a set of performance metrics of
interest. We categorize them below based on the stakeholder to whom they are significant.
Performance metrics related to the network layer, i.e. the NSP:
Intra-/inter-domain traffic: The traffic generated by the inter-cloud communication both in the intra-domain
and inter-domain links of an NSP.
Path length: The reduction in the path length of customer flows in the network. This metric can also be
combined with measurements on the average delay, jitter, and bandwidth attained.
Inter-connection cost: The (normalized) amount of money paid for the data transfers.
Transit: IP transit rates are currently mainly dependent on the location of the PoI where the IP Transit
customer is connected to its provider. Transit prices are declining, especially in Europe and the US,
however the per-unit decrease should be considered along with the on-going increase of the volume
of traffic and especially video. Indicative values for transit prices as provided by Telegeography [114]
show that for Gigabit Ethernet interfaces the median price per Mbps is USD 3.13 in Europe and USD
3.50 in USA, as opposed to the much higher price of USD 15 in Asia. In a trans-continental cloud
federation, TM could exploit such differences in transit prices to reduce the overall cost.
Peering: It is important to capture the impact of the mechanism on the traffic ratio over the peering
link, which may possibly affect existing and new interconnection agreements. Specifically, large
asymmetries in the traffic ratios between two peers typically lead to disputes and in extremis de-
peering, thus threatening the stability of the inter-domain connectivity and traffic.
Energy consumption due to network operation: The energy consumption in routers, switches and other
networking equipment, which are involved in the data transmission of the inter-cloud communication. Issues
that need to be considered include the model of power consumption w.r.t. traffic load and especially in idle
state, as well as related energy cost functions.
Performance metrics important to the cloud layer:
Distribution of data/work load: The percentage of traffic handled by each server (or more data center).
Energy consumption due to cloud operation: The energy consumption for each individual cloud, which is
depends on their respective load. This parameter is significant and need to be considered especially in peak
times, when energy consumption is maximized. A suitable model that correlated energy consumption and
load has been proposed in [18].
Overall federation energy consumption: The energy consumption across the points of presence of the
federated clouds or data centers.
The ICC mechanism has not been evaluated in its complete form yet, as it is also not fully
specified. Though, as reported in Subsection 3.2.2.3, we have investigated the impact of a
concrete scheduling mechanism with perfect information, which exploits the 95-th
percentile rule. This mechanism can serve as a component of the proposed mechanism,
which estimates transit costs between ISPs, in order to route larger traffic volumes, without
simultaneous increase of the inter-connection costs.
Figure 14 in Subsection 3.2.2.3 illustrates the total volume of traffic that can be routed
through an inter-domain link where the 95-th percentile rule is applied, when the
Public

scheduling mechanism is not employed (blue bar), and when it is (green bar). As it can be
observed, throughput is more than 80% higher when such a scheduling mechanism is in
place. Significant benefit can also be attained by the algorithm described in Subsection
4.3.2.1. However, the transmission of data volumes within the 5-minute intervals whose
traffic traces are similar to that of the 95th percentile must be performed carefully, as it
requires accurate estimation of the expected traffic during these intervals.
These evaluation results appear to be really promising for the proposed mechanism. The
detailed concrete specification of the mechanisms constituent components and the
precise level of control and information exchange distributed among the various layers and
the respective actors involved leave large potential for further optimizations. These are to
be investigated in later phases of the SmartenIT project.
The ICC TM mechanism addresses the Network and Cloud layers. Specifically, in the
Network layer, ICC employs Network Traffic Management to calculate paths and Cloud
Traffic Management to characterize the cost of the various destinations provided by the
Cloud layer, as well as Traffic Monitoring of both intra- and inter-domain links. Incentives &
Economics should be also enabled to estimate inter-connection cost, as well as cost due
to specific traffic flows. In the Cloud layer, Overlay Management must be employed to
orchestrate data replications, load and VMs migration between the data centers of different
cloud operators, based on overlay criteria, e.g., availability, energy consumption, etc.
Specifically, information on the energy consumption in each data centers is derived by the
Energy Analyzer. Finally, the QoS management components needs to be employed to
assure that QoS requirements of the OTT/Cloud service providers (i.e. customers of the
Cloud operator) are sufficiently met. The mapping of the ICC mechanism to the SmartenIT
architecture is depicted in Figure 62.
We consider the use case illustrated in Figure 24. Without loss of generality, we consider
here the case of file storage (e.g., Dropbox) being offered over the federation. The
mechanism is triggered by the cloud operator that aims to improve the end-users' QoE at
low operational cost. The operation of the ICC mechanism is described below:
The CloI of the source cloud sends to CloS a request, asking where to replicate data
providing its QoS requirement. Next, CloS provides a list of destinations to the SmaS of its
home NSP including the QoS requirements, asking for weights, and simultaneously, it
requests cloud-related information from the CloIs of all federated clouds, i.e. the level of
energy consumption for each one of the data centers of the federated clouds. In the
network layer, SmaS either tries to evaluate them alone, or asks neighboring SmaSs for
information. We consider the simple case where the rating of destinations is based only on
latency, i.e. RTT; thus, the rating of the various destinations is performed only by the
SmaS of the home NSP. Additionally, depending on the QoS requirements received by the
cloud layer, the SmaS calculates when the specific traffic flow can be sent to each of the
specific destinations and at what cost (see Subsection 4.3.2.1).
Finally, CloS taking into account the feedback by SmaS and CloS, makes decision on
where to send the traffic flow in a prioritized way, i.e. the set of destinations is clustered
according to their current energy consumption (in decreasing order), and out of the data
centers in the cluster of the lowest consumption the one is chosen that has received higher
rating by the SmaS. The decision making is consistent with the case where the destination
Public

within a unique cloud is selected based only on energy-related criteria. Specifically, in such
a case the data center with the highest energy consumption would be selected in order to
achieve consolidation of running machines and avoid duplicate costs for cooling, etc. of
multiple data center sites.
The proposed traffic management mechanism including its components, communication
protocols and message flows for the case of inter-cloud communication between multiple
cloud operators, when a federation is established between them, is depicted in Figure 25.

Figure 25: Instantiation of the ICC mechanism in the case of a cloud federation.
4.4 Dynamic Traffic Management
The dynamic traffic management (DTM) is a generic concept addressing the problem of
minimization of the ISP's costs related to the amount of traffic in the network. The cost may
reflect a monetary cost for inter-domain traffic, cost of energy consumption by network
nodes or devices, or other monetary or non-monetary costs related to the amount of traffic.
The latter may reflect network conditions, utilization of resources, congestion, etc., which
are used to construct a metric expressing the general cost. The dynamic management
mechanism performs optimization to find a best solution over criteria and conditions
defined by the ISP.
The optimization procedure finds a best solution to be achieved at the end of an
accounting period. Using cost functions and information on the traffic distribution in the
previous period, it makes a prediction for the next period and finds a better traffic
distribution in which the cost is minimized (optimal solution). Then the DTM strives to
influence the traffic distribution in such a way that the cost at the end of the current
accounting period will be close to the optimal solution. The whole period is divided into
several intervals. During an interval, all factors influencing the cost are observed. On the
basis of these measurements, a new prediction of the cost, which will be achieved at the
end of period, is calculated. If the predicted cost differs from the optimal solution an
intervention is needed to compensate for the undesired trend. This intervention is achieved
by appropriate traffic management, realized by influencing the overlay application
operations. This type of intervention may be performed on a very short timescale.
Additionally, the DTM mechanism may also recalculate the goal function during the
accounting period. The new optimization process can be triggered by the following two
Public

factors: (a) the mechanism realizes that a previously calculated goal function cannot be
achieved; (b) it turns out that the goal function calculation was too conservative and there
is a potential for further reduction of total cost.
To sum up, a generic DTM mechanism consists of the following main components:
an procedure used to find the optimal solution;
the compensation procedure which in each interval determines how the traffic
distribution should be influenced to achieve the optimal solution at the end of the
accounting period;
the interface to the application which translates traffic compensation decisions into
application-specific recommendations for traffic management.
The proposed DTM mechanism is assumed to be run on an ISP-owned server, under full
control of the ISP. This server is conceptually a kind Oracle server [1]. For example, it
might be an implementation of an ALTO server [106] or SIS (SmoothIT Information Server)
[55], equipped with a DTM mechanism or, in SmartenIT, DTM might be a component of S-
Box. The server is responsible for gathering traffic information from the local ISP domain,
exchanging some information with similar servers in other domains operated by trusted
ISPs (e.g., using an inter-ALTO protocol), and providing guidelines to overlay applications.
At the current stage of development, the DTM addresses primarily global service mobility
and inter-cloud communication scenarios. The main assumption is that the content or
service needed is not available in the autonomous system where the request is originated.
For instance, the following situations might be considered:
The end-user request the content or service not available in its domain
The ISP realizes that some content or service currently not available in its domain
will be highly demanded in the near future (e.g., based on information from OSN):
o ISP wants to cache this content to serve users locally
o ISP wants to download the content and place it on NaDas
A cloud provider having its DC in ISPs domain realizes that the content or service
will be highly demanded in the near future so it decides to deploy this content on
this given DC.
In all of the above examples, the content must be downloaded to a given location in ISPs
domain. We assume that:
ISP operates an AS with at least two inter-domain links (i.e., has a multi-homed
AS).
The desired content is available from multiple resources outside ISPs AS: servers,
clouds, data centers, or peers (in P2P application), located in distinct autonomous
systems.
Depending of the selection of content source, the generated traffic will be
transferred through one or the other inter-domain link. The routing paths from
various content sources to the destination in ISPs domain may be different and
pass through different transit ASes and inter-domain links.
The cost of downloading the traffic through different inter-domain links is different
Public

DTM is closely related to global service mobility and also inter-cloud considerations. As
social network considerations are very important here since they are one of the major
triggers for content/service demands in a given AS. The cost of energy might be included
in the cost function. Summarizing, DTM mechanism addresses all SmartenIT scenarios.
Figure 26 is an illustration of sample use-cases. ISP-0 deployed SmartenIT server (S-Box
functionality) with the DTM mechanism. This is concerned with influencing the selection of
the resource of content to be downloaded by local peers, end-users, or DCs.
4.4.1.1 Content delivery based on peer-to-peer technology
In this use-case we consider a P2P application. All or parts of the desired content are
possessed by peers in the ISP-1, ISP-2 and ISP-3 domains. The peers in the ISP-0
domain may select a subset of remote peers to communicate with. The selection may be
influenced by ISP. Peers in ISP-0 domain receive a list of preferred peers sent by a
SmartenIT server owned by the ISP. ISPs preferences are related to optimization done by
DTM.
4.4.1.2 Content delivery cloud service case
In a cloud-based service the situation is significantly different. The required content might
be provided by a few DCs. The routing paths from various DCs to the end-user device may
cross-different inter-domain links. The end-user requesting the content from a content
provider is not aware of the existence of neither multiple resources nor their location. It
contacts the cloud load balancer and it gets redirected to one of content servers. The
decision is taken at cloud side. Thus, if the ISP wants to influence the decision on content
source selection, it should cooperate with cloud service provider. ISP gives hits (its
preferences) on resource selection. The Cloud service provider may take into account
those hints and serve the end-user from a preferred server or DC.

Figure 26: Sample network model for the use-cases description
Public

We have designed a practical implementation of the DTM mechanism. It is assumed that
the ISP will use DTM to minimize the total cost of inter-domain traffic by influencing overlay
traffic.
The ISP's autonomous system is multi-homed. The tariff used on each inter-domain link
may be based on different cost functions. The cost function is defined as c
(B
) where
c
is a cost calculated for link i,
() is a cost function used on that link, and B
is measured
traffic at link i. The cost function may be based on traffic volume or 95
th
percentile. The
total cost of inter-domain traffic is calculated as C = _ c
+_ c
p
N
p
=1
N
=1
whereN
and N
p
are
the number of links charged according to traffic volume and percentile rules, respectively.
Costs calculated on respective links are denoted as c
and c
p
. DTM strives to distribute
the inter-domain traffic among links in such a way that the total cost is minimized The goal
of optimization is to minimize C, i.e., the total cost.
The overall traffic on each inter-domain link is classified as manageable (generated by a
cooperative application that may follow ISPs preference on resource selection) or non-
manageable (background). The latter cannot be influenced by DTM. Even the cooperative
application follows the ISP's suggestion with some probability, referred to as a success
rate. The capability of DTM to influence the total cost of inter-domain traffic depends on
two factors: the share of manageable traffic in the total traffic, and the success rate.
DTM consist of the optimization procedure and the compensation procedure which are
explained on the example. For simplicity's sake, we assume that the AS has two inter-
domain links, both charged for the traffic volume. Cost functions on those links are
presented at Figure 27.

Figure 27: Cost functions used for accounting cost of inter-domain traffic
4.4.2.1 The Optimization Procedure
The optimization procedure is used to find an optimal solution (cost) to be achieved. The
plot in Figure 28 shows a cost map related to traffic volumes at both inter-domain links.
The X-axis and Y-axis represent traffic volumes B
1
and B
2
, respectively. The contour plot
shows total monetary cost calculated for B
1
and B
2
traffic using respective cost functions.
Point C shows traffic volumes on both links a represents a total cost for inter-domain traffic
in the previous accounting period. The red curve shows how the cumulative traffic volume
grew on both links during the previous accounting period. The manageable traffic at link 1
is denoted as B
1
m
. If all this traffic were downloaded from sources available via link 2
instead of link 1, the traffic volume on link 1 could be decreased by B
1
m
while the traffic
volume at link 2 would be increased by the same value (green arrows in Figure 28). In
turn, B
2
m
denotes manageable traffic on link 2. Therefore, the optimal solution can be found
at a dashed line presented in Figure 28. The optimization procedure finds a solution
Public

denoted as C%, e.g., a minimal cost lying on the line. For N links the cost function is defined
over N-dimensional space. Using the 95
th
percentile rule on some links also complicates
the optimization problem.

Figure 28: A cost map as a function of traffic volume on both inter-domain links and cost
optimization potential
4.4.2.2 The Compensation Procedure
Traffic management is performed during the accounting period. The optimal cost C% is
translated into desired traffic volume at both links B'
1
and B'
2
. Those values define the so
called reference vector B
that determines the desired direction of the traffic growth on

links over time (Figure 29).
The accounting period is divided into several intervals. Traffic measurements are
performed during a time interval and at the end a current traffic vector X
t
is calculated,
where t denotes a current time. The direction of this vector is compared to the reference
vector. As a result, the amount of traffic that needs to be shifted from one link to the other
is determined (a compensation vector u
t
). The compensation vector is found by a

compensation procedure which takes into account traffic and cost estimation on several
links. Finally, the compensation vector is translated into recommendations for overlay
applications on the content and traffic management. This recommendation is valid for a
current interval. At the end of the interval, the compensation vector is recalculated. The
compensation procedure works for the cost functions based on traffic volume and 95
th

percentile.
Public


Figure 29: Illustration of a traffic compensation mechanism
The performance of this mechanism is mainly influenced by metrics that directly influence
the cost. In the considered use-case the cost depends on the following influence factors:
Amount of manageable traffic
Cost functions related to traffic distribution (e.g., the tariffs used for accounting for
traffic transferred through inter-domain links)
Distribution of content/service resources, availability of multiple resources
Ownership of content/service sources multiple or single provider of the same
content/ service, cooperativeness and relations between multiple content providers
The number of inter-domain links the traffic might be downloaded
Availability of routing information (downstream direction)
Content demand/popularity
The main performance metric is total cost, or cost savings. In the presented example the
cost of inter-domain traffic is assumed. In general, depending on the ISPs goal, the
performance metric might be a selection or all of the following: (a) total cost for the traffic
transfer (cost savings), (b) amount of traffic moved between links, (c) energy savings, (d)
QoE of cloud/DC based services and applications, (e) network resource utilization. Other
performance metrics are possible.
DTM mechanism has been implemented in a proprietary simulator. We have performed a
number of experiments to evaluate the performance of DTM mechanism. The source of
Public

manageable overlay traffic is a BitTorrent application. Peers do not use a random list of
peers provided by a tracker, but a list ranked (using DTM) by a SmartenIT server. We
assume a success rate of 30%, since some remote peers may refuse connection with local
peers. The peer rating is changed dynamically every five minutes based on current
measurements of the traffic.
The green and red curves in Figure 30 show the growth of traffic volume with and without
DTM, respectively. The final cost of the inter-domain traffic is lower when DTM is used
(point C%). It can be easily seen from Figure 30 that the traffic on link 1 was decreased
while on link 2 it was increased. In Figure 31 we show traffic patterns on link 2 with and
without DTM. Due to operation of DTM, the traffic on link 2 is increased.
As shown in Figure 63, the DTM mechanism is meant to be used locally by an ISP. It may
use any metrics measured by the components to predict the applications behavior and
modify it accordingly. Moreover, it uses Overlay Management component to gain
information on the overlay networks topologies and influence them.
The presented instantiation example of the DTM is related to data transfer between data
centers. A data center wants to download a specific content from another data center. It
has been supposed that the required content is present in a few data centers (or clouds).
Data centers possessing this content are located in foreign ISPs domains. The local data
center (lacking of content) will choose a partner data center in such a way that generated
traffic will follow the chosen reference vector.

Figure 30: Comparison of traffic growth on links during the accounting period with (green
curve) and without (red) DTM
Public


Figure 31: Traffic pattern in link 2 with and without DTM mechanism
The following architecture components are involved. The local data center asks the traffic
manager for assistance in selecting a proper data center for download. The traffic
monitoring component reports amount of traffic transferred through inter domain links. The
reports are sent to the traffic manager component on a time schedule basis. The traffic
manager implements DTM and according to this mechanism a compensation vector is
calculated. This vector indicates which inter-domain link should be used. The traffic
manager knows the location of the data centers possessing required content. The traffic
manager consults network analyzer (topology/proximity monitor) in order to determine
which inter-domain link will be used for the download from each potential data center. The
remote data center is selected. The traffic manager instructs the local data center which
remote data center should be used.
4.5 RB-Tracker: User Traffic Management
RB-Tracker is a traffic management solution, which aims at reducing Internet traffic peaks
of user induced traffic. By caching content, which a user will consume in the future, during
low traffic times, e.g., at night, RB-Tracker makes the content available locally close to the
user. That way, some Internet traffic can be offloaded from the peak hours and the content
is available faster.
The target deployment scenario of RB-Tracker involves UNaDas, which can be used as
caches. Since traffic optimization is the main objective of RB-Tracker running caching
tasks during off-peak hours (e.g., nighttime) is key to the mechanism. Therefore, UNaDas
constitute the ideal deployment platform for RB-Tracker. Furthermore, UNaDas are more
energy efficient as regular computers and are usually running 24/7 anyways. RB-Tracker
enabled services running on UNaDas can easily be transferred from one UNaDa to
another and therefore enable global service mobility.
RB-Trackers prediction algorithm relies on common interests of users. In a later step
information from OSNs could improve this prediction algorithm. Therefore, RB-Tracker
could benefit from a social awareness service addressed in the social awareness scenario.
Public

RB-Tracker [72] is currently investigated in combination with the BitTorrent (BT) network,
because BT offers more intervention potential than commercial services. However, the
principles used for the BT network can also be applied to other application areas such as
video streaming. Therefore it will be described in respect of YouTube. The basic
assumption is that a user, which has certain interests in certain videos will have watched
several of these Therefore, a users video set is an expression of his or her interests since
a user would not watch videos he or she has no interest in.
Based on these assumptions RB-Tracker can build neighbor relations through watched
videos (e.g., user A and user B watch the same video, therefore, they are neighbors) and
that way build an overlay topology. Neighbors exchange lists of all the movies they
watched in the last period (a period could be a month) and then these lists are checked for
unknown resources which are ranked according to popularity among the neighbors and to
the similarity between the respective neighbors and the users resource list. The rationale
is that the more movies two users have in common the more similar is the taste of them.
RB-Tracker uses the ranking relevant videos for prefetching of the most interesting
relevant videos. Additionally all the videos being watched by the user are cached. This
way every user becomes a source for the videos watched. When a user starts watching a
video it can be fetched either from (1) the original source, e.g., YouTube, (2) the local
cache because it has been prefetched, or (3) from a neighbor who has already cached /
prefetched the video. The decision from where to load the video is ideally made based on
underlay network status. Therefore metrics like RTT, Hop-Count, and Bandwidth from the
user to the source need to be collected and evaluated.
Prefetching is only done when the network utilization is low, since the goal is to reduce
traffic imposed on the network during peak hours. The utilization can be measured by the
existing transfers for prefetching or it can come from an external service.
RB-Tracker is mainly influenced by three factors: the users content consumption behavior,
the users location, and the state of the network.
Users behavior in consuming content has the largest influence on RB-Trackers
performance, since it defines what kind of content is being cached. The relevant factors
are when users consume the content and how well the similarity between users can
predict the content that will be consumed.
Closeness of resources and consumers is a beneficial property since it reduces the
likelihood of inter-domain traffic and also keeps traffic away from the backbone. Closeness
does not necessarily mean geographical proximity; it means that the closer source and
request in terms of the underlay network the better.
Two metrics can be used to evaluate the performance of RB-Tracker. The first is
bandwidth since RB-Tracker aims at reducing traffic peaks, bandwidth consumption with
and without RB-tracker can be compared. Bandwidth performance appears on several
levels, e.g., user-access links, inter-domain links, or end-to-end, ideally all these levels are
measured and compared in the evaluation. The second metric is cache-hit ratio; it is
calculated by dividing the number of cached resources through the number of cache
accesses. This metric indicates how good the caching algorithm performs.
Public

Initial experiments were executed with a RB-Tracker implementation for Vuze, which is a
Java based BitTorrent client. The experiment scenario involved randomly selected torrents
having different popularity for a set of peers. The experiments were divided in phases,
where phases 1 & 2 are low traffic and phases 3 & 4 are high traffic times.
These experiments have shown that a reduction of traffic peaks could not be
accomplished in the given scenario. The main reasons why the traffic did not decrease are
the tiny fraction of correct cache predictions, independent of the strategy used. Only about
10.43% of the cache downloads were requested by the clients afterwards in average,
89.57% of the cache downloads just produced additional traffic without any need for the
caching peers themselves. However, this big amount of additionally cached data leads to a
better overall data availability. As a consequence, other peers were able to download Files
with at a higher speed, which increases the overall traffic in general and unfortunately in
peak hours as well.[11]

Figure 32: Initial experiments show no traffic peak reduction in a random preference
scenario.[11]
Figure 32 shows the average traffic during the 4 phases of the experiment for various
caching strategies. It can be seen that no strategy yielded any improvement regarding
traffic reduction in any of the phases. However, the caching resulted in higher download
completion at the end of the experiment.
RB-Tracker is focused on End-Users and is deployed on one of their devices, ideally this is
a UNaDa. RB-Tracker pre-fetches content for its users during times of low network
utilization / congestion. Pre-fetching ideally uses sources that are close in terms of
domains.
To accomplish this RB-Tracker acquires social information by observing the content
requested by a user and then finding similar users that provide content for pre-fetching;
Public

this is entirely done in user space and needs the core and access networks only for data
transmission. The network monitoring and locality properties require information about the
underlying network. Therefore, RB-Tracker is mapped up to the core network of the
architecture. It includes the Traffic Monitoring, Overlay Management, Social- Awareness,
and Monitoring components.
To make it easier to understand RB-Tracker, an example deployment / instantiation is
given here. In this example RB-Tracker is deployed on a UNaDa and monitors the
resources its user(s) consume. Such a resource can be a video stored in a cloud provider
such as YouTube or any other file even collection of files such as torrents. These videos
are cached on the UNaDa. RB-Tracker is connected to other instances which have videos
in common (cached or prefetched) which monitor other users videos and can therefore
discover unknown videos through its peers. RB-Tracker then makes a selection which
videos are most likely to be watched by its own users, either based on similarity of users
combined with popularity of the content. If available, information from OSNs is used as
well, i.e., videos that are posted on a users Facebook wall are very likely of being
watched. The top ranked videos are then being downloaded or pre-fetched.
Videos that are located on other UNaDas will be downloaded from them if they are closer
and or less busy than the original source. By monitoring round trip times (RTT) RB-Tracker
can make assumptions of the status of the network and decide when it is a good time to
pre-fetch or not and also if a source is busy or not. Furthermore, by using bi-directional
AS-hop count between two UNaDas, locality is improved and close sources are preferred
over others.

Figure 33: Deployment example of RB-Tracker.
Public

Figure 33 shows an example of how RB-Tracker could be deployed to improve network
traffic of serving YouTube videos. In the depicted situation the laptop in Home Network 2
requests video X. The UNaDa acts as a proxy for the request and the RB-Tracker instance
running on it joins the overlay network for video X. There are multiple sources but the best
would be the RB-Tracker instances in Home Network 1 since it is also connected to ISP A
(green arrow). If the resources of the nodes in Home Network 1 were not sufficient the
video should be streamed from the cloud (yellow arrow). Streaming from the UNaDa in
Home Network 3 must be avoided since it would involve a longer path than all other
options. Furthermore, RB-Tracker could be used in mobile networks and form Ad-Hoc
networks to stream the videos what results in a higher bandwidth, e.g., 3G vs. Wi-Fi.
4.6 Selection Mechanism for Storage Providers
This section defines a generic traffic optimization mechanism in the scope of Cloud
Storage Systems (CSS) using an overlay-based approach. In this case, the underlay
would be multiple Cloud services where content/data is stored and in the overlay the
management information about the content/data is kept.
This selection mechanism for storage providers (SMSP) is mainly centered on selecting
the best Cloud services (underlay) based on Internet traffic conditions, Cloud services
performance, and location of users and content/data.
The following scenarios are addressed with the respective reasons:
- Global Service Mobility: users location should be detected and known by the
system. Also, monitoring actions in order to evaluate Cloud services performance
should be performed. Based on these network conditions, the Cloud storage system
can decide where to store content/data.
- Social Awareness: who shares content/data must be aware of the network
conditions of who is consuming. Therefore, the Cloud storage system must consider
the interaction between friends to decide where to store content/data.
The proposed traffic optimization solution aims to select the best Cloud services to store
data/content. The mechanism is based on formulating a preference ranking (for each user)
composed by Cloud services, classified by a composition of parameters monitored over
time. In the following paragraphs, the solution is described in a high-level.
In overlay-based Cloud Storage Systems, the information of where, how, and what was
stored is kept in a file index. This file index describes, for example, in which Cloud service
the file is stored, etc. Therefore, the file index is built during the file storage since the
CSS selects where to store and which process are applied to the content/data. The
decision of selecting Cloud services to store content/data is essential for traffic
optimization purposes. One typical scenario is when a CSS user wants to share
content/data to other users within the system. The CSS must decide employ specific
mechanisms where it should place the data.
A rudimentary version employed by CSSes is based on a random function: it means that,
depending on which Cloud services credentials are available for a specific user, the CSS
would randomly select a Cloud service to upload data. However, even though a random
function can be beneficial in some cases (e.g., distribute data in an unpredictable manner,
Public

which may be desired for privacy purposes), it does not present clear advantages for traffic
optimization nor for QoE (Quality-of-Experience). It can happen that a randomly chosen
Cloud service performs well for the user that shares (e.g., low RTT, always reachable,
etc), but not for the users which will consume the content/data. It depends on a set of
factors that are discussed in this Traffic Optimization mechanism.
Therefore, this Traffic Optimization mechanism for Cloud Storage Systems introduces an
enhanced solution based on a Cloud service preference ranking. A preference ranking is
composed by Cloud services, in a strict order, also allowing ties. It means the two or more
Cloud services can have the same weight decision when choosing Cloud services to store
content/data. The preference ranking is expressed by each CSS user, independently.
In order to compose the preference ranking, two types of monitoring dimensions should be
considered: (1) dimensions related to the Cloud service performance, and (2) dimensions
related to the user location.
Considering (1), the following dimensions are observed, always targeting to monitor Cloud
services:
ICMP RTT: an ICMP echo request should be performed to the Cloud service put()
API endpoint. The ICMP echo response should be observed, and the RTT
measured.
Number of Hops: TCP and UDP traceroute should be performed to the Cloud
service put() and get() API endpoint. The number of hops from the CSS user until
the Cloud service should be observed.
API testing: perform an API test. This is a service-specific feature of each Cloud
service. It may be available or not. A valid test could be, for example, to perform a
wrong API request and wait for the answer. Therefore, the CSS user can conclude
that the Cloud service API is available.
The ICMP RTT, Number of Hops, and API testing are monitored within a context. The
context is related to the user location, which the monitoring type (2) dictates.
Considering (2), the following dimensions are observed related to the user location. The
purpose here is to discover where the CSS user is located and what are the conditions of
his/her network.
Bandwidth testing: estimate the bandwidth that the user has at the moment. This
estimation is made in two different manners. The first way is to observe the Cloud
service put() and get() times (related to put and get a file, respectively), exactly
when the CSS user tries to store or to download files. But, since the user may not
use the CSS for some time period of time, a central server should be used. Then,
the CSS should periodically upload/download small files to/from such central entity,
in order to estimate the users bandwidth.
Autonomous System (AS) identification: identify in which AS the user is located.
This can be done through identifying the public IP address and matching to the AS
that advertises such address.
The bandwidth testing and the AS identification should be constantly monitored, in order to
define the context for the monitoring type (1). These dimensions do not play a direct role in
the Cloud service preference ranking, but they specify to which scope the preference
ranking relates. Therefore, if the context changes (the CSS detects that the user is located
Public

in another area), the monitoring dimensions (1) should only consider the newest specific
context, and not previous measurements of different contexts.
The dimensions related to (1) compose the preference ranking as follows. First, the
dimensions cited in (1), ICMP RTT, Number of Hops, and API testing, should be
normalized to numerical values of the same unit even though some dimensions are
presented in milliseconds, in an absolute number, or a Boolean (e.g., the API testing true
or false). For each context, an internally within the CSS, a preference ranking will be
composed. It means that, for each user location, the user will have a preference ranking of
Cloud services. However, the user will expose his/her preference ranking as one single
ranking. This unique preference ranking encapsulates the preferences of all locations in
one single ranking, which is formed in a weighed manner: if the user stays in location Z for
2/3 of the total time (that the user is running the CSS application), then, the CSS should
form the unique preference ranking with 2/3 values from the preference ranking of location
Z, and 1/3 of location Y.
Once the CSS composes the unique preference ranking for the user, and considering that
user A wants to share content to users B, C, and D, the process works as follows. Also,
consider CS
x
as the representation for Cloud services, having the total of 4 Cloud services
available in the CSS application.
Table 8: Users and its Cloud services preference ranking
User Preference Ranking
A
1. CS
2

2. CS
3

3. CS
4

B
1. CS
1

2. CS
3

C
1. CS
2

2. CS
1

D
1. CS
1

2. CS
4

Table 8 shows the Cloud services preference ranking for each user. In this example, user
A does not have CS
1
in the preference ranking since user A did not interact to CS
1
in any
manner. It means that user A either did not provide any Cloud credential of CS
1
, or did not
consume (download) any content/data that had CS
1
as the host. The same happens for
user B, C, and D, if observed that each user just have two Cloud services composing the
preference ranking, out of four Cloud services available in the system.
Since user A wants to share content/data to B, C, and D, CSS should combine the
preference rankings to find a resulting Cloud services ranking that satisfies everyone
involved. The satisfaction relates to Cloud services performance and, indirectly, also
prioritizing less traffic consumption. The system assumes that the closer the Cloud service
is, the faster it can perform, and, as a consequence, traffic consumption will be saved
since the traffic flow will involve less ASs.
Public

Table 9: The steps to combine Cloud services preference ranking of each users
Cloud
Services
Step 1: Computation Step2: Sum Step 3: Average Step 4: Result
Users
A B C D
CS
1
0 4 3 4 11 11/3 = 3.6 1. CS
2

2. CS
1

3. CS
3

4. CS
4

CS
2
4 0 4 0 8 8/2 = 4
CS
3
3 3 0 0 6 6/2 = 3
CS
4
2 0 0 3 5 5/2 = 2.5

Table 9 shows the combination steps for the preference ranking combination. Step 1
computes the preference ranking of each user, which is composed by values from 0 to 4,
where 0 represents that the given user could not evaluate (monitor) the Cloud service, and
values from 4 to 1 represent the highest to the lowest ranking value. If the CSS would
have more than 4 Cloud services available, then such values would vary from the total
number of available Cloud services to 1 (n..1).
Step 2 simply sums the ranking values of Cloud services for each user. Step 3 performs
the average of each Cloud service, discarding the users that did not evaluate/monitor the
specific Cloud service. Step 4, as a result, shows the combined preference ranking,
classified from the highest to lowest average value.
Once user A combined the result, the system composes a probabilistic ranking considering
also its available Cloud credentials (for the specific user who wants to share). Table 10
shows the available Cloud services for user A, and the combined preference ranking
considering users A, B, C and D. Even though the CS
1
appears in the combined
preference ranking, user A cannot use such Cloud service to share content/data since A
does not have any credential related to CS
1
. Considering such fact, the CSS composes
probabilistic values related to the preference ranking, excluding the CS
1
. Such probabilistic
preference ranking specifies that, for example, CS
2
should be used 66.66% of the time to
store content when sharing content/date to users B, C, and D. Since that one common
characteristic of CSSs is to fragment files to store in different Cloud facilities, the CSS may
pick one random Cloud service from the preference ranking but respecting the probabilistic
values for example, the random function should guarantee that the CS
2
will be selected
66.66% of the times, CS
3
will be chosen in 22.22% of times, and CS
4
11.11%.
This Traffic Optimization mechanism is mainly influenced by two factors: the CSS users
location, and the Internet traffic situation (e.g., congestion, delay).
The mechanism attempts to identify what are the best Cloud services to store
content/data, considering the experience of multiple CSS users (e.g., in the sharing use
case). Therefore, depending of the users experience (that is obtained through monitoring),
the CSS can select Cloud services that are convenient to all involved users prioritizing
Cloud services that has better performance (e.g., low RTT), higher availability, and that are
closer (e.g., less number of hops).
Table 10: Probabilistic Values for the Preference Ranking.
Public

User A
Available Cloud
Services
Preference Ranking
Combination Result
Probabilistic Values for the
Preference Ranking
to store Content/Data
CS2, CS3, CS4
1. CS
2

2. CS
1

3. CS
3

4. CS
4

1. CS
2
66.66%
2. CS
3
22.22%
3. CS
4
11.11%
Moreover, the employed monitoring identifies the closeness and performance of Cloud
services to its users, and is a beneficial property since it may reduce the likely hood of
inter-domain traffic in cases of content/data sharing.
Two metrics can be used to evaluate the performance of the proposed Traffic Optimization
Mechanism. The first is Total Time of Download/Upload content to the CSS, since the
mechanism aims to select Cloud services that are closer and with the best performance.
The second is the Total Cost of ISP intra-domain traffic spent, since the mechanism
prioritizes Cloud services that are closer to users, i.e., less hops to the Cloud service may
mean less traffic charges for the user ISP.
Other key performance metrics related to CSS QoE (Quality-of-Experience) can be
defined, since the main concern of the presented Traffic Optimization mechanism is
performance.
At this point, there is no evaluation results related to the presented Traffic Optimization
Mechanism. However, it can be shown that the presented dimensions that supports the
preference ranking expresses established metrics to monitor the network state and users
location.
The Traffic Optimization Mechanism can be evaluated in the scope of CSSs, e.g., PiCsMu
[16], which is an open source project with a working flexible prototype.
SMSP is focused on end-user, and may be deployed in any device located in end-users
premises. Considering storage for multiple users (distributed environment), each storage
system should have access to a mechanism that best select other storage facilities to
persist content/data. This selection is based on current network state (traffic monitoring)
and social information. Pre-fetching may be an option, since SMSP can choose the best
storage facility either to persist content/data on users demand. The pre-fetching prediction
should be based on a third-party prediction mechanism.
In order to give a better understanding on how SMSP can be instantiated, the following
lines depict a scenario example.
Public

It is reasonable to define that a Cloud Storage System may be composed of one or more
public storage facilities (e.g., Dropbox, SkyDrive) or UNaDas (owned by users). Also, it is
reasonable to assume that end-users may use such Cloud Storage System for private
purposes (private storage) or to interact with other end-users (sharing content).
Such Cloud Storage System client runs in end-users devices, either in a UNaDa or in its
private computer/device (e.g., laptop, mobile phone). Therefore, the SMSP mechanism
should be deployed close to the Cloud Storage System client.
SMSP will monitor how end-users interact with others, and where (i.e., which storage
facility) is the best place to persist content/date when sharing to the other user. Therefore,
SMSP supports the Cloud Storage System in identifying storage facilities that are closer
and that have better performance. SMSP achieves such results by monitoring several
dimensions related to end-users location (e.g., AS peering info, AS hops), how end-users
share/consume content, and parameters regarding storage facilities performance (e.g.,
RTT, API testing).
Cloud Storage Systems must interface to SMSP in some manner, in order to direct the
traffic to the storage facilities that are optimum (returned by SMSP). The interface between
Cloud Storage Systems and SMSP is also enacted in the end-user side.
4.7 Static Resource Allocation in the IaaS Federation
As discussed in Section 3.2.2.6, in case a data center or cloud federation offers its
services as a collective, customers may deploy any cloud in the federation, which
increases efficiency but also makes load balancing between data centers necessary, as
customers may concentrate load on certain data centers. This section describes how a
federation of data centers or clouds can deploy the greediness metric introduced in
Section 3.2.2.4 to ensure federation wide/global fair resource allocation while at the same
time giving load balancing incentive to customers. In particular, this Multi-resource
Allocation (MRA) mechanism allows to ensure multi-resource fairness, that is, contrary to
competing approaches, not based on static consumption profiles and can be applied in
completely decentralized manner.
Assume each data center calculates the (internal/local) greediness of customers and
announces this information to all other data centers in the federation. Data centers can
then add up the greediness to conclude on the global greediness of users. If a data center
sees congestion on a resource, it can apply resource trimming based on this summed
up/global greediness. In this way, global fairness is enforced, as individual data centers
take the overall resource consumption of customers into account and not only the local
when trimming resources. Furthermore, it gives load balancing incentive to customers, as
argued subsequently. Recall that greediness is defined such that releasing resources are
only credited to the extent that others request this resource beyond their endowment (cf.
else-part of the definition of b(d
i,i
) in Section 3.2.2.6). If a data center sees high load on a
resource, there must be customers that request this resource from this data center beyond
their endowment. On the other hand, if a data center sees low load on a resource, most
customers requests do not exceed their endowment. It follows that the release of a
resource is credited higher in data centers that see high load for that resource compared
to data centers that see low load for this resource. Therefore, if customers have the choice
to release resources on different data centers, it is beneficial for them to release these
resources on highly loaded data centers, as this decreases their greediness stronger and
therefore makes the chance that they are trimmed small in the entire federation. Note that
this resolves Tussle I described in Sections 3.2.2.6. As data centers can exchange
Public

information about customer greediness in an efficient P2P manner, the approach works
completely decentralized, wherefore no central decision point is needed. In particular,
local/decentralized decisions are taken to ensure global greediness.
Although data center customers (be it cloud service providers or end users) are effected
by the mechanism, these only have passive role in this mechanism, instead data center in
a federation apply the mechanism actively to achieve global fairness and load balancing,
wherefore the mechanism is clearly data center focused. In particular, the mechanism is
focused on inter cloud-communication, as it provides a metric to data centers that allows to
quantify customer consumptions and exchange this information efficiently with other data
centers in the federation, thereby allowing for targeted trimming that strengthens load
balancing between data centers.
A data center operator measures for each customer how much he consumes of the
provided resources (CPU, RAM, bandwidth, disk space). If these measurements are
available for every customer, the metric presented in Section 3.2.2.6 can be applied to
calculate a rational number for each customer, that expresses its greediness, i.e., if this
number for customerC1 is larger than for customerC2, C1s consumption profile is more
expensive for the cloud than that of C2 (note that it is not straightforward to say which
consumption profile is more expensive if different resources are offered).
In case a load situation occurs for C1, i.e., the request for a resource exceeds the supply;
the access to it has to be throttled for some customers. Since the greediness of customers
is represented by a simple rational number the cloud can quickly determine the greedy
customers who access the scarce resource and throttle access of those. If greedy
customers are throttled stronger than less greedy customers, this aligns the greediness of
customers and therefore enforces multi-resource fairness.
The mechanism should be extended for a data center federation by exchange of
greediness information between data centers, which will allow throttling customers based
on the global greediness and thereby giving incentive to customers to consume resources
such that the load between clouds is balanced.
The size and number of data centers in the federation just as the number of customers will
heavily influence the performance of the mechanism. As the metric provides guidelines on
how to handle overload situations and the frequency of these depends on the degree of
overprovisioning, the latter will also influence the performance of the mechanism. Also the
homogeneity of clouds with respect to technical specs and geographical placement is
relevant, as this influences how customer preferences will differ over clouds. For example,
if the geographical distance between clouds in the federation is large, customers will still
tend to use resources on the closest cloud, even though it is overloaded, since the round
trip times to the other clouds are too high. Similarly, if a cloud provides dedicated
infrastructure for certain computing tasks, customers may still have a benefit from using it
when it is overloaded, compared to other less suited clouds, that are not that much loaded.
Public

The performance of the described mechanism will be measured by (i) fairness, i.e., envy-
freeness, (ii) Pareto efficiency, and (iii) technical efficiency of the allocations produced in
overload situations. Furthermore, also load distribution over clouds is a key performance
metric. Note that the presented greediness metric already provides a fairness
indicator/definition, as similar greediness for customers (in the best case 0 for all) can be
considered fair and largely varying number can be considered unfair. However, in the latter
case, it becomes important to make different unfair allocations comparable with respect to
fairness. In particular, the vector produced by the greediness metric needs to be mapped
to a scalar by a fairness function. In literature several such functions have been presented,
with the probably most prominent being Jains Index [62]. A recent paper by Joe-Wong et
al. also proposed to families of fairness functions [63]. However, these functions (including
Jains index) map from the space of non-negative numbers, wherefore it has to be
investigated how they can be adapted to appropriately handle a vector produced by the
greediness metric, which may contain negative numbers.
At this point no numerical evaluation results exist. However, it can be shown that the
greediness metric, which constitutes the core of the mechanism, expresses an intuitive
understanding of fairness and was presented on the 7th International Conference on
Autonomous Infrastructure, Management and Security (AIMS 2013) on June 25-28, 2013,
where it received positive feedback of the audience consisting of different experts in the
computer networking and management field.
Since the greediness can be calculated in linear time, optimization potential is currently
investigated in the trimming process: when trimming is applied, resources allocated to
customers have to be trimmed such that their greediness is aligned. This implies, that first
resources of the greediest customer have to be trimmed, which will decrease his
greediness until there is a second customer with equal greediness. Then resources of both
customers have to be trimmed, until they reach the greediness of the third greediest
customer. This process has to be continued until no scarcity exists. Although the idea of
the process is straight forward, an efficient implementation is not. However, it is important
to implement the process efficiently, since in case of scarcity the trimming needs to
happen fast. Also starvation limits need to be integrated in the trimming process.
Although the MRA mechanism determines how resource access of data center customers
is restricted in case of congestion, these customers are only passively involved in the
mechanism. In particular, customers not even need to be aware of the MRA mechanism
being executed (although it is reasonable to inform them, such that they take this as an
load balancing incentive). Data centers, on the other hand, have to run the MRA, that is,
they have to monitor resource consumptions of customers, feed them to the greediness
metric, exchange greediness information with other data centers and apply resource
trimming accordingly, if need be. Therefore the MRA mechanism is data center focused. In
particular, since the MRA mechanism not only implies the exchange information between
data centers but also greatly aids their functioning as a federation, the mechanism clearly
addresses inter-cloud communication.
As the MRA mechanism gives incentive to customers to support load balancing and
addresses a multi-resource allocation problem, which is clearly of economical nature,
Public

when mapped to the SmartenIT architecture, the MRA mechanism is located in the
Incentives & Economics component.
Assume a data center provides computing power to four customers C1, C2, C3, and C4.
The data center not only accounts for the consumed computing power but also for the
consumed bandwidth and arrives at the values shown in

Table 11. Since the traffic resource is greatly overbooked, it is necessary to limit
customers access to this resource. If a single resource policy would be applied, e.g., max-
min fairness, all customers would receive a bandwidth share of 10 Mb/s. Considering that
customers C2, C3, and C4 greatly profit from C1s low consumption of computing power
(each receives 3 units that C1 does no use of his endowment), one can easily see that
such single resource policy does not yield a fair throttling of customers.

Table 11: Local application scenario before shaping
Computing Power Traffic in Mb/s Greed
Endowment 10 10
C1 1 16 -3
C2 13 16 9
C3 13 16 9
C4 13 16 9
Exceed 0 24 24

The MRA mechanism takes all resources into account, when calculating the fair bandwidth
shares and works in the following way.
First, the greediness of customers (also shown in

Table 11) is calculated. The resulting values already imply that C2, C3, and C4 profit from
C1s modesty. Second, the starvation limits for customers are calculated. Starvation limits
depict the minimum amount of resources that a customer receives and depend on the
greediness for the consumption of other resources and the endowment to the resource
that has to be throttled. More precisely, let r be the resource that has to be throttled, e be
the endowment of a customer to r and g be the customers greediness when all resources
but r are considered. If g is less or equal to zero, the starvation limit is e and if g is greater
than zero the starvation limit is given by
.
While the mathematical justification of this formula is omitted for the sake of brevity, note
that its upper bound is e, wherefore the starvation limit always allows for a feasible
allocation. The third step of the MRA mechanism, is to allocate every customer at least its
starvation limit of r. When an amount of r is left after this, the customer, that desires more
Public

resources and has the lowest greediness (this greediness is also influenced by what is
received of r), receives them.
For the example shown in

Table 11, the MRA mechanism will yield the allocation shown inTable 12. It becomes
apparent, that the resulting allocation is more fair than those calculated by a single-
resource fair mechanism, as the MRA mechanism swaps 2 Mb/s from C2, C3, and C4 to
C1, as the former three each receive 3 units computing power of the latter.
One may argue that the bisecting of bandwidth will limit C2-C4s ability to make use of
their allocated computing power. However, in such case they may devote part of their
computing power to compression of traffic, which will reduce consumed bandwidth. More
generally, the MRA mechanism gives incentive to customers to deploy resources in a load
balanced manner, as discussed in Section 3.2.2.6.
The application of the MRA mechanism to the federation case is similar. Here, data
centers additionally receive information about the greediness of customers from other data
centers. In the second step of the mechanism, this information would be added to r and
therefore influence the starvation limit of customers. Furthermore, the information would
also be added to the greediness that is considered in the third step to allocate resources
beyond the starvation limit.
Table 12: Local application scenario after shaping
Computing Power Traffic in Mb/s (Starvation limit) Greed
Endowment 10 10
C1 1 16 (10) -3
C2 13 8 (7.36) 1
C3 13 8 (7.36) 1
C4 13 8 (7.36) 1
Exceed 0 0 24
4.8 Optimized Upgrade and Planning Processes in Load Balancing
Networks
Operation and planning processes for interconnection networks have to ensure scalability
by continuous upgrades of transport capacities to cope with steadily increasing Internet
traffic. The trend of growing traffic and the application demands behind the development
are summarized in deliverables D1.1, section 3.6 and D1.2, section 5.1 of SmartenIT with
regard to IP traffic in general and especially for clouds, data centers, and CDN support for
content delivery.
Decisions for upgrading the resources on network and service platforms are based on
permanent monitoring of the load in order to indicate bottlenecks. As the usual reaction,
threshold-based upgrades are triggered when a resource is approaching an overload
regime. In pure IP networks, this was traditionally done separately for each link based on
link load measurement, but advanced network-wide load balancing mechanisms are
employed in future networks for achieving optimum resource utilization. Moreover, content
distribution networks and clouds are organized as distributed systems where content is
Public

made available from several sources based on virtualized system-wide concepts, requiring
also integrated monitoring, management and upgrade processes for the whole system.
Optimized load balancing mechanisms have been developed on the IP layer although
there is no standard support for determining IP traffic matrices and the manipulation of link
weights is only an indirect method to control and balance link loads [38]. The deployment
of multi-protocol label switching (MPLS) [12]in many provider networks during the last
decade made a basic set of monitoring and management functions available. MPLS
establishes explicit paths between edge routers, which allow
to set up dedicated source to destination paths (LSP: Label Switched Path) between
each pair of edge routers (LER: Label Edge Router), rather than depending on the
shortest path principle of IP routing,
to monitor the traffic volume on each path (LSP) which gives visibility of the traffic
matrix between the edge routers (LER),
to prepare backup paths for fast rerouting after failures, embedded in a network-wide
path design for throughput optimization in each failure scenario [82].
We refer to Section 0 for a more detailed description and discussion of MPLS functions.
Currently, the flexibility in traffic management is further developing through software
defined networking (SDN) approaches and the OpenFlow protocol, which improve
granularity in extended control functions on the signaling plane. Measurement of traffic
demand matrix gives basic planning information in core networks. MPLS deployment
made traffic matrices visible in standard path monitoring as starting point to develop tools
(TE-Scout, [48][104]) supporting complete traffic engineering cycles
starting from monitoring of the traffic matrix through statistics per path in 15 minute
intervals,
via offline optimization algorithms to construct a path design for achieving maximum
throughput in the current network topology regarding the traffic matrix [48],
to finally configuring the pre-computed design on the routers.
A variety of commercial and open source traffic engineering tools has been developed,
e.g., the TOTEM toolset [38][39] with focus on optimization via manipulation of routing
weights in traditional IP routing protocols, which are relevant as a fallback option also in
MPLS.
Modifications of the path design are triggered only after changes of the network topology
and link bandwidths or when shifts to unbiased traffic load persist, rather than for short
term reactions which may cause instability. Reconfiguration of a small number of paths is
considered as requirement for optimized path design [38][48].
We have studied linear programming and heuristic methods to determine the paths for
traffic flows with the main goal of achieving maximum network throughput [1][93][121].
Those methods can be applied to a given topology and traffic demand matrix. Moreover,
modifications of the current network topology can be investigated to determine improved
alternatives of network development and planning:
Most efficient upgrades or topology adaptations are identified to widen bottlenecks [48].
The throughput reduction caused by link failures is analyzed [1][82].
The optimization can be extended to include cost models for investment in network
resources over time [49] and estimations for operational costs.
Public

Moreover, the optimization can include different sources being available in content
delivery network (CDN) and data center infrastructures, as another optimization
dimension to shorten the transport paths.
Naturally, no general exact optimization solutions can be expected with regard to all those
aspects, but a limited set of scenarios can be analyzed depending on the performing of the
methods.
Within a usual time frame for traffic engineering cycles, the monitoring of the traffic
demand matrix is updated in 5- or 15- minute intervals. Based on new traffic matrix
information, optimization of traffic paths by linear programming techniques can be
performed in a few seconds for networks up to a size of at least 100 nodes. The decision
to reconfigure modified optimum path design into the routers depends on the achievable
improvement in balanced load and mainly on the need to reduce overload on links beyond
predefined thresholds in the current design caused by shifts in the demands or topology
after the last reconfiguration. Most relevant for indication of critical load situations is the
measurement in the busy hours of the daily traffic profile. When a re-optimization is
required, it is usually done in a maintenance window in low load regime at night time. In
severe failure cases the network operators may have to react spontaneously based on a
failure analysis and recommendations for shifts of traffic paths by the tool. The tools can
also be used to investigate failure resilience and to enhance the topology for throughput
optimization, as shown in case studies for single link failures [26][48][82].
As compared to advanced MPLS traffic engineering functions, dynamic mobile and ad-hoc
network environments make traffic engineering often much more challenging while offering
less control functions. Fast and efficient algorithms for local monitoring and management
response have to be established when the network topology and usage pattern are subject
to a high rate of change. Then self-organizing networks (SON) [1][112] as standardized by
3GPP can provide distributed functionality, which is required as soon as a centralized
management is no longer capable of tracking the state of the whole dynamic network.
Therefore more intelligence has to be put on network elements in order to enable

decentralized monitoring techniques based on gossiping and tree based distribution of
information [120]; as an alternative monitoring scheme, structured peer-to-peer
networks using distributed hash tables are proposed for connecting self-organizing
entities. Gossiping and peer-to-peer approaches can deal with dynamic networks
subject to a high churn rate.
aggregation and evaluation of information in management overlays, from simple
functions, e.g., a sum or a minimum to more complex evaluations of network wide
threshold crossings [120],
estimation of the size of networks or of groups with specific characteristics in the
network at minimum messaging overhead, as a basic function to be embedded in
higher layer evaluations and applications,
efficient topology discovery with scalable information propagation and synchronization
[112] and
search methods for highly dynamic networks using combinations of flooding and
random walks including partially available information about the path to a target [52].
Public

Network topology and traffic demands
The TE-Scout tool has been set up to investigate several methods for constructing an
optimized path design which transports a maximum flow through a network regarding
source to destination traffic demands. Basically a single case of a predefined topology and
traffic demand matrix is analyzed but in general alternatives and sequences of upgrades or
failure situation can be considered. Options for the control of optimization methods and
visualization are represented in a graphical user interface.
The input for defining the considered network and traffic is expected in three files defining
the network topology as well as the traffic demands.

Optimization goals and preconditions
Starting from an arbitrary current network topology with known traffic matrix, the maximum
traffic scaling factor
max
is of main interest for network providers, such that all current
demands can still be delivered through the network after a uniform traffic growth by the
factor
max
. We apply optimization methods for load balancing, which offload flows from
bottleneck links on paths via lower loaded links.
Such optimization methods often encounter NP-complete problems. The simple example
of Figure 34 for balancing a set of flows with fixed traffic rates F
1
, F
2
, , F
M
on two paths
with capacity C
1
and C
2
, is obviously equivalent to the bin-packing problem: Can the set of
objects of size F
1
, F
2
, , F
M
be distributed and put into two buckets of size C
1
and C
2
?
With basic NP-complete problems included, only heuristic, approximate solutions are
feasible for the optimum network throughput, e.g., simulated annealing [48].

Figure 34: Optimized load balancing often includes NP-complete problems, e.g., Bin-
Packing [48].
We experienced that a slight variation of the boundary conditions helps to avoid NP-
complete problems in load balancing: When we allow a flow to be split up over multiple
paths then the complexity of load balancing reduces to a linear programming approach
[1][48][93][121], which is efficiently and exactly solved based on the Simplex algorithm.
In order to identify key influence factors, we have implemented linear programming,
simulated annealing and other optimization methods in the TE-Scout tool [48] to be chosen
from the menu shown in Figure 35. The Shortest paths algorithm routes each demand on
the shortest path regarding IP link weights. We use it for comparison to explicit path
routing and also included a simple Weight Optimization heuristics. Adaptation of IP
routing weights turned out to be surprisingly effective for load balancing [38] with an
overview of heuristic adaptations given in [39]. The choice of Exact refers to linear
FlowDemands F
k
to
Destinations
C
2
F
1
+ F
4
+ F
5
+ ... C
1
F
1
F
2
F
M
F
1
F
2
F
M
F
2
+ F
3
+ F
6
+ ... C
2
C
2
Capacity C
1
C
1
from
Sources
F1

, F4

, F5

,
F
2

, F
3

, F
6

,
.
.
.
.
.
.
FlowDemands F
k
to
Destinations
C
2
F
1
+ F
4
+ F
5
+ ... C
1
F
1
F
2
F
M
F
1
F
2
F
M
F
2
+ F
3
+ F
6
+ ... C
2
C
2
Capacity C
1
C
1
from
Sources
F1

, F4

, F5

,
F
2

, F
3

, F
6

,
.
.
.
.
.
.
Public

programming, whereas Heuristics offers several approximate methods for explicit paths
including simulated annealing.
Moreover, constraint shortest path first routing is applicable, such that traffic demands are
allocated sequentially on the shortest path on a subset of those links, which arent
overloaded by the new demand. Therefore the options smallest or largest demand first
(SDF, LDF) and file order are available to determine the sequence.
Finally, the optimization can include a set of failure situations (Failure Analysis).

Figure 35: Options for algorithms and graphical views provided by the TE-Scout tool
The main performance metrics to be evaluated are
the maximum achievable throughput in a given network topology with regard to the
traffic demands,
throughput degradation in failure scenarios and indication of worst failure cases within
a set of relevant scenarios [82], e.g., single link breakdowns,
additionally required bandwidth and upgrade planning in order to achieve demanded
throughput for scalable network provisioning according to traffic forecast for the
planned life cycle period of a platform,
cost and energy saving potential by optimized upgrade processes.

4.8.5 Initial Evaluation for Stepwise Upgrades in Full Mesh Core Networks
In order to demonstrate typical benefits of advanced traffic management, we consider a full
mesh network scenario, whose analysis is straightforward and simpler than for arbitrarily
meshed networks, including basic modeling of costs and energy consumption.
In particular, we start from a fully connected 8-node network with unique 40Gb/s capacities
shown in Figure 36. Traffic demands are also assumed to be uniform and bidirectional
Public

between each pair of routers. A 40Gb/s maximum network throughput obviously is
achieved by transporting each demand on the direct link between each pair of routers.
When traffic demands are growing beyond 40Gb/s, capacity upgrades become necessary.
The simplest option would be to simultaneously upgrade all links to 100Gb/s.
Alternatively, several intermediate upgrading steps can be taken, each of which expands a
subset of links to 100Gb/s. In this way, the network throughput is kept closer to the
demands and most investments in interfaces of higher capacity are delayed. As a
consequence, upgrade costs of link capacity is expected to decrease over time and energy
consumption is reducing. A simplified but nonetheless relevant modeling approach
assumes that
traffic demands between router pairs stay uniform, with exponential increase from
40Gb/s to 100Gb/s over 2.5 years,
upgrading steps are carried out as soon as the traffic demand exceeds the maximum
network throughput,
upgrading costs from 40Gb/s to 100Gb/s bandwidth are the same for all links from the
start, but are subject to exponential decrease to half of the price over the 2.5 year
upgrading period, i.e., the exponential increase in traffic demands is compensated by
decreasing costs per bandwidth over time.

Figure 36: Cost and energy optimization by stepwise upgrades in an 8-node full mesh
In particular, the stepwise upgrades are illustrated in Figure 36, where seven links to and
from a single node (A) are extended first to 100Gb/s, then six links for another node (D),
next seven of nine links to and from two further nodes in the third step. The remaining
eight links are left for the last step to finally establish the 100Gb/s network throughput
between all router pairs.
The first step is increasing the achievable network throughput from 40Gb/s to 48.6Gb/s,
because the new 60Gb/s capacity to and from node A can carry 60/7 Gb/s traffic between
all router pairs. Therefore the transport of the demand is split up over the direct 40Gb/s link
and a path over two links via node A. Growing traffic will reach 48.6Gb/s network
throughput after about half a year, when another upgrade is required.
Figure 37 and Table 13 indicate the dates when upgrade steps become necessary, their
costs, and their effect on the network throughput. Concluding, the costs of stepwise
upgrades over the 2.5 year period are 19.5% below a complete immediate network
upgrade and the delayed upgrades save 16.6% of energy consumption, when we assume
that 100Gb/s links consume twice as much energy as 40Gb/s. General formulas for timing,
cost and energy analysis of stepwise upgrading processes are provided in [49] for arbitrary
rates of exponential traffic growth and cost decrease factors over time, together with a
more detailed discussion on optimized upgrading scenarios.
In general, traffic paths have to be redirected to offload other, higher loaded links for
making better use of the new available capacity. Otherwise, the mean utilization over an
1. Upgrade Step:
7 Links to/from Node A
A
D
F
G
2. Upgrade Step:
6 Links to/from Node D
3. Upgrade Step:
9 Links to/from Node F,G
40 Gb/s link
100 Gb/s link
1. Upgrade Step:
7 Links to/from Node A
A
D
F
G
2. Upgrade Step:
6 Links to/from Node D
3. Upgrade Step:
9 Links to/from Node F,G
40 Gb/s link
100 Gb/s link
Public

upgrading period with exponential traffic growth is only 72% for 2-fold upgrades and even
less for larger steps, i.e. only 54% for 4-fold, and 39% for 10-fold upgrades as partly done
in Ethernet networks [49]. When a next upgrading period in the same full mesh of Figure
36 proceeds towards 400Gb/s links then we calculate higher percentages of 26.7% cost
saving and 22.4% energy saving assuming that 400Gb/s links have 2.5-fold higher costs
and energy consumption as compared to 100Gb/s links.

Upgrade Step
1 2 3 4
Number of links 7 6 7 8
Achieved network
throughput [Gb/s]
48.6 57.1 74.2 100
Upgrade time [Year] 0 0.53 0.98 1.64
Cost decrease factor 1 0.87 0.76 0.63
Energy saving factor 0.625 0.732 0.86 1

Table 13: Savings in stepwise link upgrade cycles

Figure 37: Timing and Cost Decrease for Stepwise Upgrades
Concluding, the effect of utilization gaps after capacity upgrades essentially contributes to
current overprovisioning in core networks and provides considerable potential for
improvement by advanced traffic engineering support.
The mapping of this mechanism is very similar to the case of MPLS-based inter-cloud
traffic management. The focus is on meshed interconnection and core networks in a single
domain and under unique administration of an ISP or a cloud or overlay provider with own
networking infrastructure. In order to balance the load during a steady upgrading process
for resources, traffic monitoring information has to be gathered and based on the
optimization results a recommended path design finally has to be configured into the
routing equipment.
Therefore the mechanism is working on the network layer, eventually also for integrating
concepts with lower layers or on the basis of virtualized networks (VPN). The transport
optimization can be extended to include multiple sources as in clouds, CDNs and
distributed data center architectures. Mobile networking environments havent been
0
0,5
1
1,5
2
2,5
3
1 101 201 301 401 501
C
o
s
t

d
e
c
r
e
a
s
e

f
a
c
t
o
r

/

b
a
n
d
w
i
d
t
h

[
G
b
/
s
]
Reihe1
Reihe5
Reihe2
Reihe3
Time in Upgrade Period [Year]
0 0,5 1 1,5 2 2,5
120
100
80
60
40
1
0,5
0
T
r
a
f
f
i
c

G
r
o
w
t
h

a
n
d

T
h
r
o
u
g
h
p
u
t

D
e
m
a
n
d
C
o
s
t

D
e
c
r
e
a
s
e

F
a
c
t
o
r
[Gb/s]
Throughput: Complete Upgrade
Throughput: Stepwise Upgrades
Exponential Traffic Growth
Exponential Cost Decrease
0
0,5
1
1,5
2
2,5
3
1 101 201 301 401 501
C
o
s
t

d
e
c
r
e
a
s
e

f
a
c
t
o
r

/

b
a
n
d
w
i
d
t
h

[
G
b
/
s
]
Reihe1
Reihe5
Reihe2
Reihe3
0 0,5 1 1,5 2 2,5
120
100
80
60
40
1
0,5
0
T
r
a
f
f
i
c

G
r
o
w
t
h

a
n
d

T
h
r
o
u
g
h
p
u
t

D
e
m
a
n
d
C
o
s
t

D
e
c
r
e
a
s
e

F
a
c
t
o
r
[Gb/s]
0
0,5
1
1,5
2
2,5
3
1 101 201 301 401 501
C
o
s
t

d
e
c
r
e
a
s
e

f
a
c
t
o
r

/

b
a
n
d
w
i
d
t
h

[
G
b
/
s
]
Reihe1
Reihe5
Reihe2
Reihe3
0 0,5 1 1,5 2 2,5
120
100
80
60
40
1
0,5
0
T
r
a
f
f
i
c

G
r
o
w
t
h

a
n
d

T
h
r
o
u
g
h
p
u
t

D
e
m
a
n
d
C
o
s
t

D
e
c
r
e
a
s
e

F
a
c
t
o
r
[Gb/s]
Public

included yet, which add a challenging dynamics and optimization potential also for the last
mile and the destinations due to moving user population with a daily profile and flash
crowd events etc.
Costs and energy savings are the main criteria for the efficiency of the mechanism in a
tradeoff with providing sufficient QoS/QoE for the users.
Operation and planning processes for interconnection networks have to ensure scalability
by permanent upgrades of transport capacities to cope with steadily increasing Internet
traffic.
Decisions for upgrading the resources on network and service platforms are based on
permanent monitoring of the load in order to indicate bottlenecks. As the usual reaction,
threshold-based upgrades are triggered when a resource is approaching an overload
regime. In pure IP networks, this was traditionally done separately for each link based on
link load measurement, but advanced network-wide load balancing mechanisms are
employed in future networks for achieving optimum resource utilization. Moreover, content
distribution networks and clouds are organized as distributed systems where content is
made available from several sources based on virtualized system-wide concepts, requiring
also integrated monitoring, management and upgrade processes for the whole system.
The main application areas are meshed IP/MPLS networks, where Multiprotocol Label
Switching (MPLS) provides an advanced basis for network monitoring and management
functions. A set of heuristics optimization algorithms and linear programming are involved
as main optimization techniques. For an integrated solution, a cyclic process has to be
established starting from monitoring via optimization and upgrading recommendations to
configuration of optimized traffic paths into the network elements.
4.9 vINCENT
vINCENT is a relevant traffic management mechanism for overlay-based video streaming
as it differs from related approaches (e.g., [1]) in that it has a clear focus on mobile video
streaming traffic that is becoming increasingly important as recent studies [100] show.
Additionally, vINCENT clearly addresses the challenges of video streaming applications
such as YouTube (see Section 3.1) and it is embedded as a key incentive mechanism in
the end-user scenario. The relevance of a P2P-based content distribution mechanism is
further confirmed by recent approaches such as BTLive [15] and Akamai NetSession [6].
vINCENT acts as a traffic management mechanism for multi-source media streaming
incorporating an incentive mechanism allowing for the consideration of social data. A
stable backbone of the system is built by user-owned nano data centers (UNaDas),
helping in disseminating the content (see Figure 38). The system builds upon the rich work
in the area of Peer-to-Peer streaming making use of scalable video codecs [105]. As an
incentive, a reciprocal mechanism is employed.

Public


Figure 38: vINCENT Infrastructure
Content is pushed to the system using a gateway provided by the Internet Service
Provider or an Application Service Provider. This gateway offers a cloud-like interface for
Application Providers willing to distribute their content using the UNaDa platform, thus
allowing UNaDas to be used as an alternative cloud/CDN concept (Figure 38).
The system mainly addresses the global service mobility scenario and the social
awareness scenario.
Global service mobility: The global service mobility scenario covers two main
types of mobility, vertical and horizontal service mobility (see D1.2). The vINCENT
mechanism is based on UNaDas and allows bringing content very close to the user.
Thus, vINCENT can realize vertical service mobility.
Social awareness: The efficiency of the solution is optimized by exploiting social
information, on the one hand as a source of trust, on the other hand as a source of
information on content that is likely to be consumed by the end-user (prefetching).
Besides the mainly addressed scenarios, there are also aspects of other scenarios, which
are covered by the solution, e.g., vINCENT uses the UNaDa platform, which is expected to
be more energy efficient than a content distribution scheme based on centralized
infrastructure [118].
vINCENT treats the user owned NaDas (UNaDa) as a platform for content distribution. The
resources of UNaDas can be reserved via some cloud-like interface offered by the Internet
service provider (ISP) to application providers offering content, e.g., Vimeo or YouTube.
The proposed architecture is shown in Figure 38.
Public


Figure 39: Virtual Node concept of vINCENT

As a contribution of resources by all users NaDas is needed in the system, a Tit-for-Tat
style approach is used, where any node can only consume as much data as he contributes
to the system. This mechanism is extended for the support of virtual nodes as depicted in .
A virtual node represents a number of nodes trusting each other, where only inter- virtual
node contributions have to be balanced, while intra-node communication may be
unbalanced. Node instances can appear as helper instances creating credit for other
members of the virtual node or as data sinks, consuming the credit of helper instances.
The socially enhanced virtual Tit-for-Tat mechanism is understood as a distributed traffic
management mechanism. By employing vINCENT, traffic can be load-balanced based on
social relations.
The concept of virtual nodes is introduced for two purposes:
Compensation of resource heterogeneity: Bandwidth and energy resources are
not distributed uniformly in the system, i.e., UNaDas will have varying bandwidth
and there will be mobile nodes participating in the system. A user can compensate
for lack of resources by utilizing the resources of, e.g., a cloud instance or a UNaDa
owned by the user, which participates in his virtual node.
As a primitive to project social relations to the system: The utilization of social
relationships needs a mechanism, which reflects these relations on the network
layer. The concept of virtual nodes can be used to cluster nodes according to social
data. If a membership in multiple virtual nodes is allowed, there is additional space
for optimization, as the complete social graph can be spanned with virtual nodes.
Besides that, social data can be used to improve the performance of the system by
applying prefetching strategies to the content. As Tit-for-Tat as an incentive scheme has a
slow startup time caused by the necessity to randomly unchoke new peers in order to
make an entry to the system possible, social data can be used to prefetch parts of a file in
advance, e.g., files viewed by a node close in the social graph.
A number of influence factors are key performance drivers of the system:
Public

Availability of social data: A crucial factor for the performance of the solution is
the availability of social data. vINCENT assumes social information to stay with the
end user in order to preserve his privacy. However, the user can utilize his social
data to increase his own performance gain from the system.
Bandwidth constraints: As vINCENT is mainly dealing with bulk data delivery, the
availability of upstream bandwidth is a crucial factor on the UNaDa. Due to the
asymmetric nature of Digital Subscriber Lines, there might be a need to allow a
fallback using the UNaDa gateway to substitute missing bandwidth.
Content Popularity: The solution will work more efficiently for popular content, as
more users will show interest to contribute bandwidth. This effect can be mitigated
by prefetching strategies and by letting the UNaDa gateway owned by the provider
act as a fallback solution.
In order to measure the performance of the system, a number of key performance indices
are needed. These factors can be divided into two types of indices, the SmartenIT specific
metrics and general metrics.
Among the SmartenIT specific performance metrics is energy efficiency and efficiency of
incentive, i.e., the price of anarchy compared to the efficiency of a globally optimal
solution. The metric of energy efficiency is to be compared to the efficiency of
disseminating the same content using a central data center. A theoretical comparison as
conducted in [118] can be performed as a first step. However, in a later stage of the
project, the comparison can be performed using energy models or doing live
measurements in a test bed.
Moreover, the global service mobility scenario includes the possibility of using WiFi for
offloading. This affects the energy efficiency of mobile devices, which do not have to use
energy expensive cellular technology. The question of fairness and incentive efficiency is
tied to the Quality-of-Service and Quality-of-Experience a user can draw from the system
depending on his contributions to the system and his available resources.
The non-SmartenIT specific performance metrics are the general metrics applying for
multi-source streaming systems. Quality-of-Service and Quality-of-Experience can help to
judge the general performance of the system under artificial and real-world workloads.
Moreover, the utilization of NaDa resources is a factor to be considered. Utilization should
be distributed fairly among the UNaDas. The available bandwidth resources should be
utilized when not in use by the user, while freeing them immediately, if they are needed for
other purposes.
Previous work done in the area of multi-source streaming show these systems to be able
to provide reliable service, especially in a scenario that can be considered stable regarding
the availability of nodes in the system, such as in the UNaDa scenario.
Public


Figure 40: Measurement of existing P2P streaming system.

Figure 40 shows Quality-of-Experience [57] measurements in a P2P live streaming system
under a varying fraction of present non-cooperative nodes The system measured is
swarming based and a Tit-for-Tat based incentive is deployed. The overall applicability of
the incentive scheme is tested by showing that the scheme can separate non-cooperative
nodes efficiently by preserving a good playback quality for those nodes, who are
contributing to the system, while punishing peers who do not. In the absence of free-riders,
a nearly perfect MOS can be achieved.
As discussed above, vINCENT will allow to build virtual nodes in such an incentive
scheme, which is expected to further stabilize resilience against non-cooperative nodes.
Moreover, the utilization of UNaDas may lead to an even higher performance, as a higher
stability of these nodes is expected.

Figure 41: Energy Efficiency of end-devices.
Moreover, we have evaluated the potential for saving energy of end-user devices along
two axes:
Public

Offloading between different wireless technologies
Preventing upload of data by preventing mobile nodes from contributing data using
a helper instance
For this purpose, we measured the energy efficiency of a mobile phone while
performing up- and downloads at the same time with varying data rates (see Figure
41). We conclude that offloading has considerable potential for saving energy (~ 450
mW), while the prevention of upload on a mobile phone accounts for roughly 100 mW.
In the best case, vINCENT can enable offloading and the prevention of upload at the
same time, thus increasing energy efficiency for end-users.
The vINCENT mechanism spans the domain of the end-user premise and the ISPs
network. Thus, the mechanism is clearly focused on the end-user focused macro scenario.
vINCENT uses the overlay management component on the NaDa box to provide sufficient
incentives to share resources among UNaDa owners.
The ultimate goal of vINCENT is the coupling of received system performance to
contributed resources of a user (reciprocal incentive scheme). We describe a sample
instantiation for a video on demand streaming service based on UNaDas as a technical
infrastructure. The ISP offers a gateway to initially place videos in the system.
Assuming a certain video is to be placed in the UNaDa system, as nodes want to prefetch
the video, because their user is likely to watch the video soon (e.g., based on social data).
Thus, the ISP will act as a seeder in a BitTorrent-like swarm; the interested nodes will act
as leechers.
If reciprocity as in BitTorrent is assumed, mobile nodes participating in the system will
suffer from bad playback performance, as they cannot contribute a sufficient amount of
resources. vINCENT allows mobile phones to seek the assistance of UNaDas (e.g., of a
friend) while downloading. Both devices form a virtual node, where the mobile phone
downloads the video, only, while the UNaDa performs the upload. Thus, the mobile phone
can receive good service without sacrificing resources.
4.10 ALTO-driven Application Quality Aggregation System
The described mechanism gathers QoE information on a set of selected Application
Endpoints (AEPs) such as video servers, based on QoE measurements collected from a
set of selected application client or user Endpoints (UEPs). The particularity is that the
Application QoE Monitoring System (AQMS) can optionally be coupled with the ALTO
Server of an ISP hosting some of the AEPs and the UEPs. ALTO drives QoE
measurements and their aggregation on selected (AEP, UEP) pairs, ultimately putting
abstracted QoE Cost values in an ALTO Server, gathering thus abstracted information on
both the transport and application layer in a standardized way.
AQAS does not directly drive content placement or application redirection but is a possible
add-on enriching AEP selection with QoE information. Both ALTO and QoE measurement
functions can be used separately as parts or enablers of the SmartenIT architecture.
The application QoE monitoring system (AQMS) can work in a scheduled or on-line mode.
A popularity based filtering function adds reliability to the measurements. The reliability of
Public

the resulting ALTO QoE information is based on both the measured activity and the use of
an updated ALTO Cost Map or an ALTO Endpoint Cost Map. The ALTO Server may
choose to provide QoE Cost values on either pairs of individual source/destination
Endpoints, or on Cost Maps with cost values among groups of Endpoints, noted PIDs.
Example applications include: video download and streaming, virtual applications involving
several physical endpoints performing an application.
AQAS is an innovation that brings added value to both QoE monitoring and the ALTO
Service: on one hand, it adds QoE awareness to the ALTO Endpoint Cost Service to serve
the needs of customers of overlay networks such as content providers; on the other hand,
it provides ISPs with a view on overlay network owned AEP allowing them to avoid
needless use of network paths leading to congested endpoints.
The mechanism provides improvements to the following scenario components:
Global service mobility: application providers may use the QoE and network level
costs provided by the ALTO service to adapt the placement of content among caches
of equivalent interest.
Social awareness: the mechanism loosely uses social awareness in that it captures
the rate and number of user connections to an endpoint.
ALTO aims at improving the QoE of the application while reducing resource consumption
in the underlying network infrastructure. To this end, ALTO Servers deployed by ISP,
provide requesting ALTO Clients with information that affects the performance and
efficiency of the data transmission between the hosts. Such information currently includes
the ISP-centric view of the network topology and associated routing costs, application
endpoint attributes such as their routing cost or connectivity type. ALTO Servers are meant
to provide information that cannot be gathered via end to end real-time like measurement
tools. End user QoE metrics such as application packet loss, application flow jitter (e.g.,
video jitter), freeze or re-buffering, bitrate, content access time, objective content quality
measures are often considered out of the ALTO scope as they can be measured on-line
by individual tools embedded in end systems and lack generality. However when there is a
significant number of samples captured at carefully chosen places and times, their
aggregation becomes a reliable AEP value indicator. To keep ALTO generic and
confidential, a selection of QoE metrics should be exposed in an abstracted form, to
represent AEP QoE costs so as to guide applications w.r.t. their QoE needs. Last,
virtualized applications run on sets of diverse physical machines on which with QoE
performance needs to be expressed in a harmonized way. Orchestration systems cannot
afford doing pervasive on-line measurements and analysis to select them and at the same
time they need to rely on a robust set of pre-computed QoE values.
The system integrates mechanisms to:
Provide an ALTO guided automation of the application QoE monitoring mechanism
used to populate an ALTO Server.
Provide means for the ALTO protocol to select AEPs such as CDN or DC servers
w.r.t end user QoE specific metrics.
Public

Provide reliable QoE metric values associated to an AEP (thanks to aggregation
and filtering of collected QoE measurements).
The resulting system is illustrated in Figure 42, where AQMS stands for Application Quality
Monitoring System. DB stands for Data Base. In this figure, it is assumed that an ALTO
Server provides QoE values on individual AEPs. That is, AQAS populates an ALTO Server
with QoE costs when the latter are provided via the ALTO Endpoint Cost service.

Figure 42: Application quality aggregation system for an ALTO guided population of ALTO
Endpoint QoE Cost values.
The ALTO QoE Cost values are gathered by the AQMS that is deployed on a number of
User Endpoints (UEPs) on which it captures the QoE perceived when a UEP is connected
to a given AEP (e.g., a video server). The measurements are gathered in an AQMS DB
associated to the AEP. The AQMS then filters authorized metrics that can be unveiled by
the overlay network provider to be integrated with ISP managed ALTO information and
sends their values to a central ALTO dedicated AQMS database. The QoE EP results are
filtered then aggregated in ALTO QoE Costs and sent to the ALTO Server. These
operations are orchestrated according to rules set by the ALTO Server Management Entity
(AOSME). The AQMS is triggered either on-line or in a scheduled mode.
AQAS involves the following components:
- An Application QoE Monitoring System (AQMS): performs QoE measurements on
selected application Endpoints (EPs).
Public

- 2 ALTO Agents: are functions that: (i) know the semantics of ALTO and of the
functions using ALTO information, (ii) pass information and requests between these
entities (iii) embed an ALTO Client or communicate with an ALTO Server data base.
o ALTO Agent 1: manages the AQMS requests. It gets the identity of source-
destination (S,D) EPs on which to monitor the QoE via its associated ALTO
Client, by downloading the EP QoE Costs map. Given rules and a frequency F1
set by the ALTO Server management entity (AOSME), it triggers AQMS
requests on the (S,D) pairs of EPs. EP AQMS databases must send the
requested QoE measurements to the central ALTO dedicated AQMS database.
Note that an alternative system architecture is when Agent 1 triggers directly the
AQMS clients for sending their QoE measurements to the central database. For
example, it can be the case when the AQMS system is composed of the AQMS
clients and the central ALTO dedicated database.
o ALTO Agent 2: transcripts the QoE measurements that are gathered in the
central AQMS database in ALTO semantics.
It generates ALTO QoE information only for the EPs registered in the
ALTO network map that it downloads periodically.
It aggregates the application QoE measurements over time and space in
a non-real time abstracted form of collected ALTO QoE Costs values so
as to protect the confidentiality of operator performances.
The aggregation rules are set by AOSME
The QoE Cost values are sent to the ALTO Server under the condition
that they comply with a set of AOSME reliability rules.
These rules are based on parameters including: number of QoE
measurements, number of application flows, popularity/hit rate,
and date of measurement.
The resulting ALTO QoE Cost values are sent to the ALTO Server at a frequency F2
specified by the AOSME. Note that F1 is not necessarily synchronized with F2.
AQAS has thus the following characteristics:
Decisions are taken at several levels: (i) the QoE measurement decisions are taken
independently by the AQMS. (ii) ALTO Agent1 knows via the ALTO Server from which
pairs (UEP, AEP) to capture QoE measurements to be filtered W.R.T reliability. (iii)
ALTO Agent2 then knows when to upload abstracted aggregates to the ALTO Server.
There are 2 centralized decision taking points: (i) the AQMS schedule configuration, (ii)
the AOSME provides timers for ALTO Agent1 to trigger requests to send QoE
measurements to the central ALTO dedicated AQMS database and to ALTO Agent2 to
process applicable QoE results and send them to ALTO Server.
The decision scope covers the set of AEPs of the Overlay network monitored by the
AQMS and at the same time covered by the QoE aware ALTO Services.
AQAS can assist demand prediction: it captures popularity and hit rate and can use
statistics.
Typical QoE metrics captured by AQAS are: application packet loss, freeze, content
quality and jitter.
The performances of AQAS are influenced by the following factors:
Public

The ability to collect enough QoE data with significant statistics: : depending on the
demand of end-users on applications,
the ability of applications to offer a choice among several AEPs,
the ability of ALTO servers to support an abstraction of QoE metrics in addition to
the usual ALTO metrics that are rather transport network related.
The benefits of using AQAS can be evaluated w.r.t. the following key performance metrics:
The QoE of end-users: in terms of e.g., user throughput, download delay and
application specific QoE metrics such as flow freeze for video streaming,
Network provider transport costs: in terms of monetary routing costs and network
resources consumption,
Application network provider performances such as number of serviced requests per
time period.
At this point, there is no evaluation result related to the AQAS Mechanism. A prototype of
the QoE monitoring tool, called VITALU is under development. VITALU measures QoE on
video streaming flows.
An important optimization potential, when VITALU associated with ALTO, is the hidden
usage by an ALTO Server of QoE values: that is the ALTO Server is not required to
explicitly provide QoE values but can use them to tie-break or further rank AEPs with
similar topology cost values, e.g., equal number of hops. While preserving confidentiality
on overlay network performances, an ALTO Server may indicate to its Clients that it uses a
given QoE metric to tie break.
4.10.6 Mapping of AQAS to SmartenIT Architecture
The AQAS system comprises two functional blocs:
- A QoE monitoring system that is deployed by the application network provider
(ANP), that is a cloud provider or a CDN provider in (i) application delivery nodes
(ADN) located in Data Centers, (ii) in the QoE manager to schedule and place QoE
measurements and analyze the results and (iii) in end-user systems that trigger an
application on which QoE is tested.
- Optionally an ALTO-QoE Client-Server bloc that is deployed in an ISP network
where the ANP has ADNs on which the ANP wants to make ISP topology-aware
QoE assessment to help an ISP making QoE aware topology based application
guidance and/or the ANP placing contents with additional ISP awareness.
4.10.7 Example Instantiation of AQAS
In the instance of AQAS that is being implemented, the QoE monitoring system is a tool
called VITALU that measures the QoE of a video stream. The primary goal of the tool is to
evaluate/monitor the QoE for identifying multimedia delivery issues. VITALU supports
multiple video streaming protocols such as HTTP Live Streaming, RTP/RTSP, RTP
Multicast, Flash HTTP.
Public

AQAS is triggered by the ANP on dedicated end systems that test performances
representative of a given end user location area. The ANP specifies the schedule and
spatial placement of the QoE measurements. VITALU measurements are thus no
permanent online processes and neither do they generate a QoE degradation alarm.
The VITALU function deployed on end systems connects to video delivery servers that are
each associated to one or more video caches.
VITALU has the following main functions for each monitored video flow:
Collect/Analyze IP network info (Packet Lost, Jitter, Delay, Timestamp, Sequence
number),
Compute video quality metrics (Video Quality Score VQS, ESR(%Freeze),
ESR(%Frame Error))
Provide frame by frame analysis, with jitter buffer simulation for RTP protocol.
The metrics measured by VITALU include: DateTime Capture (s), Date, Spot#, Clip Name,
Resolution, Operator, Session ID, RTT Avg, RTT Std, RTT Max, Video Access Time (s),
Video Duration (s), Stream Duration (s), Video Bitrate (bps), VQS Avg (Real), VQS Avg
(Theor.), VQS Std (Real), VQS Std (Theor.), Frame Error ESR5_20 (%), Freeze ESR5_20
(%), ESR5_20 Nb Windows, Nb of Freezes, Freeze Duration (Avg) (s), Freeze Duration
(Max) (s), Nb of Error Frames, Nb. of Packets, Packets Retransmitted (%), Nw Video Jitter
(Avg) (s), Nw Video Jitter (Max) (s), Nw Bitrate (Avg) (bps), Nw Bitrate (Std) (bps), Late &
Err. Frame Ratio (%), Frame Rate (Avg) (fps), Frame Rate (Std) (fps), Nb of Frame
Resolutions, Frame Resolution (px).
When the ALTO option is activated, the restricted set of VITALU metrics that are sent to
the ALTO dedicated AQMS database, attached to the VITALU System, for abstraction into
aggregated ALTO QoE metrics include mainly: Freeze or re-buffering, Impaired frame
ratio (Frame error ESR5_20 (%), or Late & erroneous frame ratio), application packet loss,
video quality score (VQS), flow jitter (video jitter).Only VITALU QoE measurements that
are authorized (usable) by ALTO are sent. Before sending an ALTO QoE metric value to
the ALTO Server, ALTO Agent 2 verifies its reliability against parameters such as number
of QoE measurements, number of application flows, popularity/hit rate, and date of
measurement.
4.11 Multi-Criteria Application End-Point Selection
MUCAPS is a functional block performing a selection of Application Endpoints (AEPs) with
awareness of the underlying transport network state, the application type and the access
type of the connected end user. It is used for applications that offer a choice among
several possible endpoints initially selected by the application network provider that can be
a CDN or a cloud provider. AEPs are application resources locations such as data centers,
video delivery nodes or virtual machines.
One key function of cloud traffic management is to redirect the user request to another
data center or NaDa. This is usually done by the overlay network upon its own priorities
but without any awareness of the underlying transport network. Some ISPs investigate
mechanisms that revise the initial AEP choice with additional information on their network
topology and connection history. An example approach published in [94] enriches initial
AEP sets produced by DNS and further ranks the AEPs w.r.t. information provisioning
tools such as ALTO. MUCAPS builds upon such revision mechanisms by adding the
following features:
Public

Usage of a proposed ALTO protocol extension that introduces metrics abstracting the
transport and overlay network state,
Associated robust vector-based AEP ranking function jointly using several metrics,
Addition of QoE impacting metrics such as abstracted path bandwidth score, to the
traditional ALTO routing cost and hop count metrics.
Automation of the choice of the AEP evaluation metrics, based on QoE requirements,
Automation of the metric weight tuning, based on user access type awareness,
Possible time sensitivity of metric values.
The MUCAPS mechanism provides improvements to the following scenario components:
Energy efficiency: energy can be saved at the transport network level by selecting
shorter data paths, avoiding congested paths and at the application network level, by
avoiding congested content resources locations.
Global Service Mobility: given end-user location, MUCAPS results can be used by an
end system on behalf of a cloud provider to assist the latter in services placement w.r.t.
user location.
Social awareness: if social related metrics are integrated in the AEP selection metrics.
Indirect: when costs values are time-dependent.
4.11.2.1 MUCAPS Motivation
The growth of QoE demanding cloud applications stresses the need to:
complete the usual application network provider selection of AEPs with a transport
network topology based selection decisions to optimize transport resources usage,
Provide mutual usage incentives by adding QoE impacting metrics reflecting for
instance bandwidth and delay capabilities that meet the interests of applications, to the
routing cost metric that meet the interests of the ISPs,
Deploy the resulting layer cooperative selection mechanism at the transport network
control level in order to meet interests of both ISPs and application or cloud providers.
MUCAPS can achieve this by using the ALTO protocol that guides applications on the
selection of their AEPs based on information abstracting the transport network topology.
MUCAPS integrates automated mechanisms selecting the ALTO metrics that evaluate the
AEPs and adapting their influence with respect to the User Endpoint (UEP), because of
the high rate of AEP selections to process and for the following reasons:
the multiplicity of application metrics needed for a reliable AEP selection,
the variation of network conditions in time, that induce a variation of the needed
selection metrics and their weights,
the diversity of user equipment and their related access capabilities,
todays increase of inter access technology handovers either for 3G offloading
purposes or due to user mobility.
Public

4.11.2.2 MUCAPS Overview
The functional blocks of MUCAPS are illustrated in Figure 43. MUCAPS is hooked to an
ISP managed AEP Gathering (AEPG) function such as a DNS Resolver that provides an
initial selection of AEPs. MUCAPS captures the initial selection and/or ordering of the
AEPG function and revises it according to ISP based metrics and network usage policies.
The ISP based information is provided by the ALTO Service that abstracts the ISP
topology and may provide costs reflecting the ISP preferences in terms of monetary cost,
bandwidth usage, hop count and other QoE impacting metrics. As a result, the AEPs are
selected so as to jointly optimize interests of application network providers and ISP.
MUCAPS is a multi-party incentive mechanism as it jointly addresses issues of the ISP
seeing its resources wasted by needless connections to sub-optimal or remote AEPs, the
end users having therefore poor QoE and of the application provider observing customer
frustration. MUCAPS implements two proposed ALTO protocol extensions that propose a
richer set of metrics and allow time sensitivity of ALTO cost values. Above all, it automates
the choice of application metrics and weights w.r.t. the AEPs and further metric weight
tuning w.r.t. the UEP connection.
The MUCAPS functions are integrated in a prototype that appends to the AEP gathering
function (AEPG) an ALTO Client augmented with Multiple Metric capabilities and a vector
based AEP ranking function. Two mechanisms for automated metric and weight setting
are located in a functional block called Application Metrics and Weight Manager (AMWM).
The functional blocks of MUCAPS, as illustrated in Figure 43, include:
1. An ALTO Client supporting transactions with multiple ALTO cost metrics. An example
metric is the bandwidth score, a scalar in range [0, N] to be maximized, whose value
increases as there is more available path bandwidth, and that does not unveil the real
available bandwidth value of the ISP network.
2. A metric selector located in the AMWM: it is an AEP centered decision mechanism that
automatically maps a selection of evaluation metrics to a selection of candidate AEPs,
among a set of metrics supported by an ALTO Server, considering the type and QoE
needs of the application and optionally the time.
3. A metric weight tuning function located in the AMWM: it is a UEP centered decision
mechanism that automatically tunes the influence of the already chosen AEP metrics
weight by considering the network access type of the UEP and optionally the time.
4. A Multi-Criteria EP Evaluation function performing vector-based AEP ranking.
5. An AMWM configuration setting function that: activates or deactivates automation of
metric selection and weight selection, uploads and updates the set of configurations
used in the AMWM look up tables.
The AEPG in our deployment case is an open source DNS resolver, presumably managed
by the ISP as it is able to map UEP addresses with access type, contrary to public DNSs
such as Open DNS or Google DNS.
As one can see, the decision on the metric weights is made in two stages:
- The initial choice made by mechanisms one w.r.t to the AEP intrinsic QoE needs,
- A multiplying factor applied to the previously selected weight w.r.t. UEP context.
Public


Figure 43: example deployment of MUCAPS: Multi-Cost ALTO Client block integrated in
an ISP DNS resolver and coupled with (i) an automated Application Metric Mapping
function (ii) an automated metric weight tuning function.
4.11.2.3 MUCAPS Characteristics
MUCAPS has the following characteristics:
- The functions interacting with MUCAPS include: an ALTO Server, an ALTO Client, an
AEPG function and an application quality metrics configuration controller.
- MUCAPS decides which AEP(s) are selected to host or provide application resources, ,
which metrics and weight to use for the AEP selection.
- The decision taking point is thus centralized at the ISP resolver-level.
- As for the decision scope, the decision is taken for each user end system and each
related application endpoint, w.r.t a view of the transport access and core network.
- Connectivity is considered in that the user connection type impacts the metric weights.
- MUCAPS uses ISP network metrics such as hop count, routing cost and abstracted
path bandwidth, cloud overlay metrics such as abstracted AEP bandwidth.
- If AQAS is supported, MUCAPS may use QoE metrics to further refine AEP selection.
- If time sensitivity of ALTO Cost values is supported, the variation of network conditions
due to predictable social activity patterns can be reflected,
- Time sensitivity can also be used to configure the set of metrics or their weight.
An example usage of time sensitivity by the AMWM is that when the UEPs network is
busy, it is necessary to augment the weight factor of the path bandwidth scores (PBS)
whereas during off peak hours this factor may be lowered. Moreover, the controller may
decide that the PBS is not used in off peak hours.
Public

The ALTO Cost values sensitivity relates to social activity at the AEP side and besides,
may be very useful to integrate the time shift for cloud applications involving physical AEPs
located in different time zones.
The performances of MUCAPS are influenced by the following factors:
The ability of the AEP gathering (AEPG) function to maintain a reliable diversity in the
choice of AEPs: the ability of the DNS resolver to support a reliable choice of candidate
AEPs needs to be further studied, together with other applicable ISP cooperative
AEPG functions,
The ability of the ISP ALTO Server to support appropriate cost metrics: it is likely
however that ISPs willing to implement MUCAPs will also implement the support of
appropriate metrics in their ALTO Server.
The performances of MUCAPS can be evaluated w.r.t. the following key performance
metrics:
The QoE of end-users: in terms of, e.g., user throughput, download delay and
application specific QoE metrics such as flow freeze for video streaming,
Network provider transport costs: in terms of monetary routing costs and network
resources consumption,
Application or overlay network provider performances such as number of serviced
requests per time period, customer satisfaction represented by opinion scores.

Figure 44: Prototype and example scenario for MUCAPS-based AEP selection.
Public

MUCAPS is a functional block that is hooked to an ISP managed Application Endpoint
Gathering (AEPG) function that identifies candidate application endpoints (AEPs). It thus
deployed in an ISP network and ends up redirecting a user request to another data center
or nano data center. It thus acts for the Cloud Traffic Manager and comprises:
- A Multi-Cost capable ALTO Client getting information on the AEPs from an ALTO
Server deployed in the ISP network
- A vector based AEP ranking function F.
- An Application Metric and Weight Manager (AMWM), automatically selecting the
ALTO metrics to request and weights to use by F. This function interacts with a
configuration module that activates or deactivates automation and maps metrics to
application classes.
In the instance of MUCAPS that is being implemented, the AEPG function is an ISP
managed DNS Resolver. The Multi-Cost ALTO Client block is integrated with an ISP DNS
Server following well known integration schemes. An ISP may choose to run the ALTO
client itself as the overlay network chooses the AEPs (whether with ALTO or not)
independently of the ISP priorities and does not always have access to detailed ISP
information.
Figure 45 gives an example functional scenario where a user performs video streaming,
the DNS resolver identifies candidate AEPs that are ranked by MUCAPS w.r.t. 2 metrics.
The mapping between (i) AEPs and ALTO metrics and (ii) UEPs and metric weights is
done with look up tables (LUT) containing configurations set via the AMWM function w.r.t.
to QoE needs and possibly time.

Figure 45: video streaming application with 3 candidate AEPs and 2 metrics.
Public

This mechanism is cross-layer and incentive-compatible. It provides network layer
awareness to the application layer. It provides mutual incentives to the ISP needing users
to connect to AEPs in a network resources friendly way and users who in exchange of
minimizing ISP routing costs would get informed of performances related to, e.g.,
bandwidth and an AEP selection that best fits their device.
A prototype is under development: MUCAPS is integrated with an open source ISP
managed DNS Server. End users are located in a multi-access network and run video
streaming from multiple delivery servers, currently audio and video applications. An
example scenario ran with the prototype is illustrated in Figure 44.
The ability of the DNS resolver to support a reliable choice of candidate AEPs needs to be
further studied, together with other applicable ISP cooperative AEPG functions.
4.12 QoE and Energy Aware Mobile Traffic Management
The QoE and Energy Aware Mobile Traffic Management is a mechanism focused on
improving the QoE and at the same time reducing the energy consumption on mobile
devices by intelligently scheduling network traffic generated on the mobile device.
Offloading traffic using WiFi networks can save between 75% and 90% of the energy used
for network transmissions compared to 3G connectivity only [41]. At the same time, the
risk of network caused stalls may be reduced by increased data rates on WiFi, improving
the QoE of the end-user. The traffic management proposed here is based on QoS maps,
user mobility prediction, energy models, and QoE models. The main contribution is the
scheduling of traffic generated on the mobile device to different connections or locations
with the target to improve the QoE of the end user while reducing the energy consumption
of the mobile device. At the same time, traffic on the cellular network is reduced and/or
equalized over time and space, which reduces congestion and thus allows satisfying more
users with the existing infrastructure. This is also clearly advantageous for the mobile
network operator, as network upgrades may be deferred. The traffic management is
conducted on the mobile device only, eliminating the need of network support and network
communication. However, collected data can be made available to the network to optimize
content placement and reduce the energy consumption within the core network. At the
same time, network support may be helpful in creating a network quality map, which is
used to determine suitable combinations of location, time and network availability to
offload traffic from the cellular network.
The described mechanisms address the following scenarios:
Global Service Mobility
Collaboration for Energy Efficiency
The GSM scenario specifies traffic offloading using WiFi access points, reducing the load
on the cellular network and likely improving the energy efficiency. This mechanism
provides the means to dynamically offload the traffic controlled by the mobile device.
Offloading might be achieved by using public WiFi access points, restricted WiFi networks
(i.e. public networks only available to registered users), or private networks. It further
allows shifting the generated traffic in space and time, such reducing the load on often
congested links. Combined with the concept of UNaDas, which are part of the GSM
Public

scenario, this mechanism allows the download of content pre-fetched on UNaDas by
automatically connecting to the provided access points.
In the Collaboration for Energy Efficiency scenario, the proposed mechanism supports the
offloading of mobile traffic, reducing the power consumption of the mobile device. The
mechanism can also be extended to allow input from the network to control the timing of
network transmissions to reduce the energy consumption in the overall network as well. As
one of the components of the architecture is the energy model and the energy analyzer,
accurate power estimates can be made available to a central server. This server can then
create an overview of the network and propose optimizations. In addition to that, the
network quality parameters (QoS) are measured and the QoE is derived and may be
transmitted to a central authority. A possible extension of the proposed mechanism is to
make content downloaded by the device available to close-by devices using ad-hoc
connectivity or WiFi access points, hence reducing the demand on cellular infrastructure
and possibly also saving energy.
Architecture: The mechanism is based on the architecture proposed in [68]. The main
component is the Network Optimizer, which is given in the large, green box. The approach
is a combination of well-known components targeting the improvement of the perceived
network quality. The components to be developed or modified are
Request analyzer
Network quality model
Energy models
QoE model
Network action recommender
These are the components drawn with green color in the architecture diagram in Figure
46. The boxes partly contained in the Network Optimizer are existing components, which
are to be extended. To be fully functional, also some other, external components are
required. These are
Network interface
Mobility prediction
The Network Optimizer must be able to control the network interface; hence control
functions for disabling/enabling individual interfaces as well as deferring network
transmissions are required. The mobility prediction must be able to provide user mobility
predictions in the range of minutes based on location data, which may be provided by the
Network Optimizer or the mobile operating system.
The information flow is the following: Incoming network requests are intercepted at the
network interface. The intercepted request is then sent to the request analyzer, which
determines the QoS requirements of the incoming requests. The analysis is based on the
originating application as well as the ports of the connection. If the service requirements
are unclear, it is also possible to request input from the user to determine the
corresponding QoS requirements.
Public

Mo5ilit #rediction
/ser ID #rediction
%nerg Model 1et.or2
Model
3o% Model
3oS %nerg
1et.or2 Action
'ecommender
1et.or2 Interface
Action
#o.er Tutor (.n Models
'e6uest
Anal*er
3oS Ser&ice 'e6uest
1
e
t
.
o
r
2

(
)
t
i
m
i
*
e
r

Figure 46 Architecture of the Network Optimizer from [68]
The derived QoS requirements are fed to the network model. It contains a map of the
measured network quality in the area, which is combined with the mobility prediction to
derive a list of possible connection options. The QoS map itself is generated by
periodically probing the network or monitoring data transmissions passively. If the network
interface is saturated, the maximum throughput can be determined, otherwise only RTT
and packet loss rate can be determined.
The list of the possible connection options is then sent to the energy model, where the
estimated energy consumption for each option is calculated and appended to the
corresponding entries to the list. This list is forwarded to the QoE model, appending the
estimated QoE. The estimated QoE is calculated using a model based approach. Here, a
new model needs to be generated to include the user experience of different network
services. This model also requires the the battery usage data of the mobile device.
The Network Action Recommender executes the variant with the highest QoE. This
approach may later be extended to include more sophisticated scheduling decisions based
on the connection parameters and user feedback. The control decisions are executed by
the network interface. This might be controlled by freezing the TCP stack to avoid data
transmissions and such powering up the network interface.
Applicability: The proposed mechanism is applicable for user generated network
requests as well as background data transmissions. The request analyzer derives the
traffic requirements from the device state in combination with the originating applications
requirements to estimate the urgency of a network request. In the case of the device being
actively used, and the foreground application requesting data, the urgency is high. In the
case of a periodically occurring background data transfer, which might pre-fetch social
media information or update the application state, the urgency is low. Still, the scheduler
might execute deferred network transfers, if the connection is opened by another
application anyway. This reduces the cost of the background transfer, as the ramp and tail
energies of the mobile network connection are incurred only once. In this case, the
estimated size/duration of the connection is also of importance, as large file transfers are
still better executed via WiFi.
Public

For foreground requests, also QoE requirements must be considered. Rendering the first
elements on a web page should not take longer than one second [44], otherwise a mental
context switch occurs. Commonly it can be assumed, that the user may cancel an
operation, if the response time is larger than 10s [17]. This is a result from a study in 2000,
hence it can be assumed, that this is the upper limit and requirements may even be stricter
now.
Implementation considerations: The proposed mechanism is applicable to a large
number of applications. Still, not all QoS requirements can be determined a priori. To
support adding new services, it might be beneficial to request input from the user, if the
QoS requirements a request cannot be classified. The simplest solution would be to ask
the user for the urgency of the request and the type of service (i.e. video streaming, VoIP,
file transfer, web browsing, app-data-transfer). This can then be included in a shared
database to prevent requesting the same information multiple times.
Feedback to the user might be given to communicate the current network quality and
possible improvements on the predicted path. Similar to a traffic light, which counts down
the seconds until it turns green, a counter may be placed in the notification area, showing
the estimated remaining time to encounter the next service level (i.e. red to green in x
seconds, red to yellow in y seconds, etc.; here red stands for poor, yellow for average, and
green for good QoS). Thereby the user is aware of the estimated network conditions
before starting the network request. Based on this information, he can decide to start the
request immediately or wait for the specified time. In the first case, the algorithm uses the
available connection, and the user receives the QoE as announced in the notifications, in
the latter case, the QoE is expected to be higher.
The key influence factors for the proposed mechanism are the
Availability of wireless networks (i.e. WiFi access points, cellular networks)
Predicted mobility of the user
Current device state
The availability and quality of the WiFi and cellular networks is measured by the proposed
mechanism. To successfully offload traffic to other networks, other networks must exist
and it must be possible to connect to. Here, the social aspects introduced by HORST may
be useful to increase the number of available hotspots by sharing the WiFi connection
based on trust.
The mobility prediction, which in the proposed mechanism is an external component, must
be able to predict the mobile users path up to a few minutes to allow meaningful
scheduling of data transfers. Still, related work [85] shows that bandwidth estimation with
an accuracy of 100kb is possible in 80% of the predictions for a distance of 6 steps
(~600m).
The device state, in particular the availability of network interfaces and their performance
(i.e. maximum data rate) are of importance. If the user disables network interfaces, the
efficacy of the mechanism is highly affected by reducing the solution space to time and
space diversity only. Another factor influencing the quality of the approach is the remaining
battery capacity of the mobile device. It might be advantageous in terms of overall energy
consumption to share already downloaded content with nearby devices using ad-hoc
connections, but this could lead to depleting the battery, which results in a bad user
Public

experience. In the case of low battery capacity on the users device the mechanism may
also be extend the remaining battery lifetime by deferring background data transfers until
connected to the grid or a high quality connection is available.
The performance of the proposed mechanism can be evaluated by a number of measures.
These are the:
Energy consumption of the end-user device caused by communication
User experienced service quality/QoE
Offloaded traffic vs. cellular traffic
The Energy consumption of the air interfaces can be evaluated by measuring the energy
consumption of the device for arbitrary data connections with a physical power meter
attached. Knowing the device state and combining this with the measurements, a model of
the energy consumption of the mobile device can be generated. This energy model can be
used in the field to evaluate the gain in energy efficiency of the mechanism.
The QoE of the end user can be derived by observing the network traffic on the network
interface and deriving QoE metrics from the observations. A model to derive the QoE for
video content is described in [37]. In this mechanism also other services are of interest.
One measure is the retrieval time for a typical mobile web page or the time to the
availability of a bulk data transfer. The extended QoE proposed here also includes the
energy consumption of the device. The influence of this is to be evaluated and modeled in
future work.
Another important metric, in particular for the end-user, is the consumed traffic on the
cellular interface, as for most contracts the allowed data volume is limited. In general,
cellular data consumption is expensive. The consumed cellular data volume and the
offloaded data volume can be used as metrics for the evaluation of the proposed
mechanism.
Currently, first measurements of the cellular network coverage are available. More data, in
particular QoS data, needs to be collected to support offloading decisions. The mechanism
needs to be implemented and the scheduling decisions taken must be evaluated.
Possible extensions of the QoE and Energy Aware Mobile Traffic Management are the use
of social trust information to allow the use of WiFi access points on the way, or moving
content to UNaDas on the predicted path to mitigate bandwidth limitations on the WiFi
uplink.
The QoE and Energy Aware Mobile Traffic Management mechanism is deployed on
mobile devices only. It makes use of the Energy-, QoE-, and Network Monitor to derive
data to base the scheduling decisions on. It further uses the devices location and network
information derived by the Topology/Proximity monitor. Further, the device state and
mobility are considered when scheduling connections. The control decisions are executed
by the Switching/Forwarding component on the end-users device.
Public

The proposed mechanism works in the following way: A mobile user, on his commute, may
be reading news on his mobile phone. The mobility prediction, combined with the
knowledge of the available networks (cellular, WiFi), and their QoS at the time of the
commute is used to select the optimal network to transfer the data. If a WiFi network is
close, and the QoE browsing the web is not affected, the data may be scheduled to use
the WiFi connection. This, at the same time, reduces the battery consumption of the data
transfer by reducing transmission times. Still, the window for optimizations for interactive
applications is narrow.
A background transfer to be executed while on the go allows greater flexibility. The
scheduling duration might be extended much longer, and the connection restricted to WiFi
networks only. Here, the transmission may be deferred for minutes, possibly even until the
destination is reached. Still, if the time interval is too large, a connection using the cellular
network may be necessary. As background transfers usually dont directly affect the QoE,
this allows greater flexibility in optimizing the connectivity considering the QoE and energy
efficiency at the same time.

Public

5 Configuration and Communication Frameworks
During the definition of SmartenIT Traffic Management Solutions some technologies were
identified that do not constitute a TMS per se but provide important functionalities with a
high potential for use in TMSs specified in section 4. Three selected frameworks have
been described in the subsequent sections.
Inter-ALTO protocol specifies a communication scheme to be used between ALTO servers
located in different Autonomous Systems (AS). Such an inter-ALTO communication
extends the ALTO service capabilities and provides additional information on remote
nodes (i.e., nodes located in other ASs) that may be considered very useful in a number of
TMSs dealing with multi-domain environments.
OpenFlow- and MPLS-based frameworks can be used in any TMS that needs to directly
alter underlying network configuration. This is realized by configuring flows and LSPs,
respectively. Both solutions are particularly suitable for different scenarios depending, e.g.,
on the required management granularity.
Each framework is described by providing its specification, its potential applicability to TMS
and initial theoretical or functional evaluation results.
5.1 Inter-ALTO Communication Framework
This section describes the rationale for a communication to be used between ALTO
servers located in different autonomous systems (AS). Such an inter-ALTO communication
extends the ALTO service [106] capabilities and provides additional information on remote
nodes, i.e., nodes located in other ASes. To make the consideration more clear we
distinguish local AS and remote ASes. Local AS is the one from which perspective we
describe the communication. Local nodes are located in the local AS and are served by a
local ALTO server. On the contrary, all other nodes are located in remote ASes. Those
nodes are referred to as remote and are served by remote ALTO server. This basic
terminology adheres to majority of considerations in this section.
The motivation for the ALTO service as discussed in the ALTO problem statement [106]
focuses on the overlay traffic optimization based on information gathered from the Internet
Service Provider (ISP) domain, i.e., an Autonomous System (AS). Due to the suggested
approach, information on the AS internal topology and some routing information obtained
from the global Internet (the BGP routing tables) may be used for the node selection
procedures. The data transfer cost can be also incorporated. However, there are some
parameters which can be used for the better node selection mechanism, but they are not
available in the local AS and must be obtained from outside, i.e., from a remote AS. For
example, the BGP routing information available in the AS identifies only the upstream
traffic. The information about the downstream traffic is not present or is incomplete and the
full BGP information for this traffic could be obtained from the remote AS containing the
subnetwork where the node sending downstream traffic is attached. In order to obtain such
data, we propose to use the inter-ALTO communication framework.
It is assumed that the ALTO servers are deployed in the local and remote ASes. The
ALTO server located in the client AS can request desired information from the ALTO
server which is located in the remote AS. Each server is managed by a respective ISP.
Each ISP decides what type of information can be exposed to the requesting party. The
ALTO server responds with the type of information that was previously agreed to
Public

exchange. The information delivered by remote ALTO servers may be used by a local
ALTO server to perform advanced rating/ranking procedure of nodes.
Due to the intrinsic nature of how cloud services are composed (different stakeholders
composing and purchasing the service, creating an overlay network above the underlying
network infrastructure), Inter-ALTO techniques are particularly suited to nearly all
SmartenIT scenarios, particularly to those scenarios (Inter-cloud communication and
Global Service Mobility) which are characterized by the following key factors:
1. The need to move content or services from one location to another, thus
crossing and utilizing WAN network links owned by different ISPs
2. The need to realize a fair ISP link utilization to avoid potential congestion and
general slowness of network
3. The need to optimize the placement i.e. to find optimal location and time-slot to
operate the content movement.
The primary stimulus for content (or service) movement can vary from scenario to
scenario. In inter-cloud communication scenario the trigger for content movement is
activated by Data center Operator while within global service mobility the trigger could be
activated by End-user as well as Cloud Service provider. In particular, the following key
aspects motivate the proposal of the inter-ALTO communication framework:
Route Asymmetry: The ALTO server can obtain routing information locally (e.g.,
from BGP routers) and can determine the upstream path. Information about the
downstream path is usually not easily available. Some additional routing information
can be obtained from Looking Glass Servers, but not all ASes provide them. The
inter-ALTO communication framework provides the ability to exchange the relevant
information between ALTO servers. Especially, the downstream path can be reliably
determined using the information provided by remote ALTO server. Due to route
asymmetry in the Internet [27], such information appears to be necessary for a
better optimization of a node rating/ranking algorithm, as assumption that inter-AS
routes follow symmetrical paths can give not only sub-optimal, but misleading and,
in effect, harmful results.
Many ASes within One ISP: An ISP may possess a complex topology network
composed of many autonomous systems. Current ALTO specification allows for
deployment of independent ALTO servers in each AS. In such a case the overlay
traffic management performed by the ALTO server is restricted to a single AS since
cost maps have a local meaning. An ISP operating a multi-AS network may be
interested in managing the traffic in the whole administrative domain in a consistent
and coordinated manner. The information possessed by a single ALTO server is
insufficient. To obtain a complete knowledge on the multi-AS network a
communication between ALTO servers is needed. As a result, local cost maps
originating from different autonomous systems may be coordinated. A uniform cost
map reflecting the whole network structure may be created and distributed between
ALTO servers.
Different Types of Business Relations: An AS may be connected to many other
ASes with various agreements. The cost of inter-AS traffic transfer may differ
depending on which neighbor AS the path passes. For this reason an ISP may
prefer that its own customers exchange data with remote nodes located in such
ASes that the path directed to them passes cheaper links. The ALTO server may
sort nodes taking into account these criteria. To receive almost complete
Public

information on routing paths to and from different remote domains the information
provided by remote ALTO server using inter-AS communication may be helpful.
Congestion Avoidance: A node rating/ranking procedure may also take into
account the congestions on inter-AS links. An ISP is able to monitor queues on its
inter-domain links and assign metrics indicating the buffer occupancy or bandwidth
utilization. These metrics can express percentage use of buffers or bandwidth on a
particular inter-AS link. If one inter-domain link is congested it is desirable to
promote nodes reachable through lightly loaded links. Again, information provided
by the remote ALTO server would support such optimization. The aim of the inter-
ALTO communication is not to replace the existing congestion avoidance
mechanisms. The idea is to support the present mechanism by the exchange of
parameters describing the load on inter-AS links.
Proximity Awareness: For a set of reasons (e.g., the performance of an
application) the ALTO server may suggest its customers to connect to remote
nodes located in its proximity. The simplest measure of proximity is the number of
inter-AS hops. As indicated above, due to the route asymmetry, the number of hops
may significantly differ between the upstream and downstream paths. Such
information for the downstream path may be provided by the remote ALTO server.
A more advanced metric of proximity can be found in the delay that can be
approximated by exchanging messages between ALTO servers. The ALTO servers
can be equipped with an application-layer ping functionality, which only operates
between ALTO servers. By exchanging special packets prepared by the ALTO
servers, these servers can estimate delay and packet loss.
Remote ISP's Preference: If two ISPs agree on cooperation, the remote ALTO
server may provide its preference parameters (remote preference parameters)
indicating which nodes are better from the point of view of the remote ISP. For
instance, the AS in which the remote ALTO server is located may possess two
subnetworks connected to the operator's core network by distinct links. It may
happen that a connection to one of the subnetworks is cheaper than the other. The
remote operator may prefer connections through cheaper link, so nodes located in
the subnetwork transferring data via this cheaper link are preferred. The remote
preference parameter may be also used when a remote ISP wants to suggest
nodes which are connected to the Internet through access links of higher capacity.
This way, the remote ALTO server, without exposing the exact values of access link
bandwidth, may indicate nodes with higher throughput. The remote preference
parameters have only local meaning, i.e., their values are comparable for nodes
located in the same AS only.
Coordination of ISPs' Policies: Operators may have an incentive to coordinate
their efforts in order to decrease transfer costs on inter-AS links or improve quality
experienced by nodes, i.e., coordinate their node rating/ranking strategies. This
way, operators may avoid contradictory strategies resulting in inefficiency of
rating/ranking algorithms. Operators may agree to promote each other's nodes.
Sensitivity of Topology Information: The minimum information that the remote
AS provides to the local ALTO server via the inter-ALTO communication may be the
number of inter-AS hops and the number of the local AS's neighbor in the
downstream path (the full downstream AS_PATH may be not exchanged). Such
information does not reveal any sensitive information neither on the ISP internal
Public

topology details nor remote AS connections with other ASes, but does provide basic
and very useful information for the local ALTO server.
Out-dated Information: It is expected that some information (parameters) from
routing protocols that will be used in the rating/ranking procedures may outdate.
Also information related to the network performance is constantly changing.
Therefore, the information obtained from the remote AS requires updates. This
updates may be generated on request (pull mechanism), on event base schema or
periodically (push mechanism). The inter-ALTO communication should be equipped
with mechanisms for updates. The need for the present information describing
network conditions and some routing parameters are arguments for introducing
specific protocol for the communication between ALTO servers.
Mobile Networks: The inter-ALTO communication may be very useful for mobile
network operators and content providers serving mobile clients. An ALTO server
may recognize mobile clients and properly assign them to PIDs. Some information
about the mobile network resources gathered from mobile network nodes located in
different networks should be exchanged between operators for better then random
node selection. ALTO servers should possess information which allows to make
proper node selection, taking into account, e.g., the mobile network load (including
the load in the radio access network and in the circuit- and packet-switched
domains).
Route Aggregation: The BGP protocol allows the aggregation of specific routes
into one route. In such a case the aggregate route is advertised. The full path is
either lost completely or the AS set information is available. In the latter case only
the set of ASes behind the aggregating router is known but the detailed information
about the routing path, including AS sequence and AS-hop count, is lost. From the
overlay traffic optimization point of view the knowledge on ASes located behind
aggregating router and the number as well as sequence of inter-AS hops may be
useful, e.g., because of route asymmetry problem described earlier. The solution for
this problem is information exchange between ALTO servers located in ASes ahead
and behind the router aggregating routes.
5.1.1 Specification of the Framework
The architecture of the Inter-ALTO communication framework is shown in Figure 47. Both
ALTO servers gather the information from their information sources like routing protocols,
provisioning policies, or dynamic network information sources. The local ALTO server
needs to communicate with a remote ALTO server to obtain information, which is available
only at the entities in the remote AS.

Figure 47: Inter-ALTO communication framework architecture.
Public

5.1.2 Application to Traffic Management Solutions
The inter-ALTO communication framework may be used in all situations when a
mechanism cannot perform guidance optimally based on local database, e.g.:
In case of inter-DC / inter-cloud communications, the inter-ALTO framework may
provide information on how DCs communicate with the end-users.
In case of collaboration between data centers for energy efficiency, the inter-ALTO
framework may guide shutting down DCs without harming end-users QoE.
In case of global service mobility, the inter-ALTO framework may provide
information on where the data/service should be duplicated.
In case of exploitation of social awareness for content delivery, the inter-ALTO
framework may provide information on end-users interests.
The influence of the inter-ALTO communication framework is influenced by the following
three factors: (a) availability of ALTO servers capable and configured to use the inter-
ALTO communication framework; (b) willingness of stakeholders to share the topology
information with other stakeholders; and (c) amount of information not available locally
(e.g., on route asymmetry, different types of business relations, congestions, ISPs'
policies, route aggregation).
Since Inter-ALTO is a generic framework enabling inter-stakeholder exchange of
aggregated information on any numerical cost, there is no restriction on which metric is
influenced by enhancing guidance with inter-ALTO information exchange.
5.1.2.1 Example Instantiation of Framework
In this example, it is assumed that an Application Provider constructs a service from,
among others, PaaS and IaaS cloud services. The service provided to the end-users
consumes mainly downstream bandwidth and is jitter- and delay-sensitive. Therefore, the
Application Provider may assume that the lower number of AS hops the packets traverse,
the higher QoE perceived by the end-users is.

Figure 48: Example instantiation of the inter-ALTO framework topology.
In Figure 48, the topology is presented. The Application provider has to provide the
application run by the end-user with IP addresses of exactly one PaaS DC and exactly one
IaaS DC on which the services are present. In each case, it may choose from two
alternatives.
Public


Figure 49: Example instantiation of the inter-ALTO framework communication schemes.
Figure 49 presents the information flows that are needed to properly address the problem
of end-user redirection. When the Application Providers DC server asks its local ALTO
server for guidance, the ALTO server has to use inter-ALTO framework and gather the
topology information from remote ALTO servers. Since the (e.g., BGP-based) topology
information sources store information only on upstream paths, there is no need to discover
and contact any ALTO servers run by end-users ISP. The Application Providers ALTO
server gather information on the paths being used by the flows from all candidate DCs to
the end-user. Then, the Application Provider is able to apply its policy (metric) and provide
the application with proper guidance.

Figure 50: Example instantiation of the inter-ALTO framework resulting data flows.
Figure 50 shows the result of the applying inter-ALTO framework to the guidance problem
described in this example. The application run by the end-user is redirected to and served
by DC p1 and DC i2.
5.1.3 Evaluation Environment and Initial Results
The framework has been implemented in the Eruption.S engine, an event-driven underlay-
aware packet-based interactive BitTorrent network simulator written in C#.
In the proposed algorithm [28], a local operator assigns preference numbers to particular
neighboring ASes. It is done according to the type of a given inter-domain link and related
costs of traffic transfer. Since the cost of upstream and downstream traffic might be
different, separate preference numbers are assigned for each direction.
Public


Figure 51: Topology used during simulations. N means that there are N links of a
indicated category between given ASes.
The simulated topology is presented in Figure 51. An AS may identify its links to neighbor
ASes as peering, provider or customer (in figures we use acronyms for those link types:
PE, PR and CU, respectively). There are 20 transit ASes and 9 ASes where nodes are
located. One of them, labeled as AS5, plays also a role of transit AS, that is, it serves its
own residential customers and provides Internet access for AS6. AS5 possesses all types
of inter-domain links. Stub ASes have peering and/or provider types of inter-domain
connections.
For simplicity, we do not consider the types of accounting and the real monetary costs. In
the experiments, local nodes (located in the same domain with ALTO server) always
receive AS preference value 0. Among the remote nodes the most preferred are those the
traffic is exchanged with over customer inter-domain links. Such nodes receive AS
preference value 1. Node reachable over a peering link receives AS preference value 2.
Finally, nodes the traffic is sent through a provider link receive AS preference value 3,
since they are least preferred. Nodes receive two separate rating values: for upstream and
downstream links. The final node rating is calculated by adding the ratings for the two
directions.
The experiments are designed to evaluate performance of the three following inter-ALTO
applications: route asymmetry, different types of business relations, and coordination of
ISPs' policies. There are three basic simulation scenarios: (a) No ALTO server, (b) 1 ALTO
server (in AS5), and (c) 9 ALTO servers. As expected and presented in detail in [28], the
ALTO mechanism supported by the inter-ALTO communication framework reduces the
traffic on the most expensive links and shortens the download time.
Among others, we study the situation when the policies of operators are contradictory.
Interesting observations were made for the link between two stub ASes: AS5 and AS6.
Originally, AS5 serves as a provider for AS6, thus, from the point of view of AS5 the link is
of the customer type, while AS6 sees it as a provider link. If both ASes implement ALTO,
their policies are contradictory. AS5 promotes nodes located in AS6 to other remote
nodes. Therefore nodes located in AS5 willingly initiate communication with those located
in AS6. On the other hand, operator of AS6 wants to decrease the traffic on this link to pay
less, so it discourages its nodes from initiating communication with nodes in AS5. This is
Public

the situation examined in scenario (c). In Figure 52 this scenario is labeled as CU-PR. To
avoid the situation of contradictory intentions, ISPs may agree on a consistent routing (and
optimization) policy for the overlay traffic. To evaluate the results of various possible
policies two extra scenarios were simulated: (d) 9 ALTO servers PR-PR, and (e) 9 ALTO
servers PE-PE. In scenario (d), AS5 agree to decrease the overlay traffic on the customer
link and assign the same priority as for provider links. AS6 still has to pay for the traffic
sent over this link but AS5 agrees on lower income yet removes contradicting policies and
supports the overlay traffic optimization. Scenario (e) is introduced to show what would
happen if the link type was changed to peering, i.e., the overlay traffic is exchanged
between AS5 and AS6 with no cost.

Figure 52: Average traffic on the link between AS5 and AS6.
Introducing ALTO into AS5 only causes the traffic to increase. When both AS5 and AS6
deploy ALTO servers, the volume of the traffic flowing between them decreases. However,
the reduction of the traffic is not as high as it can be when both operators focus on the
reduction of the traffic, what is presented in Figure 52. When both AS5 and AS6 promote
the link between them, the volume of traffic is significantly higher than it is in the situation
when both of them reduce the traffic.
5.2 OpenFlow-based Network Configuration Framework
OpenFlow [81] is a standardized communication protocol which enables to control the
forwarding behavior of Ethernet switches in Software Defined Networks [88]. It was initially
released at Stanford University. Currently the Open Networking Forum is responsible for
maintaining its specification. The ONFs OpenFlow Switch Specification [89] covers not
only specification of the OpenFlow protocol itself but also the components and basic
functions of the switch. Most of todays articles, researches or implementations are still
based on OpenFlow Switch Specification 1.0.0 from December 2009, although the latest
version is 1.3.2 from April 2013.
An OpenFlow Switch enables to separate the fast packet forwarding and the high level
routing decisions. The first functionality is still performed by the switch, whereas the latter
is moved to an external controller. The communication between the OpenFlow Switch and
Controller is performed by means of the OpenFlow protocol. The first version of the
OpenFlow protocol supports three types of messages:
controller-to-switch messages used by the Controller for direct management and
inspections of the OpenFlow Switch,
asynchronous messages triggered by the OpenFlow Switch in order to notify
network events and its state changes to the Controller,
Public

symmetric messages initiated by either the OpenFlow Switch or Controller such as
Hello or Echo.
Figure 53 shows an OpenFlow-enabled switch with the Flow Table responsible for packet
lookup and forwarding and the Secure Channel to an external controller.

Figure 53: An OpenFlow communication between OpenFlow-enabled switch and
Controller (ONF, OpenFlow Switch Specification 1.0.0, Dec. 31, 2009).
The Flow Table stores flow entries consisting of header fields, list of counters and actions.
Header fields are used to match against incoming traffic. Counters in the list should be
updated when specific packets are forwarded. Actions define how the switch handles
matching packets. The Secure Channel is the interface that enables Controller to configure
and manage the switch.
OpenFlow Switch supports two types of flow insertion: proactive and reactive. Reactive
flow insertion launches when packet reaches an OpenFlow switch port and cant be
matched against any entry from the Flow Table. In this case the packet is sent to the
controller which performs packet evaluation. Next, new appropriate flows are added, and
the packet is forwarded accordingly. The latter type of flow insertion allows the controller to
insert flows in the Flow Table before the packets arrive.
Figure 54 presents OpenFlow specification evolution.

Figure 54: OpenFlow evolution [25]
The most important changes introduced since version 1.0.0 are:
multiple Flow Tables allowing multi-action pipeline processing (see Figure 55),
support for multipath allowing directing a flow over several paths (packet cloning),
support for IPv6,
Public

support for VLAN Q-in-Q encapsulation,
multiple parallel channels between switch and controller,
per flow meters enabling measuring and controlling of rate of packets sent to the
controller.

Figure 55: OpenFlow switch 1.3.0 (ONF, OpenFlow Switch Specification 1.3.0, June 25,
2012).
OpenFlow specification contains limited amount of QoS-related features. Since the first
version, OpenFlow allows forwarding packets through queues attached to ports of the
switch. However the creation and behavior of the queues is outside the scope of
OpenFlow. The second feature enables to include VLAN priority and IP ToS value to
packet header fields. Version 1.3.0 brings new Meter Table which consists of per-flow
meter entries which allow for the implementation of various simple QoS operations such as
rate-limiting.
The OpenFlow Switch Specification does not take into consideration multi-domain
environments. Thus, several European projects try to address this issue. For example the
FELIX project [36] aims to specify a novel framework capable of provisioning end-to-end
network services between resources attached to SDN clouds controlled by OpenFlow [14].
In order to achieve this it is planned to specify a set of extensions to the NSI [84] which
would allow utilizing functionalities of both OpenFlow and NSI for creation of Federated
SDN Services.
The Software-defined Networking becomes more popular these days. Most of the network
equipment vendors deploy SDN solutions in their devices. One of the most popular
technologies from the SDN portfolio that is widely implemented in network devices is the
OpenFlow Switch (protocol). Currently OpenFlow Switch Specification 1.0.0 is the most
common. However, for example Juniper announced that its EX9200 supports also version
1.3.0.
OpenFlow-based configuration framework can be used by all TMSs that need to alter
routing and/or forwarding in the underlying network. This would be realized by
configuration of proper flows in the network equipment. This approach is suitable only for
Public

the case of configuration done within a single domain. This approach enables finer
configuration granularity than the MPLS.
The use of OpenFlow-based approach implies that to some extent the ISP is willing to
deploy the SmartenIT TMS and enable it to control some of the resources of its network in
automatic manner.
In order to initially validate the proposed solution a pilot deployment of the OpenFlow
controlled domain based on the Mininet [83] tool was prepared. Mininet is a framework
which enables the creation of virtual network topologies using Open vSwitch [90].
Figure 56 shows the pilot topology which consists of four core switches and two gateway
switches each interconnected with four hosts. Each switch communicates using
OpenFlow protocol with the Floodlight [95] controller. By default Floodlight controller starts
with a Forwarding module which enables reactive forwarding. In order to use the proactive
approach the Forwarding module has to be disabled or a proper flow has to be inserted in
each switch (see empty-rule from Section 12.1). The latter option was chosen. The goal of
the tests was to use proactive flow insertion in order to enable communication (ping)
between hosts: lh1 and rh2. To achieve this, to each switch on the path (red line) four
flows (see curl commands from Section 12.1) were manually sent using the Floodlight
REST API [96]. This allowed pinging host lh1 from rh1 and vice versa.

Figure 56: OpenFlow test domain topology

Public

5.3 MPLS-based Network Configuration Framework
Multi-Protocol Label Switching [98] is a well-known standard and commonly used
technology in ISP networks. At its core, MPLS enables creation of virtual end-to-end
tunnels called Label Switch Paths (LSP). Data packets are forwarded along an LSP based
solely on attached short label placed in an MPLS header. Routers along the path do not
perform resource consuming IP lookups but direct packets according to the entries of the
Label Forwarding Information Base (LFIB) that comprises forwarding rules for each LSP
that traverses given router. In the MPLS original specification the LDP [9] protocol is used
to establish LSPs, i.e. exchange information about labels to be used on each link building
up the LSP. LSP can be specified on hop-by-hop basis utilizing typical routing information
(as in traditional IP networks) but also specified explicitly (calculated by a single entity, i.e.
ingress router, or configured manually) what brings many benefits and possibilities (e.g.,
balanced network resources utilization).
Over the years MPLS has been enhanced with additional features including the Traffic
Engineering extensions [12] that enable efficient and reliable network operations while
simultaneously optimizing network resources utilization. Introduction of TE capabilities to
MPLS involved extensions to protocols used on both routing and signaling planes. The
already used IGP routing protocols, e.g., OSPF-TE [67], were extended to distribute TE
data providing additional information on network resources within a domain. This
information is then used during optimal LSP calculation performed at the Label Edge
Router (LER). LER employs a Constrained Shortest Path First (C-SPF) algorithm to
calculate optimal LSPs taking into account current network conditions and administratively
induced constraints. In MPLS-TE for LSP signaling purposes the RSVP-TE [13] protocol is
used that natively facilitates resource reservation (i.e. for LSP bandwidth guarantees) and
has been extended with label distribution capabilities. MPLS-TE provides also path
protection and restoration capabilities.
The Path Computation Element [34] concept and the proposed PCE-based architecture
enhanced the existing constraint-based path computation mechanisms, being the core of
MPLS-TE operations, with capabilities to run complex path computation processes in
large, multi-domain networks on dedicated computational components.
Originally MPLS LSPs were established within a single network domain but the
requirements for inter-domain MPLS TE were defined [74][122] and a number of possible
solutions were specified to enable MPLS usage in multi-domain environments [33].
Proposed solutions consider new signaling options, multi-domain path computation
techniques and the problem of reachability and TE information distribution between
domains. In particular several inter-domain LSP signaling options exist [35]. The concept
of nested, contiguous and stitched inter-domain LSPs was introduced. Moreover, BGP
implements a mechanism defined in [97] for MPLS label information distribution across
domain borders. When BGP is used to distribute a particular route, it can be also used to
distribute a MPLS label which is mapped to that route. This standard is supported by
various network equipment vendors, e.g., in form of BGP Labeled Unicast (BGP-LU) [64]
protocol in case of Juniper.
Moreover, multiple connectivity applications have been defined that leverage MPLS as the
underlying technology, e.g., VPLS [73], BGP/MPLS VPN [99], that enable setting up
different type of connectivity services on top of the existing LSPs tailored for particular
communication requirements.
Public

A major advantage of the adoption of MPLS-TE based network configuration framework in
SmartenIT is the possibility to use well-defined, widely deployed mechanisms rather than
developing them from scratch.
MPLS-TE is able to support any TMS that needs to differentiate traffic, set up dedicated
paths in the network, monitor the load on each path or make QoS guarantees.
MPLS LSPs might be set up on-demand on per data transfer basis in the case when the
delivery paths need to be established for a considerably long time period. It is also
possible to establish a mesh of LSPs and provide periodical LSPs re-configuration what
may be employed by load balancing and overall cost minimization focused TMSs.
MPLS supports cases when the communication paths span multiple ISP domains without
tampering with the established interconnection agreements between particular ISP
domains. A multi-domain LSP could be set up in a similar way to the one presented in the
LSP stitching concept. It would be constructed of segments (intra-domain LSPs)
configured separately in each domain. The MPLS labels assigned to particular LSP
segments can be advertised to other domains by means of BGP based label distribution
protocol, e.g., the BGP Labeled Unicast, configured between BGP peers. This would
enable seamless MPLS end-to-end connectivity.
The use of MPLS implies that to some extent the ISP is willing to deploy the SmartenIT
TMS and enable it to control some of the resources of its network in automatic manner.
Vast number of MPLS related solutions deployed in ISP networks indicate that those
technologies are reliable, efficient and scalable. Moreover, those mechanisms are
commonly supported by network devices, e.g., implemented as part of Junos OS [66] in
the case of Juniper equipment.
More sophisticated network equipment supports device virtualization that enables creation
of quite complex virtual network topologies by means of a single virtualized physical
device. This facilitates functional validation of the underlying protocols, proposed solutions
and developed mechanisms in real network environment.
In order to initially validate the proposed approach a simple multi-domain virtual testbed
was set up using Juniper MX80 device virtualization capabilities. Three different device
virtualization options are available and the Logical Systems (LS) were used. Juniper MX80
device supports creation of up to 15 Logical Systems. Logical tunnels were used for inter-
LS connectivity.
The test topology comprising 11 virtual routers (Logical Systems) organized into 4
Autonomous Systems is presented in Figure 57.
Public


Figure 57: Multi-domain network for MPLS tests
The conducted tests covered several configuration phases:
Logical Systems configuration and connection by means of logical tunnel interfaces.
Intra-domain protocols configuration in all domains: OSPF-TE, RSVP-TE, MPLS-
TE.
Intra-domain LSPs configuration using both path calculation based purely on IGP
and by means of the CSPF.
BGP Labeled Unicast configuration for both intra-domain and inter-domain peering
what resulted in an end-to-end stitched inter-domain LSP creation connecting LS1
and LS43.
L3VPN configuration on top of BGP/MPLS between client sites LS-CE1 and LS-
CE2.
Proper configuration was validated by ping and traceroute tests run between CE routers.
Router configuration was prepared based on Juniper documentation available online [65].
Detailed test topology along with detailed configuration extracted from the router is
included in Section 12.2.
This simple test, focused purely on functional aspects of the proposed MPLS-based
framework, proved that it is feasible and the configuration is rather straightforward. It was
also confirmed that the device virtualization capabilities can be easily used to create more
complex and sophisticated network topologies having in mind final tests of the developed
TMSs and demonstrations.

Public

6 Synergies between Mechanisms
Having presented the details of the traffic management mechanisms proposed in Section
4, as well as the communication and configuration frameworks in Section 5, it becomes
important to categorize those mechanisms, spot their overlaps and their
complementarities, in order to identify the possible synergies or even combination of
mechanisms that will provide more concrete and wider solutions for the management of
overlay and Cloud traffic. It is expected that certain mechanisms share the same
characteristics, or have similar goals. Moreover, some mechanisms introduce new
decision metrics that can be utilized by other mechanisms as well. All these observations
render the identification of synergies an important task that will optimize the design of
efficient mechanisms and narrow down the set of possible mechanisms that will require to
be hosted by the system architecture, as initially specified by WP3. The
merging/combination of mechanisms will not only serve for implementation purposes but
also for the study (theoretical or simulative) of the specific characteristics of final
mechanisms. In this section, a preliminary investigation on the existence of possible
synergies is provided, considering the mechanisms adherence to the identified scenarios
from WP1, the shared characteristics between certain mechanisms and the common basis
for the intelligence/logic foreseen by each mechanism.
6.1 Adherence to the Scenarios
A basic step for providing a taxonomy of the mechanisms proposed in this document as
well as for identifying synergies, we provide a mapping of the mechanisms to the
functionalities identified in deliverable 3.1 (Report on Initial System Architecture) and the
two scenarios proposed in deliverable 1.2 (Description of Cloud and Overlay Service
Classifications and final Scenario Development). By doing so, we bridge the gap between
architecture, scenarios and mechanisms. Moreover, we identify synergies between
mechanisms by discussing a survey on mechanisms based on synergy ratings. The
mapping was done for both the old scenario definition with the four scenarios, namely
inter-cloud communication, global service mobility, social awareness and energy
efficiency, and the new operator focused and end-user focused scenario definition. The
mapping on the four scenarios of the old scenario definition has been performed by
examining the descriptions of the traffic management solutions in Section 4, whereas the
mapping on the new scenario definition was achieved by deploying the mechanism
functionality on a topological diagram of the architecture taken from D3.1.
Public


Figure 58: Topological view of architecture with added scenario overlay (based on
component map taken from D3.1).
Figure 58 shows the topological view of the architecture taken from D3.1. The topological
diagram has been used to get a first overview of the traffic management solutions w.r.t the
SmartenIT architecture, rather than a detailed mapping to components. Basically, three
domains of entities are present in this figure:
Data Center/Cloud Layer: This layer comprises data centers and their virtual
interconnections using the Internet as a network.
Core and Access Network Layer: The core and access layer network contains
components in the ISP network and the private networks of data center operators.
End User Layer: The end-user layer covers access network infrastructure as well
as the end-users devices. In particular, this layer contains the users home gateway
(UNaDa).
In D3.1, a common set of functional blocks were identified which, when combined in a
functionality stack, can provide on the different entity layers the required operations.
These functionality stacks are depicted as a stack of blue/orange boxes and are
associated to the respective entities.
Moreover, as a main outcome of deliverable D1.2, two scenarios were defined. The
components and entities covered by both scenarios are depicted as additional overlays
in Figure 58.
Public

End User Focused Scenario: This scenario mainly addresses end user related
aspects. Thus, the scenario is mainly related to the End-User Layer and touches
the Data Center/Cloud Layer and Core and Access Network Layer only partially. It
is depicted with the yellow overlay shape.
Operator Focused Scenario: The Operator Focused scenario mainly covers intra-
and inter-data center aspects. Thus, the end-user is merely a passive entity in this
scenario, while Core and Access Network Layer and Data Center/Cloud Layer are
in the focus. It is depicted with the red overlay shape.
With this set of definitions in mind, scenarios, entities and functionality can be aligned
along a common axis. This allows mapping each mechanism proposed in this
deliverable to a set of entities and envisioned functionality, while at the same time
showing the mainly addressed scenario. A detailed view of the mapping of all
mechanisms to the respective scenarios/entities/functionality is attached to the
appendix of this deliverable. As a summary, we provide the adherence of mechanisms
to one of the respective scenarios as depicted in Table 14: Adherence of mechanisms
to scenarios. There are merely two mechanisms (AQAS and MUCAPS), which cannot
be matched to a single scenario, because these mechanisms address aspects of both
scenarios.
Table 14: Adherence of mechanisms to scenarios.
Operator Focused End-User Focused
Mechanism for Inter Cloud
Communication (ICC)
OpenFlow-, MPLS-based Inter-Cloud
Traffic Management Framework
(OpenFlow, MPLS)
Multi resource allocation in the IaaS
Federation (MRA)
Inter-ALTO (Inter-ALTO)
Dynamic Traffic Management (DTM)
Upgrading and Planning Processes in
Load Balanced Networks (OptiPlan)
Virtual Incentive (vINCENT)
RB-Tracker: User Traffic Management
(RB-Tracker)
Home Router Sharing based on Trust
(HORST)
Socially-aware TM for Efficient Content
Delivery (SECD)
Selection Mechanism for Storage
Providers Systems (SMSP)
QoE and Energy Aware Mobile TM
(QoEnA)
Application Quality Aggregation System (AQAS)
Multi-Criteria Application End-Point Selection (MUCAPS)

This view on scenario adherence is further strengthened by Table 15, giving an overview
of the proposed traffic management solutions and the adherence to scenarios based on a
survey. In the table the coherence to the scenarios inter-cloud communication, global
service mobility, social awareness and energy efficiency of the old scenario definition is
shown. For each mechanism every scenario was rated from 0 (scenario is not addressed
by the traffic management solution) to 3 (scenario is mainly addressed by the traffic
Public

management solution). The solutions were grouped by the scenario they mainly address.
This is highlighted by the background color of the TMS. The right column shows the sum
of the ratings for each proposed solution.
The largest group of solutions addresses the inter-cloud communication scenario.
Furthermore most of these solutions are more or less restricted on inter-cloud
communication, which is reflected by the low sum of ratings. To fit the wide scope of the
SmartenIT project, these solutions might provide mechanisms for a combined solution
addressing a larger span of scenarios.
As the sum of the ratings of the proposed solutions increase the mechanisms are less
specific but provide comprehensive concepts for a SmartenIT solution that span several
scenarios. Solutions that address social awareness can provide concepts that can
combine different scenarios, by, e.g., using the information from social networks to predict
user interest or movement and such improve the content delivery or resource
management.
The bottom row shows the total sum of the ratings for each scenario. The high values for
inter-cloud communication and global service mobility indicate that these scenarios are
sufficiently represented in traffic management solutions. Energy efficiency and social
awareness are slightly less represented in the traffic management solutions. The reason is
that most traffic management solutions mainly address a scenario different from energy
efficiency or social awareness, but use metrics from these scenarios. Therefore energy
efficiency and/or social awareness are also addressed in the majority of scenarios if only
not as main scenario. Furthermore the solutions which address mainly energy efficiency
focus on nano data centers or mobile devices of end-users. In the scenario description in
D1.2 [108] the definition of scenario energy efficiency mainly addresses the federation of
data centers and live migration of virtual machines.
Table 15: Overview of proposed TMS w.r.t. scenarios. Absolute values. 3 (dark green):
TMS mainly addresses scenario, 0 (white): TMS does not address scenario.
Traffic Management Solution or
0ommunication !rame.or2

Inter4
cloud
com.
7lo5al
ser&ice
mo5ilit
Social
a.ar4
eness
%nerg
effici4
enc
S
u
m
8

Multi 'esource Allocation in the IaaS !ederation 9M'A:
(
)
e
r
a
t
o
r

f
o
c
u
s
e
d

3 - - - 3
()en!lo. 5ased Inter40loud TM !rame.or2 9()en!lo.: 3 - - - 3
M#;S 5ased Inter40loud TM !rame.or2 9M#;S: 3 - - 1 4
/)grades< #lanning in ;oad =alanced 1et.or2s 9()ti#lan: 3 - - 1 4
Inter4A;T( 9Inter4A;T(: 3 2 - - $
Mechanism for Inter40loud 0ommunication 9I00: 3 - - 2 $
Dnamic Traffic Management 9DTM: 2 2 1 2 +
A))lication 3ualit Aggregation Sstem 9A3AS: - 3 1 - 4
Multi 0riteria A))lication %nd4#oint Selection 9M/0A#S:

- 3 1 2 "
Selection Mechanism of Storage #ro&iders 9SMS#:

- 3 2 - $
>irtual Incenti&e 9&I10%1T:

- 3 2 - $
Sociall4a.are TMM for %fficient 0ontent Deli&er 9S%0D:

- - 3 - 3
?ome 'outer Sharing 5ased on Trust 9?('ST:

- 2 3 2 +
/ser Traffic Management 9'=Trac2er:

- 2 3 2 +
3o% and %nerg A.are Mo5ile TM 93o%nA:

- 2 - 3 $
Total8 2- 22 1" 1$

%
n
d

u
s
e
r

f
o
c
u
s
e
d

Public

However none of the proposed solutions mainly addresses this definition of energy
efficiency. Overall the different scenarios are sufficiently addressed which shows that the
solutions cover a wide scope of the project.
The traffic management solutions in Table 15 were further grouped according to the new
operator focused and end-user focused scenario definitions. As expected the traffic
management solutions which address mainly inter-cloud communication are operator
focused, whereas traffic management solutions having their emphasis not on inter-cloud
communication tend to be more end-user focused. Details for each solution on the
addressed scenarios can be found in the respective sections of Chapter 4.
6.2 Properties of Mechanisms
To proceed with the identification of synergies, we now summarize the presented
mechanisms with respect to some main properties that they share. This summary is
included in Table 16. This table highlights the main contribution areas of each mechanism
(data offloading, content replication, content delivery, delivery scheduling), if and how
different information (social, overlay, energy, cost or QoE-related) is used, whether an
inter-domain protocol is described (in case it is required), whether a decision algorithm is
provided and what type of network traffic is targeted. These attributes nicely summarize
the mechanisms described in Sections 4 and 5.
One can observe that the majority of mechanisms propose content replication, caching
and pre-fetching as a mean to bias and localize the traffic related to the consumption of
the specific content. To extend this observation, this happens due to the fact that most
mechanisms deal with offline, cacheable content. Continuing the analysis, few of the
mechanisms consider a way of content delivery other than best effort. Moreover, there is a
group of mechanisms that use social awareness as a metric for either predicting the
demand or evaluating trust. Most of the mechanisms consider a variety of metrics for their
decision process, but since the work is still in its early phases, there is no concrete
decision algorithm described and, hence, the metrics are not very specific. Finally, as
already identified in the previous section, there are some mechanisms that do not foresee
some decision process, but rather require an external decision process to be in place and
can act in complement to the decision process, allowing for the enforcement/configuration
of the decision.
Based on the above observations and the details of Table 16, the following classification of
mechanisms is proposed, along with some analysis:
Category A Content Placement: This category deals with selection of the
content localization, i.e., where to place the content and which content to replicate,
prefetch or move, based, e.g., on content popularity, demand prediction, energy
considerations etc. In this class, HORST, SECD, RBTracker and vINCENT
mechanisms are included. All four mechanisms deal with content caching, pre-
fetching. HORST, RBTracker and vINCENT focus on the end-user level while
SECD focus on the AS level. However, the actual placement of the storage
resources (either UNaDas at end-user premises or caches at ISP-owned servers)
does not alter the concept of the mechanisms. HORST and SECD base their
approaches on the existence of OSNs (generic or purpose-specific) that provide the
necessary trust among participants, while vINCENT proposes an incentive scheme
to achieve a level of trust. HORST offers an additional feature of forming a
federation of WiFi Access Points that co-reside with the local caches (both under a
UNaDa). vINCENT puts more focus in P2P networks, so its inclusion in this
Public

category is marginal, since it only deals with placement of content at peers, where
the content will be stored anyway. However, sharing resources between mobile and
static nodes (in the case of vINCENT, these resources refer to bandwidth capacity
and battery limitations) is common in both vINCENT and HORST (where a mobile
node can instruct a static node to prefetch content to be consumed at a later point).
Category B Delivery Scheduling: This category deals with mechanisms which
decide when to transfer the content (e.g., during low load period) but also from
where to fetch it, i.e., from which inter-domain link, based, e.g., on past period's
(accounting) measurements. Both ICC and DTM mechanisms focus on managing
(at the AS level) inter-domain traffic based on network costs. The main difference is
that DTM considers the case of multi-homed domains, while ICC doesnt. They both
target on reducing the traffic during high-peak periods, while considering the 95
th

percentile charging rule, and try to schedule data exchanges between clouds or
data centers at off-peak periods. ICC further considers Cloud-based information for
the scheduling of transfers. It seems that both mechanisms try to reduce inter-
domain traffic by biasing only the traffic generated by Clouds while doing nothing
with other types of traffic (considered as background). Hence, there might be the
case of unfair treatment of Cloud/overlay traffic, an issue that needs to be further
considered. On the other hand, RBTracker and QoEnA try to schedule the delivery
of content to an end-user device based on the status of the devices network
interfaces. Finally, OptiPlan aims at load-balancing the intra-domain links according
the network-related events (like congestion), hence, in a sense, it also proposes a
mechanism for scheduling how traffic should circulate inside a domain. Extensions
of such a mechanism, could also consider other metrics, such as energy or QoE
related metrics.
Category C Ranking: This category includes all mechanisms that have proposed
a kind of ranking algorithm as their key (but not sole) contribution. Here, the
mechanisms RBTracker, SMSP, MRA, ICC, AQAS and the MUCAPS are included.
All these mechanism deal with ranking end users, peers, storage systems or even
content according to several metrics, depending on the focus of each mechanism.
For example, end users or peers can be ranked according to their network distance
from a specific content source/destination, or according to their use of resource
(more or less greedy). More details about the metrics will be given in the following
paragraphs. However, the combination of those metrics seems a viable outcome
which will broaden the scope of the mechanisms and will also provide multi-
parameter rankings and optimizations. Note that although some metrics are specific
to a given environment (e.g., P2P networks), there might be cases that they can be
re-used and adapted to other environments as well, preserving their overlay-centric
nature but not constrained by specific characteristics of deployed technologies.
Category D Communication Protocols: This category contains only
communication protocols that may be used by other mechanisms and system
elements. AQAS and inter-ALTO are included in this class. These mechanisms
define intra-domain (AQAS) or inter-domain (inter-ALTO) protocols that allow the
(vertical or horizontal) exchange of information that is required to local decision-
taking processes. It is expected that a combination of such protocols will be used in
order to disseminate the results of a decision process to intra-domain or inter-
domain entities.
Public

Category E Configuration Frameworks: This category contains proposals that
offer a network configuration solution. In this class, OpenFlow and MPLS are
included. Both approaches offer a uniform way to configure network elements in
order to bias the data transfer at the network layer. OpenFlow is more generic,
since it allows configuring several attributes of the network entity, while MPLS
allows configuring network paths. Actually, OpenFlow could also configure path
labels that will be later used by the MPLS label switching process. It is expected
that decision-making algorithms that require the configuration of network elements
will make use of such mechanisms to enforce their decisions in an intra-domain
environment.
The above classification can provide a first level of overlaps among the proposed
mechanisms. At the same time, it provides an initial view on the possible synergies. It is
expected that a complete mechanism that addresses all the issues of a specific scenario
(proposed by WP1) will have to cover most of (if not all) the aspects that the
aforementioned categories address. Hence, several super-mechanisms could be
designed by picking a proposal from each class of mechanisms; the selection of
mechanisms should be made in such a way so that the final outcome applies to the
existing scenarios.
Public

Table 16: Summary of mechanisms properties
0roperties 4
Mec*anism
6,RS! SE$D I$$ Inter5A7!, D!M R3!rac'er SMS0 MRA ,pen%lo& M07S ,pti0lan vI1$E1! A8AS M9$A0S

8oEnA
E:uipment
home
router @
storage 9A
/1aDa:
Social
#ro,
Ser&er
9S#S:
SmartenIT
Informatio
n Ser&ice
9SmaS:<
0loud
Scheduler
90loS:<
0loud
Informatio
n Ser&ice
90loI:
A;T(
ser&er
IS#Bs
decision
sstem
Trac2er

()en!lo.
ena5led
s.itches @
()en!lo.
controller
M#;S
ena5led
routers
and
s.itches
(ffline tool
)rocessing
data from
monitoring
sstems

/1aDa<
/1aDa
7ate.a
A;T(
0lient and
Ser&er<
3o%
monitoring
sstem and
informatio
n D=
A;T(
Ser&er<
A;T(
0lient<
D1S4li2e
Ser&er<
a))lication
6ualit
re6uest
controller
Mo5ile
#hone
Data
offloadin"
indirectl
9Ci!i &s.
mo5ile
data:

es 9D0 to
D0< D0 to
/1aDa:
es
es 9from
D0 to D0:

;oad
5alancing
.ithin
net.or2

es 9using
Ci!i:
$ontent
replication4ca
c*in"4prefetc
*in"
es es es es
es es

Informatio
n on
content
)lacement
o)tions as
in)ut
es
$ontent
delivery
;ot*er t*an
)est5effort<
es

es

Sc*edulin" of
data transfer
off4)ea2 or
scheduled
5 user 9if
)ossi5le:

Scheduling
5ased 9$th
)ercentileD
resourceEd
estination
selection
5ased on
net.or2
and cloud
metrics

off4)ea2
9indireclt:

local<
scheduling
in
timeEs)ace
dimension
Social info as
trust metric
es

es 9&irtual
nodes:

Social info for
demand
prediction
es es
es 9later:

Public

%ormation of
overlay
es
9through
/1aDa
induced
0D1:
es
9through
messaging:

es
es. An
o&erla of
end4users
through a
D?T
9Distri5ute
d ?ash
Ta5le: to
e,change
the
)reference
ran2ing.

!lo.s o&er
e,isting
net.or2
to)olog
>#1EM#;S
o&er ;2
)rotocol

es<
random
mesh of
/1aDas

1et&or'
metrics

es es e,changes es
es es
es es
as o)tional
in)ut
es
es 9no
details:
as in)ut
$loud4overlay
metrics
es es es
es es
es
9greedines
s:

$ost metrics

es es es

as in)ut for
cost
o)timi4
*ation
Messaging
o&erhead<
fairness

Mo5ile
data
consum)ti
on< energ
consum)ti
on
8oE metrics

es

as in)ut for
dimen4
sioning
es
es< see
5elo.
es 9no
details:
es
Inter5domain
protocol

de&elo)me
nt of inter4
)ro,
ser&er
)rotocolD
)ro,ies can
5e located
in different
Ases< or
IS#s
es es

es 9=7#
.ith M#;S
la5els
distri5utio
n :

02043it!orren
t5specific

es
#2#
ena5les
the
e,change
of
informatio

Public

n among
end4users.
1o decision
mec*anism
es

)artiall<
social
)lacement
of content
re6uires
e,ternal
algorithm
es
decision on
.hich
A%#S<
.hich
metrics
and their
.eight
!ype of traffic
;e.".= content=
computation
data= ...<
content content
an data
9content or
com)utati
on data:
an an
content
contentEda
ta
user
consum)ti
on )rofiles
an data an data
An
aggregated
sourceElin2
traffic
content an

Public

6.3 Observation and Decision Metrics
The decision-making process, being the core of each mechanism, reveals some the
properties of a mechanism and may also point to possible synergies at a lower level. For
this reason, a second table is provided that presents in more detail the decision metrics
that are foreseen by the mechanisms (see Table 17). This table provides an overview on
how each mechanism meets the following questions: where the decision is taken and what
is its scope? Is there a demand prediction mechanism in place? Does the mechanism
consider different connectivity technologies for the end devices? What type of metrics
(social, network, cloud/overlay, cost, energy or QoE-related) are considered for the
decision? What is the innovation of the mechanism? The answers to these questions will
further help to identify overlaps, complementarities and possible synergies between the
proposed mechanisms.
Looking into the different metrics that each mechanism proposes to use, the following
preliminary observations can be made:
There is a balance between centralized and de-centralized solutions. Since the
mechanisms are described in a more abstract way (more like frameworks), its not
possible at the moment to evaluate how these two approaches are compared and
what are the advantages/disadvantages that may influence the further development
of the mechanisms.
Social metrics such as interests of users and users relationships are used to
predict demand of content. There are three cases which make such use of social
information. In another case, the OSN graph can be used as a means of trust as
well.
Three mechanisms considering the access technology are proposed, i.e. WiFi vs.
3G/4G access, local traffic scheduling, content pre-fetching, and virtual nodes.
From the users point of view improved QoE, prolonged the battery life and reduced
expenses are possible; from the network operators perspective, peak loads may be
reduced and possibly the overall load as well. Collaboration between these
mechanisms should be considered.
Network metrics are foreseen in most mechanisms, as expected. There is a big
overlap among the proposed metrics in the various mechanisms, probably due to
the fact that the exact decision algorithms are not yet in place. However, it is
expected that most mechanisms will use the same evaluation functions when
considering the network status. A collaboration potential is observed here.
Regarding Cloud/Overlay metrics, the location of the users and the content are
most popular. The congestion and availability of the content servers may also be
considered by some mechanisms.
Cost metrics reveal whether a mechanism aims at reducing the costs of a
stakeholder (e.g., of an ISP). In other words, it shows whether a decision algorithm
has a cost minimization objective. This objective is directly tackled by mechanisms
that manage with inter-domain traffic. Others deal with this objective indirectly, since
caching, by default, decreases inter-domain traffic, hence saves some expenses
related to it, if done in off-peak times.
Public

Energy metrics are not specified accurately and few mechanisms consider them
directly. It is expected that as the mechanisms will be further specified, the relation
with energy consumption and optimization will become clearer.
QoE metrics are considered by some mechanisms, but it remains for future
specification to make clear how they are used.
The impact of mobility to the perceived QoE is considered by one mechanism.

Public

Table 17: Overview of mechanisms decision-taking process and envisioned innovation
Decision
!a'in"
6,RS! SE$D I$$ Inter5A7!, D!M R3!rac'er SMS0 MRA ,pen%lo& M07S ,pti0lan vI1$E1! A8AS M9$A0S 8oEnA
Decision5
ta'in" point
;centrali>ed=
decentrali>ed=
etc<
Decentrali*
ed 9F:
0entrali*ed 0entrali*ed
n
o

d
e
c
i
s
i
o
n

m
a
2
i
n
g

9
i
t

i
s

a

c
o
m
m
u
n
i
c
a
t
i
o
n

s
c
h
e
m
e
:

0entrali*ed
Decentrali*
ed
Decentrali*
ed
Decentrali*
ed

0entrali*ed
Decentrali*
ed
0entrali*ed
0entrali*ed
at the IS#
resol&er4
le&el
decentrali*
ed< end4
user
e6ui)ment
Scope of
decision
;"lo)al=
selfis*= local=
etc<
local 9F: local
glo5al<
selfish in
case of no
federation<
non4selfish
in case of
federation
selfish<
IS#Bs
domain
local
9influences
ans.ers to
re6uests:
glo5al
9local
decision
5ased on
glo5al
)reference
s:
glo5al
9localEdece
ntrali*ed
decisions
5ased on
glo5al
greediness
:

Intra4
domain<
local
Selfish
7lo5al8
intersectio
n of
end)oint
D= of
a))lication
net.or2
and A;T(
Ser&er
co&erage
*one
Decision
made on
each user
end sstem
and related
a))lication
end)oint<
..r.t a
glo5al &ie.
of the
access G
core
net.or2
;ocal9F:
Demand
0rediction
=ased on
social and
o&erla
metrics<
e.g.< social
cascades
=ased on
Social
metrics

Interests of
users
)o)ular
content

as rele&ant
in)ut for
scala5ilit
and timing
for
u)grades
;ocal &ie.
of (S1

$onnectivity
;mo)ile
devices<
Select
access
technolog
< access
)oints

Selected
de&ices
form
&irtual
nodes

o)timi*ed
5ased on
locationEm
o5ilit
)rediction
Social metrics
Interests of
users and
(S1 gra)h<
trust
Interests of
users and
(S1 gra)h

Interests of
users and
(S1 gra)h

Trust and
interest
deri&ed
from social
gra)h
Public

1et&or'
metrics

=and.idth
9du)licate
transfers
as )ro,
for inter4
domain
traffic:
latenc<
congestion
< num5er
of ho)s<
=7# info<
geogra)hic
al location
utili*ation<
congestion
< net.or2
status
'TT 9uT#:
to ada)t
transfer
s)eed and
determine
H5usinessH
of
net.or2<
AS ho)s to
im)ro&e
localit
'TT<
num5er of
ho)s< =C<
AS num5er

1et.or2
to)olog<
5and.idth
s< traffic
demand
metrics<
e,)ected
gro.th
Standard
3oS
metrics for
rece)tion
of static
content

?o) count<
a5straction
on )ath
5and.idth
a&ail.
=and.idth<
delaE'TT<
loss rate
$loud4overlay
metrics
;ocation of
users and
content
9e.g.<
geogra)hic
< AS:<
)o)ularit
of content
/se of
cache
ser&er 9as
)ro, for
cache
congestion
:<
a&aila5ilit<
redundant
transfers
congestion
<
a&aila5ilit

;ocation of
users 9AS:
and
content
9.hich
0loud
ser&ice is
located:<
ser&ice
a&aila5ilit
Hgreedines
sH8 amount
of
resources
re6uested
90#/<
'AM< =C<
storage:

a5straction
on
end)oint
5and.idth
$ost metrics

1um5er of
transfers
of content
items from
outside to
the inside
of an ASD
num5er of
du)licate
items as
)ro, for
redundant
traffic
0ost due to
inter4
domain
traffic
99$th
)ercentile:
D cost due
to energ
consum)ti
on 92C:
5ased on
inter4
domain
tariffs
99$th
)ercentile
or total
&olume:

as in)ut for
cost
o)timi*atio
n

mo5ile
data cost
Ener"y
metrics

9see
a5o&e:

Mo5ile
5atter
de)letion
Public

8oE metrics

hitEmiss as
)ro,ies for
goodE)oor
)erforman
ce

3o%
metrics for
streaming
a))lication
)ac2et
loss<
free*e<
content
6ualit<
IitterJ
3o%
metrics
used to tie
5rea2
1et.or2
and 0ost
Metric
5ased
decisions
as limit for
deferring
connection
s
Innovation
1. Ci!i
access 5
calculating
trust from
(S1sD 2.
/1aDa
induced
0D1 .hich
ta2es into
account
socialEclou
dEnet.or2
EcostEener
gE3o%
metrics
;ocali*atio
n of traffic
5 2ee)ing
content
re)licas
locallD if
the cache
cannot
ade6uatel
address
demand is
assisted 5
a #2#
o&erlaD
inno&ati&e
o&erla
constructio
n 5ased on
social
metrics
%,change
of
informatio
n 5et.een
cloud and
net.or2
laer so as
to ma2e
decisions
on .here
and .hen
to route
trafficD
e,)loitatio
n of the
9$4th
)ercentile
rule to
send more
traffic at
no e,tra
cost 9.e
consider a
full4I#
architectur
e:
e,change
of
aggregated
IS#sB
sensiti&e
informatio
n
91:
addresses
the
)ro5lem of
cost4
o)timal
distri5utio
n of
ine&ita5le
inter4
domain
traffic on
multi)le
lin2s
92: ena5les
dnamic
o&erla
traffic
manageme
nt and
ada)tation
to
constantl
changing
traffic
)atterns to
meet cost
o)timi*atio
n goals
#redict
user
5eha&ior
and
)re)are
content
5eforehan
d during
lo.
net.or2
usage
times to
ease
net.or2
load in
)ea2
times.
/suall
Storage
(&erlas
Iust
consider
file si*e<
file t)e
etc. This
)ro)osed
mechanism
also ta2es
into
considerati
on the
net.or2
state and
the user
location 44
therefore<
it can sa&e
inter4
domain
traffic and
enhance
userBs 3o%.
M/;TI
resource
fairness<
that is<
contrar to
com)eting
a))roache
s< not
5ased on
static
consum)ti
on )rofiles
and can 5e
a))lied
com)letel
decentrali*
ed
dnamic
9on
demand:<
automatic
connecti&it

)ro&isionin
g

net.or2
mechanism
integration
.ith
e,ternal
)ath
selection
9decision:
algorithm
dnamic
9on
demand:<
automatic
connecti&it

)ro&isionin
g

net.or2
mechanism
integration
.ith
e,ternal
)ath
selection
9decision:
algorithm
Although
net.or2
monitoring
< u)grading
and
)lanning is
a classical
tas2< it has
to 5e re4
thought for
ne. net4
.or2ing
conce)ts
including
&irtuali*ed
clouds<
o&erlas<da
ta center
inter4
connection
.ith focus
on energ
consum)ti
on
0lustering
of )ri&ate
de&ices<
common
sense of
cost and
contri5utio
n for user4
5ound
incenti&e<
not de&ice4
5ound
incenti&e.
Adds 3o%
a.areness
to A;T(
%nd)oint
0ost to
ser&e
needs of
a))lication
net.or2s
5uing IS#
ser&ices
C.r.t.
#ADIS
scheme8 9i:
adds multi4
cost A;T(
5ased A%#
selection
9ii:
automates
decision on
.hich
A;T(
metric to
re6uest 9iii:
automates
decision on
.eight
setting for
multi4
o5Iecti&e
A%#
e&aluation
()timi*ati
on of
foreground
and
5ac2groun
d net.or2
re6uests
5ased on
mo5ilit
)rediction<
considerin
g the
resulting
3o%
including
energ
consum)ti
on

Public

6.4 Discussion of Synergies between Mechanisms
In this section we discuss observations on possible synergies. To investigate if the
identified synergies are also in-line with the partners proposing the solutions we conducted
a survey among all partners. In the survey each partner that proposed a solution was
asked to rate the synergy of the proposed solution with the other traffic management
solutions. Such partners were encouraged to deal with the different solutions and find
similarities.
Before we discuss the outcomes of the survey we show some problems of the survey.
First the list of traffic management solutions has grown since the survey. Hence not all
traffic management solutions proposed to date were considered in the survey. Another
problem was the diversity of ratings between partners. To cope with this problem we
normalized the ratings of each partner. Finally the significance of the ratings is rather low,
since we have only a sample of two ratings for each pair of mechanisms. However, the
results of the survey provide an impression of the synergies partners find in the traffic
management solutions with respect to their limited perspective.
Based on the results of the survey we could identify two major cliques: One clique
including all solutions dealing with content placement or nano data centers, i.e. SMSP,
vINCENT, SECD, HORST and RBTracker. The second clique contains solutions
addressing inter-cloud communication, i.e. MRA, OpenFlow, Inter-ALTO, ICC and DTM.
Hence, the cliques reflect the end-user focused and operator focused scenario definition.
However, the synergies identified in the survey were not all in-line with the categories
defined in Section 6.2. A category which was not covered by the cliques is Category C
Ranking. For partners the ranking aspect of the mechanisms did not play such a big role
rating the synergies. However, a closer look to these mechanisms has shown that the
metrics of these mechanisms for ranking could be combined which yields high synergy
and the opportunity to evaluate them on a common basis. Finally, the categories identified
in this subsection provide the synergies of the traffic management mechanisms in detail,
which cover the high-level synergies identified in the survey.

Public


7 Summary and Conclusions
In this deliverable D2.2, SmartenIT had four main targets:
The first target was to propose and specify incentive-based TM mechanisms and their
intelligence for the efficient handling of traffic generated by overlay applications in an
energy efficient manner. The main feature of TM mechanisms is to knowingly influence
transfer of traffic flows among stakeholders of distributed cloud-based architectures, each
with a relevantly defined more specific target or objective. Each of the proposed TM
mechanisms try to explore characteristics of contemporary cloud-based overlay
applications, and/or use social awareness information, and/or enhance energy efficiency,
while respecting user-perceived QoE aspects.
The second target was to describe all TM mechanisms proposed by SmartenIT partners.
In fact, direct documentation of all proposals made, reflecting wide scope of partners
interests, resulted in inclusion of 12 solutions in this document, with rather different levels
of maturity, flexibility and potential for further implementation.
The third target was to develop theoretical and simulation models for the evaluation of the
specified TM mechanisms and of the benefits and reactions of different stakeholders when
implementing such TM mechanisms
The fourth target was to present preliminary results of the evaluation for certain TM
mechanisms, which was complemented with example configuration patterns for OpenFlow
and MPLS test configuration settings.
All targets are following and extending general and high-level investigations, analysis and
conclusions about currently deployed cloud-based applications carried out within WP1 and
documented in deliverables D1.1 and D1.2. Additionally these targets also take advantage
of preliminary descriptions, characteristics and worked out taxonomy of TM solutions
documented in D2.1, starting from classical management concepts described in widely
accepted TM standards and proposing emerging traffic management solutions relevant for
cloud-based applications; these solutions reflect partners visions of concepts potentially
applicable for SmartenIT architecture and practical implementation, and explore cross-
layer incentives, social awareness information and energy efficiency.
Besides proposing a rather wide range of TM mechanisms, there was an important effort
to describe those using consistent categories to be able to clearly compare them. This
approach in some cases led to some formal difficulties but finally it allows understanding
the scope of the mechanisms, their key aspects and innovations, and their overlap and
complementarities.
The next two subsections summarize key outcomes and lessons learnt from our
investigations towards the four targets mentioned above, and finally outline the next steps
and open issues to be addressed in future works of SmartenIT, mainly oriented towards
specification and implementation of selected solutions done within WP3.
7.1 Key Outcomes and Lessons Learnt
This deliverable documents all TM mechanisms and frameworks proposed or suggested
by partners, both classical, currently thoroughly elaborated, widely known (such as MPLS),
and also new, prospective traffic management solutions made as original proposals for
emerging applications (e.g, UNaDa, vINCENT). They are described and analyzed with a
Public

main focus on suitability of traffic management mechanisms with regard to employed
incentives, energy efficiency and social awareness aspects. Besides presenting TM
mechanisms with basic features, there are formulated conclusions and directions for
SmartenIT project.
A representative selection of different applications for SmartenIT TM solutions is
given in Subsection 3.1. This work is an extract from achievements accomplished within
WP1 on the selection of overlay applications whose traffic is to be tackled by SmartenIT.
This description is customized for the purposes of WP2 work on mapping mechanisms and
scenarios and use cases.
Then, theoretical models for the assessment of QoE/QoS relations developed within
SmartenIT as well as other metrics related to the selected application categories are
provided in Subsection 3.2. Simulation models for video streaming and Dropbox are also
included.
The definition and detailed description of all TM mechanisms is provided in Section 4.
This part of D2.2 is extensive since proposed mechanisms are presented both as
theoretical concepts and also a preliminary mapping to the architecture is included. In
order to make this presentation more informative and to allow for their comparison, all
descriptions are made using the same template. In particular, each description of TM
mechanism starts from the main addressed scenarios (the set of scenarios being worked
out within SmartenIT), then the key influence factors, i.e. parameters that have significant
impact on their performance, key performance metrics that should be monitored and are
aimed to be improved by each mechanisms, and finally in some cases also preliminary
evaluation results.
Section 6 provides a very important outcome by analyzing potential synergies between
all specified mechanisms, obtained as results of comparative studies and interactions
among partners, with the general aims to work out taxonomy of solutions and also to
recognize addressed scenarios covered, as well as related overlaps and
complementarities of the mechanisms.
Results are included in two tables. All features of proposed mechanisms, potential
aspects for intervention, evaluation metrics are summarized in Table 16. The concise
form of comparison of all mechanisms together with an overview of mechanisms
decision-taking process and envisioned innovation is included in Table 17. From the
above two tables, the lessons learnt can be summarized in the following points:
most mechanisms are in a very initial state, mainly described as frameworks;
additional work will be carried out in the remainder of the project to make them
concrete and precise,
there is a clear trend towards content placement enhanced with social awareness
for the mechanisms focusing to the end user scenario,
operator-focused mechanisms consider aggregated traffic and theres a question
how social-awareness can help in this direction,
the underlying business model is a significant factor with respect to the deployment
of such mechanism(s); e.g. its not clear how ISPs can bias the decisions in the
user-focused mechanisms or how operator-focused mechanisms can consider
user-specific metrics,
regarding the cross-layer impact of the mechanisms, user-oriented mechanisms,
combine overlay- and social-related metrics, while operator-focused mechanisms
Public

mainly consider network and cost metrics; combining complementary mechanisms
can lead to true cross-layer solutions,
energy-awareness is treated indirectly, in most cases.
Two general and two theoretical SmartenIT objectives, defined preliminarily in DoW[110]
are discussed in Section 8. Mapping of objectives on tasks within the general framework of
WP2 is also performed.
In order to outline implementation aspects for proposed TM mechanisms example
configuration patterns for OpenFlow and MPLS test configuration settings were included
as Appendices 12.1 and 12.2. The presented settings suggest usability of those general
TM mechanisms to the implementation phase to be done within WP3 work.
7.2 Next Steps
The final content of deliverable D2.2 reflects important studies and interactions within
SmartenIT consortium aiming at making classification of proposed impressive number of
12 TM solutions in order to figure out their relevance for the project, value for
implementation, relation to scenarios and to the architecture, etc.
Results from D2.2 will directly impact the process of choosing traffic management
mechanisms for practical implementation and then evaluation within the experimental
phase of SmartenIT project. Also, the complete range of recognized functions and key
performance metrics for TM solutions will be translated into specifications of architecture
entities and components being specified, designed, implemented and assessed. The direct
task relying on D2.2 results is T2.3 with the target to finally select a subset of TM
mechanisms to be further specified, in accordance with the system architecture designed
within T3.1 and documented in D3.3. Furthermore, the set of traffic management solutions
provided in D2.2 serves as basis for the decision on use cases that will be defined in D2.3.

Public

8 Smart Objectives
Through this document, four SmartenIT SMART objectives defined in Section B1.1.2.4 of
the SmartenIT Description of Work (DoW, [110]) have been partially addressed. Namely,
two overall (O1 and O3, see Table 18) and two theoretical (O1.2 and O1.3, see Table 19)
SMART objectives were addressed.
The overall Objective 3 is defined in DoW as following:
Objective 3 SmartenIT will investigate economically, QoE, and energy-wise such models
in theory by simulations to guide the respective prototyping work in terms of
a successful, viable, and efficient architectural integration, framework and
mechanisms engineering, and its subsequent performance evaluation.
The research topics in this deliverable cover: Simulative investigations and Incentive
compatibility. This document presents in Chapter 3 first the selection of relevant
application and subsequently models for relevant applications. The models include
simulation and theoretical models, addressing objective 3. In Chapter 4, detailed
descriptions of traffic management solutions were presented that cover objective 1. In that
chapter traffic management solutions were presented with respect to incentives, energy
efficiency and social awareness, contributing to T2.3 (Traffic Management Mechanisms
Specification) and T2.4 (Theoretical Investigations).
This deliverable contributes to answering two specific theoretical questions:
1. Objective O1.2: What overlay information is needed for the network to optimize its
resources and how can this information be conveyed from overlay operators,
application providers, or end users to network operators through relevant and viable
incentives?
In order to find relevant information to optimize resources, it is important to find first
the relevant application for further analysis. The relevance of applications was
defined and presented in Chapter 3. Based on these relevant applications,
simulation and theoretical models were presented to assess and further analyze the
relevant information. An extensive analysis of incentives of overlay operators,
application providers, end users, or network operators was already addressed in
Chapter 5 in D1.1 [1].
2. Objective O1.3: Which incentive schemes will be needed (for application overlays
and clouds) to adapt to the existing physical network structure? For what
optimization criteria (revenue, energy efficiency) can this be attained more
effectively?
Chapter 4 discusses several incentive schemes such as rewarding home router
owners, who make their private WiFi available to other people assuming an online
social network application. Another mentioned incentive is inspired by operators,
which want to coordinate their efforts in order to decrease transfer costs on inter-AS
links or improve QoE by nodes. As more and more data is stored in clouds an
incentive mechanism must be developed, which deals with load balancing across
clouds to manage user traffic. The proposed application vINCENT incorporates an
incentive mechanism allowing for the consideration of social data and supports the
Tit-for-Tat mechanism.
The introduced incentive schemes can be used to optimize the energy efficiency, as
well as load balancing on resources and traffic.
Public

According to the SMART objectives set within SmartenIT DoW, those ones of relevance
for D2.2and the respective task within WP2, i.e., T2.3 andT2.4 the targeted for objectives
have been met.
Table 18: Overall SmartenIT SMART objective addressed. (Source: [110])
Objective
No.
Specific
Measurable
Achievable Relevant
Timely
Deliverable
Number
Mile Stone
Number
O1
Incentive
compatibility
D2.2 Study, design Advanced MS2.2
O3
Simulative
investigations
D2.2
Analysis. Simulation,
evaluation
Complex MS2.2
Table 19: Theoretical SmartenIT SMART objectives addressed. (Source: [110])
Objective
ID
Specific
Measurable
Achievable Relevant
Timely
Metric
Project
Month
O1.2
What overlay information is
needed for the network to
optimize its resources and
how can this information be
conveyed from overlay
operators, application
providers, or end users to
network operators through
relevant and viable
incentives?
Number of identified types of
information (AS affiliation,
requested service, legal
requirements, energy costs,
traffic demand, traffic
volume, latency/QoE
requirements, social
information and meta-
information)
Design,
simulation
T2.3
Highly
relevant
output of
relevance for
providers and
users
M24
O1.3
Which incentive schemes will
be needed (for application
overlays and clouds) to adapt
to the existing physical
network structure? For what
optimization criteria (revenue,
energy efficiency) can this be
attained more effectively?
Number of identified
incentive schemes and their
cost reduction in terms of
money, traffic, and energy
footprint
Design,
simulation
T2.3, T2.4
Extremely
relevant
output of
relevance for
providers and
users
M24

Public

9 References
[1] 3rd Generation Partnership Project (3GPP) TS 32.500 Telecommunication
management: Self-Organizing Networks (SON); Concepts and requirements (2013)
www.3gpp.org/SON, October 2013.
[2] O. Abboud, K. Pussep, A. Kovacevic, and R. Steinmetz: Quality Adaptive Peer-to-
Peer Streaming using Scalable Video Coding; Wired-Wireless Multimedia Networks
and Services Management, 2009.
[3] V. K. Adhikari, J.Sourabh, and Z. Zhi-Li: Where Do You Tube? Uncovering YouTube
Server Selection Strategy; In Computer Communications and Networks (ICCCN),
2011 Proceedings of 20th International Conference, IEEE, 2011.
[4] V. K. Adhikari, J. Sourabh, Y. Chena and Z. Zhi-Li. Vivisecting YouTube: An active
measurement study; In INFOCOM Proceedings IEEE, 2012.
[5] V. Aggarwal, A. Feldmann, and C. Scheideler: Can ISPs and P2P users cooperate for
improved performance? SIGCOMM Comput. Commun. Rev., July 2007.
[6] Akamai Technologies: Akamai NetSession; http://www.akamai.com/client, October
2013.
[7] R. Alimi, R. Penno and Y. Yang: ALTO Protocol; Internet-Draft draft-ietf-alto-protocol-
17, July 2013.
[8] P. Amrehn, K. Vandenbroucke, T. Hofeld, K. de Moor, M. Hirth, R. Schatz and
Pedro Casas; Need for Speed? On Quality of Experience for File Storage Services;
to be published at the 4th International Workshop on Perceptual Quality of Systems
(PQS 2013), Vienna, Austria, September 2013.
[9] L. Andersson, I. Minei and B. Thomas: LDP Specification; RFC 5036, October 2007.
[10] O. Arnold, F. Richter, G. Fettweis, and O. Blume: Power consumption modeling of
different base station types in heterogeneous cellular networks; in Future Network
and Mobile Summit, 2010.
[11] C. Anliker: Caching and Replication Strategies for RB-Tracker; (Master Thesis)
University of Zurich, Faculty of Economics, 2013.
[12] D. Awduche, J. Malcolm, J. Agogbua, M. O'Dell and J. McManus: Requirements for
Traffic Engineering Over MPLS; RFC 2702, September 1999.
[13] D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan and G. Swallow: RSVP-TE:
Extensions to RSVP for LSP Tunnels; RFC 3209, December 2001.
[14] B. Belter: Federating SDN-enabled islands with an extended NSI Framework;
Maastricht, Netherlands https://tnc2013.terena.org/core/presentation/29, June 2013.
[15] BitTorrent Inc.: BitTorrent Live; http://live.bittorrent.com/, October 2013.
[16] T. Bocek: TP-Link TL-MR3020, 2013; http://nope.tv/articles/tp3020/, August 2013.
[17] A. Bouch, A. Kuchinsky, and N. Bhatti: Quality is in the Eye of the Beholder: Meeting
Users Requirements for Internet Quality of Service; Proceedings of the SIGCHI
conference on Human Factors in Computing Systems, 2000.
[18] J. Buysse, K. Georgakilas, A. Tzanakaki, M. De Leenheer, B. Dhoedt, and C.
Develder: Energy-efficient resource-provisioning algorithms for optical clouds;
Journal of Optical Communications and Networking, 2013.
Public

[19] P. Casas, H. Fischer, S. Suette, and R. Schatz: A first look at quality of experience in
personal cloud storage services; IEEE-ICC International Workshop on Mobile Cloud
Networking, Budapest, Hungary, 2013.
[20] Cisco: Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update,
2012-2017; 2013.
[21] M. Cha, H. Kwak, P. Rodriguez, Y. Ahn, and S. Moon: I tube, you tube, everybody
tubes: analyzing the world's largest user generated content video system;
In Proceedings of the 7th ACM SIGCOMM conference on Internet
measurement (IMC '07), 2007.
[22] X. Cheng and J. Liu: NetTube: Exploring Social Networks for Peer-to-Peer Short
Video Sharing; IEEE INFOCOM, April 2009.
[23] X. Cheng and J. Liu: NetTube: Exploring Social Networks for Peer-to-Peer Short
Video Sharing; IEEE INFOCOM, April 2009.
[24] X. Cheng, J. Liu and C. Dale: Understanding the Characteristics of Internet Short
Video Sharing: YouTube as a Case Study; in IEEE Transactions on Multimedia 2010.
[25] Cisco Open Network Environment: An Introduction to OpenFlow; Cisco Open Network
Environment Webinar Series,
http://www.cisco.com/web/solutions/trends/open_network_environment/docs/cisco_o
ne_webcastan_introduction_to_openflowfebruary142013.pdf, February 2013.
[26] R. Cohen and G. Nakibly: Maximizing Restorable Throughput in MPLS Networks;
Proc. IEEE Infocom, Phoenix, AZ, USA, 2008.
[27] Z. Dulinski, M. Kantor, W. Krzysztofek, R. Stankiewicz and P. Cholda: Optimal Choice
of Peers based on BGP Information; Proceedings of 2010 IEEE International
Conference on Communications (ICC), May 2010.
[28] Z. Dulinski, R. Stankiewicz, P. Wydrych, M. Kantor and P. Cholda: Cost-driven peer
rating algorithm; In Communications (ICC) IEEE, June 2011.
[29] Z. Dulinski, P. Wydrych, R. Stankiewicz: Inter-ALTO Communication Problem
Statement; http://tools.ietf.org/html/draft-dulinski-alto-inter-problem-statement-01,
January 2012.
[30] The ETICS project: Deliverable D2.3 - Business and technical requirements for future
network architectures Final version, April 2013.
[31] Facebook Marketing: Sponsor your Page posts;
https://www.facebook.com/notes/facebook-marketing/sponsor-your-page-
posts/10150675727637217, October 2013.
[32] Facebook: Key Facts Facebooks latest news, announcements and media
resources; http://newsroom.fb.com/Key-Facts, October 2013.
[33] A. Farrel, J.-P. Vasseur and A. Ayyangar: A Framework for Inter-Domain Multiprotocol
Label Switching Traffic Engineering; RFC 4726, November 2006.
[34] A. Farrel, J. P. Vasseur, and G. Ash: A Path Computation Element (PCE)-Based
Architecture, IETF RFC 4655, 2006.
[35] A. Farrel, A. Ayyangar and J.-P. Vasseur: Inter-Domain MPLS and GMPLS Traffic
Engineering -- Resource Reservation Protocol-Traffic Engineering (RSVP-TE)
Extensions; RFC 5151, February 2008.
[36] The FELIX Project: FP7 FELIX project website; http://ict-felix.eu/, October 2013.
Public

[37] M. Fiedler, T. Hofeld and P. Tran-Gia: A generic quantitative relationship between
quality of experience and quality of service; IEEE Network - Special issue on
Improving Quality of Experience for Network Services 24(2), March 2010.
[38] B. Fortz, J. Rexroth and M. Thorup: Traffic engineering with traditional IP routing
protocols; IEEE Com. Magaz., October 2002.
[39] B. Fortz and H. mit: Efficient techniques and tools for intra domain traffic
engineering; Internat.Transact.in Operational Research, 2011.
[40] R. Gao: Interdomain traffic engineering for multi-homed networks; Georgia Tech, PhD
Thesis, 2007.
[41] N. Gautam, H. Petander, and J. Noel: A comparison of the cost and energy efficiency
of prefetching and streaming of mobile video; in Proceedings of the 5th Workshop on
Mobile Video, 2013.
[42] H. Gredler, J. Medved, S. Previdi, A. Farrel and S. Ray: North-Bound Distribution of
Link-State and TE Information using BGP; draft-ietf-idr-ls-distribution-03, May 2013.
[43] H. Gredler, J. Medved and S. Previdi: Advertising Traffic Engineering Information in
BGP; draft-gredler-bgp-te-01, July 2011.
[44] I. Grigorik: High Performance Browser Networking: OReilly, 2013.
[45] C. Gross, F. Kaup, D. Stingl, B. Richerzhagen, D. Hausheer, and R. Steinmetz:
EnerSim&: An Energy Consumption Model for Large-Scale Overlay Simulators; in
LCN, 2013.
[46] W. Guo, S. Wang, T. OFarrell, and S. Fletcher: Energy Consumption of 4G Cellular
Networks: A London Case Study; in IEEE Vehicular Technology Conference
(VTC2013-Spring), 2013.
[47] F. Hartleb, G. Halinger and S. Kempken: Network Planning and Dimensioning for
Broadband Access to the Internet; Handbook of Research on Telecommunications
Planning and Management for Business, IGI global publ., 2009.
[48] G. Halinger, S. Schnitter and M. Franzke: The efficiency of traffic engineering with
regard to failure resilience; Telecommunication Systems Vol. 29, 2005.
[49] G. Halinger, G. Nunzi, C. Meirosu, C. Fan and F.-U. Andersen: Traffic Engineering
Supported by Inherent Network Management: Analysis of Resource Efficiency and
Cost Saving Potential; International Journal on Network Management, Wiley, 2011.
[50] G. Halinger, A. Schwahn and F. Hartleb: 2-state (semi-)Markov processes beyond
Gilbert-Elliot: Traffic and channel models based on 2. Order statistics; Proc. IEEE
Infocom, Turin, Italy, 2013.
[51] G. Halinger and F. Hartleb: Content Delivery and Caching from a Network Providers
Perspective; Special Issue on Internet based Content Delivery, Computer Networks
55, 2011.
[52] G. Halinger and T. Kunz: Challenges for Routing and Search in Dynamic and Self-
organizing Networks; Proc. Ad Hoc Now Conf., Murcia, Spain, 2009.
[53] K. Hinton, J. Baliga, M. Feng, R. Ayre, and R. S. Tucker: Power Consumption and
Energy Efficiency in the Internet; IEEE Network, April 2011.
[54] H. Hlavacs, G. Da Costa, and J.-M. Pierson: Energy Consumption of Residential and
Professional Switches; in International Conference on Computational Science and
Engineering, 2009.
Public

[55] T. Hofeld, D. Hausheer, F. V.Hecht, F. Lehrieder, S. Oechsner, I. Papafili, P. Racz et
al.: An Economic Traffic Management Approach to Enable the Triple Win for Users,
ISPs, and Overlay Providers; In Future Internet Assembly, 2009.
[56] T. Hofeld, R. Schatz, S. Biedermann, A. Platzer, S. Egger, M. Fiedler: The Memory
Effect and Its Implications on Web QoE Modeling; In: Proc. of 23rd International
Teletraffic Congress (ITC23), San Francisco, USA, September 2011.
[57] T. Hofeld, R. Schatz, M. Seufert, M. Hirth, T. Zinner and P. Tran-Gia: Quantification
of YouTube QoE via Crowdsourcing; In: Proc. of IEEE International Workshop on
Multimedia Quality of Experience - Modeling, Evaluation, and Directions (MQoE
2011), Dana Point, CA, USA, December 2011.
[58] T. Hofeld, S. Egger, R. Schatz, M. Fiedler, K. Masuch and C. Lorentzen; Initial Delay
vs. Interruptions: Between the Devil and the Deep Blue Sea; QoMEX 2012, Yarra
Valley, Australia, July 2012.
[59] T. Hofeld, R. Schatz, E. Biersack, L. Plissonneau; Internet Video Delivery in
YouTube: From Traffic Measurements to Quality of Experience in Data Traffic
Monitoring and Analysis: From Measurement, Classification and Anomaly Detection
to Quality of Experience; Editors: Ernst Biersack, Christian Callegari, Maja
Matijasevic, Springers Computer Communications and Networks Series, 2012.
[60] T. Hofeld, F. Liers, R. Schatz, B. Staehle, D. Staehle, T. Volkert and F. Wamser:
Quality of Experience Management for YouTube: Clouds, FoG and the AquareYoum;
PIK - Praxis der Informationsverarbeitung und -kommunikation (PIK), August 2012.
[61] IETF: Application-Layer Traffic Optimization (alto); http://datatracker.ietf.org/wg/alto/,
October 2013.
[62] R. Jain, D.M. Chiu and W.R. Hawe: A quantitative measure of fairness and
discrimination for resource allocation in shared computer system; Eastern Research
Laboratory, Digital Equipment Corp., 1984.
[63] C. Joe-Wong, S. Sen, T. Lan, M. Chiang: Multiresource Allocation: Fairness
Efficiency Tradeoffs in a Unifying Framework; in Proceeding of INFOCOM 2012,
IEEE, 2012.
[64] Juniper Networks: Network Scaling with BGP Labeled Unicast;
http://www.juniper.net/us/en/local/pdf/design-guides/8020013-en.pdf, January 2010.
[65] Juniper Networks: Junos Logical Systems configuration
guidelines;http://www.juniper.net/techpubs/en_US/junos11.4/information-
products/pathway-pages/config-guide-logical-systems/config-guide-logical-
systems.html#configuration, October 2013.
[66] Juniper Networks: Junos Network Operating System;
http://www.juniper.net/us/en/products-services/nos/junos/, October 2013.
[67] D. Katz, K. Kompella and D. Yeung: Traffic Engineering (TE) Extensions to OSPF
Version 2; RFC 3630, September 2003.
[68] Fabian Kaup and David Hausheer: Optimizing Energy Consumption and QoE on
Mobile Devices; In: IEEE International Conference on Network Protocols (ICNP
2013), October 2013.
[69] E. Koukoumidis, D. Lymberopoulos, K.Strauss,J. Liu and D. Burger: Pocket
cloudlets; ACM SIGPLAN Notices, 2012.
Public

[70] G. Kramer: On generating self-similar traffic using pseudo-Pareto distribution; A Short
Tutorial-Like, Network Research Lab., Department of Computer Science, University
of California, 2000.
[71] N. Laoutaris, G. Smaragdakis, P. Rodriguez and R. Sundaram: Delay tolerant bulk
data transfers on the Internet; In Proceedings of the eleventh international joint
conference on Measurement and modeling of computer systems. ACM, 2009.
[72] A. Lareida, T. Bocek, M. Waldburger, and B. Stiller: RB-Tracker: A Fully Distributed,
Replicating, Network-, and Topology-aware P2P CDN; IFIP/IEEE International
Symposium on Integrated Network Management Workshop, IM'13, IEEE, 2013.
[73] M. Lasserre and V. Kompella: Virtual Private LAN Service (VPLS) Using Label
Distribution Protocol (LDP) Signaling; RFC 4762, January 2007.
[74] J.-L. Le Roux, J.-P. Vasseur and J. Boyle: Requirements for Inter-Area MPLS Traffic
Engineering; RFC 4105, June 2005.
[75] F. Lehrieder, S. Oechsner, T. Hossfeld, D. Staehle, Z. Despotovic, W. Kellerer and M.
Michel: Mitigating unfairness in locality-aware peer-to-peer networks; International
Journal of Network Management, January 2011.
[76] Z. Li, H. Shen, H. Wang, G. Liu and J. Li: SocialTube: P2P-assisted video sharing in
online social networks; IEEE INFOCOM, 2012.
[77] H. Li, H. Wang, J. Liu and K. Xu: Video Requests from Online Social Networks:
Characterization, Analysis and Generation; IEEE INFOCOM, 2013
[78] LongTail Video: Social Sharing - Publish Your Videos to Facebook, Twitter &
YouTube; http://www.longtailvideo.com/blog/22445/social-sharing-publish-your-
videos-to-facebook-twitter-youtube/, October 2013.
[79] G. S. Machado, F. V. Hecht, M. Waldburger and B. Stiller: Bypassing Cloud
Providers Data Validation to Store Arbitrary Data; IFIP/IEEE International
Symposium on Integrated Network Management, Ghent, Belgium, May 2013.
[80] J. McAuley and J. Leskovec: Learning to Discover Social Circles in Ego Networks;
http://snap.stanford.edu/data/egonets-Facebook.html, NIPS, 2012.
[81] N. McKeown et al.: OpenFlow Documents; http://www.openflow.org/wp/documents/,
October 2013.
[82] M. Menth, M. Duelli, R. Martin and J. Milbrandt: Resilience Analysis of Packet
Switched Networks; IEEE/ACM Trans. on Networking Vol. 17/6, 2009.
[83] Mininet Team: Mininet: An Instant Virtual Network on your Laptop (or other PC)
Mininet; http://mininet.org/, October 2013.
[84] Network Service Interface Working Group: Overview - NSI WG - Open Grid Forum;
http://redmine.ogf.org/projects/nsi-wg, October 2013.
[85] A.J. Nicholson; B.D. Noble, Bread Crumbs: Forecasting Mobile Connectivity, Proc.
MobiCom, 2008.
[86] The Nielsen Company: Top U.S. Web Brands and Parent Companies for September
2009; http://www.nielsen.com/us/en/newswire/2009/top-u-s-web-brands-and-parent-
companies-for-september-2009.html, October 2013
[87] B. Niven-Jenkings, F. Le Faucheur, N. Bitar: Content Distribution Network
Interconnection (CDNI) Problem Statement; http://tools.ietf.org/html/rfc6707,
September 2012.
Public

[88] Open Networking Foundation: Software-Defined Networking: The New Norm for
Networks; ONF White Paper, 2012.
[89] Open Networking Foundation: OpenFlow; https://www.opennetworking.org/sdn-
resources/onf-specifications/openflow, October 2013.
[90] Open vSwitch Team: Open vSwitch; http://openvswitch.org/, October 2013.
[91] Oracle Corporation: Managing Your Facebook Community: Findings on Conversation
Volume by Day of Week, Hour, and Minute; Oracle White Paper, August 2012.
[92] D. Papadimitriou, T. Zahariadis, P. Martinez-Julia, I. Papafili, V. Morreale, F. Torelli,
and P. Demeester. Design principles for the future Internet architecture. In The
Future Internet; Springer Berlin Heidelberg, 2012.
[93] M. Pioro and D. Mehdi: Routing, Flow and Capacity Design in Communication and
Computer Networks; Morgan Kaufmann Publishers, 2004.
[94] I. Poese, B. Franck, B. Ager, G. Smaragdakis, S. Uhlig and A. Feldmann: Improving
Content Delivery with PaDIS; Internet Computing, June 2012.
[95] Project Floodlight: Floodlight OpenFlow Controller;
http://www.projectfloodlight.org/floodlight/, October 2013.
[96] Project Floodlight: Static Flow Pusher API;
http://www.openflowhub.org/display/floodlightcontroller/Static+Flow+Pusher+API,
March 2013.
[97] Y. Rekhter, E. Rosen: Carrying Label Information in BGP-4; RFC 3107, May 2001.
[98] E. Rosen, A. Viswanathan and R. Callon, Multiprotocol Label Switching Architecture;
RFC 3031, January 2001.
[99] E. Rosen and Y. Rekhter: BGP/MPLS IP Virtual Private Networks (VPNs); RFC 4364,
February 2006.
[100] Sandvine: Spring 2012 Global Internet Phenomena Repotr; Technical Report,
Sandvine Corporation, 2012.
[101] R. Schatz, S. Egger and T. Hofeld: Understanding Ungeduld - Quality of Experience
Assessment and Modeling for Internet Applications; In: Proc. of 11th Wrzburg
Workshop on IP: Joint ITG and Euro-NF Workshop Visions of Future Generation
Networks (EuroView 2011), Wrzburg, Germany, August 2011.
[102] R. Schatz, T. Hofeld, L. Janowski and S. Egger: From Packets to People: Quality of
Experience as a New Measurement Challenge in Data Traffic Monitoring and
Analysis: From Measurement, Classification and Anomaly Detection to Quality of
Experience; Editors: Ernst Biersack, Christian Callegari, Maja Matijasevic, Springers
Computer Communications and Networks Series, 2012.
[103] S. Schnitter, A. Barth, O. Schnitter and M. Horneffer: Benefits from 2-layer traffic
engineering in IP/MPLS networks; Proc. Networks, 2008.
[104] S. Schnitter, F. Hartleb and M. Horneffer: Quality-of-service class specific traffic
matrices in ip/mpls networks; Internet Measurement Conference IMC, 2007.
[105] H. Schwarz, M. Wien: The Scalable Video Coding Extension of the H.264/AVC
Standard; IEEE Signal Processing Magazine, 2008.
[106] J. Seedorf and E. Burger: Application-Layer Traffic Optimization (ALTO) Problem
Statement; RFC 5693, October 2009.
Public

[107] The SmartenIT project: Deliverable D1.1 - Report on Stakeholders Characterization
and Traffic Characteristics; April 2013.
[108] The SmartenIT project: Deliverable D1.2 - Report on Cloud Service Classifications
and Scenarios; October 2013.
[109] The SmartenIT project: Deliverable D2.1 - Report on Overview ofOverlay Traffic
Management Solutions; April 2013.
[110] The SmartenIT project: Grant Agreement for STREP: Annex I 'Description of Work
(DoW)'; 2012.
[111] The SmoothIT Project: Deliverable D2.4 Performance, Reliability, and Scalability
Investigations of ETM mechanisms; 2011.
[112] SOCRATES project of the European Commission: Self-Optimisation and Self-
configuration in Wireless Networks, www.fp7-socrates.org, 2010.
[113] Statistic Brain: Facebook Statistics; http://www.statisticbrain.com/facebook-statistics/,
October 2013.
[114] TeleGeography: IP transit price declines steepen;
http://www.telegeography.com/products/commsupdate/articles/2012/08/02/ip-transit-
price-declines-steepen/, August 2012.
[115] T. Tominaga, T. Hayashi, J. Okamoto and A. Takahashi: Performance comparisons
of subjective quality assessment methods for mobile video; In: Proc. of 2nd
International Workshop on Quality of Mulitmedia Experience (QoMEX 2010),
Trondheim, Norway, July 2010.
[116] R. Torres, A. Finamore, J. R. Kim, M. Mellia, M. M. Munafo and S. Rao: Dissecting
video server selection strategies in the YouTube CDN;2011 31st International
Conference on Distributed Computing Systems (ICDCS),IEEE, 2011.
[117] S. Traverso, K. Huguenin, I. Trestian, V. Erramilli, N. Laoutaris and K. Papagiannaki:
TailGate: handling long-tail content with a little help from friend; ACM WWW, 2012.
[118] V. Valancius, N. Laoutaris, L. Massouli, C. Diot, and P. Rodriguez: Greening the
Internet with Nano Data Centers. ACM CoNEXT, New York, 2009.
[119] Wireless Broadband Alliance: WBA Industry Report 2011: Global Developments in
Public Wi-Fi; 2011.
[120] F. Wuhib, M. Dam, R. Stadler and A. Clemm: Robust Monitoring of Network-wide
Aggregates through Gossiping; IEEE Transactions on Network and Service
Management, Vol. 6, No. 2, June 2009.
[121] H. Xie, A. Krishnamurthy, A. Silberschatz and Y.R. Yang: P4P: Provider Portal for
Applications; Proc. ACM Sigcomm, Seattle, USA, 2008.
[122] R. Zhang and J.-P. Vasseur: MPLS Inter-Autonomous System (AS) Traffic
Engineering (TE) Requirements; RFC 4216, November 2005.
[123] L. Zhang, B. Tiwana, R. P. Dick, Z. Qian, Z. M. Mao, Z. Wang, and L. Yang: Accurate
Online Power Estimation and Automatic Battery Behavior Based Power Model
Generation for Smartphones; in CODES + ISSS10, 2010.
[124] F. Zhou, L. Zhang, E. Franco, A. Mislove, R. Revis and R. Sundaram: WebCloud:
Recruiting Social Network Users to Assist in Content Distribution; IEEE Computer
Society NCA, 2012.
Public

10 Abbreviations
3GPP 3rd Generation Partnership Project
AAA Authentication, Authorization, Accounting
API Application Programming Interface
BGP Border Gateway Protocol
CAGR Compound Annual Growth Rate
CAPEX Capital Expenditure
CDF Cumulative Distribution Function
CDN Content Distribution Network
CPP Critical Peak Pricing
CPU Central Processing Unit
CRM Customer Relationship Management
DC Data Center
DFA Deterministic Finite Automaton
DNS Domain Name System
DoW Description of Work
DPI Deep Packet Inspection
EGP Exterior Gateway Protocol
FoF Friend-of-Friend
GPS Global Positioning System
GSM Global System for Mobile Communication
HSDPA High-Speed Downlink Packet Access
HTML Hyper-Text Markup Language
HTTP Hyper-Text Transfer Protocol
IaaS Infrastructure as a Service
ICT Information and Communications Technology
IGP Interior Gateway Protocol
IP Internet Protocol
IRT Interoute S.A.
ISN Index Serving Node
ISP Internet Service Provider
IT Information Technology
LTE Long-Term Evolution
MNO Mobile Network Operator
MOS Mean Opinion Score
Public

MPLS Multi-Protocol Label Switching
NADA Nano Data Center
NIST National Institute of Standards and Technology
NSP Network Service Provider
OLA Operation Level Agreement
OPEX Operating Expenditure
OSN Online Social Network
OSPF Open Shortest Path First
OTT Over-The-Top
P2P Peer-to-Peer
PaaS Platform as a Service
PC Personal Computer
PoI Point of Interest
RAM Random Access Memory
RTMP Real Time Messaging Protocol
QoE Quality-of-Experience
QoS Quality-of-Service
REST Representational State Transfer
SaaS Software as a Service
SLA Service Level Agreement
SnF Store and Forward
SOAP Simple Object Access Protocol
SSL Secure Sockets Layer
STREP Specific Targeted Research Project
T4T Tit-for-Tat
TCP Transmission Control Protocol
TDG Telekom Deutschland GmbH
ToU Time of Use
TV Television
UDP User Datagram Protocol
VC Value Chain
VDC Virtual Data Centre
UNADA User Owned Nano Data Center
VM Virtual Machine
VN Value Network
Public

VNC Value Network Configuration
VoIP Voice over IP
VPN Virtual Private Network
XML X-tensible Markup Language
WAN Wide Area Network
WiFi Wireless Fidelity - 802.11 standards family

11 Acknowledgements
Besides the deliverable authors, as indicated in the document control section, this
deliverable was made possible with the considerable help of the WP2 team of SmartenIT.
Many thanks to all of them! Special thanks go to the reviewers Spiros Spirou, George D.
Stamoulis, Krzysztof Wajda and Burkhard Stiller for detailed revisions and valuable
comments.

Public

12 Appendices
12.1 OpenFlow Test Configuration Details
Following is the test Python script used to deploy the test topology shown in Figure 56 by
means of the Mininet tool.
from mininet.net import Mininet
from mininet.cli import CLI
from mininet.node import RemoteController,Node,Host
from mininet.log import setLogLevel, info
from mininet.link import Intf

def smartenIT():

net Mininet( controllerRemoteController, a!to"etMacsTr!e )
net.addController( #c$#, controller RemoteController, ip%&'(.$.$.&% )

%Create "martenIT topolog)..%

left*ateI+ #lg#
left*ate+,I+ #$$$$$$e$df-.(&d/#

coreI+s 0 #s1&#:#$$$&.2-(--3-f-3(#,
#s1'#:#$$$'.2-(--3-f-3(#,
#s12#:#$$$&.2-(--3-f.3(#,
#s1.#:#$$$'.2-(--3-f.3(# 4
core"1itc5es 04

rig5t*ateI+ #rg#
rig5t*ate+,I+ #$$$$-'-f-''&'c2'#

rig5tHostI+s 0 #r5&#: 0#mac#:#3'32$$($2.$(#4,
#r5'#: 0#mac#:#3'32$$3.(3&(#4,
#r56#: 0#mac#:#3'32$$a$'&&f#4,
#r52#: 0#mac#:#3'32$$7&ed32#4 4
rig5tHosts 04

leftHostI+s 0 #l5&#: 0#mac#:#3'32$$($2.$-#4,
#l5'#: 0#mac#:#3'32$$3.(3&-#4,
#l56#: 0#mac#:#3'32$$a$'&'$#4,
#l52#: 0#mac#:#3'32$$7&ed33#4 4
leftHosts 04

left*ate net.add"1itc5( left*ateI+, dpidleft*ate+,I+, listen,ort..3.
)
rig5t*ate net.add"1itc5( rig5t*ateI+, dpidrig5t*ate+,I+,
listen,ort..3( )

port ..3-
for k,v in coreI+s.iteritems():
print k, v
core"1itc5es8k9 net.add"1itc5( k, dpidv,
listen,ortport )
port port : &

net.addLink( left*ate, core"1itc5es8#s1&#9, 3, . )
net.addLink( core"1itc5es8#s1&#9, core"1itc5es8#s1'#9, &, & )
Public

net.addLink( core"1itc5es8#s1&#9, core"1itc5es8#s12#9, ', 6 )
net.addLink( core"1itc5es8#s1'#9, core"1itc5es8#s12#9, 6, ' )
net.addLink( core"1itc5es8#s1'#9, core"1itc5es8#s1.#9, ', ' )
net.addLink( core"1itc5es8#s12#9, core"1itc5es8#s1.#9, &, & )
net.addLink( core"1itc5es8#s1.#9, rig5t*ate, (, 3 )

for k,v in rig5tHostI+s.iteritems():
print k, v
rig5tHosts8k9 net.addHost( k, macv8#mac#9 )

net.addLink( rig5t*ate, rig5tHosts8#r5&#9, &, &)
net.addLink( rig5t*ate, rig5tHosts8#r5'#9, ', &)
net.addLink( rig5t*ate, rig5tHosts8#r56#9, 6, &)
net.addLink( rig5t*ate, rig5tHosts8#r52#9, 2, &)

for k,v in leftHostI+s.iteritems():
print k, v
leftHosts8k9 net.addHost( k, macv8#mac#9 )

net.addLink( left*ate, leftHosts8#l5&#9, &, &)
net.addLink( left*ate, leftHosts8#l5'#9, ', &)
net.addLink( left*ate, leftHosts8#l56#9, 6, &)
net.addLink( left*ate, leftHosts8#l52#9, 2, &)

net.start()
CLI(net)
net.stop()

if ;;name;; #;;main;;#:
setLogLevel( #info# )
smartenIT()

Following Floodlight controller commands were executed in order to enable ping between
hosts lh1 and lh2 in the deployed test OpenFlowdomain.
At first, empty flows with priority 0, which allow to drop all packets in the domain, were
created the Floodlight forwarding is not triggered.
c!rl <d #0%s1itc5%: %$$:$$:$$:e$:df:-.:(&:d/%, %name%:%empt)<r!le%, %priorit)%:%$%,
%active%:%tr!e%4# 5ttp:==&'(.$.$.&:-$-$=1m=staticflo1entr)p!s5er=>son
c!rl <d #0%s1itc5%: % $$:$&:.2:-(:--:3-:f-:3(%, %name%:%empt)<r!le%, %priorit)%:%$%,
c!rl <d #0%s1itc5%: % $$:$':.2:-(:--:3-:f-:3(%, %name%:%empt)<r!le%, %priorit)%:%$%,
c!rl <d #0%s1itc5%: % $$:$&:.2:-(:--:3-:f.:3(%, %name%:%empt)<r!le%, %priorit)%:%$%,
c!rl <d #0%s1itc5%: % $$:$':.2:-(:--:3-:f.:3(%, %name%:%empt)<r!le%, %priorit)%:%$%,
c!rl <d #0%s1itc5%: % $$:$$:-':-f:-':'&:'c:2'%, %name%:%empt)<r!le%, %priorit)%:%$%,

Later on a set of flows were created on several switches to enable the actual ping between
hostslh1 and rh1.
Switch lg
Public

c!rl <d #0%s1itc5%: %$$:$$:$$:e$:df:-.:(&:d/%, %name%:%arp<r!le<&<3%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$32%,%ingress<port%:%&%,%active%:%tr!e%,
%actions%:%o!tp!t3%4# 5ttp:==&'(.$.$.&:-$-$=1m=staticflo1entr)p!s5er=>son
c!rl <d #0%s1itc5%: %$$:$$:$$:e$:df:-.:(&:d/%, %name%:%arp<r!le<3<&%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$32%,%ingress<port%:%3%,%active%:%tr!e%,
%actions%:%o!tp!t&%4# 5ttp:==&'(.$.$.&:-$-$=1m=staticflo1entr)p!s5er=>son
c!rl <d #0%s1itc5%: %$$:$$:$$:e$:df:-.:(&:d/%, %name%:%ipv2<r!le<&<3%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$2-%,%ingress<port%:%&%,%active%:%tr!e%,
c!rl <d #0%s1itc5%: %$$:$$:$$:e$:df:-.:(&:d/%, %name%:%ipv2<r!le<3<&%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$2-%,%ingress<port%:%3%,%active%:%tr!e%,
Switch SW1
c!rl <d #0%s1itc5%: %$$:$&:.2:-(:--:3-:f-:3(%, %name%:%arp<r!le<&<.%,
%actions%:%o!tp!t.%4# 5ttp:==&'(.$.$.&:-$-$=1m=staticflo1entr)p!s5er=>son
c!rl <d #0%s1itc5%: %$$:$&:.2:-(:--:3-:f-:3(%, %name%:%arp<r!le<.<&%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$32%,%ingress<port%:%.%,%active%:%tr!e%,
c!rl <d #0%s1itc5%: %$$:$&:.2:-(:--:3-:f-:3(%, %name%:%ipv2<r!le<&<.%,
%actions%:%o!tp!t.%4# 5ttp:==&'(.$.$.&:-$-$=1m=staticflo1entr)p!s5er=>son
c!rl <d #0%s1itc5%: %$$:$&:.2:-(:--:3-:f-:3(%, %name%:%ipv2<r!le<.<&%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$2-%,%ingress<port%:%.%,%active%:%tr!e%,
Switch SW2
c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f-:3(%, %name%:%arp<r!le<&<'%,
%actions%:%o!tp!t'%4# 5ttp:==&'(.$.$.&:-$-$=1m=staticflo1entr)p!s5er=>son
c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f-:3(%, %name%:%arp<r!le<'<&%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$32%,%ingress<port%:%'%,%active%:%tr!e%,
c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f-:3(%, %name%:%ipv2<r!le<&<'%,
c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f-:3(%, %name%:%ipv2<r!le<'<&%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$2-%,%ingress<port%:%'%,%active%:%tr!e%,
Switch SW6
c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f.:3(%, %name%:%arp<r!le<3<'%,
c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f.:3(%, %name%:%arp<r!le<'<3%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$32%,%ingress<port%:%'%,%active%:%tr!e%,
c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f.:3(%, %name%:%ipv2<r!le<3<'%,
Public

c!rl <d #0%s1itc5%: %$$:$':.2:-(:--:3-:f.:3(%, %name%:%ipv2<r!le<'<3%,
%priorit)%:%6'(.-%, %et5er<t)pe%:%'$2-%,%ingress<port%:%'%,%active%:%tr!e%,
Switch lg
c!rl <d #0%s1itc5%: %$$:$$:-':-f:-':'&:'c:2'%, %name%:%arp<r!le<3<&%,
c!rl <d #0%s1itc5%: %$$:$$:-':-f:-':'&:'c:2'%, %name%:%arp<r!le<&<3%,
c!rl <d #0%s1itc5%: %$$:$$:-':-f:-':'&:'c:2'%, %name%:%ipv2<r!le<3<&%,
c!rl <d #0%s1itc5%: %$$:$$:-':-f:-':'&:'c:2'%, %name%:%ipv2<r!le<&<3%,
12.2 MPLS Test Configuration Details
In this section we provide parts of the test configuration extracted from the virtualized
Juniper device used during the tests. Only representative parts of the configuration are
shown but the deployed scenario can be easily replicated.
Tests were conducted on an Juniper MX80 router on which 11.2 version of Junos was
running.
The detailed test topology is shown in Figure 59.

Figure 59: Detailed MPLS test topology
Following is the basic configuration for a Logical System (LS) creation on the Juniper MX
box. This listing comprises a new user configuration that has the rights restricted to
configuration and monitoring of a particular LS only.
Public

version &&.'R2.6?
s)stem 0
5ost<name M@<test<&?
root<a!t5entication 0
encr)pted<pass1ord %A&A&Bm2CRD6A,$3vB$>ETID@F=GHkTBa&.%?
4
login 0
class adminCD& 0
logical<s)stem L"<CD&?
permissions all?
4
!ser L"CD&Hdmin 0
!id '$&3?
class adminCD&?
a!t5entication 0
encr)pted<pass1ord %A&A3Nkt=3m.AHMI2&fI5JgN@d2KL6CRM3=%?
4
4
4
4
logical<s)stems 0
L"<CD& 0
NN Logical s)stem config!ration (interfaces : protocols)
4
4
c5assis 0
fpc & 0
pic & 0
t!nnel<services 0
7and1idt5 &g?
4
4
4
4

Following is the configuration of one of the Provider Core routers being in this case also an
AS border router.
L"2 0
interfaces 0
lt<&=&=&$ 0
!nit 3 0
description L"2<OL"'?
encaps!lation et5ernet?
peer<!nit 2?
famil) inet 0
address &$.$./.&-=6$?
4
famil) mpls?
4
!nit ( 0
description L"2<OL"6?
peer<!nit .?
famil) inet 0
address &$.$.&$.&-=6$?
4
famil) mpls?
Public

4
!nit &'$ 0
description %L"2PH"&<OL"'&PH"'%?
peer<!nit &'&?
famil) inet 0
address &('.&..&'.&=6$?
4
famil) mpls?
4
!nit &6$ 0
description %L"2PH"&<OL"6&PH"6%?
peer<!nit &6&?
famil) inet 0
address &('.&..&6.&=6$?
4
famil) mpls?
4
4
lo$ 0
!nit . 0
famil) inet 0
address &/'.&.-.&.2=6'?
4
4
4
4
protocols 0
rsvp 0
interface lt<&=&=&$.3?
interface lt<&=&=&$.(?
4
mpls 0
icmp<t!nneling?
la7el<s1itc5ed<pat5 to<L"&<cspf 0
to &/'.&.-.&.&?
7and1idt5 '$m?
4
pat5 defa!lt<pat5?
interface lt<&=&=&$.&'$?
interface lt<&=&=&$.&6$?
4
7gp 0
famil) inet 0
la7eled<!nicast 0
ri7 0
inet.6?
4
4
4
gro!p internal<peers 0
t)pe internal?
local<address &/'.&.-.&.2?
famil) inet<vpn 0
!nicast?
4
Public

eBport 8 advertise.lo$ n5s<all 9?
neig57or &/'.&.-.&.&?
4
gro!p eBternal<peers 0
t)pe eBternal?
famil) inet<vpn 0
!nicast?
4
eBport 8 advertise.lo$ n5s<all 9?
neig57or &('.&..&6.' 0
peer<as 6?
4
neig57or &('.&..&'.' 0
peer<as '?
4
4
4
ospf 0
traffic<engineering?
area $.$.$.$ 0
interface lo$.. 0
passive?
4
4
4
4
polic)<options 0
polic)<statement advertise.lo$ 0
term self 0
from 0
protocol direct?
ro!te<filter $.$.$.$=$ prefiB<lengt5<range =6'<=6'?
4
t5en accept?
4
4
polic)<statement in>ect<ospf 0
term & 0
from 0
protocol ospf?
area $.$.$.$?
4
t5en accept?
4
4
polic)<statement n5s<all 0
term & 0
t5en 0
neBt<5op self?
4
4
4
4
ro!ting<options 0
interface<ro!tes 0
ri7<gro!p inet ifrg<inet$<to<inet6?
4
Public

ri7<gro!ps 0
ifrg<inet$<to<inet6 0
import<ri7 8 inet.$ inet.6 9?
4
4
ro!ter<id &/'.&.-.&.2?
a!tonomo!s<s)stem &?
4
4

Following is the configuration of one of the Provider Edge (PE) routers.
L"& 0
interfaces 0
lt<&=&=&$ 0
!nit $ 0
description L"&<OL"'?
peer<!nit &?
famil) inet 0
address &$.$.-.&6=6$?
4
famil) mpls?
4
!nit ' 0
description L"&<OL"6?
peer<!nit 6?
famil) inet 0
address &$.$.(.&6=6$?
4
famil) mpls?
4
!nit &$$& 0
description to<L"<CD&?
peer<!nit &$$$?
famil) inet 0
address &('.&..&$$.'=6$?
4
4
4
lo$ 0
!nit 3 0
famil) inet 0
address &/'.&.-.&.&=6'?
4
4
4
4
protocols 0
rsvp 0
interface lt<&=&=&$.$?
interface lt<&=&=&$.'?
4
mpls 0
icmp<t!nneling?
la7el<s1itc5ed<pat5 to<L"2<cspf 0
to &/'.&.-.&.2?
Public

7and1idt5 &$m?
link<protection?
primar) defa!lt<pat5?
secondar) defa!lt<pat5 0
stand7)?
4
4
pat5 defa!lt<pat5?
interface all?
4
7gp 0
gro!p internal<peers 0
t)pe internal?
local<address &/'.&.-.&.&?
famil) inet 0
la7eled<!nicast 0
ri7 0
inet.6?
4
4
4
famil) inet<vpn 0
!nicast?
4
eBport advertise.lo$?
neig57or &/'.&.-.&.2?
4
4
ospf 0
traffic<engineering?
area $.$.$.$ 0
interface lo$.3 0
passive?
4
interface ge<&=$=-.$ 0
passive?
4
4
4
4
polic)<options 0
polic)<statement advertise.lo$ 0
term & 0
from 0
protocol direct?
ro!te<filter $.$.$.$=$ prefiB<lengt5<range =6'<=6'?
4
t5en accept?
4
term ' 0
t5en re>ect?
4
4
4
ro!ting<instances 0
Public

7l!e 0
instance<t)pe vrf?
interface lt<&=&=&$.&$$&?
ro!te<disting!is5er &/'.&.-.&.&:&$$?
vrf<target target:'$:'$?
vrf<ta7le<la7el?
protocols 0
7gp 0
gro!p to<CD 0
t)pe eBternal?
local<address &('.&..&$$.'?
peer<as &$$?
neig57or &('.&..&$$.&?
4
4
4
4
4
ro!ting<options 0
interface<ro!tes 0
ri7<gro!p inet ifrg<inet$<to<inet6?
4
ri7<gro!ps 0
ifrg<inet$<to<inet6 0
import<ri7 8 inet.$ inet.6 9?
4
4
ro!ter<id &/'.&.-.&.&?
a!tonomo!s<s)stem &?
4
4

Following is the configuration of one of the Consumer Edge (CE) routers.
L"<CD& 0
interfaces 0
lt<&=&=&$ 0
!nit &$$$ 0
description to<L"&<,D?
peer<!nit &$$&?
famil) inet 0
address &('.&..&$$.&=6$?
4
4
4
lo$ 0
!nit &$$ 0
famil) inet 0
address &/'.&.-.&$$.&=6'?
4
4
4
4
protocols 0
7gp 0
eBport eBport;loop7ack?
gro!p to<,D 0
t)pe eBternal?
Public

local<address &('.&..&$$.&?
peer<as &?
neig57or &('.&..&$$.'?
4
4
4
polic)<options 0
polic)<statement eBport;loop7ack 0
from 0
ro!te<filter &/'.&.-.&$$.&=6' eBact?
4
t5en accept?
4
4
ro!ting<options 0
ro!ter<id &/'.&.-.&$$.&?
a!tonomo!s<s)stem &$$?
4
4

Public

12.3 Mapping of Mechanism to SmartenIT Architecture
In the following you find the mappings of the traffic management solutions described in
Chapter4 to the SmartenIT architecture which will be defined in deliverable D3.3. The red
box shows the part of the architecture which is data center / operator focused whereas the
yellow box shows the part which end-user focused. The grey boxes show where the
particular traffic management solution is deployed. The blue text-boxes show
functionalities implemented by the traffic management solutions. Architecture components
used by the traffic management solution are given in the yellow text-boxes.

Figure 60: Mapping of HORST to SmartenIT architecture.
Public


Figure 61: Mapping of SECD to SmartenIT architecture.

Figure 62: Mapping of ICC to SmartenIT architecture.
Public


Figure 63: Mapping of DTM to SmartenIT architecture.

Figure 64: Mapping of RB-Tracker to SmartenIT architecture.
Public


Figure 65: Mapping of SMSP to SmartenIT architecture.

Figure 66: Mapping of MRA to SmartenIT architecture.
Public


Figure67: Mapping of OptiPlan to SmartenIT architecture.

Figure 68: Mapping of vINCENT to SmartenIT architecture.
E
n
d
-
U
s
e
r

F
o
c
u
s
s
e
d
D
a
t
a
-
C
e
n
t
e
r

F
o
c
u
s
s
e
d
OptPlanLoad

Public


Figure 69: Mapping of AQAS to SmartenIT architecture.

Figure 70: Mapping of MUCAPS to SmartenIT architecture.
Public


Figure 71: Mapping of QoEnA to the SmartenIT architecture

Deliverable D2.2 Report On Definitions of Traffic Management Mechanisms and Initial Evaluation Results

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Deliverable D2.2 Report On Definitions of Traffic Management Mechanisms and Initial Evaluation Results

Uploaded by

Copyright:

Available Formats

Seventh Framework STREP No. 317846 D2.

2 Definitions of Traffic Management Mechanisms

with the block bitrates y

is a cost calculated for link i,

() is a cost function used on that link, and B

that determines the desired direction of the traffic growth on

). The compensation vector is found by a

You might also like