Professional Documents
Culture Documents
Performance Analysis
of an SD-WAN Infrastructure
Implemented Using Cisco
System Technologies
GIANLORENZO MOSER
GIANLORENZO MOSER
Abstract
Software-Defined Wide Area Networking (SD-WAN) is an emerging
technology that has the potential to satisfy the increasing demand for reliable
and efficient Wide Area Networks (WANs) in the enterprise-network market.
This thesis focuses on the main features of an SD-WAN network and on the
technical challenges facing the design and implementation of an SD-WAN
infrastructure. It also provides a detailed comparison between the SD-WAN
and the other WANs solutions such as MultiProtocol Label Switching (MPLS).
The thesis is based on the project that is about the migration of network
infrastructure that uses the MPLS technology to a network infrastructure that
uses the SD-WAN technology. The migration process includes many phases
such as the analysis of the existing MPLS based infrastructure, identification
of suitable appliances based on customer requests, and the design of the
SD-WAN infrastructure that can be implemented without disrupting the
network functioning during the transition stage. The thesis provides a detailed
description of these steps and it discusses the trade-offs that were made during
the design phase of the project.
The results presented in the thesis are obtained through on-site tests performed
for the new SD-WAN infrastructure. The tests were performed with the
objective to evaluate some of the main SD-WAN functionalities such as load
balancing, traffic shaping, and high availability. The obtained results show
the effective functioning of the network infrastructure and illustrate some of
the main advantages that the new SD-WAN infrastructure has over the old
MPLS infrastructure. Finally, this thesis could be of interest to network
professionals and employees who consider SD-WAN as a possible solution
for their company’s business.
Keywords
SD-WAN, SDN, MPLS, Load Balancing, Traffic Shaping
ii | Abstract
Sammanfattning | iii
Sammanfattning
Software-Defined Wide Area Networking (SD-WAN) är en framväxande
teknik som har potential att tillgodose den ökande efterfrågan på tillförlitliga
och effektiva Wide Area Networks (WAN) på företagsnätverksmarknaden.
Denna avhandling fokuserar på huvudfunktionerna i ett SD-WAN-nätverk och
på de tekniska utmaningarna för design och implementering av en SD-WAN-
infrastruktur. Det ger också en detaljerad jämförelse mellan SD-WAN och
andra WAN-lösningar som MultiProtocol Label Switching (MPLS).
Avhandlingen
bygger på projektet som handlar om migrering av nätverksinfrastruktur
som använder MPLS-tekniken till en nätverksinfrastruktur som använder
SD-WAN-tekniken. Migreringsprocessen omfattar många faser, till exempel
analys av befintlig MPLS-baserad infrastruktur, identifiering av lämpliga
apparater baserat på kundförfrågningar och utformningen av SD-WAN-
infrastrukturen som kan implementeras utan att nätverket fungerar under
övergångssteget. Avhandlingen ger en detaljerad beskrivning av dessa steg och
diskuterar de avvägningar som gjordes under projektets designfas.
Resultaten som presenteras i avhandlingen erhålls genom test på plats för den
nya SD-WAN-infrastrukturen. Testerna utfördes i syfte att utvärdera några av
de viktigaste SD-WAN-funktionerna som lastbalansering, trafikformning och
hög tillgänglighet. De erhållna resultaten visar att nätinfrastrukturen fungerar
effektivt och illustrerar några av de största fördelarna som den nya SD-WAN-
infrastrukturen har jämfört med den gamla MPLS-infrastrukturen. Slutligen
kan denna avhandling vara av intresse för nätverkspersonal och anställda som
anser SD-WAN som en möjlig lösning för företagets verksamhet.
Nyckelord
SD-WAN, SDN, MPLS, Load Balancing, Traffic Shaping
iv | Sammanfattning
Sommario | v
Sommario
Software-Defined Wide Area Networking (SD-WAN) è una tecnologia
emergente che ha il potenziale per soddisfare la crescente domanda di reti
geografiche (WAN) affidabili ed efficienti nel mercato delle reti aziendali.
Questa tesi si concentra sulle caratteristiche principali di una rete SD-WAN e
sulle sfide tecniche che devono affrontare la progettazione e l’implementazione
di un’infrastruttura SD-WAN. Fornisce inoltre un confronto dettagliato tra
la SD-WAN e le altre soluzioni WAN come MultiProtocol Label Switching
(MPLS).
La tesi si basa sul progetto che riguarda la migrazione di un’infrastruttura di
rete che utilizza la tecnologia MPLS ad un’infrastruttura di rete che utilizza
la tecnologia SD-WAN. Il processo di migrazione comprende molte fasi
come l’analisi dell’infrastruttura esistente basata su MPLS, l’identificazione
di dispositivi idonei in base alle richieste dei clienti e la progettazione
dell’infrastruttura SD-WAN che può essere implementata senza interrompere
il funzionamento della rete durante la fase di transizione. La tesi fornisce una
descrizione dettagliata di questi passaggi e discute i compromessi che sono
stati fatti durante la fase di progettazione del progetto.
I risultati presentati nella tesi sono ottenuti attraverso test eseguiti in loco per
la nuova infrastruttura SD-WAN. I test sono stati eseguiti con l’obiettivo di
valutare alcune delle principali funzionalità SD-WAN come load balancing,
traffic shaping, e high availability. I risultati ottenuti mostrano l’effettivo
funzionamento dell’infrastruttura di rete e illustrano alcuni dei principali
vantaggi che la nuova infrastruttura SD-WAN presenta rispetto alla vecchia
infrastruttura MPLS. Infine, questa tesi potrebbe interessare professionisti e
dipendenti di rete che considerano SD-WAN come una possibile soluzione
per il business della propria azienda.
Parole Chiave
SD-WAN, SDN, MPLS, Load Balancing, Traffic Shaping
vi | Sommario
Acknowledgments | vii
Acknowledgments
I would first like to thank my thesis supervisor Sladana Josilo of the School of
Electrical Engineering and Computer Science at KTH. Without her assistance
and dedicated involvement in every step throughout the process, this thesis
would have never been accomplished. I would like to thank you very much
for your support and understanding over these past four months. I would also
like to show gratitude to my examiner György Dán of the School of Electrical
Engineering and Computer Science at KTH.
I would like to express my sincere gratitude to VEM Sistemi. Despite the
constraints of the COVID-19 pandemic, they welcomed me into their reality
and allowed me to perform my degree project. In particular, I thank Nicola
Gatto and all the colleagues of the Padua office.
I would like to offer my special thanks to EIT Digital Master School for the
opportunity to be part of the students that have participated in the double
degree master program. Moreover, a special thanks to all the fantastic people
from every part of the world that I met during these two years of my master
degree that made my experience unique.
Finally, I must express my very profound gratitude to my family for providing
me with unfailing support and continuous encouragement throughout my years
of study and through the process of researching and writing this thesis. This
accomplishment would not have been possible without them. Thank you.
Stockholm, August 2021
Gianlorenzo Moser
viii | Acknowledgments
CONTENTS | ix
Contents
1 Introduction 1
1.1 Motivation and Challenges . . . . . . . . . . . . . . . . . . . 2
1.2 Research Methodology . . . . . . . . . . . . . . . . . . . . . 3
1.3 Delimitation . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Sustainability and ethical concerns . . . . . . . . . . . . . . . 4
2 Background 7
2.1 Types of internet connection
technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Multiprotocol Label Switching . . . . . . . . . . . . . 8
2.1.2 Internet . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Software Defined Networking . . . . . . . . . . . . . . . . . 11
2.3 Software-Defined Access . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Mobility . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . 14
2.3.3 Management . . . . . . . . . . . . . . . . . . . . . . 14
2.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Software-Defined Wide Area Networking . . . . . . . . . . . 15
2.4.1 Network Appliances . . . . . . . . . . . . . . . . . . 16
2.4.2 Topologies . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.3 The Essentials of SD-WAN Architecture . . . . . . . . 19
2.4.4 Economic advantages . . . . . . . . . . . . . . . . . . 23
2.4.5 Vendors comparison . . . . . . . . . . . . . . . . . . 24
2.5 Key Features of SD-WAN . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Security . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . 26
2.5.3 Cloud integration . . . . . . . . . . . . . . . . . . . . 28
x | Contents
3 Methodology 39
3.1 Research Process . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Experimental design . . . . . . . . . . . . . . . . . . . . . . 41
3.2.1 Reliability and validity of the network
infrastructure . . . . . . . . . . . . . . . . . . . . . . 41
6 Discussion 73
References 79
LIST OF FIGURES | xi
List of Figures
List of Tables
DC Data Center
DHCP Dynamic Host Configuration Protocol
DMZ Demilitarized zone
DoS Denial of Service
DSCP Differentiated Services Code Point
DTLS Datagram Transport Layer Security
HA High Availability
xvi | List of acronyms and abbreviations
SA Security Association
SaaS Software as a Service
SD-Access Software-Defined Access
SD-WAN Software-Defined Wide Area Networking
SDN Software-Defined Networking
SFP small form-factor pluggable transceiver
SIG Secure Internet Gateway
SSL Secure Sockets Layer
STP Spanning Tree Protocol
UI User Interface
USB Universal Serial Bus
VIP Virtual IP
VLAN Virtual Local Area Network
VN Virtual Network
VoIP Voice over IP
VPN Virtual Private Network
VRRP Virtual Router Redundancy Protocol
xviii | List of acronyms and abbreviations
Chapter 1
Introduction
The demand for a variety of emerging cloud services is growing across every
line of businesses. Cloud services usually have high bandwidth and low
latency requirements that cannot be met in a traditional Wide-Area Network
(WAN) designed to provide the best-effort service [1]. A promising approach
to support these emerging cloud services is Software-Defined Wide Area
Networking (SD-WAN). Owing to its potential to improve the user experience
and provide fast, scalable, affordable, and flexible connectivity between
different network environments, SD-WAN-based solutions are continuously
being deployed. This aspect has been confirmed by a recent estimate by Cisco,
according to which the SD-WAN traffic is expected to grow at a Compound
Annual Growth Rate (CAGR) of 37% compared to 3% for traditional WAN in
the next few years [2].
SD-WAN based solutions have the potential to extend the capabilities of an
organization by leveraging the corporate WAN and multi-cloud connectivity.
One of the key benefits of SD-WAN is that it provides dynamic path selection
between connectivity options such as Multiprotocol Label Switching (MPLS),
Long Term Evolution (LTE) / Fifth-Generation (5G) and internet network
[3]. Moreover, SD-WAN with the traffic shaping features allows for the
inbound and outbound traffic segmentation. In addition, SD-WAN allows
for traffic prioritization based on both user/group policies and the types of
their applications. Consequently, SD-WAN solutions have the potential to
allow organizations to access business-critical cloud applications quickly and
easily, and thus to provide high-speed application performance along the
WAN perimeter of branch offices [1]. Finally, SD-WAN can increase the
2 | Introduction
1.3 Delimitation
This thesis project will have as the main objective the realization of an SD-
WAN infrastructure, which will be able to satisfy the customer’s requirements
of HA and provide the availability of services during the transition to the
new infrastructure. Since the implementation is physically carried out with
components available on the market, it will not be possible to carry out an
in-depth analysis of algorithms and protocols as they are often proprietary
and therefore not subject to disclosure to third parties. Finally, since the
implementation is existent, sensitive information related to places, people,
specifications, and public Internet Protocol address (IP) will be omitted.
Chapter 2
Background
The label edge routers instead have the task of calculating the entire path of
the packet and placing/removing an MPLS header that identifies the traffic as
MPLS traffic. To do this, the packet is tagged, as shown in Figure 2.2 with a
4-byte (32-bit) MPLS header consisting of 4 fields:
• label value (20 bits): the effective tag divides different streams of
packets,
• traffic class (Exp) (3 bits): used to prioritize a particular flow (QoS),
• bottom of stack (S) (1 bit): when set it signifies that the current label is
the last in the stack,
• Time To Live (TTL) (8 bits).
Owing to the rapidity of the switch routers to apply the rules stored in the labels
[8], MPLS provides benefits in terms of performance, scalability, bandwidth
utilization, network congestion, and improved QoS and QoE. Moreover,
thanks to resource reservation, the providers that support the MPLS very often
provide bandwidth, latency and QoS guarantees. The trade-off is the cost of
an MPLS connection, which is significantly higher than that of an internet
connection with the same bandwidth requirement. Consequently, MPLS is
often chosen to route the traffic more sensitive to latency and jitter (e.g., Voice
over IP (VoIP) traffic).
It is important to mention that the MPLS traffic is only accessible to other
MPLS customers on the same infrastructure, and thus the network is not
exposed to Denial of Service (DoS) attacks because only a limited set of
users can access it [8]. For this reason, MPLS itself does not implement
encryption, but the confidentiality can be created by establishing encryption in
a higher layer, for example, by creating a Virtual Private Network (VPN) over
10 | Background
MPLS. Despite the fact, the MPLS network is considered secure compared
to the Internet as there are fewer possible attacks that can create disservice
or theft of user information. The principal risk is related to a physical
intrusion by an attacker. Therefore, it is of primary importance to secure
the network edges (i.e., the network entry points) and to protect them with
security measures, such as firewalls and/or Intrusion Detection System (IDS)
/ Intrusion Prevention System (IPS) [9].
The main disadvantages of MPLS are related to the high costs. The cost of an
MPLS link mainly depends on the geographical location of the sites, on the
provider that provides the service and on the bargaining between provider and
customer. The cost is usually ten times higher than in the case of a traditional
internet connection [10]. This difference in the cost is likely to remain in the
future, since with the diffusion of optical fiber, the cost per megabit has become
lower and the optical fiber has become more available [10]. Finally, in case
of offices located in different countries, there will be the need to agree on
the MPLS network with several providers, making the procedure long and
expensive [8].
2.1.2 Internet
Wired internet connection
Different from MPLS, the internet network is based on a best-effort service,
and thus it does not provide any guarantees on bandwidth, latency, or other
performance metrics. Consequently, it is easier to encounter congestion or
network degradation, which can compromise the QoS. Some of the solutions
(e.g., services offered by the ISP) provide bandwidth guarantees through the
traffic prioritization. However, these guarantees usually come at the expense
of an additional cost.
The main advantage of the internet connection is its price, which is lower than
that of the MPLS mainly because of the high competition among internet
service providers. Another advantage is that a classic internet connection
can be activated/deactivated much faster than an MPLS connection. In
particular, the activation of a new internet connection does not require
particular interactions between provider and customer and can be made at any
time. This efficient activation of a new internet connection comes, however,
at the price of vulnerability in terms of different types of cyber attacks. For
this reason, a company using a classic internet connection needs to have
firewalls, IDS / IPS and, when possible use encrypted VPN tunnels to secure
Background | 11
port, MAC address and VLAN. In case that there is no correspondence (i.e.,
in case of a new flow), the device must contact the controller. The controller
then identifies the best route for the flow, and subsequently, it informs all of
the involved devices about the new rule that needs to be inserted in the flow
table [14].
The main advantages of SDN compared to a traditional network are the
centralization and simplification of the control of an entire network. For
example, SDN allows simple access to the controller to register particular
policies/rules that can be further sent to all the switches. Therefore, SDN has
the potential to reduce errors and configuration times of the various devices in
the network. The centralization control is especially beneficial in the case of
devices distributed across geographically distant locations [15], which will be
discussed in more detail in Chapter 2.4. Apart from the centralized control, the
SDN has the potential to improve data plane performance and perform traffic
programmability. Finally, it allows the implementation of cloud abstraction,
and thus it can simplify the process of unifying the cloud resources. Indeed,
the SDN controllers can manage all the network’s components that comprise
the massive data center platforms [16].
As discussed above, centralization brings the intelligence of the entire network
to a single place, which besides the mentioned advantages may also be a source
of certain disadvantages, for example, in the context of security. In particular,
in case of a failure or cyber-attack, the entire network including the control
Background | 13
2.3.1 Mobility
A user who changes the access point potentially changes the network and
will be assigned a new IP. The process of assigning a new IP is complicated
because if a policy-IP connection is made (a policy applied only to certain
IP addresses), it has to be reassigned to the new IP. One way of solving
this problem is to use Locator/Identifier Separation Protocol (LISP). The
LISP is used with SD-Access technology and it is a protocol that informs the
control node on which switch the user’s device is connected to. Using this
information, the controller can dynamically provide routes and services based
14 | Background
2.3.2 Segmentation
Segmentation is one of the main aspects of SD-Access because the user can
be discriminated based on macro rules (macro-segmentation) such as role
(learned in the authentication phase) or type of device. This aspect allows for a
user to be assigned to a particular virtual network (created with the VXLAN),
which maintains different group of users with different privileges. In addition
to this, it is possible to apply the micro-segmentation. Micro-segmentation
indicates which users can be contacted by another user within the same Virtual
Network (VN). This approach reduces the risk of illegitimate access to various
parts of the network [19].
2.3.3 Management
In SD-Access the administration part becomes simpler than a traditional
network because it is based no more on the network primitives’ use (e.g., IP
addresses or VLANs) for segmentation and access control. The simplification
is mainly related to the fact that there is no need for translation between
business intent and IP and Access Control List (ACL) since the network
operator can define the connectivity matrix between the groups of users.
Consequently, the administration in SD-Access becomes more efficient in
terms of scalability, error occurrence, and complexity. At the network level,
each router keeps track of each endpoint through its IP address, and it adds
a 16-bit tag that represents its group when a new user logs in. In this way,
the connectivity rules contained in the matrix can be applied to the user.
Therefore, the network administration is radically simplified and common
operations such as IP address assignment planning or ACL configurations are
greatly improved [19].
Background | 15
2.3.4 Summary
SD-Access allows users to access the resources through the customized
policies that allow for a dynamic and centralized implementation. At the
same time, SD-Access allows keeping the same services intact regardless of
the point (switch or access point) to which the user is connected. Moreover,
systems that use Cisco technology, such as Cisco Digital Network Architecture
(Cisco DNA) with Identity Services Engine (ISE) integration, also allow
access to the parts of the system that are responsible for traffic monitoring
and analysis. The SD-Access allows having an overview of the group and
authorization policies, the status of the entire infrastructure and an efficient
traffic control.
2.4.2 Topologies
The interconnections between different branches can be multiple and very
different from each other. However, thanks to SD-WAN it is possible to have
different logical network topology than the physical network topology. This
allows for the coexistence of an underly topology (the real one) and an overlay
topology (the desired one). The various topologies can be categorized in a
mesh topology, hub and spoke topology, and hybrid topology [23] as illustrated
in the Figure 2.5.
Mesh topology: The full mesh topology requires that all the sites have to
be interconnected directly to each other. It provides a different VPN tunnel
for each site interconnection. Therefore, even in the case of few locations,
the number of tunnels can be very high. A high number of tunnels can
lead to hardware/software limitations and loss of performances. The positive
aspects of this design are the capacity of having a low latency compared to
18 | Background
others topologies and that the amount of traffic has not to pass through the
Headquarter (HQ) because there is a direct connection with the destination
site.
Hub and spoke topology: In this topology, the main office is set as a hub, while
the other offices as spokes. The peculiarity is that the spoke can communicate
only with one or more hubs, not with the other spokes, and the hub will
therefore act as a link between the various spokes. The feature due to which
hub and spoke topology is highly used is that it allows only a part of the traffic
to be transferred between two sites without going through the data center.
Based on that, a widely used configuration is the so-called hub and spoke.
The advantages are greater scalability, reduced costs and centralization of the
HQ in the network topology.
Hybrid topology: As illustrated in Figure 2.5c, hybrid WAN architecture
Background | 19
controller send and receive the information related to updating routing and
reachability, policies and security (i.e, the encryption keys) [3].
In the case of reachability and routing, the data plane relies on a routing
protocol such as Open Shortest Path First (OSPF) or Border Gateway Protocol
(BGP) for checking the available routes. It will also use the Bidirectional
Forwarding Detection (BFD) network protocol to detect a link failure and
to notify the controller immediately. BFD is used because it can be set
independently of the routing protocol used and because the echo time could be
defined differently from the routing protocol timer [24] [25]. Besides, the BFD
can be used to monitor a virtual connection, for example, a tunnel. In the case
of SD-WAN, BFD can be used to monitor liveliness, and quality measurement
(latency and jitter) of each VPN created between two locations [3].
authentication in the lower ISO-OSI protocol layers (layer three or layer two)
to provide security and connection over an otherwise insecure network. The
most common network protocol used for this purpose is at the IP layer and is
known as IP Security (IPsec).
IP Security
The IPsec protocol is a layer three protocol, used for most VPN
implementations, and it is considered secure and reliable. It ensures
authentication and integrity for packet sources, confidentiality and access
control [18]. It can work in two modes: transport mode and tunnel mode.
In the first case, only the payload and part of the IP header is encrypted.
Otherwise, in the second case, the whole packet is encapsulated, and hence
the destination address of the gateway is shown as the real destination.
In addition to this differentiation, two types of encapsulation can be chosen:
the Authentication Header (AH) or Encapsulating Security Payload (ESP).
The first one provides only integrity and authentication through Hash Message
Authentication Code (HMAC). The AH allows the recipient, using a hash
function, to verify if the content is authentic and not modified during transport.
The second type of encapsulation, in addition to authentication, encrypts the
TCP/UDP segment, and thus it also ensures confidentiality. In the case of
tunnel mode, the IP header is encrypted too, and thus the destination IP can
be hidden from the third parties, which allows for better confidentiality [26].
It is necessary to establish a Security Association (SA) to ensure that the host
or destination gateway decrypts and correctly verifies the IPsec packets. The
SA is a set of information (unique for each IPsec flow) that allows to correctly
identify and decrypt the packets. These information are based on the used
encryption protocol, key, method (AH or ESP), transport or tunnel mode,
and other specific parameters [18]. To establish the entire communication via
IPsec it is necessary to use Internet Key Exchange (IKE) protocol. IKE is a
key management protocol that uses the Diffie-Hellman key exchange to share
the symmetric keys for encrypted communication for both users.
Auto-VPN
Many SD-WAN vendors, such as Cisco Meraki or Juniper, use the Auto-VPN
concept to create an IPsec tunnel between various peers on the SD-WAN
network [27]. Auto-VPN is a concept that provides the creation of a VPN
tunnel automatically, without having to manually configure the two peers even
22 | Background
Figure 2.7 – Auto-VPN and Automatic NAT traversal with UDP hole punching
The advantages of this technology are many, but the two most important are:
the simplicity of configuration and creation of the VPN tunnel between the two
devices and the ability of the devices to update their cryptographic material
quickly and automatically.
Background | 23
2.5.1 Security
Security is one of the most fundamental aspects of SD-WAN. The traditional
approach without SD-WAN involved access to the network only through a data
center, protected by firewalls or other protection systems for threats coming
from outside. With the advent of the increasing use of SaaS and IaaS, the
need to pass all traffic through the data center is therefore obsolete and not
very functional. A possible solution could be the use of SD-WAN together
with a local internet gateway secured by using a firewall in each branch. In
this way, all the inbound/outbound traffic can be analyzed and verified and
each branch can reach the Internet in a secure way [1].
The solution described above is, however, not very scalable, and an alternative
and more used solution is Secure Internet Gateway (SIG). SIG allows all traffic
coming from one or more branches to converge in the cloud, ensuring adequate
security. It keeps unauthorized traffic from entering an organization’s network
by analysis of the incoming/outcoming traffic to prevent malicious website
traffic, viruses, and malware. In this way, the IaaS and SaaS traffic is not
conveyed to the HQ (reducing congestion), but at the same time, adequate
traffic safety and monitoring are guaranteed. SIGs often provide databases on
threats and other useful tools to make security shared with other SIGs around
the world.
2.5.2 Segmentation
An important aspect of the network infrastructures, and in particular of SD-
WAN, is network segmentation. Network segmentation is an architectural
approach that divides a network into multiple segments or subnets. It allows
improving monitoring and control of the network through policies. It also has
the potential to improve security aspects by preventing simultaneous access
to all of the resources. Consequently, in case of intrusion, only the part of
the resources exposed to the attack will be compromised, while the rest of the
Background | 27
infrastructure. For example, on the LAN side and also on the WAN side for
the VPN traffic. On the LAN side of the offices, the traffic is often split into
various VLANs in order to expose only the necessary ones via SD-WAN to
the other offices. In this way, only part of the network is shared with the other
offices, allowing some areas to be kept private. Otherwise, from the point
of view of VPN traffic, segmentation and micro-segmentation are applied
to make traffic flow on a specific connection based on the user’s type. The
user is first differentiated based on the ACL and the destination of the traffic
(macro segmentation) and, after that, based on the policies and analysis of the
application layer (micro-segmentation).
data center or the centralized internet gateway. In particular, traffic to the cloud
is limited due to the need to concentrate traffic to the outside in order to monitor
it. Monitoring can take place through the use of firewalls and other systems
that ensure the security of the internal network. However, this strategy leads
to higher latencies and, therefore, to low QoE [45].
based on its priority level. Hence, the local data policies can be recapped as
access lists that allow the provision of QoS [3].
Centralized policies, on the contrary, can concern the control plane or the data
plane. The control plane policy affects the communication routes between the
controller and switch. The data plane policy instead affects only the rules that
select the routes for the forwarding of packets. Besides, they are also used to
decide which packets to analyze at the payload level [46].
the transmitted data and the recovery of the corrupted packets in the case of
few errors. However, the FEC technique may not be effective in the case of
burst errors that can result in damaging both the packet and the FEC data. For
this reason, some other methodologies need to be used in order to separate
the repeated information in different packets. One strategy is to perform the
FEC check at each hop to immediately identify the presence of errors and
carry out the correction. However, this requires that the TCP/IP flow control
is not end-to-end. Another strategy that can work together with the one just
described is to make the FEC relative to n separate packets. For example, the
first packet of FEC will not refer to sequential packets (i.e. 1, 2, 3, 4, 5) but
to interspersed packets (i.e. 1, 11, 21, 31, 41). In this case, the risk of burst
error is reduced since the packets are transmitted at different times. However,
one of the possible disadvantages of the described solution is that it requires a
greater buffer memory for saving all the packets received before carrying out
the FEC check [50].
The issues related to VoIP are usually related to the transmission of VoIP traffic
through a single link. With the introduction of SD-WAN, it is possible to
choose between more than one link and decide in real-time which one to use
based on the current network conditions. The choice of the link allows to
obtain better QoE of the call than the traditional approaches for the same values
of bandwidth, latency and jitter [51]. Beyond that, suppliers are increasingly
adopting the use of a particular technique called packaged duplication [1].
This technique involves sending the same information on two different physical
links, as illustrated in Figure 2.10b. In this way, if the packet loss occurs, it
will be sufficient to use the backup stream to obtain the lost packets. The
disadvantage of this technique is a major use of bandwidth, but, thanks to
the ever-increasing bandwidth offered by the providers, the method may be
Background | 33
Failover
One of the main aspects of HA is related to failover, which is the ability to
create redundancy at the level of equipment and/or links to keep the entire
infrastructure up and running. It must take place at all levels of SD-WAN:
data, control and application planes, and the connections between the various
levels. All these particular levels are equally important, and for each of them,
several solutions have been designed to overcome the issues. The following
sections provide a brief overview of the mentioned levels.
API and management plane
The management level of SD-WAN is ensured through a cluster of
applications. The clusters are located in the same Data Center (DC) and are
all kept active at the same time. Moreover, to maintain geo-redundancy, they
are also duplicated in various DCs, in which case the applications have to
remain active/passive [3]. In addition to this, the applications are designed
in a modular way to avoid cascading faults between different services, and
therefore to limit inefficiencies between different functions of the management
level [52]. Finally, in the event of a general fault (e.g., caused by non-
reachability of all management servers), SD-WAN is designed to maintain the
34 | Background
current state and therefore guarantee operation while maintaining the current
state policies [52].
Control plane
There are numerous approaches used at the control level to guarantee the
HA in case of fault of one or multiple controllers. Some of the approaches
provide SMaRtLight [53], i.e., the use of a data store that allows obtaining an
immediate backup in case of a controller failure. This approach allows copying
the configuration to another controller. Other approaches, implemented
with various protocols, provide the establishment of continuous update
messages and verification of the appliance’s status between active and dormant
controllers [52]. An example of implementation is one proposed by Fonseca
and is called CPRecovery [54].
In addition, it is also possible to use controllers not belonging to the
zone concerned, and therefore re-associate the switches to the remaining
controllers. This approach could be very interesting in the case of several
controllers associated with various devices in the network, and it will be later
discussed in more detail.
Data plane
At the data plane, the HA can be achieved connecting two routers and
configuring them through the Virtual Router Redundancy Protocol (VRRP).
The VRRP allows to virtualize multiple physical routers and merge them into
a single virtual router (i.e., a router with a virtual IP), as shown in Figure 2.11.
In this way, the traffic is always directed towards the virtual IP address, and it
Background | 35
is the task of the VRRP protocol to redirect the traffic towards the new master
router. In the case of master failure, the secondary router (slave) takes its place
and continues routing the traffic [3]. This approach is particularly effective
because the devices do not change the gateway as it always remains the one
defined as a virtual router.
It is also necessary to maintain the redundancy on a physical level, and
a technique called switch stack has been adopted for this purpose. This
technique relies on using two or more physically separate switches. They are
interconnected through special ports called stack ports [55]. The switches
have to be configured such that they are aware of the stack. This operation can
be done via the primary switch that is in charge of managing and discovering
the other switches in the stack. The other switches remain as slaves and can
replace a master if a master failover is identified [55].
Link Aggregation Control Protocol
Redundancy has been also considered for links, and in this context, it has been
standardized by IEEE 802.3ad, which provides the standard for the aggregation
of multiple physical interfaces in a single logical link [56]. According to
this standard, the traffic passes on the various links by exploiting the load-
balancing of the Link Aggregation Control Protocol (LACP), which allows
the use of the bandwidth of all the links added together. For example, in the
case of 4 links of 10 Gbps, the connection looks like a single logical link of 40
Gbps. Besides, if the links are configured in a HA configuration in case of a
single link failure, the link will remain unaltered, with only a loss of bandwidth
(e.g., in the described example, the bandwidth loss will be 25%).
Failure detection
The rapid identification of a failure is one of the most essential steps for
the proper functioning of HA. The failure detection usually occurs through
protocols and algorithms that have the task to identify the malfunction of a
link. The monitoring of the failures can happen between the control and data
plane or between two switches in the data plane.
In the case of failure at the data plane level, a possible approach is Sentinel
[57], which supports for the creation and installation of backup tunnels for
each link that could fail. In this way, when a failure occurs on a specific
connection, it will be sufficient to use the previously created backup. Other
implementations, such as SafeGuard [58] provide the proactive creation of
tunnels and backups. SafeGuard also provides an analysis of the traffic present
36 | Background
on each link, which allows for use of traffic weighting methods. In this way,
SafeGuard can prevent situations in which the failure of a link leads to an
overload of a backup link. There are also other approaches that use the BFD
protocol to verify that, after a link failure, a quick recovery is carried out on
the backup link [25] and that the remaining links are not overloaded. These
two objectives can be achieved through a recalculation of the optimal path
with BFD, once when the traffic migration on the secondary link is finished.
Variations to the above approaches are called hybrid [59], and they allow
for the use of a backup tunnel after having detected the malfunction of the
link. When the malfunction of the link is detected, BFD is used to check the
status of the remaining links. Subsequently, the routes are recalculated based
on the new parameters, which allows for a complete change of the topology.
Finally, there are approaches that exploit relations derived from the monitoring
system and the virtual connections. The main purpose of these approaches is
to deduce the topology of the underlying route and to calculate the best path
at the presentation of the failure [60].
In the case of a failure between the control and the data plane, the two widely
used algorithms are Greedy failover (GF) and Pre-partitioning failover (PPF)
[61]. The GF algorithm is based on echo messages between the controller and
switch that are used to verify the actual presence and operation of the devices.
When a switch loses connectivity with its controller, it sends a frame using
Link Layer Discovery Protocol (LLPD), which indicates that the switch is no
longer controlled by the controller. The other switches forward this message to
their own controllers, and subsequently one of the notified controllers sends a
request to the orphaned switch. At the end of the operations, the controller
that becomes associated with the orphaned switch updates the information
concerning the switches it controls. The PPF instead is pro-active, and it uses
Greedy’s algorithm. The main difference is that the PPF algorithm already
selects, for each switch, the controller backup. Therefore, when a controller
detects a failure of another controller, it tries to establish a connection with all
the switches affected by the failure. If the switches recognize a controller as a
backup, they accept the new association. Once the connection is established
with each of the orphaned switches, the backup controller updates its database.
Finally, approaches that take inspiration from Sentinel and SafeGuard were
applied to make the control-data connection fault-tolerant [62]. The main
idea on which these approaches are based is to proactively create backup links
between all the controllers and the switches. Consequently, in the event of a
controller fault, it not necessary to redistribute switches at the moment as the
Background | 37
Chapter 3
Methodology
This chapter describes the research process and the phases of the performed
experimental design. It also discusses the data collection required for the
experiments, its analysis and evaluation.
Background study
The background study involved a systematic analysis of different topics
related to the SD-WAN. First, the literature on the characteristics of SD-
WAN networks was studied. Second, the literature on planning, design, and
implementation of SD-WAN networks was reviewed. Finally, the SD-WAN
solutions provided by several vendors were compared.
Evaluation
In the last phase, the proposed solution was evaluated through simulations.
The first set of simulations was performed in order to evaluate the aspects
related to traffic shaping. Then, the ability of the new network infrastructure
to withstand failover events was tested. Finally, the previous and the proposed
infrastructures were compared in terms of the fault detection performance.
Chapter 4
This chapter describes the network infrastructure present on the study site,
the customer change requests, the functionalities that a new infrastructure
should provide, and the various steps that need to be performed during the
design of the new network infrastructure. Moreover, the chapter describes
the implementation of the SD-WAN solution, with a particular focus on the
various features, like SD-WAN and traffic shaping. Therefore, this chapter
collects all the operational steps carried out during the degree project.
based on the IPsec standard and the necessary information (IP addresses,
authentication certificates and cryptographic material) required to establish
the VPN tunnel.
between the HQ and the peripheral site. The local internet traffic is routed
towards the default gateway. If the primary router is active and the secondary
is passive, the traffic is forwarded from the primary to the secondary gateway
that permits local internet navigation. Therefore, it is not required to transfer
all the traffic via MPLS to the HQ in order to provide internet browsing.
had to take place only in the case of failure of both primary links. Finally,
concerning the HA, it was required to implement architectural and technical
features capable of guaranteeing adequate network availability.
between the non-migrated office and the migrated office. Indeed, in the case of
communication between two branches (one already migrated and the other one
waiting for the migration), the flow needs to pass through the HQ to guarantee
the reachability. Therefore, the entire network infrastructure has a hub and
spoke topology with the HQ being a unique hub. Initially, the traffic passes
through the MPLS path from a non-migrated site to the HQ using the existing
interconnection. Subsequently, the traffic is sent to the SD-WAN concentrator
and then via the VPN tunnel from the HQ to the destination site that is already
being migrated. The exact operation will be explained in detail in Chapter
4.3.2.
functionality that the appliance must perform. There are switch products
(MS), network security & SD-WAN (MX) products, access points (MR),
wireless WAN (MG), and Internet of Things (IoT) products divided into
cameras (MV) and environmental sensors (MT). Each of the categories has
sub-products that differ in minor characteristics and computational power.
Meraki MX67C
The MX67C model is classified by Cisco Meraki as specific for a small
branch. Indeed, it supports a limited number of users (up to 50) and four LAN
Ethernet ports. Despite this, the MX67C appliance fully satisfies the required
characteristics as it allows to establish, via auto-VPN, the VPN tunnels for SD-
WAN and to monitor the entire network and the devices connected to them via
the Meraki Dashboard.
Hardware and interfaces
The MX67C is characterized by a WAN port and four LAN Ethernet ports with
1 Gigabit Ethernet (GbE) capacity. However, when SD-WAN is performed, it
is possible to configure via software one of the LAN ports as an additional
1
https://meraki.cisco.com/product/security-sd-wan/
medium-branch/mx84/
2
https://meraki.cisco.com/product/security-sd-wan/
small-branch/mx67c/
From MPLS based network to SD-WAN infrastructure | 51
uplink port. Moreover, there is a slot for the LTE cellular model and an
Universal Serial Bus (USB) 2.0 port for connecting an additional device in
the case of LTE failover.
Network
Besides SD-WAN typical features, such as load balancing, the MX67C
supports a few additional network features. Some of them are VLANs,
creating static routes, and also performing the Dynamic Host Configuration
Protocol (DHCP) server service for internal networks.
Security
From a security point of view, the MX67C performs firewall functions. In
particular, the MX67C allows traffic segmentation based on identity-based
policies, traffic analysis at the application level (layer 7), content filtering and
web searches. Moreover, an intrusion prevention system (IPS) and advanced
protection against malware are guaranteed based on Cisco Advanced Malware
Protection (AMP) technology.
Throughput
The MX67C provides a stateful firewall throughput of 450 Mbps and the
throughput of each VPN is equal to 200 Mbps. For LTE connectivity, the
link speed is around 300 Mbps.
Power supply
The MX appliance is a compact device designed for small branches. Its
electrical consumption is around 18 Watt, which is a considerably low value
when compared to other products that usually consume hundreds of watts
during their normal use.
Meraki MX84
The MX84 model is classified by the same manufacturer as specific for a
medium branch. It allows for a larger number of users (up to 200) and more
interfaces than the previous model (Meraki MX67C). The MX84 is positioned
in the HQ because it is the concentrator of all VPN tunnels (it is used the
hub and spoke topology) and because it acts as a gateway for internet access.
Therefore, the MX84 needs to have higher performance than the peripheral
sites. As for Meraki MX67C the VPN tunnels for SD-WAN are created though
auto-VPN and thus it is possible to monitor the entire network and the devices
connected to the network infrastructure through the Meraki Dashboard.
Hardware and interfaces
52 | From MPLS based network to SD-WAN infrastructure
The MX84 is characterized by two WAN ports and eight 1 GbE LAN ports.
The MX appliance is also equipped with two slots for small form-factor
pluggable transceiver (SFP) at 1 Gbps for fiber optic connections. There is no
slot for the LTE backup, as it is assumed that it is difficult to provide services
only with the LTE backup. Furthermore, the availability of WAN connectivity
is significantly greater in HQ than in the peripheral offices.
Network
Similar to the MX67C, the MX84 supports the features such as load balancing,
VLANs, static routing, and DHCP service for internal networks. In addition
to these features, the MX84 appliance supports application prioritization and
web caching, which can be used to speed up services for the end-users.
Security
The MX84 appliance supports the same set of security functions as MX67C,
therefore it is sufficient to refer to the characteristics already described above.
Throughput
In terms of throughput, the MX84 has better performance than the MX67C.
The MX84 supports a larger number of users, the throughput of the stateful
firewall of 500 Mbps, and the throughput of each VPN is equal to 250 Mbps.
Beyond this, the number of supported tunnels is equal to 100, instead of only
50 for the MX67C model, used in the peripheral offices.
Power supply
Although the whole MX series is particularly undemanding in terms of
electricity consumption, the MX84 has an ordinary consumption of 100W, an
index of greater computational power that allows it to manage a larger number
of clients, VPN tunnels and features.
such as the integration of IaaS and SaaS connectivity and the "split-tunnel"
functionality. The "split-tunnel" function allows direct forwarding of the traffic
to the local internet in the case of certain applications that are considered
safe and reliable (e.g., cloud or similar applications defined by the network
manager). These features can be obtained through additional licenses or, in
case of IaaS and SaaS connectivity, for some cloud service providers only.
However, the customer in the case of this project did not consider additional
features necessary, and thus Cisco Meraki was chosen as the most adequate
solution in terms of the supported features and monetary cost.
Another important aspect that needs to be considered when designing a SD-
WAN based network infrastructure is the HA. In this case, neither the MX67C
nor the MX84 allows power redundancy. However, for the HQ is required
more HA than a branch, so it was created a cluster with two MX84 to mitigate
a device failure. This is necessary as the HQ is the core in the hub and
spoke network infrastructure and, therefore, acts as an intermediary between
all the branches. Consequently, in the case of an MX84 failure, the branches
remain in communication with each other and with the HQ, thanks to the
secondary MX84 in active-active mode. Otherwise, for the secondary offices,
the creation of an MX cluster is not envisaged as the failure of the device
would only lead to the non-reachability of the secondary office, with minor
effects compared to the failure in the case of the HQ. In the event of a device
malfunction, the replacement of the device is guaranteed by Cisco Meraki in
a single working day. From the Meraki Dashboard it is possible to clone and
import the entire configuration file on the new device, and thus reconfiguring
a new MX for replacement is quick and easy.
Intermediate infrastructure
Headquarter
The MX84s are installed in the HQ and a Demilitarized zone (DMZ) is created
on the firewall to allow the MX84s to communicate with the HQ’s internal
network and the internet. The first reason for allowing the reachability from
the HQ to the MX84 is the necessity to be able to forward the traffic coming
from the VPN tunnels towards the MPLS edge router for the reachability of the
offices not yet migrated. Moreover, the connectivity is necessary to provide
connectivity to the new offices and allow the registration and configuration of
MX devices to the Meraki Dashboard. Therefore, the MX84s have, as can be
seen in Figure 4.3 and Figure 4.4, one uplink connected to the network called
WAN II and one uplink connected to the edge router of the MPLS. Moreover,
each MX84 has a LAN port connected to the firewall’s DMZ. Furthermore,
being the two MX84 devices in the stack, they are provided with a virtual IP
and two physical IPs for each WAN port. On the LAN side, however, this
distinction is not necessary. In this phase, the routers named R2 and R3 (c.f.
Figure 4.3) are left in their position to guarantee backup reachability for the
other sites not yet migrated.
Branch
For the migration of each branch-side location, the router cluster that performs
the HSRP is replaced with an MX67C. The MX67C has the LAN port
connected towards the internal network and has the two WAN ports connected
to the MPLS edge router and the internet. The connectivity to the HQ and
other locations is guaranteed by the VPN tunnels established through the
auto-VPN. Therefore, the device monitoring through the Meraki Dashboard
is allowed through the HQ internet connection. However, for non-migrated
offices, there is no reachability with the migrated office because the LAN is no
longer advertised on the MPLS network. Therefore, it is necessary to forward
the traffic for the new offices through the HQ via R3 router. Subsequently, the
traffic is sent through static routes towards the firewall that further forwards it
on the DMZ relative to the MX84s. Finally, the MX84s encapsulate the traffic
in the VPN tunnel that reaches the newly migrated site. The setup described
permits reachability between heterogeneous sites from the migration point
of view to guarantee connectivity and productivity that would otherwise be
significantly affected.
Final infrastructure
At the end of the migration of all secondary offices, it is possible to eliminate
the part of the topology used in the intermediate phase, as illustrated in Figure
56 | From MPLS based network to SD-WAN infrastructure
4.5. Precisely, the static routes on the core that allowed traffic forwarding from
the MPLS circuit to the MX84s need to be eliminated. These static routes are
no longer necessary because all traffic to and from the offices is sent/received
by the MX84s, which act as a VPN concentrator for traffic between the offices.
The remaining traffic is forwarded by default in the direction of the firewall.
The firewall forwards the packets based on the IP to the internet port or to the
internal network. Furthermore, the routers R2 and R3 are removed because
they are not longer used.
Branch to headquarter
The traffic from the branch office to the HQ is routed from the MX67C, which
acts as a gateway. Based on the exposed VLAN, the MX67C decides whether
to forward the traffic through the VPN tunnel to the HQ. The traffic is decrypted
once it reaches the MX84. After the MX84 the traffic reaches the firewall via
the DMZ, and from there, it is directly accessible to the HQ.
Branch to internet
The internet traffic from each branch is conveyed to the HQ and routed from
there under the firewall supervision. The concentration of all internet flows in a
single point brings an advantage because there is only one "insecure" gateway
to the internet. The gateway is protected through a firewall that analyzes
all incoming and outgoing traffic for the entire organization. Indeed, in the
previous configuration, each site could go out on the internet locally, making
the surface exposed to attacks from the outside much greater.
From MPLS based network to SD-WAN infrastructure | 57
Branch to branch
The traffic between two branches takes place in a similar way to that between a
branch and the HQ because the SD-WAN topology is hub-spoke, with the HQ
being the only hub. Therefore, the traffic passes from branch to branch through
MX84 in HQ. This approach allows increasing the choice of the best path
because there are two physical connections for each hub-spoke. Therefore,
the communication between two branches could travel on the internet from
the starting branch to the HQ and on the MPLS from the HQ to the arrival
branch. The path decision depends on the network congestion and the health
parameters of the connection. On the contrary, in the full-mesh approach,
it is not possible to change the physical connection because there is not an
intermediate hub. The result is that a single section affected by network
congestion involves the use of the MPLS connection, considered more reliable
in terms of QoS, with possible overload repercussions.
The last step of this particular configuration is to set a static route that forwards
all the traffic to the firewall. The firewall selects on which interface (internal
or external) to forward the traffic. The static route is necessary as the MX84s
are directly connected to the internet to establish the VPN tunnels with the
other offices. Therefore, in the case of internet traffic, the MX would attempt
to send the traffic to the WAN port without going through the firewall.
For each appliance the VLANs and the public IP are set based on the
configuration on the previous router. Figure 4.7 illustrates the configuration
needed for establishing the VPN tunnels for the SD-WAN. As shown in the
figure, one needs to set: the role of the appliance (hub or spoke), the hubs
to refer to, and the subnets exposed to the other branches (in this particular
case, there is only one subnet). At the end of these operations, the VPN
tunnels are established, and the overlay topology is created. Subsequently,
the configurations relating to the SD-WAN features are set, and these features
are general for all the sites of the analyzed topology.
Load Balancing
The load balancing involves the use of two internet links with the addition of
the cellular connection in case of failure of both WAN links. As illustrated
in Figure 4.8a, each link has a bandwidth limit that depends on the link
characteristics, and the WAN I link is enabled as primary.
SD-WAN allows for creating policies for splitting the traffic between the two
links. In this case, as illustrated in Figure 4.8b, the VoIP traffic migrates on
WAN II if the performance of the network is too degraded. In particular, the
link change is performed if any of the requirements for the latency, jitter, or
loss parameters are not satisfied. Otherwise, if the QoE is satisfactory, the
primary link should be used.
60 | From MPLS based network to SD-WAN infrastructure
Traffic shaping
The traffic shaping rules are used to allow for better management of the
available bandwidth resources and to improve the QoE for essential services.
In particular, two rules are created, to improve the QoE of users (c.f., Figure
4.9).
From MPLS based network to SD-WAN infrastructure | 61
The first rule allows guaranteeing unrestricted bandwidth for VoIP traffic.
Owning to this rule, it is possible to have a separate queue for the VoIP traffic,
and thus to provide a better VoIP service in terms of latency. The second rule
concerns online backups and allows for a maximum bandwidth of 2 Mbps for
each user. Indeed, a backup is not an essential service, and therefore the file
fetching can last more time but with less instant impact on bandwidth.
62 | From MPLS based network to SD-WAN infrastructure
Results and Analysis | 63
Chapter 5
The following chapter presents the results obtained from tests carried out on
the SD-WAN infrastructure in a hub and spoke topology. The tests mainly
focus on the features of load balancing, traffic shaping, and certain failure
situations that can be mitigated thanks to the HA strategy.
20
Time (seconds) 15
10
0
Failure Recover
The above setups were evaluated in terms of the switching time between the
primary and the secondary link, and the time needed to recover the primary
connectivity, respectively. The corresponding results are shown in Figure 5.1.
The results suggest that the routers in HSRP setup has the worst performance
with the switching time and the time needed to recover the primary link
equal to approximately 20s and 18s, respectively. This is mainly because
multiple actions need to be performed upon the failure. For example, during
the switching time the fault is identified based on the HSRP timer, the traffic
control is migrated from the primary to the secondary router, and the new VPN
tunnel on the secondary link is established.
The use of a single MX for managing both links halves the detection times and
the time needed for the new VPN tunnel creation, which results in a switching
time that is approximately equal to 9s in the case of the MX without Active-
Active VPN setup. Furthermore, the time to recover the primary link is null
because the appliance first recreates the tunnel and then migrates the data flow.
Finally, in the case of the MX with Active-Active VPN setup, the results show
that both the switching time and the time needed to recover the primary link
are equal to zero. This is because a secondary VPN tunnel on the secondary
uplink already exists at the time of the failure. Therefore, one can conclude
that the MX with Active-Active VPN setup is resilient to the link failures, and
Results and Analysis | 65
2.2 WAN I
2 WAN II
Cellular
1.8
1.6
Bandwidth (Mbps)
1.4
1.2
1
0.8
0.6
0.4
0.2
2 6 10 14 18 22 26 30 34 38 42 46 50 54 58
Time (seconds)
80
Time (seconds)
60
40
20
0
primary-secondarysecondary-primary
Figure 5.3 shows the switching times between the primary and secondary
appliances together with the switching times between secondary and primary
appliances in the case when the VIP functionality is both disabled and enabled.
The results show that, when the VIP functionality is disabled and failure of the
primary appliance occurs, it is needed to wait for approximately 90s before
the secondary appliance becomes operational. On the contrary, when the VIP
functionality is enabled the switching time is reduced by approximately 91%.
This is because in the case with disabled VIP functionality, the spare appliance
uses an IP other than the primary. This means that the IP pointed by the branch
appliances is a virtual one, and hence changing the routes and recreating VPN
Results and Analysis | 67
tunnels is time-consuming. On the contrary, in the case with enabled VIP, the
traffic is routed on the same IP, which reduces the failover time.
The results also show that the switching times from secondary to primary
appliance are smaller than the switching times from primary to secondary
appliance in both cases because the primary appliance is preemptive, therefore,
when the primary appliance returns active, it informs the secondary appliance
and takes its role back. Finally, based on the above observations, one can
conclude that the VIP functionality has the potential to improve the availability
of the network infrastructure, and thus to reduce the disruption of the services.
1.4 WAN I
WAN II
1.2
1
Bandwidth (Mbps)
0.8
0.6
0.4
0.2
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Time (seconds)
According to the Meraki documentation, the flows are split between the WAN
68 | Results and Analysis
200
70 180
60 160
140
Bandwidth (Mbps)
Bandwidth (Kbps)
50
120
40
100
30 80
60
20
40
10 WAN I
20
WAN II
0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Time (seconds)
Figure 5.5 – Internet traffic analysis with SD-WAN policies - bandwidth usage
Next, the traffic shaping rule was evaluated for the settings shown in Figure 4.9.
The traffic shaping rule enables the routing of VoIP traffic on the primary link
as long as the latency, jitter and loss parameters remain in the predetermined
range. In order to validate the functioning of the traffic shaping rule, the
test was performed for a high amount of background traffic on the WAN I
uplink. The background traffic was started prior to the VoIP traffic and the
corresponding bandwidth usage and the latency are shown in Figure 5.5 and
Figure 5.6, respectively. Figure 5.5 shows that the bandwidth consumption on
Results and Analysis | 69
the WAN II link increases from 0 to approximately 180 Kbps after introducing
the VoIP traffic. As expected, this happens because the VoIP traffic was routed
to the WAN II link due to high latency caused by the congestion on the WAN
I link. The increase of the latency on the WAN I link is shown in Figure 5.6
as result of the background traffic on WAN I link.
20
18
16
Latency (milliseconds)
14
12
10
8
6
4
WAN I
2
WAN II
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Time (seconds)
Figure 5.6 – Internet traffic analysis with SD-WAN policies - latency values
the QoE of the other services. In order to keep the amount of traffic at the
reasonable level, the limit of 2Mbps was imposed on the traffic generated
by a backup software, This limit is shown in Figure 5.8 together with the
bandwidth consumption in the case of backup traffic, bandwidth consumption
in the case of the other type of traffic referred to as mixed internet traffic,
and the total bandwidth consumption, respectively. The figure shows that the
total bandwidth consumption increases due to the start of the backup software.
However, the limit imposed for the backup traffic is respected on average.
Consequently, the internet traffic does not increase uncontrollably, and thus
one can expect that the traffic shaping feature can be used for reducing the
occurrence of the slowdowns and the service interruptions.
3.5
Total internet traffic
Mixed internet traffic
3 Backup
Limit imposed
2.5
Bandwidth (Mbps)
1.5
0.5
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Time (seconds)
Chapter 6
Discussion
Chapter 7
7.1 Conclusions
This thesis summarizes the main characteristics of the SD-WAN technology
and provides a detailed description of the process carried out during the
migration from an MPLS based infrastructure to the SD-WAN infrastructure.
In particular, the thesis describes the steps that need to be followed in order to
perform the migration in a way that allows for the proper network functioning
during the transition period. Finally, the thesis presents the results concerning
the impact of some of the main SD-WAN features on the overall network
performance.
In the first part of the thesis, the focus was on the analysis of the SD-WAN
technology and its main features such as load balancing, traffic shaping,
cloud integration and HA. Furthermore, the first part of the thesis provides
a summary of the various solutions proposed by the main vendors in the SD-
WAN sector.
In the second part of the thesis, the focus was on the process of migrating an
existing MPLS based infrastructure to an SD-WAN infrastructure using Cisco
Meraki. The implementation of the new infrastructure required numerous
phases for its realization. Each of them required in-depth knowledge of the
76 | Conclusions and Future work
7.3 Reflections
This thesis shows that the SD-WAN technology allows a substantial
improvement of the network infrastructures. Improvements include security,
HA, and easier integration of the new offices into the existing network.
SD-WAN solutions considered in this thesis also have a great value in terms
of environmental sustainability. For example, the SD-WAN technology allows
for the remote establishment and maintenance of the overlay network, and
thus it reduces the physical interventions by network infrastructure managers.
Since the physical interventions are usually associated with diverse transport
commuters, the SD-WAN technology has the potential to reduce pollutant
emissions.
Regarding the ethical aspects, SD-WAN technology supports reliable
transmission of sensitive information over insecure internet connections.
The reliability is due to the fact that VPN tunnels are used to make the
connection encrypted. Furthermore, the advantage of SD-WAN is the
possibility of creating these VPN tunnels automatically, eliminating the
problems of exchanging cryptographic material. Finally, most of the SD-WAN
infrastructure makes fundamental services for users less prone to malfunctions
and inefficiencies created by ISPs and preserves the rights to make the network
services accessible to all users.
78 | Conclusions and Future work
REFERENCES | 79
References
td/docs/routers/sdwan/configuration/policies/vedge/policies-book.pdf,
[Accessed: 2021-03-06].
[47] S. Rajagopalan, “An overview of sd-wan load balancing for wan
connections,” in 2020 4th International Conference on Electronics,
Communication and Aerospace Technology (ICECA), 2020. doi:
10.1109/ICECA49313.2020.9297574 pp. 1–4.
[48] S. Troia, F. Sapienza, L. Varé, and G. Maier, “On deep
reinforcement learning for traffic engineering in sd-wan,” IEEE
Journal on Selected Areas in Communications, pp. 1–1, 2020. doi:
10.1109/JSAC.2020.3041385
[49] T. Semong, T. Maupong, S. Anokye, K. Kehulakae, S. Dimakatso,
G. Boipelo, and S. Sarefo, “Intelligent load balancing techniques
in software defined networks: A survey,” Electronics, vol. 9,
no. 7, 2020. doi: 10.3390/electronics9071091. [Online]. Available:
https://www.mdpi.com/2079-9292/9/7/1091
[50] E. Vollset, “Maelstrom: Transparent error correction for lambda
networks,” in 5th USENIX Symposium on Networked Systems Design and
Implementation (NSDI 08). San Francisco, CA: USENIX Association,
Apr. 2008. [Online]. Available: https://www.usenix.org/conference/
nsdi-08/maelstrom-transparent-error-correction-lambda-networks
[51] D. Radcliffe, E. Furey, and J. Blue, “An sd-wan solution
assuring business quality voip communication for home based
employees,” in 2019 International Conference on Smart Applications,
Communications and Networking (SmartNets), 2019. doi:
10.1109/SmartNets48225.2019.9069755 pp. 1–6.
[52] P. C. Fonseca and E. S. Mota, “A survey on fault management in software-
defined networks,” IEEE Communications Surveys Tutorials, vol. 19,
no. 4, pp. 2284–2321, 2017. doi: 10.1109/COMST.2017.2719862
[53] F. Botelho, A. Bessani, F. M. V. Ramos, and P. Ferreira, “On
the design of practical fault-tolerant sdn controllers,” in 2014 Third
European Workshop on Software Defined Networks, 2014. doi:
10.1109/EWSDN.2014.25 pp. 73–78.
[54] P. Fonseca, R. Bennesby, E. Mota, and A. Passito, “A replication
component for resilient openflow-based networking,” in 2012
REFERENCES | 85
www.kth.se