You are on page 1of 5

Intelligent Network Management for 5G Systems:

The SELFNET Approach


Wei Jiang∗† , Mathias Strufe∗ , and Hans D. Schotten†∗
∗ Intelligent Networking Group, German Research Center for Artificial Intelligence (DFKI)
Trippstadter street 122, Kaiserslautern, 67663 Germany
Emails: {wei.jiang, mathias.strufe, hans.schotten}@dfki.de
† Institute for Wireless Communication and Navigation, University of Kaiserslautern

Building 11, Paul-Ehrlich street, Kaiserslautern, 67663 Germany


Emails: {wei.jiang@dfki.uni-kl.de, schotten@eit.uni-kl.de}

Abstract—The maintenance and management for the current Networking (SDN) [4], Network Function Virtualization (N-
Fourth Generation (4G) networks are still in a manual and FV) [5], Self-Organized Network (SON) [6] and Artificial
semi-automatic manner, which are costly and time-consuming. Intelligence [7], this framework provides the capabilities of
This imposes a great challenge on the network management
of heterogeneous, software-defined and virtualized Fifth Gen- self-healing against network failures, self-protection against
eration (5G) systems. With the advent of network intelligence, distributed cyber-attacks, and self-optimization to improve
a possibility on intelligent management is opened for the 5G users’ Quality-of-Experience (QoE). Although SON has a self-
system. Without interventions of network administrators, the managing function, it is limited to static network resources. It
novel approach can autonomically deal with network failures, does not suit to some 5G scenarios, such as network slicing
cyber-attacks and inefficient resource utilization, which in turn
can lower operational expenditure, improve user’s experience [8] and multi-tenancy [9], where dynamic resource utilization
and reduce time-to-market of new services. In this paper, the are required. Moreover, SON can only reactively respond to
reference architecture, functionality, closed-loop control, enabling detected network events, while the intelligent framework is
algorithms of the network intelligence are presented. An intel- capable of proactively performing preventive actions for pre-
ligent 5G test-bed is set up and the experimental results verify dicted problems. SELFNET aims to assist network operators
the feasibility and effectiveness.
to simplify management and maintenance tasks, which in turn
I. I NTRODUCTION lower OPEX, improve user experience and shorten time-to-
market of new services [10].
As of today, mobile networks’ troubleshooting (systems In addition to the SDN/NFV-based infrastructure [11], the
failures, cyber-attacks, and performance degradations, etc.) SELFNET framework mainly consists of: 1) sensors and a
still cannot avoid manual operations, such as reconfiguring monitor that can extract network metrics; 2) actuators and
software, repairing or replacing hardware. A mobile operator an orchestrator that perform corrective and preventive actions;
has to keep an operational group with a large number of and 3) the network intelligence that is in charge of diagnosing
network administrators with high expertise, leading to a costly network problems and making tactical decisions. This paper
Operational Expenditure (OPEX) that is currently three times focuses on presenting and validating the proposed network
that of Capital Expenditure (CAPEX) and keeps rising [1]. intelligence. The rest of this paper is organized as follows:
On the other hand, to meet the radical KPI requirements Section II briefly introduces the architecture of the SELFNET
of mobile broadband access and emerging services (Internet framework in order to provide a complete view. Section III
of Things, virtual and augmented reality, self-driving car, proposes the network intelligence, as well as its closed-loop
etc.), the forthcoming Fifth Generation (5G) system [2] is control and enabling algorithms. Section IV and V illustrate
envisioned to become far more complicated and heterogeneous the setup of intelligent 5G test-bed and some experimental
than current systems. It inevitably imposes a great challenge results, respectively. Finally, Section VI concludes this paper.
on today’s manual and semi-automatic network management
that is already costly, vulnerable and time-consuming. II. T HE SELFNET F RAMEWORK
In this context, the EU H2020 SELFNET project [3] has
been established to design and implement an intelligent man- Taking account into SDN and NFV technologies for the
agement framework for 5G mobile networks. Taking advan- 5G system, the network intelligence is applied in software-
tage of cutting-edge technologies including Software-Defined defined and virtualized network infrastructure. To provide a
complete view of the intelligent management, the reference
∗ This work was supported by the European Union’s Horizon 2020 Pro-
architecture of the SELFNET framework [12] is reviewed, as
gramme under the 5G-PPP project: Framework for Self-Organized Network
Management in Virtualized and Software Defined Networks (SELFNET) with shown in Fig.1. The differentiated layers are briefly explained
Grant no. H2020-ICT-2014-2/671672. as follows:

c 2017 IEEE
978–1–5386–3873–6/17/$31.00
Autonomic

Network
Layer
Network
Monitor Orchestrator
Intelligence Intelligence

Features Actions
Control Layer

NFV Orchestrator & Management Layer


Control Plane Sublayer
Sensors Actuators VNF
SDN Controllers Manager Monitor Orchestrator
Layer
Data

Data Plane Sublayer


VIM Sensor Actuator
Infrastructure

Virtualization Sublayer
Layer

Physical Sublayer
Fig. 2. Control loop of the network intelligence.

Fig. 1. Reference architecture of the SELFNET framework. This section presents closed-loop control of the network
intelligence, as well as the enabling algorithms.

• Infrastructure Layer: All physical and virtual network re- A. Control Loop
sources are located in this layer. It encompasses physical Apart from the underlying software-defined and virtualized
and virtualization sublayer. The former provides an access network infrastructure, a closed control loop, starting from the
to physical resources (networking, computing, storage, sensors and terminating at the actuators, is designed. Once
etc.), while the latter instantiates virtual infrastructures the monitor has detected or predicted a network problem, a
on top of the physical sublayer. control loop initiates. The network intelligence diagnoses the
• Data Layer: It implies an architectural evolution towards problem’s cause, decides a tactic and plans an action. As soon
the SDN paradigm by decoupling the control plane from as the orchestrator received an action request, it coordinates the
the data plane. In this framework, the Data Layer repre- physical and virtual resources to enforce this action. As shown
sents a simple data-forwarding, which can be either non- in Fig.2, the input and output of the network intelligence
virtualized or virtualized network function. are features and actions, respectively, which are explained as
• Control Layer: This layer includes two internal sublayers: follows:
SDN controllers and control plane sublayer. SDN/NFV • Feature: According to [14], five differentiated data
sensors and actuators, which are capable of collecting sources have been identified in the SELFNET frame-
data from the entire system and enforcing actions, re- work. All monitoring information retrieved from physical
spectively, are also included. devices, data plane, SDN controller, SDN/NFV sensors,
• Autonomic Layer: This layer consists of three modules: and VIM, are uniformly called sensor data. The monitor is
the monitor extracts features related to network behavior, capable of analyzing and aggregating the collected sensor
which are sent to the intelligence part to decide which data so as to derive a set of network features that can be
action should be done. The orchestrator coordinates phys- evaluated to indicate the characteristics of an existing or
ical/virtual resources and manages actuators to execute emerging network problem.
the decided action. • Action: It is an implementable countermeasure to de-
• NFV Orchestration and Management Layer: This layer scribe how to enforce, taking into account available
is responsible for orchestrating and managing Virtual physical and virtual resources.
Network Functions (VNFs) via the VNF manager, as
well as virtual resources through Virtualized Infrastruc- B. Enabling Algorithms (1): Feature Selection
ture Manager (VIM). It conforms to NFV Management In practice, a large number of features (network metrics)
and Orchestration (MANO) specified by the European can be extracted from the 5G infrastructure. Each feature
Telecommunications Standards Institute (ETSI) [13]. generally needs to be periodically recorded, resulting in a
huge volume of data. When the management system tackles a
III. T HE N ETWORK I NTELLIGENCE specific problem, e.g., traffic congestion, it is inefficient (if not
One of the main innovative aspects of the SELFNET frame- infeasible) to process all data. That is because generally only a
work is the network intelligence, which enables an autonomic relatively small subset of all-available features are informative,
management for 5G networks. Taking advantage of cutting- while others are either irrelevant or redundant. As a data-
edge techniques in the field of artificial intelligence, it pro- driven approach, the network intelligence should be built on
vides the capabilities of self-healing, self-protection and self- relevant features, while discarding others, so that irrelevant
optimization by means of reactively and proactively dealing and redundant features do not degrade the performance on
with detected and predicted network problems, respectively. both training speed and predictive accuracy.
RAN Edge DC Core Data Plane
Management Plane
ZABBIX client
eNodeB
Server

Mini-PC as UE LTE USB Surfstick USRP


Network Intelligence Server YouTube
(HUAWEI E398) (Ettus B210)
(with ZABBIX master)
iPerf3 Server

Internet
EPC Server Access Internet
(with HSS, SPGW & MME) Switch

Fig. 3. The setup of intelligent 5G test-bed.

Feature Selection (FS) is one of the most important intelli- accuracy. But it needs high memory usage, is vulnerable to
gence techniques and an indispensable component in machine noisy data and is not easy to interpret.
learning and data mining [15]. It can reduce the dimensionality
of data by selecting only a subset of features to built the IV. A N I NTELLIGENT 5G T EST- BED
learning machine. In this paper, we take advantage of a A. Test-Bed Setup
classical FS algorithm called Relief-F [16] to calculate the
To demonstrate an intelligent management for 5G, as shown
relevance of the collected features.
in Fig.3, a mobile network test-bed is established. Its archi-
tecture conforms to ETSI’s Mobile Edge Computing (MEC)
C. Enabling Algorithms (2): Classification
[19], which was specially designed for the forthcoming 5G
In the terminology of machine learning, classification is an networks. To approach a realistic network as much as possible,
instance of supervised learning. It is applied to identify which an open-source LTE implementation called OpenAirInterface
class a new observation belongs on the basis of a training (OAI) [20] is adopted. OAI provides a full protocol stack of
dataset. An example would be assigning an incoming email 3GPP LTE standards for E-UTRAN radio access and EPC core
into ’SPAM’ or ’non-SPAM’ classes in terms of the observed network. Relying on a software-defined radio module (Ettus
features of the email (source IP address, text length, title USRP B210) at the eNodeB side, a radio link is established
content, etc.). Several classification algorithms that are used between user equipments (UEs) and the eNodeB. In our test-
in the test-bed are briefly reviewed as follows: bed, commercial UEs have been successfully tested to connect
1) Decision Tree: Decision Tree (DT) [17] is a classical the eNodeB and access the Internet, e.g., using an LTE-enabled
supervised learning method used for classifying. Decision Apple iPad to browse webpages and watch YouTube videos.
rules are inferred from a training dataset and a tree-shaped Here, a mini-PC with an LTE surfstick (Huawei E398) rather
diagram is built. Each node of the decision tree relies on than a commercial UE is applied for the sake of easy to install
a feature to separate the data, and each branch represents a measurement tools. The UE and eNodeB form the RAN Edge
possible decision. DT is simple, interpretable and fast, whereas of MEC.
it is hard to apply in a complex and non-linear case. On the other side of this test-bed is the MEC Data Center
2) Support Vector Machine: Support Vector Machine (DC) Core, where three servers and two switches are deployed.
(SVM) [18] utilizes a so-called hyperplane to separate all data First, the EPC server acts as the LTE EPC core network. It
points of one class from another. The number of features is connected to the eNodeB at one side and to a switch at
does not affect the computational complexity of SVM, so the other side. The Internet access (data plane), marked by
that it can perform well in the case of high-dimensional and the blue-solid lines in Fig.3, is granted to the UE. Second,
continuous features. However, it is a binary classifier and a to facilitate a controllable network testing, a server is used
multi-class problem can be solved only by transferring into to deploy network tools like iPerf3 [21] (to flexibly generate
multiply binary problems. desired traffic) and to provide internal services such as video-
3) Nearest Neighbor: Another algorithm called k Nearest streaming. Third, the network intelligence server runs machine
Neighbor (kNN) is applied for data classification following the learning algorithms so as to monitor, diagnose and control.
hypothesis that close proximity in terms of inter-data distance For the illustration shown in this paper, this server acts as
have an similarity. The class of an unclassified observation can the data-sink for collecting network features with the help
be decided by observing the classes of its nearest neighbors. of ZABBIX monitoring solution [22], implements feature
It is among the simplest algorithms with a good predictive selection and classification modules. The hardware including
TABLE I 0.4
LIST OF FEATURES

No. Metric Definition 0.3

1 EPC Traffic In Incoming trafffic of EPC server

Relevance Weight
2 EPC Traffic Out Outgoing trafffic of EPC server 0.2
3 UE Traffic In Incoming trafffic of UE
4 UE Traffic Out Outgoing trafffic of UE 0.1
5 Server Traffic In Incoming trafffic of iPerf3 server
6 PLR Average Packet Loss Rate
0
7 Delay Round trip delay
8 EPC Packet In Number of EPC’s incoming packets
-0.1
9 UE Packet In Number of UE’s incoming packets 0 2 4 6 8 10 12 14 16 18
10 eNB CPU Util eNodeB’s CPU utilization in percentage (%) Features
11 eNB CPU Temp eNodeB’s CPU temperature in (o C)
12 eNB Mem Util eNodeB’s memory utilization Fig. 4. Relevance weights of features.
13 EPC CPU Util EPC’s CPU utilization in percentage (%)
1
14 EPC CPU Temp EPC’s CPU temperature in (o C)
15 EPC Mem Util EPC’s memory utilization
16 UE CPU Util UE’s CPU utilization in percentage (%) 0.9

Prediction Accuracy
17 UE CPU Temp UE’s CPU temperature in (o C)
18 UE Mem Util UE’s memory utilization 0.8

DT
0.7
DT -r
servers, switches, mini-PC, radio modules and antennas fits SVM

into a server rack, as shown in the bottom-left part of Fig.3, 0.6


SVM -r
kNN
where the RAN Edge and the DC Core are not physically kNN -r
separated. 0.5
2 4 6 8 10 12 14 16 18
Number of Features
B. Acquiring Network features
Fig. 5. Predictive accuracy as a function of an incremental number of
Since intelligence algorithms based on machine learning features.
are data-driven, data collecting is necessary for both training
and prediction phases. To guarantee the reporting of network
features, a separated traffic path for management plane is 5) After a period of congestion, terminate the iPerf3 traffic
required, as highlighted by the red-dashed lines in Fig.3. so as to return to the normal status.
ZABBIX clients are installed in the servers and mini-PC where During the whole test, the network features listed in Table I
features needed to extract. These clients are connected to are periodically (one sample per second) collected and stored
the ZABBIX database running on the network intelligence in the ZABBIX database. A training dataset consisting of 18
server via a switch, which is specific for the management features with 240 observations is then obtained.
plane’s traffic. It is noted that the data traffic, such as YouTube
video-streaming, is transferred in an independent network V. E XPERIMENTAL RESULTS
route through the Internet access switch. The reason is to In this test, we take advantage of a classical algorithm called
avoid potential collisions. In the previous tests, the data and Relief-F [16] to make clear which features listed in Table I
management traffic are not separated, the management traffic are relevant to the traffic congestion. The acquired training
is also blocked when testing data traffic congestion. dataset with the dimension of 240×18 is fed into the Relief-F
In this paper, we illustrate the operation of test-bed by algorithm. A metric called relevance weight ranging from −1
means of an example of traffic congestion. A testing procedure to 1 is used to indicate the relevance of features. The larger
is designed as follow: a weight is, the more relevant the corresponding feature. As
1) Configure the maximal bandwidth of the Internet access shown in Fig.4, the 6th and 14th features, namely PLR and
switch to 768kByte/s. EPC CPU Temp, are the most relevant and irrelevant, respec-
2) Run the eNodeB and EPC. tively. As we observed in the test, the CPU’s temperature of the
3) Connect the UE to the network, visit YouTube.com and EPC server randomly fluctuates around 39o C, independently
start a video down-streaming. with the occurrence of traffic congestion.
4) Generate traffic of 2.5MBytes/s by the iPerf3 server and Then, predictive accuracies of three different classifiers
inject the traffic into the Internet access switch. Once the (DT, SVM, and kNN) are tested in terms of an incremental
iPerf3 traffic arrived, the congestion occurs. number of features, where the number of features selected for
classification is gradually increased from 1 to 18. In the first Starting from a brief overview of the SELFNET framework,
iteration, the number of used features is set to 1. The most the closed-loop control and enabling algorithms of the network
relevant feature PLR is the unique feature in the training data, intelligence, which provides the functionalities of self-healing,
whereas other features are ignored. Three classifiers are trained self-protection and self-optimization, have been presented. To
and the achieved predictive accuracies are 95.4%, 95.8%, and demonstrate the proposed approach, an example problem of
96.3%, respectively. In other words, when the problem of traffic congestion has been tested over an MEC-compliant 5G
traffic congestion occurs 100 times, an average of around 96 test-bed. The experimental results validated that the network
times can be correctly detected by the network intelligence so intelligence is capable of autonomically detecting the anomaly
as to take further countermeasures. in a very high accuracy with the aid of feature selection
On the contrast, the classifiers can be built on the basis of and classification algorithms. This verified the feasibility and
the most irrelevant feature EP C CU P T emp. The resulting effectiveness of the proposed intelligent 5G management.
predictive accuracies are only 62.9%, 62.9%, and 61.7%,
R EFERENCES
respectively. That is to say, the network intelligence is unaware
of almost 40% of occurred congestion. Suppose we merely [1] “Top ten pain points of operating networks,” Aviat Networks, 2011.
[2] “5G white paper,” NGMN, Feb. 2015.
randomly decide (guess) whether there is congestion, like [3] EU H2020 5G-PPP SELFNET project. [Online]. Available:
throwing coins, the average accuracy is 50%. In comparison, https://selfnet-5g.eu/
the classifiers with the most irrelevant feature get a really bad [4] B. A. A. Nunes et al., “A survey of software-defined networking: Past,
present, and future of programmable networks,” IEEE Commun. Surveys,
performance since their results are just a little bit higher than vol. 16, no. 3, pp. 1617–1634, 2014.
that of a random decision-making. [5] R. Mijumbi et al., “Network function virtualization: State-of-the-art and
In the second iteration, the number of used features in- research challenges,” IEEE Commun. Surveys, vol. 18, no. 1, pp. 236–
262, 2016.
creases to two by adding another feature. To be specific, [6] S. Dixit et al., “On the design of self-organized cellular wireless
the most and second relevant features, namely P LR and networks,” IEEE Commun. Mag., vol. 43, no. 7, pp. 86–93, Jul. 2005.
EP C P acket In, are selected to train and predict. Similarly, [7] A. He et al., “A survey of artificial intelligence for cognitive radios,”
IEEE Trans. Veh. Technol., vol. 59, no. 4, pp. 1578–1592, May 2010.
EP C CU P T emp and Server T raf f ic In are used in the [8] X. Zhou et al., “Network slicing as a service: enabling enterprises’ own
irrelevant case. For each iteration, one feature is added to software-defined cellular networks,” IEEE Commun. Mag., vol. 54, no. 7,
the relevant and irrelevant tests according to their relevance, pp. 146–153, Jul. 2016.
[9] K. Samdanis et al., “From network sharing to multi-tenancy: The 5G
until all of 18 features are used. Using DT as an example, as network slice broker,” IEEE Commun. Mag., vol. 54, no. 7, pp. 32–39,
shown in Fig.5, the curve ’DT’ shows the predictive accuracy Jul. 2016.
of the DT classifier with an incremental number of relevant [10] L. J. G. Villalba et al., “D2.1 - Use cases definition and requirements
of the system and its components,” EU H2020 SELFNET project, Tech.
features. The curve ’DT-r’ gives its predictive accuracy with Rep., Oct. 2015. [Online]. Available: https://selfnet-5g.eu/deliverables/
the incremental number of irrelevant features. [11] P. Neves et al., “The SELFNET approach for autonomic management in
Generally speaking, three classifiers can achieve a high an NFV/SDN networking paradigm,” Intl. Journal of Distributed Sensor
Networks, vol. 16, no. 2, pp. 1–17, Feb. 2016.
predictive accuracy merely using a few most relevant features. [12] R. Cale et al., “D2.2 - Definition of APIs and interfaces of the
Using 6 features, for example, DT can achieve its highest SELFNET framework,” EU H2020 SELFNET project, Tech. Rep., Mar.
accuracy of 98.3%. Then, its performance suffers from a slight 2016. [Online]. Available: https://selfnet-5g.eu/deliverables/
[13] “Network Functions Virtualisation (NFV): Management and
decrease with the number of features increases. This reveals Orchestration,” ETSI, Tech. Rep., Dec. 2014. [Online]. Available:
that the number of features does not follow the rule of ”the http://www.etsi.org/
more the better”. Notice that SVM and kNN can achieve the [14] L. J. G. Villalba et al., “D4.1 - Report and prototypcal implementation
of the monitoring and discovery module,” EU H2020 SELFNET project,
optimal accuracy of 100%. Staring from 7 features, the SVM Tech. Rep., Sep. 2016.
keeps the optimal accuracy of 100% until all of 18 features [15] V. Kumar and S. Minz, “Feature selection: A literature review,” Smart
used. With 12 features, kNN achieves the optimal accuracy, Computing Review, vol. 4, no. 3, pp. 211–229, Jun. 2014.
[16] I. Kononenko et al., “Estimating attributes: analysis and extensions of
while there is a slight decrease when the number of features RELIEF,” in Proc. of the 6th European Conf. on Machine Learning,
is larger than 16. We can remark here that the classifiers can Catania, Italy, Apr. 1994, pp. 171–182.
achieve a very high accuracy of congestion detection with [17] S. K. Murthy, “Automatic construction of decision trees from data:
A multi-disciplinary survey,” Journal on Data Mining and Knowledge
a reasonable number of network features. It proves that the Discovery, vol. 2, no. 4, pp. 345–389, Dec. 1998.
network intelligence is both feasible and effective. [18] C. J. Burges, “A tutorial on support vector machines for pattern recog-
Notice that our objective in this paper is not the comparison nition,” Journal on data mining and knowledge discovery, vol. 2, no. 2,
pp. 121–167, Dec. 1998.
of different classifiers. The aim is to verify the feasibility and [19] M. Patel, et al., “Mobile-Edge Computing introductory technical white
effectiveness of applying the artificial intelligence technique paper,” Mobile-Edge Computing (MEC) industry initiative, 2014.
for implementing an intelligent 5G network management. [20] N. Nikaein et al., “OpenAirInterface: A flexible platform for 5G
research,” ACM SIGCOMM Computer Communication Review, vol. 44,
Now, we can say that this feasibility and effectiveness are no. 5, pp. 33–38, 2014.
basically verified in the experiments. [21] iPerf - The ultimate speed test tool for TCP, UDP and SCTP. [Online].
Available: https://iperf.fr/
VI. C ONCLUSIONS [22] ZABBIX: The enterprise-class monitoring solution for everyone. Zabbix
LLC. [Online]. Available: http://www.zabbix.com/
This paper explored the possibility of intelligent network
management for software-defined and virtualized 5G systems.

You might also like