You are on page 1of 9

A Performance Management System for Telecommunication Network Using AI

Techniques

Abstrac

Anomaly detection has become more and more difficult for telecommunication network
due to the various trends of networking technologies and the growing number of
unauthorized activities in the performance data. This paper builds up a performance
management system based on the one-class-support vector machine (OCSVM) and K-
means clustering algorithm,which achieves not only the automatic detection of network
anomalies but also the clustering of the anomalies with different levels. The OCSVM
detects the anomalies by solving an optimal problem to separate the nominal data from
the anomalies; these detected anomalies are then classified into minor, medium and
severe levels using K-means clustering. The real telecommunication performance data
are employed in this paper for the investigation, and the numerical results demonstrate
the promising performance of this system.

1. Introductoin

With the telecommunication networks becoming more and more complicated, it brings
about big operational problems for telecommunication operators and a larger number
of elements in the network are required to monitor or manage the performance of the
system. Performance data of each element is generated to indicate the performance of
the network. As the data of each element has its own trend and typical values according
to the nature of the element, such data can characterize the network behavior and
therefore is used for network anomaly detection.
The anomaly is the abnormal behaviour from normal trend and anomaly detection,
which identifies the anomaly activities [1], is one of the main tasks for performance
management (PM) system. Because of the insufficient knowledge and inaccurate
representative of the so-called “novelty” for a given system, anomaly detection has
become a challenging topic.
Several techniques and algorithms have been reported by researchers for anomaly
detection. One of them is to define the abnormal conditions [2], however, due to the
difficulty of defining unknown behaviours, these rules based algorithms are always not
applicable in the real applications. Generally, anomaly detection can be regarded as
binary classification problem and thus many classification algorithms are utilized for
detecting the anomalies, such as probabilistic techniques [3], neural network [4, 5],
support vector machines [6, 7], K-nearest neighbour (KNN) [8] and Hidden Markov
model [9]. However, strictly speaking, they are not anomaly detection algorithms, as
they require knowing what kind of anomaly is expecting, which deviates the
fundamental object of anomaly detection, and additionally these algorithms may be
sensitive to noise in the training samples. Segmentation and clustering algorithms [10]
seem to be better choices because they do not need to know the signatures of the series.
The shortages of such algorithms are that they always need parameters to specify a
proper number of segmentation or clusters and the detection procedure has to shift from
one state to another state. Negative selection algorithms [11-13] are designed for one-
class classification; however, these algorithms can potentially fail with the increasing
diversity of normal set and they are not meant to the problem with a small number of
self samples, or general classification problem where probability distribution plays a
crucial role. Other algorithms, such as time series analysis are also introduced to
anomaly detections [14], then again, they may not be suitable for most of the real
application cases.
As the anomaly data can be regarded as outliers of the “normal” data distribution [15],
One-Class-SVM (OCSVM), which is an extension of support vector machines [16],
was proposed to detect the outliers [17, 18]. The OCSVM is an optimization algorithm,
the idea of which is to first map the data into a high dimensional Hilbert space, and
then maximize the margin between the mapped data and the origin. A trade-off
parameter ν is introduced in the optimal objective function to control the maximum
percentage of outliers in the dataset. The OCSVM overcomes the shortcomings of the
traditional SVMs for coping with the one class data with noise and is supposed to
possess good generation ability in outlier detection. In this paper, the OCSVM will be
introduced to the telecommunication network PM system, and will be verified via real
telecommunication performance data for anomaly detection.

The rest of this paper is organised as follows: Section 2 presents a brief introduction of
the OCSVM. Section 3 introduces the PM anomaly detection system, which includes
the detector training module and anomaly detection module. Experiments are executed
in Section 4 to show the performance of the system. Conclusion and future work are
discussed in Section 5.

2. A Brief Introduction of the OCSVM


Considering a data set withT = {x1 , x2 ,…, xl } , x∈RN , the task is to find a function f
that takes the value “+1” for most of the vectors in the data set, and “-1” for the other
very small part.
The strategies for the OCSVM are: first of all, map the input data into a Hilbert space
H according to a mapping function X = ϕ (x) , and then separate the data from the origin
to its maximum margin and a hyper-plane f(x) is built up to mark the boundary of
separation.
The key idea for the separation is that it doesn’t really need all the data to be separated
to the same side of the hyper-plane f(x), on controversy, a small number of points can
be lying on the other side of the hyper-plane. In order to allow this, slack variables are
introduced to the objective function of support vector machine, and the OCSVM solves
the following quadratic optimization problem:
In Functions (1) and (2), wis the norm that perpendicular to the hyper-plane and ρ is
the bias of the hyper-plane. ξi are slack variables acting as penalization in the objective
function. ν ∈(0,1) is the trade-off parameter to balance between the normal and
anomaly data in the data set and a maximum of ν ×100% data points are expected to
return negative values according to f (x) = w.φ (x) − ρ . Deriving its dual representations,
the OCSVM is to solve the following problems:

Select the kernel function K(x, x' ) in the Hilbert space H and the trade-off parameter ν
, construct and solve the following optimization problem to find the solution α * = (α1*
, …, αl* ) :

where K(xi , x j ) = ϕ (xi ).ϕ (x j ) is called as kernel function and can be with various
format.

It is proved that ν ×100 is the upper bound percentage of data points that are expected
to be outliers in the training data [11], and a vector xi is detected to be outlier in the
training set if and only if αi = 1/(ν l) . The parameter v directly determines the sensitivity
of outlier detection using the OCSVM.

3. Anomaly Detection System for Telecommunication Data

3.1 Anomaly Detection in Telecommunication Data

Performance management in telecommunication addresses the problems of


intelligently managing the performance data. One key function of PM is to detect the
anomaly and then generate alarms according to the detection results. There are two
different types of PM data, qualitative data and quantitative data. Qualitative data, also
known as key performance indicators (KPIs), measure the service quality. These
indicator values are measured in percentage between zero and hundred. Examples of
qualitative KPIs are the system interchange success rate and paging success rate.
Instead of recording the percentage number, quantitative type data trace the traffic data
at each service point in the network. Figure 1 shows the normal performance quantity
traces of a service for the same day (Thursday) in the last five weeks. The values of the
quantitative data are logged at 15 minutes intervals throughout the day, and thus 96 raw
values per day per recording. In this figure, the obvious anomaly data are marked with
circles. The objective of the PM system is to flag these anomaly events occurred in the
network.

3.2 PM Anomaly Detection System

Anomaly detection system can be built in two steps, as shown in Figure 2. Firstly, the
offline data is used to train the OCSVM and generate the model function f (x) , once
testified, the model will be transferred to performance monitoring and management
system and detect the anomalies of the performance data online.
As mentioned in step 3 of Section 2, the real negative values of the anomalies will be
returned. These negative values can in fact reflect the degree of deviation of abnormal
events, the lower the value, the more abnormal the event. According to different values,
the detected anomalies can be clustered using clustering algorithms into three types,
namely: severe, medium and minor alarms.

Figure 2 shows the structure of the PM anomaly detection system. In the system, the
OCSVM is used to train the offline data and generate the detection model, and then the
model function is employed for anomaly detection. The returned values from decision
function are then moved to the clustering machine, and are grouped into different
clusters.

4. Experiments for Telecommunication Anomaly Detection Using OCSVM

In this part, numerical experiments will be carried out to demonstrate the effectiveness
of the anomaly detection system. It is mentioned above that there are mainly two types
of performance data in telecommunication network. In our experiment, we focus on
the analysis of the traffic data, which is one type of the quantitative data. As can be
seen in Figure 1, the anomaly detection based on the traffic data seems uneasy because
of the nonlinear nature of the performance curve.

4.1 Experiment on a Small Data set

4.1.1 Data Pre-processing and Feature Extraction: As seen in Figure 1, the traffic
data possesses a nonlinear nature of the performance curve, and it differs in the ranges
of the values among different weeks. Due to its nonlinear nature, it is unable to use the
data directly for training the OCSVM model. As it is well known for
telecommunication network that network failures cause the sinking or rising of
different PM indicators, the first order gradients of the data set are generated to
compose the new feature of the data for training and testing. That is, a feature set F =
{y1, y2 ,… , yl } of a training set T = {x1, x2 ,…, xl } is defined as follows:

4.1.2 Training and Testing: The five weeks data in Figure 3 are divided into the
training and testing data sets. The first week data are used as the training set, which is
presented as Series 1 in Figure 1 and the other four weeks data presented as Series 2,
3, 4 and 5 are used for testing. Radial Basic Function (RBF) kernel is chosen as the
kernel K(x, y) , which can be expressed as:
The parameter σ is chosen to be 2.5, and the trade-off parameter ν in Function (4) is
selected to be 0.01, which supposes that a maximum of 1% data points in the training
set are abnormal. After training, the model obtained is applied to the testing data set.
Figure 3 and Figure 4 illustrate the offline monitoring results for week3 and week 4,
respectively. In Figure 3 and Figure 4, Series 1 is the plot of the original traffic values
for one week, and Series 2 presents the results returned by the decision function f(x).
These results perfectly match the human visual detection result. It can be seen from the
figures that one anomaly at data point 7 is successfully detected in week 3, and three
anomalies at data points 7, 9 and 94 are detected in week 4.

Some of the returned values from decision function for


week 4 are listed in Table 1. There are altogether three anomalies in week 4 with the
returned values being -0.0801, -0.1376 and -0.2056. By comparing the returned value
with their original traffic value, it can be concluded that the farther a value derivates
from the normal trend, the smaller the returned negative value.
4.2 Experiment on Large Dataset

After successfully detecting the anomalies of the small data set, the experiment is
moving on to a large data set for further testing. The large data set contains the traffic
data of 31 continuous days, with 2976 values in total and every 96 values for one day.
The original traffic data is shown as the lower plot in Figure 5.

4.2.1 Training and Testing Using the OCSVM Detector: First gradient feature is
extracted for both the training and testing data. The parameter σ is set to be 2.5, and the
trade-off parameter ν is 0.01. The results are displayed in the upper part of Figure 5. 32
abnormal data points are successfully detected using the OCSVM detector.

4.2.2 Clustering the Anomalies into Different Levels: As floods of anomalies are
always displayed at the same time in the telecommunication networks, determining the
levels of anomaly severity can help the users to prioritize their work. When a major or
critical anomaly is reported to the user, an immediate attention is required, whilst the
minor anomalies in the systems, can be ignored or can wait until normal maintenance
is performed. As described before, the OCSVM detector returns a negative value for
each anomaly and such negative values can be used to analyze the severity of the
anomalies. These returned values are used as the inputs of the clustering algorithm, and
the anomalies are clustered into different classes: minor, medium or severe using K-
means clustering algorithm. Table 2 and Figure 6 show the results after clustering.
Altogether 32 anomalies are detected using the OCSVM anomaly detector, they are
then clustered into 3 classes, with 18 being minor alarms (rectangular in Figure 6), 12
medium alarms (circles) and two severe alarms (triangles). With different levels of
alarms, the PM administrator can make different reactions accordingly.
5. Conculsion

This paper proposes a PM anomaly detection system based on the OCSVM and K-
means clustering. The OCSVM is introduced into performance monitoring and
management system to detect the anomaly events. By solving an optimal problem with
slack variables and trade-off parameter, the OCSVM can capture most of the data in a
data set as normal in a “small” region, and flag a very small part of data as anomalies.
The detected anomalies are clustered into different classes to generate different levels
alarms for PM system. Experiments show that the system can successfully detect the
anomalies and generate alarms for telecommunication system.