You are on page 1of 5

2020 IEEE 4th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC 2020)

A DDoS Attack Detection Method Based on


Information Entropy and Deep Learning in SDN
Lu Wang, Ying Liu
School of Electronic and Information Engineering, Beijing Jiaotong University
Beijing, China
luwangbjtu@bjtu.edu.cn, yliu@bjtu.edu.cn

Abstract—Software Defined Networking (SDN) decouples the have a high false alarm rate. Additionally, it is not easy to
control plane and the data plane and solves the difficulty of new distinguish real attacks from all alerts.
services deployment. However, the threat of a single point of
failure is also introduced at the same time. The attacker can In order to resolve this security issue, this paper proposes a
launch DDoS attacks towards the controller through switches. In DDoS attack detection method based on information entropy
this paper, a DDoS attack detection method based on information and deep learning in the SDN environment. This method
entropy and deep learning is proposed. Firstly, suspicious traffic combines the advantages of information entropy and that of
can be inspected through information entropy detection by the deep learning. It has a low complexity of information entropy
controller. Then, fine-grained packet-based detection is executed detection and high accuracy of deep learning detection. Thus, it
by the convolutional neural network (CNN) model to distinguish has the potential to effectively detect the DDoS attack against
between normal traffic and attack traffic. Finally, the controller the controller in SDN and ensure the security of the SDN
performs the defense strategy to intercept the attack. The network.
experiments indicate that the accuracy of this method reaches
98.98%, which has the potential to detect DDoS attack traffic This paper is organized as follows: Section II discusses the
effectively in the SDN environment. related work of DDoS detection methods in SDN. In Section
III the architecture design and details of the detection
Keywords—Software Defined Network, information entropy, mechanism have been described. Section IV presents the
deep learning, DDoS attack detection performance evaluation experiments of the detection method.
Section V discusses the conclusion.
I. INTRODUCTION
With the rapid development of the Internet, the II. RELATED WORK
disadvantages of traditional networks have been discovered Some researchers have discussed the DDoS attacks in the
gradually. The emerging issues can only be addressed by SDN environment. Scott-Hayward et al. [1] analyzed the
patching the network, which makes the network more bloated security of SDN and pointed out the potential threats and
and the control ability weaker. These problems are solved by corresponding solutions. Dayal and Srivastava [2] analyzed the
Software Defined Networking (SDN) through decoupling the DDoS attacks in SDN. They summarized the possible threats
control function from the forwarding hardware. Because of the of SDN architecture caused by different kinds of DDoS attacks
centralized control architecture, the controller has the potential and identified their features. Khairi et al. [3] classified DDoS
to obtain the conditions of all switches in its range and control attacks in SDN into application layer attack, control layer
the whole network through the Southbound Application attack, and data layer attack, studied and analyzed the anomaly
Programming Interface (API). detection technology for them. The articles mentioned above
provide theoretical guidance for the development of attack
Although the controllability and flexibility of the network detection methods.
have been enhanced, new security issues are also introduced by
SDN. The centralized characteristics of the controller enable it In recent years, DDoS attack detection in the SDN network
to be the target of attackers to launch DDoS attacks easily. is mainly divided into the following two categories: statistical
Attackers can send a large number of packets with spoofed analysis and machine learning. Common methods of statistical
source IP addresses, which are not able to match any flow entry analysis include information entropy, chi-square statistics, and
in the switch and will be sent to the controller. Processing these principal component analysis. Mousavi and St-Hilaire [4]
spoofed packets will take up excessive computing and memory proposed a method based on the destination address entropy
resources of the controller, which causes it unable to process change, which consumes fewer resources. Zuo et al. [5] used
the legitimate requests of normal users. Thus, the purpose of the traffic matrix and principal component analysis to detect
launching a DDoS attack against the controller will be abnormal traffic. The accuracy achieved 91% while the false
achieved. Traditional methods for detecting DDoS attacks alarm rate was 4.2%. It is a difficulty of statistical methods that
include access control and intrusion detection systems. Human only a fixed threshold is used as the judgment basis, so it is
intervention is required in access control so that there is a easy to misjudge flash crowds as DDoS attacks. Moreover, rich
relatively high labor cost. Intrusion detection systems usually experience is required when adjusting the threshold according
to different environments, which directly affect the accuracy of

978-1-7281-4390-3/20/$31.00 ©2020 IEEE 1084

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on May 10,2020 at 08:02:25 UTC from IEEE Xplore. Restrictions apply.
detection. Lastly, the accuracy of statistical methods is not A. Controller Detection Design
superior enough for actual detection. The initial section of the overall method includes
Machine learning methods include supervised and PACKET_IN messages rate detection, port entropy detection,
unsupervised learning. Ye et al. [6] used the support vector and control module. The controller is responsible for filtering
machine (SVM) to detect DDoS attacks in the SDN network. suspicious traffic previously in order to improve recall. The
In recent years, deep learning develops rapidly so that many control module is implemented by the controller itself. The
scholars begin to consider applying it to attack detection. Tang entropy detection module based on the information entropy is
et al. [7] proposed an intrusion detection system based on deep realized by modifying the core code of the controller.
neural network (DNN), which proved the strong potential of (1) Rate detection module: DDoS attack is a typical
deep learning in traffic anomaly detection. Yuan et al. [8] flooding attack. A large number of packets are sent to the
implemented a DDoS attack detection method based on long network by attackers in a short time. In the SDN environment,
short term memory neural network (LSTM). Compared with an attacker can launch DDoS by sending spoofed packets that
the random forest, the error rate of LSTM decreased from can not match any flow entries of the switch. The switch will
7.517% to 2.103%. The problem of machine learning methods send PACKET_IN messages to the controller to request the
is that the flow collection is prone to cause congestion of the processing method. Thus, the packet rate received by the
secure channel. Some methods need a small amount of data, controller increases rapidly in a short time. In accordance with
but accuracy is only about 95%. In addition, most evaluations this principle, the packet rate is regarded as the first standard.
focus on accuracy and do not pay attention to efficiency. When it exceeds the normal threshold, the possibility of the
To sum up, researchers have made a large number of current network being attacked increases. The next step is to
achievements in DDoS attack detection, but there are still some ascertain whether the abnormal performance is caused by a
problems that need to be solved urgently in this area. Firstly, flash crowd or attack behavior in the network.
accuracy is not high enough for multiple practical applications. (2) Entropy detection module: According to the abnormal
Secondly, network bottlenecks can be caused easily by packet rate, the switch has received a large number of packets
information collection. Finally, most methods emphasize only that can not match any flow entries. It is necessary to
one aspect of accuracy or efficiency. Therefore, this paper distinguish which port these packets come from through the
proposes a DDoS attack detection method based on next detection. If they enter the network from a relatively
information entropy and deep learning. Two-level detection is concentrated port, they may be attack packets. As a flash crowd
performed on the controller and the deep detection server. may have similar behavior, it is not easy to identify whether
Based on PACKET_IN messages, the entropy analysis does there is an attack through the port-based packet rate. The
not increase the data exchange and has the potential to ensure difficulty of this section is that flash crowd traffic is similar to
efficiency. Deep learning is used to improve the accuracy of attack traffic in terms of packets per second, bits per second
detection. and flows per second. Therefore, the essential differences
between them must be discovered to discern attack effectively.
III. ARCHITECTURE DESIGN AND DETAILS
A flash crowd is generated by normal access of legitimate
In order to ensure high accuracy and low computational users, so the traffic will be relatively scattered. As attackers
complexity at the same time, two-level detection is applied for create attack traffic by scripts, the similarity between them is
network traffic. The controller performs a preliminary section stronger than that between normal traffic. The entropy has the
based on information entropy to assure high efficiency. The potential to estimate the similarity of traffic. The more similar
deep detection server is used for the packet-based deep section the traffic is, the smaller the entropy is. Because of the
to guarantee fine granularity and high accuracy. The controller similarity of attack traffic, the information entropy of certain
performs the defense strategy to intercept anomaly traffic. The features will deviate from that of a normal one. Based on the
architecture of the method is shown in Fig. 1. above discussion, the traffic features entropy of each port of
the switch with an abnormal packet rate will be calculated.
When the value differs from that of normal traffic significantly,
it is determined that the attack traffic may enter the network
through this port.

B. Deep Detection Server Design


Because of its automatic feature extraction character, the
convolutional neural network is applied in deep detection. It
has the potential to distinguish between attack traffic and
normal traffic based on packet granularity and avoid the
problem of artificial features. Consequently, it enhances
accuracy and decreases the false alarm rate. The detection
includes the data processing section and the deep learning
detection section. The traffic will be transformed into the
acceptable input shape of the convolutional neural network in
Fig. 1. The architecture of the detection method.

1085

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on May 10,2020 at 08:02:25 UTC from IEEE Xplore. Restrictions apply.
the first section. The second section outputs the classification simulate DDoS attacks, while normal traffic is generated by a
results based on the deep learning method. script. Host1, Host3, and Host5 are chosen as attack hosts.
(1) Data processing module: Convolutional neural network
is a significant deep learning method, which is mainly used for
image classification. A packet can be transformed into an
image and detected by the convolutional neural network model.
The following Fig. 2 shows a fragment of a packet captured by
Wireshark. In accordance with the figure, each byte of a packet
is 8 bits, which can be represented by two hexadecimal
numbers. The gray-scale value of each pixel is from 0 to 255,
which can also be represented by two hexadecimal numbers.
Consequently, each byte of a packet can be converted into a
pixel and gathered as a picture. The convolutional neural
network will be used to detect the generated traffic pictures.

Fig. 4. Experimental topology.

B. Parameters of the Method


(1) Rate threshold: The threshold value is directly related to
precision and recall. Because the deep detection is based on the
Fig. 2. A fragment of a packet captured by Wireshark. convolutional neural network, which has the potential to ensure
high precision, a threshold slightly higher than the normal
(2) Deep learning detection module: The convolutional packet rate is selected to achieve high recall. In the experiment,
neural network model selected through experiments includes normal traffic started from 0 seconds, while attack traffic was
two convolutional layers, two pooling layers, and two fully added to the network from the 25th seconds to the 50th seconds.
connected layers, as is shown in Fig. 3. The convolution layer Taking 1 second as the sampling period, the PACKET_IN
is the core of the model, which is responsible for extracting messages rate received by the controller is shown in Fig. 5.
high-dimensional features of data. The pooling layer decreases
the number of training parameters and reduces the
consumption of computing resources. The fully connected
layer is used to synthesize extracted features and implement the
classification based on the features. The dropout layer is added
before the output layer to prevent overfitting and improve the
generalization ability of the model. It enables neurons to be
activated or deactivated with the probability of parameter P.

Fig. 3. The convolutional neural network model. Fig. 5. PACKET_IN message rate of the switch.

It can be seen that the PACKET_IN message rate between


IV. EVALUATION AND DISCUSSION 0 seconds and the 25th seconds fluctuated about 50 Packets/s.
From 0 seconds to the 10th seconds, because normal traffic
A. Experimental Environment entered the network for the first time, a relatively large number
The experiments are performed in the Mininet environment of PACKET_IN messages were triggered to establish flow
and POX is selected as the controller of the network. The entries. After flow entries had been created, the rate dropped to
hardware and operating system configuration are Inter Core i5- a stable level below 50 Packets/s. After the 25th seconds, the
7300HQ CPU, 8GB RAM, and Ubuntu 5.4.0-6ubuntu1 system. packet rate increased sharply, and the peak value reached
The convolution neural network is implemented through the nearly 350 Packets/s. That is because the attack traffic entered
TensorFlow framework. Dataset CICIDS2017 [9] is chosen as the network and led the packet number to ascend. After the
the detection evaluation dataset. The experimental topology 50th seconds, the rate began to decline gradually and returned
consists of six switches, one deep detection server, and one to a normal level in the 55th seconds. From the figure, the
controller. The topology is shown in Fig. 4. Hping3 is used to maximum packet rate generated by normal traffic is about 90

1086

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on May 10,2020 at 08:02:25 UTC from IEEE Xplore. Restrictions apply.
Packets/s. Hence, the threshold is selected as 100 Packets/s, normal distribution table, when the significance level  is 5%,
which have the potential to ensure a high recall. the Z  is 1.96.
1−
2
(2) Information entropy features: There are several diverse
types of DDoS attacks and their traffic differs in some features. Z Z
 
In order to represent the difference between the attack traffic 1− 1−

and the normal traffic effectively, common features of different Confidence Interval = [  − 2 2
, +
] (1)
N N
DDoS attack traffic will be selected. Four typical DDoS attacks,
(4) The number of convolution neural network model
Http flooding, UDP flooding, ICMP flooding, and SYN
layers: The classification accuracy of a neural network is
flooding, are utilized in the experiment. A flow is identified by
related to the number of layers closely. Generally speaking, the
the five-tuple of TCP/IP (Source IP Address, Destination IP
deeper a neural network is, the higher the accuracy has the
Address, Source Port, Destination Port, and Protocol). As the
potential to be. Nevertheless, a deep model will result in a
length of attack packets generated by the same script is similar,
geometrical ascent of the model convergence time. Therefore, a
Packet Length is chosen to distinguish two kinds of traffic. The
relatively simple model with fewer layers should be chosen in
six-tuple entropy of attack traffic is calculated and compared
order to achieve higher accuracy and lower training time at the
with that of normal traffic. The results of different entropy
same time.
values are shown in Fig. 6.
The computation of the convolution neural network is
According to Fig. 6, the entropy value of attack traffic and
mainly concentrated in the convolution layer and the full
normal traffic is significantly different. Therefore, it is feasible
connection layer. Hence, the number of these layers has a
to distinguish them with information entropy. In accordance
significant influence on training time and accuracy of
with experience, the larger the difference between the entropy
classification results. The difference between experimental
of attack traffic and normal traffic is, the more effective the
models is mainly expressed in the number of these two
feature is to distinguish them. As is depicted in the figure,
structures. In the experiment, the model is composed of two or
Source IP Address, Packet Length, and Protocol are selected on
three convolution layers and one, two, or three full connection
the basis of the principle above.
layers. The number of pooling layer is two. Six different
models are established through the experiment. The
performance of models is evaluated by accuracy, precision,
recall, F1-score, and training time.
The performance comparison of the six models is shown in
Table I. 2C1F indicates that the model contains two
convolutional layers and one fully connected layer. In
accordance with Table I, most of all evaluation metrics
increase accordingly when the number of convolution layer
and full connection layer increases. Accuracy of 2C2F is nearly
0.2% higher than that of 2C1F, while there is a slight increase
in training time by 20.85 seconds. Accuracy of 2C3F and 3C1F
is lower than that of 2C2F, while the training time of them is
relatively higher. Accuracy of 3C2F is 0.07% higher than that
of 2C2F, but the training time increases by 52.77 seconds. It is
Fig. 6. Six-tuple information entropy of attack traffic and normal traffic. the same with 3C3F. According to the comparison, the 2C2F
model has the potential to maintain relatively low training time
(3) Information entropy thresholds: The entropy of three while achieving high accuracy. Therefore, this model is
traffic features are calculated and compared with thresholds selected for further evaluation.
respectively. If the threshold value is exceeded, it is supposed
that an attack may happen and further deep detection is needed
consequently. The precision and recall also need to be TABLE I. ACCURACY METRICS OF DIFFERENT LAYERS
considered for the entropy threshold setting. Because the deep Model Accuracy Precision Recall F1- Training
detection is expected to remain precision at a high level, the
(%) (%) (%) score time
information entropy detection should ensure recall as high as
possible. Hence, the traffic will be judged as suspicious (%) (s)
behavior once one of the three thresholds has been exceeded. 3C3F 99.06 99.05 99.08 99.06 157.36
The threshold value is obtained by calculating the sample 3C2F 99.05 99.04 99.07 99.05 125.58
size N, the mean  , standard deviation  , and confidence 3C1F 98.95 98.96 98.94 98.95 83.53
interval of the entropy. A threshold slightly larger (or smaller) 2C3F 98.95 99.00 98.90 98.95 121.43
than the confidence interval of the normal flow has the 2C2F 98.98 98.99 98.96 98.97 72.81
potential to ensure higher recall. Equation (1) is the calculation
equation of the confidence interval. According to the standard 2C1F 98.79 98.69 98.90 98.80 51.96

1087

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on May 10,2020 at 08:02:25 UTC from IEEE Xplore. Restrictions apply.
C. Analysis of Experimental Results To sum up, compared with the other three typical machine
In accordance with the experimental results of the learning methods, the convolutional neural network can
parameter section, the 2C2F model is selected for performance achieve higher accuracy, precision, recall, and F1-score among
evaluation compared with three typical machine learning them. At the same time, its training time is slightly higher than
methods. Evaluation metrics are accuracy, training time, others but also in an acceptable range. It is demonstrated that
Receiver Operating Characteristic (ROC) curves, and area this method has the potential to distinguish DDoS traffic and
under the ROC Curve (AUC). The horizontal coordinate of the normal traffic effectively.
ROC curve is the false positive rate (FPR), and the vertical
coordinate is the true positive rate (TPR). The steeper the ROC V. CONCLUSION
curve, the better the performance of the model. Consequently, In this paper, a DDoS attack detection method based on
the larger AUC is, the better the classifier is. Hence, it has the information entropy and deep learning in SDN has been
potential to represent the performance of the model. proposed. This method used a two-level detection to identify
It can be seen from Table II that the accuracy of CNN is the attack. The initial section based on entropy has been
4.25% - 8.20% higher than that of SVM, DNN, and decision executed by the controller to detect which switch the
tree algorithm, with similar performance in precision, recall, suspicious traffic entered the network from. Fine-grained
and F1-score. The training time of CNN is higher than that of packet-based deep detection distinguished DDoS attack traffic
the other three algorithms. Since the detection model is based on a convolutional neural network. The controller issued
generally trained offline and not updated frequently, slightly a flow table to implement the defense strategy to intercept
higher training time is acceptable in the case of high accuracy. attack traffic. Evaluations were performed by comparing the
The ROC curve of CNN is steeper than that of the other three method with the deep neural networks, support vector machine,
machine learning algorithms. AUC of CNN is 0.949, which is and decision tree. The proposed method has some outstanding
0.081 higher than that of the second place. It indicates that the properties in terms of accuracy, precision, recall, and F1-score
convolutional neural network model has an advantage in traffic with acceptable training time.
detection.
ACKNOWLEDGMENT
TABLE II. ACCURACY COMPARISON OF DIFFERENT MODELS This work was supported by National Key R&D Program
of China under Grant 2018YFB1800305. (Corresponding
Model Accur Precision Recall F1- Training
author: Ying Liu)
acy (%) (%) score time
(%) (%) (s)
REFERENCES
CNN 98.98 98.99 98.96 98.97 72.81
[1] S. Scott-Hayward, S. Natarajan, S. Sezer, "A survey of security in
DNN 94.73 95.76 93.58 94.66 63.23 software defined networks", IEEE Commun. Surv. Tutor., vol. 18, no. 1,
SVM 92.14 92.37 91.85 92.11 36.75 pp. 623-654, 2016.
[2] N. Dayal, S. Srivastava, "Analyzing behavior of ddos attacks to identify
Decision 90.78 91.31 91.94 91.62 21.54 ddos detection features in sdn", 9th International Conference on
Tree COmunication Systems and NETworks (COMSNETS-2017), pp. 274-
281, 2017.
[3] M. H. H. Khairi, S. H. S. Ariffin, N. M. A. Latiff, A. S. Abdullah, M. K.
Hassan, "A review of anomaly detection techniques and distributed
denial of service (DDoS) on software defined network (SDN)", Eng.
Technol. Appl. Sci. Res., vol. 8, no. 2, pp. 2724-2730, 2018.
[4] S. Mousavi, M. St-Hilaire, "Early detection of DDoS attacks against
SDN controllers", Proc. Int. Conf. Comput. Netw. Commun. (ICNC), pp.
77-81, Feb. 2015.
[5] Q. Zuo, M. Chen, X. Wang, B. Liu, "Online traffic anomaly detection
method for SDN", J. Xidian Univ., vol. 42, no. 1, pp. 155-160, 2015.
[6] J. Ye, X. Cheng, J. Zhu, L. Feng, L. Song, "A DDoS Attack Detection
Method Based on SVM in Software Defined Network", Security and
Communication Networks, vol. 2018, pp. 8, 2018.
[7] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, M. Ghogho,
"Deep learning approach for network intrusion detection in software
defined networking", Proc. Int. Conf. Wireless Netw. Mobile Commun.
(WINCOM), pp. 258-263, Oct. 2016.
[8] X. Yuan, C. Li, X. Li, "DeepDefense: Identifying DDoS Attack via
Deep Learning", 2017 IEEE International Conference on Smart
Computing (SMARTCOMP), pp. 1-8, 2017.
[9] I. Sharafaldin, A. Habibi Lashkari, A. A. Ghorbani, "Toward Generating
Fig. 7. ROC curve comparison of different models. a New Intrusion Detection Dataset and Intrusion Traffic
Characterization", Proc. 4th Int. Conf. Inf. Syst. Secur. Priv., pp. 108-
116, 2018.

1088

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on May 10,2020 at 08:02:25 UTC from IEEE Xplore. Restrictions apply.

You might also like