You are on page 1of 5

2019 International Conference on Communications, Information System and Computer Engineering (CISCE)

Network Intrusion Detection Based on Deep Learning


Wang Peng, Xiangwei Kong, Guojin Peng, Xiaoya Li, Zhongjie Wang
Testing Technology Institute
China Flight Test Establishment
Xi'an, China

Abstract—with the continuous development of computer network the theoretical results and practical results of deep learning
technology, security problems in the network are emerging one have emerged in an endless stream, and have achieved amazing
after another, and it is becoming more and more difficult to results in the fields of speech recognition and image
ignore. For the current network administrators, how to classification, and are suitable for processing large-scale data
successfully prevent malicious network hackers from invading, so [8].
that network systems and computers are at Safe and normal
operation is an urgent task. This paper proposes a network Aiming at the shortcomings of traditional intrusion
intrusion detection method based on deep learning. This method detection methods, slow detection speed and weak adaptive
uses deep confidence neural network to extract features of ability, this paper proposes a hybrid intrusion detection method
network monitoring data, and uses BP neural network as top based on deep confidence neural network [9]. The method
level classifier to classify intrusion types. The method was realizes effective identification and classification of massive,
validated using the KDD CUP'99 dataset from the Lincoln high-dimensional and nonlinear intrusion network data, and
Laboratory of the Massachusetts Institute of Technology. The improves the accuracy of IDS classification.
results show that the proposed method has a significant
improvement over the traditional machine learning accuracy.
II. INTRUSION DETECTION TECHNOLOGY
Keywords-network intrusion, deep learning, deep confidence,
BP neural network A. basic concepts
Intrusion detection is to monitor network traffic and
I. INTRODUCTION computer system activity information, and analyze the data to
find out whether there are malicious attacks by hackers or
With the continuous development of computer network damage the behavior of computers and network resources. The
technology and the rapid expansion of the scale of Internet intrusion detection system IDS is a combination of hardware
users, security problems in the network have emerged in an and software that can implement intrusion detection [10]. The
endless stream and become more and more important [1]. deployment diagram of the intrusion detection system is shown
Cyber security threats such as spam, viruses, and spyware can in Fig.1. It can be seen that the intrusion detection system is the
lead to widespread identity theft and property fraud, causing second layer of protection measures, and the first layer is the
serious economic losses to consumers and businesses [2]. firewall. The firewall is mainly for external attacks and has no
In today's society, computer networks play an extremely resistance to internal attacks. It is generally based on policies
important role in many key infrastructure areas, such as that use fixed rules to prevent known attacks and is inflexible.
government and military organizations, as well as enterprises IDS can detect internal and external attacks in real time and
[3]. Therefore, it is an urgent task for current network make Timely response [11].
administrators to study how to successfully prevent malicious
network hackers from invading and making network systems
and computers in a safe and normal state. Measures to protect
such network systems not only need to fix known
vulnerabilities [4], but also deploy firewalls and Intrusion
Detection Systems (IDS) . IDS can review audit data and audit
data as it is generated. This enables attacks to be discovered
during attack and attempted attacks, enabling real-time
response and real-time protection of network systems [5].
Although the current intrusion detection system has more
advantages in network protection, in the face of the current
complex network environment and increasingly updated attack
methods, most of the traditional intrusion detection systems Figure 1. Deployment diagram of the intrusion detection system
rely on rule base and traditional machine learning [6]. The
algorithm is computationally complex and has been unable to
adapt to the new network environment [7]. Many problems B. Intrusion Detection Data Source
have arisen: low data detection efficiency, poor adaptability, For IDS, choosing the right input data is the first problem to
low false negative rate and false positive rate. In recent years, be solved. The intrusion detection technology used by IDS is

978-1-7281-3681-3/19/$31.00 ©2019 IEEE 431


DOI 10.1109/CISCE.2019.00102
different for different types of data sources [12]. Today, In the above formula, the first one is mainly to find the
commonly used and analyzed data sources are usually the mean square error, and the latter one is mainly to adjust the
following four types: weight to avoid over-fitting. The principle of using the gradient
descent method to update the network model parameters in the
(1) The audit record of the operating system, the audit BP algorithm is:
record generated by the audit subsystem inside the computer
operating system software is the data source that was first
adopted as IDS .
(2) Operating system logs, operating system logs refer to
log files generated according to the operating system log
mechanism and closely related to the host's data source. (4)

(3) Application log, which refers to user-level activity B. Restricted Boltzmann machine
record information, which is selected as the input data source of
IDS for easy analysis and processing. In 1986, researchers such as Hinton proposed the Restricted
Boltzmann machine Restricted Boltzmann Machine (RBM), a
(4) Network data The network-based data source is mainly new type of stochastic neural network model. Compared with
composed of two parts: network data packet and network the traditional Boltzmann machine, the network structure of
connection service record . RBM is a bipartite graph. There is no edge connection inside
the visible layer, and there is no edge connection inside the
III. DEEP CONFIDENCE NEURAL NETWORK hidden layer. Only the visible layer unit and the hidden layer
unit have edge connections. Specifically, as shown in Fig.3.
A. Back Propagation Algorithm

Figure 3. Structure diagram of RBM network


Figure 2. Back propagation algorithm
As can be seen from Fig.3, in the RBM network model
The structure diagram of the network model is shown in diagram, it is assumed that the number of visible unit (v) nodes
Fig.2. Generally, in addition to the input layer and the output is n, and the number of hidden unit (h) nodes is m, where each
layer, at least one hidden layer is included, and the neurons visual node v is only affected by m hidden nodes h, and has no
inside the same layer are connected without mutual influence, relationship with other visual nodes. Similarly, each hidden
and the adjacent two layers are Neurons are generally node is only affected by n visual nodes. The range of v and h is
connected by network weights [13]. In the forward propagation {0, 1}.
sub-process of the signal, each neuron node in the network has
a nonlinear sigmoid activation function: The energy function E(v, h) of RBM is:

(1) (5)

In the backpropagation sub-process of error, the weight and Where: vi is the visible layer unit, and hj is the hidden layer
threshold of each neuron in the network are repeatedly unit, bi is used to represent the deviation of vi, and cj is used to
corrected by the gradient descent algorithm to achieve the goal represent the deviation of hj, and Wij is the weight between vi
of the smallest error function value [14]. Suppose there are data and hj.
sets {(x(1),y(1)), (x(2),y(2)),…, (x(n),y(n))} of m training samples, In the RBM network, we model the Gibbs distribution:
and the maximum number of network layers is set to k. For a p(h|v) and the energy function above to calculate the joint
single sample (x, y), we can define its cost function like this: probability distribution:

(2)
Then for all sample data, the overall cost function is:

(6)
(3)

432
Because there is no connection between hidden layer units, IV. EXPERIMENTAL ANALYSIS
the edge distribution calculation of the above formula is very
simple: A. Intrusion Detection Method Based on Deep Confidence
Neural Network
Deep Confidence Neural Network (RBM) learns through
(7) data. RBM has five parameters: h, v, b, v, W, where b, c, W are
the corresponding weights and offset values, which are learned.
There are two commonly used sampling algorithms for For a sample data x, it is trained using a contrast divergence
RBM: algorithm:
(1) Gibbs sampling algorithm (Gibbs sampling) 1) Assign x to the display layer, v1 calculates the
As a commonly used sampling method, the Gibbs sampling probability P(h1|v1) that each neuron in the hidden layer is
algorithm is based on the Markov chain Monte Carlo strategy. activated;
Assuming the K-dimensional sample data vector X, its 2) Take a sample of Gibbs sampling from the calculated
conditional probability distribution is known, but the joint probability distribution:
distribution cannot be solved. The Gibbs sampling method can h1~ P(h1|v1) (9)
generally be used to find the joint probability distribution of X.
The specific process is expressed by the formula (8). 3) Reconstruct the display layer with h1 , that is, push back
the display layer through the hidden layer, and calculate the
probability P(v2|h1) of each neuron in the display layer.
4) Similarly, take a sample of Gibbs samples from the
(8) calculated probability distribution:
(2) Contrast Divergence (CD) V2~ P(v2|h1) (10)
In the Gibbs sampling algorithm, a large number of state 5) The probability of each neuron in the hidden layer
transitions are performed to obtain a desired distribution. being activated is calculated again by v2, and the probability
Therefore, the training efficiency of RBM in this way is not distribution P(h2|v2) update weight is obtained:
high, especially when the feature dimension of the sample set is
relatively high. In response to these questions, Professor Hinton
proposed a new sampling algorithm, the contrast divergence
algorithm. Experiments show that the CD algorithm only needs (11)
a few steps to get a good enough approximation, which greatly
improves the RBM training speed. The specific fast learning After several trainings, the hidden layer can not only
process of the CD algorithm is shown in Fig.4. display the features of the display layer more accurately, but
also restore the display layer. When the number of hidden layer
Start neurons is smaller than the display layer, a "data compression"
effect is produced, which is similar to an automatic encoder.
Randomly initialize
model parameters
Here BP neural network is used as the top classifier to classify
the intrusion type.
H = sigm(c +Wv)
B. Introduction to experimental data sets
The KDD CUP’99 dataset is a collection of network data
V' = sigm(b +Wh)
collected by the Massachusetts Institute of Technology's
N Lincoln Laboratory in conjunction with the US Department of
Update network model Defense (DARPA) in 1998 to conduct an intrusion detection
parameters
assessment project. In this project, they simulated the US Air
Force's local area network and realized the research and
Reach the maximum number of iterations? analysis of various types of attacks faced by different types of
Y users in a real network environment.
End
C. Experimental data preprocessing
There are some character types in the various feature
attribute values in the data set used in the experiment, which is
Figure 4. CD-based fast learning algorithm flow chart not conducive to the algorithm processing. Therefore, the
character type feature attributes must be digitized in advance.
In addition, because the dimension types of each feature
attribute value are very large, it is not conducive to feature
learning, and the range of values between different feature
attributes is very different. This difference is likely to cause
"large numbers" to eat "decimal numbers". The problem is that

433
the value of a certain feature is too small to be overwhelmed, Through the comparison results of the experiments given in
affecting the experimental results. Therefore, in order to Table1 and Fig.5, it is found that the DBN-based feature
eliminate this effect, we need to normalize the data set, that is, learning method performs intrusion detection training on four
use the data processing method to normalize each feature different data sets, which defeats the traditional feature learning
attribute to the same range. method with a large advantage. For the larger data set S4, the
DBN-based feature learning method is 11.58% higher than the
D. Evaluation indicators PCA method and 12.91% higher than the gain ratio method.
The performance indicators commonly used in intrusion Therefore, the DBN-based feature learning algorithm is more
detection systems are: accuracy (AC), false alarm (FA), CPU suitable for feature learning tasks in high-dimensional space.
consumption time. In general, given a real intrusion detection
data sample, there are four possible outcomes for predicting its V. CONCLUSION
category using IDS. Accuracy rate AC refers to the ratio of the In this paper, a network intrusion detection method based
number of samples correctly classified by a certain type of on deep confidence neural network is proposed. The influence
sample in the test set to the total number of samples in the test of parameters of DBN network model on the intrusion
set. The specific formula is: detection effect is analyzed experimentally. The deep feature
learning model based on deep confidence neural network and
other traditional common methods are analyzed. Feature
(18) learning methods were compared experimentally. According to
The false positive rate FA indicates the ratio of the number the analysis of experimental results, the detection rate of this
of samples in the test set that are falsely reported as abnormal method is significantly improved compared with the traditional
classes to the total number of normal samples. The calculation machine learning method.
method is shown in 4-27.
REFERENCES
[1] Javaid A Y , Niyaz Q , Sun W , et al. A Deep Learning Approach for
(19) Network Intrusion Detection System[C]// 9th EAI International
Conference on Bio-inspired Information and Communications
E. 4.5 Analysis of results Technologies. ICST (Institute for Computer Sciences, Social-Informatics
and Telecommunications Engineering), 2015.
In this experiment, the data sets S1, S2, S3, and S4 selected [2] Tang T A , Mhamdi L , Mclernon D , et al. Deep Learning Approach
from KDD CUP'99-10% were also used as the original data set, for Network Intrusion Detection in Software Defined Networking[C]//
and the data set was extracted by the above three kinds of The International Conference on Wireless Networks and Mobile
feature representation methods, and then the new one was Communications (WINCOM'16). IEEE, 2016.
adopted by SVM. Feature data is classified. The DBN in this [3] Alrawashdeh K , Purdy C . Toward an Online Anomaly Intrusion
Detection System Based on Deep Learning[C]// IEEE International
experiment uses a 5-layer RBM network structure, specifically Conference on Machine Learning & Applications. IEEE, 2017.
122-110-90-60-30-10, RBM pre-training iterations 100 times,
[4] Roy S S , Mallik A , Gulati R , et al. A Deep Learning Based Artificial
and BP network weight fine-tuning iterations 200 times. Neural Network Approach for Intrusion Detection[C]// International
Conference on Mathematics and Computing. Springer, Singapore, 2017.
TABLE I. COMPARISON TABLE OF INTRUSION DETECTION CORRECT [5] Qu F , Zhang J , Shao Z , et al. An Intrusion Detection Model Based on
RATE OF DIFFERENT FEATURE LEARNING METHODS Deep Belief Network[C]// Proceedings of the 2017 VI International
Conference on Network, Communication and Computing. ACM, 2017.
Data PCA gain ratio DBN [6] Hong C , Guangxue W , Zhenjiu X , et al. Intrusion detection method
S1 82.64% 82.14% 91.68% of deep belief network model based on optimization of data
S2 83.15% 83.06% 93.87% processing[J]. Journal of Computer Applications, 2017.
S3 83.49% 82.65% 94.94% [7] Tan Q , Huang W , Li Q . An intrusion detection method based on
S4 83.96% 82.54% 95.45% DBN in ad hoc networks[C]// International Conference on Wireless
Communication & Sensor Network. 2016.
[8] Benmessahel I , Xie K , Chellal M . New Improved Training for Deep
Neural Networks Based on Intrusion Detection System[J]. IOP
Conference Series Materials Science and Engineering, 2018, 435.
[9] Papamartzivanos D , Marmol F G , Kambourakis G . Introducing Deep
Learning Self-Adaptive Misuse Network Intrusion Detection Systems[J].
IEEE Access, 2019, PP(99):1-1.
[10] Shone N , Ngoc T N , Phai V D , et al. A Deep Learning Approach to
Network Intrusion Detection[J]. IEEE Transactions on Emerging Topics
in Computational Intelligence, 2018, 2(1):41-50.
[11] Shone N , Ngoc T N , Phai V D , et al. A Deep Learning Approach to
Network Intrusion Detection[J]. IEEE Transactions on Emerging Topics
in Computational Intelligence, 2018, 2(1):41-50.
[12] Chuan-Long Y , Yue-Fei Z , Jin-Long F , et al. A Deep Learning
Approach for Intrusion Detection Using Recurrent Neural Networks[J].
Figure 5. Comparison of detection accuracy of different feature learning IEEE Access, 2017, PP(99):1-1.
methods

434
[13] Dong B , Wang X . Comparison deep learning method to traditional [14] Songlin K , Le L , Chuchu L , et al. Intrusion detection based on
methods using for network intrusion detection[C]// IEEE International multiple layer extreme learning machine[J]. Journal of Computer
Conference on Communication Software & Networks. IEEE, 2016. Applications, 2015.

435

You might also like