You are on page 1of 5

2019 the 7th International Conference on Smart Energy Grid Engineering

Modeling Network Intrusion Detection System Using Feed-Forward Neural


Network Using UNSW-NB15 Dataset

Liu Zhiqiang Ghulam Mohi-ud-din


School of Software and Micro Electronics School of Software and Micro Electronics
Northwestern Polytechnic University Northwestern Polytechnic University
Xi’an, China Xi’an, China
e-mail: gmohiudin@hotmail.com e-mail: 3474195255@qq.com

Li Bing Luo Jianchao


School of Software and Micro Electronics School of Software and Micro Electronics
Northwestern Polytechnic University Northwestern Polytechnic University
Xi’an, China Xi’an, China

Zhu Ye Lin Zhijun


School of Software and Micro Electronics School of Software and Micro Electronics
Northwestern Polytechnic University Northwestern Polytechnic University
Xi’an, China Xi’an, China

Abstract—Ordinary machine learning algorithms are not very especially previously unseen attack types, is a crucial issue that
efficient in solving the classification problem of Network needs to be resolved urgently.
Intrusion because of the huge amount of data. Deep Learning In analyzing network behavior, normal system behavior
is proven to be more effective in this scenario. Deep Learning
can effectively classify with high dimensionality and complex
and network traffic are studied by anomaly-based methods.
features. In this paper, a deep learning IDS is proposed using These methods identify an anomaly or attack whenever the
state of the art UNSW-NB15 dataset. An experiment conducted system or network behavior deviates from its standard or
to select the optimal activation function and features and then general behavior. Anomaly-based methods are most often used
testing on unseen data demonstrates high accuracy and lower because of their ability to adapt to zero-day or new attacks.
false alarm rate. The evaluation results show that proposed Another advantage of using anomaly-based methods is that the
classifier outperforms other machine learning models , thus
profile for normal behavior of a system or network is different
opening new dimensions in research in Network Intrusion
Detection. for each application and protocol, thus making it hard for the
attacker to break into the system. Furthermore, the information
Keywords-intrusion detection; deep learning; feed forward on which an attack alert is triggered usually identifies misuse.
network; UNSW-NB15 The main downside of using an anomaly-based technique is
the higher rate of false identification of an attack, known as a
I. I NTRODUCTION false alert. A false alert occurs when normal traffic is identified
as an attack. The reason behind this is the activity which has
The development of the Internet in recent times has not only
not been seen before, which is categorized as an attack or
changed the way of learning and development but at the same
anomaly.
time exposed networks and systems to even more advanced
Machine learning methodologies happen to be popular in
security threats. Cybersecurity is a pair of processes and tech-
identifying different kinds of attacks, along with a machine
nologies created to safeguard computers, networks, data and
learning strategy can assist the system administrator to go
programs from unauthorized access and attacks, modification,
up the equivalent procedures for stopping intrusions. Nev-
and obliteration [1]. A substantial research milestone in the
ertheless, because of feature engineering and brief learning,
information security area is the IDS. It can quickly identify
traditional machine learning cannot handle the anomaly or
an intrusion, which may be a continuing intrusion or may be an
attack problem which occurs in a real network environment.
intrusion which has just occurred. One of the key difficulties
Introducing neural networks in machine learning lead us to
involved in cybersecurity will be the provision of an effective
the brand new area know as Deep Learning. The inspiration
and robust IDS. The way to identify countless network attacks,

978-1-7281-2440-7/19/$31.00 ©2019 IEEE 299

Authorized licensed use limited to: Carleton University. Downloaded on November 17,2020 at 18:08:16 UTC from IEEE Xplore. Restrictions apply.
of deep learning came from the structure of the human brain, A Deep Learning method was proposed by Potluri and
which based on the neural networks and neuron. And these Diedrich [10]. They used the Deep Neural Network for their
neurons and neural networking used for analytical learning. approach. In the Deep Neural Network they used three hidden
The human mind mechanism to understand information like layers, soft-max and 2 AE (Auto-Encoder) and Forty-one
pictures sounds, as well as texts, are represented by it [2]. features were used. However, they obtained mixed results; the
Deep learning is an area of Machine Learning that applies result depends more upon a small number of classes than on
neurons as the mathematical elements [3] for understanding others. The author described it as insufficient data for some
the task. Neural networks have been around for several decades classes.
[4] and have actually been both gaining and losing the favor In [11], an Adaptive IDS for industrial internet of things that
of research groups. uses One Class Support Vector Machine (OCSVM) algorithm
The remaining paper is organized as follows. In Section was proposed. The use of OCSVM would allow the model to
II, we survey related research in the field of intrusion detec- classify unknown anomalies. To accommodate the changes in
tion, especially how deep learning methodologies encourage the network architecture, the proposed system utilizes Spear-
the progress of intrusion detection. Section III introduces man’s rank correlation coefficient to match the unknown traffic
commonly-used datasets for Intrusion Detection. In Section IV, with known traffic. Six datasets with different configurations
we give an overview of our proposed approach to an Intrusion were used to evaluate the proposed system. The datasets were
Detection System, Random Forest Decision Tree, the NSL- created using the hybrid environment for design and validation
KDD dataset, data preparation, and assessment measurements. testbed.
In Section V, the execution of our methodology is assessed, A cascade of boosting-based artificial neural network mul-
dependent on exploratory outcomes and contrasted with the ticlass classifier for intrusion detection system has been pro-
related methodologies for Intrusion Detection Systems. Our posed in [12]. The performance of the proposed method
conclusions and suggestions for future research are introduced was evaluated using two datasets (i.e. KDD-CUP99 and
in Section VI. UNSWNB15). The results showed that the proposed method
has performed better on the KDD-CUP99 dataset when ap-
II. R ELEVANT W ORK proaching the problem as both a binomial and a multiclass
problem.
In the continually expanding world of network, network and A multi-layer perceptron feed-forward artificial neural net-
information security is becoming more and more dependent on work with a single hidden layer was proposed in [14] to detect
IDS. To maximize the anomaly detection mechanism and in- DoS attacks. The model was evaluated on the NSL-KDD
crease its accuracy and precision, supervised, semi-supervised [13] dataset and the UNSW-NB15 dataset. The authors also
and unsupervised machine learning approaches are in use. experimented with the number of hidden layers and reported
Liskov et al. [5] provided a comparative assessment of their effect on the model accuracy, loss, training and testing
supervised, as well as unsupervised, learning approaches about time.
the detection accuracy of theirs and ability to identify unidenti-
fied attacks. Martinez-Balleste and Solanas [6] offered cluster- III. DATASET
ing algorithms for anomaly detection. An extensive collection For network security analysis, data is considered to be the
of anomaly-based IDS was presented by Bhattacharya and most crucial component. To conduct better research, selection
Kalita, and Tavallee [7] compared the effectiveness of the of suitable data and its practical use is key to success. The
NSLKDD dataset on distinct classification algorithms such as performance of Machine Learning and Deep Leaning models
NaiveBayes, Support Vector Machines, and Decision-Trees. is also affected by the size of the data.
An HG-GA based anomaly detection mechanism was pro-
posed by Raman et al. [8]. They utilized the Hypergraph A. Darpa Intrusion Detection Data Sets
Genetic Algorithm for the feature selection and setting the The MIT Lincoln laboratory for Network IDS collected and
parameters for a Support Vector Machine. They reported that published the DARPA IDS [14] datasets. This dataset happens
their technique outperformed the present methods, with a to be the first complete and thorough dataset for network
97.14 % detection rate on an NSL-KDD dataset; it has been intrusion detection since it contained an extensive amount of
utilized for validation and experimentation of the IDS. traffic as well as attack data.
Another hybrid approach was the approach used by Teng
et al. [9] combining Decision Trees and a Support Vector B. KDD CUP99 Dataset
Machine. They tested their model on the KDD CUP99 dataset A widely used dataset for Network Intrusion Detection
and acquired an accuracy rate of 89.02%. however, because is KDD Cup99 [15]. The ‘KDDCUP99 Data’ datasets were
of the presence of redundant records in the KDD CUP99 issued For the Classifier-Learning Competition. These datasets
dataset, the classifier is often biased. Additionally, SVM is are based on DARPA datasets. Currently, there are 49,000,000
not considered to be ideal in handling the colossal network records in the dataset. TCP packets from an src-ip to a dest-ip
traffic data from today’s fast growing networks because of during a specific time under some well-established protocol
poor performance and substantial computation cost. are defined as a connection. Each connection is labeled as

300

Authorized licensed use limited to: Carleton University. Downloaded on November 17,2020 at 18:08:16 UTC from IEEE Xplore. Restrictions apply.
an anomaly or normal network traffic. Attack categories are IV. P ROPOSED M ETHODOLOGIE
divided into Denial of Service Attack, Root to Local Attack, A. Dataset Pre-Processing
User to Root Attack or Probing Attack; normal network traffic
is categorized as Normal. The number of features for each Instead of using prepared training/testing datasets as in
record is 41, of which are 36 are continuous properties, and [6]. We utilized the full dataset for the training purpose of
7 are symbolic properties. Deep Leaning Classifier for Intrusion Detection. The complete
dataset is available in four separate CSV files. Following are
C. NSL-KDD Dataset our observations when we explore the dataset further.
1) There is a difference in features in the prepared dataset
The analysis shows that the KDD Cup99 dataset carries and full dataset.
statistical issues which lead to poor approximation and esti- 2) Number of records are redundant. Some records are also
mation. Those issues were addressed in the NSL-KDD [16] found with a local IP address.
dataset. Some files are available to download, for further We proposed a Binomial classifier for the Network Intrusion
studies and research. Table 3 shows the details of each file. Detection System. Not only proposed, but we also experi-
The NSLKDD dataset addresses some of the shortcomings of mented and evaluated the described model. The proposed Deep
the KDD Cup99 dataset. Leaning Model is built using Python, and Feedforward ANN is
implemented using back-propagation and Stochastic Gradient
D. ADFA Dataset
decent.
The Australian Defence Academy, also known as ADFA 1) Deep Learning Model: Our proposed Deep Learning
[17], issued a dataset for the host-level Network IDS. Because model contains ten hidden layers with 100 neurons. Each
of its importance, this dataset is widely used in NIDS products. layer has ten neurons. Several hidden layers correspond with
The dataset is characterized in two widely used operating the number of features. And each pair of neurons is equal
system platforms, Windows and Linux. The Windows platform to several features in the dataset. A full dataset with 10-
is characterized as ADFA-WD, and the Linux platform is fold cross-validation and ten epochs is used for the training
Characterized as ADFA-LD., The Dataset contains system purpose. For cross-validation, the stratified cross-validation
calls, and, in that way by that, it marked the attack types. fold assignment method is used, where the squared sum of
Describing the ADFA-LD, it keeps the record of invocation the incoming weights per unit is constrained by a maximum
of system calls and records the period. Linux kernel provides value of ten.
the standard interface. It is solely the job of the kernel to B. Methodology
handle the user space request. Interfaces like system resources,
reading, writing, etc. are acting like as a connection between Performance of Deep Learning model to classify the attacks
user and kernel space. and normal traffic is evaluated using three different exper-
iments. Also, each experiment is repeated ten times to get
accurate results.
E. UNSW-NB15
The first experiment is carried out to find a suitable acti-
Researchers from ACCS created the UNSW-NB15 [8], a vation function.3 Different activation functions are used on
publicly available dataset since 2015, for the modern Network the entire dataset using 10-fold cross-validation. The second
Intrusion Detection System. 2.5 million records of the dataset experiment is utilizing the best activation function from the
are distributed in CSV format and are divided into 49 features first experiment and identify the important features using the
containing both flow and packet-based features. Those features Gedeon method and their effects on the model error matrices.
are further divided into four different categories, namely: And finally, the third experiment utilizes the findings from the
content, basic, flow, and time. Initially, dataset is labeled in above mentioned two experiments on unseen data.
two distinct traffic labels (attack and normal). Later the attack
V. R ESULTS AND E VALUATION
categories are divided into further nine different class types
based on their attack type, as mentioned in the table. A. Matrices
Contrary to other publicly available benchmark datasets Accuracy , F1-Score, Precision , Recall and AUC are used
such as KDD-Cup99 and NSL-KDD , which has been utilized to evaluate the model. Performance matrices is calculated from
by researchers for the evaluation purpose of Network Intrusion the following :
Detection System, the UNSW-NB15 is a fairly new dataset • Accuracy (AC): AC indicates the total percentage of
containing modern network traffic both normal and abnormal correct predictions (True Positive or True Negative);
including newer low footprint attack types, which makes it Equation 1 details how to calculate it.
a more suitable choice for testing the proposed classifier.
Above mentioned datasets not only include the redundant TP + TN
Accuracy(AC) = (1)
amount of attacks and unbalanced amount of traffic, but also TP + TN + FP + FN
those datasets are incredibly old; thus they are not up to the • Precision (P ): P signifies the percentage of accurate
challenges and threats of current modern day networks. predictions; Iit is obtained by dividing the total correct

301

Authorized licensed use limited to: Carleton University. Downloaded on November 17,2020 at 18:08:16 UTC from IEEE Xplore. Restrictions apply.
TABLE I. COMPARISON OF PROPOSED METHOD WITH OTHER
MACHINE LEARNING ALGORITHMS

Algorithm Accuracy in Percentage FAR Rate


Logistic Regression 83.15 18.48
Naive Bayes 81.2 18.3
Artificial Neural Network 81.5 22.1
EM Clustering 78.4 23.7
CANID 99.36 -
Our Method 99.5 0.47

C. Results
Proposed Deep Learning model was evaluated on the unseen
Figure 1. Comparison of final model with the maximum of area under data, for that dataset was divided into training , validating and
curve (AUC). testing set by the ratio of 60%, 10%, 30% respectively.10-
fold cross validation technique is used during the model
training. We get 6 diffrent models with the validations set
of 5%,10%,15%,20%,25% and 100% features , respectively.
Figure 1 and 2 demostrate the best and worst model on the
basis of AUC. Figure 1 shows that the proposed Deep Learning
model achieve very high accuracy.
And finally, we compare our model with state of the art
availabel methods for Network Inrusion Detection. It can
be seen that our method presents higher accuracy and low
FAR as compared to other Machine Learning techniques , as
demonstrated in Table I.

VI. C ONCLUSION AND F UTURE W ORK


In this paper, a Deep Learning model using a feed-forward
neural network is introduced for intrusion detection. Exper-
Figure 2. Comparison of final model with the minium of area under curve iment results demonstrated that the accuracy of 99% is ac-
(AUC). chieved with lower FAR. Future work includes the introduction
of Self Taught Learning System and Ending techniques for the
dimensionality reduction and lowers FAR.
predictions with by the total number of true and false
predictions demonstrated in Equation 2. R EFERENCES
TP [1] S. Aftergood, “Cybersecurity: The cold war online,” Nature, vol. 547,
p= (2) no. 7661, pp. 30–31, Jul. 2017
TP + FP [2] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
• Recall (R): Recall (R) signifies the correct percentage of pp. 436–444, May 2015
[3] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge,
accurate predictions of attack, which we get is obtained MA, USA: MIT Press, 2016.
by dividing it by the total number of attacks or intrusions, [4] M. Minsky and S. Papert, Perceptrons. Cambridge, MA, USA: MIT
as shown in Equation 3. Press, 1969
[5] P. Laskov, P. Dessel, C. Schafer, and K. Rieck, “Learning intrusion
TP detection: Supervised or unsupervised” in Proc. 13th Int. Conf. Image
Recall = (3) Anal. Process. (ICIAP), Cagliari, Italy, F. Roli and S. Vitulano, Eds.
TP + F N Berlin, Germany: Springer, Sep. 2005, pp. 50-57
• F-measure (F): is considered the most crucial statistic of [6] A. Solanas and A. Martinez-Balleste, Advances in Artificial Intelligence
network intrusion detection, presenting both precision (P) for Privacy Protection and Security (Intelligent Information Systems).
Hackensack, NJ, USA: World Scientific, 2010. [Online].
and Recall (R), as shown in Equation 4. [7] D. K. Bhattacharyya and J. K. Kalita, Network Anomaly Detection: A
Machine Learning Perspective. Boca Raton, FL, USA: CRC Press, 2013.
2∗P ∗R [8] M. R. G. Raman, N. Somu, K. Kirthivasan, R. Liscano, and V. S. S.
F − measure = (4)
P +R Sriram, “An efficient intrusion detection system based on hyper-graph
Genetic algorithm for parameter optimization and feature selection in
B. Evaluation Environment support vector machine,” Knowl.-Based Syst., vol. 134, pp. 1-12, Oct.
All experiments were executed on a standalone H2O 2017
[9] S. Teng, N.Wu, H. Zhu, L. Teng, and W. Zhang, “SVM-DT-based adap-
(v3.10.5.1) cluster running under Windows 10 with 24GB tive and collaborative intrusion detection,” IEEE/CAA J. Automatica
RAM and 3.4GHz Intel Core i7 quad processor. Sinica, vol. 5, no. 1, pp. 108-118, Jan. 2018

302

Authorized licensed use limited to: Carleton University. Downloaded on November 17,2020 at 18:08:16 UTC from IEEE Xplore. Restrictions apply.
[10] S. Potluri and C. Diedrich, “Accelerated deep neural networks for [15] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed
enhanced intrusion detection system,” in Proc. IEEE 21st Int. Conf. analysis of the KDD CUP 99 data set,” in Proc. IEEE Int. Conf. Comput.
Emerg. Technol. Factory Autom., Berlin, Germany, Sep. 2016, pp. 1–8. Intell. Secure. Defense Appl., Jul. 2009, pp. 1-6.
[11] B. Stewart, L. Rosa, L. A. Maglaras, T. J. Cruz, M. A. Ferrag, P. Simoes, [16] S. Revathi and A. Malathi, “A detailed analysis on NSL-KDD dataset
and H. Janicke, “A novel intrusion detection mechanism for SCADA using various machine learning techniques for intrusion detection,” in
systems which automatically adapts to network topology changes,” Proc. Int. J. Eng. Res. Technol., 2013, pp. 1848-1853.
vol. ”4”, no. 10. [Online]. Available: http://eudl.eu/doi/10.4108/eai.1- [17] M. Xie, J. Hu, X. Yu, and E. Chang, “Evaluating host-based anomaly
2-2017.152155 detection systems: Application of the frequency-based algorithms to
[12] M. M. Baig, M. M. Awais, and E.-S. M. El-Alfy, “A multiclass cascade ADFA-LD,” in Proc. Int. Conf. Netw. Syst. Secur., 2014, pp. 542-549.
of artificial neural network for network intrusion detection,” vol. 32, [18] N. Moustafa and J. Slay, “UNSW-NB15: a comprehensive data set for
no. 4, pp. 2875–2883. [Online]. Available: http://content.iospress.com/ network intrusion detection systems (UNSW-NB15 network data set),”
articles/journal-of-intelligent-and-fuzzy-systems/ifs169230 in 2015 Military Communications and Information Systems Conference
[13] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed (MilCIS), pp. 1–6.
analysis of the KDD CUP 99 data set,” in 2009 IEEE Symposium on [19] N. Moustafa and J. Slay, “The evaluation of network anomaly detection
Computational Intelligence for Security and Defense Applications, pp. systems: Statistical analysis of the UNSW-NB15 data set and the
1–6. comparison with the KDD99 data set,” vol. 25, no. 1, pp. 18–31.
[14] R. P. Lippmann et al., “Evaluating intrusion detection systems: The 1998 [Online]. Available: http://dx.doi.org/10.1080/19393555.2015.1125974
DARPA off-line intrusion detection evaluation,” in Proc. DARPA Inf.
Survivability Conf. Expo. (DISCEX), vol. 2, 2000, pp. 12-26.

303

Authorized licensed use limited to: Carleton University. Downloaded on November 17,2020 at 18:08:16 UTC from IEEE Xplore. Restrictions apply.

You might also like