You are on page 1of 7

A Hybrid Network Intrusion Detection System Using Neural Networks and

Ensemble Classifier

Meenakshi Sumeet Aryaa* and Ryan Collinsb

a,b
Department of Computer Science and Engineering, Faculty of Engineering and
Technology, SRM Institute of Science and Technology (Vadapalani Campus)
Chennai, India

raina.arya@rediffmail.com

Dr Meenakshi S Arya is a Professor of Computer Science at SRM Institute of Science and


Technology, Vadapalani Campus, Chennai and an alumnus of the prestigious College of
Engineering, Pune. She completed her Ph.D. in the field of Biometric Watermarking and has
filed a patent for the work done. She has a diverse academic and professional experience
background spanning more than 16 years and has served in many engineering colleges across
the length and breadth of the country in various capacities. Her research interests include Data
Science & Analytics and Digital Image Processing. She has published papers in reputed
Journals and conferences and has many book chapters to her credit. She is a senior IEEE
member and fellow of IETE. She serves on the editorial and review board of a multitude of
journals related to Computer Science.
A Hybrid Network Intrusion Detection System Using Neural Networks
and Ensemble Classifier

With the proliferation of internet into daily lives of people and the growing
dependence on e-commerce and e-services, the vulnerability of networks to
attacks is also becoming a major issue. The need of the hour is robust Intrusion
detection systems capable of effectively detecting network attack behaviour,
which is pivotal to the network security. Many systems proposed in the past using
single classifiers are not able to achieve higher performance metrics like
accuracy, precision and recall as the nature and behaviour of Malware is
unpredictable. In this paper, a hybrid network intrusion detection model based on
neural network and ensemble classifiers is proposed. The proposed method learns
the effective features using optimized neural networks and machine learning
algorithms, and the test results are produced in conjunction with the voting
ensemble classifier. The KDD-CUP 99 standard network intrusion detection
dataset is used for evaluation and experimentation. The experimental results show
that the multiclassification network intrusion detection model proposed in this
paper improves the performance evaluation metrics and provides stable results.

Keywords: Ensemble Classifiers, Hybrid Method, Neural Networks, Intrusion


Detection, Voting Classifiers.

Introduction

With businesses growing across lengths and breadths, the probability of the security

breach of private data available online has also increased significantly. The reliance of

organizations on the Network Intrusion Detection Systems (NIDS) for maintenance of

network and securing private assets is increasing day-by-day [1]. These systems are also

being used for continual monitoring and reporting regarding any abnormal activity or

behavior. Traditionally, NIDS systems operates in either anomaly detection or misuse

detection mode. Misuse detection examines for particular signatures of known

defamatory conduct, whereas an anomaly-based detection tries to create a model which

includes normal network traffic patterns [2] and then detect abnormalities from those.
Anomaly intrusion detection offers the intriguing ability in detecting odd attacks even

before they have been categorized by the security analysts, and being capable of

detecting differences on the existing attack methods. In our proposed system, we aim to

classify the anomalies using both supervised and unsupervised methods. 

In order to create data for the Intrusion Detection System [3], it’s needed to configure

the real working environment to analyze all the probabilities of attacks, which is not

cost efficient. Data validation, data pre-processing, feature engineering are the phases

involved. The data analysis phase (data validation, pre-processing, feature engineering)

methodically discovers the patterns in the assembled information and portrays them to

the defined problem. It is a procedure of analyzing the data, modeling and transforming

of data and deciding on how to organize, classify, interrelate, compare and display the

data. Data quality targets mainly on the accuracy and reliability of data collected and

used in an evaluation [4] [16]. Image processing, web-site analysis, medical

applications, remote sensing, etc. and have standard and legitimate ground truth

databases for analysis. Likewise, most of the computer Network IDS uses the KDD

Cup99 for the classification analysis of network traffic.

The paper proposes an approach which uses combines a voting ensemble classifier with

a neural network model to give accurate and consistent results for the type of attack that

a network has been subjected to. The algorithm tests the proposed model on other

classifiers such as random forest, gaussianNB, KNeighbours and Decision Tree

Classifiers with the neural network model to prove that the voting ensemble classifier in

conjunction with Neural Networks produces the best results.

The paper is organised as follows. Section 2 throws some light on the work

being done by other researchers in this field. Section 3


Related work

Proposed Methodology

Fig 1: Flowchart Depicting the Proposed Methodology

The network intrusion detection system is a hybrid system because it uses an ensemble

classifier with ML algorithms and a neural network model, which is translated into a

classifier. Here we use the whole 10% of the KDDCup-99 dataset for accurate results.

The traditional ML algorithms work better for organized data and uniform data. The
neural network model performs better when we keep adding data. So, if we keep adding

data to the dataset, still we will get better results.

The dataset is divided into 2, training and testing, dummy variable(Y) is created

with the outcome column for the neural network model. Then the neural network model

is built and the model is converted into a classifier, so, that it can be sent to the

ensemble classifier with the other ML classifiers.

The ML classifiers and the keras classifier (neural network model classifier) is

given in the voting classifier as estimators. The voting classifier is then trained and

tested.

In general, this method predicts and gives the best possible and consistent

outputs but it is not cost efficient.

Results:
Discussion of Results:
The results show that the voting ensemble classifier gives an accuracy of 99.96%. The
random forest classifier also gives a better accuracy but when more data is added in the
future, the accuracy and consistency of voting ensemble classifier will be better because
of the neural network model (keras classifier), the same cannot be said for random
forest classifier or any other ML classifiers that’s been used.

Conclusion:
The voting ensemble classifier produces consistent and accurate results, than that of
other ML algorithms and the keras classifier for diverse datasets.

[1] Z. Abedjan, X. Chu, D. Deng, R. C. Fernandez, I. F. Ilyas, M. Ouzzani, P. Papotti, M. Stonebraker,


and N. Tang, “Detecting data errors: Where are we and what needs to be done?” PVLDB, vol. 9, no. 12,
pp. 993–1004, 2016.
[2] F. Chiang and R. J. Miller, “Discovering data quality rules,” PVLDB, vol. 1, no. 1, pp. 1166–1177,
2008.
[3] Mr Mohit Tiwari, Raj Kumar, Akash Bharti, Jai Kishan,”INTRUSION DETECTION SYSTEM”
International Journal of Technical Research and Applications e-ISSN: 2320-8163, Volume 5, PP. 38-44,
2017.
[4] G. Webb and J. Vreeken, “Efficient discovery of the most interesting associations,” ACM TKDD, vol.
8, no. 3, pp. 1–31, 2014.
[16] P. C. Arocena, B. Glavic, G. Mecca, R. J. Miller, P. Papotti, and D. Santoro, “Messing up with bart:
error generation for evaluating datacleaning algorithms,” PVLDB, vol. 9, no. 2, pp. 36–47, 2015.

You might also like