You are on page 1of 51

HIGH PERFORMANCE NETWORK

INTRUSION DETECTION ENGINE

SAMPATH C V GOWDA - 20201CCS0074


AFFAN BAIG M - 20201CCS0101
VISHAAL - 20201CCS0109
CHANDAN R - 20201CCS0112
G M HARIKRISHNAN - 20201CCS0095

Mr. PRAVEEN PAWASKAR


Sampath C V Gowda, 20201CCS0074
Affan Baig M,20201CCS0101
Vishaal V,20201CCS0109
Chandan R,20201CCS0112
G M Harikrishnan,20201CCS0095

Signature(s) of the Students

ABSTRACT

2
LIST OF TABLES

Sl. No. Table Name Table Caption Page No.


1 Table 1.1 General summary of attack families in UNSW NB-15 15
dataset

3
LIST OF FIGURES

Sl. No. Figure Name Caption Page No.


1 1 Software modules versus Reusable components 5
2 2 Activity Diagram 22
3 3 Sequence Diagram 23
4 4 RNN Diagram 24
5 5 Use Case Diagram 25

4
TABLE OF CONTENTS

CHAPTER NO. TITLE PAGE NO.

ABSTRACT i
ACKNOWLEDGMENT ii

1. 1

2. 3
2.1 Hybrid Deep Neural Network Intrusion
4
Detection System
2.2 Optimizing Network Security using LSTM 5
algorithm
6
2.3 NIDS Method of Power
Monitoring System 7

2.4 Research on technical


8
application of AI in NIDS
2.5 An intrusion detection model
combining signature based
recognition and two round immune
based recognition

3. 3.1 UBP – Signature based IDS 10


3.2 IDS using ML 10
3.3 NIDS Power Monitoring 10
3.4 Research on technical 11
application of AI in NIDS 12
3.5 NIDS model combining two
round immune based 12
3.6 A general Ai based NIDS
3.7 ML Algorithm 12
3.8 Deep Learning Algorithm 12
13

5
4.
4.1 Methodologies 14
4.2 Data Collection 14
4.3 Data Preprocessing 14
4.4 Training Model 15
4.5 System Evaluation 15
4.6 Points to be keep in mind 15
15

5.
5.1 Purpose of IDS 16
16

6.
6.1 Introduction to design 17
document
6.2 Data flow diagram 17
6.3 Architecture Diagram
6.4 LSTM Cell 17
6.5 Activity Diagram 19
6.6 Sequence Diagram 20
6.7 RNN Architecture 22
6.8 Use Case Diagram 23
24
25

7.

7.1 Gantt Chart 28

28
8. OUTCOME 37

9. 9.1 Results 38
9.2 Discussions 38
38
10. 10.1 Conclusion 39
39

REFERENCES 40
APPENDEX – A(PSEUDOCODE) 41

6
APPENDEX – B (SCREENSHOTS) 43
APPENDEX – C(ENCLOSURE) 51

CHAPTER-1

7
Network intrusion detection and prevention are now critical in the constantly changing field of
cybersecurity. Conventional signature-based detection techniques are no longer enough for spotting
complex and ever-changing cyberthreats. This has caused the development of sophisticated methods
for Network Intrusion Detection Systems (NIDS), such as machine learning.

Our study aims to use the UNSW-NB15 dataset to build a network intrusion detection engine
through machine learning techniques. Because the UNSW-NB15 dataset is large, diverse, and covers
a range of network traffic scenarios, it's a perfect choice for training and evaluating intrusion
detection models.

One of the greatest scientific advances in data security is the Intrusion Detection System (IDS),
which can differentiate between an attack and a recent interruption. This work aims to distinguish
between abnormal and normal system traffic behavior, as well as to identify normal system traffic
behavior from any of the other four forms of attacks: Denial of Service (DOS), User to Root (U2R),
Probe (Probing), and Root to Local (R2L). To put it succinctly, the main goal of interruption
recognition is to increase the accuracy with which classifiers recognize the intervening action.

1.1 Significance of Network Intrusion Detection


Protecting computer networks from hostile activity, such as unauthorized access, data
exfiltration, and denial-of-service assaults, is made possible in large part by network intrusion
detection. An increasing number of clever and adaptive solutions that can recognize and react to
unusual network behavior are required due to the complexity of cyber threats.
1.1.1 Early Threat Detection
Through the identification and notification of any security breaches to security professionals,
NID systems offer an early warning system. It is essential to detect cyberattacks early in
order to stop them or lessen their effects.
1.1.2 Protecting Sensitive Data
Sensitive data is shielded against unwanted access, alteration, and eavesdropping with the aid
of NID. These systems monitor network traffic in order to identify anomalies and unusual
activity that can point to an attempt to jeopardize data confidentiality and integrity.

1.1.3 Maintaining Trust and Reputation

8
A successful cyberattack has the potential to undermine consumer confidence and harm an
organization's brand. Organizations can show stakeholders that they are committed to
cybersecurity by putting strong NID measures in place.

1.2 Machine Learning in Intrusion Detection

Machine learning has demonstrated encouraging results in improving the precision and
effectiveness of intrusion detection systems because of its capacity to identify patterns and
adjust to new data. The system can detect new threats that conventional rule-based systems
would miss by learning to distinguish between malicious and legitimate network traffic
through the training of models on massive datasets such as UNSE-NB15.

1.2.1 Anomaly Detection


Because they can recognize variations from standard patterns of network behavior, machine
learning models are highly effective at detecting anomalies. Algorithms for unsupervised
learning, such autoencoders and clustering, work especially well for identifying unfamiliar or
undetected hazards.

1.2.2 Behavioral Analysis

Models that examine and comprehend the behavior of network traffic over time can be
created thanks to machine learning. These algorithms can differentiate between legitimate
activity and possibly harmful conduct by spotting patterns and trends, which improves
intrusion detection accuracy.

1.2.3 Real-Time Detection


Real-time processing optimization of ML models allows for quick identification and handling
of security events. This is essential for stopping or lessening the effects of strikes that happen
quickly.

1.3 UNSW NB-15 Dataset


A publicly accessible dataset that replicates several network traffic scenarios, including both
typical activities and various attack types, is called UNSW-NB15. Because of its extensive

9
feature set, which offers a thorough depiction of network behaviors, it is a priceless tool for
developing and accessing machine learning models for intrusion detection.

In the context of Network Intrusion Detection Systems (NIDS), datasets like these play a crucial
role. NIDS researchers and practitioners typically select datasets that offer a diverse
representation of network traffic scenarios.

Once chosen, the dataset undergoes preprocessing to remove noise and convert it into a suitable
format. Features relevant to network behavior, such as packet sizes, IP addresses, and protocols,
are then extracted. Machine learning models, including supervised learning algorithms or deep
learning models, are trained on these datasets to distinguish between normal and malicious
network activities.

The models are subsequently evaluated on separate datasets to assess their performance, and
fine-tuning may be performed based on the evaluation results. It's important to refer to the
specific documentation and research associated with the "UNSW NB 15" dataset for detailed
insights into its characteristics and considerations for use in NIDS. Keep in mind that the field of
network security is dynamic, and staying updated with the latest literature is crucial for
incorporating the most recent methodologies and datasets.

CHAPTER-2

2.5.1 Abstract:

The outcomes of the simulation confirm the effectiveness of the model and its ability to improve security
posture in the constantly changing field of cybersecurity.

10
2.5.2 Advantages:

1. Behavioral Analysis:
Two-round immune-based recognition inherently involves behavioral analysis by considering
the dynamics and interactions between different components of the immune system. This
approach allows for a more nuanced understanding of network behavior, contributing to the
detection of subtle anomalies that may be indicative of attacks.
2. Comprehensive Threat Coverage:
Signature-based recognition excels at identifying known patterns of attacks by comparing
network activities with predefined signatures. Combining it with immune-based recognition
provides a more comprehensive approach, allowing the system to adapt to new and evolving
threats that may not have known signatures.

2.5.3 Disadvantages:
1. Adaptability to Evolving Threats:
Immune-based recognition, inspired by the human immune system, aims to adapt to new
threats over time. However, the two-round immune-based recognition may still struggle to
keep pace with rapidly evolving attack strategies and variations

CHAPTER-3

1.1 User behavior Pattern -Signature based Intrusion Detection

3.1.1 Research Gap:

Dynamic and Evolving User Behavior:


- Current signature-based intrusion detection systems face challenges in adapting to
dynamic and evolving user behaviors. There is a research gap in developing models that
can effectively capture and respond to changes in user patterns over time, especially in

11
environments with fluid and frequently modified user behaviors.

1.1.2 Future Research Opportunity:

Machine Learning Augmentation:


- Investigate the integration of machine learning techniques to enhance the adaptability
of signature-based intrusion detection systems. This approach aims to enable systems to
dynamically evolve and update their patterns, effectively capturing and responding to
changes in user behaviors over time.

1.2 Intrusion Detection System Using Machine Learning

3.2.1 Research Gap:

Interpretable Machine Learning Models:


- The research lacks focus on developing interpretable machine learning models for
intrusion detection systems, hindering understanding and trust in the decision-making
processes of these systems.

3.2.2 Research Opportunity:

Adaptive Machine Learning Models:


- Explore the development of adaptive machine learning models for intrusion detection
systems. These models should possess the ability to dynamically evolve and update
their algorithms in response to emerging threats and changing patterns of cyberattacks.

3.3 Network Intrusion Detection Method of Power Monitoring System Based


on Data Mining

3.3.1 Research Gap:

Dynamic Threat Adaptation:

12
- The research lacks focus on developing network intrusion detection methods for
power monitoring systems that dynamically adapt to evolving threats, posing a gap in
effectively addressing emerging cybersecurity challenges.

3.3.2 Research Opportunity:

Privacy-Preserving Intrusion Detection:


- Explore methods to enhance intrusion detection while prioritizing user privacy,
addressing the balance between effective threat detection and privacy protection.

3.4 Research on the technical application of artificial intelligence in network


intrusion detection system

3.4.1 Research Gap:


Explainable AI in Intrusion Detection:
- The research lacks a focus on developing artificial intelligence applications in network
intrusion detection systems that prioritize explainability and interpretability, posing a gap
in understanding and trust in the decision-making processes of these systems.

3.4.2 Research Opportunity:

Federated Learning for Intrusion Detection:


- Explore the potential of federated learning in the context of intrusion detection
systems. Investigate how a decentralized, collaborative learning approach can improve
the efficiency and accuracy of intrusion detection while addressing privacy concerns by
keeping sensitive data localized.

3.5 Title: An Intrusion Detection Model Combining Signature-Based


Recognition and Two-Round Immune-Based Recognition

3.5.1 Research Gap:

Real-Time Adaptive Signature Updates:

13
- The research lacks a focus on creating an intrusion detection model that combines
signature-based recognition with two-round immune-based recognition, especially in the
aspect of real-time adaptive updates to signature databases. This poses a gap in
effectively responding to evolving threats in a dynamically changing cyber landscape.

3.5.2 Research Opportunity:

Cross-Domain Application:
- Explore the adaptability and effectiveness of the intrusion detection model, combining
signature-based recognition and two-round immune-based recognition, across diverse
domains, beyond cybersecurity. Fine-tuning the model for specific domains to optimize
its performance. Making the AI's decision-making process transparent for trust and better
anomaly understanding. Developing mechanisms for real-time anomaly detection and
triggering appropriate responses in each domain.

3.6 Title: A general AI-based NIDS methodology

NIDS developed using ML and DL techniques typically have three main steps: (i) data
preprocessing phase, (ii) training phase, and (iii) testing phase, as shown in Figure 5. It is
included. In all proposed solutions, the dataset is first pre processed to convert it into a format
suitable for use in the algorithm.

Typically, encoding and normalizing are part of this stage. This step also involves cleaning up
your data set if necessary by eliminating entries that have duplicates or missing data. A training
data set and a test data set are randomly selected from the preprocessed data. Usually, 80% of
the initial data set size is allocated to the training set, while the remaining 20% is used for
testing. 53, 54 The ML or DL algorithm is subsequently trained using the training data set.
move.

The quantity of the dataset and the intricacy of the suggested model determine how long it takes
to train the algorithm. Because of their deeper and more intricate structure, DL models usually
take longer to train.

14
After training, a model is tested on a test set of data and assessed according to its predictions.
Network traffic examples are predicted by the NIDS model to fall into one of two classes: the
benign (normal) class or the attack class.

1.6 Title: ML Algorithms (Decision Tree)

ML is a subset of AI that consists of all the techniques and algorithms that let computers
automatically learn from big datasets and draw conclusions by using mathematical models. 13,55
The most popular machine learning algorithm (sometimes referred to as shallow learning).
Artificial neural networks (ANN), decision trees, and K-nearest neighbors IDS instances are
(KNN). ensemble methods, quick learning networks, K-means clustering, and support vector
machines (SVM).

Decision trees quantify a dataset's disorder or impurity using a metric known as entropy. After
that, the information gain is computed to see how well the data are separated according to a
certain attribute. Reducing entropy and increasing information acquisition are the objectives.

One of the core supervised machine learning methods, the Derived Tree Model (DT), uses a
collection of decisions (rules) to classify and regression-test a given dataset. The node, branch,
and leaf structures of a typical tree characterize the DT model. A node is a feature or property.

Choice Trees can get quite deep and complicated, which can cause overfitting. Removing
extraneous branches from the tree by pruning enhances the tree's ability to generalize to new
data.

Branch represents a decision, or a rule. Leaf represents a possible outcome, or class label. The
DT algorithm automatically finds the best features to build a tree. Then, it prunes the tree to
remove irrelevant branches to avoid overfitting. The most popular DT models are: CART C4.5
ID3 Advanced Learning Algorithms Random Forest (RF) XGBoost604.2.2 K-Nearest Neighbor

KNN is a supervised machine learning method that predicts the class of a given data sample by
utilizing the notion of "feature similarity". By measuring a sample's distance from its neighbors,
KNN classifies it based on those neighbors. The performance of the model is influenced by the
KNN parameter k. The model is prone to overfitting if k is really low. Conversely, the sample
instance is probably going to be incorrectly classified if k is extremely high. The performance of

15
different machine learning methods is compared against an updated benchmark dataset by
KARATAS et al. 63. By lowering the imbalance ratio, the synthetic minority oversampling
method (SMOTE) was used to solve the dataset imbalance issue. As a result, the detection rate
of attacks against the minority class improved.

In order to avoid overfitting, decision trees provide halting criteria. Attaining a specific depth in
the tree, having a minimal amount of samples in a leaf node, or realizing that more splitting will
not appreciably increase information acquisition are examples of common stopping conditions.

Decision trees quantify a dataset's disorder or impurity using a metric known as entropy. After
that, the information gain is computed to see how well the data are separated according to a
certain attribute. Reducing entropy and increasing information acquisition are the objectives.

1.7 Title: Deep Learning Algorithms

A type of machine learning (ML) called deep learning (DL) uses multiple layers to extract the
characteristics of a deep network. Because these techniques have a deeper structure and are
capable of autonomously learning and producing the key properties of the dataset, they are more
effective than machine learning. The DL techniques used to suggest DL-based NIDS solutions in
reviewed studies are described in this section. A type of machine learning (ML) called deep
learning (DL) uses multiple layers to extract the characteristics of a deep network. Because these
techniques have a deeper structure and are capable of autonomously learning and producing the
key properties of the dataset, they are more effective than machine learning. The DL techniques
used to suggest DL-based NIDS solutions are described in this section.

Modeling sequence data, recurrent neural networks (RNNs) are essentially feed-forward
network extensions. An input, a hidden, and an output unit make up an RNN. The output units
are the hidden units, while the memory elements are the hidden units.The current input and
output of each RNN unit are used to inform its judgment. RNNs are extensively employed in
many different domains, including semantic comprehension, handwriting prediction, speech
processing, and human activity recognition, to name a few. RNNs can be used for smooth
feature extraction and supervised classification in identity-driven systems (IDS). The majority of
RNNs are made to handle sequences of a certain length.

16
However, if the sequence is lengthy, RNNs may experience short-term memory problems.
Naseer used a GPU-based testbed to conduct a comparative examination of IDS based on several
DL and ML algorithms. The NSL-KDD dataset is widely regarded as the benchmark, and the
experimental findings indicate that LSTM and Deep CNN yielded superior accuracy outcomes
in comparison to other models.

Transformers were first created for use in natural language processing applications, but they
have now evolved into a core architecture for many other uses. They are very scalable because
they process incoming data in parallel using self-attention techniques.

A generator and a discriminator that have been concurrently trained via adversarial training
make up a GAN. GANs have applications in image synthesis and style transfer and are used to
generate fresh data samples, like realistic images.

17
CHAPTER-4
PROPOSED MOTHODOLOGY

4.1 Methodologies
Input, output, and hidden units make up recurrent neural networks; the hidden unit performs the
majority of the work. The synthesis of the one-way information flow from the timing hiding unit to
the temporal concealing unit before is shown in Figure 1. In the RNN paradigm, data moves from the
input units to the hidden units effectively in a single direction. Hidden units can be thought of as the
network's general store in terms of end-to-end data retention. Upon dissecting the RNN, we discover
that deep learning is ingrained in it. One method for supervised classification learning is the use of
RNNs

4.2. Data Collection:


Data collection is an essential part of our research. The right dataset needs to be supplied in order to
get trustworthy findings. The majority of the stock prices in our data are from the previous week or
year. We will utilize and analyze Kaggle data. After we see the accuracy, we will use the data in our
model.

4.2.1 DATASET DESCRIPTION

A number of datasets were assessed in order to test the suggested NIDS. The NIDS required a
training and testing dataset that included information about network attacks in addition to typical
network traffic in order to detect both flow-based and packet-based intrusions

18
Table 1 General summary of attack families in UNSW NB-15 dataset

4.3 Data Pre-Processing:

It is preferable to make the data more machine readable because humans can comprehend any kind
of data, but machines cannot. Usually, raw data is typically incomplete or inaccurate. Partitioning the
dataset, confirming missing values, and other duties are all part of pre-processing data.
This process involves several key steps. Initially, data cleaning is performed to address missing
values, outliers, and errors, ensuring the dataset's integrity. Subsequently, data normalization or
scaling may be applied to bring features to a consistent scale, preventing the dominance of certain
variables. Feature engineering may involve creating new informative features or transforming
existing ones to improve model performance. Categorical variables are often encoded, and text data
may be tokenized and vectorized. Handling imbalanced datasets, dealing with redundant features,
and addressing multicollinearity are additional considerations. Data pre-processing is a crucial stage
in the data analysis workflow, significantly influencing the success and accuracy of subsequent
modeling or analytical endeavors.

1.4 Training Model:

A computer or model should learn by feeding on data, just like when anything is fed. The collected
data set will be used to train the model. The training model uses an initial dataset, or raw set of data,

19
that is collected from the prior fiscal year. From the same dataset, a more refined view, or intended
output, is then provided. The dataset is refined using several methods to yield the desired outcomes.
During the training phase, the model learns patterns and relationships within the training dataset,
allowing it to generalize and make predictions on similar but previously unseen data.

Training is typically organized into epochs, where one epoch represents a complete pass through the
entire training dataset. The dataset is often divided into batches, and model updates are performed
after processing each batch. The combination of epochs and batch size affects the learning dynamics.

To monitor the model's generalization performance, a separate validation dataset is often used. Early
stopping is a regularization technique that involves halting the training process when the model's
performance on the validation set ceases to improve, preventing overfitting.
.

1.5 System Evaluation:

We obtained the dataset from Kaggle for the project that we are proposing. However, This data set
hasn't been processed, though. A variety of stock market valuation data for corporations is provided
by the data set. The first step is to convert raw data into processed data. Since the prediction only
needs a fraction of the raw data's numerous attributes, feature extraction is used to do this. Feature
extraction is one reducing technique. A system's structure, attitudes, and behaviors are all described
by a structural model.

System evaluation is a crucial phase in the development of a machine learning model or system,
focusing on assessing its performance, robustness, and generalization to new, unseen data. This
phase involves various metrics and methodologies to gauge the effectiveness of the model in meeting
its intended objectives. Different machine learning tasks, such as

1.6 Points to be keep in mind


Not all types of invasions are known
 Creating totally safe systems is not practical.
 Majority of systems in use today contain security flaws

20
 Timely intrusion detection can help identify intruders and minimize damage.
 Majority of systems in use today contain security flaws.

CHAPTER-5

The goal of proposed anomaly-based network intrusion detection system (NIDS) with Long Short-
Term Memory (LSTM) algorithm aims to offer a novel intrusion detection tool that, via the use of AI
technology, can detect and highlight network anomalies with low false positive rates and high
accuracy.

21
The proposed approach aims to improve cybersecurity by providing a more effective and efficient
means of detecting network anomalies, which is a critical component in safeguarding critical
network infrastructures from potential attacks. Artificial intelligence (AI) will be used by the system
to identify patterns and abnormalities in the data that could be difficult for people to see. The
network security business may be significantly impacted if this endeavor is successful. The
utilization of the LSTM algorithm is a key distinguishing feature of this intrusion detection system.
LSTM, a type of recurrent neural network (RNN), is well-suited for processing sequential data,
making it particularly effective for capturing and understanding patterns in network traffic over time.
False positives can lead to unnecessary alerts and operational disruptions, impacting the efficiency of
cybersecurity operations. By leveraging the capabilities of LSTM and AI technology, the proposed
NIDS aims to minimize false positives, ensuring that flagged incidents are more likely to be genuine
security threats.

Moreover, the commitment to high accuracy underscores the system's reliability in identifying and
classifying network anomalies. H Anomaly detection systems aim to identify deviations from normal
patterns in data, which may indicate potential security threats or irregularities.

The emphasis on high accuracy also aligns with the goal of fostering a proactive security posture. A
proactive approach involves identifying and mitigating potential security risks before they escalate
into significant threats. A system with high accuracy in anomaly detection can provide timely alerts
and responses to security incidents, allowing organizations to address vulnerabilities and minimize
the impact of potential breaches.
LSTM Algorithm and Long-Term Dependencies:

The LSTM algorithm is specifically mentioned for its ability to capture long-term dependencies in
data. Unlike traditional neural networks or machine learning models that may struggle with capturing
patterns in sequential data, LSTMs excel in retaining information over extended periods. In the
context of network security, this is valuable because security threats often manifest as complex
patterns that unfold over time. The LSTM's capability to capture long-term dependencies enhances
the system's ability to recognize and understand the evolving nature of both common and
sophisticated intrusion patterns.
Recognition of Common and Sophisticated Intrusion Patterns:

22
This adaptability is crucial in the ever-evolving landscape of cybersecurity, where attackers
continually develop new strategies and techniques to compromise network security.
In summary, the proposed Anomaly-Based NIDS with LSTM algorithm seeks to redefine intrusion
detection by offering a cutting-edge solution that harnesses the capabilities of AI. By focusing on
low false positive rates and high accuracy, this system aims to provide cybersecurity professionals
with a robust tool for effectively safeguarding networks against a diverse range of threats, ultimately
enhancing the overall security posture of information systems.

23
CHAPTER-6

6.1 Introduction to Design document

The software design aids in the software development process for Android applications by
providing guidelines for the application's structure. The software design specifications are a
narrative and graphical description of the software design for the project; they comprise use
case models, sequence diagrams, and other supporting required information.

This phase involves creating comprehensive software design specifications, which serve as a
blueprint for the entire development process. These specifications encompass various
elements, including use case models, sequence diagrams, and other essential information.

6.2 DATA FLOW DIAGRAM

A DFD is a visual representation of a data flow. There are no loops, decisions rules, or
control flows in a DFD.
A flowchart is a diagram that describes a particular operation based on data.
You can display a data flow diagram using a number of different notations.

Fig 6 1 Data Flow Diagram

.
24
Step 1: load datasets
load intrusion datasets that’s may be text file or csv file (we have loaded UNSW
NB15 Dataset)

Step 2: The next step is data processing

Steps 3: data sets field names


assign field names(column name) to the datasets
column names :

Step 4: remove fields


remove the unwanted data, fields, columns from test datasets and train datasets

Step 5:mapping
assign class name to the different types of attack
example:
ipsweep=probe, teardrop=DoS , perl=U2R, Ftp_write=R2L
and normal=normal

Step 6: Attack class distribution

Step 7: The next process involves model training

Step 8: The test dataset is used to test the validated model. If any intrusion is there means it
will discover.

6.3 Architechture Diagram

.
25
Fig 6 2 Architecture Diagram

1. Data Collection: Information on network traffic is gathered and kept in a database from a
variety of sources, including network sensors and honeypots.

2. Preprocessing of the Collected Data: This step includes handling missing values, feature
extraction, and normalization of the data.

3. Feature Extraction: The network input of the suggested RNN-LSTM architecture


receives many feature channels, which enhances the detection performance.

4. The RNN-LSTM Model: The RNN-LSTM model is the central component of the system
and is comprised of two primary parts:

5. RNNs (Recurrent Neural Networks): RNNs are appropriate for sequential data
processing because they can detect temporal connections in network traffic data.

6. Long Short-Term Memory (LSTM): LSTM is a unique type of RNN that can analyse
and forecast input sequentially, which makes it perfect for identifying intricate patterns in
network traffic.

.
26
7. Training and Evaluation: Using an appropriate loss function, like cross-entropy or
categorical cross-entropy, the RNN-LSTM model is trained on the pre processed and
extracted feature data. A variety of performance indicators, including accuracy, precision,
recall, and F1-score, are used to evaluate the model's performance.

8. Intrusion Detection: To find intrusions in the network traffic data, the trained RNN-
LSTM model is employed. Based on the input data, the model can produce predictions, and
it is possible to establish a threshold value to distinguish between intrusions and non-
intrusions.

9. Visualization and Monitoring: Real-time visualization and monitoring of the intrusion


detection process's outcomes makes it simple to identify any threats and take the necessary
countermeasures.

Overall, the RNN-LSTM architecture of a network intrusion detection system combines the
advantages of RNNs and LSTMs to efficiently identify intricate patterns in network traffic data,
enhancing intrusion detection system performance and enhancing network and computer system
security.

9.3.1 Long Short Term Memory:

LSTMs are artificial recurrent neural networks (RNNs) that are used in deep learning.
They can be used to classify, process and make predictions from time series data.
LSTMs have the advantage of being able to selectively remember patterns for long
periods of time, as there can be long lags in the time series between important events.
LSTMs differ from RNNs in many ways, and this article will explain how they differ
from each other and how they can be used in real-world problems..

.
27
Fig 6 3 Long Short Term Memory

6.3 ACTIVITY DIAGRAM

.
28
Fig 6.3 Activity Diagram

Both contemporaneous and sequential activities can be modeled. An activity diagram will
have an initial state at the beginning and a final state at the end in both scenarios.

6.4 SEQUENCE DIAGRAM

Sequence diagrams show how classes interact with each other through a message exchange
over time. They are also referred to as event diagrams.

.
29
A sequence diagram is a validating and visual representation of different runtime conditions.

When you model a new system, a sequence diagram can be used to anticipate how the
system will act and to determine any class roles.

Figure 6.5 Sequence Diagram

6.5 RNN:

We can think of hidden units as the network's overall store, retaining end-to-end data. Upon
dissecting the RNN, we discover that deep learning is embodied in it. For supervised

.
30
classification learning, one can employ RNNs. With recurrent neural networks, a directional
loop that can retain and utilize prior knowledge has been presented.

Figure 4 RNN Diagram

Use Case diagram:

Use case models describe the different ways that users (actors) and the Android application
interact and play out situations. They specify the features and functionalities that the
program needs to offer in order to satisfy user needs. Use cases ensure that all potential user
interactions are taken into account during the design phase by helping to identify various
paths users can follow within the program.

.
31
Figure 5 Use Case Diagram

6.5 Implementation

Python is used to implement the project; it is an object-oriented and procedure-oriented


programming language. Using a partitioned memory space for data and functions, object-
oriented programming offers a method for modularizing programs. This memory space can
then be used as a template to create further copies of the module as needed.
Software implementation is the process of installing a package in its final form in a real-
.
32
world setting, ensuring that the system functions properly and that the intended users are
satisfied. The software's intended purpose of making their jobs easier is unclear to the users.
The system's advantages must be understood by the active user, and their trust in the
program must grow.
The user receives inadequate guidance to ensure his comfort level when utilizing the
application.

The flask_ngrok package is frequently utilized for creating Flask web apps. It offers an easy
way to use ngrok, a service that builds secure tunnels to your localhost, to expose your
locally running Flask application to the internet.

For your Flask application, flask_ngrok configures ngrok to produce a dynamically


generated public URL. This implies that ngrok assigns a distinct URL to your Flask
application each time you launch it, which anybody else can use to visit it. This is
particularly useful if you work on multiple networks or if your IP address changes
regularly.By default, ngrok offers secure HTTPS tunnels. When simulating a production-like
environment, including secure connections, in testing and development situations, this is
helpful.

.
33
1. Intrusion detection systems are crucial in the realm of cybersecurity for thwarting
network assaults.
2. To optimize the system's adaptability, anomaly detection through a learning system
must be included in lieu of signature-based detection.
3. The goal of this paper is to provide a succinct overview of intrusion detection systems

.
34
based on deep learning, including many aspects of both deep learning and intrusion
detection techniques.
4. Compiling datasets that are accessible to the general public and providing details about
their features and restrictions.

.
35
CHAPTER-9

9.1 Result

When creating a high-throughput Network Intrusion Detection System (NIDS) with the
UNSW-NB15 dataset, real-time processing problems and dataset characteristics must be
carefully taken into account. LSTM networks are used in this process. The approach that
follows combines methods to deal with these issues and maximize the NIDS for high
throughput.

On the UNSW-NB15 dataset, the deployment of a High Throughput Network Intrusion


Detection System (NIDS) utilizing Long Short-Term Memory (LSTM) networks produced
encouraging outcomes and insights. The effectiveness of the LSTM-based NIDS in
managing large amounts of network traffic and identifying known attacks and abnormalities
was proven. The system demonstrated a strong ability to analyze network data sequences in
real-time by utilizing the temporal relationships collected by LSTM, which qualified it for
situations with high throughput needs. Standard measures, including as accuracy, precision,
recall, and F1-score, were used to assess the model's performance. These metrics
demonstrated the model's capacity to sustain low false positive rates and high detection
accuracy.

The analysis is conducted on the UNSW-NB15 dataset, which is likely a network traffic
dataset commonly used for evaluating intrusion detection systems. This dataset contains a
variety of normal and attack instances, making it suitable for training and testing network
intrusion detection models.
The deployed Network Intrusion Detection System is designed for high throughput, meaning
it can efficiently handle large amounts of network traffic. This is achieved by utilizing Long
Short-Term Memory (LSTM) networks, which are a type of recurrent neural network (RNN)
known for their ability to capture temporal dependencies in sequential data, making them
well-suited for time-series analysis.

The LSTM-based NIDS demonstrated effectiveness in identifying both known attacks and
.
36
abnormalities in the network traffic. This suggests that the model has learned patterns and
temporal relationships within the data that enable it to distinguish between normal and
malicious activities.

Real-time Analysis of Network Data The system showcased a strong ability to analyze
network data sequences in real-time. This means that the LSTM-based NIDS can process
and make predictions on incoming network data as it occurs, allowing for timely detection of
potential intrusions or abnormalities.

LSTM networks are particularly adept at capturing long-term dependencies and temporal
relationships in sequential data. In the context of network traffic analysis, this implies that
the model leverages the historical context of data to make informed predictions about the
current state, enhancing its ability to identify anomalies.

Standard performance metrics, including accuracy, precision, recall, and F1-score, were used
to assess the model's performance. These metrics provide a comprehensive view of how well
the NIDS is performing in terms of true positives, false positives, true negatives, and false
negatives.

The mentioned metrics, such as accuracy, precision, recall, and F1-score, indicated that the
LSTM-based NIDS achieved low false positive rates (minimizing incorrect alarms) and high
detection accuracy. This is a crucial aspect in intrusion detection systems, as it reflects the
system's ability to effectively identify and distinguish between normal and malicious
network activities.

The results of this study emphasize the potential of LSTM-based approaches in developing
high-throughput NIDS solutions, providing a foundation for further research and
optimization in the realm of network security.

The statement indicates that the study provides a foundation for further research and
optimization in the realm of network security. This implies that the findings contribute to the
understanding of how LSTM-based approaches can be optimized for more efficient and

.
37
effective network security applications. Optimization may involve fine-tuning the models,
improving algorithms, or addressing specific challenges associated with network intrusion
detection.

LSTM-based approaches have demonstrated potential in the development of high-


throughput Network Intrusion Detection Systems. The positive outcomes from this research
provide a basis for future investigations and optimizations within the field of network
security. It opens avenues for researchers and practitioners to explore and enhance LSTM-
based models for more robust and efficient intrusion detection in network environments.

OUTPUT:

9.2 Discussions

There is a great deal of opportunity and complexity in designing a high-throughput Network


Intrusion Detection System (NIDS) utilizing Long Short-Term Memory (LSTM) networks
and the UNSW-NB15 dataset. By utilizing LSTMs, network traffic temporal dependencies
.
38
can be captured, which facilitates the identification of intricate attacks. But attaining high
throughput—which is essential for real-time monitoring in expansive networks—calls for
careful model development and optimization.

It is crucial to strike a balance in the trade-off between computing efficiency and model
complexity. Improving throughput is mostly dependent on strategies like strategic data
preparation, hardware acceleration optimization, and parallel processing. The UNSW-NB15
dataset offers a strong basis for training and testing the LSTM-based NIDS because of its
wide range of normal and assault scenarios.

The system needs to be tuned for quick sequence analysis and decision-making in order to
reach high throughput. To ensure that intrusion detection remains accurate and effective with
the least possible impact on network performance, strategies including hardware
acceleration, parallel processing, and model tuning are essential. In order to effectively
manage the UNSW-NB15 dataset's diversity and dynamic structure, LSTM-based NIDS
must carefully strike a balance between computational efficiency and model complexity,
which will eventually result in a reliable and high-performing network security solution.

.
39
CHAPTER-10
CONCLUSION
10.1 Conclusion

Network intrusion detection systems (NIDSs) help protect your network from unauthorized
access and attacks. NIDS monitors and analyzes network traffic to identify and prevent
network security breaches. NIDSs protect sensitive information, identify vulnerabilities, and
prevent network attacks. By proactively detecting and responding to network intrusion,
organizations can reduce the risk of network damage and protect critical assets.

The process begins with a methodical approach to selecting papers. The primary focus is on
AI-based NIDS, implying the use of Artificial Intelligence techniques in the context of
Network Intrusion Detection Systems.
Before delving into specific papers, the approach starts with providing a brief overview of
the concept of Intrusion Detection Systems (IDS). This includes an exploration of various
classification schemes related to IDS. This initial step helps establish a foundational
understanding of the field.

The selection process is based on a literature review that involves examining existing
research in the field of AI-based NIDS. This literature review likely covers various aspects
such as methodologies, algorithms, and trends in intrusion detection.

Once relevant papers are identified, the next step involves analyzing the methodologies
presented in each article. This analysis includes understanding the techniques, algorithms, or
models proposed in the papers for improving intrusion detection in network systems.

The relevant papers in the area of AI based NIDS are selected using a methodical approach.
First, a brief overview of the concept of IDS and the various classification schemes are
presented, based on the reviewed literature.

.
40
REFERENCES

[1] Network Intrusion Detection Using Deep Neural Networks M.Ponkarthika1 and
Dr.V.R.Saraswathy2 (Open Access Quarterly International Journal) Volume 2, Issue 2,
Pages 665-673, April-June 2018
[2] Host Based Intrusion Detection System with Combined CNN/RNN ModelAshima
Chawla(B), Brian Lee, Sheila Fallon, and Paul Jacob
[3] On the Effectiveness of Machine and Deep Learning for Cyber Security 2018 10th
International Conference on Cyber Conflict
[4] Collective Anomaly Detection Based on Long Short-Term Memory Recurrent Neural
Networks Lo¨ıc Bontemps, Van Loi Cao(B), James McDermott, and Nhien-An Le-Khac
[5] A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion
Detection Anna L. Buczak, Member, IEEE, and Erhan Guven, Member, IEEE
[6] Application of Neural Networks for Intrusion Detection in Tor Networks Taro Ishitaki∗,
Donald Elmazi†, Yi Liu ∗, Tetsuya Oda ∗, Leonard Barolli‡ and Kazunori Uchida‡2015
29th International Conference on Advanced Information Networking and Applications
Workshops
[7]ApplicationofDeepRecurrentNeuralNetworksforPredictionofUserBehaviorin
TorNetworks 2017 31st International Conference on Advanced Information Networking and
Applications Workshops

.
41
1. Upload the dataset:
a. Uploading the UNSW NB-15 dataset.
b. Pre-process the dataset.

2. Running using flask-ngrok:


a Running on “http://b492-35-247-180-122.ngrok-free.app”
b. Then we should give as a developer

3. Details Entry:
a. Prompt the user to enter relevant details or data related to the chosen attack type.
b. Provide clear instructions on the type of information required for detection.

4. Detection Process:
a. Utilize a machine learning model trained to detect the specific attack type chosen by the
user.
b. Process the user-provided details through the detection model.
c. If an attack is detected, proceed to the alert and details display step. If not, inform the
user that no attack was detected.

5. Alert and Details Display:


a. Notify the user that an attack has been detected.
b. Display the details of the detected attack, including relevant information provided by
the user.
c. Provide recommendations on how to address the detected attack and enhance security.

6. Logging and Analysis:

a. Log all detected attacks and their characteristics for future analysis and model
improvement.
b. Analyze the user-provided details to enhance the detection model's accuracy and
effectiveness.
c. Analyze the accuracy graph
.
42
8. User Feedback and Improvement:
a. Allow users to provide feedback on the detection process and the accuracy of the
results.
b. Use user feedback to improve the detection system and enhance user experience.

9. Continuous Monitoring:
a. Implement continuous monitoring of network traffic and user behavior to detect
ongoing and potential attacks.
b. Update the detection model and system to adapt to new attack patterns and techniques.

10. System Maintenance and Updates:


a. Regularly maintain and update the detection system to incorporate the latest security
measures and attack detection capabilities.
b. Keep the system up to date with new machine learning algorithms and intrusion
detection techniques.

12. Collaboration and Research:


a. Collaborate with security experts and researchers to stay informed about the latest
developments in network intrusion detection.
b. Contribute to research efforts aimed at advancing the field of intrusion detection and
network security.

.
43
1. Home Page

2. DoS attack

3. Normal

.
44
4. Analysis

5. Exploits

6. Backdoor

.
45
7. Exploits

8. Reconnaissance

9. Shellcode

.
46
10. Generic

1.

.
47
2.

.
48
.
49
.
50
.

.
51

You might also like