You are on page 1of 36

Network Intrusion Detection Using Wireless

Controllers

05/03/2023
SYNOPSIS
 Domain Introduction
 Abstract
 Introduction
 Objectives
 Existing System & Disadvantages
 Proposed System & Advantages
 System Requirements
 System Architecture
 Flow Diagram
 Modules
 Module Description
2

05/03/2023
DOMAIN INTRODUCTION

 Data mining is the computing process of discovering patterns in large datasets involving
methods at the intersection of machine learning, statistics and database systems.
 The overall goal of the data mining process is to extract information from a data set and
transform it into an understandable structure for further use.
 Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.

 Data mining is about finding new information in a lots of data.


 The information obtained from data mining is hopefully both new and useful. 3

05/03/2023
ABSTRACT
 Intrusion detection is one of the important security problems in today’s cyber world.
 A significant number of techniques have been developed which are based on machine
learning approaches.
 So for identifying the intrusion we have designed the machine learning algorithms.
 By using the algorithm we find out intrusion and we can identify the attackers details also.
 IDS are mainly two types: Host based and Network based.
 A Host based Intrusion Detection System (HIDS) monitors individual host or device and
sends alerts to the user if suspicious activities such as modifying or deleting a system file,
unwanted sequence of system calls, unwanted configuration changes are detected. 4

05/03/2023
CONTINUE…

 A Network based Intrusion Detection System (NIDS) is usually placed at network points such
as a gateway and routers to check for intrusions in the network traffic.
 Here C4.5 Decision tree algorithm is used. It is a machine learning algorithm which can be
used for both classification and regression challenges. However, it is mostly used in
classification problems.
 C4.5 Decision tree algorithm is an ensemble classifier that consists of many decision trees
and outputs the class that is the mode of the class's output by individual trees.

05/03/2023
CONTINUE…
 This C4.5 Decision tree classification and prediction algorithm will increase the performance of the
overall classification and prediction results.

05/03/2023
INTRODUCTION
 A detailed investigation and analysis of various machine learning techniques have been carried
out for finding the cause of problems associated with various machine learning techniques in
detecting intrusive activities.
 Attack classification and mapping of the attack features is provided corresponding to each attack.

 Issues which are related to detecting low-frequency attacks using network attack dataset are also
discussed and viable methods are suggested for improvement.
 Machine learning techniques have been analyzed and compared in terms of their detection
capability for detecting the various category of attacks.

05/03/2023
OBJECTIVE

 To effectively classify and predict the data.

 To decrease sparsity problem.


 To enhance the performance of the overall prediction results.

05/03/2023
EXISTING SYSTEM

 HACKING incidents are increasing day by day as technology rolls out. A large number of
hacking incidents are reported by companies each year.
 The existing system doesn’t effectively classify and predict the attack which is presented in
the network.

05/03/2023
DISADVANTAGES

 Doesn’t Efficient for handling large volume of data.

 Theoritical Limits
 Incorrect Classification Results.

 Less Prediction Accuracy

10

05/03/2023
PROPOSED SYSTEM

 The proposed model is introduced to overcome all the disadvantages that arises in the
existing system.
 This system will increase the accuracy of the classification results by classifying the data
based on the social network mental disorders and others using C4.5 Decision tree
classification algorithm.
 It enhances the performance of the overall classification results.

11

05/03/2023
ADVANTAGES

• High performance.
• Provide accurate prediction results.
• It avoid sparsity problems.
• Reduces the information Loss and the bias of the inference due to the multiple estimates.

12

05/03/2023
SYSTEM REQUIREMENTS

Software Requirements

 Operating System : Windows 8.1


 Language : Python
 IDE : Anaconda - Spyder

13

05/03/2023
SYSTEM REQUIREMENTS

Hardware Requirements
 Hard Disk : 1000 GB
 Monitor : 15 VGA color
 Mouse : Microsoft.
 Keyboard : 110 keys enhanced
 RAM : 4GB
14

05/03/2023
SYSTEM ARCHITECTURE

TRAIN

FORMATTED
DATASET DATSET CLASSIFICATION

TEST

PREDICTION

15
05/03/2023
FLOW DIAGRAM
START

SELECT DATSET

CLEANING DATSET

SPLIT TRAIN AND TEST

CLASSIFICATION

PREDICTION

16

05/03/2023
MODULES
 Data Selection and Loading

 Data Preprocessing

Splitting Dataset into Train and Test Data

 Feature Extraction

 Classification

 Prediction

 Result Generation

17

05/03/2023
MODULES DESCRIPTION

18

05/03/2023
DATA SELECTION AND LOADING

 The data selection is the process of selecting the data for detecting the attacks.

 In this project, the KDDCUP dataset is used for detecting attacks.


 The dataset which contains the information about the duration, flag, service, src_bytes,
dest_bytes and class labels.

19

05/03/2023
DATA PREPROCESSING
 Data preprocessing is the process of removing the unwanted data from the dataset.

Missing data removal

Encoding Categorical data


 Missing data removal: In this process, the null values such as missing values are removed using
imputer library.
 Encoding Categorical data: That categorical data is defined as variables with a finite set of label
values. That most machine learning algorithms require numerical input and output variables. That an
integer and one hot encoding is used to convert categorical data to integer data.

20

05/03/2023
SPLITTING DATASET INTO TRAIN AND TEST
DATA
 Data splitting is the act of partitioning available data into. two portions, usually for cross-
validatory purposes.
 One. portion of the data is used to develop a predictive model. and the other to evaluate the
model's performance.
 Separating data into training and testing sets is an important part of evaluating data mining
models.
 Typically, when you separate a data set into a training set and testing set, most of the data is
used for training, and a smaller portion of the data is used for testing.
21

05/03/2023
FEATURE EXTRACTION

• Feature scaling. Feature scaling is a method used to standardize the range of independent
variables or features of data. In data processing, it is also known as data normalization and is
generally performed during the data pre-processing step.
• Feature Scaling or Standardization: It is a step of Data Pre Processing which is applied to
independent variables or features of data. It basically helps to normalise the data within a
particular range. Sometimes, it also helps in speeding up the calculations in an algorithm.

22

05/03/2023
CLASSIFICATION

 The C4.5 algorithm is used in Data Mining as a Decision Tree Classifier


which can be employed to generate a decision, based on a certain sample
of data (univariate or multivariate predictors). A decision tree is a tool that
is used for classification in machine learning, which uses a tree structure
where internal nodes represent tests and leaves represent decisions. C4.5
makes use of information theoretic concepts such as entropy to classify the
data.
23

05/03/2023
PREDICTION

 It’s a process of predicting the attacks in the network from the dataset.
 This project will effectively predict the data from dataset by enhancing the
performance of the overall prediction results.

24

05/03/2023
In above screen click on ‘Upload NSL KDD Dataset’ button and upload dataset.
25
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
In above screen I am uploading ‘intrusion_dataset.txt’ file, after uploading dataset will get below
screen
26
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
Now click on ‘Pre-process Dataset’ button to clean dataset to remove string values
from dataset and to convert attack names to numeric values 27
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
After pre-processing all string values removed and convert string attack names to numeric values such as normal signature contains id 0
and anomaly attack contains signature id 1.
Now click on ‘Generate Training Model’ to split train and test data to generate model for prediction using SVM and ANN
28
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
In above screen we can see dataset contains total 1244 records and 995 used for training and 249 used for testing.
Now click on ‘Run SVM Algorithm’ to generate SVM model and calculate its model accuracy 29

A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion Detection 05/03/2023
 
In above screen we can see with SVM we got 84.73% accuracy, now click on ‘Run ANN Algorithm’ to calculate ANN accuracy 30
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
In above screen we got 96.88% accuracy, now we will click on ‘Upload Test Data & Detect Attack’ button to upload test data
and to predict whether test data is normal or contains attack. All test data has no class either 0 or 1 and application will predict
and give us result. See below some records from test data
31
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
In above test data we don’t have either ‘0’ or ‘1’ and application will detect and give us
result
32
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
In above screen I am uploading ‘test_data’ file which contains test record, after prediction will get below
results
33
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
In above screen for each test data we got predicted results as ‘Normal Signatures’ or ‘infected’ record for each
test record. Now click on ‘Accuracy Graph’ button to see SVM and ANN accuracy comparison in graph format
34
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
 
From above graph we can see ANN got better accuracy compare to SVM, in above graph x-axis contains algorithm name and y-axis
represents accuracy of that algorithms 35
A Detailed Investigation and Analysis of using Machine Learning Techniques for Intrusion
05/03/2023
Detection
Thank You…

36

05/03/2023

You might also like