You are on page 1of 4

SCHOOL OF INFORMATION

TECHNOLOGY

Network and Information Security


SLOT: B2+TB2

Faculty: Prof. Shantharajah S P

Submitted by:
Aditya Kumar (18BIT0235)
Problem Statement
To implement a smart (dynamic) HoneyPot and Intrusion Detection System using
Machine Learning.

Machine Learning is commonly used for identifying patterns and making the
computer “intelligent” so it can dynamically detect suspicious traffic and hence
prevent critical systems

HoneyPot
It is a computer system, that acts as a decoy, fooling cybercriminals into thinking it's
a legitimate target. This system has fabricated data to mimic the original system so
hackers let their guard down once they are in. It is then used to systematically track
every action of the hackers and get as much data as possible about them without
alerting them.

Proposed Idea

Integration of Machine Learning with Honey-Potting


There are a lot of proposed approaches in the area of network security but they still lack
handling the newer malwares. System security software like firewall and antivirus fail to
detect the malicious software.

The power and analysis of Machine Learning with the implementation of various
Honeypotting techniques may lead to dynamic identification and detection of malwares and
other security threats.

Methodology
The design of the complete honeypot and machine learning system will be completely
separated from the actual critical system.

This design consists of network components like wireless routers and honeypots and the
actual system. The attacker tries to penetrate the network but due to the implementation of
honeypot the actual system is not affected. All the traffic of external network comes directly
into the internal network.

Honeypot is responsible for capturing the traffic packets that have entered the interior
network and stored in a virtual network.
The router forwards the packets from the external network into the internal network and then
to the honeypot as its intended purpose. The process of classification can be shown in the
steps below:

 The classification will we between malicious and harmless. As the dataset contains
unlabeled data we classify the data into labeled data, extract the malware and predict the
unlabeled malware by training the labeled data with the predicted data. We can call this as
pseudo data as it is not completely accurate but made from the database of previous attack-
data.

 We then take these predictions and label each piece of unlabeled data with the individual
output that was predicted for them.

 We then train our model on full dataset which is now consisting of both truly labeled data
with the data that was pseudo data.

 Now, the extracted malware is set for data training and monitored manually to check for
errors. We will need cybersecurity expert for this step.

For the classification and analysis of data, learning process is done by the malware detection
system.
The above diagram shows the basic design of the HoneyPot and the analysis mechanism

Future Work

As I am already working on a Machine Learning based IDS for J-Component, I plan to


implement this idea using Virtual Networks in AWS Free-Tier.

References: http://www.refworks.com/express/expressimport.asp?
vendor=IEEE&filter=RefWorks%20Tagged%20Format&database=&url=https%3A%2F
%2Fieeexplore.ieee.org%2Fxpl%2FrefWorkGen%3Farnumber%3D7155011%26dlSelect
%3Dcite%26fileFormat%3DRefWork&encoding=65001

You might also like