Professional Documents
Culture Documents
A REPORT
SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE AWARD OF THE DEGREE
OF
BACHELOR OF ENGINEERING
IN
DIVISION OF INFORMATION TECHNOLOGY
Submitted by:
RISHABH SETIYA 2018UIT2597
SRIRAM M. PANT 2018UIT2623
AYUSH GOEL 2018UIT2582
University of Delhi
Delhi-110007, India
We, (Rishabh Setiya Roll No. 2018UIT2597, Sriram M. Pant Roll No. 2018UIT2623 and
Approach to Detect and Avoid Sources of Misclassified Cyber Attack Data” which is
Technology, Delhi (University of Delhi) in partial fulfillment of the requirement for the
award of the degree of Bachelor of Engineering, is original and not copied from any
source without proper citation. This work has not previously formed the basis for the
Place: Delhi
Date: 20 December 2021
i
CERTIFICATE
University of Delhi
Delhi-110007, India
This is to certify that the work embodied in the Project-Thesis titled “A Federated
Learning Approach to Detect and Avoid Sources of Misclassified Cyber Attack Data” has
been completed by Rishabh Setiya Roll No. 2018UIT2597, Sriram M. Pant Roll No.
requirements for the award of the degree of Bachelor of Engineering. This work has not
ii
ABSTRACT
Federated learning is a form of deep learning in which the training data stays at the source
while the model gets trained from the data arising from a number of sources. The model
is communicated between a central server and the data source nodes which train it
independently. The models are averaged based on the amount of data available with the
sources.
In such a scenario, it is possible that some of the data sources provide mislabelled data
which could be due to a deliberate attempt to tamper with the learning process or inability
to correctly label the data. Our attempt is to identify such sources and gradually reduce
We assign trust parameter to each of the data sources involved in training. Then we run a
single epoch of training on each of the sources in a loop. A test dataset is available at the
central server. It is used for checking whether the previous epoch had a constructive or
destructive effect on the model. That is, whether the accuracy of the model improved or
degraded on the test dataset. If the effect was constructive, then the trust parameter of that
source is increased and if was destructive then the trust parameter is decreased.
The expectation from this experiment is that the trust parameter of malicious sources
would eventually drop to nil and the trust parameter of trustworthy sources would keep
increasing.
The dataset used in this project is an intrusion detection dataset which contains labelled
data corresponding to “Benign” and “Attack” classes. The model that we use for
iii
LIST OF CONTENTS
DECLARATION i
CERTIFICATE ii
ABSTRACT iii
LIST OF CONTENTS iv
LIST OF FIGURES vi
1.3 AUTOENCODERS 6
1.5.5 BOTNET 17
iv
CHAPTER 3 METHODOLOGY 25-38
3.1.1 TENSORFLOW 25
3.1.2 PANDAS 25
3.1.3 MATPLOTLIB 26
3.1.4 SCIKIT-LEARN 26
3.4 SIMULATION 36
5.1 CONCLUSION 41
REFERENCES 43
PLAGIARISM REPORT 47
APPENDIX 54
v
LIST OF FIGURES
1.2 Convolutional layer having filter size 3x3 and input size 5x5 4
vi
CHAPTER 1 INTRODUCTION
It's a struggle to design safe networks and systems for everyday usage as the
world becomes increasingly reliant on computers and automation [1]. With the
based on data [2]. With more data and information available than ever before,
it's more critical than ever to properly analyse and evaluate it. There exists
are constantly coming up with new exploits and attack methods to bypass your
security.
of devices.
These attacks can sometimes be identified from the flow level data, which is
information summarized from a number of packets. This flow level data can
However, there are concerns of privacy, as this flow level data may be
confidential to the organization to which the computer belongs. This issue can
paradigm in which the training data does not move from one machine to
another but the updates to model weights are sent to the centralized server.
Nevertheless, there may be machines on which the data does not get correctly
1
network may deliberately tamper the data so that the model does not gain or
The aim of this project is to identify such malicious actors and reduce their
ability to damage the model. If these actors turn helpful later on, then their
training ability should be restored back so that their data also helps in
nature.
• keeps actual data secure and unavailable to any party except the
source itself
Neural networks [4] are a subset of machine learning inspired from the
prediction of future from past trends in areas like stock market, decision
2
Fig 1.1: Architecture of a feed-forward deep neural network
In feed-forward neural networks [5], the inputs enter through the first layer,
propagate in the forward direction and the last layer produces the outputs. Each
neuron sums up the inputs coming into it and outputs the result after applying
For the neural network to learn, it must modify its weights to produce the
expected results on the output layer. In the beginning, however, the outputs
The difference between the previous output and the expected output propagates
in the backward direction and depending on this error the weights of each layer
get modified slightly. This process is repeated until the error is less than a
threshold.
3
For many cases, a single hidden layer is sufficient to give good results. In this
Fig 1.2: Convolutional layer having filter size 3x3 and input size 5x5
The simple feed-forward neural networks may give a decent result on some
simple binary patterns like handwritten characters but they do not generalize
slides over the input matrix. The element-wise products are summed up to give
The number of rows that the filter shifts towards right in each step and then
4
Another layer used in CNNs is known as pooling layer. Each cell of output of
input.
The last few layers of CNNs are fully connected. This means that the 2D matrix
is converted to a 1D array and the rest of the operations are similar to simple
RNNs [7] are used for sequential or time series data. The output of the previous
step is also fed as input to every neuron in the recurrent layer. The idea
autoencoder.
1.3 AUTOENCODERS
consists of as many neurons as the input layer and the model is trained to
6
This means that the input features and the expected outputs are the same. The
is same or somewhat similar to the patterns on which the model was trained.
encoder part followed by a decoder part. The difference between the original
pattern and output of the network is used to calculate the error known as
reconstruction error.
A low value indicates that the pattern is similar to training patterns while a
7
There are three components in autoencoder:
• Bottleneck: This is the layer that includes the compressed form of input
data. This is layer in which the input data is converted to lowest number
of dimensions.
• Decoder: In this, the model recreates the data from the encoded form to
into a set of basic signals and then attempts to rebuild the input from all of
those signals. We can also utilize CAE [9] to alter the image geometry or to
8
2. Variational Autoencoders (VAEs): New images can be produced by using
of the latent vector often matches the training data much better than a
added to the source images [11] and then noise removal learning is performed.
As a result, rather than duplicating the input to the output, data features are
recover the original unaltered input. To cancel out the extra noise, there is
the learning of a vector field by the model for mapping the input data towards
9
a lower-dimensional manifold that reflects the natural data. This is how by
identifying the most relevant features, the encoder will learn a more robust
traditional approach, data from all the sources is aggregated on a single ser ver
and then the model is trained. In federated learning, the current model at any
The training happens at the source itself. The data does not leave the sources
at any time.
Only the weight updates are sent back to the aggregation server. Homomorphic
encryption is used to ensure that the updates cannot be used to infer statistics
about the edge device data. The aggregation server updates the weights.
10
The update from each source is weighted by the amount of data t hat is used
While theoretically viable, this architecture would not have been practical in
for computing heavy tasks starting with Huawei’s Mate, Google’s Pixel, and
This method of learning poses another big challenge. The model may contain
a large number of weights. Downloading and uploading all the weight changes
Often there is a large gap between the download and upload speeds. The
download speed could be as high as 100 Mbps while the upload speed may be
Uploading of the entire model from the sources to the server is often the
preservation of privacy. The owners of data may not agree to share their data
11
data privacy. In addition to delivering an update to the shared model, the
12
for example, might create a model for predicting the next word on a virtual
keyboard.
Learning.
In many circumstances, the app will have already saved this information
5. Model Evaluation: After the tasks have been properly trained, the
the models are distributed to held-out clients for assessment on their local
13
6. Model Deployment: Eventually, after choosing a good model, it follows
inspection, live A/B testing (where the updated model is used on various
bad behaviour may be identified and corrected before it affects too many
launch method, which is usually unrelated to how the model was trained.
To put it in, either way, this phase would apply to either a model built
technique.
gain access to the information stored on the computer, to intercept the data
flowing to and from the system or to simply stop the machine from functioning.
The abilities of attackers are growing along with the progress in hardware
Smaller and simpler passwords can be guessed in a few minutes while longer
and a mix of upper- and lower-case letters could take many years to be guessed.
14
Brute force attacks are of different kinds depending on what passwords are
attempted. Some attacks use dictionary words while others use logical guesses
also be used.
Servers have a threshold on the number of requests they can serve at a time.
If the number of requests grows too much then the server would take a lot of
time to serve the clients which may frustrate them to leave the website and go
In extreme cases, the requests would result in timeout and not get served at
all.
15
Attackers try to disrupt the functioning of a server by sending large number of
If the malicious requests are sent from a number of computers, the n the attack
Websites which accept user input in the form of blog post, comments or other
types of forms may be vulnerable to cross site scripting. The attacker can insert
This code could be used to change the content that is displayed on the website
on users’ computers. This allows the attacker to do any action on the website
16
Fig 1.14 Representation of Cross Site Scripting attack
fetching some data from a database based on a user’s query entered in a text
The application uses the user’s input to construct an SQL query which is
then they may collect any information which they are not authorized to. It may
even be possible to delete entire tables from the database using this att ack.
1.5.5 Botnet
The term is a short form for robot-network. The attacker infects a large number
of computers and uses the whole network of such computers to carry out
IoT devices like set top boxes, cameras, voice assistant speakers, smart
watches etc may also be used as zombies. These botnets could be used for
17
mining cryptocurrency, launching DDoS attacks or recruiting more zombies
the effect is not easily observable because the devices appear to function
It helps them find out which devices are vulnerable to attacks and which ones
It tells them which services are running on the device, their versions and
18
CHAPTER 2 RELATED WORK
on a federated learning model it updates the model to the cloud itself rather
than sharing the user’s personal data to the cloud which may c reate privacy
issues for the user. The algorithm also increments the weights of some
particular devices which are high in priority and avoids updating the weights
of the rest of the devices depending on a certain threshold value which may
determine whether that device is important for the cloud server or not. This
thus reduces any kind of irrelevant updates thereby reducing the bandwidth
and extra computational costs required for updating those parameters to the
server. The cloud server starts updating the model parameters on receiving
model updates from multiple client devices without waiting for the rest of the
client devices whose models are still processing the data, this is because the
aggregating and updating the parameters of the global model. Along with that
from GRU (Gated recurrent unit) and SVM (Support Vector Machine) instead
of using SGD (Stochastic Gradient Descent), and is relatively faster and uses
19
more accurate compared to the other centralized learning algorithms and is
algorithms.
find out the optimum model required for their IDS research. Hindy et al. [ 15]
unknown flaw or bug in the system which has not been discovered by any
control of the system or steal data from the system such as user credentials.
These attacks create a very nasty challenge for many other machine learning
detected threats for identifying current threats which may be attacking the
network. Hence, they suffer from a lot of false negatives d ue to the increasing
number and variety of new threats to the system which may remain undetected.
The authors are aiming to resolve this issue by building a better IDS model
using an autoencoder model which can detect zero-day attacks a lot faster than
For demonstrating the efficiency of their autoencoder model they have also
compare the results along with using two different datasets one of which is
Now there are some differences between the CICIDS2017 and NSL -KDD
common cyber-attacks such as Heartbleed, SSH & FTP Brute Force, DDOS
20
and many more which is suitable to represent real world attacks as on daily
many types of attacks a network has to face whereas NSL -KDD covers only
benign traffic, Denial of Service (DOS), probing, Remote to Local (R2L) and
User to Root(U2R). Before using the datasets for training and validation the
correlated features are dropped from the datasets to reduce model instability
and flow-based data are used, as they are better suited for the IDS syst em.
During the scaling and selection of the features only benign data was selected
Thus, the models are then trained on both the datasets with 75% of normal data
for training and the rest for testing purposes. The results thus show that
class SVM showing an accuracy of 89-99% for the NSL-KDD dataset and 75-
as previous research weren’t able to give good enough results using the
bias and variance present in the model and is best suited for weak learners such
as decision trees.
21
Therefore, the author’s goal is to reduce the imbalance present in the training
dataset using SMOTE which generates synthetic samples for the minority class
thus improving the sensitivity of the model towards minority class and select
important features from the new dataset with the help of PCA and ensemble
feature selection which minimizes the dimensionality of the dataset and at the
same time reduce the data loss too. The researchers have thus found their
whether the overlap between anomaly and benign data flow in the network can
be eliminated. He has also used the CICIDS2017 dataset due to its wide range
of attacks. Also, during the pre-processing part of the dataset, the time stamps
of all the flow data have been modified to include even microseconds to make
the data more precise. The author has used an autoencoder along with ReLU
(Rectified Linear Unit) as activation function and RMSE (Root Mean Square
in autoencoder model and found out that on increasing number of neurons the
RMSE value decreased but for both benign and malicious data they changed
and on some specific substructure of the data. But the results of the testing
showed us that there still existed an overlap between two traffic flows-benign
and malicious data thus making us realize the fact that no matter what some
overlap between the data will always exist no matter how we label the data.
22
performance of all the combinations on the basis of four parameters time,
Evaluator with Naive Bayes, Classifier Subset Evaluator with J48, and
algorithms include REPTree and OneR. For training and evaluation, the same
dataset CICIDS-2017 is used. The software tool for data analysis and
investigation was WEKA. As the result, they found that feature selection
reduced the dataset size and time and gives high performance. For Port Scan
Web Attack, and DDoS Attack, the REPTree classification algorithm with
Evaluator with Naive Bayes features selection technique performs best for
infiltration attack.
Farukee et al. [19] proposed a model for DDoS attack detection in IoT
networks using a single direction convolutional neural network (1D CNN) and
Training and evaluation take place on the same data set which is CICIDS2017.
The main motive for the proposed work was to detect the DDoS attack as soon
obtained was 99.63%. They also concluded that the feature selection approach
interpretability.
23
24
CHAPTER 3 METHODOLOGY
3.1.1 TensorFlow
TensorFlow, is ongoing.
takes care of the finer details behind the scenes. Keras is the high -level API
for TensorFlow.
Each layer has a single input tensor and a single output tensor. We have used
3.1.2 Pandas
25
or an SQL Table. The columns are unidimensional arrays and each column
This library gives us the capability to read a CSV file containing column names
3.1.3 Matplotlib
Matplotlib is the most widely used library for plotting graphs in Python [23].
It plots two dimensional graphs like the simple plot between two arrays of
numbers connecting the dots with lines, scatter plot, bar graph, step graph,
histogram, box plot, pie chart, contours etc. It can also convert a three -
dimensional array to an image by using the data as RGB values for the colour
of each pixel.
3.1.4 Scikit-Learn
processing of data.
It also allows us to train neural networks but does not offer GPU support which
makes it not much useful for large scale applications. The customization which
In this thesis, we use this library for pre-processing of data using Min-Max
scaler.
26
3.2 PROPOSED ALGORITHM
runs on the machine that aggregates the updates and computes the global model
and the edge algorithm which runs on the devices which provide the data for
training.
The aggregation server stores a database of all the edge devices which are
interested to train the global model. It assigns a default value to them and we
will refer to this value as the trust of that device. As we need to evaluate the
impact of update by each device, we cannot aggregate all the updates and
compute their weighted average based on the amount of data. We rather fix the
amount of data that would be used for training by each device in a single epoch.
The server sends the current model that it has at any time along with the
amount of data that should be used for training to the edge devices. The server
then receives the updates to weights from the edge devices and scales t hem by
the old value of trust and applies these updates on the global model. Then it
evaluates the impact of this update by computing the accuracy of this model
on a dataset which is kept for this purpose. If the impact is positive (accuracy
increases), the trust of the device which sent that update is incremented. In
case the impact turns out to be negative (accuracy decreases), the value of trust
27
The server then moves to the next edge device and the above -mentioned
The process is repeated with all the edge devices in the database cyclically. If
any new device wants to join the system, then it will be assigned the default
value of trust and get added to the database towards the end.
28
α α
α α
α α
α α
The upper bound and lower bound in this algorithm have their own
significance.
In absence of any upper bound, the trust would keep increasing and the updates
to weights will grow so large that the model would oscillate from one
suboptimal state to another suboptimal state but would never arrive at the
29
optimal state. Later on, the updates will be so large that the model would be
The lower bound is necessary to stop the trust from reaching a negative value
to grow reasonably when it has correctly labelled data which would improve
An edge device receives a request from the aggregation server to train the
model. The input that it receives contains the current global model as well as
the amount of data that it should use for training. Alternatively, it already
knows the amount of data required and sends a request to the aggregation
30
server for allowing it to join the epoch of training the global model. The
amount of data is fixed to maintain uniformity among all the edge devices.
It uses the specified amount of data to train the model using backpropagation
and gradient descent. It sends back the update that it wants the aggregation
request for training from the server or it has generated the fixed amount of
data that it must use for training. Once that much data is available, it can itself
request the server to download the current model as it would like to train it.
way that all the benign samples have reconstruction error below it and all the
31
errors of benign and anomalous samples. Even if we take the threshold equal
to 0.9 times the maximum reconstruction error of the training data, we see that
almost all the samples get classified as benign. This is because on benign
dataset also has some samples which are highly unsimilar to the other samples
and hence have a huge reconstruction error, which is greater than the
Fig 3.1 RMSE distribution for Friday Morning Working Hours after training on benign data (Monday)
that 10% of the benign samples would also get classified as malicious .
reconstruction errors.
32
3.3 MATHEMATICAL JUSTIFICATION
are the model parameters. The aim of the model is to reac h the optimal
location in one function. There are two guiding forces which modify the
variables. One force makes the model move towards the optimal location
the model to towards the op timal location of the other function which
is undesired. The model is cyclically led by both the forces turn by turn.
Up to a few steps, the model feels that both the forces are friendly and
it trusts both of them. After a certain step it realizes that whi le one force
its destination when it is led by one force, its trust on that force starts
to fade away and the s teps it takes in the direction indicated by that
force become smaller and smaller. It is able to see that the destination
coming closer when led by the other force and its step size for being led
33
Fig 3.2 Initial view of mathematical description presented
In Figure 3.2, the circle at the starting of the dotted arrow represents the initial
position of the model. The dotted arrow is the misguiding force. The triangle
The cross symbol represents the centre of attraction or optimal point of the
good guiding force. The hollow circle is the position of the model after it takes
a step in the direction of the misguiding force. This move decreases its distance
from both the centres. However, the model focuses only knows its dis tance
from the correct destination and as this distance has decreased, its trust on the
The next figure, Figure 3.3, shows the turn of ‘good’ force. The distance
between the cross symbol and current position decreases once again and
34
Fig 3.3 The model is guided by ‘good’ force
35
Figure 3.4 shows the point where the distance of the model from its
optimal destination point increases when it is the turn of the ‘bad’ force.
At this point, the model realizes that it should not trust this force and
gradually the size of the step on this ‘bad’ force’s t urn decreases until
3.4 SIMULATION
The dataset used in this project is the CIC -IDS-2017 Dataset [24], an
days from Monday through Friday and has been available to be used by
The Monday CSV file represents the data for the first day of the week
between Tuesday and Friday CSV files. These are Brute -force attacks
Each record of the CSV files is the traffic flow in the network.
36
What is a traffic flow? It is the sequence of IP packets passing an
comprehensive as the packet data but is more than sufficient for keeping
the help of a mirrored port on the primary switch which was saving a ll
the data on a PCAP file. The CICFlowMeter tool has been used to create
bidirectional flows and also the calculated features from the PCAP files.
Some of the features of the flow data are source IP address, source port,
The attack network and victim network used for creating the datasets
had all the required equipment like router, firewalls, switches, hubs and
Ubuntu.
imbalanced.
The features are scaled using min -max scaler algorithm from scikit -
37
have been reduced using principal component analysis. The first 20
vectors would have been enough to cover more than 99% data.
The activation function used in the model is ‘ReLU’ which stands for
After which the model is compiled with Root Mean Square Error and
Adam optimizer.
efficient method used for optimizing the gradient descent even while
RMSProp algorithm.
We use two for loops for training. A single iteration of the outer loop
refers to one global epoch and a single iteration of the inner loop would
train the algorithm for one epoch on a single source of data, compare
the result with previous result, update the trust value of that source in
a dictionary which contains trust values of all the sources and change
38
CHAPTER 4 RESULTS AND DISCUSSION
The following results were generated using 4 sources. The first 2 lakh samples
from Wednesday data (which contains DoS attacks) were used as server data
for measuring accuracy. The next 2 lakh samples from Wednesday data were
used as first source and next 2 lakh samples were used as third source. The
first 2 lakh records from Monday data (Benign) were used as second source,
the next 2 lakh samples were used as fourth source and the remaining 129918
errors from the data kept for setting threshold was used as the threshold.
Geometric increment and decrement were used with the multiplication factors
39
The upper bound for trust values was set to 5 and the lower bound was
set to 0.1. Figure 4.1 shows the result graphically. The trust value of
source 2 has reached the upper bound of 5 and stays constant after that.
The trust value of source 4 alternates ab ove and below 1. The trust value
40
CHAPTER 5 CONCLUSION AND FUTURE WORK
5.1 CONCLUSION
Our focus in this thesis was to study the impact of assigning variable learning
patterns of malicious and benign flows, the model described here is not an
However, our idea of varying the trust parameter depending on the effect each
trained with a mix of benign and attack data, the overlap between the training
data and attack flows indeed increases and makes it tougher to identify the
We have explored only the effect of varying the learning rate based on trust
epochs on each dataset and the batch size can also be explored.
However, in cases where the anomalies occur without such deliberate attempts
and are similar in pattern to other anomalies, this method may gi ve better
41
If a network simulator is used for the experimentation, instead of a simple
program that we have used, the challenges of a real network would get
analysis which could potentially reveal some information about the data
42
REFERENCES
[1] Stahl R (2017) Technology reliant society, has it gone too far?
(https://thesnapper.millersville.edu/index.php/2017/04/19/technology-reliant-society-
opinion)
[4] Dongare AD, Kharde RR, Kachare AD. Introduction to artificial neural
network. International Journal of Engineering and Innovative Technology
(IJEIT). 2012 Jul;2(1):189-94.
[5] Fine TL. Feedforward neural network methodology. Springer Science &
Business Media; 2006 Apr 6.
[7] Medsker L, Jain LC, editors. Recurrent neural networks: design and
applications. CRC press; 1999 Dec 20.
[9] Chow JK, Su Z, Wu J, Tan PS, Mao X, Wang YH. Anomaly detection of
defects on concrete structures with the convolutional autoencoder. Advanced
Engineering Informatics. 2020 Aug 1; 45:101105.
43
[11] Gondara L. Medical image denoising using convolutional denoising
autoencoders. In2016 IEEE 16th international con ference on data mining
workshops (ICDMW) 2016 Dec 12 (pp. 241-246). IEEE.
[12] Llanasas R (2020) How AI and machine learning are transforming mobile
technology (https://www.greenbook.org/mr/market-research-technology/how-ai-is-
transforming-mobile-technology/)
[13] Akbari Roumani M, Fung CC, Rai S, Xie H. Value analysis of cyber
security based on attack types. ITMSOC: Transactions on Innovation a nd
Business Engineering. 2016; 1:34-9.
44
[19] Farukee MB, Shabit MZ, Haque MR, Sattar AS. DDoS Attack Detection
in IoT Networks Using Deep Learning Models Combined with Random
Forest as Feature Selector. In International Conference on Advances in Cyber
Security 2020 Dec 8 (pp. 118-134). Springer, Singapore.
[23] Tosi S. Matplotlib for Python developers. Packt Publishing Ltd; 2009
Nov 9.
45
46
Bachelor Thesis Project
ORIGINALITY REPORT
10 %
SIMILARITY INDEX
6%
INTERNET SOURCES
6%
PUBLICATIONS
3%
STUDENT PAPERS
PRIMARY SOURCES
1
www.mdpi.com
Internet Source 1%
2
Submitted to Rochester Institute of
Technology
1%
Student Paper
3
docplayer.net
Internet Source <1 %
4
"Proceedings of Data Analytics and
Management", Springer Science and Business
<1 %
Media LLC, 2022
Publication
5
Submitted to Indian School of Mines
Student Paper <1 %
6
ebin.pub
Internet Source <1 %
7
"Data Science and Security", Springer Science
and Business Media LLC, 2021
<1 %
Publication
8
elib.dlr.de
Internet Source <1 %
47
9
papers.ssrn.com
Internet Source <1 %
10
Submitted to Universiti Teknologi Petronas
Student Paper <1 %
11
link.springer.com
Internet Source <1 %
12
labs-repos.iit.demokritos.gr
Internet Source <1 %
13
"Advances in Cyber Security", Springer
Science and Business Media LLC, 2021
<1 %
Publication
14
iopscience.iop.org
Internet Source <1 %
15
s-space.snu.ac.kr
Internet Source <1 %
16
www.upgrad.com
Internet Source <1 %
17
"Computational Vision and Bio Inspired
Computing", Springer Science and Business
<1 %
Media LLC, 2018
Publication
18
Submitted to University of Reading
Student Paper <1 %
19
arxiv.org
Internet Source <1 %
48
20
dokumen.pub
Internet Source <1 %
21
downloads.hindawi.com
Internet Source <1 %
22
www.greenbook.org
Internet Source <1 %
23
Angela Demke Brown. "Compiler-based I/O
prefetching for out-of-core applications", ACM
<1 %
Transactions on Computer Systems, 5/1/2001
Publication
24
Submitted to Myongji University Graduate
School
<1 %
Student Paper
25
Submitted to Thadomal Shahani Engineering
College
<1 %
Student Paper
26
Latif U. Khan, Walid Saad, Zhu Han, Ekram
Hossain, Choong Seon Hong. "Federated
<1 %
Learning for Internet of Things: Recent
Advances, Taxonomy, and Open Challenges",
IEEE Communications Surveys & Tutorials,
2021
Publication
27
Submitted to Swinburne University of
Technology
<1 %
Student Paper
49
28
Zhuo Chen, Na Lv, Pengfei Liu, Yu Fang, Kun
Chen, Wu Pan. "Intrusion Detection for
<1 %
Wireless Edge Networks Based on Federated
Learning", IEEE Access, 2020
Publication
29
doaj.org
Internet Source <1 %
30
en.wikipedia.org
Internet Source <1 %
31
iugspace.iugaza.edu.ps
Internet Source <1 %
32
www.cert.org
Internet Source <1 %
33
Xumei Fan, William Sayers, Shujun Zhang,
Zhiwu Han, Luquan Ren, Hassan Chizari.
<1 %
"Review and Classification of Bio-inspired
Algorithms and Their Applications", Journal of
Bionic Engineering, 2020
Publication
34
www.springerprofessional.de
Internet Source <1 %
35
publications.muet.edu.pk
Internet Source <1 %
36
"Machine Intelligence and Soft Computing",
Springer Science and Business Media LLC,
<1 %
2021
50
Publication
37
"Proceedings of the 22nd Engineering
Applications of Neural Networks Conference",
<1 %
Springer Science and Business Media LLC,
2021
Publication
38
Al-Zoubi, H.. "Rejection and modelling of
sulphate and potassium salts by
<1 %
nanofiltration membranes: neural network
and Spiegler-Kedem model", Desalination,
20070205
Publication
39
Ankit Thakkar, Ritika Lohiya. "A survey on
intrusion detection system: feature selection,
<1 %
model, performance measures, application
perspective, challenges, and future research
directions", Artificial Intelligence Review, 2021
Publication
40
Benedetto Marco Serinelli, Anastasija Collen,
Niels Alexander Nijdam. "On the analysis of
<1 %
open source datasets: validating IDS
implementation for well-known and zero day
attack detection", Procedia Computer Science,
2021
Publication
41
Dun Li, Dezhi Han, Tien-Hsiung Weng, Zibin
Zheng, Hongzhi Li, Han Liu, Arcangelo
<1 %
Castiglione, Kuan-Ching Li. "Blockchain for
51
federated learning toward secure distributed
machine learning systems: a systemic
survey", Soft Computing, 2021
Publication
42
Mohamed Gaber, Ashraf Khalaf, Imbaby
Mahmoud, Mohamed El_Tokhy. "Advanced
<1 %
Protection Scheme For Information
Monitoring in Internet of Things
Environment", Research Square Platform LLC,
2021
Publication
43
arxiv-export-lb.library.cornell.edu
Internet Source <1 %
44
content.iospress.com
Internet Source <1 %
45
mafiadoc.com
Internet Source <1 %
46
"Intelligent Communication, Control and
Devices", Springer Science and Business
<1 %
Media LLC, 2020
Publication
47
Arif Yulianto, Parman Sukarno, Novian Anggis
Suwastika. "Improving AdaBoost-based
<1 %
Intrusion Detection System (IDS) Performance
on CIC IDS 2017 Dataset", Journal of Physics:
Conference Series, 2019
Publication
52
48
Hanane Azzaoui, Akram Boukhamla. "Two-
Stages Intrusion Detection System Based On
<1 %
Hybrid Methods", Proceedings of the 10th
International Conference on Information
Systems and Technologies, 2020
Publication
49
Mohamed Amine Ferrag, Othmane Friha,
Leandros Maglaras, Helge Janicke, Lei Shu.
<1 %
"Federated Deep Learning for Cyber Security
in the Internet of Things: Concepts,
Applications, and Experimental Analysis", IEEE
Access, 2021
Publication
50
"Computer Security – ESORICS 2017", Springer
Nature, 2017
<1 %
Publication
51
Hanan Hindy, Robert Atkinson, Christos
Tachtatzis, Jean-Noël Colin, Ethan Bayne,
<1 %
Xavier Bellekens. "Utilising Deep Learning
Techniques for Effective Zero-Day Attack
Detection", Electronics, 2020
Publication
53
54
APPENDIX
55
Sr. No Feature Name Description
Minimum time between two packets sent in
20 Flow IAT Min the flow
Minimum time between two packets sent in
21 Fwd IAT Min the forward direction
Maximum time between two packets sent in
22 Fwd IAT Max the forward direction
Mean time between two packets sent in the
23 Fwd IAT Mean forward direction
Standard deviation of time between two
24 Fwd IAT Std packets sent in the forward direction
Total time between two packets sent in the
25 Fwd IAT Total forward direction
Minimum time between two packets sent in
26 Bwd IAT Min the backward direction
Maximum time between two packets sent in
27 Bwd IAT Max the backward direction
Mean time between two packets sent in the
28 Bwd IAT Mean backward direction
Standard deviation of the time between two
29 Bwd IAT Std packets sent in the backward direction
Total time between two packets sent in the
30 Bwd IAT Total backward direction
Number of times the PSH flag was set in
31 Fwd PSH Flags packets travelling forward
Number of times the PSH flag was set in
32 Bwd PSH Flags packets travelling backwards
Number of times the URG flag was set in
33 Fwd URG Flags packets travelling forward
Number of times the URG flag was set in
34 Bwd URG Flags packets travelling backwards
Total bytes used for headers in the forward
35 Fwd Header Length direction
Total bytes used for headers in the backward
36 Bwd Header Length direction
37 Fwd Packets/s Number of forward packets per second
38 Bwd Packets/s Number of backward packets per second
39 Min Packet Length Minimum length of a packet
40 Max Packet Length Maximum length of a packet
41 Packet Length Mean Mean length of a packet
42 Packet Length Std Standard deviation of packet length
56
Sr. No Feature Name Description
43 Packet Length Variance Variance of packet length
44 FIN Flag Count Number of packets with FIN Flag
45 SYN Flag Count Number of packets with SYN Flag
46 RST Flag Count Number of packets with RST Flag
47 PSH Flag Count Number of packets with PUSH Flag
48 ACK Flag Count Number of packets with ACK Flag
49 URG Flag Count Number of packets with URG Flag
50 CWE Flag Count Number of packets with CWE
51 ECE Flag Count Number of packets with ECE
52 Down/Up Ratio Download and upload ratio
53 Average Packet Size Average size of packets
Average size observed in the forward
54 Avg Fwd Segment Size direction
Average size observed in the backward
55 Avg Bwd Segment Size direction
Average bytes bulk rate in the forward
56 Fwd Avg Bytes/Bulk direction
Average packet bulk rate in the forward
57 Fwd Avg Packets/Bulk direction
58 Fwd Avg Bulk Rate Average bulk rate in the forward direction
Average bytes bulk rate in the backward
59 Bwd Avg Bytes/Bulk direction
Average packets bulk rate in the backward
60 Bwd Avg Packets/Bulk direction
61 Bwd Avg Bulk Rate Average bulk rate in the backward direction
Average number of packets in a sub flow in
62 Subflow Fwd Packets the forward direction
Average bytes in a sub flow in the forward
63 Subflow Fwd Bytes direction
Average number of packets in a sub flow in
64 Subflow Bwd Packets the backward direction
Average bytes in a sub flow in the backward
65 Subflow Bwd Bytes direction
Number of bytes sent in the initial window
66 Init Win bytes fwd in the forward direction
Number of bytes sent in the initial window
67 Init Win bytes bwd in the backward direction
57
Sr. No Feature Name Description
Count of packets with at least 1 byte of TCP
68 act data pkt fwd data payload in the forward direction
Minimum segment size observed in the
69 Min seg size fwd forward direction
Mean time a flow was active before
70 Active Mean becoming idle
Standard Deviation of the time a flow was
71 Active Std active before becoming idle
Maximum time a flow was active before
72 Active Max becoming idle
Minimum time a flow was active before
73 Active Min becoming idle
Minimum time a flow was idle before
74 Idle Min becoming active
Mean time a flow was idle before becoming
75 Idle Mean active
Maximum time a flow was idle before
76 Idle Max becoming active
Standard deviation of the time a flow was
77 Idle Std idle before becoming active
The target variable, ‘Benign’ or a specific
78 Label ‘Attack category’
58