You are on page 1of 23

9444 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO.

12, JUNE 15, 2022

A Taxonomy of Machine-Learning-Based Intrusion


Detection Systems for the Internet of Things:
A Survey
Abbas Jamalipour , Fellow, IEEE, and Sarumathi Murali , Member, IEEE

Abstract—The Internet of Things (IoT) is an emerging tech- Routing Protocol for Low-Power and Lossy network (RPL)
nology that has earned a lot of research attention and technical is a standard routing protocol for 6LoWPAN-based IoT
revolution in recent years. Significantly, IoT connects and inte- networks [4], [6]. RPL’s topology is more flexible to form
grates billions of devices and communication networks around
the world for several real-time IoT applications. On the other a network with an enormous number of IoT nodes with and
hand, cybersecurity attacks on the IoT are growing at an alarm- without mobility. IoT facilitates a wide range of exciting appli-
ing rate since these devices are vulnerable because of their limited cations that ensure a better human life; however, with several
battery life, global connectivity, resource-constrained nature, and advantages, it also brings plenty of security vulnerability asso-
mobility. When attacks on IoT networks go undetected within ciated with IoT devices. IoT devices are an apparent victim
a speculated period, such security attacks may prompt severe
threats and disruptive behavior inside the network and make the of security attacks because of their global connectivity, limited
network unavailable to the end user. Hence, it is quintessential to battery life, ad hoc nature, and mobility. Therefore, it demands
design an intelligent and robust security approach that promptly regular surveillance and analysis continuously to secure the
detects potential attack surfaces in a dynamic IoT network. This network [7]–[9]. Also, while monitoring and preventing cyber-
article investigates a comprehensive survey of machine learn- attacks in IoT networks, it is essential to maximize the efforts
ing, deep learning, and reinforcement learning-based intelligent
intrusion detection techniques for securing IoT. Also, this article to provide and continue the services, secure the delicate data,
thoroughly illustrates the implementation of various categories and make sure to prepare well in hand for any unexpected
of security threats in IoT with a neat diagram. Significantly, we situations [10].
classify the threats into two broad categories: 1) wireless sensor Existing security mechanisms assist in detecting, espe-
networks (WSNs) inherited security attacks and 2) routing pro- cially some particular attacks, namely, sinkhole and wormhole
tocol for low power and lossy networks (RPL) specific security
attacks in IoT. Finally, we present potential research opportuni- attacks; however, it does not support much in identifying com-
ties and challenges in intelligent intrusion detection approaches plex and more destructive large attack surfaces in the real IoT
in future IoT security. environment. It could be even worse if it is a combination
Index Terms—Deep learning (DL), Internet of Things (IoT), of multiple attacks inside the network surface. Hence, it is
intrusion detection system (IDS), machine learning, reinforce- essential to design an intelligent security mechanism capa-
ment learning (RL), routing protocol for low power and lossy ble of detecting the attacks even in a dynamic environment.
networks (RPL), and security attacks. Also, security mechanisms and intrusion detection approaches
should be competent to identify yet to be known attacks in
the future. Besides, there is a crucial demand now for classi-
I. I NTRODUCTION fying the attack category through deeply analyzing its behavior
HE Internet of Things (IoT) is a growing technology
T that has earned a lot of research attention and a vast
technical revolution in recent years [1]. IoT plays a sig-
and the nature of its action. Further, while ensuring security,
intrusion detection, and prevention mechanisms deal with sig-
nificant challenges because of the vast amount of data traffic
nificant role in smart energy-efficient automation in many exchanged in IoT networks [11], [12].
exciting applications, such as smart home, robotics, wearables, Those extremely large volume of data has been generally
healthcare, environmental surveillance, smart farming, etc. IoT termed as big data. Big Data is a buzzword that includes
demands low power, resource-constrained devices that support methods that can extract extremely vital information from
mobility, and global connectivity [2]. IPv6 over Low-power the massive amount of data traffic exchanged in the IoT
Wireless Personal Area Network (6LoWPAN) is a small-scale networks. Moreover, all the data traffic should not be nec-
IoT network that consists of low-power devices with IPv6 essary for further analysis and learning. The advancements
connectivity [3]–[5]. in big data technology are much practical to extract differ-
ent patterns of legitimate and malicious behavioral packet
Manuscript received May 23, 2021; accepted October 28, 2021. Date
of publication November 10, 2021; date of current version June 7, 2022. patterns from the immense data traffic. However, conven-
(Corresponding author: Abbas Jamalipour.) tional intrusion detection techniques have not been suitable
The authors are with the School of Electrical and Information to deal with big data to extract much potentially meaning-
Engineering, The University of Sydney, Sydney, NSW 2006, Australia (e-mail:
a.jamalipour@ieee.org; sarumathi.murali@sydney.edu.au). ful information. With its power to learn from the data set,
Digital Object Identifier 10.1109/JIOT.2021.3126811 machine learning is especially appropriate for the intricacies
2327-4662 
c 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9445

that are very complex to be entirely explained or cannot IDS acts as a second line of defense to protect the network
be performed precisely. To analyze the massive data traffic from intruders, and it intends to catch the attackers before
produced in the IoT network, intelligent intrusion detection they could perform damage to the network. The majority of
systems (IDS) could be a promising solution to determine and the proposed techniques in the literature are based on IDS
adapt to any dynamic changes occurring in the network [13]. types, such as signature based, anomaly based, specification
Prominently, machine learning and deep learning (DL)-based based, and hybrid based. Also, they focus on static RPL with
intrusion detection approaches have brought great attention for a fewer number of nodes in the network. The existing works
improving security in IoT networks. In machine learning, the did not consider the massive data traffic, and they aimed to
intrusion problem is basically broken into small sections, these find the IDS for specific security attacks.
sections have been solved one by one, and then the solutions A highlight of the most distinguished surveys and security
are developed as an aggregate of all the arrived solutions. In approaches are outlined in the following paragraphs.
DL, any problem would be considered to solve in an end- Wallgren et al. [7] analyzed the IoT protocol architecture,
to-end manner. Machine learning requires small amount of and the significant contribution of this work was the imple-
data for training and testing, but it gives less accuracy. On the mentation and demonstration of various routing attacks against
contrary, DL, a subset of machine learning demands a large IoT-RPL. The authors also proposed a lightweight heart-
amount of data for training the machine, and it needs a long beat protocol that mitigated the selective forwarding attack
time for training, but it usually provides higher accuracy. Also, instances and explained the RPL’s self-healing mechanisms
reinforcement learning and deep reinforcement learning (RL)- to defend against such routing attacks. This article intended
based security techniques play major roles in enhancing the to highlight the importance of security in the RPL-based IoT
performances with high accuracy in a wide range of applica- and presented some basic knowledge to the future researchers
tions [11]. In the survey part of this article, we have reviewed who plan to design and implement IDSs for the IoT.
thoroughly potential intelligent intrusion detection approaches Raoof et al. [14] presented a comprehensive survey on
for IoT security. Significantly, we illustrate a detailed and com- various routing attacks and mitigation methods for the IoT
prehensive review of various machine learning, deep learning, networks. Also, they classified the security attacks broadly
and RL-based IDS approaches associated with IoT networks. into two categories, namely, RPL specific and WSN-inherited
We summarize our contributions as follows. attacks, and they examined several mitigation techniques in
1) This article contributes a detailed review of machine detail. The authors also introduced a technique-based classifi-
learning-based IDS, DL-based IDS, and RL-based IDS cation scheme for each type of security attack and discussed
with its working principle, advantages, drawbacks, and current trends in IDS approaches. Furthermore, to gain better
application in the platform of IDS with a neat sketch for knowledge and differentiation between these attacks, a clas-
each technique. sification based on the origin of the attacks was presented in
2) This article could be a one-stop point where targeted the paper. This survey contributes a solid background for the
researchers can find the most advanced and updated knowl- potential researchers in this field, aiming to understand the
edge on current intelligent intrusion approaches, wireless different recent intrusion detection approaches and their new
sensor networks (WSN)—inherited security threats, and security strategies on IoT networks.
RPL specific security attacks in IoT environments. Granjal et al. [15] performed an exhaustive study on the
3) To help the researchers and any new learners better security protocols and tools available to secure communica-
understand, we thoroughly illustrated the implementa- tions on the IoT networks. Primarily, they discussed the secu-
tion of various categories of security attacks in IoT with rity requirements at the physical and MAC layers and defense
a neat diagram. for the IoT application layer. Also, the authors reviewed
The remainder of this article is organized as follows. Section II current open research challenges and issues by providing
reviews the related previous research work. Section III exam- opportunities for future research work.
ines the taxonomy of different IDS techniques in IoT networks Our previous work in [16] proposed a lightweight intru-
based on the intrusion data source, detection techniques, sion detection approach against three types of Sybil attacks in
and placement strategy. And Section IV illustrates machine mobile RPL-IoT networks. This algorithm worked lightweight
learning-based IDS for IoT security, which primarily com- based on the computation of a pheromone value and cumu-
prises two separate sections: 1) supervised learning-based lative trust factor. It required low computational complex-
IDs and 2) unsupervised learning-based IDS techniques. ity and provided high accuracy, which is quintessential in
Section V demonstrates the RL-based IDS, which includes resource-constrained IoT networks. We also proposed a bio-
deep Q-learning and double Q-learning methods. Section VI inspired mathematical model for the Sybil attack in mobile
provides a thorough study on various DL-based IDs models. RPL based on the artificial bee colony (ABC) model.
Section VII presents potential security threats and attacks on Le et al. [5] proposed an IDS architecture with monitored
IoT networks. Section VIII discusses significant challenges in node enabled with a finite state machine for the topology
IoT security and future directions. attacks, namely, rank attack, and local repair attack, neighbor
attack, and destination-oriented direct acyclic graph (DODAG)
Information Solicitation (DIS) attack. This work focused on
II. R ELATED W ORK specification-based IDS for mitigating topology attacks. The
An IDS is a system used for monitoring network packets for primary approach is to determine the states, transitions, and
potentially identifying the malicious traffic in a network. The related statistics based on analyzing the packet trace file.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9446 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 1. Taxonomy of IDS for IoT security.

The specification-based IDS system is integrated into the IDS A. IDS Based on Intrusion Data Sources
server, connected with many monitoring nodes to secure the IDS has been classified into three categories based on the
IoT network [17], [18]. Intrusion data sources, namely, host-based IDSs (HIDSs) and
Raza et al. [19] proposed SVELTE, a hybrid IDS for network-based IDSs (NIDSs) [22].
IP-based IoT where IDS were fixed in the 6BR and HIDS examines the data that originates from the host system
resource-constrained nodes. SVELTE aims at spoofed or modi- and audit sources, such as operating system, window server
fied identity attacks, sinkhole attacks, and selective-forwarding logs, firewalls logs, application system audits, or database
attacks. Moreover, the authors proposed a distributed mini logs. HIDS can identify insider attacks that do not include
firewall to defend the RPL network from adversaries. The network traffic. HIDS examines the data that originates from
combination of both signature based, and anomaly-based intru- the host system and audit sources, such as operating system,
sion detection is the hybrid approach. As signature-based and window server logs, firewalls logs, application system audits,
anomaly-based approaches have some advantages and disad- or database logs. HIDS can identify insider attacks that do not
vantages [20], the hybrid system could be a hopeful solution to include network traffic [22], [23].
overcome those. The anomaly can be earlier identified through NIDS monitors the network traffic acquired from a network
the pattern changes and a slight deviation from the normal ones through packet capture, NetFlow, and other network data
even when the network is much dynamic [12]. sources. Network-based IDS can be used to monitor many
Vinayakumar et al. [21] proposed a DL approach for intelli- computers that are connected to a network. NIDS can observe
gent IDSs for different network attacks. The authors presented the external malicious activities that could be launched from an
a smart IDS that combines network virtualization and deep external threat at an earlier stage before the threats expanded
belief network (DBN)-based anomaly detection system to to another network system. However, NIDSs have few con-
determine and locate the abnormal behavior. Security attacks straints while examining massive data which passes through
in the IoT networks increase at a startling rate as the num- the high bandwidth network [24], [25].
ber of devices, connected systems, communication networks,
and global connectivity. It is very challenging to identify
the security attacks in the real-time world, as the number B. IDS Based on Detection Techniques
of vulnerabilities and their causes are too many to iden-
Based on the type of detection, IDS can be generally
tify. Therefore, it is essential to identify the attacks ear-
divided into four classes. They are signature-based IDS,
lier in the IoT networks to ensure adequate security and
anomaly-based IDS, specification-based IDS and hybrid-based
protection [13].
IDS [11], [26].
1) Signature-Based IDS: Signature-based IDS approach
examines and determines the issues from a network against
III. TAXONOMY OF IDS S FOR I OT a predefined attack signature or pattern from a database [11].
IDS can be broadly divided into three categories based on This kind of approach requires specific knowledge of any par-
the intrusion data sources, detection techniques, and placement ticular attack in a network environment based on previous
strategy [11], [14] as shown in Fig. 1. attempts and awareness. Signatures and patterns need to

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9447

be updated frequently to improve the accuracy of the IDS and achieved 94% of TPR to distinguish the attack and 87%
continuously. of TPR for accurately recognizing the attacker. However,
Signature-based IDSs are trustworthy and efficient at detect- this method is fit for identifying barely one category of
ing known threats. Patterns that were already stored in the attack.
internal repository can be easily matched, and signature-based 3) Specification-Based IDS: Specification-based IDS uses
IDS will trigger an alert for mitigation action. Though it some set of rules and thresholds determined by the end-
consumes limited energy and resources, this approach is inef- user as per the specifications of the network, routing pro-
ficient in identifying new unknown attacks and modifications tocol, and other configurations. They can differentiate the
of known attacks since similar matching patterns for these new anomalies depend upon the feature deviations and rule excep-
sets of attacks are strange to the IDS system. In [28] and [29], tions [11], [32]. It is similar to anomaly-based detection, but a
they proposed a signature-based IDS for 6LoWPAN-based IoT cybersecurity expert outlined the rules and values for normal
networks, and their primary objective is to identify Denial and abnormal behavior classification. It produces considerably
of Service (DoS) attacks in the IoT networks. The IDS will lower false-positive and false-negative rates since the human
transmit the alert messages to a DoS protection manager that expert sets all the thresholds [20].
examines distinct features, namely, channel interference rate Le et al. [17] proposed an IDS architecture with moni-
and packet dropping rate, to justify the attack. Also, they tored node enabled with a finite state machine for the topology
intended to reduce the false alarm rate in the IDS system. attacks, namely, rank attack, and local repair attack. This work
However, they did not produce any proofs for how they focused on specification-based IDS for mitigating topology
updated the attack signatures continuously. The same authors attacks. This work has been extended in [18], and they mod-
extended their work by modifying the previous signature-based ify the profiles into a set of rules and validate each data packet
IDS with minor advancements in [29]. exchanged in the RPL network.
Oh et al. [30] intended to minimize the computational cost Amaral et al. [37] proposed a specification-based IDS,
while examining the packet payloads and attack signatures. which includes the network administrator to design some set
They proposed a multipattern-detection approach that skips of rules for intrusion detection. If there is any violation or
unnecessary matching operations expected between network breach of the rules, IDS conveys a quick alert message to the
traffic overloads and attack signatures through the use of event management system (EMS). The EMS operates on a
auxiliary shift values. particular node without any resource constraints to associate
2) Anomaly-Based IDS: The anomaly-based or event-based the alerts for the other nodes in the network. However, the
approach defines some set of normal and abnormal behavior efficacy of the proposed system depends on the solid knowl-
in the network involving the protocol specifications. When the edge of the network administrator, which is a significant point
features and events follow these patterns, they are recognized of consideration for the specification-based IDS techniques.
as normal behavior, and the events that deviate from the regular Incorrect specifications may lead to fake warning messages
pattern are identified as abnormal behavior [31], [32]. It can that reduce the system’s accuracy and are even a substantial
detect a strange attack quickly, but it cannot clarify what kind risk to network security.
of attack exactly it is. However, if any scenario does not match 4) Hybrid-Based IDS: Hybrid-based IDS combines two or
the regular pattern, it has been identified as an intrusion. Also, more IDS approaches, namely, signature-based, specification-
exploring the entire characteristics of normal behavior is not based, and anomaly-based approaches in one system to
an effortless task. From the literature, it can be found that improve their advantages and reduce the influence of their
they usually have a high false alarm rate which bound their drawbacks [2]. The anomaly can be earlier identified through
potential [33], [34]. the pattern changes and a slight deviation from the normal
Farzaneh et al. [35] proposed an anomaly-based lightweight ones even when the network is much dynamic [38], [39].
intrusion detection system by setting the threshold values SVELTE is one of the popular hybrid IDS proposed by
for identifying attacks on the IoT- RPL networks. They also Raza et al. [19]. This hybrid IDS intends to provide a
described various IDS categories, and they considered and fair tradeoff between the storage cost of the signature-based
implemented the neighbor attack and DIS attack. Further, method and the computing cost of the anomaly-based method.
they evaluated their performance of the proposed anomaly- SVELTE aims at sinkhole attacks, spoofed or modified iden-
based IDS approach in terms of true positive rate (TPR) and tity attacks, and selective-forwarding attacks. Moreover, they
false-positive rate (FPR). proposed a distributed mini firewall to defend the RPL network
Pongle and Chavan [36] designed an IDS to identify from adversaries [12].
the wormhole attack in IoT-RPL networks. The proposed Bostan and Sheikhan [40] proposed a hybrid intru-
method works based on the location information and neigh- sion detection framework that includes anomaly-based and
bor information of the IoT nodes to determine the Wormhole specification-based intrusion detection systems to mitigate
attack and received signal strength to recognize attacker nodes sinkhole and selective-forwarding attacks. They made the
in the network. They concluded that the wormhole attack specification-based intrusion detection agents placed in the
always leaves its symptoms on the network. For a hint, many router nodes, and anomaly-based intrusion detection agent is
control packets are exchanged between the two endpoints of occupied in the root node. The hybrid intrusion detection
the tunnel, or many neighbors get connected after a successful framework examines the node’s behavior using an unsuper-
attack. They used this logic to identify the wormhole attack vised optimum path forest algorithm.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9448 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 2. Intelligent IDS for IoT security.

C. IDS Based on Placement Strategies 3) Hybrid IDS Placement: To get the most beneficial
IDS can be further divided into three categories based on its outcome from both the previous placement approaches, a
location, namely, centralized IDS, distributed IDS, and hybrid hybrid solution is advisable. A central node(s) that have
IDS [14], [41]–[43]. more resources are responsible for computationally intensive
1) Centralized IDS Placement: In this type, IDS is placed IDS responsibilities (such as analyzing data gathered from
either at the root node (e.g., border router) or a preferred node monitoring nodes, decision making, etc.) and normal nodes
(e.g., cluster head node) and utilizes the data traffic passing that are accountable for performing lightweight IDS duties
through to identify attacks. It is expected that the central node (such as monitoring neighbor nodes, sending data about traf-
with IDS sends periodic requests for updates from the mon- fic passing through them, and responding to mitigation control
itored network [2], [12]. The advantages of centralized IDS messages from central nodes). This approach benefits from
might be that most of the substantial task occurs inside a pow- better and faster detection of attacks than the centralized
erful node, which provides the ability to perform extensive approach and lower resource consumption than the distributed
security checks. Also, it is generally competent in shielding approach [44], [45].
the network from Internet-side attacks and botnets since it is
serving as a firewall. However, it would be challenging to
monitor the network during the attack itself. IV. M ACHINE L EARNING -BASED IDS FOR I OT S ECURITY
2) Distributed IDS Placement: In this type of IDSs, every Machine learning techniques for IoT security can be majorly
node in the network will be configured with full IDS imple- classified into four categories as shown in Fig. 2 as follows:
mentation, making it efficient for detecting attacks at any stage. 1) supervised learning; 2) unsupervised learning; 3) DL; and
Significantly, there would be a collaboration among the nodes 4) RL [46], [47].
to detect the attacks earlier by sharing threats if any are sup-
posed to be around. But this system results in high energy
and resource consumption among all IoT nodes. Therefore, A. Supervised Learning
an alternative approach employs distributed monitoring nodes Supervised learning-based IDS includes a labeled data set to
called (watchdogs) within the network accountable for moni- examine the features to learn the normal and abnormal behav-
toring tasks, then communicates with IDS systems inside other ior in the network. Basically, each input has been mapped to
nodes to mitigate the attacks. The main advantage of these an output variable called a class label. Each record is a pair
IDSs is better monitoring and detection of insider attacks. comprising network packets or host-based information records
Since significant parts of the IDS, namely, the decision- and a correlated output label, precisely intrusion scenario or
making, are implemented within each node; this generally normal behavior in this IDS. The primary function of a super-
results in higher resource consumption at the nodes. It is usu- vised classifier is to predict the intrusion given a set of network
ally required to optimize the IDS periodically to minimize this flows with class labels (benign/attack) for training. It can iden-
effect [42], [43]. tify the output label for the testing data set based on learning

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9449

Fig. 3. Illustration of DT algorithm for IDS [48].

the inherent relationship between the input data and the labeled class (testing data set) are sorted by beginning with the root
output value [46], [48]. nodes of the established tree and continuing on the path corre-
Some of the most efficient supervised machine learning sponding to the determined values of the features at the inner
algorithms are the naive Bayes classifier, decision trees (DTs), nodes of the tree. This procedure proceeds until a leaf has been
K-nearest neighbor (KNN), support vector machine (SVM), acquired. Finally, the related labels (i.e., predicted classes) of
and ensemble learning (EL) [49]. the new samples are obtained. The most popular DT mod-
1) Decision Trees: DTs operate by extracting features of els are CART, C4.5, and ID3 [51]. Several advanced machine
the records in a data set and then creating an organized tree- learning algorithms, such as random forest (RF) and XGBoost
like structure based on the value of Information Gain and Gini are obtained from multiple DTs. In [52], they proposed a
index from each attribute in the data set. The tree-like design fog computing-based security system called FOCUS, which
contributes the interpretable knowledge about the data set. A employs DT to secure the communications and employs a
DT comprises three components: 1) decision nodes; 2) branch; challenge-response authentication to defend the IoT network
and 3) leaf nodes. Each vertex (node) in a tree indicates the against DDoS attacks.
features, and each edge (branch) on the tree implies a value 2) K-Nearest Neighbor: KNN is a nonparametric and
that the vertex can have in a particular sample to be classified. supervised learning method [47], [53]. The principal idea
A decision node is employed to analyze a test attribute, a behind the KNN is the manifold hypothesis. If the major-
branch denotes a desirable choice based on the value of the test ity of the neighbors of the data object under classification
attribute and a leaf that contains the class to which the instance belong to the same class, then the sample has a higher prob-
associates [46]. DT can eliminate the irrelevant and redundant ability of belonging to the same category. The classification
features from the data set automatically. DT algorithms consist model is trained and classified based on some specific criteria,
of two main processes: 1) induction and 2) inference. The and incoming data is examined for similarity within k neigh-
induction process helps at building the tree, and the Inference bors [54]. Basically, it uses the Euclidean distance calculation
process benefits the classification procedure. Fig. 3 illustrates to estimate the distance between any two data points. Here,
an example of DT-based architecture for IDSs [48]. we assign a new data point into previously recognized classes
A DT is built typically by initially having a tree with unoccu- based on its relative distance between either of the classes,
pied nodes and branches in the building process. Consequently, where k is the number of nearest neighbors. The parameter k
the feature that best splits the training samples is considered largely influences the performance of the classifier. When the k
to be the root node of the tree. This feature is selected using value is smaller, the model becomes more intricate, and the
different measures, such as information gain and Gini index risk would be the higher [55], [56]. On the other hand, when
values [47], [50]. The DT algorithm picks the most appropriate the k value is large, the model looks simpler but the fitting
features independently and produces child nodes from the root ability is weak.
node. The same procedure is repeated to each sub-DT until In Fig. 4, green circles represent the normal behavior, and
leaves are reached, and their related classes are set. blue squares denote the abnormal behavior in a network. The
In the classification (inference) process, following the tree red triangle indicates an unknown data object classified either
is created, the new samples with a set of features and unknown as normal or abnormal behavior. The KNN classifier classifies

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9450 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 6. SVM principle.

Fig. 4. KNN classification for IDS. probability of network traffic being normal or not [60], [61].
The Naive Bayes classifier achieves a considerably good out-
come when the attribute independence hypothesis is satisfied.
However, this approach does not function competently on
attribute-related data [62]. The conditional probability has
been calculated using (1)
P(b/a).P(a)
P(a/b) = (1)
P(b)
P(a/b) is the posterior probability of class (a, target) given
Fig. 5. Naive Bayers classifier for IDS. that predictor ( b, attributes).
P(a) is the class’s (a) prior probability.
P(b/a) is the probability of “b” being true given that “a”
the new data object based on the majority of the votes from is true.
its nearest neighbors [56], [57]. For k = 1, the red triangle has P(b) is predictor’s (b) prior probability.
been classified as an abnormal class; however, it will be con- 4) Support Vector Machine: SVM is also a supervised
sidered as a normal class for k = 2 and k = 3. Hence, choosing machine learning approach that is basically more fitting for
the optimal value of the k-value is very significant to achieve data sets with large feature sets to be classified but a small
high accuracy using this algorithm. Adetunmbi et al. [58] number of data samples [63]. SVM classifies input data into
proposed a KNN-based classification method for anomaly and n-dimensional space and creates an n − 1 max-margin sepa-
intrusion detection in IoT networks for user to root (U2R) ration hyperplane to divide the entire feature sets into groups,
and remote to local (R2L) attacks. The proposed classification as shown in Fig. 6. It can work with binary and the multi-
model decreased the feature’s dimensionality to improve the class environment, and they are good at solving linear prob-
overall performance using two layers of feature reduction and lems. Also, SVM requires less memory and learns potential
then applied a two-tier classification approach that implements extractable information from a smaller training set [64]–[66].
NB and KNN classifiers [47]. It depicts a reasonable accuracy For the nonlinear classification, kernel functions are
rate for both R2L and U2R attacks in IoT. employed. SVM classifiers are applied for anomaly-
3) Naive Bayes Classifier: Naive Bayes classifier is a super- based intrusion detection in real-time online learning [48].
vised machine learning approach, and this classifier works Garg et al. [67] proposed “Sec-IoV,” a multistage optimized
based on the Bayesian theorem with robust independence SVM-based model for intrusion detection in vehicle-to-vehicle
assumptions between the predictors [22], [46]. Naive Bayes communications on the Internet of Vehicles (IoV) networks.
classifier predicts the solution for the question, “what is the Also, SVM has been introduced for the anomaly detection
probability of occurrence of a specific type of attack, given of DoS attacks and malware detection in IoT networks, and
the previous observations of similar instances in the network it outperforms other machine learning algorithms in terms of
traffic?” We can determine the class of unlabeled traffic class accuracy.
as a normal or abnormal event [59] by applying conditional 5) Ensemble Learning: Multiple machine learning algo-
probabilities as shown in Fig. 5. rithms can be employed at the same time to achieve supe-
The naive Bayes classification is one of the most popular rior predictive performance than anyone particular machine
models in IDS because of its simplicity and ease of computa- learning algorithm [68]. Training different classifiers simulta-
tion. An independent set of features of the detected traffic like neously to identify various attacks and then combining their
status flags, protocol, latency have been used to determine the results by producing a majority vote out for classification to

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9451

Fig. 7. EL for IDS. Fig. 8. K-means clustering algorithm.

in small clusters are labeled as an intrusion since the nor-


improve the detection rate as found in Fig. 7. Basically, the mal instances should produce congested clusters contrasted
ensemble’s performance is more potential than a single clas- to the anomalies. Besides, malicious intrusions and normal
sifier’s, since it can enhance weak classifiers to offer better instances are dissimilar; hence they do not result in an identi-
results than a single classifier [69]–[71]. Boosting, stacking, cal group. Unsupervised learning-based IDS includes K-means
and bagging are some of the ensemble methods that work clustering, principal component analysis (PCA), singular value
in practice. Boosting denotes the family of algorithms that decomposition (SVD), and hierarchical clustering [80].
can convert limited learners into active learners. In Bagging, 1) K-Means Clustering: K-means clustering method is one
the same classifier has been trained on different subsets of the most prominent clustering algorithms, and it belongs
of the same data set. Stacking unites various classifications to the unsupervised machine learning method. This method
approaches by a meta-classifier [72]. intends to discover clusters in the data objects [22]. k refers
Jabbar et al. [73] proposed an intrusion detection approach to the number of clusters and means indicate the mean of
based on an ensemble classifier built using the combination attributes. The method is achieved by iteratively designating
of RF and average one-dependence estimator (AODE), which each data object to one of the k clusters as per the related
addresses the dependence of the attributes problem in the features. Each cluster will comprise data objects with similar
naive Bayes model. RF enhances accuracy and deteriorate characteristics as illustrated in Fig. 8. The algorithm’s inputs
the false alarm rate than the individual classifier approaches. are the number of clusters (k) and the data set, including a set
Khraisat et al. [74] introduced a stacking ensemble method of features for each record in the data set [47].
that couples the C5 DT classifier approach and one-class At First, the k centroids are calculated, and then each sam-
SVM for malware detection in IoT networks. It shows 99.97% ple is allocated to its closest cluster centroid as per the squared
accuracy in the stacked ensemble method. Euclidean distance. Second, after all the data samples are
consigned to a particular cluster, the cluster centroids are recal-
B. Unsupervised Learning culated by estimating the mean of all data objects allocated to
Unsupervised learning-based IDS is a kind of machine that cluster [81]. The algorithm repeats this procedure until the
learning algorithm that uses input data sets without class same cluster order has been obtained between the iterations.
labels to extract interesting information and patterns indepen- Selection of a suitable value of k and a hypothesis that the sam-
dently [75]–[77]. ple data set will be uniformly distributed over the k clusters
In supervised learning, the output class labels are provided appear as constraints for the k-means clustering algorithm [82].
for training, and by using these, it will figure out the class Previous research demonstrated in [83] recommends the suit-
label for the unknown testing data set. In contrast, in unsuper- ability of k-means clustering for anomaly detection by calcu-
vised learning, no class labels are given, and instead, the data lating the similarity among the features using the Euclidean
is assorted basically into various classes through the extrac- distance computation. The researchers [84] proposed combin-
tion of features and patterns during the learning process and ing k-means clustering with DT for anomaly detection in IoT
feature mapping [78]. While developing an IDS, unsupervised networks to improve the performance. In general, unsupervised
learning utilizes the mechanism to recognize intrusions using machine learning methods are more effective for the unlabelled
unlabeled data to train the model. data set and for the data set, which is quite large in assigning
IoT network traffic is clustered into many groups based the class labels [85].
on the similarity of the network traffic and features with- 2) Principal Component Analysis: PCA is not an anomaly
out the requirement to predefine these groups [48], [78]–[80]. detection technique; however, it is generally applied as a fea-
Once data records are clustered, data objects that appear ture selection or a feature reduction method from a large data

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9452 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 9. RL between the agent (classifier) and IoT network. (a) Q-learning. Fig. 10. Schema of DQN and DDQN model during training. (a) Learning
(b) Deep Q-learning. Structure of DQN. (b) Learning Structure of DDQN.

set [86]. The selected feature sets can later be utilized along The outcome will be stored in the current state row and cur-
with some machine learning classifiers to identify the abnor- rent action column, both are acquired from experience in the
malities and anomalies in an IoT network. The PCA technique memory. Deep Q-learning network (DQN) and double DQN
converts a large set of features into a reduced set of features (DDQN) are some of the powerful RL techniques for security
without losing valuable information. In [87], they proposed applications [92], [93]. Q-learning algorithm is explained in
a combination of PCA with different classifiers to identify detail in the section.
anomalies in the IoT system. In other previous work [88],
they offered a model that applies PCA for feature reduction
A. Deep Q-Learning
and uses softmax regression and KNN algorithm as classifiers
for the real-time intrusion detection. The primary difference between Q-learning Network and
Deep Q-learning Network is found from Fig. 9. The major dif-
V. R EINFORCEMENT L EARNING -BASED IDS FOR I OT ference between them can be the agent’s brain. In Q-learning,
S ECURITY Q-table is the agent’s brain, and in DQN, the deep neural
network (DNN) is the agent’s brain, as predicted from Fig. 9.
Generally, learning from the surrounding environment is one The inputs for the DNN would be the states or observation,
of the first and best human experiences. Humans start learn- and the output represents the actions taken by the agent as
ing by interacting with the environment, and they slowly gain shown in Fig. 10 [94], [95]. The target used by the DQN is
some feedback on it. RL consists of two essential elements: given by
1) a learning agent and 2) an environment found in Fig. 9.  
RL-based IDS enables an agent to interact with an environ- Q(s, a) = r(s, a) + γ max Q s , a . (2)
a
ment. The agent learns how to effectively outline each state
highest rewards by trial and error methods using the feed- B. Double Q-Learning
back [89], [90]. The objective is to learn the best sequence of
actions that would maximize the total cumulative reward of Double DQN utilizes two similar DNN models in its archi-
the agent. tecture as illustrated in Fig. 10. Also, DDQN can sort the issue
Hence, the agent can be rewarded positive for performing of overestimation of Q-values in DQN [95]–[97]. Here, one
good actions and provide penalty rewards for each adverse of the neural networks learns the state-action pair during the
action. The brain of the agent in Q-learning is a Q-table that experience replay, as in DQN, and the other NN is a reflec-
contains rows and columns. The rows represent state or obser- tion of the last episode of the first model. The new target value
vation, and the column represents actions to take. Each cell used by the DDQN is given by
 
of the Q-table will be entered with a value named Q − value, Q(s, a) = r(s, a) + γ max Q(s , argmaxa Q s , a . (3)
which is the value of an action based on its state [91]. a

Primarily, Q-value is the reward gained from the computa- DQN and DDQN-based IDS has been proposed in [98], and
tion of the current state and the maximum Q-value from the they compared four different DRL approaches, namely, (DQN,
next state. Then, it will couple the reward to the best Q-value DDQN, policy gradient, and actor-critic). They explained the
determined from the row of the following state in the Q-table. application of these approaches for intrusion detection in IoT.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9453

Fig. 11. CNN for IDS. (a) Input Data. (b) Feature Learning. (c) Classification.

VI. D EEP L EARNING -BASED IDS FOR I OT S ECURITY employed instead of traditional fully connected networks. This
DL is a subset of machine learning, but it outperforms ML process results in significantly fewer parameters, which causes
in applications involving large data sets. DL can be defined the network to act faster and easier for training [101].
as a DNN with multiple hidden neural layers in between the A CNN architecture consists of two alternative layers:
input layer and output layer [99]. Since it has many hidden 1) convolutional layers and 2) pooling layers (subsampling
layers, it can extract the significant features from the data set layers), and an activation unit as represented in Fig. 11. The
on its own without any human assistance. Machine learning activation unit can trigger a nonlinear activation function on
requires a small amount of data for training and testing, but every element of the feature set [104]. The nonlinear activa-
it gives less accuracy. On the contrary, DL demands a large tion function is preferred in this model called rectified linear
amount of data for training the machine, and it needs a long unit (ReLU) activation function, f (x) = max(0, x). The con-
time for training, but it usually provides high accuracy [100]. volutional layers apply various kernels for convoluting data
DL can be supervised, unsupervised, or semi supervised. inputs [105]. The pooling layers achieve down-sampling to
Among them, supervised methods are DNN, convolution neu- reduce the sizes of the subsequent layers through max-pooling
ral network (CNN), and recurrent neural network (RNN). or average pooling. Max-pooling determines a maximum value
Besides, deep autoencoders (AEs), DBN, and restricted for every cluster of past layers after distributing the input
Boltzmann machines (RBMs) are unsupervised DL meth- among distinctive groups. On the other hand, the average
ods. generative adversarial networks (GANs) and ensemble pooling calculates the average values of every cluster in the
of DL networks (EDLNs) are examples of the Hybrid DL previous layer [106].
approach [101], [102]. CNN has become well-suited for highly efficient and fast
DL has been further broadly divided into two architectures, feature extraction from the raw data set since it requires less
namely, generative architecture and discriminative architec- time for training. However, high computational power has been
ture. Generative models are also called graphical models needed with CNN, which has been observed to be a limi-
because they represent independence/dependence for distri- tation of CNN. Hence, using CNN on resource-constrained
bution. They are visualized as graphs representing random IoT networks could be a profoundly challenging issue. This
variables and can define the relationship between random vari- challenge has been considerably approached through dis-
ables with millions of parameters to define the given system tributed architecture where a lighter Deep NN has been trained
graphically [103]. The generative model uses joint distribution and implemented on-board with only a subset of vital out-
P(y, x), whereas the discriminative model makes predictions put classes. However, a high computational power cloud
by estimating conditional probability P(y|x). Deep AEs and has been utilized to perform the complete training of the
RBM belong to generative architecture, and CNNs and RNNs algorithm [107], [108].
relate to discriminative DL models. In [109], they proposed a CNN-based android malware
detection system in IoT security, which is capable of simul-
taneously learning to perform feature extraction and malware
A. Convolutional Neural Networks classification given only the data records of a vast number of
CNN is a type of discriminative DL model that has labeled malware samples. The significant features correlated
been primarily used to handle large training data sets using to malware detection are detected automatically from the raw
hierarchical-based feature extraction and representation. CNN data set by employing CNN for malware detection; thereby, it
was proposed to reduce the data parameters used in a tra- eliminates the requirement of manual feature engineering.
ditional artificial neural network (ANN) [46], [47]. The data Garg et al. [110] offered a robust hybrid model for
parameters are reduced by utilizing three concepts: 1) sparse network anomaly detection in cloud environments, employ-
interaction; 2) parameter sharing; and 3) equivariant represen- ing grey wolf optimization (GWO), and CNN techniques.
tation [99], [100]. To fully use the 2-D input data structure, They improved GWO and CNN model for better explo-
local connections, and share weights in the network are ration, exploitation, and producing the initial population. The

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9454 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 13. Deep AEs for IDS.

variants have been introduced in RNN techniques, namely,


Fig. 12. RNN for IDS. long short-term memory (LSTM) [116], gated recurrent unit
(GRU) [117], and bi-RNN [118], which are beneficial at sort-
ing out the vanishing gradient issue and long-term dependency
proposed model has claimed to achieve better accuracy in problem [102].
terms of detection rate. LSTM is essentially an advanced RNN model, which has
been designed to overcome the sensitivity issues. Each LSTM
B. Recurrent Neural Networks contains three gates: 1) an input gate; 2) a forget gate; and
RNN is a discriminative DL algorithm. RNN would be 3) an output gate. The forget gate discards old memory when
a well-suited algorithm if the application data has to be the input gate accepts new upcoming data, and the output gate
processed sequentially, e.g., (speech text, sensor data), and unites short-term memory with long-term memory to create the
also, a dependent nature exists between the present and current memory state. In the GRU model, the input gate and
previous states [46]. On the other hand, in the traditional neu- forget gate have been combined to a single update gate, which
ral network, there is no interdependency between input and looks flexible than LSTM [102], [119]–[120].
output [47]. Here, outputs from the neurons are given back as feed-
From Fig. 12, it can be depicted that it is a back propagation back to the neurons of the previous layer. Since the IoT
instead of forward propagation. For instance, to understand a environments are apparent to have vast volumes of sequen-
word in a sentence, it is necessary to know the context. Each tial network traffic flows, RNN looks more appropriate in IoT
unit in an RNN receives the current state and the previous state security applications, particularly for the scenario of network
to obtain contextual information. RNN can be observed as intrusion detections. Torres et al. [121] proposed the RNN
short-term memory units that contain the input layer x, hidden based network intrusion detection against time series-based
layer (state) s, and output layer y. Information flows in the threats, in which they examined the network traffic behavior
RNN happen only one way from the input units to hidden and identified the threats. Another past research work in [122]
units. All of the information throughout the RNN is saved in introduces an IDS that utilizes cascaded filtering stages in
the hidden units. The hidden units of an RNN can maintain which deep multilayered RNN is used for each filter. RNNs
a “state vector” that includes a memory of the past inputs in are then trained to identify common attacks that originated in
the sequence order. RNN can adjust the length of this memory IoT environments, namely, R2L, DoS, U2R, and Probe.
based on the type of RNN node which has been selected. The
longer the memory, RNN can learn longer dependencies [111]. C. Deep Autoencoders
The RNN calculates the hidden unit vector sequence, h =
Deep AE is an unsupervised DL model consisting of
(h1, h2, h3, . . . , hn) in order to compute the output vector
two symmetrical components: 1) encoder and 2) decoder.
sequence y = (y1, y2, y3, . . . , yn) through 1 to n iterations
The encoder function h = f (x) extracts the features(code)
using (4) and (5) [112]
from the input vectors, and the decoder function r =
ht = H(Wxh xt + Whh ht−1 + bh ) (4) g(h) reconstructs the input vector using the extracted fea-
yt = Why ht + by (5) tures(code) [123]. During progressive training, the deviation
between the encoder’s input and decoder’s output is grad-
where W denotes the weight matrix, b is a bias vector, and H ually reduced [13], [102]. AEs can be primarily used for
denotes the recurrent hidden layer function. feature extraction and feature compression. If the hidden lay-
Typically, RNNs are considerably challenging to train ers have less dimensionality than the input and output layers,
because of their sensitivity to vanishing and exploding gra- then this model can be used for feature compression, as
dients. This sensitivity subsidizes over time, which indicates shown in Fig. 13. It attempts to reduce the dimensionality
the network forgets the initial inputs with the introduction of of the data set without having any prior knowledge of it.
the new ones [99], [113]–[115]. There are many advanced However, it has a limitation of consuming high computational

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9455

time and power. There are many variants in the autoencoders


namely, denoising autoencoders, stacked autoencoders, and
sparse autoencoders [124], [125].
If the autoencoder has been trained to reconstruct the input
from the noisy input vectors, it is referred to as denoising
autoencoders [126]. Stacked autoencoder utilizes multiple lay-
ers of autoencoders, which they have been trained with a series
of data inputs to compress the information progressively to a
certain extent [127]. A sparse autoencoder consists of more
hidden nodes than the number of nodes in input and output lay-
ers, though only a portion of the hidden nodes are stimulated
at a particular time [128].
Yousefi-Azar et al. [129] proposed network-based malware
detection using Deep AEs. Autoencoders learn the poten-
tial features from the input vectors, which are significantly Fig. 14. RBM for IDS.
associated with cybersystems. It has been observed from this
research work that AEs outperform other conventional ML
algorithms, particularly SVM and KNN, in terms of accuracy RBMs are applied in many research work [137] for network
and detection rate. in IoT IDS. The difficulty of implementing RBMs is that it
In [130], an ensemble of deep auto-encoders were utilized demands exceptional computational resources while executing
to design an online lightweight IDS for IoT networks, which it on resource-constrained IoT devices. Moreover, Single RBM
is primarily an unsupervised learning approach, and it con- limits the ability of feature representation. This limitation can
sumes light resources and power for processing. The results be surpassed by utilizing two or more RBM coupled to create
confirmed that the performance of the proposed lightweight a DBN.
technique is better than other ML and DL techniques. In [138], an intelligent city IDS based on RBMs has been
In another research [131], an AE has been coupled with suggested. RBMs are used because of their potential to explore
DBNs to offer a malware detection method and also, AE and study advanced features from the unknown data set in an
has been employed first for data dimensionality reduction to unsupervised manner. It can also handle the real-time data
extract only the distinctive features by nonlinear mapping. description produced from intelligent meters and intelligent
Then after, the DBN learning algorithm was trained to identify sensors. Several classifiers are trained with these extracted fea-
the malicious codes. tures. The performance of the proposed technique has been
examined and created a standard utilizing a data set from a
smart water distribution plant. The results demonstrated the
D. Restricted Boltzmann Machines efficacy of the recommended method in intrusion detection
RBM is a randomized deep neural model in which each with outstanding accuracy.
node follows the Boltzmann distribution. It consists of two
layers, namely, the visible layer and the hidden layer. RBMs E. Deep Belief Networks
are unsupervised learning models, and they will be trained Another prominent DL model for cybersecurity is the
only one layer at a time. RBM improves the learning speed of DBNs [139]. DBN is a generative model which consists of
the algorithm by restricting the connections of all the nodes multiple hidden layers as illustrated in Fig. 15. DBN is prin-
in the same layer [112], [132]–[134]. There are no intralayer cipally built by stacking multiple RBM [108]. Training has
connections (i.e., between nodes in the same layer); however, been implemented layer-by-layer in a greedy fashion in an
each node in the input layer is associated with each node in the unsupervised training phase. Training a DBN includes two
hidden layer as found in Fig. 14 (i.e., full connectivity) [111]. essential stages, namely, unsupervised pretraining and super-
In [135], they addressed some of the significant challenges vised fine-tuning. During the pretraining phase, the primary
when developing network anomaly detection. The first challenge features are trained through a greedy layer-wise with the unsu-
is adequate training of the model demands the generation of a pervised approach, whereas a SoftMax layer is employed at
quality labeled data set, and also the network traffic includes the fine-tuning stage (top layer) to fine-tune the features with
multipart, and it may be irregular sometimes. The second one respect to the labeled records [140].
requires consistent evolution of anomaly occurrences with time. An intrusion detection approach based on an enhanced
Aldwairi et al. [136] proposed an RBM-based learning DBN has been proposed in [141] to alleviate the problems
model for detecting the network anomaly to sort out these like feature overfitting, poor accuracy, and high FPR when
challenges. It has been trained in an unsupervised fashion even defied with a significantly massive volume of data traffic. They
with an incomplete training data set. The authors observed demonstrated that the proposed method provides improved
that the classifier showed inadequate performance during test- classification accuracy, which estimates 96.17%.
ing when the classifier was tested on a network data set that In another study [131], AE has been coupled with DBN to
differed from the network data set on which the classifier was design a malware detection method. Autoencoders are used
trained. for data dimensionality to extract the prominent features by

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9456 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 15. DBNs for IDS.

Fig. 16. GANs for IDS.

nonlinear mapping. Sequentially, the DBN model has been original data. Hence, the generator and discriminator help each
employed to identify the malware in the data. DBNs demands other for their enhancement. The primary objective of train-
high computational costs because of comprehensive initializa- ing the generative model is to improve the probability that
tion procedures at the early stage to handle a massive number the discriminative model misclassifies the record [102], [142].
of features. The generative model is adapted to fool the discriminator by
producing a few samples from random noise at each stage.
F. Generative Adversarial Networks The discriminator is served with many actual data samples
GAN has emerged as a hot research topic, and it acts as a from the training set, co-occurred with a few fake samples
promising solution for many DL applications. Fig. 16 depicts from the generator. The discriminator intends to differenti-
the GAN framework for IDS. A GAN framework consists ate both the real (training data set) and synthetic (from the
of two subnetworks, namely, a generator and a discrimina- generative model) samples. Their performances are estimated
tor. The generator intends to generate a synthetic (falsified) based on the number of correctly and incorrectly classified
data set considerably similar to the actual data set, and the instances. The discriminative model supports the generative
discriminator aims to recognize the synthetic data from the model to improve the samples produced for the successive

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9457

Fig. 17. Example of sinkhole attack in IoT network. Fig. 18. Example of the Blackhole attack in IoT network.

iterations [13], [143]. The research work in [144] examined for spoofing, and the new combination creates a very poten-
the efficiency of the GAN algorithm for identifying abnor- tial attacking scenario [147]. Generally, this attack could be
mal behavior in the IoT networks. It shows promising results implemented in different ways: by advertising the control mes-
by mitigating zero-day attacks by generating dummy zero-day sages with a more beneficial combination of the rank and
attack samples, thereby influencing the discriminator to learn the objective function (OF). Otherwise, this can be performed
different attack instances and scenarios. It produces incon- by manipulating the preferences or by utilizing many adver-
sistent results initially, and the training seems to be very saries converging all passing traffic toward another adversary
challenging. by creating a sinkhole [14], [39].
The research in [145] proposed an IDS architecture to secure 2) Blackhole Attack: A Blackhole attack is a packet-
the cyberspace of IoT networks. The proposed work should dropping attack [7], [148], where the attacker drops or discards
classify the network traffic as either normal or abnormal. all the packets by resembling a “Blackhole” in the network
GAN-based proposed architecture performs well in detecting instead of relaying to the next forwarding node as depicted in
abnormalities in the system by showing exceptional accuracy Fig. 18. This attack might sometimes cause a DoS in the IoT
values. networks [36]. In IoT, the Blackhole attacker joins the network
and acts as a parent and, then, is followed by dropping and
VII. P OTENTIAL S ECURITY T HREATS AND ATTACKS IN discarding the packets sent by its descendant nodes. Also, a
I NTERNET OF T HINGS Blackhole attack is critical when it is coupled with a sink-
hole attack. A more advanced Blackhole attack will forward
In the following sections, we illustrated the potential secu-
a selected set of control messages (e.g., ICMP v6 packets and
rity threats and attacks in IoT networks. We categorized the
Control messages) and drop all the relatively actual payloads.
attacks majorly into two sections, namely, WSN derived IoT
It seems quite confusing to recognize whether it is an abnormal
security threats and RPL specific IoT security attacks.
scenario since it is partially delivering it to the right destina-
tion. This has been identified as a Selective-Forward attack
A. WSN-Derived IoT Security Threats or Greyhole attack. Because of their characteristics, selective-
Since a substantial amount of standards and techniques have forward attacks could not be identified nor mitigated by the
been introduced from WSNs, many of the WSN routing attacks self-healing mechanisms inside the IoT networks, since the
can probably happen in IoT networks, with few minor varia- malicious nodes usually forward control messages and actively
tions in their implementation style and their attacking goals. participate in building a structure in IoT networks [149]–[151].
Various categories of WSN derived IoT attacks have been 3) Wormhole Attack: Generally, two malicious nodes par-
discussed in the following section. ticipate in this category. They create a tunnel between them
1) Sinkhole Attack: In this type of attack [19], [146], the as shown in Fig. 19 and transmit traffic (completely or par-
malicious node broadcasts its rank to be very relatively less. tially) through the tunnel instead of passing through the regular
By doing so, the attacker deceives to appear like its posi- route [152]–[154]. This way, two far-standing malicious nodes
tion is significantly closer to the border router (BR) to attract of the network advertised them as neighbors to attract the
many neighbor nodes to select this node in its routing path neighboring traffic [14].
to forward the packets toward the destination as shown in There are three potential ways to perform a wormhole
Fig. 17. By itself, sinkhole attacks are not much danger- attack. They are given in the following.
ous, but when they are coupled with other types of attacks 1) Packet Encapsulation [14], [155], in which wormhole
(e.g., blackhole or wormhole attacks), where the network traf- nodes employ a legitimate route between each other and
fic passing through this route can be altered, forged, or used build a logical tunnel between the wormhole partners by

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9458 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 19. Example of wormhole attack in IoT network.

Fig. 21. Example of various categories of the Sybil attack in IoT


network [16].

tedious for the source node to identify whether it has been for-
warded successfully to the destination or not. Overall, this kind
of attack destroys the integrity of network routing [7], [36].
5) Sybil Attack and Clone-ID Attack: In the Sybil attack,
many malicious nodes are forging or fabricating multiple peer
identities to compromise the IoT network. The Sybil attack
is a highly critical threat to the RPL, which may worsen the
network performance by exponentially multiplying the control
traffic overhead transmission [159], [160]. Moreover, it grad-
Fig. 20. Example of selective forwarding attack in IoT network. ually degrades the overall lifetime of the network. It tries to
deceive the network by increasing the number of overheads,
and progressively, it intends to compromise the primary root
encapsulating the original legitimate packets to hide the node in the architecture [16], [161].
hop count from other nodes on the tunnel’s routing paths. 1) SA-1 Sybil Attack: In the SA-1-type Sybil
2) Packet Relay [14], [148] where the malicious attack- attack [16], [159], illegitimate nodes will target
ers forward and relay packets between two far located one particular region, and they will attempt to compro-
neighbors through the logical tunnel. They attempt to mise the identities of the nearby nodes to accomplish
trick the two far legitimate nodes into being their neigh- the attack. In SA-1, all the Sybil identities and attackers
bors. Typically, this has been performed by relaying are static nodes, as in Fig. 21. The motivation behind
packets without altering the hop count. this for attacking one bounded region is to pretend to
3) Out-of-Band Link [7], [148], [156] here malicious be like a legitimate set of nodes and perform the attacks
nodes employ an out-of-band link (wired or wireless together.
connection) to communicate with each other. This pro- 2) SA-2 Sybil Attack: In the SA-2-type Sybil
cedure can even support the opponent inside the IoT attack [16], [159] Sybil attackers are distributed
network for interacting with some adversary outside the among the legitimate nodes in the network, and they
networks, and they may circumvent the procedures of will not be limited to one particular region, as shown in
border routers and firewalls. Fig. 21. Moreover, this class of attack is much difficult
4) Selective Forwarding Attack: The selective forwarding to identify as these Sybil nodes have established a set of
attack is a distinct type of the blackhole attack, where the socially normal connections with the legitimate nodes.
malicious node drops or discards a partial number of packets The principal objective of this type of SA-2 Sybil attack
from the incoming traffic, and forwards the remaining on a is to disrupt the routing topology and manipulate the
selective basis as illustrated in Fig. 20 [157], [158]. It is very system, which is in favor of the Sybil attacker.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9459

B. RPL-Specific IoT Security Threats


This section outlines RPL protocol-specific IoT secu-
rity threats, such as rank attacks, version number attacks,
local repair attacks, DIS attacks, replay attacks, RPL storing
mode attacks, and DODAG inconsistency attacks [164], [165].
Because of limited battery life and mobility, RPL is potentially
vulnerable to various security attacks.

Overview of RPL and Mobile RPL


RPL is a standardized routing protocol for low power and
lossy networks in IoT. RPL is a distance-vector and source
routing protocol which is working under a tree-based topology,
namely, DODAG in the 6LoWPAN. A DODAG comprises
many nodes, and there is a sink node called BR, which gath-
Fig. 22. Example of the hello flooding attack in IoT network. ers all sensed information from the residual nodes in the same
DAG [6]. Every DODAG is distinguished by its RPL instance
ID, DODAG ID, DODAG version number, and Rank. There
3) SA-3 Sybil Attack: In the SA-3-type Sybil are three types of control messages have been exchanged in
attack [16], [159], Sybil nodes are under movement, RPL, namely, DIO, DAO, and DIS [6], [166].
and also it is scattered among the network. There 1) DIO—DODAG Information Object.
will not be any fixed structure and topology in the 2) DAO—DODAG Advertisement Object.
attack since it moves dynamically from one position 3) DIS—DODAG Information Solicitation.
to another. It attempts to perform malicious action on The BR starts the DODAG construction process by broad-
surrounding nodes on the way of motion, as in Fig. 21. casting DIO messages to the nearby neighboring nodes in
The primary goals of SA-2 and SA-3 are very similar, the DODAG for building and renewing the topology. The
but the identification of these mobile Sybil identities nodes which receive the DIO message, in turn, return the
is challenging in the RPL structure. Trust-based Sybil DAO acknowledgment message to the BR. Primarily, RPL
detection will act as one of the most reliable ways to can support mobility among the nodes. If the network has
identify this type of the mobile Sybil attack to secure been run under conventional RPL with mobility without any
the network. This type of attack can use both fabricated optimization, the overall lifetime of the network has been
identities and compromised identities together or stand vastly reduced, and the performance is too low in terms of
alone as well. control traffic overhead and average end-to-end delay [166].
In addition to that, we can differentiate further Sybil attack Mobile RPL or Mobility-aware RPL is an enhanced RPL
into two categories; namely, one is a simultaneous attack, i.e., protocol that supports the random mobility of the nodes in
it uses all the set of compromised identities at the same time, the network. In this article, we considered and simulated
and the other one is a nonsimultaneous attack, i.e., it utilizes RPL under mobility (mobile RPL). For introducing mobility
only a specific subset of identities to perform the attack. and efficient parent selection, we had employed our previous
6) Hello Flooding Attack: If any node seeks to join a research work [166] on mobility-aware parent selection pro-
network, it begins by broadcasting “Hello” packets to the cess, which supports random mobility of the nodes in RPL,
neighbor nodes as explained in Fig. 22. If any node receives and it determines the best parent from the preferred par-
such hello packets, it will believe that it is inside the ent list under mobility by considering the metrics, namely,
node’s transmission range (radio range) from which it receives expected transmission count (ETX), expected life time (ELT),
the hello message [7], [10]. Particularly when we examine and RSSI. Also, dynamic trickle (D-Trickle) timer has been
IoT networks, its routing protocol RPL uses DIO (DODAG used to optimize the control message transfer under mobil-
Information Object) messages in the place of Hello packets ity. While examining the various attacks under mobility, the
while started building the topology [36], [162]. mobility-aware parent selection process assists actively in
An intruder can broadcast DIO messages with influential reducing the number of control overhead transmission and
routing metrics and strong received signal strength indicator average energy consumption of each node in RPL [166].
(RSSI) signal values; later, it might disappear or deteriorate its 1) Rank Attacks: Every node in RPL forms the DODAG
transmission power to normal. Packets from legitimate nodes structure by choosing a preferred parent. Nodes can select
will fail to deliver to the correct destination if it chooses the their preferred parent based on two values: 1) rank and 2) OF,
attacking nodes as the neighbor by analyzing the fake trans- which can be read from the DIO messages. The rank of a node
mission power and RSSI value in the message [163]. This kind increases when it is at the leaf (terminal) of the DODAG, and
of attack is known as the Hello Flood Attack. Another con- rank decreases when it is near the BR [17], [167], [168]. The
sequence of the Hello flooding attack is resource exhaustion adversary can manipulate the rank value, and it causes drastic
of legitimate nodes and network congestion due to progressed effects in the routing topology. The rank attack can combine
control overhead transmission [10]. other attacks such as selective forward and IP spoofing attacks,

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9460 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

Fig. 23. Various categories of rank attacks in RPL. (a) Decreased Rank Attack. (b) Increased Rank Attack. (c) Worst Parent Attack

which prompts a significant impact on the overall structure of


the RPL-DODAG [169].
This manipulation can be performed in two approaches:
altering the adversary’s rank by a particular value based on
the ranks of the adversary’s neighbors. This is satisfactory if
no protective measures were implemented (such as a moni-
toring IDS). Another way is by manipulating the adversary’s
rank by using a modified OF (at the adversary) to trick the
legitimate nodes toward providing the malicious node a better
rank. Doing these manipulations can cause the malicious node
challenging to identify since it adapts to any changes in the
dynamic topology [17].
Rank can be classified into three categories as illustrated in
Fig. 23, as a decreased rank attack, increased rank attack, and
worst parent attack [170], [171].
1) Decreased Rank Attacks: Malicious node promotes
lower rank value to its neighboring nodes, resulting
in many neighbor nodes selecting the malicious node Fig. 24. Illustration of version number attack. The incremented version
as their preferred parent. This is primarily a sinkhole number attack has been launched by the malicious node no. 6 (adversary
attack [14], [171]. node starts broadcast manipulated version number in DIO plotted in red) is
automatically propagated by the legitimate nodes in the network (broadcasted
2) Increased Rank Attacks: Compared to the expected by relay nodes sketched in blue).
behavior of the malicious nodes, adversaries advertise
a larger rank when it is actually near the root node,
and thereby, it will try to disrupt the routing topology with a higher version number, it will expect the global repair
and introduces more delays. The impact of this attack is procedure to originate, and it can lead to topology inconsis-
relatively low when compared with the other ones [171]. tency and routing loops, particularly if the adversary node is
3) Worst Parent Attacks: This class is challenging to recog- far away from the border router [172]–[174]. Fig. 24 illustrates
nize since the malicious node will advertise its original the working of version number attack in RPL. The incremented
rank; however, it chooses the worst parent for itself. It version number attack has been launched by the malicious
can employ the decreased rank method to attract more node no. 6 (adversary node starts broadcast manipulated ver-
legitimate nodes to select it as a parent, and then it will sion number in DIO plotted in red) is automatically propagated
send their packets to the worst routing path concluding by the legitimate nodes in the network (broadcasted by relay
in increased delays or routing loops [14], [171]. nodes sketched in blue).
2) Version Number Attacks: Version number attack is one 3) Local Repair Attacks: In the Local repair attack, the
of the exploits of the global repair mechanism of RPL. adversary node advertises local repair messages to the nearby
Actually, only the BR can change the version number in the neighboring nodes periodically when there is actually no
DODAG. However, the RPL standard does not confirm the problem in the DODAG structure. This will induce the adja-
integrity of the version number value transmitted with the DIO cent nodes to recompute their routing paths again, which may
message [171]. If a malicious node advertised a DIO message include malicious nodes in their path, leading to the same

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9461

Fig. 25. Illustration of DIS attack in RPL. Fig. 26. Illustration of DODAG inconsistency attack.

DODAG topology again. The primary goal of this attack is to misuse the trickle-timer algorithm for DODAG construction
generate more control traffic overhead transmission, which in and maintenance [177].
turn leads to exhaust battery life and resources of the nodes The adversary replays the DIO messages received from
in the DODAG [36]. This will impact the overall lifetime of the legitimate node many times with a distinct frequency.
each node in the network. Le et al. presented in [169] that this This can be sufficient to prove to the victim that there
type of attack could critically impact the packet delivery ratio will not be any modifications in the network topology, and
because of increased control overhead transmission. However, then it will double the trickle timer value. Therefore, the
it has fewer effects on the average end-to-end delay since it adversary restricts the control traffic transmission of the
still has the optimal route. victim. This behavior may direct to an incomplete routing
4) DIS Attack: If a node desires to join the DODAG, it path and sometimes detach the victim from the complete
has to broadcast DIS message to its neighboring nodes to DODAG [178].
get the topology information from them. In the DIS attack, 6) DODAG Inconsistency Attack: DAG inconsistency can
an adversary can misuse this feature, and it can send the be noticed when there is a contradiction between the direction
DIS message periodically to its neighbors as depicted in of flow of the packet and its rank relationship between the
Fig. 25 [14], [175], [176]. This attack can be performed in transmitting and receiving node. The adversary can misuse
two ways, broadcasting the DIS message or unicasting the the global repair mechanism to perform the DAG inconsis-
DIS message [164]. tency attack. In this attack, a malicious node can transmit an
1) Broadcast DIS [164]: When the adversary is broad- inconsistent packets (upward direction with “O” flag set or
casting the DIS messages, it might introduce a local downward direction with “O” flag clear) with the “R” flag
repair mechanism upon neighboring nodes. Few of the also set [179], or it could modify the “O” and “R” flags of
anomaly-based or specification-based IDSs can identify any packet it receives before forwarding it as explained in
the attack based on the scalability and the number of Fig. 26 [179], [180].
impacted neighboring nodes. The DODAG inconsistency attack causes increased con-
2) Unicast DIS [164]: Unicasting DIS message will trol traffic overhead, enhanced resource consumption, and
cause an unwanted transfer of DIO messages extended end-to-end delay than expected. Suppose it employs
among the neighboring nodes without resetting their packet manipulation to launch the attack. In that case, it may
trickle-timers. produce a blackhole-like scenario that can drop all the pack-
5) Replay Attack: In the replay attack, the adversary ets from the legitimate nodes to pass through, and it becomes
records the legitimate control traffic, namely, DIO, DIS, somewhat challenging to identify [179].
and DAO messages exchanged between the legitimate nodes
in the network, and later forward them into the network.
The primary purpose of this attack is to degrade the rout- VIII. C HALLENGES IN I OT S ECURITY
ing performance and reduce the packet delivery ratio in the AND F UTURE D IRECTIONS
network [171], [177]. The adversary can perform the attack The number of IoT devices and their applications is advanc-
even when the secure mode of RPL is active, and they do ing rapidly and brings a lot of attention now in many real-time
not require to seize the cryptographic keys. In a few scenar- applications. To achieve secure communication between many
ios, it might separate a partial part of the victim’s DODAG IoT devices at a more affordable cost, it is necessary to develop
structure [14]. efficient and potential security solutions to mitigate the prob-
The DIO suppression attack is a special category of a replay lems. Here, we will address some of the significant challenges
attack that employs the replay attack concept, which will and future directions briefly.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9462 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

1) Global Connectivity: Since an enormous number of IoT quality and a large data set volume are quintessential for
devices have been connected to the Internet and have efficient machine learning and DL techniques.
been open for access to extensive connectivity, there
would be high potential threats to merge via the Internet.
It will cause extreme consequences to the complete IX. C ONCLUSION
network. For an instance, if it is an IoT healthcare This article contributes a comprehensive review of machine
system, the impacts would be even worse. learning-based IDS, DL-based IDS, and RL-based IDS with
2) Scalability: Security solutions should support scalabil- its practical working principle, benefits, drawbacks, and appli-
ity in the network since the number of IoT devices is cation in the IDS solutions with a neat sketch for each method.
expanding at a higher rate at any point in a network. This article also appears to be a one-stop survey article for the
Handling a large number of IoT devices demands scal- targeted researchers for exploring the various advanced and
able intrusion and intelligent intrusion approaches. updated knowledge on prevailing intelligent intrusion strate-
3) Resource-Constrained Nature: Their limited battery life gies, WSN-inherited security threats, and RPL specific security
is another challenge if we design to implement robust attacks in IoT environments. Further, we thoroughly illustrated
and intensive security approaches. It demands more pro- the implementation of various categories of security attacks in
cessing power and computational complexity. Resource- IoT with a neat diagram. IoT supports several advanced and
constrained nature restricts the likelihood of developing exciting applications in many fields; however, IoT devices are
advanced security solutions. an apparent victim of security attacks because of their global
4) Mobility: One of the most significant advantages in IoT connectivity, limited battery life, ad hoc nature, and mobil-
networks is it supports the mobility of the nodes. Their ity. Hence, it is essential to have continuous monitoring and
energy consumption will be high during mobility, and intelligent intrusion detection techniques for identifying yet
simultaneously, if there is an apparent attempt of an to known attacks in the future. While analyzing the massive
attack, it will drastically exchange the control traffic data traffic in IoT, advanced machine learning-based IDS can
overhead between the nodes. Then, the overall lifetime be recommended as a promising solution to predict unfamil-
of the node will be drastically reduced. iar and novel attack instances. However, more research efforts
5) Combination of Multiple Attacks: At the same point of should be focused to overcome and optimize resource con-
time, there is a plausible attempt of a combination of sumption of IDSs to support low-power IoT devices in the
various attacks, and though we have an intelligent IDS, future.
there would be an unfair damage in the network at the We trust that this survey could be a prominent reference
initial stage. article for future researchers in this field attempting to inves-
6) Overhead Traffic: When there is a security breach in tigate and better develop advanced mitigation techniques for
the network, control traffic overhead messages will be enhancing IoT security.
exchanged at a higher rate, which will cause a delay
in executing an immediate response to mitigate the
attack because of the complexity in the system to pro- R EFERENCES
cess the tremendous overhead transmission in real-time [1] M. R. Palattella et al., “Standardized protocol stack for the Internet
application. of (Important) Things,” IEEE Commun. Surveys Tuts., vol. 15, no. 3,
pp. 1389–1406, 3rd Quart., 2013.
7) Big Data: A vast amount of data traffic has been [2] L. Atzori, A. Iera, and G. Morabito, “The Internet of Things: A survey,”
exchanged in IoT networks. Those substantial volumes Comput. Netw., vol. 54, no. 15, pp. 2787–2805, 2010.
of data have been generally termed big data. Advanced [3] S. Deering and R. Hinden, “Internet protocol, version 6 (IPv6) spec-
ification,” Internet Eng. Task Force, RFC 8200, Jul. 2017. [Online].
machine learning and data mining techniques need to be Available: https: //www.rfc-editor.org/info/rfc8200
employed to extract vital information from the massive [4] G. Montenegro, C. Schumacher, and N. Kushalnagar, “IPv6 over
amount of data traffic exchanged in the IoT network. low-power wireless personal area networks (6LoWPANs): Overview,
assumptions, problem statement, and goals,” Internet Eng. Task Force,
Moreover, all the data traffic should not be neces- RFC 4919, 2007. Accessed: Sep. 2017. [Online]. Available: https://rfc-
sary for further analysis and learning. Feature selection editor.org/rfc/rfc4919.txt
and extraction could be an additional challenge in IoT [5] A. Le, J. Loo, A. Lasebae, M. Aiash, and Y. Luo, “6LoWPAN: A study
on QoS security threats and countermeasures using intrusion detection
networks. system approach,” Int. J. Commun. Syst., vol. 25, no. 9, pp. 1189–1212,
8) Assorted and Heterogenous Device Connectivity: IoT 2012.
network may include heterogeneous devices which may [6] T. Winter and P. Thubert, “RPL: IPv6 routing protocol for low power
and lossy networks,” IETF, RFC 6550, Mar. 2010. [Online]. Available:
work in different platforms and frameworks. It will cre- https://rfc-editor.org/rfc/rfc6550.txt
ate complexity in operation at the implementation level [7] L. Wallgren, S. Raza, and T. Voigt, “Routing attacks and countermea-
when we consider the security solutions. There will be sures in the RPL-based Internet of Things,” Int. J. Distrib. Sens. Netw.,
a necessity to develop a standardized security solution vol. 9, no. 8, pp. 1–11, 2013, doi: 10.1155/2013/794326.
[8] J. P. Vasseur, M. Kim, K. Pister, N. Dejean, and D. Barthel, “Routing
that supports any platforms and protocol architecture of metrics used for path calculation in low-power and lossy networks,”
IoT networks. Internet Eng. Task Force, RFC 6551, Mar. 2012. [Online]. Available:
9) Availability of IoT Security Data Set for Training: For https://rfc-editor.org/rfc/rfc6551.txt
[9] F. A. Alaba, M. Othman, I. A. T. Hashem, and F. Alotaibi, “Internet of
advanced IDS solution which includes DL and deep RL, Things security: A survey,” J. Netw. Comput. Appl., vol. 88, pp. 10–28,
there is a scarcity of the IoT data sets for training. High Jun. 2017.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9463

[10] D. Airehrour, J. Gutierrez, and S. K. Ray, “Secure routing for Internet [32] R. Mitchell and I.-R. Chen, “A survey of intrusion detection techniques
of Things: A survey,” J. Netw. Comput. Appl., vol. 66, pp. 198–213, for cyber-physical systems,” ACM Comput. Surveys, vol. 46, no. 4,
May 2016. pp. 1–29, 2014.
[11] B. B. Zarpelao, R. S. Miani, C. T. Kawakani, and S. C. de Alvarenga, [33] F. Yihunie, E. Abdelfattah, and A. Regmi, “Applying machine learning
“A survey of intrusion detection in Internet of Things,” J. Netw. to anomaly-based intrusion detection systems,” in Proc. IEEE Long
Comput. Appl., vol. 84, pp. 25–37, Apr. 2017. Island Syst. Appl. Technol. Conf. (LISAT), Farmingdale, NY, USA,
[12] D. Shreenivas, S. Raza, and T. Voigt, “Intrusion detection in the RPL- 2019, pp. 1–5.
connected 6LoWPAN networks,” in Proc. 3rd ACM Int. Workshop IoT [34] D. B. Gothawal and S. V. Nagaraj, “Anomaly-based intrusion detec-
Privacy Trust Security (IoTPTS), 2017, pp. 31–38. tion system in RPL by applying stochastic and evolutionary game
[13] M. Mohammadi, A. Al-Fuqaha, S. Sorour, and M. Guizani, “Deep models over IoT environment,” Wireless Pers. Commun., vol. 110,
learning for IoT big data and streaming analytics: A survey,” IEEE pp. 1323–1344, Sep. 2020.
Commun. Surveys Tuts., vol. 20, no. 4, pp. 2923–2960, 4th Quart., [35] B. Farzaneh, M. A. Montazeri, and S. Jamali, “An anomaly-based IDS
2018, doi: 10.1109/COMST.2018.2844341. for detecting attacks in RPL-based Internet of Things,” in Proc. 5th
[14] A. Raoof, A. Matrawy, and C.-H. Lung, “Routing attacks and mit- Int. Conf. Web Res. (ICWR), Tehran, Iran, 2019, pp. 61–66.
igation methods for RPL-based Internet of Things,” IEEE Commun. [36] P. Pongle and G. Chavan, “A survey: Attacks on RPL and 6LoWPAN in
Surveys Tuts., vol. 21, no. 2, pp. 1582–1606, 2nd Quart., 2019, IoT,” in Proc. Int. Conf. Pervasive Comput. (ICPC), Jan. 2015, pp. 1–6.
doi: 10.1109/COMST.2018.2885894. [37] J. P. Amaral, L. M. Oliveira, J. J. P. C. Rodrigues, G. Han, and L. Shu,
“Policy and network-based intrusion detection system for IPv6-enabled
[15] J. Granjal, E. Monteiro, and J. Sá Silva, “Security for the Internet of
wireless sensor networks,” in Proc. IEEE Int. Conf. Commun. (ICC),
Things: A survey of existing protocols and open research issues,” IEEE
2014, pp. 1796–1801.
Commun. Surveys Tuts., vol. 17, no. 3, pp. 1294–1312, 3rd Quart.,
[38] M. M. Shurman, R. M. Khrais, and A. A. Yateem, “IoT denial-of-
2015.
service attack detection and prevention using hybrid IDS,” in Proc.
[16] S. Murali and A. Jamalipour, “A lightweight intrusion detection for Int. Arab Conf. Inf. Technol. (ACIT), 2019, pp. 252–254.
sybil attack under mobile RPL in the Internet of Things,” IEEE Internet [39] C. Cervantes, D. Poplade, M. Nogueira, and A. Santos, “Detection
Things J., vol. 7, no. 1, pp. 379–388, Jan. 2020. of Sinkhole attacks for supporting secure routing on 6LoWPAN for
[17] A. Le, J. Loo, K. K. Chai, and M. Aiash, “A specification-based IDS Internet of Things,” in Proc. IFIP/IEEE Int. Symp. Integr. Netw. Manag.
for detecting attacks on RPL-based network topology,” Information, (IM), 2015, pp. 606–611.
vol. 7, no. 2, p. 25, 2016. [40] H. Bostani and M. Sheikhan, “Hybrid of anomaly-based and
[18] A. Le, J. Loo, L. Yuan, and A. Lasebae, “Specification-based IDS for specification-based IDS for Internet of Things using unsupervised OPF
securing RPL from topology attacks,” in Proc. IFIP Wireless Days based on mapreduce approach,” Comput. Commun., vol. 98, pp. 52–71,
(WD), 2011, pp. 1–3. Jan. 2016.
[19] S. Raza, L. Wallgren, and T. Voigt, “SVELTE: Real-time intrusion [41] A. A. Gendreau and M. Moorman, “Survey of intrusion detection
detection in the Internet of Things,” Ad Hoc Netw., vol. 11, no. 8, systems towards an end to end secure Internet of Things,” in Proc.
pp. 2661–2674, 2013. IEEE 4th Int. Conf. Future Internet Things Cloud (FiCloud), 2016,
[20] G. Glissa, A. Rachedi, and A. Meddeb, “A secure routing protocol pp. 84–90.
based on RPL for Internet of Things,” in Proc. IEEE Global Commun. [42] A. Tabassum, A. Erbad, A. Mohamed, and M. Guizani, “Privacy-
Conf. (GLOBECOM), 2016, pp. 1–7. preserving distributed IDS using incremental learning for IoT
[21] R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, health systems,” IEEE Access, vol. 9, pp. 14271–14283, 2021,
A. Al-Nemrat, and S. Venkatraman, “Deep learning approach doi: 10.1109/ACCESS.2021.3051530.
for intelligent intrusion detection system,” IEEE Access, vol. 7, [43] I. Butun, S. D. Morgera, and R. Sankar, “A survey of intrusion detection
pp. 41525–41550, 2019. systems in wireless sensor networks,” IEEE Commun. Surveys Tuts.,
[22] A. Khraisat and A. Alazab, “A critical review of intrusion detection vol. 16, no. 1, pp. 266–282, 1st Quart., 2014.
systems in the Internet of Things: Techniques, deployment strategy, val- [44] E. D. Alalade, “Intrusion detection system in smart home network
idation strategy, attacks, public datasets and challenges,” Cybersecurity, using artificial immune system and extreme learning machine hybrid
vol. 4, pp. 1–27, Mar. 2021. approach,” in Proc. IEEE 6th World Forum Internet Things (WF-IoT),
[23] G. Creech and J. Hu, “A semantic approach to host-based intrusion 2020, pp. 1–2.
detection systems using contiguousand discontiguous system call pat- [45] A. Althubaity, H. Ji, T. Gong, M. Nixon, R. Ammar, and S. Han,
terns,” IEEE Trans. Comput., vol. 63, no. 4, pp. 807–819, Apr. 2014. “ARM: A hybrid specification-based intrusion detection system for rank
[24] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamruzzaman, “Survey attacks in 6TiSCH networks,” in Proc. 22nd IEEE Int. Conf. Emerg.
of intrusion detection systems: Techniques, datasets and challenges,” Technol. Factory Automat. (ETFA), Limassol, Cyprus, 2017, pp. 1–8.
Cybersecurity, vol. 2, p. 20, Jul. 2019. [46] J. Asharf, N. Moustafa, H. Khurshid, E. Debie, W. Haider, and
[25] C. Liu, J. Yang, Y. Zhang, R. Chen, and J. Zeng, “Research on A. Wahab, “A review of intrusion detection systems using machine and
immunity-based intrusion detection technology for the Internet of deep learning in Internet of Things: Challenges, solutions and future
Things,” in Proc. 7th Int. Conf. Natural Comput. (ICNC), vol. 1, 2011, directions,” Electronics, vol. 9, no. 7, pp. 1–45, 2020.
pp. 212–216. [47] M. A. Al-Garadi, A. Mohamed, A. K. Al-Ali, X. Du, I. Ali, and
M. Guizani, “A survey of machine and deep learning methods for
[26] H.-J. Liao, C.-H. R. Lin, Y.-C. Lin, and K.-Y. Tung, “Intrusion detection
Internet of Things (IoT) security,” IEEE Commun. Surveys Tuts.,
system: A comprehensive review,” J. Netw. Comput. Appl., vol. 36,
vol. 22, no. 3, pp. 1646–1685, 3rd Quart., 2020.
no. 1, pp. 16–24, 2013.
[48] A. L. Buczak and E. Guven, “A survey of data mining and
[27] J. Krimmling and S. Peter, “Integration and evaluation of intrusion machine learning methods for cyber security intrusion detection,” IEEE
detection for CoAP in smart city applications,” in Proc. IEEE Conf. Commun. Surveys Tuts., vol. 18, no. 2, pp. 1153–1176, 2nd Quart.,
Commun. Netw. Security, 2014, pp. 73–78. 2016.
[28] P. Kasinathan, C. Pastrone, M. A. Spirito, and M. Vinkovits, “Denial- [49] F. Hussain, R. Hussain, S. A. Hassan, and E. Hossain, “Machine learn-
of-service detection in 6LoWPAN based Internet of Things,” in Proc. ing in IoT security: Current solutions and future challenges,” IEEE
IEEE 9th Int. Conf. Wireless Mobile Comput. Netw. Commun. (WiMob), Commun. Surveys Tuts., vol. 22, no. 3, pp. 1686–1721, 3rd Quart.,
2013, pp. 600–607. 2020, doi: 10.1109/COMST.2020.2986444.
[29] P. Kasinathan, G. Costamagna, H. Khaleel, C. Pastrone, and [50] K. Rai, M. S. Devi, and A. Guleria, “Decision tree based algo-
M. A. Spirito, “DEMO: An IDS framework for Internet of Things rithm for intrusion detection,” Int. J. Adv. Netw. Appl., vol. 7, no. 4,
empowered by 6LoWPAN,” in Proc. ACM SIGSAC Conf. Comput. pp. 2828–2834, 2016.
Commun. Security (CCS), 2013, pp. 1337–1340. [51] S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine
[30] D. Oh, D. Kim, and W. W. Ro, “A malicious pattern detection engine learning: A review of classification techniques,” Emerg. Artif. Intell.
for embedded security systems in the Internet of Things,” Sensors, Appl. Comput. Eng., vol. 31, pp. 3–24, Oct. 2007.
vol. 14, no. 12, pp. 24188–24211, 2014. [52] S. B. Kotsiantis, “Decision trees: A recent overview,” Artif. Intell. Rev.,
[31] H. Debar and J. Vinikka, Intrusion Detection: Introduction to Intrusion vol. 39, pp. 261–283, 2013, doi: 10.1007/s10462-011-9272-4.
Detection and Security Information Management (Lecture Notes in [53] L. Cui, S. Yang, F. Chen, Z. Ming, N. Lu, and J. Qin, “A survey on
Computer Science), vol. 3655. Berlin, Germany: Springer, 2005, application of machine learning for Internet of Things,” Int. J. Mach.
pp. 207–236. Learn. Cybern., vol. 9, no. 8, pp. 1399–1417, 2018.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9464 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

[54] P. Soucy and G. W. Mineau, “A simple KNN algorithm for text cate- [79] V. Gazis, “A survey of standards for machine-to-machine and the
gorization,” in Proc. IEEE Int. Conf. Data Mining, 2001, pp. 647–648. Internet of Things,” IEEE Commun. Surveys Tuts., vol. 19, no. 1,
[55] Y. Liao and V. R. Vemuri, “Use of K-nearest neighbor classifier for pp. 482–511, 1st Quart., 2017.
intrusion detection,” Comput. Security, vol. 21, no. 5, pp. 439–448, [80] Q. Liu, P. Li, W. Zhao, W. Cai, S. Yu, and V. C. M. Leung, “A survey
2002. on security threats and defensive techniques of machine learning: A
[56] W. Li, P. Yi, Y. Wu, L. Pan, and J. Li, “A new intrusion detection system data driven view,” IEEE Access, vol. 6, pp. 12103–12117, 2018.
based on KNN classification algorithm in wireless sensor network,” J. [81] J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A K-means clus-
Elect. Comput. Eng., vol. 2014, pp. 240217:1–240217:8, Jun. 2014. tering algorithm,” J. Roy. Stat. Soc. Ser. C, Appl. Stat., vol. 28, no. 1,
[57] M.-Y. Su, “Real-time anomaly detection systems for Denial-of-Service pp. 100–108, 1979.
attacks by weighted K-nearest-neighbor classifiers,” Expert Syst. Appl., [82] S. Kanjanawattana, “ A novel outlier detection applied to an adaptive
vol. 38, no. 4, pp. 3492–3498, 2011. K-means,” Int. Mach. Learn. Comput., vol. 9, pp. 569–574, Oct. 2019.
[58] A. O. Adetunmbi, S. O. Falaki, O. S. Adewale, and B. K. Alese, [83] M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, “Network
“Network intrusion detection based on rough set and k-nearest neigh- anomaly detection: Methods, systems and tools,” IEEE Commun.
bour,” Int. J. Comput. ICT Res., vol. 2, pp. 60–66, Jun. 2008. Surveys Tuts., vol. 16, no. 1, pp. 303–336, 1st Quart., 2014.
[59] M. Panda and M. R. Patra, “Network intrusion detection using Naive [84] A. P. Muniyandi, R. Rajeswari, and R. Rajaram, “Network anomaly
Bayes,” Int. J. Comput. Sci. Netw. Security, vol. 7, no. 12, pp. 258–263, detection by cascading K-Means clustering and C4. 5 decision tree
2007. algorithm,” Procedia Eng., vol. 30, pp. 174–182, Jan. 2012.
[60] S. Mukherjee and N. Sharma, “Intrusion detection using Naive
[85] J. Qi, Y. Yu, L. Wang, and J. Liu, “K*-Means: An effective and effi-
Bayes classifier with feature reduction,” Procedia Technol., vol. 4,
cient K-means clustering algorithm,” in Proc. IEEE Int. Conf. Big Data
pp. 119–128, May 2012.
Cloud Comput. (BDCloud) Soc. Comput. Netw. (SocialCom) Sustain.
[61] M. Swarnkar and N. Hubballi, “OCPAD: One class Naive Bayes clas-
Comput. Commun. (SustainCom) (BDCloud-SocialCom-SustainCom),
sifier for payload based anomaly detection,” Exp. Syst. Appl., vol. 64,
Atlanta, GA, USA, 2016, pp. 242–249.
pp. 330–339, Dec. 2016.
[62] L. Koc, T. A. Mazzuchi, and S. Sarkani, “A network intrusion detection [86] F. E. Heba, A. Darwish, A. E. Hassanien, and A. Abraham, “Principle
system based on a hidden Naive Bayes multiclass classifier,” Exp. Syst. components analysis and support vector machine based intrusion detec-
Appl., vol. 39, no. 18, pp. 13492–13500, 2012. tion system,” in Proc. 10th Int. Conf. Intell. Syst. Design Appl., Cairo,
[63] S. Tong and D. Koller, “Support vector machine active learning Egypt, 2010, pp. 363–367.
with applications to text classification,” J. Mach. Learn. Res., vol. 2, [87] B. Subba, S. Biswas, and S. Karmakar, “Enhancing performance of
pp. 45–66, Nov. 2001. anomaly based intrusion detection systems through dimensionality
[64] W. Hu, Y. Liao, and V. R. Vemuri, “Robust support vector machines reduction using principal component analysis,” in Proc. IEEE Int. Conf.
for anomaly detection in computer security,” in Proc. ICMLA, 2003, Adv. Netw. Telecommun. Syst. (ANTS), Bangalore, India, 2016, pp. 1–6.
pp. 168–174. [88] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,”
[65] H.-S. Ham, H.-H. Kim, M.-S. Kim, and M.-J. Choi, “Linear SVM- Chemom. Intell. Lab. Syst., vol. 2, nos. 1-3, pp. 37–52, 1987.
based android malware detection for reliable IoT services,” J. Appl. [89] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction.
Math., vol. 2014, pp. 1–10, Sep. 2014. Cambridge, MA, USA: MIT Press, 1998.
[66] M. C. Belavagi and B. Muniyal, “Performance evaluation of super- [90] V. Mnih et al., “Human-level control through deep reinforcement
vised machine learning algorithms for intrusion detection,” Procedia learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
Comput. Sci., vol. 89, pp. 117–123, Jun. 2016. [91] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath,
[67] S. Garg, K. Kaur, G. Kaddoum, F. Gagnon, N. Kumar, and Z. Han, “Deep reinforcement learning: A brief survey,” IEEE Signal Process.
“Sec-IoV: A multi-stage anomaly detection scheme for Internet of Mag., vol. 34, no. 6, pp. 26–38, Nov. 2017.
Vehicles,” in Proc. ACM MobiHocWorkshop Pervasive Syst. IoT Era, [92] T. T. Nguyen, N. D. Nguyen, and S. Nahavandi, “Deep reinforcement
Catania, Italy, 2019, pp. 37–42. learning for multiagent systems: A review of challenges, solutions,
[68] R. Boutaba et al., “A comprehensive survey on machine learning and applications,” IEEE Trans. Cybern., vol. 50, no. 9, pp. 3826–3839,
for networking: Evolution, applications and research opportunities,” J. Sep. 2020.
Internet Services Appl., vol. 9, p. 16, Jun. 2018. [93] N. C. Luong et al., “Applications of deep reinforcement learning in
[69] R. K. S. Gautam and E. A. Doegar, “An ensemble approach for intru- communications and networking: A survey,” IEEE Commun. Surveys
sion detection system using machine learning algorithms,” in Proc. 8th Tuts., vol. 21, no. 4, pp. 3133–3174, 4th Quart., 2019.
Int. Conf. Cloud Comput. Data Sci. Eng. (Confluence), Noida, India, [94] C. J. Watkins and P. Dayan, “ Q -learning,” Mach. Learn., vol. 8, no. 3,
2018, pp. 14–15. pp. 279–292, 1992.
[70] M. Raihan-Al-Masud and H. A. Mustafa, “Network intrusion detection [95] H. V. Hasselt, “Double Q-learning,” in Proc. Adv. Neural Inf. Process.
system using voting ensemble machine learning,” in Proc. IEEE Int. Syst., 2010, pp. 2613–2621.
Conf. Telecommun. Photon. (ICTP), Dhaka, Bangladesh, 2019, pp. 1–4. [96] H. V. Hasselt, A. Guez, and D. Silver, “Deep reinforcement learn-
[71] D. Stiawan et al., “An approach for optimizing ensemble intrusion ing with double Q-learning,” in Proc. 30th AAAI Conf. Artif. Intell.,
detection systems,” IEEE Access, vol. 9, pp. 6930–6947, 2018. Feb. 2016, pp. 2094–2100.
[72] Y. Shen, K. Zheng, C. Wu, M. Zhang, X. Niu, and Y. Yang, “An [97] Z. Wang, T. Schaul, M. Hessel, H. Hasselt, M. Lanctot, and
ensemble method based on selection using bat algorithm for intrusion N. D. Freitas, “Dueling network architectures for deep reinforcement
detection,” Comput. J., vol. 61, no. 4, pp. 526–538, 2018. learning,” in Proc. Int. Conf. Mach. Learn., 2016, pp. 1995–2003.
[73] M. A. Jabbar, R. S. Aluvalu, and S. S. Reddy, “RFAODE: A novel
[98] M. Lopez-Martin, B. Carro, and A. Sanchez-Esguevillas, “Application
ensemble intrusion detection system,” Procedia Comput. Sci., vol. 115,
of deep reinforcement learning to intrusion detection for supervised
pp. 226–234, Jan. 2017.
problems,” Expert Syst. Appl., vol. 141, pp. 1–15, Mar. 2020.
[74] A. Khraisat, I. Gondal, P. Vamplew, J. Kamruzzaman, and A. Alazab,
“Hybrid intrusion detection system based on the stacking ensemble [99] S. Pouyanfar et al., “A survey on deep learning: Algorithms, techniques,
of C5 decision tree classifier and one class support vector machine,” and applications,” ACM Comput. Surv., vol. 51, no. 5, pp. 1–36, 2019.
Electronics, vol. 9, no. 1, p. 173, 2020. [100] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep Learning.
[75] M. Usama et al., “Unsupervised machine learning for networking: Cambridge, MA, USA: MIT Press, 2016.
Techniques, applications and research challenges,” IEEE Access, vol. 7, [101] X.-W. Chen and X. Lin, “Big data deep learning: Challenges and
pp. 65579–65615, 2019. perspectives,” IEEE Access, vol. 2, pp. 514–525, 2014.
[76] K. R. Dalal, “Analysing the role of supervised and unsupervised [102] H. Liu and B. Lang, “Machine learning and deep learning methods
machine learning in IoT,” in Proc. Int. Conf. Electron. Sustain. for intrusion detection systems: A survey,” Appl. Sci., vol. 9, no. 20,
Commun. Syst. (ICESC), Coimbatore, India, 2020, pp. 75–79. pp. 1–28, 2019.
[77] S. Wang, J. Cai, Q. Lin, and W. Guo, “An overview of unsuper- [103] E. Hodo, X. Bellekens, A. Hamilton, C. Tachtatzis, and R. Atkinson,
vised deep feature representation for text categorization,” IEEE Trans. “Shallow and deep networks intrusion detection system: A taxonomy
Comput. Soc. Syst., vol. 6, no. 3, pp. 504–517, Jun. 2019. and survey,” 2017, arXiv: 1701.02145.
[78] J. Lin, W. Yu, N. Zhang, X. Yang, H. Zhang, and W. Zhao, “A sur- [104] D. C. Cireşan, U. Meier, J. Masci, L. M. Gambardella, and
vey on Internet of Things: Architecture, enabling technologies, security J. Schmidhuber, “Flexible, high performance convolutional neural
and privacy, and applications,” IEEE Internet Things J., vol. 4, no. 5, networks for image classification,” in Proc. 22nd Int. Joint Conf. Artif.
pp. 1125–1142, Oct. 2017. Intell., Barcelona, Spain, 2011, pp. 1237–1242.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
JAMALIPOUR AND MURALI: TAXONOMY OF MACHINE-LEARNING-BASED INTRUSION DETECTION SYSTEMS 9465

[105] D. Scherer, A. Müller, and S. Behnke, “Evaluation of pooling oper- [129] M. Yousefi-Azar, V. Varadharajan, L. Hamey, and U. Tupakula,
ations in convolutional architectures for object recognition,” in Proc. “Autoencoder-based feature learning for cyber security applications,”
Int. Conf. Artif. Neural Netw., 2010, pp. 92–101. in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Anchorage, AK, USA,
[106] W. Wang, M. Zhu, X. Zeng, X. Ye, and Y. Sheng, “Malware traf- 2017, pp. 3854–3861.
fic classification using convolutional neural network for representation [130] Y. Mirsky, T. Doitshman, Y. Elovici, and A. Shabtai, “Kitsune: An
learning,” in Proc. Int. Conf. Inf. Netw. (ICOIN), Da Nang, Vietnam, ensemble of autoencoders for online network intrusion detection,”
2017, pp. 712–717. 2018, arXiv:1802.09089.
[107] W. Abbass, Z. Bakraouy, A. Baina, and M. Bellafkih, “Classifying IoT
security risks using deep learning algorithms,” in Proc. 6th Int. Conf. [131] Y. Li, R. Ma, and R. Jiao, “A hybrid malicious code detection
Wireless Netw. Mobile Commun. (WINCOM), Marrakesh, Morocco, method based on deep learning,” Int. J. Security Appl., vol. 9, no. 5,
2018, pp. 1–6. pp. 205–216, 2015.
[108] Y. Chen, Y. Zhang, and S. Maharjan, “Deep learning for secure mobile [132] G. E. Hinton, “A practical guide to training restricted Boltzmann
edge computing,” 2017, arXiv:1709.08025. machines,” in Neural Networks: Tricks of the Trade. Heidelberg,
[109] N. McLaughlin et al., “Deep android malware detection,” in Proc. Germany: Springer, 2012, pp. 599–619.
7th ACM Conf. Data Appl. Security Privacy, Scottsdale, USA, 2017, [133] J. Zhang, “Deep transfer learning via restricted Boltzmann machine for
pp. 301–308. document classification,” in Proc. 10th Int. Conf. Mach. Learn. Appl.
[110] S. Garg, K. Kaur, N. Kumar, G. Kaddoum, A. Y. Zomaya, and Workshops, Honolulu, HI, USA, 2011, pp. 323–326.
R. Ranjan, “A hybrid deep learning-based model for anomaly detec-
[134] N. Jaitly and G. E. Hinton, “Learning a better representation of speech
tion in cloud datacenter networks,” IEEE Trans. Netw. Services Manag.,
soundwaves using restricted Boltzmann machines,” in Proc. IEEE
vol. 16, no. 3, pp. 924–935, Sep. 2019.
Int. Conf. Acoust. Speech Signal Process. (ICASSP), Prague, Czech
[111] D. S. Berman, A. L. Buczak, J. S. Chavis, and C. L. Corbett, “A sur-
Republic, 2011, pp. 5884–5887.
vey of deep learning methods for cyber security,” Information, vol. 10,
p. 122, Apr. 2019. [135] U. Fiore, F. Palmieri, A. Castiglione, and A. De Santis,
[112] D. Dasgupta, Z. Akhtar, and S. Sen, “Machine learning in cyberse- “Network anomaly detection with the restricted Boltzmann machine,”
curity: A comprehensive survey,” J. Defense Model. Simul., pp. 1–50, Neurocomputing, vol. 122, pp. 13–23, Dec. 2013.
Sep. 2020, doi: 10.1177/1548512920951275. [136] T. Aldwairi, D. Perera, and M. A. Novotny, “An evaluation of the
[113] A. Graves, A.-R. Mohamed, and G. Hinton, “Speech recognition with performance of Restricted Boltzmann Machines as a model for anomaly
deep recurrent neural networks,” in Proc. IEEE Int. Conf. Acoust. network intrusion detection,” Comput. Netw., vol. 144, pp. 111–119,
Speech Signal Process., Vancouver, BC, Canada, 2013, pp. 6645–6649. Oct. 2018.
[114] A. Graves and N. Jaitly, “ Towards end-to-end speech recognition with [137] M. Mayuranathan, M. Murugan, and V. Dhanakoti, “Best features
recurrent neural networks,” in Proc. Int. Conf. Mach. Learn., 2014, based intrusion detection system by RBM model for detecting DDoS
pp. 1764–1772. in cloud environment,” J. Ambient Intell. Hum. Comput., vol. 12,
[115] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning pp. 3609–3619, Dec. 2019.
with neural networks,” in Proc. Annu. Conf. Neural Inf. Process. Syst.,
2014, pp. 3104–3112. [138] A. Elsaeidy, K. S. Munasinghe, D. Sharma, and A. Jamalipour,
[116] S. Hochreiter, J. Schmidhuber, “Long short-term memory,” Neural “Intrusion detection in smart cities using restricted Boltzmann
Comput., vol. 9, pp. 1735–1780, Nov. 1997. machines,” J. Netw. Comput. Appl., vol. 135, pp. 76–83, Jun. 2019.
[117] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation [139] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm
of gated recurrent neural networks on sequence modeling,” in Proc. for deep belief nets,” Neural Comput., vol. 18, no. 7, pp. 1527–1554,
NIPS Workshop Deep Learn., 2014, pp. 1–9. 2006.
[118] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neu- [140] Q. Zhang, L. T. Yang, Z. Chen, and P. Li, “A survey on deep learning
ral networks,” IEEE Trans. Signal Process., vol. 45, no. 11, for big data,” Inf. Fusion, vol. 42, pp. 146–157, Jul. 2018.
pp. 2673–2681, Nov. 1997.
[119] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep [141] Q. Tian, D. Han, K.-C. Li, X. Liu, L. Duan, and A. Castiglione, “An
feedforward neural networks,” in Proc. Int. Conf. Artif. Intell. Stat., intrusion detection approach based on improved deep belief network,”
vol. 9, 2010, pp. 249–256. Appl. Intell., vol. 50, pp. 3162–3178, May 2020.
[120] X. Li and X. Wu, “Constructing long short-term memory based deep [142] I. Goodfellow et al., “Generative adversarial nets,” in Advances in
recurrent neural networks for large vocabulary speech recognition,” Neural Information Processing Systems. Red Hook, NY, USA: Curran,
in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2014, pp. 2672–2680.
South Brisbane, QLD, Australia, 2015, pp. 4520–4524. [143] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and
[121] P. Torres, C. Catania, S. Garcia, and C. G. Garino, “An analysis of X. Chen, “Improved techniques for training GANs,” in Advances in
recurrent neural networks for botnet detection behavior,” in Proc. IEEE Neural Information Processing Systems. Red Hook, NY, USA: Curran,
Biennial Congr. Argentina (ARGENCON), Buenos Aires, Argentina, 2016, pp. 2234–2242.
2016, pp. 1–6.
[144] J.-Y. Kim, S.-J. Bu, and S.-B. Cho, “Zero-day malware detection
[122] M. Almiani, A. AbuGhazleh, A. Al-Rahayfeh, S. Atiewi, and
using transferred generative adversarial networks based on deep autoen-
A. Razaque, “Deep recurrent neural network for IoT intrusion detec-
coders,” Inf. Sci., vols. 460–461, pp. 83–102, Sep. 2018.
tion system,” Simul. Model. Pract. Theory, vol. 101, May 2020,
Art. no. 102031. [145] R. E. Hiromoto, M. Haney, and A. Vakanski, “A secure architecture for
[123] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep Learning. IoT with supply chain risk management,” in Proc. 9th IEEE Int. Conf.
Cambridge, MA, USA: MIT Press, 2016. Intell. Data Acquisit. Adv. Comput. Syst. Technol. Appl. (IDAACS),
[124] Y. Wang, H. Yao, and S. Zhao, “Auto-encoder based dimensionality Bucharest, Romania, 2017, pp. 431–435.
reduction,” Neurocomputing, vol. 184, pp. 232–242, Apr. 2016. [146] K. Weekly and K. Pister, “Evaluating sinkhole defense techniques in
[125] J. Chen, B. Xie, H. Zhang, and J. Zhai, “Deep autoencoders in pattern RPL networks,” in Proc. 20th IEEE Int. Conf. Netw. Protocols (ICNP),
recognition: A survey in bio-inspired computing models and algo- Austin, TX, USA, 2012, pp. 1–6.
rithms,” in Bio-Inspired Computing Models and Algorithms. Singapore: [147] M. Alzubaidi, M. Anbar, S. Al-Saleem, S. Al-Sarawi, and K. Alieyan,
World Sci., 2019, pp. 229–255. “Review on mechanisms for detecting sinkhole attacks on RPLs,”
[126] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extracting in Proc. 8th Int. Conf. Inf. Technol. (ICIT), Amman, Jordan, 2017,
and composing robust features with denoising autoencoders,” in Proc. pp. 369–374.
25th Int. Conf. Mach. Learn., 2008, pp. 1096–1103.
[127] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. A. Manzagol, [148] D. Airehrour, J. Gutierrez, and S. K. Ray, “Securing RPL routing
“Stacked denoising autoencoders: Learning useful representations in a protocol from blackhole attacks using a trust-based mechanism,” in
deep network with a local denoising criterion,” J. Mach. Learn. Res., Proc. 26th Int. Telecommun. Netw. Appl. Conf. (ITNAC), Dunedin, New
vol. 11, no. 110, pp. 3371–3408, 2010. Zealand, 2016, pp. 115–120.
[128] J. Deng, Z. Zhang, E. Marchi, and B. Schuller, “Sparse autoencoder- [149] R. Sahay, G. Geethakumari, B. Mitra, and V. Thejas, “Exponential
based feature transfer learning for speech emotion recognition,” in smoothing based approach for detection of blackhole attacks in IoT,”
Proc. Humaine Asso. Conf. Affect. Comput. Intell. Interact., Geneva, in Proc. IEEE Int. Conf. Adv. Netw. Telecommun. Syst. (ANTS), Indore,
Switzerland, 2013, pp. 511–516. India, 2018, pp. 1–6.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.
9466 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022

[150] E. G. Ribera, B. M. Alvarez, C. Samuel, P. P. Ioulianou, and [173] A. Aris, S. F. Oktug, and S. B. O. Yalcin, “RPL version number
V. G. Vassilakis, “Heartbeat-based detection of blackhole and Greyhole attacks: In-depth study,” in Proc. IEEE/IFIP Netw. Oper. Manage.
attacks in RPL networks,” in Proc. 12th Int. Symp. Commun. Syst. Netw. Symp. (NOMS), Istanbul, Turkey, 2016, pp. 776–779.
Digit. Signal Process. (CSNDSP), Porto, Portugal, 2020, pp. 1–6. [174] A. Ariş and S. F. Oktuǧ, “Analysis of the RPL version number
[151] S. Ali, M. A. Khan, J. Ahmad, A. W. Malik, and A. ur Rehman, attack with multiple attackers,” in Proc. Int. Conf. Cyber Situational
“Detection and prevention of black hole attacks in IOT & WSN,” in Awareness Data Anal. Assessment (CyberSA), Dublin, Ireland, 2020,
Proc. 3rd Int. Conf. Fog Mobile Edge Comput. (FMEC), Barcelona, pp. 1–8.
Spain, 2018, pp. 217–226. [175] G. Guo, “A lightweight countermeasure to DIS attack in RPL routing
[152] Y.-C. Hu, A. Perrig, and D. B. Johnson, “Wormhole attacks in wireless protocol,” in Proc. IEEE 11th Annu. Comput. Commun. Workshop Conf.
networks,” IEEE J. Sel. Areas Commun., vol. 24, no. 2, pp. 370–380, (CCWC), 2021, pp. 753–758.
Feb. 2006. [176] A. Verma and V. Ranga, “Addressing flooding attacks in IPv6-based
[153] S. Ji, T. Chen, and S. Zhong, “Wormhole attack detection algorithms low power and lossy networks,” in Proc. IEEE Region 10 Conf.
in wireless network coding systems,” IEEE Trans. Mobile Comput., (TENCON), Kochi, India, 2019, pp. 552–557.
vol. 14, no. 3, pp. 660–674, Mar. 2015. [177] A. Raoof, C.-H. Lung, and A. Matrawy, “Introducing network coding to
[154] M. Goyal and M. Dutta, “Intrusion detection of wormhole attack in IoT: RPL: The chained secure mode (CSM),” in Proc. IEEE 19th Int. Symp.
A review,” in Proc. Int. Conf. Circuits Syst. Digit. Enterprise Technol. Netw. Comput. Appl. (NCA), Cambridge, MA, USA, 2020, pp. 1–4.
(ICCSDET), Kottayam, India, 2018, pp. 1–5. [178] P. Perazzo, C. Vallati, A. Arena, G. Anastasi, and G. Dini, “An
[155] P. Pongle and G. Chavan, “Real time intrusion and wormhole attack implementation and evaluation of the security features of RPL,” in
detection in Internet of Things,” Int. J. Comput. Appl., vol. 121, no. 9, ADHOC-NOW’17. Cham, Switzerland: Springer, 2017, pp. 63–76.
pp. 1–9, 2015. [179] A. Sehgal, A. Mayzaud, R. Badonnel, I. Chrisment, and J. Schonwalder,
[156] R. Mehta and M. M. Parmar, “Trust based mechanism for securing IoT “Addressing DODAG inconsistency attacks in RPL networks,” in Proc.
routing protocol RPL against wormhole & Grayhole attack,” in Proc. Global Inf. Infrastruct. Netw. Symp. (GIIS), 2014, pp. 1–8.
3rd Int. Conf. Converg. Technol. (I2CT), Pune, India, 2018, pp. 1–6. [180] A. Mayzaud, A. Sehgal, R. Badonnel, I. Chrisment, and J. Schonwalder,
[157] A. Raoof, A. Matrawy, and C.-H. Lung, “Enhancing routing security “Mitigation of topological inconsistency attacks in RPL-based low
in IoT: Performance evaluation of RPL’s secure mode under attacks,” power lossy networks,” Int. J. Netw. Manage., vol. 25, no. 5,
IEEE Internet Things J., vol. 7, no. 12, pp. 11536–11546, Dec. 2020. pp. 320–339, Sep. 2015.
[158] R. P. Parameswarath, C. Y. Eugene, N. V. Abhishek, T. J. Lim, and
B. Sikdar, “Detecting selective forwarding using sentinels in clustered
IoT networks,” in Proc. IEEE Global Commun. Conf. (GLONECOM),
Taipei, Taiwan, 2020, pp. 1–6. Abbas Jamalipour (Fellow, IEEE) received the
[159] K. Zhang, X. Liang, R. Lu, and X. Shen, “Sybil attacks and their Ph.D. degree in electrical engineering from Nagoya
defenses in the Internet of Things,” IEEE Internet Things J., vol. 1, University, Nagoya, Japan, in 1996.
no. 5, pp. 372–383, Oct. 2014. He is a Professor of Ubiquitous Mobile
[160] F. Medjek, D. Tandjaoui, M. R. Abdmeziem, and N. Djedjig, Networking with the University of Sydney, Sydney,
“Analytical evaluation of the impacts of Sybil attacks against RPL NSW, Australia. He has authored nine technical
under mobility,” in Proc. Int. Symp. Program. Syst. (ISPS), Algiers, books, 11 book chapters, over 550 technical papers,
Algeria, 2015, pp. 1–9. and five patents, all in the area of wireless commu-
[161] A. K. Mishra, A. K. Tripathy, D. Puthal, and L. T. Yang, “Analytical nications.
model for Sybil attack phases in Internet of Things,” IEEE Internet Prof. Jamalipour was a recipient of the number of
Things J., vol. 6, no. 1, pp. 379–387, Feb. 2019. prestigious awards, such as the 2019 IEEE ComSoc
[162] W. Yang, Y. Wang, Z. Lai, Y. Wan, and Z. Cheng, “Security vulner- Distinguished Technical Achievement Award in Green Communications,
abilities and countermeasures in the RPL-based Internet of Things,” the 2016 IEEE ComSoc Distinguished Technical Achievement Award in
in Proc. Int. Conf. Cyber Enabled Distrib. Comput. Knowl. Discov. Communications Switching and Routing, the 2010 IEEE ComSoc Harold
(CyberC), Zhengzhou, China, 2018, pp. 49–495. Sobol Award, the 2006 IEEE ComSoc Best Tutorial Paper Award, as well as
[163] N. Djedjig, D. Tandjaoui, and F. Medjek, “Trust-based RPL for the 15 best paper awards. He is the President of the IEEE Vehicular Technology
Internet of Things,” in Proc. IEEE Symp. Comput. Commun. (ISCC), Society. He held the positions of the Executive Vice-President and the
Larnaca, Cyprus, 2015, pp. 962–967. Editor-in-Chief of VTS Mobile World and has been an Elected Member of
[164] A. Mayzaud, R. Badonnel, and I. Chrisment, “A taxonomy of attacks the Board of Governors of the IEEE Vehicular Technology Society since
in RPL-based Internet of Things,” Int. J. Netw. Security, vol. 18, no. 3, 2014. He was the Editor-in-Chief IEEE W IRELESS C OMMUNICATIONS,
pp. 459–473, 2016. the Vice President-Conferences, and a member of Board of Governors of
[165] I. Butun, P. Österberg, and H. Song, “Security of the Internet of Things: the IEEE Communications Society. He sits on the Editorial Board of the
Vulnerabilities, attacks, and countermeasures,” IEEE Commun. Surveys IEEE ACCESS and an Editor of the IEEE T RANSACTIONS ON V EHICULAR
Tuts., vol. 22, no. 1, pp. 616–644, 1st Quart., 2020. T ECHNOLOGY and several other journals. He has been the General Chair
[166] S. Murali and A. Jamalipour, “Mobility-aware energy-efficient parent or the Technical Program Chair for a number of conferences, including
selection algorithm for low power and lossy networks,” IEEE Internet IEEE ICC, GLOBECOM, WCNC, and PIMRC. He is a Fellow of the
Things J., vol. 6, no. 2, pp. 2593–2601, Apr. 2019. Institute of Electrical, Information, and Communication Engineers and the
[167] S. Mangelkar, S. N. Dhage, and A. V. Nimkar, “A comparative study on Institution of Engineers Australia, an ACM Professional Member, and an IEEE
RPL attacks and security solutions,” in Proc. Int. Conf. Intell. Comput. Distinguished Speaker.
Control (I2C2), Coimbatore, India, 2017, pp. 1–6.
[168] S. Kalyani and D. Vydeki, “Survey of rank attack detection algorithms
in Internet of Things,” in Proc. Int. Conf. Adv. Comput. Commun. Sarumathi Murali (Member, IEEE) received the
Informat. (ICACCI), Bangalore, India, 2018, pp. 2136–2141. B.E. degree in electronics and communication engi-
[169] A. Le, J. Loo, A. Lasebae, A. Vinel, Y. Chen, and M. Chai, “The impact neering and the M.E. degree (Hons. and Gold Medal)
of rank attack on network topology of routing protocol for low-power in communication systems from Anna University,
and lossy networks,” IEEE Sensors J., vol. 13, no. 10, pp. 3685–3692, Chennai, India, in 2010 and 2012, respectively, and
Oct. 2013. the Ph.D. degree in 2021.
[170] K. K. Rai and K. Asawa, “Impact analysis of rank attack with She is a Post Graduate Research Scholar with the
spoofed IP on routing in 6LoWPAN network,” in Proc. 10th Int. Conf. Wireless Networking Group under the supervision of
Contemporary Comput. (IC3), Noida, India, 2017, pp. 1–5. Prof. A. Jamalipour with the School of Electrical and
[171] A. Kamble, V. S. Malemath, and D. Patil, “Security attacks and secure Information Engineering, The University of Sydney,
routing protocols in RPL-based Internet of Things: Survey,” in Proc. Sydney, NSW, Australia. She is currently working
Int. Conf. Emerg. Trends Innovat. ICT (ICEI), Pune, India, 2017, on mobility enhancement for low power and lossy networks and security
pp. 33–39. attacks in RPL under mobility. She had published more than 20 scholarly
[172] A. Mayzaud, R. Badonnel, and I. Chrisment, “A distributed monitoring journals and 40 technical papers in National and International conferences.
strategy for detecting version number attacks in RPL-based networks,” Her research interest includes routing under low power and lossy networks,
IEEE Trans. Netw. Service Manag., vol. 14, no. 2, pp. 472–486, Internet of Things routing protocol modeling, security and privacy issues in
Jun. 2017. IoT, signal processing, and mobile ad hoc networks.

Authorized licensed use limited to: EASWARI COLLEGE OF ENGINEERING. Downloaded on January 10,2023 at 04:10:00 UTC from IEEE Xplore. Restrictions apply.

You might also like