You are on page 1of 10

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 1

A detection framework against CPMA attack based


on trust evaluation and machine learning in IoT
network
Liang Liu, Member, IEEE, Xiangyu Xu, Yulei Liu, Member, IEEE, Zuchao Ma, Jianfei Peng

Abstract—IoT network is vulnerable to various cyberattacks, devices in IoT networks to steal, modify [3, 4], discard data,
especially insider attacks. Most existing studies mainly detect or consume network bandwidth. Such attacks can interfere
non-targeted insider attackers, who manipulate all packets for- with routing establishment and data transmission seriously,
warded by them with a probability. Compared with non-targeted
attackers, targeted attackers only manipulate specific packets, resulting in the failure of data fusion and affecting the normal
which makes them more efficient and covert. In this paper, we function of networks.
propose a targeted insider attack model called conditional packets Motivation: Compared with non-targeted attackers who
manipulation attack (CPMA), in which attackers maliciously manipulate all packets forwarded by them with a probability
manipulate the packets whose attribute values meet specific
[5, 6], targeted attackers only manipulate specific packets,
conditions with a probability. When resisting CPMA attack,
most existing detection algorithms are inefficient to find such which makes them more efficient and covert. Therefore, in
malicious behavior. Also, they detect malicious nodes by collecting this paper, we propose a targeted insider attack model called
and analyzing the overall behavior of nodes, which are not conditional packets manipulation attack (CPMA), in which
appropriate for energy-constrained nodes in IoT network. To attackers manipulate maliciously the packets whose attribute
solve these problems, we present CPMAED, a malicious nodes
values meet specific conditions with a probability.
detection framework against CPMA attack. CPMAED maintains
some partial trust metrics for each relay node, which indicate the As shown in Fig. 1, the sensor node S1 , S2 and S3 are
probability of launch attacks when forwarding the packets with deployed in three different areas of a forest respectively. Also,
different attribute values. Also, our scheme leverages regression they can monitor the temperature of the surrounding envi-
and clustering algorithms to evaluate the trust values of nodes ronment, and send sensing data to the sink through multihop
and classify them into benign or malicious. In order to obtain
higher detection accuracy, we optimize the routing of transmitted routing. Assume that node Rf is malicious, and it only tamper
packets and inject the packets to collect more information about the packets sent by node S3 with a probability of 0.5. If a
nodes to enhance detection. The experimental results show that fire breaks out in the above three areas, the sensing data of
our proposed scheme utilizing SVM and K-means can achieve S1 and S2 will be sent to the sink safely and trigger the fire
good detection performance and identify malicious nodes’ attack alarm. However, S3 ’s sensing data packet1 will be manipulated
modes with high accuracy.
maliciously by node Rf , so that the position of the fire can not
Index Terms—IoT network, Insider attacks, Trust evaluation, be located in time or even the alarm system fails. Also, most
Machine learning.

I. I NTRODUCTION

I NTERNET of Things (IoT) is becoming more and more


indispensable in our lives. Through IoT, we can collect raw
environmental information such as sound, light, and electricity,
and transform them into digital signals for further processing
[1]. In recent years, IoT is widely used in industries, agri-
culture, medical care, logistics, transportation, smart cities [2]
and other fields.
With the increasing demand and growth in IoT networks,
the security problems in IoT networks are also becoming more
and more serious. Due to the heterogeneity of devices involved
and the characteristic of multihop routing, IoT networks are
vulnerable to a variety of cyber attacks, especially internal Fig. 1: An example of CPMA attack in IoT networks
attacks. Internal attackers can capture or compromise sensor
Liang Liu,Xiangyu Xu,Yulei Liu,Zuchao Ma,Jianfei Peng are with Collega existing studies generally rely on the overall reputation of a
of Computer Science and Technology, Nanjing University of Aeronautics and node to determine whether the node is malicious [7, 8], and
Astronautics, NO.29 Yudao Street, Nanjing, China. ignore the partial trust values when forwarding packets with
Copyright (c) 20xx IEEE. Personal use of this material is permitted.
However, permission to use this material for any other purposes must be different attribute values. For example, node S1 , S2 and S3
obtained from the IEEE by sending a request to pubs-permissions@ieee.org. forward 20 packets to the sink respectively in a time window.

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 2

Because node Rf only attack the packets sent by node S3 , the improve IoT network security, in which each node monitored
overall reputation of node Rf is (20 + 20 + 20 × 0.5) ÷ (20 + its one-hop neighbours’ behavior, and periodically reported
20 + 20) = 5/6, which is relatively high. Therefore, node Rf is to a base station which utilized these information to evaluate
very likely to be identified as benign. In fact, when forwarding the trust values of nodes. A trust management scheme was
the packets whose source node is S3 , the partial trust value of designed in [14], which calculated the node’s direct trust
node Rf is (20 × 0.5) ÷ 20 = 1/2, which is relatively low. value by Bayesian and periodically updated it based on the
From the above example, most existing detection ap- combination of effective history records and adaptive decay
proaches are not efficient to detect such malicious nodes who factor. Indirect trust was used only if the current trust value
are highly concealed and destructive. Therefore, it is necessary was below a threshold. Also, in their scheme, the weights of
and challenging to design an effective detection mechanism different trust values could be calculated based on entropy
against CPMA attack in IoT networks. theory.
Contributions: In this paper, our key contributions are Machine learning-based detection: Recently, the rapid
summarized as follows. growth of machine learning also provides a new perspective
1) In this paper, we propose an advanced insider attack for cybersecurity. Kaplantzis et al. [15] used Support Vector
model called CPMA attack, in which the attackers ma- Machines to deal with security threats in WSN for the first
liciously manipulate the packets whose attribute values time and achieved high accuracy. However, this method could
meet specific conditions with a probability; not identify which nodes are malicious. For the first time, Tie
2) In our work, we present CPMAED, a malicious nodes Luo et al. [1] introduced autoencoder neural networks into
detection framework against CPMA attack in IoT net- WSN to solve the anomaly detection problem. Because of deep
works. In CPMAED, each relay node has at least one learning’s formidable hunger for computational resources, the
partial trust value, which indicates the probability that method built an autoencoder neural network that consists
it maliciously manipulate the packets with a specific of three layers. However, the complexity of this method is
attribute value. too high to be suitable for large-scale sensor networks. A
3) In our scheme, we leverage regression and clustering kNN-based anomaly detection scheme was introduced in [16].
algorithms to calculate nodes’ trust values and classify Through redening anomaly detection region and converting
them into benign or malicious. And to obtain higher hypergrid structure, the computational complexity could be
detection accuracy, we optimize the routing of transmit- reduced and detection efficiency could be improved. Xin
ted packets and inject the packets again to collect more Liu et al. [17] designed a malicious nodes identification
information about nodes to enhance detection. approach using network diversity and clustering algorithm,
4) The experimental results show that the detection scheme which motivates our work. In their work, a contribution
we designed can achieve good detection performance metric was formulated for each node in the network based
and identify malicious nodes’ attack modes with high on their behaviour. However, it assumed that the contributions
accuracy. of different nodes in the same path to the path’s reputation are
Organization: The remainder of this paper is organized the same, which is unrealistic.
as follows. Section 2 mainly introduces the existing work of Most of the above studies mainly focus on non-targeted
malicious nodes detection. Section 3 presents our proposed attacks rather than destructive and covert targeted attacks.
system model, including network model, packet model, attack To meet this challenge, we consider a targeted insider attack
model, node model and path model. Then section 4 details model named CPMA attack and propose an efficient detection
the detection scheme we designed. Section 5 presents the framework (CPMAED) against CPMA attack in IoT networks.
experimental environment and experimental results. At last,
section 6 concludes our work. III. S YSTEM MODEL
This section describes system model which includes net-
II. R ELATED WORK work model, packet model, attack model, node model and path
model. Later, we will use these models to detail our proposed
To detect malicious nodes, various detection methods based
scheme. And all symbols used in this paper are presented in
on trust evaluation or machine learning techniques have been
TABLE I.
proposed.
Trust-based detection: Trust evaluation or reputation eval-
uation can be used to improve network security [6] [9]. A. Network model
Nodes with higher trust values are more likely to be benign, According to the role of different nodes, all nodes are
whereas nodes with lower trust values are more likely to be divided into three sets: source node set S, the sink, and
vicious [10]. Xia Li et al. [11] proposed that the direct trust relay node set R. The source node set S can be denoted as
values could combine with recommendation trust values from S = {S1 , S2 , S3 · · · }. Their role is to send probe pack-
other nodes. In [12], Romman et al. proposed to use the ets over several routing paths to the sink for assisting in
neighbour weight trust determination algorithm (NWTD) to identifying malicious nodes [17]. The sink is responsible for
detect malicious nodes in MANETs. In NWTD, the trust of collecting packets and evaluating the reputation metric for
a node was evaluated by its one-hop neighbours. Rikli et al. each routing path. The relay node set R can be denoted as
[13] proposed a lightweight trust-based security mechanism to R = {R1 , R2 , R3 · · · }. Their role is to route packets from the

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 3

TABLE I: The symbols used in this paper Such advanced attackers follow IoT protocols to hide as
Symbol Description much as possible for most of their lifetime, and will launch
attacks unless they encounter packets that meet specific con-
S the set of source nodes
R the set of relay nodes
ditions. Furthermore, the condition function ft for launching
packetm the m-th packet injected into the network attack αt , t ∈ [1, ω] can be defined as:
Ri the i-th relay node
Ri .T the trust value of node Ri ft : packetm → boolean (2)
P athj the j-th routing path
Pj .T the trust value of routing path P athj If ft (packetm ) is true, it indicates that packetm ’s attribute
θ the set of attribute values
α the set of attack modes values meet the conditions for launching attack αt .
ω the number of attack modes As shown in Fig. 1, attack mode α1 means that an attacker
ft the condition function to launch attack αt maliciously manipulates the packets whose Source Node is S3
αt the t-th attack mode
ϕt the t-th packet group and its condition function is f1 : packetm → boolean, where
ξt the t-th detection domian 
LT Gξt the low trust value group in ξt  true if packetm ’s Source Node is node

M T Gξt the medium trust value group in ξt 
 S3
HT Gξt the high trust value group in ξt f1 (packetm ) =
BGξt the benign group in ξt 
 f alse if packetm ’s Source Node is not
M Gξt the malicious group in ξt 

F BG the final benign group node S3
F MG the final malicious group
Malicious node Rf with attack mode α1 can determine
whether the packets that forwarded by it are its targets using
condition function f1 . Then if f1 (packet1 ) = true, Rf will
source node to the sink. There may be some malicious nodes
launch an attack on packet1 with a probability.
in R, and our purpose is to detect them and identify their attack
modes.
D. Node model
B. Packet model The i-th relay node in the relay node set R can be repre-
Let packetm be the m-th packet injected into the network sented as
and it can be denoted as a tuple: Ri = {(αi1 , pi1 ), (αi2 , pi2 ), · · · , (αik , pik )}, (3)
packetm = {θ1m , θ2m , θ3m · · · θkm , flag, pass} (1) where αik ∈ attack mode set α. αik is k-th attack mode
of node Ri and pik is the probability of node Ri launching
where θkm represents the attribute value of the k-th field of
attack αik . Define βi as a flag to present whether node Ri is
packetm and θkm ∈ attribute value set θ. When packetm
malicious. If node Ri is benign, then βi = 1; otherwise βi =
arrives at the sink, it will verify if packetm has been com-
0. That is
promised by an attacker and update the packet’s flag. That
(
1 if Ri is benign
is βi =
0 otherwise
(
1 if packetm is not compromised
flag =
0 otherwise When forwarding packetm , whether node Ri will launch
And pass is the sequence of relay nodes that forward packetm an attack depends on βi and condition function fik to launch
[18]. attack αik (refer to (2)). If βi = 1, node Ri will not launch any
For example, in Fig. 1, every packet can be represented as attacks; If βi = 0 and fik (packetm ) = false, node Ri will not
a 5-tuple: launch an attack on packetm ; If βi = 0 and fik (packetm )
= true, node Ri will launch an attack on packetm with
{Source Node, Length, Data Type, flag, pass}. probability pik .
Moreover, when forwarding the packets with different at-
Then, S3 ’s sensing data packet1 can be denoted as
tribute values, the reliability of a node can be measured by its
{S3 , 16bits, temperature, 0, (R3 , R6 , R9 , Rb , Rd , Re , Rf )}. partial trust values, which can be defined as below [17]:
Rik .T = 1 − pik . (4)
C. Attack model
In this paper, we propose a targeted insider attack model E. Path model
named conditional packets manipulation attack (CPMA), in
which the attackers only maliciously manipulate the packets Let P athj be the j-th path that connects a source node to
whose attribute values meet specific conditions with a proba- the sink. It can be denoted as :
bility. Depending on the different targets of malicious nodes, P athj = {a1j , a2j , a3j , · · · , anj } (5)
we divide CPMA attack into multiple attack modes. Also ω is
the number of attack modes, and α is the set of attack modes, where n is the number of relay nodes and aij indicates whether
which is denoted as {α1 , α2 , · · · , αω }. node Ri is in the P athj . That is

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 4

(
1 if Ri is in the P athj
aij =
0 if Ri is not in the P athj

For example, in Fig. 2, the path ”R2 − R4 − R5 ” can be


represented as P ath0 = {0, 1, 0, 1, 1}, which indicates node
R2 , R4 and R5 are in the P ath0 , while node R1 and R3 are
not.

Fig. 2: The Send-set and Receive-set

To evaluate a path’s reliability, we introduce Send-set and


Receive-set. Send-setj refers to the set of packets injected
by the source node of P athj , and Receive-setj refers to
the set of packets arriving at the sink through P athj . For
example, in Fig. 2, Send-set0 = {packet1 , packet2 , packet3 }, Fig. 3: The workflow of our proposed scheme
and Receive-set0 = {packet1 , packet20 , packet3 }.
The more packets successfully transmitted by P athj , the
higher the trust value of P athj . In this paper, Pj .T denotes 1) Detection domain formation: Most detection strategies
the trust value of P athj and it can defined as: based on the trust utilize nodes’ overall reputation to determine
whether they are malicious, which are not efficient when
the number of packets in Receive-setj whose flag = 1
Pj .T = dealing with CPMA attack.
the number of packets in Send-setj To tackle this problem, we use condition function ft to
(6)
filter packets so that all the packets that meet ”ft (packetm )
Also, the reputation of a path depends on the contribution
= true” form the packet group ϕt . When all packets in ϕt are
of each node in the path. Inspired by [17], the reputation of
forwarded from source nodes to the sink, their routing paths
P athj can also be expressed as
will form a detection domain ξt . In ξt , we can formulate a
Yn partial trust metric for each node to evaluate its attack proba-
Pj .T = (1 − pi ) (7) bility when forwarding the packets in ϕt . In this way, nodes
i in multiple detection domains have corresponding partial trust
values that represent their probability of launching attacks on
where aij = 1. If Pj .T 6= 1, it indicates that there is at least
the packets in different packet groups.
one malicious node in the P athj .
2) Trust model construction: Considering energy consump-
As shown in Fig. 2, the reputation of P ath0 can be
tion and computing load, typical methods to collect and
calculated as below:
analyze the statistical communication data [19, 20] is not
P0 .T = (1 − p2 ) × (1 − p4 ) × (1 − p5 ) = R2 .T × R4 .T × R5 .T. appropriate for IoT networks. In this paper, by injecting probe
packets into the network and collecting them at the sink,
the reputation of a path can be calculated based on obtained
IV. P ROPOSED METHOD analysis results [21, 22]. Moreover, the reputation of a path is
the result of how much each node along this path contributes
In this section, we first describe the workflow of our to ”not modifying the packets”. Therefore, in each detection
proposed detection framework against CPMA attack, as shown domain, we can construct corresponding node-trust model
in Fig. 3. Without loss of generality, we take tamper attack between the reputation of each routing path and the reputation
which is a typical internal attack as an example to show how of all nodes along the path [23].
our proposed detection framework works. 3) Trust value calculation: It is difficult to directly cal-
culate each node’s trust value using mathematical methods
because of the uncertainty of attack and the diversity of the
A. Overview
network. To tackle the problem, we transform the calculation
The reputation of all nodes can be evaluated based on their of nodes’ trust values into a multivariable linear regression
behavior. To collect nodes’ behavior information, inspired by problem. Considering that the regression method is a powerful
[17], we inject probe packets into the network by some trusted tool to solve linear regression problems, we train the optimal
source nodes and collect statistical information by the sink. regression model by inputting corresponding routing paths’

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 5

reputation. When the training is finished, we can obtain all information, the sink can evaluate the reliability of P ath1 and
nodes’ trust values [24]. the result can be expressed as P1 .T (refer to (6)). That is
4) Malicious node detection: In some detection strategies, |{packet1 , packet3 }|
clustering algorithms are used to detect malicious nodes and P1 .T =
|{packet1 , packet2 , packet3 }| (8)
they cluster the nodes directly into two groups, such as benign 2
group and malicious group. Moreover, a node’s trust value =
3
can be impacted by the behaviour of other nodes along its
associated multihop paths, which can degrade the performance In addition, the reputation of a path is also the contribution
of detection schemes. To improve detection accuracy, we of all nodes along this path. By referring to equation (7), the
cluster the nodes into three groups, which are low trust value reputation of P ath1 can also be formalized as:
group (LTG), medium trust value group (MTG) and high trust P1 .T =(1 − p3 ) × (1 − p6 ) × (1 − p9 ) × (1 − pb ) × (1 − pd)
value group (HTG). To determine whether nodes in the MTG
× (1 − pe ) × (1 − pf )
are benign or malicious, we optimize the routing of transmitted
packets and inject the packets into the network again to collect =R31 .T × R61 .T × R91 .T × Rb1 .T × Rd1 .T × Re1 .T
more information about them, which can enhance the learning × Rf 1 .T
of regression model. Then we use clustering algorithm based (9)
on obtained trust values of nodes again to classify the nodes Mathematically, the equation (9) can be derived as below:
into benign or malicious.
ln (P1 .T ) = ln (R31 .T ) + ln (R61 .T ) + ln (R91 .T ) +
5) Detection result aggregation: After the detection is
ln (Rb1 .T ) + ln (Rd1 .T ) + ln (Re1 .T ) + ln (Rf 1 .T )
completed in each detection domain, we can obtain the final (10)
benign group (FBG) and final malicious group (FMG) by To generalize, the relationship between P ath1 ’ reputation
aggregating detection result in each detection domain. and all relay nodes’ reputation along P ath1 can be expressed
Also, if the reputation of node Ri is relatively low in the as below:
detection domain ξt , it indicates that node Ri launches attacks
ln (P1 .T ) = ln (R11 .T ) × a11 + ln (R21 .T ) × a21 + ln (R31 .T )
on the packets in the packet group ϕt with a high probability.
Also, this kind of attacks on the packets in the packet group × a31 + · · · + ln (Rn1 .T ) × an1
ϕt is defined as attack mode αt , so αt is one of attack modes (11)
of node Ri . If node Ri exists in other detection domains, we where aij indicates whether node Ri is in the P ath1 (refer to
can identify other attack modes of Ri in this way. (5)).
Based on equation (11), the corresponding node-trust model
in the detection domain ξ1 can be formalized as below:

 ln (P1 .T ) = ln (R11 .T ) × a11 + ln (R21 .T ) × a21 + · · ·
B. Conditional packet manipulation attack detection 

+ ln (Rn1 .T ) × an1





As shown in Fig. 1, there are fifteen relay nodes and ln (P .T ) = ln (R11 .T ) × a12 + ln (R21 .T ) × a22 + · · ·

2



multiple possible routing paths between three source nodes + ln (Rn1 .T ) × an2
and the sink. First of all, we inject probe packets by three 
···


trusted source nodes and collect statistical information by the



ln (Pσ .T ) = ln (R11 .T ) × a1σ + ln (R21 .T ) × a2σ + · · ·

sink in the same way as [17].




+ ln (Rn1 .T ) × anσ

1) Detection domain formation: To detect malicious nodes
(12)
that launch attack α1 , we use condition function f1 to filter
where σ is the number of available routing paths in the
packets so that all the packets whose Source Node is S3
detection domian ξ1 .
form the packet group ϕ1 . When the packets in the group
ϕ1 are forwarded from node S3 to the sink, their routing 3) Trust value calculation: Then we can use three matrixes
paths form a domain which is shown by the black arrows to represent above equation (12):
in Fig 1. We define this domain as the detection domain of T ×X =Y (13)
attack type α1 and use ξ1 to represented it. Also, P ath1
”R3 − R6 − R9 − Rb − Rd − Re − Rf ” which can be denoted In equation (13), T is the matrix of nodes’ reputation which
as {0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1} exists in the detection is what we want to calculate and
domain ξ1 and its Send-set is { packet1 , packet2 , packet3 }.
T = [ln (R11 .T ) , ln (R21 .T ) , ln (R31 .T ) , · · · , ln (Rn1 .T )]
2) Trust model construction: After the probe packets trans- (14)
mitted along P ath1 reach the sink, they form Receive-set1 , X is the matrix of nodes’ existence, and
which can be denoted as {packet1 , packet20 , packet3 }. Here
we assume that packet20 has been tampered maliciously by
 
a11 a12 · · · a1σ
an attacker. Then the sink can check the integrity of all  a21 a22 · · · a2σ 
packets in Receive-set1 by a keyed hash function [25, 26] and X= · · · · · · · · · · · · 
 (15)
update packet2 ’s flag to 0. Based on the obtained statistical an1 aσ2 · · · anσ

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 6

Y is the matrix of routing paths’ reputation and of the attack modes of node Rf . And, if node Rf exists in
other detection domains besides ξ1 , we can use above method
Y = [ln (P1 .T ) , ln (P2 .T ) , ln (P3 .T ) · · · ln (Pσ .T )] (16) to obtain node Rf ’ other attack modes.
When probe packets reach the sink, the sink can use every
V. S IMULATION AND ANALYSIS
packet’s pass which stores the packet’s routing information to
construct X [18]. Also, by checking the integrity of collected In this section, we mainly evaluate the detection perfor-
packets, the sink can calculate every routing path’s reputation mance of our proposed CPMAED in two aspects of: (1) the
and construct Y . combination of regression and clustering algorithms; (2) the
According to equation (13), the calculation of matrix T can influence of experimental parameters. To this end, we first
be considered as a multivariable linear regression problem. select the combination of regression and clustering algorithm
Therefore, we introduce the regression algorithm to evaluate with the best detection performance. Also, we change some
the reputation of all nodes. Inspired by [27], we take matrix key experimental parameters respectively to analyze the detec-
Y and X as the dependent variable and independent variable tion performance of CPMAED comprehensively. Meanwhile,
respectively, and feed them as inputs to train the regression we compare our proposed scheme with Hard Detection (HD)
model. In statistics, the regression model focuses on generating [17] and Perception Detection with enhancement (PDE) [24],
a relationship between a dependent variable and multiple in- both of which can detect tamper attacks in IoT networks.
dependent variables. When the training is finished, the matrix
of nodes’ reputation T can be obtained as the regression A. Environment setting
coefficient.
In our simulation environment, all IoT nodes are deployed
4) Malicious nodes detection: Based on the obtained repu-
in a 100 × 100 m2 rectangle area discretely, as shown in Fig.
tation of all nodes in the detection domain ξ1 , we detect mali-
1. Each node’s communication range is 15m. Besides, we use
cious nodes using clustering algorithm. To improve detection
the scikit-learn [28] to implement regression and clustering
accuracy, we cluster the nodes in ξ1 to three groups instead
algorithms.
of two, namely low trust value group (LTGξ1 ), medium trust
To ensure the reliability of the experimental results, we
value group (MTGξ1 ) and high trust value group (HTGξ1 ).
run our simulation for each experiment in ten rounds with
Considering that a node’s trust value can be impacted by the
ten different networks generated randomly. The average value
behavior of other nodes along its associated multihop paths, we
of ten rounds’ results is calculated as the final result of
optimize the routing paths to collect more information about
each experiment. Unless otherwise specified, all experimental
node behavior. And the optimization of routing paths follows
parameters will remain the default, which is set as follows:
three principles:
1) The utilization of all routing paths in the network is
1) each path contains as few nodes in LSGξ1 as possible;
100%;
2) each path contains nodes in MSGξ1 , but contains as few
2) 30% of nodes in the relay node set R are malicious;
nodes in MSGξ1 as possible;
3) The attack modes of each malicious node are randomly
3) each path contains as few nodes as possible.
selected from attack mode set α and the probability of
By the set of optimized routing paths, we inject the packets each attack is 30%;
into the network again to collect more evidence about the 4) The number of probe packets injected into the network
nodes at the sink. The additional information obtained can be is 2000;
used to retrain the regression model to output more accurate 5) The relay node set R contains 15 elements that are
trust values. Then the clustering method can be applied again deployed between source nodes and the sink.
to classify the nodes into two groups, such as benign group 6) The source node set S contains 2 source nodes.
(BGξ1 ) and malicious group (MGξ1 ).
5) Detection results aggregation: After the detection is
completed in each detection domain, we can obtain the final B. Evaluation metrics
detection result by aggregating the detection result in each We mainly measure the detection performance of our pro-
detection domain. For example, in Fig. 1, the final benign posed scheme in terms of detection accuracy of malicious
group can be represented as FBG = BGξ1 ∪ BGξ2 ∪ · · · ∪ BGξω , nodes, detection false alarm rate of malicious nodes [29],
and the final malicious group can be represented as FMG = running time and detection accuracy of attack modes. Based
MGξ1 ∪MGξ2 ∪· · ·∪MGξω , where ω is the number of detection on TABLE II, four measures are defined as follows:
domains.
In addition, our approach can obtain the attack modes of TABLE II: Confusion matrix
each node in FMG. In Fig. 1, if node Rf is assigned to MGξ1 , Predicted result
it indicates that the trust value of node Rf is relatively low in Negative Positive
Negative True Positive (TP) False Negative (FN)
the detection domain ξ1 and it is very likely to launch attacks Actual result
Positive False Positive (FP) True Negative (TN)
when forwarding the packets in the group ϕ1 . According to
condition function f1 , all packets whose Source Node is S3 1) detection accuracy of malicious nodes:
form the packet group ϕ1 and this kind of attacks on the
packets in the group ϕ1 is defined as α1 . Therefore, α1 is one Am = (T N + T P )/n,

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 7

where n is the number of relay nodes, TN is the number


of benign nodes correctly classified as benign and TP
is the number of malicious nodes correctly classified as
malicious;
2) detection false alarm rate of malicious nodes:
Fm = F P /(F P + T N ),
where FP is the number of benign nodes incorrectly (a) The influence of the percentage (b) The influence of the percentage
classified as malicious and TN is the number of benign of malicious nodes on Am of malicious nodes on Fm
nodes correctly classified as benign;
3) running time t refers to the period from the beginning
to the end of the detection of malicious nodes.
4) detection accuracy of attack modes:
ω
X
Aα = ( (T P t + T N t )/n)/ω,
1

where ω is the number of attack modes. And T P t is the (c) The influence of the percentage (d) The influence of the percentage
number of nodes whose actual attack mode is αt and the of malicious nodes on t of malicious nodes on Aα
attack mode predicted by our scheme is also αt ; T N t
Fig. 5: The influence of the percentage of malicious nodes
is the number of nodes whose actual attack mode is not
on detection performance
αt and the predicted attack mode is not αt .

C. Experimental results According to the obtained results, we can find that as the
1) The influence of regression and clustering algorithms: percentage of malicious nodes increases, Am of HD gradually
First of all, to evaluate the influence of different machine increases, and Am of PDE gradually decreases, while Am of
learning algorithms on the detection performance, we choose CPMAED always remains high. For PDE, as the percentage
a variety of typical machine learning algorithms [30, 31], of malicious nodes, the reputation of a node is more likely
including five regression algorithms: Support Vector Machine to be impacted by the behaviour of malicious nodes along
(SVM), Gradient Descent Method (GD), Least Square Method its associated multihop paths, which degrades Am of PDE.
(LSM), Perceptron (P), Ridge Regression (RR) and three Also, by aggregating the detection results in multiple detection
clustering algorithms: K-means, Gaussian mixed clustering domains, Am of CPMAED always remains high.
(GMM), AGNES hierarchical clustering (AGNES). Our ex- In addition, we also find that as the percentage of malicious
perimental results are shown in Fig. 4. nodes increases, Fm of HD, PDE and CPMAED gradually
It is observed that SVM can achieve better detection per- decreases. Among them, Fm of HD is the highest.
formance, compared with other regression algorithms. Then And in all kinds of the proportion of malicious nodes,
we combine SVM with three clustering algorithms to evalu- CPMAED is significantly better then HD and PDE in terms of
ate their detection performance. In Fig. 4, we can find the Am and Fm , and Aα of our proposed scheme always remains
combination of SVM and K-means has the best detection above 90%.
performance. Therefore, we choose SVM and K-means for 3) The influence of probability of attack: To evaluate the
subsequent detection performance evaluation of CPMAED. impact of probability of attack on the detection performance,
the attack probability of malicious nodes is set to 0.1, 0.3,
0.5, 0.7 and 0.9 respectively. As the probability of attack
increases, malicious nodes will become more and more active.
The results are shown in Fig. 6.
Our results show that when the attack probability of mali-
cious nodes is small, HD and PDE cannot achieve excellent
detection performance. This is because a lower attack prob-
ability indicates that malicious nodes adopt a covert attack
(a) Regression methods (b) Clustering methods strategy to avoid being detected, which degrades the detection
performance of HD and PDE. However, CPMAED can detect
Fig. 4: The selection of machine learning algorithms such malicious behavior because it adopts the partial trust
values of nodes instead of the overall trust values.
2) The influence of percentage of malicious nodes: To Moreover, we can find that our proposed CPMAED has the
evaluate the impact of percentage of malicious nodes on the highest Am and the lowest Fm by optimizing the routing paths
detection performance, the percentage of malicious nodes in and aggregating the detection results of multiple detection
the relay node set R is set to 0.1, 0.2, 0.3, 0.4 and 0.5 domains. After detecting malicious nodes, CPMAED can
respectively. The results are shown in Fig. 5. identify the attack modes of the malicious nodes with high

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 8

decreases gradually. This is because the larger number of


injected packets, more behavior information about nodes can
be collected and analyzed, which makes the trust values
of nodes obtained more accurate. For PDE and CPMAED,
calculating the trust values of nodes based on node-trust model
instead of assuming that the contributions of different nodes in
the same path to the path’s reputation are the same can improve
(a) The influence of the probability (b) The influence of the probability their detection performance. And by aggregating detection
of attack on Am of attack on Fm results of multiple detection domains, CPMAED can obtain
stable detection performance.
Besides, compared with HD and PDE, CPMAED has an
advantage in Am and Fm . Meanwhile, our scheme can also
identify malicious nodes’ attack modes with high accuracy,
which is not available in HD and PDE.
5) The influence of count of relay nodes: To evaluate the
influence of count of relay nodes on the detection performance,
the count of relay nodes is set to 5, 10, 15, 20 and 25
(c) The influence of the probability (d) The influence of the probability
of attack on t of attack on Aα
respectively. As the count of relay nodes increases in the
network, the scale of IoT network also increases. The results
Fig. 6: The influence of the probability of attack on are shown in Fig. 8.
detection performance

accuracy, even up to 0.95.


4) The influence of count of injected packets: To evaluate
the influence of count of injected packets on the detection
performance, the count of injected packets in a time window
is set to 500, 1000, 1500, 2000 and 2500 respectively. As
the number of injected packets increases, so does the average (a) The influence of the count of (b) The influence of the count of
number of packets forwarded by each relay node. The results relay nodes on Am relay nodes on Fm
are shown in Fig. 7.

(c) The influence of the count of (d) The influence of the count of
relay nodes on t relay nodes on Aα
(a) The influence of count of injected (b) The influence of count of in-
packets on Am jected packets on Fm Fig. 8: The influence of the count of relay nodes on
detection performance

It is found that when the number of relay nodes is small,


Am of HD, PDE and CPMAED is low. As the number of
relay nodes in the network increases, Am of three schemes
gradually increases. This is because as the number of relay
nodes increases, the scale of network becomes larger and
larger. Meanwhile, the increase in the number of available
(c) The influence of count of injected (d) The influence of count of in-
packets on t jected packets on Aα routing paths allows us to collect and analyze more statistical
information about nodes in the network. And in CPMAED,
Fig. 7: The influence of count of injected packets on this improvement increases the dimensions of matrix X and Y,
detection performance and strengthens the learning of the regression model to output
more accurate trust values.
It is found that when the number of injected packets is Also, the obtained results indicate that CPMAED outper-
small, Am of HD and PDE is not ideal, but Am of CPMAED forms HD and PDE in all cases. After the detection of
always remains high, which can reach 0.88. Also, as the malicious nodes, CPMAED can identify their attack modes
number of injected packets increase, Fm of HD and CPMAED with high accuracy, even up to 0.96.

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 9

VI. C ONCLUSION cluster based trust entropy. In 2018 International Con-


ference on Advances in Computing, Communications and
In this paper, we propose an advanced attack model called
Informatics (ICACCI), pages 2447–2452. IEEE, 2018.
conditional packets manipulation attack (CPMA), in which the
[8] Wafa Abdelghani, Corinne Amel Zayani, Ikram Amous,
attackers manipulate maliciously the packets whose attribute
and Florence Sèdes. Trust evaluation model for attack
values meet specific conditions with a probability. When
detection in social internet of things. In International
dealing with CPMA attack, most existing detection techniques
Conference on Risks and Security of Internet and Sys-
are inefficient and cannot identify malicious nodes’ attack
tems, pages 48–64. Springer, 2018.
modes. To solve these problems, we present CPMAED, a
[9] Abdelmuttlib Ibrahim Abdalla Ahmed, Siti Hafizah
malicious nodes detection framework against CPMA attack
Ab Hamid, Abdullah Gani, Muhammad Khurram Khan,
in IoT networks. CPMAED maintains partial trust values for
et al. Trust and reputation for internet of things: Funda-
each relay node and converts the calculation of nodes’ trust
mentals, taxonomy, and open research challenges. Jour-
values into a multivariable linear regression problem. Also,
nal of Network and Computer Applications, 145:102409,
the clustering algorithm is adopted to classify the nodes into
2019.
benign or malicious. Our experimental results show that our
[10] Hansi Mayadunna, Shanen Leen De Silva, Iesha
proposed scheme can achieve good detection performance and
Wedage, Sasanka Pabasara, Lakmal Rupasinghe,
identify malicious nodes’ attack modes with high accuracy.
Chethena Liyanapathirana, Krishnadeva Kesavan,
Chamira Nawarathna, and Kalpa Kalhara Sampath.
VII. ACKNOWLEDGMENTS Improving trusted routing by identifying malicious
This work is supported by the National Natural Science nodes in a manet using reinforcement learning. In 2017
Foundation of China under Grant No.61402225 and the Sci- Seventeenth International Conference on Advances in
ence and Technology Funds from National State Grid Ltd. ICT for Emerging Regions (ICTer), pages 1–8. IEEE,
(The Research on Key Technologies of Distributed Parallel 2017.
Database Storage and Processing based on Big Data). [11] Xia Li, Jill Slay, and Shaokai Yu. Evaluating trust
in mobile ad hoc networks,” the. In Workshop of
International Conference on Computational Intelligence
R EFERENCES and Security, 2005.
[1] Tie Luo and Sai G Nagarajan. Distributed anomaly detec- [12] Ali Abu Romman and Hussein Al-Bahadili. Performance
tion using autoencoder neural networks in wsn for iot. In analysis of the neighbor weight trust determination algo-
2018 IEEE International Conference on Communications rithm in manets. Int J Netw Secur Appl, 8(4):29–40,
(ICC), pages 1–6. IEEE, 2018. 2016.
[2] Yasir Mehmood, Farhan Ahmad, Ibrar Yaqoob, Asma [13] Nasser-Eddine Rikli and Aljawharah Alnasser.
Adnane, Muhammad Imran, and Sghaier Guizani. Lightweight trust model for the detection of concealed
Internet-of-things-based smart cities: Recent advances malicious nodes in sparse wireless ad hoc networks.
and challenges. IEEE Communications Magazine, International Journal of Distributed Sensor Networks,
55(9):16–24, 2017. 12(7):1550147716657246, 2016.
[3] Chuang Wang, Taiming Feng, Jinsook Kim, Guiling [14] Shenyun Che, Renjian Feng, Xuan Liang, and Xiao
Wang, and Wensheng Zhang. Catching packet droppers Wang. A lightweight trust management based on
and modifiers in wireless sensor networks. In 2009 bayesian and entropy for wireless sensor networks. Secu-
6th Annual IEEE Communications Society Conference rity and Communication Networks, 8(2):168–175, 2015.
on Sensor, Mesh and Ad Hoc Communications and [15] Sophia Kaplantzis, Alistair Shilton, Nallasamy Mani, and
Networks, pages 1–9. IEEE, 2009. Y Ahmet Sekercioglu. Detecting selective forwarding
[4] Nikos Komninos, Eleni Philippou, and Andreas Pitsil- attacks in wireless sensor networks using support vector
lides. Survey in smart grid and smart home security: machines. In 2007 3rd International Conference on
Issues, challenges and countermeasures. IEEE Commu- Intelligent Sensors, Sensor Networks and Information,
nications Surveys & Tutorials, 16(4):1933–1954, 2014. pages 335–340. IEEE, 2007.
[5] Rwan Mahmoud, Tasneem Yousuf, Fadi Aloul, and Imran [16] Miao Xie, Jiankun Hu, Song Han, and Hsiao-Hwa Chen.
Zualkernan. Internet of things (iot) security: Current Scalable hypergrid k-nn-based online anomaly detection
status, challenges and prospective measures. In 2015 10th in wireless sensor networks. IEEE Transactions on Par-
International Conference for Internet Technology and allel and Distributed Systems, 24(8):1661–1670, 2012.
Secured Transactions (ICITST), pages 336–341. IEEE, [17] Xin Liu, Mai Abdelhakim, Prashant Krishnamurthy, and
2015. David Tipper. Identifying malicious nodes in multihop iot
[6] KN Ambili and Jimmy Jose. Trust based intrusion networks using diversity and unsupervised learning. In
detection system to detect insider attacks in iot systems. 2018 IEEE International Conference on Communications
In Information Science and Applications, pages 631–638. (ICC), pages 1–6. IEEE, 2018.
Springer, 2020. [18] Changda Wang, Syed Rafiul Hussain, and Elisa Bertino.
[7] S Kanthimathi and P Jhansi Rani. Defending against Dictionary based secure provenance compression for
packet dropping attacks in wireless adhoc networks using wireless sensor networks. IEEE transactions on parallel

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2020.3047642, IEEE Internet of
Things Journal
IEEE INTERNET OF THING JOURNAL,VOL.1,NO.1,AUGUST 2020 10

and distributed systems, 27(2):405–418, 2015. Security, 2020(5):8–19, 2020.


[19] Dinesh Kumar Anguraj and S Smys. Trust-based in-
trusion detection and clustering approach for wireless
body area networks. Wireless Personal Communications,
104(1):1–20, 2019. Liang Liu is currently an associate professor in
[20] Wenjuan Li, Weizhi Meng, Lam-For Kwok, and HS Ho- College of Computer Science and Technology, Nan-
race. Enhancing collaborative intrusion detection net- jing University of Aeronautics and Astronautics,
Nanjing, Jiangsu Province, China. His research in-
works against insider attacks using supervised intrusion terests include distributed computing, big data and
sensitivity-based trust management model. Journal of system security. He received the B.S. degree in
Network and Computer Applications, 77:135–145, 2017. computer science from Northwestern Polytechnical
University, Xi’an, Shanxi Province, China in 2005,
[21] Wolali Ametepe, Changda Wang, Selasi Kwame and the Ph.D. degree in computer science from
Ocansey, Xiaowei Li, and Fida Hussain. Data provenance Nanjing University of Aeronautics and Astronautics,
collection and security in a distributed environment: a Nanjing, Jiangsu Province, China in 2012.
survey. International Journal of Computers and Appli-
cations, pages 1–15, 2018.
[22] Changda Wang and Elisa Bertino. Sensor network prove-
nance compression using dynamic bayesian networks. Xiangyu Xu received his Bachelor’s degree in
ACM Transactions on Sensor Networks (TOSN), 13(1):1– 2017, from the Nanjing Tech University, China. He
is currently a master student in College of Com-
32, 2017. puter Science and Technology, Nanjing University
[23] Weizhi Meng. Intrusion detection in the era of iot: Build- of Aeronautics and Astronautics, China. His research
ing trust via traffic filtering and sampling. Computer, interests include System Security and IoT Security.
51(7):36–43, 2018.
[24] Liang Liu, Zuchao Ma, and Weizhi Meng. Detection
of multiple-mix-attack malicious nodes using perceptron-
based trust in iot networks. Future Generation Computer
Systems, 101:865–879, 2019.
[25] Qiang Zhou, Xiaolin Qin, Guoxiu Liu, Hui Cheng, and
Huanhuan Zhao. An efficient privacy and integrity pre- Yulei Liu is currently an associate research fellow
serving data aggregation scheme for multiple applications in College of Computer Science and Technology,
in wireless sensor networks. In 2019 IEEE International Nanjing University of Aeronautics and Astronautics,
Nanjing, Jiangsu Province, China. His research in-
Conference on Smart Internet of Things (SmartIoT), terests include distributed computing, big data and
pages 291–297. IEEE, 2019. system security.
[26] Hongliang Zhu, Ying Yuan, Yuling Chen, Yaxing Zha,
Wanying Xi, Bin Jia, and Yang Xin. A secure and
efficient data integrity verification scheme for cloud-iot
based on short signature. IEEE Access, 7:90036–90044,
2019.
[27] Ali Bou Nassif, Danny Ho, and Luiz Fernando Capretz.
Towards an early software estimation using log-linear Zuchao Ma received his Bachelor’s degree in 2018,
regression and a multilayer perceptron model. Journal from the Nanjing University of Aeronautics and
of Systems and Software, 86(1):144–160, 2013. Astronautics, China. He is currently a master student
in College of Computer Science and Technology,
[28] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gram- Nanjing University of Aeronautics and Astronautics,
fort, Vincent Michel, Bertrand Thirion, Olivier Grisel, China. His research interests include Cloud Security,
Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent System Security and IoT Security.
Dubourg, et al. Scikit-learn: Machine learning in python.
the Journal of machine Learning research, 12:2825–
2830, 2011.
[29] Longjie Li, Yang Yu, Shenshen Bai, Ying Hou, and
Xiaoyun Chen. An effective two-step intrusion detection
approach based on binary classification and k-nn. IEEE
Jianfei Peng received his Bachelor’s degree in 2019,
Access, 6:12060–12073, 2017. from the Nanjing University of Aeronautics and
[30] Daniel Gibert, Carles Mateu, and Jordi Planes. The Astronautics, China. He is currently a master student
rise of machine learning for detection and classification in College of Computer Science and Technology,
Nanjing University of Aeronautics and Astronautics,
of malware: Research developments, trends and chal- China. His research interests include Cloud Security,
lenges. Journal of Network and Computer Applications, System Security and Routing.
153:102526, 2020.
[31] Amar Meryem and Bouabid EL Ouahidi. Hybrid intru-
sion detection system using machine learning. Network

2327-4662 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: VIT University. Downloaded on July 24,2021 at 02:22:35 UTC from IEEE Xplore. Restrictions apply.

You might also like