A Novel Intrusion Detection System For Vehicular Ad Hoc Networks (VANETs) Based On Differences of Traffic Flow and Position

Accepted Manuscript
A novel Intrusion Detection System for Vehicular Ad Hoc Networks

(VANETs) based on differences of traffic flow and position
Junwei Liang, Jianyong Chen, Yingying Zhu, Richard Yu
PII: S1568-4946(18)30683-5
DOI: https://doi.org/10.1016/j.asoc.2018.12.001
Reference: ASOC 5227
To appear in: Applied Soft Computing Journal
Received date : 27 January 2018

Revised date : 30 October 2018
Accepted date : 3 December 2018
Please cite this article as: J. Liang, J. Chen, Y. Zhu et al., A novel Intrusion Detection System for
Vehicular Ad Hoc Networks (VANETs) based on differences of traffic flow and position, Applied
Soft Computing Journal (2018), https://doi.org/10.1016/j.asoc.2018.12.001
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form.
Please note that during the production process errors may be discovered which could affect the
content, and all legal disclaimers that apply to the journal pertain.
*Highlights (for review)
Highlights
l A novel feature extracting algorithm is used to extract distinct features for IDS.
l An improved growing hierarchical self-organizing map is proposed for IDS in
VANETs.
l The proposed IDS can mitigate message congestion and accurately detect attacks.
Graphical abstract (for review)
Graphical abstract
Response
Messages ...
Position Table
Request Neighboring
Messages Vehicles
Expire
Proposed IDS
Beacon Feature Response
Vehicle ID
Message Feature Vector Response Signal
Classifier Reject
Extraction Beacon Center
Message
Beacon
Accept Message
Last Neighbor Expire Current Neighbor

Table Table

The Scheme of proposed IDS.
*Manuscript
Click here to view linked References
A Novel Intrusion Detection System for Vehicular Ad Hoc Networks (VANETs) based on
Differences of Traffic Flow and Position
Junwei Liang1, Jianyong Chen1, Yingying Zhu1, and Richard Yu2

1
School of Computer and Software Engineering, Shenzhen University, Shenzhen, China
2
School of Information Technology, Carleton University, Ottawa, Canada
Abstract
Vehicle ad hoc networks (VANETs) have attracted great interests from both industry and
academia, but a number of issues, particularly security, have not been readily addressed.
Intrusion Detection System (IDS) as one of the most important approaches to protect network
security has been studied adequately in previous literatures. However, the performance of IDSs
still needs to be improved to adapt the scenario of VANETs which are very fast moving and
highly dynamic. In this paper, we propose a novel IDS that is able to be appropriately used in
the wireless and dynamic networks, like VANETs. It mainly contains a novel feature extraction
algorithm and a classifier based on an improved growing hierarchical self-organizing map (I-
GHSOM) for IDS in VANETs. The proposed feature extraction algorithm is used to quickly
extract distinct features from vehicle messages for IDS’s training and test. In the proposed
algorithm, two key features including the differences of traffic flow and of position are
extracted. The former feature is calculated according to the range of the distance between
vehicles, while both a voting filter mechanism and a semi-cooperative mechanism are designed
to get the latter feature. Furthermore, in the I-GHSOM-based classifier, for quickly attaining
precise classification results, two novel mechanisms (relabeling and recalculating mechanisms)
are proposed to relabel the units of GHSOM and check whether the balance of GHSOM
structure is broken or not. Simulation results show that the proposed IDS is better than others
in the measurement of accuracy, stability, processing efficiency and message scales.
Keywords: Vehicle Ad Hoc Networks (VANETs), Intrusion Detection System (IDS), Feature
Extraction Algorithm, and Improved Growing Hierarchical Self-Organizing Map (I-GHSOM).
1.Introduction
The automotive industry has equipped vehicles with wireless access in vehicular
environments (WAVE) devices since 2015 [1]. Vehicle ad hoc networks (VANETs) will become
a reality in the very near future and remarkably change our lives. The tremendous safety,
convenience and commercial potential of VANETs can make the tasks for drivers easier and
thus improve safety. VANETs as the basic infrastructure can facilitate the applications and
services of connected vehicles and intelligent transportation systems.
Dedicated short range communication (DSRC) provides a two-way short- to medium-range
wireless communications capability that permits very high data transmission critical in
communications-based active safety applications. Vehicles in VANETs use DSRC to
communicate with each other, i.e., vehicle to vehicle (V2V), and with the infrastructure (Road
Side Units), i.e., vehicle to infrastructure (V2I) [2]. Safety and entertainment are the main topics
and incentives of VANETs, which have drawn great interests from both academia and industry
[3].
Because VANETs handle the vital traffic information related to human safety, security in
VANETs is crucial. Intrusion detection system (IDS) is one of the most important approaches
to protect vehicular networks against threats as it has the ability to detect both insider and
external attacks with a high accuracy [4], [5]. However, there are some important issues that
need to be addressed before using IDS to prevent attacks in VANETs. (1) It is difficult to use
the same intrusion detection mechanisms that have been appropriately used in wired networks
because of the wireless and mobile nature of VANETs. (2) There is no any well-known database
to train IDS in VANETs scenario. (3) Encounters are short lived and the received messages
have to be handled quickly considering that VANETs are very fast moving and highly dynamic.
In other words, the reliability of information needs to be ascertained both quickly and accurately
[6].
The main contributions in this paper are described below.
1. A novel feature extraction algorithm is proposed to extract the differences of traffic flow
and of position. In the proposed extraction mechanism, the difference of traffic flow is
calculated according to the range of the distance between vehicles. Both a voting filter
mechanism and a semi-cooperative mechanism are proposed to calculate the difference of
position based on the positions between the neighboring vehicles in current and last time
points.
2. An improved growing hierarchical self-organizing map (I-GHSOM) is presented as
classifier in proposed IDS, in which two novel mechanisms (relabeling and recalculating
mechanisms) are used to relabel the units of GHSOM and identify the balance of GHSOM
structure.
3. With the proposed IDS, the network message congestion (e.g., broadcast storms) can be
mitigated by the reduction of message scale and the improvement of processing efficiency.
Simulation results show that the performance of the proposed IDS is still remarkable even
when up to 40% of vehicles are rogue vehicles.
The remainder of this paper is organized as follows. Related work is discussed in Section 2.
VANETs model and attack model are presented in Sections 3 and 4, respectively. In Section 5,
the overview of the proposed IDS is provided. Its security performance is demonstrated in detail
and its results are discussed in Section 6. Conclusions and future work are presented in Section
7.
2.Related work
The security of VANETs is a highly important issue that has been the focus of research during
the last many years [7], [8], [9], [10], [11], [12]. One of the important challenges in this regard
is the existence of rogue vehicles. Several approaches have been reported in literatures to tackle
this issue in VANETs. These approaches are usually classified into three categories (1) trust or
reputation based scheme, (2) data centric misbehavior detection scheme, and (3) IDS.
In trust or reputation based scheme, the assignment of a trust score to a vehicle is based on
previous or current interactions between the vehicle and others [13]. The trust score represents
the reputation of a vehicle in the network. In [14], [15], a decentralized infrastructure of trust
based scheme has been adopted. They build a reputation management system for each vehicle
that enables to quickly adapt the change of local conditions and to establish trust relationships
with other vehicles. Whereas, in [16], a centralized infrastructure (an attack-resistant trust
management scheme) has been proposed. It can not only detect and cope with malicious attacks,
but also evaluate the trustworthiness of both data and vehicles in VANETs. The trust based
scheme is useful, but it cannot be used for detecting false emergency messages as trust needs
to be built over a period of time. Moreover, in this kind of schemes, it is no way to detect a false
message coming from a trusted vehicle.
Data centric misbehavior detection scheme has been proposed in [16], [17], [18]. It has been
applied in shared data to improve reliability of VANETs [17]. In [16], the authors proposed a
model to detect and correct the errors of the data sent out by vehicles [16]. The messages that
conform to the model are accepted and otherwise are rejected. This scheme also has been
adopted to identify the false information of emergency message on the basis of message type
and the subsequent behavior of vehicle sender [18]. However, the technique is difficult to be
managed in VANETs and is infeasible for emergency messages, which need to be acted quickly.
Additionally, this technique produces enormous computation cost.
Because maintaining and depending on trust or reputation are very expensive, a complex
concept and centralized trust has long been debated as it is difficult to maintain, update and use
in VANETs [20]. IDS is a more proper approach to protect VANETs against threats, since it can
detect attacks with a high accuracy and can protect the system from unknown attacks [22], [23].
Several studies have been performed in the area of IDS for VANETs. In [24], [25], reputation
scores and a framework of rule based anomaly detection have been employed in IDS. However,
when the number of vehicles increases, its performance declines evidently, such as longer
detection time, more frequency of false alarms and heavier overhead. Although rule based IDS
has high detection accuracy and efficiency, it just can detect known attacks and is invalid to
unknown attacks [23], [26], [27], [34]. In [28], [29], the authors presented a watchdog for
intrusion detection in VANETs. The former monitors all packets to decide if an attack is under
progress, while the later supervises both the number of RTS/CTS (request to send/clear to send)
requests from the watchdogs and the detected vehicles at the MAC layer. The main shortcoming
of this approach is that the misbehaving vehicles may be rewarded instead of being punished,
since they will no longer forward packets of other vehicles, but their own packets will still be
forwarded. In [19], [20], [21], statistic-based methods have been used for anomaly detection,
which can detect attacks accurately and protect VANETs from unknown attacks. When using
these statistic-based methods, the distraction of data has to be known in advance. However, it
is usually difficult. In addition, these statistic-based methods can only handle one feature at a
time. In other words, when dealing with more than one feature, these methods have to be
executed multiple times. In [30], [31], support vector machine (SVM) based IDSs have been
designed. They use SVM-based classifier to monitor vehicles, which classifies the smart
vehicles to be cooperative or malicious. These SVM-based detection mechanisms are placed in
a single resource-constraint vehicle, which may result in overload as it needs to gather,
propagate, store and analyze the training data sent out from a large number of vehicles around.
In [32], any malicious activity in VANET is detected by the Markov chain model that is
constructed based on the states of vehicle and their transitions in VANETs. However, the
Markov chain model has to collect a series of vehicle states before detection of attacks, which
means rogue vehicles cannot be detected in that period. Moreover, in [33], the proposed IDS
framework uses Bayesian game-theoretic methodology to switch the status of IDS (active or
idle) to reduce overhead. Unfortunately, attacks from vehicles cannot be detected during the
idle status of IDS.
To address the above challenges, we proposed a novel IDS based on the mobile and dynamic
nature of VANETs, which mainly consists of a novel feature extraction algorithm and an I-
GMSOM classifier. Unlike the wired network, there is no well-known database in VANETs, so
feature extraction is necessary for IDSs in VANETs. Compared with other methods for feature
extraction, our proposed algorithm is able to obtain more distinct traffic flow features by taking
the range of distance between vehicles into consideration, and uses a novel semi-cooperative
mechanism to extract position feature in a short time. In addition, a neural network (I-GHSOM)
is used as classifier for IDS. In this neural network, two novel mechanisms (relabeling and
recalculating mechanisms) are proposed to attain more precise classification results.
3.VANETs model
Last Neighbor Table
V1, AvgFlowtag , Xpostag , Ypostag
V2, AvgFlowneg , Xposneg , Yposneg
V3,
V4,
V5,
Current Neighbor Table
V1, AvgFlowtag, Xpostag, Ypostag
V2, AvgFlowneg, Xposneg, Yposneg
V3,
V4,
V5,
Position Information Table

V1, V2, Xposneg, Yposneg, D1&2 Vehicle-Mounted Devices
V1, V3,
V2, V5,
V5, V4, GPS Radar IDS
V4, V3,
V4, V5,
Fig. 1. VANETs model on a highway.
In previous studies on VANETs [20], [21], [27], [33], [34], [35], researchers have provided
their VANETs models that contain the format of message, the protocol used for message
forwarding as well as the construction of VANETs. However, there are some distinctions in
these models due to the difference of their proposed methods. Here, we provide a universal
VANETs model based on these studies in Fig. 1. In this model, each vehicle is equipped with
several devices, such as GPS, Radar, IDS and others. In this model, GPS can obtain vehicle
position, Radar is used to measure signal information and IDS is adopted to detect attacks.
There are three roles, i.e., own vehicle, neighboring vehicle and target vehicle, in vehicular
communication. The own vehicle represents the considered vehicle itself. The neighboring
vehicles are the vehicles nearby, with which the own vehicle can communicate directly. The
target vehicle is a special one of the neighboring vehicles, whose message is being processed
by the own vehicle. It means that any neighboring vehicle could be a target vehicle. Moreover,
each vehicle has several tables to store the messages from neighboring vehicles and the items
within these tables are used for the decision making to improve driving experience. In following
subsection, the details of the VANETs model are provided.
3.1. VANETs measurements

As shown in Fig. 1, the own vehicle can get the number of the neighboring vehicles on the
highway by checking their IDs in messages in last time point to obtain the density of vehicles
(𝐷𝑒𝑛𝑠𝑖𝑡𝑦𝑜𝑤𝑛 ). In addition, we assume that each vehicle is equipped with GPS device, which is
supposed to be accurate. By GPS, the own vehicle can acquire vehicle position 𝑃𝑜𝑠𝑜𝑤𝑛
(𝑋𝑝𝑜𝑠𝑜𝑤𝑛 , 𝑌𝑝𝑜𝑠𝑜𝑤𝑛 ). Based on the Green-shield’s model [20], [35] and the free space model
[36], the own vehicle can obtain traffic flow (𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 ) and the distance between itself and
the neighboring vehicle or between itself and the target vehicle (𝐷𝑜&𝑛 or 𝐷𝑜&𝑡 ) respectively.
The details of both models are provided below.
Based on the Green-shield’s model, Eq. (1) and Eq. (2) can be obtained, where 𝑀𝑎𝑥𝑆𝑝𝑒𝑒𝑑
is the free flow speed when the density is zero and 𝑀𝑎𝑥𝐷𝑒𝑛𝑠𝑖𝑡𝑦 is the point, in which the
speed becomes zero and vehicles are stuck in a traffic jam. As a result, 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 can be
derived as shown in Eq. (3), where 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔′ is the average traffic flow of the neighboring
vehicles in last time point and 𝑁 − 1 is the number of neighboring vehicles in the same time
point. Readers can refer to [20], [35] for the details of the Green-shield’s model.
𝐷𝑒𝑛𝑠𝑖𝑡𝑦𝑜𝑤𝑛
𝐴𝑣𝑔𝑆𝑝𝑒𝑒𝑑𝑜𝑤𝑛 = 𝑀𝑎𝑥𝑆𝑝𝑒𝑒𝑑 − 𝑀𝑎𝑥𝑆𝑝𝑒𝑒𝑑 (1)
𝑀𝑎𝑥𝐷𝑒𝑛𝑠𝑖𝑡𝑦
𝐹𝑙𝑜𝑤𝑜𝑤𝑛 = 𝐴𝑣𝑔𝑆𝑝𝑒𝑒𝑑𝑜𝑤𝑛 × 𝐷𝑒𝑛𝑠𝑖𝑡𝑦𝑜𝑤𝑛 (2)
1
𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 = (∑𝑁−1
𝑖=1 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔′ + 𝐹𝑙𝑜𝑤𝑜𝑤𝑛 ) (3)
𝑁
Based on the free space model, the own vehicle can calculate the distance between itself and
its neighboring vehicle or between itself and its target vehicle to get vehicle position on
highways by Eq. (4), in which 𝑅𝑆𝑆𝑗 , 𝑊𝐿𝑗 and S𝑃𝑗 are the received signal strength, the wave
length and the sending power of neighboring vehicle (or target vehicle) respectively. For the
details of the free space model, readers can refer to [36].
S𝑃 ×𝑊𝐿 2
𝑗 𝑗
𝐷𝑖&𝑗 = √(4π)2 𝑅𝑆𝑆 , 𝑖 = 𝑜(𝑜𝑤𝑛), 𝑗𝜖{𝑛(𝑛𝑒𝑔), 𝑡(𝑡𝑎𝑔)} (4)
𝑗
3.2. Message format

For communication with the neighboring vehicles, the own vehicle continuously broadcasts
a beacon message (𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔) in the same time interval (𝐵𝑒𝑎𝑐𝑜𝑛𝑇). The format of message
is shown in Eq. (5), where 𝐼𝐷𝑜𝑤𝑛 is the identity of the own vehicle. It should be noted that
these measurements will be changed to ( 𝐼𝐷𝑛𝑒𝑔 , 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔 , 𝑃𝑜𝑠𝑛𝑒𝑔 ) or ( 𝐼𝐷𝑡𝑎𝑔 ,
𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔 , 𝑃𝑜𝑠𝑡𝑎𝑔 ) at the end of receivers.
𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔(𝐼𝐷𝑜𝑤𝑛 , 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 , 𝑃𝑜𝑠𝑜𝑤𝑛 ) (5)
When the own vehicle needs to know the position of its target vehicle, it will broadcast a
request message (𝑅𝑒𝑞𝑢𝑒𝑠𝑡𝑀𝑠𝑔) as Eq. (6), where 𝐼𝐷𝑡𝑎𝑔 is the identity of the target vehicle.
Once its neighboring vehicles receive 𝑅𝑒𝑞𝑢𝑒𝑠𝑡𝑀𝑠𝑔, they broadcast their positions and the
distances to the target vehicle. During a certain waiting period, 𝑊𝑎𝑖𝑡𝑖𝑛𝑔𝑇 (it is shorter than
𝐵𝑒𝑎𝑐𝑜𝑛𝑇), the own vehicle can receive response messages (𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔), the format of
which is shown in Eq. (7) with 𝐷𝑛&𝑡 as the distance between the neighboring vehicle and the
target vehicle.
𝑅𝑒𝑞𝑢𝑒𝑠𝑡𝑀𝑠𝑔(𝐼𝐷𝑜𝑤𝑛 , 𝐼𝐷𝑡𝑎𝑔 ) (6)
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔(𝐼𝐷𝑛𝑒𝑔 , 𝐼𝐷𝑡𝑎𝑔 , 𝑃𝑜𝑠𝑛𝑒𝑔 , 𝐷𝑛&𝑡 ) (7)
3.3. Information tables

For storing messages within a communication window, each vehicle has three tables, i.e.,
current neighbor table, last neighbor table and position table as shown in Fig. 1. The current
neighbor table and the last neighbor table are used to store 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔, and the position table
is used to store 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔. The lifespan of these items in the three tables is 𝐵𝑒𝑎𝑐𝑜𝑛𝑇.
When time is ended, the items in both the position table and the last neighbor table are deleted.
Then, the items in the current neighbor table are moved to the last neighbor table and are
changed to (𝐼𝐷𝑛𝑒𝑔′ , 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔′ , 𝑃𝑜𝑠𝑛𝑒𝑔′ ) or (𝐼𝐷𝑡𝑎𝑔′, 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔′ , 𝑃𝑜𝑠𝑡𝑎𝑔′).
4.Attacks models
Rogue Fake Legitimate

Vehicle Vehicle Vehicle
Fig. 2. An example of false information attack.
Rogue Fake Legitimate

Vehicle Vehicle Vehicle
Fig. 3. An example of Sybil attack.

To examine the performance of the proposed IDS, two important attacks, i.e., false
information attack and Sybil attack, are considered. It should be noted that the proposed IDS
can also resist other more types of attacks.
False Information Attack [37]: Rogue vehicles can inject false messages into the network
either on the purpose with malicious intent or due to faulty sensors that can cause serious
damage to the network. Under extreme conditions, the network can even be paralyzed [20]. In
this paper, for selfish purposes or avoiding detection, rogue vehicles can start injecting false
data at any time and can falsify their calculated measurements in 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 or in
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔. An example of false information attack is shown in Fig. 2. The red region is
where the brakes have been applied. The orange region is where the effect information is being
propagated and vehicles begin to slow down their speed. Unlike the two regions, the vehicles
in the blue region can run smoothly. As shown in the figure, a rogue vehicle can create a fake
car accident by broadcasting false low values of traffic flow to slow down the speed of its
neighbor vehicles for some selfish intentions.
Sybil Attack [38]: Sybil attacks refer to the rogue vehicles that illegitimately use multiple
identities. In VANETs, vehicles usually discover new neighboring vehicles by periodically
broadcasting beacon messages. However, given the invisible nature of wireless communication,
a rogue vehicle can easily claim multiple identities without being detected. Identity
authentication does not help to prevent Sybil attacks because malicious drivers can still acquire
additional identity information by using non-technical ways such as stealing, or simply
borrowing from his friends [20], [21]. In this paper, rogue vehicles can create many false
𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 and 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔 to generate Sybil attacks. For instance, as shown in Fig. 3,
a rogue vehicle can create a false congestion condition by transmitting a lot of false
𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 to its neighboring vehicles to intercept certain vehicle for the selfish purposes.
5.The proposed intrusion detection system

To address the issues that the existing IDSs for VANETs face (described in Section 1), we
propose a novel IDS, which is designed based on the VANETs model described in Section 3.
The proposed IDS cannot be trained and tested directly by ready databases due to short of well-
known databases. Here, two scenarios are set up to collect data for training and testing, i.e.,
normal scenario (without rogue vehicles) and rogue scenario (involving rogue vehicles). First,
we gathered data under normal scenario, and then used the data to train the proposed IDS.
Furthermore, in rogue scenario, the trained IDS can detect anomalies once there are deviations
in vehicle messages.
In the following, the scheme of the proposed IDS is provided first. Then, the main modules
of this scheme, i.e., feature extraction module and classifier module, are described in detail.
Unlike the preprocessing mechanism of IDS used in wire network, the feature extraction
module needs to quickly transform the measurements in messages into the features that can
reflect the security characteristics of vehicles. The classifier module adopts an improved neural
network algorithm to detect the deviation in VANETs both quickly and accurately (outlier
detection).
5.1. Scheme
Response
Messages ...
Position Table
Request Neighboring
Messages Vehicles
Expire
Proposed IDS
Beacon Feature Response
Vehicle ID
Message Feature Vector Response Signal
Classifier Reject
Extraction Beacon Center
Message
Beacon
Accept Message
Last Neighbor Expire Current Neighbor

Table Table
Fig. 4. Scheme of the proposed IDS.
The scheme of the proposed IDS is provided in Fig. 4. It consists of three modules, i.e.,
feature extraction, classifier and response center. They cooperate with each other and use some
facilities of VANETs to quickly and accurately detect deviation. The feature extraction module
can extract features from messages with help of the last neighbor table and the position table.
The classifier module that has been trained is able to check if there are deviations in these
messages according to the features from the feature extraction module. Corresponding actions
are carried out by the response center module to assure security of VANETs. The procedures of
training and testing phases are introduced as follows.
As shown in Fig. 4, no matter in the training phase or in the testing phase, once IDS receives
𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 from a target vehicle, the feature extraction module extracts a feature vector from
𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 with help of the last neighbor table and the position table, and then sends the
feature vector and 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 to the classifier module. It has to be noted that the items in
the current neighbor table will be moved to the last neighbor table automatically in the end of
per 𝐵𝑒𝑎𝑐𝑜𝑛𝑇. If IDS needs assistance from neighboring vehicles in feature extraction process,
it will broadcast 𝑅𝑒𝑞𝑢𝑒𝑠𝑡𝑀𝑠𝑔 and wait for 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔 from the neighboring vehicles
within 𝑊𝑎𝑖𝑡𝑖𝑛𝑔𝑇. Next, in the training phase, the feature vector is just used to train the
classifier and 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 is added into the current neighbor table. In the testing phase, the
classifier which has been trained checks if any deviation exists in 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 when it
receives the feature vector. If the classifier judges that there is not any deviation in the feature
vector, 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 is accepted and added to the current neighbor table. Otherwise, it is
rejected and the classifier sends ID of this target vehicle to the response center to take
corresponding actions.
5.2. Feature extraction module
Algorithm 1
Feature extraction procedure.
Input: 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔, 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔
Output: 𝐷𝑒𝑡𝐹𝑒𝑎(𝐹𝑙𝑜𝑤𝑅, 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅) % Features vector
% Calculate 𝐹𝑙𝑜𝑤𝑅：
1: Calculate 𝐷𝑜&𝑡 as Eq. (4) and 𝐷𝑒𝑛𝑠𝑖𝑡𝑦𝑜𝑤𝑛
2: Obtain 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔 and calculate 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 by Eq. (3)
𝑜&𝑡 𝐷
3: 𝐹𝑙𝑜𝑤𝑅 = (|𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 − 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔 |, ⌊ 100 ⌋)
% Calculate 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 by a semi-cooperative method：
4: if 𝐼𝐷𝑡𝑎𝑔 does not exist in last neighbor table then % Cooperative situation
5: Broadcast 𝑅𝑒𝑞𝑢𝑒𝑠𝑡𝑀𝑠𝑔 and wait for 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔 in 𝑊𝑎𝑡𝑖𝑛𝑔𝑇 to fill in
position table
6: Obtain 𝐵𝑖𝑎𝑠𝑜&𝑡 as Eq. (8) and 𝐵𝑖𝑎𝑠𝑛&𝑡 as Eq. (9) % There could have more
than one 𝐵𝑖𝑎𝑠𝑛&𝑡
7: 𝐵𝑖𝑎𝑠𝑛&𝑡 , which does not exist in last neighbor table or whose 𝐵𝑖𝑎𝑠𝑛&𝑛′ >
𝑀𝑎𝑥𝐵𝑖𝑎𝑠, are abandoned
1
8: if 𝑁𝑜. 𝐵𝑖𝑎𝑠𝑛&𝑡 (𝐵𝑖𝑎𝑠𝑛&𝑡 ≥ 𝑀𝑎𝑥𝐵𝑖𝑎𝑠) > 2 𝑇𝑜𝑡𝑎𝑙 𝑁𝑜. 𝐵𝑖𝑎𝑠𝑛&𝑡 then
9: 𝐵𝑖𝑎𝑠𝑛&𝑡 (𝐵𝑖𝑎𝑠𝑛&𝑡 < 𝑀𝑎𝑥𝐵𝑖𝑎𝑠) are abandoned
10: else
11: 𝐵𝑖𝑎𝑠𝑛&𝑡 (𝐵𝑖𝑎𝑠𝑛&𝑡 ≥ 𝑀𝑎𝑥𝐵𝑖𝑎𝑠) are abandoned
12: end if
13: if (𝑁𝑜. 𝐵𝑖𝑎𝑠𝑛&𝑡 ≥ 2) then
14: 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 is obtained with Eq. (11)
15: else
16: 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 cannot be calculated
17: end if
18: else % Uncooperative situation, i.e., 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 is attained directly
19: Obtain 𝐷𝑡&𝑡′ by Eq. (12) and 𝐵𝑖𝑎𝑠𝑡&𝑡′ by Eq. (13)
20: Obtain 𝐷𝑜′ &𝑡 by Eq. (14) and 𝐵𝑖𝑎𝑠𝑜′&𝑡 by Eq. (15)
21: Use free space model to obtain 𝐵𝑖𝑎𝑠𝑜&𝑡 by Eq. (8)
22: 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 can be attained by Eq. (16)
23: end if
As there is no any well-known database for IDS in VANETs scenario, the proposed feature
extraction module cannot be the same as the preprocessing module of IDS in wire network
scenario, which just normalizes and selects given features from some well-known databases.
The feature extraction module needs to quickly transform the measurements in the messages
into the features that can reflect security characteristics of vehicle. According to the VANETs
model, two features, i.e., 𝐹𝑙𝑜𝑤𝑅 and 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅, can be extracted by the feature extraction
module. The former is the difference of the traffic flow between an own vehicle and its target
vehicle. It is based on a principle that traffic flows must be very similar when vehicles are close
to each other under the same traffic conditions [20]. There are attacks if the difference of flows
between two adjacent vehicles is beyond the normal range. The latter is the difference between
the claimed position and detected position for a target vehicle. If the deviation of this feature is
bigger than the deviation caused by some inevitable errors, such as calculation error, it means
a wrong position is sent by malicious vehicle. Therefore, flow and position can be taken as
features in IDS to detect attacks in VANETs, such as false information attack and Sybil attack.
For example, rogue vehicles try to create a non-existent accident by lowering its flow or
generating many false messages with forged IDs, positions and so on. In such cases, 𝐹𝑙𝑜𝑤𝑅
or 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 may deviate from the normal scenario. The procedures of feature extraction are
provided in Algorithm 1, which will be described in the following subsections in detail.
5.2.1. Feature extraction of traffic flow

Normally, traffic flows are similar for the vehicles that are close to each other under the same
traffic conditions [20]. It means that 𝐹𝑙𝑜𝑤𝑅 is a variable related to the difference of traffic
flow and the distance between vehicles. The closer the two vehicles are, the more similar their
traffic flows are. An example of the relationship between traffic flow and distance is provided
in Fig. 5. As shown in the figure, the own vehicle divides its communication range into three
distance domains, and then locates each target vehicle in one of the distance domains according
to the distance between itself and the target vehicle. With the increase of domain number, the
traffic flow deviations between the own vehicle and the target vehicles increase. The traffic
flow deviation in distance domain #0 is the smallest, followed by that in distance domain #1
and that in distance domain #2. Thus, 𝐹𝑙𝑜𝑤𝑅 can be taken as the vector that consists of both
the difference of traffic flow and the proximity of vehicle’s distance.
The procedures to extract 𝐹𝑙𝑜𝑤𝑅 are shown in the lines 1-3 of Algorithm 1. As described
in Section 3, firstly, the own vehicle can calculate 𝐷𝑒𝑛𝑠𝑖𝑡𝑦𝑜𝑤𝑛 by checking their IDs in its
last neighbor table, and then obtains 𝐷𝑜&𝑡 as Eq. (4). Secondly, 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 is calculated
using Eq. (3) and then 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔 is obtained from 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔. Finally, the difference
value of traffic flow (|𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 − 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔 |) and the proximity of vehicle’s distance
𝐷 𝑜&𝑡
( ⌊ 100 ⌋ ) are composed as a vector to represent 𝐹𝑙𝑜𝑤𝑅 , i.e., 𝐹𝑙𝑜𝑤𝑅 = (|𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 −
𝐷
𝑜&𝑡
𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔 |, ⌊ 100 ⌋). It should be noted that when extracting flow feature, both Ref. [20] and
this paper use the difference value of traffic flow, and we further add the proximity of vehicle’s
distance into the flow feature to make it more distinct.
The range of traffic flow deviation:
Target Vehicle1 < Target Vehicle2 < Target Vehicle3
#2 distance
domain
#1 distance
domain
#0 distance
domain Target
Vehicle 2
Own Target Target

Vehicle Vehicle 1 Vehicle 3
Fig. 5. Relationship between traffic flow and distance.
5.2.2. Position feature extraction

𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 in Refs. [19] and [21] only extracts the position feature in cooperation situation
that takes much time of IDS processing, because there are delay and congestion among the
process of vehicles’ messages exchange. However, we use a semi-cooperative method to get it,
which calculates the position biases between a target vehicle and other vehicles. Then, the
average of these biases is taken as 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅. The semi-cooperative method consists of
cooperative and uncooperative situations. A novel voting-filter mechanism is proposed in the
cooperative situation to filter out wrong 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔 and a novel uncooperative
mechanism is proposed in the uncooperative situation to calculate 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 by historical
positions stored in last neighbor table without any help from other neighboring vehicles. The
procedures to extract 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 are shown in the lines 4-20 of Algorithm 1.
5.2.2.1 Voting filter mechanism for cooperative situation

(1) As shown in the line 5 in Algorithm 1, the own vehicle broadcasts 𝑅𝑒𝑞𝑢𝑒𝑠𝑡𝑀𝑠𝑔 to its
neighboring vehicles and waits for their returns with 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔 during 𝑊𝑎𝑡𝑖𝑛𝑔𝑇.
(2) The own vehicle calculates 𝐵𝑖𝑎𝑠𝑜&𝑡 and 𝐵𝑖𝑎𝑠𝑛&𝑡 as shown in the line 6 of Algorithm 1.
The former is the position bias between the own vehicle and the target vehicle, which can
be obtained by Eq. (8). The latter is the bias between the neighboring vehicle and the target
vehicle, which can be computed by Eq. (9). It should be noted that there could have more
than one 𝐵𝑖𝑎𝑠𝑛&𝑡 .
(3) The own vehicle executes a novel voting-filter mechanism to identify wrong
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔 from rogue vehicles as shown in the lines 7-12 of Algorithm 1 and an
example is shown in Fig. 6. In the step1, for guaranteeing the credibility of voting vehicles,
two measures are executed. 𝐵𝑖𝑎𝑠𝑛&𝑡 from the neighboring vehicles that do not exist in the
last neighbor table (no history) and whose 𝐵𝑖𝑎𝑠𝑛&𝑛′ > 𝑀𝑎𝑥𝐵𝑖𝑎𝑠 (wrong position) are
abandoned. Here, 𝐵𝑖𝑎𝑠𝑛&𝑛′ is the position bias between the last and current neighboring
vehicles as Eq. (10), and 𝑀𝑎𝑥𝐵𝑖𝑎𝑠 which is determined in the training process, is the
maximum of 𝐵𝑖𝑎𝑠𝑛&𝑡 under the normal scenario. In the step2, a voting procedure is
carried out. If less than half of 𝐵𝑖𝑎𝑠𝑛&𝑡 satisfy 𝐵𝑖𝑎𝑠𝑛&𝑡 ≥ 𝑀𝑎𝑥𝐵𝑖𝑎𝑠 (deviation), they
are abandoned. Otherwise, the rest of 𝐵𝑖𝑎𝑠𝑛&𝑡 (no deviation) are abandoned.
(4) If the number of 𝐵𝑖𝑎𝑠𝑛&𝑡 ≥ 2 , 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 is obtained by Eq. (11), where 𝑁 =
(𝑁𝑜. 𝐵𝑖𝑎𝑠𝑛&𝑡 + 𝑁𝑜. 𝐵𝑖𝑎𝑠𝑜&𝑡 ) . Otherwise, 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 cannot be attained. These
operations are shown in the lines 13-17 of Algorithm 1.
Step1 Step2
Remain
Remain
Abandon
Abandon
Neighboring vehicles:
deviation(or no deviation), less than half of amount
Neighboring vehicles:
no history, wrong position Neighboring vehicles:
no deviation(or deviation), more than half of amount
Fig. 6. Voting-filter mechanism to filter out wrong response messages from neighboring
vehicles (The blue and red dots could be deviation or no deviation).
𝐵𝑖𝑎𝑠𝑜&𝑡 = |𝐷𝑜&𝑡 − |𝑃𝑜𝑠𝑜𝑤𝑛 𝑃𝑜𝑠𝑡𝑎𝑔 || (8)
𝐵𝑖𝑎𝑠𝑛&𝑡 = |𝐷𝑛&𝑡 − |𝑃𝑜𝑠𝑛𝑒𝑔 𝑃𝑜𝑠𝑡𝑎𝑔 || (9)
𝐵𝑖𝑎𝑠𝑛&𝑛′ = |𝐷𝑛&𝑛′ − |𝑃𝑜𝑠𝑛𝑒𝑔 𝑃𝑜𝑠𝑛𝑒𝑔′ || (10)
1
𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 = 𝑁 (∑ 𝐵𝑖𝑎𝑠𝑛&𝑡 + 𝐵𝑖𝑎𝑠𝑜&𝑡 ) (11)
5.2.2.2 Uncooperative mechanism for uncooperative situation
Distance Estimation Cosine Law
Last Target
Current Target Dt&t'
Vehicle Vehicle
Do&t Dt'&o'

Free Space
Model Do&o'
Current Own Last Own
Vehicle Vehicle
Fig. 7. Diagram of calculating vehicle position feature in uncooperative situation.
𝑀𝑎𝑥𝑆𝑝𝑒𝑒𝑑
𝐷𝑡&𝑡′ = 𝐴𝑣𝑔𝑆𝑝𝑒𝑒𝑑𝑡𝑎𝑔′ × 𝐵𝑒𝑎𝑐𝑜𝑛𝑇, 𝑤ℎ𝑒𝑟𝑒 𝐴𝑣𝑔𝑆𝑝𝑒𝑒𝑑𝑡𝑎𝑔′ = 2
±
𝑀𝑎𝑥𝑆𝑝𝑒𝑒𝑑 2 𝑀𝑎𝑥𝑆𝑝𝑒𝑒𝑑
√ 4
− 𝑀𝑎𝑥𝐷𝑒𝑛𝑠𝑖𝑡𝑦 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔′ (12)
𝐵𝑖𝑎𝑠𝑡&𝑡′ = |𝐷𝑡&𝑡′ − |𝑃𝑜𝑠𝑡𝑎𝑔 𝑃𝑜𝑠𝑡𝑎𝑔′ || (13)
𝐷𝑜′ &𝑡 = √𝐷𝑜&𝑡 2 + 𝐷𝑜&𝑜′ 2 − 2𝐷𝑜&𝑡 𝐷𝑜&𝑜′ cos ∠𝑇𝑂𝑂′ (14)
𝐵𝑖𝑎𝑠𝑜′&𝑡 = |𝐷𝑜′&𝑡 − |𝑃𝑜𝑠𝑜𝑤𝑛 𝑃𝑜𝑠𝑡𝑎𝑔 || (15)
1
𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 = 3 (𝐵𝑖𝑎𝑠𝑜&𝑡 + 𝐵𝑖𝑎𝑠𝑡&𝑡′ + 𝐵𝑖𝑎𝑠𝑜′&𝑡 ) (16)
In case that 𝐼𝐷𝑡𝑎𝑔 exists in the last neighbor table (uncooperative situation), 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅
can be quickly extracted only from the last position. According to Fig. 7 and the lines 18-23 of
Algorithm 1, we can figure out how to calculate position biases between the target vehicle and
other vehicles. First, the distance between the last position and the current position of target
vehicle (𝐷𝑡&𝑡′) is estimated with Eq. (12). Here, 𝐷𝑡&𝑡′ is the one who is closest to the value of
𝐷𝑜&𝑜′ , and 𝐴𝑣𝑔𝑆𝑝𝑒𝑒𝑑𝑡𝑎𝑔 is average speed of target vehicle, which can be deduced based on
Eq. (1), Eq. (2) and Eq. (3). Then, the position bias between the last target and the current target
vehicles (𝐵𝑖𝑎𝑠𝑡&𝑡′ ) is obtained by Eq. (13). Second, the distance between the last position of
the own vehicle and the current position of the target vehicle (𝐷𝑜′ &𝑡 ) is obtained with Eq. (14),
where ∠𝑇𝑂𝑂′ = ∠𝑇𝑂𝑇 + ∠𝑇 ′ 𝑂𝑂′ can be calculated by cosine law according to the known
position. Then, the position bias between the own vehicle in the last time and the current target
vehicles ( 𝐵𝑖𝑎𝑠𝑜′&𝑡 ), is obtained with Eq. (15). Third, similar to 𝐵𝑖𝑎𝑠𝑜&𝑡 in cooperative
situation, it is obtained with Eq. (8). Finally, 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 is obtained with Eq. (16).
5.3. Improve GHSOM-based classifier

We are the first to use GHSOM as classifier in IDS to detect attacks in VANET scenarios,
considering GHSOM can achieve both the high accuracy of classification and the low overhead
of computation [39]. Moreover, two novel mechanisms are proposed for GHSOM to get the
better accuracy on the detection of deviation in this paper, as named I-GHSOM. In the following,
we first introduce the main concept of self-organizing map (SOM) and GHSOM. Then, the
details of the proposed I-GHSOM are described.
5.3.1. SOM and GHSOM
O utp ut
L ay e r
Inp u t
L ay e r
Fig. 8. The framework of SOM.
SOM [41] is the most well-known artificial neural network model for unsupervised learning
and has been previously used to detect the anomaly network connection. As shown in Fig. 8,
SOM has two layers, the input layer or input vector and the output layer or the output map.
Each layer includes a number of neurons which are also called units. The input layer is a one-
dimensional vector that consists of several independent units. The output layer could be a two-
or three-dimensional lattice, and each unit of the output layer is connected to all units of the
input layer. The property of topology preserving in output layer means that the output layer
preserves the relative distance between units. Units that are near to each other are considered
as nearby units. The association of a pair of units in input and output layers is called a “weight”,
which has the same dimension as the input layer [42]. Like most artificial neural networks,
SOM operates in two processes: training and mapping. The training process builds the map
using dataset, in which there are a series of known input vectors, while the mapping process
automatically classifies a new input vectors. Readers can refer to Ref. [41] for the details of
SOM.
GHSOM [43] has a hierarchical and non-fixed structure. It consists of the multiple layers
that are comprised of several independent SOMs, whose number and size are determined during
its training phase. Two parameters, 𝜏1 and 𝜏2 , are used to determine the depth and breadth of
the neural network. GHSOM determines which class an input vector belongs to by finding the
label of the unit that has the shortest distance with the input vector.
During the training process, in each iteration t, a new output map of SOM, 2 × 2 size, is
created. Then, the new map is trained as an independent SOM. During training, the quantization
error 𝑞𝑖 of each unit is calculated as Eq. (17), where 𝐶𝑖 is a set of input vectors mapped into
the unit 𝑖, 𝑥𝑗 is the j-th input vector belonging to 𝐶𝑖 and 𝜔𝑖 is the weight associated to the
unit 𝑖.
𝑞𝑖 = ∑𝑥𝑗𝜖𝐶𝑖‖𝜔𝑖 − 𝑥𝑗 ‖ (17)
Then, the mean quantization error 𝑀𝑄𝐸𝑚 of the new map can be determined, which can be
used to check the growth of map. If 𝑀𝑄𝐸𝑚 ≥ 𝜏1 ∙ 𝑞𝑢 , where 𝑞𝑢 is the quantization error of
the unit 𝑢 on the upper layer, a row or a column of units is inserted between the unit with the
highest quantization error and the unit with the most dissimilar neighboring unit, as shown in
Fig. 9(a). Thereafter, the training of this updated map and comparing process of 𝑀𝑄𝐸𝑚 begin
again until 𝑀𝑄𝐸𝑚 < 𝜏1 ∙ 𝑞𝑢 . Then, the expansion in depth begins as shown in Fig. 9(b). If
𝑞𝑖 ≥ 𝜏2 ∙ 𝑞0 , where 𝑞0 is the initial quantization error, neuron 𝑖 is expanded in a new map
which is located in the next level of the hierarchy.
1 5 3
2 6 4
1 3 1 5 3
1 2 1 2
2 4 2 6 4
3 4 3 4
(a) The Process of Breadth Growth (b) The Process of Depth Growth
Fig. 9. The growth process of GHSOM (a) The process of breath growth, (b) The process of
depth growth.
Map1
1 3
Layer1
2 4 Map3
Map2
1 3
1 3
5 6 Layer2
2 4
2 4
Map4 Map5 Map6
1 3 1 4 1 5 3
Layer3
2 4 2 4 2 6 4
BMU in Unit6, Map6, Layer3
Fig. 10. BMU calculation on GHSOM hierarchy.
The mapping process of GHSOM is shown in Fig. 10. First, the best matching unit (BMU)
on the layer 0 will be obtained by calculating the Euclidean distance between an input vector
and the weight vectors of the layer 0 map. Then, this BMU needs to be checked whether it is a
parent unit or not. The process can be accomplished by using the parent vectors that contain the
growing path coming from the GHSOM training process. Next, if a new map (the child map)
has arisen from this BMU, a new BMU on this child map will be calculated. These
aforementioned processes described above are repeated until no growing map is found. In the
end, the label of BMU with no child map determines the class of data to which the input vector
belongs.
5.3.2. I-GHSOM-based classifier

As described in Section 5.3.1, the deviation of input vector (feature vector) is just determined
by observing whether its BMU is labeled. In other words, if its BMU is labeled, it does not
deviate, otherwise, it deviates. The values of parameters 𝜏1 and 𝜏2 have to be set carefully to
achieve high accuracy of detection. For example, too large values of 𝜏1 and 𝜏2 make the scale
of GHSOM enlarge more than necessary. In this case, some units are unlabeled, which leads to
the low accuracy of detection as a lot of feature vectors are clustered into the unlabeled units.
On the other hand, GHSOM is likely to cluster the features vector from unknown classes into
known classes. To improve accuracy in GHSOM, two novel mechanisms, i.e., relabeling and
recalculating mechanisms, are proposed to judge whether feature vector deviates or not, as
shown in Fig. 11. (1) Relabeling mechanism: BMU without label is relabeled if more than half
of its neighboring units are labeled. (2) Recalculating mechanism: The deviation of feature
vector is also determined by not only the label of BMU, but also whether the feature vector
breaks the balance of trained GHSOM. In this case, when the feature vector is temporarily
inserted into the related data set of its BMU, if 𝑞𝐵𝑀𝑈 < 𝜏2 ∙ 𝑞0 and 𝑀𝑄𝐸𝐵𝑀𝑈 < 𝜏1 ∙ 𝑞𝑢 , the
feature vector does not deviate. Otherwise, it deviates. Here, 𝑞𝐵𝑀𝑈 is the quantization error of
BMU and 𝑀𝐸𝑄𝐵𝑀𝑈 is the mean quantization error of the map that includes BMU.
The procedures of I-GHSOM-based classifier are presented in Algorithm 2. During training
phase, I-GHSOM is trained with 𝐷𝑒𝑡𝐹𝑒𝑎 and then it is labeled with 𝐷𝑒𝑡𝐹𝑒𝑎 according to
the shortest Euclidean distance, as shown at the steps 1-4. After the training process, I-GHSOM
can be used to judge if 𝐷𝑒𝑡𝐹𝑒𝑎 deviates from the normal scenario or not. The steps 5-21
consist of a loop, which keeps running until IDS is closed. At the beginning of the loop, BMU
of 𝐷𝑒𝑡𝐹𝑒𝑎 is obtained. Then, the classifier checks whether BMU has been labeled. If it is still
unlabeled after relabel process, let 𝑅𝑒𝑠 = 1(𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛) and 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 be rejected.
Otherwise, 𝐷𝑒𝑡𝐹𝑒𝑎 is temporarily associated with BMU. Then, 𝑞𝐵𝑀𝑈 and 𝑀𝐸𝑄𝐵𝑀𝑈 are
recalculated. If 𝑞𝐵𝑀𝑈 < 𝜏2 ∙ 𝑞0 and 𝑀𝐸𝑄𝐵𝑀𝑈 < 𝜏1 ∙ 𝑞𝑢 , let 𝑅𝑒𝑠 = 0(𝑛𝑜 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛) and
𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 be accepted, which is put into the current neighbor table. Otherwise, let 𝑅𝑒𝑠 =
1(𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛) and 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 be rejected.
(1)Relabeling Mechanism (2)Recalculating Mechanism
Feature
L L Vector
Step1: Temporarily
Insert Related Data
L BM U L
Set
Step2:
L
Recalculating Quantization
Error
Relabeling the BMU
L L L L
BM U BM U
L L L L L L
L L
Fig. 11. The improvement processes of GHSOM (The white unit is unlabeled, the unit with L
is labeled).
Algorithm 2
I-GHSOM-based classifier procedure.
Input: 𝐷𝑒𝑡𝐹𝑒𝑎(𝐹𝑙𝑜𝑤𝑅, 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅), 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔
Output: 𝑅𝑒𝑠 = 0(no deviation) or 1(deviation) % Classification result
1: for each 𝐷𝑒𝑡𝐹𝑒𝑎 in training phase then % I-GHSOM will be trained
2: Train I-GHSOM with 𝐷𝑒𝑡𝐹𝑒𝑎
3: Label I-GHSOM with 𝐷𝑒𝑡𝐹𝑒𝑎
4: End for
5: for each 𝐷𝑒𝑡𝐹𝑒𝑎 in testing phase then % I-GHSOM has been trained and will be
used to classify 𝐷𝑒𝑡𝐹𝑒𝑎 as following
6: Find BMU of 𝐷𝑒𝑡𝐹𝑒𝑎 in I-GHSOM
7: if BMU is not labeled then % Relabeling mechanism begins
8: Relabel BMU if more than half neighboring units are labeled
9: if BMU is not relabeled then
10: Let 𝑅𝑒𝑠 = 1 and reject 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔
11: end if
12: end if
13: if BMU is labeled then % Recalculating mechanism begins
14: Recalculate 𝑞𝐵𝑀𝑈 and 𝑀𝑄𝐸𝐵𝑀𝑈 with 𝐷𝑒𝑡𝐹𝑒𝑎
15: if 𝑞𝐵𝑀𝑈 < 𝜏2 ∙ 𝑞0 and 𝑀𝑄𝐸𝐵𝑀𝑈 < 𝜏1 ∙ 𝑞𝑢 then
16: Let 𝑅𝑒𝑠 = 0 and accept 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔 to current neighbor table
17: else
18: Let 𝑅𝑒𝑠 = 1 and reject 𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔
19: end if
20: End if
21: End for
6.Performance evaluation
6.1. Simulation setup
Our simulation is based on Network Simulator version 2 (NS2) [44] and Simulation of Urban
Mobility (SUMO) [45]. NS2, which has been highly validated by the networking research
community, presents many well-developed low-layer protocols with easy programming
interfaces. SUMO is a software tool used to generate vehicular traffic by specifying speed, types,
behavior and the number of vehicles. A combination of these tools might be a good choice.
Therefore, SUMO is used to generate mobility trace files, and then NS2 is used to load these
trace files and run the proposed IDS.
6.2. Simulation environment

The simulation parameters are shown in Table 1. In the simulations, vehicles run on a 2-lane
highway with a top speed of 100km/h and each lane is 5km length. Each vehicle can
communicate with other vehicles within 500m transmission range according to communication
protocol 802.11p. To avoid generation of too much data in the simulation, simulation time is
set as 165sec (from 35sec to 200sec), vehicle inter-arrival interval is set as 1sec and
transmission interval, i.e., 𝐵𝑒𝑎𝑐𝑜𝑛𝑇 is set as 0.5sec. To ensure enough time to wait arrival of
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔, the time for response (𝑊𝑎𝑖𝑡𝑖𝑛𝑔𝑇 ) is set as 0.2sec. 𝜏1 and 𝜏2 , i.e., two
parameters about GHSOM, are set as 0.1 and 0.01, respectively.
Table 1
Simulation parameters.
Parameter Value
Scenario 2 Lane Highway

Highway Length 5km
Max Vehicle Speed 100km/h
Wireless Protocol 802.11p
Transmission Range 500m in each direction
Simulation Time 165s
Vehicle Arrival Interval 1sec
Transmission Interval Every 0.5sec
Waiting Time for Response 0.2sec
𝜏1 0.1
𝜏2 0.01
6.3. Evaluation metric

The proposed IDS is examined with true detection rate (TDR), false detection rate (FDR),
uncertain rate (UR), average processing time (APT) and average number of output message
(AOMN). The metrics used are described below:
1).TDR: TDR is the detection rate of vehicles, i.e., the sum of the percentages of the rogue
vehicles that are classified as rogue vehicles and the legitimate vehicles that are classified as
legitimate vehicles. It is shown as Eq. (18).
𝑁𝑜.𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦

𝑇𝐷𝑅 = (18)
𝑇𝑜𝑡𝑎𝑙 𝑁𝑜. 𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠
2).FDR: FDR is the sum of the percentages of the rogue vehicles that are classified as legitimate
vehicles and the legitimate vehicles that are classified as rogue vehicle. It is given as Eq. (19).
𝑁𝑜.𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑤𝑟𝑜𝑛𝑔𝑙𝑦

𝐹𝐷𝑅 = (19)
3).UR: UR is the percentage of the vehicles that cannot be classified since IDS lacks sufficient
information. In this paper, UR is caused by the process of extracting position feature. When the
feature extraction module cannot extract position feature from messages, the classifier is unable
to detect these messages and hence produces UR. It is an important metric to measure the
efficiency of IDS and is given in Eq. (20).
𝑁𝑜.𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠 𝑡ℎ𝑎𝑡 𝑐𝑎𝑛𝑛𝑜𝑡 𝑏𝑒 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑

𝑈𝑅 = = 1 − 𝑇𝐷𝑅 − 𝐹𝐷𝑅 (20)
4). APT: APT is the average working time of IDS in simulation. It is given as Eq. (21).
∑ 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑣𝑒ℎ𝑖𝑐𝑙𝑒

𝐴𝑃𝑇 = 𝑇𝑜𝑡𝑎𝑙 𝑁𝑜. 𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠
(21)
5).AOMN: AOMN is the average number of the output messages from IDS in simulation as
shown in Eq. (22). If AOMN is large, it always means there are some serious problems, such
as broadcast storm, message collision and so on.
∑ 𝑁𝑜. 𝑜𝑓 𝑜𝑢𝑡𝑝𝑢𝑡 𝑚𝑒𝑠𝑠𝑎𝑔𝑒𝑠 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑣𝑒ℎ𝑖𝑐𝑙𝑒

𝐴𝑂𝑀𝑁 = 𝑇𝑜𝑡𝑎𝑙 𝑁𝑜. 𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠
(22)
6.4. Simulation results

For demonstrating the effectiveness of the proposed IDS, we use different scenarios, i.e.,
normal scenario (without rogue vehicles) and rogue scenario (involving rogue vehicles) for
experiment. First, data are gathered under the normal scenario for the detection of deviation.
Furthermore, the rogue scenario is used in both cases, i.e., in case of scenario without IDS and
in case of scenario with the proposed IDS to examine the proposed IDS.
6.4.1. Simulation results of the proposed IDS
Main Modules of
Proposed IDS
Feature Extract
(IDneg,AvgFlowown,AvgFlowneg,Xposneg,
(IDneg,AvgFlowneg,Xposneg,Yposneg)
Yposneg,FlowR,PositionR,Res)
Feature Vector
(FlowR & Position R)
Res (Deviation or
no Deviation)
Classifier
Fig. 12. The message processing of proposed IDS.

(a)Vehicle under normal scenario
250
200
Flow Value(veh/hr)
150
100
50
AvgF l owown
AvgF l owneg
0
20 40 60 80 100 120 140 160 180 200
Simulation Time(sec)
(b)Vehicle without IDS under rough scenario

250
AvgF l owown
200 AvgF l owneg
Flow Value(veh/hr)
150
100
50
0
20 40 60 80 100 120 140 160 180 200
(c)Vehicle with IDS under rough scenario

250
200
Flow Value(veh/hr)
150
AvgF l owown
100 AvgF l owneg
50
0
20 40 60 80 100 120 140 160 180 200
Fig. 13. Scenario: 40% of rogue vehicles and production of false messages beginning at t=50sec
(a) Vehicle under normal scenario, (b) Vehicle without IDS under rogue scenario, (c) Vehicle
with proposed IDS under rogue scenario.
As described in Section 5, the message processing of the proposed IDS is shown in Fig. 12,
where 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 , 𝐹𝑙𝑜𝑤𝑅 and 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 are produced in the feature extraction module
and 𝑅𝑒𝑠 = 0(no deviation) 𝑜𝑟 1(deviation) is generated by the classifier module.
Compared with the figures of 𝑅𝑒𝑠 in which we just can observe the detection results (0 or 1)
of the proposed IDS in different scenarios, the variation of 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 , 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔 and
𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 can clearly reflect the effectiveness of the proposed IDS in the normal scenario,
rogue scenario without IDS and rogue scenario with the proposed IDS. Thus, in the following,
the simulation results are discussed respectively on the detection of traffic flow and of position
to demonstrate the effectiveness of the proposed IDS.
In this simulation, the average flow of a randomly selected vehicle (𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 ) and its
received flow (𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔 ) are shown in Fig. 13. Fig. 13(a) shows the recorded data under
the normal scenario, where 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 and 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔 are close to each other. Then, 40%
of rogue vehicles are inserted, which send out the false messages with low traffic flow, as shown
by the blue dots that are close to the bottom of Fig. 13(b) and Fig. 13(c) after t=50 sec. In the
absence of IDS (Fig. 13(b)), 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 are reduced as all the messages including the false
messages are accepted after t=50. Similarly, 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔 from legitimate vehicles are also
reduced, which are shown by the blue dots that are close to the red curve because they are also
affected by the rogue vehicles. On the contrary, as shown in Fig. 13(c), the proposed IDS can
evaluate 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑛𝑒𝑔 and then rejects the deviated messages with 𝑅𝑒𝑠 = 1 , so that
𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 values are similar to that in normal scenario (Fig. 13 (a)). It means the proposed
IDS is effective in the traffic flow detection.
Here, 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 of a randomly selected vehicle is collected and shown in Fig. 14. We first
run the simulation under the normal scenario, as shown in Fig. 14(a). From the figure, the range
of 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 is within 15, which is relative small. The small range is caused by some
unavoidable errors, i.e., devices error, transmission error and so on. Then, we assume that there
are 40% of rogue vehicles under the rogue scenario. The rogue vehicles change the position of
their messages. In the absence of IDS, Fig. 14(b) indicates that many of 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 values are
larger than that in the normal scenario. Finally, we run the proposed IDS of each vehicle under
the rogue scenario that can reject the deviated messages with 𝑅𝑒𝑠 = 1. It is evident in Fig.
14(c) that most of 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 values are close to that in the normal scenario (Fig. 14(a)),
which means the proposed IDS is effective in the detection of vehicle position.
(a) Vehicle under normal scenario
800
600
Frequency 400
200
0
0 5 10 15
Range of PositionR Value
(b) Vehicle without IDS under rough scenario
250
200
Frequency
150
100
50
0
0 200 400 600 800 1000 1200 1400 1600 1800
(c) Vehicle with IDS under rough scenario
300
Frequency
200
100
0
0 5 10 15
Fig. 14. Scenario: 40% of rogue vehicles (a) Vehicle under normal scenario, (b) Vehicle without
IDS under rogue scenario, and (c) Vehicle with proposed IDS under rogue scenario.
6.5. Effectiveness of proposed IDS

IDS in other papers [19], [20], [21] are just based on detection of either traffic flow or vehicle
position, but the proposed IDS detects the difference in both traffic flow and vehicle position
at the same time, as shown in Fig. 12. Thus, the proposed IDS is compared with them on the
two aspects respectively. The comparison contains three parts. The first one is that the proposed
IDS is compared with Ref. [20] in scenario, where rogue vehicles send the messages with false
traffic flow, but true position. The second one is that the proposed IDS is compared with Ref.
[21] and Ref. [19] in scenario, where rogue vehicles send the messages with false position, but
true traffic flow. The last one is that the performance of the proposed IDS is provided and
described in scenario, where rogue vehicles send the messages with both false position and
false traffic flow. It should be noted that we do not cite the results from Ref. [19], [20], [21].
The IDSs in this paper and the three references are experimented in our simulation environment.
Moreover, training time does not encumber the efficiency of the IDSs because they are trained
offline.
6.5.1. Detection of traffic flow
In this subsection, the detection of traffic flow is compared between the proposed IDS and
the IDS based on statistical methods from Ref. [20]. In [20], the traffic flow of own vehicle is
calculated and the traffic flow from target vehicle is received by the own vehicle. Then, the
difference value between both traffic flows is attained. Next, a hypothesis testing is used to
judge if the difference is deviated or not.
(a)TDR of two IDSs (b)FDR of two IDSs

100 0.5
True Detection Rate(TDR)(%)
False Detection Rate(FDR)(%)

FDR of Proposed IDS
0.45 FDR of Ref.[20] IDS
99.5
0.4
0.35
99
0.3
0.25
98.5
TDR of Proposed IDS 0.2
TDR of Ref.[20] IDS
98
10 15 20 25 30 35 40 10 15 20 25 30 35 40
Rogue Vehicle Rate(%) Rogue Vehicle Rate(%)
(c)UR of two IDSs

1
UR of Proposed IDS
UR of Ref.[20] IDS
Uncertain Rate(UR)(%)
0.8
0.6
0.4
0.2
0
10 15 20 25 30 35 40
Rogue Vehicle Rate(%)
Fig. 15. Detection of traffic flow (a) TDR of two IDS, (b) FDR of two IDS, and (c) UR of two
IDS.
The results of comparison are shown in Fig. 15, where the ratio of rogue vehicles increases
from 10% to 40%. As noted in the figures, TDR and FDR of the proposed IDS are better than
that in Ref. [20]. In addition, the UR of the proposed IDS is zero. In our proposed IDS, as shown
in the lines 7-20 of Algorithm 2, I-GHSOM with two novel mechanisms can accurately judge
if there is deviation in traffic flow or not. Apart from that, 𝐹𝑙𝑜𝑤𝑅 takes into account the range
of the distance between vehicles to make it more distinctive, as shown in the line 3 of Algorithm
𝐷
𝑜&𝑡
1 ( 𝐹𝑙𝑜𝑤𝑅 = (|𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 − 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑡𝑎𝑔 |, ⌊ 100 ⌋) ). Therefore, according to the
observations presented here, both the accuracy and the stability in the proposed IDS are higher
than that in Ref. [20].
The proposed IDS is compared with Ref. [20] and the results of APT and AOSM are shown
in Table 2, where the percentage of rogue vehicles increases from 10% to 40%. As shown in
the results, APT in the proposed IDS is less than that in Ref. [20]. More specifically, APT of
Ref. [20] increases with the increase of rogue vehicles and that of the proposed IDS remains
the same. It is because that the proposed IDS can identify possible deviation by I-GHSOM. No
matter a message is malicious or legitimate, I-GHSOM uses the same time to classify it.
However, However, IDS of Ref. [20] uses hypothesis test to classify VANET messages. When
𝑥̅ −𝑦̅
using hypothesis test, the authors have to decide if 𝑡𝑜 (𝑡𝑜 = ) belongs to (−𝑡𝛼/2 , 𝑡𝛼/2 )
2 𝑠 2
√ 𝑠𝑥 − 𝑦
𝑛1 𝑛2
or not, where (−𝑡𝛼/2 , 𝑡𝛼/2) is confidence interval, 𝑥̅ & 𝑦̅ are the mean difference of traffic
flow values of vehicles, 𝑠𝑥2 and 𝑠𝑦2 are the standard deviations, and 𝑛1 and 𝑛2 are the
numbers of samples. Obliviously, to get 𝑡𝑜 , enough samples of a suspicious vehicle has to be
collected. With the increase of rogue vehicle, it takes more time to collect samples. As a result,
the overall efficiency of the proposed IDS is higher than that in Ref. [20].
Table 2
Average processing time (APT) and average number of output message (AOMN) for
detection of traffic flow.
Rogue Vehicle Proposed IDS IDS of Ref. [20]

Rate (%) APT (sec) AOMN APT (sec) AOMN
10 0.11 330 2.62 330
20 0.11 330 3.35 330
30 0.11 330 3.87 330
40 0.11 330 4.61 330
6.5.2. Detection of vehicle position

The proposed IDS is compared with two methods in Ref. [19] and Ref. [21] respectively. As
described in Ref. [19], the own vehicle receives the claimed position of target vehicle sent out
by the target vehicle. In order to identify the claimed position, the own vehicle collects the
neighboring positions from the response messages sent out by the neighboring vehicles to
calculate the detected position of the target vehicle. After a waiting time of collection, an
information-theoretic method is used to detect whether deviation exists in the claimed position
of the target vehicle. In Ref. [21], the own vehicle receives the claimed position of its target
vehicle sent out by the target vehicle. In order to identify the claimed position, the neighboring
positions from beacon messages instead of response messages are collected by the own vehicle,
which means its neighboring vehicles do not need to send response broadcast. After a beacon
time of collection, a hypothesis testing is used to detect whether deviation exists in the claimed
position of the target vehicle.
(a)TDR of three IDSs (b)FDR of three IDSs

98.5 1.8
FDR of Proposed IDS
True Detection Rate(TDR)(%)
False Detection Rate(FDR)(%)

98 1.6 FDR of Ref.[19] IDS
1.4 FDR of Ref.[21] IDS
97.5
1.2
97
1
96.5
0.8
96
TDR of Proposed IDS 0.6
95.5 TDR of Ref.[19] IDS 0.4
TDR of Ref.[21] IDS
95 0.2
10 15 20 25 30 35 40 10 15 20 25 30 35 40
(c)UR of three IDSs

3.5
UR of Proposed IDS
UR of Ref.[19] IDS
Uncertain Rate(UR)(%)
3 UR of Ref.[21] IDS
2.5
1.5
10 15 20 25 30 35 40
Rogue Vehicle Rate(%)
Fig. 16. Detection of vehicle position (a) TDR of three IDS, (b) FDR of three IDS and (c) UR
of three IDS.
Fig. 16 shows the comparison results when the percentage of rogue vehicles increases from
10% to 40%. TDR of the proposed IDS is higher than that in both Ref. [19] and Ref. [21]. In
addition, Both FDR and UR of the proposed IDS are less than that of the two methods. Because
I-GHSOM is used to check if there are deviations in position, the proposed IDS can obtain
higher TDR, and the lower FDR and UR. Moreover, when extracting 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅, the voting-
filter mechanism is used to filter wrong 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔. The 𝐵𝑖𝑎𝑠𝑛&𝑡 belonging to majority
are accepted while that belonging to minority are rejected, as shown in the lines 7-12 of
Algorithm 2. Therefore, the proposed IDS has higher accuracy and more stability than the other
two statistic methods, i.e., information-theoretic and hypothesis testing.
APT and AOMN of the proposed IDS are compared with that of Ref. [19] and Ref. [21]
respectively. The results are shown in Fig. 17. When the ratio of rogue vehicles increases from
10% to 40%, APT in the proposed IDS is less than that in both Ref. [19] and Ref. [21]. In this
case, AOMN of the proposed IDS is also less than that in Ref. [19], but little higher than that
in Ref. [21]. This is because the proposed IDS uses semi-cooperate mechanism (in the lines 18-
23 of Algorithm 1), which reduces not only the processing time, but also the number of output
messages. Different from other literatures, 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑅 in our proposed IDS can be obtained
according to historical biases, i.e., 𝐵𝑖𝑎𝑠𝑡&𝑡′ and 𝐵𝑖𝑎𝑠𝑜′&𝑡 , as shown in Eq. (16). As a result,
the proposed IDS has higher processing efficiency than both that of Ref. [19] and Ref. [21].
(a)APT of three IDSs (b)AOMN of three IDSs

180 2500
Average Output Message Number(AOMN)
160
Average Processing Time(APT)(sec)
140 2000
120
1500
100
80
1000
60
40 500
20
0 0
Proposed IDS Ref.[19] IDS Ref.[21] IDS Proposed IDS Ref.[19] IDS Ref.[21] IDS
Fig. 17. Overhead of IDS (a) APT for detection of position, (b) AOMN for Detection of Position.
6.5.3. Detection of both traffic flow and vehicle position

In this subsection, the performance of the proposed IDS is provided on the detection of both
traffic flow and vehicle position. Then, the results are compared with that of the proposed IDS
on detection either of them shown in Section 6.5.1 and 6.5.2.
As shown in Table 3, TDR, FDR and UR outperform that of the proposed IDS on detection
either of them. This is because they can complement each other to improve the accuracy of
detection. If one of them fails to identify deviation from a message, another one may identify
it from the same message. Moreover, APT and AOMN are the same as that of the proposed IDS
on the detection of only vehicle position. It means that APT and AOMN are mainly contributed
by the detection of vehicle position. In other words, processing efficiency and scale of message
mainly depend on the detection of vehicle position. It should be noted that although the
detection of vehicle position has inferior performance than that of traffic flow, the detection of
vehicle position cannot be abolished. This is because the proposed IDS cannot identify the
wrong position of messages without the detection of vehicle position.
Table 3
All evaluation metrics for detection of both traffic flow and vehicle position.
Rogue Vehicle Proposed IDS

Rate (%) TDR (%) FDR (%) UR (%) APT (sec) AOMN
10 99.82 0.18 0.00 14.72 989.40
20 99.80 0.20 0.00 14.72 989.40
30 99.77 0.23 0.00 14.72 989.40
40 99.69 0.31 0.00 14.72 989.40
6.6. Comparison among IDSs under different simulation parameters

To further demonstrate the performance of proposed method, the proposed IDS is compared
with other existing researches (Ref. [46] and [47]) under different simulation parameters, i.e.,
arrival interval and transmission range. Arrival interval is the parameter that depends on the
number of vehicles on the highway, while transmission range relates to the success ratio and
delay of vehicle. In the comparison, there are 40% of rogue vehicles transmitting malicious
messages to their neighbor vehicles. As described in Ref. [46], authors proposed a novel SVM-
based IDS by combining dolphin swarm algorithm with SVM. They claimed that the dolphin
swarm algorithm is able to improve the performance of SVM-based IDS. In Ref. [47], authors
developed a game theory based intrusion detection framework and a novel clustering algorithm
for VANETs. By using their specification rules and a lightweight neural network based
classifier for detecting malicious vehicle, the communication overhead of IDS is reduced.
Table 4
Comparison among IDSs with different arrival intervals.
Arrival TDR (%) FDR (%) UR (%) APT (sec) AOMN

Interval
Our IDS 1sec 99.69 0.31 0.00 14.72 989.40
Ref. [46] 98.04 1.96 0.00 52.67 1989.87
Ref. [47] 96.11 3.89 0.00 18.76 990.54
Our IDS 0.8sec 99.57 0.43 0.00 21.16 1422.34
Ref. [46] 97.49 2.51 0.00 75.50 3018.67

Ref. [47] 94.81 5.19 0.00 26.98 1424.50
Our IDS 0.5sec 99.40 0.60 0.00 31.87 2142.11

Ref. [46] 96.97 3.03 0.00 115.26 4165.90
Ref. [47] 92.26 7.74 0.00 40.59 2143.36
Table 5
Comparison among IDSs with different transmission ranges.
Transmission TDR (%) FDR (%) UR (%) APT (sec) AOMN

Range
Our IDS 400m 99.54 0.46 0.00 18.22 1224.35
Ref. [46] 97.85 2.15 0.00 65.54 2231.51
Ref. [47] 95.78 4.22 0.00 23.28 1228.94
Our IDS 500m 99.69 0.31 0.00 14.72 989.40
Ref. [46] 98.04 1.96 0.00 52.67 1989.87

Ref. [47] 96.11 3.89 0.00 18.76 990.54
Our IDS 600m 99.71 0.29 0.00 14.39 967.28

Ref. [46] 98.08 1.92 0.00 51.71 1971.54
Ref. [47] 96.13 3.87 0.00 18.36 969.41
With different transmission ranges, the performances of the proposed IDS and Ref. [46] and
[47] are examined with the results listed in Table 5. Evidently, the performance of the proposed
IDS is the best which not only can achieve higher classification accuracy but also can achieve
lower computational complexity. Moreover, we can notice that with the increase of the arrival
interval, vehicles in VANETs need to deal with much more messages from others, thus the
overhead of IDS represented by APT and AOMN soars, while detection accuracy represented
by TDR, FDR and UR decreases slightly. In addition, when the transmission range is small
(400m), overhead is higher than that under the scenario with the bigger transmission ranges.
On the contrary, if the transmission range is bigger enough (no less than 500), vehicles can
maintain a stable communication with each neighbor vehicle considering the maximal speed of
vehicle as 100km/h.
7.Conclusion and future work

We proposed a novel IDS that can be appropriately used in VANETs, in which the reliability
of information should be quickly ascertained. It mainly contains a novel feature extraction
algorithm and an I-GHSOM-based classifier. For quickly extracting distinct features from
vehicle messages, in the proposed extraction algorithm, the feature of traffic flow is calculated
according to the range of the distance between vehicles, while both a voting filter mechanism
and a semi-cooperative mechanism are presented to calculate the feature of position based on
the positions between the neighboring vehicles in current and last time point. For detecting
attacks quickly and accurately, in the I-GHSOM-based classifier, relabeling and recalculating
mechanisms are designed to relabel the units of GHSOM and check the balance of GHSOM
structure. Simulations show that the network message congestion (e.g., broadcast storms) can
be mitigated by the reduction of message scale and the improvement of processing efficiency.
Moreover, the performance of the proposed IDS is still remarkable even when up to 40% of
vehicles are rogue vehicles, as its accurate rate reaches 99.69% at that moment.
Acknowledgements
This work was supported by the Natural Science Foundation of Guangdong Province under
Grant 2017A030313338, and the Fundamental Research Project in the Science and Technology
Plan of Shenzhen under Grants JCYJ20170817102218122.
References
[1] Pathan, Al-Sakib Khan, ed. Security of self-organizing networks: MANET, WSN, WMN,
VANET. CRC press, 2016.
[2] Mershad, Khaleel, and Hassan Artail. A framework for secure and efficient data
acquisition in vehicular ad hoc networks. IEEE Transactions on vehicular technology 62.2
(2013): 536-551.
[3] Pathan, Al-Sakib Khan, ed. Security of self-organizing networks: MANET, WSN, WMN,
VANET. CRC press, 2016.
[4] Devi, Anu, and Sandeep Garg, Survey of clustering based Detection using IDS Technique,
(2017).
[5] Mishra, Preeti, Emmanuel S. Pilli, Vijay Varadharajan, and Udaya Tupakula. Intrusion
detection techniques in cloud environment: A survey. Journal of Network and Computer
Applications 77 (2017): 18-47.
[6] Tzeng, S. F., Horng, S. J., Li, T., Wang, X., Huang, P. H., & Khan, M. K. Enhancing
Security and Privacy for Identity-Based Batch Verification Scheme in VANETs. IEEE
Transactions on Vehicular Technology, 66(4) (2017): 3235-3248.
[7] Chaubey, Nirbhay Kumar. Security analysis of vehicular ad hoc networks (VANETs): a
comprehensive study. International Journal of Security and Its Applications 10, no. 5
(2016): 261-274.
[8] Li, Wenjia, and Houbing Song. ART: An attack-resistant trust management scheme for
securing vehicular ad hoc networks. IEEE Transactions on Intelligent Transportation
Systems 17, no. 4 (2016): 960-969.
[9] Vinel, Alexey, Xiaomin Ma, and Dijiang Huang. Guest Editors’ Introduction: Special Issue
on Reliable and Secure VANETs. IEEE Transactions on Dependable and Secure
Computing 13, no. 1 (2016): 2-4.
[10] Huang, Dijiang, et al. PACP: An efficient pseudonymous authentication-based conditional
privacy protocol for VANETs. IEEE Transactions on Intelligent Transportation Systems
12.3 (2011): 736-746.
[11] Shen, An-Ni, et al. A lightweight privacy-preserving protocol using chameleon hashing
for secure vehicular communications. Wireless Communications and Networking
Conference (WCNC), 2012 IEEE. IEEE, 2012.
[12] Lu, Rongxing, et al. Pseudonym changing at social spots: An effective strategy for location
privacy in vanets. IEEE Transactions on Vehicular Technology 61.1 (2012): 86-96.
[13] Javed, Muhammad Awais, Sherali Zeadally, and Zara Hamid. Trust-based Security
Adaptation Mechanism for Vehicular Sensor Networks. Computer Networks (2018).
[14] Minhas, Umar Farooq, et al. Towards expanded trust management for agents in vehicular
ad-hoc networks. International Journal of Computational Intelligence: Theory and Practice
(IJCITP) 5.1 (2010): 03-15.
[15] Sugumar, R., A. Rengarajan, and C. Jayakumar. Trust based authentication technique for
cluster based vehicular ad hoc networks (VANET). Wireless Networks 24, no. 2 (2018):
373-382.
[16] Hu, Hao, et al. REPLACE: A reliable trust-based platoon service recommendation scheme
in VANET. IEEE Transactions on Vehicular Technology 66.2 (2017): 1786-1797.
[17] Wahid, Abdul, Munam Ali Shah, Faisal Fayyaz Qureshi, Hafsa Maryam, Rahat Iqbal, and
Victor Chang. Big data analytics for mitigating broadcast storm in vehicular content centric
networks. Future Generation Computer Systems (2017).
[18] Ruj, Sushmita, et al. On data-centric misbehavior detection in VANETs. Vehicular
technology conference (VTC Fall), 2011 IEEE. IEEE, 2011.
[19] Yan, Shihao, et al. Optimal information-theoretic wireless location verification. IEEE
Transactions on Vehicular Technology 63.7 (2014): 3410-3422.
[20] Zaidi, Kamran, et al. Host-based intrusion detection for VANETs: a statistical approach to
rogue node detection. IEEE Transactions on Vehicular Technology 65.8 (2016): 6703-6714.
[21] Yu, Bo, Cheng-Zhong Xu, and Bin Xiao. Detecting sybil attacks in VANETs. Journal of
Parallel and Distributed Computing 73.6 (2013): 746-756.
[22] Camastra, Francesco, Angelo Ciaramella, and Antonino Staiano. Machine learning and
soft computing for ICT security: an overview of current trends. Journal of Ambient
Intelligence and Humanized Computing 4, no. 2 (2013): 235-247.
[23] Sedjelmaci, Hichem, Sidi Mohammed Senouci, and Mohammed Feham. An efficient
intrusion detection framework in cluster-based wireless sensor networks. Security and
Communication Networks 6.10 (2013): 1211-1224.
[24] Sedjelmaci, Hichem, and Sidi Mohammed Senouci. An accurate and efficient
collaborative intrusion detection framework to secure vehicular networks. Computers &
Electrical Engineering 43 (2015): 33-47.
[25] Bouali, Tarek, Sidi ‐ Mohammed Senouci, and Hichem Sedjelmaci. A distributed
detection and prevention scheme from malicious nodes in vehicular
networks. International Journal of Communication Systems 29.10 (2016): 1683-1704.
[26] Sedjelmaci, Hichem, Sidi Mohammed Senouci, and Mosa Ali AbuRgheff. An efficient and
lightweight intrusion detection mechanism for service-oriented vehicular networks. IEEE
Internet of Things Journal 1.6 (2014): 570-577.
[27] Mokdad, Lynda, Jalel Ben-Othman, and Anh Tuan Nguyen. DJAVAN: Detecting jamming
attacks in Vehicle Ad hoc Networks. Performance Evaluation 87 (2015): 47-59.
[28] Hortelano, Jorge, Juan Carlos Ruiz, and Pietro Manzoni. Evaluating the usefulness of
watchdogs for intrusion detection in VANETs. Communications Workshops (ICC), 2010
IEEE International Conference on. IEEE, 2010.
[29] Baiad, Raghad, et al. Cooperative cross layer detection for blackhole attack in VANET-
OLSR. Wireless Communications and Mobile Computing Conference (IWCMC), 2014
International. IEEE, 2014.
[30] Wahab, Omar Abdel, et al. CEAP: SVM-based intelligent detection model for clustered
vehicular ad hoc networks. Expert Systems with Applications 50 (2016): 40-54.
[31] Alheeti, Khattab M. Ali, Anna Gruebler, and Klaus D. McDonald-Maier. On the detection
of grey hole and rushing attacks in self-driving vehicular networks. Computer Science and
Electronic Engineering Conference (CEEC), 2015 7th. IEEE, 2015.
[32] Kumar, Neeraj, and Naveen Chilamkurti. Collaborative trust aware intelligent intrusion
detection in VANETs. Computers & Electrical Engineering 40.6 (2014): 1981-1996.
[33] Sedjelmaci, Hichem, Sidi Mohammed Senouci, and Nirwan Ansari. Intrusion Detection
and Ejection Framework Against Lethal Attacks in UAV-Aided Networks: A Bayesian
Game-Theoretic Methodology. IEEE Transactions on Intelligent Transportation Systems
18.5 (2017): 1143-1153.
[34] Sedjelmaci, Hichem, Tarek Bouali, and Sidi Mohammed Senouci. Detection and
prevention from misbehaving intruders in vehicular networks. Global Communications
Conference (GLOBECOM), 2014 IEEE. IEEE, 2014.
[35] Greenshields model, http://www.webpages.uidaho.edu/niatt labmanual/chapters/trafficfl-
owtheory/theoryandconcepts/GreenshieldsModel.htm.
[36] Free space Model, http://www.isi.edu/nsnam/ns/d-oc/node217.html.
[37] False Attack, https://en.wiktionary.org/wiki/false attack.
[38] Newsome, James, et al. The sybil attack in sensor networks: analysis & defenses.
Proceedings of the 3rd international symposium on Information processing in sensor
networks. ACM, 2004.
[39] Zhu, Yingying, et al. An improved NSGA-III algorithm for feature selection used in
intrusion detection. Knowledge-Based Systems 116 (2017): 74-85.
[40] De la Hoz, Emiro, et al. Feature selection by multi-objective optimisation: Application to
network anomaly detection by hierarchical self-organising maps. Knowledge-Based
Systems 71 (2014): 322-338.
[41] Kohonen, Teuvo. The self-organizing map. Neurocomputing 21, no. 1-3 (1998): 1-6.
[42] SOM: Self-organizing, https://en.wikipedia.org/wiki/Self-organizing_map.
[43] Rauber, Andreas, Dieter Merkl, and Michael Dittenbach. The growing hierarchical self-
organizing map: exploratory analysis of high-dimensional data. IEEE Transactions on
Neural Networks 13.6 (2002): 1331-1341.
[44] NS2: Network Simulator, http://www.isi.edu/nsna-m/ns/.
[45] Behrisch, Michael, et al. SUMOCsimulation of urban mobility: an overview. Proceedings
of SIMUL 2011, The Third International Conference on Advances in System Simulation.
ThinkMind, 2011.
[46] Sharma, Sparsh, and Ajay Kaul. Hybrid fuzzy multi-criteria decision making based multi
cluster head dolphin swarm optimized IDS for VANET. Vehicular Communications 12
(2018): 23-38.
[47] Subba, Basant, Santosh Biswas, and Sushanta Karmakar. "A game theory based multi
layered intrusion detection framework for VANET." Future Generation Computer
Systems82 (2018): 12-28.

A Novel Intrusion Detection System For Vehicular Ad Hoc Networks (VANETs) Based On Differences of Traffic Flow and Position

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Novel Intrusion Detection System For Vehicular Ad Hoc Networks (VANETs) Based On Differences of Traffic Flow and Position

Uploaded by

Copyright:

Available Formats

Accepted Manuscript

A novel Intrusion Detection System for Vehicular Ad Hoc Networks

Junwei Liang, Jianyong Chen, Yingying Zhu, Richard Yu

To appear in: Applied Soft Computing Journal

Received date : 27 January 2018

Last Neighbor Expire Current Neighbor

Junwei Liang1, Jianyong Chen1, Yingying Zhu1, and Richard Yu2

Position Information Table

Fig. 1. VANETs model on a highway.

3.1. VANETs measurements

𝐹𝑙𝑜𝑤𝑜𝑤𝑛 = 𝐴𝑣𝑔𝑆𝑝𝑒𝑒𝑑𝑜𝑤𝑛 × 𝐷𝑒𝑛𝑠𝑖𝑡𝑦𝑜𝑤𝑛 (2)

3.2. Message format

𝐵𝑒𝑎𝑐𝑜𝑛𝑀𝑠𝑔(𝐼𝐷𝑜𝑤𝑛 , 𝐴𝑣𝑔𝐹𝑙𝑜𝑤𝑜𝑤𝑛 , 𝑃𝑜𝑠𝑜𝑤𝑛 ) (5)

𝑅𝑒𝑞𝑢𝑒𝑠𝑡𝑀𝑠𝑔(𝐼𝐷𝑜𝑤𝑛 , 𝐼𝐷𝑡𝑎𝑔 ) (6)

𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑀𝑠𝑔(𝐼𝐷𝑛𝑒𝑔 , 𝐼𝐷𝑡𝑎𝑔 , 𝑃𝑜𝑠𝑛𝑒𝑔 , 𝐷𝑛&𝑡 ) (7)

3.3. Information tables

Rogue Fake Legitimate

Fig. 2. An example of false information attack.

Rogue Fake Legitimate

Fig. 3. An example of Sybil attack.

5.The proposed intrusion detection system

Last Neighbor Expire Current Neighbor

Fig. 4. Scheme of the proposed IDS.

5.2.1. Feature extraction of traffic flow

Own Target Target

Fig. 5. Relationship between traffic flow and distance.

5.2.2. Position feature extraction

5.2.2.1 Voting filter mechanism for cooperative situation

𝐵𝑖𝑎𝑠𝑜&𝑡 = |𝐷𝑜&𝑡 − |𝑃𝑜𝑠𝑜𝑤𝑛 𝑃𝑜𝑠𝑡𝑎𝑔 || (8)

𝐵𝑖𝑎𝑠𝑛&𝑡 = |𝐷𝑛&𝑡 − |𝑃𝑜𝑠𝑛𝑒𝑔 𝑃𝑜𝑠𝑡𝑎𝑔 || (9)

𝐵𝑖𝑎𝑠𝑛&𝑛′ = |𝐷𝑛&𝑛′ − |𝑃𝑜𝑠𝑛𝑒𝑔 𝑃𝑜𝑠𝑛𝑒𝑔′ || (10)

Distance Estimation Cosine Law

Fig. 7. Diagram of calculating vehicle position feature in uncooperative situation.

𝐵𝑖𝑎𝑠𝑡&𝑡′ = |𝐷𝑡&𝑡′ − |𝑃𝑜𝑠𝑡𝑎𝑔 𝑃𝑜𝑠𝑡𝑎𝑔′ || (13)

𝐷𝑜′ &𝑡 = √𝐷𝑜&𝑡 2 + 𝐷𝑜&𝑜′ 2 − 2𝐷𝑜&𝑡 𝐷𝑜&𝑜′ cos ∠𝑇𝑂𝑂′ (14)

𝐵𝑖𝑎𝑠𝑜′&𝑡 = |𝐷𝑜′&𝑡 − |𝑃𝑜𝑠𝑜𝑤𝑛 𝑃𝑜𝑠𝑡𝑎𝑔 || (15)

5.3. Improve GHSOM-based classifier

5.3.1. SOM and GHSOM

Fig. 8. The framework of SOM.

Map4 Map5 Map6

BMU in Unit6, Map6, Layer3

Fig. 10. BMU calculation on GHSOM hierarchy.

5.3.2. I-GHSOM-based classifier

(1)Relabeling Mechanism (2)Recalculating Mechanism

6.2. Simulation environment

Scenario 2 Lane Highway

6.3. Evaluation metric

𝑁𝑜.𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦

𝑁𝑜.𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑤𝑟𝑜𝑛𝑔𝑙𝑦

𝑁𝑜.𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒𝑠 𝑡ℎ𝑎𝑡 𝑐𝑎𝑛𝑛𝑜𝑡 𝑏𝑒 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑

∑ 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑣𝑒ℎ𝑖𝑐𝑙𝑒

∑ 𝑁𝑜. 𝑜𝑓 𝑜𝑢𝑡𝑝𝑢𝑡 𝑚𝑒𝑠𝑠𝑎𝑔𝑒𝑠 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑣𝑒ℎ𝑖𝑐𝑙𝑒

6.4. Simulation results

6.4.1. Simulation results of the proposed IDS

Fig. 12. The message processing of proposed IDS.

(b)Vehicle without IDS under rough scenario

(c)Vehicle with IDS under rough scenario

6.5. Effectiveness of proposed IDS

(a)TDR of two IDSs (b)FDR of two IDSs

False Detection Rate(FDR)(%)

(c)UR of two IDSs

Rogue Vehicle Proposed IDS IDS of Ref. [20]