You are on page 1of 9

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 1

AI-empowered Trajectory Anomaly Detection in


IoVand ITS environments
Gunasekaran Raja, Senior Member, IEEE

Abstract--- The Internet of Vehicles (IoV) emerged as a result in the subject. Many fields are concerned by the emergence
of rapid advancement in Intelligent Transportation Systems of autonomous cars: there are law modifications to make,
(ITS) over the last few decades, in which smart vehicles connectivity questions to solve, such as hacking, social issues
connected with one another using Next-generation wireless to consider, like the fear of handling one’s life to a machine.
network namely 6G for information sharing. The exponential The autonomous navigation system is often divided into three
rise of vehicles, along with rising data demands from in- categories: perception and localization, planning, and control.
vehicle users, has resulted in a massive increase in data in the
IoV infrastructure which exposes to higher security risks. A Recent years have witnessed the popularity of online ride
common large data analysis task is trajectory data feature hailing services, which have gradually transformed the
analysis for IoV and ITS environments. It is challenging to transportation scenarios. At the same time, massive volumes
describe and detect anomalies in urban motion behavior due of trajectories have been generated in an unexpected speed
to the enormous coverage and complexity of IoV and ITS. with the advancement of mobile devices and sensors.
Some present systems are ineffective at detecting aberrant Trajectory data representing the mobility of vehicles contain
urban vehicle trajectories because they rely on a restricted rich temporal and spatial information, which provide great
number of single detection strategies, such as determining opportunities for improving the transportation efficiency and
frequent patterns. In this paper, we provide a framework for
urban trajectory modelling and anomaly detection. Our ride-hailing service quality.
methodology considers the way anomalous behavior emerges As a critical module in an autonomous vehicle, trajectory
in the Autonomous Vehicles (AV) trajectories. As a result, planning identifies a spatiotemporal curve that is free from
four different patterns required for anomaly identification, collisions, easily tracked by the controller, and comfortable for
such as AV driving speed, AV driving direction, AV driving the passengers. Trajectory tracking includes localization,
time, and AV distance, are determined in this work. The point mission planning, motion planning, decision-making and
is characterized as anomalous if these patterns are not within actuation in autonomous driving. Trajectory planners have
the normal limits. Using these types of pattern, we investigate
AV behaviors and proposes a method for trajectory anomaly been widely developed for on-road autonomous driving.
classification and identification. First, we introduce the Trajectory tracking is an important part of autonomous
Euclidean distance metric to evaluate the similarity between driving, as tracking accuracy directly affects the safety of
any two trajectories. We build an anomaly count based on autonomous driving. However various threats are possible to
this distance in order to measure the differences between occur which collapse the entire system and lead to fatal
various types of anomalous and normal trajectories. We accidents.
further propose a reinforcement learning using Deep Many methods have been proposed from various
Deterministic Policy Gradient algorithm (DDPG) to improve
the accuracy and efficiency of trajectories anomaly detection perspectives to detect anomalous trajectory. Some previous
and classification (ETADC). The effectiveness and studies designed hand-crafted features that describe the
performance of the proposed method are evaluated on normal routes to distinguish abnormal from normal
OMNET++ and SUMO using real Chennai trajectory map trajectories. Heuristic features based on expertise have weak
data. The obtained results reflect the proposed approach ability to define the diversity of normal routes. The shortest
outperforms existing state of the art schemes by a significant and the fastest trajectories can be considered as normal routes,
margin. and some normal routes with longer distance and time take a
Index Terms— Trajectory data, trajectory anomaly better road condition. Other existing works on density or
detection, distance metrics, anomaly classification, isolation detect anomalous trajectory by comparing the given
DDPG, 6G. trajectory with quantity of historical trajectories.

- I. INTRODUCTION
Autonomous terrestrial vehicles have become a major
concern, and most car manufacturers are deeply involved

Manuscript received ˙˙˙˙˙˙˙˙; revised ˙˙˙˙˙˙˙˙; accepted ˙˙˙˙˙˙; Date of Pub-


lication xxxx xxx xxxxxx; date of current version xxxxx xxxxx xxxxx. The
In the past few decades, researchers have proposed many
novel methods to detect abnormal taxi trajectories based on
taxi traces. Most of the existing methods use counting-based
strategies to detect anomalous taxi trajectories. The basic idea
of counting-based methods is that trajectories that were rarely
traversed by taxis in the past are considered as anomalous.
work of Gunasekaran Raja, ˙˙˙˙˙˙ was supported by NGNLab, Department Specifically, when judging a testing trajectory, if the number
of Computer Technology, Anna University, Chennai 600025, India. The review
of this article was coordinated by xxxxxxxxxxxx. Paper no. ˙˙˙˙˙˙˙˙˙.
of appearances in the historical trajectory set is greater than
(Corresponding Author: Gunasekaran Raja). the threshold, it is considered normal; otherwise, it is marked
Copyright © 2015 IEEE. Personal use of this material is permitted. as anomalous.
However, permission to use this material for any other purposes must be
obtained from the IEEE by sending a request to pubs-permissions@ieee.org.
However, the distribution of trajectories is unbalance in
Gunasekaran Raja, ˙˙˙˙˙˙˙˙˙˙˙˙and ˙˙˙˙˙˙˙˙˙˙˙˙ are with NGNLab, Department the real world. Historical datasets only have few trajectories
of Computer Technology, Anna University, Chennai 600025, India. (e-mail: for most source–destination pairs.
dr.r.gunasekaran@ieee.org).
Digital Object Identifier xxxxxxxxxxxxxx
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 2

II. RELATED WORK


Thus, the normal or abnormal routes are difficult to define. Even Anomalous trajectory detection approaches can be mainly divided
worse, the number of trajectories decreases with route distance into two categories, namely, metrics-based and learning-based
increasing. methods. The metrics-based methods use evaluation indicators to
measure the similarity between the trajectories and the historical
ones. The outlier trajectories are defined as anomalies. The
isolation-Based Anomalous Trajectory (iBAT) detection method
[1] defines and calculates the isolation on the basis of the
locations, where the historical trajectories traveled. If the
trajectories travel to locations, where the other trajectories never
pass, then the trajectories are more likely to be regarded as
anomaly. The improved method, isolation-Based Online
Anomalous Trajectory (iBOAT) [2], considers the location orders
of historical trajectories. If the trajectories travel popular locations
in an infrequent order, then the trajectories would be classified as
outliers. Zhu et al. [3] proposed the outlier detection algorithm,
which derived the anomalous trajectory by measuring the edit
distance between the pending evaluated trajectory and the popular
Fig.1. An illustration of AV trajectories between (S, D). routes. The algorithm needs to retrieve popular routes between the
The identification and classification of anomalous trajectories is same source and destination with the pending evaluated trajectory;
the core of this research. Something that deviates from what is thus, it suffers from the trajectory sparsity issue. Lv et al. [4]
normal, typical, or intended is referred to as an anomaly (outlier). clustered the historical trajectories to generate the representative
When measured by some similarity metrics, an anomalous routes. The trajectories, which were grossly inconsistent with the
trajectory is one that has local or global variations from most representative routes, would be considered outliers. Only iBOAT
other normal trajectories. As a result, the aim of anomalous can identify the abnormal parts of the trajectories. The other
trajectory detection is to locate the trajectories that are distinct methods only have the ability to classify the whole trajectories.
from the usual trajectories in a dataset and to what level they are The second category methods are based on learning models. They
distinct. Fig. 1 illustrate few AV trajectories between (S, D) can capture the dependencies between trajectory sequences and
where S and D represent the source and the destination, define trajectories with a low probability of occurrence as
respectively. Assume that the trajectories 𝑇𝑟𝑎4 and 𝑇𝑟𝑎5 are anomalies. Sequence Auto Encoder (SAE) [5]
defined as normal trajectory, whereas the other four trajectories utilizes a conventional RNN-based sequence to learn trajectory
{𝑇𝑟𝑎1 , 𝑇𝑟𝑎2 , 𝑇𝑟𝑎3 , 𝑇𝑟𝑎6 } are considered as anomalous features by minimizing the reconstruction error. The anomaly
trajectories due to its various reasons like long distance and short trajectories were defined on the basis of the reconstruction error.
distance. Liu et al. [6] proposed a generative model, which can identify
The primary contributions of the paper are listed below: normal routes and define trajectories with small regenerate
1) The patterns of anomalous trajectories were first ex- probability as anomalies. Wu et al. [7] proposed a method to model
the driving behavior and measured the decision cost of the driving
plored, and four main types of anomalies were identified.
behavior.
Euclidean distance metrics are used to analyze the
The trajectories with higher decision cost are defined as
trajectory and evaluate the similarity between any two
anomalies. However, additional features, such as the structure of
trajectories.
road networks, should be collected explicitly. Xiao et al. [7]
2) A DDPG method has been proposed that can classify
proposed a supervised model based on RNN that can model the
trajectory paths and detect unforeseen on-road vehicular context in trajectories and detect whether the given trajectories are
anomalies. We evaluate and analyze the effectiveness anomalous. The Stacked Denoising Autoencoder (SDA) [8] can
and perfor- mance of DDPG on real-world Autonomous represented trajectory features and detect anomalous trajectory
Vehicles (AV) traces. simultaneously. The method stacks several autoencoder models to
3) Our proposed strategies improve the running perfor- denoise trajectory noise and extract abstract features in the form
mance of the existing ODMTS algorithm. The proposed of vectors. The anomalous trajectories can be recognized by
method is evaluated in several well-known trajectory outlier vectors. Cheng et al. [9] proposed a Spatial-Temporal RNN
databases and the scalability of the algorithms is (ST-RNN) model to detect anomalous trajectories. The model
investigated in terms of runtime, the number of out- liers, explicitly feeds the coordinate sequence and spatial-temporal
and speedup by analyzing the performance under sequence into RNN layers to extract the vector representation of
different parameters. trajectory spatial temporal information. And then the vectors
The rest of this article is structured as follows. Section II would flow into the multilayer perceptron layer and a SoftMax
represents the relevant works in recent days. The proposed layer to generate a probability prediction for anomalous
working environment is presented in section III. The proposed trajectories.
hybrid deep learning for trajectory anomaly detection is delivered Most of the recent abnormal trajectory detection research
in section IV. Section V demonstrates the simulation results focus on classification and distance-based detection methods. One
and discusses the performance achieved. Finally, Section of the traditional methods of abnormal trajectory detection is to
VII concludes the article. cluster the trajectory data [10]. If the trajectory does not belong to
a specific class, it will be judged as an abnormal trajectory. Qin et
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 3

al. [11] developed a probability-based taxi fraud detection status. The Normal and Anomaly trajectories can be defined
algorithm that considers both the behavior of taxi drivers and the when a specific S2D pair is provided. Therefore, a partitioning
traffic variability. The set of route choice of all taxi drivers is first and detect framework is proposed, where all the trajectories are
generated from the taxi trajectory database. The choice partitioned, and then outlying trajectory partitions are detected
probability of each route is then calculated by joining all taxi by a distance-based method.
driver choice taking this route. The anomaly score of each taxi Definition 2 (Sub-Trajectory): A sub-trajectory
driver is finally determined using the probability values of all
STra  Tra i s defined as a sub-set or sub-part of
routes visited by such taxi driver. Kamoona et al. [12] developed
a smart system which integrate the images databases in detecting geolocation points { p1, p2 .... pn } in the whole trajectory.
anomalies. The convolution neural network is applied to extract Definition 3 (Trajectory Database): A trajectory database
both the trip, and engineering features. The random finite set
algorithm is then performed to derive outliers. Zhang et al.
Tra = {Tra1 , Tra2 ....Tran } where each trajectory
With the rapid advancement of DL, DRL offers a model geolocation point { p1, p2 .... pn } with time-interval
that can learn complex policies in a high-dimensional state space
such as in robotics [13], and locomotion [14] domains. The {t1 , t2 ....tn } obtained by GPS AV service. All the trajectories
appealing structure of the convolutional network and properties points are stored in this database. Each trajectory is assigned
of function approximation motivate researchers to employ DRL a uniqueID, and its driving time, driving distance (long and
algorithms in solving autonomous driving problems such as lane- short), driving speed, are calculated.
merging [15], and lane-changing maneuvers [16], [17]. However, Definition 4 (Trajectory Distance Metrics): Trajectory dis-
applying DRL to autonomous driving in complex urban areas is tance metrics measure the similarity of different trajectories
still an open problem. One reason is the use of high-dimensional
images that dramatically increases the complexity required for Tral and Tram denoted as S (Tral , Tram ) is defined by
DRL training. In addition, most current studies use DRL for Euclidean distance metrics.
simple driving tasks, while complex urban settings with multi-
𝑑(𝑝1 , 𝑝2 ) = √(𝑝1 𝑙 − 𝑝2 𝑙)2 + (𝑝1 𝑚 − 𝑝2 𝑚)2
tasks are still unexplored. Value-based RL algorithms are used to
cope with discrete actions with a finite state space such as in
games [18]. Nevertheless, value-based RL algorithms in solving
AV problems generally require continuous action commands.
Currently, Deep Q-Network (DQN) is a commonly used
value-based RL algorithm for solving AV problems [19], [20].
Toromanoff et al. [21] adapted the valuebased Rainbow-IQN
Ape-X [22] by discretizing the action space to suit the CARLA
driving simulator [23]. However, the dimensionality problem,
which causes low precision results in [24], remains unsolved.
Meanwhile, some studies integrate policy-based algorithms to
solve issues related to large state space representations.
Current approaches to anomaly detection mostly focus
on identifying whether a trajectory is anomalous, ignoring Fig.2. An illustration of driving distance disi and driving
different patterns of anomalous trajectories. In this paper, we look
at distinct patterns of anomalous trajectories namely sudden time t j of point pi .
speed variations, vehicles moving in the wrong directions, Definition 5 (Driving distance, driving time, driving speed):In
variation in-vehicle driving time, and violations in lane driving
(driving distance), and present a framework for detecting and a trajectory, the source is denoted as p1 for any point pi in
classifying anomalous trajectories in real-world data. Tra , the distance from pi to p1 , denoted by disi , is defined
as
III.PROBLEM DEFINITION
disi = dis tan ce( pi , p1 )
AV traces are collections of data created on a regular
basis by GPS devices installed in AV. Each data includes Where dist an ce( pi , p1 ) is a function to compute the
the coordinates (longitude and latitude), timestamp, speed, geographical distance between pi and p1 .
direction, shape, and distance.
Definition 1 (Trajectory): An AV service is represented The summation of all fragment geographical distance starting
bya trajectory T, which is a series of GPS coordinates, from p1 source is computed as
denoted by Tra={ 𝑝1, 𝑝2 . . . . 𝑝𝑛 } where Pi  D is the
2
i −1

geolocation (i.e., latitudeand longitude), and are the source sumi =  dis tan ce( p j , p j +1 )
points and the destinationpoints of the trajectory produced at j =1

time-interval t j = {t1 , t2 ....tn } , respectively.


We can get a huge number of AV trajectories by extracting
valid AV trips from all the AV GPS traces based on occupancy
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 4

The driving time from pi to p1 , denoted by t j , is defined as


t j = pi .ti − p1.ti
The speed from pi to p1 , denoted by

disi
si =
tj
Definition 6 (Global Trajectory): Global trajectory is definedas
a long trajectory that is larger and completely different froma
normal trajectory.
Definition 7 (Local Trajectory): Local trajectory is defined as
a long trajectory that is larger and partially different froma
normal trajectory.
Definition 8(Global shortcut): It is the straight path or
approximate straight path between S and D.
Definition 9 (Local shortcut): It is defined as a trajectory that
contains the shortest path in some parts of the trajectory.
Definition 10 (Global Anomaly Detection): It uses the whole
trajectory to find anomalies. Fig.3. Proposed trajectory anomaly detection framework.
Definition 11 (Local Anomaly Detection): It divides tra-
jectories into subparts and finds anomalies in those sub-
trajectories. The proposed framework consists of three modules namely
Problem Statement: In a given trajectory Tra ={ trajectory data, data preprocessing stage, and anomaly detection
p1, p2 .... pn } the task is to detect the anomalies and and classification stage.
classify them into different classes, represented as
1.Trajectory data: We are working on the real-world
ℂ={𝐶𝑝1 , C𝑝2 , …..C𝑝𝑛 }, 𝐶𝑝1 ∈{DD, DS, DT, DD}.
Chennai trajectory data, which is retrieve from SUMO simulator.
2. Data preprocessing stage: It is a process of preparing the raw
IV. SYSTEM MODEL trajectory and making it suitable for a machine learning model.
A trajectory is a time-series dataset that includes object which involves extracting occupied trajectories, denoising errors,
locations indexed in chronological order. Current approaches missing values, and maybe in an unusable format, pre- processing
to anomaly detection mostly focus on identifying whether a increase the accuracy and efficiency of a machine learning model.
trajectory is anomalous, ignoring different patterns of anoma- The following is a list of the functions of each component in this
lous trajectories. In this paper, we look at distinct patterns of module.
anomalous trajectories namely sudden speed variations, a. Extraction: When gathering AV traces, the GPS
vehicles moving in the wrong directions, variation in-vehicle records of all AVs are saved together. To extract trajectories, we
driving time, and violations in lane driving (driving dis- tance), first group the GPS records of each AV by AV IDs, then
and present a framework for detecting and classifying sortthe GPS records in chronological order. The segments of the
anomalous trajectories in real-world data. First, we analyze the journey in the occupied status are extracted after sorting the GPS
trajectory based on distance metrics by using Euclidean points of each AV.
distance metrics and then evaluate the similarity between any b. Denoising: In actuality, GPS location accuracy is
two trajectories. We further propose an Efficient Trajectory affectedby a variety of circumstances. For example, due to the
Anomaly Detection and Classification (ETADC) framework by aging of GPS devices and multi-path effects, the gathered data
using deep deterministic policy gradient algorithm toofind may experience coordinate drift, causing the obtained coordinates
anomalies in different patterns. Finally, we evaluate the to be far from the actual locations. As a result, reducing noise in
proposed ETADC method through extensive experiments on the collection of retrieved trajectories is critical.
real AV trajectory data. c. Clustering on trajectories: When raw trajectories are
obtained from some tracking algorithms, they need to be clustered
to identify different patterns.
d. Trajectory database: To improve search and
detection efficiency, point trajectories, and trained models are
maintained in a database. This database contains all of the
trajectory’s points. A unique ID is provided to each trajectory, and
its driving time, distance, and speed are determined.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 5

3. Anomaly detection stage: Consider a trajectory Tra formed


by a moving object with multi-dimensional points pi at a time
interval t j . The trajectory of a moving object ob is then
defined as a sequence of such trajectory points produced at time-
intervals t j = {t1 , t2 ....tn } . In order to measure the similarity
between two trajectories, we propose Euclidean distance metrics.
Further, the proposed method deep deterministic policy gradient
is applied to the dataset to detect anomalies and classify them
into different classes. It successfully identifies trajectory
anomalies suchas sudden speed variations, vehicles moving in
the wrong directions. variation in-vehicle driving time, and
violations in lane driving (driving distance). Fig.4. An illustration of normal and anomalous trajectory.

V. PROPOSED SYSTEM The normal and anomalous trajectory are depicted in the
The proposed system uses Euclidean distance metric and diagram above. The normal trajectory is represented by black
reinforcement learning using deep deterministic policy gradient lines, whereas the anomalous trajectory is represented by red
algorithm for motion planning and decision making towards lines. The result was obtained on real Chennai real map using the
detection and classification of anomalous trajectory. SUMO simulator.
Thus, the anomaly count in the second phase is
A. Euclidean distance metric represented as
Distance metrics are an important part of trajectory analysis,
∑𝑙𝑗,𝑗∈𝑇𝑟𝑎𝑖𝑘 (|𝑃𝑗 | − |𝑃𝑘 |)
because they can measure the similarity of distinct trajectories. ꞎ2 (𝑃𝑗 ) =
Appropriate distance metrics are critical for any method of ∑𝑙𝑗,𝑗∈𝑇𝑟𝑎𝑖𝑘 | 𝑃𝑗 ∩ 𝑃𝑘 |
detecting outliers in a trajectory. (3)
After the two phases are completed, the anomaly count of every
We begin by analyzing the trajectory and determining the degree 𝑃 trajectory is defined as
𝑗
of similarity between any two trajectories. The Euclidean
distance metrics is defined as,

𝑛 ꞎ1 (𝑃𝑗 ) 𝑃𝑗 ∈ 𝑇𝑟𝑎
ꞎ(𝑃𝑗 ) = {
𝑑(𝑙, 𝑚) = √∑(𝑚𝑖 − 𝑙𝑖 )2 ꞎ2 (𝑃𝑗 ) 𝑃𝑗 ∈ Tra \ 𝕮
𝑖=1 (4)
where d(l, m) is the two points in the normal trajectory, 𝑚𝑖 , 𝑙𝑖 are
the Euclidean vectors, starting from the initial point. In many Then, to classify such trajectories, a set of thresholds must be
scenarios, especially when comparing distances, Euclidean applied. We elucidate a function ℒ(𝑃𝑗 ;Tra; ℑ) to express the likely
distance is of central importance. class of 𝑃𝑗 in Tra, where ℑ= (ℑ1, ℑ2, ℑ3, ℑ4 ) is a parameter vector
class.
B. Anomaly Count
Assume that there are many trajectories between (S, D) pair, we 𝑃𝑗 is DS, ꞎ2 (𝑃𝑗 ) ≥ ℑ1
have to measure the Euclidean distance between one trajectory
to another trajectory. Consider trajectory Tra={ 𝑝1, 𝑝2 . . . . 𝑝𝑛 }, ℒ(𝑃𝑗 ;Tra; ℑ)= {𝑃𝑗 𝑖𝑠 𝐷𝐷, ℑ2 ≤ ꞎ2 (𝑃𝑗 ) < ℑ1
the anomaly count of 𝑃𝑗 is defined as 𝑃𝑗 is DD, ℑ3 < ꞎ2 (𝑃𝑗 ) < ℑ2
1 (5)
∑ℕ−1 (|𝑃𝑗 \𝑃𝑘 |−|𝑃𝑘 \𝑃𝑗 |)
ℕ−1 𝑗,𝑗≠𝑘 Where ℑ1 𝑎𝑛𝑑 ℑ2 are positive while ℑ3 is negative. It is clear that
ꞎ1 (𝑃𝑗 ) = 1 parameter has a significant impact in the classification of
∑ℕ−1 |𝑃 ∩𝑃𝑘 |
ℕ−1 𝑗,𝑗≠𝑘 𝑗 anomalous trajectory. Due to labels of trajectory in the mechanism
of detection and classification, unsupervised learning method has
1 ∑ℕ−1 (|𝑃 |−|𝑃 |) been used.
ℕ−1 𝑗,𝑗≠𝑘 𝑗 𝑘 In Algorithm 1, we present the pseudo code for Anomaly count
= (2)
∑ℕ−1 |𝑃 ∩𝑃 | and anomaly threshold and in proposed algorithm 2 we present
𝑗,𝑗≠𝑘 𝑗 𝑘
the pseudo code for deep deterministic policy gradient. The time
complexity of the proposed algorithm is O(n^2) where n= no. of
trajectories.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 6

C. Reinforcement Learning Using DDPG Value Network are copied every time after a specific number of
In this paper, we propose a deep deterministic policy gradient iterations to update the Target Policy Network and the Target Q
algorithm based on four distinct patterns for effectively detecting Value Network, respectively. The procedure for upgrading the
abnormal trajectory and achieving accuracy. destination networks is described as follows:
The essential notion is that the costs (in terms of driving distance 𝜇𝑑𝑒𝑠𝑡 ← ℵ𝜇𝑑𝑒𝑠𝑡 + (1 − ℵ) 𝜇
and driving time) incurred for the displacement estimated from 𝜗𝑑𝑒𝑠𝑡 ← ℵ𝜗𝑑𝑒𝑠𝑡 (1 − ℵ) 𝜗
any point on the normal trajectory to its source should be within
the usual range. If the driving distance is only out of range, the 1) State Vector Representation : The approach for the AV
AV may take a longer route to save time, which is fairly typical driving agent to generalise in diverse navigation tasks is
in practice. Likewise, if merely the driving time is out of range, described in this section. Four different anomalous
the AV route may meet traffic bottlenecks, which are common trajectory patterns namely driving speed, driving time,
during rush hours. When the travel distance and time are both out driving direction and driving distance are represented in the
of range, the AV is more likely to take a detour, which is the form of state vector. Consider the illustration in Fig.1.
focus of our system. T1 and T2 are protracted diversions that may correspond to
To address this problem, we proposed reinforcement taxi driving scams because fraudulent drivers have been
learning using deep deterministic policy gradient algorithm. shown to earn more than non-fraudulent drivers on average.
DDPG is a deterministic strategy gradient algorithm that T4 and T6 are short-distance diversions that cab drivers may
combines the DL strategy. The algorithm employs a dual be forced to take due to heavy traffic or road closures. T5
network structure i.e., actor and critic to ensure a more stable resembles a straight line, which is widely believed to be the
learning process and speedier convergence. In the DDPG shortest way in most studies, however it is extremely likely
algorithm, the actor network and the critic network are updated to be the result of a GPS malfunction because such a road
at the same time. The Actor-Critic is a typical reinforcement doesn't always exist. We represent the state vector as
learning structure. Critic is built as a value function, while Actor follows:
is designed as a policy function. The network can be used to
𝓈 = [DD, DS, DT, DD]
explain the policy function (Actor) and the value function
Where 𝓈 is the state vector and DD, DS,DT, DD are the
(Critic).
Driving Direction, Driving Speed, Driving Time, and
We use the DDPG algorithm to solve the problem of
Driving Distance. The DDPG receives input from the sate
detecting and classifying anomalous trajectories. In order to
vector and generates the result based on the anamoly count
develop DDPG algorithm we present mathematics ideas required
and anomaly threshold.
in the autonomous decision-making issues of trajectories.
In the first step, we calculate the anomaly count of all
the trajectories according to equation (2). Then, as completely
normal trajectories, we select a set of ℭ trajectories with anomaly
count inside a limited interval [-ℳ, ℳ] and use Tra \ ℭ to
represent the other trajectories.
In the second stage, deep deterministic algorithm is
applied to detect anomaly and classify them among different
classes. Initially, policy parameters 𝜗 and Q-function
parameters 𝜇 is required to initialize.
Furthermore, while the actor tries to improve the
followed policy, the critic samples past experiences from a
buffer ∝ to reduce the temporal difference (TD) error.
Based on the distinct pattern, the state 𝓈 of the AV
trajectories is observed and an action 𝜑 is chosen. Then the
anomaly count ꞎ1 is calculated. In the same way, the
following step is monitored and the anomaly count is calculated.
If AV driving time varies drastically then it claim that driving
time anomaly detection happened. If the AV's driving time varies
dramatically, it is claimed that a driving time anomaly has been
detected. Hence, destination is computed as
x(Ʀ , ꞎ2 , 𝜑) = Ʀ + 𝛿(1 − 𝜑)𝑄𝜇𝑑𝑒𝑠𝑡 (ꞎ2 , 𝜇𝜗𝑑𝑒𝑠𝑡 (ꞎ2 )
Fig. 5. A trajectory network of Deep Determinisic Policy
The Q-function and policy gradient is designed as Gradient(DDPG).
1
∇𝜇 𝑧 ∑(ꞎ1 ,𝜑,ꞎ2 ,Ʀ)∈𝑧(𝑄𝓈, 𝜑 − x (Ʀ , ꞎ2 , 𝜑) )
1 2) Reward Function : The reward function is the most
∇𝜗 𝑧 ∑𝓈∈𝑧(𝑄𝓈, 𝜑 − x (Ʀ , ꞎ2 , 𝜑))2 important aspect in optimising performance, and it is used
The weights of the Online Policy Network and the Online Q as a scalar to assess the safety and efficiency of DDPG
activities in the environment. Negative reward values are
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 7

assigned to inappropriate behaviors in order to encourage the Algorithm 2: Deep Deterministic Policy Gradient
individual to avoid behaviours that result in large punishments. Input: initialize policy parameters 𝜗, Q-function
The reward function is estimated once the anomaly count and parameters 𝜇, buffer ∝.
threshold have been applied to each action. The process of
Output: Anomalous trajectories in AV
creating the reward function is a difficult work since it is
sensitive, and it takes a lot of attempts to find the values that 1. Set destination parameters equal to main
will help the deep reinforcement learning entity perform better parameters
on the required tasks. We arrive at a set of three basic weighted 𝜗𝑑𝑒𝑠𝑡 ← 𝜗, 𝜇𝑑𝑒𝑠𝑡 ← 𝜇
rewards: speed Ʀ(s), and trajectory direction Ʀ(d). Each reward 2. repeat
value corresponds to a certain task in relation to the vehicle 3. Observe state 𝓈 and select action 𝜑
during the driving period. 4. Where 𝓈 = (DD, DS, DT, DD)
The initial reward function is the speed reward Ʀ(s), for 5. DS← [Driving Speed, Normal Trajectory]
obtaining the target speed in varied traffic conditions. the speed 6. DT←[Driving Time, Normal Trajectory]
reward Ʀ(s) is maximised to value 1 in these cases: (a) the 7. DD←[Driving Direction, Normal Trajectory]
current AV speed ⱴ𝑠 equals to the predicted speed limit of the 8. DD←[Driving Distance, Normal Trajectory]
9. Execute ꞎ1 in the circumstances
road; (ii) the AV stops (ⱴ𝑠 = 0) at the predicted red traffic light
10. Observe next state ꞎ2 , reward Ʀ
(Ʈ𝑡𝑟𝑢𝑒 ), (iii) the AV stops (ⱴ𝑠 = 0) at the predicted hazardous
11. Store (ꞎ1 , 𝜑, ꞎ2 , Ʀ)
situation (ɧ𝑡𝑟𝑢𝑒 ); (iv) the AV speed gradually decreases to ⱴ𝑠 =
12 km/h when the predicted distance from the leading vehicle ⱱ𝑑 12. reset circumstances
= 12 meters to maintain a safe distance. 13. if time increased then
The second reward function is the trajectory direction 14. for t=500 do
reward, Ʀ(d), which encourages the agent to continue driving in 15. driving time anomaly detected
the correct lane. It is denoted as follow: 16. z = (ꞎ1 , 𝜑, ꞎ2 , Ʀ) from ∝
Ʀ(d)=𝑉𝑎𝑐𝑜𝑠(𝜃) − 𝑉𝑎𝑠𝑖𝑛(𝜃) − 𝑉𝑎 𝑀𝑑 17. Compute destination
where θ is the expected angle between the AV and the center x (Ʀ , ꞎ2 , 𝜑) = Ʀ + 𝛿(1 − 𝜑)𝑄𝜇𝑑𝑒𝑠𝑡 (ꞎ2 , 𝜇𝜗𝑑𝑒𝑠𝑡 (ꞎ2 )
of the road and 𝑉𝑎 is the vehicle velocity. 18. Update Q-function
The longitudinal velocity 𝑉𝑎𝑐𝑜𝑠(𝜃) is favourably 1
rewarded, the transverse velocity 𝑉𝑎𝑠𝑖𝑛(𝜃) is penalised, and the
19. ∇𝜇 𝑧 ∑(ꞎ1 ,𝜑,ꞎ2 ,Ʀ)∈𝑧(𝑄𝓈, 𝜑 − x (Ʀ , ꞎ2 , 𝜑) )
agent is penalised when it is far away from the lane centre, 𝑉𝑎 𝑀𝑑 . 20. Update policy gradient
1
At each time step, the overall accumulated reward is: 21. ∇𝜗 ∑𝓈∈𝑧(𝑄𝓈, 𝜑 − x (Ʀ , ꞎ2 , 𝜑))2
𝑧
Ʀ = Ʀ(s)+ Ʀ(d). 22. Update destination networks using
For each action, the cumulative reward value Ʀ is calculated. 𝜇𝑑𝑒𝑠𝑡 ← ℵ𝜇𝑑𝑒𝑠𝑡 + (1 − ℵ) 𝜇
𝜗𝑑𝑒𝑠𝑡 ← ℵ𝜗𝑑𝑒𝑠𝑡 (1 − ℵ) 𝜗
Algorithm 1: Anomaly count and anomaly threshold 23. end for
24. end if
INPUT: Tra ={ p1, p2 .... pn }, ℳ, ℒ, and ℑ.
25. until convergence
OUTPUT: ℂ={𝑪𝒑𝟏 , C𝒑𝟐 , …..C𝒑𝒏 }
1. For j in Tra : VI. RESULTS AND DISCUSSIONS
2. Compute 𝑑(𝑝1 , 𝑝2 ) according to Euclidean distance This section includes an experimental evaluation and
metric. comparison of our proposed method as well as other state-of-the-
art models.
3. If - ℳ ≤ 𝑑(𝑝1 , 𝑝2 ) ≤ ℳ:
4. ℭ←𝑷𝒋
A. Experimental Setup
5. End This section illustrates results of performance
6. End assessment of the proposed scheme. In order to attain the
7. For j in Tra \ ℭ: desired simulation setup, two simulators were evaluated.
8. Tra 𝑗𝑛 ← choose the n-nearest trajectories of 𝑃𝑗 These simulators are referred as OMNET++ and
SUMO. Here, the network simulations have been carried
From ℭ
out on OMNET++. In addition, vehicular trajectory
9. Compute ꞎ(𝑷𝒋 ) according to anomaly threshold
generation is achieved using SUMO. The proposed
10. End method was successfully performed and produced the
11. Obtain anomaly threshold of all trajectories intended results. All the experiments were implemented
12. Classify all trajectories 𝑷𝒋 on a PC with 8 GB memory, an Intel core i7 7th
generation processor and the Windows 10 system.
13. Return ℂ={𝑪𝒑𝟏 , C𝒑𝟐 , …..C𝒑𝒏 }
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 8

The experimental results clearly demonstrate the ability of the


proposed system to outperform state-of-the art existing systems
scheme.
In future work, we can look at further techniques for
detecting trajectory outliers that aren't too dissimilar from regular
trajectories. This is due to the fact that outliers with such traits are
simpler to spot. The difficulty is detecting trajectory outliers that
appear to be regular trajectories. Furthermore, we can evaluate the
use of other distance metrics, such as Jaccard Similarity etc., to
see if they do a better job of capturing the concept of trajectory
dissimilarity in real-world datasets.
Fig.6. Multi-trajectory anomaly detection

Fig.7. Intersection Anomaly detection

Fig.8. Two-trajectory anomaly detection.

B. Performance analysis

VI. CONCLUSION
In this paper, a framework for urban trajectory modelling and
anomaly detection is proposed. Four different patterns, namely
driving speed, driving distance, driving duration, and driving
direction, are established to identify anomalous trajectories. With
these patterns, the behavior of autonomous vehicles is examined.
Euclidean distance metrics are used to calculate the similarity
between any two trajectories. Anomaly count, on the other hand,
is used to determine the difference between various types of
abnormal and normal trajectory. Apart from this, we also present
a reinforcement learning employing a deep deterministic policy
gradient technique to increase the accuracy and efficiency of
trajectory anomaly classification and identification. Finally,
OMNET++ and SUMO were used to do a thorough evaluation
of the suggested anomaly detection model.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORT SYSTEMS, 2022 9

You might also like