You are on page 1of 15

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO.

8, AUGUST 2022 10365

A Unified Framework Integrating Decision Making


and Trajectory Planning Based on Spatio-Temporal
Voxels for Highway Autonomous Driving
Ting Zhang, Wenjie Song , Mengyin Fu, Yi Yang, Xiaohui Tian, and Meiling Wang

Abstract— Intelligent decision making and efficient trajectory


planning are closely related in autonomous driving technology,
especially in highway environment full of dynamic interactive
traffic participants. This work integrates them into a unified hier-
archical framework with long-term behavior planning (LTBP)
and short-term dynamic planning (STDP) running in two par-
allel threads with different horizon, consequently forming a
closed-loop maneuver and trajectory planning system that can
react to the dynamic environment effectively and efficiently.
In LTBP, a novel voxel structure and the ‘voxel expansion’
algorithm are proposed for the generation of driving corridors
in 3D configuration, which involves the prediction states of
surrounding vehicles. By using Dijkstra search, the maneuver
with minimal cost is determined in form of voxel sequences, then
a quadratic programming (QP) problem is constructed for solving
the optimal trajectory. And in STDP, another small-scaled QP
problem is performed to track or adjust the reference trajectory
from LTBP in response to the dynamic obstacles. Meanwhile,
a Responsibility-Sensitive Safety (RSS) Checker keeps running Fig. 1. The role and composition of the proposed framework. The proposed
at high frequency for real-time feedback to ensure security. framework plays the role of decision-making and planning, which makes use
Experiments on real data collected in different highway scenarios of the perception and prediction information, then the generated trajectory will
demonstrate the effectiveness and efficiency of our work. be executed by the controller. And the framework is actually a hierarchical
closed-loop system composed of LTBP, STDP and RSS Checker.
Index Terms— Decision making, trajectory planning,
spatio-temporal voxel, non-linear optimization.

I. I NTRODUCTION that are closely associated. The decision making process


which is also called behavior planning or maneuver planning,
A UTONOMOUS driving is a very promising emerging
technology, which saves human resources and pro-
motes the progress of intelligent cities and industries. The
is actually the selection of the best actions (i.e., lane change,
lane keeping, acceleration and deceleration) based on the
decision-making and trajectory planning have always been surrounding environment and the vehicle’s own states, and
regarded as one of the core techniques for autonomous driving trajectory planning is responsible for finding a collision-free,
comfortable and smooth trajectory which meets the initial and
Manuscript received 23 December 2020; accepted 15 June 2021. Date target conditions.
of publication 9 July 2021; date of current version 9 August 2022. This
work was supported in part by the National Natural Science Founda- Although there have been a lot of researches devoted to the
tion of China under Grant 61903034, Grant U1913203, Grant 61973034, design of decision-making algorithms [1], [2] and the devel-
and Grant 91120003; in part by the Program for Changjiang Scholars opment of planning schemes [3]–[5] separately, it is hard to
and Innovative Research Team in University under Grant IRT-16R06 and
Grant T2014224; in part by the China Postdoctoral Science Foundation isolate the two modules because the planning process requires
Funded Project under Grant 2019TQ0035; and in part by the Beijing local target generated by decision-making steps. Meanwhile,
Institute of Technology Research Fund Program for Young Scholars. The the flexibility and optimality of decision making results are
Associate Editor for this article was P. Kachroo. (Corresponding author:
Wenjie Song.) not feasible guaranteed without consideration of the planning
Ting Zhang, Wenjie Song, Yi Yang, Xiaohui Tian, and Meiling Wang are consequence. In some extreme situations, the following motion
with the State Key Laboratory of Intelligent Control and Decision of Complex planning fails to attach the appropriate movement to the
Systems, Beijing Institute of Technology, Beijing 100081, China (e-mail:
songwj@bit.edu.cn). maneuver, which will result in the conservative or aggressive
Mengyin Fu is with the State Key Laboratory of Intelligent Control and actions.
Decision of Complex Systems, Beijing Institute of Technology, Beijing In the existed decision-making and planning frame-
100081, China, and also with the School of Automation, Nanjing University
of Science and Technology, Nanjing 210014, China. works, two dominant systems prevail: rule-based and
Digital Object Identifier 10.1109/TITS.2021.3093548 learning-based decision making and planning schemes.
1558-0016 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10366 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022

Rule-based decision-making utilizes the driving rules and corridors in 3D configuration, which is the key architecture
expert knowledge to determine the optimal logical action, of our LTBP framework.
with representative methods Finite State Machine (FSM) [6],
[7] and Markov Decision Process (MDP) [8], [9]. And the A. Frameworks Combining Decision Making and Trajectory
learning-based method uses different network structures to Planning
generate the optimal policy with best rewards according to the Decision making and trajectory planning are both crucial
environment. Various learning-based methods like reinforce- techniques for autonomous vehicles, which can be closely
ment learning are popular [10]–[12]. Different frameworks associated in research and applications. For decision making,
place the emphasis on various aspects, but the following issues the maneuver selection could be lane-level or samples in lanes.
are still not clearly addressed: In EM Motion Planner [13], the path-speed EM iteration inside
1) How to achieve safe and collision-free motion planning lane-level planning is proposed to generate optimal trajectory.
for vehicles in a high-speed environment with dynamic In E-step, static and dynamic obstacles are projected on the
obstacles; lane Frenét frame. In two M-steps, path and speed profiles
2) How to balance the relationship between long-term are generated by a combination of dynamic programming
decision-making and short-term planning to improve the and quadratic programming. [14] integrates discrete maneu-
efficiency; vering decisions with a model predictive control-based motion
3) How to make the decision-making and planning modules trajectory-planning scheme. Particularly, the uncertainties of
closely integrated to achieve reactive planning and re- the predicted vehicle’s states are explicitly discussed. These
planning. frameworks separate the path and velocity planning process,
For solving these problems, in this paper, we propose thus are inflexible to deal with the dynamic obstacles, espe-
a hierarchical long-term and short-term integration frame- cially in the re-planning process. In [15], the domain-specific
work that closely incorporates decision-making and planning, closed-loop policy tree (DCP-Tree) is proposed to outline the
as illustrated in Fig. 1. This framework makes use of the possible maneuvers in lane sequence of the ego vehicle, then
information of perception and prediction, making decisions conditional focused branching mechanism is well designed to
and generating trajectories executed by the control module. guide the branching in action and intention space by utilizing
The key characteristics of the framework are: (a) In the the domain-specific expert knowledge. The other kind of
solution space, the voxels are created and expanded in the method is to evenly spread sampling points in the longitudinal
spatio-temporal domain to search the optimal maneuver and and lateral directions, and determine the preliminary behavior
trajectory, thereby transforming the dynamic planning problem path through the interconnection of sampling points [16],
in two-dimensional plane into a static planning problem in [17]. [18] is a typical framework which contains the upper
three-dimensional space, solving issue 1. (b) The architec- behavioral trajectory planner for searching the macro-scale
ture belongs to a hierarchical form, where the long-term trajectory then determining the behavior, and also the lower
spatio-temporal decision module is used as the behavior motion trajectory planner for generating the micro-scale tra-
planning level, and the short-term dynamic dynamic module jectory. However, the adoption of sample-based searching
is used as the trajectory optimization and tracking module, method requires finding the equilibrium between the searching
as illustrated in Fig. 1. The design of different operating resolution and the time efficiency.
frequencies and parallel threads balance the periodic process of
B. Trajectory Planning in 3D Configuration
decision-making and planning, thus conquering the efficiency
problem of issue 2. (c) In the information flow, a closed-loop Trajectory planning in 3D configuration means planning
system with forward and backward feedback mechanism is in the passable plane with additional time dimension. Tra-
constructed among modules in different levels as the yellow ditional works on 3D trajectory planning can be roughly
arrows showed in Fig. 1. With the design of time trigger and classified into three categories: spatially-partitioned trajectory
event trigger mechanism, the decision-making and planning planning [19], temporarily-partitioned trajectory planning [18],
modules are closely combined to solve issue 3. [20] and direct method named spatio-temporal planning. The
spatially partitioned method aims to divide the 3D configu-
II. R ELATED W ORK ration into two 2D planes with respect to lateral and lon-
In this section, the related work are arranged in order of gitudinal direction, which reduces the scale of the planning
framework level, configuration domain level and struc- problem. Nevertheless, too many parameters are involved
ture composition level. We first outline the existing frame- in the searching process and optimization process in each
works that combine decision-making and planning and analyze decoupled sub-problem, which makes it troublesome to deal
their characteristics and shortcomings. Then, several methods with. While temporarily-partitioned method is based on the
are categorized by whether the state space is decoupled. assumption that position and speed can be decoupled, so it
By comparison, the spatio-temporal planning stands out for is inapplicable for those situations that depend on both posi-
its flexibility to cope with dynamic obstacles. And based on tion and speed planning, such as intersections or overtaking.
this conclusion, we finally focus on the structure of corridor In contrast, spatio-temporal planning is able to plan velocity
contained within spatio-temporal planning. Inspired by the and acceleration simultaneously due to the additional time
corridor, we propose to create a novel voxel structure and dimension, realizing obstacles avoidance of the static and
the ‘voxel expansion’ algorithm for generating the driving dynamic vehicles [21], [22]. In [23], the authors propose

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10367

to use the spatio-temporal state lattice to automatically and • Voxels and Expansion Algorithm: We propose to cre-
efficiently explore the states in both space and time domain. ate the voxel structure in spatio-temporal configura-
Furthermore, the direct method with optimization process tion. With the voxel serving as the vertices and the
should be provided with appropriate initial value and heuristic calculated weighted edges, a directed weighted traffic
objective functions for obtaining analytic solutions. graph (DWTG) that describes the passable space is
constructed, translating the complex traffic situation into
C. Generation of Spatio-Temporal Corridors the brief graph structure. Finally the voxel sequences
The spatial temporal corridor provides an idea to create are determined to find the proper maneuver and optimal
space describing the passable area. Compared to the plane trajectories.
cube designed in the static environment [24], [25], the spa- • Effectiveness: We make experiments on the released
tial temporal corridor can contain more dynamic evolutions data-set NGSIM, which describes the real-world driving
about the obstacles and environments. [26] builds a dynamic scenarios in highway. The success of decision-making
corridor that yields velocity obstacle-based (VO) maneuver and planning in different scenarios verifies the effec-
for real-time moving obstacle avoidance. Once the predictions tiveness of our proposed hierarchical framework, and
of the surrounding vehicles are accurate, the VO paradigm the comparison experiments on the ground truth and
will contribute to a set of safe velocities in discrete time, the corridor generation method in [22] demonstrate the
thus the planned speed profile is guaranteed to be collision- intelligence and efficiency of the algorithm.
free. [27] builds the corridors to partition the trajectory space
into discrete solution classes, such that all possible maneuvers
can be enumerated to find the global optimum as the opti- III. F RAMEWORK
mum trajectory among the maneuver variants. [28] proposes The detailed structure of the framework is illustrated
a four collision-free cells partitioning method to design a in Fig. 2, which contains a perception module, the LTBP
spatio-temporal transition graph. The generated corridors are thread, the STDP thread and visualization module.
actually the surrounding space excluded the obstacles. [22]
proposes to build a semantic corridor structure, and then
finds out the optimal trajectory constrained within the corridor A. Functionality of Each Module
by using piecewise Bézier curves. Though these mentioned The perception module is supposed to process the envi-
corridors are designed in spatio-temporal space, they lack the ronmental information obtained by the sensors, which mainly
detailed temporal information. For example, each part of the includes the environment manager to deal with road infrastruc-
corridor is supposed to be topology equivalent, and hardly any ture and static obstacles as well as the social agent manager for
physical character like velocity, acceleration, jerk could be find processing dynamic obstacle information (including real-time
in the corridor, therefore the superiority of the corridor in both and predicted states). Since the perception part is beyond the
time and space is not fully utilized. scope of the work, the relevant details will be omitted. The
Based on the analysis of the above literature, we can draw two threads in the middle are the core components of our
the conclusion that, in framework level, a unified framework framework, named LTBP and STDP.
is required that integrates decision making and trajectory The main task of LTBP is making long-term decision
planning, moreover, it is supposed to deal with the dynamic and optimal trajectory planning based on the approximate
environment and generate safe, efficient smooth and reac- prediction information of the surrounding vehicles, and also
tive trajectory. In configuration domain level, spatial-temporal providing a reference trajectory for STDP. There are total four
decision making and planning hold the advantage over flexi- managers in LTBP, where ego manager is the key part that
bility. And for the constructed corridor structure, the existed dominates each manager. Once the voxels and weighted matrix
works either regard the free space as a whole with equal are constructed based on the perception, they are delivered to
topology status or expand the corridor regularly according to the Dijkstra manager to find the best voxel sequence SV,M
the fixed step length, which neglect the dynamic feasibility of as the path boundary of maneuver M. Then the sequence
the vehicle and cause the impassable space involved. will be transformed into the optimal trajectory T r ∗ through
Motivated by the existed works and towards the above solving a QP problem by the QP optimization manager.
drawbacks, we propose our framework and summarize the Finally, the evaluation manager in LTBP will help to select
contributions as follows: the best maneuver and trajectory as the reference trajectory,
• Framework: We propose a hierarchical long-term and as in Algorithm 2 discussed in Section IV. STDP mainly
short-term integration framework that closely incorpo- aims to track and optimize the reference trajectory based on
rates decision-making and planning. The design of dif- the updated states of surrounding vehicles, which contains
ferent operating frequencies and parallel threads balance the dynamic planning manager, the RSS [29] safety check
the periodic process of each module. A closed-loop manager and the QP optimization manager. Based on this,
system with forward and backward feedback mechanism real-time obstacle avoidance and re-planning are managed in
is constructed among modules in different levels. With a frequency of 10Hz. The integration of these modules ulti-
the design of time trigger and event trigger mechanism, mately forms the maneuver and trajectory planner, enabling the
the decision-making and planning modules are closely autonomous vehicle to accomplish safe and efficient planning
combined. in a dynamic environment.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10368 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022

Fig. 2. The detailed structure of the proposed framework, with two core modules LTBP and STDP running in two parallel threads. LTBP is running every
4s, responsible for expanding voxels in 3D configuration and generating optimal trajectory to each maneuver by solving the QP problem. STDP runs in 1Hz,
tracking and optimizing the reference trajectory based on the short-term prediction results. RRS Checker is the embedded module in STDP, which runs in 10Hz
to check whether there is a potential risk. If an obstacle suddenly appears, the event trigger of STDP will be enabled, and with the failure of STDP, the event
trigger of LTBP is activated, forming the forward and backward closed-loop planning system.

B. Operation Mechanism and the re-planned maneuver will be made until the danger is
In our framework, we use two parallel threads for the eliminated as in Algorithm 1.
execution of the two modules, as shown in Algorithm 1. The above-mentioned operation mechanism is just the high-
Furthermore, the operation frequency f L of the LTBP module light of our framework. Different operation frequencies cause
is 0.25Hz, with planning horizon 5s (in line with the general the module to make effects in combination with its functional-
prediction horizon), while the operation frequency of STDP ities. LTBP runs at a relatively low frequency, which reduces
is 1Hz, with the planning horizon 1s. Both modules adopt the burden of repeating the decision-making and planning
the dual trigger mechanism of time and event trigger, that process, and alleviates the unstable swerve of strategy. STDP
is, once the event trigger is enabled, the module will restart runs at a frequency of 1Hz, which meets the requirements
immediately. The motivation of this mechanism is for the of the planner for obstacle avoidance and tracking reference
safety and efficiency. Because the planning horizon of LTBP is trajectories. At the same time, the RSS safety detection
5s, this means that in such a long planning period, the states of module runs at a higher frequency for real-time and accu-
the surrounding vehicles are generated by prediction. Whether rate obstacle avoidance. When the system detects a danger
the states derive from the analysis of the physical kinemat- in the future, the STDP will restart immediately, and if it
ics model [30] or the usage of neural networks [31], [32], fails, the LTBP decision planner will restart. In this way,
the uncertainty will increase over time, which brings diffi- a hierarchical closed-loop decision-making planning system
culties to accurate decision-making and planning. In order to with information transmitted forward and backward is formed
make the vehicle handle the dynamic obstacles and be reactive as Fig. 1, making it more reactive and efficient to plan and
to the emergencies accurately, we develop the STDP, which re-plan in the highly dynamic environment.
regards the optimal trajectory T r ∗ generated by LTBP as the
reference trajectory and considers the states of social vehicles IV. M ETHODOLOGY
P in a short period. The STDP module is conducted every 1s. In this section, we introduce the voxel structure and also
At the same time, the RSS Checker contained in STDP runs its generation and expansion algorithm in the Frenét [16]
at a frequency of 10Hz to check whether there is a risk of coordinate system (s is the direction along the road that
collision in the next 0.2s. Once a potential collision is detected, originates from the initial position, and d is the lateral position
the event trigger ‘event_S’ of STDP module is enabled, and perpendicular to s direction from right side to the left side).
the STDP module will re-plan immediately with the updated The overall process from the generation of voxels till its expan-
perception. If the planning result of STDP ‘success_S’ turns sion and finally the acquisition of the optimal trajectory are
out to be successful, the trigger will be released, otherwise the illustrated in Algorithm 2, which will be explained thoroughly
event trigger ‘event_L’ of the LTBP module will be activated in this section.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10369

Algorithm 1 Operation Mechanism lane (with the same vd ), it means that the planned trajectory
1: success_L = False, success_S = False derives from the lane keeping behavior. If the voxels in the
2: event_L = False, event_S = False, event_RSS = False sequence are from different lanes, it indicates the lane change
3: f L = 0.25Hz, f S = 1Hz, f R S S = 10Hz, behavior.
4: Thread 1: The generation and expansion of voxels could be illustrated
5: if f L or event_L then: by Fig. 3, which actually describes the situation where the ego
6: success_L = LTBP.running() vehicle (blue) is driving on the middle lane, with two surround-
7: if success_L then: ing vehicles (orange, grey) driving aside. Since human drivers
8: event_L = False do not make frequent lane-change decisions, we assume that
9: end if in a planning cycle, the vehicle will only make lane-change
10: end if decision at most once. Therefore, voxels are only generated in
11: Thread 2: the current lane (blue) and the adjacent lanes (yellow, green).
12: if f S or event_S then: Assuming that the voxel ViL in interval ti = t (with the initial
13: success_S = STDP.running() time t0 = 0 in TL ) has been determined, then the voxel Vi+1L of

14: if success_S then: interval ti+1 = t +t L , t L = 1s is determined as formula (1)


15: event_S = False (superscript L ∈ {C,L,R} represents the current, left, right
16: else lane), where v 0 is the initial velocity in this planning cycle,
17: event_L = True v tmin
i+1
, stmin
i+1
represent the minimum velocity at the beginning of
18: end if interval ti+1 , and v tmax
i+1
, stmax
i+1
represent the maximum velocity
19: end if and longitudinal position at the end of interval ti+1 .
20: if f R S S then: vmin − v 0
21: event_RSS = RSS.running() Tt ovmin = ,
amin
22: if event_RSS then: v tmin = max(vmin , v 0 + amin t),

i+1
23: event_S = True
⎪ 1
24: end if ⎨ s0 + v 0 t + amin t 2 , if t < Tt ovmin ,
25: end if stmin = 2
i+1 ⎪ v 02
⎩ s0 + , if t ≥ Tt ovmin ,
2amin
Algorithm 2 Overall Process of Voxels Generation, Expansion vmax − v 0
Tt ovmax = ,
and the Acquisition of the Optimal Trajectory amax
Input: p0 , v 0 , a0 , P v tmax = min(vmax , v 0 + amax (t + t L )),
i+1

Output: optimal trajectory T r ∗ ⎪ 1

⎪ s0 + v 0 (t + t L ) + amax (t + t L )2 ,
function OVERALL P ROCESS( p0 , v 0 , a0 , P) ⎪
⎪ 2


2: O passable = GetPassableSpace( p0, v 0 , a0 ) ⎪
⎨ if t + t L < Tt ovmax ,
Ooccupied = GetSocialStates(P) stmax = 1 (1)
i+1 ⎪ s0 + v 0 Tt ovmax + amax Tt2ovmax
4: O puri f ied = GetPurifiedSpace(O passable, Ooccupied ) ⎪
⎪ 2


T = GetVoxelTopology(O puri f ied ) ⎪ + (t + t L − Tt ovmax )vmax ,


6: SV = DijFindVoxelSequence(O puri f ied , T ) ⎩ if t + t ≥ T
L t ovmax
{T r } = Optimization(SV, p0 , v 0 , a0 )
8: T r ∗ = EvaluateTrajectory({Tr }, P) It can be analyzed that the starting position of Vi+1 L is
end function the result of deceleration behavior by amin from the initial
planning time till t (minimum 0m/s), while the end position
of voxel is the result of acceleration by amax until reaching
the maximum velocity till t + t L . Meanwhile, the initial
A. Generation and Expansion of Voxels
acceleration at t0 could be considered for better precision. Thus
Voxel V is a tri-dimensional space, including vs , vd , vt , vl , all the positions in voxel are reachable under the maximum
where vt is the unit interval of the long-term planning horizon, velocity and acceleration constraints (The more accurate cor-
vs , vd (in terms of lanes) represent the center of the voxel in ridor should be trapezoidal, however, the tedious space could
longitudinal and lateral direction, and vl is the length of the be excluded by the velocity constraints.).
voxel in s axis (the width of the voxel vd in lateral direction By utilizing formula (1), the maximum free driving space
is the lane width by default). Voxel V actually represents O passable of the ego vehicle in lane level is detected, then
the collection of all possible positions of the vehicle during the predicted states of the N S surrounding vehicles P in long
interval vt . Therefore, arranging the voxel into a sequence term are considered as the occupied space to get the purified
in the order of time constitutes the distribution space of the space as formula (2). According to the predicted positions of
possible trajectories of the vehicle in the planning horizon, the social obstacles, we remove the occupied space Ooccupied,i
which is also the feasible domain of the optimization problem. from voxel ViL , and the voxel whose vl is greater than a certain
In our work, there are total 5 intervals of 1s, t1 , t2 , t3 , t4 , t5 in threshold like (3) is considered as the valid free space and
the planning horizon TL . If all the voxels belong to the same is retained (L min is the defined threshold of minimal voxel

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10370 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022

Fig. 3. A schematic diagram illustrating the process of the proposed method. (a) A traffic scene where the ego vehicle is driving with two social vehicles
(orange vehicle on the left and grey vehicle on the right). (b) The generated voxels in 3D configuration by our proposed algorithm, with blue voxels in the
current lane, yellow voxels in the right lane and green voxels in the left lane. These voxels are actually the set of space without obstacles occupied. (c) The
abstract vertices derive from the voxels. (d) Types of voxels and edges of the formed traffic graph. (e) The topology of the connected voxels. (f) The planned
trajectory of each maneuver.

length, and L v is the length of vehicle.). After that, we finally


get the NV voxels SV = {ViL |i ∈ NV , ti ∈ TL , L ∈ {C,L,R}} in
each considered lane within each planning interval ti , and each
voxel will be regarded as a vertex as Fig. 3 (c) shows. In terms
of the spatio-temporal expansion characteristics, voxels are
categorized into the ego-centric ones in the current lane and
the others in the adjacent lanes as (d) in Fig. 3.
O puri f ied = O passable − Ooccupied , Fig. 4. A schematic diagram depicting the connection topology of the voxels.
With the obstacles colored red, the voxels in planning time t4 , t5 are divided
O passable = {ViL |i ∈ NV , L ∈ {C,L,R}}, into two isolated voxels. Then two topology named (1) and (2) are formed,
leading to two maneuvers: (1) decelerating and following the social vehicle;
Ooccupied = {(sit , dit )|i ∈ N S , t ∈ T, t = 0.2s}. (2) (2) accelerate and driving in front of the social vehicle.
vl ≥ L min + L v , (3)

ra = amax − amin . (5)


B. Directed Weighted Traffic Graph f spatial (ti , t j ) = f lane change + wt t c f t t c . (6)
With NV voxels created, we categorize them by lanes,
thus constituting the set S = SC ∪ S L ∪ S R like in Fig. 3 It is noteworthy that in each interval ti of each lane,
(b), which could also be regarded as vertices as c). Next, the number of voxels could be more than one due to the
an adjacent matrix A ∈ R NV ×NV is generated, which illustrates existence of social obstacles, as illustrated in Fig. 4. In Fig. 4,
the topological relationship between voxels. According to (4), the voxels are created in a specific lane in time sequence,
the (i, j ) element of adjacent matrix A : ai, j = 1 means that hence the d axis is skipped. As in interval t4 , t5 , a social
two voxels Vi , V j are physically connected. obstacle appears, dividing the voxels V4 , V5 into voxels
[V4,1 , V4,2 ], [V5,1 , V5,2 ]. Since both of the temporal topol-
|Vi .vs − V j .vs | ≤ Vi .vl /2 + V j .vl /2 + L v . (4) ogy (1)V3 − V4,1 − V5,1 and (2)V3 − V4,2 − V5,2 exist,
There are two types of topological relationships (edges) it consequently leads to two maneuvers: (1) decelerating and
between different voxels. One kind of topology lies in the following the social obstacle, (2) accelerating and overtaking
voxels in the sequential interval with the same lane, serving the social obstacle. The existence of a spatial edge means
as the temporal edges. The other kind of topology belongs that there is a possibility of lane change in the current
to the voxels in the same interval but different lanes, serving voxel. Here, we evaluate the cost of changing lanes mainly
as the spatial edges as (c) in Fig. 3. If the temporal edge from the perspective of the distance from the obstacle, and
exists in voxel ViL , Vi+1
L , then the weight is calculated as the calculation can be referred as formula (6), where wt t c
formula (5). The effective factor that we considered is the and ft t c are the weight and term of time-to-collision (TTC)
‘transition coefficient’ of the two voxels at the subsequent respectively.
intervals. The larger the transition coefficient is, the higher the However, it is not wise to give up changing lanes due to
overlap ratio of the two voxels is, indicating that the vehicle the cost, because in some cases, lane changes are necessary
can cover the two voxels with velocity changed in a wider for the vehicle to drive more safely at the desired velocity.
range, thereby relaxing the constraints of planning. Therefore, in order to maintain the diversity of maneuvers,
we will generate three maneuvers {MC , M L , M R } including
a(t, t + t L )
f temporal (ti , ti+1 ) = f transition = 1 − , Lane Keeping, Left Lane Change and Right Lane Change.
ra For each maneuver, the voxel at the end of the planning
a(t, t + t L ) = 2[(vi,s + vi,l /2) − (vi+1,s interval of the corresponding lane is regarded as the target
− vi+1,l /2)]/t 2 , vertex. Then the Dijkstra algorithm [33] is employed to search

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10371

for a voxel sequence SMi that starts from the starting voxel trajectory smooth and comfortable enough, we choose jerk
to the destination with the lowest cost as the result of the as the trajectory optimization function (with optimized order
decision, as in Fig. 3 (e). Concatenating the voxel along the m opt = 3). In terms of constraints, to ensure the consistent
time sequence, the corridor that indicates the passable space of continuity of the planning results in sequential planning cycles,
(0) (1)
the ego vehicle is constructed. Note that in each interval, only we take the initial position p0 , velocity v 0 = p0 , and
(2)
one voxel is selected. If the strategy is lane change, two voxels acceleration a0 = p0 as equality constraints. As for the
are involved at the optimal lane-change interval, which are destination, we regard the distance covered with desired speed
then merged into one voxel (the merged voxel means the start as the expected target. So in s direction, with no obstacles
interval of the lane-change action rather than the completion ahead, pe equals the desired spanned distance v desired TL
of the action). within the maximum distance sTmax L
travelled in TL . Otherwise,
the maximum destination is set to keep the safe distance
C. Trajectory Optimization–QP Problem dsafe from the front obstacle by criterion like time-to-collision.
After the voxels are generated, a passable space of the Meanwhile, the connection between each Bézier segment
vehicle trajectory is actually determined. Here, similar to [22], should be consistent in C 2 . The inequality constraints mainly
[34], we use piecewise Bézier curves to generate the trajectory include the velocity p(1) (1)
min , pmax , acceleration range constraints
of the vehicle. The difference is that our voxels are generated p(2) (2) (0) (0)
min , pmax and the upper and lower limits pmin , pmax provided
relatively strictly according to the dynamics model of the vehi- by the voxel boundary. The construction of the QP problem
cle in each time interval, so the redundant space against the is similar to [22], so great details will not be reiterated here,
dynamic feasibility is potentially eliminated, which contributes and the reader interested in this problem can find proofs in
to the convergence of optimization. Besides, the width of the the references.
voxels are lane-leveled, so the optimized trajectory is directly
 TL  3 μ

associated with the corresponding maneuver. TL
d b j (t)
The Bézier curve b(t) is a special spline curve, defined F LT B P = Fj = ,
dt 3
in the fixed interval t ∈ [0, 1], consisting of the control j =1 j =1 μ
j μ (k)
points C = [c0 , c j , · · · , cm ] and the Berstein basis bm (t) [35], b1 (0)k = p0 ,k = 0, 1, 2,
as in (7) where m is the degree of the basis (also the trajectory μ
bTL (TL ) = pe ,
order). In trajectory planning, the Bézier curve is widely used ⎧
to describe the trajectory because of the convex hull property, ⎪
⎪ min, (sTmax , v desired TL )


L
hodograph property and the inherent continuity. The convex if no preceding obstacle,
pe =
hull property implies that the Bézier curve is strictly bounded ⎪
⎪ min, (sTmax , Pahead .s − dsafe − m )

⎩ L
within the convex hull formed by the control points (including if others.,
the boundary), so as long as the convex hull formed by the con- μ μ
trol points is collision-free, the planned trajectory is guaranteed bi (ti,−1 )(k) = bi+1 (ti+1,0 )(k) , i = 1, 2, 3, 4,
μ
to be safe. The hodography property means that the derivative p(k)
min ≤ bi (t)
(k)
≤ p(k)
max , i = 1, 2, 3, 4, 5. (9)
of the curve is still a Bézier curve, and the control points of
the derivation are the linear combination of the control points
of the source curve, which makes the variable constraints
in the optimization problem a linear inequality, simplifying D. Trajectory Evaluation
the complexity of optimization. In this work, we assign a
μ After the above steps, at most three kinds of maneuvers
piecewise Bézier curve bi (t) in dimension μ = [s, d] to each
and the corresponding optimal trajectory are generated. Next,
voxel Vi in the searched voxel sequence {SMC , SM L , SM R }.
we define an evaluation function in the linear combination
Since the voxel covers a range of the unit planning interval
of factors like safety f t t c,Mi , the difference from the desired
(1s in LTBP) in t-axis, only a simple transition is required for
velocity f v d ,Mi , and the behavior cost f Mi as in (10).
meeting the defined interval of Bézier curve. The definition of
The coefficients of the function can be adjusted reason-
the Bézier curve for voxel Vi is as shown in the formula (8).
ably according to the specific traffic environment and actual
m
j decision-making and planning requirements. In this paper,
b(t) = c j bm (t),
under the premise of ensuring safety, we expect the lane
j =0
  change maneuver to make the speed-up, thereby improving
j m j the efficiency. Conversely, the lane change without velocity
bm (t) = t (1 − t)m− j . (7)
j optimized or safety guaranteed are not supported by our plan-
μ
m
μ j ner. Ultimately, we have made a trade-off on the coefficients
bi (t) = c j,i bm (t −ti ), t ∈ [ti , ti +t L ], μ ∈ [s, d]. of the factors and a trajectory with the lowest cost will be used
j =0 as the reference trajectory T r ∗ .
(8)
The objective function and constraints of the QP problem FMi = f t t c,Mi + f v d ,Mi + f Mi ,
are formed as formula (9). In order to make the planned T r ∗ = min{FMi }, i ∈ {C,L,R}. (10)

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10372 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022

E. Short-Term Dynamic Planning algorithm and framework in Python3.6 based on ROS (Robot
For more efficient and long-term decision-making and plan- Operating System) with a computer equipped with an Intel I5-
ning, LTBP runs every 4s with a planning horizon TL = 5s. 3470 CPU and an NVIDIA GeForce GTX 2080-Ti graphics
However, in such a long planning horizon, the uncertainties of card. The decision planning system can run stably at the
the prediction results of the surrounding vehicles may increase. designed frequency, which will be illustrated in subsection F
Therefore, we design an STDP thread for re-planning tasks in detail.
within a short horizon of 1s. Inside STDP, an RSS Checker n(t) ∼ N (0, σ0 (t − t0 )),
is running with a frequency of 10Hz. Based on the relative
Pt,n = Gt,n + n(t). (13)
accurate prediction results of the surrounding obstacles in
the next 0.2s (supposing the prediction is accurate in the The overall experiments contain 4 parts:
short term), the minimum safe distance dsafe is calculated Part 1: The effectiveness of the decision and planning
to determine the potential collision of the ego vehicle as in framework in different traffic scenarios including the scenario
formula (11), where tr is the reaction time, v is the velocity with static obstacles only and the scenarios with congested
of the ego vehicle, and v f is the velocity of the front vehicle. dynamic traffic participants. And in the dynamic environment,
Once the RSS warning ‘event_RSS’ emerges, the STDP event lane keeping and lane change behaviors are tested respectively.
‘event_S’ is triggered. These experiments just verify the applicability and mobility of
In the STDP planner, the re-planning process starts based our proposed framework in diverse scenarios.
on the updated prediction results and the referenced trajectory Part 2: The comparison experiments between our proposed
T r ∗ . Here, we construct a smaller-scaled QP problem. The voxel generation and expansion algorithm and the seed-based
μ
optimized object contains the jt h Bézier segment b j (0)(k) fixed step length corridor expansion algorithm [22]. The exper-
inside voxel V j , with an objective function considering the iment will demonstrate the superiority of the voxel structure
deviation to the reference trajectory. Constraints are the same and also the efficiency of the optimization process with the
as in LTBP (but the boundary is updated with new vehicle redundant space eliminated by the voxel.
states). When STDP is successful, the trigger is disabled and Part 3: Comparison experiments between the results planned
the RSS continues running. If STDP fails, the LTBP event by our proposed framework and the ground truth generated
trigger ‘event_L’ is enabled, and LTBP will be renewed under by human drivers in terms of safety, efficiency, and comfort,
the condition of the updated prediction states. which will prove the reliability and intelligence of our pro-
v 2f posed framework.
1 (v + tr amax )2
dsafe = vtr + amax tr2 + − . (11) Part 4: The time consumption of the proposed LTBP and
2 2amin 2amin STDP. The efficiency indicates the real-time performance of
 d3 bμj (t) the proposed framework. And the parameters used in our
F ST D P = F j = ,
dt 3 experiments will be shown in Table I.
μ
μ (k)
b j (0)(k) = p0 , k = 0, 1, 2,
A. Drive With Static Obstacles
(k) μ
pmin ≤ b j (t)(k) ≤ p(k)
max . (12) The static environment is relatively simple for the ego
vehicles to avoid obstacles. As we pointed out in [22], in the
spatio-temporal domain, static obstacles can be regarded as
V. E XPERIMENTS objects perpendicular to the driving plane through the time
In order to verify the effectiveness of our proposed frame- axis at their locations. Therefore, in each unit time ti , static
work, we adopt the publicly released data-set NGSIM col- obstacles within the perception range should be considered.
lected in the highway environment for testing. The NGSIM As shown in Fig. 5, taking the starting position of the vehicle
US101 database was collected in America, which records the as the origin of the s coordinate, we place the static obstacle
various traffic situation during morning rush hour, including 1 at s = 100m in lane 2 where the vehicle is located,
the traffic congestion and normal highway driving, etc. In the an obstacle 2 at s = 220m lane 3, and an obstacle 3 at
data-set, we arbitrarily select a vehicle in the scene as the s = 320m in lane 1. Additionally, in order to examine the
ego vehicle, and regard the vehicles within a certain range deceleration to stop performance of the framework, we set up
(±100m) around it as the obstacles in the perception field. obstacle 4 across multiple lanes at 500m in the s direction
Since the predicted states of the social vehicles are required (which can also be regarded as stop signs or traffic signals
in the process of voxel generation, we add a noise n(t) to that prevent the vehicle from moving forward, and the safety
the ground truth Gt,n of vehicle n as the predicted states distance is set to be 50m by default) to check whether the
Pt,n . As the research [31], [36] reveal, the uncertainty of vehicle will slow down to stop. Since in sensors manager
prediction is increasing with respect to time, so the variance that we designed, the static and dynamic obstacles are treated
of the gaussian noise σ0 is a function with respect to time separately, the planner stipulates that for static obstacles ahead
in formula (13). In the future, we will employ the sequential of the ego vehicle within the sensing range of the current
neural network or hidden Markov process for prediction in lane, penalties will be imposed on the planning results, thereby
combination with our proposed decision planning framework, prompting the vehicle to keep far away from obstacles or to
making the framework more systematic. We completed our change lanes.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10373

Fig. 5. The top view of the whole planned trajectory with static obstacles. The obstacles are: Obstacle 1: lane 2, 100m ahead, Obstacle 2: lane 3, 220m
ahead, Obstacle 3: lane 1, 320m ahead and Obstacle 4: 500m, covering from lane 1 to 5. The velocity of the trajectory is described in red series color.

TABLE I
PARAMETERS C ONFIGURATION

Based on the planning rules, in the first planning cycle, since


obstacle 1 is detected in the current lane ahead, the LTBP
makes a decision to change lane. As for the left and right
lanes, the penalty item of left lane change is higher because
of obstacle 2 (the locations of static obstacles are known).
Consequently, in this planning cycle, the vehicle makes the
decision to change lane to the right. After that, the vehicle
stays in the lane until it encounters another static obstacle
3 in lane 1. As the right side is out of boundary, voxel
can only be expanded along the current lane and the left
lane. Also, because the current lane is occupied by obstacles,
the spatio-temporal voxel can only be expanded to the left
lane from the current lane, leading to the left-lane-change
decision and the associated smooth trajectory. After that,
in the free environment, the vehicle keeps driving in the lane Fig. 6. The planning results of scenarios with static obstacles by our
until obstacle 4 is detected. Because obstacle 4 occupies all framework, displayed in terms of positions in s direction, positions in d
the accessible lanes, under the premise of keeping a safe direction, velocity and acceleration in longitudinal direction.
distance from static obstacles, the voxel is limited to extend in
the longitudinal direction, forcing the vehicle to slow down.
As approaching obstacle 4, the vehicle decelerates smoothly complex traffic scenarios with dense dynamic obstacles to test
to stop. The whole planned trajectory with velocity profile can our framework. Furthermore, the generation and expansion of
be seen in Fig. 5, and Fig. 6 gives a detailed description to the voxels and the formed traffic graph will be displayed.
velocity and acceleration profile. Seen from Fig. 6, at around Dynamic scenario 1 describes the situation where the ego
t = 16s, the vehicle accelerates to change lane to the free vehicle is driving with lots of surrounding social vehicles
left lane. And when obstacle 4 is detected, the acceleration as in Fig. 7, and the social vehicles are represented by
decreases to negative to make the velocity decrease smoothly. blue rectangles, triangles and ellipses at the beginning time
Finally, the vehicle stops comfortably in front of the obstacle of planning cycle 1, 5, 10. In order to keep pace with
within the safety distance. the social vehicle group, we set the average speed of the
traffic group as the desired velocity. In Fig. 7 a) and Fig. 8,
given the initial velocity v 0 = 12m/s, the vehicle accelerates
B. Lane Keeping in Cluttered Environment at the early age, then drives in an almost constant speed.
Since the generation and expansion of voxels are closely During the planning stage, the lane where the ego vehicle
related to the surrounding dynamic obstacles, we select two is located has always maintained a relatively good driving

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10374 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022

Fig. 7. The top view of traffic scenario 1 with dynamic obstacles and the generated trajectories in 3D configuration. The ego vehicle in red is depicted
in every cycle, whose trajectory and velocity profile are represented in the red-series color. And the distribution of social vehicles at the beginning time of
cycle 1, 5, 10 are depicted by blue rectangles, triangles and ellipses, with their trajectories in blue dash lines.

Fig. 8. The planning results of scenario 1 with dynamic obstacles by our Fig. 9. The planning results of scenario 2 with dynamic obstacles by our
framework, displayed in terms of positions in s direction, positions in d framework, displayed in terms of positions in s direction, positions in d
direction, velocity and acceleration in longitudinal direction. direction, velocity and acceleration in longitudinal direction.

generated on the current lane strictly accords with the range


condition. Therefore, the ego vehicle always makes a behavior
defined by formula (1). Eventually, the planned trajectory
decision to keep lanes. The reasons why the vehicle decides
in 3D configuration are shown in Fig. 7 b), c), d). Seen
to keep lanes might be: (1) The current lane has proper
from Fig. 8, the generated trajectory is smooth in both spatial
driving condition, such as free driving or the obstacles do
domain (path) and temporal domain (velocity and accelera-
not influence driving; (2) There are no proper condition for
tion).
lane change, such as the adjacent lanes are occupied or there
is a potential risk. (3) There is a possibility to make a lane
change, but the cost paid for lane change is high enough. C. Lane Change in Cluttered Environment
Based on this, the planner will make a decision to keep It is relatively difficult to make a lane-change decision
the lane under the comprehensive consideration of maintain- in a complex and highly dynamic environment, because the
ing safety, improving driving efficiency and saving energy social vehicles in the departure and also the target lanes
consumption. should be considered. Seeing the dynamic scenario 2 depicted
Since the vehicle always keeps in the current lane and in Fig. 10 a), since social vehicle 34 is in front of ego
drives freely without obstacle blocked, the size of the voxels vehicle, in order to be more efficient, the ego vehicle in red

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10375

Fig. 10. The top view of traffic scenario 2 with dynamic obstacles and the generated voxels, DWTG and trajectories. The ego vehicle in red is depicted
in every cycle, whose trajectory and velocity profile are represented in the red-series color. And the distribution of social vehicles at the beginning time of
cycle 1, 5, 8 are depicted by blue rectangles, triangles and ellipses, with their trajectories in blue dash lines. The results of our proposed algorithm in particular
cycles are illustrated in b), c), d).

makes decision to change lane to the right at the first cycle. plane into a static planning problem in 3D space. Particularly,
After the lane change, the ego vehicle is driving ahead of cycle 8 displays a situation of emergency where vehicle
social vehicle 47. Particularly, viewed on a two-dimensional 44 behind the ego vehicle accelerates suddenly (since the
plane, the trajectories of the ego vehicle and social vehicle interaction of the vehicles are not considered in the work,
34; 47 overlap, whereas, they are misplaced in spatio-temporal vehicle 44 accelerates with regardless of the ego vehicle,
domain. From the view of spatio-temporal domain, it is clear which can also be regarded as an abnormal acceleration.).
that the space of the planned trajectory and the space occupied At this moment, the ego vehicle cannot successfully avoid the
by obstacles do not overlap, so it will not cause a collision. rear-end collision by accelerating (exceeding the maximum
This is just the advantage of the 3D spatio-temporal decision speed). Therefore, the ego vehicle manages to change lanes
planning, which can provide the intuitive motions in time and seizes the proper opportunity to successfully change lanes
series, then transfer the dynamic planning problem in the to the left lane between vehicle 21 and 25.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10376 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022

Fig. 10 b), c), d) show the generated voxels, the DWTG as


well as the planned trajectories of planning cycle 1, 5, 8 in
dynamic scenario 2. The first cycle demonstrates a situation
where the ego vehicle decides to change lane to the right.
Because the left lane (t1 , t5 ) and the current lane (t5 ) are
occupied by the social vehicles in sub-figure b.1), the possible
maneuver is switching to the right lane. And due to the
existence of the social obstacles, the voxels in the right lane
are smaller in size (s direction), which implies that the space
bounded by the voxels is collision-free. The topology of the
DWTG is shown in b.2), where the cost of spatial edge is Fig. 11. An example of the 3D voxels generated by our algorithm. The voxels
always greater than the temporal edge considering the lane can be generated more efficiently and thoroughly according to the motion
model of the ego vehicle, the lanes and the planned unit time. For each lane
change cost except for infinity. Sub-figure c) shows the results at each unit time, if there is no obstacle, then voxels will be generated using
of planning in cycle 5. In the left lane showed in c.1) and the motion model, and if there are occupied obstacles, the free space will be
c.2), the absence of weighted edge between vertices ‘T1LL’ divided into more than one passable voxels or no voxels.
and ‘T2LL’ indicates the unconnected topology. Moreover,
at t4 , two voxels are generated in the right lane and left
lane respectively. From the content of Section IV B, we can
figure out that if the ego vehicle plans to change lanes to
the right/left at this moment, two strategies can be adopted.
One is following the vehicle after deceleration, and the other
is accelerating to overtake the social vehicle in the right/left
lane and change lanes in front of it. And the same situation
happens in planning cycle 8 in the left lane. As we address
in Section IV B, we will generate voxel sequence for each
maneuver, so the voxel sequence with the destination ‘T5LL1’,
‘T5LL2’ (Left Lane Change) and ‘T5LR1’, ‘T5LR2’ (Right Fig. 12. An example of the 3D corridors generated by [22]. The corridors
Lane Change) as well as the lowest cost will be selected by are generated according to the state seeds originated from the multi-policy
decision making (MPDM). For each seed, the expansion is conducted in 6
Dijkstra algorithm in cycle 5. Fig. 10 d) illustrates the detailed directions until encountering obstacles.
situation of cycle 8. In the current lane, vertices ‘T4LC’ and
‘T5LC2’ have no ingoing vertices, so the edges originate TABLE II
from these vertices weight infinity in case the vertices are P ERFORMANCE C OMPARISON B ETWEEN THE G ROUND T RUTH AND THE
selected by Dijkstra algorithm. In all these cycles, the planner P LANNED T RAJECTORY
succeeds in executing the maneuvers, and the planned tra-
jectories of the ego vehicle (red) as well as the predictions
of social vehicles (grey) in these three cycles are shown in
sub-figure b.3), c.3), d.3) respectively. By constructing the
voxels and calculating the weighted edges, the complex traffic
scenario can be abstracted into a brief graph structure, thereby
transforming the sophisticated dynamic planning problem into
a quantified graph search problem.

The expansion step in each direction and the resolution in each


D. Comparison Between the Generation of Corridors and dimension are all empirically set, consistent with [22].
Our Proposed Voxels Fig. 11 and Fig. 12 respectively show the results of cor-
In order to compare the difference between our method of ridor generation method and the voxel generation method
voxel generation (Hereinafter referred to as VOXEL) and the (red rectangles represent the sequential positions of social
corridor construction proposed in [22] (Hereinafter referred vehicles). By carefully comparing the differences between
to as CORRIDOR), we randomly select some traffic scenes them, the following conclusions can be drawn:
for experiments. The generation of corridor is completed by 1) Expansion mechanism: CORRIDOR adopts a fixed step
ourselves with [22]. And for the sake of comparison, we also length when expanding, causing it necessary to bal-
set the maximum expansion range of each corridor on the time ance the resolution of the map and the step length
axis to 1s, and the maximum expansion range of the corridor to ensure both the accuracy and operating efficiency;
in the positive direction of s is the distance after driving while VOXEL directly determines the range of motion
with maximum acceleration till the last considered moment according to the dynamics model of the vehicle, which
(no greater than vmax ), the maximum expansion range in the is more concise;
negative direction of s is the distance after driving with the 2) Time Axis: There is no strict time limit for the gen-
maximum deceleration till the last moment (no less than vmin ). eration of corridor, so the range on the time axis of

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10377

TABLE III
T IME C ONSUMPTION OF E ACH P HASE

the corridor may vary, resulting in the overlap even the surrounding vehicles. In scenario 2, the average safety
space redundancy. Nevertheless, the advantage is that the distance of the trajectory we planned is also larger. Therefore,
corridor will be more flexible in time axis with precise our planner is more reliable in terms of security. In terms of
resolution. VOXEL is strictly generated bounded by the efficiency, we consider the average speed of the trajectory. The
unit planning time t L , so there is no overlap on the results show that the average speed planned by our framework
time axis. As for the precision of time, we adopt two is higher, which proves the efficiency (the results are calculated
methods to compensate for the accuracy. One is to loose with v desired = 18.0m/s, the average velocity of the vehicle
the boundary of the generated voxel when optimizing, group). In terms of comfort, we consider the average accel-
the relaxed factor m will give tolerance to the uncertainty eration of the vehicle (the sum of acceleration in s direction
risen by prediction. The other is the usage of RSS divided by the total length). Because our framework adopts
Checker, which considers the states of social vehicles in the jerk term as the optimization function, the acceleration is
each 0.1s. ensured to be steady. In the lateral direction, as a result of
3) s-Axis and d-Axis: The expansion order of the corridor the added constraints of the lateral velocity and acceleration,
from seeds is s direction, d direction and finally the the lateral acceleration is small (the lateral acceleration for
time axis, which will cause space omission (the lateral both our framework and the ground truth are very small, which
expansion will be restricted after the longitudinal length is are not listed in the table). In summary, the trajectories planned
determined). VOXEL determines the size in three dimen- by our planner are superior to human driving trajectories in
sions at the same time, and the lateral size of the voxel is terms of safety, efficiency and comfort. On the one hand, it can
equal to the lane width. In this way, every possible space demonstrate the reliability and intelligence of our proposed
of each lane is considered, and the lane-width structure framework, and on the other hand, it also benefits from the
makes it convenient to make maneuvers in lane level. reasonable prediction of the surrounding vehicles.
4) Time Consumption: In terms of time consumption,
VOXEL takes advantage. This is because the size of
the voxel is determined at the same time, while the
corridor requires expansion along 6 directions separately. F. Time Consumption
The average time consumption of VOXEL is 1.5−3.0ms,
and the time consumption of CORRIDOR is more than The experiments on time consumption of the framework is
60ms, which is 20 times as long as our algorithm. (For carried out under the same hardware conditions as the above
both algorithms, the obstacles information are traversed experiments. We tested the time consumption of 4 parts of
for collision detection. Converting the obstacles informa- the framework, which are the time consumption of the traffic
tion into sequential grid maps, the searching process will graph determination in LTBP, the time consumed by the QP
be more time-efficient). optimization problem, the optimization problem in STDP and
the RSS Checker (because the frequency of RSS Checker is
10Hz, the statistical results are the summation of repeat for ten
E. Comparison of the Ground Truth times). The statistical results are shown in Table III, and the
In order to verify the reliability and intelligence of our unit is ‘ms’. For the determination of traffic graph, the time
proposed framework, we conduct experiments compared with consumption is 5ms, including the generation and expansion of
the ground truth derived from human drivers. In this work, voxel, and the determination of topological relations as well
we refer to [15] to evaluate the trajectory from the perspective as the weighted matrix. The optimization process of LTBP
of safety, efficiency and comfort. For safety, we consider takes about 50-80ms. By solving the quadratic programming
the distance between the vehicle and the nearest obstacle problem including equality constraints, inequality constraints,
in the same lane. When the distance is less than the safe and boundary condition constraints, the optimal trajectory
distance threshold (20m), we consider it to be a potential within TL planning horizon is obtained. For STDP, since the
dangerous situation. Finally, we calculate the average distance optimization target is a relatively short trajectory in horizon
for each time the distance is less than the safe distance for TS , the scale of the optimization problem is smaller and the
safety evaluation. The greater value means the better security running time is shorter. Through time-consuming statistics,
performance. It can be seen from Table II that in dynamic it proves that LTBP and STDP are ensured to run stably at
scene 1, the average distance of the ground truth is 18.81m, the set frequencies f L and f S , thereby realizing real-time
and the trajectory we planned keeps at least 20m away from behavioral decision-making and trajectory planning.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10378 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022

VI. C ONCLUSION [13] H. Fan et al., “Baidu Apollo EM motion planner,” 2018,
arXiv:1807.08048. [Online]. Available: https://arxiv.org/abs/1807.08048
In this work, we propose a unified framework integrating [14] Q. Wang, B. Ayalew, and T. Weiskircher, “Predictive maneuver planning
decision making and trajectory planning with long-term behav- for an autonomous vehicle in public highway traffic,” IEEE Trans. Intell.
ior planning (LTBP) and short-term dynamic planning (STDP) Transp. Syst., vol. 20, no. 4, pp. 1303–1315, Apr. 2019.
[15] L. Zhang, W. Ding, J. Chen, and S. Shen, “Efficient uncertainty-aware
running in two parallel threads with different horizon, and decision-making for automated driving using guided branching,” in Proc.
the hierarchical closed-loop system is proved to be capable to IEEE Int. Conf. Robot. Autom. (ICRA), May 2020, pp. 3291–3297.
react to the dynamic environment efficiently and effectively. [16] M. Werling, J. Ziegler, S. Kammel, and S. Thrun, “Optimal trajectory
generation for dynamic street scenarios in a Frenét Frame,” in Proc.
In the framework, a voxel generation and expansion algorithm IEEE Int. Conf. Robot. Autom., May 2010, pp. 987–993.
is proposed to determine the passable space without obstacles, [17] Y. Meng, Y. Wu, Q. Gu, and L. Liu, “A decoupled trajectory planning
therefore, the directed weighted traffic graph is constructed framework based on the integration of lattice searching and convex
optimization,” IEEE Access, vol. 7, pp. 130530–130551, 2019.
to describe the traffic situation intuitively and concisely. The [18] W. Lim, S. Lee, M. Sunwoo, and K. Jo, “Hierarchical trajectory planning
experiments on the public datasets demonstrate the effec- of an autonomous car based on the integration of a sampling and an
tiveness, mobility, efficiency as well as the intelligence of optimization method,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 2,
pp. 613–626, Feb. 2018.
the proposed framework. By the comparison experiments, [19] W. Zhan, J. Chen, C.-Y. Chan, C. Liu, and M. Tomizuka, “Spatially-
we verify the superiority of our proposed framework. Still, partitioned environmental representation and planning architecture for
more experiments on the real-world application should be on-road autonomous driving,” in Proc. IEEE Intell. Vehicles Symp. (IV),
Jun. 2017, pp. 632–639.
conducted, so in the future, we plan to: 1) test our proposed [20] C. Hubmann, M. Aeberhard, and C. Stiller, “A generic driving strategy
framework in the real-world autonomous vehicle; 2) integrate for urban environments,” in Proc. IEEE 19th Int. Conf. Intell. Transp.
the prediction module (employing the neural network or Syst. (ITSC), Nov. 2016, pp. 1010–1016.
[21] J. Ziegler and C. Stiller, “Spatiotemporal state lattices for fast trajectory
the hidden Markov process) and the controller module to planning in dynamic on-road driving scenarios,” in Proc. IEEE/RSJ Int.
our framework, making the framework more systematic and Conf. Intell. Robots Syst., Oct. 2009, pp. 1879–1884.
applicable. For more work about us, please refer to the github1. [22] W. Ding, L. Zhang, J. Chen, and S. Shen, “Safe trajectory genera-
tion for complex urban environments using spatio-temporal semantic
corridor,” IEEE Robot. Autom. Lett., vol. 4, no. 3, pp. 2997–3004,
R EFERENCES Jul. 2019.
[23] M. McNaughton, C. Urmson, J. M. Dolan, and J.-W. Lee, “Motion plan-
[1] R. He, E. Brunskill, and N. Roy, “Efficient planning under uncer- ning for autonomous driving with a conformal spatiotemporal lattice,”
tainty with macro-actions,” J. Artif. Intell. Res., vol. 40, pp. 523–570, in Proc. IEEE Int. Conf. Robot. Autom., May 2011, pp. 4889–4895.
Mar. 2011. [24] Z. Zhu, E. Schmerling, and M. Pavone, “A convex optimization approach
[2] A. G. Cunningham, E. Galceran, R. M. Eustice, and E. Olson, “MPDM: to smooth trajectories for motion planning with car-like robots,” in Proc.
Multipolicy decision-making in dynamic, uncertain environments for 54th IEEE Conf. Decis. Control (CDC), Dec. 2015, pp. 835–842.
autonomous driving,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), [25] S. M. Erlien, S. Fujita, and J. C. Gerdes, “Safe driving envelopes for
May 2015, pp. 1670–1677. shared control of ground vehicles,” IFAC Proc. Volumes, vol. 46, no. 21,
[3] L. Ma, J. Xue, K. Kawabata, J. Zhu, C. Ma, and N. Zheng, “Efficient pp. 831–836, 2013.
sampling-based motion planning for on-road autonomous driving,” IEEE [26] J. Choi, “Kinodynamic motion planning for autonomous vehicles,” Int.
Trans. Intell. Transp. Syst., vol. 16, no. 4, pp. 1961–1976, Aug. 2015. J. Adv. Robotic Syst., vol. 11, no. 6, p. 90, Jun. 2014.
[4] X. Li, Z. Sun, D. Cao, Z. He, and Q. Zhu, “Real-time trajectory planning [27] P. Bender, O. S. Tas, J. Ziegler, and C. Stiller, “The combinatorial aspect
for autonomous urban driving: Framework, algorithms, and verifica- of motion planning: Maneuver variants in structured environments,” in
tions,” IEEE/ASME Trans. Mechatronics, vol. 21, no. 2, pp. 740–753, Proc. IEEE Intell. Vehicles Symp. (IV), Jun. 2015, pp. 1386–1392.
Apr. 2016. [28] F. Altche and A. de La Fortelle, “Partitioning of the free space-time for
[5] D. González, J. Pérez, V. Milanés, and F. Nashashibi, “A review of on-road navigation of autonomous ground vehicles,” in Proc. IEEE 56th
motion planning techniques for automated vehicles,” IEEE Trans. Intell. Annu. Conf. Decis. Control (CDC), Dec. 2017, pp. 2126–2133.
Transp. Syst., vol. 17, no. 4, pp. 1135–1145, Apr. 2016. [29] Responsibility-Sensitive Safety, a Mathematical Model for Auto-
[6] S. Kammel et al., “Team AnnieWAY’s autonomous system for the mated Vehicle Safety. Accessed: Jul. 2, 2021. [Online]. Available:
DARPA urban challenge 2007,” in The DARPA Urban Challenge. Berlin, https://www.mobileye.com/responsibility-sensitive-safety/
Germany: Springer, 2009, pp. 359–391. [30] M. Schreier, V. Willert, and J. Adamy, “Bayesian, maneuver-based, long-
[7] J. Ziegler et al., “Making Bertha drive—An autonomous journey on a term trajectory prediction and criticality assessment for driver assistance
historic route,” IEEE Intell. Transp. Syst. Mag., vol. 6, no. 2, pp. 8–20, systems,” in Proc. 17th Int. IEEE Conf. Intell. Transp. Syst. (ITSC),
Apr. 2014. Oct. 2014, pp. 334–341.
[8] H. Mouhagir, R. Talj, V. Cherfaoui, F. Aioun, and F. Guillemard, [31] N. Deo, A. Rangesh, and M. M. Trivedi, “How would surround vehicles
“Integrating safety distances with trajectory planning by modifying the move? A unified framework for maneuver classification and motion
occupancy grid for autonomous vehicle navigation,” in Proc. IEEE 19th prediction,” IEEE Trans. Intell. Vehicles, vol. 3, no. 2, pp. 129–140,
Int. Conf. Intell. Transp. Syst. (ITSC), Nov. 2016, pp. 1114–1119. Jun. 2018.
[9] S. Ulbrich and M. Maurer, “Towards tactical lane change behavior [32] N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and
planning for automated vehicles,” in Proc. IEEE 18th Int. Conf. Intell. M. Chandraker, “DESIRE: Distant future prediction in dynamic scenes
Transp. Syst., Sep. 2015, pp. 989–995. with interacting agents,” in Proc. IEEE Conf. Comput. Vis. Pattern
[10] Z. Chen and X. Huang, “End-to-end learning for lane keeping of self- Recognit. (CVPR), Jul. 2017, pp. 2165–2174.
driving cars,” in Proc. IEEE Intell. Vehicles Symp. (IV), Jun. 2017, [33] E. W. Dijkstra, “A note on two problems in connexion with graphs,”
pp. 1856–1860. Numerische Math., vol. 1, no. 1, pp. 269–271, Dec. 1959.
[11] N. Li, D. W. Oyler, M. Zhang, Y. Yildiz, I. Kolmanovsky, and [34] F. Gao, W. Wu, Y. Lin, and S. Shen, “Online safe trajectory generation
A. R. Girard, “Game theoretic modeling of driver and vehicle interac- for quadrotors using fast marching method and Bernstein basis poly-
tions for verification and validation of autonomous vehicle control sys- nomial,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), May 2018,
tems,” IEEE Trans. Control Syst. Technol., vol. 26, no. 5, pp. 1782–1797, pp. 344–351.
Sep. 2018. [35] J. A. Preiss, W. Hönig, N. Ayanian, and G. S. Sukhatme, “Downwash-
[12] L. Yang, X. Liang, T. Wang, and E. Xing, “Real-to-virtual domain aware trajectory planning for large quadrotor teams,” in Proc. IEEE/RSJ
unification for end-to-end autonomous driving,” in Proc. Eur. Conf. Int. Conf. Intell. Robots Syst. (IROS), Sep. 2017, pp. 250–257.
Comput. Vis. (ECCV), Sep. 2018, pp. 530–545. [36] C. Hubmann, M. Becker, D. Althoff, D. Lenz, and C. Stiller, “Decision
making for autonomous driving considering interaction and uncertain
1 https://github.com/zt-BIT/Spatio-temporal-Decision-making-and- prediction of surrounding vehicles,” in Proc. IEEE Intell. Vehicles Symp.
Trajectory-Planning-Framework/ (IV), Jun. 2017, pp. 1671–1678.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10379

Ting Zhang received the B.S. degree in automation Yi Yang received the Ph.D. degree in automation
from the Beijing Institute of Technology, China, from the Beijing Institute of Technology, Beijing,
in 2018, where she is currently pursuing the Ph.D. China, in 2010. He is currently a Professor with
degree in navigation guidance and control with the the School of Automation, Beijing Institute of Tech-
School of Automation. Her research interests include nology. He is the author/coauthor of more than
trajectories and behavioral predictions of unmanned 50 conference papers and journal articles in the
vehicles in a variety of traffic environments and area of unmanned ground vehicle. His research
prediction-based decision making and planning. interests include autonomous vehicles, bioinspired
robots, intelligent navigation, semantic mapping,
scene understanding, motion planning and control,
and robot design and development.

Wenjie Song received the B.S. and Ph.D. degrees


from the Beijing Institute of Technology, Beijing,
Xiaohui Tian received the B.S. degree in automa-
China, in 2013 and 2019, respectively. He stud-
tion from the Beijing Institute of Technology, China,
ied at Princeton University as a Visiting Scholar,
in 2016, where he is currently pursuing the Ph.D.
from 2016 to 2017. He is currently an Assistant
degree in navigation guidance and control with the
Professor with the School of Automation, Bei-
School of Automation. His research interests include
jing Institute of Technology. His research interests
motion planning and control for unmanned ground
include autonomous driving, environmental percep-
vehicles.
tion, SLAM, and path planning. He has taken part
in China Intelligent Vehicle Future Challenge for
several times as the Captain or the Core Member.

Mengyin Fu received the B.S. degree from Liaoning


University, China, the M.S. degree from the Beijing Meiling Wang received the B.S. degree in automa-
Institute of Technology, China, and the Ph.D. degree tion and the M.S. and Ph.D. degrees from the Beijing
from the Chinese Academy of Sciences. Institute of Technology, China, in 1992, 1995, and
He is currently the President of the Nanjing Uni- 2007, respectively.
versity of Science and Technology. His research She has been a Teacher with the Beijing Institute
interests include integrated navigation, intelligent of Technology, since 1995, and she worked with the
navigation, image processing, and learning and University of California at San Diego, San Diego,
recognition and their applications. USA, as a Visiting Scholar, in 2004. She is currently
Dr. Fu was elected as a Yangtze River Scholar Dis- the Director of the Integrated Navigation and Intel-
tinguished Professor in 2009 and won the Guanghua ligent Navigation Laboratory, Beijing Institute of
Engineering Science and Technology Award for Youth Award in 2010. Technology. Her research interests include advanced
He received the National Science and Technology Progress Award for several technology of sensing and detecting and vehicle intelligent navigation. She
times in recent years. was elected as a Yangtze River Scholar Distinguished Professor in 2014.

Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.

You might also like