Professional Documents
Culture Documents
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10366 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022
Rule-based decision-making utilizes the driving rules and corridors in 3D configuration, which is the key architecture
expert knowledge to determine the optimal logical action, of our LTBP framework.
with representative methods Finite State Machine (FSM) [6],
[7] and Markov Decision Process (MDP) [8], [9]. And the A. Frameworks Combining Decision Making and Trajectory
learning-based method uses different network structures to Planning
generate the optimal policy with best rewards according to the Decision making and trajectory planning are both crucial
environment. Various learning-based methods like reinforce- techniques for autonomous vehicles, which can be closely
ment learning are popular [10]–[12]. Different frameworks associated in research and applications. For decision making,
place the emphasis on various aspects, but the following issues the maneuver selection could be lane-level or samples in lanes.
are still not clearly addressed: In EM Motion Planner [13], the path-speed EM iteration inside
1) How to achieve safe and collision-free motion planning lane-level planning is proposed to generate optimal trajectory.
for vehicles in a high-speed environment with dynamic In E-step, static and dynamic obstacles are projected on the
obstacles; lane Frenét frame. In two M-steps, path and speed profiles
2) How to balance the relationship between long-term are generated by a combination of dynamic programming
decision-making and short-term planning to improve the and quadratic programming. [14] integrates discrete maneu-
efficiency; vering decisions with a model predictive control-based motion
3) How to make the decision-making and planning modules trajectory-planning scheme. Particularly, the uncertainties of
closely integrated to achieve reactive planning and re- the predicted vehicle’s states are explicitly discussed. These
planning. frameworks separate the path and velocity planning process,
For solving these problems, in this paper, we propose thus are inflexible to deal with the dynamic obstacles, espe-
a hierarchical long-term and short-term integration frame- cially in the re-planning process. In [15], the domain-specific
work that closely incorporates decision-making and planning, closed-loop policy tree (DCP-Tree) is proposed to outline the
as illustrated in Fig. 1. This framework makes use of the possible maneuvers in lane sequence of the ego vehicle, then
information of perception and prediction, making decisions conditional focused branching mechanism is well designed to
and generating trajectories executed by the control module. guide the branching in action and intention space by utilizing
The key characteristics of the framework are: (a) In the the domain-specific expert knowledge. The other kind of
solution space, the voxels are created and expanded in the method is to evenly spread sampling points in the longitudinal
spatio-temporal domain to search the optimal maneuver and and lateral directions, and determine the preliminary behavior
trajectory, thereby transforming the dynamic planning problem path through the interconnection of sampling points [16],
in two-dimensional plane into a static planning problem in [17]. [18] is a typical framework which contains the upper
three-dimensional space, solving issue 1. (b) The architec- behavioral trajectory planner for searching the macro-scale
ture belongs to a hierarchical form, where the long-term trajectory then determining the behavior, and also the lower
spatio-temporal decision module is used as the behavior motion trajectory planner for generating the micro-scale tra-
planning level, and the short-term dynamic dynamic module jectory. However, the adoption of sample-based searching
is used as the trajectory optimization and tracking module, method requires finding the equilibrium between the searching
as illustrated in Fig. 1. The design of different operating resolution and the time efficiency.
frequencies and parallel threads balance the periodic process of
B. Trajectory Planning in 3D Configuration
decision-making and planning, thus conquering the efficiency
problem of issue 2. (c) In the information flow, a closed-loop Trajectory planning in 3D configuration means planning
system with forward and backward feedback mechanism is in the passable plane with additional time dimension. Tra-
constructed among modules in different levels as the yellow ditional works on 3D trajectory planning can be roughly
arrows showed in Fig. 1. With the design of time trigger and classified into three categories: spatially-partitioned trajectory
event trigger mechanism, the decision-making and planning planning [19], temporarily-partitioned trajectory planning [18],
modules are closely combined to solve issue 3. [20] and direct method named spatio-temporal planning. The
spatially partitioned method aims to divide the 3D configu-
II. R ELATED W ORK ration into two 2D planes with respect to lateral and lon-
In this section, the related work are arranged in order of gitudinal direction, which reduces the scale of the planning
framework level, configuration domain level and struc- problem. Nevertheless, too many parameters are involved
ture composition level. We first outline the existing frame- in the searching process and optimization process in each
works that combine decision-making and planning and analyze decoupled sub-problem, which makes it troublesome to deal
their characteristics and shortcomings. Then, several methods with. While temporarily-partitioned method is based on the
are categorized by whether the state space is decoupled. assumption that position and speed can be decoupled, so it
By comparison, the spatio-temporal planning stands out for is inapplicable for those situations that depend on both posi-
its flexibility to cope with dynamic obstacles. And based on tion and speed planning, such as intersections or overtaking.
this conclusion, we finally focus on the structure of corridor In contrast, spatio-temporal planning is able to plan velocity
contained within spatio-temporal planning. Inspired by the and acceleration simultaneously due to the additional time
corridor, we propose to create a novel voxel structure and dimension, realizing obstacles avoidance of the static and
the ‘voxel expansion’ algorithm for generating the driving dynamic vehicles [21], [22]. In [23], the authors propose
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10367
to use the spatio-temporal state lattice to automatically and • Voxels and Expansion Algorithm: We propose to cre-
efficiently explore the states in both space and time domain. ate the voxel structure in spatio-temporal configura-
Furthermore, the direct method with optimization process tion. With the voxel serving as the vertices and the
should be provided with appropriate initial value and heuristic calculated weighted edges, a directed weighted traffic
objective functions for obtaining analytic solutions. graph (DWTG) that describes the passable space is
constructed, translating the complex traffic situation into
C. Generation of Spatio-Temporal Corridors the brief graph structure. Finally the voxel sequences
The spatial temporal corridor provides an idea to create are determined to find the proper maneuver and optimal
space describing the passable area. Compared to the plane trajectories.
cube designed in the static environment [24], [25], the spa- • Effectiveness: We make experiments on the released
tial temporal corridor can contain more dynamic evolutions data-set NGSIM, which describes the real-world driving
about the obstacles and environments. [26] builds a dynamic scenarios in highway. The success of decision-making
corridor that yields velocity obstacle-based (VO) maneuver and planning in different scenarios verifies the effec-
for real-time moving obstacle avoidance. Once the predictions tiveness of our proposed hierarchical framework, and
of the surrounding vehicles are accurate, the VO paradigm the comparison experiments on the ground truth and
will contribute to a set of safe velocities in discrete time, the corridor generation method in [22] demonstrate the
thus the planned speed profile is guaranteed to be collision- intelligence and efficiency of the algorithm.
free. [27] builds the corridors to partition the trajectory space
into discrete solution classes, such that all possible maneuvers
can be enumerated to find the global optimum as the opti- III. F RAMEWORK
mum trajectory among the maneuver variants. [28] proposes The detailed structure of the framework is illustrated
a four collision-free cells partitioning method to design a in Fig. 2, which contains a perception module, the LTBP
spatio-temporal transition graph. The generated corridors are thread, the STDP thread and visualization module.
actually the surrounding space excluded the obstacles. [22]
proposes to build a semantic corridor structure, and then
finds out the optimal trajectory constrained within the corridor A. Functionality of Each Module
by using piecewise Bézier curves. Though these mentioned The perception module is supposed to process the envi-
corridors are designed in spatio-temporal space, they lack the ronmental information obtained by the sensors, which mainly
detailed temporal information. For example, each part of the includes the environment manager to deal with road infrastruc-
corridor is supposed to be topology equivalent, and hardly any ture and static obstacles as well as the social agent manager for
physical character like velocity, acceleration, jerk could be find processing dynamic obstacle information (including real-time
in the corridor, therefore the superiority of the corridor in both and predicted states). Since the perception part is beyond the
time and space is not fully utilized. scope of the work, the relevant details will be omitted. The
Based on the analysis of the above literature, we can draw two threads in the middle are the core components of our
the conclusion that, in framework level, a unified framework framework, named LTBP and STDP.
is required that integrates decision making and trajectory The main task of LTBP is making long-term decision
planning, moreover, it is supposed to deal with the dynamic and optimal trajectory planning based on the approximate
environment and generate safe, efficient smooth and reac- prediction information of the surrounding vehicles, and also
tive trajectory. In configuration domain level, spatial-temporal providing a reference trajectory for STDP. There are total four
decision making and planning hold the advantage over flexi- managers in LTBP, where ego manager is the key part that
bility. And for the constructed corridor structure, the existed dominates each manager. Once the voxels and weighted matrix
works either regard the free space as a whole with equal are constructed based on the perception, they are delivered to
topology status or expand the corridor regularly according to the Dijkstra manager to find the best voxel sequence SV,M
the fixed step length, which neglect the dynamic feasibility of as the path boundary of maneuver M. Then the sequence
the vehicle and cause the impassable space involved. will be transformed into the optimal trajectory T r ∗ through
Motivated by the existed works and towards the above solving a QP problem by the QP optimization manager.
drawbacks, we propose our framework and summarize the Finally, the evaluation manager in LTBP will help to select
contributions as follows: the best maneuver and trajectory as the reference trajectory,
• Framework: We propose a hierarchical long-term and as in Algorithm 2 discussed in Section IV. STDP mainly
short-term integration framework that closely incorpo- aims to track and optimize the reference trajectory based on
rates decision-making and planning. The design of dif- the updated states of surrounding vehicles, which contains
ferent operating frequencies and parallel threads balance the dynamic planning manager, the RSS [29] safety check
the periodic process of each module. A closed-loop manager and the QP optimization manager. Based on this,
system with forward and backward feedback mechanism real-time obstacle avoidance and re-planning are managed in
is constructed among modules in different levels. With a frequency of 10Hz. The integration of these modules ulti-
the design of time trigger and event trigger mechanism, mately forms the maneuver and trajectory planner, enabling the
the decision-making and planning modules are closely autonomous vehicle to accomplish safe and efficient planning
combined. in a dynamic environment.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10368 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022
Fig. 2. The detailed structure of the proposed framework, with two core modules LTBP and STDP running in two parallel threads. LTBP is running every
4s, responsible for expanding voxels in 3D configuration and generating optimal trajectory to each maneuver by solving the QP problem. STDP runs in 1Hz,
tracking and optimizing the reference trajectory based on the short-term prediction results. RRS Checker is the embedded module in STDP, which runs in 10Hz
to check whether there is a potential risk. If an obstacle suddenly appears, the event trigger of STDP will be enabled, and with the failure of STDP, the event
trigger of LTBP is activated, forming the forward and backward closed-loop planning system.
B. Operation Mechanism and the re-planned maneuver will be made until the danger is
In our framework, we use two parallel threads for the eliminated as in Algorithm 1.
execution of the two modules, as shown in Algorithm 1. The above-mentioned operation mechanism is just the high-
Furthermore, the operation frequency f L of the LTBP module light of our framework. Different operation frequencies cause
is 0.25Hz, with planning horizon 5s (in line with the general the module to make effects in combination with its functional-
prediction horizon), while the operation frequency of STDP ities. LTBP runs at a relatively low frequency, which reduces
is 1Hz, with the planning horizon 1s. Both modules adopt the burden of repeating the decision-making and planning
the dual trigger mechanism of time and event trigger, that process, and alleviates the unstable swerve of strategy. STDP
is, once the event trigger is enabled, the module will restart runs at a frequency of 1Hz, which meets the requirements
immediately. The motivation of this mechanism is for the of the planner for obstacle avoidance and tracking reference
safety and efficiency. Because the planning horizon of LTBP is trajectories. At the same time, the RSS safety detection
5s, this means that in such a long planning period, the states of module runs at a higher frequency for real-time and accu-
the surrounding vehicles are generated by prediction. Whether rate obstacle avoidance. When the system detects a danger
the states derive from the analysis of the physical kinemat- in the future, the STDP will restart immediately, and if it
ics model [30] or the usage of neural networks [31], [32], fails, the LTBP decision planner will restart. In this way,
the uncertainty will increase over time, which brings diffi- a hierarchical closed-loop decision-making planning system
culties to accurate decision-making and planning. In order to with information transmitted forward and backward is formed
make the vehicle handle the dynamic obstacles and be reactive as Fig. 1, making it more reactive and efficient to plan and
to the emergencies accurately, we develop the STDP, which re-plan in the highly dynamic environment.
regards the optimal trajectory T r ∗ generated by LTBP as the
reference trajectory and considers the states of social vehicles IV. M ETHODOLOGY
P in a short period. The STDP module is conducted every 1s. In this section, we introduce the voxel structure and also
At the same time, the RSS Checker contained in STDP runs its generation and expansion algorithm in the Frenét [16]
at a frequency of 10Hz to check whether there is a risk of coordinate system (s is the direction along the road that
collision in the next 0.2s. Once a potential collision is detected, originates from the initial position, and d is the lateral position
the event trigger ‘event_S’ of STDP module is enabled, and perpendicular to s direction from right side to the left side).
the STDP module will re-plan immediately with the updated The overall process from the generation of voxels till its expan-
perception. If the planning result of STDP ‘success_S’ turns sion and finally the acquisition of the optimal trajectory are
out to be successful, the trigger will be released, otherwise the illustrated in Algorithm 2, which will be explained thoroughly
event trigger ‘event_L’ of the LTBP module will be activated in this section.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10369
Algorithm 1 Operation Mechanism lane (with the same vd ), it means that the planned trajectory
1: success_L = False, success_S = False derives from the lane keeping behavior. If the voxels in the
2: event_L = False, event_S = False, event_RSS = False sequence are from different lanes, it indicates the lane change
3: f L = 0.25Hz, f S = 1Hz, f R S S = 10Hz, behavior.
4: Thread 1: The generation and expansion of voxels could be illustrated
5: if f L or event_L then: by Fig. 3, which actually describes the situation where the ego
6: success_L = LTBP.running() vehicle (blue) is driving on the middle lane, with two surround-
7: if success_L then: ing vehicles (orange, grey) driving aside. Since human drivers
8: event_L = False do not make frequent lane-change decisions, we assume that
9: end if in a planning cycle, the vehicle will only make lane-change
10: end if decision at most once. Therefore, voxels are only generated in
11: Thread 2: the current lane (blue) and the adjacent lanes (yellow, green).
12: if f S or event_S then: Assuming that the voxel ViL in interval ti = t (with the initial
13: success_S = STDP.running() time t0 = 0 in TL ) has been determined, then the voxel Vi+1L of
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10370 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022
Fig. 3. A schematic diagram illustrating the process of the proposed method. (a) A traffic scene where the ego vehicle is driving with two social vehicles
(orange vehicle on the left and grey vehicle on the right). (b) The generated voxels in 3D configuration by our proposed algorithm, with blue voxels in the
current lane, yellow voxels in the right lane and green voxels in the left lane. These voxels are actually the set of space without obstacles occupied. (c) The
abstract vertices derive from the voxels. (d) Types of voxels and edges of the formed traffic graph. (e) The topology of the connected voxels. (f) The planned
trajectory of each maneuver.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10371
for a voxel sequence SMi that starts from the starting voxel trajectory smooth and comfortable enough, we choose jerk
to the destination with the lowest cost as the result of the as the trajectory optimization function (with optimized order
decision, as in Fig. 3 (e). Concatenating the voxel along the m opt = 3). In terms of constraints, to ensure the consistent
time sequence, the corridor that indicates the passable space of continuity of the planning results in sequential planning cycles,
(0) (1)
the ego vehicle is constructed. Note that in each interval, only we take the initial position p0 , velocity v 0 = p0 , and
(2)
one voxel is selected. If the strategy is lane change, two voxels acceleration a0 = p0 as equality constraints. As for the
are involved at the optimal lane-change interval, which are destination, we regard the distance covered with desired speed
then merged into one voxel (the merged voxel means the start as the expected target. So in s direction, with no obstacles
interval of the lane-change action rather than the completion ahead, pe equals the desired spanned distance v desired TL
of the action). within the maximum distance sTmax L
travelled in TL . Otherwise,
the maximum destination is set to keep the safe distance
C. Trajectory Optimization–QP Problem dsafe from the front obstacle by criterion like time-to-collision.
After the voxels are generated, a passable space of the Meanwhile, the connection between each Bézier segment
vehicle trajectory is actually determined. Here, similar to [22], should be consistent in C 2 . The inequality constraints mainly
[34], we use piecewise Bézier curves to generate the trajectory include the velocity p(1) (1)
min , pmax , acceleration range constraints
of the vehicle. The difference is that our voxels are generated p(2) (2) (0) (0)
min , pmax and the upper and lower limits pmin , pmax provided
relatively strictly according to the dynamics model of the vehi- by the voxel boundary. The construction of the QP problem
cle in each time interval, so the redundant space against the is similar to [22], so great details will not be reiterated here,
dynamic feasibility is potentially eliminated, which contributes and the reader interested in this problem can find proofs in
to the convergence of optimization. Besides, the width of the the references.
voxels are lane-leveled, so the optimized trajectory is directly
TL 3 μ
associated with the corresponding maneuver. TL
d b j (t)
The Bézier curve b(t) is a special spline curve, defined F LT B P = Fj = ,
dt 3
in the fixed interval t ∈ [0, 1], consisting of the control j =1 j =1 μ
j μ (k)
points C = [c0 , c j , · · · , cm ] and the Berstein basis bm (t) [35], b1 (0)k = p0 ,k = 0, 1, 2,
as in (7) where m is the degree of the basis (also the trajectory μ
bTL (TL ) = pe ,
order). In trajectory planning, the Bézier curve is widely used ⎧
to describe the trajectory because of the convex hull property, ⎪
⎪ min, (sTmax , v desired TL )
⎪
⎨
L
hodograph property and the inherent continuity. The convex if no preceding obstacle,
pe =
hull property implies that the Bézier curve is strictly bounded ⎪
⎪ min, (sTmax , Pahead .s − dsafe − m )
⎪
⎩ L
within the convex hull formed by the control points (including if others.,
the boundary), so as long as the convex hull formed by the con- μ μ
trol points is collision-free, the planned trajectory is guaranteed bi (ti,−1 )(k) = bi+1 (ti+1,0 )(k) , i = 1, 2, 3, 4,
μ
to be safe. The hodography property means that the derivative p(k)
min ≤ bi (t)
(k)
≤ p(k)
max , i = 1, 2, 3, 4, 5. (9)
of the curve is still a Bézier curve, and the control points of
the derivation are the linear combination of the control points
of the source curve, which makes the variable constraints
in the optimization problem a linear inequality, simplifying D. Trajectory Evaluation
the complexity of optimization. In this work, we assign a
μ After the above steps, at most three kinds of maneuvers
piecewise Bézier curve bi (t) in dimension μ = [s, d] to each
and the corresponding optimal trajectory are generated. Next,
voxel Vi in the searched voxel sequence {SMC , SM L , SM R }.
we define an evaluation function in the linear combination
Since the voxel covers a range of the unit planning interval
of factors like safety f t t c,Mi , the difference from the desired
(1s in LTBP) in t-axis, only a simple transition is required for
velocity f v d ,Mi , and the behavior cost f Mi as in (10).
meeting the defined interval of Bézier curve. The definition of
The coefficients of the function can be adjusted reason-
the Bézier curve for voxel Vi is as shown in the formula (8).
ably according to the specific traffic environment and actual
m
j decision-making and planning requirements. In this paper,
b(t) = c j bm (t),
under the premise of ensuring safety, we expect the lane
j =0
change maneuver to make the speed-up, thereby improving
j m j the efficiency. Conversely, the lane change without velocity
bm (t) = t (1 − t)m− j . (7)
j optimized or safety guaranteed are not supported by our plan-
μ
m
μ j ner. Ultimately, we have made a trade-off on the coefficients
bi (t) = c j,i bm (t −ti ), t ∈ [ti , ti +t L ], μ ∈ [s, d]. of the factors and a trajectory with the lowest cost will be used
j =0 as the reference trajectory T r ∗ .
(8)
The objective function and constraints of the QP problem FMi = f t t c,Mi + f v d ,Mi + f Mi ,
are formed as formula (9). In order to make the planned T r ∗ = min{FMi }, i ∈ {C,L,R}. (10)
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10372 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022
E. Short-Term Dynamic Planning algorithm and framework in Python3.6 based on ROS (Robot
For more efficient and long-term decision-making and plan- Operating System) with a computer equipped with an Intel I5-
ning, LTBP runs every 4s with a planning horizon TL = 5s. 3470 CPU and an NVIDIA GeForce GTX 2080-Ti graphics
However, in such a long planning horizon, the uncertainties of card. The decision planning system can run stably at the
the prediction results of the surrounding vehicles may increase. designed frequency, which will be illustrated in subsection F
Therefore, we design an STDP thread for re-planning tasks in detail.
within a short horizon of 1s. Inside STDP, an RSS Checker n(t) ∼ N (0, σ0 (t − t0 )),
is running with a frequency of 10Hz. Based on the relative
Pt,n = Gt,n + n(t). (13)
accurate prediction results of the surrounding obstacles in
the next 0.2s (supposing the prediction is accurate in the The overall experiments contain 4 parts:
short term), the minimum safe distance dsafe is calculated Part 1: The effectiveness of the decision and planning
to determine the potential collision of the ego vehicle as in framework in different traffic scenarios including the scenario
formula (11), where tr is the reaction time, v is the velocity with static obstacles only and the scenarios with congested
of the ego vehicle, and v f is the velocity of the front vehicle. dynamic traffic participants. And in the dynamic environment,
Once the RSS warning ‘event_RSS’ emerges, the STDP event lane keeping and lane change behaviors are tested respectively.
‘event_S’ is triggered. These experiments just verify the applicability and mobility of
In the STDP planner, the re-planning process starts based our proposed framework in diverse scenarios.
on the updated prediction results and the referenced trajectory Part 2: The comparison experiments between our proposed
T r ∗ . Here, we construct a smaller-scaled QP problem. The voxel generation and expansion algorithm and the seed-based
μ
optimized object contains the jt h Bézier segment b j (0)(k) fixed step length corridor expansion algorithm [22]. The exper-
inside voxel V j , with an objective function considering the iment will demonstrate the superiority of the voxel structure
deviation to the reference trajectory. Constraints are the same and also the efficiency of the optimization process with the
as in LTBP (but the boundary is updated with new vehicle redundant space eliminated by the voxel.
states). When STDP is successful, the trigger is disabled and Part 3: Comparison experiments between the results planned
the RSS continues running. If STDP fails, the LTBP event by our proposed framework and the ground truth generated
trigger ‘event_L’ is enabled, and LTBP will be renewed under by human drivers in terms of safety, efficiency, and comfort,
the condition of the updated prediction states. which will prove the reliability and intelligence of our pro-
v 2f posed framework.
1 (v + tr amax )2
dsafe = vtr + amax tr2 + − . (11) Part 4: The time consumption of the proposed LTBP and
2 2amin 2amin STDP. The efficiency indicates the real-time performance of
d3 bμj (t) the proposed framework. And the parameters used in our
F ST D P = F j = ,
dt 3 experiments will be shown in Table I.
μ
μ (k)
b j (0)(k) = p0 , k = 0, 1, 2,
A. Drive With Static Obstacles
(k) μ
pmin ≤ b j (t)(k) ≤ p(k)
max . (12) The static environment is relatively simple for the ego
vehicles to avoid obstacles. As we pointed out in [22], in the
spatio-temporal domain, static obstacles can be regarded as
V. E XPERIMENTS objects perpendicular to the driving plane through the time
In order to verify the effectiveness of our proposed frame- axis at their locations. Therefore, in each unit time ti , static
work, we adopt the publicly released data-set NGSIM col- obstacles within the perception range should be considered.
lected in the highway environment for testing. The NGSIM As shown in Fig. 5, taking the starting position of the vehicle
US101 database was collected in America, which records the as the origin of the s coordinate, we place the static obstacle
various traffic situation during morning rush hour, including 1 at s = 100m in lane 2 where the vehicle is located,
the traffic congestion and normal highway driving, etc. In the an obstacle 2 at s = 220m lane 3, and an obstacle 3 at
data-set, we arbitrarily select a vehicle in the scene as the s = 320m in lane 1. Additionally, in order to examine the
ego vehicle, and regard the vehicles within a certain range deceleration to stop performance of the framework, we set up
(±100m) around it as the obstacles in the perception field. obstacle 4 across multiple lanes at 500m in the s direction
Since the predicted states of the social vehicles are required (which can also be regarded as stop signs or traffic signals
in the process of voxel generation, we add a noise n(t) to that prevent the vehicle from moving forward, and the safety
the ground truth Gt,n of vehicle n as the predicted states distance is set to be 50m by default) to check whether the
Pt,n . As the research [31], [36] reveal, the uncertainty of vehicle will slow down to stop. Since in sensors manager
prediction is increasing with respect to time, so the variance that we designed, the static and dynamic obstacles are treated
of the gaussian noise σ0 is a function with respect to time separately, the planner stipulates that for static obstacles ahead
in formula (13). In the future, we will employ the sequential of the ego vehicle within the sensing range of the current
neural network or hidden Markov process for prediction in lane, penalties will be imposed on the planning results, thereby
combination with our proposed decision planning framework, prompting the vehicle to keep far away from obstacles or to
making the framework more systematic. We completed our change lanes.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10373
Fig. 5. The top view of the whole planned trajectory with static obstacles. The obstacles are: Obstacle 1: lane 2, 100m ahead, Obstacle 2: lane 3, 220m
ahead, Obstacle 3: lane 1, 320m ahead and Obstacle 4: 500m, covering from lane 1 to 5. The velocity of the trajectory is described in red series color.
TABLE I
PARAMETERS C ONFIGURATION
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10374 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022
Fig. 7. The top view of traffic scenario 1 with dynamic obstacles and the generated trajectories in 3D configuration. The ego vehicle in red is depicted
in every cycle, whose trajectory and velocity profile are represented in the red-series color. And the distribution of social vehicles at the beginning time of
cycle 1, 5, 10 are depicted by blue rectangles, triangles and ellipses, with their trajectories in blue dash lines.
Fig. 8. The planning results of scenario 1 with dynamic obstacles by our Fig. 9. The planning results of scenario 2 with dynamic obstacles by our
framework, displayed in terms of positions in s direction, positions in d framework, displayed in terms of positions in s direction, positions in d
direction, velocity and acceleration in longitudinal direction. direction, velocity and acceleration in longitudinal direction.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10375
Fig. 10. The top view of traffic scenario 2 with dynamic obstacles and the generated voxels, DWTG and trajectories. The ego vehicle in red is depicted
in every cycle, whose trajectory and velocity profile are represented in the red-series color. And the distribution of social vehicles at the beginning time of
cycle 1, 5, 8 are depicted by blue rectangles, triangles and ellipses, with their trajectories in blue dash lines. The results of our proposed algorithm in particular
cycles are illustrated in b), c), d).
makes decision to change lane to the right at the first cycle. plane into a static planning problem in 3D space. Particularly,
After the lane change, the ego vehicle is driving ahead of cycle 8 displays a situation of emergency where vehicle
social vehicle 47. Particularly, viewed on a two-dimensional 44 behind the ego vehicle accelerates suddenly (since the
plane, the trajectories of the ego vehicle and social vehicle interaction of the vehicles are not considered in the work,
34; 47 overlap, whereas, they are misplaced in spatio-temporal vehicle 44 accelerates with regardless of the ego vehicle,
domain. From the view of spatio-temporal domain, it is clear which can also be regarded as an abnormal acceleration.).
that the space of the planned trajectory and the space occupied At this moment, the ego vehicle cannot successfully avoid the
by obstacles do not overlap, so it will not cause a collision. rear-end collision by accelerating (exceeding the maximum
This is just the advantage of the 3D spatio-temporal decision speed). Therefore, the ego vehicle manages to change lanes
planning, which can provide the intuitive motions in time and seizes the proper opportunity to successfully change lanes
series, then transfer the dynamic planning problem in the to the left lane between vehicle 21 and 25.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10376 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10377
TABLE III
T IME C ONSUMPTION OF E ACH P HASE
the corridor may vary, resulting in the overlap even the surrounding vehicles. In scenario 2, the average safety
space redundancy. Nevertheless, the advantage is that the distance of the trajectory we planned is also larger. Therefore,
corridor will be more flexible in time axis with precise our planner is more reliable in terms of security. In terms of
resolution. VOXEL is strictly generated bounded by the efficiency, we consider the average speed of the trajectory. The
unit planning time t L , so there is no overlap on the results show that the average speed planned by our framework
time axis. As for the precision of time, we adopt two is higher, which proves the efficiency (the results are calculated
methods to compensate for the accuracy. One is to loose with v desired = 18.0m/s, the average velocity of the vehicle
the boundary of the generated voxel when optimizing, group). In terms of comfort, we consider the average accel-
the relaxed factor m will give tolerance to the uncertainty eration of the vehicle (the sum of acceleration in s direction
risen by prediction. The other is the usage of RSS divided by the total length). Because our framework adopts
Checker, which considers the states of social vehicles in the jerk term as the optimization function, the acceleration is
each 0.1s. ensured to be steady. In the lateral direction, as a result of
3) s-Axis and d-Axis: The expansion order of the corridor the added constraints of the lateral velocity and acceleration,
from seeds is s direction, d direction and finally the the lateral acceleration is small (the lateral acceleration for
time axis, which will cause space omission (the lateral both our framework and the ground truth are very small, which
expansion will be restricted after the longitudinal length is are not listed in the table). In summary, the trajectories planned
determined). VOXEL determines the size in three dimen- by our planner are superior to human driving trajectories in
sions at the same time, and the lateral size of the voxel is terms of safety, efficiency and comfort. On the one hand, it can
equal to the lane width. In this way, every possible space demonstrate the reliability and intelligence of our proposed
of each lane is considered, and the lane-width structure framework, and on the other hand, it also benefits from the
makes it convenient to make maneuvers in lane level. reasonable prediction of the surrounding vehicles.
4) Time Consumption: In terms of time consumption,
VOXEL takes advantage. This is because the size of
the voxel is determined at the same time, while the
corridor requires expansion along 6 directions separately. F. Time Consumption
The average time consumption of VOXEL is 1.5−3.0ms,
and the time consumption of CORRIDOR is more than The experiments on time consumption of the framework is
60ms, which is 20 times as long as our algorithm. (For carried out under the same hardware conditions as the above
both algorithms, the obstacles information are traversed experiments. We tested the time consumption of 4 parts of
for collision detection. Converting the obstacles informa- the framework, which are the time consumption of the traffic
tion into sequential grid maps, the searching process will graph determination in LTBP, the time consumed by the QP
be more time-efficient). optimization problem, the optimization problem in STDP and
the RSS Checker (because the frequency of RSS Checker is
10Hz, the statistical results are the summation of repeat for ten
E. Comparison of the Ground Truth times). The statistical results are shown in Table III, and the
In order to verify the reliability and intelligence of our unit is ‘ms’. For the determination of traffic graph, the time
proposed framework, we conduct experiments compared with consumption is 5ms, including the generation and expansion of
the ground truth derived from human drivers. In this work, voxel, and the determination of topological relations as well
we refer to [15] to evaluate the trajectory from the perspective as the weighted matrix. The optimization process of LTBP
of safety, efficiency and comfort. For safety, we consider takes about 50-80ms. By solving the quadratic programming
the distance between the vehicle and the nearest obstacle problem including equality constraints, inequality constraints,
in the same lane. When the distance is less than the safe and boundary condition constraints, the optimal trajectory
distance threshold (20m), we consider it to be a potential within TL planning horizon is obtained. For STDP, since the
dangerous situation. Finally, we calculate the average distance optimization target is a relatively short trajectory in horizon
for each time the distance is less than the safe distance for TS , the scale of the optimization problem is smaller and the
safety evaluation. The greater value means the better security running time is shorter. Through time-consuming statistics,
performance. It can be seen from Table II that in dynamic it proves that LTBP and STDP are ensured to run stably at
scene 1, the average distance of the ground truth is 18.81m, the set frequencies f L and f S , thereby realizing real-time
and the trajectory we planned keeps at least 20m away from behavioral decision-making and trajectory planning.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
10378 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 23, NO. 8, AUGUST 2022
VI. C ONCLUSION [13] H. Fan et al., “Baidu Apollo EM motion planner,” 2018,
arXiv:1807.08048. [Online]. Available: https://arxiv.org/abs/1807.08048
In this work, we propose a unified framework integrating [14] Q. Wang, B. Ayalew, and T. Weiskircher, “Predictive maneuver planning
decision making and trajectory planning with long-term behav- for an autonomous vehicle in public highway traffic,” IEEE Trans. Intell.
ior planning (LTBP) and short-term dynamic planning (STDP) Transp. Syst., vol. 20, no. 4, pp. 1303–1315, Apr. 2019.
[15] L. Zhang, W. Ding, J. Chen, and S. Shen, “Efficient uncertainty-aware
running in two parallel threads with different horizon, and decision-making for automated driving using guided branching,” in Proc.
the hierarchical closed-loop system is proved to be capable to IEEE Int. Conf. Robot. Autom. (ICRA), May 2020, pp. 3291–3297.
react to the dynamic environment efficiently and effectively. [16] M. Werling, J. Ziegler, S. Kammel, and S. Thrun, “Optimal trajectory
generation for dynamic street scenarios in a Frenét Frame,” in Proc.
In the framework, a voxel generation and expansion algorithm IEEE Int. Conf. Robot. Autom., May 2010, pp. 987–993.
is proposed to determine the passable space without obstacles, [17] Y. Meng, Y. Wu, Q. Gu, and L. Liu, “A decoupled trajectory planning
therefore, the directed weighted traffic graph is constructed framework based on the integration of lattice searching and convex
optimization,” IEEE Access, vol. 7, pp. 130530–130551, 2019.
to describe the traffic situation intuitively and concisely. The [18] W. Lim, S. Lee, M. Sunwoo, and K. Jo, “Hierarchical trajectory planning
experiments on the public datasets demonstrate the effec- of an autonomous car based on the integration of a sampling and an
tiveness, mobility, efficiency as well as the intelligence of optimization method,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 2,
pp. 613–626, Feb. 2018.
the proposed framework. By the comparison experiments, [19] W. Zhan, J. Chen, C.-Y. Chan, C. Liu, and M. Tomizuka, “Spatially-
we verify the superiority of our proposed framework. Still, partitioned environmental representation and planning architecture for
more experiments on the real-world application should be on-road autonomous driving,” in Proc. IEEE Intell. Vehicles Symp. (IV),
Jun. 2017, pp. 632–639.
conducted, so in the future, we plan to: 1) test our proposed [20] C. Hubmann, M. Aeberhard, and C. Stiller, “A generic driving strategy
framework in the real-world autonomous vehicle; 2) integrate for urban environments,” in Proc. IEEE 19th Int. Conf. Intell. Transp.
the prediction module (employing the neural network or Syst. (ITSC), Nov. 2016, pp. 1010–1016.
[21] J. Ziegler and C. Stiller, “Spatiotemporal state lattices for fast trajectory
the hidden Markov process) and the controller module to planning in dynamic on-road driving scenarios,” in Proc. IEEE/RSJ Int.
our framework, making the framework more systematic and Conf. Intell. Robots Syst., Oct. 2009, pp. 1879–1884.
applicable. For more work about us, please refer to the github1. [22] W. Ding, L. Zhang, J. Chen, and S. Shen, “Safe trajectory genera-
tion for complex urban environments using spatio-temporal semantic
corridor,” IEEE Robot. Autom. Lett., vol. 4, no. 3, pp. 2997–3004,
R EFERENCES Jul. 2019.
[23] M. McNaughton, C. Urmson, J. M. Dolan, and J.-W. Lee, “Motion plan-
[1] R. He, E. Brunskill, and N. Roy, “Efficient planning under uncer- ning for autonomous driving with a conformal spatiotemporal lattice,”
tainty with macro-actions,” J. Artif. Intell. Res., vol. 40, pp. 523–570, in Proc. IEEE Int. Conf. Robot. Autom., May 2011, pp. 4889–4895.
Mar. 2011. [24] Z. Zhu, E. Schmerling, and M. Pavone, “A convex optimization approach
[2] A. G. Cunningham, E. Galceran, R. M. Eustice, and E. Olson, “MPDM: to smooth trajectories for motion planning with car-like robots,” in Proc.
Multipolicy decision-making in dynamic, uncertain environments for 54th IEEE Conf. Decis. Control (CDC), Dec. 2015, pp. 835–842.
autonomous driving,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), [25] S. M. Erlien, S. Fujita, and J. C. Gerdes, “Safe driving envelopes for
May 2015, pp. 1670–1677. shared control of ground vehicles,” IFAC Proc. Volumes, vol. 46, no. 21,
[3] L. Ma, J. Xue, K. Kawabata, J. Zhu, C. Ma, and N. Zheng, “Efficient pp. 831–836, 2013.
sampling-based motion planning for on-road autonomous driving,” IEEE [26] J. Choi, “Kinodynamic motion planning for autonomous vehicles,” Int.
Trans. Intell. Transp. Syst., vol. 16, no. 4, pp. 1961–1976, Aug. 2015. J. Adv. Robotic Syst., vol. 11, no. 6, p. 90, Jun. 2014.
[4] X. Li, Z. Sun, D. Cao, Z. He, and Q. Zhu, “Real-time trajectory planning [27] P. Bender, O. S. Tas, J. Ziegler, and C. Stiller, “The combinatorial aspect
for autonomous urban driving: Framework, algorithms, and verifica- of motion planning: Maneuver variants in structured environments,” in
tions,” IEEE/ASME Trans. Mechatronics, vol. 21, no. 2, pp. 740–753, Proc. IEEE Intell. Vehicles Symp. (IV), Jun. 2015, pp. 1386–1392.
Apr. 2016. [28] F. Altche and A. de La Fortelle, “Partitioning of the free space-time for
[5] D. González, J. Pérez, V. Milanés, and F. Nashashibi, “A review of on-road navigation of autonomous ground vehicles,” in Proc. IEEE 56th
motion planning techniques for automated vehicles,” IEEE Trans. Intell. Annu. Conf. Decis. Control (CDC), Dec. 2017, pp. 2126–2133.
Transp. Syst., vol. 17, no. 4, pp. 1135–1145, Apr. 2016. [29] Responsibility-Sensitive Safety, a Mathematical Model for Auto-
[6] S. Kammel et al., “Team AnnieWAY’s autonomous system for the mated Vehicle Safety. Accessed: Jul. 2, 2021. [Online]. Available:
DARPA urban challenge 2007,” in The DARPA Urban Challenge. Berlin, https://www.mobileye.com/responsibility-sensitive-safety/
Germany: Springer, 2009, pp. 359–391. [30] M. Schreier, V. Willert, and J. Adamy, “Bayesian, maneuver-based, long-
[7] J. Ziegler et al., “Making Bertha drive—An autonomous journey on a term trajectory prediction and criticality assessment for driver assistance
historic route,” IEEE Intell. Transp. Syst. Mag., vol. 6, no. 2, pp. 8–20, systems,” in Proc. 17th Int. IEEE Conf. Intell. Transp. Syst. (ITSC),
Apr. 2014. Oct. 2014, pp. 334–341.
[8] H. Mouhagir, R. Talj, V. Cherfaoui, F. Aioun, and F. Guillemard, [31] N. Deo, A. Rangesh, and M. M. Trivedi, “How would surround vehicles
“Integrating safety distances with trajectory planning by modifying the move? A unified framework for maneuver classification and motion
occupancy grid for autonomous vehicle navigation,” in Proc. IEEE 19th prediction,” IEEE Trans. Intell. Vehicles, vol. 3, no. 2, pp. 129–140,
Int. Conf. Intell. Transp. Syst. (ITSC), Nov. 2016, pp. 1114–1119. Jun. 2018.
[9] S. Ulbrich and M. Maurer, “Towards tactical lane change behavior [32] N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, and
planning for automated vehicles,” in Proc. IEEE 18th Int. Conf. Intell. M. Chandraker, “DESIRE: Distant future prediction in dynamic scenes
Transp. Syst., Sep. 2015, pp. 989–995. with interacting agents,” in Proc. IEEE Conf. Comput. Vis. Pattern
[10] Z. Chen and X. Huang, “End-to-end learning for lane keeping of self- Recognit. (CVPR), Jul. 2017, pp. 2165–2174.
driving cars,” in Proc. IEEE Intell. Vehicles Symp. (IV), Jun. 2017, [33] E. W. Dijkstra, “A note on two problems in connexion with graphs,”
pp. 1856–1860. Numerische Math., vol. 1, no. 1, pp. 269–271, Dec. 1959.
[11] N. Li, D. W. Oyler, M. Zhang, Y. Yildiz, I. Kolmanovsky, and [34] F. Gao, W. Wu, Y. Lin, and S. Shen, “Online safe trajectory generation
A. R. Girard, “Game theoretic modeling of driver and vehicle interac- for quadrotors using fast marching method and Bernstein basis poly-
tions for verification and validation of autonomous vehicle control sys- nomial,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), May 2018,
tems,” IEEE Trans. Control Syst. Technol., vol. 26, no. 5, pp. 1782–1797, pp. 344–351.
Sep. 2018. [35] J. A. Preiss, W. Hönig, N. Ayanian, and G. S. Sukhatme, “Downwash-
[12] L. Yang, X. Liang, T. Wang, and E. Xing, “Real-to-virtual domain aware trajectory planning for large quadrotor teams,” in Proc. IEEE/RSJ
unification for end-to-end autonomous driving,” in Proc. Eur. Conf. Int. Conf. Intell. Robots Syst. (IROS), Sep. 2017, pp. 250–257.
Comput. Vis. (ECCV), Sep. 2018, pp. 530–545. [36] C. Hubmann, M. Becker, D. Althoff, D. Lenz, and C. Stiller, “Decision
making for autonomous driving considering interaction and uncertain
1 https://github.com/zt-BIT/Spatio-temporal-Decision-making-and- prediction of surrounding vehicles,” in Proc. IEEE Intell. Vehicles Symp.
Trajectory-Planning-Framework/ (IV), Jun. 2017, pp. 1671–1678.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: UNIFIED FRAMEWORK INTEGRATING DECISION MAKING AND TRAJECTORY PLANNING 10379
Ting Zhang received the B.S. degree in automation Yi Yang received the Ph.D. degree in automation
from the Beijing Institute of Technology, China, from the Beijing Institute of Technology, Beijing,
in 2018, where she is currently pursuing the Ph.D. China, in 2010. He is currently a Professor with
degree in navigation guidance and control with the the School of Automation, Beijing Institute of Tech-
School of Automation. Her research interests include nology. He is the author/coauthor of more than
trajectories and behavioral predictions of unmanned 50 conference papers and journal articles in the
vehicles in a variety of traffic environments and area of unmanned ground vehicle. His research
prediction-based decision making and planning. interests include autonomous vehicles, bioinspired
robots, intelligent navigation, semantic mapping,
scene understanding, motion planning and control,
and robot design and development.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on November 08,2023 at 03:46:50 UTC from IEEE Xplore. Restrictions apply.