Professional Documents
Culture Documents
Decentralized Spatial-Temporal
Trajectory Planning for Multicopter Swarms
Xin Zhou, Zhepei Wang, Xiangyong Wen, Jiangchao Zhu, Chao Xu and Fei Gao∗
fully autonomous multicpoter swarm system using B-Spline 𝑇𝑇1 𝑇𝑇2 𝑇𝑇𝑀𝑀−1 𝑇𝑇𝑀𝑀
𝐪𝐪1 𝐩𝐩𝟐𝟐,𝟎𝟎
parameterized trajectories. Nevertheless, the difficulty of time
adjustment of B-Spline curves results in a twisty trajectory 𝐩𝐩𝟐𝟐,𝟏𝟏
shape when multiple agents need to pass through the same 𝑝𝑝(𝑡𝑡) 𝐜𝐜𝑀𝑀−1 𝛽𝛽(𝑡𝑡)
𝐜𝐜1 𝛽𝛽(𝑡𝑡) 𝐩𝐩𝟐𝟐,𝟐𝟐
area. However, this limitation is broken in the proposed 𝐜𝐜2 𝛽𝛽(𝑡𝑡) 𝐩𝐩𝟐𝟐,𝟑𝟑 …
𝐩𝐩𝟐𝟐,𝟒𝟒 𝐪𝐪𝑀𝑀−2
𝐪𝐪𝑀𝑀−1 𝐜𝐜𝑀𝑀 𝛽𝛽(𝑡𝑡)
𝐪𝐪2
method.
Fig. 4: This figure illustrates the parameters of MINCO trajectory
III. S PATIAL -T EMPORAL T RAJECTORY O PTIMIZATION representation and its constraint evaluation. The trajectory is directly
network that shares trajectories and performs on-demand time
parameterized by every waypoint qi and time Ti . Red dots represent
F OR S WARMS
constraint points p̊ (introduced in Sec. III-C), where any constraint
A. Overview is evaluated and propagated to every qi and Ti .
The capability of spatial-temporal optimization comes from
a recently developed trajectory representation named TMINCO
[24]. It’s basic definition and features are explained in directly constructs a minimum control trajectory for an m-
Sec.III-B. Then Sec.III-C is a general description of how dimensional s-integrator chain for any specified initial and
to add various continuous-time constraints to TMINCO tra- terminal conditions. An instance in TMINCO is intuitively
jectories: by constraint quadrature. Finally in Sec.III-D, we shown in Fig. 4. Since this report focuses on swarm planning,
formulate an unconstrained nonlinear optimization, which is we only intuitively introduce the features of TMINCO . Detailed
then solved by gradient-decent methods. To solve it in this proofs are presented in [24].
way, costs and gradients of various objectives are defined and Feature 1: Compactly represented by q and T, TMINCO
derived in Sec.III-D as well. is a class of polynomial trajectories satisfying the following
control effort minimization:
B. MINCO Trajectory Class Z tM
Thanks to the differential flatness of multicopters [12], their min p(s) (t)T Wp(s) (t)dt, (3a)
p(t) t0
motion planning is sufficient to be directly performed on a
s.t. p[s−1] (t0 ) = p̄o , p[s−1] (tM ) = p̄f , (3b)
time differentiable curve. In this report, we adopt TMINCO
[24] for our trajectory representation, which is a minimum p(ti ) = qi , 1 ≤ i < M, (3c)
control polynomial trajectory class defined as ti−1 < ti , 1 ≤ i ≤ M, (3d)
n
TMINCO = p(t) : [0, T ] 7→ Rm c = M(q, T),
where W ∈ Rm×m is a diagonal matrix with positive entries,
o p̄o ∈ Rms and p̄f ∈ Rms the specified initial and terminal
q ∈ Rm(M −1) , T ∈ RM >0 , conditions, qi ∈ Rm is the given intermediate waypoints that
the trajectory is enforced to pass at time ti .
where p(t) is an m-dimensional M -piece polynomial trajec- Feature 2: The evaluation and differentiation of the map-
tory with degree N = 2s − 1, s the order of the relevant inte- ping c = M(q, T) enjoys linear complexity. A more specific
grator chain, c = (cT T T
1 , . . . , cM ) ∈ R
2M s×m
the polynomial correspondence can be expressed as the following function
coefficient, q = (q1 , . . . , qM −1 ) the intermediate waypoints
and T = (T1 , T2 , . . . , TM )T the time allocated for all pieces. M(T)c = b(q), (4)
Specifically, the trajectory is evaluated as
where M(T) ∈ R2M s×2M s is a banded matrix with non-
p(t) = pi (t − ti−1 ), ∀t ∈ [ti−1 , ti ]. (1) singularity for any T 0 ensured by Theorem 2 in [24],
b(q) ∈ R2M s×m . The conversion between these two trajectory
The i-th piece pi (t) : [0, Ti ] 7→ Rm is defined by representations, i.e., (c, T) and (q, T) is achieved using
pi (t) = cT Banded PLU Factorization with O(M ) linear time and space
i β(t), ∀t ∈ [0, Ti ], (2)
complexity. The evaluation is extremely fast, which takes
where β(t) := [1, t, · · · , tN ]T is the natural basis, ci ∈
PR
2s×m
about 1µs per piece for minimum-jerk trajectory generation
M
the coefficient matrix, Ti = ti − ti−1 and T = i=1 Ti . on a desktop CPU.
The core of TMINCO is the parameter mapping c = M(q, T) Feature 3: Feature 2 allows any user-defined objective or
constructed from Theorem 2 in [24]. Technically, the mapping penalty function F (c, T) with available gradients applicable
[Technical Report] FAST Lab, Zhejiang University
to TMINCO parameterized in (q, T). Specifically, the corre- D. Optimization Problem Construction
sponding objective of TMINCO is computed as
According to the above definitions of trajectory parameter-
H(q, T) = F (M(q, T), T). (5) ization and constraints imposition, we present the complete
optimization for trajectory planning in this section. The basic
Then the mapping c = M(q, T) gives a linear-complexity requirements on a desired trajectory are smoothness, dynam-
way to compute ∂H/∂q and ∂H/∂T from the corresponding ical feasibility, as well as safety among obstacles and other
∂F/∂c and ∂F/∂T. After that, a high-level optimizer is able agents. Extra objectives such as minimization of control effort
to optimize the objective efficiently. and execution time are also desired. In this report, we adopt
TMINCO to address above all concerns. Since TMINCO is
C. Time Integral Constraints
naturally guaranteed smooth by Theorem 2 in [24], no extra
The way how various constraints are implemented closely effort needs to be paid on trajectory continuity.
relevant to trajectory parameterization methods. Conven- In this work, we formulate the trajectory generation as an
tionally, requirements for dynamical feasibility and colli- unconstrained optimization problem:
sion avoidance are formulized as functional-type constraints
G(p(t), . . . , p(s) (t)) 0, ∀t ∈ [0, T ]. However, this formula-
X
min λx Jx , (12)
tion cannot be directly handled by constrained optimization, q,T
x
since G contains infinitely many inequality constraints. Thus,
we transform G into a finite-dimensional one via integral of where Jx are JΣ instances, λx is the weight for either an
constraint violation. Practically, the integral is transformed objective or penalty, subscripts x = {e, t, d, o, w, u} represent
into weighted sum JΣ (c, T) of the sampled penalty function. control effort (e), execution time (t), dynamical feasibility (d),
In most cases that constraints are decoupled, i.e., constraints obstacle avoidance (o), swarm reciprocal avoidance (w), and
G(p[s] (t)) with ti ≤ t < ti+1 are merely determined by ci and uniform distribution of constraint points (u). Commonly, a
Ti , the penalty for i-th trajectory piece is computed as sufficiently large weight suffices for penalties. The problem
is solved via unconstrained nonlinear optimization.
κi
Ti X j 1) Control Effort Je : Control effort follows the definition in
Ji (ci , Ti , κi ) = ω̄j χT max (G(ci , Ti , ), 0)3 , (6)
κi j=0 κi Eq. 3b. Without loss of generality, consider a single dimension
of the i-th piece pi (t) : [0, Ti ] 7→ R of a polynomial spline,
n g
where κi is the sample number on the i-th piece, χ ∈ R≥0 its derivatives with respect to ci and Ti are
a vector of penalty weights with appropriately large en- Z Ti !
tries, (ω̄0 , ω̄1 , . . . , ω̄κi −1 , ω̄κi ) = (1/2, 1, · · · , 1, 1/2) are the ∂Je (s) (s) T
=2 β (t)β (t) dt ci , (13)
quadrature coefficients following the trapezoidal rule [15]. We ∂ci 0
define the points determined by {ci , Ti , j/κi } as constraint
points p̊i,j = pi ((j/κi )Ti ) with pi (t) the i-th polynomial ∂Je T
piece. Then G(ci , Ti , j/κi ) = G(p̊i,j ). Note that p̊i,κi = = cT
i β
(s)
(Ti )β (s) (Ti ) ci . (14)
∂Ti
p̊i+1,0 . The cubic penalty is used here for its twice continuous
differentiability. JΣ (c, T) is then defined as the summation: 2) Execution Time Jt : A shorter execution time is desirable,
M
so
PM we also minimize the weighted total execution time Jt =
i=1 Ti . Obviously, its gradient ∂Jt /∂c = 0, ∂Jt /∂T = 1.
X
JΣ (c, T) = Ji (ci , Ti , κi ). (7)
i=1 3) Dynamical Feasibility Penalty Jd : For the trajectory
generation of the multicopter, dynamical feasibility is always
Since the gradients w.r.t. (c, T) are constructed from all guaranteed by limiting the trajectory’s derivatives. In our work,
(ci , Ti ), we only give gradient templates to ci and Ti using we limit the amplitude of velocity, acceleration, and jerk. Note
the chain rule, that the dynamical feasibility penalty is acquired using time
∂JΣ ∂JΣ ∂G
= , (8) integral in Sec. III-C. Following the Eq. 6, constraints of
∂ci ∂G ∂ci
velocity, acceleration, and jerk are denoted as
∂JΣ Ji ∂JΣ ∂G ∂t ...
= + , (9) Gv = ṗ(t)2 −vm 2
, Ga = p̈(t)2 −a2m , Gj = p (t)2 −jm 2
, (15)
∂Ti Ti ∂G ∂t ∂Ti
κi
∂JΣ Ti X j where vm , am , jm are maximum allowed velocity, acceleration
=3 ω̄j max (G(ci , Ti , ), 0)2 ◦ χ. (10)
∂G κi j=0 κi and jerk. The corresponding gradients are
∂t j j ∂Gx ∂Gx
= , t = Ti . (11) = 2β (n) (t)p(n) (t)T , = 2β (n+1) (t)T ci p(n) (t),
∂Ti κi κi ∂ci ∂t
(16)
From above equations, once the gradients of constraints G where x = {v, a, j}, n = {1, 2, 3}, and t = jTi /κi + ti−1 , t ∈
relative to polynomial coefficients ci and time t are evaluated, [ti−1 , ti ], respectively. Substituting Eq. 15 into 6 and 16 into
the gradients of JΣ (c, T) are efficiently computed. Eq.8, 9 gets the penalty and the gradient of ci and Ti ;
[Technical Report] FAST Lab, Zhejiang University
Other Agents’
𝐬 𝐯 Trajectories
Enlarging
𝑑 = (𝐩key − 𝐬) T 𝐯
𝐩key
where Nsv is the number of {s, v} pairs recorded by a single ∂Jw /∂ci . However, the gradient to T is more complicated,
p̊i,j and t = jTi /κi the relative time on this piece. For the since the gradient template equation 9 does not hold here.
case that Co ≥ d(p(t), {q, v}k ), 0), the gradient is computed: Considering previous time profile, the proper gradient to the
preceding time Tl for any 1 ≤ l ≤ i should be computed as
∂Gok ∂Gok
= −β(t)vT , = −vT ṗ(t). (20)
∂ci ∂t U U M κ
∂Jw X ∂Jw
k
XXX i
∂Jwi,jk
Otherwise, ∂Gok /∂ci = 02s×m , ∂Gok /∂t = 0. = = , (25)
∂Tl ∂Tl ∂Tl
5) Swarm Reciprocal Avoidance Penalty Jw : In this work, k=1 i=1 j=0 k=1
a non-cooperative swarm framework is adopted, which means ∂Jwi,jk Jwi,jk ∂Jwi,jk ∂Gwi,j
k
that one agent receives other agents’ trajectories as constraints = + i,j
, (26)
∂Tl Tl ∂Gwk ∂Tl
and generates trajectories only for itself. Considering the u- i,j i,j i,j
∂Gw ∂Gw ∂t ∂Gw ∂τ
th agent in a multicpoter swarm containing U agents, swarm k
= k
+ k
, (27)
collision avoidance is guaranteed when the trajectory pu (t) ∂Tl ( ∂t ∂T l ∂τ ∂T l
i,j T
of agent u maintains a distance greater than a safe clearance ∂Gw k
2 (pk (t) − pu (τ )) Eṗu (t) k 6= u,
= (28)
Cw to all the trajectories at the same global timestamps of ∂t 0 k = u,
other agents as depicted in Fig. 6. However, one caution must i,j
(
T
be taken that we have always been using a relative time t = ∂Gwk 2 (pu (t) − pk (τ )) Eṗk (τ ) k 6= u,
= (29)
jTi /κi in our optimization, while the other agents’ trajectories ∂τ 0 k = u,
take an absolute timestamp τ = T1 +· · ·+Ti−1 +jTi /κi . This
( (
j j
∂t l = i, ∂τ l = i,
indeed makes the penalty evaluated on the i-th piece depend = κi = κi (30)
∂Tl 0 l < i, ∂Tl 1 l < i.
on all its preceding piece times. Thus we denote by Gi,j w the
[Technical Report] FAST Lab, Zhejiang University
Skipped
Observation
Camera
Safety Ratio and Distance to Obstacles (in abbreviation Dis. VIO SE3 Controller
Odometry
to Obs.) are the minimal value in all records. Agent 0
Agent 1
1) Comparison of Reciprocal Collision Avoidance without Agent 2
Static Obstacles: Results are given in Tab. II. A bold term IMU Flight controller
Flight
Flightcontroller
controller
indicates a better value in the corresponding category (On-
line/Offline), while an underlined term indicates the best value
Fig. 10: System Architecture.
among all planners. Indicated by Tab. II, SCP achieves a better
flight time at the sacrifice of trajectory safety. RBP achieves
better dynamical performance int(a2 ) and int(j 2 ) with the 3) A Brief Conclusion: From the comparison, the proposed
cost of a relatively long flight time. Both of these methods fail method achieves top-level performance with the shortest com-
to balance various aspects of trajectory quality. Compared with putation time. This achievement is attributed to low complexity
offline methods, three online planners show more balances of TMINCO -based problem formulation and decoupling of
while requiring less computation by orders of magnitude. spatial-temporal parameters. All the programmers run on an
Among them, MADER shows more conservativeness due Intel Core i7-10700KF CPU at 5.1GHz.
to the simplex representation of the trajectory, although the
V. R EAL -W ORLD E XPERIMENTS
simplex enjoys almost the minimum volume.
2) Comparison of Reciprocal Collision Avoidance with A. System Architecture
Static Obstacles: Results are presented in Tab. III for three System architecture is depicted in Fig. 10. A broadcast
online planners. Note that MADER uses a pre-built map network that shares trajectories and performs on-demand time
while EGO-Swarm and the proposed method perform online synchronization is the only connection among all agents.
sensing and mapping. From the data, MADER demands Therefore it is less coupled than [26].
aggressive maneuvers for the vehicle while the other two In Fig. 10, three of the modules may be confusing, here is
are more superior in their trajectory smoothness. Another a simple explanation. The module ”Drone Detection” is for
experiment to intuitively visualize the benefits of temporal detecting other agents witnessed. The module ”Frame Align-
optimization is illustrated in Fig. 9. ment” compensates single agent localization drift using the
[Technical Report] FAST Lab, Zhejiang University
(d) Trajectory and map created during the flight in dense environment.
Fig. 11: Composite images of indoor experiments, in which three drones fly to three reverse placed targets. Dashed lines indicate the relative
positions in the same photographs. As shown in the images, agents maintain a safe distance from each other while avoiding obstacles. Note
that the trajectories are smooth, and there is no detour.
position deviation acquired from ”Drone Detection” module. orders of magnitude and the top-level solution quality. Real-
The module ”Agent Removal” removes pixels of other agents word experiments also demonstrate the wide applicability of
from depth images, since these pixels can interfere the depth- our framework.
based obstacle mapping. A detailed explanation of these three It’s worth noting that the formulation of reciprocal collision
modules is in our previous work [26]. avoidance in Sec. III-D5 enables the planner to avoid moving
obstacles as well. Since we use non-cooperative swarms,
B. Indoor other agents and moving obstacles behave identically from the
We present several indoor experiments at a speed limit planner’s view. However, we exclude this module in the current
of 3.0m/s for empty space and 2.0m/s for narrow gate work due to the immature front-end of moving object detection
and obstacle-rich scenarios, as depicted in Fig.11. The left and prediction, which is taken as the future work to enable
one shows three quadrotors perform a cross flight in empty fully autonomous aerial swarms in dynamical environments.
space where reciprocal collision avoidance is necessary. In the
middle one, quadrotors manage to pass through a narrow gate
R EFERENCES
one after another. In the right figure, we set up a cluttered
environment composed of vertical and horizontal obstacles, [1] Senthil Hariharan Arul and Dinesh Manocha. DCAD:
where the narrowest gate is less than 1m. In such a cluttered Decentralized Collision Avoidance With Dynamics Con-
environment, three quadrotors manage to navigate across this straints for Agile Quadrotor Swarms. IEEE Robotics and
environment sequentially and smoothly. Automation Letters, 5(2):1191–1198, 2020.
C. Outdoor [2] Federico Augugliaro, Angela P Schoellig, and Raffaello
D’Andrea. Generation of Collision-Free Trajectories
Outdoor experiments are presented to validate that the for a Quadrocopter Fleet: A Sequential Convex Pro-
proposed system is capable of field operations. Snap- gramming Approach. In 2012 IEEE/RSJ international
shots of this experiment with the map built during the conference on Intelligent Robots and Systems (IROS),
flight are shown in Fig. 1. Please watch the video at pages 1917–1922. IEEE, 2012.
https://www.youtube.com/watch?v=w5GDMpjAoVQ for more [3] Yufan Chen, Mark Cutler, and Jonathan P How. Decou-
information. The hardware settings are the same as [7]. pled Multiagent Path Planning via Incremental Sequen-
tial Convex Programming. In 2015 IEEE International
VI. C ONCLUSION AND F UTURE W ORK
Conference on Robotics and Automation (ICRA), pages
In this work, we propose a decentralized spatial-temporal 5954–5961. IEEE, 2015.
trajectory planning framework for multicopter swarms. Our [4] Soon-Jo Chung, Aditya Avinash Paranjape, Philip
framework is powered by TMINCO trajectory class capa- Dames, Shaojie Shen, and Vijay Kumar. A Survey on
ble of spatial-temporal deformation, and the time integral Aerial Swarm Robotics. IEEE Transactions on Robotics,
penalty functional based on constraint transcription. All these 34(4):837–855, 2018.
components enjoy efficiency originating from low-complexity. [5] Flaviu Cristian. Probabilistic Clock Synchronization.
Extensive benchmarks are performed to show the speedup by Distributed Computing, 3(3):146–158, 1989.
[Technical Report] FAST Lab, Zhejiang University