You are on page 1of 7

2021 IEEE International Conference on Robotics and Automation (ICRA 2021)

May 31 - June 4, 2021, Xi'an, China

Fast Swing-Up Trajectory Optimization for a Spherical Pendulum


on a 7-DoF Collaborative Robot
Minh Nhat Vu1 , Christian Hartl-Nesic1 , and Andreas Kugi1,2
2021 IEEE International Conference on Robotics and Automation (ICRA) | 978-1-7281-9077-8/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICRA48506.2021.9561093

Abstract— In this paper, the experimental swing-up of a In most works, the swing-up and stabilization of a pen-
spherical pendulum mounted on a collaborative robot is pre- dulum is performed using a custom-designed mechanical
sented. The complete mechanical system consists of nine degrees system which is capable of providing the required forces
of freedom (DoFs). The primary focus of this work is the design
of a fast trajectory planning for the swing-up by systematically and torques. In contrast, the complexity increases signif-
incorporating the kinematic and dynamics constraints. The icantly for implementations on an industrial robot, where
proposed algorithm consists of two steps: First, an offline the joint position, joint velocity, and torque limits have to
trajectory optimization is used to build a database of swing- be considered additionally. Winkler et al. [14] presented
up trajectories, with an average computing time of 10 s for the swing-up and stabilizing of a single inverted pendulum
one trajectory. Second, a fast trajectory replanner based on a
constrained quadratic program is described, which computes on the KUKA KR3 by using an intuitive control scheme
the swing-up trajectory for an arbitrary initial configuration consisting of a feedforward and a balancing controller. Re-
of the system with an average computing time of 0.2 s. Simu- cently, experiments of balancing an inverted pendulum on a
lations and experimental results demonstrate the swing-up of 7 degrees of freedom (DoFs) robotic arm are accomplished
the spherical pendulum using a discrete time-variant linear by using Bayesian optimization [15] and model-based policy
quadratic regulator as a feedback controller.
search [16] for automatic tuning of control strategies. The
I. INTRODUCTION stabilization of a spherical pendulum was first demonstrated
on the robot "DLR LWR II" by Schreiber et al. in [17].
For several years, the inverted pendulum has been a Up to the authors’ knowledge, the swing-up of a spherical
popular benchmark for research in nonlinear control and pendulum on an industrial robot has not been demonstrated
trajectory optimization, see, e.g., [1]. Inverted pendulums so far. In the authors’ previous work [18], a first approach
with increasingly complex kinematics were published, such was proposed which only allows a restricted movement of
as the single pendulum [2], the double pendulum [3], the the robot. Despite these restrictions, due to the complexity
triple pendulum on a cart [4], and the spherical pendulum [5]. of the system with 9 DoFs and the associated kinematic
Despite its simple structure, the inverted pendulum represents and dynamic constraints, the calculation of feasible swing-
an underactuated nonlinear system and exhibits an unstable up trajectories was very time consuming. Thus, the swing-up
equilibrium point and non-minimum phase behaviour. There- demonstration in [18] is limited to the fixed initial and target
fore, it offers exciting challenges for trajectory optimization configuration.
and nonlinear control. In order to accomplish the desired motion of a sys-
In the 1960s, several contributions on the stabilization tem involving inverted pendulums, the desired trajectory is
of the single inverted pendulum were published, see, e.g., first computed through an optimization scheme such as a
[6], [7]. Rather than only to achieve the stabilization task, multiple-shooting optimization [19], and then tracking the
the swing-up control of an inverted pendulum system was desired trajectory through a feedback controller such as
first presented using a simple feedforward control [8]. Later, sliding mode control [20] or a time-variant linear quadratic
more advanced control strategies are demonstrated on this regulator (LQR) [21]. Another common approach is to com-
simple system such as nonlinear model predictive control [9], bine the planning and controlling phase by using a library
reinforcement learning [10], and a Lyapunov-based damping of trajectories in order to directly design the controllers for
controller [11]. As an alternative system to the pendulum the nonlinear underactuated system, see, e.g., [22], [23]. In
on a cart, a rotary pendulum system was introduced in [12], [22], the library of trajectories is built upon a sparse tree of
where a bang-bang type feedback controller for the swing-up LQR-stabilized trajectories, its region of attraction is verified
action together with a least-squares regulator for stabilization using a simulation-based falsification method. Although this
were employed. In 2013, the swing-up of the triple inverted approach shows promising results on generating the tree
pendulum on a cart was experimentally demonstrated in [4] policies for the trajectory stabilization, long computation
by using an output-constrained feedforward controller [13] periods are required for generating and verifying a single
on a real-time hardware. tree policy.
1 Minh Nhat Vu, Christian Hartl-Nesic, and Andreas Kugi are with the In this work, the custom-built spherical pendulum tool
Automation & Control Institute (ACIN), TU Wien, 1040 Vienna, Austria from [18] is mounted on the 7-axis collaborative robot
{vu,hartl,kugi}@acin.tuwien.ac.at KUKA LBR iiwa 14 R820. The main contribution of this
2 A. Kugi is with the Center for Vision, Automation & Control,
AIT Austrian Institute of Technology GmbH, 1210 Vienna, Austria work is a fast algorithm for the computation of swing-
andreas.kugi@ait.ac.at up trajectories, which consists of two steps. In the first

978-1-7281-9077-8/21/$31.00 ©2021 IEEE 10114

Authorized licensed use limited to: Sejong Univ. Downloaded on February 12,2023 at 04:16:59 UTC from IEEE Xplore. Restrictions apply.
step, an offline trajectory planner computes a database of O0
near time-optimal swing-up trajectories with initial states
from an equidistantly discretized joint space. This planner
(O1 , q1 ) d1 = 0.1525 m
systematically accounts for the dynamic constraints on the
system states and control inputs. In the second step, a fast
trajectory replanner calculates a new swing-up trajectory (O2 , q2 ) d2 = 0.2025 m
from an arbitrary initial state in a given range of the joint
space. Subsequently, a discrete LQR is applied to stabilize
the robot on this trajectory. A video of typical swing-up (O3 , q3 )
d3 = 0.2325 m
maneuvers can be found at www.acin.tuwien.ac.at/8c60.
The remainder of this paper is structured as follows: In (O4 , q4 )
Section II, the experimental setup and the mathematical d4 = 0.1825 m
model, consisting of the robot and the spherical pendulum,
are presented in detail. Section III deals with the offline (O5 , q5 )
d5 = 0.2175 m
trajectory planner and builds up a trajectory database for the
complete mechanical system. Section IV is concerned with (O6 , q6 )
the design of a fast replanning algorithm, which chooses the d6 = 0.1825 m
(O7 , q7 )
closest swing-up trajectory from the database and adapts this
trajectory via a constrained quadratic program. Simulation (O8 , q8 ) d7 = 0.081 m
and experimental results are shown in Section VI. The last (O9 , q9 ) d8 = 0.155 m
section concludes this work.
d9 = 0.073 m
II. EXPERIMENTAL SETUP AND MATHEMATICAL
MODEL
Fig. 2. Schematic drawing of the ceiling-mounted robot KUKA LBR iiwa
The experimental setup, shown in Fig. 1, comprises three in the hanging position and its corresponding coordinate frames Oi , i =
main components, i.e. the ceiling-mounted robot KUKA 1, . . . , 9. The x-, y-, and z-axis of each coordinate frame are drawn as red,
green, and blue arrows, respectively.
LBR iiwa, the PC running M ATLAB/S IMULINK modules
inside the real-time automation software BECKHOFF Twin- TABLE I
CAT, and the spherical pendulum tool. The PC communicates C OORDINATE TRANSFORMATION OF THE SYSTEM
with the robot and the spherical pendulum tool via two
network interface cards (NIC) using the EtherCAT protocol. Frame On Frame Om Transformation matrix Tm n
The complete mechanical system includes the 7-axis robot 0 1 Dz (d1 )Rz (q1 )
and the spherical pendulum tool, and consists of 9 DoFs. The 1 2 Dz (d2 )Rx (−π/2)Rz (q2 )
2 3 Dy (d3 )Rx (π/2)Rz (q3 )
coordinate frames Oi , i = 1, . . . , 9, of the complete system 3 4 Dz (d4 )Rx (π/2)Rz (q3 )
are depicted in Fig. 2, where the red, green, and blue arrows 4 5 Dy (d5 )Rz (π)Rx (π/2)Rz (q5 )
represent the x-, y-, and z-axes, respectively. The spherical 5 6 Dz (d6 )Rx (π/2)Rz (q6 )
6 7 Dy (d7 )Rz (π)Rx (π/2)Rz (q7 )
pendulum tool has two intersecting axes, i.e., the z-axis of 7 8 Dy (d8 )Rz (q8 )
the frame O8 and the z-axis of the frame O9 , which can 8 9 Dy (d9 )Rx (π/2)Rz (π)Rz (q9 )
be rotated freely. For more information on the mechanical
design and the integrated sensors of this custom-built tool,
the reader is referred to [18]. The complete system is modeled as a rigid-body system
in the state-space form using the generalized coordinates
EtherCAT
EtherCAT qT = [q1 , q2 , . . . , q9 ], see Fig. 2, which are the rotation
angles qi around the z-axes (blue arrows) of each coordinate
frame Oi , i = 1, . . . , 9. In this work, the homogeneous
KUKA LBR iiwa 14 transformation of a simple translation by the distance d along
the local axis j ∈ {x, y, z} is denoted by Dj (di ), whereas
an elementary rotation around the local axis j ∈ {x, y, z}
NIC 1 NIC 2 by the angle φ is described by Rj (φ). Based on these basic
spherical pendulum TwinCAT transformations, the coordinate transformations Tm
tool n between
MATLAB/Simulink
two adjacent frames On and Om are constructed and listed
PC in Tab. I.
The equations of motion are derived using the Lagrange
formalism, see, e.g., [24],
 
Fig. 1. Overview of the experimental setup. τ
M(q)q̈ + C(q, q̇)q̇ + g(q) = A , (1)
0

10115

Authorized licensed use limited to: Sejong Univ. Downloaded on February 12,2023 at 04:16:59 UTC from IEEE Xplore. Restrictions apply.
TABLE II
In more compact form, (4) is rewritten as
K INEMATIC AND DYNAMIC LIMITS OF THE SYSTEM
d
x = f (x, u). (5)
Joint i Angle limits Velocity limits Torque limits dt
qi (◦ ) q̇i (◦ /s) τA,i (N)
1 170 85 320
It is worth noting that the inversion of MU ∈ R2×2 in (4)
2 120 85 320 is computationally much more efficient than an inversion of
3 170 100 176 M ∈ R9×9 in (1).
4 120 75 176
5 170 130 110 A. Near time-optimal swing-up trajectory
6 120 135 40
7 175 135 40 Next, the optimization scheme is formulated using the
8 180 180 – direct collocation method, see, e.g., [27], by discretizing the
9 45 180 –
swing-up trajectory into N + 1 grid points and solving the
resulting static optimization problem
N
where M(q) ∈ R9×9 denotes the symmetric and positive 1 X T
min J(ξ) = tF + h uk Ruk (6a)
definite mass matrix, C(q, q̇) ∈ R9×9 is the Coriolis matrix, ξ 2
k=0
g(q) ∈ R9×1 is the force vector related to the potential 1
energy, and τA ∈ R7×1 are the motor torque inputs. The s.t. xk+1 − xk = h(fk + fk+1 ) (6b)
2
kinematic and dynamic parameters of the KUKA LBR iiwa x0 = xS , xN = xT (6c)
in (1) are taken from [25]. The kinematic and dynamic limits
x ≤ xk ≤ x, k = 0, . . . , N , (6d)
of the robot [26] and the spherical pendulum tool [18] are
summarized in Tab. II. All limits are symmetric w.r.t zero, i.e. where the index k indicates the discrete time step t = kh,
qi = −qi , q̇i = −q̇i , and τA,i = −τA,i . Note that only 50% fk = f (xk , uk ) is the right-hand side of (5), and the vector
of the limitations of the torque input and the joint velocity in of optimization variables
Tab. II are considered because of the limited working space
in the laboratory and for safety reasons. Since an underlying ξ T = [tF , xT T T T
0 , . . . , xN , u0 , . . . , uN ] (7)
friction compensation is implemented in the experimental is introduced. Note that the final time tF in (6a) denotes
setup, friction is not considered in the system model. the time it takes for the system (5) to transition from
III. OFFLINE TRAJECTORY OPTIMIZATION the starting state xS to the target state xT . Additionally,
h = tF /N is the time step and R is the positive definite
In this section, the offline trajectory planner is derived
weighting matrix for the input u. The target state xT for the
to compute a database of time-optimal swing-up trajectories
spherical pendulum is the upright position, i.e. the unstable
for the system (1). These trajectories take into account the
equilibrium with [q8 , q9 ] = [180◦ , 0]. To avoid collisions with
full system state xT = [qT , q̇T ] and the motor torques τA .
the surroundings, the target state xT is partly constrained
In order to reduce the complexity of the system dynamics
using
and thus enhance the computational speed, (1) is partitioned
into an actuated subsystem qT A = [q1 , q2 , . . . , q7 ] and an xT = [0, q2F , 0, q4F , 0, q6F , 0, 180◦ , 0]T ,
unactuated system qT U = [q 8 9 ]. Thus, (1) reads as
, q
          with
MA MAU q̈A
+
CA CAU q̇A g
+ A = A ,
τ q2F − q4F + q6F = 90◦ , (8)
MT AU MU q̈U CUA CU q̇U gU 0
(2) as shown in Fig. 3. Furthermore, the system dynamics (5) are
where qT = [qT A , q T
U ]. Note that the arguments of the func-
tions are omitted for clarity of presentation. The acceleration
of the unactuated joints q̈U and the motor torques τA can
be calculated from (2), resulting in
q̈U = M−1 T
U (−MAU q̈A − CUA q̇A − CU q̇U − gU ) (3a)
τA = MA q̈A + MAU q̈U + CA q̇A + CAU q̇U + gA (3b) q2
The acceleration of the actuated joints q̈A directly acts on
the unactuated system. Thus, by considering u = q̈A as the
new input to the system, the state-space representation of (1) q4
reads as q6
   
qA q̇A
Fig. 3. Schematic drawing of an example target configuration. The black
d 
qU  = 
  q̇U 
. circle and the cross indicate the local z-axis pointing out of and into the
dt q̇A
   u  plane of the paper, respectively.
−1 T
q̇U MU (−MAU u − CUA q̇A − CU q̇U − gU )
(4) approximated using the trapezoidal rule in (6b). Moreover, x

10116

Authorized licensed use limited to: Sejong Univ. Downloaded on February 12,2023 at 04:16:59 UTC from IEEE Xplore. Restrictions apply.
and x denote the symmetric lower and upper bounds of the two angles q8 and q9 and the joint velocities q̇ of xS are
state, respectively. Since the accelerations of the actuated zero. Thereby, a time-optimal swing-up trajectory is planned
joints are considered as system inputs, i.e. u = q̈A , the offline from every grid point in this discrete joint space using
motor torque limits have to be incorporated as additional (6), (8), and (11). Then, these offline trajectories are stored
constraints. By substituting (3a) into (3b), the motor torque in a database. Note that each trajectory in the database is
limits are reformulated as represented by a label, which contains the actuated joint
angels qA of the starting state xS . In order to efficiently
τA ≤ (M(q
f k )uk + C(q e(qk )) ≤ τA ,
e k , q̇k )q̇k + g (9) search for a suitable trajectory in the database, this labeled
where data is preprocessed with the K-D tree algorithm [30],
f = MA − MAU M−1 MT which is a well-known space-partitioning data structure for
M U AU partitioning and organizing points.
−1
CAU − MAU M−1
 
C = CA − MAU MU CU A
e
U CU IV. FAST TRAJECTORY REPLANNER
e = gA − MAU M−1
g U gU . In the previous section, an offline trajectory optimization
(10)
is used to build a database of swing-up trajectories for a
Note that (9) is a computationally costly inequality con-
discrete set of starting states xS of the system (5). However,
straint, mainly due to the large expressions in the Coriolis
the actual starting state of the system x̄S is almost certainly
matrix C. e In fact, the Coriolis matrix is often neglected
not contained in the database. Therefore, a fast trajectory re-
in industrial applications [28]. To account for the effect
planner is proposed in this section to adapt the precomputed
of the Coriolis matrix C, e the value range of C e q̇ for the
trajectories.
complete system is examined using a Monte-Carlo simula-
In the first step, the nearest-neighbor search algorithm [31]
tion. In this simulation, 108 uniformly distributed random
is applied to find the grid point in the database closest to
state vectors x are selected from the admissible range, see
x̄S . Using the pre-built K-D tree, the search for the closest
Tab. II. This simulation reveals that the values of C e q̇ are
grid point quickly eliminates the large portions of the search
bounded between cT = [6, 8, 3, 4, 1, 1, 0.1] N m and cT =
space. The near time-optimal swing-up trajectory from the
−[6, 7, 3, 4, 1, 1, 0.1] N m, which is significantly smaller than
database corresponding to the grid point closest to x̄S is
the motor torque limits. Although the effect of the Coriolis
denoted as
matrix on the overall system dynamics is not significant,
it is still worth considering the determined bounds in the (ξ ∗ )T = [(t∗F )T , (x∗0 )T , ..., (x∗N )T , (u∗0 )T , ..., (u∗N )T ]. (13)
optimization scheme to avoid exceeding the motor torque
In the second step, ξ ∗ is exploited to calculate the swing-
limits. To this end, the costly inequality constraint (9) is
up trajectory for the actual starting state x̄S which deviates
replaced by
from x∗0 . It can be considered that only small deviations δξ =
τA − c ≤ (M(q e(qk )) ≤ τA − c .
f k )uk + g (11) [δtF , δx0 , ..., δxN , δu0 , ..., δuN ]T are required to obtain the
adapted swing-up trajectory. Thus, the discrete-time system
The swing-up trajectory is found by solving the static dynamics (6b) are linearized around ξ ∗ in the form
optimization problem (6), (8), and (11) using the Interior
t∗

Point OPTimize (IPOPT), which is an open-source package δxk+1 = δxk + F Γxk δxk + Γu k δuk +
based on the interior point method (IPM) for large scale 2N

nonlinear programming, see, e.g., [29]. In order to speed up δtF ∗ ∗
Γxk+1 δxk+1 + Γu k+1 δuk+1 + (f + fk+1 ),
the computation, the gradient of (6a), the Jacobian of the 2N k
constraints (6b-d), (8), and (11) are computed analytically. (14)
After the optimal values are found at the collocation points, with
δxk = xk − x∗k
the trajectory of the state x and the input u need to be
interpolated. Between two adjacent collocation points k and δuk = uk − u∗k
k + 1, the input u(t) and state x(t) for t ∈ [kh, (k + 1)h] δtF = tF − t∗F
are a simple linear and quadratic spline, respectively, i.e. fk∗ = f (x∗k , u∗k ) ,
(t − kh)2 and
x(t) ≈ xk + (t − kh)fk + (fk+1 − fk ) ∂f
2h (12) Γxk =
∂x x∗k ,u∗k

t − kh
u(t) ≈ uk + (uk+1 − uk ) , ∂f
h Γu =
k
∂u x∗k ,u∗k

for k = 0, . . . , N − 1.
for k = 0, . . . , N − 1. In addition, the deviations
B. Trajectory database
δx0 = x̄S − x∗0
After deriving all ingredients for solving a single swing- (15)
up trajectory optimization, an offline database is built from δxN = x̄T − x∗N = 0
a discrete joint space. Assuming that the swing-up trajectory are fixed due to the given actual system state x̄S and the
starts from an equilibrium point of the complete system, the equilibrium final state x̄T from the trajectory ξ ∗ found in

10117

Authorized licensed use limited to: Sejong Univ. Downloaded on February 12,2023 at 04:16:59 UTC from IEEE Xplore. Restrictions apply.
the database. The deviation δξ is obtained as the solution of The LQR control law reads as, see, e.g., [34],
a constrained quadratic program (QP) [32]
∆uk = Kk ∆xk , (22)
1
min Jξ = δξ T Qδξ (16a) with the controller feedback gain matrix Kk . These gain
δξ 2
matrices are computed by iterating the discrete-time Riccati-
s.t. Aδξ = b (16b)
equation
δξ ≤ δξ ≤ δξ , (16c)
−1
Kk = −(Rc + ΓT
k Pk+1 Γk ) (ΓT
k Pk+1 Φk ) (23a)
with the positive definite weighting matrix
Pk = (Qc + ΦT
k Pk+1 Φk )− (23b)
Q = diag(QtF , Qx0 , . . . , QxN , Qu0 , . . . , QuN ) . (17) ΦT P T
Γ (R + ΓT
P Γ )−1 T
Γ P Φ
k k+1 k c k k+1 k k k+1 k
The equality constraints (16b) are constructed by combining
backwards from k = Nc −1, . . . , 1 with PNc = X. With this
the equations of the linearized system dynamics (14) to-
approach, no switching of the controller is required. Finally,
gether with (15). The inequality constraints (16c) correspond
T by substituting (22) into (3b), an expression for the motor
to (6d) and (11), where δξ T = [0, δxT , δuT ], δξ = torque τA is found, reading as, see also (9) and (10)
T T
[δtF , δx , δu ], with δx = x − x∗ , δx = x − x∗ , δu =
τA,k = M(q
f k )(ud + Kk ∆xk + C(q
e k , q̇k )q̇k + g
e(qk )) .
u − u∗ , δu = u − u∗ , and the sufficiently large upper bound
(24)
δtF . The computational efficiency is further improved by
implementing a receding horizon scheme which splits the VI. EXPERIMENTAL RESULTS
full horizon of N grid points into s sub-horizons consisting The offline and online trajectory optimization is imple-
of N/s grid points. Subsequently, the constrained QP (16) is mented with M ATLAB (R2019b) using a computer equipped
applied to each sub-horizon individually while considering with a 1.8 GHz Intel Core i7-8550K and 16 GB RAM. The
that the endpoint of one sub-horizon corresponds to the nonlinear optimization problem (6) is solved using the state-
starting point of the next one. of-the-art interior-point solver IPOPT together with the linear
V. CONTROLLER DESIGN solver MA57, see, e.g., [29], [35]. In order to enhance the
In order to perform the swing-up trajectory in the exper- computation speed, the automatic differentiation (AD) of the
iment, a closed-loop controller is necessary to handle the cost function and the constraint functions are implemented
neglected friction and model uncertainties. Thus, a discrete with the CasADi package [36].
time-variant LQR is employed to stabilize the system (5) Each trajectory is discretized into N = 100 collocation
around the swing-up trajectory from Section IV. points, yielding a total of 2501 optimization variables. For
The system (5) is linearized around the desired tra- safety reasons, the constraints on the joint velocities and
jectory (xd , ud ) = (x∗ + δx, u∗ + δu), with (xd )T = motor torques according to Tab. II are reduced by 50%.
[(qdA )T , (qdU )T , (q̇dA )T , (q̇dU )T ] and (δx, δu) is the solution Without loss of generality and to avoid collisions with the
of the constrained QP (16). Assuming that the system matri- surroundings, the angles of four joints q1 , q2 , q4 , and q6 of the
ces remain constant for each discrete time step k, the discrete starting state xS are discretized equidistantly in the ranges
time-variant system dynamics read as q1 ∈ [−60◦ , 45◦ ], q2 ∈ [−30◦ , 30◦ ], q4 ∈ [−120◦ , −60◦ ],
and q6 ∈ [−60◦ , 60◦ ]. The remaining joint angles are fixed
∆xk+1 = Φk ∆xk + Γk ∆uk , (18) at 0. In each admissible range, four grid points are used.
with Thus, the number of offline planned swing-up trajectories
in the database is 256. The total computing time for the
∆xk = xk − xdk ∆uk = uk − udk offline database is about 43 min, which results in an average
Φk = I18 + Ts Ak Γk = Ts Bk computing time for one trajectory of 10 s. In order to verify
(19)
∂f ∂f the efficiency of the fast trajectory replanner, a Monte
Ak = Bk = ,
∂x xdk ,udk

∂u xdk ,udk
Carlo simulation is performed by randomly selecting 104
initial states xS from a uniform random distribution in the
and the sampling time Ts . The discrete time-variant LQR
admissible ranges. Even though the offline database is built
is applied to stabilize this error system by minimizing the
using a very coarse grid, all 104 test cases in the Monte Carlo
objective function
simulation provide feasible solutions and adhere to the input
c −1
NX
and state constraints. The computing time for a fast trajectory
J= (∆xT T T
k Qc ∆xk + ∆uk Rc ∆uk ) + ∆xN X∆xN , replanning is approximate 200 ms in average, which is 50
k=0
(20) times faster than the offline trajectory optimization.
where Qc and Rc are the positive definite weighting matri- Figure 4 illustrates the simulation results of the
ces. Moreover, the matrix X is chosen as the unique solution fast trajectory replanner from the starting state xT S =
of the algebraic Riccati equation, see [33], [−15◦ , −15◦ , 0, −95◦ , 0, 10◦ , 0, 0, 0] in solid lines, which
deviates from the closest grid point in the database, i.e.
X = ΦT T
Nc XΦNc − (ΦNc XΓNc ) the starting state xT ◦
(21) S = [0, 0, 0, −90 , 0, 0, 0, 0, 0] shown as
−1
(Rc + ΓT
Nc XΓNc ) (ΓT
Nc XΦNc ) + Qc . dashed lines. The trajectories in Fig. 4 are demonstrated in

10118

Authorized licensed use limited to: Sejong Univ. Downloaded on February 12,2023 at 04:16:59 UTC from IEEE Xplore. Restrictions apply.
simulation without the closed-loop control from Section V. around the trajectory in Fig. 5 can be attributed to the
Note that for both, the offline trajectory optimization and the parameter and model uncertainties and to the measurement
fast trajectory replanner, a straight line from the starting state noise present in the velocity signals.
to the target state is used as an initial guess. Both trajectories
are feasible with respect to the dynamical constraints (6b) as qi /q¯i q̇i /q̇¯i τA,i /τA,i
well as the limits of the robot (6d), (11), and (8). 1
joint 1
qi /q¯i q̇i /q̇¯i τA,i /τA,i 0.5

1 0

0.5 −0.5
joint 2 joint 3
0 −1
1
−0.5 joint 6
joint 1 joint 2 joint 3 0.5
−1
1 0
joint 6
0.5 −0.5
joint 4 joint 5
0 −1
−0.5 1
joint 9
joint 4 joint 5
−1 0.5
1
joint 7 joint 9 0
0.5
−0.5
0 joint 7 joint 8
−1
0 1 2 3 0 1 2 3 0 1 2 3
−0.5
joint 8 time (s) time (s) time (s)
−1
0 1 2 0 1 2 0 1 2
Fig. 5. Experimental results for the swing-up and stabilization of the
time (s) time (s) time (s) spherical pendulum with the fast trajectory replanner. The colored solid lines
present the scaled measurement data, and the corresponding black dashed
Fig. 4. Time evolution of the offline and the replanned trajectory lines show the desired trajectory from Section IV.
optimization. The dashed lines are the scaled offline trajectory from the
database in Section III, and the solid lines represent the scaled replanned
trajectory from Section IV. For safety reasons, the limits for the joint VII. CONCLUSIONS
velocities and the motor torques are 50% lower than the limits in Tab.
II. In this work, a fast trajectory replanner based on a database
of precomputed near time-optimal trajectories for the swing-
In the experiment, the joint velocities of the complete up of a spherical pendulum on a 7-axis robot is presented
system cannot be measured directly. Therefore, differential and experimentally validated. First, the swing-up trajectories
filters with a time constant of T1 = 12 ms are used for are computed offline using an optimization approach and
each joint. The parameters Qc and Rc are tuned to obtain a are stored in a K-D tree database. The offline trajectory
good closed-loop performance. The weights of the pendulum optimization takes into account the system dynamics of the
velocities q̇8 and q̇9 in Qc are reduced to avoid oscillations underactuated system as well as the kinematic and dynamic
during the swing-up. Note that the sampling time for the constraints and uses the state-of-the-art nonlinear solver
controller is Ts = 125 µs, while the time step h = tF /N IPOPT to solve each swing-up motion in approximately
of the trajectory is significantly larger. Therefore, the desired 10 s on average. Second, for the actual starting state of
trajectory from the fast replanner has to be interpolated using the system, a nearest neighbour search algorithm is applied
(12). The torque input τA is calculated from (24) with the to find the nearest point in the K-D tree database. This
time-variant feedback matrix Kk from (23). The experimen- trajectory is adapted by solving a constrained quadratic
tal results of the swing-up of the spherical pendulum from optimization, which takes on average 200 ms even if the
◦ ◦ ◦
the initial state xTS = [0, −15 , 0, −95 , 0, 10 , 0, 0, 0] are grid in the database is relatively coarse. Finally, the exper-
presented in Fig. 5. The measured signals are illustrated as imental demonstration shows the successful swing-up and
colored solid lines, and the desired trajectories are drawn as stabilization of a spherical pendulum tool mounted on the 7-
black dashed lines. In addition, a video of the swing-up and axis robot. Overall, the proposed fast trajectory optimization
stabilization is provided at www.acin.tuwien.ac.at/8c60. Note allows to generate an optimal trajectory in real time, which
that the discrete time-variant LQR is able to stabilize the can be utilized in many different applications, such as pick-
swing-up trajectory and the upright position of the spherical and-place tasks of objects using a robotic arm or moving
pendulum using the KUKA LBR iiwa. The small oscillations large loads in a factory using a 3D gantry crane [37].

10119

Authorized licensed use limited to: Sejong Univ. Downloaded on February 12,2023 at 04:16:59 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES [23] C. Liu and C. G. Atkeson, “Standing balance control using a trajec-
tory library,” Proceedings of the IEEE International Conference on
[1] K. H. Lundberg and T. W. Barton, “History of Inverted-pendulum Intelligent Robots and Systems, pp. 3031–3036, 2009.
Systems,” IFAC Proceedings Volumes, vol. 42, no. 24, pp. 131–135, [24] M. Spong, S. Hutchinson, and M. Vidyasagar, Robot Modeling and
2010. Control. Wiley, 2005.
[2] Y. Xu, M. Iwase, and K. Furuta, “Time Optimal Swing-up Control of [25] Y. R. Stürz, L. M. Affolter, and R. S. Smith, “Parameter identification
Single Pendulum,” Journal of Dynamic Systems, Measurement, and of the KUKA LBR iiwa robot including constraints on physical
Control, vol. 123, no. 3, pp. 518–527, 2001. feasibility,” IFAC PapersOnLine, vol. 50, no. 1, pp. 6863–6868, 2017.
[26] KUKA Roboter GmbH, “LBR iiwa 7 R800 and LBR iiwa 14 R820
[3] M. Yamakita, M. Iwashiro, Y. Sugahara, and K. Furuta, “Robust
specification,” 2017.
Swing Up Control of Double Pendulum,” Proceedings of the American
[27] J. T. Betts, Practical methods for optimal control and estimation using
Control Conference, vol. 1, pp. 290–295, 1995.
nonlinear programming. Siam, 2010.
[4] T. Glück, A. Eder, and A. Kugi, “Swing-up control of a triple
[28] E. E. Binder and J. H. Herzog, “Distributed computer architecture and
pendulum on a cart with experimental validation,” Automatica, vol. 49,
fast parallel algorithms in real-time robot control,” IEEE Transactions
no. 3, pp. 801–808, 2013.
on Systems, Man, and Cybernetics, vol. 16, no. 4, pp. 543–549, 1986.
[5] J. Shen, A. K. Sanyal, N. A. Chaturvedi, D. Bernstein, and H. Mc- [29] A. Wächter and L. T. Biegler, “On the implementation of an interior-
Clamroch, “Dynamics and control of a 3D pendulum,” Proceedings of point filter line-search algorithm for large-scale nonlinear program-
the IEEE Conference on Decision and Control, vol. 1, pp. 323–328, ming,” Mathematical Programming, vol. 106, no. 1, pp. 25–57, 2006.
2004. [30] J. L. Bentley, “Multidimensional binary search trees used for asso-
[6] R. H. Cannon, Dynamics of physical systems. McGraw-Hill, 1967. ciative searching,” Communications of the ACM, vol. 18, no. 9, pp.
[7] K. Ogata and Y. Yang, Modern control engineering. Prentice Hall, 509–517, 1975.
1970. [31] M. Soleymani and S. Morgera, “An efficient nearest neighbor search
[8] S. Mori, H. Nishihara, and K. Furuta, “Control of unstable mechanical method,” IEEE Transactions on Communications, vol. 35, no. 6, pp.
system control of pendulum,” International Journal of Control, vol. 23, 677–679, 1987.
no. 5, pp. 673–692, 1976. [32] J. J. Moré and G. Toraldo, “Algorithms for bound constrained quadratic
[9] A. Mills, A. Wills, and B. Ninness, “Nonlinear model predictive con- programming problems,” Numerische Mathematik, vol. 55, no. 4, pp.
trol of an inverted pendulum,” Proceedings of the American Control 377–400, 1989.
Conference, pp. 2335–2340, 2009. [33] S. Bittanti, A. J. Laub, and J. C. Willems, The Riccati Equation.
[10] E. Derner, J. Kubalík, and R. Babuška, “Reinforcement learning with Springer Science & Business Media, 2012.
symbolic input-output models,” Proceedings of the IEEE International [34] G. F. Franklin, J. D. Powell, A. Emami-Naeini, and J. D. Powell,
Conference on Intelligent Robots and Systems, pp. 3004–3009, 2018. Feedback control of dynamic systems. Prentice Hall, 2002.
[11] T. Chen and B. Goodwinel, “A simple approach on global control of [35] I. S. Duff, “MA57—A Code for the Solution of Sparse Symmetric
a class of underactuated mechanical robotic systems,” Proceedings of Definite and Indefinite Systems,” ACM Transactions on Mathematical
the IEEE International Conference on Intelligent Robots and Systems, Software, vol. 30, no. 2, pp. 118–144, 2004.
pp. 5139–5145, 2019. [36] J. A. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl,
[12] K. Furuta, M. Yamakita, and S. Kobayashi, “Swing-up control of “CasADi: a software framework for nonlinear optimization and opti-
inverted pendulum using pseudo-state feedback,” Proceedings of the mal control,” Mathematical Programming Computation, vol. 11, no. 1,
Institution of Mechanical Engineers, Part I: Journal of Systems and pp. 1–36, 2019.
Control Engineering, vol. 206, no. 4, pp. 263–269, 1992. [37] M. Vu, P. Zips, A. Lobe, F. Beck, W. Kemmetmüller, and A. Kugi,
[13] K. Graichen and M. Zeitz, “Feedforward control design for finite- “Fast motion planning for a laboratory 3D gantry crane in the presence
time transition problems of nonlinear systems with input and output of obstacles,” IFAC-PapersOnLine, 2020.
constraints,” IEEE Transactions on Automatic Control, vol. 53, no. 5,
pp. 1273–1278, 2008.
[14] A. Winkler and J. Suchỳ, “Erecting and balancing of the inverted
pendulum by an industrial robot,” IFAC Proceedings Volumes, vol. 42,
no. 16, pp. 323–328, 2009.
[15] A. Marco, P. Hennig, J. Bohg, S. Schaal, and S. Trimpe, “Auto-
matic LQR tuning based on gaussian process global optimization,”
Proceedings of the IEEE International Conference on Robotics and
Automation, pp. 270–277, 2016.
[16] A. Doerr, D. Nguyen-Tuong, A. Marco, S. Schaal, and S. Trimpe,
“Model-based policy search for automatic tuning of multivariate PID
controllers,” Proceedings of the IEEE International Conference on
Robotics and Automation, pp. 5295–5301, 2017.
[17] G. Schreiber, C. Ott, and G. Hirzinger, “Interactive redundant robotics:
Control of the inverted pendulum with nullspace motion,” Proceedings
of the IEEE International Conference on Intelligent Robots and
Systems, vol. 1, pp. 158–164, 2001.
[18] C. Hartl-Nesic, J. Kretschmer, M. Schwegel, T. Glück, and A. Kugi,
“Swing-up of a spherical pendulum on a 7-axis industrial robot,” IFAC-
PapersOnLine, vol. 52, no. 15, pp. 346–351, 2019.
[19] S. Farzan, A.-P. Hu, E. Davies, and J. Rogers, “Modeling and control
of brachiating robots traversing flexible cables,” Proceedings of the
IEEE International Conference on Robotics and Automation, pp.
1645–1652, 2018.
[20] K.-D. Nguyen and D. Liu, “Robust control of a brachiating robot,”
Proceedings of the IEEE International Conference on Intelligent
Robots and Systems, pp. 6555–6560, 2017.
[21] S. Farzan, A.-P. Hu, E. Davies, and J. Rogers, “Feedback motion
planning and control of brachiating robots traversing flexible cables,”
Proceedings of the American Control Conference, pp. 1323–1329,
2019.
[22] P. Reist, P. Preiswerk, and R. Tedrake, “Feedback-motion-planning
with simulation-based LQR-trees,” The International Journal of
Robotics Research, vol. 35, no. 11, pp. 1393–1416, 2016.

10120

Authorized licensed use limited to: Sejong Univ. Downloaded on February 12,2023 at 04:16:59 UTC from IEEE Xplore. Restrictions apply.

You might also like