01 - Asymptotic Optimal Control of Uncertain Nonlinear Euler-Lagrange Systems PDF

Automatica 47 (2011) 99–107
Contents lists available at ScienceDirect
Automatica
journal homepage: www.elsevier.com/locate/automatica
Brief paper
Asymptotic optimal control of uncertain nonlinear Euler–Lagrange systems✩

Keith Dupree, Parag M. Patre, Zachary D. Wilcox, Warren E. Dixon ∗
Department of Mechanical and Aerospace Engineering, University of Florida, Gainesville, FL 32611, USA
article info abstract

Article history: A sufficient condition to solve an optimal control problem is to solve the Hamilton–Jacobi–Bellman (HJB)
Received 5 June 2009 equation. However, finding a value function that satisfies the HJB equation for a nonlinear system is
Received in revised form challenging. For an optimal control problem when a cost function is provided a priori, previous efforts
15 July 2010
have utilized feedback linearization methods which assume exact model knowledge, or have developed
Accepted 16 August 2010
Available online 18 November 2010
neural network (NN) approximations of the HJB value function. The result in this paper uses the implicit
learning capabilities of the RISE control structure to learn the dynamics asymptotically. Specifically, a
Keywords:
Lyapunov stability analysis is performed to show that the RISE feedback term asymptotically identifies
Optimal control the unknown dynamics, yielding semi-global asymptotic tracking. In addition, it is shown that the system
Neural networks converges to a state space system that has a quadratic performance index which has been optimized by
RISE an additional control element. An extension is included to illustrate how a NN can be combined with the
Nonlinear control previous results. Experimental results are given to demonstrate the proposed controllers.
Lyapunov-based control © 2010 Elsevier Ltd. All rights reserved.
1. Introduction optimal controller can be developed that optimizes a derived cost.

In most cases inverse optimal control requires exact knowledge of
Optimal control theory involves the design of controllers that the nonlinear dynamics, however inverse optimal adaptive control
can satisfy some objective while simultaneously minimizing some (cf. Fausz, Chellaboina, & Haddad, 1997; Lewis, Syrmos, Li, & Krstic,
performance metric. A sufficient condition to solve an optimal 1995; Luo, Chu, & Ling, 2005) techniques have been developed for
control problem is to solve the Hamilton–Jacobi–Bellman (HJB) systems with linear in the constant parameters (LP) uncertainty.
equation. For the special case of linear time-invariant systems, Motivated by the desire to eliminate the requirement for
the HJB equation reduces to an algebraic Riccati equation (ARE); exact knowledge of the dynamics for a direct optimal controller
however, for nonlinear systems, finding a value function that (i.e., where the cost function is given a priori), (Johansson,
satisfies the HJB equation is challenging because it requires the 1990) developed a self-optimizing adaptive controller to yield
solution of a partial differential equation that may not have an global asymptotic tracking despite LP uncertainty provided the
explicit solution. If the nonlinear dynamics are exactly known, then parameter estimation error could somehow converge to zero.
the problem can be reduced to solving an ARE through feedback In the preliminary work in Dupree, Patre, Wilcox, and Dixon
linearization methods (cf. Freeman & Kokotovic, 1995; Lu, Sun, Xu, (2008), we illustrated how a Robust Integral of the Sign of the
& Mochizuki, 1996; Nevistic & Primbs, 1996; Primbs & Nevistic, Error (RISE) feedback controller could be modified to yield a
1996; Sekoguchi, Konishi, Goto, Yokoyama, & Lu, 2002). direct optimal controller that achieves semi-global asymptotic
Inverse optimal control (see Freeman & Kokotovi’c, 1996; Krstic tracking. The result in Dupree et al. (2008) exploits the implicit
& Li, 1998; Krstic & Tsiotras, 1999) is an alternative method to learning characteristic (Qu & Xu, 2002) of the RISE controller
solve the nonlinear optimal control problem by circumventing the to asymptotically cancel LP, non-LP uncertainties and additive
need to solve the HJB equation. By finding a control Lyapunov disturbances in the dynamics so that the overall control structure
function, which can be shown to also be a value function, an converges to an optimal controller.
Researchers have also investigated the use of the universal
approximation property of neural networks (NNs) to approximate
the LP and non-LP unknown dynamics as a means to develop
✩ This research is supported in part by the NSF CAREER award number 0547448, direct optimal controllers. Specifically, results such as Abu-Khalaf
NSF award number 0901491. The material in this paper was partially presented and Lewis (2002), Cheng, Li, and Zhang (2006), Cheng and Lewis
at the 47th IEEE CDC, December 9–11, 2008, Cancun, Mexico. This paper was (2007), Cheng, Lewis, and Abu-Khalaf (2007), Kim and Lewis (2000)
recommended for publication in revised form by Associate Editor Derong Liu under
the direction of Editor Miroslav Krstic.
and Kim, Lewis, and Dawson (2000) find an optimal controller for
∗ Corresponding author. Tel.: +1 352 846 1463; fax: +1 352 392 7303. a given cost function for a partially feedback linearized system,
E-mail addresses: kdupree@gmail.com (K. Dupree), parag.patre@gmail.com and then modify the optimal controller with a NN to approximate
(P.M. Patre), zibrus@gmail.com (Z.D. Wilcox), wdixon@ufl.edu (W.E. Dixon). the unknown dynamics. Specifically the tracking errors for the
0005-1098/$ – see front matter © 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.automatica.2010.10.007
100 K. Dupree et al. / Automatica 47 (2011) 99–107
NN methods are proven to be uniformly ultimately bounded (UUB) Assumption 2. The following skew-symmetric relationship is
and the resulting state space system, for which the HJB optimal satisfied:
controller is developed, is only approximated.
The contribution in this work, and our preliminary efforts ξ T (Ṁ (q) − 2Vm (q, q̇))ξ = 0 ∀ξ ∈ Rn . (3)
in Dupree et al. (2008), arise from incorporating optimal control
elements with an implicit learning feedback control strategy
Assumption 3. If q(t ), q̇(t ) ∈ L∞ , then Vm (q, q̇), F (q̇) and G(q)
coined the Robust Integral of the Sign of the Error (RISE) method
are bounded. Moreover, if q(t ), q̇(t ) ∈ L∞ , then the first and
in Patre, MacKunis, Makkar, and Dixon (2008). The RISE method
second partial derivatives of the elements of M (q), Vm (q, q̇), G(q)
is used to identify the system and reject disturbances, while
with respect to q(t ) exist and are bounded, and the first and second
achieving asymptotic tracking and the convergence of a control
partial derivatives of the elements of Vm (q, q̇), F (q̇) with respect to
term to the optimal controller. Inspired by the previous work
q̇(t ) exist and are bounded.
in Abu-Khalaf and Lewis (2002), Abu-Khalaf, Huang, and Lewis
(2006), Cheng et al. (2006), Cheng and Lewis (2007), Cheng
Assumption 4. The desired trajectory ... .... is assumed to be designed
et al. (2007), Johansson (1990), Kim and Lewis (2000), Kim
such that qd (t ), q̇d (t ), q̈d (t ), q d (t ), q d (t ) ∈ Rn exist, and are
et al. (2000) and Lewis (1986) a system in which all terms
bounded.
are assumed known (temporarily) is feedback linearized and
a control law is developed based on the HJB optimization Assumption 5. The nonlinear disturbance term and its first two
method for a given quadratic performance index. Under the time derivatives, i.e. τd (t ), τ̇d (t ), τ̈d (t ) are bounded by known
assumption that parametric uncertainty and unknown bounded constants.
disturbances are present in the dynamics, the control law is
modified to contain the RISE feedback term which is used to 3. Control objective
identify the uncertainty. A Lyapunov stability analysis indicates
that the RISE feedback term asymptotically identifies the unknown The control objective is to ensure that the system tracks a
dynamics (yielding semi-global asymptotic tracking) provided desired time-varying trajectory, denoted by qd (t ) ∈ Rn , despite
upper bounds on the disturbances are known and the control uncertainties in the dynamic model, while minimizing a given
gains are selected appropriately. A distinction in this work is performance index. To quantify the tracking objective, a position
that the uncertain nonlinear disturbances are asymptotically tracking error, denoted by e1 (t ) ∈ Rn , is defined as
identified (rather than UUB), allowing the developed controller to
asymptotically converge to an optimal controller for the residual e1 , qd − q. (4)
uncertain nonlinear system.
To facilitate the subsequent analysis, filtered tracking errors,
An extension is included that investigates the amalgam of
denoted by e2 (t ), r (t ) ∈ Rn , are also defined as
the robust RISE feedback method with NN methods to yield a
direct optimal controller. Combining a NN feedforward controller e2 , ė1 + α1 e1 (5)
with the RISE feedback method yields an asymptotic result (Patre,
r , ė2 + α2 e2 , (6)
MacKunis, Kaiser, & Dixon, 2008), that can also be proven to
converge to an optimal controller through the efforts in this paper. where α1 ∈ R n×n
, denotes a subsequently defined positive
Hence, a modification to the results in Abu-Khalaf and Lewis definite, constant, gain matrix, and α2 ∈ R is a positive constant.
(2002), Cheng et al. (2006), Cheng and Lewis (2007), Cheng et al. The subsequent development is based on the assumption that
(2007), Kim and Lewis (2000) and Kim et al. (2000) and is provided q(t ) and q̇(t ) are measurable, so the filtered tracking error r (t ) is
that allows for asymptotic tracking and convergence to the optimal not measurable since the expression in (6) depends on q̈(t ). The
controller. error systems are based on the assumption that the generalized
coordinates of the Euler–Lagrange dynamics allow additive and not
2. Dynamic model multiplicative errors.
The class of considered nonlinear dynamic systems is assumed 4. Optimal control design
to be modeled by the following Euler–Lagrange formulation:
M (q)q̈ + Vm (q, q̇)q̇ + G(q) + F (q̇) + τd (t ) = τ (t ). (1) In this section, a state-space model is developed based on the
tracking errors in (4) and (5). Based on this model, a controller is
In (1), M (q) ∈ Rn×n denotes the generalized inertia matrix, developed that minimizes a quadratic performance index under
Vm (q, q̇) ∈ Rn×n denotes the generalized centripetal-Coriolis ma- the (temporary) assumption that the dynamics in (1), including
trix, G(q) ∈ Rn denotes the generalized gravity vector, F (q̇) ∈ Rn the additive disturbance, are known. The development in this
denotes generalized friction, τd (t ) ∈ Rn denotes a general distur- section motivates the control design in Section 5, where a robust
bance (e.g., unmodeled effects), τ (t ) ∈ Rn represents the input controller is developed to identify the unknown dynamics and
control vector, and q(t ), q̇(t ), q̈(t ) ∈ Rn denote the generalized po- additive disturbance.
sition, velocity, and acceleration vectors, respectively. The subse- To develop a state-space model for the tracking errors in (4) and
quent development is based on the assumption that q(t ) and q̇(t ) (5), the inertia matrix is premultiplied by the time derivative of (5),
are measurable and that M (q), Vm (q, q̇), G(q), F (q̇) and τd (t ) are and substitutions are made from (1) and (4) to obtain
unknown. Moreover, the following assumptions will be exploited
in the subsequent development. M ė2 = −Vm e2 − τ + h + τd , (7)
where the nonlinear function h(q, q̇, qd , q̇d , q̈d ) ∈ Rn is defined as
Assumption 1. The inertia matrix M (q) is symmetric, positive
definite, and satisfies the following inequality ∀ξ (t ) ∈ Rn : h , M (q̈d + α1 ė1 ) + Vm (q̇d + α1 e1 ) + G + F . (8)
m1 ‖ξ ‖ ≤ ξ M (q)ξ ≤ m̄(q)‖ξ ‖ ,
2 T 2
(2) Under the (temporary) assumption that the dynamics in (1) are
known, the control input can be designed as
where m1 ∈ R is a known positive constant, m̄(q) ∈ R is a known
positive function, and ‖ · ‖ denotes the standard Euclidean norm. τ , h + τd − u, (9)
K. Dupree et al. / Automatica 47 (2011) 99–107 101
where u(t ) ∈ Rn is an auxiliary control input that will be designed where K ∈ Rn×n denotes a positive definite symmetric gain matrix.
to minimize a subsequent performance index. By substituting (9) If α1 , R, and K , introduced in (5), (12) and (15), satisfy the following
into (7) the closed-loop error system for e2 (t ) can be obtained as algebraic relationships
M ė2 = −Vm e2 + u. (10)

T
K = −Q12 = −Q12 >0 (16)
A time-varying state-space model for (5) and (10) can now be Q11 = α1T K + K α1 , (17)
developed as
R −1
= Q22 , (18)
ż = A(q, q̇)z + B(q)u, (11) then Theorem 1 of Kim and Lewis (2000) and Kim et al. (2000) can
be invoked to prove that P (q) satisfies
where A(q, q̇) ∈ R 2n×2n
, B(q) ∈ R 2n×n
, and z (t ) ∈ R 2n
are defined
as z T (PA + AT P T − PBR−1 BT P + Ṗ + Q )z = 0, (19)
and the value function V (z , t ) ∈ R
[ ]
−α1 In×n −1 T
A(q, q̇) , , B(q) , 0n×n
 
M
0n×n −M − 1 V m
1
T V = z T Pz (20)
z (t ) , eT1 eT2
 
2
satisfies the HJB equation in (14). Lemma 1 of Kim and Lewis (2000)
where In×n and 0n×n denote an n × n identity matrix and matrix of
and Kim et al. (2000) can be used to conclude that the optimal
zeros, respectively. The quadratic performance index J (u) ∈ R to
control u∗ (t ) that minimizes (12) subject to (11) is
be minimized subject to the constraints in (11) is
∂ V (z , t )
 T
∞
u∗ (t ) = −R−1 BT = −R−1 e2 .
∫  
1 1 (21)
J (u) = z T Qz + uT Ru dt . (12) ∂z
0 2 2
2n×2n
In (12), Q ∈ R and R ∈ Rn×n are positive definite symmetric Remark 1. An infinite horizon optimal controller for a linear
matrices to weight the influence of the states and (partial) control time-invariant system can be developed by solving an ARE. The
effort, respectively. Furthermore, the matrix Q can be broken into constants that make up the ARE are a state weighting matrix Q
blocks as: and a control weighting matrix R which come from the typical
[ ] quadratic cost functional and can be independently selected.
Q11 Q12
Q = T . Solving the infinite horizon optimal controller for the nonlinear
Q12 Q22 time-varying system in (11) (sufficiently) involves finding the
solution to a more general HJB equation. As described in Kim
Given the performance index J (u), the control objective is to find
et al. (2000) and Kim and Lewis (2000), a solution to the HJB
the auxiliary control input u(t ) that minimizes (12) subject to the
equation in (13) involves solving the matrix equation in (19).
differential constraints imposed by (11). The optimal control that To solve (19), a strategic choice of P (q) is selected and (3) is
achieves this objective is denoted by u∗ (t ). As stated in Kim and used to reduce the problem to a set of algebraic constraints/gain
Lewis (2000) and Kim et al. (2000), the fact that the performance conditions. The resulting gain conditions indicate that Q11 can be
index is only penalized for the auxiliary control u(t ) is practical selected to weight the output state e1 (t ), Q12 can be independently
since the gravity, Coriolis, and friction compensation terms in (8) selected to weight the state vector cross terms, and that R can
cannot be modified by the optimal design phase. be independently selected to weight the control vector in the
The controller u∗ (t ) minimizes (12) subject to (11) if and only if cost functional. However, for this approach, the weighting matrix
there exists a value function V (z , t ) where Q22 for the auxiliary state e2 (t ) is dependent on the control vector
weight matrix (i.e., Q22 = R−1 ). For alternative methods to
∂V 1 1 ∂V
− = z T Qz + u∗T Ru∗ + ż solve the infinite horizon optimal control problem for time-varying
∂t 2 2 ∂z systems see Kirk (2004) and Lewis et al. (1995).
that satisfies the HJB equation (Lewis, Syrmos, Li, & Krstic, 1995)
5. RISE feedback control development
∂V ∂V
[  ]
+ min H z , u, ,t =0 (13)
In general, the bounded disturbance τd (t ) and the nonlinear
∂t u ∂z
dynamics given in (8) are unknown, so the controller given
where the Hamiltonian of optimization H (z , u, ∂∂Vz , t ) ∈ R is in (9) cannot be implemented. However, if the control input
defined as contains some method to identify and cancel these effects, then
z (t ) will converge to the state space model in (11) so that u(t )
∂V ∂V
 
1 1
H z , u, ,t = z T Qz + uT Ru + ż . minimizes the respective performance index. As stated in the
∂z 2 2 ∂z introduction, several results have explored this strategy using
function approximation methods such as neural networks, where
The minimum of (12) is obtained for the optimal controller u(t ) =
the tracking control errors converge to a neighborhood near
u∗ (t ) where the respective Hamiltonian is
the state space model yielding a type of approximate optimal
controller. In this section, a control input is developed that exploits
∂V ∂V
[  ]
H ∗ = min H z , u, ,t =− . (14) RISE feedback to identify the nonlinear effects and bounded
u ∂z ∂t disturbances to enable asymptotic convergence to the state space
model.
To facilitate the subsequent development, let P (q) ∈ R2n×2n be a
To develop the control input, the error system in (6) is
positive definite symmetric matrix defined as
premultiplied by M (q) and the expressions in (1), (4) and (5) are
[ ] utilized to obtain
K 0n×n
P (q) = , (15) Mr = −Vm e2 + h + τd + α2 Me2 − τ . (22)
0n×n M
Based on the open-loop error system in (22), the control input To facilitate the subsequently stability analysis, the control gains
is composed of the optimal control developed in (21), plus a α1 and α2 introduced in (5) and (6), respectively, are selected
subsequently designed auxiliary control term µ(t ) ∈ Rn as according to the sufficient conditions:
τ , µ − u. (23) 1
λmin (α1 ) > α2 > 1, (35)
The closed-loop tracking error system can be developed by 2
substituting (23) into (22) as where λmin (α1 ) is the minimum eigenvalue of α1 . The control gain
Mr = −Vm e2 + h + τd + α2 Me2 + u − µ. (24)
α1 cannot be arbitrarily selected, rather it is calculated using a
Lyapunov equation solver. Its value is determined based on the
To facilitate the subsequent stability analysis the auxiliary function value of Q and R. Therefore Q and R must be chosen such that (35)
fd (qd , q̇d , q̈d ) ∈ Rn , which is defined as is satisfied.
fd , M (qd )q̈d + Vm (qd , q̇d )q̇d + G(qd ) + F (q̇d ), (25)

6. Stability and optimality analysis
is added and subtracted to (24) to yield
Mr = −Vm e2 + h̄ + fd + τd + u − µ + α2 Me2 , (26) Theorem 1. The controller given in (21) and (23) ensures that all
system signals are bounded under closed-loop operation, and the
where h̄(q, q̇, qd , q̇d , q̈d ) ∈ Rn is defined as tracking errors are regulated in the sense that
h̄ , h − fd . (27) ‖y(t )‖ → 0 as t → ∞ ∀y(0) ∈ S (36)
The time derivative of (26) can be written as where the set S can be made arbitrarily large by selecting ks based
1 on the initial conditions of the system (i.e., a semi-global result). The
M ṙ = − Ṁr + Ñ + ND − e2 − R−1 r − µ̇ (28) boundedness of the closed loop signals and the result in (36) can
2
be obtained provided the sufficient conditions in (33) and (35) are
after strategically grouping specific terms. In (28), the unmeasur- satisfied. Furthermore, u(t ) converges to an optimal controller that
able auxiliary terms Ñ (e1 , e2 , r , t ), ND (t ) ∈ Rn are defined as minimizes (12) subject to (11) provided the gain conditions given
· in (16)–(18) are satisfied.
1
Ñ , −V̇m e2 − Vm ė2 − Ṁr + h̄ +α2 Ṁe2
2 Proof. Let D ⊂ R3n+1 be a domain containing Φ (t ) = 0, where
+ α2 M ė2 + e2 + α2 R−1 e2 Φ (t ) ∈ R3n+1 is defined as
ND , f˙d + τ̇d . 
Φ (t ) , [yT (t ) O(t )]T . (37)
Motivation for grouping terms into Ñ (e1 , e2 , r , t ) and ND (t ) comes
from the subsequent stability analysis and the fact that the Mean In (37), the auxiliary function O(t ) ∈ R is defined as
Value Theorem and Assumptions 3–5 can be used to upper bound n
−
the auxiliary terms as O(t ) , β1 |e2i (0)| − e2 (0)T ND (0) − L(t ), (38)
i =1
‖Ñ (t )‖ ≤ ρ(‖y‖)‖y‖, (29)
where e2i (0) is equal to the ith element of e2 (0) and the auxiliary
‖ND ‖ ≤ ζ1 , ‖ṄD ‖ ≤ ζ2 , (30) function L(t ) ∈ R is the generalized solution to
where y(t ) ∈ R3n is defined as
L̇(t ) , r T (ND (t ) − β1 sgn(e2 )), (39)
y(t ) , [eT1 eT2 r ] ,
T T
(31)
where β1 ∈ R is a positive constant chosen according to the
the bounding function ρ(‖y‖) ∈ R is a positive globally invertible sufficient conditions in (33). Provided the sufficient conditions
nondecreasing function, and ζi ∈ R (i = 1, 2) denote known introduced in (33) are satisfied, the following inequality can be
positive constants. Based on (28), the control term µ(t ) is designed obtained (Xian et al., 2004):
based on the RISE framework (see Patre et al., 2008; Qu & Xu, 2002;
n
Xian, Dawson, de Queiroz, & Chen, 2004) as −
L(t ) ≤ β1 |e2i (0)| − e2 (0)T ND (0). (40)
µ(t ) , (ks + 1)e2 (t ) − (ks + 1)e2 (0) + ν(t ) (32) i=1
where ν(t ) ∈ Rn is the generalized solution to Hence, (40) can be used to conclude that O(t ) ≥ 0.
Let VL (Φ , t ) : D ×[0, ∞) → R be a continuously differentiable
ν̇ = (ks + 1)α2 e2 + β1 sgn(e2 ), positive definite function defined as
ks ∈ R is a positive constant control gain, and β1 ∈ R is a
1 1
positive control gain selected according to the following sufficient VL (Φ , t ) , eT1 e1 + eT2 e2 + r T M (q)r + O, (41)
condition 2 2
1 which satisfies the following inequalities:

β1 > ζ1 + ζ2 . (33)
α2 U1 (Φ ) ≤ VL (Φ , t ) ≤ U2 (Φ ). (42)
The closed loop error systems for r (t ) can now be obtained by In (42), the continuous positive definite functions U1 (Φ ), and
substituting the time derivative of (32) into (28) as U2 (Φ ) ∈ R are defined as U1 (Φ ) , λ1 ‖Φ ‖2 , and U2 (Φ ) ,
1 λ2 (q)‖Φ ‖2 , where λ1 , λ2 (q) ∈ R are defined as
M ṙ = − Ṁr + Ñ + ND − e2 − R−1 r − (ks + 1)r
2
 
1 1
− β1 sgn(e2 ). (34) λ1 , min{1, m1 } λ2 (q) , max m̄(q), 1 ,
2 2
where m1 and m̄(q) are introduced in (2). After taking the time From (27), (49) and (50)
derivative of (41), and using (5), (6), (34), and substituting for the
time derivative of O(t )
µ → h + τd as t → ∞. (51)
Using (51), and comparing (9) to (23) indicates that the dynamics
V̇L (Φ , t ) = −2eT1 α1 e1 + 2eT2 e1 + r T Ñ (t )
in (1) converge to the state-space system in (11). Hence,
− (ks + 1)r T r − R−1 r T r − α2 eT2 e2 . (43) u(t ) converges to an optimal controller that minimizes (12) subject
to (11) provided the gain conditions given in (16)–(18), (33) and
Based on the fact that (35) are satisfied.
1 1
eT2 e1 ≤ ‖e1 ‖2 + ‖e2 ‖2 , (44)
2 2 7. Neural network extension
the expression in (43) can be upper bounded as
The efforts in this section investigate the amalgam of the robust
V̇L (Φ , t ) ≤ r T Ñ (t ) − (ks + 1 + λmin (R−1 ))‖r ‖2 RISE feedback method with NN methods to yield a direct optimal
controller. These efforts provide a modification to the results
− (2λmin (α1 ) − 1)‖e1 ‖2 − (α2 − 1)‖e2 ‖2 . (45) in Abu-Khalaf and Lewis (2002), Cheng et al. (2006), Cheng and
By using (29), the expression in (45) can be rewritten as Lewis (2007), Cheng et al. (2007), Kim and Lewis (2000) and Kim
et al. (2000) that allows for asymptotic stability and convergence
V̇L (Φ , t ) ≤ −λ3 ‖y‖2 − [ks ‖r ‖2 − ρ(‖y‖)‖r ‖‖y‖], (46) to the optimal controller rather than to approximate the optimal
controller.
where λ3 , min{2λmin (α1 ) − 1, α2 − 1, 1 + λmin (R )}, and α1 and
−1
α2 are chosen according to the sufficient condition in (35). After 7.1. Feedforward NN estimation
completing the squares for the terms inside the brackets in (46),
the following expression can be obtained
The universal approximation property indicates that weights
ρ (‖y‖)‖y‖
2 2 and thresholds exist such that some continuous function f (x) ∈
V̇L (Φ , t ) ≤ −λ3 ‖y‖2 + . (47) RN1 +1 can be represented by a three-layer NN as (Ge, Hang, &
4ks Zhang, 1999; Lewis, 1999; Lewis, Selmic, & Campos, 2002)
The expression in (47) can be further upper bounded by a
continuous, positive semi-definite function f (x) = W T σ (V T x) + ε(x). (52)
(N1 +1)×N2 (N2 +1)×n
V̇L (Φ , t ) ≤ −U (Φ ) = c ‖y‖2 ∀Φ ∈ D In (52), V ∈ R and W ∈ R are bounded constant
ideal weight matrices for the first-to-second and second-to-third
for some positive constant c, where layers respectively, where N1 is the number of neurons in the input
 layer, N2 is the number of neurons in the hidden layer, and n is the
D , {Φ ∈ R3n+1 | ‖Φ ‖ ≤ ρ −1 (2 λ3 ks )}. number of neurons in the third layer. The activation function in
The inequalities in (42) and (47) can be used to show that (52) is denoted by σ (·) : RN1 +1 → RN2 +1 , and ε(x) : RN1 +1 → Rn
VL (Φ , t ) ∈ L∞ in D ; hence, e1 (t ), e2 (t ), and r (t ) ∈ L∞ in is the functional reconstruction error. Based on (52), the typical
D . Given that e1 (t ), e2 (t ), and r (t ) ∈ L∞ in D , standard linear three-layer NN approximation for f (x) is given as (Ge et al., 1999;
analysis methods can be used to prove that ė1 (t ), ė2 (t ) ∈ L∞ Lewis, 1999; Lewis et al., 2002)
in D from (5) and (6). Since e1 (t ), e2 (t ), r (t ) ∈ L∞ in D , the
assumption that qd (t ), q̇d (t ), q̈d (t ) exist and are bounded can be fˆ (x) , Ŵ T σ (V̂ T x), (53)
used along with (4)–(6) to conclude that q(t ), q̇(t ), q̈(t ) ∈ L∞ where V̂ (t ) ∈ R (N1 +1)×N2
and Ŵ (t ) ∈ R (N2 +1)×n
are subsequently
in D . Since q(t ), q̇(t ) ∈ L∞ in D , Assumption 3 can be used to designed estimates of the ideal weight matrices. The estimate
conclude that M (q), Vm (q, q̇), G(q), and F (q̇) ∈ L∞ in D . Thus from
mismatches for the ideal weight matrices, denoted by Ṽ (t ) ∈
(1) and Assumption 4, we can show that τ (t ) ∈ L∞ in D . Given
R(N1 +1)×N2 and W̃ (t ) ∈ R(N2 +1)×n , are defined as
that r (t ) ∈ L∞ in D , it can be shown that µ̇(t ) ∈ L∞ in D .
Since q̇(t ), q̈(t ) ∈ L∞ in D , Assumption 3 can be used to show Ṽ , V − V̂ , W̃ , W − Ŵ ,
that V̇m (q, q̇), Ġ(q), Ḟ (q) and Ṁ (q) ∈ L∞ in D ; hence, (34) can be
used to show that ṙ (t ) ∈ L∞ in D . Since ė1 (t ), ė2 (t ), ṙ (t ) ∈ L∞ in and the mismatch for the hidden-layer output error for a given x(t ),
D , the definitions for U (y) and z (t ) can be used to prove that U (y) denoted by σ̃ (x) ∈ RN2 +1 , is defined as
is uniformly continuous in D .
Let S ⊂ D denote a set defined as: σ̃ , σ − σ̂ = σ (V T x) − σ (V̂ T x). (54)

S , {Φ (t ) ∈ D | U2 (Φ (t )) < λ1 (ρ −1 (2 λ3 ks ))2 }. (48)
Assumption 6. Ideal weights are assumed to exist and be bounded
The region of attraction in (48) can be made arbitrarily large by known positive values so that
to include any initial conditions by increasing the control gain
ks (i.e., a semi-global type of stability result) (Xian et al., 2004). The ‖V ‖2F = tr (V T V ) ≤ V̄B (55)
LaSalle–Yoshizawa Theorem (see Theorem 8.4 of Khalil, 2002) can ‖W ‖ = tr (W W ) ≤ W̄B
2 T
(56)
F
now be invoked to state that
where ‖ · ‖F is the Frobenius norm of a matrix, and tr (·) is the trace
c ‖y(t )‖2 → 0 as t → ∞ ∀y(0) ∈ S . (49) of a matrix.
Based on the definition of y(t ), (49) can be used to conclude the To develop the control input, the error system in (6) is
results in (36) ∀y(0) ∈ S . From (21), u(t ) → 0 as e2 (t ) → 0, premultiplied by M (q) and the expressions in (1), (4) and (5) are
therefore (26) can be used to conclude that utilized to obtain
µ → h̄ + fd + τd as r (t ), e2 (t ) → 0. (50) Mr = −Vm e2 + h̄ + fd + τd + α2 Me2 − τ , (57)
where (25) and (27) were used. The auxiliary function fd (qd , q̇d , q̈d ) where the unmeasurable auxiliary terms Ñ (e1 , e2 , r , t ), N (Ŵ , V̂ ,
∈ Rn in (25) can be represented by a three-layer NN as (Ge et al., xd , t ) ∈ Rn are defined as
1999; Lewis, 1999; Lewis et al., 2002)
·
1
fd = W T σ (V T xd ) + ε(xd ). (58) Ñ , − Ṁr + h̄ +e2 + α2 R−1 e2 − V̇m e2 − Vm ė2
2 ′
In the above equation, the input xd (t ) ∈ R3n+1 is defined as + α2 Ṁe2 + α2 M ė2 − proj(Γ1 σ̂ V̂ T ẋd eT2 )T σ̂
xd (t ) , [ 1 qTd (t ) q̇Td (t ) q̈Td (t ) ]T so that N1 = 3n where N1 ′ ′
− Ŵ T σ̂ proj(Γ2 ẋd (σ̂ T Ŵ e2 )T )T xd (67)
was introduced in (52). Based on the assumption that the desired
trajectory is bounded, the following inequalities hold N , ND + NB . (68)
‖ε‖ ≤ εb1 , ‖ε̇‖ ≤ εb2 , ‖ε̈‖ ≤ εb3 , (59) In (68), ND (t ) ∈ Rn is defined as

where εb1 , εb2 , εb3 ∈ R are known positive constants. ′
ND = W T σ V T ẋd + ε̇ + τ̇d , (69)
7.2. Closed-loop error system
while NB (Ŵ , V̂ , xd ) ∈ Rn is further segregated as
Based on the open-loop error system in (57), the control input NB = NB1 + NB2 , (70)
is composed of the optimal control developed in (21), a three-layer
NN feedforward term, plus the RISE feedback term in (32) as where NB1 (Ŵ , V̂ , xd ) ∈ Rn is defined as
τ , fˆd + µ − u. (60) ′ ′
NB1 = −W T σ̂ V̂ T ẋd − Ŵ T σ̂ Ṽ T ẋd , (71)
The feedforward NN component in (60), denoted by fˆd (t ) ∈ Rn , is
defined as and the term NB2 (Ŵ , V̂ , xd ) ∈ Rn is defined as
fˆd , Ŵ T σ (V̂ T xd ). (61) ′ ′
NB2 = Ŵ T σ̂ Ṽ T ẋd + W̃ T σ̂ V̂ T ẋd . (72)
The estimates for the NN weights in (61) are generated on-line
(there is no off-line learning phase) as Segregating the terms as in (69)–(72) facilitates the development
·
of the NN weight update laws and the subsequent stability analysis.
′ For example, the terms in (69) are grouped together because
Ŵ = proj(Γ1 σ̂ V̂ T ẋd eT2 ) (62)
· the terms and their time derivatives can be upper bounded by a
′T constant and rejected by the RISE feedback, whereas the terms
V̂ = proj(Γ2 ẋd (σ̂ Ŵ e2 )T )
grouped in (70) can be upper bounded by a constant but their
where σ (V̂ T x) ≡ dσ (V T x)/d(V T x)|V T x=V̂ T x ; Γ1 ∈ R(N2 +1)×(N2 +1)
′
derivatives are state dependent. The terms in (70) are further
and Γ2 ∈ R(3n+1)×(3n+1) are constant, positive definite, symmetric segregated because NB1 (Ŵ , V̂ , xd ) will be rejected by the RISE
matrices. In (62), proj(·) denotes a smooth convex projection feedback, whereas NB2 (Ŵ , V̂ , xd ) will be partially rejected by the
algorithm that ensures Ŵ (t ) and V̂ (t ) remain bounded inside RISE feedback and partially canceled by the adaptive update law
known bounded convex regions. See Section 4.3 in Dixon, Fang, for the NN weight estimates.
Dawson, and Flynn (2003) for further details.
In a similar manner as in Xian et al. (2004), the Mean Value
The closed-loop tracking error system is obtained by substitut-
Theorem can be used to develop the following upper bound
ing (60) into (57) as
Mr = −Vm e2 + α2 Me2 + fd − fˆd + h̄ + τd + u − µ. (63) ‖Ñ (t )‖ ≤ ρ(‖y‖)‖y‖, (73)
Taking the time derivative of (63) and using (58) and (61) yields where y(t ) ∈ R3n was defined in (31), and the bounding function
M ṙ = −Ṁr − V̇m e2 − Vm ė2 + α2 Ṁe2 + α2 M ė2
ρ(‖y‖) ∈ R is a positive globally invertible nondecreasing
· · function. The following inequalities can be developed based on
′ ′ Assumption 5, (55), (56), (59), (62) and (70)–(72):
+ W T σ V T ẋd − Ŵ T σ̂ − Ŵ T σ̂ V̂ T xd
·
′
− Ŵ T σ̂ V̂ T ẋd + ε̇ + h̄ +τ̇d + u̇ − µ̇, (64) ‖ND ‖ ≤ ζ1 ‖NB ‖ ≤ ζ2 ‖ṄD ‖ ≤ ζ3 (74)
where the notations σ̂ and σ̃ are introduced in (54). Adding and ‖ṄB ‖ ≤ ζ4 + ζ5 ‖e2 ‖, (75)
′ ′
subtracting the terms W T σ̂ V̂ T ẋd + Ŵ T σ̂ Ṽ T ẋd to (64), yields where ζi ∈ R (i = 1, 2, . . . , 5) are known positive constants. To
M ṙ = −Ṁr − V̇m e2 − Vm ė2 + α2 Ṁe2 + α2 M ė2 facilitate the subsequent stability analysis, the control gains α1 and
′ ′
· α2 introduced in (5) and (6), respectively, are selected according to
+ Ŵ T σ̂ Ṽ T ẋd + W̃ T σ̂ V̂ T ẋd − Ŵ T σ̂ the sufficient conditions:
·
′ ′ ′
− Ŵ T σ̂ V̂ T xd + W T σ V T ẋd − W T σ̂ V̂ T ẋd 1
λmin (α1 ) > α2 > ζ5 + 1, (76)
· 2
′
− Ŵ σ̂ Ṽ ẋd + ε̇ + h̄ +τ̇d + u̇ − µ̇.
T T
(65) and β1 is selected according to the following sufficient conditions:
By using (21) and the NN weight tuning laws in (62), the expression
in (65) can be rewritten as 1 1
β1 > ζ1 + ζ2 + ζ3 + ζ4 , (77)
1 α2 α2
M ṙ = − Ṁ (q)r + Ñ + N − e2 − R −1
r − (ks + 1)r
2 where ζi ∈ R, i = 1, 2, . . . , 5 are introduced in (74) and (75), and
− β1 sgn(e2 ), (66) β1 was introduced in (32).
7.3. Stability and optimality analysis where λ3 , min{2λmin (α1 )− 1, α2 − 1 −ζ5 , 1 +λmin (R−1 )}; hence,
α1 , and α2 must be chosen according to the sufficient condition in
Theorem 2. The nonlinear optimal controller given in (60)–(62) en- (76). After completing the squares for the terms inside the brackets
sures that all system signals are bounded under closed-loop operation in (87), the following expression can be obtained:
and that the position tracking error is regulated in the sense that
ρ 2 (‖y‖)‖y‖2
‖y(t )‖ → 0 as t → ∞ ∀y(0) ∈ S (78) V̇L (Φ , t ) ≤ −λ3 ‖y‖2 + . (88)
4ks
where the set S can be made arbitrarily large by selecting ks based Based on (84) and (88), the same stability arguments from
on the initial conditions of the system (i.e., a semi-global result). The Theorem 1 can be used to conclude the result in (78). Furthermore,
boundedness of the closed loop signals and the result in (78) can the result in (78) can be used along with (63) to indicate that
be obtained provided the sufficient conditions in (76) and (77) are
satisfied. Furthermore, u(t ) converges to an optimal controller that fˆd + µ = h + τd as t → ∞. (89)
minimizes (12) subject to (11) provided the gain conditions given
in (16)–(18) are satisfied. Using (89), and comparing (9) to (60) indicates that the dynamics
in (7) converge to the state-space system in (11). Hence,
Proof. Let D ⊂ R be a domain containing Φ (t ) = 0, where
3n+2
u(t ) converges to an optimal controller that minimizes (12) subject
Φ (t ) ∈ R3n+2 is defined as
to (11) provided the gain conditions given in (16)–(18), (76) and
(77) are satisfied.
 
Φ (t ) , [yT (t ) P (t ) G(t )]T . (79)
In (79), the auxiliary function P (t ) ∈ R is defined as
8. Experimental results
n
−
P (t ) , β 1 |e2i (0)| − e2 (0)T N (0) − L(t ), (80) To examine the performance of the controllers developed in
i =1 (23) and (60) an experiment was performed on a two-link robot
where e2i (0) is equal to the ith element of e2 (0) and the auxiliary testbed. The testbed is composed of a two-link direct drive revolute
function L(t ) ∈ R is the generalized solution to robot consisting of two aluminum links, mounted on a 240.0 [Nm]
(base joint) and 20.0 [Nm] (second joint) switched reluctance
L̇(t ) , r T (NB1 (t ) + ND (t ) − β1 sgn(e2 ))
motors. The control objective is to track the desired time-varying
+ ėT2 (t )NB2 (t ) − ζ5 ‖e2 (t )‖2 . (81) trajectory
Provided the sufficient conditions introduced in (77) are satis- qd1 = qd2 = 60 sin(2t )(1 − exp(−0.01t 3 )). (90)
fied (Xian et al., 2004)
To achieve the control objective, the control gains α2 , ks , and β1
n
− defined as scalars in (6) and (32), were implemented (with non-
L(t ) ≤ β1 |e2i (0)| − e2 (0) NB (0).
T
(82) consequential implications to the stability result) as diagonal gain
i=1
matrices. The weighting matrixes for the controllers were chosen
Hence, (82) can be used to conclude that P (t ) ≥ 0. The auxiliary as
function G(t ) ∈ R in (79) is defined as [ ] [ ]
40 2 −4 4
α2 α2 Q11 = , Q12 =
2 40 4 −6
G(t ) = tr (W̃ T Γ1−1 W̃ ) + tr (Ṽ T Γ2−1 Ṽ ). (83)
2 2
Q22 = diag 4, 4 ,
 
Since Γ1 and Γ2 are constant, symmetric, and positive definite
matrices and α2 > 0, it is straightforward that G(t ) ≥ 0. which using (16)–(18) yielded the following values for K , α1 , and
Let VL (Φ , t ) : D ×[0, ∞) → R be a continuously differentiable R
positive definite function defined as [
4 −4
] [
15.6 10.6
]
K = , α1 =
1 1 −4 6 10.6 10.4
VL (Φ , t ) , eT1 e1 + eT2 e2 + r T M (q)r + P + G, (84)
R = diag 0.25, 0.25 .
 
2 2
provided the sufficient conditions introduced in (77) are satisfied.
The remaining control gains for both controllers were selected as
After taking the time derivative of (84), utilizing (5), (6) and (66),
and substituting for the time derivative of P (t ) and G(t ), V̇L (Φ , t ) α2 = diag 60, 35 , β1 = diag 5, 0.1
   
can be simplified as
ks = diag 140, 20 .
 
V̇L (Φ , t ) = −2eT1 α1 e1 − (ks + 1)‖r ‖2 − r T R−1 r2eT2 e1
The neural network update law weights were selected as
+ r T Ñ (t ) − α2 ‖e2 ‖2 + ζ5 ‖e2 (t )‖2
′ ′ Γ1 = 25I11 Γ2 = 25I7 .
+ α2 eT2 [Ŵ T σ̂ Ṽ T ẋd + W̃ T σ̂ V̂ T ẋd ]
· · For all experiments, the rotor’s velocity signal is obtained by
+ tr (α2 W̃ T Γ1−1 W̃ ) + tr (α2 Ṽ T Γ2−1 Ṽ ). (85) applying a standard backwards difference algorithm to the position
signal. The integral structure for the RISE term in (32) was
Based on (44) and (62), the expression in (85) can be simplified as computed on-line via a standard trapezoidal algorithm. In addition,
V̇L (Φ , t ) ≤ r T Ñ (t ) − (ks + 1 + λmin (R−1 ))‖r ‖2 all the states were initialized to zero. Each experiment was
performed ten times, and data from the experiments is displayed
− (2λmin (α1 ) − 1)‖e1 ‖2 − (α2 − 1 − ζ5 )‖e2 ‖2 . (86) in Table 1. Figs. 1 and 2 depict the tracking errors and control
torques for one experimental trial for the optimal RISE controller.
By using (73), the expression in (86) can be rewritten as
Figs. 3 and 4 depict the tracking errors and control torques for one
V̇L (Φ , t ) ≤ −λ3 ‖y‖2 − [ks ‖r ‖2 − ρ(‖y‖)‖r ‖‖y‖], (87) experimental trial for the optimal NN+RISE controller.
Table 1
Tabulated values for the 10 runs for the developed controllers.
RISE NN+RISE
Avg. Max SS Error (deg)— Link 1 0.0416 0.0416

Avg. Max SS Error (deg)— Link 2 0.0573 0.0550
Avg. RMS Error (deg) — Link 1 0.0128 0.0139
Avg. RMS Error (deg) — Link 2 0.0139 0.0143
Avg. RMS Torque (Nm) — Link 1 9.4217 9.4000
Avg. RMS Torque (Nm) — Link 2 1.7375 1.6825
Error std dev (deg) — Link 1 0.0016 0.0011
Error std dev (deg) — Link 2 0.0019 0.0015
Torque std dev (Nm) — Link 1 0.2775 0.3092
Torque std dev (Nm) — Link 2 0.0734 0.1746
Fig. 1. Tracking errors for the optimal RISE controller.
Fig. 4. Torques for the optimal NN+RISE controller.
8.1. Discussion
The experiments show that both controllers stabilize the

system. Both controllers keep the average maximum steady state
(defined as the last 5 s of the experiment) error under 0.05 degrees
for the first link and under 0.06 degrees for the second link. The
data in Table 1 indicates that the NN+RISE controller resulted in
slightly more RMS error for each link, although a reduced or equal
Fig. 2. Torques for the optimal RISE controller. maximum steady state error, with a slightly reduced torque. The
reduced standard deviation of the NN+RISE controller show that
the results from each run were more alike then the RISE controller
alone, but there was greater variance in the torque.
9. Conclusion
A control scheme is developed for a class of nonlinear

Euler–Lagrange systems that enables the generalized coordinates
to asymptotically track a desired time-varying trajectory despite
general uncertainty in the dynamics such as additive bounded
disturbances and parametric uncertainty that do not have to satisfy
a LP assumption. The main contribution of this work is that the RISE
feedback method is augmented with and without a feedforward
NN and an auxiliary control term that minimizes a quadratic
performance index based on a HJB optimization scheme. Like the
influential work in Abu-Khalaf and Lewis (2002), Abu-Khalaf et al.
(2006), Cheng et al. (2006), Cheng and Lewis (2007), Cheng et al.
(2007), Johansson (1990), Kim and Lewis (2000), Kim et al. (2000)
and Lewis (1986) the result in this effort initially develops an
optimal controller based on a partially feedback linearized state-
space model assuming exact knowledge of the dynamics. The
Fig. 3. Tracking errors for the optimal NN+RISE controller. optimal controller is then combined with a feedforward NN and
RISE feedback. A Lyapunov stability analysis is included to show Patre, P. M., MacKunis, W., Makkar, C., & Dixon, W. E. (2008). Asymptotic tracking
that the NN and RISE identify the uncertainties, therefore the for systems with structured and unstructured uncertainties. IEEE Transactions
on Control Systems Technology, 16(2), 373–379.
dynamics asymptotically converge to the state-space system that Primbs, J.A., & Nevistic, V. (1996) Optimality of nonlinear design techniques: A
the HJB optimization scheme is based on. Experiments show that converse HJB approach. Technical report CIT-CDS 96–022, California Institute
both controllers stabilize the system and the optimal NN+RISE of Technology, Pasadena, CA 91125.
controller yields similar steady state error compared to the optimal Qu, Z., & Xu, J. X. (2002). Model-based learning controls and their comparisons using
lyapunov direct method. Asian Journal of Control, 4(1), 99–110.
RISE controller while requiring slightly reduced torque. Sekoguchi, M., Konishi, H., Goto, M., Yokoyama, A., & Lu, Q. (2002) Nonlinear optimal
control applied to STATCOM for power system stabilization In Proc. IEEE/PES
Transm. Distrib. Conf. Exhib. (pp. 342–347), October 2002.
References
Xian, B., Dawson, D. M., de Queiroz, M. S., & Chen, J. (2004). A continuous asymptotic
tracking control strategy for uncertain nonlinear systems. IEEE Transactions on
Abu-Khalaf, M., Huang, J., & Lewis, F. L. (2006). Nonlinear H-2/H-infinity constrained Automatic Control, 49(7), 1206–1211.
feedback control. Springer.
Abu-Khalaf, M., & Lewis, F. L. (2002) Nearly optimal HJB solution for constrained
input systems using a neural network least-squares approach. In Proc. IEEE conf. Keith Dupree received his B.S. Degree in Aerospace
decis. control (pp. 943–948), Las Vegas, NV, 2002. Engineering with a minor in Physics in May of 2005, his
Cheng, J., Li, H., & Zhang, Y. (2006) Robust low-cost sliding mode overload control M.S. in Mechanical Engineering in May of 2007, his Masters
for uncertain agile missile model. In Proc. world congr. intell. control autom. (pp. of Science in Electrical and Computer Engineering in
2185–2188), Dalian, China, June 2006. August of 2008, and his Ph.D. in Mechanical Engineering;
Cheng, T., & Lewis, F. (2007) Neural network solution for finite-horizon H-infinity all from the University of Florida. He is currently
constrained optimal control of nonlinear systems. In Proc. IEEE int. conf. control employed as a Systems Engineer by Raytheon. His research
autom. (pp. 1966–1972), June 2007. interests include Nonlinear Controls, Image Processing,
Cheng, T., Lewis, F. L., & Abu-Khalaf, M. (2007). A neural network solution for fixed- State Estimation, and Optimal Control.
final time optimal control of nonlinear systems. Automatica, 43, 482–490.
Dixon, W. E., Fang, Y., Dawson, D. M., & Flynn, T. J. (2003). Range identification
for perspective vision systems. IEEE Transactions on Automatic Control, 48(12),
2232–2238.
Dupree, K., Patre, P. M., Wilcox, Z. D., & Dixon, W. E. (2008) Optimal control Parag M. Patre received his B. Tech. degree in mechanical
of uncertain nonlinear systems using RISE feedback. In Proc. IEEE conf. decis. engineering from the Indian Institute of Technology
control. (pp. 2154–2159), December 2008. Madras, India, in 2004. Following this he was with Larsen
& Toubro Limited, India until 2005, when he joined the
Fausz, J. L., Chellaboina, V. -S., & Haddad, W. M. (1997) Inverse optimal adaptive
graduate school at the University of Florida. He received
control for nonlinear uncertain systems with exogenous disturbances. In Proc.
his M.S. and Ph.D. degrees in mechanical engineering in
IEEE Conf. Decis. Control (pp. 2654–2659), December 1997.
2007 and 2009, respectively, and is currently a NASA
Freeman, R. A., & Kokotovi’c, P. V. (1996). Robust nonlinear control design: state-space
postdoctoral fellow at the NASA Langley Research Center,
and lyapunov techniques. Boston, MA: Birkĺauser. Hampton, VA. His areas of research interest are Lyapunov-
Freeman, R. A., & Kokotovic, P. V. (1995) Optimal nonlinear controllers for feedback based design and analysis of control methods for uncertain
linearizable systems. In Proc. IEEE am. control conf. (pp. 2722–2726), June 1995. nonlinear systems, robust and adaptive control, neural
Ge, S. S., Hang, C. C., & Zhang, T. (1999). Nonlinear adaptive control using neural networks, and control of robots and aircraft.
networks and its application to CSTR systems. Journal of Process Control, 9(4),
313–323.
Johansson, R. (1990). Quadratic optimization of motion coordination and control. Zachary D. Wilcox received his Ph.D. in 2010 from the
IEEE Transactions on Automatic Control, 35(11), 1197–1208. Department of Mechanical and Aerospace Engineering
Khalil, H. K. (2002). Nonlinear systems (3rd ed.). Upper Saddle River, NJ: Prentice at the University of Florida (UF). After completing his
Hall. doctoral studies, he was selected as a Post-doctoral
Kim, Y. H., & Lewis, F. L. (2000). Optimal design of CMAC neural-network controller Research Associate at NASA Dryden, Edwards Air Force
for robot manipulators. IEEE Transactions on Systems, Man and Cybernetics Part Base. Dr. Wilcox’s main research interest has been the
C Applications and Review, 30(1), 22–31. development and application of Lyapunov-based control
Kim, Y. H., Lewis, F. L., & Dawson, D. M. (2000). Intelligent optimal control of robotic techniques for aerospace systems with uncertain dynamic
manipulator using neural networks. Automatica, 36, 1355–1364. models.
Kirk, D. (2004). Optimal control theory: an introduction. Dover Pubns.
Krstic, M., & Li, Z.-H. (1998). Inverse optimal design of input-to-state stabilizing
nonlinear controllers. IEEE Transactions on Automatic Control, 43(3), 336–350.
Krstic, M., & Tsiotras, P. (1999). Inverse optimal stabilization of a rigid spacecraft.
Warren E. Dixon received his Ph.D. in 2000 from the
IEEE Transactions on Automatic Control, 44(5), 1042–1049.
Department of Electrical and Computer Engineering from
Lewis, F. L. (1986). Optimal control. John Wiley & Sons.
Clemson University. After completing his doctoral studies
Lewis, F. L. (1999). Nonlinear network structures for feedback control. Asian Journal
he was selected as an Eugene P. Wigner Fellow at Oak
of Control, 1(4), 205–228. Ridge National Laboratory (ORNL) where he worked in
Lewis, F. L., Selmic, R., & Campos, J. (2002). Neuro-fuzzy control of industrial systems the Robotics and Energetic Systems Group. In 2004, Dr.
with actuator nonlinearities. Philadelphia, PA, USA: Society for Industrial and Dixon joined the faculty of the University of Florida in
Applied Mathematics. the Mechanical and Aerospace Engineering Department.
Lewis, F. L., & Syrmos, V. L. (1995). Optimal control. Wiley. His main research interest is the development and
Li, Z. H., & Krstic, M. (1997). Optimal design of adaptive tracking controllers for application of Lyapunov-based control techniques for
nonlinear systems. Automatica, 33, 1459–1473. uncertain nonlinear systems. He is the co-author of four
Lu, Q., Sun, Y., Xu, Z., & Mochizuki, T. (1996). Decentralized nonlinear optimal monographs, an edited collection, six chapters, and over 200 refereed journal and
excitation control. IEEE Transactions on Power Systems, 11(4), 1957–1962. conference papers. His work has been recognized by the 2009 American Automatic
Luo, W., Chu, Y.-C., & Ling, K.-V. (2005). Inverse optimal adaptive control for Control Council (AACC) O. Hugo Schuck Award, 2006 IEEE Robotics and Automation
attitude tracking of spacecraft. IEEE Transactions on Automatic Control, 50(11), Society (RAS) Early Academic Career Award, an NSF CAREER Award (2006–2011),
1639–1654. 2004 DOE Outstanding Mentor Award, and the 2001 ORNL Early Career Award for
Nevistic, V., & Primbs, J.A. (1996) Constrained nonlinear optimal control: a Engineering Achievement. Dr. Dixon is a senior member of IEEE. He is a member of
converse HJB approach. Technical report CIT-CDS 96–021, California Institute various technical committees, and conference program and organizing committees.
of Technology, Pasadena, CA 91125. He also serves on the conference editorial board for the IEEE CSS and RAS and the
Patre, P. M., MacKunis, W., Kaiser, K., & Dixon, W. E. (2008). Asymptotic tracking for ASME DSC. He served as an appointed member to the IEEE CSS Board of Governors
uncertain dynamic systems via a multilayer neural network feedforward and for 2008. He is currently an associate editor for Automatica, IEEE Transactions on
RISE feedback control structure. IEEE Transactions on Automatic Control, 53(9), Systems Man and Cybernetics: Part B Cybernetics,International Journal of Robust and
2180–2185. Nonlinear Control, and Journal of Robotics.

01 - Asymptotic Optimal Control of Uncertain Nonlinear Euler-Lagrange Systems PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

01 - Asymptotic Optimal Control of Uncertain Nonlinear Euler-Lagrange Systems PDF

Uploaded by

Copyright:

Available Formats

Automatica 47 (2011) 99–107

Contents lists available at ScienceDirect

Asymptotic optimal control of uncertain nonlinear Euler–Lagrange systems✩

article info abstract

1. Introduction optimal controller can be developed that optimizes a derived cost.

M ė2 = −Vm e2 + u. (10)

fd , M (qd )q̈d + Vm (qd , q̇d )q̇d + G(qd ) + F (q̇d ), (25)

1 which satisfies the following inequalities:

‖ε‖ ≤ εb1 , ‖ε̇‖ ≤ εb2 , ‖ε̈‖ ≤ εb3 , (59) In (68), ND (t ) ∈ Rn is defined as

Mr = −Vm e2 + α2 Me2 + fd − fˆd + h̄ + τd + u − µ. (63) ‖Ñ (t )‖ ≤ ρ(‖y‖)‖y‖, (73)

Avg. Max SS Error (deg)— Link 1 0.0416 0.0416

Fig. 1. Tracking errors for the optimal RISE controller.

Fig. 4. Torques for the optimal NN+RISE controller.

The experiments show that both controllers stabilize the

A control scheme is developed for a class of nonlinear

You might also like