You are on page 1of 4

3.3.

OPTIMALITY, STABILITY, AND PASSIVITY 91

3.3 Optimality, Stability, and Passivity


3.3.1 Optimal stabilizing control
We now introduce optimal control as a design tool which guarantees stability
margins. Of the two types of optimality conditions, Pontryagin-type neces-
sary conditions (“Maximum Principle”) and Bellman-type sufficient conditions
(“Dynamic Programming”), the latter is more suitable for feedback design over
infinite time intervals [1]. This will be our approach to the problem of finding
a feedback control u(x) for the system

ẋ = f (x) + g(x)u, (3.3.1)

with the following properties:

(i) u(x) achieves asymptotic stability of the equilibrium x = 0

(ii) u(x) minimizes the cost functional


Z ∞
J= (l(x) + uT R(x)u) dt (3.3.2)
0

where l(x) ≥ 0 and R(x) > 0 for all x.

For a given feedback control u(x), the value of J, if finite, is a function of


the initial state x(0): J(x(0)), or simply J(x). When J is at its minimum, J(x)
is called the optimal value function. Preparatory for our use of the optimal
value function J(x) as a Lyapunov function, we denote it by V (x). When we
want to stress that u(x) is optimal, we denote it by u∗ (x). The functions V (x)
and u∗ (x) are related to each other via the following optimality condition.

Theorem 3.19 (Optimality and stability)


Suppose that there exists a C 1 positive semidefinite function V (x) which sat-
isfies the Hamilton-Jacobi-Bellman equation
1
l(x) + Lf V (x) − Lg V (x)R−1 (x)(Lg V (x))T = 0, V (0) = 0 (3.3.3)
4
such that the feedback control
1
u∗ (x) = − R−1 (x)(Lg V )T (x) (3.3.4)
2
achieves asymptotic stability of the equilibrium x = 0. Then u∗ (x) is the opti-
mal stabilizing control which minimizes the cost (3.3.2) over all u guaranteeing
limt→∞ x(t) = 0, and V (x) is the optimal value function.
92 CHAPTER 3. STABILITY MARGINS AND OPTIMALITY

Proof: Substituting
1
v = u + R−1 (x)(Lg V (x))T
2
into (3.3.2) and using the HJB-identity we get the following chain of equalities:
Z ∞ 1
J = (l + v T Rv − v T (Lg V )T + Lg V R−1 (Lg V )T ) dt
Z0 ∞ 4 Z ∞
1
= (−Lf V + Lg V R (Lg V )T − Lg V v) dt +
−1
v T R(x)v dt
0Z 2 Z ∞ Z ∞0 Z ∞
∞ ∂V dV
T
=− (f + gu) dt + v R(x)v dt = − + v T R(x)v dt
0 ∂x 0 Z

0 dt 0
= V (x(0)) − lim V (x(T )) + v T R(x)v dt
T →∞ 0

Because we minimize (3.3.2) only over those u which achieve limt→∞ x(t) = 0,
the above limit of V (x(T )) is zero and we obtain
Z ∞
J = V (x(0)) + v T R(x)v dt
0

Clearly, the minimum of J is V (x(0)). It is reached for v(t) ≡ 0 which proves


that u∗ (x) given by (3.3.4) is optimal and that V (x) is the optimal value
function.
2

Example 3.20 (Optimal stabilization)


For the optimal stabilization of the system

ẋ = x2 + u

with the cost functional Z ∞


J= (x2 + u2 ) dt (3.3.5)
0
we need to find a positive semidefinite solution of the HJB equation
à !2
2 ∂V 2 1 ∂V
x + x − = 0, V (0) = 0
∂x 4 ∂x
∂V
Solving it first as the quadratic equation in ∂x
we get
∂V √
= 2x2 + 2x x2 + 1
∂x
where the positive sign is required for the optimal value function to be positive
semidefinite:
2 3
V (x) = (x3 + (x2 + 1) 2 − 1) (3.3.6)
3
3.3. OPTIMALITY, STABILITY, AND PASSIVITY 93

It can be checked that V (x) is positive definite and radially unbounded. The
control law
1 ∂V √
u∗ (x) = − = −x2 − x x2 + 1 (3.3.7)
2 ∂x
achieves GAS of the resulting feedback system

ẋ = −x x2 + 1

and hence, is the optimal stabilizing control for (3.3.5). 2

In the statement of Theorem 3.19 we have assumed the existence of a


positive semidefinite solution V (x) of the HJB equation. For the LQR-problem
the HJB equation (3.3.3) can be solved with the help of an algebraic Ricatti
equation whose properties are well known. For further reference we quote a
basic version of this well known result.

Proposition 3.21 ( LQR-problem)


For optimal stabilization of the linear system

ẋ = Ax + Bu (3.3.8)

with respect to the cost functional


Z ∞
J= (xT C T Cx + uT Ru)dt, R>0
0

consider the Ricatti equation

P A + AT P − P BR−1 B T P + C T C = 0 (3.3.9)

If (A, B) is controllable and (A, C) is observable, then (3.3.9) has a unique


positive definite solution P ∗ , the optimal value function is V (x) = xT P ∗ x, and
the optimal stabilizing control is

u∗ (x) = −R−1 B T P ∗ x

If (A, B) is stabilizable and (A, C) is detectable then P ∗ is positive semidefinite.


2

A proof of this result can be found in any standard text, such as [1]. For our
further discussion, the semidefiniteness of l(x) = xT C T Cx is of interest because
it shows the significance of an observability property. It is intuitive that “the
detectability in the cost” of the unstable part of the system is necessary for
an optimal control to be stabilizing. A scalar example will illustrate some of
the issues involved.
94 CHAPTER 3. STABILITY MARGINS AND OPTIMALITY

Example 3.22 (Optimal control and “detectability in the cost”)


For the linear system
ẋ = x + u
and the cost functional Z ∞
J= u2 dt (3.3.10)
0
we have A = 1, B = 1, C = 0, R = 1. The Ricatti equation and its solutions
P1 and P2 are
2P − P 2 = 0, P1 = 0, P2 = 2 (3.3.11)
It can also be directly checked that the solutions of the HJB equation
∂V 1 ∂V 2
x − ( ) = 0, V (0) = 0
∂x 4 ∂x
are V1 (x) = 0 and V2 (x) = 2x2 , that is V1 (x) = P1 x2 , V2 (x) = P2 x2 . The
smaller of the two, V1 (x), gives the minimum of the cost functional, but the
control law u(x) = 0 is not stabilizing. The reason is that l(x) = 0 and the
instability of ẋ = x is not detected in the cost functional.
According to Theorem 3.19, in which the minimization of J is performed
only over the set of stabilizing controls, V2 (x) = 2x2 is the optimal value
function and u(x) = −2x is the optimal stabilizing control.
The assumptions of Theorem 3.19 can be interpreted as incorporating a
detectability condition. This can be illustrated by letting the cost functional
J in (3.3.10) be the limit, as ² → 0, of the augmented cost functional
Z ∞
J² = (²2 x2 + u2 )dt
0

in which the state is observable. The corresponding Ricatti equation, and its
solutions P1² and P2² are
√ √
2P − P 2 + ²2 = 0, P1² = 1 − 1 + ², P2² = 1 + 1 + ²
√ √
The HJB solutions V1² (x) = (1 − 1 + ²)x2 and V2² (x) = (1 + 1 + ²)x2
converge, as ² → 0, to V1 (x) = 0 and V2 (x) = 2x2 , respectively. This reveals
that V1 (x) = 0 is the limit of V1² (x) which, for ² > 0, is negative definite while
J ² must be nonnegative. Hence V1² (x) cannot be a value function, let alone
an optimal value function. The optimal value function for J ² is V2² (x) and
Theorem 3.19 identifies its limit V2 (x) as the optimal value for J. 2

In our presentation thus far we have not stated the most detailed conditions
for optimality, because our approach will be to avoid the often intractable task
of solving the HJB equation (3.3.3). Instead, we will employ Theorem 3.19
only as a test of optimality for an already designed stabilizing control law.

You might also like