You are on page 1of 8

JOURNAL OF OPTIMIZATIONTHEORYAND APPLICATIONS:Vol. 9, No.

1, 1972

Linear Stochastic Singular Control Problems 1


Yu-CHI H o 9

Abstract. The explicit feedback control law for the singular


linear-quadratic-gaussian stochastic control problem is derived.
The interesting implication of the control law in terms of information
pattern is discussed.

1. I n t r o d u c t i o n

O p t i m a l control p r o b l e m s in which the H a m i l t o n i a n does not have


a unique m i n i m u m are called singular problems. A fairly general class
of such problems, arising as the accessory minimization p r o b l e m of
second variation or as classical m i n i m u m - e r r o r p r o b l e m in servo theory,
consists of a p e r f o r m a n c e criterion quadratic in the state variable x b u t
linear in the control variable u and a linear dynamic system in x and u.
T h e deterministic version of such p r o b l e m s has been studied b y various
authors (Refs. 1-3). I n this paper, we shall solve the stochastic version
of these p r o b l e m s and show that (i) the certainty-equivalence principle
does not hold in this case, b u t the optimal control law is still linear and
(ii) the control law involves a dynamic system of order n - - r only where
n = dim(x) and r = dim(u). T h i s latter point is particularly interesting
f r o m the viewpoint of information pattern. F o r example, if n ~ r, t h e n
a z e r o - m e m o r y control law is optimal, which implies the possibility
of decentralization.

t Paper received May 12, 1971. The research reported in this document was done while
the author was a Guggenheim Fellow at the Imperial CoUege, London, England, and
was supported in part by the U.S. Army Research Office, the U.S. Air Force Office of
Scientific Research, and the U.S. Office of Naval Research under the Joint Services
Electronics Program, Contracts Nos. N00014-67-A-0298-0006, 0005, and 0008. The
author is indebted to various discussions with D. Q. Ma3aae and J. M. C. Clark of
Imperial College which clarified several points.
z Professor, Division of Engineering and Applied Physics, Harvard University, Cambridge,
Massachusetts.
24
© I972PlenumPublishing Corporation,227 West 17th Street, New York,N.Y. 10011.
JOTA: VOL. 9, NO. 1, 1972 25

The problem in question can be stated as follows. Consider the


linear constant coefficient system
~, = Fx + Gu + w, (1)

where go(t) is a gaussian white-noise process with a


E(w(t) = O, E(w(t) w(r) T) = Q8~, Q > 0, (2)

and x(0) is N(2, Po). Here, F and G are given n × n and n × r matrices,
respectively, and constitute a controllable pair. We are also given the
measurements
z(t) = H~(t) + ~(t), (3)
where H is p x n and (F, H) is observable, with v(t) another gaussian
white-noise process independent of w and
E(v(t)) = O, E(~.(t) v(z)~') =: RSt,, R > 0. (4)

For ease of discussion, we can define x T = Ix 1 , x2]T, where x 1 is (n -- r)


dimensional and x~ is r dimensional. Equation (1) then becomes

[2] = + [cc;':]" + (5)


There is no Ioss of generality in taking G1 = 0 and Ge = / , which
we shall assume henceforth. Equation (3) becomes

z(t) == H~xl(t ) 4- H~x~(t) @ v(t). (6)

The performance criterion is taken to be

I [xl'~'~J~lr[AllA*2][xl]b't~2~lLx~] !
where the expectation is taken w.r.t, the basic random variables x(O),
g0(t), and v(t), with the control law u taken to be a function of the past
measurement history z(r), 0 <~ r <~ t. It can be shown (Ref. 3) that
any linear terms in u in (7) can always be eliminated by appropriate
linear transformations and modifications on the problem. T h e objective
of the problem is to find the particular control law which minimizes (7).

3 For notational slmplh~carion, we use w~'te noiseformally here. However, the development
to follow can also be carried out by considering the discrete equivalent version of the
problem where no intricacies of the stochastic process need arise. A rigorous treatment
of the continuous time version of the problem to be stated below can also be carried
oat essentially along the lines established by Wonham (Ref. 4, pp. 178-180).
26 JOTA: VOL. 9, NO. 1, 1972

2. D e t e r m i n i s t i c Solution

Since the solution to the stochastic problem is related to the deter-


ministic one, we shall give the solution to the latter first (i.e., with
v = 0, w - - 0 , and H ~-I). This solution utilizes the concept of
transformation first advanced by Kelley (Ref. 1) and its optimality was
established by Speyer and Jacobson (Ref. 3). We begin by considering
x 2 as control variables. Then, the minimization problem faced by x 2 is

= + , (8)

] = fro
1 fr
'
r&~&~]
IAI %J
[~q dr. (9)

This is a standard linear-quadratic optimal control problem if

A22 > 0. (10)


If not, then the problem is singular of higher order, and additional
decomposition and transformation of x is necessary,a If one assumes (10),
then it is well known that the optimal control is

x2(t) = --A;,~Fr2Sx~(t) ~ C2t(t ) x~(t), (11)


where

-- ( A ~ - - A x 2 A ~ A ~ ) + SF12A~lFT2S, S(T) -----O. (12)

Note that (11) defines the envelope of tangent hyperplanes to the


singular surface in the state space. Furthermore, existence of a solution
to (12), together with (10), is equivalent to the new necessary and
sufficient conditions for optimality developed by Jacobson and Speyer
(Ref. 3). Now, the only remaining problem is to implement u such
that (11) is satisfied for all t. Since there is no cost on u, this can be
easily accomplished in two steps: (i) at t = 0, use impulsive control
u(O) = [C21(0 ) xl(O ) - - x~(O)l ~(t) to bring about x~(O+) := C~1(0) xl(O+);
and (ii) on the singular surface, let

Inequality(lO)isinfaetthe generalizedLegendre-Clebsch condition ( O/ Ou)[d2/dt~(Hu) ] < 0


applied to (8)-(9).
JOTA: VOL. 9, NO. 1, 1972 27

that is,

u(t) -- [021 - - F21 - - F22C2~ q- C2,(F~, @ F12C21)1 x~ A B(t) xa(t), (13)

which is a linear control law that insures the satisfaction of ( I 1 ) f o r


all 0 ~ t ~ T.

3+ Stochastic Solution

The solution of the stochastic problem (5)-(7) hinges on the


observation that (7) can be rewritten as

J = 2E [4,, 4,,]
~ . LA~2A2e] t&J dt ~-' o [e, ' e2]T t a ~ 2 e J t d dt , (14)

where 4 A E [ x ( t ) / z ( r ) ; 0 <~ r <~ t] and e(t) = x ( t ) - - ~(t). Furthermore,

and

rn -- ml m .trq + p'q _ .%
(16)
where K r = [ K 1 , K~] r is the well-known Kalman-Bucy filter gain.
In particular, E ( e ( t ) ~(t) T) = 0 for all t, which enables the decomposition
of (14). Now, since the control u cannot effect e, then the best that can
be done is to require that
&(t) = c~(t) 4~(t) (17)
by virtue of the optimality of the deterministic trajectory and the fact
that the correction term K l ( Z - - t t ~ 1 - - H ~ z ) is another independent
white noise in the first of Eqs. (15). Note that it does not imply that
u(t) = B ( t ) ~ l ( t ) , which is what is expected from the certainty-equi-
valence principle. In fact, we have on the stochastic singular surface

u(t) ~- ~2 - - (F21 - - K2H~) 4~ - - (F22 - - K2tte) 42 - - K , z ,

which, by a similar development as (13), becomes

,,(t) = V(t) 41(t) + O(t) z(t), (18)


28 JOTA: VOL. 9, NO. 1, 1972

with
r(t) = c,~ + C~[(F~ -- g d J ~ ) + (F~ -- IqH~) C~]
-- (& -- K#,) -- (F~ -- K#~) C~1 (19)
and
o(t) = C.~K~ - K~.
Again, impulsive control is used at t = 0, that is,

u0) ---- [C~a(0) ~(0) - ~ 0 ) ] a(t)

to bring about ~g(0+) ----- C~1(0)~1(0+). T h e estimate kl(t) can now be


generated via an (n -- r) dimensional linear system

tl = ( & -- K i l l + (fl~ -- K1Hi) C,0 ~ + K~z, x,(0) = ~ . (20)

T h e control lag, defined by (18) and (20) is realizable and minimizes (14)
by virtue of k~ = C21~1, which is known to be optimal from Section 2.

4. Extensions and Remarks

A few obvious extensions and remarks are in order at this point.

4.1. Addition of T e r m i n a l Payoff. If we consider

J = ~ E1 Ix(T)TSsx(T)+ f[(xTAx)dtl, (21)

then the above solution is still applicable. At t = T, however, a separate


minimization of k(T) r SlY(T), using ~2(T) as control variable, is required.
This generally implies that
x~(T) = D21~l(T), (22)

which means that ~2(T) will be discontinuous at t = T. Thus, a terminal


impulsive control
u(T-) = D~I~,(T- ) 8(t -- T) (23)

will be needed at t = T to bring about (22).

4,2. T e r m i n a l Constraints, Since ~ ( T ) can be made arbitrary


via impulsive u(T), we only need to consider terminal constraints of
the type
E[~xl(T)] = O.
JOTA: VOL. 9, NO. 1, 1972 29

Such constraint can be easily handled in the treatment of standard


linear-quadratic control problems with ~2 considered as a control
variable (see Section 2 and Ref. 3).

4.3. H i g h e r - O r d e r S i n g u l a r P r o b l e m s . If in (10) we have


A22 > / 0 , then the problem is singular from the viewpoint of x~ considered
as a control variable. This reduced singular control problem of dimension
n -- r can be transformed once again into the standard form (5), with x 1
further decomposed into x l r = [Xl~, Xl~]r. We now consider x12 as a
control variable, and so on. Since the problem is singular in x2, impulses
in x~, in certain directions at least, will be allowed. This will imply the
need of doublets in u at the initial time, which wilt be permissible in
this case. This whole process can be carried out to as high an order of
singularity as necessary.

4.4. Case n = r. If n = r, then the control law (18) drastically


simplifies to
u(t) = --Kz(t) (24)
and
lfr
-2 o ~crA~dt--=-O' ]=2
lfro EerAe dr,
the irreducible minimum in a linear-quadratic-gaussian control problem.
Furthermore, (24) implies that the control law is of zero memory. This
is extremely interesting from the viewpoint of information pattern and
decentralization. If the control actions are carried out by different agents
at different times, then no communication is necessary among the
agents. There is no need for perfect recall or the passing of sufficient
statistic from one agent to another. Thus, in the special case F = 0,
G=I, P0 = or2, R == 1, Q = 0 , A = 1, the discrete equivalent of
the problem constitutes a dramatically different version of the counter-
example of Witsenhausen (Ref. 5), where the information pattern is
nonclassical.

4.5. T i m e - V a r y i n g S y s t e m s . Conceptually, nothing new hap-


pens when the problem is made time varying. Various technical diffi-
culties wilt have to be resolved by making appropriate assumptions on
the nature of the various time-varying matrices, particularly G. For
example, it seems that a minimal requirement is that G(t) be continuously
differentiable so that the linear transformation required in (5) to make
G1 = 0 will be continuous and differentiable. We shall not pursue the
details here.
30 JOTA: VOL. 9, NO. 1, 1972

4~6. Case T--+ co, This solution also shed additional light on the
classical minimum-error servo-regulator problem, where we let T --+ co.
In this case, provided the Riccati equations involved have steady-state
solutions, the linear controller will be constant and will be of lower
order than the full-scale filter-controller solution called for in con-
ventional stochastic optimal control theory. The result of this paper is
thus applicable to the regulation of nonlinear systems about an operating
point.

4.7. C e r t a i n t y - C o i n c i d e n c e P r i n c i p l e . This term was coined


by ~vVillman (Ref. 6) in discussing the solution property of linear-
quadratic-gaussian zero-sum differential games. A similar situation
occurs in (17), where it is the trajectory, rather than the control law,
that is of importance. This trick can obviously be extended, for example,
if we have, instead of (7),
T
] = e (o L(x) dr. (25)

For linear-gaussian systems, we can replace (25) by

J = E fTo g(P~) dt + terms independent of u,

where 9(') is a function related to L(.). In fact, if L(-) is convex, then so


is 9(') (Ref. 7). Now, if the deterministic version of f~9(x) dt can be
solved then by a process similar to (17)-(18), we can obtain the control
law for the stochastic version.

References

1. KELLEY,H. J., A Transformational Approach to Singular Subarcs in Optimal


Trajectory and Control Problems, SIAM Journal on Control, Vol. 2, No. 3,
1964.
2. WONHAr%W. M., and JOHNSON, C. D., Optimal Bang-Bang Control with
Quadratic Performance Index, ASME Journal of Basic Engineering, Vol. 86,
No. 1, 1964.
3. SPEYER,J. L., and JACOBSON,D. H., Necessary and Sufficient Conditionsfor
Optimality for Singular Control Problems, Journal of Mathematical Analysis
and Applications, Vol. 33, No. 1, 1971.
4. WO~HAM,W., Random Differential Equation in Control Theory, Probabilistic
Method in Applied Mathematics, Vol. 2, Edited by A. Baruchi-Reed, Aca-
demic Press, New York, 1970.
JOTA: VOL. 9, NO. l, 1972 31

5. WITSENHAUSEN, H., A Counter Example in Stochastic Optimum Control,


SIAM Journal on Control, Vol. 6, No. 1, 1968.
6. WILLMA~, W., Formal Solution of a Class of Stochastic Pursuit-Evasion
Games, IEEE Transactions on Automatic Control, Vol. AC-14, No. 5, 1969.
7. BRYANT, G. F., and MAYNE, D. Q., A Minimum Principle for a Class of
Discrete Time Stochastic Systems, IEEE Transactions on Automatic Control,
Vol. AC-14, No. 4, 1969.

You might also like