BF 00932802

JOURNAL OF OPTIMIZATIONTHEORYAND APPLICATIONS:Vol. 9, No.
1, 1972
Linear Stochastic Singular Control Problems 1

Yu-CHI H o 9
Abstract. The explicit feedback control law for the singular

linear-quadratic-gaussian stochastic control problem is derived.
The interesting implication of the control law in terms of information
pattern is discussed.
1. I n t r o d u c t i o n
O p t i m a l control p r o b l e m s in which the H a m i l t o n i a n does not have

a unique m i n i m u m are called singular problems. A fairly general class
of such problems, arising as the accessory minimization p r o b l e m of
second variation or as classical m i n i m u m - e r r o r p r o b l e m in servo theory,
consists of a p e r f o r m a n c e criterion quadratic in the state variable x b u t
linear in the control variable u and a linear dynamic system in x and u.
T h e deterministic version of such p r o b l e m s has been studied b y various
authors (Refs. 1-3). I n this paper, we shall solve the stochastic version
of these p r o b l e m s and show that (i) the certainty-equivalence principle
does not hold in this case, b u t the optimal control law is still linear and
(ii) the control law involves a dynamic system of order n - - r only where
n = dim(x) and r = dim(u). T h i s latter point is particularly interesting
f r o m the viewpoint of information pattern. F o r example, if n ~ r, t h e n
a z e r o - m e m o r y control law is optimal, which implies the possibility
of decentralization.
t Paper received May 12, 1971. The research reported in this document was done while
the author was a Guggenheim Fellow at the Imperial CoUege, London, England, and
was supported in part by the U.S. Army Research Office, the U.S. Air Force Office of
Scientific Research, and the U.S. Office of Naval Research under the Joint Services
Electronics Program, Contracts Nos. N00014-67-A-0298-0006, 0005, and 0008. The
author is indebted to various discussions with D. Q. Ma3aae and J. M. C. Clark of
Imperial College which clarified several points.
z Professor, Division of Engineering and Applied Physics, Harvard University, Cambridge,
Massachusetts.
24
© I972PlenumPublishing Corporation,227 West 17th Street, New York,N.Y. 10011.
JOTA: VOL. 9, NO. 1, 1972 25
The problem in question can be stated as follows. Consider the

linear constant coefficient system
~, = Fx + Gu + w, (1)
where go(t) is a gaussian white-noise process with a

E(w(t) = O, E(w(t) w(r) T) = Q8~, Q > 0, (2)
and x(0) is N(2, Po). Here, F and G are given n × n and n × r matrices,
respectively, and constitute a controllable pair. We are also given the
measurements
z(t) = H~(t) + ~(t), (3)
where H is p x n and (F, H) is observable, with v(t) another gaussian
white-noise process independent of w and
E(v(t)) = O, E(~.(t) v(z)~') =: RSt,, R > 0. (4)
For ease of discussion, we can define x T = Ix 1 , x2]T, where x 1 is (n -- r)

dimensional and x~ is r dimensional. Equation (1) then becomes
[2] = + [cc;':]" + (5)

There is no Ioss of generality in taking G1 = 0 and Ge = / , which
we shall assume henceforth. Equation (3) becomes
z(t) == H~xl(t ) 4- H~x~(t) @ v(t). (6)
The performance criterion is taken to be
I [xl'~'~J~lr[AllA*2][xl]b't~2~lLx~] !
where the expectation is taken w.r.t, the basic random variables x(O),
g0(t), and v(t), with the control law u taken to be a function of the past
measurement history z(r), 0 <~ r <~ t. It can be shown (Ref. 3) that
any linear terms in u in (7) can always be eliminated by appropriate
linear transformations and modifications on the problem. T h e objective
of the problem is to find the particular control law which minimizes (7).
3 For notational slmplh~carion, we use w~'te noiseformally here. However, the development
to follow can also be carried out by considering the discrete equivalent version of the
problem where no intricacies of the stochastic process need arise. A rigorous treatment
of the continuous time version of the problem to be stated below can also be carried
oat essentially along the lines established by Wonham (Ref. 4, pp. 178-180).
26 JOTA: VOL. 9, NO. 1, 1972
2. D e t e r m i n i s t i c Solution
Since the solution to the stochastic problem is related to the deter-

ministic one, we shall give the solution to the latter first (i.e., with
v = 0, w - - 0 , and H ~-I). This solution utilizes the concept of
transformation first advanced by Kelley (Ref. 1) and its optimality was
established by Speyer and Jacobson (Ref. 3). We begin by considering
x 2 as control variables. Then, the minimization problem faced by x 2 is
= + , (8)
] = fro
1 fr
'
r&~&~]
IAI %J
[~q dr. (9)
This is a standard linear-quadratic optimal control problem if
A22 > 0. (10)

If not, then the problem is singular of higher order, and additional
decomposition and transformation of x is necessary,a If one assumes (10),
then it is well known that the optimal control is
x2(t) = --A;,~Fr2Sx~(t) ~ C2t(t ) x~(t), (11)

where
-- ( A ~ - - A x 2 A ~ A ~ ) + SF12A~lFT2S, S(T) -----O. (12)
Note that (11) defines the envelope of tangent hyperplanes to the

singular surface in the state space. Furthermore, existence of a solution
to (12), together with (10), is equivalent to the new necessary and
sufficient conditions for optimality developed by Jacobson and Speyer
(Ref. 3). Now, the only remaining problem is to implement u such
that (11) is satisfied for all t. Since there is no cost on u, this can be
easily accomplished in two steps: (i) at t = 0, use impulsive control
u(O) = [C21(0 ) xl(O ) - - x~(O)l ~(t) to bring about x~(O+) := C~1(0) xl(O+);
and (ii) on the singular surface, let
Inequality(lO)isinfaetthe generalizedLegendre-Clebsch condition ( O/ Ou)[d2/dt~(Hu) ] < 0

applied to (8)-(9).
JOTA: VOL. 9, NO. 1, 1972 27
that is,
u(t) -- [021 - - F21 - - F22C2~ q- C2,(F~, @ F12C21)1 x~ A B(t) xa(t), (13)
which is a linear control law that insures the satisfaction of ( I 1 ) f o r

all 0 ~ t ~ T.
3+ Stochastic Solution
The solution of the stochastic problem (5)-(7) hinges on the

observation that (7) can be rewritten as
J = 2E [4,, 4,,]
~ . LA~2A2e] t&J dt ~-' o [e, ' e2]T t a ~ 2 e J t d dt , (14)
where 4 A E [ x ( t ) / z ( r ) ; 0 <~ r <~ t] and e(t) = x ( t ) - - ~(t). Furthermore,
and
rn -- ml m .trq + p'q _ .%
(16)
where K r = [ K 1 , K~] r is the well-known Kalman-Bucy filter gain.
In particular, E ( e ( t ) ~(t) T) = 0 for all t, which enables the decomposition
of (14). Now, since the control u cannot effect e, then the best that can
be done is to require that
&(t) = c~(t) 4~(t) (17)
by virtue of the optimality of the deterministic trajectory and the fact
that the correction term K l ( Z - - t t ~ 1 - - H ~ z ) is another independent
white noise in the first of Eqs. (15). Note that it does not imply that
u(t) = B ( t ) ~ l ( t ) , which is what is expected from the certainty-equi-
valence principle. In fact, we have on the stochastic singular surface
u(t) ~- ~2 - - (F21 - - K2H~) 4~ - - (F22 - - K2tte) 42 - - K , z ,
which, by a similar development as (13), becomes
,,(t) = V(t) 41(t) + O(t) z(t), (18)

28 JOTA: VOL. 9, NO. 1, 1972
with
r(t) = c,~ + C~[(F~ -- g d J ~ ) + (F~ -- IqH~) C~]
-- (& -- K#,) -- (F~ -- K#~) C~1 (19)
and
o(t) = C.~K~ - K~.
Again, impulsive control is used at t = 0, that is,
u0) ---- [C~a(0) ~(0) - ~ 0 ) ] a(t)
to bring about ~g(0+) ----- C~1(0)~1(0+). T h e estimate kl(t) can now be

generated via an (n -- r) dimensional linear system
tl = ( & -- K i l l + (fl~ -- K1Hi) C,0 ~ + K~z, x,(0) = ~ . (20)
T h e control lag, defined by (18) and (20) is realizable and minimizes (14)
by virtue of k~ = C21~1, which is known to be optimal from Section 2.
4. Extensions and Remarks
A few obvious extensions and remarks are in order at this point.
4.1. Addition of T e r m i n a l Payoff. If we consider
J = ~ E1 Ix(T)TSsx(T)+ f[(xTAx)dtl, (21)
then the above solution is still applicable. At t = T, however, a separate

minimization of k(T) r SlY(T), using ~2(T) as control variable, is required.
This generally implies that
x~(T) = D21~l(T), (22)
which means that ~2(T) will be discontinuous at t = T. Thus, a terminal

impulsive control
u(T-) = D~I~,(T- ) 8(t -- T) (23)
will be needed at t = T to bring about (22).
4,2. T e r m i n a l Constraints, Since ~ ( T ) can be made arbitrary

via impulsive u(T), we only need to consider terminal constraints of
the type
E[~xl(T)] = O.
JOTA: VOL. 9, NO. 1, 1972 29
Such constraint can be easily handled in the treatment of standard

linear-quadratic control problems with ~2 considered as a control
variable (see Section 2 and Ref. 3).
4.3. H i g h e r - O r d e r S i n g u l a r P r o b l e m s . If in (10) we have

A22 > / 0 , then the problem is singular from the viewpoint of x~ considered
as a control variable. This reduced singular control problem of dimension
n -- r can be transformed once again into the standard form (5), with x 1
further decomposed into x l r = [Xl~, Xl~]r. We now consider x12 as a
control variable, and so on. Since the problem is singular in x2, impulses
in x~, in certain directions at least, will be allowed. This will imply the
need of doublets in u at the initial time, which wilt be permissible in
this case. This whole process can be carried out to as high an order of
singularity as necessary.
4.4. Case n = r. If n = r, then the control law (18) drastically

simplifies to
u(t) = --Kz(t) (24)
and
lfr
-2 o ~crA~dt--=-O' ]=2
lfro EerAe dr,
the irreducible minimum in a linear-quadratic-gaussian control problem.
Furthermore, (24) implies that the control law is of zero memory. This
is extremely interesting from the viewpoint of information pattern and
decentralization. If the control actions are carried out by different agents
at different times, then no communication is necessary among the
agents. There is no need for perfect recall or the passing of sufficient
statistic from one agent to another. Thus, in the special case F = 0,
G=I, P0 = or2, R == 1, Q = 0 , A = 1, the discrete equivalent of
the problem constitutes a dramatically different version of the counter-
example of Witsenhausen (Ref. 5), where the information pattern is
nonclassical.
4.5. T i m e - V a r y i n g S y s t e m s . Conceptually, nothing new hap-

pens when the problem is made time varying. Various technical diffi-
culties wilt have to be resolved by making appropriate assumptions on
the nature of the various time-varying matrices, particularly G. For
example, it seems that a minimal requirement is that G(t) be continuously
differentiable so that the linear transformation required in (5) to make
G1 = 0 will be continuous and differentiable. We shall not pursue the
details here.
30 JOTA: VOL. 9, NO. 1, 1972
4~6. Case T--+ co, This solution also shed additional light on the
classical minimum-error servo-regulator problem, where we let T --+ co.
In this case, provided the Riccati equations involved have steady-state
solutions, the linear controller will be constant and will be of lower
order than the full-scale filter-controller solution called for in con-
ventional stochastic optimal control theory. The result of this paper is
thus applicable to the regulation of nonlinear systems about an operating
point.
4.7. C e r t a i n t y - C o i n c i d e n c e P r i n c i p l e . This term was coined

by ~vVillman (Ref. 6) in discussing the solution property of linear-
quadratic-gaussian zero-sum differential games. A similar situation
occurs in (17), where it is the trajectory, rather than the control law,
that is of importance. This trick can obviously be extended, for example,
if we have, instead of (7),
T
] = e (o L(x) dr. (25)
For linear-gaussian systems, we can replace (25) by
J = E fTo g(P~) dt + terms independent of u,
where 9(') is a function related to L(.). In fact, if L(-) is convex, then so

is 9(') (Ref. 7). Now, if the deterministic version of f~9(x) dt can be
solved then by a process similar to (17)-(18), we can obtain the control
law for the stochastic version.
References
1. KELLEY,H. J., A Transformational Approach to Singular Subarcs in Optimal

Trajectory and Control Problems, SIAM Journal on Control, Vol. 2, No. 3,
1964.
2. WONHAr%W. M., and JOHNSON, C. D., Optimal Bang-Bang Control with
Quadratic Performance Index, ASME Journal of Basic Engineering, Vol. 86,
No. 1, 1964.
3. SPEYER,J. L., and JACOBSON,D. H., Necessary and Sufficient Conditionsfor
Optimality for Singular Control Problems, Journal of Mathematical Analysis
and Applications, Vol. 33, No. 1, 1971.
4. WO~HAM,W., Random Differential Equation in Control Theory, Probabilistic
Method in Applied Mathematics, Vol. 2, Edited by A. Baruchi-Reed, Aca-
demic Press, New York, 1970.
JOTA: VOL. 9, NO. l, 1972 31
5. WITSENHAUSEN, H., A Counter Example in Stochastic Optimum Control,

SIAM Journal on Control, Vol. 6, No. 1, 1968.
6. WILLMA~, W., Formal Solution of a Class of Stochastic Pursuit-Evasion
Games, IEEE Transactions on Automatic Control, Vol. AC-14, No. 5, 1969.
7. BRYANT, G. F., and MAYNE, D. Q., A Minimum Principle for a Class of
Discrete Time Stochastic Systems, IEEE Transactions on Automatic Control,
Vol. AC-14, No. 4, 1969.

BF 00932802

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BF 00932802

Uploaded by

Copyright:

Available Formats

JOURNAL OF OPTIMIZATIONTHEORYAND APPLICATIONS:Vol. 9, No.

Linear Stochastic Singular Control Problems 1

Abstract. The explicit feedback control law for the singular

O p t i m a l control p r o b l e m s in which the H a m i l t o n i a n does not have

The problem in question can be stated as follows. Consider the

where go(t) is a gaussian white-noise process with a

For ease of discussion, we can define x T = Ix 1 , x2]T, where x 1 is (n -- r)

[2] = + [cc;':]" + (5)

z(t) == H~xl(t ) 4- H~x~(t) @ v(t). (6)

The performance criterion is taken to be

Since the solution to the stochastic problem is related to the deter-

This is a standard linear-quadratic optimal control problem if

A22 > 0. (10)

x2(t) = --A;,~Fr2Sx~(t) ~ C2t(t ) x~(t), (11)

-- ( A ~ - - A x 2 A ~ A ~ ) + SF12A~lFT2S, S(T) -----O. (12)

Note that (11) defines the envelope of tangent hyperplanes to the

Inequality(lO)isinfaetthe generalizedLegendre-Clebsch condition ( O/ Ou)[d2/dt~(Hu) ] < 0

u(t) -- [021 - - F21 - - F22C2~ q- C2,(F~, @ F12C21)1 x~ A B(t) xa(t), (13)

which is a linear control law that insures the satisfaction of ( I 1 ) f o r

The solution of the stochastic problem (5)-(7) hinges on the

where 4 A E [ x ( t ) / z ( r ) ; 0 <~ r <~ t] and e(t) = x ( t ) - - ~(t). Furthermore,

u(t) ~- ~2 - - (F21 - - K2H~) 4~ - - (F22 - - K2tte) 42 - - K , z ,

which, by a similar development as (13), becomes

,,(t) = V(t) 41(t) + O(t) z(t), (18)

u0) ---- [C~a(0) ~(0) - ~ 0 ) ] a(t)

to bring about ~g(0+) ----- C~1(0)~1(0+). T h e estimate kl(t) can now be

tl = ( & -- K i l l + (fl~ -- K1Hi) C,0 ~ + K~z, x,(0) = ~ . (20)

4. Extensions and Remarks

A few obvious extensions and remarks are in order at this point.

4.1. Addition of T e r m i n a l Payoff. If we consider

J = ~ E1 Ix(T)TSsx(T)+ f[(xTAx)dtl, (21)

then the above solution is still applicable. At t = T, however, a separate

which means that ~2(T) will be discontinuous at t = T. Thus, a terminal

will be needed at t = T to bring about (22).

4,2. T e r m i n a l Constraints, Since ~ ( T ) can be made arbitrary

Such constraint can be easily handled in the treatment of standard

4.3. H i g h e r - O r d e r S i n g u l a r P r o b l e m s . If in (10) we have

4.4. Case n = r. If n = r, then the control law (18) drastically

4.5. T i m e - V a r y i n g S y s t e m s . Conceptually, nothing new hap-

4.7. C e r t a i n t y - C o i n c i d e n c e P r i n c i p l e . This term was coined

For linear-gaussian systems, we can replace (25) by

J = E fTo g(P~) dt + terms independent of u,

where 9(') is a function related to L(.). In fact, if L(-) is convex, then so

1. KELLEY,H. J., A Transformational Approach to Singular Subarcs in Optimal

5. WITSENHAUSEN, H., A Counter Example in Stochastic Optimum Control,

You might also like