You are on page 1of 366

CONTRIBUTORS TO THIS VOLUME

MASANAO AOKI
HUBERT HALKIN
FRANCIS H. KISHI
JAMES S. M E D I T C H
PETER R. S C H U L T Z
P. K. C. W A N G
ADVANCES IN

C O N T R O L SYSTEMS
THEORY AND APPLICATIONS

Edited by
C . T. LEONDES
D E P A R T M E N T OF E N G I N E E R I N G
U N I V E R S I T Y OF CALIFORNIA
Los ANGELES, CALIFORNIA

VOLUME 1 1964

ACADEMIC PRESS New York and London


COPYRIGHT © 1964, BY ACADEMIC PRESS INC.
ALL RIGHTS RESERVED.
NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM,
BY PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT
WRITTEN PERMISSION FROM THE PUBLISHERS.

ACADEMIC PRESS INC.


I l l Fifth Avenue, New York, New York 10003

United Kingdom Edition published by


ACADEMIC PRESS INC. (LONDON) LTD.
Berkeley Square House, London W.l

LIBRARY OF CONGRESS CATALOG CARD NUMBER: 64-8027

PRINTED IN THE UNITED STATES OF AMERICA


Contributors

M A S A N A O A O K I , D e p a r t m e n t of Engineering, University of Cali-


fornia, L o s Angeles, California

H U B E R T H A L K I N , Bell T e l e p h o n e Laboratories, W h i p p a n y , N e w
Jersey
F R A N C I S H . K I S H I , Electronics Division, T R W Space T e c h n o l o g y
Laboratories, R e d o n d o Beach, California

J A M E S S. M E D I T C H , Aerospace Corporation, Los Angeles, Cali-


fornia

P E T E R R. S C H U L T Z , D e p a r t m e n t of Engineering, University of
California, L o s Angeles, and G u i d a n c e Systems D e p a r t m e n t ,
Aerospace Corporation, El S e g u n d o , California

P. K. C. W A N G , International Business M a c h i n e s Corporation,


San Jose Research Laboratory, San Jose, California

ν
Preface

T h e first volume of Advances in Control Systems initiates a series


which has been developed to disseminate c u r r e n t information from
leading researchers in the ever b r o a d e n i n g field of automatic control.
T h i s material will appear in t h e form of critical and definitive reviews
written at a level between that of t h e technical j o u r n a l a n d the research
m o n o g r a p h . T h e need for such a series is a p p a r e n t w h e n one considers
the overwhelming volume of widely dispersed original literature that
is currently appearing. A u t o m a t i c control is itself becoming so increas-
ingly active that control systems and t e c h n i q u e s are being applied not
only in engineering, b u t t h r o u g h o u t a variety of other scientific disci-
plines as well. T h e large n u m b e r of practicing engineers, applied
mathematicians, and other scientists being d r a w n into this field from
allied areas, together with the increase in s t u d e n t enrollment in systems
engineering, ensures a steady flow of new results in the future.
T h e primary p u r p o s e of this series is to bring together this diverse
information in a single publication. Persons directly active in developing
control theory, as well as those persons for w h o m t h e t e c h n i q u e s of
a u t o m a t i c control are an effective tool, will find it invaluable as a
comprehensive and readily accessible compilation of information.
Some of the contributions, as t h e subtitle indicates, will be of an
applied n a t u r e , whereas others will be theoretical. I n either case the
level of mathematical sophistication will usually be well within the
grasp of the trained engineer. W h e n m o r e advanced mathematical
terminology is i n t r o d u c e d , every effort will be m a d e to include a self-
T
contained presentation. W ith t h e wealth of concise and readable m a t h e -
matics textbooks on advanced topics available today, even this should
often not be necessary. Scientists w h o will be applying these newer
results directly to their practical p r o b l e m s will naturally seek out these
references anyway.
T h i s series will provide an a d d e d service for the classroom instructor
w h o is frequently confronted with obsolete material, particularly at the
higher level, where advances occur so rapidly. T h e teacher will find
here, as will his research-oriented colleague, a timely and convenient
source to which to refer his s t u d e n t s .
C. T . LEONDES

September, 1964
vii
O n Optimal and Suboptimal
Policies in Control Systems
MASANAO AOKI
Department of Engineering,
University of California,
Los Angeles, California

I. Introduction 1
II. Stochastic and Adaptive Control S y s t e m s 2
A. Introduction 2
B. Mathematical Description of Control S y s t e m s . . 3
C. Stochastic Final Value Control S y s t e m s 5
D . Adaptive Final Value Control S y s t e m s 8
III. Approximate Realization of Desired Trajectories . . . 12
A. Introduction 12
B. Upper Bounds 15
C. Independent Controls 18
D . O n e - D i m e n s i o n a l Problem 20
E. Approximations by Integrator O u t p u t s 33
IV. Existence of Optimal Controls 43
A. Introduction 43
B. Mathematical Formulation of Optimal Controls . . 44
C. Sliding Regime 50

References 51

I. Introduction

D u r i n g the last decade, control theory has advanced quite rapidly.


T h i s is largely d u e to the fact that control engineers have been called
u p o n to deal with increasingly complex systems.
As the p r o b l e m s b e c o m e m o r e complex a n d d e m a n d s on system
performances become m o r e stringent, t h e theory of optimal control
systems has received increasing attention b o t h by engineers and m a t h e -
maticians.
Historically, the optimal control p r o b l e m arose first as t h e " t i m e opti-
m a l " control p r o b l e m , the p r o b l e m of b r i n g i n g some c o m p o n e n t s of
the system state vectors to desired states from a given set of initial
states as quickly as possible, while satisfying certain constraints on the

1
2 MASANAO ΑΟΚΙ

means of controlling the system (/, 2). Since then, a large n u m b e r of


papers have appeared on various other types of optimal control p r o b l e m s
as well as on time optimal control p r o b l e m s , and by now the theory of
optimal control has reached a certain level of d e v e l o p m e n t (J).
In this chapter the optimal way of controlling a given system with
respect to the given criterion of performance will be called the optimal
control policy. O p t i m a l control policies may be given as a function of
" s t a t e " of the control systems or as a function of time. In the former
case, they represent closed-loop control of the system, and represent
open-loop control in the latter.
A way of controlling the system in some nonoptimal way for the given
criterion of performance will be called a s u b o p t i m a l control policy.
T h e r e are several reasons why one should discuss s u b o p t i m a l policies.
First of all, because of the scale and complexity of systems, it may not
be possible to solve optimal control p r o b l e m s exactly even if optimal
policies are assumed to exist, or, for that matter, optimal policies may
not exist.
F u r t h e r m o r e , if optimal policies whose existences are assumed are
too complex either from the viewpoint of analysis or engineering i m -
plementation, then various approximate solutions of t h e optimal control
p r o b l e m s m u s t be considered.
Original complex p r o b l e m s may, on the other hand, be sufficiently
simplified to allow exact solutions. I n either case, one wants " g o o d "
suboptimal policies to approximate optimal policies.
A n o t h e r related point to consider is the fact that rarely are criteria
of performances designed to include all pertinent factors in optimal
system designs. T h u s , it pays to consider not only optimal policies b u t
suboptimal policies to allow engineering and/or economical considera-
tions in building systems.
In all sections except the last, the discussions assume the existence
of optimal controls. In the last section sufficient conditions for the
existence of optimal control and the p r o b l e m of controlling systems
where optimal controls do not exist will be discussed.

I I . Stochastic and A d a p t i v e C o n t r o l S y s t e m s (4)

A. Introduction

A problem of optimally controlling a class of linear control systems


which are disturbed by r a n d o m noise is discussed in this section.
W e will first assume that the probability distribution function of the
OPTIMAL AND SUBOPTIMAL CONTROLS 3

noise is known and then we will treat the case where the distribution
function is known only as a m e m b e r of a given class of distribution
functions. T h e first case is referred to as stochastic control p r o b l e m s
and the second as adaptive control p r o b l e m s . T h e reason for these
names will become clear as the discussions proceed.

B. Mathematical Description of Control Systems ( 5 , 6)

Let us denote by S a d y n a m i c system of η degrees of freedom. T h e n ,


a set of η scalar functions of time /, i — 1, 2, ..., η can describe
completely the state of S at each time instant, i.e., given x'\t), i = 1,2,
..., η for t ^ t0 , the behavior of 5 with time for t > t0 is completely
described. T h e s e η functions of time can be regarded as η c o m p o n e n t s
of a vector function of time x(t), called the state vector of the system S (7).
Although in a most general case the value of x(t) for t > t0 is d e p e n d e n t
not only on the current state vector x(t0) b u t also on all the past values
of the state vector x(t), t ^ t0 (8) only such systems whose future state
vectors x(t) are completely and uniquely d e t e r m i n e d by their c u r r e n t
state vector x(t0) = c are considered. T h a t is to say, u n d e r suitable
existence and uniqueness conditions, x(t) becomes a function of time t
and of the initial condition vector cf x(t) = x(c> t), x(0) = cy taking
t0 = 0 without loss of generality. In other words, S is taken to be
a differential system.
F r o m the uniqueness property, which x(i) satisfies, there follows

x(c, tl + to) = x(x (c, f 2) (1)

which is nothing more than a mathematical representation of the


principle of causality (5).
Let us take a unit time interval to be At and set

x(c, At) = T(c) (2)

i.e., the initial state vector c is transformed into a new state vector T{c),
after the lapse of a unit time interval. T h e n from E q . (1),
n
nut) = T(T(... T(c)...)) = T {c)

which is to say, t h e state vector at t — nut is given by T"(c), the wth


iterate of the function T(c). T h i s m e a n s that the time behavior of the
deterministic d y n a m i c system S at time t = nAt is d e t e r m i n e d by the
4 MASANAO AOKI

successive iterates of a specific function of x, or successive point t r a n s -


formations on c. H e r e , the kind of transformation that is applied on t h e
state vector at t = nAty to get a new state vector at t h e next time instant
t = (n + l)Aty is i n d e p e n d e n t of x(nAt). O n t h e other h a n d , feedback
control systems are designed in such a way as to utilize m o r e t h a n one
type of transformation as a function of t h e state vectors (9, 10).
W e will consider time discrete control systems. Let t1 , t2 , ... be a
sequence of t i m e w h e r e tk = kAt.
L e t x(t) be the state vector of the control system at time t, t h e n the
fact x(tk+1) is d e t e r m i n e d by x{tk) and the control applied over [tk ,
tk+1] is expressed symbolically as

<tk+l) = T(x(tk)y uk) (3)

where uk stands for the control variable over [tk , tk+1] and is included in
E q . (3) to indicate the fact that t h e transformation from x(tk) to
x(tk+1) is now d e p e n d e n t on the choice of control variable. T h e choice
is m a d e in such a way as to optimize some performance index assigned
to t h e control system (11).
M o r e generally the transformation from x(tk) to x(tk+1) depends
also on t h e external a n d / o r internal disturbances that exist a n d is given
by
x
r
*(*fc+i) = T( h . "* » k ) (4)

w h e r e the r a n d o m disturbances are expressed symbolically by rk .


F o r example consider a discrete-time linear control system given by
t h e difference equation

= Axk + Buk + rk (5)

where
xk is the tt-dimensional state vector,
A is the η χ η matrix,
Β is the η χ m matrix,
uk is the w-dimensional control vector,
rk is the η-dimensional disturbance vector.

T y p e s of transformations to be applied are d e t e r m i n e d by specifying


t h e control vectors as functions of t h e present state vector a n d t i m e .
T h u s , if t h e time is a s s u m e d to advance in discrete steps, a p r o b l e m of
d e t e r m i n i n g p r o p e r sequences of control vectors, given t h e criterion
O P T I M A L AND SUBOPTIMAL CONTROLS 5

of performance, can be formulated as a p r o b l e m of multistage decision


processes. T h e decision at each stage is a p r o p e r choice of the control
vector from the d o m a i n of control vectors.
Functional equation t e c h n i q u e s of d y n a m i c p r o g r a m m i n g are well
suited to t r e a t m e n t of control processes as multistage decision processes.

C . Stochastic Final-Value C o n t r o l Systems

1. FUNCTIONAL EQUATION

Consider an TV-stage control p r o b l e m with

xk+1 = axk + uk + rk1 k = 0, 1, ..., Ν — 1 (7)

where 0 < a < 1 and xk , uh , and rk are scalar. L e t us further assume


that t h e disturbance rk is identically and i n d e p e n d e n t l y distributed
Bernoulli r a n d o m variable given by

\+ c with probability p
Yk
~~ {— c with probability 1 — p ^ '

L e t us take as the performance index of a control system some func-


tion φ of the state vector xN at t h e t i m e instant tN . T h e time tN is called
the final time. Control systems with such a performance index are
called final-value systems ( / 2 , 13, 14). T h e control system S m u s t decide
to apply u which minimizes t h e expected value of φ(χΝ), given the
present state variable Λ:, t h e n u m b e r of t h e remaining decision stages η
and the value p.
I n order to treat the situation w h e r e t h e r e is a constraint on t h e total
a m o u n t of available control in t h e form

N-l

X «<* < ν
the performance index is modified to

E(JN) = E(xs*+X%uA (9)


i=0

Define hn(x; p) to be t h e m i n i m u m of the expected value of JN


by employing an optimal policy w h e n η m o r e control stages remain
with the state vector x. T h e p a r a m e t e r value p which characterizes the
6 MASANAO ΑΟΚΙ

r a n d o m variable r is included since p is considered to be u n k n o w n in


the next section when adaptive final-value systems are discussed.
According to the definition,

hN(x0\ p) =-- min


u min u... min E(JN) u (10)
o i N-I

where Λ: 0 is the initial state of the system of E q . (5).


T h e recurrence relation for hn(x\ p) is given by
2
hx{x\ p) = min [Xhh-S + />(*+)* + (1 - />)(*") ]

h^x; p) = min [ληΝ^ + ρ ΑΛ(*+; ρ) + (1 - />)*»(*";/>)], (11)


η = 1,2, ..., Ν - 1
+
where χ = <2x + c + w n , χ~ = ÖJC — c + un , w = 0, 1, ..., iV — 1.
If no other constraint is imposed on w, t h e n , as expected, hn(x; p)
is quadratic in χ and un{x\ p) is linear in x, and is given by

2 2 2
*„(*; p) = (c - c )[l + — - a + ... + 2
+ 1 ' "' ' λ + 1 + a + ... tf2(n-2)J

2 [ C (1 1 ß W2 12
+ Γ+Γ+^+... + a ^ ' ^ + " + - )l < >
and
nl n 71 1
a [a x + c(\ + a + ... + a - )]
P) - - +λ , + fl2 +_ + Λ 2_(1η}

where f — Z?(c) = (2/> — \)c. However, if un is constrained to be

un = m or —w m > c (13)

as in a contactor servo system, then explicit expressions for hn(x\ p)


2 2
and «„(#; />) are no longer available. Since Σ ΐ =0 u{ = Nm , the criterion
2
of performance can be taken simply to be E(xN )
T h e recurrence relation is now given using h also for this case by

A^*; p) = min [/>(*+)* + (1 - />)(*-)*]

Ρ) - min [phv(x+; ρ) + (1 - ρ) Λ η(*-; ρ)],


1 « = 1,2, ..., ΛΓ - 1
«JV-i-n^i«
(14)
Although explicit expressions for hn and un are not available, it can
be shown by inductive a r g u m e n t s that hn(x; p) = hn( — x\ 1 — />) holds
O P T I M A L AND SUBOPTIMAL CONTROLS 7

(4). D u e to this s y m m e t r y , the a m o u n t of c o m p u t a t i o n necessary to


solve hn{x\ p) for a given range of χ and 0 < p < 1 is reduced by half.

2. ONE-STAGE SUBOPTIMAL POLICY

F i g u r e 1 shows an optimal control variable as a function of χ a n d η


for ρ = 0.625, D = 1/4, a - 7/8, c = 1/16, m = 9/128, η < Ν = 12.

THE STATE V A R I A B L E OF T H E S Y S T E M , X n

d
x nr
Xn -0125
I 1 "η Ό
2
c 3

ω _ — h'
b
Crt
LV.t
-ι 7

S — - u

° L

l 0
II

| X nI ^ D i 0 * 1 / 4 - ( * , n ) W H E R E m IS THE OPTIMAL CONTROL FORCE

d « -(2p-i)c/a - ( X,n ) W H E R E - m I S THE OPTIMAL CONTROL FORCE

d«-l/56 FOR a * 7/8 tC * 1/16 , AND p * 5/8 m * 9/128

F I G . 1. Optimal control policy as functions of the state vector χ and the number
of remaining control stages n.

Notice that the b o u n d a r y between un(x; p) = + m and un(x; p) = — m


is not a simple straight line.
Given the total duration Ν of t h e control process, the optimal
sequence of control vectors {un(x; />)} are given by solving E q . ( 1 4 )
sequentially for η = 1, 2 , ..., Ν. W h e n Ν = 1, t h e optimal control
vector is given by
w m s n C
o(*i P) = — ' g (** + (2p - 0 ) (15)

If E q . ( 1 5 ) is used as un(x\p) for η > 2 , t h e n it constitutes a s u b -


optimal policy.
T h e switching b o u n d a r y , t h e n , is i n d e p e n d e n t of η a n d is represented
by a straight line given by
χ = -(2p - \)c\a (16)

and shown in Fig. 1 as the line xn = d. T h a t is, according to this o n e -


stage suboptimal policy, un = — m if xn > d and « n = + m if # n < </.
8 MASANAO AOKI

F r o m Fig. 1 it is noted that for χ > 0 t h e optimal a n d s u b o p t i m a l


policies agree fairly well, although t h e agreement is not particularly good
for χ < 0.
It might be expected, therefore, that if t h e system starts from t h e
initial position χ = D , t h e n t h e suboptimal policy is a fairly good
approximation for t h e optimal policy.
T o see to what extent t h e conjecture is correct, t h e M o n t e Carlo
m e t h o d (15) was used to simulate t h e system behavior from t h e initial
positions x0 = ± D. Forty simulated r u n s were m a d e using r a n d o m
n u m b e r s to generate rn = + c a n d rn = — c with appropriate p. A
part of t h e simulation result is listed here.
W i t h x0 = Z),

£(V)optimal = 0.00367/
P
£ ( V ) s u b o p t i m a l = 0.00374* ^

W i t h x0= — Z),

K optlMl =
n"S2n *>'/> = 0.625
£ ( V ) s u b o p t i m a l = 0.00441 \
2
I n spite of t h e relatively similar values in E(xN )> t h e suboptimal
policy for x0 = D appears to be better than that for x0 = — D, as
conjectured, by t h e fact that out of 4 0 trials with x0 = Dy t h e optimal
2
and suboptimal policies gave t h e same xN in 21 trials, b u t in 4 0 similar
trials with x0 = —Dy t h e optimal a n d suboptimal policies did not give
2
the same xN in any case.
A l t h o u g h these results are by no m e a n s , conclusive, they tend to
s u p p o r t t h e conjecture that t h e one-stage suboptimal policy is a fairly
good approximation to t h e optimal policy for t h e control system u n d e r
consideration.
It is to be noted that t h e adoption of this suboptimal policy simplifies
the control policy implementation considerably.

D. Adaptive Final-Value Control Systems

1. F U N C T I O N A L EQUATION

I n t h e previous section, t h e p a r a m e t e r value p in E q . (8) has been


assumed known. As soon as p is assumed to be u n k n o w n , t h e p r o b l e m
of controlling t h e system optimally becomes adaptive since t h e control
policy d e p e n d s on t h e u n k n o w n p and m u s t be able to incorporate
additional information on p as it becomes available.
O P T I M A L AND S U B O P T I M A L CONTROLS 9

A l t h o u g h it is possible to treat t h e situation w h e r e p can be a n y w h e r e


b e t w e e n 0 a n d 1, let us discuss in detail t h e case w h e r e p is k n o w n to be
either p1 or p2, p1 < p2, with t h e given a priori probability ζ that p

I +c with probability ft
—c with probability 1 — Pj , j = 1, 2

Pr(/> = ft) = ζ, Pr(p = ft) = 1 - ζ, ft < ft

T h a t is, n a t u r e is a s s u m e d to be in one of two possible states, Hx a n d


H2 . T h e r a n d o m force r is a s s u m e d to be i n d e p e n d e n t l y a n d identically
d i s t r i b u t e d for each of t h e Ν stages.
L e t us define, similar to E q . (10),

hn(x\ pi) = the expected value of φ(χΝ) employing the optimal policy,
given the present state variable χ of the system, and the
number of the remaining decision stages n> and given
that the state of nature is the tth state, i = 1,2

T h e functional e q u a t i o n s hn(x\ ft), i = 1,2 have been discussed in


Section I I , C, 1.
Define kn(x, ζ) to be t h e m i n i m u m of t h e estimated expected value
of t h e function of t h e final value, <f>(xN)> w h e n η control stages remain,
with t h e state variable χ a n d t h e c u r r e n t estimate of/> being ft given by ζ.
N o t i n g t h a t t h e a posteriori probability at t h e wth stage b e c o m e s t h e a
priori probability for t h e next, (n + l)st, stage, t h e functional equation
for kn(x, ζ) is given by

ζ) = Umin{£[ft<£(*+) + (1 -ρ1)Φ(χ~)]
N-i

+ (1 - + (1 -ΡΖ)Φ(ΧΤ)]}
(19)
kn(xy ζ) = min {ζΐρ,Κ-ι^^ ί') + (1 - Pi) **-ι(*-, Π Ι
+
+ (ι - i ) [ P A - i ( * , Ο + (ι - Λ ) * η - ι ( * - , ε")]}

where a n d JC~ are given by E q . (11) and. w h e r e ζ' a n d ζ" are given
by E q s . (20) a n d (21), respectively.
If t h e present estimate of n a t u r e ^ being in H1 is ζ, a n d if r — c is
realized, t h e n t h e a posteriori probability t h a t n a t u r e is in H1 will
become
10 MASANAO ΑΟΚΙ

W h e n r = — c is observed at this stage, the a posteriori probability


becomes
Γ = «' -ft> = ! (21)
+ 0 - C X i - A )
} (
CO-Λ) i + ( L r J ) . o

where # 0 and ax are the likelihood ratios of H1 over / / 2 > after r n = ·—-c
and r.n = + c are observed, namely,

1 p 2
a - ~ β
1 —ίι Pi
After η such observations, t h e a priori probability ζ becomes

in = ïïqr (22)
1 + \—ζ-)°η

where an is the likelihood ratio of t h e η observations, i.e., if r = c are


observed nx times, and r = — c are observed n2 times, t h e n

a
" = (If)" (-n^f • " * = "
1 1+ M
<> 23

F r o m E q . (19),

kx(x, ζ) = min {[ζ/,! + (1 - 0/>Jfl*+)

+ ώ ι - Ρ ι) + ( ι - 0 ( ι - Λ)] *(*")} ) ( 2 4

Α„(*> ζ) = min {[ζ/»χ + (1 - ζ)/>2] *,,_,(*+, {')

+ [ζ(1 - Λ ) + (1 - 0 ( 1 - Pt)] *»-ι(*-, ζ")}. » = 2, 3 , . . . , Ν


Let us note that if ζ = 1 or ζ = 0, t h e n ζ' = ζ " = 1 or ζ' = ζ" = 0,
and E q s . (19) and (24) reduce to E q . (11), since kn(x, 1) = hn(x;p^),
*n(*. 0) = *»(*; p2).
If one could obtain, before solving E q . (24), some information on t h e
functional s t r u c t u r e of kn(x, ζ) from t h e knowledge of hn(x; pj) a n d
hn(x; p2), t h e n one would be in a b e t t e r position to devise an a p p r o p r i a t e
computational p r o c e d u r e a n d / o r an analytic a p p r o x i m a t i o n of solving
Eq- (24).
O n e such structural information is given by t h e following proposition.

PROPOSITION. kn(x, ζ) of Eqs. (19) and (24) satisfies

kn{x, ζ) > ζΚ(*> 1) + (1 - 0 *„(*, 0), λ = 1, 2 , . . . (25)


F o r t h e proof, see A p p e n d i x I of reference (4).
O P T I M A L AND SUBOPTIMAL CONTROLS 11

E q u a t i o n (25) supplies t h e lower limit on kn(x, ζ), given kn(x, 1)


hX A DN Χ
= n( > Pl) Κ{ > 0) = hn(x, p2).
T h e u p p e r limit on kn(x\ ζ) is given from E q . (19) by

kn(xt ζ) < min max [ p A - ^ * * , ζ') + (1 - px) kn_x(x__, ζ"),

ζ') + (1 - Ρύ *n-l(*", D ]
+
(26)
< min max [kn_1(x i ζ'), *η-ι(*~ι ζ")]

Observe that ctn(x, ζ) requires less c o m p u t a t i o n t h a n kn(x> ζ).


T h u s , we know t h a t t h e m i n i m u m of t h e estimated expected value
of φ(χΝ) for this type of adaptive final value system, kn(x, ζ), is concave
in £, and E q . (25) provides a lower b o u n d on kn(x, ζ), given t h e expected
values of t h e criterion function for t h e c o r r e s p o n d i n g stochastic case,
h x
n( lPi) hn(x'>p2)>
a nd

Let us note that in o r d e r to prove t h e relation, E q . (25), t h e actual


forms of t h e difference equation of t h e system, E q . (5), of t h e d i s t r i b u -
tion function a n d of φ are immaterial. T h e relation is, therefore, a
general characteristic of final-value control systems with a finite n u m b e r
of states of n a t u r e . Since t h e knowledge of a sequence of optimal u
is equivalent to that of t h e criterion functional values (J), t h e lower
b o u n d of kn(x9 ζ) may be used to derive initial a p p r o x i m a t e policy to
start a sequence of a p p r o x i m a t i o n s in policy space (6).
Let us assume for t h e m o m e n t that t h e r e c u r r e n c e relation E q . (19)
has been solved, a n d let us consider certain fixed values of χ a n d n.
T h e n in E q . (19) let us define s1 a n d s2 as

sf = ΡιΚ-λ{χ\ Π + (1 - Pi) *η-ι(*-, ζ"), ί=1,2 (27)

t h u s sl and s2 will be in general functions of x9 n, u, and ζ.


W e can now write E q . (19) as

*„(*, ζ) = min [{ίχ + Ο -i)s2]

• = C V + (1 - i ) * 2 * (28)

where j j * and j 2 * are sl and s2 which is optimal for χ a n d η considered.


If one regards st and s2 as the loss of t h e control system with u w h e n
1
n a t u r e is Ht and H2 respectively, then the control process may be
considered to be the S g a m e (16) where n a t u r e has two strategies.
W h e n n a t u r e employs a mixed strategy with probability distribution

1
T h i s is exactly true only tor ζ 1 or ζ 0.
12 MASANAO ΑΟΚΙ

ζ = (ζ, 1 — ζ), the expected value of the loss to the control system
with u becomes
ζ$χ + (\ - ζ Κ (29)
t ne
and t h e i * = (^*, s 2 *) is m i n i m u m of E q . (29).
Viewed in t h e light of game theory, t h e use of the a priori and the
a posteriori probability distributions in Eq. (19) seems to be quite
natural. T h i s can be formalized by assuming that ζ will be transformed
by t h e Bayes formula.
T h e reason for i n t r o d u c i n g the concept of S game here is that there
exist mathematical theories on set-theoretical relations a m o n g the
classes of strategies, and they are useful in discussing optimal strategies
or optimal policies. Although it does not seem possible to fit questions
in adaptive control processes completely in the existing frame of the
theory of statistical decisions and sequential analysis, certain analogies
can be used to advantage in the construction of the theory of adaptive
control processes, and in deriving approximate solutions for functional
equations derived by t h e application of the principle of optimality.
W h e n ζ, ζ', and ζ" are close to each other, it is reasonable to expect
that kn(x, ζ), kn(x, ζ'), and kn(x, ζ") are also close to one another, since
they are c o n t i n u o u s in χ and ζ. T h e recurrence relation defining
K(*> 0 by
^(.v, 0 = *,(*, 0
+
0 = min { ζ [ ρ Α - ι ( * . ί) + (1 - Pi) ^ - ι ( ^ , 0 ]

+ ( 1 - ζ)[ρΑ-ι{*\ 0 + (1 - />·>) **-ι(*-. ζ)1 « = 2, 3, ..., Λ"


(30)
may be used as an approximation to the exact recurrence relation
E q . (19).
Having c o m p u t e d hn(x\ p j and hn(x\ p 2 ) , one can use t h e m to provide
a lower b o u n d on the criterion function of t h e adaptive final value
system kn(x, ζ) of Eq. (19).
Figure 2 shows kn(x, ζ) as a function of ζ for various η and x. It is
seen that for η > 8, the lower b o u n d given by a linear combination of
hn{x\pij and hn(x\p2) is a fairly good approximation to kn(x, ζ).

III. Approximate Realization of Desired Trajectories


(17,18)
A . Introduction
Sometimes an optimal control p r o b l e m is posed as the proble m of
designing a system such that certain c o m p o n e n t s of the system state
OPTIMAL AND SUBOPTIMAL CONTROLS 13

vector follow some desired (or given) functions of time as accurately


as possible d u r i n g t h e control period [t0 , t0 + T ] , w h e r e Τ is t h e d u r a -
tion of control. I t is convenient t o refer to desired functions of t i m e as
e
kn(x,C) 10

Δ χ-0.25
• χ-0.125
V χ —0
Χ χ--0.125
0.8536 |
ο Χ--0.25
0.3706 0.6294 0.8536 0.3706 0.6294 0.9830 α ·7/8
b -1/16
m -9/128
β β
-0.25$ Χ s0.25
!<η(χ.ζ)·ΙΟ *n(«.C) Ι 0

0.1464 0.9830

00170 0 3706 0.6294 0.8536 0.3706 06294 0.8536

F I G . 2. Values of kn(x, ζ) as a function of ζ.

desired trajectories a n d actual (or realized) functions of t i m e p r o d u c e d


by t h e system as actual trajectories of t h e systems. Optimization p r o b -
lems consist, therefore, in choosing system control vectors a n d other
system p a r a m e t e r s at t h e disposal of system designers in such a way
14 MASANAO ΑΟΚΙ

that some given criterion of closeness of fit is m i n i m i z e d for given


desired trajectories a n d realized system trajectories.
In this section, we will be mainly interested in using Lx n o r m , i.e.,
the integral of t h e absolute deviation as t h e criterion of fit.
Boltyanskii has treated a class of similar p r o b l e m s in which L 2 n o r m ,
i.e., time integral of t h e s q u a r e d deviation is to be m i n i m i z e d (3, 18).
His t e c h n i q u e is not readily applicable to p r o b l e m s with Lx n o r m since
the integrand fails to be differentiable. M o r e complete t r e a t m e n t of
p r o b l e m s with Lx n o r m is given in Section I I I , Ε for control systems
with simple integrator dynamics.
T h e control system is assumed to be given by

dx(t)
= A(t)x(t) + Bu(t), u(t) e Ω (31)
dt
where
χ is w-vector,
A is η χ η matrix,
Β is η χ m matrix,
u is m-vector, m ^ w,
Ω is the admissible set of control variables

T h e desired trajectory is given by an w-vector d(t). It is a s s u m e d that


t h e system can be started exactly at t h e beginning, i.e., Xi(t0) = ^(*0)>
i — 1, 2, w, where subscripts denote c o m p o n e n t s of vectors. W h e n
m < n> it is, in general, impossible to realize t h e desired trajectory
exactly. Even w h e n m = w, d e p e n d i n g on Ω it may not be possible to
realize d(t) exactly. T h i s point is discussed in some detail in t h e Section
IV w h e n t h e existence of t h e optimal control is discussed.
T h e criterion of t h e system p e r f o r m a n c e is taken to be

(32)
to

T h e n o r m is taken to be either
η
(33)

(34)

or

(35)

where xi is t h e hh c o m p o n e n t of x.
O P T I M A L AND SUBOPTIMAL CONTROLS 15

T h e question of optimal a n d s u b o p t i m a l policies as well as estimates


of t h e deviation between t h e desired a n d t h e actual system trajectories
are discussed u n d e r various a s s u m p t i o n s on control systems. T h e con-
nection with t h e p r o b l e m s of a p p r o x i m a t i o n of functions is also indicated.

B. Upper Bounds

F r o m E q s . (31) a n d (32), t h e deviation y(t) = x(t) — d(t) satisfies

Q = Ay + Bu + Ad-d
(36)
y(t{)) = 0

w h e r e â indicates t h e t i m e derivative of d(t).


Let
φ(ί) = -Ad + d (37)
then

& = Ay + Bu-Kt)
(38)

T h e function y(t) can be written from E q . (38)

y(t) = Γ W(t, s){Bu(s) - t(s)}ds (39)

where W(t, s) is t h e f u n d a m e n t a l matrix of solution of E q . (38).


L e t u s first obtain an u p p e r b o u n d on C(u) by applying Schwarz
inequality to t h e fth c o m p o n e n t of y(t), y^t),

α ι »

XWl(t1s)ds)
tokZ-1
ν 1/2

'
ι rt

(J U
' o f i
η 1/2

%Sk%t)dt)
1
(40)
0+T V
< M{t)(fU V Sk*(t)dt) \ 1 = 1,2,..., η
t0 k=l

where Wik(t9 s) is t h e element of W{t, s) at t h e fth row a n d t h e jih


column, and

M ( 0 = max ( f V Wîk(t, s ) * ) ™ (41)


16 MASANAO ΑΟΚΙ

and Sk(t) is t h e Äth element of t h e w-vector S = Bu — φ. If t h e n o r m


of E q . (33) is taken, t h e n
ί +Τ 1 /2
rt0+T , tr0+T \/Γ ο ^Λ
S
\
C(u) = J II y(t) y dt < η (J M(t)dt)(j £ M)dt)
to *o to k=l
(42)

y .sv(o^0
where

# = ni M(t)dt

If t h e n o r m of E q . (34) is used, t h e n

C(«)<L(f° XS^O*) (43)

where
r*o+T
L = J Μ(ί)Λ
t

with t h e n o r m of E q . (35),
^o+T^n
2
X S* (0*) (44)
Ό Jt="i

where
+T
jV = i^° M\t)dt)

I n this case, it is possible to represent Ν in t e r m s of t h e eigenvalues of


t h e linear operator b u t we will not discuss this here (19).
T h u s , we see that with t h e three n o r m s we are considering, u p p e r
b o u n d s on C(u) are given by E q s . (42)-(44), which are all minimized by

a \J'; ÎSAW\
+T
(45)

If t h e set Ω is such that u(t) can be chosen for each t independently, t h e n


instead of E q . (45) one can consider

min X SM) (46)


O P T I M A L AND SUBOPTIMAL CONTROLS 17

for each t in [t0 , t0 + T). F o r example, if Ω is such t h a t

Q = [u(t) : u(t) measurable, | u^t) | < at-, i = 1,2, m)

or

Ω = {u(t) : u(t) measurable, £ ( « i ( 0 > u m ( t ) ) < ]3}

t h e n E q . (45) can be replaced by E q . (46).


Since
m

w h e r e ft^ is the (Ä, j j t h element of t h e matrix Β in E q . (40), t h e


original p r o b l e m is replaced by t h a t of m i n i m i z i n g t h e Euclidean
distance b e t w e e n t h e given vector Ρ = (φ^ί), ..., ψη(ϊ)) a n d t h e vector Q
w h i c h is a function of w, Q = ( Σ ; · b^Uj, ..., Σ ;· bnjud). T h i s is a p r o b l e m
of a p p r o x i m a t i o n in En (19, 20y 21).
Measurability question will not be discussed here. I n t h e absence of
any m a g n i t u d e constraints on w, E q . (46) has t h e usual geometric inter-
pretation (79), namely the vector Q s h o u l d be taken to be t h e orthogonal
projection of t h e vector Ρ on t h e subspace Η s p a n n e d by m c o l u m n vectors
of t h e m a t r i x B.
I n case t h e m a g n i t u d e of u is constrained, t h e vector Q s h o u l d be
taken to be t h e point closest to t h e projection of Ρ on t h e subspace H,
which satisfies t h e constraint (20, 27, 22).
O n e can obtain other b o u n d s on

For example, from E q . (39)

(47)

where

(48)

then,

(49)
18 MASANAO ΑΟΚΙ

H e n c e , the control policy which minimizes t h e r i g h t - h a n d side of


E q . (49) is that which minimizes

J (50)
t0 ΪΞι Ζ*ι

T h i s is the p r o b l e m of minimizing t h e distance between two points


Ρ and Q in the space w h e r e the distance p(x, y) between χ and y is defined
by n o r m of E q . (34). W h e n m = η = 1, these t w o different metrics
p r o d u c e t h e same u(t) with \u\ ^ a ,

m when m
Β Β
u(t) a when φ(ί) > OLB (51)

—oc when φ(ΐ) < -JLB

In t h e rest of this section we will use t h e n o r m of E q . (34).

C . Independent Controls
Consider a special case w h e r e m = η in E q . (31). A s s u m e t h a t the
constant matrix A has η distinct real eigenvalues λ λ , λ 2 , ..., λ η . A l t h o u g h
it is possible to treat situations w h e r e some of t h e eigenvalues are c o m -
plex a n d / o r not all of t h e m are distinct, they do not add any new insight
into the p r o b l e m and hence will not be discussed here.
Letting U be the η χ η matrix whose c o l u m n s are normalized eigen-
vectors of A y one transforms Eq. (38) by

ζ = Uy
into
dy l
= Ay{t)+U- Bu(t)-<Ht) (52)
It
with
y(t0) = U-\x(t0) - r(t0)) (53)
where

1
U- AU = (54)
O P T I M A L AND S U B O P T I M A L CONTROLS 19

a n d where

(55)

E q u a t i o n (38) reduces to

•ir =
lfi
+ (tf- «W)< - M ) , i = ι. 2 , - , (56)

w h e r e y{(t) is t h e ith c o m p o n e n t of y(t).


N o t e that all c o m p o n e n t s of w, ΐ/ χ , w H enter into t h e d e t e r m i n a t i o n
of ylt).
If in t h e control system Β is chosen such t h a t each c o l u m n of Β is
proportional to eigenvectors of A , t h e n

a n d from E q . (56)

yM = y ^ y w + C e - V (&« Ε(τ) - φΐ-τ)) dr (57)

T h e r e f o r e , not only each c o m p o n e n t of y(t) is decoupled, b u t also


one has achieved an i n d e p e n d e n t control of each c o m p o n e n t of y(t)
over [t0 , t0 + T]. N a m e l y , each c o m p o n e n t can be controlled i n d e p e n -
dently by Ui(t)> i = 1, n. T h u s , in this special case, t h e p r o b l e m is
essentially η one-dimensional control systems being controlled i n d e p e n -
dently. Therefore, t h e subscript i can be d r o p p e d from E q . (57) for t h e
rest of this section.
T h e n o r m s of Eqs. (33) a n d (34) b o t h give rise to t h e p r o b l e m of
minimizing

(58)

where

u G Ω, and λ, β, and φ are given.

In the next section, we will discuss this one-dimensional case m o r e


thoroughly.
20 MASANAO ΑΟΚΙ

D. One-Dimensional Problem
1. INTRODUCTION

I n this section, the one-dimensional case of E q . (31) will be considered


in detail. O n e possible way in which such one-dimensional p r o b l e m s
arise is discussed in Section I I I , C.
It is convenient to make use of two formulations:

min f I x(t) — r(t) dt


u J n
where (59)
dx
— = ax + buy α Φ0, b > 0, x(0) = r(0), 0 < u< 1
and

min ί I z(t) dt
U JQ
where
(60)
and
^ = az + b(u - ν - φ), s(0) = 0, 0 < u < 1

where
ar iv — ar\
ν : (61)
and
(0 when χ < 0
v(x) Ix when 0 < λ 1 (62)
(l when χ· > 1

T h e function ν is introduced in such a way that it automatically


assumes a correct control value when the derivative of the given tra-
jectory r(i) is capable of being duplicated by the system, i.e., if r(t)
satisfies
f = ar + bv
with 0 ^ ν ^ 1. W h e n this h a p p e n s , φ of Eq. (61) is zero.
T h u s , φ(ί) Φ 0 implies that at t, the system is incapable of duplicating
r(t) exactly by the admissible control variable. N o t e also that
=
when φ > 0, ν — 1
(63)
=
and when φ < 0, ν ^ — ^
O P T I M A L AND SUBOPTIMAL CONTROLS 21

2. S P E C I A L CASES 1

T h e r e are t h r e e cases for which t h e optimal controls are immediately


seen. W e assume t h a t r(t) is differentiable almost everywhere in [t0 ,
t0 + T). It is easy to verify by direct substitution t h a t

s)
u{ = *'> - « ' \ O^s^t (64)

satisfies
at a(t s)
x(t) = e x(0) + f ' e - bu(s)ds = r(t) (65)
J 0

for almost all t in [0, T).


It is convenient to divide r(t) — t plane into t h r e e regions by m e a n s
of two auxiliary functions r(t) a n d f(t) defined by
at
r(t) = r(0)e
(66)
at at
r(t) 4 (0)e
r +-(e - 1)
for all t in [0, T].
T h e y correspond to x(t) with u(t) = 0 a n d u(t) = 1 in [0, 1].
I n t h e r(t) — t plane, r(t) ^ f(t) defines Region I, r(t) < r(t) < f(t),
Reeion I L and r(t) < r(t), Region I I I . T h u s :

Case (i)
a.e. in [0, T] (67)

t h e n optimal policy is given by

Case (ii)
r(t) > r(t) in (0, s) and Eq. (67) holds in (s, T),

where s is any t i m e in (0, T). T h e optimal policy is given by

Case (iii)

w h e r e s is any t i m e in (0, T). T h e optimal policy is given by

Figures 3(a)-3(c) illustrate these t h r e e cases with Τ = 1.


22 MASANAO ΑΟΚΙ

I n t e r m s of E q . (61), in Fig. 3(a), φ(ΐ) = 0, in Fig. 3(b), φ(ή > 0


and φ(ή < 0 in Fig. 3(c).

F I G . 3. Examples of r(t) for w h i c h optimal policies are derived simply.


OPTIMAL AND SUBOPTIMAL CONTROLS 23

3. SPECIAL CASE 2.

C o n s i d e r a situation w h e r e φ(ί) is k n o w n to be

( > 0 in (0, tx)


0 = ] = O at tx (68)
(<0 in (tl9T]
and r(t) crosses from Region I I I into Region II at some t i m e in [0, T]
and stays in Region II (see Fig. 4 ) . F r o m E q . (63)

υ = 1 in (0, tt)
(69)
ν = 0 in (tt, T)

FIG. 4 . Example of ψ(ί) behavior for w h i c h the optimal control policy is derived.

F r o m E q . (60),
a
Γ I z(t) I dt = C dt fe «-^b(u - υ- φ ) ά τ (70)
J φ Jq ^ 0

a. Suboptimal Policy. D e c o m p o s e u(t) into ux{t) a n d u2(t) such t h a t

"(0 = «l(0 + «2(0


« i ( 0 = 0 in [f x, Γ] (71)
w 2(0 = 0 in [0, tt)

L e t us first find an u(t) which gives an u p p e r b o u n d on E q . (70).


F o r t e [0, f j ,
I *(*) I = Jf ' ea(f-r)^ + 1 — « 1) (72)
0

since the integrand is of t h e same sign in [0, f j .


24 MASANAO ΑΟΚΙ

1
F o r te[tly 7 ],

<1 α τ) a(i
I z(t) I < ί € «- ο(φ + 1 - ux) dr + f' e -^b(u2 - φ) dr (73)
J0 J «χ

since 0 < wx < 1, ί = 1, 2. T h e r e f o r e from E q s . (72) a n d (73), a n d by


a change of t h e order of integration, one obtains

f , z(t)
•Ό
I dt < οf J
1
K{T, r) β-^φ \+ - U l ) d r +
*
\
tx
T
K(T, .r) e-«(tt8 - φ) dr

(74)
where
Χ(Γ,τ) 4 ^ ( e< T _ ea r ) ( 7 5)

Since t h e expression K{ Τ, τ) is positive, E q . ( 7 4 ) will be minimized by

U { t) )
= (Ο : (tt, T]

T h i s policy gives an u p p e r b o u n d on t h e m i n i m u m of E q . (70).

b. Exact Solution; Optimal Policy. T o obtain an optimal policy,


rather t h a n a suboptimal policy derived in t h e previous Section I I I ,
D , 3, a> we make use of various extensions of t h e fundamental m i n - m a x
t h e o r e m of von N e u m a n n in t h e t h e o r y of games (23). W e note

T
f I *(0 I dt = _ m a x ^ J z(t) k(t) dt (77)

with k a measurable function, hence

min I z(t) I dt = min max z(t) kit) dt

= max min f z(t) k(t) dt (78)

F r o m E q . (72), one can rewrite (77) to be

Γ 1
ί | z(t) I dt = f' dt (' *<>-«%φ + \ - U l ) d r + max f z(t) k{t) dt

(79)
O P T I M A L AND SUBOPTIMAL CONTROLS 25

where k(t) is now d e n n e d only on [tt, T]. F r o m E q s . ( 7 2 ) , ( 7 3 ) , and ( 7 9 ) ,

(80)

where

(81)
E q u a t i o n (80) can be rewritten as

(82)

T h e function C of E q . (81) is seen to be negative in the neighborhood


of tx , and is monotomically increasing for all admissible u2 in (t1, T],
T h e r e f o r e the optimal k(t) is equal to —1 in t h e same neighborhood
of tx .
T h u s , C has at most one zero at σ in [tl , T], T h i s will be the case if
C(T; Wj, M 2 ) > 0, then

*>-!"', Sfen' <> 83

Define Α(τ, σ) a n d Β(τ\ σ) by

Α(τ\ σ) = K(tx, τ) - Ç b k(t) e at


dt, 0 < τ < tx (84)

at
B{r\σ) = b fk(t) e dt, tx < τ ^ Τ (85)

F r o m E q . (85), B(T; σ) is zero a n d Β(σ; σ) is positive.

(> 0 in [<j, a)
— = -b k(r) = (86)
(< 0 in [σ, Τ]

T h e r e f o r e , if B{tx; σ) > 0 t h e n

β ( τ ; α) S 3 0 for all τ ε [ f , , Τ] (87)


26 MASANAO AOKI

If B(tx\ σ) < 0, then there exists /* such that

B{t*3 σ) = 0, tx < t* < σ (88)

F r o m Eqs. (82) and (84)

A(r; σ) = K(tx , τ) - B(tx ; σ), 0 < τ< tx (89)

Equations (80) and (81) can now be rewritten as

1 ατ at
min max ( I K(tx , τ) β~ (ψ + 1 - ux) dr + f C(i; mx , u2) k(t) e dt)

1 ατ t t T
= ma
x mi
n () Α(τ\ σ) e (φ + 1 - xu) dr + j £ (; τ )σ é ? ( i /2 — 0) dr
)

(90)
N o w consider the following four cases separately:
Case (1). B{tx\ σ) > 0 and A(t\ σ) > 0, 0 < t < ^ , i.e., , τ) >
2
B(tx\ σ). F r o m E q . (90),

ux* = 1 on [0, fj

Since ß ( ^ ; σ) > 0,
ιι 2* = 0 on , Γ]

T h e above a r g u m e n t tacitly assumes the existence of σ, t1 < σ ^ 7\


T h e assumed form of k of Eq. (83) implies that

1 ατ a
0 < C(T\ 1, 0) = - b (Ç β- φ(τ) dr + f e~ ^(r) dr) (91)

Equation (91) has a geometrical interpretation that the weighted absolute


area above the axis φ = 0, Sl is smaller than the weighted absolute
- α τ
area below φ = 0, S2, weight being £ . See Fig. 5.
Case (2). A(t\ σ) < 0, 0 < t < tx . F r o m E q . (90),

= 0 in [0, tx] is optimal

F r o m Eq. (89), B(tx\ a) > 0; therefore w 2* = 0 in , Γ ] . T h i s is the


case when C(T\ 0, 0) > 0 or when

s e a Tr d S
i + C ~ < 2
J 0
2
Actually if A(t; σ) = 0, then ux is arbitrary.
OPTIMAL AND SUBOPTIMAL CONTROLS 27

Case (3). Bfa; σ) < 0, hence from E q . (89) A(t; σ) > 0, 0 < / < ^ .
As before, = 1 on [0, tt]. F r o m E q . (88), t h e r e exists t* such that
B(t*; σ) = 0

F I G . 5. Weighted areas and . S 2 .

T h e r e f o r e from E q . (90),

1 in [tx, /*)
0 in [**, T]

T h i s is the case w h e n C(T\ t^*, w 2*) > 0 or w h e n S1 and S2 are such


that
ar
Si < S2 + Ç e~ dr

flT
Since t1 < ί* < σ < Γ, whenever 5 Χ < S2 + £~ dr it belongs to
this case.
Case (4). σ = Τ is realized. I n this case J5(r; T ) < Ofor tx < τ < Γ
and A(t; T) > 0 for 0 < t < tx . T h e r e f o r e

M l* = 1 on [0, ^
w 2* = 1 on [ί 3 , T]
and
C(T; 1 , 1 ) < 0
or
ar
^ > S2 + JΓ e' dr
h
28 MASANAO ΑΟΚΙ

T h i s completes t h e analysis of an optimal policy for this case.


Case (1) takes care of t h e situation w h e n

ax
S i < S2 < .Si + e- dr
*ο
Case (2) when
h at
Sx + \ e- dr < S2
*ο
Case (3) w h e n

Case (4) when


T ατ
Λ + J\ β~ < Sl%
h

T h u s , by first c o m p u t i n g S1 and S2 a n d from their relative m a g n i t u d e


one can derive an optimal policy.
It should be clear that a reverse situation to that of t h e present section
can be handled quite analogously. T h e a r g u m e n t s , therefore, will not
be repeated.

4. SPECIAL CASE 3

Consider now a situation depicted in t h e Fig. 6. D e c o m p o s e u into

u = ux + u2 + w3 (92)

a. Exact Solution. By introducing auxiliary functions k^t) and k2(t)


defined on [tx , t2] and [t2 , T] respectively, one can try to use t h e m i n -
max theorem as in Sections I I I , D , 3. However, behaviors of auxiliary
functions, defined similarly to A, B> a n d C in Section I I I becomes
rather complex and no simple rule of obtaining optimal policies seem
to be forthcoming.
A dynamic p r o g r a m m i n g formulation may be employed to obtain
numerically optimal policies in these cases.

b. Suboptimal Policy. For t ^ tx ,

I (t) I =
z f' i(0 + l - ) dr
M l (93)
Jο
O P T I M A L AND SUBOPTIMAL CONTROLS 29

F o r t x < t < t2 y

1 a ( t T)
I *(0 I < f' e " b(Ul - 1 - φ) dr + f* 6 I«2 - ü I rfr (94)

Ψ0)

V
*-1 τ
j
».

F I G . 6. Another example of ψ(ί) behavior for w h i c h a suboptimal control policy is


derived.

1
For t 2 ^ t < 7,

{ t r) 2 T
I *(0 I < ^ ' - 1 - Ψ) dr + f é?°«- > 6 I u2 - V \dr

+ f * é*«-*> b(uz -ψ)άτ (95)

A control policy which gives an u p p e r b o u n d on t h e integral can be


obtained by utilizing E q s . (92) (93), and (95) to evaluate

1 ατ
Γ I z(t) I it < f K(Tf τ) «-"(«χ - ΐ - φ ) ά τ + Γ* K(T, τ) ~ \u2 — v\dr J
^ο Jο

Γ
+ | * ( 7 \ τ ) * - « [ η ζ- φ ) ά τ (96)

Since ^ ( Γ , τ) is positive for nonzero a, E q . (96) can b e minimized by


choosing u to be
«! = 1
m2 = v (97)
« 3 = 0
30 MASANAO ΑΟΚΙ

It can be seen, by similar a r g u m e n t s , that for every possible r(-) an


u p p e r b o u n d can be obtained by breaking φ u p into different segments,
each segment being of the same sign, and treating each s e g m e n t in-
dependently. O n e s u b o p t i m a l policy, therefore, is given by t h e rule:

whenever φ > 0 in a segment, u = 1 in the same segment


whenever φ = 0 in a segment, u = ν in the same segment
whenever φ < 0 in a segment, u = 0 in the same segment

T h i s s u b o p t i m a l policy is valid for all φ(-) behavior and is not restricted


to that of Fig. 6.

5. DYNAMIC PROGRAMMING FORMULATION

a. Functional Equations. In this section we will formulate first a


functional equation for a one-dimensional control system w h e r e Ω
consists of a finite n u m b e r of points on t h e real line, and new controls
can be exerted only at discrete time instants. T h e r e f o r e , over each s u b -
interval t h e same control is assumed to prevail. Ν such time subintervals
are of duration t l y t 2 , tN , Σ ι =1 ti = T. W e use formulation of
E q . (60). T h e solution of E q . (60) is given by
at at aT
z(t) = z0e + e f e~ b(u — ν — φ) dr
' ο
at at aT
= -R(t) + z0e + e J ί e~ bu dr (98)
ο
where
at aT 3
R{t) = e Ç e~ b(v + φ) dr (99)
*ο
Define
T
/ „ ( * „ , T) = min \ \z(t)\dt (100)*

where there are TV subintervals in [0, T] and hence Ν i n d e p e n d e n t


choices of u are p e r m i t t e d d u r i n g the control interval. T h e r e f o r e ,
1 h
f2{z0, T) = min j Γ \z0e«t + - (*«< - 1) - R{t)\ dt

+j T
J^e""-'!» + e<" Ç e-'Wdr - Ä(i)j rfij (101)

3
Although a is taken to be a constant in this chapter, a can be time-varying w i t h
minor
4
complications in notations.
Another way of defining fN would be to minimize the right hand side over possible
choices of ί, , î = 1, 2 , N. Here it is assumed that ί / s are given a priori, for example,
= T/N, i = 1 , 2 , JV.
O P T I M A L AND SUBOPTIMAL CONTROLS 31

where

at b a 1
*, = z0e i + - (e h - u (102)

N o t e u\ i— 1,2, are constants. E q u a t i o n (101) can be rewritten as

/ a(*o , T) = min j \
h
)z0e^ + -
b
1) M
1
R(t)\ dt

ai 2
+ f'* Le<" + - (e - 1) « - Ä(i x + <)! Λ (103)

T h e functional equation for general iV can now be written as

τ α
/JV(*O , Ό = min |/ΑΓ-Ι(^Ο > ~ h) + j " |*Ν-ι* '

+ - (e
a
at
- \ ) u » - R 1, + f) j <ftj (104)
i=l ' )
where
1 ,
s v_ x = W ' " " + ? ('"'"-' - 1) «"•• (105)
a

T h e functional equation similar to the above can be derived for a


general w-dimensional case with n o r m s such as

11*11 = £ ι * < ι
or
Ι ζ II = max Ι z{ I

T h e function z(t) can be written for this case

z(t) = W(t) z° + ! ' W{t - τ ) B(u - ν - ψ) dr


Jο

= -R(t) + W(t) z
Q
+ f W(t - τ ) B u dr (106)

where

R(t) ± \ W(t- t)(v + ψ) dr (107)


' 0
is an w-vector.
32 MASANAO AOKI

If the n o r m is taken to be

i=l
then,

(108)
0 j=i ΐ
λ=

where uk y k = I, m are m c o m p o n e n t s of the control vector, and


1 1
/ 2( s ° , Γ) = min j f II ^ ( 0 *° + ( W(f - r ) ß w ^ - R(t) \\ dt
l 2
u ,u (Joli ^0
1 2
+ C II W{t - tx) z + f W(t - τ ) Bu dr dt

0
= min
1 2 j Γ'ΊΙ s + f W(f - τ) ßw^r - R(t)
w ," /J 0 II Jo

1 2
+ ί ' Ί Ι W{t) z + Ç W(t - τ) £W </T - Rit, + 0
J ο 11 J0
(109)
where the superscript on w is used to denote time and

1 1
ζ 4 W(tx) z» + Ç W(t, - τ) ä m ^ T (110)
Jο

1
/„(*», Γ) = min j / ^ , Γ - <„) + | ^ Κ W ^ ) * " "

+ Γ' W(Î - τ) B « " < / T - Λ ( y /,. + λ υ Λ) (ni)

where
-1
= Wit^z"-* J - ί'"" - τ) Bu^-VT (112)

b. Computational Considerations. In Section Ι Ι Ι , Β , it has been pointed


out that the problem of minimizing an u p p e r b o u n d on the criterion
function reduces to that of finding a best approximation to a given point
φ in the form of

(113)
OPTIMAL AND SUBOPTIMAL CONTROLS 33

where
Ικ,-Kl, ί = 1 , 2 , m (114)
and where Ci are linearly i n d e p e n d e n t .
Let us denote by G the region of all possible φ given by E q s . (113)
a n d (114). T h e n , the best approximation of φ w h e n φ φ G is given by
t h e projection of φ onto the b o u n d a r y of G if the geometrical distance
in Em is used as the n o r m .
W h e n C / s are o r t h o n o r m a l , this reduces simply to
ifo.C,) if|tf,C,)|<l
l l K )
ι if|(^,c,)!>i '
W h e n Q ' s are not orthogonal finding of w's is not as straightforward
5
as t h e above w h e n φ φ G , since following t h e p r o c e d u r e of E q . (114)
does not result in φ nearest to φ in G.
T h e question of computational algorithm of finding φ, therefore,
arises. T h e following algorithm is offered as one suggestion:
Step 1: C o m p u t e ui . T h i s can be done most simply by using reci-
procal bases to C / s .
Step 2: If | ui | ^ 1, ι' = 1,2, m, this is t h e answer.
If at least one u{ violates the inequality, t h e n d e t e r m i n e t h e pertinent
" v e r t e x " and " s u r f a c e s " of G from t h e signs of w/s. F o r example, if
< 0, u2, u3 , um > 0, t h e n the vertex is t h e one given by
u
i = -1» u2 = uz = ... = um = 1
T h e surfaces are those whose edges join at t h e p e r t i n e n t vertex.
Step 3: Project φ onto one of the m p e r t i n e n t surfaces.
If the projection is inside the surfaces, t h e n only one w/s is at t h e
boundary.
If not, try t h e next surface, if both of t h e projections are not successful
(here, a success m e a n s that t h e projection is inside t h e surface), t h e n this
means that t h e projection is on t h e edge of these two surfaces. T h e n ,
the dimension of the p r o b l e m can be r e d u c e d at least by one.
T h e rest of t h e p r o c e d u r e follows in an obvious way.

E. Approximations by Integrator Outputs


1. INTRODUCTION

I n this section, control systems are restricted to be simple integrators.


For this simple type of control system a n d for sufficiently well behaved
5
W h e n φ e G, no complications arise and Eq. ( 1 1 4 ) gives a best φ.
34 MASANAO AOKI

functions r(t)y it is possible to give a m e t h o d of constructing a u n i q u e


optimal control by means of elementary m e t h o d s (24).
T h e problem proposed is to approximate a desired signal for a finite
time by the o u t p u t of an integrator whose i n p u t (control signal) is a
positive b o u n d e d signal, the optimal approximation being that which
minimizes the time integral of the absolute value of the difference between
the o u t p u t and the desired signal. W e will give a mathematical formula-
tion of the problem, properties of its solutions, and, for certain classes
of desired functions, a m e t h o d of constructing a u n i q u e optimal o u t p u t .
As noted in Section I I I , A , Boltyanskii has treated a class of similar
problems in which the integral of the squared error is to be minimized.
M o r e elementary m e t h o d s seem appropriate, and they give results
sufficiently complete to indicate the peculiarities of p r o b l e m s involving
the integral of the absolute error. Proofs will be usually omitted or
sketched, since their extension to m e r e general linear servomechanisms
will require a t r e a t m e n t more abstract than that given here.

2. FORMULATION

A desired function r will belong to the class C of real continuously


differentiable functions on the time interval [ 0 , 1 ] . C o r r e s p o n d i n g to
the o u t p u t of an integrator whose initial condition is k, an output χ
belongs to Gk , t h e class of real functions on [ 0 , 1 ] satisfying

0 < x(t) - x(s) < M(t - s), x(0) = k (116)

for any s, t, such that 0 ^ s ^ t ^ 1 . Given a particular desired function


r, the initial value of the o u t p u t is constrained to be k = r ( 0 ) ; with no
loss of generality we shall take r ( 0 ) = 0 = k, denoting G 0 by G.
F o r any χ in G there exists a control u in the class of functions
Ω = {u : 0 ^ u(t) ^ M , u L e b e s g u e measurable on [ 0 , 1 ] } such that on
[0, 1]
x(t) = f u(s)ds (117)
* 0

W e wish to find an χ in G which will minimize t h e absolute area


between t h e g r a p h s of x(t) and r(t), 0 ^ t ^ 1, in t h e t> χ plane. T h i s
area (it is t h e Lx distance between t h e functions) is d e n o t e d by

| | * - r | | =4 Ç \ x(t) - r(t) \ dt (118)

F o r fixed r in C , || χ — r || has a greatest lower b o u n d for χ in G.


T h e n from Ascol^s (Arzela's) L e m m a (25) we obtain
OPTIMAL AND SUBOPTIMAL CONTROLS 35

T H E O R E M 1. There exists at least one function in G , denoted by


x0 , such that
ll*o-'ll -ΊΙ.
x iG n
(119)

(Proof omitted.) x0 is called an optimal output and the c o r r e s p o n d i n g


control u0 is called an optimal control; u0 is d e t e r m i n e d except on a set
of points of measure zero.

3. NECESSARY CONDITIONS ON x0

T H E O R E M 2. Almost everywhere on [0, 1] either u0(t) = 0, u0(t) = M ,


or x0(t) = r(t) with 0 < u0(t) < M. On any time interval I where x0(t)
r(t)y u0(t) = 0 on at most one subinterval and u0(t) = M on at most one
subinterval.

Proof. If 0 < r\t) < M , on [0, 1] t h e n x(t) = r(t) on [ 0 , 1 ] . O t h e r -


wise, there may exist at least one interval I = {tx> t2) which is maximal
with respect to t h e p r o p e r t y r(t) > x0(t). T h e n r(t2) ^ Κ ^ ι ) · T h e optimal
o u t p u t satisfies Condition A : r(t) > x(t) on / , x(t^) = r ( ^ ) , a n d either
t2 = 1 or x(t2) = r(t2). F o r any χ in G which satisfies Condition, A
2
\ * - r \ \ = J
( [r(t)-x(t)]dt+ V(x) (120)
h

where V(x) is i n d e p e n d e n t of t h e values of χ on / . T h e r e f o r e , for any χ


in G satisfying Condition A,

x(t) < * 0( 0 < r(t) on / (121)

since x0 m u s t minimize t h e integral in E q . (120). T h e functional form


of xQ which satisfies these r e q u i r e m e n t s is, on / ,

Y „x + M(t - tx) if ίχ < t < t* where«0 = M


0 ;
" (r(t2) if t* < t < t2 where u0 = 0.
where
t* = min {ί 2 , tx + [r(i t) - rfofl/M}

If there is an interval / ' which is maximal with respect to t h e p r o p e r t y


x0(t) > r(t), the configuration of t h e g r a p h of x0 on / ' is generally a
segment of slope zero preceding a segment of slope M (one or t h e other
segment may not occur): if t h e t, χ plane were rotated t h r o u g h 180°, t h e
optimization p r o b l e m would be essentially unaltered a n d t h e configura-
tion of the g r a p h s would be similar to t h a t given in t h e preceding
paragraph.
36 MASANAO ΑΟΚΙ

T h e n u m b e r of maximal intervals of type / or / ' is countable, so t h e


set of corner points f* (where x0(t*) Φ r(t*) a n d 0 < w0(**) < M) is
countable, hence of measure zero completing t h e proof.
F o r an example illustrating these relationships, see Fig. 7. Figure 7

F I G . 7 . A n optimal output x(t) either (R) follows r(t)> ( 0 ) has slope zero, or ( M )
has slope M.

also illustrates a significant consequence of t h e constraint #(0) = 0:


if at a given t i m e t either x0(t) = 0 or x0(t) = Mt> t h e n on t h e interval
[0, t) t h e control is constant, with either u0 = 0 or u0 = M respectively.
T h e maximal initial interval of this type is [0, T ) ,

Τ = max {t e [0, 1] : * 0(f) = 0 or x0(t) = Mt} (123)

T h i s initial interval will be referred to again in T h e o r e m 4.


If there were m o r e t h a n one o u t p u t which minimized || χ — r ||,
convergence troubles might arise in constructing t h e m ; however, we
have

T H E O R E M 3. There exists exactly one optimal output x0 for a given r


in C " .
Proof. F r o m T h e o r e m 1, t h e r e exists at least one optimal o u t p u t .
S u p p o s e t h e r e are t w o : x0 , with control uQ , a n d y0 , with control v0 .
Construct, on [0, 1], their average

z(t) = [x0(t)+y0(t)]l2

with control w0 = z' defined almost everywhere and satisfying, wherever


defined,
»o(0 = K(0 + fo(0]/2
O P T I M A L AND SUBOPTIMAL CONTROLS 37

F r o m E q s . (119) and t h e definition of zy

Il ζ - r H < H *o - r ||/2 + \\y0-r ||/2 = || * 0 - r ||

so ζ and w0 m u s t be optimal also. But this cannot be t r u e unless u0(t) =


v0(t) almost everywhere (the slopes M / 2 , r ' / 2 , (r' + M ) / 2 cannot occur
on intervals for an optimal o u t p u t , by T h e o r e m 2). T h e n by E q . (117),
x0 is identical with yQ , which was to be proved.

4. THE EQUAL-TIME CRITERION

L e t F be t h e family of functions r in C such t h a t t h e equations

r'(t) = 0, r'(t) = M

each have only a finite n u m b e r of roots on [0, 1].


F o r functions in F one can prove by elementary t e c h n i q u e s an additio-
nal necessary condition on t h e optimal o u t p u t , t h e " e q u a l - t i m e c r i t e r i o n , "
given in t h e next t h e o r e m .
1
First, it is desirable to define t h e " s i g n u m function* by

(-1 if;y<0
sgn (y) =\ 0 ify= 0 (124)
( 1 if y > 0

T H E O R E M 4. Let x0 be the optimal output, in G , corresponding to


a given r in F. Let Τ be defined by Eq. (123). If the interval (a, b)y a > Γ,
is maximal for either of the properties u0(t) = 0, u0(t) = M (such an inter-
val will be called critical) then

Ç sgn [x0(t) - r(t)] dt = 0 (125)

T h a t is,* on any critical interval t h e total t i m e t h a t x0(t) > r(t) equals


t h e total t i m e t h a t x0(t) < r(t). T h e proof is omitted, except for t h e
illustrative case (see Fig. 8) w h e r e x0(t) = r(t) except on (ay b)y w h e r e
u0 = My and (by c)y w h e r e u0 = 0. T h e illustrated configuration of xQ
satisfies T h e o r e m 2, with t* = b; as will be seen below, t h e t w o critical
intervals cannot be treated i n d e p e n d e n t l y ; we consider x0 on (ay c).
L e t h = x0(b); if h a n d b are k n o w n , a a n d c are d e t e r m i n e d ; i n d e e d
if h and b vary i n d e p e n d e n t l y in small n e i g h b o r h o o d s of their optimal
values h0 , b0 , t h e n a(hy b) a n d c(hy b) are c o n t i n u o u s functions given by

h - r(a) = M(b - β ) , r(c) = h


38 MASANAO ΑΟΚΙ

Define W(h, b) = || x0 — r ||; t h e n ,

W(h, b) = f \h-M(b - t)- r{t) \dt+ f I h - r{t) \ dt (126)


Ja Jh

Since t h e integrands are zero at a and at c, and t h e t e r m s in | h — r(b)


cancel,

dW
dh
Ç sgn \h - M(b - t ) - r(t)] dt + ί ' sgn [h - r(t)] dt (127)
J a Jb
dW
db
Ç -M sgn [h - M(b - t) - r(t)] dt (128)

F I G . 8. T h e optimal values of h and b are determined by the equaltime criterion:


b — tx = tx — a, c — t2 = t2 — b.

T h u s E q . (125) holds for both critical intervals if, at h0 , b0 , b o t h partial


derivatives vanish. L e t tx and t2 be t h e intersection times of x0 and r0
occurring in (a, b) and (c> d) respectively, in Fig. 8 (of course t h e r e
might be m a n y such times). T h e s e times are also c o n t i n u o u s in A, 6,
and since
dW
dh = 2 t 1 - a - 2 t 2 + c (129)

3W
= -Μ{2ίτ - a - b ) (130)
db

these two partial derivatives are c o n t i n u o u s in b a n d h\ therefore at a


relative m i n i m u m of W each m u s t equal zero, as is well k n o w n from t h e
theory of maxima and m i n i m a ; so t h e optimal values h0 , b0 satisfy

a = b — tx (131)

b = c - to (132)
OPTIMAL AND SUBOPTIMAL CONTROLS 39

It can also be shown, by calculating t h e second derivatives of W in


t e r m s of r'(a)y τ\ίλ), a n d r'(t2)> t h a t t h e stationary values given by
E q s . (131) a n d (132) actually give a relative m i n i m u m . F o r such simple
configurations as this, we now have e n o u g h conditions to construct x0
graphically.
If r is in C b u t not in Fy t h e conclusion of T h e o r e m 4 may fail to
hold; for an example see Fig. 9 w h e r e t h e desired function itself has
a segment with zero slope, or consider an analogous situation w h e n
a long segment with slope M occurs. I n Fig. 9 it can be seen t h a t x0(t)

F I G . 9. A case where the equal-time criterion can be interpreted only as a ''convention."

m u s t coincide with r(t) on t h e flat segment, a n d t h a t E q . (125) is not


satisfied (because t h e partial derivatives are discontinuous). I n a " c o n -
v e n t i o n a l " sense, however, t h e e q u a l - t i m e criterion still h o l d s : we p e r -
t u r b r slightly to obtain a new desired function inF, for w h i c h T h e o r e m 4
holds, a n d a p p r o x i m a t e t h e optimal o u t p u t for r by t h e optimal o u t p u t
found for t h e function in F. T h i s p r o c e d u r e is justified by

T H E O R E M 5. For any desired function r in C there exists an indexed


set {rd} of functions in F, each satisfying Theorem 4, such that as d - > 0,
r r α η
11 — d 11 0 ά the optimal outputs x0 (for r) and xd (for rd) satisfy

l*o(0 - * d ( O I - * o uniformly on [0, 1]

Consequently, in t h e r e m a i n d e r of t h e p a p e r we shall consider only


desired functions belonging to t h e family F.
T h e proof (omitted) of T h e o r e m 5 follows t h e following lines. By
applying Weierstrass' A p p r o x i m a t i o n T h e o r e m (roughly, " w i t h i n a
given error, on a closed interval a given c o n t i n u o u s function can be
a p p r o x i m a t e d uniformly by a p o l y n o m i a l " ) we show t h a t for any
d > 0, there exists rd in F such t h a t 11 r — rd \ \ < d. T h e n by applying
40 MASANAO ΑΟΚΙ

Ascoli's L e m m a and T h e o r e m 3, we obtain a proof-by-contradiction


t h a t for a given r in C and any c > 0, t h e r e exists d > 0 such t h a t if
a function rd is in C and || r — rd || < d, t h e n | x 0(£) — xd(t) | < c
on [0, 1]; taking a sequence of c's a p p r o a c h i n g zero and constructing
t h e n u m b e r s d = d(c) a n d the functions rd i n F , we establish the t h e o r e m .
Given a function r in F whose oscillations are small c o m p a r e d to its
overall increase, as in Fig. 7, one can graphically construct a single
function in G which satisfies all of t h e conditions given above; this
function is necessarily x0 . Unfortunately, if oscillatory behavior p r e -
dominates, as in Fig. 10, t h e r e may be two or m o r e such functions

Γ
χ γ
χ,
x0

F I G . 10. Both x0 and xt satisfy theorems 1 through 4. Only x0 , w h i c h was constructed


by T h e o r e m 6, is optimal.

xx, x2, . . . . M o r e information about t h e optimal o u t p u t is needed to


decide which of these candidates is actually x0 . T h e next section supplies
this information for a restricted class of desired functions.

5. CONSTRUCTION OF O P T I M A L OUTPUTS

L e t Fi denote t h e family of desired functions which belong to F and


satisfy r\t) < M , 0 < t < 1.
If r is in Fx t h e r e are only a finite n u m b e r of intervals maximal with
respect to t h e p r o p e r t y 0 ^ r\t) < M ; t h e corresponding arcs of t h e
g r a p h of r will be called R-arcs. T h e g r a p h of t h e corresponding x0 will
include at most one s u b a r c (R-seqment) from each Ä-arc. A n arc of
t h e g r a p h of x0 corresponding to a critical interval w h e r e u0 = 0 is
called an 0-seqment. (At no t i m e does u0 — M.) Because of T h e o r e m 4,
0-segments satisfy t h e equal-time criterion. T h u s t h e g r a p h of x0 con-
sists of alternating 0-segments and R - s e g m e n t s ; t h e r e may also be a
segment with u0 = 0 on an initial interval [0, T) [see above, E q . (123)].
If x0(t) ξ ξ ξ h on a critical interval, we refer to h as the height of the
corresponding 0-segment. F o r simplicity of exposition, we say that an
0-segment begins at its left endpoint, and ends at its right endpoint.
O P T I M A L AND SUBOPTIMAL CONTROLS 41

Consider a specific desired function ini<\ , for example t h a t of Fig. 1 0 .


If we construct 0 - s e g m e n t s b et ween all pairs of i?-arcs which can be
t h u s connected, we find that some sequences of 0 - s e g m e n t s cannot be
part of t h e g r a p h of a function in G; after t h r o w i n g out such abortive
0 - s e g m e n t s , t h e two r e m a i n i n g sequences s h o w n in t h e figure b o t h
represent functions in G. Only one of t h e m can be optimal.
T h e following t h e o r e m gives a construction a n d a sufficient condition
for an optimal o u t p u t , t h u s obviating t h e difficulties described in t h e
preceding p a r a g r a p h . First note t h a t for r inF1 , x0(l) > r ( l ) . I n t h e t> χ
plane let R0 denote t h e u n i o n of t h e half-line {t = 1, χ > r ( l ) } with t h e
/?-arc (if any) e n d i n g at t — 1 . L e t t h e 0 - s e g m e n t s of t h e optimal o u t p u t
be d e n o t e d by 0 X , 0 2 , in order of decreasing t i m e (i.e. from right
to left). If t h e 0 - s e g m e n t s are k n o w n , x0 is d e t e r m i n e d .

T H E O R E M 6 . / / r belongs to F1, for an output x* to be optimal it is


sufficient that the 0-segments of the graph of x* be given by the following
construction, in the class of all 0-segments ending on R0 , 0 X has the greatest
height hx . The beginning of 0 X lies on an R-arc\ we denote this arc as Rx.
Using like notation and definitions for j = 2 , 3 , ..., 0 ; ends on R-arc
Rj_x and begins on Rj] in the class of 0-segments ending on Rj_x, 0 ; has
the greatest height hj . The leftmost R-segment begins at height zero,
determining the endpoint Τ of the initial segment [cf. Eq. (123)].

Proof. Evidently belongs to G. If, for each t in [ 0 , 1 ] , either


r(t) ^ 0 or x*(t) = r(t), t h e r e is n o t h i n g to prove, so a s s u m e t h a t b o t h
R- a n d 0 - s e g m e n t s occur. T h e s e g m e n t 0 X c o r r e s p o n d s to a t i m e inter-
val / * = (σ, τ).
A m o n g t h e functions χ on / * t h a t satisfy E q . ( 1 1 6 ) , a n d t h a t have
χ(σ) = k u n c o n s t r a i n e d , t h e integral

is m i n i m i z e d by an optimal function χ w h i c h satisfies T h e o r e m s 1 - 4 ,


with [ 0 , 1 ] replaced by [σ, 1 ] . I n t h e hypothesis of T h e o r e m 4 take
Τ = σ, since k is u n c o n s t r a i n e d . N o w consider t h e last step in t h e proof
of T h e o r e m 3 . E q u a t i o n ( 1 1 7 ) m u s t be replaced by

(133)
0

T h e o r e m 3 is not used in t h e proof of T h e o r e m 4 , so for functions r


in F, we choose k = k0 to satisfy t h e e q u a l - t i m e criterion djjdk = 0,
42 MASANAO ΑΟΚΙ

which has a u n i q u e solution k0 since dJJdk is continuous and m o n o -


tonic. Therefore χ is u n i q u e .
By the construction of Οχ , if τ < 1 t h e n x(t) = r(t) on [ τ , 1]. S u p p o s e
χ is not identical with on [ σ , τ ] . T h e n t h e r e m u s t be a subinterval of
[σ, τ ] on which x(t) < hx , and

1
j sgn [x(t) - r(t)] dt < Ç sgn [x*(t) - r(t)] dt = 0 (134)

contradicting the optimality of x\ so t h e supposition is false, and χ


is identical with x*, which therefore minimizes Jx .
If, for χ in G, we define

Κτ(χ) = Γ
^ο
I *(0 - r(i) I Λ

there exists a function xx in G which minimizes Κλ and satisfies T h e o -


rems 1-4 with [0, σ] replacing [0,1]. If χχ(σ) < hx , xx(t) <r(t) on a
maximal interval Ιλ e n d i n g at σ, b u t t h e n by E q . (122), on Ιλ, x^t) =
r(a) = hx, a contradiction, so # χ ( σ ) > A i . S u p p o s e Χχ(σ) > A x . T h e n
in t h e graph of x1 t h e r e is an 0-segment of height A ' > hx e n d i n g at σ
and beginning on some i?-arc R'. Consider an 0-segment of height A
connecting R' and R0 (beginning at time t' and e n d i n g at t i m e t0 , b o t h
continuous functions of A). T h e n if

W(h) = (°\h- r{t) I dt

as in t h e proof of T h e o r e m 4, we have

dW F)W ι
< 0 and ~ΈΤΓ >0
dh dh h'

and dW/dh is continuous. T h e r e f o r e t h e r e exists a height h" > A x


such that the 0-segment between t\h") and t0(h") satisfies t h e equal-time
criterion. T h i s contradicts t h e definition of 0 X and A x , therefore Χχ(σ) =
hx — #*(σ); t h e n the function y1 which is defined by

y m
- \x*(t), a < t < 1

is a m e m b e r of G. F o r all * in G,

II χ - r II = + AW > *ι(*ι) + hi**) = II yi - r \\

so y1 is optimal on [0, 1].


O P T I M A L AND SUBOPTIMAL CONTROLS 43

N o w , replacing t h e interval [0, 1] with the interval [0, σ], we see that
the problem of finding xx , to minimize Κλ , has replaced the original
p r o b l e m of finding the optimal o u t p u t on [0, 1]. T h u s 0 2 , 0 3 , are
shown to belong to the optimal o u t p u t , x0 , on [0, 1]. Finally since the
n u m b e r of 0-segments is finite, we obtain a final Ä-segment that begins
at height zero. Therefore χ* =^ x 0 , completing t h e proof.
In Fig. 10 one can easily use the above p r o c e d u r e to show that the
o u t p u t labelled x0 is actually t h e (unique) optimal o u t p u t .

6. EXTENSIONS

W h e n the class G is replaced by G'

\ dx )
G' = j * ; = ax + w, 0 < u < M, u measurable in [0, 1], *(0) = Oj

t h e n the equal-time criterion ( T h e o r e m 4 ) m u s t be modified by a p p r o -


priate exponential weights. T h e simple geometric construction of
T h e o r e m 6 m u s t be considerably modified. T h e rest of t h e t h e o r e m s
will be valid. It is h o p e d that this note is sufficiently indicative of t h e
procedures one needs to adopt in constructing optimal control variables
for this new problem.

I V . Existence of O p t i m a l C o n t r o l s

A. Introduction
I n this section, we will discuss questions of s u b o p t i m a l policies of
s o m e w h a t different n a t u r e . So far, t h e existence of optimal policies has
been implicitly assumed a n d t h e n questions of s u b o p t i m a l policies have
been discussed.
It is not always t r u e , however, that optimal policies exist.
As it will b e c o m e evident later, t h e existence proof of optimal control
makes essential use of compactness and convexity of a certain set
R(x(t), t) defined by t h e system differential equation and t h e set of
admissible control vectors (26, 27, 28). T h e precise definition will be
given later.
T h e d e v e l o p m e n t of this section is designed to show t h e essential
n a t u r e of these two properties with as few extraneous factors as possible.
Roughly speaking, t h e convexity of R is n e e d e d for t h e existence of t h e
control vector which realizes t h e optimal curve (trajectory) and the
compactness is used for t h e existence of the m i n i m u m of certain con-
t i n u o u s functions.
44 MASANAO AOKI

It will also be shown that by enlarging the set R of the original


p r o b l e m to its convex closure, t h e R set for the " r e l a x e d " p r o b l e m satis-
fies the convexity p r o p e r t y and t h e optimal control exist for "essentially
b o u n d e d " p r o b l e m s (27, cf. T h e o r e m 3.3).
W h e n this is t h e case, the optimal curves (trajectories) for t h e relaxed
p r o b l e m s can be shown to be uniformly a p p r o x i m a t e d by curves (tra-
jectories) of the original p r o b l e m s (27, T h e o r e m 2.2). T h u s , original
curves can be regarded as suboptimal curves (when c o m p a r e d with
optimal curves of t h e relaxed problems) even if optimal curves for the
original p r o b l e m s do not exist.
Such suboptimal control is also k n o w n as sliding regimes (29). T h i s
connection will also become clear in t h e development.

B. Mathematical Formulation of Optimal Controls


I n order to investigate t h e sufficient conditions for t h e existence of
optimal control, we will begin by formulating optimal control p r o b l e m s
mathematically.
T h e state of control system is governed by the differential equation
w e a m t en
dx(t) / M 0 > t> ( 0 ) -- control interval (135)
=

where χ and / are vectors in En , Euclidean w-space,

x(t) = *»(*))
f(x, w), ...,/"(*, f, U)\
t, U) = ( / * ( * , f, /*(*, f, I I ) , 1 < i < Λ,
continuous in every argument and continuously
differentiable in x.

and u is the control vector (an r-vector),

« = («!, ur)
u(t) e Q(t, x) = the admissible set of control vectors, the set Q is
compact in Er and upper semicontinuous.

T h e cost function to be minimized is assumed to be given by

h
C(u) = \ g(x(t\t,u(t))dt (136)

where g is continuous in every a r g u m e n t .


T h e control vector is optimal w h e n it minimizes C(u)> and t h e condi-
O P T I M A L AND S U B O P T I M A L CONTROLS 45

tions for t h e initial a n d final states are met. L e t g of E q . (136) to be


defined as f° a n d x° by

^=f(x(t),t1u(t)) (137)

*°(*o) = 0

T h e optimization p r o b l e m is to minimize t h e x° c o m p o n e n t of the


a u g m e n t e d state vector χ w h e r e

dx(t)
= )(x(t)> t, u(t)) a.e. in the control interval (138)

and where
χ = (x°, x)
/=(/ ,/)
0

T h e vectors χ and / are, therefore, in En+1 subject to some constraints


on t h e initial a n d final state vectors a n d possibly other constraints. F o r
example, in time optimal p r o b l e m s , take f° == 1.
A l t h o u g h t h e initial and t h e final state vectors, w h e n t h e control
is t e r m i n a t e d , can be elements from certain closed sets in En with m o r e
t h a n one element, here they are taken to be single elements xi a n d xf,
and t i m e optimal p r o b l e m s will be used as a vehicle of discussion.
O t h e r optimization p r o b l e m s can be discussed similarly with slight
modifications.
I n order t h a t t h e p r o b l e m be not vacuous, it is a s s u m e d t h a t at least
one absolutely c o n t i n u o u s curve ζ(ί) exists with u(t) £Q(t> ζ) such that
it satisfies E q . (135), a. e. in t h e control interval and t h e initial and
final conditions.
F o r example, in a t i m e optimal control p r o b l e m , there is a finite
time Τ such that ζ(0) = xi and ζ(Τ) = xf. W e need, therefore, consider
only those x(t) which are absolutely c o n t i n u o u s , satisfy E q . (135), a.e.
in [0, T] a n d satisfy t h e conditions x(0) = x{, x(T) = xf with Τ < T.
I n a ' g e n e r a l optimal control p r o b l e m with control interval [t0 , tx]
we need consider only those x(t) satisfying initial a n d final conditions
0 0
a n d E q . (135), a.e. in [t0 , f j a n d such t h a t x ^) < ζ ^).
I n what follows, we take t0 = 0 w i t h o u t any loss of generality. T h e y
are called admissible curves. If t h e r e are only a finite n u m b e r of a d m i s -
sible curves, t h e n , t h e r e is t h e optimal one. T h e r e f o r e , we will assume
t h e r e are infinitely m a n y admissible curves.
It is n o w p r o v e d t h a t u n d e r t h e stated a s s u m p t i o n s , t h e r e exists a
measurable admissible control u(t) w h i c h minimizes T h e proof
given here is essentially a p a r a p h r a s e of t h e proof by Filippov (26).
46 MASANAO ΑΟΚΙ

T o establish t h e sufficient condition for t h e existence of optimal con-


trol, it is first necessary to show that from the class of all admissible
curves one can choose a s u b s e q u e n c e converging uniformly to a curve
in t h e class, i.e., the class of admissible curves is sequentially compact
in the topology of uniform n o r m . F o r t h e class to be compact it is
necessary and sufficient by t h e t h e o r e m of Arzelà, to show t h a t the
6
admissible curves are uniformly b o u n d e d and e q u i c o n t i n u o u s (25).
T o ensure that admissible curves of E q . (135) have these properties,
we assume that t h e vector function f(x(t)> t, u(t)) is c o n t i n u o u s in every
a r g u m e n t and / has continuous partial derivatives with respect to x.
T o guarantee that t h e admissible x(t) stays b o u n d e d , it is r e q u i r e d
7
that / does not become too large with for example, by r e q u i r i n g
that there exists C > 0 such that

2
*7<C(!I*II + Ο (139)
where

i=1

T h e n from E q . (135) a n d (139),

dy{t)
< 2Cy (140)
dt

where
y(t) = | | * ||» + 1
2
MO) = il *,·!Ι + ι

T h e solution of E q . (140) is majorized by t h e solution of

(141)
*(0)=J(0)

6
T h e proof is usually given for real-valued functions, but the same proof applies to
vector-valued
7
functions.
T h e r e are other ways of imposing conditions on the norm of / to guarantee the b o u n -
dedness of admissible curves in the control interval (28).
OPTIMAL AND SUBOPTIMAL CONTROLS 47

i.e., y(t) ~ z(t). The solution of Eq. (141) is


z(t) = A2 e2Ct

where

hence,

therefore,
II x(t) II < A ecr for all 0 ~ t ~ t (142)

Thus, x(t) is uniformly bounded.


The equicontinuity is ensured, for example, by assuring the existence,
of a constant M such that

II ~; II = IIf(x, t, u)lI ~ M (143)

Since f is continuous and x and t are bounded, such an M exists if


u( t) is bounded in [0, T]. If the set of admissible controls is not a func-
tion of (t, x) but some constant compact set in E r , then this is immediate.
When u(t) E Q(t, x), which is not a constant, we assu~e that Q(t, x) is
closed and bounded for all t in [0, T] and II x II ~ Ae CT• We also assume
that for every t and x and € > 0, there exists a S == S( €, t, x) > Osuch
that
Q(t', x') C U(Q(t, x), E)

for It - t' I < S and II x - x' II < S, where U(F, €) is the union of all
r-dimensional balls in E r with centers in F and with radius E, i.e.,
U(F, E) is an E-neighborhood of F. This property of Q(t, x) is referre1
to as the upper semi-continuity (in t and x) with respect to inclusion.

° With these assumption~, the set Q(t, x) is uniformly bounded for


~ t ~ T, II x II ~ Ae , for otherwise there would be sequences
CT

{tn}, {x n}, and {un} such that t n ---+ t, Xn ---+ x, Un E Q(tn , x n) and II Un 11---+ 00.
But this is impossible since for every E > 0, there exist a S == S(€, t, x)
and an N == N(€, t, x) such that

I tn - t I < S, I Xn - x 1< S
then
Un EQ(tn , x n ) C U(Q(t, x), E) for all n ~ N
48 MASANAO ΑΟΚΙ

Since Q(t, x) is b o u n d e d by a s s u m p t i o n , so is its e-neighborhood. T h u s ,


t h e r e is a constant L such that

cf
H u(t) Il < L for all u(t) e Q(ty x) in 0 < t < Τ, || χ || < Ae .

T h e r e f o r e , an M exists for E q . (143).


N e x t , choose a sequence {xn(t)} from t h e class of admissible curves
w h e r e xn(0) = xi, xn{Tn) = xf, such t h a t Γ η —> Γ * w h e r e Γ * is t h e
infimum of Τ in t h e class of admissible curves. N o t e such infinite n u m b e r
of elements exist by a s s u m p t i o n .
T h e n , t h e r e exists a s u b s e q u e n c e which converges to x(t) by t h e
compactness. R e n u m b e r t h e s u b s e q u e n c e as {xn(t)}. N o t e #(0) = xi a n d
x(T*) = x'f .
N o t e t h e r e is no solution of E q . (135) which satisfies t h e b o u n d a r y
condition with Τ < Τ*. T h e r e f o r e , x(t) is t h e optimal curve.
T h e t h i n g t h a t n o w r e m a i n s to be s h o w n is t h e existence of a m e a s u r -
able u(t)sQ{t,x) w h i c h realizes x(t). Because of F i l i p p o v ' s L e m m a
(26), if

*(*,*) (144)

1
a.e. in [0, 7 *] w h e r e

R(t, x) = {/(*, f, u) : u(t) G Q(t, x)} (145)

t h e n t h e r e exists a measurable u(t) e Q(t, x) w h i c h realizes t h e same


x(t). I n proving E q . (144), we will see that t h e convexity a s s u m p t i o n
on R(t, x) is needed.
Since t h e curve x(t) is absolutely c o n t i n u o u s , t h e derivative exists
a.e. in [0, T% a n d || dx(t)jdt || < M , a.e. in [0, T * ] .
1
L e t t0 e [0, 7 *] be a point w h e r e dx/dt exists. F o r every e > 0,
t h e r e exists a δ χ > 0 such that

x(t) — x(t0) dx(t0)


<€ ior\t-t0\<81 (146)
t — tn df~

s m an
W e will now show t h a t || (x(t) — #(£ 0))/(* — ^o) I i ^ ^-neighbor-
hood of R(t0,x(t0)). T h i s will make dx(t0)/dt in a 2 e - n e i g h b o r h o o d of R(t0 ,
x(t0)). Since / is c o n t i n u o u s a n d Q(t, x) is u p p e r s e m i c o n t i n u o u s R(t, x)
is also u p p e r s e m i c o n t i n u o u s w i t h respect to inclusion.
F o r t h e same e in E q . (146), t h e r e also exists a δ 2 > 0 such that

R(t, x) C U(R(t0 , x(t0)\ e) for \t - t0 I < δ 2 , || χ - x(t0) \\ < 2MS2


(147)
O P T I M A L AND S U B O P T I M A L CONTROLS 49

E q u a t i o n s (146) a n d (147) r e m a i n valid if 8X a n d δ 2 are replaced by


δ = m i n ( δ χ , δ 2) .
Since
xjt) - x(t0) = U m xn{t) - xn{t0)
t — tQ *»-*» t — tQ
1 r* SU
l m
l sd
= -, Τ /(*nW> n( ))

for sufficiently large w,

II * « ( * ) - * ( ' o ) II < II * » ( ' ) - xn(t0) II + II xn(t0) - x(t0) H

for I ί — ί 0 I < δ
T h u s , #(*, xn(t)) C £/(Ä(* 0> *(/„)), c) a n d / ( * n ( f ) , ί, u n (f)) e £/(/?(*„, *(f 0 )), e)
îor \ t — t0 \ < δ and for sufficiently large η.
e
Since U(R(t0 , #(£0))> ) is convex from t h e a s s u m e d convexity of R,
for sufficiently large n,

Ç f(xn(s), sy unis)) ds G U{R(tQ , *(f 0)), c) for I t - t0 \ < δ (148)

F r o m E q s . (148) and (146),

dx(t0)
G U(R(t0 , * ( * 0) , 2 e )
dt

Since € is arbitrary and Ä ( i 0 , x(t0)) is closed

^ G #(*,*(/)) a.e. in [Ο, T*]

or t h e r e exists u e Q(t, xit)) such t h a t

dxit)
= /(*(*)> *, «) a.e. in [Ο, Γ*]
dt

A few r e m a r k s are now in order. If in E q . (135), t h e system equation


is linear as in E q . (149)
λ
dx
— = A(t) χ + </>(t, n(f)), u G £>(*), Q(t) bounded and closed (149)

t h e n , Rity x) is clearly convex. T h u s , in this case only c o m p a c t n e s s of


t h e set {(fity uit))\ u e O ( i ) } is n e e d e d . F o r i n d e p e n d e n t proof of this see
reference (50).
50 MASANAO ΑΟΚΙ

T h u s , for linear p r o b l e m s t h e original p r o b l e m s are at the same t i m e


relaxed (27).

C . Sliding Regime
As an example to illustrate t h e essential n a t u r e of t h e convexity
assumption and also by way of i n t r o d u c i n g t h e new topic of sliding
control (29, 37), consider t h e t i m e optimal control p r o b l e m
dx
2 , 2

150
% = - < >

I «(01 < ι
χ(Τ) = \, y(T) = o
x(0) = y{0) = 0

T h i s example is d u e to Filippov (26).


T h e set R is not convex. F r o m E q . (150), dxjdt ^ 1 a n d since
y(t) I Φ 0 for a positive interval of t i m e , Τ > 1.
Consider a sequence | un{t) | = 1 such t h a t | yn{t) | ^ \jn, t h e n

dt Φ

F r o m xn{Tn) = 1, the duration of control is such t h a t

1
1 < Tn < 1 +
1
T h e limit of the minimizing sequence, however, converges to x(t) = t,
y(t) = 0 a n d T* = 1 which does not satisfy E q . (150). T h u s , t h e
optimal curve does not exist for this p r o b l e m .
Gamkrelidze(29)indicates a w a y of relaxing t h e p r o b l e m by considering

2 2
§Ul = i > , - ( 0 ( - : v + «i )
i=l
(151)

t-i>,(.)»,
where />i(*)> p2(t) ^ 0, measurable, ρλ(ί) -f- p2(t) = 1. H e r e are dif-
ferent modes of control and p{ indicate t h e percentage in t i m e in which
each m o d e is utilized (31).
OPTIMAL A N D SUBOPTIMAL CONTROLS 51

Applying t h e M a x i m u m Principle ( J ) ,

χ u 2 w
H (Φι, Φ2 > > ) = Φι(-Ϊ* + « ) + 02
(152)
2
2)
Σ Pi(t) Η(φ , 1 0 2, * , >>, Μ,) = Μ , ψ 2 , * , y)

ι=1
T h u s , every m o d e of control w {(i) m u s t satisfy t h e same equation

Η(φλ, φ2, », « f ) = max , φ2, χ, j ) (153)

F r o m Eq. (152), if φ2 = 0, t h e n / / will have t w o m o d e s , u = 1, u = —1


which satisfies E q . (153).
Since

^ = 0

J
A dy

φ2 = 0 implies^ = 0

T h u s t h e necessary condition for t h e t w o control m o d e s t o exist is


that y = 0, i.e., t h e optimal trajectory requires t w o control m o d e s
u = 1, u = —1 switching infinitely often between t h e m , a n d t h e t r a -
jectory will " s l i d e " along t h e #-axis.
T h i s infinite switching between u = 1 a n d u = — 1 such t h a t for any
measurable subset Τ of [ 0 , 1 ] , t h e sets {t : t G Ty u(t) = 1} a n d
{t : t e Γ, u(t) = — 1} have measures one half of t h e m e a s u r e Τ respectively
realizes t h e velocity vector dxjdt = 1, dyjdt = 0 which is n o t possible
in t h e original p r o b l e m .
T h e minimizing sequence xn(t), yn(t) however, uniformly a p p r o x -
imates t h e optimal trajectory (t, 0) 0 < t < 1 within any given accuracy.

References

V. D . W . BUSHAW, Optimal discontinuous forcing terms. P h . D . T h e s i s , D e p t . M a t h . ,


Princeton U n i v . , Princeton, N e w Jersey, 1 9 5 2 .
2. R. B E L L M A N , I. G L I C K S B U R G , and O. G R O S S , O n t h e B a n g - B a n g control p r o b l e m .
Quart. Appl. Math. 14, 11-18 (1956).
3. L . S . P O N T R Y A G I N , V . G . B O L T Y A N S K I ι , R. V . G a m k r e l i d z e , and E . F . M I S C H C H E N K O ,
" T h e Mathematical T h e o r y of Optimal Processes." Wiley (Interscience),
N e w York, 1 9 6 2 .
52 MASAÜAO ΑΟΚΙ

4. Μ . Α ο κ ι , D y n a m i c programming approach to a final-value control system with a


random variable having an u n k n o w n distribution function. IRE Trans. Autom.
Control 5, 2 7 0 - 2 8 2 (1960).
5. R. BELLMAN and R. KALABA, D y n a m i c programming and adaptive processes: mathe-
matical foundation. IRE Trans. Autom. Control 5, 5 - 1 0 (1960).
6. R. BELLMAN, " D y n a m i c Programming." Princeton U n i v . Press, Princeton, N e w
Jersey, 1957.
7. L . A. Z A D E H , Introductory lectures o n state space concepts. Proc. 1962 Joint Autom.
Control Conf. p p . 10.1-10.5 (1962). N e w York University, A m . Inst. Elec.
Engrs. Publ., N e w York.
8. R. BELLMAN and J. M . D A N S K I N , Jr., A survey of the mathematical theory of time-lag,
retarded control and hereditary processes. Rand Corp., Santa Monica, Cali-
fornia, Rept. N o . R - 2 5 6 (1954).
9. R. BELLMAN, O n the application of the theory of dynamic programming to the
study of control processes. Proc. Symp. Nonlinear Circuit Anal. p p . 199-213.
Polytechnic Inst. Brooklyn, N e w York, 1956.
10. R. BELLMAN, " D y n a m i c Programming and Stochastic Control Processes," Inform.
Control 1, 2 2 8 - 2 3 9 (1958).
11. R. E. K A L M A N and R. W . KOEPCKE, Optimal syntheses of linear sampling control
systems using generalized performance indexes. Trans. ASME 8 0 , 1820-1826
(1958).
12. C. W . STEEG and M . V . M A T H E W S , Final-value control synthesis. IRE Trans. Autom.
Control 2, 6 - 1 6 (1957).
13. R. C. BOOTON, Jr., O p t i m u m design of final-value control systems. Proc. Symp.
Nonlinear Circuit Anal. p p . 2 3 3 - 2 4 1 . Polytechnic Inst. Brooklyn, N e w York,
1956.
14. R. C. BOOTON, Jr., Final-value systems with Gaussian inputs. IRE Trans. Inform.
Theory I T - 1 , 173-175 (1956).
75. H . A . MEYER, " S y m p o s i u m o n M o n t e Carlo M e t h o d s . " Wiley, N e w York, 1956.
16. D . BLACKWELL and M . A . GIRSHICK, " T h e o r y of G a m e s and Statistical D e c i s i o n . "
Wiley, N e w York, 1954.
17. Μ . Α ο κ ι , O n the approximation of trajectories and its applications to control systems
optimization problems. T e c h . Rept. 62-58, D e p t . E n g . , U n i v . Calif. L o s
Angeles, California, 1962.
18. V . G. BOLTYANSKII, Application of the theory of optimal processes to problems of
approximation of functions (Russian). Tr. Mat. Inst. Steklova 6 0 , 8 2 - 9 5 (1961).
19. E. A . BARBASHIN, O n the realization of motion along a given trajectory (English
transi.). Autom. Remote Control 2 4 , 5 0 7 - 5 9 3 (1961).
20. E. A . BARBASHIN, O n a problem of the theory of dynamic programming. J. Appl.
Math. andMech. (English transi. oîPriklad. Mat. i Mekh.) 2 4 , 1 0 0 2 - 1 0 1 2 (1960).
21. Ν . I. ACHIESER, ''Theory of Approximation" (English transi.). F. Ungar, N e w York,
1956.
22. F. H . K i S H i , A suboptimal on-line discrete controller with b o u n d e d control variables.
Trans. Inst. Electrical and Electronics Engineers Paper 63-1202 (1963).
23. R. B E L L M A N , I. GLICKSBERG, and O. GROSS, Some nonclassical problems in the
calculus of variations. Proc. Am. Math. Soc. 7, 8 7 - 9 4 (1956).
24. Μ . Α ο κ ι , D . L . ELLIOTT, and L . A . LOPES, Correction and a d d e n d u m t o — M i n i m i z i n g
integrals of absolute deviation in linear control systems. U n p u b l i s h e d report,
1963.
25. A. N . KOLMOGOROV and S. V . F O M I N , "Elements of the T h e o r y of Functions and
OPTIMAL A N D SUBOPTIMAL CONTROLS 53

Functional Analysis," V o l . I . (English transi.). Graylock Press, Rochester,


N e w York, 1 9 5 7 .
26. A . F. F I L I P P O V , O n certain questions i n t h e theory of optimal control (English
transi.). J. Control Ser. A 1, 7 6 - 8 4 ( 1 9 6 2 ) .
27. J . W A R G A , Relaxed variational problems. J. Math. Anal. Appl. 4 , 1 1 1 - 1 2 8 ( 1 9 6 2 ) .
28. E . R O X I N , T h e existence of optimal controls. Mich. Math. J. 9 , 1 0 9 - 1 1 9 ( 1 9 6 2 ) .
29. R. V . GAMKRELIDZE, Optimal sliding states (English transi.). Soviet Math. 3 , 5 5 9 -
562 (1962).
30. L . W . NEUSTADT, T h e existence of optimal controls i n t h e absence of convexity
conditions. Aerospace T e c h . Rept. A - 6 2 - 1 7 3 2 . 1 - 1 6 . Aerospace Corp. L o s
Angeles, California, 1 9 6 2 .
31. R. A . N E S B I T , T h e problem of optimal m o d e switching. Proc. Optimum System
Syn. Con}. Tech. Rept. No. ASD-TDR-63-l 19 ( 1 9 6 3 ) . Wright-Patterson A i r
Force Base, Ohio.
The Pontryagin Maximum Principle
and Some of Its Applications
J A M E S S. MEDITCH
Aerospace Corporation,
Los Angeles, California

I. T h e M a x i m u m Principle 56
A . Problem Formulation and Fundamental T h e o r e m s . 56
B. F i x e d T i m e Problems 59
C. Transversality Conditions 59
D . D i s c u s s i o n of Results 61
II. Properties of O p t i m a l Controls 62
A. T h e S e r v o m e c h a n i s m Problem 62
B. A Class of M i n i m u m Effort Controls 64
C. D i s c u s s i o n 65
III. A n Application S t u d y 66
A. Problem Formulation 66
B. Optimal T h r u s t Program 69
C. Lunar H o v e r i n g M i s s i o n 69
D . Discussion of Results 74
References 74

A n i m p o r t a n t result in control t h e o r y is t h e P o n t r y a g i n m a x i m u m
principle (7) which was first i n t r o d u c e d in 1956 (2). A n especially
appealing feature of this principle from t h e control system designer's
viewpoint is its utility in establishing certain properties of optimal
controls with a m i n i m u m of m a t h e m a t i c a l manipulation.
I n this chapter, we shall s u m m a r i z e s o m e of t h e f u n d a m e n t a l results
of t h e m a x i m u m principle a n d show h o w they may be exploited in
control system studies. Since we shall be concerned only with inter-
preting a n d utilizing these results, we shall simply state t h e f u n d a m e n t a l
t h e o r e m s w i t h o u t proof. A detailed mathematical derivation of t h e
m a x i m u m principle a n d an outline of its d e v e l o p m e n t are available
to t h e interested reader in t h e o p e n literature ( / , 3, 4).
W e shall show h o w t h e m a x i m u m principle can be used to develop
properties of optimal controls a n d t h e r e b y lend insight into system
design. I n conclusion, we shall p r e s e n t a design s t u d y in which t h e
m a x i m u m principle is applied to develop an optimal t h r u s t p r o g r a m
for a lunar space mission.
55
56 JAMES S. MEDITCH

I. T h e M a x i m u m Principle

A . Problem Formulation and Fundamental Theorems

W e shall consider physical processes whose behavior is governed by


a system of ordinary differential equations:
X X X n
i =fi( l > - > n\ « 1 · . . . . «r)» * = 1» ··.» 0)

T h e xi, / = 1, w, define the state of the process, and t h e Uj , j = 1,


r, define t h e state of the control. If we denote the vector (x±, #n)
by t h e vector (ux , w r) by w, and t h e vector (/Ί , . . . , / n ) b y / , E q . (1)
can be written in t h e vector form

χ = f(x, u) (2)

I n E q . (2), we shall call χ t h e state vector and u t h e control vector; we


shall denote the finite-dimensional vector space of t h e vector variable
χ by X.
Physical processes for which time t does not appear explicitly in the
equations of motion [such as E q . (2)] are t e r m e d a u t o n o m o u s systems.
If time t appears explicitly in the equations of motion, the process is
called n o n a u t o n o m o u s . W e shall consider a u t o n o m o u s systems first.
F o r obvious physical reasons, we shall require that u(t) be piecewise
continuous and constrained such that u(t) Ε U for all /, where U is a set
in r-dimensional Euclidean space which is i n d e p e n d e n t of χ and t.
Every control u(i) which satisfies these conditions will be called an
admissible control.
F o r every x e X and u e U> we assume that the fi are continuous
in all of their a r g u m e n t s , and continuously differentiable with respect
to the x i .
W e assume that we are given a fixed initial time t0 , an initial state
1
x(t0) = x°f and a desired terminal state x(t^) = x , t x > t0 . W e assume
that the terminal time tx may be either fixed or free.
Let us suppose that the quality of system performance is to be measu-
red by t h e integral
h
7= \ fo{x(t),u(t))dt (3)

where / 0 satisfies the same conditions as the fi , i = 1, n. T h e value


of J for a given admissible control is called t h e cost for that control.
O u r optimization p r o b l e m consists in d e t e r m i n i n g an admissible
MAXIMUM P R I N C I P L E 57

control u(t) w h i c h " t r a n s f e r s " t h e a u t o n o m o u s system of E q . (2) from a


1
given state x(t0) = x° to a n o t h e r state ^ ( ^ ) = Λ: in such a m a n n e r that
t h e cost is minimized. S u c h a control will be called an optimal control
a n d t h e c o r r e s p o n d i n g solution of E q . (2) an optimal trajectory.
Since E q . (2) defines an a u t o n o m o u s system, a n d t h e initial a n d
0 1
t e r m i n a l states, Λ: a n d x , respectively, are fixed, we shall refer to t h e
system a n d its b o u n d a r y conditions as an a u t o n o m o u s system with
fixed e n d p o i n t s .
W e n o w define an additional state variable

*o(0 = Ç fMr)Mr))dr (4)

w h e r e / 0 is t h e i n t e g r a n d in E q . (3), a n d t0 < τ ^ t < tx . W e observe


=
that x0(t0) = 0 a n d t h a t * 0(*i) / · W e also have t h a t
x u
o = /o(*» )

If we d e n o t e t h e (n + 1)—dimensional vectors (x0i x) a n d ( / 0 , / )


by χ a n d f, respectively, we m a y express t h e system defined by E q s . (2)
a n d (5) by
χ = f(x, u)

[We shall use boldfaced letters to indicate (n + l ) - d i m e n s i o n a l q u a n -


tities.]
W e n o w i n t r o d u c e a new set of variables φϊ, i = 0, 1, n, which
m u s t satisfy t h e system of linear, h o m o g e n e o u s differential equations

φι = ^ ^ ψ & φ ι , ; = o,, (5,


i=0

w h e r e t h e partial derivatives are evaluated along an optimal trajectory.


T h e solution of E q . (5) is an (n + l ) - d i m e n s i o n a l vector ψ(/) = (^ 0 (ί),
Άι(0> ···> Φη(ί))- W e also i n t r o d u c e t h e so-called Hamiltonian Η which
is defined by t h e relation

//(ψ, x, u) = ψ'ί(*, il) = φΜχ, u) (6)

w h e r e t h e p r i m e denotes t h e transpose.
F r o m E q . (6), we see t h a t

Xi = a dn φ ί = = i =
^ -Ιχ-> 0,l,..., W (7)

T h i s system is t e r m e d t h e Hamiltonian system.


58 JAMES S. M E D I T C H

F o r fixed values of ψ and x, we shall denote the m a x i m u m of Η for


u Ε U by

Μ(ψ, χ) = sup //(ψ, x, u)

W e are now p r e p a r e d to state t h e first t h e o r e m of t h e m a x i m u m p r i n -


ciple (/, p . 19).

T H E O R E M 1. A necessary condition that u(t) and x(t) be optimal for


an autonomous system with fixed endpoints is that there exist a nonzero
continuous solution ψ(/) of Eq. (5) for which:

HMt)9x(t),u(t)) = MMt),x{t)) (8)

φ0(ή = constant < 0 (9)


and
Μ(ψ(0,*(ί)) = 0
for t0 ^ t ^ t1 .
F o r a n o n a u t o n o m o u s system with fixed e n d p o i n t s , t h e relevant
equations in the optimization p r o b l e m formulation are:

x = f(xt uy t)

J = f 7o(*> «, 0 dt

X X W
Q = fo( > > 0

i,= - 2 « ! ^ , i-0.1 . (II)

i=0

and
Μ(ψ, *, t) = sup //(ψ, x, Uy t)y
U6t/
1
where we require #(£ 0) = x° and x(i x ) = a? .
I n addition to t h e conditions i m p o s e d on t h e fi9 i = 0, 1, w,
above, we shall assume t h a t t h e fi are c o n t i n u o u s in t a n d continuously
differentiable with respect to t. I n this case, t h e m a x i m u m principle is
stated in t h e following t h e o r e m ( 1 , p p . 60-61).
MAXIMUM PRINCIPLE 59

T H E O R E M 2. A necessary condition that u(t) and x(t) be optimal for


a nonautonomous system with fixed endpoints is that there exist a nonzero
continuous solution ψ(£) of Eq. (11) for which

/ / ( ψ ( 0 , x(t\ u(t), t) = Μ(ψ(/), x(t), t) (12)


φ0(ή = constant < 0 (13)
and

M m X , (t),t) =|JM)^i . f ( T ) , T (14 )


for tQ < * < ^ .

B. Fixed Time Problems

In T h e o r e m s 1 and 2, t h e terminal t i m e tx was assumed to be free,


i.e., not specified a priori. If t h e terminal time is fixed, t h e p r o b l e m
involves one less parameter, viz., t1 , and T h e o r e m 2 assumes t h e
following form (/, p p . 67-68).

T H E O R E M 3. A necessary condition that u(t) and x(t) be optimal for


the fixed time problem is that there exist a nonzero continuous solution
ψ(ί) of Eq. (11) for which

# ( ψ ( 0 , x(t\ u(t), t) = Μ ( ψ ( 0 , x(t), t) (15)


and
φ0(ή = constant < 0 (16)

for t0 ^ t ^ t1 .
W e observe that there is one less condition in the t h e o r e m as a result
of fixing the terminal time. I n t h e fixed t i m e case, T h e o r e m 1 assumes
the same form as T h e o r e m 3 except that t does not appear explicitly
in E q . (15).

C . Transversality Conditions
T h e o r e m s 1-3 are valid for p r o b l e m s wherein t h e initial and terminal
states are specified a priori. I n m a n y p r o b l e m s , it is desirable that some
of t h e coordinates of t h e state vector be free at t h e initial a n d / o r terminal
times. F o r example, in the launching of a s o u n d i n g rocket, one may
wish to p r o g r a m t h e t h r u s t in order to achieve m a x i m u m altitude at
b u r n o u t w i t h o u t regard for w h a t t h e rocket's velocity is at b u r n o u t .
60 JAMES S. MEDITCH

I n this section, we shall consider certain conditions, called t r a n s -


versality conditions (/, p p . 45-58), which m u s t be satisfied when some
(or all) of the coordinates of the state vector are free at t h e initial a n d / o r
terminal time(s). In order to expedite the presentation, we shall only
consider the transversality conditions for the case where some (or all)
of the coordinates of the state vector are free at the terminal time for an
a u t o n o m o u s system. I n the sequel, we shall refer to the terminal state
as a variable right e n d p o i n t .
Before we can state the transversality conditions, we m u s t i n t r o d u c e
some geometric concepts.
Analogous to the definition of surfaces in Euclidean three-space, we
may define hypersurfaces in t h e w-dimensional space X by the equations

, .··> xn) = 0

: (Π)
gk(xx, x„) = 0

W e shall assume that k < η in the sequel. T h e hypersurfaces are said


to be smooth if all of the

J^, ί = ι,...,*; .7 = 1,..., «

are continuous and nonzero. T h e set of all x e X which simultaneously


satisfy E q s . (17) is called an (n — k) - dimensional s m o o t h manifold in
X if the vectors , i = 1, k, w h e r e V denotes the gradient, are
linearly i n d e p e n d e n t .
L e t S be a s m o o t h (n — £)-dimensional manifold in X, and let
x G S. L e t Tt be the tangent hyperplane (a plane in w-space) of the
hypersurface gi(xx , xn) = 0, i = 1, k, at x. T h e intersection of
the Ti is called the tangent plane Τ of S at x. It is clear that the d i m e n -
sion of this tangent plane is (n — k). Any (n — Ä)-dimensional vector
which lies in Τ and emanates from χ is called a tangent vector of S at x.
1
W e now pose the p r o b l e m of the variable right e n d p o i n t . L e t x e S
where S is a s m o o t h (n — Ä)-dimensional manifold. W e remark that k
1
of the coordinates of x are fixed. L e t us assume that we wish to " t r a n s -
fer" t h e system1 of E q . (2) from a given initial state x(t0) = x° to some
state x ( ^ ) = Λ: E S in such a m a n n e r that t h e cost, E q . (3), is minimized.
1
N o w let x be a specific point in S and let Τ be the tangent plane
1
of S at x . Recall that Τ is of dimension p — η — k, k < n. N o w let
u(t), x(f), and ψ(ί), t0 ^ t ^ t1 , be the solution of the optimization
MAXIMUM PRINCIPLE 61

problem in Section I, A for which T h e o r e m 1 is relevant. W e say that


ψ(ί) satisfies the transversality condition at x(^) if the vector φ(ίχ) =
s
(0i(*i)> ···> Άη(*ι)) * orthogonal to T. Equivalently, the transversality
condition is satisfied if φ(ίλ) is orthogonal to every vector ν e T, i.e.,
φ'(ί^)ν = 0 w h e r e the p r i m e denotes the transpose. Since Τ is of d i m e n -
sion />, we can obtain p i n d e p e n d e n t relations from the transversality
1 p
condition by substituting p linearly i n d e p e n d e n t vectors, v , v in
1
the relation φ\ίχ)ν = 0. Along with the k coordinates of x which are
known, this gives us η = p + k conditions which m u s t be satisfied at
the terminal time tx . T h i s is ' ' e q u i v a l e n t ' ' to knowing the w-conditions
1
on x . H e n c e , we have a "sufficient" set of b o u n d a r y conditions.
W e now state the t h e o r e m for a u t o n o m o u s systems with a variable
right endpoint.

T H E O R E M 4. A necessary condition that u(t) and x(t) be optimal for


an autonomous system with a variable right endpoint is that there exist
a nonzero continuous solution ψ(ί) of Eq. (5) which satisfies the conditions
of Theorem 1 and the transversality condition.

D. Discussion of Results

W e remark first of all that T h e o r e m s 1-4 are necessary conditions


for optimality. T h a t is, if an optimal control exists for a given p r o b l e m ,
it m u s t satisfy the conditions of the t h e o r e m relevant to the p r o b l e m
formulation. O n the other h a n d , satisfaction of t h e conditions of the
m a x i m u m principle by an admissible control does not necessarily imply
that t h e control is optimal.
W e also remark that t h e question of t h e existence of optimal controls
is of fundamental i m p o r t a n c e ( 5 - 7 ) . T h e question of u n i q u e n e s s
(/, p p . 123-127) is also of interest. I n an actual design, however, one may
be satisfied with having one optimal solution w i t h o u t regard for its
uniqueness.
F r o m t h e p r o b l e m formulation a n d the statement of the t h e o r e m s ,
it is clear that application of t h e m a x i m u m principle to a given p r o b l e m
will, in general, lead to a nonlinear, t w o - p o i n t b o u n d a r y value p r o b l e m .
T h i s is immediately clear for t h e fixed e n d p o i n t p r o b l e m s and t h e varia-
ble right e n d p o i n t p r o b l e m w h e r e t h e free coordinates of x(tx) lead
to fixed coordinates of φ(ίλ) t h r o u g h t h e transversality condition. I n
its general form t h e n , t h e p r o b l e m of synthesizing optimal controls has
no known general solution. However, some results have been obtained
for certain special cases (8-9).
62 JAMES S. MEDITCH

I I . Properties of O p t i m a l C o n t r o l s

In this section we shall illustrate the facility of the m a x i m u m principle


in d e t e r m i n i n g certain properties of optimal controls. W e shall restrict
ourselves to two rather general classes of optimization p r o b l e m s in order
to expedite the presentation. T h e approach, which is a straightforward
application of the m a x i m u m principle, is also applicable to other classes
of optimization p r o b l e m s .

A. The Servomechanism Problem

L e t y(t) be some desired state which we wish t h e state x(t) of a physical


process to "follow" over a fixed interval [0, T]> such that an integral
of t h e form

(18)
ο

is minimized. I n E q . (18), φ is a scalar-valued function of t h e difference


between y(t) and x(t). W e assume that φ is c o n t i n u o u s and continuously
differentiable in all of its a r g u m e n t s . W e note that 5 is a measure of
integrated system error. F o r example, if φ is the square of t h e Euclidean
n o r m , S becomes t h e familiar integral-square-error (70).
Let us consider physical processes whose behavior is governed by
t h e system of ordinary differential equations

χ =/(*) + Bu (19)

where χ is an w-dimensional vector, / is an w-dimensional vector-valued


function of χ, Β is a constant η X r matrix, and u is an r-dimensional
vector. W e assume that / is c o n t i n u o u s a n d continuously differentiable
in all*of the xi . W e also assume that x(0) — x°, t h e initial state of the
process in E q . (19), is fixed, b u t that the terminal state x(T) is free.
H e n c e , we have a fixed time p r o b l e m with a variable right e n d p o i n t .
In this case, the manifold S corresponds to the entire vector space X
of t h e vector variable x. H e n c e , any vector ν G X is tangent to S. T h e
transversality condition t h e n gives φ\Τ)ν = 0 where t h e p r i m e denotes
the transpose. Since ν ^ 0, this means that Φι(Τ) = φ2 (Τ) — ... =
φη(Τ) = 0. It t h e n follows that ψ0(Τ) Φ 0, a n d therefore, that φ0(ΐ) =
constant < 0. W e shall let φ0(ί) = — 1, 0 < * < T.
I n minimizing E q . (18), we shall restrict ourselves to controls u(t)
whose c o m p o n e n t s are piecewise c o n t i n u o u s and satisfy t h e constraint
I u{ I < 1, i = 1, r, for 0 < t < T.
MAXIMUM PRINCIPLE 63

O u r cost coordinate xQ is defined by the relation

F r o m the definition of the H a m i l t o n i a n , we obtain

7/(ψ, χ, u) = φ'/(χ) + φ'Βη - (γ


Ψ - χ)

where the p r i m e denotes the transpose. It is clear that the H a m i l t o n i a n


is maximized if we set

u(t) = sgn [Β'φ{ί)\ 0 < t < Τ (20)

H e n c e , if an optimal control exists for our p r o b l e m , it m u s t assume the


form given in E q . (20). W e observe t h a t if any c o m p o n e n t of Β'φ(ί) = 0
on any subinterval in [0, T ] , t h e form of t h e optimal control cannot be
d e t e r m i n e d by this m e t h o d .
F r o m the second set of equations in E q . (7), we have t h a t

(21)
dxt

for Î = 1, n.
W e observe that the systems of E q s . (19) and ( 2 1 ) constitute 2n o r d i -
nary differential equations with b o u n d a r y conditions x(0) = x° and
φ(Τ) — 0. T h e two systems are " c o u p l e d " t h r o u g h E q . (20), t h e optimal
control. H e n c e , we have a nonlinear, t w o - p o i n t b o u n d a r y value p r o b l e m
with mixed b o u n d a r y conditions. T h a t is, the initial conditions are on χ
and the terminal conditions are on φ. While it is known t h a t t h e solution
for this p r o b l e m exists, the c o m p u t a t i o n a l p r o b l e m s associated with
obtaining a solution are a p p a r e n t .
O n the other hand, we note that the form of t h e optimal control
was obtained with virtually no effort. T h e optimal control, if it exists, is
" b a n g - b a n g " (providing no c o m p o n e n t of Β'φ(ί) vanishes on any n o n -
zero subinterval of [0, T]) and its switching times are governed by φ(ί).
F r o m Eq. (21), we observe that the form of φ(ί) is in t u r n , governed by
the form of b o t h / ( x ) and φ(γ — χ). H e n c e , changing the form of t h e cost
function integrand will, in general, change t h e switching times.
A s s u m i n g a solution for φ(ί) can be obtained for 0 ^ t ^ T, t h e
optimal system assumes the form shown in Fig. 1. W e observe t h a t t h e
optimal control is open-loop unless the solution for φ(ί) can be " u p d a t e d "
and m a d e a function of the instantaneous state x(t) of t h e physical
process.
64 JAMES S. MEDITCH

S O L U T INO O F L _ ^
T ) (S 0
E q. ( 2)3
I
-X(0)= X·

y(t)

F I G . 1. Block diagram of optimal system for the servomechanism problem.

B. A Class of Minimum Effort Controls

Let us consider a nonlinear physical process which is characterized


by the system of ordinary differential equations

*=/(*)+*(*)«! (22)

In E q . (22), χ is a w-dimensional vector, / and g are w-dimensional


vector-valued functions of x, and ux is a scalar control variable. W e
assume that / and g are continuous and continuously differentiable in all
of the xi 9i = 1, n. W e assume that the process of Eq. (22) is initially
in the state x(0) = x° and that we wish to transfer the process to the state
1
x(T) = x in such a m a n n e r that the cost

Ä - i f "J [ « , ( « ) ] * A (23)
0

is minimized. W e assume that the terminal time Τ is fixed. E q u a t i o n


(23) is a measure of the control effort e x p e n d e d in effecting the transfer.
For this example, we shall allow u^t), 0 ^ t ^ T, to assume any
real value. H e n c e , the set U of admissible controls is the entire real line.
We, of course, require u^t) to be piecewise continuous.
F r o m E q . (23), our cost coordinate is defined by the relation
2
*o = è("i)

T h e Hamiltonian for our p r o b l e m is


2
//(ψ, x, u) = φ'/(χ) + ^g(x)ux + i 0 o[ " i ] > 0 < t < Τ

Since φ0(ί) = constant 0, we can set φ0(ί) — — 1, 0 ^ t ^ T, with


little loss of generality.
MAXIMUM PRINCIPLE 65

1
Since —oo < w 1 < o o , 0 ^ £ ^ 7 , t h e value of ux which maximizes
the Hamiltonian is obtained by setting

ψ - *

from which we obtain

«i(0 = f ( 0 - * M 0 ) = * ' M * ) ) - W ) , ο < t < r (24)

where the p r i m e denotes the transpose. W e may consider u^t) to be the


dot p r o d u c t of a time-varying gain φ(ί) and t h e function g of the process
state x(t).
H e r e also, we remark that if an optimal control exists for our problem,
it m u s t assume the form given in E q . (24).
F r o m the second set of equations in E q . (7), we obtain

*--t^- + m
-I—-]«" <>
2s

for i = 1, n. W e observe that t h e optimal control u^t) appears in


E q . (25), a fact which further complicates the p r o b l e m .
As in the preceding example, we note that the form of the optimal
control was obtained very easily, b u t that the actual solution of the
p r o b l e m (synthesis of t h e optimal control) requires solution of a n o n -
linear, t w o - p o i n t b o u n d a r y value p r o b l e m .
A s s u m i n g E q . (25) can be solved, we obtain the structural form of
the optimal system as shown in Fig. 2.

g' (χ) g(x)

SOLUTION OF - X ( 0 ) = X'
Eq.(27)

FIG. 2. Block diagram of optimal system for the minimal effort problem.

C. Discussion of Results
W e remark that the partial derivatives in E q s . (21) and (25) m u s t be
evaluated along an optimal trajectory. H e n c e , the p r o b l e m of synthesi-
zing an optimal control would be greatly simplified if it were possible to
determine the initial conditions on these systems of equations. A techni-
66 JAMES S. MEDITCH

que for d e t e r m i n i n g these initial conditions has been obtained for cer-
tain cases involving linear processes (9).
F r o m the two examples considered above, we make two observations
regarding application of t h e m a x i m u m principle. First, it is clear t h a t
the form of t h e optimal control (providing it exists), and, therefore, t h e
structure of t h e optimal control system are obtained essentially by in-
spection of the Hamiltonian. Secondly, t h e m a x i m u m principle does not
provide any direct t e c h n i q u e for synthesizing t h e optimal control.
Moreover, if t h e resulting two-point, b o u n d a r y value p r o b l e m can be
solved, t h e corresponding control is open-loop.

I I I . A n A p p l i c a t i o n Study

In this section, we shall consider t h e p r o b l e m of specifying the


optimal (minimal fuel) t h r u s t p r o g r a m for t h e vertical flight of a rocket
in vacuo in a uniform gravitational field. I n particular, we shall apply
our results to the p r o b l e m of performing a lunar hovering mission.

A. Problem Formulation
T h e physical process which we shall consider is depicted in Fig. 3.
W e assume that the rocket's motion is subject to the following conditions:
( 1 ) t h e only forces acting on t h e vehicle are its own weight a n d the t h r u s t

k * VELOCITY OF E X H A U S T G A S E S WITH
R E S P E C T TO V E H I C L E » C O N S T A N T
m · M A S S FLOW R A T E OF P R O P E L L A N T
m « TOTAL M A S S

g « ACCELERATION OF GRAVITY

THRUST * -km

/ / / / / / / / / / / / / / / / / / / / / / / / / Γ

F I G . 3. Diagram of vertical motion of a rocket.


MAXIMUM PRINCIPLE 67

which can only act in the positive χ direction ; (2) t h e t h r u s t is t a n g e n t to


the descent trajectory; (3) t h e acceleration of gravity is constant; ( 4 ) t h e
velocity of t h e exhaust gases is constant with respect to t h e vehicle;
a n d (5) t h e propulsion system is capable of delivering either zero or a
fixed mass flow rate, i.e., we assume a n o n t h r o t t a b l e engine. U n d e r
these a s s u m p t i o n s , it is well k n o w n ( / / ) t h a t t h e m o t i o n of t h e rocket is
governed by t h e s e c o n d - o r d e r ordinary differential equation

I n E q . (26), χ is t h e altitude, m is t h e total mass, m is t h e mass flow rate,


k is t h e velocity of t h e exhaust gases with respect to t h e rocket, a n d g
is t h e acceleration of gravity. T h e single a n d d o u b l e dots d e n o t e t h e
first and second t i m e derivatives, respectively. Also, k > 0 a n d m < 0
in E q . (26).
W e assume t h a t t h e initial altitude of t h e rocket is x(0) > 0, its initial
velocity is x(0) < 0, a n d its initial mass is m(0) > 0. W e assume t h a t
we wish to transfer t h e rocket from this initial state to a terminal state
(x(r)y X(T), m(r))f w h e r e x(r) a n d x(r) are specified a priori, a n d m(r)
a n d τ are free, such that the integral

(27)
ο

is minimized. W e observe that S is simply t h e change in mass d u r i n g


the transfer, a n d is, therefore, equal to t h e fuel c o n s u m p t i o n . ( O u r
reason for allowing t h e terminal t i m e τ to be free will b e c o m e a p p a r e n t
later.)
I n minimizing S, we shall restrict ourselves to t h r u s t (mass flow rate)
p r o g r a m s for which m(i) is piecewise constant a n d can assume t h e
values of either 0 or — on subintervals of t h e time interval 0 < t ^ τ.
H e r e , M is a positive constant. (As a result of our a s s u m p t i o n of a n o n -
throttable engine, t h e p r o b l e m consists, essentially, in d e t e r m i n i n g t h e
switching times for t h e engine.)
Since

E q . (26) can also be written as

χ = -k — (In m) —g (28)
68 JAMES S. MEDITCH

Integrating E q . (28) between the limits of 0 and ty we obtain

* W = - * l n ^ - i i + *(0)

It t h e n follows t h a t we can achieve our desired terminal velocity x(r)


if, and only if,

k \ n ^ = x { 0 ) - x ( r ) - g r (29)

If we denote t h e velocity difference x(0) — x(r) by AV and solve


E q . (29) for τη(τ), we obtain

T
m(r) = m ( 0 ) e x p ( ^ - ^ - ) (30)

Substituting E q . (30) into E q . (27), we have t h a t

S = m(0) [l - exp ( ^ = ^ ) ]

Since w(0), A V> g> and k are constants, we observe t h a t t h e a m o u n t of


fuel required to effect t h e transfer of t h e rocket is a m o n o t o n i e strictly
increasing function of t h e terminal t i m e r. T h e r e f o r e , the minimal
fuel p r o b l e m is equivalent to t h e minimal t i m e p r o b l e m . O u r reason
for allowing τ to be free in t h e p r o b l e m formulation is n o w clear. W e
shall consider t h e minimal t i m e p r o b l e m in t h e sequel. ( W e note that
S is i n d e p e n d e n t of the altitude change in t h e transfer.)
If we set χ = xx, x1 = x2, x3 = my and u = rh, E q . (26) can be
represented by t h e system of first-order ordinary differential e q u a t i o n s :
Z =
X\ X2

x2 = - ^ - u - g (31)

X3 = U

I n t h e system of E q . (31), u is t h e control variable a n d can only assume


t h e values 0 or — on subintervals of [0, τ ] . T h e b o u n d a r y conditions
on E q . (31) are ^ ( 0 ) = *(0), * 2( 0 ) = x(0), # 8(0) = m(0), XX{T) = *(τ),
and x2(r) = x(r). T h e terminal mass x3(r) = τη(τ) is free. H e n c e , we
have an a u t o n o m o u s system with a variable right e n d p o i n t and free
terminal time. ( W e note that since # 3 is t h e total mass, only those cases
for which xz(t) > 0, 0 < t < r, are physically meaningful.)
MAXIMUM PRINCIPLE 69

F o r t h e minimal t i m e p r o b l e m , t h e cost coordinate is defined by


XQ = 1, i.e.,

s = Γ dt = τ
Jο

Β. Optimal Thrust Program

T h e H a m i l t o n i a n for our p r o b l e m is

k
# ( ψ , xy u) = φλχ2 — φ2 — u — tig + φζ u + φ0 (32)

F r o m t h e second set of equations in E q . (7), it follows t h a t t h e φί ,i = 1,


2, 3, are solutions of t h e system of e q u a t i o n s :

& = o

φ, = -φ, (33)

It is clear from E q . (32) that t h e H a m i l t o n i a n is maximized if we set

χ (t)
0 whenever φ2(ή < φζ(ή
Μ ( )ί = ) (
{ . x(t)
I —M whenever φ2(ή > φ3(ή

for 0 ^ t ^ r . W e note that u(t) appears to be i n d e t e r m i n a t e if

U T ) U )T = 0 )
~ W )

on any nonzero subinterval in [0, τ ] . However, using a r g u m e n t s similar


to those given elsewhere, (12), it can be s h o w n t h a t t h e condition given
in E q . (35) cannot hold on any finite closed interval in [0, τ ] .

C . Lunar Hovering Mission


L e t us assume t h a t a space vehicle is in t h e terminal descent phase
of a lunar mission a n d is descending vertically. L e t us further assume
that we wish to p r o g r a m t h e t h r u s t so that t h e vehicle will achieve
zero terminal velocity at some specified altitude (say a few h u n d r e d feet)
with a m i n i m u m e x p e n d i t u r e of fuel.
70 JAMES S. M E D I T C H

O n c e this terminal condition is achieved, t h e vehicle can hover by


applying a t h r u s t acceleration of one lunar g. S u c h a mission p e r m i t s
inspection of t h e m o o n ' s surface, and, subsequently, choice of a possible
landing site if desired.
It can be s h o w n (12) for our p r o b l e m t h a t t h e r e is at most one switching
d u r i n g the descent a n d t h a t this switching is from off to on. T h a t is,
t h e optimal t h r u s t p r o g r a m consists of either full t h r u s t from t h e initi-
ation of the mission until t h e desired hovering altitude is achieved, or a
period of zero t h r u s t (free-fall) followed by full t h r u s t until t h e desired
hovering altitude is achieved.
Because of t h e relative simplicity of t h e optimal t h r u s t p r o g r a m , it
can be synthesized by developing an a p p r o p r i a t e switching function.
D e v e l o p m e n t of a switching function consists in d e t e r m i n i n g a relation
f(xx, x2) = 0 such t h a t if t h e t h r u s t is t u r n e d on w h e n this relation is
first satisfied (and left on thereafter), t h e desired terminal conditions
χλ(τ) = constant and x2(r) = 0 are achieved.
T h e optimal t h r u s t p r o g r a m is t h e n i m p l e m e n t e d by sensing t h e
altitude a n d velocity d u r i n g descent, say by a radar altimeter a n d
doppler radar, respectively; initiating t h r u s t w h e n f(xx, x2) = 0; a n d
continuing t h r u s t until the desired terminal conditions are achieved.
W e obtain the switching function by integrating t h e equations of
motion u n d e r the a s s u m p t i o n t h a t u(t) = — M and d e t e r m i n i n g t h e
relation which m u s t exist b e t w e e n t h e altitude a n d velocity at t h e
initiation of t h r u s t i n g in order to achieve t h e desired terminal conditions
in a time r . W e let 0 ^ t < τ be t h e interval over w h i c h t h r u s t i n g
occurs. W e let x^y x2*, a n d M0 be t h e altitude, velocity, a n d mass,
respectively, at the initiation of t h r u s t i n g . W e note that x3(t) = M0 — Nit
for 0 < t < τ .
Integrating t h e equations of motion subject to t h e above assumptions,
we obtain

*iW = ^ (l - |r t) In (l - t) + kt - fc* + Xi*t + xS

and

x2(t) = - * l n ( l -jft)-gt + x2*

=
At t τ , we require XX(T) = constant and x2(r) = 0. H e n c e , our
desired terminal conditions are achieved if, and only if,

Xi
* =
~ ~w ( - w )
ln ! T _
0
kr
- ^ +* i(t) (36)
MAXIMUM PRINCIPLE 71

and
V = * l n ( l - - ^ - r ) +gr (37)

Because of t h e t r a n s c e n d e n t a l n a t u r e of E q s . (36) a n d (37), it is


difficult to eliminate t h e p a r a m e t e r τ a n d obtain an expression for / ( # i * ,
* 2 * ) — 0· Instead, we shall obtain an a p p r o x i m a t e switching function
which is applicable to a n u m b e r of cases of interest.
W e note first t h a t Μτ/Μ0 is t h e fraction of t h e initial mass w h i c h
is c o n s u m e d d u r i n g t h r u s t i n g . If we allow no m o r e t h a n 2 5 % of t h e
initial mass to be fuel which may be used for t h e above mission, we can
utilize t h e approximation
/ \ Mr jftM
2
l Μ0Ί- M0 2M0

for which it can b e s h o w n t h a t t h e error is at most 2 . 2 3 % .


S u b s t i t u t i n g t h e a p p r o x i m a t i o n of t h e In function into E q s . (36) a n d
(37) a n d simplifying, we obtain
2
xx* = ar + x^r) (38)
and
2
*2* = _ 2ar - br (39)

respectively, w h e r e
! \kNl-gMQ
M0

and 2
kM
2
ο = IM

I n order that t h e engine possess t h e capability of decelerating t h e


vehicle d u r i n g its descent, it is necessary t h a t kM > gM0 . H e n c e ,
a > 0, and since xx* — χλ(τ) > 0 m u s t hold in order that the problem
be meaningful, it follows t h a t t h e only value of τ > 0 which satisfies
Eq. (38) is

—V^r^ (40)

S u b s t i t u t i n g E q . (40) into E q . (39) a n d simplifying, we obtain

/(*ι*. V ) = ^[xi* - * , « ] + 2a + χ2* = 0 (41)


72 JAMES S. M E D I T C H

W e remark t h a t t h e set of all initial states from which it is possible to


reach t h e desired terminal state (x(r), 0) using full t h r u s t for 0 ^ t ^ τ
is comprised of those states ( # χ * , x2*) which satisfy E q . (41).
Since t h e relation xx* > χ χ ( τ ) m u s t hold, a n d we have assumed t h a t
the vehicle is descending, i.e., t h e velocity is negative, we are only inter-
ested in t h e behavior o f / ^ * , # 2 * ) in t h e fourth q u a d r a n t of t h e xx — x2
plane w h e r e xx > χ χ ( τ ) a n d x2 < 0. Moreover, since 0.25 MQjKl is t h e
m a x i m u m value which r can assume if only 25 % of t h e initial mass
is fuel which m a y be used for t h e mission, it follows from E q s . (38)
and (39) that t h e constraints

* ! < 0.0625* ( - ^ ) +xx(r)

and

m u s t also hold.
A plot of / ( # ! * , # 2 * ) for t h e region of t h e fourth q u a d r a n t of t h e
χχ X2 plane which is of interest is given in Fig. 4.

x2

F I G . 4. Plot of switching function and example of free-fall trajectory for lunar


hovering mission.

T h e free-fall trajectory of o u r space vehicle for a given initial altitude


γχ a n d initial velocity y2 is given by

2
*ι = η - γ - (y 2) ]
MAXIMUM PRINCIPLE 73

A typical free-fall trajectory is s h o w n in Fig. 4. It is clear t h a t a given


free-fall trajectory and t h e switching function cannot intersect m o r e t h a n
once.
N o w let γλ a n d y2 be t h e altitude a n d velocity, respectively, at t h e
e t en
initiation of t h e mission, a n d assume t h a t f(y1, γ2) > 0> i- -> point
n es
(YI y Ύ2) above t h e switching function curve (see Fig. 4). T h e optimal
t h r u s t p r o g r a m is clear: as t h e vehicle is falling, m e a s u r e d values of
altitude xx and velocity x2 are s u b s t i t u t e d in t h e relation

h T a2 l X X
. * s ) = ~ [*i - *i( )] + yj + 2

As long as f(xx, x2) > 0, t h e vehicle is allowed to free-fall. As soon as


f(x1, x2) = 0, t h r u s t i n g is initiated a n d continues until t h e desired
terminal conditions are achieved. T h e c o m p u t a t i o n involved is simple
e n o u g h t h a t it can be performed in real t i m e by a small special-purpose
digital c o m p u t e r . T h r u s t cut-off a n d switching into t h e hovering m o d e
occur w h e n t h e m e a s u r e d velocity becomes equal to zero. W e remark
t h a t t h r u s t cutoff could also be p r o g r a m m e d to occur w h e n t h e vehicle
reaches t h e desired hovering altitude.
If γ1 a n d y 2 are such t h a t /(γ1, y 2) = 0 at t h e initiation of t h e mission,
t h e n t h r u s t i n g c o m m e n c e s immediately. However, if /(γχ, γ2) < 0
initially, a c c o m p l i s h m e n t of t h e mission is b e y o n d t h e capability of the
propulsion system. T h a t is, a higher t h r u s t level t h a n t h a t a s s u m e d is
needed to achieve t h e desired terminal conditions. W e shall assume that
f(Vi y Ύ2) ^ 0 initially, i.e., that t h e mission and t h e propulsion system
are compatible.
A block diagram of t h e optimal system is given in Fig. 5. T h e functions
to be p e r f o r m e d by t h e c o m p u t e r were discussed above.

RAOAR
ALTIMETER
PROPULSION
VEHICLE
SYSTEM
DOPPLER *2
RADAR

T H R U S T INITIATION S I O N A L

THRUST CUTOFF S I G N A L

F I G . 5. Block diagram of optimal system for lunar hovering mission.


74 JAMES S. M E D I T C H

D. Discussion of Results
W e have given a particular example of how t h e m a x i m u m principle
can be used to effect a preliminary design. T h e actual design of a c o m -
plete system for the assumed mission would, of course, involve consider-
ably m o r e detail t h a n we have presented here. I n any event, we have
established t h e general form of t h e t h r u s t p r o g r a m a n d have indicated
the type of h a r d w a r e needed to i m p l e m e n t t h e optimal system.

References

1. L . S. P O N T R Y A G I N , V . G. B O L T Y A N S K I I , R . V . G A M K R E L I D Z E , and E. F. M I S H C H E N K O ,
in " T h e Mathematical T h e o r y of Optimal Processes" ( L . W . Neustadt, ed.).
Wiley, N e w York, 1962 (translated by K. N . TrirogofT).
2. V . G. B O L T Y A N S K I I , R . V. GAMKRELIDZE, and L . S. P O N T R Y A G I N , O n the theory of
optimal processes. Dokl. Akad. Nauk SSSR 110, 7 - 1 0 (1956).
3. V. G. B O L T Y A N S K I I , R . V. G A M K R E L I D Z E , and L . S. P O N T R Y A G I N , T h e theory of
optimal processes. I. T h e m a x i m u m principle. Izv. Akad. Nauk SSSR Ser.
Mat. 2 4 , 3 - 4 2 (1960); English transi, in Am. Math. Soc. Transi. [2] 18, 3 4 1 - 3 8 2
(1961).
4. J. S. M E D I T C H , in "Status of M o d e r n Control S y s t e m T h e o r y " ( C . T . L e o n d e s , ed.),
Chapt. V I I . M c G r a w - H i l l , N e w York, 1964.
5. Ε . B. LEE, and L. M A R K U S , Optimal control for nonlinear processes. Arch. Rational
Mech. Anal. 8, 3 6 - 5 8 (1961).
6. E. R O X I N , T h e existence of optimal controls. Mich. Math. J. 9, 1 0 9 - 1 1 9 (1962).
7. L . W . NEUSTADT, T h e existence of optimal controls in the absence of c o n v e x i t y
conditions. J. Math. Anal. Appl. 7, 110-117 (1963).
8. M . Ä T H A N S , P. L . FALB, and R . T . LACOSS, On optimal control of self-adjoint s y s t e m s .
Proc. 1963 Joint Automatic Control Conference, Minneapolis, Minnesota p p . 1 Π -
Ι 20 (June, 1963).
9. L . W . NEUSTADT, and B. PAIEWONSKY, O n synthesizing optimal controls. Proc.
2nd Intern. Congr. Intern. Fed. Automatic Control, Butterworths, L o n d o n (to
appear).
10. G. C. N E W T O N , Jr., L . A . G O U L D , and J. F. K A I S E R , "Analytical D e s i g n of Linear
Feedback Controls." Wiley, N e w York, 1957.
11. A. M I E L E , in "Optimization T e c h n i q u e s : W i t h Applications to Aerospace S y s t e m s "
( G . Leitmann, ed.), Chapt. 4. A c a d e m i c Press, N e w York, 1962.
12. J. S. M E D I T C H , O n the problem of optimal thrust programming for a lunar soft
landing. Proc. 1964 Joint Automatic Control Conference, Stanford, California
pp. 2 3 3 - 2 3 8 (June, 1964).
Control of
Distributed Parameter Systems 1

2
P. K. C. WANG
International Business Machines Corporation,
San Jose Research Laboratory, '
San Jose, California

I. Introduction 75
||. System Description 77
A. Physical Description 78
B. Mathematical Description 88
HI. Intrinsic Properties 104
A. Stability ' 105
Β. Controllability 118
C. Observability 132
IV. O p t i m u m Control 140
A. Problem Formulation 140
Β. Functional Equations; M a x i m u m Principle . . . . 143
C. Linear Systems 151
V· Problems in Approximation and Computation 161
A. Approximate S y s t e m s 162
Β. Computational Problems 166
VI. Practical Aspects of Control 167
VII. Concluding Remarks 169
References 170

I. I n t r o d u c t i o n

Recent d e v e l o p m e n t s in control theory have concentrated primarily


on systems whose d y n a m i c behavior can be adequately described by
ordinary differential equations. In view of the present t r e n d of rapidly
advancing science and technology, it is most likely t h a t the future a u t o -
matic control systems will call for m o r e stringent design specifications
and m o r e complex control objectives, particularly in industrial processes
1
T h i s research was s u p p o r t e d in part by the U n i t e d States Air Force through
Flight D y n a m i c s Laboratory, Research and T e c h n o l o g y Division, Wright-Patterson
Air Force Base, under contract N o . A F 33(657)-11545.
- M e m b e r of Research Staff.

75
76 P . K. C. WANG

and aerospace systems. T h i s generally requires the consideration of a


more accurate mathematical description of the systems to be controlled,
the development of a more sophisticated control theory, and the explo-
ration of new m e t h o d s of implementation.
F u n d a m e n t a l l y speaking, all physical systems are intrinsically distri-
b u t e d in nature. However, in m a n y physical situations, the system's
spatial energy distribution is sufficiently concentrated or invariant in
form d u r i n g the course of motion so that an approximate l u m p e d para-
meter description may be adäquate. On the other hand, the spatial
energy distributions of many practical systems are widely dispersed.
It is of interest to maintain precise control of certain spatially distri-
b u t e d physical variables. T h i s generally requires the direct consider-
ation of distributed parameter mathematical models which are in the
form of partial differential equations or integral equations. Typical exam-
ples of this class of systems are c o n t i n u o u s furnaces, distillation proces-
ses, nuclear and chemical reactors.
T h e basic approach underlying almost all the existing works on the
control of distributed p a r a m e t e r systems has been based on first a p p r o x i -
mating the distributed model by a c o r r e s p o n d i n g spatially discretized
model, and then designing a control system via the established theory for
l u m p e d parameter systems. Such an approach is natural from the prac-
tical standpoint. However, it does not provide deep insight into the
general control p r o b l e m associated with distributed p a r a m a t e r systems,
since certain salient features of the system behavior may be obscured or
lost by the discrete approximation, and f u r t h e r m o r e , it does not yield
any quantitative information on the relationship between the discretiza-
tion level and the actual performance of the controlled system. T h e r e -
fore, it is desirable to develop a unified control theory within the frame-
work of distributed p a r a m e t e r systems, and then proceed to establish
rational criteria for approximation. O n the other h a n d , since l u m p e d
parameter systems can be regarded as a particular case of distributed
parameter systems, the development of such a theory would represent
another step in the hierarchy of general control system theory.
T h e first serious work toward this direction was initiated by Butkov-
skii and L e r n e r (1-7). T h e i r work, u p to the present time, has been
concentrated on p r o b l e m formulation and the derivation of a m a x i m u m
principle for a certain class of distributed p a r a m e t e r systems governed
by a set of nonlinear integral equations. Subsequently, Egorov (8)
studied in detail the o p t i m u m control p r o b l e m associated with a parti-
cular linear diffusion system with various performance indices. Recently,
W a n g and T u n g (9) presented a general discussion of various aspects of
o p t i m u m control of distributed parameter systems. In particular, the
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 77

notions of controllability and observability were extended to distributed


systems. Also, certain functional equations associated with optimization
were derived via the t e c h n i q u e of d y n a m i c p r o g r a m m i n g .
I n view of the meager results in this area, it is impossible at this time
to present a comprehensive t r e a t m e n t of this topic. T h e r e f o r e , it is
felt that the objective here should be aimed at the establishment of p r e -
cise definitions and concepts, which may serve as starting points for
further development toward a unified control theory; and the presenta-
tion of a broad prespective on various p r o b l e m s in this area. A natural
preliminary approach to the d e v e l o p m e n t of such a theory is to extend
as m u c h as possible certain fundamental concepts and known results
associated with l u m p e d p a r a m e t e r systems to distributed p a r a m e t e r
systems. T h e main portion of t h e present work will be orientated toward
this direction.
In the d e v e l o p m e n t of this work, the existing results as well as new
recent results will be delineated. At the same time, various potential
p r o b l e m s along with their difficulties will be discussed in detail. A t t e m p t s
will be m a d e in manifesting the relationship between various results for
distributed p a r a m e t e r systems and those for their corresponding l u m p e d
parameter systems.
It is h o p e d that this work will provide some simulation for further
investigations in this area, and also inject some of the viewpoints and
ideas in control theory into the field of classical c o n t i n u u m mechanics.

I I . S y s t e m Description

O n e of the basic prerequistes for t h e analytical design of control


systems is the establishment of an a d e q u a t e mathematical model of t h e
system to be controlled. T h e derivation of such a model generally
requires considerable physical insight. T h e model should have t h e
simplest possible form while preserving t h e d o m i n a n t system d y n a m i c
characteristics. It should also reveal the overall physical s t r u c t u r e of
the system.
In m a n y practical situations, the system can be only described in
some statistical sense d u e to m e a s u r e m e n t errors and r a n d o m fluctua-
tions in the system parameters. T h i s generally leads to a set of stochastic
d y n a m i c equations. In this work, t h e discussion will be restricted only
to deterministic mathematical models which are in t h e form of a set
of partial differential equations or integral equations. Systems of a
stochastic n a t u r e will be discussed elsewhere.
78 P . K. C. WANG

A. Physical Description

M o s t physical systems which are of interest to control can be r e p r e -


sented by a block diagram shown in Fig. 1. T h e e n v i r o n m e n t a l effects

ENVIRONMENT

DISTURBANCE
A

OUTPUT

TRANS-
SYSTEM
FORMATION

FIG. 1.

on the system are represented by input variables which consist of a set


of manipulatable quantities (control variables) and a set of nonmanipula-
table quantities (disturbances). T h e o u t p u t transformation c o r r e s p o n d s
to a set of t r a n s d u c e r s or measuring i n s t r u m e n t s which monitor certain
system variables and transform t h e m into a set of output quantities.
For a distributed parameter system defined on a finite spatial domain,
both the control and the disturbance variables may be distributed
t h r o u g h o u t the interior of the spatial domain and/or enter at t h e d o m a i n
b o u n d a r y . T h e latter form of i n p u t is most c o m m o n , since only t h e
system b o u n d a r y is in direct physical contact with t h e external world.
T h e distributed i n p u t is more difficult to realize physically. In certain
degenerate cases (e.g., heating of thin rods), spatially nonuniform
b o u n d a r y i n p u t s can be considered as distributed i n p u t s .
T h e p u t p u t of a distributed parameter system may be in the form of
a set of spatially-dependent time functions and/or a set of spatially-
i n d e p e n d e n t time functions. T h e latter form generally results from the
use of spatial-averaging type of measuring i n s t r u m e n t s .
F r o m the physical standpoint, it is convenient to classify the distri-
b u t e d parameter systems according to their spatial domain properties
as follows:
(1) Fixed-Domain Systems. T h e system's spatial d o m a i n is a speci-
fied, connected subset of a M - d i m e n s i o n a l Euclidean space whose
b o u n d a r y is composed of a finite n u m b e r of c o n t i n u o u s surfaces. T h e
domain boundaries remain time-invariant with respect to a given inertial
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 79

coordinate system. However, material flow may occur in and out of t h e


specified d o m a i n .
(2) Variable-Domain Systems. I n these systems, t h e domain b o u n d a r i e s
vary with t i m e . T h e b o u n d a r y motion may be either a specified function
of time or d e p e n d e d u p o n certain variables defined over t h e entire
domain or subsets of the d o m a i n . T h e s e systems arise in physical
situations where the system itself is a deformable b o d y (e.g., an elastic
body), or system m e d i u m is c o m p o s e d of interacting m u l t i p h a s e c o m -
p o n e n t s such as t h e result of heating (cooling) of a solid (liquid) s u b -
stance beyond its melting (freezing) t e m p e r a t u r e ; t h e motion of t h e
liquid-solid domain b o u n d a r i e s d e p e n d u p o n the heat transfer rate in t h e
system. Also, variable-domain systems can arise w h e n t h e spatial
domain b o u n d a r y of t h e distributed system c o r r e s p o n d s to t h e material
b o u n d a r y of a l u m p e d p a r a m e t e r system whose motion d e p e n d s u p o n
certain physical variables (such as fluid pressure) evaluated at t h e b o u n d -
ary.
F o r fixed-domain systems, t h e system state at any t i m e can be generally
specified by a set of functions defined on t h e spatial d o m a i n . F o r variable
domain systems, additional variables specifying t h e instantaneous b o u n d -
ary position, velocity, etc., are usually necessary to specify t h e system
state completely.

REGION I

1
)v(t)
x x =0
— <
x
r 2 f f 2 ( t , x l) i
£ REGION II

·*»'.·*·'. V· .ν "*V
' ·:':·" :-· : v /:·· .'·· '• ?J »;."'·'.*-

FIG. 2.

In order to clarify the above system classification and to provide some


motivation for the later d e v e l o p m e n t s on the control of distributed
systems, a few simple examples of various types of distributed p a r a m e t e r
systems will be discussed. Particular attention will be focused on their
80 P . K. C. W A N G

salient physical features and mathematical models. T h e control p r o b l e m s


associated with each system will be briefly delineated. S o m e of the
examples will be used for illustrating other ideas in the r e m a i n i n g
portion of this work.
Continuous Furnace (System 1). Consider a c o n t i n u o u s furnace as
s h o w n in Fig. 2. A continuous, h o m o g e n e o u s material strip with uniform
thickness is fed into t h e furnace by m e a n s of a variable-speed t r a n s p o r t
m e c h a n i s m . It is assumed that t h e t e m p e r a t u r e distributions fx and f2
are spatially uniform except along t h e .^-direction in t h e respective
regions I a n d II of the furnace, and they can be varied between t h e
following constant limits:

F F F
u ^ Λ(*ι Xi) < » n < * i ) < 2u for all t and all xx Ε (0, 1 )

If t h e material strip is sufficiently thin and narrow, a n d j x — f2 = fQ ,


its t e m p e r a t u r e distribution u(ty xj can be approximately described by
the following one-dimensional, linear heat diffusion e q u a t i o n :

= /. + « o + ^ - / < * · * > » <•>

with b o u n d a r y conditions u(ty 0) = u(ty 1) = 0, a n d where μ is t h e


coefficient of diffusivity, σ is a constant proportional to t h e surface
conductivity of t h e material, a n d ν is t h e material strip velocity.
I n t h e case w h e r e t h e material strip is thick a n d stationary, a n d fx and
/ 2 are i n d e p e n d e n t of χλ , t h e equation governing t h e t e m p e r a t u r e distri-
b u t i o n within t h e material strip is
x
<M*» 2 ) _ μ
*2) λ
)
η
K L
dt ~ 'ox*

with b o u n d a r y conditions u(t, h) = f2(t) and u(t, 0) = fi(t).


F o r t h e above cases, the system state at any t i m e t can be specified
by u(ty Xi) or u(t, x2). T h e manipulatable control variables are v, fx
a n d f2 . I n practical situations, it may be desirable to maintain close
control of t h e t e m p e r a t u r e distribution of t h e material strip inside t h e
furnace or the strip t e m p e r a t u r e at t h e furnace exit.
Multicomponent Ion Exchange Column (10) (System 2). Figure 3
shows an ion exchange c o l u m n consisting of a packed bed of ion exchange
resin which is b r o u g h t into contact with an ionic solution flowing t h r o u g h
it. F o r simplicity, a ternary ionic system will be considered. It is assumed
that t h e solute at any point in t h e bed u n d e r g o e s exchange with t h e resin
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 81

phase at that point and their diffusion in the axial direction is negligible.
At any distance χ from the top of the bed, a material balance equation
may be written for any ion c o m p o n e n t as follows:

c * t ) + „. ± » ψ * ± + ,ρ ο (3) =

w h e r e us(i) and urii) are the solution and resin concentration of c o m p o -


nent " i " respectively, and

c0 is the total ionic concentration of the solution


ν is the solution volumetric flow rate
Q is the total resin capacity
η is the void fraction of the packed bed
ρ is the density of packing

INFLUENT

PACKED
BED

WITH

VOID FRACTION

TTÏÏÏÏ
T h e three t e r m s in Eq. (3) represent the concentration at any point in
the bed caused by solution flow t h r o u g h that point, liquid h o l d u p in the
voids at that point, and the exchange with t h e resin phase at that same
point.
82 P. K. C. WANG

In addition to E q . (3), an i n d e p e n d e n t equation for t h e exchange rate


of each species m u s t be specified,

dur{i)(t,x) = R ^
dt

where R { represents t h e rate expression.


F r o m t h e overall material balance considerations, it is only necessary
to write t h e equation for t w o of t h e t h r e e ionic c o m p o n e n t s present.
T h e equations for t h e complete system are

0"*<n(*>*) _ *(0 ™*a)(t>x) PQ U U u u


Rl( s(l) > s(2) » r(l) » r(2)) (5)
dt dx Ψο
dus(2)(t, x) v(t) dus{2)(t, x) pQ w u u u

dt dx ^ 2 ( s ( l ) » s(2) ι r(l) » r{2)) (6)


Ψο Ψο
dur{1)(t, x) U uR U U

dt = l( s(l) > s(2) ι r(l) » r(2)) (7)

dur{2){tf x) w

dt
, u s {) 2 , w r (.» r1 ( 2 ) ) (8)

where i ? x a n d R 2 are specified functions of their a r g u m e n t s .


A s s u m i n g t h e entering solution concentration is constant for all time,
the b o u n d a r y conditions at χ — 0 a r e :

0) = * e l ; u8{2)(t, 0) = A„ 2 for t > 0 (9)

Also, since no change in solution or resin concentration will occur at any


point in t h e b e d until t h e entering solution has h a d t i m e to flow d o w n to
that point, hence:

u l
r(l)\ > )
x

R
r\ »
u
r(2)\ i
l x
) ~ r2 R 1
0
J

E q u a t i o n s (5)-(8) along with (9) a n d (10) completely define t h e ion


exchange process for a ternary ionic system.
T h i s system is a typical example of a distributed p a r a m e t e r dynamical
process which is describable b y a set of first-order partial differential
equations. T h e state of t h e above system at any time t is specified by
M x W x u x a n
«(D(*> )> * ( 2 ) ( * > )> r(i)(t> ) d uri2)(t, x). T h e manipulatable control
variable is t h e solution flow rate v(t). I t m a y b e of interest to control
the solution concentration at t h e outlet of t h e ion exchange c o l u m n .
Electrical Power Transmission System (System 3). Consider a simple
electrical power transmission system consisting of a lossy line of finite
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 83

length / connected to a generator at one end and t e r m i n a t e d by a l u m p e d


load at the other. T h e line voltage distribution e(t9 x) and c u r r e n t dis-
tribution i(t, x) are governed by a wave equation of the form

G
e(t, x) e(ty x)
d C C dx
Ht j _ _ a _
(Π)
i(t, x) i(t, x)
L dx L

defined for 0 < χ < / and t > 0, w h e r e L, R, C, and G are t h e line


series i n d u c t a n c e ; resistance, s h u n t capacitance, and c o n d u c t a n c e per
unit length respectively.
At χ = 0 and χ = /, the line voltages a n d c u r r e n t s are related to t h e
load and generator d y n a m i c variables by a set of ordinary differential
equations:

g , (t, i(t, /), e(t, I), * i , *M- ,...) =0, ί = 1,..., Ν


(12)
q , (*. ,·(*, 0), « 0), * Ά , ,...) = 0 , j = 1,.... N'

T h e state of t h e complete system at any t i m e can be specified by e(tf x)>


i(ty x) and a set of spatially i n d e p e n d e n t variables specifying t h e states
of the generator and the load. T h i s system is a simple example of a
fixed domain distributed parameter system coupled with a lumped parameter
system. In practical situations, the load may be time-varying, it is desir-
able to maintain close regulation of t h e load voltage e(t, Ϊ) and t h e
generator frequency.
Re-Entry Vehicle with Ablative Surface (System 4). I n m a n y aero-
d y n a m i c re-entry vehicles, ablative shields are used to protect t h e
vehicle from structural damage caused by a e r o d y n a m i c heating. T h e
velocity and attitude of the vehicle m u s t be closely controlled so t h a t
the ablation rate does not exceed certain m a x i m u m allowable value at
any time d u r i n g the re-entry flight.
Consider a simplified one-dimensional version of the ablative portion
of a re-entry vehicle as shown in Fig. 4. It consists of an ablative slab
of thickness /, insulated at χ = /, and subjected to a heat i n p u t Q(t)
at χ = 0. Q(t) can be expressed as a specified function of t h e vehicle
velocity v(f):

c?(o = oxm
Let t = t0 be the starting time of t h e re-entry flight, and t tx be the
time at which t h e t e m p e r a t u r e at χ = 0 reaches t h e melting value u
84 P. K. C. WANG

F o r t0 < t ^ t 1 , the t e m p e r a t u r e inside the slab will be denoted by


u*(t, x) which corresponds to the solution to the slab heat conduction
equation
du*(t,x) d*u*(t, x)
μ
St ~ dx* >

INSULATION

\\\\\\\\\\\\\\\\\\\\\
ABLATIVE
SURFACE

s(t)

Q(v(t))

FIG. 4.

with initial condition


w * ( i0 , x) = 0(x)
u (14)

and b o u n d a r y conditions
du*(t, x)
(15)
dx

du*(t, x)
= ο (16)
dx

where κ is the thermal conductivity of the slab.


Since one face of the slab is insulated, the accumulation of heat will
cause the slab to melt at time t > tx . Let the position of the solid
b o u n d a r y and the t e m p e r a t u r e inside the slab after melting has b e g u n
be denoted by s(t) and u(ty x) respectively. A s s u m i n g complete removel
of melt, the governing equations for t > t1 are
2
du(t, x) _ d u(t, x)
μ 2 for s(t) < χ < I (17)
dt dx
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 85

with initial condition


u(tr, x) = u*(tx, x) (18)

and b o u n d a r y conditions
X) \x=s(t) = u.

dsjt) du(t, x)
L
P (19)
dt dx

du(t, x)
= o
dx

w h e r e ρ is t h e density of t h e slab a n d L is t h e latent heat of melting.


T h e above system is an example of a variable-domain distributed
parameter system in which the domain boundary motion depends upon
certain system variables evaluated at the boundary. T h e state of this
system at any time t s u b s e q u e n t to melting can be specified by s(i) and
u(ty x) for x e (s(t)> I). T h e control variable is t h e vehicle velocity v.
It should be noted that by i n t r o d u c i n g t h e transformation (a conti-
n u o u s , o n e - t o - o n e m a p p i n g of t h e spatial d o m a i n [s(t), I] onto [0, 1]),

(l-x)
ξ = (20)
(/ - s(t))

Eq. (17), (18), a n d (19) can be rewritten in t h e form:

8u(t,
dt
ξ)
(l-s(t)Y
d*u(t, ξ)

2
+ I ds(t) ξ e ( 0 , 1),/ > tx
(17')
{I — s(t)) dt
s(tl) = 0, «(*,,£)=«*(*!,£), for a l l f e (0,1) (18')

u{t, 1) = «„,

d s { t) κ du(t, ξ)
„7 4- (19')
pL
~dT
+ (I - s(t)) 3ξ

du{t, ξ)
= 0

T h e above t r a n s f o r m e d system can be r e g a r d e d as a fixed-domain


d i s t r i b u t e d p a r a m e t e r system w h i c h is coupled to a l u m p e d p a r a m e t e r
system whose state at any t i m e t is specified by s(t).
Transport System (System 5). T h e system consists of a fluid-actuated,
free, rigid carrier enclosed in a cylinder as s h o w n in Fig. 5a. By i n t r o -
86 P . K. C. WANG

ducing a p p r o p r i a t e pressure signals at b o t h e n d s of t h e cylinder, it is


possible to transfer t h e carrier from one position to another. F o r sim-
plicity, t h e following a s s u m p t i o n s are m a d e :

(i) T h e cylinder is semi-infinite (Fig. 5b).


(ii) T h e pressure at t h e right side of t h e carrier is set at a constant
value P0 .
(iii) T h e fluid is compressible a n d inviscid.
(iv) T h e r e is negligible friction between t h e carrier a n d t h e cylinder.
(v) T h e pressure disturbances are sufficiently small so t h a t a linear
pressure-density relation holds.

CONTROL CONTROL
PRESSURE PRESSURE
SOURCE SOURCE

FLUID MEDIUM

(B)

CONTROL; 1
I—
P(t,x)
PRESSURE
v(t,x)
S O U R C E <;

• x r( t ) -

x=0

FIG. 5.

T h e . d y n a m i c behavior of t h e idealized system can be described by a


one-dimensional wave equation a n d an ordinary differential equation
for t h e carrier motion,

"fy>(*, x)'
/>(*, x) 0 -β
a dx
for 0 < χ < xc(t) and t > 0
dt dv(ty x)
v(ty x) -I I Po ο
dx
(21)
and
2
d xc A , x
for t > 0 (22)
x->xc(t)
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 87

where p is t h e fluid pressure referenced with respect to P 0 ; ν is t h e fluid


velocity; a n d p(ty χ) denotes t h e limit of t h e function p as χ a p p r o -
aches xv from t h e left. Ay ßy and p0 are t h e cylinder cross-sectional area,
bulk m o d u l u s and density of t h e fluid respectively.
T h e first b o u n d a r y condition results from t h e continuity of t h e fluid
and carrier velocities at χ = xc(t)y i.e.,
dxr ) ( 2 3
v(ty x)
->xcit) dt

which, in view of E q . (21), can be rewritten as


2
d xc _ 1 dp(tyx)
2 (24)
dt p0 dx X->Xc(t)

T h e manipulatable control variable c o r r e s p o n d s to a b o u n d a r y c o n d i -


tion at χ = 0, in t h e form of a pressure variation

p(ty0) = Pc(t) (25)

where Pc is constrained by | Pc(t) \ < Ps for all t.


T h e above system is a simple e x a m p l e of a variable-domain distributed
parameter system coupled with a lumped parameter system. T h e state of
this system at any t i m e t is specified by xc(t)y dx(./dty p(ty x), a n d v(ty x)
defined for all χ e (0, x(:(t)). A possible control p r o b l e m may be to
find t h e r e q u i r e d control pressure Pc as a function of time, satisfying t h e
given a m p l i t u d e constraint, such that t h e carrier can be transferred
from an arbitrary e q u i l i b r i u m position to a desired e q u i l i b r i u m position
in a m i n i m u m a m o u n t of t i m e .
Similar to System 4, t h e moving b o u n d a r y may be mathematically
eliminated by i n t r o d u c i n g a transformation of t h e form
ξ = */* c (0 (26)
T h e transformed system can be considered as a fixed-domain system
which is coupled to a l u m p e d p a r a m e t e r system whose state at any t i m e
t is specified by xc(t) and dxcjdt.
Time-Delayed Diffusion System (11-13) (System 6). Consider a t i m e -
delayed diffusion system defined on a o n e - d i m e n s i o n a l spatial d o m a i n
Q = (0, 1). A possible mathematical description for a simple system is
given by
l
du(ty x) d u(ty x)
2 + a[u(ty x) - h(xy u(t - rd , x)yf(ty x))] (27)
dt dx
where h is a specified c o n t i n u o u s function of its a r g u m e n t s , σ is a con-
stant, and / is a d i s t r i b u t e d control function. T h e b o u n d a r y conditions
are assumed to be u(ty 0) = u(ty 1) = 0.
88 P . K. C. WANG

In physical situations, time delays can be i n t r o d u c e d by t h e presence


of delayed action sources i m b e d e d in the m e d i u m or by external feedback
paths with time delays as shown in Fig. 6.

•f(t,

J
u(t, x )

TIME
DELAY u(t-T d, x)
Td

FIG. 6.

T h e state of such a system at any time t can be specified by a surface


S ι = u(t' , x) defined on the rectangle [t — rd , t] χ (0, 1). T h i s system
is directly analogous to an ordinary differential system with time delays.
A control p r o b l e m may be to d e t e r m i n e a control / in t e r m s of the in-
stantaneous state St such that a certain performance index is minimized.
A possible form for the control / (feedback control) is

f(t, χ) = ί WQ(X', x)u(t, x') dx' 4- j I wx(t, t \ x, x')u(t', x') dx' dt' (28)
J 0 t—T
ti J0

where w0 and wx are specified weighting functions. W i t h this choice of


/ , Eq. (27) becomes a partial differential-integral equation. T h i s is an
example of a distributed parameter system whose local dynamic behavior
(i.e., at a particular spatial point x s Ω) d e p e n d s u p o n the system
variable u defined on the entire spatial domain as well as its past
history.

B. Mathematical Description
T h e mathematical model of a distributed parameter dynamical system
is generally derived from certain fundamental physical laws such as
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 89

C a u c h y ' s first and second laws of motion for a c o n t i n u u m , and conserva-


tion of energy. In most situations, the model can be simplified to some
extent by making suitable a s s u m p t i o n s based u p o n physical a r g u m e n t s .
However, the solutions of the equations of the resulting simplified model
may differ considerably from the actual behavior of the physical system.
T h e r e f o r e , it is necessary to establish certain consistency conditions for
which the properties of a mathematical model coincide with those of
a dynamical system.
In this section, a set of notations and terminologies for the m a t h e m a -
tical description of distributed p a r a m e t e r dynamical systems will be
established. Also, their basic properties, mathematical forms, and classi-
fication will be discussed.

1. N O T A T I O N S AND T E R M I N O L O G I E S

Consider a distributed p a r a m e t e r system defined on a fixed spatial


domain Ω—an open, connected subset of a M - d i m e n s i o n a l Euclidean
space EM . W e shall denote the b o u n d a r y of Ω by 3Ω, the closure of Ω
by Ω(Ω = ß u 3Ω), and the spatial coordinate vector by X = (x1 ,
X
M)-

T h e state of such a system at any fixed time t can be generally speci-


fied by a set of real-valued functions {u{(ty X), i = 1, TV}, defined for
all X G Ω. T h e set of all possible functions of X that ui can take on at
any time t will be called the state component function space Γ{(Ω), and
the p r o d u c t space Γ(Ω) = Γ^Ω) χ · · · χ ΓΝ(Ω) will be called the state
function space. An example of Γ^Ω) is L 2 ( ß ) — t h e set of all s q u a r e - i n t e -
grable functions of χ defined on Ω.
T h e control action is achieved by i n t r o d u c i n g a set of manipulatable
i n p u t (control) variables which are distributed over all or certain subsets
of Ω. F r o m the physical standpoint, it is convenient to classify the
control variables as follows:
(1) Distributed Control (denoted by {fQ(i){tf X)> i = 1, K})—a set
of i n p u t functions distributed over all or certain subsets of Ω.
(2) Boundary Control (denoted by { / ^ ( ί )( ί , X')y i = \, K'} defined
for all X' e 3Ω)—a set of i n p u t functions distributed over all or certain
subsets of 8Ω (the b o u n d a r y control may vary along the b o u n d a r y of Ω).
In the case where both distributed and b o u n d a r y controls are present,
certain compatibility conditions may have to be satisfied in the neigh-
b o r h o o d of the domain b o u n d a r y .
F o r simplicity, the state, distributed and b o u n d a r y control functions
will be denoted by vector-valued functions U(t, X), F ß(£, X), and
FdQ(t, X) respectively.
90 P . K. C. WANG

T h e set of all possible functions of X that the control F$(t, X) (a totality


of distributed and b o u n d a r y controls) can take on at any fixed t i m e t
will be denoted by./,.(i5). Also, t h e set of all admissible controls F^t, X)
defined for all X e Ω, and all ί on a specified time interval τ will be
denoted b y * / ( ß χ τ). Also, we can classify the n o n m a n i p u l a t a b l e i n p u t s
(disturbances) in a similar m a n n e r .
I n physical situations, t h e state functions ui may not be directly
measurable, b u t instead, only certain prescribed functions of ui are
actually obtained. T h e s e measured variables will be called t h e output
of the system (denoted by Vj , 1 ^ j' ^ TV), which can be considered
as t h e result of a z e r o - m e m o r y , c o n t i n u o u s transformation Λ of t h e
3
state functions. I n general, the transformation Jt can be divided into
the following categories:

( Γ ) Spatially-Dependent Output Transformation. Jt transforms the


state functions ui into a set of a spatially d e p e n d e n t o u t p u t functions v^ .
I n other words, defines a c o n t i n u o u s m a p p i n g from t h e state
function space Γ{Ω)^^(Ω)—an o u t p u t function space. F o r example,
the o u t p u t may be obtained by a linear transformation

vit, X) = X bHuit, X\ 1< ; < Ν (29)


i=l

w h e r e bH are constants.

(2') Spatially-Independent Output Transformation. *JK transforms t h e


state functions ui into a set of spatially i n d e p e n d e n t o u t p u t functions Vj .
I n other words, J( defines a c o n t i n u o u s m a p p i n g from t h e state function
space Γ(Ω) —• Ψ'—a finite-dimensional Euclidean o u t p u t space. F o r
example, a particular transformation may consist of i n d e p e n d e n t
weighted spatial averages of u{ of the form

Vi(t) = Jf TO, X) dQ, ί=1,..., Ν (30)


Ω'

where Ω' ç= Ω. Physically, the above transformation results from t h e


use of a p r o b e type of measuring device which senses t h e local spatial
average of the physical variable.
F o r a fixed-domain distributed p a r a m e t e r system coupled with a
l u m p e d p a r a m e t e r system such as System 3, t h e system state at any time
3
Here, it is assumed that the dynamics of the measuring instruments are negligible
and the instruments have negligible effect on the system behavior. Clearly, these proper-
ties are desirable in practical situations. In general, the measuring instruments may be
effected by the environment. T h u s , Λ may have an algebraic d e p e n d e n c e u p o n time t.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 91

t can be specified by a set of functions {η{(ί, X), i — 1, N} defined on


Ω, and a set of variables {u/(t)yj = 1, N'} specifying t h e state of t h e
l u m p e d system. T h e state space is the Cartesian p r o d u c t of a function
space and a subset of a finite-dimensional Euclidean space.
For a variable-domain system such as Systems 4 and 5, its entire
spatial domain u n d e r g o e s deformation in t h e course of t i m e . T o distin-
guish this property, we shall denote t h e spatial domain by Ωι, its b o u n d -
ary by düt, and its closure by Ωι. T h e d o m a i n b o u n d a r y at any t i m e is
composed of a set of c o n t i n u o u s surfaces which can be specified m a t h e -
matically by a spatial coordinate vector XB and a set of surface para-
4
meters GJ , each with a specified r a n g e . T o be consistent with the fixed-
domain case, we shall denote a point interior to Ωι by X a n d a point
on t h e b o u n d a r y by XB . T h e state of a variable-domain system at any
time can be specified by a set of functions {u^t, X)y i — 1, N]
defined for all Z e ß h and possibly a vector (XB , dXB/dty ...) specifying
the instantaneous position, velocity etc of t h e domain b o u n d a r y . T h u s ,
the state space is a set of functions defined on a d o m a i n Ωι d e p e n d i n g
on XB(t). In the case w h e r e XB, dXBjdty ... are state variables, t h e state
space becomes the Cartesian p r o d u c t of a function space a n d a subset
of a finite-dimensional Euclidean space.
In a m o r e complex variable-domain system such as multiphase flow
system, t h e entire spatial domain Dt may be partitioned into subsets
ι) ι)
Ω\ . T h e size of each Ω\ may vary with t i m e in such a m a n n e r t h a t
ι)
Ω( = Ω\ . T h e state of each distributed subsystem can be specified
ι)
by a set of functions defined on Ω\ .
In m a n y cases, it is advantageous to find an a p p r o p r i a t e transformation
which m a p s the variable spatial d o m a i n onto a fixed domain. T h e
resulting transformed system generally reduces to an equivalent fixed-
domain distributed p a r a m e t e r system which is coupled to a l u m p e d
p a r a m e t e r system as examplified by System 4.

2. BASIC P R O P E R T I E S OF D I S T R I B U T E D PARAMETER DYNAMICAL SYSTEMS

L e t us recall certain d o m i n a n t features of l u m p e d p a r a m e t e r d y n a m i -


cal systems with a finite n u m b e r of degrees of freedom, whose motions
are governed by the H a m i l t o n canonical equations (14),

dqi _ dH t dpi
ι = 1 JV (31)
dt dpi ' dt

4
For example, if Ω( is a closed, b o u n d e d subset of a Euclidean plane, one may select
XB to be a radius and σ to be an angle with a range of 0 — 2π radians. For an arbitrary
time-varying region, XB depends on both / and σ.
92 P . K. C. WANG

where H is the Hamiltonian, qi and pi are the generalized coordinates


and m o m e n t a , respectively. T h e state of such a system is represented
by a point S = (px, ...,pN , qx , .... qN) in a 27V-dimensional Euclidean
state space. F o r a given initial state SQ , the set of s u b s e q u e n t (and prior)
states defines a trajectory in the state space. T h e state St at time t is
related to the initial state S0 by a c o n t i n u o u s transformation Φ(ί),
n sa t en
the set of transformations {Φ(0} group property:
r f o 0 00
<P(t + t') = Φ(ήΦ(ϊ), - < *» *' < + (32)

T h i s is simply an expression of H u y g e n ' s principle: " T h e state of a


dynamical system at time t + t' can be d e d u c e d from its state at time
by first c o m p u t i n g the state at time t' from the initial state and then
c o m p u t i n g the state at time t + t' by regarding the state at time t' as
a new initial s t a t e . "
For a corresponding fixed-domain distributed p a r a m e t e r dynamical
system, its state at any time t is now specified by a vector-valued function
U(t, X) or a point in the state function space Γ(Ω). Moreover, the
H a m i l t o n canonical ordinary differential equations (31) are now replaced
by partial differential equations (14)

X) = M . *PiV> *) = . , , N *m
} K
dt 8pi'(t9X)' dt &ς/(ί,Χ)'

where q! and p? are the generalized coordinates and m o m e n t u m d e n -


sities respectively; and δ( · )/δ( · ) denotes a functional partial derivative.
T h e state function U(t, X) corresponds to a vector (qi(t, X),qN'(t, X),
p1'(t,X),...,pN'{t,X)).
T h e system motion corresponding to a given initial state S0 defines a
trajectory in the state function space Γ(Ω). T h e transition from one state
to another can be generally defined by a set of continuous transfor-
mations {Φ'(ί)} having the semigroup p r o p e r t y :

<p'(t + t') = Φ\ί)Φ'{ί') for 0 < f, t' < + oo (34)

I n certain distributed dynamical systems such as those governed by a


hyperbolic partial differential equation, t h e semigroup can be extended
to a g r o u p . T h i s implies that the knowledge of the system state at any
t i m e completely determines the system behavior in t h e future as well
as the past. T h e existence of a g r o u p of transformations is closely
related to t h e notion of reversibility in classical mechanics. F o r example,
a vibrating string is reversible while a heat diffusion process is n o n -
reversible.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 93

T h e basic properties of a d i s t r i b u t e d p a r a m e t e r dynamical system


outlined above can b e s u m m a r i z e d in precise mathematical t e r m s as
follows:

Given a state function space Γ(Ω)> an admissible control function set


J(Q χ r ) (and possibly a d i s t u r b a n c e function set), a n d a subset τ of
t h e real line (values of t i m e t), for each admissible c o n t r o l ^ eJ(Q χ τ),
t h e r e exists a c o n t i n u o u s m a p p i n g ΦρΩ from τ χ Γ(Ω) χ τ —• Γ(Ω)
with t h e properties t h a t :

(i) (f2; 0 F ( t i ; U0(X), * 0 ) , i x ) - 0 F ( i 2 ; t/ 0(A-), f 0) for all t0 < tx


< t2 in τ, a n d all f / 0 ( Z ) g Γ ( β ) ,
a s f or a 11 G T a nd a 11 G
(ii) Φ|Γ Λ(ί'; t / 0 W > *o) ^oW *' *o 'o ^oW

(iii) T h e o u t p u t transformation <J( is c o n t i n u o u s .

T h e above properties can be also established for a variable-domain


system a n d a system which is coupled to a l u m p e d p a r a m e t e r system.

3. SYSTEM EQUATIONS

T h e d y n a m i c behavior of a large n u m b e r of disturbance-free distri-


b u t e d p a r a m e t e r systems can be described by a set of partial differential
equations ( P D E ) of t h e form (9):

dufa X)
= Α ^ ί , XIuN{t, X)Jmi)(t, X), ...JmK)(t9 X)l i=l,..., Ν
Ft
(35)

defined for t > 0 on a fixed spatial d o m a i n Ω, w h e r e hi are specified


spatial differential operators whose p a r a m e t e r s m a y d e p e n d u p o n X
a n d t. F o r simplicity, E q . (35) can be rewritten in t h e following vector
form:
dU(t9 X)
= JT(U(t, X),Fat, X)) (36)
dt

where = Col^, hN). T h e above equation only describes t h e


local behavior of t h e system at any point X e Ω. Starting from a given
point X a n d a set of initial data at X, t h e differential equation generally
p e r m i t s t h e construction of m a n y possible solutions. I n order to choose
the solution which is a p p r o p r i a t e to t h e physical situation, additional
constraints or b o u n d a r y conditions are i n t r o d u c e d . T h e y m a y b e r e p r e -
sented by a vector equation

9{U{t, X'\FdQ{t,X')) = 0 , X'edQ (37)


94 P . K. C. WANG

where = Col^ , gN,)y gi are specified spatial differential operators


whose p a r a m e t e r s may d e p e n d u p o n X' and t\ and F$Q is t h e b o u n d a r y
control function. F o r system (36) with b o u n d a r y condition (37), its
state at any fixed t i m e t can be specified by U(ty X).
F o r preciseness, we shall i n t r o d u c e t h e following definitions.
( 1 ) A fixed-domain distributed p a r a m e t e r system describable by E q s .
(36) and (37) is said to be free or unforced, if FQ(ty Χ)ΞΞΞ 0 for all X e Ω
and FdQ(ty Χ ' ) ξ ξ 0 for all X' e 3Ω. If t h e p a r a m e t e r s of Jif do not
d e p e n d on time t, t h e n t h e system is said to be time-invariant.
(2) A vector-valued function UFß(ty X\ U0(X)y t0) is said to be a
5
particular solution of system (35) c o r r e s p o n d i n g to a specified control
function Fß , initial data U0(X) at t = t0 , a n d b o u n d a r y condition (37),
if it satisfies E q s . (36) and (37) on some t i m e interval τ = (t0 , t0 + T]
and UFß(t\ X\ U0(x)y t0) -> U0(X) as t' -> t0 .
(3) An equilibrium state Ueq[X) of system (36) is defined to be t h e
solution of E q . (36) with dU/dt = 0, Fß(t, X) = 0 for all t ^ t0 and all
X e Ω, and with b o u n d a r y condition (37), such t h a t t h e solution
UFa(t, X; Ueq(X)yt0) = Ueq(X) for all t > t0 .

If the P D E (36) with b o u n d a r y condition (37) is to represent a p h y -


sical system, t h e following basic r e q u i r e m e n t s should be satisfied:

( Γ ) Its solution m u s t exist and should be u n i q u e l y d e t e r m i n e d .


(2') Its solution should d e p e n d continuously on t h e initial data.

T h e first r e q u i r e m e n t essentially excludes any a m b i g o u s and contra-


dictory properties in the physical situation. T h e second r e q u i r e m e n t
implies that an arbitrary infinitesimal variation of the given data can
only lead to an infinitesimal change in t h e c o r r e s p o n d i n g solution. If
b o t h r e q u i r e m e n t s are satisfied, t h e n t h e p r o b l e m of d e t e r m i n i n g the
d y n a m i c behavior of t h e system is said to be well-posed (in the sense of
Hadamard).
I n order to manifest some of the mathematical implications of the
above fundamental idea, one m u s t first define:

( 1 " ) the state space Γ(Ω) consisting of all admissible initial state func-
tions U0(X) defined on ß ,
(2") the admissible control function set J(ß X r ) consisting of all

5
Here, the solution is defined in the classical sense. In general, equations (such as
hyperbolic equations) with discontinuous initial data may lead to discontinuous (weak)
solutions. In this case, a solution may be defined in a more generalized sense ( / 5 , 16).
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 95

admissible distributed and b o u n d a r y control functions F^(ty X) defined


for all (f, X)e[Oy + 0 0 ) X fi,
(3") the solution space Γ 8 ( Ω ) consisting of all functions of X defined
on Ω for each t, which are regular solutions (solutions of a well-posed
problem),
(4") t h e topologies on t h e function spaces Γ ( Ω ) , Γ 8 ( Ω ) and J(ß Χ τ),
(5") t h e limit: UFß', X; U0(X), t0) UQ(X) as t' -> t0 .

F r o m the mathematical s t a n d p o i n t , t h e r e are usually many function


spaces for Γ ( Ω ) a n d definitions of the limit (5") which all lead to a
well-posed p r o b l e m . T h e r e f o r e , the choice should be based u p o n a
careful scrutiny of the physical properties of the system u n d e r consider-
ation. In m a n y situations, the most natural choice for Γ a n d Γ 8 leads
to t h e inclusion: Γ „ C Γ , which implies t h a t the regular solutions belong
to the same class of functions as t h e initial data b u t have certain additional
properties. F o r example, aΚ possible state space Γ may be L2{fi)y while
the solution space Γ Η is Ο ( Ω ) — t h e set of all real-valued functions of
X defined on Ω, having c o n t i n u o u s partial derivatives with respect to xi
u p to order K(z set of " s m o o t h e r " functions). It will be seen later that
the relative properties of t h e state a n d solution spaces are p e r t i n e n t
to the observability and controllability of a distributed p a r a m e t e r
system.
N o w , we shall give a precise definition for a well-posed p r o b l e m
corresponding to system (36) in t e r m s of t h e definitions ( l " ) - ( 5 " ) :
T h e motion of a distributed p a r a m e t e r system is said to be well-posed,
if, for every t e (0, + 0 0 ) and for each admissible control Ε ^ ( Ω χ τ),
there exists a one-to-one c o n t i n u o u s transformation (continuous in the
topologies of 7^ and Γ#) from the initial state to the regular solutions of
Eq. (36) with b o u n d a r y condition (37) such t h a t :

lim ϋ φ X; U0(X), t0) = U0(X) (38)

for the chosen definition of limit.


For a fixed-domain distributed system coupled with a l u m p e d - p a r a -
meter system, t h e governing equations generally consist of a P D E of
t h e form (36) with b o u n d a r y condition (37) a n d an additional first-order
vector ordinary differential equation describing t h e motion of t h e
l u m p e d system

^ ^ ^ ^
_d_crp_ =
UV) β{ υ χΐ)< F χ ΐ ) ) ) (39)
96 P . K. .C WANG

where U'(t) = C o ^ w / ^ ) , uN\t)). β is a specified vector spatial func-


tional. A typical form for β is:

β = f ir0(X')U(t, Χ') άΩ' + ί ^λ(Χ')¥Ω(ΐ, Χ') άΩ\ Ω' ç β , β " ç .Ω


•'ίΐ' • ' β "

where IV^ and # ^ are spatial weighting function matrices. I n the


particular case w h e r e some of the elements of iV\ are D i r a c delta
functions, t h e motion of the l u m p e d system d e p e n d s on certain state
functions u^t, X) evaluated at a set of spatial points in Ω.
For a variable-domain distributed system (possibly coupled with a
l u m p e d system) whose entire domain varies with time, the general equa-
tion of motion usually has the form (36), (37), and (39), except E q s . (36)
and (37) are defined for X e Ωι and X' e 3 Ωι respectively. Also, some
of the c o m p o n e n t s of U\t) in E q . (39) may correspond to the spatial
coordinates of the instantaneous spatial domain b o u n d a r y 3Ωι.
T h e notion of a well-posed p r o b l e m can be directly extended to fixed-
domain distributed systems coupled with a l u m p e d system by selecting
suitable definitions for the state and solution spaces, their topologies and
the limit ( 5 " ) . For variable-domain systems, the extension is less direct,
since the domain of the state function space d e p e n d s on the instantaneous
6
location of the b o u n d a r y . However, by transforming the variable spatial
domain onto a fixed domain, the notion of a well-posed p r o b l e m can be
extended to the equivalent fixed-domain system.
T h e foregoing discussions have been restricted to system equations
in differential form. As it was pointed out earlier that partial differential
equations in the form (36) only describe the local system behavior at
any point X e Ω, additional b o u n d a r y conditions are generally necessary
for complete system description. T h e r e f o r e , it would be useful to for-
mulate the system equations in such a m a n n e r that the b o u n d a r y
conditions are included explicitly. Such a formulation m u s t relate the
system variables not only locally at any point X e Ω, b u t also to their
values at all points of Ω as well as 3Ω. T h i s approach usually leads to
a system of integral equations of the form

U(t, X) = Jί /C 0(f, X, X\ U0(X')) άΩ' + ( fJ Kx(t, f, X, X', U(t\ Λ"), J


Ω t0 Ω

ΓΩ(ϊ,Χ'))άΩ' dt' (41)

where K0 and K1 are specified vector-valued functions of their a r g u m e n t s .


6
T h e problem of determining the dynamic behavior of variable-domain systems is
c o m m o n l y referred to as a moving-boundary problem in partial differential equations.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 97

Since the above equation contains the b o u n d a r y conditions, it represents


a very compact system description.
For fixed-domain system coupled with a l u m p e d p a r a m e t e r system,
the integral equation representation takes on the form:

1
u(t, x) = \ K0'(t, χ, x\ υ ix )) dû' + f ' I' K,\U t\ x, x\ u\r\
J
« Ω J t0 Ω

U(t\ X'), F D(t', X')) dÙ' dt\ (42a)

U'(t) = K'0'(ty U'(0)) + f f K['(t, t\ X\ U\t'), U(t\ X'),Fa(t\ X')) dQ' dt'
J J
tQ Ω
(42b)

T h e corresponding representation for variable-domain systems is


similar to E q . (42) except the domain of spatial integration is replaced
by Ω ι, and some of the c o m p o n e n t s of U'(t) may correspond to the
spatial coordinate of οΩ( .
Finally, it should be pointed out that in many physical situations, the
system cannot be represented in t e r m s of differential equations. In other
words, the local system behavior at a point X G Ω and time t d e p e n d s
not only on the system variables evaluated at X and t, b u t also on their
values at other points in Ω and past times t' < t. T h i s type of systems
usually leads to more complex mathematical descriptions which may be
in the form of E q s . (36) and (37) b u t with '/ί and ^ as time a n d / o r
spatial integro-differential operators (e.g., System 6 described in Section
Π , A).

4. L I N E A R SYSTEMS

Linear systems are of particular i m p o r t a n c e in practice, since the


d y n a m i c behavior of many physical systems in the n e i g h b o r h o o d of
certain prescribed motions can be a p p r o x i m a t e d by that of a linear
system.
For many linear distributed p a r a m e t e r systems defined on a fixed
spatial domain Ω, the governing P D E ' s can be expressed in the following
general form:

- - ! r — = W C X) + xWau X) (43)

with b o u n d a r y conditions:

&xU{ty X') =Fda{t,X') for X'edQ (44)


98 P. K. C. WANG

?
where i f 0 and £ l are matrix, linear spatial differential, or. integro-
differential operators whose p a r a m e t e r s may d e p e n d u p o n X a n d / o r t\
D(t, X) is a matrix whose elements are specified functions of t and X.
A typical form for an element of J 5 f 0 is:

Μ p. ο Μ ο

X w a U X ) +
Σ HX)
w + C{X) (45)

where the coefficients alm(X), bt(X) and c(X) are specified real-valued
functions of X defined on Ω. T h e system o u t p u t V is related to t h e state
vector U by a linear transformation

V = JtU(t, X) (46)

I n dealing with linear systems, it is natural to take t h e state function


space Γ(Ω) as a linear space (i.e., s u m s a n d differences of elements in
Γ(Ω) imply s u m s and differences of their c o r r e s p o n d i n g functions; and
multiplication of an element of Γ(Ω) by a n u m b e r implies multiplication
of its corresponding functions by t h e same n u m b e r ) . Also, for nonlinear
systems, it is desirable, if possible, to i m b e d Γ(Ω) in some linear spaces.
O n e of t h e most i m p o r t a n t types of linear spaces, w h e n dealing with
physical systems, is t h e complete normed linear (Banach) space (17, 18).
F o r a linear system whose state function space Γ is a Banach space, t h e
topology of Γ is based on a spatial n o r m 11 U 11, and Γ is complete in this
topology. T h e distance between two arbitrary states Ux(t, X) and U2(t, X)
7
in 7^ at any time t can be specified b y

P(Ux(t, X), U2(U X)) = II L\(t, X) - U2(ty X) y (47)

It can be easily shown that t h e above distance satisfies all metric


axioms (17). Also, we can i n t r o d u c e t h e notion of norm (strong) conver-
gence:
U = lim Un implies \\Un- U\\->0
η
for η —• oo.
Since Banach spaces are complete metric linear spaces, t h e familiar
concepts in complete metric spaces such as sphere, b o u n d e d set, linear
i n d e p e n d e n c e , etc., also apply here in t h e same sense.
A n o t h e r n o t e w o r t h y p r o p e r t y of Banach spaces is t h a t most of t h e m
have a countable S c h a u d e r basis (18): (B^X), B2(X), ...), i.e., for each

7
W e shall regard two states to be identical w h e n their distance is zero or their corre-
sponding state functions coincide almost everywhere in Ω.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 99

element U(X) e Γ, t h e r e exists a unique sequence of real n u m b e r s


(αλ, a2, ...), d e p e n d i n g u p o n U, such t h a t t h e series ΣαηΒη(Χ) conver-
ges strongly t o w a r d U(X). T h i s c o r r e s p o n d e n c e implies t h a t t h e system
state function U can be also r e p r e s e n t e d by an infinite-dimensional vector
a = (al9 a2, ...), and t h e state function space Γ can be m a d e to be
i s o m o r p h i c and isometric to certain subset Γ of an infinite-dimensional
Euclidean space, i.e., t h e r e exists a o n e - t o - o n e c o r r e s p o n d e n c e a m o n g t h e
elements of Γ and Γ; a n d t h e c o r r e s p o n d i n g operations of addition a n d
multiplication by a constant, as well as t h e distances in b o t h spaces are
identical. F o r example, if t h e state can be r e p r e s e n t e d by a scalar
function w, a n d Γ τ= L2(Q)f t h e n its c o r r e s p o n d i n g infinite-dimensional
state space Γ is t h e H i l b e r t s e q u e n c e space l2, in which a state is specified
by an infinite-dimensional vector c o r r e s p o n d i n g to t h e F o u r i e r coeffi-
cients of a state function in Γ. T h e distance b etw e en two arbitrary states
a t a nv me
ux and « 2 ti t is given by

12
piu.it, X), «2(i, X)) = [| o I Μ ι(ί, X) - u2it, Χ) I» dû] '

= [Sl«i W-«i (0r]


1 ) I ) l rt
(48)
71=1

It should be r e m a r k e d t h a t t h e representation of a system state by an


infinite-dimensional vector may not be desirable from t h e practical
standpoint, since t h e d e t e r m i n a t i o n of t h e system state a n d t h e r e q u i r e d
control function at any t i m e requires additional data-processors such as
spatial h a r m o n i c analyzers a n d synthesizers.
I n w h a t follows, we shall discuss t h e representation a n d properties
of t h e particular solutions of E q . (43) c o r r e s p o n d i n g to initial data
U0(X) and b o u n d a r y condition (44), w h e r e t h e state function space is a
Banach space.
I n t h e present generality, if we a s s u m e t h a t E q s . (43) a n d (44) describe
a dynamical system, t h e n from t h e physical s t a n d p o i n t (superposition
principle), t h e regular solutions can be w r i t t e n in t h e form

ϋ φ X; U0(X), t0) = K0(t, t 0 , X, X'W^X') dST

t
+ C f K0{t9t\X9X')D{t'9X')F^t\X')da dt' (49)

+ Jf J
f K1(t,t'\X,X")Fdat"tX")d(dQ)dt"
t0 8Ω
100 P . K. C. W A N G

where K0 and K1 are the G r e e n ' s function matrices. K0 has t h e p r o p e r t y


t h a t K0(t0 , t0 , X, X') = S(X — X')I, w h e r e δ is the Dirac delta func-
tion, and / is t h e identity matrix. However, the mathematical p r o b l e m of
establishing conditions for which E q s . ( 4 3 ) and ( 4 4 ) describe a dynamical
system and d e t e r m i n i n g the actual expressions for t h e G r e e n ' s functions
K0 and Κλ for specific equations may be very difficult. O n the other h a n d ,
an abstract approach to this p r o b l e m has been developed in the frame-
work of semigroup theory. T h i s a p p r o a c h represents a natural general-
ization of m a n y familiar formalisms and concepts associated with linear
dynamical systems governed by ordinary differential equations. H e r e ,
we shall outline the main results which are p e r t i n e n t to t h e later discus-
sions on linear distributed systems.
First, consider t h e case w h e r e E q . ( 4 3 ) is defined for t e [t0 , tx] and
D(ty X) is a specified matrix-valued function of its a r g u m e n t s . W e wish
to find t h e solution of E q . ( 4 3 ) c o r r e s p o n d i n g to a given initial state
U0(X) G ΓΒ(Ω) and a specified distributed control function FQ(ty X) G
J0(Ω) for any fixed t i m e ty w i t h o u t i m p o s i n g b o u n d a r y condition ( 4 4 ) ,
w h e r e ΓΒ(Ω) is a Banach space.
T o clarify later developments, we shall first recall certain elementary
facts pertaining to the foregoing initial-valued p r o b l e m w i t h F ß ( i , X) = 0 ,
which is well-posed in t h e sense discussed in Section (II, By 3 ) .
L e t UF (t,
=0 X; U0(X)y t0) be t h e solution to t h e above p r o b l e m defined
by:
UF^0(ty X; U0(X)y t0) = 0(ty t0)U0(X) (50)

F o r any fixed t G [t0 , i j , <P(ty tQ) defines a linear transformation in


ΓΒ(Ω), with its domain d e n o t e d by Γ0(Ω)—a subset of ΓΒ(Ω).
T h i s initial-value p r o b l e m d e t e r m i n e d by t h e linear operator JSf0 in
E q . ( 4 3 ) is well-posed, if the following two conditions are satisfied:

( 1 ) the domain Γ0(Ω) of Φ(ί, t0) is dense in ΓΒ(Ω),


( 2 ) t h e operators <P(ty t0) for t G [t0 , t{\ are uniformly b o u n d e d .
Condition ( 1 ) implies t h a t any element in ΓΒ(Ω) can be a p p r o x i m a t e d
arbitrarily closely by an element of Γ0(Ω)\ in other words, every element
of ΓΒ(Ω) is t h e limit of some sequence of Γ0(Ω), or t h e closure of Γ0(Ω)
is ΓΒ(Ω). T h u s , even t h o u g h a solution may not exist for some choice of
U0(X) G ΓΒ(Ω), we can a p p r o x i m a t e U0(X) by one in Γ0(Ω) for
which a solution does exist. F o r example, if t h e initial t e m p e r a t u r e
distribution uQ(x^ corresponding to the linear heat diffusion E q u a t i o n
[Eq. ( 2 ) ] is a discontinuous function of x1 defined on ( 0 , 1 ) , we can
approximate it by a sequence of twice-differentiable functions which
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 101

approaches u^x^ in t h e limit. T h u s , t h e corresponding sequence of


solutions approaches a function which is t h e solution to t h e original
problem.
T h e second condition (2) implies that t h e solution d e p e n d s continu-
ously on t h e initial data, i.e., t h e r e exists a positive constant M corre-
s p o n d i n g to t h e m a x i m u m of t h e n o r m of Φ(ΐ, t0) for t e [t0 , such that

I! UFQ=0(t, X; U0(X)y t0) - UFi}=0(ty X; £ / 0 ' ( Z ) , t0) || < M || U0(X) - U0'(X) \\


(51)
T h u s , if t h e initial states are close in t h e sense of t h e n o r m of t h e space,
their corresponding solutions are also close.
L e t us r e t u r n to t h e original p r o b l e m of d e t e r m i n i n g t h e solution to t h e
n o n h o m o g e n e o u s equation (43) w i t h o u t imposing b o u n d a r y condition
(44). T h e solution to this p r o b l e m can be formally represented by

UFQ(t, X; U0(X), t0) = Φ(ί, t0)U0(X) + J f* tf(*, t')D(t\ X)FQ(t\ X) dt' (52)
to

It is of interest to establish explicit forms for Φ(ί, t') in t e r m s of


J5fο , a n d t h e conditions on o S f 0 such t h a t E q . (43) describes a dynamical
system({Φ(ί,ΐ')} has t h e properties of a semigroup). T h i s p r o b l e m is rather
simple if t h e operator J 5 f 0 is b o u n d e d (i.e., t h e r e exists a constant M such
that H J ^ 0 U H < M \ \ U \ \ for all U e Γ 0 ) . However, in most cases of
interest, J S ? 0 is u n b o u n d e d . I n such cases, t h e p r o b l e m of establishing
conditions for which j £ ? 0 generates a semigroup of transition operators
is no longer trivial.
If J S ? 0 is independent of t, Φ(ί, t') is given formally by:

Φ(Μ') = e x p { ( f - 0 ^ o } (53)

A precise definition for t h e operator-valued function exp{(t — O ^ m ) }


has been given by Hille a n d Yosida in t h e framework of semigroup
theory (79). N o t e in t h e trivial case w h e r e U0 a n d FQ are vectors in a
finite-dimensional Euclidean space, a n d J S f 0 a n d D are constant matrices,
E q . (53) corresponds to t h e familiar fundamental (transition) matrix of
a free, linear, time-invariant ordinary differential system.
N o w , we shall state t h e main result d u e to Hille a n d Yosida:

T H E O R E M I I - 1 . A necessary and sufficient condition that the operator


J S ? 0 be the generator* of a semigroup {Φ(£, t')) is that
8
A n operator JSP0 [associated with E q . (43)] w h i c h generates a semigroup {Φ(ί, t')}
is called an infinitesimal generator of the semigroup {Φ(ί, t').}
102 P . K. C. W A N G

(i) the domain of J?0 (denoted by &(^0)) is dense in ΓΒ(Ω),


(ii) all the real numbers X > k are in the resolvent set of JSf0 and

\\R(K<?»)\\^(X-k)-i for X>k


_ 1 i ne
where R(X, <5?0) = (XI — ^ 0 ) — resolvent of =£?0; / is the identity
operator.
F o r each satisfying t h e above conditions, t h e r e exists a u n i q u e
operator-valued function exp(ioSf 0) defined for t ^ 0 with t h e following
properties:

(i') exp(iJSf 0) is b o u n d e d a n d || exp(/oSf 0) || < cxp(kt)\


9
(ii') exp(ioSf 0) is strongly c o n t i n u o u s in t with ε χ ρ ( 0 ^ 0 ) = / ;
f
(iii') exp{(* + t )^0} = exp(fJS? 0) exp(i'JS? 0);
i n 0t l t s ef a nd f or e a nc
(iv') e x p ( f i ? 0 ) m a p s ^(JS? 0) l > Ue <®(^o)>
exp(£JS?0) is strongly differentiable, with (rf/A)[exp(iJS? 0)[/] =
jSf0 exp(iJS? 0)t/ = e x p ( ^ 0 ) · JS?0E7;
_ 1
(ν') exp(iJSf 0) is p e r m u t a b l e with t h e resolvent (XI — J ^ 0 ) .

F o r t h e case w h e r e JS?0 is b o u n d e d , t h e exponential formula exp(*JS?0)


can be r e p r e s e n t e d by

e x p ( i ^ 0) = V ^ - i f 0« (54)
71=0 '

F o r a u n b o u n d e d j£? 0 , t h e definition is less direct. A list of exponential


formulas are given by Hille a n d Phillips (19), ( p . 354).
T h e results of Hille a n d Yosida are in essence an abstract generaliza-
tion of t h e Laplace-transform a p p r o a c h to initial-value p r o b l e m s asso-
ciated with linear, time-invariant systems. W e shall give a plausible
verification of t h e conditions in T h e o r e m I I - 1 .
L e t u s i n t r o d u c e a Laplace t r a n s f o r m operator £ defined by

£ = J Γ exp(-Xt)( · ) dt (55)
ο

T a k i n g t h e Laplace transform of E q . (43) with F ß — 0, initial condition


U0 , a n d a s s u m i n g J?0 is i n d e p e n d e n t of t lead to

(XI-J?0)U(X) = U0 (56)

9
S e e reference (19) for precise definitions.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 103

I n o r d e r for E q . (56) to be solvable for U(X)y X m u s t be in t h e resolvent


set ρ(<&ο) of J?0 a n d p(^0) m u s t be n o n e m p t y . T h u s

U(X) = (XI - J?0)-'U0 = R(Xy <?0)U0 (57)

O n t h e other h a n d , in o r d e r t h a t t h e inverse Laplace t r a n s f o r m of


R(Xy « S f 0 ) be of t h e form e x p ( £ J S ? 0 ) , w e have

R(Xy &0) U0 = Γ exp( - λί) e x p ( ^ 0 ) U0 dt (58)


Jο

and

H *(λ, <?0)U0 II < JΓ txp(-Xt) II e x p ( ^ 0) L 7 01 | dt (59)


ο
Clearly, if 11 e x p ( i J S f 0 ) 11 ^ txp(kt)y t h e n t h e above integral converges
for λ > k a n d

||Λ(λ, JSP 0)£/ 0IK JΓ exp[-(A - || I7 01| Λ = (λ - * ) - ^ || t / 01 | (59')


ο

By définition of t h e n o r m of an operator (77), we have

| | Α ( λ , ^ 0) | | < ( λ - * ) - ι (60)

w h i c h is precisely condition (ii) in T h e o r e m I I - 1 .


So far, we have established t h e relation b e t w e e n <P(ty t') a n d J S f 0 ,
a n d t h e conditions on J ä f 0 for which {Φ(ί, t')} is to be a s e m i - g r o u p . It
r e m a i n s to show t h a t E q . (52) is a solution to t h e n o n h o m o g e n e o u s
(forced) system (43). Phillips (20) has p r o v e d t h a t E q . (52) is a solution
9
if D ( t y X)FQ(ty X) is strongly c o n t i n u o u s l y differentiable in t. T h i s is
also t r u e if D ( t y X)FQ(ty X) e £}(£>0) for all ty a n d & 0 D ( t y X)FQ(ty X)
a n d D ( t y X)Fn(ty X) are strongly c o n t i n u o u s in t.
F o r t i m e - v a r y i n g systems w h e r e t h e p a r a m e t e r s of J S f 0 d e p e n d s u p o n
t a n d Xy K a t o has s h o w n that if, in addition to t h e above conditions on
DFQ , J ^ 0 satisfies Hille-Yosida's conditions (i)-(ii) with k — 0 for each
t G a n ( t ne
[*o > h] * d o m a i n of J S ? 0 _is1 i n d e p e n d e n t of t\ a n d also t h e
operator [/ — <&0(t)] · [ / — J 5 f 0 ( ^ ) ] satisfies certain regularity c o n d i -
tions (21) for details), t h e n E q . (43) has a solution in t h e form of E q . (52).
I t s h o u l d be m e n t i o n e d t h a t t h e existence a n d u n i q u e n e s s t h e o r e m s
for t h e initial-value p r o b l e m s associated with linear t i m e - v a r y i n g
systems w h e r e t h e p a r a m e t e r s of 3?0 d e p e n d s u p o n ty independent of Xy
9
See reference {19) for precise definitions.
104 P. K. C. WANG

have been established by F r i e d m a n using a somewhat different a p p r o a c h


(76).
T h e foregoing discussions have been concentrated primarily on t h e
initial-value p r o b l e m s associated with linear system (43). I n t h e m o r e
general case where b o u n d a r y conditions in t h e form of E q . (44) are
imposed, we have t h e so-called initial b o u n d a r y - v a l u e p r o b l e m s . C o n -
ceptually speaking, m a n y of these p r o b l e m s can be reformulated in the
framework of initial-value p r o b l e m s by imposing appropriate restrictions
on t h e d o m a i n of the operator . F o r example, if J ^ 0 is an operator in
k
C (Q)—the set of all real-valued functions defined on Ω having k
continuous derivatives, t h e n , for system (43) with a h o m o g e n e o u s
b o u n d a r y condition

&1U(tiX')=0 for all X' Ε 8Ω, (61)

where J S f 1 is a differential operator, we may consider it as an initial-value


problem with t h e domain of J S f g defined by
k k
^'(JS?o) ={U:Ue C (Q) χ ... χ C (Ü)(N products of C%
(62)
&1U = 0 at 3Ω}
or
k k
^ " ( J ^ o ) = C 0*(ß) X ... X C0 (Q)—N products of C0 (63)
fc 10
where C 0 ( ß ) is the set of all functions with c o m p a c t s u p p o r t in Ω
having k continuous derivatives. Clearly, every function in . @ ' ( J S ? 0 ) or
& ' ( & 0 ) satisfies the b o u n d a r y condition (61).
T h e above interpretation of an initial b o u n d a r y - v a l u e p r o b l e m is
conceptually simple. However, if we wish to apply T h e o r e m I I - 1 to the
p r o b l e m , J 5 f 0 will not in general satisfy t h e conditions of the t h e o r e m ,
since ^ ( J S f 0 ) may be too small as a result of t h e restrictions imposed
by t h e b o u n d a r y conditions. F o r some systems, it is possible to c i r c u m -
vent the above difficulty by making suitable extensions of the operator
J S ? 0 [for example, F r i e d r i c h s ' extension of s e m i b o u n d e d s y m m e t r i c
operators (22)]. A detailed discussion of this aspect of t h e p r o b l e m is
beyond the scope of the present work.

I I I . Intrinsic Properties

Given a dynamical system, it is natural to ask w h a t are its intrinsic


properties which are of fundamental i m p o r t a n c e to control. T h e iden-
10
A function is said to have compact support in Ω, if it vanishes outside a compact
subset of Ω, thus all its derivatives also vanish at 8Ω.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 105

tification of most of these properties has been a natural o u t g r o w t h of t h e


m o d e r n control theory. I n general, these properties are not i n d e p e n d e n t
in t h e sense that one p r o p e r t y may have certain implications on another,
or there exist certain " d u a l " relations a m o n g t h e m . As t h e dynamical
system and its control objectives b e c o m e m o r e complex, it is most likely
that m o r e intrinsic properties will be identified, and each of t h e m have
a m o r e refined meaning. Hopefully, future d e v e l o p m e n t s in control
theory will eventually lead to a general s t r u c t u r e consolidating various
intrinsic properties in a m a n n e r so that t h e precise relationship between
various individual properties can be clearly identified.
H e r e , we shall focus our attention on t h r e e well-known intrinsic p r o -
perties, namely, stability, controllability, and observability.
Roughly speaking, stability is associated with t h e b o u n d e d n e s s of
excursions of t h e system motions about certain prescribed regime;
hence it can be regarded as a p r o p e r t y related to the overall system's
energy balance. Controllability is associated with t h e ability of steering
one system state to another in a finite a m o u n t of time by means of
certain admissible class of controls; hence it can be regarded as a respon-
sive p r o p e r t y of t h e system with respect to t h e manipulatable i n p u t s .
Finally, observability is associated with t h e ability to access complete
knowledge on t h e system state at any time from a finite a m o u n t of
observed output data; hence it can be regarded as a structural p r o p e r t y
relating t h e system to t h e external observer.
I n t h e s u b s e q u e n t sections, we shall discuss each of t h e above p r o -
perties in detail.

A. Stability

Generally speaking, there are m a n y possible mathematical definitions


for stability. H e r e , we shall confine our discussion only to stability in
t h e sense of L y a p u n o v (23-26). A l t h o u g h the original work of L y a p u n o v
(23) is primarily devoted to dynamical systems having a finite n u m b e r
of degrees of freedom, m a n y of his results have been extended to distri-
b u t e d p a r a m e t e r dynamical systems with d e n u m e r a b l y infinite degrees
of freedom. T h e most general extended results have been obtained by
Z u b o v (24). I n contrast with L y a p u n o v functions associated with ordinary
differential systems, Z u b o v ' s a p p r o a c h involves t h e introduction of
L y a p u n o v functionals defined on t h e state space. O n t h e other h a n d ,
Massera (27, 28) and Persidskii (29, 30) have extended L y a p u n o v ' s
stability t h e o r e m s to systems describable by d e n u m b e r a b l y infinite
systems of ordinary differential equations. Since the state function
space of m a n y distributed p a r a m e t e r systems can be m a d e to be iso-
106 P . K. C. WANG

m o r p h i c and isometric to certain subset of an infinite-dimensional


Euclidean space, t h e results of Z u b o v , Massera, and Persidskii are equally
applicable in such cases.
I n t h e s u b s e q u e n t discussions, we shall first establish precise defini-
tions for various degrees of stability. T h e n , t h e main stability t h e o r e m s
will be stated. T h e i r applications will be illustrated by specific examples.

I. DEFINITIONS

C o n s i d e r a free, fixed-domain distributed p a r a m e t e r dynamical system


defined by a set of transformations from a state space Γ(Ω) (with
a specified metric p(U, U')) into itself. U s i n g t h e same notations defined
in Section II, a particular motion of t h e free system c o r r e s p o n d i n g to a
specified initial state U0(X) at t i m e t0 is d e n o t e d by 0F^o(t; UQ(X), t0).
T h e set { 0 F ^ = o( i ; U0(X)y t0), t > t0} in Γ(Ω) d e t e r m i n e s a system
trajectory in Γ(Ω). A set Γιν(Ω) is said to be invariant with respect to a
given dynamical system, if for any initial state U0(X) e Γιν(Ω), its
c o r r e s p o n d i n g trajectory also lies in 7^ I V(ß). Obviously, each trajectory
is an invariant set, a n d t h e solution function space Γ8(Ω) is also an
invariant set. Since, in general, t h e solution function space may be a
subset of Γ(Ω), t h u s , we have t h e relation Γιν(Ω) ç Γ8(Ω) Ç Γ(Ω).
T h e distance of a state U from an invariant set Γιν(Ω) is defined by

Ρ(υ,ΓΜΩ))=Μί (U, PU') (64)

Also, t h e distance of a particular motion Φ ^ = 0( ί ; £/ 0(x), t0) from an


invariant set Γιν(Ω) is defined by

p(<PF^0(f,U0(X),t0),rw(Q))= sup ρ{υ,Γιν(Ω)) (65)

N o w , we shall state t h e precise definitions for various degrees of


stability associated with t h e invariant set of d i s t r i b u t e d p a r a m e t e r d y n a -
mical systems.

(i) An invariant set Γΐν(Ω) of a system in Γ(Ω) is said to be stable


(in t h e sense of t h e metric of Γ(Ω)), if for every real n u m b e r e > 0,
t h e r e exists a real n u m b e r 8(e, t0) > 0 such that

p(^o(f, U0(X), t0), Γ1ν{Ω)) < « for all t > t0

provided that
ρ(ϋ9, Γ„(Ω)) < *ο)·
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 107

In the special case where Γιγ(Ω) is a set consisting of only an equilib-


r i u m state Ueq(X), we have the definition for the stability of an equilib-
r i u m state.
(ii) An invariant set Γιν(Ω) is said to be asymptotically stable, if it
is stable and

P(<PF^(t; U0(X), i 0), Γιν(Ω)) ->0 as ί - oo

If in addition, δ is i n d e p e n d e n t of t0 , t h e n t h e invariant set Γιχ(Ω)


is said to be uniformly asymptotically stable.
(iii) If an invariant set Γιχ(Ω) is stable (asymptotically stable, u n i -
formly asymptotically stable) for all U0 e Γ(Ω)> t h e n Γιν(Ω) is said to be
stable (asymptotically stable, uniformly asymptotically stable) in the
large with respect to a specified Γ(Ω).

T h e above definitions also apply to a fixed-domain distributed para-


meter system which is coupled to a l u m p e d p a r a m e t e r subsystem. H e r e ,
it is required to select a suitable metric for t h e state space which consists
of t h e Cartesian p r o d u c t of a function space and a finite-dimensional
Euclidean space. F o r example, if the state functions ut(X)y i = 1, Nd
of t h e distributed system are square-integrable functions of X, and t h e
state of t h e l u m p e d system is a vector U' in a 7 V r d i m e n s i o n a l Euclidean
space, a possible definition for t h e distance between two states in t h e
total state space is

*Um> >"™,) = X KD - ";<>) + J S K u W


2
2
- ";(«(*))* dQ] (66)

where t h e total state vector UT = (w/ uN', u^X), uNd(X)).


I n order to apply the above definitions to a variable-domain system,
it is necessary to first transform it to an equivalent fixed-domain system
as examplified by System 4 in Section I I , A.

2. STABILITY THEOREMS

T h e direct m e t h o d of L y a p u n o v a t t e m p t s to make s t a t e m e n t s on t h e
stability of motions of a dynamical system w i t h o u t any knowledge of t h e
solutions of its governing equations. F u n d a m e n t a l to this m e t h o d , as
applied to a finite-dimensional ordinary differential system, is to select
a scalar function V(t, Z) which gives some estimate of the distance of
the system state Ζ from a specific invariant set in t h e state space. L e t
Z(t\ Z 0 , £0) be a system trajectory starting at t0 with state Z 0 . If it is
possible to show that V(ty Z(t\ Z 0 , t0)) will be small w h e n e v e r V(t0 , Z 0)
108 P . K. C. WANG

is small, t h e n t h e invariant set is stable. If, in addition, V(t, Z(t\ Z 0 , / 0 ))


—• 0 as t —> oo, t h e n t h e invariant set is asymptotically stable.
T h e above idea can be e x t e n d e d to d i s t r i b u t e d p a r a m e t e r dynamical
systems defined on a fixed spatial d o m a i n Ω. Since t h e system state at
any t i m e t is specified by a vector-valued function U(t, X), it is necessary
to select a functional Ϋ"(ΐ9 U) which gives some estimate of t h e distance
of U from a specified invariant set in t h e state function space Γ(Ω).
A detailed discussion of this e x t e n d e d m e t h o d has been given by Z u b o v
(24). H e r e , we shall only state his main results.

T H E O R E M I I I - 1 [Zubov (24)]: A necessary and sufficient condition for


an invariant set Γιγ(Ω) of a distributed parameter dynamical system
defined on Γ(Ω) to be uniformly asymptotically stable is that there exists
a real functional i^(t, U(X)) having the following properties:
(i) i^(t, U(X)) is defined for all t ^ 0 and all U(X) belonging to a
certain neighborhood Ν(Γιγ(Ω), r) of Γιν(Ω) (i.e., the set of all U for
which 0 < p(U, Γ ι ν ( β ) ) < r);
(ii) for each sufficiently small ηί > 0, there exists a number η2 > 0
r
such that i (t, U(X)) > r\2for all t > 0, provided p(U, Γιχ(Ω)) > ηχ\
(iii) lim i^(t, U(X)) = 0 uniformly with respect tot > t0 as p( U, Γιν(Ω))
->0;
(iv) the function ^~'(ί; U(X), t0) defined by

r'(t;U(X),t0)= sup r(t,U\X)) (67)


U'(X)e0F_=o(t;U(X),to)

does not increase for all t ^ t0\


(v) the function ^'(ί\ U(X)> t0) tends toward zero as t —> + oo and for
all U belongs to a certain neighborhood Ν(Γιγ(Ω), 8) of Γιγ(Ω)\
r
(vi) lim,_, _^ ^ '(t\ U(X), t0) = 0 uniformly with respect to t0 for
all U satisfying p(U, Γιν(Ω)) < δ.

For stability and asymptotic stability, it is necessary and sufficient that


conditions (i)-(iv) and (i)-(v) are satisfied respectively.
F o r m a n y systems, t h e state function space Γ(Ω) can be m a d e to be
isomorphic and isometric to an infinite-dimensional Banach space Γ,
a n d t h e system's t i m e - d o m a i n behavior are describable by a d e n u m e r a b l y
infinite system of ordinary differential equations defined on Γ, a n d of
t h e form:
dÜ(t)
= œ(U(t)tt) (68)
dt
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 109

where Ü is an infinite-dimensional vector; ω is a specified vector-valued


function of its a r g u m e n t s with t h e p r o p e r t y ω ( 0 , t) = 0 for all t.
T o study t h e stability of t h e trivial solution of this class of systems,
the following t h e o r e m d u e to Massera is applicable.

T H E O R E M I I I - 2 [Massera (27)]: The trivial solution of system (68)


defined on a Banach space Γ is uniformly asymptotically stable in t h e
large, if there exists a real-valued scalar function V(t, Ü) with continuous
11
first partial derivatives with respect to its arguments such that V(t, 0) = 0
and:
(ϊ) V(t, Ü) is positive definite; i.e., there exists a continuous, non-
decreasing scalar function η such that

V(t, Ü) ^ Tjfll 0 i|) > 0 for all t and Ü φ 0

and η(0) = 0;
(ii') the total derivative dV(t, U)/dt evaluated along the solutions of
11
Eq. (68) is negative definite;
(iii') V(t, Ü) admits an infinitesimal upper bound( i.e., there exists a
continuous, nondecreasing scalar function φ(\\ Ü ||) such that

V(t, 0)^φ(\\ ff H)

for all t and 0(0) = 0;


( ί ν ' ) η(\\ tf||)-* o o with II C / | | - > o o .

For stability and uniform asymptotic stability, it is sufficient that condi-


tions (i')-(ii') and (i')-(iii') are satisfied respectively.

Remarks.

(i") I n Z u b o v ' s t h e o r e m , t h e a s s u m p t i o n that t h e solutions satisfy


t h e properties of a dynamical system (see Sections I I , B, 2) is needed
to prove t h e necessity part of t h e t h e o r e m . I n Massera's t h e o r e m , t h e
above a s s u m p t i o n has not been m a d e and t h e conditions are only suffi-
cient for uniform asymptotic stability in t h e large.
11
Massera (27) has s h o w n that V is differentiable may be replaced by a weaker condi-
tion that F satisfies a local Lipschitz condition in Û, and dVjdt is replaced by a generalized
derivative:
dV
= ps u[t v {4 J u + J a V
IT j i ?+ ' ^ ' ) > ')) — V> Û)]JA
110 P. K. C. WANG

(ii") I n general, asymptotic stability does not imply uniform a s y m p -


totic stability in t h e case of an infinite-dimensional, time-invariant
ordinary differential system. T h i s fact can be d e m o n s t r a t e d by t h e
following simple example provided by Massera (27):

Ü ü
- 4r = - \ ^ « = 1.2,.·. (69)

defined on the Hilbert sequence space / 2 .


(hi") T h e extension of L y a p u n o v ' s direct m e t h o d to d e n u m e r a b l y
infinite systems was first m a d e by Persidskii (29, 30). H e considered a
system in the form of E q . (68) defined on a n o r m e d linear space with
a norm given by

II Ü\\ = s u p { | f i n | > 11 = 1,2,...} (70)


η

3. D I S C U S S I O N OF PARTICULAR SYSTEMS

In the sequel, we shall discuss t h e application of t h e t h e o r e m s stated


in the previous section to derive stability conditions for particular classes
of distributed parameter dynamical systems.
a. Linear Systems. Consider a free, linear system describable by

lELäL = j? (t,
oU x) (7i)

defined for t > 0 and X e Ω, where J 5 ? 0 is a linear spatial differential


N
or integro-differential operator. L e t t h e state space Γ(Ω) ç= L2 fâ)—Ν
w
p r o d u c t s of L 2 ( ^ ) i t h a n o r m defined by

T
II U{t, Χ) II = [j U {ty X)U{t, Χ) dû]"* (72)

r
where ( ) denotes transpose.
It is of interest to derive t h e stability conditions for t h e invariant set
{U = 0} in the sense of n o r m (72).
Consider the following positive-definite functional:

T
Ϋ=( U (t, X)U{ty X) dQ = <Ι/(ί, X), U{ty Χ)}Ω (73)

2
Since lT = || U(t, X) || , t h e conditions (i)-(iii) of T h e o r e m I I I - l are
automatically satisfied. F r o m condition (iv) of T h e o r e m I I I - 1 a sufficient
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 111

condition for stability is that t h e total derivative of with respect to t


is < 0 for all t > 0

Ç = < j ? 0£ / ( f , X), U(t, Χ))Ω + <U(t, X), <?0U(t, Χ)>Ω < 0


(74)
for all UeS(&à

I n m a n y systems, c o r r e s p o n d s to a m e a s u r e of t h e system energy


at any given time, condition (74) states that t h e energy is nonincreasing
in time. F r o m t h e above physical motivations, Phillips (31) called an
operator J S f 0 satisfying (74) a dissipative operator, and an operator j£? 0
satisfying (74) with equality sign a conservative operator.
F o r a time-invariant system w h e r e t h e p a r a m e t e r s of j £ P 0 are i n d e -
p e n d e n t of t, and J S ? 0 is an infinitesimal generator of a s e m i g r o u p (i.e.,
E q . (71) describes a dynamical system), we have t h e following t h e o r e m
for asymptotic stability, which is analogous to t h a t for linear, t i m e -
invariant ordinary differential systems:

T H E O R E M I I I - 3 : / / the linear operator J?0 in Eq. (71) is time-invariant


and J S ? 0 is an infinitesimal generator of a semigroup, and its spectrum
a ( J 5 f 0 ) satisfies the condition

R e d ( J S P 0) < -y, y = constant > 0 (75)

then, the trivial solution of Eq. (71) is asymptotically stable in the large.
Moreover, the solutions of Eq. (71) satisfy an estimate of the form

II U(t, X; U0(X), t0) H < e x p [ - y ( i - *0)] · || U0(X) ||, for t > t0 (76)
Proof. W e recall that t h e s p e c t r u m of J S ? 0 is t h e set of all complex
12
n u m b e r s for which t h e operator (XI — J S f 0 ) does not have an i n v e r s e
and t h e resolvent set p(^0) is t h e c o m p l e m e n t of t h e s p e c t r u m . F r o m
condition (75) of t h e theorem, t h e real line ( — y, + oo) is in the
resolvent set / o ( « 5 f 0 ) -
Since J S f 0 is assumed to be an infinitesimal generator of a s e m i g r o u p ,
it follows directly from condition (ii) and p r o p e r t y (i') of T h e o r e m II-1
due to Hille and Yosida t h a t

Il e x p ( / i f 0 ) H < e x p ( - y i ) for all t > 0 (77)

where —y corresponds to t h e constant "A" in T h e o r e m I I - l .


12
In the finite dimensional case where i f 0 is a constant matrix A0 , this condition is
equivalent to the statement that the determinant of the matrix ( λ / A0) is zero.
112 P. K. C. WANG

r
F r o m the discussions in Section ( I I - 5 - 4 ) , we can w rite t h e h o m o g e -
neous solutions of E q . (71) as

U(t, X\ U0(X), t0) = exp[(f - t0)J?0]U0(X) (78)

a n d its n o r m as

II t / ( f , X; U0(X)y g II = II e x p [ ( f - t0)J?0] U0(X) II (79)

By definition of the n o r m of an operator (17) a n d inequality (77),


it follows t h a t :

II e x p [ ( f - t0)#0]U0(X) II < |l e x p [ ( i - t0)&0] \\ \\ U0(X) ||


(80)
< e x p [ - y ( i — i 0) ] · H t / 0( A T ) H

hence the proof is complete.

COROLLARY III— 1 : If in Eq. (71) is time-invariant and an infini-


tesimal generator of a semigroup; and its spectrum σ(«5Ρ 0) satisfies the
condition
R e a ( J ^ 0) < 0 (81)

then the trivial solution of Eq. (71) is stable in the large.

Proof. It follows trivially from t h e proof of T h e o r e m I I I - 3 that


e x p ( i ^ 0 ) is a contraction operator (i.e., || exp(fJSf 0) || ^ 1 for all t ^ 0).
Hence

II U(t, X; U0(X), g II < Il U0(X) H f o r a l l t > t0 (82)

or t h e trivial solution of E q . (71) is stable in t h e large.

Remarks.

(i) I n t h e proof of T h e o r e m I I I - 3 a n d Corollary I I I - 1 , we did not


make explicit use of condition (74) as derived from T h e o r e m I I I - 1 . I n
order to relate t h e inner p r o d u c t s in inequality (74) to t h e s p e c t r u m
of ^ 0 , it is necessary to extend first certain results in Von N e u m a n ' s
theory of spectral sets (32) to u n b o u n d e d linear operators. H e r e , we
make use of t h e fact that E q . (71) describes a dynamical system or
in an infinitesimal generator of a semigroup so that T h e o r e m II-1 is
applicable.
(ii) I n order to apply T h e o r e m I I I - 3 a n d Corollary I I I - 1 to a practical
p r o b l e m , it is necessary to verify first that is an infinitesimal gener-
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 113

ator of a s e m i g r o u p . Secondly, it is necessary to establish u p p e r - b o u n d s


for t h e real part of t h e s p e c t r u m of in t e r m s of its p a r a m e t e r s . Both
of these tasks may be very difficult. H o w e v e r , results are available for
specific classes of systems.
So far, we have focused our attention primarily on linear systems w i t h -
out imposing b o u n d a r y conditions. I n general, a h o m o g e n e o u s b o u n d a r y
condition of t h e form

-2\ί/(ί, Χ') = 0 for Λ" e 'οΩ (83)

may be i n t r o d u c e d along with E q . (71), where ¥ λ is a linear spatial


b o u n d a r y operator.
A l t h o u g h , t h e i n t r o d u c t i o n of b o u n d a r y conditions does not present
any conceptual difficulties, b u t t h e details b e c o m e m o r e involved. H e r e ,
we shall only illustrate t h e application of T h e o r e m 111 -1 to a special
class of linear systems with b o u n d a r y conditions.
Consider a d i s t r i b u t e d p a r a m e t e r system defined on a b o u n d e d spatial
d o m a i n Ω, a n d governed by a s y m m e t r i c hyperbolic P D E of t h e form
(71) with «JS?0 given by

Μ α

^o=%é-(Ar) X
+ B (84)
i=l ° i

w h e r e Ai a n d Β are matrix-valued functions of X only. F u r t h e r m o r e ,


Ai are s y m m e t r i c and continuously differentiable in Ω and Β is conti-
n u o u s in Ω.
S u b s t i t u t i n g t h e above expression for J S f 0 directly into E q . (74) leads
to

Ç = ( t i ^ *>· u
<t> *>> + e *>· % I- w> v \
A

+ <U(t,X) (BT + B)U{t,X)>a (85)

F r o m t h e a s s u m p t i o n t h a t At are s y m m e t r i c a n d continuously differen-


tiable in Ω, E q . (85) can b e r e d u c e d to

+ (jBr + B+fi^L)U{t,X),U{t,X)^ (86)


114 P . K. C. W A N G

T h e first integral in E q . (86), in view of G r e e n ' s t h e o r e m (75), can be


rewritten as a surface integral

Jί Τ UT{t, X)AiU(ty Χ)η^(3Ω) (87)


da t[

w h e r e nt are t h e outer n o r m a l s to 3Ω. Physically speaking, this integral


represents t h e rate at which energy enters t h e system t h r o u g h t h e
b o u n d a r y surface. T h e second inner p r o d u c t t e r m in E q . (86) represents
t h e rate at which energy enters t h e system from e n e r g y sources in t h e
interior of Ω.
A sufficient condition for a s y m p t o t i c stability is that af"jdt < 0 for all
t > 0. T h i s condition may be satisfied in m a n y ways:

(i') T h e matrix

is negative definite for all Χ Ε Ω a n d t h e integral (87) vanishes either


by i m p o s i n g a boundary condition

U(t, X) = 0 for all Χ Ε 3Ω and all t (88)

or a so-called local boundary condition (22)


M
T
X U (t, X)AiU(t1 Χ)η{ < 0 for all Χ Ε 3Ω (89)
i=l

N o t e t h a t t h e latter case, t h e b o u n d a r y condition is related intrinsically


to t h e operator J S f 0 , since (89) involves t h e matrices Ai .
(ii') T h e s u m of t h e b o u n d a r y integral (87) a n d t h e second t e r m in
E q . (86) is < 0 for all t > 0. Physically, this implies t h a t t h e energy
flow across t h e d o m a i n b o u n d a r y a n d t h e internal energy generation
or dissipation m u s t be related in such a m a n n e r t h a t t h e total energy
of t h e system is nonincreasing in t i m e a n d t e n d s to zero as t —> oo.

b. Nonlinear Parabolic System (33). Consider a nonlinear d i s t r i b u t e d


p a r a m e t e r dynamical system governed by a scalar P D E of t h e form

i ï | * > _ Λ·, + , , , , x . „«,, x , , * £ Ï L m

defined for 0 < t < oo a n d Χ Ε ß , w h e r e g is a specified function of its


arguments.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 115

W e shall i n t r o d u c e t h e following a s s u m p t i o n s :

(i) J?q is a linear operator uniformly elliptic in X and t> defined for all
X e Ω and t > 0
Μ ο ο Λ/ ο

=2 wt "
Λ( ί Χ)
' êï; 1
+
·
δί(ί Χ) +c(<> (91)

By definition, t h e r e exists a uniform ellipticity coefficient p(t) > 0


for all t ^ 0 such t h a t for every (t, X) e [0, + o o ) and any real vector
ξ = ( f i , - , Im):

/w /vi

2) (92)
(ii) T h e coefficients aXj, bi, and c of J 5 f 0 are c o n t i n u o u s in (ί, X) e
[0, + o o ) χ Ω and t h e following limits exist:

Um, α„(ΐ9 Χ) = flW(00)


(X'), Jim, , Χ) = bi{aû)(X'),
t->ao t-*ao

Hm,c(i, X) = cla>)
(X')
t-*co

Also, a{j a n d have c o n t i n u o u s first partial derivatives with respect


to xi , a n d c(t> X) < 0.
(iii) g(t, X, u(t, X) = 0, du(t, X)jdXl = 0 , d u ( t , X)/dxM = 0) = 0 for
all t > 0 and X e Ω, a n d g satisfies a u n i f o r m local Lipschitz condition
in u(t, X), du(t, X)jdxi , ι = 1, ..., M for all t > 0 a n d X e Ω,

(93)
where are positive constants.
(iv) u(t, X) = 0 for all X e 8Ω.
(v) T h e state function space Γ = C 0 ( ß ) — t h e set of all c o n t i n u o u s
functions of X defined on Ω, which vanish at 8Ω a n d have b o u n d e d
first partial derivatives at 8Ω.

I n view of a s s u m p t i o n s (iii) a n d (iv), it is evident t h a t t h e trivial


solution u = 0 is an e q u i l i b r i u m state of system (90). It is of interest to
d e t e r m i n e t h e conditions for asymptotic stability of t h e trivial solution
of E q . (90) [in t h e sense of a L 2 n o r m defined by E q . (72)] in t e r m s of
the k n o w n p a r a m e t e r s of t h e system.
116 P . K. C. WANG

Consider again a positive definite functional Ψ* in the form of E q . ( 7 3 )


which satisfies conditions (i)-(iii) of T h e o r e m I I I - 1 . T h e total derivative
of with respect to t is

— = 2 [ < « ( f , X)9 -SV(*. Χ»* + «t 9 X)9g>a] (94)

For asymptotic stability, we require that d^/dt < 0. T h u s , the p r o -


blem reduces to finding u p p e r b o u n d s for the inner p r o d u c t s <w, ^ 0 U } Q
and <w, g}Q .
T h e first inner p r o d u c t in E q . ( 9 4 ) can be rewritten as

<«· = l J, k K -Ter) m

(95)

+ { h i { t X y i { t X ) ) dQ
2)a%J^ ' '

dQ

T h e first and t h i r d integrals in E q . ( 9 5 ) vanish by G r e e n ' s t h e o r e m


and b o u n d a r y condition (iv). T h e second integral, in view of the uniform
ellipticity condition ( 9 2 ) , is b o u n d e d above by

(96)

N o w , consider the second inner p r o d u c t in Eq. ( 9 4 ) . F r o m the assumed


Lipschitzian p r o p e r t y of g, we have

<u,g)a < < « ( ί , χ), σ 0 1 U(t, x) \ya + ( « ( f , χ α, I - - f ^ - (97)

Applying Schwarz inequality to the second inner p r o d u c t in t h e right-


h a n d side of the above inequality leads to
M

<*,g>a < <u(t, χ), „ σ ι u(t, χ) | > 0 + £ σ , !| « ( / , X ) II · (98)


1=1
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 117

T o proceed further, we shall make use of t h e following l e m m a :

L E M M A I I I - 1 : Let Θ(Χ) be a real-valued function defined on a bounded


subset Ω of a M-dimensional Euclidean space. Θ(Χ) has continuous partial
derivatives with respect to x.t, i = 1, M, up to order K. Furthermore,
Θ(Χ) together with its partial derivatives up to order Κ — 1 vanish at 3Ω.
Then, for all S < Κ and S > 0
Κ

S
K (99)
dXi'i ... 3XMM dx^i ... dxMM

where S = Σ 4 =1 st, Κ = Σ]=1 kj, d0 is the diameter of Ω, and || · || is a


L2 norm.
T h e above l e m m a can be readily proved by partial integration and
applying Schwarz inequality.
Applying t h e above l e m m a to inequalities (96) and (98) results in
the following u p p e r b o u n d for di /dt:

< -«(*, *)<!/(*, X), u(t, Χ)>Ω (100)


~dt

where

(101)

Clearly, asymptotic stability is achieved if

a(f, A') > 0 for all * > 0 and all X G Ω (102)

In the special case where the linear o p e r a t o r - J ^ 0 is self-adjoint in


C 0(£?) for all t > 0 , and has a purely positive discrete s p e c t r u m , we can
make use of the following well-known inequality

<«, - J S ? o « > « > A m l( ni ) < « , u)a f o r all / > 0 (103)

where A mm ( i ) is the m i n i m u m eigenvalue. In this case, a sufficient


condition for asymptotic stability is simply

AminiO > *o + - 7 " X °, f o r all ί > 0 104)


"ο /Γι

Similar conditions for the asymptotic stability of the trivial solution


118 P. K. C. WANG

can be established for a uniformly parabolic system with t i m e delays,


which is governed by a partial differential-difference equation of t h e form

du(t, X) du(t -τά,Χ) \


+ g' (t, X, u(t, X), u(t-rd,X\

i = 1,..., M (105)

where g' is a specified function of its a r g u m e n t s a n d rd is t h e delay time.


T h e details are given in reference (72).

B. Controllability
It was stated earlier t h a t controllability is associated w i t h t h e ability
of steering one system state to another in a finite a m o u n t of t i m e by
means of certain admissible class of controls. T h e notion of controllability
was first i n t r o d u c e d by K a i m a n (34, 35). H e derived precise mathematical
conditions for t h e controllability of finite-dimensional linear dynamical
systems. T h e same conditions were derived i n d e p e n d e n t l y by Pontryagin
et al. (36) in their work on o p t i m u m control. Recently, Gilbert (37) has
defined controllability of finite-dimensional linear dynamical systems
from the s t a n d p o i n t of system structural decomposition (i.e., t h e system
cannot be decomposed in such a m a n n e r t h a t one or m o r e of its state
variables is unaffected by t h e controls for all time). However, u n d e r
certain restrictive conditions, G i l b e r t s definition is equivalent to t h a t
of K a i m a n .
F o r dynamical systems governed by nonlinear ordinary differential
equations, results pertaining to local controllability in K a l m a n ' s sense
have been obtained by Lee and M a r k u s (38). Also, they have established
a relation between complete controllability a n d global asymptotic sta-
bility.
I n - t h i s section, we shall extend t h e definitions for various degrees of
controllability in t h e sense of K a i m a n to distributed p a r a m e t e r d y n a m -
ical systems. Since the extent of usefulness a n d i m p o r t a n c e of this p r o -
perty in a general dynamical system (in particular, nonlinear systems) is
not yet clear at this time, the definitions may appear s o m e w h a t superficial.
However, since t h e controllability of a system is closely related to t h e
question of existence of o p t i m u m controls, a n d t h e fact t h a t m a n y
o p t i m u m control p r o b l e m s associated with distributed p a r a m e t e r systems
can be formulated in a m a n n e r which is directly analogous to t h a t for
l u m p e d parameter systems, it is most likely that t h e r e exists a certain
degree of parallelism between t h e roles of controllability in l u m p e d and
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 119

distributed p a r a m e t e r control system theories. F u r t h e r m o r e , it is expected


that m a n y k n o w n results pertaining to t h e controllability of l u m p e d para-
meter systems can be generalized to t h e case of distributed p a r a m e t e r
systems. H e r e , we shall show t h a t K a l m a n ' s results for t h e controlla-
bility of finite-dimensional linear dynamical systems can be generalized
to a class of linear distributed p a r a m e t e r dynamical systems. Also, we
shall discuss t h e physical m e a n i n g of controllability in t h e cases of
variable-domain distributed systems a n d coupled l u m p e d a n d distri-
b u t e d systems with t h e aid of simple specific examples.

1. DEFINITIONS

Since controllability in t h e sense of K a i m a n is defined with respect


to an admissible class of control functions, it is necessary to establish
first t h e precise definitions for admissible controls.
I n general, t h e mathematical conditions defining an admissible set of
controls can be categorized as follows:
(i) a set of weakest conditions w h i c h one can impose on t h e control
functions such that the solutions of t h e forced system equations still
exist,
(ii) a set of conditions imposed by physical constraints on t h e control
functions, for example, a possible constraint for t h e d i s t r i b u t e d control
function may be in t h e f o r m

\fü(i)(t> X) I ^ gi(X) almost everywhere in [0, Τ] χ Ω, i = 1, ..., Κ


w h e r e gi > 0 for all X e Ω).
Clearly, a set of control functions satisfying only condition (i) r e p r e -
sents t h e largest set of admissible controls.
For l u m p e d p a r a m e t e r systems governed by finite-dimensional first-
order vector ordinary differential equations, t h e mathematical condition
corresponding to (i) is t h a t every c o m p o n e n t of t h e vector control
function defined on a finite t i m e interval m u s t be L e b e s g u e measurable
as established by C a r a t h e o d o r y ' s existence t h e o r e m (39). T h e solutions
corresponding to controls satisfying t h e above condition will be conti-
n u o u s and once differentiable with respect to t, except on a set of m e a s u r e
zero, and satisfy the differential equation almost everywhere. F o r distri-
b u t e d p a r a m e t e r systems governed by a partial differential equation
of t h e form (36) with b o u n d a r y condition (37), t h e r e is no c o r r e s p o n d i n g
general mathematical condition for e n s u r i n g t h e existence of forced
solutions. Results are available only for particular classes of systems.
M o s t likely, the mathematical conditions w h i c h one m u s t impose on the
control functions involve m o r e t h a n j u s t L e b e s g u e measurability.
120 P . K. C. WANG

I n t h e ensuing discussions, t h e t e r m " s e t of admissible control func-


tions'' [denoted by J>(r χ Ω)] will be used w i t h o u t making explicit
statements on conditions (i) a n d (ii). T h e following définitions are
directed primarily to fixed-domain distributed p a r a m e t e r systems defined
on a state function space Γ(Ω) with a specified metric p(Uy U').
(i') A state U0(X) e Γ(Ω) is said to be null-controllable at time t0 ,
if t h e r e exists an admissible control function which will transfer U0(X)
to t h e null state in a finite t i m e Ty i.e., t h e solution UFû(t0 + Ty Χ;
U0(X)y t0) = 0 almost everywhere in Ω. I n general, Τ d e p e n d s u p o n
b o t h t0 a n d U0(X). If t h e condition UF^tQ + Ty X; U0(X)y t0) = 0 is
replaced by p(UF.(t0 + Ty X\ U0(X)y t0)y 0) < δ, w h e r e δ is a specified
positive n u m b e r , t h e n t h e state U0(X) is said to be null 8-controllable
at time t0 .
Obviously, any state defined by t h e solution UF^ty X\ U0(X), t0) at
any fixed t e (t0 , t0 + T) is null (δ) controllable at t i m e t.
If a state is null (δ) controllable i n d e p e n d e n t of t0 , t h e n t h e state is
said to be null (δ) controllable.
(ii') T h e set of all states which are null-controllable at t0 will b e
called t h e domain of null-controllability at t0 (denoted by ^ ^ ( ß ) ) . Also,
the set of all states which are null-controllable i n d e p e n d e n t of t0 will b e
called t h e domain of null-controllability (denoted by ^f°(ß)). Clearly,
^°ιο(Ω) a n d ^°(Ω) m u s t be connected subsets of Γ(Ω) containing t h e
null state.
Similarly, we can define t h e domain of null 8-controllability at t0:
δ
^ o ( ß ) , a n d t h e domain of null-8-controllability: ^ (Ω). H e r e , t h e null
δ
state m u s t be an interior point of ^ο(Ω) a n d of ^ (Ω).
(iii') If # ° ( β ) [ ί ί ? ( ί 2 ) ] coincides with t h e entire state space Γ(Ω),
then t h e system is said to be completely null-controllable [at t0]. Also, t h e
definition for a completely null 8-controllable system can be established.

Remarks.
(i") I n general, t h e solution space Γ^Ω) may be a subset of t h e state
space Γ(Ω), therefore a completely null-controllable system does not
always imply that any t w o states in Γ(Ω) are mutually transferable in
finite time by means of admissible controls. However, if Γ(Ω) = Γ^Ω),
t h e n t h e above implication holds. T h e same a r g u m e n t s apply to d o m a i n s
<Ζ°(Ω) a n d «7 o (fl).
(ii") If t h e null state of t h e system is uniformly asymptotically stable
in t h e sense of L y a p u n o v , t h e n t h e r e exists a set %>\Ω) with t h e null
state as an interior point.
(iii") T h e notion of δ-controllability is useful w h e n dealing with
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 121

approximate systems and in t h e design of linear distributed control


systems (see Sections IV and V).

T h e above définitions can be readily e x t e n d e d to fixed-domain dis-


t r i b u t e d p a r a m e t e r systems coupled with a l u m p e d system by defining
an a p p r o p r i a t e state space with a suitable metric a n d a set of admissible
controls. I n order to extend t h e above definitions to a variable-domain
distributed system, it is necessary to first transform t h e system into an
equivalent fixed-domain system as examplified by System 4 in Section
I I , A. Since t h e b o u n d a r y position, velocity, etc., may enter as state
variables, t h e d e t e r m i n a t i o n of t h e controllability of such a system
may be related to t h e question of w h e t h e r t h e size, expansion rate, etc.,
of t h e spatial d o m a i n can be transferred from one set of values to another
in finite t i m e by m e a n s of certain admissible controls.

2. C O N T R O L L A B I L I T Y OF PARTICULAR SYSTEMS

H a v i n g established precise definitions for various degrees of control-


lability, it is of interest to examine t h e conditions for which a particular
system or a class of systems posseses these properties.

a. Linear Systems. A linear system governed by

= U{t )X + {t X ) F q X) ) ( 1 0 6
^° > °> ^

defined for t > t0 , will be considered here. It is assumed that t h e initial


N
states UQ(X) G L 2 ( Î 3 ) — Ν p r o d u c t s of L 2 ( ß ) . If t h e r e are given b o u n d a r y
conditions, it is assumed that they are linear, h o m o g e n e o u s and of t h e
form
if, U(t, Χ ' ) = 0 for all X ' Ε dQ (107)

a n d are taken care of by restricting t h e d o m a i n of JS?0 to functions satis-


fying E q . (107).
Since t h e finite-dimensional linear system is a particular case of the
above system, it is natural to ask what are t h e controllability conditions
for this system which c o r r e s p o n d to those for t h e finite-dimensional
case.
First, we i n t r o d u c e the following a s s u m p t i o n s :

(i) T h e linear operator (a spatial differential or integro-differential


operator) is an infinitesimal generator of a s e m i g r o u p (or a group)

(ii) N o constraints are i m p o s e d on t h e m a g n i t u d e of FQ , and D is


122 P . K. C. WANG

a matrix whose elements are specified c o n t i n u o u s functions of t a n d X.


F u r t h e r m o r e , D(t, X)Fn(t, X) is strongly differentiable in t so t h a t t h e
solution to E q . (106) with an initial state U0(X) and control FQ can be
written in the form (20, 21)

U {t FQ y X; U0(X\ t ) = Φ(ί, t )U (X)


0 0 0 + f Φ(ί, t')D(t\ X)Fü(t', X) dt' (108)

I n view of the above assumptions, it is evident that if t h e system (106)


is to be completely null-controllable at t0, t h e r e m u s t exist an admissible
N
control FQ which will transfer an arbitrary initial state U0(X) e L2 (Q)
to t h e null state in a finite a m o u n t of time, or t h e r e exists an admissible
control FQ defined on [t0 , i j χ ß , which satisfies t h e following integral
N
equation for any given U0(X) eL2 (Q)

. toW^X) = f 0(h , t')D(f, X)F (t\


J
1
a X) dt' (109)
t0

where t0 < tx = constant < + o o .


Let us rewrite the above equation as

- Φ & . to)U0(X) = &Ah > toWo(ty X) (110)


where

^ i . « = \ 0(t t')D(t\X)(')dt'
h
ly (111)

T h e domain of ^f(tx, t0) is ^([t0 , t^\ χ Ω)—a set consisting of all


admissible control functions FQ defined on [t0 , χ ß , which is dense
N N
in L2 ([t0 , i j χ ß ) . T h e range of J S f / ^ , t0) is a subset of L2 (Q). On
the other h a n d , — Φ(ίχ, £0) represents a continuous m a p p i n g from
N N
L2 (Q) into itself. Obviously, in order to have some states in L2 (Q)
to be null-controllable at t0 , t h e intersection between t h e ranges of
— Φ(ΐ1, t0) and ^f(t1, £0) m u s t be n o n e m p t y for some finite t1 ^ t0'.
F r o m t h e a s s u m p t i o n that <JSf0 is an infinitesimal generator of a semi-
g r o u p , Φ(ί1, t') is b o u n d e d . Also, since t h e elements of matrix D are
continuous functions of t a n d X, JPfa , t0) is a b o u n d e d operator.
H e n c e , t h e adjoint of ^ / ( ^ , £0) (denoted by / * ( * i , * 0)) exists and is
defined by t h e relation
(fi(X\ <e (t , t )F(t, X)},
f x 0 = <Xf*{tx, t0)G{X)9F{t, X)\ (112)
a n N
where < ·,· >i d < V > 2 denote inner p r o d u c t s in L2 (Q) and
L N β
2 ([to > *ι] Χ ) respectively. T h e d o m a i n of Se^(tx, i 0) is L 2 " ( ß ) and
N
its range is in L2 ([t0 , i j X ß ) .
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 123

L e t us consider the following equation obtained by premultiplying


ie
both sides of E q . (110) by J i f / * ( i 1 , * 0) , - >

-JSP,*('i> 'oWi > to)U0(X) = &,*(h , t0)J?f(tl, t0)FQ(t, X) (113)


in t eh
T h e operator Sffih , * 0) ^ / ( * i > *o) r i g h t - h a n d side of E q . (113)
is a linear, self-adjoint, non-negative operator which m a p s . / ( [ f 0 , f j X £?)
N x
into L 2 ([A) » *i] H e n c e its s p e c t r u m is real a n d n o n e m p t y .
Clearly, if zero is not contained in t h e s p e c t r u m of JSf^*^, t0)J?Ah> t0)
(J?f*&f does not m a p some nonzero FQ into zero), t h e n S£^S^1 has an
inverse or FQ(t, X) has a u n i q u e solution given by
1
FQ{t, X) = -(JSfV*('i - * o W i - to))- *^ , t ^ t , , t0)U0(X) (114)

T h e above result can be stated formally as a controllability l e m m a


which is a generalization of that for finite-dimensional linear dynamical
systems given by K a i m a n (34, 35).

L E M M A I I I - 2 : A necessary and sufficient condition for a linear distri-


buted parameter dynamical system governed by Eq. (106) [and possibly
boundary condition (107)] satisfying assumptions (i)-(ii) to be completely
1
null-controllable at t0 is that the linear, self-adjoint operator *
t
&f*(h > *o)«^Ah > o)
1 1
= (J* Φ(ΐλ, t')D(t', X)( · ) A ' ) * ^ 0 ( t x , t')D(t', X)( - ) dt') (115)

N
which m a p s «/([£ 0 , tx] Χ Ω) into L2 ([t0 , tx] Χ Ω), has an inverse for
some finite tx > t0 .
Proof: Sufficiency. Setting F ß in t h e form of E q . (114) and substituting
it into E q . (108) lead directly to t h e result that UF{tx , X; U0(X), t0) = 0
almost everywhere in Ω.

A s s u m e £έ' *(h > o)^ Ah > *o) does not have an inverse.
i ?
Necessity.
T h e n t h e r e exists a nonzero Fü(t, X) eJ(\tQ , t^\ χ Ω) such that

J2?,*('i, g i ? ^ , t0)FD(t, X) = 0. (116)


Thus,
2
<J2?V*('i , g j ? ^ , t0)FQ(t, X),FP(t, X)> = II <?f(h , ' o W , * ) I I = ο
(117)
13 >
T h e lemma remains valid if w e consider the linear self-adjoint operator ^ f{t1 , /0)
J2V*(*i > t0) instead of , f0) , f0).
124 P. K. C. WANG

or J?fFn(t> X) = 0 almost everywhere in ß , where <·,·> denotes an


N
inner p r o d u c t in *f([t0 , *J X ß ) , a n d || · || denotes a L2 ([t0 , ί χ] χ ß )
norm.
O n t h e other hand, if the system is completely null-controllable at
N
t0 , t h e n for every U0(X) eL2 (Q)y t h e r e exist a finite t i m e tly and a
Fa(ty X) e / ( [ / 0 , f j X ß ) satisfying

> «ο) i / o W = ^ / ( * ι . Ό)*ί>(*> * ) (118)

which, in view of E q . (117), implies that

Φ(*!, t0)U0(X) = 0 almost everywhere in Ω (119)

T h e above condition states that the solution to the free system with initial
state U0(X) is zero at time t 1 , or Fa(t, X) = 0 almost everywhere on
[tQ , i j X ß , which contradicts the a s s u m p t i o n t h a t FQ(ty Χ) Φ 0. T h e
14
contradiction completes the proof.
I n t h e above lemma, t h e condition for controllability is expressed in
t e r m s of an abstract transition operator <P(ty t'). I n m a n y systems, it is
possible to express 0(t,t') explicitly in t h e form

0(ty f) = f K(t, t\ X, X')( · ) dü\ t > t' (120)


with the p r o p e r t y that <P(ty t) = I and

Φ(ί, ί")Φ(ί'\ t') = f K(t, t", Xy X") f K(t", t\ X", X')( · ) dQ" dQ',
JΩ JΩ
t^t" > t' (121)
where K is the G r e e n ' s function matrix of the system.
?
I n this case, the operator «JSf/*(i1 , tQ)^ f(t1 , t0) is given by

T
jgyi?, = Jf D (ty x)K (tx,
T
t, x \ xy
Ω

f1 f tffo, f, Χ ' , X")D(t\ Χ"){ · ) dQ" Λ ' <£Ω' (122)

r
where ( · ) denotes transpose.
14
A similar proof can be given for the case where w e consider the linear self-adjoint
operator J^J^,* instead of £Pf*yf . F r o m the mathematical standpoint, these are well-
known results in the theory of linear operators in Hilbert Space (52, Chapt. V I I ) .
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 125

Remarks.
(i') It will be s h o w n in Section IV, C, 1 t h a t the control law given by
E q . (114) is precisely the one which will transfer an arbitrary initial
N
state U0(X) G L2 (Q) to t h e null state in a specified time with m i n i m u m
control energy.
(ii') T h e condition for controllability established in L e m m a I I I - 2 is
expressed in t e r m s of the transition operator Φ(ί, t') of t h e system. A
more useful form would be to express t h e conditions explicitly in t e r m s
of JS?0 and D in E q . (106). However, general results in this form are not
readily derivable even in the case of time-invariant systems.
(hi') It is natural to ask w h e t h e r G i l b e r t ' s definition (57) for complete
controllability of finite dimensional linear dynamical systems can be
extended to t h e distributed case. I n essence, G i l b e r t ' s definition involves
the question w h e t h e r the system can be transformed into one such t h a t
one or m o r e of its state variables is unaffected by t h e control for all
time. If t h e answer is affirmative, t h e n t h e system is not completely
controllable. T h i s definition seems to be a natural one from t h e s t a n d -
point of canonical representation of linear dynamical systems. A l t h o u g h
we can a d o p t G i l b e r t ' s definition for linear d i s t r i b u t e d p a r a m e t e r d y n a -
mical systems, however, the usefulness of this definition is limited by
t h e fact t h a t t h e r e is no systematic p r o c e d u r e for transforming linear
P D E ' s from one form to another, since the transformation generally
d e p e n d s on t h e spatial coordinate variables. F u r t h e r m o r e , t h e relation
between G i l b e r t ' s and K a l m a n ' s definitions is not as clear as in t h e case of
finite-dimensional linear systems. However, from physical a r g u m e n t s ,
it can be d e d u c e d that for linear distributed systems, complete controlla-
bility in K a l m a n ' s sense implies complete controllability in G i l b e r t ' s
sense; b u t t h e converse is not t r u e in general. T h e latter fact can be
clarified by t h e following example.
Consider a linear distributed system governed by

dux(ty X)
(123)
Ft
du2(ty X)
£2Xux(t, X) + J?22u2(t, X) (124)
dt

defined on a spatial d o m a i n ß , w h e r e are time-invariant linear


operators. T h e sets of initial functions for ux and u2 are d e n o t e d by
Γ χ ( β ) and Γ 2 ( β ) , respectively.
Clearly, this system is completely controllable in G i l b e r t ' s sense;
b u t it is not necessarily completely controllable in t h e sense of K a i m a n .
L e t us assume t h a t the i n d e p e n d e n t subsystem (123) is completely
126 P. K. C. WANG

null-controllable in K a l m a n ' s sense (i.e., every state in Γ-^Ω) can be


transferred to t h e null state in a finite time by some admissible control).
w c na D
e
Since the t e r m J £ ? 2 1 i ( * > X) regarded as a control variable for
subsystem ( 1 2 4 ) , and the d o m a i n of « i f 2 1 is restricted to t h e solution space
of E q . ( 1 2 3 ) , it is quite possible that t h e range of JS? 21 represents a set of
functions which is too " s m a l l " to make subsystem ( 1 2 4 ) completely
null-controllable in t h e sense of K a i m a n .
N o w we shall illustrate t h e application of L e m m a I I I - 2 by m e a n s of
a simple example.
Consider a linear diffusion system governed by

* ß ^ = J ^ + / M ) (125)

defined on the spatial d o m a i n ( 0 , 1 ) . T h e b o u n d a r y conditions are


u(t, 0 ) = u(t, 1 ) = 0 for all t.
T h e solution to E q . ( 1 2 5 ) with an initial state u0(x) at t i m e t = 0 ,
and a given control / has the form

uf(ty x\ u0(x)y 0 ) = f k(t, 0 , x, x')u0(x') dx' + f f k(t, t'y χ, x')f(t\ χ') dx' dt'
U (126)
where k is t h e G r e e n ' s function given by
oo
2 2
k(ty t', χ, x') = ^ exp[—n n (t — t')] sin(nnx) sin(nnx') (127)
n=l

W e shall assume that the control function / is square-integrable a n d


unconstrained in m a g n i t u d e , and t h e initial state function u0(x) G L 2 ( 0 , 1 ) .
It can be readily verified that t h e set of integral operators {J k(t> t\ χ> χ')
( · )dx'} has t h e properties of a s e m i g r o u p .
F r o m L e m m a I I I - 2 , t h e above system will be completely null-control-
lable, if t h e integral operator
1
r Ä 2
&t*{tx, OJJSP/*!, 0) = Υ. e x p ( - w V ( ^ - *)) s i n ( w T T ^ ) sin(fwr*')'

C Γ X e x p i - r o V f o - t')) ύη(τηπχ') ύη(πΐπχ")( · ) dx" dt' dx' (128)

has an inverse for some finite tx > 0 .


By the assumption that f(t, x) e L 2 ( [ 0 , i j X ( 0 , 1 ) ) , t h e operator
lef&j m a p s L 2 ( [ 0 , i j X ( 0 , 1 ) ) into itself. T h u s , for any /(*, x) e
^ 2 ( [ 0 > *ι] X ( 0 , 1 ) ) , t h e series corresponding to J 2 ? / * J S ? / / ( i , x) converges
uniformly, and we can interchange t h e order of integration and s u m m a -
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 127

tion in E q . (128). Performing t h e integrations leads directly to t h e


following simplified form for

(129)
I t can be readily verified t h a t t h e range of JSf^JS^ is a linear subspace of
£ 2( [ 0 , *J X (0, 1)), and Sej*&jf{t, x) = 0 if and only if f(t, x) = 0
almost everywhere in [0, f j X (0, 1). H e n c e S£1^S£1 has an inverse or
the system is completely null-controllable.
T h e complete null-controllability of t h e above system can be also
verified m o r e directly w i t h o u t applying L e m m a I I I - 2 (5-9).

b. Variable Domain Systems. I n view of t h e complexity of variable


d o m a i n systems, we shall discuss only t h e controllability of t w o simple
systems.
System with a free interior boundary. Consider a t w o - p h a s e (solid-
liquid) system defined on a finite one-dimensional spatial d o m a i n (0, 1) as
s h o w n in Fig. 7. T h e t e m p e r a t u r e at χ = 1 is maintained at a specified

FIG. 7.

constant positive value fx > um—the melting t e m p e r a t u r e of t h e solid.


T h e control variable corresponds to a t e m p e r a t u r e f0(t) at χ = 0.
L e t xB(t) be the location of t h e interface at any t i m e t. A s s u m i n g
128 P. K. C. WANG

homogeneity of b o t h t h e liquid a n d solid media, t h e t e m p e r a t u r e


distributions in the liquid and solid regions are governed by

M ± ±
è = ^ ^ è for 0 < . < XB(t) (130)

and

J L
2 fr o
- *£- = μι — ^ Γ " - **W < x < 1 (131)

with b o u n d a r y conditions

w u χ u
0) = / ( 0 . *(*> **(*)) = i(t> β(*)) = m , «ι(ί, 1) = Λ (132)
and
/ 0 i i s( f , * ) dul(t,x)\\ dxB(t)
(*' — W -
Kl
—3χ—) L XBit) = - 5 Γ
)

w h e r e L is t h e latent heat of fusion; ρ is t h e density of solid; μ and κ


are t h e t h e r m a l constants.
Clearly, if t h e interface is to remain stationary with t i m e [i.e., xB{t) =
xBe = constant, 0 < xBe < 1], a constant t e m p e r a t u r e /(f) = / „ at
χ — 0 m u s t be maintained at all t i m e such t h a t

l 8us(t,X) dutt,x)U 0= foralli ( )1 3 4

dx dx J L = x
Bo

Since t h e e q u i l i b r i u m t e m p e r a t u r e gradients in b o t h t h e liquid a n d


solid regions are uniform, t h e relation between / 0 and xBe can be deter-
m i n e d from E q . (134).

U m) ) ( 1 3 5
/· = "»· - Ί^Γ=Ί£Ϊ M -

T h e e q u i l i b r i u m t e m p e r a t u r e distributions in t h e solid and liquid


corresponding to a specified / 0 is

«.<βα>(*)=/ο + Ρτ-^)*
\ XBe
for x
*c[Q, Be] (136)

u x x
U X t \ U
ι (fl ~~ m)( — Be) c η ί\ιη\
Ueq)( ) = rn Η * ^ ' ^ ^ ^

N o w , t h e controllability of t h e set of e q u i l i b r i u m states can be defined


as follows.
u x u x s a
T h e set of e q u i l i b r i u m states ({xBe > s(eq)( )> ueq)( )}) is i d to be
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 129

completely controllable if every pair of e q u i l i b r i u m states belonging


to t h e set are mutually transferrable in a finite a m o u n t of t i m e by m e a n s
of certain admissible controls f(t).
T h e physical implication of t h e above s t a t e m e n t is that t h e equilibrium
interface can be shifted from one position to another in finite t i m e by
applying appropriate t e m p e r a t u r e variations at χ = 0 . T h e verification
for t h e existence of the above p r o p e r t y leads to an interesting m a t h e m a -
tical p r o b l e m in partial differential equations with moving b o u n d a r i e s .
However, this p r o b l e m is by no m e a n s trivial. A complete solution is
not available at t h e present t i m e . H e r e , we shall only note t h a t the inter-
face can be always transferred to some n e i g h b o r h o o d of a desired posi-
tion by applying a f(t) = f0 , constant for all t ^ t0 , and with / 0 set
precisely at t h e value corresponding to t h e desired interface e q u i l i b r i u m
position. F o r t h e case w h e r e t h e admissible controls are constrained in
m a g n i t u d e (\f(t) | < / m xa > w m), a simple calculation shows t h a t t h e
set of e q u i l i b r i u m interface positions xBe is a subset of [ 0 , 1 ] defined by

1
0 < * B e < [l - - T T ^ V T - I " < 1 (138)
L u J
κΛ/max m)
T h e r e f o r e , not all t h e interface positions in [ 0 , 1 ] are reachable by
admissible controls.

Coupled distributed and lumped system. T h e t r a n s p o r t system (System


5 ) described in Section I I , A will be considered here. T h e set of all
e q u i l i b r i u m states of this system is defined by {xc(t) = xc{eq) , dxjdt — 0 ,
p(t, x) = v(ty x) = 0 } for all t, all x L . m) e [ 0 , xc{m&x)] and all χ e ( 0 , xc{eq)),
w h e r e x c ( am x ) is a specified finite positive n u m b e r .
Similar to t h e system described in t h e preceding section, the set of
e q u i l i b r i u m states is said to be completely controllable, if t h e r e exists
an admissible terminal control pressure Pc(t) which will transfer one
arbitrary e q u i l i b r i u m state to a n o t h e r in finite time.
I n what follows, we shall give a brief account of a wave-tracing
m e t h o d (41) for analyzing the d y n a m i c behavior of such a system s u b -
jected to pressure discontinuities i n t r o d u c e d at χ = 0 . T h i s m e t h o d will
be used in t h e later discussion on controllability.
Consider t h e situation w h e r e a pressure discontinuity is e m a n a t e d
from χ = 0 and t h e carrier is initially at rest (i.e., xc(0) = x c (e q ) ,
xc(0) = 0 ) . T h e pressure wave propagates at a constant velocity
1/2
v0 = (β/ρο) and strikes t h e carrier at t = xc{eq)/v0 . T h e impact of
t h e wave u p o n t h e carrier surface i m p a r t s a j u m p in the carrier accelera-
tion. T h e wave reflected from t h e carrier surface propagates backward
toward t h e left and may be reflected again t o w a r d t h e carrier d e p e n d i n g
130 P . K. C. WANG

u p o n t h e terminal conditions at χ — 0 a n d t ^ 2 j c i . ( eq ) / ^ 0 . It can be


shown (41) that, to a first-order approximation, t h e j u m p in t h e carrier
acceleration is related to t h e pressure discontinuity Ap by

^ = - # ( ^ ) 039)

and t h e a m p l i t u d e of t h e reflected wave from t h e moving carrier is


equal to that of the incident wave.
An alternative form for t h e equation of motion of t h e carrier can be
derived by using t h e following well-known p r o p e r t y of t h e solution of t h e
wave equation ( 2 1 ) : If p(ty x) is a solution of E q . ( 2 1 ) , t h e n v0(dp/dx) |==
(dpjdt) is constant on t h e family of characteristics t ^ x/v0 = constant.
Applying t h e above p r o p e r t y to two pairs of e x t r e m e points on t h e
characteristic leads to t h e following relation:

dp - dp + π _ dp dp
(140)
•dx „ „ „ v dx „ J dt x->xc(t)

I n view of E q s . ( 2 2 ) and ( 2 4 ) , t h e above e q u a t i o n can be r e d u c e d to

d*xe d*xe A \8p(t,x) + dp(t,x)\+ -ι


+ <x = v ) ( 1 4 1
~W -dW M \—äT ^ - ° —èx- LJ

where α = v0p0A/M.
U s i n g E q s . ( 1 3 9 ) and ( 1 4 1 ) , t h e carrier trajectory c o r r e s p o n d i n g to
an arbitrary piecewise c o n t i n u o u s terminal pressure Pc(t) can be c o m -
p u t e d in a piecewise m a n n e r (41).
N o w , we shall show that, within t h e framework set by t h e foregoing
a s s u m p t i o n s , t h e set of e q u i l i b r i u m states is completely controllable.
A constructive a p p r o a c h will be used to verify this fact.
Consider a piecewise constant control pressure sequence of t h e form

Pit) = for 0 < t < t'

PI*) = for t' <t <ti + t2' = 2 * c ( e q /) © 0


3

Pit) = ±^00 for ti +t '<t<y. t/ (142)


2

Pc(t) = T P C 0 for J ' / < t < 2) t,'

a n
where P c o , *c(eq) d t±—the starting t i m e for t h e first control pressure
reversal, are u n d e t e r m i n e d p a r a m e t e r s (see Fig. 8). By construction,
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 131

the time for the second pressure reversal is m a d e to coincide with t h e


arrival time for the reflected discontinuity in Pc(t) initiated at t — 0
(see Fig. 8). It can be readily verified by superposition that b o t h p(t, x)
and v(ty x) resulting from the above control pressure sequence are
identically zero in a region above t h e terminal characteristic. It remains

FIG. 8.

to be shown that it is possible to select a tx' a n d a P C o e [ 0 , Ρ J , t h e


admissible range for j Pc(t) |, such t h a t xc(to) = #c(eq) (a position
d e p e n d i n g on and and
U s i n g E q s . ( 1 3 9 ) and ( 1 4 1 ) , t h e carrier trajectory for t e [t1, t2) can
be found by direct integration with initial conditions xc{t^) = * C( e q ) >
+
xc(tx) = 0 and J c r ( / 1 ) = ±2APC0/My i.e.,

(143)
132 P . K. C. WANG

where t l = xC(eq)/v0 . T h e t i m e t2 is d e t e r m i n e d by t h e intersection of


xc(t) given by E q . (143) and t h e characteristic initiated at / = t x \
χ = 0, i.e.,

2AP
*«eq> ± K'* - h) - «^(l - exp(-«(i2 - ίχ)))] = v0(t2 - */) (144)

T h e carrier trajectory for t e (t2 , f 3] can be found by integrating E q .


a n x + =
(141) with new initial conditions x(.(t2), xdh) d c(h ) ^dh) Τ
2APC0/M = 0. It can be readily shown that by i m p o s i n g a t e r m i n a l
+
condition x(.(t3) = 0, *,.(ί 3 ) = χ(.(ίό) ± 2APC0/M = 0 that f/, P C o a n d
x
c{eq) can be uniquely d e t e r m i n e d from E q . (144) a n d t h e following
relations:
1
f3 = a " l n ( 2 exp(arf 2) — e x p ( a ^ c ( e / qi ;) 0 ) ) (145)

OLM Γ2
Pc o = ± 2J«(eq) - *r(eq>) [" l n è { e xP K ^ c ( e q ) + <(eq))/^o)

+ exp(«A?c ( e/ qt ;) 0) } - (3xc ( e) q+ * ; ( e )q) / ü o ] (146)

It can be s h o w n that P C o is a c o n t i n u o u s m o n o t o n e function of # C( e q ) >


t h u s t h e r e is a o n e - t o - o n e c o r r e s p o n d e n c e between t h e elements of t h e
set [ — P s , + P J a n d t h e set of reachable e q u i l i b r i u m positions (see
Fig. 9). By repeating t h e control sequence of t h e form (142), every point
in t h e interval (0, x'C(eq) = * ( ( m a x ) ] can be reached in a finite a m o u n t of
t i m e . H e n c e t h e set of equilibrium states is completely controllable.
H a v i n g established t h e existence of t h e above p r o p e r t y , it is natural
to ask what is t h e r e q u i r e d control Pc(t) which will transfer an arbitrary
e q u i l i b r i u m state to a n o t h e r in a m i n i m u m a m o u n t of t i m e ? T h i s q u e s -
tion cannot be readily answered.

C. Observability
T h e notion of observability of a dynamical system, i n t r o d u c e d by
K a i m a n (34), is associated with t h e processing of data obtained from
m e a s u r e m e n t s at t h e o u t p u t of t h e system. T h e basic question is:
Given a mathematical model of a free dynamical system (i.e., t h e
system equations a n d t h e o u t p u t transformation .JÎ), is it possible to
d e t e r m i n e t h e system state at any t i m e t by observing t h e output over
a finite time interval, say [t, t + T]9 w h e r e Τ may d e p e n d u p o n JÎ a n d
t h e system properties, and also, <Jt may have an algebraic d e p e n d e n c e
u p o n t.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 133

1. DEFINITIONS

L e t <P(t> t0) be a given c o n t i n u o u s transformation defining the state


transition of a free, distributed p a r a m e t e r dynamical system from time
t0 to t. Given an initial state U0(X) e Γ(Ω) at t i m e t0 , and a c o n t i n u o u s
o u t p u t transformation the system o u t p u t Vt is given by

Vt = Jt<&(t, t0)U0(X) (147)

w h e r e Vt, at a fixed t i m e t, is a vector with c o m p o n e n t s v ,v


l ( 1 ) t { N) f ,
N' <^N= d i m ( [ / 0) .

FIG. 9.

If J( is a spatially-dependent o u t p u t transformation, t h e n t h e o u t p u t
variables vt(j) at any fixed t i m e t are functions of Jf, defined on t h e
spatial d o m a i n Ω. O n t h e other h a n d , if J( is a spatially-independent
o u t p u t transformation, t h e n , at any t i m e t, is a point in a ^ - d i m e n -
sional Euclidean space (see Section I I , Β, 1 for examples). In m o r e
complex situations, ^ may be a composite of b o t h types of transforma-
tions m e n t i o n e d above.
W e shall define a distributed p a r a m e t e r dynamical system to be
134 P . K. C. W A N G

completely observable at time t0 , if it is possible t o d e t e r m i n e t h e system


state U0(X) at time t0 by observing t h e corresponding system o u t p u t
Vι over a finite time interval [t0 , t0 + 7 ] , where U0(X) is an arbitrary
element in Γ(Ω), a n d Τ may d e p e n d on J( a n d Φ(ί, t0). If t h e system is
completely observable at any time t, t h e n t h e system is said to b e com-
pletely observable.
T h e above definitions can b e interpretated mathematically as follows:
Consider again E q . ( 1 4 7 ) . If we let t takes on all values in [t0 , * J ,
t h e n <P(t, t0) m a p s t h e state function space Γ(Ω) onto t h e solution space
rs([t0 , tx] χ ß ) , in which each element corresponds to a segment of a
system trajectory (a function defined on [t0 , χ Ω). By t h e a s s u m p t i o n
that t h e solutions are u n i q u e , t h e m a p p i n g is one-to-one. T h e o u t p u t
transformation ^ y in t u r n , m a p s Γ8([ί0 , tx] χ Ω) onto t h e o u t p u t
1 5
space ^ 0 , in which each element corresponds to a segment of an o u t p u t
trajectory. I n general, J( m a y b e a many-to-one m a p p i n g for a particular
choice of {t0, tx}. F o r example, consider a system whose state is describable
by a single variable w, a n d ^ is a spatial-averaging transformation
defined by

w(X)( · ) dQ (148)
Ω

where w(X) is a specified spatial weighting function. S u p p o s e that t h e


solutions corresponding to all initial states u0(X) G Γ(Ω) vary in such a
m a n n e r that their spatial averages are equal over some finite time inter-
val, say [t0 , ti]. Clearly, in this case, it is impossible t o recover t h e
initial states by observing t h e o u t p u t over [t0 , I n order t h a t this
system is to be completely observable at t0 , t h e r e m u s t exist a finite
time tx ^ t0 such that a one-to-one correspondence can b e established
between t h e segments of t h e o u t p u t trajectories a n d t h e solutions ( a n d
hence t h e initial states, by assumption of u n i q u e n e s s of solutions) on
the time interval [t0 , f j .
I n t h e foregoing discussions, it has been assumed t h a t t h e o u t p u t Vt
can b e precisely measured at all time. I n practical situations, aside
from inaccuracies in t h e system's mathematical model, m e a s u r e m e n t
errors are unavoidable d u e to t h e presence of external noise, i m p e r -
fections of t h e m e a s u r i n g i n s t r u m e n t s , a n d interactions between t h e
system a n d t h e measuring i n s t r u m e n t s . I n these situations, we can
consider t h e o u t p u t space Y 0 to b e an enlarged set of functions containing

15 or a
D e p e n d i n g on the form of may be a function space ^ 0[ * o » * i ] function
space r0([tü , χ Ω).
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 135

b o t h t h e exact a n d inexact o u t p u t trajectory segments. H e n c e t h e o u t p u t


transformation J( m a p s rs([t0 , f j χ Ω) into Ψ0 .
I n order to ensure physical meaningness in our définitions of obser-
vability, we require that t h e p r o b l e m of recovering the initial data is
well-posed (in t h e sense of H a d a m a r d ) . I n other words, if t h e system
16
is completely observable at t0 , small e r r o r s in t h e o u t p u t function
V t can only lead to small errors in t h e recovered initial data at t0 .
I n view of t h e foregoing considerations, we can restate t h e previous
definitions in mathematical t e r m s as follows:
A distributed p a r a m e t e r dynamical system is completely observable
at time t0 , if there exist a finite time tx ^ t0 , a n d a c o n t i n u o u s m a p p i n g
on ^o[t0 , f j or ^ 0 ( [ * o , *ι] X ß ) onto Γ(Ω). Moreover, t h e m a p p i n g is
one-to-one from the range of J( (the set of all exact o u t p u t trajectory
segments) in y 0 onto Γ(Ω). If t h e above conditions are satisfied for all t0 ,
t h e n t h e system is said to be completely observable.
Obviously, if t h e o u t p u t transformation with its domain equal to
t h e range of Φ(ΐ, t0), has a c o n t i n u o u s inverse at some finite t i m e
t = t1 ^ t0 , t h e n t h e system is completely observable for all t e [t0 , i j .

2. OBSERVABILITY OF L I N E A R SYSTEMS

Consider a free linear distributed p a r a m e t e r dynamical system


describable by E q . (106) w i t h i * ^ = 0. It is assumed that t h e state space
N
Γ(Ω) <Ξ L 2 ( ß ) and t h e linear h o m o g e n e o u s b o u n d a r y conditions (if
present) are taken care of by restricting the d o m a i n of J S f 0 . F u r t h e r m o r e ,
J?0 is an infinitesimal generator of a s e m i g r o u p {Φ(ί, t')}. T h e
o u t p u t of t h e system at time t, starting with initial state U0(X) at t0 ,
is given by E q . (147) with a specified linear o u t p u t transformation Jt.
It is of interest to establish conditions for complete observability of this
system.
First we shall consider a question which is pertinent to observability,
namely, knowing t h e system state at any time t> is it possible to recon-
struct t h e past history of t h e system within some finite time interval
say [/ 0 , t) ? Obviously, t h e r e is no p r o b l e m , if t h e system state at time t
can be precisely measured and t h e free system trajectories within [t0 , t)
are uniquely related to t h e initial states at t0 . I n fact, if t h e family of
operators {Φ(ί, t')) has t h e properties of a group, t h e complete past
history of t h e system can be d e t e r m i n e d . However, t h e p r o b l e m is no
longer trivial if errors are i n d u c e d in t h e m e a s u r e m e n t of t h e state.

lu
Here, the "smallness" is in the sense that the distance between t w o elements in f \
is small.
136 P. K. C. WANG

H e r e , we require that t h e p r o b l e m of d e t e r m i n i n g t h e past states to be


well-posed or finding a continuous ( b o u n d e d ) inverse to t h e system t r a n -
_1
sition operator <J>(ty t') for t' e [t0 , t). I n general, Φ ( £ , t') is an unbounded
operator. However, it is possible to pose t h e above p r o b l e m properly by
_ 1
restricting t h e d o m a i n of Φ ( ί , t') to a sufficiently small class of functions.
F o r example, consider again t h e linear diffusion system governed by
E q . (125) with f(t, x) = 0, a n d defined on t h e spatial d o m a i n ( — oo, oo).
F o r this system, t h e solution c o r r e s p o n d i n g to a square-integrable
initial state function u0(x) at t0 has t h e form

1 2 2
u(t, χ) = Φ(ί, t0)u0(x) = J (π(ΐ - g ) " / exp(-(x - x') l(t - t0))u0(x') dx'
(149)

N o w , consider t h e p r o b l e m of d e t e r m i n i n g t h e previous states of t h e


system, given t h e system state u(t, x) at t i m e t. T h i s p r o b l e m c o r r e s p o n d s
to finding solutions to t h e so-called backward diffusion equation

2
dü(t'\ x) d ü(t\ χ)
2 (150)
dF dx

defined for t' > t. It is well-known that this p r o b l e m is not well-posed


for all u(t, x) G L 2 ( — o o , oo). However, J o h n (40) has shown that the
above p r o b l e m can be posed properly by imposing a positivity condition
on u(t, x). Also, M i r a n k e r (42) has s h o w n well-posedness by restricting
u(t, x) to t h e class of spatial b a n d - l i m i t e d functions defined on ( — oo, oo)
(i.e., the spatial F o u r i e r transform of u(t, x) has compact s u p p o r t ) .
R e t u r n i n g now to t h e p r o b l e m of establishing conditions for complete
observability of free system (106) with a general linear o u t p u t t r a n s -
formation <Ji.
W e shall i n t r o d u c e t h e following a s s u m p t i o n s :

(i) T h e operator « S f 0 in E q . (106) is an infinitesimal generator of a


semigroup (or a g r o u p ) .
T h u s , <P(t, t') is a b o u n d e d linear operator governing t h e state t r a n s i -
tion of a free, linear distributed p a r a m e t e r dynamical system. If we let t
takes on all values in [/ 0 , f j , w h e r e tx is a finite n u m b e r > t0 , t h e n
0(t, t0) defines a c o n t i n u o u s m a p p i n g on Γ(Ω) onto t h e solution space
N
•^β([*ο > *ι] X ®) which is a subset of L2 ([t0 , f j χ Ω).
(ii) is a b o u n d e d operator with d o m a i n Γ8([ΐ0 , χ Ω). Also, the
o u t p u t can be precisely measured.

F r o m t h e above assumptions, it follows that ^ Φ ( ί , t0) is a b o u n d e d


CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 137

M
linear operator on Γ(Ω) onto t h e o u t p u t space f 0 — a subset of L2 [t0 ,
M
(or L2 ([t0 , ί χ] X ß ) , d e p e n d i n g on t h e form of e^ ) , w h e r e M < TV.
If this system is to be completely observable at t0 , t h e r e m u s t exist a
finite time t1 and a c o n t i n u o u s o n e - t o - o n e m a p p i n g from Y 0 onto P ( ß ) .
In view of the similarity between t h e m a t h e m a t i c s of this p r o b l e m a n d
t h a t of establishing conditions for complete controllability as discussed
in Section I I I , B, 2, a, we can state a l e m m a for complete observability
which is directly analogous to L e m m a I I I - 2 .

L E M M A I I I - 3 : A necessary and sufficient condition for a linear distri-


buted parameter dynamical system governed by Eq. ( 1 0 6 ) (and possibly
boundary condition ( 1 0 7 ) ) satisfying assumptions (i)-(ii) to be completely
observable at t0 , is that the linear, self-adjoint operator (J(<î>(t, ί 0) ) *
Ν
(^Φ(ί, t0)) which maps Γ(Ω) into Ε2 (Ω), has a bounded inverse for some
finite t = t 1 ^ t0 .
T h e proof is analogous to t h a t of L e m m a I I I - 2 .
Obviously, if t h e system is completely observable at t0 , its initial state
can be found by

U0(X) = [(J?0(tv to))*(J?0(tv to))]-\^0(tly t0))*Vt (151)

where Vt is the o u t p u t function defined for all t on t h e observation time


interval [t0 , f j .

Remarks.
,,
(ï) A straightforward s t a t e m e n t of a "duality p r i n c i p l e analogous to
that established by K a i m a n (34) for finite-dimensional linear dynamical
systems cannot be m a d e here. T h i s is d u e to the fact t h a t care m u s t be
exercised in distinguishing the d o m a i n s and ranges of various opera-
tors. Physically speaking, t h e above complications are c o n t r i b u t e d by t h e
fact t h a t the constraints i m p o s e d by t h e m e a s u r i n g devices at the o u t p u t
are usually of a considerably different n a t u r e t h a n those imposed by t h e
way which the i n p u t s enter t h e system.
(ii') G i l b e r t ' s definition for complete observability of finite-dimen-
sional linear dynamical systems can be extended to the distributed case.
H e r e , we have t h e following definition: a linear distributed p a r a m e t e r
dynamical system is completely observable (in t h e sense of Gilbert), if
t h e r e does not exist a transformation w h i c h decomposes t h e system into
two subsystems, such t h a t one s u b s y s t e m does not affect either t h e other
subsystem, or t h e o u t p u t s of t h e system. Again, as m e n t i o n e d previously
in remark (iii') of Section I I I , B, 2, a on controllability, the usefulness
138 P . K. C. WANG

of the above definition is limited by t h e fact that there are no systematic


ways of transforming linear P D E ' s from one form to another.
(iii') In practical situations, once t h e complete observability of a
system is established, it is of i m p o r t a n c e to d e t e r m i n e t h e m i n i m u m
time required for observation, hence, keeping t h e r e q u i r e d a m o u n t of
measured o u t p u t data to a m i n i m u m .
In the sequel, we shall discuss t h e observability of a simple linear
distributed p a r a m e t e r dynamical system whose free motion is describable
by a scalar integral equation

u{t, X) = 0(t, t0)ua(X) = Jf A(f, t0 , X, X')u0(X') dQ' (152)


Ω

where k is a c o n t i n u o u s function of its a r g u m e n t s , and the initial state


u0(X) G Γ(Ω) ç L 2 ( ß ) . T h e o u t p u t transformation J( is a spatial averag-
ing operator given by

Jt = f w(t,X)(-)dQ (153)

w h e r e w is a specified, continuous, spatial weighting function d e p e n d i n g


on t.
I n view of L e m m a I I I - 3 , the above system is completely observable
at time t0 , if the following linear, self-adjoint, non-negative, integral
operator

{Jl<i>(tx, < 0))%#Φ(Λ, t0))

= C f k(t, t , X', X)w(t, X') f w(t, X")-


1
0
J J
t0 Ω JΩ

ί *(*, t0, X", X'")( · ) dQ'" dQ" dQ' dt (154)


which m a p s Γ(Ω) into L 2 ( ß ) , has an b o u n d e d inverse for s o m e finite

Consider the particular case w h e r e t h e system is governed by a linear


diffusion equation (125) with its G r e e n ' s function given by E q . (127),
and t h e o u t p u t corresponds to measuring u(t> x) at a fixed point xx e (0, 1).
Here, can be symbolically written in t h e form

M = C 8(x -Xl)( >)dx (155)


Jο
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 139

where 8(x — xx) is t h e Dirac delta function. T h e operator c o r r e s p o n d i n g


to E q . (154) reduces to
1
(ΛΦ(ίν O))%*0(tlt 0)) = f' *(*, 0, X l , x) f *(/, 0, xx, x){ · ) dx' dt (156)
J ο J 0

where
2 2
£(f, 0, jc, , λ:) = ^ exp(— n n t) sin(m7\x:) sin(w7rjc1) (157)
n=l

It can be readily shown that a b o u n d e d inverse f o r ( ^ 0 ( f 1 , O ) ) * ( ^ ' 0 ( f 1 , O ) )


does not exist for any finite t1 . T h i s conclusion can be verified by con-
sidering the special case w h e r e u0(x) = sin(47rx). T h e solution corre-
s p o n d i n g to this initial condition is simply exp(—4πΗ) sin (4πχ). If we
fix the m e a s u r i n g device at χ = \ y t h e n t h e o u t p u t v(t) = 0 for all
t ^ 0. Clearly, u0(x) cannot be recovered by observing v(t). O n t h e other
h a n d , it can be shown that t h e system is completely observable for certain
restricted class of initial functions if t h e data is given along t h e lines
t = txx and χ = 1 (see Fig. 10b) (or along any c o n t i n u o u s m o n o t o n e

FIG. 10.

curve intersecting t h e lines χ = 0 and χ = 1). T h e physical implication


of this result is that, if we are limited to a pointwise m e a s u r i n g device,
it is necessary to p u t the device into a scanning motion so that t h e value
of the state function u can be obtained for all Λ: G (0, 1) within some
finite-time interval [0, * J . F u r t h e r m o r e , if we assume that t h e scanning
140 P . K. C. WANG

velocity vs is limited (i.e., | vs | ^ vs0)> t h e fastest scanning scheme is


the one shown in Fig. 9b, where tx — V~Q .

IV. O p t i m u m Control

T h e p r o b l e m of o p t i m u m control of a dynamical system is that of


determining the manipulatable i n p u t s (control variables) such that its
response will correspond as closely as possible to the desired behavior
according to a prescribed performance criterion; and both the control
variables and the resulting system response satisfy certain constraints
imposed by the system's physical limitations.
D e p e n d i n g on the particular physical system and its interaction with
the environment, its behavior can be usually described either by a set of
deterministic or stochastic dynamical equations. In this section, the
discussion will be focused on the o p t i m u m control of deterministic
systems. T h e extension of some of the results to the stochastic case will
be discussed in a future paper.

A. Problem Formulation
W e assume that the process to be controlled is a fixed-domain distri-
b u t e d parameter dynamical system whose state at any t i m e t can be
specified by a vector-valued function U(t> X) belonging to a state func-
tion space Γ(Ω) with a specified metric p(U, U'). F o r any admissible
control Fn(t, X) eJ(r χ Ω) (see Section I I I , Β, 1) for definitions), there
r om
exists a u n i q u e continuous transformation Φ*·β(*ΐ U0(X), *o) f
into itself.
Let the performance index be given by the following functional:

(158)

where SP^ and £PX are specified scalar functions of their a r g u m e n t s . T h e


parameter tx is the terminal time, which is defined as the first instant
of time t > t0 when the motion enters a specified set ç= Γ(Ω) χ τ,
where τ is the set of all values of t ^ t0 . N o t e that for each fixed
time t, is a subset of Γ(Ω), which may d e p e n d u p o n T h e first
integral in E q . (158) represents a terminal performance index, while
t h e second integral represents the performance index defined over the
entire time interval.
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 141

I n t e r m s of the performance index (158), the o p t i m u m control p r o b l e m


can be stated as follows:
(PB-0). Given a distributed parameter dynamical system whose motion
or a
is defined by ^ F ^ ( ^ ^ O ( ^ 0 > *O) f specified initial state U0(X) at time t0 ,
find a corresponding admissible control Fü defined on [t0 , i j χ fl such
that the performance index (158) assumes its infinum (or supremum) with
respect to the set of all admissible controls J (τ χ Ω).
T h e above p r o b l e m can be reformulated as an optimum feedback
control problem, if we require t h a t the control is to be found in t e r m s of
the instantaneous state U(t> X) (i.e., a control law).
N o t e that the above p r o b l e m formulation is a direct extension of those
given by K a i m a n (35) and Pontryagin et al. (36) for dynamical systems
governed by ordinary differential equations.
By letting the performance index takes on certain specific forms, the
above general p r o b l e m reduces to various special p r o b l e m s :
(PB-1) Time Optimal Control. T h e p r o b l e m is to steer the system
from an initial state U0(X) at t0 to a desired state Ud(X) in a m i n i m u m
a m o u n t of time by means of an admissible control . Since, in general,
the solution space ΓΗ(Ω) may be a subset of Γ(Ω), hence Ud(X) must
belong to Γ8(Ω). F o r this p r o b l e m , we set ^ 0 = 0, J @χ(-~)άΩ = 1
Ω
and 9> = {Ud(X)} Χτ.
(PB-2) Optimum Terminal Control. I n this p r o b l e m , it is required
to b r i n g the system from an initial state U0(X) at t0 as close as possible
to a desired terminal set Γα(Ω) C Γ(Ω) at a specified t i m e ίλ . In this case,
= 0, ίΩ &0('~)dD may be replaced by a distance p(U(tx, Χ), Γα(Ω))
defined by

P(U(tl, X), Td(Q)) = ^ i n f (U(


P tl, X\ U\X)) (159)

and the set Sf = Γ(Ω) χ { i j . I n t h e special case where Γα(Ω) =


{Ud(X)}> E q . (159) reduces to t h e usual distance between two states.
(PB-3) Optimum Regulator Problem. H e r e , the desired state Ud(X)
is an equilibrium state Ue(l(X) of t h e system. If, for some reason, t h e
system is p e r t u r b e d away from Ueq(X), the p r o b l e m is to find an a d m i s -
sible control (or a control law) which will return the system state to
Ueq(X) in such a m a n n e r that certain specified performance index (158)
is minimized. I n particular, if we wish to transfer any p e r t u r b e d state to
Ueq(X) in a m i n i m u m a m o u n t of time, then we have a time-optimal
regulator problem.
(PB-4) Optimum Tracking Problem. In this problem, t h e desired
motion is a space, t i m e - d e p e n d e n t function Ud(t, X) defined for
142 P . K. C. WANG

{ty X) e τ X ß , or a trajectory in Γ(Ω). I t is r e q u i r e d to keep t h e i n -


stantaneous distance between Ud(t, X) a n d t h e controlled m o t i o n as
small as possible. H e r e , we may replace J ß ^ 1 ( - - - ) r f ß by t h e i n s t a n t a n e o u s
distance p(Ud(t, X)y ΦΓβ\ U^X)y t0)) a n d set ^ 0 = 0 a n d & = Γ(Ω) χ
{h}* Again, if we wish to transfer any initial state to a prescribed neigh-
b o r h o o d of t h e desired motion in a m i n i m u m a m o u n t of t i m e , t h e n we
have a time-optimal tracking problem.
(PB-5) Minimum Energy Control. H e r e , it is r e q u i r e d to transfer an
initial state t0 to a desired state Ud(X) [or a prescribed n e i g h b o r h o o d
of Ud(X)] at a specified t i m e t1 with t h e e x p e n d i t u r e of a m i n i m u m
a m o u n t of energy. F o r this p r o b l e m , £PX is taken to be a n o n - n e g a t i v e
function of F$ , i n d e p e n d e n t of ΦΓ .

Remarks.

(i) I n m a n y physical situations, constraints other t h a n those i m p o s e d


on t h e control variables are i n t r o d u c e d as a result of s y s t e m ' s physical
limitations and design specifications. M o s t of these constraints can be
p u t into t h e following form

gUi) ^^ityXy Ü(tyX)yF^tyX)) ^g^y X = l , . . . , N, (16Û)

where 2£i are specified functions or f u n c t i o n a l of their a r g u m e n t s ;


gHi) a n d guH) may be either given functions of t a n d / o r Xy or constants,
d e p e n d i n g on t h e form of 3ti. I n particular, inequality (160) m a y take
t h e form of a set of integral inequality constraints

z{(ty Xy ΦρΩ{ί; U0{X)y tQ)9F^t9 X)) dQ dt < gu(i) = constant,


Ω
ί = 1 JVe (161)

A n o t h e r form for (160) may be derived from t h e fact t h a t only a


subset Γ„'(Ω) of t h e solution space Γ^Ω) of t h e mathematical model
has physical meaning. A possible Γ^(Ω) may be defined by

rs'(Q) = {U(X) : U{X) 6 rs(Q)y | Ui(X) \ < g^X) almost everywhere in ß ,

ί = (162)
where gi(X) are non-negative functions of X.
T h e preceding o p t i m u m control p r o b l e m s can be reformulated in t h e
presence of t h e above constraints. I n t h e case of integral inequality
constraints of t h e form (162), t h e p r o b l e m can be r e d u c e d directly to t h e
preceding ones by i n t r o d u c i n g a set of fictitious state variables similar
to t h e case of ordinary differential systems (36).
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 143

(ii) I n some physical systems, it is of interest to maintain close con-


trol of certain state functions defined on a given subset Ω' of t h e spatial
domain Ω. N o constraints are imposed on the state functions defined
on (Ω — Ω') except they m u s t remain b o u n d e d at all time. T h i s situa-
tion can be p u t into t h e framework of t h e preceding p r o b l e m formula-
tions by incorporating a p p r o p r i a t e spatial weighting factors into ^ 0
and & x .
(iii) T h e preceding p r o b l e m formulations can be generalized to a
fixed-domain distributed parameter system which is coupled to a
l u m p e d - p a r a m e t e r subsystem, and also to a variable-domain distributed
p a r a m e t e r system. I n t h e latter system, it is necessary to i n t r o d u c e an
appropriate transformation which m a p s t h e variable spatial d o m a i n
onto a fixed-domain as discussed earlier.

B. Functional Equations; Maximum Principle


In this section, certain functional equations and m a x i m u m principle
associated with t h e o p t i m u m control of particular classes of distributed
p a r a m e t e r systems will be discussed. T h e applications of these results
to specific p r o b l e m s will be presented in Section IV, C.

1. SYSTEMS IN D I F F E R E N T I A L FORM

Consider a distributed p a r a m e t e r dynamical system describable by


t h e following partial differential (-integral) e q u a t i o n :

dU(t, X)
= JT(U{t, X),Fü(t, X)) (163)
dt
y
defined for t > t0 on a fixed spatial d o m a i n ß , w h e r e J f is a specified
vector spatial differential (-integral) operator. In particular, may be
a composite of a spatial differential operator acting on U a n d a specified
vector-valued function of FQ . W e shall assume that t h e initial state
U0(X) 6 Γ(Ω) — a specified state function space, a n d F ß e<f([t0 , tt] X ß )
— a given admissible set of control functions defined on [t0 , χ ß.
If there are b o u n d a r y conditions, it is a s s u m e d that they can be taken
care of by restricting the d o m a i n of t h e operator acting on U.
H e r e , we shall use t h e t e c h n i q u e of d y n a m i c p r o g r a m m i n g to derive
the functional equation for p r o b l e m (PB — 0) with a performance index
of the form

1
+ f f ^ t ( i , X, UF(t, X; U0(X), g , FM, X)) dQ dt (164)
144 P . K. C. W A N G

First, we i n t r o d u c e t h e notation

Π(υ Χ),Τ)=^ ν
ΰ{ ι]χΩ) (165)
where Τ = tx — t.
Applying t h e principle of optimality (43), we have

FQ{t, X)) dQ dt + n(V(tQ + Δ, Χ), Τ - Δ)\ (166)

W e assume that t h e solutions t o E q . (163) on t h e t i m e interval (t0 ,


exist a n d are u n i q u e for any admissible control function Fa(t, X) e
e / ( [ £ 0 , £ j χ ß ) , and the initial state function U0(X) is sufficiently s m o o t h ,
so that t h e solutions corresponding to sufficiently small t i m e i n c r e m e n t
Δ can be written as

UFQ(t0 + Δ, X\ U,(X\ t0) * U0(X) + AJT(U0{X),FQ(t0, X)) + 0(Δ) (167)

where 0(Δ) is an infinitesimal quantity of higher order t h a n Δ .


Using t h e above relation, and assuming Π is sufficiently s m o o t h with
respect to t / 0 , we can expand FI(U(t + Δ, Χ), Τ — Δ) about UQ a n d Τ
as follows:
Π(ϋ(ί0 + Δ,Χ),Τ-Δ)

* I7(U0(X), Τ) + Δ f £ M^lZl je(U0(X),FQ(t0, X)) dQ

_ Δ * Π ΰ ψ Τ ) + σ ){ Δ ( 1 6 8 )

where BIJ(U0(X), T)/8um)(X) denotes a functional partial (variational)


derivative,v/hich is defined as t h e variation of t h e functional Π with respect
to t h e function u0H) at a point X e Ω, or formally (14, 19),

SII(U0(X), T) _
δ"ο</>(-Χ)
, i m r n(uM)(X),.... u0U)(X) + ht(X),.... uoiN)(X), T) - n(U0(X), T)i
Ai^l ξ Δ J
(169)

where ht(X) is a continuous function having compact s u p p o r t in a


region ΔΩ in ß s u r r o u n d i n g t h e p o i n t X, and

Δ ξ = \ Η{(Χ)άΩ.
J ΔΩ
C O N T R O L OF DISTRIBUTED PARAMETER SYSTEMS 145

H e n c e , the first variation of t h e functional d u e to a complete variation


of all u{(X) defined for all X e Ω is

U s i n g the approximation

+A
(° f ^ ( f , X, UFo{t, X; U0(X), g , Fu{t, X)) dQ dt

^ Δ J\ ^(t0 , Xx U0(X),FSi(t0 , * ) ) rfO + 0 " ( J ) (171)


Ω

and substituting E q . (168) into E q . (166), and taking the limit as Δ —• 0


lead to t h e following partial differential-integral equation:

+ ^ 0 , ^ , C/oW.^Co.^))]^ (172)

Since E q . (172) m u s t hold for all t e[t0 , f j , it can be rewritten as

+ X, U(t, X), Fa(t, X))] dQ (173)

w h e r e Τ = tt — t and the initial condition is given by

Π{ϋ(*χ, X), 0) = Jf » l t x , X, C/(i x, X)) dQ (174)


Ω

F o r t h e case where an inner p r o d u c t in Γ(Ω) is defined, E q . (193) can


be expressed in t h e following simplified form:

where

P = Col {Pl,..., pN , ρ Η ύ+ = Col [ ^ j ^ P , l] (176)


ρ = Coi ( f t , ? v , ? N + )1
= Col [Jt{U{t, X),FSi{t, X)), £/(*, X ) , F f i( i , * ) ) ] (177)
146 P . K. C. WANG

and <P, ζ)}Ω denotes an inner p r o d u c t of P and Q in Γ(Ω), i.e.,

r N+l
<P>Q>o = XpiVidQ (178)

If we define <P, ζ ) > β as t h e Hamiltonian H(U, Ρ , i), t h e n t h e q u a n t i t y


^iPiQi c o r r e s p o n d s to t h e Hamiltonian density, Ρ a n d Q c o r r e s p o n d to
t h e generalized m o m e n t u m density and coordinate, respectively, in
classical c o n t i n u u m mechanics.
By i n t r o d u c i n g t h e notation

)
^ ^ • ' i ^ À / ' ^ » ' '

E q . (175) becomes
0J7(£/(f, X\ T)
= H\U,P,t) (180)
3T

T h e above equation is a partial differential-integral equation, which


c o r r e s p o n d s to t h e Hamilton-Jacobi equation in finite-dimensional
systems. If t h e solution is regular, one can go a step further to show
that t h e o p t i m u m system motions are solutions of t h e Hamilton canonical
equations, which consist of a set of partial differential equations of t h e
form
3U(t,X) _ hH\U,P,t)
(181)
dt SP(t, X)

dP(t,X) _ SH%U,P,t)
(182)
dt W(t,x)
with initial condition
U(t0,X)= U0(X) (183)

a n d terminal condition at t i m e t 1 , w h i c h d e p e n d s u p o n w h e t h e r U(t1, X)


is specified or free:

for specified U(tx , X), P(t1, X) is free


for free U(tx, X), P{h , X) = C o l f ô ^ f o , X, U{tx, X))ldU(tx , X), Î]

It is evident that t h e p r o b l e m of solving E q s . (181) a n d (182) is a


two-point boundary-value problem in a function space.

Remarks.

(i) I n t h e case where t h e performance index is defined only on a subset


Ω' of t h e spatial domain Ω, t h e functional equation is identical to E q .
(193) except t h e d o m a i n of integration is replaced by Ω'. Also, for a
system with isoperimetric constraints of t h e form (161), a functional
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 147

equation c o r r e s p o n d i n g to E q . (173) can be derived using L a g r a n g e


multipliers (9, 44).
(ii) T h e functional equation derived here is essentially identical to
t h a t for finite-dimensional systems, except for t h e i n t r o d u c t i o n of t h e
notion of a functional partial derivative. F o r t h e case w h e r e Γ(Ω) is a
N
subset of a H i l b e r t space L2 (Q)y t h e functional equation can be also
derived using geometric a r g u m e n t s . H e r e , t h e functional partial deriva-
tive corresponds to a gradient in function space.

2 . SYSTEMS I N I N T E G R A L F O R M

In m a n y distributed p a r a m e t e r systems, it is possible to formulate t h e


d y n a m i c equation directly in t h e form of a set of integral equations.
T h i s representation is desirable since t h e b o u n d a r y conditions are
included in its description.
H e r e , we shall consider a system which is describable by a set of
nonlinear integral equations of the form

£/(/, X) = Jf tf0(*, t0, Χ, X'} υIX')) άΩ'


Ω

h
+ ( f K(t, t\ Xy X\ U(t\ X'l Fa(t', X')) άΩ' at' (184)
J t0 J Ω

w h e r e K0 a n d Κ are specified vector-valued functions of their a r g u m e n t s .


K0 has t h e p r o p e r t y that

\ K0(t0l t0, X, X\ U0(X')) άΩ' = U0(X) (185)

F u r t h e r m o r e , each c o m p o n e n t of K0 and Κ is a square-integrable func-


tion defined on the d o m a i n [t0 , t] χ Ω, a n d has c o n t i n u o u s first-order
partial derivatives with respect to ui , t h e c o m p o n e n t s of U. W i t h o u t
loss of generality, U0(X) is taken to be zero almost everywhere in Ω.
In addition to the system equation (184), t h e r e are given a set of
functional constraints of t h e form

X)tFa{t, X))] = 0 , i = l,.... Nc (186)


where
1
ζ = f fJ Z(t\ X\ U{t\ X'),Fa(t', X')) άΩ' at' (187)
J tο Ω

and Ζ is a vector-valued function with c o m p o n e n t s Zj , j = 1, Nc'.


It is assumed that Sti and Zj have c o n t i n u o u s first partial derivatives
148 P. K. C. WANG

with respect to ζ and U respectively. T o simplify t h e notations in t h e


s u b s e q u e n t discussions, we shall denote (t, X) by S, and [t0> tx] Χ Ω
by <§.
T h e o p t i m u m control p r o b l e m can be stated as follows:
>
Given system equation (184), find an admissible control F^S) eJ(£ )
such that t h e performance index

φ = J
f ^(S.C/iS),^))^ (188)
δ

assumes its m i n i m u m with respect to t h e set of all admissible controls


</(<?), and t h e constraints (186) are satisfied.
T h e above p r o b l e m is essentially that formulated by Butkovskii
(5, 7) for which he derived t h e following m a x i m u m principle:

T H E O R E M I V - 1 : Let FQ(S), S
e Sy be an admissible control such
that by virtue of Eq. (184), the constraints (186) are satisfied. In order
that this control Fn(S) be optimal (i.e., Fa = F ^ ° ) , it is necessary that
there exists a non-zero vector (c0 , c1, cNc) with c0 = — 1 , such that for
almost all fixed values of S e S\ the function:

S(S,Fv) = c0#>l(S1U(S)1Fn(S))

- j M(S", S')K(S', S, U(S),Fa(S))d<r\ dS"

(
4 - X c i ^ i ^ j Z ( 5 ) i / ( 5 ) , F ß( 5 ) )

dZ S F
+ \ f ( "<V> ^ [K(S", S, U(S),F0(S))

- j M(S", S')K(S', S, U(S),F^S)) d£"\ (189)

of the variable FQ e^(S) attains its maximum, i.e., for almost all S e e ,
the following relation holds:

S(S,FDo) = sup S{S,FB) (190)


FöeS(<?)
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 149

where the matrix-valued function M(S"S') satisifies an integral equation


of the form:
f
dK{S",S\ U(S')9Fq{S ))
M ( S " , S') +
eu

J£ OU

Proof. W e shall only p r e s e n t an outline of t h e proof here (see refer-


ences (5) a n d (7) for details). A s s u m e t h a t t h e r e exist an optimal control
F$\S) eJ(ê), a n d a c o r r e s p o n d i n g optimal m o t i o n U°(5), w h i c h satisfy
t h e constraint equation (186) a n d m i n i m i z e t h e performance index (188).
W e shall consider t h e following functional

= Κ I t / ( S ) , F f l( S ) ) dê + WCit/.Ffl)) (192)

where 2£i are defined by E q s . (186) a n d (187).


L e t S be a regular point for t h e control function in t h e d o m a i n ê\
a n d Δ ( be a small region s u r r o u n d i n g S a n d with v o l u m e e such t h a t
€ —> 0 as t h e diameter of Δ€ —> 0.
W e shall i n t r o d u c e a p e r t u r b e d control ΡΩ defined about t h e o p t i m a l
Fa° in t h e following m a n n e r :

for all 5 6 ^ - 4
) ( 1 9 3
^ " f o * forallSeJ,

w h e r e F ß * eJ>($). I n t h e sequel, we shall denote t h e system motion cor-


r e s p o n d i n g to controls ΡΩ , ΡΩ° by Û(S) a n d U°(S) respectively.
A s s u m i n g SPX has c o n t i n u o u s first partial derivatives with respect to
U(S), t h e value of t h e functional φ ' with t h e p e r t u r b e d control ΡΩ
(denoted by φ') can be c o m p u t e d by

= λ0 j ^(S, /i,(S)) + £ λ ^ , [ ζ ( # ( £ ) , / f l( S ) ) ]

= λ0 j ^ ( S , U«{S), FAS)) +
d
^ ^IlD-
S
8C7(S)j rf* (194)

+ e\[&>x{5, V\S),Fa*) - ^(8, U%S),Fa\S))]

d Z { S F p 0)
+ g JZ(S, U\S),Fn%S)) + 'd^°0' 8U(S)\ dS

+ e[Z(S, U°(S), Fx*) - Z(S, U°(S), F f l«(S))] + 0 ( e ) ] (195)


150 P . K. C. W A N G

w h e r e 0(e)/e —> 0 as € - > 0 and

; u
dip ~ L " " ' e« NoJ

0 0
a n d dZjdU is a matrix with elements dzjdiij , i = 1, i V / , j = 1,
JV.
F r o m t h e system equation (184), t h e i n c r e m e n t δ {7(5) satisfies, w i t h
an accuracy u p to small quantities of higher o r d e r t h a n e, a n o n h o m o -
geneous F r e d h o l m integral equation (linear in δ £/(£)).·

W(S) = «[Jf(S, 5, U<\S),Fa*) - K(S, S, U°(S),Fa°(S))]

F
+ f . J ^ g i f > g l s 1/(5') <//' (197)

0 0
w h e r e dK/dU is a matrix with elements dkjdtij , i, j = 1, N.
T h e solution to E q . (197) can be written as (45)

W(S) = e \k(S, 5, - 5, U%S)9Fa%3))

-j M(S, S')[tf(S' f S,
ϋ
*7 (5),/^*) - 5, Ι/°(5),^ο(5))]^Ί
(198)
where t h e kernel M ( S , 5 ' ) satisfies E q . (191).
S u b s t i t u t i n g E q . (198) into E q . (195) a n d e x p a n d i n g 2£x in powers
of €, we can express t h e difference b e t w e e n φ , and φ ' c o r r e s p o n d i n g to
as follows:

Δψ = ψ - c0 I ^(S, U°(S)9Fa°(S)) dê - ^K(^S),

= € [ 5 ( 5 ^ Λ* ) - 5 ( 5 , ^ ο ) ] ( 1 9) 9

where e > 0, S is defined by E q . (189), and we have set = c i 9 i = 0,


1, Nc.
s
If we set c0 — — 1 , t h e n J $ ' m u s t be nonpositive a b o u t t h e o p t i m u m
<P'. T h u s ,

3(S,Fa»)^S{S,Fa*) (200)

Since t h e above relation is valid for any F a * 6 / ( i ) , t h e n S(S9Fa)


CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 151

attains a m a x i m u m with respect to FQ for fixed 5 , i.e. for almost all


See,
£ 0 W ) = sup S{S,F ) a
(201)
H e n c e , the proof is complete.

Remarks.

(i) I n t h e proof of T h e o r e m I V - 1 , use was m a d e of t h e L a g r a n g e


multiplier rule. A proof for the validity of this rule for t h e minimization
of a certain class of functionals subjected to a set of equality constraints,
all defined on a general Banach space, has given by Butkovskii (7).
(ii) Butkovskii has extended T h e o r e m I V - 1 to a more general p r o b l e m
where b o t h the system equation and its constraints are describable by
the following single operator equation:

&{U(S), f K{S S', U(S') F {S')) dé') = φ


9 9 D (202)
where b o t h U and FQ are elements from a subset of a Banach space; φ is
a null element; and the functional to be minimized is of t h e form:

J% ( J" K(S S U(S) F (S)) dé)


ly 9 9 a (203)
where S1 is a point in ê. Also, he has s h o w n that if IF and FQ are
linear, t h e conditions in t h e extended t h e o r e m are also sufficient for
o p t i m u m . T h e details can be found in reference (7).

C. Linear Systems
T o illustrate the applications of t h e general results derived in t h e
preceding sections, the o p t i m u m control of linear systems with specified
performance indices will be discussed.

1. O P T I M U M CONTROL WITH GENERALIZED QUADRATIC


PERFORMANCE INDEX

W e assume that the dynamical system to be controlled is describable


by a linear P D E in t h e form of ( 1 0 6 ) , i.e.,

-~ΪΓ^ = &oU{t, X) + /)(*, X)Fa(t, X) (204)


152 P. K. C. WANG

defined for t > t0 on a fixed spatial domain Ω, where is an infinitesi-


mal generator of a semigroup. Again, we assume that the state function
N
space Γ(Ω) is a subset of L 2 ( ß ) , and any given b o u n d a r y conditions
can be taken care of by restricting the domain of i f 0 .
Let the performance index be given by

* = f ί VUt,, X; U0(X), t0)Q0(X, X')UFö(h , X'\ U0(X), te) άΩ άΩ'


JQ JQ Ω Ω
1
+ Jf' J f ίJ UUt, Χ; U0(X), ί 0 ) ρ χ ( Ζ , Χ', t)UF (t, Χ'; U0(X), t0) άΩ άΩ' dt
t0 Ω Ω " "

1 T
+ Γί f Fi} (t,X)Q2{X,X\t)FÇi(tyX')dQdQ' dt (205)
J t0 J Ω J Ω

where Q0 , Qx , and Q2 are positive definite, s y m m e t r i c , matrix kernels.


In view of Eqs. ( 173) and ( 174), the functional equations corresponding
to Eq. (173) is

a/7(t/(i, x \ T)
dT
= min if m^Ih^^Uit.X)
1 0 ν
F„€./iti 0.',]xß) Uο V 8 l / ( f , ΛΓ) ' '

+ D(t, x)F {t, u χ)] άΩ+\ J


fJ χ; t)U(t, χ)
1

a a

T
+ Fa {t, X)Q2(X, X', t)Fa{t, X')] άΩ άΩ' j (206)

where Τ = t1 — t a n d the initial condition is given by

n{U{tx, Χ), 0) = Jf ίJ ί / ψ , , X)Qo(X, X')U(h , Χ') ΊΩ άΩ' (207)


Ω Ω

If we assume that the admissible control FQ is unconstrained in magni-


t u d e , it can be readily deduced by applying elementary variational
calculus that t h e o p t i m u m distributed control F ß ° can be d e t e r m i n e d
from the following equation:

\W{t, X'WQ^X, X', t) dSÏ=-\ i™^^±p) D(t, T


X) (208)

I n the special case where Q2(X, X\ t) = 8(X — X')I (δ is the Dirac


delta function, / is t h e identity matrix), E q . (208) reduces to
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 153

Substituting E q . (209) into E q . (206) leads to a partial differential-


integral equation for Π:

dn(U{t,x), τ) r ,m(U(t, x), 7 y


dT Ja\ 8U{tyX) )

T
+ Jf ίJ U (t,X)Ql(X,X',t)U(t,X')dQdQ' (210)
Ω Ω

If we assume that the solution to E q . (210) has the form

T
J7(E/(f, X), T) = f ί U (t, X)A0(X, X\ tx - t)U(t, X) dQ' dQ (211)
JΩ JΩ

where A0 is an u n d e t e r m i n e d positive-definite, s y m m e t r i c , matrix


kernel. It can be readily d e d u c e d by c o m p u t i n g the functional partial
derivative 817/811 that F ß°(f, X) has t h e form

T
FQ\t, X) = —D (t, X) f A0(X, X', tx - t)U(t, X') dQ' (212)

T h e equation for d e t e r m i n i n g A0 can be obtained by c o m p u t i n g


8TJ/8U and 3Π/3Τζηά substituting t h e m i n t o E q . ( 2 1 0 ) , a n d then equating
the kernels of the resulting integrals,

dA0(X^X'tT)=^MX^ ^ ) T

- Γ A0(X, X'\
J T)D{t% X")AQ(X'\ X\ T) dQ" + QX(X, X\ t) (213)
Ω

where J £ ? 0 * is t h e adjoint operator corresponding to J ^ 0 , which is formally


defined by the relation

</>(*), &0U(X)>o = <&o*P(X), U{X)>a (214)

In contrast with the matrix Riccati equation resulting from linear


ordinary differential systems with q u a d r a t i c performance index (55),
E q . (213) can be regarded as a "Riccati partial differential e q u a t i o n "
with its initial condition given by

A0(X,X',0)=O0{X,X') (215)
154 P. K. C. WANG

F r o m E q s . (209) and (210), it can be also d e d u c e d t h a t the H a m i l t o n


canonical equations are

dU^X)_ = ) _ χ X ^) Dχ T) ^ρ χ) ) ( 2 1 6
dt

dP(t, X)
= -&0*P(t, X) - f Q^X, X\ t)U(t, Χ') άΩ' (217)
dt

where

T h e initial and terminal conditions for E q s . (216) and (217) are:

U(t0, X) = U0(X) (219)

P(tx, X) = 2 Jf Q0(Xy Χ')Ό(ίλ, Χ') άΩ' (220)


Ω

Since E q s . (216) a n d (217) form a set of linear, h o m o g e n e o u s P D E ,


and E q . (216) describes a dynamical system, we can assume t h a t their
solution at time tx c o r r e s p o n d i n g to initial data U(t, X) and P(t, X) at
time t can be expressed in the form

r t / ( i , , ^Ql ι* γ * „ ( Ί , /,X, X') M'i - 1 , X, X')} \U(t, X)i


K
[Pit, ,X)\ J„ ,t,X,X') , t, X, X')i \l\t, X)\ '

where k i }, i, j = 1, 2, are G r e e n ' s function matrices having t h e p r o p e r -


ties

MIX. X , S ' , ' ' , (222,


where φ is a null matrix. F u r t h e r m o r e , it can be readily verified from
E q s . (209), (212), and (218) that P(t, X) and U(t, X) are related by

P(t, X) = ®{ιΛ , t)U(t, X) (223)

where 38(tx , t) is a linear operator given by

, ί) = 2 f A0(X, X\ t, - t)( · ) dQ' (224)


CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 155

A n explicit form for 3S{tx, t) in t e r m s of t h e G r e e n ' s function matrices


ktj can be obtained by s u b s t i t u t i n g E q . (223) into E q . (224) a n d taking
into account t h e t e r m i n a l condition (220),

# ( Ί , t) = [ f * 2 l( ' i , ty X, X)( -)dQ' + 2\ > t, Χ, X) ·


U Ω Ω

jaQo(X\ X")(-)dQ"dQ'] •
\f Μ Ί . t , x , x ' ) ( - ) d s r + 2( .*,x,*')·
U Ω Ω
1
Q0(X\X")(')dQ" dû'}" (225)
T h u s , the o p t i m u m control law becomes

Fa*{t, X) = -%DT(ty X).®(tx, t)U(t, X) (226)

N o w we shall consider a few special cases of t h e above p r o b l e m


, .
Minimum Energy Control. H e r e , t h e p r o b l e m is to find a control law
which will transfer an arbitrary initial state U0(X) e Γ(Ω) at t0 to t h e
null state at a specified t i m e t1 such that t h e control energy is minimized.
W e shall assume that t h e control energy is m e a s u r e d by

Er = f * ί FDT(t, X)FQ(t, X) dQ dt (227)

F o r this p r o b l e m , t h e p e r f o r m a n c e index is precisely Ec and the


H a m i l t o n canonical equations take on t h e form

T
= ^ W . * ) - ±W> X)D (t, W . X) (228)

- - a y — = -X0*P{t,X) ' (229)

with initial a n d t e r m i n a l conditions

U(t0, X) = U0(X); L\tx, X) = 0 (230)

U s i n g t h e notations in E q . (221), t h e solutions to E q s . (228) a n d


(229) at t i m e t can be written as
f
U(ty X) = Jf ku(ty t0 , X, X')U0(X') dQ
Ω

T
- \ fJ *J f *n(i. X'W> X')D (t', X')P(t\ X) dQ' dt' (231)
^ t0 Ω

P(t, X) = f * 2 2( ί , ί 0 , Χ, Χ')Ρ0{Χ') άΩ' (232)


•Ό
156 P . K. C. WANG

S u b s t i t u t i n g E q . (232) into E q . (231) a n d setting U(tx , Χ) = 0 lead


to t h e following relation between U0(X) a n d P0(X):

2 Jf * η ( Ί ,t0tX, Χ)υ,(Χ) dQ' = s/(tx, t0)P0(X) (233)


Ω

where
1
^(h , t0) = Γ ί M ' i , X, X')D(t\ Χ')Όψ, Χ') •

f Λ 2 2(ί', ί 0 , Χ', Χ")( · ) rfß" </β' Λ ' (234)

Clearly, if t h e integral operator <z/(t1, t0) has an inverse, t h e n

P0(X) = 2s4-\tx, t0) f Α η ( Ί . Ό , X, X'WoiX') dQ' (235)


Jo
I n view of E q s . (209) a n d (218), t h e o p t i m u m control law is

Fa{t,X) = -$DT(t,X)P(t,X)

T
= -D {t, X) ( J * 2 2(f, <0 , X , X')( • ) dQ') sf~\tx, i 0) ·

(ja ,t0,X, X')( • ) dQ') U0(X) (236)

I n view of t h e fact that

f M ' > Ό . Χ, Χ'* -)dQ' =[J ktih , t, X', X)( · ) dQ', (237)
Ω

t h e operator ^ ( ^ . , t0) is precisely ^ ( ^ , ί 0 ) ^ , / 0 ) in t h e control-


lability L e m m a I I I - 2 .
F o r t h e particular case of a linear diffusion system governed by E q .
(125) defined on t h e spatial d o m a i n (0, 1 ) , a n d with t h e b o u n d a r y
conditions u(t, 0) = u(t> 1) = 0 for all t, it can be s h o w n (9) that t h e
control law for m i n i m u m control energy is given by
1
f Ä 2 2 2 /
f(t, x) = — 2j 2π η\εχρ(2π η (ί — t)) — 1 )
_1
sin(«7rjc) sin(w77-jc )w(i
,
χ') dx'
ι

(238)

Terminal Control Problem. L e t t h e set of admissible control functions


Fa(t, X) b e defined by

S([t0 , f j X ß ) = {F P(i, Χ ) : F^f, X ) e L 2 " ( [ i 0 , i j x ß ) , \fOU)(t, X)\ < F 0 ,


( < )

ι = 1,...,JV} (239)

Also, we assume that D(t, X) in E q . (204) is an identity matrix.


CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 157

T h e p r o b l e m is to find an admissible control which minimizes a ter-


minal q u a d r a t i c performance index at t i m e tx in t h e form of E q . (205)
χνίΛρ 1 = ρ = ο.
2

F o r this p r o b l e m , t h e functional equation c o r r e s p o n d i n g to (206)


becomes

3II(U(t, X)y T)
3T

= , * Ä K » i L ΠΜί 1 1
) ^ 7 X ) + F
° { t
> X ) ) dS2
\ (240)

with initial condition given by E q . (207).


It is evident that if 8FI/8U Φ0, t h e o p t i m u m control FQ\tyX) must
satisfy

8 J 7 ( T
JS«„(*. X) = -FMn sgn ( g^;fj ) , ί = 1, - , Ν (241)

If 8TI/8U = 0, t h e n a t u r e of FQ°(t, X) is no longer obvious.

2. TIME-OPTIMAL CONTROL

Consider again a linear d i s t r i b u t e d p a r a m e t e r dynamical system in


t h e form of E q . (204), w h e r e t h e distributed control function at any
fixed t i m e t is constrained by

1/2
I FQ(ty X)\\ = [j FQT(t, X)Fü{ty X) dO\ < 1 (242)

T h e p r o b l e m is to find an admissible d i s t r i b u t e d control F ß w h i c h


will transfer an arbitrary initial state U0(X) e Γ(Ω) to t h e null state in
a m i n i m u m a m o u n t of t i m e . H e r e , t h e performance index is simply:

S
P = J f
1
Λ, (243)
t0

and t h e functional equation c o r r e s p o n d i n g to (173) is:

dIJ(U(ty X)y T)
dt

8IJ(U(ty X)y T)\\


= min

(244)
158 P . K. C. WANG

Clearly, t h e integral in (244) will take on its m i n i m u m value with respect


to Fß satisfying constraint (242), if we choose

X . )- [ D, , . * > a ^ Q ] ι or,. χ )
δ
» χ ) |p
(245)

3. L I N E A R SYSTEMS W I T H BOUNDARY C O N T R O L

F o r a linear distributed p a r a m e t e r dynamical system with control


variables i n t r o d u c e d at t h e b o u n d a r y of t h e spatial d o m a i n , t h e d y n a m i c
p r o g r a m m i n g approach is not directly applicable. O n the other h a n d ,
Butkovskii's results can be used in this case.
W e assume that the system is describable by a linear integral equation
of the form

U(t, X) = f K0(t, t 0, Xy X')UQ{X') dQ' + Ç Kx(t, t\ X)FdQ(t') dt' (246)

w h e r e F^ is a b o u n d a r y control function which does not vary along t h e


b o u n d a r y of Ω, K0 , and Kx are specified G r e e n ' s function matrices.
L e t t h e performance index be given by

1
φ = ί' ί X, U(t, X),FdQ(t)) dQ dt (247)

W i t h o u t loss of generality, we assume U0 — 0. I n view of T h e o r e m


I V - 1 , the function corresponding to Ξ in E q . (189) is

XyFdii) = c&it, X, U(ty X\FdQ(t))

+ ° ilL ™ 'ïù '


c lir U FB0) K
^ > <> άΩ dt
' ( )
248

According to t h e t h e o r e m , we set c0 — — 1, and t h e o p t i m u m b o u n d -


ary control F^Q(t)y t e [t0 , * J , at any fixed instant of t i m e t m u s t yield
the m a x i m u m of S(t, Xy F^)y i.e.,

£(*, X, F%) = sup X, Fea) (249)

N o w , consider a specific case (J) w h e r e (246) is a scalar equation, i.e.,

u(t, X) = C k {t, t\ X)f (n x dü dt' (250)


CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 159

and the performance index is given by

y = Γ (u (X) d - u{t x , X)Y dQ, μ = constant > 0 (251)


where ud(X) is a specified function of X. I n addition, t h e b o u n d a r y


control is constrained by

F m in < |/δίΧ0Ι < ^ m xa (252)

H e r e , the function Ξ can be expressed explicitly as


U 3X WX
Wit V \ r Γ ( d ( ) - ('l » ) )" fb ( , f\ ( JQ

= - V/arXO J
ί MX)
Ω
-
1
, - Υ ) ) " " * ^ χ , ί, Χ ) (253)

Since c 0 = — 1 , hence — ε0μ > 0, and the m a x i m u m of S ( t , F D Q) with


respect to FQQ> subject to constraint (252), is attained when

fdiM = i(^min + ^ m a x ) + sC^max — ^ m l )n

X sgn [J - , X ) ) " - ^ ^ , *, X ) dQ] (254)

For the particular case where μ = 2, and t h e system is governed by


a linear diffusion equation (125) defined oh a spatial domain (0, 1) with
f(t, x) = 0 and b o u n d a r y conditions

«(/.Ο) = 0 , u(ty 1) = / , ( * ) (255)

and constraint \fi(t) | ^ 1, E q . (254) reduces to

== s n M w
Λ(0 S [J ( dM - ('i > *))*ι(*ι , t, x) dx]

= sgn Γ Γ u (x)k {t d x x , f, λ ) <fe - C k {t


l
x x , f x ) / ^ ' ) dt'kx(tx , ί, λ ) ω]
(256)
where the G r e e n ' s function &j is given by

ι 2) 2TT( - 1 ) ^ ^ sin(wmr) e x p i - w V f o - ί)) (257)


160 P . K. C. WANG

4. NEAR-OPTIMUM CONTROL

For practical systems, the o p t i m u m control policies derived from


theory (if obtainable) are usually so complex that they can be i m p l e m e n -
ted only at prohibitive cost. Therefore, it is desirable to derive near-
o p t i m u m control policies by introducing appropriate approximations.
However, in most approximation schemes, there are no direct ways of
assuring stability of the resulting controlled system. A possible approach
to this p r o b l e m is to incorporate the stability r e q u i r e m e n t into the
formulation of t h e o p t i m u m control p r o b l e m . T h i s approach has been
taken by Bass (46) and Krasovskii (47, 48) in t h e design of near-
o p t i m u m control systems whose dynamic m e m b e r s are governed by
ordinary differential or differential-difference equations.
H e r e , we shall discuss only the extension of Bass' idea to t h e design
of a class of linear distributed parameter dynamical systems. I n essence,
his approach consists of first selecting an appropriate set of parameters
to ensure asymptotic stability of the uncontrolled (free) system, and t h e n
choosing a control function, which d e p e n d s on the system state variables
and satisfies certain prescribed constraints to increase the speed of
response. T h u s , it can be regarded as an approximate m e t h o d for
designing time-optimal control systems.
For illustrative purpose, we shall consider a linear system governed by

dU(t,xi = ) + χF ^ χ) ) ( 2 5 g

dt

satisfying the same assumptions regarding E q . ( 2 0 4 ) . Also, the com-


p o n e n t s of the distributed control function, fQ{i)(t, X), are constrained
by
\fawit9X)\ < F m W i9 ) i = l9...9N (259)

Again, as in Section I I I , 3 a , we consider the following positive-definite


functional:

r = ί
T
U (ty X)U(ty X) dQ = <C7(i, X), U(t, Χ)}Ω (260)

In view of E q . (258), the total derivative of with respect to t is given by

Ç = <J?0U(t, X)9 U(t, X)>a + <£/(/, X)9 #0U(t9 Χ)>Ω + (U(t, X)9Fa(t9 Χ)}Ω

(261)

If we assume that the trivial solution of the free system can be m a d e


to be asymptotically stable with respect to the L%(Q) n o r m by choosing
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 161

an appropriate set of p a r a m e t e r s in J?0 , t h e n t h e r e exists a positive


constant M0 such that

(262)

N o w , the required distributed control function is to be d e t e r m i n e d


by making af"jdt as negative as possible with respect to FQ . Clearly,
this is achieved w h e n

(263)

where

(264)

T o avoid t h e possibility t h a t t h e solutions of t h e controlled system


[with control given by E q . (263)] may not be exist, sgn y{ may be
a p p r o x i m a t e d as closely as desired by a c o n t i n u o u s function such as
tanh , w h e r e y is a large positive n u m b e r .
I n practical situations, it is desirable to make a comparison between
the response of t h e controlled and uncontrolled system on the basis of
some prescribed measure. Specific results of this n a t u r e have been
obtained for the case w h e r e J5f0 is a uniformly, strongly elliptic operator
[see reference (33) for details].

V . P r o b l e m s in A p p r o x i m a t i o n and C o m p u t a t i o n

I n Section IV, certain general conditions and functional equations


associated with the o p t i m u m control of distributed parameter systems
have been derived. I n order to obtain solutions to practical p r o b l e m s ,
effective approximation schemes a n d computational procedures m u s t be
devised. T h i s task generally requires discretization of the equations
derived from t h e optimization theory in one form or another. At this
point, it is natural to ask w h a t a n d how m u c h advantage can be gained,
from a practical standpoint, in formulating t h e control p r o b l e m directly
in t h e framework of a c o n t i n u u m , since t h e solution of the p r o b l e m
ultimately requires discretization. T h e answer to this question is not
clear at this time. However, the following observations can be m a d e .
I n a t t e m p t i n g to approximate t h e distributed mathematical model by
a finite-dimensional model at the start, it is not clear a priori what form
162 P . K. C. WANG

and level of discretization are most suitable for t h e particular system u n -


der consideration. It is conceivable that the solution of t h e o p t i m u m
control p r o b l e m based on the approximate model derived from a parti-
cular discretization scheme deviates considerably from t h e actual o p t i m u m
solution. O n t h e other h a n d , instead of discretizing t h e mathematical
model at the start, we try to select discretization schemes for t h e e q u a -
tions of optimization, which are most suitable from t h e computational
standpoint. T h i s way, we have a better chance of obtaining useful
solutions to the o p t i m u m control p r o b l e m .
I n some systems, discretizing t h e distributed mathematical models at
t h e start has t h e advantage of retaining certain microscopic s t r u c t u r e of
a system, since electrical or mechanical analogs of t h e system may be
set u p directly from t h e discretized model. T h u s , a physical " f e e l "
for the behavior of t h e system may be acquired.
I n what follows, we shall discuss various aspects of t h e p r o b l e m s in
t h e derivation of approximate systems and in c o m p u t a t i o n .

A. Approximate Systems

Here, an approximate system is defined to be one which is derived


from the distributed mathematical model by some form of discretization
process.

1. F O R M S OF APPROXIMATION

T h e approximation generally takes on one of the following forms:


(i) Spatial discretization. T h e discretized mathematical model consists
of a finite-dimensional system of c o n t i n u o u s - t i m e ordinary differential
equations. As m e n t i o n e d earlier, an advantage of this form of approxi-
mation is that some of the basic physical microscopic s t r u c t u r e of t h e
system may be retained, since the derivation of d y n a m i c equations for,
m a n y distributed systems usually starts with this discrete form. Also,
electrical or mechanical analogs of t h e system may be established.
(ii) T i m e discretization. T h e discretized model usually consists of a
finite-dimensional system of spatially-continuous ordinary differential
equations. T h i s form of approximation may be used in discrete-time
distributed parameter control systems where t h e spatial distribution of
t h e physical variables are sampled in time.
(iii) Space-time discretization. T h e discretized model consists of a
finite-dimensional system of difference equations. T h i s form is generally
required for digital c o m p u t a t i o n .
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 163

(iv) Spatial h a r m o n i c t r u n c a t i o n . F o r m a n y physical systems defined


on a finite spatial d o m a i n , t h e state functions can be considered to have
a b a n d - l i m i t e d , discrete spatial h a r m o n i c s p e c t r u m . F u r t h e r m o r e , t h e
systems are essentially of a ''low-pass*' n a t u r e so t h a t t h e high-frequency
spatial h a r m o n i c s will be a t t e n u a t e d . I n these situations, t h e system m a y
be a p p r o x i m a t e d by a finite-dimensional system by t r u n c a t i n g t h e spatial
h a r m o n i c s at a suitable frequency.

2. V A L I D I T Y OF A P P R O X I M A T I O N

Aside from spatial h a r m o n i c t r u n c a t i o n , t h e r e are generally n u m e r o u s


discretization schemes for a specific form of a p p r o x i m a t i o n as outlined
above (e.g., explicit a n d implicit differences). I t is natural to ask w h e t h e r
a given discretization scheme (or a set of schemes) is consistent in t h e
sense t h a t t h e a p p r o x i m a t e e q u a t i o n c o r r e s p o n d i n g to a prescribed (set)
discretization s c h e m e a p p r o a c h e s t h e original c o n t i n u o u s equation in
s o m e sense as t h e spatial a n d / o r t i m e increments—•(). T h e answer to this
question requires t h e consideration of t h e properties of t h e exact solutions
or t h e function space on w h i c h t h e partial differential equation is defined.
F o r e q u a t i o n s w i t h sufficiently s m o o t h solutions, t h e consistency of
a p p r o x i m a t i o n can be usually verified by applying t h e T a y l o r ' s e x p a n -
sion t h e o r e m a n d examining t h e t r u n c a t i o n error t e r m s of t h e difference
formulas.
I n addition to t h e question of consistency of a p p r o x i m a t i o n , t h e follow-
ing questions having practical i m p o r t a n c e may be posed:

(i) Given a particular discretized equation, do t h e a p p r o x i m a t e


solutions c o r r e s p o n d i n g to a given set of initial conditions within some
finite t i m e interval converge (in s o m e prescribed sense) to those of t h e
original equation as t h e spatial a n d / o r t i m e i n c r e m e n t s —• 0 ? F u r t h e r -
m o r e , if they converge, w h a t is their rate of convergence ?
(ii) Given a particular discretized e q u a t i o n with fixed spatial a n d t i m e
i n c r e m e n t s , does t h e difference b e t w e e n t h e solutions of t h e exact a n d
a p p r o x i m a t e systems r e m a i n b o u n d e d as t i m e becomes sufficiently
large ?
T h e first question is most difficult a n d c a n n o t be answered in general
t e r m s . T h e second question pertains directly to t h e stability of a p p r o x i -
mation.
I n w h a t follows, we shall confine o u r discussions to a general h o m o -
geneous, time-invariant linear system in t h e form of E q . (71), w h e r e oSf0
is a spatial differential or integro-differential operator, a n d t h e initial
states U0(X) are elements of a Banach space ΓΒ(Ω). F u r t h e r m o r e , we
shall assume that E q . (71) describes a dynamical system, h e n c e J 5 f 0i s a n
164 P . K. C. WANG

infinitesimal generator of a semigroup (or group) of transition operators.


T h e exact solution to E q . (71) with initial state U0(X)czn be written as

U(t, X; U0(X), t0) = <P(t, t0)U0(X) (265)

where the transition operator Φ(ί, t0) is uniformly b o u n d e d for t e [t0 , i j


and t h e domain of Φ(ΐ, t0) is dense in ΓΒ(Ω).
H e r e , we are interested in obtaining approximate solutions c o r r e s p o n d -
ing to Φ(ί, t0)U0(X). T o do this, we construct a family &0(AX) of
bounded linear operators which approximate i f 0 in t h e sense that

hm II [ i f 0 - J?0(AX)] U(X) \\ = 0 (266)

for all U(X) e ^ ( i f 0 ) — t h e domain of i f 0 , w h e r e AX denotes the spatial


2 2
increment (Δχλ, AxM). F o r example, if J S f 0 = d /dx , a possible
form for J?0(Ax) is defined by

U{X + A )X 2 + U{X A )X
J?0(Ax)u(x) = - f) - (267)

Following L a x ' s definition (49), ^0(AX) is said to be a consistent


approximation to i f 0 if

II [S?Q(AX) - i f 0 ] # ( * , t0)U0(X) |l -> 0 as AX-+ 0 uniformly in ί, ί e [f0 , M


(268)
for some set of initial states U0(X), which is dense in ΓΒ(Ω).
I n t h e c o m p u t a t i o n of approximate solutions, we are dealing with a
sequence of c o m p u t a t i o n s using the approximate state transition opera-
f
tors. T o be specific let us approximate t h e time derivative of Φ(ί, t ) by t h e
following forward difference operator:

( J * ) - W + Δ*, Π - Ol (269)

and consider t h e approximate equation corresponding to E q . (71):

U(t + At, X) = [I + At&0(AX)]U(t, X) (270)

If we start at t = t0 with initial state U0(X), t h e n the approximate


solution Û at t1 = n(At) can be written as

Û(tx, X; U0(X), t0) = [I + AtJ?0(AX)YU0(X) (271)


n
It is natural to ask w h e t h e r [ / + At^0(AX)] U0(X) approaches the
exact solution Φ(ίΎ, t0)U0(X) as At-+0 and nAt^(tx — t0) ? If the
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 165

answer is affirmative, t h e n t h e a p p r o x i m a t i o n is said to be convergent.


Clearly, by t h e fact t h a t Φ(ΐχ, t0) is a b o u n d e d operator, a necessary
condition for convergence is t h a t t h e r e exists a constant Κ such that

H [/ + AtJ?0(AX)n < Κ for n(At) < (h - t0) (272)

T h i s uniform b o u n d e d n e s s is called t h e stability of t h e approximation


scheme. Its implication is t h a t t h e r e exists a limit to t h e extent to
which any c o m p o n e n t of an initial function U0(X) can be amplified in
t h e numerical p r o c e d u r e .
W i t h i n t h e above framework, L a x w e n t a step further to show that
u n d e r certain restrictions, stability is also sufficient condition for the
convergence of t h e approximation. H i s result can be s u m m a r i z e d by t h e
following equivalence t h e o r e m (49):

Given a well-posed initial-value problem and a corresponding discrete


approximation which satisfies the consistency condition (268), stability is
a necessary and sufficient condition for convergence.
T h u s , in order to establish a valid discrete mathematical model for a
linear distributed p a r a m e t e r dynamical system (71), b o t h consistency a n d
stability of t h e discretized system should be verified. O n c e t h e conditions
for convergence of t h e a p p r o x i m a t i o n are established (this task is by no
m e a n s trivial), one can proceed to d e t e r m i n e the required discretization
levels (sizes of spatial a n d / o r t i m e increments) for a prescribed tolerable
error in t h e a p p r o x i m a t e solutions in t e r m s of a prescribed n o r m .

3. P R O P E R T I E S OF E X A C T AND A P P R O X I M A T E SYSTEMS

I n practical situations, it is desirable to establish some c o r r e s p o n d e n c e


between those properties of t h e exact and approximate systems which
are p e r t i n e n t to control; in particular, stability, controllability, and
observability. I n order t h a t t h e correspondence is to be meaningful, a
consistent set of definitions for various p e r t i n e n t quantities in t h e exact
a n d approximate systems m u s t be established.
F o r an approximate system derived by spatial discretization, t h e func-
tions describing t h e state of t h e a p p r o x i m a t e system at any fixed t i m e are
specified only at a finite n u m b e r of spatial points. However, we may still
consider such a specification as represented by a point in t h e state
function space of t h e exact system by a d o p t i n g some rule for specifying
function values between t h e spatial points (e.g., linear interpolation).
T h e rule should be chosen so t h a t certain basic properties of the approxi-
m a t e operators are preserved. O n t h e other h a n d , K a n t o r o v i t c h (50)
prefers to represent t h e states of t h e a p p r o x i m a t e system by elements
166 P . K. C. WANG

in a different space and t h e n to establish suitable h o m o m o r p h i s m


between the state spaces of t h e exact a n d a p p r o x i m a t e systems.
F o r systems with distributed control functions, we can define a set of
approximate distributed control functions in a similar m a n n e r .
I n order to form a c o m m o n g r o u n d for comparison between t h e exact
a n d approximate systems, it is logical to restrict ourselves only to those
which are convergent approximations to t h e original system. H e r e , of
course, we m u s t first define what do we m e a n by a convergent a p p r o x i -
mation for the case of a general system. A possible definition may be
established by extending that of L a x as discussed earlier. H a v i n g
established such a framework, t h e following questions may be posed:

(i) Does there exist any intrinsic properties of t h e system which r e m a i n


invariant u n d e r t h e given approximation ?
(ii) Given an approximate system possessing certain k n o w n properties,
how m u c h can one infer on t h e corresponding properties of t h e exact
system ?

T h e s e questions are open for future investigations. S o m e preliminary


results have been discussed in reference (9).

B. Computational Problems
T h e solution of most of t h e o p t i m u m control p r o b l e m s considered
here can be reduced to the task of solving a set of partial differential or
integro-differential equations satisfying certain auxiliary conditions.
I n general, t h e solution of these equations in closed form cannot be
obtained except for a few simple cases. Therefore, numerical c o m p u t a t i o n
(usually iterative) procedures m u s t be devised for solving these equations.
I n order to indicate some of t h e i n h e r e n t difficulties and complexities
of these c o m p u t a t i o n p r o b l e m s , we shall consider t h e situation of
a t t e m p t i n g to obtain numerical solutions to a t w o - p o i n t b o u n d a r y - v a l u e
p r o b l e m in a function space, w h e r e t h e exact solutions satisfy a set of
linear P D E ' s such as t h e H a m i l t o n canonical equations (216) a n d (217).
A possible first step to this p r o b l e m is to approximate t h e P D E ' s by an
finite-dimensional system of algebraic equations. H e r e , t h e convergence
of t h e approximation (in t h e sense of Lax) should be established. T h i s
is generally a difficult task. H a v i n g reduced t h e p r o b l e m to a t w o - p o i n t
b o u n d a r y - v a l u e p r o b l e m in a finite-dimensional space, t h e next step
is to set u p some form of iterative p r o c e d u r e for solving it. H e r e , t h e
convergence of t h e iterative procedure m u s t be ensured. T h u s , t h e
numerical solution to the overall p r o b l e m requires t h e consideration
of two different types of convergence. Of course, this situation also exists
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 167

w h e n one a t t e m p t s to solve numerically t h e c o r r e s p o n d i n g t w o - p o i n t


b o u n d a r y value p r o b l e m for l u m p e d - p a r a m e t e r systems. However, t h e
p r o b l e m of d e t e r m i n i n g t h e convergence of an approximation for ordi-
nary differential equation is considerably simpler t h a n that for P D E ' s .
F o r t h e case w h e r e t h e P D E ' s are nonlinear, t h e p r o b l e m of deter-
m i n i n g t h e convergence of an a p p r o x i m a t i o n becomes extremely diffi-
cult. F u r t h e r m o r e , t h e a p p r o x i m a t e algebraic equations b e c o m e n o n -
linear a n d their solution requires an additional iterative p r o c e d u r e .
Aside from establishing suitable approximations a n d iterative p r o -
cedures, an i m p o r t a n t factor w h i c h strongly influences t h e practical
applicability of t h e chosen c o m p u t a t i o n p r o c e d u r e is t h e r o u n d i n g
errors. Although, the difference formulas may exhibit a high order of
accuracy, their use in practical c o m p u t a t i o n can lead to dissappointing
results. F o r t h e t w o - p o i n t b o u n d a r y value p r o b l e m considered here,
t h e difference equation for one of t h e P D E system is generally unstable.
T h e accumulation of r o u n d i n g errors can r e n d e r t h e c o m p u t a t i o n
scheme completely useless in practice.
Finally, t h e usefulness of t h e c o m p u t a t i o n s c h e m e is limited by t h e
dimensionality of t h e p r o b l e m and t h e rate of convergence in t h e iteration
procedures.
H e r e , we have only discussed briefly some of t h e obvious difficulties
in t h e computational aspects of t h e o p t i m u m control p r o b l e m for
distributed p a r a m e t e r systems. T h e s e difficulties represent a major
obstacle in t h e successful solution of practical p r o b l e m s . Its removal
would require considerable effort in developing new t e c h n i q u e s for t h e
numerical solution of partial differential equations a n d variational
problems.

V I . Practical A s p e c t s of C o n t r o l

T h e main portion of t h e previous discussions has been devoted to


various theoretical aspects of controlling distributed p a r a m e t e r d y n a m i -
cal systems. I n order to a bridge t h e gap between theory a n d practice,
developments should be m a d e along t h e following directions:

Model Building. T h e starting point of almost all t h e theoretical


developments has been based u p o n t h e a s s u m p t i o n of a mathematical
model of t h e dynamical system to be controlled. Obviously, this analyt-
ical a p p r o a c h to t h e design of control systems for physical processes can
lead to useful results only w h e n t h e d y n a m i c behavior of t h e processes
can be " a d e q u a t e l y ' ' described by their c o r r e s p o n d i n g mathematical
models.
168 P. K. C. WANG

Unfortunately, the development of mathematical models for a large


n u m b e r of physical processes, particularly distributed p a r a m e t e r in-
dustrial processes, is a major task in itself, in which physical insight
plays a d o m i n a n t role. Moreover, t h e " a d e q u a t e n e s s " of t h e model
based on some rational criteria cannot be generally d e t e r m i n e d in a
straightforward m a n n e r b u t , rather, by a complicated trial-and-error
process.
Since the complexity of physical processes and t h e control objectives
vary from one process to another, it is impossible to speak of a general
model building p r o c e d u r e . But for particular processes it may be p o s -
sible to develop systematic, recursive p r o c e d u r e s for achieving a d e q u a t e
models in the form of a unified p r o g r a m of analysis, simulation, a n d
experimental tests. T h e analysis should be based primarily on physical
u n d e r s t a n d i n g rather t h a n p u r e mathematical reasoning. T h e experi-
mental tests should be oriented t o w a r d achieving effective m e a s u r e m e n t
of the d y n a m i c response and p a r a m e t e r s of t h e process, whereas simula-
tion should represent a bilateral tie between analysis and e x p e r i m e n t s .

Control System Design. N o d o u b t , high-speed c o m p u t e r s will play


a major role in almost all parts of the control system for a distributed
p a r a m e t e r dynamical process. I n the present state of art, it is evident
t h a t t h e application of m o d e r n control theory to t h e successful design
of a control system for m a n y physical processes has been greatly i m -
p e d e d by t h e rapid growth in t h e required a m o u n t of c o m p u t a t i o n and
data processing with the increase in t h e dimensionality of t h e process.
F o r a distributed parameter process, the m a g n i t u d e of this difficulty is
even m o r e intensified and it is most unlikely to be removable by simply
increasing the size a n d speed of the control c o m p u t e r s . M o r e efficient
computational techniques and effective approximation p r o c e d u r e s m u s t
be developed in parallel with t h e control theory. A n o t h e r difficulty which
one may encounter is t h e selection of a set of realistic performance
indices. Since the complexity of the control policies d e p e n d s to a certain
extent on the form of t h e performance index, it may be desirable to
compare t h e control policies for a reasonably wide class of performance
indices, and to select the final design on the basis of ease of i m p l e m e n t a -
tion and economic factors.
Instrumentation. Present day control system i n s t r u m e n t a t i o n consists
of primarily transducers which either sense or actuate in some spatially
averaged m a n n e r . T h e averaging process may be either performed locally
as in t h e case of a p r o b e type of sensing i n s t r u m e n t , or globally over t h e
entire system domain. Since these type of i n s t r u m e n t s are developed
primarily for l u m p e d p a r a m e t e r systems, they may not be desirable
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 169

to use in distributed p a r a m e t e r control systems. T h e r e f o r e , an a t t e m p t


should be m a d e in developing a n e w line of i n s t r u m e n t a t i o n which are
especially designed for distributed p a r a m e t e r control systems. T h i s
may be accomplished by utilizing advantageously certain special physical
properties of t h e distributed systems. Also, m a n y existing i n s t r u m e n t s
may be used in distributed systems by modifying their m o d e of operation.
F o r example, a p r o b e sensor may be p u t into a scanning motion to
measure spatial distributions of certain physical quantities. I n m a n y
physical situations w h e r e t h e spatial distributions of certain physical
variables are to be closely controlled, it is desirable to i n t r o d u c e control
variables which are distributed over t h e spatial d o m a i n . M e t h o d s for
generating distributed control signals for various physical processes
should be explored.

V I I . Concluding Remarks

A n a t t e m p t has been m a d e to present a general, unified discussion


on various aspects of t h e p r o b l e m s associated with t h e control of distri-
b u t e d p a r a m e t e r dynamical systems. Portions of this work may seem
to be s o m e w h a t abstract from t h e control engineering standpoint.
However, it is felt t h a t because of t h e extreme complexities of these
p r o b l e m s , an abstract a p p r o a c h at this preliminary stage can provide
a better perspective and u n d e r s t a n d i n g of t h e basic p r o b l e m s and their
difficulties in this area, w i t h o u t being overshadowed by t h e enormity
of details. Also, a large portion of t h e present work has been based on
extending certain k n o w n results in l u m p e d p a r a m e t e r control system
theory to t h e distributed case. T h i s a p p r o a c h is a natural one, since
l u m p e d p a r a m e t e r systems are special cases of t distributed p a r a m e t e r
systems. However, in view of t h e " d i m e n s i o n a l i t y " of t h e p r o b l e m s
associated with controlling distributed p a r a m e t e r systems, a fresh
approach to p r o b l e m formulation taking into account t h e associated
computational difficulties, is certainly n e e d e d in t h e long r u n . O n t h e
other h a n d , because of t h e intricacies associated with partial differential
equations, a n d t h e fact t h a t t h e theory of partial differential equations
is not fully developed at t h e present t i m e as c o m p a r e d to t h a t of ordinary
differential equations, investigations in this area in t h e i m m e d i a t e future
should be directed t o w a r d establishing theories to particular classes of
distributed systems, a n d at t h e same time, trying to obtain feasible
solutions to t h e practical design of d i s t r i b u t e d p a r a m e t e r control systems.
170 P. K. C. WANG

ACKNOWLEDGMENTS

T h e author is grateful to M . L . Bandy for numerous helpful discussions. H e also


benefited from discussions with C . B. M e h r and W . A . Michael.

References

/ . A . G. BUTKOVSKII and A. Y. LERNER, Avtomatika i Telemekhan. 21, 6 8 2 ( 1 9 6 0 ) ;


Automation Remote Control 2 1 , 4 7 2 ( 1 9 6 0 ) .
2 . A. G. BUTKOVSKII and A. Y. LERNER, Dokl. Akad. Nauk SSSR 134, 7 7 8 ( 1 9 6 0 ) ;
Soviet Phys. Doklady {English Transi.) 5, 9 3 6 ( 1 9 6 1 ) .
3. A. G. BUTKOVSKII and Α . Υ . LERNER, Regelungstechnik 5, 1 8 5 ( 1 9 6 1 ) .
4. A. G. BUTKOVSKII, Avtomatika i Telemekhan. 22, 1 7 ( 1 9 6 1 ) ; Automation Remote
Control 2 2 , 1 3 ( 1 9 6 1 ) .
5. A. G. BUTKOVSKII, Avtomatika i Telemekhan. 2 2 , 1 2 8 8 ( 1 9 6 1 ) ; Automation Remote
Control 2 2 , 1 1 5 6 ( 1 9 6 2 ) .
6. A. G. BUTKOVSKII, Avtomatika i Telemekhan. 2 2 , 1 5 6 5 ( 1 9 6 1 ) ; Automation Remote
Control 2 2 , 1 4 2 9 ( 1 9 6 2 ) .
7. A. G. BUTKOVSKII, Avtomatika i Telemekhan. 24, 3 1 4 ( 1 9 6 3 ) ; Automation Remote
Control 24, 2 9 2 ( 1 9 6 3 ) .
8. J . V. EGOROV, Dokl. Akad. Nauk SSSR 1 4 5 , 7 2 0 ( 1 9 6 2 ) ; Soviet Math. 3, 1 0 8 0 ( 1 9 6 2 ) .
9 . P. K. C. W A N G and F. T U N G , Proc. 1963 Joint Autom. Control Con}., Paper 1 - 2 ;
ASME Trans. J. Basic Eng. 86, 6 7 ( 1 9 6 4 ) .
10. J. S. D R A N O F F and L . L A P I D U S , Proc. Symp. Digital Computing Chem. Petrochem.
Ind. 6 3 ( 1 9 5 8 ) . N . Y . U n i v . Press, N e w York.
U. P. K. C. W A N G and M . L . B A N D Y , J. Electron. Control 15, 3 4 3 ( 1 9 6 3 ) .
12. P. K. C. W A N G , A S M E Trans. Paper N o . 6 3 - A P M W - l l , J. Appl Mech. 30,
500 (1963).
13. P. K. C. W A N G , IEEE Trans. Autom. Control A C - 9 , 1 3 ( 1 9 6 4 ) .
1 4 . V . VOLTERRA, " T h e o r y of F u n c t i o n a l and of Integral and Integro-differential
Equations." Dover, N e w York, 1 9 5 9 .
7 5 . R. C O U R A N T and D . HILBERT, " M e t h o d s of Mathematical Physics," Vol. 2 . Wiley
(Interscience), N e w York, 1 9 6 2 .
16. A . FRIEDMAN, "Generalized Functions and Partial Differential Equations." Prentice-
Hall, E n g l e w o o d Cliffs, N e w Jersey, 1 9 6 3 .
17. A. N . KOLMOGOROV and S. V. F O M I N , "Elements in the T h e o r y of Functions and
Functional Analysis." Graylock Press, N e w York, 1 9 5 7 .
18. M . M . D A Y , " N o r m e d Linear S p a c e s . " A c a d e m i c Press, N e w York, 1 9 6 2 .
19. E. H I L L E and R. S. P H I L L I P S , "Functional Analysis and S e m i - G r o u p s . " A m . Math.
S o c , Providence, Rhode Island, 1 9 5 7 .
20. R. S. P H I L L I P S , Trans. Am. Math. Soc. 7 4 , 1 9 9 ( 1 9 5 3 ) .
21. T . K A T O , J. Math. Soc. Japan 5, 2 0 8 ( 1 9 5 3 ) .
22. K . O . FRIEDRICHS, Commun. Pure Appl. Math. 11, 3 4 5 ( 1 9 5 8 ) .
23. A. M . L Y A P U N O V , Problème général de la stabilité d u m o u v e m e n t . Ann. Math.
Studies N o s . 17 ( 1 9 4 7 ) .
24. V. I. ZUBOV, " T h e M e t h o d s of L y a p u n o v and their Applications." H o u s e of Leningrad.
U n i v . , Leningrad, U . S . S . R . , 1 9 5 7 ; English translation: U.S. At. Energy Comm.
Transi N o . A E C - t r - 4 4 3 9 .
CONTROL OF DISTRIBUTED PARAMETER SYSTEMS 171

25. W. H A H N , "Theorie und A n w e n d u n g der Direkten M e t h o d e von Ljapunov." Springer,


Berlin, 1 9 5 9 ; English ed.—Prentice-Hall, Englewood Cliffs, N e w Jersey, 1 9 6 3 .
26. Ν . N . KRASOVSKII, "Stability of M o t i o n . " Stanford U n i v . Press, Stanford, California,
1 9 5 9 . (English translation of book published by G o s . Isd. Fiz.-Mat. Lit.,
Moscow.)
27. J. L . MASSERA, Ann. Math. 64, 1 8 2 ( 1 9 5 6 ) ; (Correction) 68, 2 0 2 ( 1 9 5 8 ) .
28. J. L. MASSERA, Rev. Union Mat. Arg. 17, 1 3 5 ( 1 9 5 5 ) .
29. Κ . PERSIDSKII, Prikl. Mat. Mekhan. 12, 5 9 7 ( 1 9 4 8 ) .
30. K . PERSIDSKII, I Z V . Akad. Nauk Kazan. SSR Ser. Mat. Mekhan. 56, 3 ( 1 9 4 8 ) .
31. R. S. P H I L L I P S , Trans. Am. Math. Soc. 90, 193 (1959).
32. F. RIESZ and B. S Z . - N A G Y , "Functional Analysis." Frederick Ungar, N e w York,
1955.
33. P. K . C. W A N G and M . L. B A N D Y , Res. N o t e N o . N J - 2 9 . I B M Res. L a b . San Jose,
California, Jan. 1 9 6 3 .
34. R. E. K A L M A N , Proc. 1st Intern. Congr. Autom. Control 1960, Vol. 1, p. 4 8 1 ( 1 9 6 1 ) .
Butterworths, L o n d o n .
35. R. E. K A L M A N , Proc. Mexico City Conf. Ordinary Differential Equations, 1959,
Bol. Soc. Mat. Mex. p. 1 0 2 ( 1 9 6 0 ) .
36. L . S. P O N T R Y A G I N , V. G. B O L T Y A N S K I I , R. V. GAMKRELIDZE, and E. F. M I S H C H E N K O ,
" T h e Mathematical Theory of Optimal Processes." Wiley (Interscience),
N e w York, 1 9 6 2 .
37. E. G. G I L B E R T , SI AM J. Control 2, 1 2 8 ( 1 9 6 3 ) .
38. Ε . B. L E E and L . M A R K U S , Arch. Rational Mech. Anal. 8, 3 6 ( 1 9 6 1 ) .
39. E. A. C O D D I N G T O N and Ν . LEVINSON, "Theory of Ordinary Differential Equations."
M c G r a w - H i l l , N e w York, 1 9 5 5 .
40. F. J O H N , Ann. Mat. 40, 129 (1955).
41. W . L . M I R A N K E R , J. Franklin Inst. 271, 2 6 3 ( 1 9 6 1 ) .
42. W . L . MIRANKER, Res. Rept. N o . R C - 1 6 7 . I B M Res. Center, Yorktown Heights,
N e w York, 1 9 5 9 .
43. R. BELLMAN, " D y n a m i c Programming." Princeton Univ. Press, Princeton, N e w
Jersey, 1 9 5 7 .
44. R. BELLMAN, "Adaptive Control Processes: A G u i d e d T o u r . " Princeton U n i v .
Press, Princeton, N e w Jersey, 1 9 6 1 .
45. F. G. T R I C O M I , "Integral Equations." Wiley (Interscience), N e w York, 1 9 5 7
46. R. W . BASS, Discussion of a paper by A. M . Letov, in Proc. Heidelberg Conf. Autom.
Control p. 2 0 9 . R. Oldenbourg, M u n i c h , 1 9 5 7 .
47. Ν . Ν . KRASOVSKII, Proc. 1st Intern. Congr. Autom. Control 1960 Vol. I, p. 4 6 5 ( 1 9 6 1 ) .
Butterworths, L o n d o n .
48. Ν . Ν . KRASOVSKII, Prikl. Mat. Mekhan. 26, 3 9 ( 1 9 6 1 ) .
49. R. D . RICHTMYER, "Difference M e t h o d s for Initial-Value P r o b l e m s . " Wiley (Inter-
science), N e w York, 1 9 5 7 .
50. L . V. K A N T O R O V I T C H , Usp. Mat. Nauk 3, 8 9 ( 1 9 4 8 ) .

Additional References (Added in proof)

Since the c o m p l e t i o n of this work, the following articles related to the control of
distributed parameter systems have appeared. In order to clarify the subjects, article
titles will be included.

5 1 . A . G. BUTKOVSKII, " O p t i m u m Control of Systems with Distributed Parameters."


Proc. 2nd Intern. Congr. Autom. Control, 1963 (Preprint N o . 5 1 3 ) .
172 P . K. C. WANG

52. Ju. V. EGOROV, ' O p t i m a l Control in Banach Space." Dokl. Akad. Nauk SSSR
150, 241 (1963); Soviet Math. (English Transi.) 4, 630 (1963).
53. A. G. BUTKOVSKII, " M e t h o d s of M o m e n t s in O p t i m u m Control T h e o r y for Distri-
buted Parameter S y s t e m s . " Avtomatika i Telemekhan. 24, 1217 (1963); Auto-
mation Remote Control 24, 1106 (1964).
54. I. M C C A U S L A N D , "On O p t i m u m Control of Temperature Distribution in a Solid."
J. Electron. Control 14, 655 (1963).
55. I. M C C A U S L A N D , "On-Off Control of Linear Systems W i t h Distributed Parameters."
Ph. D . Dissertation, D e p t . of Eng. Cambridge University, Cambridge, England,
1963.
56. A. I. EGOROV, "On Optimal Control of Processes in Distributed Objects." Prikl.
Mat. Mekhan. 27, 688 (1963).
57. K. A. L U R ' E , "On the Hamilton-Jacobi M e t h o d in Variational Problems of Partial
Differential Equations." Prikl. Mat. Mekhan. 27, 255 (1963).
58. K. A. LUR'E, " T h e Mayer-Bolza Problem for Multiple Integrals and the Optimiza-
tion of the Performance of Systems with Distributed Parameters." Prikl. Mat.
Mekhan. 27, 842 (1963).
59. T . K. SIRAZETDINOV, "Concerning the T h e o r y of O p t i m u m Processes w i t h Distributed
Parameters." Avtomatika i Telemekhan. 25, 463 (1964).
60. P. K. C. W A N G AND M . L. B A N D Y , "On the Stability of Equilibrium of a Diffusion
System with Feedback Control." Res. N o t e N o . N J - 5 6 , I B M Res. Lab. San
Jose, California, June, 1964, to appear in IEEE Trans. Autom. Control (1964).
Optimal Control for Systems
Described by Difference Equations
HUBERT HALKIN
Bell Telephone Laboratories,
Whippany, New Jersey

I. Introduction 173
II. Statement of the Problem 174
III. Set of Reachable Events and H u y g e n s ' Construction . . 179
IV. Principle of Optimal Evolution 181
V. A First Approach to the M a x i m u m Principle 182
VI. C o m o v i n g Space A l o n g a Trajectory 188
VII. Closure and Convexity of the Sets W{i) and W(i, Ό . . 191
VIM. A General Proof of the M a x i m u m Principle 193
IX. Existence T h e o r e m 195

References 196

I. Introduction

T h e aim of this chapter is to give a simple a n d rigorous presentation


of some i m p o r t a n t concepts in t h e theory of optimal control. W e have
found that the study of systems described by difference equations is the
most appropriate framework for a first i n t r o d u c t i o n to these concepts.
P r o b l e m s of optimal control for systems described by difference
equations have been considered by m a n y a u t h o r s : W i n g and Desoer ( / ) ,
1
Rozonoer (2), C h a n g (3), Katz (4), e t c . I n this work we shall study this
p r o b l e m from a completely different point of view and with a different
motivation.
I n particular we derive a M a x i m u m Principle for t h e optimal control
of systems described by difference equations which is t h e analog of the
well k n o w n M a x i m u m Principle of Pontryagin for the optimal control
of differential equations.
T h e main p u r p o s e of this chapter is to i n t r o d u c e the reader to t h e
geometrical and topological m e t h o d in t h e theory of optimal control
[see Halkin (5)]. T h i s geometrical a n d topological m e t h o d is, in our

1
T o these papers w e should add the recent work of Jordan and Polak (J2).

173
174 HUBERT HALKIN

opinion, m u c h m o r e satisfactory t h a n t h e purely algebraic and formal


t r e a t m e n t currently found in t h e literature. T h i s m e t h o d has not only
decisive advantages from a purely theoretical point of view b u t leads
moreover to a deeper c o m p r e h e n s i o n of t h e p r o b l e m which is particu-
larly helpful in divising efficient computational schemes [Halkin (6)].
Unfortunately t h e simplicity of t h e geometrical and topological
m e t h o d is not always evident for a great majority of control scientists
since they are not familiar with t h e particular type of mathematical
language and t e c h n i q u e used in its development. I n t h e s t u d y of t h e
relatively simple p r o b l e m treated in this paper we shall p u t a special
emphasis on t h e content of t h e m e t h o d and we shall use as few unfamiliar
notations as possible. F o r a m o r e complete and concise d e v e l o p m e n t of
more general p r o b l e m s using t h e same m e t h o d s we refer t h e reader to
our previous publications [Halkin (7, 8y 9)].

I I . S t a t e m e n t of the P r o b l e m

W e shall now i n t r o d u c e t h e basic elements of t h e p r o b l e m .

Evolution variable. T h e evolution variable, iy which is usually t h e


time, will assume one of t h e values 0, 1, 2, k w h e r e k is a positive in-
teger given beforehand. W e shall denote by Τ t h e set {0, 1, 2, A}.

State variable. W e shall represent t h e state variable by t h e letter x.


W e shall assume that χ is an element of an w-dimensional Euclidean
1 2 n
space Xy called t h e state space, a n d we shall write χ = (x , x y x ) to
indicate t h e η c o m p o n e n t s . T h e Euclidean n o r m of χ will be d e n o t e d
by I χ |. T h e letter 9C will always represent a trajectory. T h i s m e a n s
that 3C will be a set of o r d e r e d pairs

& ={(x(i)yi):i = 0y\y...yk} (1)

I n other words SC represents the trajectory for which t h e system has t h e


state x(0) at t h e time 0, the state #(1) at t h e t i m e 1, etc. T h e pair formed
by a state variable and a time will be called an event.

Control variable. W e shall represent t h e control variable by the


letter u. W e shall assume t h a t u is an element of an r-dimensional
1
Euclidean space Uy called t h e control space, a n d we shall write u = (u y
2 r
uy u ) to indicate t h e r c o m p o n e n t s . W e shall assume t h a t a p a r t i -
cular region Ω of the space U is given. T h e region Ω is called the set
of admissible controls (see Fig. 1 ) . T h e letter will represent a control
O P T I M A L C O N T R O L FOR DIFFERENCE EQUATIONS 175

function, called also a strategy. T h i s m e a n s that W is a set of o r d e r e d


pairs
«={(«(0,i): i = 0,l,...,*-l} (2)

J
F I G . 1. Example of a t w o - d i m e n s i o n a l control space t/ and of a set Ω of admissible
controls (here Ω = {u : \ u | < 1 | } ) .

In other w o r d s °U r e p r e s e n t s t h e strategy for which t h e control


variable has t h e value w(0) at t h e t i m e 0, t h e value u(\) at t h e t i m e 1, etc.
A strategy like that of E q . (2) is admissible if t h e control has an admissi-
ble value for every t i m e 0, 1, k — 1. Formally ^ is admissible iff
("iff" m e a n s "if and only if")

u(i)eQ for i = 0, 1 (3)

T h e letter F will represent t h e set of all admissible strategies.

Initial and terminal condition. W e assume that we are given an


2
1
initial event S = (xs , 0) = ( Λ : , . , X s , xs'\ 0) a n d a t e r m i n a l line of
events Ε parallel to t h e wth state axis. T h e set E, called t h e terminal
2 η λ
target line, is characterized by its projection (x*y xe , ..., χ€ ~ ) on t h e
other state axes and its projection k on t h e t i m e axis. M o r e formally
n n
Ε = {{x\ , x\ , x»-\ x y k) : x G R) (4)

w h e r e R is t h e set of real n u m b e r s (see Fig. 1.2).


Difference equation for the dynamical system. T h e difference equation
for t h e dynamical system is a rule which enables us to c o m p u t e t h e state
of t h e system at t h e t i m e i + 1 if we know t h e state of t h e system a n d
t h e value of t h e control at t h e time i.
176 HUBERT HALKIN

I n this paper we shall assume that the difference equation takes t h e


particularly simple form

x(i + 1) — x(i) = A(i)x(i) + g(i, u(i)) (5)

FIG. 2 . Initial and terminal conditions in the case of a two-dimensional state space.

I n E q . (5), A is an η χ η matrix defined for every i = 0, 1 , 2 ,


k — 1 and g is an η vector defined for every i = 0, 1 , 2 , k — 1 and
every u in Ω.
W h e n it will be m o r e convenient, especially w h e n there is no need to
make explicit t h e linear s t r u c t u r e of t h e p r o b l e m , we shall write

/WO. «(0.0 (6)


instead of
A(i)x{i)+g(i,u(i)) (7)

W e shall often refer to

/ ( * , ». 0 (8)
O P T I M A L CONTROL FOR DIFFERENCE EQUATIONS 177

or equivalently to
A{i)x+g{i,u) (9)

as the velocity vector at t h e point χ and time i for t h e control u.


W e shall denote by x(j, °ti) t h e value of the state variable at t h e time j
corresponding to the solution of t h e system of E q . (5) with t h e strategy
of E q (2) and the initial condition

(*(0,#),0) = S (10)

By /T(?/) we shall denote t h e entire trajectory c o r r e s p o n d i n g to the


strategy of E q . (2) and the initial condition (10).
I n other w o r d s , if °ll is a set of o r d e r e d pairs as described by Eq. (2)
t h e n 2Γ(%) is a set of ordered pairs

#(«0 = {(*(!', η ί) : i = 0, 1, 2 , k ) (11)


d e t e r m i n e d uniquely by t h e following two conditions:

(*(0, «0, 0) = S (12)


x(i + 1, «Τ) - x(i, « ) = /(*(*), «(0, i) (13)

for i = 0, 1,2, k — 1.
Optimization problem. W e w a n t to find a strategy E F such that
the corresponding trajectory ^T(f^) intersects t h e terminal target line Ε
as far as possible in the positive direction along the nth state axis, i.e.,
such that

(a) (x(k, r\k) Ε Ε


1 4) (
(β) for all W G F for which

(*(*, * ) , * ) e £ (15)

shall hold t h e relation


n Η
x {k,4/) < * (Α,Ή (16)

T h e strategy satisfying the conditions ( a ) and (ß) will be called an


optimal strategy and the c o r r e s p o n d i n g trajectory will be called
an optimal trajectory. T h e pair formed by an optimal strategy and t h e
corresponding optimal trajectory will be called a solution of the p r o b l e m .

Remarks. In this paper we shall try to give an acceptable answer


to the p r o b l e m stated above. M o r e precisely we shall study the condi-
178 HUBERT HALKIN

tions u n d e r which this p r o b l e m has at least one solution (existence


t h e o r e m ) . W e shall also analyze the characteristics of a solution in o r d e r
to derive necessary conditions which are s t r o n g e n o u g h to help us
significantly in our search for this solution.

Two fundamental assumptions:

(1) \A(i)\<\ for all f = 0, 1 Λ— 1 (17)

by I A(i) I we m e a n t h e n o r m of t h e linear transformation i n d u c e d by


t h e matrix A(i), i.e.,

\A{i)\ = max | A(i)x | (18)

(2) {g(i, u) : u e Ω} is closed, convex and bounded for each

i = 0 , 1,2,...,*- 1 (19)

T h e s e two a s s u m p t i o n s will enable us to give m o r e s t r u c t u r e to our


theory.
At this point we should remark that t h e two preceding a s s u m p t i o n s
are not only useful in order to derive elaborate results b u t c o r r e s p o n d
also to very natural characteristics of t h e physical system u n d e r consider-
ation as we show below. Usually a difference equation is t h e discretiza-
tion of a differential equation. T h i s discretization is a necessary step
toward computational solution via digital devices. T h e first of t h e above
a s s u m p t i o n s expresses that the elementary t i m e i n c r e m e n t in t h e dis-
cretization has been chosen small e n o u g h . T h e second a s s u m p t i o n could
be intuitively motivated as follows: let us assume that we are given two
functions f(t) and g(t) and that we w a n t to consider t h e class Κ of all
c o n t i n u o u s functions whose derivatives have for almost every t i m e , *
one of t h e two values f(t) a n d g(i), t h e n a c o n t i n u o u s function whose
derivative is for almost every t i m e t an element of t h e set {xf(f) +
(1 — oc)g(t) : oc G [0, 1]} can be a p p r o x i m a t e d arbitrarily closely by a
function of t h e class Κ whose derivative j u m p s with a sufficiently high
frequency and with an appropriate m e a n value between t h e two functions
f(t) a n d g(t). A rigorous form of t h e previous s t a t e m e n t requires t h e
study of some consequences of L y a p o u n o v ' s t h e o r e m on t h e range of a
vector integral over Borel sets a n d t h e consideration of sliding states,
generalized curves, relaxed variational p r o b l e m s , etc. T h e reader in-
terested in this matter should consult t h e p a p e r s of W a r g a (70),
Gamkrelidze (77), and t h e a u t h o r (5, 9).
OPTIMAL CONTROL FOR DIFFERENCE EQUATIONS 179

I I I . Set of Reachable Events and H u y g e n s ' C o n s t r u c t i o n

In the previous section we have defined an event as a pair of elements,


t h e first being a state, the second being a time. T h e pair S = (xs , 0)
is an example of an event which we called the initial event. W e shall now
consider the set H of all events reachable from S, i.e., the set of all
events which belong to at least one trajectory issued from S and c o r r e -
s p o n d i n g to a strategy in the class F. Formally

H = {(*, ί) : (*, ι) e 3T{4/) for some eF} (20)

or equivalently
2 1
H = U «Χ*) ( )
4/eF

The concept of set of reachable events is of p r i m e i m p o r t a n c e i n the


m o d e r n d e v e l o p m e n t of calculus of variations and optimal control
theory.
It is very convenient to consider t h e following partition of the set H\
let W(i) be the projection on X of t h e intersection of Η by t h e h y p e r p l a n e
of time i. Formally let
W(i) = {χ : (χ, Ϊ) G H} (22)
or equivalently

H = Û W(i) x {i} (23)

It is easy to prove that we have t h e n

(24)

Example. W e assume that t h e state space and t h e control space are


1
both two-dimensional, that S = (0, 0), t h a t k = 3, that Ω = {u : \ u \ < 1
2
and I u | ^ 1} and that t h e difference equation reduces to

(25)

In that case we have (see Fig. 3)

(26)
1 2
R/(1)={(^,1) χ 1 < 1, ! x \ < 1} (27)
1 2
W{2) = {(x, 2) χ I < 2, I χ 1 < 2} (28)
2
W(3) = {(x, 3) ^ I < 3, I χ 1 < 3} (29)
180 HUBERT HALKIN

In the actual construction of the set H it is m u c h more convenient


to apply formula (23) than the definition (20) or its equivalent (21).
T h e intuitive construction scheme r u n s as follows: we now already W(0)y
since W(0) is the projection on X of the given initial event S\ to construct
1^(1) we integrate one step of the basic difference equation from W(0)

F I G . 3. Set of reachable events for the example treated.

and with all possible u in Ω; to construct W(2) we integrate one step


of the basic difference equation from all possible points in W(\) a n d all
possible u in ß , etc.
Formally,
W{0) = {*.} (30)
u : x w
W(* + Π = {x + /(*> > 0 ε (i), ueQ}, i = 0 , 1, 2 , k - 1 (31)

T h i s m e t h o d of constructing successively W^O), W(\), W(2), etc. is


very similar to H u y g e n s ' C o n s t r u c t i o n in geometrical optics.
Let
w{x, i) = { / ( * , w, f) : u G Ω} (32)
then
W(i+\)= U {x + ot:otezv(x,i)} (33)
xeW(i)

W e may consider the succession of t h e sets W(0), W(l)y W(2), etc.


as t h e result of some propagation in the space X. I n this analogy dW(i),
t h e b o u n d a r y of W(i), is t h e ^ a v e i r o n t ' ' at the time i of t h e p e r t u r -
bation initiated at x s at the time 0. T h e set w(x, i) is t h e " w a v e l e t "
O P T I M A L CONTROL FOR DIFFERENCE EQUATIONS 181

at t h e point χ a n d t h e time i characterizing this propagation. T h e set


W(i) is also called t h e " z o n e of influence" at t h e time i of t h e p e r t u r b a -
tion initiated at xs at t h e time 0. T h e set w(xy i) is t h e m a p p i n g of t h e set
Ω by t h e transformation given by t h e basic difference equation [see
définition (32)].
I n t h e optics of isotropic media t h e set w(x, i) is always a sphere
centered at t h e origin; in t h e optics of crystals t h e set w(x, i) is an ellip-
soid centered at t h e origin, etc. It should be remarked immediately
that we have no such limitation in o u r particular type of propagation:
from t h e second assumption of Section I I we require only t h e wavelet
w(xy i) to be a b o u n d e d , closed, a n d convex set.

IV. Principle of O p t i m a l Evolution

I n a previous publication (7), devoted to t h e theory of optimal control


for differential equations, we have i n t r o d u c e d t h e following s t a t e m e n t :
" E v e r y event of an optimal trajectory belongs to t h e b o u n d a r y of t h e set
of possible e v e n t s " . W e called this s t a t e m e n t " T h e Principle of O p t i m a l
E v o l u t i o n " . W e shall prove in t h e next proposition t h a t t h e same
result holds for t h e optimal control of difference equations.

PROPOSITION I V - 1 . / / °ti is an optimal strategy then x(j, W) e dW(j)


for all j = 1, 2, k.
Proof. W e shall prove Proposition I V - 1 by induction. First we shall
show t h a t x(ky $/) e dW(k) a n d secondly we shall show that x(i + 1, °ll) e
3W(i + 1) implies x(i, <%) e 3W(i) for i = 0, 1, 2, k - 1. T h e first
part is easy: if x(k> °tt) is an interior point of W(k) then there exists a
°U* a n d an € > 0 such that
n n
x (k, «*) = x (k, <%) + € (34)
x(k, °ll*) 6 Ε (35)
these two relations contradict t h e optimality of °U, which concludes
the first part of t h e proof.
I n t h e second part of t h e proof we shall use t h e first fundamental
assumption of Section I I . L e t us assume t h a t x(i + 1, °ll) G 8W(i + 1)
and x(i, e int W(i), t h e n t h e r e exists an e > 0 such that

N(x(i, « ) , e) C W(i) (36)


By N(x(iy e) we mean t h e e n e i g h b o r h o o d a r o u n d t h e point x(i,
i.e., t h e open set
N(x(i, « ) , e) = {χ : I χ - x(i, <W)\ < e} (37)
182 HUBERT HALKIN

T h e relation (36) implies that

x(i + 1, «T) G {χ + / ( * , i) : * ε JV(*(i, # ) , e)} C W(i + 1) (38)

But t h e set
{* + / ( * , ί) : * G JV(*(i, * ) , €)} (39)

is open since t h e set N(x(i, e) is o p e n a n d t h e m a p p i n g χ + f(x, u(i), i)


has a c o n t i n u o u s inverse. W e r e m i n d t h e reader t h a t

f(xyu(i),i) = A(i)x+g(i,u(i)) (40)


and
I A{i)\ < 1 (41)

It follows t h a t x(i + 1, °U) is an interior point of W(i + 1 ) . T h i s contra-


diction concludes t h e second and last part of t h e proof of Proposition
IV-1.
Proposition IV-1 is of fundamental i m p o r t a n c e in all t h e following
developments. I n d e e d in Proposition I V - 1 we have associated two
concepts: Optimality of a trajectory and a topological p r o p e r t y of t h e
same trajectory. I n all t h e r e m a i n i n g parts of this p a p e r t h e r e will be
no m o r e explicit reference to optimality in itself: all our results will be
derived from t h e topological p r o p e r t y i n t r o d u c e d in Proposition I V - 1 .
Proposition IV. 1 tells us that t h e predecessor on an optimal trajectory
of a b o u n d a r y point x(i + 1, ^ ) of t h e set W(i + 1) is always a b o u n d a r y
point x(i, of t h e set W(i). W e can go even further and state t h a t t h e
transition between these two points x(i> °ll) and x(i + 1, °l/) c o r r e s p o n d s
always to a b o u n d a r y point of t h e associated wavelet w(i9 x(i, (see
Figure 4). T h i s s t a t e m e n t is m a d e precise in t h e following proposition.

PROPOSITION I V - 2 . // is an optimal strategy then


U
(x{i + 1, <%) - *(*, ^ ) ) G dw(x(iy U\ i) (42)
Proposition I V - 2 is an i m m e d i a t e consequence of Proposition I V - 1
if we recall the relation (33):
W(i + 1) = U {χ + α : oc G w(x, i)} (43)
xeW(i)

V . A First A p p r o a c h to the M a x i m u m Principle

In this section we shall derive an i m p o r t a n t necessary condition for


the optimality of a strategy. T h i s necessary condition is called t h e
OPTIMAL CONTROL FOR DIFFERENCE EQUATIONS 183

M a x i m u m Principle. It is the equivalent for difference equations of t h e


M a x i m u m Principle of Pontryagin in the theory of optimal control for
differential equations. T h e derivation given in this section is rigorous
and simple b u t d e p e n d s on certain assumptions which are not always
satisfied. In Section VII we shall give a more elaborate proof of this

X2

FIG. 4 . Transition from <%) to * ( 1 + i, °U).

M a x i m u m Principle w i t h o u t making any of these s u p p l e m e n t a r y


assumptions.
Let % be an optimal strategy, x(iy and x(i + 1, tft) be two con-
secutive points on the corresponding trajectory SCitfl). W e shall assume,
in this section only, that 8W(i) has a tangent h y p e r p l a n e P(iy $ί) at
x(iy °M) and that dW(i + 1 ) has a tangent h y p e r p l a n e P(i + at
x(i + 1, Let/>(*', $/) and/>(/ + 1, be nonzero vectors respectively
normal to P(iy %) at x(iy °tl) and to P(i + 1, °t/) at x(i + \y9/) (see
Fig. 5 ) . Formally we have then
qi) ={χ:<χ - <8f) | />(*, <?/)> = 0} (44)

and similarly

P(i + ,#)={*: <* - x{i + 1, °U) I p(i + 1, * ) > = 0} (45)


W e use < α I β> to indicate t h e scalar p r o d u c t of α and β. W e assume
that p(iy tf/) and p(i + 1, °1ί) are positively oriented toward t h e outside
of the sets W(i) and W{i + I) respectively (Fig. 5 ) . T h e lengths of the
184 HUBERT HALKIN

vectors p(i, and p(i + 1, °tl) are not yet d e t e r m i n e d . I n Proposition


V-2 we shall fix these lengths u p to a u n i q u e multiplicative factor. Let

H(i, x, u, ρ) = </(*, Μ, ί) ! p} (46)

FIG. 5. Example of two successive s m o o t h wavefronts.

I n other w o r d s H(iy x> u> p) is t h e projection on the vector p of t h e


velocity vector f(xy u, i) at the point χ and the time i w h e n t h e control
has t h e value u. A quick look at Fig. 5 suggests the following rule :
along an optimal trajectory 3C(fll) t h e control at the point x(i, tf/) is
always chosen in order to maximize t h e projection of the velocity vector
û
at x(i, tf/) in t h e direction of p(i + 1, ?/), the normal to the wavefront
at t h e next point x(i + 1, tf/). T h e preceding statement is m a d e precise
in the following proposition.

PROPOSITION V - l . If ^ is an optimal strategy and if dW(i) has a


tangent hyperplane with nonzero outward normal p(i, $/) at x(i> °//) and
O P T I M A L CONTROL FOR DIFFERENCE EQUATIONS 185

if dW(i + 1) has a tangent hyperplane with outward normal p(i + 1, °Ιί)


at x(i + 1, ^ ) then

H(i, «), p(i + 1, « ) ) > H(i, °U\ v, p{i + 1, °?/)) for all ν G Ω
(47)

Proof. L e t us assume that t h e r e is Ά ν e Ω such that

°U\ V, p(i + ! , « ) ) > « ) , «(0, p(i + l, « ) ) (48)

and show that this leads to a contradiction. F o r every α G [0, 1] let


be an element of Ω such that

«Ο = <*£('"> + (1 - α)^(ι, u(i)) (49)

Such an element exists since we have assumed that the set

{g(i,u):ueQ} (50)
is convex (see Fig. 6).

F I G . 6. Convexity of {g(i, u) : u G £?}.

L e t ^ a be t h e strategy identical to °ll b u t for t h e value of the control


at t h e t i m e i which is now , formally is t h e n defined by t h e two
following relations

ujj) = u(j) for all j Φ i (51)

«a(0 = (52)

By definition we have t h e n
« o = °ll (53)

By construction t h e points x(i + 1, <?/a) have t h e following two p r o -


properties (see Fig. 7),
x(i + 1, « a ) = oix(i + 1, tyx) + (1 - )*(i
a + 1, « ) (54)

(x{i + 1, # a ) - x(i + 1, <?/) I p(i + 1, # ) > > 0 for all a > 0 (55)
186 HUBERT HALKIN

Since °ΙίΛ s F we have also

{x(i + l, # β) : α g [0, 1]} C W(i + 1) (56)

But the relations (54), (55), and (56) are contradictory since p(i + 1, tfl)
was defined as the o u t w a r d n o r m a l to 3W(i + 1), see Fig. 5. T h i s contra-
diction concludes the proof of Proposition V - l .

x(i,U)
FIG. 7. Construction of the set {x(i -f 1, ^ a ) : a G [ 0 , 1 ] } .

W e shall now derive a difference equation for t h e n o r m a l vector


(p(i, ft).

PROPOSITION V-2. / / °il is an optimal strategy, if dW(i) has a tangent


hyperplane with a nonzero outward normal p(i, tff) at x(i, °ll) and if
dW(i + 1) has a tangent hyperplane with a nonzero outward normal
p(i + 1, at x(i + 1, then it is possible to choose the lengths of
p(iy and p(i + 1, ^ ) such that
û T
p(i, <W) - p(i + 1, 1l) = A (i)p(i + 1, «f) (57)
T
where A (i) is the transpose of the matrix A(i). Let W(i + 1) be t h e set
obtained by integration of the basic difference equation from t h e set
W(i) when we choose the same control u(i) for each integration (see
Fig. 8.). Formally, let

W(i + ! ) = { * + / ( * , «(0. 0 : * e W(i)} (58)


OPTIMAL CONTROL FOR DIFFERENCE EQUATIONS 187

T h e set W(i + 1) has, by construction, t h e following two properties


\)CW(i+ 1) (59)
x(i+ \)W)GdW(i+ 1) (60)

M o r e o v e r dW{i -\- 1) has, by continuity, a tangent h y p e r p l a n e at


x(i + 1, since dW(i) has, by assumption, a tangent h y p e r p l a n e at
x{U W).

F I G . 8 . Construction of the set W{i + 1).

L e t p(i + 1, ^ ) be a nonzero o u t w a r d normal to W{i + 1 ) at


x(i + 1, <?/). F r o m the relations (58) and (59) we conclude t h a t we may
choose the length of p(i + I, such that

fti+h®) =P(i+\,®) (61)


188 HUBERT HALKIN

W e have
<x - x(i, <?/) I p(i, <?/)> = 0 (62)
iff

(x + A(i)x -\- g(i, u(i)) - x{i, <%) - A(i)x(i, W) - g(i, u(i)) I p(i + 1, 4Sr)> = 0
(63)
i.e., iff
<(/ + A(i))(x - «)) I p(i + l, #)> = 0 (64)
i.e., iff
(x - *(*, « ) I (/ + i 4 ( i + 1, «)> = 0
r
(65)

( W e use / to denote the identity η χ η matrix). F r o m (62) and (65) we


T
conclude that the two vectors p(i, <W) and (/ + A (i))p(i + 1, ^ ) are
parallel and we choose the lengths of these two vectors such that
T
/>(/, «) = (/ + A (i))p{i + 1, «0 (66)

i.e.,
/>(/, ^/) - p(i + 1, φ) = i4 (i)p(i + 1, * ) (67)
r

T h i s concludes the proof of Proposition V-2.


Propositions V - l and V-2 constitute the Maximum Principle for the
Optimal Control of Difference Equation. I n Section V I I we shall prove
once m o r e Propositions V-l and V-2 b u t that time without assuming
that the set dW(i) have tangent hyperplanes along t h e optimal trajectory.

V I . C o m o v i n g Space A l o n g a Trajectory

T h e aim of this section is to introduce a new space Y and a m a p p i n g


of Χ Χ Τ into Υ χ Τ such that the difference equation for the space Y,
which corresponds to the basic difference equations for the space Xy takes
a m u c h simpler form.
L e t <& = {(G(i), i) : i = 0, 1, 2, k) be the set of ordered pairs,
whose first elements are η Χ η matrices a n d second elements are times,
which are defined by the following two relations :

G(k) = I (68)
G(i) - G(i + 1) = G(i + l)A(i), all / = 0, 1, * - 1 (69)
_1
PROPOSITION V I - 1 . The matrix G ( z ) , inverse of the matrix G(i)>
exists for all i = 0, 1, k — 1.
O P T I M A L CONTROL FOR DIFFERENCE EQUATIONS 189

Proof. F r o m t h e definitions (68) a n d (69) we m a y write

G(i) = (/ + A(k - 1))(7 + A(k - 2))... (/ + A(i)) (70)


λ
But (/ + Α(ϊ))~ exists since we have a s s u m e d that

I A(i)\ < 1 (71)


- 1
hence G ( t ) exists also a n d
1 _1
G-Hj) = (/ + A{i))-\I + A(i + l ) ) - . . . (/ + A(k - 2))~ψ + A(k - 1 ) )
(72)
T h i s concludes t h e proof of Proposition V I - 1 .
1
L e t Y be an w-dimensional Euclidean space with elements y = ( y , ...,
n
y ). W e shall consider t h e m a p p i n g from Χ χ Τ into Y Χ Τ defined
by t h e relation
y = G(i)(x - x(if r)) (73)

F r o m t h e proposition V I - 1 we k n o w that t h e m a p p i n g from Χ χ Τ


into Y Χ Τ defined by t h e relation (73) is o n e to one a n d onto.
W e shall also consider t h e set of o r d e r e d pairs

W η = «Μ'. # , η : i = 0, 1, 2, k) (74)

w h e r e y(i, "f) is defined by

«f, TT) = G(i)(*ft # ) - *(*, TT)) (75)

I n other w o r d s < ^ ( « , ^ ) is t h e trajectory in y χ Τ which corresponds


to t h e trajectory 3£(9ί) in Χ χ T.

PROPOSITION VI-2. / / °U and eF then

(1) y(i + 1, * , r) - y{û # , η = G(i + 1 )(£(*> (ι)) - g(i, v(i))) (76)

(2) y(k, # , Ή = Λ:(*, « ) - * ( * , 1T) (77)

Proof Part (2) is trivial since we a s s u m e d t h a t

G(k) = / (78)

P a r t (1) is t h e result of t h e following straightforward c o m p u t a t i o n :

y(i+ rT) -y{i,<%, f)


= G(i + 1)0ψ· + 1, * ) - x(i + 1,
-G(i)(x(i, * ) - *(/, f ) )
190 HUBERT HALKIN

= G(i + ι )(*(« + ι, * ) - X(i + ι, r))


-G(i + + A(i))(x(i, * ) - *(ί, T T ) )
= G(i + 1 )(*(» + 1, <?/) - x(t, <?/) - x(i + 1, τΓ) + iT))

- G ( i + ΐ)Λ(ίχ*(ί, « ο - * ( » ' . * 0 )
= G(i + ΐ χ / Ι ( ι Μ . * ) +!<«. "(0) - A(i)x(i, r) -g(i, v(i)))
-G(i + l)A(i)(x(i, <?/) - x(i, Ψ~))
= G(i+\Mi,u(i))-g(i,v(i))) (79)
T h i s concludes the proof of Proposition V I - 2 .
At this point it is convenient to i n t r o d u c e for t h e space Y certain
concepts which correspond to t h e concepts i n t r o d u c e d in Section I I I
for the space X. Let

W(ù Ή = {y(i, Ή : m E F} (80)

H(r) = \jW(tyr)x{i} (81)


i=0

w(i, r) = {G(i + \)(g(t u) -g(i, v(i)) :UEQ}


y (82)

T h e relation (82) exhibits clearly the reasons which have motivated t h e


transformation from the space Χ χ Τ into t h e space Y χ T. I n d e e d
if we c o m p a r e the formulas (32) and (82) we see that the set w(i> x)
d e p e n d s explicitly on the state variable χ b u t that set w(i, i^) is i n d e p e n -
dent of the state variable y : this p r o p e r t y of w(i, i^) will enable us to
derive very interesting results in t h e following sections.
T h e space Y is called the comoving space along the trajectory 9C(i^).
T h i s appellation comes from the fact that t h e trajectory in t h e
space X is transformed into t h e trajectory

y = 0 (83)

of the space Y a n d that t h e trajectories of the space X corresponding


to the same strategy b u t with different initial conditions are t r a n s -
formed into the trajectories

y = constant (84)

I n other words the space y χ Γ is obtained from t h e space Χ χ Τ


by a transformation which stretches a n d twists t h e field of all trajectories
with the same strategy Ψ* into a nice field of parallel trajectories (see
Fig. 9).
O P T I M A L C O N T R O L FOR DIFFERENCE EQUATIONS 191

V I I . C l o s u r e and C o n v e x i t y of the Sets W(i) and W(/\ r)

I n this section we shall prove that t h e sets W{i) a n d W(i, τ ) are


closed a n d convex. T h e convexity of t h e sets W(i, i^) will be n e e d e d
in Section VIII to derive t h e M a x i m u m Principle in t h e general case

F I G . 9. Transformation of trajectories from the space X x Τ into the space Y X T.

a n d t h e closure of t h e sets W(i) will be used in Section I X to prove t h e


existence t h e o r e m .
F r o m t h e relations (24), (73), a n d (76) we may write

W(i, I T ) = {G(i)(x - x(iy Y O ) : x G W(i)} (85)

or equivalently

W(i) = {*(*, r) + G~\i)y : y e W(iy r)} (86)

H e n c e we see immediately t h a t W(iy i^) is convex if a n d only if W(i)


is convex a n d that W(iy i^) is closed iff W(i) is closed.

PROPOSITION V I I - 1 . The sets W(iy f ) , i = 0, 1, ky are convex.

Proof. By définition we have

W(i, -T) = {y(iy ΊΤ) : f GF} (87)

= \%G(j+ l)(£(y, u{j))-g{jy v{j))) : ® eF\ (88)


192 HUBERT H A L K I N

L e t tyQ a n d be two elements of F. W e shall prove that for every


α G [0, 1] t h e r e exists a ^ e F such that

y(i, ty. , rT) = αν(,·, ^ , T T ) + (1 - ) να( * , # 0 , r ) (89)

I n d e e d let ^ α be defined by t h e relation

<ϊ)) = <*(/, «i0")) + (i - «)*0\ «oO")) au y = ο ι 2 ... k - 1 (90)

S u c h ty. exists since we have a s s u m e d that t h e set

{g(jyu):ueQ} (91)

is convex. It is now a trivial m a t t e r to verify that y(iy tya , i^) satisfies


relation (87). T h i s concludes t h e proof of Proposition V I I - 1 .

PROPOSITION V I I - 2 . The sets W(iyi^)y i = 0, 1, k are closed.

Proof. L e t {tym : m = 1, 2, 3, ...} be a sequence of elements of F


such that t h e sequence

{y{h°tim,r):m = 1,2,3,...} (92)

converges. W e have to prove that t h e r e exists an element ty in F such


t h a t t h e sequence (92) converges to y(iy tyy i^).
F o r every j such that 0 ^ j < i we shall consider t h e sequence

V
{GO' + 1 MJ> «»(/)) - g(j, U))) • « = 1,2,3,...} (93)

All t h e sequences (92) are b o u n d e d hence, by t h e T h e o r e m of W e i e r -


strass-Bolzano, t h e r e exists a s u b s e q u e n c e {tyt : m = 1, 2, 3, ...} of
t h e sequence {tym : m = 1, 2, 3, ...} such that t h e sequence

{GO' + 1 )(*(/, « J / ) ) - £0'. »0"))) : m = 1, 2, 3,...} (94)

converges for all j such that 0 ^ j < z.


W e have a s s u m e d that t h e set

{g(j,u):ueQ} (95)
is closed, hence t h e set
G
iU + 1 Mi, «) - SÜ. «0"))) (96)

will also be closed and, for every j such that 0 ^ j < z, t h e r e exists
an ü(J) G ß such that t h e sequence (94) converges to G(j + 1 )(g(jy ü(j)) —
g{j> </')))·
O P T I M A L C O N T R O L FOR DIFFERENCE EQUATIONS 193

L e t °li G F be defined by t h e following relations:


r
u
(j)
=
"0) 0 < ; < Î (97)
u(J) arbitrary in Ω for i ^ j < k (98)
u
It is n o w a trivial m a t t e r to verify t h a t t h e sequence {y(i> ttm , Y) :
m = 1, 2, 3, ...} converges to y(iy y*). T h i s concludes t h e proof of
Proposition V I I - 2 .

V I I I . A General Proof of the M a x i m u m Principle

I n this section we shall obtain t h e same result as in Propositions


V - l a n d V - 2 b u t w i t h o u t a s s u m i n g that t h e W(i) have t a n g e n t h y p e r -
planes along an optimal trajectory. M o r e o v e r t h e present derivation
with its clearly a p p a r e n t geometrical motivation is a m o r e powerful
tool w h e n it is necessary to design efficient c o m p u t a t i o n a l schemes.

THEOREM VIII-1. / / is an optimal strategy then x(k, Y) is a


boundary point of W(k).

Proof. T h i s t h e o r e m is a straightforward application of t h e P r i n -


ciple of O p t i m a l Evolution (see Proposition I V - 1 ) .

T H E O R E M V I I I - 2 . / / x(ky Y) is a boundary point of W(k) then y = 0


is a boundary point of W(k, i^).

Proof. By definition we have

W(ky τΤ)={χ- x(k, Y):xe W(k)} (99)

and T h e o r e m V I I I - 2 follows immediately.

T H E O R E M V I I I - 3 . If y = Ois a boundary point of W(ky "Γ) then there


is a vector φ(Υ) such that

<G(i + l)fc(f, u) - g(i, v(i))) Iφ ( ^ ) > < 0 (100)

for all i = 0, 1, k — 1 and all u e Ω.

Proof. T h e set W(ky is convex, see Proposition V I I - 1 , hence


t h e r e exists a s u p p o r t i n g h y p e r p l a n e passing t h r o u g h t h e point y = 0.
I n other w o r d s there exists a vector <p(Y)> n o r m a l to t h e s u p p o r t i n g
hyperplane, such t h a t

<y I <ΚΉ> < 0 for all y e W{k, Y) (101)


194 HUBERT HALKIN

If <p(V ) does not satisfy t h e condition (100), t h e n t h e r e exists a j G {0, 1,


k — 1} a n d a w G O such that

+ 1 «) - *0", </))) I > 0 (102)

Let Ψ eF b t constructed by t h e two following relations

v(i) =-. v(i) if ι ^ ; (103)

v(j) = u (104)

It is trivial matter to verify that

( j ( M ; f ) k ( f ) ) > o (105)

T h e relations (101) a n d (105) are contradictory. T h i s contradiction


concludes t h e proof of T h e o r e m V I I I - 3 .

T H E O R E M VI 11-4: If there exists a nonzero vector φ(Υ) satisfying


condition (100) then there exists a nonzero vector p(i, Y) defined for
i = 0, 1, k such that

(1) P M ^ G W ) (106)
(2) H(i9 x(i9 i \ u9 p(i + 1, r)) < //(i, x(i9 r \ v ( i \ p(i + 1, r)) (107)

for all i = 0, 1, 2, k — 1 and all u Ε ß , and,

(3) / > ( ί , r ) -p(i + i , r ) = Αψ)Ρ(ι + ι , r ) (108)

/ o r Λ// ι = 0, 1, 2, A—1

Proof. By a well k n o w n p r o p e r t y of matrix theory t h e relation (100)


may be written in t h e form
T
<g& «) - v(i)) I G (i + 1 ) φ ( ^ ) > < 0 (109)

for all i = 0, 1, A — "1 a n d all w G ß


If we consider t h e relation (106) as t h e définition of p(i, Ψ*) t h e n t h e
relation (109) may be written u n d e r t h e form

<g{h u) - g(i, v(i)) \p(i+ 1,1T)> < 0 (110)

for all i = 0, 1, k — 1 a n d all u Ε Ω which is equivalent to

<A(i)x(i9 r ) + g(i9 v(i)) ι p(i + ι, * o > ^ <^('>(*> * 0 + «) I/>(* + ι. Ό >


(111)
for all ζ = 0, 1, k — 1 a n d all u Ε Ω
O P T I M A L C O N T R O L FOR DIFFERENCE EQUATIONS 195

If we recall t h e definition of / / ( i , x, u, p) given in Section V we see


that t h e relations (ΙΟΙ) a n d (111) are equivalent.
It remains to prove relation (108). I n d e e d from t h e relation (106) we
have
/>(i, * ) - p(i + 1, i ) = (G\i) - G\i + 1 ) ) φ( Π (112)

and from t h e definition of G{i)y see relation (69), we have


T T T T
G (i) - G (i + 1) = A (i)G (i + 1) (113)
henc e
/>(*', r) - p(i + 1, i ) = A\i)G\i + (114)

U s i n g again t h e definition (106) we transform immediately t h e relation


(114) into t h e relation (108). T h i s concludes t h e proof of T h e o r e m
VIII-4.
r
T H E O R E M V I I I - 5 . If χ = x(k> i ) is a boundary point of the set W{k)
then there exists a nonzero vector p(iy i^) defined for i = 0, 1, k and
satisfying the relations (106), (107), and (108).

Proof. T h e o r e m V I I I - 5 is a direct c o n s e q u e n c e of t h e T h e o r e m s
V I I I - 2 , V I I I - 3 , a n d V I I I - 4 . I n d e e d if χ = x(ky τΤ) is a b o u n d a r y
point of t h e set W(k) t h e n y = 0 is a b o u n d a r y point of t h e set W(k, i^)
(see t h e o r e m V I I - 2 ) , t h e n t h e r e exists a vector <p(i^) satisfying t h e
r
condition (100) (see T h e o r e m V I I I - 3 ) a n d t h e r e exists a vector />(/, i )
satisfying t h e relations (106), (107), a n d (108) (see T h e o r e m V I I I - 4 .
T h i s concludes t h e proof of T h e o r e m V I I I - 5 .

T H E O R E M V I I I - 6 . / / ^ is an optimal strategy then there exists a


r
nonzero vector p(i, i ) defined for i — 0, 1, k and satisfying the relations
(101), (107), a n d (108).

Proof. T h e r o e m V I I I - 6 is a direct consequence of T h e o r e m s


r
V I I I - 1 a n d V I I I - 5 . I n d e e d if is an optimal strategy t h e n x(ky i )
is a b o u n d a r y point of t h e set W(k)y see T h e o r e m V I I I - 1 , t h e n there
exists a vector p(iy Ϋ~) defined for i — 0, 1, k a n d satisfying t h e
relations (106), (107), a n d (108) (see T h e o r e m V I I I - 5 ) . T h i s concludes
t h e proof of T h e o r e m V I I I - 6 .

I X . Existence T h e o r e m

I n this section we shall prove that if t h e r e is a trajectory satisfying


the initial a n d terminal conditions t h e n t h e r e exists an optimal trajectory.
T h i s result seems trivial at first a n d is indeed very easy to prove in t h e
196 HUBERT HALKIN

case of the particular type of p r o b l e m s treated in this paper. However t h e


question of existence is very often an extremely complicated o n e .

PROPOSITION I X - 1 . If there exists a strategy 4/ e F such that


(x(ky k) G Ε then there exists an optimal strategy.

Proof. W e have proved in Section V I I that t h e set W(k) is closed.


By assumption t h e set
(W(k) x(*j)n« (115)
is not empty, since it contains t h e element (x(ky ^/), k)y is closed, since
the set W(k) is closed by Proposition V I I - 2 , a n d b o u n d e d , by construc-
tion. H e n c e there is a Ϋ'" G F such that (x(ky i*)y k) is t h e farthest point
11
of (W(k) χ {k}) Π Ε in t h e positive direction along t h e x axis. By
r
definition this i is then an optimal strategy. T h i s concludes t h e proof
of Proposition I X - 1 .

ACKNOWLEDGMENTS

I am very grateful to Professor C. A. Desoer and to Drs. F. T . Geyling and A. G. Lubovve


for their valuable c o m m e n t s on this paper.

References

1. J. W I N G and C. A. DESOER, T h e multiple-input minimal time regulator problem


(general theory). IEEE Trans. Autom. Control A C - 8 , 1 2 5 (1963).
2 . L. I. ROZONOER, T h e m a x i m u m principle of L. S. Pontryagin in optimal-system
theory, Part III. Automation Remote Control 2 0 , 1 5 1 9 - 1 5 3 2 (1959).
3. S. S. L. C H A N G , D i g i t i z e d m a x i m u m principle. Proc. IRE, pp. 2 0 3 0 - 2 0 3 1 (1960).
4 . S. KATZ, A discrete version of Pontryagin's m a x i m u m principle. J. Electron. Control
13, 179-184 (1962).
5. H . H A L K I N , T o p o l o g i c a l aspects of optimal control of dynamical polysystems.
Contrib. Differential Equations (in press) ( 1 9 6 4 ) .
6. H . H A L K I N , M e t h o d of convex ascent. In " C o m p u t i n g M e t h o d s in Optimization
Problems." Academic Press, N e w York, 1 9 6 4 .
7. H . H A L K I N , T h e principle of optimal evolution. In "Nonlinear Differential Equations
and Nonlinear M e c h a n i c s " ( J . P. LaSalle and S. Lefschetz, eds.), p p . 2 8 4 - 3 0 2 .
Academic Press, N e w York, 1 9 6 3 .
8. H . H A L K I N , O n the necessary condition for optimal control of nonlinear systems.
T e c h . Rept. N o . 1 1 6 , D e p t . Math., Stanford U n i v . , 1 9 6 3 . J. Analyse Math.
(in press).
9 . H . H A L K I N , Lyapounov's theorem on the range of a vector measure and Pontryagin's
m a x i m u m principle. Arch. Rational Mech. Anal. 10, 2 9 6 - 3 0 4 ( 1 9 6 2 ) .
1 0 . J . WARGA, Relaxed variational problems. J. Math. Anal. Appl. 4, 1 1 1 - 1 2 8 ( 1 9 6 2 ) .
1 1 . R. V . GAMKRELIDZE, Optimal sliding states. Dokl. Akad. Nauk SSSR 143, 1 2 4 3 - 1 2 4 5
( 1 9 6 2 ) . (In Russian.)
12. B . W . JORDAN and E. POLAK, T h e o r y of a class of discrete optimal control systems.
J. Electr. Control (in press).
A n Optimal Control Problem with
State Vector Measurement Errors
1
P E T E R R. S C H U L T Z
Department of Engineering, University of California, Los Angeles,
and Guidance Systems Department, Aerospace Corporation, El Segundo, California

I. Introduction and Preliminaries 198


A. Definition of the Problem 198
B. Continuous Stochastic Processes with Independent
Increments 201
C. Discontinuous Stochastic Processes with Independent
Increments 204
D . Stochastic Integrals 205
E. T h e Pay-Off Associated with a Linear Policy . . . 206
II. Optimal Control in the Presence of State Vector Measure-
ment Errors 208
A. Derivation of the Partial Differential Equations Satis-
fied by the Optimal Pay-OfT 208
B. T h e Solution of the Partial Differential Equation for
the Optimal Pay-OfT 211
C. Estimation in the Optimal Policy 213
D . T h e Sufficiency Question and Local Optimality 214
III. Optimal Control in the Presence of Measurement Errors
and Continuous R a n d o m Disturbances 219
A. T h e Optimal Policy and Pay-OfT for Disturbances
which are Independent of the State Vector 219
B. T h e Sufficiency and Local Optimality Questions in the
Presence of Continuous R a n d o m Disturbances which
are Independent of the State Vector 222
C. T h e Optimal Policy and Optimal Pay-OfT with
Disturbances w h i c h are Linearly Proportional to the
State Vector 225
D . T h e Sufficiency and Local Optimality Question in
the Presence of Continuous Disturbances which are
Linearly Proportional to the State Vector 227
IV. Optimal Control in the Presence of Measurement Errors
and R a n d o m Disturbances Derived from the Generalized
Poisson Process 229
A. T h e Optimal Policy and Pay-OfT w h e n the Disturbances
are Independent of the State Vector 229

1
N o w Guidance Systems Department, Aerospace Corporation. T h e work discussed
in this chapter was supported by the Adaptive Control Project at U C L A under A F O S R
Grant 62-68 and by a fellowship of the Aerospace Corporation.

197
198 PETER R. SCHULTZ

B. T h e Sufficiency Question when the Disturbances are


Independent of the State Vector 231
C. T h e Optimal Policy and Pay-Off when the D i s t u r b -
ances are Linearly Proportional to the State Vector 233
D . T h e Sufficiency Question with Disturbances which
are Linearly Proportional to the State Vector . . . . 235
V. Conclusions 236
VI. Appendices 237
A. T h e Solution to a Matrix Differential Equation . . 237
B. A Demonstration that the Solution to a Matrix
Differential Equation is N o n - N e g a t i v e Definite . . 238
C. T h e Minimization of the Expected Value of a
Quadratic Form 240
References 241

I. Introduction and Preliminaries

A. Definition of the Problem


T h i s chapter is devoted to the discussion of a specific stochastic
optimal control problem which will now be defined. Consider a conti-
n u o u s linear system whose behavior is described by the following vector
matrix differential equation:
dx
-j- = A(t)x(t) + B(t)u(t) + n(t) (1)

where x(t) is an η χ 1 matrix (w-vector) which will be called the state


vector; u(t) is an r X 1 matrix (r-vector) which will be called t h e control
vector or policy. N(t) is an η χ 1 matrix whose elements represent
external r a n d o m disturbances acting on the system. A{t) and B(t) are
η χ η and η X r matrices, respectively, whose elements are assumed
to be continuous functions of time.
T h e control vector xx(t) is to be chosen so that the system behaves
in some desired fashion. u(t) may be a function of t h e state vector x(t)
as well as the i n d e p e n d e n t variable time. I n m a n y practical p r o b l e m s ,
however, the controller's knowledge of x(t) is r e n d e r e d imperfect by
the presence of noise or m e a s u r e m e n t errors. I n other words, u m u s t
be chosen in t e r m s of x(t) - f z(t)> where z(t) is a vector-valued stochastic
process which represents the m e a s u r e m e n t error, rather t h a n the actual
value of the state vector x(t). Let E[ ] or [ ] denote the mathematical
expectation of a quantity [ ]. T h e stochastic process z(t) is assumed to
satisfy the following conditions in this chapter:
(i) z(t) is a Markov process uncorrelated with x(s)> s > t.
OPTIMAL CONTROL WITH MEASUREMENT ERRORS 199

(ii) E[z(t)] = 0 over the time interval of control, t0 < t T.


(iii) T h e matrix E[z(t)z(t)'] is defined and its elements are continuous
almost everywhere in t0 < t < T. (In this chapter a p r i m e (')
denotes t h e transpose of t h e a p p r o p r i a t e matrix.)
F u r t h e r m o r e , in this chapter it will be a s s u m e d that u is a linear function
of x(t) + z(t). N o t e that in t h e notation used here and in the succeeding
parts of this chapter, z(t) is a vector-valued r a n d o m variable defined over
a suitable sample space for each t such that t0 ^ t ^ T, i.e., t0 ^ t ^ Τ
is the index set over which t h e different r a n d o m variables in t h e stochastic
process are catalogued. T h i s differs slightly from the conventional nota-
tion in the theory of stochastic processes.
I n this chapter the following generalized quadratic performance
criterion or pay-off will be considered:

performance criterion

= Ε ]I [ x ' ( 0 ^ ( 0 x ( 0 + M(t)R(t)u(t)) ds + j x ' ( T ) p x ( T ) | (2)

R(t) is assumed to be a c o n t i n u o u s , positive definite matrix. P(t) is


assumed to be a c o n t i n u o u s non-negative definite matrix. T, t h e terminal
time of the control interval, is assumed to be specified. T h e mathematical
expectation in Eq. (2) is taken with respect to both the m e a s u r e m e n t
errors and the external disturbances. In Eq. (2), \x(x(t) + z(t), t) is
denoted by u(t) for brevity.
In order to make better use of the theory of stochastic processes it
is desirable to change E q . ( 1 ) from a differential equation into a stochastic
differential equation. T h i s step a m o u n t s to rewritting E q . ( 1 ) in t h e
following form:

dx(ty t + h) = A(t)x(t)h + B(t)\i(t, x(t) + z(t))h + dN(t, t + h) + 0(h) (3)

In Eq. (3) dx(t, t + h) represents the total increment in the state


vector χ that occurs d u r i n g t h e interval between t and t h e t + A,
dN(ty t - f A) represents the i n c r e m e n t in the state vector d u e to external
disturbances in this interval and O(A) represents higher order t e r m s in A.
Consequently, the p r o b l e m that is considered in this chapter can be
stated as follows; given a system whose behavior is described by Eq. (1)
or, equivalently, by E q . (3), find the policy u ( i , x(t) 4 z{t)) from all
piecewise c o n t i n u o u s (in t) linear functions of x ( / ) -j z{t) which mini-
mizes the performance criterion defined in Eq. (2). T h e m e t h o d of
dynamic p r o g r a m m i n g developed by Bellman (/, 2) will be used to
obtain the solution.
200 PETER R. SCHULTZ

It is true that both the assumptions regarding the system dynamics in


E q s . (1) and (3) and the definition of the performance criterion in E q . (2)
are less general t h a n one might desire. However, these assumptions do
enable one to obtain more information about the characteristics of the
optimal policy and the optimal pay-off (or value of the performance
criterion that results from the use of the optimal policy) than would
otherwise be possible. F o r this reason, m a n y research workers in the
field of control theory have studied versions of it.
Bellman (7), Bellman et al. (2) and Beckwith (3) discussed c o n t i n u o u s
deterministic versions of this p r o b l e m w h e n both m e a s u r e m e n t errors
and external r a n d o m disturbances are absent. K a l m a n (4, 5) has recently
discussed this p r o b l e m by using an approach based on the classical
calculus of variations. Letov (6) has also discussed the deterministic
version using d y n a m i c p r o g r a m m i n g for the case w h e r e the time inter-
val of operation is infinite. Collina and D o r a t o (7) have utilized the
m a x i m u m principle (8) to obtain a solution to the deterministic p r o b l e m .
M e r r i a m (9, 10) also solved a deterministic version of this p r o b l e m and
in addition he incorporated an element of r a n d o m n e s s by considering
the p r o b l e m of tracking a r a n d o m l y moving point. Florentin ( / / , 12)
treated the p r o b l e m with r a n d o m external disturbances generated by
stochastic processes with i n d e p e n d e n t increments. T h e a u t h o r of this
chapter (13) considered the p r o b l e m where both m e a s u r e m e n t errors
and external disturbances are present. K u s h n e r (14) also considers
similar problems. D e n h a m and Speyer (15) have obtained results per-
tinent to a midcourse guidance p r o b l e m .
T h e discrete version of this p r o b l e m has also been considered exten-
sively by many a u t h o r s . In this version of the p r o b l e m the differential
equation in E q . (1) is replaced by an appropriate difference equation
and the integral in E q . (2) is replaced by a s u m m a t i o n . H e r e the i n d e p e n -
dent variable (time) is " d i s c r e t i z e d . " K r a m e r (16), K a l m a n and Koepcke
(17), and others have studied deterministic versions of this p r o b l e m .
K r a m e r (16), A d o r n o (/#, 19), and Florentin (20, 21) have considered
stochastic versions of this p r o b l e m and external disturbances and have
also i n t r o d u c e d aspects of adaptiveness into the p r o b l e m . O n e such
aspect involves considering a case where the disturbances are character-
ized by a Bernoulli distribution where the p a r a m e t e r is u n k n o w n initially.
T h e adaptive aspect then enters into the p r o b l e m t h r o u g h an estima-
tion scheme which provides increasingly better knowledge of the u n -
known parameter which characterizes the process as time passes and
uses this knowledge to i m p r o v e the controller's characteristics. G u n c k e l
(22), G u n c k e l and Franklin (23), Joseph and T o u (24), and Pottle (25)
have considered versions of this p r o b l e m where both m e a s u r e m e n t
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 201

errors and r a n d o m disturbances are present. T h e results in this chapter


are in m a n y ways " c o n t i n u o u s a n a l o g s " of their results concerning t h e
optimal estimation and control policies. In a recent article Eaton (26)
discusses an " i n t e r r u p t e d stochastic control p r o c e s s " where there is a
nonzero probability that no m e a s u r e m e n t s of the state vector will be
available for each stage of t h e stochastic process. While E a t o n ' s funda-
mental approach is different from those of t h e above a u t h o r s (22-25),
there are m a n y similarities in t h e results.
It should not be inferred from t h e preceding discussion that this is the
only stochastic optimal control p r o b l e m which has been considered.
M a n y other stochastic optimal control p r o b l e m s have been discussed
in the literature. I n m a n y cases their solution, w h e n it can be obtained,
requires involved numerical c o m p u t a t i o n s . Some of the interesting
papers in this area are those of Katz (27), K a l m a n (28), Bellman (29),
Aoki (30, 31), Krasovskii (32), Kipiniak (33), M i s h c h e n k o (8), Booton
(34), Krasovskii and Lidskii (35), and Bryson (36).

B. Continuous Stochastic Processes with Independent Increments

Stochastic processes with i n d e p e n d e n t i n c r e m e n t s are quite useful in


t h e study of certain stochastic optimal control p r o b l e m s . T h e i r utility
in this respect appears to have been recognized first by Florentin ( / / , 12).
Since some concepts associated with these stochastic processes will be
used in t h e following sections of this chapter, it is desirable to present
some of the facts concerning t h e m which are well known to m a t h e m a t i -
cians b u t not to engineers and scientists. M o r e detailed and sophisticated
discussions of processes with i n d e p e n d e n t increments are available
in D o o b (37), Loeve (38), G n e d e n k o (39), and Bartlett (40). T w o
kinds of processes with i n d e p e n d e n t i n c r e m e n t s will be discussed in this
chapter. T h i s subsection will be devoted to those whose sample functions
are continuous. T h e next subsection will be devoted to those processes
whose sample functions are discontinuous.
A continuous p a r a m e t e r stochastic process (37, 38) ζι(ω) (ω e Ω,
β is a sample space, F the sigma field in this sample space, Ρ the p r o -
bability measure defined over the sample space) is said to be a process
with i n d e p e n d e n t i n c r e m e n t s if, for any ordered set {ί{}, tx ^ ... < t{ <
··· ^ tm , the r a n d o m variables

it èt > ···» èti


2 x fif_i » ···' èt ~~ ^t _
m m 1

are mutually i n d e p e n d e n t . O n e interesting general result discussed in


ω w n e r
Loeve (38) is w o r t h noting here. Let ζι(ω) = ξι( ) ~~ e f(t) is a
202 PETER R. SCHULTZ

function of t (and not ω) chosen so that Ρ(ζι+0 — tt-o = 0) = 1, i.e.,


chosen so that (in a loose sense) all fixed discontinuities are removed
from the process. T h e n ζι is called a centered process with i n d e p e n d e n t
i n c r e m e n t s defined on the closed interval tx ^ t ^ tn and the following
is t r u e :

(a) ζί is a normal (Gaussian) process if and only if almost all its


sample functions are c o n t i n u o u s .
(b) ζι is a Poisson process if and only if almost all of its sample
functions are step functions of constant height.

T h u s one can say that for a centered process with i n d e p e n d e n t incre-


ments, continuity of the sample functions implies that t h e process is
normal. Likewise, w h e n its sample functions consist of steps of constant
height, a centered process with i n d e p e n d e n t i n c r e m e n t s m u s t be a
Poisson process.
O n e example of a continuous, centered process with i n d e p e n d e n t
i n c r e m e n t s is the Wiener process. It is defined as follows: £/(ω) is said
to be a Wiener process if ζ^ω) has i n d e p e n d e n t i n c r e m e n t s , is Gaussian
2
for all t, Ε(ξ( - £,) = 0 and Ε(ξ{ - ξ8) = σ χ * | t - s | for all t and s
such that ξι(ω) is defined, is a specified positive p a r a m e t e r which
is quite often chosen to be unity. Since t h e r a n d o m variables ξι — ξ8
d e p e n d , in a statistical sense, only on t — s a n d not on t a n d s separately,
the i n c r e m e n t s in t h e W i e n e r process are stationary. Also, almost all
sample functions of a W i e n e r process are not differentiable for any t
and do not have b o u n d e d variation (37). However, almost all sample
functions are continuous, i.e., t h e sample functions are c o n t i n u o u s
with probability one.
Define η as an ^ - v e c t o r valued r a n d o m process whose elements are
composed of rx i n d e p e n d e n t W i e n e r processes with σ χ * = 1. T h e n we
can form a vector-valued process with i n d e p e n d e n t i n c r e m e n t s as
follows:
dL(t, t + h) = a(t)h + a(t) dr\(ty t + h) + 0(h) (4)

a(t) is an r x- v e c t o r whose c o m p o n e n t s are c o n t i n u o u s in t. a(t) is an


r x r
i i diagonal matrix whose elements are c o n t i n u o u s . C o n s e q u e n t l y ,

EdL(ttt + h) = 0 ( 0 * + 0 ( A ) (5)
£[dL(*, t + h)dL'(t' u + A)] = o(tfh + 0(A) (6)

T h e r a n d o m process defined in E q s . (4), (5), and (6) is a multidimensional


Gaussian process with i n d e p e n d e n t i n c r e m e n t s .
T h e r e are m a n y cases w h e r e a process with i n d e p e n d e n t i n c r e m e n t s is
O P T I M A L C O N T R O L W I T H MEASUREMENT ERRORS 203

not a suitable model for a physical p r o b l e m b u t a M a r k o v process derived


by operating on a c o n t i n u o u s r a n d o m process with i n d e p e n d e n t incre-
m e n t s by a shaping filter is suitable. S o m e aspects of shaping filters are
described by K a i m a n (41, 51) a n d Stear (42). F o r t h e p u r p o s e s of this
chapter, the shaping filter can be assumed to be described by the following
stochastic differential e q u a t i o n :

d V ( f , t + h) = C(t)V(t)h + D(t) d L ( f , t + h) + 0(h) (1)

V(t) is a n « ! X 1 vector valued process which is the o u t p u t of t h e s h a p i n g


filter. T h e L(t) process and its i n c r e m e n t s dL(t, t + h) are defined by
Eqs. (4), (5), and (6). C(t) a n d D(t) are ηλ χ nx and nx χ rx matrices
whose elements are c o n t i n u o u s in t. V(t) will be a Gaussian M a r k o v
process if V(t{)) is Gaussian for some t0 ^ t. It will be used in Section I I I
to discuss the effects of external r a n d o m disturbances which are i n d e p e n -
dent of the state vector.
In Section I I I it will also be desirable to use a n n χ η matrix analog of
the shaping filter described by E q . (7) to discuss t h e effect of disturbances
on the plant which are linearly proportional to t h e state vector. T o do
this, E(t) will represent an η χ η diagonal matrix whose diagonal
elements are W i e n e r processes with σχ* = 1. T h e n t h e a p p r o p r i a t e
matrix analog of E q . (4) is

dM(t, t + h)= A(t)h + v(t) dS(t, t + h) + 0(h) (8)

A(t) is a diagonal η χ η matrix with c o n t i n u o u s elements while v(t) is


a diagonal η χ η matrix whose diagonal elements are c o n t i n u o u s and
positive. T h e definition of a W i e n e r process implies that

Ε dM(t, t + h)= A(t)h + 0(h) (9)

E[dM(t, t + A) dM'(U t + h)]= v\t)h + 0(h) (10)

Let φ(ί) and ψ(ί) ben χ η matrices whose elements are c o n t i n u o u s . T h e n


the following stochastic matrix differential equation defines a M a r k o v
process which is an analog of t h e one defined in E q . (7):

dAr(t9 t + h)= <p(t)Ar(t)h + φ(ί) dM(tf t + h) + 0(h) (11)

W h e n Ar(t0) is an η χ η matrix whose elements are n o r m a l r a n d o m


variables for some t0 ^ t h e Ar(t) matrix process will represent Gaussian
M a r k o v r a n d o m processes, i.e., the elements of the Ar(t) matrix will be
Gaussian processes and the matrix Ar(t) constitutes what m i g h t be called
a matrix valued r a n d o m process. Both E q s . (7) and (11) show how
204 PETER R. SCHULTZ

Markov processes can be generated by processes with independent


increments.

C . Discontinuous Stochastic Processes with


Independent Increments
H e r e the so-called generalized Poisson process ( / / , 12) and stochastic
processes which are derived from it by shaping filters will be discussed.
First, the vector-valued generalized Poisson process will be defined.
Let W f ( o ü ) be an ^ - v e c t o r valued r a n d o m process (hereafter Wt(œ)
will be denoted by W ( t ) where ω denotes the point in the sample space)
which is defined so that EW(t) and E[W(t)W(t)'] exist. T h e n Lx(t)
will be called a generalized Poisson process if its increments are i n d e p e n -
dent and defined statistically as follows:

(0 with probability 1 — q(t)h + 0(h)


dL^t, t + h) = L,(t + h) - L L( 0 = " (12)
oo > q(t) > 0 ( w ( f ) with probability q(t)h + 0(h)

dL x(/, t + h) is statistically independent of dhi(t1 — A, ^ ) ,

all tx < t or tx > t + 2h.

T h e case where d L x ( i , t + h) = W(t) might be called the occurrence


of an " e v e n t " in the interval between t and t + h while the case where
d L 1( / , t + h) = 0 could be called the absence of an event in this inter-
val. T h e generalized Poisson process differs from the usual Poisson
process (37) in two i m p o r t a n t ways. First, the event is characterized
by a vector rather t h a n a scalar. Second, the results of the occurrence
of an event are r a n d o m , i.e., the effect w h e n an event does occur may
have a nondegenerate distribution.
A Markov process can be generated from a generalized Poisson p r o -
cess by means of a shaping filter in the same m a n n e r that was used in
connection with the continuous r a n d o m process. Let C(t) and D(t) be
nl X rx matrices, respectively, whose elements are c o n t i n u o u s . T h e n t h e
behavior of a shaping filter which acts on the generalized Poisson process
can be described by the following stochastic differential equation:

dVi(i, t + h) = CiOVxW* + D(t)àLx(t} t + h) + 0(h) (13)

T h i s Markov process will be useful as a model for external disturbances


which are i n d e p e n d e n t of the plant's state vector.
A matrix analog of E q s . (12) and (13) can be derived easily. Let
Y(t) be an « X « diagonal matrix whose diagonal elements consist of
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 205

a stochastic process. T h e mean and covariance matrices of the elements


of Y(t) are taken to piecewise c o n t i n u o u s . T h e n a so-called matrix
valued generalized Poisson process Mp(t) can be defined by the following
stochastic differential equation:

Μ Jit, f + A) = |0 with probability 1 - (t)f-O(A)


q
pK 7 vJ
' I Y(i) with probability + 0(Ä)
dMp(ty t + h) is statistically i n d e p e n d e n t of dMp(t1 — A, i x) all
tx < t or fx > t + 2h and o o > q(t) > 0
T h e n t h e following stochastic matrix differential equation will define
a so-called matrix M a r k o v process:

dA9(t, t + h)= <p(t)Ap(t)h + φ(ή dMv(ty t + h) + 0(h) (15)

cp(t) and i/j(t)are assumed to be η Χ η matrices whose elements are


continuous in t. T h e Ap(t) matrix r a n d o m will be useful in discussing
the effects of disturbances which are linearly proportional to the state
vector.

D. Stochastic Integrals
I n t h e two previous parts of this section the i n c r e m e n t s in the V(f),
Νχ(ί), Ar(t), and Ap(t) processes were defined. However, t h e integrals
or s u m s of these i n c r e m e n t s were not defined. A p p r o p r i a t e integrals
can be defined by use of t h e theory of stochastic integrals (37). Consider
the following partition of the interval t0 ^ s < t. If, as η - > o o , the
quantity m a x ( i 1 +1 — t{) becomes arbitrarily small for 0 ^ i ^ n, t h e n
suitable definitions of these processes in t e r m s of stochastic integrals are

V(0 = f dV(t, t + 0) Urn 2) dV{t,, ti+1) (16)


2=0

m—1
4
Vx(0 = f dV^t, t + 0) H m X dV^t,, ti+1) (17)
J
to
i=0
Ar(t) = JÇ dAr(t, t + 0) m-1
t0 dA
if™, Σ ) r(t,:, tl+1) (18)
i=0
m—1
Av(t) = JÇ dAp(t,t + 0)
t0

N o w some well k n o w n results from probability theory (37) will be used


to show that the limits (and hence t h e integrals) exist in the sense of
206 PETER R. SCHULTZ

convergence in t h e m e a n s q u a r e for each c o m p o n e n t of t h e matrices


(and vectors) in E q s . ( 1 6 ) a n d ( 1 9 ) .
L e t x(t, r) be a f u n d a m e n t a l solution to t h e following matrix differen-
tial equation:

J = < W , T ) , X(T,T)=7 (20)

I = identity matrix

Also, let xj(i, r ) be a f u n d a m e n t a l solution of t h e matrix differential


equation

^ = r f O X i ( ' . T ) , XI(T,T)=7 (21)

N o w consider t h e following stochastic integrals (57):

V(0 = x(*, t0)[V(t0) + C -χ ΐ ( 5 , t0)D(s) dL(s, s + 0)] (22)

Vx(0 = x(*, g [ V x ( g + f ' x-\s, t0)D(s) dl^s, s + 0)] (23)

Mt) = XiC α Λ ( Ό ) + Γ x r V . 'oWM ^ ( i . * + 0)] (24)

^p(0 = XiC + Γ x r V . 'oWW


J
h
^ ( s , ί + 0)] (25)

T h e s e stochastic integrals define r a n d o m processes whose i n c r e m e n t s


satisfy E q s . ( 1 6 ) - ( 1 9 ) . Since t h e i n t e g r a n d s x~\sy t)D(s) and xï\s, t)i/j(s)
are matrices whose elements are c o n t i n u o u s , t h e t h e o r y of stochastic
integrals (57) can be used to define these stochastic processes. I n p a r t i -
cular, t h e stochastic processes L x ( i ) , L(i), M(t), and Μλ(ί) have i n d e p e n -
d e n t and, therefore, uncorrelated i n c r e m e n t s . C o n s e q u e n t l y , each c o m -
p o n e n t of t h e V(i), Vx(i), Ar(t), and Ap(t) processes can be s h o w n to
exist in t h e mean square sense, i.e., t h e vector and matrix E q s . (22)-(25)
are t r u e for each c o m p o n e n t in t h e mean s q u a r e sense. C o n s e q u e n t l y ,
t h e limits on t h e r i g h t - h a n d sides of E q s . ( 1 6 ) — ( 1 9 ) exist in t h e m e a n
square sense. F u r t h e r discussions of stochastic integrals a n d additional
references are given in references (5S, 59, and 40).

E. The Pay-Off Associated with a Linear Policy


H e r e it will be s h o w n t h a t t h e payoff defined in E q . (2) is a q u a d r a t i c
function of t h e state vector's c o m p o n e n t s w h e n t h e p l a n t ' s behavior is
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 207

governed by t h e stochastic differential equation given in E q . (3). T h e


policy u is a ss u m e d to be of t h e form

« = WO + Λ(0χ(0 (26)
S o m e or all of t h e elements of J x a n d J2 may be r a n d o m processes w h i c h
are c o n t i n u o u s in t h e m e a n . I n view of t h e discussions associated with
Eqs. (7), (11), (13), and (15), t h e stochastic differential equation which
describes t h e plant's behavior is a s s u m e d to be

dx(f, t + h) = [A(t)h + dA*(t, t + h)]x(t) + B(t)xx(t)h + dV*(t, t + h) + 0(h)

x(t0) = x 0 (27)

dA* refers to r a n d o m processes defined in E q s . (11) a n d (15), while


rfV* refers to t h e t y p e of r a n d o m processes defined by E q s . (7) a n d (13).
T h e A(t) and B(t) matrices are a s s u m e d to possess c o n t i n u o u s elements
as before. S o m e of these elements may be r a n d o m processes whose s a m -
ple functions are c o n t i n u o u s in t h e m e a n . x(t) is defined in t e r m s of t h e
following stochastic integral:

dx< s 5 28
x(0 = x(*o) + Γ < o ' + °) (>

It will be desirable to d e c o m p o s e x(t) into two c o m p o n e n t s , xp(t) a n d


xh(t). I n t h e course of an a p p r o a c h analogous to t h e one used in t h e
theory of ordinary linear differential equations, x^t) will be called t h e
h o m o g e n e o u s c o m p o n e n t of t h e solution to E q . (27) while xp(t) will be
called t h e particular c o m p o n e n t of t h e solution. T h e s e quantities are
defined by t h e following e q u a t i o n s :

dxh(t, t + h) = [A(t)h + dA*(t, t + h)]xh(t) + 0(h) (29)

x*('o) = x ( ' o ) = X o (30)

dxp(t0 , Ό + A) = [B(t)u(t)h + dV*(t, t + h)]

+ [A(t)h + dAr(t, t + h)]xp(t) + 0(h) (31)

xp('o) = 0 (32)

x(0 = x„(f) + x (0 A (33)

E q u a t i o n s (31) and (32) imply that χμ(ί) is i n d e p e n d e n t of x0 while


Eqs. (29) and (30) imply that xft(t) is a linear (and possibly r a n d o m )
function of x 0 . T h e r e f o r e x(t) is c o m p o s e d of two additive c o m p o n e n t s .
O n e is a linear function of x 0 and t h e other is i n d e p e n d e n t of x 0 . C o n s e -
208 PETER R. SCHULTZ

quently, the quantity defined in Eq. (2) will be a quadratic function of


t h e c o m p o n e n t s of x 0 .

I I . O p t i m a l C o n t r o l in the Presence of
State V e c t o r M e a s u r e m e n t E r r o r s

A. Derivation of the Partial Differential Equation Satisfied by


the Optimal Pay-Off
H e r e the partial differential equation which t h e optimal pay-off
function satisfies will be derived. T h e solution to this partial differential
equation determines b o t h the optimal policy and t h e resulting pay-off.
T h i s partial differential equation will be derived by t h e m e t h o d of
invariant i m b e d d i n g (/, 2, 43, 44). In this section the external r a n d o m
disturbances are assumed to be absent. T h e r e f o r e N(t) = 0 in E q . (1)
and dN(t, t + h) ΞΞ 0 in E q . (3). Different kinds of nonzero r a n d o m
external disturbances will be considered in t h e following sections of
this chapter. T h e performance criterion or pay-off is, of course, defined
in E q . (2).
L e t / ( x , t1 , T) be the optimal pay-off w h e n the state vector at time t1
is χ and an optimal policy is used in t h e interval ίλ < t < T. T h a t is,

f(x,tl9T)= min

χ Ε ji j * [x{t)'P(t)x(t) + Xu'(t)R(t)u(t)] dt + ±x'(T)Qx(T)\ (34)

E q u a t i o n (34) can be rewritten as follows:

/ ( x , tl9T)= imin
u(x«)+z«),i)

χ E] [x'(t)P(t)x(t) + Xu'(t)R(t)u{t)] dt

, ,
+ f [xX0^(0x(0+A u(0Ä(0u(0]* + x (^)ßx(r)( (35)

N o w , for h suitably small, t h e state vector at time t + h is [see E q . (3)]

x(* + h) = x(t) + dx(t, t + h)

= x ( 0 + A(t)x(t)h + B(t)u(x(t) + z(0, t)h + 0(h) (36)


OPTIMAL CONTROL WITH MEASUREMENT ERRORS 209

H e r e it is desirable to make use of Bellman's Principle of Optimality


(/, 43y 44). T h i s principle is the following observation: " A n optimal
policy has t h e p r o p e r t y that, whatever t h e initial state and the initial
decision are, t h e r e m a i n i n g decisions m u s t constitute an optimal policy
with regard to t h e state resulting from t h e first decision." In this discus-
sion tx ^ t ^ t1 + A can be considered to constitute t h e first stage of
a " m u l t i s t a g e decision p r o c e s s . " Likewise, u(x(t) + z(t)y t)y t1 ^ t ^
t1 + A, can be considered as the initial decision or control effort in this
multistage decision process. T h u s t h e choice of u in E q . (35) which
minimizes t h e q u a n t i t y in brackets on t h e r i g h t - h a n d side can be
considered to involve two stages. T h e first d u r i n g the interval t1 ^ t ^
t1 + A affects t h e first t e r m and the value of x ( i x + A) [given by E q . (36)].
T h e r e m a i n i n g stages d u r i n g t h e interval tx + A ^ t ^ Τ are affected
by x(tx + h) and t h e policy that is used d u r i n g this interval. C o n s e -
quently, t h e principle of optimality a n d E q . (36) can be used to rewrite
E q . (35) in t h e following form:

/ ( x , t x , T) = £min
u(x(*)+zUU)

1+
χ Ε \C \x\t)P(t)x(t) + Xn'(t)R(t)u(t)] dt

T
+ min e)( [x'(t)P(t)x(t)
L v w x7
u(x(t) ~z(t),t) IJ t ..h '

t1 + A < f 5ζ Τ
+ Xxx'(t)R(t)xx(t)] dt + '(Γ)ρχ(Γ)| J
χ

= mm
B fi^ix'PiOx + XuVJR&MhM + 0(h)

+ f(x + A(t)xh + B(t)u(t)h + 0 ( A ) , t + A, T)} (37)

Since t h e u that minimizes t h e q u a n t i t y in brackets on t h e r i g h t - h a n d


side of E q . (37) is to be chosen from t h e linear functions of χ + z,
t h e optimal pay-off / w i l l , in cases w h e r e it exists, be a q u a d r a t i c function
of t h e c o m p o n e n t s of x. H e n c e t h e r i g h t - h a n d side of E q . (37) can be
e x p a n d e d in a T a y l o r ' s series to yield

/ ( x , tl9T) = um
( in Ε [x'Px + Xu'Ru]h

+ / ( x , h,T) + -^-h + Vf'[Ax + Bu]h + <D(h)\

(38)
210 PETER R. SCHULTZ

T h e functional d e p e n d e n c e of u on χ + ζ a n d ty A on tx etc., is s u p -
pressed here for brevity. T h e symbol V f denotes an η X 1 matrix whose
its c o m p o n e n t is df/dxi. T h e partial derivatives with respect to the c o m -
p o n e n t s of χ and tx are evaluated at χ and t — ίτ . S i n c e / ( χ , t1 , T) and
dfjdt1 are i n d e p e n d e n t of u, these quantities do not affect t h e operation
of minimization and therefore can be removed from t h e inside of t h e
brackets. D o i n g this and letting h —• 0 yields

= - min £ { i [ x ' P x + Xu'Ru] + \Vi'[Ax + Bxx] + \[u'B' + x'A']Vf}

(39)

E q u a t i o n (39), can in t u r n be rewritten as follows by " c o m p l e t i n g t h e


square":

f - = - i % ("' + \ ' - ) ( \ * )
m n vf BR 1 R u+ _lim

l
- VÎ'BR B'VÎ + \ x'A'Vf + \ Vf Ax + \ x'Px j (40)

T h e bar over the first t e r m on the r i g h t - h a n d side denotes the expecta-


tion E. T h e optimal policy is found by choosing u in t e r m s of x(i x) + z(tj)
which minimizes the quadratic form that is t h e first t e r m in t h e brackets
on the r i g h t - h a n d side of E q . (40). N o w it is well k n o w n (41) that the
choice of u which minimizes this quadratic form is

u(x + z, h) = - j / ^ ( 0 £ ' ( * i ) V f ( x , h , Τ)

= - I ^Vi)£'('i)*Ff(x, h > Τ) I χ + ζ] (41)

E[Vf(xy t x , Τ) I χ + ζ] denotes t h e conditional expectation of Vf(x, t 1 ,


T) given t h e observation of χ + ζ. T h i s result is derived in A p p e n d i x A
for reference. T h e substitution of E q . (41) into E q . (40) gives

x T x
-J£ = - } i + ( V f - V$')BR-*B'(Vf - Vf)

- Vf'BR^B'VÎ + i V f ' ^ χ + 1 χ'^' Vf j (42)


O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 211

Equation (42) constitutes the partial differential equation which t h e


optimal pay-off m u s t satisfy. T h e b o u n d a r y condition which this solu-
tion is required to satisfy is

/ ( x , T, T) = J x ' Q x (43)
T h e optimal pay-off is, of course, obtained by letting χ = Xq (the initial
value of the state vector) and tx = t0 .

B. The Solution of the Partial Differential Equation for the


Optimal Pay-Off
H e r e the solution to E q . (42) which satisfies the b o u n d a r y condition
given in E q . (43) will be obtained. T h e a p p r o a c h that is used makes use
of the fact t h a t / ( x , tx , T) will be a q u a d r a t i c function of t h e c o m p o n e n t s
of the state vector since the optimal policy is to be obtained from linear
policies. T h i s approach has been used by M e r r i a m (9, 10) and Florentin
(//, 12), a m o n g others, in p r o b l e m s where m e a s u r e m e n t errors are
absent.
T h e approach involves assuming a solution of the form given below in
E q . (44);
/ ( x , tl9T) = k0(tx, T) + x'Kx{txT) + \x'K2{tx, T)x (44)

T h e coefficients of t h e different powers of χ will be d e t e r m i n e d in t e r m s


of solutions to certain ordinary differential equations. T h e s e equations
will be derived in the sequel. k0(tx , T) is a scalar, K1(tl , T) is an η X 1
matrix and K2(t1 , T) is assumed to be a s y m m e t r i c η Χ η matrix.
Equation (44) implies that

V f ( X l, tx, Γ) = , T) + K2(tl, 7> (45)


V f = £ [ V f ( X l, t , l Τ) I χ + z]

= , T) + K2(tl, Γ)(χ + ζ) (46)

E q u a t i o n (46) is due to the fact that Ez = 0. E q u a t i o n s (44), (45), and (46)


can be substituted into E q . (42) to yield

dk0 , dKx 1 , dK2


+X + X X
dt, dtx 2 dtx

= - j^x'Px + ^ - z'KzBR-^B'Kofi

- ~ ( K / + x'KJBR-WQL! + K2x)

+ J ( K / + x'K2)Ax + ^x'A'iK, + K 2x ) | (47)


212 PETER R. SCHULTZ

I n E q . (47), as before, the bar over the second t e r m denotes the m a t h e -


matical expectation with respect to z(£ x) and the i n d e p e n d e n t variable t1
has been suppressed. E q u a t i n g coefficients of each power of χ on b o t h
sides of E q . (47) yields

dh
- -L ^'KZt7Tf)W^(h)Wï^7W(^ (48)
+ ^ - K ^ , TyBitJR-WBWKfa , T)

dK,
~(A{tiy - 1 K2(U , TWtJR-WBiWK^ , T) (49)

-P{U) + I K^r , TmtjmtJ-iBitJKAU , T)


dt~

-K.it,, T)A(tl) - Α(ίχ)'Κ^χ , Γ) Τ (50)

Equation (50) is a nonlinear ordinary matrix differential equation called


the matrix Riccati equation (45-48). Its properties have been discussed
in the literature and its solution can be expressed in t e r m s of the funda-
mental solutions of a 2n χ 2n matrix linear ordinary differential equa-
tion (45). Once K2(t1 , T) is obtained by this means or by numerical
integration, then Eq. (49) becomes a linear vector matrix differential
equation whose solution can be expressed in t e r m s of f u n d a m e n t a l
solutions (49, 50). W h e n , T) and K2(t, T) have been obtained,
then Eq. (48) can be integrated directly.
T h e b o u n d a r y condition specified in Eq. (43) requires that

ko(T, T)=0 (51)


Κ ^ Γ , T) = 0 (52)
K2(T, T)=Q (53)
Equations (44) and (48) t h r o u g h (53) imply that

, T) = 0, t0 < tx < Τ (54)

/(x 0 ,t0,T) = ^Xo'K2(t0 , 7>0

Also, E q s . (41) and (46) imply that the optimal policy is given by

ü(x(0 + z(tx\ h) = - \ R-VJBitJK^ , T)(x(t,) + z(t,)) (56)


OPTIMAL CONTROL WITH MEASUREMENT ERRORS 213

T w o items are w o r t h noting at this point. O n e is that t h e pay-off function


given by E q . (55) is composed of two t e r m s . O n e t e r m d e p e n d s on t h e
m e a s u r e m e n t errors and is i n d e p e n d e n t of t h e initial value of t h e state
vector. T h e other t e r m is, conversely, i n d e p e n d e n t of t h e noise and a
function of t h e state vector. I n t h e absence of m e a s u r e m e n t errors, t h e
pay-off is given by t h e first t e r m on t h e r i g h t - h a n d side of E q . (55).
T h e other item is that t h e optimal policy w h e n m e a s u r e m e n t errors are
present consists of using x ( ^ ) + z(i x ) in place of x ( ^ ) in t h e optimal
policy w h e n m e a s u r e m e n t errors are absent. T h i s latter p h e n o m e n o n
is called a "certainty equivalence p r i n c i p l e ' ' by some a u t h o r s (20). It
will be seen in later parts of this chapter that there are versions of
this p r o b l e m for which t h e certainty equivalence principle does not
hold.

C . Estimation in the Optimal Policy

Aside from t h e a s s u m p t i o n s regarding ζ(ίχ) that were stated at the


beginning of t h e chapter, little has been said concerning t h e statistical
characteristics of t h e r a n d o m process z(ti) which represents t h e m e a s u r e -
m e n t errors. T h e part of t h e pay-off which d e p e n d s on t h e m e a s u r e m e n t
errors is given by t h e second t e r m on t h e r i g h t - h a n d side of E q . (55).
N o w in most practical situations t h e m e a s u r e m e n t errors z(t^) are t h e
errors t h a t exist in t h e result of a physically realizable filtering operation
on noisy observations of some or all of t h e c o m p o n e n t s of t h e state
vector. At this point it is a p p r o p r i a t e to question w h e t h e r or not it is
possible to optimize t h e pay-off by varying t h e estimation p r o c e d u r e and
hence z ( * i ) .
_ 1
Since i ? ( * i ) is positive definite, t h e integrand in t h e second t e r m
on t h e r i g h t - h a n d side of E q . (55) is positive for all z(i x ) such that
, /
£ [ ζ ( ί 1 ) ζ ( ί 1 ) ] is a n o n - n u l l matrix. K a l m a n (41) has s h o w n that a filter
which will minimize this i n t eg r an d will also minimize E[z'(t^)z(t^)\.
Consequently, t h e filtering operation which minimizes t h e pay-off
,
is one which minimizes £'[ζ (ί 1 )ζ(ί 1 )] for t0 ^ tx ^ Γ, i.e., t h e
optimal least squares filter for this class of m e a s u r e m e n t errors.
T h u s , t h e optimal policy in t h e presence of noise involves using t h e
optimal least squares estimate of x ( ^ ) in place of x(tx) in t h e optimal
policy w h e r e m e a s u r e m e n t errors are absent for this case. I n other
words, w h e n m e a s u r e m e n t errors are i n t r o d u c e d into t h e p r o b l e m ,
t h e optimal policy is modified by replacing x ( ^ ) with t h e optimal least
squares estimate of x ( ^ ) .
K a l m a n and K o e p c k e (17) conjectured that this is t r u e for t h e discrete
case b u t did not prove it. G u n c k e l (22), G u n c k e l and Franklin (25),
214 PETER R. SCHULTZ

a n d Joseph and T o u (24) proved that this is t r u e for discrete versions


of this p r o b l e m . T h e above discussion proves t h e result for t h e c o n t i n u o u s
version which is considered in this chapter.

D. The Sufficiency Question and Local Optimality


I n t h e previous part of this section a partial differential e q u a t i o n
(together with t h e associated b o u n d a r y conditions) which t h e optimal
pay-off m u s t satisfy were derived. T h e solution to this partial differential
equation was obtained. T h e s e results provide t h e optimal policy a n d
constitute necessary conditions for optimality. H e r e t h e difference
between t h e pay-off resulting from any given s u b o p t i m a l linear policy
and the optimal pay-off will be examined. T h i s difference in t h e pay-offs
will show the conditions w h e r e t h e necessary conditions (which are not
sufficient to guarantee optimality) do yield a policy that is optimal with
respect to other linear policies. I n addition, t h e question of w h e t h e r t h e
policy is o p t i m a l in a " l o c a l " or " g l o b a l " sense is also of interest. T h a t
is, a policy may be optimal with respect to only those linear policies
which are close (in some sense) to it or with respect to all linear policies.
T h i s question will be considered here too.
F o r computational convenience, the s u b o p t i m a l linear policies will be
written in the form of p e r t u r b a t i o n s from t h e optimal policy [see E q .
(56)], i.e.,

U*(x + Z, f) = Ü + δ ΰ

= - 1 ä - H 0 B ( 0 ' [ E I ( 0 + K,(*. Τ) + (K2(t, Τ) + £ 2(<))(χ + ζ)]


(57)
Έχ(ί) is assumed to be an η χ 1 matrix whose elements are c o n t i n u o u s
functions of t and E2(t) is an η χ « matrix which is not necessarily s y m -
metric whose elements are c o n t i n u o u s functions of t. T h e results in t h e
first section show that t h e pay-off that will result from t h e use of this
suboptimal policy is a quadratic function of the initial state vector
x(t0) = x 0 . L e t t h e pay-off associated with the s u b o p t i m a l policy be
designated by g(x0 , t0 , T). T h e n we have

g(x0 ,t0,T) = $E \f [x'(t)P(t)x(t) + Xu*'(t)R(t)u*(t)] dt + x'(T)Qx(T)\

(58)
N o w t h e same p r o c e d u r e of invariant i m b e d d i n g which was used in
E q s . (34)-(38) to derive E q . (39) can be used to derive t h e partial differ-
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 215

ential equation which g(x0 , tQ , T) m u s t satisfy a n d its b o u n d a r y c o n d i -


tions. T h i s derivation will be sketched briefly here since it is very similar
to that given in E q s . (34)-(38).
E q u a t i o n (58) can be modified to read

h+h
£(x, tx,T) = E\\ f [x'Px + λιι*'Λι*] dt

+ \ f [x'Px + Au*'Ku*] dt + ±x'(T)Qx(T)\ (59)

As before Ε denotes t h e m a t h e m a t i c expectation with respect to t h e


m e a s u r e m e n t errors and t h e d e p e n d e n c e on t i m e has been s u p p r e s s e d
for brevity. T h e n

+ k
£(x, t1,T) = E ]i f ' [x'Px + Xu*'Ru*] dt

+ g(x + dx(t,, tx + h\ tx + h, T)

E χ / ρχ
= \k [ + Au*'Pu*]A + g(x, t x , T)

+ ~-h + Vg'f^x + Bu*]h + 0(h) j (60)

T h e manipulations indicated in E q . (60) make use of E q . (36). R e a r r a n -


ging E q . (60) and using E q . (57) yields

r
+ \Vg'[Ax + J 3 ( ü T S ü ) ] + \ p T T ü ) B ' + x'/T]Vg| (61)

As before, the b o u n d a r y condition on g is

*(x, 7\ T) = \xOx (62)

E q u a t i o n (41) yields t h e following result:

Xhu'Ru = -Su'B'X/î (63)

T h e use of E q . (63) and the subtraction of E q . (39) from E q . (61) (after


216 PETER R. SCHULTZ

the minimizing value ü has been substituted in the brackets on the


r i g h t - h a n d side of Eq. (39) yields

dp
- 2( 8u'R8u + £8u'B'(Vf - Vf)

+ = (Vf - Vf )B8u + \v9'[Ax + B{ù + Su)]

1
+ 2 [χ'Α' + (fi' + Su')Ä']Vp (64)

p(x, / 1 )Γ ) = ^ ( χ , ί , , η - / ( χ , ί χ , Γ )
Equations (43) a n d (62) imply that the b o u n d a r y condition for ρ is
p(x, Γ, Γ ) = 0 (65)

It will be convenient to use E q . (41) to rewrite E q . (64) in the following


form:

~ = ~ \ (Su - i (Vf - Vf ')BR-i) R (δη - i R-W(Vt - Vf))

+ ^ (Vf - Vf^B^R^vf-Vf)

- ^Vp' [Ax + β (δα - i Α - Ή ' ν ή ]

1
- I [χ'^' + (ôu' - i VfB/?- ) £'] Vp (66)

T h e same technique that was used to solve Eq. (42) will be used to
solve Eq. (64). A solution of the following form is assumed to exist:
P(x, t, , T) = , T) + x'I1(/1, T) + è x 7 2 ( ^ , 7 > (67)
as before, i() is a scalar, Ij an η X 1 matrix and I2 an w χ « s y m m e t r i c
matrix. Substituting Eq. (67) into Eq. (66) yields the following result:
di0 , ά\λ ,x dl2 1
1 12
dl, <&, Λι 2λ

- 2χ(Ε,' + (x' + ζ')£2' + z'KJBR^B'^ + £2(χ + ζ) + Κ2ζ)

- \ (Ι,' -f ζ7 2) [Λχ - I BR-iß'iEi + Kt + (£, + Â'2)x)]

- \ [χ'Α' - i (Ε,' + Κ,' + x'(£Y + K^BR-w] (I, + /2x) (68)


O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 217

It is necessary to use the fact that Ez(tx) = 0 to derive E q . (68). E q u a t i n g


t e r m s involving identical powers of χ on both sides of Eq. (68) yields

+ ^ z'K,BR^B'K2z + ^ KBR-Wfo + K,)

1
+ ^ ( E 1' + K 1' ) B Ä - B ' I 1 (69)

A E
= - [ ' ~\( 2 + KJBR-W] I, + jIJR-W^ + K,)

- 1 E^BR^B'E, (70)

l 1
< ^2
E/2l'BR-
Tm ^ T ^ i^ T/ n
B'E2 - I2 r [A
l Λ
- ^ 1BR- B\E2 + AT2)J

- (Λ' - I (£ 2' + K2)BR-W) I2 (71)

E q u a t i o n s (69)—(71 ) are ordinary linear differential equations. The


b o u n d a r y conditions for these equations are

h(T, T) = 0 (72)

I x ( r , Γ) = 0 (73)

*Ό(Γ, Τ) = 0 (74)

E q u a t i o n (67) can be rewritten in the following form by " c o m p l e t i n g


the square":
1 l
P ( x i , *ι, Γ ) = £(x' + Ι / Ο / , ί χ + Ζ," I x) + ί 0 - \lj-2 \λ
1
= £(χ' + 1/ / Â y 2 ( x + / Γ 1 0 + dit,, Τ) (75)

T h e results given in Section VI, A imply that I2(t1 , T) is positive definite


_1
since JR ( i 1) is positive definite. H e n c e the first t e r m on the r i g h t - h a n d
side in the second form of Eq. (75) is non-negative. T h e question of
interest now becomes u n d e r what conditions θ(ίλ , Τ) ^ 0 and if
θ(ΐλ , Τ) < 0, u n d e r what conditions is p(x1 , t x , Τ) ^ 0. N o t e that the
first t e r m on the r i g h t - h a n d side of Eq. (75) is i n d e p e n d e n t of the mea-
s u r e m e n t errors z(t).
218 PETER R. SCHULTZ

Differentiating θ with respect to t1 yields

dB_ dip
(76)
dU dU

But, from E q . (71)

h 2
~ dt,

l 1 1
ItfEt'BR^B'EJ? + (a - l ±B1B\E2
BR- + K2)) I,

1 X
+ I, [A' - i (Ε2' + K2)BR- B') (77)

T h e substitution of E q s . (77), (69), and (70) into E q . (76) yields after


some algebraic manipulation

+ ^ z'K2BR-*B'K2z - ^ z\K2 + E2')BR-iB'(K2 + E2)z (78)

E q u a t i o n (78) can be integrated to yield

(79)

1
T h e fact that R' is positive definite (by a s s u m p t i o n ) implies t h a t t h e
integrand in t h e t h i r d t e r m on t h e r i g h t - h a n d side of E q . (79) is n o n -
negative. C o n s e q u e n t l y , this t e r m , which is i n d e p e n d e n t of t h e control
p e r t u r b a t i o n s E x a n d E2 , will always make a negative c o n t r i b u t i o n to
θ(ίχ, Τ) w h e n m e a s u r e m e n t errors are present. W h e n m e a s u r e m e n t
errors are absent, this t e r m vanishes. U n d e r this conditions (where
ζ = 0), since t h e i n t e g r a n d s in t h e first two t e r m s are always positive
a n d since I2 is non-negative definite, we have that ρ > 0 for all E2 a n d
E x . H e n c e the policy given by E q . (56) is indeed optimal w h e n z(t) = 0,
h ^ t ^ T. Similarly, if E2 = 0, t h e first a n d t h i r d t e r m s on t h e right-
OPTIMAL CONTROL W I T H MEASUREMENT ERRORS 219

h a n d side of E q . (79) cancel and we have 9(t1, Τ) > 0 for all nonzero
E j and hence, w h e n m e a s u r e m e n t errors are present (z(t) Φ 0) and
E2 = 0, the policy given by E q . (56) is optimal ( c o m p a r e d with all
policies for which E x Φ 0 and E2 = 0). However, if E2 Φ 0 and m e a s u r e -
m e n t errors are present, t h e n E q . (79) indicates that there are certain
combinations of E2 and levels of m e a s u r e m e n t errors (as defined in
t e r m s of the third t e r m on the r i g h t - h a n d side of E q . (79) ) for which
θ(ί1, Τ) < 0 and certain values of x 0 for which
, 1
p(»o. Ό . Γ) = | ( χ 0 ' + ΐ ι / 2- ) / 2( χ 0 + Ά ) + Wo . Γ) < 0
T h u s , as the levels of m e a s u r e m e n t errors increase t h e r e will be m o r e
matrices Ex(t) and E2(t) and values of x 0 for which p ( x 0 , t0 , T) is nega-
tive and hence the pay-off can be lowered by using a policy different from
that stated in E q . (56). T h u s it seems a p p r o p r i a t e at this point to state
that t h e policy given by E q . (56) is optimal in a local rather t h a n a global
sense. I n other words, the class of linear policies in which E q . (56)
provides t h e optimal policy varies with b o t h x 0 and t h e level of m e a s u r e -
m e n t errors (as specified by t h e t h i r d t e r m on t h e r i g h t - h a n d side of
E q . (79)).
A n illustration of this p h e n o m e n o n is provided by t h e following
heuristic example. Consider a stable linear system whose performance
criterion is defined by E q . (2). I n t h e case were large m e a s u r e m e n t
errors are present and x 0 is arbitrarily close to the origin, it is heuristi-
cally obvious that t h e best policy is to make u = 0 to minimize t h e per-
formance criterion. T h i s is d u e to t h e fact that w h e n large m e a s u r e m e n t
errors are present, the policy specified by E q . (56) (or any other policy
which is a linear function of t h e noisy m e a s u r e m e n t s of t h e state vector)
will be displaced from t h e origin. W h e n this occurs t h e pay-off will
obviously be increased above that which would be obtained if a policy
of u = 0 were used. H e n c e , in this case a policy of u = 0 d u r i n g t h e
interval t0 < t < Τ yields a lower pay-off t h e n does the policy given
by E q . (56). T h i s conclusion can be substantiated by straightforward
calculations.

I I I . O p t i m a l C o n t r o l in the Presence of M e a s u r e m e n t
E r r o r s and C o n t i n u o u s R a n d o m Distrubances

A. The Optimal Policy and Pay-Off for Disturbances which are


Independent of the State Vector
I n the previous section of this chapter the optimal control p r o b l e m
u n d e r consideration was discussed w i t h o u t external r a n d o m disturbances.
220 PETER R. SCHULTZ

I n this section external r a n d o m disturbances are considered w h i c h are


generated by a c o n t i n u o u s r a n d o m process with i n d e p e n d e n t i n c r e m e n t s .
I n t h e next section disturbances generated from r a n d o m processes with
i n d e p e n d e n t i n c r e m e n t s whose sample functions are discontinuous will
be discussed. C o n t i n u o u s disturbances which are i n d e p e n d e n t of the
state vector will be considered first.
E q u a t i o n (3) is a s s u m e d to describe t h e behavior of t h e state vector
with dN(t> t + h) defined by E q . (7). C o n s e q u e n t l y , we have
A
dx(t, t + A) = A(t)x(t)h + B(t)u(x(t) + z(t), 0

+ C(t)V(t)h + D(t) dL(t, t + h) + 0(h) (80)

w h e r e the quantities in E q . (80) have been defined in t h e discussions


associated with E q s . (7) and (3); V(t) is defined by E q . (16).
N o w t h e partial differential equation which t h e optimal pay-off
/ ( x 0 , t0 , T) satisfies will be derived. Since t h e derivation is quite similar
to t h e one given in t h e last section, only t h e highlights of this derivation
will be given to clarify t h e differences b e t w e e n this section and t h e last
section. T h e similarities will be o m i t t e d here to the greatest extent p o s -
sible. T h u s ,

/ ( x , t y , T) = min()
u u ( () E U f [x'Px + Xu'Rn] dt + 1 χ ' ( Γ ) ρ Χ ( Γ ) |

= u(mm ) j l [x'Px + Xu'Ru]h + 0(h)

+ f(^+ dx(t,, 7 7 + A), t1 + A, T) j (81)

E x p a n d i n g t h e last t e r m in t h e brackets on t h e r i g h t - h a n d side of E q .


(81), using E q s . (80), (5), and (6) a n d letting h-+0 yields

= - min \hx'Px -^Vf'BR-iB'Vf


u
ctl ( 2Λ

+ ^ ( u ' + I Vf'BR-ή R ( V L + \ R-'B'Vf)

+ Vf'[Ax + CV + D a ] + 1 V V V DjkDik aÜ (82)

clk(t) is the &th diagonal element of t h e matrix defined in E q . (6). L e t


t h e rj-vector μ(ί) be defined by t h e following e q u a t i o n :

(83)
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 221

T h u s E q . (82) becomes

3 - = - min hx'Px -^Vf'BR-WVf


dtx λ 2

+ u + vf B R1 R n + R1 BV {
\ (' x ' ' ) ( l ~ ' )

+ %Vf'[Ax + CV + Da]

+ £ [a'/)' + V C + x'A']Vf + \ μ'0'[(νΨ)/]Όμ\ (84)

[ ( W ) / ] represents t h e matrix of second partial derivatives of / with


respect t o t h e c o m p o n e n t s of t h e state vector. V ( * i ) is t h e expected value
of V(ix) a n d is defined by

1
wit,) = 0 ( t l, g [v(g + f t0)D(t)a(t) dt] (85)

Φ(ί, t0) is t h e fundamental solution t o t h e following matrix differential


equation:

^ - = C(t1)<P(t1,t0),<P(t0,t0)=I (86)

/ refers to t h e identity matrix.


As with E q . (40), t h e u that minimizes t h e expression in brackets on
the r i g h t - h a n d side of E q . (84) is given b y [see E q . (41)]

û = -jR-^JBitJVi (87)

T h e substitution into E q . (84) yields

J £ = - j i x ' P x - \ Vf'BR-iB'Vf + i μ'ο'[(νν')/]Ζ)μ

1
+ ^ (Vf - V^BR^B'iVi - Vf)

, , ,
+ \ Vf['Ax + CV + Da] + i [α Ζ)' + V C + x ^ ]Vf [ (88)

T h e solution to E q . (88) determines t h e optimal pay-off as before. T h i s


solution m u s t satisfy t h e same b o u n d a r y conditions as before [see E q .
(43)].
222 PETER R. SCHULTZ

T h i s solution is obtained by exactly t h e same p r o c e d u r e that was used


in the previous section. First, the following trial solution is a s s u m e d :

/ ( x , t,, Τ) = k0(t,, Τ) + x ' K ^ ! , Τ) + \x'K2(t,, 7> (89)

Substituting E q . (89) into E q . (88) and equating t e r m s involving identical


powers of χ on b o t h sides of the result yields t h e following ordinary
differential equations for k0 , K,, a n d K2:

Ρ + Y KzBR-WKz - K2A - A'K2 (90)


dt, ' λ

^ = - (A' - ± K2BR-iB') K, - K2(CV + Da) (91)


dt,

dk
dt,0 ^2λ"z'K BR-*B'K " + 12A
2 z 2=
VJ K,'BR-iB'K,
2 2

- i K',(CV + Da) - 1 (a'D' + V'C')K, - \ \k'D'K2D\L (92)

T h e b o u n d a r y conditions are t h e same as before [see E q s . (51)-(53)].


I n this case t h e optimal policy is given by

ü(x(ix) + z{h), h) = - I Ä-H'iWOIKxCi, Τ)

+ K2(tl, Γ)(χίΛ) + zih))} (93)

Here , T) is not zero for all t, . As before, E q . (92) indicates t h a t


the optimal filtering p r o c e d u r e for minimizing t h e pay-off involves
least squares filtering (to minimize E(z', z) for all t) for t h e z(t) dis-
cussed previously in Section I I , C. T h i s is the same situation that existed
in the absence of external r a n d o m disturbances.

B. The Sufficiency and Local Optimality Questions in the Presence


of Continuous Disturbances which are Independent of the
State Vector

N o w the question regarding w h e n and if the necessary conditions


for optimality which were derived previously are sufficient to guarantee
optimality will be examined. T h e question will be examined in t h e same
m a n n e r that was used in Section I I , D\ i.e., w h e n do p e r t u r b a t i o n s in t h e
OPTIMAL CONTROL W I T H MEASUREMENT ERRORS 223

optimal policy of t h e type given by E q . (57) cause an increase in the


pay-off? T h e details of t h e d e v e l o p m e n t are quite similar to those of
Section I I , D . Consequently, only t h e highlights will be given.
Consider the s u b o p t i m a l policy given by E q . (57) with Kj(i, T) and
K2(t, T) given by solutions to E q s . (90) and (91). Let g(x, t ± , T) be t h e
pay-off that results from using this policy. T h e n t h e use of a d e v e l o p m e n t
which is similar to the ones in Sections I I I , A and I I , D shows that
g(x, t x , T) m u s t satisfy the following partial differential equation:

- | ζ = - J E * ' * * + ^ ( ü ' + S u ' ) # ( ü + 8u)

+ £ Vg'[i4x + B(u + 6u) + CV + Da]


Γ
+ \[*'D' + V ' C + (Ü' + δ α ) 5 ' + x'A']Vg + £μ'Ζ)'[(νν')£]Ζ)μ|

(94)
If p(x, tx , T) = g(x, ί] , T) — / ( x , t x , T), t h e n t h e subtraction of E q . (88)
from E q . (94) and t h e use of E q . (63) yields

- J - = - ^ f r ü m i - £Su'fi'(Vf - Vf)

- l(Vf'-Vf)B8u - \ μ'ο'[(νν')ρ] βμ (95)

- J VP'[Ax + B(û + 5u) + CV + Da]

- £[a'D' + V ' C + ( û ' + S u ' ) B ' + x'i4']Vp

p(x, f, Γ) = 0 (96)
T h e assumption that

p(x, /, Τ) = φ , Τ) + x'l^i, 7') + £x7 2(f, 7 >


1
= | ( x ' + Ι χ ' / , ; ) / ^ + /,"%) + i, -

= ftx' + I x ' / J ^ / ^ x + I - % ) + 0{tl, T) (97)

yields the following results for t h e case u n d e r consideration h e r e :

^ = - ^μ'Ο'/,Ομ - ^ (Εχ' + z'(tf, + E^BR-^B'iE, + (K2 + E2)z)

+ ~ z'K2BR-iB'K& - % Ι α' (CV + D<x - 1 BR-^B^ + Kj)

1
- i ( V ' C + a'D' - I (Ε,' + K ^ ß / ? - / * ' ) I t (98)
224 PETER R. SCHULTZ

d
±=-[A'-[w K+2)BR-w]l,

- I2 ( C V + DA - ί BR-^B'i^ + ΕΧ)) - J E^BR-^B'^ (99)

= - I E2'BR-iB'E2 - I2 {A - l BR-*B'(E2 + K 2))

- (Λ' - i ( £ 2' + K2)BR->B'} I2 (100)

T h e b o u n d a r y conditions are given by E q s . (72)-(74). T h e use of E q .


(76) yields

1
dd _ diç ! Λ Ι _ Ι . , ^IT

T
11 2 21χ 1χ
A ~ ΛΧ ΛΧ

= - | μ ' 0 7 2 ο μ - 1·
1
7(K2^VË2 )BR-'B\E2 + Κ2)ζ

+ ^ ζΊ^ΒΪΓ'Β'Κ^ - ^ ( Ε / - Ι / / ^ ) ^ ^ ^ - Ε2Ι2\)

(101)

As before the results in Section V I , A imply t h a t I2 is n o n - n e g a t i v e


1
definite. Also, R' is positive definite by a s s u m p t i o n . C o n s e q u e n t l y , t h e
1 s
t e r m -^(x' + I / i i " ) ^ * + ^ 2 " % ) * non-negative. As before, t h e question
1
of w h e n p(x, t 0 , T) > 0 involves a comparison of θ(ίλ , 7 ) with
λ
| ( χ ' -f 1 / Ι2 )Ι2(χ + ^ ^ I ) - W h e n t h e m e a s u r e m e n t errors are absent
( z ' z = 0), t h e n an integration of E q . (101) (similar to t h e integration
of E q . (78) in Section I I , D ) shows that ρ ^ 0 for all c o n t i n u o u s E X
and E2 . Also, if E 2 = 0 a n d m e a s u r e m e n t errors are present, t h e n
ρ ^ 0 for all c o n t i n u o u s E X . However, if m e a s u r e m e n t errors are p r e -
sent a n d E X a n d E2 are nonzero, t h e n for any given E X a n d E2 t h e r e
will exist a region of values of χ for which ρ < 0. I n this region t h e pay-
off can be decreased from t h e pay-off t h a t will result from using t h e
policy given by E q . (93). Likewise, for a given region of values of x,
t h e r e exists a class of p e r t u r b a t i o n s E X a n d E2 for which ρ ^ 0. T h e
policy given by E q . (93) is optimal with respect to this class of p e r t u r b a -
tions. T h u s , t h e conclusion here is t h e same as t h e conclusion that was
reached in Section I I , D , namely that in t h e presence of m e a s u r e m e n t
errors t h e policy given by E q . (93) is optimal only in t h e above " l o c a l "
sense. T h i s conclusion has not been affected by the i n t r o d u c t i o n of
external r a n d o m disturbances which are i n d e p e n d e n t of the state vector.
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 225

C . The Optimal Policy and Optimal Pay-Off with Disturbances


which are Linearly Proportional to the State Vector
I n t h e r e m a i n d e r of this section t h e optimal policy and pay-off will be
considered for t h e case w h e r e t h e external r a n d o m disturbances are
linearly proportional to t h e state vector. F o r this case Eq. (3) is replaced
by t h e following expression for t h e i n c r e m e n t in t h e state vector:

dx(t, t + A) = A{t)x(t)h + B(t)u(x{t) + z(t)t t)h + dAr(t, t + A)x(f) + 0(A)

(102)
dAr(t> t + h) is t h e i n c r e m e n t in an η χ η matrix r a n d o m process which
is defined by E q . (11). T h e Ar(t) process is defined by E q . (18). E q u a t i o n s
(9) and (10) are also p e r t i n e n t to t h e following discussion. U s i n g the
invariant i m b e d d i n g p r o c e d u r e gives

Χ T E [ X P
X X
u R u] d t
'( · ». ) = . , Ä W ) I* + ' + K w w o j

= min Ε j i^xTx + ^ iTÂû) A + 0(A)

+ / ( x + dx{tx, t, + A), t, + A, T)\ (103)

U s i n g t h e T a y l o r ' s series expansion as before and letting h —> 0 yields


l
-J^ = - m i n |I x'Px - Vf'BR- BVf
ôtx u (ζ 2Λ

+ ^ (u' + i V f ' B R - ή R ( U + \ R-WVf)

+ JVf'[^ + <pA r + φΔ]χ + %χ'[Α' + Α 'ψ


τ + J ' f ]Vf

+ |xVf[(VV')/]f'xj (104)

As before, t h e i n d e p e n d e n t variables (such as tj) have been s u p p r e s s e d


in E q . (104). Ar{t^) is d e n n e d by t h e following expression [see E q . (24)]:

Mh) = x(h , t0) [Ar(t0) + -\t,


x ί0)φ(ί)Δ(ί) A ] (105)

T h e u which minimizes t h e r i g h t - h a n d side of E q . (104) is, as in E q . (41)

& = - \ R-Hfl)B'(tl)V{ (106)


A
226 PETER R. SCHULTZ

T h e substitution of E q . (106) into E q . (104) gives

d
f _
= - I i ,x ' P x + ^ ( V f -
Jè V f )PP-*(Vf - Vf)

+ \Vi'[A + ΨΑΤ + φΔ]χ + ±χ'[Α' + Ar'<p' + J ' f ]Vf

+ i x V f [ ( W ' ) / ] ^ x - 1 V f BR-WVÎj (107)

T h e b o u n d a r y conditions w h i c h E q . (107) m u s t satisfy are t h e same as


before. T h e usual trial solution will be used, viz.,

/(Xl ,tlfT) = k,{tx, T) + x ' K ^ , T) + \x'K2{tx, T)x (108)

S u b s t i t u t i n g E q . (108) into (107) yields t h e following ordinary differen-


tial equations which k0 , K x , a n d K2 m u s t satisfy:

dk0
- -^z'K.BR^B'K.z + ^K.'BR^B^ (109)

^ = - (A' + Ary + Δ'φ' - I K2BR-*B') K, (110)

p l
= - + χ K2BR~ B'K2 - K2(A + φ,ΐr + ^ J )
K
- (Λ' + i i r y + J ' f )2 - νφ'κ2ψν ( i n )

E q u a t i o n (110) and t h e b o u n d a r y conditions [which are t h e same as those


in E q s . (51), (52), and (53)] imply that

Kl{t)T) = 0 for ί 0 < Ί < Γ (112)

C o n s e q u e n t l y , / ( χ , t0 , T) has t h e form given in E q . (55) a n d t h e optimal


policy is given by

ü(x(0 + z(0, 0 = - \ R-\t)B\t)K&, T)[x(t) + z(t)] (113)

T h e differences between E q . ( I l l ) a n d E q s . (90) a n d (50) are w o r t h


noting. T h e s e differences in K2 imply t h a t for t h e case w h e r e t h e exter-
nal disturbances involved are linearly proportional to t h e state vector
there is no simple relationship in t h e form of a " c e r t a i n t y equivalence
p r i n c i p l e " between t h e optimal policy for t h e c o r r e s p o n d i n g d e t e r m i n -
istic p r o b l e m because K2 is different in this case from t h e K2 associated
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 227

with t h e deterministic p r o b l e m . Again, note that t h e integral of E q . (109)


implies that t h e optimal estimation p r o c e d u r e which minimizes t h e pay-
off is least squares filtering which minimizes Ez'(t)z(t) for t h e restric-
tions on z(t) m e n t i o n e d in Section I I , C. T h i s was also t r u e in t h e
previous cases.

D. The Sufficiency and Local Optimality Question in the Presence


of Continuous Disturbances which are Linearly Proportional
to the State Vector

H e r e the pertinent results will be presented regarding w h e n t h e


previous necessary conditions for optimality (with external r a n d o m
disturbances which are linearly proportional to t h e state vector) are
sufficient to yield t h e optimal policy and pay-off. T h e derivation of these
results is quite similar to c o m p a r a b l e derivations in previous sections
and hence it will not be given in detail. As before, s u b o p t i m a l policies
of t h e form given by E q . (57) will be considered.
Let g(x0 , t{), T) be the pay-off that results w h e n t h e system is in state
x ( ) at time t0 and the s u b o p t i m a l policy given in E q . (57) is used. T h e n
t h e previous approach by m e a n s of t h e invariant i m b e d d i n g p r o c e d u r e
yields t h e following partial differential equation which g(x> tx , T) m u s t
satisfy:

Ol f
2

du
+ ^ Vg'B(u + Su) + i ( S u ' + u')B'Vg + ± x V f [(VV')£]0vx

f
+ ί v g ' [ ^ + 9Ar + φΔ]χ + ±x'[A + Ary + J'f]vgj (114)

If p(x, t l y T) = g(x, tx , T) — / ( x , t x , Γ ) , t h e n t h e subtraction of E q .


(107) from E q . (114) and t h e use of E q . (63) yields

7
^- = - )^ S u / ^ + £ f t a ' B ' ( V f - V f ) + J ( V f ' - V{')B&u

+ | V p ' £ ( ü + Su) + \(μ' + 6u')ß'Vp

+ \V9'[A + <pÄr + φΔ]χ + \χ'[Α' + Α,'ψ' + Δ'φ']νρ

+ ixV0'[(VV>]0vx (115)
228 PETER R. SCHULTZ

T h e usual trial solution is used here, viz.,

p ( x , tltT) = ifa , T) + x'l^ij, Τ) + Ιχ'/^ί,, 7>


1
= £(χ' + Ι/Τί )/^ + / Λ ) + «o - i l x ' / ^

= i(x' + Ιχ'/^/^χ + + Wi,T) (116)

T h i s yields t h e following c o u n t e r p a r t s of E q s . (98)-(100):

+ ^ z'K2BR^B'K2z + ^ l.'BR-'B'E, + ^ Ε/ΒΛ^ΒΊ, (117)

= - \A' + ^fy + J'f _ 1 (#2 + E2')BR-iß'] I,

- - /JBÄ-^'Ex (118)

^ = - 7 2 [Λ + <ρΛ + φΔ - I BR-*B'{Kt + £„)]

- [> + Λ Γ> ' + J'f - ^ (*t + ES')BR-*B'] I2

l
- \ E2'BR~ B'E2 - ν'φ'Ι2φν (119)
Λ

T h e terminal condition w h i c h E q s . (117)-(119) m u s t satisfy are given


by E q s . (72)-(74). T h e results in Section V I , Β imply t h a t I2 is n o n -
negative definite. Consequently, the question regarding w h e n a n d if
p ( x , t, T) > 0 can be reduced to the question of w h e n is 9(t, T) < 0.
T h e use of E q . (76) yields

^ - = - ^ [z(K2 + E2')BR-*B'(K2 + E2)z] + ^ [z'K2BR^B'K2z]

- ^ (Ii W - EJBR-iB'iEJ? lt - E0

- ^Ι,'ν'φΊ,φνΙ-χ (120)

Integrating b o t h sides of E q . (120) shows that the same comments


O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 229

regarding the conditions u n d e r which θ(ΐ, Τ) > 0 that were m a d e in


Sections I I , D and I I I , Β are also t r u e in this case where the external
r a n d o m disturbances are linearly proportional to the state vector.
Consequently the optimal policies that have been derived for the three
cases considered u p to now in this chapter are optimal with respect to
all linear policies of the form given in E q . (57) only w h e n Ez'z = 0.
T h i s fact appears to be i n d e p e n d e n t of the kinds of external disturbances
considered t h u s far. So do t h e facts that these optimal policies are optimal
only in the local sense w h e n Ez'z φ 0 and E2 φ 0 and that t h e class
of linear policies for which these policies are optimal varies with t h e
initial value of the state vector.

I V . O p t i m a l C o n t r o l in the Presence of
M e a s u r e m e n t E r r o r s and R a n d o m Disturbances
Derived f r o m the Generalized Poisson Process

A. The Optimal Policy and Pay-Off when the Disturbances are


Independent of the State Vector

T h e optimal policies and pay-offs for the cases w h e r e external r a n d o m


disturbances were absent and w h e r e t h e external r a n d o m disturbances
were generated by operating with shaping filters on continuous r a n d o m
processes with i n d e p e n d e n t i n c r e m e n t s were discussed in previous
sections. I n this section analogous p r o b l e m s will be considered for t h e
case where t h e disturbances are generated by operating on a generalized
Poisson process with a shaping filter. As in Section I I I , the major e m -
phasis here will be on t h e differences between t h e results in t h e previous
sections and those in this section. Consequently, detailed derivations
will not be given here since they are similar in m a n y respects to t h e
derivations in Sections II and I I I . First, t h e case where t h e external
disturbances are i n d e p e n d e n t of t h e state vector will be considered. F o r
this case, E q . (3) becomes

dx(t, t + h) = A(t)x(t)h + B(t)u(x(t) + z{t\ t)h

+ dV1(t,t + h) + 0(h) (121)

dVi(t, t + h) is defined by E q . (13). E q u a t i o n (12) is also pertinent to


this discussion.
If / ( x , t x , T) is t h e optimal pay-off that results w h e n an optimal
policy is used d u r i n g t , ^ t < Γ w h e n t h e state vector is χ at times t,,
230 PETER R. SCHULTZ

t h e n the invariant i m b e d d i n g a p p r o a c h t h a t was used in Section II, A


will yield t h e following result w h e n E q . (121) replaces E q . (3):

/ ( x , tlfT) = min Ε j^x'PxÄ + ^u'Ruh + 0(h)

+ / ( x + dx(tx, tx + A), tx + A, T)J (122)

= min j^x'PxA + ^ u ' P u A + 0(h)

+ (1 - q(t)h)f(x + Axh + Bxxh + C V ^ , tx + Α, Γ)

+ ?(0¥(* + ^4XÄ + PuA + CVxA + DW, ii + A, T ) |

E x p a n d i n g t h e last two t e r m s in a T a y l o r ' s series a n d letting h —> 0 gives


7
= - min j ^ x ' P x + ^ t i / ^ + Vf'[Ax + Bü + C V J

+ [7 7 ( x + DW, ^ , Τ) - / ( χ , ^ , Γ ) ] |

1
= - min j ^ x ' P x + ^ (u' + i V f B P " ) P (u + i P ^ P ' V f )

l
- ^ Vf'BR- B'Vf + i V f [i4x + C V J

+ l[x'.4' + Vx'C'jVf + 9[ / ( x + DW, t, Γ) - / ( χ , ί χ , Γ ) ] | (123)

E q u a t i o n s (12), (13), a n d (17) imply t h a t V x is given by

1
X [V,(i 0) + J ' q(t)X-\t, i 0)D(i)W(i) Ä ] (124)

As before the optimal policy is [see E q . (41)]


l
ü = _ -R-i{il)B(tl)Vi (125)

S u b s t i t u t i n g E q . (125) into E q . (123) yields

x p
•J£ = - i ' x + ^ ( V f - Vf')B*-i£'(Vf - V f )

- ^ Vf'ßÄ-iß'Vf + £ v f [ 4 χ + CVj]

+ i l Y i ' C ' + x'^']Vf + 9[ / ( x + DW, h , Γ) - / ( χ , ί χ , Γ)] (126)


OPTIMAL C O N T R O L W I T H MEASUREMENT ERRORS 231

T h e b o u n d a r y conditions are given b y E q . (43). A s s u m e

/(x, tltT) = A 0 ( ' i . T) + x'Kfa, T) + fr'K^, T)x (127)

S u b s t i t u t i n g E q . (127) into E q . (126) yields t h e following set of ordinary


differential e q u a t i o n s :

dK
* = _ ρ + I K2BR^B'K2 - K2A - A'K2 (128)

dKi
= - (A' - i KzBR-w} Kj - ^ 2[ C V x + gDW] (129)
dh

+ ξ [W'D'Kx + Kj'DW + WD'K2DW] - ^V/CKj - ^K/CVx


2
(130)

T h e s e ordinary differential equations are quite similar to ones derived


for previous cases. T h e terminal conditions on k0 , K x , and K2 are given
by E q s . (51), (52), a n d (53). C o n s e q u e n t l y , t h e optimal policy given by
E q . (125) becomes

u(x + z, i) = - l Ä-HOB'WP«*, Τ) + K2{t, Γ)(χ + ζ)] (131)

T h e results concerning t h e form of t h e optimal estimation scheme which


minimizes t h e pay-offs t h a t were obtained for previous cases are also
t r u e for this case. T h i s is d e m o n s t r a t e d by t h e integration of E q . (130).

B. The Sufficiency Question when the Disturbances are


Independent of the State Vector
Consider p e r t u r b a t i o n s in t h e optimal policy of t h e form given in
E q . (57) as in the previous cases. L e t # ( x , t x , T) be t h e pay-off associated
with these p e r t u r b e d policies w h e n t h e state vector is χ at time tx . T h e n
it can be shown by using p r o c e d u r e s similar to those t h a t have b e e n
used before that g(x, tx , T) satisfies the following partial differential
equation:

r
+ ± ( 8 u ' + u')B'Vg + ±Vg'[Ax + C V J + \[x'A + V/C'JVg

+ q[g(x + DW, h,T)- g(x, t x, T)]\ (132)


232 PETER R. SCHULTZ

If p(x, t l 9 Τ) = g(x, , T) — / ( x , i x , T ) , t h e n p(x, ί χ , T) can be s h o w n


to satisfy t h e following partial differential equation by s u b t r a c t i n g
E q . (126) from E q . (132) a n d using E q . (63):

8u ( ν ί v i) B R 1 R s R 1 B vf
π = - \ (' -1 ' - ' - ) ( » - l - ' ^ - >)

+ ± (Vf - Vf ) B Ä - i ß ' ( V f - Vf)

- q[p(x + DW, h, Τ)- p(x, t,, T)]

- y Vp'[Ax + Β(δϋ+ΰ) + CVJ

- £[χ'Λ' + (vTM)B' + V/jVp (133)

T h e same form for the trial solution that has been used before is used here,
viz., E q . (116). F o r this case, t h e matrix coefficients of t h e state vector
in E q . (116) are d e t e r m i n e d b y t h e following ordinary differential
equations:

^ = - \ E2'BR-*B'E2 - It [A - I BR^B'(E2 + K2j)

(E2 + K2)BR-iB') It (134)

^ = - i E.'BR^B'E, - ?/ 2DW

l
(E2 + K2)BR~iB') 1, - l 2 [CV, - [ BR~ B'(^ + K x) ]
(135)

7
^ = - 1 ( Ε χ' + ζ ' ( £ 2' + ^ 2 p ï F î f i ( Ë 7 + ( £ t + K 2)z)

t
+ ^VK2BR^ÏB K21. - jtW'D'Ii + i W ' D ' I 2D W ]

- l [vx'C - -[ ( Ε χ ' + K / ^ Ä - ^ ' ] I x ( 136)

I n this case Section V I , A shows that l z is a non-negative definite matrix.


Consequently, w h e t h e r or not p(x, , Τ) ^ 0 is directly connected with
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 233

whether or not θ(ί1, Τ) < 0. T h e same steps that were used in the
consideration of previous p r o b l e m s yield the following result here:

= - ^ zT(Ê lHQWiF^ys;+
2 KJZ + ^VK^R^WK^
- 1 ( E / - I^I^E^BR-'B'ÇE, - E2l2%) - | WD'T2MÎ ( 137)

Integration of E q . (137) yields results that are identical to those that


have been obtained in previous cases, i.e., t h e policy given by E q . (131)
is optimal in a "global s e n s e " only w h e n t h e state vector m e a s u r e m e n t
errors are absent or w h e n E2 = 0 a n d t h e state vector m e a s u r e m e n t
errors are present. W h e n t h e m e a s u r e m e n t errors are present and
E2 Φ 0, t h e n t h e policy given by E q . (131) is optimal only within a
subclass of linear p e r t u r b a t i o n s [of t h e kind given by E q . (57)] and t h i s
subclass d e p e n d s on the initial value of t h e state vector a n d t h e m a t r i x

E[z(t)z(tn
C . The Optimal Policy and Pay-Off when the Disturbances are
Linearly Proportional to the State Vector
H e r e expressions are derived for t h e optimal policy a n d t h e resulting
pay-off w h e n t h e external r a n d o m disturbances are linearly proportional
to the state vector and derived from a generalized Poisson process by
means of a shaping filter. F o r this case the stochastic differential equation
which describes the behavior of t h e state vector is

A(t)x(t)h + B(t)u(x(t) + z(i), t)h


+ cp(t)Ap(t)x(t)h + 0(h)
with probability 1 — q(t)h + 0(h)
A(t)x(t)h + B(t)n(x(t) + z(0, t)h
+ <p(t)Ap(t)x(t)h + φ(ί)Υ(ήχ(ί) + 0(h)
with probability q(t)h + 0(h) (138)

T h e Ap(t) matrix is a matrix r a n d o m process which is defined by E q s .


(14), (15), and (19). I n this case t h e invariant i m b e d d i n g p r o c e d u r e
yields t h e following relation for t h e optimal pay-off / ( x , tx , T):

/ ( x , t,, T) = min j^x'Px +^ÏUÎu + 0(h)

+ (1 - q(t)h)f(x + (A+ cpAv)xh + Bxxh +~Ö(A), tx + Ä, T)

+ q(t)hf(^+ (A + <pAp)xh + Bxxh + φΥχ + 0(h\t~+h7T)\


(139)
234 PETER R. SCHULTZ

E x p a n d i n g t h e r i g h t - h a n d side of t h e equation in a T a y l o r ' s series as


before yields

| - = - min j i x ' P x + \ («' + \ Vf'BR-ή R (u + { R^B'Vf)

+ \Vi\A + φΑν]χ + \x'[Ä + iî„V]Vf

1
^ VÎ'BR^B'VÎ + ?[ / ( χ + 0 Υχ, ί , , Τ) - / ( χ , < χ , Τ)] j (140)

is defined as follows [see E q s . (14) a n d (25)]

AM = EAM = Xl(tt, t0) [AM + f't q(t)Xl(t, tM(t)Y(t) dt] (141)

As before, t h e policy which minimizes t h e expression in brackets on t h e


r i g h t - h a n d side is

û = - i mty*B'(t)E\yf I x(<) + z(i)]


1
= -^Rr B'Vt (142)

S u b s t i t u t i n g this entity into E q . (140) gives

= - jix'Px + 1 · ( V f - Vf')BR-iB'(VÎ -Vf)

- 1 Vf'BR-WVi + %[A + <pA,]Vt

+ lx'[A' + Av'<p']Vi + ?[ / ( x + 0 F x , , Γ) - / ( χ , ί, Γ)]j (143)

T h e same trial solution t h a t was used before is used again ;

/ ( x , tltT) = Kit,, T) + x'Kxifj, T) + tx'KJh , T)x (144)

S u b s t i t u t i n g E q . (144) into E q . (143) a n d e q u a t i n g coefficients of t h e


same powers in c o m p o n e n t s of χ on b o t h sides of t h e resulting equation
yields t h e following ordinary differential equations for k^t, , Τ), Κ 1 ( ί 1 ,
T), a n d Kfa , T):

dk l
° = - ^ z'K2BR~
2 B'K2z 2 + ^ K^BR-WK^ (145)
dt, 2 λ " " " " " " " ' 2λ
O P T I M A L CONTROL W I T H MEASUREMENT ERRORS 235

f)
(ä + Apy + ςΫ'φ' - i K2BR-*B ) K, (146)

dU Ρ + 1 K2BR'B K2 - K2(A + ΨΑΡ + ςψΫ)

{Α' + Ψ Α,; '


+ q?W)K2 - ςΥ'φ'Κ2φΥ (147)
T h e terminal conditions which t h e above equations m u s t satisfy are given
by E q s . (51), (52), a n d (53). E q u a t i o n s (52) a n d (145) imply t h a t
Κι(*ι , Τ) = 0 for t{) < / ^ T. T h u s t h e optimal policy b e c o m e s

ü(x(f) + z(0, 0 = - χ R-\t)B'(t)KJt, T)[x(t) + z(<)] (148)

As before, k0(t, T) is t h e only t e r m that involves t h e estimation errors


z(t). Again t h e optimal estimation p r o c e d u r e (that yields m i n i m u m
pay-off) is one which minimizes E[z'(t)z(t)] for t h e case m e n t i o n e d in
Section I I , C.

D. The Sufficiency Question with Disturbances which are


Linearly Proportional to the State Vector
H e r e t h e pay-off that results from using s u b o p t i m a l policies of t h e
form given by E q . (57) will be examined a n d c o m p a r e d with t h e optimal
pay-off. L e t t h e pay-off associated with t h e s u b o p t i m a l policy be d e n o t e d
by £ ( x , tx , T ) . T h e n t h e application of t h e t e c h n i q u e s of invariant
i m b e d d i n g a n d a T a y l o r ' s series expansion to g(x, tx , T) yields t h e fol-
lowing partial differential e q u a t i o n :

•Je- = - j i x ' P x + j (û' + 8u')R(û + Su)

+ \Vg'B{u + Su) + 1(ΰ' + Su')ß'Vg + \Vg'Ax

+ \x'A'Vg + q[i(x + φΥζ, h, T)- g{x, t l , Γ)] (149)

N o w let /o(x, tx, T) = g — f as before. T h e s u b t r a c t i o n of E q . (143)


from E q . (149) a n d t h e use of E q . (63) yields

^bu'RBu + |Su'ß'(Vf - V f ) + l ( V f - V{')BSu


ι
+ | V P ' ß ( ü + Su) + \{ü' + Su')ß'Vp + %Vf>'Ax + \x'A'V ρ

+ q[P(x + φΥχ, h, T)- p(x, t,, Τ)] (150)


236 PETER R. SCHULTZ

As before, a trial solution of t h e form given in E q . (116) is used here.


A substitution of this trial solution into E q . (150) yields t h e following
ordinary differential equations for z 0 , I x , a n d I2:

- ^ K 2 B ^ ~ B % z + 2^ Ι, ΒR^B'E, + ^ E 1 ' B R - \ B ' l 1 (151)

1
^ = - [A - \ (K2 + E2>)BR^B' + ςΫ'φ] Ιχ

- \ E^BR^B'E, + I IJBR-^B'E, 152)


Λ Λ

Yfx= - \ E2'BR-*B'E2 - I2 [A - I BR-*B\K2 + E2) + ςφ?)

1
- (A' - I ( K 2 + A V ) ^ " + ςΥ'φ') I 2 - ςΎψ7$Ϋ (153)

T h e terminal conditions which E q s . (151 )—( 153) m u s t satisfy are given


in E q s . (72)-(74).
N o w t h e results in Section V I , Β imply that I2 is positive definite.
As before, t h e question of w h e t h e r ρ is positive is t h u s reduced to w h e t h e r
θ is negative. I n this case

^ = - ^ ^(K^^W^R^VE^z +^ YK2BR^^WK^z

- ^ ( I i ' W - YLi)BR-iB\E2I2% - E a)

- hWTjÜ^n ( 154)
T h e implications of E q . (154) are t h e same as those of Section I V , B.

V. Conclusions

T h e p r o b l e m of optimal control in the presence of state vector m e a s u r e -


m e n t errors a n d r a n d o m disturbances which was defined in Section I, A
has been discussed. T h e optimal policy a n d t h e associated pay-off are
specified in t e r m s of solutions to a matrix-Ricatti differential equation
OPTIMAL CONTROL WITH MEASUREMENT ERRORS 237

and a linear vector matrix differential equation. T h e terminal conditions


are specified independently of t h e state vector in t e r m s of the perfor-
mance criterion and these differential equations do not involve the state
vector. H e n c e there is no two-point b o u n d a r y value p r o b l e m with
unspecified b o u n d a r y conditions which is a characteristic of some
optimization problems. It was also shown that in certain cases where
it is possible to choose between different kinds of estimates of the state
vector (which result in different m e a s u r e m e n t errors) the estimate that
will optimize the system's performance is the one with m i n i m u m mean
square error. F o r certain cases, a "certainty equivalence p r i n c i p l e " holds
in that the optimal policy in the presence of m e a s u r e m e n t errors is
obtained by using t h e estimate of the state vector (which involves
m e a s u r e m e n t errors) in place of the actual value of the state vector in
the optimal policy for the deterministic p r o b l e m (where r a n d o m distur-
bances and m e a s u r e m e n t errors are absent).
S u b o p t i m a l policies involving p e r t u r b a t i o n s from t h e optimal policies
were also examined. T h e results of this examination indicate that t h e
optimal policies that were obtained for the different versions of this
p r o b l e m are o p t i m u m only in a certain " l o c a l " sense w h e n m e a s u r e m e n t
errors are present. W i t h no m e a s u r e m e n t errors, the optimal policy
yields a lower pay-off t h a n any other linear policy. T h e optimal policy
also yields a lower pay-off w h e n m e a s u r e m e n t errors are present t h a n any
other policy which involves p e r t u r b a t i o n s that are i n d e p e n d e n t of t h e
state vector, i.e., w h e n E 2 = 0 in t e r m s of E q . (57). However, w h e n t h e
perturbations from t h e optimal policy involve t e r m s that are proportional
to the m e a s u r e m e n t s of t h e state vector, there exist p e r t u r b a t i o n s which
will yield a lower pay-off t h a n the so-called optimal policy for certain
values of t h e state vector. T h e values of t h e state vector for which this
is t r u e d e p e n d s on t h e specific definition of t h e p r o b l e m [in t e r m s of
the matrices A(t)y B(t)> R(t)y P(t), Q, etc.] a n d on the second m o m e n t
matrix of t h e c o m p o n e n t s of t h e state vector m e a s u r e m e n t errors
E[z{t)z\t)l

V I . Appendices

A . The Solution to a Matrix Differential Equation


I n Sections I I , I I I , and I V of this chapter it is necessary to prove
that the solutions to E q s . (71), (100), a n d (134) are non-negative definite.
T h e s e equations can be written in t h e following canonical form:

= -Kh , T)A(tl) - A'(tl)J(tl, T) - R(tl) (A-l)


238 PETER R. SCHULTZ

R is taken to be a non-negative definite matrix whose elements are


piecewise continuous. / and A are η χ η matrices. T h e p e r t i n e n t b o u n -
dary condition is J(T, T) = 0.
Let , T) be defined at the solution to the following matrix differen-
tial equation:
AT*
= - r f o , T)A{tx)% Γ ( Γ , Τ) = identity matrix (A-2)

T h e n the direct substitution of t h e following expression into E q . ( A - l )


shows that it is a solution of this equation and satisfies t h e desired t e r m i -
nal conditions:

](tx, T) = r \ t x, T) f [Γ(ί, T)-*\'R(t)r-\t, T) itr{tx, Τ), T> t, (A-3)

Since R(t) is non-negative definite, J(t1, T) m u s t be non-negative


definite.

B. A Demonstration that the Solution to a


Matrix Differential Equation is Non-Negative Definite
I n Sections I I I , D and IV, D t h e following form of matrix differential
equation arises [see E q s . (119) and (153)] in connection with t h e suffi-
ciency proofs that are associated with cases w h e r e r a n d o m disturbances
are linearly proportional to t h e state vector:

4L = - C X O / C i , T)C(tl) - J(h , T)A{h) - A'{tl)J(tl, Τ) (B-l)

R(t^) is a non-negative definite η χ η matrix w h e r e elements are piece-


wise continuous and C ( i x) and A(t^) are η χ η matrices whose elements
are continuous. I n this a p p e n d i x it will be s h o w n t h a t J(t1 , T) is n o n -
negative definite if J(T, T) = 0. A m e t h o d of successive a p p r o x i m a t i o n s
will be used to do this.
First, let T(t1 , T) be t h e following fundamental solution to t h e
matrix differential equation given below:

— = - Γ ( ^ , T)A(tx\ Γ(Τ, Τ) = identity matrix (B-2)

N o w define the following sequence {/""} as follows:

, τ) = r y , , τ) Ç [r-\t, T)]'R\t)i\t, τ) dti\tx, r) (B-3)

Τ
/<»>&, Γ ) = Γ) ί [ Γ ( ί , Γ)-Τ
h

χ [/?(<) + C(t)J<»-»{t, T)C(t)]r~\t, T) dtr{t, ,Τ), η = 2, 3 , . . .


OPTIMAL CONTROL WITH MEASUREMENT ERRORS 239

(n)
Since R(t) is non-negative definite, clearly 7 ( * i , T) is non-negative
definite for all finite n. I n addition,
Γ
jM _ ;<»-i) = rh {, τγ f [- ιΓ( >ί T ^ ' C W ^ C , Γ)

< η2 ,
- ; - (ί, T)]C(t)r-\t, Τ) dtr(tl, Τ)

= CΓ'(
J ί χ, t)C\t)[j*-»(t, τ) - ;<»-«(ί, T)]C(i)r(i,, ο dt
t

H e r e the n o r m of a matrix J will be defined in t h e following m a n n e r

I l / H = Vtrace77' (B-5)

T h i s is a perfectly acceptable definition of t h e n o r m of a matrix (49).


Therefore, the application of t h e Schwarz inequality to E q . (B-4) yields
t h e following result:

I, ;<«> _ y.«-i>|| ^ Ç y {t)I\t, C


2
ί χ)ΙΙ II P ~ \U
n x
Τ) - / « - « ( ί , Γ)|| dt (Β-6)

T h e following definition will prove useful:

m = max J i q o r ^ O I I (B-7)

mR= max r||i?(OII (B-7)

Consequently,
_ ;d)|| ^ mRm\T - tx) (B-9)
By induction

; ( n +l ) _ ;(n)|| ^ J _ _ (mRtn*)n{T _ nh ) . 1 )0

( n + 1) ( r )i
Therefore, the sequence 7 — 7 converges to zero in t h e n o r m
defined in E q . (B-5) and the application of t h e ratio test shows that t h e
series
( 1) in+i) in)> = iœ) B_1
7 + X U - J > J ( ^
n=l
(00)
converges uniformly to a limit which will be called 7 ·
{n)
Since all the matrices in t h e sequence {J } are positive definite,
their characteristic roots are positive. T h e characteristic roots of t h e
in)
matrices in the sequence {J } m u s t converge to a limiting set of charac-
(r,)
teristic roots since {7 } converges. But it is easy to show that if this
240 PETER R. SCHULTZ

limiting set of characteristic roots had any negative m e m b e r s , t h a t some


of the characteristic roots of some m e m b e r s of the sequence would have
to be negative (49). T h i s is a contradiction to t h e fact t h a t all m e m b e r s
{11) { c) c
of the sequence {J } are non-negative definite. Consequently, J
m u s t be non-negative definite.
in)
In addition, {J } converges and each matrix in the sequence
{co)
satisfies Eq. (B-3). T h u s J m u s t satisfy E q . (B-3). T h i s implies
( x)
that / satisfies Eq. ( B - l ) .

C . The Minimization of the Expected Value of a Quadratic Form


I n Sections I I , I I I , and IV t h e p r o b l e m of choosing u(x(t) + z(t), t)
[where u is to be a linear function of χ + ζ, x(t) is the c u r r e n t value of
t h e state vector and z(t) is a r a n d o m process which represents the
m e a s u r e m e n t error] so that the following quadratic form is minimized
occurs:

V f = Vf(x, tY, T)

T h e entities in the above expression are defined in Section I I , A. L e t


F(x + z) be the w-dimensional distribution of the state vector m e a s u r e -
m e n t . i \ ( x I χ + z) is the conditional distribution of χ given the m e a s u r e -
m e n t χ + z. Also, F(x + z, x ) is assumed to be the joint distribution
of χ + ζ and x. T h e n t h e previous expression can be written in t h e
following form:

(C-l)

Let u be defined by E q . ( 4 0 ) . T h e n E q . ( C - l ) can be rewritten as

j dF(x + z) j J dFx(x I χ + z) [(u + i VÎ'BR-ή R (u + 1 R^B'Vf)

- JL VÎ'BR^B'VÎ + -L Vf'BR-WVf] J (((C - 2 )


OPTIMAL CONTROL WITH MEASUREMENT ERRORS 241

R is assumed to b e positive definite. T h e r e f o r e , t h e obvious choice of


u in t e r m s of χ + ζ which minimizes t h e left-hand side of E q . (C-2) is

u(x + z,0 = - I R - W V Î (C-3)


Λ

T h i s proof is similar to o n e p u b l i s h e d by K a i m a n (41).

References

/ . R. E. BELLMAN, " D y n a m i c Programming." Princeton U n i v . Press, Princeton, N e w


Jersey, 1 9 5 7 .
2. R. E. B E L L M A N , I. G L I C K B E R G , and O. G R O S S , S o m e aspects of the mathematical
theory of control processes. R A N D Corp. Rept. R - 3 1 3 . Santa Monica, Cali-
fornia, 1 9 5 8 .
3. R. E. BECKWITH, Analytic and computational aspects of dynamic programming
processes of high dimension. Jet Propulsion L a b . M e m o . 3 0 - 1 1 . Pasadena,
California, 1 9 5 9 .
4. R. E. K A L M A N , T h e theory of optimal control and the calculus of variations. Res.
Inst. Advan. Study T e c h . Rept. 6 1 - 3 . Baltimore, Maryland, 1 9 6 1 .
5. R. E. K A L M A N , Contributions to the theory of optimal control. Bol. Soc. Mat. Mex.
pp. 102-119 (1960).
6. A. M . LETOV, Analytical controller design. Automation Remote Control (4, 5, 6),
303-307, 3 8 9 - 3 9 3 , 4 5 8 - 4 6 1 ( 1 9 6 0 ) ; 22 ( 4 ) , 3 6 3 - 3 7 2 ( 1 9 6 1 ) .
7. G . C . C O L L I N A and P. DORATO, Application of Pontryagin's m a x i m u m principle:
linear control systems. Polytech. Inst. Brooklyn Rept. P I B M R I - 1 0 1 5 - 6 2 . N e w
York, 1 9 6 2 .
8. L . S. P O N T R Y A G I N , V . G . B O L T Y A N S K I ι , R. V . GAMKRELIDZE, and E. F . M I S H C H E N K O ,
" T h e Mathematical T h e o r y of Optimal Processes." (translated from Russian
by K . N . Trirogoff and edited by L . W . Neustadt). Wiley (Interscience),
N e w York, 1 9 6 2 .
9 . C. W . MERRIAM, I I I , A n optimization theory for feedback control system design.
J. Inform. Control 3, 3 2 - 5 9 ( 1 9 6 0 ) .
10. C. W . MERRIAM, I I I . A class of o p t i m u m control systems, y. Franklin Inst. 2 6 7 ,
267-281 (1959).
11. J. J. FLORENTIN, Optimal control of continuous time, Markov, stochastic systems.
y. Electron. Control 10, 4 7 3 - 4 8 8 (1961).
12. J. J. FLORENTIN, Optimal control of systems with generalized poisson inputs. Proc.
1962 y oint Autom. Control Conf. N. Y. Univ., Ν. Y. Paper N o . 1 4 - 2 ( 1 9 6 2 ) .
13. P. R. SCHULTZ, Optimal control in the presence of measurement errors and random
disturbances. P h . D . Dissertation, D e p t . E n g . , U n i v . Calif., L o s Angeles,
California, March, 1 9 6 3 .
14. H . J. KUSHNER, Optimal stochastic control. IRE Trans. Autom. Control 7 ( 5 ) ( 1 9 6 2 )
7 5 . W . F . D E N H A M and J. F . SPEYER, Optimal measurement and velocity correction
programs for mid-course guidance. Raytheon C o . Space and Inform. Systems
Div. Rept. B R - 2 3 8 6 . 1 9 6 3 .
242 PETER R. SCHULTZ

16. J. D . R. KRAMER, O n the control of linear systems with time lags. J. Inform. Control
4, 299-326 (1960).
17. R. E. K A L M A N and R. W . KOEPCKE, Optical synthesis of linear sampling control
systems using generalized performance indexes. Trans. ASME 8 0 ( 8 ) , 1820—
1826 (1958).
18. D . S. A D O R N O , Studies in the asymptotic theory of control systems: I. Stochastic
and deterministic iV-stage processes. Jet Propulsion Lab. T e c h . Rept. 3 2 - 2 1 .
Pasadena, California, 1 9 6 0 .
19. D . S. A D O R N O , Studies in the asymptotic theory of control systems: II. Jet Propulsion
Lab. T e c h . Rept. 3 2 - 9 9 . Pasadena, California, 1 9 6 1 .
20. J. J. F L O R E N T I N , Partial observability and optimal control. J. Electron. Control 1 3
(3), 263-379 (1962).
21. J. J. FLORENTIN, Optimal probing, adaptive control of a simple Bayesian system.
J. Electron. Control 1 3 (2), 165-177 (1962).
22. T . L. G U N C K E L , II, O p t i m u m Design of sampled data systems with random para-
meters. Stanford Electron. L a b . T e c h . Rept. 2 1 0 2 - 2 . Stanford, California, 1 9 6 1 .
23. T . L. G U N C K E L , II and G . F. FRANKLIN, A general solution for linear sampled data
control. Proc. 1962 Joint Autom. Control Conf. N. Y. Univ., Ν. Y. Paper
No. 15-1 (1962).
24. P. D . JOSEPH and J. T . Τ ο υ , O n linear control theory. Trans. Am. Inst. Elec. Engr.
Appl. Ind. 80 (Pt. II), 1 9 3 - 1 9 6 ( 1 9 6 1 ) .
25. C. POTTLE, T h e digital adaptive control of a linear process modulated by random
noise. Proc. 1962 Joint Autom. Control. Conf. Ν. Y. Univ., Ν. Y. Paper N o . 1 5 - 3
(1962).
26. J. H . E A T O N , Discrete-time interrupted stochastic control processes. J. Math. Anal.
Appl. 5 (2), 287-305 (1962).
27. S. KATZ, Best endpoint control of noisy systems. J. Electron. Control 11, 323-343
(1962).
28. R. E. K A L M A N , Control of randomly varying linear systems. Proc. Symp. Appl.
Math. 1962, V o l . 1 3 ( 9 9 6 2 ) . A m . Math. S o c , Providence, Rhode Island.
29. R. E. BELLMAN, O n the foundations of a theory of stochastic variational processes.
Proc. Symp. Appl. Math. 1962 V o l . 1 3 ( 1 9 6 2 ) . A m . Math. S o c , Providence,
Rhode Island.
30. Μ. Α ο κ ι , Stochastic time optimal control. Trans. AIEE Pt. II 5 4 , 4 1 - 4 6 ( 1 9 6 1 ) .
31. Μ . Α ο κ ι , S o m e properties of stochastic time optimal control systems.' Dept. Eng.,
Rept. 6 0 - 1 0 0 . U n i v . Calif., L o s Angeles, California, 1 9 6 0 .
32. Ν . N . KRASOVSKII, O n o p t i m u m control in t h e presence of random disturbances.
Appl. Math. Mech. 24 ( 1 ) , 6 4 - 7 9 (1960).
33. W . K I P I N I A K , " D y n a m i c Optimization and Control." M . I . T . Press, Cambridge,
Massachusetts, and Wiley, N e w York, 1 9 6 1 .
34. R. C. BOOTON, Jr., Final value control systems with Gaussian inputs. IRE Trans.
Inform Theory 2 (3), 173-175 (1956).
35. Ν . N . KRASOVSKII and E. A. L I D S K I I , Analytical design of controllers i n systems with
random properties: 1 , 2 , and 3 . Avtomatika i Telemekhan. 22 ( 9 , 1 0 , 1 1 ) , 1 1 4 5 -
1150, 1273-1278, 1426-1431 (1961)
36. A. E. BRYSON, Jr., O p t i m u m programming of multivariable control systems in the
presence of noise. Proc. O p t i m u m Systems Synthesis Conf. Wright-Patterson
Air Force Base, Ohio, 1 9 6 2 , Air Force Systems C o m m a n d Rept. A S D - T D R -
63-119.
37. J. L. D O O B , "Stochastic Processes." Wiley, N e w York, 1 9 5 3 .
OPTIMAL CONTROL WITH MEASUREMENT ERRORS 243

38. M . LOEVE, "Probability T h e o r y , " 2nd ed. Van Nostrand, Princeton, N e w Jersey,
1960.
39. Β. V. G N E D E N K O , " T h e o r y of Probability." Chelsea Publ. N e w York, 1962.
40. M. S. BARTLETT, "An Introduction to the T h e o r y of Stochastic Processes." Cambridge
U n i v . Press, L o n d o n and N e w York, 1955.
41. R . E. K A L M A N , N e w m e t h o d s and results in linear prediction and filtering theory.
Res. Inst. Advan. Study T e c h . Rept. 6 1 - 1 . Baltimore, Maryland, 1961.
42. Ε. B . STEAR, Synthesis of shaping filters for non-stationary stochastic processes
and their uses. P h . D . Dissertation, D e p t . Eng., Univ. of Calif., L o s Angeles,
California, August, 1961.
43. R . E. BELLMAN, "Adaptive Control Processes: A G u i d e d T o u r . " Princeton U n i v .
Press, Princeton, N e w Jersey, 1961.
44. R . E. BELLMAN and S. E. DREYFUS, "Applied D y n a m i c Programming." Princeton
Univ. Press, Princeton, N e w Jersey, 1962.
45. J. J. L E V I N , On the matrix Riccati equation. Trans. Am. Math. Soc. 10 (4), 519-524
(1959).
46. W. T . REID, A matrix differential equation of the Riccati type. Am. J. Math. 68
(2), 2 3 7 - 2 4 6 (1946); A d d e n d u m to a matrix differential equation of the Riccati
type. 70 ( 3), 4 6 0 (1948).
47. W. T . R E I D , Solutions of a Riccati matrix differential equation as functions of initial
values. J. Math. Mech. 8 (2), 2 2 1 - 2 3 0 (1959).
48. R. REDHEFFER, Inequalities for a matrix Riccati equation. J. Math. Mech. 8 (3),
3 4 9 - 3 6 7 (1959).
49. E. A. C O D D I N G T O N and N . L E V I N S O N , " T h e o r y of Ordinary Differential Equations."
M c G r a w - H i l l , N e w York, 1955.
50. R. E. BELLMAN, "Introduction to Matrix Analysis." M c G r a w - H i l l , N e w York, 1960.
51. R. E. K A L M A N and R. S. Bucy, N e w results in linear filtering and prediction theory.
J. Basic Eng. 8 3 D , 9 5 - 1 0 8 (1961).
52. H . C. H S I E H , On the synthesis of adaptive controls by Hilbert Space approach.
D e p t . Eng. Rept. 6 2 - 1 9 . U n i v . Calif., L o s A n g e l e s , California, 1962.
On Line Computer Control
Techniques and Their Application
to Re-entry Aerospace Vehicle Control
1
FRANCIS H. K I S H I
Electronics Division,
TRW Space Technology Laboratories,
Redondo Beach, California

I. Introduction 247
A. General Statement 247
B. A Definition 247
C. Background 248
D . Objectives of the S t u d y 251
E. Organization of Chapter 252

II. O p t i m u m Linear Discrete Control 253


A. Introduction 253
B. General Philosophy 253
C. Control of C o n t i n u o u s Linear Process by Digital
Computer 255
D. Discussion of the M a x i m u m Principle and the
Calculus of Variations Approach 258
E. D y n a m i c Programming Approach 260
F. A n Extension to the Stochastic Case 262
G. Stability of the C l o s e d - L o o p S y s t e m 263
H. H o w G o o d is Suboptimal ? 265
I. Additional Remarks 267

III. Synthesis of Control Forces with Inequality Constraints . 267


A. Introduction 267
B. Problem Formulation 268
C. Coordinatewise Gradient M e t h o d 270
D. Remarks on Convergence 273
E. A Remark on the Initial Trial 273
F. Example for One Optimization Interval 274

1
Formerly with the D e p a r t m e n t of Engineering, University of California, Los Angeles
and H u g h e s Aircraft Company, Culver City, California. T h e work discussed in this
chapter was supported by the Adaptive Control Project at U C L A under A F O S R Grant
6 2 - 6 8 and by A F Contract A F 33(657)-7154 under Task N o . 82181 of Project N o . 500
administered under the direction of the Flight Control Laboratory, Aeronautics System
Division, A F Systems C o m m a n d .

245
246 FRANCIS H. KISHI

G. Simulation 276
H. Effect of Uncertainties in Process Parameters . . . 281
I. B o u n d s on the Rate of Change of the Control Variable 282
J. W e i g h t i n g between Error and Control Energy . . . 283
K. Bounds on Both Control Force and the Rate of Change
of Control Force 284
L. A C o m p r o m i s e Procedure for the Multiple Constraint
Case 287
M. Between Sample Considerations 288
N. A Hybrid Computational Procedure 288
IV. Identification of Process Parameters—Explicit M a t h e m a -
tical Relation M e t h o d 292
A. Introduction 292
B. Description of the Mathematical Relation M e t h o d 293
C. Additional Filtering 296
D . Block Processing of Data 296
E. Exponential W e i g h t i n g 297
F. Uniform Weighting—Observable Case Only . . . . 299
G. Confidence Interval 301
H. Determination of Pulse Response 303
V. Identification of Process Parameters—Learning Model
Method 305
A. Introduction 305
B. Margolis' Sensitivity Function Approach 306
C. Modified N e w t o n ' s Approach 308
D . Algorithm and Convergence 310
E. Simulation 312
F. A Possible Alternative 322
VI. State Variable Estimation 323
A. Introduction 323
B. Outline of Estimation Problem 323
VII. Application to the Re-entry Flight Control Problem . . 324
A. Introduction 324
B. Flight-Path Control Problem 324
C. Constant Altitude Controller 326
VIII. S u m m a r y and Suggested Extensions 330
A. S u m m a r y 330
B. Suggestions for Further Research 331
A p p e n d i x 1. Notation and C o n c i s e Statement of Problems 332
A p p e n d i x 2. A Brute Force M e t h o d for the Quadratic
Programming Problems 336
A p p e n d i x 3. Quadratic Programming T h e o r e m s . . . 339
A p p e n d i x 4. A Recursive M e t h o d to Obtain the Best
Estimate 343
A p p e n d i x 5. Correspondence Between Greville's and
Kalman's Recursive Procedures 352
List of S y m b o l s 355

References 355
O N L I N E COMPUTER CONTROL TECHNIQUES 247

I. Introduction

A. General Statement
T h i s chapter is concerned with t h e p r o b l e m of controlling processes
u n d e r the condition of uncertain changes in t h e process to be controlled.
Of course, the feedback principle solves this p r o b l e m to some extent.
Larger process variations a n d increased accuracy r e q u i r e m e n t s , however,
dictate the need for m o r e sophistication in t h e control. Control systems
designed specifically to consider these p r o b l e m s have been called
" a d a p t i v e " control systems.
I n d e p e n d e n t of the study on adaptive controls, engineers a n d m a t h -
ematicians have been concerned with optimal controls, i.e., t h e c o m p u t a -
tion of controlling forces for a known process which minimizes some
performance criterion. O n e can use results from optimal controls for
t h e adaptive control p r o b l e m if first identification is m a d e on t h e
process. S u c h a philosophy has been taken by K a l m a n (7), M e r r i a m (2),
Braun (3), M e d i t c h (4), and H s i e h (5). T h i s p r o b l e m area will also be
the concern of our investigations.
Section I is divided into four parts. First, a definition of adaptive
controls will be given to set the general framework for our discussions.
Second, b a c k g r o u n d material will be given which are p e r t i n e n t to t h e
present study. T h i r d , t h e p u r p o s e a n d goal of our endeavor will be stated.
Last, the organization of this c h a p t e r will be given.

B. A Definition
M a n y definitions of adaptive control systems have been set forth in
t h e past with each definition generally chosen to fit t h e needs of t h e
investigator. T h i s situation has resulted in m u c h confusion. It is not t h e
intent here to set forth another definition b u t to choose a definition
given by Z a d e h (6) which sets t h e framework for most of t h e adaptive
controls given in t h e past. Before stating this definition it will be necessary
to explain some terminology.
Let us consider an adaptive system a , which will have access to
various i n p u t s as s h o w n in Fig. 1. T h e s e i n p u t s are reference i n p u t s ,
k n o w n i n p u t s to t h e process, disturbances, and measurable o u t p u t s .
T h e s e i n p u t s a(t), defined for 0 < t < 7 \ , form a set defined for
some γ e Γ, i.e., Sy Δ {a(t)}. F o r a particular γ any m e m b e r from
a family of the functions is possible. F o r another γ the t i m e functions
will be m e m b e r s from another family, etc. T h e performance of α is
assumed to b e m e a s u r e d by a performance criterion, 0*, which w h e n SY
248 FRANCIS H. KISHI

is the i n p u t to α the performance criterion is &(y). W e define a criterion


of acceptability by the relation SP(y) e W> i.e., if the performance criterion
is maintained in the set W, t h e n the system is acceptable. T h e s e notions
lead to the following definition.

KNOWN INPUTS PARAMETERS OF KNOWN INPUTS KNOWN UNKNOWN


TO PROCESS KNOWN D I S T U R B A N C E S TO PROCESS DISTURBANCES DISTURBANCES
I ADDITIVE β
PARAMETRIC)

F I G . 1. Adaptative control systems.

D E F I N I T I O N : A system, a , is adaptive with respect to {Sv} and W if


it satisfies the criterion of acceptability (i.e., &(y) e W)y with every source
in the family {Sy}, γ e Γ. More compactly, a is adaptive with respect to Γ
if it maps Γ into W.
W i t h this definition even ordinary feedback systems may be adaptive.
T h e p r o b l e m arises, however, w h e n SP(y) cannot be maintained in W.
Here, it becomes necessary to consider more complex m e c h a n i s m s
within oc to satisfy the criterion of acceptability. Therefore, one is led
to many possible alternatives for the construction of a , each with an
a t t e m p t to satisfy the criterion of acceptability.

C. Background
Although the definition given above was stated recently, engineers in
the past knew intuitively what was desired, i.e., to, achieve acceptable
control in the presence of large variations in t h e process. Even before
the t e r m " a d a p t i v e " was attached to control systems, engineers used,
for example, air data m e a s u r e m e n t s to vary the controller. T h i s situation
can certainly be adaptive by t h e definition given above. N o a t t e m p t
will be m a d e in this section to survey all the different schemes devised
ON L I N E COMPUTER CONTROL TECHNIQUES 249

in the past because several good survey materials are available (7, 8, 9).
It will be more the intent to delineate the three categories into which
the different schemes seem to fall. T h e s e are (1) the high-gain scheme,
(2) the model-referenced scheme, and (3) the o p t i m u m - a d a p t i v e scheme.
F r o m the practical standpoint, the high-gain scheme, first proposed
by Minneapolis-Honeywell C o m p a n y (10), has been widely discussed
and tested. It has been proven to be of wide applicability. T h e gain in
the feedback loop a r o u n d the changing process is kept as high as possible
in order that the i n p u t - o u t p u t transference is close to unity. Because
stability p r o b l e m s arise at high gain, the signal in the loop is monitored
to check for oscillations. W i t h this information t h e loop gain is adjusted
to keep the system on the verge of instability. A response close to that
of a particular model is obtained regardless of the process p a r a m e t e r s
by placing a model in front of the feedback loop. A schematic diagram
of the high-gain scheme is shown in Fig. 2. O n e of the objections to this
approach is that the designer m u s t have considerable a priori information
about the process, i.e., he m u s t know the general vicinity w h e r e the roots
of the system go into the right half plane. Of course, a frequency insensi-
tive unity gain can only be a p p r o a c h e d implying that the o u t p u t response
will differ to some extent from the model response. Also, small oscillations
are always present in the loop. ( T h i s oscillation has been r e p o r t e d to be
unobjectionable in aerospace applications.) A variation of the same
philosophy has been recently given by H o r t o n ( / / ) .

F I G . 2. H i g h gain scheme.
250 FRANCIS H. KISHI

If one is willing to accept more complexity, the model-referenced


scheme can provide better response. T h i s scheme has been tested
successfully in experimental flight tests by a g r o u p at M . I . T . (12).
Stability problems associated with this type of scheme have been studied
by Donalson (13). A schematic diagram of this m e t h o d is shown in
Fig. 3. T h e m e t h o d simply adjusts the controller parameters so that the

D I S T U R B A NS C E

_J
Λ C O N T R O LRL E

XZJ
P A R A M E RT E
P R O C ESS

A D J U S T M TE N

R E F.

M O D LE

.J
FK;. 3 . Model referenced scheme.

process o u t p u t is kept close to the model o u t p u t . T h e stability p r o b l e m


ensues in the parameter adjustment loop. With this scheme unstable
processes and nonlinear processes can be handled. T h e m e t h o d , however,
requires a good knowledge of the form of the process.
As the state of the c o m p u t e r art advances, one asks if there are still
better m e t h o d s which can improve u p o n the accuracy of the system.
W i t h regard to this, o p t i m u m - a d a p t i v e schemes are investigated. T h i s
area is still primarily in the exploratory stage with no applications
reported. Experimental verification has been m a d e to a limited extent
via analog and digital simulation. S o m e of the c o n t r i b u t o r s in this
area are Kaiman (/), M e r r i a m (2), Braun (3), M e d i t c h (4), and Hsieh (5).
A schematic diagram for this scheme is given in Fig. 4. Basically the
technique solves an optimization p r o b l e m on the assumption that the
process and the states are known. Since the process state and p a r a m e t e r s
ON L I N E COMPUTER CONTROL TECHNIQUES 251

are u n k n o w n to some extent in an adaptive task, b o t h state estimation


and process identification m u s t be performed.
T h e identification p r o b l e m has been investigated by m a n y investigators
i n d e p e n d e n t of the adaptation scheme. S o m e b a c k g r o u n d material on
identification will be given in Section I V and V.

D I S T U R B A NS C E

C O M P U T A TNI O OF

R E F.
I N P UT O P T I M MU C O N T R SO L

P R O C ESS

IDENTIFICAN
TIO

STAE
T ESTIMAT
NI O

F I G . 4. Optimum-adaptive scheme.

D. Objectives of the Study


T h e major objective is to investigate u n e x p l o r e d areas of t h e o p t i m u m -
adaptive scheme to adaptation. W e will look into b o t h t h e area of
synthesis of o p t i m u m controls and the area of identification. Application
will t h e n be sought in t h e area of re-entry of aerospace vehicles.
An extreme a m o u n t of b a c k g r o u n d material is available for t h e opti-
m u m control problem. I n fact, several alternative approaches are
available. T h e s e are (1) m a x i m u m principle, (2) d y n a m i c p r o g r a m m i n g ,
(3) functional analysis, and (4) steepest descent m e t h o d s . T h e on-line
c o m p u t a t i o n of optimal controls, however, is not in a satisfactory state
of affairs except possibly for the quadratic criterion-linear process case.
T h e previous investigators for the most part have remained in this
latter case. In our investigation we impose an a d d e d constraint of b o u n d s
252 FRANCIS H. KISHI

on the control force. W e will also stay, however, in t h e quadratic criterion-


linear process case. T h e nonlinear (quadratic) p r o g r a m m i n g a p p r o a c h is
used, as it seems to be t h e most suitable m e t h o d w h e n we have this
additional constraint. I n our p r o b l e m we postulate a digital c o m p u t e r
to c o m p u t e t h e control forces. T h i s postulation reduces t h e p r o b l e m
to t h e discrete case.
I n the area of identification, we will investigate two principle areas.
First, the statistical aspects of t h e estimated p a r a m e t e r s will be studied.
H e r e , we study t h e concept of confidence interval primarily for t h e case
with u n k n o w n variances. Secondly, we re-examine t h e learning model
approach of Margolis (14) from another viewpoint, i.e., we will study
an integral error-square criterion previously unexplored by Margolis.
A modified N e w t o n ' s p r o c e d u r e will be employed.
Generally, t h e identification p r o b l e m is coupled with t h e state estima-
tion p r o b l e m . Identification of process p a r a m e t e r s can be m a d e if t h e
states are known, or estimation of t h e states can be m a d e if t h e process is
known. W e will attack this p r o b l e m by investigating identification
m e t h o d s which d e p e n d only on partial knowledge of t h e states. T h e n ,
estimation of the states will be m a d e with t h e identified parameters.

E. Organization of Chapter

T h i s chapter is organized into eight sections with five appendices.


T h i s first section provided i n t r o d u c t i o n to t h e subject via definitions,
b a c k g r o u n d materials, a n d objectives.
Section I I gives algorithms for on-line discrete control of linear
processes with quadratic criterion without inequality constraints on t h e
control force. T h i s is primarily review material and is included as a
preliminary to Section I I I .
Section I I I gives algorithms for on-line discrete control of linear
processes with quadratic criterion with inequality constraints on t h e
control force. H e r e , quadratic p r o g r a m m i n g m e t h o d s will be applied
and suitability of on-line c o m p u t a t i o n will be verified by experimentation
t h r o u g h digital simulation.
Section I V explores the statistical aspects of t h e explicit mathematical
relation m e t h o d of identification. Also, t h e recursive m e t h o d of
Greville ( 7 5 ) - K a l m a n (16) will be applied to identification.
Section V explores t h e learning model a p p r o a c h with integral error-
square criterion. T h e application of N e w t o n ' s m e t h o d to t h e learning
model approach will be verified t h r o u g h experimentation via digital
simulation.
ON L I N E COMPUTER CONTROL TECHNIQUES 253

Section VI gives a m e t h o d for state variable estimation. T h i s is


again primarily review material, b u t it is an integral part of t h e overall
adaptive system.
Section V I I explores possible application areas for the proposed
m e t h o d of adaptation. T h e area of re-entry of aerospace vehicles is
chosen.
Section V I I I concludes the chapter by suggestions for future studies.
A p p e n d i x 1 describes t h e notation used in t h e control system and it
states concisely the p r o b l e m s attacked in this chapter.
A p p e n d i x 2 gives a b r u t e force m e t h o d to solve t h e q u a d r a t i c p r o -
g r a m m i n g p r o b l e m of Section I I I . A l t h o u g h t h e m e t h o d is c u m b e r s o m e
it is included because it gives a d d e d insight into the p r o b l e m .
A p p e n d i x 3 reviews the p e r t i n e n t q u a d r a t i c p r o g r a m m i n g t h e o r e m s .
Several routines described in Section I I I will draw heavily from these
t h e o r e m s based on the K u h n and T u c k e r (77) t h e o r e m s .
In A p p e n d i x 4 the recursive m e t h o d of Greville is a d o p t e d for t h e
identification p r o b l e m . T h e algorithms are rederived from t h e postulates
given by Penrose (75, 79).
A p p e n d i x 5 gives the correspondence between G r e v i h V s and K a l m a n ' s
recursive procedures.

I I . O p t i m u m Linear Discrete C o n t r o l

A. Introduction

T h i s section gives algorithms for o p t i m u m linear discrete controls


with a q u a d r a t i c performance criterion. N o inequality constraints will
be considered here. T h i s is primarily review material and is included
to set the stage for Section I I I . As previously stated, we confine ourselves
to the discrete control case, as we postulate a digital c o m p u t e r to perform
the synthesis.
T h i s section first gives a philosophical basis for our adaptive control
before proceeding to give algorithms.

B. General Philosophy

W e envision using the o p t i m u m - a d a p t i v e control to keep t h e process


o u t p u t close to some desired trajectory. T h i s operation is to be m a i n -
tained over some time interval which we will designate as t h e operation
254 FRANCIS H. KISHI

interval. In other words, we desire to minimize the performance criterion

^ = 5)llyd(*)-y(*)llô 0)
where

yd(k) is the desired trajectory


y(k) is the actual trajectory
iVj is the number of sampling intervals in operation interval
k = 0 is the beginning of operation interval
0 is a non-negative weighting matrix

T h e desired trajectory will be assumed k n o w n t h r o u g h o u t the operation


interval. A controller designed to minimize 3P is t e r m e d a follower
and an example will be given in Section V I I .
T h e optimization of E q . (1) is not practical primarily for t h r e e reasons.
First of all, open-loop control ensues and it is m o r e desirable to r e c o m -
p u t e periodically the optimal control. Secondly, t h e process is uncertain
for time into the future. T h i r d l y , the on line numerical c o m p u t a t i o n
required may be too large. As a result, it is more practical to perform
periodically the following optimization. W e choose a fixed time interval
into the future from the present time, designated optimization interval,
and perform a minimization over this interval. T h e r e f o r e , instead of
E q . (1) we minimize periodically

;= X \\YaU)-y(J)\\l (2)
7=k+l
where
Ν is the number of sampling intervals in the optimization interval
k is the present time
T h e time relation of the intervals u n d e r consideration is given in Fig. 5.

OPERATION
INTERVAL

SAMPLING OPTIMIZATION
INTERVAL INTERVAL

->T.«- t t + NT
k = 0 k = Ν

FIG. 5. T i m e relation of intervals under consideration.


ON L I N E COMPUTER CONTROL TECHNIQUES 255

T h e idea of adaptive controls originated from a desire to emulate


the desirable h u m a n characteristics. T h e r e f o r e as t h e general philosophy,
we give a h u m a n analogy discussion. A similar discussion was first
presented by M e r r i a m (2).
A h u m a n faced with a control p r o b l e m , such as driving an automobile,
has t h e p r o b l e m of selecting optimally t h e next decision in a multistage
decision process. T h i s decision will be based on t h e present state a n d
t h e knowledge (may be intuitive) of t h e process response (automobile
behavior). A h u m a n will decide on a particular control on t h e basis of
considerations given over a relatively short t i m e into t h e future. F o r
example, t h e road conditions may change and t h e h u m a n will not
apply t h e same control on a r o u g h road as on an icy road. W i t h knowledge
of the desired p a t h over a short t i m e into t h e future (optimization interval)
and t h e knowledge of t h e vehicle response, t h e h u m a n can apply p r o p e r
control effort on t h e steering wheel. T h e criterion given by E q . (2) will
t h e n replace t h e subjective evaluation performed by a h u m a n .
Repeating ourselves to s o m e extent, although E q . (2) may lead to
s u b o p t i m a l policies it may be t h e only p r o p e r criterion to apply in any
given circumstance. State estimation a n d process identification were
inherent in t h e above discussion. T h e s e functions are performed by t h e
h u m a n t h r o u g h observation a n d testing vehicle response. As a h u m a n
could adapt to different vehicles (different responses) a n d also changes
in t h e same vehicle (road conditions, tire-blow-out, etc.), an adaptive
control m u s t be able to perform these tasks if it is to have t h e finer
h u m a n capabilities.

C . Control of Continuous Linear Process by Digital Computer


W e will a t t e m p t to control processes which are describable by linear
ordinary differential equations. W e immediately make t h e following
assumption.
Assumption. Changes occurring in t h e process d u r i n g an optimization
interval will be a s s u m e d to be small.
T h i s a s s u m p t i o n allows us to use constant coefficient differential
equations which in t u r n will relieve t h e computational r e q u i r e m e n t s .
W i t h m o r e complexity, considerations can be carried over to t h e variable
coefficient case.
T h e process is t h e n described by
x ( 0 = Ax(t) + Bu{t) (3)
where
A is an η χ η matrix
ß is an w χ r matrix
256 FRANCIS H . KISHI

T h e solution of E q . (3) is given by

X(i)[ = X(t) [x(0) + £ X-\r)Bn(r) dr] (4)

where X(i) is t h e matrix solution of

X(t) = AX(t) with X(0) = I (identity matrix)

W h e n digital c o m p u t e r s are employed as controllers, t h e control


signal will have the appearance of a staircase signal s h o w n in Fig. 6.

τ 2T 2 4T 5T t—

F I G . 6. Staircase signal.

Mathematically, it is formed by a sample-hold combination. I n m a t h -


ematical notation,

u(T) - u(A), (A - 1)Γ < τ < AT, A = 1, 2, ..., Ν (5)

F o r this staircase situation, E q . (4) can be solved at discrete instants


of time.
T
x(A) = X(k) [x(0) + Ç Χ-\τ)Βη(τ) dr]

= X(k) [x(0) + Γ X-\r)Bn(r) dr] (6)

Since

x A
( _ i) = (kX - 1) fx(0)
1 + V ΓJ Χ-\τ)Βη{τ) dr] J
T=i (*-i)r

we can write x(A) in t e r m s of x(A — 1).

A rkT
x(A) = X(k)X(k - l)- x(A - 1) + X(k) X-\r)B dru(k)
J (k-l)T
ON LINE COMPUTER CONTROL TECHNIQUES 257

Let

we note that X(k) is t h e solution of

Therefore,
(7)

I n t e r m s of Φ and Γ, E q . (6) becomes

(8)

It may not be possible to measure all t h e state variables. T h e m e a s u r e d


o u t p u t , y(k) is usually some linear c o m b i n a t i o n of t h e state variables.

y(*) = Hx(k) (9)

T h e basic deterministic model is s h o w n in Fig. 7.

χ y
P R O C ES S H

F I G . 7. Deterministic model.

T o the deterministic model we can add stochastic disturbances:


( 1 ) load disturbances a n d (2) m e a s u r e m e n t errors. T h e distinction
should be carefully noted. L o a d disturbances generally cause t h e state
variables to become stochastic, a n d these can be incorporated into t h e
deterministic model by including in addition to t h e control forces,
u(&), other i n p u t s , w(&), which are white noise. M e a s u r e m e n t error can,
w i t h o u t too m u c h loss of generality, be considered as additive white
noise on the o u t p u t variable. T h e model with stochastic disturbances
is shown in Fig. 8.

P R O C ES S
- Ο -

F I G . 8. Model with stochastic disturbances.


258 FRANCIS H. KISHI

In the discussion on optimization, it is desired to restrict the a m p l i t u d e


of the control force. T h i s is accomplished in this chapter indirectly
by adding t e r m s to the performance criterion, Eq. (2).

7= % \\yaU)-y<j)\\l' + \\*U)U (10)

where R is a non-negative weighting matrix.

D. Discussion of the Maximum Principle and the Calculus of


Variations Approach
T h e m a x i m u m principle and the calculus of variations approach can
be applied to the discrete version of the linear-process, quadratic-criterion
case. C h a n g (20) and Katz (21) investigated the m a x i m u m principle for
the discrete case giving necessary conditions. As the calculus of variations
approach yields the same algorithm, this latter point of view will be
discussed in this section. T h i s approach was taken by Kipiniak (22). It
will be observed that this approach leads to a feedback control law,
i.e., t h e control is given as a function of the state variables.
It should be pointed out that although only necessary conditions are
satisfied, the solution to the necessary condition could be necessary
and sufficient. T h a t is, if we know t h e existence and u n i q u e n e s s of the
m i n i m u m to the p r o b l e m and if only a u n i q u e solution is provided by
the necessary condition, that solution is t h e m i n i m u m . F r o m the argu-
m e n t s given in A p p e n d i x 3 we can show existence and u n i q u e n e s s of
the m i n i m u m . It is noted that the infinite domain is a convex set.
Let us consider the linear process
x(j) = Φχ(]- l) + r u ( j ) (7)
with x(k) = x°, and the criterion
k+N

j = Σ ϋι*ωιιό + 4ΐι«ωιιΐ ου 2

j=k+l
Although E q . (11) differs from E q . (10), t h e derivation follows t h e same
lines. E q u a t i o n (11) is applicable directly to t h e regulator p r o b l e m
which is i m p o r t a n t in itself. U s i n g Lagrange multipliers, t h e constrained
functional to be minimized becomes
k+N

h = % i II m \ l + \ II "0)11% + <PÜ), χϋ') - Φχϋ - 1) - Αιθ)>


j=Ar+l
2
T h e use of χ instead of y implies H = I. If the criterion does not contain every state,
Xi , then Q can appropriately be chosen with zero elements.
ON L I N E COMPUTER C O N T R O L TECHNIQUES 259

T h e necessary condition states t h a t t h e total differential of J1 vanishes


for i n d e p e n d e n t differentials of x(j), u(j), a n d p(j). T a k i n g t h e differen-
tial, we get

Therefore, t h e following relations m u s t be satisfied.

(7)
(12)3

(13)

with transversality condition

(14)
Or,

1)

(15)

Or,

(16)

where

Thus,

(17)

3 -1
It is noted that ( Φ * ) exists since Φ is a fundamental matrix.
260 FRANCIS H . KISHI

where
ψ = QN =Γ ^ π ^12l

Since p(k + N) = —Q x(k + N), we can eliminate p(k + TV) and


x(k + N) from E q . (17). T h u s ,

p(k) = -(Ψ + QW )-\W


22 12 21 + ß^uW*) (18)
and

P(k + 1) = [ΘΆ - Θ22(Ψ22 + QW )~^ 12 21 + QiFuMk)

T h u s , the feedback solution is given by E q . (13). T h e inverse here is


assumed to exist. It seems that Q should be chosen so that t h e inverse
exists even for large p a r a m e t e r variations. It is noted that existence is
r e q u i r e d if the p r o b l e m is to be a necessary and sufficient condition.

u(A + 1) = -Ax(k) (19)


where
Λ = -ϋ-ιΓ*(θ21 - Θ22(Ψ22 + ΟΨχ1γ\ΨΆ + 0Ψ11))

If the process does not change Λ will be a constant a n d t h e feedback


p r o b l e m is easy. I n an adaptive task, the, Γ, , and m u s t be u p d a t e d
as Φ and Γ change.

E. Dynamic Programming Approach


T h e derivation a n d algorithm given in this section are d u e to
K a l m a n (16). Again, necessary conditions are used to arrive at the
solution. As before, if a u n i q u e solution is provided, t h e solution is
necessary and sufficient.
W e start with t h e process

χΟ') = ΦχΟ' - 1) + Ai(j) (7)

with x(0) = x°, a n d t h e criterion

χ 2 4
^ = έ έ " 0 ' ) Ι Ι ο + ϋΙ«0·)ΙΙ (H)
i=i
Let
fx = min (bar above the time index Ν indicates time-to-go)

4
T o simplify the notation in this section and Section II, F time index k has been
dropped, (j = 1 => j = k + 1, j = Ν => j = k + N).
O N L I N E COMPUTER CONTROL TECHNIQUES 261

or,
h = ÏÏJÎ 1 {/ϊ( χ(°) «(!)) + / ^ i ( x ( l ) ) } (20)
'
ίΐ = ^{]-1<Ν -\),n{N)))

F o r t h e p r o b l e m we are considering, it can be shown by i n d u c t i o n


that fa is a q u a d r a t i c form, or
/ - = x*0)M(m)xO) (21)
where

m + j = Ν

m is the time-to-go

j is the running time

It is noted that

/ô = 0
a n d therefore,
Μφ) = 0.
Also,
Λ = 11*0)110+ 11 »0)11* (22)
U p o n s u b s t i t u t i n g E q s . (21) a n d (22) into E q . (20),

fji = min {II x(l)||« + || u(l)||* Ä + || «0)11 W r > >

Since x ( l ) is a function of u ( l ) a n d x(0),

= Γηίη{||Φχ(0) + Γ*(\)\Τ0+Μ{—„ + ||u(l)||j|}

Or

+ 2u*(l)r*(Q + M(N - 1))Φχ(0)} (23)

Differentiating t h e q u a n t i t y in t h e bracket with respect to u ( l ) and


5
setting t h e derivative equal to zero, we g e t

u ( i ) = - ( Γ * ( ρ + M(N - i ) ) r + Λ)-!Γ*(ρ + M(N - ΐ))Φχ(θ)


5
It is easily seen that the inverse here exists, since the first term in the parenthesis
is positive semidefinite and the second term is positive definite.
262 FRANCIS H . KISHI

or, t h e feedback solution is given by [equivalent to E q . (19)]

u(l) = -(Γ*Ρ(Ν~~\)Γ + R)- r*P(N~^\)0x(O)


l
(24)
where

Ρ(ΛΓ^Ϊ) = Q + M(N~^TJ)
A recursive relation can be derived for P(N — 1). W e note that

/A = il«(0fe = ll«(0)ii* _ (jf) o

U p o n substituting E q . (24) into E q . (23), we also have


= Χ Χ 1
ffi II (0)ΙΙφ*/>(ΛτΤΓ)φ + II (ο)ΙΙ Φ*/»(^-1)Γ(Γ*Ρ(Ν-1)Γ+/ΐ)- Γ*Ρ(ΛΓ-1)Φ

1 Χ
+ II Φ)\\-2Φ*Ρ(Ν-1)Γ(Γ*Ρ(Ν-1)Γ+Ε)- Γ*Ρ(Ν^Ϊ)Φ

Therefore,

Ρ(Ν) = Φ*{Ρ(ΛΝΠΓ) - Ρ(Ν^)Γ(Γ*Ρ(Ν~Ξ~Ϊ)Γ + Α)- Γ*Ρ("]ν^Ί)}Φ + ρ


1

(25)
with P(Ö) = 0. T h i s is a nonlinear Ricatti equation. E q u a t i o n s (25)
and (24) give t h e optimal control force. At each sampling interval these
equations are reused. T h e quantities Φ a n d Γ can b e changed as n e w
information is available. W h e t h e r t h e algorithm given in this section is
better t h a n that given in t h e previous section is debatable. T h e two may
well be computationally equivalent. O n e difference which is evident is t h a t
for t h e second algorithm we are assured of a u n i q u e solution. I n t h e
first algorithm an inverse was assumed t o exist.

F. A n Extension to the Stochastic Case


W i t h stochastic disturbances the, algorithms derived for t h e deter-
ministic case can still b e applied if a particular criterion function is
chosen. T h i s situation was first shown for t h e white noise case by J o s e p h
and T o u (23). Extensions to t h e m o r e general case were given b y
Gunckel and Franklin (24), Florentin (25), and Schultz (26). Apparently,
this situation was known previously to statisticians u n d e r t h e n a m e
" U n c e r t a i n t y Equivalence P r i n c i p l e . " A result of their studies is p r e -
sented in this section.
T h e stochastic model is given by

x ( * ) = <Px(k - 1) + Γιι(Α) + Sw(*)


z(k) = Hx(k) + v(*)
ON L I N E COMPUTER CONTROL TECHNIQUES 263

where w ( £ ) a n d v(k) are sequences of i n d e p e n d e n t Gaussian noise. W e


choose t h e following performance criterion.

T h e optimal control is t h e n given by

u ( l ) = -(Γ*Ρ(ΛΓ - 1)Γ + R^r+PiN - 1)Φχ(0)


1
P(N) = Φ*{Ρ(Ν~^Ϊ) - Ρ(Ί^)Γ(Γ*Ρ(Ν^Ϊ)Γ + Α)" Γ*Ρ(]ν-Ί)}Φ + ρ

with P(Ö) = 0. T h e equations are exactly t h e same except x(0) is replaced


by t h e best least squares estimate, x(0).
Of course, t h e results in this section do not reflect changes in Φ a n d Γ
which can occur in an adaptive p r o b l e m . At least, t h e above results
give assurance t h a t p r o p e r action is being taken in a stationary situation.

G . Stability of the Closed-Loop System


T h e r e may be some question w h e t h e r t h e i m p l e m e n t a t i o n of t h e
optimal on-line controller in a closed-loop m a n n e r gives a stable system.
F o r t h e case discussed in this c h a p t e r we can give sufficient conditions
for stability. W e employ t h e discrete version of L y a p u n o v ' s direct
m e t h o d . L e t us state first L y a p u n o v ' s t h e o r e m (27):

STABILITY T H E O R E M : / / for the process

x(k + 1) = f(x(A))

there exists a scalar function of the state variables, V(x(k))y such that
V(0) = 0, and

(i) V(x) > 0 when χ φ 0


(ii) V(x(k + 1) < V(x{k)) for k > Κ, Κ finite
(iii) V(x) is continuous in χ
(iv) V(x) o o when χ —> o o ,

then the equilibrium solution χ = 0 is globally stable and V(x) is a Lyapunov


function for the system.
F o r t h e application of this t h e o r e m , let us choose t h e following
L y a p u n o v function.
F(x(*)) = ||x(*)||* (26)
264 FRANCIS H . KISHI

T h e p r o b l e m is to d e t e r m i n e x(k - j - 1) w h e n t h e optimal controller is


used. L e t u s consider t h e formulation of Section I I , D . F r o m E q . (17)
we have
x(k) = 0 ux ( A + Ν) — <l>12Qx(k + Ν)

p(k)=<t21x(k + N)-<t,22Qx(k + N)

E l i m i n a t i n g x(k + N), we have

λ
P(*) = 0/-21 - ΦΜΨιι - Φι&Υ Φ)
Therefore.
1
x(* + 1) = (An + M ^ i - 0«ρΧ*ιι - ^uß)- «*)

F r o m this follows a sufficient condition for t h e stability of t h e optimal


on-line controller.

THEOREM II-1: If

(All + #12(021 - 0220X011 ~ Φ^ΩΥΎ(βΐ1 + M021 ~ 022Ö)(011 " ΦνβΥ*) "/

is negative definite, then the system employing the on-line controller (without
inequality constraints) is stable.
Of course, t h e choice of t h e L y a p u n o v function, E q . (26), m a y be
overly restrictive. I n this case some other choice will have to be inves-
tigated.
It seems t h a t t h e stability p r o b l e m will b e c o m e m o r e severe as t h e
optimization interval is s h o r t e n e d . O t h e r p r o b l e m areas may include
t i m e lag in c o m p u t a t i o n a n d process p a r a m e t e r errors. T h e s e p r o b l e m s
will be left as-future research topics. L e t us look at an example to d e m o n -
strate t h e t h e o r e m given above.

Example II-1. F o r t h e process

x(k) = 0.9x(k - 1) + «(*), *(0) = 1

we will use a u(k) w h i c h minimizes

j=k

E q u a t i o n s (15) a n d (16) b e c o m e

2.01111 l . l l l l l - i rx(k I)"


.1.11111 1.1111 lJ lp(k 1)
ON L I N E COMPUTER CONTROL TECHNIQUES 265

and E q . (17) becomes

r x(k + 4)i Γ39.90408 26.87973] rx(k)i


[-x(k + 4)J - 1.26.87973 18.13147J lp(k)\

Eliminating x(k + 4) we get

p(k) = - 1 . 4 8 3 7 1 x(k)
Therefore,
x(k + \) = 0.36255 x(k)
2
Applying the t h e o r e m , (0.36225) — 1 < 0. T h e r e f o r e we have stability.

H. How Good is Suboptimal?


F o r the general philosophy, t h e controller was based on performing
optimization at every sampling instant over a finite interval into t h e
future. Several reasons were given for doing this. O n e of t h e reasons
was the uncertainty in t h e process into the future. T h e question t h e n
arises: H o w good is the controller based on a fixed optimization interval,
if t h e process is k n o w n into t h e future ? ( W e reiterate that the controller
based on a fixed optimization interval may be the best one could do in
t h e face of uncertainty). As a comparison, we can make t h e following
two c o m p u t a t i o n s . First, we will solve t h e p r o b l e m which minimizes

X \ Il x ( * ) l l ô + \ il (Situation 1)
ο

with t h e process a n d initial conditions given. T h i s solution is strictly


open-loop. Secondly, we solve t h e p r o b l e m which minimizes at every
sampling instant t h e following criterion.

k+N

X h II *0')IIq + 4 llu(/)lli (Situation 2)


j=k

T h e second philosophy is t h e basis for our on-line controller. F o r t h e


comparison we will assume t h a t t h e process is k n o w n for all times. W e
will illustrate t h e comparison with two examples.

Example II-2. Let us consider t h e scalar process

x(k) = 0.9 x(k-\) + u(k), x{0) = 1


266 FRANCIS H . KISHI

For Situation 1, we use

% + \u(kf
υ

F o r Situation 2, we use

W e will use t h e calculus of variations a p p r o a c h . T h e Euler equations are

x(k) = 0.9x(k- \) + u{k)

p(k + 1) = \.\\\\\p(k) + 1.11111 x(k)

U(k)=p(k)

Eliminating u(k)> we get

\x(k+ 1)1 = Γ2.01111 1.1111 h rx(k)i


[p(k + \)\ Ll.lllll l.lllllJLp(*)J

F o r Situation 1, we can eliminate p(k) a n d obtain

x{k + 2) - 3.12222 x(k + 1) + x(k) = 0

T h i s has t h e general solution


fc
x(k) = ^(0.36234) + £(2.75988)*

T o satisfy t h e initial conditions: x(0) = 1 a n d x(co) = 0, we obtain


fc
x(k) = (0.36234) (Situation 1)

F o r Situation 2, t h e solution is given in t h e example of t h e previous


s e c t i o n , o r

W h e n 8 Τ was considered as t h e optimization interval t h e response was

x(k) = (0.36235)*

Example II-3. T h e conditions are t h e same as t h e last example


except we take an unstable process given by

x(k) = 1.11111 x(k - 1) + u(k)


ON L I N E COMPUTER CONTROL TECHNIQUES 267

T h e Euler equations after eliminating u(k) b e c o m e

rx(k + 1)1 [2.01111 0.9] pc(*)-|


_
lp(k + 1)J L0.9 0.9.1 L/>(*)J
F o r Situation 1, t h e solution is
x(k) = 0.39789*
F o r Situation 2 with 4 Γ as t h e optimization interval, t h e solution is
x(k) = 0.39858*
F o r Situation 2 with ST as t h e optimization interval, t h e solution is

x(k) = 0.39791*
T h e amazing revelation of these examples is t h a t only a short finite
time into t h e future is r e q u i r e d for t h e optimization interval. Of course,
more complicated processes may r e q u i r e a longer optimization interval.
E x a m p l e I I - 3 reveals that unstable processes can be controlled using
the above p r o c e d u r e .

I. Additional Remarks
T h i s section provides b a c k g r o u n d for t h e extension given in this
chapter. It provides review material for t h e discrete version of t h e
linear process and q u a d r a t i c criterion case. N o inequality constraints
are considered in this section. T h e extension in Section I I I considers
inequality constraints on t h e control variable.
T w o algorithms were presented for c o m p u t i n g t h e optimal control
based on two different approaches to t h e optimal control p r o b l e m . A
t h i r d possible a p p r o a c h is t h e use of t h e steepest descent m e t h o d . It is
not discussed here because it is p r e s e n t e d in t h e dissertation by
Hsieh (28).
A philosophy for the adaptive s c h e m e (perform optimization over
a fixed interval into t h e future) is given in this section. T h i s a p p r o a c h
will be verified in Section I I I for t h e case with inequality constraints
t h r o u g h experimentation.

I I I . Synthesis of C o n t r o l Forces w i t h
Inequality C o n s t r a i n t s

A . Introduction
I n this section, we extend considerations given in Section I I to t h e
case w h e n we impose inequality constraints on t h e control variable.
268 FRANCIS H. KISHI

T h e p r o b l e m of on-line synthesis of control forces is no different from


t h e optimization p r o b l e m . T h e difficult r e q u i r e m e n t is that it m u s t be
rapidly performed. Also for t h e adaptive task, it m u s t be performed
in t e r m s of easily m e a s u r e d parameters.
H ö r i n g (29) has considered an on-line controller calling it a predictive
controller. H e solves t h e same p r o b l e m by using concepts from p a t t e r n
recognition and he synthesizes the controller by adders and logical
elements. Complexity arises in his m e t h o d if t h e wishes to lengthen t h e
optimization interval.
H o and Brentani (30) have extensively studied q u a d r a t i c p r o g r a m m i n g
m e t h o d s applied to t h e control p r o b l e m . T h e p r o b l e m of minimizing the
q u a d r a t i c error over an optimization interval falls in their nonlinear
class requiring additional calculations. It is s h o w n s u b s e q u e n t l y that
t h e quadratic error p r o b l e m can be attacked directly using t h e formula-
tion by H o (57) from an earlier paper. H o and Brentani explore a m e t h o d
which projects t h e gradient on t h e feasible region, R. A l t h o u g h this
m e t h o d can also be applied to our p r o b l e m an alternate m e t h o d used
by H i l d r e t h (32) called a coordinatewise gradient m e t h o d will be ex-
plored. Both m e t h o d s can be applied to t h e particular control p r o b l e m
with ease (in comparison to some general q u a d r a t i c p r o g r a m m i n g
p r o b l e m ) . It is to be emphasized t h a t this chapter is exploring a follower-
type controller in comparison to t h e m o r e difficult (computationally)
trajectory optimization p r o b l e m .
I n this section, t h e coordinatewise gradient m e t h o d will be described.
Secondly, some simulation results will be presented showing responses
to several different i n p u t s . A comparison is m a d e with responses of
conventional s a m p l e d - d a t a systems. Effects of p a r a m e t e r errors on t h e
responses are experimentally observed. Extensions are t h e n m a d e to
b o u n d s on t h e rate of change of t h e control variable. O n e extension
gives a h y b r i d computational p r o c e d u r e .
I n A p p e n d i x 2, a brute-force m e t h o d is described. T h e dimensionality
p r o b l e m of this m e t h o d is indicated t h u s r e c o m m e n d i n g the gradient
m e t h o d . A l t h o u g h of little practical value, a s t u d y of t h e brute-force
m e t h o d is i m p o r t a n t in that it gives geometrical insight into t h e p r o b l e m .

B. Problem Formulation
T h e philosophy for t h e determination of t h e control force was stated
in Section I I . I n addition to t h e consideration given to t h e formulation
of t h e p r o b l e m posed in Section I I , we require t h e control variables to
be b o u n d e d , i.e.,
I u(k)\ < M (27)
ON L I N E COMPUTER C O N T R O L TECHNIQUES 269

F o r t h e sake of ease in presentation, t h e single control force a n d single


o u t p u t case will be considered. Generalization can be m a d e to t h e
multipole case (31). T h e i n p u t - o u t p u t relationship of the process is
given by

y(i)^ig(i+\-j>(J)+yo(i) (28)°

where

g(l) is the response to a unit pulse of width Τ at IT seconds from the initiation
of pulse

y0(l) is the i n i t i a l condition response

T h e g(l) are to be estimated by m e t h o d s in Sections IV and V. E q u a t i o n


(28) is rewritten in matrix from.

y = Gu + y0 (29)
where

Γ.ν(ΐ) Ί
y(2)
Γ«(ΐ)
«(2)
Ί pyo(2)
voO) 1
U =

AN).

0 0 0
S(2) £(1) 0 0
G = = til I 82

lg(N) g(N-\) i?(i).

T h i s G matrix is triangular because of physical realizability. Also, t h e ξ{


are linearly i n d e p e n d e n t i f ^ ( l ) φ 0. F o r this discrete case t h e G matrix
has rank Ν if a n d only if t h e process is controllable (31). I t is observed
that if g( 1 ) Φ 0 t h e n system is controllable.
I n t e r m s of t h e above notation t h e criterion b e c o m e s

= ll (30)
Yd
where is t h e desired trajectory.
6
A s was d o n e in Section II, Ε, the index k has been dropped. j=\=>j = k+ \y
j = Ν = k + N.
270 FRANCIS H. KISHI

Let
à =Yä-Yo (31)
and

à = £ (32)

T h e d will not in general be m a d e equal to d ' because of E q . (27).


T h e problem which can now be stated is: D e t e r m i n e u(i) which
minimizes
; = |||d'-d||* (33)

Each column vector, g{, can be viewed as a basis which collectively


spans a linear manifold of EN (output-space). W i t h o u t b o u n d s t h e
7
problem can be solved readily because the G matrix is triangular. W i t h
b o u n d s the p r o b l e m is to d e t e r m i n e a point in a closed convex region
which is nearest to the desired point, d ' . T h e closed convex region is
in particular a parallelotope in EN .

C . Coordinatewise Gradient Method


In this section we look at a gradient m e t h o d to iteratively approach
the o p t i m u m point. W e modify t h e m e t h o d of steepest descent to
consider limitations on the m o v e m e n t of the trial point. Because of the

ι \v/w /;
\ \v /
• ~ ' /

F I G . 9. Path of descent.
7
T h i s statement is especially true if the criterion does not include a penalty for
control energy.
O N L I N E COMPUTER CONTROL TECHNIQUES 271

simplicity of the b o u n d a r i e s (parallelotope) c o m p a r e d to some general


quadratic p r o g r a m m i n g p r o b l e m we anticipate some easy gradient
m e t h o d to apply. H o (30) also utilizes t h e simplicity of t h e b o u n d a r i e s
in his m e t h o d . A s i m p l e - m i n d e d m e t h o d is to adjust each c o m p o n e n t
one at a time. I n this way t h e b o u n d s on t h e c o m p o n e n t s can easily be
applied.
L e t us look at a two-dimensional p r o b l e m as s h o w n in Fig. 9. I n t h e
m e t h o d we can start from any point in R. F o r t h e sake of discussion,
let us begin at t h e origin, O. First, we move in t h e u(\) direction. I n
the u(\) direction, we seek t h e m i n i m u m which is located at point a!,
Since we cannot reach t h a t point we stop at point a. N e x t , we seek
a m i n i m u m in the u(2) direction starting from point a. T h e m i n i m u m
in the u(2) direction is found at point b. Since this is t h e o p t i m u m
point in Ry we have reached t h e o p t i m u m in two iterations. ( F o r higher
dimensions t h e o p t i m u m will usually not be reached so rapidly.)
Next, the equations which will be p r o g r a m m e d will be derived. T h e
point in R is given by

à=X«U)*i = G* (32)

W e seek the m i n i m u m of

/ = £|| d ' - G u | | *

T h e gradient along a c o m p o n e n t is

g,*Gu - g,*d' = V, (34)


du(j)

It is noted t h a t t h e gradient along a c o m p o n e n t is a scalar. T h e corrected


value for t h e u(j) c o m p o n e n t is

yn+l)
u(j =u { ) j y+ n jln)€

T h e en is found by seeking t h e m i n i m u m along t h e direction of t h e


jth c o m p o n e n t g ; . E x p a n d i n g / ,
2
2J = < G * G u , u> - 2 < G * d ' , u> + H d ' ||

L e t us work with t h e t e r m s w h i c h d e p e n d on u.

ρ(ιι) Δ < G * G u , u> - 2 < G * d ' , u> (35)

Also,
(n
. ( n + )l — t n)
u > + u e„r
272 FRANCIS H. KISHI

( n) n )
where m is zero except for the jth element, w h i c h is equal to V j .
i n +) 1
Substituting u into E q . (35) we obtain
{n) ( w)
£(u<*> \- e„m<*>) = Q(u ) + 2 € w< G * G u - G * d ' , m<">>
2 ( n )
+ fn (G*Gm^,m )

T h e m i n i m u m along a particular direction is t h e n given by

w) ( w )
-f-0( ) = 2<G*Gu< - G * d ' , m<«>> + 2 e r <
i G*Gm , m<*>> = 0

Or,
(
_ <G*Gu<*> - G * d ' , m *>)
€n w
< G * G m < > , m<*>> ~

T h e vector m is zero except for the yth element. T h e r e f o r e , en in


the^'th direction is

T h e r e f o r e , at t h e «th step we get the η + 1 a p p r o x i m a t i o n by

„ ω , „ + 1, = - „, _ _ ^ | 2
m 0) ( ) ( 3 6

< n +) 1
As M ( j ) could possibly exceed a b o u n d we m u s t limit its a m p l i t u d e , or

M(i)( + „ i> = a,•


S [«(/)(»+«] (37)
—M,M

T h e quantity on t h e left is used for t h e next iteration. T h e r e f o r e , the


vital equations are E q s (34), (36) and (37). T h e simplicity of
t h e equations to-be-solved is noted. Every iteration requires only
2 2
Νβ + 5N/2 — 1 additions, Ν β + 5N/2 multiplications, and 1 divi-
sion. A n iteration for t h e coordinatewise gradient m e t h o d should not be
c o m p a r e d with one iteration for H o ' s m e t h o d . T h e c o m p u t a t i o n t i m e
for Ν iterations of the coordinatewise gradient m e t h o d should m o r e
closely correspond with one iteration of t h e other m e t h o d .
T h e p r o c e d u r e described above can be modified to possibly i m p r o v e
t h e rate-of-convergence. Before each iteration, t h e gradient in each
coordinate direction is evaluated. T h e direction of the largest gradient
is t h e n chosen for t h e descent. If no motion is possible in that direction,
t h e direction for t h e next highest gradient is chosen, etc. Of course,
such a p r o c e d u r e will d e m a n d m o r e from t h e c o m p u t e r ; however, it may
still be m u c h simpler t h a n other m e t h o d s .
ON L I N E COMPUTER CONTROL TECHNIQUES 273

D. Remarks on Convergence
C o m m e n t s in this section will be largely heuristic, appealing to the
geometrical picture. A discussion on t h e existence and u n i q u e n e s s is
given in A p p e n d i x 3.
T h e proof of convergence has been given by H i l d r e t h (32) and
D ' E s o p o (33) for the parallelotope region that we have (rectangular in
w-space). It should be emphasized that convergence of t h e coordinatewise
gradient m e t h o d is assured only for this particular type of constraint.
Geometrically, t h e convergence can be visualized for t h e t w o - d i m e n s i o n a l
case. T h e criterion function, / , defines a surface in rf-space which is a
circular paraboloid (noncircular paraboloid in w-space). I n the parallelo-
tope region, R, we are to converge u p o n t h e lowest point on this surface.
At each iteration (although we select t h e direction of t h e coordinates)
we measure t h e slope and we choose to go in the negative slope direction.
Along any direction the slope is either positive, negative, or zero. If zero,
we temporarily do n o t h i n g because if t h e point is n o n m i n i m a l some
other coordinate will have nonzero slope. T h e p r o c e d u r e stops w h e n
either we arrive at a point where all t h e gradient c o m p o n e n t s are zero
(min in R)y or motion of t h e trial point is restrained by the boundaries
of R. If restrained by a single b o u n d a r y , t h e gradient will be n o r m a l
to that b o u n d a r y .

E. A Remark on the Initial Trial


Of course, the success of t h e gradient m e t h o d will d e p e n d u p o n t h e
closeness of t h e initial guess or trial to t h e answer. T h i s section describes
a t e c h n i q u e whereby a good initial guess can be obtained. As previously
described we envision repeating t h e same optimization p r o c e d u r e every
Τ seconds. A l t h o u g h the optimization yields t h e control force for the entire
optimization interval, NTy only t h e first c o m p o n e n t is ever used. H o w -
ever, the other c o m p o n e n t s can be used as an initial approximation for
the following interval of consideration. If t h e changes caused by d i s t u r b -
ances and process and i n p u t variations are small d u r i n g T, one should
be able to c o m p u t e t h e optimal controls rapidly since the initial approxi-
mations will be very close to the optimal point. I n Fig. 10, u(2) in
interval 1 becomes the first guess for u(\) in interval 2. Only an initial

OPTIMIZATION INTERVAL I OPTIMIZATION INTERVAL 2

F I G . 10. Translation of optimization intervals.


274 FRANCIS H. KISHI

approximation for the last Τ second is missing. For this reason, t h e


iteration is initiated from the last Τ interval, working forward, and
repeating this process. In this way the first iteration will not disturb t h e
initial good approximation of the other intervals. F o r the reason that
only one c o m p o n e n t may be initially indeterminate, it is felt that t h e
coordinatewise gradient m e t h o d may be the most suitable in this
application.
If no initial approximation is available, the u n b o u n d e d solution can
be c o m p u t e d . By simply passing the u n b o u n d e d solution t h r o u g h a
limiter operation we have a possible initial guess.

F. Example for One Optimization Interval


Before proceeding to the simulation of the controller in a closed-loop,
let us examine in detail the iteration p r o c e d u r e for one optimization
interval. W e take a four-dimensional example. Let us consider the
following linear process described in t e r m s of the Laplace transfer
function
Y(s) = 0.5
U(s) ~ s(s +0.5)

with a sampling period in the controller of Τ = 1 sec. T h e unit pulse


response is given by the succession of the following values

gj = (0.21306,0.52270,0.71050,0.82442)

T h e G matrix is

"0.21306 0 0 0 "
= 0.52270 0.21306 0 0
" 0.71050 0.52270 0.21306 0
0.82442 0.71050 0.52270 0.21306

W e will let y(j(j) = 1 and assume that the initial condition is equal
to zero. Therefore, d'(j) = 1. W e restrict u(j) such that | u(j) | ^ 5.5
Using the gradient m e t h o d we assume as a first approximation t h e
set u(j) obtained by limiting the u n b o u n d e d solution. T h e u n b o u n d e d
solution for the p r o b l e m is

u(j) = (4.69, - 6 . 8 2 , 5.77, - 4 . 8 9 )

Therefore, the first approximation is

«(/)(!) = (4.69, - 5 . 5 , 5.5, - 4 . 8 9 )


O N L I N E COMPUTER CONTROL TECHNIQUES 275

Figure 11 shows t h e o p t i m u m b o u n d e d - c o n t r o l sequence obtained


from the gradient m e t h o d and t h e brute-force m e t h o d as described in
A p p e n d i x 2. T h e n o n g r a d i e n t solution was possible because the example

CONTROL
VARIABLE ABOUND
u(j)

i(4). 3.34

u(2)--550 \-BOUND

F I G . 11. B o u n d e d control sequence.

chosen was one of the special cases (see A p p e n d i x 2). ( T h e brute-force


m e t h o d was not p r o g r a m m e d in general terms.) Also, Fig. 11 shows the
u n b o u n d e d solution. It is noted t h a t a l t h o u g h t h e u n b o u n d e d solution
exceeds t h e b o u n d s twice, t h e b o u n d e d solution has only one c o m p o n e n t
at t h e b o u n d a r y .

/-OESIRED OUTPUT a UNBOUNDED SOLUTION

J-JC=Z\

y (OPTIMUM RESPONSE)

Οβ^

ΟI I ι ι ι ι

0 Τ 2T 3T 4T t •

F I G . 12. Output of process for o p t i m u m control s e q u e n c e .

F i g u r e 12 shows t h e c o r r e s p o n d i n g o u t p u t of t h e linear process.


Actually, t h e o u t p u t will be c o n t i n u o u s rather t h a n the staircase signal
shown. T h e staircase response is plotted for convenience and the
276 FRANCIS H. KISHI

response at t h e sampling instants will c o r r e s p o n d exactly with t h e


actual response.
T a b l e I shows how t h e o p t i m u m point is a p p r o a c h e d by t h e gradient

TABLE I

C O N T R O L S E Q U E N C E A N D O U T P U T VS N U M B E R OF ITERATIONS

Itéra-
tion «(D u(2) «(3) «(4) yd) y(2) y(3) y(4) J

0 4.693 -5.50 5.500 -4.890 1.000 1.281 1.632 1.795 0.5549


1 4.693 -5.50 5.500 -5.500 1.000 1.281 1.632 1.665 0.4600
2 4.693 -5.50 3.987 -5.500 1.000 1.281 1.309 0.874 0.0954
3 4.693 -5.50 3.987 -5.500 1.000 1.281 1.309 0.874 0.0954
4 4.519 -5.50 3.987 -5.500 0.963 1.190 1.185 0.730 0.0724
8 4.418 -5.50 3.863 -4.231 0.941 1.138 1.087 0.853 0.0259
12 4.360 -5.50 3.805 -3.539 0.929 1.107 1.034 0.921 0.0119
16 4.326 -5.50 3.782 -3.171 0.922 1.089 1.004 0.960 0.0079
20 4.305 -5.50 3.779 -2.983 0.917 1.078 0.989 0.983 0.0067
24 4.292 -5.50 3.787 -2.895 0.914 1.071 0.981 0.993 0.0064
28 4.283 -5.50 3.799 -2.862 0.913 1.067 0.978 0.999 0.0063
32 4.277 -5.50 3.814 -2.859 0.911 1.064 0.977 1.003 0.0062
36 4.272 -5.50 3.830 -2.872 0.910 1.061 0.977 1.004 0.0062
44 4.266 -5.50 3.861 -2.917 0.909 1.058 0.978 1.005 0.0061
52 4.261 -5.50 3.889 -2.968 0.908 1.055 0.981 1.005 0.0060
60 4.257 -5.50 3.913 -3.015 0.907 1.053 0.983 1.005 0.0059
68 4.253 -5.90 3.935 -3.058 0.906 1.051 0.986 1.004 0.0058
79 4.251 -5.50 3.962 -3.111 0.906 1.050 0.989 1.004 0.0058
Ans. 4.231 -5.50 4.075 -3.337 0.902 1.040 1.000 1.000 0.0056

m e t h o d . T h e gradient m e t h o d has t h e characteristic t h a t errors are


initially rapidly r e d u c e d a n d t h e finer accuracy is obtainable only
after m a n y iterations. T a b l e I shows t h a t good a p p r o x i m a t i o n s are
obtained after 16 iterations. I n an adaptive control task t h e solutions
should be a p p r o a c h e d even sooner because as discussed in t h e previous
section we generally have a good initial approximation.

G . Simulation
A digital simulation was performed on an I B M 7090 to operate t h e
controller in a feedback loop. T h e flow chart is s h o w n in Fig. 13.
First, t h e controller was r e q u i r e d to cause t h e process to follow
a triangular wave. T h e process used previously (as an example) was
again considered. Optimization intervals of AT and ST were considered
ON L I N E COMPUTER CONTROL TECHNIQUES 277

INPUT :
TRAJECTORY, INITIAL STATE OF PROCESS, BOUND ON CONTROL,
PULSE RESPONSE OF P R O C E S S , NO. OF I T E R A T I O N S PER
SAMPLING INTERVAL, INITIAL TRIAL FOR FIRST OPTIMIZATION
INTERVAL, INITIALIZATION.

C O M P U T A T I O N OF R E S P O N S E TO
INITIAL STATE FOR THE
OPTIMIZATION INTERVAL

COMPUTE DESIRED R E S P O N S E
TO C O N T R O L F O R C E FOR
THE O P T I M I Z A T I O N

COMPUTE THE OPTIMAL GO T O N E X T


SAMPLING INTERVAL
CONTROL FOR THE
OPTIMIZATION INTERVAL

*
PROCESS:
COMPUTE THE R E S P O N S E AT
E N D OF O N E S A M P L I N G
INTERVAL

• ^ T O P ^

PROVIDE INITIAL APPROXI


MATION FOR NEXT
OPTIMIZATION INTERVAL

FIG. 13. Simulation flow chart.

with Τ = 1 sec. A comparison is m a d e with a conventional controller


s h o w n in Fig. 14 for which t h e Κ was chosen so that t h e d a m p i n g ratio

HOLD
τ ult)
ι - β - ' y
7\ κ .5

9 T= 1 8 S ( S + .5 )

FIG. 1 4 . Conventional controller.

was 0.5. F o r a comparison, the b o u n d s on t h e on-line controller were


selected from t h e m a x i m u m and m i n i m u m control forces experienced
by the conventional controller. T h e n u m b e r s of iterations per sampling
interval were respectively 20 a n d 40 for t h e AT and ST cases. ( T h i s
278 FRANCIS H . KISHI

means that each c o m p o n e n t was iterated 5 times.) Simulation was


performed over 100 sampling intervals.
A portion of t h e results is shown in Fig. 15. A marked i m p r o v e m e n t

Δ - 4 STEP-ON LINE CONTROLLER

FIG. 1 5 . Comparison of on-line vs conventional, for yju = 0 . 5 / s ( s + 0 . 5 ) , triangular


input.

in the response is noted. T h e conventional controller response shows


t h e characteristic lag which is not present for t h e on-line controller
response. F o r t h e example chosen it is seen that no appreciable difference
is seen in the responses of t h e AT a n d 8 Τ cases. T h e n u m b e r of iterations
was increased by a factor of two with no appreciable difference in the
response.
T h e control forces for the conventional controller and t h e online
controller are shown respectively in Figs. 16 and 17. T h e on-line con-
troller's controls are m o r e j u m p y b u t such constraints on t h e rate of
change were not considered in t h e optimization.
An estimate can be m a d e of t h e c o m p u t a t i o n t i m e per sampling interval
2
using the formulas previously stated. [ N o . (Add) = N /2 + 5N/2 — 1,
2
N o . (Multiply) == N /2 + 5N/2 N o . (Divide) = 1 per iteration.] L e t
us assume that we have an on-line digital c o m p u t e r with an a d d - t i m e of
35 /xsec. ( T h e a d d - t i m e for t h e I B M 7090 is 2 jusec.) Considering t h a t
we have 10 digit multiplication and that the transfer time is one-half
t h e add time, t h e estimate is 0.0085 sec per sampling interval for t h e
O N L I N E COMPUTER CONTROL TECHNIQUES 279

4 Τ a n d 20 iterations case (0.0005 sec for I B M 7090). T h e r e f o r e , c o m -


pared to t h e 1-sec sampling period t h e c o m p u t a t i o n t i m e is only a
fraction.

F I G . 16. Control force for on-line controller.

MAXIMUM-^

TIME •

Y
F I G . 17. Control force for conventional controller.
280 FRANCIS H. KISHI

1 2
FIG. 18. Comparison of on-line vs conventional, yju = 0.25/(s - - 0 . 5 ) , triangular
input.

2
FIG. 1 9 . Comparison of on-line vs conventional, yju = 0 . 2 5 / ( s -f- 0 . 5 ) , sine wave
input.
ON L I N E COMPUTER CONTROL TECHNIQUES 281

As the pulse response of the previous example did not tail off (because
of the integrator) another process was selected with

Y{s) _ 0.25
2
U(s) " (s+0.5)

and with Τ = 1 sec. T h e results of t h e simulation are s h o w n in Fig. 18


for the 8 Γ case. Again, an i m p r o v e m e n t is noticed over t h e conventional
controller. N o appreciable i m p r o v e m e n t was noticed w h e n the n u m b e r
of iterations was increased by a factor of two. A sine wave was also
tried and t h e results are s h o w n in Fig. 19.
It is felt that t h e results reveal that some new types of responses can
be obtained by using an on-line controller. It should be noted that if
the conventional controller m u s t operate in t h e linear range a simulation
m u s t be performed with all t h e possible i n p u t s that the feedback process
will encounter. O n t h e other h a n d , t h e on-line controller can do its best
at all times with t h e available control forces.

H. Effect of Uncertainties in Process Parameters


T h e optimal controls are c o m p u t e d assuming that the process is
k n o w n accurately. I n an adaptive task, one is not so fortunate as to have
accurate knowledge of the process. It is very desirable t h e n to know
w h e t h e r suitable control action is obtained even with inaccuracies of,
say, 1 0 % in the process p a r a m e t e r s . If we have this condition, t h e n ,
assurance is given that if t h e process p a r a m e t e r s are k n o w n to within
1 0 % , t h e n t h e overall system will behave satisfactorily. T h e r e f o r e ,
optimal controls and trajectories should be experimentally studied with
errors in process parameters. A few experimental results are reported
in this section.
T h e situation of Fig. 18 was studied further. T h e process was

Y(s) _ 0.25
2
U(s) ~~ (s + 0.5)

T h e pole position (a) of 0.5 was uncertain to the controller and values
of 0.45, 0.5, and 0.55 were respectively used. Optimization intervals
of 8 Τ a n d AT interations per sampling period were used. T h e differences
in the responses were hardly noticeable to plot on a g r a p h . T h e r e f o r e ,
the initial part of the r u n s are tabulated in T a b l e II for the ST case for
comparison purposes. T h e o u t p u t of t h e conventional controller is also
tabulated.
O n e should not draw sweeping conclusions from a single example.
282 FRANCIS H. KISHI

TABLE II

EFFECT OF U N C E R T A I N T I E S I N PARAMETERS

Desired
k path a = 0.45 a = 0.5 a = 0.55 Conv.

1 1.0 0.546 0.559 0.521 0.0


2 2.0 1.986 1.838 1.771 0.271
3 3.0 3.733 3.471 3.302 0.990
4 4.0 4.847 4.783 4.628 1.981
5 5.0 5.628 5.590 5.568 2.974
6 6.0 6.665 6.517 6.500 3.819
7 7.0 7.801 7.635 7.573 4.521
8 8.0 8.798 8.651 8.582 5.173
9 9.0 9.589 9.472 9.411 5.858
10 10.0 10.182 10.094 10.046 6.600
11 11.0 10.611 10.547 10.510 7.381
12 12.0 10.912 10.867 10.841 8.163
13 11.0 11.120 11.089 11.071 8.927
14 10.0 10.499 10.640 10.763 9.131
15 9.0 9.440 9.760 10.004 8.429
16 8.0 8.181 8.460 8.738 7.189
17 7.0 7.093 7.266 7.473 5.950
18 6.0 6.180 6.263 6.386 5.014
19 5.0 5.170 5.274 5.369 4.364
20 4.0 4.033 4.193 4.320 3.812
21 3.0 2.935 3.108 3.247 3.193
22 2.0 1.789 1.938 2.097 2.455
23 1.0 0.898 0.966 1.056 1.643
24 0.0 -0.306 -0.205 -0.098 0.828
25 -1.0 -1.359 -1.190 -1.096 0.051

However, the results indicate that possibilities are present, a n d that any
individual p r o b l e m should be analyzed by simulation. T h e close tracking
capability in spite of errors in t h e process information can possibly be
attributed to t h e feedback which is present in t h e on-line controller.

I. Bounds on the Rate of Change of the Control Variable


Instead of having b o u n d s on the a m p l i t u d e , we can place b o u n d s on
the rate of change of the control variable. Let us look at t h e four-
dimensional case as an example.

d = tt(l)gx + «(2)g + a «(3)g3 + «(4)g 4 (38)


O N L I N E COMPUTER C O N T R O L TECHNIQUES 283

W e wish to b o u n d t h e difference b et ween succeeding control forces.

I u(k) - u(k - 1)| < M2

W e p u t no constraints on t h e range of u{i) itself. Rewriting E q . (38)


we get

d = " ( l ) ( g i + g 2 + gs + g 4 ) + ("(2) - « ( l ) ) ( g 2 + g 3 + g 4 )

+ («(3) - «(2))(g 3 + g 4 ) + («(4) - u(3))(g 4)


Let

h,=2g4-,> / ( 0 = «(«) - « ( * - 1)

then,
d = / ( l ) ^ + / ( 2 ) h 2 + /(3)h 8 + /(4)h 4 (39)
where
I /(Ol < M2

N o w , we can use t h e same m e t h o d as discussed previously and solve


for /(/) which in t u r n can be solved for u(i).

J. Weighting between E r r o r and C o n t r o l Energy


In place of E q . (30) it may be desirable to use instead t h e following
criterion which also penalizes control energy,

u 2
J = è i l y - y l i + èii H
rf
2

N o w , distances in state-space or j - s p a c e have no longer t h e s a m e


significance as before. W i t h less geometrical significance, however,
t h e p r o b l e m can be viewed as d o n e by H o in t h e solution space or
control space. If we still desire to limit t h e control force, a point is t h e n
desired in a h y p e r c u b e , R. T h e t w o - d i m e n s i o n a l p r o b l e m is s h o w n in
Fig. 20.
I n Fig. 20, t h e lines of constant / are no longer circular, b u t t h e
/ hypersurface defined at every point of t h e solution space can be
shown to be convex. It can be a s s u m e d here t h a t / is c o n t i n u o u s with
b o u n d e d second partial derivatives with respect to u(k). T h e n , / is a
convex function of u(k) if t h e s y m m e t r i c matrix of t h e second partial
derivatives is positive semidefinite at all points of R (34, p . 51). It is
284 FRANCIS H . KISHI

noted in passing that the s u m of convex functions is convex. T h i s follows


simply from the fact that t h e s u m of semidefinite matrices is semidefinite.
W r i t i n g / in t e r m s of w, we have

/ = 4 l | d ' - G u | | * + £||u||* (40)

T h e second partial derivative matrix of the first t e r m is G * G which is


s y m m e t r i c and positive definite (columns of G are linearly i n d e p e n d e n t ) .
T h e second partial derivative matrix of the second t e r m is simply 21.
Therefore, the coordinatewise gradient m e t h o d is still applicable for
this case.

J2

U(l)

F I G . 20. Solution space (w-space).

K. Bounds on Both Control Force and the Rate of Change of


Control Force

M o s t practical systems have limitations b o t h on the m a g n i t u d e of


the control force a n d on the rate of change of control force. L e t us
restate the p r o b l e m with the a d d e d constraint.
ON L I N E COMPUTER CONTROL TECHNIQUES 285

Problem. Given
(a) Process:

3=1

(b) C o n s t r a i n t s :

I «0")l < M l

I «01 - «(/' - 1)1 < m 2

D e t e r m i n e : u(j),j = 1, 2, which minimizes

2
J = hf,(yä(j)-y(j))

FIG. 21. T w o - d i m e n s i o n a l case with multiple constraints.


286 FRANCIS H. KISHI

T h e p r o b l e m is again a q u a d r a t i c p r o g r a m m i n g p r o b l e m b u t with m o r e
constraints. T h e region from which a solution is to be chosen will no
longer be a parallellotope. T h e region for t h e t w o - d i m e n s i o n a l case is
s h o w n in Fig. 21 in w-space.
T h e p r o b l e m is to find a point in w-space a n d in t h e s h a d e d region
which has t h e smallest J. F o r such regions, t h e coordinatewise gradient
m e t h o d or H o ' s simplified gradient projection m e t h o d is not directly
applicable. T h e r e f o r e , a m o r e involved m e t h o d is r e q u i r e d . R o s e n ' s (35)
gradient projection m e t h o d is applicable b u t t h e use of such a s c h e m e
on-line is questionable. T h u s , we look for a simpler s c h e m e to apply to
our particular p r o b l e m .
T h e p r o c e d u r e to be described will transform t h e above p r o b l e m so
that t h e constraints will be rectangular. S u c h a scheme has been described
by H i l d r e t h (32). T h e constraints being rectangular, we can apply t h e
coordinatewise gradient m e t h o d or H o ' s simplified gradient projection
m e t h o d . It should be noted that t h e following p r o c e d u r e can also be
used for control p r o b l e m s with state variable constraints by converting
to equivalent statements on w.
As before,
/(u) = \\\ d' - G u | | «

T a k i n g t h e parts of J(u) which d e p e n d on u,

Q(u) = | u * C u + h * u (41)
where
C = G*G is an Ν χ Ν matrix
h = — G * d ' is an Ν X 1 vector

T h e constraints can be placed in t h e form

Du - b ^ 0

T o illustrate that p r o b l e m s with a m p l i t u d e a n d rate-of-change con-


straints can be p u t into this form, let us look at t h e t w o - d i m e n s i o n a l
example. I n this case,

~-l 0"
Γ-Mi
1 0
0 -1 -M1
-Mx
-M2 - «(0)
1 0 - M 2 + w(0)
1 -1 -M2
-1 1 -M2
ON LINE COMPUTER CONTROL TECHNIQUES 287

R e t u r n i n g to the general formulation, we form t h e Lagrangian

</>(u, λ ) = Q(u) - X*(Du - b)

F r o m the t h e o r e m s given in A p p e n d i x 3 [ K u h n - T u c k e r (17) t h e o -


rems], t h e task is to find t h e saddle point of <rS(u, λ), or solve t h e following
max-min problem.

max min ( | u * C u + h * u - X*(Du - b)) (42)

T h e following is an equivalent p r o b l e m .

mm - [mm ( | u * C u + h*u - λ*(ΖΛι - b))] (43)

W e can differentiate </>(w, λ) with respect to u to solve t h e first m i n i m u m .

u = C~\D*\ - h) (44)

U p o n s u b s t i t u t i n g E q . (44) into E q . (43), we have t h e following p r o b l e m


T h e t e r m s which do not d e p e n d on λ have been left out.

mm [ λ Μ λ + 7λ] (45)

where
l
Λ = %DC D*
Y = h*C-W* - b *

N o w , the coordinatewise gradient m e t h o d or H o ' s simplified gradient


projection m e t h o d can be used to solve this new p r o b l e m . U p o n deter-
mining λ, E q . (44) yields the o p t i m u m u. W e note that t h e λ obtained
need not be u n i q u e .

L. A Compromise Procedure for the Multiple Constraint Case


If the p r o c e d u r e outlined in Section I I I , Κ is not computationally
feasible, t h e n t h e following c o m p r o m i z i n g p r o c e d u r e can be tried.
A m e t h o d is proposed which attacks directly t h e m a g n i t u d e of t h e
control force and which indirectly constrains t h e rate-of-change by
using a penalty function.
W e attack t h e p r o b l e m in w-space with t h e criterion,

; = i l l y d- y l l
2
+ Af (^t^Li}L)' 1V1
j=l 2
288 FRANCIS H. KISHI

where u(0) — 0 (or, the control used in the previous interval), and α is
an even integer (2, 4 , etc.). T h e larger the value of α the closer will
the solution approximate the solution to the original p r o b l e m . For
a > 2, the problem is slightly more complicated by the fact that / is
no longer quadratic.
W i t h this formulation, the coordinatewise gradient m e t h o d or H o ' s
simplified gradient projection m e t h o d will apply directly in w-space.

M. Between Sample Considerations


Besides the errors at the sampling instants, considerations can be
given to the o u t p u t of the process between sampling instants. Instead
of Eq. (29), we use
y = Gu + y°
where

m 0 0 ... 0 "

S(2)
0
£(1)
0
0 m
v(2)
g(2)
y = m
g(N) SO) v(N)
g(N-l)
where
y(j) is the output Tjl sec after y(j)
g(J) is the response to a unit pulse of width Τ at (j + ^) sec from the initiation
of pulse
T h e criterion becomes
2
7 = è l l y d- y ! !

F r o m here, the p r o c e d u r e is exactly the same as before. If desired, the


p r o c e d u r e can be extended to m o r e in-between points.

N. A Hybrid Computational Procedure


I n this section a m e t h o d will be proposed which exploits the particular
features of the analog and digital c o m p u t e r s . As shown in A p p e n d i x 3,
λ;· > 0 can be used as a test to d e t e r m i n e w h e t h e r the m i n i m u m is on
a particular b o u n d i n g hyperplane. U p o n determination of the h y p e r -
planes u p o n which the m i n i m u m lies we can d e t e r m i n e the m i n i m u m
ON L I N E COMPUTER CONTROL TECHNIQUES 289

point by projection. T h e analog c o m p u t e r will be e m p l o y e d for t h e zero-


nonzero d e t e r m i n a t i o n ; while t h e digital c o m p u t e r will be employed
for t h e projection operation.
T o each constraint

is associated a λ^· . F o r those inequalities satisfied by t h e equality we


have λ;· > 0. F o r those inequalities satisfied by a strict inequality we
have λ;· = 0. W e are interested in d e t e r m i n i n g those λ;· which are positive.
F r o m T h e o r e m A3.2 we have to satisfy

u = C-\D*\ - h) (46)

Du - b > 0 (47)

X*(Du - b) = 0 (48)

λ > 0 (49)

L e t us s u b s t i t u t e E q . (46) into E q s . (47) and (48) eliminating u. W e


obtain t h e set
DC-HD*X - h ) - b > 0
1
X*(Z)C- D*X - ( D C - * + b)) = 0

5? 0
λ
Let
w = DC-!(D*X - h) - b (50)

T h e n , we have t h e symmetrical set of relations to satisfy.

w ^ 0

λ > 0

<w,X>=0 (51)

T h e last relation requires that Wj = 0 w h e n λ;· > 0 a n d λ;· = 0 w h e n


Wj > 0.
Instead of using λ^· , we can use Wj to d e t e r m i n e w h e t h e r t h e o p t i m u m
point is on a particular h y p e r p l a n e . T h e m a g n i t u d e of wi gives t h e
distance from t h e o p t i m u m point to t h e h y p e r p l a n e , H} . T h e r e f o r e ,
we are interested in those Wj w h i c h are zero. As we are interested only
in t h e z e r o - n o n z e r o aspect, an analog c o m p u t e r with limited accuracy
can be employed. If a Wj is close to zero t h e r e will be little h a r m in
calling it zero.
290 FRANCIS H. KISHI

U p o n d e t e r m i n i n g those z^-'s which are zero we collect the corre-


s p o n d i n g inequalities which are to be satisfied by equalities

Hj:^djiui-bj=0i j= 1 9 (52)
i

T h e equations may not necessarily be linearly i n d e p e n d e n t . T h e r e is


no loss of generality in assuming t h a t d ; vectors, which are n o r m a l to
the hyperplanes, Hj, have unit n o r m .
T o perform the projection it will be convenient to find a point which
is c o m m o n to all of the hyperplanes. L e t us write E q . (52) in vector form

<d,, u> - 6, = 0
or
Du - b = 0
where

D = q χ η matrix

b = q χ 1 vector

A point u* which is c o m m o n to all of t h e hyperplanes in E q . (52) is


given by the pseudo-inverse.

uî = D t b (53)

Before proceeding, we describe the projection operator as described


by Rosen (35). ( W e extend Rosen's work by employing t h e p s e u d o -
inverse.) L e t us consider the linear subspaces (includes origin) corre-
s p o n d i n g to t h e hyperplanes, Hj .

Du =0

T h e normals to the subspaces ( d ; ) span t h e ^-dimensional subspace Q-


T h e subspace obtained by t h e intersection of t h e h y p e r p l a n e s translated
to t h e origin we designate as Q. N o w , the total space consists of t h e
p r o d u c t space of Q and Q, or En = Q © Q.
N o w , the projection of a vector in En onto Q is given by

DQ = DDf

T h e projection of a vector in En onto Q is given by the η χ η matrix.

Dg=I - DD*
ON L I N E COMPUTER CONTROL TECHNIQUES 291

Since we are interested in the intersection of h y p e r p l a n e s translated


from the origin, we form t h e vector from u* to the desired point ιΓ,
or ιΓ — u*. Performing t h e projection we obtain

(/ - DZ)t)(u' - ut)

N o w , t h e o p t i m u m point is obtained by

u° = (/ - 5 ß t ) ( u ' - ut) + ut (54)

T h e c o m p u t a t i o n of E q . (54) will be p e r f o r m e d on a digital c o m p u t e r


with the pseudo-inverse s u b r o u t i n e described in A p p e n d i x 4. It should
be pointed out that the t e c h n i q u e is directly applicable w h e n t h e criterion
is given by Eq. (33). Otherwise, t h e gradient vector m u s t be projected in
an iterative m a n n e r .
T h e reasons for employing analog c o m p u t a t i o n are: (1) speed of
response and (2) minimal accuracy r e q u i r e m e n t s . T h e implicit function
t e c h n i q u e does not seem to have a c o u n t e r p a r t in digital c o m p u t a t i o n
except by using analogous t e c h n i q u e s such as D D A . L e t us describe
the analog circuit r e q u i r e m e n t s by looking at a simple example. A l t h o u g h
simple constraints are considered in t h e example, multiple constraints
can be considered w i t h o u t modification of t h e m e t h o d . It is not difficult
to envision special p u r p o s e c o m p u t e r s for on-line application.

Example. F i n d u(l) a n d w(2) which minimizes


2
II d' - Gu H = u*G*Gu - 2d'*Gu + d ' * d '
where

•g(2) g(\)-
subject to
I K(I)| < M, i 1,2

T h e constraints in vector form are

0~ ~M~
r-i 0 M
Du - b = > 0
0 -1 M
0 1 M

I n E q . (50)
C = G*G, (2 χ 2)
-2d'*G, (2x1)
292 FRANCIS H. KISHI

Let

1
DC- /)* -
Α
• 44_

DC^h - b - η, (4 χ 1)
A schematic for the implicit function m e t h o d for solving E q . (51) is
shown in Fig. 22. Only one channel is shown. F o r the two-dimensional

FIG. 2 2 . Zero-non-zero determination of wt on analog computer.

example there will be 4 similar channels. I n general, a channel is required


per constraint. T h e circuit employs integrators, s u m m e r s , diodes, and
relays.

I V . Identification of Process P a r a m e t e r s —
Explicit M a t h e m a t i c a l Relation M e t h o d

A. Introduction
M a n y m e t h o d s have been proposed for identification (more precisely,
parameter estimation) of physical processes. T h e m e t h o d to be used in
a particular application may d e p e n d u p o n a m o n g other conditions: (1)
the m a n n e r in which the estimated information is used and (2) the
a m o u n t of a priori information available. T h e m e t h o d s sought t h e n
m u s t fit the control signal synthesis m e t h o d discussed in Section I I I .
As the identification is to be performed on-line, there are r e q u i r e m e n t s
on the speed and a m o u n t of c o m p u t a t i o n . If a priori information is
O N L I N E COMPUTER CONTROL TECHNIQUES 293

available t h e simpler is t h e identification p r o b l e m . T o have m e t h o d s


which can be readily performed on-line we usually require a certain
a m o u n t of knowledge about t h e process.
O u r discussion will be restricted to those m e t h o d s which have t h e
following characteristics. First of all, t h e process is assumed linear and
stationary. T h e stationarity is a s s u m e d for t h e time interval of t h e data
from which an identification is m a d e . Secondly, t h e identification should
be performed without inserting externally generated test signals. It
should d e p e n d only on t h e n o r m a l signals present in t h e system. Lastly,
because noise is inevitable in t h e systems, s m o o t h i n g should be provided.
F o r linear processes either t h e weighting function or t h e coefficients
of t h e difference equation (discrete case) are identified. W e confine
ourselves to t h e determination of t h e coefficients. Discussions on t h e
determination of t h e weighting function are given by Levin (36), K e r r
and S u r b e r (37), Balakrishnan (38), a n d Hsieh (28).
Restricting ourselves to t h e d e t e r m i n a t i o n of t h e coefficients of t h e
difference equation, essentially t w o different approaches are available:
(1) t h e explicit mathematical relation m e t h o d a n d (2) the learning model
m e t h o d . T h e explicit mathematical relation m e t h o d requires knowledge
of t h e exact form of t h e difference equation. T h i s restriction is s o m e w h a t
relaxed for t h e learning model m e t h o d in t h e sense t h a t a lower order
model can be m a d e to a p p r o x i m a t e a higher order process. T h i s section
will discuss t h e explicit mathematical relation m e t h o d . Section V will
discuss t h e learning model m e t h o d .
T h e explicit mathematical relation m e t h o d was used by K a l m a n (7)
b u t t h e basic philosophy dates as far back as 1951 w h e n G r e e n b e r g (39)
discussed m e t h o d s for d e t e r m i n i n g stability derivatives of an airplane.
S u b s e q u e n t work on this m e t h o d was performed by Bigelow and
Ruge (40). T h e m e t h o d will b e generalized by bringing in t h e concept of
t h e pseudo-inverse. F u r t h e r m o r e , statistical analysis has been lacking in
t h e previous studies on this particular m e t h o d . T h e r e f o r e , statistical
considerations will be given in t e r m s of t h e confidence interval.
I n accordance with considerations given in Section I, t h e explicit
mathematical relation m e t h o d does not rely on t h e exact knowledge
of t h e state variables.
A t h o r o u g h survey of identification m e t h o d s in provided in a report
by Eykhoff (41).

B. Description of the Mathematical Relation Method

Briefly, t h e m e t h o d reconstructs t h e equation of the process by


measuring t h e o u t p u t a n d i n p u t , and their previous values (sufficiently
294 FRANCIS H. KISHI

e n o u g h so that all of t h e t e r m s in t h e equation are a c c o u n t e d ) . By


taking r e d u n d a n t m e a s u r e m e n t s filtering is provided. Additional filtering
can also be obtained by inserting filters (this can be d o n e w i t h o u t
sacrificing t h e identification process).
T h e m e t h o d can best be described by taking an example. L e t u s
d e t e r m i n e t h e coefficients of t h e following difference e q u a t i o n .

y(k) = oLXy(k — 1) + oc2u(k) (55)

T h e p r o b l e m is to d e t e r m i n e oc, a n d o c 2 . T h e s e p a r a m e t e r s can be
constant b u t u n k n o w n or changing d u e to changes in e n v i r o n m e n t .
Usually, y(k) will not be directly observed b u t with a c o n t a m i n a t i n g
noise q u a n t i t y as depicted in Fig. 49. T h u s ,

z(k) = y(k) + v(k) (56)

T h e values of z(k) a n d u(k) will be stored for s o m e interval of t i m e


into t h e past; a n d t h r o u g h o u t this interval t h e p a r a m e t e r s OLX a n d OL2 are
assumed to be constant. Since y(k) c a n n o t be directly m e a s u r e d , E q . (55)
is rewritten in t e r m s of z(k).

z(k) - v(k) = ot^zik - 1) - v(k - 1)] + oi2u(k) (57)


Or,
z(k) = oixz(k - 1) + oc2u(k) + v^k) (58)
where
vx(k) = v(k) — OL^k — 1)

T a k i n g a set of m e a s u r e m e n t s , E q . (58) can be rewritten in vector form.


8
zk = z f c_ ! + a 2U fc + v l fc
ai (59)
where
z(k - N + i y

, etc.

. m _
I n matrix form
z fc = A* + v l fc (60)
where
A = FAN juj

8
T h e k signifies that Ν data points into the past from time k are considered.
ON L I N E COMPUTER CONTROL TECHNIQUES 295

Let
zk = Aa (61)

T h e zk is in t h e manifold of Tk_x a n d u^.. T h e q u a n t i t y zk is n o t necessarily


in t h e linear manifold because of v 1 A. . Since v l k is u n k n o w n , a reasonable
estimate of t h e p a r a m e t e r s w o u l d b e those values w h i c h result from
t h e projection zk on t h e manifold of zk_x a n d uk . T h e projection yields

< Z fc - Zk , Z f c_ ! > = 0

(zk - zk , uk) = 0
or,
a Z Z a U Z Z Z
l < f c _ l » fc-1> + 2 < f c » fc-1> = < fc > fc-1>
z
< * i < * - i , u f c> + <x2(uk , u f c> = <z fc , u f c> (62)

I n t e r m s of t h e matrix e q u a t i o n , E q . (62) is

A*A* = A*zk (63)

E q u a t i o n s (62) a n d (63) are k n o w n as n o r m a l equations, a n d if zk_x


a n d uk are linearly i n d e p e n d e n t , t h e n t h e solution is given by
1
α = (A*A)- A*zk (64)

If zk_x a n d are n o t necessarily linearly i n d e p e n d e n t , E q . (64) can b e


generalized to
« = A*zk (65)

T h e pseudo-inverse, extensively discussed b y Penrose (75, 79), provides


a u n i q u e solution even if t h e inverse in E q . (64) does n o t exist. I t
provides t h e solution with M i n || α ||. I t s h o u l d b e noted t h a t t h e m i n i m u m
n o r m solution m a y n o t b e t h e actual values of t h e process p a r a m e t e r s .
However, a solution is p r o v i d e d t o t h e p r o b l e m formulation instead
of some nonsensical solution. A recursive m e t h o d of evaluating t h e
pseudo-inverse is p r e s e n t e d in A p p e n d i x 4 essentially following t h e
derivation given by Greville (75). I t is rederived starting with t h e
axioms given by Penrose. T h e relation of Greville's r o u t i n e with K a l m a n ' s
recursive filtering t e c h n i q u e (76) is given in A p p e n d i x 5.
D u r i n g t h e first few steps of t h e recursive p r o c e d u r e w e always have
a singular situation. T h e advantage of Greville's p r o c e d u r e is t h a t a
u n i q u e solution is provided even for these first few steps; a n d eventually
as t h e nonsingular situation is reached t h e solution is o b t a i n e d w i t h o u t
error.
296 FRANCIS H. KISHI

C . Additional Filtering
In conjunction with the use of r e d u n d a n t data, it is possible to
incorporate additional filtering. T h i s filtering should be provided
without c o m p r o m i s i n g the identification process. Let us describe this
filtering process on the same example. W e designate F( ) as a linear
discrete filter and operate on both sides of E q . (58).

F(z(k)) = ocJizik - 1)) + oc2F(u(k)) + v2(k) (66)

N o w , the quantities F(z(k)) and F(z(k — 1 ) ) are respectively closer to


y(k) and y(k — 1 ) . Therefore, we have in vector form

(67)
where
~F(z{k-N+ 1))

, etc.
F(z(k))

T h e identification configuration will appear as in Fig. 23.

v(k)

u(k) zlk)

1 PERIOD F(z(k-i
DELAY

F I G . ?3. Configuration for additional filtering.

D. Block Processing of Data


T h e Greville-Kalman recursive m e t h o d can process the data as it
arrives. However, there is one difficulty. I n an adaptive task in which
the process is changing it becomes necessary to lop off t h e effect of old
ON L I N E COMPUTER CONTROL TECHNIQUES 297

data. Of course, in an adaptive task in which the process is u n k n o w n


b u t constant, there is no p r o b l e m because the recursive m e t h o d can
start at time t — 0 and continue u p to the present time. A possible
solution to the former case is block processing depicted in Fig. 24. T h e

OBSERVATION
INTERVAL

SAMPLING
INTERVAL

FIG. 2 4 . Block processing.

recursive m e t h o d is initiated at t h e start of each observation interval.


T h i s , of course, is simple m i n d e d . T h e estimated values are changed
every NT seconds. If p a r a m e t e r s are changing continually this p r o c e d u r e
may not be satisfactory.

E. Exponential Weighting
T h e lopping off of old data can be provided by exponential weighting.
T h i s weighting can be incorporated into t h e recursive m e t h o d previously
described by d e t e r m i n i n g t h e solution to

WkAk*k = Wkzk (68)«

T h e dot above the equal sign signifies that the a ' s are to be chosen so
that t h e left-hand side best approximates t h e r i g h t - h a n d side in t h e
sense that we have
min |j WkAkak
T h e Wk is equal to
0 π
0

L 0 Vw J
with 0 ^ w ^ 1, w being the staleness factor.
9
T h e subscript k again refers to the present time.
298 FRANCIS H. KISHI

T h e solution is given by
*k = AJWkzk (69)

where is t h e pseudo-inverse of Wk Ak .
It is observed that
f
-Vw Wk_x Ak_x- ~
Vw vv k_i . υ ι
(70)
-Vw a f c* λ/α

T h e recursive equations can b e derived in t h e same way as in A p p e n d i x 4 .


T h e i m p o r t a n t equations are

<*/c = «k-i - Vwbfcafc*«^! + Vwbkz(k) (71)

c , * = Vwak* - Vw a ^ t _ i ^ - A - i (72)

Case 1: ck Φ 0.
1
b fc = ( c ^ c * ) " ^ (73)
Case 2: ck = 0.

Ak'WkAk = A - x + b A * (75)

1 1
AkWkAt* = [V5 - b A ^ t ^ ^ j y V F i _ aA*] + Vwbkbk*

(76)
T h e exponential weighting is depicted in Fig. 2 5 . T h e point k respresents
the present time. Recent data are given larger weights t h a n older data.
As w —> 1, these equations will revert to t h e growing m e m o r y case
of A p p e n d i x 4.

. » t τ τ 1 I I 11 I
k-i k t—•

F I G . 25. Exponential weighting.


ON L I N E COMPUTER CONTROL TECHNIQUES 299

F. Uniform Weighting—Observable Case Only

If a d e q u a t e c o m p u t e r storage space is provided, at any sampling


instant a finite a m o u n t of data into t h e past can be analyzed. T h e
recursive equations for this uniform weighting has been worked o u t
1
by G a i n e r (42) for t h e observable case ( ( ^ 4 * A)" exists). T h e p r o c e d u r e
will be outlined in this section. Pictorially, t h e uniform weighting
slides forward in t i m e as depicted in F i g . 26.

OBSERVATION
INTERVAL

F I G . 2 6 . U n i f o r m weighting.

F o r a d d i n g t h e effect of n e w data, a k N was d e t e r m i n e d in t e r m s


10
of a ^ ! ^ . ! a n d t h e n e w set of data ak , z(k). T h i s t i m e , t h e set of
data to be deleted (a,._N , z(k — N)) a n d a k N are given, a n d it is desired
to d e t e r m i n e OLK- N_, . F r o m E q . (64)

a Z
k,N — Pk,NAt,N k.N (11)
where

Ρ k,N — ( A * T AN K M) N
(78)

or, (the subscript k is d r o p p e d w h e n u n a m b i g u o u s )

rz(k - ΛΓ)-|
α =
Ν ^N^k-N \ ^N-l\
Z
L N-1

= P N * k - N* ( k - N ) + P N A * ^ z N ^ (79)

Also,
(80)
Α = Z
Ν-ι PN-IAN-I N-I
W e note t h a t


a a
k-N k-N + ^ N - l ^ N - l ~
aa
k-N k-N + ^ N - l (81)
10
In aA.,yv the Ν signifies that Ν data points are taken, and k signifies time. T h i s
notation is adopted primarily for this section.
300 FRANCIS H . KISHI

Therefore
α 1 a a
Ν-1
=
[ΡΝ ~ fc-tf fc-tf] ^lï-Av-l (82)
or,

(83)

Substituting E q . (83) into E q . (79), we have


α aP a P ZK N
Ν = V - N k-N 1c-N]*N-l + N*k-N ( - )
or,
a P ZK
= U - ΡΝΪΗ-ΝΖΪ-ΝΥ^Ν - N k-N ( -
N
) ) (84)

W e can eliminate t h e inverse by noting t h e following


aP a 1 Pa a =
U ~ N k-N t-N\ U ~~ N k-N t-N] I

Pa a a a a a LP
U ~~ N k-N t-N]
1
— [ I — ΡN k-N k-N] N k-N t-N I
=
(85)

Post multiply by PN a,._N

aP a a 1P
V —
N k-N k-Ni N k-N ~~ U ~~ P a
N k-N î-Ni
a
N k-Nß
a 1P =
^

where
β = at.NPNak_N (scalar)
Therefore,
a 1 a
[/ PNak_Nat-N] PN k-N — j _ β PN k-N (86)

Post multiplying E q . (85) by aN


a a
V ~ PN k-N k-N\ *N —
1 ( A
N + j _ β
a
^N fc-.V *-]V tf
A a
(87)

Substituting E q s . (86) a n d (87) into E q . (84), we have

a Pa
*N-I =
α
* + γ~ζ7β N k-N( *k-N*N -
Z K
( -
N
) ) (88)

N o w , PN_i will be derived in t e r m s of PN . F r o m E q . (81)


P X P a a P
N -1 ~ N \ I N k-N t-N\
or,
P = Pa a 1 P
N-l U ~~ N k-N t-N\ N
ON L I N E COMPUTER CONTROL TECHNIQUES 301

T o eliminate t h e inverse, we post multiply E q . (85) by ΡΛ> . Therefore,

= a A
ΡN-l ? N+ j _ β PN k-N T-NPΝ (89)

E q u a t i o n s (88) a n d (89) are to be used with the recursive equations of


A p p e n d i x 4 to perform uniform weighting. A sequence of add, delete,
add, delete, add, ... alternatingly using t h e above equations for t h e
oldest data a n d equations of A p p e n d i x 4 for t h e new data is required.
It is noted that 1 — β may be equal to zero. W h e n such a situation
arises, the elimination of that particular row of data can be deferred, of
course, with a t t e n d a n t increase in p r o g r a m m i n g complexity.

G . Confidence Interval
T h e determination of the accuracy with which p a r a m e t e r s can be esti-
m a t e d requires statistical analysis. A n extensive study in the area of least
squares has been m a d e by L i n n i k (43). T h e particular results which are
useful for our purposes will be p r e s e n t e d here.
Let us refer to Fig. 49 a n d consider t h e case w h e n v(k) is a sequence
of uncorrelated Gaussian r a n d o m variables. As vx(k) is a function of v(k)
a n d v(k — 1) in E q . (58) a n d if we consider the data points at every
other sampling interval, vx(k) would be an uncorrelated sequence of
noise. Therefore, our samples are taken so that we consider the white
noise case. Of course, one would do better to consider every data point
even if they are correlated. However, the white noise case is m o r e con-
venient for the determination of confidence intervals and it will provide
a conservative determination.
2
W e will consider the case w h e n t h e variance (σ ) of vx(k) is u n k n o w n .
It is observed that even if the variance of v(k) is known, t h e variance
of vx(k) is u n k n o w n because vx(k) is a function of t h e p a r a m e t e r s to be
determined.
First, let us discuss the properties of the o p t i m u m estimate, a. W e
state t h e significant properties as l e m m a s . T h e proofs can be found in
Linnik (43).

LEMMAIV-1: The estimators from the least squares analysis are


unbiased, i.e.,
Ε OL = OL

L E M M A I V - 2 : The unbiased estimators, a, form a Gaussian, n-dimen-


sional vector with the correlation matrix.
2 ι
R-a = σ (Α*Α)~
302 FRANCIS H . KISHI

Γ 2 1
° ' Vara, = a {(A*A)~ }ii
rpi r

1 hereiore,
- . g J V ( 0 , 1)
aV{(i4M)-%

where Af(0, 1) represents Gaussian distribution with zero m e a n a n d


s t a n d a r d deviation of o n e .
Next, we consider t h e properties of v , given by

ν = A* — ζ

LEMMA IV-3: The minimum variance unbiased estimator also satisfies


the condition
2
II ν II = min

L E M M A I V - 4 : The error vector', v , is an (N-n) dimensional Gaussian


vector and it is independent of ά.

2
L E M M A I V - 5 : The random variable ν * ν ί ί distributed as χ with Ν — η
degrees of freedom and it is independent of ά.
N o w , we have t h e quantities which can form t h e ί-distribution. If
ξ a n d Σ ξ* are statistically i n d e p e n d e n t Gaussian r a n d o m variables
with t h e latter having n' degrees of freedom, t h e ^-distribution is formed
by t h e following ratio.

t = 7 2 2
λ/(1/η )"Σ^ VXln'
Let
α
ξ = ϊOLi —
aV{(A*A)-i}n

2
x = -^r***

η' = Ν — η

ν
then,

ν*ν
{ A
{ * A ) - i } i i J- ^r

It is observed t h a t t h e u n k n o w n variance cancels w h e n t h e ratio is formed.


U s i n g t h e ^-distribution we can d e t e r m i n e t h e interval a b o u t
which will include OL{ with a certain probability. F o r example, let u s u s e
Pr. = 0.90; t h e n ,
Pr {!**-» I < y } = 0.90
O N L I N E COMPUTER CONTROL TECHNIQUES 303

T h e γ is found from well tabulated tables. T h e r e f o r e ,

v*v
— η

T h u s , t h e range 2Δ of t h e 0.90 confidence interval is

2Δ = l ^ i y ^ p M ) - ' } « ^ (90)

T h e difficulty in t h e use of t h e confidence interval lies in t h e fact that


A*A a n d v * v change as t h e interval of consideration changes. Possibly
one could use t h e conservative (larger) estimate of these quantities to
get an estimate of 2Δ. T h e i m p o r t a n t point to observe is that to
decrease 2Δ, Ν — η m u s t be increased.
T h e above results can be extended to t h e case w h e n exponential
weighting is used. T h e range 2Δ is t h e n given by

(91)

H. Determination of Pulse Response


I n the type of adaptive controller studied in Section I I I , t h e elements
of t h e pulse response are desired along with t h e coefficients of t h e
difference equation. However, t h e pulse response a n d t h e coefficients of
the difference equations are closely related; and two m e t h o d s are available
for d e t e r m i n i n g t h e pulse response.
First, there is t h e well-known m e t h o d of deriving the pulse response
from the coefficients via long division. A l t h o u g h it is relatively simple
to perform t h e calculations, t h e r e may be uncertainty in t h e propagation
of errors t h r o u g h the division process.
I n t h e other m e t h o d , t h e pulse response coefficients can be m e a s u r e d
directly. Let us first look at difference equations which have only a
single forcing t e r m . I n this case t h e states can simply be chosen as x(k)>
x(k — 1), x(k — 2), etc. T h e s e c o n d - o r d e r example has t h e form

x(A) = Φχ(Α - 1) + yu{k)


z{k) = Mx(k) + v(k) (92)
where
M = [1 0]

X(Ä)
- lx2(k)] - U(* - 1)1
304 FRANCIS H . KISHI

T h e equations are

*(*) - v(k) =φη(φ - 1) - v(k - 1)) + φ21{ζ(Η - 2) - v(k - 2)) +g(\)u(k)

or,
*k = Φιχτ-k-i + + g(l)"ic + v l fc (93)
where
z f c* = (z(k - Ν + 1), z(k)), (N samples)

F o r t h e state variables chosen, t h e above equations apply to t h e case


w h e n there is only a single forcing t e r m . F r o m E q . (93), t h e pulse
response at the end of one sampling interval, g(l), can be d e t e r m i n e d
along with estimates of φ η and φ12 . T h e least-squares p r o c e d u r e is
again used.
In order to obtain £(2), we need an equation for x(k) in t e r m s of
x(k — 2). F r o m
x(k - \ ) = <Px(k - 2) + yu(k - 1)
we obtain
2
x(k) = 0 x(k - 2) + 0yu(k - 1) + yu(k) (94)
Therefore,
(
zk = φ ^_2 + <f>™zk^ + g{2)ak_x + g(\)uk + v 2 Jt (95)
where

02 _ φφ _
tor Φ$.
} 2]
F r o m Eq. (95), g(l) and g(2) can be found along with φ[\ and φ[ 2 . If
m o r e elements of t h e pulse response are desired, t h e above p r o c e d u r e
is repeated. T h e pattern is now, however, familiar. For example, if
£(1) to £(4) are desired, t h e following equation would be used.

z, = ^ V z A . _ 4 + ^ ζ , , . 5 +£(4K_3 +g(3)uk_2
+ ^ ( 2 ) u , _ 1+ ^ ( l ) u , + v 4,

Although t h e p r o c e d u r e requires larger equations, t h e advantage in


using this m e t h o d is that the u n k n o w n coefficients are d e t e r m i n e d
directly.
I n the case where there is m o r e t h a n one forcing t e r m , the above
p r o c e d u r e can be used b u t with a little m o r e difficulty. T h e r e are two
alternatives. First, if x^k), xx(k — 1), etc., are used as state variables,
t h e p r o b l e m can be treated as a multiple control i n p u t p r o b l e m . T h e
ON L I N E COMPUTER C O N T R O L TECHNIQUES 305

second a p p r o a c h is to use a different set of state variables so t h a t t h e


single difference e q u a t i o n can be p u t into t h e form of E q . (92). T h e
p r o c e d u r e will be briefly illustrated.
L e t u s look at t h e e x a m p l e given by
k
x(k) + oc^ik - 1) + <x2x(k -2) = ßA ) + iW* - 1)
Let
x x
i(k) — (fy

x2(k — 1) = a 2#i(& — 2) — ß2u(k — 1)


then
*i(*) = - 1) - *2(* - Ο + M * )
x2(k) = ocjc^k — 1) — ß2u(k)

T h e e q u a t i o n s are n o w in t h e form of E q . (92). L e t us see w h a t is


involved if we desire #(1) a n d £(2). T h e t o p row of t h e vector equation,
E q . (94), is

= * f f * i ( * - 2) + * i ? * a ( * - 2) + g(2)u(k - 1) + g(l)u(k)

I n t e r m s of t h e m e a s u r e d quantities we have

Z Z )a Z U
k = <f>lÎ k-2 + <Al2 2 fc-3 ~ <l>l2ß2 k-2
+ g(2)"k-i +g(l)uk + noise

Along with ^(1) a n d #(2) o t h e r coefficients are d e t e r m i n e d .


A l t h o u g h this m e t h o d requires m o r e m a n i p u l a t i o n s , it gives the
r e q u i r e d coefficients directly.

V . Identification of Process P a r a m e t e r s —
Learning M o d e l M e t h o d

A. Introduction

T h e other a p p r o a c h available for estimation of coefficients of a


difference equation is t h e learning m o d e l m e t h o d . It is felt t h a t if some
a priori estimate of t h e u n k n o w n p a r a m e t e r s is available t h e n we should
be able to use this information to advantage. T h i s is p r o b a b l y t h e
motivation for t h e learning m o d e l m e t h o d . T h i s m e t h o d was originally
studied by Margolis (14) using t h e sensitivity function. T h e sensitivity
function is also used by Staffanson (44) w h o was c o n c e r n e d with para-
306 FRANCIS H. KISHI

meter determination from flight test data. Several characteristics are


apparent in Margolis' work:
(1) O n e is constantly worried about t h e stability p r o b l e m .
(2) Noise considerations were not given.
(3) O n e m u s t choose the gain in t h e steepest descent p r o c e d u r e .
(4) T h e use of sensitivity functions is generally valid for small regions
about a trial point.
T o overcome some of t h e above p r o b l e m areas, this section will give
an alternative p r o c e d u r e primarily p a t t e r n e d after N e w t o n ' s m e t h o d
b u t with the extensive use of t h e digital c o m p u t e r to give assurance of
m o n o t o n e convergence. N e w t o n ' s m e t h o d is chosen because it is k n o w n
for its rapid rate of convergence. By considering blocks of data at a t i m e ,
smoothing is performed. W e will first briefly describe Margolis'
approach t h r o u g h an example so that it will provide a basis for c o m -
parison. Again, we restrict ourselves to t h e discrete case.
T w o other possibilities for performing t h e learning model m e t h o d
should be m e n t i o n e d . First, the quasi-linearization a p p r o a c h described
by Bellman et al. (45). T h i s m e t h o d was found to be very c u m b e r s o m e
for t h e discrete case. T h e other m e t h o d is t h e orthogonal function
approach used by Elkind et al. (46). Fixing the model t i m e constants
a priori seems to be a c r u d e m e t h o d .

B. Margolis' Sensitivity Function Approach


Margolis' learning model approach is s h o w n in Fig. 27. Margolis
used t h e e r r o r - s q u a r e d as the criterion. Integrals of e r r o r - s q u a r e d led
to stability p r o b l e m s . Even t h o u g h Margolis may have had success in
m a n y situations for the continuous case, t h e discrete case may lead to
other conclusions. Therefore, we will look at the discrete case. T h e
p r o c e d u r e will be described here with t h e results given later.

LEARNG
IN ym(k)
M O D LE

T T

A D J U S TG
IN

M E C H A N MI S

P R O C ES S
u(k) Z(k)

F I G . 27. Margolis' learning model approach.


ON L I N E COMPUTER C O N T R O L TECHNIQUES 307

Let us choose to discuss t h e first-order process with two u n k n o w n


parameters a n d oc2.

y(k) = oc^k — 1) + (x2u(k)

z(k) =y(k)+v(k) (96)

T h e equation for t h e model is given by

ym(k) = axym(k - 1) + a2u(k) (97)

T h e coefficients ax a n d a2 are to be adjusted to m i n i m i z e

J = (z(k) - ym(k))* (98)

W e take t h e gradient of / with respect to ax a n d a2.

-|£ = -2(z(k) - ym(k))Ul(k) (99)

]
- - = -2(z(k) - ym(k))u2(k) (100)

where
dym(k)

dym(k)
u2(k) =
da2

T h e u^k) a n d u2(k) are called sensitivity functions a n d they are deter-


m i n e d from equations obtained by differentiating E q . (97) with respect
to t h e p a r a m e t e r s . T h e r e f o r e ,

ux(k) = alUl(k - 1) + ym(k - 1) (101)

u2(k) = axu2(k- \) + u(k) (102)

T h e corrections on t h e p a r a m e t e r s αλ a n d a2 are taken in t h e direction


of steepest descent.

(k
ai +\) = a^k) - 2K(z(k) -ym(k))u1(k) (103)

a2(k + \) = a2(k) - 2K(z(k) - ym(k))u2(k) (104)

where Κ is t h e gain in t h e steepest descent p r o c e d u r e . T h e Κ is to be


chosen from stability a n d noise considerations.
308 FRANCIS H. KISHI

C . Modified Newton's Approach


W e next describe a m e t h o d which will be extensively studied. Again
we will use an example to illustrate t h e p r o c e d u r e .
Instead of operating on the error as s h o w n in Fig. 27, t h e stability
p r o b l e m can possibly be alleviated by solving instead t h e p r o b l e m .

Problem V-l. F i n d the p a r a m e t e r s (a{) of the model which minimizes

/ = 2(*ω-*«(/))* (los)
where
ym(j) is subject to the dynamical constraint
ym(j) = axym(j - 1) + a2u(j) (106)
11
T h e time indices are s h o w n in Fig. 2 8 . I n our case, t h e model, E q . (106),
could be of lower order t h a n the actual process (model fitting p r o b l e m ) .

i=N
k

F I G . 28. Observation interval.

W e start from an initial trial or estimate of t h e parameters, αγ\ and the


{1)
initial conditions for the interval of o b s e r v a t i o n , y m ( 0 ) . W i t h these
{1
initial trials E q . (106) is solved to obtain a nominal solution, ym(j) \
j = 0, 1, N. Next, the p e r t u r b a t i o n equations of E q . (106) are
{1)
written, evaluted along the nominal ym(j) .
l) {1)
Sym(j) = a[ 8ym(j - 1) + ym (j - \)Sax(j - 1) + u(j)Sa2(j - 1) (107)
W e adjoin to E q . (107) other equations which maintain t h e p a r a m e t e r s
constant. T h i s trick was used by Bellman et al. (45).

hax(j) = δ ϋ , Ο ' - 1)
Sa2(j) = 8a2(j - 1) (108)
Let
rSym(j)i
W)= *"iU) (109)
.Sa2(j) .
11
T o simplify the notation, the index k is dropped. T h u s , at the time of computation,
j = 0, => j = k — Ν and j = Ν => j = k.
O N L I N E COMPUTER C O N T R O L TECHNIQUES 309

then
ζ(/) = * ϋ " - ΐ ) ί ϋ " - ΐ ) (HO)
where
{1)
ra[V ym (j - 1) u(j)-
0(j-\)= 0 1 0 (111)
.0 0 1-

At this stage, instead of solving t h e optimization p r o b l e m stated in


P r o b l e m V - l , t h e following p r o b l e m is solved.

Problem V-2. F i n d t h e intial conditions of E q . (110) which minimizes

a) {1
/ = X ( * ( / ) - y* U) - Sym W ( i 12)
3=1
(1)
w h e r e 8ym (k) is subject to t h e constraint E q . (110).
W e have converted a nonlinear p r o b l e m into a linear p r o b l e m . By
repeatedly solving this last p r o b l e m we h o p e to a p p r o a c h t h e solution
to t h e first p r o b l e m .
P r o b l e m V - 2 is solved by using t h e least-squares curve fitting p r o c e -
d u r e . It is noted that

{1)
ym™(j) + Sytn U) = *Ü) + «0") (113)

w h e r e n(j) is t h e discrepancy caused by noise a n d error in t h e p a r a m e t e r


adjustment. T h u s
Sym^ij) = z(J) - ym™{j) (114)

T h e r i g h t - h a n d side of E q . (114) is k n o w n a n d it is desired to d e t e r m i n e ,


a) {1)
8ym (j), subject to E q . (110), w h i c h best approximates z(j) — ym(j) .
E q u a t i o n (114) can be rewritten as

h%{j)=z{j)-ym^{j) (115)
where
h * = (1 0 0)

T h e TV equations r e p r e s e n t e d by E q . (115) can all be rewritten in t e r m s


of ζ(0) by using E q . (111).

1ι*ζ(0) = *(0) -ytn™(0)

Η*Φ(1,0)ζ(0) = *(1) -ym™{\)

Η*Φ(ΛΤ,0)ζ(0) = z(N)-ym^(N)
310 FRANCIS H. KISHI

Or, in matrix form


Αζ(0) = ξ (116)
where

A = Ν + 1 χ 3 matrix

h*4>(N, 0)

ζ{0) - ym««(0)
ξ = Ν + 1 χ 1 vector
_z(N)-ym^(N)]

T h e pseudo-inverse routine is used to solve E q . (116).


α
ζ(0) > = iitç«i) (117)

F r o m Eq. (117) we can make corrections to the initial trial of the para-
meters and initial conditions.

a m = ii)
a + 8fl(i)(0)
2) ( 1) (1)
ym(0y = jm(0) + o>m (0) (118)

T h e procedure can now be repeated.

D. Algorithm and Convergence


T h e p r o c e d u r e outlined in the last section may well be divergent.
Procedures using t h e digital c o m p u t e r can, however, be used to give
m o n o t o n e convergence. T h i s section will give the algorithm which
assures this i m p o r t a n t property.
F r o m the initial trial and solution we can c o m p u t e t h e error index.
{1) 2 ( 1 ) 2
Ji = X «/) -ym U)) = II * - ym ||

T h e problem is to find a 8ym(k) such that J2 given by


2
j2 = X«j)-ym^U)-^yn(j))
is less t h a n Jx .
T h e difference J1 — J2 m u s t be greater t h a n zero.
2
Λ - / 2 = Il z - ym<i>|| - || z - y m(1)112
( 1 )
+ 2<8ym, z - ym >
2
-||8ym|| ^ 0
ON L I N E COMPUTER CONTROL TECHNIQUES 311

Or,
(119)

Equation (119) is t h e condition for convergence. If

t h e n for 8ym sufficiently small E q . (119) can be satisfied since t h e first


t e r m is linear in 8ym while t h e second t e r m is quadratic. I t is n o t e d t h a t
the first t e r m in E q . (119) is positive since it is t h e scalar p r o d u c t between
t h e error a n d t h e projection of t h e error on t h e linear manifold.
T h e condition
1
<8ym, ζ - ym* *) = 0 (120)
( 1)
requires t h a t y m is closer to ζ t h a n any nearby point obtained t h r o u g h
linear p e r t u r b a t i o n . I n other w o r d s , t h e g r a d i e n t is zero a n d we have
a local m i n i m u m .
T h e situation is s h o w n in F i g u r e 2 9 . T h e first linear correction is Γ .
U p o n solving E q . (106) point 1 is obtained which m a y well give a /
which is greater t h a n Jx . If J2 > J1, t h e n we cut t h e correction, 8ym(k)y
by a half, yielding point 1. If t h e / at point 1 is less t h a n J1 t h e n w e
{1) (1)
keep t h e correction given by 8ym(k) /2. If not, we cut hym{k) j2 by
a half a n d repeat this process. By using this c u t t i n g p r o c e d u r e we have
m o n o t o n e convergence until condition of E q . (120) is reached.

F I G . 2 9 . T w o - d i m e n s i o n a l picture of correction scheme.


312 FRANCIS H. KISHI

I n an on-like task, we are limited in the n u m b e r of iterations we can


make at a given time. T h e r e q u i r e m e n t is not as stringent, however,
as the control synthesis p r o b l e m because the estimation can be m a d e
at wider time intervals for slowly varying processes. If we limit the
n u m b e r of cutting procedure described in the last paragraph, we may
never find the correction which will give a smaller / . I n this case no
corrections will be m a d e and we go on to the next interval. H e r e again,
no interval may give corrections, in which case the m e t h o d fails. It is felt,
however, that for a class of p r o b l e m s in which the estimates are within
a certain range from the true values the routine will be applicable. T h i s
p r o b l e m seems no worse t h a n the instability problem associated with
Margolis' procedure.

E. Simulation

A digital simulation of the modified N e w t o n ' s p r o c e d u r e was m a d e


on an I B M 7090. As a comparison, the discrete version of Margolis'
p r o c e d u r e was also simulated. T h e experimental set-up and results will
be discussed in this section.
L e t us first describe the experimental set-up for the modified N e w t o n ' s
p r o c e d u r e . T h e first-order process with two u n k n o w n coefficients was
taken as an example. T h i s process has the form

y(k) = oiiy(k - 1) + oc2u(k)

z(k) = y(k) + v(k)

T h e noise, v(k) was an u n c o r r e c t e d noise with a uniform distribution


because it was readily available. It is believed that this distribution is
more severe t h a n the usual Gaussian noise if the variances of the two
are the same. M a n y r u n s were made, however, w i t h o u t noise.
T h e flow chart for the simulation is shown in Fig. 30. Over 100 points
of u(k) were inserted. Either a triangular wave with a period of 24
sampling instants or a square wave with a period of 20 sampling instants
was used. T h e observation interval was taken as 10 sampling instants
and the intervals were taken in a block processing m a n n e r . (In an actual
application probably more points will be taken.) F o u r iterations were
taken per observation interval. If needed, the cutting-by-two p r o c e d u r e
was counted as an iteration. T h e m e t h o d requires initial conditions for
the model equations at the beginning of every observation interval.
T h e s e were supplied by either of two ways. First, if the previous interval
revealed an i m p r o v e m e n t in the criterion / , then the state values at the
last sampling instant of the nominal solution of the previous interval were
ON L I N E COMPUTER CONTROL TECHNIQUES 313

used as the initial conditions. Otherwise, the measured o u t p u t s were


used as the initial conditions.
For Margolis' p r o c e d u r e , essentially the same conditions prevailed
to permit a comparison. T h e p r o c e d u r e provides adjustment after every

PROCESS WITH ACTUAL VALUES


INITIALIZATION: n*0

1. M E A S UER Z(n
)

2. S E T n e n - H
3. S T OER P A S
T 10 S A M P LSE
OF u(n) A ND z(n)
4. I N I T I A L I Z A T:I O NIΌ
5. C O M P U
E T N O M I NLA S O L U TN
IO
USING E S T I M AD T EV A L USE IF J2 W AS L E S T H A
N

6. C O M P U
E TC R I8T E RNI O
J, S ET I N I T ILA C O N. O
J -Hi-ymll F OR N E X
T O B S E R V A T!I O N
I N T E R VLA E Q ULA TO
L A ST V A L E
U IN
P R E V I OSU O B S E R V A T!I O N
INTERVA
. L
1. I N S ETR P E R T U R B ANT IEOQ U A T N
IO O T H E R WEI SU SE
C O E F F I C I ESN V
TAI N O M I NLA S O L U T N
IO
M E A S U RDE V A L U. E
2. A D J ONI E Q U A T I S
ON
a(n-H) -a(n) FOR

EVER
Y U N K N ONWP A R A M E R
TE

3 C A L S U B R O U TEI P
N S E UO
O
TO O E T E R MEI NC O R R E C T
NI O
ON
I N I T I LA C O N D I T I S
ONOF P E R T U R
, B
eqs. A N
O A D J O I NDE t q. t

1. C O M P U
E T N EW N O M I NLA S O L U T N
IO

2. S ET I " I* I

3. C O M P U J2
E T N EW C R I T E R I ,O N

FIG. 3 0 . Flow chart for modified N e w t o n ' s procedure.

sampling instant as described in Section V, B. T h i s p r o c e d u r e requires


insertion of a gain, K, for t h e steepest descent p r o c e d u r e .
F o r t h e first series of r u n s , the process p a r a m e t e r s were taken as
constant b u t u n k n o w n . T h e estimates were initially displaced from
t h e t r u e value. A representative no-noise case is s h o w n in Fig. 3 1 .
After three observation intervals t h e t r u e values are obtained. It was
found that large displacements of t h e initial estimates can still provide
convergence. Even unstable roots were identified. F r o m this series of
runs, it is felt that any root near and within the unit circle can be identified
for the first-order process regardless of t h e initial uncertainty.
314 FRANCIS H. KISHI

T R UE V A L U E: S a*, .9
5
<V .10
I N I T I AL E S .T ,a «0.
0
0, «o.o
T R I A N GEL W A E
V I N PT
U

NO N O IES

0 10 20 30 40 50 60 70 80 90 100
T I ME •

Fie. 31. Modified N e w t o n ' s p r o c e d u r e , constant u n k n o w n parameter.

T R UE V A L U E: S a, * .9'
Oj« I

I N I T I AL E S .T ,a « 8
.
a t« 15

S Q U AER W A E
V I N PT
U

5% N O IES

Ο 10 20 30 40 50 60 70 80 90 100
T I ME •

FIG. 3 2 . Modified N e w t o n ' s procedure, 5 % noise, square wave.


O N L I N E COMPUTER CONTROL TECHNIQUES 315

0.8

TRUE VALUES a , * .9
0.6

INITIAL EST. a,= .8

TRIANGULAR WAVE INPUT


0.4
5% NOISE

0.2

I
0 10 20 30 40 50 60 70 80 90 100
TIME •

F I G . 33. Modified N e w t o n ' s procedure, 5 % noise, triangular wave.

0.8

TRUE VALUES a, «.9


a2 = .ι

INITIAL EST. a, «.β


a2 =15

TRIANGULAR WAVE INPUT

10% NOISE

I
0 10 20 30 40 50 60 70 80 90 100

TIME •

F I G . 34. Modified N e w t o n ' s procedure, 1 0 % noise, triangular wave.


316 FRANCIS H. KISHI

F I G . 35. Modified N e w t o n ' s procedure, changing parameters ( 0 . 0 0 2 5 / T ) , no noise.

F I G . 36. Modified N e w t o n ' s procedure, changing parameters ( 0 . 0 0 2 5 / T ) , 5 % noise.


ON L I N E COMPUTER CONTROL TECHNIQUES 317

i .o

0.6

SQUARE WAVE INPUT


5 % NOISE

0 10 20 30 40 50 60 70 80 90 100
TIME •

FIG. 3 7 . Modified N e w t o n ' s procedure, changing parameters ( 0 . 0 0 1 2 5 / T ) , square wave.

FIG. 3 8 . Modified N e w t o n ' s procedure, changing parameters ( 0 . 0 0 2 5 / T ) , triangular wave.


318 FRANCIS H. KISHI

TRUE VALUES α, « 9

INITIAL EST. α, = .85


• 2= . Ι 5

SQUARE WAVE INPUT


NO NOISE
Κ * .0006.

F I G . 39. Margolis' procedure, constant u n k n o w n parameters, square wave.

0.6

T R U E VALUES a , * .9
a 2s ι

INITIAL E S T . a , = 85
s
a 2 i5

TRIANGULAR WAVE INPUT

NO NOISE

Κ = 0006

FIG. 4 0 . Margolis' procedure, constant unknown parameters, triangular wave.


ON L I N E COMPUTER CONTROL TECHNIQUES 319

TRUE VALUES d,*.9


α 2« ι

INITIAL EST. a,s.85


a 2= i 5

TRIANGULAR WAVE INPUT

NO NOISE

Κ « .00024

10 20 30 40 50 60 70 80

TIME •

FIG. 41. Margolis' procedure, constant u n k n o w n parameters, ax = 0.85.

0.8

0.4 TRUE VALUES <Z| s . 9

α2 = .ι

INITIAL E S T . Q, = .95

α2 τ .15
TRIANGULAR WAVE INPUT

NO NOISE

Κ = .00024

0 10 20 30 40 50 60 70 80 90 100
TIME •

F I G . 42. Margolis' procedure, constant u n k n o w n parameters, ax = 0.95.


320 FRANCIS H. KISHI

F I G . 43. Margolis' procedure, changing parameters.

TRUE VALUES α, * .9
α 2- . Ι

INITIAL E S T Q ι '.85

V. I 5
TRIANGULAR WAVE INPUT
5% NOISE
K- .00024

10 20 30 40 50 60 70 80 90 100

TIME

F I G . 44. Margolis' procedure, 5 % noise.


ON L I N E COMPUTER CONTROL TECHNIQUES 321

For the second series of runs, noise was added to the o u t p u t of the
process. Noises with 5 % and 1 0 % of the peak o u t p u t were inserted
along with initial displacements of the estimated values. Several results
are shown in Fig. 32, 33, and 34. T h e results show convergence from
the displacements b u t an error in the estimated values. T h e 1 0 % noise
case reveals that possibly more t h a n 10 points are required for the
averaging. Essentially, there is no significant difference between the
triangular and square wave i n p u t s .
For the t h i r d series of r u n s , the t r u e values were continually changed
as a r a m p . Both noise and no-noise cases were taken. S o m e results are
shown in Fig. 35, 36, 37, and 38. First, even without noise, the
tracking capability is rather poor if the parameters are changing as m u c h
as 0.0025 per sampling instant. W i t h 5 % noise, the situation is even
worse. Close analysis of Fig. 36 showed that u p to 50 Γ the signal-to-noise
ratio was m u c h worse than 5 % . As the signal portion increased the
tracking capability improved. Figures 37 and 38 show that the m e t h o d
is able to track changes in parameter of 0.00125 per sampling instant
even with noise. Essentially, there is no significant difference between
the triangular and square wave i n p u t s .
Results using the discrete version of Margolis' procedure are shown
in Fig. 39, 40, 4 1 , 42, 4 3 , and 44. T h e adjustment of Κ is very critical.
M a n y runs were m a d e before a satisfactory Κ was obtained. ( T h i s
adjustment was very troublesome on the digital c o m p u t e r . ) T h i s gain
was d e p e n d e n t u p o n the i n p u t signal. W h e n the gain was adjusted to
give a satisfactory response to square waves, it was unsatisfactory for
the triangular wave (Fig. 39 and 40). Different types of behavior were
obtained d e p e n d i n g u p o n the direction of the initial offset. It seems that
the best adjustment for Κ is w h e n the behavior is slightly o v e r d a m p e d .
Otherwise, oscillations appear to persist for a long time. W i t h Κ adjusted
to this seemingly suitable value, it takes a long time before the true
values are obtained. T h e m e t h o d is also not applicable for large displace-
m e n t s of the initial guesses, and the Κ seems to d e p e n d u p o n the values
of the parameters which are being estimated. W i t h the gain set so that
the behavior is slightly o v e r d a m p e d , noise did not affect appreciably
the response. ( T h i s fact was conjectured by Margolis.) In fact, with
noise the gain should be even smaller.

Let us s u m m a r i z e the difficulties of the discrete version of Margolis'


procedure.

(1) T h e gain d e p e n d s u p o n the i n p u t signal.


(2) T h e response is slow, when Κ is properly adjusted.
322 FRANCIS H. KISHI

(3) T h e behavior differs d e p e n d i n g u p o n the direction of the initial


offset.
(4) T h e m e t h o d is applicable for small initial displacements between
the estimate and t r u e values.
(5) T h e gain d e p e n d s u p o n the t r u e parameter values of the process.

Because of the critical nature of K, the modified N e w t o n ' s p r o c e d u r e


appears to be more practical even with the a d d e d complexity in c o m -
putation. Even for the well-monitored experiments the adjustment of Κ
was difficult. I n an on-line application where the p a r a m e t e r s are u n c e r -
tain, the p r o b l e m s would be almost i n s u r m o u n t a b l e .

F. A Possible Alternative

If the pseudo-inverse routine is computationally d e m a n d i n g an


alternative would be to use the steepest-descent m e t h o d to perform the
inversion of the rectangular matrix, E q . ( l 16). W e can choose the criterion
2
/ = Μζ(0)-ξΐ! (121)

Or equivalently, we can minimize

0(ζ(Ο)) = ζ ( 0 ) Μ Μ ζ ( 0 ) - 2ξ*Λζ(0)

W e assume here that sufficient data points are processed so that A* A


is positive definite.
T h e gradient is given by

ν: ρ =
( 0 ) ΛΜζ(0)-Λ*ζ

T h e next approximation is given by

ζ(0)««+» = ζ(0)<«> - * Βν Μ ρ) < « >

where en is d e t e r m i n e d so that the m i n i m u m point in t h e direction of


the gradient is obtained, or
,n> 2
= il V q o ) g l l
£
" <Λ*Λν ΐ ( 0ρ<»>,
) ν £ ( 0ρ««)>
)

As before, one can check to see w h e t h e r the / in E q . (105) is actually


decreasing, and if not perform the cutting by two procedure. It is noted
here that even if / in Eq. (121) is continually decreasing it does not
imply that the / in E q . (105) is decreasing.
ON L I N E COMPUTER CONTROL TECHNIQUES 323

V I . State V a r i a b l e E s t i m a t i o n

A . Introduction
T o use t h e adaptive controller discussed in Section I I I we m u s t know
the state variable at every sampling instant. T h i s section will discuss
a m e t h o d of estimating these variables. T h e content of this section
draws heavily from t h e work of K a i m a n (16). J o s e p h a n d T o u (23)
have also m a d e studies along this line.
T h e state variables can be estimated if t h e process is known. Also,
it is k n o w n that t h e process can be d e t e r m i n e d if t h e state variables are
known accurately. T h e task in adaptive controls is one step m o r e difficult
because neither the state variables nor t h e process is k n o w n accurately
at any time. H o w e v e r we can employ t h e following philosophy. If
identification m e t h o d s are available which can operate with inaccurate
knowledge of t h e state variables, t h e n t h e identified process can be used
in t h e state-variable estimation. A possible reason for taking this r o u t e
is that the state variables generally change faster t h a n t h e process
parameters. As t h e identification m e t h o d s of Sections I V a n d V were
applicable even with unprecise knowledge of t h e state variables, those
results can be used to u p d a t e t h e process p a r a m e t e r s in t h e state variable
estimation. T h e r e f o r e , t h e state variable estimation part can employ
K a l m a n ' s recursive t e c h n i q u e . T h e p r o c e d u r e will be outlined mainly
to complete t h e total picture.

B. Outline of Estimation Problem


L e t us refer to t h e process configuration s h o w n in Fig. 4 5 . F r o m t h e
knowledge of z(k) and u(k), it is r e q u i r e d to estimate t h e state, x(A), at
L O AD
D I S T U R B AEN C

WHIT
E N O IES

M E A S U R E MTE N
NOIS
E

F I G . 45. Process configuration.


324 FRANCIS H. KISHI

the present time. T h e past values of z(k) and u(k) are k n o w n from some
initial start time. T h e process characteristics, G1 and G2, are known, t h e
former t h r o u g h identification. I n an adaptive task the transfer charac-
teristics are time varying. As new parameter values are obtained, t h e
corresponding values used in t h e estimation will be changed. T h e
covariance matrices of v(k) and w(&) are also known. T h e s e noise
sources can be taken to be white noise. It is noted that because of G 2 t h e
load distrubances can have a n o n w h i t e spectra.
W e note
x(k) = Xl(*) + x2(k)

where xx(k) is k n o w n . Let


v(*) = z ( * ) - x 1( * )

x 2( A ) = x(*) - x,(k)

T h e p r o b l e m is now simply the determination of x 2(&) which is t h e


conditional expectation given v(£), k = 0, 1, k. F r o m x 2(&) t h e
estimate of the state is

x{k) = x2(k) + (XAl )

Therefore, it can be seen that K a l m a n ' s filtering algorithm which can


treat time varying processes is applicable here.

V I I . A p p l i c a t i o n to the Re-entry Flight C o n t r o l P r o b l e m

A. Introduction
T h e control of an aerospace vehicle entering t h e E a r t h ' s a t m o s p h e r e
is one of t h e more challenging problems facing engineers at t h e p r e s e n t
time (47y 48). Large variations and uncertainties in t h e process dynamics,
primarily due to variations in air density, make feedback control m a n -
datory. F u r t h e r m o r e , accuracy r e q u i r e m e n t s may dictate using s o m e
sophisticated form of adaptive controls. T h i s section will outline h o w
the scheme discussed in this chapter can be applied to the re-entry
problem.

B. Flight-Path Control Problem


Probably the ideal m e t h o d for the re-entry p r o b l e m would c o m p u t e
o p t i m u m controls d e p e n d i n g u p o n the present state and the desired
ON L I N E COMPUTER CONTROL TECHNIQUES 325

destination. As time progresses the controls are r e c o m p u t e d . T h i s task


using the nonlinear equations of motion, however, is very difficult
requiring an e n o r m o u s ( I B M 7090) c o m p u t e r . Even if a c o m p u t e r is
available the c o m p u t a t i o n time will be an appreciable portion of the
re-entry time. Therefore, some other p r o c e d u r e is required.
Several alternative schemes have been suggested in the literature (49,
50). O n e scheme performs re-entry by following a previously c o m p u t e d ,
stored optimal-trajectory. T h e adaptive control philosophy discussed
in this chapter can be applied for such a scheme. Linear dynamical
equations are obtained by writing p e r t u r b a t i o n equations evaluated
along t h e nominal optimal trajectory.
A n o t h e r scheme is to approximate the optimal path by segments of
shorter paths which are easier to solve. T h i s scheme is illustrated in
Fig. 46. As an example, the optimal path is a p p r o x i m a t e d by t h r e e

RANGE

FIG. 4 6 . Approximation of optimal path.

segments: (1) constant lift-to-drag ratio path, (2) constant altitude path,
and (3) constant lift-to-drag ratio path. T h e adaptive control philosophy
discussed in t h e previous sections can be applied to each segment
separately. T h e p r o c e d u r e will be illustrated for the constant altitude
segment.
326 FRANCIS H. KISHI

C . Constant Altitude Controller


First, the two-dimensional equations of motion will be derived. L e t
us refer to Fig. 47. S u m m i n g the forces in the direction of V we obtain
D
V = g sin γ (122)
m

F I G . 47. Geometry of re-entry.

S u m m i n g the forces in t h e direction perpendicular to V we obtain


mVê = g cos γ — L
Since,
Φ+ γ = 6
V cos y
R + h
we obtain
V cos - J_
r (123)
R + h V mV

I n addition, the altitude rate is given by


h = —Fsin y (124)
ON L I N E COMPUTER C O N T R O L TECHNIQUES 327

T h e n a m e s attached to t h e above symbols are:

y flight p a t h angle m e a s u r e d from local horizontal


V velocity
R radius of E a r t h
h altitude
L lift force
D drag force
m vehicle mass
g acceleration of gravity
I n control t e r m s , Vy y, a n d h are t h e state variables, a n d L a n d D are
control forces. T h e a m o u n t of lift a n d drag being applied at any t i m e
can be m e a s u r e d by accelerometers because

D_
m
L_
m
where
aD is t h e m a g n i t u d e of deceleration m e a s u r e d by an accelerometer
oriented along t h e velocity vector
aL m a g n i t u d e of deceleration m e a s u r e d by an accelerometer oriented
perpendicular to t h e velocity vector.

Since i n d e p e n d e n t control of lift a n d drag w o u l d be very difficult


physically, we will assume t h a t lift is a control force a n d D is a function
ofL.
D=f(L)

Next, we write p e r t u r b a t i o n equations of E q s . (122), (123), a n d (1^4)


a b o u t t h e constant altitude condition. It is n o t e d t h a t y 0 = 0 a n d
ti0 = 0. T h e r e f o r e ,
Ϋα = - / ( L 0 ( i ) ) (125)

Or, t h e velocity m u s t decrease along a constant altitude p a t h . Also,

**>---5^+·«
Along a constant altitude p a t h , L0 , which is a function of time, m u s t
328 FRANCIS H. KISHI

satisfy E q s . (125) a n d (126). W r i t i n g p e r t u r b a t i o n equations a b o u t V0 ,


y 0 = 0, λ 0 , L 0 , we obtain

SV = g8y — \\m 8L (127)


8L
2
= ->nVo + (mg + L0)(R + h0) 2
2 1 •8h - SL (128)
m(R + h0)V0 ( Ä + Äo) mV0
= -ν08γ (129)

T h e uncertainties ing, m, R, a n d df(L)/dL | 0 require u s to use an adaptive


controller. Of course, if a p p r o x i m a t e values are k n o w n they s h o u l d
be used as an initial trial in any iterative identification process. I n
matrix form
χ = Ax + bu (130)

where
u = 8L

rSVi
X = δγ
I8h J

0 «12 o -
A = «21 0 «23

Lo a 32 0 J

and
«12 =g
2
-mV0 + (mg + L0)(R + h0)
m(R + h0)V0*

«23 =

= - v n

1 V(L)
*i = - m 3L
ON L I N E COMPUTER CONTROL TECHNIQUES 329

At any time instant, αυ and bi are treated as constants over a short time
interval. S u c h an a s s u m p t i o n is valid if the coefficients are changing
slowly. T h e signal flow graph for Eq. (130) is s h o w n in Fig. 4 8 .

FIG. 4 8 . Signal flow graph for linearized process

r
Reducing the signal flow g r a p h w e obtain

hh b2a32s ~h bxa2Xa^2
2
8L s(s — a12s — a32a23)

Making 8L to be a staircase signal as appearing from a digital controller,


we obtain t h e discrete i n p u t - o u t p u t transfer function.
2
M(g) = 3
ßi* +2 ft* + ft
8L(z) Z + Z