You are on page 1of 389

Kurt Marti

Stochastic
Optimization
Methods
Applications in Engineering and
Operations Research
Third Edition
Stochastic Optimization Methods
Kurt Marti

Stochastic Optimization
Methods
Applications in Engineering
and Operations Research

Third edition

123
Kurt Marti
Institute for Mathematics and Computer
Science
Federal Armed Forces University Munich
Neubiberg/Munich
Germany

ISBN 978-3-662-46213-3 ISBN 978-3-662-46214-0 (eBook)


DOI 10.1007/978-3-662-46214-0

Library of Congress Control Number: 2015933010

Springer Heidelberg New York Dordrecht London


© Springer-Verlag Berlin Heidelberg 2015
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made.

Printed on acid-free paper

Springer-Verlag GmbH Berlin Heidelberg is part of Springer Science+Business Media (www.springer.


com)
Preface

Optimization problems in practice depend mostly on several model parameters,


noise factors, uncontrollable parameters, etc., which are not given fixed quantities
at the planning stage. Typical examples from engineering and economics/operations
research are: material parameters (e.g., elasticity moduli, yield stresses, allowable
stresses, moment capacities, specific gravity), external loadings, friction coeffi-
cients, moments of inertia, length of links, mass of links, location of center of
gravity of links, manufacturing errors, tolerances, noise terms, demand parameters,
technological coefficients in input–output functions, cost factors, interest rates,
exchange rates, etc. Due to the several types of stochastic uncertainties (physical
uncertainty, economic uncertainty, statistical uncertainty, model uncertainty), these
parameters must be modeled by random variables having a certain probability
distribution. In most cases, at least certain moments of this distribution are known.
In order to cope with these uncertainties, a basic procedure in the engi-
neering/economic practice is to replace first the unknown parameters by some
chosen nominal values, e.g., estimates, guesses, of the parameters. Then, the
resulting and mostly increasing deviation of the performance (output, behavior)
of the structure/system from the prescribed performance (output, behavior), i.e.,
the “tracking error”, is compensated by (online) input corrections. However, the
online correction of a system/structure is often time-consuming and causes mostly
increasing expenses (correction or recourse costs). Very large recourse costs may
arise in case of damages or failures of the plant. This can be omitted to a large extent
by taking into account already at the planning stage the possible consequences
of the tracking errors and the known prior and sample information about the
random data of the problem. Hence, instead of relying on ordinary deterministic
parameter optimization methods—based on some nominal parameter values—and
applying then just some correction actions, stochastic optimization methods should
be applied: Incorporating stochastic parameter variations into the optimization
process, expensive and increasing online correction expenses can be omitted or at
least reduced to a large extent.
Consequently, for the computation of robust optimal decisions/designs, i.e.,
optimal decisions which are insensitive with respect to random parameter variations,

v
vi Preface

appropriate deterministic substitute problems must be formulated first. Based on the


decision theoretical principles, these substitute problems depend on probabilities of
failure/success and/or on more general expected cost/loss terms. Since probabilities
and expectations are defined by multiple integrals in general, the resulting often
nonlinear and also nonconvex deterministic substitute problems can be solved by
approximative methods only. Two basic types of deterministic substitute problems
occur mostly in practice:
• Expected primary cost minimization subject to expected recourse (correction)
cost constraints
Minimization of the expected primary costs subject to expected recourse
cost constraints (reliability constraints) and remaining deterministic constraints,
e.g., box constraints. In case of piecewise constant cost functions, probabilistic
objective functions and/or probabilistic constraints occur;
• Expected Total Cost Minimization Problems:
Minimization of the expected total costs (costs of construction, design,
recourse costs, etc.) subject to the remaining deterministic constraints.
The main analytical properties of the substitute problems have been examined in
the first two editions of the book, where also appropriate deterministic and stochastic
solution procedures can be found.
After an overview on basic methods for handling optimization problems with ran-
dom data, in this present third edition transformation methods for optimization prob-
lems with random parameters into appropriate deterministic substitute problems
are described for basic technical and economic problems. Hence, the aim of this
present third edition is to provide now analytical and numerical tools—including
their mathematical foundations—for the approximate computation of robust optimal
decisions/designs/control as needed in concrete engineering/economic applications:
stochastic Hamiltonian method for optimal control problems with random data,
the H-minimal control and the Hamiltonian two-point boundary value problem,
Stochastic optimal open-loop feedback control as an efficient method for the
construction of optimal feedback controls in case of random parameters, adaptive
optimal stochastic trajectory planning and control for dynamic control systems
under stochastic uncertainty, optimal design of regulators in case of random
parameters, optimal design of structures under random external load and with
random model parameters.
Finally, a new tool is presented for the evaluation of the entropy of a probability
distribution and the divergence of two probability distributions with respect to
the use in an optimal decision problem with random parameters. Applications to
statistics are given.
Realizing this monograph, the author was supported by several collaborators
from the Institute for Mathematics and Computer Sciences. Especially, I would like
to thank Ms Elisabeth Lößl for the excellent support in the LATEX-typesetting of
the manuscript. Without her very precise and careful work, the completion of this
project could not have been realized. Moreover, I thank Ms Ina Stein for providing
several figures. Last but not least, I am indebted to Springer-Verlag for inclusion
Preface vii

of the book into the Springer-program. I would like to thank especially the Senior
Editor for Business/Economics of Springer-Verlag, Christian Rauscher, for his very
long patience until the completion of this book.

Munich, Germany Kurt Marti


October 2014
Preface to the First Edition

Optimization problems in practice depend mostly on several model parameters,


noise factors, uncontrollable parameters, etc., which are not given fixed quantities
at the planning stage. Typical examples from engineering and economics/operations
research are: material parameters (e.g., elasticity moduli, yield stresses, allowable
stresses, moment capacities, specific gravity), external loadings, friction coeffi-
cients, moments of inertia, length of links, mass of links, location of center of
gravity of links, manufacturing errors, tolerances, noise terms, demand parameters,
technological coefficients in input–output functions, cost factors, etc. Due to the
several types of stochastic uncertainties (physical uncertainty, economic uncertainty,
statistical uncertainty, model uncertainty), these parameters must be modeled by
random variables having a certain probability distribution. In most cases, at least
certain moments of this distribution are known.
In order to cope with these uncertainties, a basic procedure in the engi-
neering/economic practice is to replace first the unknown parameters by some
chosen nominal values, e.g., estimates, guesses, of the parameters. Then, the
resulting and mostly increasing deviation of the performance (output, behavior)
of the structure/system from the prescribed performance (output, behavior), i.e.,
the “tracking error,” is compensated by (online) input corrections. However, the
online correction of a system/structure is often time-consuming and causes mostly
increasing expenses (correction or recourse costs). Very large recourse costs may
arise in case of damages or failures of the plant. This can be omitted to a large extent
by taking into account already at the planning stage the possible consequences
of the tracking errors and the known prior and sample information about the
random data of the problem. Hence, instead of relying on ordinary deterministic
parameter optimization methods—based on some nominal parameter values—and
applying then just some correction actions, stochastic optimization methods should
be applied: incorporating the stochastic parameter variations into the optimization
process, expensive and increasing online correction expenses can be omitted or at
least reduced to a large extent.
Consequently, for the computation of robust optimal decisions/designs, i.e.,
optimal decisions which are insensitive with respect to random parameter variations,

ix
x Preface to the First Edition

appropriate deterministic substitute problems must be formulated first. Based on the


decision theoretical principles, these substitute problems depend on probabilities of
failure/success and/or on more general expected cost/loss terms. Since probabilities
and expectations are defined by multiple integrals in general, the resulting often
nonlinear and also nonconvex deterministic substitute problems can be solved by
approximative methods only. Hence, the analytical properties of the substitute
problems are examined, and appropriate deterministic and stochastic solution
procedures are presented. The aim of this present book is therefore to provide
analytical and numerical tools—including their mathematical foundations—for the
approximate computation of robust optimal decisions/designs as needed in concrete
engineering/economic applications.
Two basic types of deterministic substitute problems occur mostly in practice:
• Reliability-Based Optimization Problems:
Minimization of the expected primary costs subject to expected recourse
cost constraints (reliability constraints) and remaining deterministic constraints,
e.g., box constraints. In case of piecewise constant cost functions, probabilistic
objective functions and/or probabilistic constraints occur;
• Expected Total Cost Minimization Problems:
Minimization of the expected total costs (costs of construction, design,
recourse costs, etc.) subject to the remaining deterministic constraints.
Basic methods and tools for the construction of appropriate deterministic substitute
problems of different complexity and for various practical applications are presented
in Chap. 1 and part of Chap. 2 together with fundamental analytical properties of
reliability-based optimization/expected cost minimization problems. Here, a main
problem is the analytical treatment and numerical computation of the occurring
expectations and probabilities depending on certain input variables.
For this purpose deterministic and stochastic approximation methods are pro-
vided which are based on Taylor expansion methods, regression and response
surface methods (RSM), probability inequalities, differentiation formulas for prob-
ability and expectation value functions. Moreover, first order reliability methods
(FORM), Laplace expansions of integrals, convex approximation/deterministic
descent directions/efficient points, (hybrid) stochastic gradient methods, stochastic
approximation procedures are considered. The mathematical properties of the
approximative problems and iterative solution procedures are described.
After the presentation of the basic approximation techniques in Chap. 2, in
Chap. 3 derivatives of probability and mean value functions are obtained by the
following methods: transformation method, stochastic completion and transfor-
mation, orthogonal function series expansion. Chapter 4 shows how nonconvex
deterministic substitute problems can be approximated by convex mean value
minimization problems. Depending on the type of loss functions and the parameter
distribution, feasible descent directions can be constructed at non-efficient points.
Here, efficient points, determined often by linear/quadratic conditions involving first
and second order moments of the parameters, are candidates for optimal solutions
to be obtained with much larger effort. The numerical/iterative solution techniques
Preface to the First Edition xi

presented in Chap. 5 are based on hybrid stochastic approximation, hence, methods


using not only simple stochastic (sub)gradients, but also deterministic descent
directions and/or more exact gradient estimators. More exact gradient estimators
based on Response Surface Methods (RSM) are described in the first part of
Chap. 5. The convergence in the mean square sense is considered, and results on the
speed up of the convergence by using improved step directions at certain iteration
points are given. Various extensions of hybrid stochastic approximation methods
are treated in Chap. 6 on stochastic approximation methods with changing variance
of the estimation error. Here, second order search directions, based, e.g., on the
approximation of the Hessian, and asymptotic optimal scalar and matrix valued
step sizes are constructed. Moreover, using different types of probabilistic conver-
gence concepts, related convergence properties and convergence rates are given.
Mathematical definitions and theorems needed here are given in the Appendix.
Applications to the approximate computation of survival/failure probabilities for
technical and economical systems/structures are given in the last Chap. 7. The reader
of this book needs some basic knowledge in linear algebra, multivariate analysis and
stochastics.
Realizing this monograph, the author was supported by several collaborators:
First of all I would like to thank Ms Elisabeth Lößl from the Institute for
Mathematics and Computer Sciences for the excellent LATEXtypesetting of the whole
manuscript. Without her very precise and careful work, this project could not
have been realized. I also owe thanks to my former collaborators, Dr. Andreas
Aurnhammer, for further LATEXsupport and for providing English translation of
some German parts of the original manuscript, and to Thomas Platzer for proof
reading of some parts of the book. Last but not least, I am indebted to Springer-
Verlag for inclusion of the book into the Springer-program. I would like to
thank especially the Publishing Director Economics and Management Science of
Springer-Verlag, Dr. Werner A. Müller, for his really very long patience until the
completion of this book.

Munich, Germany Kurt Marti


August 2004
Preface to the Second Edition

The major change in the second edition of this book is the inclusion of a new,
extended version of Chap. 7 on the approximate computation of probabilities of
survival and failure of technical, economic structures and systems. Based on
a representation of the state function of the structure/system by the minimum
value of a certain optimization problem, approximations of the probability of
survival/failure are obtained by means of polyhedral approximation of the so-
called survival domain, certain probability inequalities and discretizations of the
underlying probability distribution of the random model parameters. All the other
chapters have been updated by including some more recent material and by
correcting some typing errors.
The author thanks again Ms. Elisabeth Lößl for the excellent LaTeX-typesetting
of all revisions and completions of the text. Moreover, the author thanks Dr. Müller,
Vice President Business/Economics & Statistics, Springer-Verlag, for the possibility
to publish a second edition of “Stochastic Optimization Methods.”

Munich, Germany Kurt Marti


March 2008

xiii
Outline of the 3rd Edition

A short overview on stochastic optimization methods including the derivation of


appropriate deterministic substitute problems is given in Chap. 1.
Basic solution techniques for optimal control problems under stochastic uncer-
tainty are presented in Chap. 2: optimal control problems as arising in different tech-
nical (mechanical, electrical, thermodynamic, chemical, etc.) plants and economic
systems are modeled mathematically by a system of first order nonlinear differential
equations for the plant state vector z D z.t/ involving, e.g., displacements, stresses,
voltages, currents, pressures, concentration of chemicals, demands, etc. This system
of differential equations depends on the vector u.t/ of input or control variables
and a vector a D a.!/ of certain random model parameters. Moreover, also the
vector z0 of initial values of the plant state vector z D z.t/ at the initial time
t D t0 may be subject to random variations. While the actual realizations of the
random parameters and initial values are not known at the planning stage, we may
assume that the probability distribution or at least the occurring moments, such as
expectations and variances, are known. Moreover, we suppose that the costs along
the trajectory and the terminal costs G are convex functions with respect to the
pair .u; z/ of control and state variables u, z, the final state z.tf /, respectively. The
problem is then to determine an open-loop, closed-loop, or an intermediate open-
loop feedback control law minimizing the expected total costs consisting of the
sum of the costs along the trajectory and the terminal costs. For the computation of
stochastic optimal open-loop controls at each starting time point tb , the stochastic
Hamilton function of the control problem is introduced first. Then, a H -minimal
control can be determined by solving a finite-dimensional stochastic optimization
problem for minimizing the conditional expectation of the stochastic Hamiltonian
subject to the remaining deterministic control constraints at each time point t.
Having a H -minimal control, the related Hamiltoninan two-point boundary value
problem with random parameters is formulated for the computation of the stochastic
optimal state and adjoint state trajectory. In case of a linear-quadratic control
problem, the state and adjoint state trajectory can be determined analytically to a
large extent. Inserting then these trajectories into the H-minimal control, stochastic

xv
xvi Outline of the 3rd Edition

optimal open-loop controls are found. For approximate solutions of the stochastic
two-point boundary problem, cf. [112].
Stochastic optimal regulators or optimal feedback control under stochastic
uncertainty can be obtained in general by approximative methods only. In Chap. 3,
stochastic optimal feedback controls are determined by means of the fundamental
stochastic open-loop feedback method. This very efficient approximation method is
also the basis of model predictive control procedures. Using the methods described
in Chap. 2, stochastic optimal open-loop feedback controls are constructed by
computing next to stochastic optimal open-loop controls on the remaining time
intervals tb  t  tf with t0  tb  tf . Having a stochastic optimal open-loop
feedback control on each remaining time interval tb  t  tf with t0  tb  tf ,
a stochastic optimal open-loop feedback control law follows then immediately
evaluating each of the stochastic optimal open-loop controls on tb  t  tf at
the corresponding initial time point t D tb . The efficiency of this method has
been proved already by applications to the stochastic optimization of regulators for
robots.
Adaptive Optimal Stochastic Trajectory Planning and Control .AOSTPC/ are
considered in Chap. 4: in optimal control of dynamic systems, the standard proce-
dure is to determine first off-line an optimal open-loop control, using some nominal
or estimated values of the model parameters, and to correct then the resulting
deviation of the actual trajectory or system performance from the prescribed
trajectory (prescribed system performance) by on-line measurement and control
actions. However, on-line measurement and control actions are very expensive and
time-consuming. By adaptive optimal stochastic trajectory planning and control
(AOSTPC), based on stochastic optimization methods, the available a priori and
statistical information about the unknown model parameters is incorporating into
the optimal control design. Consequently, the mean absolute deviation between the
actual and prescribed trajectory can be reduced considerably, and robust controls are
obtained. Using only some necessary stability conditions, by means of stochastic
optimization methods also sufficient stability properties of the corresponding feed-
forward, feedback (PD-, PID-)controls, resp., are obtained. Moreover, analytical
estimates are given for the reduction of the tracking error, hence, for the reduction
of the on-line correction expenses by applying (AOSTPC).
Special methods for the optimal design of regulators are described in Chap. 5:
the optimal design of regulators is often based on given, fixed nominal values of
initial conditions, external loads, and other model parameters. However, due to the
variations of material properties, measurement and observational errors, modeling
errors, uncertainty of the working environment, the task to be executed, etc., the
true external loadings, initial conditions, and model parameters are not known
exactly in practice. Hence, a predetermined optimal regulator should be “robust,”
i.e., the regulator should guarantee satisfying results also in case of observational
errors and variations of the different parameters occurring in the model. Robust
regulators have been considered mainly for uncertainty models based on given fixed
sets of parameters, especially certain multiple intervals containing the unknown,
true parameter. However, since in many cases observational errors and parameter
Outline of the 3rd Edition xvii

uncertainties can be described more adequately by means of stochastic models,


in this chapter we suppose that the parameters involved in the regulator design
problem are realizations of random variables having a known joint probability
distribution. For the consideration of stochastic parameter variations within an
optimal design process one has to introduce—as for any other optimization prob-
lem under stochastic uncertainty—an appropriate deterministic substitute problem.
Based on the (optimal) reference trajectory and the related feedforward control, for
the computation of stochastic optimal regulators, deterministic substitute control
problems of the following type are considered: minimize the expected total costs
arising from the deviation between the (stochastic optimal) reference trajectory
and the effective trajectory of the dynamic system and the costs for the control
correction subject to the following constraints: (i) dynamic equation of the stochastic
system with the total control input being the sum of the feedforward control
and the control correction, (ii) stochastic initial conditions, and (iii) additional
conditions for the feedback law. The main problem is then the computation
of the (conditional) expectation arising in the objective function. The occurring
expectations are evaluated by Taylor expansions with respect to the stochastic
parameter vector at its conditional mean. Applying stochastic optimization methods,
the regulator optimization under stochastic uncertainty problem is converted into a
deterministic optimal control problem involving the gain matrices as the unknown
matrix parameters to be determined. This method has been validated already for the
stochastic optimal design of regulators for industrial robots.
Stochastic optimization methods for the optimization of mechanical structures
with random parameters are considered in Chap. 6: in the optimal design of
mechanical structures, one has to minimize the weight, volume, or a more general
cost function under yield, strength, or more general safety constraints. Moreover,
for the design variables, e.g., structural dimensions, sizing variables, there are
always some additional constraints, like box constraints. A basic problem is that
the material parameters of the structure, manufacturing and modeling errors, as
well as the external loads are not given, fixed quantities, but random variables
having a certain probability distribution. In order to omit—as far as possible—very
expensive observation and repair actions, or even severe damages, hence, in order to
get robust designs with respect to random parameter variations, the always available
information (e.g., certain moments) about the random variations of the unknown
technical parameters should be taken into account already at the project definition
stage. Hence, the original structural optimization problem with random parameters
has to be replaced by an appropriate deterministic substitute problem. Starting from
the linear yield, strength, or safety conditions and the linear equilibrium equation
involving the member forces and bending moments, the problem can be described
in the framework of a two-stage stochastic linear program (SLP) “with complete
fixed recourse.” For the evaluation of the violation of the yield, strength or safety
condition, different types of linear and sublinear cost models are introduced by
taking into account, e.g., the displacements of the elements of the structure caused
by the member forces or bending moments, or by considering certain strength or
safety “reserves” of the structure. The main properties of the resulting substitute
xviii Outline of the 3rd Edition

problems are discussed, e.g., (i) the convexity of the problems and (ii) the “dual
block angular” structure of the resulting (large-scale) deterministic linear program
in case of discrete probability distributions. In this case, several special purpose LP-
solvers are available. In case of a continuous parameter distribution, the resulting
substitute problems can be solved approximatively by means of a discretization of
the underlying probability distribution and using then the SLP-method as described
above.
Structural Analysis and Optimal Structural Design under Stochastic Uncertainty
using Quadratic Cost Functions are treated in Chap. 7: problems from plastic
analysis and optimal plastic design are based on the convex, linear, or linearized
yield/strength condition and the linear equilibrium equation for the stress (state)
vector. In practice, one has to take into account stochastic variations of the vector
a D a.!/ of model parameters (e.g., yield stresses, plastic capacities, external
load factors, cost factors, etc.). Hence, in order to get robust optimal load factors
x, robust optimal designs x, resp., the basic plastic analysis or optimal plastic
design problem with random parameters has to be replaced by an appropriate
deterministic substitute problem. As a basic tool in the analysis and optimal design
of mechanical structures under uncertainty, a state function s  D s  .a; x/ of the
underlying structure is introduced. The survival of the structure can be described
then by the condition s   0. Interpreting the state function s  as a cost function,
several relations s  to other cost functions, especially quadratic cost functions, are
derived. Bounds for the probability of survival ps are obtained then by means
of the Tschebyscheff inequality. In order to obtain robust optimal decisions x  ,
i.e., maximum load factors, optimal designs insensitive with respect to variations
of the model parameters a D a.!/, a direct approach is presented then based
on the primary costs (weight, volume, costs of construction, costs for missing
carrying capacity, etc.) and the recourse costs (e.g., costs for repair, compensation
for weakness within the structure, damage, failure, etc.), where the abovementioned
quadratic cost criterion is used. The minimum recourse costs can be determined
then by solving an optimization problem having a quadratic objective function
and linear constraints. For each vector a D a.!/ of model parameters and each
design vector x, one obtains then an explicit representation of the “best” internal
load distribution F  . Moreover, also the expected recourse costs can be determined
explicitly. The expected recourse function may be represented by means of a
“generalized stiffness matrix.” Hence, corresponding to an elastic approach, the
expected recourse function can be interpreted here as a “generalized expected
compliance function”, which depends on a generalized “stiffness matrix.” Based
on the minimization of the expected primary costs subject to constraints for the
expected recourse costs (“generalized compliance”) or the minimization of the
expected total primary and recourse costs, explicit finite dimensional parameter
optimization problems are achieved for finding robust optimal design x  or a
maximal load factor x  . The analytical properties of the resulting programming
problem are discussed, and applications, such as limit load/shakedown analysis,
are considered. Furthermore, based on the expected “compliance function,” explicit
Outline of the 3rd Edition xix

upper and lower bounds for the probability ps of survival can be derived using
linearization methods.
Finally, in Chap. 8 the inference and decision strategies applied in stochastic
optimization methods are considered in more detail:
A large number of optimization problems arising in engineering, control, and
economics can be described by the minimization of a certain (cost) function
v D v.a; x/ depending on a random parameter vector a D a.!/ and a decision
vector x 2 D lying in a given set D of feasible decision, design, or control
variables. Hence, in order to get robust optimal decisions, i.e., optimal decisions
being most insensitive with respect to variations of the random parameter vector
a D a.!/, the original optimization problem is replaced by the deterministic
substitute problem which consists in the minimization of the expected objective
function Ev D Ev.a.!/; x/ subject to x 2 D. Since the true probability distribution
 of a D a.!/ is not exactly known in practice, one has to replace  by a certain
estimate or guess ˇ. Consequently, one has the following inference and decision
problem:
• inference/estimation step
Determine an estimation ˇ of the true probability distribution  of a D a.!/,
• decision step R
determine an optimal solution x  of min v.a.!/; x/ˇ.d!/ s.t. x 2 D.
Computing approximation, estimation ˇ of , the criterion for judging an approxi-
mation ˇ of  should be based on its utility for the decision-making process, i.e., one
should weight the approximation error according to its influence on decision errors,
and the decision errors should be weighted in turn according to the loss caused by
an incorrect decision.
Based on inferential ideas developed among others by Kerridge, Kullback, in
this chapter generalized decision-oriented inaccuracy and divergence functions for
probability distributions  , ˇ are developed, taking into account that the outcome ˇ
of the inferential stage is used in a subsequent (ultimate) decision-making problem
modeled by the above-mentioned stochastic optimization problem. In addition,
stability properties of the inference and decision process

 ! ˇ ! x 2 D .ˇ/;

are studied, where D .ˇ/ denotes the set of -optimal decisions with respect to
probability distribution Pa./ D ˇ of the random parameter vector a D a.!/.
Contents

1 Stochastic Optimization Methods . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1
1.2 Deterministic Substitute Problems: Basic Formulation .. . . . . . . . . . . . . 3
1.2.1 Minimum or Bounded Expected Costs . . .. . . . . . . . . . . . . . . . . . . . 4
1.2.2 Minimum or Bounded Maximum Costs (Worst Case) . . . . . . 6
1.3 Optimal Decision/Design Problems with Random Parameters .. . . . . 7
1.4 Deterministic Substitute Problems in Optimal Decision/Design .. . . 12
1.4.1 Expected Cost or Loss Functions .. . . . . . . .. . . . . . . . . . . . . . . . . . . . 13
1.5 Basic Properties of Deterministic Substitute Problems . . . . . . . . . . . . . . 16
1.6 Approximations of Deterministic Substitute Problems
in Optimal Design/Decision .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 18
1.6.1 Approximation of the Loss Function . . . .. . . . . . . . . . . . . . . . . . . . 18
1.6.2 Approximation of State (Performance) Functions . . . . . . . . . . . 21
1.6.3 Taylor Expansion Methods . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 24
1.7 Approximation of Probabilities: Probability Inequalities .. . . . . . . . . . . 28
1.7.1 Bonferroni-Type Inequalities . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 28
1.7.2 Tschebyscheff-Type Inequalities . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 30
2 Optimal Control Under Stochastic Uncertainty . . . . .. . . . . . . . . . . . . . . . . . . . 37
2.1 Stochastic Control Systems . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 37
2.1.1 Random Differential and Integral Equations .. . . . . . . . . . . . . . . . 39
2.1.2 Objective Function . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 45
2.2 Control Laws .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 49
2.3 Convex Approximation by Inner Linearization . .. . . . . . . . . . . . . . . . . . . . 52
2.4 Computation of Directional Derivatives .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 57
2.5 Canonical (Hamiltonian) System of Differential
Equations/Two-Point Boundary Value Problem ... . . . . . . . . . . . . . . . . . . . 67
2.6 Stationary Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 69
2.7 Canonical (Hamiltonian) System of Differential .. . . . . . . . . . . . . . . . . . . . 71

xxi
xxii Contents

2.8 Computation of Expectations by Means of Taylor Expansions . . . . . 73


2.8.1 Complete Taylor Expansion . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 74
2.8.2 Inner or Partial Taylor Expansion . . . . . . . .. . . . . . . . . . . . . . . . . . . . 75
3 Stochastic Optimal Open-Loop Feedback Control . .. . . . . . . . . . . . . . . . . . . . 79
3.1 Dynamic Structural Systems Under Stochastic Uncertainty .. . . . . . . . 79
3.1.1 Stochastic Optimal Structural Control: Active Control.. . . . . 79
3.1.2 Stochastic Optimal Design of Regulators . . . . . . . . . . . . . . . . . . . . 81
3.1.3 Robust (Optimal) Open-Loop Feedback Control . . . . . . . . . . . . 82
3.1.4 Stochastic Optimal Open-Loop Feedback Control . . . . . . . . . . 82
3.2 Expected Total Cost Function .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 84
3.3 Open-Loop Control Problem on the Remaining Time
Interval Œtb ; tf  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 85
3.4 The Stochastic Hamiltonian of (3.7a–d) .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 85
3.4.1 Expected Hamiltonian (with Respect to the Time
Interval Œtb ; tf  and Information Atb ) . . . . .. . . . . . . . . . . . . . . . . . . . 86
3.4.2 H -Minimal Control on Œtb ; tf  . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 86
3.5 Canonical (Hamiltonian) System . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 87
3.6 Minimum Energy Control .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 88
3.6.1 Endpoint Control . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 89
3.6.2 Endpoint Control with Different Cost Functions . . . . . . . . . . . . 93
3.6.3 Weighted Quadratic Terminal Costs . . . . . .. . . . . . . . . . . . . . . . . . . . 95
3.7 Nonzero Costs for Displacements . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 98
3.7.1 Quadratic Control and Terminal Costs . . .. . . . . . . . . . . . . . . . . . . . 100
3.8 Stochastic Weight Matrix Q D Q.t; !/ . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 103
3.9 Uniformly Bounded Sets of Controls Dt ; t0  t  tf . . . . . . . . . . . . . . . 107
3.10 Approximate Solution of the Two-Point Boundary
Value Problem (BVP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 112
3.11 Example .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 115
4 Adaptive Optimal Stochastic Trajectory Planning
and Control (AOSTPC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 119
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 119
4.2 Optimal Trajectory Planning for Robots.. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 121
4.3 Problem Transformation.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 124
4.3.1 Transformation of the Dynamic Equation .. . . . . . . . . . . . . . . . . . . 126
4.3.2 Transformation of the Control Constraints .. . . . . . . . . . . . . . . . . . 127
4.3.3 Transformation of the State Constraints . .. . . . . . . . . . . . . . . . . . . . 128
4.3.4 Transformation of the Objective Function . . . . . . . . . . . . . . . . . . . 129
4.4 OSTP: Optimal Stochastic Trajectory Planning ... . . . . . . . . . . . . . . . . . . . 129
4.4.1 Computational Aspects . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 137
4.4.2 Optimal Reference Trajectory, Optimal
Feedforward Control . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 141
Contents xxiii

4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning . . . . . . . . . 142


4.5.1 (OSTP)-Transformation .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 146
4.5.2 The Reference Variational Problem . . . . . .. . . . . . . . . . . . . . . . . . . . 148
4.5.3 Numerical Solutions of (OSTP) in Real-Time . . . . . . . . . . . . . . . 150
4.6 Online Control Corrections: PD-Controller . . . . . .. . . . . . . . . . . . . . . . . . . . 157
4.6.1 Basic Properties of the Embedding q.t; / . . . . . . . . . . . . . . . . . . . 158
4.6.2 The 1st Order Differential dq . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 161
4.6.3 The 2nd Order Differential d 2 q . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 168
4.6.4 Third and Higher Order Differentials . . . .. . . . . . . . . . . . . . . . . . . . 172
4.7 Online Control Corrections: PID Controllers . . . .. . . . . . . . . . . . . . . . . . . . 173
4.7.1 Basic Properties of the Embedding q.t; "/ . . . . . . . . . . . . . . . . . . . 176
4.7.2 Taylor Expansion with Respect to " . . . . . .. . . . . . . . . . . . . . . . . . . . 177
4.7.3 The 1st Order Differential dq . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 178
5 Optimal Design of Regulators . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 195
5.1 Tracking Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 197
5.1.1 Optimal PD-Regulator . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 198
5.2 Parametric Regulator Models . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 200
5.2.1 Linear Regulator .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 200
5.2.2 Explicit Representation of Polynomial Regulators . . . . . . . . . . 201
5.2.3 Remarks to the Linear Regulator . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 202
5.3 Computation of the Expected Total Costs of the Optimal
Regulator Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 203
5.3.1 Computation of Conditional Expectations
by Taylor Expansion . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 204
5.3.2 Quadratic Cost Functions . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 206
5.4 Approximation of the Stochastic Regulator Optimization
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 207
5.4.1 Approximation of the Expected Costs:
Expansions of 1st Order.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 209
5.5 Computation of the Derivatives of the Tracking Error .. . . . . . . . . . . . . . 216
5.5.1 Derivatives with Respect to Dynamic Parameters
at Stage j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 217
5.5.2 Derivatives with Respect to the Initial Values
at Stage j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 221
5.5.3 Solution of the Perturbation Equation .. . .. . . . . . . . . . . . . . . . . . . . 224
5.6 Computation of the Objective Function . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 230
5.7 Optimal PID-Regulator .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 233
5.7.1 Quadratic Cost Functions . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 235
5.7.2 The Approximate Regulator Optimization Problem .. . . . . . . . 250
6 Expected Total Cost Minimum Design of Plane Frames . . . . . . . . . . . . . . . . 253
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 253
6.2 Stochastic Linear Programming Techniques . . . . .. . . . . . . . . . . . . . . . . . . . 254
6.2.1 Limit (Collapse) Load Analysis of Structures
as a Linear Programming Problem . . . . . . .. . . . . . . . . . . . . . . . . . . . 254
6.2.2 Plane Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 257
xxiv Contents

6.2.3 Yield Condition in Case of M –N -Interaction . . . . . . . . . . . . . . . 263


6.2.4 Approximation of the Yield Condition by Using
Reference Capacities . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 270
6.2.5 Asymmetric Yield Stresses . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 273
6.2.6 Violation of the Yield Condition .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 279
6.2.7 Cost Function .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 280
7 Stochastic Structural Optimization with Quadratic Loss Functions .. . 289
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 289
7.2 State and Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 292
7.2.1 Cost Functions .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 296
7.3 Minimum Expected Quadratic Costs . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 299
7.4 Deterministic Substitute Problems .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 304
7.4.1 Weight (Volume)-Minimization Subject
to Expected Cost Constraints . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 304
7.4.2 Minimum Expected Total Costs . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 306
7.5 Stochastic Nonlinear Programming .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 308
7.5.1 Symmetric, Non Uniform Yield Stresses. . . . . . . . . . . . . . . . . . . . . 311
7.5.2 Non Symmetric Yield Stresses . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 312
7.6 Reliability Analysis .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 315
7.7 Numerical Example: 12-Bar Truss. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 318
7.7.1 Numerical Results: MEC . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 320
7.7.2 Numerical Results: ECBO . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 322
8 Maximum Entropy Techniques. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 323
8.1 Uncertainty Functions Based on Decision Problems . . . . . . . . . . . . . . . . 323
8.1.1 Optimal Decisions Based on the Two-Stage
Hypothesis Finding (Estimation) and Decision
Making Procedure .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 323
8.1.2 Stability/Instability Properties . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 328
8.2 The Generalized Inaccuracy Function H.; ˇ/ . .. . . . . . . . . . . . . . . . . . . . 331
8.2.1 Special Loss Sets V . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 334
8.2.2 Representation of H" .; ˇ/ and H.; ˇ/
by Means of Lagrange Duality . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 342
8.3 Generalized Divergence and Generalized Minimum
Discrimination Information . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 346
8.3.1 Generalized Divergence .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 346
8.3.2 I -, J -Projections . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 352
8.3.3 Minimum Discrimination Information . . .. . . . . . . . . . . . . . . . . . . . 353

References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 357

Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 365
Chapter 1
Stochastic Optimization Methods

1.1 Introduction

Many concrete problems from engineering, economics, operations research, etc.,


can be formulated by an optimization problem of the type

min f0 .a; x/ (1.1a)


s.t.
fi .a; x/  0; i D 1; : : : ; mf (1.1b)
gi .a; x/ D 0; i D 1; : : : ; mg (1.1c)
x 2 D0 : (1.1d)

Here, the objective (goal) function f0 D f0 .a; x/ and the constraint functions
fi D fi .a; x/; i D 1; : : : ; mf and gi D gi .a; x/; i D 1; : : : ; mg , defined
on a joint subset of R  Rr , depend on a decision, design, control or input
vector x D .x1 ; x2 ; : : : ; xr /T and a vector a D .a1 ; a2 ; : : : ; a /T of model
parameters. Typical model parameters in technical applications, operations research
and economics are material parameters, external load parameters, cost factors,
technological parameters in input–output operators, demand factors. Furthermore,
manufacturing and modeling errors, disturbances or noise factors, etc., may occur.
Frequent decision, control or input variables are material, topological, geometrical
and cross-sectional design variables in structural optimization [77], forces and
moments in optimal control of dynamic systems and factors of production in
operations research and economic design.
The objective function (1.1a) to be optimized describes the aim, the goal of
the modeled optimal decision/design problem or the performance of a technical,
economic system or process to be controlled optimally. Furthermore, the constraints

© Springer-Verlag Berlin Heidelberg 2015 1


K. Marti, Stochastic Optimization Methods, DOI 10.1007/978-3-662-46214-0_1
2 1 Stochastic Optimization Methods

(1.1b–d) represent the operating conditions guaranteeing a safe structure, a correct


functioning of the underlying system, process, etc. Note that the constraint (1.1d)
with a given, fixed convex subset D0  Rr summarizes all (deterministic) con-
straints being independent of unknown model parameters a, as e.g. box constraints:

xL  x  xU (1.1d’)

with given bounds x L ; x U .


Important concrete optimization problems, which may be formulated, at least
approximatively, this way are problems from optimal design of mechanical struc-
tures and structural systems [7, 77, 147, 160], adaptive trajectory planning for robots
[9, 11, 32, 93, 125, 148], adaptive control of dynamic system [150, 159], optimal
design of economic systems, production planning, manufacturing [86, 129] and
sequential decision processes [106] , etc.
In optimal control, cf. Chap. 2, the input vector x WD u./ is interpreted as a
function, a control or input function u D u.t/; t0  t  tf , on a certain given
time interval Œt0 ; tf . Moreover, see Chap. 2, the objective function f0 D f0 .a; u.//
is defined by a certain integral over the time interval Œt0 ; tf . In addition, the
constraint functions fj D fj .a; u.// are defined by integrals over Œt0 ; tf , or
fj D fj .t; a; u.t// may be functions of time t and the control input u.t/ at time t.
A basic problem in practice is that the vector of model parameters a D
.a1 ; : : : ; a /T is not a given, fixed quantity. Model parameters are often unknown,
only partly known and/or may vary randomly to some extent.
Several techniques have been developed in the recent years in order to cope with
uncertainty with respect to model parameters a. A well known basic method, often
used in engineering practice, is the following two-step procedure [11, 32, 125, 148,
150]:
I) Parameter Estimation and Approximation: Replace first the -vector a of
the unknown or stochastic varying model parameters a1 ; : : : ; a by some
estimated/chosen fixed vector a0 of so-called nominal values a0l ; l D 1; : : : ; .
Apply then an optimal decision (control) x  D x  .a0 / with respect to the
resulting approximate optimization problem

min f0 .a0 ; x/ (1.2a)


s.t.
fi .a0 ; x/  0; i D 1; : : : ; mf (1.2b)
gi .a0 ; x/ D 0; i D 1; : : : ; mg (1.2c)
x 2 D0 : (1.2d)

Due to the deviation of the actual parameter vector a from the nominal vector
a0 of model parameters, deviations of the actual state, trajectory or performance
of the system from the prescribed state, trajectory, goal values occur.
1.2 Deterministic Substitute Problems: Basic Formulation 3

II) Compensation or Correction: The deviation of the actual state, trajectory or


performance of the system from the prescribed values/functions is compensated
then by online measurement and correction actions (decisions or controls).
Consequently, in general, increasing measurement and correction expenses
result in course of time.
Considerable improvements of this standard procedure can be obtained by taking
into account already at the planning stage, i.e., offline, the mostly available a priori
(e.g. the type of random variability) and sample information about the parameter
vector a. Indeed, based e.g. on some structural insight, or by parameter identification
methods, regression techniques, calibration methods, etc., in most cases information
about the vector a of model parameters can be extracted. Repeating this information
gathering procedure at some later time points tj > t0 (D initial time point), j D
1; 2; : : : , adaptive decision/control procedures occur [106].
Based on the inherent random nature of the parameter vector a, the observation
or measurement mechanism, resp., or adopting a Bayesian approach concerning
unknown parameter values [14], here we make the following basic assumption:
Stochastic (Probabilistic) Uncertainty: The unknown parameter vector a is a
realization

a D a.!/; ! 2 ˝; (1.3)

of a random -vector a.!/ on a certain probability space .˝; A0 ; P/, where the
probability distribution Pa./ of a.!/ is known, or it is known that Pa./ lies within
a given range W of probability measures on R . Using a Bayesian approach, the
probability distribution Pa./ of a.!/ may also describe the subjective or personal
probability of the decision maker, the designer.
Hence, in order to take into account the stochastic variations of the parameter
vector a, to incorporate the a priori and/or sample information about the unknown
vector a, resp., the standard approach “insert a certain nominal parameter vector
a0 , and correct then the resulting error” must be replaced by a more appropriate
deterministic substitute problem for the basic optimization problem (1.1a–d) under
stochastic uncertainty.

1.2 Deterministic Substitute Problems: Basic Formulation

The proper selection of a deterministic substitute problem is a decision theoretical


task, see [90]. Hence, for (1.1a–d) we have first to consider the outcome map
 T
e D e.a; x/ WD f0 .a; x/; f1 .a; x/; : : : ; fmf .a; x/; g1 .a; x/; : : : ; gmg .a; x/ ;

a 2 R ; x 2 Rr .x 2 D0 /;
(1.4a)
4 1 Stochastic Optimization Methods

and to evaluate then the outcomes e 2 E  R1Cm0 ; m0 WD mf C mg , by means of


certain loss or cost functions

i W E ! R; i D 0; 1; : : : ; m; (1.4b)

with
 an integer
 m  0. For the processing of the numerical outcomes
i e.a; x/ ; i D 0; 1; : : : ; m, there are two basic concepts:

1.2.1 Minimum or Bounded Expected Costs

Consider the vector of (conditional) expected losses or costs


0 1 0 1
F0 .x/ E0 .e.a.!/; x//
B F1 .x/ C B E1 .e.a.!/; x// C
B C B C
F.x/ D B : C WD B :: C ; x 2 Rr ; (1.5)
@ :: A @ : A
Fm .x/ Em .e.a.!/; x//

where the (conditional) expectation “E” is taken with respect to the time history
A D At ; .Aj /  A0 up to a certain time point t or stage j . A short definition of
expectations is given in Sect. 2.1, for more details, see e.g. [13, 47, 135].
Having different expected cost or performance functions F0 ; F1 ; : : : ; Fm to be
minimized or bounded, as a basic deterministic substitute problem for (1.1a–d) with
a random parameter vector a D a.!/ we may consider the multi-objective expected
cost minimization problem

“ min ” F.x/ (1.6a)

s.t.

x 2 D0 : (1.6b)

Obviously, a good compromise solution x  of this vector optimization problem


should have at least one of the following properties [27, 138]:
Definition 1.1 a) A vector x 0 2 D0 is called a functional-efficient or Pareto
optimal solution of the vector optimization problem (1.6a, b) if there is no x 2
D0 such that

Fi .x/  Fi .x 0 /; i D 0; 1; : : : ; m (1.7a)
1.2 Deterministic Substitute Problems: Basic Formulation 5

and

Fi0 .x/ < Fi0 .x 0 / for at least one i0 ; 0  i0  m: (1.7b)

b) A vector x 0 2 D0 is called a weak functional-efficient or weak Pareto optimal


solution of (1.6a, b) if there is no x 2 D0 such that

Fi .x/ < Fi .x 0 /; i D 0; 1; : : : ; m (1.8)

(Weak) Pareto optimal solutions of (1.6a, b) may be obtained now by means of


scalarizations of the vector optimization problem (1.6a, b). Three main versions are
stated in the following.
I) Minimization of primary expected cost/loss under expected cost constraints

min F0 .x/ (1.9a)

s.t.

Fi .x/  Fimax ; i D 1; : : : ; m (1.9b)


x 2 D0 : (1.9c)

Here, F0 D F0 .x/ is assumed to describe the primary goal of the


design/decision making problem, while Fi D Fi .x/; i D 1; : : : ; m, describe
secondary goals. Moreover, Fimax ; i D 1; : : : ; m, denote given upper cost/loss
bounds.
Remark 1.1 An optimal solution x  of (1.9a–c) is a weak Pareto optimal
solution of (1.6a, b).
II) Minimization of the total weighted expected costs
Selecting certain positive weight factors c0 ; c1 ; : : : ; cm , the expected weigh-
ted total costs are defined by

X
m  
FQ .x/ WD ci Fi .x/ D Ef a.!/; x ; (1.10a)
i D0

where

X
m  
f .a; x/ WD ci i e.a; x/ : (1.10b)
i D0
6 1 Stochastic Optimization Methods

Consequently, minimizing the expected weighted total costs FQ D FQ .x/ sub-


ject to the remaining deterministic constraint (1.1d), the following deterministic
substitute problem for (1.1a–d) occurs:

X
m
min ci Fi .x/ (1.11a)
i D0

s.t.

x 2 D0 : (1.11b)

Remark 1.2 Let ci > 0; i D 1; 1; : : : ; m, be any positive weight factors. Then


an optimal solution x  of (1.11a, b) is a Pareto optimal solution of (1.6a, b).
III) Minimization of the maximum weighted expected costs
Instead of adding weighted expected costs, we may consider the maximum
of the weighted expected costs:
  
FQ .x/ WD max ci Fi .x/ D max ci Ei e a.!/; x : (1.12)
0i m 0i m

Here c0 ; c1 ; : : : ; m, are again positive weight factors. Thus, minimizing FQ D


FQ .x/ we have the deterministic substitute problem

min max ci Fi .x/ (1.13a)


0i m

s.t.

x 2 D0 : (1.13b)

Remark 1.3 Let ci ; i D 0; 1; : : : ; m, be any positive weight factors. An


optimal solution of x  of (1.13a, b) is a weak Pareto optimal solution
of (1.6a, b).

1.2.2 Minimum or Bounded Maximum Costs (Worst Case)

Instead of taking expectations, we may consider the worst case with respect to
the cost variations caused by the random parameter vector a D a.!/. Hence, the
random cost function
  
! ! i e a.!/; x (1.14a)
1.3 Optimal Decision/Design Problems with Random Parameters 7

is evaluated by means of
  
sup
Fi .x/ WD ess sup i e a.!/; x ; i D 0; 1; : : : ; m: (1.14b)

Here, ess sup .: : :/ denotes the (conditional) essential supremum with respect to the
random vector a D a.!/, given information A, i.e., the infimum of the supremum
of (1.14a) on sets A 2 A0 of (conditional) probability one, see e.g. [135].
Consequently, the vector function F D Fsup .x/ is then defined by
0    1
0 1
ess sup 0 e a.!/; x
F0 .x/ B    C
B F1 .x/ C B C
B C B ess sup  1 e a.!/; x C
F .x/ D B : C WD B
sup B C: (1.15)
@ :: A :: C
B : C
@    A
Fm .x/ ess sup m e a.!/; x

Working with the vector function F D Fsup .x/, we have then the vector minimiza-
tion problem

“ min ” Fsup .x/ (1.16a)

s.t.

x 2 D0 : (1.16b)

By scalarization of (1.16a, b) we obtain then again deterministic substitute


problems for (1.1a–d) related to the substitute problem (1.6a, b) introduced in
Sect. 1.2.1.
More details for the selection and solution of appropriate deterministic substitute
problems for (1.1a–d) are given in the next sections. Deterministic substitute
problems for optimal control problems under stochastic uncertainty are considered
in Chap. 2.

1.3 Optimal Decision/Design Problems with Random


Parameters

In the optimal design of technical or economic structures/systems, in optimal


decision problems arising in technical or economic systems, resp., two basic classes
of criteria appear:
First there is a primary cost function

G0 D G0 .a; x/: (1.17a)


8 1 Stochastic Optimization Methods

Important examples are the total weight or volume of a mechanical structure, the
costs of construction, design of a certain technical or economic structure/system,
or the negative utility or reward in a general decision situation. Basic examples in
optimal control, cf. Chap. 2, are the total run time, the total energy consumption of
the process or a weighted mean of these two cost functions.
For the representation of the structural/system safety or failure, for the represen-
tation of the admissibility of the state, or for the formulation of the basic operating
conditions of the underlying plant, structure/system certain state, performance or
response functions

yi D yi .a; x/; i D 1; : : : ; my ; (1.17b)

are chosen. In structural design these functions are also called “limit state functions”
or “safety margins”. Frequent examples are some displacement, stress, load (force
and moment) components in structural design, or more general system output
functions in engineering design. Furthermore, production functions and several
cost functions are possible performance functions in production planning problems,
optimal mix problems, transportation problems, allocation problems and other
problems of economic decision.
In (1.17a, b), the design or input vector x denotes the r-vector of design or
input variables, x1 ; x2 ; : : : ; xr , as e.g. structural dimensions, sizing variables, such
as cross-sectional areas, thickness in structural design, or factors of production,
actions in economic decision problems. For the decision, design or input vector x
one has mostly some basic deterministic constraints, e.g. nonnegativity constraints,
box constraints, represented by

x 2 D; (1.17c)

where D is a given convex subset of Rr . Moreover, a is the -vector of model


parameters. In optimal structural/engineering design
 
p
aD (1.17d)
R

is composed of the following two subvectors: R is the m-vector of the acting exter-
nal loads or structural/system inputs, e.g. wave, wind loads, payload, etc. Moreover,
p denotes the .  m/-vector of the further model parameters, as e.g. material
parameters, like strength parameters, yield/allowable stresses, elastic moduli, plastic
capacities, etc., of the members of a mechanical structure, parameters of an electric
circuit, such as resistances, inductances, capacitances, the manufacturing tolerances
and weight or more general cost coefficients.
In linear programming, as e.g. in production planning problems,

a D .A; b; c/ (1.17e)
1.3 Optimal Decision/Design Problems with Random Parameters 9

is composed of the m  r matrix A of technological coefficients, the demand m-


vector b and the r-vector c of unit costs.
Based on the my -vector of state functions
 T
y.a; x/ WD y1 .a; x/; y2 .a; x/; : : : ; ymy .a; x/ ; (1.17f)

the admissible or safe states of the structure/system can be characterized by the


condition

y.a; x/ 2 B; (1.17g)

where B is a certain subset of Rmy I B D B.a/ may depend also on some model
parameters.
In production planning problems, typical operating conditions are given, cf.
(1.17e), by

y.a; x/ WD Ax  b  0 or y.a; x/ D 0; x  0: (1.18a)

In mechanical structures/structural systems, the safety (survival) of the struc-


ture/system is described by the operating conditions

yi .a; x/ > 0 for all i D 1; : : : ; my (1.18b)

with state functions yi D yi .a; x/; i D 1; : : : ; my , depending on certain response


components of the structure/system, such as displacement, stress, force, moment
components. Hence, a failure occurs if and only of the structure/system is in the i -th
failure mode (failure domain)

yi .a; x/  0 (1.18c)

for at least one index i; 1  i  my .


Note. The number my of safety margins or limit state functions yi D
yi .a; x/; i D 1; : : : ; my , may be very large. For example, in optimal plastic design
the limit state functions are determined by the extreme points of the admissible
domain of the dual pair of static/kinematic LPs related to the equilibrium and
linearized convex yield condition, see [99, 100].
Basic problems in optimal decision/design are:
I) Primary (construction, planning, investment, etc.) cost minimization under
operating or safety conditions

min G0 .a; x/ (1.19a)


10 1 Stochastic Optimization Methods

s.t.

y.a; x/ 2 B (1.19b)
x 2 D: (1.19c)

Obviously we have B D .0; C1/my in (1.18b) and B D Œ0; C1/my or B D f0g


in (1.18a).
II) Failure or recourse cost minimization under primary cost constraints
 
“ min ” y.a; x/ (1.20a)

s.t.

G0 .a; x/  G max (1.20b)


x 2 D: (1.20c)

In (1.20a)  D .y/ is a scalar or vector valued cost/loss function evaluating


violations of the operating conditions (1.19b). Depending on the application, these
costs are called “failure” or “recourse” costs [70, 71, 97, 132, 146, 147]. As already
discussed in Sect. 1.1, solving problems of the above type, a basic difficulty is the
uncertainty about the true value of the vector a of model parameters or the (random)
variability of a:
In practice, due to several types of uncertainties such as, see [163],
• physical uncertainty (variability of physical quantities, like material, loads,
dimensions, etc.)
• economic uncertainty (trade, demand, costs, etc.)
• statistical uncertainty (e.g. estimation errors of parameters due to limited sample
data)
• model uncertainty (model errors),
the -vector a of model parameters must be modeled by a random vector

a D a.!/; ! 2 ˝; (1.21a)

on a certain probability space .˝; A0 ; P/ with sample space ˝ having elements !,


see (1.3). For the mathematical representation of the corresponding (conditional)
A
probability distribution Pa./ D Pa./ of the random vector a D a.!/ (given the
time history or information A  A0 ), two main distribution models are taken into
account in practice:
i) Discrete probability distributions,
ii) Continuous probability distributions.
1.3 Optimal Decision/Design Problems with Random Parameters 11

In the first case there is a finite or countably infinite number l0 2 N [ f1g of


realizations or scenarios al 2 R ; l D 1; : : : ; l0 ,
 
P a.!/ D al D ˛l ; l D 1; : : : ; l0 ; (1.21b)

taken with probabilities ˛l ; l D 1; : : : ; l0 . In the second case, the probability that the
realization a.!/ D a lies in a certain (measurable) subset B  R is described by
the multiple integral
  Z
P a.!/ 2 B D '.a/ da (1.21c)
B

Z
with a certain probability density function ' D '.a/  0; a 2 R ; 
'.a/da D 1.
The properties of the probability distribution Pa./ may be described—fully or
in part—by certain numerical characteristics, called parameters of Pa./ . These
distribution parameters  D h are obtained by considering expectations
 
h WD Eh a.!/ (1.22a)

of some (measurable) functions


 
.h ı a/.!/ WD h a.!/ (1.22b)

composed of the random vector a D a.!/ with certain (measurable) mappings

h W R ! Rsh ; sh  1: (1.22c)

 of the probability distribution Pa./ of a D a.!/, the


According to the type
expectation Eh a.!/ is defined, cf. [12, 13], by

8
ˆ X
l0  
ˆ
ˆ
  ˆ
< h al ˛l ; in the discrete case (1.21b)
Eh a.!/ D ZlD1 (1.22d)
ˆ
ˆ
ˆ h.a/'.a/ da; in the continuous case (1.21c).

R

Further distribution parameters  are functions

 D  .h1 ; : : : ; hs / (1.23)


12 1 Stochastic Optimization Methods

of certain “h-moments” h1 ; : : : ; hs of the type (1.22a). Important examples of the
type (1.22a), (1.23), resp., are the expectation

a D Ea.!/ (for h.a/ WD a; a 2 R ) (1.24a)

and the covariance matrix


  T
Q WD E a.!/  a a.!/  a D Ea.!/a.!/T  a aT (1.24b)

of the random vector a D a.!/.


Due to the stochastic variability of the random vector a./ of model parameters,
and since the realization a.!/ D a is not available at the decision making stage, the
optimal design problem (1.19a–c) or (1.20a–c) under stochastic uncertainty cannot
be solved directly.
Hence, appropriate deterministic substitute problems must be chosen taking into
account the randomness of a D a.!/, cf. Sect. 1.2.

1.4 Deterministic Substitute Problems in Optimal


Decision/Design

According to Sect. 1.2, a basic deterministic substitute problem in optimal design


under stochastic uncertainty is the minimization of the total expected costs including
the expected costs of failure
 
min cG EG0 a.!/; x C cf pf .x/ (1.25a)

s.t.

x 2 D: (1.25b)

Here,
   
pf D pf .x/ WD P y a.!/; x T B (1.25c)

is the probability of failure or the probability that a safe function of the structure,
the system is not guaranteed. Furthermore, cG is a certain weight factor, and cf > 0
describes the failure or recourse costs. In the present definition of expected failure
costs, constant costs for each realization a D a.!/ of a./ are assumed. Obviously,
it is

pf .x/ D 1  ps .x/ (1.25d)


1.4 Deterministic Substitute Problems in Optimal Decision/Design 13

with the probability of safety or survival


   
ps .x/ WD P y a.!/; x 2 B : (1.25e)

In case (1.18b) we have


   
pf .x/ D P yi a.!/; x  0 for at least one index i; 1  i  my : (1.25f)

The objective function (1.25a) may be interpreted as the Lagrangian (with given
cost multiplier cf ) of the following reliability based optimization (RBO) problem,
cf. [7, 92, 132, 147, 163]:
 
min EG0 a.!/; x (1.26a)

s.t.

pf .x/  ˛ max (1.26b)


x 2 D; (1.26c)

where ˛ max > 0 is a prescribed maximum failure probability, e.g. ˛ max D 0:001, cf.
(1.19a–c).
The “dual” version of (1.26a–c) reads

min pf .x/ (1.27a)

s.t.
 
EG0 a.!/; x  G max (1.27b)

x 2D (1.27c)

with a maximal (upper) cost bound G max , see (1.20a–c).

1.4.1 Expected Cost or Loss Functions

Further substitute problems are obtained by considering more general expected


failure or recourse cost functions
  
.x/ D E y a.!/; x (1.28a)
14 1 Stochastic Optimization Methods

Fig. 1.1 Loss function 

arising from structural systems weakness or failure, or because of false operation.


Here,
      T
y a.!/; x WD y1 a.!/; x ; : : : ; ymy a.!/; x (1.28b)

is again the random vector of state or performance functions, and

 W Rm y ! Rm  (1.28c)

is a scalar or vector valued cost or loss function (Fig. 1.1). In case B D .0; C1/my
or B D Œ0; C1/my it is often assumed that  D .y/ is a non increasing function,
hence,

.y/  .z/; if y  z; (1.28d)

where inequalities between vectors are defined component-by-component.


Example 1.1 If .y/ D 1 for y 2 B c (complement of B) and .y/ D 0 for y 2 B,
then .x/ D pf .x/.
Example 1.2 Suppose that  D .y/ is a nonnegative measurable scalar function
on Rmy such that

.y/  0 > 0 for all yTB (1.29a)

with a constant 0 > 0. Then for the probability of failure we find the following
upper bound
    1   
pf .x/ D P y a.!/; x TB  E y a.!/; x ; (1.29b)
0
1.4 Deterministic Substitute Problems in Optimal Decision/Design 15

where the right hand side of (1.29b) is obviously an expected cost function of type
(1.28a–c). Hence, the condition (1.26b) can be guaranteed by the expected cost
constraint
  
E y a.!/; x  0 ˛ max : (1.29c)

Example 1.3 If the loss function .y/ is defined by a vector of individual loss
functions i for each state function yi D yi .a; x/; i D 1; : : : ; my , hence,
 T
.y/ D 1 .y1 /; : : : ; my .ymy / ; (1.30a)

then
 T   
.x/ D 1 .x/; : : : ; my .x/ ; i .x/ WD Ei yi a.!/; x ; 1  i  my ;
(1.30b)

i.e., the my state functions yi ; i D 1; : : : ; my , will be treated separately.


Working with the more general expected failure or recourse cost functions D
.x/, instead of (1.25a–c), (1.26a–c) and (1.27a–c) we have the related substitute
problems:
I) Expected total cost minimization
 
min cG EG0 a.!/; x C cfT .x/; (1.31a)

s.t.

x2D (1.31b)

II) Expected primary cost minimization under expected failure or recourse cost
constraints
 
min EG0 a.!/; x (1.32a)

s.t.

.x/  max (1.32b)


x 2 D; (1.32c)

III) Expected failure or recourse cost minimization under expected primary cost
constraints

“ min ” .x/ (1.33a)


16 1 Stochastic Optimization Methods

s.t.
 
EG0 a.!/; x  G max (1.33b)

x 2 D: (1.33c)

Here, cG ; cf are (vectorial) weight coefficients, max is the vector of upper loss
bounds, and “min” indicates again that .x/ may be a vector valued function.

1.5 Basic Properties of Deterministic Substitute Problems

As can be seen from the conversion of an optimization problem with random


parameters into a deterministic substitute problem, cf. Sects. 2.1.1 and 2.1.2, a
central role is played by expectation or mean value functions of the type
  
.x/ D E y a.!/; x ; x 2 D0 ; (1.34a)

or more general
 
.x/ D Eg a.!/; x ; x 2 D0 : (1.34b)

Here, a D a.!/ is a random -vector, y D y.a; x/ is an my -vector valued function


on a certain subset of R  Rr , and  D .z/ is a real valued function on a certain
subset of Rmy .
Furthermore, g D g.a; x/ denotes a real valued function on a certain subset of
R  Rr . In the following we suppose that the expectation in (1.34a, b) exists and
is finite for all input vectors x lying in an appropriate set D0  Rr , cf. [16].
The following basic properties of the mean value functions are needed in the
following again and again.
 
Lemma 1.1 (Convexity) Suppose that x ! g a.!/; x is convex a.s. (almost
 
sure) on a fixed convex domain D0  Rr . If Eg a.!/; x exists and is finite for
each x 2 D0 , then D .x/ is convex on D0 .
Proof This property follows [70, 71, 90] directly from the linearity of the expecta-
tion operator.  
If g D g.a; x/ is defined by g.a; x/ WD  y.a; x/ , see (1.34a), then the above
theorem yields the following result:
  
Corollary 1.1 Suppose that  is convex and E y a.!/; x exists and is finite
 
for each x 2 D0 . a) If x ! y a.!/; x is linear a.s., then D .x/ is convex. b)
1.5 Basic Properties of Deterministic Substitute Problems 17

 
If x ! y a.!/; x is convex a.s., and  is a convex, monotoneous nondecreasing
function, then D .x/ is convex.
It is well known [85] that a convex function is continuous on each open subset of
its domain. A general sufficient condition for the continuity of is given next.
 
Lemma 1.2 (Continuity) Suppose that Eg a.!/; x exists and is finite for each
 
x 2 D0 , and assume that x ! g a.!/; x is continuous at x0 2 D0 a.s. If there is
 
a function D a.!/ having finite expectation such that
ˇ  ˇ  
ˇ ˇ
ˇg a.!/; x ˇ  a.!/ a.s. for all x 2 U.x0 / \ D0 ; (1.35)

where U.x0 / is a neighborhood of x0 , then D .x/ is continuous at x0 .


Proof The assertion can be shown by using Lebesgue’s dominated convergence
theorem, see e.g. [90].
For the consideration of the differentiability of D .x/, let D denote an open
subset of the domain D0 of .
Lemma 1.3 (Differentiability) Suppose that
 
i) Eg a.!/; x exists and is finite for each x 2 D0 ,
 
ii) x ! g a.!/; x is differentiable on the open subset D of D0 a.s. and
iii)
    
 
rx g a.!/; x   a.!/ ; x 2 D; a.s.; (1.36a)
 
where D a.!/ is a function having finite expectation. Then the expectation
 
of rx g a.!/; x exists and is finite, D .x/ is differentiable on D and
   
r .x/ D rx Eg a.!/; x D Erx g a.!/; x ; x 2 D: (1.36b)



Proof Considering the difference quotients ; k D 1; : : : ; r, of at a fixed

xk
point x0 2 D, the assertion follows by means of the mean value theorem, inequality
(1.36a) and Lebesgue’s dominated convergence theorem, cf. [70, 71, 90].
Example 1.4 In case (1.34a), under obvious differentiability
 assumptions
 concern-
ing  and y we have rx g.a; x/ D rx y.a; x/ r y.a; x/ , where rx y.a; x/
T
18 1 Stochastic Optimization Methods

denotes the Jacobian of y D y.a; x/ with respect to a. Hence, if (1.36b) holds,


then
 T   
r .x/ D Erx y a.!/; x r y a.!/; x : (1.36c)

1.6 Approximations of Deterministic Substitute Problems


in Optimal Design/Decision

The main problem in solving the deterministic substitute problems defined above
is that the arising probability and expected cost functions pf D pf .x/; D
.x/; x 2 Rr , are defined by means of multiple integrals over a -dimensional
space.
Thus, the substitute problems may be solved, in practice, only by some approx-
imative analytical and numerical methods [44, 70, 90, 100]. In the following we
consider possible approximations for substitute problems based on general expected
recourse cost functions D .x/ according to (1.34a) having a real valued convex
loss function .z/. Note that the probability of failure function pf D pf .x/ may
be approximated from above, see (1.29a, b), by expected cost functions D .x/
having a nonnegative function  D .z/ being bounded from below on the failure
domain B c . In the following several basic approximation methods are presented.

1.6.1 Approximation of the Loss Function

Suppose here that  D .y/ is a continuously differentiable, convex loss function


on Rmy . Let then denote
      T
y.x/ WD Ey a.!/; x D Ey1 a.!/; x ; : : : ; Eymy a.!/; x (1.37)
   
the expectation of the vector y D y a.!/; x of state functions yi D yi a.!/; x ,
i D 1; : : : ; my .
For an arbitrary continuously differentiable, convex loss function  we have
      T    
 y a.!/; x   y.x/ C r y.x/ y a.!/; x  y.x/ : (1.38a)

Thus, taking expectations in (1.38a), we find Jensen’s inequality


    
.x/ D E y a.!/; x   y.x/ (1.38b)
1.6 Approximations of Deterministic Substitute Problems in Optimal. . . 19

which holds for any convex function  . Using the mean value theorem, we have

.y/ D .y/ C r.y/


O T .y  y/; (1.38c)

where yO is a point on the line segment yy between y and y. By means of (1.38b, c)


we get
         
   
0  .x/   y.x/  E r yO a.!/; x   y a.!/; x  y.x/ :

(1.38d)

a) Bounded gradient  
If the gradient r is bounded on the range of y D y a.!/; x ; x 2 D, i.e.,
if
   
 
r y a.!/; x   # max a.s. for each x 2 D; (1.39a)

with a constant # max > 0, then


     
 
0  .x/   y.x/  # max E y a.!/; x  y.x/ ; x 2 D: (1.39b)

p
Since t ! t; t  0, is a concave function, we get
  p
0  .x/   y.x/  # max q.x/ ; (1.39c)

where
   2
 
q.x/ WD E y a.!/; x  y.x/ D trQ.x/ (1.39d)

is the generalized variance, and


  
Q.x/ WD cov y a./; x (1.39e)
 
denotes the covariance matrix of the random vector y D y a.!/; x . Conse-
quently, the expected loss function .x/ can be approximated from above by
  p
.x/   y.x/ C # max q.x/ for x 2 D: (1.39f)

b) Bounded eigenvalues of the Hessian


Considering second order expansions of  , with a vector yQ 2 yy
N we find

1
.y/  .y/ D r.y/T .y  y/ C .y  y/T r 2 .y/.y
Q  y/: (1.40a)
2
20 1 Stochastic Optimization Methods

 of r .y/ are bounded from below and above


2
Suppose that the eigenvalues

on the range of y D y a.!/; x for each x 2 D, i.e.,
   
0 < min   r 2  y a.!/; x  max < C1; a.s.; x 2 D (1.40b)

with constants 0 < min  max . Taking expectations in (1.40a), we get


  min   max
 y.x/ C q.x/  .x/   y.x/ C q.x/; x 2 D: (1.40c)
2 2
Consequently, using (1.39f) or (1.40c), various approximations for the determin-
istic substitute problems (1.31a, b), (1.32a–c), (1.33a–c) may be obtained.
Based on the above approximations of expected cost functions, we state the
following two approximates to (1.32a–c), (1.33a–c), resp., which are well known
in robust optimal design:
i) Expected primary cost minimization under approximate expected failure or
recourse cost constraints
 
min EG0 a.!/; x (1.41a)

s.t.
 
 y.x/ C c0 q.x/  max (1.41b)

x 2 D; (1.41c)

where c0 is a scale factor, cf. (1.39f) and (1.40c);


ii) Approximate expected failure or recourse cost minimization under expected
primary cost constraints
 
min  y.x/ C c0 q.x/ (1.42a)

s.t.
 
EG0 a.!/; x  G max (1.42b)

x 2 D: (1.42c)

Obviously, by means of (1.41a–c) or (1.42a–c) optimal designs x  are achieved


which
1.6 Approximations of Deterministic Substitute Problems in Optimal. . . 21

• yield a high mean performance of the structure/structural system


• are minimally sensitive or have a limited sensitivity with respect to random
parameter variations (material, load, manufacturing, process, etc.) and
• cause only limited costs for design, construction, maintenance, etc.

1.6.2 Approximation of State (Performance) Functions

The numerical solution is simplified considerably if one can work with one single
state function y D y.a; x/. Formally, this is possible by defining the function

y min .a; x/ WD min yi .a; x/: (1.43a)


1i my

Indeed, according to (1.18b, c) the failure of the structure, the system can be
represented by the condition

y min .a; x/  0: (1.43b)

Thus, the weakness or failure of the technical or economic device can be evaluated
numerically by the function
  
.x/ WD E y min a.!/; x (1.43c)

with a nonincreasing loss function  W R ! RC , see Fig. 1.1.


However, the “min”-operator in (1.43a) yields a nonsmooth function y min D
y min .a; x/ in general, and the computation of the mean and variance function
 
y min .x/ WD Eymin a.!/; x (1.43d)
  
y2min .x/ WD V y min a./; x (1.43e)

by means of Taylor expansion with respect to the model parameter vector a at a D


Ea.!/ is not possible directly, cf. Sect. 1.6.3.
According to the definition (1.43a), an upper bound for y min .x/ is given by
 
y min .x/  min y i .x/ D min Eyi a.!/; x :
1i my 1i my

Further approximations of y min .a; x/ and its moments can be found by using the
representation

1 
min.a; b/ D a C b  ja  bj
2
22 1 Stochastic Optimization Methods

of the minimum of two numbers a; b 2 R. For example, for an even index my we


have
 
y min .a; x/ D min min yi .a; x/; yi C1 .a; x/
i D1;3;:::;my 1

1 ˇ ˇ
D min yi .a; x/ C yi C1 .a; x/  ˇyi .a; x/  yi C1 .a; x/ˇ :
i D1;3;:::;my 1 2

In many cases we may suppose that the state (performance) functions yi D


yi .a; x/; i D 1; : : : ; my , are bounded from below, hence,

yi .a; x/ > A; i D 1; : : : ; my ;

for all .a; x/ under consideration with a positive constant A > 0. Thus, defining

yQi .a; x/ WD yi .a; x/ C A; i D 1; : : : ; my ;

and therefore

yQ min .a; x/ WD min yQi .a; x/ D y min .a; x/ C A;


1i my

we have

y min .a; x/  0 if and only if yQ min .a; x/  A:

Hence, the survival/failure of the system or structure can be studied also by means
of the positive function yQ min D yQ min .a; x/. Using now the theory of power or Hölder
means [25], the minimum yQ min .a; x/ of positive functions can be represented also
by the limit
!1=
1 X
my
yQ min
.a; x/ D lim yQi .a; x/
!1 my i D1

!1=
1 X 
my
Q WD
of the decreasing family of power means M Π.y/ yQ ;  < 0.
my i D1 i
Consequently, for each fixed p > 0 we also have
!p=
1 X
my
yQ min .a; x/p D lim yQi .a; x/ :
!1 my i D1
1.6 Approximations of Deterministic Substitute Problems in Optimal. . . 23

   p
Assuming that the expectation EM ΠyQ a.!/ ; x exists for an exponent  D
0 < 0, by means of Lebesgue’s bounded convergence theorem we get the moment
representation
!p=
 p 1 X 
my 
E yQ min a.!/; x D lim E yQi a.!/; x :
!1 my i D1

Since t ! t p= ; t > 0, is convex for each fixed p > 0 and  < 0, by Jensen’s
inequality we have the lower moment bound
!p=
 p 1 X
my  
E yQ min
a.!/; x  lim E yQi a.!/; x :
!1 my i D1
 
Hence, for the pth order moment of yQ min a./; x we get the approximations

!p= !p=
1 X   1 X  
my my
E yQi a.!/; x  E yQi a.!/; x
my i D1 my i D1

for some  < 0.


Using regression techniques, Response Surface Methods (RSM), etc., for given
vector x, the function a ! y min .a; x/ can be approximated [17, 24, 66, 78, 144]
by functions yQ D y.a;
Q x/ being sufficiently smooth with respect to the parameter
vector a.
In many important cases, for each i D 1; : : : ; my , the state functions

.a; x/ ! yi .a; x/

are bilinear functions. Thus, in this case y min D y min .a; x/ is a piecewise linear
function with respect to a. Fitting a linear or quadratic Response Surface Model
[22, 23, 115, 116]

Q x/ WD c.x/ C q.x/T .a  a/ C .a  a/T Q.x/.a  a/


y.a; (1.43f)

to a ! y min .a; x/, after the selection of appropriate reference points

a.j / WD a C da.j / ; j D 1; : : : ; p; (1.43g)

.j /
with “design” points da 2 R ; j D 1; : : : ; p, the unknown coefficients c D
c.x/; q D q.x/ and Q D Q.x/ are obtained by minimizing the mean square error

X
p
 2
.c; q; Q/ WD Q .j / ; x/  y min .a.j / ; x/
y.a (1.43h)
j D1
24 1 Stochastic Optimization Methods

Q x/ of y min .a; x/ for given x


Fig. 1.2 Approximation y.a;

with respect to .c; q; Q/ (Fig. 1.2). Since the model (1.43f) depends linearly on the
function parameters .c; q; Q/, explicit formulas for the optimal coefficients

c  D c  .x/; q  D q  .x/; Q D Q .x/; (1.43i)

are obtained from this least squares estimation method, cf.[100].

Approximation of Expected Loss Functions

Corresponding to the approximation (1.43f) of y min D ymin.a; x/, using


 again least
squares techniques, a mean value function .x/ D E y a.!/; x , cf. (1.28a),
can be approximated at a given point x0 2 R by a linear or quadratic Response
Surface Function

Q .x/ WD ˇ0 C ˇIT .x  x0 / C .x  x0 /T B.x  x0 /; (1.43j)

with scalar, vector and matrix parameters ˇ0 ; ˇI ; B. In this case estimates y .i / D


O .i / of .x/ are needed at some reference points x .i / D x0 C d .i / ; i D 1; : : : ; p.
Details are given in [100].

1.6.3 Taylor Expansion Methods

As can be seen above, cf. (1.34a, b), in the objective and/or in the constraints
of substitute problems for optimization problems with random data mean value
functions of the type
1.6 Approximations of Deterministic Substitute Problems in Optimal. . . 25

 
.x/ WD Eg a.!/; x

occur. Here, g D g.a; x/ is a real valued function on a subset of R  Rr , and


a D a.!/ is a -random vector.

(Complete) Expansion with Respect to a

Suppose that on its domain the function g D g.a; x/ has partial derivatives
ral g.a; x/; l D 0; 1; : : : ; lg C 1, up to order lg C 1. Note that the gradient ra g.a; x/
@g
contains the so–called sensitivities .a; x/; j D 1; : : : ; , of g with respect to the
@aj
parameter vector a at .a; x/. In the same way, the higher order partial derivatives
ral g.a; x/; l > 1, represent the higher order sensitivities of g with respect to a at
.a; x/. Taylor expansion of g D g.a; x/ with respect to a at a WD Ea.!/ yields

Xlg
1 l 1
g.a; x/ D N x/  .a  a/
r g.a; N lC N lg C1
O x/  .a  a/
r lgC1 g.a;
lŠ a .lg C 1/Š a
lD0
(1.44a)
where aO WD aN C #.a  a/;
N 0 < # < 1, and .a  a/
N l denotes the system of l-th order
products

Y

.aj  aN j /lj
j D1

with lj 2 N [ f0g; j D 1; : : : ; ; l1 C l2 C : : : C l D l. If g D g.a; x/ is defined


by
 
g.a; x/ WD  y.a; x/ ;

see (1.34a), then the partial derivatives ral g of g up to the second order read:
 T  
ra g.a; x/ D ra y.a; x/ r y.a; x/ (1.44b)
 T  
ra2 g.a; x/ D ra y.a; x/ r 2  y.a; x/ ra y.a; x/ (1.44c)
 
Cr y.a; x/  ra2 y.a; x/;
26 1 Stochastic Optimization Methods

where
 2 
T @ y
.r /  ra2 y WD .r /  (1.44d)
@ak @al k;lD1;:::;

Taking expectations in (1.44a), .x/ can be approximated, cf. Sect. 1.6.1, by

X
lg  l
Q .x/ WD g.a;
N x/ C ral g.a;
N x/  E a.!/  aN ; (1.45a)
lD2

 l
where E a.!/  aN denotes the system of mixed lth central moments of the
 T
random vector a.!/ D a1 .!/; : : : ; a .!/ . Assuming that the domain of g D
g.a; x/ is convex with respect to a, we get the error estimate
ˇ ˇ     
ˇ ˇ 1  l C1 
ˇ .x/  Q .x/ˇ  E sup rag g aN C # a.!/  aN ; x 
.lg C 1/Š 0#1
 lg C1
 
a.!/  aN  : (1.45b)

In many practical cases the random parameter -vector a D a.!/ has a convex,
l C1
bounded support, and rag g is continuous. Then the L1 –norm
  
1  l C1 
r.x/ WD ess sup rag g a.!/; x  (1.45c)
.lg C 1/Š

is finite for all x under consideration, and (1.45b, c) yield the error bound
ˇ ˇ  lg C1
ˇ ˇ  
ˇ .x/  Q .x/ˇ  r.x/E a.!/  aN  : (1.45d)

Remark 1.1 The above described


 method can be extended to the case of vector
T
valued loss functions .z/ D 1 .z/; : : : ; m .z/ .

Inner (Partial) Expansions with Respect to a

In generalization of (1.34a), in many cases .x/ is defined by


  
.x/ D E a.!/; y a.!/; x ; (1.46a)
1.6 Approximations of Deterministic Substitute Problems in Optimal. . . 27

hence, the loss function  D .a; y/ depends also explicitly on the parameter vector
a. This may occur e.g. in case of randomly varying cost factors.
Linearizing now the vector function y D y.a; x/ with respect to a at a,
N thus,

y.a; x/  y.1/ .a; x/ WD y.a;


N x/ C ra y.a;
N x/.a  a/;
N (1.46b)

the mean value function .x/ is approximated by


  
Q .x/ WD E a.!/; y.a;
N x/ C ra y.a;
N x/ a.!/  aN : (1.46c)

This approximation is very advantageous in case that the cost function  D .a; y/
is a quadratic function in y. In case of a cost function  D .a; y/ being linear in
the vector y, also quadratic expansions of y D y.a; x/ with respect to a many be
taken into account.
Corresponding to (1.37), (1.39e), define
 
y .1/ .x/ WD Ey.1/ a.!/; x D y.a; x/ (1.46d)
    
Q.1/ .x/ WD cov y.1/ a./; x D ra y.a; x/ cov a./ ra y.a; x/T : (1.46e)

In case of convex loss functions  , approximates of Q and the corresponding


substitute problems based on Q may be obtained now by applying the methods
described in Sect. 1.6.1. Explicit representations for Q are obtained in case of
quadratic loss functions  .
Error estimates can be derived easily for Lipschitz(L)-continuous or convex loss
function  . In case of a Lipschitz-continuous loss function .a; / with Lipschitz
constant L D L.a/ > 0, e.g. for sublinear [90, 91] loss functions, using (1.46d) we
have
ˇ ˇ     
ˇ ˇ  
ˇ .x/  Q .x/ˇ  L0  E y a.!/; x  y.1/ a.!/; x  ; (1.46f)

provided that L0 denotes a finite upper bound of the L-constants L D L.a/.


Applying the mean value theorem [37], under appropriate 2nd order differentia-
bility assumptions, for the right hand side of (1.46f) we find the following stochastic
version of the mean value theorem
    
 
E y a.!/; x  y.1/ a.!/; x 
 2     
   
 E a.!/  a sup ra2 y a C # a.!/  a ; x  : (1.46g)
0#1
28 1 Stochastic Optimization Methods

1.7 Approximation of Probabilities: Probability Inequalities

In reliability analysis of engineering/economic structures or systems, a main


problem is the computation of probabilities
! !
[
N [
N
P Vi WD P a.!/ 2 Vi (1.47a)
i D1 i D1

or
0 1 0 1
\
N \
N
P@ Sj A WD P @a.!/ 2 Sj A (1.47b)
j D1 j D1

of unions and intersections of certain failure/survival domains (events) Vj ; Sj ,


j D 1; : : : ; N . These domains (events) arise from the representation of the structure
or system by a combination of certain series and/or parallel substructures/systems.
Due to the high complexity of the basic physical relations, several approximation
techniques are needed for the evaluation of (1.47a, b).

1.7.1 Bonferroni-Type Inequalities

In the following V1 ; V2 ; : : : ; VN denote arbitrary (Borel-)measurable subsets of the


parameter space R , and the abbreviation
 
P.V / WD P a.!/ 2 V (1.47c)

is used for any measurable subset V of R .


Starting from the representation of the probability of a union of N events,
0 1
[
N X
N
P@ Vj A D .1/k1 sk;N ; (1.48a)
j D1 kD1

where
!
X \
k
sk;N WD P Vil ; (1.48b)
1i1 <i2 <<ik N lD1
1.7 Approximation of Probabilities: Probability Inequalities 29

we obtain [48] the well known basic Bonferroni bounds


0 1
[
N X

P@ Vj A  .1/k1 sk;N for  1; odd (1.48c)
j D1 kD1
0 1
[
N X

P@ Vj A  .1/k1 sk;N for  1; even. (1.48d)
j D1 kD1

Besides (1.48c, d), a large amount of related bounds of different complexity are
available, cf. [48, 166]. Important bounds of first and second degree are given below:
0 1
[
N
max qj  P @ Vj A  Q 1 (1.49a)
1j N
j D1
0 1
[
N X
Q1  Q2  P @ Vj A  Q1  max qil (1.49b)
1lN
j D1 i 6Dl
0 1
Q12 [
N
P@ Vj A  Q 1 : (1.49c)
Q1 C 2Q2 j D1

The above quantities qj ; qij ; Q1 ; Q2 are defined as follows:

X
N
Q1 WD qj with qj WD P.Vj / (1.49d)
j D1

j 1
X
N X
Q2 WD qij with qij WD P.Vi \ Vj /: (1.49e)
j D2 i D1

Moreover, defining

q WD .q1 ; : : : ; qN /; Q WD .qij /1i;j N ; (1.49f)

we have
0 1
[
N
P@ Vj A  q T Q q; (1.49g)
j D1

where Q denotes the generalized inverse of Q, cf. [166].


30 1 Stochastic Optimization Methods

1.7.2 Tschebyscheff-Type Inequalities

\
m
In many cases the survival or feasible domain (event) S D Si , is represented by
i D1
a certain number m of inequality constraints of the type

yli < ./yi .a; x/ < ./yui ; i D 1; : : : ; m; (1.50a)

as e.g. operating conditions, behavioral constraints. Hence, for a fixed input, design
or control vector x, the event S D S.x/ is given by

S WD fa 2 R W yli < ./yi .a; x/ < ./yui ; i D 1; : : : ; mg : (1.50b)

Here,

yi D yi .a; x/; i D 1; : : : ; m; (1.50c)

are certain functions, e.g. response, output or performance functions of the structure,
system, defined on (a subset of) R  Rr .
Moreover, yli < yui ; i D 1; : : : ; m, are lower and upper bounds for the variables
yi ; i D 1; : : : ; m. In case of one-sided constraints some bounds yli ; yui are infinite.

Two-Sided Constraints

If yli < yui ; i D 1; : : : ; m, are finite bounds, (1.50a) can be represented by

jyi .a; x/  yic j < ./ i ; i D 1; : : : ; m; (1.50d)

where the quantities yic ; i ; i D 1; : : : ; m, are defined by

yli C yui yui  yli


yic WD ; i WD : (1.50e)
2 2

Consequently, for the probability P.S / of the event S , defined by (1.50b), we have
ˇ   ˇ 
ˇ ˇ
P.S / D P ˇyi a.!/; x  yic ˇ < ./ i ; i D 1; : : : ; m : (1.50f)

Introducing the random variables


 
  yi a.!/; x  yic
yQi a.!/; x WD ; i D 1; : : : ; m; (1.51a)
i
1.7 Approximation of Probabilities: Probability Inequalities 31

Fig. 1.3 Function ' D '.y/

and the set

B WD fy 2 Rm W jyi j < ./1; i D 1; : : : ; mg ; (1.51b)

with yQ D .yQi /1i m , we get


   
P.S / D P yQ a.!/; x 2 B : (1.51c)

Considering any (measurable) function ' W Rm ! R such that

i / '.y/  0; y 2 Rm (1.51d)
ii/ '.y/  '0 > 0; if y TB; (1.51e)

with a positive constant '0 (Fig. 1.3), we find the following result:
Theorem 1.1 For any (measurable) function ' W Rm ! R fulfilling conditions
(1.51d, e), the following Tschebyscheff-type inequality holds:
   
P yli < ./yi a.!/; x < ./yui ; i D 1; : : : ; m

1   
1 E' yQ a.!/; x ; (1.52)
'0

provided that the expectation in (1.52) exists and is finite.


32 1 Stochastic Optimization Methods

Proof If PyQa./;x  denotes the probability distribution of the random m-vector yQ D


 
yQ a.!/; x , then

   Z Z
E' yQ a.!/; x D '.y/Py.a./;x/
Q .dy/ C '.y/Py.a./;x/
Q .dy/
y2B y2B c
Z Z
 '.y/Py.a./;x/
Q .dy/  '0 Py.a./;x/
Q .dy/
y2B c y2B c
        
D '0 P yQ a.!/; x TB D '0 1  P yQ a.!/; x 2 B ;

which yields the assertion, cf. (1.51c).


Remark 1.2 Note that P.S /  ˛s with a given minimum reliability ˛s 2 .0; 1 can
be guaranteed by the expected cost constraint
  
E' yQ a.!/; x  .1  ˛s /'0 :

Example 1.5 If ' D 1B c is the indicator function of the complement B c of B, then


'0 D 1 and (1.52) holds with the equality sign.
Example 1.6 For a given positive definite m  m matrix C , define '.y/ WD
y T Cy; y 2 Rm . Then, cf. (1.51b, d, e),
n o
min '.y/ D min min y T Cy; min y T Cy : (1.53a)
yTB 1i m yi 1 yi 1

Thus, the lower bound '0 follows by considering the convex optimization
Q
problems arising in the right hand side of (1.53a). Moreover, the expectation E'.y/
needed in (1.52) is given, see (1.51a), by

E'.y/Q D E yQ T C yQ D EtrC yQ yQ T ;
     T 
D trC.diag /1 cov y a./; x C y.x/  yc y.x/  yc .diag /1 ;

(1.53b)

where “tr” denotes the trace of a matrix, diag isthe diagonal


 matrix
 diag WD
. i ıij /; yc WD .yic /, see (1.50e), and y D y.x/ WD Eyi a.!/; x . Since kQk 
t rQ  mkQk for any positive definite m  m matrix Q, an upper bound of E'.y/ Q
1.7 Approximation of Probabilities: Probability Inequalities 33

reads
    T
Q  mkC k k.diag /1 k2 kcov y a./; x C y.x/  yc y.x/  yc k:
E'.y/
(1.53c)

Example 1.7 Assuming in the above Example 1.6 that C D diag .ci i / is a diagonal
matrix with positive elements ci i > 0; i D 1; : : : ; m, then

min '.y/ D min ci i > 0; (1.53d)


yTB 1i m

Q is given by
and E'.y/
   2
X
m E yi a.!/; x  yic
Q D
E'.y/ ci i
i D1
i2
   2
X
m y2i a./; x C y i .x/  yic
D ci i : (1.53e)
i D1
i2

One-Sided Inequalities

Suppose that exactly one of the two bounds yli < yui is infinite for each
i D 1; : : : ; m. Multiplying the corresponding constraints in (1.50a) by 1, the
admissible domain S D S.x/, cf. (1.50b), can be represented always by

S.x/ D fa 2 R W yQi .a; x/ < ./ 0; i D 1; : : : ; mg ; (1.54a)

where yQi WD
 yi  yui, if yli D 1, and yQi WD yli  yi , if yui D C1. If we set
Q x/ WD yQi .a; x/ and
y.a;
n o
BQ WD y 2 Rm W yi < ./ 0; i D 1; : : : ; m ; (1.54b)

then
     
P S.x/ D P yQ a.!/; x 2 BQ : (1.54c)

Consider also in this case, cf. (1.51d, e), a function ' W Rm ! R such that

i / '.y/  0; y 2 Rm (1.55a)
Q
ii/'.y/  '0 > 0; if yTB: (1.55b)
34 1 Stochastic Optimization Methods

Then, corresponding to Theorem 1.1, we have this result:


Theorem 1.2 (Markov-Type Inequality) If ' W Rm ! R is any (measurable)
function fulfilling conditions (1.55a, b), then
    1   
P yQ a.!/; x < ./ 0  1  E' yQ a.!/; x ; (1.56)
'0

provided that the expectation in (1.56) exists and is finite.


Remark 1.3 Note that a related inequality was used already in Example 1.2.
X
m
Example 1.8 If '.y/ WD wi e ˛i yi with positive constants wi ; ˛i ; i D 1; : : : ; m,
i D1
then

inf '.y/ D min wi > 0 (1.57a)


yTBQ 1i m

and
   X
m  
E' yQ a.!/; x D wi Ee ˛i yQi a.!/;x ; (1.57b)
i D1

where the expectation in (1.57b) can be computed approximatively by Taylor


expansion:
 
Ee ˛i yQi D e ˛i yQi .x/ Ee ˛i yQi yQi .x/
 2 
˛i yQ i .x/ ˛i2   
e 1 C E yi a.!/; x  y i .x/
2
 
˛i yQ i .x/ ˛i2 2
De 1 C yi .a./;x/ : (1.57c)
2
 
Supposing that yi D yi a.!/; x is a normal distributed random variable, then

N 1 2 2
Ee ˛i yQi D e ˛i yQi .x/ e 2 ˛i yi .a./;x/ : (1.57d)

Example 1.9 Consider '.y/ WD .y  b/T C.y  b/, where, cf. Example 1.6, C is
a positive definite m  m matrix and b < 0 a fixed m-vector. In this case we again
have

min '.y/ D min min '.y/


yTBQ 1i m yi 0
1.7 Approximation of Probabilities: Probability Inequalities 35

and
   T    
Q D E yQ a.!/; x  b C yQ a.!/; x  b
E'.y/
      T
D trCE yQ a.!/; x  b yQ a.!/; x  b :

Note that

yi .a; x/  .yui C bi /; if yli D 1
yQi .a; x/  bi D
yli  bi  yi .a; x/; if yui D C1;

where yui C bi < yui and yli < yli  bi .


Remark 1.4 The one-sided case can also be reduced approximatively to the two-
sided case by selecting a sufficiently large, but finite upper bound yQui 2 R, lower
bound yQli 2 R, resp., if yui D C1; yli D 1.
Chapter 2
Optimal Control Under Stochastic Uncertainty

2.1 Stochastic Control Systems

Optimal control and regulator problems arise in many concrete applications


(mechanical, electrical, thermodynamical, chemical, etc.) are modeled [5, 63,
136, 155] by dynamical control systems obtained from physical measurements
and/or known physical laws. The basic control system (input–output system) is
mathematically represented [73, 159] by a system of first order random differential
equations:
 
zP .t/ D g t; !; z.t/; u.t/ ; t0  t  tf ; ! 2 ˝ (2.1a)

z.t0 / D z0 .!/: (2.1b)

Here, ! is the basic random element taking values in a probability space .˝; A; P /,
and describing the present random variations of model parameters or the influence
of noise terms. The probability space .˝; A; P / consists of the sample space
or set of elementary events ˝, the -algebra A of events and the probability
measure P . The plant state vector z D z.t; !/ is an m-vector involving direct
or indirect measurable/observable quantities like displacements, stresses, voltage,
current, pressure, concentrations, etc., and their time derivatives (velocities), z0 .!/
is the random initial state. The plant control or control input u.t/ is a deterministic
or stochastic n-vector denoting system inputs like external forces or moments,
voltages, field current, thrust program, fuel consumption, production rate, etc.
Furthermore, zP denotes the derivative with respect to the time t. We assume that
an input u D u.t/ is chosen such that u./ 2 U , where U is a suitable linear
space of input functions u./ W Œt0 ; tf  ! Rn on the time interval Œt0 ; tf . Examples
for U are subspaces of the space PCn0 Œt0 ; tf  of piecewise continuous functions

© Springer-Verlag Berlin Heidelberg 2015 37


K. Marti, Stochastic Optimization Methods, DOI 10.1007/978-3-662-46214-0_2
38 2 Optimal Control Under Stochastic Uncertainty

u./ W Œt0 ; tf  ! Rn normed by the supremum norm


  n  o
 
u./ D sup u.t/ W t0  t  tf :
1

Note that a function on a closed, bounded interval is called piecewise continuous


if it is continuous up to at most a finite number of points, where the one-sided
limits of the function exist. Other important examples for U are the Banach
spaces of integrable, essentially bounded measurable or regulated [37] functions
Lnp .Œt0 ; tf ; B 1 ; 1 /; p  1, Ln1 .Œt0 ; tf ; B 1 ; 1 /, Reg.Œt0 ; tf I Rn /, resp., on Œt0 ; tf .
 
Here, Œt0 ; tf ; B 1 ; 1 denotes the measure space on Œt0 ; tf  with the -algebra B 1
of Borel sets and the Lebesgue-measure 1 on Œt0 ; tf . Obviously, PCn0 Œt0 ; tf  
Ln1 .Œt0 ; tf ; B 1 ; 1 /. If u D u.t; !/; t0  t  tf , is a random input function,
then correspondingly we suppose that u.; !/ 2 U a.s. (almost sure or with
probability 1). Moreover, we suppose that the function g D g.t; !; z; u/ of the
plant differential equation (2.1a) and its partial derivatives (Jacobians) Dz g; Du g
with respect to z and u are at least measurable on the space Œt0 ; tf   ˝ 
Rm  Rn .
The possible trajectories of the plant, hence, absolutely continuous [118] m-
vector functions, are contained in the linear space Z D C0m Œt0 ; tf  of continuous
functions z./ W Œt0 ; tf  ! Rm on Œt0 ; tf . The space Z contains the set PCm 1 Œt0 ; tf 
of continuous, piecewise differentiable functions on the interval Œt0 ; tf . A function
on a closed, bounded interval is called piecewise differentiable if the function is
differentiable up to at most a finite number of points, where the function and
its derivative have existing one-sided limits. The space Z is also normed by the
supremum norm. D. U / denotes the convex set of admissible controls u./,
defined e.g. by some box constraints. Using the available information At up to a
certain time t, the problem is then to find an optimal control function u D u .t/
being most insensitive with respect to random parameter variations. This can be
obtained by minimizing the total (conditional) expected costs arising along the
trajectory z D z.t/ and/or at the terminal state zf D z.tf / subject to the plant
differential equation (2.1a, b) and the required control and state constraints. Optimal
controls being most insensitive with respect to random parameter variations are also
called robust controls. Such controls can be obtained by stochastic optimization
methods [100].
Since feedback (FB) control laws can be approximated very efficiently, cf. [2,
80, 136], by means of open-loop feedback (OLF) control laws, see Sect. 2.2, for
practical purposes we may confine to the computation of deterministic stochastic
optimal open-loop (OL) controls u D u.I tb /; tb  t  tf ; on arbitrary “remaining
time intervals” Œtb ; tf  of Œt0 ; tf . Here, u D u.I tb / is stochastic optimal with respect
to the information Atb at the “initial” time point tb .
2.1 Stochastic Control Systems 39

2.1.1 Random Differential and Integral Equations

In many technical applications the random variations is not caused by an additive


white noise term, but by means of possibly time-dependent random parameters.
Hence, in the following the dynamics of the control system is represented by random
differential equation, i.e. a system of ordinary differential equations (2.1a, b) with
random parameters. Furthermore, solutions of random differential equations are
defined here in the parameter (point)-wise sense, cf. [10, 26].
In case of a discrete or discretized probability distribution of the random
elements, model parameters, i.e., ˝ D f!1 ; !2 ; : : : ; !% g; P .! D !j / D ˛j >
P
%
0; j D 1; : : : ; %; ˛j D 1, we can redefine (2.1a, b) by
j D1

 
zP.t/ D g t; z.t/; u.t/ ; t0  t  tf ; (2.1c)

z.t0 / D z0 : (2.1d)

with the vectors and vector functions


   
z.t/ WD z.t; !j / ; z0 WD z0 .!j /
j D1;:::;% j D1;:::;%
 
g.t; z; u/ WD g.t; !j ; z.j / ; u/ ; z WD .z.j / /j D1;:::;% 2 R%m :
j D1;:::;%

Hence, in this case (2.1c, d) represents again an ordinary system of first order
differential equations for the %m unknown functions

zij D zi .t; !j /; i D 1; : : : ; m; j D 1; : : : ; %:

Results on the existence and uniqueness of the systems (2.1a, b) and (2.1c, d) and
their dependence on the inputs can be found in [37].
Also in the general case we consider a solution in the point-wise sense. This
means that for each random element ! 2 ˝, (2.1a, b) is interpreted as a system
of ordinary first order differential equations with the initial values z0 D z0 .!/ and
control input u D u.t/. Hence, we assume that to each deterministic control u./ 2
U and each random element ! 2 ˝ there exists a unique solution
   
z.; !/ D S !; u./ D S !; u./ ; (2.2a)

z.; !/ 2 C0m Œt0 ; tf , of the integral equation

Zt  
z.t/ D z0 .!/ C g s; !; z.s/; u.s/ ds; t0  t  tf ; (2.2b)
t0
40 2 Optimal Control Under Stochastic Uncertainty

 
such that .t; !/ ! S !; u./ .t/ is measurable. This solution is also denoted by

z.t; !/ D zu .t; !/ D z.t; !; u.//; t0  t  tf : (2.2c)

Obviously, the integral equation (2.2b) is the integral version of the initial value
problem (2.1a, b): Indeed, if, for given ! 2 ˝, z D z.t; !/ is a solution of (2.1a,
b), i.e., z.; !/ is absolutely continuous, satisfies (2.1a) for almost all t 2 Œt0 ; tf 
and fulfills (2.1b), then z D z.t; !/ is also a solution of (2.2b). Conversely, if, for
given ! 2 ˝, z D z.t; !/ is a solution of (2.2b), such that the integral on the
right hand side exists in the Lebesgue-sense for each t 2 Œt0 ; tf , then this integral
as a function of the upper bound t and therefore also the function z D z.t; !/ is
absolutely continuous. Hence, by taking t D t0 and by differentiation of (2.2b) with
respect to t, cf. [118], we have that z D z.t; !/ is also a solution of (2.1a, b).

Parametric Representation of the Random Differential/Integral Equation

In the following we want to justify the above assumption that the initial value
problem (2.1a, b), the equivalent integral equation (2.2b), resp., has a unique
solution z D z.t; !/. For this purpose, let  D .t; !/ be an r-dimensional
stochastic process, as e.g. time-varying disturbances, random parameters, etc., of
the system, such that the sample functions .; !/ are continuous with probability
one. Furthermore, let

gQ W Œt0 ; tf   Rr  Rm  Rn ! Rm

Q Dz g;
be a continuous function having continuous Jacobians D q; Q Du gQ with respect
to ; z; u. Now consider the case that the function g of the process differential
equation (2.1a, b) is given by
 
g.t; !; z; u/ WD gQ t; .t; !/; z; u ;

.t; !; z; u/ 2 Œt0 ; tf ˝ Rm Rn . The spaces U; Z of possible inputs, trajectories,
resp., of the plant are chosen as follows: U WD Reg.Œt0 ; tf I Rn / is the Banach space
of all regulated functions u./ W Œt0 ; tf  ! Rn , normed by the supremum norm kk1 .
Furthermore, we set Z WD C0m Œt0 ; tf  and WD C0r Œt0 ; tf . Here, for an integer
, C0 Œt0 ; tf  denotes the Banach space of all continuous functions of Œt0 ; tf  into R
normed by the supremum norm k  k1 . By our assumption we have .; !/ 2 a.s.
(almost sure). Define

 D Rm   U I

 is the space of possible initial values, time-varying model/environmental param-


eters and inputs of the dynamic system. Hence,  may be considered as the total
2.1 Stochastic Control Systems 41

space of inputs
0 1
z0
 WD @ ./ A
u./

into the plant, consisting of the random initial state z0 , the random input function
 D .t; !/ and the control function u D u.t/. Let now the mapping  W  Z ! Z
related to the plant equation (2.1a, b) or (2.2b) be given by
0 1
  Zt  
 ; z./ .t/ D z.t/  @z0 C gQ s; .s/; z.s/; u.s/ dsA ; t0  t  tf : (2.2d)
t0

Note that for each input vector  2  and function z./ 2 Z the integrand in
(2.2d) is piecewise continuous, bounded or at least essentially bounded on Œt0 ; tf .
Hence, the integral in (2.2d) as a function of its upper limit t yields again a
continuous function on the interval Œt0 ; tf , and therefore an element of Z. This
shows that  maps   Z into Z.
Obviously, the initial value problem (2.1a, b) or its integral form (2.2b) can be
represented by the operator equation

.; z.// D 0: (2.2e)

Operators of the type (2.2d) are well studied, see e.g. [37, 85]: It is known that
 is continuously Fréchet .F /-differentiable [37, 85]. Note that the F-differential
is a generalization of the derivatives (Jacobians) of mappings between finite-
dimensional spaces to mappings between  arbitrary
 normed spaces. Thus, the
F -derivative D of  at a certain point ; z./ is given by
    
N zN./  ; z./ .t/
D ;

Zt  
D z.t/  z0 C Dz gQ s; N .s/; zN.s/; uN .s/ z.s/ds
t0

Zt  
C D gQ s; N .s/; zN.s/; uN .s/ .s/ds
t0
1
Zt  
C Du gQ s; N .s/; zN.s/; uN .s/ u.s/dsA ; t0  t  tf ; (2.2f)
t0
42 2 Optimal Control Under Stochastic Uncertainty

   
where N D zN0 ; N ./; uN ./ and  D z0 ; ./; u./ . Especially, for the derivative of
 with respect to z./ we find
! Zt
   
N zN./  z./ .t/ D z.t/ 
Dz  ; Dz gQ s; N .s/; zN.s/; uN .s/ z.s/ds; t0  t  tf :
t0
(2.2g)
The related equation
 
N zN./  z./ D y./; y./ 2 Z;
Dz  ; (2.2h)

is a linear vectorial Volterra integral equation. By our assumptions this equation has
a unique solution z./ 2 Z. Note that the corresponding result for scalar Volterra
equations,
 see
 e.g. [151], can be transferred to the present vectorial case. Therefore,
N
Dz  ; zN./ is a linear, continuous one-to-one map from Z onto Z. Hence, its
!
  1
inverse Dz  ; N zN./ exists. Using the implicit function theorem [37, 85], we

obtain now the following result:


   
Lemma 2.1 For given N D zN0 ; ./;N uN ./ , let ;
N zN./ 2   Z be selected such
 
N zN./ D 0, hence, zN./ 2 Z is supposed to be the solution of
that  ;
 
N
zP.t/ D gQ t; .t/; z.t/; uN .t/ ; t0  t  tf ; (2.3a)

z.t0 / D zN0 (2.3b)

in the integral sense (2.2b). Then there is an open neighborhood of , N denoted by


0 N N N
V ./, such that for each open connected neighborhood V ./ of  contained in
N there exists a unique continuous mapping S W V ./
V 0 ./ N ! Z such that a) S./ N D
 
N i.e. S./ D S./.t/; t0  t  tf , is the
zN./; b)  ; S./ D 0 for each  2 V ./,
solution of

Zt  
z.t/ D z0 C gQ s; .s/; z.s/; u.s/ ds; t0  t  tf ; (2.3c)
t0

 
N and it holds
where  D z0 ; ./; u./ ; c) S is continuously differentiable on V ./,

  1  
Du S./ D  Dz  ; S./ N
Du  ; S./ ;  2 V ./: (2.3d)
2.1 Stochastic Control Systems 43

An immediate consequence is given next:


Corollary 2.1 The directional derivative ./ D u;h ./ D Du S./h./.2
Z/; h./ 2 U , satisfies the integral equation

Zt  
.t/  Dz gQ s; .s/; S./.s/; u.s/ .s/ds
t0

Zt  
D Du gQ s; .s/; S./.s/; u.s/ h.s/ds; (2.3e)
t0

 
where t0  t  tf and  D z0 ; ./; u./ .

Remark 2.1 Taking the time derivative of Eq. (2.3e) shows that this integral equa-
tion is equivalent to the so-called perturbation equation, see e.g. [73].
For an arbitrary h./ 2 U the mappings
 
.t; / ! S./.t/; .t; / ! Du S./h./ .t/; .t; / 2 Œt0 ; tf   V ./; (2.3f)

are continuous and therefore also measurable.


The existence of a unique solution zN D zN.t/; t0  t  tf ; of the reference
differential equation (2.3a, b) can be guaranteed as follows, where solution is
interpreted in the integral sense, i.e., zN D zN.t/; t0  t  tf ; is absolutely continuous,
satisfies equation (2.3a) almost everywhere in the time interval Œt0 ; tf  and the initial
condition (2.3b), cf. [29] and [168].
 
Lemma 2.2 Consider an arbitrary input vector N D zN0 ; N ./; uN ./ 2 , and
   
N
define, see (2.3a, b), the function gQ N ./;Nu./ D gQ N ./;Nu./ t; z WD gQ t; .t/; z; uN .t/ .
 
Suppose that i) z ! gQ N ./;Nu./ t; z is continuous for each time t 2 Œt0 ; tf , ii) t !
 
gQ N ./;Nu./ t; z is measurable for each vector z, iii) generalized Lipschitz-condition:
For each closed sphere K  Rn there exists   a nonnegative, integrable function 
LS ./ on Œt0 ; tf  such that iii1) kgQ N ./;Nu./ t; 0 k  LK .t/, and kgQ N ./;Nu./ t; z 
 
gQ N ./;Nu./ t; w k  LK .t/kz  wk on Œt0 ; tf   K. Then, the initial value problem
(2.3a, b) has a unique solution zN D zN.tI /. N

Proof Proofs can be found in [29] and [168].


We observe that the controlled stochastic process z D z.t; !/ defined by the
plant differential equation (2.1a, b) may be a non Markovian stochastic process,
see [5, 63]. Moreover, note that the random input function  D .t; !/ is not just
an additive noise term, but may describe also a disturbance which is part of the
44 2 Optimal Control Under Stochastic Uncertainty

nonlinear dynamics of the plant, random varying model parameters such as material,
load or cost parameters, etc.

Stochastic Differential Equations

In some applications [5], instead of the system (2.1a, b) of ordinary differential


equations with (time-varying) random parameters, a so–called stochastic differential
equation [121] is taken into account:
   
dz.t; !/ D gQ t; z.t; !/; u.t/ dt C hQ t; z.t; !/; u.t/ d.t; !/: (2.4a)

Here, the “noise” term  D .t; !/ is a certain stochastic process, as e.g. the
Brownian motion, having continuous paths, and gQ D g.t; Q z; u/; hQ D h.t;
Q z; u/ are
given, sufficiently smooth vector/matrix functions.
Corresponding to the integral equation (2.2a, b), the above stochastic differential
equation is replaced by the stochastic integral equation

Zt Zt 
 

z.t; !/ D z0 .!/ C gQ s; z.s; !/; u.s/ ds C hQ s; z.s; !/; u.s/ d.s; !/:
t0 t0
(2.4b)
The meaning of this equation depends essentially on the definition (interpretation)
of the “stochastic integral”

  Zt  
I .; /; z.; / .t/ WD hQ s; z.s; !/; u.s/ d.s; !/: (2.4c)
t0

Note that in case of closed loop and open-loop feedback controls, see the next
Sect. 2.2, the control function u D u.s; !/ is random. If
 
Q z; u/ D hQ s; u.s/
h.s; (2.5a)

with a deterministic
 control u D u.t/ and a matrix function hQ D h.s;Q u/, such that
s ! hQ s; u.s/ is differentiable, by partial integration we get, cf. [121],
       
I ; z./ .t/ D I ./; u./ .t/ D hQ t; u.t/ .t/  hQ t0 ; u.t0 / .t0 /

Zt
d Q 
 .s/ h s; u.s/ ds; (2.5b)
ds
t0
2.1 Stochastic Control Systems 45

where  D .s/ denotes a sample function


  the stochastic process  D .t; !/.
of
Hence, in this case the operator  D  ; z./ , cf. (2.2d), is defined by

  Zt    
 ; z./ .t/ WD z.t/  z0 C gQ s; z.s/; u.s/ ds C hQ t; u.t/ .t/
t0

Zt !
  d  
hQ t0 ; u.t0 / .t0 /  .s/ hQ s; u.s/ ds : (2.5c)
ds
t0

Obviously, for the consideration of the existence


  and differentiability of solutions
z./ D z.; / of the operator equation  ; z./ D 0, the same procedure as in
Sect. 2.1.1 may be applied.

2.1.2 Objective Function

The aim is to obtain optimal controls being robust, i.e., most insensitive with
respect to stochastic variations of the model/environmental parameters and initial
values of the process. Hence, incorporating stochastic parameter variations into the
optimization process, for a deterministic
  control function u D u.t/; t0  t  tf ,
the objective function F D F u./ of the controlled process z D z.t; !; u.// is
defined, cf. [100], by the conditional expectation of the total costs arising along the
whole control process:
     
F u./ WD Ef !; S !; u./ ; u./ : (2.6a)

Here, E D E.jAt0 /, denotes the conditional expectation given the information


At0 about
 the controlprocess up to the considered starting time point t0 . Moreover,
f D f !; z./; u./ denote the stochastic total costs arising along the trajectory
z D z.t; !/ and at the terminal point zf D z.tf ; !/, cf. [5, 159]. Hence,

  Zf  t
  
f !; z./; u./ WD L t; !; z.t/; u.t/ dt C G tf ; !; z.tf / ; (2.6b)
t0

z./ 2 Z; u./ 2 U . Here,

L W Œt0 ; tf   ˝  Rm  Rn ! R;
G W Œt0 ; tf   ˝  Rm !R
46 2 Optimal Control Under Stochastic Uncertainty

are given measurable cost functions. We suppose that L.t; !; ; / and G.t; !; / are
convex functions for each .t; !/ 2 Œt0 ; tf   ˝, having continuous partial derivatives
rz L.; !; ; /; ru L.; !; ; /; rz G.; !; /. Note that in this case

  Ztf    
z./; u./ ! L t; !; z.t/; u.t/ dt C G tf ; !; z.tf / (2.6c)
t0

is a convex function
  on Z  U for each ! 2 ˝. Moreover, assume that the
expectation F u./ exists and is finite for each admissible control u./ 2 D.
In the case of random inputs u D u.t; !/; t0  t  tf ; ! 2 ˝, with definition
(2.6b), the objective function F D F u.; / reads
!
   
F u.; / WD Ef !; S !; u.; !/ ; u.; !/ : (2.6d)

Example 2.1 (Tracking Problems) If a trajectory zf D zf .t; !/, e.g., the trajectory
of a moving target, known up to a certain stochastic uncertainty, must be followed
or reached during the control process, then the cost function L along the trajectory
can be defined by
    2
 
L t; !; z.t/; u WD  z z.t/  zf .t; !/  C '.u/: (2.6e)

In (2.6e) z is a weight matrix, and ' D '.u/ denotes the control costs, as e.g.

'.u/ D k u uk2 (2.6f)

with a further weight matrix u .


If a random target zf D zf .!/ has to be reached at the terminal time point tf
only, then the terminal cost function G may be defined e.g. by
    2
 
G tf ; !; z.tf / WD Gf z.tf /  zf .!/  (2.6g)

with a weight matrix Gf .


Example 2.2 (Active Structural Control, Control of Robots) In case of active
structural control or for optimal regulator design of robots, cf. [98, 155], the total
cost function f is given by defining the individual cost functions L and G as
follows:

L.t; !; z; u/ WDzT Q.t; !/z C uT R.t; !/u (2.6h)


G.tf ; !; z/ WDG.!; z/: (2.6i)
2.1 Stochastic Control Systems 47

Here, Q D Q.t; !/ and R D R.t; !/, resp., are certain positive (semi)definite
m  m; n  n matrices which may depend also on .t; !/. Moreover, the terminal
cost function G depends then on .!; z/. For example, in case of endpoint control,
the cost function G is given by

G.!; z/ D .z  zf /T Gf .!/.z  zf / (2.6j)

with a certain desired, possibly random terminal point zf D zf .!/ and a positive
(semi)definite, possibly random weight matrix Gf D Gf .!/.

Optimal Control Under Stochastic Uncertainty

For finding optimal controls being robust with respect to stochastic parameter
variations u? ./; u? .; /, resp., in this chapter we are presenting now several methods
for approximation of the following minimum expected total cost problem:
 
min F u./ s.t. u./ 2 D; (2.7)
 
min F u.; / s.t. u.; !/ 2 D a.s. (almost sure) ;

u.t; / At -measurable, f
.2:7/

where At  A; t  t0 , denotes the -algebra of events A 2 A until time t.


Information Set At at Time t: In many cases, as e.g. for PD- and PID -controllers,
see Chap. 4, the information -algebra At is given by At D A.y.t; //, where y D
N
y.t; !/ denotes the m-vector function of state-measurements or -observations at
time t. Then, an At -measurable control u D u.t; !/ has the representation, cf. [12],

u.t; !/ D .t; y.t; !// (2.8)

with a measurable function .t; / W RmN ! Rm .


Since parameter-insensitive optimal controls can be obtained by stochastic opti-
mization methods incorporating random parameter variations into the optimization
procedure, see [100], the aim is to determine stochastic optimal controls:
Definition 2.1 An optimal solution of the expected total cost minimization problem
f resp., providing robust optimal controls, is called a stochastic optimal
(2.7), (2:7),
control.
Note. For controlled processes working on a time range tb  t  tf with an
intermediate starting time point tb , the objective function F D F .u.// is defined
also by (2.6a), but with the conditional expectation operator E D E.jAtb /, where
Atb denotes the information about the controlled process available up to time tb .
48 2 Optimal Control Under Stochastic Uncertainty

 
Problem (2.7) is of course equivalent E D E.jAt0 / to the optimal control
problem under stochastic uncertainty:
0 tf 1
Z    ˇ
ˇ
min E @ L t; !; z.t/; u.t/ dt C G tf ; !; z.tf / ˇAt0 A (2.9a)
t0

s.t.
 
zP .t/ D g t; !; z.t/; u.t/ ; t0  t  tf ; a.s. (2.9b)

z.t0 ; !/ D z0 .!/; a.s. (2.9c)


u./ 2 D; (2.9d)

cf. [89, 90].


Remark 2.2 Similar representations can be obtained also for the second type of
f
stochastic control problem (2:7).
Remark 2.3 (State Constraints) In addition to the plant differential equation
(dynamic equation) (2.9b, c) and the control constraints (2.9d), we may still have
some stochastic state constraints
 
hI t; !; z.t; !/  .D/0 a.s. (2.9e)

as well as state constraints involving (conditional) expectations


!
   ˇ
ˇ
EhII t; !; z.t; !/ D E hII t; !; z.t; !/ ˇAt0  .D/0: (2.9f)

Here, hI D hI .t; !; z/; hII D hII .t; !; z/ are given vector functions of .t; !; z/. By
means of (penalty) cost functions, the random condition (2.9e) can be incorporated
into the objective function (2.9a). As explained in Sect. 2.8, the expectations arising
in the mean value constraints (2.9f) and in the objective function (2.9a) can be
computed approximatively by means of Taylor expansion with respect to the vector
# D #.!/ WD .z0 .!/; .!/ of random initial values  and model parameters at
.t0 /
the conditional mean # D # WD E #.!/jAt0 . This yields then ordinary
deterministic constraints for the extended deterministic trajectory (nominal state and
sensitivity)
 
t ! z.t; #/; D# z.t; #/ ; t  t0 :
2.2 Control Laws 49

2.2 Control Laws

Control or guidance usually refers [5, 73, 85] to direct influence on a dynamic
system to achieve desired performance. In optimal control of dynamic systems
mostly the following types of control laws or control policies are considered:
I) Open-Loop Control (OL)
Here, the control function u D u.t/ is a deterministic function depending
only on the (a priori) information It0 about the system, the model parameters,
resp., available at the starting time point t0 . Hence, for the optimal selection
of optimal (OL) controls
 
u.t/ D u tI t0 ; It0 ; t  t0 ; (2.10a)

we get optimal control problems of type (2.7).


II) Closed-Loop Control (CL) or Feedback Control
In this case the control function u D u.t/ is a stochastic function

u D u.t; !/ D u.t; It /; t  t0 (2.10b)

depending on time t and the total information It about the system available up
to time t. Especially It may contain information about the state z.t/ D z.t; !/
up to time t. Optimal (CL) or feedback controls are obtained by solving
f
problems of type (2:7).
Remark 2.4 Information Set At at Time t: Often the information It available
up to time t is described by the information set or algebra At  A
of events A occurred up to time t. In the important case At D A.y.t; //,
where y D y.t; !/ denotes the m-vector
N function of state-measurements or
-observations at time t, then an At  measurable control u D u.t; !/, see
f has the representation, cf. [12],
problem (2:7),

u.t; !/ D t .y.t; !// (2.10c)

with a measurable function t W RmN ! Rm . Important examples of this type


are the PD- and PID -controllers, see Chap. 4.
III) Open-Loop Feedback (OLF) Control/Stochastic Open-Loop Feedback
(SOLF) Control
Due to their large complexity, in general, optimal feedback control laws
can be determined approximatively only. A very efficient approximation
procedure for optimal feedback controls, being functions of the information
It , is the approximation by means of optimal open-loop controls. In this
combination of (OL) and (CL) control, at each intermediate time point
tb WD t, t0  t  tf , given the information It up to time t, first the open-loop
50 2 Optimal Control Under Stochastic Uncertainty

Fig. 2.1 Remaining time interval for intermediate time points t

control function for the remaining time interval t  s  tf , see Fig. 2.1, is
computed, hence,
 
uŒt;tf  .s/ D u sI t; It ; s  t: (2.10d)

Then, an approximate feedback control policy, originally proposed by


Dreyfus (1964), cf. [40], can be defined as follows:
Definition 2.2 The hybrid control law, defined by
 
'.t; It / WD u tI t; It ; t  t0 ; (2.10e)

is called open-loop feedback (OLF) control law.


Thus, the OL control uŒt;tf  .s/; s  t, for the remaining time interval Œt; tf 
is used only at time s D t, see also [2, 40–42, 51, 80, 167]. Optimal (OLF)
controls are obtained therefore by solving again control problems of the type
(2.7) at each intermediate starting time point tb WD t, t 2 Œt0 ; tf .
A major issue in optimal control is the robustness, cf. [43], i.e., the
insensitivity of an optimal control with respect to parameter variations. In
case of random parameter variations robust optimal controls can be obtained
by means of stochastic optimization methods, cf. [100], incorporating the
probability distribution, i.e., the random characteristics, of the random param-
eter variation into the optimization process, cf. Definition 2.1.
Thus, constructing stochastic optimal open-loop feedback controls, hence,
optimal open-loop feedback control laws being insensitive as far as possible
with respect to random parameter variations, means that besides the opti-
mality of the control policy also its insensitivity with respect to stochastic
parameter variations should be guaranteed. Hence, in the following sections
we also develop a stochastic version of the optimal open-loop feedback
control method, cf. [102–105]. A short overview on this novel stochastic
optimal open-loop feedback control concept is given below:
At each intermediate time point tb D t 2 Œt0 ; tf , based on the given
process observation It , e.g. the observed state zt D z.t/ at tb D t, a
stochastic optimal open-loop control u D u .s/ D u .sI t; It /; t  s  tf ,
is determined first on the remaining time interval Œt; tf , see Fig. 2.1, by
stochastic optimization methods, cf. [100].
Having a stochastic optimal open-loop control u D u .sI t; It /; t  s 
tf , on each remaining time interval Œt; tf  with an arbitrary starting time point
2.2 Control Laws 51

t, t0  t  tf , a stochastic optimal open-loop feedback (SOLF) control law


is then defined—corresponding to Definition 2.2—as follows:
Definition 2.3 The hybrid control law, defined by
 
'  .t; It / WD u tI t; It ; t  t0 : (2.10f)

is called the stochastic optimal open-loop feedback (SOLF) control law.


Thus, at time tb D t just the “first” control value u .t/ D u .tI t; It / of
u D u .I t; It / is used only.


For finding stochastic optimal open-loop controls, on the remaining time


intervals tb  t  tf with t0  tb  tf , the stochastic Hamilton
function of the control problem is introduced. Then, the class of H -minimal
controls, cf. [73], can be determined in case of stochastic uncertainty by
solving a finite-dimensional stochastic optimization problem for minimizing
the conditional expectation of the stochastic Hamiltonian subject to the
remaining deterministic control constraints at each time point t. Having
a H -minimal control, the related two-point boundary value problem with
random parameters will be formulated for the computation of a stochastic
optimal state- and costate-trajectory. In the important case of a linear-
quadratic structure of the underlying control problem, the state and costate
trajectory can be determined analytically to a large extent. Inserting then
these trajectories into the H-minimal control, stochastic optimal open-loop
controls are found on an arbitrary remaining time interval. According to
Definition 2.2, these controls yield then immediately a stochastic optimal
open-loop feedback control law. Moreover, the obtained controls can be
realized in real-time, which is already shown for applications in optimal
control of industrial robots, cf. [139].
III.1) Nonlinear Model Predictive Control (NMPC)/Stochastic Nonlinear Model
Predictive Control (SNMPC)
Optimal open-loop feedback (OLF) control is the basic tool in Nonlinear
Model Predictive Control (NMPC). Corresponding to the approximation
technique for feedback controls described above, (NMPC) is a method
to solve complicated feedback control problems by means of stepwise
computations of open-loop controls. Hence, in (NMPC), see [1, 45, 49, 136]
optimal open-loop controls

u D uŒt;t CTp  .s/; t  s  t C Tp ; (2.10g)

cf. (2.10c), are determined first on the time interval Œt; t C Tp  with a certain
so-called prediction time horizon Tp > 0. In sampled-data MPC, cf. [45],
optimal open-loop controls u D uŒti ;ti CTp  , are determined at certain sampling
instants ti ; i D 0; 1; : : :, using the information Ati about the control process
and its neighborhood up to time ti ; i D 0; 1; : : :, see also [98]. The optimal
52 2 Optimal Control Under Stochastic Uncertainty

open-loop control at stage “i ” is applied then,

u D uŒti ;ti CTp  .t/; ti  t  ti C1 ; (2.10h)

until the next sampling instant ti C1 . This method is closely related to the
Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)
procedure described in [95, 98].
Corresponding to the extension of (OLF) control to (SOLF) control,
(NMPC) can be extended to Stochastic Nonlinear Model Predictive Control
(SNMPC). For control policies of this type, robust (NMPC) with respect to
stochastic variations of model parameters and initial values are determined in
the following way:
• Use the a posteriori distribution P .d!jAt / of the basic random element
! 2 ˝, given the process information At up to time t, and
• apply stochastic optimization methods to incorporate random parameter
variations into the optimal (NMPC) control design.

2.3 Convex Approximation by Inner Linearization

f resp., is in general a non convex optimization


We observe first that (2.7), (2:7),
problem, cf. [85]. Since for convex (deterministic) optimization problems there is
a well established theory, we approximate the original problems (2.7), (2:7) f by
a sequence of suitable convex problems (Fig. 2.2). In the following we describe
first a single step of this procedure. Due to the consideration in Sect. 2.2, we may
concentrate here to problem (2.7) or (2.9a–d) for deterministic controls u./, as
needed in the computation of optimal (OL), (OLF), (SOLF) as well as (NMP),
(SNMP) controls being most important for practical problems.

Fig. 2.2 Convex approximation


2.3 Convex Approximation by Inner Linearization 53

Let v./ 2 D be an arbitrary, but fixed admissible initial or reference control and
assume, see Lemma 2.1, for the input–output map z./ D S.!; u.//:
Assumption 2.1 S.!; / is F -differentiable at v./ for each ! 2 ˝.
 
Denote by DS !; v./ the F -derivative of S.!; / at v./, and replace now the cost
 
function F D F u./ by
        
Fv./ u./ WD Ef !; S !; v./ C DS !; v./ u./  v./ ; u./ (2.11)
   
where u./ 2 U . Assume that Fv./ u./ 2 R for all pairs u./; v./ 2 D  D.
Then, replace the optimization problem (2.7), see [89], by
 
min Fv./ u./ s:t: u./ 2 D: .2:7/v./

Lemma 2.3 .2.7/v./ is a convex optimization problem.


Proof According to Sect. 1.1, function (2.6c) is convex. The assertion follows now
from the linearity of the F -differential of S.!; /.
Remark 2.5 Note that the approximate Fv./ of F is obtained from
 (2.6a, b) by
means of linearization of the input–output map S D S !; u./ with respect to
the control u./ at v./, hence, by inner linearization of the control problem with
respect to the control u./ at v./.
Remark 2.6 (Linear Input–Output Map) In case that S D S.!; u.// WD S.!/u./
is linear
 with
 respect to the control u./, then DS.!; v./ D S.!/ and we have
Fv./ u./ D F .u.//. In this case the problems (2.7) and .2.7/v./ coincide for each
input vector v./.
For a real-valued convex function  W X ! R on a linear space X the directional
0
derivative C .xI y/ exists, see e.g. [60], at each point x 2 X and in each direction
y 2 X . According to Lemma 2.3 the objective function Fv./ of the approximate
problem .2.7/v./ is convex. Using the theorem of the monotone convergence, [12],
for all u./; v./ 2 D and h./ 2 U the directional derivative of Fv./ is given, see
[90], Satz 1.4, by
       
0
Fv./C u./I h./ D Ef 0C !; S !; v./ C DS !; v./ u./  v./ ;
  
u./I DS !; v./ h./; h./ : (2.12a)
54 2 Optimal Control Under Stochastic Uncertainty

In the special case u./ D v./ we get


    
0
Fv./C v./I h./ D Ef 0C !; S !; v./ ;
  
v./I DS !; v./ h./; h./ : (2.12b)

A solution uN ./ 2 D of the convex problem .2.7/v./ is then characterized cf. [94],
by
 
0
Fv./C uN ./I u./  uN ./  0 for all u./ 2 D: (2.13)
 
Definition 2.4 For each v./ 2 D, let M v./ be the set of solutions of problem
 
.2.7/v./ , i.e., M v./
n   o
0
WD u0 ./ 2 D W Fv./C u0 ./I u./  u0 ./  0; u./ 2 D .

Note.If the input–output operator S D S.!; u.// WD S.!/u./ is linear, then


M v./ D M for each input v./, where M denotes the set of solutions of problem
(2.7).
In the following we suppose that optimal solutions of .2.7/v./ exist for each v./:
 
Assumption 2.2 M v./ 6D ; for each v./ 2 D.

A first relation between our original problem (2.7) and the family of its
approximates .2.7/v./ ; v./ 2 D, is shown in the following.
 
Theorem 2.1 Suppose that the directional derivative FC0 D FC0 v./I h./ exists
and
   
FC0 v./I h./ D Fv./C
0
v./I h./ (2.14)

for each v./ 2 D and h./ 2 D  D. Then:


 
I) If uN ./ is an optimal control, then uN ./ 2 M uN ./ , i.e. uN ./ is a solution of
.2.7/uN ./ .  
II) If (2.7) is convex, then uN ./ is an optimal control if and only if uN ./ 2 M uN ./ .

Proof Because of the  convexity


 of the approximate control problem
 .2.7/v./ ,the
0
condition v./ 2 M v./ holds, cf. (2.13), if and only if Fv./C v./I u./  v./ 
 
0 for all u./ 2 D: Because of (2.14), this is equivalent with FC0 v./I u./v./ 
0 for all u./ 2 D. However, since the admissible control domain D is convex, for
a an optimal solution v./ WD uN ./ of (2.7) this condition is necessary, and necessary
as also sufficient in case that (2.7) is convex.
2.3 Convex Approximation by Inner Linearization 55

Assuming that f D f .!; ; / is F-differentiable for each !, by means of (2.12b)


and the chain rule we have
       
0
Fv./C v./I h./ D Ef 0C !; S !; v./ ; v./I DS !; v./ h./; h./

1    
D E lim f !; S !; v./ C "h./ ; v./ C "h./
"#0 "
   
f !; S !; v./ ; v./ : (2.15)

Note. Because of the properties of the operator S D S.!; u.//, the above equations
holds also for arbitrary convex functions f such that the expectations under
consideration exist, see [90].
Due to the definition (2.6a) of the objective function F D F .u.// for condition
(2.14) the following criterion holds:
Lemma 2.4 a) Condition (2.14) in Theorem 2.1 holds if and only if the expectation
operator “E” and the limit process “lim” in (2.15) may be interchanged. b) This
n"#0   o
interchangeability holds e.g. if sup DS !; v./ C "h./  W 0  "  1 is
bounded with probability one, and the convex function f .!; ; / satisfies a Lipschitz
condition
ˇ    ˇ
ˇ ˇ
ˇf !; z./; u./  f !; zN./; uN ./ ˇ
   
 
 .!/  z./; u./  zN./; uN ./ 
ZU
   
on a set Q  Z  U containing all vectors S !; v./ C "h./ ; v./ C "h./ ; 0 
"  1, where E.!/ < C1, and kkZU denotes the norm on Z  U .
Proof The first assertion a) follows immediately from (2.15) and the definition of
the objective function F . Assertion b) can be obtained by means of the generalized
mean value theorem for vector functions and Lebesgue’s bounded convergence
theorem.
Remark 2.7 Further conditions are given in [90].
A second relation between our original problem (2.7) and the family of its
approximates .2.7/v./ ; v./ 2 D, is shown next.
 
Lemma 2.5 a) If uN ./ … M uN ./ for a control uN ./ 2 D, then
       
FuN ./ u./ < FuN ./ uN ./ D F uN ./ for each u./ 2 M uN ./
56 2 Optimal Control Under Stochastic Uncertainty

b) Let the controls u./; v./ 2 D be related such that


   
Fu./ v./ < Fu./ u./ :
 
If (2.14) holds for the pair u./; h./ ; h./ D v./  u./, then h./ is an
 
admissible direction of decrease for F at u./, i.e. we have F u./ C "h./ <
 
F u./ and u./ C "h./ 2 D on a suitable interval 0 < " < "N.
 
Proof According to Definition 2.4, for u./ 2 M uN ./ we have FuN ./ .u.// 
FuN ./ .v.// for all v./ 2 D and therefore also FuN ./ .u.//  FuN ./ .Nu.//. a) In
case FuN ./ .u.// D FuN./ .Nu.// we get FuN ./ .Nu.//  FuN ./ .v.// for all v./ 2 D,
hence, uN ./ 2 M uN ./ . Since this is in contradiction to the assumption, it follows
 
FuN ./ .u.// < FuN ./ .Nu.//. b) If controls u./; v./ 2 D are related Fu./ v./ <
   
0
Fu./ u./ , then due to the convexity of Fu./ we have Fu./C u./I v./  u./ 
 
Fu./ .v.//  Fu./ .u.// < 0. With (2.14) we then get FC0 u./I v./  u./ D
 
0
Fu./C u./I v./  u./ < 0. This yields now that h./ WD v./  u./ is a feasible
descent direction for F at u./.
 
If uN ./ 2 M uN ./ , then the convex approximate FuN of F at uN cannot be decreased
further on D. Thus, the above results suggest the following definition:
 
Definition 2.5 A control uN ./ 2 D such that uN ./ 2 M uN ./ is called a stationary
control of the optimal control problem (2.7).
Under the rather weak assumptions in Theorem 2.1 an optimal control is also
stationary, and in the case of a convex problem (2.7) the two concepts coincide.
Hence, stationary controls are candidates for optimal controls. As an appropriate
substitute/approximate for an optimal control we may determine therefore stationary
controls. For this purpose algorithms of the following conditional gradient-type may
be applied:

 u 2 D, put j D 1
1
Algorithm 2.1 I) Choose
II) If uj ./ 2 M uj ./ , then uj ./ is stationary and the algorithm stops;
 
otherwise find a control v j ./ 2 M uj ./
III) Set uj C1 ./ D v j ./ and go to ii), putting j ! j C 1.

 u ./ 2 D, put j D 1
1
Algorithm 2.2 I) Choose
II) If uj ./ 2 M uj ./ , then uj ./ is stationary and the algorithm stops;
 
otherwise find a v .j / ./ 2 M uj ./ , define hj ./ WD v j ./  uj ./
2.4 Computation of Directional Derivatives 57

 
III) Calculate uN ./ 2 m uj ./; hj ./ , set uj C1./ WD uN ./ and go to ii), putting
j ! j C 1.
 
Here, based on line search, m u./; h./ is defined by
(
   
m u./; h./ D u./ C " h./ W F u./ C " h./
)
 

D min F u./ C "h./ for " 2 Œ0; 1 ; u./; h./ 2 U:
0"1

Concerning Algorithm 2.1 we have the following result.


 
Theorem 2.2 Let the set valued mapping u./ ! M u./ be closed at each
 
uN ./ 2 D i.e. the relations uj ./ ! uN ./; v j ./ 2 M uj ./ ; j D 1; 2 : : :, and
!
 
v ./ ! v./
j
N imply that also v./N 2 M uN ./ . If a sequence u1 ./; u2 ./; : : : of

controls generated by Algorithm 2.1 converges to an element uN ./ 2 D, then uN ./ is


a stationary control.
 
A sufficient condition for the closedness of the algorithmic map u./ ! M u./ is
given next:
Lemma 2.6 Let D be a closed set of admissible controls, and let
   
0
u./; v./ ! Fu./C v./I w./  v./
 
be continuous on D  D for each w./ 2 D. Then u./ ! M u./ is closed at each
element of D.
While the convergence assumption for a sequence u1 ./; u2 ./; : : : generated
by Algorithm 2.1 is rather strong, only the existence of accumulation points of
uj ./; j D 1; 2; : : :, has to be required in Algorithm 2.2.

2.4 Computation of Directional Derivatives

Suppose here again that U WD Ln1 .Œt0 ; tf ; B 1 ; 1 / is the Banach space of all
essentially bounded measurable functions u./ W Œt0 ; tf  ! Rn , normed by the
essential supremum norm. According to Definitions 2.4, 2.5 of a stationary control
and characterization (2.13) of an optimal solution of (2.7)v./, we first have to
58 2 Optimal Control Under Stochastic Uncertainty

0
determined the directional derivative Fv./C . Based on the justification in Sect. 2.1,
 
we assume again that the solution z.t; !/ D S !; u./ .t/ of (2.2b) is measurable
 
in .t; !/ 2 Œt0 ; tf   ˝ for each u./ 2 D, and u./ ! S !; u./ is continuously
differentiable on D for each ! 2 ˝. Furthermore,
  we suppose that the F -
differential .t/ D .t; !/ D Du S !; u./ h./ .t/; h./ 2 U , is measurable
and essentially bounded in .t; !/, and is given according to (2.3a–f) by the linear
integral equation
Zt   Zt  
.t/  A t; !; u./ .s/ds D B t; !; u./ h.s/ds; (2.16a)
t0 t0

t0  t  tf ;

with the Jacobians


   
A t; !; u./ WD Dz g t; !; zu .t; !/; u.t/ (2.16b)
   
B t; !; u./ WD Du g t; !; zu .t; !/; u.t/ (2.16c)

and zu D zu .t; !/ defined, cf. (2.3f), by


 
zu .t; !/ WD S !; u./ .t/; (2.16d)
 
t; !; u./ 2 Œt0 ; tf   ˝  U . Here, the random element “!” is also used, cf.
Sect. 2.1.1, to denote the realization
 
z0
! WD
./

of the random inputs z0 D z0 .!/; ./ D .; !/.


Remark 2.8 Due to the measurability of the functions zu D zu .t; !/ and u D u.t/ on
Œt0 ; tf   ˝, Œt0 ; tf , resp., and the assumptions on the function g and its Jacobians
Dz g; Du g, see Sect. 2.1, also the matrix-valued functions .t; !/ ! A.t; !; u.//,
.t; !/ ! B.t; !; u.// are measurable and essentially bounded on Œt0 ; tf   ˝.
Equation (2.16a) is again a vectorial Volterra integral equation, and the existence
of a unique measurable solution .t/ D .t; !/ can be shown as for the Volterra
integral equation (2.2g, h).
The differential form of (2.16a) is then the linear perturbation equation
   
P D A t; !; u./ .t/ C B t; !; u./ h.t/; t0  t  tf ; ! 2 ˝ (2.16e)
.t/

.t0 / D 0: (2.16f)
2.4 Computation of Directional Derivatives 59

The solution  D .t; !/ of (2.16a), (2.16e, f), resp., is also denoted, cf. (2.3f), by
!
 
.t; !/ D u;h .t; !/ WD Du S !; u./ h./ .t/; h./ 2 U: (2.16g)

This means that the approximate (2.7)v./ of (2.7) has the following explicit form:

Ztf  
min E L t; !; zv .t; !/ C .t; !/; u.t/ dt (2.17a)
t0
!
 
C G tf ; !; zv .tf ; !/ C .tf ; !/

s.t.
    
P !/ D A t; !; v./ .t; !/ C B t; !; v./ u.t/  v.t/ a.s. (2.17b)
.t;

.t0 ; !/ D 0 a.s. (2.17c)


u./ 2 D: (2.17d)

With the convexity assumptions in Sect. 2.1.2, Lemma 2.3 yields that (2.17a–d) is a
convex stochastic control problem, with a linear plant differential equation.
For the subsequent analysis of the stochasticcontrol problem
 we need now a
0
representation of the directional derivative Fv./C u./I h./ by a scalar product

 Zf
t

0
Fv./C u./I h./ D q.t/T h.t/dt
t0

with a certain deterministic vector function q D q.t/. From representation (2.12a)


0
of the directional derivative Fv./C of the convex approximate Fv./ of F , definition
 
(2.6b) of f D f !; z./; u./ by an integral over Œt0 ; tf  and [90], Satz 1.4, with
(2.16g) we obtain

  Ztf  T
0
Fv./C u./I h./ D E rz L t; !; zv .t; !/ C v;uv .t; !/; u.t/ v;h .t; !/
t0
!
 T
C ru L t; !; zv .t; !/ C v;uv .t; !/; u.t/ h.t/
!
 T
C rz G tf ; !; zv .tf ; !/ C v;uv .tf ; !/ v;h .tf ; !/ : (2.18)
60 2 Optimal Control Under Stochastic Uncertainty

Defining the gradients


   
a t; !; v./; u./ WD rz L t; !; zv .t; !/ C v;uv .t; !/; u.t/ (2.19a)
   
b t; !; v./; u./ WD ru L t; !; zv .t; !/ C v;uv .t; !/; u.t/ (2.19b)
   
c tf ; !; v./; u./ WD rz G tf ; !; zv .tf ; !/ C v;uv .tf ; !/ ; (2.19c)

0
measurable with respect to .t; !/, the directional derivative Fv./C can be repre-
sented by

  Rtf  T
0
Fv./C u./I h./ D E a t; !; v./; u./ v;h .t; !/
t0
! !
 T  T
Cb t; !; v./; u./ h.t/ dt C c tf ; !; v./; u./ v;h .tf ; !/ : (2.20a)

According to (2.16a), for v;h D v;h .t; !/ we have

Ztf !
   
v;h .tf ; !/ D A t; !; v./ v;h .t; !/ C B t; !; v./ h.t/ dt: (2.20b)
t0

Putting (2.20b) into (2.20a), we find

  Ztf  T
0
Fv./C u./I h./ D E aQ t; !; v./; u./ v;h .t; !/dt
t0

Ztf  !
T
C bQ t; !; v./; u./ h.t/dt ; (2.20c)
t0

where
     T  
aQ t; !; v./; u./ WD a t; !; v./; u./ C A t; !; v./ c tf ; !; v./; u./

(2.20d)
     T  
bQ t; !; v./; u./ WD b t; !; v./; u./ C B t; !; v./ c tf ; !; v./; u./ :

(2.20e)
2.4 Computation of Directional Derivatives 61


Remark 2.9 According to Remark 2.8 also the functions .t; !/ ! aQ t; !; v./;
  
u./ , .t; !/ ! bQ t; !; v./; u./ are measurable on Œt0 ; tf   ˝.

In order to transform the first integral in (2.20c) into the form of the second
integral in (2.20c), we introduce the m-vector function

 D v;u .t; !/

defined by the following integral equation depending on the random parameter !:

T Z f
t
  
.t/  A t; !; v./ .s/ds D aQ t; !; v./; u./ : (2.21)
t

Under the present assumptions, this Volterra integral equation has [90] a unique
measurable solution .t; !/ ! v;u .t; !/, see also Remark 2.8. By means of (2.21)
we obtain

Ztf  T
aQ t; !; v./; u./ v;h .t; !/dt
t0

Ztf !T
T Z f
t

D .t/  A t; !; v./ .s/ds v;h .t; !/dt
t0 t

Ztf

D .t/T v;h .t; !/dt


t0

Ztf Ztf  
 dt dsJ.s; t/.s/T A t; !; v./ v;h .t; !/
t0 t0

Ztf
Ztf Ztf  
D .s/ v;h .s; !/ds 
T
ds dtJ.s; t/.s/T A t; !; v./ v;h .t; !/
t0 t0 t0

Ztf Zs !
 
D .s/ T
v;h .s; !/  A t; !; v./ v;h .t; !/dt ds; (2.22a)
t0 t0
62 2 Optimal Control Under Stochastic Uncertainty

where J D J.s; t/ is defined by



0; t0  s  t
J.s; t/ WD
1; t < s  tf :

Using now again the perturbation equation (2.16a, b), from (2.22a) we get

Ztf  Ztf Zs !
T  
aQ t; !; v./; u./ v;h .t; !/dt D .s/T B t; !; v./ h.t/dt ds
t0 t0 t0

Ztf Ztf  
D ds.s/ T
dtJ.s; t/B t; !; v./ h.t/
t0 t0

Ztf Ztf  
D dt dsJ.s; t/.s/T B t; !; v./ h.t/
t0 t0

Ztf !T
 Zf Ztf T Z f
t t
T   
D .s/ds B t; !; v./ h.t/dt D B t; !; v./ .s/ds h.t/dt:
t0 t t0 t

(2.22b)

Inserting (2.22b) into (2.20c), we have

  Ztf  T Z f
t
0
Fv./C u./I h./ D E B t; !; v./ .s/ds
t0 t
! !
  T
CbQ t; !; v./; u./ h.t/dt : (2.23)

By means of (2.20d), the integral equation (2.21) may be written by

!
 Zf
t
 T   
.t/  A t; !; v./ c tf ; !; v./; u./ C .s/ds D a t; !; v./; u./ :
t
(2.24)
According to (2.24), defining the m-vector function

  Ztf
y D yv;u .t; !/ WD c tf ; !; v./; u./ C v;u .s; !/ds; (2.25a)
t
2.4 Computation of Directional Derivatives 63

we get
 T  
.t/  A t; !; v./ yv;u .t; !/ D a t; !; v./; u./ : (2.25b)

Replacing in (2.25b) the variable t by s and integrating then the equation (2.25b)
over the time interval Œt; tf , yields

Ztf Ztf !
 T  
.s/ds D A s; !; v./ yv;u .s; !/ C a s; !; v./; u./ ds: (2.25c)
t t

Finally, using again (2.25a), from (2.25c) we get


 
yv;u .t; !/ D c tf ; !; v./; u./

Ztf !
 T  
C A s; !; v./ yv;u .s; !/ C a s; !; v./; u./ ds: (2.25d)
t

Obviously, the differential form of the Volterra integral equation (2.25d) for y D
yv;u .t; !/ reads:
 T  
P
y.t/ D A t; !; v./ y.t/  a t; !; v./; u./ ; t0  t  tf ; (2.26a)
 
y.tf / D c tf ; !; v./; u./ : (2.26b)

System (2.25d), (2.26a, b), resp., is called the adjoint integral, differential
equation related to the perturbation equation (2.16a, b).
By means of (2.25a), from (2.20e) and (2.23) we obtain now

  Ztf  T Z f
t
 
0
Fv./C u./I h./ D E B t; !; v./ .s/ds C b t; !; v./; u./
t0 t
!
 T   T
CB t; !; v./ c tf ; !; v./; u./ h.t/dt

Ztf !
 T   T
DE B t; !; v./ yv;u .t; !/ C b t; !; v./; u./ h.t/dt:
t0
64 2 Optimal Control Under Stochastic Uncertainty

Summarizing the above transformations, we have the following result:


Theorem 2.3 Let .!; t/ ! yv;u .t; !/ be the unique measurable solution of the
adjoint integral, differential equation (2.25d), (2.26a, b), respectively. Then,

  Zf 
t
T
0
Fv./C u./I h./ D E B t; !; v./ yv;u .t; !/
t0
!
  T
Cb t; !; v./; u./ h.t/dt: (2.27)

 
0
Note that Fv./C u./I  is also the Gâteaux-differential of Fv./ at u./.
 
0
For a further discussion of formula (2.27) for Fv./C u./I h./ , in generalization
of the Hamiltonian of a deterministic control problem, see e.g. [73], we introduce
now the stochastic Hamiltonian related to the partly linearized control problem
(2.17a–d) based on a reference control v./:
 
Hv./ .t; !; ; y; u/ WD L t; !; zv .t; !/ C ; u
!
    
Cy A t; !; v./  C B t; !; v./ u  v.t/ ;
T
(2.28a)

0
.t; !; z; y; u/ 2 Œt0 ; tf   ˝  Rm  Rm  Rn . Using Hv./ , for Fv./C we find the
representation

 Zf
t

0
Fv./C u./I h./ D Eru Hv./ t; !; v;uv .t; !/;
t0
!T
yv;u .t; !/; u.t/ h.t/dt; (2.28b)

where v;uv D v;uv .t; !/ is the solution of the perturbation differential, integral
equation (2.17b, c), (2.20b), resp., and yv;u D yv;u .t; !/ denotes the solution of the
adjoint differential, integral equation (2.26a, b), (2.25d).
Let u0 ./ 2 U denote a given initial control. By means of (2.28a, b), the
 necessary

and sufficient condition for a control u1 ./ to be an element of the set M u0 ./ , i.e.
2.4 Computation of Directional Derivatives 65

a solution of the approximate convex problem (2.7)u0./ , reads, see Definition 2.4
and (2.13):

Ztf 
Eru Hu0 ./ t; !; u0 ;u1 u0 .t; !/; yu0 ;u1 .t; !/;
t0
T  
u1 .t/ u.t/  u1 .t/ dt  0; u./ 2 D: (2.29)

Introducing, for given controls u0 ./; u1 ./, the convex mean value function
H u0 ./;u1 ./ D H u0 ./;u1 ./ .u.// defined by

  Ztf 
H u0 ./;u1 ./ u./ WD EH u0 ./ t; !; u0 ;u1 u0 .t; !/;
t0

yu0 ;u1 .t; !/; u.t/ dt; (2.30a)

corresponding to the representation (2.12a) and (2.18) of the directional derivative


of F , it is seen that the left hand side of (2.29) is the directional derivative of the
function H u0 ./;u1 ./ D H u0 ./;u1 ./ .u.// at u1 ./ with increment h./ WD u./  u1 ./.
Consequently, (2.29) is equivalent with the condition:

0
  
H u0 ./;u1 ./C u1 .//I u./  u1 ./  0; u./ 2 D; (2.30b)

Due to the equivalence of the conditions (2.29) and (2.30b), for a control u ./ 2
1

M u0 ./ , i.e. a solution of (2.7)u0./ we have the following characterization:

Theorem 2.4 Let u0 ./ 2 D be a given initial control. A control u.1/ ./ 2 U is a
solution of (2.7)u0./ , i.e., u1 ./ 2 M u0 ./ , if and only if u1 ./ is an optimal solution
of the convex stochastic optimization problem
 
min H u0 ./;u1 ./ u./ s.t. u./ 2 D; (2.31a)

In the following we study therefore the convex optimization problem (2.31a),


where we replace next to the yet unknown functions  D u0 ;u1 u0 .t; !/; y D
yu0 ;u1 .t; !/ by arbitrary stochastic functions  D .t; !/; y D y.t; !/. Hence, we
consider the mean value function H u0 ./;.;/;y.;/ D H u0 ./;.;/;y.;/ .u.// defined, see
(2.30a), by

  Zf
t
 
H u0 ./;.;/;y.;/ u./ D EH u0 ./ t; !; .t; !/; y.t; !/; u.t/ dt: (2.31b)
t0
66 2 Optimal Control Under Stochastic Uncertainty

In practice, the admissible domain D is often given by


n o
D D u./ 2 U W u.t/ 2 Dt ; t0  t  tf ; (2.32)

where Dt  Rn is a given  convex subset of R for each time t, t0  t 


n

tf . Since H u0 ./;.;/;y.;/ u./ has an integral form, the minimum value of


 
H u0 ./;.;/;y.;/ u./ on D can be obtained—in case (2.32)—by solving the finite-
dimensional stochastic optimization problem

min EH u0 ./ t; !; .t; !/;

y.t; !/; u s.t. u 2 Dt .P /tu0 ./;;y

for each t 2 Œt0 ; tf . Let denote then


 
ue D ue u0 ./ t; .t; /; y.t; / ; t0  t  tf ; (2.33a)

a solution of .P /tu0 ./;;y for each t0  t  tf . Obviously, if


!
   
ue u0 ./ ; .; /; y.; / 2 U e
and therefore u u0 ./ ; .; /; y.; / 2 D ;

(2.33b)
then
   
ue u0 ./ ; .; /; y.; / 2 argmin H u0 ./;.;/;y.;/ u./ : (2.33c)
u./2D

Because of Theorem 2.4, problems .P /tu0 ./;;y and (2.33a–c), we introduce, cf.
[73], the following definition:
Definition 2.6 Let u0 ./ 2 D be a given initial control. For measurable functions
 D .t; !/; y D y.t; !/ on .˝; A; P /, let denote
 
ue D ue u0 ./ t; .t; /; y.t; / ; t0  t  tf ;
 
a solution of .P /tu0 ./;;y ; t0  t  tf . The function ue D ue u0 ./ t; .t; /; y.t; /
is called a Hu0 ./ -minimal control of (2.17a–d). The stochastic Hamiltonian Hu0 ./
is called regular, strictly regular, resp., if a Hu0 ./ -minimal control exists, and is
determined uniquely.
1
Remark 2.10 Obviously, by means of Definition 2.6, for an optimal  solution u ./ 
of (2.7)u0./ we have then the “model control law” ue D ue u0 ./ t; .t; /; y.t; /
depending on still unknown state, costate functions .t; /; y./, respectively.
2.5 Canonical (Hamiltonian) System 67

2.5 Canonical (Hamiltonian) System of Differential


Equations/Two-Point Boundary Value Problem
 
For a given initial control u0 ./ 2 D, let ue D ue u0 ./ t; ./; ./ ; t0  t 
tf , denote a Hu0 ./ -minimal control of (2.17a–d) according to Definition 2.6.
Due to (2.17b, c) and (2.26a, b) we consider, cf. [73], the following so-called
canonical or Hamiltonian system of differential equations, hence, a two-point
boundary
 value problem,
 with random parameters for the vector functions .; y/ D
.t; !/; y.t; !/ ; t0  t  tf ; ! 2 ˝:
 
P !/ D A t; !; u0 ./ .t; !/
.t;
!
   
CB t; !; u ./ 0 e
u u0 ./ t; .t; /; y.t; /  u .t/ ;
0

t0  t  tf ; (2.34a)
.t0 ; !/ D 0 a.s. (2.34b)
 T
P !/ D A t; !; u0 ./ y.t; !/
y.t;
!
 
rz L t; !; zu0 .t; !/ C .t; !/; u e t; .t; /; y.t; / ;
u0 ./

t0  t  tf ; (2.34c)
 
y.tf ; !/ D rz G tf ; !; zu0 .tf ; !/ C .tf ; !/ : (2.34d)
 
Remark 2.11 Note that the (deterministic) control law ue D ue u0 ./ t; .t; /; y.t; /
 
depends on the whole random variable .t; !/; y.t; !/ ; ! 2 ˝, or the occurring
moments. In case of a discrete parameter distribution ˝ D f!1 ; : : : ; !% g, see
Sect. 2.1, the control law ue u0 ./ depends,
 
ue u0 ./ D ue u0 ./ t; .t; !1 /; : : : ; .t; !% /; y.t; !1 /; : : : ; y.t; !% / ;

on the 2% unknown functions .t; !i /; y.t; !i /; i D 1; : : : ; %.


 
Suppose now that . 1 ; y 1 / D  1 .t; !/; y 1 .t; !/ ; t0  t  tf ; ! 2 .˝; A; P /,
is the unique measurable solution of the canonical stochastic system (2.34a–d), and
define
 
u1 .t/ WD ue u0 ./ t;  1 .t; /; y 1 .t; / ; t0  t  tf : (2.35)
68 2 Optimal Control Under Stochastic Uncertainty

System (2.34a–d) takes then the following form:


    
P 1 .t; !/ D A t; !; u0 ./  1 .t; !/ C B t; !; u0 ./ u1 .t/  u0 .t/ ;
t0  t  tf ; (2.34a’)
 .t0 ; !/ D 0 a.s.
1
(2.34b’)
 T
yP 1 .t; !/ D A t; !; u0 ./ y.t; !/
 
rz L t; !; zu0 .t; !/ C  1 .t; !/; u1 .t/ ; t0  t  tf ; (2.34c’)
 
y 1 .tf ; !/ D rz G tf ; !; zu0 .tf ; !/ C  1 .tf ; !/ : (2.34d’)

Assuming that

u1 ./ 2 U; (2.36a)

due to the definition of a Hu0 ./ -minimal control we also have

u1 ./ 2 D: (2.36b)

According to the notation introduced in (2.16a–g), (2.25a–d)/(2.26a, b), resp.,


and the above assumed uniqueness of the solution . 1 ; y 1 / of (2.34a–d) we have

 1 .t; !/ D u0 ;u1 u0 .t; !/ (2.37a)


y .t; !/ D yu0 ;u1 .t; !/
1
(2.37b)

with the control u1 ./ given by (2.35).


Due to the above construction, we know that u1 .t/ solves .P /tu0 ./;; for

 D .t; !/ WD  1 .t; !/ D u0 ;u1 u0 .t; !/


 D .t; !/ WD y 1 .t; !/ D yu0 ;u1 .t; !/

for each t0  t  tf . Hence, control u1 ./, given by (2.35), is a solution of (2.31a).


Summarizing the above construction, from Theorem 2.4 we obtain this result:
 
Theorem 2.5 Suppose that D is given by (2.32), M u0 ./ 6D ;, and .P /tu0 ./;;y
has an optimal solution for each t; t0  t  tf , and measurable functions
.t; /; y.t; /. Moreover,
 suppose that the canonical system (2.34a–d) has a unique
measurable solution  .t; !/; y .t; !/ ; t0  t  tf ; ! 2 ˝, such that u1 ./ 2 U ,
1 1

where u1 ./ is defined by (2.35). Then u1 ./ is a solution of .2.7/u0 ./ .


2.6 Stationary Controls 69

2.6 Stationary Controls

Suppose here that condition (2.14) holds for all controls v./ under consideration,
cf. Lemma 2.4. Having  amethod for the construction of improved approximative
controls u ./ 2 M u ./ related to an initial control u0 ./ 2 D, we consider now
1 0

the construction of stationary  controls


 of the control problem (2.7), i.e. elements
uN ./ 2 D such that uN ./ 2 M uN ./ , see Definition 2.5.
Starting again with formula (2.27), by means of (2.14), for an element v./ 2 D
we have
   
FC0 v./I h./ D Fv./C
0
v./I h./

Ztf !
 T   T
D E B t; !; v./ yv .t; !/ C b t; !; v./; v./ h.t/dt; (2.38a)
t0

where

yv .t; !/ WD yv;v .t; !/ (2.38b)

fulfills, cf. (2.26a, b), the adjoint differential equation


 T  
P !/ D A t; !; v./ y.t; !/  rz L t; !; zv .t; !/; v.t/
y.t; (2.39a)
 
y.tf ; !/ D rz G tf ; !; zv .tf ; !/ : (2.39b)

Moreover, cf. (2.16b, c),


   
A t; !; v./ D Dz g t; !; zv .t; !/; v.t/ (2.39c)
   
B t; !; v./ D Du g t; !; zv .t; !/; v.t/ (2.39d)

and, see (2.19b),


   
b t; !; v./; v./ D ru L t; !; zv .t; !/; v.t/ ; (2.39e)

where zv D zv .t; !/ solves the dynamic equation


 
zP .t; !/ D g t; !; z.t; !/; v.t/ ; t0  t  tf ; (2.39f)

z.t0 ; !/ D z0 .!/: (2.39g)


70 2 Optimal Control Under Stochastic Uncertainty

Using now the stochastic Hamiltonian, cf. (2.28a),

H.t; !; z; y; u/ WD L.t; !; z; u/ C y T g.t; !; z; u/ (2.40a)

related to the basic control problem (2.9a–d), from (2.38a, b), (2.39a–g) we get the
representation, cf. (2.28a, b),

 Zf
t
  T
FC0 v./I h./ D Eru H t; !; zv .t; !/; yv .t; !/; v.t/ h.t/dt: (2.40b)
t0

According to condition (2.13), a stationary control of (2.7), hence, an element


uN ./ 2 D such that uN ./ is an optimal solution of (2.7)uN ./ is characterized, see (2.14),
by
 
FC0 uN ./I u./  uN ./  0 for all u./ 2 D:

Thus, for stationary controls uN ./ 2 D of problem (2.7) we have the characterization

Ztf  T  
Eru H t; !; zuN .t; !/; yuN .t; !/; uN .t/ u.t/  uN .t/ dt  0; u./ 2 D: (2.41)
t0

Comparing (2.29) and (2.41), corresponding to (2.30a, b), for given w./ 2 D we
introduce here the function

  Zf 
t

H w u./ WD EH t; !; zw .t; !/; yw .t; !/; u.t/ dt; (2.42a)
t0

and we consider the optimization problem


 
min H w u./ s.t. u./ 2 D: (2.42b)

Remark 2.12 Because of the assumptions in Sect. 2.1.2, problem (2.42b) is (strictly)
convex, provided that the process differential equation (2.1a) is affine-linear with
respect to u, hence,

zP.t; !/ D g.t; !; z; u/ D g.t; O !; z/u


O !; z/ C B.t; (2.43)

O !; z/; BO D B.t;
with a given vector-, matrix-valued function gO D g.t; O !; z/.
2.7 Canonical (Hamiltonian) System of Differential 71

If differentiation and integration/expectation in (2.42a) may be interchanged,


which is assumed in the following, then (2.41) is a necessary condition for
 
uN ./ 2 argmin H uN u./ ; (2.44)
u./2D

cf. (2.30a, b), (2.31a, b). Corresponding to Theorem 2.4, here we have this result:
Theorem 2.6 (Optimality Condition for Stationary Controls) Suppose that a
control uN ./ 2 D fulfills (2.44). Then uN ./ is a stationary control of (2.7).

2.7 Canonical (Hamiltonian) System of Differential

Assume now again that the feasible domain D is given by (2.32). In order to solve
the optimization problem (2.42a, b), corresponding to .P /tu0 ./;; , here we consider
the finite-dimensional optimization problem
 
min EH t; !; .!/; .!/; u s.t. u 2 Dt .P /t;

for each t; t0  t  tf . Furthermore, we use again the following definition, cf.


Definition 2.6:
Definition 2.7 For measurable functions ./; ./ on .˝; A; P /, let denote
 
ue D ue t; ./; ./ ; t0  t  tf ;
 
a solution of .P /t; . The function ue D ue t; ./; ./ ; t0  t  tf , is called a
H-minimal control of (2.9a–d). The stochastic Hamiltonian H is called regular,
strictly regular, resp., if a H -minimal control exists, exists and is determined
uniquely.
 
For a given H -minimal control u D u t; ./; ./ we consider now,
see (2.34a–d), the following canonical (Hamiltonian) two-point boundary value
problem with random parameters:
!
 

zP.t; !/ D g t; !; z.t; !/; u t; z.t; /; y.t; / ; t0  t  tf (2.45a)

z.t0 ; !/ D z0 .!/ (2.45b)


72 2 Optimal Control Under Stochastic Uncertainty

!
  T

P !/ D Dz g t; !; z.t; !/; u t; z.t; /; y.t; /
y.t; y.t; !/
!
 

rz L t; !; z.t; !/; u t; z.t; /; y.t; / ; t0  t  tf (2.45c)
 
y.tf ; !/ D rz G tf ; !; z.tf ; !/ : (2.45d)

Remark 2.13 In case of a discrete distribution ˝ D f!1 ; : : : ; !% g; P .! D !j /; j D


1; : : : ; %, corresponding to Sect. 3.1, for the H -minimal control we have
 
ue D ue t; z.t; !1 /; : : : ; z.t; !% /; y.t; !1 /; : : : ; y.t; !% / :

Thus, (2.45a–d) is then an ordinary two-point boundary value problem for the 2%
unknown functions z D z.t; !j /; y D y.t; !j /; j D 1; : : : ; %.
 
N D zN.t; !/; y.t;
Let denote .Nz; y/ N !/ ; t0  t  tf ; ! 2 .˝; A; P /, the unique
measurable solution of (2.45a–d) and define:
 
uN .t/ WD ue t; zN.t; /; y.t;
N / ; t0  t  tf : (2.46)

Due to (2.16e) and (2.38b), (2.39a, b) we have

zN.t; !/ D zuN .t; !/; t0  t  tf ; ! 2 .˝; A; P / (2.47a)


N !/ D yuN .t; !/; t0  t  tf ; ! 2 .˝; A; P /;
y.t; (2.47b)

hence,
 
uN .t/ D ue t; zuN .t; /; yuN .t; / ; t0  t  tf : (2.47c)

Assuming that

uN ./ 2 U.and therefore uN ./ 2 D/; (2.48)

we get this result:


Theorem 2.7 Suppose that the  Hamiltoniansystem (2.45a–d) has a unique mea-
N D zN.t; !/; y.t;
surable solution .Nz; y/ N !/ , and define uN ./ by (2.46) with a
H -minimal control ue D ue .t; ; /. If uN ./ 2 U , then uN ./ is a stationary control.
2.8 Computation of Expectations by Means of Taylor Expansions 73

Proof
 According
 N uN /, the control uN ./ 2 D minimizes
to the construction of .Nz; y;
H uN u./ on D. Hence,
 
uN ./ 2 argmin H uN u./ :
u./2D

Theorem 2.6 yields then that uN ./ is a stationary control.

2.8 Computation of Expectations by Means of Taylor


Expansions

Corresponding to the assumptions in Sect. 2.1, based on a parametric representation


of the random differential equation with a finite dimensional random parameter
vector  D .!/, we suppose that

g.t; w; z; u/ D g.t;
Q ; z; u/ (2.49a)
z0 .!/ D zQ0 ./ (2.49b)
Q ; z; u/
L.t; !; z; u/ D L.t; (2.49c)
Q ; z/:
G.t; !; z/ D G.t; (2.49d)

Here,

 D .!/; ! 2 .˝; A; P /; (2.49e)

denotes the time-independent r-vector of random model parameters and random


Q GQ are sufficiently smooth functions of the variables
Q zQ0 ; L;
initial values, and g;
indicated in (2.49a–d). For simplification of notation we omit symbol “~” and write
 
g.t; w; z; u/ WD g t; .!/; z; u (2.49a’)
 
z0 .!/ WD z0 .!/ (2.49b’)
 
L.t; !; z; u/ WD L t; .!/; z; u (2.49c’)
 
G.t; !; z; u/ WD G t; .!/; z : (2.49d’)

Since the approximate problem (2.17a–d), obtained by the above described inner
linearization, has the same basic structure as the original problem (2.9a–d), it is suf-
ficient to describe the procedure for problem (2.9a–d). Again, for simplification, the
74 2 Optimal Control Under Stochastic Uncertainty

conditional expectation E.: : : jAt0 / given the information At0 up to the considered
starting time t0 is denoted by “E”. Thus, let denote
t0
 
 D  WD E.!/ D E .!/jAt0 (2.50a)

the conditional expectation of the random vector .!/ given the information At0 at
time point t0 . Taking into account the properties of the solution
 
z D z.t; / D S z0 ./; ; u./ .t/; t  t0 ; (2.50b)

of the dynamic equation (2.3a–d), see Lemma 2.1, the expectations arising in the
objective function (2.9a) can be computed approximatively by means of Taylor
expansion with respect to  at .

2.8.1 Complete Taylor Expansion

Considering first the costs L along the trajectory we obtain, cf. [100],
   
L t; ; z.t; /; u.t/ D L t; ; z.t;  /; u.t/
!
    T
C r L t; ; z.t;  /; u.t/ C D z.t;  / rz L t; ; z.t;  /; u.t/
T
.  /

1  
C .  /T QL t;  ; z.t; /; D z.t;  /; u.t/ .  / C : : : : (2.51a)
2

Retaining only 1st order derivatives of z D z.t;  / with respect to , the


approximative Hessian QL of  ! L t; ; z.t; /; u at  D  is given by
   
QL t; ; z.t;  /; D z.t; /; u.t/ WD r2 L t;  ; z.t; /; u.t/
   T
CD z.t;  /T r2z L t; ; z.t;  /; u.t/ C r2z L t; ; z.t;  /; u.t/ D z.t;  /
 
CD z.t;  /T rz2 L t; ; z.t;  /; u.t/ D z.t; /: (2.51b)

Here, r L; rz L denotes the gradient of L with respect to ; z, resp., D z is the


Jacobian of z D z.t; / with respect to , and r2 L; rz2 L, resp., denotes the Hessian
of L with respect to ; z. Moreover, r2z L is the r  m matrix of partial derivatives
of L with respect to i and zk , in this order.
2.8 Computation of Expectations by Means of Taylor Expansions 75

Taking expectations in (2.51a), from (2.51b) we obtain the expansion


!
   
EL t; .!/; z t; .!/ ; u.t/ D L t; ; z.t;  /; u.t/

1  T   
C E .!/   QL t;  ; z.t; /; D z.t;  /; u.t/ .!/   C : : :
2
  1    
D L t; ; z.t; /; u.t/ C trQL t;  ; z.t; /; D z.t;  /; u.t/ cov ./ C : : : :
2
(2.52)

For the terminal costs G, corresponding to the above expansion we find


   
G tf ; ; z.tf ; / D G tf ;  ; z.tf ;  /
!T
   
C r G tf ; ; z.tf ; / C D z.tf ;  /T rz G tf ; ; z.tf ; / .  /

1  
C .  /T QG tf ; ; z.tf ;  /; D z.tf ; / .  / C : : : ; (2.53a)
2
where QG is defined in the same way as QL , see (2.51a). Taking expectations with
respect to .!/, we get
!
   
EG tf ; .!/; z tf ; .!/ D G tf ;  ; z.tf ;  /

1    
C trQG tf ;  ; z.tf ;  /; D z.tf ; / cov ./ C : : : : (2.53b)
2
Note. Corresponding to Definition 2.1 and (2.50a), for the mean and covariance
matrix of the random parameter vector  D .!/ we have

.t0 /
 
 D WD E .!/jAt0
!
      T ˇ
.t 0 / .t 0 / ˇ
cov ./ D cov.t0 / ./ WD E .!/   .!/   ˇAt0 :

2.8.2 Inner or Partial Taylor Expansion

Instead of a complete expansion of L; G with respect to , appropriate approxima-


tions of the expected costs EL, EG, resp., may by obtained by the inner 1st order
76 2 Optimal Control Under Stochastic Uncertainty

approximation of the trajectory, hence,


   
L t; ; z.t; /; u.t/  L t; ; z.t;  / C D z.t;  /.  /; u.t/ : (2.54a)

Taking expectations in (2.54a), for the expected cost function we get the
approximation
 
EL t; ; z.t; /; u.t/
 
 EL t; .!/; z.t; / C D z.t; /..!/  /; u.t/ : (2.54b)

In many important cases, as e.g. for cost functions L being quadratic with respect to
the state variable z, the above expectation can be computed analytically. Moreover,
if the cost function L is convex with respect to z, then the expected cost function
EL is convex with respect to both, the state vector z.t;  / and the Jacobian matrix
of sensitivities D z.t; / evaluated at the mean parameter vector .
Having the approximate representations (2.52), (2.53b), (2.54b), resp., of the
expectations occurring in the objective function (2.9a), we still have to compute
the trajectory t ! z.t;  /; t  t0 , related to the mean parameter vector  D  and
@z
the sensitivities t ! .t; /; i D 1; : : : ; r; t  t0 , of the state z D z.t; / with
@i
respect to the parameters i ; i D 1; : : : ; r, at  D . According to (2.3a, b) or (2.9b,
c), for z D z.t;  / we have the system of differential equations
 
zP.t;  / D g t;  ; z.t; /; u.t/ ; t  t0 ; (2.55a)

z.t0 ;  / D z0 ./: (2.55b)

Moreover, assuming that differentiation with respect to i ; i D 1; : : : ; r, and


integration with respect to time t can be interchanged, see Lemma 2.1, from (2.3c)
we obtain the following system of linear perturbation differential  equation for the
@z @z @z
Jacobian D z.t;  / D .t; /; .t; /; : : : ; .t;  / ; t  t0 :
@1 @2 @r

d   
D z.t; / D Dz g t; ; z.t; /; u.t/ D z.t; /
dt
 
CD g t; ; z.t;  /; u.t/ ; t  t0 ; (2.56a)

D z.t0 ; / D D z0 ./: (2.56b)

Note. Equation (2.56a, b) is closely related to the perturbation equation (2.16a,


b) for representing the derivative Du z of z with respect to the control u. Moreover,
the matrix differential equation (2.56a) can be decomposed into the following r
2.8 Computation of Expectations by Means of Taylor Expansions 77

differential equations for the columns @z


@j .t; /; j D 1; : : : ; r:

d  @z  @g   @z
.t; / D t; ; z.t;  /; u.t/ .t; /
dt @j @j @j
@g  
C t; ; z.t; /; u.t/ ; t  t0 ; j D 1; : : : ; r: (2.56c)
@j

Denoting by
 
LQ D L
Q t; ; z.t;  /; D z.t; /; u.t/ ; (2.57a)

 
GQ D GQ tf ; ; z.tf ; /; D z.tf ;  / ; (2.57b)

the approximation of the cost functions L, G by complete, partial Taylor expansion,


for the optimal control problem under stochastic uncertainty (2.9a–d) we obtain now
the following approximation:
Theorem 2.8 Suppose that differentiation with respect to the parameters i ; i D
1; : : : ; r, and integration with respect to time t can be interchanged in (2.3c).
Retaining only 1st order derivatives of z D z.t; / with respect to , the optimal
control problem under stochastic uncertainty (2.9a–d) can be approximated by the
ordinary deterministic control problem:

Ztf  
min E LQ t; .!/; z.t;  /; D z.t; /; u.t/ dt
t0
 
CE GQ tf ; .!/; z.tf ; ; D z.t; / (2.58a)

subject to
 
zP .t; / D g t; ; z.t; /; u.t/ ; t  t0 ; (2.58b)

z.t0 ; / D z0 . / (2.58c)
d    
D z.t; / D Dz g t; ; z.t; /; u.t/ D z.t; /
dt
 
CD g t; ; z.t;  /; u.t/ ; t  t0 ; (2.58d)

D z.t0 ; / D D z0 ./ (2.58e)


u./ 2 D: (2.58f)
78 2 Optimal Control Under Stochastic Uncertainty

Remark 2.14 Obviously, the trajectory of above deterministic substitute control


problem (2.58a–f) of the original optimal control problem under stochastic uncer-
tainty (2.9a–d) can be represented by the m.r C 1/-vector function:
0 1
z.t; /
B @z .t; / C
B @1 C
t ! .t/ WD B
B :: C ; t0  t  tf :
C (2.59)
@ : A
@z
@r
.t; /

Remark 2.15 Constraints of the expectation type (2.9f), i.e.,


!
 
EhII t; .!/; z t; .!/  .D/0;

can be evaluated as in (2.52) and (2.53b). This yields then deterministic constraints
for the unknown functions t ! z.t;  / and t ! D z.t;  /; t  t0 .
Remark 2.16 The expectations EH uı ./ ; EH arising in .P /tuı ./;; ; .P /t; , resp., can
be determined approximatively as described above.
Chapter 3
Stochastic Optimal Open-Loop Feedback
Control

3.1 Dynamic Structural Systems Under Stochastic


Uncertainty

3.1.1 Stochastic Optimal Structural Control: Active Control

In order to omit structural damages and therefore high compensation (recourse)


costs, active control techniques are used in structural engineering. The structures
usually are stationary, safe and stable without considerable external dynamic
disturbances. Thus, in case of heavy dynamic external loads, such as earthquakes,
wind turbulences, water waves, etc., which cause large vibrations with possible
damages, additional control elements can be installed in order to counteract applied
dynamic loads, see [18, 154, 155].
The structural dynamics is modeled mathematically by means of a linear system
of second order differential equations for the m-vector q D q.t/ of displacements.
The system of differential equations involves random dynamic parameters, random
initial values, the random dynamic load vector and a control force vector depending
on an input control function u D u.t/. Robust, i.e. parameter-insensitive optimal
feedback controls u are determined in order to cope with the stochastic uncertainty
involved in the dynamic parameters, the initial values and the applied loadings. In
practice, the design of controls is directed often to reduce the mean square response
(displacements and their time derivatives) of the system to a desired level within a
reasonable span of time.
The performance of the resulting structural control problem under stochastic
uncertainty is evaluated therefore by means of a convex quadratic cost function
L D L.t; z; u/ of the state vector z D z.t/ and the control input vector u D u.t/.
While the actual time path of the random external load is not known at the planning
stage, we may assume that the probability distribution or at least the moments
under consideration of the applied load and other random parameters are known.
The problem is then to determine a robust, i.e. parameter-insensitive (open-loop)

© Springer-Verlag Berlin Heidelberg 2015 79


K. Marti, Stochastic Optimization Methods, DOI 10.1007/978-3-662-46214-0_3
80 3 Stochastic Optimal Open-Loop Feedback Control

feedback control law by minimization of the expected total costs, hence, a stochastic
optimal control law.
As mentioned above, in active control of dynamic structures, cf. [18, 117, 154–
157, 171], the behavior of the m-vector q D q.t/ of displacements with respect to
time t is described by a system of second order linear differential equations for q.t/
having a right hand side being the sum of the stochastic applied load process and
the control force depending on a control n-vector function u.t/:
 
M qR C D qP C Kq.t/ D f t; !; u.t/ ; t0  t  tf : (3.1a)
 
Hence, the force vector f D f t; !; u.t/ on the right hand side of the dynamic
equation (3.1a) is given by the sum

f .t; !; u/ D f0 .t; !/ C fa .t; !; u/ (3.1b)

of the applied load f0 D f0 .t; !/ being a vector-valued stochastic process


describing e.g. external loads or excitation of the structure caused by earthquakes,
wind turbulences, water waves, etc., and the actuator or control force vector fa D
fa .t; !; u/ depending on an input or control n-vector function u D u.t/; t0 
t  tf . Here, ! denotes the random element, lying in a certain probability
space .˝; A; P /, used to represent random variations. Furthermore, M; D; K, resp.,
denotes the m  m mass, damping and stiffness matrix. In many cases the actuator
or control force fa is linear, i.e.

fa D u u (3.1c)

with a certain m  n matrix u .


By introducing appropriate matrices, the linear system of second order differ-
ential equations (3.1a, b) can be represented by a system of first order differential
equations as follows:

zP D g.t; !; z.t!/; u/ WD Az.t; !/ C Bu C b.t; !/ (3.2a)

with
   
0 I 0
A WD ; B WD ; (3.2b)
M K M 1 D
1
M 1 u
 
0
b.t; !/ WD (3.2c)
M 1 f0 .t; !/:
3.1 Dynamic Structural Systems Under Stochastic Uncertainty 81

Moreover, z D z.t/ is the 2m-state vector defined by


 
q
zD (3.2d)
qP

fulfilling a certain initial condition


   
q.t0 / q0
z.t0 / D WD (3.2e)
P 0/
q.t qP 0

with given or stochastic initial values q0 D q0 .!/; qP0 D qP 0 .!/.

3.1.2 Stochastic Optimal Design of Regulators

In the optimal design of regulators for dynamic systems, see also Chap. 4, the
(m-)vector q D q.t/ of tracking errors is described by a system of 2nd order linear
differential equations:

M.t/qR C D.t/qP C K.t/q.t/ D Y .t/


pD .!/ C
u.t; !/; t0  t  tf : (3.3)

Here, M.t/; D.t/; K.t/; Y .t/ denote certain time-dependent Jacobians arising
from the linearization of the dynamic equation around the stochastic optimal
reference trajectory and the conditional expectation pD of the vector of dynamic
parameters pD .!/. The deviation between the vector of dynamic parameters pD .!/
and its conditional expectation pD is denoted by
p D .!/ WD pD .!/  pD .
Furthermore,
u.t/ denotes the correction of the feedforward control u0 D u0 .t/.
By introducing appropriate matrices, system (2.1a, b) can be represented by the
1st order system of linear differential equations:

zP D A.t/z.t; !/ C B
u C b.t; !/ (3.4a)

with
   
0 I 0
A.t/ WD ; B WD ; (3.4b)
M.t/1 K M.t/1 D.t/ M.t/1
 
0
b.t; !/ WD : (3.4c)
M.t/1 Y .t/
pD .!/

Again, the (2m-) state vector z D z.t/ is defined by


 
q
zD : (3.4d)
qP
82 3 Stochastic Optimal Open-Loop Feedback Control

3.1.3 Robust (Optimal) Open-Loop Feedback Control

According to the description in Sect. 2.2, a feedback control is defined, cf. (2.10b),
by

u.t/ WD '.t; It /; t  t0 ; (3.5a)

where It denotes again the total information about the control system up to time t
and '.; / designates the feedback control law. If the state zt WD z.t/ is available at
each time point t, the control input n-vector function u D u.t/,
u D
u.t/, resp.,
can be generated by means of a PD-controller, hence,
 
u.t/.
u.t// WD ' t; z.t/ ; t  t0 ; (3.5b)

with a feedback control law ' D '.t; q; q/ P D '.t; z.t//. Efficient approximate
feedback control laws are constructed here by using the concept of open-loop
feedback control. Open-loop feedback control is the main tool in model predictive
control, cf. [1, 101, 136], which is very often used to solve optimal control problems
in practice. The idea of open-loop feedback control is to construct a feedback control
law quasi argument-wise, see cf. [2, 80].
A major issue in optimal control is the robustness, cf. [43], i.e. the insensitivity
of the optimal control with respect to parameter variations. In case of random
parameter variations, robust optimal controls can be obtained by means of stochastic
optimization methods, cf. [100]. Thus, we introduce the following concept of an
stochastic optimal (open-loop) feedback control.
Definition 3.1 In case of stochastic parameter variations, robust, hence, parameter-
insensitive optimal (open-loop) feedback controls obtained by stochastic optimiza-
tion methods are also called stochastic optimal (open-loop) feedback controls.

3.1.4 Stochastic Optimal Open-Loop Feedback Control

Finding a stochastic optimal open-loop feedback control, hence, an optimal (open-


loop) feedback control law, see Sect. 2.2, being insensitive as far as possible with
respect to random parameter variations, means that besides optimality of the control
law also its insensitivity with respect to stochastic parameter variations should be
guaranteed. Hence, in the following sections we develop now a stochastic version of
the (optimal) open-loop feedback control method, cf. [102–105]. A short overview
on this novel stochastic optimal open-loop feedback control concept is given below:
At each intermediate time point tb 2 Œt0 ; tf , based on the information
  Itb
available at time tb , a stochastic optimal open-loop control u D u tI tb ; Itb ; tb 
3.1 Dynamic Structural Systems Under Stochastic Uncertainty 83

Fig. 3.1 Remaining time


interval

t  tf , is determined first on the remaining time interval Œtb ; tf , see Fig. 3.1, by
stochastic optimization methods, cf. [100] .  
Having a stochastic optimal open-loop control u D u tI tb ; Itb ; tb  t 
tf , on each remaining time interval Œtb ; tf  with an arbitrary starting time tb , t0 
tb  tf , a stochastic optimal open-loop feedback control law is then defined, see
Definition 2.3, as follows:
Definition 3.2
   
'  D ' tb ; Itb WD u .tb / D u tb I tb ; Itb ; t0  tb  tf : (3.5c)
 
Hence, at
  time t D tb just the “first” control value
 u .tb / D u tb I tb ; Itb of
u I tb ; Itb is used only. For each other argument t; It the same construction is
applied.
For finding stochastic optimal open-loop controls, based on the methods devel-
oped in Chap. 2, on the remaining time intervals tb  t  tf with t0  tb  tf , the
stochastic Hamilton function of the control problem is introduced. Then, the class
of H  minimal controls, cf. Definitions 2.6 and 2.7, can be determined in case
of stochastic uncertainty by solving a finite-dimensional stochastic optimization
problem for minimizing the conditional expectation of the stochastic Hamiltonian
subject to the remaining deterministic control constraints at each time point t.
Having a H  minimal control, the related two-point boundary value problem
with random parameters can be formulated for the computation of a stochastic
optimal state- and costate-trajectory. Due to the linear-quadratic structure of the
underlying control problem, the state and costate trajectory can be determined
analytically to a large extent. Inserting then these trajectories into the H-minimal
control, stochastic optimal open-loop controls are found on an arbitrary remaining
time interval. According to Definition 3.2, these controls yield then immediately a
stochastic optimal open-loop feedback control law. Moreover, the obtained controls
can be realized in real-time, which is already shown for applications in optimal
control of industrial robots, cf. [139].
Summarizing, we get optimal (open-loop) feedback controls under stochastic
uncertainty minimizing the effects of external influences on system behavior,
subject to the constraints of not having a complete representation of the system,
cf. [43]. Hence, robust or stochastic optimal active controls are obtained by new
techniques from Stochastic Optimization, see [100]. Of course, the construction can
be applied also to PD- and PID-controllers.
84 3 Stochastic Optimal Open-Loop Feedback Control

3.2 Expected Total Cost Function

The performance function F for active structural control systems is defined , cf.
[95, 98, 101], by the conditional expectation of the total costs being the sum of costs
L along the trajectory, arising from the displacements z D z.t; !/ and the control
input u D u.t; !/ , and possible terminal costs G arising at the final state zf . Hence,
on the remaining time interval tb  t  tf we have the following conditional
expectation of the total cost function with respect to the information Atb available
up to time tb :
0 1
Ztf   ˇ
F WD E @ L t; !; z.t; !/; u.t; !/ dt C G.tf ; !; z.tf ; !//ˇ Atb A : (3.6a)
tb

Supposing quadratic costs along the trajectory, the function L is given by

1 T 1
L.t; !; z; u/ WD z Q.t; !/z C uT R.t; !/u (3.6b)
2 2
with positive (semi) definite 2m  2m; n  n, resp., matrix functions Q D
Q.t; !/; R D R.t; !/. In the simplest case the weight matrices Q; R are fixed.
A special selection for Q reads
 
Qq 0
QD (3.6c)
0 QqP

P Furthermore,
with positive (semi) definite weight matrices Qq ; QqP , resp., for q; q.
G D G.tf ; !; z.tf ; !// describes possible terminal costs. In case of endpoint
control G is defined by

1
G.tf ; !; z.tf ; !// WD .z.tf ; !/  zf .!//T Gf .z.tf ; !/  zf .!//; (3.6d)
2

where Gf D Gf .!/ is a positive (semi) definite, possible random weight matrix,


and zf D zf .!/ denotes the (possible random) final state.
1
Remark 3.1 Instead of uT Ru, in the following we also use a more general convex
2
control cost function C D C.u/.
3.4 The Stochastic Hamiltonian of (3.7a–d) 85

3.3 Open-Loop Control Problem on the Remaining Time


Interval Œtb ; tf 

In the following we suppose next to that the 2m  2m matrix A and the 2m  n


matrix B are given, fixed matrices.
Having the differential equation with random coefficients derived above, describ-
ing the behavior of the dynamic mechanical structure/system under stochastic
uncertainty, and the costs arising from displacements and at the terminal state,
 time interval Œtb ; tf  a stochastic optimal open-loop control
on a given remaining
u D u tI tb ; Itb ; tb  t  tf ; is a solution of the following optimal control
problem under stochastic uncertainty:
0 1
Ztf ˇ
1  ˇ
min E @ z.t; !/T Qz.t; !/ C u.t/T Ru.t/ dt C G.tf ; !; z.tf ; !//ˇAtb A
2
tb
(3.7a)
s.t. zP.t; !/ D Az.t; !/ C Bu.t/ C b.t; !/; a.s.; tb  t  tf (3.7b)
z.tb ; !/ D zb .b/ (estimated state at time tb ) (3.7c)
u.t/ 2 Dt ; tb  t  tf : (3.7d)

An important property of (3.7a–d) is stated next:


Lemma 3.1 If the terminal cost function G D G.tf ; !; z/ is convex in z, and the
feasible domain Dt is convex for each time point t, t0  t  tf ; then the stochastic
optimal control problem (3.7a–d) is a convex optimization problem.

3.4 The Stochastic Hamiltonian of (3.7a–d)

According to (2.28a), (2.40a), see also [101], the stochastic Hamiltonian H related
to the stochastic optimal control problem (3.7a–d) reads:

H.t; !; z; y; u/ WD L.t; !; z; u/ C y T g.t; !; z; u/


1 T
D z Qz C C.u/ C y T .Az C Bu C b.t; !// : (3.8a)
2
86 3 Stochastic Optimal Open-Loop Feedback Control

3.4.1 Expected Hamiltonian (with Respect to the Time Interval


Œtb ; tf  and Information Atb )

For the definition of a H minimal control the conditional expectation of the


stochastic Hamiltonian is needed:
 
 ˇ  1 T ˇ
H WD E H.t; !; z; y; u/ˇAtb D E z Qz C y T .Az C b.t; !// ˇAtb
.b/
2
 ˇ 
CC.u/ C E y T BuˇAtb
 ˇ T
D C.u/ C E B T y.t; !/ˇAtb u C    D C.u/ C h.t/T u C : : :
(3.8b)

with

   ˇ 
h.t/ D h tI tb ; Itb WD E B.!/T y.t; !/ˇAtb ; t  tb :
(3.8c)

3.4.2 H -Minimal Control on Œtb ; tf 

In order to formulate the two-point


 boundary
 value problem for a stochastic optimal
open-loop control u D u tI tb ; Itb ; tb  t  tf , we need first an H -minimal
control
 
ue D ue t; z.t; /; y.t; /I tb ; Itb ; tb  t  tf ;

defined, see Definitions 2.6 and 2.7 and cf. also [101], for tb  t  tf as a solution
of the following convex stochastic optimization problem, cf. [100]:
 ˇ 
min E H.t; !; z.t; !/; y.t; !/; u/ˇAtb (3.9a)

s.t.

u 2 Dt ; (3.9b)

where z D z.t; !/; y D y.t; !/ are certain trajectories.


According to (3.9a, b) the H-minimal control
   
ue D ue t; z.t; /; y.t; /I tb ; Itb D ue .t; h I tb ; Itb / (3.10a)
3.5 Canonical (Hamiltonian) System 87

is defined by
   T
ue .t; h I tb ; Itb / WD argmin C.u/ C h tI tb ; Itb u for t  tb : (3.10b)
u2Dt

Strictly Convex Cost Function, No Control Constraints

For strictly convex, differentiable cost functions C D C.u/, as e.g. C.u/ D 12 uT Ru


with positive definite matrix R, the necessary and sufficient condition for ue reads
in case Dt D Rn :
 
rC.u/ C h tI tb ; Itb D 0 : (3.11a)

If u 7! rC.u/ is a 1-1-operator, then the solution of (3.11a) reads


   
u D v.h tI tb ; Itb / WD rC 1 .h tI tb ; Itb / : (3.11b)

With (3.8c) and (3.10b) we then have


    ˇ 
ue .t; h/ D ue .h tI tb ; Itb // WD rC 1 E B.!/T y.t; !/ˇAtb : (3.11c)

3.5 Canonical (Hamiltonian) System


 
We suppose here that a H -minimal control ue D ue t; z.t; /; y.t; /I tb ; Itb ,
tb  t  tf , i.e., a solution ue D ue .t; h/ D v.h.t/// of the stochastic
optimization problem (3.9a, b) is available. Moreover, the conditional expectation
 ˇ 
E  ˇAtb of a random variable  is also denoted by  , cf. (3.8b). According to
.b/

(2.46), Theorem 2.7, a stochastic optimal open-loop control u D u .tI tb ; Itb /;


tb  t  tf ,
 
u .tI tb ; Itb / D ue t; z .t; /; y  .t; /I tb ; Itb ; tb  t  tf ; (3.12)

of the stochastic optimal control problem (3.7a–d), can be obtained, see also [101],
by solving the following stochastic two-point boundary value problem related to
(3.7a–d):
Theorem 3.1 If z D z .t; !/; y  D y  .t; !/; t0  t  tf , is a solution of
 .b/

zP.t; !/ D Az.t; !/ C BrC 1 B.!/T y.t; !/ C b.t; !/; tb  t  tf
(3.13a)
z.tb ; !/ D zb .b/ (3.13b)
88 3 Stochastic Optimal Open-Loop Feedback Control

P !/ D AT y.t; !/  Qz.t; !/


y.t; (3.13c)
y.tf ; !/ D rG.tf ; !; z.tf ; !//; (3.13d)

then the function u D u .tI tb ; Itb /; tb  t  tf , defined by (3.12) is a stochastic


optimal open-loop control for the remaining time interval tb  t  tf .

3.6 Minimum Energy Control


 
q
In this case we have Q D 0, i.e., there are no costs for the displacements z D .
qP
In this case the solution of (3.13c, d) reads
T .t t /
y.t; !/ D e A f
rz G.tf ; !; z.tf ; !//; tb  t  tf : (3.14a)

This yields for fixed Matrix B


 
ue .t; h.t// D v.h.t// D rC 1 B T e A .tf t / rz G.tf ; !; z.tf ; !//
T .b/
;
(3.14b)
tb  t  tf :

Having (3.14a, b), for the state trajectory z D z.t; !/ we get, see (3.13a, b), the
following system of ordinary differential equations
 .b/ 
zP.t; !/ D Az.t; !/ C BrC 1  B T e A .tf t / rz G.tf ; !; z.tf ; !//
T

Cb.t; !/; tb  t  tf ; (3.15a)


z.tb ; !/ D zb .b/ : (3.15b)

The solution of system (3.15a, b) reads

Zt
A.t tb /
z.t; !/ D e zb .b/
C e A.t s/ b.s; !/
tb
!
 .b/

1
B T e A .tf s/ rz G.tf ; !; z.tf ; !//
T
CBrC ds;

tb  t  tf : (3.16)
3.6 Minimum Energy Control 89

For the final state z D z.tf ; !/ we get the relation:

Ztf
A.tf tb /
z.tf ; !/ D e zb .b/
C e A.tf s/ b.s; !/
tb
!
 .b/

CBrC 1 B T e AT .t f s/
rz G.tf ; !; z.tf ; !// ds : (3.17)

3.6.1 Endpoint Control

In the case of endpoint control, the terminal cost function is given by the following
definition (3.18a), where zf D zf .!/ denotes the desired—possible random—final
state:
1
G.tf ; !; z.tf ; !// WD kz.tf ; !/  zf .!/k2 : (3.18a)
2
Hence,

rG.tf ; !; z.tf ; !/ D z.tf ; !/  zf .!/ (3.18b)

and therefore
.b/ .b/
rG.tf ; !; z.tf ; !// D z.tf ; !/  zf .b/ (3.18c)
 ˇ   ˇ 
D E z.tf ; !/ˇAtb  E zf ˇAtb :

Thus

Ztf
A.tf tb /
z.tf ; !/ D e zb .b/
C e A.tf s/ b.s; !/
tb
!
  .b/

CBrC 1 B T e AT .t f s/
z.tf ; !/  zf .b/ ds : (3.19a)
90 3 Stochastic Optimal Open-Loop Feedback Control

Taking expectations E.: : : jAtb / in (3.19a), we get the following condition for
.b/
z.tf ; !/ :

Ztf
.b/ A.tf tb / .b/
z.tf ; !/ De zb .b/
C e A.tf s/ b.s; !/ ds
tb

Z tf
  
.b/
e A.tf s/ BrC 1 B T e A .tf s/ z.tf ; !/  zf .b/
T
C ds :
tb

(3.19b)

Quadratic Control Costs

Here, the control cost function C D C.u/ reads

1 T
C.u/ D u Ru ; (3.20a)
2
hence,

rC D Ru (3.20b)

and therefore

rC 1 .w/ D R1 w : (3.20c)

Consequently, (3.19b) reads

Ztf
.b/ .b/
z.tf ; !/ D e A.tf tb / zb .b/ C e A.tf s/ b.s; !/ ds
tb

Ztf
.b/
e A.tf s/ BR1 B T e A s/
T .t
 f
dsz.tf ; !/
tb

Ztf
.b/
e A.tf s/ BR1 B T e A s/
T .t
C f
dszf .!/ : (3.21)
tb
3.6 Minimum Energy Control 91

Define now

Ztf
e A.tf s/ BR1 B T e A s/
T .t
U WD f
ds : (3.22)
tb

Lemma 3.2 I C U is regular.


Proof Due to the previous considerations, U is a positive semidefinite 2m  2m
matrix. Hence, U has only nonnegative eigenvalues.
Assuming that the matrix I C U is singular, there is a 2m-vector w ¤ 0 such that

.I C U / w D 0:

However, this yields

U w D I w D w D .1/w;

which means that  D 1 is an eigenvalue of U . Since this contradicts to the above


mentioned property of U , the matrix I C U must be regular.
From (3.21) we get

Ztb
.b/ A.tf tb / .b/
.I C U / z.tf ; !/ De zb .b/
C e A.tf s/ b.s; !/ ds C U zf .b/ ;
tb
(3.23a)

hence,

Ztb
.b/ 1 A.tf tb / 1 .b/
z.tf ; !/ D .I C U / e zb C .I C U / e A.tf s/ b.s; !/ ds
tb
1
C .I C U / U zf .b/
: (3.23b)

Now, (3.23b) and (3.18b) yield

.b/ .b/
rz G.tf ; !; z.tf ; !// D z.tf ; !/  zf D z.tf ; !/  zf .b/
D .I C U /1 e A.tf tb / zb .b/
Ztf
1 .b/
C .I C U / e A.tf s/ b.s; !/ ds
tb
 
C .I C U /1 U  I zf .b/ : (3.24)
92 3 Stochastic Optimal Open-Loop Feedback Control

Thus, a stochastic optimal open-loop control u D u .tI tb ; Itb /, tb  t  tf , on


Œtb ; tf  is given by, cf. (3.11b),

u .tI tb ; Itb / D R1 B T e A t /


.I C U /1 e A.tf tb / zb .b/
T .t
f

Ztf
1 .b/
C .I C U / e A.tf s/ b.s; !/ ds
tb
!
 
1
C .I C U / U  I zf .b/
; tb  t  tf :

(3.25)

Finally, the stochastic optimal open-loop feedback control law ' D '.t; It / is then
given by

'.tb ; Itb / WD u .tb I tb ; Itb /

D R1 B T e A tb /
.I C U /1 e A.tf tb / zb .b/
T .t
f

Ztf
1 T AT .tf tb / 1 .b/
R B e .I C U / e A.tf s/ b.s; !/ ds
tb
 
R1 B T e AT .t f tb /
.I C U /1 U  I zf .b/ (3.26)
 .b/ .b/

with Itb WD zb .b/ WD z.tb / ; b.; !/ ; zf .b/ .
Replacing tb ! t, we find this result:
Theorem 3.2 The stochastic optimal open-loop feedback control law ' D '.t; It /
is given by

.t /
'.t; It / D R1 B T e A t /
.I C U /1 e A.tf t / z.t/
T .t
f
„ ƒ‚ …
0 .t /

Ztf
1 T AT .tf t / 1 .t /
R B e .I C U / e A.tf s/ b.s; !/ ds
t
„ ƒ‚ …
.t /
1 .t;b.;!/ /
 
R1 B T e AT .t f t /
.I C U /1 U  I zf .t / ; (3.27a)
„ ƒ‚ …
2 .t /
3.6 Minimum Energy Control 93

hence,
.t / .t /
'.t; It / D 0 .t/z.t/ C 1 .t; b.; !/ / C 2 .t/zf .t / ;
 .t / .t /

It WD z.t/ ; b.; !/ ; zf .t / : (3.27b)

.t /
Remark 3.2 Note that the stochastic optimal open-loop feedback law z.t/ 7!
'.t; It / is not linear in general, but affine-linear.

3.6.2 Endpoint Control with Different Cost Functions

In this section we consider more general terminal cost functions G. Hence, suppose

G.tf ; !; z.tf ; !// W D .z.tf ; !/  zf .!// ; (3.28a)


rG.tf ; !; z.tf ; !// D r.z.tf ; !/  zf .!// : (3.28b)

Consequently,
 
ue .t; h.t// D v  .h.t// D rC 1 B T e A .tf t / r.z.tf ; !/  zf .!//
T .b/

(3.29a)

and therefore, see (3.17)

Ztf
z.tf ; !/ D e A.tf tb / zb C e A.tf s/ b.s; !/ ds
tb

Ztf  
.b/
e A.tf s/ BrC 1 B T e A .tf s/ r.z.tf ; !/  zf .!//
T
C ds ;
tb

tb  t  tf : (3.29b)

Special Case:
Now a special terminal cost function is considered in more detail:

X
2m
.z  zf / W D .zi  zf i /4 (3.30a)
i D1
 T
r.z  zf / D 4 .z1  zf 1 /3 ; : : : ; .z2m  zf 2m /3 : (3.30b)
94 3 Stochastic Optimal Open-Loop Feedback Control

Here,
.b/     T
r.z  zf / D 4 E .z1  zf 1 /3 jAtb ; : : : ; E .z2m  zf 2m /3 jAtb
 T
.b/ .b/
D 4 m3 .z1 .tf ; /I zf 1 .//; : : : ; m3 .z2m .tf ; /I zf 2m .//
.b/
DW 4m3 .z.tf ; /I zf .// : (3.31)

Thus,

Ztf
A.tf tb /
z.tf ; !/ D e zb C e A.tf s/ b.s; !/ ds
tb

Ztf  
e A.tf s/ BrC 1 B T e A .tf s/ 4m3 .z.tf ; /I zf .// ds :
T .b/
C
tb
„ ƒ‚ …
 
.b/
J m3 .z.tf ;/Izf .//

(3.32)

Equation (3.32) yields then


ˇ
 3 ˇˇ
z.tf ; !/  zf .!/ ˇ
ˇ
c -by -c
0 13 ˇ
Ztf   ˇ
ˇ
D @e A.tf tb / zb  zf C e A.tf s/ b.s; !/ ds C J m3 .z.tf ; /I zf .// A ˇ
.b/
;
ˇ
tb c -by -c
(3.33a)

where “c-by-c” means “component-by-component”. Taking expectations in


.b/
(3.33a), we get the following relation for the moment vector m3 :
 
.b/ .b/
m3 .z.tf ; /I zf .// D  m3 .z.tf ; /I zf .// : (3.33b)

Remark 3.3
ˇ
 3 ˇ ˇˇ
ˇ
E z.tf ; !/  zf .!/ Atb ˇ
ˇ
cbyc
 3
D E.b/ z.tf ; !/  z.b/ .tf / C z.b/ .tf /  zf .!/
3.6 Minimum Energy Control 95

 3  2  
D E.b/ z.tf ; !/  z.b/ .tf / C 3 z.tf ; !/  z.b/ .tf / z.b/ .tf /  zf .!/
!
    2   3
C3 z.tf ; !/  z.b/ .tf / z.b/ .tf /  zf .!/ C z.b/ .tf /  zf .!/ : (3.33c)

Assuming that z.tf ; !/ and zf .!/ are stochastic independent, then


 ˇ 
E .z.tf ; !/  zf .!//3 ˇAtb

.b/  3 .b/
D m3 .z.tf ; // C 3 2.b/ .z.tf ; //.z.b/ .tf /  zf .b/ / C z.b/ .tf /  zf .!/ ;
(3.33d)

where 2.b/ .z.tf ; / denotes the conditional variance of the state reached at the final
time point tf , given the information at time tb .

3.6.3 Weighted Quadratic Terminal Costs

With a certain (possibly random) weight matrix D .!/, we consider the


following terminal cost function:

1  
 .!/ z.tf ; !/  zf .!/ 2 :
G.tf ; !; z.tf ; !// WD (3.34a)
2

This yields

rG.tf ; !; z.tf ; !// D .!/T .!/.z.tf ; !/  zf .!// ; (3.34b)

and from (3.14a) we get


T .t t /
y.t; !/ D e A f
rz G.tf ; !; z.tf ; !//
AT .t t /
De f
.!/T .!/.z.tf ; !/  zf .!// ; (3.35a)

hence,

T .t 
t /
.b/
y .b/ .t/ D e A f
.!/T .!/.z.tf ; !/  .!/zf .!//
 .b/ .b/

D e A .tf t / .!/T .!/z.tf ; !/  .!/T .!/zf .!/
T
: (3.35b)
96 3 Stochastic Optimal Open-Loop Feedback Control

Thus, for the H -minimal control we find

ue .t; h/ Dv.h.t//


 
DrC 1 B T y .b/ .t/
 .b/
DrC 1  B T e A t /
T .t
f
.!/T .!/z.tf ; !/
!
.b/

 .!/T .!/zf .!/ : (3.36)

We obtain therefore, see (3.16),

Zt
A.t tb /
z.t; !/ D e zb C e A.t s/ b.s; !/
tb

 .b/
C BrC 1  B T e A s/
T .t
f
.!/T .!/z.tf ; !/
!!
.b/

 .!/T .!/zf .!/ ds: (3.37a)

Quadratic Control Costs

Assume that the control costs and its gradient are given by

1 T
C.u/ D u Ru; rC.u/ D Ru : (3.37b)
2
Here, (3.37a) yields

Ztf
A.tf tb /
z.tf ; !/ D e zb C e A.tf s/ b.s; !/
tb
!
 .b/ .b/

BR1 B T e A .tf s/ .!/T .!/z.tf ; !/  .!/T .!/zf .!/
T
ds :

(3.37c)
3.6 Minimum Energy Control 97

Multiplying with .!/T .!/ and taking expectations, from (3.37c) we get

Ztf
.b/ .b/ A.t t / .b/
T z.tf ; !/ D T ef b
zb .b/
C T e A.tf s/ b.s; !/ ds
tb

Ztf
.b/
e A.tf s/ BR1 B T e A s/
T .t
 T f
ds
tb
 .b/ .b/

 .!/T .!/z.tf ; !/  .!/T .!/zf .!/ : (3.38a)

According to a former lemma, we define the matrix

Ztf
e A.tf s/ BR1 B T e A s/
T .t
U D f
ds :
tb

From (3.38a) we get then


 .b/
 .b/
I C T U T z.tf ; !/

Ztf
.b/ A.t t / .b/
D T ef b
zb .b/ C T e A.tf s/ b.s; !/ ds
tb

.b/ .b/
C T U .!/T .!/zf .!/ : (3.38b)

.b/
Lemma 3.3 I C T U is regular.
.b/
Proof First notice that not only U , but also T is positive semidefinite:

.b/ .b/ .b/


vT T v D v T T v D . v/T v D k vk22  0:

.b/
Then their product T U is positive semidefinite as well. This follows immedi-
ately from [122] as .!/T .!/ is symmetric.
98 3 Stochastic Optimal Open-Loop Feedback Control

.b/
Since the matrix I C T U is regular, we get cf. (3.23a, b),

.b/
 .b/
1 .b/
T z.tf ; !/ D I C T U T e A.tf tb / zb .b/

1 Z f
t
 .b/ .b/
C I C T U T e A.tf s/ b.s; !/ ds
tb
 .b/
1 .b/ .b/
C I C T U T U zf .!/ : (3.38c)

Putting (3.38c) into (3.36), corresponding to (3.25) we get the stochastic optimal
open-loop control
 .b/
u .tI tb ; Itb / D  R1 B T e A t /
T .t
.!/T .!/z.tf ; !/
f

.b/

 .!/T .!/zf .!/

D : : : ; tb  t  tf ; (3.39)

which yields then the related stochastic optima open-loop feedback control ' D
'.t; It / law corresponding to Theorem 3.2.

3.7 Nonzero Costs for Displacements

Suppose here that Q ¤ 0. According to (3.13a–d), for the adjoint trajectory y D


y.t; !/ we have the system of differential equations

P !/ D  AT y.t; !/  Qz.t; !/
y.t;
y.tf ; !/ DrG.tf ; !; z.tf ; !// ;

which has the following solution for given z.t; !/ and rG.tf ; !; z.tf ; !//:

Ztf
T .st / T .t t /
y.t; !/ D eA Qz.s; !/ ds C e A f
rG.tf ; !; z.tf ; !//: (3.40)
t

Indeed, we get

y.tf ; !/ D0 C I rz G.tf ; !; z.tf ; !// D rz G.tf ; !; z.tf ; !//


T 0
P !/ D  e A
y.t; Qz.t; !/
3.7 Nonzero Costs for Displacements 99

Ztf
T .st / T .t t /
 AT e A Qz.s; !/ ds  AT e A f
rG.tf ; !; z.tf ; !//
t
T 0
D  eA Qz.t; !/
0 tf 1
Z
 AT @ e A .st / Qz.s; !/ ds C e A .tf t / rG.tf ; !; z.tf ; !//A
T T

D  A y.t; !/  Qz.t; !/ :
T

From (3.40) we then get


 
y .b/ .t/ DE.b/ .y.t; !// D E y.t; !/jAtb
Ztf
T .st / T .t t / .b/
D eA Qz.b/ .s/ ds C e A f
rG.tf ; !; z.tf ; !// : (3.41)
t

.b/
The unknown function z.t/ , and the vector z.tf ; !/ in this equation are both
.b/
given, based on y.t/ , by the initial value problem, see (3.13a, b),
 
zP.t; !/ DAz.t; !/ C BrC 1 B T y .b/ .t/ C b.t; !/ (3.42a)

z.tb ; !/ Dzb : (3.42b)

Taking expectations, considering the state vector at the final time point tf , resp.,
yields the expressions:

Zt   
A.t tb / .b/
z .t/ De
.b/
zb .b/
C e A.t s/ b.s/ C BrC 1 B T y .b/ .s/ ds;
tb
(3.43a)
Ztf   
A.tf tb /
z.tf ; !/ De zb C e A.tf s/ b.s; !/ C BrC 1 B T y .b/ .s/ ds:
tb
(3.43b)
100 3 Stochastic Optimal Open-Loop Feedback Control

3.7.1 Quadratic Control and Terminal Costs

Corresponding to (3.18a, b) and (3.20a, b), suppose

rG.tf ; !; z.tf ; !// D z.tf ; !/  zf .!/ ;


rC 1 .w/ D R1 w :

According to (3.12) and (3.11c), in the present case the stochastic optimal open-
loop control is given by
  ˇ 
u .tI tb ; Itb / D ue .t; h.t// D R1 E B T y.t; !/ˇAtb D R1 B T y.t/ :
.b/

(3.44a)
.b/
Hence, we need the function y .b/ D y.t/ . From (3.41) and (3.18a, b) we have

 Zf T
t
.b/
 .b/ .b/
AT .tf t /
y.t/ De z.tf /  zf .b/
C e A .st / Qz.s/ ds : (3.44b)
t

Inserting (3.43a, b) into (3.44b), we have

.b/ t /
e A.tf tb / zb .b/  zf .b/
T .t
y.t/ De A f

Ztf !
 .b/ .b/

A.tf s/ 1
C e b.s/  BR B y.s/ T
ds
tb

Ztf
T .st /
C eA Q e A.stb / zb .b/
t

Zs !
 .b/ .b/

C e A.s / b./  BR1 B T y./ d  ds : (3.44c)
tb

In the following we develop a condition that guarantees the existence and


.b/
uniqueness of a solution y b D y.t/ of Eq. (3.44c):
Theorem 3.3 In the space of continuous functions, the above Eq. (3.44c) has a
unique solution if

1
cB < r  : (3.45)
.tf t0 /cQ
cA cR1 .tf  t0 / 1 C 2
3.7 Nonzero Costs for Displacements 101

Here,

cA WD sup ke A.t s/ kF cB WD kBkF cR1 WD kR1 kF cQ WD kQkF ;


tb t stf

and the index F denotes the Frobenius-Norm.


Proof The proof of the existence and uniqueness of such an solution is based on
the Banach fixed point theorem. For applying this theorem, we consider the Banach
space
˚

X D f W Œtb I tf  ! R2m W f continuous (3.46a)

equipped with the supremum norm

kf kL WD sup kf .t/k2 ; (3.46b)


tb t tf

where k  k2 denotes the Euclidean norm on R2m .


Now we study the operator T W X ! X defined by

t /
e A.tf tb / zb .b/  zf .b/
T .t
.T f /.t/ De A f

Ztf !
 .b/

A.tf s/ 1
C e b.s/  BR B f .s/ dsT

tb

Ztf
T .st /
C eA Q e A.stb / zb .b/
t

Zs !
 .b/

1 T
C e A.s /
b./  BR B f ./ d  ds : (3.47)
tb

The norm of the difference T f  T g of the images of two different elements f; g 2


X with respect to T may be estimated as follows:

kT f  T gk
(
 Ztf
 AT .t t /
D sup e f e A.tf s/ BR1 B T .g.s/  f .s// ds

tb t tf 
tb
9
Ztf Zs =

C e AT .st /
Q e A.s /
BR B .g./  f .// d  ds
1 T
 ;: (3.48a)

t tb 2
102 3 Stochastic Optimal Open-Loop Feedback Control

Note that the Frobenius norm is submultiplicative and compatible with the Euclidian
norm. Using these properties, we get

kT f  T gk
( Ztf
 sup cA cA cB cR1 cB kf .s/  g.s/k2 ds
tb t tf
tb

Ztf Zs )
C cA cQ cA cB cR1 cB kf ./  g./k2 d  ds
t tb

( Ztf
 sup cA cA cB cR1 cB sup kf .s/  g.s/k2 ds
tb t tf tb t tf
tb

Ztf Z s )
C cA cQ cA cB cR1 cB sup kf ./  g./k2 d  ds
tb t tf
t tb
n cQ  o
Dkf  gkcA2 cB2 cR1sup .tf  tb / C .tf  tb /2  .t  tb /2
tb t tf 2
cQ
kf  gkcA2 cB2 cR1 .tf  tb /.1 C .tf  tb // : (3.48b)
2
Thus, T is a contraction if

1
cB2 <  cQ  (3.48c)
cA2 cR1 .tf  tb / 1 C .t
2 f
 tb /

and therefore
1
cB < q  : (3.48d)
cQ
cA cR1 .tf  tb / 1 C .t
2 f
 tb /

In order to get a condition that is independent of tb , we take the worst case tb D t0 ,


hence,

1
cB < r  : (3.48e)
.tf t0 /cQ
cA cR1 .tf  t0 / 1 C 2
3.8 Stochastic Weight Matrix Q D Q.t; !/ 103

Remark 3.4 Condition (3.48e) holds if the matrix in (3.1c) has a sufficiently
small Frobenius norm. Indeed, according to (3.2b) we have
 
0
BD
M 1

and therefore

cB D kBkF D kM 1 kF  kM 1 kF  k kF :

Having y .b/ .t/ , according to (3.44a) a stochastic optimal open-loop control


u .t/ D u .tI tb ; Itb /, tb  t  tf , reads:

.b/
u .tI tb ; Itb / D R1 B T y.t/ : (3.49a)

Moreover,

'.tb ; Itb / WD u .tb /; t0  tb  tf ; (3.49b)

is then a stochastic optimal open-loop feedback control law.


Remark 3.5 Putting Q D 0 in (3.40), we again obtain the stochastic optimal open-
loop feedback control law (3.26) in Sect. 3.6.1.

3.8 Stochastic Weight Matrix Q D Q.t; !/

In the following we consider the case that, cf.(2.6h, i), the weight matrix for the
evaluation of the displacements z D z.t; !/ is stochastic and may depend also on
time t; t0  t  tf . In order to take into account especially the size of the additive
disturbance term b D b.t; !/, cf.(3.7b), in the following we consider the stochastic
weight matrix

Q.t; !/ WD kb.t; !/k2 Q; (3.50a)

where Q is again a positive (semi)definite 2m  2m matrix, and k  k denotes the


Euclidian norm.
According to (3.13c, d), for the adjoint variable y D y.t; !/ we then have the
system of differential equations

P !/ D  AT y.t; !/  ˇ.t; !/Qz.t; !/


y.t;
y.tf ; !/ DrG.tf ; !; z.tf ; !//;
104 3 Stochastic Optimal Open-Loop Feedback Control

where the stochastic function ˇ D .t; !/ is defined by

ˇ.t; !/ WD kbt.t; !/j2 : (3.50b)

Assuming that we have also the weighted terminal costs,

1
G.tf ; !; z.tf ; !// WD ˇ.tf ; !/kz.tf ; !/  zf .!/k2 ; (3.50c)
2

for the adjoint variable y D y.t; !/, we have the boundary value problem

P !/ D  AT y.t; !/  ˇ.t; !/Qz.t; !/


y.t; (3.51a)
 
y.tf ; !/ Dˇ.tf ; !/ z.tf ; !/  zf .!/
Dˇ.tf ; !/z.tf ; !/  ˇ.tf ; !/zf .!/ (3.51b)

Corresponding to (3.40), from (3.51a, b) we then get the solution

Ztf
T .st /
y.t; !/ D eA Qˇ.s; !/z.s; !/ ds
t
T .t t /
 
Ce A f
ˇ.tf ; !/z.tf ; !/  ˇ.tf ; !/zf .!/ ; tb  t  tf
(3.52a)

Taking conditional expectations of (3.52a) with respect to Atb , corresponding to


(3.41) we obtain
 .b/ .b/

T .t t /
y .b/ .t/ De A f
.tf ; /z.tf ; /  ˇ.tf ; /zf ./

Ztf
T .st / .b/
C eA Qˇ.s; /z.s; / ds; t  tb (3.52b)
t

Since the matrices A; B are assumed to be fixed, see (3.7b), from (3.8c) and (3.52b)
we get
 
h.t/ DE B T y.t; !/jAtb D B T y .b/ .t/; t  tb (3.53a)

Consequently, corresponding to (3.11c) and (3.12), with (3.53a) the optimal open-
loop control u D u .t/ is given then by

u .tI Itb / DR1 .h.t// : (3.53b)


3.8 Stochastic Weight Matrix Q D Q.t; !/ 105

Moreover, the weighted conditional mean trajectory

.b/  
t ! ˇ.t; /z.t; / D E ˇ.t; !/z.t; !/jAtb ; t  tb ; (3.54a)

is determined in the present open-loop feedback approach as follows. We first


remember that the optimal trajectory is defined by the initial value problem (3.13a,
b). Approximating the weighted conditional mean trajectory (3.54a) by
 
t ! E ˇ.tb ; !/z.t; !/jAtb ; tb  t  tf ; (3.54b)

we multiply (3.13a, b) by ˇ.tb ; !/. Thus, the trajectory t ! ˇ.tb ; !/z.t; !/, t  tb ,
is the solution of the initial value problem

d
ˇ.tb ; !/z.t; !/ DAˇ.tb ; !/z.t; !/  BR1 Bˇ.tb ; !/y .b/ .t/ C ˇ.tb ; !/b.t; !/
dt
(3.55a)
ˇ.tb ; !/z.tb ; !/ Dˇ.tb ; !/zb : (3.55b)

Taking conditional expectations of (3.55a, b) with respect to Atb , for the approx-
imate weighted conditional mean trajectory (3.54b) we obtain the initial value
problem

d .b/ .b/ .b/ .b/


ˇ.tb ; /z.t; / DAˇ.tb ; /z.t; /  BR1 Bˇ .tb /y .b/ .t/ C ˇ.tb ; /b.t; /
dt
(3.56a)
.b/ .b/
ˇ.tb ; /z.tb ; / Dˇ.tb ; /zb ./ ; (3.56b)

.b/  
where ˇ .t/ WD E ˇ.t; !/jAtb , t  tb . Consequently, the approximate weighted
conditional mean trajectory (3.54b) can be represented, cf. (3.43a, b), by

.b/ .b/
ˇ.tb ; /z.t; / De A.t tb / ˇ.tb ; /zb
Zt  
.b/ .b/
C e A.t s/ ˇ.tb ; /b.s; /  BR1 B T ˇ .tb /y .b/ .s/ ds;
tb

tb  t  tf : (3.57)

Obviously, a corresponding approximate representation for

.b/
t ! ˇ.tf ; /z.tf ; /

can be obtained, cf. (3.43b), by using (3.57) for t D tf .


106 3 Stochastic Optimal Open-Loop Feedback Control

Inserting now (3.57) into (3.52b), corresponding to (3.44c) we find the following
approximate fixed point condition for the conditional mean adjoint trajectory t 7!
y .b/ .t/; tb  t  tf , needed in the representation (3.53b) of the stochastic optimal
open-loop control u D u .t/; tb  t  tf :
 .b/ .b/

T .t t /
y .b/ .t/ De A f
ˇ.tf ; /z.tf ; /  ˇ.tf ; /zf ./

Ztf
T .st / .b/
C eA Qˇ.s; /z.s; / ds
tb

t / .b/ .b/
e A.tf tb / ˇ.tb ; /zb ./
T .t
e A f
 ˇ.tf ; /zf ./

Ztf  .b/
C e A.tf s/ ˇ.tb ; /b.s; /
tb
!

BR1 B T .b/ .tb /y .b/ .s/ ds

Ztf
T .st / .b/
C eA Q e A.stb / ˇ.tb ; /zb ./
t

Zs  .b/
C e A.s / ˇ.tb ; /b.; /
tb
!
.b/

1
BR B ˇ .tb /y T .b/
./ d  ds (3.58)

Corresponding to Theorem 3.3 we can also develop a condition that guarantees


the existence and uniqueness of a solution y b D y .b/ .t/ of Eq. (3.58):
Theorem 3.4 In the space of continuous functions, Eq. (3.58) has a unique solution
if

1
cB < r  : (3.59)
.tf t0 /cQ
cA cR1 .b/ .tb /.tf  t0 / 1 C 2
3.9 Uniformly Bounded Sets of Controls Dt ; t0  t  tf 107

Here again,

cA WD sup ke A.t s/ kF cB WD kBkF cR1 WD kR1 kF cQ WD kQkF ;


tb t stf

and the index F denotes the Frobenius-Norm.


According to (3.53a, b), the stochastic optimal open-loop control u .t/, tb  t 
tf , can be obtained as follows:
Theorem 3.5 With a solution y .b/ .t/ of the fixed point condition (3.58), the
stochastic optimal open-loop control u .t/, tb  t  tf , reads :

u .tI Itb / D R1 B T y .b/ .t/: (3.60a)

Moreover,

'.tb ; Itb / WD u .tb I Itb /; t0  tb  tf ; (3.60b)

is then the stochastic optimal open-loop feedback control law.

3.9 Uniformly Bounded Sets of Controls Dt ; t0  t  tf

The above shown Theorem 3.4 guaranteeing the existence of a solution of the fixed
point condition (3.58) can be generalized considerably if we suppose that the sets
Dt of feasible controls u D u.t/ are uniformly bounded with respect to the time
t; t0  t  tf . Hence, in the following we suppose again:
• time-independent and deterministic matrices of coefficients, hence

A.t; !/ D A B.t; !/ D B (3.61a)

• quadratic cost functions,

1 T 1 T
C.u/ D u Ru Q.z/ D z Qz
2 2
1 
z.tf ; !/  zf .!/2
G.tf ; !; z.tf ; !/ D 2
(3.61b)
2
• uniformly bounded sets of feasible controls, hence, we assume that there exists a
constant CD  0 such that
[
kuk2  cD for all u 2 Dt : (3.61c)
t 2T
108 3 Stochastic Optimal Open-Loop Feedback Control

According to the above assumed deterministic coefficient matrices A; b, the


H -minimal control depends only on the conditional expectation of the adjoint
trajectory, hence,

ue .t/ D ue .t; y.t/ /:


.b/
(3.62)

Thus, the integral form of the related 2-point boundary value problem reads:

Zt  
Az.!; s/ C b.s; !/ C B ue .s; y.s/
.b/
z.!; t/ Dzb C ds (3.63a)
tb

Ztf
   T 
y.!; t/ D z.tf ; !/  zf .!/ C A y.!; s/ C Qz.s; !/ ds: (3.63b)
t

Consequently, for the conditional expectations of the trajectories we get

Zt  
Az.s/ C b.s/ C B ue .s; y.s/
.b/ .b/ .b/ .b/
z.t/ Dzb .b/
C ds (3.64a)
tb

  Ztf  
.b/ .b/ .b/ .b/
y.t/ D z.tf /  zf .b/
C AT y.s/ C Qz.s/ ds: (3.64b)
t

Using the matrix exponential function with respect to A, we have

Zt  
e A.t s/ b.s/ C B ue .s; y.s/
.b/ A.t tb / .b/ .b/
z.t/ De zb .b/
C ds (3.65a)
tb

 Zf T
t
.b/
 .b/ .b/
AT .t f t /
y.t/ De z.tf /  zf .b/
C e A .st / Qz.s/ ds: (3.65b)
t

.b/
Putting (3.65a) into (3.65b), for y.t/ we get then the following fixed point
condition
.b/
 .b/

y.t/ D e A .tf t / z.tf /  zf .b/
T

0 1
Ztf Zs  
e A.s / b./ C B ue .; y./
.b/ .b/
C e AT .st /
Q @e A.stb / zb .b/ C d  A ds
t tb
(3.66)
3.9 Uniformly Bounded Sets of Controls Dt ; t0  t  tf 109

For the consideration of the existence of a solution of the above fixed point
equation (3.66) we need several auxiliary tools. According to the assumption
(3.61c), next we have the following lemma:
Lemma 3.4 There exist constants cz ; cG > 0 such that

.b/ .b/
kz.f ./; t/ k  cz and krz G.z.f ./; t/ k  cG (3.67a)

for each time t; tb  t  tf , and all f 2 C.T I Rm /, where

Zt  
e A.t s/ b.s/ C B ue .f .s/; s/ ds
.b/ A.t tb / .b/
z.f ./; t/ WD e zb .b/
C (3.67b)
tb

.b/
Proof With cA WD e kAkF .tf tb / ; cB WD kBkF and cb .b/ WD kb./ k1 the following
inequalities hold:

.b/
   
kz.f ./; t/ k2  cA kzb .b/ k2 C cb .b/ C cB cD .tf  tb /  cz (3.68a)
.b/ .b/
krz G.z.f ./; t/ k2 D kz.f ./; tf /  zf .b/ k  cz C kzf .b/ k2  cG ; (3.68b)

where cz ; cG are arbitrary upper bounds of the corresponding left quantities. t


u
In the next lemma the operator defined by the right hand side of (3.66) is studied:
Lemma 3.5 Let denote again X WD C.T I Rm / the space of continuous functions
f on T equipped with the supremum norm. If TQ W X ! X denotes the operator
defined by
 
.TQ f /.t/ WD e A
T .t t / .b/
f
z.tf /  zf .b/
0 1
Ztf Zs  
e A.s / b./ C B ue .; f .// d  A ds;
.b/
C e AT .st /
Q @e A.stb / zb .b/ C
t tb
(3.69)

then the image of TQ is relative compact.


Proof Let cQ WD kQkF . We have to show that TQ .X / is bounded and equicontinu-
ous.
110 3 Stochastic Optimal Open-Loop Feedback Control

• TQ .X / is bounded:

 T  
 A .tf t / .b/
e z.tf /  zf .b/

0
Ztf Zs  .b/
C e AT .st /
Q @e A.stb / zb .b/ C e A.s / b./
t tb
 1
 

C B ue .; f .// d  A ds


!
tf2  tb2
cA cG C cA cQ cA kzb .b/ k2 .tf  tb / C .cA cb .b/ C cB cD / (3.70)
2

• TQ .X / is equicontinuous:
We have to show that for each  > 0 there exists a ı > 0 such that, independent
of the mapped function f , the following inequality holds:

jt  sj < ı ) kTQ f .t/  TQ f .s/k2  :

Defining the function

Zt
A.t tb / .b/
%.t/ D e zb C e A.t / b./ d; (3.71)
tb

the following inequalities hold:


 
TQ f .t/  TQ f .s/
2

 T    
 .b/ .b/
De A .tf t / z.tf /  zf .b/  e A .tf s/ z.tf /  zf .b/
T

Ztf
T . t / 
C eA Q e A. tb / zb
t
1
Z  
e A. / b./ C B ue .; f .// dA d 
.b/
C
tb
3.9 Uniformly Bounded Sets of Controls Dt ; t0  t  tf 111

Ztf
T . s/ 
 eA Q e A. tb / zb
s
1 
Z   

b./ C B ue .; f .// dA d  
.b/
C e A. / 
(3.72a)

tb

From Eq. (3.72a) we get then


 
TQ f .t/  TQ f .s/
2

 T  
 .b/
D e A .tf t /  e A .tf s/ z.tf /  zf .b/
T


0 1
Ztf Z
C e AT . t /
Q @%./ C B ue .; f .// dA d 
t tb
0 1 
Ztf Z 
e 
 e AT . s/
Q @%./ C B u .; f .// dA d  

s tb 2
 T 
 
 e A .tf t /  e A .tf s/  cG
T

2
 0 1
  Ztf Z
 AT .t /
C e A  Q @%./ C B ue .; f .// dA d 
T T
 e  e A .s/

t tb
0 1 
Z t Z  

 e A .s/ e A  Q @%./ C B ue .; f .// dA d  
T T


s tb 2
 T 
 A .tf t / AT .tf s/ 
 e e  cG
2
 T 
 A .t / 
 e A .s/  e kAkF tf cQ .c% C cB cD .tf  tb //.tf  tb /
T
C e

C cA cQ .c% C cB cD .tf  tb //jt  sj: (3.72b)

Obviously, the final expression in (3.72b) is independent of f ./. Hence, due to the
continuity of the matrix exponential function and the function %./, the assertion
follows. t
u
112 3 Stochastic Optimal Open-Loop Feedback Control

From the above Lemma 3.5 we obtain now this result:


Theorem 3.6 The fixed point equation (3.66) has a continuous, bounded solution.
Proof Define againX WD C.T I Rm / and consider the set M  X

M WD f ./ 2 X j sup kf .t/k2  C ; (3.73a)
t 2T

where
!
tf2  tb2
C WD cA cG C cA cQ cA kzb k2 .tf  tb / C .cA cb .b/ C cB cD / : (3.73b)
2

Moreover, let T denote the restriction of TQ to M, hence,

T W M ! M; f 7! TQ f: (3.74)

Obviously, the operator T is continuous and, according to Lemma 3.5, the image of
T is relative compact. Moreover, the set M is closed and convex. Hence, according
to the fixed point theorem of Schauder, T has a fixed point in M. t
u

3.10 Approximate Solution of the Two-Point Boundary Value


Problem (BVP)

According to the previous sections, the remaining problem is then to solve the fixed
point equation (3.44c) or (3.58). In the first case, the corresponding equation reads:

t /
Gf e A.tf tb / zb .b/  zf .b/
T .t
y .b/ .t/ De A f

Ztf !
 .b/

C e A.tf s/ b.s/  BR1 B T y .b/ .s/ ds
tb

Ztf
T .st /
C eA Q e A.stb / zb .b/
t

Zs !
 .b/

1
C e A.s /
b./  BR B y T .b/
./ d  ds : (3.75)
tb
3.10 Approximate Solution of the Two-Point Boundary Value Problem (BVP) 113

Based on the given stochastic open-loop feedback control approach, the fixed
point equation (3.75) can be solved by the following approximation method:
Step I

According to Eqs. (3.12) and (3.44a) of the stochastic optimal OLF C , for each
remaining time interval Œtb ; tf  the value of the stochastic optimal open-loop control
u D u .tI tb ; Itb /; t  tb , is needed at the left time point tb only.
Thus, putting first t D tb in (3.75), we get:

tb /
Gf e A.tf tb / zb .b/  e A tb /
T .t T .t
y .b/ .tb / De A f f
Gf zf .b/
Ztf
AT .tf tb / .b/
Ce Gf e A.tf s/ b.s/ ds
tb

Ztf
AT .tf tb /
e Gf e A.tf s/ BR1 B T y .b/ .s/ ds
tb

Ztf
T .st
C eA b/
Qe A.stb / dszb .b/
tb

Ztf  Zs 
AT .stb / .b/
C e Q e A.s / b./ d  ds
tb tb

Ztf  Zs 
 e AT .stb /
Q e A.s / BR1 B T y .b/ ./ d  ds : (3.76a)
tb tb

Step II
Due to the representation (3.44a) of the stochastic optimal open-loop control u and
the stochastic OLF construction principle (2.10d, e), the value of the conditional
mean adjoint variable y .b/ .t/ is needed at the left boundary point t D tb only.
Consequently, y b D y .b/ .s/ is approximated on Œtb ; tf  by the constant function

y .b/ .s/  y .b/ .tb /; tb  s  tf : (3.76b)

In addition, the related matrix exponential function s ! e A.tf s/ is approximated


on Œtb ; tf  in the same way.
114 3 Stochastic Optimal Open-Loop Feedback Control

This approach is justified especially if one works with a receding time horizon or
moving time horizon

tf WD tb C

with a short prediction time horizon


.

tb /
Gf e A.tf tb / zb .b/  e A tb /
T .t T .t
y .b/ .tb / e A f f
Gf zf .b/
Ztf
AT .tf tb / .b/
Ce Gf e A.tf s/ b.s/ ds
tb

AT .tf tb /
.tf  tb /e Gf e A.tf tb / BR1 B T y .b/ .tb /
Ztf
T .st
C eA b/
Qe A.stb / dszb .b/
tb

Ztf  Zs 
AT .stb / .b/
C e Q e A.s / b./ d  ds
tb tb

Ztf
Qe A.stb / ds BR1 B T y .b/ .tb /:
T .st
 .s  tb /e A b/
(3.76c)
tb

Step III
Rearranging terms, (3.76c) yields a system of linear equations for y .b/ .tb /:
T .t tb /
y .b/ .tb / A0 ..tb ; tf ; Gf ; Q/zb .b/  e A f
Gf zf .b/
.b/
C A1 .tb ; tf ; Gf ; Q/  b Œtb ;tf  ./

 A23 .tb ; tf ; Gf ; Q/BR1 B T y .b/ .tb /; (3.76d)

.b/
where the matrices, linear operator and function, resp., A0 ; A1 ; A23 ; b Œtb ;tf  can be
easily read from relation (3.76c). Consequently, (3.76d) yields
 
I C A23 .tb ; tf ; Gf ; Q/BR1 B T y .b/ .tb /  A0 ..tb ; tf ; Gf ; Q/zb .b/
T .t tb / .b/
e A f
Gf zf .b/ C A1 .tb ; tf ; Gf ; Q/  b Œtb ;tf  ./: (3.76e)
3.11 Example 115

For the matrix occurring in (3.76e) we have this result:


Lemma 3.6 The matrix I C A23 .tb ; tf ; Gf ; Q/BR1 B T is regular.
Proof According to (3.76c, d) we have

tb /
Gf e A.tf tb /
T .t
A23 .tb ; tf ; Gf ; Q/ D .tf  tb /e A f

Ztf
T .st
C .s  tb /e A b/
Qe A.stb / ds :
tb

Hence, A23 D A23 .tb ; tf ; Gf ; Q/ is a positive definite matrix. Moreover, U WD


BR1 B T is at least positive semidefinite. Consider now the equation .I CA23 U /w D
0. We get A23 U w D w and therefore U w D A1 1
23 w, hence, .U C A23 /w D 0.
1
However, since the matrix U C A23 is positive definite, we have w D 0, which
proves now the assertion.
The above lemma and (3.76e) yields now
 1 
y .b/ .tb /  I C A23 .tb ; tf ; Gf ; Q/BR1 B T A0 ..tb ; tf ; Gf ; Q/zb .b/
.b/

e A .tf tb / Gf zf .b/ C A1 .tb ; tf ; Gf ; Q/  b Œtb ;tf  ./ :
T

(3.76f)

According to (3.75) and (2.10d, e), the stochastic optimal open-loop feedback
control '  D '  .tb ; Itb / at t D tb is obtained as follows:
Theorem 3.7 With the approximative solution y .b/ .tb / of the fixed point condition
(3.75) at t D tb , represented by (3.76f), the stochastic optimal open-loop feedback
control law at t D tb is given by

'  .tb ; Itb / WD R1 B T y .b/ .tb /: (3.76g)

Moreover, the whole approximate stochastic optimal open-loop feedback control


law '  D '  .t; It / is obtained from (3.76g) by replacing tb ! t for arbitrary
t; t0  t  tf .

3.11 Example

We consider the structure according to Fig. 3.2, see [18], where we want to control
the supplementary active system while minimizing the expected total costs for the
control and the terminal costs.
116 3 Stochastic Optimal Open-Loop Feedback Control

Fig. 3.2 Principle of active


structural control

The behavior of the vector of displacements q.t; !/ can be described by a system


of differential equations of second order:
     
qR 0 .t; !/ qP 0 .t; !; t q0 .t; !/
M CD CK D f0 .t; !/ C fa .t/
qRz .t; !/ qPz .t; !/ qz .t; !/
(3.77)

with
 
m0 0
M D mass matrix (3.78a)
0 mz
 
d0 C dz dz
DD damping matrix (3.78b)
dz dz
 
k0 C kz kz
KD stiffness matrix (3.78c)
kz kz
 
1
fa .t/ D u.t/ actuator force (3.78d)
C1
 
f01 .t; !/
f0 .t; !/ D applied load : (3.78e)
0

Here we have n D 1, i.e. u./ 2 C.T; R/, and the weight matrix R becomes a
positive real number.
3.11 Example 117

To represent the equation of motion (3.77) as a first order differential equation we


set
0 1
q0 .t; !/
B qz .t; !/ C
P !//T D B
z.t; !/ WD .q.t; !/; q.t; C
@ qP 0 .t; !/ A :
qPz .t; !/

This yields the dynamical equation


     
0 I2 0 0
zP.t; !/ D z.t; !/ C C D
M 1 K M 1 D M 1 fa .s/ M 1 f0 .s; !/
0 1 0 1 0 1
0 0 1 0 0 0
B 0 1 C B 0 C B 0 C
B 0 0 C B C B C
D B k0 Ckz kz d0 Cdz dz C z.t; !/ C B 1 C u.s/ C B f0 .s;!/ C;
@  m0 m0  m0 m0 A @  m0 A @ m0 A
kz
mz
 mkzz dz
mz
 mdzz 1
mz 0
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ …
WDA WDB WDb.t;!/
(3.79)

where Ip denotes the p  p identity matrix. Furthermore, we have the optimal


control problem under stochastic uncertainty:
0 tf 1
Z
1@ ˇ
min F .u.// WD E R .u.s//2 ds C z.tf ; !/T Gz.tf ; !/ˇ Atb A (3.80a)
2
tb

Zt
 
s:t: z.t; !/ D zb C Az.s; !/ C Bu.s/ C b.s; !/ ds (3.80b)
tb

u./ 2 C.T; R/: (3.80c)

Note that this problem is of the “Minimum-Energy Control”-type, if we apply no


extra costs for the displacements, i.e. Q 0.
The two-point-boundary problem to be solved reads then, cf. (3.13a–d),

1 .b/
zP.t; !/ DAz.t; !/  BB T y.t; / C b.!; t/ (3.81a)
R
P !/ D  AT y.t; !/
y.t; (3.81b)
z.tb ; !/ Dzb (3.81c)
y.tf ; !/ DGz.tf ; !/: (3.81d)
118 3 Stochastic Optimal Open-Loop Feedback Control

Hence, the solution of (3.81a)–(3.81d), i.e. the optimal trajectories, reads, cf.
(3.14a), (3.37a),
T .t t /
y.t; !/ De A f
Gz.tf ; !/ (3.82a)
Zt
A.t tb /
z.t; !/ De zb C e A.t s/ b.s; !/
tb
!
1 .b/
 BB T e A .tf s/ Gz.tf ; !/
T
ds: (3.82b)
R

Finally, we get the optimal control, see (3.38c) and (3.39):


0 1
Ztf
1 .b/
u .t/ D  B T e A .tf t / .I4 C GU / Ge Atf @e Atb zb C e As b.s; !/ dsA
1
T

R
tb
(3.83)

with

Ztf
1
e A.tf s/ BB T e A s/
T .t
U D f
ds: (3.84)
R
tb
Chapter 4
Adaptive Optimal Stochastic Trajectory
Planning and Control (AOSTPC)

4.1 Introduction

An industrial, service or field robot is modeled mathematically by its dynamic


equation, being a system of second order differential equations for the robot or
configuration coordinates q D .q1 ; : : : ; qn /0 (rotation angles in case of revolute
links, length of translations in case of prismatic links), and the kinematic equation,
relating the space fqg of robot coordinates to the work space fxg of the robot.
Thereby one meets [11, 126, 140, 148, 150, 161] several model parameters, such
as length of links, li .m/, location of center of gravity of links, lci .m/, mass
of links, mi .kg/, payload .N /, moments of inertia about centroid, Ii .kgm2 /,
(Coulomb-)friction coefficients, Rij0 .N /, etc. Let pD ; pK denote the vector of
model parameters contained in the dynamic, kinematic equation, respectively.
A further vector pC of model parameters occurs in the formulation of several
constraints, especially initial and terminal conditions, control and state constraints
of the robot, as e.g. maximum, minimum torques or forces in the links, bounds for
the position, maximum joint, path velocities. Moreover, certain parameters pJ , e.g.
cost factors, may occur also in the objective (performance, goal) functional J.
Due to stochastic variations of the material, manufacturing errors, measurement
(identification) errors, stochastic variations of the workspace environment, as e.g.
stochastic uncertainty of the payload, randomly changing obstacles, errors in the
selection of appropriate bounds for the moments, forces, resp., in the links, for the
position and path velocities, errors in the selection of random cost factors, modeling
errors, disturbances, etc., the total vector
0 1
pD
B pK C
pDB C
@ pC A (4.1a)
pJ

© Springer-Verlag Berlin Heidelberg 2015 119


K. Marti, Stochastic Optimization Methods, DOI 10.1007/978-3-662-46214-0_4
120 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

 
Fig. 4.1 Control of dynamic systems (robots). Here, z.t / WD P / , z.0/ .t /
q.t /; q.t WD
 
q .0/ .t /; qP .0/ .t /

of model parameters is not a given fixed quantity. The vector p must be represented
therefore by a random vector

p D p.!/; ! 2 .˝; A; P / (4.1b)

on a certain probability space .˝; A; P /, see [8, 58, 123, 148, 169].
Having to control a robotic or more general dynamical system, the control law
u D u.t/, is represented usually by the sum

u.t/ WD u.0/ .t/ C


u.t/; t0  t  tf ; (4.2)

of a feedforward control (open-loop-control) u0 .t/; t0  t  tf , and an on-line local


control correction (feedback control)
u.t/ (Fig. 4.1).
In actual engineering practice [68, 125, 127, 170], the feedforward control u.0/ .t/
is determined off-line based on a certain reference trajectory q .0/ .t/; t0  t  tf ,
in configuration space, where the unknown parameter vector p is replaced by a
certain vector p .0/ of nominal parameter values, as e.g. the expectation p .0/ WD p D
Ep.!/. The increasing deviation of the actual position and velocity of the robot
from the prescribed values, caused by the deviation of the actual parameter values
p.!/ from the chosen nominal values p .0/ , must be compensated by on-line control
corrections
u.t/; t > t0 . This requires usually extensive on-line state observations
(measurements) and feedback control actions.
In order to determine a more reliable reference path q D q.t/; t0  t  tf , in
configuration space, being robust with respect to stochastic parameter variations,
the a priori information (e.g. certain moments or parameters of the probability
distribution of p./) about the random variations of the vector p.!/ of model
parameters of the robot and its working environment is taken into account already
at the planning phase. Thus, instead of solving a deterministic trajectory planning
problem with a fixed nominal parameter vector p .0/ , here, an optimal velocity
profile ˇ .0/ ; s0  s  sf , and—in case of point-to-point control problems—also an
4.2 Optimal Trajectory Planning for Robots 121

.0/
optimal geometric path qe .s/; s0  s  sf , in configuration space is determined
by using a stochastic optimization approach [93, 109–111, 130]. By means of
.0/
ˇ .0/ .s/ and qe .s/; s0  s  sf , we then find a more reliable, robust reference
.0/
trajectory q .0/ .t/; t0  t  tf , in configuration space. Applying now the so-called
“inverse dynamics approach” [4, 11, 61], more reliable, robust open-loop controls
.0/
u.0/ .t/; t0  t  tf , are obtained. Moreover, by linearization of the dynamic
  
equation of the robot in a neighborhood of u.0/ .t/; q .0/ .t/; E pM .!/jAt0 ; t 
t0 , where At0 denotes the -algebra of informations up to the initial time point t0 ,
a control correction
u.0/ .t/; t  t0 , is obtained which is related to the so-called
feedback linearization of a system [11, 61, 131, 150].
At later moments (main correction time points) tj ,

t0 < t1 < t2 < : : : < tj 1 < tj < : : : ; (4.3)

further information on the parameters of the control system and its environment
are available, e.g., by process observation, identification, calibration procedures
etc. Improvements q .j / .t/; u.j / .t/;
u.j /.t/; t  tj ; j D 1; 2; : : :, of the preced-
ing reference trajectory q .j 1/ .t/, open loop control u.j 1/ .t/, and local control
correction (feedback control)
u.j 1/.t/ can be determined by replanning, i.e.,
by optimal stochastic trajectory planning (OSTP) for the remaining time interval
t  tj ; j D 1; 2; : : :, and by using the information Atj on the robot and its working
environment available up to the time point tj > t0 ; j D 1; 2; : : :, see [62, 141, 142].

4.2 Optimal Trajectory Planning for Robots

According to [11, 126, 148], the dynamic equation for a robot is given by the
following system of second order differential equations
   
R C h pD ; q.t/; q.t/
M pD ; q.t/ q.t/ P D u.t/; t  t0 ; (4.4a)

for the n-vector q D q.t/ of the robot or configuration coordinates q1 ; q2 ; : : : ; qn .


Here, M D M.pD ; q/ denotes the n  n inertia (or mass) matrix, and the vector
function h D h.pD ; q; q/
P is given by

P WD C.pD ; q; q/
h.pD ; q; q/ P qP C FR .pD ; q; q/
P C G.pD ; q/; (4.4b)
 
P D C.pD ; q/q,
where C.pD ; q; q/ P and C.pD ; q/ D Cijk .pD ; q/ is the
1i;j;kn
tensor of Coriolis and centrifugal terms, FR D FR .pD ; q; q/ P denotes the vector of
frictional forces and G D G.pD ; q/ is the vector of gravitational forces. Moreover,
u D u.t/ is the vector of controls, i.e., the vector of torques/forces in the joints of
122 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

the robot. Standard representations of the friction term FR are given [11, 68, 148]
by

P WD Rv .pD ; q/q;
FR .pD ; q; q/ P (4.4c)
P WD R.pD ; q/sgn.q/;
FR .pD ; q; q/ P (4.4d)
 T
where sgn.q/ P WD sgn.qP1 /; : : : ; sgn.qPn / . In the first case (4.4c), Rv D
Rv .pD ; q/ is the viscous friction
 matrix, and in the Coulomb approach (4.4d),
R D R.pD ; q/ D Ri .p; q/ıij is a diagonal matrix.

Remark 4.1 (Inverse Dynamics) Reading the dynamic equation (4.4a) from the left
to the right hand side, hence, by inverse dynamics [4, 11, 61], the control function
u D u.t/ may be described in terms of the trajectory q D q.t/ in configuration
space.
The relationship between the so-called configuration space fqg of robot coordi-
nates q D .q1 ; : : : ; qn /0 and the work space fxg of world coordinates (position and
orientation of the end-effector) x D .x1 ; : : : ; xn /0 is represented by the kinematic
equation

x D T .pK ; q/: (4.5)

As mentioned already in the introduction, pD ; pK , denote the vectors of dynamic,


kinematic parameters arising in the dynamic and kinematic equation (4.4a–d), (4.5).
Remark 4.2 (Linear Parameterization of Robots) Note that the parameterization of
a robot can be chosen, cf. [4, 11, 61], so that the dynamic and kinematic equation
depend linearly on the parameter vectors pD ; pK .
The objective of optimal trajectory planning is to determine [19, 20, 68, 127, 149]
a control function u D u.t/; t  t0 , so that the cost functional

Ztf 
    
J u./ WD P
L t; pJ ; q.t/; q.t/; u.t/ dt C  tf ; pJ ; q.tf /; q.t
P f/ (4.6)
t0

is minimized, where the terminal time tf may be given explicitly or implicitly,


as e.g. in minimum-time problems. Standard examples are, see e.g. [126]: a)
 D 0; L D 1 (minimum time), b)  D 0; L D sum of potential, translatory and
n 
X 2
rotational energy of the robot (minimum energy), c)  D 0; L D qPi .t/ui .t/
i D1
n 
X 2
(minimum fuel consumption), d)  D 0; L D ui .t/ (minimum force and
i D1
moment). Furthermore, an optimal control function u D u .t/ and the related
4.2 Optimal Trajectory Planning for Robots 123

optimal trajectory q  D q  .t/; t  t0 , in configuration space must satisfy the


dynamic equation (4.4a–d) and the following constraints [19, 20, 28]:
i) The initial conditions

q.t0 / D q0 .!/; q.t


P 0 / D qP0 .!/ (4.7a)

Note that by means of the kinematic equation (4.5), the initial state
.q0 .!/; qP0 .!//in configuration space can be represented by the initial state

x0 .!/; xP0 .!/ in work space.
ii) The terminal conditions
 
P f / D 0;
tf ; p; q.tf /; q.t (4.7b)

e.g.

q.tf / D qf .!/; q.t


P f / D qPf .!/: (4.7c)

Again, by means of (4.5), .qf ; qPf / may be described in terms of the final state
.xf ; xP f / in work space.
Note that more general boundary conditions of this type may occur at some
intermediate time points t0 < 1 < 2 < : : : < r < tf .
iii) Control constraints

umin .t; p/  u.t/  umax .t; p/; t0  t  tf (4.8a)


 
P
gI t; p; q.t/; q.t/; u.t/  0; t0  t  tf (4.8b)
 
P
gII t; p; q.t/; q.t/; u.t/ D 0; t0  t  tf : (4.8c)

iv) State constraints


 
P
SI t; p; q.t/; q.t/  0; t0  t  tf (4.9a)
 
P
SII t; p; q.t/; q.t/ D 0; t0  t  tf : (4.9b)

Using the kinematic equation (4.5), different types of obstacles in the work
space can be described by (time-invariant) state constraints of the type (4.9a, b).
In robotics [127] often the following state constraints are used:

qmin .pC /  q.t/  qmax .pC /; t0  t  tf (4.9c)


qP min .pC /  q.t/
P  qP max .pC /; t0  t  tf ; (4.9d)

with certain vectors qmin ; qmax ; qP min ; qPmax of (random) bounds.


124 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

A special constraint of the type (4.9b) occurs if the trajectory in work space
 
x.t/ WD T pK ; q.t/ (4.10)

should follow as precise as possible a geometric path in work space

xe D xe .px ; s/; s0  s  sf (4.11)

being known up to a certain random parameter vector px D px .!/, which then is


added to the total vector p of model parameters, cf. (4.4a, b).
Remark 4.3 In the following we suppose that the functions M; h; L;  and T arising
in (4.4a–d), (4.5), (4.6) as well as the functions ; gI ; gII ; SI ; SII arising in the
constraints (4.7b–4.9b) are sufficiently smooth.

4.3 Problem Transformation

Since the terminal time tf may be given explicitly or implicitly, the trajectory q./
in configuration space may have a varying domain Œt0 ; tf . Hence, in order to work
with a given fixed domain of the unknown functions, the reference trajectory q D
q.t/; t  t0 , in configuration space is represented, cf. [68], by
 
q.t/ WD qe s.t/ ; t  t0 : (4.12a)

Here,

s D s.t/; t0  t  tf ; (4.12b)

is a strictly monotonous increasing transformation from the possibly varying time


domain Œt0 ; tf  into a given fixed parameter interval Œs0 ; sf . For example, s 2
Œs0 ; sf  may be the path parameter of a given path in work space, cf. (4.11).
Moreover,

qe D qe .s/; s0  s  sf ; (4.12c)

denotes the so-called geometric path in configuration space.


Remark 4.4 In many more complicated industrial robot tasks such as grinding,
welding, driving around difficult obstacles, complex assembly, etc., the geometric
path qe ./ in configuration space is predetermined off-line [21, 62, 64] by a separate
path planning procedure for qe D qe .s/; s0  s  sf , only. Hence, the
trajectory planning/replanning is reduced then to the computation/adaptation of the
.0/
transformation s D s.t/ along a given fixed path qe ./ D qe ./.
4.3 Problem Transformation 125

Assuming that the transformation s D s.t/ is differentiable on Œt0 ; tf  with


the exception of at most a finite number of points, we introduce now the so-
called velocity profile ˇ D ˇ.s/; s0  s  sf , along the geometric path qe ./ in
configuration space by

   2  
ds
ˇ.s/ WD sP t.s/ D
2
t.s/ ; (4.13)
dt

where t D t.s/; s0  s  sf , is the inverse of s D s.t/; t0  t  tf . Thus, we have


that
1
dt D p ds; (4.14a)
ˇ.s/

and the time t  t0 can be represented by the integral

Zs
d
t D t.s/ WD t0 C p : (4.14b)
ˇ. /
s0

By using the integral transformation WD s0 C .s  s0 / ; 0   1; t D t.s/


may be represented also by

Z1
d
t.s/ D t0 C .s  s0 / r   ; s  s0 : (4.15a)
0 ˇ s0 C .s  s0 /

By numerical quadrature, i.e., by applying a certain numerical integration formula


of order  and having weights a0 ; a1 ; a2 ; : : : ; a to the integral in (4.15a), the time
function t D t.s/ can be represented approximatively (with an "0 > 0) by

X

ak
tQ.s/ WD t0 C .s  s0 / q   ; s  s0 : (4.15b)
kD0 ˇ s0 C " 0 C .s  s0  2" /
0 
k

In case of Simpson’s rule . D 2/ we have that


0 1
s  s0 B 1 4 1 C
tQ.s/ WD t0 C @p Cq   Cp A: (4.15c)
6 ˇ.s0 C "0 / sCs
ˇ 20 ˇ.s  "0 /

As long as the basic mechanical equations, the cost and constraint functions do not
depend explicitly on time t, the transformation of the robot control problem from
126 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

the time onto the s-parameter domain causes no difficulties. In the more general case
one has to use the time representation (4.14b), (4.15a) or its approximates (4.15b, c).
Obviously, the terminal time tf is given, cf. (4.14b), (4.15a), by

Zsf
d
tf D t.sf / D t0 C p (4.16)
ˇ. /
s0

Z1
d
D t0 C .sf  s0 / r  :
0 ˇ s0 C .sf  s0 /

4.3.1 Transformation of the Dynamic Equation

Because of (4.12a, b), we find


 
ds dqe
P
q.t/ D qe0 .s/Ps sP WD ; qe0 .s/ WD (4.17a)
dt ds
R
q.t/ D qe0 .s/Rs C qe00 .s/Ps 2 : (4.17b)

Moreover, according to (4.13) we have that


p
sP 2 D ˇ.s/; sP D ˇ.s/; (4.17c)

and the differentiation of (4.17c) with respect to time t yields

1 0
sR D ˇ .s/: (4.17d)
2
Hence, (4.17a–d) yields the following representation
p
P
q.t/ D qe0 .s/ ˇ.s/ (4.18a)
1
R
q.t/ D qe0 .s/ ˇ 0 .s/ C qe00 .s/ˇ.s/ (4.18b)
2

P
of q.t/; R in terms of the new unknown functions qe ./; ˇ./.
q.t/
Inserting now (4.18a, b) into the dynamic equation (4.4a), we find the equivalent
relation

ue .pD ; sI qe ./; ˇ.// D u.t/ with s D s.t/; t D t.s/; (4.19a)


4.3 Problem Transformation 127

where the function ue is defined by


    1 
ue pD ; sI qe ./; ˇ./ WD M pD ; qe .s/ qe0 .s/ˇ 0 .s/ C qe00 .s/ˇ.s/ (4.19b)
2
 p 
Ch pD ; qe .s/; qe0 .s/ ˇ.s/ :

The initial and terminal conditions (4.7a–c) are transformed, see (4.12a, b)
and (4.18a), as follows
p
qe .s0 / D q0 .!/; qe0 .s0 / ˇ.so / D qP 0 .!/ (4.20a)
 q 
t.sf /; p; qe .sf /; qe0 .sf / ˇ.sf / D 0 (4.20b)

or
q
qe .sf / D qf .!/; qe0 .sf / ˇ.sf / D qP f .!/: (4.20c)

Remark 4.5 In most cases we have the robot resting at time t D t0 and t D tf , i.e.
P 0 / D q.t
q.t P f / D 0, hence,

ˇ.s0 / D ˇ.sf / D 0: (4.20d)

4.3.2 Transformation of the Control Constraints

Using (4.12a, b), the control constraints (4.8a–c) read in s-form as follows:
     
umin t.s/; pC  ue pD ; sI qe ./; ˇ./  umax t.s/; pC ; s0  s  sf

 p   (4.21a)
gI t.s/; pC ; qe .s/; qe0 .s/ ˇ.s/; ue pD ; sI qe ./; ˇ./  0; s0  s  sf

 p   (4.21b)
gII t.s/; pC ; qe .s/; qe0 .s/ ˇ.s/; ue pD ; sI qe ./; ˇ./ D 0; s0  s  sf ;
(4.21c)
   
where t D t.s/ D t sI ˇ./ or its approximation t D tQ.s/ D tQ sI ˇ./ is defined
by (4.14b), (4.15a–c).
Remark 4.6 I) In the important case
   
umin .t; pC / WD umin pC ; q.t/; q.t/
P ; umax .t; pC / WD umax pC ; q.t/; q.t/
P
(4.22a)
128 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

 
that the bounds for u D u.t/ depend on the system state P
q.t/; q.t/ in
configuration space, condition (4.21a) is reduced to
 p   
umin pC ; qe .s/; qe0 .s/ ˇ.s/  ue pD ; sI qe ./; ˇ./ (4.22b)
 p 
 umax pC ; qe .s/; qe0 .s/ ˇ.s/ ; s0  s  sf :

II) If the bounds for u.t/ in (4.22a) do not depend on the velocity q.t/ P in
configuration space, and the geometric path qe .s/ D qe .s/; s0  s  sf , in
configuration space is known in advance, then the bounds
 
umin pC ; qe .s/ D uQ min .pC ; s/ (4.22c)
 
umax pC ; qe .s/ D uQ max .pC ; s/; s0  s  sf ;

depend on .pC ; s/ only.


Bounds of the type (4.22c) for the control function u.t/ may be taken into
account as an approximation of the more general bounds in (4.21a).

4.3.3 Transformation of the State Constraints

Applying the transformations (4.12a, b), (4.17a) and (4.14b) to the state con-
straints (4.9a, b), we find the following s-form of the state constraints:
 p 
SI t.s/; pC ; qe .s/; qe0 .s/ ˇ.s/  0; s0  s  sf (4.23a)
 p 
SII t.s/; pC ; qe .s/; qe0 .s/ ˇ.s/ D 0; s0  s  sf : (4.23b)

Obviously, the s-form of the special state constraints (4.9c, d) read

q min .pC /  qe .s/  q max .pC /; s0  s  sf ; (4.23c)


p
qP min .pC /  qe0 .s/ ˇ.s/  qP max .pC /; s0  s  sf : (4.23d)

In the case that the end-effector of the robot has to follow a given path (4.11) in
work space, Eq. (4.23b) reads
 
T pK ; qe .s/  xe .px ; s/ D 0; s0  s  sf ; (4.23e)

with the parameter vector px describing possible uncertainties in the selection of the
path to be followed by the roboter in work space.
4.4 OSTP: Optimal Stochastic Trajectory Planning 129

4.3.4 Transformation of the Objective Function

ds
Applying the integral transformation t D t.s/; dt D p to the integral in the
  ˇ.s/
representation (4.6) of the objective function J D J u./ , and transforming also
the terminal costs, we find the following s-form of the objective function:

  Zf
s
p   ds
J u./ D L .t.s/; pJ ; qe .s/; qe0 .s/ ˇ.s/; ue pD ; sI qe ./; ˇ./ p
ˇ.s/
s0
 q 
0
C  t.sf /; pJ ; qe .sf /; qe .sf / ˇ.sf / : (4.24a)

Note that ˇ.sf / D 0 holds in many practical situations.


For the class of time-minimum problems we have that

  Zf t Zf s
ds
J u./ WD tf  t0 D dt D p : (4.24b)
ˇ.s/
t0 s0

Optimal Deterministic Trajectory Planning (OSTP) By means of the t  s-


transformation onto the fixed s-parameter domain Œs0 ; sf , the optimal control
problem (4.4a–d),(4.6)–(4.11) is transformed into a variational problem for finding,
see (4.12a–c) and (4.13), an optimal velocity profile ˇ.s/ and an optimal geometric
path qe .s/; s0  s  sf . In the determinsitstic case, i.e. if the parameter vector p
is assumed to be known, then for the numerical solution of the resulting optimal
deterministic trajectory planning problem several efficient solution techniques are
available, cf. [19, 20, 28, 68, 93, 111, 149].

4.4 OSTP: Optimal Stochastic Trajectory Planning

In the following we suppose that the initial and terminal conditions (4.20d) hold, i.e.

ˇ0 D ˇ.s0 / D ˇf D ˇ.sf / D 0 or q.t


P 0 / D q.t
P f / D 0:

Based on the .t  s/-transformation described in Sect. 4.2, and relying


on the inverse dynamics approach, the robot control problem (4.6), (4.7a–c),
(4.8a–c),(4.9a–c)
  can be represented now by a variational
  problem for
qe ./; ˇ./ ; ˇ./, resp., given in the following. Having qe ./; ˇ./ ; ˇ./, resp., a
reference trajectory and a feedforward control can then be constructed.
130 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

A) Time-invariant case (autonomous systems)


If the objective function and the constraint functions do not depend explicitly on
time t, then the optimal control problem takes the following equivalent s-forms:

Zsf    
min LJ pJ ; qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ ds C  J pJ ; qe .sf /
s0
(4.25a)
s.t.
 
fIu p; qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/  0; s0  s  sf (4.25b)
 
fIIu p; qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ D 0; s0  s  sf (4.25c)
 
fIS p; qe .s/; qe0 .s/; ˇ.s/  0; s0  s  sf (4.25d)
 
fIIS p; qe .s/; qe0 .s/; ˇ.s/ D 0; s0  s  sf (4.25e)

ˇ.s/  0; s0  s  sf (4.25f)
p
qe .s0 / D q0 .!/; qe0 .s0 / ˇ.s0 / D qP 0 .!/ (4.25g)
qe .sf / D qf .!/; ˇ.sf / D ˇf : (4.25h)

Under condition (4.20d), a more general version of the terminal condi-


tion (4.25h) reads, cf. (4.20b),
 
p; qe .sf / D 0; ˇ.sf / D ˇf WD 0: (4.25h’)

Here,
 
LJ D LJ pJ ; qe ; qe0 ; qe00 ; ˇ; ˇ 0 ;  J D  J .pJ ; qe / (4.26a)

fIu D fIu .p; qe ; qe0 ; qe00 ; ˇ; ˇ 0 /; fIIu D fIIu .p; qe ; qe0 ; qe00 ; ˇ; ˇ 0 / (4.26b)
fIS D fIS .p; qe ; qe0 ; ˇ/; fIIS D fIIS .p; qe ; qe0 ; ˇ/ (4.26c)

are the functions representing the s-form of the objective function (4.24a),
the constraint functions in the control constraints (4.21a–c), and in the state
constraints (4.23a–e), respectively. Define then f u and f S by
   
fIu fIS
f u WD ; f S WD : (4.26d)
fIIu fIIS
4.4 OSTP: Optimal Stochastic Trajectory Planning 131

B) Time-varying case (non autonomous systems)


If the time t occurs explicitly in the objective and/or in some of the constraints
of therobot control problem, then, using (4.14a, b), (4.15a–c), we have that
t D t sI t0 ; s0 ; ˇ./ , and the functions (4.26a–d) and may depend then also
   
on s; t0 ; s0 ; ˇ./ ; sf ; t0 ; s0 ; ˇ./ , resp., hence,
 
LJ D LJ s; t0 ; s0 ; ˇ./; pJ ; qe ; qe0 ; qe00 ; ˇ; ˇ 0
 
 J D  J sf ; t0 ; s0 ; ˇ./; pJ ; qe
 
f u D f u s; t0 ; s0 ; ˇ./; p; qe ; qe0 ; qe00 ; ˇ; ˇ 0
 
f S D f S s; t0 ; s0 ; ˇ./; p; qe ; qe0 ; ˇ
 
D sf ; t0 ; s0 ; ˇ./; p; qe :

In order to get a reliable optimal geometric path qe D qe .s/ in configuration
space and a reliable optimal velocity profile ˇ  D ˇ  .s/; s0  s  sf ,
being robust with respect to random parameter variations of p D p.!/, the
variational problem (4.25a–h) under stochastic uncertainty must be replaced by
an appropriate deterministic substitute problem which is defined according to the
following principles [71, 76, 84, 90, 111], cf. also [70, 71, 90, 107, 108].
Assume first that the a priori information about the robot and its environment up
to time t0 is described by means of a -algebra At0 , and let then
 
.0/
Pp./ D Pp./ jAt0 (4.27)

denote the a priori distribution of the random vector p D p.!/ given At0 .
Depending on the decision theoretical point of view, different approaches are
possible, e.g. reliability-based substitute problems, belonging essentially to one of
the following two basic classes of substitute problems:
I) Risk(recourse)-constrained minimum expected cost problems
II) Expected total cost-minimum problems.
Substitute problems are constructed by selecting certain scalar or vectorial loss or
cost functions

Iu ; IIu ; IS ; IIS ;  ; : : : (4.28a)

evaluating the violation of the random constraints (4.25b, c), (4.25d, e), (4.25h’),
respectively.
132 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

In the following all expectations are conditional expectations with respect to


.0/
the a priori distribution Pp./ of the random parameter vector p.!/. Moreover, the
following compositions are introduced:
   
Iu ı fIu IS ı fIS
fu WD ; fS WD (4.28b)
IIu ı fIIu IIS ı fIIS

 WD  ı : (4.28c)

Now the two basic types of substitute problems are described.


I) Risk(recourse)-based minimum expected cost  problems 
Minimizing the expected (primal) costs E J.u.//jAt0 , and demanding that
the risk, i.e. the expected (recourse) costs arising from the violation of the
constraints of the variational problem (4.25a–h) do not exceed given upper
bounds, in the time-invariant case we find the following substitute problem:

Zsf    
min E LJ pJ ; qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ jAt0 ds (4.29a)
s0
   
CE  J pJ ; qe .sf / jAt0

s.t.
   
E fu p; qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ jAt0  u ;

s0  s  sf (4.29b)
   
E fS p; qe .s/; qe0 .s/; ˇ.s/ jAt0  S ; s0  s  sf (4.29c)

ˇ.s/  0; s0  s  sf (4.29d)
p
qe .s0 / D q 0 ; qe0 .s0 / ˇ.s0 / D qP 0 (4.29e)
qe .sf / D q f .if  J D 0/; ˇ.sf / D ˇf ; (4.29f)

and the more general terminal condition (4.25h’) is replaced by


   
ˇ.sf / D ˇf WD 0; E  p; qe .sf / jAt0  : (4.29f’)

Here,

u D u .s/; S D S .s/; D .s/ (4.29g)


4.4 OSTP: Optimal Stochastic Trajectory Planning 133

denote scalar or vectorial upper risk bounds which may depend on the path
parameter s 2 Œs0 ; sf . Furthermore, the initial, terminal values q 0 ; qP 0 ; q f
in (4.29e, f) are determined according to one of the following relations:
a)

q 0 WD q.t OP 0 /; q f WD q.t
O 0 /; qP 0 WD q.t O f /; (4.29h)
 
O
where q.t/; OP
q.t/ denotes an estimate, observation, etc., of the state in
configuration space at time t;
b)
   
q 0 WD E q0 .!/jAt0 ; qP 0 WD E qP 0 .!/jAt0 ;
 
q f D qf .0/ WD E qf .!/jAt0 ; (4.29i)

where q0 .!/; qP0 .!/ is a random initial position, and qf .!/ is a random
terminal position.
Having corresponding information about initial and terminal values x0 ; xP 0 , xf
in work space, related equations for q0 ; qP0 ; qf may be obtained by means of the
kinematic equation (4.5).
Remark 4.7 Average Constraints
Taking the average of the pointwise constraints (4.29c, c) with respect to the
path parameter s; s0  s  sf , we get the simplified integrated constraints

Zsf    
E fu p; qe ; .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ jAt0 ds  Q u
s0
(4.29b’)

Zsf    
E fS p; qe .s/; qe0 .s/; ˇ.s/ jAt0 ds  Q S (4.29c’)
s0

Remark 4.8 Generalized Area of Admissible Motion


In generalization of the admissible area of motion [68, 93, 125] for path
planning problems with a prescribed geometrical path qe ./ D q e ./ in
configuration space, for point-to-point problems the constraints (4.29c–i) define
for each path point s; s0  s  sf , a generalized admissible area of motion for
the vector
 
.s/ WD qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ ; s0  s  sf ; (4.29j)
134 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

 
including information about the magnitude ˇ.s/; ˇ 0 .s/ of the motion as well
 
as information about the direction qe .s/; qe0 .s/; qe00 .s/ of the motion.

Remark 4.9 Problems with Chance Constraints


Substitute problems having chance constraints are obtained if the loss
functions  u ;  S for evaluating the violation of the inequality constraints
in (4.25a–h, h’) are 0–1-functions, cf. [93].
To give a characteristic example, we demand that the control, state con-
straints (4.21a), (4.23c), (4.23d), resp., have to be fulfilled at least with
probability ˛u ; ˛q ; ˛qP , hence,
  
P umin .pC /  ue pD ; sI qe ./; ˇ./  umax .pC /jAt0 /  ˛u ;

s0  s  sf ; (4.30a)
 min 
P q .pC /  qe .s/  q max .pC /jAt0  ˛q ; s0  s  sf ; (4.30b)
 p 
P qP min .pC /  qe0 .s/ ˇ.s/  qP max .pC /jAt0  ˛qP ; s0  s  sf : (4.30c)

Sufficient conditions for the chance constraints (4.30a–c) can be obtained by


applying certain probability inequalities, see [93]. Defining

umax .pC / C umin .pC / umax .pC /  umin .pC /


uc .pC / WD ; u .pC / WD ;
2 2
(4.30d)
then a sufficient conditions for (4.30a) reads, cf. [93],
   T 
1 1
E trB u .pC /d ue  u .pC / ue  u .pC / u .pC /d jAt0
c c

 1  ˛u ; s0  s  sf ; (4.30e)
 
where ue D ue pD ; sI qe ./; ˇ./ and u .pC /d denotes the diagonal matrix
containing the elements of u .pC / on its diagonal. Moreover, B denotes a
positive definite matrix such that zT Bz  1 for all vectors z such that kzk1  1.
Taking e.g. B D I , (4.30e) reads
     
E k u .pC /1
d u e p D ; sI qe ./; ˇ./  u c
.p C / k2
jAt 0  1  ˛u ;

s0  s  sf : (4.30f)

Obviously, similar sufficient conditions may be derived for (4.30b, c).


4.4 OSTP: Optimal Stochastic Trajectory Planning 135

We observe that the above class of risk-based


 minimum expected cost prob-
lems for the computation of qe ./; ˇ./ ; ˇ./, resp., is represented completely
by the following set of

.0/
initial parameters 0 W t0 ; s0 ; q 0 ; qP 0 ; Pp./ or 0 (4.31a)

and

terminal parameters f W tf ; sf ; ˇf ; q f : (4.31b)

In case of problems with a given geometric path qe D qe .s/ in configura-


tion space, the values q0 ; qf may be deleted. Moreover, approximating the
expectations in (4.29a–f, f’) by means of Taylor expansions with respect to the
parameter vector p at the conditional mean
 
p .0/ WD E p.!/jAt0 ; (4.31c)

.0/
the a priori distribution Pp./ may be replaced by a certain vector
!
Y
r 
0 WD E pl .!/jAt0 (4.31d)
.l1 ;:::;lr /2
kD1

of a priori moments of p.!/ with respect to At0 .


Here,  denotes a certain finite set of multiple indices .l1 ; : : : ; l /; r  1.
Of course, in the time-variant case the functions LJ ;  J ; f u ; f S ; as described
in item B) have to be used. Thus, t0 ; tf occur then explicitly in the parameter
list (4.31a, b).
II) Expected total cost-minimum problem
Here, the total costs arising from violations of the constraints in the variational
problem (4.29a–f, f’) are added to the (primary) costs arising along the
trajectory, to the terminal costs, respectively. Of course, corresponding weight
factors may be included in the cost functions (4.28a). Taking expectations with
respect to At0 , in the time-invariant case the following substitute problem is
obtained:

Zsf    
min E LJ p; qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ jAt0 ds
s0
   
CE J p; qe .sf / jAt0 (4.32a)
136 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

s.t.

ˇ.s/  0; s0  s  sf (4.32b)
p
qe .s0 / D q 0 ; qe0 .s0 / ˇ.s0 / D qP 0 (4.32c)
qe .sf / D q f . if J D 0/; ˇ.sf / D ˇf ; (4.32d)

where LJ ; J are defined by

LJ WD LJ C  uT fu C  S T fs (4.32e)


J WD  J or J WD  J C  T ; (4.32f)

and Iu ; IIu ; IS ; IIS ;   0 are certain nonnegative (vectorial) scale factors
which may depend on the path parameter s.
We observe that also in this case the initial/terminal parameters characterizing
the second class of substitute problems (4.32a–f) are given again by (4.31a, b).
In the time-varying case the present substitute problems of class II reads:

Zsf    
min E LJ s; t0 ; s0 ; ˇ./; pJ ; qe .s/; qe0 .s; qe00 .s/; ˇ.s/; ˇ 0 .s/ jAt0 ds
s0
   
CE J sf ; t0 ; s0 ; ˇ./; pJ ; qe .sf / jAt0 (4.33a)

s.t.

ˇ.s/  0; s0  s  sf (4.33b)
p
qe .s0 / D q0 ; qe0 .s0 / ˇ.s0 / D qP 0 (4.33c)
qe .sf / D qf . if J D 0/; ˇ.sf / D ˇf : (4.33d)

Remark 4.10 Mixtures of (I), (II)


Several mixtures of the classes (I) and (II) of substitute problems are possible.
4.4 OSTP: Optimal Stochastic Trajectory Planning 137

4.4.1 Computational Aspects

The following techniques are available for solving substitutes problems of type (I),
(II):
a) Reduction to a finite dimensional
 parameter
 optimization problem
Here, the unknown functions qe ./; ˇ./ or ˇ./ are approximated by a linear
combination

X
lq
q
qe .s/ WD qOl Bl .s/; s0  s  sf (4.34a)
lD1


l
ˇOl Bl .s/; s0  s  sf ;
ˇ
ˇ.s/ WD (4.34b)
lD1

q q ˇ ˇ
where Bl D Bl .s/; Bl D Bl .s/; s0  s  sf ; l D 1; : : : ; lq .lˇ /, are given
basis functions, e.g. B-splines, and qO l ; ˇOl ; l D 1; : : : ; lq .lˇ /, are vectorial, scalar
coefficients. Putting (4.34a, b) into (4.29a–f, f’), (4.32a–f), resp., a semiinfinite
optimization problem is obtained. If the inequalities involving explicitly the path
parameter s; s0  s  sf , are required for a finite number N of parameter values
s1 ; s2 ; : : : ; sN only, then this problem is reduced finally to a finite dimensional
parameter optimization problem which can be solved now numerically by
standard mathematical programming routines or search techniques. Of course, a
major problem is the approximative computation of the conditional expectations
which is done essentially by means of Taylor expansion with respect to the
parameter vector p at p .0/ . Consequently, several conditional moments have to
be determined (on-line, for stage j  1). For details, see [109–111, 130] and the
program package “OSTP” [8].
b) Variational techniques
Using methods from calculus of variations, necessary and—in some cases—also
sufficient conditions in  terms ofcertain differential equations may be derived for
.0/
the optimal solutions qe ; ˇ .0/ ; ˇ .0/ , resp., of the variational problems (4.29a–
f, f’), (4.32a–f). For more details, see [130].
c) Linearization methods
Here,
 we assume that we already have an approximative optimal solution
q e .s/; ˇ.s/ ; s0  s  sf , of the substitute problem (4.29a–f, f’) or
(4.32a–f)
 under
 consideration. For example, an approximative optimal solution
q e ./; ˇ./ can be obtained by starting from the deterministic substitute
problem obtained by replacing  the random
 parameter vector p.!/ just by its
conditional mean p WD E p.!/jAt0 .
.0/
138 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

 
Given an approximate optimal solution q e ./; ˇ./ of substitute problem
 
.0/
(I) or (II), the unknown optimal solution qe ./; ˇ .0/ ./ to be determined is
represented by

qe.0/ .s/ WD q e .s/ C


qe .s/; s0  s  sf (4.35a)
ˇ .0/ .s/ WD ˇ.s/ C
ˇ.s/; s0  s  sf ; (4.35b)
 
where
qe .s/;
ˇ.s/ ; s0  s  sf , are certain (small) correction terms. In
the following we assume that the changes
qe .s/;
ˇ.s/ and their first and resp.
second order derivatives
qe0 .s/;
qe00 .s/;
ˇ 0 .s/ are small.
We observe that the function arising in the constraints and in the objective
of (4.29a–f, f’), (4.32a–f), resp., are of the following type:
 
g .0/ qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/
   
WD E g p.!/; qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ jAt0 ; (4.36a)
.0/
     
 qe .sf / WD E  p.!/; qe .sf / jAt0 (4.36b)

and

  Zf 
s

.0/
F qe ./; ˇ./ WD g .0/ qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/ ds: (4.36c)
s0

with certain functions g; . Moreover, if for simplification the pointwise (cost-)


constraints (4.29c, c) are averaged with respect to the path parameter s; s0  sf ,
then also constraint functions of the type (4.36c) arise, see (4.29b’, c’).
By means of first order Taylor expansion  of g;  with respect  to
 0 00 0
 0 00 0

qe .s/;
qe .s/;
qe .s/;
ˇ.s/;
ˇ .s/ at q e .s/; q e .s/; q e .s/; ˇ.s/; ˇ .s/ ,
s0  s  sf , we find then the following approximations
   0

g .0/ qe .s/; qe0 .s/; qe00 .s/; ˇ.s/; ˇ 0 .s/  g .0/ q e .s/; q 0e .s/; q 00e .s/; ˇ.s/; ˇ .s/
.0/ .0/ .0/
CAg;q .s/T
qe .s/ C B g;q .s/T
qe0 .s/ C C g;q .s/T
qe00 .s/
e ;ˇ e ;ˇ e ;ˇ

.0/ .0/
CD g;q .s/
ˇ.s/ C E g;q .s/
ˇ 0 .s/ (4.37a)
e ;ˇ e ;ˇ

and
.0/
  .0/
 
.0/
 qe .sf /   q e .sf / C a ;q e .sf /T
qe .sf /; (4.37b)
4.4 OSTP: Optimal Stochastic Trajectory Planning 139

where the expected sensitivities of g;  with respect to q; q 0 ; q 00 ; ˇ and ˇ 0 are


given by

.0/
  0
 
Ag;q .s/ WD E rq g p.!/; q e .s/; q 0e .s/; q 00e .s/; ˇ.s/; ˇ .s/ jAt0 (4.37c)
e ;ˇ

.0/
  0
 
B g;q ;ˇ .s/ WD E rq 0 g p.!/; q e .s/; q 0e .s/; q 00e .s/; ˇ.s/; ˇ .s/ jAt0 (4.37d)
e

.0/
  0
 
C g;q ;ˇ .s/ WD E rq 00 g p.!/; q e .s/; q 0e .s/; q 00e .s/; ˇ.s/; ˇ .s/ jAt0 (4.37e)
e
   
.0/ @g 0 00 0
D g;q ;ˇ .s/ WD E p.!/; q e .s/; q e .s/; q e .s/; ˇ.s/; ˇ .s/ jAt0 (4.37f)
e @ˇ
 
.0/ @g  0 00 0

E g;q ;ˇ .s/ WD E p.!/; q e .s/; q e .s/; q e .s/; ˇ.s/; ˇ .s/ jAt0 (4.37g)
e @ˇ 0
   
.0/
a;q e .sf / WD E rq  p.!/; q e .s/ jAt0 ; s0  s  sf : (4.37h)

.0/
As mentioned before, cf. (4.31c, d), the expected values g .0/ ;  in (4.37a, b)
and the expected sensitivities defined by (4.37c–h) can be computed  approxima-

tively by means of Taylor expansion with respect to p at p D E p.!/jAt0 .
.0/

.0/
According to (4.37a), and using partial integration, for the total costs F
along the path we get the following approximation:

Zsf  
.0/ 0
F  g .0/ q e .s/; q 0e .s/; q 00e .s/; ˇ.s/; ˇ .s/ ds (4.38a)
s0

Zsf Zsf
.0/ .0/
C G g;q ;ˇ .s/T
qe .s/ ds C H g;q .s/
ˇ.s/ ds
e e ;ˇ
s0 s0
 .0/ .0/0
T
C B g;q .sf /  C g;q .sf /
qe .sf /
e ;ˇ e ;ˇ
 .0/ .0/0
T
C B g;q .s0 / C C g;q .s0 /
qe .s0 /
e ;ˇ e ;ˇ

.0/ .0/
C C g;q .sf /T
qe0 .sf /  C g;q .s0 /T
qe0 .s0 /
e ;ˇ e ;ˇ

.0/ .0/
C E g;q .sf /
ˇ.sf /  E g;q .s0 /
ˇ.s0 /;
e ;ˇ e ;ˇ
140 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

where
.0/ .0/ .0/0 .0/00
G g;q .s/ WD Ag;q .s/  B g;q .s/ C C g;q .s/ (4.38b)
e ;ˇ e ;ˇ e ;ˇ e ;ˇ

.0/ .0/ .0/0


H g;q .s/ WD D g;q .s/  E g;q .s/: (4.38c)
e ;ˇ e ;ˇ e ;ˇ

Conditions (4.29e–f, f’), (4.32c, d), resp., yield the following initial and terminal
conditions for the changes
qe .s/;
ˇ.s/:


ˇ.s0 / D 0;
qe .s0 / D q 0  q e .s0 / (4.38d)

ˇ.sf / D 0;
qe .sf / D q f  q e .sf /; if  J D 0: (4.38e)

Moreover, if qP 0 6D 0 (as in later correction stages, cf. Sect. 4.5), according


to (4.29e) or (4.32c), condition
ˇ.s0 / D 0 must be replaced by the more general
one
 q
q 0e .s0 / C
qe0 .s0 / ˇ.s0 / C
ˇ.s0 / D qP 0 (4.38f)

which can be approximated by


q q
1
ˇ.s0 / 0
ˇ.s0 /
qe0 .s0 / C q q e .s0 /  qP 0  ˇ.s0 /q 0e .s0 /: (4.38f’)
2
ˇ.s0 /

Applying the above described linearization (4.37a–h) to (4.29a–e) or to the


constraints (4.29c, c) only, problem (4.29a–f, f’) is approximated by a linear vari-
ational problem or a variational problem having linear constraints for the changes

qe ./;
ˇ./. On the other hand, using linearizations of the type (4.37a–h) in
the variational problem (4.32a–f), in the average constraints (4.29b’, c’), resp.,
an optimization problem for
qe ./;
ˇ./ is obtained which is linear, which has
linear constraints, respectively.
d) Separated computation of qe ./ and ˇ./
In order to reduce the computational complexity, the given trajectory planning
problem is often split up [62] into the following two separated problems for qe ./
and ˇ./:
.0/
i) Optimal path planning: find the shortest collision-free geometric path qe D
.0/
qe .s/; s0  s  sf , in configuration space from a given initial point q0
to a prescribed terminal point qf . Alternatively, with a given initial velocity
profile ˇ./ D ˇ./, see (4.35b), the substitute problem (4.29a–f, f’), (4.32a–
.0/
f), resp., may be solved for an approximate geometric path qe ./ D qe ./
only.
ii) Velocity planning: Determine then an optimal velocity profile ˇ .0/ D
.0/
ˇ .0/ .s/; s0  s  sf , along the predetermined path qe ./.
4.4 OSTP: Optimal Stochastic Trajectory Planning 141

n o
Having a certain collection qe; ./ W  2  of admissible paths in
configuration space, a variant of the above procedure (i), (ii) is to determine—
in an inner optimization loop—the optimal velocity profile ˇ ./ with respect
to a given path qe; ./, and to optimize then the parameter  in an outer
optimization loop, see [68].

4.4.2 Optimal Reference Trajectory, Optimal Feedforward


Control
.0/ .0/
Having, at least approximatively, the optimal geometric path qe D qe .s/ and
the optimal  velocity  profile
 ˇ .0/ D ˇ.0/ .s/; s0  s  sf , i.e. the optimal
.0/ .0/
solution qe ; ˇ .0/ D qe .s/; ˇ .0/ .s/ ; s0  s  sf , of one of the stochastic
path planning problems (4.29a–f, f’), (4.32a–f), (4.33a–d), resp., then, according
to (4.12a, b), (4.13), the optimal reference trajectory in configuration space q .0/ D
q .0/ .t/; t  t0 , is defined by
 
q .0/ .t/ WD qe.0/ s .0/ .t/ ; t  t0 : (4.39a)

Here, the optimal .t $ s/-transformation s .0/ D s .0/ .t/; t  t0 , is determined by


the initial value problem
q
sP .t/ D ˇ .0/ .s/; t  t0 ; s.t0 / WD s0 : (4.39b)

By means of the kinematic equation (4.5), the corresponding reference trajectory


x .0/ D x .0/ .t/; t  t0 , in workspace may be defined by
     
.0/
x .0/ .t/ WD E T pK .!/; q .0/ .t/ jAt0 D T p K ; q .0/ .t/ ; t  t0 ; (4.39c)

where
.0/
p K WD E .pK .!/jAt0 / : (4.39d)

Based on the inverse dynamics approach, see Remark 4.1, the optimal reference
trajectory q .0/ D q .0/ .t/; t  t0 , is inserted now into the left hand side of the
dynamic equation (4.4a). This yields next to the random optimal control function
   
v .0/ t; pD .!/ WD M pD .!/; q .0/ .t/ qR .0/ .t/
 
C h pD .!/; q .0/ .t/; qP .0/ .t/ ; t  t0 : (4.40)
142 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

 
Starting at the initial state .q 0 ; qP 0 / WD q .0/ .t0 /; qP .0/ .t0 / , this control obviously
keeps the robot exactly on the optimal trajectory q .0/ .t/; t  t0 , provided that pD .!/
is the true vector of dynamic parameters.
An optimal feedforward control law u.0/ D u.0/ .t/; t  t0 , related to the optimal
reference trajectory q .0/ D q .0/ .t/; t  t0 , can be
 obtained then by applying a certain
averaging or estimating operator  D  jAt0 to (4.40), hence,
   
u.0/ WD  v .0/ t; pD ./ jAt0 ; t  t0 : (4.41)

If  .jAt0 / is the conditional expectation, then we find the optimal feedforward


control law
     
u.0/ WD E M pD .!/; q .0/ .t/ qR .0/ .t/ C h pD .!/; q .0/ .t/; qP .0/ .t/ jAt0 ;
   
.0/ .0/
D M p D ; q .0/ .t/ qR .0/ .t/ C h p D ; q .0/ .t/; qP .0/ .t/ ; t  t0 ; (4.42a)

.0/
where p D denotes the conditional mean of pD .!/ defined by (4.31c), and the
second equation in formula (4.42a) holds since the dynamic equation of a robot
depends linearly on the parameter vector pD , see Remark 4.2.
.0/
Inserting into the dynamic equation (4.4a), instead of the conditional mean p D
.0/
of pD .!/ given At0 , another estimator pD of the true parameter vector pD or a
.0/
certain realization pD of pD .!/ at the time instant t0 , then we obtain the optimal
feedforward control law
   
.0/ .0/
u.0/ .t/ WD M pD ; q .0/ .t/ qR .0/ .t/ C h pD ; q .0/ .t/; qP .0/ .t/ ; t  t0 : (4.42b)

4.5 AOSTP: Adaptive Optimal Stochastic Trajectory


Planning

As already mentioned in the introduction, by means of direct or indirect measure-


ments, observations of the robot and its environment, as e.g. by observations of the
P .q; q/,
state .x; x/; P resp., of the mechanical system in work or configuration space,
further information about the unknown parameter vector p D p.!/ is available at
each moment t > t0 . Let denote, cf. Sects. 4.1, 4.4,

At . A/; t  t0 ; (4.43a)

the -algebra of all information about the random parameter vector p D p.!/ up
to time t. Hence, .At / is an increasing family of -algebras. Note that the flow of
4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning 143

information in this control process can be described also by means of the stochastic
process
 
pt .!/ WD E p.!/jAt ; t  t0 ; (4.43b)

see [12].
By parameter identification [65, 143] or robot calibration techniques [15, 145]
we may then determine the conditional distribution

.t /
Pp./ D Pp./jAt (4.43c)

of p.!/ given At . Alternatively, we may determine the vector of conditional


moments
!
Y r 
 WD E
.t /
plk .!/jAt (4.43d)
kD1 .l1 ;:::;lr /2

arising in the approximate computation of conditional expectations in (OSTP) with


respect to At , cf. (4.31c, d).
The increase of information about the unknown parameter vector p.!/ from one
.t /
moment t to the next t C dt may be rather low, and the determination of Pp./ or  .t /
at each time point t may be very expensive, though identification methods in real-
time exist [143]. Hence, as already mentioned briefly in Sect. 4.1, the conditional
.t /
distribution Pp./ or the vector of conditional moments  .t / is determined/updated at
discrete moments .tj /:

t0 < t1 < t2 < : : : < tj < tj C1 < : : : : (4.44a)

.0/
The optimal functions qe .s/; ˇ .0/ .s/; s0  s  sf , based on the a priori
information At0 , loose in course of time  more or less their qualification to provide a
satisfactory pair of guiding functions q .0/ .t/; u.0/ .t/ ; t  t0 .
However, having at the main correction time points tj ; j D 1; 2; : : :, the updated
.tj /
information -algebras Atj and then the a posteriori probability distributions Pp./
or the updated conditional
 .0/ moments
  .tj / of p.!/; j D 1; 2; : : :, the pair of
guiding functions q .t/; u .t/ ; t  t0 , is replaced by a sequence of renewed
.0/
 
pairs q .j / .t/; u.j / .t/ ; t  tj ; j D 1; 2; : : :, of guiding functions determined
.j /
by replanning, i.e. by repeated (OSTP) for the remaining time intervals Œtj ; tf 
and by using the new information given by Atj . Since replanning at a later main
correction time point tj ; j  1, hence on-line, may be very time consuming, in
order to maintain the real-time capability of the method, we may start the replanning
144 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Fig. 4.2 Start of replanning

procedure for an update of the guiding functions at time tj already at some earlier
time tQj with tj 1 < tQj < tj ; j  1. Of course, in this case

Atj WD AtQj (4.44b)

is defined to contain only the information about the control process up to time tQj in
which replanning for time tj starts (Fig. 4.2).
The resulting substitute problem at a stage j  1 follows from the corresponding
.j 1/
substitute problem for the previous stage j 1 just by updating j 1 ! j ; f !
.j /
f , the initial and terminal parameters, see (4.31a, b). The renewed

.j /
initial parameters j W tj ; sj ; q j ; qP j ; Pp./ or j (4.45a)

for the j -th stage, j  1, are determined recursively as follows:

sj WD s .j 1/ .tj / .1  1  transformation s D s.t// (4.45b)


 
O j /; q j WD q .j 1/ .tj / or q j WD E q.tj /jAtj
q j WD q.t (4.45c)
 
OP j /; q j WD qP .j 1/ .tj / or qP j WD E q.t
qP j WD q.t P j /jAtj (4.45d)

P j /)
(observation or estimate of q.tj /; q.t
.j / .t /
Pp./ WD Pp./
j
D Pp./jAtj (4.45e)

j WD  .tj / : (4.45f)

The renewed
.j / .j / .j /
terminal parameters f W tf ; sf ; q f ; ˇf (4.46a)

for the j -th stage, j  1, are defined by

sf (given) (4.46b)
 
.j / .j /
q f WD q.t
O f / or q f WD E qf .!/jAtj (4.46c)

ˇf D 0 (4.46d)
 
.j /
s .j / tf D sf : (4.46e)
4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning 145

As already mentioned above, the (OSTP) for the j -th stage, j  1, is obtained
from the substitute problems (4.29a–f, f’), (4.32a–f), (4.33a–d), resp., formulated
for the 0-th stage, j D 0, just by substituting

.j /
0 ! j and f ! f : (4.47)

Let then denote


   
qe.j / ; ˇ .j / D qe.j /.s/; ˇ .j / .s/ ; sj  s  sf ; (4.48)

the corresponding pair of optimal solutions of the resulting substitute problem for
the j -th stage, j  1.  
The pair of guiding functions q .j /.t/; u.j / .t/ ; t  tj , for the j -th stage,
j  1, is then defined as described in Sect. 4.4.2 for the 0-th stage. Hence, for
the j -th stage, the reference trajectory in configuration space q .j / .t/; t  tj , reads,
cf. (4.39a),
 
q .j /.t/ WD qe.j / s .j / .t/ ; t  tj ; (4.49a)
h i
.j /
where the transformation s .j / W tj ; tf ! Œsj ; sf  is defined by the initial value
problem
q
sP .t/ D ˇ .j / .s/; t  tj ; s.tj / D sj : (4.49b)

.j /
The terminal time tf for the j -th stage is defined by the equation
 
.j /
s .j / tf D sf : (4.49c)

Moreover, again by the inverse dynamics approach, the feedforward control


u.j / D u.j /.t/; t  tj , for the j -th stage is defined, see (4.40), (4.41), (4.42a, b), by
   
u.j /.t/ WD  v .j / t; pD .!/ jAtj ; (4.50a)

where
   
v .j / .t; pD / WD M pD ; q .j / .t/ qR .j / .t/ C h pD ; q .j / .t/; qP .j / .t/ ; t  tj :
(4.50b)
Using the conditional expectation  .jAtj / WD E.jAtj /, we find the feedforward
control
   
.j / .j /
u.j / .t/ WD M p D ; q .j / .t/ qR .j /.t/ C h p D ; q .j / ; qP .j / .t/ ; t  tj ; (4.50c)
146 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

where, cf. (4.31c),


.j /  
p D WD E pD .!/jAtj : (4.50d)

Corresponding to (4.39c, d), the reference trajectory in work space x .j / D


.j /
x .j / .t/; t  tj , for the remaining time interval tj  t  tf , is defined by
     
.j / .j /
x .j / .t/ WD E T pK .!/; q .j / .t/ jAtj D T p K ; q .j / .t/ ; tj  t  tf ;

(4.51a)

where
 
.j /
p K WD E pK .!/jAtj : (4.51b)

4.5.1 (OSTP)-Transformation

The variational problems (OSTP) at the different stages j D 0; 1; 2 : : : are deter-


.j /
mined uniquely by the set of initial and terminal parameters .j ; f /, cf. (4.45a–
f), (4.46a–e). Thus, these
 problems can be transformed to a reference problem
.j /
depending on j ; f and having a certain fixed reference s-interval.

Theorem 4.5.1 Let ŒQs0 ; sQf ; sQ0 < sQf WD sf , be a given, fixed reference s-interval,
and consider for a certain stage j; j D 0; 1; : : :, the transformation

sQf  sQ0
sQ D sQ .s/ WD sQ0 C .s  sj /; sj  s  sf ; (4.52a)
sf  sj

from Œsj ; sf  onto ŒQs0 ; sf  having the inverse

sf  sj
s D s.Qs / D sj C .Qs  sQ0 /; sQ0  sQ  sQf : (4.52b)
sQf  sQ0

Represent then the geometric path in work space qe D qe .s/ and the velocity
profile ˇ D ˇ.s/; sj  s  sf , for the j -th stage by
 
qe .s/ WD qQe sQ.s/ ; sj  s  sf (4.53a)
 
ˇ.s/ WD ˇQ sQ .s/ ; sj  s  sf (4.53b)

where qQ e D qQe .Qs /; ˇQ D ˇ.Q


Q s /; sQ0  sQ  sQf , denote the corresponding functions
on ŒQs0 ; sQf . Then the (OSTP) for the j -th stage is transformed into a reference
4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning 147

Q depending on the parameters


variational problem (stated in the following) for .qQe ; ˇ/

.j /
.; f / D .j ; f / 2 Z  Zf (4.54)

and having the fixed reference s-interval ŒQs0 ; sQf . Moreover, the optimal solution

.j / .j /
qe ; ˇ .j / D qe .s/; ˇ .j / .s/ ; sj  s  sf , may be represented by the optimal
adaptive law
 
qe.j / .s/ D qQe sQ .s/I j ; f ; sj  s  sf ;
.j /
(4.55a)
 
ˇ .j / .s/ D ˇQ  sQ .s/I j ; f ; sj  s  sf ;
.j /
(4.55b)

where

qQe D qQ e .Qs I ; f /; ˇQ  D ˇQ  .Qs I ; f /; sQ0  sQ  sQf ; (4.55c)

denotes the optimal solution of the above mentioned reference variational problem.
Proof According to (4.53a, b) and (4.52a, b), the derivatives of the functions
qe .s/; ˇ.s/; sj  s  sf , are given by
  sQ  sQ
qe0 .s/ D qQ e0 sQ .s/
f 0
; sj  s  sf ; (4.56a)
sf  sj
   sQ  sQ 2
00 00 f 0
qe .s/ D qQ e sQ .s/ ; sj  s  sf ; (4.56b)
sf  sj
  sQ  sQ
ˇ 0 .s/ D ˇQ 0 sQ .s/
f 0
; sj  s  sf : (4.56c)
sf  sj

Now putting the transformation (4.52a, b) and the representation (4.53a,


b), (4.56a–c) of qe .x/; ˇ.s/; sj  s  sf , and their derivatives into one of the
substitute problems (4.29a–f, f’), (4.32a–f) or their time-variant versions, the
chosen substitute problem is transformed into a corresponding reference variational
problem (stated in the following Sect. 4.5.1) having the fixed reference interval
.j /
ŒQs0 ; sQf  and depending on the parameter vectors j ; f . Moreover, according
 
.j /
to (4.53a, b), the optimal solution qe ; ˇ .j / of the substitute problem for the j -th
stage may be represented then by (4.55a–c).
Remark 4.11 Based on the above theorem, the stage-independent functions qQ e ; ˇQ 
can be determined now off-line by using an appropriate numerical procedure.
148 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

4.5.2 The Reference Variational Problem

After the (OSTP)-transformation described in Sect. 4.5.1, in the time-invariant case


for the problems of type (4.29a–f, f’) we find

ZsQf   
sQf  sQ0 00 sQf  sQ0 2 Q
min E L J
pJ ; qQe .Qs /; qQe0 .Qs / ; qQ .Qs / ; ˇ.Qs /;
sf  sj e sf  sj
sQ0

sQf  sQ0  sf  sj    
ˇQ 0 .Qs / jAtj d sQ C E  J pJ ; qQ e .sQf jAtj
sf  sj sQf  sQ0
(4.57a)
s.t.
  !
 sQf  sQ0 00 sQf  sQ0 2 Q sQf  sQ0 
0
E f p; qQe .Qs /; qQe .Qs / ; qQe .Qs / ; ˇ.Qs /; ˇQ 0 .Qs / jAtj
sf  sj sf  sj sf  sj
 f ; (4.57b)
sQ0  sQ  sQf (4.57c)
Q s /  0; sQ0  sQ  sQf
ˇ.Q (4.57d)
q
sQf  sQ0 Q s0 / D qP
qQe .Qs0 / D q j ; qQe0 .Qs0 / ˇ.Q j (4.57e)
sf  sj
.j / Q sf / D 0
qQe .Qsf / D q f .if  J D 0/; ˇ.Q (4.57f)
   
Q sf / D 0; E
ˇ.Q p; qQ e .Qsf / jAtj  ; (4.57e’)

where f ; f are defined by


! !
fu fu
f WD ; f WD : (4.57g)
fS fS

Moreover, for the problem type (4.32a–f) we get

ZsQf   
sQf  sQ0 00 sQf  sQ0 2 Q
min E LJ p; qQ e .Qs /; qQe0 .Qs / ; qQ e .Qs / ; ˇ.Qs /;
sf  sj sf  sj
sQ0

sQf  sQ0  sf  sj    
ˇQ 0 .Qs / jAtj d sQ C E J p; qQ e .Qsf / jAtj (4.58a)
sf  sj sQf  sQ0
4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning 149

s.t.

Q s /  0; sQ0  sQ  sQf
ˇ.Q (4.58b)
q
sQf  sQ0 Q s0 / D qP
qQ e .Qs0 / D q j ; qQ e0 .Qs0 / ˇ.Q j (4.58c)
sf  sj
.j / Q sf / D 0:
qQe .Qsf / D q f .if J D 0/; ˇ.Q (4.58d)

For the consideration of the time-variant case we note first that by using the
transformation (4.52a, b) and (4.53b) the time t  tj can be represented, cf. (4.14a,
b) and (4.15a), also by

  Z sQ
Q sf  sj d Q
t D t sQ; tj ; sj ; ˇ./ WD tj C q : (4.59a)
sQf  sQ0 Q Q /
sQ0 ˇ.

Hence, if the variational problems (4.57a–f) and (4.58a–d) for the j -th stage
depend explicitly on time t  tj , then, corresponding to Sect. 4.4, item B), for the
constituting functions LJ ;  J ; LJ ; J of the variational problems we have that
 
Q pJ ; qe ; q 0 ; q 00 ; ˇ; ˇ 0 ; sQ0  sQ  sQf
LJ D LJ sQ ; tj ; sj ; ˇ./; (4.59b)
e e
 
Q pJ ; qe
 J D  J sQf ; tj ; sj ; ˇ./; (4.59c)
 
Q p; qe ; q 0 ; q 00 ; ˇ; ˇ 0 ; sQ0  sQ  sQf
f D f sQ; tj ; sj ; ˇ./; (4.59d)
e e
 
Q p; qe ; q 00 ; ˇ; ˇ 0 ; sQ0  sQ  sQf
LJ D LJ sQ ; tj ; sj ; ˇ./; (4.59e)
e
 
Q p; qe :
J D J sQf ; tj ; sj ; ˇ./; (4.59f)

Transformation of the Initial State Values

Suppose here that  J 6D 0; J 6D 0, resp., and the terminal state condition (4.57f,
e’), (4.58d), resp., is reduced to

Q sf / D 0:
ˇ.Q (4.60a)

Q qQ e ./ on ŒQs0 ; sQf  by


Representing then the unknown functions ˇ./;

Q s / WD ˇj ˇQa .Qs /;
ˇ.Q sQ0  sQ  sQf (4.60b)
qQe .Qs / WD q jd qQea .Qs /; sQ0  sQ  sQf ; (4.60c)
150 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

where q jd denotes the diagonal matrix with the components of q j on its main
 
diagonal, then in terms of the new unknowns ˇQa ./; qQ ea ./ on ŒQs0 ; sQf  we have
the nonnegativity and fixed initial/terminal conditions

ˇQa .Qs /  0; sQ0  sQ  sQf (4.61a)


ˇQa .Qs0 / D 1; qQea .Qs0 / D 1 (4.61b)
ˇQa .Qsf / D 0; (4.61c)

where 1 WD .1; : : : ; 1/0 .

4.5.3 Numerical Solutions of (OSTP) in Real-Time

With the exception of field robots (e.g. Mars rover) and service robots [62], becom-
ing increasingly important, the standard industrial robots  move very fast. Hence, for
.j /
industrial robots the optimal solution qe .s/; ˇ .j / .s/ ; ˇ .j / .s/, resp., sj  s 
 
sf , generating the renewed pair of guiding functions q .j / .t/; u.j / .t/ ; t  tj , on
each stagej D 1; 2;: : : should be provided in real-time. This means that the optimal
.j /
solutions qe ; ˇ .j / ; ˇ .j / , resp., must be prepared off-line as far as possible such
that only relatively simple numerical operations are left on-line.
Numerical methods capable to generate approximate optimal solutions in real-
time are based mostly on discretization techniques, neural network (NN) approxi-
mation [8, 111, 114, 133], linearization techniques (sensitivity analysis) [159].

Discretization Techniques

Partitioning the space Z  Zf of initial/terminal parameters .; f / into a certain


(small) number l0 of subdomains

[
l0
Z  Zf D Z l  Zfl ; (4.62a)
lD1

and selecting then a reference parameter vector

. l ; fl / 2 Z l  Zfl ; l D 1; : : : ; l0 ; (4.62b)


4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning 151

in each subdomain Z l  Zfl , the optimal adaptive law (4.55c) can be approximated,
cf. [150], by
)
qQO e .Qs I ; f / WD qQ  .Qs I  l ; fl /; sQ0  sQ  sQf
for .; f / 2 Z l  Zfl : (4.62c)
ˇOQ  .Qs I ; f / WD ˇQ  .Qs I  l ; fl /; sQ0  sQ  sQf

NN-Approximation

For the determination of the optimal adaptive law (4.55a–c) in real-time, according
to (4.34a, b), the reference variational problem (4.57a–f) or (4.58a–d) is reduced
first to a finite dimensional parameter optimization problem by
i) representing the unknown functions qQe D qQe .Qs /; ˇQ D ˇ.Q
Q s /; sQ0  sQ  sQf , as
linear combinations

X
lq
q
qQe .Qs / WD qOe Bl .Qs /; sQ0  sQ  sQf ; (4.63a)
lD1

l

Q s / WD
ˇ.Q ˇOl Bl .Qs /; sQ0  sQ  sQf ;
ˇ
(4.63b)
lD1

q q ˇ ˇ
of certain basis functions, e.g. cubic B-splines, Bl D Bl .Qs /; Bl D Bl .Qs /; sQ0 
sQ  sQf ; l D 1; : : : ; lq .lˇ /, with unknown vectorial (scalar) coefficients
qOl ; ˇOl ; l D 1; : : : ; lq .lˇ /, and
ii) demanding the inequalities in (4.57b, c), (4.58b), resp., only for a finite set of
sQ-parameter values sQ0 < sQ1 < : : : < sQk < sQkC1 < : : : < sQ D sQf .
By means of the above described procedure (i), (ii), the optimal coefficients

qOl D qOl .; f /; l D 1; : : : ; lq (4.63c)


ˇOl D ˇOl .; f /; l D 1; : : : ; lˇ (4.63d)

become functions of the initial/terminal parameters ; f , cf. (4.55c). Now, for


the numerical realization of the optimal parameter functions (4.63c, d), a Neural
Network (NN) is employed generating an approximative representation

qOe .; f /  qOeNN .; f I wq /; l D 1; : : : ; lq (4.64a)


ˇOl .; f /  ˇOlNN .; f I wˇ /; l D 1; : : : ; lˇ ; (4.64b)
152 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

where the vectors wq ; wˇ of NN-weights are determined optimally

wq D wq (data); wˇ D wˇ (data) (4.64c)

in an off-line training procedure [8, 111, 133]. Here, the model (4.64a, b) is fitted in
the LSQ-sense to given data
! !
  ; f   ; f
; ;  D 1; : : : ; 0 ; (4.64d)
qO l ; l D 1; : : : ; lq ˇOl ; l D 1; : : : ; lˇ

where

.  ; f /;  D 1; : : : ; 0 ; (4.64e)

is a certain collection of initial/terminal parameter vectors, and

qOl WD qOl .  ; f /; l D 1; : : : ; lq ;  D 1; : : : ; 0 (4.64f)

ˇOl WD ˇOl .  ; f /; l D 1; : : : ; lˇ ;  D 1; : : : ; 0 (4.64g)

are the related optimal coefficients in (4.63a, b) which are determined off-line by an
appropriate parameter optimization procedure.
Having the vectors wq ; wˇ of optimal NN-weights, by means of (4.64a–c), for
.j /
given actual initial/terminal parameters .; f / D .j ; f / at stage j  0, the NN
yields then the optimal parameters

qO l .j ; f /; ˇOl .j ; f /; l D 1; : : : ; lq .lˇ /


.j / .j /

in real-time; consequently, by means of (4.63a, b), also the optimal functions


qQ e .Qs /; ˇQ  .Qs /; sQ0  sQ  sQf , are then available very fast. For more details, see
[8, 111].

Linearization Methods

I) Linearization of the optimal feedforward control law


Expanding the optimal control laws (4.55c) with respect to the initial/terminal
parameter vector O D .; f / at its value O0 D .0 ; f / for stage j D 0,
.0/
4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning 153

approximatively we have that

@qQe
qQe .Qs ; ; f / D qQ e .Qs I 0 ; f / C
.0/ .0/ .0/
.Qs I 0 ; f /.  0 ; f  f / C : : :
@O
(4.65a)
@ˇQ 
ˇQ  .Qs I ; f / D ˇQ  .Qs I 0 ; f / C
.0/ .0/ .0/
.Qs I 0 ; f /.  0 ; f  f / C : : : ;
@O
(4.65b)

where the optimal starting functions qQe .Qs I 0 ; f /; ˇQ  .Qs I 0 ; f / and the deriva-
.0/ .0/

@qQ  .0/ @ˇ
Q .0/
tives e .Qs I 0 ; f /; .Qs I 0 ; f /; : : : can be determined—on a certain grid
@O @O
of ŒQs0 ; sQf —off-line by using sensitivity analysis [159]. The actual values of
qQe ; ˇQ  at later stages can be obtained then very rapidly by means of simple
matrix operations. If necessary, the derivatives can be updated later on by a
numerical procedure running in parallel to the control process.
II) Sequential linearization of the (AOSTP) process
.j / .j /
Given the optimal guiding functions qe D qe .s/; ˇ .j / D ˇ .j / .s/; sj  s  sf
for the j -th stage (Fig. 4.3), corresponding to the representation (4.35a, b), the
.j C1/
optimal guiding functions qe .s/; ˇ .j C1/ .s/; sj C1  s  sf , are represented
by

qe.j C1/ .s/ WD qe.j / .s/ C


qe .s/; sj C1  s  sf ; (4.66a)
ˇ .j C1/ .s/ WD ˇ .j / .s/ C
ˇ.s/; sj C1  s  sf ; (4.66b)
 
where sj < sj C1 < sf , and
qe .s/;
ˇ.s/ ; sj C1  s  sf , are certain
 
.j /
(small) changes of the j -th stage optimal guiding functions qe ./; ˇ .j / ./ .
Obviously, the linearization technique described in Sect. 4.4.1c can be
applied now also to the approximate computation of the optimal changes

Fig. 4.3 Sequential linearization of (AOSTP)


154 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)


qe .s/;
ˇ.s/; sj C1  s  sf , if the following replacements are made in the
formulas (4.35a, b), (4.36a–c), (4.37a–h) and (4.38a–f):
9
s0  s  sf ! sj C1  s  sf >
>
>
>
qe ; ˇ
.j /
! qe ; ˇ .j / >
>
>
>
At0 ! Atj C1 >
>
>
>
.0/ .0/ .j C1/ .j C1/ >
>
g .0/ ;  ; F ! g .j C1/;  ;F >
=
.0/ .0/ .j C1/ .j C1/
Ag;q ;ˇ ; : : : ; H g;q ! Ag;q .j / ;ˇ.j / ; : : : ; H g;q .j / ;ˇ.j /
e e ;ˇ e e >
>

ˇ.s0 / D 0 !
ˇ.sj C1 / D ˇj C1  ˇ .j / .sj C1 / >
>
>
>
.j / >
>

qe .s0 / D q0  q e .s0 / !
qe .sj C1 / D q j C1  qe .sj C1 / >
>
>
>

ˇ.sf / D 0 !
ˇ.sf / D 0 >
>
.0/ .j C1/ .j / >
;

qe .sf / D qf  q e .sf / !
qe .sf / D q f  q f ; if  D 0;
J

(4.66c)
.j C1/
where X denotes the conditional expectation of a random variable X with
respect to Atj C1 , cf. (4.50d), (4.51b). Furthermore, (4.37f) yields
 0
 q
qe.j / .sj C1 / C
qe0 .sj C1 /  ˇj C1 D qP j C1 (4.66d)

which can be approximated, cf. (4.38f’), by


p
ˇ.s / .j /0
ˇ .j / .sj C1 /
qe0 .sj C1 / C 12 p .j /j C1 qe .sj C1 /
ˇ .sj C1 /
p .j /0
(4.66d’)
 qP j C1  ˇ .j / .sj C1 /qe .sj C1 /:

Depending on the chosen substitute problem, by this linearization method we


obtain
 then a variational
 problem, an optimization problem, resp., for the changes

qe .s/;
ˇ.s/ ; sj C1  s  sf , having a linear objective function and/or linear
constraints.
To give a typical example, we consider now (AOSTP) on the .j C 1/th stage
with substitute problem (4.32a–f). Hence, the functions g;  in (4.39a–c), (4.37a–
h), and (4.38a–f) are given by

g WD LJ ;  WD J :

Applying the linearization techniques developed in Sect. 4.4.1c now to


(4.32a–f), according to (4.38a–c), (4.37b), for the correction terms
qe .s/;
ˇ.s/,
4.5 AOSTP: Adaptive Optimal Stochastic Trajectory Planning 155

sj C1  s  sf , we find the following linear optimization


problem:

Zsf Zsf
.j C1/ .j C1/
min G g;q .j / ;ˇ.j / .s/
qe .s/ds C
T
H g;q .j / ;ˇ.j / .s/
.s/ds (4.67a)
e e
sj C1 sj C1

CRjT
qe .sf / C SjT
qe0 .sf / C Tj
ˇ.sj C1 /

s.t.


qe .sj C1 / D q j C1  qe.j / .sj C1 / (4.67b)
.j C1/ .j /

qe .sf / D q f  q f ; if  J D 0 (4.67c)

ˇ.sf / D 0

ˇ.s/  ˇ .j / .s/; sj C1  s  sf ; (4.67d)

where
.j C1/ .j C1/ .j C1/
Rj WD B g;q .j / ;ˇ.j / .sf /  C g;q .j / ;ˇ.j / .sf / C a .j / .sf / (4.67e)
e e ;qe
.j C1/
Sj WD C g;q .j / ;ˇ.j / .sf / (4.67f)
e

.j /0
1 .j C1/ qe .sj C1 / .j C1/
Tj WD C g;q .j / ;ˇ.j / .sj C1 /T .j /  E g;q .j / ;ˇ.j / .sj C1 /: (4.67g)
2 e ˇ .sj C1 / e

The linear optimization problem (4.67a–g) can be solved now by the methods
developed e.g. in [52], where for the correction terms
qe .s/;
ˇ.s/; sj C1  s 
sf , some box constraints or norm bounds have to be added to (4.67a–g). In case of

ˇ./ we may replace, e.g., (4.67d) by the condition

 ˇ .j / .s/ 
ˇ.s/ 
ˇ max ; sj C1  s  sf ; (4.67d’)

with some upper bound


ˇ max .
It is easy to see that (4.67a–g) can be split up into two separated linear
optimization problems for
qe ./;
ˇ./, respectively. Hence, according to the
simple structure of the objective function (4.67a), we observe that
 .j C1/ 
sign H g;q .j / ;ˇ.j / .s/ ; sj C1  s  sf ; and sign .Tj /
e

indicates the points s in the interval Œsj C1 ; sf  with


ˇ.s/ < 0 or
ˇ.s/ >
0, hence, the points s; sj C1  s  sf , where the velocity profile should be
156 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

decreased/increased. Moreover, using (4.67d’), the optimal correction


ˇ.s/ is
equal to the lower/upper bound in (4.67d’) depending on the above mentioned signs.
Obviously, the correction vectors
qe .s/; sj C1  s  sf , for the geometric
path in configuration space can be determined in the same way. Similar results are
obtained also if we use L2 -norm bounds for the correction terms.
If the pointwise constraints (4.29b, c) in (4.29a–f, f’) are averaged with respect
to s; sj C1  s  sf , then functions of the type (4.36c) arise, cf. (4.29b’, c’), which
can be linearized again by the same techniques as discussed above. In this case,
linear constraints are obtained for
ˇ.s/;
qe .s/; sj C1  s  sf , with constraint
functions of the type (4.38a–c), cf. also (4.67a).

Combination of Discretization and Linearization

Obviously, the methods described briefly in Sect. 4.5.3 can be combined in the
following way:
First, by means of discretization (Finite Element Methods), an approximate
optimal control law .qQe ; ˇ  / is searched in a class of finitely generated functions
of the type (4.63a, b). Corresponding to (4.65a, b), by means of Taylor expansion
here the optimal coefficients qOl ; ˇOl ; l D 1; : : : ; lq .lˇ /, in the corresponding linear
combination of type (4.63a, b) are represented, cf. (4.63c, d), by
  @qO   
qOl .; f / D qOl 0 ; f C l .0 ; f /   0 ; f  f C : : : ;
.0/ .0/ .0/
(4.68a)
@O
  @ˇO    
ˇOl .; f / D ˇOl 0 ; f C l 0 ; f
.0/ .0/ .0/
  0 ; f  f C : : : ;
@O
(4.68b)

l D 1; : : : ; lq .lˇ /. Here, the derivatives

@qOl  .0/
 @ˇO  
.0/

0 ; f ; l 0 ; f ; : : : ; l D 1; : : : ; lq .lˇ /; (4.68c)
@O @O

can be determined again by sensitivity analysis [159] of a finite dimensional


parameter-dependent optimization problem which may be much simpler than
the sensitivity analysis of the parameter-dependent variational problem (4.57a–f)
or (4.58a–d).
Stating the necessary (and under additional conditions also sufficient) Kuhn–
Tucker conditions for the optimal coefficients qOl ; ˇOl ; l D 1; : : : ; lq .lˇ /, formulas
for the derivatives (4.68c) may be obtained by partial differentiation with respect to
the complete vector z D .; f / of initial/terminal parameters.
4.6 Online Control Corrections: PD-Controller 157

4.6 Online Control Corrections: PD-Controller

We now consider the control of the robot at the j -th stage, i.e., for time t  tj ,
see [4, 6, 11, 20, 58, 61]. In practice we have random variations of the vector p of
the model parameters of the robot and its environment,
 moreover,
 there are possible
deviations of the true initial state .qj ; qPj / WD q.tj /; q.t
P j / in configuration space
 p 
.j /0
from the corresponding initial values .q j ; qP j / D q j ; qe .sj / ˇj of the (OSTP)
at stage j . Thus, the actual trajectory
 
q.t/ D q t; pD ; qj ; qP j ; u./ ; t  tj (4.69a)

in configuration space of the robot will deviate more or less from the optimal
reference trajectory
   
.j /
q .j / .t/ D qe.j / s .j / .t/ D q t; p D ; q j ; qP j ; u.j /./ ; (4.69b)

see (4.44a,
 b), (4.45a–f),
 (4.49a, b) and (4.50c). In the following we assume that
P
the state q.t/; q.t/ in configuration space may be observed for t > tj . Now, in
order to define an appropriate control correction (feedback control law), see (4.2)
and Fig. 4.1,
 

u.j /.t/ D u.t/  u.j / .t/ WD ' .j / t;
z.j / .t/ ; t  tj ; (4.70a)

for the compensation of the tracking error


   
q.t/ q .j / .t/

z.j / .t/ WD z.t/  z.j / .t/; z.t/ WD ; z.j / .t/ WD ; (4.70b)
P
q.t/ qP .j / .t/

P is such a function that


where ' .j / D ' .j / .t;
q;
q/

' .j /.t; 0; 0/ D 0 for all t  tj ; (4.70c)

the trajectories q.t/ and q .j / .t/; t  tj , are embedded into a one-parameter family
of trajectories q D q.t; /; t  tj ; 0    1, in configuration space which are
defined as follows:
Consider first the following initial data for stage j :

qj ./ WD q j C 
qj ;
qj WD qj  q j (4.71a)
P j ;
q
qP j ./ WD qP j C 
q P j WD qPj  qP j (4.71b)
.j / .j /
pD ./ WD p D C 
pD ;
pD WD pD  p D ; 0    1: (4.71c)
158 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Moreover, define the control input u.t/; t  tj , by (4.70a), hence,

u.t/ D u.j /.t/ C


u.j /.t/
 
D u.j /.t/ C ' .j / t; q.t/  q .j / .t/; q.t/
P  qP .j / .t/ ; t  tj : (4.71d)

Let then denote


 
q.t; / D q t; pD ./; qj ./; qPj ./; u./ ; 0    1; t  tj ; (4.72)

the solution of the following initial value problem consisting of the dynamic
equation (4.4a) with the initial values, the vector of dynamic parameters and the
total control input u.t/ given by (4.71a–d):

P /; q.t;
F .pD ./; q.t; /; q.t; R // D u.t; /; 0    1; t  tj ; (4.73a)

where

q.tj ; / D qj ./; q.t


P j ; / D qP j ./; (4.73b)
 
u.t; / WD u.j /.t/ C ' .j / t; q.t; /  q .j / .t/; q.t;
P /  qP .j / .t/ ; (4.73c)

P q/
and F D F .pD ; q; q; R is defined, cf. (2.6a), by

P q/
F .pD ; q; q; R WD M.pD ; q/qR C h.pD ; q; q/:
P (4.73d)

In the following we suppose [76] that the initial value problem (4.73a–d) has
a unique solution q D q.t; /; t  tj , for each parameter value ; 0    1
(Fig. 4.4).

4.6.1 Basic Properties of the Embedding q.t; /


 D 0 WD 0

Because of condition (4.70c) of the feedback control law ' .j / to be determined, and
due to the unique solvability assumption of the initial value problem (4.73a–d) at
the j -th stage, for  D 0 we have that

q.t; 0/ D q .j / .t/; t  tj : (4.74a)


4.6 Online Control Corrections: PD-Controller 159

Fig. 4.4 Adaptive optimal stochastic trajectore planning and control (AOSTPC)

 D 1 WD 1

According to (4.69a), (4.70a–c) and (4.71a–d),


 
q.t; 1/ D q.t/ D q t; pD ; qj ; qP j ; u./ ; t  tj ; (4.74b)

is the actual trajectory in configuration space under the total control input u.t/ D
u.j /.t/ C
u.j /.t/; t  tj , given by (4.71d).
160 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Taylor-Expansion with Respect to 

Let
 D 1  0 D 1, and suppose that the following property known from
parameter-dependent differential equations, cf. [76], holds:
Assumption 4.1 The solution q D q.t; /; t  tj ; 0    1, of the initial value
problem (4.71a–d) has continuous derivatives with respect to  up to order  > 1
for all tj  t  tj C
tj ; 0    1, with a certain
tj > 0.
Note that .t; / ! q.t; /; t  tj ; 0    1, can be interpreted as a homotopy
from the reference trajectory q .j / .t/ to the actual trajectory q.t/; t  tj , cf. [134].
Based on the above assumption and (4.74a, b), by Taylor expansion with respect
to  at  D 0 D 0, the actual trajectory of the robot can be represented by
 
q.t/ D q t; pD ; qj ; qPj ; u./ D q.t; 1/ D q.t; 0 C
/

D q.t; 0/ C
q.t/ D q .j / .t/ C
q.t/; (4.75a)

where the expansion of the tracking error


q.t/; t  tj , is given by

X
1
1 l 1 @ q

q.t/ D d q.t/.
/l C .t; #/.
/
lŠ Š @ 
lD1

X
1
1 l 1 @ q
D d q.t/ C .t; #/; t  tj : (4.75b)
lŠ Š @ 
lD1

Here, # D #.t; /; 0 < # < 1, and

@l q
d l q.t/ WD .t; 0/; t  tj ; l D 1; 2; : : : ; (4.75c)
@ l

denote the l-th order differentials of q D q.t; / with respect to  at  D 0 D 0.


Obviously, differential equations for the differentials d l q.t/; l D 1; 2; : : :, may
be obtained, cf. [76], by successive differentiation of the initial value problem
(4.73a–d) with respect to  at 0 D 0.
Furthermore, based on the Taylor expansion of the tracking error
q.t/; t 
tj , using some stability requirements, the tensorial coefficients Dzl ' .j /.t; 0/; l D
1; 2; : : :, of the Taylor expansion
1
X
' .j /.t;
z/ D Dlz ' .j / .t; 0/  .
z/l (4.75d)
lD1

of the feedback control law ' .j / D ' .j /.t;


z/ can be determined at the same time.
4.6 Online Control Corrections: PD-Controller 161

4.6.2 The 1st Order Differential dq

Next to we have to introduce some definitions. Corresponding to (4.70b) and (4.72)


we put
 
q.t; /
z.t; / WD ; t  tj ; 0    1I (4.76a)
P /
q.t;

then, we define the following Jacobians of the function F given by (4.73d):

P q/
K.pD ; q; q; R WD Fq .pD ; q; q;
P q/
R D Dq F .pD ; q; q;
P q/
R (4.76b)
P WD FqP .pD ; q; q;
D.pD ; q; q/ P q/
R D hqP .pD ; q; q/:
P (4.76c)

Moreover, it is

M.pD ; q/ D FqR .pD ; q; q;


P q/;
R (4.76d)

and due to the linear parameterization property of robots, see Remark 4.2, F may
be represented by

P q/
F .pD ; q; q; R D Y .q; q;
P q/p
R D (4.76e)

with a certain matrix function Y D Y .q; q;P q/.


R
By differentiation of (4.73a–d) with respect to , for the partial derivative
@q
.t; / of q D q.t; / with respect to  we find, cf. (4.70b), the following linear
@
initial value problem (error differential equation)
    @q
P /; q.t;
Y q.t; /; q.t; R /
pD C K pD ./; q.t; /; q.t; P /; q.t;
R / .t; /
@
  d @q   d 2 @q
P /
CD pD ./; q.t; /; q.t; .t; / C M pD ./; q.t; / .t; /
dt @ dt 2 @
@u @' .j /   @z
D .t; / D t;
z.j / .t/ .t; / (4.77a)
@ @z @

with the initial values, see (4.71a, b),

@q d @q P j:
.tj ; / D
qj ; .tj ; / D
q (4.77b)
@ dt @
162 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Putting now  D 0 D 0, because of (4.70a, b) and (4.74a), system


(4.77a, b) yields then this system of 2nd order differential equations for the 1st
order differential dq.t/ D @q
@ .t; 0/:

Y .j / .t/
pD C K .j /.t/dq.t/ C D .j / .t/dP q.t/ C M .j /.t/dR q.t/
D du.t/ D 'z.j / .t; 0/dz.t/ D 'q.j /.t; 0/dq.t/ C 'qP .t; 0/dP q.t/; t  tj ;
.j /

(4.78a)
with the initial values

dq.tj / D
qj ; dP q.tj / D
q
P j: (4.78b)

Here,

@u
du.t/ WD .t; 0/; (4.78c)
@
 
dq.t/ d d2
dz.t/ WD P ; dP q WD dq; dR q WD 2 dq; (4.78d)
d q.t/; dt dt

and the matrices Y .j / .t/; K .j /.t/; D .j / .t/ and M .j /.t/ are defined, cf. (4.76b–e), by
 
Y .j / .t/ WD Y q .j / .t/; qP .j / .t/; qR .j / .t/ (4.78e)
 
.j /
K .j / .t/ WD K p D ; q .j / .t/; qP .j / .t/; qR .j / .t/ (4.78f)
   
.j / .j /
D .j / WD D p D ; q .j / .t/; qP .j / .t/ ; M .j /.t/ WD M p D ; q .j / .t/ : (4.78g)

Local (PD-)control corrections du D du.t/ stabilizing system (4.78a, b) can now


be obtained by the following definition of the Jacobian of ' .j /.t; z/ with respect to
z at z D 0:
 
.j /
'z.j /.t; 0/ WD Fz p D ; q .j / .t/; qP .j / .t/; qR .j / .t/  M .j /.t/.Kp ; Kd /
 
D K .j /.t/  M .j /.t/Kp ; D .j / .t/  M .j /.t/Kd ; (4.79)

where Kp D .pk ık /; Kd D .dk ık / are positive definite diagonal matrices with
positive diagonal elements pk ; dk > 0; k D 1; : : : ; n.
Inserting (4.79) into (4.78a), and assuming that M .j / D M .j /.t/ is regular [11]
for t  tj , we find the following linear system of 2nd order differential equations
for dq D dq.t/:

R
dq.t/ C Kd dP q.t/ C Kp dq.t/ D M .j /.t/1 Y .j / .t/
pD ; t  tj ; (4.80a)
dq.tj / D
qj ; dP q.tj / D
q
P j: (4.80b)
4.6 Online Control Corrections: PD-Controller 163

Considering the right hand side of (4.80a), according to (4.78e), (4.76e)


and (4.73d) we have that
   
Y .j / .t /
pD D Y q .j / .t /; qP .j / .t /; qR .j / .t /
pD D F
pD ; q .j / .t /; qP .j / .t /; qR .j / .t /
   
D M
pD ; q .j / .t / qR .j / .t / C h
pD ; q .j / .t /; qP .j / .t / : (4.81a)

Using then definition (4.49a, b) of q .j / .t/ and the representation (4.18a, b) of


qP .t/; qR .j / .t/, we get
.j /

Y .j / .t/
pD
    0
 1 0
 
D M
pD ; qe s .t/ .j / .j /
qe.j / s .j / .t/ ˇ .j / s .j / .t/
2
00
   
Cqe.j / s .j / .t/ ˇ .j / s .j / .t/
!
   r  
.j /0
Ch
pD ; qe s .t/ ; qe
.j / .j / .j /
s .t/ ˇ .j / s .j / .t/ : (4.81b)

From (4.19b) we now obtain the following important representations, where we


suppose that the feedforward control u.j /.t/; t  tj , is given by (4.50c, d).
Lemma 4.1 The following representations hold:
a)
 
Y .j / .t/
pD D ue
pD ; s .j / .t/I qe.j / ./; ˇ .j / ./ ; t  tj I (4.82a)
b)
 
.j /
u.j /.t/ D ue p D ; s .j / .t/I qe.j / ./; ˇ .j / ./ ; t  tj I (4.82b)

c)
 
u.j / .t/ C Y .j / .t/
pD D ue pD ; s .j / .t/I qe.j / ./; ˇ .j / ./ ; t  tj : (4.82c)

Proof The first equation follows from (4.81b) and (4.19b). Equations (4.50c),
(4.18a, b) and (4.19b) yield (4.82b). Finally, (4.82c) follows from (4.82a, b) and
the linear parameterization of robots, cf. Remark 4.2.
Remark 4.12 Note that according to the transformation (4.19a) of the dynamic
equation onto the s-domain, for the control input u.t/ we have the representation

u.t/ D ue .pD ; sI qe ./; ˇ.//


 
.j /
D ue p D ; sI qe ./; ˇ./ C ue .
pD ; sI qe ./; ˇ.// (4.82d)

with s D s.t/.
164 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Using (4.78d), it is easy to see that (4.80a, b) can be described also by the 1st
order initial value problem
 
0
dP z.t/ D Adz.t/ C .j;1/ ; t  tj (4.83a)
.t/
!
qj  q j
dz.tj / D
zj D ; (4.83b)
qPj  qP j

where A is the stability or Hurwitz matrix


 
0 I
A WD ; (4.83c)
Kp Kd

.j;1/
and .t/ is defined, cf. (4.82a), by
.j;1/
.t/ WD M .j /.t/1 Y .j / .t/
pD
 
D M .j /.t/1 ue
pD ; s .j / .t/I qe.j / ./; ˇ .j / ./ : (4.83d)

.j /
Consequently, for the first order expansion
  term dz.t/ of the deviation
z .t/
q.t/
between the actual state z.t/ D and the prescribed state z.j / .t/ D
P
q.t/
 .j / 
q .t/
; t  tj , we have the representation [32, 76]
qP .j / .t/

Zt  
A.t tj / 0
dz.t/ D dz .t/ D e
.j /

zj C e A.t  / .j;1/ d : (4.84a)
./
tj

 
Because of E
pD .!/jAtj D 0, we have that

 
E .t/jAtj D 0;
.j;1/
(4.84b)
   
E dz.t/jAtj D e A.t tj / E
zj jAtj ; t  tj ; (4.84c)

where , see (4.83a, b),


   
E
zj jAtj D E z.tj /jAtj  zj : (4.84d)
4.6 Online Control Corrections: PD-Controller 165

It is easy to see that the diagonal elements dk ; pk > 0; k D 1; : : : ; n, of


the positive definite diagonal matrices Kd ; Kp , rep., can be chosen so that the
fundamental matrix ˚.t; / D e A.t  / ; t  , is exponentially stable, i.e.

k˚.t; /k  a0 e 0 .t  / ; t  ; (4.85a)

with positive constants a0 ; 0 . A sufficient condition for (4.85a) reads

dk ; pk > 0; k D 1; : : : ; n; and dk > 2 in case of double eigenvalues of A.


(4.85b)
Define the generalized variance var .ZjAtj / of a random vector Z D Z.!/ given
  q
.j /
Atj by var.ZjAtj / WD E kZ  E.ZjAtj /k2 jAtj , and let Z WD var .ZjAtj /.
Then, for the behavior of the 1st order error term dz.t/; t  tj , we have the following
result:
Theorem 4.6.1 Suppose that the diagonal matrices Kd ; Kp are selected such
that (4.85a) holds. Moreover, apply the local (i.e. first order) control correction
(PD-controller)

du.t/ WD 'z.j /.t; 0/dz.t/; (4.86a)

.j /
where 'z .t; 0/ is defined by (4.79). Then, the following relations hold:
a) Asymptotic local stability in the mean:
 
E dz.t/jAtj ! 0; t ! 1I (4.86b)

b) Mean absolute 1st order tracking error:


  r  
0 .t tj / .j /2
E kdzkjAtj  a0 e z.tj / C kE z.tj /jAtj  zj k2

Zt q  
C a0 e 0 .t  / E k .j;1/ ./k2 jA
tj d ; t  tj ; (4.86c)
tj

where
   
E k .j;1/
.t/k2 jAtj  kM .j / .t/1 k2 u.je /2 s .j / .t/ ; (4.86d)
2
   
u.je / s .j / .t/  kY .j / .t/k2 var pD ./jAtj (4.86e)
166 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

with
r    
.j /
u.je / .s/ WD var ue pD ./; sI qe ./; ˇ .j / ./ jAtj ; sj  s  sf : (4.86f)

Proof
p Follows p from (4.84a), (4.82a–d) and the fact that by Jensen’s inequality
E X.!/  EX.!/ for a nonnegative random variable X D X.!/.
 
.j /2
Note that ue s .j / .t/ can be interpreted as the risk of the feedforward control
u.j /.t/; t  tj . Using (4.69b), (4.78g), (4.86d, e) and then changing variables  !
s in the integral in (4.86c), we obtain the following result:
Theorem 4.6.2 Let denote t .j / D t .j / .s/; s  sj , the inverse of the parameter
transformation s .j / D s .j / .t/; t  tj . Under the assumptions of Theorem 4.6.1, the
.j /
following inequality holds for tj  t  tf :

  r 
0 .t tj / .j /2 
E kdz.t/kjAtj  a0 e z.tj / C kE z.tj /jAtj  zj k2
 
0 t t .j / .s/
 1
R
.j / .j /
s .j / .t / a0 e kM p D ;qe .s/ k
C p .j /
ue .s/ ds: (4.87a)
sj ˇ .j / .s/

The minimality or boundedness of the right hand side of (4.87a), hence, the
robustness [43] of the present control scheme, is shown next:
Corollary 4.6.1 The meaning of the above inequality (4.87a) follows from the
following important minimality/boundedness properties depending on the chosen
substitute problem in (OSTP) for the trajectory planning problem under stochastic
uncertainty:
i) The
 error contribution
 of the initial value zj takes a minimum for zj WD
E z.tj /jAtj , cf. (4.44a, b).
ii) The factor 0 can be increased by an appropriate selection of the matrices
Kp ; Kd ;
iii)
 1
.j /
c M  kM pD ; qe.j /.s/ k  c M ; sj  s  sf ; (4.87b)

with positive constants c M ; c M > 0. This follows from the fact that the mass
matrix is always positive definite [11].
4.6 Online Control Corrections: PD-Controller 167

iv)

sZ
.j / .t /
Zsf
ds ds .j /
p  p D tf  tj ; (4.87c)
ˇ .j / .s/ .j /
ˇ .s/
sj sj

where according to (OSTP), for minimum-time and related substitute problems,


the right hand side is a minimum.
v) Depending on the chosen substitute problem in (OSTP), the generalized
.j /
variance ue .s/; sf  s  sf , is bounded pointwise by an appropriate upper
.j /
risk level, or ue ./ minimized in a certain weighted mean sense.
.j /2
For the minimality or boundedness of the generalized variance ue .s/; sj  s 
sf , mentioned above, we give the following examples:
Working with the probabilistic control constraints (4.30a) and assuming that the
vectors uc and u are fixed, see (4.30d), according to (4.30f) we find that (4.30a) can
be guaranteed by
2
u.je / .s/ C ku.j
e .s/  u k  .1  ˛u / min uk ; sj  s  sf ;
/ c 2 2
(4.87d)
1kn

 
.j / .j /
e .s/ WD ue p D ; sI qe ./; ˇ ./ . Hence, with (4.87d) we have then the
where u.j / .j /

condition
2
u.je / .s/  .1  ˛u / min u2e ; sj  s  sf ; (4.87d’)
1kn

cf. (4.87a). Under special distribution assumptions for pD .!/ more exact explicit
deterministic conditions for (4.30a) may be derived, see Remark 4.2.
If minimum force and moment should be achieved along the trajectory, hence, if
 D 0 and L D ku.t/k2 , see (4.6), then, according to substitute problem (4.29a–f,
f’) we have the following minimality property:

Zsf   Z f
t

  ds
u.je / .s/ C u.j  p
2 2
/
e .s/ D min E ku.t/k2 dtjAtj :
ˇ .j / .s/ qe ./;ˇ./
sj tj
(4.87e)
Mean/variance condition for ue : Condition (4.29c) in substitute problem (4.29a–
f, f’) may read in case of fixed bounds umin ; umax for the control u.t/ as follows:
 
.j /
umin  ue p D ; sI qe ./; ˇ./  umax ; sj  s  sf (4.87f)

u.je / .s/  umax


e
; sf  s  sf (4.87g)

with a given upper bound umax


e
, cf. (4.87d’).
168 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

According to the above Theorem 4.6.2, further stability results, especially the
convergence
 
E kdz.t/kjAtj ! 0 for j ! 1; t ! 1 (4.88a)

of the mean absolute first order tracking error can be obtained if, by using a suitable
update law [4, 6, 11, 32] for the parameter estimates, hence, for the a posteriori
distribution P .jAtj /, we have that, see (4.86f),
   
var pD ./jAtj D var pD ./jAtj ! 0 for j ! 1: (4.88b)

4.6.3 The 2nd Order Differential d 2 q

In order to derive a representation of the second order differential d 2 q, Eq. (4.77a)


@q
for .t; / is represented as follows:
@

@z @qR @u @z
FpD
pD C Fz C FqR D D 'z.j / ; (4.89a)
@ @ @ @

R z D qqP , is given by (4.73d), see also (4.76e), and therefore
where F D F .pD ; z; q/;

FpD D FpD .q; q;


P q/
R D Y .q; q;
P q/;
R FqR D FqR .pD ; q/ D M.pD ; q/ (4.89b)
   
Fz D Fz .pD ; q; q; R D Fq ; FqP D K.pD ; q; q;
P q/ P q/;
R D.pD ; q; q/
P : (4.89c)

Moreover, we have that


   
q.t; /
' .j /
D' .j /
t; z  z .t/ ; z D
.j /
; pD D pD ./: (4.89d)
P /
q.t;

By differentiation of (4.89a) with respect to , we obtain


     
@z @qR @z @z
2FpD z 
pD ; C 2F pD qR 
pD ; C F zz  ;
@ @ @ @
 
@z @qR @2 z @2 qR
C2FzqR  ; C Fz 2 C FqR 2
@ @ @ @
 
@z @z @2 z
D 'zz.j /  ; C 'z.j / 2 ; (4.90a)
@ @ @
4.6 Online Control Corrections: PD-Controller 169

with the 2nd order partial derivatives

FpD z D FpD z .z; q/;


R FpD qR D FpD qR .q/ D MpD .q/ (4.90b)
 
Fzz D Fzz .pD ; z; q/;
R FzqR D FzqR .pD ; z/ D Mq .pD ; q/; 0 : (4.90c)

Moreover, differentiation of (4.77b) with respect to  yields the initial values

@2 q d @2 q @2 qP
2
.tj ; / D 0; 2
.tj ; / D 2 .tj ; / D 0 (4.90d)
@ dt @ @
 2 
@2 z @2 z @ q @2 qP
for 2 D 2 .t; / D .t; /; .t; / ; t  tj ; 0    1:
@ @ @ 2 @ 2
Putting now  D 0, from (4.90a) we obtain the following differential equation
2
for the 2nd order differential d 2 q.t/ D @@q2 .t; 0/ of q D q.t; /:

d d2
K .j /.t/d 2 q.t/ C D .j / .t/ d 2 q.t/ C M .j /.t/ 2 d 2 q.t/
dt dt
     
.j / .j /
C Fzz p D ; q .t/; qP .t/; qR .t/  'zz .t; 0/  dz.t/; dz.t/
.j / .j / .j /

 
 'z.j /.t; 0/d 2 z.t/ D 2Fp.jD/z .t/ 
pD ; dz.t/
   
C 2FpD qR .t/ 
pD ; dR q.t/ C 2FzqR .t/  dz.t/; dR q.t/ :
.j / .j /
(4.91a)

Here, we set
 
d 2 q.t/
d z.t/ WD
2
d 2 ; (4.91b)
dt
d q.t/

.j / .j / .j /
and the vectorial Hessians FpD z .t/; FpD qR .t/; FzqR .t/ follow from (4.90b) by insert-
 
.j /
ing there the argument .pD ; q; q;P q/
R WD p D ; q .j / .t/; qP .j / .t/; qR .j / .t/ . Further-
more, (4.90d) yields the following initial condition for d 2 q.t/

d 2
d 2 q.tj / D 0; d q.tj / D 0: (4.91c)
dt
According to (4.91a) we define now, cf. (4.79), the second order derivative of
' .j / with respect to z at
z D 0 by
 
.j /
'zz.j / .t; 0/ WD Fzz p D ; q .j / .t/; qP .j / .t/; qR .j / .t/ ; t  tj : (4.92)
170 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Using then definition (4.79) of the Jacobian of ' .j / with respect to z at


z D 0,
for d 2 q D d 2 q.t/; t  tj , we find the following initial value problem

d2 2 d
2
d q.t/ C Kd d 2 q.t/ C Kp d 2 q.t/ (4.93a)
dt dt
e  2
D M .j /.t/1 D2 F .j / .t/ 
pD ; dz.t/; dR q.t/ ; t  tj ;

d 2
d 2 q.tj / D 0; d q.tj / D 0; (4.93b)
dt
where the sub-Hessian

e e 
.j /

D2 F .j / .t/ WD D2 F p D ; q .j / .t/; qP .j / .t/; qR .j / .t/ ; (4.93c)

of F results from the Hessian of F by replacing the diagonal block Fzz by zero. Of
course, we have that, cf. (4.91a),

e 
R
r 2 F .j / 
pD ; dz.t/; dq.t/
2  
D 2 Fp.jD/z .t/ 
pD ; dz.t/


   
CFpD qR .t/ 
pD ; dR q.t/ C FzqR .t/  dR q.t/; dz.t/ :
.j / .j /
(4.93d)

Comparing now the initial value problems (4.80a, b) and (4.93a, b) for the first
and second order differential of q D q.t; /, we recognize that the linear 2nd order
differential equations have—up to the right hand side—exactly the same form.
According to (4.83a–d) and (4.84a) we know that the 1st order expansion terms
   
dz.t/; dR q.t/ D dq.t/; dP q.t/; dR q.t/ ; t  tj ;

in the tracking error depend linearly on the error term


    
e A.t tj /
zj 0

 .j / D
 .j / .t/ WD ! ;t ! 1 (4.94)

pD
pD

corresponding to the variations/disturbances of the initial values .qj ; qP j / and


dynamic parameters pD . Consequently, we have this observation:

Lemma 4.2 The right hand side

.j;2/ e  2
.t/ WD M .j /.t/1 D2 F .j / .t/ 
pD ; dz.t/; dR q.t/ ; t  tj ; (4.95)
4.6 Online Control Corrections: PD-Controller 171

of the error differential equation (4.93a) for d 2 q.t/ is quadratic in the error term

 .j / ./.
According to (4.93a–c), the second order expansion term
 
d
d 2 z.t/ D d 2 q.t/; d 2 q.t/ ; t  tj ;
dt

of the Taylor expansion of the tracking error can be represented again by the
solution of the system of linear differential equations (4.83a–c), where now .j;1/ .t/
is replaced by .j;2/ .t/ defined by (4.95), and the initial values are given by
d 2 z.tj / D 0. Thus, applying again solution formula (4.84a), we find

Zt  
0
d 2 z.t/ D e A.t  / .j;2/ d : (4.96)
./
tj

From (4.94)–(4.96) and Lemma 4.2 we get now the following result:
Theorem
 4.6.3 Thesecond
 order tracking error2 expansion
 terms
2
d d d
d 2 z.t/; 2 d 2 q.t/ D d 2 q.t/; d 2 q.t/; 2 d 2 q.t/ ; t  tj , depend
dt dt dt
 
i) quadratically on the 1st order error terms
pD ; dz.t/; dR q.t/ and
ii) quadratically on the error term
 .j / ./ corresponding to the variations/
disturbances of the initial values and dynamics parameters.
Because of (4.96), the stability properties of the second order tracking error
expansion term d 2 q.t/; t  tj , are determined again by the matrix exponential
function ˚.t; / D e A.t  / and the remainder .j;2/ .t/ given by (4.95).
According to Theorem 4.6.2 and Corollary 4.6.1 we know that the disturbance
term .j;1/ .t/ of (4.80a), (4.83a) is reduced directly by control-constraints
(for ue ) present in (OSTP). Concerning the next disturbance term .j;2/ .t/
of
 (4.93a), by (4.95)  we note first that a reduction of the 1st order error terms
R

pD ; dz./; d q./ yields a reduction of .j;2/ and by (4.96) also a reduction of
d2 2
d 2 z.t/; d q.t/; t  tj . Comparing then definitions (4.83d) and (4.95) of the
dt 2
disturbances .j;1/ ; .j;2/ , we observe that, corresponding to .j;1/ , certain terms
in .j;2/ depend only on the reference trajectory q .j / .t/; t  tj , of stage j . Hence,
this observation yields the following result.
172 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

.j;2/
Theorem 4.6.4 The disturbance of (4.93a), and consequently also the second
d d2
order tracking error expansion terms d 2 q.t/; d 2 q.t/; 2 d 2 q.t/; t  tj , can be
dt dt
diminished by
 
R
i) reducing the 1st order error terms
pD ; dz./; q./ , and by
ii) taking into (OSTP) additional conditions for the unknown functions
qe .s/; ˇ.s/; sj  s  sf , guaranteeing that (the norm of) the sub-Hessian
e
D2 F .j / .t/; t  tj , fulfills a certain minimality or boundedness condition.
Proof Follows from definition (4.95) of .j;2/ and representation (4.96) of the
second order tracking error expansion term d 2 z.

4.6.4 Third and Higher Order Differentials

By further differentiation of Eqs. (4.90a, d) with respect to  and by putting  D 0,


also the 3rd and higher order differentials d l q.t/; t  tj ; l  3, can be obtained.
We observe that the basic structure of the differential equations for the differentials
d l q; l  1, remains the same. Hence, by induction for the differentials d l z; l  1,
we have the following representation:
Theorem 4.6.5 Defining the tensorial coefficients of the Taylor expansion (4.75d)
for the feedback control law ' .j / D ' .j / .t;
z/ by
 
.j /
Dzl ' .j /.t; 0/ WD Dzl F p D ; q .j / .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; l D 1; 2; : : : ;
  (4.97)
d
the differentials d l z.t/ D d l q.t/; d q.t/ ; t  tj , may be represented by the
l
dt
systems of linear differential equations

 
d l 0
d z.t/ D Ad l z.t/ C .j;l/ ; t  tj ; l D 1; 2; : : : ; (4.98a)
dt .t/

with the same system matrix A and the disturbance terms .j;l/
.t/; t  tj , given
by

.j;l/

e
.t/ D M .j /.t/1  D F .j / .t/; 2    lI
pD ; d j z.t/;

d2 k 
2
d q.t/; 1  j; k  l  1 ; l  2; (4.98b)
dt
4.7 Online Control Corrections: PID Controllers 173

d2 k
where  is a polynomial in the variables
pD and d j z.t/; d q.t/; 1  j; k 
e dt 2
l 1, having coefficients from the sub-operators DF .j / .t/ of D F .j / .t/ containing
mixed partial derivatives
 of F with respect to
pD; z; qR of order  D 2; 3; : : : ; l  1
P q/
at .pD ; q; q; R D pD ; q .t/; qP .j / .t/; qR .j / .t/ such that the disturbance .j;l/
.j / .j /

is a polynomial of order l with respect to the error term


 .j / ./.
According to (4.73d), (4.4a–d) and Remark 4.2 we know that the vector function
F D F .pD ; q; q;P q/
R is linear in pD , linear in qR and quadratic in qP (supposing
case 4.4c)) and analytical with respect to q. Hence, working with a polynomial
approximation with respect to q, we may assume that
 
Dl F .j /.t/ D Dl F pD .j / ; q .j / .t/; qP .j / .t/; qR .j / .t/  0; t  tj ; l  l0 (4.99)

for some index l0 .


According to the expansion (4.75d) of the feedback control law ' .j / D
.j /
' .t;
z/, definition (4.97) of the corresponding coefficients and Theorem 4.6.4
we have now this robustness [43] result:
Theorem 4.6.6 The Taylor expansion (4.75d) of the feedback control law ' .j / D
' .j /.t;
z/ stops after a finite number . l0 / of terms. Besides the conditions
for ue contained automatically
 in (OSTP)
 via the control constraints, the mean
absolute tracking error E k
z.t/k jAtj can be diminished further be including
 
additional conditions for the functions qe .s/; ˇ.s/ ; sj  s  sf , in (OSTP)

e
which guarantee a minimality or boundedness condition for (the norm of) the sub-
operators of mixed partial derivatives D2 F .j / .t/; t  tj ;  D 2; 3; : : : ; l0 .

4.7 Online Control Corrections: PID Controllers

Corresponding to Sect. 4.5, Eq. (4.70a–c), at stage j we consider here control


corrections, hence, feedback control laws, of the type
 

u.j /.t/ WD u.t/  u.j / .t/ D ' .j / t;
z.j /.t/ ; t  tj ; (4.100a)

where


z.j / .t/ WD z.t/  z.j / .t/; t  tj ;
174 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

is the tracking error related now to the state vector


0 1
0 1 q.t/
z1 .t/ B Rt C
B C
z.t/ WD @ z2 .t/ A D B q.s/ds C ; t  tj (4.100b)
@ tj A
z3 .t/
P
q.t/
0 1
0 .j / 1 q .j / .t/
z .t/ B t C
B 1 / C B R q .j / .s/ds C
z.j / .t/ WD @ z.j .t/ A D B C ; t  tj : (4.100c)
2
.j / @ tj A
z3 .t/
qP .j / .t/

Furthermore,
 
' .j / D ' .j / t;
z.t/
 
D ' .j / t;
z1 .t/;
z2 .t/;
z3 .t/ ; t  tj ; (4.100d)

is a feedback control law such that

' .j / .t; 0; 0; 0/ D 0; t  tj :; (4.100e)

Of course, we have
0 1
q.t/  q .j / .t/
B Rt   C
B q.s/  q .j / .s/ ds C

z.t/ D
z.j /.t/ D B C ; t  tj : (4.101)
@ tj A
P  qP .j / .t/
q.t/

Corresponding to Sect. 4.5, the trajectories q D q.t/ and q .j / D q .j / .t/; t  tj , are


embedded into a one-parameter family of trajectories q D q.t; "/; t  tj ; 0  " 
1, in configuration space which is defined as follows.
At stage j , here we have the following initial data:

qj ."/ WD q j C "
qj ;
qj WD qj  q j (4.102a)
P j ;
q
qPj ."/ WD qP j C "
q P j WD qP j  qP j (4.102b)
.j / .j /
pD ."/ WD p D C "
pD ;
pD WD pD  p D ; (4.102c)
4.7 Online Control Corrections: PID Controllers 175

0  "  1. Moreover, the control input u D u.t/; t  tj , is defined by

u.t/ D u.j /.t/ C


u.j /.t/
 Zt
 
Du .j /
.t/ C ' .j /
t; q.t/  q .t/;
.j /
q.s/  q .j / .s/ ds;
tj

P  qP .j / .t/ ; t  tj :
q.t/ (4.102d)

Let denote now


 
q.t; "/ D q t; pD ."/; qj ."/; qPj ."/; u./ ; t  tj ; 0  "  1; (4.103)

the solution of following initial value problem based on the dynamic equation (4.4a)
having the initial values and total control input u.t/ given by (4.102a–d):
 
P "/; q.t;
F pD ."/; q.t; "/; q.t; R "/ D u.t; "/; t  tj ; 0  "  1; (4.104a)

where

q.tj ; "/ WD qj ."/; q.t


P j ; "/ D qP j ."/ (4.104b)
 Zt
 
u.t; "/ WD u .j /
.t/ C ' .j /
t; q.t; "/  q .j /
.t/; q.s; "/  q .j / .s/ ds;
tj

P "/  qP .j / .t/ ;
q.t; (4.104c)

and the vector function F D F .pD ; q; q;


P q/
R is again defined by

P q/
F .pD ; q; q; R D M.pD ; q/qR C h.pD ; q; q/;
P (4.104d)

cf. (4.104a).
In the following we assume that problem (4.104a–d) has a unique solution q D
q.t; "/; t  tj , for each parameter "; 0  "  1.
176 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

4.7.1 Basic Properties of the Embedding q.t; "/

According to (4.102a–c), for " D 0 system (4.104a–d) reads


 .j / 
P 0/; q.t;
F p D ; q.t; 0/; q.t; R 0/
 Rt  
D u.j /.t/ C ' .j / t; q.t; 0/  q .j / .t/; q.s; 0/  q .j /.s/ ds;
tj

P 0/  qP .j /.t/ ; t  tj ;
q.t; (4.105a)

where

q.tj ; 0/ D q j ; q.t
P j ; 0/ D qP j : (4.105b)

Since, due to (OSTP),

q .j / .tj / D q j ; qP .j / .tj / D qPj

and
 .j /
F p D ; q .j / .t/; qP .j / .t/; qR .j / .t/ D u.j /.t/; t  tj ;

according to the unique solvability assumption for (4.104a–d) and condi-


tion (4.100e) we have

q.t; 0/ D q .j / .t/; t  tj : (4.106)

For " D 1, from (4.102a–c) and (4.104a–d) we obtain the system


 
F pD ; q.t; 1/; q.t;P 1/; q.t;
R 1/
 Rt   
D u.j /.t/ C ' .j / t; q.t; 1/  q .j / .t/; q.s; 1/  q .j /.s/ ds; q.t;
P 1/  qP .j / .t/ ;
tj

t  t1 (4.107a)

with

q.tj ; 1/ D qj ; q.t
P j ; 1/ D qP j : (4.107b)

However, since the control input is defined by (4.102d), again due to the unique
solvability property of (4.104a–d), (4.107a, b) yields

q.t; 1/ D q.t/; t  tj ; (4.108)

where q D q.t/ denotes the actual trajectory.


4.7 Online Control Corrections: PID Controllers 177

Remark 4.13 The integro-differential equation (4.104a–d) can be easily converted


into an ordinary initial value problem. Indeed, using the state variables
0 1
q.tI "/
z.t; "/ D @ qI .t; "/ A ; t  tj ;
P "/
q.t;

where qI D qI .t; "/; t  tj , is defined, see (4.100b), by

Zt
qI .t; "/ WD q.s; "/ds; (4.109)
tj

problem (4.104a–d) can be represented by the equivalent 2nd order initial value
problem:
 
P "/; q.t;
F pD ."/; q.t; "/; q.t; R "/
 .j /
D u.j / .t/ C ' .j / t; q.t; "/  q .j / .t/; qI .t; "/  qI .t/;

P "/  qP .j / .t/
q.t; (4.110a)
qPI .t; "/ WD q.t; "/ (4.110b)

with

q.tj ; "/ D qj ."/ (4.110c)


P j ; "/ D qP j ."/
q.t (4.110d)
qI .tj ; "/ D 0: (4.110e)

and

Zt
.j /
qI .t/ WD q .j / .s/ds: (4.111)
tj

4.7.2 Taylor Expansion with Respect to "

Based on representation (4.110a–e) of problem (4.104a–d), we may again assume,


cf. Assumption 1.1, that the solution q D q.t; "/; t  tj ; 0  "  1, has continuous
derivatives with respect to " up to a certain order   1 for all t 2 Œtj ; tj C
tj ; 0 
"  1, with a certain
tj > 0.
178 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Corresponding to (4.75a–c), the actual trajectory of the robot can be represented


then, see (4.108), (4.106), (4.103), by
 
q.t/ D q t; pD ; qj ; qP j ; u./ D q.t; 1/ D q.t; "0 C
"/
D q.t; "0 / C
q.t/ D q .j / .t/ C
q.t/; (4.112a)

with "0 D 0;
" D 1. Moreover, the expansion on the tracking error
q D

q.t/; t  tj , is given by

X
1
1 l 1 @

q.t/ D d q.t/.
"/l C q.t; #/.
"/
lŠ Š @"
lD1

X
1
1 l 1 @ q
D d q.t/ C .t; #/; t  tj : (4.112b)
lŠ Š @"
lD1

Here, # D #.t; /; 0 < # < 1, and

@l q
d l q.t/ WD .t; 0/; t  tj ; l D 1; 2; : : : ; (4.112c)
@"l

denote the l-th order differentials of q D q.t; "/ with respect to " at " D "0 D 0.
Differential equations for the differentials d l q.t/; l D 1; 2; : : :, may be obtained by
successive differentiation of the initial value problem (4.104a–d) with respect to "
at " D 0.

4.7.3 The 1st Order Differential dq

Corresponding to Sect. 4.6.2, we consider now the partial derivative in Eq.


(4.110a–e) with respect to ". Let

P q/
K.pD ; q; q; R WD Fq .pD ; q; q;
P q/
R (4.113a)
P WD FqP .pD ; q; q;
D.pD ; q; q/ P q/
R D hqP .pD ; q; q/
P (4.113b)
P q/
Y .q; q; R WD FpD .pD ; q; q;
P q/
R (4.113c)
M.pD ; q/ WD FqR .pD ; q; q;
P q/
R (4.113d)

denote again the Jacobians of the vector function F D F .pD ; q; q;


P q/
R with respect
P qR and pD . According to the linear parametrization property of robots we
to q; q;
have, see (4.76e)

P q/
F .pD ; q; q; R D Y .q; q;
P q/p
R D: (4.113e)
4.7 Online Control Corrections: PID Controllers 179

Taking the partial derivative with respect to ", from (4.110a–e) we obtain the
following equations, cf. (4.77a, b),

    @q
Y q.t; "/; q.t; R "/
pD C K pD ."/; q.t;
P "/; q.t; P "/; q.t;
R "/ .t; "/
@"
  d @q   d 2 @q
C D pD ."/; q.t; "/; q.t; P "/; q.t;R "/ .t; "/ C M pD ."/; q.t; "/ .t; "/
dt @" dt 2 @"
 .j /  @q
D 'q.j / t; q.t; "/  q .j /.t/; qI .t; "/  qI .t/; q.t; P "/  qP .j / .t/ .t; "/
@"
 .j /  @qI
C 'q.jI / t; q.t; "/  q .j /.t/; qI .t; "/  qI .t/; q.t; P "/  qP .j / .t/ .t; "/
@"
.j /  .j /  d @q
C 'qP t; q.t; "/  q .j /.t/; qI .t; "/  qI .t/; q.t; P "/  qP .j / .t/ .t; "/:
dt @"
(4.114a)

d @qI @q
.t; "/ D .t; "/ (4.114b)
dt @" @"
@q
.tj ; "/ D
qj (4.114c)
@"
d @q P j
.tj ; "/ D
q (4.114d)
dt @"
@qI
.t; "/ D 0: (4.114e)
@"
Putting now " D "0 D 0, due to (4.106), (4.109), 4.112c) we obtain

Y .j /.t/
pD C K .j /.t/dq.t/ C D .j / .t/dP q.t/ C M .j /.t/dR q.t/
Zt
D 'q.j / .t; 0; 0; 0/dq.t/ C 'q.jI / .t; 0; 0; 0/
.j / P
dq.s/ds C 'qP .t; 0; 0; 0/dq.t/; t  tj
tj

(4.115a)
dq.tj / D
qj (4.115b)
dP q.tj / D
q
P j; (4.115c)

where
2
d R WD d dq:
dP q WD dq; dq (4.116a)
dt dt 2
180 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

From (4.114b, e) we obtain

Zt
@qI
.t; 0/ D dq.s/ds; (4.116b)
@"
tj

which is already used in (4.115a). Moreover, the matrices Y .j / ; K .j /; D .j / and M .j /


are defined as in Eqs. (4.78e–g) of Sect. 4.6.2, hence,
 
Y .j / .t/ WD Y q .j / .t/; qP .j / .t/qR .j / .t/ (4.117a)
 .j / 
K .j /.t/ WD K p D ; q .j / .t/; qP .j / .t/; qR .j / .t/ (4.117b)
 .j / 
D .j / .t/ WD D p D ; q .j / .t/; qP .j / .t/ (4.117c)
 .j / 
M .j /.t/ WD M p D ; q .j / .t/ ; t  tj ; (4.117d)

see also (4.113a–d).


Multiplying now (4.115a) with the inverse M .j /.t/1 of M .j /.t/ and rearranging
terms, we get
 
dR q.t/ C M .j /.t/1 D .j / .t/  'qP .t; 0; 0; 0/ dP q.t/
.j /

 
C M .j /.t/1 K .j /.t/  'q.j /.t; 0; 0; 0/ dq.t/
Zt
M .j /
.t/1 'q.jI /.t; 0; 0; 0/ dq.s/ds D M .j /.t/1 Y .j /
pD ; t  tj
tj
(4.118)

with the initial conditions (4.115b, c).


For given matrices Kd ; Kp ; Ki to be selected later on, the unknown Jacobians
.j / .j / .j /
'qP ; 'q and 'qI are defined now by the equations

 
M .j /.t/1 D .j / .t/  'qP .t; 0; 0; 0/ D Kd
.j /
(4.119a)
 
M .j /.t/1 K .j /.t/  'q.j /.t; 0; 0; 0/ D Kp (4.119b)
 
M .j /.t/1  'q.jI /.t; 0; 0; 0/ D Ki : (4.119c)

Thus, we have

'q.j /.t; 0; 0; 0/ D K .j /.t/  M .j /.t/Kp (4.120a)

'q.jI /.t; 0; 0; 0/ D M .j /.t/Ki (4.120b)


.j /
'qP .t; 0; 0; 0/ D D .j / .t/  M .j /.t/Kd : (4.120c)
4.7 Online Control Corrections: PID Controllers 181

Putting (4.119a–c) into system (4.118), we find

Zt
dR q.t/ C Kd dP q.t/ C Kp dq.t/ C Ki dq.s/ds D .j;1/
.t/; t  tj ; (4.121a)
tj

with

dq.tj / D
qj (4.121b)
dP q.tj / D
q
P j; (4.121c)

.j;i1
where .t/ is given, cf. (4.83d), by
.j;1/
.t/ WD M .j /.t/1 Y .j / .t/
pD : (4.121d)

If Kp ; Kd and Ki are fixed matrices, then by differentiation of the integro-


differential equation (4.121a) with respect to time t we obtain the following 3rd
order system of linear differential equations

d«q.t/ C Kd dR q.t/ C Kp dP q.t/ C Ki dq.t/ D P .j;1/ .t/; t  tj ; (4.122a)

for the 1st order tracking error term dq D dq.t/; t D tj . Moreover, the initial
conditions read, see (4.121b, c),

dq.tj / D
qj (4.122b)
dP q.tj / D
q
P j (4.122c)
dq.t R j WD qRj  qR .j / ;
R j / D
q (4.122d)

where qRj WD q.t


R j /, see (4.102a–c).
Corresponding to (4.83a–c), the system (4.121a) of 3rd order linear differential
equations can be converted easily into the following system of 1st order differential
equations
0 1
0
dP z.t/ D Adz.t/ C @ 0 A ; t  tj ; (4.123a)
P .j;1/
.t/
182 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

where
0 1
0 I 0
A WD @ 0 0 I A; (4.123b)
Ki Kp Kd
0 1

qj
dz.tj / WD @
q
P j A D
zj (4.123c)
R j

q

and
0 1
dq.t/
dz.t/ WD @ dP q.t/ A : (4.123d)
R
dq.t/

With the fundamental matrix ˚.t; / WD e A.t  / ; t  , the solution of (4.123a–d)


reads
0 1
Zt 0
dz.t/ D e A.t tj /
zj C e A.t  / @ 0 A d ; t  tj ; (4.124a)
tj P .j;1/ ./

where, see (4.121d),


 
P .j;1/ .t/ D  d M .j /.t/1 Y .j / .t/
pD : (4.124b)
dt
Because of
 
.j /

pD D pD .!/  p D D pD .!/  E pD .!/jAtj ;
 
see (4.102c), for the conditional mean 1st order error term E dz.t/jAtj
from (4.124a, b) we get
 
E dz.t/jAtj D e A.t tj / E.
zj jA/; t  tj : (4.124c)

Obviously, the properties of the 1st order error terms dz.t/; E.dz.t/jAtj /, resp.,
t  tj , or the stability properties of the 1st order system (4.123a–d) depend on
the eigenvalues of the matrix A.
4.7 Online Control Corrections: PID Controllers 183

Diagonalmatrices Kp ; Kd ; Ki

Supposing here, cf. Sect. 4.5, that Kp ; Kd ; Ki are diagonal matrices

Kp D .pk ık /; Kd D .dk ık /; Ki D .ik ık / (4.125)

with diagonal elements pk ; dk ; ik , resp., k D 1; : : : ; n, system (4.122a) is divided
into the separated ordinary differential equations

d«qk .t/ C dk dq


R k .t/ C pk dP qk .t/ C ik dqk .t/ D P .j;1/ .t/; t  tj ;
k (4.126a)

k D 1; 2; : : : ; n. The related homogeneous differential equations read

d«qk C dk dq
R k C pk dP qk C ik dqk D 0; (4.126b)

k D 1; : : : ; n, which have the characteristic equations

3 C dk 2 C pk  C ik DW pk ./ D 0; k D 1; : : : ; n; (4.126c)

with the polynomials pk D pk ./; k D 1; : : : ; n, of degree D 3.


A system described by the homogeneous differential equation (4.126b) is called
uniformly (asymptotic) stable if

lim dqk .t/ D 0 (4.127a)


t !1

for arbitrary initial values (4.122b–d). It is well known that property (4.127a) holds
if

Re.kl / < 0; l D 1; 2; 3; (4.127b)

where Re.kl / denotes the real part of the zeros k1 ; k2 ; k3 of the characteristic
equation (4.126c). According to the Hurwitz criterion, a necessary and sufficient
condition for (4.127b) is the set of inequalities

det.dk / > 0 (4.128a)


 
dk 1
det D dk pk  ik > 0 (4.128b)
ik pk
0 1
dk 1 0
det @ ik pk dk A D ik .dk pk  ik / > 0: (4.128c)
0 0 ik
184 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Note that
0 1
dk 1 0
H3k WD @ ik pk dk A (4.128d)
0 0 ik

is the so-called Hurwitz-matrix of (4.126b).


Obviously, from (4.128a–c) we obtain now this result:
Theorem 4.7.1 The system represented by the homogeneous 3rd order linear
differential equation (4.126b) is uniformly (asymptotic) stable if the (feedback)
coefficients pk ; dk ; ik are selected such that

pk > 0; dk > 0; ik > 0 (4.129a)


dk pk > ik : (4.129b)

Mean Absolute 1st Order Tracking Error


 
Because of E
pD .!/jAtj D 0 and the representation (4.124b) of P .j;1/ D
P .j;1/ .t/, corresponding to (4.84b) we have
 
E P .j;1/ .t/jAtj D 0; t  tj : (4.130a)

Hence, (4.124a) yields


   
E dz.t/jAtj D e A.t tj / E
zj jAtj ; (4.130b)

where
   
E
zj jAtj D E z.tj /jAtj  zNj (4.130c)

with, cf. (4.105a), (4.122d),


0 1 0 1
qj qj
@
zNj D qP j A ; z.tj / D qPj A :
@ (4.130d)
qR .j / qRj

The matrices Kp ; Ki ; Kd in the definition (4.119a–c) or (4.120a–c) of the


.j / .j / .j /
Jacobians 'q .t; 0; 0; 0/; 'qI .t; 0; 0; 0/; 'qP .t; 0; 0; 0/ of the feedback control law
4.7 Online Control Corrections: PID Controllers 185

P can be chosen now, see Theorem 4.7.1, such that

u.j /.t/ D ' .j / .t;


q;
qI ;
q/
the fundamental matrix ˚.t; / D e A.t  / ; t  , is exponentially stable, hence,

k˚.t; /k  a0 e 0 .t  / ; t  ; (4.131)

with constants a0 > 0; 0 > 0, see also (4.85a).


Considering the Euclidean norm kdz.t/k of the 1st order error term dz.t/,
from (4.124a, b) and with (4.131) we obtain
 0 1 
Z t 
 0 
 A d
kdz.t/k  ke A.t tj /
zj k C  e A.t  / @ 0 
 P .j;1/ ./ 
tj 
 0 1
Zt  0 
 
 a0 e 0 .t tj /
k
zj k C e A.t  / @
0 A  d

 P .j;1/
./ 
tj

Zt
 a0 e 0 .t tj /
k
zj k C a0 e 0 .t  / k P .j;1/ ./kd : (4.132)
tj

Taking the conditional expectation in (4.132), we find


 ˇ   ˇ 
ˇ ˇ
E kdz.t/kˇAtj  a0 e 0 .t tj / E k
zj kˇAtj

Zt  ˇ 
ˇ
Ca0 e 0 .t  / E k P .j;1/ ./kˇAtj d : (4.133)
tj

Applying Jensen’s inequality


p p
EX.!/  E X.!/;

where X D X.!/ is a nonnegative random variable, with (4.124b) we get,


cf. (4.86d, e),
 ˇ  q ˇ 
ˇ ˇ
E k P .j;1/ ./kˇAtj D E k P .j;1/ ./k2 ˇAtj
r  ˇ 
ˇ
 E k P .j;1/ ./k2 ˇAtj (4.134a)
   
 d M .j /.t/1 Y .j / .t/ r  ˇ 
 
 ./  var pD ./ˇˇAt ;
  j
 dt 
186 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

where
 ˇ   ˇ 
var pD ./ˇAtj WD E k
pD .!/k2 ˇAtj : (4.134b)

In the following we study the inhomogeneous term P .j;1/ .t/ of the 3rd order
linear differential equation (4.122a) in more detail. According to (4.121d) we have
 
P .j;1/ .t/ D  d M .j /.t/1 Y .j / .t/
pD
dt
 
d .j / 1
D M .t/ Y .j / .t/
pD
dt
 
d .j /
M .j /.t/1 Y .t/
pD : (4.135a)
dt

and therefore
 
 d .j / 1 
k P .j;1/ .t/k  
 dt M .t/   kY .j / .t/
pD k

  
C kM .j /.t/1 k   YP .j / .t/
pD  : (4.135b)

Now, according to (4.19a, b), (4.82a–c) it holds


 
u.j /.tI
pD / WD Y .j / .t/
pD D ue
pD ; s .j / .t/I qe.j / ./; ˇ .j / ./ ; t  tj :
(4.136a)
Thus, we find
 
uP .j /.tI
pD / D YP .j / .t/
pD
d  
D ue
pD ; s .j / .t/I qe.j / ./; ˇ .j / ./
dt
@ue   d
D
pD ; s .j / .t/I qe.j / ./; ˇ .j /./  s .j / .t/
@s dt
  r  
@ue
D
pD ; s .j / .t/I qe.j / ./; ˇ .j /./  ˇ .j / s .j / .t/ ; (4.136b)
@s

see (4.17c), (4.49b). Note that

u.j / .tI
pD /; uP .j / .tI
pD /
 
are linear with respect to
pD D pD  E pD .!/jAtj .
4.7 Online Control Corrections: PID Controllers 187

From (4.135a, b) and (4.136a, b) we get


 ˇ     ˇ 
ˇ ˇ
E k P .j;1/ .t/kˇAtj   dtd M .j /.t/1  E kY .j / .t/
pD kˇAtj
   ˇˇ 
CkM .j /.t/1 kE  YP .j / .t/
pD  ˇAtj
  
 d M .j / .t /1   ˇ 
 
D   E ku.j /.tI
pD /kˇˇAt
 dt  j
 
 ˇ 
ˇ
CkM .j / .t/1 k  E kPu.j / .tI
pD kˇAtj
  
 d M .j / .t /1 
    ˇ 
D   E ue
pD ; s .j / .t/I qe.j / ./; ˇ .j /./ ˇˇAt
 dt  j
 
ˇ !
  q   ˇ
1  @ue .j / ˇ
CkM .t/ k  E  @s
pD ; s .t/I qe ./; ˇ ./ ˇ s .t/  ˇAtj :
.j / .j / .j / .j / .j /
ˇ
(4.137a)

Using again Jensen’s inequality, from (4.137a) we obtain


 ˇ 
E k P .j;1/ .t/kˇAtj
  
 d M .j / .t /1  r
    ˇ 
  var ue pD ./; s .j /.t/I qe.j / ./; ˇ .j /./ ˇˇAt
 dt  j
 
r   ˇ   
ˇ
CkM .j /.t/1 k var @u
.j /
@s
e
pD ./; s .j /.t/I qe ./; ˇ .j / ./ ˇAtj ˇ .j / s .j / .t/ ;

(4.137b)

where
 
.j / ˇˇ 
var ue pD ./; s .j /.t/I q0 ./; ˇ .j / ./ ˇAtj
 
 .j /  ˇ 
2 ˇ
WD E ue
pD .!/; s .j / .t/I qe ./; ˇ .j / ./  ˇAtj ; (4.137c)
ˇ !
 r  ˇ
.j / ˇ
var @u e
p D ./; s .j /
.t/I qe ./; ˇ .j /
./ ˇ .j / s .j / .t/ ˇAtj
@s ˇ
  2 ˇ   
 e .j /  ˇ
D E  @u
@s
p D .!/; s .j /
.t/I q e ./; ˇ .j /
./  ˇA t j ˇ .j / .j /
s .t/ :

(4.137d)
188 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

According to the representation (4.124a, b) of the first order tracking error form
dz D dz.t/; t  tj , the behavior of dz.t/ is determined mainly by the system matrix
A and the “inhomogeneous term” P .j;1/ .t/; t  tj . Obviously, this term plays the
same role as the expression .j;1/ .t/; t  tj , in the representation (4.84a) for the
first order error term in case of PD-controllers.
However, in the present case of PID-control the error estimates (4.137a–d) show
that for a satisfactory behavior of the first order error term dz D dz.t/; t  tj ,
besides the control constraints (4.8a–c), (4.21a–c), (4.30a–f) for
 
u.t/ D ue pD ; sI qe .:/; ˇ.:/ with s D s.t/; (4.138a)

here, also corresponding constraints for the input rate, i.e. the time derivative of the
control

@ue  p
uP .t/ D pD ; sI qe .:/; ˇ.:/ ˇ.s/ with s D s.t/ (4.138b)
@s
are needed!
The above results can be summarized by the following theorem:
Theorem 4.7.2 Suppose that the matrices Kp ; Ki ; Kd are selected such that the
fundamental matrix ˚.t; / D e A.t  / ; t  , is exponentially stable (cf. Theo-
rem 4.7.1). Then, based on the definition (4.120a–c) of the linear approximation of
the PID-controller, the following properties hold:
a) Asymptotic local stability in the mean
 ˇ 
E dz.t/ˇAtj ! 0; t ! 1 (4.139a)

b) Mean absolute 1st order tracking error


 ˇ   ˇ 
E kdz.t/kˇAtj  a0 e 0 .t tj / E k
zj kˇAtj
   
Zt  d M .j /.t/1 

0 .t  / 
 .j /  .j / 
C a0 e  ./
 ue s ./ d 
 dt 
tj

Zt    q
  .j /  
C a0 e 0 .t  / M .j /./1  @ue s .j / ./ ˇ .j / s .j / ./ d ; t  tj
@s
tj
(4.139b)
4.7 Online Control Corrections: PID Controllers 189

with
r  
.j / ˇˇ 
u.je / .s/ WD var ue pD .:/; sI qe ./; ˇ .j / ./ ˇAtj ; (4.139c)
r
.j /
 @u 
e .j / ˇˇ 
@ue .s/ WD var pD ./; sI qe ./; ˇ .j / ./ ˇAtj ; (4.139d)
@s @s

s  sj . Moreover,
 ˇ   ˇ 
E kdz.t/kˇAtj  a0 e 0 .t tj / E k
zj kˇAtj
    !
Zt  d M .j /.t/1 Y .j / .t/ 
 
C a0 e 0 .t  / 
 ./  d  .j / ; t  tj ;
 pD
 dt 
tj
(4.139e)

where
q  ˇ 
p.jD/ WD var pD .:/ˇAtj : (4.139f)

Using the t-s-transformation s D s .j / ./;   tj , the time-integrals in (4.139b)


can be represented also in the following form:
   
Zt  d M .j /.t/1 

0 .t  / 
 .j /  .j / 
e  ./
 ue s ./ d 
 dt 
tj
   
sZ
  .j /
.j / .t /
  d M .j /
.t/ 1
  ue .s/
t t .s/ 
e 0 t .s/ 
.j /
D 
.j /
 p .j / ds; (4.140a)
 dt  ˇ .s/
sj

Zt    q
  .j /  
e 0 .t  / M .j /./1  @ue s .j / ./ ˇ .j / s .j / ./ d 
@s
tj

sZ
.j / .t /
  
 .j / .j / 1  .j /
D e 0 t t .j / .s/
M t .s/  @ue .s/ds; (4.140b)
@s
sj

where  D t .j / .s/; s  sj , denotes the inverse of s D s .j / ./;   tj .


190 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Minimality or Boundedness Properties

Several terms in the above estimates of the 1st order tracking error dz D dz.t/; t 
tj , are influenced by means of (OSTP) as shown in the following:
i) Optimal velocity profile ˇ .j /
For minimum-time and related substitute problems the total runtime

sZ
.j / .t /
Zsf
1 1 .j /
p ds  p ds D tf  tj (4.141)
ˇ .j / .s/ ˇ .j / .s/
sj sj

is minimized by (OSTP).
ii) Properties of the coefficient 0
 and 4.7.2 the matrices Kp ; Ki ; Kd can be selected
According to Theorem 4.7.1
such that real parts Re kl ; k D 1; : : : ; n; l D 1; 2; 3, of the eigenvalues
kl ; k D 1; : : : ; n; l D 1; 2; 3, of the matrix A, cf. (4.123b), are negative, see
(4.127b), (4.128a–d). Then, the decisive coefficient 0 > 0 in the norm estimate
(4.131) of the fundamental matrix ˚.t; /; t  , can be selected such that

0 < 0 <  max Re.kl /: (4.142)


1kn
lD1;2;3

iii) Chance constraints for the input


With certain lower and upper vector bounds umin  umax one usually has the
input or control constraints, cf. (2.10a),

umin  u.t/  umax ; t  tj :

In the following we suppose that the bounds umin ; umax are given deterministic
vectors. After the transformation

s D s .j / .t/; t  tj ;

.j /
from the time domain Œtj ; tf  to the s-domain Œsj ; sf , see (4.49b), due to
 
(4.21a) we have the stochastic constraint for qe ./; ˇ./ :
 
umin  ue pD .!/; sI qe ./; ˇ./  umax ; sj  s  sf :

Demanding that the above constraint holds at least with the probability ˛u , we
get, cf. 4.30a, the probabilistic constraint
   ˇ 
ˇ
P umin  ue pD .!/; sI qe ./; ˇ./  umax ˇAtj  ˛u ; sj  s  sf :
(4.143a)
4.7 Online Control Corrections: PID Controllers 191

Defining again, see (4.30d),

umin C umax umax  umin


uc WD ; u WD ;
2 2
by means of Tschebyscheff-type inequalities, the chance constraint (4.143a)
can be guaranteed, see (4.30e, f), (4.87d), by the condition
   2 ˇˇ 
E ue pD .!/; sI qe .:/; ˇ.:/  uc  ˇAtj  .1  ˛u / min u2k ; sj  s  sf :
1kn
(4.143b)
.j /
According to the definition (4.139c) of ue .s/, inequality (4.143b) is equivalent
to
 / 
u.je / .s/2 C u.j
e .s/  u
c 2
 .1  ˛u / min u2k ; sj  s  sf ; (4.144a)
1kn

where
  ˇˇ 
e WDE ue pD .!/; sI qe ./; ˇ ./ ˇAtj
u.j / .j / .j /

 
.j /
Due p D ; sI qe.j / ./; ˇ .j / ./ : (4.144b)

Hence, the sufficient condition (4.144a) for the reliability constraint (4.143a)
yields the variance constraint

u.je / .s/2  .1  ˛u / min u2k ; sj  s  sf : (4.144c)


1kn

iv) Minimum force and moment  


According to the different performance functions J u.:/ mentioned after the
definition 4.6, in case of minimum expected force and moment we have, using
transformation formula 4.19a,

!
.j /

  ˇ  Ztf
ˇ  2 ˇˇ
E J u./ ˇAtj D E  
u.t / dt ˇAtj
tj

Zsf   2 ˇˇ  ds
D E ue pD .!/; sI qe ./; ˇ./ ˇAtj p
ˇ.s/
sj

Zsf  
   .j / 2 ˇˇ 
D E ue pD .!/; sI qe ./; ˇ./  ue p D ; sI qe ./; ˇ./ ˇAtj
sj
192 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)


  .j / 2 ds
C ue p D ; sI qe ./; ˇ./  p
ˇ.s/
Zsf  
  .j / 2 ds
D u2e .s/ C ue p D ; sI qe ./; ˇ./ p ; (4.145a)
ˇ.s/
sj

where
  2 ˇˇ 
  
W D E ue pD .!/; sI qe ./; ˇ./  ue p D ; sI qe ./; ˇ./  ˇˇAtj
.j /
u2e .s/
 2 ˇˇ 
  
D E ue
pD .!/; sI qe ./; ˇ./ ˇˇAtj : (4.145b)

Hence, (OSTP) yields the following minimum property:

!
.j /
Zsf  Ztf
 / 2  ˇ
u.je / .s/2 C u.j  p ds D ku.t/k dt ˇAtj ;
2
e .s/ min E
ˇ .j / .s/ qe ./;ˇ./
sj s:t: (4.8a–c); tj
(4.9a–d)
(4.146)

where u.j /
e .s/ is defined by (4.144b).
v) Decreasing stochastic uncertainty
According to (4.132), (4.133) and (4.134a, b) we have
   ˇ 
ˇ
E kdz.t/kAtj  ˛0 e 0 .t tj / E k
zj kˇAtj
    !
Zt  d M .j /.t/1 Y .j / .t/  q 

0 .t  / 
 ˇ 
C ao e  ./
 d var pD .:/ˇAtj :
 dt 
tj
(4.147)

Thus, the mean absolute 1st order tracking error can be decreased further
by removing step by step the uncertainty about the vector pD D pD .!/
of dynamic parameter. This is done in practice by a parameter identification
procedure running parallel to control process of the robot.
vi) Chance constraints for the input rate
According to the representation (4.138b) of the input or control rate we have

@ue  p
uP .t/ D pD ; sI qe ./; ˇ./ ˇ.s/; s D s.t/:
@s
4.7 Online Control Corrections: PID Controllers 193

From the input rate condition

uP min  uP .t/  uP max ; t  tj ; (4.148a)

with given, fixed vector bounds uP min  uP max , for the input rate @ue
@s
with respect
to the path parameter s  sj we obtain the constraint

uP min @ue   uP max


p  pD .!/; sI qe ./; ˇ./  p ; sj  s  sf : (4.148b)
ˇ.s/ @s ˇ.s/

If we require that the input rate condition (4.148a) holds at least with probability
˛uP , then corresponding to (4.143a) we get the chance constraint
ˇ !
uP min @ue   uP max ˇˇ
P p  pD .!/; sI qe ./; ˇ./  p ˇAtj  ˛uP : (4.149)
ˇ.s/ @s ˇ.s/ ˇ

In the same way as in (iii), condition (4.149) can be guaranteed, cf. (4.139d),
by
 .j / 2   1
 @ue uP c 
@u .s/2 C   1˛uP
.j /
 .s/ p  min 2 ; sj  s  sf ;
.j / .s/ 1kn uP k
@s @s .j
ˇ .s/ / ˇ
(4.150a)
where

uP min C uP max uP max  uP min


uP c WD ; uP WD ; (4.150b)
2 2
@ue  .j / 
.j /
@ue
WD p D ; sI qe.j / ./; ˇ .j /./ ; s  sj : (4.150c)
@s @s
Hence, corresponding to (4.144c), here we get the following variance con-
straint:
  1
.j /
@u .s/2  1  ˛uP min 2 ; sj  s  sf : (4.150d)
@s ˇ .j / .s/ 1kn uP k

vii) Minimum force and moment rate


Corresponding to (4.145a) we may consider the integral

Ztf !
  ˇ   2 ˇˇ
ˇ
E J uP ./ Atj WD E  
uP .t/ dt ˇAtj : (4.151a)
tj
194 4 Adaptive Optimal Stochastic Trajectory Planning and Control (AOSTPC)

Again with (4.138b) we find

Ztf  2 ˇ !
  ˇ   @ue  p  ds ˇˇ
E J uP ./ˇAtj D E  
 @s pD .!/; sI qe ./; ˇ./ ˇ.s/ p ˇAtj
ˇ.s/ ˇ
tj

Zsf  ˇ 
 @ue  
2 ˇˇ p
D E  pD .!/; sI qe ./; ˇ./  ˇAtj ˇ.s/ds
@s
sj

Zsf  
@ue  ˇˇ 
D var pD .!/; sI qe ./; ˇ./ ˇAtj
@s
sj
 @u  
 e .j / 
2 p
C p D ; sI qe ./; ˇ./  ˇ.s/ds: (4.151b)
@s

If we consider

Ztf !
  ˇ    ˇˇ
ˇ  
E JQ uP ./ ˇAtj WD E uP .t/ dt ˇAtj ; (4.152a)
tj

then we get

  ˇ  Z f  @u 
s ˇ 
ˇ  e 
ˇˇ
Q
E J uP ./ ˇAtj D E  pD .!/; sI qe ./; ˇ./ ˇAtj ds: (4.152b)
@s
sj
Chapter 5
Optimal Design of Regulators

The optimal design of regulators is often based on the use of given, fixed nominal
values of initial conditions, pay loads and other model parameters. However, due
to
• variations of the material properties,
• measurement errors (e.g. in case of parameter identification)
• modeling errors (complexity of real systems)
• uncertainty on the working environment, the task to be executed, etc.,
the true payload, initial conditions, and model parameters, like sizing parameters,
mass values, gravity centers, moments of inertia, friction, tolerances, adjustment set-
ting error, etc., are not known exactly in practice. Hence, a predetermined (optimal)
regulator should be “robust”, i.e., the controller should guarantee satisfying results
also in case of variations of the initial conditions, load parameters and further model
parameters.
Robust controls have been considered up to now mainly for uncertainty models
based on given fixed sets of parameters, e.g. a certain multiple intervals, assumed
to contain the unknown, true parameter. In this case one requires then often that the
controlled system fulfills certain properties, as e.g. certain stability properties for
all parameter vectors in the given parameter domain. If the required property can
be described by a scalar criterion, then the controller design is based on a minimax
criterion, such as the H 1 -criterion.
Since in many cases parameter uncertainty can be modeled more adequately
by means of stochastic parameter models, in the following we suppose that the
parameters involved in the regulator design problem are realizations of a random
vector having a known or at least partly known joint probability distribution.
The determination of an optimal controller under uncertainty with respect to
model parameters, working neighborhood, modeling assumptions, etc., is a decision
theoretical problem. Criteria of the “holds for all parameter vectors in a given
set” and the minmax-criterion are very pessimistic and often too strong. Indeed,
in many cases the available a priori and empirical information on the dynamic

© Springer-Verlag Berlin Heidelberg 2015 195


K. Marti, Stochastic Optimization Methods, DOI 10.1007/978-3-662-46214-0_5
196 5 Optimal Design of Regulators

system and its working neighborhood allows a more adequate, flexible description
of the uncertainty situation by means of stochastic approaches. Thus, it is often
more appropriate to model unknown and varying initial values, external loads and
other model parameters as well as modeling errors, e.g. incomplete representation
of the dynamic system, by means of realizations of a random vector, a random
function with a given or at least partly known probability distribution. Consequently,
the optimal design of robust regulators is based on an optimization problem under
stochastic uncertainty.

Stochastic Optimal Design of Regulators

For the consideration of stochastic parameter variations as well as other uncer-


tainties within the optimal design process of a regulator one has to introduce—as
for any other optimization problem under (stochastic) uncertainty—an appropriate
deterministic substitute problem. In the present case of stochastic optimal design
of a regulator, hence, a map from the state or observation space into the space
of control corrections, one has a control problem under stochastic uncertainty.
For the solution of the occurring deterministic substitute problems the methods of
stochastic optimization, cf. [100], are available.
As shown in the preceding sections, the optimization of a regulator presupposes
an optimal reference trajectory q R .t/ in the configuration space and a corresponding
feedforward control
 uR .t/. As shown in preceding sections, the guiding functions
R R
q .t/; u .t/ can be determined also by stochastic optimization methods.
For the computation of stochastic optimal regulators, deterministic substitute
control problems of the following type are considered:
Minimize the expected total costs composed of (i) the costs arising from the
deviation
z.t/ between the (stochastic optimal) reference trajectory and the
effective trajectory of the dynamic system and (ii) the costs for the regulation
u.t/.
Subject to the following constraints:
– dynamic equation of the robot or a more general dynamic stochastic system with
the total control input u.t/ D uR .t/ C
u.t/ being the sum of the feedforward
control uR .t/ and the control correction
u.t/ D '.t;
z.t//
– stochastic initial conditions q.t0 / D q0 .!/, q.t
P 0 / D qP0 .!/ for the state of the
system at the starting time point t0 ,
– conditions for the feedback law ' D '.t; z.t//, such as '.t; 0/ D 0 (if the
effective state is equal to the state prescribed by the reference trajectory, then
no control correction is needed).
Here, as often in practice, we use quadratic cost functions. The above described
deterministic substitute problem can be interpreted again as a control problem for
the unknown feedback control law. A basic problem is then the computation of the
(conditional) expectation arising in the objective function. Since the expectations
are defined here by multiple integrals, the expectations can be determined in general
5.1 Tracking Error 197

only approximatively. In the following approximations based on Taylor expansions


with respect to the stochastic parameter vector a.!/ at their conditional expectations
aN t0 D Ea.!/jAt0 /.

5.1 Tracking Error

According to Sects. 4.6 and 4.7, at each stage j D 0; 1; 2; : : : the control input u.t/
is defined by

u.t/ WD feedforward control plus control correction


D u.j /.t/ C
u.j /.t/; t  tj (5.1a)

where
 

u.j /.t/ WD ' .j / t;
z.j / .t/ (5.1b)
 
D ' .j / t; z.t/  z.j / .t/
 
D ' .j / t; q.t/  q .j / .t/; q.t/
P  qP .j / .t/

with a feedback-function or feedback control law

' .j / D ' .j /.t;


z/: (5.1c)

If z.t/ D z.j / .t/, hence,


z.j / .t/ D 0, at the time point t  tj no control correction
is needed. Thus, without further, global information one often requires that

' .j /.t; 0/ D 0 for all t  tj : (5.1d)

Inserting the total control input u.t/; t  tj , into the dynamic equation (4.4a), then
the effective further course of the trajectory q.t/; t  tj , will be described by the
system of ordinary differential equations (ODE):

P
F .pD ; q.t/; q.t/; R
q.t// D u.j /.t/ C
u.j /.t/ (5.2a)
 
D u .t/ C '
.j / .j /
t; q.t/  q .t/; q.t/
.j /
P  qP .t/ ; t  tj
.j /

q.tj / D qj ; q.t
P j / D qP j ; (5.2b)

where zj WD .qj ; qP j / denotes the effective state in the time point tj .


Because of


q.t/ D
q .j /.t/ WD q.t/  q .j / .t/; t  tj (5.3a)
198 5 Optimal Design of Regulators

P

q.t/ P .j /.t/ WD q.t/
D
q P  qP .j / .t/; t  tj (5.3b)
R

q.t/ R .j /.t/ WD q.t/
D
q R  qR .j / .t/; t  tj (5.3c)

and


qj WD q.tj /  q .j / .tj / D qj  qj (5.3d)
P j WD q.t

q P j /  qP .j / .tj / D qPj  qPj (5.3e)
.j /

pD WD pD  p D ; (5.3f)

system (5.2a, b) can be rewritten by the system of differential equations for the
tracking error:

.j /
F p D C
pD ; q .j / .t/ C
q.t/; qP .j / .t/

P
C
q.t/; R
qR .j / .t/ C
q.t/
 
P
D u.j /.t/ C ' .j / t;
q.t/;
q.t/ ; t  tj (5.4a)
P j / D
q

q.tj / D
qj ;
q.t P j: (5.4b)

The solution of system (5.4a, b) is a function


   
P j D
q .j / t;
pD ;
qj ;
q

q.t/ D
q t;
pD ;
qj ;
q P j ; t  tj :
(5.4c)

5.1.1 Optimal PD-Regulator

For simplification, we first consider the optimal design of a PD-regulator. Of course,


the aim of the control correction
u.j /.t/; t  tj , at the j -th stage reads:
   
q.t/ q .j / .t/
i) z.t/ D ! z.j / .t/ D ; t > tj ; t ! 1
P
q.t/ qP .j / .t/
(5.5a)
or
.j /
ii) z.t/  z.j / .t/ on Œtj ; tf :

In general, the control corrections


u.j /.t/; t  tj ; cause considerable expenses
for measurements and control. Hence, choosing a feedback-function ' .j / D
' .j /.t;
z/, one has also to take into account certain control costs

 D  .t; '.t; ; /;


u.t// ; (5.5b)
5.1 Tracking Error 199

depending on the whole feedback function '.t; ; / and/or on the individual control
inputs
u.t/.
For given model parameters and initial conditions, the quality or performance of
a feedback-function ' .j / D ' .j / .t;
z/ can be evaluated by the functional
 
J D J .j / ' .j /; pD ; zj (5.6a)
.j /
Ztf  
D P
c.t;
q.t/;
q.t/ / C  .t; '.t; ; /;
u.t// dt
„ ƒ‚ … „ ƒ‚ …
tj D
z.t / regulator costs
„ ƒ‚ …
costs of the
tracking error

with a cost function


 
P D c.t;
z/
c D c t;
q;
q (5.6b)

for evaluating the tracking error


z, as e.g.

P
c.t;
z/ D
q.t/T Cq
q.t/ C
q.t/C P
qP
q.t/ (5.6c)

involving positive (semi-) definite weight matrices Cq D Cq .t/; CqP D CqP .t/:
Moreover, the regulator costs can be defined by

 D .t; '.t; ; /;


u.t// WD
uT Cu
u (5.6d)

with a positive (semi-) definite matrix Cu D Cu .t/.


Minimizing the conditional expected total costs EJ , for the selection of an
optimal feedback-function ' D '.t;
z/, according to (5.3a)–(5.6d) at the j-th stage
we have the following regulator optimization problem:
0 .j /
1
Ztf ˇ
B      ˇ C
min E B
@
P
c t;
q.t/;
q.t/ C  t; '.t; ; /;
u.t/ dt ˇ Atj C
A (5.7a)
tj

subject to
 
P
F pD .!/; q .j / .t/ C
q.t/; qP .j / .t/ C
q.t/; R
qR .j / .t/ C
q.t/
 
P
D u.j /.t/ C ' t;
q.t/;
q.t/ ; t  tj ; a.s. (almost sure) (5.7b)
P j / D
q

q.tj / D
qj .!/;
q.t P j .!/; a.s.; (5.7c)

where
u.t/ is defined by (5.1b–d).
200 5 Optimal Design of Regulators

Obviously, (5.7a–c) is a stochastic optimization problem, or an optimal control


problem under stochastic uncertainty. Transforming the second order dynamic
equation (5.7b) into a first order system of differential equations, the above regulator
optimization problem can also be represented, cf. Sect. 3.1.2, in the following form:
0 .j /
1
Ztf 
B     ˇˇ C
min E B
@ c t;
z.t/ C  t; '.t; ; /;
u.t/ dt ˇ Atj C
A (5.8a)
tj

subject to
 
P D f0 pD .!/; z.j / .t/ C
z.t/  zP.j / .t/

z (5.8b)
   
CB pD .!/; z.j / .t/ C
z.t/ u.j / .t/ C '.t;
z.t// ; t  tj ; a.s.

z.tj / D
zj .!/; a.s.: (5.8c)

Remark 5.1 Note that state and control constraints can also be considered within
the stochastic optimal regulator optimization problem (5.7a–c).

5.2 Parametric Regulator Models

5.2.1 Linear Regulator

Lineare regulators are defined by


 
 
q
'.t;
z/ WD G.t/
z D Gp .t/; Gd .t/ P
D Gp .t/
q C Gd .t/
q (5.9a)
„ ƒ‚ … P

q
n comp.

with unknown matrix functions Gp D Gp .t/; Gd D Gd .t/.

Nonlinear (Polynomial) Regulator

Considering the m-th order Taylor expansion of the feedback function ' D '.t;
z
with respect to
z, we obtain the representation

X
m X
m
'.t;
z/ WD Gl .t/  .
z/ D
l
Gl .t/  .
z;
z; : : : ;
z/ : (5.9b)
„ ƒ‚ … „ƒ‚… „ ƒ‚ …
n comp. lD1 lD1
l
multilin. form
l-th order
5.2 Parametric Regulator Models 201

This equation can also be represented by

'.t;
z/ D G.t/h.
z/ (5.9c)

involving a matrix function G D G.t/ with elements to be determined, and a known


vector function h D h.
z/ containing the weighted different products of order 1 up
to m of the components of
z.

5.2.2 Explicit Representation of Polynomial Regulators

With the unit vectors e1 ; : : : ; e2n of R2n we put

X
2n

z D
zk ek ; (5.10a)
kD1

where


zk D
qk ; k D 1; : : : ; n; (5.10b)
P l;

znCl D
q l D 1; : : : ; n: (5.10c)

Thereby the summands in (5.9b) can be represented by


0 1
X
2n X
2n
Gl .t/  .
z;
z; : : : ;
z/ D Gl .t/  @
zk1 ek1 ; : : : ;
zkl ekl A
„ ƒ‚ …
k1 D1 kl D1
l

X
2n
D Gl .t/  .ek1 ; : : : ; ekl /
zk1  : : : 
zkl (5.11a)
k1 ;k2 ;:::;kl D1

Defining the n-vector functions gk1 :::kl D gk1 :::kl .t/ by

gk1 :::kl .t/ WD Gl .t/  .ek1 ; : : : ; ekl /; (5.11b)

for (5.11a) we obtain also the representation

X
2n
Gl .t/  .
z/l D gk1 :::kl .t/
zk1  : : : 
zkl : (5.11c)
k1 ;k2 ;:::;kl D1
202 5 Optimal Design of Regulators

Obviously, (5.11c) depends linearly on the functions gk1 :::kl .t/. According to (5.9c)
and (5.11c), the matrix function G D G.t/ has the following columns:

G.t/ D .g1 .t/; : : : ; g2n .t/; g11 .t/; : : : ; g2n2n .t/; : : : ; g1:::1 .t/; : : : ; g2n:::2n .t//:
„ ƒ‚ …
linear regulator
„ ƒ‚ …
quadratic regulator
„ ƒ‚ …
polynomial regulator m-th order
(5.12)

5.2.3 Remarks to the Linear Regulator

The representation of the feedback function '.t;


z/ discussed in the previous
Sect. 5.2.2 is still illustrated for the case of linear regulators. With (5.9a), (5.11b)
and (5.12) we get:

'.t;
z/ D
(5.9a)
Gp .t/
q C P
Gd .t/
q
„ ƒ‚ …
n-vector    
P
n P
n
D Gp .t/
qk ek C Gd .t/ P l el

q
kD1 lD1
n 
P  P
n
D Gp .t/ek
qk C P l
.Gd .t/el /
q
kD1 lD1
Pn P
n
(5.11b)
D gpk .t/
qk C P l
gdl .t/
q
kD1   lD1
(5.12)
q.t/
D G.t/ P

q.t/
 
with G.t/ WD gp1 .t/; : : : ; gpn .t/; gd1 .t/; : : : ; gdn .t/ .

Remark to the Regulator Costs

Consider once more (5.9c):

'.t;
z/ D G.t/ h.
z/:
„ƒ‚… „ ƒ‚ …
unknown given

Thus, the cost function  can be defined by

.t; '.t; ; /;


u.t// DW Q .t; G.t// (5.13)
5.3 Computation of the Expected Total Costs of the Optimal Regulator Design 203

with an appropriate function Q . Consequently, the optimal regulator problem


(5.7a–c) under stochastic uncertainty can also be represented by:
0 .j /
1
Ztf
B     ˇ C
min E B
@
P
c t;
q.t/;
q.t/ C Q .t; G.t// dt ˇ Atj C
A (5.14a)
tj

subject to
 
P
F pD .!/; q .j / .t/ C
q.t/; qP .j / .t/ C
q.t/; R
qR .j / .t/ C
q.t/ (5.14b)
P
D u.j / .t/ C G.t/h.
q;
q.t//; t  tj ; a.s. (almost sure)
P j / D
q

q.tj / D
qj .!/;
q.t P j .!/; f.s. (5.14c)

5.3 Computation of the Expected Total Costs of the Optimal


Regulator Design

Working with the expected total costs, in this section we have to determine now the
conditional expectation of the functional (5.6a):

.j /
Ztf
     
E J .j / jAtj D P
E c.t;
q.t/;
q.t// C .t; '.t; ; /;
u.t// jAtj dt:
tj
(5.15)

The system of differential equations (5.4a, b) yields the trajectory

P j /;

q.t/ D
q.t;
pD ;
qj ;
q
.j /
tj  t  tf ; (5.16a)
P

q.t/ P
D
q.t; P j /;

pD ;
qj ;
q
.j /
tj  t  tf ; (5.16b)

where


pD D
pD .!/ (5.16c)

qj D
qj .!/ (5.16d)

q P j .!/:
P j D
q (5.16e)
204 5 Optimal Design of Regulators

According to (5.15), we consider first the computation of the expectations:

   
P j .!/ ;
q
E c t;
q t;
pD .!/;
qj .!/;
q P t;
pD .!/;
qj .!/;

!
ˇ
P j .!/ ˇˇ Atj

q

and
!
 ˇ
ˇ
E  t; '.t; ; /;
u.t; !/ ˇ Atj (5.17a)

with


u.t; !/ D '.t;
z.t;
pD .!/;
zj .!/// (5.17b)
 

q.t/

z.t/ D P (5.17c)

q.t/
 

qj

zj D P j : (5.17d)

q

5.3.1 Computation of Conditional Expectations by Taylor


Expansion

For abbreviation, we introduce the random parameter vector with conditional


expectation
0 1
0 1 .j / 0 1

pD .!/
pD E.
pD .!/jAtj /
B .j / C
a WD @
qj .!/ A ; a D a .j / WD B C @
@
qj .j / A D E.
qj .!/jAtj / :
A
P j .!/

q P j P j .!/jAtj /
E.
q

q
(5.18a)

Here, we may readily assume that the following property holds:

The random vectors pD .!/ and


 

qj .!/

zj .!/ D P j .!/ are stochastically independent. (5.18b)

q
5.3 Computation of the Expected Total Costs of the Optimal Regulator Design 205

Then, the first conditional expectation (5.17a) can be represented by

E.c.t;
z.t; a.!///jAtj / D E.c.t;
Q a.!//jAtj / (5.19a)

where

Q a/ WD c.t;
z.t; a//:
c.t; (5.19b)

Q a.!//jAtj / we consider the Taylor


For the computation of the expectation E.c.t;
expansion of a ! c.t;
Q a/ at a D a with
 T
a WD E.a.!/jAtj / D E.a1 .!/jAtj /; : : : ; E.a .!/jAtj / ;

see Sect. 1.6.3. Hence,

1
Q a/ D c.t;
c.t; Q a/ C ra c.t;
Q a/T .a  a/ C .a  a/T ra2 c.t;
Q a/.a  a/
2
1
C ra3 c.t;
Q a/  .a  a/3 C : : : ; (5.19c)

and therefore
  1  
Q a.!//jAtj D c.t;
E c.t; Q a/ C E .a  a/T ra2 c.t;
Q a/.a  a/jAtj
2
1  
C E ra3 c.t;
Q a/  .a.!/  a/3 jAtj C : : : : (5.19d)

Here, (5.19d) is obtained from (5.19c) due to the following equation:
     
E ra c.t;
Q a/T .a.!/  a/jAtj D ra c.t;
Q a/T E a.!/jAtj  „ƒ‚…
a D 0:
„ ƒ‚ …
Da.j /
Da.j /
„ ƒ‚ …
D0

In practice, mostly Taylor expansions up to second or eventually third order are


taken into account. Thus, in the following we consider Taylor expansions of a !
Q a/ up to third order. Consequently, for the expansion (5.19d) up to the 3rd order
c.t;
we need the derivatives ra2 c.t;
Q a/, ra3 c.t;
Q a/ at a.

Q a/
Computation of ra c.t;

According to (5.19b) we have

Q a/ D c.t;
z.t; a//:
c.t;
206 5 Optimal Design of Regulators

Differentiation with respect to the components of a yields

@cQ X @c
2n
@
zk
.t; a/ D .t;
z.t; a// .t; a/; j D 1; : : : ; : (5.20)
@aj @
zk @aj
kD1

Writing this in vector-matrix form, we get, cf. Sect. 1.6.3,


0 1 0 1
0 1
B : C B C
B :: C B C B ::: C
B C B CB C
B @Qc C B @
z1 : : : @
z2n C B @c C
ra c.t;
Q a/ D B @a C D B @aj C B @
zk C;
B j C B @aj
CB C
B :: C @ A B :: C
@ : A @ : A
„ ƒ‚ …
„ ƒ‚ …
.ra
z/T
r
z c

hence,
 T  
ra c.t;
Q a/ D ra
z.t; a/ r
z c t;
z.t; a/ : (5.21)

Higher derivatives with respect to a follow by further differentiation of (5.21).


However, in the following we consider quadratic cost functions which simplifies the
computation of the expected cost function (5.15).

5.3.2 Quadratic Cost Functions

For the regulator optimization problem under stochastic uncertainty (5.7a–c) we


consider now the following class of quadratic cost functions:
P
 thecost function c D c.t;
q;
q/ for the evaluation
(i) Let  of the deviation
z D

q q.t/
P between the effective trajectory z D z.t/ D and the optimal

q P
q.t/
 .j / 
q .t/
reference trajectory z.j / D z.j / .t/ D ; t  tj ; be given by (5.6c),
qP .j / .t/
d.h.

P WD
q T Cq
q C
q
c.t;
q;
q/ P t  tj ;
P T CqP
q;

with symmetric, positive (semi-)definite matrices Cq D Cq .t/; CqP D


CqP .t/; t  tj :
5.4 Approximation of the Stochastic Regulator Optimization Problem 207

(ii) Suppose that the cost function for the evaluation of the regulator expenses is
given by (5.6d), hence,

.t; '.t; ; /;


u/ WD
uT Cu
u; t  tj ;

with a symmetric, positive (semi-)definite matrix Cu D Cu .t/; t  tj :

5.4 Approximation of the Stochastic Regulator Optimization


Problem

Under the cost assumptions .i /; .ii/ in Sect. 5.3.2 and because of (5.1b), thus

P

u WD '.t;
q.t/;
q.t//;

from (5.7a–c) we obtain now (at stage j ) the following optimal regulator problem
under stochastic uncertainty:

.j /
Ztf 
min E P

q.t/T Cq
q.t/ C
q.t/C P
qP
q.t/
tj
 ˇ !
 T
  ˇ
P
C' t;
q.t/;
q.t/ P
Cu ' t;
q.t/;
q.t/ dt ˇˇAtj (5.22a)

s.t.
 
.j / P
F p D C
pD .!/; q .j / .t/ C
q.t/; qP .j / .t/ C
q.t/; R
qR .j / .t/ C
q.t/
 
D u.j /.t/ C ' t;
q.t/;
q.t/P f.s. ; t  tj ; (5.22b)
P j / D
q

q.tj / D
qj .!/;
q.t P j .!/ f.s. (5.22c)

As already mentioned in Sect. 5.3, (5.16a, b), for the tracking error we obtain
from (5.22b, c) the representation:
 
P j ; t  tj

q.t/ D
q t;
pD ;
qj ;
q
 
P

q.t/ P t;
pD ;
qj ;
q
D
q P j ; t  tj ;
208 5 Optimal Design of Regulators

where
pD D
pD .!/;
qj D
qj .!/;
q P j D
q P j .!/. Under appropriate
regularity assumptions on the functions F; ' as well as on q .j / and u.j / , the solution
   

q P j/

q.t;
pD ;
qj ;
q

z D P D P P j / ; t  tj ;

q
q.t;
pD ;
qj ;
q

of (5.22b, c) has continuous partial derivatives—of a prescribed order—with respect


to the parameter deviations
pD ;
qj ;
q P j . By Taylor expansion of
z with
respect to the vector of deviations
a D .
pD T P T /T at
a D 0 we find
;
qjT ;
qj

@
q @
q @
q P j

q.t/ D
q.t; 0/ C .t; 0/
pD C .t; 0/
qj C .t; 0/
q
@
pD @
qj P j
@
q
C::: (5.23a)
P
@
q P
@
q P
@
q
P

q.t/ P
D
q.t; 0/ C .t; 0
pD C .t; 0/
qj C P j
.t; 0/
q
@
pD @
qj P
@
qj
C::: (5.23b)
0 1
0
with 0 WD @ 0 A and the Jacobians
0
0 1
@
q1 @
q1 @
q1
B @
pD @
pD    @
pD C
B 1 2  C
@
q P j / WD B
.t;
pD ;
qj ;
q B
:: :: :: C P j /;
C .t;
pD ;
qj ;
q
@
pD B : : : C
@ @
qn @
qn @
qn A

@
pD1 @
pD2 @
pD

etc. For the choice of parameters


a D 0 or
pD D 0;
qj D 0;
q P j D 0,
 

q.t; 0/
according to (5.22b, c), the function
z D
z.t; 0/ D P is the solution

q.t; 0/
of the system of differential equations:
 
.j / P
F p D ; q .j / .t/ C
q.t/; qP .j / .t/ C
q.t/; R
qR .j / .t/ C
q.t/
 
D u.j /.t/ C ' t;
q.t/;
q.t/ P ; t  tj ; (5.24a)
P

q.tj / D 0;
q.t/ D 0: (5.24b)

Because of (4.50c), i.e.,


 
.j /
F p D ; q .j / .t/; qP .j / ; qR .j /.t/ D u.j /.t/; t  tj ;
5.4 Approximation of the Stochastic Regulator Optimization Problem 209

and (5.1d), thus,

'.t; 0; 0/ D 0;

system (5.24a, b) has the unique solution


   

q.t; 0/ 0

z.t; 0/ D P D0D ; t  tj :

q.t; 0/ 0

Consequently, the expansion (5.23a, b) reads

P j/

q.t/ D
q.t;
pD ;
qj ;
q
@
q @
q @
q P j C :::
D .t; 0/
pD C .t; 0/
qj C .t; 0/
q (5.25a)
@
pD @
qj P j
@
q
P

q.t/ P
D
q.t; P j/

pD ;
qj ;
q
P
@
q P
@
q P
@
q
D .t; 0/
pD C .t; 0/
qj C P j C :::
.t; 0/
q (5.25b)
@
pD @
qj P j
@
q

@
q
The Jacobians @
p ; @
q ; : : : arising in (5.25a, b) can be obtained by partial
D @
qj
differentiation of the system of differential equations (5.22b, c) with respect to

pD ;
qj ;
qP j ; cf. Sect. 4.6.2.

5.4.1 Approximation of the Expected Costs: Expansions of 1st


Order

According to (5.22a), we have to determine the following conditional expectation:


  
E f .j / jAj WD E
q.t/T Cq
q.t/ C
q.t/P T P
CqP
q.t/
 T  ˇˇ 
P
C' t;
q.t/;
q.t/ P
Cu ' t;
q.t/;
q.t/ ˇAtj : (5.26)

For the computation of the expectation in (5.26) also the feedback function ' D
P is approximated by means of Taylor expansion at
q D 0;
q
'.t;
q;
q/ P D 0.
Because of (5.1d), hence, '.t; 0; 0/ D 0; we get

P D @' @' P C :::


'.t;
q;
q/ .t; 0; 0/
q C .t; 0; 0/
q (5.27a)
@
q P
@
q
210 5 Optimal Design of Regulators

with unknown Jacobians

@' @'
.t; 0; 0/; .t; 0; 0/; : : : (5.27b)
@
q P
@
q

to be determined. Using in the Taylor expansion (5.27a) of the feedback function


' D '.t; ; / only the linear terms, we find, c.f. (5.26)
   
E f .j /jAtj  E fQ.j / jAtj ; (5.28a)

where
 
E fQ.j / jAj
 @'
P
WD E
q.t/T Cq
q.t/ C
q.t/ T P
CqP
q.t/ C .t; 0; 0/
q.t/
@
q
!
T  @' ˇ
@' P @' P ˇ
C .t; 0; 0/
q.t/ Cu .t; 0; 0/
q.t/ C .t; 0; 0/
q.t/ ˇAtj
P
@
q @
q P
@
q
 @'
P
D E
q.t/T Cq
q.t/ C
q.t/ T P
CqP
q.t/ C
q.t/T .t; 0; 0/T Cu
@
q
!
 @' ˇ
P @' @' P ˇ
C
q.t/ T
.t; 0; 0/T Cu .t; 0; 0/
q.t/ C .t; 0; 0/
q.t/ ˇAtj :
P
@
q @
q P
@
q
(5.28b)

This yields

 
E fQ.j /jAj WD E
q.t/T Cq
q.t/ C
q.t/
P T P
CqP
q.t/

@' @'
C
q.t/T .t; 0; 0/T Cu .t; 0; 0/
q.t/
@
q @
q

P @' @' P
C
q.t/ T
.t; 0; 0/T Cu .t; 0; 0/
q.t/
P
@
q P
@
q
@' @' P
C
q.t/T .t; 0; 0/T Cu .t; 0; 0/
q.t/
@
q P
@
q
!
ˇ
P T @' @' ˇ
C
q.t/ .t; 0; 0/T Cu .t; 0; 0/
q.t/ˇAtj
P
@
q @
q
(5.28c)
5.4 Approximation of the Stochastic Regulator Optimization Problem 211

and therefore
   @' @' 
E fQ.j / jAj D E
q.t/T Cq C .t; 0; 0/T Cu .t; 0; 0/
q.t/
@
q @
q
 @' @' 
P
C
q.t/ T
CqP C .t; 0; 0/T Cu P
.t; 0; 0/
q.t/
@
qP P
@
q
!
T @' @' ˇ
C2
q.t/ T
.t; 0; 0/ Cu P
.t; 0; 0/
q.t/ ˇAt : (5.28d)
@
q P
@
q
j

Putting

@' @'
Qq D Qq .t; '/ WD Cq C .t; 0; 0/T Cu .t; 0; 0/ (5.29a)
@
q @
q
@' @'
QqP D QqP .t; '/ WD CqP C .t; 0; 0/T Cu .t; 0; 0/ (5.29b)
P
@
q P
@
q
@' @'
Qu D Qu .t; '/ WD .t; 0; 0/T Cu .t; 0; 0/; (5.29c)
@
q P
@
q

we obtain
 @' @' 
fQ.j / WD
q.t/T Cq C .t; 0; 0/T Cu .t; 0; 0/
q.t/
@
q @
q
 @' @' 
P
C
q.t/ T
CqP C .t; 0; 0/T Cu P
.t; 0; 0/
q.t/
P
@
q P
@
q
@' @' P
C 2
q.t/T .t; 0; 0/T Cu .t; 0; 0/
q.t/
@
q P
@
q
  
  Qq Qu
q.t/
P
D
q.t/T ;
q.t/ T
QuT QqP P

q.t/
D
z.t/T Q.t; '/
z.t/: (5.30a)

Here,
 

q.t/

z.t/ D P ; (5.30b)

q.t/
 
Qq Qu
Q.t; '/ WD
QuT QqP
@' @' @' !
Cq C @
q
.t; 0; 0/T Cu @
q .t; 0; 0/ @
q
.t; 0; 0/T Cu @@'
P .t; 0; 0/
D @' @'

q
: (5.30c)
T
P .t; 0; 0/ Cu @
q .t; 0; 0/
@
q
CqP C @@' T @'
P .t; 0; 0/ Cu @
q

q P .t; 0; 0/
212 5 Optimal Design of Regulators

Obviously, Qq ; QqP and also Q D Q.t; '/ are symmetric matrices. Since
Cq ; CqP ; Cu are positive (semi-)definite matrices, according to the definition (5.28b)
of the approximate total cost function we get

fQ.j /.t;
z/ > ./ 0; for all
z ¤ 0:

Thus, we have the following result on the approximate total cost function fQ.j / D
fQ.j / .t;
z/:
Lemma 5.1 The quadratic approximation fQ.j / =fQ.j /.t;
z/ of f .j / =f .j /.t;
z/ is

q
a positive semidefinite quadratic form in
z D P . If Cq D Cq .t/; CqP D CqP .t/

q
are positive definite weight matrices, then also the matrix Q D Q.t; '/ is positive
definite, and fQ.j / is then a positive definite quadratic form in
z.
Proof The first part of the assertion is clear. Suppose now that Cq ; CqP are positive
definite weight matrices. Then, according to the definition of fQ.j / D fQ.j / .t;
z/
by (5.28b, c) and since Cu is positive (semi-)definite, we have fQ.j / .t;
z/ D 0 if
and only if
z/ D 0.
 
For the approximate computation of the expectation E fQ.j /jAtj we use now
only the linear terms of (5.25a, b), hence
 

q.t/

z.t/ D P

q.t/
! ! 0 @
q
1
@
q @
q .t; 0/
.t; 0/ .t; 0/ P j
Š @
pD
P
pD C
@
qj
P
qj C @
@
q
P
P j
A
q
@
q @
q @
q
@
pD
.t; 0/ @
qj .t; 0/ P j
@
q
.t; 0/

@
z @
z @
z P j:
D .t; 0/
pD C .t; 0/
qj C .t; 0/
q (5.31)
@
pD @
qj P j
@
q

Inserting (5.31) into (5.30a) yields

fQ.j / D
zT Q.t; '/
z
 T  T  T !
@
z @
z P j @
z
D
pD T
.t; 0/ C
qj T
.t; 0/ C
q T
.t; 0/
@
pD @
qj P j
@
q
!
@
z @
z @
z P
Q .t; 0/
pD C .t; 0/
qj C .t; 0/
qj
@
pD @
qj @
qP j
 T
@
z @
z
D
pDT
.t; 0/ Q .t; 0/
pD
@
pD @
pD
5.4 Approximation of the Stochastic Regulator Optimization Problem 213

 T
@
z @
z
C
qjT .t; 0/ Q .t; 0/
qj
@
qj @
qj
 T
C P jT @
z .t; 0/ Q @
z .t; 0/
q

q P j
P j
@
q P j
@
q
 T
@
z @
z
C 2
qjT .t; 0/ Q .t; 0/
pD
@
qj @
pD
 T
C P jT @
z .t; 0/ Q @
z .t; 0/
pD
2
q
P j
@
q @
pD
 T
C P jT @
z .t; 0/ Q @
z .t; 0/
qj :
2
q (5.32)
P j
@
q @
qj

Readily we may assume that the random vectors


pD .!/;
qj .!/ as well as
P j .!/

pD .!/;
q

are stochastically independent. Because of


 ˇ 
ˇ
E
pD .!/ˇAtj D 0;

(5.32) yields
 ˇ   ˇ 
ˇ ˇ
E fQ.j / ˇAtj D E
pD .!/T P
pD
pD .!/ˇAtj
 ˇ   ˇ 
ˇ P j .!/ˇˇAtj
P j .!/T P P
q
C E
qj .!/T P
qj
qj .!/ˇAtj C E
q
qj
 ˇ 
P j .!/T P P ˇ
C 2E
q
qj ;
qj
qj .!/ˇAtj (5.33a)

with the matrices


 T  
@
z @
z
P
pD WD .t; 0/ Q.t; '/ .t; 0/ ; (5.33b)
@
pD @
pD
 T  
@
z @
z
P
qj WD .t; 0/ Q.t; '/ .t; 0/ ; (5.33c)
@
qj @
qj
 T  
@
z @
z
P
q
P j WD .t; 0/ Q.t; '/ .t; 0/ ; (5.33d)
P j
@
q @
qP j
214 5 Optimal Design of Regulators

 T  
@
z @
z
P j ;
qj WD
P
q .t; 0/ Q.t; '/ .t; 0/ : (5.33e)
P j
@
q @
qj

The P -matrices defined by (5.33b–e) are deterministic and depend on the


feedback law ' D '.t;
z/.
Die conditional expectations in (5.33a) can be determined as follows: Let P D
.pkl / denote one of the P -matrices, where the first three matrices P
pD ; P
qj ; P
q
P j
are symmetric. Now, for P D P
pD we get

 ˇ  X
 ˇ 
ˇ ˇ
E
pD .!/T P
pD
pD .!/ˇAtj D E pkl
pDk .!/
pDl .!/ˇAtj
k;lD1

X
  ˇ 
ˇ
D pkl E
pDk .!/
pDl .!/ˇAtj
k;lD1

X

 
D pkl cov.j / pDk ; pDl : (5.34a)
k;lD1

Here, the last equation follows, cf. (5.16c), from the following equations:
    ˇˇ 
cov.j / pDk ; pDl WD E pDk .!/  pDk .j / pDl .!/  pDl .j / ˇAtj
 ˇ 
ˇ
D E
pDk .!/
pDl .!/ˇAtj ; k; l D 1; : : : ; : (5.34b)

Because of the symmetry of the covariance matrix


    
cov.j / pD ./ WD cov.j / pDk ./; pDl ./
k;lD1;:::;
  T ˇˇ 
D E pD .!/  pD .j / pD .!/  pD .j / ˇAtj (5.34c)

of pD D pD .!/, from (5.34a) we obtain

 ˇ  X

 
ˇ
E
pD .!/T P
pD
pD .!/ˇAtj D pkl cov.j / pDk ; pDl
k;lD1

X

 
D pkl cov.j / pDl ; pDk
k;lD1
 X
X 
 
D pkl cov.j / pDl ; pDk
kD1 lD1
5.4 Approximation of the Stochastic Regulator Optimization Problem 215

   
X  
D k-th row of P  k-th column of cov.j / pD ./
kD1
 
D tr P
pD cov.j / pD ./ : (5.35a)

In the same way we obtain


 ˇ   
ˇ
E
qj .!/T P
qj
qj .!/ˇAtj D tr P
qj cov.j /
qj ./ : (5.35b)
 ˇ   
E
q P j .!/ˇˇAtj D tr P P cov.j /
q
P j .!/T P P
q P j ./ : (5.35c)

qj
qj

For the last expectation in (5.33a) with P WD P


q
P j ;
qj we find

 ˇ  X
 ˇ 
P j .!/T P P ˇ P j k .!/
qj l .!/ˇˇAtj
E
q
qj ;
qj
qj .!/ ˇAt j D E p kl
q
k;lD1

X

 
D P j k ./;
qj l ./
pkl cov.j /
q
k;lD1

X

 
D P j k ./
pkl cov.j /
qj l ./;
q
k;lD1
 X
X 
 
D P j k ./
pkl cov.j /
qj l ./;
q
kD1 lD1
   
X  
D P j ./ ;
k-th row of P  k-th column of cov.j /
qj ./;
q (5.35d)
kD1

where
    .j / ˇˇ 
P j ./ WD E
qj .!/ 
qj .j /
q
cov.j /
qj ./;
q P j .!/ 
q
P j ˇAtj
(5.35e)

According to (5.35d) we have


 ˇ   
P j .!/T P P ˇ P j ./
E
q
qj ;
qj
qj .!/ ˇAtj D tr P
q
P j ;
qj cov
.j /

qj ./;
q
(5.35f)
216 5 Optimal Design of Regulators

Inserting (5.35a–c) and (5.35f) into (5.33a), we finally get


 ˇ     
ˇ
E fQ.j / ˇAtj D tr P
pD cov.j / pD ./ C tr P
qj cov.j /
qj ./
 
C tr P P cov.j /
qP j ./

qj
 
C tr P
q
P j ;
qj cov
.j / P j ./ :

qj ./;
q (5.36)

5.5 Computation of the Derivatives of the Tracking Error

As can be seen from (5.33a–e) and (5.36), for the approximate computation of the
conditional expectation of the objective function (5.22a) we need the Jacobians or
“sensitivities”:

@
q P
@
q @
q P
@
q @
q P
@
q
.t; 0/; .t; 0/; .t; 0/; .t; 0/; .t; 0/; .t; 0/;
@
pD @
pD @
qj @
qj P j
@
q P j
@
q

of the functions

P

q D
q.t;
pD ;
q;
q/; t  tj
P D
q.t;

q P P

pD ;
q;
q/; t  tj :

As already mentioned in Sect. 5.4, conditions in form of initial value problems


can be obtained by differentiation of the dynamic equation with respect to the
P

pD ;
q;
q. According to (5.22b, c), for the actual trajectory
parameterdeviations

q.t/

z.t/ D P ; t  tj , we have the second order initial value problem



q.t/
 
.j / P
F p D C
pD ; q .j / .t/ C
q.t/; qP .j / .t/ C
q.t/; R
qR .j / .t/ C
q.t/
 
P
D u.j / .t/ C ' t;
q.t/;
q.t/ ; t  tj ; (5.37a)

q.tj / D
qj ; (5.37b)
P j / D
q

q.t P j: (5.37c)

The needed derivatives follow from (5.37a–c) by implicit differentiation with


P j . Here, the following equations for the functional matrices
respect to
pD ;
qj ;
q
5.5 Computation of the Derivatives of the Tracking Error 217

have to be taken into account:


0 T 1
grad
pD
q1  
@
q B C @
q @
q @
q
D@B : C
:
: A D @
pD1 ; @
pD2 ; : : : ; @
pD (5.38a)
@
pD  T
grad
pD
qn
0 T 1
grad
qj
q1  
@
q B C @
q @
q @
q
D@B : C
:
: A D @
qj1 ; @
qj 2 ; : : : ; @
qj n (5.38b)
@
qj  T
grad
qj
qn
0 T 1
P j
q1
grad
q !
@
q B : C @
q @
q @
q
DB@ :: CD
A ; ;:::; (5.38c)
P j
@
q  T P j1 @
q
@
q P j2 P jn
@
q
P j
qn
grad
q

Because of
P 
  
@
q @ @
q @ @
q
D D (5.38d)
@
pD @
pD @t @t @
pD
P    
@
q @ @
q @ @
q
D D (5.38e)
@
qj @
qj @t @t @
qj
  !
P
@
q @ @
q @ @
q
D D ; (5.38f)
P j
@
q P j
@
q @t @t @
q P j

@
qP P P
the derivatives @
p ; @
q ; @
q
D @
qj @
q P j follow, as shown below, together with derivatives
defined by (5.38a–c).

5.5.1 Derivatives with Respect to Dynamic Parameters


at Stage j

According to (5.38a) and (5.38d), for each vector function

@
q P j /; t  tj ; i D 1; : : : ; ;
t 7! .t;
pD ;
qj ;
q
@
pDi
218 5 Optimal Design of Regulators

we obtain a system of differential equations by differentiation of the systems (5.37a–


c) with respect to
pDi ; i D 1; : : : ; : From (5.37a) next to we find

@F @
pD @F @
q @F @
qP @F @
qR
C C C
@pD @
pDi @q @
pDi @qP @
pDi @qR @
pDi
@' @
q P
@' @
q
D C ; i D 1; : : : ; ; (5.39a)
@
q @
pDi P @
pDi
@
q

@
pD
where @
p Di
D ei (= i-th unit vector). Differentiation of the initial conditions (5.37b,
c) with respect to pDi yields

@
q P j / D 0; i D 1; : : : ; ;
.tj ;
pD ;
qj ;
q (5.39b)
@
pDi
P
@
q P j / D 0; i D 1; : : : ; ;
.tj ;
pD ;
qj ;
q (5.39c)
@
pDi

P j are independent of the dynamic parameters pD ,


since the initial values
qj ;
q

pD , respectively.
Then, the initial value problems for the sensitivities

@
q P
@
q d @
q
t 7! .t; 0/; t 7! .t; 0/ D .t; 0/
@
pDi @
pDi dt @
pDi
1 0

pD
follow by insertion of
a D @
qj A D 0 into (5.39a–c). Defining still the
P j

q
expressions

@F ˇˇ @F  .j / .j / 
ˇ WD p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; (5.40a)
@pD .j / @pD
@F ˇˇ @F  .j / .j / 
ˇ WD p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; (5.40b)
@q .j / @q
@F ˇˇ @F  .j / .j / 
ˇ WD p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; (5.40c)
@qP .j / @qP
@F ˇˇ @F  .j / .j / 
ˇ WD p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; (5.40d)
@qR .j / @qR
5.5 Computation of the Derivatives of the Tracking Error 219

for i D 1; : : : ; , the initial value problems read:

@F ˇˇ @F ˇˇ @
q @F ˇˇ P
@
q @F ˇˇ R
@
q
ˇ C ˇ .t; 0/ C ˇ .t; 0/ C ˇ .t; 0/
@pDi .j / @q .j / @
pDi @qP .j / @
pDi @qR .j / @
pDi
@' @
q @' P
@
q
D .t; 0/ .t; 0/ C .t; 0/ .t; 0/; t  tj (5.41a)
@
q @
pDi P
@
q @
pDi
@
q
.tj ; 0/ D 0 (5.41b)
@
pDi
P
@
q
.tj ; 0/ D 0: (5.41c)
@
pDi

Subsuming the above  systems, we get a second order system of differential


@
q
equations for the functional matrix t 7! @
pD
.t; 0/:

@F ˇˇ @F ˇˇ @
q @F ˇˇ @
q P @F ˇˇ @
q R
ˇ C ˇ .t; 0/ C ˇ .t; 0/ C ˇ .t; 0/;
@pD .j / @q .j / @
pD @qP .j / @
pD @qR .j / @
pD
@' @
q @' P
@
q
D .t; 0/ .t; 0/ C .t; 0/ .t; 0/; t  tj (5.42a)
@
q @
pD P
@
q @
pD
@
q
.tj ; 0/ D 0 (5.42b)
@
pD
P
@
q
.tj ; 0/ D 0: (5.42c)
@
pD

Using the definition

@
q
.t/ WD .t; 0/; t  tj ; (5.43)
@
pDi

with i D 1; : : : ; , the separated systems (5.41a–c) can be represented by:


   ˇ 
@F ˇˇ @' @F ˇ @' ˇ
ˇ
ˇ  .t; 0/ .t/ C ˇ  P C @F
.t; 0/ .t/ R
@qR ˇ.j / .t/
@q .j / @
q P
@qP .j / @
q
@F ˇˇ
D ˇ ; t  tj (5.44a)
@pDi .j /
.tj / D 0 (5.44b)
P j / D 0:
.t (5.44c)
220 5 Optimal Design of Regulators

Obviously, this property holds:


@
q
Lemma 5.2 The sensitivities .t/ D @
p Di
.t; 0/; t  tj ; with respect to the
deviations of the dynamic parameters
pDi ; i D 1; : : : ; ; are determined by linear
initial value problems of second order.
The following abbreviations are often used in control theory:

@F ˇˇ @F  .j / .j / 
M .j /.t/ WD ˇ D p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; (5.45a)
@qR .j / @qR
@F ˇˇ @F  .j / .j / 
D .j / .t/ WD ˇ D p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; (5.45b)
@qP .j / @qP
ˇ
@F ˇ @F  .j / .j / 
K .j /.t/ WD ˇ D p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj ; (5.45c)
@q .j / @q
@F ˇ ˇ @F  .j / .j / 
Y .j /.t/ WD ˇ D p D ; q .t/; qP .j / .t/; qR .j / .t/ ; t  tj : (5.45d)
@pD .j / @pD

Putting
 
.j / .j /
Y .j / .t/ WD y1 ; y2 ; : : : ; y.j / (5.45e)

with the columns

.j / @F ˇˇ
yi .t/ WD ˇ D i-th column of Y .j / , (5.45f)
@pDi .j /

then (5.44a–c) can be represented also by


   
@' @'
K .t/ 
.j /
.t; 0/ .t/ C D .t/ 
.j /
P CM .j /.t/.t/
.t; 0/ .t/ R
@
q P
@
q
.j /
D yi .t/; t  tj (5.46a)
.tj / D 0 (5.46b)
P j / D 0:
.t (5.46c)

In the present case the matrix M .j /.t/ is regular. Thus, the above system can be
represented also by the following initial value problem of 2nd order for  D .t/:
 
R C .M .j /.t//1 D .j / .t/  @@'
.t/ P

q
.t; 0/ P
.t/
 
C.M .j / .t//1 K .j /.t/  @
q
@'
.t; 0/ .t/

D .M .j /.t//1 yi .t/; t  tj


.j /
(5.47a)
5.5 Computation of the Derivatives of the Tracking Error 221

.tj / D 0 (5.47b)
P j / D 0:
.t (5.47c)

for each i D 1; : : : ; .

5.5.2 Derivatives with Respect to the Initial Values at Stage j

Differentiation of the 2nd order system (5.37a–c) with respect to the initial values

qj k ; k D 1; : : : ; n, at stage j -ten yields, cf. (5.39a–c),

@F @
q @F @
qP @F @
qR @' @
q
C C D
@q @
qj k @qP @
qj k @qR @
qj k @
q @
qj k
P
@' @
q
C ; k D 1; : : : ; n; (5.48a)
P @
qj k
@
q
@
q
.tj / D ek .D k-th unit vector/ (5.48b)
@
qj k
P
@
q
.tj / D 0: (5.48c)
@
qj k
0 1

pD
Inserting again
a D @
qj A D 0 in (5.48a–c), for
P j

q

@
q
.t/ WD .t; 0/; t  tj ; (5.49)
@
qj k

with a fixed index k; 1  k  n, cf. (5.43), we get the following linear initial value
problem of 2nd order:
   
@' @' P
K .t/ 
.j /
.t; 0/ .t/ C D .t/ 
.j /
.t; 0/ .t/
@
q @
qP
R D 0;
CM .j /.t/.t/ t  tj (5.50a)
.tj / D ek (5.50b)
P j / D 0:
.t (5.50c)

In the same way, for the partial derivative

@
q
.t/ WD .t; 0/; t  tj ; (5.51)
P jk
@
q
222 5 Optimal Design of Regulators

P j k ; k D 1; : : : ; n, at
of
q with respect to the deviation of the initial velocities
q
stage j we have the linear 2nd order initial value problem
   
@' @' P
K .t/ 
.j /
.t; 0/ .t/ C D .j / .t/ .t; 0/ .t/
@
q P
@
q
R D 0;
CM .j / .t/.t/ t  tj (5.52a)
.tj / D 0 (5.52b)
P j / D ek ;
.t (5.52c)

where ek denotes the k-th unit vector.


Corresponding to (5.47a–c), system (5.50a–c) and (5.52a–c) can be represented also
as follows:
 
R C .M .j /.t//1 D .j / .t/  @' .t; 0/ .t/
.t/ P
P
@
q
 
C.M .j / .t//1 K .j / .t/  @
q
@'
.t; 0/ .t/ D 0; t  tj (5.53a)

.tj / D ek (5.53b)
P j / D 0;
.t (5.53c)

 
R C .M .j /.t//1 D .j / .t/  @' .t; 0/ .t/
.t/ P
P
@
q
 
C.M .j /.t//1 K .j /.t/  @
q
@'
.t; 0/ .t/ D 0; t  tj (5.54a)

.tj / D 0 (5.54b)
P j / D ek
.t (5.54c)

for each fixed k D 1; : : : ; n.


Definition 5.1 The 2nd order linear initial value problems (5.47a–c),
(5.50a–c), (5.52a–c) are also called perturbation equations.
The coefficients of the perturbation equations (5.47a–c),(5.50a–c) and (5.52a–c)
are defined often by the approach:
 
@'
.M .j /.t//1 K .j /.t/  .t; 0/ D Kp (5.55a)
@
q
 
@'
.M .j /.t//1 D .j / .t/  .t; 0/ D Kd (5.55b)
P
@
q
5.5 Computation of the Derivatives of the Tracking Error 223

with certain fixed matrices (often diagonal matrices) still to be determined


 
Kp D pkl ; Kd D .dkl / : (5.55c)

@'
For the functional matrices @
q ; @@'
P to be determined, the approach (5.55a–c)

q
yields the following representation:

@'
.t; 0/ D K .j /.t/  M .j /.t/Kp ; (5.56a)
@
q
@'
.t; 0/ D D .j / .t/  M .j /.t/Kd : (5.56b)
P
@
q

@'
Using only the linear terms @
q
.t; 0/; @@'
P .t; 0/ of the Taylor expansion (5.27a, b)

q
of the feedback function ' D '.t;
q;
q/, P and using the definition (5.55a–c), then
the optimal regulator problem (5.22a–c) is reduced to the optimal selection of the
regulator parameters in the matrices Kp ; Kd .
@'
Defining the Jacobians @
q .t; 0/; @@'
P .t; 0/ by (5.56a, b), the above described 2nd

q
order linear initial value problems read as follows:
@
q
With an index i D 1 : : : , for .t/ D @
q Di
.t; 0/; t  tj , according to (5.47a–c)
we have

P C Kp .t/ D .M .j /.t//1 yi .t/; t  tj


.j /
R C Kd .t/
.t/ (5.57a)
.tj / D 0 (5.57b)
P j / D 0:
.t (5.57c)

@
q
For .t/ D @
qj k .t; 0/; t  tj , with an index k D 1; : : : ; n we get, cf. (5.53a–c),

R C Kd .t/
.t/ P C Kp .t/ D 0; t  tj (5.58a)
.tj / D ek (5.58b)
P j / D 0:
.t (5.58c)

@
q
Finally, for .t/ WD P j k .t; 0/;
@
q
t  tj , with an index k D 1; : : : ; n we have,
see (5.54a–c),

R C Kd .t/
.t/ P C Kp .t/ D 0; t  tj (5.59a)
.tj / D 0 (5.59b)
P j / D ek :
.t (5.59c)
224 5 Optimal Design of Regulators

5.5.3 Solution of the Perturbation Equation

For given fixed indices i D 1; : : : ; ; k D 1; : : : ; n; we put


8 9
< .M .j /.t//1 yi .t/; t  tj ; for (5.57a–c) =
.j /
.j /
.t/ WD 0; for (5.58a–c) ; t  tj ; (5.60a)
: ;
0; for (5.59a–c)

and
8
< 0; for (5.57a–c)
dqj WD ek ; for (5.58a–c) (5.60b)
:
0; for (5.59a–c);
8
< 0; for (5.57a–c)
dP qj WD 0; for (5.58a–c) (5.60c)
:
ek ; for (5.59a–c):

Then the 2nd order linear initial value problems (5.57a–c), (5.58a–c), (5.59a–c),
resp., for the computation of the sensitivities
8 9
< .t/; for (5.57a–c) =
dq.t/ WD .t/; for (5.58a–c) ; t  tj (5.60d)
: ;
.t/; for (5.59a–c)

can be represented uniformly as follows:

dR q.t/ C Kd dP q.t/ C Kp dq.t/ D .j /


.t/; t  tj (5.61a)
dq.tj / D dqj (5.61b)

dP q.tj / D dP qj : (5.61c)

Defining now
 
dq.t/
dz.t/ WD ; t  tj ; (5.62)
dP q.t/

the 2nd order linear initial value problem (5.61a–c) can be transformed into an
equivalent linear 1st order initial value problem, see (1.2a, b),(1.3):
 
0
dP z.t/ D A dz.t/ C .j / ; t  tj (5.63a)
.t/
dz.tj / D dzj : (5.63b)
5.5 Computation of the Derivatives of the Tracking Error 225

Here, the matrix A and the vector dzj of initial values are defined as follows:
 
0 I
A WD (5.63c)
Kp Kd
 
dqj
dzj WD P : (5.63d)
d qj

For the associated homogeneous linear 1st order system of differential equations

dP z.t/ D A dz.t/; t  tj (5.64a)


dz.tj / D dzj : (5.64b)

we have this stability-criterion:


Lemma 5.3 (Routh–Hurwitz Criterion) The homogeneous linear system (5.64a,
b) is asymptotic stable for each initial state dzj , if the real parts of all eigenvalues
of A are negative. In this case we especially have

dz.t/ ! 0; t ! C1 (5.65)

for arbitrary dzj .


According to Lemma 5.3, Definition (5.60d) of dq D dq.t/; t  tj , and the
Definitions (5.43), (5.49) and (5.51) of  D .t/;  D .t/,  D .t/, resp., for the
sensitivities we have the following asymptotic properties:
Corollary 5.5.1 Suppose that the derivatives (5.43), (5.49) and (5.51) are defined
for all t  tj . If the matrix A has only eigenvalues with negative real parts, then
 
@
q
lim .t; 0/ D0 (5.66a)
t !C1 @
pD homog.
part of
solution
@
q
lim .t; 0/ D 0 (5.66b)
t !C1 @
qj

@
q
lim .t; 0/ D 0: (5.66c)
P j
t !C1 @
q

@
q @
q
Proof Since the systems of differential equations for @
qj
.t; 0/ and P j
@
q
.t; 0/ are
homogeneous, the assertion follows immediately from Lemma 5.3.
226 5 Optimal Design of Regulators

By means of the matrix exponential function e At of A, the solution dz D


dz.t/; t  tj , of (5.63a–d) can be represented explicitly as follows:

Zt  
A.t tj / 0
dz.t/ D e dzj C e A.t  / .j / d ; t  tj : (5.67)
./
tj

Consequently, because of (5.60a–c), the sensitivities read:

Zt !
@
q 0
.t; 0/ D  e A.t  / d ; t  tj ; (5.68a)
.M .j /.//1 yi ./
.j /
@
pDi
tj
 
@
q ek
.t; 0/ D e A.t tj / ; t  tj ; (5.68b)
@
qj k 0
 
@
q 0
.t; 0/ D e A.t tj / ; t  tj : (5.68c)
P jk
@
q ek

Selection of the Matrices Kp ; Kd

An explicit solution of (5.61a–c) and therefore also of the 2nd order linear systems
of differential equations (5.57a–c),(5.58a–c) and (5.59a–c) as well as an explicit
solution of the 1st order system (5.63a–d) can be obtained under the assumption
that Kp and Kd are diagonal matrices:

Kp D .pl ıkl /; Kd D .d l ıkl /: (5.69)

Hence, pl , d l ; l D 1; : : : ; n; resp., denote the diagonal elements of Kp , Kd . Under


this assumption the system (5.61a–c) is decomposed into n separated equations for
the components dqk D dqk .t/; t  tj ; k D 1; : : : ; n of dq D dq.t/; t  tj :

dR qk .t/ C dk dP qk .t/ C pd dqk .t/ D


.j /
k .t/; t  tj (5.70a)
dqk .tj / D dqj k (5.70b)

dP qk .tj / D dP qj k (5.70c)

with k D 1; : : : ; n. Obviously, (5.70a–c) are ordinary linear differential equations


of 2nd order with constant coefficients. Thus, up to possible integrals, the solution
can be given explicitly.
For this purpose we consider first the characteristic polynomial

pk ./ WD 2 C dk  C pk (5.71)


5.5 Computation of the Derivatives of the Tracking Error 227

The zeros of pk D pk ./ read


r
dk pk 2
k1;2 WD  ˙  pk
2 2
 q 
1
D dk ˙ dk2
 4pk ; k D 1; : : : ; n: (5.72)
2

As shown in the following, the roots k1;2 ; k D 1; : : : ; n, are just the eigenvalues
of the system matrix A, provided that Kd ; Kp are defined by (5.69):
 
u
If  is an eigenvalue of A and w D an associated eigenvector, then (5.63c)
v
      
u 0 I u v
 D D : (5.73)
v Kp Kd v Kp u  Kd v

Thus, u D v, and the second equality in (5.73) yields

2 u D .u/ D v D Kp u  Kd v D Kp u  Kd u (5.74a)

and therefore

.2 I C Kd C Kp /u D 0: (5.74b)


 
u
Because of v D u each eigenvector w D of A related to the eigenvalue 
v
of A has the form
 
u
wD : (5.75a)
u

However, this means that

w¤0 , u ¤ 0: (5.75b)

Hence, Eq. (5.74b) holds if and only if

det.2 I C Kd C Kp / D 0: (5.76a)

An eigenvalue  of A is therefore a root of the polynomial

q./ D det.2 I C Kd C Kp / (5.76b)


228 5 Optimal Design of Regulators

with degree 2n. If the matrices Kp and Kd have the diagonal form (5.69), then also
2 I C Kd C Kp is a diagonal matrix. Thus, we have

Y
n
q./ D .2 C dk  C pk /: (5.76c)
kD1

Obviously, the eigenvalues  of A are the quantities k1;2 ; k D 1; : : : ; n; defined


by (5.72).
According to Lemma 5.3 we consider now the real parts of the eigenvalues
k1;2 ; k D 1; : : : ; n; of A, where we assume that

pk > 0; dk > 0; k D 1; : : : ; n: (5.77)

Case 1:
2
dk > 4pk (5.78a)

Here, the eigenvalues k1;2 are real valued, and because of


q
2
dk  4pk < dk

we have
 q 
1
k1 D dk C dk
2
 4pk < 0; (5.78b)
2
 q 
1
k2 D dk  dk
2
 4pk < 0: (5.78c)
2

Thus, in this case, both eigenvalues take negative real values.


Case 2:
2
dk D 4pk (5.79a)

In this case
dk
k1;2 D  (5.79b)
2
is (at least) a double negative eigenvalue of A.
Case 3:
2
dk < 4pk (5.80a)
5.5 Computation of the Derivatives of the Tracking Error 229

Here, we have
 q 
1
k1;2 D dk ˙ .1/.4pk  dk2
/
2
 q 
1
D dk ˙ i 4pk  dk
2
2
1
D .dk ˙ i !k / (5.80b)
2
with
q
!k WD 4pk  dk
2
: (5.80c)

In the present case k1;2 are two different complex eigenvalues of A with negative
real part  12 dk .
Consequently, we have this result:
Lemma 5.4 If Kp ; Kd are diagonal matrices having positive diagonal elements
pk ; dk ; k D 1; : : : ; n;, then all eigenvalues k1;2 ; k D 1; : : : ; n; of A have negative
real parts.
According to the Routh–Hurwitz criterion, under the assumption (5.77) of
positive diagonal elements, system of differential equations (5.64a, b) is asymptotic
stable.
With respect to the eigenvalues k1;2 described in Cases 1–3 we have the follow-
ing contributions to the fundamental system of the differential equation (5.70a, b):

Case 1:

e k1 .t tj / ; e k2 .t tj / ; t  tj ; (5.81a)

with the negative real eigenvalues k1;2 , see (5.78b, c);


Case 2:

e k1 .t tj / ; .t  tj /e k1 .t tj / ; t  tj ; (5.81b)

with k1 D  12 dk , see (5.79b);


Case 3:
1 1
e  2 dk .t tj / cos !k .t  tj /; e  2 dk .t tj / sin !k .t  tj /; t  tj ; (5.81c)

with !k given by (5.80c).


230 5 Optimal Design of Regulators

Consequently, the solution of system (5.61a–c), (5.63a, b), resp., can be given
explicitly—eventually up to certain integrals occurring in the computation of a
particular solution of the inhomogeneous linear system of differential equations.

5.6 Computation of the Objective Function

For the optimal regulator problem under stochastic uncertainty (5.22a–c), we have
to determine approximatively the conditional expectation

.j /
t
 ˇ  Zf  ˇ 
.j / ˇ ˇ
E J ˇAtj D E f .j / .t/ˇAtj dt
tj

.j /
Ztf  ˇ 
ˇ
 E fQ.j / .t/ˇAtj dt; (5.82a)
tj

see (5.6a–d), (5.15), (5.26) and (5.28a). According to (5.36) we have


 ˇ     
ˇ
E fQ.j / ˇAtj D tr P
pD cov.j / pD ./ C tr P
qj cov.j /
qj ./
 
C tr P P cov.j /
qP j ./

qj
 
C tr P
q
P j ;
qj cov
.j / P j ./ :

qj ./;
q (5.82b)

Here, the P -matrices are defined, see (5.33b–e), by


 T  
@
z @
z
P
pD WD .t; 0/ Q.t; '/ .t; 0/ ; (5.83a)
@
pD @
pD
 T  
@
z @
z
P
qj WD .t; 0/ Q.t; '/ .t; 0/ ; (5.83b)
@
qj @
qj
 T  
@
z @
z
P
q
P j WD .t; 0/ Q.t; '/ .t; 0/ ; (5.83c)
P j
@
q @
qP j
 T  
@
z @
z
P
q
P j ;
qj WD .t; 0/ Q.t; '/ .t; 0/ ; (5.83d)
P j
@
q @
qj
5.6 Computation of the Objective Function 231

where, see (5.30c),


@' @' @' @' !
Cq C T
@
q .t; 0/ Cu @
q .t; 0/
T
@
q .t; 0/ Cu @
q
P .t; 0/
Q.t; '/ D @' @' (5.84)
T
P .t; 0/ Cu @
q .t; 0/
@
q
CqP C @@' T @'
P .t; 0/ Cu @
q

q P .t; 0/

with, see (5.56a, b) and (5.45a–c),

@'
.t; 0/ D K .j /.t/  M .j /.t/Kp (5.85a)
@
q
@'
.t; 0/ D D .j / .t/  M .j /.t/Kd : (5.85b)
P
@
q

By (5.31) and (5.38a–f) we further have

@
q
!
@
z @
pD
.t; 0/
.t; 0/ D P
@
q
@
pD @
pD .t; 0/
@
q @
q @
q !
@
pD1
.t; 0/ @
pD2
.t; 0/ ::: @
pD
.t; 0/
D P
@
q P
@
q P
@
q ; (5.86a)
@
pD1
.t; 0/ @
pD2
.t; 0/ ::: @
pD
.t; 0/

@
q !
@
z @
qj
.t; 0/
.t; 0/ D P
@
q
@
qj @
qj
.t; 0/
@
q @
q @
q !
@
qj1
.t; 0/ @
qj2
.t; 0/ : : : @
qjn
.t; 0/
D P
@
q P
@
q P
@
q ; (5.86b)
@
qj1
.t; 0/ @
qj2
.t; 0/ : : : @
qjn
.t; 0/

0 1
@
q
@
z P j .t; 0/
.t; 0/ D @ A
@
q
P
P j
@
q @
q
.t; 0/
P j
@
q
0 @
q @
q @
q
1
P j .t; 0/ P j .t; 0/ : : : P jn .t; 0/
D@ A:
@
q 1 @
q 2 @
q
P
@
q P
@
q P
@
q (5.86c)
P j
@
q
.t; 0/ P j
@
q
.t; 0/ : : : P jn
@
q
.t; 0/
1 2

Using (5.85a, b), according to (5.42a–c) and (5.45a–d), the functional matrix

@
q
t! .t; 0/; t  tj ;
@
pD
232 5 Optimal Design of Regulators

satisfies the 2nd order linear initial value problem

@
qR P
@
q @
q
.t; 0/ C Kd .t; 0/ C Kp .t; 0/
@
pD @
pD @
pD
 1 .j /
D  M .j /.t/ Y .t/; t  tj (5.87a)
@
q
.tj ; 0/ D 0 (5.87b)
@
pD
P
@
q
.tj ; 0/ D 0: (5.87c)
@
pD

Moreover, using (5.85a, b), by (5.48a–c), (5.49) and (5.50a–c) the functional
matrix

@
q
t! .t; 0/; t  tj ;
@
qj

satisfies the
R
@
q P
@
q @
q
.t; 0/ C Kd .t; 0/ C Kp .t; 0/ D 0; t  tj (5.88a)
@
qj @
qj @
qj
@
q
.tj ; 0/ D I (5.88b)
@
qj
P
@
q
.tj ; 0/ D 0: (5.88c)
@
qj

Finally, using (5.85a, b), according to (5.51) and (5.52a–c) the functional matrix

@
q
t! .t; 0/; t  tj ;
P j
@
q

is the solution of the 2nd order linear initial value problem

R
@
q P
@
q @
q
.t; 0/ C Kd .t; 0/ C Kp .t; 0/ D 0; t  tj (5.89a)
P
@
qj P
@
qj P j
@
q
@
q
.t ; 0/ D 0 (5.89b)
P j j
@
q
P
@
q
.t ; 0/ D I: (5.89c)
P j j
@
q
5.7 Optimal PID-Regulator 233

5.7 Optimal PID-Regulator

For simplification of the notation, in the following we consider only the 0-th stage
j WD 0. Moreover, we suppose again that the state z.t/ D .q.t/; q.t// P is available
exactly. Corresponding to Sect. 5.1, the tracking error, i.e., the deviation of the actual
position q D q.t/ and velocity qP D q.t//
P from the reference position q .0/ D q .0/ .t/
and reference velocity qP D qP .t//; t0  t  tf is defined by
.0/ .0/


q.t/ WD q.t/  q R .t/; t0  t  tf ; (5.90a)
P WD q.t/

q.t/ P  qP R .t/; t0  t  tf : (5.90b)

Hence, for the difference between the actual state z D z.t/ and the reference state
z.0/ D z.0/ .t/ we have
 
q.t/  q .0/ .t/

z.t/ WD z.t/  z.0/ .t/ D ; t0  t  tf : (5.91)
P  qP .0/ .t/
q.t/

As in Sect. 5.1, the actual control input u D u.t/; t0  t  tf ; is defined by

u.t/ WD u.0/ .t/ C


u.t/; t0  t  tf : (5.92)

Here u.0/ D u.0/ .t/; t0  t  tf , denotes the feedforward control based on the
reference trajectory z.0/ .t/ D z.0/ .t/; t0  t  tf , obtained e.g. by inverse dynamics,
cf. [4, 11, 61], and
u D
u.t/; t0  t  tf ; is the optimal control correction to be
determined.
Generalizing the PD-regulator, depending on the state z.t/ D .q.t/; q.t//,P for the
PID-regulator, cf. Sect. 4.7, also the integrated position

Zt
qI .t/ WD q./ d ; t0  t  tf ; (5.93)
t0

is taken into account. Hence, the PID-regulator depends also on the integrative
position error

Zt Zt  

qI .t/ WD
q./ d  D q./  q R ./ d 
t0 t0

Zt Zt
D q./ d   q R ./ d  D qI .t/  qIR .t/: (5.94)
t0 t0
234 5 Optimal Design of Regulators

Thus, working with the PID-regulator, the control correction is defined by


u.t/ WD '.t;
q.t/;
qI .t/;
q.t//;
P t0  t  tf : (5.95)

Without further global information on the stochastic parameters also in the present
case for the feedback law ' we suppose

'.t; 0; 0; 0/ D 0; t0  t  tf : (5.96)

Corresponding to the optimization problem (5.7a–c) for PD-regulators described


in Sect. 5.1.1, for PID-regulators '  D '  .t;
q;
qI ;
q/
P we have the following
optimization problem:
Definition 5.2 A stochastic optimal feedback law '  D '  .t;
q.t/;
qI .t/,
P

q.t// is an optimal solution of

Ztf  ˇ !
    ˇˇ
min E P / C  t; '.t;
q.t /;
qI .t /;
q.t
c t;
q.t /;
qI .t /;
q.t P // dtˇAt0
ˇ
t0
(5.97a)
subject to
P /; q.t
F .pD .!/; q.t /; q.t R //
D uR .t / C '.t;
q.t /;
qI .t /;
q.t
P //; t0  t  tf ; a:s: (5.97b)
q.t0 ; !/ D q0 .!/; (5.97c)
P 0 ; !/ D qP 0 .!/;
q.t (5.97d)
'.t; 0/ D 0: (5.97e)

Here,
   
P
F .pD .!/; q.t/; q.t/; R
q.t// WD M pD ; q.t/ q.t/
R C h pD ; q.t/; q.t/
P (5.98)

is again the left hand side of the dynamic equation


 as in (4.4a). 
Corresponding to Sect. 5.1.1, the term c t;
q.t/;
qI .t/;
q.t/ P in the objec-
tive function (5.97a) describes the costs resulting from the deviation of the actual
trajectory from the reference trajectory , and the costs for the control corrections are
P
represented by  t; '.t;
q.t/;
qI .t/;
q.t// . Moreover, cf. (5.3f),


pD .!/ WD pD .!/  pD (5.99)

denotes again the deviation of the effective dynamic parameter vector pD D


pD .!/ from its nominal value or expectation pD . In the present section, pD WD
5.7 Optimal PID-Regulator 235

 
E pD .!/jAt0 denotes the (conditional) expectation of the random vector pD D
pD .!/.
Having the stochastic optimal feedback control '  D '  .t;
q;
qI ;
q/, P the
effective trajectory q D q.t/; t  t0 , of the stochastic optimally controlled dynamic
system is then given by the solution of the initial value problem (5.97b–d). Using
the definitions, cf. (4.45c, d), (5.3d, e),
 

q0 WD q.t0 /  q R .t0 / D q0  E q0 jAt0 ; (5.100a)
 

qP0 WD q.t
P 0 /  qP R .t0 / D qP 0  E qP0 jAt0 ; (5.100b)

the constraints of the regulator optimization problem (5.97a–f) can be represented


also by

F .pD C
pD .!/; q R .t/ C
q.t/; qP R .t/ C
q.t/;
P qR R .t/ C
q.t//
R
D uR .t/ C '.t;
q.t/;
qI .t/;
q.t//;
P t0  t  tf ; a:s; (5.101a)

q.t0 ; !/ D
q0 .!/; (5.101b)
P 0 ; !/ D
qP0 .!/;

q.t (5.101c)
'.t; 0/ D 0: (5.101d)

P
Remark 5.2 For given control law ' D '.t;
q.t/;
qI .t/;
q.t//, the solution of
the initial value problem (5.101a–d) yields position error function


q WD
q.t; a.!//; t0  t  tf ; (5.102a)

depending on the random parameter vector


0 1

pD .!/
a D a.!/ WD @
q0 .!/ A : (5.102b)

qP0 .!/

5.7.1 Quadratic Cost Functions

Corresponding to Sect. 5.3.2, here we use again quadratic cost functions. Hence,
with symmetric, positive (semi-)definite matrices Cq ; CqI ; CqP the costs c arising
from the tracking error are defined by
 
P
c t;
q.t/;
qI .t/;
q.t/
WD
q.t/T Cq
q.t/ C
qI .t/T CqI
qI .t/ C
q.t/
P T CqP
q.t/:
P (5.103a)
236 5 Optimal Design of Regulators

Moreover, with a positive (semi-)definite matrix Cu the regulator costs  are defined
by
 
P
 t; '.t;
q.t/;
qI .t/;
q.t/
 T  
P
WD ' t;
q.t/;
qI .t/;
q.t/ P
Cu ' t;
q.t/;
qI .t/;
q.t/ : (5.103b)

Corresponding to Sect. 5.4.1 the total cost function is defined by

f .t/ WD
q.t/T Cq
q.t/ C
qI .t/T CqI
qI .t/
C
q.t/
P T CqP
q.t/
P C
u.t/T Cu
u.t/: (5.104)

Thus, for the objective function of the regulator optimization problem (5.97a–e) we
obtain

Ztf ˇ ! Ztf
ˇ  ˇ 
ˇ ˇ
E f .t/ dtˇAt0 D E f .t/ˇAt0 dt
ˇ
t0 t0

Ztf 
 T  T  T
D E
q.t/ Cq
q.t/ C
qI .t/ CqI
qI .t/ C
q.t/
P P
CqP
q.t/
t0
  T  ˇˇ 
P
C ' t;
q.t/;
qI .t/;
q.t/ P
Cu ' t;
q.t/;
qI .t/;
q.t/ ˇAt0 dt:
(5.105)

Computation of the Expectation by Taylor Expansion

Corresponding to Sect. 5.4, here we use also the Taylor expansion method to
compute approximatively the arising conditional expectations. Thus, by Taylor
expansion of the feedback function ' at
q D 0;
qI D 0;
qP D 0, we get
   
P
' t;
q.t/;
qI .t/;
q.t/ D ' t; 0 C D
q '.t; 0/
q.t/
CD
qI '.t; 0/
qI .t/ C D
qP '.t; 0/
q.t/
P C : (5.106)

Here,

@'
D
q ' D D
q '.t; 0/ D .t; 0/
@
q

denotes the Jacobian, hence, the matrix of partial derivatives of ' with respect to

q at .t; 0/.
5.7 Optimal PID-Regulator 237

Taking into account only the linear terms in (5.106), because of '.t; 0/ D 0 we
get the approximation
 ˇ   ˇ 
ˇ ˇ
E f ˇAt0  E fQˇAt0 ; (5.107)

where
 
E fQjA0 WD E
q.t/T Cq
q.t/ C
qI .t/T CqI
qI .t/ C
q.t/
P T CqP
q.t/
P

 T
C D
q '.t; 0/
q.t/ C D
qI '.t; 0/
qI .t/ C D
qP '.t; 0/
q.t/
P
ˇ !
 T ˇ
ˇ
 Cu D
q '.t; 0/
q.t/ C D
qI '.t; 0/
qI .t/ C D
qP '.t; 0/
q.t/
P ˇAt0
ˇ

DE
q.t/T Cq
q.t/ C
qI .t/T CqI
qI .t/ C
q.t/
P T CqP
q.t/
P

C
q.t/T D
q '.t; 0/T Cu C
qI .t/T D
qI '.t; 0/T Cu

C
q.t/
P T D
qP '.t; 0/T Cu
ˇ !
 ˇ
ˇ
 D
q '.t; 0/
q.t/ C D
qI '.t; 0/
qI .t/ C D
qP '.t; 0/
q.t/
P ˇAt0
ˇ

DE
q.t/T Cq
q.t/ C
qI .t/T CqI
qI .t/ C
q.t/
P T CqP
q.t/
P

C
q.t/T D
q '.t; 0/T Cu D
q '.t; 0/
q.t/
C
q.t/T D
q '.t; 0/T Cu D
qI '.t; 0/
qI .t/
P
C
q.t/T D
q '.t; 0/T Cu D
qP '.t; 0/
q.t/
C
qI .t/T D
qI '.t; 0/T Cu D
q '.t; 0/
q.t/
C
qI .t/T D
qI '.t; 0/T Cu D
qI '.t; 0/
qI .t/
P
C
qI .t/T D
qI '.t; 0/T Cu D
qP '.t; 0/
q.t/
C
q.t/
P T D
qP '.t; 0/T Cu D
q '.t; 0/
q.t/
C
q.t/
P T D
qP '.t; 0/T Cu D
qI '.t; 0/
qI .t/
ˇ !
ˇ
ˇ
C
q.t/ P ˇAt0
P T D
qP '.t; 0/T Cu D
qP '.t; 0/
q.t/
ˇ
238 5 Optimal Design of Regulators

 
DE
q.t/T Cq C D
q '.t; 0/T Cu D
q '.t; 0/
q.t/
 
C
qI .t/T CqI C D
qI '.t; 0/T Cu D
qI '.t; 0/
qI .t/
 
C
q.t/
P T CqP C D
qP '.t; 0/T Cu D
qP '.t; 0/
q.t/
P

C 2
q.t/T D
q '.t; 0/T Cu D
qI '.t; 0/
qI .t/
C 2
q.t/T D
q '.t; 0/T Cu D
qP '.t; 0/
q.t/
P
ˇ !
ˇ
ˇ
P ˇAt0 :
C 2
qI .t/T D
qI '.t; 0/T Cu D
qP '.t; 0/
q.t/ (5.108)
ˇ

In order to compute the above expectation, we introduce the following definition:

Qq WDCq C D
q '.t; 0/T Cu D
q '.t; 0/; (5.109a)
QqI WDCqI C D
qI '.t; 0/T Cu D
qI '.t; 0/; (5.109b)
QqP WDCqP C D
qP '.t; 0/ Cu D
qP '.t; 0/;
T
(5.109c)
QqqI WDD
q '.t; 0/ Cu D
qI '.t; 0/;
T
(5.109d)
Qq qP WDD
q '.t; 0/T Cu D
qP '.t; 0/; (5.109e)
QqI qP WDD
qI '.t; 0/T Cu D
qP '.t; 0/: (5.109f)

Next to this yields for fQ the representation


 
fQ WD
q.t/T Cq C D
q '.t; 0/T Cu D
q '.t; 0/
q.t/
 
C
qI .t/T CqI C D
qI '.t; 0/T Cu D
qI '.t; 0/
qI .t/
 
P T CqP C D
qP '.t; 0/T Cu D
qP '.t; 0/
q.t/
C
q.t/ P

C 2
q.t/T D
q '.t; 0/T Cu D
qI '.t; 0/
qI .t/
P
C 2
q.t/T D
q '.t; 0/T Cu D
qP '.t; 0/
q.t/
C 2
qI .t/T D
qI '.t; 0/T Cu D
qP '.t; 0/
q.t/
P
0 10 1
Qq QqqI Qq qP
q.t/
B T Q Q C@
P T / @Qqq
D.
q.t/T ;
qI .t/T ;
q.t/ I qI qI qP A
qI .t/A
T T
Qq qP QqI qP QqP P

q.t/

D
zI .t/T Q.t; '/
zI .t/; (5.110)
5.7 Optimal PID-Regulator 239

where the extended state vector


zI and the matrix Q are defined by
0 1

q.t/

zI .t/ WD @
qI .t/A ; (5.111a)
P

q.t/
0 1
Qq QqqI Qq qP
B T Q Q C
Q.t; '/ WD @Qqq I qI qI qP A : (5.111b)
QqTqP QqTI qP QqP

Since Cq ; CqI ; CqP and Cu are positive (semi-)definite matrices, we have fQ  0.


In order to determine the expectation
 of fQ, in addition to the Taylor approxima-
tion of the feedback law ' D ' t;
q.t/;
qI .t/;
q.t/ P , we need also the Taylor
expansion of
zI with respect to the parameter vector a at

a D .
pD .!/T ;
q0 .!/T ;
qP0 .!/T /T D 0: (5.112)

Hence, we get


q.t/ D
q.t; 0/ C D
pD
q.t; 0/
pD .!/ C D
q0
q.t; 0/
q0 .!/
C D
qP0
q.t; 0/
qP0 .!/ C : : : ; (5.113a)
P D
q.t;

q.t/ P 0/ C D
pD
q.t;
P 0/
pD .!/
C D
q0
q.t;
P 0/
q0 .!/ C D
qP0
q.t;
P 0/
qP0 .!/ C : : : ; (5.113b)

as well as

Zt Zt 

qI .t/ D
q./d  D
q.; 0/ C D
pD
q.; 0/
pD .!/
t0 t0

CD
q0
q.; 0/
q0 .!/ C D
qP0
q.; 0/
qP0 .!/ C : : : d : (5.113c)

Using only the linear terms, and taking into account the conditions (5.101a–d)


q.t; 0/ D 0; t  t0 ; (5.114a)
P 0/ D 0; t  t0 ;

q.t; (5.114b)

then (5.113a, b) can be approximated by

q.t/ D
pD
q.t; 0/
pD .!/ C D
q0
q.t; 0/
q0 .!/ C D
qP0
q.t; 0/
qP0 .!/;
(5.115a)
240 5 Optimal Design of Regulators

P D
pD
q.t;

q.t/ P 0/
pD .!/ C D
q0
q.t;
P 0/
q0 .!/ C D
qP0
q.t;
P 0/
qP0 .!/:
(5.115b)

For (5.113c) this yields then

Zt

qI .t/ D
q./ d 
t0

Zt 
 D
pD
q.; 0/
pD .!/
t0

C D
q0
q.; 0/
q0 .!/ C D
qP0
q.; 0/
qP0 .!/ d 

Zt Zt
D D
pD
q.; 0/
pD .!/ d  C D
q0
q.; 0/
q0 .!/ d 
t0 t0

Zt
C D
qP0
q.; 0/
qP0 .!/ d 
t0

Zt !ˇ Zt !ˇ
ˇ ˇ
@ ˇ @ ˇ
D
q.; p/ d  ˇ
pD .!/ C
q.; p/ d  ˇ
q0 .!/
@
pD ˇ @
q0 ˇ
t0 pD0 t0 pD0
„ ƒ‚ … „ ƒ‚ …
D
qI .t;p/ D
qI .t;p/

Zt !ˇ
ˇ
@ ˇ
C
q.; p/ d  ˇ
qP0 .!/
@
qP0 ˇ
t0 pD0
„ ƒ‚ …
D
qI .t;p/

DD
pD
qI .t; 0/
pD .!/ C D
q0
qI .t; 0/
q0 .!/ C D
qP0
qI .t; 0/
qP0 .!/:
(5.115c)

Consequently, we find
0 1

q.t/

zI .t/ D @
qI .t/A
P

q.t/
0 1
D
pD
q.t; 0/
pD .!/ C D
q0
q.t; 0/
q0 .!/ C D
Pq0
q.t; 0/
qP 0 .!/
 @D
pD
qI .t; 0/
pD .!/ C D
q0
qI .t; 0/
q0 .!/ C D
Pq0
qI .t; 0/
qP 0 .!/A
P 0/
pD .!/ C D
q0
q.t;
D
pD
q.t; P 0/
q0 .!/ C D
Pq0
q.t;P 0/
qP 0 .!/
5.7 Optimal PID-Regulator 241

0 1 0 1
D
pD
q.t; 0/ D
q0
q.t; 0/
D @D
pD
qI .t; 0/A
pD .!/ C @D
q0
qI .t; 0/A
q0 .!/
P 0/
D
pD
q.t; P 0/
D
q0
q.t;
0 1
D
Pq0
q.t; 0/
C @D
Pq0
qI .t; 0/A
qP 0 .!/
P 0/
D
Pq0
q.t;
DD
pD
zI .t; 0/
pD .!/ C D
q0
zI .t; 0/
q0 .!/ C D
Pq0
zI .t; 0/
qP 0 .!/:
(5.116)

Inserting this into equation (5.110), we get the approximation:

fQ D
zI .t/T Q
zI .t/
  T  T

pD .!/T D
pD
zI .t; 0/ C
q0 .!/T D
q0
zI .t; 0/
 T 
C
qP0 .!/ D
qP0
zI .t; 0/
T


 Q D
pD
zI .t; 0/
pD .!/ C D
q0
zI .t; 0/
q0 .!/

C D
qP0
zI .t; 0/
qP0 .!/
 T
D
pD .!/T D
pD
zI .t; 0/ Q D
pD
zI .t; 0/
pD .!/
 T
C
q0 .!/T D
q0
zI .t; 0/ Q D
q0
zI .t; 0/
q0 .!/
 T
C
qP0 .!/T D
qP0
zI .t; 0/ Q D
qP0
zI .t; 0/
qP0 .!/
 T
C 2
q0 .!/T D
qP0
zI .t; 0/ Q D
qP0
zI .t; 0/
pD .!/
 T
C 2
qP0 .!/T D
qP0
zI .t; 0/ Q D
q0
zI .t; 0/
q0 .!/
 T
C 2
pD .!/T D
qP0
zI .t; 0/ Q D
qP0
zI .t; 0/
qP0 .!/: (5.117)

Assumption 1 Without constraints we may suppose that the random vectors

q0 .!/;
pD .!/ as well as
pD .!/;
qP0 .!/ are stochastically independent.
Since
 
E
pD .!/jAt0 D
pD D 0; (5.118)
242 5 Optimal Design of Regulators

cf. (5.99), we have


  T 
E
q0 .!/T D
qP0
zI .t; 0/ Q D
qP0
zI .t; 0/
pD D 0; (5.119a)
  T 
E
pD .!/ D
qP0
zI .t; 0/ Q D
qP0
zI .t; 0/
qP0 .!/ D 0:
T
(5.119b)

Defining the deterministic matrices,


 T
P
pD WD D
pD
zI .t; 0/ Q D
pD
zI .t; 0/; (5.120a)
 T
P
q0 WD D
q0
zI .t; 0/ Q D
q0
zI .t; 0/; (5.120b)
 T
P
qP0 WD D
qP0
zI .t; 0/ Q D
qP0
zI .t; 0/; (5.120c)
 T
P
qP0 ;
q0 WD D
qP0
zI .t; 0/ Q D
q0
zI .t; 0/; (5.120d)

for the conditional expectation of fQ we obtain the approximation


 ˇ   ˇ 
ˇ ˇ
E fQˇAt0  E gO ˇAt0 ; (5.121a)

where
 ˇ   ˇ   ˇ 
ˇ ˇ ˇ
E fOˇAt0 WD E
pD .!/T P
pD
pD .!/ˇAt0 C E
q0 .!/T P
q0
q0 .!/ˇAt0
 ˇ   ˇ 
ˇ ˇ
C E
qP0 .!/T P
qP0
qP0 .!/ˇAt0 C E
qP0 .!/T P
qP0 ;
q0
q0 .!/ˇAt0 :
(5.121b)

Thus, we find

 ˇ  X
 ˇ 
ˇ ˇ
E
pD .!/ P
pD
pD .!/ˇAt0 D E
T
ŒP
pD kl
pDk .!/
pDl .!/ˇAt0
k;lD1

X
  ˇ 
ˇ
D ŒP
pD kl E
pDk .!/
pDl .!/ˇAt0
k;lD1

X
   ˇˇ 
D ŒP
pD kl E
pDk .!/ 
pDk
pDl .!/ 
pDl ˇAt0
„ƒ‚… „ƒ‚…
k;lD1
D0 D0
5.7 Optimal PID-Regulator 243

X
   ˇˇ 
D ŒP
pD kl E pDk .!/  pDk pDl .!/  pDl ˇAt0
k;lD1

X

 
D ŒP
pD kl cov pDk ; pDl : (5.122a)
k;lD1

With the covariance matrix


    
cov pD ./ WD cov pDk ./; pDl ./
k;lD1;:::;
  T ˇˇ 
D E pD .!/  pD pD .!/  pD ˇAt0 (5.122b)

of pD D pD .!/, from (5.122a) we obtain the representation

 ˇ  X

 
ˇ
E
pD .!/ P
pD
pD .!/ˇAt0 D
T
ŒP
pD kl cov pDk ; pDl
k;lD1

X

 
D ŒP
pD kl cov pDl ; pDk
k;lD1
 X
X 
 
D ŒP
pD kl cov pDl ; pDk
kD1 lD1
   
X  
D k-th row of P
pD  k-th column of cov pD ./
kD1
 
D tr P
pD cov pD ./ : (5.123a)

Here, “tr” denotes the trace of a quadratic matrix.


In the same way we have, cf. (5.100a, b),
 ˇ   
ˇ
E
q0 .!/T P
q0
q0 .!/ˇAt0 D tr P
q0 cov
q0 ./ ; (5.123b)
 ˇ   
ˇ
E
qP0 .!/T P
qP0
qP0 .!/ˇAt0 D tr P
qP0 cov
qP0 ./ : (5.123c)

According to (5.100a, b), see also (5.3d, e), (5.118), we have


 ˇ   ˇ 
ˇ ˇ
E
q0 .!/ˇAt0 D 0; E
qP0 .!/ˇAt0 D 0: (5.124)
244 5 Optimal Design of Regulators

Hence, for the fourth term in (5.121b) we get

 ˇ  X
 ˇ 
ˇ ˇ
E
qP0 .!/T P
qP0 ;
q0
q0 .!/ˇAt0 D E ŒP
qP0 ;
q0 kl
qP0k .!/
q0l .!/ˇAt0
k;lD1

X

 
D ŒP
qP0 ;
q0 kl cov
qP0k ./;
q0l ./
k;lD1

X

 
D ŒP
qP0 ;
q0 kl cov
q0l ./;
qP0k ./
k;lD1
 X
X 
 
D ŒP
qP0 ;
q0 kl cov
q0l ./;
qP0k ./
kD1 lD1
 
X 
D k-th row of P
qP0 ;
q0 
kD1
  
k-th column of cov
q0 ./;
qP0 ./
 
D tr P
qP0 ;
q0 cov
q0 ./;
qP0 ./ : (5.125)

Inserting now (5.123a–c) and (5.125) into (5.121b), for the objective function
(5.97a) we find the approximation
 ˇ     
ˇ
E fQˇAt0  tr P
pD cov pD ./ C tr P
q0 cov
q0 ./
 
C tr P
qP0 cov
qP0 ./
 
C tr P
qP0 ;
q0 cov
q0 ./;
qP0 ./ : (5.126)

Calculation of the Sensitivities

Because of (5.126), (5.120a–d), resp., we need the derivatives of


zI .t; p/ with
respect to
pD ;
q0 and
qP0 . The calculation of the derivatives is based on the
initial value problem (5.97b–e), where, see 5.99, (5.100a, b), (5.118) and (5.124),
 

pD .!/ WD pD .!/  E pD .!/jAt0 ; (5.127a)
 

q0 .!/ WD q0 .!/  E q0 .!/jAt0 ; (5.127b)
 

qP0 .!/ WD qP0 .!/  E qP0 .!/jAt0 : (5.127c)
5.7 Optimal PID-Regulator 245

Due to Eqs. (5.127a–c), the constraints of the regulator optimization prob-


lem (5.97a–f) will be replaced now by the equivalent ones

Ztf 
     
min tr P
pD cov pD ./ C tr P
q0 cov
q0 ./ C tr P
qP0 cov
qP0 ./
t0

 
C tr P
qP0 ;
q0 cov
q0 ./;
qP0 ./ dtF .p D C
pD .!/; q R .t/ C
q.t/; qP R .t/

C
q.t/;
P qR R .t/ C
q.t//
R
P
D F .pD ; q.t/; q.t/; R
q.t// D uR .t/ C '.t;
q.t/;
qI .t/;
q.t//;
P a:s:
(5.128a)

q.t0 ; !/ D
q0 .!/; (5.128b)
P 0 ; !/ D
qP0 .!/;

q.t (5.128c)
'.t0 ; 0/ D 0: (5.128d)

By partial differentiation of the equation (5.128a) with respect to the above


mentioned parameter vectors we obtain three systems of differential equations for
the sensitivities or partial derivatives of
zI D
zI .t; p/ with respect to the total
parameter vector p needed in in (5.117), (5.120a–d),(5.126).
For the representation of these sensitivities we introduce the following definition:
Definition
 5.3 The Jacobians  of the vector function F in (5.128a) taken at
pD ; q R .t/; qP R .t/; qR R .t/ are denoted by

K R .t/ WD Dq F .pD ; q R .t/; qP R .t/; qR R .t//; (5.129a)


D R .t/ WD DqP F .pD ; q R .t/; qP R .t/; qR R .t//; (5.129b)
Y R .t/ WD DpD F .pD ; q R .t/; qP R .t/; qR R .t//; (5.129c)
M R .t/ WD M.pD ; q R .t// D DqR F .pD ; q R .t/; qP R .t/; qR R .t//: (5.129d)

Partial Derivative with Respect to pD

By means of differentiation of (5.128a) with respect to the parameter sub vector

pD , using the property


q.t/ D
q.t; a/ D
q.t;
pD ;
q0 ;
qP0 /, we obtain
246 5 Optimal Design of Regulators

F D F .p D C
pD ; q R .t/ C
q.t/; qP R .t/ C
q.t/;
P qR R .t/ C
q.t//:
R

@F @
pD @F @
q.t; a/ P a/
@F @
q.t; R a/
@F @
q.t; @ R
C C C D u .t/
@pD @
pD @q @
pD @qP @
pD @qR @
pD @pD
„ ƒ‚ …
D0

@' @
q.t; a/ @' @
qI .t; a/
C .t;
zI / C .t;
zI /
@
q @
pD @
qI @
pD
@' P a/
@
q.t;
C .t;
zI / : (5.130a)
@
qP @
pD

Moreover, by differentiation of Eqs. (5.128b, c) for the initial values with respect
to
pD we still find

@
q
.t0 ; a/ D 0; (5.130b)
@
pD
@
qP
.t0 ; a/ D 0: (5.130c)
@
pD

Here, we have

Zt Zt
@
qI @ @
q
.t; a/ D
q.; a/ d  D .; a/ d : (5.131)
@
pD @
pD @
pD
t0 t0

Evaluating the initial value problem (5.130a–c) at a D .


pD ;
q0 ;
qP0 / D 0
yields, using the definitions (5.129a–d),

@
q.t; 0/ P 0/
@
q.t; R 0/
@
q.t;
Y R .t/ C K R .t/ C D R .t/ CM R .t/
@
pD @
pD @
pD
@' @
q.t; 0/ @' @
qI .t; 0/
D .t; 0/ C .t; 0/
@
q @
pD @
qI @
pD
@' P 0/
@
q.t;
C .t; 0/ ; (5.132a)
@
qP @
pD
@
q
.t0 ; 0/ D 0; (5.132b)
@
pD
@
qP
.t0 ; 0/ D 0: (5.132c)
@
pD

Defining the function

@
q
.t/ WD .t; 0/; t0  t  tf ; (5.133)
@
pD
5.7 Optimal PID-Regulator 247

the initial value problem (5.132a–c) can be represented also by

Y R .t/ C K R .t/.t/ C D R .t/.t/CM


P R
R
.t/.t/
Zt
@' @' @'
D .t; 0/.t/C .t; 0/ ./ d  C P
.t; 0/.t/;
@
q @
qI @
qP
t0
(5.134a)
.t0 / D 0; (5.134b)
P 0 / D 0:
.t (5.134c)

Since the matrix M D M.t/ is regular, see e.g. [11], Eq. (5.134a) can be
transformed into
 @'   @' 
R C M R .t/1 D R .t/ 
.t/ P C M R .t/1 K R .t/ 
.t; 0/ .t/ .t; 0/ .t/
@
qP @
q
Zt
1
 @' 
CM .t/
R
 .t; 0/ .s/ ds D M R .t/1 Y .t/: (5.135)
@
qI
t0

Hence, for the unknown Jacobians of ' we use the following representation:
Definition 5.4 With given fixed, positive definite matrices Kp ; Ki ; Kd , define

 @' 
Kd DWM R .t/1 D R .t/  .t; 0/ ; (5.136a)
@
qP
 @' 
Kp DWM R .t/1 K R .t/  .t; 0/ ; (5.136b)
@
q
 @' 
Ki DWM R .t/1  .t; 0/ : (5.136c)
@
qI

Remark 5.3 In many cases the matrices Kp ; Ki ; Kd in (5.136a–c) are supposed to


be diagonal matrices.
With the definitions (5.136a–c), the initial value problem (5.134a–c) can be
represented now as follows:

Zt
R C Kd .t/
.t/ P C Kp .t/ C Ki ./ d  D M R .t/1 Y R .t/; t  t0 ;
t0
(5.137a)
.t0 / D 0; (5.137b)
P 0 / D 0;
.t (5.137c)
248 5 Optimal Design of Regulators

Partial Derivative with Respect to q0

Following the above procedure, we can also (i) determine the partial derivative of

zI with respect to the deviation


q0 in the initial position q0 , and (ii) evaluate then
P 0 / D 0.
the derivatives at a D .
pD ;
q0 ;
q
This way, we get the next initial value problem:

@
pD @
q.t; 0 / P 0/
@
q.t; R 0/
@
q.t;
Y R .t/ C K R .t/ C D R .t/ C M R .t/
@
q0 @
q0 @
q0 @
q0
@' @
q.t; 0/ @' @
qI .t; 0/
D .t; 0/ C .t; 0/
@
q @
q0 @
qI @
q0
@' P 0/
@
q.t;
C .t; 0/ (5.138a)
@
qP @
q0
@
q
.t0 ; 0/ D I; (5.138b)
@
q0
@
qP
.t0 ; 0/ D 0; (5.138c)
@
q0

where I denotes the unit matrix .


Defining then

@
q
.t/ WD .t; 0/; t  t0 ; (5.139)
@
q0

@
pD
and having @
q0
D 0, the differential equations (5.138a) reads

   
R C M.t/1 D.t/  @' .t; 0/ .t/
.t/ P C M.t/1 K.t/  @' .t; 0/ .t/
@
qP @
q
Zt
1
 @' 
CM.t/  .t; 0/ ./ d  D 0: (5.140)
@
qI
t0

Using again definitions (5.136a–c), for  D .t/ we get the initial value problem:

Zt
R C Kd .t/
.t/ P C Kp .t/ C Ki ./ d  D 0; t  t0 ; (5.141a)
t0

.t0 / D I; (5.141b)
P 0 / D 0:
.t (5.141c)
5.7 Optimal PID-Regulator 249

Partial Derivative with Respect to qP 0

Finally, we consider the partial derivative with respect to the deviations


qP0 in the
initial velocities qP0 . With a corresponding differentiation and evaluation at a D
.
pD ;
q0 ;
qP 0 / D 0, we get the in initial value problem

@
pD @
q.t; 0/ P 0/
@
q.t; R 0/
@
q.t;
Y .t/ C K.t/ C D.t/ C M.t/
@
qP0 @
qP0 @
qP0 @
qP0
@' @
q.t; 0/ @' @
qI .t; 0/
D .t; 0/ C .t; 0/
@
q @
qP0 @
qI @
qP0
@' P 0/
@
q.t;
C .t; 0/ ; (5.142a)
@
qP @
qP0
@
q
.t0 ; 0/ D 0; (5.142b)
@
qP0
@
qP
.t0 ; 0/ D I: (5.142c)
@
qP0

@
pD
Also here I denotes the unit matrix and it holds @
qP0 D 0.
Defining here

@
q
.t/ WD .t; 0/; t  t0 ; (5.143)
@
qP0

for (5.142a) we find the representation

   
R C M R .t/1 D R .t/  @' .t; 0/ .t/
.t/ P C M R .t/1 K R .t/  @' .t; 0/ .t/
@
qP @
q
Zt
 @' 
CM R .t/1  .t; 0/ ./ d  D 0: (5.144)
@
qI
t0

Taking into account (5.136a–c), one finally gets the initial value problem

Zt
R C Kd .t/
.t/ P C Kp .t/ C Ki ./ d  D 0; t  t0 ; (5.145a)
t0

.t0 / D 0; (5.145b)
P 0 / D I:
.t (5.145c)
250 5 Optimal Design of Regulators

5.7.2 The Approximate Regulator Optimization Problem

Based on the above approximations, the regulator optimization problem under


stochastic uncertainty can be represented by

Ztf
     
min tr P
pD cov pD ./ C tr P
q0 cov
q0 ./ C tr P
qP0 cov
qP0 ./
t0
!
 
C tr P
qP0 ;
q0 cov
q0 ./;
qP0 ./ dt (5.146a)

subject to the integro-differential system

Zt
R C Kd .t/
.t/ P C Kp .t/ C Ki ./ d  D M.t/1 Y .t/; t  t0 ; (5.146b)
t0

.t0 / D 0; (5.146c)
P 0 / D 0;
.t (5.146d)
Zt
R C Kd .t/
.t/ P C Kp .t/ C Ki ./ d  D 0; t  t0 ; (5.146e)
t0

.t0 / D I; (5.146f)
P 0 / D 0:
.t (5.146g)
Zt
R C Kd .t/
.t/ P C Kp .t/ C Ki ./ d  D 0; t  t0 ; (5.146h)
t0

.t0 / D 0; (5.146i)
P 0 / D I:
.t (5.146j)

Introducing appropriate vector and vector functions, the above three systems of
differential equations (5.146b–j) can be represented uniformly. Hence, we set:
8 9
< M R .t/1 Y R .t/; t  t0 ; for (5.146b–d) =
.t/ WD 0; for (5.146e–g) ; t  t0 ; (5.147a)
: ;
0; for (5.146h–j)
5.7 Optimal PID-Regulator 251

and
8
< 0; for (5.146b–d)
dq0 WD I; for (5.146e–g) (5.147b)
:
0; for (5.146h–j);
8
< 0; for (5.146b–d)
d qP0 WD 0; for (5.146e–g) (5.147c)
:
I; for (5.146h–j):

The three systems of differential equations (5.146b–d), (5.146e–g), (5.146h–j),


resp., for the computation of the partial derivatives (sensitivities)
8 9
< .t/; for (5.146b–d) =
dq.t/ WD .t/; for (5.146e–g) ; t  t0 (5.147d)
: ;
.t/; for (5.146h–j)

can be represented then by:

Zt
R C Kd d q.t/
d q.t/ P C Kp dq.t/ C Ki dq./ d  D .t/; t  t0 (5.148a)
t0

dq.t0 / D dq0 (5.148b)


P 0 / D d qP0 :
d q.t (5.148c)

Defining
0 1
dq.t/
B Rt C
dzI .t/ WD B C
@ dq./ d  A ; t  t0 ; (5.149)
t0
P
d q.t/

the integro-differential equations (5.148a–c) can be transformed into linear


systems of first order of the following type:
0
1
0
d zPI .t/ D A dzI .t/ C @ 0 A ; t  t0 ; (5.150a)
.t/
dzI .t0 / D dzI0 : (5.150b)
252 5 Optimal Design of Regulators

Here, the matrix A and the vector of initial conditions dzI0 is defined by:
0 1
0 0 I
A WD @ I 0 0 A; (5.150c)
Kp Ki Kd
0 1
dq0
dzI0 WD @ 0 A : (5.150d)
d qP0
Chapter 6
Expected Total Cost Minimum Design of Plane
Frames

6.1 Introduction

Yield stresses, allowable stresses, moment capacities (plastic moments with respect
to compression, tension and rotation), applied loadings, cost factors, manufacturing
errors, etc., are not given fixed quantities in structural analysis and optimal design
problems, but must be modeled as random variables with a certain joint probability
distribution. Problems from plastic analysis and optimal plastic design are based on
the convex yield (feasibility) criterion and the linear equilibrium equation for the
stress (state) vector.
After the formulation of the basic mechanical conditions including the relevant
material strength parameters and load components as well as the involved design
variables (as e.g. sizing variables) for plane frames, several approximations are
considered: (i) approximation of the convex yield (feasible) domain by means of
convex polyhedrons (piecewise linearization of the yield domain); (ii) treatment
of symmetric and non symmetric yield stresses with respect to compression
and tension; (iii) approximation of the yield condition by using given reference
capacities.
As a result, for the survival of a plane frame a certain system of necessary and/or
sufficient linear equalities and inequalities is obtained. Evaluating the recourse
costs, i.e., the costs for violations of the survival condition by means of piecewise
linear convex cost functions, a linear program is obtained for the minimization of
the total costs including weight-, volume- or more general initial (construction)
costs. Appropriate cost factors are derived. Considering then the expected total
costs from construction as well as from possible structural damages or failures, a
stochastic linear optimization problem (SLP) is obtained. Finally, discretization of
the probability distribution of the random parameter vector yields a (large scale)
linear program (LP) having a special structure. For LP’s of this type numerous, very
efficient LP-solvers are available—also for the solution of very large scale problems.

© Springer-Verlag Berlin Heidelberg 2015 253


K. Marti, Stochastic Optimization Methods, DOI 10.1007/978-3-662-46214-0_6
254 6 Expected Total Cost Minimum Design of Plane Frames

6.2 Stochastic Linear Programming Techniques

6.2.1 Limit (Collapse) Load Analysis of Structures as a Linear


Programming Problem

Assuming that the material behaves as an elastic-perfectly plastic material [59, 79,
119] a conservative estimate of the collapse load factor T is based [30, 31, 36, 39,
54, 55, 77, 120, 164] on the following linear program:

maximize  (6.1a)

s.t.

FL  F  FU (6.1b)
CF D R0 : (6.1c)

Here, (6.1c) is the equilibrium equation of a statically indeterminate loaded structure


involving an m  n matrix C D .cij /; m < n, of given coefficients cij ; 1  i 
m; 1  j  n, depending on the undeformed geometry of the structure having
n0 members (elements); we suppose that rank C D m. Furthermore, R0 is an
external load m-vector, and F denotes the n-vector of internal forces and bending-
moments in the relevant points (sections, nodes or elements) of lower and upper
bounds F L ; F U .
For a plane or spatial truss [83, 158] we have that n D n0 , the matrix C contains
the direction cosines of the members, and F involves only the normal (axial) forces
moreover,

FjL WD yjL Aj ; FjU WD yjU Aj ; j D 1; : : : ; n.D n0 /; (6.2)

where Aj is the (given) cross-sectional area, and yjL ; yjU , respectively, denotes
the yield stress in compression (negative values) and tension (positive values)
of the j -th member of the truss. In case of a plane frame, F is composed of
subvectors [158],
0 .k/
1 0 1
F1 tk
B C
F .k/ D @ F2.k/ A D @ mC k
A; (6.3a)
.k/ 
F3 mk

where F1 D tk denotes the normal (axial force, and F2 D mC 


.k/ .k/ .k/
k ; K3 D mk are
the bending-moments at the positive, negative end of the k-th member. In this case
6.2 Stochastic Linear Programming Techniques 255

F L ; F U contain—for each member k—the subvectors


0 1 0 1
L U
yk Ak yk Ak
B C B C
F .k/L D @ Mkpl A ; F .k/U D @ Mkpl A ; (6.3b)
Mkpl Mkpl

respectively, where Mkpl ; k D 1; : : : ; n0 , denotes the plastic moments (moment


capacities) [59, 119] given by

Mkpl D yk
U
Wkpl ; (6.3c)

and Wkpl D Wkpl .Ak / is the plastic section modulus of the cross-section of the k-th
member (beam) with respect to the local z-axis.
For a spatial frame [83, 158], corresponding to the k-th member (beam), F
contains the subvector

F .k/ WD .tk ; mkT ; mC C   0


k yN mkNz ; mk yN mkNz / ; (6.4a)

C
where tk is the normal (axial) force, mkT the twisting moment, and m k yN ; mkNz ,
 
mk yN ; mkNz denote four bending moments with respect to the local y; N z-axis at the
positive, negative end of the beam, respectively. Finally, the bounds F L ; F U for F
are given by
 L pN yN zN yN z 0

F .k/L D yk Ak ; Mkpl ; Mkpl ; Mkpl ; Mkpl ; Mkpl (6.4b)
 U pN yN zN yN 
Nz 0
F .k/U D yk Ak ; Mkpl ; Mkpl ; Mkpl ; Mkpl ; Mkpl ; (6.4c)

where [59, 119]


pN pN yN zNyN zN
Mkpl WD yk Wkpl ; Mkpl WD yk
U
Wkpl ; Mkpl WD yk
U
Wkpl ; (6.4d)

are the plastic moments of the cross-section of the k-th element with respect to the
pN pN
N zN-axis, respectively. In (6.4d), Wkpl D Wkpl .x/ and
local twisting axis, the local y-,
yN yN zN Nz
Wkpl D Wkpl .X /; Wkpl D Wkpl .X /, respectively, denote the polar, axial modulus
of the cross-sectional area of the k-th beam and yk denotes the yield stress with
respect to torsion; we suppose that yk D ykU
.
Remark 6.1 Possible plastic hinges [59, 82, 119] are taken into account by inserting
appropriate eccentricities ekl > 0; ekr > 0; k D 1; : : : ; n0 , with ekl ; ekr
Lk , where
Lk is the length of the k-th beam.
Remark 6.2 Working with more general yield polygons, see e.g. [164], the stress
condition (6.1b) is replaced by the more general system of inequalities

H.FdU /1 F  h: (6.5a)


256 6 Expected Total Cost Minimum Design of Plane Frames

Here, .H; h/ is a given   .n C 1/ matrix, and FdU WD .FjU ıij / denotes the n  n
diagonal matrix of principal axial and bending plastic capacities
j
FjU WD yk
U
j
Akj ; FjU WD yk
U
j
Wkj pl ; (6.5b)

where kj; j are indices as arising in (6.3b)–(6.4d). The more general case (6.5a)
can be treated by similar methods as the case (6.1b) which is considered here.

Plastic and Elastic Design of Structures

In the plastic design of trusses and frames [77, 87, 92, 97, 128, 152] having n0
members, the n-vectors F L ; F U of lower and upper bounds

F L D F L . yL ; yU ; x/; F U D F U . yL ; yU ; x/; (6.6)

for the n-vector F of internal member forces and bending moments Fj ; j D


1; : : : ; n, are determined [54, 77] by the yield stresses, i.e. compressive limiting
 L L T

stresses (negative values) yL D y1 ; : : : ; yn , the tensile yield stresses yU D
 U U
 T
0

y1 ; : : : ; yn 0
, and the r-vector

x D .x1 ; x2 ; : : : ; xr /T (6.7)

of design variables of the structure. In case of trusses we have that, cf. (6.2),

F L D yd
L
A.x/ D A.x/d yL ;

F U D yd
U
A.x/ D A.x/d yU ; (6.8)

where n D n0 , and yd L U
; yd denote the n  n diagonal matrices having the diagonal
elements yj ; yj , respectively, j D 1; : : : ; n, moreover,
L U

h iT
A.x/ D A1 .x/; : : : ; An .x/ (6.9)

is the n-vector of cross-sectional area Aj D Aj .x/; j D 1; : : : ; n, depending on


the r-vector x of design variables x ;  D 1; : : : ; r, and A.x/d denotes the n  n
diagonal matrix having the diagonal elements Aj D Aj .x/; 1 < j < n.
Corresponding to (6.1c), here the equilibrium equation reads

CF D Ru ; (6.10)

where Ru describes [77] the ultimate load [representing constant external loads or
self-weight expressed in linear terms of A.x/].
6.2 Stochastic Linear Programming Techniques 257

The plastic design of structures can be represented then, cf. [7], by the optimiza-
tion problem

min G.x/; (6.11a)

s.t.

F L . yL ; yU ; x/  F  F U . yL ; yU ; x/ (6.11b)
CF D Ru ; (6.11c)

where G D G.x/ is a certain objective function, e.g. the volume or weight of the
structure.
Remark 6.3 As mentioned in Remark 6.2, working with more general yield poly-
gons, (6.11b) is replaced by the condition

H ŒF U . yU ; x/d 1 F  h: (6.11d)

For the elastic design we must replace the yield stresses yL ; yU by the allowable
stresses aL ; aU and instead of ultimate loads we consider service loads Rs . Hence,
instead of (6.11a–d) we have the related program

min G.x/; (6.12a)

s.t.

F L . aL ; aU ; x/  F  F U . aL ; aU ; x/; (6.12b)
CF D Rs ; (6.12c)
x xx ;
L U
(6.12d)

where x L ; x U still denote lower and upper bounds for x.

6.2.2 Plane Frames

For each bar i D 1; : : : ; B of a plane frame with member load vector Fi D


.ti ; mC  T
i ; mi / we consider [153, 165] the load at the negative, positive end

Fi WD .ti ; m C C T
i / ; Fi WD .ti ; mi / ;
T
(6.13)

respectively.
258 6 Expected Total Cost Minimum Design of Plane Frames

Furthermore, for each bar/beam with rigid joints we have several plastic capac-
L
ities: The plastic capacity Nipl of the bar with respect to axial compression, hence,
the maximum axial force under compression is given by
L
Nipl D j yiL j  Ai ; (6.14a)

where yiL < 0 denotes the (negative) yield stress with respect to compression and Ai
is the cross sectional area of the i th element. Correspondingly, the plastic capacity
with respect to (axial) tension reads:
U
Nipl D yiU  Ai ; (6.14b)

where yiU > 0 is the yield stress with respect to tension. Besides the plastic
capacities with respect to the normal force, we have the moment capacity

Mipl D yiU  Wipl (6.14c)

with respect to the bending moments at the ends of the bar i .


Remark 6.1 Note that all plastic capacities have nonnegative values.
Using the plastic capacities (6.14a–c), the load vectors FiC ; FiC given by (6.13)
can be replaced by dimensionless quantities
!T !T
ti m ti m
FiL WD ; i
L M
; FiU  WD ; i
U M
(6.15a,b)
Nipl ipl Nipl ipl

!T !T
ti mC ti mC
FiLC WD ; i
L M
; FiU C WD ; i
U M
(6.15c,d)
Nipl ipl Nipl ipl

for the negative, positive end, resp., of the i th bar.


Remark 6.2 (Symmetric Yield Stresses Under Compression and Tension) In the
important special case that the absolute values of the yield stresses under compres-
sion .< 0/ and tension .> 0/ are equal, hence,

yiL D  yiU (6.16a)


L
Nipl D Nipl
U
DW Nipl : (6.16b)

The limit between the elastic and plastic state of the elements is described by the
feasibility or yield condition: At the negative end we have the condition

FiL 2 Ki ; FiU  2 Ki (6.17a,b)


6.2 Stochastic Linear Programming Techniques 259

M/Mpl

1 N/Npl

approximation from outside


approximation from inside
K

Fig. 6.1 Domain K0;sym with possible approximations

and at the positive end the condition reads

FiLC 2 Ki ; FiU C 2 Ki : (6.17c,d)

Here, Ki , Ki 2 R2 , denotes the feasible domain of bar “i ” having the following


properties:
• Ki is a closed, convex subset of R2
• the origin 0 of R2 is an interior point of Ki
• the interior Ki of Ki represents the elastic states
• at the boundary @Ki yielding of the material starts.
Considering e.g. bars with rectangular cross sectional areas and symmetric yield
stresses, cf. Remark 6.2, Ki is given by Ki D K0;sym , where [72, 74]

K0;sym D f.x; y/T W x 2 C jyj  1g (6.18)

where x D N
Npl and y D M
Mpl , (see Fig. 6.1).
260 6 Expected Total Cost Minimum Design of Plane Frames

In case (6.18) and supposing symmetric yield stresses, the yield condition
(17a–d) reads
 ˇ ˇ
2
ti ˇm ˇ
C ˇˇ i ˇˇ  1; (6.19a)
Nipl Mipl
  ˇ ˇ
ti 2 ˇˇ mC i ˇ
ˇ
Cˇ ˇ  1: (6.19b)
Nipl ˇ Mipl ˇ

Remark 6.3 Because of the connection between the normal force ti and the bending
moments, (6.19a, b) are also called “M –N -interaction”.
If the M –N -interaction is not taken into account, K0;sym is approximated from
outside, see Fig. 6.1, by
u
K0;sym WD f.x; y/T W jxj; jyj  1g: (6.20)

Hence, (4.17a–d) are replaced, cf. (6.19a, b), by the simpler conditions

jti j  Nipl (6.21a)


jm
i j  Mipl (6.21b)
jmC
i j  Mipl : (6.21c)

Since the symmetry condition (6.16a) does not hold in general, some modifica-
tions of the basic conditions (6.19a, b) are needed. In the non symmetric case K0;xm
must be replaced by the intersection

K0 D K0U \ K0L (6.22)

of two convex sets K0U and K0L . For a rectangular cross-sectional area we have
p
K0U D f.x; y/T W x  1  jyj; jyj  1g (6.23a)

for tension and


p
K0L D f.x; y/T W x  1  jyj; jyj  1g (6.23b)

compression, where again x D NNpl and y D MMpl (see Fig. 6.2).


In case of tension, see (6.17b, d), from (6.23a) we obtain then the feasibility
condition
s ˇ ˇ
ti ˇ mi ˇ
 1  ˇ ˇ (6.24a)
U
Nipl ˇM ˇ
ipl
6.2 Stochastic Linear Programming Techniques 261

M/M
pl

1 N/Npl

U
K0
L
K0

Fig. 6.2 Feasible domain as intersection of K0U and K0L

v ˇ ˇ
u ˇ mC ˇ
ti u ˇ i ˇ
t
 1ˇ ˇ (6.24b)
U
Nipl ˇ Mipl ˇ
ˇ ˇ
ˇ mi ˇ
ˇ ˇ
ˇM ˇ  1 (6.24c)
ipl
ˇ ˇ
ˇ mC ˇ
ˇ i ˇ
ˇ ˇ  1: (6.24d)
ˇ Mipl ˇ

For compression, with (4.17a, c) and (6.23b) we get the feasibility condition
s ˇ ˇ
ti ˇm ˇ
 L  1  ˇˇ i ˇˇ (6.24e)
Nipl Mipl
v ˇ ˇ
u ˇ mC ˇ
ti u ˇ ˇ
  t1  ˇ i
ˇ (6.24f)
L
Nipl ˇ Mipl ˇ
262 6 Expected Total Cost Minimum Design of Plane Frames

ˇ ˇ
ˇ mi ˇ
ˇ ˇ
ˇM ˇ  1 (6.24g)
ipl
ˇ ˇ
ˇ mC ˇ
ˇ i ˇ
ˇ ˇ  1: (6.24h)
ˇ Mipl ˇ

From (6.24a), (6.24e) we get


s ˇ ˇ s ˇ ˇ
ˇ mi ˇ ˇm ˇ
 Nipl
L ˇ
1ˇ ˇ  ti  Nipl 1  ˇˇ i ˇˇ:
U
(6.25a)
Mipl ˇ Mipl

and (6.24b), (6.24f) yield


v ˇ ˇ v ˇ ˇ
u ˇ mC ˇ u ˇ mC ˇ
u ˇ ˇ u ˇ ˇ
Lt Ut
 Nipl 1  ˇ i ˇ  ti  Nipl 1  ˇ i ˇ: (6.25b)
ˇ Mipl ˇ ˇ Mipl ˇ

Furthermore, (6.24c), (6.24g) and (6.24d), (6.24h) yield

jm
i j  Mipl (6.25c)
jmC
i j  Mipl : (6.25d)

For computational purposes, piecewise linearizations are applied [99] to the nonlin-
ear conditions (6.25a, b). A basic approximation of K0L and K0U is given by

K0U u WD f.x; y/T W x  1; jyj  1g D 1; 1  Œ1; 1 (6.26a)

and

K0Lu WD f.x; y/T W x  1; jyj  1g D Œ1; 1  Œ1; 1 (6.26b)

with x D NNpl and y D MMpl (see Fig.6.3).


Since in this approximation the M –N -interaction is not taken into account,
condition (6.17a–d) is reduced to

 Nipl
L
 ti  Nipl
U
(6.27a)
jm
i j  Mipl (6.27b)
jmC
i j  Mipl : (6.27c)
6.2 Stochastic Linear Programming Techniques 263

M/Mpl

1 N/Npl

K0
approximation KUu
0
approximation KLu
0

Fig. 6.3 Approximation of K0 by K0Lu and K0U u

6.2.3 Yield Condition in Case of M –N -Interaction


Symmetric Yield Stresses

Consider first the case

yiU D  yiL DW yi ; i D 1; : : : ; B: (6.28a)

Then,
L
Nipl WD j yiL jAi D yiU Ai DW Nipl
U
; (6.28b)

hence,

Nipl WD Nipl
L
D Nipl
U
D yi Ai : (6.28c)

Moreover,

Mipl D yiU Wipl D yi Wipl D yi Ai yNic ; i D 1; : : : ; B; (6.28d)


264 6 Expected Total Cost Minimum Design of Plane Frames

where y ic denotes the arithmetic mean of the centroids of the two half areas of the
cross-sectional area Ai of bar i .
Depending on the geometric from of the cross-sectional areal (rectangle, circle,
etc.), for the element load vectors
0 1
ti
Fi D @ mCi
A ; i D 1; : : : ; B; (6.29)

mi

of the bars we have the yield condition:


ˇ ˇ˛ ˇ  ˇ
ˇ ti ˇ ˇ ˇ
ˇ ˇ C ˇ mi ˇ  1 (negative end) (6.30a)
ˇN ˇ ˇM ˇ
ipl ipl
ˇ ˇ˛ ˇˇ C ˇˇ
ˇ ti ˇ ˇ mi ˇ
ˇ ˇ
ˇ N ˇ C ˇˇ M ˇˇ  1 (positive end): (6.30b)
ipl ipl

Here, ˛ > 1 is a constant depending on the type of the cross-sectional area of the
i th bar. Defining the convex set
( ! )
x
K0˛ WD 2 R W jxj C jyj  1 ;
2 ˛
(6.31)
y

for (6.30a, b) we have also the representation

ti
!
Nipl
mi
2 K0˛ (negative end) (6.32a)
Mipl
0 1
ti
@ Nipl
A 2 K0˛ (positive end): (6.32b)
mCi
Mipl

Piecewise Linearization of K0˛

Due to the symmetry of K0˛ with respect to the transformation

x ! x; y ! y;

K0˛ is piecewise linearized as follows:


Starting from a boundary point of K0˛ , hence,
 
u1
2 @K0˛ with u1  0; u2  0; (6.33a)
u2
6.2 Stochastic Linear Programming Techniques 265

we consider the gradient of the boundary curve

f .x; y/ WD jxj˛ C jyj  1 D 0

of K0˛ in the four points


       
u1 u1 u1 u1
; ; ; : (6.33b)
u2 u2 u2 u2

We have
   
u1 ˛u1˛1
rf D (6.34a)
u2 1
     
u1 ˛..u1 //˛1 ˛u1˛1
rf D D (6.34b)
u2 1 1
     
u1˛..u1 //˛1 ˛u1˛1
rf D D (6.34c)
u2 1 1
   ˛1 
u1 ˛u1
rf D ; (6.34d)
u2 1

where
   
u1 u1
rf D rf (6.35a)
u2 u2
   
u1 u1
rf D rf : (6.35b)
u2 u2

Furthermore, in the two special points


   
0 0
and
1 1

of @K0˛ we have, cf. (6.34a), (6.34d), resp., the gradients


   
0 0
rf D (6.36a)
1 1
   
0 0
rf D : (6.36b)
1 1
266 6 Expected Total Cost Minimum Design of Plane Frames

Though f .x; y/ D jxj˛ C jyj  1 is not differentiable at


   
1 1
and ;
0 0

we define
   
1 1
rf WD ; (6.36c)
0 0
   
1 1
rf WD : (6.36d)
0 0
 
u1
Using a boundary point of K0˛ with u1 ; u2 > 0, the feasible domain K0˛
u2
can be approximated from outside by the convex polyhedron defined as follows:
From the gradients (6.36a–d) we obtain next to the already known conditions (no
M –N -interaction):
 T      T  
0 x 0 0 x
rf  D 0
1 y 1 1 y 1
 T      T  
0 x 0 0 x
rf  D 0
1 y 1 1 y C1
 T      T  
1 x 1 1 x1
rf  D 0
0 y 0 0 y
 T      T  
1 x 1 1 xC1
rf  D  0:
0 y 0 0 y

This yields

y  1  0; 1.y C 1/  0 and x  1  0; 1.x C 1/  0;

hence,

jxj  1 (6.37a)
jyj  1: (6.37b)
6.2 Stochastic Linear Programming Techniques 267

Moreover, with the gradients (6.34a–d), cf. (6.35a, b), we get the additional
conditions
 T    
u1 x u1
rf 
u2 y u2
 ˛1 T  
˛u1 x  u1
D 0 (1st quadrant) (6.38a)
1 y  u2
 T    
u1 x u1
rf 
u2 y u2
 ˛1 T  
˛u1 x C u1
D 0 (3rd quadrant) (6.38b)
1 y C u2
 T    
u1 x u1
rf 
u2 y u2
 T  
˛u1˛1 x C u1
D 0 (2nd quadrant) (6.38c)
1 y  u2
 T    
u1 x u1
rf 
u2 y u2
 T  
˛u1˛1 x  u1
D  0 (4th quadrant) (6.38d)
1 y C u2

This means
 
˛u1˛1 x  ˛u˛1 C y  u2 D ˛u1˛1 x C y  ˛u˛1 C u2  0 (6.39a)
     
 ˛u1˛1 x C ˛u˛1 C y C u2 D  ˛u1˛1 x C y C ˛u˛1 C u2   0 (6.39b)
 
˛u1˛1 x  ˛u˛1 C y  u2 D ˛  u1˛1 x C y  ˛u˛1 C u2  0 (6.39c)
 
˛u1˛1 x  ˛u˛1  y  u2 D ˛  u1˛1 x  y  ˛u˛1 C u2  0: (6.39d)

With
 T  
u1 u1
˛u˛1 C u2 D rf DW ˇ.u1 ; u2 / (6.40)
u2 u2

we get the equivalent constraints

˛u1˛1 x C y  ˇ.u1 ; u2 /  0
˛u1˛1 x C y C ˇ.u1 ; u2 /  0
268 6 Expected Total Cost Minimum Design of Plane Frames

˛u1˛1 x C y  ˇ.u1 ; u2 /  0
˛u1˛1 x  y  ˇ.u1 ; u2 /  0:

This yields the double inequalities

j˛u1˛1 x C yj  ˇ.u1 ; u2 / (6.41a)


j˛u1˛1 x  yj  ˇ.u1 ; u2 /: (6.41b)
 
u1
Thus, a point u D 2 @K0˛ ; u1 > 0; u2 > 0, generates therefore the
u2
inequalities

1  x  1 (6.42a)
1  y  1 (6.42b)
ˇ.u1 ; u2 /  ˛u1˛1 x C y  ˇ.u1 ; u2 / (6.42c)
ˇ.u1 ; u2 /  ˛u1˛1 x  y  ˇ.u1 ; u2 /: (6.42d)

Obviously, each further point uO 2 @K0˛ with uO 1 > 0; uO 2 > 0 yields additional
inequalities of the type (6.42c, d).
Condition (6.42a–d) can be represented in the following vectorial form:
     
1 x 1
 I  (6.43a)
1 y 1
     
1 x 1
ˇ.u1 ; u2 /  H.u1 ; u2 /  ˇ.u1 ; u2 / ; (6.43b)
1 y 1

with the matrices


   ˛1 
10 ˛u1 1
I D ; H.u1 ; u2 / D : (6.44)
01 ˛u1˛1 1

Choosing a further boundary point uO of K0˛ with uO 1 > 0; uO 2 > 0, we get additional
conditions of the type (6.43b).
Using (6.42a–d), for the original yield condition (6.32a, b) we get then the
approximative feasibility condition:
i) Negative end of the bar

 Nipl  ti  Nipl (6.45a)


Mipl  m
i  Mipl (6.45b)
6.2 Stochastic Linear Programming Techniques 269

ti m
ˇ.u1 ; u2 /  ˛u1˛1 C i  ˇ.u1 ; u2 / (6.45c)
Nipl Mipl
ti m
ˇ.u1 ; u2 /  ˛u1˛1  i  ˇ.u1 ; u2 /: (6.45d)
Nipl Mipl

ii) Positive end of the bar

 Nipl  ti  Nipl (6.45e)


Mipl  mC
i  Mipl (6.45f)
ti mC
ˇ.u1 ; u2 /  ˛u1˛1 C i
 ˇ.u1 ; u2 / (6.45g)
Nipl Mipl
ti mC
ˇ.u1 ; u2 /  ˛u1˛1  i  ˇ.u1 ; u2 /: (6.45h)
Nipl Mipl

Defining
0 1 1 0 1
  Nipl 0 0 ti
B C
.i / a.!/; x WD @ 0 M1ipl 0 A ; Fi D @ mCi
A; (6.46)
1 
0 0 Mipl mi

conditions (6.45a–h) can be represented also by


0 1 0 1
1   1
 @ 1 A  .i / a.!/; x Fi  @ 1 A (6.47a)
1 1
0 1 0 ˛1 1 0 1
1 ˛u1 1 0 1
B 1 C B ˛u˛1 1 0 C B1C
ˇ.u1 ; u2 / B C B 1 C B C
@ 1 A  @ ˛u˛1 0 1 A i .a.!/; x/Fi  ˇ.u1 ; u2 / @ 1 A : (6.47b)
1
1 ˛u1˛1 0 1 1

Multiplying (6.45a, c, d, g, h) with Nipl , due to

Nipl yi Ai yi Ai 1
D D D ; (6.48)
Mipl yi Wipl yi Ai yNic yNic

for (6.45a, c, d, g, h) we also have

m
 ˇ.u1 ; u2 /Nipl  ˛u1˛1 ti C i
 ˇ.u1 ; u2 /Nipl (6.49a)
yNic
270 6 Expected Total Cost Minimum Design of Plane Frames

m
ˇ.u1 ; u2 /Nipl  ˛u1˛1 ti  i
 ˇ.u1 ; u2 /Nipl (6.49b)
yNic
mC
ˇ.u1 ; u2 /Nipl  ˛u1˛1 ti C i
 ˇ.u1 ; u2 /Nipl (6.49c)
yNic
mC
ˇ.u1 ; u2 /Nipl  ˛u1˛1 ti  i
 ˇ.u1 ; u2 /Nipl : (6.49d)
yNic

6.2.4 Approximation of the Yield Condition by Using


Reference Capacities

According to (6.31), (6.32a, b) for each bar i D 1; : : : ; B we have the condition

t
!  
Npl x
m 2 K0˛ D 2 R2 W jxj˛ C jyj  1
Mpl y

with .t; m/ D .ti ; m˙


i /; .Npl ; Mpl / D .Nipl ; Mipl /.
Selecting fixed reference capacities

Ni 0 > 0; Mi 0 > 0; i D 1; : : : ; B;

related to the plastic capacities Nipl ; Mipl , we get


ˇ ˇ˛ ˇ ˙ ˇ ˇˇ ˇ˛ ˇ
ˇ ˇ ˙
ˇ
ˇ
ˇ ti ˇ ˇ ˇ
ˇ ˇ C ˇ mi ˇ D ˇˇ ti  1 ˇˇ C ˇˇ mi  1 ˇˇ :
ˇN ˇ ˇM ˇ ˇN Nipl ˇ ˇ Mi 0 Mipl ˇ
ipl ipl i0
Ni 0 Mi 0

Putting


Nipl Mipl
i D i .a.!/; x/ WD min ; ; (6.50)
Ni 0 Mi 0

we have
i i
Nipl
 1; Mipl
1
Ni 0 Mi 0

and therefore
ˇ ˇ˛ ˇ ˙ ˇ ˇ ˇ ˇ ˇ˛ ˇ ˇ ˇˇ ˇ ˇ ˇ ˇ ˇ
ˇ ti ˇ ˇ mi ˇ ˇ ti ˇ˛ ˇˇ i ˇˇ ˇ m˙ ˇ ˇ i ˇˇ ˇ ti ˇ˛ ˇ m˙ ˇ
ˇ ˇ Cˇ ˇDˇ ˇ ˇ ˇ C ˇ i ˇˇ ˇ  ˇ ˇ Cˇ i ˇ:
ˇN ˇ ˇ M ˇ ˇ N ˇ ˇ Nipl ˇ ˇ M ˇ ˇ Mipl ˇ ˇ N ˇ ˇ M ˇ
ipl ipl i i0 i i0 i i0 i i0
Ni 0 Mi 0
(6.51)
6.2 Stochastic Linear Programming Techniques 271

Thus, the yield condition (6.30a, b) or (6.32a, b) is guaranteed by


ˇ ˇ ˇ ˇ
ˇ ti ˇ˛ ˇ m˙ ˇ
ˇ ˇ Cˇ i ˇ1
ˇ N ˇ ˇ M ˇ
i i0 i i0

or
ti !
i Ni 0
m˙ 2 K0˛ : (6.52)
i
i Mi 0

Applying the piecewise linearization described in Sect. 2.1 to condition (6.52),


we obtain, cf. (6.45a–h), the approximation stated below. Of course, condi-
tions (6.45a, b, e, f) are not influenced by this procedure. Thus, we find

 Nipl  ti  Nipl (6.53a)


Mipl  m
i  Mipl (6.53b)
Mipl  mC
i  Mipl (6.53c)
ti m
ˇ.u1 ; u2 /  ˛u1˛1 C i
 ˇ.u1 ; u2 / (6.53d)
i Ni 0 i Mi 0
ti m
ˇ.u1 ; u2 /  ˛u1˛1  i
 ˇ.u1 ; u2 / (6.53e)
i Ni 0 i Mi 0
ti mC
ˇ.u1 ; u2 /  ˛u1˛1 C i
 ˇ.u1 ; u2 / (6.53f)
i Ni 0 i Mi 0
ti mC
ˇ.u1 ; u2 /  ˛u1˛1  i
 ˇ.u1 ; u2 /: (6.53g)
i Ni 0 i Mi 0

ti m˙
i
Remark 6.4 Multiplying with i we get quotients Ni 0 ; Mi 0 with fixed denominators.
Hence, multiplying (6.53d–g) with i , we get the equivalent system

 Nipl  ti  Nipl (6.54a)


Mipl  m
i  Mipl (6.54b)
Mipl  mC
i  Mipl (6.54c)
ti m
ˇ.u1 ; u2 / i  ˛u1˛1 C  ˇ.u1 ; u2 / i
i
(6.54d)
Ni 0 Mi 0
ti m
ˇ.u1 ; u2 / i  ˛u1˛1  i  ˇ.u1 ; u2 / i (6.54e)
Ni 0 Mi 0
ti mC
ˇ.u1 ; u2 / i  ˛u1˛1 C i  ˇ.u1 ; u2 / i (6.54f)
Ni 0 Mi 0
272 6 Expected Total Cost Minimum Design of Plane Frames

ti mC
ˇ.u1 ; u2 / i  ˛u1˛1  i  ˇ.u1 ; u2 / i : (6.54g)
Ni 0 Mi 0

Obviously, (6.54a–g) can be represented also in the following form:

jti j  Nipl (6.55a)


jm i j  Mipl (6.55b)
ˇ Cˇ
ˇm ˇ  Mipl (6.55c)
i
ˇ ˇ
ˇ ˛1 ti ˇ
mi ˇ
ˇ˛u C  ˇ.u1 ; u2 / i (6.55d)
ˇ 1 N Mi 0 ˇ
i0
ˇ ˇ
ˇ ˛1 ti m ˇ
ˇ˛u i ˇ
ˇ 1 N M ˇ  ˇ.u1 ; u2 / i (6.55e)
i0 i0
ˇ ˇ
ˇ mC ˇ
ˇ ˛1 ti i ˇ
ˇ˛u1 C ˇ  ˇ.u1 ; u2 / i (6.55f)
ˇ Ni 0 Mi 0 ˇ
ˇ ˇ
ˇ mC ˇ
ˇ ˛1 ti ˇ
ˇ˛u1  i ˇ  ˇ.u1 ; u2 / i : (6.55g)
ˇ Ni 0 Mi 0 ˇ

By means of definition (6.50) of i , system (6.55a–g) reads

jti j  Nipl (6.56a)


jm
i j  Mipl (6.56b)
ˇ Cˇ
ˇm ˇ  Mipl (6.56c)
i
ˇ ˇ
ˇ ˛1 ti mi ˇˇ

Nipl
ˇ˛u
ˇ 1 N CM ˇ  ˇ.u1 ; u2 /
Ni 0
(6.56d)
i0 i0
ˇ ˇ
ˇ ˛1 ti mi ˇˇ

Mipl
ˇ˛u
ˇ 1 N CM ˇ  ˇ.u1 ; u2 /
Mi 0
(6.56e)
i0 i0
ˇ ˇ
ˇ ˛1 ti ˇ
mi ˇ Nipl
ˇ˛u   ˇ.u1 ; u2 / (6.56f)
ˇ 1 N Mi 0 ˇ Ni 0
i0
ˇ ˇ
ˇ ˛1 ti m ˇ Mipl
ˇ˛u i ˇ
ˇ 1 N M ˇ  ˇ.u1 ; u2 /
Mi 0
(6.56g)
i0 i0
ˇ ˇ
ˇ mC ˇ
ˇ ˛1 ti ˇ Nipl
ˇ˛u1 C i ˇ  ˇ.u1 ; u2 / (6.56h)
ˇ Ni 0 Mi 0 ˇ Ni 0
ˇ ˇ
ˇ mC ˇ
ˇ ˛1 ti ˇ Mipl
ˇ˛u1 C i ˇ  ˇ.u1 ; u2 / (6.56i)
ˇ Ni 0 Mi 0 ˇ Mi 0
6.2 Stochastic Linear Programming Techniques 273

ˇ ˇ
ˇ mC ˇ
ˇ ˛1 ti i ˇ Nipl
ˇ˛u1  ˇ  ˇ.u1 ; u2 / (6.56j)
ˇ Ni 0 Mi 0 ˇ Ni 0
ˇ ˇ
ˇ mC ˇˇ
ˇ ˛1 ti Mipl
ˇ˛u1  i ˇ  ˇ.u1 ; u2 / (6.56k)
ˇ Ni 0 Mi 0 ˇ Mi 0

Corresponding to Remark 6.4, the variables

ti ; mC 
i ; mi ; Ai or x

enters linearly. Increasing the accuracy of approximation by taking a further point


uO D .Ou1 ; uO 2 / with the related points .Ou1 ; Ou2 /; .Ou1 ; uO 2 /; .Ou1 ; Ou2 /, we obtain
further inequalities of the type (6.55d–g), (6.56d–g) respectively.

6.2.5 Asymmetric Yield Stresses

Symmetric yield stresses with respect to compression and tension, cf. (6.28a), do
not hold for all materials. Thus, we still have to consider the case

yiL ¤  yiU for some i 2 f1; : : : ; Bg: (6.57)

L
In case (6.57) we have different plastic capacities Nipl ¤ Nipl
U
with respect
to compression and tension. Thus must be taken into account in the yield condi-
tion (6.30a, b), hence,
ˇ ˇ˛ ˇ ˙ ˇ
ˇ ti ˇ ˇ ˇ
ˇ ˇ C ˇ mi ˇ  1:
ˇN ˇ ˇM ˇ
ipl ipl

Corresponding to (6.20), (6.23a, b) for ˛ D 2, the feasible domain K0˛ is represented


by the intersection of convex sets K0˛L ; K0˛U defined in the same way as K02L ; K02U
in (6.23a, b).
Corresponding to Sect. 6.2.4 we select then a vector
 
u1
2 @K0˛ with u1 > 0; u2 > 0; (6.58a)
u2

hence,

u˛1 C u2 D 1; u1 > 0; u2 > 0; (6.58b)

see (6.33a). More accurate approximations follow by choosing additional points


uO D .Ou1 ; uO 2 /.
274 6 Expected Total Cost Minimum Design of Plane Frames

Set K0˛U is then approximated, cf. (6.39a–d), by

x1 (6.59a)
1  y  1 (6.59b)
˛u1˛1 x C y  ˇ.u1 ; u2 / (6.59c)
˛u1˛1 x  y  ˇ.u1 ; u2 / (6.59d)

where, cf. (6.40),

ˇ.u1 ; u2 / D ˛u˛1 C u2 : (6.59e)

Consider now the points of K0˛U such that


 
x
; x  0; 1  y  1: (6.60)
y

Because of

ˇ.u1 ; u2 / D ˛u˛1 C u2 > u˛1 C u2 D 1 (6.61)

and ˛ > 1, for the points of K0˛U with (6.60) we have

˛u1˛1 x C y  y  1 < ˇ.u1 ; u2 / (6.62a)


˛u1˛1 x  y  y  1 < ˇ.u1 ; u2 /: (6.62b)

Obviously, the piecewise linearization (6.59a–d) of K0˛U effects only the case
x > 0.
Corresponding to (6.59a–e) the set K0˛L is piecewise linearized by means of the
conditions

1 x (6.63a)
1  y  1 (6.63b)
˛u1˛1 x C y  ˇ.u1 ; u2 / (6.63c)
˛u1˛1 x  y  ˇ.u1 ; u2 /: (6.63d)

Considering here the points


 
x
; x  0; 1  y  1; (6.64)
y
6.2 Stochastic Linear Programming Techniques 275

of K0˛L , then corresponding to (6.60) we have

 ˛u1˛1 x C y  y  1 < ˇ.u1 ; u2 / (6.65a)


˛u1˛1 x  y  y  1 < ˇ.u1 ; u2 /: (6.65b)

Hence, the piecewise linearization (6.63a–d) of K0˛L effects the case x < 0 only.

Formulation of the Yield Condition in the Non Symmetric Case

yiL ¤  yiU :

U
Since the plastic capacities Nipl D yiU Ai and Nipl
L
D j yiL jAi with respect to
tension and compression are different, we have to consider the cases ti > 0 (tension)
and ti < 0 (compression) differently.
Because of K0˛ D K0˛L \ K0˛U , we obtain the following conditions: From
(6.59a–d), hence, for K0˛U we get

ti
U
1 (6.66a)
Nipl
ˇ ˙ˇ
ˇ mi ˇ
ˇ ˇ
ˇM ˇ  1 (6.66b)
ipl

ti m˙
˛u1˛1 U
C i
 ˇ.u1 ; u2 / (6.66c)
Nipl Mipl

ti m˙
˛u1˛1 U
 i  ˇ.u1 ; u2 /: (6.66d)
Nipl Mipl

Moreover, from (6.63a–d), hence, for K0˛L we get

1  ti
L
Nipl
(6.67a)
ˇ ˇ
ˇ m˙ ˇ
ˇ i ˇ1 (6.67b)
ˇ Mipl ˇ


˛u1˛1 NtiL C i
Mipl
 ˇ.u1 ; u2 / (6.67c)
ipl


˛u1˛1 NtiL  i
Mipl  ˇ.u1 ; u2 /: (6.67d)
ipl
276 6 Expected Total Cost Minimum Design of Plane Frames

Summarizing the above conditions, we have

 Nipl
L
 ti  Nipl
U
(6.68a)
Mipl  m
i  Mipl (6.68b)
Mipl  mC
i  Mipl (6.68c)
ti mC
˛u1˛1 U
C i
 ˇ.u1 ; u2 / (6.68d)
Nipl Mipl

ti mC
˛u1˛1 U
 i
 ˇ.u1 ; u2 / (6.68e)
Nipl Mipl

ti mC
˛u1˛1 L
C i  ˇ.u1 ; u2 / (6.68f)
Nipl Mipl

ti mC
˛u1˛1 L
 i
 ˇ.u1 ; u2 / (6.68g)
Nipl Mipl
ti m
˛u1˛1 U
C i
 ˇ.u1 ; u2 / (6.68h)
Nipl Mipl
ti m
˛u1˛1 U
 i
 ˇ.u1 ; u2 / (6.68i)
Nipl Mipl
ti m
˛u1˛1 L
C i
 ˇ.u1 ; u2 / (6.68j)
Nipl Mipl
ti m
˛u1˛1 L
 i
 ˇ.u1 ; u2 /: (6.68k)
Nipl Mipl

U
Corresponding to (6.49a–d), the inequalities (6.68d–k) may be multiplied by Nipl ,
L
Nipl resp., in order to get denominators independent of Ai ; x, respectively.
,
Two-sided constraints are obtained by introducing lower bounds in (6.68d–k)

 ˇ.u1 ; u2 / (6.69)

with a sufficiently large  > 0.

Use of Reference Capacities

Corresponding to Sect. 6.2.4 we introduce here reference capacities

NiL0 ; NiU0 ; Mi 0 > 0; i D 1; : : : ; B;


6.2 Stochastic Linear Programming Techniques 277

L U
for the plastic capacities Nipl ; Nipl ; Mipl . Consider then the relative capacities iL ; iU ,
cf. (6.50),
( )
L
Nipl Mipl
iL WD min L M
; ; (6.70a)
Ni 0 i0
( )
U
Nipl Mipl
iU WD min ; : (6.70b)
NiU0 Mi 0

Based again on (6.52) and the representation K0˛ D K0˛L \ K0˛U , corresponding
to (6.53d–g) we get

ti m˙
˛u1˛1 C i
 ˇ.u1 ; u2 / (6.71a)
iU