You are on page 1of 389

Applications to Regular

and Bang-Bang Control

DC24_Osmolovskii-MaurerFM.indd 1 10/8/2012 9:29:39 AM


Advances in Design and Control
SIAM’s Advances in Design and Control series consists of texts and monographs dealing with all areas of
design and control and their applications. Topics of interest include shape optimization, multidisciplinary
design, trajectory optimization, feedback, and optimal control. The series focuses on the mathematical and
computational aspects of engineering design and control that are usable in a wide variety of scientific and
engineering disciplines.

Editor-in-Chief
Ralph C. Smith, North Carolina State University

Editorial Board
Athanasios C. Antoulas, Rice University Michel C. Delfour, University of Montreal
Siva Banda, Air Force Research Laboratory Max D. Gunzburger, Florida State University
Belinda A. Batten, Oregon State University J. William Helton, University of California, San Diego
John Betts, The Boeing Company (retired) Arthur J. Krener, University of California, Davis
Stephen L. Campbell, North Carolina State University Kirsten Morris, University of Waterloo

Series Volumes
Osmolovskii, Nikolai P. and Maurer, Helmut, Applications to Regular and Bang-Bang Control: Second-Order
Necessary and Sufficient Optimality Conditions in Calculus of Variations and Optimal Control
Biegler, Lorenz T., Campbell, Stephen L., and Mehrmann, Volker, eds., Control and Optimization with
Differential-Algebraic Constraints
Delfour, M. C. and Zolésio, J.-P., Shapes and Geometries: Metrics, Analysis, Differential Calculus, and
Optimization, Second Edition
Hovakimyan, Naira and Cao, Chengyu, L1 Adaptive Control Theory: Guaranteed Robustness with Fast Adaptation
Speyer, Jason L. and Jacobson, David H., Primer on Optimal Control Theory
Betts, John T., Practical Methods for Optimal Control and Estimation Using Nonlinear Programming, Second
Edition
Shima, Tal and Rasmussen, Steven, eds., UAV Cooperative Decision and Control: Challenges and Practical
Approaches
Speyer, Jason L. and Chung, Walter H., Stochastic Processes, Estimation, and Control
Krstic, Miroslav and Smyshlyaev, Andrey, Boundary Control of PDEs: A Course on Backstepping Designs
Ito, Kazufumi and Kunisch, Karl, Lagrange Multiplier Approach to Variational Problems and Applications
Xue, Dingyü, Chen, YangQuan, and Atherton, Derek P., Linear Feedback Control: Analysis and Design
with MATLAB
Hanson, Floyd B., Applied Stochastic Processes and Control for Jump-Diffusions: Modeling, Analysis, and
Computation
Michiels, Wim and Niculescu, Silviu-Iulian, Stability and Stabilization of Time-Delay Systems: An Eigenvalue-
Based Approach
Ioannou, Petros and Fidan, Barıs,¸ Adaptive Control Tutorial
Bhaya, Amit and Kaszkurewicz, Eugenius, Control Perspectives on Numerical Algorithms and Matrix Problems
Robinett III, Rush D., Wilson, David G., Eisler, G. Richard, and Hurtado, John E., Applied Dynamic Programming
for Optimization of Dynamical Systems
Huang, J., Nonlinear Output Regulation: Theory and Applications
Haslinger, J. and Mäkinen, R. A. E., Introduction to Shape Optimization: Theory, Approximation, and Computation
Antoulas, Athanasios C., Approximation of Large-Scale Dynamical Systems
Gunzburger, Max D., Perspectives in Flow Control and Optimization
Delfour, M. C. and Zolésio, J.-P., Shapes and Geometries: Analysis, Differential Calculus, and Optimization
Betts, John T., Practical Methods for Optimal Control Using Nonlinear Programming
El Ghaoui, Laurent and Niculescu, Silviu-Iulian, eds., Advances in Linear Matrix Inequality Methods in Control
Helton, J. William and James, Matthew R., Extending H1 Control to Nonlinear Systems: Control of Nonlinear
Systems to Achieve Performance Objectives

DC24_Osmolovskii-MaurerFM.indd 2 10/8/2012 9:29:39 AM


Applications to Regular
and Bang-Bang Control
Second-Order Necessary and Sufficient
Optimality Conditions in Calculus
of Variations and Optimal Control

Nikolai P. Osmolovskii
Systems Research Institute
Warszawa, Poland
University of Technology and Humanities in Radom
Radom, Poland
University of Natural Sciences and Humanities in Siedlce
Siedlce, Poland
Moscow State University
Moscow, Russia

Helmut Maurer
Institute of Computational and Applied Mathematics
Westfälische Wilhelms-Universität Münster
Münster, Germany

Society for Industrial and Applied Mathematics


Philadelphia

DC24_Osmolovskii-MaurerFM.indd 3 10/8/2012 9:29:39 AM


Copyright © 2012 by the Society for Industrial and Applied Mathematics

10 9 8 7 6 5 4 3 2 1

All rights reserved. Printed in the United States of America. No part of this book may be reproduced,
stored, or transmitted in any manner without the written permission of the publisher. For information,
write to the Society for Industrial and Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia,
PA 19104-2688 USA.

No warranties, express or implied, are made by the publisher, authors, and their employers that the
programs contained in this volume are free of error. They should not be relied on as the sole basis to
solve a problem whose incorrect solution could result in injury to person or property. If the programs are
employed in such a manner, it is at the user’s own risk and the publisher, authors, and their employers
disclaim all liability for such misuse.

Trademarked names may be used in this book without the inclusion of a trademark symbol. These names
are used in an editorial context only; no infringement of trademark is intended.

GNUPLOT Copyright © 1986–1993, 1998, 2004 Thomas Williams, Colin Kelley.

Figure 8.3 reprinted with permission from John Wiley & Sons, Ltd.
Figure 8.5 reprinted with permission from Springer Science+Business Media.
Figure 8.8 reprinted with permission from Elsevier.

Library of Congress Cataloging-in-Publication Data

Osmolovskii, N. P. (Nikolai Pavlovich), 1948-


Applications to regular and bang-bang control : second-order necessary and sufficient optimality condi-
tions in calculus of variations and optimal control / Nikolai P. Osmolovskii, Helmut Maurer.
p. cm. -- (Advances in design and control ; 24)
Includes bibliographical references and index.
ISBN 978-1-611972-35-1
1. Calculus of variations. 2. Control theory. 3. Mathematical optimization. 4. Switching theory. I.
Maurer, Helmut. II. Title.
QA315.O86 2012
515’.64--dc23 2012025629

is a registered trademark.

DC24_Osmolovskii-MaurerFM.indd 4 10/8/2012 9:29:45 AM


For our wives, Alla and Gisela
j

DC24_Osmolovskii-MaurerFM.indd 5 10/8/2012 9:29:45 AM


Contents

List of Figures xi

Notation xiii

Preface xvii

Introduction 1

I Second-Order Optimality Conditions for Broken Extremals in the Calculus


of Variations 7

1 Abstract Scheme for Obtaining Higher-Order Conditions in Smooth


Extremal Problems with Constraints 9
1.1 Main Concepts and Main Theorem . . . . . . . . . . . . . . . . . . . 9
1.2 Proof of the Main Theorem . . . . . . . . . . . . . . . . . . . . . . . 15
1.3 Simple Applications of the Abstract Scheme . . . . . . . . . . . . . . 21

2 Quadratic Conditions in the General Problem of the Calculus


of Variations 27
2.1 Statements of Quadratic Conditions for a Pontryagin Minimum . . . . 27
2.2 Basic Constant and the Problem of Its Decoding . . . . . . . . . . . . 34
2.3 Local Sequences, Higher Order γ , Representation of the Lagrange
Function on Local Sequences with Accuracy up to o(γ ) . . . . . . . . 39
2.4 Estimation of the Basic Constant from Above . . . . . . . . . . . . . 54
2.5 Estimation of the Basic Constant from Below . . . . . . . . . . . . . 75
2.6 Completing the Proof of Theorem 2.4 . . . . . . . . . . . . . . . . . . 102
2.7 Sufficient Conditions for Bounded Strong and Strong Minima in the
Problem on a Fixed Time Interval . . . . . . . . . . . . . . . . . . . . 115

3 Quadratic Conditions for Optimal Control Problems with Mixed


Control-State Constraints 127
3.1 Quadratic Necessary Conditions in the Problem with Mixed Control-
State Equality Constraints on a Fixed Time Interval . . . . . . . . . . 127
3.2 Quadratic Sufficient Conditions in the Problem with Mixed Control-
State Equality Constraints on a Fixed Time Interval . . . . . . . . . . 138

vii
viii Contents

3.3 Quadratic Conditions in the Problem with Mixed Control-State Equal-


ity Constraints on a Variable Time Interval . . . . . . . . . . . . . . . 150
3.4 Quadratic Conditions for Optimal Control Problems with Mixed
Control-State Equality and Inequality Constraints . . . . . . . . . . . 164

4 Jacobi-Type Conditions and Riccati Equation for Broken Extremals 183


4.1 Jacobi-Type Conditions and Riccati Equation for Broken Extremals in
the Simplest Problem of the Calculus of Variations . . . . . . . . . . . 183
4.2 Riccati Equation for Broken Extremal in the General Problem of the
Calculus of Variations . . . . . . . . . . . . . . . . . . . . . . . . . . 214

II Second-Order Optimality Conditions in Optimal Bang-Bang Control


Problems 221

5 Second-Order Optimality Conditions in Optimal Control Problems


Linear in a Part of Controls 223
5.1 Quadratic Optimality Conditions in the Problem on a Fixed Time
Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
5.2 Quadratic Optimality Conditions in the Problem on a Variable Time
Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
5.3 Riccati Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
5.4 Numerical Example: Optimal Control of Production and
Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

6 Second-Order Optimality Conditions for Bang-Bang Control 255


6.1 Bang-Bang Control Problems on Nonfixed Time Intervals . . . . . . . 255
6.2 Quadratic Necessary and Sufficient Optimality Conditions . . . . . . . 259
6.3 Sufficient Conditions for Positive Definiteness of the Quadratic Form
 on the Critical Cone K . . . . . . . . . . . . . . . . . . . . . . . . 266
6.4 Example: Minimal Fuel Consumption of a Car . . . . . . . . . . . . . 272
6.5 Quadratic Optimality Conditions in Time-Optimal Bang-Bang Control
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
6.6 Sufficient Conditions for Positive Definiteness of the Quadratic Form
 on the Critical Subspace K for Time-Optimal Control Problems . . 281
6.7 Numerical Examples of Time-Optimal Control Problems . . . . . . . 286
6.8 Time-Optimal Control Problems for Linear Systems with Constant
Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

7 Bang-Bang Control Problem and Its Induced Optimization Problem 299


7.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
7.2 First-Order Derivatives of x(tf ; t0 , x0 , θ ) with Respect to t0 , tf , x0 ,
and θ. Lagrange Multipliers and Critical Cones . . . . . . . . . . . . 305
7.3 Second-Order Derivatives of x(tf ; t0 , x0 , θ ) with Respect to t0 , tf , x0 ,
and θ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
7.4 Explicit Representation of the Quadratic Form for the Induced Opti-
mization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Contents ix

7.5 Equivalence of the Quadratic Forms in the Basic and Induced Opti-
mization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

8 Numerical Methods for Solving the Induced Optimization Problem and


Applications 339
8.1 The Arc-Parametrization Method . . . . . . . . . . . . . . . . . . . . 339
8.2 Time-Optimal Control of the Rayleigh Equation Revisited . . . . . . . 344
8.3 Time-Optimal Control of a Two-Link Robot . . . . . . . . . . . . . . 346
8.4 Time-Optimal Control of a Single Mode Semiconductor Laser . . . . . 353
8.5 Optimal Control of a Batch-Reactor . . . . . . . . . . . . . . . . . . . 357
8.6 Optimal Production and Maintenance with L1 -Functional . . . . . . . 361
8.7 Van der Pol Oscillator with Bang-Singular Control . . . . . . . . . . . 365

Bibliography 367

Index 377
List of Figures

2.1 Neighborhoods of the control at a point t1 of discontinuity. . . . . . . . . 40


2.2 Definition of functions (t, v) on neighborhoods of discontinuity points. . 51

4.1 Tunnel-diode oscillator. I denotes inductivity, C capacity, R resistance,


I electric current, and D diode. . . . . . . . . . . . . . . . . . . . . . . 203
4.2 Rayleigh problem with regular control. (a) State variables. (b) Control.
(c) Adjoint variables. (d) Solutions of the Riccati equation (4.140). . . . 204
4.3 Top left: Extremals x (1) , x (2) (lower graph). Top right: Variational so-
lutions y (1) and y (2) (lower graph) to (4.145). Bottom: Envelope of
neighboring extremals illustrating the conjugate point tc = 0.674437. . . 206

5.1 Optimal production and maintenance, final time tf = 0.9. (a) State vari-
ables x1 , x2 . (b) Regular production control v and bang-bang maintenance
control m. (c) Adjoint variables ψ1 , ψ2 . (d) Maintenance control m with
switching function φm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
5.2 Optimal production and maintenance, final time tf = 1.1. (a) State vari-
ables x1 , x2 . (b) Regular production control v and bang-singular-bang
maintenance control m. (c) Adjoint variables ψ1 , ψ2 . (d) Maintenance
control m with switching function φm . . . . . . . . . . . . . . . . . . . . 252

6.1 Minimal fuel consumption of a car. (a) State variables x1 , x2 . (b) Bang-
bang control u. (c) Adjoint variable ψ2 . (d) Switching function φ. . . . . 274
6.2 Time-optimal solution of the van der Pol oscillator, fixed terminal state
(6.120). (a) State variables x1 and x2 (dashed line). (b) Control u and
switching function ψ2 (dashed line). (c) Phase portrait (x1 , x2 ). (d) Ad-
joint variables ψ1 and ψ2 (dashed line). . . . . . . . . . . . . . . . . . . 287
6.3 Time-optimal solution of the van der Pol oscillator, nonlinear boundary
condition (6.129). (a) State variables x1 and x2 (dashed line). (b) Control
u and switching function ψ2 (dashed line). (c) Phase portrait (x1 , x2 ).
(d) Adjoint variables ψ1 and ψ2 (dashed line). . . . . . . . . . . . . . . 289
6.4 Time-optimal control of the Rayleigh equation. (a) State variables x1
and x2 (dashed line). (b) Control u and switching function φ (dashed
line). (c) Phase portrait (x1 , x2 ). (d) Adjoint variables ψ1 and ψ2 (dashed
line). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

xi
xii List of Figures

8.1 Time-optimal control of the Rayleigh equation with boundary conditions


(8.31). (a) Bang-bang control and scaled switching function (×4),
(b) State variables x1 , and x2 . . . . . . . . . . . . . . . . . . . . . . . . 345
8.2 Time-optimal control of the Rayleigh equation with boundary condition
(8.40). (a) Bang-bang control u and scaled switching function φ (dashed
line). (b) State variables x1 , x2 . . . . . . . . . . . . . . . . . . . . . . . . 346
8.3 Two-link robot [67]: upper arm OQ,  lower arm OP  , and angles q1
and q2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
8.4 Control of the two-link robot (8.44)–(8.47). (a) Control u1 and scaled
switching function φ1 (dashed line). (b) Control u2 and scaled switching
function φ2 (dashed line). (c) Angle q1 and velocity ω1 . (d) Angle q2
and velocity ω2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
8.5 Control of the two-link robot (8.53)–(8.57). (a) Control u1 .
(b) Control u2 . (c) Angle q1 and velocity ω1 . (d) Angle q2 and velocity
ω2 [17]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
8.6 Control of the two-link robot (8.53)–(8.57): Second solution. (a) Control
u1 . (b) Control u2 . (c)Angle q1 and velocity ω1 . (d)Angle q2 and velocity
ω2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
8.7 Time-optimal control of a semiconductor laser. (a) Normalized pho-
ton density S(t) × 10−5 . (b) Normalized photon density N (t) × 10−8 .
(c) Electric current (control) I (t) with I (t) = I0 = 20.5 for t < 0 and
I (t) = I∞ = 42.5 for t > tf . (d) Adjoint variables ψS (t), ψN (t). . . . . . 356
8.8 Normalized photon number S(t) for I (t) ≡ 42.5 mA and optimal
I (t) [46]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
8.9 Schematics of a batch-reaction with two control variables. . . . . . . . . 357
8.10 Control of a batch reactor with functional (8.79). Top row: Control
u = (FB , Q) and scaled switching functions. Middle row: Molar concen-
trations MA and MB . Bottom row: Molar concentrations (MC , MD ) and
energy holdup H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
8.11 Control of a batch reactor with functional (8.89). Top row: Control
u = (FB , Q) and scaled switching functions. Middle row: Molar con-
centrations MA , MB . Bottom row: Molar concentrations (MC , MD ) and
energy holdup H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
8.12 Optimal production and maintenance with L1 -functional (8.95). (a) State
variables x and y. (b) Control variables v and m. (c), (d) Control variables
and switching functions. . . . . . . . . . . . . . . . . . . . . . . . . . . 364
8.13 Control of the van der Pol oscillator with regulator functional. (a) Bang-
singular control u. (b) State variables x1 and x2 . . . . . . . . . . . . . . 366
Notation

{x | P (x)} : set of elements x with the property P .


R : set of real numbers.
ξ+ := max{ξ , 0} : positive ⎞ of ξ ∈ R.
⎛ part
x1
⎜ ⎟
x ∈ Rn : x = ⎝ ... ⎠ , xi ∈ R.
xn
y ∈ Rn∗ : ⇔ y = (y1 , . . . , yn ), yi ∈ R.
x ∗ ∈ Rn∗ : x ∗ = (x1 , . . . , xn ) for x ∈ Rn .

yx = ni=1 yi xi : x ∈ Rn , y ∈ Rn∗ .

x, y = ni=1 xi yi : x, y ∈ Rn .
d(a) : dimension of the vector a.

·
X : norm in the normed space X.
[a, b] ⊂ X : [a, b] := {x ∈ X | x = αa + (1 − α)b,
where α ∈ [0, 1]}, closed interval in the linear
space X with endpoints a, b.
A : closure of the set A.
X∗ : dual space to the normed space X.
x ∗ , x : value of the linear functional x ∗ ∈ X∗ at the
point x ∈ X.
K∗ : K ∗ := {x ∗ ∈ X∗ | x ∗ , x ≥ 0 for all x ∈ K},
dual cone of K.
g (x0 ) : Fréchet derivative of the mapping g : X → Y
at x0 .
J (x) → min : J (x) → min , fi (x) ≤ 0, i = 1, . . . , k, g(x) = 0
(g : X → Y ); abstract optimization problem in
the space X.
σ (δx) : σ (δx) = max{m(δx),
g(x0 + δx)
},
: m(δx) = maxi=0,...,k fi (x0 + δx), f0 (x) :=
J (x) − J (x0 ), violation function in the
abstract optimization problem at x0 .
λ = (α0 , . . . , αk , y ∗ ) : Lagrange multipliers in the abstract optimiza-
tion problem.

xiii
xiv Notation

k
L(λ, x) : L(λ, x) = α0 J (x) + i=1 αi fi (x) +
y ∗ , g(x) , Lagrange function in the abstract
optimization
problem.
0 : 0 = λ | αi ≥ 0, (i = 0, . . . , m), αi fi (x̂) =
0 (i = 1, . . . , k),
k ∗

: i=0 αi +
y
= 1, Lx (λ, x̂) = 0 ,
Lx = ∂L/∂x,
set of the normed tuples of Lagrange multipli-
ers at x̂.
K : K = {x̄ ∈ X | fi (x0 ), x̄ ≤ 0, i ∈ I ∪
{0}; g (x0 )x̄ = 0}, cone of critical directions
(critical cone) at the point x0 ,
I = {i ∈ {1, . . . , k} | fi (x0 ) = 0}, set of active
indices at x0 .
z(·) : element of a function space; z(t) is the value
of z(·) at t.
x(t) : state variable, x ∈ Rd(x) .
u(t) : control variable, u ∈ Rd(u) .
 = [t0 , tf ] : time interval.
w := (x, u) : state x and control u.
p = (x0 , xf ) : ∈ R2d(x) , if  is a fixed time interval, x0 :=
x(t0 ), xf := x(tf ).
p := (t0 , x0 , tf , xf ) : ∈ R2+2d(x) , if  is a variable time interval.
ψ(t) : adjoint variable, ψ ∈ Rd(x)∗ .
ẋ = f (t, x, u) : control system, where ẋ:= dx dt .
H (t, x, u, ψ) : = ψf (t, x, u), Pontryagin function or Hamil-
tonian
Hx , Hu : partial derivatives of H with respect to x and
u, e.g., Hx := ∂H ∂x = ∂x1 , . . . , ∂xn ∈ R ,
∂H ∂H n∗

n = d(x).
Hux : second partial derivative of H with respect to
x and u,
⎛ ⎞
∂2H 2H
. . . ∂u∂ 1 ∂x
⎜ ∂u1 ∂x1 n ⎟
Hux = ⎝ . . . ... . . . ⎠,
2
∂ H 2
∂um ∂x1 . . . ∂u∂mH ∂xn
m = d(u), n = d(x).
w 0 (t) = (x 0 (t), u0 (t)) : pair satisfying the constraints of an optimal
control problem.
δx(t) or x̄(t) : variation of the state x 0 (t).
δu(t) or ū(t) : variation of the control u0 (t).
 n ∂2H
Hux x̄, ū : = m i=1 j =1 ∂ui ∂xj ūi x̄j , m = d(u), n =
d(x).
Notation xv

C([t0 , tf ], Rn ) : space of continuous vector functions


x : [t0 , tf ] → Rn with norm
x(·)
∞ =

x(·)
C = maxt∈[t0 ,tf ] |x(t)|.
PC([t0 , tf ], Rn ) : class of piecewise continuous functions
u : [t0 , tf ] → Rn .
 = {t1 , . . . , ts } : set of discontinuity points of u0 (·) ∈
PC([t0 , tf ], Rn ), u0k− = u0 (tk −), u0k+ =
u0 (tk +), [u0 ]k = u0k+ − u0k− .
C 1 ([t0 , tf ], Rn ) : space of continuously differentiable func-
tions x : [t0 , tf ] → Rn endowed with the norm

x(·)
C 1 = max{
x(·)
∞ ,
ẋ(·)
∞ }.
PC 1 ([t0 , tf ], Rn ) : class of continuous functions x : [t0 , tf ] → Rn
with a piecewise continuous derivative.
L1 ([t0 , tf ], Rm ) : space of Lebesgue integrable functions u :
[t0 , tf ] → R m
tf endowed with the norm

u(·)
1 = t0 |u(t)| dt.
L2 ([t0 , tf ], Rm ) : Hilbert space of square Lebesgue integrable
functions u : [t0 , tf ] → Rm
with the inner product
t
u(·), v(·) = t0f u(t), v(t) dt.
L∞ ([t0 , tf ], Rm ) : space of bounded measurable functions u :
[t0 , tf ] → Rm endowed with the norm

u(·)
∞ = ess supt∈[t0 ,tf ] |u(t)|.
W 1,1 ([t0 , tf ], Rn ) : space of absolutely continuous functions x :
[t0 , tf ] → Rn endowed t with the norm

x(·)
1,1 = |x(t0 )| + t0f |ẋ(t)| dt.
W 1,2 ([t0 , tf ], Rn ) : Hilbert space of absolutely continuous func-
tions x : [t0 , tf ] → Rn with square integrable
derivative and inner product x(·), y(·) =
t
x(t0 ), y(t0 ) + t0f ẋ(t), ẏ(t) dt.
W 1,∞ ([t0 , tf ], Rn ) : space of Lipschitz continuous functions
x : [t0 , tf ] → Rn endowed with the norm

x(·)
1,∞ = |x(t0 )| +
ẋ(·)
∞ .
P W 1,1 ([t0 , tf ], Rd(x) ) : space of piecewise continuous functions x̄ :
[t0 , tf ]  → Rd(x) that are absolutely continuous
on each of the intervals of the set (t0 , tf ) \ .
P W 1,2 ([t0 , tf ], Rd(x) ) : Hilbert space of functions x̄(·) ∈
P W 1,1 ([t0 , tf ], Rd(x) ) such that the first
derivative x̄˙ is Lebesgue square integrable;
with [x̄]k = x̄ k+ − x̄ k− = x̄(tk +) − x̄(tk −)
and
s innerk product tx̄,
f ˙
ȳ = x̄(t0 ), ȳ(t0 ) +
˙
k=1 [x̄] , [ȳ] + t0 x̄(t), ȳ(t) dt.
k
Preface

The book is devoted to the theory and application of second-order necessary and sufficient
optimality conditions in the Calculus of Variations and Optimal Control. The theory is
developed for control problems with ordinary differential equations subject to boundary
conditions of equality and inequality type and mixed control-state constraints of equality
type. The book exhibits two distinctive features: (a) necessary and sufficient conditions
are given in the form of no-gap conditions, and (b) the theory covers broken extremals,
where the control has finitely many points of discontinuity. Sufficient conditions for regular
controls that satisfy the strict Legendre condition can be checked either via the classical
Jacobi condition or through the existence of solutions to an associated Riccati equation.
Particular emphasis is given to the study of bang-bang control problems. Bang-bang
controls induce an optimization problem with respect to the switching times of the control.
It is shown that the classical second-order sufficient condition for the Induced Optimization
Problem (IOP), together with the so-called strict bang-bang property, ensures second-order
sufficient conditions (SSC) for the bang-bang control problem. Numerical examples in
different areas of application illustrate the verification of SSC for both regular controls and
bang-bang controls.
SSC are crucial for exploring the sensitivity analysis of parametric optimal control
problems. It is well known in the literature that for regular controls satisfying the strict
Legendre condition, SSC allow us to prove the parametric solution differentiability of opti-
mal solutions and to compute parametric sensitivity derivatives. This property has lead to
efficient real-time control techniques. Recently, similar results have been obtained for bang-
bang controls via SSC for the IOP. Though the discussion of sensitivity analysis and the
ensuing real-time control technques are an immediate consequence of the material presented
in this book, a systematic treatment of these issues is beyond the scope of this book.
The results of Sections 1.1–1.3 are due to Levitin, Milyutin, and Osmolovskii. The
results of Section 6.8 were obtained by Milyutin and Osmolovskii. The results of Sections
2.1–3.4, 5.1, 5.2, 6.1, 6.2, and 6.5 were obtained by Osmolovskii; some important ideas
used in these sections are due to Milyutin. The results of Sections 4.1 and 4.2 (except for
Section 4.1.5) are due to Lempio and Osmolovskii. The results of Sections 5.3, 6.3, 6.6,
and 7.1–7.5 were obtained by Maurer and Osmolovskii. All numerical examples in Sections
4.1, 5.4, 6.4, and Chapter 8 were collected and investigated by Maurer, who is grateful for
the numerical assistance provided by Christof Büskens, Laurenz Göllmann, Jang-Ho Robert
Kim, and Georg Vossen. Together we solved a lot more bang-bang and singular control
problems than could be included in this book. H. Maurer is indebted to Yalçin Kaya for
drawing his attention to the arc-parametrization method presented in Section 8.1.2.

xvii
xviii Preface

Acknowledgments. We are thankful to our colleagues at Moscow State University


and University of Münster for their support. A considerable part of the book was written
by the first author during his work in Poland (System Research Institute of Polish Academy
of Science in Warszawa, Politechnika Radomska in Radom, Siedlce University of Natural
Sciences and Humanities), and during his stay in France (Ecole Polytechnique, INRIA
Futurs, Palaiseau). We are grateful to J. Frédéric Bonnans from laboratory CMAP (INRIA)
at Ecole Polytechnique for his support and for fruitful discussions. Many important ideas
used in this book are due to A.A. Milyutin. N.P. Osmolovskii was supported by the grant
RFBR 11-01-00795; H. Maurer was supported by the grant MA 691/18 of the Deutsche
Forschungsgemeinschaft.
Finally, we wish to thank Elizabeth Greenspan, Lisa Briggeman, and the SIAM com-
positors for their helpful, patient, and excellent work on our book.
Introduction

By quadratic conditions, we mean second-order extremality conditions formulated for a


given extremal in the form of positive (semi)definiteness of the corresponding quadratic
form. So, in the simplest problem of the calculus of variations
 tf
F (t, x, ẋ) dt → min, x(t0 ) = a, x(tf ) = b,
t0

considered on the space C 1 of continuously differentiable vector-valued functions x(·) on


the given closed interval [t0 , tf ], the quadratic form is as follows:
 tf
ω= ˙ + Fẋ ẋ x̄,
Fxx x̄, x̄ + 2Fẋx x̄, x̄ ˙ x̄
˙ dt, x̄(t0 ) = x̄(tf ) = 0,
t0

where the second derivatives Fxx , Fẋx , and Fẋ ẋ are calculated along the extremal x(·) that
is of interest to us. It is considered in the space W 1,2 of absolutely continuous functions
x̄ with square integrable derivative x̄. ˙ By using ω for a given extremal, one formulates
a necessary second-order condition for a weak minimum (the positive semidefiniteness of
the form), as well as a sufficient second-order condition for a weak minimum (the positive
definiteness of the form). As is well known, these quadratic conditions are equivalent (under
the strengthened Legendre condition) to the corresponding Jacobi conditions.
The simplest problem of the calculus of variations can also be considered in the
space W 1,∞ of Lipschitz-continuous functions x(·), and then, in particular, there arises
the problem of studying the extremality of broken extremals x(·), i.e., extremals such that
the derivative ẋ(·) has finitely many points of discontinuity of the first kind. What are
second-order conditions for broken extremals, and what is the corresponding quadratic
form for them? A detailed study of this problem for the simplest problem was performed by
the first author in the book [79], where quadratic necessary and sufficient conditions for the
so-called “Pontryagin minimum” (corresponding to L1 -small variations of the control u = ẋ
under the condition of their uniform L∞ -boundedness) were obtained, and also the relation
between the obtained conditions and the conditions for the strong minimum (and also the
so-called “-weak” and “bounded strong” minima) was established. For an extremal x(·)
with one break at a point t∗ , the corresponding form becomes (cf. [39, 85, 86, 107]):
 tf
 = a ξ̄ 2 + 2[Fx ]x̄av ξ̄ + Fxx x̄, x̄ + 2Fẋx x̄, x̄ ˙ x̄
˙ + Fẋ ẋ x̄, ˙ dt,
t0

where ξ̄ is a numerical parameter, x̄(·) is a function that can have a nonzero jump [x̄] :=
x̄(t∗ +) − x̄(t∗ −) at the point t∗ and is absolutely continuous on the semiopen intervals

1
2 Introduction

˙ is square integrable, and, moreover, the following


[t0 , t∗ ) and (t∗ , tf ], and the derivative x̄(·)
conditions hold:

[x̄] = [ẋ]ξ̄ , x̄(t0 ) = x̄(tf ) = 0.

Here, [Fx ] and [ẋ] denote the jumps of the gradient Fx (t, x(t), ẋ(t)) and the derivative ẋ(t)
of the extremal at the point t∗ , respectively (e.g., [ẋ] = ẋ(t∗ +) − ẋ(t∗ −)); a is the derivative
in t of the function

Fẋ (t, x(t), ẋ(t))[ẋ] − F (t, x(t), ẋ(t∗ +)) + F (t, x(t), ẋ(t∗ −))

at the same point (its existence is proved); and x̄av is the average value of the left-hand and
the right-hand values of x̄ at t∗ , i.e., x̄av = 12 (x̄(t∗ −) + x̄(t∗ +)). The Weierstrass condition
(the minimum principle) implies the inequality a ≥ 0, which complements the well-known
Weierstrass–Erdmann conditions for broken extremals.
In [96] and [97] it was shown (see also [90]) that the problem of the “sign” of the
quadratic form  can be studied by using methods analogous to the classical methods. The
Jacobi conditions and criteria formulated by using the corresponding Riccati equation are
extended to the case of a broken extremal. In all these conditions, a new aspect consists
only of the fact that the solutions of the corresponding differential equations should have
completely certain jumps at the point of break of the extremal. Moreover, it was shown in
[97] that, as in the classical case, the quadratic form  reduces to a sum of squares solving,
by the corresponding Riccati equation satisfying (this is the difference from the classical
case), a definite jump condition at the point t∗ , which also gives a criterion for positive
definiteness of the form  and, therefore, a sufficient extremality condition for a given
extremal.
In book [79], quadratic extremality conditions for discontinuous controls were also
presented in the following problem on a fixed time interval [t0 , tf ]:

J(x(·), u(·)) = J (x(t0 ), x(tf )) → min,


F (x(t0 ), x(tf )) ≤ 0, K(x(t0 ), x(tf )) = 0,
ẋ = f (t, x, u), g(t, x, u) = 0, (t, x, u) ∈ Q,

where Q is an open set, x, u, F , K, and g are vector-valued functions, and J is a scalar-


valued function. The functions J , F , K, f , and g belong to the class C 2 , and, moreover,
the derivative gu has the maximum rank on the surface g = 0 (the nondegeneracy condition
for the relation g = 0). We seek the minimum among pairs (x(·), u(·)) admissible by the
constraints such that the function x(·) is absolutely continuous and u(·) is bounded and
measurable. This statement corresponds to the general canonical optimal control problem in
the Dubovitskii–Milyutin form, but, in contrast to the latter, it is considered on a fixed
interval of time, and, which is of special importance, it does not contain pointwise (or local,
in the Dubovitskii–Milyutin terminology) mixed inequality-type constraints ϕ(t, x, u) ≤ 0.
Precisely, these constraints caused the major difficulties in the study of quadratic conditions
[85, 88]. Also, due to the absence of local inequalities, we refer this problem to the calculus
of variations (rather than to optimal control) and call it the general problem of the calculus
of variations (with mixed equality-type constraints, on a fixed time interval). Its statement
is close to the Mayer problem, but the existence of endpoint inequality-type constraints
determines its specifics.
Introduction 3

On the other hand, this problem, even being referred to as the calculus of variations, is
sufficiently general and its statement is close to optimal control problems, especially owing
to the local relation g(t, x, u) = 0. In [79], it was shown how, by using quadratic conditions
for the general problem of calculus of variations, one can obtain quadratic (necessary and
sufficient) conditions in optimal control problems in which the controls enter linearly and the
constraint on the control is given in the form of a convex polyhedron under the assumption
that the optimal control is piecewise-constant and (outside the switching points) belongs
to vertices of the polyhedron (the so-called bang-bang control). To show this, in [79],
we first used the property that the set V of vertices of a polyhedron U can be given by a
nondegenerate relation g(u) = 0 on an open set Q consisting of disjoint open neighborhoods
of vertices. This allows us to write quadratic necessary conditions for bang-bang controls.
Further, in [79], it was shown that a sufficient minimality condition on V guarantees (when
the control enters linearly) the minimum on its convexification U = co V . In this way, the
quadratic sufficient conditions were obtained for bang-bang controls.
However, in [79], there is a substantial gap stemming from the fact that, to avoid
making the book too long, the authors decided to omit the proofs of quadratic conditions
for the general problem of the calculus of variations and restricted themselves to their
formulation and the presentation of proofs only for the simplest problem. Although the
latter gives the idea of the proofs in the general case, there are no formal proofs of quadratic
conditions for the general problem of the calculus of variations in [79]. Part I of the
present book is devoted to removing this gap. Therefore, Part I can be considered as a
necessary supplement to the book [79]. At the same time, the material contained in Part I
is independent and is a complete theory of quadratic conditions for smooth problems of
the calculus of variations and optimal control that are covered by the statement presented
above.
Part I is organized as follows. First, in Chapter 1, we present a fragment of the abstract
theory of higher-order conditions of Levitin, Milyutin, and Osmolovskii [54, 55, 56], more
precisely, a modification of this theory for smooth problems on the set of sequences 
determining one or another concept of minimum. In this theory, by a higher order, we
mean a nonnegative functional determining a growth estimate of the objective function on
admissible sequences of variations from . The main result of the abstract theory is that for
a given class of problems, a given higher order γ , and a given set of sequences , one defines
a constant Cγ (by using the Lagrange function) such that Cγ ≥ 0 is a necessary minimality
condition (corresponding to ), and Cγ > 0 is a sufficient condition. The constant Cγ is
said to be basic. In each concrete class of problems (for given  and γ ), there arises the
problem of “decoding” the basic constant. By decoding we mean the simplest method for
calculating its sign. We illustrate the decoding of the basic constant by two simple examples,
obtaining conditions of the order γ , where γ is a certain quadratic functional.
In Chapter 2, on the basis of the results of Chapter 1, we create a quadratic theory
of conditions for Pontryagin minimum in the general problem of the calculus of variations
without local mixed constraint g(t, x, u) = 0. We perform the decoding of the basic constant
for the set of the so-called “Pontryagin sequences” and a special higher order γ , which is
characteristic for extremals with finitely many discontinuities of the first kind of control. We
first estimate the basic constant from above, thus obtaining quadratic necessary conditions
for Pontryagin minimum, and then we estimate it from below, thus obtaining sufficient
conditions. After that (in Section 2.7), we establish the relation of the obtained sufficient
conditions for Pontryagin minimum with conditions for strong minimum.
4 Introduction

In Chapter 3, we extend the quadratic conditions obtained in Chapter 2 to the general


problem with the local relation g(t, x, u) = 0 using a special method of projection contained
in [79]. Moreover, we extend these conditions to the problem on a variable interval of
time using a simple change of time variable. We also formulate, without proofs, quadratic
conditions in an optimal control problem with local relations g(t, x, u) = 0 and ϕ(t, x, u) ≤ 0.
The proofs are set forth in [94, 95].
In Chapter 4, following the results of papers [96] and [97], we derive the tests for
positive semidefiniteness and that for positive definiteness of the quadratic form  on the
critical cone K (obtained in Chapter 2 for extremals with jumps of the control); these are
necessary and sufficient conditions for local minimum, respectively. First, we derive such
tests for the simplest problem of the calculus of variations and for extremal with only one
corner point. In these tests we exploit the classical Jacobi and Riccati equations, but, as
it was said, we use the discontinuous solutions to these equations satisfying specific jump
conditions at the corner point of extremal. In the proofs we use the ideas in [25] and [26].
Namely, we consider a one parameter family of the auxiliary minimization problems and
reduce the question of the “sign” of our quadratic form on the critical subspace to the con-
dition of the existence of a nonzero point of minimum in the auxiliary problem for a certain
value of the parameter. This condition was called in [25] the “passage of quadratic form
through zero”. Then we obtain a dual test for minimum in the auxiliary problem. As a
result we arrive at a generalization of the concept of conjugate point. Such a point is called
a -conjugate, where  is a singleton consisting of one corner point t∗ of extremal at hand.
This generalization allows us to formulate both necessary and sufficient second-order opti-
mality conditions for broken extremal. Next, we concentrate only on sufficient conditions
for positive definiteness of the quadratic form  in the auxiliary problem. Following [97],
we show that if there exists a solution to the Riccati matrix equation satisfying a definite
jump condition, then the quadratic form  can be transformed into a perfect square, just as
in the classical case. This gives the possibility of proving a sufficient condition for positive
definiteness of the quadratic form in the auxiliary problem and thus to obtain one more suf-
ficient condition for optimality of broken extremal. First we prove this result for extremal
with one corner point in the simplest problem of the calculus of variations, and then for
extremal with finitely many points of discontinuity of the control in the general problem of
the calculus of variations.
Part II is devoted to optimal control problems. In Chapter 5, we derive quadratic
optimality conditions for optimal control problems with a vector control variable having
two components: a continuous unconstrained control appearing nonlinearly in the control
system and a bang-bang control appearing linearly and belonging to a convex polyhedron.
Such type of control problem arises in many applications. The proofs of quadratic optimality
conditions for the mixed continuous-bang case are very similar to the proofs given in [79]
for the pure bang-bang case, but some modifications were inevitable. We demonstrate
these modifications. In the proofs we use the optimality conditions obtained in Chapter 3
for extremal with jumps of the control in the general problem of calculus of variations.
Further, we show that, also for the mixed case, there is a techniques for checking positive
definiteness of the quadratic form on the critical cone via a discontinuous solution of an
associated Riccati equation with corresponding jump conditions (for this solution) at the
points of discontinuity of bang-bang control. This technique is applied to an economic
control problem in optimal production and maintenance which was introduced by Cho,
Abad, and Parlar [22]. We show that the numerical solution obtained in Maurer, Kim, and
Introduction 5

Vossen [67] satisfies the second-order test derived in Chapter 5 while existing sufficiency
results fail to hold.
In Chapter 6, we investigate the pure bang-bang case. We obtain second-order nec-
essary and sufficient optimality conditions for this case as a consequence of the conditions
obtained in Chapter 5. In the pure bang-bang case, the conditions amount to testing the
positive (semi)definiteness of a quadratic form on a finite-dimensional critical cone. Nev-
ertheless, the assumptions are appropriate for numerical verification only in some special
cases. Therefore, again we study various transformations of the quadratic form and the
critical cone which will be tailored to different types of control problems in practice. In
particular, by means of a solution to a linear matrix differential equation, the quadratic form
can be converted to perfect squares. We demonstrate by practical examples that the obtained
conditions can be verified numerically.
We also study second-order optimality conditions for time optimal control problems
with control appearing linearly. More specifically, we consider the special case of time
optimal bang-bang controls with a given initial and terminal state. We aim at showing that
an approach similar to the above-mentioned Riccati equation approach works as well for
such problems. Again, the test requires us to find a solution of a linear matrix differen-
tial equation which satisfies certain jump conditions at the switching points. We discuss
three numerical examples that illustrate the numerical procedure of verifying positive defi-
niteness of the corresponding quadratic forms. Finally, following [79], we study second-
order optimality conditions in a simple, but important, class of time optimal control problems
for linear systems with constant entries.
Second-order optimality conditions in bang-bang control problems have been derived
in the literature in two different forms. The first form was discussed above. The second
form belongs to Agrachev, Stefani, and Zezza [1], who first reduce the bang-bang control
problem to a finite-dimensional Induced Optimization Problem (IOP) and then show that
well-known sufficient optimality conditions for the induced problem supplemented by the
strict bang-bang property furnish sufficient conditions for the bang-bang control problem.
In Chapter 7, we establish the equivalence of both forms of sufficient conditions. The
proof of this equivalence make extensive use of explicit formulas for first- and second-
order derivatives of the trajectory with respect to variations of the optimization variable ζ
comprising the switching times, the free initial and final time, and the free initial state.
We formulate the IOP with optimization variable ζ which is associated with the bang-bang
control problem. We give formulas for the first- and second-order derivatives of trajectories
with respect to ζ which follow from elementary properties of ordinary differential equations
(ODEs). The formulas are used to establish the explicit relations between the multipliers of
Pontryagin’s minimum principle and the Lagrange multipliers, critical cones and quadratic
forms of the original and IOPs. In our opinion, the resulting formulas seem to have been
mostly unknown in the literature. These formulas provide the main technical tools to obtain
explicit representations of the second-order derivatives of the Lagrangian. The remark-
able fact to be noted here is that by using a suitable transformation, these derivatives are
seen to involve only first-order variations of the trajectory with respect to ζ . This prop-
erty facilitates considerably the numerical computation of the Hessian of the Lagrangian.
Thus, we arrive at a representation of the quadratic form associated with the Hessian of the
Lagrangian.
Finally, Chapter 8 is devoted to numerical methods for solving the IOP and testing
the second-order sufficient conditions in Theorem 7.10. After a brief survey on numerical
6 Introduction

methods for solving optimal control problems, we present in Section 8.1.2 the arc-
parametrization method for computing bang-bang controls [44, 45, 66] and its extension to
piecewise feedback controls [111, 112, 113]. Arc parametrization can be efficiently imple-
mented using the code NUDOCCCS developed by Büskens [13, 14]. Several numerical
examples illustrate the arc-parametrization method and the verification of second-order
conditions.
Chapter 1

Abstract Scheme for Obtaining


Higher-Order Conditions in
Smooth Extremal Problems
with Constraints

Here, we present the general theory of higher-order conditions [78] which will be used in
what follows in obtaining quadratic optimality conditions in the canonical problem of the
calculus of variations. In Section 1.1, we formulate the main result of the general theory in
the smooth case. Section 1.2 is devoted to its proof. In Section 1.3, we present two simple
applications of the general theory.

1.1 Main Concepts and Main Theorem


1.1.1 Minimum on a Set of Sequences
Let X and Y be Banach spaces. Let a set  ⊂ X, functionals J :  → R1 , fi :  → R1 ,
i = 1, . . . , k, and an operator g :  → Y be given. Consider the problem
J (x) → min; fi (x) ≤ 0, i = 1, . . . , k; g(x) = 0; x ∈ . (1.1)
Let a point x0 ∈  satisfy the constraints, and let us study its optimality. By {δxn }, and
also by {x̄n }, we denote countable sequences in X. Denote by 0 the set of sequences {x̄n }
converging in norm to zero in X. Let us introduce the set of sequences determining the type
of the minimum at the point x0 , which will be used. Let  be an arbitrary set of sequences
{δxn } satisfying the following conditions:
(a)  is closed with respect to passing to a subsequence;
(b)  + 0 ⊂ ; i.e., the conditions {δxn } ∈  and {x̄n } ∈ 0 imply {δxn + x̄n } ∈ 
(in this case, we say that  sustains a 0 -extension).
Moreover, it is assumed that the following condition holds for , , and x0 :
(c) for any sequence {δxn } ∈ , we have x0 + δxn ∈  for sufficiently large n
(in this case, we say that the set  is absorbing for  at the point x0 ).
We give the following definition for problem (1.1).

Definition 1.1. We say that the minimum is attained at a point x0 on  (or x0 is a point of
-minimum) if there is no sequence {δxn } ∈  such that for all n,
J (x0 + δxn ) − J (x0 ) < 0, fi (x0 + δxn ) ≤ 0 (i = 1, . . . , k), g(x0 + δxn ) = 0.

9
10 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

In a similar way, the strict minimum on  at x0 is defined. We need only replace the strict
inequality J (x0 + δxn ) − J (x0 ) < 0 in the previous definition with nonstrict inequality and
require additionally that the sequence {δxn } contains nonzero terms.
Obviously, the minimum on 0 is a local minimum. If 0 ⊂ , then the minimum
on  is not weaker than a local minimum. The inclusion 0 ⊂  holds iff  contains a
zero sequence. This condition holds in all applications of the general theory.
In what follows, the point x0 is fixed, and, therefore, as a rule, it will be omitted
in the definitions and notation. By δ, we denote the set of variations δx ∈ X such that
x0 + δx ∈ . Note that 0 ∈ δ. We set f0 (x) = J (x) − J (x0 ) for x ∈ . Denote by S
the system consisting of the functionals f0 , f1 , . . . , fk and the operator g. The concepts
of minimum and strict minimum on  are naturally extended to the system S. The con-
cepts introduced below can also be related to problem (1.1), as well as to the system S.
Sometimes, it is more convenient to speak about the system and not about the problem.
For δx ∈ δ, we set m(δx) = max0≤i≤k fi (x0 + δx). Let g be the set of sequences
{δxn } ∈  such that g(x0 + δxn ) = 0 for all sufficiently large n. Consider the condition
m ≥ 0 | g . By definition, this condition means that for any sequence {δxn } ∈ g , there
exists a number starting from which m(δxn ) ≥ 0. In what follows, such a notation will be
used without additional explanations. The following proposition follows from the defini-
tions directly.

Proposition 1.2. If the minimum is attained on , then m ≥ 0 | g .

Therefore, the condition m ≥ 0 | g is necessary for the minimum on . It will serve as a


source of other, coarser necessary conditions.
Now let us consider two obvious sufficient conditions. Let + be the set of all se-
quences from  that do not vanish. Define + g analogously. The following proposition
follows directly from the definitions.

Proposition 1.3. The condition m > 0 | +


g is equivalent to the strict minimum on .

For δx ∈ δ, we set σ (δx) = max{m(δx),


g(x0 + δx)
}. We say that σ is the
violation function. If, in problem (1.1), there is no equality-type constraint g(x) = 0, then
we set σ (δx) = m+ (δx), where m+ = max{m, 0}. The following proposition is elementary.

Proposition 1.4. Condition σ > 0 | + is equivalent to the strict minimum on .

1.1.2 Smooth Problem


Let us formulate the assumptions in problem (1.1) that define it as a smooth problem. These
assumptions are related not only to the functionals J , f1 , . . . , fk , the operator g, and the
point x0 , but also to the set of sequences . First let us give several definitions.
Let Z be a Banach space. A mapping h :  → Z is said to be -continuous
at x0 if the condition
h(x0 + δxn ) − h(x0 )
→ 0 (n → ∞) holds for any {δxn } ∈ .
A -continuous mapping is said to be strictly -differentiable at x0 if there exists a linear
operator H : X → Z such that for any sequences {δxn } ∈  and {x̄n } ∈ 0 , there exists a
sequence {zn } in Z such that
zn
→ 0, and for all sufficiently large n, we have the relation
1.1. Main Concepts and Main Theorem 11

h(x0 + δxn + x̄n ) = h(x0 + δxn ) + H x̄n + zn


x̄n
.

The definition easily implies the uniqueness of the operator H . If  contains the zero
sequence (and hence 0 ⊂ ), then H is the Frechét derivative of the operator h at the
point x0 , and the strict -differentiability implies the strict differentiability. In what fol-
lows, we set H = h (x0 ). The function of two variables, (δx, x̄)  −→ h(x0 + δx) + h (x0 )x̄,
that maps δ × X into Z, is called a fine linear approximation of the operator h at the
point x0 on . These concepts are used for operators, as well as for functionals.
It is assumed that all functionals J , f1 , . . . , fk and the operator g in problem (1.1) are
-continuous at x0 . Introduce the set of active indices:

I = {i ∈ {0, 1, . . . , k}  fi (x0 ) = 0}, where f0 (x) = J (x) − J (x0 ). (1.2)

Obviously, 0 ∈ I . It is assumed that the functionals fi , i ∈ I , and the operator g are strictly
-differentiable at x0 . Also, it is assumed that either g (x0 )X = Y (in this case, we say
that for g at x0 on , the Lyusternik condition holds) or the image g (x0 )X is closed in Y
and has a direct complement which is a closed subspace in Y . Precisely, these assumptions
define a smooth problem (smooth system) on  at the point x0 . In this chapter we will
consider only this type of problem.

1.1.3 Conditions of Order γ


As usual, by the order of an extremality condition, one means the order of the highest
derivative entering this condition. We give another definition of the order. A functional
γ : δ → R1 is called an order on  if it is nonnegative on δ, -continuous at zero, and
γ (0) = 0. An order γ on  is said to be higher if it is strictly -differentiable
 at zero, and
hence γ (0) = 0. An order γ on  is said to be strict if γ > 0  + .
Let γ be strict higher order on . Define the following two conditions on : the
γ -necessity and the γ -sufficiency. To this end, we set
 
m
Cγ (m, g ) = inf lim inf .
g γ

Let us explain that, in calculating this quantity, for each sequence {δxn } ∈ g that does not
vanish, we first calculate the limit inferior (lim inf ) of the ratio m(δxn )/γ (δxn ) as n → ∞,
and then we take the greatest lower bound of the limits inferior over the whole set of
sequences from g that do not vanish. An analogous notation will be used for other
functions and other sets of sequences. Proposition 1.2 implies the following assertion.

Proposition 1.5. If we have the minimum on , then Cγ (m, g ) ≥ 0.

The condition Cγ (m, g ) ≥ 0 is called the γ -necessity on . It is easy to see that the
γ -necessity on  is equivalent to the following condition: there are no ε > 0 and sequence
{δxn } ∈ + such that

fi (x0 + δxn ) ≤ −εγ (δxn ) (i = 0, . . . , k), g(x0 + δxn ) = 0.


12 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

It is convenient to compare the concept of γ -necessity in this form with the concept of
minimum on . Further, we set
 
σ
Cγ (σ , ) = inf lim inf .
 γ
Propositions 1.3 and 1.4 imply the following assertion.

Proposition 1.6. Each of the two conditions Cγ (m, g ) > 0 and Cγ (σ , ) > 0 is sufficient
for the strict minimum on .

The condition Cγ (σ , ) > 0 is said to be γ -sufficiency on . It is equivalent to the


following condition: there exists C > 0 such that σ ≥ Cγ | + . Since, obviously,
Cγ (m, g )+ = Cγ (m+ , g ) = Cγ (σ , g ) ≥ Cγ (σ , ),
the inequality Cγ (σ , ) > 0 implies the inequality Cγ (m, g ) > 0. Therefore, the inequality
Cγ (σ , ) > 0 is not a weaker sufficient condition for the strict minimum on  than the
inequality Cγ (m, g ) > 0. In what follows, we will show that if the Lyusternik condition
holds, then these two sufficient conditions are equivalent. As the main sufficient condition,
we will consider the inequality Cγ (σ , ) > 0, i.e., the γ -sufficiency on .
Therefore, the γ -necessity and the γ -sufficiency on  are obvious weakening and
strengthening of the concept of minimum on , respectively. We aim at obtaining the
criteria for the γ -conditions, i.e., the γ -necessity and the γ -sufficiency, that are formulated
by using the Lagrange function.

1.1.4 Lagrange Function. Main Result


By λ = (α, y ∗ ), we denote an arbitrary tuple of multipliers, where α = (α0 , α1 , . . . , αk ) ∈
Rk+1 , y ∗ ∈ Y ∗ . Denote by 0 the set of tuples λ such that

k
αi ≥ 0 (i = 0, . . . , k), αi +
y ∗
= 1,
i=0 (1.3)

k
αifi (x0 ) = 0 (i = 1, . . . , k), αifi (x0 ) + y ∗ g (x0 ) = 0,
i=0
where y ∗ g (x0 ), x = y ∗ , g (x0 )x for all x ∈ X by definition (here, we prefer not to use
the notation g (x0 )∗ for the adjoint operator).
 Therefore, 0 is the set of normalized tuples
of Lagrange multipliers. The relation αi +
y ∗
= 1 is the normalization condition here.
Such a normalization is said to be standard. Introduce the Lagrange function

k
L(λ, x) = αifi (x) + y ∗ , g(x) , x ∈ ,
i=0
and the functions

k
(λ, δx) = L(λ, x0 + δx) = αifi (x0 + δx) + y ∗ , g(x0 + δx) ,
i=0
0 (δx) = max (λ, δx), δx ∈ δ.
λ∈0

Here and in what follows, we set max∅ (·) = −∞.


1.1. Main Concepts and Main Theorem 13

Denote by σ γ the set of sequences {δxn } ∈  satisfying the condition σ (δxn ) ≤


O(γ (δxn )). The latter means that there exists C > 0 depending on the sequence such that
σ (δxn ) ≤ Cγ (δxn ) for all n. We set
 
0
Cγ (0 , σ γ ) = inf lim inf .
σ γ γ
The constant Cγ (0 , σ γ ) is said to be basic. It turns out that for an arbitrary higher
strict order γ on , the constant Cγ (0 , σ γ ) allows us to formulate the following pair of
adjacent conditions for : the inequality Cγ (0 , σ γ ) ≥ 0 is necessary for the minimum
on , and the strict inequality Cγ (0 , σ γ ) > 0 is sufficient for the strict minimum on .
Moreover, the following assertion holds.

Theorem 1.7. (a) If g (x0 )X = Y , then the inequality Cγ (0 , σ γ ) ≥ 0 is equivalent to the
inequality Cγ (m, g ) ≥ 0. If g (x0 )X  = Y , then 0 ≥ 0, and , therefore, Cγ (0 , σ γ ) ≥ 0.
(b) The inequality Cγ (0 , σ γ ) > 0 is always equivalent to the inequality Cγ (σ , ) > 0.
In the case where g (x0 )X = Y , the following three inequalities are pairwise equivalent to
each other:
Cγ (0 , σ γ ) > 0, Cγ (σ , ) > 0, and Cγ (m, g ) > 0.

Therefore, the γ -necessity on  always implies the inequality Cγ (0 , σ γ ) ≥ 0, and


the γ -sufficiency on  is always equivalent to the inequality Cγ (0 , σ γ ) > 0. Theorem
1.7 is the main result of the abstract theory of higher-order conditions for smooth prob-
lems. Note thatit remains valid if, in the definition of the set 0 , we replace the standard
normalization αi +
y ∗
= 1 by any equivalent normalization. Let us make more precise
what we mean by an equivalent normalization.

1.1.5 Equivalent Normalizations


Let ν(λ) be a positively homogeneous function of the first degree. A normalization ν(λ) = 1
is said to be equivalent to the standard normalization if the condition that 0 is nonempty
implies the inequalities
0 < inf ν(λ) ≤ sup ν(λ) < +∞.
0 0

The following assertion holds.


k
Proposition 1.8. Let g (x0 )X = Y . Then the condition i=0 αi = 1 defines an equivalent
normalization.

Proof. Let 0 be nonempty,
 and let λ ∈ 0 . Then I αifi (x0 ) + y ∗ g (x0 ) = 0, which

implies
y g (x0 )
≤ ( αi ) maxI
fi (x0 )
. Therefore,
    
αi ≤ αi +
y ∗ g (x0 )
≤ αi 1 + max
fi (x0 )
.
I

It remains to note that


y ∗ g (x0 )
| and
y ∗
are two equivalent norms on Y ∗ , since
g (x0 )X = Y .
14 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

Therefore, in thecase where the Lyusternik condition g (x0 )X = Y holds, we can


use the normalization αi = 1 in the definition of 0 . This normalization is called the
Lyusternik normalization. We need to compare the functions 0 for the standard and
Lyusternik normalizations. Therefore, in the case of Lyusternik normalization, let us agree
to equip the set 0 and the function 0 with the subscript L; i.e., we write L L
0 and 0 . The
following assertion holds.

Proposition 1.9. Let g (x0 )X = Y . Then there exists a number a, 0 < a ≤ 1, such that

0 ≤ max{aL
0 , 0 },
L
(1.4)

 
1
L
0 ≤ max 0 ,  0 . (1.5)
a

Proof. We first prove inequality (1.4). If 0 is empty, then 0 = −∞, and hence in-
equality (1.4) holds. Suppose that 0 is not empty. By Proposition
 1.8, there exists
a, 0 < a ≤ 1, such that for any λ ∈0 , the inequality a ≤ αi holds. Moreover,  the
condition αi +
y ∗
= 1 implies αi ≤ 1. Let λ = (α, y ∗ ) ∈ 0 . We set ν := αi .
Then λ̂ := λ/ν ∈ L0 , a ≤ ν ≤ 1. Therefore, for any δx ∈ δ,

(λ, δx) = (ν λ̂, δx) = ν(λ̂, δx) ≤ max{aL L


0 (δx), 0 (δx)}.

This implies estimate (1.4).


Now let us prove (1.5). If L 0 is empty, then 0 = −∞, and hence (1.5) holds.
L

Now let L = (α̂, ŷ ∗ ) ∈ L


0 be nonempty, and let λ̂ 
∗ ∗
0 . We set μ := 1 +

, λ = (α, y ) :=
1
λ̂/μ. Then λ ∈ 0 . Moreover, a ≤ αi = μ α̂i = μ ≤ 1, which implies 1 ≤ μ ≤ 1/a.
1

Therefore, for any δx ∈ δ, we have


 
1
(λ̂, δx) = μ(λ, δx) ≤ max 0 (δx), 0 (δx) .
a

This implies estimate (1.5).

Proposition 1.9 immediately implies the following assertion.

Proposition 1.10. Let g (x0 )X = Y . Then there exists a number a, 0 < a ≤ 1, such that

Cγ (0 , σ γ ) ≤ max{aCγ (L L


0 , σ γ ), Cγ (0 , σ γ )}, (1.6)

1
0 , σ γ ) ≤ max{Cγ (0 , σ γ ), Cγ (0 , σ γ )}.
Cγ (L (1.7)
a

Therefore, if the Lyusternik condition holds, constants Cγ (0 , σ γ ) and Cγ (L 0,


σ γ ) have the same signs, which in this case allows us to replace the first constant by the
second in Theorem 1.7.
1.2. Proof of the Main Theorem 15

1.1.6 Sufficient Conditions


As was already noted, the inequalities Cγ (0 , σ γ ) ≥ 0 and Cγ (0 , σ γ ) > 0 are a pair
of adjacent conditions for a minimum on  at the point x0 . The nonstrict inequality is a
necessary condition, and the strict inequality is a sufficient condition. This is implied by
Theorem 1.7. The necessary condition is not trivial as we will verify below in proving
Theorem 1.7. As for the sufficient condition Cγ (0 , σ γ ) > 0, its sufficiency for the
minimum on  is simple (this is characteristic for sufficient conditions in general: their
sources are simple as a rule). Let us prove the sufficiency of this condition. The following
estimate easily follows from the definitions of the functions 0 and σ : 0 ≤ σ . Hence
Cγ (0 , σ γ ) ≤ Cγ (σ , σ γ ). (1.8)
Let us show that
Cγ (σ , σ γ ) = Cγ (σ , ). (1.9)
Indeed, the inclusion σ γ ⊂  implies the inequality Cγ (σ , σ γ ) ≥ Cγ (σ , ). Let us prove
the converse inequality
Cγ (σ , σ γ ) ≤ Cγ (σ , ). (1.10)
If Cγ (σ , ) = ∞, then inequality (1.10) holds. Let Cγ (σ , ) < ∞, and let a number C be
such that Cγ (σ , ) < C, i.e.,
σ
inf lim inf < C.
 γ
Then there exists a sequence {δxn } ∈ + such that σ (δxn )/γ (δxn ) < C for all n. This
implies {δxn } ∈ σ γ and
σ
Cγ (σ , σ γ ) := inf lim inf ≤ C.
σ γ γ
We have shown that the inequality Cγ (σ , ) < C always implies inequality Cγ (σ , σ γ ) ≤ C.
This implies inequality (1.10) and, therefore, relation (1.9). From (1.8) and (1.9) we obtain
Cγ (0 , σ γ ) ≤ Cγ (σ , ). (1.11)
Thus, the inequality Cγ (0 , σ γ ) > 0 implies the inequality Cγ (σ , ) > 0, i.e., the γ -
sufficiency on . Therefore, the inequality Cγ (0 , σ γ ) > 0 is sufficient for the strict
minimum on .
The latter assertion contained in Theorem 1.7 turns out to be very simple. However,
a complete proof of Theorem 1.7 requires considerably greater effort. Before passing
directly to its proof, we present all necessary auxiliary assertions. The next section is
devoted to this.

1.2 Proof of the Main Theorem


We will need the main lemma for the proof of Theorem 1.7. In turn, the proof of the
main lemma is based on the following three important properties used in the extremum
theory: the compatibility condition of a set of linear inequalities and equations, the Hoffman
lemma, and the Lyusternik theorem. These three properties compose the basis of the abstract
16 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

theory of higher order for smooth problems, and for the reader’s convenience they are
formulated in this section.

1.2.1 Basis of the Abstract Theory


As above, let X and Y be Banach spaces, and let X ∗ and Y ∗ be their duals. Let a tuple
on
l = {l1 , . . . , lm }, li ∈ X∗ , i = 1, . . . , m, and a linear surjective operator A : X −→ Y be given.
With the tuple l and the operator A we associate the set  = (l, A) consisting of the tuples
of multipliers λ = (α, y ∗ ), α = (α1 , . . . , αm ) ∈ Rm∗ , y ∗ ∈ Y ∗ , satisfying the conditions

m 
m
αi ≥ 0 (i = 1, . . . , m), αi = 1, αi li + y ∗ A = 0.
i=1 i=1

The compatibility criterion of a set of linear inequalities and equations has the following
form.

Lemma 1.11. Let ξ = (ξ1 , . . . , ξm )∗ ∈ Rm , y ∈ Y . The set of conditions

li , x + ξi < 0 (i = 1, . . . , m), Ax + y = 0

is compatible iff  

m

sup αi ξi + y , y < 0.
λ∈ i=1
(By definition, sup∅ = −∞.) Along with this lemma, in studying problems with
inequality-type constraints, an important role is played by the estimate of the distance to
the set of solutions of a set of linear inequalities and equations [41], which is presented
below.

Lemma 1.12 (Hoffman). There exists a constant C = C(l, A) with the following property:
if for certain ξ ∈ Rm and y ∈ Y the system

li , x + ξi ≤ 0 (i = 1, . . . , m), Ax + y = 0

is compatible, then there exists its solution x satisfying the estimate


x
≤ C max{ξ1 , . . . , ξm ,
y
}.

In the case where there is no equation Ax + y = 0, Lemma 1.12 holds with the
estimate
x
≤ C max{ξ1+ , . . . , ξm+ }.
Finally, we present the Lyusternik-type theorem on the estimate of the distance to the
level of the equality operator in the form which is convenient for us (see [78, Theorem 2]).
Let a set of sequences  satisfy the same conditions as in Section 1.1. The following
theorem holds for an operator g : X → Y strictly -differentiable at a point x0 .

Theorem 1.13. Let g(x0 ) = 0 and let g (x0 )X = Y . Then there exists C > 0 such that for
any {δxn } ∈ , there exists {x̄n } ∈ 0 satisfying the following conditions for all sufficiently
large n: g(x0 + δxn + x̄n ) = 0,
x̄n
≤ C
g(x0 + δxn )
.
1.2. Proof of the Main Theorem 17

1.2.2 Main Lemma



We now turn to problem (1.1). Let all assumptions of Section 1.1hold. Let g (x0 )X = Y .
In the definition of the set 0 , we choose the normalization αi = 1. According to
Proposition 1.8, it is equivalent to the standard normalization. The following assertion
holds.

Lemma 1.14 (Main Lemma). Let a sequence {δxn } and a sequence of numbers {ζn } be
such that δxn ∈ δ for all n, ζn+ → 0, and L 0 (δxn ) + ζn < 0 for all n. Then there exists a
sequence {x̄n } ∈ 0 such that the following conditions hold :
(1)
x̄n
≤ O(σ (δxn ) + ζn+ );
(2) fi (x0 + δxn + x̄n ) + ζn ≤ o(
x̄n
), i ∈ I ;
(3) g(x0 + δxn + x̄n ) = 0 for all sufficiently large n.

Proof. For an arbitrary n, let us consider the following set of conditions on x̄:
fi (x0 ), x̄ + fi (x0 + δxn ) + ζn < 0, i ∈ I ; g (x0 )x̄ + g(x0 + δxn ) = 0. (1.12)
Let (αi )i∈I and y ∗ be a tuple from the set  of system (1.12). We set αi = 0 for i ∈ / I.
Then λ = (α0 , . . . , αk , y ∗ ) ∈ L ∗
0 . The converse is also true: if λ = (α0 , . . . , αk , y ) ∈ 0 ,
L

then αi = 0 for i ∈ / I and the tuple ((αi )i∈I , y ∗ ) belong to the set  of system (1.12).
Therefore,
 


max αi (fi (x0 + δxn ) + ζn ) + y , g(x0 + δxn )

I
 

k

= max αi (fi (x0 + δxn ) + ζn ) + y , g(x0 + δxn )
L
0 i=0
= 0 (δxn ) + ζn < 0.
L
The latter relation is implied by the definition of the function L
0 and the normalization
αi = 1. According to Lemma 1.11, system (1.12) is compatible. Then by Hoffman’s
lemma (Lemma 1.12), there exist C > 0 and a sequence {x̄n } such that for all n,
fi (x0 ), x̄n + fi (x0 + δxn ) + ζn ≤ 0, i ∈ I ; (1.13)
g (x0 )x̄n + g(x0 + δxn ) = 0; (1.14)
 

x̄n
≤ C max max{fi (x0 + δxn ) + ζn },
g(x0 + δxn )
. (1.15)
i∈I

It follows from (1.15) that for all sufficiently large n,



x̄n
≤ C(σ (δxn ) + ζn+ ) → 0, (1.16)
/ I . Therefore, {x̄n } ∈ 0 . Since g is strictly -
since fi (x0 + δxn ) → fi (x0 ) < 0 for i ∈
differentiable at the point x0 , condition (1.14) implies g(x0 + δxn + x̄n ) = o(
x̄n
). Then
by the Lyusternik theorem (see Theorem 1.13), there exists {x̄n } ∈ 0 such that for all
sufficiently large n, we have
g(x0 + δxn + x̄n + x̄n ) = 0, (1.17)

x̄n
= o(
x̄n
). (1.18)
18 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

We set {x̄n } = {x̄n + x̄n }. Condition (1.17) implies g(x0 + δxn + x̄n ) = 0 for all sufficiently
large n, and conditions (1.16) and (1.18) imply
x̄n
≤ O(σ (δxn ) + ζn+ ). Further, we obtain
from conditions (1.13) and the property of strong -differentiability of the functionals fi
at the point x0 that
fi (x0 + δxn + x̄n ) + ζn
= fi (x0 + δxn ) + fi (x0 ), x̄n + fi (x0 ), x̄n + ζn + o(
x̄n
)
≤ fi (x0 ), x̄n + o(
x̄n
) = o1 (
x̄n
), i ∈ I.
The latter relation holds because of (1.18). The lemma is proved.

Now, we prove a number of assertions from which the main result (Theorem 1.7) will
follow. Below we assume that the order γ is strict and higher on , and all assumptions of
Section 1.1 hold for the set of sequences  and problem (1.1) at the point x0 .

1.2.3 Case Where the Lyusternik Condition Holds


We have the following theorem.

Theorem 1.15. Let g (x0 )X = Y . Then Cγ (L


0 , σ γ ) = Cγ (m, g ).

0 , σ γ ) ≤ Cγ (m, g ). Indeed,
Proof. We first show that Cγ (L

0 , σ γ ) ≤ Cγ (0 , σ γ ∩ g ) ≤ Cγ (m, σ γ ∩ g ) = Cγ (m, g ).


Cγ (L L

Here, the first inequality is obvious, and the second inequality follows from the obvious
estimate L0 ≤ m | g . The equality is proved in the same way as relation (1.9).
Now let us prove inequality Cγ (m, g ) ≤ Cγ (L 0 , σ γ ), which will finish the proof
of the theorem. If Cγ (L 0 , σγ ) = +∞, the inequality holds. Let Cγ (L 0 , σ γ ) < +∞,
and let C be such that
L
Cγ (L
0 , σ γ ) := inf lim inf
0
< −C.
σ γ γ
Then there exists a sequence {δxn } ∈ +σ γ at which 0 (δxn ) + Cγ (δxn ) < 0, and, moreover,
L

δxn ∈ δ for all n. We set ζn = Cγ (δxn ). According to the main lemma, there exists a
sequence {x̄n } such that the following conditions hold:
(α)
x̄n
≤ O(σ (δxn ) + C + γ (δxn ));
(β) fi (x0 + δxn + x̄n ) + Cγ (δxn ) ≤ o(
x̄n
), i ∈ I ;
(γ ) g(x0 + δxn + x̄n ) = 0 for all sufficiently large n.
Since {δxn } ∈ σ γ , the first condition implies
x̄n
≤ O(γ (δxn )). We set {δxn } = {δxn + x̄n }.
Then condition (γ ) implies {δxn } ∈ g , and condition (β) implies
fi (x0 + δxn ) + Cγ (δxn ) ≤ o(γ (δxn )), i ∈ I.
From this we obtain
m(δxn )
lim inf ≤ −C.
γ (δxn )
1.2. Proof of the Main Theorem 19

Since γ is a higher order, we have

γ (δxn ) = γ (δxn + x̄n ) = γ (δxn ) + o(


x̄n
) = γ (δxn ) + o(γ (δxn )).

Therefore,
m(δxn )
lim inf ≤ −C.
γ (δxn )
Taking into account that {δxn } ∈ g , we obtain from this that Cγ (m, g ) ≤ −C. There-
fore, we have proved that the inequality Cγ (L0 , σ γ ) < −C always implies the inequality
Cγ (m, g ) ≤ −C. Therefore, Cγ (m, g ) ≤ Cγ (L 0 , σ γ ). The theorem is completely
proved.

Theorem 1.15 and Proposition 1.10 imply the following theorem.

Theorem 1.16. Let g (x0 )X = Y . Then the following three inequalities are pairwise equiv-
alent:
Cγ (m, g ) ≥ 0, Cγ (L 0 , σ γ ) ≥ 0, and Cγ (0 , σ γ ) ≥ 0.
Now, consider the sequence of relations

Cγ (0 , σ γ ) ≤ Cγ (σ , ) ≤ Cγ (σ , g ) = Cγ (m+ , g ) = Cγ (m, g )+ . (1.19)

The first of these relations was proved in Section 1.1 (inequality (1.11)), and the other
relations are obvious. The following assertion follows from (1.19), Theorem 1.15, and
inequality (1.7).

Corollary 1.17. Let g (x0 )X = Y . Then for 0 < a ≤ 1 the following inequalities hold :
1
Cγ (0 , σ γ ) ≤ Cγ (σ , ) ≤ Cγ (m, g )+ ≤ Cγ (L +
0 , σ γ ) ≤ Cγ (0 , σ γ )+ .
a
This implies the following theorem.

Theorem 1.18. Let g (x0 )X = Y . Then the following four inequalities are pairwise equiv-
alent:

Cγ (σ , ) > 0, Cγ (m, g ) > 0, Cγ (0 , σ γ ) > 0, and Cγ (L


0 , σ γ ) > 0.

Therefore, in the case where the Lyusternik condition holds, we have proved all the
assertions of Theorem 1.7.

1.2.4 Case Where the Lyusternik Condition Is Violated


To complete the proof of Theorem 1.7, we need to prove the following: if
g (x0 )X  = Y , then (a) 0 ≥ 0, and (b) the inequality Cγ (σ , ) > 0 is equivalent to the
inequality Cγ (0 , σ γ ) > 0. We begin with the proof of (a).

Proposition 1.19. If g (x0 )X = Y , then 0 (δx) ≥ 0 for all δx ∈ δ.


20 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

Proof. Since the image Y1 := g (x0 )X is closed in Y , the condition Y1  = Y implies the
existence of y ∗ ∈ Y ,
y ∗
= 1 such that y ∗ , y = 0 for all y ∈ Y1 , and hence λ = (0, y ∗ ) ∈ 0
and λ = (0, −y ∗ ) ∈ 0 . From this we obtain that for any δx ∈ δ,
max (λ, δx) ≥ max{(λ , δx), (λ , δx)} = |y ∗ , g(x0 + δx) | ≥ 0.
0

The proposition is proved.

Now let us prove assertion (b). Since the inequality Cγ (0 , σ γ ) ≤ Cγ (σ , ) always
holds by (1.11), in order to prove (b), we need to prove the following lemma.

Lemma 1.20. Let g (x0 )X  = Y . Then there exists a constant b = b(g (x0 )) > 0 such that
Cγ (σ , ) ≤ b Cγ (0 , σ γ )+ . (1.20)

Proof. The proof uses a special method for passing from the system S = {f0 , . . . , fk , g} to a
certain auxiliary system Ŝ. We set Y1 = g (x0 )X. According to the definition of the smooth
problem, Y = Y1 ⊕ Y2 , where Y2 is a closed subspace in Y . Then Y ∗ = W1 ⊕ W2 , where W1
and W2 are such that any functional from W1 is annihilated on Y2 , and any functional from
W2 is annihilated on Y1 . Without loss of generality, we assume that if y = y1 + y2 , y1 ∈ Y1 ,
and y2 ∈ Y2 , then
y
= max{
y1
,
y2
}. Then for y ∗ = y1∗ + y2∗ , y1∗ ∈ W1 , y2∗ ∈ W2 , we
have
y ∗
=
y1∗
+
y2∗
.
Let P1 : Y → Y1 and P2 : Y → Y2 be projections compatible with the decomposition
of Y into a direct sum Y = Y1 ⊕ Y2 . Then P1 + P2 = I , P1 P2 = 0, and P2 P1 = 0. We set
g1 = P1 g and g2 = P2 g. Then g = g1 + g2 , g1 (x0 )X = Y1 , and g2 (x0 )X = {0}. Introduce
the functional fg (x) =
g2 (x)
. The condition g2 (x0 )X = {0} implies that fg is strictly
-differentiable at the point x0 and fg (x0 ) = 0. Consider the system Ŝ consisting of the
functionals f0 , . . . , fk , fg and the operator g1 . All subjects related to this system will be
endowed with the sign ∧. Since g = g1 + g2 , we have
g
= max{
g1
,
g2
}. Therefore,
σ (δx) := max{f0 (x0 + δx), . . . , fk (x + δx),
g(x0 + δx)
}
= max{f0 (x0 + δx), . . . , fk (x0 + δx), fg (x0 + δx),
g1 (x0 + δx)
} =: σ̂ (δx).
This implies
Cγ (σ , ) = Cγ (σ̂ , ). (1.21)
Further, since the Lyusternik condition g1 (x0 )X = Y1 holds for the system Ŝ, by Corollary
1.17, there exists â > 0 such that
1
Cγ (σ̂ , ) ≤ ˆ 0 , σ̂ γ )+ .
Cγ ( (1.22)

Now let us show that  ˆ 0 is empty, then 
ˆ 0 ≤ 0 . If  ˆ 0 = −∞, and hence the inequality
ˆ
holds. Let 0 be nonempty, and let λ̂ = (α0 , . . . , αk , αg , y1∗ ) be an arbitrary element of the
ˆ 0 . Then
set 
αi ≥ 0 (i = 0, . . . , k), αi fi (x0 ) = 0 (i = 1, . . . k), αg ≥ 0; y1∗ ∈ W1 ,
k k
αi + αg +
y1∗
= 1, αi fi (x0 ) + αg fg (x0 ) + y1∗ g1 (x0 ) = 0.
i=0 i=0
1.3. Simple Applications of the Abstract Scheme 21

Moreover, fg (x0 ) = 0. Let δx ∈ δ be an arbitrary element. Choose y2∗ ∈ W2 so that the


following conditions hold:


y2∗
= αg , y2∗ , g2 (x0 + δx) = αg
g2 (x0 + δx)
.

We set y ∗ = y1∗ + y2∗ , and λ = (α0 , . . . , αk , y ∗ ). As is easily seen, then we have λ ∈ 0 and
ˆ λ̂, δx) = (λ, δx). Therefore, for arbitrary δx and λ̂ ∈ 
( ˆ 0 , there exists λ ∈ 0 such that
ˆ
the indicated relation holds. This implies 0 (δx) ≤ 0 (δx). Also, taking into account that
σ̂ = σ , we obtain
Cγ ( ˆ 0 , σ̂ γ ) ≤ Cγ (0 , σ γ ). (1.23)
It follows from (1.21)–(1.23) that

1
Cγ (σ , ) ≤ Cγ (0 , σ γ )+ . (1.24)

It remains to set b = 1/â. The lemma is proved.

Therefore, we have shown that in the case of violation of the Lyusternik condition, the
inequalities Cγ (σ , ) > 0 and Cγ (0 , σ γ ) > 0 are equivalent. Thus, we have completed
the proof of Theorem 1.7.

1.3 Simple Applications of the Abstract Scheme


In this section, following [55], we shall obtain quadratic conditions in a smooth problem in
Rn and in the problem of Bliss with endpoint inequalities.

1.3.1 A Smooth Problem in Rn


Let X = Rn , Y = Rm , and  = X. Consider the problem

J (x) → min; fi (x) ≤ 0 (i = 1, . . . , k), g(x) = 0. (1.25)

We assume that the functions J : Rn → R1 , fi : Rn → R1 , i = 1, . . . , k, and the operator


g : Rn → Rm are twice differentiable at each point. Let a point x0 satisfy the constraints, and
let us study its optimality. We define f0 and I as in relation (1.2). Let  = 0 := {{δxn } |
δxn → 0 (n → ∞)}. Obviously, (1.25) is a smooth problem on 0 at the point x0 , and the
minimum on 0 is a local minimum. For an order we take a unique (up to a nonsingular
transformation) quadratic positive definite functional γ (δx) = δx, δx in Rn . Obviously,
γ is a strict higher order (see the definition in Section 1.1.3). Let us define the violation
function σ as in Section 1.1.1. Denote by σ γ the set of sequences {δxn } ∈  satisfying
the condition σ (δxn ) ≤ O(γ (δxn )). Define the following necessary condition for a local
minimum at a point x0 .
Condition ℵγ : g (x0 )X = Y or γ -necessity holds.
By results in Section 1.1.4, the inequality Cγ (0 , σ γ ) ≥ 0 is a necessary condition for
a local minimum and is equivalent to Condition ℵγ , and Cγ (0 , σ γ ) > 0 is a sufficient
22 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

condition for a local minimum and is equivalent to γ -sufficiency. Below we transform the
expression for
0 (δxn )
Cγ (0 , σ γ ) := inf lim inf
{δxn }∈σ γ n→∞ γ (δxn )

into an equivalent simpler form. We recall that


k
(λ, δx) := αi fi (x0 + δx) + y ∗ , g(x0 + δx) = L(λ, x0 + δx),
i=0
0 (δx) := max (λ, δx),
λ∈0

where the set 0 is defined by (1.3). We set


1
λ (x̄) := Lxx (λ, x0 )x̄, x̄ , 0 (x̄) := max λ (x̄),
2 λ∈0
K := {x̄ ∈ X | fi (x0 ), x̄ ≤ 0, i ∈ I ; g (x0 )x̄ = 0},

σ (x̄) := fi (x0 ), x̄ + +
g (x0 )x̄
,
i∈I
Cγ (0 , K) := inf {0 (x̄) | x̄ ∈ K, γ (x̄) = 1}.

If K = {0}, then as usual we set Cγ (0 , K) = +∞. Obviously, K = {x̄ ∈ X | σ (x̄) = 0}.
We call K the critical cone.

Theorem 1.21. The following equality holds: Cγ (0 , σ γ ) = Cγ (0 , K).

Proof. Evidently,
0 (δx) = 0 (δx) + o(γ (δx)) as δx → 0, (1.26)
σ (δx) = σ (δx) + O(γ (δx)) as δx → 0. (1.27)
We set

σ γ := {δxn } ∈ 0 | σ (δx) ≤ O(γ (δx)) .


It follows from (1.27) that σ γ = σ γ ; hence, by taking into account (1.26), we immedi-
ately obtain Cγ (0 , σ γ ) = Cγ (0 , σ γ ), where
0 (δxn )
Cγ (0 , σ γ ) := inf lim inf .
{δxn }∈σ γ n→∞ γ (δxn )

Then we set

0 (δxn )
σ := {δxn } ∈ 0 | σ (δx) = 0 , Cγ (0 , σ ) := inf lim inf .
{δxn }∈σ n→∞ γ (δxn )
Since, obviously, σ ⊂ σ γ , we have Cγ (0 , σ ) ≥ Cγ (0 , σ γ ). We show that, in
fact, equality holds. Suppose that Cγ (0 , σ γ )  = +∞. Take any ε > 0. Let {δxn } ∈ σ γ
be a nonvanishing sequence such that
0 (δxn )
lim ≤ Cγ (0 , σ γ ) + ε.
n→∞ γ (δxn )
1.3. Simple Applications of the Abstract Scheme 23

By applying the Hoffman lemma (see Lemma 1.12) for each δxn (n = 1, 2, . . . ) to the system
f (x0 ), δxn + x̄ ≤ 0, i ∈ I , g (x0 )(δxn + x̄) = 0,
regarded as a system in the unknown x̄, and by bearing in mind that σ (δxn ) ≤ O(γ (δxn )),
we obtain the following assertion: we can find an {x̄n } such that σ (δxn + x̄n ) = 0 and

x̄n
≤ O(γ (δxn )). Consequently,
0 (δxn + x̄n ) 0 (δxn )
lim = lim ≤ Cγ (0 , σ γ ) + ε,
n→∞ γ (δxn + x̄n ) n→∞ γ (δxn )

and {δxn + x̄n } ∈ σ . This implies that Cγ (0 , σ ) ≤ Cγ (0 , σ γ ). Consequently, the
equality Cγ (0 , σ ) = Cγ (0 , σ γ ) holds, from which we also obtain Cγ (0 , σ γ ) =
Cγ (0 , σ ). But since 0 and γ are positively homogeneous of degree 2, by applying
the definition of the cone K, in an obvious way we obtain that, in turn, Cγ (0 , σ ) =
Cγ (0 , K). Thus, Cγ (0 , σ γ ) = Cγ (0 , K), and the theorem is proved.

Corollary 1.22. The condition Cγ (0 , K) ≥ 0 is equivalent to Condition ℵγ and so is


necessary for a local minimum. The condition Cγ (0 , K) > 0 is equivalent to γ -sufficiency
and so is sufficient for a strict local minimum. In particular, K = {0} is sufficient for a strict
local minimum.

It is obvious, that Cγ (0 , K) ≥ 0 is equivalent to the condition 0 ≥ 0 on K, and


since K is finite-dimensional, Cγ (0 , K) > 0 is equivalent to 0 > 0 on K \ {0}. We also
remark that these conditions are stated by means of the maximum of the quadratic forms,
and they cannot be reduced in an equivalent way to a condition on one of these forms. Here
is a relevant example.
Example 1.23 (Milyutin). Let X = R2 , and let ϕ and ρ be polar coordinates in R2 . Let the
four quadratic forms Qi , i = 1, 2, 3, 4, be defined by their traces qi on a circle of unit radius:
q1 (ϕ) = sin 2ϕ − ε, q2 (ϕ) = − sin 2ϕ − ε,
q3 (ϕ) = cos 2ϕ − ε, q4 (ϕ) = − cos 2ϕ − ε.
We choose the constant ε > 0 so that max1≤i≤4 qi (ϕ) > 0, 0 ≤ ϕ < 2π . We consider the
system S formed from the functionals fi (x) = Qi (x), i = 1, 2, 3, 4, in a neighborhood of
x0 = 0. For γ (x) = x, x this system has γ -sufficiency. In fact, since fi (0) = 0, i = 1, 2, 3, 4,

we have K = R2 , 0 = {α ∈ R4 | αi ≥ 0, αi = 1} and 0 (x) = max1≤i≤4 Qi (x), and
 
so 0 (x) ≥ εγ (x) for all x ∈ X. But no form 4i=1 αi Qi , where αi ≥ 0 and αi = 1,
is nonnegative, since its trace q on a circle of unit radius has the form q(ϕ) = A sin 2ϕ +
B cos 2ϕ − ε.

1.3.2 The Problem of Bliss with Endpoint Inequalities


We consider the following problem. It is required to minimize the function of the initial and
final states
J (x(t0 ), x(tf )) → min (1.28)
under the constraints
Fi (x(t0 ), x(tf )) ≤ 0 (i = 1, . . . , k), K(x(t0 ), x(tf )) = 0, ẋ = f (t, x, u), (1.29)
24 Chapter 1. Abstract Scheme for Obtaining Higher-Order Conditions

where x ∈ Rn , u ∈ Rr , J ∈ R, Fi ∈ R, K ∈ Rs , f ∈ Rn , and the interval [t0 , tf ] is fixed.


Strictly speaking, the problem of Bliss [6] includes the local equation g(t, x, u) = 0, where
g ∈ Rq , and it is traditional to require that the matrix gu has maximal rank at points (t, x, u)
such that g(t, x, u) = 0. This statement will be considered later. Here, for simplicity, we
consider the problem without the local equation.
We set W = W 1,1 ([t0 , tf ], Rn ) × L∞ ([t0 , tf ], Rr ), where W 1,1 ([t0 , tf ], Rn ) is the space
of n-dimensional absolutely continuous functions, and L∞ ([t0 , tf ], Rr ) is the space of
r-dimensional bounded measurable functions. We consider problem (1.28)–(1.29) in W .
We denote the pair (x, u) by w, and we define the norm in W by
 tf

w
=
x
1,1 +
u
∞ = |x(t0 )| + ẋ(t) dt + ess sup |u(t)|.
t0 t∈[t0 ,tf ]

Clearly, a local minimum in this space is weak. This is the minimum on the set of sequences
0 := {{δwn } |
δwn
→ 0 (n → ∞)}. Again we set  = 0 .
We denote the argument of the functions J , Fi , and K by p = (x0 , xf ), where x0 ∈ Rn
and xf ∈ Rn . All relations containing measurable sets and functions are understood with
accuracy up to a set of measure zero. We assume the following.

Assumption 1.24. The functions J , Fi , and K are twice continuously differentiable with
respect to p; the function f is twice differentiable with respect to w; the function f and its
second derivative fww are uniformly bounded and equicontinuous with respect to w on any
bounded set of values (t, w) and are measurable in t for any fixed w.

Evidently, f satisfies these conditions if f and fww are continuous jointly in both
variables. Let w 0 (·) ∈ W be a trajectory satisfying all constraints that is being inves-
tigated for an optimal situation. We set p 0 = (x 0 (t0 ), x 0 (tf )), F0 (p) = J (p) − J (p 0 ),
I = {i ∈ {0, 1, . . . , k} | Fi (p0 ) = 0}. Obviously, problem (1.28)–(1.29) is smooth on 0
at the point w0 . The set 0 consists of aggregates λ = (α, β, ψ) for which the local form of
Pontryagin’s minimum principle holds:

α ∈ R(k+1)∗ , β ∈ Rs∗ , ψ(·) ∈ W 1,1 ([t0 , tf ], Rn∗ ), (1.30)



k 
s
α ≥ 0, αi Fi (p0 ) = 0 (i = 1, . . . , k), αi + |βj | = 1, (1.31)
i=0 j =1

ψ̇ = −Hx (t, w0 , ψ), ψ(t0 ) = −lx0 (p0 , α, β), ψ(tf ) = lxf (p0 , α, β), (1.32)
Hu (t, w 0 , ψ) = 0, (1.33)

where
H = ψf , l = αF + βK. (1.34)
The notation Rn∗ stands for the space of n-dimensional row vectors. We emphasize the
dependence of the Pontryagin function H and the endpoint Lagrange function l on λ that
is defined by (1.34) by writing H = H λ (t, w) and l = l λ (p). Under our assumptions 0 is
a finite-dimensional compact set, each point of which is uniquely determined by its projec-
tion (α, β).
1.3. Simple Applications of the Abstract Scheme 25

For any δw ∈ W and λ ∈ 0 the Lagrange function  has the form


t  
(λ, δw) = l λ (p0 + δp) + t0f H λ (t, w0 + δw) − H λ (t, w0 ) dt
tf (1.35)
− t0 ψδ ẋ dt,

where δp = (δx(t0 ), δx(tf )). For arbitrary w̄ ∈ W and λ ∈ 0 we set


 tf
1 λ 0 1
ωλ (w̄) := lpp (p )p̄, p̄ + Hww
λ
(t, w0 )w̄, w̄ dt, ω0 (w̄) = max ωλ (w̄), (1.36)
2 2 t0 λ∈0

where p̄ = (x̄(t0 ), x̄(tf )). Set


 tf  tf
γ (δw) = δx(t0 ), δx(t0 ) + δ ẋ(t), δ ẋ(t) dt + δu(t), δu(t) dt. (1.37)
t0 t0

Obviously, γ is a strict higher order. We define the cone of critical variations


 
K = w̄ ∈ W | Fi (p0 )p̄ ≤ 0, i ∈ I , K (p0 )p̄ = 0, x̄˙ = fw (t, w0 )w̄ (1.38)

and the constant

Cγ (ω0 , K) = inf {ω0 (w̄) | w̄ ∈ K, γ (w̄) = 1} . (1.39)

(Let us note that the sign of the constant Cγ (ω0 , K) will not change if we replace, in its
t
definition, the functional γ with the functional γ̄ (w̄) = x̄(t0 ), x̄(t0 ) + t0f ū(t), ū(t) dt.) We
define Condition ℵγ as in Section 1.3.1.

Theorem 1.25. The condition Cγ (ω0 , K) ≥ 0 is equivalent to the condition ℵγ and so is


necessary for a local minimum; the condition Cγ (ω0 , K) > 0 is equivalent to γ -sufficiency
and so is sufficient for a strict local minimum.

This result among others, is given, in [84]. But it is not difficult to derive it directly by
following the scheme indicated in our discussion of the finite-dimensional case. Because
the problem is smooth, Theorem 1.25 follows from Cγ (0 , σ γ ) = Cγ (ω0 , K), which is
established in the same way as in Section 1.3.1.
Notes on SSC for abstract optimization problems. Maurer and Zowe [75] consid-
ered optimization problems in Banach spaces with fully infinite-dimensional equality and
inequality constraints defined by cone constraints and derived SSC for quadratic function-
als γ . Maurer [62] showed that the SSC in [75] can be applied to optimal control problems
by taking into account the so-called “two-norm discrepancy.”
Chapter 2

Quadratic Conditions in the


General Problem of the
Calculus of Variations

In this chapter, on the basis of the results of Chapter 1, we create the quadratic theory of
conditions for a Pontryagin minimum in the general problem of the calculus of variations
without local mixed constraint g(t, x, u) = 0. Following [92], we perform the decoding
of the basic constant for the set of the so-called “Pontryagin sequences” and a special
higher order γ , which is characteristic for extremals with finitely many discontinuities of
the first kind of control. In Section 2.1, we formulate both necessary and sufficient quadratic
conditions for a Pontryagin minimum, which will be obtained as a result of the decoding.
In Sections 2.2 and 2.3, we perform some preparations for the decoding. In Section 2.4,
we estimate the basic constant from above, thus obtaining quadratic necessary conditions
for Pontryagin minimum, and in Section 2.5, we estimate it from below, thus obtaining
sufficient conditions. In Section 2.7, we establish the relation of the obtained sufficient
conditions for Pontryagin minimum with conditions for strong minimum.

2.1 Statements of Quadratic Conditions for a Pontryagin


Minimum
2.1.1 Statement of the Problem and Assumptions
In this chapter, we consider the following general problem of the calculus of variations on
a fixed time interval  := [t0 , tf ]:
J (x(t0 ), x(tf )) → min, (2.1)
F (x(t0 ), x(tf )) ≤ 0, K(x(t0 ), x(tf )) = 0, (2.2)
ẋ = f (t, x, u), (2.3)
(x(t0 ), x(tf )) ∈ P , (t, x, u) ∈ Q, (2.4)

where P ⊂ R2d(x) and Q ⊂ R1+d(x)+d(u) are open sets. By d(a) we denote the dimension
of vector a. Problem (2.1)–(2.4) also will be called the canonical problem. For the sake of
brevity, we set
x(t0 ) = x0 , x(tf ) = xf , (x0 , xf ) = p, (x, u) = w.

27
28 Chapter 2. Quadratic Conditions in the Calculus of Variations

We seek the minimum among pairs of functions w = (x, u) such that x(t) is an absolutely
continuous function on  = [t0 , tf ] and u(t) is a bounded measurable function on . Recall
that by W 1,1 (, Rd(x) ) we denote the space of absolutely
t continuous functions x :  →
Rd(x) , endowed with the norm
x
1,1 := |x(t0 )| + t0f |ẋ(t)| dt, and L∞ (, Rd(u) ) denotes
the space of bounded measurable functions u :  → Rd(u) , endowed with the norm
u
∞ :=
ess supt∈[t0 ,tf ] |u(t)|. We set

W = W 1,1 (, Rd(x) ) × L∞ (, Rd(u) ).


We define the norm in the space W as the sum of the norms in the spaces W 1,1 (, Rd(x) )
and L∞ (, Rd(u) ):
w
=
x
1,1 +
u
∞ . The space W with this norm is a Banach space.
Therefore, we seek the minimum in the space W . A pair w = (x, u) is said to be admissible,
if w ∈ W and constraints (2.2)–(2.4) are satisfied by w.
We assume that the functions J (p), F (p), and K(p) are defined and twice contin-
uously differentiable on the open set P , and the function f (t, w) is defined and twice
continuously differentiable on the open set Q. These are the assumptions on the functions
of the problem. Before formulating the assumptions on the point w 0 ∈ W being studied, we
give the following definition.

Definition 2.1. We say that t∗ ∈ (t0 , tf ) is an L-point (or Lipschitz point) of a function
ϕ : [t0 , tf ] → Rn if at t∗ , there exist the left and right limit values
lim ϕ(t) = ϕ(t∗ −), lim ϕ(t) = ϕ(t∗ +)
t→t∗ t→t∗
t<t∗ t>t∗

and there exist L > 0 and ε > 0 such that


|ϕ(t) − ϕ(t∗ −)| ≤ L|t − t∗ | ∀ t ∈ (t∗ − ε, t∗ ) ∩ [t0 , tf ],
|ϕ(t) − ϕ(t∗ +)| ≤ L|t − t∗ | ∀ t ∈ (t∗ , t∗ + ε) ∩ [t0 , tf ].
A point of discontinuity of the first kind that is an L-point will be called a point of
L-discontinuity.

Let w0 = (x 0 , u0 ) be a pair satisfying the constraints of the problem whose optimal-


ity is studied. We assume that the control u0 (· ) is piecewise continuous. Denote by  =
{t1 , . . . , ts } the set of points of discontinuity for the control u0 , where t0 < t1 < · · · < ts < tf .
We assume that  is nonempty (in the case where  is empty, all the results remain valid
and are obviously simplified). We assume that each tk ∈  is a point of L-discontinuity.
By u0k− = u0 (tk −) and u0k+ = u0 (tk +) we denote the left and right limit values of the
function u0 (t) at the point tk ∈ , respectively. For a piecewise continuous function u0 (t),
the condition (t, x 0 , u0 ) ∈ Q means that (t, x 0 (t), u0 (t)) ∈ Q for all t ∈ [t0 , tf ]\. We also
assume that (tk , x 0 (tk ), u0k− ) ∈ Q and (tk , x 0 (tk ), u0k+ ) ∈ Q for all tk ∈ . As above, all rela-
tions and conditions involving measurable functions are assumed to be valid with accuracy
up to a set of zero measure even if this is not specified.

2.1.2 Minimum on the Set of Sequences


Let S be an arbitrary set of sequences {δwn } in the space W invariant with respect to the
operation of passing to a subsequence. According to Definition 1.1, w0 is a minimum point
2.1. Statements of Quadratic Conditions for a Pontryagin Minimum 29

on S if there is no sequence {δwn } ∈ S such that the following conditions hold for all its
members:
J (p 0 + δpn ) < J (p0 ), F (p0 + δpn ) ≤ 0, K(p 0 + δpn ) = 0,
ẋ 0 + δ ẋn = f (t, w0 + δwn ), (p0 + δpn ) ∈ P , (t, w0 + δwn ) ∈ Q,
where p 0 = (x 0 (t0 ), x 0 (tf )), δwn = (δxn , δun ), and δpn = (δxn (t0 ), δxn (tf )). In a similar
way, we define the strict minimum on S: it is necessary to only replace the strict inequality
J (p0 + δpn ) < J (p 0 ) by nonstrict in the previous definition and additionally assume that
the sequence {δwn } contains only nonzero members.
We can define any local (in the sense of a certain topology) minimum as a minimum
on the corresponding set of sequences. For example, a weak minimum is a minimum
on the set of sequences {δwn } in W such that
δxn
C +
δun
∞ → 0, where
x
C =
maxt∈[t0 ,tf ] |x(t)| is the norm in the space of continuous functions. Let 0 be the set of
sequences {δwn } in W such that
δwn
=
δxn
1,1 +
δun
∞ → 0. (We note that 0
corresponds to the set of sequences 0 introduced in Section 1.1.) It is easy to see that
a minimum on 0 is also a weak minimum. Therefore, we can define the same type of
minimum using various sets of sequences. We often use this property when choosing a set
of sequences for the type of minimum considered which is most convenient for studying.
In particular, this refers to the definition of Pontryagin minimum studied in this chapter.

2.1.3 Pontryagin Minimum


t
Let
u
1 := t0f |u(t)| dt be the norm in the space L1 (, Rd(u) ) of functions u : [t0 , tf ] →
Rd(u) Lebesque integrable with first degree. Denote by  the set of sequences {δwn } in W
satisfying the following two conditions:
(a)
δxn
1,1 +
δun
1 → 0;
(b) there exists a compact set C ⊂ Q (for each sequence) such that, starting from
a certain number, the following condition holds: (t, w 0 (t) + δwn (t)) ∈ C a.e. on
[t0 , tf ].
A minimum on  is called a Pontryagin minimum. For convenience, let us formulate an
equivalent definition of the Pontryagin minimum.
The pair w 0 = (x 0 , u0 ) is a point of Pontryagin minimum iff for each compact set
C ⊂ Q there exists ε > 0 such that J (p) ≥ J (p 0 ) (where p 0 = (x 0 (t0 ), x 0 (tf )),
p = (x(t0 ), x(tf ))) for all admissible pairs w = (x, u) such that
t∈ |x(t)0 − x (t)| < ε,
(a) max 0

(b)  |u(t) − u (t)| dt < ε,


(c) (t, x(t), u(t)) ∈ C a.e. on [t0 , tf ].
We can show that it is impossible to define a Pontryagin minimum as a local minimum with
respect to a certain topology. Therefore the concept of minimum on a set of sequences is
more general than the concept of local minimum. Since  ⊃ 0 , a Pontryagin minimum
implies a weak minimum.

2.1.4 Pontryagin Minimum Principle


Define two sets 0 and M0 of tuples of Lagrange multipliers. They are related to the first-
order necessary conditions for the weak and Pontryagin minimum, respectively. We set
30 Chapter 2. Quadratic Conditions in the Calculus of Variations

l = α0 J + αF + βK, H = ψf , where α0 is a number and α, β, and ψ are row vectors


of the same dimension as F , K, and f , respectively (note that x, u, w, F , K, and f
are column vectors). Denote by (Rn )∗ the space of row vectors of the dimension n. The
functions l and H depend on the following variables: l = l(p, α0 , α, β), H = H (t, w, ψ).
Denote by λ an arbitrary tuple (α0 , α, β, ψ(· )) such that α0 ∈ R1 , α ∈ (Rd(F ) )∗ , β ∈ (Rd(K) )∗ ,
ψ(· ) ∈ W 1,∞ (, (Rd(x) )∗ ), where W 1,∞ (, (Rd(x) )∗ ) is the space of Lipschitz continuous
functions mapping [t0 , tf ] into (Rd(x) )∗ . For arbitrary λ, w, and p, we set

l λ (p) = l(p, α0 , α, β), H λ (t, w) = H (t, w, ψ(t)).

We introduce an analogous notation for partial derivatives (except for the derivative with
respect to t) lxλ0 = ∂x
∂l
0
(p, α0 , α, β), Hxλ (t, w) = ∂H
∂x (t, w, ψ(t)), etc. Denote by 0 the set of
tuples λ such that

)
d(F 
d(K)
α0 ≥ 0, α ≥ 0, αF (p ) = 0,0
α0 + αi + |βj | = 1, (2.5)
i=1 j =1

ψ̇ = −Hxλ (t, w 0 ), ψ(t0 ) = −lxλ0 (p0 ), ψ(tf ) = lxλf (p 0 ), Huλ (t, w0 ) = 0. (2.6)

Here, αi are components of the row vector α and βj are components of the row vector β.
If a point w0 yields a weak minimum, then 0 is nonempty. This was shown in [79, Part 1].
We set U(t, x) = {u ∈ Rd(u) | (t, x, u) ∈ Q}. Denote by M0 the set of tuples λ ∈ 0
such that for all t ∈ [t0 , tf ]\, the condition u ∈ U(t, x 0 (t)) implies the inequality

H (t, x 0 (t), u, ψ(t)) ≥ H (t, x 0 (t), u0 (t), ψ(t)). (2.7)

If w 0 is a point of Pontryagin minimum, then M0 is nonempty; i.e., the Pontryagin minimum


principle holds. This also was shown in [79, Part 1].
The sets 0 and M0 are finite-dimensional compact sets, and, moreover, the projection
λ  → (α0 , α, β) is injective on the largest set 0 and, therefore, on M0 . Denote by co 0
the convex hull of the set 0 , and let M0co be the set of all λ ∈ co 0 such that for all
t ∈ [t0 , tf ]\, the condition u ∈ U(t, x 0 (t)) implies inequality (2.7).
We now formulate a quadratic necessary condition for the Pontryagin minimum. For
this purpose, along with the set M0 , we need to define a critical cone K and a quadratic
form on it.

2.1.5 Critical Cone


Denote by P W 1,2 (, Rd(x) ) the space of piecewise continuous functions x̄(t) : [t0 , tf ] →
Rd(x) absolutely continuous on each of the intervals of the set (t0 , tf )\ such that their first
derivative is square Lebesgue integrable. We note that all points of discontinuity of func-
tions in P W 1,2 (, Rd(x) ) are contained in . Below, for tk ∈  and x̄ ∈ P W 1,2 (, Rd(x) ),
we set x̄ k− = x̄(tk −), x̄ k+ = x̄(tk +), and [x̄]k = x̄ k+ − x̄ k− . Let Z2 () be the space of
triples z̄ = (ξ̄ , x̄, ū) such that

ξ̄ = (ξ̄1 , . . . , ξ̄s ) ∈ Rs , x̄ ∈ P W 1,2 (, Rd(x) ), ū ∈ L2 (, Rd(u) ),


2.1. Statements of Quadratic Conditions for a Pontryagin Minimum 31

where L2 (, Rd(u) ) is the space of Lebesgue square integrable functions ū(t) : [t0 , tf ] →
Rd(u) . Let IF (w 0 ) = {i ∈ {1, . . . , d(F )} | Fi (p0 ) = 0} be the set of active subscripts of the
constraints Fi (p) ≤ 0 at the point w0 . Denote by K the set of z̄ = (ξ̄ , x̄, ū) ∈ Z2 () such
that

Jp (p 0 )p̄ ≤ 0, Fip (p0 )p̄ ≤ 0 ∀ i ∈ IF (w 0 ), Kp (p0 )p̄ = 0, (2.8)


x̄˙ = fw (t, w0 )w̄, [x̄]k = [ẋ 0 ]k ξ̄k ∀ tk ∈ , (2.9)

where p̄ = (x̄(t0 ), x̄(tf )), w̄ = (x̄, ū), and [ẋ 0 ]k is the jump of the function ẋ 0 (t) at the point
tk , i.e.,
[ẋ 0 ]k = ẋ 0k+ − ẋ 0k− = ẋ 0 (tk +) − ẋ 0 (tk −).

Clearly, K is a convex polyhedral cone. It will be called the critical cone.


The following question is of interest: which inequalities in the definition of K can
be replaced by equalities not changing K? An answer to this question follows from the
next proposition. For λ = (α0 , α, β, ψ) ∈ 0 , denote by [H λ ]k the jump of the function
H (t, x 0 (t), u0 (t), ψ(t)) at the point tk ∈ , i.e.,

[H λ ]k = H λk+ − H λk− ,

where

H λk+ = H (tk , x 0 (tk ), u0k+ , ψ(tk )), H λk− = H (tk , x 0 (tk ), u0k− , ψ(tk )).

0 be the subset of λ ∈ 0 such that [H ] = 0 for all tk ∈ . Then 0 is a finite-


Let  λ k 

dimensional compact set, and, moreover, M0 ⊂ 0 ⊂ 0 .




0 and any z̄ ∈ K:
Proposition 2.2. The following conditions hold for any λ ∈ 

α0 Jp (p0 )p̄ = 0, αi Fip (p0 )p̄ = 0 ∀ i ∈ IF (w 0 ). (2.10)

Also, the following question is of interest: in which case can one of the inequalities
in the definition of K be omitted not changing K? For example, in which case can we omit
the inequality Jp (p0 )p̄ ≤ 0?

Proposition 2.3. If there exist λ ∈ 


0 such that α0 > 0, then the conditions

Fip (p0 )p̄ ≤ 0, αi Fi (p0 )p̄ = 0 ∀ i ∈ IF (w 0 ), Kp (p 0 )p̄ = 0, (2.11)


x̄˙ = fw (t, w0 )w̄, [x̄]k = [ẋ 0 ]k ξ¯k ∀ tk ∈  (2.12)

imply Jp (p0 )p̄ = 0, i.e., conditions (2.11) and (2.12) determine K as before.

An analogous assertion holds for any other inequality Fip (p 0 )p̄ ≤ 0, i ∈ IF (w 0 ), in


the definition of K.
32 Chapter 2. Quadratic Conditions in the Calculus of Variations

2.1.6 Quadratic Form


For λ ∈ 0 , tk ∈ , we set
(k H λ )(t) := H (t, x 0 (t), u0k+ , ψ(t)) − H (t, x 0 (t), u0k− , ψ(t)).
The following assertion holds: for any λ ∈ 0 , tk ∈ , the function (k H λ )(t) has the
derivative at the point tk . We set1
d
D k (H λ ) = − (k H λ )(t)|t=tk .
dt
The quantity D k (H λ ) can be calculated by the formula
D k (H λ ) = −Hxλk+ Hψk− + Hxλk− Hψk+ − [Htλ ]k ,

where
Hxλk+ = ψ(tk )fx (tk , x 0 (tk ), u0k+ ), Hxλk− = ψ(tk )fx (tk , x 0 (tk ), u0k− ),
Hψk+ = f (tk , x 0 (tk ), u0k+ ) = ẋ 0k+ , Hψk− = f (tk , x 0 (tk ), u0k− ) = ẋ 0k− ,
ẋ 0k+ = ẋ(tk +), ẋ 0k− = ẋ(tk −),

and [Htλ ]k is the jump of the function Htλ = ψ(t)ft (t, x 0 (t), u0 (t)) at the point tk , i.e.,

[Htλ ]k = Htλk+ − Htλk− = ψ(tk )ft (tk , x 0 (tk ), u0k+ ) − ψ(tk )ft (tk , x 0 (tk ), u0k− ).
We note that D k (H λ ) depends on λ linearly, and D k (H λ ) ≥ 0 for any λ ∈ M0 and any tk ∈ .
Let [Hxλ ]k be the jump of the function Hxλ = ψ(t)fx (t, x 0 (t), u0 (t)) = −ψ̇(t) at the point
tk ∈ , i.e., [Hxλ ]k = Hxλk+ − Hxλk− . For λ ∈ M0 , z̄ ∈ Z2 (), we set (see also Henrion [39])

1  k λ 2
s
λ (z̄) = D (H )ξ̄k + 2[Hxλ ]k x̄av k
ξ̄k
2

k=1  tf  (2.13)
1
+ lpp
λ
(p 0 )p̄, p̄ + Hww
λ
(t, w0 )w̄, w̄ dt ,
2 t0

k = 1 (x̄ k− + x̄ k+ ), p̄ = (x̄(t ), x̄(t )). Obviously, λ (z̄) is a quadratic form in z̄


where x̄av 2 0 f
and a linear form in λ.

2.1.7 Necessary Quadratic Condition for a Pontryagin Minimum


In the case where the set M0 is nonempty, we set
0 (z̄) = max λ (z̄). (2.14)
λ∈M0

If M0 is empty, we set 0 (· ) = −∞.

Theorem 2.4. If w 0 is a Pontryagin minimum point, then M0 is nonempty and the function
0 (· ) is nonnegative on the cone K.
1 In the book [79], the value D k (H λ ) was defined by the formula D k (H λ ) = d ( H λ )(t ), since there
dt k k
the Pontryagin maximum principle was used.
2.1. Statements of Quadratic Conditions for a Pontryagin Minimum 33

We say that Condition A holds for the point w 0 if the set M0 is nonempty and the
function 0 is nonnegative on the cone K. According to Theorem 2.4, the condition A is
necessary for a Pontryagin minimum.

2.1.8 Sufficient Quadratic Condition for a Pontryagin Minimum


A natural strengthening of Condition A is sufficient for the strict Pontryagin minimum.
(We will show in Section 2.7 that it is also sufficient for the so-called “bounded strong
minimum”.) We now formulate this strengthening. For this purpose, we define the set
M0+ of elements λ ∈ M0 satisfying the so-called “strict minimum principle” and its subset
Leg+ (M0+ ) of “strictly Legendre elements.” We begin with the definition of the concept of
strictly Legendre element. An element λ = (α0 , α, β, ψ) ∈ 0 is said to be strictly Legendre
if the following conditions hold:
(a) [H λ ]k = 0, D k (H λ ) > 0 for all tk ∈ ;
(b) for any t ∈ [t0 , tf ]\, the inequality Huu (t, x 0 (t), u0 (t), ψ(t))ū, ū > 0 holds
for all ū ∈ Rd(u) , ū  = 0;
(c) for each point tk ∈ , the inequalities Huu (tk , x 0 (tk ), u0k− , ψ(tk ))ū, ū > 0
and Huu (tk , x 0 (tk ), u0k+ , ψ(tk ))ū, ū > 0 hold for all ū ∈ Rd(u) , ū  = 0.
We note that each element λ ∈ M0 is Legendre in nonstrict sense, i.e., [H λ ]k = 0,
≥ 0 for all tk ∈ , and in conditions (b) and (c) nonstrict inequalities hold for the
D k (H λ )
quadratic form Huu ū, ū . In other words, Leg(M0 ) = M0 , where Leg(M) is a subset of
Legendre (in nonstrict sense) elements of the set M ⊂ 0 . Further, denote by M0+ the set
of λ ∈ M0 such that
(a) H (t, x 0 (t), u, ψ(t)) > H (t, x 0 (t), u0 (t), ψ(t)) if t ∈ [t0 , tf ]\, u ∈ U(t, x 0 (t)),
u  = u0 (t);
(b) H (tk , x 0 (tk ), u, ψ(tk )) > H λk− = H λk+ if tk ∈ , u ∈ U(tk , x 0 (tk )),
u∈ / {u0k− , u0k+ }.
Denote by Leg+ (M0+ ) the subset of all strictly Legendre elements λ ∈ M0+ . We set
 tf
γ̄ (z̄) = ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + ū(t), ū(t) dt. (2.15)
t0

We say that Condition B holds for the point w0 if there exist a nonempty compact set
M ⊂ Leg+ (M0+ ) and a constant C > 0 such that
max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K.
λ∈M

Theorem 2.5. If Condition B holds, then w0 is a strict Pontryagin minimum point.

In conclusion, we note that the space P W 1,2 (, Rd(x) ) with the inner product

s  tf
¯ = x̄(t0 ), x̄(t
x̄, x̄ ¯ 0 ) + ¯ k +
[x̄]k , [x̄] ˙¯
˙ x̄(t)
x̄(t), dt,
k=1 t0

and also the space Z2 () with the inner product


 tf
¯ = ξ̄ , ξ̄¯ + x̄, x̄
z̄, z̄ ¯ + ¯
ū(t), ū(t) dt,
t0
34 Chapter 2. Quadratic Conditions in the Calculus of Variations

are Hilbert spaces. Moreover, the functional γ̄ (z̄) is equivalent to the norm

Z2 () =

z̄, z̄ on the subspace {z̄ ∈ Z2 () | x̄˙ = fw w̄, [x̄]k = [ẋ 0 ]k ξ̄k , k ∈ I ∗ }, where I ∗ =
{1, . . . , s}.

2.2 Basic Constant and the Problem of Its Decoding


2.2.1 Verification of Assumptions in the Abstract Scheme
The derivation of quadratic conditions for the Pontryagin minimum in the problem (2.1)–
(2.4) is based on the abstract scheme presented in Chapter 1. We begin with the verification
of the property that the problem (2.1)–(2.4), the set of sequences , and the point w0 satisfy
all the assumptions of the abstract scheme. As the Banach space X, we consider the space
W = W 1,1 (, Rd(x) ) × L∞ (, Rd(u) ) of pairs of functions w = (x, u). As the set  entering
the statement of the abstract problem (1.1), we consider the set W of pairs w = (x, u) ∈ W
such that (x(t0 ), x(tf )) ∈ P and there exists a compact set C ⊂ Q such that (t, x(t), u(t)) ∈ C
a.e. on [t0 , tf ].
Let w 0 = (x 0 , u0 ) ∈ W be a point satisfying the constraints of the canonical problem
(2.1)–(2.4) and the assumptions of Section 2.1. As the set of sequences defining the type
of minimum at this point, we take the set of Pontryagin sequences  in the space W .
Obviously,  is invariant with respect to the operation of passing to a subsequence. Also,
it is elementary verified that  is invariant with respect to the 0 -extension, where 0 =
{{w̄n } |
w̄n
=
x̄n
1,1 +
ūn
∞ → 0}. As was already mentioned, 0 corresponds to the
set of sequences 0 introduced in Section 1.1. (Here, the notation 0 is slightly more
convenient.) Finally, it is easy to see that the set W is absorbing for  at the point w 0 .
Therefore, all the assumptions on w0 , W , and  from Section 1.1 hold.
Furthermore, consider the functional
w(· ) = (x(· ), u(· ))  −→ J (x(t0 ), x(tf )). (2.16)
Since the function J (p) is defined on P , we can consider this functional as a functional given
on W . We denote it by Jˆ. Therefore, the functional Jˆ : W → R1 is given by formula (2.16).
Analogously, we define the following functionals on W :
F̂i : w(· ) = (x(· ), u(· )) ∈ W  −→ Fi (x(t0 ), x(tf )), i = 1, . . . , d(F ), (2.17)
and the operator
ĝ : w(· ) = (x(· ), u(· )) ∈ W  −→ (ẋ − f (t, w), K(p)) ∈ Y , (2.18)
where Y = L1 (, Rd(x) ) × Rd(K) . We omit the verification of the property that all the
functionals Jˆ(w), F̂i (w), i = 1, . . . , d(F ) and the operator ĝ(w) : W  −→ Y are -continuous
and strictly -differentiable at the point w 0 , and, moreover, their Fréchet derivatives at this
point are linear functionals
Jˆ (w 0 ) : w̄ = (x̄, ū) ∈ W −→ Jp (p 0 )p̄,
F̂i (w0 ) : w̄ = (x̄, ū) ∈ W −→ Fip (p0 )p̄, i = 1, . . . , d(F ),
and a linear operator
ĝ (w 0 ) : w̄ = (x̄, ū) ∈ W  −→ (x̄˙ − fw (t, w0 )w̄, Kp (p 0 )p̄) ∈ Y , (2.19)
2.2. Basic Constant and the Problem of Its Decoding 35

respectively; moreover, here, p̄ = (x̄(t0 ), x̄(tf )). The verification of this assertion is also
very elementary.
We prove only that if ĝ (w 0 )W  = Y , then the image ĝ (w 0 )W is closed in Y and has
a direct complement that is a closed subspace in Y . Note that the operator
w̄ = (x̄, ū) ∈ W  −→ x̄˙ − fw (t, w0 )w̄ ∈ L1 (, Rd(x) ) (2.20)
is surjective. Indeed, for an arbitrary function v̄ ∈ L1 (, Rd(x) ), there exists a function x̄
satisfying x̄˙ − fx (t, w0 )x̄ = v̄ and x̄(t0 ) = 0. Then under the mapping given by this operator,
the pair w̄ = (x̄, ū) yields v̄ in the image. Further, we note that the operator
w̄ = (x̄, ū) ∈ W  −→ Kp (p0 )p̄ ∈ Rd(K) (2.21)
is finite-dimensional. The surjectivity of operator (2.20) and the finite-dimensional prop-
erty of operator (2.21) imply the closedness of the image of operator (2.19). The following
more general assertion holds.

Lemma 2.6. Let X, Y , and Z be Banach spaces, let A : X → Y be a linear operator with
closed range, and let B : X → Z be a linear operator such that the image B(Ker A) of the
kernel of the operator A under the mapping of the operator B is closed in Z. Then the
operator T : X → Y × Z defined by the relation Tx = (Ax, Bx) for all x ∈ X has a closed
range.

Proof. Let a sequence {xn } in X be such that Txn = (Axn , Bxn ) → (y, z) ∈ Y × Z. Since
the range of AX is closed in Y , by the Banach theorem on the inverse operator there exists
a convergent subsequence xn → x in X such that Axn = Axn → y and, therefore, Ax = y.
Then A(xn − xn ) = 0 for all n and B(xn − xn ) → z − Bx. By the closedness of the range
B(Ker A), there exists x ∈ X such that Ax = 0 and Bx = z − Bx. Then B(x + x ) = z
and A(x + x ) = y as required. The lemma is proved.

Since the image of any subspace under the mapping defined by a finite-dimensional
operator is a finite-dimensional and, therefore, closed subspace, Lemma 2.6 implies the
following assertion.

Corollary 2.7. Let X, Y , and Z be Banach spaces, let A : X → Y be a surjective linear


operator, and let B : X → Z be a finite-dimensional linear operator. Then the operator
T : X → Y × Z defined by the relation Tx = (Ax, Bx) for all x ∈ X has a closed range.

This implies that operator (2.19) has a closed range. Since the range of operator
(2.19) is of finite codimension, this range has a direct complement that is finite-dimensional,
and, therefore, it is a closed subspace.
Therefore, all the conditions defining a smooth problem on  at the point w 0 hold
for the problem (2.1)–(2.4). Hence we can apply the main result of the abstract theory of
higher-order conditions, Theorem 1.7.

2.2.2 Set 0
In what follows, for the sake of brevity we make the following agreement: if a point at which
the derivative of a given function is taken is not indicated, then for the functions J , F , K,
36 Chapter 2. Quadratic Conditions in the Calculus of Variations

and l λ , p0 is such a point, and for the functions f and H λ , it is the point (t, w0 (t)). Using
the definition given by relations (1.3), we denote by 0 the set of tuples λ = (α0 , α, β, ψ)
such that
α0 ∈ R, α ∈ (Rd(F ) )∗ , β ∈ (Rd(K) )∗ , ψ(· ) ∈ L∞ (, (Rd(x) )∗ )
and the following conditions hold:
α0 ≥ 0, αi ≥ 0 (i = 1, . . . , d(F )), αi Fi (p0 ) = 0 (i = 1, . . . , d(F )), (2.22)
)
d(F 
d(K)
α0 + αi + |βj | +
ψ
= 1, (2.23)
i=1 j =1

)
d(F  tf
α0 Jp p̄ + αi Fip p̄ + βKp p̄ − ψ(x̄˙ − fw w̄) dt = 0 (2.24)
i=1 t0

for all w̄ = (x̄, ū) ∈ W . Here, p̄ = (x̄(t0 ), x̄(tf )).


Let us show that the normalization (2.23) is equivalent to the normalization
)
d(F 
d(K)
α0 + αi + |βj | = 1 (2.25)
i=1 j =1

(the definition of equivalent


 normalizations was given in Section 1.1.5). The upper estimate
is obvious: α0 + αi + |βj | ≤ 1 for any λ ∈ 0 . It is required to establish the lower
estimate: there exists ε > 0 such that
 
α0 + αi + |βj | ≥ ε (2.26)
for any λ ∈ 0 . Suppose that this is not true. Then there exists a sequence
λn = (α0n , αn , βn , ψn ) ∈ 0
such that α0n → 0, αn → 0, and βn → 0. By (2.24), this implies that the functional w̄ ∈
W  −→ t0f ψn (x̄˙ − fw w̄) dt is norm-convergent to zero. Since the operator w̄ ∈ W  −→
t

(x̄˙ − fw w̄) ∈ L1 (, Rd(x) ) is surjective, this implies


ψn
∞→ 0. Therefore,
 α0n + |αn | +
|βn | +
ψn
→ 0, which contradicts the condition α0n + αin + |βj n | +
ψn
= 1,
which follows from λn ∈ 0 . Therefore, estimate (2.26) also holds with some ε > 0. The
equivalence of normalizations (2.23) and (2.25) is proved. In what follows, we will use
normalization (2.25) preserving the old notation for the new 0 .
Let us show that the set 0 with normalization (2.25) coincides with the set 0
defined by (1.3) in Section 1.1.4. For λ = (α0 , α, β, ψ), let conditions (2.22), (2.24), and
(2.25) hold. We rewrite condition (2.24) in the form
 tf
lp p̄ − ψ(x̄˙ − fw w̄) dt = 0 ∀ w̄ ∈ W , (2.27)
t0
t
where l = α0 J + αF + βK. We set x̄ = 0 in (2.27). Then t0f ψfu ū dt = 0 for all ū ∈ L∞ .
Therefore, ψfu = 0 or Hu = 0, where H = ψf . Then we obtain from (2.27) that
 tf  tf
lp p̄ − ψ x̄˙ dt + ψfx x̄ dt = 0 (2.28)
t0 t0
2.2. Basic Constant and the Problem of Its Decoding 37

for all x̄ ∈ W 1,1 . Let us show that this implies the conditions
ψ ∈ W 1,∞ , −ψ̇ = Hx , ψ(t0 ) = −lx0 , ψ(tf ) = lxf . (2.29)

Let ψ satisfy the conditions ψ ∈ W 1,∞ , −ψ̇ = ψ fx , and ψ (t0 ) = −lx0 . Integrating by
parts, we obtain
 tf  tf  tf
˙ tf
ψ x̄ dt = ψ x̄ |t0 − ψ̇ x̄ dt = lx0 x̄0 + ψ (tf )x̄f + ψ fx x̄ dt
t0 t0 t0

for any x̄ ∈ W 1,1 . (Hereafter x̄0 := x̄(t0 ), x̄f := x̄(tf )). Therefore,
 tf  tf
˙
ψ x̄ dt − ψ fx x̄ dt − lx0 x̄0 − ψ (tf )x̄f = 0. (2.30)
t0 t0

Adding (2.30) to (2.28), we obtain


 tf  tf
(lxf − ψ (tf ))x¯f + (ψ − ψ)x̄˙ dt + (ψ − ψ )fx x̄ dt = 0
t0 t0

for all x̄ ∈ W 1,1 . Let c̄ = lxf − ψ (tf ) and ψ̄ = ψ − ψ. Then


 tf
c̄x̄f + ψ̄(x̄˙ − fx x̄) dt = 0. (2.31)
t0

This is true for any x̄ ∈ W 1,1 ˙


.tf Let ā ∈ R , v̄ ∈ L . Let us find x̄ ∈ W such that x̄ − fx x̄ =nv̄
n 1 1,1

and x̄(tf ) = ā. Then c̄ā + t0 ψ̄ v̄ dt = 0. Since this relation holds for an arbitrary ā ∈ R ,
v̄ ∈ L1 , we obtain that c̄ = 0 and ψ̄ = 0, i.e., ψ = ψ and ψ (tf ) = lxf . Therefore, conditions
(2.29) hold. Conversely, if conditions (2.29) hold for ψ, then applying the integration-
t t
by-parts formula t0f ψ x̄˙ dt = ψ x̄ |tf0 − t0f ψ̇ x̄ dt, we obtain condition (2.28). From (2.28)
t

and the condition Hu = 0, condition (2.27), and therefore, condition (2.24) follow. There-
fore, we have shown that the set 0 defined in this section coincides with the set 0 of
Section 1.1.4.
We note that the set 0 is a finite-dimensional compact set, and the projection λ =
(α0 , α, β, ψ)  −→ (α0 , α, β) is injective on 0 . Indeed, according to (α0 , α, β), the vector lxλ0
is uniquely defined, and from the conditions −ψ̇ = ψfx and ψ(t0 ) = lx0 the function ψ is
uniquely found. The same is also true for the set co 0 .

2.2.3 Lagrange Function


We set F0 (p) = J (p) − J (p0 ). Introduce the Lagrange function of the problem (2.1)–(2.4).
Let δW be the set of variations δw ∈ W such that (w 0 + δw) ∈ W , i.e., (p 0 + δp) ∈ P , where
δp = (δx(t0 ), δx(tf )), and there exists a compact set C ⊂ Q such that (t, w0 (t) + δw(t)) ∈ C
a.e. on [t0 , tf ]. For λ = (α0 , α, β, ψ) ∈ co 0 and δw = (δx, δu) ∈ δW , we set

(λ, δw) = α0F0 (p0 + δp) + αF (p0 + δp) + βK(p 0 + δp)


tf
− ψ(ẋ 0 + δ ẋ − f (t, w0 + δw)) dt.
t0
38 Chapter 2. Quadratic Conditions in the Calculus of Variations

We set
δFi = Fi (p0 + δp) − Fi (p 0 ), i = 1, . . . , d(F ),
δK = K(p 0 + δp) − K(p 0 ) = K(p 0 + δp), δf = f (t, w 0 + δw) − f (t, w0 ).
Then
)
d(F  tf
(λ, δw) = αi δFi + βδK − ψ(δ ẋ − δf ) dt
i=0  t0
tf  tf
= δl λ − ψδ ẋ dt + δH λ dt,
t0 t0
where
)
d(F
lλ = αi Fi + βK, H λ = ψf , δl λ = l λ (p 0 + δp) − l λ (p 0 ),
i=0
δH λ = H (t, w 0 + δw, ψ) − H (t, w0 , ψ) = ψδf .
Note that in contrast to the classical calculus of variations, where δJ stands for the first
variation of the functional, we denote by δJ , δf , etc. the full increments corresponding to
the variation δw.

2.2.4 Basic Constant


For δw ∈ δW , we set
0 (δw) = max (λ, δw).
0

Let γ = γ (δw) : δW  −→ R1
be an arbitrary strict higher order on  whose definition is
given in Section 1.1.3. We set
σ γ = {{δwn } ∈  | σ (δwn ) ≤ O(γ (δwn ))},
where σ (δw) = max{F0 (p 0 + δp), . . . , Fd(F ) (p0 + δp), |δK|,
δ ẋ − δf
1 } is the violation
function of the problem (4.1)–(4.4). In what follows, we shall use the shorter notation lim
for the limit inferior instead of lim inf . We set
0
Cγ (0 , σ γ ) = inf lim .
σ γ γ
Theorem 1.7 implies the following theorem.

Theorem 2.8. The condition Cγ (0 , σ γ ) ≥ 0 is necessary for a Pontryagin minimum


at the point w0 , and the condition Cγ (0 , σ γ ) > 0 is sufficient for a strict Pontryagin
minimum at this point.

Further, there arises the problem of the choice of a higher order γ and decoding the
constant Cγ (0 , σ γ ) corresponding to the chosen order. The constant Cγ (0 , σ γ ) is
said to be basic, and by decoding of the basic constant, we mean the simplest method for
calculating its sign. In what follows, we will deal with the choice of γ and decoding of
the basic constant. As a result, we will obtain theorems on quadratic conditions for the
Pontryagin minimum formulated in Section 2.1.
2.3. Local Sequences, Representation of the Lagrange Function 39

2.3 Local Sequences, Higher Order γ , Representation of


the Lagrange Function on Local Sequences with
Accuracy up to o(γ )
2.3.1 Local Sequences and Their Structure
As before, let w0 ∈ W be a point satisfying the constraints of the canonical problem and
the assumptions of Section 2.1. For convenience, we assume that u0 is left continuous at
each point of discontinuity tk ∈ . Denote by u0 the closure of the graph of u0 (t). Let loc
be the set of sequences {δwn } in the space W satisfying the following two conditions:
(a)
δxn
1,1 → 0;
(b) for any neighborhood V of the compact set u0 there exists n0 ∈ N with
(t, u0 (t) + δun (t)) ∈ V a.e. on [t0 , tf ] ∀ n ≥ n0 . (2.32)

Sequences from loc are said to be local. Obviously, 0 ⊂ loc ⊂ . Although the set
of local sequences loc is only a part of the set  of Pontryagin sequences, all main
considerations in obtaining quadratic conditions for the Pontryagin minimum are related
namely to the set loc .
Let us consider the structure of local sequences. Denote by loc u the set of se-
quences {δun } in L∞ (, Rd(u) ) such that for any neighborhood V of the compact set u0
there exists a number starting from which condition (2.32) holds. We briefly write the
condition defining the sequences from loc u in the form (t, u + δun ) → u . Therefore,
0 0

 = {{δwn } |
δxn
1,1 → 0, (t, u + δun ) → u }. In what follows, in order not to abuse
loc 0 0
the notation, we will omit the number n in sequences.
Let Qtu be the projection of Q under the mapping (t, x, u) → (t, u). Then Qtu is an open
set in Rd(u)+1 containing the compact set u0 . Denote by u0 (tk−1 , tk ) the closure in Rd(u)+1
of the intersection of the compact set u0 with the layer {(t, u) | u ∈ Rd(u) , t ∈ (tk−1 , tk )},
where k = 1, . . . , s + 1, and ts+1 = tf . In other words, u0 (tk−1 , tk ) is the closure of the graph
of the restriction of the function u0 (· ) to the interval (tk−1 , tk ). Obviously, u0 is the union
k
of disjoint compact sets u0 (tk−1 , tk ). For brevity, we set u0 (tk−1 , tk ) = u0 .
k
Let Vk ⊂ Qut be fixed disjoint bounded neighborhoods of the compact sets u0 ,
k = 1, . . . , s + 1. We set

s+1
V= Vk . (2.33)
k=1

Then V is a neighborhood of Without loss of generality, we assume that V, together


u0 .
with its closure, is contained in Qut . Recall that for brevity, we set I ∗ = {1, . . . , s}. By
the superscript “star” we denote the functions and sets related to the set  of points of
discontinuity of the function u0 . Define the following subsets of the neighborhood V; cf.
the illustration in Figure 2.1 for k = 1:
∗ ∗
Vk− = {(t, u) ∈ Vk+1 | t < tk }; Vk+ = {(t, u) ∈ Vk | t > tk };

Vk∗ = Vk−
∗ ∗
∪ Vk+ , k ∈ I ∗; ∗
V = Vk∗ ; V 0 = V\V ∗ .
k∈I ∗
40 Chapter 2. Quadratic Conditions in the Calculus of Variations

6u '

∗ u1+ u(t)
V1−
V2
&

$
u(t) u1−
- V∗
1+
V1
%

t0 t1 tf t
-

Figure 2.1. Neighborhoods of the control at a point t1 of discontinuity.

While the superscripts k−, k+, and k were used for designation of left and right limit values
and ordinary values of functions at the point tk ∈ , the same subscripts will be used for
enumeration of sets and functions related to the point tk ∈ . The notation vraimaxt∈M u(t)
will be often used to denote the essential supremum (earlier denoted also by ess sup) of a
function u(·) on a set M.
Further, let {δu} ∈ loc ∗
u , i.e., (t, u + δu) → u , and let k ∈ I . For a sequence {δu},
0 0
∗ ∗
introduce the sequence of sets Mk− = {t | (t, u (t) + δu(t)) ∈ Vk− } and assume that χk−
0 ∗ is
∗ ∗ ∗
the characteristic function of the set Mk− . We set {δuk− } = {δuχk− }. Then

vraimax

|u0 (t) + δu∗k− (t) − u0k+ | → 0, where u0k+ = u0 (tk +).
t∈Mk−

In short, we write this fact as (u0 + δu∗k− ) |Mk−


∗ → u0k+ . Analogously, for the sequence {δu},

we define
∗ ∗
Mk+ = {t | (t, u0 (t) + δu(t)) ∈ Vk+ } and {δu∗k+ } = {δuχk+

},
∗ is the characteristic function of the set M ∗ . Then (u0 +δu∗ ) | ∗ → u0k− , i.e.,
where χk+ k+ k+ Mk+

vraimax

|u0 (t) + δu∗k+ (t) − u0k− | → 0, where u0k− = u0 (tk −).
t∈Mk+

The sequence {δu∗k− } belongs to the set of sequences ∗uk− defined as follows: ∗uk− consists
of sequences {δu} in L∞ (, Rd(u) ) such that
(a) vraimaxt∈M t ≤ tk , where M = {t | δu(t) = 0}, i.e., the support M of each member
δu of the sequence {δu} is located to the left from tk (here and in what follows, by the
support of a measurable function, we mean the set of points at which it is different from
zero); for brevity, we write this fact in the form M ≤ tk ;
(b) vraimaxt∈M |t − tk | → 0, i.e., the support M tends to tk ; for brevity, we write this
fact as M → tk ;
(c) vraimaxt∈M |u0 (t) + δu(t) − u0k+ | → 0, i.e., the values of the function u0 + δu on
the support M tend to u0k+ ; in short, we write this fact in the form
(u0 + δu) |M → u0k+ .
2.3. Local Sequences, Representation of the Lagrange Function 41

Analogously, we define the set ∗uk+ consisting of sequences {δu} such that

M ≥ tk , M → tk , (u0 + δu) |M → u0k− ,


where M = {t | δu(t)  = 0}. Clearly, {δu∗k+ } ∈ ∗uk+ . We set

∗uk− + ∗uk+ = ∗uk , ∗uk = ∗u .
k∈I ∗

By definition, the sum of sets of sequences consists of all sums of sequences belonging to
these sets.
As before, let {δu} ∈ loc ∗ ∗ ∗ ∗
u . Let the sets Mk− , Mk+ and the functions χk− , χk+
correspond to the members of this sequence. We set

Mk∗ = Mk− ∗ ∪ M∗ , M∗ =
k+ Mk∗ ,
k∈I ∗
∗ ∗ ∗
χk = χk− + χk+ , χ = ∗ χk∗ ,

k∈I 
δu∗k = δuχk∗ = δu∗k− + δu∗k+ , δu∗ = δuχ ∗ = δu∗k .
k∈I ∗

Then the sequence {δu∗ } belongs to the set ∗u .


Further, for members δu of the sequence {δu}, we set δu0 = δu − δu∗ . Then starting
from a certain number, we have for the sequence {δu0 } that (t, u0 (t) + δu0 (t)) ∈ V 0 a.e.
on [t0 , tf ]. Moreover, δu0 χ ∗ = 0. Here, we assume that members of the sequences {δu0 }
and {χ ∗ } having the same number are multiplied. This remark refers to all relations which
contain members of distinct sequences. In what follows, we do not make such stipulations.
Clearly,
δu0
∞ → 0. We denote the set of sequence in L∞ (, Rd(u) ) having this prop-
erty by 0u . Therefore, we have shown that an arbitrary sequence {δu} ∈ loc u admits the
representation
{δu} = {δu0 } + {δu∗ }, {δu0 } ∈ 0u , {δu∗ } ∈ ∗u , {δu0 χ ∗ } = {0},
where χ ∗ is the characteristic function of the set M ∗ = {t | δu∗ (t)  = 0}. Such a representation
is said to be canonical. It is easy to see that the canonical representation is unique. In

particular, the existence of the canonical representation implies loc u ⊂ u +u . Obviously,
0

the converse inclusion holds. Therefore, the relation u = u + u holds.


loc 0 ∗

We now introduce the set ∗ consisting of sequences {δw ∗ } = {(0, δu∗ )} in the space
W such that {δu∗ } ∈ ∗u (the component δx of these sequence vanishes identically).

Proposition 2.9. We have the relation loc = 0 + ∗ . Moreover, we can represent any
sequence {δw} = {(δx, δu)} ∈ loc in the form {δw} = {δw0 } + {δw∗ }, where {δw0 } =
{(δx, δu0 )} ∈ 0 , {δw∗ } = {(0, δu∗ )} ∈ ∗ , and , moreover, for all members of sequences
with the same numbers, the condition δu0 χ ∗ = 0 holds; here, χ ∗ is the characteristic
function of the set {t | δu∗ (t) = 0}.

Recall that by definition, 0 = {{δw} |


δx
1,1 +
δu
∞ → 0}. Proposition 2.9 fol-

u = u + u and the existence of the canonical representation
lows from the relation loc 0

for sequences in loc


u .
42 Chapter 2. Quadratic Conditions in the Calculus of Variations

The representation of a sequence from loc in the form of a sum of sequences from
0 and ∗ with the condition {δu0 χ ∗ } = {0}, which was indicated in Proposition 2.9, will
also be called canonical. Obviously, the canonical representation is unique. It will play an
important role in what follows.
We introduce one more notation related to an arbitrary sequence {δw} ∈ loc . For
such a sequence, we set
δvk− = (u0 + δu − u0k+ )χk−∗ = (u0 + δu∗ − u0k+ )χ ∗ ,
k− k−
δvk+ = (u + δu − u )χk+
0 0k− ∗ = (u0 + δu∗ − u0k− )χ ∗ ,
 k+ k+
{δvk } = {δvk− } + {δvk+ }, {δv} = {δvk }.
k∈I ∗

Then the supports of δvk− , δvk+ , δvk , and δv are the sets Mk− ∗ , M ∗ , M ∗ , and M ∗ ,
k+ k
∗ ∗
respectively. Moreover, it is obvious that
δvk−
∞ → 0,
δvk+
∞ → 0,
δvk
∞ → 0, and

δv
∞ → 0, i.e., the sequences {δvk−∗ }, {δv ∗ }, {δv }, and {δv} belong to 0 .
k+ k u

2.3.2 Representation of the Function f (t, x, u) on Local Sequences


Let {δw} ∈ loc . Recall that
0 (δw) := max (λ, δw) = max (λ, δw),
λ∈0 λ∈co 0
tf tf
where (λ, δw) = δl λ − t0 ψδ ẋ dt + t0 δH λ dt, δH λ = ψδf , and co 0 is the convex hull
of 0 .
Consider δf on the sequences {δw}. Represent {δw} in the canonical form:
{δw} = {δw0 } + {δw ∗ }, {δw 0 } ∈ 0 , {δw∗ } ∈ ∗ , {δu0 χ ∗ } = {0}.
Then
δf = f (t, w 0 + δw) − f (t, w0 )
= f (t, w 0 + δw) − f (t, w0 + δw ∗ ) + f (t, w 0 + δw ∗ ) − f (t, w 0 )
= f (t, w 0 + δw ∗ + δw 0 ) − f (t, w0 + δw ∗ ) + δ ∗ f (2.34)
1
= fw (t, w0 + δw ∗ )δw0 + fww (t, w0 + δw ∗ )δw0 , δw 0 + r + δ ∗ f ,
2
where δ ∗ f = f (t, w0 + δw∗ ) − f (t, w0 ), and the residue term r with the components ri ,
i = 1, . . . , d(f ), is defined by the mean value theorem as follows:
1
ri = (fiww (t, w 0 + δw ∗ + ζi δw 0 ) − fiww (t, w 0 + δw∗ ))δw 0 , δw0 ,
2
ζi = ζi (t), 0 ≤ ζi (t) ≤ 1, i = 1, . . . , d(f ).
Therefore,   
tf

r
1 = o
δx
2C + |δu | dt .
0 2
(2.35)
t0

Further, fw (t, w0 + δw ∗ )δw0 = fw (t, w 0 )δw0 + (δ ∗ fw )δw0 , where δ ∗ fw = fw (t, w0 + δw ∗ )


− fw (t, w0 ). Since (δ ∗ fw )δw0 = (δ ∗ fx )δx + (δ ∗ fu )δu0 = (δ ∗ fx )δx (because δu0 χ ∗ = 0),
we have
fw (t, w0 + δw ∗ )δw0 = fw δw 0 + (δ ∗ fx )δx. (2.36)
2.3. Local Sequences, Representation of the Lagrange Function 43

Here and in what follows, we set fw = fw (t, w0 ) for brevity. Analogously,

fww (t, w 0 + δw ∗ )δw 0 , δw0 = fww (t, w0 )δw0 , δw0 + (δ ∗ fww )δw0 , δw0
= fww (t, w0 )δw0 , δw0 + (δ ∗ fxx )δx, δx .

Setting fww (t, w0 ) = fww for brevity, we obtain

fww (t, w0 + δw ∗ )δw0 , δw 0 = fww δw 0 , δw0 + (δ ∗ fxx )δx, δx , (2.37)

where δ ∗ fxx = fxx (t, w0 + δw∗ ) − fxx (t, w0 ). Moreover,


δ ∗ fxx
1 ≤ const· meas M∗ → 0.
This implies

(δ ∗ fxx )δx, δx
1 = o(
δx
2C ). (2.38)
Substituting (2.36) and (2.37) in (2.34) and taking into account estimates (2.35) and (2.38),
we obtain the following assertion.

Proposition 2.10. Let {δw} ∈ loc . Then the following formula holds for the canonical
representation {δw} = {δw 0 } + {δw∗ }:
1
δf = fw δw 0 + fww δw 0 , δw0 + δ ∗ f + (δ ∗ fx )δx + r̃,
2
where
  tf 


1 = o
δx
2C + |δu | dt ,
0 2
δ ∗ f = f (t, w 0 + δw ∗ ) − f (t, w0 ),
t0

δ ∗ fx = fx (t, w0 + δw ∗ ) − fx (t, w0 ), fw = fw (t, w 0 ), fww = fww (t, w0 ).

Therefore, for any λ ∈ co 0 , we have


 tf  tf 
1 tf λ
δH dt =
λ
Hx δx dt +
λ
Hww δw0 , δw0 dt
t0 t0 2 t0
 tf  tf (2.39)
+ δ ∗ H λ dt + (δ ∗ Hxλ δx) dt + ρ λ ,
t0 t0
tf
where supλ∈co 0 |ρ λ | = o(
δx
2C + t0 |δu0 |2 dt), δ ∗ H λ = ψδ ∗ f , and δ ∗ Hxλ = ψδ ∗ fx .

Here, we have used the relation Huλ = 0 for all λ ∈ co 0 . As above, all derivatives
whose argument is not indicated are taken for w = w0 (t).

 tf
2.3.3 Representation of the Integral t0 (δ ∗ Hxλ )δx dt on Local
Sequences

Proposition 2.11. Let two sequences {δx} and {δu∗ } such that
δx
C → 0 and {δu} ∈ ∗u
be given. Let λ ∈ co 0 . Then
 tf   tf
(δ ∗ Hxλ )δx dt = ∗
[Hxλ ]k δx(χk− ∗
− χk+ ) dt + ρ ∗λ , (2.40)
t0 k∈I ∗ t0
44 Chapter 2. Quadratic Conditions in the Calculus of Variations


where supλ∈co 0 |ρ ∗λ | = o(
δx
2C + I ∗ Mk∗ |δtk | dt), δtk = tk −t, [Hx ]
λ k = Hxλk+ −Hxλk− =
ψ(tk )(fxk+ − fxk− ) = ψ(tk )(fx (tk , x 0 (tk ), u0k+ ) − fx (tk , x 0 (tk ), u0k− )).
 
Proof. Since χ ∗ = χk∗ , we have δ ∗ Hxλ = δk∗ Hxλ , and, therefore,
 tf  tf
δ ∗ Hxλ δx dt = δk∗ Hxλ δx dt, δk∗ Hxλ = ψδk∗ f = ψ(f (t, w 0 + δwk∗ ) − f (t, w0 )).
t0 I∗ t0

Further, for ψ = ψ(t) we have ψ(t) = ψ(tk ) + k ψ, where k ψ = ψ(tk + δtk ) − ψ(tk ),
and, moreover, supco 0 |k ψ| ≤ const |δtk |, since −ψ̇ = ψfx , and supco 0
ψ
∞ < +∞.
Consider δk∗ fx . Since χk∗ = χk−
∗ + χ ∗ , we have δ ∗ f = δ ∗ f + δ ∗ f , where the in-
k+ k x k− x k+ x
crements δk− fx and δk+ fx correspond to the variations δu∗k− and δu∗k+ , respectively. For
∗ ∗
∗ f , we have
δk− x
∗ ∗ ∗ ∗
δk− fx = [fx ]k χk− + (δk− fx − [fx ]k )χk− ,
where

[fx ]k = fxk+ − fxk− , fxk− = fx (tk , x 0 (tk ), u0k− ), fxk+ = fx (tk , x 0 (tk ), u0k+ ).
∗ f −[f ]k )χ ∗ = (f (t, x 0 , u0 +δu∗ )−f k+ )χ ∗ −(f (t, x 0 , u0 )−
Further, let ηk− := (δk− x x k− x k− x k− x
k− ∗ ∗
fx )χk− . Then
ηk−
∞ → 0, since u + δuk− |Mk− → u
0 ∗ 0k+ and u |Mk− → u0k− . There-
0 ∗
∗ ∗ ∗
fore, δk− fx = [fx ] χk− + ηk− , ηk− χk− = ηk− ,
ηk−
∞ → 0. Therefore,
k

∗ Hλ
δk− = ∗ f = (ψ(t ) +  ψ)([f ]k + η )χ ∗
ψδk−
x x k k x k− k−
= ∗ + ηλ = [H λ ]k χ ∗ + ηλ ,
ψ(tk )[fx ]k χk− k− x k− k−

λ
→ 0, ηλ χ ∗ = ηλ . This implies
where supco 0
ηk− ∞ k− k− k−
 tf  tf
∗ ∗
δk− Hxλ δx dt = [Hxλ ]k δxχk− dt + ρk−
λ
, (2.41)
t0 t0

λ =
tf λ δx dt. Let us estimate ρ λ . We have
where ρk− t0 ηk− k−

1
∗ ∗ 2
|ρk−
λ
| ≤ sup
ηk−
λ


δx
C meas Mk− ≤ sup
ηk−
λ


δx
2C + (meas Mk− ) .
co 0 2 co 0

∗ )2 ≤ 2

We use the obvious estimate (meas Mk− ∗ |δtk | dt.
Mk− Then
  
sup |ρk−
λ
| = o
δx
2C + |δtk | dt . (2.42)
co 0 ∗
Mk−

Analogously, we prove that


 tf  tf
∗ ∗
δk+ Hxλ δx dt = − [Hxλ ]k δxχk+ dt + ρk+
λ
, (2.43)
t0 t0
2.3. Local Sequences, Representation of the Lagrange Function 45

where   
sup |ρk+
λ
|=o
δx
2C + |δtk | dt . (2.44)
co 0 ∗
Mk+

Taking into account that δk∗ Hxλ = δk−∗ H λ + δ ∗ H λ , we obtain from (2.41)–(2.43) that
x k+ x
 tf  tf
δk∗ Hxλ δx dt = ∗
[Hxλ ]k δx(χk− ∗
− χk+ ) dt + ρkλ ,
t0 t0

where supco 0 |ρkλ | = o
δx
2C + Mk∗ |δtk | dt . This and the relation δ ∗ Hxλ = δk∗ Hxλ im-
ply (2.40). The proposition is proved.

 tf
2.3.4 Representation of the Integral t0 δ ∗ H λ dt on Local Sequences
t 
We consider the term t0f δ ∗ H λ dt on an arbitrary sequence {δu∗ } ∈ ∗u . Since χ ∗ = χk∗ ,

we have δ ∗ H λ = δk∗ H λ , where δk∗ H λ = ψδk∗ f . In turn, δk∗ H λ = δk−
∗ H λ + δ ∗ H λ , which
k+
∗ ∗ ∗
corresponds to the representation δuk = δuk− + δuk+ . Consider the increment δk− ∗ Hλ =
∗ ∗ ∗
ψδk− f . By definition, δk− f = f (t, x , u + δuk− ) − f (t, x , u ), and, moreover, δk−
0 0 0 0 ∗ f =
∗ ∗
δk− f χk− . Recall that we have introduced the function

δvk− = (u0 + δu∗k− − u0k+ )χk−

.

On Mk− ∗ , we have t = t + δt , u0 + δu∗ = u0k+ + δv , x 0 (t) = x 0 (t + δt ) = x 0 (t ) +


k k k− k− k k k
k x , where k x 0 = x 0 (tk + δtk ) − x 0 (tk ). But k x 0 = ẋ 0k− δtk + o(δtk ), where ẋ 0k− =
0

ẋ 0 (tk −). Therefore, |k x 0 | = O(δtk ). This implies that on Mk−∗ ,

f (t, x 0 , u0 + δu∗k− ) = f (tk + δtk , x 0 (tk ) + k x 0 , u0k+ + δvk− )


= f k+ + ftk+ δtk + fxk+ k x 0 + fuk+ δvk−
1
+ (f )k+ (δtk , k x 0 , δvk− ), (δtk , k x 0 , δvk− )
2
+ o |δtk |2 + |k x 0 |2 + |δvk− |2 ,

where f k+ , ftk+ , fxk+ , fuk+ , and (f )k+ are values of the function f and its derivatives
at the point (tk , x 0 (tk ), u0k+ ). Taking into account that k x 0 = ẋ 0k− δtk + o(δtk ), we obtain
from this that on Mk− ∗ ,

f (t, x 0 , u0 + δu∗k− ) = f k+ + ftk+ δtk + fxk+ ẋ 0k− δtk + fuk+ δvk−


1 k+ (2.45)
+ fuu δvk− , δvk− + o(|δtk | + |δvk− |2 ).
2
∗ . For t < t , we set  u0 = u0 (t) − u0k− . Note that
Further, consider f (t, x o , u0 ) on Mk− k k
|k u | = O(δtk ) by the assumption that tk is an L-point of the function u0 . Moreover,
0

k x 0 = ẋ 0k− δtk + o(δtk ). Therefore,


f (t, x 0 , u0 ) = f (tk + δtk , x 0 (tk ) + k x 0 , u0k− + k u0 )
(2.46)
= f k− + ftk− δtk + fxk− ẋ 0k− δtk + fuk− k u0 + o(δtk ),
46 Chapter 2. Quadratic Conditions in the Calculus of Variations

where f k− , ftk− , fxk− , and fuk− are the values of the function f and its derivatives at the
point (tk , x 0 (tk ), u0k− ). Subtracting (2.46) from (2.45), we obtain the following on Mk− :

δ ∗ k− f = [f ]k + [ft ]k δtk + [fx ]k ẋ 0k− δtk + fuk+ δvk− − fuk− k u0


1 k+ (2.47)
+ fuu δvk− , δvk− + o(|δtk | + |δvk− |2 ).
2
∗ f and δv ∗ ∗
Taking into account that δk− k− are concentrated on Mk− and δtk = −|δtk | on Mk− ,
we obtain from this that
 
∗ − [f ]k + [f ]k ẋ 0k− |δt |χ ∗ + f k+ δv
δ ∗ k− f = [f ]k χk− t x k k− u k−
1
∗ + f k+ δv , δv + o(|δt | + |δv |2 )χ ∗ .
(2.48)
−fuk− k u0 χk− uu k− k− k k− k−
2
∗ f , we have the formula
Analogously, for δk+
 
∗ − [f ]k + [f ]k ẋ 0k+ |δt |χ ∗ + f k− δv
δ ∗ k+ f = −[f ]k χk+ t x k k+ u k+
1
∗ + f k− δv , δv + o(|δt | + |δv |2 )χ ∗ ,
(2.49)
−fuk+ k u0 χk+ k+ k+ k k+
2 uu k+

where k u0 = u0 (t) − u0k+ for t > tk , and ẋ 0k+ = ẋ 0 (tk +).


We mention a consequence of formulas (2.48) and (2.49), which will be needed in
what follows. Adding (2.48) and (2.49) and taking into account that |k u0 | = O(δtk ), we
obtain

δk∗ f = [f ]k (χk−
∗ ∗
− χk+ ) + fuk+ δvk− + fuk− δvk+ + O(|δtk | + |δvk |2 )χk∗ . (2.50)

This formula holds for an arbitrary sequence {δu∗k } ∈ ∗uk .


∗ Hλ =
Let us return to formula (2.48) and use it to obtain the expression for δk−
∗ f . On M ∗ , we have
ψδk− k−

ψ
ψ(t) = ψ(tk + δtk ) = ψ(tk ) + ψ̇ k− δtk + ηk− |δtk |
ψ (2.51)
= ψ(tk ) − ψ̇ k− |δtk | + ηk− |δtk |,
ψ
where ψ̇ k− = ψ̇(tk −) and supco 0 |ηk− | → 0 as |δtk | → 0, since ψ̇ = −Hxλ is left continuous
at the point tk . We obtain from (2.48) and (2.51) that

∗ ∗ ∗
δk− H λ = [H λ ]k χk− − [Htλ ]k + [Hxλ ]k ẋ 0k− + ψ̇ k− [Hψ ]k |δtk |χk−
1 λk+
+ Huu δvk− , δvk− + η̃k−
λ
(|δtk | + |δvk− |2 ), (2.52)
2
λ χ ∗ = η̃λ and sup
λ∈co 0
η̃k−
∞ → 0. Here, we have taken into account that
where η̃k− λ
k− k−
Hu = Hu = 0 for all λ ∈ co 0 and [f ]k = [Hψ ]k .
λk+ λk−
∗ Hλ =
We now turn to formula (2.49) and use it to obtain the expression for δk+
∗ ∗
ψδk+ f . On Mk+ , we have
ψ
ψ(t) = ψ(tk + δtk ) = ψ(tk ) + ψ̇ k+ δtk + ηk+ δtk
ψ (2.53)
= ψ(tk ) + ψ̇ k+ |δtk | + ηk+ |δtk |,
2.3. Local Sequences, Representation of the Lagrange Function 47

ψ
where ψ̇ k+ = ψ̇(tk +), and supco 0 |ηk+ | → 0. Analogously to formula (2.52), we obtain
from (2.49) and (2.53) that

∗ ∗ ∗
δk+ H λ = −[H λ ]k χk+ − [Htλ ]k + [Hxλ ]k ẋ 0k+ + ψ̇ k+ [Hψ ]k |δtk |χk+
1 λk−
+ Huu δvk+ , δvk+ + η̃k+
λ
(|δtk | + |δvk+ |2 ), (2.54)
2
λ χ ∗ = η̃λ and sup
λ∈co 0
η̃k+
∞ → 0.
where η̃k+ λ
k+ k+
A remarkable fact is that the coefficients of |δtk | in formulas (2.52) and (2.54)
coincide and are the derivative at the point t = tk of the function {k H λ }(t) introduced
in Section 2.1.6. Let us show this. Let k ∈ I ∗ and λ ∈ co 0 . Recall that by definition,

(k H λ )(t) = H (t, x 0 (t), u0k+ , ψ(t)) − H (t, x 0 (t), u0k− , ψ(t)) = ψ(t)(k f )(t),

where (k f )(t) = f (t, x 0 (t), u0k+ )−f (t, x 0 (t), u0k− ). In what follows, we will omit the su-
perscript λ of H . The conditions ψ̇(t) = −ψ(t)fx (t, x 0 (t), u0 (t)), ẋ 0 (t) = f (t, x 0 (t), u0 (t)),
and the property of the function u0 (t) implies the existence of the left and right derivatives
of the functions ψ(t) and x 0 (t) at the point tk , and, moreover, the left derivative is the left
limit of the derivatives, and the right derivative is the right limit of the derivatives. On
each of the intervals of the set [t0 , tf ]\, the derivatives ψ̇, ẋ 0 are continuous. Analogous
assertions hold for the function (k H )(t). Its left derivative at the point tk can be calculated
by the formula

d 
d 
(k H )(tk ) = H (t, x 0 (t), u0k+ , ψ(t)) − H (t, x 0 (t), u0k− , ψ(t)) 
dt − dt − 
t=tk
= [Ht ]k + [Hx ]k ẋ 0k− + ψ̇ k− [Hψ ]k . (2.55)
∗ H . The right derivative
This derivative is the coefficient of |δtk | in expression (2.52) for δk−
of the function {H } (t) at the point tk can be calculated by the formula
k


d 
d 
(k H )(tk ) = H (t, x (t), u , ψ(t)) − H (t, x (t), u , ψ(t)) 
0 0k+ 0 0k−
dt + dt + 
t=tk
= [Ht ]k + [Hx ]k ẋ 0k+ + ψ̇ k+ [Hψ ]k . (2.56)
∗ H . We show that
This derivative is the coefficient of |δtk | in expression (2.54) for δk+

d d
(k H )(tk ) = (k H )(tk ); (2.57)
dt − dt +
i.e., the function (k H )(tk ) is differentiable at the point tk . Indeed,
d
(k H )(tk ) = [Ht ]k + [Hx ]k ẋ 0k− + ψ̇ k− [Hψ ]k
dt −
= [Ht ]k + (Hxk+ − Hxk− )Hψk− − Hxk− (Hψk+ − Hψk− )

= [Ht ]k + Hxk+ Hψk− − Hxk− Hψk+ . (2.58)


48 Chapter 2. Quadratic Conditions in the Calculus of Variations

But the right derivative has the same form:


d
(k H )(tk ) = [Ht ]k + [Hx ]k ẋ 0k+ + ψ̇ k+ [Hψ ]k
dt +
= [Ht ]k + (Hxk+ − Hxk− )Hψk+ − Hxk+ (Hψk+ − Hψk− )
= [Ht ]k + Hxk+ Hψk− − Hxk− Hψk+ .

We denote the derivative of the function −(k H λ )(t) at the point tk by D k (H λ ). Therefore,
we have proved the following assertion.

Lemma 2.12. The function (k H )(t) is differentiable at each point tk ∈ . Its derivative
at this point can be calculated by the formulas
d
(k H )(tk ) = [Htλ ]k + Hxλk+ Hψλk− − Hxλk− Hψλk+
dt
= [Htλ ]k + [Hxλ ]k ẋ k− + ψ̇ k− [Hψλ ]k = [Htλ ]k + [Hxλ ]k ẋ k+ + ψ̇ k+ [Hψλ ]k .

In particular, these formulas imply


d d
(k H )(tk −) = (k H )(tk +),
dt dt
and, therefore, (k H )(t) is continuously differentiable at each point tk ∈ . By definition
d
− (k H )(tk ) = D k (H ).
dt
We now turn to formulas (2.52) and (2.54). Since δk∗ H λ = δk−
∗ H λ + δ ∗ H λ and
 ∗ λ k+
δ∗H λ = k δk H , the following assertions follows from these formulas and Lemma 2.12.

Proposition 2.13. Let {δw∗ } = {(0, δu∗ )} ∈ ∗ and λ = (α0 , α, β, ψ) ∈ co 0 . Then,


 1 λk+
∗ ∗
δ∗H λ = [H λ ]k (χk− − χk+ ) + D k (H λ )|δtk |χk∗ + Huu δvk− , δvk−
2
k∈I ∗ 
1 λk−
+ Huu δvk+ , δvk+ + η̃kλ (|δtk | + |δvk |2 ) ,
2
where supλ∈co 0
η̃kλ
∞ → 0, η̃kλ χk∗ = η̃kλ , k ∈ I ∗ . Therefore,
 tf  

∗ λ ∗ ∗
δ H dt = [H λ ]k (meas Mk− − meas Mk+ ) + D k (H λ ) | δtk | dt
t0 k∈I ∗ Mk∗
 
1 tf λk+
+ Huu δvk− , δvk− + Huu λk−
δvk+ , δvk+ dt
2 t0
 
+ εkλ (|δtk | + |δvk |2 ) dt,
k∈I ∗ Mk∗

where supλ∈co 0 |εkλ | → 0, k ∈ I ∗ .


2.3. Local Sequences, Representation of the Lagrange Function 49

 tf
2.3.5 Representation of the Integral t0 δH λ dt on Local Sequences
Propositions 2.10, 2.11, and 2.13 imply the following assertion.

Proposition 2.14. Let {δw} ∈ loc be represented in the canonical form: {δw} = {δw 0 } +
{δw∗ }, where {δw 0 } = {(δx, δu0 )}, {δw∗ } = {(0, δu∗ )}, {δu0 χ ∗ } = {0}. Then the following
formula holds for any λ ∈ co 0 :
 tf  tf 
1 tf λ
δH dt =
λ
Hx δx dt +
λ
Hww δw 0 , δw0 dt
t0 t0  2 t0
 
∗ ∗
+ [H ] (meas Mk− − meas Mk+ ) + D (H )
λ k k λ
|δtk | dt
k∈I ∗ Mk∗
 tf
1
+ Huu
λk+
δvk− , δvk− + Huu
λk−
δvk+ , δvk+ dt
2 t0 
 tf
∗ ∗
+[Hxλ ]k δx(χk− − χk+ ) dt + ρH λ,
t0

λ = ελ (

2 ) dt + tf
k∈I ∗ Mk∗ (|δtk |+|δvk | |δu0 |2 dt +
δx
2C ), supλ∈co 0 |εH
where ρH λ | → 0.
H t0

2.3.6 Expansion of the Lagrange Function on Local Sequences


tf tf
Now recall that (λ, δw) = δl λ − t0 ψδ ẋ dt + t0 δH λ dt for {δw} ∈ loc . For δl λ , we have
the decomposition
1 λ
δl λ = l λ (p0 + δp) = lpλ δp + lpp δp, δp + εlλ |δp|2 ,
2
t
where supλ∈co 0 |εlλ | → 0. We integrate the term t0f ψδ ẋ dt by parts and use the transver-
sality conditions and also the adjoint equation:
 tf  tf
ψδ ẋ dt = −ψ(t0 )δx(t0 ) + ψ(tf )δx(tf ) − ψ̇δx dt
t0 t0
 tf  tf
= lxλ0 δx(t0 ) + lxλf δx(tf ) + Hxλ δx dt = lpλ δp + Hxλ δx dt.
t0 t0

From this we obtain


 tf  tf
1 λ
δl λ − ψδ ẋ dt = lpp δp, δp − Hxλ δx dt + εlλ |δp|2 . (2.59)
t0 2 t0

For the sequence {δw} ∈ loc represented in the canonical form, we set
 tf s 
  tf
γ (δw) =
δx
2C + |δu | dt + 2
0 2
|δtk | dt + |δv|2 dt. (2.60)

t0 k=1 Mk t0

Formula (2.59) and Proposition 2.14 imply the following assertion.


50 Chapter 2. Quadratic Conditions in the Calculus of Variations

Proposition 2.15. For any sequence {δw} ∈ loc represented in the canonical form,
{δw} = {δw 0 } + {δw∗ }, {δw0 } = {(δx, δu0 )} ∈ 0 , {δw ∗ } = {(0, δu∗ )} ∈ ∗ , {δu0 χ ∗ } = {0},
and for any λ ∈ co 0 , we have the formula

(λ, δw) = 1λ (δw) + ε


λ
γ (δw),

where

1 λ 1 tf λ
 (δw) = lpp δp, δp +

Hww δw 0 , δw0 dt
2 2 t0
s
∗ ∗
+ [H λ ]k (meas Mk− − meas Mk+ )
k=1
  tf
∗ ∗
+D (H ) k λ
|δtk | dt + [Hxλ ]k δx(χk− − χk+ ) dt

Mk t0
 tf
1
+ Huu
λk+
δvk− , δvk− + Huu
λk−
δvk+ , δvk+ dt (2.61)
2 t0

and supλ∈co 0 |ε


λ | → 0.

 ∗ −
In expression (2.61) for 1λ (δw), all terms, except for sk=1 [H λ ]k (meas Mk−
meas Mk+ ∗ ), are estimated through γ on any sequence {δw} ∈ loc starting from a certain

number. For example,


 tf 
 ∗ ∗  1
 δx(χk− − χk+ ) dt  ≤
δx
C meas Mk∗ ≤ (
δx
2C + (meas Mk∗ )2 ) ≤ γ (δw),
t0 2

since (meas Mk∗ )2 ≤ 4 M∗ |δtk | dt. (This estimate follows from the estimates 12 (meas Mk−∗ )2
k
∗ )2 ≤
∗ ∗
≤ M∗ |δtk | dt, 12 (meas Mk+ ∗ |δtk | dt, and the equality meas Mk = meas Mk− +
Mk+
k−
meas Mk+ ∗ .)

Recall that by 0 we denote the set consisting of those λ ∈ 0 for which the condi-
tions [H λ ]k = 0 for all k ∈ I ∗ hold. Proposition 2.15 implies the following assertion.

Proposition 2.16. Let the set  0 be nonempty. Then there exists a constant C > 0 such
that the following estimate holds at any sequence {δw} ∈ loc represented in the canonical
form, starting from a certain number:

max |(λ, δw)| ≤ C γ (δw).


λ∈co 
0

We will need this estimate later, in Section 2.5.


We have made an important step in the way of distinguishing the quadratic form.
Also, we have defined the functional γ on local sequences. We must extend this functional
to Pontryagin sequences. Precisely, this functional will define the higher order which we
will use to obtain quadratic conditions in the problem considered. Note that  (λ, δw 0 ) :=
1 tf
2 lpp δp, δp + 2 t0 Hww δw , δw dt is the second variation of the Lagrange functional.
1 λ λ 0 0
2.3. Local Sequences, Representation of the Lagrange Function 51
u
6

(t, v) > 0 (t, v) = |v − u(t)|2

'
?
(t, v) = |v − u1+ |2 + 2|t − t1 | - u1+
 u(t)

V2
&
$

u(t) u1−
- (t, v) > 0
6
6 V1
%

(t, v) = |v − u(t)|2 (t, v) = |v − u1− |2 + 2|t − t1 |


t
-
t0 t1 tf

Figure 2.2. Definition of functions (t, v) on neighborhoods of discontinuity points.

2.3.7 Higher Order γ


We first define the concept of admissible function (t, u); cf. the illustration in Figure 2.2
for k = 1.

Definition 2.17. A function (t, u) : Qtu → R is said to be admissible (or an order function)
if it is continuous on Qtu and there exist disjoint neighborhoods Vk ⊂ Qtu of the compact
sets u0 (tk−1 , tk ) such that the following five conditions hold:
(1) (t, u) = |u − u0 (t)|2 if (t, u) ∈ Vk , t ∈ (tk−1 , tk ), k = 1, . . . , s + 1;
(2) (t, u) = 2|t − tk | + |u − u0k− |2 if (t, u) ∈ Vk , t > tk , k = 1, . . . , s;
(3) (t, u) = 2|t − tk | + |u − u0k+ |2 if (t, u) ∈ Vk+1 , t < tk , k = 1, . . . , s;

(4) (t, u) > 0 on Qtu \V, where V = s+1 k=1 Vk ;
(5) for any compact set F ⊂ Qtu \V, there exists a constant L > 0 such that |(t, u ) −
(t, u )| ≤ L|u − u | if (t, u ) and (t, u ) belong to F .

Let us show that there exists at least one admissible function . Fix arbitrary disjoint

neighborhoods Vk ⊂ Qtu of the compact sets u0 (tk−1 , tk ) and define  on V = Vk by
conditions (1)–(3). We set Vε = {(t, u) ∈ V | (t, u) < ε}. For a sufficiently small ε = ε0 > 0,
the set Vε0 is a neighborhood of u0 contained in V together with its closure. For the above ε0 ,
we set 
(t, u) if (t, u) ∈ Vε0 ,
0 (t, u) =
ε0 if (t, u) ∈ Qtu \Vε0 .
Then the function 0 is admissible. An admissible function  is not uniquely defined, but
any two of them coincide in a sufficiently small neighborhood of the compact set u0 .
52 Chapter 2. Quadratic Conditions in the Calculus of Variations

Let (t, u) be a certain admissible function. We set


 tf
γ (δw) =
δx
2C + (t, u0 + δu) dt. (2.62)
t0

This functional is defined for pairs δw = (δx, δu) ∈ W such that (t, u0 + δu) ∈ Qtu a.e. on
[t0 , tf ]. Such pairs are said to be admissible with respect to Qtu (also, in this case, the
variation δu is said to be admissible with respect to Qtu ). It is easy to see that for any local
sequence {δw} ∈ loc , the values of γ (δw) can be calculated by formula (2.60) starting
from a certain number, and, therefore in the definition of γ and in formula (2.60), we have
used the same notation.
Let us verify that γ is a strict higher order on , where  is the set of Pontryagin
sequences. Obviously, γ ≥ 0, γ (0) = 0, and for any variation δw admissible on Qtu , the
condition γ (δw) = 0 implies δw = 0.
Let us show that the functional γ is -continuous at zero. It is required to show
that γ (δw) → 0 for any Pontryagin sequence {δw}. Since the condition
δx
1,1 → 0 holds
for {δw} ∈  and
δx
C ≤
δx
1,1 , it suffices to show that for {δw} ∈ , the condition
tf
t0 (t, u + δu) dt → 0 holds. Let Uε (u ) be an ε-neighborhood of the set u in R
0 0 0 1+d(u) .

Assume that ε > 0 is chosen so that Uε (u0 ) ⊂ Qtu . Represent δu in the form δu = δuε + δuε ,
where

δu(t) if (t, u0 (t) + δu(t)) ∈ Uε (u0 ),
δuε (t) =
0 otherwise.

We set M ε = {t | δuε (t)  = 0}, Mε = [t0 , tf ]\Mε . Since |δuε | ≥ ε on M ε , we have


 tf  tf
ε meas M ≤ ε
|δu | dt ≤
ε
|δu| dt → 0.
t0 t0

Therefore, for any fixed ε, we have meas Mε → 0. But then we can choose a subsequence
ε → +0 such that meas M ε → 0. (Recall that M ε is defined by a member of the sequence
{δu} and the corresponding member of the sequence {ε}; when defining Mε , we take the
members of the sequences {δu} and {ε} with the same numbers.) Fix such a sequence
t
{ε}. Since
δuε
∞ ≤ O(1), we have
(t, u0 + δuε )
∞ ≤ O(1). Therefore, t0f (t, u0 +
δuε ) dt ≤
(t, u0 +δuε )
∞ meas Mε → 0. Moreover, the condition ε → +0 implies {δuε } ∈
loc , and therefore,
(t, u 0 + δu )
→ 0, which implies tf (t, u0 + δu ) dt → 0. Then
u ε ∞ t0 ε
 tf  tf  tf
(t, u0 + δu) dt = (t, u0 + δuε ) dt + (t, u0 + δuε ) dt → 0;
t0 t0 t0

this is what was required to be proved. Therefore, we have shown that the functional γ is
-continuous at zero. Therefore, γ is an order. Moreover, γ is a strict order.
Let us verify that γ is a higher order. Let {δw} ∈ , and let {w̄} ∈ 0 . We need to
show that γ (δw + w̄) = γ (δw) + o(

), where

=

1,1 +

∞ . Since
δx + x̄
2C =

δx
2C + o(

C ) and

C ≤

1,1 , it suffices to show that
 tf  tf
(t, u0 + δu + ū) dt = (t, u0 + δu) dt + o(

∞ ).
t0 t0

As above, represent {δu} in the form {δu} = {δuε } + {δuε }, where ε → 0, meas M ε → 0.
2.3. Local Sequences, Representation of the Lagrange Function 53

Then
 tf  
(t, u0 + δu + ū) dt = (t, u0 + δuε + ū) dt + (t, u0 + δuε + ū) dt
t0 Mε Mε
 tf 
= (t, u0 + δu) dt + ((t, u0 + δuε + ū)
t0 Mε

− (t, u0 + δuε )) dt

+ ((t, u0 + δuε + ū) − (t, u0 + δuε )) dt.

Here, we have used the relations
(t, u0 + δu) = (t, u0 + δu)(χε + χ ε ) = (t, u0 + δuε ) + (t, u0 + δuε ),
where χε and χ ε are the characteristic functions of the sets Mε and M ε , respectively. By
property (5) of Definition 2.17, we have
   

 (t, u0 + δuε + ū) − (t, u0 + δuε ) dt  ≤ const(meas Mε )

∞ = o(

∞ ).

Therefore, it suffices to show that

((t, u0 + δuε + ū) − (t, u0 + δuε )) dt = o(

∞ ) (2.63)

or, which is the same,
 tf
((t, u0 + δuε + ūε ) − (t, u0 + δuε )) dt = o(

∞ ),
t0

where ūε = ūχε . As was already noted, {δuε } ∈ locu . Moreover,


ūε
∞ → 0, i.e., {ūε } ∈
0u . Therefore, it suffices to prove the following assertion.

Proposition 2.18. The following estimate holds for any {δu} ∈ loc u and {ū} ∈ u :
0
 tf  tf
(t, u0 + δu + ū) dt = (t, u0 + δu) dt + o(

∞ ).
t0 t0

Proof. Represent {δu} in the canonical form {δu} = {δu0 } + {δu∗ }, {δu0 } ∈ 0u , {δu∗ } ∈ ∗u ,
|δu0 |· |δu∗ | = 0. The latter property holds for all members of the sequences {δu0 } and {δu∗ }
with the same numbers. According to the definition of the function (t, u), we have
 tf  tf 
(t, u + δu) dt =
0
|δu | dt +
0 2
(2|δtk | + |δvk∗ |2 ) dt.
t0 t0 k Mk∗

Let M = [t0 , tf ]\M∗ . Then


 tf  
(t, u0 + δu + ū) dt = |δu + ū| dt +
0 2
(2|δtk | + |δvk∗ + ū|2 ) dt
t0 M Mk∗
 tf k

= (t, u + δu) dt + o(

∞ ).
0
t0

The proposition is proved.


54 Chapter 2. Quadratic Conditions in the Calculus of Variations

According to Proposition 2.18, formula (2.63) holds. Therefore, γ is a higher order


on . Obviously, γ is a strict order on . We will perform decoding of the constant Cγ :=
Cγ (0 , σ γ ) precisely with this order. According to Theorem 2.8, the inequality Cγ ≥ 0
is necessary for the Pontryagin minimum at the point w0 , and the strict inequality Cγ > 0
is sufficient for the strict Pontryagin minimum at this point. As was already mentioned, the
decoding of the basic constant Cγ consists of the following two stages: estimating Cγ from
above and estimating Cγ from below.

2.4 Estimation of the Basic Constant from Above


2.4.1 Passing to Local Sequences and Needles
Recall that
0
Cγ = Cγ (0 , σ γ ) = inf lim , σ γ = {{δw} ∈  | σ ≤ O(γ )},
σ γ γ
σ = max{F0 (p0 + δp), . . . , Fd(F ) (p0 + δp), |δK|,
δ ẋ − δf
1 }.

We will estimate Cγ from above. Since Cγ ≥ 0 is a necessary condition for the Pontryagin
minimum, the nonnegativity of any upper estimate for Cγ is also a necessary condition for
the Pontryagin minimum. Therefore, this stage of decoding can be considered as obtaining
a necessary condition for the Pontryagin minimum.
Let t ∈ (t0 , tf )\, ε > 0, and let [t − ε, t + ε] be entirely contained in one of the
intervals of the set (t0 , tf )\. Let a point u ∈ Rd(u) be such that (t , x 0 (t ), u ) ∈ Q, u  =
u0 (t ). Define the needle-shaped variation

u − u0 (t), t ∈ [t − ε, t + ε],
δu = δu (t; t , ε, u ) =
0 otherwise.

Consider a sequence of needle-shaped variations {δu } := {δu (· , t , n1 , u )}, enumerated by


the parameter ε so that ε = ε n = 1/n. Clearly, {δw } := {(0, δu )} is a Pontryagin sequence.
Obviously, γ := γ (δw ) = [t −ε,t +ε] (t, u ) dt is of order ε. We denote by  the set of
sequences {δw } = {(0, δu )} such that {δu } is a sequence of needle-shaped variations. We
σ γ =  ∩ σ γ . Therefore,
set loc loc

σ γ = {{δw} ∈ 
loc | σ ≤ O(γ )}.
loc

We have the following assertion.

Lemma 2.19. Let {δw loc } ∈ loc


σ γ and {δw } ∈  be such that γ ≤ O(γ ), where γ =
loc

γ (δw ) and γ = γ (δw ). Then
loc loc

tf
(λ, δw loc ) + t0 δ H λ dt
lim max ≥ Cγ ,
0 γ loc + γ

where δ H λ = H (t, x 0 , u0 + δu , ψ) − H (t, x 0 , u0 , ψ).


2.4. Estimation of the Basic Constant from Above 55

To prove Lemma 2.19, we need the following proposition.

Proposition 2.20. Let ϕ(t, w) : Q → Rd(ϕ) be a continuous function. Let {δwloc } ∈ loc ,
{δw } ∈  , {δw} = {δw loc + δw }. Then δϕ = δ loc ϕ + δ ϕ + rϕ , where

1 = o(γ ),


∞ → 0, δϕ = ϕ(t, w 0 + δw) − ϕ(t, w0 ), δ loc ϕ = ϕ(t, w0 + δwloc ) − ϕ(t, w 0 ), δ ϕ =
ϕ(t, w0 + δw ) − ϕ(t, w0 ).

Proof. The following relations hold:


δϕ = ϕ(t, w0 + δwloc + δw ) − ϕ(t, w 0 + δw loc ) + δ loc ϕ
= δ̄ ϕ + δ loc ϕ = δ̄ ϕχ + δ loc ϕ,
where δ̄ ϕ = ϕ(t, w 0 + δwloc + δw ) − ϕ(t, w0 + δw loc ) and χ is the characteristic func-
tion of the set M = {t | δu  = 0}. Further, let {δwloc } = {δw 0 } + {δw∗ } be the canonical
representation, where {δw 0 } ∈ 0 , {δw∗ } ∈ ∗ , and |δu0 |· |δu∗ | = 0. It follows from the
definitions of sequences {δw } and {δw∗ } that |δu |· |δu∗ | = 0 starting from a certain number.
Therefore,

(δ̄ ϕ)χ = ϕ(t, w0 + δw 0 + δw ) − ϕ(t, w0 + δw 0 ) χ

= ϕ(t, w0 + δw 0 + δw ) − ϕ(t, w0 + δw ) − δ 0 ϕ + δ ϕ χ = rϕ + δ ϕ,

where
rϕ = (δ̄ 0 ϕ − δ 0 ϕ)χ ,
δ̄ 0 ϕ = ϕ(t, w 0 + δw + δw 0 ) − ϕ(t, w 0 + δw ),
δ0ϕ = ϕ(t, w 0 + δw 0 ) − ϕ(t, w0 ).
Therefore, δϕ = δ loc ϕ + δ ϕ + rϕ . Since
δ̄ 0 ϕ
∞ → 0,
δ 0 ϕ
∞ → 0, meas M = O(γ ), we
have

∞ ≤
δ̄ 0 ϕ
∞ +
δ 0 ϕ
∞ → 0,

1 ≤

∞ meas M = o(γ ). The proposition
is proved.

Proposition 2.20 implies the following assertion.

Proposition 2.21. Let {δwloc } ∈ loc , {δw } ∈  , {δw} = {δwloc + δw }. Then for any
λ ∈ 0 , we have (λ, δw) = (λ, δw loc ) + t0f δ H λ dt + ρ λ , where sup0 |ρ λ | = o(γ ).
t

Moreover, γ = γ loc + γ + o(γ ), where γ = γ (δw), γ loc = γ (δwloc ), and γ = γ (δw ).


Finally,
δ ẋ − δf
1 =
δ ẋ − δ loc f
1 + O(γ ).

Proof. By Proposition 2.20, we have


 tf  tf  tf  tf
(λ, δw) = δl λ − ψδ ẋ dt + ψδf dt = δ loc l λ − ψδ ẋ dt + ψδ loc f dt
t0 t0 t0 t0
 tf  tf  tf

+ ψδ f dt + ψrf dt = (λ, δw loc
)+ δ H λ dt + ρ λ ,
t0 t0 t0

where sup0 |ρ λ | ≤ sup0


ψ

rf
1 = o(γ ). Further,
 tf  tf  tf
γ (δw) =
δx
2C + δ dt =
δx
2C + δ loc  dt + δ  dt + o(γ )
t0 t0 t0

= γ loc + γ + o(γ ).
56 Chapter 2. Quadratic Conditions in the Calculus of Variations

Finally,
δ ẋ − δf
1 =
δ ẋ − δ loc f − δ f − rf
1 =
δ ẋ − δ loc f
1 + O(γ ), since
δ f
1 ≤
const meas M = O(γ ) and
rf
1 = o(γ ). The proposition is proved.

Proof of Lemma 2.19. Let {δw} = {δw loc } + {δw }, where {δw loc } ∈ loc
σ γ , {δw } ∈ 

and γ ≤ O(γ loc ). Then, according to Proposition 2.21, γ = γ loc + γ + o(γ ). However,
γ ≤ O(γ loc ). Therefore, γ ≤ O(γ loc ). On the other hand, since γ = γ loc + (1 + o(1))γ ,
γ ≥ 0, we have γ loc ≤ O(γ ). Therefore, γ and γ loc are of the same order of smallness.
Obviously, {δw} ∈ . Let us show that {δw} ∈ σ γ . Indeed, by Proposition 2.21,

δ ẋ − δf
1 =
δ ẋ − δ loc f
1 + O(γ ). Therefore,
σ (δw) = max{Fi (p 0 + δp), |δK|,
δ ẋ − δf
1 }
≤ σ (δw loc ) + O(γ ) ≤ O1 (γ loc ) ≤ O2 (γ ).
Thus, {δw} ∈ σ γ . Further, according to Proposition 2.21,
 tf
(λ, δwloc ) + δ H λ dt + ρ λ = (λ, δw),
t0

where sup0 |ρ λ | = o(γ ) = o1 (γ ). Therefore,


t
(λ, δw loc ) + t0f δ H λ dt (λ, δw) 0 0
lim max = lim max = lim ≥ inf lim = Cγ .
0 γ loc + γ 0 γ γ σ γ γ
The inequality holds, since {δw} ∈ σ γ . The lemma is proved.

We can now use the results of Section 2.3. Lemma 2.19 and Proposition 2.15 imply
the following assertion.

Lemma 2.22. Let {δw} ∈ loc


σ γ and {δw } ∈  be such that γ ≤ O(γ ), where γ = γ (δw )
and γ = γ (δw). Then
t
1λ (δw) + t0f δ H λ dt
lim max ≥ Cγ ,
0 γ +γ
where δ H λ = H (t, x 0 , u0 + δu , ψ) − H (t, x 0 , u0 , ψ) and the function 1λ (δw) is defined
by formula (2.61).

2.4.2 Replacement of the Functions in the Definition of the Set loc


σγ
by their Decompositions up to First-Order Terms
We represent the sequences {δw} ∈ loc in the canonical form: {δw} = {δw0 } + {δw ∗ },
where {δw0 } = {(δx, δu0 )} ∈ 0 , {δw ∗ } = {0, δu∗ } ∈ ∗ , and, moreover |δu0 |· |δu∗ | = 0.
We set I = {i ∈ {0, 1, . . . , d(F )} : Fi (p0 ) = 0} = IF (w 0 ) ∪ {0}. Then loc
σ γ consists of se-
quences {δw} ∈  such that
loc

δFi ≤ O(γ ), i ∈ I ; (2.64)


|δK| ≤ O(γ ); (2.65)

δ ẋ − δf
1 ≤ O(γ ). (2.66)
2.4. Estimation of the Basic Constant from Above 57

Since

δFi = Fip δp + O(|δp|2 ), i ∈ I ; δK = Kp δp + O(|δp|2 ); |δp|2 ≤ 2


δx
2C ≤ 2γ ,

conditions (2.64) and (2.65) are equivalent to the conditions

Fip δp ≤ O(γ ), i ∈ I ; (2.67)


|Kp δp| ≤ O(γ ), (2.68)

respectively. Further, consider condition (2.66). By Proposition 2.10,


1
δf = fw δw0 + fww δw0 , δw0 + δ ∗ f + δ ∗ fx δx + r̃,
2
 tf 
where

1 = o
δx
C + t0 |δu |2 dt = o1 (γ ). Here,
2 0

 tf

fww δw 0 , δw0
1 ≤ O(
δx
C
2 + |δu0 |2 dt) ≤ O1 (γ ),
t0
1

(δ ∗ fx )δx
1 ≤ const
δx
C meas M∗ ≤ const(
δx
2C + (meas M∗ )2 ).
2

But, as was already mentioned, (meas Mk∗ )2 ≤ 4 M∗ |δtk | dt ≤ 2γ . Therefore,
(δ ∗ fx )δx
1 ≤
 k
O(γ ). Further, δ ∗ f = k δk∗ f . According to formula (2.50), we have

δk∗ f = [f ]k (χk−
∗ ∗
− χk+ ) + fuk+ δvk− + fuk− δvk+ + O(|δtk | + |δvk |2 )χk∗ .

Therefore, condition (2.66) is equivalent to the following condition:



∗ ∗
δ ẋ − fw δw 0 − ([f ]k (χk− − χk+ ) − fuk+ δvk− − fuk− δvk+ ) ≤ O(γ ). (2.69)
k 1

We have shown that loc σ γ consists of sequences {δw} ∈ 


loc such that conditions (2.67)–

(2.69) hold for their canonical representations.

2.4.3 Narrowing the Set of Sequences loc


σγ

In what follows, in the formulation of Lemma 2.22, we narrow the set loc
σ γ up to its subset
defined by the following conditions:
(a) δv = 0;
(b) for any λ ∈ 0 ,

∗ ∗
[H λ ]k (meas Mk− − meas Mk+ ) ≤ 0. (2.70)
k

These conditions should hold for each member δw of the sequence {δw}. We denote by
σ γ the set of sequences {δw} ∈ σ γ satisfying these conditions.
loc 1 loc

For any sequence {δw} ∈ loc 1


σ γ , we obviously have

1λ (δw) ≤ 2λ (δw) ∀ λ ∈ 0 , (2.71)


58 Chapter 2. Quadratic Conditions in the Calculus of Variations

where 1λ is as defined in (2.61) and


1
2λ (δw) =  (λ, δw 0 )
2    tf 

∗ ∗
+ D k (H λ ) |δtk | dt + [Hxλ ]k δx(χk− − χk+ ) dt ,
M ∗ t
k  tf k 0

 (λ, δw0 ) = lpp


λ
δp, δp + Hww λ
δw0 , δw0 dt.
t0

Moreover, for any sequence {δw} ∈ loc1


σγ , we have γ (δw) = γ1 (δw), where
 tf 
γ1 (δw) =
δx
2C + |δu0 |2 dt + 2 |δtk | dt.
t0 k Mk∗

Finally, condition (2.69) passes to the following condition on these sequences:



∗ ∗
δ ẋ − fw δw0 − [f ]k (χk− − χk+ ) ≤ O(γ1 ).
k 1


σ γ , only δw and Mk participate and,
We note that in the definitions of 2 , γ1 , and loc1 0

moreover, the variation δw is uniquely reconstructed by Mk∗ and δw0 by using the conditions
δu0 χ ∗ = 0, δv = 0. We denote the pairs (δw, M ∗ ) by b. Introduce the set of sequences of
pairs {b} = {(δw, M ∗ )} such that

{δw} = {(δx, δu)} ∈ 0 , M∗ = Mk∗ , Mk∗ → tk (k ∈ I ∗ ), δuχ ∗ = 0,
k∈I ∗
Fip δp ≤ O(γ1 ) (i ∈ I ), |Kp δp| ≤ O(γ1 ),

∗ ∗
δ ẋ − fw δw − [f ]k (χk− − χk+ ) ≤ O(γ1 ),
1
 k
∗ ∗
[H ] (meas Mk− − meas Mk+ ) ≤ 0 ∀ λ ∈ 0 .
λ k

σ γ . In what follows, we denote by {δw}


As above, we denote this set of sequences by loc1
the sequences from 0 . Lemma 2.22 and inequality (2.71) imply the following assertion.

Lemma 2.23. Let {b} = {(δw, M ∗ )} ∈ loc1


σ γ and {δw } ∈  be such that γ ≤ O(γ1 ), where
 tf  
γ1 = γ1 (b) :=
δx
2C + |δu|2 dt + 2|δtk | dt, γ = γ (δw ).
t0 k Mk∗

Then, tf
2λ + t0 δ H λ dt
lim max ≥ Cγ ,
0 γ1 + γ
where
1
2λ = 2λ (b) :=  (λ, δw)
 2   
 tf
∗ ∗
+ k
D (H )λ
|δtk | dt + [Hx ]
λ k
δx(χk− − χk+ ) dt ,
k Mk∗ t0

δ H λ := H (t, x 0 , u0 + δu , ψ) − H (t, x 0 , u0 , ψ).


2.4. Estimation of the Basic Constant from Above 59

2.4.4 Replacement of δx2C by |δx(t0 )|2 in the Definition of


Functional γ1
We set  tf 
γ2 = |δx(t0 )|2 + |δu|2 dt + 2|δtk | dt.
t0 k Mk∗

Since |δx(t0 )| ≤
δx
C , we have γ2 ≤ γ1 . Let us show that the following estimate also holds
σ γ : γ1 ≤ const γ2 , where const > 0 is independent of the sequence.
on the sequence from loc1
For this purpose, it suffices to show that
δx
2C ≤ const γ2 on a sequence from loc1 σ γ . Let
us prove the following assertion.

Proposition 2.24. There exists a const > 0 such that for any sequence {b} = {δw, M ∗ }
satisfying the conditions

{δw} ∈ 0 , M∗ = ∪k Mk∗ , Mk∗ → tk (k ∈ I ∗ ),


 √
∗ ∗ (2.72)
δ ẋ − fw δw − [f ]k (χk− − χk+ ) = o( γ1 )
k 1

starting from a certain number the following inequality holds:


δx
2C ≤ const γ2 .

Proof. Let {b} satisfy conditions (2.72). Then



∗ ∗
δ ẋ = fx δx + fu δu + [f ]k (χk− − χk+ ) + r,
k

where
r
1 = o( γ1 ). As is known, this implies the estimate

∗ ∗

δx
1,1 ≤ |δx(t0 )| + const fu δu + [f ]k (χk− − χk+ )+r .
k 1

Since √

δu
1 ≤ ∗ − χ ∗
≤ meas M ∗ ,
tf − t0
δu
2 ,

χk−
 k+ 1 k
∗ )2 + (meas M ∗ )2 ≤ 2
(meas Mk− k+ |δtk | dt ≤ γ2 , k ∈ I ∗ ,
Mk∗

we have
δx
2C ≤
δx
21,1 ≤ const γ2 + o(γ1 ). This implies what was required. The propo-
sition is proved.

Therefore, γ1 and γ2 are equivalent on the set of sequences satisfying conditions


(2.72), i.e., on each such sequence, they estimate one another from above and from below:
γ2 ≤ γ1 ≤ γ2 ( > 1) (2.73)
(the constant  is independent of the sequence). In particular, inequalities (2.73) hold on
sequences from loc1 loc1
σ γ . First, this implies that the set σ γ does not change if we replace γ1
by γ2 in its definition. Further, inequality (2.73) implies the inequalities γ2 + γ ≤ γ1 + γ ≤
(γ2 + γ ), whence
γ1 + γ
1≤ ≤ . (2.74)
γ2 + γ
60 Chapter 2. Quadratic Conditions in the Calculus of Variations

Let {b} ∈ loc1


σ γ , {δw } ∈  , and let γ ≤ O(γ2 ). Then, we obtain from (2.74) and Lemma 2.23
that
t t
2λ + t0f δ H λ dt γ1 + γ 2λ + t0f δ H λ dt
lim max = lim max
0 γ2 + γ γ2 + γ 0 γ1 + γ

Cγ , Cγ ≥ 0,
≥ min{Cγ , Cγ } =
Cγ , Cγ < 0.

We have proved the following assertion.

Lemma 2.25. The following inequality holds for any {b} ∈ loc1
σ γ and {δw } ∈  such that

γ ≤ O(γ2 ): t
2λ (b) + t0f δ H λ dt
lim max ≥ min{Cγ , Cγ }.
0 γ2 (b) + γ

2.4.5 Passing to Sequences with Discontinuous State Variables


Denote by P W 1,1 (, Rd(x) ) the space of functions x̄ : [t0 , tf ]  → Rd(x) piecewise continuous
on [t0 , tf ] and absolutely continuous on each of the intervals of the set (t0 , tf ) \  (points of
discontinuity of such functions are possible only at points of the set ). The differential
constraint in the set loc1
σ γ is represented by the condition


∗ ∗
δ ẋ − fw δw − [f ]k (χk− − χk+ ) ≤ O(γ2 ). (2.75)
k 1
 ∗ − χ ∗ ) on δx in this condition? We
What is the influence of the terms k [f ]k (χk− k+
show below that the variations δx ∈ W (, Rd(x) ) can be replaced by variations
1,1
∗ − meas M ∗ , k ∈ I ∗ ,
x̄ ∈ P W 1,1 (, Rd(x) ) such that [x̄]k = [f ]k ξk , where ξk = meas Mk− k+
and, moreover, (2.75) passes to the condition


x̄˙ − fx x̄ − fu δu
1 ≤ O(γ2 ).

We will prove a slightly more general assertion, which will be used later in estimating Cγ
from below.
Therefore, we assume that there is a sequence {b} = {(δw, M ∗ )} such that {δw} ∈ 0 ,
M = ∪Mk∗ , Mk∗ → tk , k ∈ I ∗ . Moreover, let the following condition (which is weaker

than (2.75)) hold:


 √
∗ ∗
δ ẋ = fx δx + fu δu + [f ]k (χk− − χk+ ) + r,
r
1 = o( γ2 ). (2.76)
k

For each member b = (δx, δu, M∗ ) of the sequence {b}, let us define the functions δxk∗ and
x̄k∗ by the following conditions:
∗ − χ ∗ ), δx ∗ (t ) = 0, x̄˙ ∗ = 0,
δ ẋk∗ = [f ]k (χk− x̄k∗ (t0 ) = 0,
k+ k 0 k (2.77)

[x̄k ] = [f ] ξk , ξk = meas Mk−
k k ∗ − meas M ∗ .
k+
2.4. Estimation of the Basic Constant from Above 61

Therefore, x̄k∗ is the jump function: x̄k∗ (t) = 0 if t < tk and x̄k∗ (t) = [f ]k ξk if t > tk , and,
moreover, the value of the jump is equal to [f ]k ξk . We set
∗ ∗  ∗ ∗ 
δx k = x̄k∗ − δxk∗ , k ∈ I ∗ , δx = δx k , x̄ = δx + δx = δx + (x̄k∗ − δxk∗ ).
k k

Note that x̄ ∈ P W 1,1 (, Rd(x) ). Since the functions x̄k∗ and δxk∗ coincide outside Mk∗ , we
∗ ∗ ∗ ∗ ∗ ∗
have δx k χk∗ = δx k for all k. Hence δx χ ∗ = δx . Let us estimate
δx
∞ and
δx
1 .
We have      


δx k
∞ ≤
x̄k∗
∞ +
δxk∗
∞ ≤ [f ]k  · ξk  + [f ]k  meas Mk∗ .
Moreover,
  1
    2 
ξk  ≤ meas M ∗ ≤ 4 δtk  dt ≤ 2γ2 .
k
Mk∗
∗ √ ∗ ∗ ∗ ∗
Hence
δx
∞ ≤ const γ2 . Since δx χ ∗
= δx , we have
δx
1 ≤
δx
∞ meas M∗ ≤

const γ2 . What equation does x̄ satisfy? We obtain from (2.76) and (2.77) that

x̄˙ = fx δx + fu δu + r = fx x̄ + fu δu − fx δx + r.

Since
δx
1 ≤ O(γ2 ), we have
x̄˙ − fx x̄ − fu δu
1 ≤ O(γ2 ) +
r
1 . Note that the replace-
ment of δx by x̄ does not influence the value of γ2 , since x̄(t0 ) = δx(t0 ) and M ∗ and δu
are preserved. Now let us show that
 tf
∗ ∗
δx(χk− − χk+ ) dt = x̄av
k
ξk + o(γ2 ), (2.78)
t0
∗ − meas M ∗
where ξk = meas Mk− k = 1 (x̄ k− + x̄ k+ ) = 1 (x̄(t −) + x̄(t +)). Recall
and x̄av
k+ 2 2 k k
that δx satisfies equation (2.76). Represent δx in the form δx = δx 0 + δx ∗ + x , where
 r
ẋr = r, xr (t0 ) = 0, and δx ∗ = δxk∗ . Then δ ẋ 0 = fx δx + fu δu and δx 0 (t0 ) = δx(t0 ). This
and the conditions
δx
C → tf 0,
δu
∞ → 0 imply
δ ẋ
∞ → 0.
0
∗ ∗
Now let us consider t0 δx(χk− − χk+ ) dt. We set δxr0 := δx 0 + xr . Since

δx = (δx − δx ∗ ) + δx ∗ = δx 0 + xr + δx ∗ = δxr0 + δx ∗ ,
we have
 tf  tf  tf
∗ ∗ ∗ ∗
δx(χk− − χk+ ) dt = δxr0 (χk− − χk+ ) dt + δx ∗ (χk−
∗ ∗
− χk+ ) dt. (2.79)
t0 t0 t0

Let us estimate each summand separately. We have


 tf  tf
0 ∗ ∗ ∗ ∗
δxr (χk− − χk+ ) dt = δxr0 (tk )(χk− − χk+ ) dt
t0 t0
 tf
∗ ∗
+ (δx 0 (t) − δx 0 (tk ))(χk− − χk+ ) dt
t0 (2.80)
 tf
∗ ∗
+ (xr (t) − xr (tk ))(χk− − χk+ ) dt
t0

= δxr0 (tk )ξk + o(γ2 ).


62 Chapter 2. Quadratic Conditions in the Calculus of Variations

Here, we have used the following estimates: √



(a)
xr
∞ ≤
xr
1,1 =
ẋr
1 =
r
1 = o( γ2 ), meas Mk∗ ≤ 2γ2 , whence
 tf
∗ ∗
(xr (t) − xr (tk ))(χk− − χk+ ) dt = o(γ2 ); (2.81)
t0
   
(b) δx 0 (t) − δx 0 (tk ) ≤
δx˙ 0
∞ δtk , and hence
 tf  
 
 ∗ ∗  δtk  dt = o(γ2 ),
 (δx 0 (t) − δx 0 (tk ))(χk− − χk+ ) dt  ≤
δ ẋ 0
∞ (2.82)
t0 Mk∗

since
δ ẋ 0
∞ → 0. Relation (2.80) follows from (2.81) and (2.82). Further, the conditions
 
δ ẋr0 = fx δx + fu δu + r, x̄˙ = fx δx + fu δu + r,
δxr (t0 ) = δx(t0 ),
0 x̄(t0 ) = δx(t0 )
 
imply δxr0 = x̄ − k x̄k∗ outside , and hence δxr0 (tk ) = x̄ k− − j <k [x̄]j . We obtain from
this and (2.80) that
 tf   
∗ ∗
δxr0 (χk− − χk+ ) dt = x̄ k− − [x̄]j ξk + o(γ2 ). (2.83)
t0 j <k

Further, let yk∗ (t) be defined by the conditions ẏk∗ = χk−


∗ − χ ∗ , y ∗ (t ) = 0. Then
k+ k 0
 tf 1 1
yk∗ ẏk∗ dt = (meas Mk−
∗ ∗ 2
− meas Mk+ ) = ξk2 .
t0 2 2

Obviously, δxk∗ = [f ]k yk∗ . Hence


 tf  tf 1 1 1
δxk∗ (χk−
∗ ∗
− χk+ ) dt = [f ]k yk∗ ẏk∗ dt = [f ]k ξk2 = [x̄k∗ ]k ξk = [x̄]k ξk .
t0 t0 2 2 2

We obtain from this that


 tf
δx ∗ (χk−

− χk+∗
) dt
t0  
tf  tf
= δxj∗ (χk−
∗ ∗
− χk+ ) dt + δxk∗ (χk−
∗ ∗
− χk+ ) dt (2.84)
t0 j <k t0
 1
= [x̄]j ξk + [x̄]k ξk ,
2
j <k

since we have the following for j < k:

δxj∗ χk∗ = x̄j∗ χk∗ = [x̄j∗ ]j χk∗ = [x̄]j χk∗ .


2.4. Estimation of the Basic Constant from Above 63

We obtain from (2.79), (2.83), and (2.84) that


 tf  
∗ ∗ 1
δx(χk− − χk+ ) dt = x̄ k− + [x̄]k ξk + o(γ2 ) = x̄av
k
ξk + o(γ2 ),
t0 2

as required.
Finally, let us show that
 tf  tf
Hww
λ
w̄, w̄ dt = Hww
λ
δw, δw dt + ρ λ , (2.85)
t0 t0
  ∗ ∗
where w̄ = (x̄, ū), ū = δu, and sup0 ρ λ  = o(γ2 ). Indeed, x̄ = δx + δx , where
δx
∞ →

0,
δx
1 = O(γ2 ). Hence
 
tf tf  
Hww
λ
w̄, w̄ dt = Hxx
λ
x̄, x̄ + 2Hux
λ
x̄, ū + Huu
λ
ū, ū dt
t0 t0
 tf  
= Hxx
λ
δx, δx + 2Hux
λ
δx, δu + Huu
λ
δu, δu dt
t0
 tf
∗ ∗ ∗ ∗
+ λ
2Hxx δx, δx + 2Hux
λ
δx , δu + Hxx
λ
δx , δx dt
t0
 tf  
= Hww
λ
δw, δw dt + ρ λ , supρ λ  = o(γ2 ),
t0 0

since
 t 
 f λ ∗  ∗
 Hxx δx, δx dt  ≤ sup
Hxx
λ


δx
C
δx
1 = o(γ2 ),

t0 0
 t 
 f λ ∗  ∗
 Hux δx , δu dt  ≤ sup
Hux
λ


δu

δx
1 = o(γ2 ),

t0 0
 t 
 f λ ∗ ∗  ∗ ∗
 Hxx δx , δx dt  ≤ sup
Hxx
λ


δx

δx
1 = o(γ2 ).

t0 0

Therefore, formula (2.85) holds. We set

b̄ = (w̄, M ∗ ); ξk = meas Mk− ∗ − meas M ∗ ,



tf λ k+
 (λ, w̄) = lpp p̄, p̄ + t0 Hww w̄, w̄ dt,
λ

where p̄ = (x̄(t0 ), x̄(tf ));


 
1  
3λ (b̄) =  (λ, w̄) + D k (H λ ) δtk  dt + [H λ ]k x̄ k ξk ; (2.86)
x av
2 Mk∗
k
 tf 
γ2 (b̄) = |x̄(t0 )|2 + |ū|2 dt + 2 |δtk | dt.
t0 k Mk∗
64 Chapter 2. Quadratic Conditions in the Calculus of Variations

Since p̄ = δp for the entire sequence {b}, we obtain from (2.78) and (2.85) that 2λ (b) =
3λ (b̄) + ρ λ , where sup0 |ρ λ | = o(γ2 ). We have proved the following assertion.

Lemma 2.26. Let a sequence {b} = {(δw, M∗ )} be such that {δw} = {(δx, δu)} ∈ 0 ,
M∗ = ∪Mk∗ , Mk∗ → tk , k ∈ I ∗ , and , moreover, let

∗ ∗
δ ẋ = fx δx + fu δu + [f ]k (χk− − χk+ ) + r,
k


where
r
1 = o( γ2 ). Let a sequence {b̄} = {(w̄, M∗ )} be such that w̄ = (x̄, ū), x̄ =
∗ ∗  
δx + δx , and ū = δu, where δx = x̄ ∗ − δx ∗ = x̄k∗ − δxk∗ , and x̄k∗ and δxk∗ are defined
by formulas (2.77). Let δp = (δx(t0 ), δx(tf )) and p̄ = (x̄(t0 ), x̄(tf )). Then

{δp} = {p̄}, γ2 (b) = γ2 (b̄), x̄˙ = fx x̄ + fu ū − fx δx + r,
[x̄]k = [f ]k ξk ∀ k, ξk = meas Mk− ∗ − meas M ∗ ;
δx ∗
≤ O(γ );
k+ 1 2
2λ (b) = 3λ (b̄) + ρ λ , sup0 |ρ λ | = o(γ2 ),

where 3λ (b̄) is defined by formula (2.86).

We will need this lemma in estimating Cγ from below. We now use the corollary of
Lemma 2.26, which is formulated below.

Corollary 2.27. Let a sequence {b̄} = {(w̄, M ∗ )} be such that

w̄ = (x̄,ū), x̄ ∈ P W 1,1 (, Rd(x) ), ū ∈ L∞ (, Rd(u) ),



∞ +

∞ → 0,
M∗ = Mk∗ , Mk∗ → tk (k ∈ I ∗ ),
x̄˙ − fw w̄
1 ≤ O(γ2 ).

Let a sequence {b} = {(δw, M ∗ )} be such that δw = (δx, δu), δx = x̄ − δx , and δu = ū,
∗  
where δx = x̄ ∗ − δx ∗ = x̄k∗ − δxk∗ , and x̄k∗ and δxk∗ are defined by formulas (2.77).
Then
{δw} ∈ 0 , {δp} = {p̄}, γ2 (b) = γ2 (b̄),

∗ ∗
δ ẋ − fw δw − [f ]k (χk− − χk+ ) ≤ O(γ2 ),
k 1
2λ (b) = 3λ (b̄) + ρ λ , sup0 |ρ λ | = o(γ2 ).

Proof. Indeed, by the condition of Corollary 2.27, it follows that x̄˙ = fx x̄ + fu ū + r̃,
∗ ∗


1 ≤ O(γ2 ). Then x̄˙ = fx x̄ + fu ū − fx δx + r, where r = r̃ + fx δx , and, moreover,

r
1 ≤

1 +
fx δx
1 ≤ O(γ2 ). We obtain from this that

∗ ∗
δ ẋ = fx δx + fu δu + [f ]k (χk− − χk+ ) + r.
k
 ∗ −χ ∗ )
≤ O(γ ). The other assertions of Corol-
Consequently,
δ ẋ −fw δw − k [f ]k (χk− k+ 1 2
lary 2.27 follow from Lemma 2.26 directly.
2.4. Estimation of the Basic Constant from Above 65

Denote by S 2 the set of sequences {b̄} = {(w̄, M ∗ )} such that

w̄ = (x̄, ū), ū ∈ L∞ (, Rd(u) ), x̄ ∈ P W 1,1 (, Rd(x) ),




∞ → 0,

∞ → 0,

M∗ = Mk∗ , Mk∗ → tk (k ∈ I ∗ ), ūχ ∗ = 0,
k
Fip p̄ ≤ O(γ2 ) (i ∈ I ), |Kp p̄| ≤ O(γ2 ),
x̄˙ − fw w̄
1 ≤ O(γ2 ),

∗ ∗
[H λ ]k (meas Mk− − meas Mk+ ) ≤ 0 ∀ λ ∈ 0 .
k

Then Lemma 2.25 and also Corollary 2.27 imply the following assertion.

Lemma 2.28. The following inequality holds for any sequences {b̄} ∈ S 2 and {δw } ∈ 
such that γ ≤ O(γ2 (b̄)):
t
3λ (b̄) + t0f δ H λ dt
lim max ≥ min{Cγ , Cγ }.
0 γ2 (b̄) + γ

2.4.6 Passing to the Quadratic Form 


We remove the condition ūχ ∗ = 0 from the definition of S 2 . We denote by S 3 the resulting
new set of sequences. Let us show that Lemma 2.28 remains valid under the replacement
of S 2 by S 3 . Indeed, assume that there is a sequence {b̄} = {(w̄, M∗ )} ∈ S 3 , where w̄ = (x̄, ū).
We set ū¯ = ū(1−χ ∗ ) = ū− ū∗ , where ū∗ = ūχ ∗ . Then
ū∗
∞ → 0 and
ū∗
1 = M∗ |ū| dt ≤

meas M∗

2 ≤ O(γ2 ). Consequently,
 tf
|ū∗ |2 dt ≤
ū∗

ū∗
1 = o(γ2 ),
t0
 tf  tf  tf  tf
λ ¯ ¯ λ ∗ ∗ λ ¯ ¯
Huu
λ
ū, ū dt = Huu ū, ū dt + Huu ū , ū dt = Huu ū, ū dt + ρ1λ ,
t0 t0 t0 t0
 tf  tf  tf  tf
Hux
λ
x̄, ū dt = Hux
λ ¯ dt +
x̄, ū Hux
λ
x̄, ū∗ dt = Hux
λ ¯ dt + ρ2λ ,
x̄, ū
t0 t0 t0 t0

¯ = {(x̄, ū,
where sup0 |ρiλ | = o(γ2 ), i = 1, 2. We set {b̄} ¯ M∗ )}. Then

¯ = γ (b̄) + o(γ ),
γ2 (b̄) ¯ = 3λ (b̄) + ρ λ ,
3λ (b̄) sup |ρ λ | = o(γ2 ). (2.87)
2 2
0

¯ ∈ S 2 . Indeed,
Moreover, it is easy to see that {b̄}
¯ 1

x̄˙ − fx x̄ − fu ū
=
x̄˙ − fx x̄ − fu ū − fu ū∗
1

x̄˙ − fw w̄
1 +
fu ū∗
1 ≤ O(γ2 ).

¯ belonging to S 2 obviously hold. Conditions (2.87) and {b̄}


The other conditions of {b̄} ¯ ∈ S2
imply that Lemma 2.28 remains valid under the replacement of S 2 by S 3 .
66 Chapter 2. Quadratic Conditions in the Calculus of Variations

Further, we narrow the set of sequences S 3 up to the set S 4 by adding the following
conditions to the definition of the set S 3 :
(i) Each set Mk∗ is a segment adjusting to tk , i.e., Mk∗ = [tk − ε, tk ] or Mk∗ = [tk , tk + ε],
where ε → +0. In this case (see formula (2.13)),

2 |δtk | dt = ξk2 , where ξk = meas Mk− ∗ − meas M ∗
k+
Mk∗
  tf
γ2 (b̄) = ξk2 + |x̄(t0 )|2 + |ū|2 dt = γ̄ ;
t0
1 1 
3λ (b̄) =  (λ, w̄) + D k (H λ )ξk2 + [Hxλ ]k x̄av
k
ξk =: λ .
2 2
k

(ii) Also, the following relations hold:

Fip p̄ ≤ 0, i ∈ I , Kp p̄ = 0, x̄˙ = fw w̄.


∗ −meas M ∗
We note that for sequences from S 4 , each of the quantities ξk = meas Mk− k+
uniquely defines the set Mk := [tk − ξk , tk + ξk ]. Here, ξk = max{0, −ξk }, and ξk+ =
∗ − + −

max{0, ξk }. Moreover, we note that  and γ̄ depend on ξ̄ , x̄, and ū. Therefore, S 4 can be
identified with the set of sequences {z̄} = {(ξ̄ , w̄)} such that

ξ̄ ∈ Rs , w̄ = (x̄, ū), x̄ ∈ P W 1,1 (, Rd(x) ), ū ∈ L∞ (, Rd(u) ),


|ξ̄ | → 0,

∞ → 0,

∞ → 0,
Fip p̄ ≤ 0 (i ∈ I ), Kp p̄ = 0,
(2.88)
x̄˙ = fw w̄, [x̄]k = [f ]k ξ̄k (k ∈ I ∗ ),

[H λ ]k ξ̄k ≤ 0 ∀ λ ∈ 0 .
k

Therefore, the following assertion holds.

Lemma 2.29. The following inequality holds for any sequences {z̄} ∈ S 4 and {δw } ∈ 
such that γ̄ (z̄) ≤ O(γ ):
t
λ (z̄) + t0f δ H λ dt
lim max ≥ min{Cγ , Cγ }.
0 γ̄ (z̄) + γ

Now let us show that the condition k [H λ ]k ξ̄k ≤ 0 for all λ ∈ 0 holds automatically
for the elements of the critical cone K, and therefore, it is extra in the definition of the set of
sequences S 4 , i.e., it can be removed. We thus will prove that S 4 consists of the sequences
of elements of the critical cone K that satisfy the condition |ξ̄ | +

∞ +

∞ → 0.

2.4.7 Properties of Elements of the Critical Cone


Proposition 2.30. Let λ = (α0 , α, β, ψ) ∈ 0 , (ξ̄ , x̄, ū) ∈ K. Then the function ψ x̄ is constant
on each of the intervals of the set (t0 , tf )\ and hence is piecewise constant on [t0 , tf ].
2.4. Estimation of the Basic Constant from Above 67

Proof. The conditions x̄˙ = fw w̄, ψ̇ = −ψfx , and ψfu = 0 imply


d
0 = ψ(x̄˙ − fx x̄ − fu ū) = ψ x̄˙ − ψfx x̄ = ψ x̄˙ + ψ̇ x̄ = (ψ x̄).
dt
Moreover, ψ ∈ W 1,∞ (, Rd(x) ) and x̄ ∈ P W 1,2 (, Rd(x) ). Therefore, ψ x̄ = const on any
interval of the set (t0 , tf )\. The proposition is proved.

Proposition 2.31. Let λ = (α0 , α, β, ψ) ∈ 0 , and let (ξ̄ , x̄, ū) ∈ K. Then

[H λ ]k ξ̄k = lpλ p̄ ≤ 0.
k

Proof. By Proposition 2.30, dtd (ψ x̄) = 0 a.e. on [t0 , tf ]. Hence


 tf 
d t
0 = (ψ x̄) dt = ψ x̄ tf − [ψ x̄]k
t0 dt 0
 k
= ψ(tf )x̄(tf ) − ψ(t0 )x̄(t0 ) − ψ(tk )[x̄]k
 k 
= lxλf x̄(tf ) + lxλ0 x̄(t0 ) − ψ(tk )[f ]k ξ̄k = lpλ p̄ − [H λ ]k ξ̄k .
k k

We obtain from this that λ k
k [H ] ξ̄k = lpλ p̄. Moreover, the conditions
αi ≥ 0 ∀ i ∈ I , αi = 0 ∀ i ∈
/ I, Fip p̄ ≤ 0 ∀ i ∈ I, Kp p̄ = 0
imply lpλ p̄ ≤ 0. The proposition is proved.

Proposition 2.31 implies the following assertion.

Proposition 2.32. Let λ = (α0 , α, β, ψ) ∈ 0 be such that [H λ ]k = 0 for all k ∈ I ∗ . Let


z̄ = (ξ̄ , x̄, ū) ∈ K. Then α0 (Jp p̄) = 0, αi (Fip p̄) = 0, i = 1, . . . , d(F ).

z̄ ∈ K, ∗
 λ ∈ 0 , [H ] = 0 for all k ∈ I imply
Proof. By Proposition 2.31, the conditions λ k

lp p̄ = 0, where lp p̄ = α0 (Jp p̄) + αi (Fip p̄) + βj (Kjp p̄). This and the conditions
λ

α0 ≥ 0, Jp p̄ ≤ 0, αi ≥ 0, Fip p̄ ≤ 0 ∀ i ∈ IF (w 0 ),
αi = 0 ∀ i ∈
/ I := IF (w 0 ) ∪ {0}, Kp p̄ = 0
imply what was required. The proposition is proved.

In fact, Proposition 2.32 is equivalent to Proposition 2.2. Therefore, using Proposi-


tion 2.31, we have proved Proposition 2.2. Proposition 2.3 is proved analogously (we leave
this proof to the reader). Further, we use Proposition 2.31.
We set
Z() = Rs × P W 1,1 (, Rd(x) ) × L∞ (, Rd(u) ). (2.89)
Consider sequences of the form {ε z̄}, where ε → +0 and z̄ ∈ K ∩ Z() is a fixed element.
According to Proposition 2.31, such a sequence belongs to S 4 . Therefore, Lemma 2.29
implies the following assertion.
68 Chapter 2. Quadratic Conditions in the Calculus of Variations

Lemma 2.33. Let z̄ ∈ K ∩ Z(), ε → +0, {δw } ∈  , and γ ≤ O(ε 2 ). Then


t
ε2 λ (z̄) + t0f δ H λ dt
lim max ≥ min{Cγ , Cγ }.
0 ε2 γ̄ (z̄) + γ

Below, we set K ∩ Z() = KZ .

2.4.8 Cone C in the Space of Affine Functions on the Compact Set


co 0
To obtain the next upper estimate of the basic constant Cγ , in the estimate of Lemma 2.33,
we need to replace the set 0 by the set M co (C) of tuples satisfying the minimum principle
of “strictness C” (the meaning of this will be explained later) and, simultaneously, to
remove the sequence of needle-shaped variations. We will attain this using the method of
Milyutin, which has become a standard method [27, 86, 92, 95] for this stage of decoding
higher-order conditions in problems of the calculus of variations and optimal control. The
method of Milyutin is called the “cone technique.” The meaning of such a term will become
clear after its description. We now give several definitions.
Denote by L0 = Lin(co 0 ) the linear span of the set co 0 . Since 0 is a finite-
dimensional compact set, L0 is also of finite dimension. We denote by l(λ) an arbitrary
linear function l : L0 → R1 , and by a(λ) an arbitrary affine function a : L0 → R1 , i.e.,
a function of the form a(λ) = l(λ) + c, where c is a number. Since L0 is of finite dimension,
the affine functions a(λ) : L0 → R1 compose a finite-dimensional space, and each of these
functions is uniquely defined by its values on co 0 . In what follows, we consider affine
functions a(λ) on the compact set co 0 (of course they can be defined directly on co 0 ,
not passing to the linear span).
With each sequence {δw } ∈  , we associate the following sequence of linear func-
tions on co 0 : tf λ
t δ H dt
l(λ) = 0 ,
γ

where δ H λ = H λ (t, w 0 + δw ) − H λ (t, w0 ), γ = t0f  dt, and  = (t, u0 + δu ). It
t

follows from the definition of the sequence {δw } ∈  that γ > 0 on it, and, therefore, this
definition is correct. Let A0 be the set of all limit points of the sequences {l(λ)} obtained by
the above method from all sequences {δw } ∈  . Clearly, A0 is a closed subset in the finite-
dimensional space of linear functions l(λ) : co 0 → R1 . We note that each convergent
sequence {l(λ)} converges to its limit uniformly on co 0 . Further, to each number C, we
associate the set of affine functions AC obtained from A0 by means of the shift by (−C):

AC = {a(λ) = l(λ) − C  l(· ) ∈ A0 }.
Denote by C := con AC the cone spanned by the set AC .
Fix an arbitrary element z̄ ∈ KZ , z̄  = 0 (recall that KZ := K ∩ Z()).

Proposition 2.34. Let a number C be such that


t
ε 2 λ (z̄) + t0f δ H λ dt
lim max ≥C (2.90)
0 ε2 γ̄ (z̄) + γ
2.4. Estimation of the Basic Constant from Above 69

for any pair of sequences {ε}, {δw } such that ε → +0, {δw } ∈  , γ ≤ O(ε 2 ). Then

inf max{λ (z̄) + a(λ)} ≥ C γ̄ (z̄). (2.91)


a∈C co 0

Proof. Let a(· ) ∈ C , i.e., a(λ) = ρ(l(λ) − C), where ρ > 0, l(· ) ∈ A0 . The latter means
that there exists {δw } ∈  such that
tf λ
t0 δ H dt
→ l(λ) ∀ λ ∈ co 0 . (2.92)
γ

To the sequence {δw }, we write the sequence {ε} = {ε(δw )} of positive numbers such that
 tf
γ
ε 2 = , where γ = (t, u0 + δu ) dt. (2.93)
ρ t0

Then γ = O(ε2 ), and, therefore, inequality (2.90) holds. This inequality and conditions
(2.92) and (2.93) easily imply
λ (z̄)
ρ + l(λ)
max γ̄ (z̄)
≥ C.
ρ +1
0

Clearly, the maximum over 0 in this inequality can be replaced by the maximum over
co 0 , since λ and l(λ) are linear in λ. Therefore, multiplying the inequality by γ̄ (z̄) + ρ,
we obtain
max(λ (z̄) + ρl(λ)) ≥ C(γ̄ (z̄)) + ρ).
co 0

Since ρ(l(λ) − C) = a(λ), this implies

max(λ (z̄) + a(λ)) ≥ C γ̄ (z̄). (2.94)


co 0

It remains to recall that a(· ) is an arbitrary element of C , and hence (2.94) implies (2.91).
The proposition is proved.

Lemma 2.33 and Proposition 2.34 imply the following assertion.

Lemma 2.35. Let Cγ > −∞. Then for any C ≤ min{Cγ , Cγ } and for any z̄ ∈ KZ , we
have the inequality
inf max {λ (z̄) + a(λ)} ≥ C γ̄ (z̄). (2.95)
a∈C λ∈co 0

In what follows, we will need the convexity property of the cone C . It is implied by
the following assertion.

Proposition 2.36. The set A0 is convex.

Proof. Let l1 (· ) ∈ A0 , l2 (· ) ∈ A0 , p > 0, q > 0, p + q = 1. It is required to show that


l (· ) = pl1 (· ) + ql2 (· ) ∈ A0 . Let {δwi } ∈  , i = 1, 2, be two sequences of needle-shaped
70 Chapter 2. Quadratic Conditions in the Calculus of Variations

variations such that tf


t0 δi H λ dt
→ li (λ), i = 1, 2,
γi
t
where δi H λ and γi = t0f i dt correspond to the sequences {δwi }, i = 1, 2. Using the
sequences {δwi }, i = 1, 2, we construct a sequence {δw } ∈  such that
tf λ
t0 δ H dt
→ l(λ). (2.96)
γ
t
Thus, the convexity of A0 will be proved. For brevity, we set t0f δi H λ dt = ξi (λ), i = 1, 2.
Then
ξi (λ)
→ li (λ), i = 1, 2.
γi
Moreover, we set
pγ2 qγ1
= α , = β .
qγ1 + pγ2
qγ1 + pγ2

Then α > 0, β > 0, α + β = 1, and


α γ1 β γ2
= p, = q.
α γ1 + β γ2 α γ1 + β γ2
Consequently,
α ξ1 (λ) + β ξ2 (λ) ξ1 (λ) ξ2 (λ)
= p + q → pl1 (λ) + ql2 (λ) = l(λ).
α γ1 + β γ2 γ1 γ2
Therefore, tf tf λ
α λ
t0 δ1 H dt + (1 − α ) t0 δ2 H dt
t t → l(λ). (2.97)
α t0f 1 dt + (1 − α ) t0f 2 dt
Assume now that there is a sequence of functions {α(t)} in L∞ , each member of
which satisfies the following conditions:
(i) α(t) tassumes two values, 0 or 1, only; t t
(ii) α t0f δ1 H λ dt = t0f α(t)δ1 H λ dt, α t0f δ2 H λ dt = t0f α(t)δ2 H λ dt for all λ ∈ co 0 ;
t
t t t t
(iii) α t0f 1 dt = t0f α(t)1 dt and α t0f 2 dt = t0f α(t)2 dt.
We note that conditions (ii) hold for all elements of λ ∈ co 0 whenever they hold for finitely
many linearly independent elements of co 0 that compose a basis in L0 = Lin(co 0 ).
Therefore, in conditions (ii) and (iii) we in essence speak about the preservation of finitely
many integrals.
We set δw = α(t)δw1 + (1 − α(t))δw2 . Then, obviously, {δw } ∈  , and, moreover,
 tf  tf  tf
λ λ
α δ1 H dt + (1 − α ) δ2 H dt = α(t)δ1 H λ + (1 − α(t))δ2 H λ dt
t0 t0 t0
 tf
= δ H λ dt ∀ λ ∈ co 0 ,
t0
2.4. Estimation of the Basic Constant from Above 71

where δ H λ corresponds to the sequence {δw }. Analogously,


 tf  tf  tf
α 1 dt + (1 − α ) 2 dt =  dt,
t0 t0 t0

where  = (δw , t). This and (2.97) imply (2.96). Therefore, the convexity of A0 will be
proved if we ensure the existence of a sequence {α(t)} satisfying conditions (ii) and (iii).
The existence of such a sequence is implied by the Blackwell lemma, which is well known
in optimal control theory and is contiguous to the Lyapunov theorem on the convexity of
the range of a vector-valued measure. However, we note that we need not satisfy conditions
(ii) and (iii) exactly: it suffice to do this with an arbitrary accuracy; i.e., given an arbitrary
sequence ε → +0 in advance, we need to ensure the fulfillment of conditions (ii) and (iii)
with accuracy up to ε for each serial number of the sequence. Also, in this case, condition
(2.96) certainly holds. With such a weakening of conditions (ii) and (iii), we can refer to
Theorem 16.1 in [79, Part 2]. The proposition is proved.

2.4.9 Narrowing of the Set co 0 up to the Set M co (C )


In what follows, we will deal with the transformation of the left-hand side of inequality
(2.95). For this purpose, we need the following abstract assertion.

Lemma 2.37. Let X be a Banach space, F : X → R1 be a sublinear (i.e., convex and


positively homogeneous) functional, K ⊂ X be a nonempty convex cone, and x0 ∈ X be a
fixed point. Then the following formula holds:
inf F (x0 + x) = sup x ∗ , x0 ,
x∈K x ∗ ∈∂F ∩K ∗

where ∂F = {x ∗ ∈ X∗ | x ∗ , x ≤ F (x) for all x ∈ X} is the set of support functionals of F


and K ∗ = {x ∗ ∈ X∗ | x ∗ , x ≥ 0 for all x ∈ K} is the dual cone of K.

We use this lemma in order to transform the expression


inf max {λ (z̄) + a(λ)}
a∈C λ∈co 0

in the left-hand side of inequality (2.91). Denote by A the set of all affine functions
a(λ) : co 0 → R1 . As was already noted, A is a finite-dimensional space. In this space,
we consider the sublinear functional
F : a(· ) ∈ A → max a(λ).
λ∈co 0

Since co 0 is a convex compact set, we can identify it with the set of support functionals
of F ; more precisely, there is a one-to-one correspondence between each support functional
a∗ ∈ ∂F and the element λ ∈ co 0 such that a∗ , a = a(λ) for all a ∈ A. Moreover,
according to this formula, a certain support functional a∗ ∈ ∂F corresponds to every element
λ ∈ co 0 .
Further, let C be a number such that the cone C defined above is nonempty. Then
what was said above implies that ∂F ∩ ∗C can be identified with the set
def 

M(C ; co 0 ) = λ ∈ co 0  a(λ) ≥ 0 ∀ a ∈ C .
72 Chapter 2. Quadratic Conditions in the Calculus of Variations

By Lemma 2.37, we obtain from this that


inf max {λ (z̄) + a(λ)} = max λ (z̄). (2.98)
a∈C λ∈co 0 λ∈M(C ;co 0 )

It suffices to make more precise what the set M(C ; co 0 ) means.


For an arbitrary C, denote by M co (C) the set of λ ∈ co 0 such that
H (t, x 0 (t), u, ψ(t)) − H (t, x 0 (t), u0 (t), ψ(t)) ≥ C(t, u) (2.99)

if t ∈ [t0 , tf ] \ , u ∈ U(t, x (t)), where U(t, x) = {u ∈ R
0 d(u)  (t, x, u) ∈ Q}. Namely in this
case, we say that the minimum principle “of strictness C” holds for λ. For a positive C
and λ ∈ 0 it is a strengthening of the usual minimum principle. Also, we note that the
set M co (C) for C = 0 coincides with the set M0co defined in the same way as the set M0
from Section 2.1.4, with the only difference being that in the definition of the latter, it is
necessary to replace the set 0 by its convex hull co 0 .

Proposition 2.38. For any real C, we have


M(C ; co 0 ) ⊂ M co (C ). (2.100)

Proof. Let C ∈ R. Let λ̂ ∈ M(C ; co 0 ), i.e., λ̂ ∈ co 0 and a(λ̂) ≥ 0 for all a ∈ C .


Hence l(λ̂) ≥ C for all l ∈ A0 . Using this inequality, we show that λ̂ ∈ M co (C). Fix an
arbitrary point t and a vector u such that
t ∈ (t0 , tf ) \ , u ∈ Rd(u) , (t , x 0 (t ), u ) ∈ Q, u  = u0 (t ). (2.101)
Let ε > 0. Define the needle-shaped variation

u − u0 (t), t ∈ [t − ε, t + ε],
δu (t) =
0 otherwise.
For ε → +0, we have the sequence {δw } ∈  , where δw = (0, δu ). For each λ ∈ co 0 ,
there exists the limit tf λ 
t0 δ H dt δ H λ 
lim tf = 
ε→+0  dt  
t0 t=t
for this sequence. According to the definition of the set A0 , this limit is l(λ), where l(· ) ∈ A0 .
Since l(λ̂) ≥ C, we have

δ H λ̂ 
 ≥ C.
 
t=t

In other words, for λ = λ̂, inequality (2.99) holds for arbitrary u and t satisfying conditions
(2.101). This implies λ̂ ∈ M co (C). The proposition is proved.

(In fact, we have the relation M(C ; co 0 ) = M co (C), but we need only the inclu-
sion for decoding the constant Cγ .) We obtain from relation (2.98) and inclusion (2.100)
that
inf max {λ (z̄) + a(λ)} ≤ max λ (z̄).
a∈C λ∈co 0 λ∈M co (C)

This and Lemma 2.35 imply the following assertion.


2.4. Estimation of the Basic Constant from Above 73

Lemma 2.39. Let min{Cγ , Cγ } ≥ C > −∞. Then

max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ KZ . (2.102)


λ∈M co (C)

The distinguishing of the cone C by using the set  of sequences of needle-shaped


variations and the “narrowing” of the set co 0 up to the set M co (C) by using formula
(2.98) referred to the duality theory represents Milyutin’s method which is called the “cone
technique” bearing in mind the cone C of affine functions on the convex compact set
co 0 . Also, we note that the use of this method necessarily leads to the convexification of
the compact set 0 in all the corresponding formulas.

2.4.10 Closure of the Cone KZ in the Space Z2 (


)
It remains to prove that Lemma 2.39 remains valid if we replace KZ by K in it. This is
implied by the following assertion.

Proposition 2.40. The closure of the cone KZ in the space Z2 () coincides with the
cone K.

The proof of this proposition uses the Hoffman lemma on the estimation of the distance
to the solution set of a system of linear inequalities, i.e., Lemma 1.12. More precisely, we
use the following consequence of the Hoffman lemma.

Lemma 2.41. Let X and Y be two Banach spaces, let li : X → R1 , i = 1, . . . , k, be linear


functionals on X, and let A : X → Y be a linear operator with closed range. Then there
exists a constant N = N(l1 , . . . , lk , A) > 0 such that for any point x0 ∈ X, there exists x̄ ∈ X
such that
li , x0 + x̄ ≤ 0, i = 1, . . . , k; A(x0 + x̄) = 0, (2.103)
 k 

+


≤ N li , x0 +
Ax0
. (2.104)
i=1

Indeed, system (2.103) is compatible, since it admits the solution x̄ = −x0 , and then
by Lemma 1.12, there exists a solution satisfying estimate (2.104) (clearly, the surjectivity
condition for the operator A : X → Y in Lemma 1.12 can be replaced by the closedness of
the range of this operator considering the image AX as Y ).

Proof of Proposition 2.40. Since KZ ⊂ K and K is closed in Z2 (), it suffices to only


show that K ⊂ [KZ ]2 , where [· ]2 is the closure in Z2 (). Let z̄ = (ξ̄ , x̄, ū) ∈ K. We show
that there exists a sequence in KZ that converges to z̄ in Z2 (). Take a sequence N → ∞.
For each member of this sequence, we set

ū(t), |ū(t)| ≥ N ,
ū (t) =
N
0, |ū(t)| < N .
t
The sequence {ūN } satisfies the condition t0f ūN , ūN dt → 0. Let x̄ N be defined from the
conditions x̄˙ N = fx x̄ N + fu ūN , x̄ N (t0 ) = 0, x̄ N ∈ W 1,2 (, Rd(x) ), where W 1,2 (, Rd(x) )
74 Chapter 2. Quadratic Conditions in the Calculus of Variations

is the space of absolutely continuous functions x(t) : [t0 , tf ] → Rd(x) having the Lebesgue
square integrable derivatives; it is endowed with the norm
 tf 1/2

x
1,2 = x(t0 ), x(t0 ) + ẋ, ẋ dt .
t0

Then
x̄ N
1,2 → 0, and hence
x̄ N
C → 0. We set z̄N = (0, x̄ N , ūN ) and z̄N = z̄ − z̄N .
Then
z − z̄N
Z2 () =
z̄N
Z2 () → 0. The conditions z̄ ∈ K and
x̄ N
C → 0 imply

(Fip p̄N )+ + |Kp p̄N | → 0, (2.105)
i∈I

where p̄N corresponds to the sequence z̄N . Moreover, {z̄N } belongs to the subspace T2 ⊂
Z2 () defined by the conditions

x̄˙ = fx x̄ + fu ū, [x̄]k = [f ]k ξ̄k , k = 1, . . . , s.

Applying Lemma 2.41 on T2 , we obtain that for a sequence {z̄N } in T2 satisfying condition
} in K such that
z̄ − z̄

(2.105), there exists a sequence {z̄N Z N N Z2 () → 0. But the following


condition also holds for {z̄N }:
z̄ − z̄

N Z2 () → 0. Therefore, z̄ ∈ [KZ ]2 . Since z̄ is an


arbitrary element from K, we have K ⊂ [KZ ]2 , and then K = [KZ ]2 . The proposition is
proved.

Lemma 2.39 and Proposition 2.40 imply the following assertion.

Lemma 2.42. Let min{Cγ , Cγ } ≥ C > −∞. Then

max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K. (2.106)


λ∈M co (C)

We now recall that Cγ ≥ 0 is a necessary condition for the Pontryagin minimum.


Assume that it holds. Then, setting C = 0 in (2.106), we obtain the following result.

Theorem 2.43. Let w 0 = (x 0 , u0 ) be a Pontryagin minimum point. Then

max λ (z̄) ≥ 0 ∀ z̄ ∈ K.
λ∈M0co

Therefore, we have obtained the quadratic necessary condition for the Pontryagin
minimum, which is slightly weaker than Condition A of Theorem 2.4. It is called Condition
Aco . Using Condition Aco , we will show below that Condition A is also necessary for
the Pontryagin minimum. We thus will complete the proof of Theorem 2.4. Section 2.6
is devoted to this purpose. But first we complete (in Section 2.5) the decoding of the
constant Cγ .
Denote by CK the least upper bound of C such that M co (C) is nonempty and
condition (2.106) holds. Then Lemma 2.42 implies

CK ≥ min{Cγ , Cγ }, (2.107)

i.e., the constant CK estimates the constant Cγ from above with accuracy up to a constant
multiplier. We will prove that the constant CK estimates the constant Cγ from below with
2.5. Estimation of the Basic Constant from Below 75

accuracy up to a constant multiplier. This will allow us to obtain a sufficient condition for
the Pontryagin minimum.

Remark 2.44. Lemma 2.42 also implies the following assertion: if Cγ ≥ 0, then M0co is
nonempty and
maxco λ (z̄) ≥ 0 ∀ z̄ ∈ K.
λ∈M0

This assertion serves as an important supplement to inequality (2.107).

2.5 Estimation of the Basic Constant from Below


2.5.1 Method for Obtaining Sufficient Condition for a Pontryagin
Minimum
Since Cγ > 0 is a sufficient condition for the Pontryagin minimum, the positivity requirement
for any quantity that estimates Cγ from below is also a sufficient condition. Therefore, the
second stage of decoding, the estimation of Cγ from below, can also be considered as a
method for obtaining a sufficient condition for the Pontryagin minimum. As was already
noted, the sufficiency of the condition Cγ > 0 for the Pontryagin minimum is a sufficiently
elementary fact not requiring the constructions of Chapter 1 for its proof. Therefore a source
for obtaining a sufficient condition for the Pontryagin minimum is very simple, which is
characteristic for sufficient conditions in general. However, in contrast to other sources
and methods for obtaining sufficient conditions, which often are of arbitrary character, in
this case, we are familiar with connection of the sufficient condition being used with the
necessary condition; it consists of the passage from the strict inequality Cγ > 0 to the
nonstrict inequality. This fact is already not so obvious, it is nontrivial, and it is guaranteed
by the abstract scheme.

2.5.2 Extension of the Set σ γ


For convenience, we recall the following main definitions:
0
Cγ = inf lim , 0 (δw) = max (λ, δw) = max (λ, δw),
σ γ γ λ∈0 λ∈co 0
 tf
(λ, δw) = δl λ − (ψδ ẋ − δH λ ) dt, σ γ = {{δw} ∈  | σ ≤ O(γ )},
t0

σ = max{Fi (p0 + δp) (i ∈ I ), |δK|,


δ ẋ − δf
1 }.
Let M ⊂ co 0 be an arbitrary nonempty compact set. Then we have the following for any
variation δw ∈ δW :
def
0 (δw) = max (λ, δw) ≥ max (λ, δw) = M (δw).
co 0 M

Consequently,
0 M
Cγ := inf lim ≥ inf lim . (2.108)
σ γ γ σ γ γ
76 Chapter 2. Quadratic Conditions in the Calculus of Variations

We set

o(√γ ) = {{δw} ∈  | σ = o( γ )}.

Since σ γ ⊂ o(√γ ) , we have

M M
inf lim ≥ inf√ lim . (2.109)
σ γ γ o( γ ) γ

Inequalities (2.108) and (2.109) imply


M
Cγ ≥ inf√ lim . (2.110)
o( γ) γ

Let C ∈ R1 be such that M co (C) is nonempty. We set


C
C = max (λ, δw), Cγ C ; o(√γ ) = inf√ lim .
coM (C) o( γ ) γ

Then (2.110) implies the following assertion.

Lemma 2.45. The following inequality holds for an arbitrary C such that
M co (C) is nonempty:
Cγ ≥ Cγ (C ; o(√γ ) ). (2.111)

In what follows, we will fix an arbitrary C such that the set M co (C) is nonempty.

2.5.3 Passing to Local Sequences


We set √
o( γ ) = {{δw} ∈ 
loc | σ = o( γ )}.
√ loc
(2.112)
In other words,

o( γ ) = o( γ ) ∩  .
loc
√ loc

Our further goal consists of passing from the constant Cγ (C , o(√γ ) ) defined by the
set of sequences o(√γ ) to a constant defined by the set of sequences loc

o( γ ) . For such a
passage, we need the estimate

|C | ≤ O(γ ) | loc , (2.113)

i.e., |C (δw)| ≤ O(γ (δw)) for any sequence {δw} ∈ loc . To prove estimate (2.113), we
need a certain property of the set M co (C) analogous to one of the Weierstrass–Erdmann
conditions of the classical calculus of variations. Let us formulate this analogue.

Proposition 2.46. The following conditions hold for any λ ∈ M co (C):

[H λ ]k = 0 ∀ tk ∈ .
2.5. Estimation of the Basic Constant from Below 77

Proof. Fix arbitrary λ ∈ M co (C) and tk ∈ . We set tε = tk − ε, ε > 0, and

δε H λ = H (tε , x 0 (tε ), u0 (tε ) + [u0 ]k , ψ(tε )) − H (tε , x 0 (tε ), u0 (tε ), ψ(tε )).

Then for a small ε > 0, the condition λ ∈ M co (C) implies δε H λ ≥ Cδε , where δε  =
(tε , u0 (tε ) + [u0 ]k ) − (tε , u0 (tε )) = (tε , u0 (tε ) + [u0 ]k ). Taking into account that
δε H λ → [H λ ]k and δε  → 0 as ε → +0, we obtain [H λ ]k ≥ 0. Constructing an analo-
gous sequence t ε = tk + ε to the right from the point tk , we obtain −[H λ ]k ≥ 0. Therefore,
[H λ ]k = 0. The proposition is proved.

In Section 2.1.5, we have defined 0 as the set of tuples λ ∈ 0 such that [H ] = 0
λ k

for all tk ∈ . Proposition 2.46 means that M (C) ⊂ co 0 . This and Proposition 2.16
co 

imply estimate (2.113). By the way, we note that under the replacement of M co (C) by
co 0 , estimate (2.113) does not hold in general.
Recall that by u0 we have denoted the closure of the graph of the function u0 (t)
assuming that u0 (t) is left continuous. By Qtu we have denoted the projection of the set
Q under the mapping (t, x, u)  → (t, u). Denote by {V } an arbitrary sequence of neighbor-
hoods of the compact set u0 contained in Qtu such that V → u0 . The latter means that for
any neighborhood V ⊂ Qtu , of the compact set u0 , there exists a number starting from
which V ⊂ V .
Let {δw} ∈  be an arbitrary sequence. For members δw = (δx, δu) and V of the
sequences {δw} and {V }, respectively, which have the same numbers,
we set 
δu(t) if (t, u0 (t) + δu(t)) ∈ V ,
δuV (t) =
0 otherwise,

δuV = δu(t) − δuV (t), δw loc = (δx, δuV ), δw V = (0, δuV ).

Then {δw} = {δwloc } + {δw V }, {δw loc } ∈ loc , and {δw V } ∈ .

Proposition 2.47. The following relation holds for the above representation of the se-
quence {δw}:
δf = δ loc f + δ Vf + rfV ,

where

δf = f (t, w 0 + δw) − f (t, w 0 ), δ locf = f (t, w 0 + δwloc ) − f (t, w0 )


δ Vf = f (t, w 0 + δw V ) − f (t, w0 ), rfV = (δ̄x f − δx f )χ V ;

the function χ V is the characteristic function of the set M V = {t | δuV  = 0}; and

δx f = f (t, x 0 + δx, u0 ) − f (t, x 0 , u0 ),


δ̄x f = f (t, x 0 + δx, u0 + δu) − f (t, x 0 , u0 + δu).

def
Moreover,
rfV
∞ ≤
δ̄x f
∞ +
δx f
∞ = εf → 0 and
rfV
1 ≤ εf meas M V .
78 Chapter 2. Quadratic Conditions in the Calculus of Variations

Proof. We have
δf = δf (1 − χ V ) + δf χ V
= δ loc f (1 − χ V ) + (f (t, x 0 + δx, u0 + δu) − f (t, x 0 , u0 + δu)
+ f (t, x 0 , u0 + δu) − f (t, x 0 , u0 ))χ V
= δ loc f − δ loc f χ V + δ̄x f χ V + δ Vf
= δ loc f + δ Vf + (δ̄x f − δx f )χ V
= δ loc f + δ Vf + rfV .

The estimates for rfV are obvious. The proposition is proved.

Also, it is obvious that the representation γ = γ loc + γ V corresponds to the repre-


sentation {δw} = {δw loc } + {δwV }, where γ = γ (δw), γ loc = γ (δw loc ), and γ V = γ (δwV ).
This is implied by the relation (t, u0 + δu) = (t, u0 + δuV ) + (t, u0 + δuV ).
Now let us define a special sequence {V } = {V (ε)} of neighborhoods of the compact
0
set u that converges to this compact set. Namely, for each ε > 0, we set

V = V (ε) = Oε (t, u),
(u,t) ∈ u0

where Oε (t, u) = {(t , u ) | |t − t | < ε 2 , |u − u | < ε }.


Fix a sequence {δw} ∈  and define a sequence {ε} for it such that ε → +0 suffi-
ciently slowly. The following proposition explains what “sufficiently slowly” means.

Proposition 2.48. Let ε → +0 so that



εf γ
→ 0 and → 0,
ε2 ε 2

where εf is the same as in Proposition 2.47. Then


rfV
1 ≤ εf meas MV = o(γ V ) and

meas MV = o( γ V ).

Proof. Let there exist the sequences {δw}, {ε}, and {V } = {V (ε)} defined as above. Since
V → u0 , we have that, starting from a certain number, V ⊂ V, where V ⊂ Qtu is the
neighborhood of the compact set u0 from the definition of the function (t, u) given in
Section 2.3. Condition V ⊂ V implies (δuV )V = δuV (the meaning of this notation is
the same as above). Hence δuV = δuVV + δuV , where (t, u0 + δuVV ) ∈ V \ V on MVV =
{t | δuVV  = 0} and (t, u0 + δuV ) ∈
/ V on M V = {t | δuV  = 0}. The definitions of V, , and
V = V (ε) imply (t, u + δuV ) ≥ ε 2 on MV
0 V V . This implies γ V ≥ ε 2 meas M V , where γ V :=
V V V
tf V
0 + δuV ) dt. Since γ V ≤ γ V , we have meas M V ≤ γ . Further, it follows from the
(t, u V V V
t0 ε 2
t
definitions of  and δuV that meas M V ≤ const γ V , where γ V := t0f (t, u0 + δuV ) dt, and
const depends on the entire sequence {δuV } but not on its specific member. Since γ V ≤ γ V
(because MV ⊂ MV ), we have meas M V ≤ const γ V . Therefore,
 
γ V 1
meas M V = meas MV V
+ meas MV ≤ 2 + const γ V = + const γ V . (2.114)
ε ε2
2.5. Estimation of the Basic Constant from Below 79

Taking into account that εf /ε 2 → 0, we obtain from (2.114) that εf meas MV = o(γ V ).

Moreover, since γ /ε 2 → 0 and γ V ≤ γ , (2.114) also implies
 ! ! ! 
γV
meas M ≤ V
+ const γ V γ V =o γV .
ε2

The proposition is proved.

In what follows, for brevity we set


δ = (λ, δw),
λ
δ loc λ = (λ, δw loc ), δC = C (δw), δ loc C = C (δwloc ).
Assume that there is an arbitrary sequence {δw} ∈ . We define a sequence {ε} for
it such that the conditions of Proposition 2.48 hold. Also, we define the corresponding
sequences {V } = {V (ε)}, {δwloc }, and {δw V }. Then
 tf  tf
δ := δl −
λ λ
ψδ ẋ dt + δH λ dt
t t
 0tf  0tf
= δl λ − ψδ ẋ dt + ψ(δ loc f + δ Vf + rfV ) dt
t0 t0
 tf  tf  tf  tf
= δl −λ
ψδ ẋ dt + δ H dt +
loc λ
δ H dt +
V λ
ψrfV dt
t0 t0 t0 t0
 tf
= δ  +
loc λ
δ V H λ dt + ρ V λ ,
t0
tf
where ρV λ = t0 ψrf
V dt. Since supco 0
ψ
∞ < +∞ and
rfV
1 = o(γ V ), we have

sup |ρ V λ | = o(γ V ). (2.115)


co 0

Finally, we obtain  tf
δλ = δ loc λ + δ V H λ dt + ρ V λ . (2.116)
t0

Moreover, the conditions δf = δ loc f + δ Vf + rfV ,
δ Vf
1 ≤ const meas MV = o γV ,
and
rfV
1 = o(γ V ) imply
!

δ ẋ − δ loc f
1 ≤
δ ẋ − δf
1 + o γV . (2.117)

Therefore, the following assertion holds.

Proposition 2.49. Let {δw} ∈  be an arbitrary sequence, and let a sequence {ε} satisfy
the conditions of Proposition 2.48. Then conditions (2.115)–(2.117) hold for the corre-
sponding sequences {V (ε)}, {δwloc }, and {δwV }.

We now are ready to pass to local sequences. We set


C
Cγ C ; loco( γ ) = inf
√ lim .
loc

o( γ ) γ

We have the following assertion.


80 Chapter 2. Quadratic Conditions in the Calculus of Variations

Lemma 2.50. The following inequality holds:


 
Cγ C ; o(√γ ) ≥ min C, Cγ C ; loc

o( γ ) . (2.118)

Proof. Let {δw} ∈ o(√γ ) be a sequence with γ > 0. Choose the sequence {ε} and the
corresponding sequence {V } as in Proposition 2.48. With the sequences {δw} and {V },
we associate the splitting {δw} = {δwloc } + {δwV }. Let us turn to formula (2.116). The
definition of the set M co (C) implies

δ V H λ ≥ C(t, u0 + δuV ) ∀ λ ∈ M co (C)

for all sufficiently large numbers. Then (2.115) and (2.116) imply

δC ≥ δ loc C + Cγ V + o(γ V ), (2.119)

where γ V = γ (δw V ). We set γ = γ (δw) and γ loc = γ (δwloc ). Then γ = γ loc + γ V . The
following two cases are possible:
loc
Case (a): lim γγ = 0.
loc
Case (b): lim γγ > 0.
Let us consider each of them.
loc
Case (a): Let lim γγ = 0. Choose subsequences such that γ loc /γ → 0 for them, and,
therefore,
γV
γ loc = o(γ ) and → 1. (2.120)
γ
We preserve the above notation for the subsequences. It follows from (2.119), (2.120), and
estimate (2.113) that δC ≥ Cγ + o(γ ). Hence
δC
lim ≥ C,
γ
where the lower limit is taken over the chosen subsequence.
Case (b): Now let limγ loc /γ > 0, and, therefore, γ ≤ O(γ loc ) and γ V ≤ O(γ

loc ).

Since
δ ẋ − δf
1 = o( γ ), it follows from (2.117) that
δ ẋ − δ loc f
1 = o( γ loc ).
Moreover,
!  !
√ √
δ loc Fi = δFi ≤ o( γ ) = o1 γ loc , i ∈ I ; |δ loc K| = |δK| = o( γ ) = o γ loc .

Hence {δw loc } ∈ loc



o( γ ) . Further, we obtain from (2.119) that
 
δC δ loc C + Cγ V γ loc δ loc C γ V
lim ≥ lim = lim + C
γ γ γ γ loc γ
   
δ loc C δ loc C
≥ lim min , C = min lim ,C .
γ loc γ loc
2.5. Estimation of the Basic Constant from Below 81

The second equation follows from the condition

γ loc γ V
+ = 1.
γ γ
Furthermore,
δ loc C
lim ≥ C γ  C ; loc√
o( γ ) ,
γ loc
since {δwloc } ∈ loc

o( γ ) . Hence

δC  
lim ≥ min Cγ C ; loc

o( γ ) , C .
γ
Therefore, we have shown that for every sequence {δw} ∈ o(√γ ) on which γ > 0, there
exists a subsequence such that lim δC /γ is not less than the right-hand side of inequality
(2.118). This implies inequality (2.118). The lemma is proved.

The method used in this section is very characteristic for decoding higher-order
conditions in optimal control in order to obtain sufficient conditions. Now this method is
related to the representation of the Pontryagin sequence as the sum {δw} = {δw loc } + {δw V }
and the use of the maximum principle of strictness C for {δwV }:
δ V H λ ≥ C(t, u0 + δuV ), λ ∈ M co (C).
In what follows, analogous consideration will be given to various consequences of the
minimum principle of strictness C, the Legendre conditions, and conditions related to
them. Lemmas 2.45 and 2.50 imply the following assertion.

Lemma 2.51. Let the set M co (C) be nonempty. Then


 
Cγ ≥ min Cγ C ; loc √
o( γ ) , C .

 
2.5.4 Simplifications in the Definition of Cγ C ; loc

o( γ )

o( γ ) consists of sequences {δw} ∈ 


By definition, loc
√ loc satisfying the conditions

√ √
δFi ≤ o( γ ) ∀ i ∈ I , |δK| = o( γ ), (2.121)


δ ẋ − δf
1 = o( γ ). (2.122)
Obviously, conditions (2.121) are equivalent to the conditions
√ √
Fip δp ≤ o( γ ) ∀ i ∈ I , |Kp δp| = o( γ ).
Let us consider condition (2.122). Assume that the sequence {δw} is represented in the
canonical form (see Proposition 2.9)

{δw} = {δw 0 } + {δw ∗ }, {δw 0 } = {(δx, δu0 )} ∈ 0 ,


{δw∗ } = {(0, δu∗ )} ∈ ∗ , |δu0 | · |δu∗ | = 0.
82 Chapter 2. Quadratic Conditions in the Calculus of Variations

By Proposition 2.10,
1
δf = fw δw 0 + fww δw0 , δw0 + δ ∗ f + δ ∗ fx δx + r̃,
2
t
where
r̃||1 = o(γ ) and γ :=
δx
2C + t0f |δu0 |2 dt. According to formula (2.50),
0 0

 
δ∗f = δk∗ f = ∗
[f ]k (χk− ∗
− χk+ ) + fuk+ δvk− + fuk− δvk+ + O(|δtk | + |δvk |2 ).

As was shown in Section 2.4 in proving the equivalence of conditions (2.66) and (2.69), we
have the estimates
  tf 
 

fww δw , δw
1 ≤ o(γ ), 
0 0 0
δ ∗ fx δx dt  ≤ o(γ ).
t0

Moreover,
 

fuk+ δvk−
1 ≤ |fuk+ | meas Mk−
δvk−
2 = o( γ ∗ ),
 

fuk− δvk+
1 ≤ |fuk− | meas Mk+
δvk+
2 = o( γ ∗ ),
 t t
where γ ∗ := k ( M∗ |δt| dt + t0f |δvk |2 dt) and
v
2 = ( t0f v(t), v(t) dt)1/2 is the norm
k
of the space L2 (, Rd(u) ) of Lebesgue square integrable functions v(t) : [t0 , tf ] → Rd(u) .
Therefore, condition (2.122) is equivalent to the condition
 √
∗ ∗
δ ẋ − fw δw 0 − [f ]k (χk− − χk+ ) = o( γ ).
1
Finally, we obtain
 √ √
loc
√ = {δw} ∈ loc | Fip δp ≤ o( γ ) ∀ i ∈ I ; |Kp δp| = o( γ );
o( γ )
 √ 
∗ ∗
δ ẋ − fw δw0 − [f ]k (χk− − χk+ ) = o( γ ) .
1

In this relation, we use the canonical representation of the sequence {δw} = {δw 0 } + {δw ∗ },
where {δw0 } ∈ 0 , {δw∗ } ∈ ∗ , and |δu0 | · |δu∗ | = 0.
By Proposition 2.15, in calculating Cγ (C ; loc √
o( γ ) ), we can use the function

1C (δw) := max


co
1λ (δw)
M (C)

(see definition (2.61) for 1λ ) instead of the function C = maxM co (C) (λ, δw), and
since by Proposition 2.46the conditions [H λ ]k = 0 for all tk ∈  hold for any λ ∈ M co (C),
we can omit the terms k [H λ ]k (meas Mk− ∗ − meas M ∗ ) in the definition of 1λ , thus
k+
passing to the function
1
˜ 1λ (δw)
 :=  (λ, δw 0 )
2 s   
 tf
∗ ∗
+ k
D (H ) λ
|δtk | dt + [Hxλ ]k δx(χk− − χk+ ) dt
Mk∗ t0
k=1
 
1 tf
+ Huu
λk+
δvk− , δvk− + Huu
λk−
δvk+ , δvk+ dt ,
2 t0
2.5. Estimation of the Basic Constant from Below 83

where  tf
 (λ, δw) = lpp
λ
δp, δp + Hww
λ
δw, δw dt.
t0
 
Therefore, the constant Cγ C ; loc

o( γ ) does not change, if we replace the function C
˜ 1 , where
by the function  C

˜ 1C (δw) = max 


 ˜ 1λ (δw).
co
M (C)

Finally, we recall that the functional γ has the following form on local sequences represented
in the canonical form:
 tf  
   tf
γ (δw) =
δx
2C + |δu0 |2 dt + 2 |δtk | dt + |δvk |2 dt
t0 k Mk∗ t0

or, in short, γ = γ 0 + γ ∗ , where


    
tf  tf
γ 0 =
δx
2C + |δu0 |2 dt, γ∗ = 2 |δtk | dt + |δvk |2 dt .
t0 k Mk∗ t0

We have proved that



C C ; loc
√ = Cγ ˜ 1C ; loc√ ,
 (2.123)
o( γ ) o( γ )

where
˜1

˜ 1C ; loc√
Cγ  o( γ ) = inf lim C .
loc

o( γ )
γ

˜ 1λ
We call attention to the fact that in the definition of this constant, the functions 

and γ are defined indeed on the set of triples (δw , M , δv) such that
0

δw0 = (δx, δu0 ) ∈ W , M∗ = ∪Mk∗ , Mk∗ = Mk− ∗ ∪ M∗ ,


k+
∗ ∗ ∗
Mk− ⊂ (tk − ε, tk ), Mk+ ⊂ (tk , tk + ε), k ∈ I , ε > 0;

δv = δvk , δvk = δvk− + δvk+ , k ∈ I ∗ ;
∗ ,
{t | δvk−  = 0} ⊂ Mk− ∗ ,
{t | δvk+  = 0} ⊂ Mk+ k ∈ I ∗.

Denote by a these triples, and denote by {a} an arbitrary sequences of triples a such that

{δw 0 } ∈ 0 ; Mk∗ → tk , k ∈ I ∗;
δv
∞ → 0.

To each sequence {δw} from loc represented in canonical form, we naturally associate the
sequence of triples {a} = {a(δw)} = {(δw 0 , M ∗ , δv)}. However, note that for an arbitrary
sequence {a}, we do not require the condition δu0 χ ∗ = 0, which holds for the sequences {a}
corresponding to the sequences {δw} ∈ loc in their canonical representations (as before,
χ ∗ is the characteristic function of the set M∗ ). Therefore, on the set of sequences {a}, we
84 Chapter 2. Quadratic Conditions in the Calculus of Variations

have defined the functions  ˜ 1λ = ˜ 1λ (a), λ ∈ M co (C) and γ = γ (a) = γ 0 + γ ∗ . We set


 √ √
S1 = {a} = {(δw0 , M ∗ , δv)} | Fip δp ≤ o( γ ) ∀ i ∈ I , |Kp δp| = o( γ ),

∗ ∗ √ 
δ ẋ − fw δw0 − [f ]k (χk− − χk+ ) = o( γ ) ,
1

where χk− ∗
and χk+ ∗ and M ∗ , respectively.
are the characteristic functions of the sets Mk− k+
Also, we set
˜1

˜ 1C (a) = max 
 ˜ 1λ (a), ˜ 1C ; S1 ) = inf lim
Cγ (  C
.
co
M (C) S1 γ
The following inequality holds:

Cγ  ˜ 1C ; loc√ ≥ Cγ ˜ 1C ; S1 .
 (2.124)
o( γ )

Indeed, the sequence {a} = {a(δw)} ∈ S1 corresponds to every sequence [δw} ∈ loc √
o( γ ) ,
˜ 1
and, moreover, the values of  and γ are preserved under this correspondence. There is
C
no converse correspondence, since we omit the condition δu0 χ ∗ = 0 in the definition of S1 .
Lemma 2.51 and formulas (2.123) and (2.124) imply the following assertion.

˜ 1 ; S1 ), C}.
Lemma 2.52. Let the set M co (C) be nonempty. Then Cγ ≥ min{Cγ ( C

˜ 1 ; S1 ) from below.
In what follows, we estimate the constant Cγ ( C

2.5.5 Use of Legendre Conditions


Now our goal is to pass to sequences with δv = 0. This will be done by using the Legendre
conditions for points tk ∈ . Let us formulate the following general assertion on the
Legendre conditions.

Proposition 2.53. Let λ ∈ M co (C). Then we have that


(a) the following condition holds for any t ∈ [t0 , tf ] \ :
1
Huu (t, x 0 (t), u0 (t), ψ(t))ū, ū ≥ Cū, ū ∀ ū ∈ Rd(u) ;
2
(b) the following conditions hold for any tk ∈ :
1 λk+
H ū, ū ≥ Cū, ū ∀ ū ∈ Rd(u) ;
2 uu
1 λk−
H ū, ū ≥ Cū, ū ∀ ū ∈ Rd(u) .
2 uu

Proof. Let λ ∈ M co (C), t ∈ [t0 , tf ]. Choose ε > 0 so small that the conditions ũ ∈ Rd(u)
and |ũ| < ε imply (t, x 0 (t), u0 (t) + ũ) ∈ Q and (t, u0 (t) + ũ) = |ũ|2 , and hence
H (t, x 0 (t), u0 (t) + ũ, ψ(t)) − H (t, x 0 (t), u0 (t), ψ(t)) ≥ C|ũ|2 .
2.5. Estimation of the Basic Constant from Below 85

In other words, the function ϕ(ũ) := H (t, x 0 (t), u0 (t) + ũ, ψ(t)) − C|ũ|2 defined on a neigh-
borhood of the origin of the space Rd(u) has a local minimum at zero. This implies
ϕ (0) = 0 and ϕ (0)ū, ū ≥ 0 for all ū ∈ Rd(u) . The first condition is equivalent to
Hu (t, x 0 (t), u0 (t), ψ(t)) = 0, and the second is equivalent to

Huu (t, x 0 (t), u0 (t), ψ(t))ū, ū − 2Cū, ū ≥ 0 ∀ ū ∈ Rd(u) .

This implies assertion (a) of the proposition. Assertion (b) is obtained from assertion (a) by
passing to the limit as t → tk + 0 and t → tk − 0, k ∈ I ∗ . The proposition is proved.

We now use assertion (b) only. Denote by b the pair (δw0 , M ∗ ) and by {b} the
sequence of pairs such that {δw0 } ∈ 0 , Mk∗ → tk for all k, M∗ = ∪Mk∗ . On each such
sequence, we define the functions
   tf 
1 
∗ ∗
2λ (b) =  (λ, δw ) +
0 k
D (H )λ
|δtk | dt + [Hx ]
λ k
δx(χk− − χk+ ) dt ,
2 Mk∗ t0
k
2C (b) = max 2λ (b),
M co (C)
 tf 
γ1 (b) =
δx
2C + |δu0 |2 dt + 2|δtk | dt.
t0 k Mk∗

We set
 √ √
S2 = {b} = {(δw0 , M ∗ )} | Fip δp ≤ o( γ1 ), i ∈ I ; |Kp δp| = o( γ1 );
 

δ ẋ − fw δw 0 − [f ]k (χk− ∗ − χ ∗ )
= o(√γ ) .
k+ 1 1

˜ 1 , γ , and
We obtain the definitions of 2 , γ1 , and S2 from the corresponding definitions of 
S1 setting δv = 0 everywhere. We set

2C
Cγ1 (2C ; S2 ) = inf lim .
S2 γ1

Lemma 2.54. The following inequality holds:


 
Cγ (˜ 1C ; S1 ) ≥ min Cγ1 (2C ; S2 ), C . (2.125)

Proof. Let {a} ∈ S1 be an arbitrary sequence such that γ > 0 for all its members. For this
sequence, we set
 tf

{b} = {(δw , M )}, γ̂ (δv) =
0
|δv|2 dt.
t0

Then γ (a) = γ1 (b)+ γ̂ (δv), or, for short, γ = γ1 + γ̂ . Let λ ∈ M co (C). Proposition 2.53(b),
implies ˜ 1λ (δw0 ) ≥ 2λ (δw0 )+C γ̂ (δv). Consequently,  ˜ 1 (δw0 ) ≥ 2 (δw 0 )+C γ̂ (δv),
C C
or, briefly, ˜ 1 ≥ 2 + C γ̂ . We consider the following two possible cases for the se-
C C
quence {a}.
86 Chapter 2. Quadratic Conditions in the Calculus of Variations

Case (a): Let lim(γ1 /γ ) = 0. Extract a subsequence such that γ1 = o(γ ) on it.
Let this condition hold for the sequence {a} itself. Then we obtain from the inequality
˜ 1 ≥ 2 + C γ̂ and the obvious estimate |2 | ≤ O(γ1 ) that
 C C C

˜1
 2 + C γ̂ o(γ ) + C γ̂ γ̂
lim C
≥ lim C = lim = C lim = C,
γ γ γ γ

since γ = γ1 + γ̂ = o(γ ) + γ̂ .
Case (b): Assume now that lim(γ1 /γ ) > 0, and hence γ ≤ const γ1 on the subse-
quence. The inequality ˜ 1 ≥ 2 + C γ̂ implies
C C
 
˜1
 2C + C γ̂ γ1 2C γ̂ 2C
C
≥ = + C ≥ min ,C ,
γ γ γ γ1 γ γ1

since γ1 + γ̂ = γ , γ1 ≥ 0, and γ̂ ≥ 0. Consequently,


 
˜1
 2C
lim C
≥ min lim ,C .
γ γ1

But the conditions {a} ∈ S and γ ≤ const γ1 immediately imply {b} ∈ S2 . We obtain from
this that
˜1

lim C ≥ Cγ1 (2C , S2 ).
γ1
Consequently,
˜1
  
lim C
≥ min Cγ1 (2C ; S2 ), C .
γ
Therefore, we have shown that from any sequence {a} ∈ S1 at which γ > 0, it is possible to
˜ 1 /γ on it is not less than the right-
extract a subsequence such that the lower limit lim  C
hand side of inequality (2.125). This obviously implies inequality (2.125). The lemma is
proved.

Lemmas 2.52 and 2.54 imply the following assertion.


Lemma 2.55. Let the set M co (C) be nonempty. Then Cγ ≥ min Cγ1 (2C ; S2 ), C .

In what follows, we will estimate the constant Cγ1 (2C ; S2 ) from below.

2.5.6 Replacement of δx2C by |δx(t0 )|2 in the Definition of the


Functional γ1
We set  tf 
γ2 = γ2 (b) = |δx(t0 )|2 + |δu0 |2 dt + 2|δtk | dt.
t0 k Mk∗
2.5. Estimation of the Basic Constant from Below 87

Therefore, γ2 differs from γ1 by the fact that the term


δx
2C is replaced by |δx(t0 )|2 . By
definition, the sequences {b} = {(δw, M∗ )} from S2 satisfy the condition
 √
∗ ∗
δ ẋ − fw δw − [f ]k (χk− − χk+ ) = o( γ1 ).
1

Therefore, according to Proposition 2.24, there exists a constant 0 < q ≤ 1, q = 1/ (see
(2.73)), such that for any sequence {b} ∈ S2 , there exists a number starting from which we
have qγ1 (b) ≤ γ2 (b), or, briefly, qγ1 ≤ γ2 . Moreover, since |δx(t0 )| ≤
δx
C , we have
γ2 ≤ γ1 . Hence
γ2
q≤ ≤ 1.
γ1
This implies that the following relations hold for any sequence {b} ∈ S2 :

2C 2 γ2  
lim = lim C · ≥ min Cγ2 (2C ; S2 ), qCγ2 (2C ; S2 ) .
γ1 γ2 γ1
Consequently,
 
Cγ1 (2C ; S2 ) ≥ min Cγ2 (2C ; S2 ), qCγ2 (2C ; S2 ) . (2.126)

Here,
2C
Cγ2 (2C ; S2 ) = inf lim .
S2 γ2
Inequality (2.126) and Lemma 2.55 imply the following assertion.

Lemma 2.56. Let the set M co (C) be nonempty. Then


 
Cγ ≥ min Cγ2 (2C ; S2 ), qCγ2 (2C ; S2 ), C , 0 < q ≤ 1.

In what follows, we will estimate the constant Cγ2 (2C ; S2 ) from below.

2.5.7 Passing to Sequences with Discontinuous State Variables


As in Section 2.4, denote by {b̄} = {(w̄, M ∗ )} a sequence such that
M ∗ = ∪Mk∗ , Mk∗ → tk (k ∈ I ∗ ),
w̄ = (x̄, ū), x̄ ∈ P W 1,1 (, Rd(x) ), ū ∈ L∞ (, Rd(u) ),

∞ → 0.

Also, we set
 
1
3λ (b̄) =  (λ, w̄) + D k (H λ ) |δtk | dt + [Hxλ ]k x̄av
k
ξk ,
2 Mk∗
k

∗ − meas M ∗ , 3 (b̄) = max co


where ξk = meas Mk− 3λ
k+ C M (C)  (b̄), and
 tf 
γ2 (b̄) = |x̄(t0 )|2 + |ū|2 dt + 2 |δtk | dt.
t0 k Mk∗
88 Chapter 2. Quadratic Conditions in the Calculus of Variations

Let S3 be the set of sequences {b̄} such that


√ √
Fip p̄ ≤ o( γ2 ) (i ∈ I ), |Kp p̄| = o( γ2 ),


x̄˙ − fw w̄
1 = o( γ2 ), [x̄]k = [f ]k ξk ∀ tk ∈ ,

where ξk = meas Mk− ∗ − meas M ∗ . According to Lemma 2.26, for any sequence {b} ∈ S
k+ 2
for which γ2 > 0, there exists a sequence {b̄} such that

δp = p̄,
x̄˙ − fw w̄
1 = o( γ2 ), [x̄]k = [f ]k ξk ∀ tk ∈ ;
γ2 (b) = γ2 (b̄), 2C (b) = 3C (b̄) + o(γ2 ).

Consequently, {b̄} ∈ S3 and

2C (b) 3 (b̄) 3 def


lim = lim C ≥ inf lim C = Cγ2 (3C ; S3 ).
γ2 (b) γ2 (b̄) S3 γ2

Since {b} ∈ S2 is an arbitrary sequence on which γ2 > 0, we have

Cγ2 (2C ; S2 ) ≥ Cγ2 (3C ; S3 ). (2.127)

This inequality and Lemma 2.56 imply the following assertion.

Lemma 2.57. Let the set M co (C) be nonempty. Then


 
Cγ ≥ min Cγ2 (3C ; S3 ), qCγ2 (3C ; S3 ), C .

In what follows, we will estimate the constant Cγ2 (3C ; S3 ).

2.5.8 Additional Condition of Legendre, Weierstrass–Erdmann Type


Related to Varying Discontinuity Points of the Control
Now our goal is to pass from the sequences {(w̄, M ∗ )} ∈ S3 to the sequences {(ξ̄ , w̄)}, where
∗ − meas M ∗ , k = 1, . . . , s.
ξ̄ = (ξ̄1 , . . . , ξ̄s ), ξ̄k = meas Mk− k+

Proposition 2.58. The following inequality holds for any λ ∈ M co (C):

D k (H λ ) ≥ 2C, k ∈ I ∗.

Proof. Fix λ ∈ M co (C), k ∈ I ∗ . Take a small ε > 0 and construct a variation δu(t) in the
left neighborhood (tk − ε, tk ) of the point tk such that

u0k+ − u0 (t), t ∈ (tk − ε, tk ),
δu(t) =
0, t∈/ (tk − ε, tk ).

For a sufficiently small ε > 0, we have (t, x 0 (t), u0 (t) + δu(t)) ∈ Q. Consequently,

δH λ ≥ C(t, u0 (t) + δu(t)),


2.5. Estimation of the Basic Constant from Below 89

where δH λ = H λ (t, x 0 , u0 + δu) − H λ (t, x 0 , u0 ). It follows from the definition of (t, u)


that the following relation holds for a sufficiently small ε > 0:

(t, u0 (t) + δu(t)) = 2|δtk |χk− ,

where χk− ∗ is the characteristic function of the set M ∗ = {t | δu(t)  = 0} = (t − ε, t ).


k− k k
Moreover, according Proposition 2.13, with the conditions [H λ ]k = 0, δvk− = (u0 + δu −
u0k+ )χk−∗ = 0 taken into account, we have the following for δu: δH λ = (D k (H λ )|δt | +
k
∗ . Consequently, D k (H λ )|δt | + o(|δt |) ≥ 2C|δt | on (t − ε, t ). This implies
o(|δtk |))χk− k k k k k
D k (H λ ) ≥ 2C. The lemma is proved.

Denote by S4 the set of sequences {z̄} = {(ξ̄ , x̄, ū)} of elements of the space Z()
(defined by (2.89)) such that |ξ̄ | + ||w̄
∞ → 0 and, moreover,
√ √
Fip p̄ ≤ o( γ̄ ) (i ∈ I ), |Kp p̄| = o( γ̄ ),


x̄˙ − fw w̄
1 = o( γ̄ ), [x̄]k = [f ]k ξ̄k ∀ tk ∈ .
tf
Recall that γ̄ = γ̄ (z̄) := |x̄(t0 )|2 + t0 |ū|2 dt + |ξ̄ |2 . Also, recall that (see formula (2.13))

1 k λ 2 1
λ (z̄) := (D (H )ξ̄k + 2[Hxλ ]k x̄av
k
ξ̄k ) +  (λ, w̄),
2 2
k
tf
where  (λ, w̄) = lpp
λ p̄, p̄ +
t0 Hww
λ w̄, w̄ dt. We set

C (z̄) = max


co
λ (z̄)
M (C)

and
C
Cγ̄ (C ; S4 ) = inf lim .
S4 γ̄
Using Proposition 2.58, we prove the following estimate.

Lemma 2.59. The following inequality holds:

Cγ2 (3C ; S3 ) ≥ min{Cγ̄ (C ; S4 ), C}. (2.128)

Proof. Let {b̄} = {(w̄, M ∗ )} ∈ S3 be a sequence on which γ2 > 0. The following inequalities
hold for each member of this sequence:

2 |δtk | dt ≥ ξ̄k2 , k ∈ I ∗ ,
Mk∗

∗ − meas M ∗ . We set
where ξ̄k = meas Mk− k+
 
μk = 2 |δtk | dt − ξ̄k2 , k ∈ I ∗, μ= μk .
Mk∗
90 Chapter 2. Quadratic Conditions in the Calculus of Variations

Then μk ≥ 0 for all k, and, therefore, μ ≥ 0. To the sequence {b̄} = {(w̄, M ∗ )}, we
naturally associate the sequence {z̄} = {ξ̄ , w̄)} with the same components w̄ and with
ξ̄k = meas Mk−∗ − meas M ∗ , k ∈ I ∗ . Then γ (b̄) = γ̄ (z̄) + μ, or, for short, γ = γ̄ + μ.
k+ 2 2
Moreover,
1 1 
 (b̄) =
3λ  (λ, w̄) + D (H )(ξ̄k + μk ) + [Hx ] x̄av ξ̄k
k λ 2 λ k k
2 2
k
1 k λ
= λ (z̄) + D (H )μk .
2
k

According to Proposition 2.58, D k (H λ ) ≥ 2C for all k for any λ ∈ M co (C). Consequently,


3C (b̄) ≥ C (z̄) + Cμ

or, briefly, 3C ≥ C + Cμ. Therefore, we have the following for the sequence {b̄}:

3C C + Cμ
≥ , (2.129)
γ2 γ2
where γ2 = γ̄ + μ. Consider the following two cases.
Case (a). Let lim γ̄ /γ2 = 0. Choose a subsequence such that γ̄ = o(γ2 ) on it. Let this
condition hold on the sequence {b̄} itself. Since γ2 = γ̄ + μ, we have μ/γ2 → 1. Let us
show that the following estimate holds in this case:
|C | = o(γ2 ). (2.130)
Indeed, the definition of the functional λ and the boundedness of the set M co (C) imply
the existence of a constant C > 0 such that
|C | ≤ C (

2∞ +

22 + |ξ̄ |2 ). (2.131)

Further, since {b̄} ∈ S3 , we have



x̄˙ = fx x̄ + fu ū + r̄;

1 = o( γ2 ); [x̄]k = [f ]k ξ̄k , k ∈ I ∗.
Consequently,

∞ ≤ O(|x̄(t0 )| + |ξ̄ | +

1 +

1 ). Since
 √
|x̄(t0 )| + |ξ̄ | +

1 ≤ O( γ̄ ),

1 = o( γ2 ),
we have

2∞ ≤ O(γ̄ ) + o(γ2 ). This and estimate (2.131) imply |C | ≤ O(γ̄ ) + o(γ2 ).
But γ̄ = o(γ2 ). Consequently, estimate (2.130) holds.
Taking into account estimate (2.130), we obtain from inequality (2.129) that
3C C + Cμ Cμ
lim ≥ lim = lim = C.
γ2 γ2 γ2
Case (b). Let lim γ̄ /γ2 > 0. Then γ2 = O(γ̄ ). This and the condition {b̄} ∈ S3 easily imply
that {z̄} ∈ S4 . We obtain from inequality (2.129) that
 
3C γ̄ C μ C
≥ · + C ≥ min ,C ,
γ2 γ2 γ̄ γ2 γ̄
2.5. Estimation of the Basic Constant from Below 91

since γ̄ ≥ 0, μ ≥ 0, and γ2 = γ̄ + μ. Consequently,


 
3 C

lim C ≥ min lim ; C ≥ min Cγ̄ (C , S4 ), C .


γ2 γ̄
The latter inequality holds, since {z̄} ∈ S4 .
Therefore, we have proved that it is possible to extract a subsequence from any
sequence {b̄} ∈ S3 on which γ2 > 0 such that the following inequality holds on it:
3C
lim ≥ min{Cγ̄ (C ; S4 ), C}.
γ2
This implies inequality (2.128). The lemma is proved.

Lemmas 2.57 and 2.59 imply the following assertion.

Lemma 2.60. Let the set M co (C) be nonempty. Then


Cγ ≥ min{Cγ̄ (C , S4 ), qCγ̄ (C , S4 ), C, qC}.
In what follows, we will estimate the constant Cγ̄ (C ; S4 ) from below.

2.5.9 Passing to Equality in the Differential Relation


We restrict the set of sequences S4 to the set of sequences in which each member satisfies
the conditions x̄˙ = fw w̄, [x̄]k = [f ]k ξ̄k , k ∈ I ∗ . We denote by S5 this new set of sequences.
Let us show that under such a restriction, the constant Cγ̄ does not increase. In other words,
let us prove the following lemma.

Lemma 2.61. The following relation holds: Cγ̄ (C ; S4 ) = Cγ̄ (C ; S5 ).

Proof. Let {z̄} ∈ S4 be such that γ̄ > 0 on it. Then


√ √
Fip p̄ ≤ o( γ̄ ) (i ∈ I ), |Kp p̄| = o( γ̄ ),
√ (2.132)
x̄˙ = fx x̄ + fu ū + r̄,

= o( γ̄ ); [x̄]k = [f ]k ξ̄k (k ∈ I ∗ ).
Let a sequence {δx} satisfy the conditions δ ẋ = fx δx + r̄ and δx(t0 ) = 0. Then


δx
C ≤
δx
1,1 ≤ const ·

1 = o( γ̄ ).
We set δu = 0, δw = (δx, 0), w̄ = w̄ − δw, ξ̄ = ξ̄ , z̄ = (ξ̄ , w̄ ). For the sequence {z̄ }, we
have γ̄ (z̄ ) = γ̄ (z̄), or, for short, γ̄ = γ̄ . Moreover,
 
Fip p̄ ≤ o( γ̄ ) (i ∈ I ), |Kp p̄ | = o( γ̄ ),
x̄˙ = fw w̄ , [x̄ ]k = [f ]k ξ (k ∈ I ∗ ),

∞ + |ξ̄ | → 0.
k
Hence {z̄ } ∈ S 5. Moreover,
1
λ (z̄) = λ (z̄ ) + lpp p̄, δp + lpp δp, δp
 2

1 tf  λ  (2.133)
+ Hxx δx, δx + 2Hxx λ
δx, x̄ dt + [Hx ]k δx(tk )ξ̄k .
2 t0
k
92 Chapter 2. Quadratic Conditions in the Calculus of Variations


Conditions (2.132) easily imply

2∞ ≤ O(γ̄ ). Moreover,
δx
C = o( γ̄ ). This and con-
ditions (2.133) imply C (z̄) = C (z̄ ) + o(γ̄ ). Consequently,
C (z̄) C (z̄ )
lim = lim ≥ Cγ̄ (C ; S5 ).
γ̄ (z̄) γ̄ (z̄ )
The inequality holds, since {z̄ } ∈ S5 . Since {z̄} is an arbitrary sequence from S4 on which
γ̄ > 0, we obtain from this that Cγ̄ (C ; S4 ) ≥ Cγ̄ (C ; S5 ). The inclusion S5 ⊂ S4 implies
the converse inequality. Therefore, we have an equality here. The lemma is proved.

Lemmas 2.60 and 2.61 imply the following assertion.

Lemma 2.62. Let the set M co (C) be nonempty. Then

Cγ ≥ min{Cγ̄ (C ; S5 ), qCγ̄ (C ; S5 ), C, qC}.

We now estimate the constant Cγ̄ (C ; S5 ) from below.

2.5.10 Passing to the Critical Cone


Introduce the constant
  

C (z̄) 
Cγ̄ (C ; S5 ) = inf  z̄ ∈ K \ {0} .
γ̄ (z̄) 

Lemma 2.63. The following inequality holds: Cγ̄ (C ; S5 ) ≥ Cγ̄ (C ; K).

Proof. Let {z̄} be an arbitrary nonvanishing sequence from S5 . For this sequence, we have
 
Fip p̄ ≤ o( γ̄ ) (i ∈ I ), |Kp p̄| = o( γ̄ ),

∞ + |ξ̄ | → 0, (2.134)
˙x̄ = fw w̄, [x̄]k = [f ]k ξ̄k (k ∈ I ∗ ). (2.135)

Moreover, γ̄ (z̄) > 0 on the whole sequence, since it contains nonzero members. Let T
be the subspace in Z() defined by conditions (2.135). According to Lemma 2.41 (which
¯ =
follows from the Hoffman lemma), for the sequence {z̄}, there exists a sequence {z̄}
¯ ¯ ū)}
{(ξ̄ , x̄, ¯ in the subspace T such that

¯ ≤0
Fip (p̄ + p̄) (i ∈ I ), ¯ = 0,
Kp (p̄ + p̄)

and, moreover,

¯ ∞ + |ξ̄¯ | = o( γ̄ ). We set {z̄ } = {z̄ + z̄}.
¯ ∞ +

¯ As in the proof of
Lemma 2.61, for the sequence {z̄ }, we have

C (z̄) = C (z̄ ) + o(γ̄ ). (2.136)

Moreover,
γ̄ (z̄) = γ̄ (z̄ ) + o(γ̄ ). (2.137)
¯ √
¯ ¯
√ follow from the estimate

∞ +

∞ + |ξ̄ | = o( γ̄ ) and the estimate
These conditions


∞ ≤ O( γ̄ ), which, in turn, follows from (2.135). It follows from (2.137) and the
2.5. Estimation of the Basic Constant from Below 93

condition γ̄ (z̄) > 0 that γ̄ (z̄ ) > 0 starting from a certain number. According to (2.136) and
(2.137), we have
C (z̄) C (z̄ )
lim = lim ≥ Cγ̄ (C ; K). (2.138)
γ̄ (z̄) γ̄ (z̄ )
The inequality holds, since z̄ ∈ K \ {0}. Since this inequality holds for an arbitrary nonva-
nishing sequence {z̄} ∈ S5 , this implies Cγ̄ (C ; S5 ) ≥ Cγ̄ (C ; K).

Lemmas 2.62 and 2.63 imply the following assertion.

Lemma 2.64. The following inequality holds for any real C such that the set M co (C) is
nonempty:
Cγ ≥ min{Cγ̄ (C ; K), qCγ̄ (C ; K), C, qC}. (2.139)
Let C be such that M co (C) is nonempty and

C (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K. (2.140)

Then it follows from the definition of the constant Cγ̄ (C ; K) that Cγ̄ (C ; K) ≥ C.
This and (2.139) imply Cγ ≥ min{C, qC}. This inequality holds for all C such that the set
M co (C) is nonempty and condition (2.140) holds. Therefore, it also holds for the least
upper bound of these C. At the end of Section 2.4, we have denoted this upper bound by CK .
Therefore, we have proved the following theorem.

Theorem 2.65. The following inequality holds:

Cγ ≥ min{CK , qCK }, 0 < q ≤ 1. (2.141)

Earlier, at the end of Section 2.4, we obtained the following estimate


(see (7.43)):
1
CK ≥ min{Cγ , Cγ },  = ≥ 1,
q
which can be written in the equivalent form:
 
1
max CK , CK ≥ Cγ ,  ≥ 1. (2.142)


2.5.11 Decoding Result


Combining inequalities (2.141) and (2.142), we obtain the following decoding result.

Theorem 2.66. The following inequalities hold :

min{CK , qCK } ≤ Cγ ≤ max{CK , qCK }. (2.143)

These inequalities are equivalent to

min{Cγ , Cγ } ≤ CK ≤ max{Cγ , Cγ } (2.144)


94 Chapter 2. Quadratic Conditions in the Calculus of Variations

Recall that  ≥ 1, 0 < q ≤ 1, and  = 1/q. Bearing in mind inequalities (2.143) and (2.144),
we write
const
C γ = CK . (2.145)
This is the main result of decoding.
We now obtain the following important consequences of this result.
Case (a). Let CK ≥ 0. Then Cγ ≥ 0 by (2.143); according to Remark 2.44, this implies
the condition
M0co  = ∅; max
co
λ (z̄) ≥ 0 ∀ z̄ ∈ K. (2.146)
M0

Conversely, if condition (2.146) holds, then CK ≥ 0. Therefore, condition (2.146) is equiv-


alent to the inequality CK ≥ 0; by Theorem 2.66, the latter is equivalent to the inequality
Cγ ≥ 0.
Case (b). Let CK > 0. Let the constant C be such that CK > C > 0. Then, by the definition
of CK , we have
M co (C)  = ∅; max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K. (2.147)
M co (C)

Conversely, if for a certain C > 0, condition (2.147) holds, then CK ≥ C, and hence
CK > 0. Therefore, the existence of C > 0 such that (2.147) holds is equivalent to the
inequality CK > 0. By Theorem 2.66, the latter is equivalent to the inequality Cγ > 0.
The following theorem summarizes what was said above.

Theorem 2.67. (a) The inequality Cγ ≥ 0 is equivalent to condition (2.146). (b) The
inequality Cγ > 0 is equivalent to the existence of C > 0 such that condition (2.147) holds.

2.5.12 Sufficient Conditions for the Pontryagin Minimum


We give the following definition.

Definition 2.68. Let (t, u) be an admissible function. We say that Condition Bco () holds
at the point w 0 if there exists C > 0 such that condition (2.147) holds.

As we already know, the condition Cγ > 0 is sufficient for the strict Pontryagin
minimum at the point w0 , and, according to Theorem 2.67, it is equivalent to Condition
Bco (). Therefore, Condition Bco () is also sufficient for the strict Pontryagin minimum.
Let us consider Condition Bco (). First of all, we show that it is equivalent to Condi-
tion B() in whose definition we have the set M(C) instead of the set M co (C). By defini-
tion, the set M(C) consists of tuples λ = (α0 , α, β, ψ) ∈ 0 such that ψ(t)(f (t, x 0 (t), u) −
f (t, x 0 (t), u0 (t))) ≥ C(t, u) if t ∈ [t0 , tf ]\ and (t, x 0 (t), u) ∈ Q. Therefore, the set M(C)
differs from the set M co (C) by that co 0 is replaced by 0 in the definition of the latter.
Condition B() means that the set M(C) is nonempty for a certain C > 0 and
max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K. (2.148)
M(C)

Since 0 ⊂ co 0 , and hence M(C) ⊂ M co (C), Condition B() implies Condition


Bco (). It is required to prove the converse statement: Condition Bco () implies Condition
B(). For this purpose, we prove the following lemma.
2.5. Estimation of the Basic Constant from Below 95

Lemma 2.69. For any C > 0 such that M co (C) is nonempty, there exists 0 < ε < 1 such
that  
C
M (C) ⊂ [ε, 1] ◦ M
co
 ,
ε
where [ε, 1] ◦ M is the set of tuples λ = ρ λ̃ such that ρ ∈ [ε, 1] and λ̃ ∈ M.

Proof. For an arbitrary λ = (α0 , α, β, ψ) ∈ co 0 , we set ν(λ) = α0 + |α| + |β|. Since the
function ν(λ) is convex and equals 1 on 0 , we have ν(λ) ≤ 1 for all λ ∈ co 0 . Further, let
C > 0 be such that M co (C) is nonempty. Then as is easily seen, the compact set M co (C)
does not contain zero. This implies
ε= min ν(λ) > 0,
λ∈M co (C)

since the conditions λ ∈ co 0 and ν(λ) = 0 imply λ = 0.


Therefore, the inequalities ε ≤ ν(λ) ≤ 1 hold for an arbitrary λ ∈ M co (C). We set
λ̃ = λ/ν(λ). Then ν(λ̃) = 1, and hence λ̃ ∈ 0 . Moreover, the condition ψδu f ≥ C
implies
C C
ψ̃δu f ≥  ≥ ,
ν(λ) ε
where ψ̃ is a component of λ̃, δu f = f (t, x 0 , u) − f (t, x 0 , u0 ), and  = (t, u). Hence,
λ̃ ∈ M( Cε ). Therefore, for an arbitrary λ ∈ M co (C), we have found the representation
λ = ν(λ)λ̃, where λ̃ ∈ M( Cε ), ε ≤ ν(λ) ≤ 1. This implies
 
C
M co (C) ⊂ [ε, 1] ◦ M  .
ε
The lemma is proved.

Lemma 2.69 implies the following theorem.

Theorem 2.70. Condition Bco () is equivalent to Condition B().

Proof. Let Condition Bco () hold, i.e., there exists C > 0 such that M co (C) is non-
empty and
max
co
λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K.
M (C)

Then, by Lemma 2.69, there exist ε > 0 such that

 C   (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K.
λ
max
[ε,1]◦M ε 

By the linearity of λ in λ and the positivity of C, we obtain from this that

 C   (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K.
λ
max
M ε 

Therefore, Condition B() holds. We have shown that Condition Bco () implies Condition
B(). As mentioned above, the converse is also true. Therefore, Conditions Bco () and
B() are equivalent. The theorem is proved.
96 Chapter 2. Quadratic Conditions in the Calculus of Variations

Theorems 2.67(b) and 2.70 imply the following theorem.

Theorem 2.71. The inequality Cγ > 0 is equivalent to Condition B().

There is a certain inconvenience in Condition B(), in that it is difficult to verify that


λ ∈ M(C). Therefore, it is desirable to pass to the sufficient condition, which can be more
easily verified. Recall that in Section 2.1.8 we introduced the set M0+ consisting of those
λ ∈ M0 for which the strict minimum principle holds outside the discontinuity points of the
control. In the same section, we have introduced the set Leg+ (M0+ ) consisting of all strictly
Legendrian elements λ ∈ M0+ . The definition of M(C) ⊂ M co (C), its compactness, and
Propositions 2.53 and 2.58 imply the following assertion.

Lemma 2.72. For any admissible function (t, u) and any C > 0, the set M(C) is a
compact set contained in the set Leg+ (M0+ ).

Also, the following assertion holds.

Lemma 2.73. For any nonempty compact set M ⊂ Leg+ (M0+ ), there exist an admissible
function (t, u) and a constant C > 0 such that M ⊂ M(C).

Before proving Lemma 2.73, we prove a slightly simpler property. Let U be an


arbitrary neighborhood of the compact set u0 containing in Qtu . With the subscript U,
we denote all objects referring to the canonical problem complemented by the constraint
(t, u) ∈ U. For example, we write M0U , M U (C), etc. Denote by Leg+ (0 ) the subset of
all strictly Legendre elements λ ∈ 0 .

Lemma 2.74. Let M ⊂ Leg+ (0 ) be a nonempty compact set, and let (t, u) be an admis-
sible function. Then there exist a neighborhood U of the compact set u0 and a constant
C > 0 such that M ⊂ M U (C).

To prove Lemma 2.74, we need several auxiliary assertions.

Proposition 2.75. Assume that there is a nonempty compact set M ⊂ 0 such that the
following conditions hold for each of its elements λ:
(a) for any t ∈ [t0 , tf ] \ ,

1
Huu (t, x 0 (t), u0 (t), ψ(t))ū, ū > 0 ∀ ū ∈ Rd(u) \ {0};
2

(b) for any tk ∈ ,


1 λk+
H ū, ū > 0 ∀ ū ∈ Rd(u) \ {0} (2.149)
2 uu
and
1 λk−
H ū, ū > 0 ∀ ū ∈ Rd(u) \ {0}. (2.150)
2 uu
2.5. Estimation of the Basic Constant from Below 97

Then there exist C > 0 and ε > 0 such that for any λ ∈ M, the conditions
t ∈ [t0 , tf ] \  and |u − u0 (t)| < ε (2.151)
imply
H (t, x 0 (t), u, ψ(t)) − H (t, x 0 (t), u0 (t), ψ(t)) ≥ C|u − u0 (t)|2 . (2.152)

Proof. Assume the contrary. Let the compact set M ⊂ 0 be such that conditions (a) and
(b) of the proposition hold for each of its element, but there are no C > 0 and ε > 0 such that
conditions (2.151) imply inequality (2.152). Then there exist sequences {Cn }, {tn }, {λn },
and {ūn } such that
Cn → +0, tn ∈ [t0 , tf ] \ , λn ∈ M, ūn ∈ Rd(u) , |ūn | → 0,
(2.153)
H λn (tn , xn0 , u0n + ūn ) − H λn (tn , xn0 , u0n ) < Cn |ūn |2 ,

where xn0 = x 0 (tn ) and u0n = u0 (tn ). Without loss of generality, we assume that
ūn
tn → tˆ ∈ [t0 , tf ], λn → λ̂ ∈ M, → ū.
|un |
Then ūn = εn (ū + ũn ), where εn = |ūn | → 0, |ūn | → 0. In this case, we obtain from (2.153)
that
1 λn
H (tn , xn0 , u0n )ūn , ūn + o(εn2 ) < Cn εn2 . (2.154)
2 uu
Here, we have taken into account that Huλn (tn , xn0 , u0n ) = 0.
We first assume that tˆ ∈
/ . Dividing (2.154) by εn2 and passing to the limit, we obtain
1 λ̂
H (tˆ, x 0 (tˆ), u0 (tˆ))ū, ū ≤ 0.
2 uu
But this contradicts condition (a), since ū  = 0. Analogously, in the case where tˆ ∈ , we
arrive at a contradiction to one of the conditions in (b). The proposition is proved.

In what follows, we need to use the assumption that each tk ∈  is an L-point of the
control u0 .

Proposition 2.76. Let M ⊂ 0 be a nonempty compact set such that the following conditions
hold for a fixed point tk ∈  and any λ ∈ M:
[H λ ]k = 0, D k (H λ ) > 0, (2.155)
1 λk−
H ū, ū > 0 ∀ ū ∈ Rd(u) \ {0}. (2.156)
2 uu
Then there exist C > 0 and ε > 0 such that for any λ ∈ M, the conditions
tk < t < tk + ε and |u − u0k− | < ε (2.157)
imply the inequality
 
1
H (t, x 0 (t), u, ψ(t)) − H (t, x 0 (t), u0 (t), ψ(t)) ≥ C |t − tk | + |u − u0k− |2 . (2.158)
2
98 Chapter 2. Quadratic Conditions in the Calculus of Variations

Proof. Let a compact set M ⊂ 0 satisfy the condition of the proposition, and let there be
no C > 0 and ε > 0 such that for any λ ∈ M, conditions (2.157) imply inequality (2.158).
Then there exist a sequence {C} and a sequence of triples {(t, u, λ)} such that

C → +0, t → tk + 0, u → u0k+ ,  
1 (2.159)
H λ (t, x 0 (t), u) − H λ (t, x 0 (t), u0 (t)) < C |t − tk | + |u − u0k− |2
2

(we omit the serial numbers of members). We set t − tk = δt > 0, u − u0k− = δv, u0 (t) −
u0k+ = δu0 , x 0 (tk ) = x 0k , x 0 − x 0k = δx 0 , etc. Then we get

H λ (t, x 0 , u) = H (t, x 0 , u, ψ) = H (tk + δt, x 0k + δx 0 , u0k− + δv, ψ k + δψ)


 
= H λk− + Htλk− + Hxλk+ ẋ 0k+ + ψ̇ k+ Hψλk− δt + Huλk− δv
1 λk−
+ Huu δv, δv + o(|δt| + |δv|2 ),
2
and taking into account that |δu0 | ≤ L|δt| (L > 0) by assumption, we obtain

H λ (t, x 0 , u0 ) = H (t, x 0 , u0 , ψ) = H (tk + δt, x 0k + δx 0 , u0k+ + δu0 , ψ k + δψ)


 
= H λk+ + Htλk+ + Hxλk+ ẋ 0k+ + ψ̇ k+ Hψλk+ δt + Huλk+ δu0 + o(δt).

Subtracting the latter relation from the previous one, taking into account that Huλk+ =
Huλk− = 0 and [H λ ]k = 0, and also taking into account inequality (2.159), we obtain
  1 λk−
− [Htλ ]k + [Hxλ ]k ẋ 0k+ + ψ̇ k+ [Hψλ ]k δt + Huu δv, δv + o(|δt| + |δv|2 )
  2
1
< C |δt| + |δv| . 2
2

But [Htλ ]k + [Hxλ ]k ẋ 0k+ + ψ̇ k+ [Hψλ ]k = −D k (H λ ) and C → +0. Hence

1 λk−
D k (H λ )δt + Huu δv, δv < o1 (|δt| + |δv|2 ). (2.160)
2

Without loss of generality, we assume that λ → λ̂ ∈ M. We consider the following two


possible cases for the sequences {δt} and {δv}.
Case (a). Assume that there exists a subsequence such that the following relation holds on
it: |δv|2 = o(|δt|). Then it follows from (2.160) that D k (H λ )δt < o(δt). Since δt → +0 and
λ → λ̂, we obtain from this that D k (H λ̂ ) ≤ 0, which contradicts the condition λ̂ ∈ M.
Case (b). Assume now that
|δv|2
lim > 0,
|δt|
i.e., |δt| ≤ O(|δv|2 ). In this case, from (2.160) and the conditions δt > 0 and D k (H λ ) > 0,
we obtain
1 λk−
H δv, δv < o(|δv|2 ). (2.161)
2 uu
2.5. Estimation of the Basic Constant from Below 99

Without loss of generality, we assume that


δv
→ v̄, |v̄| = 1.
|δv|

Dividing inequality (2.161) by |δv|2 and passing to the limit, we obtain


1 λ̂k−
H v̄, v̄ ≤ 0.
2 uu
Since |v̄| = 1, this also contradicts the condition λ̂ ∈ M. Therefore, our assumption that
there exist the sequences {C} and {(t, u, λ)} with the above property is not true. We thus
have proved the proposition.

The following assertion is proved analogously.

Proposition 2.77. Let M ⊂ 0 be a nonempty compact set such that the following condi-
tions hold for a fixed point tk ∈  and any λ ∈ M:

[H λ ]k = 0, D k (H λ ) > 0,

and
1 λk+
H ū, ū > 0 ∀ ū ∈ Rd(u) \ {0}.
2 uu
Then there exist C > 0 and ε > 0 such that for any λ ∈ M, the conditions tk − ε < t < tk ,
and |u − u0k+ | < ε imply
 
1
H (t, x (t), u, ψ(t)) − H (t, x (t), u (t), ψ(t)) ≥ C |t − tk | + |u − u | .
0 0 0 0k+ 2
2
Propositions 2.75, 2.76, and 2.77 directly imply Lemma 2.74.

Proof of Lemma 2.73. Assume that there exists a nonempty compact set M ⊂ Leg+ (M0+ ).
Let 1 (t, u) be a certain admissible function (as was shown in Section 2.3.7, there exists at
least one such function). According to Lemma 2.73, there exist a neighborhood U ⊂ Qtu
of the compact set u0 and a constant C > 0 such that M ⊂ M U (C1 ), i.e., M ⊂ 0 and the
conditions t ∈ [t0 , tf ] \ , (t, u) ∈ U, and (t, x 0 (t), u) ∈ Q imply

H λ (t, x 0 (t), u) − H λ (t, x 0 (t), u0 (t)) ≥ C1 (t, u).

We set
1
h(t, u) = min {H λ (t, x 0 (t), u) − H λ (t, x 0 (t), u0 (t))},
C λ∈M (2.162)
(t, u) = min{h(t, u), 1 (t, u)}.
It is easy to see that the function (t, u) (defined by (2.162)) is admissible, and, moreover,
M ⊂ M(C). The lemma is proved.

We now recall the definition given in Section 2.1.8. We say that Condition B holds
for the point w0 if there exist a nonempty compact set M ⊂ Leg+ (M0+ ) and a constant C > 0
100 Chapter 2. Quadratic Conditions in the Calculus of Variations

such that
max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K. (2.163)
M
The following assertion holds.

Theorem 2.78. Condition B is equivalent to the existence of an admissible function (t, u)


such that Condition B() holds.

Proof. Let Condition B hold; i.e., there exist a nonempty compact set M ⊂ Leg+ (M0+ ) and
a constant C > 0 such that condition (2.163) holds. Then, according to Lemma 2.73, there
exist an admissible function (t, u) and a constant C1 > 0 such that M ⊂ M(C1 ). We set
C2 = min{C, C1 }. Then M(C1 ) ⊂ M(C2 ). Consequently, M ⊂ M(C2 ) and
max λ (z̄) ≥ max λ (z̄) ≥ C γ̄ (z̄) ≥ C2 γ̄ (z̄) ∀ z̄ ∈ K
M(C2 ) M

(the second inequality holds by (2.163)). Therefore, Condition B() holds.


Conversely, let there exist an admissible function  such that Condition B() holds.
Then there exists C > 0 such that M(C) is nonempty and condition (2.148) holds. By
Lemma 2.72, M(C) ⊂ Leg+ (M0+ ), and M(C) is a compact set. Therefore, Condition B
also holds. The theorem is proved.

According to Theorem 2.71, the inequality Cγ > 0 is equivalent to Condition B().


This is true for every admissible function . This and Theorem 2.78 imply the following
theorem.

Theorem 2.79. Condition B is equivalent to the existence of an admissible function  such


that the condition Cγ > 0 holds for the order γ corresponding to it.

In Section 2.2, we have verified all the assumptions of the abstract scheme for the
canonical problem, the point w 0 , the sets P and Q (the absorbing set  = W corresponds to
them in the space W ), and the set  of Pontryagin sequences in the space W . In Section 2.3,
the corresponding assumptions of the abstract scheme were also verified for a higher order
γ on . Therefore, Theorem 1.7 is applicable. According to this theorem, the condition
Cγ > 0 is not only sufficient for the Pontryagin minimum at the point w 0 but is equivalent
to the γ -sufficiency on . The latter will be also called the Pontryagin γ -sufficiency. For
convenience, we define this concept here. Let  be an admissible function, and let γ be the
higher order corresponding to it.

Definition 2.80. We say that the point w 0 yields the Pontryagin γ -sufficiency if there exists
ε > 0 such that for any sequence {δw} ∈ , there exists a number, starting from which the
condition σ (δw) ≥ εγ (δw) holds.

The equivalent condition for the Pontryagin γ -sufficiency consists of the following:
There is no sequence {δw} ∈  such that σ = o(γ ) on it.
The violation function σ was already defined in Section 2.2.4. In what follows, it is
convenient to use the following expression for σ :
  tf
σ = (δJ )+ + Fi+ (p0 + δp) + |δK| + |δ ẋ − δf | dt, (2.164)
t0
2.5. Estimation of the Basic Constant from Below 101

where δJ = J (p0 +δp)−J (p 0 ), δK = K(p0 +δp)−K(p 0 ), δf = f (t, w0 +δw)−f (t, w 0 ),


and a + = max{a, 0}. This expression differs from the expression from Section 2.2 by
only a constant multiplier (more precisely, they estimate each other from above and from
below with constant multipliers), and, therefore, it also can be used in all the formulations.
Therefore, Theorem 1.7(b) implies the following theorem.

Theorem 2.81. The condition Cγ > 0 is equivalent to the Pontryagin γ -sufficiency at the
point w 0 .

Theorems 2.79 and 2.81 imply the following theorem.

Theorem 2.82. Condition B is equivalent to the existence of an admissible function  such


that the Pontryagin γ -sufficiency holds at the point w0 for the order γ corresponding to it.

Since the Pontryagin γ -sufficiency implies the strict Pontryagin minimum, Condition
B is also sufficient for the latter. Therefore, Theorem 2.5 is proved.
However, Theorem 2.82 is a considerably stronger result than Theorem 2.5. It allows
us to proceed more efficiently in analyzing sufficient conditions. We will see this in what
follows.

2.5.13 An Important Estimate


We devote this section to a certain estimate, which will be needed in Section 2.7 for obtain-
ing the sufficient conditions for the strong minimum.
Let an admissible function  and a constant C be such that the set M(C) is nonempty.
Let M be a nonempty compact set in M(C). According to (2.111), we have

Cγ ≥ Cγ (M ; o(√γ ) ), (2.165)

where
M
Cγ (M ; o(√γ ) ) = inf√ lim
o( γ) γ
and M (δw) = maxλ∈M (λ, δw). We can further estimate Cγ (M ; o(√γ ) ) from below
exactly in the same way as was done for the constant Cγ (C ; o(√γ ) ) when M = M co (C).
All the arguments are repeated literally (see relations (2.118), (2.121), (2.124)–(2.128) and
Lemmas 2.61, 2.63). As a result, we arrive at the following estimate:

Cγ (M ; o(√γ ) ) ≥ min{Cγ̄ (M ; K), qCγ̄ (M ; K), C, qC)}, (2.166)

where 0 < q ≤ 1, M (z̄) = maxλ∈M λ (z̄), and


  

M (z̄) 
Cγ̄ (M ; K) = inf  z̄ ∈ K \ {0} .
γ̄ (z̄) 

Now let M ⊂ Leg+ (M0+ ) be a nonempty compact set, and let there exist a constant
C > 0 such that
max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K (2.167)
M
102 Chapter 2. Quadratic Conditions in the Calculus of Variations

(i.e., Condition B holds). Then by Lemma 2.73, there exist an admissible function 
and a constant C1 such that M ⊂ M(C1 ). Condition (2.167) implies Cγ̄ (M ; K) ≥ C.
Then (2.166) implies Cγ (M ; o(√γ ) ) ≥ q min{C, C1 }. We set CM = 12 q min{C, C1 }. Then
Cγ (M ; o(√γ ) ) > CM . Therefore,
M ≥ CM · γ | o(√γ ) , (2.168)
i.e., for any sequence {δw} ∈ o(√γ ) , there exists a number starting from which we have
M ≥ CM γ . We have obtained the following result.

Lemma 2.83. Let M ⊂ Leg+ (M0+ ) be a nonempty compact set, and let there exist C > 0
such that condition (2.167) holds. Then there exists a constant CM > 0 such that condition
(2.168) holds.

2.6 Completing the Proof of Theorem 2.4


2.6.1 Replacement of M0co by M0 in the Necessary Conditions
The purpose of this section is to complete the proof of Theorem 2.4, which we began in
Section 2.4. Here we will not use the results of Section 2.5. Instead, we shall need some
constructions from [79, Part 1, Chapter 2, Section 7]. Let us note that the proofs in this
section are rather technical and could be omitted in a first reading of the book.
We now turn to the following quadratic necessary Condition Aco for the Pontryagin
minimum obtained in Section 2.4 (see Theorem 2.43):
max
co
λ (z̄) ≥ 0 ∀ z̄ ∈ K.
M0

As was already noted, it is slightly weaker than the necessary condition of Theorem 2.4,
since in Condition A, we have the set M0 , which is more narrow than the set M0co . However,
we will show that the obtained necessary condition remains valid under the replacement
of M0co by M0 , i.e., the necessary Condition A holds. We thus will complete the proof of
Theorem 2.4.
The passage to the auxiliary problem in [79, Part 1, Chapter 2, Section 7] and the
trajectory of this problem corresponding to the index ζ chosen in a special way allows us to
do this. For this trajectory, we write the necessary Condition Aco in the auxiliary problem
with the subsequent transform of this condition into the initial problem. Such a method
was already used in [79, Section 7, Part 1] in proving the maximum principle. We use the
notation, the concepts, and the results of [79, Section 7, Part 1], briefly mentioning the main
constructions. We stress that in contrast to [79, Section 7, Part 1], all the constructions here
refer to the problem on a fixed closed interval of time [t0 , tf ]. We write the condition that
the endpoints of the closed interval of time [t0 , tf ] are fixed as follows: t0 = t00 , tf = tf0 .
Therefore, let us consider the problem (2.1)–(2.4) in the form which corresponds to the
general problem considered in [79, Section 7, Part 1],
J (x0 , xf ) → min, F (x0 , xf ) ≤ 0, K(x0 , xf ) = 0,
(2.169)
(x0 , xf ) ∈ P , t0 − t00 = 0, tf − tf0 = 0;
dx
= f (t, x, u), (t, x, u) ∈ Q, (2.170)
dt
2.6. Completing the Proof of Theorem 2.4 103

where x0 = x(t0 ), xf = x(tf ), and let

ŵ0 = (x̂ 0 (t), û0 (t) | t ∈ [t0 , tf ]) (2.171)


be a Pontryagin minimum point in this problem. (Here the components x 0 and u0 of the
pair w0 are denoted by x̂ 0 and û0 , respectively, as in [79, Section 7, Part 1].) Then the
minimum principle holds, and hence the set M0 is nonempty.

2.6.2 Two Cases


We have the following two possibilities:
(a) There exists λ ∈ M0 such that −λ ∈ M0 .
(b) There is no λ ∈ M0 such that −λ ∈ M0 , i.e., M0 ∩ (−M0 ) = ∅.
In case (a), the necessary Condition A holds trivially, since for any z̄, at least one of the
quadratic forms λ (z̄) and (−λ) (z̄) is nonnegative. Therefore, we consider case (b).
As in [79, Section 7, Part 1], for a given number N , we denote by ζ = (t i , uik ) a vector
2
in R × RN d(u) with components t i ∈ (t00 , tf0 ), i = 1, . . . , N , and uik ∈ Rd(u) , i, k = 1, . . . , N ,
N

such that
t i < t i+1 , i = 1, . . . , N − 1; (t i , x̂ 0 (t i ), uik ) ∈ Q, i, k = 1, . . . , N .
Here and below in this section, the fixed time interval is denoted by [t00 , tf0 ], while all t i ,
i = 1, . . . , N are internal points of this interval. Denote by D() the set of all ζ satisfying
the condition t i ∈ / , i = 1, . . . , N .
Further, recall the definition of the set ζ in [79, Section 7, Part 1]. For the problem
(2.169), (2.170), it consists of tuples μ = (α0 , α, β) such that

α0 ≥ 0, α ≥ 0, αF (x̂00 , x̂f0 ) = 0, α0 + αi + |β| = 1, (2.172)

and, moreover, there exist absolutely continuous functions ψ̂x (t) and ψ̂t (t) such that
ψ̂x (t00 ) = −lx 0 , ψ̂x (tf0 ) = lx f , (2.173)
d ψ̂x
− = ψ̂x (t)fx (t, x̂ 0 (t), û0 (t)),
dt (2.174)
d ψ̂t
− = ψ̂x (t)ft (t, x̂ 0 (t), û0 (t)),
dt
 t0
f  
ψ̂x (t)f (t, x̂ 0 (t), û0 (t)) + ψ̂t (t) dt = 0, (2.175)
t00

ψ̂x (t i )f (t i , x̂ 0 (t i ), uik ) + ψ̂t (t i ) ≥ 0, i, k = 1, . . . , N . (2.176)


Here,
l = l(x0 , xf , α0 , α, β) = α0 J (x0 , xf ) + αF (x0 , xf ) + βK(x0 , xf ) (2.177)
and the gradients lx 0 and lx f in the transversality conditions (2.173) are taken at the point
(x̂00 , x̂10 , α0 , α, β) (note that the components ψ of the tuple λ are denoted by ψx here). Let
"
= ζ ,
ζ
104 Chapter 2. Quadratic Conditions in the Calculus of Variations

where the intersection is taken over all subscripts ζ . At the end of Section 7 in [79, Part 1],
we have shown that elements μ ∈  satisfy the following minimum principle:

ψ̂x (t)f (t, x̂ 0 (t), u) + ψ̂t (t) ≥ 0 if t ∈ [t00 , tf0 ], u ∈ Rd(u) , (t, x 0 (t), u) ∈ Q; (2.178)

ψ̂x (t)f (t, x̂ 0 (t), û0 (t)) + ψ̂t (t) = 0 a.e. on [t00 , tf0 ]. (2.179)
By continuity, the latter condition extends to all points of the set [t00 , tf0 ] \ . This implies
 ⊂ N0 , where N0 is the projection of the set M0 under the injective mapping λ = (α0 , α,
β, ψx )  → μ = (α0 , α, β). We now consider the set
"
() = ζ .
ζ ∈D()

Clearly, elements μ ∈ () satisfy condition (2.178) at all points t ∈ [t00 , tf0 ] \ ; however,
by continuity, this condition extends to all points of the interval [t00 , tf0 ]. Consequently,
() = , and, therefore, "
ζ ⊂ N0
ζ ∈D()

(in fact, we have an equality here). In case (b) the following assertion holds.

Proposition 2.84. There exists a subscript ζ such that ζ ∩ (−ζ ) = ∅, and, moreover, the
following condition holds for all instants of time t i , i = 1, . . . , N , entering the definition of
the subscript ζ : t i ∈
/ , i = 1, . . . , N .

Proof. Assume that the proposition does not hold. Then each of the sets

Fζ := ζ ∩ (−ζ ), ζ ∈ D(),

is not empty. These sets compose a centered system of nonempty compact sets, and hence
their intersection "
F := Fζ
ζ ∈D()

is nonempty. Moreover, F ⊂ () ⊂ N0 , and the condition Fζ = −Fζ for all ζ ∈ D()
implies F = −F . Let μ = (α0 , α, β) ∈ F . Then μ ∈ N0 and (−μ) ∈ N0 ; therefore, we
have λ ∈ M0 and (−λ) ∈ M0 for the corresponding element λ. But this contradicts case (b)
considered. The proposition is proved.

2.6.3 Problem ZN and Trajectory κN


Fix the subscript ζ from Proposition 2.84. For a given N , consider Problem ZN on a fixed
closed interval of time [τ0 , τf ], where τ0 = 0, τf = tf − t0 + N 2 . Problem ZN has the form

J (x(τ0 ), x(τf )) → inf ;


F (x(τ0 ), x(τf )) ≤ 0, K(x(τ0 ), x(τf )) = 0, (x(τ0 ), x(τf )) ∈ P , (2.180)
t(τ0 ) − t00 = 0, t(τf ) − tf0 = 0, −z(τ0 ) ≤ 0,
2.6. Completing the Proof of Theorem 2.4 105

dx dt dz
= (ϕ(η)z)f (t, x, u); = ϕ(η)z; = 0;
dτ dτ dτ (2.181)
(t, x, u) ∈ Q, η ∈ Q1 .
Here, z and η are of dimension N 2 + 1 and have the components

zθ , zik i, k = 1, . . . , N , and ηθ , ηik , i, k = 1, . . . , N ,

respectively.2 The open set Q1 is the union of disjoint neighborhoods Qθ and Qik of the
points eθ and eik , i, k = 1, . . . , N , respectively, which are the standard basis of RN +1 (eθ has
2

θ ik
the unit component eθ , and other components of e are zero, while e has the unit component
eik , and other components of eik are zero), and φ(η) : Q1 → Rd(η) is a function mapping
each of the mentioned neighborhoods into the element of the basis whose neighborhood
it is. Note that the functions u(τ ) and η(τ ) are controls, while the functions z(τ ), x(τ ),
and t(τ ) are state variables in Problem ZN .
Recall the definition of the point κ ζ = (z0 (τ ), x 0 (τ ), t 0 (τ ), u0 (τ ), η0 (τ )) (in Problem
ZN on the closed interval [τ0 , τf ]) corresponding to the subscript ζ and the trajectory
(x̂ 0 (t), û0 (t)), t ∈ [t00 , tf0 ].
We “insert” N closed intervals of unit length adjusting to each other into each point t i
of the closed interval [t00 , tf0 ]; thus, we enlarge the length of the closed interval by N 2 . Place
the left endpoint of the new closed interval at zero. We obtain the closed interval [τ0 , τf ]
(τ0 = 0) with N 2 closed intervals (denoted by ik , i, k = 1, . . . , N ) placed on it; moreover,
i1 , . . . , iN are closed intervals adjusted to each other, located in the same order, and
corresponding to the point t i , i = 1, . . . , N . We set


n
E = (τ0 , τf ) \ ij .
i,j =1

Let χE and χij be the characteristic functions of the sets E and ij , respectively. We set

zθ0 = 1, 0
zij = 0, i, j = 1, . . . , N ,

i.e., z0 = eθ . Further, we set



η0 (τ ) = eθ χE (τ ) + eij χij (τ ).
i,j

Since ϕ(η0 ) = η0 , we have ϕ(η0 )z0 = η0 z0 = η0 eθ = χE . Define t 0 (τ ) by the conditions

dt 0
= ϕ(η0 )z0 , t 0 (τ0 ) = t00 .

Then t 0 (τf ) = tf0 , since meas E = tf0 − t00 . We set x 0 (τ ) = x̂ 0 (t 0 (τ )), u0 (τ ) = û0 (t 0 (τ )). As
was shown in [79, Part 1, Section 7, Proposition 7.1], the point κ ζ defined in such a way
2 We preserve notation accepted in [79, Section 7, Part 1], where z and η were used to denote “zero-
θ θ
components” of vectors z and η, respectively.
106 Chapter 2. Quadratic Conditions in the Calculus of Variations

that it is a Pontryagin minimum point in Problem ZN on the fixed closed interval [τ0 , τf ].
In what follows, all the functions and sets related to N or ζ are endowed with the indices
N or ζ , respectively.
Since κ ζ yields the Pontryagin minimum in Problem ZN , the necessary Condition
A co ζ holds for it in this problem. We show that the necessary Condition Aζ also holds for
ζ ζ
the chosen index ζ . For this purpose, in Problem ZN , we consider the sets ζ , 0 , co 0 ,
ζ co ζ
M0 , and M0 for the trajectory κ ζ and find the relations between them. The definition
of the set ζ was given in [79, Section 7, Part 1], and the other sets were defined in [79,
Part 2].

2.6.4 Condition Aζ
The function l N has the form
lN = α0 J + αF + βK − αz z0 + βt0 (t0 − t00 ) + βtf (tf − tf0 )
= l − αz z0 + βt0 (t0 − t00 ) + βtf (tf − tf0 ).

The Pontryagin function H N has the form

H N = ψx (ϕ(η)z)f (t, x, u) + ψt (ϕ(η)z) + ψz · 0 = (ϕ(η)z)(H + ψt ) + ψz · 0,


ζ
where H = ψx f (t, x, u). The set 0 consists of tuples

λN = (α0 , α, β, αz , βt0 , βtf , ψx (τ ), ψt (τ ), ψz (τ )) (2.182)

such that

α0 ≥ 0, α ≥ 0, αz ≥ 0, αF (x00 , xf0 ) = 0, αz z0 (τ0 ) = 0, (2.183)


 
α0 + αi + αzi + |β| + |βt0 | + |βtf | = 1, (2.184)
ψx (τ0 ) = −lx 0 , ψx (τf ) = lx f , (2.185)
ψt (τ0 ) = −βt0 , ψt (τf ) = βtf , (2.186)
ψz (τ0 ) = αz , ψz (τf ) = 0, (2.187)
dψx  0 
− = ϕ(η (τ ))z0 ψx (τ )fx (t 0 (τ ), x 0 (τ ), u0 (τ )), (2.188)

dψt  0 
− = ϕ(η (τ ))z0 ψx (τ )ft (t 0 (τ ), x 0 (τ ), u0 (τ )), (2.189)

dψz  
− = ϕ(η0 (τ )) ψx (τ )f (t 0 (τ ), x 0 (τ ), u0 (τ )) + ψt (τ ) , (2.190)

 0 
ϕ(η (τ ))z0 ψx (τ )fu (t 0 (τ ), x 0 (τ ), u0 (τ )) = 0. (2.191)

The gradients lx 0 and lx f are taken at the point (x00 , xf0 , α0 , α, β).
In [79, Section 7, Part 1], we have shown that there is the following equivalent
ζ
normalization for the set 0 :

α0 + αi + |β| = 1 (2.192)
2.6. Completing the Proof of Theorem 2.4 107

(the conditions α0 = 0, α = 0, and β = 0, and also conditions (2.183), (2.185)–(2.191)


ζ
imply αz = 0, βt0 = 0, and βtf = 0). Therefore, in the definition of 0 , we can replace
normalization (2.184) by the equivalent normalization (2.192). In this case, the quadratic
Condition Aco ζ remains valid. Assume that we have made this replacement. The new set
ζ
is denoted by 0 as before. In [79, Section 7, Part 1], it was also shown that the element
ζ
(α0 , α, β) ∈ ζ corresponds to an element λN ∈ 0 and has the same components α0 , α,
ζ
and β, i.e., the projection λN  → (α0 , α, β) maps 0 into ζ .

ζ
Proposition 2.85. The convex hull co 0 does not contain zero.

ζ
Proof. Assume that this is not true. Then there exist an element λN ∈ 0 and a number
ζ
ρ > 0 such that −ρλN ∈ 0 . This implies that all nonnegative components α0 , α, and αz of
ζ ζ
the element λN (see (2.182)) vanish. But then the condition λN ∈ 0 implies −λN ∈ 0 ,
i.e., we may set ρ = 1.
Let μN = (α0 , α, β) = (0, 0, β) be the projection of the element λN . Then μN and
−μ belong to ζ . But the existence of such a μN contradicts the choice of the index ζ .
N

Therefore, the assumption that 0 ∈ co ζ is wrong. The proposition is proved.

Proposition 2.85 implies the following assertion.

ζ ζ
Corollary 2.86. For any λN ∈ co 0 , there exists ρ > 0 such that ρλN ∈ 0 .

ζ ζ
Proof. Let λN ∈ co 0 . Then by Proposition 2.85, λN  = 0. Obviously, co 0 is contained
ζ ζ ζ def
in the cone con 0 spanned by 0 . The conditions λN ∈ con 0 , λN  = 0 imply ν(λN ) =
α0 + |α| + |β| > 0 (since ν = 1 is a normalization condition). We set
λN
λ̃N = .
ν(λN )
ζ ζ
Then λ̃N ∈ con 0 and ν(λ̃N ) = 1. Therefore, λ̃N ∈ 0 . It remains to set ρ = 1/ν(λN ).
The proposition is proved.
co ζ ζ
Corollary 2.86 and the definitions of the sets M0 and M0 imply the following
assertion.
co ζ ζ
Corollary 2.87. Let λN ∈ M0 . Then there exists ρ > 0 such that ρλN ∈ M0 .

The condition Aco ζ for the point κ ζ in Problem ZN has the form
max ζ (λN ; z̄N ) ≥ 0 ∀ z̄N ∈ K ζ .
co ζ
M0

Here, K ζ is the critical cone and ζ is the quadratic form of Problem ZN at the point κ ζ .
Let us show that this implies Condition Aζ :
max ζ (λN ; z̄N ) ≥ 0 ∀ z̄N ∈ K ζ .
ζ
M0
108 Chapter 2. Quadratic Conditions in the Calculus of Variations

co ζ
Indeed, let z̄N ∈ K ζ . Condition Aco ζ implies the existence of λN ∈ M0 such that

ζ (λN ; z̄N ) ≥ 0. (2.193)


ζ
According to Corollary 2.87, there exists ρ > 0 such that λ̃N = ρλN ∈ M0 . Multiplying
(2.193) by ρ > 0, we obtain ζ (λ̃N ; z̄N ) ≥ 0. Hence, maxM ζ ζ (·, z̄N ) ≥ 0. Since z̄N is an
0
arbitrary element in K ζ , this implies Condition Aζ . Thus, we have proved the following
lemma.

Lemma 2.88. Let ŵ0 be a Pontryagin minimum point in the problem (2.169), (2.170), and
let M0 ∩ (−M0 ) = ∅. Then there exists a superscript ζ ∈ D() such that Condition Aζ
holds.

In what follows, we fix a superscript ζ ∈ D() such that Condition Aζ holds. Now
our goal is to reveal which information about the trajectory ŵ 0 can be extracted from
Condition Aζ of superscript ζ . We show that Condition Aζ implies Condition A at the point
ŵ0 in the initial problem. For this purpose, consider in more detail the definitions of the
ζ
set M0 , cone K ζ , and quadratic form ζ at the point κ ζ in Problem ZN .

ζ
2.6.5 Relation Between the Sets M0 and M0
ζ ζ ζ
Consider the conditions defining the set M0 . By definition, M0 is the set of λN ∈ 0 such
that the following inequality holds for all τ in the closed interval [τ0 , τf ], except for a finite
set of discontinuity points of the controls u0 (τ ) and η0 (τ ):
 
(ϕ(η)z0 ) ψx (τ )f (t 0 (τ ), x 0 (τ ), u) + ψt (τ )
  (2.194)
≥ (ϕ(η0 )z0 ) ψx (τ )f (t 0 (τ ), x 0 (τ ), u0 (τ )) + ψt (τ )

for all u ∈ Rd(u) such that (t 0 (τ ), x 0 (τ ), u) ∈ Q and all η ∈ Q1 . Let us analyze condition
(2.194). Choose a function η = η(τ ) ∈ Q1 so that the following condition holds:

ϕ(η(τ ))z0 = 0. (2.195)

Such a choice is possible, since the condition z0 = eθ and the definition of the function ϕ
imply

1, η ∈ Qθ ,
ϕ(η)z = ϕθ (η) =
0
0, η ∈ / Qθ ,

and, therefore, we may set η(τ ) = η∗ , where η∗ is an arbitrary point in Q1 \ Qθ , for example,
η∗ = e11 . Therefore, condition (2.195) holds for η(τ ) ≡ e11 . It follows from (2.194) and
(2.195) that the right-hand side of inequality (2.194) is nonpositive. But the integral of it
over the interval [τ0 , τf ] vanishes (this was shown in [79, Section 7, Part 1]; moreover, this
follows from the adjoint equation (2.190), conditions (2.183), and the transversality condi-
tions (2.187) considered for the component ψzθ only). But if the integral of a nonpositive
2.6. Completing the Proof of Theorem 2.4 109

function over a closed interval vanishes, then this function equals zero almost everywhere
on this closed interval. Hence
 
(ϕ(η0 (τ ))z0 ) ψx (τ )f (t 0 (τ ), x 0 (τ ), u0 (τ )) + ψt (τ ) = 0 (2.196)

a.e. on [τ0 , τf ]. Further, setting η = η0 (τ ) in (2.194) and taking into account (2.196), we
obtain
 
(ϕ(η0 (τ ))z0 ) ψx (τ )f (t 0 (τ ), x 0 (τ ), u) + ψt (τ ) ≥ 0 if (t 0 (τ ), x 0 (τ ), u) ∈ Q. (2.197)

This condition also holds for almost all τ ∈ [τ0 , τf ].


We may rewrite conditions (2.196) and (2.197) for the independent variable t. Recall
that in [79, Section 7, Part 1], we have denoted by Ê the image of the set E under the
mapping t 0 (τ ). Also, we note that t 0 (τ ) defines a one-to-one and bi-absolutely continuous
correspondence between E and Ê , and, moreover, [t00 , tf0 ] \ Ê is a finite set of points {t i }N
i=1 ,
and hence Ê is of full measure in [t00 , tf0 ]. We have denoted by τ 0 (t) the inverse function
mapping Ê onto E . The function τ 0 (t) monotonically increases on Ê . Let us extend it to
the whole closed interval [t00 , tf0 ] so that the extended function is left continuous. As before,
this function is denoted by τ 0 (t). We set

ψ̂x (t) = ψx (τ 0 (t)), ψ̂t (t) = ψt (τ 0 (t)). (2.198)

We note that
x̂ 0 (t) = x 0 (τ 0 (t)), û0 (t) = u0 (τ 0 (t)). (2.199)
The first equation holds on [t00 , tf0 ], and the second holds at every continuity point of the
function û0 (t), i.e., on the set [t00 , tf0 ] \ . Also, recall that ϕ(η0 (τ ))z0 = χE (τ ), and hence

ϕ(η0 (τ 0 (t)))z0 = χE (τ 0 (t)) = 1 (2.200)

a.e. on [t00 , tf0 ]. Setting τ = τ 0 (t) in conditions (2.196) and (2.197) and taking into account
(2.198)–(2.200), for almost all t ∈ [t00 , tf0 ], we obtain

ψ̂x (t)f (t, x̂ 0 (t), û0 (t)) + ψ̂t (t) = 0, (2.201)


ψ̂x (t)f (t, x̂ (t), u) + ψ̂t (t) ≥ 0
0
(2.202)

if (t, x̂ 0 (t), u) ∈ Q, u ∈ Rd(u) . Condition (2.201), which holds a.e. on [t00 , tf0 ], also holds at
every continuity point of the function û0 (t), i.e., on [t00 , tf0 ] \ ; condition (2.202) holds for
all t ∈ [t00 , tf0 ], since all functions entering this condition are continuous.
In [79, Section 7, Part 1], we have proved that equations (2.188) and (2.189) imply
the equations

d ψ̂x
− = ψ̂x (t)fx (t, x̂ 0 (t), û0 (t)), (2.203)
dt
d ψ̂t
− = ψ̂t (t)ft (t, x̂ 0 (t), û0 (t)). (2.204)
dt
110 Chapter 2. Quadratic Conditions in the Calculus of Variations

In proving this, we use the change τ = τ 0 (t) and the condition

dt 0
= ϕ(η0 (τ ))z0 .

Finally, the transversality conditions (2.185) imply the following transversality conditions:

ψ̂x (t00 ) = −lx 0 (x̂00 , x̂10 ), ψ̂x (tf0 ) = lx f (x̂00 , x̂10 ), (2.205)

since t 0 (τ0 ) = t00 and t 0 (τf ) = tf0 . Conditions (2.201)–(2.205) and conditions (2.183)
and (2.192), which hold for a tuple (α0 , α, β, ψx (t), ψt (t)), imply that its projection
(α0 , α, β, ψx (t)) belongs to the set M0 of the problem (2.169), (2.170) at the point ŵ 0 (t).
Therefore, for the superscript ζ indicated in Lemma 2.88 and corresponding function τ 0 (t)
(defined above), we have proved the following assertion.

Lemma 2.89. Let a tuple λN = (α0 , α, β, αz , βt0 , βtf , ψx (τ ), ψt (τ ), ψz (τ )), belong to the
ζ
set M0 of Problem ZN at the point κ ζ . We set ψ̂x (t) = ψx (τ 0 (t)). Then the tuple λ =
(α0 , α, β, ψ̂x (t)) belongs to the set M0 of the problem (2.169), (2.170) at the point ŵ0 .

2.6.6 Critical Cone K ζ and Its Relation to the Critical Cone K


The discontinuity points τk = τ 0 (tk ), k = 1, . . . , s, of the function u0 (τ ) = û0 (t 0 (τ )) cor-
respond to the discontinuity points tk ∈ , k = 1, . . . , s, of the function û0 (t). We set
ζ = {τk }sk=1 . The condition ζ ∈ D() implies ζ ⊂ E . Further, let  ˜ ζ = {τ̃i }s̃ be the
i=1
set of discontinuity points of the control η (τ ). The definition of the function η0 (τ ) implies
0

that ˜ ζ does not intersect the open set E . Therefore, the sets ζ and  ˜ ζ are disjoint. We
denote their union by (ζ ).
By definition, the critical cone K ζ for the trajectory κ ζ in Problem ZN consists of
the tuples
z̄N = (ξ̄ , ξ̃ , t¯(τ ), x̄(τ ), z̄(τ ), ū(τ ), η̄(τ )) (2.206)
such that

x̄ ∈ P(ζ ) W 1,2 ([τ0 , τf ], Rd(x) ), t¯ ∈ P(ζ ) W 1,2 ([τ0 , τf ], R1 ),


z̄ ∈ P(ζ ) W 1,2 ([τ0 , τf ], Rd(z) ), ū ∈ L2 ([τ0 , τf ], Rd(u) ),
η̄ ∈ L2 ([τ0 , τf ], Rd(η) ), ξ̄ ∈ Rs , ξ̃ ∈ Rs̃ , (2.207)
Jp p̄ ≤ 0, Fip p̄ ≤ 0 (i ∈ I ), Kp p̄ = 0, (2.208)

where p̄ = (x̄(τ0 ), x̄(τf )) and the gradients Jp , Fip , and Kp are taken at the point
(x̂00 , x̂10 ) = p̂ 0 ,
t¯(τ0 ) = 0, t¯(τf ) = 0, (2.209)
d x̄  
= (ϕ(η0 (τ ))z0 ) ft t¯(τ ) + fx x̄(τ ) + fu ū(τ )

+ ((ϕη (η0 (τ ))η̄(τ ))z̄0 )f (t 0 (τ ), x 0 (τ ), u0 (τ )) (2.210)
+ (ϕ(η0 (τ ))z̄(τ ))f (t 0 (τ ), x 0 (τ ), u0 (τ )),
2.6. Completing the Proof of Theorem 2.4 111

where the gradients fx , fu , and ft are taken at the trajectory (t 0 (τ ), x 0 (τ ), u0 (τ )),


d t¯ d z̄
= (ϕη (η0 (τ ))η̄(τ ))z0 + ϕ(η0 (τ ))z̄, = 0, (2.211)
dτ dτ
[x̄](τk ) = [(ϕ(η0 )z0 )f (t 0 , x 0 , u0 )](τk )ξ̄k , [t¯](τk ) = [ϕ(η0 )z0 ](τk )ξ̄k ,
(2.212)
z̄(τk ) = 0, k = 1, . . . , s,
[x̄](τ̃i ) = [(ϕ(η0 )z0 )f (t 0 , x 0 , u0 )](τ̃i )ξ̃i , [t¯](τ̃i ) = [ϕ(η0 )z0 ](τ̃i )ξ̃i ,
[z̄](τ̃i ) = 0, i = 1, . . . , s̃. (2.213)
Here, [ · ](τk ) is the jump at the point τk , and [·](τ̃i ) is the jump at the point τ̃i . We set
t¯(τ ) = 0, z̄(τ ) = 0, η̄(τ ) = 0, ξ̃ = 0. (2.214)
ζ
These conditions define the subcone K0 of the cone K ζ such that the following conditions
hold:
x̄(τ ) ∈ Pζ W 1,2 ([τ0 , τf ], Rd(x) ), ū(τ ) ∈ L2 ([τ0 , τf ], Rd(u) ), ξ̄ ∈ Rs , (2.215)
Jp p̄ ≤ 0, Fip p̄ ≤ 0 (i ∈ I ), Kp p̄ = 0, (2.216)
d x̄ 
= (ϕ(η0 (τ ))z0 ) fx (t 0 (τ ), x 0 (τ ), u0 (τ ))x̄(τ )


+fu (t 0 (τ ), x 0 (τ ), u0 (τ ))ū(τ ) ; (2.217)
# $
[x̄](τk ) = (ϕ(η0 )z0 )f (t 0 , x 0 , u0 ) (τk )ξ̄k , k = 1, . . . , s. (2.218)
The following assertion holds.

Lemma 2.90. Let  


z̄ˆ = ξ̄ˆ , x̄(t),
ˆ ū(t)
ˆ (2.219)
be an arbitrary element of the critical cone K of the problem (2.169), (2.170) at the
point ŵ 0 . We set
ξ̄ = ξ̄ˆ , x̄(τ ) = x̄(t
ˆ 0 (τ )), ū(τ ) = ū(t
ˆ 0 (τ )),
(2.220)
t¯(τ ) = 0, z̄(τ ) = 0, η̄(τ ) = 0, ξ̃ = 0.
Then
z̄N = (ξ̄ , ξ̃ , t¯(τ ), x̄(τ ), z̄(τ ), ū(τ ), η̄(τ )) (2.221)
ζ
is an element of the cone ⊂ K0 where Kζ , Kζ
is the critical cone of Problem ZN at the
ζ
point κ , and K0 is defined by conditions (2.214)–(2.218).
ζ

Proof. Let z̄ˆ be an arbitrary element of the critical cone K of the problem (2.169), (2.170)
at the point ŵ0 having the form (2.219). Then by the definition of the cone K, we have

ξ̄ˆ ∈ Rs , ˆ ∈ P W 1,2 ([t00 , tf0 ], Rd(x) ),


x̄(t) ˆ ∈ L2 ([t00 , tf0 ], Rd(u) ),
ū(t) (2.222)
Jp (p̂0 )p̄ˆ ≤ 0, Fip (p̂0 )p̄ˆ ≤ 0 (i ∈ I ), Kp (p̂0 )p̄ˆ = 0, (2.223)
ˆ
d x̄(t) ˆ + fu (t, x̂ 0 (t), û0 (t))ū(t),
= fx (t, x̂ 0 (t), û0 (t))x̄(t) ˆ (2.224)
dt
ˆ k ) = [f (·, x 0 , u0 )](tk )ξ̄ˆk , k = 1, . . . , s.
[x̄](t (2.225)
112 Chapter 2. Quadratic Conditions in the Calculus of Variations

Let conditions (2.220) hold. We show that all conditions (2.214)–(2.218) defining the cone
ζ
K0 hold for the element z̄N (having form (2.221)).
Conditions (2.214) follow from (2.220). Conditions (2.215) follow from (2.222).
ˆ
Indeed, the function x̄(t) is piecewise absolutely continuous, and the function t 0 (τ ) is
ˆ 0 (τ )) is a piecewise absolutely continuous function
Lipschitz continuous. Hence x̄(τ ) = x̄(t
whose set of discontinuity points is contained in ζ . Further,
 
ˆ 0 (τ )) d x̄ˆ 
d x̄(τ ) d x̄(t dt 0 (τ ) d x̄ˆ 
= =  · = χE (τ )  . (2.226)
dτ dτ dt  0 dτ dt  0
t=t (τ ) t=t (τ )

dt 0
Since χE2 = χE = dτ , we have
  2   2   2
τf d x̄(τ ) τf d x̄ˆ 0 dt 0 (τ ) tf0 ˆ
d x̄(t)
dτ = (t (τ )) dτ = dt < +∞.
τ0 dτ τ0 dt dτ t00 dt

Hence the derivative d x̄/dτ is square Lebesgue integrable. Therefore,


x̄(·) ∈ Pζ W 1,2 ([τ0 , τf ], Rd(x) ).
Further, consider the integral
 τf  
ū(τ )2 dτ = ū(τ )2 dτ + ū(τ )2 dτ .
τ0 E [τ0 ,τf ]\E

The function t 0 (τ ), and hence the function ū(t ˆ 0 (τ )) = ū(τ ), assumes finitely many values
on [τ0 , τf ] \ E ; hence the second integral in the sum is finite. For the first integral, we have
   
dt 0 (τ ) 0
ˆ 0 (τ ))2 dt (τ ) dτ
ū(τ )2 dτ = ū(τ )2 χE dτ = ū(τ )2 dτ = ū(t
E E E dτ E dτ
 
= ˆ 2 dt =
ū(t) ˆ 2 dt < +∞,
ū(t)
Ê [t00 ,tf0 ]
τ
since ūˆ is Lebesgue square integrable. Hence, τ0f ū(τ )2 dτ < +∞, i.e., ū(·) ∈
L2 ([τ0 , tf ], Rd(u) ). Further, condition (2.223) implies condition (2.216), since t 0 (τ0 ) = t00 ,
ˆ 0 ), x̄(τf ) = x̄(t
t 0 (τf ) = tf0 , and, therefore, x̄(τ0 ) = x̄(t ˆ 0 ). Consider the variational equation
0 f
(2.224). Making the change t = t 0 (τ ) in it, multiplying by χE (τ ), and taking into account
(2.226), we obtain
d x̄
= χE (τ )(fx (t 0 (τ ), x 0 (τ ), u0 (τ ))x̄(τ ) + fu (t 0 (τ ), x 0 (τ ), u0 (τ ))ū(τ )).

But χE (τ ) = ϕ(η0 (τ ))z0 . Therefore, the variational equation (2.217) holds for x̄ and ū.
Finally, we show that the jump conditions (2.218) hold. Note that
tk = t 0 (τk ), τk ∈ E , k = 1, . . . , s. (2.227)
Consequently, each τk is a continuity point of the function η0 (τ ) and
ϕ(η0 (τk ))z0 = 1, k = 1, . . . , s. (2.228)
2.6. Completing the Proof of Theorem 2.4 113

It follows from (2.227) and (2.228) that

[(ϕ(η0 )z0 )f (t 0 , x 0 , u0 )](τk ) = [f (·, x̂ 0 , û0 )](tk ), k = 1, . . . , s. (2.229)

Analogously,
ˆ k ),
[x̄](τk ) = [x̄](t k = 1, . . . , s. (2.230)

Conditions (2.218) follow from (2.225), (2.229), and (2.230) and the relation ξ̄ = ξ̄ˆ . There-
ζ
fore, all the conditions defining the cone K0 hold for the tuple z̄N .

2.6.7 Quadratic Form ζ and its Relation to the Quadratic Form 


ζ ζ
Let λN ∈ M0 and z̄N ∈ K0 ; hence let condition (2.214) hold for z̄N . The value of the
quadratic form ζ (corresponding to the tuple of Lagrange multipliers λN , at the point κ ζ
in Problem ZN ) at the element z̄N is denoted by ζ (λN ; z̄N ). Taking into account conditions
(2.214), by definition, we obtain


s
2ζ (λN , z̄N ) = (D k (H N )ξ̄k2 + 2[HxN ]k x̄av
k
ξ̄k )
k=1  τf (2.231)

+lpp p̄, p̄ + Hww
N
w̄(τ ), w̄(τ ) dτ .
τ0

Here,

p̄ = (x̄(τ0 ), x̄(τf )), lpp = lpp (x̂00 , x̂f0 ; α0 , α, β), (2.232)


N
Hww = (ϕ(η0 (τ ))z0 )Hww (t 0 (τ ), x 0 (τ ), u0 (τ ), ψx (τ ))

= χE (τ )Hww (t 0 (τ ), x 0 (τ ), u0 (τ ), ψx (τ )). (2.233)

Further, [HxN ]k = [HxN ](τk ), k = 1, . . . , s, where HxN = χE (τ )Hx (t 0 (τ ), x 0 (τ ), u0 (τ ), ψx (τ )).


Let ψ̂x (t) = ψx (τ 0 (t)), Hx = Hx (t, x̂ 0 (t), û0 (t), ψ̂x (t)), [Hx ]k = [Hx ](tk ), k = 1, . . . , s. Tak-
ing into account that

χE (τk ) = 1, ψx (τk ) = ψ̂x (tk ),


x 0 (τk ) = x̂ 0 (tk ), u0 (τk −) = û0 (tk −), u0 (τk +) = û0 (tk +),
t 0 (τk ) = tk , k = 1, . . . , s,

we obtain [HxN ](τk ) = [Hx ](tk ) = [Hx ]k , k = 1, . . . , s. Thus,

[HxN ]k = [Hx ]k , k = 1, . . . , s. (2.234)

Finally, by definition,


d N 
D (H ) = − (k H )
k N
, k = 1, . . . , s.
dτ 
τ =τk
114 Chapter 2. Quadratic Conditions in the Calculus of Variations

Since τk is a continuity point of the function η0 (τ ) and ϕ(η0 (τk ))z0 = 1, we have

(k H N )(τ )  
= (ϕ(η0 (τk ))z 
0 )ψ (τ ) f (t 0 (τ ), x 0 (τ ), u0 (τ +)) − f (t 0 (τ ), x 0 (τ ), u0 (τ −))
x k k 
= ψ̂x (t 0 (τ )) f (t 0 (τ ), x̂ 0 (t 0 (τ )), û0 (tk +)) − f (t 0 (τ )), x̂ 0 (t 0 (τ )), û0 (tk −))
= (k H )(t 0 (τ )).

Consequently,
   
d N  d  dt 0 
D (H ) = − (k H )
k N
=− 
(k H )  ·
dτ τ =τk dt 
t=tk dτ τ =τk
= D k (H )χE (τk ) = D k (H ). (2.235)

Let z̄ˆ = (ξ̄ˆ , w̄) ˆ = (ξ̄ˆ , x̄(t),


ˆ ū(t))
ˆ be an arbitrary element of the critical cone K. Let
z̄N= (ξ̄ , ξ̃ , t¯, x̄, z̄, ū, η̄) be the tuple defined according to z̄ˆ by using formulas (2.220). Then
ζ
by Lemma 2.90, z̄N ∈ K0 . According to (2.233),
 tf0
ˆ
Hww (t, x̂ 0 (t), û0 (t), ψ̂(t))w̄(t), ˆ
w̄(t) dt
t00
 τf
ˆ 0 (τ )), w̄(t
ˆ 0 (τ )) dt 0 (τ )
= Hww (t 0 (τ ), x̂ 0 (t 0 (τ )), û0 (t 0 (τ )), ψ̂x (t 0 (τ )))w̄(t dτ
τ0 dτ
 τf
= Hww (t 0 (τ ), x 0 (τ ), u0 (τ ), ψx (τ ))w̄(τ ), w̄(τ ) χE (τ ) dτ
τ
 0τf
= Hww
N
w̄(τ ), w̄(τ ) dτ . (2.236)
τ0

ζ
Let λ = (α0 , α, β, ψ̂x (t)) be the element corresponding to the tuple λN ∈ M0 , where
ψ̂x (t) = ψx (τ 0 (t)). Then λ ∈ M0 according to Lemma 2.89. Recall that by definition, the
quadratic form λ (z̄) ˆ for the problem (2.169), (2.170) at the point ŵ0 corresponding to the
tuple λ of Lagrange multipliers and calculated at the element z̄ˆ has the form


s  tf0
ˆ =
2λ (z̄) k ˆ
(D k (H )ξ̄ˆk2 + 2[Hx ]k x̄ˆav ˆ p̄
ξ̄k ) + lpp p̄, ˆ + ˆ w̄
Hww w̄, ˆ dt. (2.237)
k=1 t00

Here, p̄ˆ = (x̄(t


ˆ 0 ), x̄(t
0
ˆ 0 )). Note that
f

ˆ ˆ
lpp p̄, p̄ = lpp p̄, p̄ , (2.238)

ˆ 0 ) = x̄(t
since x̄(t ˆ 0 (τ0 )) = x̄(τ0 ) and x̄(t
ˆ 0 ) = x̄(t
ˆ 0 (τf )) = x̄(τf ). Formulas (2.231) and
0 f
(2.234)–(2.238), imply
ˆ = ζ (λN ; z̄N ).
λ (z̄) (2.239)

Therefore, we have proved the following assertion.


2.7. Sufficient Conditions for the Strong Minimum 115

Lemma 2.91. Let z̄ˆ be an arbitrary element (2.219) of the critical cone K, and let z̄N
ζ ζ
be element (2.221) of the cone K0 ⊂ K ζ obtained by formulas (2.220). Let λN ∈ M0 be
an arbitrary tuple (2.182), and let λ = (α0 , α, β, ψ̂x (t)) ∈ M0 be the tuple with the same
components α0 , α, β and with ψ̂x (t) = ψx (τ 0 (t)) corresponding to it by Lemma 2.89. Then
relation (2.239) holds for the quadratic forms.

2.6.8 Proof of Theorem 2.4


Thus, let ŵ0 be a Pontryagin minimum point in the problem (2.169), (2.170). Then the
set M0 is nonempty. If, moreover, M0 ∩ (−M0 )  = ∅ (case (a) in Section 2.6.2), then as
already mentioned, Condition A holds trivially. Otherwise (case (b) in Section 2.6.2), by
ζ
Lemma 2.88, there exists a superscript ζ such that Condition Aζ holds, i.e., the set M0 is
nonempty and

max ζ (λN , z̄N ) ≥ 0 ∀ z̄N ∈ K ζ . (2.240)


ζ
M0

Let us show that Condition A holds: the set M0 is nonempty and

max λ (z̄) ≥ 0 ∀ z̄ ∈ K. (2.241)


M0

Take an arbitrary element z̄ˆ ∈ K. According to Lemma 2.90, the element z̄N ∈ K0 ⊂ K ζ
ζ
ζ
corresponds to it by formulas (2.220). By (2.240), for z̄N , there exists λN ∈ M0 such that

ζ (λN , z̄N ) ≥ 0. (2.242)


ζ
The element λ ∈ M0 corresponds to λN ∈ M0 by Lemma 2.89, and, moreover, by Lemma
2.91, we have relation (2.239). It follows from (2.239) and (2.242) that
ˆ ≥ 0,
λ (z̄) λ ∈ M0 .

Since z̄ˆ is an arbitrary element of K, this implies condition (2.241). Therefore, Condition
A also holds in case (b). The theorem is completely proved.

2.7 Sufficient Conditions for Bounded Strong and Strong


Minima in the Problem on a Fixed Time Interval
2.7.1 Strong Minimum
In [79, Part 1], we considered the strong minimum conditions related to the solutions of
the Hamilton-Jacobi equation. We can say that they were obtained as a result of devel-
opment of the traditional approach to sufficient strong minimum conditions accepted in
the calculus of variations. However, it is remarkable that there exists another, nontradi-
tional approach to the strong minimum sufficient conditions using the strengthening of the
quadratic sufficient conditions for a Pontryagin minimum. Roughly speaking, the strength-
ening consists of assuming certain conditions on the behavior of the function H at infinity.
116 Chapter 2. Quadratic Conditions in the Calculus of Variations

This fact, which had been previously absent in the classical calculus of variations, was
first discovered by Milyutin when studying problems of the calculus variations and optimal
control. We use this fact in this section.
We first define the concept of strong minimum, which will be considered here. It is
slightly different from the usual concept from the viewpoint of strengthening. The usual
concept used in the calculus of variations corresponds to the concept of minimum on the set
of sequences {δw} in the space W such that
δx
C → 0. It is not fully correct to extend it
to the canonical problem without any changes. Indeed, in the classical calculus of variations,
t
it is customary to minimize an integral functional of the form J = t0f F (t, x, u) dt, where
u = ẋ. In passing to the canonical problem, we write the integral functional as the terminal
functional: J = y(tf ) − y(t0 ), but there arises a new state variable y such that ẏ = F (t, x, u).
Clearly, the requirement
δy
C → 0 must be absent in the canonical problem if we do not
want to distort the original concept of strong minimum in rewriting the problem.
How can this be taken into account if we have the canonical form in advance, and
it is not known from which problem it originates? It is easy to note that the new state
variables y arising in rewriting the integral functionals are characterized by the property
that they affinely enter the terminal functionals of the canonical form and are completely
absent in the control system of the canonical form. These variables are said to be unessential
and the other variables are said to be essential. In defining the strong minimum, we take
into account only the essential variables.
Let us give the precise definition. As before, we consider the canonical problem
(2.1)–(2.4) on a fixed closed interval of time [t0 , tf ].

Definition 2.92. A state variable xi (the component xi of a vector x) is said to be unessential


if the function f is independent of it and the functions J , F , and K affinely depend on xi0 :=
xi (t0 ), xif = xi (tf ). The state variables xi without these properties are said to be essential.
(One can also use the terms “main” (or “basic”) and “complementary” (or “auxiliary”)
variables.) Respectively, we speak about the essential components of the vector x.

Denote by x the vector composed of the essential components of the vector x. Simi-
larly, denote by δx the vector-valued function composed of the essential components of the
variation δx.
Denote by S the set of sequences {δw} in the space W such that |δx(t0 )|+
δx
C → 0.
Let us give the following definition for problem (2.1)–(2.4).

Definition 2.93. We say that w0 is a strong minimum point (with respect to the essential
state variables) if it is a minimum point on S .

In what follows, the strong minimum with respect to the essential variables will be
called the strong minimum, for brevity. By the strict strong minimum we mean the strict
minimum on S . Since  ⊂ S , the strong minimum implies the Pontryagin minimum.

2.7.2 Bounded Strong Minimum, Sufficient Conditions


We now define the concept of bounded strong minimum, which occupies an intermediate
place between the strong and Pontryagin minima.
2.7. Sufficient Conditions for the Strong Minimum 117

Definition 2.94. We say that w0 is a bounded strong minimum point if it is a minimum


point on the set of sequences {δw} in W satisfying the following conditions:
(a) |δx(t0 )| +
δx
C → 0.
(b) For each sequence there exists a compact set C ⊂ Q such that the following
condition holds starting from a certain number: (t, x 0 (t), u0 (t) + δu(t)) ∈ C a.e. on [t0 , tf ].

S
Denote by  the set of sequences {δw} in W satisfying conditions (a) and (b) and
also the following additional conditions:
(c) starting from a certain number, (p0 + δp) ∈ P , (t, w 0 + δw) ∈ Q.
(d) σ (δw) → 0, where σ is the violation function (2.164).
Conditions (c) and (d) hold on every sequence “violating the minimum.” Therefore, we
S
may treat the bounded strong minimum as a minimum on  . We will proceed in this way
in what follows, since we will need conditions (c) and (d). By the strict bounded strong
S S
minimum, we mean the strict minimum on  . Since  ⊂  ⊂ S , the strong minimum
implies the bounded strong minimum, and the latter implies the Pontryagin minimum.
A remarkable property is that the sufficient conditions obtained in Section 2.5 guaran-
tee not only the Pontryagin minimum but also the bounded strong minimum. This follows
from the theorem, which now will be proved. In what follows, w 0 is an admissible point
satisfying the standard assumptions of Section 2.1.

Theorem 2.95. For a point w 0 , let there exist an admissible function (t, u) and a constant
S
C > 0 such that the set M(C) is nonempty. Then  =  , and hence the Pontryagin
minimum is equivalent to the bounded strong minimum.

To prove this, we need several auxiliary assertions.

S t
Proposition 2.96. Let λ ∈ 0 , {δw} ∈  . Then ( t0f δH λ dt)+ → 0, where a+ = max{a, 0}
and δH λ = H (t, x 0 + δx, u0 + δu, ψ) − H (t, x 0 , u0 , ψ).

Proof. Let λ ∈ 0 , and let δw be an admissible variation with respect to Q, i.e., (t, w 0 +
δw) ∈ Q. Then
 tf
δl −
λ
ψ(δ ẋ − δf ) dt ≤ const σ (δw). (2.243)
t0

On the other hand, we have shown earlier that the conditions −ψ̇ = Hxλ , ψ(t0 ) = −lxλ0 , and
ψ(tf ) = lxλf imply
  
tf t tf tf
ψδ ẋ dt = ψδx tf − ψ̇δx dt = lp δp + Hxλ δx dt.
0
t0 t0 t0

Taking into account that ψδf = δH λ , we obtain from inequality (2.243) that
 tf  tf
δl λ − lp δp − Hxλ δx dt + δH λ dt ≤ const σ (δw). (2.244)
t0 t0
118 Chapter 2. Quadratic Conditions in the Calculus of Variations

S tf
Let {δw} ∈  . The condition
δx
C → 0 implies Hxλ δx dt → 0 and (δl λ − lp δp) → 0.
t
t0
Moreover, σ (δw) → 0. Therefore, condition (2.244) implies ( t0f δH λ dt)+ → 0. The
proposition is proved.

Using Proposition 2.96, we prove the following assertion.

Proposition 2.97. Let there exist an admissible function (t, u) and a constant C > 0 such
that the set M(C) is nonempty. Then the following condition holds for any sequence
S t
{δw} ∈  : t0f (t, u0 + δu) dt → 0.
tf
Proof. Let C > 0, and let λ ∈ M(C). According to Proposition 2.96, ( t0 δH λ dt)+ → 0.
Represent δH λ as δH λ = δ̄x H λ + δu H λ , where
δ̄x H λ = H λ (t, x 0 + δx, u0 + δu) − H λ (t, x 0 , u0 + δu),
δu H λ = H λ (t, x 0 , u0 + δu) − H λ (t, x 0 , u0 ).

The conditions
δx
C → 0, (t, x 0 , u0 + δu) ∈ C, where C ⊂ Q is a compact set, imply
t

δ̄x H λ
∞ → 0. Hence ( t0f δu H λ dt)+ → 0. Further, the condition λ ∈ M(C) implies
t t
δu H λ ≥ C(t, u0 + δu) ≥ 0. Consequently, t0f δu H λ dt ≥ C t0f (t, u0 + δu) dt ≥ 0. This
tf t
and the condition ( t0 δu H λ dt)+ → 0 imply t0f (t, u0 + δu) dt → 0. The proposition is
proved.

In what follows, (t, u) is a function admissible for the point w 0 .

Proposition 2.98. Let C ⊂ Q be a compact set, and let the variation δu ∈ L∞ (, Rd(u) ) be
such that (t, x 0 , u0 + δu) ∈ C a.e. on [t0 , tf ]. Then we have the estimate
δu
1 ≤
t
const( t0f (t, u0 + δu) dt)1/2 , where const depends only on C.

Proof. Let V be the neighborhood of the compact set u0 from the definition of the function
(t, u) in Section 2.3.7, and let V 0 and V ∗ be subsets of the neighborhood V defined in
Section 2.3.1. Represent δu as δu = δuv + δuv , where

δu if (t, u0 + δu) ∈ V,
δuv = δuv = δu − δuv .
0 otherwise,

Further, let the representation δuv = δu0 + δu∗ correspond to the partition V = V 0 ∪ V ∗ :
 
δuv if (t, u0 + δuv ) ∈ V 0 , ∗ δuv if (t, u0 + δuv ) ∈ V ∗ ,
δu =0
δu =
0 otherwise, 0 otherwise.

Then

δu
1 =
δuv
1 +
δu0
1 +
δu∗
1 . (2.245)
Let us estimate each of the summands separately.
(1) Since (t, x 0 , u0 + δuv ) ∈ C and (t, u0 + δuv ) ∈
/ V for δuv  = 0, by the definition
of the function (t, u), there exists ε = ε(C) > 0 such that (t, u0 + δuv ) ≥ ε if δuv  = 0.
2.7. Sufficient Conditions for the Strong Minimum 119

Moreover, there exists a constant N = N (C) > 0 such that


δuv
∞ ≤
δu
∞ ≤ N .
Consequently,

 N tf

δuv
1 ≤
δuv
∞ · meas{t  δuv  = 0} ≤ (t, u0 + δuv ) dt.
ε t0
Also, taking into account that
δuv
1 ≤
δuv
∞ (tf − t0 ) ≤ N (tf − t0 ), we obtain from this
that  tf

δu
1 ≤
δu
∞ (tf − t0 )
δu
1 ≤ const
v 2 v v
(t, u0 + δuv ) dt, (2.246)
t0
where const > 0 depends on C only. t
(2) Since (t, u0 + δu0 ) ∈ V 0 , we have
δu0
22 = t0f (t, u0 + δu0 ) dt. Consequently,
 tf 1/2
 

δu0
1 ≤ tf − t0
δu0
2 = tf − t0 (t, u0 + δu0 ) dt . (2.247)
t0
tf
(3) Obviously,
δu∗
1 = t0 |δu∗ | dt ≤
δu∗
∞ · meas M ∗ ≤ N · meas M ∗ , where M ∗ =
 ∗

{t δu  = 0}. Further, as in Section 2.3.1, represent M∗ = ∪ Mk∗ , Mk∗ = Mk− ∗ ∪ M∗ ,
k+
k = 1, . . . , s. Then
  tf
∗ 2
(meas Mk− ) ≤2 |δtk | dt ≤ (t, u0 + δu∗ ) dt;

Mk− t0
  tf
∗ 2
(meas Mk+ ) ≤ 2 |δtk | dt ≤ (t, u0 + δu∗ ) dt;

Mk+
  t0
meas M∗ = meas Mk−∗
+ meas Mk+∗
.
k k
Consequently,
 tf 1/2

δu∗
1 ≤ const (t, u0 + δu∗ ) dt , (2.248)
t0
where const depends only on C. It follows from (2.245)–(2.248) that
 1/2  tf 1/2
tf

δu
1 ≤ const (t, u0 + δuv ) dt + (t, u0 + δu0 ) dt
t0 t0
 tf 1/2 
+ (t, u0 + δu∗ ) dt .
t0

Now, to obtain the required estimate, it remains to use the inequality


!
a + b + c ≤ 3(a 2 + b2 + c2 ),
which holds for any numbers a, b, and c, and also the relation
 tf  tf  tf
(t, u + δu ) dt +
0 v
(t, u + δu ) dt +
0 0
(t, u0 + δu∗ ) dt
t0 t0 t0
 tf
= (t, u0 + δu) dt.
t0

The proposition is proved.


120 Chapter 2. Quadratic Conditions in the Calculus of Variations

Proof of Theorem 2.95. Assume that the conditions of the theorem hold. Let us prove
the inclusion  ¯ S ⊂ . Let {δw} ∈ 
¯ S . Then it follows from Propositions 2.97 and 2.98
that
δu
1 → 0. Further, the condition σ (δw) → 0 implies
δ ẋ − δf
1 → 0. But δf =
δ̄x f + δu f , where
δ̄x f = f (t, x 0 + δx, u0 + δu) − f (t, x 0 , u0 + δu),
δu f = f (t, x 0 , u0 + δu) − f (t, x 0 , u0 ).
Since
δx
C → 0 and there exists a compact set C ⊂ Q such that (t, x 0 , u0 + δu) ∈ C
starting from a certain number, we have
δ̄x f
∞ → 0. The conditions
δu
1 → 0 and
(t, x 0 , u0 + δu) ∈ C imply
δu f
1 → 0. Consequently,

δ ẋ
1 ≤
δ ẋ − δf
1 +
δf
1 ≤
δ ẋ − δf
1 +
δ̄x f
∞ (tf − t0 ) +
δu f
1 → 0.
The conditions
δ ẋ
1 → 0 and |δx(t0 )| → 0 imply
δx
1,1 → 0. Therefore, {δw} ∈ . The
¯ S ⊂  is proved. The converse inclusion always holds. Therefore, 
inclusion  ¯ S = .
The theorem is proved.

Lemma 2.73 and Theorem 2.95 imply the following theorem.

Theorem 2.99. Let the set Leg+ (M0+ ) be nonempty. Then  = ¯ S , and hence the Pon-
tryagin minimum is equivalent to the bounded strong minimum.

Proof. Assume that Leg+ (M0+ ) is nonempty. Choose an arbitrary compact set M ⊂
Leg+ (M0+ ), e.g., a singleton. According to Lemma 2.73, there exist an admissible function
(t, u) and a constant C > 0 such that M ⊂ M(C). Therefore, M(C) is nonempty. Then
= ¯ S by Theorem 2.95. The theorem is proved.

For a point w 0 and the higher order γ corresponding to an admissible function , we


give the following definition.

Definition 2.100. We say that the point w 0 is a point of bounded strong γ -sufficiency if
¯ S such that σ = o(γ ) on it.
there is no sequence {δw} ∈ 

The condition that the set Leg+ (M0+ ) is nonempty is a counterpart of Condition B.
Therefore, Theorems 2.82 and 2.99 imply the following theorem.

Theorem 2.101. Condition B is equivalent to the existence of an admissible function 


such that the bounded strong γ -sufficiency holds at the point w0 for the higher order γ
corresponding to it.

The bounded strong γ -sufficiency implies the strict bounded strong minimum. There-
fore, Theorem 2.101 implies the following theorem.

Theorem 2.102. Condition B is sufficient for the strict bounded strong minimum at the
point w0 .

At this point, we complete the consideration of conditions for the bounded strong
minimum. Before passing to sufficient conditions for the strong minimum, we prove some
estimate for the function , which will be needed in what follows.
2.7. Sufficient Conditions for the Strong Minimum 121

2.7.3 Estimate for the Function on Pontryagin Sequences


Recall that in Section 2.1, we introduced the set  0 consisting of those λ ∈ 0 for which
[H λ ]k = 0 for all tk ∈ , and in Section 2.3, we showed that there exists a constant C > 0
such that the following estimate holds for any sequence {δw} ∈ loc starting from a certain
number:
max |(λ, δw)| ≤ C γ (δw)
co 
0

(see Proposition 2.16). Let us show that the same estimate also holds on any Pontryagin
sequence but with a constant depending on the order γ and the sequence.

Lemma 2.103. Let the set  0 be nonempty. Let (t, u) be an admissible function, and let
γ be the higher order corresponding to it. Then for any sequence {δw} ∈ , there exists a
constant C > 0 such that
max |(λ, δw)| ≤ Cγ (δw).
co 
0

Briefly, this property will be written as maxco  || ≤ O(γ )  .
0

Proof. Proposition 2.16 implies the following assertion: there exist constants C > 0 and
ε > 0 and a neighborhood V of the compact set u0 such that the conditions δw ∈ W ,

δx
C ≤ ε, and (t, u0 + δu) ∈ V imply the estimate maxco  |(λ, δw)| ≤ Cγ (δw). Let us
0
use this estimate. Let {δw} be an arbitrary sequence from . For each member δw = (δx, δu)
of the sequence {δw}, represent δu as δu = δuV + δuV , where

δu if (t, u0 + δu) ∈ V ,
δuV = δuV = δu − δuV .
0 otherwise,

We set δwV = (δx, δuV ) and δwV = (0, δuV ). Owing to a possible decrease of V , we
can assume that both sequences {δwV } and {δwV } are admissible with respect to Q; i.e.,
the conditions (t, x 0 + δx, u0 + δuV ) ∈ Q and (t, x 0 , u0 + δuV ) ∈ Q hold starting from a
certain number (such a possibility follows from the definition of Pontryagin sequence). We
assume that this condition holds for all numbers and
δx
C ≤ ε holds for all numbers.
We set MV = {t | δuV  = 0}. The definitions of admissible function (t, u) and Pontryagin
sequence imply the existence of constants 0 < a < b such that a ≤ (t, u0 + δuV ) ≤ b | M V
for all members of the sequence. This implies a· meas MV ≤ γ V ≤ b· meas M V , where
t
γ V = t0f (t, u0 + δuV ) dt = γ (δw V ). Therefore, γ V and meas MV are of the same order
of smallness. Moreover, the definitions of γ , δwV , and δwV imply γ (δw) = γ (δwV ) +
γ (δwV ), or, briefly, γ = γV + γ V . In what follows, we will need the formula
δf = δV f + δ̄ Vf , (2.249)
where δf = f (t, w 0 + δw) − f (t, w 0 ), δV f = f (t, w 0 + δwV ) − f (t, w 0 ), and δ̄ Vf =
f (t, x 0 + δx, u0 + δuV ) − f (t, x 0 + δx, u0 ). The fulfillment of this formula is proved by
the following calculation:
δf = f (t, x 0 + δx, u0 + δu) − f (t, x 0 + δx, u0 + δuV ) + δV f
 
= f (t, x 0 + δx, u0 + δu) − f (t, x 0 + δx, u0 + δuV ) χ V + δV f
= δ̄ Vf χ V + δV f = δ̄ Vf + δV f ,
122 Chapter 2. Quadratic Conditions in the Calculus of Variations

where χ V is the characteristic function of the set M V . Formula (2.249) implies the following
representation for (λ, δw) on the sequence {δw}:
 tf  tf
(λ, δw) := δl − λ
ψδ ẋ dt + ψδf dt
t0 t0
 tf  tf  tf
= δl λ − ψδ ẋ dt + ψδV f dt + ψ δ̄ Vf dt
t0 t0 t0
 tf
= (λ, δwV ) + δ̄ V H λ dt,
t0

where δ̄ V H λ = ψ δ̄ Vf and λ ∈ co 0 . This implies the estimate


 tf
max |(λ, δw)| ≤ max |(λ, δwV )| + max
ψ
∞ |δ̄ Vf | dt.
co 
0 co 
0 co 
0 t0

According to the choice of V and ε, the first term of the sum on the right-hand side of the
inequality is estimated through γ (δwV ). The second term of this sum is estimated through
meas MV and hence through γ V . Since γV ≤ γ and γ V ≤ γ , the total sum is estimated
through γ with a certain positive constant as a multiplier. The lemma is proved.

2.7.4 Sufficient Conditions for the Strong Minimum


In this section, we assume that the set Q has the form Q = Qt × Qx × Qu , where Qt ⊂ R,
Qx ⊂ Rd(x) , and Qu ⊂ Rd(u) are open sets. Set Qtu = Qt × Qu . We now give those
additional requirements which, together with Condition B, turn out to be sufficient for the
strong minimum whose definition was given in Section 2.7.1. For (t, x, u) ∈ Q, we set
δ¯u f = f (t, x, u) − f (t, x, u0 (t)). Further, for (t, x, u) ∈ Q and λ = (α0 , α, β, ψ) ∈ 0 , we set
δ̄u H λ = ψ δ̄u f . The following theorem holds.

Theorem 2.104. Let the following conditions hold for the point w0 :
(1) There exists a nonempty compact set M ⊂ Leg+ (M0+ ) such that
(a) for a certain C > 0, maxλ∈M λ (z̄) ≥ C γ̄ (z̄) for all z̄ ∈ K, i.e., Condition B
holds;
(b) for any ε > 0, there exist δ > 0 and a compact set C ⊂ Qtu such that for
all t ∈ [t0 , tf ] \ , the conditions (t, w) ∈ Q, |x − x 0 (t)| ≤ δ, (t, u) ∈
/ C imply
minλ∈M δ̄u H λ ≥ −ε|δ̄u f |;
(2) there exist δ0 > 0, ε0 > 0, a compact set C0 ⊂ Qtu , and an element λ0 ∈ M0 such
that for all t ∈ [t0 , tf ] \  the conditions (t, w) ∈ Q, |x − x 0 (t)| < δ0 , (t, u) ∈
/ C0 imply
δ̄u H λ0 ≥ ε0 |δ̄u f | > 0.
Then w0 is a strict strong minimum point.

Remark 2.105. For the point w 0 , let there exist δ0 , ε0 , C0 , and λ0 satisfying condition (2)
of Theorem 2.104. Moreover, let λ0 ∈ Leg+ (M0+ ), and for a certain C > 0, let λ0 (z̄) ≥
C γ̄ (z̄) for all z̄ ∈ K. Then, as is easily seen, all the conditions of Theorem 2.104 hold, and,
therefore, w0 is a strict strong minimum point.
2.7. Sufficient Conditions for the Strong Minimum 123

2.7.5 Proof of Theorem 2.104


Assume that for a subset M ⊂ Leg+ (M0+ ), C > 0, δ0 > 0, ε0 > 0, C0 ⊂ Qtu , λ0 ∈ M0 ,
all conditions of the theorem hold, but there is no strict strong minimum at the point w 0 .
Let us show that this leads to a contradiction. Since w 0 is not a strict strong minimum point,
there exists a sequence {δw} such that |δx(t0 )| +
δx
C → 0 (i.e., {δw} ∈ S ), and the
following conditions hold for all members of this sequence: (p0 +δp) ∈ P , (t, w0 +δw) ∈ Q,
σ (δw) = 0, δw  = 0. The condition σ (δw) = 0 implies
δ ẋ − δf = 0, δK = 0, δJ ≤ 0, F (p 0 + δp) ≤ 0.
Hence, for any λ ∈ 0 , we have the following on the sequence {δw}:
(λ, δw) = δl λ ≤ σ (δw) = 0. (2.250)
Assume that there is an arbitrary compact set C satisfying the condition
C0 ⊂ C ⊂ Qtu . (2.251)
For each member δw = (δx, δu) of the sequence {δw}, we set

C δu if (t, u0 + δu) ∈
/ C,
δu =
0 otherwise,

δwC = (0, δuC ), δuC = δu − δuC , δwC = (δx, δuC ).


Then {δw} = {δwC } + {δwC }. The relation δH λ = δC H + δ̄ C H λ , λ ∈ 0 , where
λ

δC H λ = H λ (t, w0 + δwC ) − H λ (t, w 0 ),


δ̄ C H λ = H λ (t, w0 + δw) − H λ (t, w0 + δwC )
= H λ (t, x 0 + δx, u0 + δu) − H λ (t, x 0 + δx, u0 + δuC )
= H λ (t, x 0 + δx, u0 + δuC ) − H λ (t, x 0 + δx, u0 ),
corresponds to this representation of the sequence {δw}. This and condition (2.250) imply
that for any λ ∈ 0 , we have the following inequality on the sequence {δw}:
 tf
 (δwC ) +
λ
δ̄ C H λ dt ≤ 0. (2.252)
t0
t
We set γC = γ (δwC ), δ̄ Cf = f (t, x 0 + δx, u0 + δuC ) − f (t, x 0 + δx, u0 ), ϕ C = t0f |δ̄ Cf | dt.
Since C ⊃ C0 and
δx
C → 0, condition (2) of the theorem implies ϕ C > 0 for all nonzero
members of the sequence {δwC } with sufficiently large numbers.

Proposition 2.106. The following conditions hold : (a) ϕ C → 0, (b) {δwC } ∈ , and hence
γC → 0.

Proof. By (2.252), we have the following for the sequence {δw} and the element λ = λ0 :
 tf  tf  tf
δl − ψδ ẋ dt + δC H dt + δ̄ C H dt ≤ 0 (2.253)
t0 t0 t0
124 Chapter 2. Quadratic Conditions in the Calculus of Variations

(we omit λ = λ0 in this proof). Represent δC H in the form

δC H = H (t, w 0 + δwC ) − H (t, w0 ) = δ̂Cx H + δCu H ,

where
δ̂Cx H = H (t, x 0 + δx, u0 + δuC ) − H (t, x 0 , u0 + δuC ),
δCu H = H (t, x 0 , u0 + δuC ) − H (t, x 0 , u0 ).
Then we obtain from (2.253) that
 tf  tf  tf  tf
δl − ψδ ẋ dt + δ̂Cx H dt + δCu H dt + δ̄ C H dt ≤ 0. (2.254)
t0 t0 t0 t0

The conditions
δx
C → 0 and (t, u0 + δuC ) ∈ C imply
 tf
δ̂Cx H dt → 0. (2.255)
t0

Further, the condition


δx
C → 0 also imply
 tf 
t tf
δl − ψδ ẋ dt = δl − ψδx tf + ψ̇δx dt
0
t0 t0
 tf
= δl − lp δp − Hx δx dt → 0. (2.256)
t0

Conditions (2.254)–(2.256) imply


 tf  tf 
C
δCu H dt + δ̄ H dt → 0. (2.257)
t0 t0 +

Since λ = λ0 and C ⊃ C0 , according to assumption (2) of the theorem, the following


inequalities hold for all members of the sequence {δw C } with sufficiently large numbers:
 tf
δ̄ C H dt ≥ ε0 ϕ C > 0. (2.258)
t0

Conditions (2.257) and (2.258) imply


 tf 
δCu H dt + ε0 ϕ C → 0. (2.259)
t0 +

Since λ = λ0 ∈ M0 and (t, x 0 , u0 + δuC ) ∈ Q, we have δCu H ≥ 0. Therefore, both terms in


(2.259) are nonnegative
t for all sufficiently large numbers of the sequence {δw}. But then
(2.259) implies t0f δCu H dt → 0, ϕ C → 0.
We now show that {δwC } ∈  ¯ S . For this purpose, we prove that σ (δwC ) → 0 (the
other conditions of {δwC } belonging to the set of sequences  ¯ S obviously hold). Since
σ (δw) = 0, we need show only that
δ ẋ − δC f
1 → 0. We have the following for the
sequence {δw}:
δ ẋ − δf = 0, δf = δC f + δ̄ Cf . (2.260)
2.7. Sufficient Conditions for the Strong Minimum 125

The condition ϕ C → 0 means that


δ̄ Cf
1 → 0. This and (2.260) imply
δ ẋ − δC f
1 → 0.
Therefore, {δwC } ∈ ¯ S.
We now recall that, by condition (1) of the theorem, the set Leg+ (M0+ ) is nonempty.
¯ S = . Therefore, {δwC } ∈ . The proposition
By Theorem 2.99, it follows from this that 
is proved.

We continue the proof of the theorem. Consider the following two possible cases for
the sequence {δw}.
Case (a). Assume that there exist a compact set C satisfying conditions (2.251) and a sub-
sequence of the sequence {δw} such that the following conditions hold on this subsequence:

ϕ C > 0, γC = o(ϕ C ). (2.261)

Assume that these conditions hold for the sequence {δw} itself. Inequality (2.252) and the
conditions λ0 ∈ M0 ⊂ 0 imply that the following inequality holds on the sequence {δw}:
 tf
δ̄ C H λ0 dt ≤ −λ0 (δwC ). (2.262)
t0

As was already mentioned in the proof of Proposition 2.106, the following inequalities hold
for all members of the sequence {δw} having sufficiently large numbers:
 tf
ε0 ϕ C ≤ δ̄ C H λ0 dt. (2.263)
t0

On the other hand, according to Lemma 2.103, the conditions {δwC } ∈ , λ0 ∈ M0 ⊂ 0
imply the estimate
−λ0 (δwC ) ≤ O(γC ). (2.264)
We obtain from (2.261)–(2.264) that 0 < ϕ C ≤ O(γC ) = o(ϕ C ). This is a contradiction.
Case (b). Consider the second possibility. Assume that for any compact set C satisfying
conditions (2.251), there exists a constant N > 0 such that the following estimate holds on
the sequence {δw}:
ϕ C ≤ N γC . (2.265)
We show that this also leads to a contradiction. We will thus prove the theorem. First
of all, we note that the constant N in (2.265) can be chosen common for all compact sets
C satisfying conditions (2.251). Indeed, let N0 correspond to a compact set C0 , i.e., the
estimate ϕ C0 ≤ N0 γC0 holds on the sequence {δw}. Let C be an arbitrary compact set such
that (2.251) hold. Then we have the following on the sequence {δw} for all sufficiently large
numbers: ϕ C ≤ ϕ C0 ≤ N0 γC0 ≤ N0 γC . Therefore, N = N0 is also appropriate for C. Also,
we note that for any C satisfying (2.251), there exists a (serial) number of the sequence
starting from which γC > 0. Indeed, otherwise, there exist a compact set C, satisfying
(2.251) and a subsequence of the sequence {δw} such that γC = 0 on the subsequence, and
then ϕ C = 0 by (2.265). By assumption (2) of the theorem, this implies that all members of
the subsequence vanish. The latter is impossible, since the sequence {δw} contains nonzero
members by assumption.
126 Chapter 2. Quadratic Conditions in the Calculus of Variations

Now let the compact set C satisfy conditions (2.251). Inequality (2.252) and the
inclusion M ⊂ 0 imply that the following inequality holds on the sequence {δw}:
 tf
max (λ, δwC ) ≤ − min δ̄ C H λ dt. (2.266)
M M t0

Obviously,  
tf tf
min δ̄ C H λ dt ≥ min δ̄ C H λ dt. (2.267)
M t0 t0 M

Condition 1(b) of the theorem implies that, for any ε > 0, there exists a compact set C
satisfying (2.251) such that
 tf
min δ̄ C H λ dt ≥ −ε· ϕ C (2.268)
t0 M

for all sufficiently large numbers of the sequence. We obtain from (2.265)–(2.268) that
for any ε > 0, there exists a compact set C satisfying (2.251) such that following estimate
holds starting from a certain number:

max (λ, δwC ) ≤ εN γC , (2.269)


M

where N = N0 is independent of C.
We now estimate the left-hand side of inequality (2.269) from below. For this pur-
pose, we show that for any compact set C satisfying (2.251), the sequence {δwC } belongs
to o(√γ ) . Let C be an arbitrary compact set satisfying (2.251). According to Proposi-

tion 2.106, {δwC } ∈ . Since σ (δw) = 0, we have (δJ )+ + Fi+ (p 0 + δp) + |δK| = 0
on the whole sequence. Moreover, the conditions δ ẋ = δf , δf = δC f + δ̄ Cf ,
δ¯C f
1 =
ϕ C ≤ O(γC ) imply
δ ẋ − δC f
1 ≤ O(γC ). Therefore, {δwC } ∈ σ γ ⊂ o(√γ ) . But then,
by Lemma 2.83, condition 1(a) of the theorem implies that there exists a constant CM > 0
such that, starting from a certain number,

max (λ, δwC ) ≥ CM γC > 0. (2.270)


M

Moreover, the constant CM is independent of the sequence from o(√γ ) and hence is
independent of C. Comparing estimates (2.269) and (2.270), we obtain the following
result: For any ε > 0, there exists a compact set C satisfying (2.251) such that, starting from
a certain number, 0 < CM γC ≤ εN γC . Choosing 0 < ε < CM /N , we obtain a contradiction.
The theorem is proved.
Chapter 3

Quadratic Conditions for


Optimal Control Problems
with Mixed Control-State
Constraints

In Sections 3.1 and 3.2 of this chapter, following [92], we extend the quadratic conditions
obtained in Chapter 2 to the general problem with the local relation g(t, x, u) = 0 using
a special method of projection contained in [79]. In Section 3.3, we extend these condi-
tions to the problem on a variable interval of time using a simple change of time variable.
In Section 3.4, we formulate (without proofs) quadratic conditions in an optimal control
problem with the local relations g(t, x, u) = 0 and ϕ(t, x, u) ≤ 0.

3.1 Quadratic Necessary Conditions in the Problem with


Mixed Control-State Equality Constraints on a Fixed
Time Interval
3.1.1 Statement of the Problem with a Local Equality and Passage to
an Auxiliary Problem without Local Constraints
We consider the following problem on a fixed interval [t0 , tf ] with a local equality-type
constraint:
J (x(t0 ), x(tf )) → min,
(3.1)
F (x(t0 ), x(tf )) ≤ 0, K(x(t0 ), x(tf )) = 0, (x(t0 ), x(tf )) ∈ P ,

ẋ = f (t, x, u), g(t, x, u) = 0, (t, x, u) ∈ Q. (3.2)


It is assumed that the functions J , F , and K are twice continuously differentiable on the
open set P ⊂ R2d(x) , and f and g are twice continuously differentiable on the open set
Q ⊂ R1+d(x)+d(u) . Moreover, the following full-rank condition is assumed for the local
equality:
rank gu (t, x, u) = d(g) (3.3)
for all (t, x, u) ∈ Q such that g(t, x, u) = 0.
As in Section 2.1.2, we define a (strict) minimum on a set of sequences S : w 0 is a
(strict) minimum point on S in problem (3.1), (3.2) if there is no sequence {δw} ∈ S such

127
128 Chapter 3. Quadratic Conditions for Optimal Control Problems

that the following conditions hold for all its members:

J (p 0 + δp) < J (p0 ) (J (p 0 + δp) ≤ J (p 0 ), δw  = 0),


F (p0 + δp) ≤ 0, K(p 0 + δp) = 0, ẋ 0 + δ ẋ = f (t, w 0 + δw),
g(t, w 0 + δw) = 0, (p 0 + δp) ∈ P , (t, w 0 + δw) ∈ Q,

where p0 = (x 0 (t0 ), x 0 (tf )), δw = (δx, δu), and δp = (δx(t0 ), δx(tf )). A (strict) minimum
on  (see Section 2.1.3) is said to be a (strict) Pontryagin minimum.
Our goal is to obtain quadratic conditions in problem (3.1), (3.2) using the quadratic
conditions obtained in problem (2.1)–(2.4). In this case, we will use the same method for
passing to a problem without local constraints, which was already used in [79, Section 17,
Part 1] for obtaining first-order conditions. Recall that in [79, Section 17, Part 1], we have
introduced the set
G = {(t, x, u) ∈ Q | g(t, x, u) = 0}.
We have shown that there exist a neighborhood Q1 ⊂ Q of the set G and a continuously
differentiable function U (t, x, u) : Q1 → Rd(x) such that

(i) (t, x, U (t, x, u)) ∈ G ∀ (t, x, u) ∈ Q1 ,


(3.4)
(ii) U (t, x, u) = u ∀ (t, x, u) ∈ G.

Owing to these properties, U is called a projection. Since g is a twice continuously differen-


tiable function on Q, we can choose the function U, together with the neighborhood Q1 , so
that U is a twice continuously differentiable function on Q1 . This can be easily verified by
analyzing the scheme for proving the existence of a projection presented in [79, Section 17,
Part 1]. We fix certain Q1 and U with the above properties. Instead of system (3.2) with
the local equality-type constraint g(t, x, u) = 0, consider the following system without local
constraint:
ẋ = f (t, x, U (t, x, u)), (t, x, u) ∈ Q1 . (3.5)
We find a connection between the necessary conditions in the problem (3.1), (3.2) and those
in the problem (3.1), (3.5). Preparatorily, we prove the following assertion.

Proposition 3.1. Let (x 0 , u0 ) = w 0 be a Pontryagin minimum point in the problem (3.1),


(3.2). Then w 0 is a Pontryagin minimum point in the problem (3.1), (3.5).

Proof. The property that w0 is a Pontryagin minimum point in the problem (3.1), (3.2)
implies that w0 is an admissible point in the problem (3.1), (3.2), and then by the second
property (3.4) of the projection, w0 is also admissible in the problem (3.1), (3.5). Suppose
that w0 is not a Pontryagin minimum point in the problem (3.1), (3.5). Then there exist
C ∈ Q1 and a sequence {δw} = {(δx, δu)} such that


δx
1,1 → 0,
δu
1 → 0, (3.6)

and the following conditions hold for all members of the sequence:

(t, w 0 + δw) ∈ C, (3.7)


δJ < 0, F (p0 + δp) ≤ 0, K(p 0 + δp) = 0, (3.8)
x˙0 + δ ẋ = f (t, x 0 + δx, U (t, x 0 + δx, u0 + δu)), (3.9)
3.1. Quadratic Necessary Conditions on a Fixed Time Interval 129

where p0 = (x 0 (t0 ), x 0 (tf )) and δp = (δx(t0 ), δx(tf )). Therefore, {δw} is a Pontryagin
sequence “violating the Pontryagin minimum” at the point w 0 in the problem (3.1), (3.5).
We set {δw1 } = {(δx, δu1 )}, where

δu1 = U (t, x 0 + δx, u0 + δu) − U (t, x 0 , u0 ) = U (t, x 0 + δx, u0 + δu) − u0 .

We show that {δw1 } is a Pontryagin sequence “violating the Pontryagin minimum” at w0


in the problem (3.1), (3.2).
First of all, we show that {δw 1 } is a Pontryagin sequence for system (3.2) at w 0 . For
this purpose, we represent δu1 in the form

δu1 = U (t, x 0 + δx, u0 + δu) − U (t, x 0 , u0 + δu)


(3.10)
+ U (t, x 0 , u0 + δu) − U (t, x 0 , u0 ).

This representation is correct, since the conditions


δx
C → 0 and (t, x 0 + δx, u0 + δu) ∈ C ⊂ Q1 (3.11)

imply (t, x 0 , u0 + δu) ∈ Q1 for all sufficiently large serial numbers (here, we use the com-
pactness of C and the openness of Q1 ). Representation (3.10) implies


δu1
1 ≤
U (t, x 0 + δx, u0 + δu) − U (t, x 0 , u0 + δu)
∞ (tf − t0 )
+
U (t, x 0 , u0 + δu) − U (t, x 0 , u0 )
1 .

This, condition (3.11), and also the condition


δu
1 → 0 imply
δu1
1 → 0. Further,
denote by C1 the image of the compact set C under the mapping (t, x, u) → (t, x, U (t, x, u)).
Then C ⊂ G, and hence C1 ⊂ Q1 . Moreover, C1 is a compact set and

(t, x 0 + δx, u0 + δu1 ) = (t, x 0 + δx, U (t, x 0 + δx, u0 + δu)) ∈ C1 .

This implies g(t, x 0 + δx, u0 + δu1 ) = 0. Finally, we note that the sequence {w 0 + δw1 }
satisfies the differential equation of system (3.2):

ẋ 0 + δ ẋ = f (t, x 0 + δx, U (t, x 0 + δx, u0 + δu)) = f (t, x 0 + δx, u0 + δu1 )

and the local equality constraint

g(t, x 0 + δx, u0 + δu1 ) = g(t, x 0 + δx, U (t, x 0 + δx, u0 + δu)) = 0.

Therefore, we have shown that the mapping


 
(δx, δu)  → δx, U (t, x 0 + δx, u0 + δu) − u0

transforms the Pontryagin sequence {δw} of the system (3.1), (3.5) at the point w 0 into
the Pontryagin sequence {δw1 } of the system (3.1), (3.2) at the same point. Since the
sequence {δw} “violates” the Pontryagin minimum in the problem (3.1), (3.5) at the point
w 0 , conditions (3.8) hold. But these conditions can be also referred to as the sequence
{δw1 }, since the members δx of these two sequence coincide. Therefore, {δw 1 } “violates”
the Pontryagin minimum at the point w0 in the problem (3.1), (3.2). Therefore, the absence
130 Chapter 3. Quadratic Conditions for Optimal Control Problems

of the Pontryagin minimum at the point w 0 in the problem (3.1), (3.5) implies the same case
in the problem (3.1), (3.2). This implies what was required.

Let w 0 = (x 0 , u0 ) be a fixed point of the Pontryagin minimum in the problem (3.1),


(3.2). Then w0 is a Pontryagin minimum point in the problem (3.1), (3.5). We now write the
quadratic necessary conditions for the Pontryagin minimum at the point w0 in the problem
(3.1), (3.5) (which contains no local constraints) so that the projection U can be excluded
from these conditions. Then we will obtain the quadratic necessary conditions for the
Pontryagin minimum in the problem (3.1), (3.2).

3.1.2 Set M0
For the problem (3.1), (3.2), we set l = α0 J + αK, H = ψf , and H̄ = H + νg, where
ν ∈ (Rd(g) )∗ . Therefore, H̄ = H̄ (t, x, u, ψ, ν). We also set
H (t, x, ψ) = min ψf (t, x, u).
{u|(t,x,u)∈G}

For problem (3.1), (3.2) and the point w 0 , we introduce the set M0 consisting of tuples
λ = (α0 , α, β, ψ(t), ν(t)) such that
α0 ≥ 0, α ≥ 0, αF (p 0 ) = 0, α0 + |α| + |β| = 1, (3.12)
ψ(t0 ) = −lx0 , ψ(tf ) = lxf , (3.13)
− ψ̇ = H̄x , H̄u = 0, (3.14)
H (t, x 0 , u0 , ψ) = H (t, x 0 , ψ). (3.15)
Here, ψ(t) is an absolutely continuous function and ν(t) is a bounded measurable function.
All the derivatives are taken for p = p 0 and w = w 0 (t). The results of [79, Section 17,
Part 1] imply that the set M0 of problem (3.1), (3.5) at the point w0 can be represented
in this form. More precisely, the linear projection (α0 , α, β, ψ, ν)  → (α0 , α, β, ψ) yields a
one-to-one correspondence between the elements of the set M0 of the problem (3.1), (3.2)
at the point w 0 and the elements of the set M0 of the problem (3.1), (3.5) at the same point.
To differentiate these two sets from one another, we denote the latter set by M0U . We will
equip all objects referring to the problem (3.1), (3.5) with the superscript U .
In what follows, we will assume that all assumptions of Section 2.1 hold for the
point w 0 , i.e., u0 (t) is a piecewise continuous function whose set of discontinuity points is
 = {t1 , . . . , ts } ⊂ (t0 , tf ), and each point of the set  is an L-point. The condition

−fu∗ (t, x 0 (t), u0 (t))ψ ∗ (t) = gu∗ (t, x 0 (t), u0 (t))ν ∗ (t), (3.16)
which is equivalent to the condition H̄u = 0, and also the full-rank condition (3.3) imply
that ν(t) has the same properties as u0 (t): the function ν(t) is piecewise continuous and
each of its point of discontinuity is an L-point which belongs to . To verify this, it suffices
to premultiply the above relation by the matrix gu (t, x 0 (t), u0 (t)),
−gu (t, x 0 (t), u0 (t))fu∗ (t, x 0 (t), u0 (t))ψ ∗ (t) = gu (t, x 0 (t), u0 (t))gu∗ (t, x 0 , (t), u0 (t))ν ∗ (t),
and use the properties of the functions g, f , x 0 , and u0 ; in particular, the property
| det gu (t, x 0 , u0 )gu∗ (t, x 0 , u0 )| ≥ const > 0,
3.1. Quadratic Necessary Conditions on a Fixed Time Interval 131

which is implied by the full-rank condition (3.3). Therefore, the basic properties of the
function ν(t) are proved exactly in the same way as in [79, Section 17, Part 1] for the
bounded measurable control u0 (t).

3.1.3 Critical Cone


We now consider the conditions defining critical cone at the point w 0 in the problem (3.1),
(3.5). The variational equation has the form

x̄˙ = (fx + fu Ux )x̄ + fu Uu ū. (3.17)

All the derivatives are taken for w = w0 (t). Setting ũ = Ux x̄ + Uu ū, we obtain

x̄˙ = fx x̄ + fu ũ. (3.18)

This is the usual variational equation, but for the pair w̃ = (x̄, ũ). Let us show that the pair
(x̄, ũ) also satisfies the condition
gx x̄ + gu ũ = 0. (3.19)
By the first condition in (3.4), we have

g(t, x, U (t, x, u)) = 0 ∀ (t, x, u) ∈ Q1 . (3.20)

Differentiating this relation in x, u, and t, as in [79, Section 17, Part 1], we obtain

gx + gu Ux = 0, (3.21)
gu Uu = 0, (3.22)
gt + gu Ut = 0. (3.23)

These relations hold on Q1 , but it suffices to consider them only on the trajectory
(t, x 0 (t), u0 (t)), and, moreover, we now need only the first two conditions. By (3.21) and
(3.22), we have

gx x̄ + gu ũ = gx x̄ + gu (Ux x̄ + Uu ū) = (gx + gu Ux )x̄ + gu Uu ū = 0.

Therefore, we have proved the following proposition.

Proposition 3.2. Let a pair of functions (x̄, ū) satisfy the variational equation (3.17) of
system (3.5). We set ũ = Ux x̄ + Uu ū. Then conditions (3.18) and (3.19) hold for (x̄, ũ).

In this proposition, x̄ ∈ P W 1,2 (, Rd(x) ), ū ∈ L2 (, Rd(u) ), and ũ ∈ L2 (, Rd(u) ),
where  = [t0 , tf ]. Also, we are interested in the possibility of the converse passage from
conditions (3.18) and (3.19) to condition (3.17). For this purpose, we prove the following
proposition.

Proposition 3.3. Let a pair of functions (x̄, ũ) be such that gx x̄ + gu ũ = 0. Then set-
ting ū = ũ − Ux x̄, we obtain Uu ū = ū, and hence ũ = Ux x̄ + Uu ū. Here, as above, x̄ ∈
P W 1,2 (, Rd(x) ), ū ∈ L2 (, Rd(u) ), and ũ ∈ L2 (, Rd(u) ); all the derivatives are taken for
x = x 0 (t) and u = u0 (t).
132 Chapter 3. Quadratic Conditions for Optimal Control Problems

Proof. First of all, we note that properties (3.4) of the function U (t, x, u) imply the follow-
ing assertion: at each point (t, x, u) ∈ G, the finite-dimensional linear operator

ū ∈ Rd(u)  → Uu (t, x, u)ū ∈ Rd(u) (3.24)

is the linear projection of the space Rd(u) on the subspace

Lg (t, x, u) = {ū ∈ Rd(u) | gu (t, x, u)ū = 0}. (3.25)

Indeed, let (t, x, u) ∈ G, and then condition (3.22) implies that the image of operator (3.24)
is contained in subspace (3.25). The condition

Uu (t, x, u)ū = ū ∀ ū ∈ Lg (t, x, u) (3.26)

is easily proved by using the Lyusternik theorem [28]. Indeed, let ū ∈ Lg (t, x, u), i.e.,
gu (t, x, u)ū = 0. Let ε → +0. Then

g(t, x, u + εū) = g(t, x, u) + gu (t, x, u)εū + rg (ε) = rg (ε),

where |rg (ε)| = o(ε). By the Lyusternik theorem [28], there exists a “sequence” {ug (ε)}
such that ug (ε) ∈ Rd(u) , |ug (ε)| = o(ε), and, moreover, g(t, x, u + ε ū + ug (ε)) = 0, i.e.,
(t, x, u + εū + ug (ε)) ∈ G. Hence U (t, x, u + εū + ug (ε)) = u + ε ū + ug (ε). But U (t, x, u +
ε ū + ug (ε)) = U (t, x, u) + Uu (t, x, u)(ε ū + ug (ε)) + rU (ε), where |rU (ε)| = o(ε). We ob-
tain from the latter two conditions and the condition U (t, x, u) = u that ε ū + ug (ε) =
Uu (t, x, u)(ε ū + ug (ε)) + rU (ε). Dividing this relation by ε and passing to the limit as
ε → +0, we obtain ū = Uu (t, x, u)ū. Condition (3.26) is proved. Condition (3.26) holds at
each point (t, x, u) ∈ G, but we use it only at the trajectory (t, x 0 (t), u0 (t)) | t ∈ [t0 , tf ].
If a pair of functions x̄(t), ũ(t) satisfies the condition gx x̄ + gu ũ = 0, then by (3.21),
we have −gu Ux x̄ + gu ũ = 0, i.e., −Ux x̄ + ũ ∈ Lg (t, x 0 , u0 ). Then the condition Uu ū = ū
also holds for ū = −Ux x̄ + ũ, and hence ũ = Ux x̄ + Uu ū. The proposition is proved.

Proposition 3.3 implies the following assertion.

Proposition 3.4. Let a pair of functions (x̄, ũ) be such that conditions (3.18) and (3.19) hold :
x̄˙ = fx x̄ +fu ũ and gx x̄ +gu ũ = 0. We set ū = −Ux x̄ + ũ. Then Uu ū = ū, and the variational
equation (3.17) of system (3.5) holds for the pair of functions (x̄, ū) at the point w 0 .

Proof. Indeed,

x̄˙ = fx x̄ + fu ũ = fx x̄ + fu (Ux x̄ + ū) = fx x̄ + fu (Ux x̄ + Uu ū) = (fx + fu Ux )x̄ + fu Uu ū

as required. The proposition is proved.

We now give the following definition.

Definition 3.5. The critical cone K of problem (3.1), (3.2) at the point w 0 is the set of
triples z̄ = (ξ̄ , x̄, ū) satisfying the following conditions:
3.1. Quadratic Necessary Conditions on a Fixed Time Interval 133

ξ̄ ∈ Rs , x̄ ∈ P W 1,2 (, Rd(x) ), ū ∈ L2 (, Rd(u) ), (3.27)


Jp p̄ ≤ 0, Fip p̄ ≤ 0 ∀ i ∈ I , Kp p̄ = 0, (3.28)
x̄˙ = fx x̄ + fu ū, (3.29)
[x̄]k = [f ]k ξ̄k ∀ tk ∈ , (3.30)
gx x̄ + gu ū = 0. (3.31)

Let us compare this definition with the definition of the critical cone K U of the
problem (3.1), (3.5) at the point w0 . According to Section 2.1, the latter is defined by the
same conditions (3.27) and (3.28), the variational equation (3.17), and the jump condition
(3.30). Thus, all the conditions referring to the components ξ̄ and x̄ in the definitions of
two critical cones coincide. This and Proposition 3.2 imply the following assertion.

Lemma 3.6. Let z̄ = (ξ̄ , x̄, ū) be an arbitrary element of the critical cone K U of the problem
(3.1), (3.5) at the point w0 . We set ũ = Ux x̄ + Uu ū. Then z̃ = (ξ̄ , x̄, ũ) is an element of the
critical cone K of the problem (3.1), (3.2) at the point w0 .

Respectively, Proposition 3.4 implies the following lemma.

Lemma 3.7. Let z̃ = (ξ̄ , x̄, ũ) be an arbitrary element of the critical cone K of the problem
(3.1), (3.2) at the point w0 . We set ū = ũ − Ux x̄. Then z̄ = (ξ̄ , x̄, ū) is an element of the
critical cone K U of the problem (3.1), (3.5) at the point w0 , and, moreover, Uu ū = ū, which
implies ũ = Ux x̄ + Uu ū.

This is the connection between the critical cones at the point w 0 in the problems
(3.1), (3.2) and (3.1), (3.5). We will need Lemma 3.6 later in deducing quadratic sufficient
conditions in the problem with a local equality; now, in deducing necessary conditions, we
use Lemma 3.7. Preparatorily, we find the connection between the corresponding quadratic
forms.

3.1.4 Quadratic Form


We write the quadratic form U (z̄) for the point w 0 in the problem (3.1), (3.5) in accor-
dance with its definition in Section 2.1. Let λU = (α0 , α, β, ψ) be an element of the set M0U
of the problem (3.1), (3.5) at the point w 0 , and let λ = (α0 , α, β, ψ, ν) be the corresponding
element of the set M0 of the problem (3.1), (3.2) at the same point. As above, we set

H (t, x, u, ψ) = ψf (t, x, u); H̄ (t, x, u, ψ) = ψf (t, x, u) + νg(t, x, u).

Also, we introduce the notation

f U (t, x, u) = f (t, x, U (t, x, u)), (3.32)


H U (t, x, u, ψ) = ψf U (t, x, u) = H (t, x, U (t, x, u), ψ). (3.33)

We omit the superscripts λ and λU in the notation. For each t ∈ [t0 , tf ], let us calculate the
quadratic form
Hww
U
w̄, w̄ = Hxx
U
x̄, x̄ + 2Hxu
U
ū, x̄ + Huu
U
ū, ū , (3.34)
134 Chapter 3. Quadratic Conditions for Optimal Control Problems

where x̄ ∈ Rd(x) , ū ∈ Rd(u) , and all the second derivatives are taken at the point (t, x, u) =
(t, x 0 (t), u0 (t)).
Let us calculate Hxx
U x̄, x̄ . It follows from (3.33) that

HxU = Hx + Hu Ux . (3.35)

Differentiating this equation in x and twice multiplying it by x̄, we obtain

Hxx
U x̄, x̄ = Hxx x̄, x̄ + 2Hxu (Ux x̄), x̄ + Huu (Ux x̄), (Ux x̄)
(3.36)
+ (Hu Uxx )x̄, x̄ ,
d(u)
where (Hu Uxx )x̄, x̄ = ( i=1 Hui Uixx )x̄, x̄ by definition. Further, let us calculate
Hxu
U ū, x̄ . Differentiating (3.35) in u and multiplying it by ū and x̄, we obtain

Hxu
U
ū, x̄ = Hxu (Uu ū), x̄ + Huu (Uu ū), (Ux x̄) + (Hu Uxu )ū, x̄ , (3.37)
d(u)
where (Hu Uxu )ū, x̄ = ( i=1 Hui Uixu )ū, x̄ . Finally, let us calculate Huu U ū, ū . It follows

from (3.33) that Hu = Hu Uu . Differentiating this equation in u and twice multiplying it


U

by ū, we obtain
HuuU
ū, ū = Huu (Uu ū), (Uu ū) + (Hu Uuu )ū, ū , (3.38)
d(u)
where (Hu Uuu )ū, ū = ( i=1 Hui Uiuu )ū, ū . Formulas (3.34), (3.36)–(3.38) imply

Hww
U w̄, w̄ = Hxx x̄, x̄ + 2Hxu (Ux x̄), x̄ + Huu (Ux x̄), (Ux x̄)
+ (Hu Uxx )x̄, x̄ + 2Hxu (Uu ū), x̄ + 2Huu (Uu ū), (Ux x̄)
+ 2(Hu Uxu )ū, x̄ + Huu (Uu ū), (Uu ū) + (Hu Uuu )ū, ū (3.39)
= Hxx x̄, x̄ + 2Hxu ũ, x̄ + Huu ũ, ũ
+ (Hu Uxx x̄, x̄ + 2(Hu Uxu )ū, x̄ + (Hu Uuu )ū, ū ,

where ũ = Ux x̄ + Uu ū. Further, differentiating in x the relations

gix + giu Ux = 0, i = 1, . . . , d(g), (3.40)

which hold on Q1 , and twice multiplying the result by x̄, we obtain

gixx x̄, x̄ + 2gixu (Ux x̄), x̄ + giuu (Ux x̄), (Ux x̄) + giu Uxx x̄, x̄ = 0,

i = 1, . . . , d(g). Multiplying each of these relations by the ith component νi of the vector-
valued function ν(t), summing with respect to i, and using the relation Hu + νgu = 0, we
obtain

νgxx x̄, x̄ + 2νgxu (Ux x̄), x̄ + νguu (Ux x̄), (Ux x̄) − Hu Uxx x̄, x̄ = 0, (3.41)
  
where νgxx = νi gixx , νgxu = νi gixu , and νguu = νi giuu . Differentiating the same
relations (3.40) in u and multiplying by ū and x̄, we obtain

gixu (Uu ū), x̄ + giuu (Uu ū), (Ux x̄) + (giu Uxu )ū, x̄ = 0.
3.1. Quadratic Necessary Conditions on a Fixed Time Interval 135

Multiplying each of these relations by 2νi , summing with respect to i, and using the property
that Hu + νgu = 0, we obtain

2νgxu (Uu ū), x̄ + 2νguu (Uu ū), (Ux x̄) − 2(Hu Uxu )ū, x̄ = 0, (3.42)
 
where νgxu = νi gixu and νguu = νi giuu . Finally, differentiating in u the relations
giu Uu = 0, i = 1, . . . , d(g), which hold on Q1 , and twice multiplying the result by ū, we
obtain giuu (Uu ū), (Uu ū) + (giu Uuu )ū, ū = 0. Multiplying each of these equations by νi
and using the property that Hu + νgu = 0, we obtain

νguu (Uu ū), (Uu ū) − Hu Uuu ū, ū = 0. (3.43)

This and (3.38) imply

Huu
U
ū, ū = Huu (Uu ū), (Uu ū) + νguu (Uu ū), (Uu ū) . (3.44)

Using the notation H̄ = H + νg, we present this relation in the form

Huu
U
ū, ū = H̄uu (Uu ū), (Uu ū) . (3.45)

We will use this relation later in Section 3.2. Summing relations (3.41)–(3.43), we obtain

νgxx x̄, x̄ + 2νgxu ũ, x̄ + 2νguu ũ, ũ


− Hu Uxx x̄, x̄ − 2Hu Uxu ū, x̄ − Hu Uuu ū, ū = 0, (3.46)

where ũ = Ux x̄ + Uu ū. It follows from (3.39) and (3.46) that Hww


U w̄, w̄ = H
ww w̃, w̃ +
νgww w̃, w̃ or
HwwU
w̄, w̄ = H̄ww w̃, w̃ , (3.47)
where w̃ = (x̄, ũ) = (x̄, Ux x̄ + Uu ū).
We now consider the terms referring to the points of discontinuity of the control u0 .
We set
D k (H̄ ) = −H̄xk+ H̄ψk− + H̄xk− H̄ψk+ − [H̄t ]k , k = 1, . . . , s, (3.48)
where
H̄xk+ = H̄x (tk , x 0 (tk ), u0k+ , ψ(tk ), ν k+ ), H̄xk− = H̄x (tk , x 0 (tk ), u0k− , ψ(tk ), ν k− ),
H̄ψk+ = f (tk , x 0 (tk ), u0k+ ) = Hψk+ , H̄ψk− = f (tk , x 0 (tk ), u0k− ) = Hψk− ,
[H̄t ]k = H̄tk+ − H̄tk− = ψ(tk )[ft ]k + [νgt ]k
= ψ(tk )(ft (tk , x 0 (tk ), u0k+ ) − ft (tk , x 0 (tk ), u0k− ))
+ ν k+ gt (tk , x 0 (tk ), u0k+ ) − ν k− gt (tk , x 0 (tk ), u0k− ),
ν = ν(tk +), ν k− = ν(tk −).
k+

Therefore, the definition of D k (H̄ ) is analogous to that of D k (H ).


We can define D k (H̄ ) using another method, namely, as the derivative of the “jump
of H̄ ” at the point tk . Introduce the function

(k H̄ )(t) = (k H )(t) + (k (νg))(t)


 
= ψ(t) f (t, x 0 (t), u0k+ ) − f (t, x 0 (t), u0k− ) (3.49)
 
+ ν k+ g(t, x 0 (t), u0k+ ) − ν k− g(t, x 0 (t), u0k− ) .
136 Chapter 3. Quadratic Conditions for Optimal Control Problems

Similarly to what was done for (k H )(t) in Section 2.3 (see Lemma 2.12), we can show
that the function (k H̄ )(t) is continuously differentiable at the point tk ∈ , and its deriva-
tive at this point coincides with −D k (H̄ ). Therefore, we can obtain the value of D k (H̄ )
calculating the left or right limit of the derivatives of the function (k H̄ )(t) defined by
formula (3.49):
d
D k (H̄ ) = − (k H̄ )(tk ).
dt
We now show that

D k (H̄ ) = D k (H U ), k = 1, . . . , s. (3.50)

Indeed, by definition,

−D k (H U ) = HxU k+ HψU k− − HxU k− HψU k+ + [HtU ]k . (3.51)

Furthermore,
HxU = Hx + Hu Ux = Hx − νgu Ux = Hx + νgx = H̄x . (3.52)

Here, we have used the formulas Hu + νgu = 0, gx + gu Ux = 0, and H̄ = H + νg. Also, it


is obvious that for (t, w) = (t, w0 (t)),

HψU = f U = f = Hψ = H̄ψ . (3.53)

Finally,
HtU = Ht + Hu Ut = Ht − νgu Ut = Ht + νgt = H̄t , (3.54)

since gt + gu Ut = 0. The formulas (3.48), (3.51)–(3.54) imply relation (3.50). Further, note
that relations (3.52) imply

[HxU ]k = [H̄x ]k , k = 1, . . . , s, (3.55)

where

[H̄x ]k = H̄x (tk , x 0 (tk ), u0k+ , ψ(tk ), ν k+ ) − H̄x (tk , x 0 (tk ), u0k− , ψ(tk ), ν k− ) (3.56)

is the jump of the function H̄x (t, x 0 (t), u0 (t), ψ(t), ν(t)) at the point tk ∈ .
For the problem (3.1), (3.2) and the point w0 , we define the following quadratic form
in z̄ = (ξ̄ , x̄, ū) for each λ = (α0 , α, β, ψ, ν) ∈ M0 :
s
  tf
2λ (z̄) = D k (H̄ λ )ξ̄k2 + 2[H̄xλ ]k x̄av
k
ξ̄k + lpp
λ
p̄, p̄ + H̄ww
λ
w̄, w̄ dt. (3.57)
k=1 t0

Therefore, the quadratic form in the problem with local equality-type constraints is defined
in the same way as in the problem without local constraints; the only difference is that
instead of the function H = ψf in the definition of the new quadratic form, we must use
the function H̄ = H + νg.
3.1. Quadratic Necessary Conditions on a Fixed Time Interval 137

According to Section 2.1, the quadratic form takes the following form for the problem
(3.1), (3.5) and the point w 0 :
s
 % &
U U U
2U λ (z̄) = D k H U λ ξ̄k2 + 2 HxU λ x̄av k
ξ̄k
k=1  tf (3.58)
λU U
+ lpp p̄, p̄ + H U λ w̄, w̄ dt.
t0

Here, as above, we have used the superscript U in the notation U of the quadratic form
in order to stress that this quadratic form corresponds to the problem (3.1), (3.5) being
considered for a given projection U (t, x, u). We have denoted by λU the tuple (α0 , α, β, ψ),
which uniquely defines the tuple λ = (α0 , α, β, ψ, ν) by the condition ψfu + νgu = 0. In
what follows, these tuples correspond to one another and belong to the sets M0U and M0
of the problems (3.1), (3.5) and (3.1), (3.2), respectively. Formulas (3.47), (3.50), (3.55),
(3.57), and (3.58) imply the following assertion.

Lemma 3.8. Let z̄ = (ξ̄ , x̄, ū) be an arbitrary element of the space Z2 (), and let z̃ =
(ξ̄ , x̄, ũ) = (ξ̄ , w̃), where ũ = Ux x̄ + Uu ū. Let λU be an arbitrary element of M0U , and let λ
U
be the corresponding element of M0 . Then λ (z̃) = U λ (z̄).

3.1.5 Necessary Quadratic Conditions


The following theorem holds.

Theorem 3.9. If w 0 is a Pontryagin minimum point in the problem (3.1), (3.2), then the
following Condition A holds: the set M0 is nonempty and

max λ (z̄) ≥ 0 ∀ z̄ ∈ K, (3.59)


λ∈M0

where K is the critical cone at the point w 0 defined by conditions (3.27)–(3.31), λ (z̄)
is the quadratic form at the same point defined by (3.57), and M0 is the set of tuples of
Lagrange multipliers satisfying the minimum principle defined by (3.12)–(3.15).

Proof. Let w0 be a Pontryagin minimum point in the problem (3.1), (3.2). Then according
to Proposition 3.1, w0 is a Pontryagin minimum point in the problem (3.1), (3.5). Hence, by
Theorem 2.4, the following necessary Condition AU holds at the point w 0 in the problem
(3.1), (3.5): the set M0U is nonempty and
U
max U λ (z̄) ≥ 0 ∀ z̄ ∈ K U . (3.60)
M0U

Let us show that the necessary Condition A holds at the point w 0 in the problem (3.1),
(3.2). Let z̃ = (ξ̄ , x̄, ũ) be an arbitrary element of the critical cone K at the point w0 in
the problem (3.1), (3.2). According to Lemma 3.7, there exists a function ū(t) such that
ũ = Ux x̄ + Uu ū, and, moreover, z̄ = (ξ̄ , x̄, ū) is an element of the critical cone K U at
the point w0 in the problem (3.1), (3.5) (we can set ū = ũ − Ux x̄). Since the necessary
138 Chapter 3. Quadratic Conditions for Optimal Control Problems

U
Condition AU holds, there exists an element λU ∈ M0U such that U λ (z̄) ≥ 0. According
to Section 3.1.2, an element λ ∈ M0 corresponds to the element λU ∈ M0U . Moreover, by
U
Lemma 3.8, λ (z̃) = U λ (z̄). Therefore, λ (z̃) ≥ 0. Since z̃ is an arbitrary element of K,
Condition A holds. The theorem is proved.

Therefore, we have obtained the final form of the quadratic necessary condition for
the Pontryagin minimum in the problem with local equality-type constraints, Condition A,
in which there is no projection U . This condition is a natural generalization of the quadratic
necessary Condition A in the problem without local constraints.

3.2 Quadratic Sufficient Conditions in the Problem with


Mixed Control-State Equality Constraints on a Fixed
Time Interval
3.2.1 Auxiliary Problem V
Since there is the projection U (t, x, u), the problem (3.1), (3.5) has the property that the
strict minimum is not attained at any point. For this reason, the problem (3.1), (3.5) can-
not be directly used for obtaining quadratic sufficient conditions that guarantee the strict
minimum. To overcome this difficulty, we consider a new auxiliary problem adding the
additional constraint  tf
(u − U (t, x, u))2 dt ≤ 0,
t0

where (u − U )2 = u − U , u − U . Representing this constraint as an endpoint constraint


by introducing a new state variable y, we arrive at the following problem on a fixed interval
[t0 , tf ]:
J (x0 , xf ) → min,
(3.61)
F (x0 , xf ) ≤ 0, K(x0 , xf ) = 0, (x0 , xf ) ∈ P ,

y0 = 0, yf ≤ 0, (3.62)

1
ẏ = (u − U (t, x, u))2 , (3.63)
2
ẋ = f (t, x, U (t, x, u)), (t, x, u) ∈ Q1 , (3.64)
where x0 = x(t0 ), xf = x(tf ), y0 = y(t0 ), and yf = y(tf ).
Problem (3.61)–(3.64) is called the auxiliary problem V . By the superscript V we
denote all objects referring to this problem. Therefore, the auxiliary problem V differs from
the auxiliary problem (3.1), (3.5) or problem U by the existence of the additional constraints
(3.62) and (3.63).
If (y, x, u) is an admissible triple in problem (3.61)–(3.64), then y = 0, and the pair
w = (x, u) satisfies the constraints of problem (3.1), (3.2). Indeed, (3.62) and (3.63) imply

y(t) = 0, U (t, x(t), u(t)) = u(t), (3.65)


3.2. Quadratic Sufficient Conditions on a Fixed Time Interval 139

and then
ẋ(t) − f (t, x(t), u(t)) = ẋ(t) − f (t, x(t), U (t, x(t), u(t))) = 0,
g(t, x(t), u(t)) = g(t, x(t), U (t, x(t), u(t))) = 0,
since (t, x(t), u(t)) ∈ Q1 . The converse is also true: if w = (x, u) is an admissible pair in
problem (3.1), (3.2) and y = 0, then the triple (y, x, u) is admissible in Problem V , since
conditions (3.65) hold for it.

3.2.2 Bounded Strong γ -Sufficiency


Fix an admissible point w 0 = (x 0 , u0 ) ∈ W in problem (3.1), (3.2) satisfying the assumption
of Section 2.1. Let y 0 = 0. Then (y 0 , w 0 ) = (0, w 0 ) is an admissible point in Problem V .
Let (t, u) be the admissible function defined in Section 2.3 (see Definition 2.17). The
higher order at the point (0, w 0 ) in Problem V is defined by the relation
 tf
γ V (δy, δw) =
δy
2C +
δx
2C + (t, u0 + δu) dt =
δy
2C + γ (δw). (3.66)
t0

The violation function is defined by the relation

σ V (δy, δw) = (δJ )+ + |F (p 0 + δp)+ | + |δK| +


δ ẋ − δf U
1
1
+
δ ẏ − (u0 + δu − U (t, x 0 + δx, u0 + δu))2
1 + (δyf )+ + |δy0 |, (3.67)
2
where

δf U = f U (t, w 0 + δw) − f U (t, w0 )


= f (t, x 0 + δx, U (t, x 0 + δx, u0 + δu)) − f (t, x 0 , u0 ). (3.68)

We will use the concept of a bounded strong minimum and also that of a bounded
strong γ V -sufficiency at the point (0, w 0 ) in the auxiliary problem V . Let us introduce
analogous concepts for problem (3.1), (3.2) at the point w 0 .

Definition 3.10. We say that a point w0 = (x 0 , u0 ) is a point of strict bounded strong


minimum in problem (3.1), (3.2) if there is no sequence {δw} in the space W that does not
contain zero members, there is no compact set C such that |δx(t0 )| → 0,
δx
C → 0, and
the following conditions hold for all members of the sequence {δw}:

(t, w 0 + δw) ∈ C, δJ = J (p 0 + δp) − J (p0 ) ≤ 0,


F (p0 + δp) ≤ 0, K(p 0 + δp) = 0,
ẋ 0 + δ ẋ = f (t, w0 + δw), g(t, w0 + δw) = 0.

As in Section 2.7 (see Definition 2.92), we denote by δx the tuple of essential com-
ponents of the variation δx. (The definition of unessential components in problem (3.1),
(3.2) is the same as in problem (2.1)–(2.4), but now neither functions f nor g depends on
these components.)

Definition 3.11. We say that w 0 = (x 0 , u0 ) is a point of bounded strong γ -sufficiency in


problem (3.1), (3.2) if there is no sequence {δw} in the space W without zero members and
140 Chapter 3. Quadratic Conditions for Optimal Control Problems

there is no compact set C ⊂ Q such that


|δx(t0 )| → 0,
δx
C → 0, σ (δw) = o(γ (δw)), (3.69)
and the following conditions hold for all members of the sequence {δw}:
g(t, w0 + δw) = 0 and (t, w 0 + δw) ∈ C. (3.70)
Here,
σ (δw) = (δJ )+ + |F (p 0 + δp)+ | + |δK| +
δ ẋ − δf
1 . (3.71)

Therefore, the violation function σ (δw) in problem (3.1), (3.2) contains no term
related to the local constraint g(t, x, u) = 0; however, in the definition of the bounded
strong γ -sufficiency of this problem, it is required that the sequence {w0 + δw} satisfies
this constraint. The local constraint does not have the same rights as the other constraints.
Obviously, the following assertion holds.

Proposition 3.12. The bounded strong γ -sufficiency at the point w0 in problem (3.1), (3.2)
implies the strict bounded strong minimum.

Our goal is to obtain a sufficient condition for the bounded strong γ -sufficiency in
problem (3.1), (3.2) using the sufficient condition for the bounded strong γ V -sufficiency in
the auxiliary problem V without local constraints. For this purpose, we prove the following
assertion.

Proposition 3.13. Let a point (0, w 0 ) be a point of bounded strong γ V -sufficiency in


problem V. Then w0 is a point of bounded strong γ -sufficiency in problem (3.1), (3.2).

Proof. Suppose that w0 is not a point of the bounded strong γ -sufficiency in problem
(3.1), (3.2). Then there exist a sequence {δw} containing nonzero members and a compact
set C ⊂ Q such that conditions (3.69) and (3.70) hold. We show that in this case, (0, w0 )
is not a point of bounded strong γ V -sufficiency in Problem V . Consider the sequence
{(0, δw)} with δy = 0. Condition (3.70) implies
U (t, x 0 + δx, u0 + δu) = u0 + δu, (3.72)
and then
δf = f (t, x 0 + δx, u0 + δu) − f (t, x 0 , u0 )
= f (t, x 0 + δx, U (t, x 0 + δx, u0 + δu)) − f (t, x 0 , u0 ) = δf U . (3.73)
It follows from (3.66)–(3.73) that
σ V (0, δw) = σ (δw) = o(γ (δw)) = o(γ V (0, δw)).
Therefore, (0, w 0 ) is not a point of the bounded strong γ V -sufficiency in Problem V . The
proposition is proved.

Next, we formulate the main result of this section: the quadratic sufficient conditions
for the bounded strong γ -sufficiency at the point w 0 in problem (3.1), (3.2). Then we
3.2. Quadratic Sufficient Conditions on a Fixed Time Interval 141

show that these conditions guarantee the bounded strong γ V -sufficiency in Problem V , and
hence, by Proposition 3.13, this implies the bounded strong γ -sufficiency at the point w 0
in problem (3.1), (3.2). This is our program. We now formulate the main result.

3.2.3 Quadratic Sufficient Conditions


For an admissible point w0 = (x 0 , u0 ) in problem (3.1), (3.2), we give the following
definition.

Definition 3.14. An element λ = (α0 , α, β, ψ, ν) ∈ M0 is said to be strictly Legendre if the


following conditions hold:
(1) D k (H̄ λ ) > 0 for all tk ∈ ;
(2) for any t ∈ [t0 , tf ] \ , the form

H̄uu (t, w0 (t), ψ(t), ν(t))ū, ū (3.74)

quadratic in ū is positive definite on the subspace of vectors ū ∈ Rd(u) such that

gu (t, w0 (t))ū = 0; (3.75)

(3) the following condition C k− holds for each point tk ∈ : the form

H̄uu (tk , x 0 (tk ), u0k− , ψ(tk ), ν k− )ū, ū (3.76)

quadratic in ū is positive definite on the subspace of vectors ū ∈ Rd(u) such that

gu (tk , x 0 (tk ), u0k− )ū = 0 (3.77)

(4) the following condition C k+ holds for each point tk ∈ : the quadratic form

H̄uu (tk , x 0 (tk ), u0k+ , ψ(tk ), ν k+ )ū, ū (3.78)

is positive definite on the subset of vectors ū ∈ Rd(u) such that

gu (tk , x 0 (tk ), u0k+ )ū = 0. (3.79)

Further, denote by M0+ the set of λ ∈ M0 such that the following conditions hold:

H (t, x 0 (t), u, ψ(t)) > H (t, x 0 (t), u0 (t), ψ(t)), (3.80)


if t ∈ [t0 , tf ]\, u ∈ U(t, x 0 (t)), u  = u0 (t), (3.81)

where U(t, x) = {u ∈ Rd(u) | (t, x, u) ∈ Q, g(t, x, u) = 0};

H (tk , x 0 (tk ), u, ψ(tk )) > H (tk , x 0 (tk ), u0k− , ψ(tk )) = H (tk , x 0 (tk ), u0k+ , ψ(tk )) (3.82)
if tk ∈ , u ∈ U(tk , x 0 (tk )), u ∈ / {u0k− , u0k+ }. (3.83)

Denote by Leg+ (M0+ ) the set of all strictly Legendrian elements λ ∈ M0+ .
142 Chapter 3. Quadratic Conditions for Optimal Control Problems

Definition 3.15. We say that Condition B holds at the point w 0 in problem (3.1), (3.2) if
the set Leg+ (M0+ ) is nonempty and there exist a compact set M ⊂ Leg+ (M0+ ) and a constant
C > 0 such that
max λ ≥ C γ̄ (z̄) ∀ z̄ ∈ K, (3.84)
λ∈M

where the quadratic form λ (z̄), the critical cone K for problem (3.1), (3.2), and the point
w0 were defined by relations (3.57) and (3.27)–(3.31), respectively, and
 tf
γ̄ (z̄) = ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + ū(t), ū(t) dt (3.85)
t0

(as in (2.15)). We have the following theorem.

Theorem 3.16. If Condition B holds for the point w 0 in the problem (3.1), (3.2), then we
have the bounded strong γ -sufficiency at this point.

We now prove this theorem. As was said, by Proposition 3.13, it suffices to show
that Condition B guarantees the bounded strong γ V -sufficiency at the point (0, w0 ) in
the problem without local constraints. For this purpose, we write the quadratic sufficient
condition of Section 2.1 for the auxiliary Problem V at the point (0, w0 ).

3.2.4 Proofs of Sufficient Quadratic Conditions


We first write the set M0V of normalized Lagrange multipliers satisfying the maximum
principle for the point (0, w9 ) in Problem V . The Pontryagin function in Problem V has the
form
1
H V (t, y, x, u, ψy , ψx ) = ψx f (t, x, U (t, x, u)) + ψy (u − U (t, x, u))2
2
1
= H (t, x, u, ψx ) + ψy (u − U (t, x, u))2 ,
U
(3.86)
2
and the endpoint Lagrange function is defined by the relation
l V (y0 , x0 , yf , xf , α0 , αy , α, βy , β)
= α0 J (x0 , xf ) + αF (x0 , xf ) + βK(x0 , xf ) + αy yf + βy y0 (3.87)
= l(x0 , xf , α0 , α, β) + αy yf + βy y0 .

The set M0V consists of tuples (α0 , αy , α, βy , β, ψy , ψx ) such that


α0 ≥ 0, α ≥ 0, αy ≥ 0, (3.88)
αF (p ) = 0,
0
(3.89)
αy yf0 = 0, (3.90)
α0 + |α| + αy + |βy | + |β| = 1, (3.91)
ψ̇y = 0, ψy (t0 ) = −βy , ψy (tf ) = αy , (3.92)
ψ̇x = −HxV , (3.93)
ψx (t0 ) = −lx0 , ψx (tf ) = lxf , (3.94)
3.2. Quadratic Sufficient Conditions on a Fixed Time Interval 143

H V (t, y 0 (t), x 0 (t), u, ψy (t), ψx (t))


≥ H V (t, y 0 (t), x 0 (t), u0 (t), ψy (t), ψx (t)) (3.95)

if t ∈ [t0 , tf ] \  and (t, x 0 (t), u) ∈ Q1 . Let us analyze these conditions. Since yf0 = y 0 (tf ) =
0, condition (3.90) holds automatically, and we can exclude it from consideration. Further,
since HxV = HxU − ψy (u0 − U (t, w0 ))Ux (t, w0 ) = HxU , condition (3.93) is equivalent to the
condition
ψ̇x = −HxU . (3.96)

It follows from (3.92) that


ψy = const = −βy = αy . (3.97)

Therefore, the normalization condition (3.91) is equivalent to the condition

α0 + |α| + |β| + αy = 1. (3.98)

We now turn to the minimum condition (3.95). It follows from (3.86) and (3.97) that it is
equivalent to the condition

1
H U (t, x 0 (t), u, ψx (t)) + αy (u − U (t, x 0 (t), u))2
2 (3.99)
≥ H U (t, x 0 (t), u0 (t), ψx (t))

whenever t ∈ [t0 , tf ] \  and (t, x 0 (t), u) ∈ Q1 . Therefore, we can identify the set M0V with
the set of tuples λV = (α0 , α, β, ψx , αy ) such that conditions (3.88), (3.89), (3.98), (3.96),
(3.94), and (3.99) hold.
Let there exist an element λ = (α0 , α, β, ψ, ν) of the set M0 of problem (3.1), (3.2) at
the point w 0 . Then its projection λU = (α0 , α, β, ψ) is an element of the set M0U of problem
(3.1), (3.5) (of problem U ) at the same point. Let 0 ≤ αy ≤ 1. We set

λV = ((1 − αy )λU , αy ). (3.100)

Let us show that λV ∈ M0V . Indeed,

(1 − αy ) α0 + (1 − αy )|α| + (1 − αy )|β| + αy = (1 − αy )(α0 + |α| + |β|) + αy = 1;

i.e., the normalization condition (3.98) holds for λV . Also conditions (3.88), (3.89), (3.96),
and (3.94) hold.
Let us verify the minimum condition (3.99). Since λU ∈ M0U, the conditions

t ∈ [t0 , tf ] \ , (t, x 0 (t), u) ∈ Q1 (3.101)

imply
H U (t, x 0 (t), u, ψx (t)) ≥ H U (t, x 0 (t), u0 (t), ψx (t)). (3.102)
144 Chapter 3. Quadratic Conditions for Optimal Control Problems

Moreover, the condition αy ≥ 0 implies

1
αy (u − U (t, x 0 (t), u))2 ≥ 0. (3.103)
2
Adding inequalities (3.102) and (3.103), we obtain the minimum condition (3.99). There-
fore, we have proved the following assertion.

Proposition 3.17. The conditions λ = (α0 , α, β, ψ, ν) ∈ M0 , 0 ≤ αy ≤ 1, λU = (α0 , α, β, ψ),


and λV = ((1 − αy )λU , αy ) imply λV ∈ M0V .

Further, let λ = (α0 , α, β, ψ, ν) ∈ M0+ , i.e., λ ∈ M0 , and the following conditions


of the strict minimum principle hold: (3.81) implies (3.80) and (3.83) implies (3.82).
We set λU = (α0 , α, β, ψ). It follows from the condition λ ∈ M0 that λU ∈ M0U . Let

0 < αy < 1 and λV = ((1 − αy )λU , αy ). (3.104)

Then according to Proposition 3.17, λV ∈ M0V . Let us show that λV ∈ M0V + , i.e., the strict
minimum principle at the point (0, w0 ) in Problem V holds for λV . First, let t ∈ [t0 , tf ] \ ,
and let
u ∈ Rd(u) , (t, x 0 (t), u) ∈ Q1 , u  = u0 (t). (3.105)
If g(t, x 0 (t), u) = 0 in this case, then

U (t, x 0 (t), u) = u (3.106)

and the strict inequality (3.80) holds. It follows from (3.106) and (3.80) that

H U (t, x 0 (t), u, ψx (t)) > H U (t, x 0 (t), u0 (t), ψx (t)). (3.107)

Taking into account that αy > 0 and (u − U )2 ≥ 0, we obtain

1
H U (t, x 0 (t), u, ψx (t)) + αy (u − U (t, x 0 (t), u))2
2 (3.108)
> H U (t, x 0 (t), u0 (t), ψx (t)),

i.e., the strict minimum condition holds for the function H V defined by relation (3.86). If,
along with conditions (3.105), the condition g(t, x 0 (t), u)  = 0 holds, then this implies

U (t, x 0 (t), u)  = u, (3.109)

and then
1
αy (u − U (t, x 0 (t), u))2 > 0, (3.110)
2
since αy > 0. Since λU ∈ M0U, (3.105) implies the nonstrict inequality

H U (t, x 0 (t), u, ψx (t)) ≥ H U (t, x 0 (t), u0 (t), ψx (t)). (3.111)


3.2. Quadratic Sufficient Conditions on a Fixed Time Interval 145

Again, inequalities (3.110) and (3.111) imply the strict inequality (3.108). Therefore, for
t∈/ , we have the strict minimum in u at the point u0 (t) for H V (t, 0, x 0 (t), u, ψy , ψx (t)).
The case t = tk ∈  is considered analogously. Therefore, we have proved the following
assertion.

Proposition 3.18. The conditions λ = (α0 , α, β, ψ, ν) ∈ M0+ , 0 < αy < 1, λU = (α0 , α, β, ψ),
and λV = ((1 − αy )λU , αy ) imply λV ∈ M0V + .

Let Leg+ (M0 ) be the set of all strictly Legendre elements λ ∈ M0 in the problem (3.1),
(3.2) at the point w0 . The definition of these elements was given in Section 3.2.3. Also,
denote by Leg+ (M0V ) the set of all strictly Legendre elements λV ∈ M0V of Problem V at
the point (0, w0 ). The definition of these elements was given in Section 2.1. Let us prove
the following assertion.

Proposition 3.19. The conditions λ = (α0 , α, β, ψ, ν) ∈ Leg+ (M0 ), 0 < αy < 1, λU =


(α0 , α, β, ψ), and λV = ((1 − αy )λU , αy ) imply λV ∈ Leg+ (M0V ).

Proof. The definitions of the element λV and the function H V imply

1
H V = (1 − αy )H U + αy (u − U )2 , (3.112)
2

where H V corresponds to the element λV and H U corresponds to the element λU . It follows


from (3.112) that the relation HxV = (1 − αy )HxU holds on the trajectory (t, w 0 (t)). But,
according to (3.52), HxU = H̄x , where H̄x = H̄xλ corresponds to the element λ. Therefore,

HxV = (1 − αy )H̄x . (3.113)

Further, since H V is independent of y, we have

HyV = 0. (3.114)

Also, (3.112) implies that HtV = (1 − αy )HtU on the trajectory (t, w0 (t)). But, according to
(3.54), HtU = H̄t . Hence

HtV = (1 − αy )H̄t . (3.115)

Finally, the definitions of the functions H V and H̄ imply that the following relations hold
on the trajectory (t, w 0 (t)):

HψVx = f U (t, w0 ) = f (t, w0 ) = H̄ψx , (3.116)


1
HψVy = (u0 − U (t, w0 ))2 = 0. (3.117)
2
146 Chapter 3. Quadratic Conditions for Optimal Control Problems

We obtain from the definitions of D k (H V ) and D k (H̄ ) and also from conditions (3.113)–
(3.117) that

−D k (H V ) = HxV k+ HψVxk− + HyV k+ HψVyk−


− (HxV k− HψVxk+ + HyV k− HψVyk+ ) + [HtV ]k
= HxV k+ HψVxk− − HxV k− HψVxk+ + [HtV ]k
= (1 − αy )H̄xk+ H̄ψk−
x
− (1 − αy )H̄xk− H̄ψk+
x
+ (1 − αy )[H̄t ]k
= −(1 − αy )D k (H̄ ) ∀ tk ∈ . (3.118)

Since λ ∈ Leg+ (M0 ) by condition, D k (H̄ ) > 0 for all tk ∈ . This and (3.118) together with
the inequality 1 − αy > 0 imply

D k (H V ) > 0 ∀ tk ∈ . (3.119)

Let us verify the conditions for the strict Legendre property of the element λV . For
this purpose, we calculate the quadratic form Huu V ū, ū , where ū ∈ Rd(u) , for this element
0
on the trajectory (t, w (t)). Differentiating relation (3.112) in u and multiplying it by ū, we
obtain
HuV ū = (1 − αy )HuU ū + αy (u − U ), (ū − Uu ū) .
The repeated differentiation in u and the multiplication by ū yield

Huu
V
ū, ū = (1 − αy )Huu
U
ū, ū + αy (ū − Uu ū)2 − αy (u − U )Uuu ū, ū .

Substituting (t, w) = (t, w 0 (t)) and, moreover, taking into account that u0 = U (t, w0 ),
we obtain Huu V ū, ū = (1 − α )H U ū, ū + α (ū − U ū)2 . Finally, according to (3.45),
y uu y u
Huu ū, ū = H̄uu (Uu ū), (Uu ū) , where H̄uu = H̄uu
U λ corresponds to the element λ. Hence

Huu
V
ū, ū = (1 − αy )H̄uu (Uu ū), (Uu ū) + αy (ū − Uu ū)2 . (3.120)

The values of the derivatives are taken for (t, w) = (t, w 0 (t)).
It is easy to verify that for each t ∈ [t0 , tf ], form (3.120) quadratic in ū is positive-
definite on Rd(u) . Indeed, suppose first that t ∈ [t0 , tf ] \ . Recall that the mapping ū ∈
Rd(u)  → Uu (t, w 0 (t))ū ∈ Rd(u) is the projection on the subspace

{ū ∈ Rd(u) | gu (t, w0 (t))ū = 0}. (3.121)

This and the condition λ ∈ Leg+ (M0 ) imply that the quadratic form H̄uu (Uu ū), (Uu ū)
is positive semidefinite on Rd(u) and positive definite on subspace (3.121). Furthermore,
the quadratic form (ū − Uu ū)2 is positive semidefinite on Rd(u) and positive outside sub-
space (3.121). This and the conditions 0 < αy < 1 imply the positivity of the quadratic
form HuuV ū, ū outside the origin of the space Rd(u) and, therefore, the positive definiteness

on R . The case t ∈ [t0 , tf ] \  has been considered. The case t = tk ∈  is consid-


d(u)

ered similarly. Therefore, all the conditions needed for element λV to belong to the set
Leg+ (M0V ) hold. The proposition is proved.
3.2. Quadratic Sufficient Conditions on a Fixed Time Interval 147

Propositions 3.18 and 3.19 imply the following assertion.

Lemma 3.20. The conditions λ = (α0 , α, β, ψ, ν) ∈ Leg+ (M0+ ), 0 < αy < 1, λU = (α0 , α,
β, ψ), λV = ((1 − αy )λU , αy ) imply λV ∈ Leg+ (M0V + ).

Fix a number αy such that 0 < αy < 1, e.g., αy = 1/2, and consider the linear operator
λ = (α0 , α, β, ψ, ν)  → λV = ((1 − αy )λU , αy ), (3.122)
where λU = (α0 , α, β, ψ) is the projection of the element λ. Lemma 3.20 implies the
following assertion.

Lemma 3.21. Operator (3.122) transforms an arbitrary nonempty compact set M ⊂


Leg+ (M0+ ) into a nonempty compact set M V ⊂ Leg+ (M0V + ).

Now let us consider the critical cone K V of Problem V at the point (0, w0 ). Accord-
ing to the definition given in Section 2.1, it consists of elements z̄V = (ξ̄ , ȳ, w̄) = (ξ̄ , ȳ, x̄, ū)
such that the following conditions hold:
z̄ = (ξ̄ , x̄, ū) = (ξ̄ , w̄) ∈ Z2 (), ȳ ∈ P W 1,2 (, R),
Jp p̄ ≤ 0, Fip p̄ ≤ 0 ∀ i ∈ I , Kp p̄ = 0,
ȳ0 = 0, ȳf ≤ 0,
ȳ˙ = 0, [ȳ]k = 0 ∀ tk ∈ ,
x̄˙ = fx x̄ + fu (Uw w̄),
[x̄]k = [f ]k ξ̄k ∀ tk ∈ .
These conditions imply ȳ = 0, z̄ = (ξ̄ , x̄, ū) ∈ K U , where K U is the critical cone of prob-
lem U , i.e., the problem (3.1), (3.5), at the point w 0 . Then, according to Lemma 3.6,
z̃ = (ξ̄ , x̄, Uw w̄) is an element of the critical cone K of the problem (3.1), (3.2) at the
point w 0 . Therefore, we have proved the following assertion.

Lemma 3.22. Let z̄V = (ξ̄ , ȳ, x̄, ū) = (ξ̄ , ȳ, w̄) ∈ K V . Then ȳ = 0, z̃ = (ξ̄ , x̄, Uw w̄) ∈ K.
Therefore, the linear operator
(ξ̄ , ȳ, x̄, ū)  → (ξ̄ , x̄, Ux x̄ + Uu ū) (3.123)
transforms the critical cone K V into the critical cone K.

We now consider the quadratic forms. Let λ ∈ M0 be an arbitrary element, and let
λV be its image under mapping (3.122). According to Proposition 3.17, λV ∈ M0V . For
V
Problem V and the point (0, w0 ), let us consider the quadratic form V λ (corresponding
to the element λV ), which was defined in Section 2.1, and let us study its relation with the
quadratic form λ (which corresponds to the element λ) at the point w0 of the problem
(3.1), (3.2). We have already shown that
D k (H V ) = (1 − αy )D k (H̄ ), tk ∈  (3.124)
(we omit the superscripts λV and λ of H V and H̄ , respectively). It follows from (3.113) that
[HxV ]k = (1 − αy )[H̄x ]k , tk ∈ , (3.125)
148 Chapter 3. Quadratic Conditions for Optimal Control Problems

and (3.114) implies


[HyV ]k = 0, tk ∈ . (3.126)
We obtain from (3.112) that
'  (
1 ∂2
Hww
V
w̄, w̄ = (1 − αy )Hww
U
w̄, w̄ + αy (u − U ) w̄, w̄ ,
2
(3.127)
2 ∂w 2

where w̄ = (x̄, ū), x̄ ∈ P W 1,2 (, Rd(x) ), ū ∈ L2 (, Rd(u) ). According to (3.47),

Hww
U
w̄, w̄ = H̄ww w̃, w̃ , (3.128)

where w̃ = (x̄, ũ) = (x̄, Uw w̄). Let us calculate the second summand in formula (3.127).
We have  
1 ∂
(u − U )2 w̄ = (u − U )(ū − Uw w̄).
2 ∂w
Therefore, for w = w 0 (t), we have
'  ( ) *
1 ∂2
(u − U )2
w̄, w̄ = ( ū − Uw w̄) 2
− u0
− U (w 0
, t) Uww w̄, w̄
2 ∂w2
= (ū − Uw w̄)2 = (ū − ũ)2 . (3.129)

We obtain from (3.127)–(3.129) that

Hww
V
w̄, w̄ = (1 − αy )H̄ww w̃, w̃ + αy (ū − ũ)2 , (3.130)

where ũ = Uw w̄ and w̃ = (x̄, ũ). Since H V is independent of y, we have


V
Hyy =0 and V
Hyw = 0. (3.131)

Let zV = (ξ̄ , ȳ, x̄, ū) be an arbitrary tuple such that

z̄ = (ξ̄ , x̄, ū) = (ξ̄ , w̄) ∈ Z2 (), ȳ ∈ P W 1,2 . (3.132)


V
The definitions of the quadratic forms V λ and λ and also relations (3.124)–(3.126),
(3.130), and (3.131) imply
 tf
V
V λ (z̄V ) = (1 − αy )λ (z̃) + αy (ū − ũ)2 dt, (3.133)
t0

where ũ = Ux x̄ + Uu ū and z̃ = (ξ̄ , x̄, ũ). Therefore, we have proved the following assertion.

Lemma 3.23. Let an element λ ∈ M0 and a tuple z̄V = (ξ̄ , ȳ, x̄, ū) satisfying conditions
(3.132) be given. Let λV be the image of λ under mapping (3.122). Then formula (3.133)
V
holds for the quadratic form V λ calculated for Problem V, the point (0, w 0 ), and the
element λV and also holds for the quadratic form λ calculated for the problem (3.1),
(3.2), the point w0 , and the element λ.
3.2. Quadratic Sufficient Conditions on a Fixed Time Interval 149

We now assume that the following sufficient Condition B holds at the point w0 in
problem (3.1), (3.2): there exist a nonempty compact set M ⊂ Leg+ (M0+ ) and a constant
C > 0 such that
max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K. (3.134)
M
Let us show that in this case, the sufficient condition of Section 2.1 (denoted by BV ) holds
at the point (0, w 0 ) in Problem V .
Let z̄V = (ξ̄ , ȳ, x̄, ū) be an arbitrary element of the critical cone K V of Problem V at
the point (0, w0 ). Then ȳ = 0, and, by Lemma 3.22, z̃ = (ξ̄ , x̄, ũ) ∈ K, where ũ = Uw w̄.
Condition (3.134) implies the existence of λ ∈ M such that
λ (z̃) ≥ C γ̄ (z̃). (3.135)
Let λVbe the image of λ under the mapping defined by operator (3.122). Then, by
Lemma 3.23, formula (3.133) holds. It follows from (3.133) and (3.135) that
 tf
V λV V
 (z̄ ) ≥ (1 − αy )C γ̄ (z̃) + αy (ū − ũ)2 dt. (3.136)
t0
Therefore, 
V
tf
max V λ (z̄V ) ≥ (1 − αy )C γ̄ (z̃) + αy (ū − ũ)2 dt, (3.137)
λV ∈M V t0
where M V is the image of the compact set M under the mapping defined by operator (3.122).
By Lemma 3.21,
M V ⊂ Leg+ (M0V + ). (3.138)
Conditions (3.137) and (3.138) imply that the sufficient Condition BV holds at the point
(0, w 0 ) in Problem V . To verify this, it suffices to show that the right-hand side of inequality
(3.137) on the cone K V is estimated from below by the functional
 tf
γ̄ (z̄ ) := ξ̄ + ȳ0 + x̄0 +
V V 2 2 2
ū2 dt = γ̄ (z̄) + ȳ02
t0

with a small coefficient ε > 0. Since ȳ = 0 for all z̄ ∈ K V , it suffices to prove that there
exists ε > 0 such that (1 − αy )C γ̄ (z̃) + αy (ū − ũ)2 dt ≥ εγ̄ (z̄) or
 tf  tf  tf
(1 − αy )C ξ̄ 2 + x̄02 + ũ2 dt + αy (ū − ũ)2 dt ≥ ε ξ̄ 2 + x̄02 + ū2 dt . (3.139)
t0 t0 t0

We set ū − ũ = û. Then ū2 = û2 + 2ûũ + ũ2 ≤ 2û2 + 2ũ2 . This obviously implies the
estimate required. Therefore, we have proved the following assertion.

Lemma 3.24. Let the sufficient Condition B hold at the point w 0 in problem (3.1), (3.2).
Then the sufficient Condition BV of Section 2.1 holds at the point (0, w 0 ) in Problem V .

By Theorem 2.101, Condition BV implies the existence of the bounded strong


γ V -sufficiencyat the point (0, w0 ) in Problem V ; by Proposition 3.13, this implies the
existence of the bounded strong γ -sufficiency at the point w0 in the problem (3.1), (3.2).
Also, taking into account Lemma 3.24, we obtain that Condition B is sufficient for the
bounded strong γ -sufficiency at the point w 0 in problem (3.1), (3.2). Therefore, we have
proved Theorem 3.16, to which this section is devoted.
150 Chapter 3. Quadratic Conditions for Optimal Control Problems

3.3 Quadratic Conditions in the Problem with Mixed


Control-State Equality Constraints on a Variable Time
Interval
3.3.1 Statement of the Problem
Here, quadratic optimality conditions, both necessary and sufficient, are presented (as in
[93]) in the following canonical Dubovitskii–Milyutin problem on a variable time interval.
Let T denote a trajectory (x(t), u(t) | t ∈ [t0 , tf ]), where the state variable x(·) is a Lipschitz-
continuous function, and the control variable u(·) is a bounded measurable function on a
time interval  = [t0 , tf ]. The interval  is not fixed. For each trajectory T we denote by
p = (t0 , x(t0 ), tf , x(tf )) the vector of the endpoints of time-state variable (t, x). It is required
to find T minimizing the functional

J(T ) := J (p) → min (3.140)

subject to the constraints

F (p) ≤ 0, K(p) = 0, (3.141)


ẋ(t) = f (t, x(t), u(t)), (3.142)
g(t, x(t), u(t)) = 0, (3.143)
p ∈ P , (t, x(t), u(t)) ∈ Q, (3.144)

where P and Q are open sets, and x, u, F , K, f , and g are vector functions.
We assume that the functions J , F , and K are defined and twice continuously differen-
tiable on P , and that the functions f and g are defined and twice continuously differentiable
on Q. It is also assumed that the gradients with respect to the control giu (t, x, u), i =
1, . . . , d(g) are linearly independent at each point (t, x, u) ∈ Q such that g(t, x, u) = 0. Here
d(g) is a dimension of the vector g.

3.3.2 Necessary Conditions for a Pontryagin Minimum


Let T be a fixed admissible trajectory such that the control u(·) is a piecewise Lipschitz-
continuous function on the interval  with the set of discontinuity points  = {t1 , . . . , ts },
t0 < t1 < · · · < ts < tf . Let us formulate a first-order necessary condition for optimality of
the trajectory T . We introduce the Pontryagin function

H (t, x, u, ψ) = ψf (t, x, u) (3.145)

and the augmented Pontryagin function

H̄ (t, x, u, ψ, ν) = H (t, x, u, ψ) + νg(t, x, u), (3.146)

where ψ and ν are row vectors of the dimensions d(x) and d(g), respectively. Let us define
the endpoint Lagrange function

l(p, α0 , α, β) = α0 J (p) + αF (p) + βK(p), (3.147)


3.3. Quadratic Conditions on a Variable Time Interval 151

where p = (t0 , x0 , tf , xf ), x0 = x(t0 ), xf = x(tf ), α0 ∈ R, α ∈ (Rd(F ) )∗ , β ∈ (Rd(K) )∗ . Also


we introduce a tuple of Lagrange multipliers

λ = (α0 , α, β, ψ(·), ψ0 (·), ν(·)) (3.148)

such that ψ(·) :  → (Rd(x) )∗ and ψ0 (·) :  → R1 are piecewise smooth functions, continu-
ously differentiable on each interval of the set  \ , and ν(·) :  → (Rd(g) )∗ is a piecewise
continuous function, Lipschitz continuous on each interval of the set  \ .
Denote by M0 the set of the normed tuples λ satisfying the conditions of the minimum
principle for the trajectory T :
 
α0 ≥ 0, α ≥ 0, αF (p) = 0, α0 + αi + |βj | = 1,
ψ̇ = −H̄x , ψ̇0 = −H̄t , H̄u = 0, t ∈  \ ,
ψ(t0 ) = −lx0 , ψ(tf ) = lxf , ψ0 (t0 ) = −lt0 , ψ0 (tf ) = ltf , (3.149)
min H (t, x(t), u, ψ(t)) = H (t, x(t), u(t), ψ(t)), t ∈  \ ,
u∈U(t,x(t))
H (t, x(t), u(t), ψ(t)) + ψ0 (t) = 0, t ∈  \ ,

where U(t, x) = {u ∈ Rd(u) | g(t, x, u) = 0, (t, x, u) ∈ Q}. The derivatives lx0 and lxf are
at (p, α0 , α, β), where p = (t0 , x(t0 ), tf , x(tf )), and the derivatives H̄x , H̄u , and H̄t are at
(t, x(t), u(t), ψ(t), ν(t)), where t ∈  \ . (Condition H̄u = 0 follows from the others con-
ditions in this definition, and therefore could be excluded; yet we need to use it later.)
Let us give the definition of Pontryagin minimum in problem (3.140)–(3.144) on a
variable interval [t0 , tf ].

Definition 3.25. The trajectory T affords a Pontryagin minimum if there is no sequence of


admissible trajectories T n = (x n (t), un (t) | t ∈ [t0n , tfn ]), n = 1, 2, . . . such that
(a) J(T n ) < J(T ) for all n;
(b) t0n → t0 , tfn → tf (n → ∞);
(c) maxn ∩ |x n (t) − x(t)| → 0 (n → ∞), where n = [t0n , tfn ];

(d) n ∩ |un (t) − u(t)| dt → 0 (n → ∞);
(e) there exists a compact set C ⊂ Q such that (t, x n (t), un (t)) ∈ C a.e. on n for all n.

For convenience, let us give an equivalent definition of the Pontryagin minimum.

Definition 3.26. The trajectory T affords a Pontryagin minimum if for each compact
set C ⊂ Q there exists ε > 0 such that J(T˜ ) ≥ J(T ) for all admissible trajectories T˜ =
(x̃(t), ũ(t) | t ∈ [t˜0 , t˜f ]) satisfying the following conditions:
(a) |t˜0 − t0 | < ε, |t˜f − tf | < ε;
(b) max∩˜ |x̃(t) − x(t)| < ε, where  ˜ = [t˜0 , t˜f ];
(c) ∩˜ | ũ(t) − u(t)| dt < ε;
(d) (t, x̃(t), ũ(t)) ∈ C a.e. on . ˜

The condition M0  = ∅ is equivalent to the Pontryagin minimum principle. It is a first-


order necessary condition of Pontryagin minimum for the trajectory T . Thus, the following
theorem holds.
152 Chapter 3. Quadratic Conditions for Optimal Control Problems

Theorem 3.27. If the trajectory T affords a Pontryagin minimum, then the set M0 is
nonempty.

Assume that M0 is nonempty. Using the definition of the set M0 and the full rank
condition of the matrix gu on the surface g = 0, one can easily prove the following statement.

Proposition 3.28. The set M0 is a finite-dimensional compact set, and the mapping λ  →
(α0 , α, β) is injective on M0 .

As in Section 3.1, for each λ ∈ M0 , tk ∈ , we set

D k (H̄ ) = −H̄xk+ H̄ψk− + H̄xk− H̄ψk+ − [H̄t ]k , (3.150)

where H̄xk− = H̄x (tk , x(tk ), u(tk −), ψ(tk ), ν(tk −)), H̄xk+ = H̄x (tk , x(tk ), u(tk +), ψ(tk ), ν(tk +)),
[H̄t ]k = H̄tk+ − H̄tk− , etc.

Theorem 3.29. For each λ ∈ M0 the following conditions hold:

D k (H̄ ) ≥ 0, k = 1, . . . , s. (3.151)

Thus, conditions (3.151) follows from the minimum principle conditions (3.149). The
following is an alternative method for calculating D k (H̄ ): For λ ∈ M0 , tk ∈ , consider the
function

(k H̄ )(t) = H̄ (tk , x(t), u(tk +), ψ(t), ν(tk +)) − H̄ (tk , x(t), u(tk −), ψ(t), ν(tk −)).

Proposition 3.30. For each λ ∈ M0 the following equalities hold :

d  d 
(k H̄ )t=t − = (k H̄ )t=t + = −D k (H̄ ), k = 1, . . . , s. (3.152)
dt k dt k

Hence, for λ ∈ M0 the function (k H̄ )(t) has a derivative at the point tk ∈  equal
to −D k (H̄ ), k = 1, . . . , s. Let us formulate a quadratic necessary condition of a Pontryagin
minimum for the trajectory T . First, for this trajectory, we introduce a Hilbert space
Z2 () and the critical cone K ⊂ Z2 (). We denote by P W 1,2 (, Rd(x) ) the Hilbert
space of piecewise continuous functions x̄(·) :  → Rd(x) , absolutely continuous on each
interval of the set  \  and such that their first derivative is square integrable. For each
x̄ ∈ P W 1,2 (, Rd(x) ), tk ∈ , we set

x̄ k− = x̄(tk −), x̄ k+ = x̄(tk +), [x̄]k = x̄ k+ − x̄ k− .

Further, we denote z̄ = (t¯0 , t¯f , ξ̄ , x̄, ū), where

t¯0 ∈ R1 , t¯f ∈ R1 , ξ̄ ∈ Rs , x̄ ∈ P W 1,2 (, Rd(x) ), ū ∈ L2 (, Rd(u) ).


3.3. Quadratic Conditions on a Variable Time Interval 153

Thus,
z̄ ∈ Z2 () := R2 × Rs × P W 1,2 (, Rd(x) ) × L2 (, Rd(u) ).
Moreover, for given z̄ we set

w̄ = (x̄, ū), x̄0 = x̄(t0 ), x̄f = x̄(tf ), (3.153)


x̄¯0 = x̄(t0 ) + t¯0 ẋ(t0 ), x̄¯f = x̄(tf ) + t¯f ẋ(tf ), p̄¯ = (t¯0 , x̄¯0 , t¯f , x̄¯f ). (3.154)

By IF (p) = {i ∈ {1, . . . , d(F )} | Fi (p) = 0}, we denote the set of active indices of the
constraints Fi (p) ≤ 0.
Let K be the set of all z̄ ∈ Z2 () satisfying the following conditions:

J (p)p̄¯ ≤ 0, Fi (p)p̄¯ ≤ 0 ∀ i ∈ IF (p), K (p)p̄¯ = 0,


˙ = fw (t, w(t))w̄(t) for a.a. t ∈ [t0 , tf ],
x̄(t)
(3.155)
[x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s,
gw (t, w(t))w̄(t) = 0 for a.a. t ∈ [t0 , tf ],

where p = (t0 , x(t0 ), tf , x(tf )), w = (x, u). It is obvious that K is a convex cone in the
Hilbert space Z2 (), and we call it the critical cone. If the interval  is fixed, then we
set p := (x0 , xf ) = (x(t0 ), x(tf )), and in the definition of K we have t¯0 = t¯f = 0, x̄¯0 = x̄0 ,
x̄¯f = x̄f , and p̄¯ = p̄ := (x̄0 , x̄f ).
Let us introduce a quadratic form on Z2 (). For λ ∈ M0 and z̄ ∈ K, we set

ωe (λ, z̄) = lpp p̄,¯ p̄
¯ − 2ψ̇(tf )x̄(tf )t¯f − ψ̇(tf )ẋ(tf ) + ψ̇0 (tf ) t¯f2

+ 2ψ̇(t0 )x̄(t0 )t¯0 + ψ̇(t0 )ẋ(t0 ) + ψ̇0 (t0 ) t¯02 , (3.156)

where lpp = lpp (p, α0 , α, β), p = (t0 , x(t0 ), tf , x(tf )). We also set
 tf
ω(λ, z̄) = ωe (λ, z̄) + H̄ww w̄(t), w̄(t) dt, (3.157)
t0

where H̄ww = H̄ww (t, x(t), u(t), ψ(t), ν(t)). Finally, we set
s

2(λ, z̄) = ω(λ, z̄) + D k (H̄ )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k , (3.158)
k=1

k = 1 (x̄ k− + x̄ k+ ), [ψ̇]k = ψ̇ k+ − ψ̇ k− .
where x̄av 2
Now, we formulate the main necessary quadratic condition of Pontryagin minimum
in the problem on a variable time interval.

Theorem 3.31. If the trajectory T yields a Pontryagin minimum, then the following Con-
dition A holds: the set M0 is nonempty and

max (λ, z̄) ≥ 0 ∀ z̄ ∈ K.


λ∈M0
154 Chapter 3. Quadratic Conditions for Optimal Control Problems

3.3.3 Sufficient Conditions for a Bounded Strong Minimum


Next, we give the definition of a bounded strong minimum in problem (3.140)–(3.144) on
a variable interval [t0 , tf ]. To this end, let us give the definition of essential component
of vector x in this problem: the ith component xi of vector x is called unessential if the
functions f and g do not depend on this component and the functions J , F , and K are affine
in xi0 = xi (t0 ), xif = x(tf ); otherwise the component xi is called essential. We denote by x
a vector composed of all essential components of vector x.

Definition 3.32. We say that the trajectory T affords a bounded strong minimum if there
is no sequence of admissible trajectories T n = (x n (t), un (t) | t ∈ [t0n , tfn ]), n = 1, 2, . . . such
that
(a) J(T n ) < J(T );
(b) t0n → t0 , tfn → tf , x n (t0n ) → x(t0 ) (n → ∞);
(c) maxn ∩ |x n (t) − x(t)| → 0 (n → ∞), where n = [t0n , tfn ];
(d) there exists a compact set C ⊂ Q such that (t, x n (t), un (t)) ∈ C a.e. on n for all n.

An equivalent definition has the following form.

Definition 3.33. The trajectory T affords a bounded strong minimum if for each compact
set C ⊂ Q there exists ε > 0 such that J(T˜ ) ≥ J(T ) for all admissible trajectories T˜ =
(x̃(t), ũ(t) | t ∈ [t˜0 , t˜f ]) satisfying the following conditions:
(a) |t˜0 − t0 | < ε, |t˜f − tf | < ε, |x̃(t˜0 ) − x(t0 )| < ε;
(b) max∩ ˜ |x̃(t) − x(t)| < ε, where  ˜ = [t˜0 , t˜f ];
(c) (t, x̃(t), ũ(t)) ∈ C a.e. on . ˜

The strict bounded strong minimum is defined in a similar way, with the nonstrict
inequality J(T˜ ) ≥ J(T ) replaced by the strict one and the trajectory T˜ required to be
different from T . Finally, we define a (strict) strong minimum in the same way but omit
condition (c) in the last definition. The following statement is quite obvious.

Proposition 3.34. If there exists a compact set C ⊂ Q such that {(t, x, u) ∈ Q | g(t, x, u) =
0} ⊂ C, then a (strict) strong minimum is equivalent to a (strict) bounded strong minimum.

Let us formulate a sufficient optimality Condition B, which is a natural strengthening


of the necessary Condition A. The condition B is sufficient not only for a Pontryagin
minimum, but also for a strict bounded strong minimum.
To formulate the condition B, we introduce, for λ ∈ M0 , the following conditions of
the strict minimum principle:
(MP+
\ ) H (t, x(t), u, ψ(t)) > H (t, x(t), u(t), ψ(t))
for all t ∈  \ , u  = u(t), u ∈ U(t, x(t)), and
(MP+
) H (tk , x(tk ), u, ψ(tk )) > H k
for all tk ∈ , u ∈ U(tk , x(tk )), u  = u(tk −), u  = u(tk +), where H k := H k− = H k+ , H k− =
H (tk , x(tk ), u(tk −), ψ(tk )), H k+ = H (tk , x(tk ), u(tk +), ψ(tk )). We denote by M0+ the set of
all λ ∈ M0 satisfying conditions (MP+ \ ) and (MP ).
+
3.3. Quadratic Conditions on a Variable Time Interval 155

For λ ∈ M0 we also introduce the strengthened Legendre–Clebsch conditions:


(SLC\ ): For each t ∈  \ , the quadratic form

H̄uu (t, x(t), u(t), ψ(t), ν(t))ū, ū

is positive definite on the subspace of vectors ū ∈ Rd(u) such that

gu (t, x(t), u(t))ū = 0.

(SLCk− ): For each tk ∈ , the quadratic form

H̄uu (tk , x(tk ), u(tk −), ψ(tk ), ν(tk −))ū, ū

is positive definite on the subspace of vectors ū ∈ Rd(u) such that

gu (tk , x(tk ), u(tk −))ū = 0.

(SLCk+ ): this condition is symmetric to condition (SLCk− ) by replacing (tk −)


everywhere by (tk +).
Note that for each λ ∈ M0 the nonstrengthened Legendre–Clebsch conditions hold;
i.e., the same quadratic forms are nonnegative on the corresponding subspaces.
We denote by Leg+ (M0+ ) the set of all λ ∈ M0+ satisfying the strengthened Legendre–
Clebsch conditions (SLC\ ), (SLCk− ), (SLCk+ ), k = 1, . . . , s, and also the conditions

D k (H̄ ) > 0 ∀ k = 1, . . . , s. (3.159)

Let us introduce the functional


 tf
γ̄ (z̄) = t¯02 + t¯f2 + ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + ū(t), ū(t) dt, (3.160)
t0

which is equivalent to the norm squared on the subspace

x̄˙ = fw (t, x(t), u(t))w̄; [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s, (3.161)

of Hilbert space Z2 (). Recall that the critical cone K is contained in the subspace (3.161).

Theorem 3.35. For the trajectory T , assume that the following Condition B holds: The
set Leg+ (M0+ ) is nonempty and there exist a nonempty compact set M ⊂ Leg+ (M0+ ) and a
number C > 0 such that
max (λ, z̄) ≥ C γ̄ (z̄) (3.162)
λ∈M

for all z̄ ∈ K. Then the trajectory T affords a strict bounded strong minimum.
156 Chapter 3. Quadratic Conditions for Optimal Control Problems

3.3.4 Proofs
The proofs are based on the quadratic optimality conditions, obtained in this chapter for
problems on a fixed interval of time. We will give the proofs but omit some details. In
order to extend the proofs to the case of a variable interval [t0 , tf ] we use a simple change
of the time variable. Namely, we associate the fixed admissible trajectory T = (x(t), u(t) |
t ∈ [t0 , tf ]) in the problem on a variable time interval (3.140)–(3.144) with a trajectory
T τ = (v(τ ), t(τ ), x(τ ), u(τ ) | τ ∈ [τ0 , τf ]), considered on a fixed interval [τ0 , τf ], where
τ0 = t0 , τf = tf , t(τ ) ≡ τ , v(τ ) ≡ 1. This is an admissible trajectory in the following
problem on a fixed interval [τ0 , τf ]: Minimize the cost function

J(T τ ) := J (t(τ0 ), x(τ0 ), t(τf ), x(τf )) → min (3.163)

subject to the constraints

F (t(τ0 ), x(τ0 ), t(τf ), x(τf )) ≤ 0, K(t(τ0 ), x(τ0 ), t(τf ), x(τf )) = 0, (3.164)


dx(τ ) dt(τ ) dv(τ )
= v(τ )f (t(τ ), x(τ ), u(τ )), = v(τ ), = 0, (3.165)
dτ dτ dτ
g(t(τ ), x(τ ), u(τ )) = 0, (3.166)
(t(τ0 ), x(τ0 ), t(τf ), x(τf )) ∈ P , (t(τ ), x(τ ), u(τ )) ∈ Q. (3.167)

In this problem, x(τ ), t(τ ), and v(τ ) are state variables, and u(τ ) is a control variable.
For brevity, we refer to problem (3.140)–(3.144) as problem P (on a variable interval
 = [t0 , tf ]), and to problem (3.163)–(3.167) as problem P τ (on a fixed interval [τ0 , τf ]).
We denote by Aτ the necessary quadratic Condition A for problem P τ on a fixed interval
[τ0 , τf ]. Similarly, we denote by Bτ the sufficient quadratic Condition B for problem P τ
on a fixed interval [τ0 , τf ].
Recall that the control u(·) is a piecewise Lipschitz continuous function on the interval
 = [t0 , tf ] with the set of discontinuity points  = {t1 , . . . , ts }, where t0 < t1 < · · · < ts < tf .
Hence, for each λ ∈ M0 , the function ν(t) is also piecewise Lipschitz continuous on the
interval , and, moreover, all discontinuity points of ν belong to . This easily follows
from the equation H̄u = 0 and the full-rank condition for matrix gu . Consequently, u̇ and
ν̇ are bounded measurable functions on . The proof of Theorem 3.31 is composed of the
following chain of implications:
(i) A Pontryagin minimum is attained on the trajectory T in problem P =⇒
(ii) A Pontryagin minimum is attained on the trajectory T τ in problem P τ =⇒
(iii) Condition Aτ holds for the trajectory T τ in problem P τ =⇒
(iv) Condition A holds for the trajectory T in problem P .
The first implication is readily verified, the second follows from Theorem 3.9. The
verification of the third implication (iii) ⇒ (iv) is not short and rather technical: we have
to compare the sets of Lagrange multipliers, the critical cones, and the quadratic forms in
both problems. This will be done below.
In order to prove the sufficient conditions in problem P , given by Theorem 3.35, we
have to check the following chain of implications:
(v) Condition B holds for the trajectory T in problem P =⇒
(vi) Condition Bτ holds for the trajectory T τ in problem P τ =⇒
3.3. Quadratic Conditions on a Variable Time Interval 157

(vii) A bounded strong minimum is attained on the trajectory T τ in problem P τ =⇒


(viii) A bounded strong minimum is attained on the trajectory T in problem P .
The verification of the first implication (v) ⇒ (vi) is similar to the verification of the third
implication (iii) ⇒ (iv) in the proof of the necessary conditions, the second implication
(vi) ⇒ (vii) follows from Theorem 3.16, and the third (vii) ⇒ (viii) is readily verified.
Thus, it remains to compare the sets of Lagrange multipliers, the critical cones, and
the quadratic forms in problems P and P τ for the trajectories T and T τ , respectively.

Comparison of the sets of Lagrange multipliers. Let us formulate the Pontryagin


minimum principle in problem P τ for the trajectory T τ . The endpoint Lagrange function l,
the Pontryagin function H , and the augmented Pontryagin function H̄ (all of them are
equipped with the superscript τ ) have the form

l τ = α0 J + αF + βK = l,
H τ = ψvf + ψ0 v + ψv · 0 = v(ψf + ψ0 ), H̄ τ = H τ + νg.

The set M0τ in problem P τ for the trajectory T τ consists of all tuples of Lagrange multipliers
λτ = (α0 , α, β, ψ, ψ0 , ψv , ν) such that the following conditions hold:

α0 + |α| + |β| = 1,
dψ dψ0 dψv
− = vψfx + νgx , − = vψft + νgt , − = ψf + ψ0
dτ dτ dτ
ψ(τ0 ) = −lx0 , ψ(τf ) = lxf , ψ0 (τ0 ) = −lt0 , ψ0 (τf ) = ltf , (3.168)
ψv (τ0 ) = ψv (τf ) = 0, vψfu + νgu = 0,
 
v(τ ) ψ(τ )f (t(τ ), x(τ ), u) + ψ0 (τ )
 
≥ v(τ ) ψ(τ )f (t(τ ), x(τ ), u(τ )) + ψ0 (τ ) .

The last inequality holds for all u ∈ Rd(u) such that g(t(τ ), x(τ ), u) = 0, (t(τ ), x(τ ), u) ∈ Q.
Recall that v(τ ) ≡ 1, t(τ ) ≡ τ , τ0 = t0 , and τf = tf . In (3.168), the function f and its deriva-
tives fx , fu , ft , gx gu , and gt are taken at (t(τ ), x(τ ), u(τ )), τ ∈ [τ0 , τf ] \ , while the deriva-
tives lt0 , lx0 , ltf , and lxf are calculated at (t(τ0 ), x(τ0 ), t(τf ), x(τf )) = (t0 , x(t0 ), tf , x(tf )).
τ
Conditions −dψv /dτ = ψf + ψ0 and ψv (τ0 ) = ψv (τf ) = 0 imply that τ0f (ψf +
ψ0 ) dτ = 0. As is well known, conditions (3.168) of the minimum principle also imply that
ψf + ψ0 = const, whence ψf + ψ0 = 0 and ψv = 0. Taking this fact into account and
comparing the definitions of the sets M0τ (3.168) and M0 (3.149), we see that the projector
   
α0 , α, β, ψ, ψ0 , ψv , ν → α0 , α, β, ψ, ψ0 , ν (3.169)

realizes a one-to-one correspondence between these two sets. (Moreover, in the definition of
the set M0τ one could replace the relations −dψv /dτ = ψf + ψ0 and ψv (τ0 ) = ψv (τf ) = 0
with ψf + ψ0 = 0, and thus identify M0τ with M0 .)
We say that an element λτ ∈ M0τ corresponds to an element λ ∈ M0 if λ is the projec-
tion of λτ under the mapping (3.169).

Comparison of the critical cones. For brevity, we set ! = (v, t, x, u) = (v, t, w).
Let us define the critical cone K τ in problem P τ for the trajectory T τ . It consists of all
158 Chapter 3. Quadratic Conditions for Optimal Control Problems

tuples (ξ̄ , v̄, t¯, x̄, ū) = (ξ̄ , !)


¯ satisfying the relations

Jt0 t¯(τ0 ) + Jx0 x̄(τ0 ) + Jtf t¯(τf ) + Jxf x̄(τf ) ≤ 0, (3.170)


Fit0 t¯(τ0 ) + Fix0 x̄(τ0 ) + Fitf t¯(τf ) + Fixf x̄(τf ) ≤ 0, i ∈ IF (p), (3.171)
Kt0 t¯(τ0 ) + Kx0 x̄(τ0 ) + Ktf t¯(τf ) + Kxf x̄(τf ) = 0, (3.172)
d x̄  
= v̄f + v ft t¯ + fx x̄ + fu ū , [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s, (3.173)

d t¯ d v̄
= v̄, [t¯]k = 0, k = 1, . . . , s, = 0, [v̄]k = 0, k = 1, . . . , s, (3.174)
dτ dτ
gt t¯ + gx x̄ + gu ū = 0, (3.175)

where the derivatives Jt0 , Jx0 , Jtf Jxf , etc. are calculated at (t(τ0 ), x(τ0 ), t(τf ), x(τf )) =
(t0 , x(t0 ), tf , x(tf )), while f , ft , fx , fu gt , gx , and gu are taken at (t(τ ), x(τ ), u(τ )), τ ∈
[τ0 , τf ] \ . Let (ξ̄ , v̄, t¯, x̄, ū) be an element of the critical cone K τ . We can use the following
change of variables:
x̃ = x̄ − t¯ẋ, ũ = ū − t¯u̇, (3.176)
or, briefly,
w̃ = w̄ − t¯ẇ. (3.177)
Since v = 1, ẋ = f , and t = τ , equation (3.173) is equivalent to

d x̄
= v̄ ẋ + ft t¯ + fw w̄. (3.178)
dt

Using the relation x̄ = x̃ + t¯ẋ in (3.178) along with t˙¯ = v̄, we get

x̃˙ + t¯ẍ = t¯ft + fw w̄. (3.179)

By differentiating the equation ẋ(t) = f (t, w(t)), we obtain

ẍ = ft + fw ẇ. (3.180)

Using this relation in (3.179), we get

x̃˙ = fw w̃. (3.181)

The relations
[x̄]k = [ẋ]k ξ̄k , x̄ = x̃ + t¯ẋ
imply
[x̃]k = [ẋ]k ξ̃k , (3.182)
where
ξ̃k = ξ̄k − t¯k , t¯k = t¯(tk ), k = 1, . . . , s. (3.183)
Further, relation (3.175) may be written as gt t¯ + gw w̄ = 0. Differentiating the relation
g(t, w(t)) = 0, we obtain
gt + gw ẇ = 0. (3.184)
3.3. Quadratic Conditions on a Variable Time Interval 159

These relations along with (3.177) imply that

gw w̃ = 0. (3.185)

Finally, note that since x̄ = x̃ + t¯ẋ and τ0 = t0 , τf = tf , we have


   
p̄ = t¯0 , x̄(t0 ), t¯f , x̄(tf ) = t¯0 , x̃(t0 ) + t¯0 ẋ(t0 ), t¯f , x̃(tf ) + t¯f ẋ(tf ) , (3.186)

where t¯0 = t¯(t0 ) and t¯f = t¯(tf ). The vector in the right-hand side of the last equality has the
same form as the vector p̄¯ in definition (3.154). Consequently, all relations in definition
(3.155) of the critical cone K in problem P are satisfied for the element z̃ = (t¯0 , t¯f , ξ̃ , w̃).
We have proved that the obtained element z̃ belongs to the critical cone K in problem P .
Conversely, if (t¯0 , t¯f , ξ̃ , w̃) is an element of the critical cone in problem P , then by setting

t¯f − t¯0
v̄ = , t¯ = v̄(τ − τ0 ) + t¯0 , w̄ = w̃ + t¯ẇ, ξ̄k = ξ̃k + t¯(τk ), k = 1, . . . , s,
tf − t0

we obtain an element (ξ̄ , v̄, t¯, w̄) of the critical cone (3.170)–(3.175) in problem P τ . Thus,
we have proved the following lemma.

Lemma 3.36. If (ξ̄ , v̄, t¯, w̄) is an element of the critical cone (3.170)–(3.175) in problem P τ
for the trajectory T τ and

t¯0 = t¯(t0 ), t¯f = t¯(tf ), w̃ = w̄ − t¯ẇ, ξ̃k = ξ̄k − t¯(tk ), k = 1, . . . , s, (3.187)

then (t¯0 , t¯f , ξ̃ , w̃) is an element of the critical cone (3.155) in problem P for the trajectory
T . Moreover, relations (3.187) define a one-to-one correspondence between elements of
the critical cones in problems P τ and P .

We say that an element (ξ̄ , v̄, t¯, w̄) of the critical cone in problem P τ corresponds to
an element (t¯0 , t¯f , ξ̃ , w̃) of the critical cone in problem P if relations (3.187) hold.

Comparison of the quadratic forms. Assume that the element λτ ∈ M0τ corre-
sponds to the element λ ∈ M0 . Let us show that the quadratic form τ (λτ , ·), calculated on
the element (ξ̄ , v̄, t¯, w̄) of the critical cone in problem P τ for the trajectory T τ , can be trans-
formed into the quadratic form (λ, ·) calculated on the corresponding element (t¯0 , t¯f , ξ̃ , w̃)
of the critical cone in problem P for the trajectory T .
(i) The relations H̄ τ = v(H + ψ0 ) + νg, H̄ = H + νg, v = 1 imply

H̄!!
τ
!, ¯ = H̄ww w̄, w̄ + 2H̄tw w̄t¯ + H̄tt t¯2 + 2v̄(Hw w̄ + Ht t¯),
¯ ! (3.188)

where ! = (v, t, w), !¯ = (v̄, t¯, w̄). Since w̄ = w̃ + t¯ẇ, we have

H̄ww w̄, w̄ = H̄ww w̃, w̃ + 2H̄ww ẇ, w̄ t¯ − H̄ww ẇ, ẇ t¯2 . (3.189)

Moreover, using the relations

Hw = H̄w − νgw , Ht = H̄t − νgt , gw w̄ + gt t¯ = 0,


−ψ̇ = H̄x , −ψ̇0 = H̄t , H̄u = 0,
160 Chapter 3. Quadratic Conditions for Optimal Control Problems

we obtain
Hw w̄ + Ht t¯ = H̄w w̄ + H̄t t¯ − ν(gw w̄ + gt t¯)
(3.190)
= H̄w w̄ + H̄t t¯ = H̄x x̄ + H̄t t¯ = −ψ̇ x̄ − ψ̇0 t¯.

Relations (3.188)–(3.190) imply

H̄!! ¯ !
τ !, ¯ = H̄ww w̃, w̃ + 2H̄ww ẇ, w̄ t¯ + 2H̄tw w̄t¯
  (3.191)
− H̄ww ẇ, ẇ t¯2 + H̄tt t¯2 − 2v̄ ψ̇ x̄ + ψ̇0 t¯ .

(ii) Let us transform the terms 2H̄ww ẇ, w̄ t¯ + 2H̄tw w̄t¯ in (3.191). By differentiating −ψ̇ =
H̄x with respect to t, we obtain

−ψ̈ = H̄tx + (ẇ)∗ H̄wx + ψ̇ H̄ψx + ν̇ H̄νx .

Here we have H̄ψx = fx and H̄νx = gx . Therefore

−ψ̈ = H̄tx + (ẇ)∗ H̄wx + ψ̇fx + ν̇gx . (3.192)

Similarly, by differentiating H̄u = 0 with respect to t, we obtain

0 = H̄tu + (ẇ)∗ H̄wu + ψ̇fu + ν̇gu . (3.193)

Multiplying (3.192) by x̄ and (3.193) by ū and summing the results, we get

−ψ̈ x̄ = H̄tw w̄ + H̄ww ẇ, w̄ + ψ̇fw w̄ + ν̇gw w̄. (3.194)

Since (ξ̄ , v̄, t¯, w̄) is an element of the critical cone in problem P τ , from (3.173) and (3.175)
we get fw w̄ = x̄˙ − v̄ ẋ − ft t¯, gw w̄ = −gt t¯. Therefore, equation (3.194) can be represented
in the form
d  
H̄tw w̄ + H̄ww ẇ, w̄ = v̄(ψ̇ ẋ) − (ψ̇ x̄) + ψ̇ft + ν̇gt t¯, (3.195)
dt
which implies
d  
2H̄ww ẇ, w̄ t¯ + 2H̄tw w̄ t¯ = 2t¯v̄(ψ̇ ẋ) − 2t¯ (ψ̇ x̄) + 2 ψ̇ft + ν̇gt t¯2 . (3.196)
dt
(iii) Let us transform the term −H̄ww ẇ, ẇ t¯2 in (3.191). Multiplying (3.192) by ẋ and
(3.193) by u̇ and summing the results, we obtain

−ψ̈ ẋ = H̄tw ẇ + H̄ww ẇ, ẇ + ψ̇fw ẇ + ν̇gw ẇ. (3.197)

From (3.180) and (3.184), we get fw ẇ = ẍ − ft , gw ẇ = −gt , respectively. Then (3.197)


implies
d  
H̄tw ẇ + H̄ww ẇ, ẇ = − (ψ̇ ẋ) + ψ̇ft + ν̇gt . (3.198)
dt
Multiplying this relation by −t¯2 , we get
d  
−H̄ww ẇ, ẇ t¯2 = H̄tw ẇt¯2 + t¯2 (ψ̇ ẋ) − ψ̇ft + ν̇gt t¯2 . (3.199)
dt
3.3. Quadratic Conditions on a Variable Time Interval 161

(iv) Finally, let us transform the term H̄tt t¯2 in (3.191). Differentiating −ψ̇0 = H̄t with
respect to t and using the relations H̄ψt = ft and H̄νt = gt , we get
 
−ψ̈0 = H̄tt + H̄tw ẇ + ψ̇ft + ν̇gt . (3.200)

Consequently,  
H̄tt t¯2 = −ψ̈0 t¯2 − H̄tw ẇt¯2 − ψ̇ft + ν̇gt t¯2 . (3.201)
(v) Summing (3.199) and (3.201), we obtain
  d
−H̄ww ẇ, ẇ t¯2 + H̄tt t¯2 = −ψ̈0 t¯2 − 2 ψ̇ft + ν̇gt t¯2 + t¯2 (ψ̇ ẋ). (3.202)
dt
Using relations (3.196) and (3.202) in (3.191), we get

d
H̄!!
τ
!, ¯ = H̄ww w̃, w̃ + 2t¯v̄(ψ̇ ẋ) − 2t¯
¯ ! (ψ̇ x̄)
dt
d  
− ψ̈0 t¯2 + t¯2 (ψ̇ ẋ) − 2v̄ ψ̇ x̄ + ψ̇0 t¯ . (3.203)
dt
But
d  d d
ψ̈0 t¯2 + 2v̄ t¯ψ̇0 = ψ̇0 t¯2 , t¯ (ψ̇ x̄) + v̄(ψ̇ x̄) = t¯ψ̇ x̄ ,
dt dt dt
d d
2t¯v̄(ψ̇ ẋ) + t¯2 (ψ̇ ẋ) = (ψ̇ ẋ t¯2 ).
dt dt
Therefore,
d
H̄!!
τ
¯ !
!, ¯ = H̄ww w̃, w̃ + (ψ̇ ẋ)t¯2 − ψ̇0 t¯2 − 2ψ̇ x̄ t¯ . (3.204)
dt
Finally, using the change of the variable x̄ = x̃ + t¯ẋ in the right-hand side of this relation,
we obtain
d
H̄!!
τ
¯ !
!, ¯ = H̄ww w̃, w̃ − (ψ̇0 + ψ̇ ẋ)t¯2 + 2ψ̇ x̃ t¯ . (3.205)
dt
We have proved the following lemma.

Lemma 3.37. Let (ξ̄ , v̄, t¯, w̄) = (ξ̄ , !)


¯ be an element of the critical cone K τ in problem P τ
for the trajectory T . Set w̃ = w̄ − t¯ẇ. Then formula (3.205) holds.
τ

(vi) Recall that λτ is an arbitrary element of the set M0τ (consequently ψv = 0) and λ is
the corresponding element of the set M0 , i.e., λ is the projection of λτ under the mapping
(3.169). The quadratic form τ (λτ , ·) in problem P τ for the trajectory T τ has the following
representation:
s

¯ =
τ (λτ ; ξ̄ , !) D k (H̄ τ )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k − 2[ψ̇0 ]k t¯av
k
ξ̄k
k=1  τf (3.206)
+ lpp p̄, p̄ + H̄!!
τ
¯ !
!, ¯ dτ .
τ0
162 Chapter 3. Quadratic Conditions for Optimal Control Problems

Comparing the definitions of D k (H̄ τ ) and D k (H̄ ) (see (3.152)) and taking into account that
H̄ τ = v(ψf + ψ0 ) + νg and v = 1, we get

D k (H̄ τ ) = D k (H̄ ). (3.207)

Let z̄τ = (ξ̄ , !)¯ = (ξ̄ , v̄, t¯, x̄, ū) be an element of the critical cone K τ in the problem P τ
for the trajectory T τ , and let z̃ = (t¯0 , t¯f , ξ̃ , x̃, ũ) be the corresponding element of the critical
cone K in the problem P for the trajectory T ; i.e., relations (3.187) hold. Since [t¯]k = 0,
k = 1, . . . , s, we have
t¯av
k
= t¯k , k = 1, . . . , s (3.208)
where t¯k = t¯(tk ), k = 1, . . . , s. Also recall that τ0 = t0 , τf = tf , t(τ ) = τ , dt = dτ . Since
the functions ψ̇0 , ψ̇, ẋ, and x̃ may have discontinuities only at the points of the set , the
following formula holds:
 tf d tf

(ψ̇0 + ψ̇ ẋ)t¯2 + 2ψ̇ x̃ t¯ dt = (ψ̇0 + ψ̇ ẋ)t¯2 + 2ψ̇ x̃ t¯ 
t0 dt t0
s
 (3.209)
− [ψ̇0 + ψ̇ ẋ]k t¯(tk )2 + 2[ψ̇ x̃]k t¯(tk ) .
k=1

Relations (3.205)–(3.209) imply the following representation of the quadratic form τ on


¯ of the critical cone K τ :
the element (ξ̄ , !)
s

¯ =
τ (λτ ; ξ̄ , !) D k (H̄ )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k − 2[ψ̇0 ]k t¯(tk )ξ̄k
k=1 (3.210)
+ [ψ̇0 + ψ̇ ẋ]k t¯(tk )2 + 2[ψ̇ x̃]k t¯(tk ) + lpp p̄, p̄
 t
− (ψ̇0 + ψ̇ ẋ)t¯2 + 2ψ̇ x̃ t¯ tf + t0f H̄ww w̃, w̃ dτ .
t
0

Let us transform the terms related to the discontinuity points tk of the control u(·), k =
1, . . . , s. For any λ ∈ M0 , the following lemma holds.

¯ = (ξ̄ , v̄, t¯, w̄) be an element of the critical cone K τ in the


Lemma 3.38. Let z̄ = (ξ̄ , !)
problem P for the trajectory T τ . Let the pair (ξ̃ , x̃) be defined by the relations
τ

ξ̃k = ξ̄k − t¯(tk ), k = 1, . . . , s, x̃ = x̄ − t¯ẋ. (3.211)

Then for any k = 1, . . . , s the following formula holds:


k ξ̄ − 2[ψ̇ ]k t¯(t )ξ̄ + [ψ̇ + ψ̇ ẋ]k t¯(t )2 + 2[ψ̇ x̃]k t¯(t )
D k (H̄ )ξ̄k2 − 2[ψ̇]k x̄av k 0 k k 0 k k
(3.212)
= D k (H̄ )ξ̃k2 − 2[ψ̇]k x̃av
k ξ̃ .
k

Proof. In this proof, we omit the subscript and superscript k. We also write t¯ instead of
t¯(tk ). Set a = D(H̄ ). Using the relations

ξ̄ = ξ̃ + t¯, x̄av = x̃av + t¯ẋav , (3.213)


3.3. Quadratic Conditions on a Variable Time Interval 163

we obtain
a ξ̄ 2 − 2[ψ̇]x̄av ξ̄ − 2[ψ̇0 ]t¯ξ̄ + [ψ̇0 + ψ̇ ẋ]t¯2 + 2[ψ̇ x̃]t¯
= a ξ̃ 2 + 2a ξ̃ t¯ + a t¯2 − 2[ψ̇]x̃av ξ̄ − 2[ψ̇]ẋav t¯ξ̄ − 2[ψ̇0 ]t¯ξ̄ + [ψ̇0 + ψ̇ ẋ]t¯2 + 2[ψ̇ x̃]t¯
= a ξ̃ 2 − 2[ψ̇]x̃av ξ̃ + r,
(3.214)
where
r = 2a ξ̃ t¯ + a t¯2 − 2[ψ̇]x̃av t¯ − 2[ψ̇]ẋav t¯ξ̄ − 2[ψ̇0 ]t¯ξ̄ + [ψ̇0 + ψ̇ ẋ]t¯2 + 2[ψ̇ x̃]t¯. (3.215)
It suffices to show that r = 0. Using the relations (3.213) in formula (3.215), we get
r = 2a(ξ̄ − t¯)t¯ + a t¯2 − 2[ψ̇](x̄av − t¯ẋav )t¯ − 2[ψ̇]ẋav t¯ξ̄ − 2[ψ̇0 ]t¯ξ̄
+ [ψ̇0 + ψ̇ ẋ]t¯2 + 2[ψ̇(x̄ − t¯ẋ)]t¯
   
= t¯2 − a + 2[ψ̇]ẋav + [ψ̇0 ] − [ψ̇ ẋ] + 2t¯ξ̄ a − [ψ̇]ẋav − [ψ̇0 ]
 
+ 2t¯ − [ψ̇]x̄av + [ψ̇ x̄] .
The coefficient of t¯2 in the right-hand side of the last equality vanishes:
 
−a + 2[ψ̇]ẋav + [ψ̇0 ] − [ψ̇ ẋ] = − ψ̇ + ẋ − − ψ̇ − ẋ + + [ψ̇0 ] + (ψ̇ + − ψ̇ − )(ẋ + + ẋ − )
+ [ψ̇0 ] − ψ̇ + ẋ + + ψ̇ − ẋ − = 0.
The coefficient of 2t¯ξ̄ is equal to
1
a − [ψ̇]ẋav − [ψ̇0 ] = ψ̇ + ẋ − − ψ̇ − ẋ + + [ψ̇0 ] − (ψ̇ + − ψ̇ − )(ẋ − + ẋ + ) − [ψ̇0 ]
2
1 + −  1
= ψ̇ ẋ − ψ̇ − ẋ + − [ψ̇ ẋ].
2 2
The coefficient of 2t¯ is equal to
1
−[ψ̇]x̄av + [ψ̇ x̄] = − (ψ̇ + − ψ̇ − )(x̄ − + x̄ + ) + (ψ̇ + x̄ + − ψ̇ − x̄ − )
2
1 + 1
= ψ̇ [x̄] + ψ̇ − [x̄] = ψ̇av [ẋ]ξ̄ ,
2 2
since [x̄] = [ẋ]ξ̄ . Consequently,
 
1 + −  1
r = 2t¯ξ̄ ψ̇ ẋ − ψ̇ − ẋ + − [ψ̇ ẋ] + ψ̇av [ẋ]
2 2
 + −
= t¯ξ̄ (ψ̇ ẋ − ψ̇ ẋ ) − (ψ̇ ẋ + − ψ̇ − ẋ − )
− + +

+ (ψ̇ − + ψ̇ + )(ẋ + − ẋ − ) = 0.
In view of (3.214) the equality r = 0 proves the lemma.

Relation (3.210) along with equality (3.212) gives the following transformation of
quadratic form τ (see (3.206)) on the element z̄τ = (ξ̄ , !)
¯ of the critical cone K τ ,

s
 
¯
τ (λτ ; ξ̄ , !) = D k (H̄ )ξ̃k2 − 2[ψ̇]k x̃av
k
ξ̃k
tf 
k=1
tf

+ lpp p̄, p̄ − (ψ̇0 + ψ̇ ẋ)t¯2 + 2ψ̇ x̃ t¯  + H̄ww w̃, w̃ dτ .
t0 t0
(3.216)
164 Chapter 3. Quadratic Conditions for Optimal Control Problems

Taking into account (3.186) and definitions (3.156)–(3.158) of quadratic forms ωe , ω, and ,
we see that the right-hand side of (3.216) is the quadratic form (λ, z̃) (see (3.158)) in
problem P for the trajectory T , where z̃ = (t¯0 , t¯f , ξ̃ , w̃) is the corresponding element of the
critical cone K. Thus we have proved the following theorem.

Theorem 3.39. Let z̄τ = (ξ̄ , v̄, t¯, w̄) be an element of the critical cone K τ in problem P τ for
the trajectory T τ . Let z̃ = (t¯0 , t¯f , ξ̃ , w̃) be the corresponding element of the critical cone K
in problem P for the trajectory T , i.e., relations (3.187) hold. Then for any λτ ∈ M0τ and
the corresponding projection λ ∈ M0 (under the mapping (3.169)) the following equality
holds: τ (λτ , z̄τ ) = (λ, z̃).

This theorem proves the implications (iii) ⇒ (iv) and (v) ⇒ (vi) (see the beginning
of this section), and thus completes the proofs of Theorems 3.31 and 3.35.

3.4 Quadratic Conditions for Optimal Control Problems


with Mixed Control-State Equality and Inequality
Constraints
In this section, we give a statement of the general optimal control problem with mixed
control-state equality and inequality constraints on fixed and variable time intervals, recall
different concepts of minimum, and formulate optimality conditions.

3.4.1 General Optimal Control Problem on a Fixed Time Interval


Statement of the problem. We consider the following optimal control problem on
a fixed interval of time [t0 , tf ]:
 
Minimize J(x, u) = J x(t0 ), x(tf ) (3.217)
subject to the constraints
     
F x(t0 ), x(tf ) ≤ 0, K x(t0 ), x(tf ) = 0, x(t0 ), x(tf ) ∈ P , (3.218)
ẋ = f (t, x, u), g(t, x, u) = 0, ϕ(t, x, u) ≤ 0, (t, x, u) ∈ Q, (3.219)
where P and Q are open sets and x, u, F , K, f , g, and ϕ are vector functions, d(g) ≤ d(u).
We use the notation x(t0 ) = x0 , x(tf ) = xf , (x0 , xf ) = p, and (x, u) = w. We seek the min-
imum among the pairs of functions w(·) = (x(·), u(·)) such that x(·) ∈ W 1,1 ([t0 , tf ], Rd(x) ),
u(·) ∈ L∞ ([t0 , tf ], Rd(u) ). Therefore, we seek for the minimum in the space
W := W 1,1 ([t0 , tf ], Rd(x) ) × L∞ ([t0 , tf ], Rd(u) ).
A pair w = (x, u) ∈ W is said to be admissible in problem (3.217)–(3.219) if constraints
(3.218)–(3.219) are satisfied by w.

Assumption 3.40. (a) The functions J (p), F (p), and K(p) are defined and twice contin-
uously differentiable on the open set P ⊂ R2d(x) , and the functions f (t, w), g(t, w), and
ϕ(t, w) are defined and twice continuously differentiable on the open set Q ⊂ Rd(x)+d(u)+1 .
(b) The gradients with respect to the control giu (t, w), i = 1, . . . , d(g), ϕj u (t, w), j ∈ Iϕ (t, w)
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 165

are linearly independent at all points (t, w) ∈ Q such that g(t, w) = 0 and ϕ(t, w) ≤ 0. Here
gi and ϕj are the components of the vector functions g and ϕ, respectively, and
Iϕ (t, w) = {j ∈ {1, . . . , d(ϕ)} | ϕj (t, w) = 0} (3.220)
is the set of indices of active inequality constraints ϕj (w, t) ≤ 0 at (t, w) ∈ Q.

We refer to (b) as the linear independence assumption for the gradients of the active
mixed constraints with respect to the control. Let a pair w0 (·) = (x 0 (·), u0 (·)) ∈ W satisfying
constraints (3.218)–(3.219) of the problem be the point tested for optimality.

Assumption 3.41. The control u0 (·) is a piecewise continuous function such that all its
discontinuity points are L-points (see Definition 2.1). Let  = {t1 , . . . , ts }, t0 < t1 < · · · <
ts < tf be the set of all discontinuity points of u0 (·). It is also assumed that (tk , x 0 (tk ), u0k− ) ∈
Q, (tk , x 0 (tk ), u0k+ ) ∈ Q, k = 1, . . . , s, where u0k− = u0 (tk − 0), u0k+ = u0 (tk + 0).

In what follows, we assume for definiteness that the set  of discontinuity points of
u0 is nonempty. Whenever this set is empty, all statements admit obvious simplifications.

Minimum on a set of sequences. Weak and Pontryagin minimum. Let S be


an arbitrary set of sequences {δwn } in the space W closed with respect to the operation of
taking subsequences. For problem (3.217)–(3.219), let us define a concept of minimum on
S at the admissible point w0 = (x 0 , u0 ). Set p0 = (x 0 (t0 ), x 0 (tf )).

Definition 3.42. We say that w0 is a (strict) minimum on S if there exists no sequence


{δwn } ∈ S such that the following conditions hold for all its members:
J (p0 + δpn ) < J (p0 ) (J (p0 + δpn ) ≤ J (p 0 ), δwn  = 0), (3.221)
F (p 0 + δpn ) ≤ 0, K(p 0 + δpn ) = 0, (3.222)
ẋ 0 + δ ẋn = f (t, w 0 + δwn ), g(t, w 0 + δwn ) = 0, (3.223)
ϕ(t, w0 + δwn ) ≤ 0, (p 0 + δpn ) ∈ P , (t, w0 + δwn ) ∈ Q, (3.224)
where δpn = (δxn (t0 ), δxn (tf )), and δwn = (δxn , δun ). Any sequence from S which satisfies
conditions (3.221)–(3.224) is said to violate the (strict) minimality on S.

Let S 0 be the set of sequences {δwn } in W such that


δwn
=
δxn
1,1 +
δun
∞ → 0.
A weak minimum is a minimum on S 0 .

Definition 3.43. We say that w0 is a point of Pontryagin minimum if this point is a mini-
mum on the set of sequences {δwn } in W satisfying t the following two conditions:
(a)
δxn
1,1 +
δun
1 → 0, where
δun
1 = t0f |δun | dt;
(b) there exists a compact set C ⊂ Q (which depends on the choice of the sequence) such
that for all sufficiently large n, we have (t, w 0 (t) + δwn (t)) ∈ C a.e. on [t0 , tf ].
Any sequence satisfying conditions (a) and (b) will be referred to as a Pontryagin
sequence on Q.

Obviously, every Pontryagin minimum is a weak minimum.


166 Chapter 3. Quadratic Conditions for Optimal Control Problems

Minimum principle. Let us state the well-known first-order necessary conditions


for both a weak and for a Pontryagin minimum. These conditions are often referred to as the
local and integral minimum principle, respectively. The local minimum principle, which
we give first, is conveniently identified with the nonemptiness of the set 0 defined below.
Let
l = α0 J + αF + βK, H = ψf , H̄ = H + νg + μϕ, (3.225)
where α0 is a scalar, and α, β, ψ, ν, and μ are row vectors of the same dimensions as F , K,
f , g, and ϕ, respectively. The dependence of the functions l, H , and H̄ on the variables is
as follows: l = l(p, α0 , α, β), H = H (t, w, ψ), H̄ = H̄ (t, w, ψ, ν, μ). The function l is said
to be the endpoint Lagrange function, H is the Pontryagin function (or the Hamiltonian),
and H̄ is the augmented Pontryagin function (or the augmented Hamiltonian).
Denote by Rn∗ the space of n-dimensional row-vectors. Set

λ = (α0 , α, β, ψ(·), ν(·), μ(·)), (3.226)

where α0 ∈ R1 , α ∈ Rd(F )∗ , β ∈ Rd(K)∗ , ψ(·) ∈ W 1,1 ([t0 , tf ], Rd(x)∗ ), ν(·) ∈


L∞ ([t0 , tf ], Rd(g)∗ ), μ(·) ∈ L∞ ([t0 , tf ], Rd(ϕ)∗ ). Denote by 0 the set of all tuples λ sat-
isfying the conditions

α0 ≥ 0, α ≥ 0, αF (p 0 ) = 0, (3.227)
)
d(F 
d(K)
α0 + αi + |βj | = 1, (3.228)
i=1 j =1

μ(t) ≥ 0, μ(t)ϕ(t, w 0 (t)) = 0, (3.229)


ψ̇ = −H̄x (t, w 0 (t), ψ(t), ν(t), μ(t)), (3.230)
ψ(t0 ) = −lx0 (p 0 , α0 , α, β), ψ(tf ) = lxf (p0 , α0 , α, β), (3.231)
H̄u (t, w0 (t), ψ(t), ν(t), μ(t)) = 0, (3.232)

where αi and βj are components of the row vectors α and β, respectively, and H̄x , H̄u , lx0 ,
and lxf are gradients with respect to the corresponding variables.
It is well known that if w0 is a weak minimum, then 0 is nonempty (see, e.g., [30]).
The latter condition is just the local minimum principle. Note that 0 can consist of more
than one element. The following result pertain to this possibility.

Proposition 3.44. The set 0 is a finite-dimensional compact set, and the projection λ =
(α0 , α, β, ψ, ν, μ) → (α0 , α, β) is injective on 0 .

This property of 0 follows from the linear independence assumption for the gra-
dients of the active mixed constraints with respect to the control. This assumption also
guarantees the following property.

Proposition 3.45. Let λ ∈ 0 be an arbitrary tuple. Then its components ν(t) and μ(t) are
continuous at each point of continuity of the control u0 (t). Consequently, ν(t) and μ(t) are
piecewise continuous functions such that all their discontinuity points belong to the set .
The adjoint variable ψ(t) is a piecewise smooth function such that all its break points belong
to the set .
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 167

In a similar way, the integral minimum principle, which is a first-order necessary


condition for a Pontryagin minimum at w0 , can be stated as the nonemptiness of the set M0
defined below. Let

U(t, x) = u ∈ Rd(u) | (t, x, u) ∈ Q, g(t, x, u) = 0, ϕ(t, x, u) ≤ 0 . (3.233)


Denote by M0 the set of all tuples λ ∈ 0 such that for all t ∈ [t0 , tf ] \ , the inclusion
u ∈ U(t, x 0 (t)) implies the inequality
H (t, x 0 (t), u, ψ(t)) ≥ H (t, x 0 (t), u0 (t), ψ(t)). (3.234)
It is known [30] that if w0 is a Pontryagin minimum, then M0 is nonempty. The
latter condition is just the integral (or Pontryagin) minimum principle. Inequality (3.234),
satisfied for all t ∈ [t0 , tf ] \ , is called the minimum condition of Pontryagin’s function H
with respect to u. (In the case of a measurable control u0 (t), inequality (3.234) is fulfilled
for a.a. t ∈ [t0 , tf ].)
Note that, just like 0 , the set M0 can contain more than one element. Since this set is
closed and M0 ⊂ 0 , it follows from Proposition 3.44 that M0 is also a finite-dimensional
compact set. Let us note one more important property of the set M0 (see [30]).

Proposition 3.46. Let λ ∈ M0 be an arbitrary tuple. Then there exists an absolutely


continuous function ψt (t) from [t0 , tf ] into R1 such that

ψ̇t = −H̄t (t, w0 (t), ψ(t), ν(t), μ(t)), (3.235)


H (t, w0 (t), ψ(t)) + ψt (t) = 0. (3.236)
Consequently, ψt (t) is a piecewise smooth function whose break points belong to .

Particularly, this implies the following assertion. Let λ ∈ M0 be an arbitrary tuple.


Then the function H (t, w 0 (t), ψ(t)) satisfies the following condition:
[H λ ]k = 0 ∀ tk ∈ , (3.237)
where [H λ ]k is a jump of the function H (t, x 0 (t), u0 (t), ψ(t)) at the point tk ∈ , defined
by the relations [H λ ]k = H λk+ − H λk− , H λk− = H (tk , x 0 (tk ), u0k− , ψ(tk )), and H λk+ =
H (tk , x 0 (tk ), u0k+ , ψ(tk )). Here, by definition, u0k− = u0 (tk − 0), and u0k+ = u0 (tk + 0).
Let  0 be the set of all tuples λ ∈ 0 satisfying condition (3.237). From Propositions
3.45 and 3.46 we have the following.

0 is a finite-dimensional compact set such that M0 ⊂ 0 ⊂ 0 .


Proposition 3.47. The set  

Note that from minimum condition (3.234), the inequality


H (tk , x 0 (tk ), u, ψ(tk )) ≥ H λk ∀ u ∈ U(tk , x 0 (tk )) (3.238)
follows by continuity, where by definition H λk := H λk− = H λk+ . Condition (3.238) holds
for any tk ∈  and any λ ∈ M0 .
Now we will state two properties of elements of the set M0 which follow from the
minimum principle. The first is a necessary optimality condition related to each disconti-
nuity point of the control u0 . The second is a generalization of the Legendre condition.
168 Chapter 3. Quadratic Conditions for Optimal Control Problems

The value D k (H̄ λ ). Let λ ∈ 0 and tk ∈ . According to Proposition 3.45 the


quantities μk− = μ(tk − 0), μk+ = μ(tk + 0), ν k− = ν(tk − 0), ν k+ = ν(tk + 0) are well
defined. Set
H̄xλk− := H̄x (tk , x 0 (tk ), u0k− , ψ(tk ), ν k− , μk− ),
(3.239)
H̄xλk+ := H̄x (tk , x 0 (tk ), u0k+ , ψ(tk ), ν k+ , μk+ ).
Similarly, set
H̄ψλk− = f k− := f (tk , x 0 (tk ), u0k− ),
(3.240)
H̄ψλk+ = f k+ := f (tk , x 0 (tk ), u0k+ ),

H̄tλk− := H̄t (tk , x 0 (tk ), u0k− , ψ(tk ), ν k− , μk− ),


(3.241)
H̄tλk+ := H̄t (tk , x 0 (tk ), u0k+ , ψ(tk ), ν k+ , μk+ ).
Finally, set
D k (H̄ λ ) := −H̄xλk+ H̄ψλk− + H̄xλk− H̄ψλk+ − [H̄tλ ]k , (3.242)

where [H̄tλ ]k = H̄tλk+ − H̄tλk− is the jump of H̄t (t, x 0 (t), u0 (t), ψ(t), ν(t), μ(t)) at tk . Note that
D k (H̄ λ ) is linear in λ.

Theorem 3.48. Let λ ∈ M0 . Then D k (H̄ λ ) ≥ 0 for all tk ∈ .

Since conditions D k (H̄ λ ) ≥ 0 for all tk ∈  follow from the minimum principle, they
are necessary conditions for the Pontryagin minimum at the point w 0 .
As in previous problems, there is another way, convenient for practical use, to calculate
the quantities D k (H̄ λ ). Given any λ ∈ 0 and tk ∈ , we set

(k H̄ λ )(t) = H̄ (t, x 0 (t), u0k+ , ψ(t), ν k+ , μk+ )


(3.243)
− H̄ (t, x 0 (t), u0k− , ψ(t), ν k− , μk− ).

The function (k H̄ λ )(t) is continuously differentiable at each point of the set [t0 , tf ] \ ,
since this property hold for x 0 (t) and ψ(t). The latter follows from the equation

ẋ 0 (t) = f (t, x 0 (t), u0 (t)), (3.244)

adjoint equation (3.230), and Assumption 3.41.

Proposition 3.49. For any λ ∈ 0 and any tk ∈ , the following equalities hold:

d d
D k (H̄ λ ) = − (k H̄ λ )(tk − 0) = − (k H̄ λ )(tk + 0). (3.245)
dt dt

Finally, note that equation (3.244) can be written in the form

ẋ 0 (t) = H̄ψ (t, x 0 (t), u0 (t), ψ(t), ν(t), μ(t)). (3.246)

Relations (3.244), (3.230), (3.235), (3.236), and formula (3.242) can be used for obtaining
one more representation of the value D k (H̄ λ ).
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 169

Proposition 3.50. For any λ ∈ 0 and tk ∈ , the following equality holds:

D k (H̄ λ ) = ψ̇ k+ ẋ 0k− − ψ̇ k− ẋ 0k+ + [ψ̇t ]k , (3.247)

where the function ψt (t) is defined by ψt (t) = −H (t, x 0 (t), u0 (t), ψ(t)), the value [ψ̇t ]k =
ψ̇tk+ − ψ̇tk− is the jump of the derivative ψ̇t (t) at the point tk , and the vectors ẋ 0k− , ψ̇ k− , ψ̇tk−
and ẋ 0k+ , ψ̇ k+ , ψ̇tk+ are the left and the right limit values of the derivatives ẋ 0 (t), ψ̇(t) and
ψ̇t (t) at tk , respectively.

Legendre–Clebsch condition. For any λ = (α0 , α, β, ψ, ν, μ) ∈ 0 , let us define


the following three conditions:
(LC) For any t ∈ [t0 , tf ] \ , the quadratic form

H̄uu (t, x 0 (t), u0 (t), ψ(t), ν(t), μ(t))ū, ū (3.248)

of the variable ū is positive semidefinite on the cone formed by the vectors ū ∈ Rd(u) such
that
gu (t, x 0 (t), u0 (t))ū = 0,
ϕj u (t, x 0 (t), u0 (t))ū ≤ 0 ∀ j ∈ Iϕ (t, x 0 (t), u0 (t)), (3.249)
μj (t)ϕj u (t, x 0 (t), u0 (t))ū = 0 ∀ j ∈ Iϕ (t, x 0 (t), u0 (t)),
where H̄uu is the matrix of second derivatives with respect to u of the function H̄ , and
Iϕ (t, x, u) is the set of indices of active inequality constraints ϕj (t, x, u) ≤ 0 at (t, x, u),
defined by (3.220).
(LC−
) For any tk ∈ , the quadratic form

H̄uu (tk , x 0 (tk ), u0k− , ψ(tk ), ν k− , μk− )ū, ū (3.250)

of the variable ū is positive semidefinite on the cone formed by the vectors ū ∈ Rd(u) such
that
gu (tk , x 0 (tk ), u0k− )ū = 0,
ϕj u (tk , x 0 (tk ), u0k− )ū ≤ 0 ∀ j ∈ Iϕ (tk , x 0 (tk ), u0k− ), (3.251)
μj ϕj u (tk , x (tk ), u )ū = 0 ∀ j ∈ Iϕ (tk , x (tk ), u )
k− 0 0k− 0 0k−

(LC+

) For any tk ∈ , the quadratic form

H̄uu (tk , x 0 (tk ), u0k+ , ψ(tk ), ν k+ , μk+ )ū, ū (3.252)

of the variable ū is positive semidefinite on the cone formed by the vectors ū ∈ Rd(u) such
that
gu (tk , x 0 (tk ), u0k+ )ū = 0,
ϕj u (tk , x 0 (tk ), u0k+ )ū ≤ 0 ∀ j ∈ Iϕ (tk , x 0 (tk ), u0k+ ), (3.253)
μk+
j ϕ ju k(t , x 0 (t ), u0k+ )ū = 0 ∀ j ∈ I (t , x 0 (t ), u0k+ ).
k ϕ k k

We say that element λ ∈ 0 satisfies the Legendre–Clebsch condition if conditions (LC),


(LC− + −
 ) and (LC ) hold. Clearly, these conditions are not independent: conditions (LC )
+
and (LC ) follow from condition (LC) by continuity.
170 Chapter 3. Quadratic Conditions for Optimal Control Problems

Theorem 3.51. For any λ ∈ M0 , the Legendre–Clebsch condition holds.

Thus, the Legendre–Clebsch condition is also a consequence of the minimum principle.

Legendrian elements. An element λ ∈ 0 is said to be Legendrian if, for this


element, the Legendre–Clebsch condition is satisfied and also the following conditions
hold:
[H λ ]k = 0, D k (H̄ λ ) ≥ 0 ∀ tk ∈ . (3.254)

Let M be an arbitrary subset of the compact set 0 . Denote by Leg(M) the subset of all
Legendrian elements λ ∈ M. It follows from Theorems 3.48 and 3.51 and Proposition 3.46
that
Leg(M0 ) = M0 . (3.255)

Now we introduce the critical cone K and the quadratic form λ (·) which will be used for
the statement of the quadratic optimality condition.

Critical cone. As above, we denote by P W 1,2 ([t0 , tf ], Rd(x) ) the space of piecewise
continuous functions x̄(·) : [t0 , tf ] → Rd(x) that are absolutely continuous on each interval
of the set [t0 , tf ] \  and have a square integrable first derivative. Given tk ∈  and x̄(·) ∈
P W 1,2 ([t0 , tf ], Rd(x) ), we use the notation x̄ k− = x̄(tk − 0), x̄ k+ = x̄(tk + 0), [x̄]k = x̄ k+ −
x̄ k− . Denote by Z2 () the space of triples z̄ = (ξ̄ , x̄, ū) such that

ξ̄ = (ξ̄1 , . . . , ξ̄s ) ∈ Rs , x̄ ∈ P W 1,2 ([t0 , tf ], Rd(x) ), ū ∈ L2 ([t0 , tf ], Rd(u) ).

Thus, Z2 () = Rs × P W 1,2 ([t0 , tf ], Rd(x) ) × L2 ([t0 , tf ], Rd(u) ). Denote by

IF (p0 ) = {i ∈ {1, . . . , d(F )} | Fi (p 0 ) = 0}

the set of indices of active inequality constraints Fi (p) ≤ 0 at the point p 0 , where Fi are the
components of the vector function F .
Let K denote the set of z̄ = (ξ̄ , x̄, ū) ∈ Z2 () such that

J (p 0 )p̄ ≤ 0, Fi (p0 )p̄ ≤ 0, i ∈ IF (p0 ), K (p 0 )p̄ = 0, (3.256)


˙ = fw (t, w 0 (t))w̄(t), [x̄]k = [ẋ 0 ]k ξ̄k , tk ∈ ,
x̄(t) (3.257)
gw (t, w0 (t))w̄(t) = 0, (3.258)
ϕj w (t, w 0 (t))w̄(t) ≤ 0 a.e. on M0 (ϕj0 ), j = 1, . . . , d(ϕ), (3.259)

where M0 (ϕj0 ) = {t ∈ [t0 , tf ] | ϕj (t, w0 (t)) = 0}, p̄ = (x̄(t0 ), x̄(tf )), w̄ = (x̄, ū), and [ẋ 0 ]k is
the jump of the function ẋ 0 (t) at the point tk ∈ , i.e., [ẋ 0 ]k = ẋ 0k+ − ẋ 0k− = ẋ 0 (tk + 0) −
ẋ 0 (tk − 0), tk ∈ . Obviously, K is a closed convex cone in the space Z2 (). We call it the
critical cone of problem (3.217)–(3.219) at the point w 0 .
The following question is of interest: Which inequalities in the definition of K can
be replaced by equalities without affecting K? This question is answered below.
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 171

Proposition 3.52. For any λ = (α0 , α, β, ψ, ν, μ) ∈ 


0 and z̄ = (ξ̄ , x̄, ū) ∈ K, we have

α0 J (p 0 )p̄ = 0, αi Fi (p 0 )p̄ = 0, i ∈ IF (p0 ), (3.260)


μj (t)ϕj w (t, w (t))w̄(t) = 0,
0
j = 1, . . . , d(ϕ), (3.261)

where αi and μj are the components of the vectors α and μ, respectively.

Note that conditions (3.260) and (3.261) can be written in brief as α0 J (p 0 )p̄ = 0,
αF (p0 )p̄= 0, and μ(t)ϕw (t, w0 (t))w̄(t) = 0. Proposition 3.52 gives an answer to the
question posed above. According to this proposition, for any λ ∈  0 , conditions (3.256)–
(3.261) also define K. It follows that if, for some λ = (α0 , α, β, ψ, ν, μ) ∈  0 , the condition
α0 > 0 holds, then, in the definition of K, the inequality J (p 0 )p̄ ≤ 0 can be replaced by
the equality J (p0 )p̄ = 0. If, for some λ ∈ 0 and i0 ∈ {1, . . . , d(F )}, the condition αi0 > 0
holds, then the inequality Fi 0 (p0 )p̄ ≤ 0 can be replaced by the equality Fi 0 (p0 )p̄ = 0.
Finally, for any j ∈ {1, . . . , d(ϕ)} and λ ∈ 
0 , the inequality ϕj w (t, w (t))w̄(t) ≤ 0 can be
0

replaced by the equality ϕj w (t, w (t))w̄(t) = 0 a.e. on the set {t ∈ [t0 , tf ] | μj (t) > 0} ⊂
0

M0 (ϕj0 ). Every such change gives an equivalent system of conditions still defining K.
The following question is also of interest: Under what conditions can one of the
endpoint inequalities in the definition of K be omitted without affecting K? In particular,
when can the inequality J (p 0 )p̄ ≤ 0 be omitted?

Proposition 3.53. Suppose that there exists λ ∈ 


0 such that α0 > 0. Then the relations

Fi (p0 )p̄ ≤ 0, i ∈ IF (p0 ), αi Fi (p0 )p̄ = 0, i ∈ IF (p0 ), K (p0 )p̄ = 0, (3.262)

combined with (3.257)–(3.259) and (3.261) imply that J (p0 )p̄ = 0; i.e., K can be defined
by conditions (3.257)–(3.259), (3.261), and (3.262)) as well.

Therefore, if for some λ ∈  0 all inequalities (3.256) corresponding to positive αi


are replaced by the equalities and each inequality ϕj w (t, w0 (t))w̄(t) ≤ 0 is replaced by the
equality on the set {t ∈ [t0 , tf ] | μj (t) > 0} ⊂ M0 (ϕj ), then, after all such changes, the
equality Jp (p 0 )p̄ = 0 corresponding to positive α0 can be excluded, and the obtained new
system of conditions still defines K.

Quadratic form. Let us introduce the following notation. Given any

λ = (α0 , α, β, ψ, ν, μ) ∈ 0 ,

we set
[H̄xλ ]k = H̄xλk+ − H̄xλk− , k = 1, . . . , s, (3.263)
where H̄xλk− and H̄xλk+ are defined by (3.239). Thus, [H̄xλ ]k denotes a jump of the function
H̄x (t, x 0 (t), u0 (t), ψ(t), ν(t), μ(t)) at t = tk ∈ . It follows from the adjoint equation (3.230)
that
[H̄xλ ]k = −[ψ̇]k , k = 1, . . . , s, (3.264)
172 Chapter 3. Quadratic Conditions for Optimal Control Problems

where the row vector

[ψ̇]k = ψ̇ k+ − ψ̇ k− = ψ̇(tk + 0) − ψ̇(tk − 0)

is the jump of the derivative ψ̇(t) at tk ∈ . Furthermore, for brevity we set

∂ 2l 0 ∂ 2 H̄
λ
lpp (p0 ) = (p , α0 , α, β), H̄ λ
ww (w 0
) = (t, x 0 (t), u0 (t), ψ(t), ν(t), μ(t)). (3.265)
∂p 2 ∂w2

Finally, for x̄ ∈ P W 1,2 ([t0 , tf ], Rd(x) ), we set

1
k
x̄av = (x̄ k− + x̄ k+ ), k = 1, . . . , s. (3.266)
2
k is an average value of the function x̄ at t ∈ .
Here x̄av k
We are now ready to introduce the quadratic form, which takes into account the
discontinuities of the control u0 . For any λ ∈ 0 and z̄ = (ξ̄ , x̄, ū) ∈ Z2 (), we set

1  k λ 2
s
λ (z̄) = D (H̄ )ξ̄k + 2[H̄xλ ]k x̄av
k
ξ̄k
2
k=1  (3.267)
1 λ 0 1 tf λ
+ lpp (p )p̄, p̄ + H̄ww (t, w0 )w̄, w̄ dt,
2 2 t0

where w̄ = (x̄, ū), p̄ = (x̄(t0 ), x̄(tf )). Recall that the value D k (H̄ λ ) is defined by (3.242),
and it is nonpositive for any λ ∈ M0 . Obviously, λ is quadratic in z̄ and linear in λ. Set

1  k λ 2
s
λ
ω (ξ̄ , x̄) = D (H̄ )ξ̄k + 2[H̄xλ ]k x̄av
k
ξ̄k . (3.268)
2
k=1

This quadratic form is related to the discontinuities of control u0 , and we call it the internal
form. According to (3.247) and (3.264) it can be written as follows:

1  k+ 0k−
s
λ
ω (ξ̄ , x̄) = ψ̇ ẋ − ψ̇ k− ẋ 0k+ + [ψ̇t ]k ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k . (3.269)
2
k=1

Furthermore, we set
 tf
1 λ 0 1
ωλ (w̄) = lpp (p )p̄, p̄ + H̄ww
λ
(t, w 0 )w̄, w̄ dt. (3.270)
2 2 t0

This quadratic form is the second variation of the Lagrangian of problem (3.217)–(3.219)
at the point w 0 . We call it the external form. Thus, the quadratic form λ (z̄) is a sum of the
internal and external forms:

λ (z̄) = ω
λ
(ξ̄ , x̄) + ωλ (w̄). (3.271)
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 173

Quadratic necessary condition for Pontryagin minimum. Now, we formulate


the main necessary quadratic condition for Pontryagin minimum in problem (3.217)–(3.219)
at the point w0 .

Theorem 3.54. If w0 is a Pontryagin minimum, then the following Condition A holds: The
set M0 is nonempty and
max λ (z̄) ≥ 0 ∀ z̄ ∈ K.
λ∈M0

The proof of this theorem was given in [86] and published in [95].
Next we formulate the basic sufficient condition in problem (3.217)–(3.219). We call
it briefly Condition B(). It is sufficient not only for a Pontryagin minimum, but also for
a bounded strong minimum defined below. We give a preliminary definition of a strong
minimum which is slightly different from the commonly used definition.

Strong minimum. A state variable xi (the ith component of x) is said to be


unessential if the functions f , g, and ϕ do not depend on this variable and the func-
tions J , F , and K are affine in pi = (xi (t0 ), xi (tf )). A state variable which does not possess
this property is said to be essential. The vector comprised of the essential components
xi of x is denoted by x. Similarly, δx denotes the vector consisting of the essential com-
ponents of a variation δx. Denote by S the set of all sequences {δwn } in W such that
|δxn (t0 )| +
δx n
C → 0. A minimum on S is called strong.

Bounded strong minimum. A sequence {δwn } in W is said to be bounded strong


on Q if {δwn } ∈ S and there exists a compact set C ⊂ Q such that for all sufficiently
large n one has (t, x 0 (t), u0 (t) + δun (t)) ∈ C a.e. on [t0 , tf ]. By a (strict) bounded strong
minimum we mean a (strict) minimum on the set of all bounded strong on Q sequences.
Every strong minimum is bounded strong. Hence, it is a Pontryagin minimum.
We know that the bounded strong minimum is equivalent to the strong minimum if
there exists a compact set C ⊂ Q such that the conditions t ∈ [t0 , tf ] and u ∈ U(t, x 0 (t))
imply (t, x 0 (t), u) ∈ C, where U(t, x) is the set defined by (3.233).
Let us state a quadratic sufficient condition for a point w0 = (x 0 , u0 ) to be a bounded
strong minimum. Again we assume that u0 is a piecewise continuous control and  =
{t1 , . . . , ts } is the set of its discontinuity points, every element of  being an L-point.

Set M(C ). Let  be an order function (see Definition 2.17). For any C > 0, we
denote by M(C) the set of all λ ∈ M0 such that the following condition holds:

H (t, x 0 (t), u, ψ(t)) − H (t, x 0 (t), u0 (t), ψ(t)) ≥ C(t, u),


(3.272)
∀ t ∈ [t0 , tf ] \ , u ∈ U(t, x 0 (t)).

Condition (3.272) strengthens the minimum condition (3.234), and we call (3.272) the
minimum condition of strictness C (or C-growth condition for H ). For any C > 0,
M(C) is a closed subset in M0 and, therefore, a finite dimensional compact set.
174 Chapter 3. Quadratic Conditions for Optimal Control Problems

Basic sufficient condition. Let


 tf
γ̄ (z̄) = ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + ū(t), ū(t) dt.
t0


On the subspace (3.257)–(3.258) of the space Z 2 (), the value γ̄ (z̄) is a norm, which is
equivalent to the norm of the space Z 2 (). Let  be an order function.

Definition 3.55. We say that a point w0 satisfies condition B() if there exists C > 0 such
that the set M(C) is nonempty and

max λ (z̄) ≥ C γ̄ (z̄) ∀ z̄ ∈ K.


λ∈M(C)

Theorem 3.56. If there exists an order function (t, u) such that Condition B() holds,
then w0 is a strict bounded strong minimum.

Condition B() obviously holds if for some C > 0 the set M(C) is nonempty, and
if the cone K consists only of zero. Therefore, Theorem 3.56 implies the following.

Corollary 3.57. If for some C > 0 the set M(C) is nonempty, and if K = {0}, then w 0 is
a strict bounded strong minimum.

Corollary 3.57 states the first-order sufficient condition of a bounded strong minimum.

γ -sufficiency. Quadratic Condition B() implies not only a bounded strong mini-
mum, but also a certain strengthening of this concept which is called the γ -sufficiency on
the set of bounded strong sequences. Below, we introduce the (higher) order γ and formu-
late two concepts: γ -sufficiency for Pontryagin minimum and γ -sufficiency for bounded
strong minimum. Regarding the point w0 = (x 0 , u0 ), tested for optimality, we again use
Assumption 3.41. Let (t, u) be an order function. Set
 tf
γ (δw) =
δx
2C + (t, u0 + δu) dt.
t0

The functional γ is defined on the set of all variations δw = (δx, δu) ∈ W such that

(t, x 0 (t), u0 (t) + δu(t)) ∈ Q a.e. on [t0 , tf ].

The functional γ is the order associated with the order function (t, u). Following the
general theory [55], we also call γ the higher order. Thus, with the point w0 we associate
the family of order functions  and the family of corresponding orders γ . Let us denote the
latter family by Ord(w0 ).
Let us introduce the violation function of problem (3.217)–(3.219) at the point w0 :

σ (δw) = max σJFK (δp), σf (δw), σgϕ (δw) , (3.273)


3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 175

where

σJFK (δp) = max J (p 0 + δp) − J (p0 ), max Fi (p0 + δp), |K(p 0 + δp)| ,
 tf i=1,...,d(F )

σf (δw) = |ẋ + δ ẋ − f (t, w + δw)| dt,


0 0
t0
σgϕ (δw) = ess sup[t0 ,tf ] max{|g(t, w0 + δw)|, max ϕi (t, w0 + δw)},
i=1,...,d(ϕ)
δw = (δx, δu) ∈ W , δp = (δx(t0 ), δx(tf )), p0 + δp ∈ P , (t, w0 (t) + δw(t)) ∈ Q.

Obviously, σ (δw) ≥ 0 and σ (0) = 0.


Let S be a set of sequences {δwn } in W , closed with respect to the operation of
taking subsequences. Evidently, w 0 is a point of a strict minimum on S iff, for any sequ-
ence {δwn } ∈ S containing nonzero terms, one has σ (δwn ) > 0 for all sufficiently large n.
A strengthened version of the last condition is suggested by the following definition.

Definition 3.58. We say that w0 is a point of γ -sufficiency on S if there exists an ε > 0 such
that, for any sequence {δwn } ∈ S, we have σ (δwn ) ≥ εγ (δwn ) for all sufficiently large n.

Let us now introduce a set of sequences related to a bounded strong minimum. Denote
by Sbs the set of all sequences {δwn } which are bounded strong on Q and satisfy the following
conditions:
(a) (p0 + δpn ) ∈ P for all sufficiently large n,
(b) σ (δwn ) → 0 as n → ∞.
Conditions (a) and (b) hold for every sequence {δwn } that violates minimality, so a bounded
strong minimum can be treated as a minimum on Sbs (the subscript bs means “bounded
strong”).

Theorem 3.59. Condition B() is equivalent to γ -sufficiency on Sbs .

The proof of this theorem was given in [86] and published in [94, 95]. Theorem 3.59
particularly shows a nontrivial character of minimum guaranteed by condition B(). Let
us explain this in more detail. A sequence {δwn } is said to be admissible if the sequence
{w0 + δwn } satisfies all constraints of the canonical problem. We say that w0 is a point of
γ -minimum on Sbs (or the γ -growth condition for the cost function holds on Sbs ) if there
exists ε > 0 such that, for any admissible sequence {δwn } ∈ Sbs , we have J (p0 + δpn ) −
J (p0 ) ≥ εγ (δwn ) for all sufficiently large n. Clearly, γ -sufficiency on Sbs implies γ -
minimum on Sbs . In fact, it is the sufficient Condition B() that ensures γ -minimum on
Sbs . A nontrivial character of γ -minimum on Sbs is caused by a nontrivial definition of the
order function  which specifies the higher-order γ .
Now, let us discuss an important question concerning characterization of Condition
λ ∈ M(C).

Local quadratic growth condition of the Hamiltonian. Fix an arbitrary tuple


λ ∈ 0 . We set

δH [t, v] := H (t, x 0 (t), u0 (t) + v, ψ(t)) − H (t, x 0 (t), u0 (t), ψ(t)). (3.274)
176 Chapter 3. Quadratic Conditions for Optimal Control Problems

Definition 3.60. We say that, at the point w 0 , the Hamiltonian satisfies a local quadratic
growth condition if there exist ε > 0 and α > 0 such that for all t ∈ [t0 , tf ] \  the following
inequality holds:

δH [t, v] ≥ α|v|2 if v ∈ Rd(u) , g(t, x 0 (t), u0 (t) + v) = 0,
(3.275)
ϕ(t, x (t), u (t) + v) ≤ 0, |v| < ε.
0 0

Recall the definition of H̄ in (3.225). Let us denote by



H̄u (t) := H̄u (t, x 0 (t), u0 (t), ψ(t), ν(t), μ(t)),
H̄uu (t) := H̄uu (t, x 0 (t), u0 (t), ψ(t), ν(t), μ(t))

the first and second derivative with respect to u of the augmented Hamiltonian, and adopt a
similar notation for the Hamiltonian function H . Similarly, we denote gu (t) :=
gu (t, x 0 (t), u0 (t)), ϕi (t) := ϕi (t, x 0 (t), u0 (t)), ϕiu (t) := ϕiu (t, x 0 (t), u0 (t)), i = 1, . . . , d(ϕ).
We shall formulate a generalization of the strengthened Legendre condition using the quad-
ratic form H̄uu (t)v, v complemented by some special nonnegative term ρ(t, v) which will
be homogeneous (not quadratic) of the second degree with respect to v. Let us define this
additional term.
For any number a, we set a + = max{a, 0} and a − = max{−a, 0}, so that a + ≥ 0,
a ≥ 0, and a = a + − a − . Denote by

χi (t) := χ{ϕi (τ )<0} (t) (3.276)

the characteristic function of the set {τ | ϕi (τ ) < 0}, i = 1, . . . , d(ϕ). If d(ϕ) > 1, then, for
any t ∈ [t0 , tf ] and any v ∈ Rd(u) , we set


d(ϕ)  
μj (t)  −  +
ρ(t, v) = max χi (t) ϕj u (t)v ϕiu (t)v . (3.277)
1≤i≤d(ϕ) |ϕi (t)|
j =1

Here, by definition,

μj (t)
χi (t) = 0 if ϕi (t) = 0, i, j = 1, . . . , d(ϕ).
|ϕi (t)|

Particularly, for d(ϕ) = 2 the function ρ has the form

μ1 (t)  −  +
ρ(t, v) = χ2 (t) ϕ1u (t)v ϕ2u (t)v
|ϕ2 (t)|
 −  + (3.278)
μ2 (t)
+ χ1 (t) ϕ2u (t)v ϕ1u (t)v .
|ϕ1 (t)|

In the case d(ϕ) = 1, we set ρ(t, v) ≡ 0.


For any  > 0 and any t ∈ [t0 , tf ] \ , denote by Ct () the set of all vectors v ∈ Rd(u)
satisfying
gu (t)v = 0, ϕj u (t)v ≤ 0 if ϕj (t) = 0,
(3.279)
ϕj u (t)v = 0 if μj (t) > , j = 1, . . . , d(ϕ).
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 177

Definition 3.61. We say that the Hamiltonian satisfies the generalized strengthened Legen-
dre condition if

∃ α > 0,  > 0 such that ∀ t ∈ [tf , tf ] \ :
(3.280)
2 H̄uu (t)v, v + ρ(t, v) ≥ α|v|
1 2 ∀ v ∈ C ().
t

Theorem 3.62. A local quadratic growth condition for the Hamiltonian is equivalent to the
generalized strengthened Legendre condition.

This theorem was proved in [8] for the control constrained problem (without mixed
constraints).
We note that Ct () is in general a larger set than the local cone Ct of critical direc-
tions for the Hamiltonian, i.e., the directions v ∈ Rd(u) , such that
gu (t)v = 0, ϕj u (t)v ≤ 0 if ϕj (t) = 0,
(3.281)
ϕj u (t)v = 0 if μj (t) > 0, j = 1, . . . , d(ϕ).

A simple sufficient condition for local quadratic growth of the Hamiltonian.


Consider the following second-order condition for the Hamiltonian:

∃ α > 0,  > 0 such that ∀ t ∈ [0, T ]:
(3.282)
2 H̄uu (t)v, v ≥ α|v|
1 2 ∀ v ∈ C ().
t

Let us note that this inequality is stronger than (3.280), since the function ρ(t, v) is nonneg-
ative.

Theorem 3.63. Condition (3.282) implies a local quadratic growth of the Hamiltonian.

Characterization of condition λ ∈ M(C ). An element λ ∈ 0 is said to be


strictly Legendrian if, for this element, the generalized strengthened Legendre condition
(3.280) is satisfied and also the following conditions hold:

[H λ ]k = 0, D k (H̄ λ ) > 0 ∀ tk ∈ . (3.283)

Denote by M0+ the set of λ ∈ M0 such that the following conditions hold:
(a) H (t, x 0 (t), u, ψ(t)) > H (t, x 0 (t), u0 (t), ψ(t))
if t ∈ [t0 , tf ]\, u ∈ U(t, x 0 (t)), u  = u0 (t), where
U(t, x) = {u ∈ Rd(u) | (t, x, u) ∈ Q, g(t, x, u) = 0, ϕ(t, x, u) ≤ 0};
(b) H (tk , x 0 (tk ), u, ψ(tk )) > H k
if tk ∈ , u ∈ U(tk , x 0 (tk )), u ∈
/ {u0k− , u0k+ }, where
H := H (tk , x (tk ), u , ψ(tk )) = H (tk , x 0 (tk ), u0k+ , ψ(tk )).
k 0 0k−

Denote by Leg+ (M0+ ) the set of all strictly Legendrian elements λ ∈ M0+ .

Theorem 3.64. An element λ ∈ Leg+ (M0+ ) iff there exists C > 0 such that λ ∈ M(C).

The proof will be published elsewhere.


178 Chapter 3. Quadratic Conditions for Optimal Control Problems

3.4.2 General Optimal Control Problem on a Variable Time Interval


Statement of the problem. Here, quadratic optimality conditions, both necessary
and sufficient, are presented in the following optimal control problem on a variable time
interval. Let T denote a trajectory (x(t), u(t) | t ∈ [t0 , tf ]), where the state variable x(·)
is a Lipschitz continuous function, and the control variable u(·) is a bounded measurable
function on a time interval  = [t0 , tf ]. The interval  is not fixed. For each trajectory T ,
we denote by p = (t0 , x(t0 ), tf , x(tf )) the vector of the endpoints of time-state variable (t, x).
It is required to find T minimizing the functional

J(T ) := J (t0 , x(t0 ), tf , x(tf )) → min (3.284)

subject to the constraints

F (t0 , x(t0 ), tf , x(tf )) ≤ 0, K(t0 , x(t0 ), tf , x(tf )) = 0, (3.285)


ẋ(t) = f (t, x(t), u(t)), (3.286)
g(t, x(t), u(t)) = 0, ϕ(t, x(t), u(t)) ≤ 0, (3.287)
p ∈ P , (t, x(t), u(t)) ∈ Q, (3.288)

where P and Q are open sets, and x, u, F , K, f , g, and ϕ are vector functions.
We assume that the functions J , F , and K are defined and twice continuously
differentiable on P , and the functions f , g, and ϕ are defined and twice contin-
uously differentiable on Q; moreover, g and ϕ satisfy the linear independence assump-
tion (see Assumption 3.40).

Necessary conditions for a Pontryagin minimum. Let T be a fixed admissible


trajectory such that the control u(·) is a piecewise Lipschitz continuous function on the
interval  with the set of discontinuity points  = {t1 , . . . , ts }, t0 < t1 < · · · < ts < tf . Let us
formulate a first-order necessary condition for optimality of the trajectory T . We introduce
the Pontryagin function
H (t, x, u, ψ) = ψf (t, x, u) (3.289)
and the augmented Pontryagin function

H̄ (t, x, u, ψ, ν, μ) = H (t, x, u, ψ) + νg(t, x, u) + μϕ(t, x, u), (3.290)

where ψ, ν, and μ are row vectors of the dimensions d(x), d(g), and d(ϕ), respectively.
Let us define the endpoint Lagrange function

l(p, α0 , α, β) = α0 J (p) + αF (p) + βK(p), (3.291)

where p = (t0 , x0 , tf , xf ), x0 = x(t0 ), xf = x(tf ), α0 ∈ R, α ∈ (Rd(F ) )∗ , β ∈ (Rd(K) )∗ . Also


we introduce a tuple of Lagrange multipliers

λ = (α0 , α, β, ψ(·), ψ0 (·), ν(·), μ(·)) (3.292)

such that ψ(·) :  → (Rd(x) )∗ and ψ0 (·) :  → R1 are piecewise smooth functions, con-
tinuously differentiable on each interval of the set  \ , and ν(·) :  → (Rd(g) )∗ and
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 179

μ(·) :  → (Rd(ϕ) )∗ are piecewise continuous functions, Lipschitz continuous on each in-
terval of the set  \ .
Denote by M0 the set of the normed tuples λ satisfying the conditions of the minimum
principle for the trajectory T :

)
d(F 
d(K)
α0 ≥ 0, α ≥ 0, αF (p) = 0, α0 + αi + |βj | = 1,
i=1 j =1
ψ̇ = −H̄x , ψ̇0 = −H̄t , H̄u = 0, t ∈  \ ,
(3.293)
ψ(t0 ) = −lx0 , ψ(tf ) = lxf , ψ0 (t0 ) = −lt0 , ψ0 (tf ) = ltf ,
min H (t, x(t), u, ψ(t)) = H (t, x(t), u(t), ψ(t)), t ∈  \ ,
u∈U(t,x(t))
H (t, x(t), u(t), ψ(t)) + ψ0 (t) = 0, t ∈  \ ,

where U(t, x) = {u ∈ Rd(u) | g(t, x, u) = 0, ϕ(t, x, u) ≤ 0, (t, x, u) ∈ Q}. The derivatives lx0
and lxf are at (p, α0 , α, β), where p = (t0 , x(t0 ), tf , x(tf )), and the derivatives H̄x , H̄u , and
H̄t are at (t, x(t), u(t), ψ(t), ν(t), μ(t)), where t ∈  \ . (Condition H̄u = 0 follows from
the others conditions in this definition, and therefore could be excluded; yet we need to use
it later.)
We define the Pontryagin minimum in problem (3.284)–(3.288) on a variable interval
[t0 , tf ] as in Section 3.3.2 (see Definition 3.25). The condition M0  = ∅ is equivalent to
the Pontryagin’s minimum principle. It is a first-order necessary condition of Pontryagin
minimum for the trajectory T . Thus, the following theorem holds (see, e.g., [76]).

Theorem 3.65. If the trajectory T affords a Pontryagin minimum, then the set M0 is
nonempty.

Assume that M0 is nonempty. Using the definition of the set M0 and the linear
independence assumption for g and ϕ one can easily prove the following statement.

Proposition 3.66. The set M0 is a finite-dimensional compact set, and the mapping λ  →
(α0 , α, β) is injective on M0 .

As in Section 3.3, for each λ ∈ M0 , tk ∈ , we set


D k (H̄ ) = −H̄xk+ H̄ψk− + H̄xk− H̄ψk+ − [H̄t ]k , (3.294)

where H̄xk− = H̄x (tk , x(tk ), u(tk −), ψ(tk ), ν(tk −), μ(tk −)), H̄xk+ = H̄x (tk , x(tk ), u(tk +), ψ(tk ),
ν(tk +), μ(tk +)), [H̄t ]k = H̄tk+ − H̄tk− , etc.

Theorem 3.67. For each λ ∈ M0 , the following conditions hold:


D k (H̄ ) ≥ 0, k = 1, . . . , s. (3.295)
Thus, conditions (3.295) follow from the minimum principle conditions (3.293).
Let us formulate a quadratic necessary condition of a Pontryagin minimum for the
trajectory T . First, for this trajectory, we introduce a Hilbert space Z2 () and the critical
cone K ⊂ Z2 (). We denote by P W 1,2 (, Rd(x) ) the Hilbert space of piecewise contin-
uous functions x̄(·) :  → Rd(x) , absolutely continuous on each interval of the set  \ 
180 Chapter 3. Quadratic Conditions for Optimal Control Problems

and such that their first derivative is square integrable. For each x̄ ∈ P W 1,2 (, Rd(x) ),
tk ∈ , we set x̄ k− = x̄(tk −), x̄ k+ = x̄(tk +), [x̄]k = x̄ k+ − x̄ k− . Further, we let z̄ = (t¯0 , t¯f ,
ξ̄ , x̄, ū), where

t¯0 ∈ R1 , t¯f ∈ R1 , ξ̄ ∈ Rs , x̄ ∈ P W 1,2 (, Rd(x) ), ū ∈ L2 (, Rd(u) ).

Thus,
z̄ ∈ Z2 () := R2 × Rs × P W 1,2 (, Rd(x) ) × L2 (, Rd(u) ).
Moreover, for given z̄, we set

w̄ = (x̄, ū), x̄0 = x̄(t0 ), x̄f = x̄(tf ), (3.296)


x̄¯0 = x̄(t0 ) + t¯0 ẋ(t0 ), x̄¯f = x̄(tf ) + t¯f ẋ(tf ), p̄¯ = (x̄¯0 , t¯0 , x̄¯f , t¯f ). (3.297)

By IF (p) = {i ∈ {1, . . . , d(F )} | Fi (p) = 0} we denote the set of active indices of the con-
straints Fi (p) ≤ 0.
Let K be the set of all z̄ ∈ Z2 () satisfying the following conditions:

J (p)p̄¯ ≤ 0, Fi (p)p̄¯ ≤ 0 ∀ i ∈ IF (p), K (p)p̄¯ = 0,


˙ = fw (t, w(t))w̄(t) for a.a. t ∈ [t0 , tf ],
x̄(t)
[x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s, (3.298)
gw (t, w(t))w̄(t) = 0 for a.a. t ∈ [t0 , tf ],
ϕj w (t, w0 (t))w̄(t) ≤ 0 a.e. on M0 (ϕj0 ), j = 1, . . . , d(ϕ),

where M0 (ϕj0 ) = {t ∈ [t0 , tf ] | ϕj (t, w 0 (t)) = 0}, p = (t0 , x(t0 ), tf , x(tf )), w = (x, u). It is
obvious that K is a convex cone in the Hilbert space Z2 (), and we call it the critical cone.
If the interval  is fixed, then we set p := (x0 , xf ) = (x(t0 ), x(tf )), and in the definition of K
we have t¯0 = t¯f = 0, x̄¯0 = x̄0 , x̄¯f = x̄f , and p̄¯ = p̄ := (x̄0 , x̄f ). Define quadratic forms ωe ,
ω, and  by formulas (3.156), (3.157), and (3.158), respectively. Now, we formulate the
main necessary quadratic condition of a Pontryagin minimum in the problem on a variable
time interval.

Theorem 3.68. If the trajectory T yields a Pontryagin minimum, then the following Con-
dition A holds: The set M0 is nonempty and

max (λ, z̄) ≥ 0 ∀ z̄ ∈ K.


λ∈M0

Sufficient conditions for a bounded strong minimum. The ith component xi of


vector x is called unessential if the functions f , g, and ϕ do not depend on this component
and the functions J , F , and K are affine in xi0 = xi (t0 ), xif = xi (tf ); otherwise the compo-
nent xi is called essential. We denote by x a vector composed of all essential components
of vector x and we define the (strict) bounded strong minimum as in Definition 3.32. Let 
be an order function (see Definition 2.17). We formulate a sufficient optimality Condition
B(), which is a natural strengthening of the necessary Condition A. Let us introduce the
functional  tf
γ̄ (z̄) = t¯02 + t¯f2 + ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + ū(t), ū(t) dt, (3.299)
t0
3.4. Quadratic Conditions for Mixed Control-State Constrained Problems 181

which is equivalent to the norm squared on the subspace

x̄˙ = fw (t, x(t), u(t))w̄; [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s, (3.300)

of Hilbert space Z2 (). Recall that the critical cone K is contained in the subspace (3.300).
For any C > 0, we denote by M(C) the set of all λ ∈ M0 such that condition (3.272) holds.

Theorem 3.69. For the trajectory T , assume that the following Condition B() holds:
There exists C > 0 such that the set M(C) is nonempty and

max (λ, z̄) ≥ C γ̄ (z̄) (3.301)


λ∈M(C)

for all z̄ ∈ K. Then the trajectory T affords a strict bounded strong minimum.

Again we can use the characterization of the condition λ ∈ M(C) formulated in the
previous section.
Chapter 4

Jacobi-Type Conditions and


Riccati Equation for Broken
Extremals

Here we derive tests for the positive semidefiniteness, respectively, positive definiteness of
the quadratic form  on the critical cone K (introduced in Chapter 2 for extremals with
jumps of the control). In Section 4.1, we derive such tests for the simplest problem of
the calculus of variations and for an extremal with only one corner point. We come to a
generalization of the concept of conjugate point which allows us to formulate both necessary
and sufficient second-order optimality conditions for broken extremals. Three numerical
examples illustrate this generalization. Further, we concentrate on sufficient conditions for
positive definiteness of the quadratic form  in the auxiliary problem. We show that if there
exists a solution to the Riccati matrix equation satisfying a certain jump condition, then
the quadratic form  can be transformed into a perfect square. This gives a possibility of
proving a sufficient condition for positive definiteness of the quadratic form in the auxiliary
problem and thus to obtain one more sufficient condition for optimality of broken extremals.
At the end of Section 4.1, we obtain such condition for the simplest problem of the calculus
of variations, and then, in Section 4.2, we prove it for the general problem (without constraint
g(t, x, u) = 0).

4.1 Jacobi-Type Conditions and Riccati Equation for


Broken Extremals in the Simplest Problem of the
Calculus of Variations
4.1.1 An Auxiliary Problem for Broken Extremal
Let a closed interval [t0 , tf ], two points b0 , bf ∈ Rm , an open set Q ⊂ R2m+1 , and a function
F : Q  → R of class C 2 be fixed. The simplest problem of the calculus of variations can be
formulated as follows:
 tf
(SP) Minimize J (x(·), u(·)) := F (t, x(t), u(t)) dt (4.1)
t0
under the constraints
ẋ(t) = u(t), x(t0 ) = b0 , x(tf ) = bf , (4.2)
(t, x(t), u(t)) ∈ Q. (4.3)
183
184 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Here, x(·) : [t0 , tf ] → Rm is absolutely continuous, and u(·) : [t0 , tf ] → Rm is bounded and
measurable. Set w(·) = (x(·), u(·)). Then w(·) is an element of the space

W := W 1,1 ([t0 , tf ], Rm ) × L∞ ([t0 , tf ], Rm ).

We say that w(·) = (x(·), u(·)) is an admissible pair if w(·) ∈ W and the constraints (4.2),
(4.3) hold for it. Let an admissible pair w(·) = (x(·), u(·)) be given. Assume that u(·) is a
piecewise continuous function with a unique point of discontinuity t∗ ∈ (t0 , tf ). Denote by
 the singleton {t∗ }. For t∗ , we set u− = u(t∗ −), u+ = u(t∗ +), and [u] = u+ − u− . Thus,
[u] is a jump of the function u(·) at the point t∗ .
In correspondence with (4.3), assume that (t, x(t), u(t)) ∈ Q for all t ∈ [t0 , tf ] \ , and
(t∗ , x(t∗ ), u− ) ∈ Q, (t∗ , x(t∗ ), u+ ) ∈ Q. Also assume that there exist a constant C > 0 and a
small number ε > 0 such that

|u(t) − u− | ≤ C|t − t∗ | ∀ t ∈ (t∗ − ε, t∗ ) ∩ [t0 , tf ],


|u(t) − u+ | ≤ C|t − t∗ | ∀ t ∈ (t∗ , t∗ + ε) ∩ [t0 , tf ].

The pair w(·) = (x(·), u(·)) is called an extremal if

ψ(t) := −Fu (t, x(t), u(t)) (4.4)

is a Lipschitz-continuous function and a condition equivalent to the Euler equation is


satisfied:
−ψ̇(t) = Fx (t, x(t), u(t)) ∀ t ∈ [t0 , tf ]\. (4.5)

Here ψ ∈ (Rm )∗ is a row vector while x, u ∈ R m are column vectors. For the Pontryagin
function
H (t, x, u, ψ) = ψu + F (t, x, u), (4.6)

we set H − = H (t∗ , x(t∗ ), u− , ψ(t∗ )), H + = H (t∗ , x(t∗ ), u+ , ψ(t∗ )), and [H ] = H + − H − .
The equalities
[H ] = 0, [ψ] = 0 (4.7)

are the Weierstrass–Erdmann conditions. They are known as necessary conditions for the
strong minimum. However, they are also necessary for the Pontryagin minimum introduced
in Chapter 2. For convenience, let us recall the definitions of Pontryagin and bounded strong
minima in the simplest problem.
We say that the pair of functions w = (x, u) is a point of Pontryagin minimum in
problem (4.1)–(4.3) if for each compact set C ⊂ Q there exists ε > 0 such that J (w̃) ≥ J (w)
for all admissible pairs w̃ = (x̃, ũ) such that
(a) max[t0 ,tf ] |x̃(t) − x(t)| < ε,
t
(b) t0f |ũ(t) − u(t)| dt < ε,
(c) (t, x̃(t), ũ(t)) ∈ C.
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 185

We say that the pair w = (x, u) is a point of bounded strong minimum in problem (4.1)–(4.3),
if for each compact set C ⊂ Q there exists ε > 0 such that J (w̃) ≥ J (w) for all admissible
pairs w̃ = (x̃, ũ), such that
(a) max[t0 ,tf ] |x̃(t) − x(t)| < ε,
(b) (t, x̃(t), ũ(t)) ∈ C.

Clearly, the following implications hold: strong minimum =⇒ bounded strong minimum =⇒
Pontryagin minimum =⇒ weak minimum.
As already mentioned, the Weierstrass–Erdmann conditions are necessary for the
Pontryagin minimum. As shown in Chapter 2, they can be supplemented by additional
condition of the same type. We set

a = D(H ) := ψ̇ + ẋ − − ψ̇ − ẋ + − [Ft ], (4.8)

where ψ̇ − = ψ̇(t∗ −), ψ̇ + = ψ̇(t∗ +), ẋ − = ẋ(t∗ −), ẋ + = ẋ(t∗ +), and [Ft ] = Ft (t∗ , x(t∗ ), u+ )−
Ft (t∗ , x(t∗ ), u− ). Then a ≥ 0 is a necessary condition for the Pontryagin minimum.
As we know, the value D(H ) can be computed in a different way. Consider the
function
(H )(t) := H (t, x(t), u+ , ψ(t)) − H (t, x(t), u− , ψ(t))
(4.9)
= ψ(t)[ẋ] + F (t, x(t), u+ ) − F (t, x(t), u− ) .

Using (4.5), we obtain

d
(H )|t∗ +0 = ψ̇ + [ẋ] − [ψ̇]ẋ + + [Ft ], (4.10)
dt
d
(H )|t∗ −0 = ψ̇ − [ẋ] − [ψ̇]ẋ − + [Ft ]. (4.11)
dt

Since ψ̇ + [ẋ] − [ψ̇]ẋ + = ψ̇ − ẋ + − ψ̇ + ẋ − = ψ̇ − [ẋ] − [ψ̇]ẋ − , we obtain from this that

d d
(H )|t∗ −0 = (H )|t∗ +0 = −D(H ). (4.12)
dt dt

Note that the inequality a ≥ 0 and the Weierstrass–Erdmann conditions are implied
by the conditions of the minimum principle, which is equivalent to the Weierstrass con-
dition in this problem. Here, the minimum principle has the form: H (t, x(t), u, ψ(t)) ≥
H (t, x(t), u(t), ψ(t)) if t ∈ [t0 , tf ] \ , u ∈ Rm , (t, x(t), u) ∈ Q. Let us also formulate the
strict minimum principle:
(a) H (t, x(t), u, ψ(t)) > H (t, x(t), u(t), ψ(t))
for all t ∈ [t0 , tf ] \ , u ∈ Rm , (t, x(t), u) ∈ Q, u  = u(t),
(b) H (t∗ , x(t∗ ), u, ψ(t∗ )) > H (t∗ , x(t∗ ), u− , ψ(t∗ )) = H (t∗ , x(t∗ ), u+ , ψ(t∗ ))
for all u ∈ Rm , (t, x(t), u) ∈ Q, u  = u− , u  = u+ .
Now we define a quadratic form that corresponds  to an extremal w(·) with a cor-
ner point. As in Section 2.1.5, denote by P W 1,2 [t0 , tf ], Rm the space of all piecewise
continuous functions x̄(·) : [t0 , tf ] −→ Rm that are absolutely continuous on each of the
intervals in [t0 , tf ]\ whose derivatives are square Lebesgue integrable. For t∗ , we set
186 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

x̄ − = x̄(t∗ −), x̄ + = x̄(t∗ +), [x̄] = x̄ + − x̄ − . Recall that the space P W 1,2 ([t0 , tf ], Rm )
t
˙ ȳ(t)
with the inner product (x̄, ȳ) = x̄(0), ȳ(0) + [x̄], [ȳ] + t0f x̄(t), ˙ dt is a Hilbert space.
We set
   
Z2 () = R1 × P W 1,2 [t0 , tf ], Rm × L2 [t0 , tf ], Rm

and denote by z̄ = (ξ̄ , x̄, ū) an element of the space Z2 (), where
   
ξ̄ ∈ R1 , x̄(·) ∈ P W 1,2 [t0 , tf ], Rm , ū(·) ∈ L2 [t0 , tf ], Rm .

In the Hilbert space Z2 (), we define a subspace and a quadratic form by setting

˙ = ū(t), x̄(t0 ) = x̄(tf ) = 0, [x̄] = [ẋ]ξ̄


K = z̄ ∈ Z2 () | x̄(t) (4.13)

and 
1 2 1 tf
(z̄) = a ξ̄ − 2[ψ̇]x̄av ξ̄ + Fww w̄(t), w̄(t) dt, (4.14)
2 2 t0
respectively, where
1 − 
x̄av = x̄ + x̄ + , Fww = Fww (t, x(t), u(t)), w = (x, u),
2
 
Fxx Fxu
Fww = , w̄(·) = (x̄(·), ū(·)).
Fux Fuu

The condition of positive semidefiniteness of  on K implies the following Legendre


condition:

(L) for any v ∈ Rm ,


Fuu (t, x(t), u(t))v, v ≥ 0 ∀ t ∈ [t0 , tf ]\, (4.15)
− +
Fuu v, v ≥ 0, Fuu v, v ≥ 0. (4.16)

which is a necessary condition for a weak minimum. Here Fuu − = F (t , x(t ), u− ), F + =


uu ∗ ∗ uu
+
Fuu (t∗ , x(t∗ ), u ). The condition of positive definiteness of  on K implies the strength-
ened Legendre condition:

(SL) for any v ∈ Rm \{0},


Fuu (t, x(t), u(t))v, v > 0 ∀ t ∈ [t0 , tf ]\, (4.17)
− +
Fuu v, v > 0, Fuu v, v > 0. (4.18)

The auxiliary minimization problem for an extremal w(·) with a corner point is for-
mulated as follows:

(AP) minimize (z̄) under the constraint z̄ ∈ K.

This setting is stipulated by the following two theorems.

Theorem 4.1. If w(·) is a Pontryagin minimum, then the following conditions hold:
(a) the Euler equation,
(b) the minimum principle (the Weierstrass condition),
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 187

(c) the Legendre condition (L),


(d) a ≥ 0,
(e) (z̄) ≥ 0 for all z̄ ∈ K.

Theorem 4.2. If the following conditions hold, then w(·) is a point of a strict bounded
strong minimum:
(a) the Euler equation,
(b) the strict minimum principle (the strict Weierstrass condition),
(c) the strengthened Legendre condition (SL),
(d) a > 0, t
(e) there exists ε > 0 such that (z̄) ≥ ε(ξ̄ 2 + t0f ū(t), ū(t) dt) for all z̄ ∈ K.

Theorems 4.1 and 4.2 follow from Theorems 2.4 and 2.102, respectively. Note that
the functional  tf
γ̄ (z̄) = ξ̄ 2 + ū(t), ū(t) dt (4.19)
t0
on the subspace K is equivalent to the squared norm: γ̄ ∼ (z̄, z̄).
Now our goal is to derive the tests for positive semidefiniteness and positive definite-
ness of the quadratic form  on the subspace K in the case of an extremal with a single
corner point. We give such tests in the form of the Jacobi-type conditions and in terms of
solutions to the Riccati equations.

4.1.2 Jacobi-Type Conditions and the Riccati Equation


In this section, we assume that for an extremal w(·) = (x(·), u(·)) with a single corner point
t∗ ∈ (t0 , tf ), the condition (SL) holds, and a > 0, where a is defined by (4.8). It follows
from condition (SL) that at each point t ∈ [t0 , tf ] \ , the matrix Fuu = Fuu (t, w(t)) has the
inverse matrix Fuu −1 . We set

A −1 F ,
= −Fuu −1 ,
B = −Fuu
ux
−1 ∗ −1 . (4.20)
C = −Fxu Fuu Fux + Fxx , A = −Fxu Fuu
All derivatives are computed along the trajectory (t, w(t)). Note that
B ∗ = B, C ∗ = C,
(4.21)
|det B| ≥ const > 0 on [t0 , tf ],
and A, B, and C are matrices with piecewise continuous entries on [t0 , tf ] that are continu-
ous on each of the intervals of the set (t0 , tf ) \ .
Further, we formulate Jacobi-type conditions for an extremal w(·) with a single corner
point t∗ . Denote by X(t) and "(t) two square matrices of order m, where t ∈ [t0 , tf ]. For
X(t) and "(t), we consider the set of differential equations
Ẋ = AX + B",
˙ (4.22)
−" = CX + A∗ "
with the initial conditions
X(t0 ) = O, "(t0 ) = −I , (4.23)
where O and I are the zero matrix and the identity matrix, respectively.
188 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Recall that a continuous (and hence a piecewise smooth) solution X(t), "(t) to the
Cauchy problem (4.22), (4.23) allows one to formulate the classical concept of the conju-
gate point. Namely, a point τ ∈ (t0 , tf ] is called conjugate (to the point t0 ) if det X(τ ) = 0.
The absence of a conjugate point in (t0 , tf ) is equivalent to the positive semidefiniteness of
the quadratic form  tf
ω= Fww w̄, w̄ dt
t0
on the subspace K0 consisting of pairs w̄ = (x̄, ū) such that

x̄˙ = ū, x̄(t0 ) = x̄(tf ) = 0, x̄ ∈ W 1,2 ([t0 , tf ], Rm ), ū ∈ L2 ([t0 , tf ], Rm ).

The latter condition is necessary for the weak minimum. The absence of a conjugate point in
(t0 , tf ] is equivalent to the condition of positive definiteness of ω on K0 , which is a sufficient
condition for the strict weak minimum. This is the classical Jacobi condition.
We note that  and K pass into ω and K0 , respectively, if we set ξ̄ = 0 in the
definition of the first pair. In our tests of positive semidefiniteness and positive definiteness
of the quadratic form  on the subspace K, we use a discontinuous solution X, " to the
Cauchy problem (4.22), (4.23) with certain jump conditions at the point t∗ . Namely, let a
pair X(t), "(t) be a continuous solution to the problem (4.22), (4.23) on the half-interval
[t0 , t∗ ). We set X− = X(t∗ −), and " − = "(t∗ −). The jumps [X] and ["] of the matrix-
valued functions X(t) and "(t) at the point t∗ are uniquely defined by using the relations
 
a[X] = [ẋ] −[ẋ]∗ " − + [ψ̇]X− , (4.24)

 ∗ − −

a["] = [ψ̇] −[ẋ] " + [ψ̇]X , (4.25)

where [ẋ] and [ψ̇]∗ are column matrices, while [ẋ]∗ and [ψ̇] are row matrices. Let us define
the right limits X + and " + of the functions X and " at t∗ in the following way:

X+ = X − + [X], " + = " − + ["].

Then we continue the process of solution of system (4.22) on (t∗ , T ] by using the initial
conditions for X and " at t∗ given by the conditions

X(t∗ +) = X + , "(t∗ +) = " + .

Thus, on [t0 , tf ], we obtain a piecewise continuous solution X(t), "(t) to system (4.22) with
the initial conditions (4.23) at t0 and the jump conditions (4.24) and (4.25) at t∗ . On each
of the intervals of the set [t0 , tf ] \ , the matrix-valued functions X(t) and "(t) are smooth.
Briefly, this pair of functions will be called a solution to the problem (4.22)–(4.25) on [t0 , tf ].

Theorem 4.3. The form  is positive semidefinite on K iff the solution X(t), "(t) to the
problem (4.22)–(4.25) on [t0 , tf ] satisfies the conditions

det X(t)  = 0 ∀ t ∈ (t0 , tf ) \ , (4.26)


det X −  = 0, det X +  = 0, (4.27)
 
a − [ẋ]∗ Q− − [ψ̇] [ẋ] > 0, (4.28)

where Q(t) = "(t)X−1 (t), Q− = Q(t∗ −), and X −1 (t) is the inverse matrix to X(t).
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 189

Theorem 4.4. The form  is positive definite on K iff the solution X(t), "(t) to the
problem (4.22)–(4.25) on [t0 , tf ] satisfies conditions (4.26)–(4.28), together with the addi-
tional condition
det X(tf )  = 0. (4.29)

The conditions for positive semidefiniteness, and those for positive definiteness of
 on K, which are given by these two theorems can easily be reformulated in terms of a
solution to the corresponding matrix Riccati equation. Indeed, if X(t), "(t) is a solution
to system (4.22) on a certain interval  ⊂ [t0 , tf ] with det X(t)  = 0 on , then, as is well
known, the matrix-valued function Q(t) = "(t)X −1 (t) satisfies the Riccati equation

Q̇ + QA + A∗ Q + QBQ + C = 0 (4.30)

on . Let us prove this assertion. Differentiating the equality " = QX and using (4.22),
we obtain
−CX − A∗ " = " ˙ = Q̇X + QẊ = Q̇X + Q(AX + B").
Consequently,
CX + A∗ " + Q̇X + QAX + QB" = 0.
Multiplying this equation by X−1 from the right, we obtain (4.30). Using (4.20), we can
also represent (4.30) as
−1
Q̇ − (Q + Fxu )Fuu (Q + Fux ) + Fxx = 0. (4.31)

The solution Q = "X −1 has a singularity at the zero point, since X(t0 ) = O. The
question is: How do we correctly assign the initial condition for Q? We can do this in
the following way. In a small half-neighborhood [t0 , t0 + ε), ε > 0, we find a solution to the
Riccati equation for R = Q−1 = X" −1 with the initial condition

R(t0 ) = O, (4.32)

which is implied by (4.23). This Riccati equation for R can easily be obtained. Namely,
differentiating the equality
X = R" (4.33)
and using (4.22), we obtain, for small ε > 0,

Ṙ = AR + RA∗ + RCR + B, t ∈ [t0 , t0 + ε]. (4.34)

Using (4.20), we can transform this Riccati equation into the form
−1
Ṙ + (RF xu + I )Fuu (Fux R + I ) − RF xx R = 0, t ∈ [t0 , t0 + ε]. (4.35)

Thus, we solve the Riccati equation (4.34) or (4.35) with initial condition (4.32) in a certain
half-neighborhood [t0 , t0 +ε) of t0 . Recall that the matrices B and C are symmetric on [t0 , tf ].
Consequently, R is also symmetric, and therefore,

Q(t0 + ε) = R −1 (t0 + ε) (4.36)

is symmetric.
190 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Let Q(t) be the continuous solution of the Riccati equation (4.30) with initial condi-
tion (4.36) on the interval (t0 , t∗ ). The existence of such a solution is a necessary condition
for positive semidefiniteness of  on K. Since B(t) and C(t) are symmetric on [t0 , tf ]
and Q(ε) is also symmetric, we have that Q(t) is symmetric on (t0 , t∗ ). Consequently,
Q− = Q(t∗ −) is symmetric. Further, we define a jump condition for Q at t∗ that corre-
sponds to the jump conditions (4.24) and (4.25). This condition has the form
 
a − (q− )[ẋ] [Q] = (q− )∗ (q− ), (4.37)

where
q− = [ẋ]∗ Q− − [ψ̇]. (4.38)
Note that q− and [ẋ]∗ )∗
are row vectors while (q− and [ẋ] are column vectors. The jump
[Q] of the matrix Q at the point t∗ is uniquely defined by using (4.37) since, according to
Theorem 4.3, the condition
a − (q− )[ẋ] > 0
is necessary for positive semidefiniteness of  on K. Note that (q− )∗ (q− ) is a symmetric
positively semidefinite matrix. Hence, [Q] is a symmetric negatively semidefinite matrix.
The right limit Q+ = Q(t∗ +) is defined by the relation

Q+ = Q− + [Q]. (4.39)

It follows that Q+ is symmetric. Using Q+ as the initial condition for Q at the point t∗ ,
we continue the solution of the Riccati equation (4.30) for t > t∗ . The matrix Q(t) is also
symmetric for t > t∗ . Assume that this symmetric solution Q(t) is extended to a certain
interval (t0 , τ ) or half-interval (t0 , τ ], where t∗ < τ ≤ tf . It will be called a solution to
problem (4.30), (4.36), (4.37) on (t0 , τ ) or on (t0 , τ ], respectively.

Theorem 4.5. The form  is positive semidefinite on K iff there exists a solution Q to the
problem (4.30), (4.36), (4.37) on (t0 , tf ) that satisfies

a − (q− )[ẋ] > 0, (4.40)

where q− is defined by (4.38).

Theorem 4.6. The form  is positive definite on K iff there exists a solution Q to the
problem (4.30), (4.36), (4.37) on (t0 , tf ] that satisfies inequality (4.40).

We note that the inequality (4.40) is equivalent to condition (4.28). Moreover, set
b− = a − (q− )[ẋ]. Then the inequality (4.40) and the jump condition (4.37) obtain the form

b− > 0, (b− )[Q] = (q− )∗ (q− ),

respectively.

4.1.3 Passage of the Quadratic Form through Zero


Now our goal consists of obtaining the tests for positive semidefiniteness and positive
definiteness of  on K, which were stated in Section 4.1.2. In the present section we shall
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 191

use some ideas from [25, 26]. Everywhere below, we assume that condition (SL) holds. It
follows from condition (SL) that  is a Legendre form on K (cf., e.g., [42]) in the abstract
sense; i.e., it is a weakly lower semicontinuos functional on K, and the conditions
z̄n ∈ K ∀ n, z̄ ∈ K, (4.41)
ξ̄ n → ξ̄ , ūn → ū weakly in L2 , (4.42)
(z̄n ) → (z̄) (4.43)
imply

ūn − ū
2 → 0, (4.44)
tf
and hence z̄ → z̄ strongly in Z2 (). Here
v
2 = ( t0 v(t), v(t) dt) is the norm in L2 .
n 1/2

Further, consider a monotonically increasing one-parameter family of subspaces K(τ ) in


K, each of which is defined by the relation

K(τ ) = z̄ = (ξ̄ , x̄, ū) ∈ K | ū(t) = 0 in [τ , tf ] , (4.45)


where τ ∈ [t0 , tf ]. It is clear that
K(t0 ) = {0}, K(tf ) = K. (4.46)
We now study the problem of how the property of positive definiteness of  on K(τ )
depends on the change of the parameter τ . This will allow us to obtain Jacobi-type conditions
for  on K.
The form  is positive on K(τ ) if (z̄) > 0 for all z̄ ∈ K(τ )\{0}. As is well known,
for Legendrian forms, the positivity of  on K(τ ) is equivalent to its positive definiteness
on K(τ ), i.e., to the following property: There exists ε > 0 such that (z̄) ≥ ε γ̄ (z̄) for all
z̄ ∈ K(τ ), where γ̄ is defined by relation (4.19).
Obviously, for τ = t0 the form  is positive definite on K(t0 ) = {0}. Below, we
will prove that, due to condition (SL),  is also positive definite on K(τ ) for all sufficiently
small τ − t0 > 0. Let us increase the value of τ . For τ = tf , we have three possibilities:
Case (1).  is positive definite on K.
Case (2).  is positive semidefinite on K, but is not positive definite on K.
Case (3).  is not positive semidefinite on K.
In Cases (2) and (3) we define

τ0 := sup τ ∈ [t0 , tf ] |  is positive definite on K(τ ) . (4.47)


We will show that (·) ≥ 0 on K(τ0 ), but  is not positive on K(τ0 ). Consequently,
there exists z̄ ∈ K(τ0 )\{0} such that (z̄) = 0. This fact was called in [25] “the passage
of quadratic form through zero.” This property of the form  follows from condition
(SL) and plays a crucial role in our study of the problem of the definiteness of  on K.
(Note that another possibility is that  is still positive on K(τ0 ). In this case,  “does not
pass through zero.” Such examples are presented in [25] for a quadratic form that does
not satisfy the strengthened Legendre condition.) Now, our goal consists of proving the
following theorem.

Theorem 4.7. If  is not positive definite on K, then there exists τ0 ∈ (t0 , tf ] such that
(a) (·) is positive definite on K(τ ) for all τ ∈ (t0 , τ0 ) and
(b) (·) ≥ 0 on K(τ0 ) and there exists z̄ ∈ K(τ0 )\{0} such that (z̄) = 0.
192 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Using this theorem, we can define τ0 as the minimum value among all τ ∈ (t0 , tf ]
such that the quadratic form  has a nontrivial zero z̄ on the subspace K(τ ). To prove this
theorem, we need two auxiliary assertions.

Proposition 4.8. There exists τ ∈ (t0 , tf ] such that  is positive definite on K(τ ).

Proof. Let #L > 0 be such that for each v ∈ Rm ,

Fuu (t, x(t), u(t))v, v ≥ #L |v|2 ∀ t ∈ [t0 , tf ]\. (4.48)

Choose a certain τ ∈ (t0 , t∗ ) and let z̄ = (ξ̄ , x̄, ū) ∈ K(τ )\{0} be an arbitrary element. Then
ū  = 0 and
ūχ[t0 ,τ ] = ū, x̄˙ = ū, x̄(t0 ) = 0,
x̄χ[t0 ,t∗ ] = x̄, x̄(τ ) = x̄ − , x̄ − + [ẋ]ξ̄ = 0,
where χM is the characteristic function of a set M. Consequently,

√ |x̄ − | τ − t0


∞ ≤

1 ≤ τ − t0

2 , |ξ̄ | = ≤

2 ,
√ |[ ẋ]| |[ ẋ]|
|2x̄av | = |x̄ − | ≤ τ − t0

2 .

Therefore,
 t 
 f 
 Fxx x̄, x̄ dt  ≤
Fxx


2∞ (t∗ − t0 ) ≤
Fxx
∞ (t∗ − t0 )(τ − t0 )

22 ,

t0
 t 
 f 
 2F ū, x̄ dt  ≤ 2
Fxu




1 ≤ 2
Fxu
∞ (τ − t0 )

2 ,
 xu  2
t0
|a| |[Fx ]|
|aξ 2 | ≤ (τ − t0 )

22 , |2[Fx ]x̄av ξ̄ | ≤ (τ − t0 )

22 .
|[ẋ]|2 |[ẋ]|
Consequently,
 tf
(z̄) ≥ Fuu ū, ū dt − M(τ − t0 )

22 ≥ #L

22 − M(τ − t0 )

22 , (4.49)
t0

where
|a| |[Fx ]|
M =
Fxx
∞ (t∗ − t0 ) + 2
Fxu
∞ + + .
|[ẋ]|2 |[ẋ]|
Let τ be such that t0 < τ < t0 + #L /M. Then (4.49) implies that  is positive definite on
K(τ ).

Proposition 4.9. Assume that  is not positive definite on K. Then  is positive semi-
definite on K(τ0 ), where τ0 is defined as in (4.47).

We omit the simple proof of this proposition. Now we are ready to prove the theorem.

Proof of Theorem 4.7. We have already proved that  is positive definite on K(τ ) for all
τ > t0 sufficiently close to t0 (Proposition 4.8). Consequently, τ0 > t0 . Further, we consider
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 193

only the nontrivial case where τ0 < tf . We know that (·) ≥ 0 on K(τ0 ) (Proposition 4.9).
We have to show that  is not positive on K(τ0 ), i.e., there exists z̄ ∈ K(τ0 )\{0} such that
(z̄) = 0 (the passage through zero). Now we follow [26]. For any τ > τ0 (τ ≤ tf ),  is
not positive on K(τ ) . Therefore, for each
1
τn = τ0 + < tf (4.50)
n
there exists
z̄n ∈ K(τn ) (4.51)
such that

(z̄n ) ≤ 0, (4.52)
γ̄ (z̄n ) = 1. (4.53)

The sequence {z̄n } is bounded in K. Therefore, without loss of generality, we assume that

z̄n −→ z̄0 weakly. (4.54)

Since each subspace K(τn ) ⊂ K is weakly closed, we have

z̄0 ∈ K(τn ) ∀n (4.55)

and, therefore, "


z̄0 ∈ K(τn ) = K(τ0 ). (4.56)
n
By Proposition 4.9, it follows from (4.56) that

(z̄0 ) ≥ 0. (4.57)

On the other hand,  is weakly lower semicontinuous on K. Thus, (4.54) implies

lim inf (z̄n ) ≥ (z̄0 ). (4.58)

We obtain from (4.52), (4.57), and (4.58) that

(z̄n ) −→ (z̄0 ) = 0, (4.59)

and then
z̄n −→ z̄0 strongly (4.60)
since  is a Legendre form. Conditions (4.53) and (4.60) imply

γ̄ (z̄0 ) = 1. (4.61)

Consequently, z̄0  = 0, (z̄0 ) = 0, i.e.,  is not positive on K(τ0 ).

For τ ∈ [t0 , tf ], let us consider the problem


(APτ ) minimize (z) under the constraint z ∈ K(τ ).
194 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Assume that a nonzero element

z̄ = (ξ̄ , x̄, ū) = (ξ̄ , w̄) ∈ K(τ ) (4.62)

yields the minimum in this problem. Then the following first-order necessary optimality
condition holds:
 
 (z̄), z̃ = 0 ∀ z̃ ∈ K(τ ), (4.63)

where z̃ = (ξ̃ , x̃, ũ) = (ξ̃ , w̃),  (z̄) is the Fréchet derivative of the functional  at the point
z̄, and (·, ·) is the inner product in Z2 (); in more detail,
 tf
( (z̄), z̃) = a ξ̄ ξ̃ − [ψ̇]x̄av ξ̃ − [ψ̇]x̃av ξ̄ + Fww w̄, w̃ dt. (4.64)
t0

Thus, Theorem 4.7 implies the following.

Corollary 4.10. Assume that  is not positive definite on K. Then, for τ = τ0 ∈ (t0 , tf ]
(given by (4.47)), there exists a nonzero element z̄ that satisfies (4.62) and (4.63).

On the other hand, we obviously have

1
( (z̄), z̄) = (z̄). (4.65)
2

This implies the following.

Proposition 4.11. If, for certain τ ∈ [t0 , tf ], there exists a nonzero element z̄ satisfying
(4.62) and (4.63), then (z̄) = 0, and hence,  is not positive definite on K(τ ).

Corollary 4.10 and Proposition 4.11 imply the following.

Theorem 4.12. Assume that  is not positive definite on K. Then τ0 , given by (4.47), is
minimal among all τ ∈ (t0 , tf ] such that there exists a nonzero element z̄ satisfying condi-
tions (4.62) and (4.63).

4.1.4
-Conjugate Point
In this section, we obtain a dual test for condition (4.63), and then we use it to obtain an
analogue of a conjugate point for a broken extremal. The most important role is played by
the following lemma.

Lemma 4.13. Let τ ∈ (t0 , tf ]. A triple z̄ = (ξ̄ , x̄, ū) satisfies conditions (4.62) and (4.63)

iff there exists a function ψ̄(·) : [t0 , tf ] −→ (Rm )∗ such that for the tuple ξ̄ , x̄, ū, ψ̄ ,
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 195

the following conditions hold:


ξ̄ ∈ R1 , x̄ ∈ P W 1,2 , ū ∈ L2 , ψ̄ ∈ P W 1,2 , (4.66)
x̄(t0 ) = x̄(tf ) = 0, (4.67)
x̄˙ = ū, ūχ[t0 ,τ ] = ū, (4.68)
∗ ∗
−ψ̄ = x̄ Fxu + ū Fuu a.e. on [t0 , τ ], (4.69)
−ψ̄˙ = x̄ ∗ Fxx + ū∗ Fux a.e. on [t0 , tf ], (4.70)
[x̄] = [ẋ]ξ̄ , (4.71)
[ψ̄] = [ψ̇]ξ̄ , (4.72)
a ξ̄ = −ψ̄av [ẋ] + [ψ̇]x̄av . (4.73)

Proof. Let z̄ = (ξ̄ , x̄, ū) satisfy conditions (4.62) and (4.63). Consider condition (4.63).
Define the following subspace:
L̃2 ([τ , tf ], Rm ) := {ṽ ∈ L2 ([t0 , tf ], Rm ) | ṽ = 0 a.e. on [t0 , τ ]}.

The operator z̃  −→ (x̃˙ − ũ, ũχ[τ ,tf ] ) maps the space Z2 () onto the space L2 ([t0 , tf ], Rm ) ×
L̃2 ([τ , tf ], Rm ). The operator z̃  −→ ([x̃] − [ẋ]ξ̃ , x̃(t0 ), x̃(tf )) is finite dimensional. Conse-
quently, the image of the operator
 
z̃  → [x̃] − [ẋ]ξ̃ , x̃˙ − ũ, ũχ[τ ,tf ] , x̃(t0 ), x̃(tf ) ,
which maps from Z2 () into
Rn × L2 ([t0 , tf ], Rm ) × L̃2 ([τ , tf ], Rm ) × Rm × Rm
is closed. The kernel of this operator is equal to K(τ ). Consequently, an arbitrary linear
functional z$ that vanishes on the kernel of this operator admits the following representation:
 tf  tf
(z , z̃) = ζ̄ ([x̃] − [ẋ]ξ̃ ) −
$ ˙
ψ̄(x̃ − ũ) dt + ν̄ ũ dt + c̄0 x̃(t0 ) + c̄f x̃(tf ), (4.74)
t0 t0

where
ζ̄ ∈ (Rn )∗ , ψ̄ ∈ L2 ([t0 , tf ], (Rm )∗ ), ν̄ ∈ L̃2 ([τ , tf ], (Rm )∗ ), c̄0 , c̄f ∈ (Rm )∗ . (4.75)

Consequently, the condition (4.63) is equivalent to the existence of ζ̄ , ψ̄, ν̄, c̄0 , and c̄f that
satisfy (4.75) and are such that for z$ defined by formula (4.74), we have
( (z̄), z̃) + (z$ , z̃) = 0 ∀ z̃ ∈ Z2 ().
The exact representation of the latter condition has the form
a ξ̄ ξ̃ − [ψ̇]x̄av ξ̃ − [ψ̇]x̃av ξ̄
 tf
+ (Fxx x̄, x̃ + Fxu ū, x̃ + Fux x̄, ũ + Fuu ū, ũ ) dt
t0  tf  tf (4.76)
+ ζ̄ ([x̃] − [ẋ]ξ̃ ) − ψ̄(x̃˙ − ũ) dt + ν̄ ũ dt + c̄0 x̃(t0 ) + c̄f x̃(tf )
t0 t0
=0 ∀ z̃ = (ξ̃ , x̃, ũ) ∈ Z2 ().
196 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Let us examine this condition.


(a) We set ξ̃ = 0 and x̃ = 0 in (4.76). Then
 tf  tf  tf
(Fux x̄, ũ + Fuu ū, ũ ) dt + ψ̄ ũ dt + ν̄ ũ dt = 0 ∀ ũ ∈ L2 . (4.77)
t0 t0 t0

Consequently,
x̄ ∗ Fxu + ū∗ Fuu + ψ̄ = −ν̄. (4.78)
The latter equation is equivalent to condition (4.69).
(b) We set ξ̃ = 0 and ũ = 0 in (4.76). Then
 tf  tf
−[ψ̇]x̃av ξ̄ + (Fxx x̄, x̃ ) + Fxu ū, x̃ ) dt + ζ̄ [x̃] − ψ̄ x̃˙ dt
t0 t0
+ c̄0 x̃(t0 ) + c̄f x̃(tf ) = 0 ∀ x̃ ∈ P W 1,2 . (4.79)

Using (4.79), it is easy to show that ψ̄ ∈ P W 1,2 ([t0 , tf ], (Rm )∗ ). Consequently,


 tf  tf
t
ψ̄ x̃˙ dt = ψ̄ x̃ tf − [ψ̄ x̃] − ψ̄˙ x̃ dt. (4.80)
0
t0 t0

Using (4.80) and the definitions x̃av = 12 (x̃ − + x̃ + ), [x̃] = x̃ + − x̃ − in (4.79), we obtain
 tf
1 − +
− [ψ̇](x̃ + x̃ )ξ̄ + Fxx x̄ + Fxu ū, x̃ dt
2 t0
+ ζ̄ (x̃ + − x̃ − ) − ψ̄(tf )x̃(tf ) + ψ̄(t0 )x̃(t0 ) + ψ̄ + x̃ + − ψ̄ − x̃ −
 tf
+ ψ̄˙ x̃ dt + c̄0 x̃(t0 ) + c̄f x̃(tf ) = 0 ∀ x̃ ∈ P W 1,2 . (4.81)
t0

Equation (4.81) implies that the coefficients of x̃ − , x̃ + , x̃(t0 ), x̃(tf ) and the coefficient of x̃
in the integral vanish:
1
− [ψ̇]ξ̄ − ψ̄ − − ζ̄ = 0, (4.82)
2
1
− [ψ̇]ξ̄ + ψ̄ + + ζ̄ = 0, (4.83)
2
ψ̄(t0 ) + c̄0 = 0, (4.84)
− ψ̄(tf ) + c̄f = 0, (4.85)
x̄ ∗ F + ū∗ F + ψ̄˙ = 0.
xx ux (4.86)

Adding (4.82) and (4.83), we obtain

−[ψ̇]ξ̄ + [ψ̄] = 0. (4.87)

Thus, (4.70) and (4.72) hold. Subtracting (4.82) from (4.83) and dividing the result by two,
we obtain
ζ̄ = −ψ̄av . (4.88)
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 197

(c) We set x̃ = 0 and ũ = 0 in (4.76). Then

a ξ̄ ξ̃ − [ψ̇]x̄av ξ̃ − ζ̄ [ẋ]ξ̃ = 0 ∀ ξ̃ ∈ R1 . (4.89)

Consequently, the coefficient of ξ̃ vanishes:

a ξ̄ − [ψ̇]x̄av − ζ̄ [ẋ] = 0. (4.90)

Using (4.88) in (4.90), we obtain (4.73). Conditions (4.67), (4.68), and (4.71) and the first
three conditions in (4.66) are implied by (4.62). Thus, all conditions (4.66)–(4.73) hold.
Conversely, if a tuple (ξ̄ , x̄, ū, ψ̄) satisfies conditions (4.66)–(4.73), then one easily verifies
that z̄ = (ξ̄ , x̄, ū) satisfies (4.62) and (4.63).

Let z̄ = (ξ̄ , x̄, ū) ∈ K. Obviously, the condition z̄  = 0 is equivalent to x̄(·) = 0. Thus,
we obtain the following theorem from Theorem 4.12 and Lemma 4.13 under the above
condition (SL).

Theorem 4.14. Assume that  is not positive definite on K. Then τ0 (given by equa-
tion (4.47)) is a minimal among all τ ∈ (t0 , tf ] such that there exists a tuple (ξ̄ , x̄, ū, ψ̄) that
satisfies (4.66)–(4.73) and the condition x̄(·)  = 0.

In what follows, we will assume that a > 0 and, as above, condition (SL) holds.
Assume that τ0 < tf . We know that  ≥ 0 on K(τ0 ). Theoretically, it is possible that
 ≥ 0 on K(τ1 ) for a certain τ1 > τ0 . In this case, the closed interval [τ0 , τ1 ] is called a
table [25]. Tables occur in optimal control problems [25], but, for a smooth extremal, they
never arise in the calculus of variations. We now show that for the simplest problem of
the calculus of variations, the closed interval [τ0 , tf ] cannot serve as a table in the case of a
broken extremal. To this end, we complete Lemma 4.13 by the following two propositions.

Proposition 4.15. If the functions x̄, ψ̄ ∈ P W 1,2 and ū ∈ L2 satisfy the system

x̄˙ = ū, −ψ̄˙ = x̄ ∗ Fxx + ū∗ Fux , −ψ̄ = x̄ ∗ Fxu + ū∗ Fuu (4.91)

on a certain closed interval  ⊂ [t0 , tf ], then the functions x̄ and ψ̄ satisfy the system

x̄˙ = Ax̄ + B ψ̄ ∗ , −ψ̄˙ = x̄ ∗ C + ψ̄A (4.92)

on the same closed interval , where A, B, and C are the same as in (4.20).

Proof. The third equation in (4.91) implies


−1 ∗ −1
ū = −Fuu ψ̄ − Fuu Fux x̄, ū∗ = −ψ̄Fuu
−1
− x̄ ∗ Fxu Fuu
−1
.

Substituting these expressions for ū and ū∗ into the first and second equations in (4.91),
respectively, we obtain (4.92).

Proposition 4.16. Conditions (4.71)–(4.73) imply

a ξ̄ = −ψ̄ + [ẋ] + [ψ̇]x̄ + . (4.93)


198 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Proof. Using the expressions ψ̄av = ψ̄ + − 0.5[ψ̄] and x̄av = x̄ + − 0.5[x̄], together with
(4.71) and (4.72), in (4.73), we obtain (4.93).

Theorem 4.17. If  ≥ 0 on K, then  is positive definite on K(τ ) for all τ < tf .

Proof. Assume the contrary, i.e.,  ≥ 0 on K and τ0 < tf . Choose an element z̄ =


(ξ̄ , x̄, ū) ∈ K(τ0 ) \ {0} such that (z̄) = 0. Then z̄ yields the minimum in problem (APτ ) for
τ = tf , and hence, (4.63) holds for K(τ ) = K. By Lemma 4.13, there exists a function ψ̄,
defined on [t0 , tf ], such that the conditions (4.66)–(4.73) hold for τ = tf . Then, according
to Propositions 4.15 and 4.16, x̄ and ψ̄ also satisfy system (4.92) on [t0 , tf ], and relation
(4.93) holds for them.
The conditions z̄ ∈ K(τ0 ), τ0 < tf , together with (4.69) and the condition τ = tf imply
x̄(tf ) = 0, ψ̄(tf ) = 0 and hence x̄(t) = 0, ψ̄(t) = 0 on [t∗ , tf ] since x̄, ψ̄ satisfy (4.92) on
the same closed interval. Therefore, x̄ + = 0, ψ̄ + = 0 and then (4.93) and the condition
a > 0 imply ξ̄ = 0. By (4.71) and (4.72), we have [x̄] = 0, [ψ̄] = 0. Consequently, x̄ − = 0,
ψ̄ − = 0, and then x̄(t) = 0, ψ̄(t) = 0 on [t0 , t∗ ], since x̄ and ψ̄ satisfy (4.92) on the same
closed interval. Thus, x̄(t) = 0 on [t0 , tf ], and hence ū(t) = 0 on [t0 , tf ]. We arrive at a
contradiction with condition z̄  = 0; this proves the theorem.

Further, we examine conditions (4.66)–(4.73). Using Proposition 4.15, we exclude ū


from these conditions. We obtain the following system:

ξ̄ ∈ R1 , x̄, ψ̄ ∈ P W 1,2 , (4.94)


x̄(t0 ) = x̄(tf ) = 0, (4.95)
x̄˙ = Ax̄ + B ψ̄ ∗ , −ψ̄˙ = x̄ ∗ C + ψ̄A on [t0 , τ ], (4.96)
x̄˙ = 0, −ψ̄˙ = x̄ ∗ F xx on [τ , t ],
f (4.97)
[x̄] = [ẋ]ξ̄ , [ψ̄] = [ψ̇]ξ̄ , (4.98)
a ξ̄ = −ψ̄ − [ẋ] + [ψ̇]x̄ − . (4.99)

The latter condition is obtained from (4.71)–(4.73) similarly to condition (4.93).


If ψ̄(t0 ) = 0, then (4.94)–(4.99) imply x̄(·) = 0. Consequently, we can assign the
nontriviality condition in the form
ψ̄(t0 )  = 0. (4.100)

Definition 4.18. A point τ ∈ (t0 , tf ] is called -conjugate (to t0 ) if there exists a triple
(ξ̄ , x̄, ψ̄) that satisfies conditions (4.94)–(4.100).

Obviously, a point τ ∈ (t0 , tf ] is -conjugate to t0 iff, for a given τ , there exists


a quadruple (ξ̄ , x̄, ū, ψ̄) that satisfies conditions (4.66)–(4.73) and the condition x̄(·) = 0.
Consequently, Theorems 4.14 and 4.17 imply the following.

Theorem 4.19. The form  is positive semidefinite on K iff there is no point that is
-conjugate to t0 on the interval (t0 , tf ). The form  is positive definite on K iff there is no
point that is -conjugate to t0 on the half-interval (t0 , tf ].

Now let us examine the condition for positive definiteness of  on K(t∗ ). This
condition implies the positive definiteness of ω on K0 (t∗ ) which is defined as a subspace of
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 199

pairs w̄ = (x̄, ū) ∈ K0 (see Section 4.1.2) such that ū(t) = 0 on [t∗ , tf ]. Let X(t), "(t) be a
matrix-valued solution of the Cauchy problem (4.22)–(4.23) on [t0 , t∗ ]. As is well known,
the positive definiteness of ω on K0 (t∗ ) is equivalent to the condition

det X(t)  = 0 ∀ t ∈ (t0 , t∗ ]. (4.101)

Assume that this condition holds. We set Q(t) = "(t)X−1 (t), t ∈ (t0 , t∗ ]. Consider condi-
tions (4.94)–(4.100) for τ ≤ t∗ . These conditions imply that

x̄(τ ) = x̄ − = x̄(t) ∀ t ∈ (τ , t∗ ), x̄ − + [ẋ]ξ̄ = x̄ + = 0,


ψ̄ − = ψ̄(τ ) − x̄(τ )∗ (τ ),
t∗
where (τ ) = τ Fxx (t, w(t)) dt. Hence, for τ ≤ t∗ the system (4.94)–(4.100) is equivalent
to the system

ξ̄ ∈ R1 , x̄, ψ̄ ∈ P W 1,2 , (4.102)


x̄(t0 ) = 0, ψ̄(t0 )  = 0, (4.103)
x̄˙ = Ax̄ + B ψ̄ ∗ , −ψ̄˙ = x̄ ∗ C + ψ̄A on [t0 , τ ], (4.104)
x̄(τ ) + [ẋ]ξ̄ = 0, (4.105)
 
a ξ̄ = − ψ̄(τ ) − x̄(τ )∗ (τ ) [ẋ] + [ψ̇]x̄(τ ). (4.106)

Let (ξ̄ , x̄, ψ̄) be a solution of this system on the closed interval [t0 , τ ], where τ ∈ (t0 , t∗ ].
Then there exists c̄ ∈ Rm such that

x̄(t) = X(t)c̄, ψ̄(t) = c̄∗ " ∗ (t). (4.107)

Consequently, ψ̄(t)∗ = "(t)c̄ = "(t)X−1 (t)(X(t)c̄) = Q(t)x̄(t). Using this relation, together
with (4.105), in (4.106), we obtain

a ξ̄ − [ẋ]∗ (Q(τ ) − (τ ))[ẋ]ξ̄ + [ψ̇][ẋ]ξ̄ = 0. (4.108)

If ξ̄ = 0, then from (4.105) and (4.107), we obtain x̄(·) = 0, ψ̄(·) = 0; this contradicts
(4.103). Therefore ξ̄  = 0, and then (4.108) implies

a − [ẋ]∗ (Q(τ ) − (τ ))[ẋ] + [ψ̇][ẋ] = 0. (4.109)

We have obtained this relation from the system (4.102)–(4.106). Conversely, if (4.109)
holds, then, setting ξ̄ = −1 and c̄ = X −1 (τ )[ẋ] and defining x̄, ψ̄ by formulas (4.107), we
obtain a solution of the system (4.102)–(4.106). We have proved the following lemma.

Lemma 4.20. Assume that condition (4.101) holds. Then, τ ∈ (t0 , t∗ ] is a point that is
-conjugate to t0 iff condition (4.109) holds.

We set
μ(t) = a − [ẋ]∗ (Q(t) − (t))[ẋ] + [ψ̇][ẋ]. (4.110)
Then, by Lemma 4.20, the absence of a -conjugate point in (t0 , t∗ ] is equivalent to the
condition
μ(t)  = 0 ∀ t ∈ (t0 , t∗ ]. (4.111)
200 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

We show further that the function μ(t) does not increase on (t0 , t∗ ] and μ(t0 + 0) = +∞.
Consequently, condition (4.111) is equivalent to μ(t∗ ) > 0, which is another form of condi-
tion (4.28) or condition (4.40).

Proposition 4.21. Assume that a symmetric matrix Q(t) satisfies the Riccati equation (4.31)
on (t0 , τ ), where τ ≤ t∗ . Then

μ̇(t) ≤ 0 ∀ t ∈ (t0 , τ ). (4.112)

Moreover, if Q satisfies the initial condition (4.36), then

μ(t0 + 0) = +∞. (4.113)

Proof. For t ∈ (t0 , τ ), we have

μ̇ = −(Q̇ + Fxx )[ẋ], [ẋ] . (4.114)

Using (4.31) in (4.114), we obtain


−1
μ̇ = −(Q + Fxu )Fuu (Q + Fux )[ẋ], [ẋ]
−1
= −Fuu (Q + Fux )[ẋ], (Q + Fux )[ẋ] ≤ 0 on (t0 , τ ), (4.115)
−1 is positive definite. Assume additionally that Q satisfies (4.36). Then Q = "X −1 ,
since Fuu
where X and " satisfy (4.22) and (4.23). It follows from (4.22) and (4.23) that
−1
X(t0 ) = O, Ẋ(t0 ) = −B(t0 ) = Fuu (t0 , w(t0 )), "(t0 ) = −I . (4.116)

Consequently,
−1
X(t) = (t − t0 )Fuu (t0 , w(t0 )) + o(t), "(t) = −I + o(1) as t → t0 + 0. (4.117)

Thus,
1
Q(t) = − (Fuu (t0 , w(t0 )) + o(1)) as t → t0 + 0; (4.118)
t − t0
this implies
1
−Q(t)[ẋ], [ẋ] = (Fuu (t0 , w(t0 ))[ẋ], [ẋ] + o(1)) → +∞ as t → t0 + 0. (4.119)
t − t0
Now (4.110) and (4.119) imply (4.113).

Lemma 4.20 and Proposition 4.21 imply the following.

Theorem 4.22. Assume that condition (4.101) holds. Then the absence of a point τ that is
-conjugate to t0 on (t0 , t∗ ] is equivalent to condition (4.28).

Consequently, the positive definiteness of  on K(t∗ ) is equivalent to the validity of


conditions (4.101) and (4.28).
We examine further the system (4.94)–(4.100) for τ > t∗ . Using (4.99) and the con-
dition a > 0, we can exclude ξ̄ from this system. Moreover, for x̄, ψ̄, we can formulate
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 201

all conditions on [t0 , τ ] only. As a result, we arrive at the following equivalent system on
[0, τ ]:
x̄, ψ̄ ∈ P W 1,2 [t0 , τ ], (4.120)
x̄(0) = x̄(τ ) = 0, ψ̄(0)  = 0, (4.121)
x̄˙ = Ax̄ + B ψ̄ ∗ , −ψ̄˙ = x̄ ∗ C + ψ̄A on [t0 , τ ], (4.122)
 
a[x̄] = [ẋ] − ψ̄ − [ẋ] + [ψ̇]x̄ − , (4.123)
 
a[ψ̄] = [ψ̇] − ψ̄ − [ẋ] + [ψ̇]x̄ − . (4.124)
We have proved the following lemma.

Lemma 4.23. A point τ ∈ (t∗ , tf ] is -conjugate to t0 iff there exists a pair of functions x̄, ψ̄
that satisfies conditions (4.120)–(4.124).

Now we can prove the following theorem.

Theorem 4.24. A point τ ∈ (t∗ , tf ] is a -conjugate to t0 iff


det X(τ ) = 0, (4.125)
where X(t), "(t) is a solution to the problem (4.22)–(4.25).

Proof. Assume that condition (4.125) holds, in which X, " is a solution to the problem
(4.22)–(4.25). Then there exists c̄ ∈ Rm such that
X(τ )c̄ = 0, c̄  = 0. (4.126)
We set
x̄ = X c̄, ψ̄ = c̄∗ " ∗ . (4.127)
Then x̄ and ψ̄ satisfy (4.120)–(4.124). Conversely, let x̄ and ψ̄ satisfy (4.120)–(4.124).
We set c̄ = −ψ̄(t0 )∗ . Then conditions (4.126) and (4.127) hold. Conditions (4.126) imply
condition (4.125).

Note that Theorems 4.19, 4.22, and 4.24 imply Theorems 4.3 and 4.4.
To complete the proof of the results of Section 4.1.2, we have to consider a jump
condition for a solution Q to the Riccati equation (4.30) with initial condition (4.36). In
what follows, we assume that  is nonnegative on K and that the pair X, " is a solution
to the problem (4.22)–(4.25). Then conditions (4.26) and (4.27) hold. We set Q = "X−1 .
Then " = QX. Using the relations q− = [ẋ]∗ Q− − [ψ̇] and " − = Q− X− in the jump
conditions (4.24) and (4.25) for X and ", we obtain
a[X] = −[ẋ](q− )X − , (4.128)
a["] = −[ψ̇]∗ (q− )X − . (4.129)

Proposition 4.25. Condition  


det aI − [ẋ](q− )  = 0 (4.130)
holds.
202 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Proof. Relation (4.128) implies


 
aX + = aI − [ẋ](q− ) X− . (4.131)

Now (4.130) follows from this relation considered together with (4.27) and the inequality
a > 0.

Proposition 4.26. The following equality holds:


 −1  −1
(q− )∗ (q− ) aI − [ẋ](q− ) = (q− )∗ (q− ) a − (q− )[ẋ] , (4.132)

where a − (q− )[ẋ] > 0.

Proof. Equality (4.132) is equivalent to


   
(q− )∗ (q− ) aI − [ẋ](q− ) = (q− )∗ (q− ) a − (q− )[ẋ] .

Let us show that this equality holds. Indeed, we have


   
(q− )∗ (q− ) aI − [ẋ](q− ) = a(q− )∗ (q− ) − (q− )∗ (q− ) [ẋ](q− )
 
= a(q− )∗ (q− ) − (q− )∗ (q− )[ẋ] (q− )
 
= a(q− )∗ (q− ) − (q− )[ẋ] (q− )∗ (q− )
 
= (q− )∗ (q− ) a − (q− )[ẋ]

since (q− )[ẋ] is a number.

Proposition 4.27. The jump condition (4.37) holds.

Proof. The relation " = QX implies ["] = Q+ X+ − Q− X− . Multiplying it by a and


using relations (4.129) and (4.131), we obtain
 
−[ψ̇]∗ (q− )X − = Q+ aI − [ẋ](q− ) X − − aQ− X− .

Since det X−  = 0 and Q+ = Q− + [Q], we get

−[ψ̇]∗ (q− ) = a[Q] − Q− [ẋ](q− ) − [Q][ẋ](q− ).

This relation and the formula (q− )∗ = Q− [ẋ] − [ψ̇]∗ imply the equality
 
[Q] aI − [ẋ](q− ) = (q− )∗ (q− ).

By virtue of (4.130) this equality is equivalent to


 −1
[Q] = (q− )∗ (q− ) aI − [ẋ](q− ) .

According to (4.132) this relation can be rewritten as


 −1
[Q] = (q− )∗ (q− ) a − (q− )[ẋ] .

This implies jump condition (4.37).


4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 203

4.1.5 Numerical Examples


Rayleigh problem with regulator functional. The following optimal control prob-
lem (Rayleigh problem) was thoroughly investigated in [65]. Consider the electric circuit
(tunnel-diode oscillator) shown in Figure 4.1. The state variable x1 (t) is taken as the elec-
tric current I at time t, and the control variable u(t) is induced by the voltage V0 at the
generator. After a suitable transformation of the voltage V0 (t), we arrive at the following
specific Rayleigh equation with a scalar control u(t) ,
ẍ(t) = −x(t) + ẋ(t) ( 1.4 − 0.14 ẋ(t)2 ) + 4 u(t).
A numerical analysis reveals that the Rayleigh equation with zero control u(t) ≡ 0 has
a limit cycle in the (x, ẋ)-plane. The goal of the control process is to avoid the strong
oscillations on the limit cycle by steering the system toward a small neighborhood of the
origin (x, ẋ) = (0, 0) using a suitable control function u(t). With the state variables x1 = x
and x2 = ẋ we arrive at the following control problem with fixed final time tf > 0: Minimize
the functional  tf
F (x, u) = ( u(t)2 + x1 (t)2 ) dt (4.133)
0
subject to
ẋ1 (t) = x2 (t), ẋ2 (t) = −x1 (t) + x2 (t) ( 1.4 − 0.14 x2 (t)2 ) + 4 u(t), (4.134)
x1 (0) = x2 (0) = −5, x1 (tf ) = x2 (tf ) = 0. (4.135)
As in [65], we choose the final time tf = 4.5 . The Pontryagin function (Hamiltonian), which
corresponds to the minimum principle, becomes
H (x1 , x2 , ψ1 , ψ2 , u) = u2 + x12 + ψ1 x2 + ψ2 (−x1 + x2 (1.4 − 0.14x22 ) + 4 u), (4.136)
where ψ1 and ψ2 are the adjoint variables associated with x1 and x2 . The optimal control
that minimizes the Pontryagin function is determined by
Hu = 2 u + 4 ψ2 = 0, i.e., u(t) = −2 ψ2 (t). (4.137)

L I

    -


IC ID IR
# ? ? ? #

AA D 6
V
t
V0 C R -
"!
? "!

Figure 4.1. Tunnel-diode oscillator. I denotes inductivity, C capacity, R resis-


tance, I electric current, and D diode.
204 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

The strict Legendre–Clebsch condition holds in view of Huu (t) ≡ 2 > 0 . The adjoint
equation ψ̇ = −Hx yields

ψ̇1 = ψ2 − 2x1 , ψ̇2 = 0.42 ψ2 x22 − 1.4 ψ2 − ψ1 . (4.138)

Since the final state is specified, there are no boundary conditions for the adjoint
variable. The boundary value problem (4.133), (4.134) and (4.138) with control u = −2ψ2
was solved using the multiple shooting code BNDSCO developed by Oberle and Grimm
[82]. The optimal state, control, and adjoint variables are shown in Figure 4.2. We get the
following initial and final values for the adjoint variables:

ψ1 (0) = −9.00247067, ψ2 (0) = −2.67303084,


(4.139)
ψ1 (4.5) = −0.04456054, ψ2 (4.5) = −0.00010636.

Nearly identical numerical results can be obtained by solving the discretized control problem
with a high number of gridpoints.
The optimality of this extremal solution may be checked by producing a finite solu-
tion of the Riccati equation (4.30). Since the Rayleigh problem has dimension n = 2 , we

(a) state variables x1 and x2 (b) control u


5 7
4 6
3
5
2
1 4
0 3
-1 2
-2 1
-3
0
-4
-5 -1
-6 -2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

(c) adjoint variables and (d) solutions Q11, Q12, Q22 of Riccati equation
1 2
2 2.5

0 2

-2 1.5

-4 1

-6 0.5

-8 0

-10 -0.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Figure 4.2. Rayleigh problem with regular control. (a) State variables. (b) Con-
trol. (c) Adjoint variables. (d) Solutions of the Riccati equation (4.140).
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 205

consider a symmetric 2 × 2 matrix Q(t) of the form


⎛ ⎞
Q11 (t) Q12 (t)
Q(t) = ⎝ ⎠.
Q12 (t) Q22 (t)

The Riccati equation (4.30) then reads explicitly as

Q̇11 = 2Q12 + 8Q212 − 2,


Q̇12 = −Q11 + ( 0.42 x22 − 1.4 ) Q12 + Q22 + 8 Q12 Q22 , (4.140)
Q̇22 = −2 Q12 + 2 ( 0.42 x22 − 1.4 ) Q22 + 8 Q222 + 0.84 ψ2 x2 .

Since the final state is fixed, no boundary condition is prescribed for Q(t1 ) . Choosing for
convenience the boundary condition Q(t1 ) = 0 , we obtain the bounded solution of the
Riccati equation shown in Figure 4.2. The initial values are computed as

Q11 (0) = 2.006324, Q12 (0) = 0.4705135, Q22 (0) = −0.3516654.

Thus the unconstrained solution shown in Figure 4.2 provides a local minimum.
To conclude the discussion of the Rayleigh problem, we consider the control problem
with free final state x(tf ). The solution is quite similar to that shown in Figure 4.2, with
the only difference being that the boundary inequality Q(tf ) > 0 should hold. However,
it suffices to find a solution Q(t) satisfying the boundary condition Q(tf ) = 0 which was
imposed earlier. Due to the continuous dependence of solutions on initial or terminal
conditions, equation (4.140) then has a solution with Q(tf ) = # · I2 > 0 for # > 0 small. We
obtain
x1 (4.5) = −0.0957105, x2 (4.5) = −0.204377,
ψ1 (0) = −9.00126, ψ2 (0) = −2.67259,
(4.141)
Q11 (0) = 2.00607, Q22 (0) = 0.470491,
Q22 (0) = −0.351606.

Variational problem with a conjugate point. The following variational problem


was studied in Maurer and Pesch [71]:

1 1
Minimize F (x, u) = ( x(t)3 + ẋ(t)2 ) dt subject to x(0) = 4, x(1) = 1. (4.142)
2 0
Defining the control variable by u = ẋ as usual, the Pontryagin function (Hamiltonian),
corresponding to the minimum principle, becomes
1
H (x, u, ψ) = (x 3 + u2 ) + ψ u.
2
The strict Legendre–Clebsch condition holds in view of Huu = 1 > 0. The minimizing
control satisfies Hu = 0 which gives u = −ψ. Using the adjoint equation ψ̇ = −Hx =
−3x 2 /2, we get the boundary value problem (Euler–Lagrange equation)
3
ẍ = x 2 , x(0) = 4, x(1) = 1. (4.143)
2
206 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

The unknown initial value ẋ(0) can be determined by shooting methods; cf. Stoer and
Bulirsch [106]. The boundary value problem (4.143) has the explicit solution x (1) (t) =
4/(1 + t)2 and a second solution x (2) (t) with initial values

ẋ (1) (0) = −8, ẋ (2) (0) = −35.858549. (4.144)

Both solutions x (k) (t), k = 1, 2, may be tested for optimality by the classical Jacobi condition.
The variational system (4.22), (4.23), along the two extremals, yields for k = 1, 2

ẍ (k) (t) = 3
2 x (k) (t), x (k) (0) = 4, ẋ (k) (0) as in (4.144),
(4.145)
ÿ (k) (t) = 3 x (k) (t) y (k) (t), y (k) (0) = 0, ẏ (k) (0) = 1.

For k = 1, 2 the extremals x (k) (k = 1, 2) and variational solutions y (k) (k = 1, 2) are displayed
in Figure 4.3. The extremal x (1) (t) = 4/(1 + t)2 is optimal in view of

y (1) (t) > 0 ∀ 0 < t ≤ 1,

whereas the second extremal x (2) is not optimal, since it exhibits the conjugate point tc =
0.674437 with y (2) (tc ) = 0. The conjugate point tc has the property that the envelope of

x(1) and x(2) variations y (1) and y (2)


4 2.5
2 2
0
1.5
-2
-4 1
-6
0.5
-8
0
-10
-12 -0.5
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Envelope of x(2) and conjugate point tc = 0.674437


4
2
0
-2
-4
-6
-8
-10
-12
0 0.2 0.4 0.6 0.8 1

Figure 4.3. Top left: Extremals x (1) , x (2) (lower graph). Top right: Variational
solutions y (1) and y (2) (lower graph) to (4.145). Bottom: Envelope of neighboring extremals
illustrating the conjugate point tc = 0.674437.
4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 207

extremals corresponding to neighboring initial slopes ẋ(0) touches the extremal x (2) (t) at
tc ; cf. Figure 4.3.

Example with a Broken Extremal


Consider the following problem:
 tf
1
Minimize J (x(·), u(·)) := (f (u(t)) − x 2 (t)) dt (4.146)
2 0
under the constraints
ẋ(t) = u(t), x(0) = x(tf ) = 0, (4.147)

where f (u) = min{(u + 1)2 , (u − 1)2 }, tf > 0 is fixed, and u  = 0.


This example was examined in [79]. Here, we will study it using the results of this
section. In [79] it was shown that for each closed interval [0, tf ] we have an infinite number
of extremals. For a given closed interval [0, tf ], each extremal x(t) is a periodic piecewise
smooth function with a finite number of switching points of the control tk = (2k − 1)ϕ,
k = 1, . . . , s, where 0 < ϕ < π2 , tf = 2sϕ < sπ . In particular, for s = 1 we have tf = 2ϕ < π .
Set  = {t1 , . . . , ts }. For each s ≥ 1 the function x(t) is defined by the conditions
ẍ = −x for t ∈ [0, tf ] \ , x(0) = x(tf ) = 0, [ẋ]k = (−1)k 2, k = 1, . . . , s
(we identify the extremals differing only by sign). Consequently, x(t) = ρ sin t for t ∈ [0, ϕ)
and x(t) = ρ sin(t − 2ϕ) for t ∈ (ϕ, 2ϕ), etc., where ρ = (cos ϕ)−1 . Further, in [79] it was
shown that for tf > π each extremal does not satisfy the necessary second-order condition.
The same is true for the case tf ≤ π and s > 1. Consequently, the only possibility for the
quadratic form  to be nonnegative on the critical subspace K is the case s = 1. In this
case we have tf = 2ϕ < π and

1 tf 2
 = (tan ϕ)ξ̄ 2 + (ū − x̄ 2 ) dt.
2 0
The critical subspace K is defined by the conditions
x̄˙ = ū, x̄(0) = x̄(tf ) = 0, [x̄] = −2ξ̄ .
The Riccati equation (4.34) has the form (for a small ε > 0)
−Ṙ = 1 + R 2 , t ∈ [0, ε], R(0) = 0.
The solution of this Cauchy problem is R(t) = − tan t. Consequently, Q(t) = R −1 (t) =
− cot t is the solution of the Cauchy problem
−Q̇ + Q2 + 1 = 0, Q(ε) = R −1 (ε)
on the interval (0, ϕ). It follows that Q− = − cot ϕ. Using the equalities [ẋ] = [u] = −2,
[ψ̇] = [x] = 0, and a = D(H ) = 2 tan ϕ, we obtain
q− = [ẋ]Q− − [ψ̇] = −2Q− = 2 cot ϕ,
b− = a − q− [ẋ] = 2 tan ϕ + 4 cot ϕ > 0,
208 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

and then
[Q] = (b− )−1 (q− )2 = (b− )−1 4 cot 2 ϕ,
4 cot 2 ϕ
Q+ = Q− + [Q] = − cot ϕ + = −2(b− )−1 < 0.
b−
Now, for t ∈ (ϕ, tf ], we have to solve the Cauchy problem

−Q̇ + Q2 + 1 = 0, Q(ϕ+) = −2(b− )−1 .

Since Q(ϕ+) < 0, the solution has the form Q(t) = − cot(t + τ ) with certain τ > 0. It is
clear that this solution is defined at least on the closed interval [ϕ, ϕ + 0.5π ] which contains
the closed interval [ϕ, tf ]. By Theorem 4.6 the quadratic form  is positive definite on the
subspace K. It means that the extremal with single switching point satisfies the sufficient
second-order optimality condition. By Theorem 4.2, such an extremal is a point of a strict
bounded strong minimum in the problem.

4.1.6 Transformation of the Quadratic Form to Perfect Squares for


Broken Extremals in the Simplest Problem of the Calculus of
Variations
Assume again that w(·) = (x(·), u(·)) is an extremal in the simplest problem of the calculus
of variations (4.1)–(4.3), with a single corner point t∗ ∈ (t0 , tf ). Assume that condition (SL)
is satisfied, and a > 0, where a is defined by (4.8). Set  = {t∗ }.
Let Q(t) be a symmetric matrix function on [t0 , tf ] with piecewise continuous entries
which are continuously differentiable on each of the two intervals of the set [t0 , tf ]\ (hence,
a jump of Q is possible only at the point t∗ ). We continue to use the following notation

Q− = Q(t∗ − 0), Q+ = Q(t∗ + 0), [Q] = Q+ − Q− ,


(4.148)
q− = [ẋ]∗ Q− − [ψ̇], b− = a − q− [ẋ],

−1 (t, w(t))F (t, w(t)),


A(t) = −Fuu −1 (t, w(t)),
B(t) = −Fuu
ux
−1 (t, w(t))F (t, w(t)) + F (t, w(t)).
(4.149)
C(t) = −Fxu (t, w(t))Fuu ux xx

Note that B(t) is negative definite. Assume that Q satisfies


(a) the Riccati equation

Q̇(t) + Q(t)A(t) + A(t)∗ Q(t) + Q(t)B(t)Q(t) + C(t) = 0 (4.150)

for all t ∈ [t0 , tf ]\,


(b) the inequality
b− > 0, (4.151)

(c) the jump condition

[Q] = (b− )−1 (q− )∗ (q− ), (4.152)


4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 209

where q− is a row vector and (q− )∗ is a column vector. (Recall that (q− )∗ (q− ) is a sym-
metric positive semidefinite matrix.) In this case, we will say that the matrix Q(t) is the
symmetric solution of the problem (4.150)–(4.152) on [t0 , tf ].
For the quadratic form  on the subspace K, defined by relations (4.14) and (4.13),
respectively, the following theorem holds.

Theorem 4.28. If there exists a symmetric solution Q(t) of the problem (4.150)–(4.152)
on [t0 , tf ], then the quadratic form  has the following transformation into a perfect
square on K:
 tf
−1 − 2
2(z̄) = (b− ) (a ξ̄ + q− x̄ ) + B(t)v̄(t), v̄(t) dt, (4.153)
t0

where
 
v̄(t) = Q(t) + Fux (t, w(t)) x̄(t) + Fuu (t, w(t))ū(t). (4.154)

The proof of this theorem consists of the following four propositions.

Proposition 4.29. Let Q(t) be a symmetric matrix on [t0 , tf ] with piecewise continuous
entries, which are absolutely continuous on each interval of the set [t0 , tf ]\. Then
 
2(z̄) = b− ξ̄ 2 + 2q− x̄ + ξ̄ + [Q]x̄ + , x̄ +
 tf
  (4.155)
+ (Fxx + Q̇)x̄, x̄ + 2(Fxu + Q)ū, x̄ + Fuu ū, ū dt
t0

for all z̄ = (ξ̄ , x̄, ū) ∈ K.

Proof. For z̄ ∈ K, we obviously have


 tf tf
d 
Qx̄, x̄ dt = Qx̄, x̄  − [Qx̄, x̄ ]. (4.156)
t0 dt t0

Using the conditions x̄(t0 ) = x̄(tf ) = 0 and x̄˙ = ū in (4.156), we obtain


 tf
0 = Q+ x̄ + , x̄ + − Q− x̄ − , x̄ − + (Q̇x̄, x̄ + 2Qx̄, ū ) dt. (4.157)
t0

Adding this zero form to the form


 tf
− +
2(z̄) = a ξ̄ − [ψ̇](x̄ + x̄ )ξ̄ +
2
(Fxx x̄, x̄ + 2Fxu ū, x̄ + Fuu ū, ū ) dt,
t0

we obtain

2(z̄) = a ξ̄ 2 − [ψ̇](x̄ − + x̄ + )ξ̄ − Q− x̄ − , x̄ − + Q+ x̄ + , x̄ +


 tf
 
+ (Fxx + Q̇)x̄, x̄ + 2(Fxu + Q)ū, x̄ + Fuu ū, ū dt. (4.158)
t0
210 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Consider the form

ω∗ (ξ̄ , x̄) := a ξ̄ 2 − [ψ̇](x̄ − + x̄ + )ξ̄ − Q− x̄ − , x̄ − + Q+ x̄ + , x̄ + (4.159)

connected to the jump point t∗ ∈  of the control u(·), and let us represent it as a function
of x̄ + , ξ̄ ,

ω∗ (ξ̄ , x̄) = a ξ̄ 2 − [ψ̇](2x̄ + − [x̄])ξ̄ − Q− (x̄ + − [x̄]), x̄ + − [x̄] + Q+ x̄ + , x̄ +


= a ξ̄ 2 − 2[ψ̇]x̄ + ξ̄ + [ψ̇][x̄]ξ̄ − Q− x̄ + , x̄ + + 2Q− [x̄], x̄ +
− Q− [x̄], [x̄] + Q+ x̄ + , x̄ + .

Using the relations [x̄] = [ẋ]ξ̄ and Q+ − Q− = [Q], we obtain

ω∗ (ξ̄ , x̄) = (a − Q[ẋ] − [ψ̇]∗ , [ẋ] )ξ̄ 2 + 2Q− [ẋ] − [ψ̇]∗ , x̄ + ξ̄ + [Q]x̄ + , x̄ + .

Now using definitions (4.148) of q− and b− , we obtain

ω∗ (ξ̄ , x̄) = b− ξ̄ 2 + 2q− x̄ + ξ̄ + [Q]x̄ + , x̄ + . (4.160)

Formulas (4.158), (4.159), and (4.160) imply formula (4.155).

Proposition 4.30. If Q satisfies the jump condition b− [Q] = (q− )∗ (q− ), then

b− ξ̄ 2 + 2q− x̄ + ξ̄ + [Q]x̄ + , x̄ + = (b− )−1 (a ξ̄ + q− x̄ − )2 . (4.161)

Proof. Using the jump conditions for [Q] and [x̄] and the definition (4.148) of b− , we
obtain
 
b− ξ̄ 2 + 2q− x̄ + ξ̄ + [Q]x̄ + , x̄ + = (b− )−1 (b− )2 ξ̄ 2 + 2q− x̄ + b− ξ̄ + (q− x̄ + )2
 2
= (b− )−1 b− ξ̄ + q− x̄ +
 2
= (b− )−1 (a − q− [ẋ])ξ̄ + q− x̄ +
 2
= (b− )−1 a ξ̄ − q− [x̄] + q− x̄ +
 2
= (b− )−1 a ξ̄ + q− x̄ − .

Proposition 4.31. The Riccati equation (4.150) is equivalent to the equation


−1
Q̇ − (Q + Fxu )Fuu (Q + Fux ) + Fxx = 0 on [t0 , tf ] \ . (4.162)

Proof. Using formulas (4.149), we obtain

QA + A∗ Q + QBQ + C
−1 −1 −1 −1
= −QFuu Fux − Fxu Fuu Q − QFuu Q − Fxu Fuu Fux + Fxx
−1 −1
= −QFuu (Q + Fux ) − Fxu Fuu (Q + Fux ) + Fxx
−1
= −(Q + Fxu )Fuu (Q + Fux ) + Fxx .

Hence, (4.150) is equivalent to (4.162).


4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 211

Proposition 4.32. If Q satisfies the Riccati equation (4.162), then


−1
(Fxx + Q̇)x̄, x̄ + 2(Fxu + Q)ū, x̄ + Fuu ū, ū = Fuu v̄, v̄), (4.163)

where v̄ is defined by (4.154).

Proof. From (4.154), it follows that


−1 −1
Fuu v̄, v̄ = Fuu ((Q + Fux )x̄ + Fuu ū), (Q + Fux )x̄ + Fuu ū)
−1 −1
= Fuu (Q + Fux )x̄, (Q + Fux )x̄ + 2Fuu (Q + Fux )x̄, Fuu ū (4.164)
−1
+Fuu Fuu ū, Fuu ū
−1
= (Q + Fxu )Fuu (Q + Fux )x̄, x̄ + 2(Q + Fux )x̄, ū + Fuu ū, ū .

Using (4.162), (4.164), we obtain (4.163).

Now, let Q be a symmetric solution of problem (4.150)–(4.152) on [t0 , tf ]. Then, Propo-


sitions 4.29–4.32 yield the transformation (4.153) of Q to a perfect square on K. This
completes the proof of Theorem 4.28.
Now we can easily prove the following sufficient condition for the positive definite-
ness of the quadratic form  on subspace K, which is the main result of this section.

Theorem 4.33. If there exists a symmetric solution Q(t) of problem (4.150)–(4.152) on


[t0 , tf ], then (z̄) is positive definite on K, i.e., there exists # > 0 such that (z̄) ≥ # γ̄ (z̄)
t
for all z̄ ∈ K, where γ̄ (z̄) = ξ̄ 2 + t0f ū(t), ū(t) dt.

Proof. By Theorem 4.28, formula (4.153) holds for  on K. Hence, the form  is
nonnegative on K. Let us show that  is positive on K. Let z̄ ∈ K and (z̄) = 0. Then
by (4.153) and (4.154), we have

a ξ̄ + q− x̄ − = 0, (Q + Fux )x̄ + Fuu ū = 0 on [t0 , tf ]\. (4.165)

Since z̄ ∈ K, we have x̄˙ = ū, and consequently

x̄˙ = −Fuu
−1
(Q + Fux )x̄ on [t0 , tf ]\. (4.166)

Since x̄(t0 ) = 0, we have x̄(t) = 0 on [t0 , t∗ ). Hence x̄ − = 0. Then, by (4.165), we have


ξ̄ = 0. Since [x̄] = [ẋ]ξ̄ = 0, we also have x̄ + = 0. Then, by (4.166), x̄(t) = 0 on (t∗ , tf ].
Thus, we obtain that ξ̄ = 0, x̄(t) ≡ 0 on [t0 , tf ], and then ū(t) = 0 a.e. on [t0 , tf ]. Thus z̄ = 0,
and therefore  is positive on K. But  is a Legendre form, and therefore positiveness of
 on K implies positive definiteness of  on K.

Remark. Theorem 4.33 was proved for a single point of discontinuity of the control u(·).
A similar result could be proved for finitely many points of discontinuity of the control u(·).

Now, using Theorem 4.33, we can prove, for s = 1, that the existence of a symmetric
solution Q(t) of the problem (4.150)–(4.152) on [t0 , tf ] is not only sufficient (as it was
stated by Theorem 4.28) but also necessary for the positive definiteness of  on K. This can
212 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

be done in different ways. We choose the following one. Let  be positive definite on K,
take a small # > 0, and put u(t) = u(t0 ) for t ∈ [t0 − #, t0 ].
Thus, u(t) is continued to the left-hand side of the point t0 by constant value equal to
the value at t0 . Now we have w(t) defined on [t0 − #, tf ] with the same single discontinuity
point t∗ , and therefore the “continued” quadratic form

1 2 1 tf
# (z̄) = a ξ̄ − 2[ψ̇]x̄av + Fww (t, w(t))w̄(t), w̄(t) dt
2 2 t0 −#

is well defined on the subspace K# :

x̄˙ = ū a.e. on [t0 − #, tf ], x̄(t0 − #) = x̄(tf ) = 0, [x̄] = [ẋ]ξ̄ .

Using the same technique as in [96], one can easily prove that there is # > 0 such that # (z̄)
is positive definite on K# (note that condition (SL) is satisfied for # on [t0 − #, tf ], which
is important for this proof). Then by Theorem 4.33 applied for [t0 − #, tf ] there exists a
solution Q of the Riccati equation (4.150) on (t0 − #, tf ] satisfying inequality (4.151) and
jump condition (4.152), and hence we have this solution on the segment [t0 , tf ]. Thus, for
s = 1 the following theorem holds.

Theorem 4.34. The existence of a symmetric solution Q(t) of problem (4.150)–(4.152)


together with condition (SL) is equivalent to the positive definiteness of  on K.

The formulation of the jump condition (4.152) allows us to solve problem (4.150)–
(4.152) in a forward time direction. Certainly it is possible to move in opposite direction.
Let us prove the following.

Lemma 4.35. Conditions (4.151) and (4.152) imply that

b+ := a + q+ [ẋ] > 0, (4.167)


(b+ )[Q] = (q+ )∗ (q+ ), (4.168)

where
q+ = [ẋ]∗ Q+ − [ψ̇]. (4.169)

Proof. By Proposition 4.26, jump condition (4.152) is equivalent to [Q](aI − [ẋ](q− )) =


(q− )∗ (q− ). This equality implies that a[Q] = (q− )∗ (q− ) + [Q][ẋ](q− ). But

[Q][ẋ] = (q+ − q− )∗ . (4.170)

Hence
a[Q] = (q+ )∗ (q− ). (4.171)
Using again (4.170) in (4.171), we obtain a[Q] = (q+ )∗ (q+ − [ẋ]∗ [Q]). It follows from this
formula that a[Q] = (q+ )∗ (q+ ) − (q+ )∗ [ẋ]∗ [Q]. Consequently

(aI + (q+ )∗ [ẋ]∗ )[Q] = (q+ )∗ (q+ ). (4.172)


4.1. Jacobi-Type Conditions and Riccati Equation for Broken Extremals 213

Let us transform the left-hand side of this equation, using (4.171). Since
 
1 1
(q+ )∗ [ẋ]∗ [Q] = (q+ )∗ [ẋ]∗ (q+ )∗ (q− ) = (q+ )∗ [ẋ]∗ (q+ )∗ (q− )
a a
1   1  
= (q+ )∗ [ẋ]∗ (q+ )∗ (q− ) = (q+ )∗ (q− ) q+ [ẋ]
a   a
= [Q] q+ [ẋ] ,

we have
 
(aI + (q+ )∗ [ẋ]∗ )[Q] = a[Q] + (q+ )∗ [ẋ]∗ [Q] = a[Q] + q+ [ẋ] [Q] = (a + q+ [ẋ])[Q].

Thus, (4.172) is equivalent to


b+ [Q] = (q+ )∗ (q+ ). (4.173)

Let us show that b+ > 0. From (4.173) it follows that


 2
b+ q+ [Q](q+ )∗ = q+ (q+ )∗ . (4.174)

By (4.151) and (4.152), we have

q+ [Q](q+ )∗ ≥ 0. (4.175)

If q+ = 0, then b+ = a + q+ [ẋ] = a > 0. Assume that q+  = 0. Then


 2
q+ (q+ )∗ > 0. (4.176)

Conditions (4.174), (4.175), and (4.176) imply that b+ > 0.

Lemma 4.35 yields the transformation of  on K to a perfect square expressed by


q+ and b+ .

Lemma 4.36. The following equality holds:

1
ω∗ (ξ̄ , x̄) := a ξ̄ 2 − [ψ̇](x̄ − + x̄ + )ξ̄ − Q− x̄ − , x̄ − + Q+ x̄ + , x̄ k+ = (a ξ̄ + q+ x̄ + )2 .
b+

Proof. We have

ω∗ (ξ̄ , x̄) = a ξ̄ 2 − [ψ̇](x̄ − + x̄ + )ξ̄ + Q+ x̄ + , x̄ + − Q− x̄ − , x̄ −


= a ξ̄ 2 − [ψ̇](2x̄ − + [x̄])ξ̄ + Q+ (x̄ − + [x̄]), x̄ − + [x̄] − Q− x̄ − , x̄ −
= a ξ̄ 2 − 2[ψ̇]x̄ − ξ̄ − [ψ̇] [ẋ] ξ̄ 2 + Q+ x̄ − , x̄ −
+ 2Q+ [x̄] , x̄ + + Q+ [x̄] , [x̄] − Q− x̄ − , x̄ −
= (a − [ψ̇][ẋ] + Q+ [ẋ], [ẋ] )ξ̄ 2 + [Q]x̄ − , x̄ − + 2([ẋ]∗ Q+ − [ψ̇])x̄ − ξ̄
= (a + q+ [ẋ])ξ̄ 2 + 2q+ x̄ − ξ̄ + [Q]x̄ − , x̄ − .
214 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Hence ω∗ (ξ̄ , x̄) = b+ ξ̄ 2 + 2q+ x̄ − ξ̄ + [Q] x̄ − , x̄ − . Using (4.173), we obtain


1 2 2
ω∗ (ξ̄ , x̄) = (b ξ̄ + 2(q+ x̄ − )b+ ξ̄ + (q+ , x̄ − )2 )
b+ +
1 1
= (b+ ξ̄ + q+ x̄ − )2 = (a ξ̄ + q+ [x̄] + q+ x̄ − )2
b+ b+
1
= (a ξ̄ + q+ x̄ + )2 .
b+
Thus, we obtain the following.

Theorem 4.37. If the assumptions of Theorem 4.28 are satisfied, then the quadratic form
 has the following transformation to a perfect square on K:
 tf
2(z̄) = (b+ )−1 (a ξ̄ + q+ x̄ + )2 + B(t)v̄(t), v̄(t) dt, (4.177)
t0

where v̄ is defined by (4.154).

4.2 Riccati Equation for Broken Extremal in the General


Problem of the Calculus of Variations
Now we show that, by analogy with the simplest problem of the calculus of variations,
the transformation of the quadratic form to a perfect square may be done in the auxiliary
problem connected with the general problem of the calculus of variations (without local
equality-type constraints and endpoints inequality-type constraints), and in this way we
obtain, for this problem, sufficient optimality conditions for broken extremals in the terms
of discontinuous solutions of the corresponding Riccati equation. Consider the problem
J(x, u) = J (x0 , xf ) −→ min, (4.178)
K(x0 , xf ) = 0, (4.179)
ẋ = f (t, x, u), (4.180)
(x0 , xf ) ∈ P , (t, x, u) ∈ Q, (4.181)
where x0 = x(t0 ), xf = x(tf ), the interval [t0 , tf ] is fixed, P and Q are open sets, x ∈
Rn , u ∈ Rr , J and K are twice continuously differentiable on P , and f is twice continu-
ously differentiable on Q. We set (x0 , xf ) = p, (x, u) = w. The problem is considered in
the space W 1,1 ([t0 , tf ], Rn ) × L∞ ([t0 , tf ], Rr ). Let an admissible pair w(·) = (x(·), u(·)) be
given. As in Section 2.1.1, we assume that u(·) is piecewise continuous and each point of
discontinuity is a Lipschitz point (see Definition 2.1). We denote by  = {t1 , . . . , ts } the set
of all discontinuity points of the control u(·).
For the problem (4.178)–(4.181), let us recall briefly the formulations of the sufficient
optimality conditions at the point w(·), given in Chapter 2 for the general problem. Let
us introduce the Pontryagin function H (t, x, u, ψ) = ψf (t, x, u), where ψ ∈ (Rn )∗ . Denote
the endpoint Lagrange function by l(p, α0 , β) = α0 J (p) + βK(p), where β is a row vector
of the same dimension as the vector K. Further, we introduce a collection of Lagrange
multipliers λ = (α0 , β, ψ(·)) such that ψ(·) : [t0 , tf ] −→ (Rn )∗ is an absolutely continuous
function continuously differentiable on each interval of the set [t0 , tf ]\.
4.2. Riccati Equation in the General Problem 215

We denote by 0 the set of the normed collections λ satisfying the conditions of the
local minimum principle for an admissible trajectory w(·) = (x(·), u(·)),

α0 ≥ 0, α0 + |βj | = 1, ψ̇ = −Hx , ψ(t0 ) = −lx0 , ψ(tf ) = lxf , Hu = 0,
(4.182)

where all derivatives are calculated on the trajectory w(·), respectively, at the endpoints
(x(t0 ), x(tf )) of this trajectory. The condition 0  = ∅ is necessary for a weak minimum at
the point w(·).
We set U(t, x) = {u ∈ Rd(u) | (t, x, u) ∈ Q}. Denote by M0 the set of tuples λ ∈ 0
such that for all t ∈ [t0 , tf ]\, the condition u ∈ U(t, x 0 (t)) implies the inequality

H (t, x 0 (t), u, ψ(t)) ≥ H (t, x 0 (t), u0 (t), ψ(t)). (4.183)

If w0 is a point of Pontryagin minimum, then M0 is nonempty; i.e., the minimum principle


holds. Further, denote by M0+ the set of λ ∈ M0 such that
(a) H (t, x 0 (t), u, ψ(t)) > H (t, x 0 (t), u0 (t), ψ(t))
for all t ∈ [t0 , tf ]\, u ∈ U(t, x 0 (t)), u  = u0 (t)
(b) H (tk , x 0 (tk ), u, ψ(tk )) > H λk− = H λk+
for all tk ∈ , u ∈ U(tk , x 0 (tk )), u ∈ / {u0k− , u0k+ }.
According to Definition 2.94, w 0 is a bounded strong minimum point if for any compact
set C ⊂ Q there exists ε > 0 such that J(w) ≥ J(w 0 ) for all admissible trajectories w
such that

|x(t0 ) − x 0 (t0 )| < ε, max |x(t) − x 0 (t)| ≤ ε, (t, w(t)) ∈ C a.e. on [t0 , tf ],
t∈[t0 ,tf ]

where x is the vector composed of the essential components of the vector x. Recall that
the component xi of a vector x = (x1 , . . . , xd(x) ) is said to be unessential if the function f is
independent of xi and the functions J and K affinely depend on xi0 = xi (t0 ), xif = xi (tf ).
Now we shall formulate quadratic sufficient conditions for a strict bounded strong
minimum at the point w(·), which follow from the corresponding conditions in Section
2.7.2; see Theorem 2.102. For λ ∈ 0 and tk ∈ , we put (k H )(t) = H (t, x(t), uk+ , ψ(t))−
H (t, x(t), uk− , ψ(t)), where uk− = u(tk − 0), uk+ = u(tk + 0), and

d  d 
D k (H ) := − (k H )t −0 = − (k H )t +0 . (4.184)
dt k dt k

The second equality is a consequence of the minimum principle. As we know, the following
formula holds:
D k (H ) = −Hxk+ Hψk− + Hxk− Hψk+ − [Ht ]k . (4.185)

Recall that the conditions [H ]k = 0, D k (H ) ≥ 0, k = 1, . . . , s, also follow from the minimum


principle. The same is true for the Legendre condition Huu (t, w(t), ψ(t))v̄, v̄ ≥ 0 a.e. on
[t0 , tf ] for all v̄ ∈ Rr . Let us recall the strengthened Legendre (SL) condition for λ ∈ 0 :
(a) for any t ∈ [t0 , tf ]\ the quadratic form Huu (t, w(t), ψ(t))v̄, v̄ is positive definite,
(b) for any tk ∈  the quadratic forms Huu (tk , x(tk ), uk− , ψ(tk ))v̄, v̄ and
Huu (tk , x(tk ), uk+ , ψ(tk ))v̄, v̄ are both positive definite.
216 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

An element λ = (α0 , β, ψ(·)) ∈ 0 is said to be strictly Legendrian if the following condi-


tions are satisfied:
(i) [H ]k = 0, k = 1, . . . , s;
(ii) D k (H ) > 0, k = 1, . . . , s;
(iii) strengthened Legendre (SL) condition.
Assume that there exists a strictly Legendrian element λ ∈ 0 such that α0 > 0. In this case
we put α0 = 1. Denote by Z2 () the space of triples z̄ = (ξ̄ , x̄(·), ū(·)) such that
ξ̄ ∈ Rs , x̄(·) ∈ P W 1,2 ([t0 , tf ], Rn ), ū ∈ L2 ([t0 , tf ], Rr ).

Let K denote the subspace of all z̄ = (ξ̄ , x̄(·), ū(·)) ∈ Z2 () such that
Kpp̄ = 0, x̄˙ = fw w̄, [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s, (4.186)

where fw = fw (t, w(t)), Kp = Kp (x(t0 ), x(tf )), [ẋ]k is the jump of x(·) at tk , and [x̄]k is the
jump of x̄(·) at tk . We call K the critical subspace.
As in (2.13), we define the quadratic form, which corresponds to the element λ ∈ 0 , as

1  k
s
 1 tf 1
 (z̄) =
λ
D (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k + Hww w̄, w̄ dt + lpp p̄, p̄ , (4.187)
2 2 t0 2
k=1

where lpp = lpp (x(t0 ), x(tf ), α0 , β), Hww = Hww (t, w(t), ψ(t)). We set
 tf
γ̄ (z̄) = ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + ū, ū dt.
t0

Theorem 2.102 implies the following assertion.

Theorem 4.38. Assume that there exists a strictly Legendrian element λ ∈ M0+ such that
α0 > 0 and the quadratic form λ (·) is positive definite on K, i.e., there exists # > 0 such
that
λ (z̄) ≥ # γ̄ (z̄) ∀ z̄ ∈ K. (4.188)
Then, w(·) is a strict bounded strong minimum.

Now, let us show that the quadratic form  could be transformed into a perfect square
if the corresponding Riccati equation has a solution Q(t) defined on [t0 , tf ], satisfying certain
jump conditions at each point of the set . Define the Riccati equation along x(t), u(t), and
ψ(t) by
Q̇ + Qfx + fx∗ Q + Hxx − (Hxu + Qfu )Huu
−1
(Hux + fu∗ Q) = 0, t ∈ [t0 , tf ]\, (4.189)
for a piecewise continuous function Q(t)  which
∗ is continuously differentiable on each
interval of the set [t0 , tf ]\. Set qk− = [ẋ]k Qk− − [ψ̇]k and define the conditions

bk− := ak − qk− [ẋ]k > 0, k = 1, . . . , s, (4.190)


where ak = D k (H ), k = 1, . . . , s. Define the jump conditions for Q at the point tk ∈  as
bk− [Q]k = (qk− )∗ (qk− ), k = 1, . . . , s. (4.191)
4.2. Riccati Equation in the General Problem 217

Theorem 4.39. Assume that there exists a symmetric solution Q(t) (piecewise continuous
on [t0 , tf ] and continuously differentiable on each interval of the set [t0 , tf ]\) of Riccati
equation (4.189) which satisfies at each point tk ∈  conditions (4.190) and jump condi-
tions (4.191). Then the quadratic form λ (z̄) (see (4.187)) has the following transformation
into a perfect square on the subspace K (see (4.186)),

s 2  tf
−1 −1
2 (z̄) =
λ
(bk− ) ak ξ̄k + (qk− )x̄ k−
+ Huu v̄, v̄ dt + M p̄, p̄ , (4.192)
k=1 t0

where
v̄ = (Hux + fu∗ Q)x̄ + Huu ū, (4.193)
 
lx0 x0 + Q(t0 ) lx0 xf
M= . (4.194)
lxf x0 lxf xf − Q(tf )

Proof. In perfect analogy with Theorem 4.28 we have to prove statements similar to
Propositions 4.29–4.32. They need only small changes. Below we point out these changes.
Take a symmetric matrix Q(t) on [t0 , tf ] with piecewise continuous coefficients, which are
absolutely continuous on each interval of the set [t0 , tf ]\. Using, for z̄ ∈ K, formula
(4.156) and the equalities
˙ x̄ = 2Qfx x̄, x̄ + 2Qfu ū, x̄ = (Qfx + fx∗ Q)x̄, x̄ + Qfu ū, x̄ + (fu∗ Qx̄, ū ,
2Qx̄,
we obtain (similar to (4.157)) the following zero form on K:

s
0= (Qk+ x̄ k+ , x̄ k+ − Qk− x̄ k− , x̄ k− )
k=1
 tf

+ (Q̇ + fx∗ Q + Qfx )x̄, x̄ + fu∗ Qx̄, ū + Qfu ū, x̄ dt


t0
− Q(tf )x̄f , x̄f + Q(t0 )x̄0 , x̄0 . (4.195)

Adding this zero form (4.195) to the form 2λ (z̄) (see (4.187)) considered for arbitrary
z̄ ∈ K, we obtain
 tf

2λ (z̄) = M p̄, p̄ + (Hxx + Q̇ + Qfx + fx∗ Q)x̄, x̄
t0

+(Hxu + Qfu )ū, x̄ + (Hux + fu∗ Q)x̄, ū + Huu ū, ū dt



s
+ ωk (ξ̄ , x̄), (4.196)
k=1

where M is defined by (4.194) and


ωk (ξ̄ , x̄) = ak ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k + Qk+ x̄ k+ , x̄ k+ − Qk− x̄ k− , x̄ k− , k = 1, . . . , s. (4.197)
According to formula (4.160),
ωk (ξ̄ , x̄) = bk− ξ̄k2 + 2qk− x̄ k+ ξ̄k + [Q]k x̄ k+ , x̄ k+ , k = 1, . . . , s. (4.198)
218 Chapter 4. Jacobi-Type Conditions and Riccati Equation for Broken Extremals

Further, by Proposition 4.30,

ωk = (bk− )−1 (ak ξ̄k + qk− x̄ k− )2 . (4.199)

Using the Riccati equation (4.189) and definition (4.193) for v̄, we obtain for the integral
part of (4.196),
 tf
(Hxx + Q̇ + Qfx + fx∗ Q)x̄, x̄ + (Hxu + Qfu )ū, x̄
t0

+(Hux + fu∗ Q)x̄, ū + Huu ū, ū dt


 tf
−1
= Huu v̄, v̄ dt, (4.200)
t0

namely,
 
tf
−1
tf −1
Huu v̄, v̄ dt = Huu (Hux + fu∗ Q)x̄, (Hux + fu∗ Q)x̄
t0 t0
−1

+ 2Huu (Hux + fu∗ Q)x̄, Huu ū + Huu−1


Huu ū, Huu ū dt
 tf −1
= (Hxu + Qfu )Huu (Hux + fu∗ Q)x̄, x̄
t0

+2(Hux + fu∗ Q)x̄, ū + Huu ū, ū dt


 tf
= (Q̇ + Qfx + fx∗ Q + Hxx )x̄, x̄
t0

+(Hux + fu∗ Q)x̄, ū + (Hxu + Qfu )ū, x̄ + Huu ū, ū dt.

Equalities (4.196)–(4.200) imply the representation (4.192).

Now we can easily prove the following theorem.

Theorem 4.40. Assume that (a) there exists a symmetric solution Q(t), piecewise continu-
ous on [t0 , tf ] and continuously differentiable on each interval of the set [t0 , tf ]\, of Riccati
equation (4.189) which satisfies at each point tk ∈  conditions (4.190) and jump conditions
(4.191). In addition, assume that (b) M p̄, p̄ ≥ 0 for all p̄ ∈ R2n such that Kpp̄ = 0. Also,
assume that (c) the conditions Kpp̄ = 0 and M p̄, p̄ = 0 imply that x̄0 = 0 or x̄f = 0. Then
the quadratic form λ (z̄) (see (4.187)) is positive definite on the subspace K (see (4.186));
i.e., condition (4.188) holds with some # > 0.

Proof. From Theorem 4.39 it follows that λ (·) is nonnegative on K. Let us show that
λ (·) is positive on K. Assume that (z̄) = 0 for some z̄ ∈ K. Then by formula (4.192),
we have

M p̄, p̄ = 0, (4.201)
v̄ = 0, (4.202)
ak ξ̄k = −qk− x̄ k− , k = 1, . . . , s. (4.203)
4.2. Riccati Equation in the General Problem 219

By assumption (c), we have x̄0 = 0 or x̄f = 0, since Kpp̄ = 0.


(i) Let x̄0 = 0. By (4.202) and (4.193), we have
−1
ū = −Huu (Hux + fu∗ Q)x̄. (4.204)

Using this formula in the equality x̄˙ = fx x̄ + fu ū, we obtain

x̄˙ = (fx − fu Huu


−1
(Hux + fu∗ Q))x̄. (4.205)

Together with the initial condition x̄(t0 ) = 0 this implies that x̄(t) = 0 for all t ∈ [t0 , t1 ).
Hence x̄ 1− = 0. Then by (4.203), we obtain ξ̄1 = 0. Then [x̄]1 = 0 by the condition
[x̄]1 = [ẋ]1 ξ̄1 . Hence x̄ 1+ = x̄ 1− + [x̄]1 = 0 and then again by (4.205) x̄(t) = 0 for all
t ∈ (t1 , t2 ), etc. By induction, we get x̄(t) = 0 on [t0 , tf ], ξ̄ = 0, and then by (4.204), we get
ū(t) = 0 a.e. on [t0 , tf ]. Thus z̄ = (ξ̄ , x̄, ū) = 0. Consequently, λ (·) is positive on K.
(ii) Consider the case x̄f = 0. Then equation (4.205) and condition x̄(tf ) = 0 imply that
x̄(t) = 0 for all t ∈ (ts , tf ]. Hence x̄ s+ = 0. Then [x̄]s = −x s− . Using this condition in
(4.203), we obtain as ξ̄s − qs− [x̄]s = 0, or

(as − qs− [ẋ]s )ξ̄s = 0, (4.206)

because [x̄]s = [ẋ]s ξ̄s . Since bs− = as − qs− [ẋ]s > 0, condition (4.206) implies that ξ̄s = 0.
Hence [x̄]s = 0 and then x̄ s− = 0. Then, by virtue of (4.205), x̄(t) = 0 on (ts−1 , ts ), etc.
By induction we get x̄(·) = 0, ξ̄ = 0, and then, by (4.204), ū = 0, whence z̄ = 0. Thus, we
have proved that  is positive on K. It means that  is positive definite on K, since  is
a Legendre form.

Notes on SSC, Riccati equations, and sensitivity analysis. For regular controls, sev-
eral authors have used the Riccati equation approach to verify SSC. Maurer and Pickenhain
[73] considered optimal control problems with mixed control-state inequality constraints
and derived SSC and the associated Riccati matrix equation on the basis of Klötzler’s
duality theory. Similar results were obtained by Zeidan [116, 117]. Extensions of these
results to control problems with free final time are to be found in Maurer and Oberle [68].
It is well known that SSC are fundamental for the stability and sensitivity analysis
of parametric optimal control problems; cf., Malanowski and Maurer [58, 59], Augustin
and Maurer [3], Maurer and Augustin [64, 65], and Maurer and Pesch [71, 72]. SSC also
lay firm theoretical grounds to the method of determining neighboring extremals (Bryson
and Ho [12] and Pesch [101]) and to real-time control techniques (see Büskens [13] and
Büskens and Maurer [14, 15, 16]).
Chapter 5

Second-Order Optimality
Conditions in Optimal Control
Problems Linear in a Part of
Controls

In this chapter, we derive quadratic optimality conditions for optimal control problems
with a vector control variable having two components: a continuous unconstrained control
appearing nonlinearly in the control system and a control appearing linearly and belonging
to a convex polyhedron. It is assumed that the control components appearing linearly
are of bang-bang type. In Section 5.1, we obtain quadratic conditions in the problem with
continuous and bang-bang control components on a fixed time interval (that we call the main
problem). The case of a nonfixed time interval is considered in Section 5.2. In Section 5.3,
we show that, also for the mixed continuous-bang case, there exists a technique to check the
positive definiteness of the quadratic form on the critical cone via a discontinuous solution of
an associated Riccati equation with appropriate jump conditions at the discontinuity points
of the bang-bang control [98]. In Section 5.4, this techniques is applied to an economic
control problem in optimal production and maintenance. We show that the numerical
solution obtained by Maurer, Kim, and Vossen [67] satisfies the second-order test derived
in this chapter, while existing sufficiency results fail to hold.

5.1 Quadratic Optimality Conditions in the Problem on a


Fixed Time Interval
In this section, we obtain a necessary quadratic condition of a Pontryagin minimum and
then show that a strengthening of this condition yields a sufficient condition of a bounded
strong minimum.

5.1.1 The Main Problem


Let x(t) ∈ Rd(x) denote the state variable, and let u(t) ∈ Rd(u) , v(t) ∈ Rd(v) denote the
control variables in the time interval t ∈ [t0 , tf ] with fixed initial time t0 and fixed final time
tf . We shall refer to the following optimal control problem (5.1)–(5.4) as the main problem:

Minimize J(x(·), u(·), v(·)) = J (x(t0 ), x(tf )) (5.1)

223
224 Chapter 5. Second-Order Optimality Conditions in Optimal Control

subject to the constraints

F (x(t0 ), x(tf )) ≤ 0, K(x(t0 ), x(tf )) = 0, (x(t0 ), x(tf )) ∈ P , (5.2)


ẋ(t) = f (t, x(t), u(t), v(t)), u(t) ∈ U , (t, x(t), v(t)) ∈ Q, (5.3)

where the control variable u appears linearly in the system dynamics,

f (t, x, u, v) = a(t, x, v) + B(t, x, v)u. (5.4)

Here, F , K, a are column vector functions, B is a d(x) × d(u) matrix function, P ⊂ R2d(x) ,
Q ⊂ R1+d(x)+d(v) are open sets, and U ⊂ Rd(u) is a convex polyhedron. The functions
J , F , K are assumed to be twice continuously differentiable on P , and the functions a, B
are twice continuously differentiable on Q. The dimensions of F , K are denoted by d(F ),
d(K). By  = [t0 , tf ] we denote the interval of control and use the abbreviations x0 = x(t0 ),
xf = x(tf ), p = (x0 , xf ).
A process T = {(x(t), u(t), v(t)) | t ∈ [t0 , tf ] } is said to be admissible if x(·) is ab-
solutely continuous, u(·), v(·) are measurable bounded on , and the triple of functions
(x(t), u(t), v(t)), together with the endpoints p = (x(t0 ), x(tf )), satisfies the constraints (5.2)
and (5.3). Thus, the main problem is considered in the space

W := W 1,1 ([t0 , tf ], Rd(x) ) × L∞ ([t0 , tf ], Rd(u) ) × L∞ ([t0 , tf ], Rd(v) ).

Definition 5.1. An admissible process T affords a Pontryagin minimum if for each com-
pact set C ⊂ Q there exists ε > 0 such that J(T˜ ) ≥ J(T ) for all admissible processes
T˜ = {(x̃(t), ũ(t), ṽ(t)) | t ∈ [t0 , tf ] } such that (a) max |x̃(t) − x(t)| < ε, (b)  |ũ(t) −
u(t)| dt < ε, (c) (t, x̃(t), ṽ(t)) ∈ C a.e. on .

5.1.2 First-Order Necessary Optimality Conditions


Let T = {(x(t), u(t), v(t) | t ∈ [t0 , tf ] } be a fixed admissible process such that the control
u(t) is a piecewise constant function taking all its values in the vertices of polyhedron U ,
and the control v(t) is a continuous function on the interval  = [t0 , tf ]. Denote by  =
{t1 , . . . , ts }, t0 < t1 < · · · < ts < tf the finite set of all discontinuity points (jump points)
of the control u(t). Then ẋ(t) is a piecewise continuous function whose discontinuity
points belong to , and hence x(t) is a piecewise smooth function on . We use the
notation [u]k = uk+ − uk− to denote the jump of the function u(t) at the point tk ∈ , where
uk− = u(tk −), uk+ = u(tk +) are the left- and right-hand values of the control u(t) at tk ,
respectively. Similarly, we denote by [ẋ]k the jump of the function ẋ(t) at the point tk .
In addition, assume that the control v(t) satisfies the following Condition Lθ : There exist
constants C > 0 and ε > 0 such that for each point tk ∈  we have |v(t) − v(tk )| ≤ C|t − tk |
for all t ∈ (tk − ε, tk + ε).
In the literature, one often uses the “Pontryagin minimum principle” instead of the
“Pontryagin maximum principle.” In order to pass from one principle to the other, one has
to change only the signs of the adjoint variables ψ and ψ0 . It leads to obvious changes in
the signs of the Hamiltonian, transversality conditions, Legendre conditions, and quadratic
forms. Let us introduce the Pontryagin function (or Hamiltonian)

H (t, x, u, v, ψ) = ψf (t, x, u, v) = ψa(t, x, v) + ψB(t, x, v)u, (5.5)


5.1. Quadratic Optimality Conditions in the Problem on a Fixed Time Interval 225

where ψ is a row vector of dimension d(ψ) = d(x), while x, u, f , F , and K are column
vectors. The row vector of dimension d(u),

φ(t, x, v, ψ) = ψB(t, x, v), (5.6)

will be called the switching function for the u-component of the control. Denote by l the
endpoint Lagrange function

l(p, α0 , α, β) = α0 J (p) + αF (p) + βK(p),

where α and β are row vectors with d(α) = d(F ) and d(β) = d(K), and α0 is a number. We
introduce a tuple of Lagrange multipliers λ = (α0 , α, β, ψ(·)) such that ψ(·) :  → Rd(x) is
continuous on  and continuously differentiable on each interval of the set  \ . In what
follows, we will denote first- or second-order partial derivatives by subscripts referring to
the variables.
Denote by M0 the set of the normalized tuples λ satisfying the minimum principle
conditions for the process T :

)
d(F 
d(K)
α0 ≥ 0, α ≥ 0, αF (p) = 0, α0 + αi + |βj | = 1, (5.7)
i=1 j =1
ψ̇ = −Hx ∀ t ∈  \ , (5.8)
ψ(t0 ) = −lx0 , ψ(tf ) = lxf , (5.9)
H (t, x(t), u, v, ψ(t)) ≥ H (t, x(t), u(t), v(t), ψ(t))
∀ t ∈  \ , u ∈ U , v ∈ Rd(v) such that (t, x(t), v) ∈ Q. (5.10)

The derivatives lx0 and lxf are taken at the point (p, α0 , α, β), where p = (x(t0 ), x(tf )), and
the derivative Hx is evaluated along the trajectory (t, x(t), u(t), v(t), ψ(t)), t ∈  \ . The
condition M0  = ∅ constitutes the first-order necessary condition of a Pontryagin minimum
for the process T which is called the Pontryagin minimum principle, cf. Pontryagin et al.
[103], Hestenes [40], and Milyutin and Osmolovskii [79]. The set M0 is a finite-dimensional
compact set and the projector λ → (α0 , α, β) is injective on M0 .
In the following, it will be convenient to use the simple abbreviation (t) for indicating
all arguments (t, x(t), u(t), v(t), ψ(t)), e.g.,

H (t) = H (t, x(t), u(t), v(t), ψ(t)), φ(t) = φ(t, x(t), v(t), ψ(t)).

Let λ = (α0 , α, β, ψ(·)) ∈ M0 . It is well known that H (t) is a continuous function. In


particular, [H ]k = H k+ − H k− = 0 holds for each tk ∈ , where H k− := H (tk − 0) and
H k+ := H (tk + 0). We denote by H k the common value of H k− and H k+ . For λ ∈ M0 and
tk ∈  consider the function

(k H )(t) = H (t, x(t), uk+ , v(tk ), ψ(t)) − H (t, x(t), uk− , v(tk ), ψ(t))
= φ(t, x(t), v(tk ), ψ(t)) [u]k . (5.11)

Proposition 5.2. For each λ ∈ M0 , the following equalities hold:


d  d 
(k H )t=t −0 = (k H )t=t +0 , k = 1, . . . , s.
dt k dt k
226 Chapter 5. Second-Order Optimality Conditions in Optimal Control

Consequently, for each λ ∈ M0 , the function (k H )(t) has a derivative at the point
tk ∈ . In what follows, we will consider the quantities
d
D k (H ) = − (k H )(tk ) = −φ̇(tk ±)[u]k , k = 1, . . . , s. (5.12)
dt
Then the minimum condition (5.10) implies the following property.

Proposition 5.3. For each λ ∈ M0 , the following conditions hold:


D k (H ) ≥ 0, k = 1, . . . , s.
Note that the value D k (H ) also can be written in the form
D k (H ) = −Hxk+ Hψk− + Hxk− Hψk+ − [Ht ]k = ψ̇ k+ ẋ k− − ψ̇ k− ẋ k+ + [ψ̇0 ]k ,

where Hxk− and Hxk+ are the left-hand and right-hand values of the function Hx (t) at tk ,
respectively, [Ht ]k is the jump of the function Ht (t) := Ht (t, x(t), u(t), v(t), ψ(t)) at tk , etc.,
and ψ0 (t) = −H (t).

5.1.3 Bounded Strong Minimum


As in Section 10.1, we define essential (or main) and unessential (or complementary) state
variables in the problem. The state variable xi , i.e., the ith component of the state vector x,
is called unessential if the function f does not depend on xi and the functions F , J , and K
are affine in xi0 = xi (t0 ) and xi1 = xi (tf ); otherwise the variable xi is called essential. Let
x denote the vector of all essential components of state vector x.

Definition 5.4. The process T affords a bounded strong minimum if for each compact set
C ⊂ Q there exists ε > 0 such that J(T+) ≥ J(T ) for all admissible processes T+ =
{(x̃(t), ũ(t), ṽ(t)) | t ∈ [t0 , tf ] } such that (a) |x̃(t0 ) − x(t0 )| < ε, (b) max |x̃(t) − x(t)| < ε,
(c) (t, x̃(t), ṽ(t)) ∈ C a.e. on .

The strict bounded strong minimum is defined in a similar way, with the nonstrict
inequality J(T+) ≥ J(T ) replaced by the strict one and the process T+ required to be different
from T .

5.1.4 Critical Cone


For a given process T , we introduce the space Z2 () and the critical cone K ⊂ Z2 ().
As in Section 2.1.5, we denote by P W 1,2 (, Rd(x) ) the space of piecewise continuous
functions x̄(·) :  → Rd(x) , which are absolutely continuous on each interval of the set
 \  and have a square integrable first derivative. For each x̄ ∈ P W 1,2 (, Rd(x) ) and for
tk ∈ , we set
x̄ k− = x̄(tk −), x̄ k+ = x̄(tk +), [x̄]k = x̄ k+ − x̄ k− .
Let z̄ = (ξ̄ , x̄, v̄), where ξ̄ ∈ Rs , x̄ ∈ P W 1,2 (, Rd(x) ), v̄ ∈ L2 (, Rd(v) ). Thus,
z̄ ∈ Z2 () := Rs × P W 1,2 (, Rd(x) ) × L2 (, Rd(v) ).
5.1. Quadratic Optimality Conditions in the Problem on a Fixed Time Interval 227

For each z̄, we set


x̄0 = x̄(t0 ), x̄f = x̄(tf ), p̄ = (x̄0 , x̄f ). (5.13)
The vector p̄ is considered a column vector. Denote by

IF (p) = {i ∈ {1, . . . , d(F )} | Fi (p) = 0}

the set of indices of all active endpoint inequalities Fi (p) ≤ 0 at the point p = (x(t0 ), x(tf )).
Denote by K the set of all z̄ ∈ Z2 () satisfying the following conditions:

J (p)p̄ ≤ 0, Fi (p)p̄ ≤ 0 ∀ i ∈ IF (p), K (p)p̄ = 0, (5.14)


˙ = fx (t)x̄(t) + fv (t)v̄(t),
x̄(t) (5.15)
[x̄] = [ẋ] ξ̄k ,
k k
k = 1, . . . , s, (5.16)

where p = (x(t0 ), x(tf )) and [ẋ]k = ẋ(tk + 0) − ẋ(tk − 0).


It is obvious that K is a convex cone in the Hilbert space Z2 () with finitely many
faces. We call K the critical cone. Note that the variation ū(t) of the bang-bang control
u(t) vanishes in the critical cone.

5.1.5 Necessary Quadratic Optimality Conditions


Let us introduce a quadratic form on the critical cone K defined by the conditions (5.14)–
(5.16). For each λ ∈ M0 and z̄ ∈ K, we set3
 s
(λ, z̄) = lpp (p)p̄, p̄ + (D k (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k )
 tf k=1
 (5.17)
+ Hxx (t)x̄(t), x̄(t) + 2Hxv (t)v̄(t), x̄(t)
t0 
+ Hvv (t)v̄(t), v̄(t) dt,

where

lpp (p) = lpp (α0 , α, β, p),


p = (x(t0 ), x(tf )),
1
k
x̄av = (x̄ k− + x̄ k+ ),
2
Hxx (t) = Hxx (t, x(t), u(t), v(t), ψ(t)),

etc. Note that the functional (λ, z̄) is linear in λ and quadratic in z̄. The following theorem
gives the main second-order necessary condition of optimality.

Theorem 5.5. If the process T affords a Pontryagin minimum, then the following Condition
A holds: The set M0 is nonempty and maxλ∈M0 (λ, z̄) ≥ 0 for all z̄ ∈ K.

We call Condition A the necessary quadratic condition, although it is truly quadratic


only if M0 is a singleton.
3 In Part II of the book, we will not use the factor 1/2 in the definition of the quadratic form .
228 Chapter 5. Second-Order Optimality Conditions in Optimal Control

5.1.6 Sufficient Quadratic Optimality Conditions


A natural strengthening of the necessary Condition A turns out to be a sufficient optimality
condition not only for a Pontryagin minimum, but also for a bounded strong minimum; cf.
Definition 5.4. Denote by M0+ the set of all λ ∈ M0 satisfying the following conditions:
(a) H (t, x(t), u, v, ψ(t)) > H (t, x(t), u(t), v(t), ψ(t)) for all t ∈  \ , u ∈ U , v ∈ Rd(v)
such that (t, x(t), v) ∈ Q and (u, v)  = (u(t), v(t)) ;
(b) H (tk , x(tk ), u, v, ψ(tk )) > H k for all tk ∈ , u ∈ U , v ∈ Rd(v) such that
(tk , x(tk ), v) ∈ Q, (u, v)  = (u(tk −), v(tk )), (u, v)  = (u(tk +), v(tk )),
where H k := H k− = H k+ .
Let Arg minũ∈U φ ũ be the set of points u ∈ U where the minimum of the linear function φ ũ
is attained.

Definition 5.6. For a given admissible process T with a piecewise constant control u(t)
and continuous control v(t), we say that u(t) is a strict bang-bang control if the set M0 is
nonempty and there exists λ ∈ M0 such that

Arg min φ(t)ũ = [u(t−), u(t+)] ∀ t ∈ [t0 , tf ],


ũ∈U

where [u(t−), u(t+)] denotes the line segment spanned by the vectors u(t−), u(t+).

If dim(u) = 1, then the strict bang-bang property is equivalent to

φ(t)  = 0 ∀ t ∈  \ .

It is easy to show that if the set M0+ is nonempty, then u(t) is a strict bang-bang control.

Definition 5.7. An element λ ∈ M0 is said to be strictly Legendre if the following conditions


are satisfied:
(a) For each t ∈  \  the quadratic form Hvv (t, x(t), u(t), v(t), ψ(t))v̄, v̄
is positive definite on Rd(v) ;
(b) for each tk ∈  the quadratic form Hvv (tk , x(tk ), u(tk −), v(tk ), ψ(tk ))v̄, v̄
is positive definite on Rd(v) ;
(c) for each tk ∈  the quadratic form Hvv (tk , x(tk ), u(tk +), v(tk ), ψ(tk ))v̄, v̄
is positive definite on Rd(v) ;
(d) D k (H ) > 0 for all tk ∈ .

Denote by Leg+ (M0+ ) the set of all strictly Legendrian elements λ ∈ M0+ , and set

 tf
γ̄ (z̄) = ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + v̄(t), v̄(t) dt.
t0

Theorem 5.8. Let the following Condition B be fulfilled for the process T :
(a) The set Leg+ (M0+ ) is nonempty;
5.1. Quadratic Optimality Conditions in the Problem on a Fixed Time Interval 229

(b) there exists a nonempty compact set M ⊂ Leg+ (M0+ ) and a number C > 0
such that maxλ∈M (λ, z̄) ≥ C γ̄ (z̄) for all z̄ ∈ K.
Then T is a strict bounded strong minimum.

Remark 5.9. If the set Leg+ (M0+ ) is nonempty and K={0}, then Condition (b) is fulfilled
automatically. This case can be considered as a first-order sufficient optimality condition
for a strict bounded strong minimum.

As mentioned in the introduction, the proof of Theorem 5.8 is very similar to the
proof of the sufficient quadratic optimality condition for the pure bang-bang case given in
Milyutin and Osmolovskii [79, Theorem 12.4, p. 302], and based on the sufficient quadratic
optimality condition for broken extremals in the general problem of calculus of variations;
see Part I of the present book. The proofs of Theorems 5.5 and 5.8 will be given below.

5.1.7 Proofs of Quadratic Conditions in the Problem on a Fixed Time


Interval
Problem Z and its convexification with respect to bang-bang control com-
ponents. Consider the following optimal control problem, which is similar to the main
problem (5.1)–(5.4):

Minimize J(x(·), u(·), v(·)) = J (x(t0 ), x(tf )) (5.18)

subject to the constraints

F (x(t0 ), x(tf )) ≤ 0, K(x(t0 ), x(tf )) = 0, (x(t0 ), x(tf )) ∈ P , (5.19)


ẋ = a(t, x, v) + B(t, x, v)u, u ∈ U, (t, x, v) ∈ Q, (5.20)

where P ⊂ R2d(x) , Q ⊂ R1+d(x)+d(v) are open sets, U ⊂ Rd(u) is a compact set, and
 = [t0 , tf ] is a fixed time interval. The functions J , F , and K are assumed to be twice
continuously differentiable on P , and the functions a and B are twice continuously differ-
entiable on Q. The compact set U is specified by

U = {u ∈ Qg | g(u) = 0}, (5.21)

where Qg ⊂ Rd(u) is an open set and g : Qg → Rd(g) is a twice continuously differentiable


function satisfying the full-rank condition

rank gu (u) = d(g) (5.22)

for all u ∈ Qg such that g(u) = 0. It follows from (5.22) that d(g) ≤ d(u), but it is possible,
in particular, that d(g) = d(u). In this latter case for U we can take any finite set of points
in Rd(u) , for example, the set of vertices of a convex polyhedron.
For brevity, we will refer to problem (5.18)–(5.20) as the problem Z. Thus, the
only difference between problem Z and the main problem (5.1)–(5.4) is that a convex
polyhedron U is replaced by an arbitrary compact set U specified by (5.21). The definitions
of the Pontryagin minimum, the bounded strong minimum, and the strict bounded strong
minimum in the problem Z are the same as in the main problem.
230 Chapter 5. Second-Order Optimality Conditions in Optimal Control

We will consider the problem (5.18), (5.19) not only for control system (5.20), but
also for its convexification with respect to u:
ẋ = a(t, x, v) + B(t, x, v)u, u ∈ co U, (t, x, v) ∈ Q, (5.23)
where co U is the convex hull of the compact set U. We will refer to the problem (5.18),
(5.19), (5.23) as the problem co Z.
We will be interested in the relationships between conditions for a minimum in the
problems Z and co Z. Naturally, these conditions concern a process satisfying the con-
straints of the problem Z. Let w 0 (·) = (x 0 (·), u0 (·), v 0 (·)) be such a process. Then it
satisfies the constraints of the problem co Z as well. We assume that w0 satisfy the follow-
ing conditions: The function u0 (t) is piecewise continuous with the set  = {t1 , . . . , ts } of
discontinuity points, control v 0 (t) is continuous, and each point tk ∈  is an L-point of the
controls u0 (t) and v 0 (t). The latter means that there exist constants C > 0 and ε > 0 such
that for each point tk ∈ , we have
|u0 (t) − u0k− | ≤ C|t − tk | for t ∈ (tk − ε, tk ),
|u0 (t) − u0k+ | ≤ C|t − tk | for t ∈ (tk , tk + ε),
|v 0 (t) − v 0 (tk )| ≤ C|t − tk | for t ∈ (tk − ε, tk + ε).
We can formulate quadratic conditions for the point w0 in the problem Z. To what extent
can they be carried over to the problem co Z?
Regarding the necessary quadratic conditions (see Condition A in Theorem 3.9) this is
a simple question. If a point w 0 yields a Pontryagin minimum in the problem co Z, this point,
a fortiori, affords a Pontryagin minimum in the problem Z. Hence any necessary condition
for a Pontryagin minimum at the point w0 in the problem Z is a necessary condition for a
Pontryagin minimum at this point in the problem co Z as well. Thus we have the following
theorem.

Theorem 5.10. Condition A (given by Theorem 3.9) for the point w 0 in the problem Z is a
necessary condition for a Pontryagin minimum at this point in the problem co Z.

Bounded strong γ1 -sufficiency in problem Z and bounded strong minimum in


problem co Z. We now turn to derivation of quadratic sufficient conditions in the problem
co Z, which is a more complicated task. As we know, Condition B for the point w 0 in the
problem Z ensures a Pontryagin, and even a bounded strong, minimum in this problem.
Does it ensure a bounded strong or at least a Pontryagin minimum at the point w 0 in the
convexified problem co Z?
There are examples where the convexification results in the loss of the minimum.
The bounded strong minimum is not stable with respect to the convexification operation.
However, some stronger property in the problem Z, which is called bounded strong γ1 -
sufficiency, ensures a bounded strong minimum in the problem co Z. This property will be
defined below. For the problem Z and the point w0 (·) define the violation function
)
d(F
0 +
σ (w) = (J (p) − J (p )) + (Fi (p))+ + |K(p)|
 tf i=1 (5.24)
+ |ẋ(t) − f (t, x(t), u(t), v(t))| dt,
t0
5.1. Quadratic Optimality Conditions in the Problem on a Fixed Time Interval 231

where f (t, x, u, v) = a(t, x, v) + B(t, x, v)u and a + = max{a, 0} for a ∈ R1 . For an arbitrary
variation δw = (δx, δu, δv) ∈ W , let
γ1 (δw) = (
δu
1 )2 , (5.25)
tf
where
δu
1 = t0 |δu(t)| dt.
Let {w n } = {(x n , un , v n )} be a bounded sequence in W . In what follows, the notation
σ (δw ) = o(γ1 (w n − w 0 )) means that there exists a sequence of numbers εn → 0 such that
n

σ (wn ) = εn γ1 (w n − w0 ) (even in the case where γ1 (w n − w0 ) does not tend to zero).

Definition 5.11. We say that the bounded strong γ1 -sufficiency holds at the point w0 in
the problem Z if there are no compact set C ⊂ Q and sequence {wn } = {(x n , un , v n )} in W
such that
σ (wn ) = o(γ1 (wn − w 0 )), max |x n (t) − x 0 (t)| → 0, |x n (t0 ) − x 0 (t0 )| → 0,
t∈

and for all n, w n  = w 0 , (t, x n (t), v n (t)) ∈ C, un (t) ∈ U a.e. on , where x n is composed
of the essential components of vector x n .

The following proposition holds for the problem Z.

Proposition 5.12. A bounded strong γ1 -sufficiency at the point w0 implies a strict bounded
strong minimum at this point.

Proof. Let w 0 be an admissible point in the problem Z. Assume that w0 is not a point of
a strict bounded strong minimum. Then there exist a compact set C ⊂ Q and a sequence
of admissible points {wn } = {(x n , un , v n )} such that maxt∈ |x n (t) − x 0 (t)| → 0, |x n (t0 ) −
x 0 (t0 )| → 0 and, for all n,
J (p n ) − J (p 0 ) < 0, (t, x n (t), v n (t)) ∈ C, un (t) ∈ U a.e. on , wn = w0 .
It follows that σ (w n ) = 0 for all n. Hence w0 is not a point of a bounded strong γ1 -sufficiency
in the problem Z.

A remarkable fact is that the convexification of the constraint u ∈ U turns a bounded


strong γ1 -sufficiency into at least a bounded strong minimum.

Theorem 5.13. Suppose that for an admissible point w 0 in the problem Z the bounded
strong γ1 -sufficiency holds. Then w 0 is a point of the strict bounded strong minimum in the
problem co Z.

The proof of Theorem 5.13 is based on the following lemma.

Lemma 5.14. Let U ⊂ Rd(u) be a bounded set, and let u(t) be a measurable function on 
such that u(t) ∈ co U a.e. on . Then there exists a sequence un (t) of measurable functions
on  such that for every n we have un (t) ∈ U , t ∈ , and
 t  t
un (τ ) dτ → u(τ ) dτ uniformly in t ∈ . (5.26)
t0 t0
232 Chapter 5. Second-Order Optimality Conditions in Optimal Control

Moreover, if u0 (t) is a bounded measurable function on  such that

meas{t ∈  | u(t)  = u0 (t)} > 0, (5.27)

then (5.26) implies 


lim inf |un (t) − u0 (t)| dt > 0. (5.28)
n 

This lemma is a consequence of Theorem 16.1 in [79, Appendix, p. 361]. In the proof
of Theorem 5.13 we will also use the following theorem, which is similar to Theorem 16.2
in [79, Appendix, p. 366].

Theorem 5.15. Let w ∗ = (x ∗ , u∗ , v ∗ ) be a triple in W satisfying the control system

ẋ = a(t, x, v) + B(t, x, v)u, (t, x, v) ∈ Q, x(t0 ) = c0 ,

where Q, a, and B are the same as in the problem Z, and c0 ∈ Rd(x) . Suppose that there is
a sequence un ∈ L∞ ([t0 , tf ], Rd(u) ) such that

sup
un
∞ < +∞ (5.29)
n

and, for each t ∈ [t0 , tf ],


 t  t
un (τ ) dτ → u∗ (τ ) dτ (n → ∞). (5.30)
t0 t0

Then, for all sufficiently large n, the system

ẋ = a(t, x, v ∗ (t)) + B(t, x, v ∗ (t))un (t), (t, x, v ∗ (t)) ∈ Q, x(t0 ) = c0 (5.31)

has the unique solution x n (t) on [t0 , tf ] and

max |x n (t) − x ∗ (t)| → 0 (n → ∞). (5.32)


[t0 ,tf ]

Proof. Consider the equations

ẋ n = a(t, x n , v ∗ ) + B(t, x n , v ∗ )un ,


ẋ ∗ = a(t, x ∗ , v ∗ ) + B(t, x ∗ , v ∗ )u∗ .

Their difference can be written as

δ ẋ = δa + (δB)un + B(t, x ∗ , v ∗ )δu, (5.33)

where
δx = x n − x ∗ , δu = un − u∗ ,
δa = a(t, x n , v ∗ ) − a(t, x ∗ , v ∗ ) = a(t, x ∗ + δx, v ∗ ) − a(t, x ∗ , v ∗ ),
δB = B(t, x n , v ∗ ) − B(t, x ∗ , v ∗ ) = B(t, x ∗ + δx, v ∗ ) − B(t, x ∗ , v ∗ ).
We have here
δx(t0 ) = 0. (5.34)
5.1. Quadratic Optimality Conditions in the Problem on a Fixed Time Interval 233

It follows from (5.29) and (5.30) that


 t
lim sup
δu
∞ < +∞, δu(τ ) dτ → 0 ∀ t ∈ [t0 , tf ]. (5.35)
t0

These properties mean that the sequence {δu} converges to zero ∗-weakly (L1 -weakly) in
L∞ . Therefore the functions
 t
δy(t) := B(τ , x ∗ (τ ), v ∗ (τ ))δu(τ ) dτ
t0

converge to zero pointwise on [t0 , tf ], and hence uniformly on [t0 , tf ] because they possess
the common Lipschitz constant. Thus


δy
C → 0. (5.36)

Using the relations


δ ẏ = B(t, x ∗ , v ∗ )δu, δy(t0 ) = 0, (5.37)
rewrite (5.33) as
δ ẋ − δ ẏ = δa + (δB)un . (5.38)
Set δz = δx − δy. Then (5.38) can be represented in the form

δ ż = δa + (δB)un ,

where

δa = a(t, x ∗ + δy + δz, v ∗ ) − a(t, x ∗ , v ∗ ), δB = B(t, x ∗ + δy + δz, v ∗ ) − B(t, x ∗ , v ∗ ).

Thus
δ ż = a(t, x ∗ + δy + δz, v ∗ ) − a(t, x ∗ , v ∗ )
+ (B(t, x ∗ + δy + δz, v ∗ ) − B(t, x ∗ , v ∗ )) un , (5.39)
δz(t0 ) = 0, (t, x ∗ + δy + δz, v ∗ ) ∈ Q.
This system is equivalent to the system (5.31) in the following sense: x n solves (5.31) iff

x n = x ∗ + δy + δz, (5.40)

where δy satisfies the conditions

δ ẏ = B(t, x ∗ , v ∗ )(un − u∗ ), δy(t0 ) = 0, (5.41)

and δz solves the system (5.39).


Hence it suffices to show that for all sufficiently large n the system (5.39) has a
unique solution, where δy is determined by (5.41). This is so because for δy ≡ 0 the system
(5.39) has a unique solution δz ≡ 0 on [t0 , tf ] and moreover (5.36) holds. From (5.36)
and representation (5.39) it follows also that
δz
C → 0 and hence
δx
C ≤
δy
C +

δz
C → 0. Therefore (5.32) holds.
234 Chapter 5. Second-Order Optimality Conditions in Optimal Control

Proof of Theorem 5.13. Suppose that w 0 is not a strict bounded strong minimum point in the
problem co Z. Then there exist a compact set C ⊂ Q and a sequence {w n } = {(x n , un , v n )}
such that for all n one has wn  = w0 and

(x n (t0 ), x n (tf )) ∈ P , (t, x n (t), v n (t)) ∈ C a.e. on , (5.42)


un (t) ∈ co U a.e. on , (5.43)
σ (w n ) = 0, (5.44)
max |x n (t) − x 0 (t)| → 0, |x n (t0 ) − x 0 (t0 )| → 0. (5.45)
t∈

We will show that w0 is not a point of the bounded strong γ1 -sufficiency in the problem
Z. If un = u0 for infinitely many terms, w0 is not even a strict bounded strong minimum
point in the problem Z. Hence, we assume that un  = u0 for all n. Apply Lemma 5.14 to
each function un . By virtue of this lemma there exists a sequence of measurable functions
{unk }∞
k=1 such that, for all k,

unk (t) ∈ U ∀ t ∈ , (5.46)


 t  t
u (τ ) dτ →
nk
un (τ ) dτ uniformly in t ∈  (as k → ∞), (5.47)
t0 t0
 tf
lim inf |unk (t) − u0 (t)| dt > 0. (5.48)
k→∞ t0

For each k define x nk as the solution of the system

ẋ nk = a(t, x nk , v n ) + B(t, x nk , v n )unk , x nk (t0 ) = x n (t0 ). (5.49)

According to Theorem 5.15 this system has a solution for all sufficiently large k and


x nk − x n
C → 0 as k → ∞. (5.50)

It follows from (5.44) and (5.50) that



(J (p nk ) − J (p0 ))+ + (Fi (pnk ))+ + |K(pnk )| → 0 as k → ∞, (5.51)
i

where p nk = (x nk (t0 ), x nk (tf )) = (x n (t0 ), x nk (tf )). Combined with (5.49) this implies that

σ (wnk ) → 0 as k → ∞, (5.52)

where w nk = (x nk , unk , v n ). It follows from (5.48) and (5.52) that there is a number k = k(n)
such that  tf 2
1
σ (wnk ) ≤ |unk (t) − u0 (t)| dt ∀ k ≥ k(n). (5.53)
n t0
By (5.50), k(n) also can be chosen so that

1

x nk(n) − x n
C ≤ . (5.54)
n
5.1. Quadratic Optimality Conditions in the Problem on a Fixed Time Interval 235

Then by virtue of (5.53), for the sequence {w nk(n) } one has

σ (wnk(n) ) = o(γ1 (w nk(n) − w0 )). (5.55)

Moreover, we have for this sequence

unk(n) (t) ∈ U ∀ t ∈ , (5.56)



x nk(n) − x 0
C ≤
x nk(n) − x n
C +
x n − x 0
C
1
≤ +
x n − x 0
C → 0 (n → ∞), (5.57)
n
|x nk(n) (t0 ) − x 0 (t0 )| = |x n (t0 ) − x 0 (t0 )| → 0. (5.58)

Finally, from (5.42) and (5.54) it follows that there exists a compact set C1 such that
C ⊂ C1 ⊂ Q, and for all sufficiently large n we have

(x nk(n) (t0 ), x nk(n) (tf )) ∈ P , (t, x nk(n) (t), v n (t)) ∈ C1 a.e. on . (5.59)

The existence of a compact set C1 ⊂ Q and a sequence {w nk(n) } in the space W satisfying
(5.55)–(5.59) means that the bounded strong γ1 -sufficiency fails at the point w0 in the
problem Z.

Now we can prove the following theorem.

Theorem 5.16. Condition B (given in Definition 3.15) for the point w0 in the problem Z is
a sufficient condition for a strict bounded strong minimum at this point in the problem co Z.

Proof. Assume that Condition B for the point w0 in the problem Z is satisfied. Then
by Theorem 3.16 a bounded strong γ -sufficiency holds in problem Z at the point w 0 .
The latter means (see Definition 3.11) that there are no compact set C ⊂ Q and sequence
{wn } = {(x n , un , v n )} in W such that

σ (w n ) = o(γ (w n − w0 )), max |x n (t) − x 0 (t)| → 0, |x n (t0 ) − x 0 (t0 )| → 0,


t∈

and for all n we have wn  = w0 , (t, x n (t), v n (t)) ∈ C, g(un (t)) = 0, un (t) ∈ Qg a.e. on ,
where γ is the higher order defined in Definition 2.17.
Note that for each n the conditions g(un (t)) = 0, un (t) ∈ Qg mean that un (t) ∈ U,
where U is a compact set. It follows from Proposition 2.98 that for any compact set C ⊂ Q
there exists a constant C > 0 such that for any w = (x, u, v) ∈ W satisfying the conditions
u(t) ∈ U and (t, x(t), v(t)) ∈ C a.e. on , we have γ1 (w − w0 ) ≤ Cγ (w − w0 ). By virtue
of this inequality, a bounded strong γ -sufficiency at the point w 0 in problem Z implies a
bounded strong γ1 -sufficiency at w 0 in the same problem. Then by Theorem 5.13, w0 is a
point of a strict bounded strong minimum in problem co Z.

Proofs of Theorems 5.5 and 5.8. Consider the main problem again:

J (p) → min, F (p) ≤ 0, K(p) = 0, p := (x(t0 ), x(tf )) ∈ P ,


ẋ = a(t, x, v) + B(t, x, v)u, u ∈ U , (t, x, v) ∈ Q,
236 Chapter 5. Second-Order Optimality Conditions in Optimal Control

where U is a convex polyhedron. Let U be the set of vertices of U . Consider an admis-


sible process w 0 = (x 0 , u0 , v 0 ) ∈ W in the main problem. Assume that the control u0 is a
piecewise constant function on  taking all its values in the vertices of U , i.e.,
u0 (t) ∈ U, t ∈ . (5.60)
As usual we denote by  = {t1 , . . . , ts } the set of discontinuity points (switching points) of
the control u0 . Assume that the control v 0 (t) is a continuous function on , satisfying the
Condition Lθ (see Section 5.1.2). By virtue of condition (5.60) the process w0 = (x 0 , u0 , v 0 )
is also admissible in the problem
J (p) → min, F (p) ≤ 0, K(p) = 0, p ∈ P , (5.61)
ẋ(t) = a(t, x, v) + B(t, x, v)u, u(t) ∈ U, (t, x(t), v(t)) ∈ Q. (5.62)
Since U = co U, the main problem can be viewed as a convexification of problem (5.61),
(5.62) with respect to u.
It is easy to see that, in problem (5.61), (5.62), we can use the results of Sections
3.1.5 and 3.2.3 and formulate both necessary and sufficient optimality conditions for w 0
(see Theorems 3.9 and 3.16, respectively). Indeed, let ui , i = 1, . . . , m, be the vertices
of the polyhedron U , i.e., U = {u1 , . . . , um }. Let Q(ui ), i = 1, . . . , m, be disjoint open
neighborhoods of the vertices ui ∈ U. Set

m
Qg = Q(ui ).
i=1

Define the function g(u) : Qg → Rd(u)as follows: On each set Q(ui ) ⊂ Qg , i = 1, . . . , m,


let g(u) = u − u . Then g(u) is a function of class C ∞ on Qg , specifying the set of vertices
i

of U , i.e., U = {u ∈ Qg | g(u) = 0}. Moreover, g (u) = I for all u ∈ Qg , where I is the


identity matrix of order d(u). Hence, the full-rank condition (3.3) is fulfilled. (This very
simple but somewhat unexpected way of using equality constraints g(u) = 0 in the problem
with a constraint on the control specified by a polyhedron U is due to Milyutin.) Thus, the
problem (5.61), (5.62) can be represented as
J (p) → min, F (p) ≤ 0, K(p) = 0, p ∈ P , (5.63)
ẋ(t) = a(t, x, v) + B(t, x, v)u, (5.64)
g(u) = 0, u ∈ Qg , (t, x(t), v(t)) ∈ Q. (5.65)
Let us formulate the quadratic conditions of Sections 3.1.5 and 3.2.3 for the point w 0 in this
problem. Put l = α0 J + αF + βK, H = ψf, H̄ = H + νg, where f = a + Bu. The set M0
(cf. (3.12)–(3.15)) consists of tuples
λ = (α0 , α, β, ψ, ν) (5.66)
such that
α0 ∈ R1 , α ∈ Rd(F ) , β ∈ Rd(K) ,
ψ ∈ W 1,1 (, Rd(x) ), ν ∈ L∞ (, Rd(u) ),
α0 ≥ 0, α ≥ 0, αF (p 0 ) = 0, α0 + |α| + |β| = 1,
(5.67)
−ψ̇ = H̄x , ψ(t0 ) = −lx0 , ψ(tf ) = lxf ,
H (t, x 0 (t), u, v, ψ(t)) ≥ H (t, x 0 (t), u0 (t), v 0 (t), ψ(t))
∀ t ∈  \ , u ∈ U, v ∈ Rd(v) such that (t, x 0 (t), v) ∈ Q.
5.2. Quadratic Optimality Conditions on a Variable Time Interval 237

The last inequality implies the conditions of a local minimum principle with respect to u
and v:

ψfu (t, x 0 (t), u0 (t), v 0 (t)) + ν(t)g (u0 (t)) = 0, ψfv (t, x 0 (t), u0 (t), v 0 (t)) = 0.

Since g (u) = I , the first equality uniquely determines the multiplier ν(t).
Note that H̄x = Hx . Therefore conditions (5.67) are equivalent to the minimum
principle conditions (5.7)–(5.10). More precisely, the linear projection

(α0 , α, β, ψ, ν) → (α0 , α, β, ψ)

yields a one-to-one correspondence between the elements of the set (5.67) and the elements
of the set (5.7)–(5.10).
Consider the critical cone K for the point w0 (see Definition 3.5). To the constraint
g(u) = 0, u ∈ Qg there corresponds the condition g (u0 )ū = 0. But g = I . Hence ū = 0.
This condition implies that the critical cone K can be identified with the set of triples
z̄ = (ξ , x̄, v̄) ∈ Z2 () such that conditions (5.14)–(5.16) are fulfilled.
The definition of the set M0+ in Section 5.1.6 corresponds to the definition of this set
in Section 3.2.3. The same is true for the set Leg+ (M0+ ) (again, due to the equality g = I ).
Further, for the point w 0 we write the quadratic form  (see formula (3.57)), where
we set ū = 0. Since H̄x = Hx and H̄ψ = Hψ , we have [H̄x ]k = [Hx ]k , D k (H̄ ) = D k (H ),
k = 1, . . . , s. Moreover, H̄xx = Hxx , H̄xv = Hxv , H̄vv = Hvv . Combined with condition ū = 0
this implies
H̄ww w̄, w̄ = Hxx x̄, x̄ + 2Hxv v̄, x̄ + Hvv v̄, v̄ . (5.68)
Thus, in view of condition ū = 0 the quadratic form  becomes defined by formula (5.17).
Now, Theorem 5.5 easily follows from Theorem 5.10, and similarly Theorem 5.8 becomes
a simple consequence of Theorem 5.16.

5.2 Quadratic Optimality Conditions in the Problem on a


Variable Time Interval
5.2.1 Optimal Control Problem on a Variable Time Interval
Let x(t) ∈ Rd(x) denote the state variable, and let u(t) ∈ Rd(u) , v(t) ∈ Rd(v) be the two types
of control variables in the time interval t ∈ [t0 , tf ] with a nonfixed initial time t0 and final
time tf . The following optimal control problem (5.69)–(5.72) will be referred to as the
general problem linear in a part of controls:

Minimize J(t0 , tf , x(·), u(·), v(·)) = J (t0 , x(t0 ), tf , x(tf )) (5.69)

subject to the constraints

ẋ(t) = f (t, x(t), u(t), v(t)), u(t) ∈ U , (t, x(t), v(t)) ∈ Q, (5.70)
F (t0 , x(t0 ), tf , x(tf )) ≤ 0, K(t0 , x(t0 ), tf , x(tf )) = 0,
(5.71)
(t0 , x(t0 ), tf , x(tf )) ∈ P .
238 Chapter 5. Second-Order Optimality Conditions in Optimal Control

The control variable u appears linearly in the system dynamics,

f (t, x, u, v) = a(t, x, v) + B(t, x, v)u, (5.72)

whereas the control variable v appears nonlinearly in the dynamics. Here, F , K, and a
are column vector functions, B is a d(x) × d(u) matrix function, P ⊂ R2+2d(x) and Q ⊂
R1+d(x)+d(v) are open sets, and U ⊂ Rd(u) is a convex polyhedron. The functions J , F , and
K are assumed to be twice continuously differentiable on P and the functions a and B are
twice continuously differentiable on Q. The dimensions of F and K are denoted by d(F )
and d(K). By  = [t0 , tf ] we denote the interval of control. We shall use the abbreviations
x0 = x(t0 ), xf = x(tf ), and p = (t0 , x0 , tf , xf ). A process T = {(x(t), u(t), v(t)) | t ∈ [t0 , tf ] }
is said to be admissible, if x(·) is absolutely continuous, u(·), v(·) are measurable bounded
on  = [t0 , tf ], and the triple of functions (x(t), u(t), v(t)) together with the endpoints p =
(t0 , x(t0 ), tf , x(tf )) satisfies the constraints (5.70) and (5.71).

Definition 5.17. The process T affords a Pontryagin minimum if there is no sequence of


admissible processes T n = {(x n (t), un (t), v n (t)) | t ∈ [t0n , tfn ] }, n = 1, 2, . . . , such that the
following properties hold with n = [t0n , tfn ]:
(a) J(T n ) < J(T ) for all n and t0n → t0 , tfn → tf for n → ∞;
(b) maxn ∩ |x n (t) − x(t)| → 0 for n → ∞;

n ∩ |u (t) − u(t)| dt → 0, n ∩ |v (t) − v(t)| dt → 0 for n → ∞;
(c) n n

(d) there exists a compact set C ⊂ Q (which depends on the choice of the sequence)
such that for all sufficiently large n, we have (t, x n (t), v n (t)) ∈ C a.e. on n .

For convenience, let us formulate an equivalent definition of the Pontryagin minimum.

Definition 5.18. The process T affords a Pontryagin minimum if for each compact set
C ⊂ Q there exists ε > 0 such that J(T˜ ) ≥ J(T ) for all admissible processes T˜ =
{(x̃(t), ũ(t), ṽ(t)) | t ∈ [t˜0 , t˜f ] } such that
(a) |t˜0 − t0 | < ε, |t˜f − tf | < ε;
(b) max∩ |x̃(t) − x(t)| < ε, where  ˜ = [t˜0 , t˜f ];
˜
(c) ˜
∩ |ũ(t) − u(t)| dt < ε; ∩ ˜ |ṽ(t) − v(t)| dt < ε;
(d) ˜
(t, x̃(t), ṽ(t)) ∈ C a.e. on .

5.2.2 First-Order Necessary Optimality Conditions


Let T = {(x(t), u(t), v(t)) | t ∈ [t0 , tf ] } be a fixed admissible process such that the control
u(t) is a piecewise constant function taking all its values in the vertices of the polyhedron U ,
and the control v(t) is a Lipschitz continuous function on the interval  = [t0 , tf ]. Denote
by  = {t1 , . . . , ts }, t0 < t1 < · · · < ts < tf the finite set of all discontinuity points (jump
points) of the control u(t). Then ẋ(t) is a piecewise continuous function whose points of
discontinuity belong to , and hence x(t) is a piecewise smooth function on .
Let us formulate a first-order necessary condition for optimality of the process T
in the form of the Pontryagin minimum principle. As in Section 5.1.2, we introduce the
5.2. Quadratic Optimality Conditions on a Variable Time Interval 239

Pontryagin function (or Hamiltonian)


H (t, x, u, v, ψ) = ψf (t, x, u, v) = ψa(t, x, v) + ψB(t, x, v)u, (5.73)
where ψ is a row vector of dimension d(ψ) = d(x), while x, u, f , F , and K are column
vectors; the factor of the control u in the Pontryagin function is called the switching func-
tion for the u-component
φ(t, x, v, ψ) = ψB(t, x, v) (5.74)
which is a row vector of dimension d(u). We also introduce the endpoint Lagrange function
l(p, α0 , α, β) = α0 J (p) + αF (p) + βK(p), p = (t0 , x0 , tf , xf ),
where α and β are row vectors with d(α) = d(F ) and d(β) = d(K), and α0 is a number. We
introduce
 d(x) ∗ a tuple of Lagrange multipliers λ = (α0 , α, β, ψ(·), ψ0 (·)) such that ψ(·) :  →
R , ψ0 (·) :  → R1 are continuous on  and continuously differentiable on each
interval of the set  \ . As usual, we denote first- or second-order partial derivatives by
subscripts referring to the variables.
Denote by M0 the set of the normalized tuples λ satisfying the minimum principle
conditions for the process T :

)
d(F 
d(K)
α0 ≥ 0, α ≥ 0, αF (p) = 0, α0 + αi + |βj | = 1, (5.75)
i=1 j =1

ψ̇ = −Hx , ψ̇0 = −Ht ∀ t ∈  \ , (5.76)


ψ(t0 ) = −lx0 , ψ(tf ) = lxf , ψ0 (t0 ) = −lt0 , ψ0 (tf ) = ltf , (5.77)
H (t, x(t), u, v, ψ(t)) ≥ H (t, x(t), u(t), v(t), ψ(t))
∀ t ∈  \ , u ∈ U , v ∈ Rd(v) such that (t, x(t), v) ∈ Q, (5.78)
H (t, x(t), u(t), v(t), ψ(t)) + ψ0 (t) = 0 ∀ t ∈  \ . (5.79)
The derivatives lx0 and lxf are taken at the point (p, α0 , α, β), where
p = (t0 , x(t0 ), tf , x(tf )),
while the derivatives Hx , Ht are evaluated at the point (t, x(t), u(t), v(t), ψ(t)) for t ∈  \ .
The condition M0  = ∅ constitutes the first-order necessary condition for a Pontryagin min-
imum of the process T which is the Pontryagin minimum principle.

Theorem 5.19. If the process T affords a Pontryagin minimum, then the set M0 is
nonempty. The set M0 is a finite-dimensional compact set and the projector λ → (α0 , α, β)
is injective on M0 .

Again we use the simple abbreviation (t) for indicating all arguments
(t, x(t), u(t), v(t), ψ(t)).
Let λ = (α0 , α, β, ψ(·), ψ0 (·)) ∈ M0 . From condition (5.79), it follows that H (t) is a contin-
uous function. In particular, we have [H ]k = H k+ − H k− = 0 for each tk ∈ , where
H k− := H (tk , x(tk ), u(tk −), v(tk ), ψ(tk )), H k+ := H (tk , x(tk ), u(tk +), v(tk ), ψ(tk )).
240 Chapter 5. Second-Order Optimality Conditions in Optimal Control

We denote by H k the common value of H k− and H k+ . For λ ∈ M0 and tk ∈  we consider


the function
(k H )(t) = H (t, x(t), uk+ , v(tk ), ψ(t)) − H (t, x(t), uk− , v(tk ), ψ(t))
= φ(t, x(t), v(tk ), ψ(t)) [u]k . (5.80)
For this function, Propositions 5.2 and 5.3 hold, so that for each λ ∈ M0 we have D k (H ) ≥ 0,
k = 1, . . . , s, where
d
D k (H ) := − (k H )(tk ) = −φ̇(tk ±)[u]k
dt
= −Hxk+ Hψk− + Hxk− Hψk+ − [Ht ]k = ψ̇ k+ ẋ k− − ψ̇ k− ẋ k+ + [ψ0 ]k ,

where Hxk− and Hxk+ are the left- and right-hand values of the function Hx (t) at tk , respec-
tively, [Ht ]k is the jump of the function Ht (t) at tk , etc.

5.2.3 Bounded Strong Minimum


As in the case of a fixed time interval we give the following definitions. The state variable
xi , i.e., the ith component of the state vector x, is called unessential if the function f does
not depend on xi and if the functions F , J , and K are affine in xi0 = xi (t0 ) and xi1 = xi (tf ).
We denote by x the vector of all essential components of state vector x.

Definition 5.20. We say that the process T affords a bounded strong minimum if there is
no sequence of admissible processes T n = {(x n (t), un (t), v n (t)) | t ∈ [t0n , tfn ] }, n = 1, 2, . . .,
such that
(a) J(T n ) < J(T );
(b) t0n → t0 , tfn → tf , x n (t0 ) → x(t0 ) (n → ∞);
(c) maxn ∩ |x n (t) − x(t)| → 0 (n → ∞), where n = [t0n , tfn ];
(d) there exists a compact set C ⊂ Q (which depends on the choice of the sequence)
such that for all sufficiently large n we have (t, x n (t), v n (t)) ∈ C a.e. on n .

An equivalent definition has the following form.

Definition 5.21. The process T affords a bounded strong minimum if for each compact
set C ⊂ Q there exists ε > 0 such that J(T˜ ) ≥ J(T ) for all admissible processes T˜ =
{(x̃(t), ũ(t), ṽ(t)) | t ∈ [t˜0 , t˜f ] } such that
(a) |t˜0 − t0 | < ε, |t˜f − tf | < ε, |x̃(t0 ) − x(t0 )| < ε;
(b) max∩ ˜ |x̃(t) − x(t)| < ε, where  ˜ = [t˜0 , t˜f ];
(c) (t, x̃(t), ṽ(t)) ∈ C a.e. on . ˜

The strict bounded strong minimum is defined in a similar way, with the nonstrict
inequality J(T˜ ) ≥ J(T ) replaced by the strict one and the process T˜ required to be dif-
ferent from T . Below, we shall formulate a quadratic necessary optimality condition of a
Pontryagin minimum (Definition 5.17) for given control process T . A strengthening of this
quadratic condition yields a quadratic sufficient condition of a bounded strong minimum
(Definition 5.20).
5.2. Quadratic Optimality Conditions on a Variable Time Interval 241

5.2.4 Critical Cone


For a given process T we introduce the space Z2 () and the critical cone K ⊂ Z2 ().
Let z̄ = (t¯0 , t¯f , ξ̄ , x̄, v̄), where t¯0 , t¯f ∈ R1 , ξ̄ ∈ Rs , x̄ ∈ P W 1,2 (, Rd(x) ), v̄ ∈ L2 (, Rd(v) ).
Thus,
z̄ ∈ Z2 () := R2 × Rs × P W 1,2 (, Rd(x) ) × L2 (, Rd(v) ).
For each z̄, we set

x̄¯0 = x̄(t0 ) + t¯0 ẋ(t0 ), x̄¯f = x̄(tf ) + t¯f ẋ(tf ), p̄¯ = (t¯0 , x̄¯0 , t¯f , x̄¯f ). (5.81)

The vector p̄¯ is considered a column vector. Note that t¯0 = 0, respectively, t¯f = 0, holds
for a fixed initial time t0 , respectively, final time tf . Denote by IF (p) = {i ∈ {1, . . . , d(F )} |
Fi (p) = 0} the set of indices of all active endpoint inequalities Fi (p) ≤ 0 at the point
p = (t0 , x(t0 ), tf , x(tf )). Denote by K the set of all z̄ ∈ Z2 () satisfying the following
conditions:

J (p)p̄¯ ≤ 0, Fi (p)p̄¯ ≤ 0 ∀ i ∈ IF (p), K (p)p̄¯ = 0, (5.82)


˙ = fx (t, x(t), u(t), v(t))x̄(t) + fv (t, x(t), u(t), v(t))v̄(t),
x̄(t) (5.83)
[x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s, (5.84)

where p = (t0 , x(t0 ), tf , x(tf )), [ẋ] = ẋ(tk +) − ẋ(tk −). It is obvious that K is a convex cone
with finitely many faces in the space Z2 (). The cone K is called the critical cone.

5.2.5 Necessary Quadratic Optimality Conditions


Let us introduce a quadratic form on the critical cone K defined by the conditions (5.82)–
(5.84). For each λ ∈ M0 and z̄ ∈ K, we set

s
(λ, z̄) = ωe (λ, z̄) + (D k (H )ξ̄k2 − [ψ̇]k x̄av
k
ξ̄k )
k=1
 tf 
+ Hxx (t)x̄(t), x̄(t) + 2Hxv (t)v̄(t), x̄(t)
t0

+ Hvv (t)v̄(t), v̄(t) dt, (5.85)

where

¯ p̄
ωe (λ, z̄) = lpp p̄, ¯ − 2ψ̇(tf )x̄(tf )t¯f − ψ̇(tf )ẋ(tf ) + ψ̇0 (tf ) t¯f2

+ 2ψ̇(t0 )x̄(t0 )t¯0 + ψ̇(t0 )ẋ(t0 ) + ψ̇0 (t0 ) t¯02 , (5.86)

1
lpp = lpp (p, α0 , α, β, p), p = (t0 , x(t0 ), tf , x(tf )),
k
x̄av = (x̄ k− + x̄ k+ ),
2
Hxx (t) = Hxx (t, x(t), u(t), v(t), ψ(t)), etc.
Note that for a problem on a fixed time interval [t0 , tf ] we have t¯0 = t¯f = 0 and, hence,
¯ p̄ .
the quadratic form (5.86) reduces to lpp p̄, ¯ The following theorem gives the main
second-order necessary condition of optimality.
242 Chapter 5. Second-Order Optimality Conditions in Optimal Control

Theorem 5.22. If the process T affords a Pontryagin minimum, then the following Condi-
tion A holds: The set M0 is nonempty and

max (λ, z̄) ≥ 0 ∀ z̄ ∈ K.


λ∈M0

We call Condition A the necessary quadratic condition.

5.2.6 Sufficient Quadratic Optimality Conditions


A natural strengthening of the necessary Condition A turns out to be a sufficient optimality
condition not only for a Pontryagin minimum, but also for a bounded strong minimum; cf.
Definition 5.20. Denote by M0+ the set of all λ ∈ M0 satisfying the following conditions:
(a) H (t, x(t), u, v, ψ(t)) > H (t, x(t), u(t), v(t), ψ(t)) for all t ∈  \ , u ∈ U , v ∈ Rd(v)
such that (t, x(t), v) ∈ Q and (u, v)  = (u(t), v(t));
(b) H (tk , x(tk ), u, v, ψ(tk )) > H k for all tk ∈ , u ∈ U , v ∈ Rd(v)
such that (tk , x(tk ), v) ∈ Q, (u, v)  = (u(tk −), v(tk )), (u, v)  = (u(tk +), v(tk )),
where H k := H k− = H k+ .

Definition 5.23. An element λ ∈ M0 is said to be strictly Legendre if the following conditions


are satisfied:
(a) For each t ∈  \  the quadratic form Hvv (t, x(t), u(t), v(t), ψ(t))v̄, v̄
is positive definite in Rd(v) ;
(b) for each tk ∈  the quadratic form Hvv (tk , x(tk ), u(tk −), v(tk ), ψ(tk ))v̄, v̄
is positive definite in Rd(v) ;
(c) for each tk ∈  the quadratic form Hvv (tk , x(tk ), u(tk +), v(tk ), ψ(tk ))v̄, v̄
is positive definite in Rd(v) ;
(d) D k (H ) > 0 for all tk ∈ .

Denote by Leg+ (M0+ ) the set of all strictly Legendrian elements λ ∈ M0+ and set

 tf
γ̄ (z̄) = t¯02 + t¯f2 + ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) + v̄(t), v̄(t) dt.
t0

Theorem 5.24. Let the following Condition B be fulfilled for the process :
(a) The set Leg+ (M0+ ) is nonempty;
(b) there exists a nonempty compact set M ⊂ Leg+ (M0+ ) and a number C > 0
such that maxλ∈M (λ, z̄) ≥ C γ̄ (z̄) for all z̄ ∈ K.
Then T is a strict bounded strong minimum.

If the set Leg+ (M0+ ) is nonempty and K = {0}, then (b) is fulfilled automatically.
This is a first-order sufficient optimality condition of a strict bounded strong minimum. Let
us emphasize that there is no gap between the necessary Condition A and the sufficient
Condition B.
5.2. Quadratic Optimality Conditions on a Variable Time Interval 243

5.2.7 Proofs
Let us consider the following problem on a variable time interval which is similar to the
general problem (5.69)–(5.71):

Minimize J(t0 , tf , x(·), u(·), v(·)) = J (t0 , x(t0 ), tf , x(tf )) (5.87)

subject to the constraints

ẋ(t) = f (t, x(t), u(t), v(t)), u(t) ∈ U, (t, x(t), v(t)) ∈ Q, (5.88)
F (t0 , x(t0 ), tf , x(tf )) ≤ 0, K(t0 , x(t0 ), tf , x(tf )) = 0,
(5.89)
(t0 , x(t0 ), tf , x(tf )) ∈ P .

The control variable u appears linearly in the system dynamics,

f (t, x, u, v) = a(t, x, v) + B(t, x, v)u, (5.90)

whereas the control variable v appears nonlinearly. Here, F , K, and a are column vector
functions, B is a d(x) × d(u) matrix function, P ⊂ R2+2d(x) and Q ⊂ R1+d(x)+d(v) are
open sets, and U ⊂ Rd(u) is a compact set. The functions J , F , and K are assumed to be
twice continuously differentiable on P and the functions a and B are twice continuously
differentiable on Q. The dimensions of F and K are denoted by d(F ) and d(K). By
 = [t0 , tf ] we denote the interval of control. The compact set U is specified by

U = {u ∈ Qg | g(u) = 0}, (5.91)

where Qg ⊂ Rd(u) is an open set and g : Qg → Rd(g) is a twice continuously differentiable


function satisfying the full-rank condition

rank gu (u) = d(g) (5.92)

for all u ∈ Qg such that g(u) = 0.


We refer to the problem (5.87)–(5.89) as the problem A. Along with this problem we
treat its convexification co A with respect to u in which the constraint u ∈ U is replaced by
the constraint u ∈ co U:

ẋ = a(t, x, v) + B(t, x, v)u, u ∈ co U, (t, x, v) ∈ Q. (5.93)

Thus co A is the problem (5.87), (5.89), (5.93).


Let T = (x(t), u(t), v(t) | t ∈ [t0 , tf ]) be an admissible trajectory in the problem A
such that u(t) is a piecewise Lipschitz continuous function on the interval  = [t0 , tf ],
with the set of discontinuity points  = {t1 , . . . , ts }, and the control v(t) is Lipschitz con-
tinuous on the same interval. For the trajectory T we deal with the same question as in
Section 5.1.7: What is the relationship between quadratic conditions in the problems A
and co A? As in Section 5.1.7, for necessary quadratic conditions this question is simple to
answer: A Pontryagin minimum in the problem co A implies a Pontryagin minimum in the
problem A, and hence Theorem 3.31 implies the following assertion.

Theorem 5.25. Condition A for a trajectory T in the problem A is a necessary condition


for a Pontryagin minimum at this trajectory in the problem co A.
244 Chapter 5. Second-Order Optimality Conditions in Optimal Control

Consider now the same question for sufficient quadratic conditions. It can be solved
with the aid of Theorem 5.16 obtained for the problem Z on a fixed time interval, but first
we have to make a simple time change. Namely, with the admissible trajectory T in the
problem A we associate the trajectory
 
T τ = z(τ ), t(τ ), x(τ ), u(τ ), v(τ ) | τ ∈ [τ0 , τf ] ,

where τ0 = t0 , τf = tf , t(τ ) ≡ τ , z(τ ) ≡ 1. This is an admissible trajectory in the problem


Aτ specified by conditions

J(T τ ) := J (t(τ0 ), x(τ0 ), t(τf ), x(τf )) → min (5.94)

subject to the constraints

F (t(τ0 ), x(τ0 ), t(τf ), x(τf )) ≤ 0, K(t(τ0 ), x(τ0 ), t(τf ), x(τf )) = 0, (5.95)


dx(τ )
= z(τ ) a(t(τ ), x(τ ), v(τ )) + B(t(τ ), x(τ ), v(τ ))u(τ ) , (5.96)

dt(τ )
= z(τ ), (5.97)

dz(τ )
= 0, (5.98)

respectively,4

(t(τ0 ), x(τ0 ), t(τf ), x(τf )) ∈ P , (t(τ ), x(τ )) ∈ Q, u(τ ) ∈ U. (5.99)

The interval [τ0 , τf ] in the problem Aτ is fixed.


Consider also the problem co Aτ differing from Aτ by the constraint u ∈ U replaced
with the constraint u ∈ co U. We have the following chain of implications:
Condition B for the trajectory T in the problem A.
=⇒ Condition B for the trajectory T τ in the problem Aτ.
=⇒ A strict bounded strong minimum is attained on the trajectory T τ in the
problem co Aτ.
=⇒ A strict bounded strong minimum is attained on the trajectory T in the
problem co A.
The first implication was proved in Section 3.3.4 for problems P and P τ which are more
general than A and Aτ , respectively, the second follows from Theorem 5.8, and the third is
readily verified. Thus we obtain the following theorem.

Theorem 5.26. Condition B for an admissible trajectory T in the problem A is a sufficient


condition for a strict strong minimum in the problem co A.

Now, recall that the representation of the set of vertices of polyhedron U in the form of
equality-type constraint g(u) = 0, u ∈ Qg allowed us to consider the main problem (5.1)–
(5.4) as a special case of the problem co Z (5.18), (5.19), (5.23) and thus to obtain both
necessary Condition A and sufficient Condition B in the main problem as the consequences
of these conditions in problem co Z; more precisely, we have shown that Theorems 5.5 and
4 Note that the function z(τ ) in problem Aτ corresponds to the function v(τ ) in problem P τ .
5.3. Riccati Approach 245

5.8 follow from Theorem 5.10 and 5.16, respectively (see the proofs of Theorems 5.5 and
5.8). Similarly, this representation allows us to consider the general problem on a variable
time interval (5.69)–(5.72) as a special case of the problem co A and thus to obtain Theorems
5.22 and 5.24 as the consequences of Theorems 5.25 and 5.26, respectively.

5.3 Riccati Approach


The following question suggests itself from a numerical point of view: How does a numerical
check of the quadratic sufficient optimality conditions in Theorem 5.8 look? For simplicity,
we shall assume that
(a) the initial value x(t0 ) is fixed,
(b) there are no endpoint constraints of inequality type, and
(c) the time interval  = [t0 , tf ] is fixed.
Thus, we consider the following control problem:
Minimize J (x(tf ))
under the constraints
x(t0 ) = x0 , K(x(tf )) = 0, ẋ = f (t, x, u, v), u ∈ U,
where
f (t, x, u, v) = a(t, x, v) + B(t, x, v)u,
U ⊂ Rd(u) is a convex polyhedron, and J , K, a, and B are C 2 -functions.
Let w = (x, u, v) be a fixed admissible process satisfying the assumptions of Sec-
tion 5.1.2 (consequently, the function u(t) is piecewise constant and the function v(t) is
continuous). We also assume, for this process, that the set M0 is nonempty and there exists
λ ∈ M0 such that α0 > 0; let us fix this element λ. Here we set again  = {t1 , t2 , . . . , ts },
where tk denote the discontinuity points of the bang-bang control u(t). Let n = d(x).

5.3.1 Critical Cone K and Quadratic Form 


In the considered case, the critical cone is a subspace defined by the relations
x̄(t0 ) = 0, K (x(tf ))x̄(tf ) = 0,
˙ = fx (t)x̄(t) + fv (t)v̄(t), [x̄]k = [ẋ]k ξ̄k ,
x̄(t) k = 1, . . . , s.

These relations imply that J (x(tf ))x̄(tf ) = 0 since α0 > 0. The quadratic form is given by

s
(λ, z̄) = lxf xf (x(tf ))x̄f , x̄f + (D k (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k )
k=1
 tf  
+ Hxx (t)x̄(t), x̄(t) + 2Hxv (t)v̄(t), x̄(t) + Hvv (t)v̄(t), v̄(t) dt,
t0

where, by definition, x̄f = x̄(tf ). We assume that D k (H ) > 0, k = 1, . . . , s, and the strength-
ened Legendre (SL) condition with respect to v is satisfied:
Hvv (t)v̄, v̄ ≥ cv̄, v̄ ∀ v̄ ∈ Rd(v) , ∀ t ∈ [t0 , tf ] \  (c > 0).
246 Chapter 5. Second-Order Optimality Conditions in Optimal Control

5.3.2 Q-Transformation of  on K
Let Q(t) be a symmetric n × n matrix on [t0 , tf ] with piecewise continuous entries which
are absolutely continuous on each interval of the set [t0 , tf ] \ . For each z̄ ∈ K we obvi-
ously have
 tf tf
d  
s
Qx̄, x̄ dt = Qx̄, x̄  − [Qx̄, x̄ ]k , (5.100)
dt t0 t0 k=1

where [Qx̄, x̄ ] is the jump of the function Qx̄, x̄ at the point tk ∈ . Using the equation
k

x̄˙ = fx x̄ + fv v̄ and the initial condition x̄(t0 ) = 0, we obtain



s
−Q(tf )x̄f , x̄f + [Qx̄, x̄ ]k
k=1
 tf  
+ (Q̇x̄, x̄ + Q(fx x̄ + fv v̄), x̄ + Qx̄, fx x̄ + fv v̄ dt = 0.
t0

Adding this zero term to the form (λ, z̄), we get


s

 
(λ, z̄) =  lxf xf − Q(tf ) x̄f , x̄f + D k (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k + [Qx̄, x̄ ]k
k=1
 tf 
+ (Hxx + Q̇ + Qfx + fx∗ Q)x̄, x̄ + (Hxv + Qfv )v̄, x̄
t0

+(Hvx + fv∗ Q)x̄, v̄ + Hvv (t)v̄(t), v̄(t) dt.
We call this formula the Q-transformation of  on K.

5.3.3 Transformation of  on K to Perfect Squares


In order to transform the integral term in (λ, z̄) into a perfect square, we assume that Q(t)
satisfies the following matrix Riccati equation (cf. equation (4.189)):
Q̇ + Qfx + fx∗ Q + Hxx − (Hxv + Qfv )Hvv
−1
(Hvx + fv∗ Q) = 0.
Then the integral term in  can be written as
 tf
−1
Hvv h̄, h̄ dt, where h̄ = (Hvx + fv∗ Q)x̄ + Hvv v̄.
t0

As we know, the terms


ωk := D k (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k + [Qx̄, x̄ ]k
can also be transformed into perfect squares if the matrix Q(t) satisfies a special jump
condition at each point tk ∈ . This jump condition was obtained in Chapter 4. Namely,
for each k = 1, . . . , s put
Qk− = Q(tk −), Qk+ = Q(tk +), [Q]k = Qk+ − Qk− , (5.101)
qk− = ([ẋ]k )∗ Qk− − [ψ̇]k , bk− = D k (H ) − (qk− )[ẋ]k , (5.102)
5.3. Riccati Approach 247

where [ẋ]k is a column vector, while qk− , ([ẋ]k )∗ and [ψ̇]k are row vectors, and bk− is a
number. We shall assume that

bk− > 0, k = 1, . . . , s, (5.103)

holds and that Q satisfies the jump conditions

[Q]k = (bk− )−1 (qk− )∗ (qk− ), (5.104)

where (qk− ) is a row vector, (qk− )∗ is a column vector, and hence (qk− )∗ (qk− ) is a symmetric
n × n matrix. Then, as it was shown in Section 4.2,

ωk = (bk− )−1 ((bk− )ξ̄k + (qk− )(x̄ k+ ))2 = (bk− )−1 (D k (H )ξ̄k + (qk− )(x̄ k− ))2 . (5.105)

Thus, we obtain the following transformation of the quadratic form  = (λ, z̄) to perfect
squares on the critical cone K:


s 
  2 tf
 =  lxf xf − Q(tf ) x̄f , x̄f + (bk− )−1 (D k (H )ξ̄k + (qk− )(x̄ k− ) + −1
Hvv h̄, h̄ dt,
k=1 t0

where h̄ = (Hvx + fv∗ Q)x̄ + Hvv v̄. In addition, let us assume that
 
 lxf xf − Q(tf ) x̄f , x̄f ≥ 0

for all x̄f ∈ Rd(x) \ {0} such that Kxf (x(tf ))x̄f = 0. Then, obviously, (λ, z̄) ≥ 0 on K. Now
let us show that (λ, z̄) > 0 for each nonzero element z̄ ∈ K. This will imply that (λ, z̄)
is positive definite on the critical cone K since (λ, z̄) is a Legendre quadratic form.
Assume that (λ, z̄) = 0 for some element z̄ ∈ K. Then, for this element, the following
equations hold:

x̄(t0 ) = 0, (5.106)
D k (H )ξ̄k + (qk− )(x̄ k− ) = 0, k = 1, . . . , s, (5.107)
h̄(t) = 0 a.e. in . (5.108)

From the last equation, we get


−1
v̄ = −Hvv (Hvx + fv∗ Q)x̄. (5.109)

Using this formula in the equation x̄˙ = fx x̄ + fv v̄, we see that x̄ is a solution to the linear
equation
x̄˙ = (fx − fv Hvv−1
(Hvx + fv∗ Q))x̄. (5.110)
This equation together with initial condition x̄(t0 ) = 0 implies that x̄(t) = 0 for all t ∈ [t0 , t1 ).
Consequently, x̄ 1− = 0, and then, by virtue of (5.107), ξ̄1 = 0. This equality, together
with the jump condition [x̄]1 = [ẋ]1 ξ̄1 , implies that [x̄]1 = 0, i.e., x̄ is continuous at t1 .
Consequently, x̄ 1+ = 0. From the last condition and equation (5.110) it follows that x̄(t) = 0
for all t ∈ (t1 , t2 ). Repeating this argument, we obtain ξ̄1 = ξ̄2 = · · · = ξ̄s = 0, x̄(t) = 0 for
all t ∈ [t0 , tf ]. Then from (5.109) it follows that v̄ = 0. Consequently, we have z̄ = 0 and
thus have proved the following theorem; cf. [98].
248 Chapter 5. Second-Order Optimality Conditions in Optimal Control

Theorem 5.27. Assume that there exists a symmetric matrix Q(t), defined on [t0 , tf ], such
that
(a) Q(t) is piecewise continuous on [t0 , tf ] and continuously differentiable on each
interval of the set [t0 , tf ] \ ;
(b) Q(t) satisfies the Riccati equation
Q̇ + Qfx + fx∗ Q + Hxx − (Hxv + Qfv )Hvv
−1
(Hvx + fv∗ Q) = 0 (5.111)
on each interval of the set [t0 , tf ] \ ;
(c) at each point tk ∈  matrix Q(t) satisfies the jump condition
[Q]k = (bk− )−1 (qk− )∗ (qk− ),
where qk− = ([ẋ]k )∗ Qk− − [ψ̇]k , bk− = D k (H ) − (qk− )[ẋ]k > 0;
 
(d)  lxf xf − Q(tf ) x̄f , x̄f ≥ 0 for all x̄f ∈ Rd(x) \ {0} such that Kxf (x(tf ))x̄f = 0.
Then (λ, z̄) is positive definite on the subspace K.

In some problems, it is more convenient to integrate the Riccati equation (5.111)


backwards from t = tf . A similar proof shows that we can replace condition (c) in Theorem
4.1 by the following condition:
(c+) at each point tk ∈ , the matrix Q(t) satisfies the jump condition
[Q]k = (bk+ )−1 (qk+ )∗ (qk+ ), where
(5.112)
qk+ = ([ẋ]k )∗ Qk+ − [ψ̇]k , bk+ = D k (H ) + (qk+ )[ẋ]k > 0.

5.4 Numerical Example: Optimal Control of Production


and Maintenance
Cho, Abad, and Parlar [22] introduced an optimal control model where a dynamic mainte-
nance problem is incorporated into a production control problem so as to simultaneously
compute optimal production and maintenance policies. In this model, the dynamics is lin-
ear with respect to both production and maintenance control, whereas the cost functional is
quadratic with respect to production control and linear with respect to maintenance control.
Hence, the model fits into the type of control problems considered in (5.1)–(5.4). A de-
tailed numerical analysis of solutions for different final times may be found in Maurer, Kim,
and Vossen [67]. For a certain range of final times the maintenance control is bang-bang.
We will show that the sufficient conditions in Theorems 5.24 and 5.27 are satisfied for the
computed solutions. The notation for the state variables is slightly different from that in
[22, 67]. The state and control variables and parameters have the following meaning:
x1 (t) : inventory level at time t ∈ [0, tf ] with fixed final time tf > 0,
x2 (t) : proportion of good units of end items produced at time t ∈ [0, tf ],
v(t) : scheduled production rate (control ),
m(t) : preventive maintenance rate to reduce the proportion of defective
units produced (control ),
α(t) : obsolescence rate of the process performance in the absence of
maintenance,
s(t) : demand rate,
ρ>0 : discount rate.
5.4. Numerical Example: Optimal Control of Production and Maintenance 249

The dynamics of the process is given by


ẋ1 (t) = x2 (t)v(t) − s(t), x1 (0) = x10 > 0,
(5.113)
ẋ2 (t) = −α(t)x2 (t) + (1 − x2 (t))m(t), x2 (0) = x20 > 0,
with the following bounds on the control variables:
0 ≤ v(t) ≤ vmax , 0 ≤ m(t) ≤ mmax for 0 ≤ t ≤ tf . (5.114)
Since all demands must be satisfied, the following state constraint is imposed:
x1 (t) ≥ 0 for 0 ≤ t ≤ tf .
Computations show that this state constraint is automatically satisfied if we impose the
boundary condition
x1 (tf ) = 0. (5.115)
The optimal control problem then consists in maximizing the total discounted profit plus the
salvage value of x2 (tf ),
 tf
J (x1 , x2 , m, v) = [ws − hx1 (t) − rv(t)2 − cm(t)]e−ρt dt
0 (5.116)
+ b x2 (tf )e−ρtf ,
under the constraints (5.113)–(5.115). For later computations, the values of constants are
chosen as in [22]:
α ≡ 2, w = 8, s(t) ≡ 4, ρ = 0.1, h = 1, c = 2.5,
(5.117)
r = 2, b = 10, vmax = 3, mmax = 4, x10 = 3, x20 = 1.
The time horizon tf will be specified below. In the discussion of the minimum principle,
we consider the usual Pontryagin function (Hamiltonian), see (5.73), instead of the current
value Hamiltonian in [22, 67],

H (t, x1 , x2 , ψ1 , ψ2 , m, v) = e−ρt (−ws + hx1 + rv 2 + cm)


(5.118)
+ψ1 (x2 v − s) + ψ2 (−αx2 + (1 − x2 )m),
where ψ1 , ψ2 denote the adjoint variables. The adjoint equations and transversality condi-
tions yield, in view of x1 (tf ) = 0 and the salvage term in the cost functional,

ψ̇1 = −he−ρt , ψ1 (tf ) = ν,


(5.119)
ψ̇2 = −ψ1 v + ψ2 (α + m), ψ2 (tf ) = −be−ρtf .
The multiplier ν is not known a priori and will be computed later. We will choose a time
horizon for which the control constraint 0 ≤ v(t) ≤ V = 3 will not become active. Hence, the
minimum condition in the minimum principle leads to the equation 0 = Hv = 2e−ρt v +ψ1 x2 ,
which yields the control
v = −ψ1 x2 eρt /2r. (5.120)
Since the maintenance control enters the Hamiltonian linearly, the control m is determined
by the sign of the switching function
φm (t) = Hm = e−ρt c + ψ2 (t)(1 − x2 (t)) (5.121)
250 Chapter 5. Second-Order Optimality Conditions in Optimal Control

according to
⎧ ⎫
⎨ mmax if φm (t) > 0 ⎬
m(t) = 0 if φm (t) < 0 . (5.122)
⎩ singular if φ (t) ≡ 0 for t ∈ Ising ⊂ [0, tf ] ⎭
m

For the final time tf = 1 which was considered in [22] and [67], the maintenance control
contains a singular arc. However, the computations in [67] show that for final times tf ∈
[0.15, 0.98] the maintenance control is bang-bang and has only one switching time:
 
0 for 0 ≤ t < t1
m(t) = . (5.123)
mmax = 4 for t1 ≤ t ≤ tf

Let us study the control problem with final time tf = 0.9 in more detail. To compute a
solution candidate, we apply nonlinear programming methods to the discretized control
problem with a large number N of grid points τi = i · tf /N , i = 0, 1, . . . , N ; cf. [5, 14]. Both
the method of Euler and the method of Heun are employed for integrating the differential
equation. We use the programming language AMPL developed by Fourer et al. [33] and
the interior point optimization code IPOPT of Wächter and Biegler [114]. For N = 5000
grid points, the computed state, control and adjoint functions are displayed in Figure 5.1.
The following values for the switching time, functional value, and selected state and adjoint

(a) inventory x1 and good items x2 (b) production v and bang-bang maintenance m
3
4
2.5 3.5
3
2
2.5
1.5 2
1.5
1
1
0.5 0.5
0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
(c) adjoint variables 1 and 2 (d) maintenance rate m and switching function m
-3
4
-4
3
-5
2
-6
1
-7

-8 0

-9 -1

-10 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Figure 5.1. Optimal production and maintenance, final time tf = 0.9. (a) State
variables x1 , x2 . (b) Regular production control v and bang-bang maintenance control m.
(c) Adjoint variables ψ1 , ψ2 . (d) Maintenance control m with switching function φm .
5.4. Numerical Example: Optimal Control of Production and Maintenance 251

variables are obtained:


t1 = 0.65691, J = 26.705,
x1 (t1 ) = 0.84924, x2 (t1 ) = 0.226879,
x1 (tf ) = 0.0, x2 (tf ) = 0.574104,
(5.124)
ψ1 (0) = −7.8617, ψ2 (0) = −4.70437,
ψ1 (t1 ) = −8.4975, ψ2 (t1 ) = −3.2016,
ψ1 (tf ) = −8.72313, ψ2 (tf ) = −9.13931.
Now we evaluate the Riccati equation (5.111),
Q̇ = −Qfx − fx Q − Hxx + (Hxv + Qfv )(Hvv )−1 (Hvx + fv∗ Q), (5.125)
for the symmetric 2 × 2 matrix
 
Q11 Q12
Q= .
Q12 Q22
Computing the expressions
   
0 v x2
fx = , fv = , Hxx = 0, Hxv = (0, ψ1 ),
0 −(α + m) 0
the matrix Riccati equation (5.125) yields the following ODE system:
Q̇11 = Q211 x22 eρt /2r, (5.126)
Q̇12 = Q11 v − Q12 (α + m) + e Q11 x2 (ψ1 + Q12 x2 ),
ρt
(5.127)
Q̇22 = −2Q12 v + 2Q22 (α + m) + eρt (ψ1 + Q12 x2 )2 /2r. (5.128)
The equations (5.126) and (5.127) are homogeneous in the variables Q11 and Q12 . Hence,
we can try to find a solution to the Riccati system with Q11 (t) = Q12 (t) ≡ 0 on [0, tf ]. Then
(5.128) reduces to the linear equation
Q̇22 = 2Q22 (α + m) + eρt ψ12 /2r. (5.129)
This linear equation always has a solution. The remaining task is to satisfy the jump and
boundary conditions in Theorem 5.27 for the matrix Q. Since we shall integrate equation
(5.128) backwards, it is more convenient to evaluate the jump (5.112). Moreover, the
boundary conditions for Q in Theorem 5.27(d) show that the initial value Q(0) can be
chosen arbitrarily, while the terminal condition imposes the sign condition Q22 (tf ) ≤ 0,
since x2 (tf ) is free. Therefore, we can choose the terminal value
Q22 (tf ) = 0. (5.130)
Hence, using the computed values in (5.124), we solve the linear equation (5.129) with
terminal condition (5.130). At the switching time t1 , we obtain the value
Q22 (t1 ) = −1.5599.
Next, we evaluate the jump in the state and adjoint variables and check conditions (5.112).
We get
([ẋ]1 )∗ = (0, M(1 − x2 (t1 ))), [ψ̇] = (0, Mψ2 (t1 )),
252 Chapter 5. Second-Order Optimality Conditions in Optimal Control

which yield the quantities

q1+ = ([ẋ 1 ]T Q1+ − [ψ̇]1 = (0, M(1 − x2 (t1 ))Q22 (t1 +) − Mψ2 (t1 )
= (0, 8.2439),
b1+ = D 1 (H ) + (q1+ )[ẋ 1 ]
= D 1 (H ) + M 2 ((1 − x2 (t1 ))Q22 (t1 +) − ψ2 (t1 ))ψ2 (t1 )
= 27.028 + 133.55 = 165.58 > 0.

Then the jump condition in (5.112),


 
0 0
[Q]1 = (b1+ )−1 (q1+ )∗ (q1+ ) = ,
0 [Q22 ]1

reduces to a jump condition for Q22 at t1 . However, we do not need to evaluate this jump
condition explicitly because the linear equation (5.129) has a solution regardless of the value
Q22 (t1 −). Hence, we conclude from Theorem 5.27 that the numerical solution characterized
by (5.124) and displayed in Figure 5.1 provides a strict bounded strong minimum.
We may hope to improve on the benefit by choosing a larger time horizon. For the
final time tf = 1.1, we get a bang-singular-bang maintenance control m(t) as shown in
Figure 5.2.

(a) inventory x1 and good items x2 (b) production v and maintenance m


3
4
2.5 3.5
3
2
2.5
1.5 2
1.5
1
1
0.5 0.5
0
0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
(c) adjoint variables 1 and 2 (d) maintenance m and switching function m
-4
4
-5
-6 3
-7 2
-8
-9 1

-10 0
-11
-1
-12
-13 -2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Figure 5.2. Optimal production and maintenance, final time tf = 1.1. (a) State
variables x1 , x2 . (b) Regular production control v and bang-singular-bang maintenance
control m. (c) Adjoint variables ψ1 , ψ2 . (d) Maintenance control m with switching func-
tion φm .
5.4. Numerical Example: Optimal Control of Production and Maintenance 253

The solution characteristics for this solution are given by

J = 23.3567,
x1 (tf ) = 0, x2 (tf ) = 0.64926,
(5.131)
ψ1 (0) = −11.93, ψ2 (0) = −9.332,
ψ1 (tf ) = −12.97, ψ2 (tf ) = −8.956.

Apparently, the larger time horizon tf = 1.1 results in a smaller gain J = 23.357 compared to
J = 26.705 for the final time tf = 0.9. We are not aware of any type of sufficient optimality
conditions that would apply to the extremal for tf = 1.1, where one control component has
a bang-singular-bang structure. Thus one is lead to ask: What is the optimal lifetime of
the machine to give maximal gain? This amounts to solving the control problem (5.113)–
(5.116) with free final time tf . The solution is very similar to that shown in Figure 5.1. The
maintenance control m(t) is bang-bang with one switching time t1 = 0.6523. The optimal
final time tf = 0.8633 gives the gain J = 26.833 which slightly improves on J = 26.705
for the final time tf = 0.9.
Chapter 6

Second-Order Optimality
Conditions for Bang-Bang
Control

In this chapter, we investigate the pure bang-bang case, where the second-order necessary
or sufficient optimality conditions amount to testing the positive (semi)definiteness of a
quadratic form on a finite-dimensional critical cone. In Section 6.2, we deduce these con-
ditions from the results obtained in the previous chapter. Although the quadratic conditions
turned out to be finite-dimensional, the direct numerical test works only in some special
cases. Therefore, in Section 6.3, we study various transformations of the quadratic form
and the critical cone which are tailored to different types of control problems in practice.
In particular, by a solution to a linear matrix differential equation, the quadratic form can be
converted to perfect squares. In Section 6.5, we study second-order optimality conditions
for time-optimal control problems with control appearing linearly. In Section 6.6, we show
that an approach similar to the above mentioned Riccati equation approach is applicable for
such problems. Again, the test requires us to find a solution of a linear matrix differential
equation which satisfies certain jump conditions at the switching points. In Section 6.7, we
discuss two numerical examples that illustrate the numerical procedure of verifying positive
definiteness of the corresponding quadratic forms. In Section 6.8, following [79], we study
second-order optimality conditions in a simple, but important class of time-optimal control
problems for linear autonomous systems.

6.1 Bang-Bang Control Problems on Nonfixed Time


Intervals
6.1.1 Optimal Control Problems with Control Appearing Linearly
We consider optimal control problems with control appearing linearly. Let x(t) ∈ Rd(x)
denote the state variable and u(t) ∈ Rd(u) the control variable in the time interval t ∈  =
[t0 , tf ] with a nonfixed initial time t0 and final time tf . We shall refer to the following control
problem (6.1)–(6.3) as the basic bang-bang control problem, or briefly, the basic problem:

Minimize J := J (t0 , x(t0 ), tf , x(tf )) (6.1)

255
256 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

subject to the constraints

ẋ(t) = f (t, x(t), u(t)), u(t) ∈ U , (t, x(t)) ∈ Q, t0 ≤ t ≤ tf , (6.2)


F (t0 , x(t0 ), tf , x(tf )) ≤ 0, K(t0 , x(t0 ), tf , x(tf )) = 0,
(6.3)
(t0 , x(t0 ), tf , x(tf )) ∈ P ,

where the control variable appears linearly in the system dynamics,

f (t, x, u) = a(t, x) + B(t, x)u. (6.4)

Here, F , K, and a are vector functions, B is a d(x) × d(u) matrix function, P ⊂ R2+2d(x) ,
Q ⊂ R1+d(x) are open sets, and U ⊂ Rd(u) is a convex polyhedron. The functions J , F ,
and K are assumed to be twice continuously differentiable on P , and the functions a, B are
twice continuously differentiable on Q. The dimensions of F and K are denoted by d(F )
and d(K). We shall use the abbreviations x0 = x(t0 ), xf = x(tf ), p = (t0 , x0 , tf , xf ). A tra-
jectory T = (x(t), u(t) | t ∈ [t0 , tf ]) is said to be admissible if x(·) is absolutely continuous,
u(·) is measurable bounded, and the pair of functions (x(t), u(t)) on the interval  = [t0 , tf ]
with the endpoints p = (t0 , x(t0 ), tf , x(tf )) satisfies the constraints (6.2) and (6.3). We set
J(T ) := J (t0 , x(t0 ), tf , x(tf )).
Obviously, the basic problem (6.1)–(6.3) is a special case of the general problem
(5.69)–(5.72) studied in the previous chapter. This special case corresponds to the assump-
tion that the function f does not depend on the variable v, i.e., f = f (t, x, u). Under this
assumption it turned out to be possible to obtain certain deeper results than in the general
problem. More precisely, we formulate the necessary quadratic Condition A in the prob-
lem (6.1)–(6.3), which is a simple consequence of the Condition A in the general problem,
whereas the correspondent sufficient quadratic Condition B will be slightly simplified.
Let us give the definition of Pontryagin minimum for the basic problem. The tra-
jectory T affords a Pontryagin minimum if there is no sequence of admissible trajectories
T n = (x n (t), un (t) | t ∈ [t0n , tfn ]), n = 1, 2, . . . , such that the following properties hold with
n = [t0n , tfn ]:
(a) J(T n ) < J(T ) for all n and t0n → t0 , tfn → tf for n → ∞;
(b) maxn ∩ |x n (t) − x(t)| → 0 for n → ∞;

n ∩ |u (t) − u(t)| dt → 0 for n → ∞.
(c) n

Note that for a fixed time interval , a Pontryagin minimum corresponds to an L1 -local
minimum with respect to the control variable.

6.1.2 First-Order Necessary Optimality Conditions


Let T = (x(t), u(t) | t ∈ [t0 , tf ]) be a fixed admissible trajectory such that the control u(·)
is a piecewise constant function on the interval  = [t0 , tf ]. Denote by  = {t1 , . . . , ts },
t0 < t1 < · · · < ts < tf the finite set of all discontinuity points ( jump points) of the control
u(t). Then ẋ(t) is a piecewise continuous function whose discontinuity points belong to
, and hence x(t) is a piecewise smooth function on . We continue to use the notation
[u]k = uk+ − uk− for the jump of function u(t) at the point tk ∈ , where uk− = u(tk −),
uk+ = u(tk +) are the left- and right-hand values of the control u(t) at tk , respectively.
6.1. Bang-Bang Control Problems on Nonfixed Time Intervals 257

Let us formulate a first-order necessary condition for optimality of the trajectory T ,


that is, the Pontryagin minimum principle. The Pontryagin function has the form
H (t, x, u, ψ) = ψf (t, x, u) = ψa(t, x) + ψB(t, x)u, (6.5)
where ψ is a row vector of dimension d(ψ) = d(x), while x, u, f , F , and K are column
vectors. The factor of the control u in the Pontryagin function is the switching function
φ(t, x, ψ) = Hu (t, x, u, ψ) = ψB(t, x) (6.6)
which is a row vector of dimension d(u). The endpoint Lagrange function is
l(α0 , α, β, p) = α0 J (p) + αF (p) + βK(p),
where α and β are row vectors with d(α) = d(F ) and d(β) = d(K), and α0 is a number.
By λ = (α0 , α, β, ψ(·), ψ0 (·)) we denote a tuple of Lagrange multipliers such that ψ(·) :
 → (Rd(x) )∗ , ψ0 (·) :  → R1 are continuous on  and continuously differentiable on
each interval of the set  \ .
Let M0 be the set of the normed collections λ satisfying the minimum principle con-
ditions for the trajectory T :

)
d(F 
d(K)
α0 ≥ 0, α ≥ 0, αF (p) = 0, α0 + αi + |βj | = 1, (6.7)
i=1 j =1

ψ̇ = −Hx , ψ̇0 = −Ht ∀ t ∈  \ , (6.8)


ψ(t0 ) = −lx0 , ψ(tf ) = lxf , ψ0 (t0 ) = −lt0 , ψ0 (tf ) = ltf , (6.9)
min H (t, x(t), u, ψ(t)) = H (t, x(t), u(t), ψ(t)) ∀ t ∈  \ , (6.10)
u∈U
H (t, x(t), u(t), ψ(t)) + ψ0 (t) = 0 ∀ t ∈  \ . (6.11)
The derivatives lx0 and lxf are taken at the point (α0 , α, β, p), where p = (t0 , x(t0 ), tf , x(tf )),
and the derivatives Hx , Ht are evaluated at the point (t, x(t), u(t), ψ(t)). Again, we use the
simple abbreviation (t) for indicating all arguments (t, x(t), u(t), ψ(t)), t ∈  \ .

Theorem 6.1. If the trajectory T affords a Pontryagin minimum, then the set M0 is
nonempty. The set M0 is a finite-dimensional compact set, and the projector λ  → (α0 , α, β)
is injective on M0 .

For each λ ∈ M0 and tk ∈ , let us define the quantity D k (H ). Set


(k H )(t) = H (t, x(t), uk+ , ψ(t)) − H (t, x(t), uk− , ψ(t)) = φ(t) [u]k , (6.12)
where φ(t) = φ(t, x(t), ψ(t)). For each λ ∈ M0 the following equalities hold:
d  d 
(k H )t=t − = (k H )t=t + , k = 1, . . . , s.
dt k dt k

Consequently, for each λ ∈ M0 the function (k H )(t) has a derivative at the point
tk ∈ . Set
d
D k (H ) = − (k H )(tk ).
dt
258 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Then, for each λ ∈ M0 , the minimum condition (6.10) implies the inequality

D k (H ) ≥ 0, k = 1, . . . , s. (6.13)

As we know, the value D k (H ) can be written in the form

D k (H ) = −Hxk+ Hψk− + Hxk− Hψk+ − [Ht ]k = ψ̇ k+ ẋ k− − ψ̇ k− ẋ k+ + [ψ0 ]k ,

where Hxk− and Hxk+ are the left- and right-hand values of the function

Hx (t, x(t), u(t), ψ(t))

at tk , respectively, [Ht ]k is a jump of the function Ht (t) at tk , etc. It also follows from the
above representation that we have

D k (H ) = −φ̇(tk ±)[u]k , (6.14)

where the values on the right-hand side agree for the derivative φ̇(tk +) from the right and
the derivative φ̇(tk −) from the left. In the case of a scalar control u, the total derivative
φt + φx ẋ + φψ ψ̇ does not contain the control variable explicitly and hence the derivative
φ̇(t) is continuous at tk .

Proposition 6.2. For any λ ∈ M0 , we have

lx0 ẋ(t0 ) + lt0 = 0, lxf ẋ(tf ) + ltf = 0. (6.15)

Proof. The equalities (6.15) follow from the equality ψ(t)ẋ(t) + ψ0 (t) = 0 evaluated for
t = t0 and t = tf together with the transversality conditions

ψ(t0 ) = −lx0 , ψ0 (t0 ) = −lt0 , ψ(tf ) = lxf , ψ0 (tf ) = ltf .

6.1.3 Strong Minimum


As already mentioned
t in Section 2.7, any control problem with a cost functional in inte-
gral form J = t0f f0 (t, x(t), u(t)) dt can be brought to the canonical form (6.1) by intro-
ducing a new state variable y defined by the state equation ẏ = f0 (t, x, u), y(t0 ) = 0. This
yields the cost function J = y(tf ). The control variable is assumed to appear linearly in the
function f0 ,
f0 (t, x, u) = a0 (t, x) + B0 (t, x)u. (6.16)
It follows that the adjoint variable ψ y associated with the new state variable y is given by
ψ y (t) ≡ α0 which yields the Pontryagin function (6.5) in the form

H (t, x, ψ, u) = α0 f0 (t, x, u) + ψf (t, x, u)


(6.17)
= α0 a0 (t, x) + ψa(t, x) + (α0 B0 (t, x) + ψB(t, x))u.

Hence, the switching function is given by

φ(t, x, ψ) = α0 B0 (t, x) + ψB(t, x), φ(t) = φ(t, x(t), ψ(t)). (6.18)


6.2. Quadratic Necessary and Sufficient Optimality Conditions 259

The component y was called an unessential component in the augmented problem. The
general definition was given in Section 5.2.3: the state variable xi is called unessential if
the function f does not depend on xi and if the functions F , J , and K are affine in xi0 = xi (t0 )
and xi1 = xi (tf ). Let x denote the vector of all essential components of state vector x. Now
we can define a strong minimum in the basic problem.
We say that the trajectory T affords a strong minimum if there is no sequence of
admissible trajectories T n = (x n (t), un (t) | t ∈ [t0n , tfn ]), n = 1, 2, . . . , such that
(a) J(T n ) < J(T );
(b) t0n → t0 , tfn → tf , x n (t0 ) → x(t0 ) (n → ∞);
(c) maxn ∩ |x n (t) − x(t)| → 0 (n → ∞), where n = [t0n , tfn ].
The strict strong minimum is defined in a similar way, with the strict inequality (a) replaced
by the nonstrict one and the trajectory T n required to be different from T for each n.

6.1.4 Bang-Bang Control


For a given extremal trajectory T = { (x(t), u(t)) | t ∈  } with a piecewise constant control
u(t) we say that u(t) is a strict bang-bang control if there exists λ = (α0 , α, β, ψ, ψ0 ) ∈ M0
such that
Arg min φ(t)u = [u(t−), u(t+)], t ∈ [t0 , tf ], (6.19)
u ∈U

where [u(t−), u(t+)] denotes the line segment spanned by the vectors u(t−) and u(t+) in
Rd(u) and φ(t) := φ(t, x(t), ψ(t)) = ψ(t)B(t, x(t)). Note that [u(t−), u(t+)] is a singleton
{u(t)} at each continuity point of the control u(t) with u(t) being a vertex of the polyhe-
dron U . Only at the points tk ∈  does the line segment [uk− , uk+ ] coincide with an edge
of the polyhedron.
As it was already mentioned in Section 5.1.7, if the control is scalar, d(u) = 1,
and U = [umin , umax ], then the strict bang-bang property is equivalent to φ(t)  = 0 for all
t ∈  \  which yields the control law
 
umin if φ(t) > 0
u(t) = ∀ t ∈  \ . (6.20)
umax if φ(t) < 0

For vector-valued control inputs, condition (6.19) imposes further restrictions. For exam-
ple, if U is the unit cube in Rd(u) , condition (6.19) precludes simultaneous switching of
the control components; the case of simultaneous switching was studied in Felgenhauer
[31]. This property holds in many examples. The condition (6.19) is indispensable in the
sensitivity analysis of optimal bang-bang controls.

6.2 Quadratic Necessary and Sufficient Optimality


Conditions
In this section, we shall formulate a quadratic necessary optimality condition of a Pontrya-
gin minimum for given bang-bang control. A strengthening of this quadratic condition
yields a quadratic sufficient condition for a strong minimum. These quadratic conditions
are based on the properties of a quadratic form on the critical cone.
260 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

6.2.1 Critical Cone


For a given trajectory T , we introduce the space Z() and the critical cone K ⊂ Z().
Denote by P C 1 (, Rd(x) ) the space of piecewise continuous functions x̄(·) :  → Rd(x) ,
continuously differentiable on each interval of the set  \ . For each x̄ ∈ P C 1 (, Rd(x) )
and for tk ∈ , we set x̄ k− = x̄(tk −), x̄ k+ = x̄(tk +), [x̄]k = x̄ k+ − x̄ k− . Set z̄ = (t¯0 , t¯f , ξ̄ , x̄),
where t¯0 , t¯f ∈ R1 , ξ̄ ∈ Rs , x̄ ∈ P C 1 (, Rd(x) ). Thus,

z̄ ∈ Z() := R2 × Rs × P C 1 (, Rd(x) ).

For each z̄, we set


 
x̄¯0 = x̄(t0 ) + t¯0 ẋ(t0 ), x̄¯f = x̄(tf ) + t¯f ẋ(tf ), p̄¯ = t¯0 , x̄¯0 , t¯f , x̄¯f . (6.21)

The vector p̄¯ is considered as a column vector. Note that t¯0 = 0, respectively, t¯f = 0, for
fixed initial time t0 , respectively, final time tf . Let IF (p) = {i ∈ {1, . . . , d(F )} | Fi (p) = 0}
be the set of indices of all active endpoint inequalities Fi (p) ≤ 0 at the point p = (t0 , x(t0 ),
tf , x(tf )). Denote by K the set of all z̄ ∈ Z() satisfying the following conditions:

J (p)p̄¯ ≤ 0, Fi (p)p̄¯ ≤ 0 ∀ i ∈ IF (p), K (p)p̄¯ = 0, (6.22)


˙ = fx (t, x(t), u(t))x̄(t), [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s,
x̄(t) (6.23)

where p = (t0 , x(t0 ), tf , x(tf )). It is obvious that K is a convex finite-dimensional and finite-
faced cone in the space Z(). We call it the critical cone. Each element z̄ ∈ K is uniquely
defined by numbers t¯0 , t¯f , a vector ξ̄ , and the initial value x̄(t0 ) of the function x̄(t).

Proposition 6.3. For any λ ∈ M0 and z̄ ∈ K, we have

lx0 x̄(t0 ) + lxf x̄(tf ) = 0. (6.24)

Proof. Integrating the equality ψ(x̄˙ − fx x̄) = 0 on [t0 , tf ] and using the adjoint equation
t t 
ψ̇ = −ψfx we obtain t0f dtd (ψ x̄) dt = 0, whence (ψ x̄)|tf0 − sk=1 [ψ x̄]k = 0. From the
jump conditions [x̄]k = [ẋ]k ξ̄k and the equality ψ(t)ẋ(t) + ψ0 (t) = 0 it follows that [ψ x̄]k =
t
ψ(tk )[ẋ]k ξ̄k = [ψ ẋ]k ξ̄k = −[ψ0 ]k ξ̄k = 0 for all k. Then the equation (ψ x̄)|tf0 = 0, together
with the transversality conditions ψ(t0 ) = −lx0 and ψ(tf ) = lxf , implies (6.24).

Proposition 6.4. For any λ ∈ M0 and z̄ ∈ K we have


s
α0 J (p)p̄¯ + αi Fi (p)p̄¯ + βK (p)p̄¯ = 0. (6.25)
i=1

Proof. For λ ∈ M0 and z̄ ∈ K, we have, by Propositions 6.2 and 6.3,

t¯0 (lx0 ẋ(t0 ) + lt0 ) + t¯f (lxf ẋ(tf ) + ltf ) + lx0 x̄(t0 ) + lxf x̄(tf ) = 0.
 
Now using the equalities x̄¯0 = x̄(t0 ) + t¯0 ẋ(t0 ), x̄¯f = x̄(tf ) + t¯f ẋ(tf ), and p̄¯ = t¯0 , x̄¯0 , t¯f , x̄¯f ,
we get lp p̄¯ = 0 which is equivalent to condition (6.25).
6.2. Quadratic Necessary and Sufficient Optimality Conditions 261

Two important properties of the critical cone follow from Proposition 6.4.

Proposition 6.5. For any λ ∈ M0 and z̄ ∈ K, we have α0 J (p)p̄¯ = 0 and αi Fi (p)p̄¯ = 0 for
all i ∈ IF (p).

Proposition 6.6. Suppose that there exist λ ∈ M0 with α0 > 0. Then adding the equali-
ties αi Fi (p)p̄¯ = 0 for all i ∈ IF (p) to the system (6.22), (6.23) defining K, one can omit
the inequality J (p)p̄¯ ≤ 0 in that system without affecting K.

Thus, K is defined by condition (6.23) and by the condition p̄¯ ∈ K0 , where K0 is the
cone in R2d(x)+2 given by (6.22). But if there exists λ ∈ M0 with α0 > 0, then we can put
K0 = {p̄¯ ∈ Rd(x)+2 | Fi (p)p̄¯ ≤ 0, αi Fi (p)p̄¯ = 0 ∀ i ∈ IF (p), K (p)p̄¯ = 0}. (6.26)

If, in addition, αi > 0 holds for all i ∈ IF (p), then K0 is a subspace in Rd(x)+2 .
An explicit representation of the variations x̄(t) in (6.23) is obtained as follows. For
each k = 1, . . . , s, define the vector functions y k (t) as the solutions to the system
ẏ = fx (t)y, y(tk ) = [ẋ]k , t ∈ [tk , tf ].

For t < tk we put y k (t) = 0 which yields the jump [y k ]k = [ẋ]k . Moreover, define y 0 (t) as
the solution to the system
ẏ = fx (t)y, y(t0 ) = x̄(t0 ) =: x̄0 .
By the superposition principle for linear ODEs it is obvious that we have

s
x̄(t) = y k (t)ξ̄k + y 0 (t)
k=1

from which we obtain the representation



s
x̄¯f = y k (tf )ξ̄k + y 0 (tf ) + ẋ(tf )t¯f . (6.27)
k=1

Furthermore, denote by x(t; t1 , . . . , ts ) the solution of the state equation (6.2) using the
values of the optimal bang-bang control with switching points t1 , . . . , ts . It easily follows
from elementary properties of ODEs that the partial derivatives of state trajectories with
respect to the switching points are given by
∂x
(t; t1 , . . . , ts ) = −y k (t) for t ≥ tk , k = 1, . . . , s. (6.28)
∂tk
This gives the following expression for x̄(t):

s
∂x
x̄(t) = − (t)ξ̄k + y 0 (t). (6.29)
∂tk
k=1

In a special case that frequently arises in practice, we can use these formulas to show that
K = {0}. This property then yields a first-order sufficient condition in view of Theorem 6.10.
262 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Namely, consider the following problem with an integral cost functional, where the initial
time t0 = tˆ0 is fixed, while the final time tf is free and where the initial and final values of
the state variables are given: Minimize
 tf
J= f0 (t, x, u)dt (6.30)
t0

subject to
ẋ = f (t, x, u), x(t0 ) = x̂0 , x(tf ) = x̂f , u(t) ∈ U , (6.31)
where f is defined by (6.4), and f0 is defined by (6.16). In the definition of K we then
have t¯0 = 0, x̄(t0 ) = 0, x̄¯f = 0. The condition x̄(t0 ) = 0 implies that y 0 (t) ≡ 0, whereas the
condition x̄¯f = 0 yields in view of the representation (6.27),

s
y k (tf )ξ̄k + ẋ(tf )t¯f = 0.
k=1

This equation leads to the following statement.

Proposition 6.7. In problem (6.30), (6.31), assume that the s + 1 vectors


∂x
y k (tf ) = − (tf ) (k = 1, . . . , s), ẋ(tf )
∂tk
are linearly independent. Then the critical cone is K = {0}.

We conclude this subsection with a special property of the critical cone for time-
optimal control problems with fixed initial time and state,
tf → min, ẋ = f (t, x, u), u ∈ U , t0 = tˆ0 , x(t0 ) = x̂0 , K(x(tf )) = 0, (6.32)
where f is defined by (6.4). The following result will be used in Section 6.2.3; cf. Propo-
sition 6.11.

Proposition 6.8. Suppose that there exists (ψ0 , ψ) ∈ M0 such that α0 > 0. Then t¯f = 0
holds for each z̄ = (t¯f , ξ̄ , x̄) ∈ K.

Proof. For arbitrary λ ∈ M0 and z̄ = (t¯f , ξ̄ , x̄) ∈ K we infer from the proof of Proposition
6.3 that ψ(t)x̄(t) is a constant function on [t0 , tf ]. In view of the relations
ψ(tf ) = βKxf (x(tf )), Kxf (x(tf ))x̄¯f = 0, x̄¯f = x̄(tf ) + ẋ(tf )t¯f ,
we get
0 = (ψ x̄)(t0 ) = (ψ x̄)(tf ) = ψ(tf )(x̄¯f − ẋ(tf )t¯f ) = −ψ(tf )ẋ(tf )t¯f = ψ0 (tf )t¯f .
Since ψ0 (tf ) = α0 > 0, this relation yields t¯f = 0.

In the case α0 > 0, we note as a consequence that the critical cone is a subspace defined by
the conditions
x̄˙ = fx (t)x̄, [x̄]k = [ẋ]k ξ̄k (k = 1, . . . , s),
(6.33)
t¯0 = t¯f = 0, x̄(t0 ) = 0, Kxf (x(tf ))x̄(tf ) = 0.
6.2. Quadratic Necessary and Sufficient Optimality Conditions 263

6.2.2 Quadratic Necessary Optimality Conditions


Let us introduce a quadratic form on the critical cone K defined by the conditions (6.22),
(6.23). For each λ ∈ M0 and z̄ ∈ K, we set
s
  tf
¯ p̄
(λ, z̄) = Ap̄, ¯ + D k (H )ξ̄k2 + 2[Hx ]k x̄av
k
ξ̄k + Hxx x̄(t), x̄(t) dt, (6.34)
k=1 t0

where
¯ p̄
Ap̄, ¯ p̄
¯ = lpp p̄, ¯ + 2ψ̇(t0 )x̄¯0 t¯0 + (ψ̇0 (t0 ) − ψ̇(t0 )ẋ(t0 ))t¯02
− 2ψ̇(tf )x̄¯f t¯f − (ψ̇0 (tf ) − ψ̇(tf )ẋ(tf ))t¯f2 , (6.35)

lpp = lpp (α0 , α, β, p), p = (t0 , x(t0 ), tf , x(tf )), Hxx = Hxx (t, x(t), u(t), ψ(t)),
1
k
x̄av = (x̄ k− + x̄ k+ ).
2
Note that for a problem on a fixed time interval [t0 , tf ] we have t¯0 = t¯f = 0 and, hence,
¯ p̄
the quadratic form (6.35) reduces to Ap̄, ¯ = lpp p̄, p̄ . The following theorem gives the
main second-order necessary condition of optimality.

Theorem 6.9. If the trajectory T affords a Pontryagin minimum, then the following Con-
dition A holds: The set M0 is nonempty and maxλ∈M0 (λ, z̄) ≥ 0 for all z̄ ∈ K.

6.2.3 Quadratic Sufficient Optimality Conditions


A natural strengthening of the necessary Condition A turns out to be a sufficient optimality
condition not only for a Pontryagin minimum, but also for a strong minimum.

Theorem 6.10. Let the following Condition B be fulfilled for the trajectory T :
(a) there exists λ ∈ M0 such that D k (H ) > 0, k = 1, . . . , s, and condition (6.19) holds
(i.e., u(t) is a strict bang-bang control),
(b) maxλ∈M0 (λ, z̄) > 0 for all z̄ ∈ K \ {0}.
Then T is a strict strong minimum.

Note that the condition (b) is automatically fulfilled, if K = {0}, which gives a first-
order sufficient condition for a strong minimum in the problem. A specific situation where
K = {0} holds was described in Proposition 6.7. Also note that the condition (b) is auto-
matically fulfilled if there exists λ ∈ M0 such that

(λ, z̄) > 0 ∀ z̄ ∈ K \ {0}. (6.36)

Example: Resource allocation problem. Let x(t) be the stock of a resource and
let the control u(t) be the investment rate. The control problem is to maximize the overall
consumption  tf
x(t)(1 − u(t)) dt
0
264 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

on a fixed time interval [0, tf ] subject to


ẋ(t) = x(t) u(t), x(0) = x0 > 0, 0 ≤ u(t) ≤ 1.
The Pontryagin function (6.5) for the equivalent minimization problem and the switching
function are given by
H = α0 x(u − 1) + ψ xu = −α0 x + φ u, φ(x, ψ) = x(α0 + ψ).
We can put α0 = 1, since the terminal state x(tf ) is free. A straightforward discussion of the
minimum principle then shows that the optimal control has exactly one switching point at
t1 = tf − 1 for tf > 1,
 
1, 0 ≤ t < t1
u(t) = ,
0, t1 ≤ t ≤ tf
 
(x0 et , −e−(t−t1 ) ), 0 ≤ t ≤ t1
(x(t), ψ(t)) = .
(x0 et1 , t − tf ), t1 ≤ t ≤ tf
The switching function is φ(t) = x(t)(1 + ψ(t)) which yields φ̇(t1 ) = x0 et1  = 0. Here we
have k = 1, [u]1 = −1, and thus obtain D 1 (H ) = −φ̇(t1 )[u]1 = φ̇(t1 ) > 0 in view of (6.12)
and (6.14). Hence, condition (a) of Theorem 6.10 holds. Checking condition (b) is rather
simple, since the quadratic form (6.34) reduces here to (λ, z̄) = D 1 (H )ξ̄12 . This relation
follows from Hxx ≡ 0 and [Hx ]1 = (1 + ψ(t1 ))[u]1 = 0 and the fact that the quadratic form
(6.35) vanishes. Note that the above control problem cannot be handled in the class of con-
vex optimization problems. This means that the necessary conditions do not automatically
imply optimality of the computed solution.

We conclude this subsection with the case of a time-optimal control problem (6.32)
with a single switching point, i.e., s = 1. Assume that α0 > 0 for a given λ ∈ M0 . Then by
Proposition 6.8, we have t¯f = 0, and thus the critical cone is the subspace defined by (6.33).
In this case, the quadratic form  can be computed explicitly as follows. Denote by y(t),
t ∈ [t1 , tf ], the solution to the Cauchy problem
ẏ = fx y, y(t1 ) = [ẋ]1 .
The following assertion is obvious: If (ξ̄ , x̄) ∈ K, then x̄(t) = 0 for t ∈ [t0 , t1 ) and x̄(t) =
y(t)ξ̄ for t ∈ (t1 , tf ]. Therefore, the inequality Kxf (x(tf ))y(tf )  = 0 would imply K = {0}.
Consider now the case Kxf (x(tf ))y(tf ) = 0. This condition always holds for time-optimal
problems with a scalar function K and α0 > 0. Indeed, the condition dtd (ψy) = 0 implies
(ψy)(t) = const in [t1 , tf ], whence
(ψy)(tf ) = (ψy)(t1 ) = ψ(t1 )[ẋ]1 = φ(t1 )[u]1 = 0.
Using the transversality condition ψ(tf ) = βKxf (x(tf )) and the inequality β  = 0 (if β = 0,
then ψ(tf ) = 0, and hence ψ(t) = 0 and ψ0 (t) = 0 in [t0 , tf ]), we see that the equality
(ψy)(tf ) = 0 implies the equality Kxf (x(tf ))y(tf ) = 0.
Observe now that the cone K is a one-dimensional subspace, on which the quadratic
form has the representation  = ρ ξ̄ 2 , where
 tf
ρ := D 1 (H ) − [ψ̇]1 [ẋ]1 + (y(t))∗ Hxx (t)y(t) dt + (y(tf ))∗ (βK)xf xf y(tf ). (6.37)
t1
6.2. Quadratic Necessary and Sufficient Optimality Conditions 265

This gives the following result.

Proposition 6.11. Suppose that we have found an extremal for the time-optimal control
problem (6.32) that has one switching point and satisfies α0 > 0 and Kxf (x(tf ))y(tf ) = 0.
Then the inequality ρ > 0 with ρ defined in (6.37) is equivalent to the positive definiteness
of  on K.

6.2.4 Proofs of Quadratic Conditions


It was already mentioned that problem (6.1)–(6.3) is a special case of the general problem
(5.69)–(5.72). It is easy to check that in problem (6.1)–(6.3) we obtain the set M0 and the
critical cone K as the special cases of these sets in the general problem.
Let us compare the quadratic forms. It suffices to show that the endpoint quadratic
form Ap̄,¯ p̄
¯ (see (6.35)) can be transformed into the endpoint quadratic form ωe (λ, z̄) in
(5.86) if relations (6.21) hold. Indeed, we have
¯ p̄
Ap̄, ¯ := ¯ p̄
lpp p̄, ¯ + 2ψ̇(t0 )x̄¯0 t¯0 + (ψ̇0 (t0 ) − ψ̇(t0 )ẋ(t0 ))t¯2
0
− 2ψ̇(tf )x̄¯f t¯f − (ψ̇0 (tf ) − ψ̇(tf )ẋ(tf ))t¯f2
= ¯ p̄
lpp p̄, ¯ + 2ψ̇(t0 )(x̄0 + t¯0 ẋ(t0 ))t¯0 + (ψ̇0 (t0 ) − ψ̇(t0 )ẋ(t0 ))t¯2
0
− 2ψ̇(tf )(x̄1 + t¯f ẋ(tf ))t¯f − (ψ̇0 (tf ) − ψ̇(tf )ẋ(tf ))t¯f2

= ¯ p̄
lpp p̄, ¯ − 2ψ̇(tf )x̄f t¯f − ψ̇(tf )ẋ(tf ) + ψ̇0 (tf ) t¯2
f
+ 2ψ̇(t0 )x̄0 t¯0 + ψ̇(t0 )ẋ(t0 ) + ψ̇0 (t0 ) t¯02 =: ωe (λ, z̄).

Thus, Theorem 6.9, which gives necessary quadratic conditions in the problem (6.1)–(6.3),
is a consequence of Theorem 5.22.
Now let us proceed to sufficient quadratic conditions in the same problem. Here the
set M0+ consists of all those elements λ ∈ M0 for which condition (6.19) is fulfilled, and
the set Leg+ (M0+ ) consists of all those elements λ ∈ M0+ for which

D k (H ) > 0, k = 1, . . . , s. (6.38)

Denote for brevity Leg+ (M0+ ) = M.Thus the set M consists of all those elements λ ∈ M0
for which (6.19) and (6.38) are fulfilled. Let us also note that the strict bounded strong
minimum in the problem (6.1)–(6.3) is equivalent to the strict strong minimum, since U is
a compact set. Thus Theorem 5.24 implies the following result.

Theorem 6.12. For a trajectory T in the problem (6.1)–(6.3) let the following Condition B
be fulfilled: The set M is nonempty and there exist a nonempty compact M ⊂ M and ε > 0
such that
max (λ, z̄) ≥ ε γ̄ (z̄) ∀ z̄ ∈ K, (6.39)
λ∈M

where γ̄ (z̄) = t¯02 + t¯f2 + ξ̄ , ξ̄ + x̄(t0 ), x̄(t0 ) . Then the trajectory T affords a strict strong
minimum in this problem.

Remarkably, the fact that the critical cone K in the problem (6.1)–(6.3) is finite-
dimensional (since each element z̄ = (t¯0 , t¯f , ξ̄ , x̄) is uniquely defined by the parameters
266 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

t¯0 , t¯f , ξ̄ , x̄(t0 )) implies that condition B in Theorem 6.12 is equivalent to the following,
generally weaker condition.
Condition B0 . The set M is nonempty and

max (λ, z̄) > 0 ∀ z̄ ∈ K \ {0}. (6.40)


λ∈M0

Lemma 6.13. For a trajectory T in the problem (6.1)–(6.3), Condition B is equivalent to


the Condition B0 .

Proof. As we pointed out, Condition B always implies Condition B0 . We will show that,
in our case, the inverse assertion also holds. Let Condition B0 be fulfilled. Let S1 (K) be
the set of elements z̄ ∈ K satisfying the Condition γ̄ (z̄) = 1. Then

max (λ, z̄) > 0 ∀ z̄ ∈ S1 (K). (6.41)


λ∈M0

Recall that M0 is a finite-dimensional compact set. It is readily verified that the relative
interior of the cone con M0 generated by M0 is contained in the cone con M generated
by M, i.e., reint(con M0 ) ⊂ con M. Combined with (6.41) this implies that for any element
z̄ ∈ S1 (K) there exist λ ∈ M and a neighborhood Oz̄ ⊂ S1 (K) of element z̄ such that
(λ, ·) > 0 on Oz̄ . The family of neighborhoods {Oz̄ } forms an open covering of the compact
set S1 (K). Select a finite subcovering. To this subcovering there corresponds a finite
compact set M = {λ1 , . . . , λr } such that

max (λ, z̄) > 0 ∀ z̄ ∈ S1 (K),


λ∈M

and consequently, due to compactness of the cross-section S1 (K),

max (λ, z̄) > ε ∀ z̄ ∈ S1 (K),


λ∈M

for some ε > 0. Hence Condition B follows.

Theorem 6.12 and Lemma 6.13 imply Theorem 6.10, where B = B0 . In what fol-
lows, for the problems of the type of basic problem (6.1)–(6.3), by Condition B we will
mean Condition B0 .

6.3 Sufficient Conditions for Positive Definiteness of the


Quadratic Form  on the Critical Cone K
Assume that the condition (a) of Theorem 6.10 is fulfilled for the trajectory T . Let λ ∈ M0
be a fixed element (possibly, different from that in condition (a)) and let  = (λ, ·) be
the quadratic form (6.34) for this element. According to Theorem 6.10, the positive defi-
niteness of  on the critical cone K is a sufficient condition for a strict strong minimum.
Recall that K is defined by (6.23) and the condition p̄¯ ∈ K0 , where p̄¯ = (t¯0 , x̄¯0 , t¯f , x̄¯f ),
x̄¯0 = x̄(t0 ) + t¯0 ẋ(t0 ), x̄¯f = x̄(tf ) + t¯f ẋ(tf ). The cone K0 is defined by (6.26) in the case
α0 > 0 and by (6.22) in the general case.
6.3. Sufficient Conditions for Positive Definiteness of the Quadratic Form 267

Now our aim is to find sufficient conditions for the positive definiteness of the quadratic
form  on the cone K. In what follows, we shall use the ideas and results presented in
Chapter 4 (see also [69]).

6.3.1 Q-Transformation of  on K
Let Q(t) be a symmetric matrix on [t0 , tf ] with piecewise continuous entries which are
absolutely continuous on each interval of the set [t0 , tf ] \ . Therefore, Q may have a jump
at each point tk ∈ . For z̄ ∈ K, formula (5.100) holds:
 tf 
tf d  s
Qx̄, x̄ dt = Qx̄, x̄  − [Qx̄, x̄ ]k ,
t0 dt t0 k=1

where [Qx̄, x̄ ]k is the jump of the function Qx̄, x̄ at the point tk ∈ . Using the equation
x̄˙ = fx x̄ with fx = fx (t, x(t), u(t)), we obtain


s  tf
[Qx̄, x̄ ]k + (Q̇ + fx∗ Q + Qfx )x̄, x̄ dt − Qx̄, x̄ (tf ) + Qx̄, x̄ (t0 ) = 0,
k=1 t0

where the asterisk denotes transposition. Adding this zero form to  and using the equality
[Hx ]k = −[ψ̇]k , we get

¯ p̄
 = Ap̄, ¯ − Qx̄, x̄ (tf ) + Qx̄, x̄ (t0 )

s
+ D k (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k + [Qx̄, x̄ ]k (6.42)
k=1
 tf
+ (Hxx + Q̇ + fx∗ Q + Qfx )x̄, x̄ dt.
t0

We shall call this formula the Q-transformation of  on K.


In order to eliminate the integral term in , we assume that Q(t) satisfies the following
linear matrix differential equation:

Q̇ + fx∗ Q + Qfx + Hxx = 0 on [t0 , tf ] \ . (6.43)

It is interesting to note that the same equation is obtained from the modified Riccati equa-
tion in Maurer and Pickenhain [73, Equation (47)], when all control variables are on the
boundary of the control constraints. Using (6.43) the quadratic form (6.42) reduces to


s
 = ω0 + ωk , (6.44)
k=1
ωk := D k (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k + [Qx̄, x̄ ]k , k = 1, . . . , s, (6.45)
¯ p̄
ω0 := Ap̄, ¯ − Qx̄, x̄ (tf ) + Qx̄, x̄ (t0 ). (6.46)
268 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Thus, we have proved the following statement.

Proposition 6.14. Let Q(t) satisfy the linear differential equation (6.43) on [t0 , tf ] \ .
Then for each z̄ ∈ K the representation (6.44) holds.

Now our goal is to derive conditions such that ωk > 0, k = 0, . . . , s, holds on K \ {0}.
To this end, as in Section 4.1.6, we shall express ωk via the vector (ξ̄k , x̄ k− ). We use the
formula
x̄ k+ = x̄ k− + [ẋ]k ξ̄k , (6.47)
which implies

Qk+ x̄ k+ , x̄ k+ = Qk+ x̄ k− , x̄ k− + 2Qk+ [ẋ]k , x̄ k− ξ̄k + Qk+ [ẋ]k , [ẋ]k ξ̄k2 .

Consequently,

[Qx̄, x̄ ]k = [Q]k x̄ k− , x̄ k− + 2Qk+ [ẋ]k , x̄ k− ξ̄k + Qk+ [ẋ]k , [ẋ]k ξ̄k2 .

k = x̄ k− + 1 [ẋ]k ξ̄ in the definition (6.45) of ω , we


Using this relation together with x̄av 2 k k
obtain  
 
ωk = D k (H ) + ([ẋ]k )∗ Qk+ − [ψ̇]k [ẋ]k ξ̄k2
  (6.48)
+2 ([ẋ]k )∗ Qk+ − [ψ̇]k x̄ k− ξ̄k + (x̄ k− )∗ [Q]k x̄ k− .

Here [ẋ]k and x̄ k− are column vectors while ([ẋ]k )∗ , (x̄ k− )∗ , and [ψ̇]k are row vectors. By
putting
qk+ = ([ẋ]k )∗ Qk+ − [ψ̇]k , bk+ = D k (H ) + (qk+ )[ẋ]k , (6.49)
we get
ωk = (bk+ )ξ̄k2 + 2(qk+ )x̄ k− ξ̄k + (x̄ k− )∗ [Q]k x̄ k− . (6.50)

Note that ωk is a quadratic form in the variables (ξ̄k , x̄ k− ) with the matrix
 
bk+ qk+
Mk+ = , (6.51)
(qk+ )∗ [Q]k

where qk+ is a row vector and (qk+ )∗ is a column vector. Similarly, using the relation
x̄ k− = x̄ k+ − [ẋ]k ξ̄k , we obtain

[Qx̄, x̄ ]k = [Q]k x̄ k+ , x̄ k+ + 2Qk− [ẋ]k , x̄ k+ ξ̄ k − Qk− [ẋ]k , [ẋ]k ξ̄k2 .

k = x̄ k+ − 1 [ẋ]k ξ̄ leads to the representation (cf.


This formula together with the relation x̄av 2 k
formula (4.160))

ωk = (bk− )ξ̄k2 + 2(qk− )x̄ k+ ξ̄k + (x̄ k+ )∗ [Q]k x̄ k+ , (6.52)

where
qk− = ([ẋ]k )∗ Qk− − [ψ̇]k , bk− = D k (H ) − (qk− )[ẋ]k . (6.53)
6.3. Sufficient Conditions for Positive Definiteness of the Quadratic Form 269

We consider (6.52) as a quadratic form in the variables (ξ̄k , x̄ k+ ) with the matrix
 
bk− qk−
Mk− = . (6.54)
(qk− )∗ [Q]k

Since the right-hand sides of equalities (6.50) and (6.52) are connected by relation (6.47),
the following statement obviously holds.

Proposition 6.15. For each k = 1, . . . , s, the positive (semi)definiteness of the matrix Mk−
is equivalent to the positive (semi)definiteness of the matrix Mk+ .

Now we can prove two theorems.

Theorem 6.16. Assume that s = 1. Let Q(t) be a solution to the linear differential equation
(6.43) on [t0 , tf ] \  which satisfies two conditions:
(i) The matrix M1+ is positive semidefinite;
(ii) the quadratic form ω0 is positive on the cone K0 \ {0}.
Then  is positive on K \ {0}.

Proof. Take an arbitrary element z̄ ∈ K. Conditions (i) and (ii) imply that ωk ≥ 0 for
k = 0, 1, and hence  = ω0 + ω1 ≥ 0 for this element. Assume now that  = 0. Then
ωk = 0 for k = 0, 1. By virtue of (ii) the equality ω0 = 0 implies that t¯0 = t¯f = 0 and
x̄(t0 ) = x̄(tf ) = 0. The last two equalities and the equation x̄˙ = fx x̄ show that x̄(t) = 0 in
[t0 , t1 ) ∪ (t1 , tf ]. Now using formula (6.45) for ω1 = 0, as well as the conditions D 1 (H ) > 0
and x̄ 1− = x̄ 1+ = 0, we obtain that ξ̄1 = 0. Consequently, we have z̄ = 0 which means that
 is positive on K \ {0}.

Theorem 6.17. Assume that s ≥ 2. Let Q(t) be a solution to the linear differential equation
(6.43) on [t0 , tf ] \  which satisfies the following conditions:
(a) The matrix Mk+ is positive semidefinite for each k = 1, . . . , s;
(b) bk+ := D k (H ) + (qk+ )[ẋ]k > 0 for each k = 1, . . . , s − 1;
(c) the quadratic form ω0 is positive on the cone K0 \ {0}.
Then  is positive on K \ {0}.

Proof. Take an arbitrary element z̄ ∈ K. Conditions (a) and (c) imply that ωk ≥ 0 for
k = 0, 1, . . . , s, and then  ≥ 0 for this element. Assume that  = 0. Then ωk = 0 for k =
0, 1, . . . , s. By virtue of (c) the equality ω0 = 0 implies that t¯0 = t¯f = 0 and x̄(t0 ) = x̄(tf ) = 0.
The last two equalities and the equation x̄˙ = fx x̄ show that x̄(t) = 0 in [t0 , t1 ) ∪ (ts , tf ] and
hence x̄ 1− = x̄ s+ = 0. The conditions ω1 = 0, x̄ 1− = 0, and b1+ > 0 by formula (6.50)
(with k = 1) yield ξ̄1 = 0. Then [x̄]1 = 0 and hence x̄ 1+ = 0. The last equality and
the equation x̄˙ = fx x̄ show that x̄(t) = 0 in (t1 , t2 ) and hence x̄ 2− = 0. Similarly, the
conditions ω2 = 0, x̄ 2− = 0 and b2+ > 0 by formula (6.50) (with k = 2) imply that ξ̄2 = 0
and x̄(t) = 0 in (t2 , t3 ). Therefore, x̄ 3− = 0, etc. Continuing this process we get that x̄ ≡ 0
and ξ̄k = 0 for k = 1, . . . , s −1. Now using formula (6.45) for ωs = 0, as well as the conditions
D s (H ) > 0 and x̄ ≡ 0, we obtain that ξ̄s = 0. Consequently, z̄ = 0, and hence  is positive
on K \ {0}.
270 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Similarly, using representation (6.52) for ωk we can prove the following statement.

Theorem 6.18. Let Q(t) be a solution to the linear differential equation (6.43) on [t0 , tf ] \ 
which satisfies the following conditions:
(a ) The matrix Mk− is positive semidefinite for each k = 1, . . . , s;
(b ) bk− := D k (H ) − (qk− )[ẋ]k > 0 for each k = 2, . . . , s
(if s = 1, then this condition is not required);
(c) the quadratic form ω0 is positive on the cone K0 \ {0}.
Then  is positive on K \ {0}.

Remark. Noble and Schättler [80] and Ledzewicz and Schättler [53] use also the linear
ODE (6.43) for deriving sufficient conditions. It would be of interest to relate their approach
to the results in Theorem 6.18.

6.3.2 Case of Fixed Initial Values t0 and x(t0 )


Consider the problem (6.1)–(6.3) with additional constraints t0 = tˆ0 and x(t0 ) = x̂0 . In
this case we have additional equalities in the definition of the critical cone K: t¯0 = 0 and
x̄¯0 := x̄(t0 ) + t¯0 ẋ(t0 ) = 0, whence x̄(t0 ) = 0. The last equality and the equation x̄˙ = fx x̄
show that x̄(t) = 0 in [t0 , t1 ), whence x̄ 1− = 0. From definitions (6.46) and (6.35) of ω0 and
Ap̄, ¯ p̄ ,
¯ respectively, it follows that for each z̄ ∈ K we have
ω0 = A1 p̄, ¯ − Q(tf )(x̄¯f − t¯f ẋ(tf )), (x̄¯f − t¯f ẋ(tf )) ,
¯ p̄ (6.55)
where
¯ p̄
A1 p̄, ¯ = ltf tf t¯f2 + 2ltf xf x̄¯f t¯f + lxf xf x̄¯f , x̄¯f
− 2ψ̇(tf )x̄¯f t¯f − (ψ̇0 (tf ) − ψ̇(tf )ẋ(tf ))t¯f2 . (6.56)
The equalities t¯0 = 0 and x̄¯0 = 0 hold also for each element p̄¯ of the finite-dimensional and
finite-faced cone K0 , given by (6.26) for α0 > 0 and by (6.22) in the general case. Rewriting
the terms ω0 , we get the quadratic form in the variables (t¯f , x̃f ) generated by the matrix
 
B11 B12
B := ∗ ,
B12 B22
where
B11 = ltf tf + ψ̇(tf )ẋ(tf ) − ψ̇0 (tf ) − ẋ(tf )∗ Q(tf )ẋ(tf ),
B12 = ltf xf − ψ̇(tf ) + ẋ(tf )∗ Q(tf ), (6.57)
B22 = lxf xf − Q(tf ).
The property x̄(t) = 0 in [t0 , t1 ) for z̄ ∈ K allows us to refine Theorems 6.16 and 6.17.

Theorem 6.19. Assume that the initial values t0 = tˆ0 and x(t0 ) = x̂0 are fixed in the problem
(6.1)–(6.3), and let s = 1. Let Q(t) be a continuous solution of the linear differential
equation (6.43) on [t1 , tf] which satisfies twoconditions:
(i) b1 := D 1 (H ) + ([ẋ]1 )∗ Q(t1 ) − [ψ̇]1 [ẋ]1 ≥ 0;
(ii) the quadratic form ω0 is positive on the cone K0 \ {0}.
Then  is positive on K \ {0}.
6.3. Sufficient Conditions for Positive Definiteness of the Quadratic Form 271

Proof. Continue Q(t) arbitrarily as a solution of differential equation (6.43) to the whole
interval [t0 , tf ] with possible jump at the point t1 . Note that the value b1 in condition (i) is
the same as the value b1+ for the continued solution, and hence b1+ ≥ 0. Let z̄ ∈ K, and
hence x̄ 1− = 0. Then by (6.50) with k = 1 the condition b1+ ≥ 0 implies the inequality
ω1 ≥ 0. Condition (ii) implies the inequality ω0 ≥ 0. Consequently  = ω0 + ω1 ≥ 0.
Further arguments are the same as in the proof of Theorem 6.16.

Theorem 6.20. Assume that the initial values t0 = tˆ0 and x(t0 ) = x̂0 are fixed in the problem
(6.1)–(6.3) and s ≥ 2. Let Q(t) be a solution of the linear differential equation (6.43) on
(t1 , tf ] \  which satisfies the following conditions:
(a) The matrix Mk+ is positive semidefinite for each k = 2, . . . , s;
(b) bk+ := D k (H ) + (qk+ )[ẋ]k > 0 for each k = 1, . . . , s − 1;
(c) the quadratic form ω0 is positive on the cone K0 \ {0}.
Then  is positive on K \ {0}.

Proof. Again, without loss of generality, we can consider Q(t) as a discontinuous solution
of equation (6.43) on the whole interval [t0 , tf ]. Let z̄ ∈ K. Then by (6.50) with k = 1 the
conditions b1+ > 0 and x̄ 1− = 0 imply the inequality ω1 ≥ 0. Further arguments are the
same as in the proof of Theorem 6.17.

6.3.3 Q-Transformation of  to Perfect Squares


As in Section 5.3.3, we shall formulate special jump conditions for the matrix Q at each
point tk ∈ . This will make it possible to transform  into perfect squares and thus to
prove its positive definiteness on K.

Proposition 6.21. Suppose that

bk+ := D k (H ) + (qk+ )[ẋ]k > 0 (6.58)

and that Q satisfies the jump condition at tk ,

bk+ [Q]k = (qk+ )∗ (qk+ ), (6.59)

where (qk+ )∗ is a column vector while qk+ is a row vector (defined as in (6.49)). Then ωk
can be written as the perfect square
 2
ωk = (bk+ )−1 (bk+ )ξ̄k + (qk+ )(x̄ k− )
 2 (6.60)
= (bk+ )−1 D k (H )ξ̄k + (qk+ )(x̄ k+ ) .

Proof. These formulas were proved in Section 4.1.6.

Theorem 6.22. Let Q(t) satisfy the linear differential equation (6.43) on [t0 , tf ] \ , and
let conditions (6.58) and (6.59) hold for each k = 1, . . . , s. Let ω0 be positive on K0 \ {0}.
Then  is positive on K \ {0}.
272 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Proof. By Proposition 6.21 and formulae (6.50), (6.51) the matrix Mk+ is positive semi-
definite for each k = 1, . . . , n. Now using Theorem 6.16 for s = 1 and Theorem 6.17 for
s ≥ 2, we obtain that  is positive on K \ {0}.

Similar assertions hold for the jump conditions that use left-hand values of Q at each
point tk ∈ . Suppose that

bk− := D k (H ) − (qk− )[ẋ]k > 0 (6.61)

and that Q satisfies the jump condition at tk

bk− [Q]k = (qk− )∗ (qk− ). (6.62)

Then, according to Proposition 4.30, we have


2
ωk = (bk− )−1 (bk− )ξ̄k + (qk− )(x̄ k+ ) (6.63)
2
= (bk− )−1 D k (H )ξ̄k + (qk− )(x̄ k− ) .

Using these formulas we deduce the following theorem.

Theorem 6.23. Let Q(t) satisfy the linear differential equation (6.43) on [t0 , tf ] \ , and
let conditions (6.61) and (6.62) hold for each k = 1, . . . , s. Let ω0 be positive on K0 \ {0}.
Then  is positive on K \ {0}.

6.4 Example: Minimal Fuel Consumption of a Car


The following optimal control problem has been treated by Oberle and Pesch [83] as an
exercise of applying the minimum principle. Consider a car whose dynamics (position x1
and velocity x2 ) are subject to friction and gravitational forces. The acceleration u(t) is
proportional to the fuel consumption. Thus the control problem is to minimize the total fuel
consumption
 tf
J= u(t) dt (6.64)
0
in a time interval [0, tf ] subject to the dynamic constraints, boundary conditions, and the
control constraints
u c
ẋ1 = x2 , ẋ2 = − αg − x22 , (6.65)
mx2 m
x1 (0) = 0, x2 (0) = 1, x1 (tf ) = 10, x2 (tf ) = 3, (6.66)
umin ≤ u(t) ≤ umax , 0 ≤ t ≤ tf . (6.67)

The final time tf is unspecified. The following values of the constants will be used in
computations:

m = 4, α = 1, g = 10, c = 0.4, umin = 100, umax = 140.


6.4. Example: Minimal Fuel Consumption of a Car 273

In view of the integral cost criterion (6.64), we consider the Pontryagin function
(Hamiltonian) (cf. (6.17)) in normalized form taking α0 = 1,
 
u c
H (x1 , x2 , ψ1 , ψ2 , u) = u + ψ1 x2 + ψ2 − αg − x22 . (6.68)
mx2 m
The adjoint equations ψ̇ = −Hx are
 
u 2c
ψ̇1 = 0, ψ̇2 = −ψ1 + ψ2 + x2 . (6.69)
mx22 m

The transversality condition (6.11) evaluated at the free final time tf yields the additional
boundary condition
 
u(tf ) 9c
u(tf ) + 3ψ1 (tf ) + ψ2 (tf ) − αg − = 0. (6.70)
3m m
The switching function
ψ2
φ(x, ψ) = Du H = 1 + , φ(t) = φ(x(t), ψ(t)),
mx2
determines the control law
 
umin if φ(t) > 0
u(t) = .
umax if φ(t) < 0
Computations give evidence to the fact that the optimal control is bang-bang with one
switching time t1 ,  
umin , 0 ≤ t < t1
u(t) = .
umax , t1 ≤ t ≤ tf
We compute an extremal using the code BNDSCO of Oberle and Grimm [82] or the code
NUDOCCCS of Büskens [13]. The solution is displayed in Figure 6.1. Results for the
switching time t1 , final time tf , and adjoint variables ψ(t) are
t1 = 3.924284, tf = 4.254074,
ψ1 (0) = −42.24170, ψ2 (0) = −3.876396,
x1 (t1 ) = 9.086464, x2 (t1 ) = 2.367329,
ψ1 (tf ) = −42.24170, ψ2 (tf ) = −17.31509.

We will show that this trajectory satisfies the assumptions of Proposition 6.7. The critical
cone is K = {0}, since the computed vectors
∂x
(tf ) = (−0.6326710, −0.7666666)∗ , ẋ(tf ) = (3.0, 0.7666666)∗
∂t1
are linearly independent. Moreover, we find in view of (6.14) that
D 1 (H ) = −φ̇(t1 ) [u]1 = 0.472397 · 40 > 0.
Theorem 6.10 then asserts that the computed bang-bang control provides a strict strong
minimum.
274 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

(a) position x1 and velocity x2 (b) control u

10
140
8
130
6
120
4
110
2
100
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
(c) adjoint variable 2
(d) switching function
-2
0.1
-4
-6 0

-8 -0.1
-10 -0.2
-12
-0.3
-14
-16 -0.4

-18 -0.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Figure 6.1. Minimal fuel consumption of a car. (a) State variables x1 , x2 . (b) Bang-
bang control u. (c) Adjoint variable ψ2 . (d) Switching function φ.

6.5 Quadratic Optimality Conditions in Time-Optimal


Bang-Bang Control Problems
In this section, we present the results of our paper [70].

6.5.1 Statement of the Problem; Pontryagin and Strong Minimum


We consider time-optimal control problems with control appearing linearly. Let x(t) ∈ Rd(x)
denote the state variable and u(t) ∈ Rd(u) the control variable in the time interval t ∈  =
[0, tf ] with a nonfixed final time tf > 0. For simplicity, the initial and terminal states are
fixed in the following control problem:

Minimize the final time tf (6.71)

subject to the constraints on the interval  = [0, tf ],

dx/dt = ẋ = f (t, x, u) = a(t, x) + B(t, x)u, (6.72)


x(0) = a0 , x(tf ) = a1 , (6.73)
u(t) ∈ U , (t, x(t)) ∈ Q. (6.74)

Here, a0 and a1 are given points in Rd(x) , Q ⊂ R1+d(x) is an open set, and U ⊂ Rd(u) is a con-
vex polyhedron. The functions a, B are twice continuously differentiable on Q, with B being
a d(x) × d(u) matrix function. A trajectory or control process T = { (x(t), u(t)) | t ∈ [0, tf ] }
is said to be admissible if x(·) is absolutely continuous, u(·) is measurable and essen-
tially bounded, and the pair of functions (x(t), u(t)) satisfies the constraints (6.72)–(6.74)
6.5. Quadratic Conditions in Time-Optimal Bang-Bang Control 275

on the interval  = [0, tf ]. Let us define the Pontryagin and the strong minimum in the
problem.
An admissible trajectory T is said to be a Pontryagin minimum if there is no sequence
of admissible trajectories T n = {(x n (t), un (t)) | t ∈ [0, tfn ]}, n = 1, 2, . . . , with
(a) tfn < tf for n = 1, 2, . . . ;
(b) tfn → tf for n → ∞;
 n|x (t) − x(t)| → 0 for n → ∞, where  = [0, tf ];
(c) max n n n n

(d) n |u (t) − u(t)| dt → 0 for n → ∞.


An admissible trajectory T is said to be a strong minimum (respectively, a strict strong
minimum) if there is no sequence of admissible trajectories T n , n = 1, 2, . . . such that
(a) tfn < tf (tfn ≤ tf , T n  = T ) for n = 1, 2, . . .;
(b) tfn → tf for n → ∞;
(c) maxn |x n (t) − x(t)| → 0 for n → ∞, where n = [0, tfn ].

6.5.2 Minimum Principle


Let T = (x(t), u(t) | t ∈ [0, tf ]) be a fixed admissible trajectory such that the control u(·)
is a piecewise constant function on the interval  = [0, tf ] with finitely many points of
discontinuity. Denote by  = {t1 , . . . , ts }, 0 < t1 < · · · < ts < tf , the finite set of all disconti-
nuity points (jump points) of the control u(t). Then ẋ(t) is a piecewise continuous function
whose discontinuity points belong to the set  and, thus, x(t) is a piecewise smooth function
on . We use the notation [u]k = uk+ − uk− for the jump of the function u(t) at the point
tk ∈ , where uk− = u(tk −), uk+ = u(tk +) are, respectively, the left- and right-hand values
of the control u(t) at tk . Similarly, we denote by [ẋ]k the jump of the function ẋ(t) at the
same point.
Let us formulate the first-order necessary conditions of optimality for the trajec-
tory T , the Pontryagin minimum principle. To this end we introduce the Pontryagin function
or Hamiltonian function
H (t, x, u, ψ) = ψf (t, x, u) = ψa(t, x) + ψB(t, x)u, (6.75)
where ψ is a row vector of dimension d(x), while x, u, and f are column vectors. The factor
of the control u in the Pontryagin function is the switching function φ(t, x, ψ) = ψB(t, x).
Consider the pair of functions ψ0 (·) :  → R1 , ψ(·) :  → Rd(x) , which are continuous on
 and continuously differentiable on each interval of the set  \ . Denote by M0 the set
of normed pairs of functions (ψ0 (·), ψ(·)) satisfying the conditions
ψ0 (tf ) ≥ 0, |ψ(0)| = 1, (6.76)
ψ̇(t) = −Hx (t, x(t), u(t), ψ(t)) ∀ t ∈  \ , (6.77)
ψ̇0 (t) = −Ht (t, x(t), u(t), ψ(t)) ∀ t ∈  \ , (6.78)
min H (t, x(t), u, ψ(t)) = H (t, x(t), u(t), ψ(t)) ∀ t ∈  \ , (6.79)
u∈U
H (t, x(t), u(t), ψ(t)) + ψ0 (t) = 0 ∀ t ∈  \ . (6.80)
Then the condition M0  = ∅ is equivalent to the Pontryagin minimum principle. This is the
first-order necessary condition for a Pontryagin minimum. We assume that this condition
is satisfied for the trajectory T . We say in this case that T is an extremal trajectory for
the problem. The set M0 is a finite-dimensional compact set, since in (6.76) the initial
276 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

values ψ(0) are assumed to belong to the unit ball of Rd(x) . The case that there exists
a multiplier (ψ0 , ψ) ∈ M0 with ψ0 (tf ) > 0 will be called the nondegenerate or normal
case. Again we use the simple abbreviation (t) for all arguments (t, x(t), u(t), ψ(t)), e.g.,
φ(t) = φ(t, x(t), ψ(t)).
Let us introduce the quantity D k (H ). For (ψ0 , ψ) ∈ M0 and tk ∈  consider the
function

(k H )(t) = H (t, x(t), uk+ , ψ(t)) − H (t, x(t), uk− , ψ(t)) = φ(t, x(t), ψ(t))[u]k .

This function has a derivative at the point tk ∈ . We set


d
D k (H ) = − (k H )(tk ) = −φ̇(tk ±)[u]k .
dt
We know that for each (ψ0 , ψ) ∈ M0

D k (H ) ≥ 0 for k = 1, . . . , s. (6.81)

We need the definition of a strict bang-bang control (see Section 6.1.4) to obtain the suffi-
cient conditions in Theorem 6.27. For a given extremal trajectory T = { (x(t), u(t)) | t ∈  }
with piecewise constant control u(t) we say that u(t) is a strict bang-bang control if there
exists (ψ0 , ψ) ∈ M0 such that

Arg min φ(t, x(t), ψ(t))u = [u(t−), u(t+)] ∀ t ∈ [t0 , tf ], (6.82)


u ∈U

where
[u(t−), u(t+)] = {αu(t−) + (1 − α)u(t+) | 0 ≤ α ≤ 1 }
denotes the line segment in Rd(u) . As it was mentioned already in Section 6.1.4, if U is
the unit cube in Rd(u) , condition (6.82) precludes simultaneous switching of the control
components. However, this property holds for all numerical examples in Chapter 8.
In order to formulate quadratic optimality condition for a given extremal T with bang-
bang control u(·) we shall introduce the space Z(), the critical subspace K ⊂ Z(), and
the quadratic form  defined in Z().

6.5.3 Critical Subspace


As in Section 6.2.1, we denote by P C 1 (, Rn ) the space of piecewise continuous functions
x̄(·) :  → Rn , that are continuously differentiable on each interval of the set  \ . For
each x̄ ∈ P C 1 (, Rn ) and for tk ∈ , we use the abbreviation [x̄]k = x̄ k+ − x̄ k− , where
x̄ k− = x̄(tk −), x̄ k+ = x̄(tk +). Putting

z̄ = (t¯f , ξ̄ , x̄) with t¯f ∈ R1 , ξ̄ ∈ Rs , x̄ ∈ P C 1 (, Rn ),

we have z̄ ∈ Z() := R1 × Rs × P C 1 (, Rn ). Denote by K the set of all z̄ ∈ Z() satis-


fying the following conditions:
˙ = fx (t, x(t), u(t))x̄(t), [x̄]k = [ẋ]k ξ̄k ,
x̄(t) k = 1, . . . , s, (6.83)
x̄(0) = 0, x̄(tf ) + ẋ(tf )t¯f = 0. (6.84)
6.5. Quadratic Conditions in Time-Optimal Bang-Bang Control 277

Then K is a subspace of the space Z() which we call the critical subspace. Each element
z̄ ∈ K is uniquely defined by the number t¯f and the vector ξ̄ . Consequently, the subspace
K is finite-dimensional.
An explicit representation of the variations x̄(t) in (6.83) is obtained as in Section
6.2.1. For each k = 1, . . . , s, define the vector functions y k (t) as the solutions to the system
ẏ = fx (t)y, y(tk ) = [ẋ]k , t ∈ [tk , tf ]. (6.85)

For t < tk we put y k (t) = 0 which yields the jump [y k ]k = [ẋ]k . Then

s
x̄(t) = y k (t)ξ̄k (6.86)
k=1

from which we obtain the representation



s
x̄(tf ) + ẋ(tf )t¯f = y k (tf )ξ̄k + ẋ(tf )t¯f . (6.87)
k=1

Furthermore, denote by x(t; t1 , . . . , ts ) the solution of the state equation (6.72) using the
optimal bang-bang control with switching points t1 , . . . , ts . Then the partial derivatives of
state trajectories with respect to the switching points are given by
∂x
(t; t1 , . . . , ts ) = −y k (t) for t ≥ tk , k = 1, . . . , s. (6.88)
∂tk

This relation holds for all t ∈ [0, tf ]\{tk }, because for t < tk we have ∂t
∂x
k
(t) = 0 and y k (t) = 0.
Hence, (6.86) yields
s
∂x
x̄(t) = − (t)ξ̄k . (6.89)
∂tk
k=1
In the nondegenerate case ψ0 (tf ) > 0, the critical subspace is simplified as follows.

Proposition 6.24. If there exists (ψ0 , ψ) ∈ M0 such that ψ0 (tf ) > 0, then t¯f = 0 holds for
each z̄ = (t¯f , ξ̄ , x̄) ∈ K.

This proposition is a straightforward consequence from Proposition 6.8. In Section 6.5.4,


we shall conclude from Theorem 6.27 that the property K = {0} essentially represents a
first-order sufficient condition. Since x̄(tf )+ ẋ(tf )t¯f = 0 by (6.84), the representations (6.86)
and (6.87) and Proposition 6.24 induce the following conditions for K = {0}.

Proposition 6.25. Assume that one of the following conditions is satisfied:


(a) The s+1 vectors y k (tf ) = ∂t ∂x
k
(tf ), k = 1, . . . , s, ẋ(tf ) are linearly independent;
(b) there exists (ψ0 , ψ) ∈ M0 with ψ0 (tf ) > 0 and the s vectors y k (tf ) = ∂t ∂x
k
(tf ),
k = 1, . . . , s, are linearly independent;
(c) there exists (ψ0 , ψ) ∈ M0 with ψ0 (tf ) > 0, and the bang-bang control has exactly
one switching point, i.e., s = 1.
Then the critical subspace is K = {0}.
278 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Now we discuss the case of two switching points, i.e., s = 2, to prepare the numerical
example in Section 6.5.4. Let us assume that ψ0 (tf ) > 0 (for some (ψ0 , ψ) ∈ M0 ) and
[ẋ]1  = 0, [ẋ]2  = 0. By virtue of Proposition 6.24, we have t¯f = 0 and hence x̄(tf ) = 0 for
each element z̄ ∈ K. Then the relations (6.84) and (6.86) yield
0 = x̄(tf ) = y 1 (tf )ξ̄1 + y 2 (tf )ξ̄2 . (6.90)

The conditions [ẋ]1  = 0 and [ẋ]2  = 0 imply that y 1 (tf )  = 0 and y 2 (tf )  = 0, respectively.
Furthermore, assume that K  = {0}. Then (6.90) shows that the nonzero vectors y 1 (tf ) and
y 2 (tf ) are collinear, i.e.,
y 2 (tf ) = αy 1 (tf ) (6.91)
with some factor α = 0. As a consequence, the relation y 2 (t) = αy 1 (t) is valid for all
t ∈ (t2 , tf ]. In particular, we have y 2 (t2 +) = αy 1 (t2 ) and thus

[ẋ]2 = αy 1 (t2 ) (6.92)


which is equivalent to (6.91). In addition, the equalities (6.90) and (6.91) imply that
1
ξ2 = − ξ1 . (6.93)
α
We shall use this formula in the next section.

6.5.4 Quadratic Form


For (ψ0 , ψ) ∈ M0 and z̄ ∈ K, we define the functional (see formulas (6.34) and (6.35))
 s  tf
(ψ0 , ψ, z̄) = (D k (H )ξ̄k2 − 2[ψ̇]k x̄av k
ξ̄k ) + Hxx (t)x̄(t), x̄(t) dt
k=1 0 (6.94)
¯
−(ψ̇0 (tf ) − ψ̇(tf )ẋ(tf ))tf .
2

k := 1 (x̄ k− + x̄ k+ ). Now we introduce second-order optimality conditions for


where x̄av 2
bang-bang control in the problem (6.71)–(6.74). From Theorem 6.9 we easily deduce the
following result.

Theorem 6.26. Let a trajectory T affords a Pontryagin minimum. Then the following
Condition A holds for the trajectory T : The set M0 is nonempty and
max (ψ0 , ψ, z̄) ≥ 0 ∀ z̄ ∈ K \ {0}.
(ψ0 ,ψ)∈M0

Similarly, from Theorem 6.10 we obtain the following theorem.

Theorem 6.27. Let the following Condition B be fulfilled for the trajectory T :
(a) there exists λ ∈ M0 such that D k (H ) > 0, k = 1, . . . , s, and condition (6.82) holds
(i.e., u(t) is a strict bang-bang control),
(b) max(ψ0 ,ψ)∈M0 (ψ0 , ψ, z̄) > 0 for all z̄ ∈ K \ {0}.
Then T is a strict strong minimum.
6.5. Quadratic Conditions in Time-Optimal Bang-Bang Control 279

Remarks.
1. The sufficient Condition B is a natural strengthening of the necessary Condition A.
2. Condition (b) is automatically fulfilled if K = {0} holds (cf. Proposition 6.25) which
gives a first-order sufficient condition for a strong minimum.
3. If there exists (ψ0 , ψ) ∈ M0 such that (ψ0 , ψ, z̄) > 0 for all z̄ ∈ K \ {0}, then condi-
tion (b) is obviously fulfilled.
4. For boxes U = {u = (u1 , . . . , ud(u) ) ∈ Rd(u) : umini ≤ ui ≤ umax
i , i = 1, . . . , d(u)}, the
condition D (H ) > 0, k = 1, . . . , s, is equivalent to the property φ̇i (tk )  = 0 if tk is a switch-
k

ing point of the ith control component ui (t). Note again that condition (6.82) precludes the
simultaneous switching of two or more control components.
5. A further remark concerns the case that the set M0 of Pontryagin multipliers is not a
singleton. This case was illustrated in [89] by the following time-optimal control problem
for a linear system:

ẋ1 = x2 , ẋ2 = x3 , ẋ3 = x4 , ẋ4 = u, |u| ≤ 1, x(0) = a, x(tf ) = b,

where x = (x1 , x2 , x3 , x4 ). It was shown in [89] that for some a and b there exists an extremal
in this problem with two switching points of the control such that, under appropriate nor-
malization, the set M0 is a segment. For this extremal, the maximum of the quadratic forms
 is positive on each nonzero element of the critical subspace, and hence the sufficient
conditions of Theorem 6.27 are satisfied. But this is not true for any single quadratic form
 (corresponding to an element of the set M0 ).

6.5.5 Nondegenerate Case


Let us assume the nondegenerate or normal case that there exists (ψ0 , ψ) ∈ M0 such that
the cost function multiplier ψ0 (tf ) is positive. By virtue of Proposition 6.24 we have in this
case that t¯f = 0 for all z̄ ∈ K. Thus the critical subspace K is defined by the conditions

˙ = fx (t)x̄(t),
x̄(t) [x̄]k = [ẋ]k ξ̄k (k = 1, . . . , s), x̄(0) = 0, x̄(tf ) = 0. (6.95)

In particular, these conditions imply x̄(t) ≡ 0 on [0, t1 ) and (ts , tf ]. Hence we have x̄ 1− =
x̄ s+ = 0 for all z̄ ∈ K. Then the quadratic form (6.94) is equal to
s
  tf
(ψ, z̄) = D k (H )ξk2 + 2[Hx ]k x̄av
k
ξk + Hxx (t)x̄(t), x̄(t) dt. (6.96)
k=1 0

This case of a time-optimal (autonomous) control problem was studied by Sarychev [104].
He used a special transformation of the problem and obtained sufficient optimality condition
for the transformed problem. It is not easy, but it is possible, to reformulate his results in
terms of the original problem. The comparison of both types of conditions reveals that
Sarychev used the same critical subspace, but his quadratic form is a lower bound for .
Namely, in his quadratic form the positive term D k (H )ξ̄k2 has the factor 14 instead of the
factor 1 for the same term in . Therefore, the sufficient Condition B is always fulfilled
whenever Sarychev’s condition is fulfilled. However, there is an example of a control
problem where the optimal solution satisfies Condition B but does not satisfy Sarychev’s
280 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

condition. Finally, Sarychev proved that his condition is sufficient for an L1 -minimum with
respect to the control (which is a Pontryagin minimum in this problem). In fact, it could be
proved that his condition is sufficient for a strong minimum.

6.5.6 Cases of One or Two Switching Times of the Control


From Theorem 6.27 and Proposition 6.25(c), we immediately deduce sufficient conditions
for a bang-bang control with one switching point. The result is used for the example
in Section 6.7.1 and is also applicable to the time-optimal control of an image converter
discussed by Kim et al. [47].

Theorem 6.28. Let the following conditions be fulfilled for the trajectory T :
(a) u(t) is a bang-bang control with one switching point, i.e., s = 1,
(b) there exists (ψ0 , ψ) ∈ M0 such that D 1 (H ) > 0 and condition (6.82) holds
(i.e., u(t) is a strict bang-bang control),
(c) there exists (ψ0 , ψ) ∈ M0 with ψ0 (tf ) > 0.
Then T is a strict strong minimum.

Now we turn our attention to the case of two switching points where s = 2. Assume
the nondegenerate case ψ0 (tf ) > 0 and suppose that [ẋ]1  = 0, [ẋ]1  = 0 and y 2 (tf ) = αy 1 (tf )
as in (6.91). Otherwise, K = {0} holds and, hence, the first-order sufficient condition for
a strong minimum is satisfied. For any element z̄ ∈ K, we have t¯f = 0, x̄ 1− = 0, x̄ 2+ = 0.
Consequently,
1 1 1 1 1
1
x̄av = [x̄]1 = [ẋ]1 ξ̄1 , 2
x̄av = x̄ 2− = y 1 (t2 )ξ̄1 = [ẋ]2 ξ̄1
2 2 2 2 2α
in view of x̄(t) = y 1 (t)ξ̄1 + y 2 (t)ξ̄2 , y 2 (t2 −) = 0 and (6.92). Using these relations in the
quadratic form (6.96) together with (6.93) and the conditions y 2 (t) = 0 for all t < t2 ,
[Hx ]k = −[ψ̇]k , k = 1, 2, we compute the quadratic form for the element of the critical
subspace as
 t2
 = D 1 (H )ξ̄12 + D 2 (H )ξ̄22 − 2[ψ̇]1 x̄av 1
ξ̄1 − 2[ψ̇]2 x̄av
2
ξ̄2 + Hxx x̄, x̄ dt
t1  t2 
1 2 1
= D (H )ξ̄1 + 2 D (H )ξ̄1 − [ψ̇] [ẋ] ξ̄1 + 2 [ψ̇] [ẋ] ξ̄1 +
1 2 2 1 1 2 2 2 2
Hxx y , y dt ξ̄12
1 1
α α t1
= ρ ξ̄12 ,

where

1 t2
ρ = D (H ) − [ψ̇] [ẋ]
1 1 1
+ 2 D 2 (H ) + [ψ̇]2 [ẋ]2 + Hxx y 1 , y 1 dt. (6.97)
α t1

Thus, we obtain the following proposition.

Proposition 6.29. Assume that ψ0 (tf ) > 0, s = 2, [ẋ]1  = 0, [ẋ]2  = 0, and y 2 (tf ) = αy 1 (tf )
(which is equivalent to (6.91)) with some factor α. Then the condition of the positive
definiteness of  on K is equivalent to the inequality ρ > 0, where ρ is defined as in (6.97).
6.6. Sufficient Conditions for Time-Optimal Control Problems 281

6.6 Sufficient Conditions for Positive Definiteness of the


Quadratic Form  on the Critical Subspace K for
Time-Optimal Control Problems
In this section, we consider the nondegenerate case as in Section 6.5.5 and assume
(i) u(t) is a bang-bang control with s > 1 switching points,
(ii) there exists (ψ0 , ψ) ∈ M0 such that ψ0 (tf ) > 0 and D k (H ) > 0, k = 1, . . . , s.
Under these assumptions the critical subspace K is defined as in (6.95). Let (ψ0 , ψ) ∈
M0 be a fixed element (possibly different from that in assumption (ii)), and denote by
 = (ψ0 , ψ, ·) the quadratic form for this element. Recall that  is given by (6.96).
According to Theorem 6.27 the positive definiteness of the quadratic form (6.96) on the
subspace K in (6.95) is a sufficient condition for a strict strong minimum of the trajectory.
Now our aim is to find conditions that guarantee the positive definiteness of  on K.

6.6.1 Q-Transformation of  on K
Here we shall use the same arguments as in Sections 5.3.2 and 6.3.1. Let Q(t) be a sym-
metric matrix on [t1 , ts ] with piecewise continuous entries which are absolutely continuous
on each interval of the set [t1 , ts ] \ . Therefore, Q may have a jump at each point tk ∈ 
including t1 , ts , and thus the symmetric matrices Q1− and Qs+ are also defined. For z̄ ∈ K,
we obviously have
 ts t s + 
d  s
Qx̄, x̄ dt = Qx̄, x̄  − [Qx̄, x̄ ]k ,
t1 dt t1 − k=1

where [Qx̄, x̄ ]k is the jump of the function Qx̄, x̄ at the point tk ∈ . Using the conditions
x̄˙ = fx (t)x̄ and x̄ 1− = x̄ s+ = 0, we obtain
s  ts
[Qx̄, x̄ ]k + (Q̇ + fx∗ Q + Qfx )x̄, x̄ dt = 0, (6.98)
k=1 t1

where the asterisk denotes transposition. Adding this zero form to , we get
s

 = D k (H )ξ̄k2 − 2[ψ̇]k x̄av
k
ξ̄k + [Qx̄, x̄ ]k

k=1
ts
(6.99)
+ (Hxx + Q̇ + fx∗ Q + Qfx )x̄, x̄ dt.
t1
We call this formula the Q-transformation of  on K.
To eliminate the integral term in , we assume that Q(t) satisfies the following linear
matrix differential equation:
Q̇ + fx∗ Q + Qfx + Hxx = 0 on [t1 , ts ] \ . (6.100)
Using (6.100), the quadratic form (6.99) reduces to

s
= ωk , ωk := D k (H )ξk2 − 2[ψ̇]k x̄av
k
ξk + [Qx̄, x̄ ]k . (6.101)
k=1
282 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Thus, we have proved the following statement.

Proposition 6.30. Let Q(t) satisfy the linear differential equation (6.100) on [t1 , ts ] \ .
Then for each z̄ ∈ K the representation (6.101) holds.

Now our goal is to derive conditions such that ωk > 0 holds on K \ {0} for k = 1, . . . , s.
We shall use the representations of ωk given in Section 6.3.1. According to (6.50),

ωk = D k (H ) + (qk+ )[ẋ]k ξ̄k2 + 2(qk+ )x̄ k− ξ̄k + (x̄ k− )∗ [Q]k x̄ k− , (6.102)

where qk+ = ([ẋ]k )∗ Qk+ − [ψ̇]k . We immediately see from this representation that one
way to enforce ωk > 0 is to impose the following conditions:

D k (H ) > 0, qk+ = ([ẋ]k )∗ Qk+ − [ψ̇]k = 0, [Q]k ≥ 0. (6.103)

In practice, however, it might be difficult to check this condition since it is necessary to satisfy
the d(x) equality constraints qk+ = ([ẋ]k )∗ Qk+ − [ψ̇]k = 0 and the inequality constraints
[Q]k ≥ 0. It is more convenient to express ωk as a quadratic form in the variables (ξ̄k , x̄ k− )
with the matrix  
D k (H ) + (qk+ )[ẋ]k qk+
Mk+ = , (6.104)
(qk+ )∗ [Q]k
where qk+ is a row vector and (qk+ )∗ is a column vector.
Similarly, according to (6.52), the following representation holds:

ωk = D k (H ) − (qk− )[ẋ]k ξ̄k2 + 2(qk− )x̄ k+ ξ̄k + (x̄ k+ )∗ [Q]k x̄ k+ , (6.105)

where qk− = ([ẋ]k )∗ Qk− − [ψ̇]k . Again, we see that ωk > 0 holds if we require the condi-
tions
D k (H ) > 0, qk− = ([ẋ]k )∗ Qk− − [ψ̇]k = 0, [Q]k ≥ 0. (6.106)
To find a more general condition for ωk > 0, we consider (6.105) as a quadratic form in the
variables (ξ̄k , x̄ k+ ) with the matrix
 
D k (H ) − (qk− )[ẋ]k qk−
Mk− = . (6.107)
(qk− )∗ [Q]k

Since the right-hand sides of equalities (6.102) and (6.105) are connected by the relation
x̄ k+ = x̄ k− + [ẋ]k ξ̄k , the following statement obviously holds.

Proposition 6.31. For each k = 1, . . . , s, the positive (semi)definiteness of the matrix Mk−
is equivalent to the positive (semi)definiteness of the matrix Mk+ .

Now we can prove the following theorem.

Theorem 6.32. Let Q(t) be a solution of the linear differential equation (6.100) on [t1 , ts ]\
which satisfies the following conditions:
6.6. Sufficient Conditions for Time-Optimal Control Problems 283

(a) the matrix Mk+ is positive semidefinite for each k = 2, . . . , s;


(b) bk+ := D k (H ) + (qk+ )[ẋ]k > 0 for each k = 1, . . . , s − 1.
Then  is positive on K \ {0}.

Proof. Take an arbitrary element z̄ = (ξ̄ , x̄) ∈ K. Let us show that  ≥ 0 for this element.
Condition (a) implies that ωk ≥ 0 for k = 2, . . . , s. Condition (b) for k = 1 together with
condition x̄ 1− = 0 implies that ω1 ≥ 0. Consequently,  ≥ 0. Assume that  = 0. Then
ωk = 0, k = 1, . . . , s. The conditions ω1 = 0, x̄ 1− = 0, and b1+ > 0 by formula (6.102)
(with k = 1) yield ξ̄1 = 0. Then [x̄]1 = 0 and hence x̄ 1+ = 0. The last equality and
the equation x̄˙ = fx (t)x̄ show that x̄(t) = 0 in (t1 , t2 ) and hence x̄ 2− = 0. Similarly, the
conditions ω2 = 0, x̄ 2− = 0, and b2+ > 0 by formula (6.102) (with k = 2) imply that
ξ̄2 = 0 and x̄(t) = 0 in (t2 , t3 ). Therefore, x̄ 3− = 0 etc. Continuing this process we get
x̄ ≡ 0 and ξ̄k = 0 for k = 1, . . . , s − 1. Now using formula (6.101) for ωs = 0, as well
as the conditions D s (H ) > 0 and x̄ ≡ 0 we obtain ξ̄s = 0. Consequently, we have z̄ = 0
which means that  is positive on K \ {0}.

Similarly, using representation (6.105) for ωk we can prove the following statement.

Theorem 6.33. Let Q(t) be a solution of the linear differential equation (6.100) on [t1 , ts ]\
which satisfies the following conditions:
(a) The matrix Mk− is positive semidefinite for each k = 1, . . . , s − 1;
(b) bk− := D k (H ) − (qk− )[ẋ]k > 0 for each k = 2, . . . , s.
Then  is positive on K \ {0}.

6.6.2 Q-Transformation of  to Perfect Squares


Here, as in Section 6.3.3, we formulate special jump conditions for the matrix Q at each
point tk ∈ , which will make it possible to transform  into perfect squares and thus to
prove its positive definiteness on K. Suppose that

bk− := D k (H ) − (qk− )[ẋ]k > 0 (6.108)

and that Q satisfies the jump condition at tk ,

bk− [Q]k = (qk− )∗ (qk− ), (6.109)

where (qk− )∗ is a column vector while qk− is a row vector. Then according to (6.63),
2 2
ωk = (bk− )−1 (bk− )ξ̄k + (qk− )(x̄ k+ ) = (bk− )−1 D k (H )ξ̄k + (qk− )(x̄ k− ) . (6.110)

Theorem 6.34. Let Q(t) satisfy the linear differential equation (6.100) on [t1 , ts ] \ .
Let condition (6.108) hold for each k = 1, . . . , s and condition (6.109) hold for each k =
1, . . . , s − 1. Then  is positive on K \ {0}.

Proof. According to (6.110), the matrix Mk− is positive semidefinite for each k = 1, . . . ,
s − 1 (cf. (6.105) and (6.107)), and hence both conditions (a) and (b) of Theorem 6.33 are
fulfilled. Then by Theorem 6.33,  is positive on K \ {0}.
284 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Similar assertions hold for the jump conditions that use right-hand values of Q at each
point tk ∈ . Suppose that

bk+ := D k (H ) + (qk+ )[ẋ]k > 0 (6.111)

and that Q satisfies the jump condition at point tk

bk+ [Q]k = (qk+ )∗ (qk+ ). (6.112)

Then
2 2
ωk = (bk+ )−1 (bk+ )ξ̄k + (qk+ )(x̄ k− ) = (bk+ )−1 D k (H )ξ̄k + (qk+ )(x̄ k+ ) . (6.113)

Theorem 6.35. Let Q(t) satisfy the linear differential equation (6.100) on [t1 , ts ] \ . Let
condition (6.111) hold for each k = 1, . . . , s and condition (6.112) hold for each k = 2, . . . , s.
Then  is positive on K \ {0}.

6.6.3 Case of Two Switching Times of the Control


Let s = 2, i.e.,  = {t1 , t2 }, and let Q(t) be a symmetric matrix with absolutely continuous
entries on [t1 , t2 ]. Put

Qk = Q(tk ), qk = ([ẋ]k )∗ Qk − [ψ̇]k , k = 1, 2.

Theorem 6.36. Let Q(t) satisfy the linear differential equation (6.100) on (t1 , t2 ) and the
following inequalities hold at t1 , t2 :

D 1 (H ) + q1 [ẋ]1 > 0, D 2 (H ) − q2 [ẋ]2 > 0. (6.114)

Then  is positive on K \ {0}.

Proof. In the case considered we have

Q1+ = Q1 , q1+ = q1 , Q2− = Q2 , q2− = q2 ,

and
b1+ := D 1 (H ) + q1 [ẋ]1 > 0, b2− := D 2 (H ) − q2 [ẋ]2 > 0. (6.115)
Define the jumps [Q]1 and [Q]2 by the conditions

b1+ [Q]1 = (q1+ )∗ (q1+ ), b2− [Q]2 = (q2− )∗ (q2− ). (6.116)

Then [Q]1 and [Q]2 are symmetric matrices. Put

Q1− = Q1+ − [Q]1 , Q2+ = Q2− + [Q]2 .

Then Q1− and Q2+ are also symmetric matrices. Thus, we obtain a symmetric matrix
Q(t) satisfying (6.100) on (t1 , t2 ), the inequalities (6.115), and the jump conditions (6.116).
By formulas (6.110) and (6.113) the terms ω1 and ω2 are nonnegative. In view of (6.101),
6.6. Sufficient Conditions for Time-Optimal Control Problems 285

we see that  = ω1 + ω2 is nonnegative on K. Suppose that  = 0 for some z̄ = (ξ , x̄) ∈ K.


Then ωk = 0 for k = 1, 2, and thus formulas (6.110) and (6.113) give

b1+ ξ1 + (q1+ )x̄ 1− = 0, b2− ξ2 + (q2− )x̄ 2+ = 0.

But x̄ 1− = 0 and x̄ 2+ = 0. Consequently, ξ̄1 = ξ̄2 = 0, and then conditions x̄ 1− = 0 and


[x̄]1 = 0 imply that x̄ 1+ = 0. The last equality and the equation x̄˙ = fx (t)x̄ imply that
x̄(t) = 0 on (t1 , t2 ). Thus x̄ ≡ 0 and then z̄ = 0. We have proved that  is positive on
K \ {0}.

6.6.4 Control System with a Constant Matrix B


In the case that B(t, x) = B is a constant matrix, the adjoint equation has the form

ψ̇ = −ψax ,

which implies that


[ψ̇]k = 0, k = 1, . . . , s.
Therefore,

qk− = ([ẋ]k )∗ Qk− , qk+ = ([ẋ]k )∗ Qk+ ,


(qk− )∗ qk− = Qk− [ẋ]k ([ẋ]k )∗ Qk− , (qk+ )∗ qk+ = Qk+ [ẋ]k ([ẋ]k )∗ Qk+ ,
bk− = D k (H ) − ([ẋ]k )∗ Qk− [ẋ]k , bk+ = D k (H ) + ([ẋ]k )∗ Qk+ [ẋ]k ,

where
D k (H ) = ψ̇(tk )B[u]k , k = 1, . . . , s.
In the case of two switching points with s = 2, the conditions (6.114) take the form

D 1 (H ) + Q1 [ẋ]1 , [ẋ]1 ) > 0, D 2 (H ) − Q2 [ẋ]2 , [ẋ]2 ) > 0. (6.117)

Now assume, in addition, that u is one-dimensional and that


⎛ ⎞
0
⎜ .. ⎟
⎜ ⎟
B = βen := ⎜ . ⎟ , β > 0, U = [−c, c], c > 0.
⎝ 0 ⎠
β

In this case we get


[ẋ]k = B[u]k = βen [u]k , k = 1, . . . , s,
and thus
Qk [ẋ]k , [ẋ]k ) = β 2 Qk en , en |[u]k |2 = 4β 2 c2 Qnn (tk ),
where Qnn is the element of matrix
⎛ ⎞
Q11 . . . Q1n
⎜ .. .. .. ⎟
Q=⎝ . . . ⎠.
Qn1 . . . Qnn
286 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Moreover, in the last case we obviously have


D k (H ) = 2βc |ψ̇n (tk )|, k = 1, . . . , s.
For s = 2, conditions (6.114) are thus equivalent to the estimates
|ψ̇n (t1 )| |ψ̇n (t2 )|
Qnn (t1 ) > − , Qnn (t2 ) < . (6.118)
2βc 2βc

6.7 Numerical Examples of Time-Optimal Control


Problems
6.7.1 Time-Optimal Control of a Van der Pol Oscillator
Consider again the tunnel-diode oscillator displayed in Figure 4.1. In the control problem
of the van der Pol oscillator, the state variable x1 represents the voltage, whereas the control
u is the voltage at the generator. Time-optimal solutions will be computed in two cases.
First, we consider a fixed terminal state x(tf ) = xf . The second case treats the nonlinear
terminal constraint x1 (tf )2 + x2 (tf )2 = r 2 for a small r > 0, by which the oscillator is steered
only to a neighborhood of the origin.
In the first case we consider the control problem of minimizing the final time tf subject
to the constraints
ẋ1 (t) = x2 (t), ẋ2 (t) = −x1 (t) + x2 (t)(1 − x12 (t)) + u(t), (6.119)
x1 (0) = −0.4, x2 (0) = 0.6, x1 (tf ) = 0.6, x2 (tf ) = 0.4, (6.120)
| u(t) | ≤ 1 for t ∈ [0, tf ]. (6.121)
The Pontryagin function (Hamiltonian) is given by
H (x, u, ψ) = ψ1 x2 + ψ2 (−x1 + x2 (1 − x12 ) + u). (6.122)
The adjoint equations ψ̇ = −Hx are
ψ̇1 = ψ2 (1 + 2x1 x2 ), ψ̇2 = −ψ1 + ψ2 (x12 − 1). (6.123)
In view of the free final time we get the additional boundary condition
H (tf ) + ψ0 (tf ) = 0.4ψ1 (tf ) + ψ2 (tf )(−0.344 + u(tf )) + 1 = 0. (6.124)
The sign of switching function φ(t) = ψ2 (t) determines the optimal control according to
 
1 if ψ2 (t) < 0
u(t) = . (6.125)
−1 if ψ2 (t) > 0
Evaluating the derivatives of the switching function, it can easily be seen that there are no
singular arcs with ψ2 (τ ) ≡ 0 holding on a time interval [t1 , t2 ] . Nonlinear programming
methods applied to the discretized control problem show that the optimal bang-bang control
has two bang-bang arcs,
 
1 for 0 ≤ t < t1
u(t) = , (6.126)
−1 for t1 ≤ t ≤ tf
6.7. Numerical Examples of Time-Optimal Control Problems 287

(a) state variables x1 and x2 (b) time-optimal control u and switching function
1
1
0.8

0.6 0.5

0.4
0
0.2

0 -0.5
-0.2
-1
-0.4
0 0.2 0.4 0.6 0.8 1 1.2 0 0.2 0.4 0.6 0.8 1 1.2
(d) adjoint variables 1 and 2
(c) phaseportrait (x1, x2)
1 0.8
0.6
0.9 0.4
0.2
0.8
0
0.7 -0.2
-0.4
0.6 -0.6
-0.8
0.5
-1
0.4 -1.2
-0.4 -0.2 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0.8 1 1.2

Figure 6.2. Time-optimal solution of the van der Pol oscillator, fixed terminal state
(6.120). (a) State variables x1 and x2 (dashed line). (b) Control u and switching function
ψ2 (dashed line). (c) Phase portrait (x1 , x2 ). (d) Adjoint variables ψ1 and ψ2 (dashed line).

with switching time t1 . This implies the switching condition


φ(t1 ) = ψ2 (t1 ) = 0. (6.127)
Hence, we must solve the boundary value problem (6.120)–(6.127). Using the code
BNDSCO [82] or NUDOCCCS [13, 14], we obtain the extremal solution displayed in
Figure 6.2. The optimal final time, the switching point, and some values for the adjoint
variables are
tf = 1.2540747, t1 = 0.158320138,
ψ1 (0) = −1.0816056, ψ2 (0) = −0.18436798,
(6.128)
ψ1 (t1 ) = −1.0886321, ψ2 (t1 ) = 0.0,
ψ1 (tf ) = −0.47781383, ψ2 (tf ) = 0.60184112.
Since the bang-bang control has only one switching time, we are in the position to apply
Theorem 6.27. For checking the assumptions of this theorem it remains to verify the
condition D 1 (H ) = |φ̇(t1 )[u]1 | > 0. Indeed, in view of the adjoint equation (6.123) and the
switching condition ψ2 (t1 ) = 0 we get
D 1 (H ) = |φ̇(t1 )[u]1 | = 2|ψ1 (t1 )| = 2 · 1.08863205  = 0.
Then Theorem 6.27 ensures that the computed solution is a strict strong minimum.
Now we treat the second case, where the two boundary conditions (6.120) are re-
placed by the single nonlinear condition
x(tf )2 + x2 (tf )2 = r 2 , r = 0.2. (6.129)
288 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

Imposing this boundary condition, we aim at steering the van der Pol oscillator to a small
neighborhood of the origin. The adjoint equation (6.124) remains valid. The transversality
condition for the adjoint variable gives

ψ1 (tf ) = 2βx1 (tf ), ψ2 (tf ) = 2βx2 (tf ), β ∈ R. (6.130)

The boundary condition (6.11) associated with the free final time tf yields

ψ1 (tf )x2 (tf ) + ψ2 (tf ) (−x1 (tf ) + x2 (tf )(1 − x1 (tf )2 ) + u(tf )) + 1 = 0. (6.131)

Again, the switching function (6.18) is given by φ(t) = Hu (t) = ψ2 (t). The structure of the
bang-bang control agrees with that in (6.132),
 
1 for 0 ≤ t < t1
u(t) = , (6.132)
−1 for t1 ≤ t ≤ tf

which yields the switching condition

φ(t1 ) = ψ2 (t1 ) = 0. (6.133)

Using either the boundary value solver BNDSCO of Oberle and Grimm [82] or the direct
optimization routine NUDOCCCS of Büskens [13, 14], we obtain the extremal solution
depicted in Figure 6.3 and the following values for the switching, final time, state, and
adjoint variables:

t1 = 0.7139356, tf = 2.864192,
ψ1 (0) = 0.9890682, ψ2 (0) = 0.9945782,
x1 (t1 ) = 1.143759, x2 (t1 ) = −0.5687884,
ψ1 (t1 ) = 1.758128, ψ2 (t1 ) = 0.0, (6.134)
x1 (tf ) = 0.06985245, x2 (tf ) = −0.1874050,
ψ1 (tf ) = 0.4581826, ψ2 (tf ) = −1.229244,
β = 3.279646.

There are two alternative ways to check sufficient conditions. We may either use
Theorem 6.19 and solve the linear equation (6.43) or evaluate directly the quadratic form
in Proposition 6.11. We begin by testing the assumptions of Theorem 6.19 and consider the
symmetric 2 × 2 matrix  
Q11 (t) Q12 (t)
Q(t) = .
Q12 (t) Q22 (t)
The linear equations Q̇ = −Qfx − fx∗ Q − Hxx in (6.100) yield the following ODEs:

Q̇11 = 2 Q12 (1 + 2x1 x2 ) + 2ψ2 x2 ,


Q̇12 = −Q11 − Q12 (1 − x12 ) + Q22 (1 + 2x1 x2 ) + 2ψ2 x1 , (6.135)
Q̇22 = −2 (Q12 + Q22 (1 − x12 )).

In view of Theorem 6.19 we must find a solution Q(t) only in the interval [t1 , tf ] such that

D 1 (H ) + (qk+ )[ẋ]1 > 0, qk+ = ([ẋ]1 )∗ Q(t1 ) − [ψ̇]1


6.7. Numerical Examples of Time-Optimal Control Problems 289

(a) state variables x1 and x2 (b) control u and switching function


1.4
1
1.2
1 0.5
0.8
0.6 0
0.4 -0.5
0.2
0 -1
-0.2
-1.5
-0.4
-0.6 -2
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3

(c) phase portrait (x1, x2) (d) adjoint variables 1 and 2


2
1
1.5
0.8
1
0.6
0.4 0.5
0.2 0
0 -0.5
-0.2 -1
-0.4
-1.5
-0.6
-2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.5 1 1.5 2 2.5 3

Figure 6.3. Time-optimal solution of the van der Pol oscillator, nonlinear boundary
condition (6.129). (a) State variables x1 and x2 (dashed line). (b) Control u and switching
function ψ2 (dashed line). (c) Phase portrait (x1 , x2 ). (d) Adjoint variables ψ1 and ψ2
(dashed line).

holds and the quadratic form ω0 in (6.55)–(6.57) is positive definite on the cone K0 defined
in (6.26). Since ψ2 (t1 ) = 0 we get from (6.130),

D 1 (H ) = −φ̇(t1 )[u]1 = 2 · ψ1 (t1 ) = 2 · 1.758128 > 0.

Furthermore, from [ψ̇]1 = 0 we obtain the condition

D 1 (H ) + ([ẋ]1 )∗ Q(t1 )[ẋ]1 = 2 · 1.758128 + 4Q22 (t1 ) > 0,

which is satisfied by any initial value Q22 (t1 ) > −0.879064. By Proposition 6.8, we have
t¯f = 0 for every element z̄ = (t¯f , ξ , x̄) ∈ K. Therefore, by (6.57) we must check that the
matrix B22 = 2β I2 − Q(tf ) is positive definite on the critical cone K0 defined in (6.26),
i.e., on the cone

K0 = {(t¯f , v1 , v2 ) | t¯f = 0, x1 (tf )v1 + x2 (tf )v2 = 0}.

Thus the variations (v1 , v2 ) are related by v2 = −v1 x1 (tf )/x2 (tf ). Evaluating the quadratic
form (2β I2 − Q(tf ))(v1 , v2 ), (v1 , v2 ) with v2 = −v1 x1 (tf )/x2 (tf ), we arrive at the test
   2    2 
x1 x1 x1
c = 2β 1 + − Q11 − 2 Q12 + Q22 (tf ) > 0.
x2 x2 x2
290 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

A straightforward integration of the ODEs (6.135) using the solution data (6.134) and the
initial values Q11 (t1 ) = Q12 (t1 ) = Q22 (t1 ) = 0 gives the numerical results

Q11 (tf ) = 0.241897, Q12 (tf ) = −0.706142, Q22 (tf ) = 1.163448,

which yield the positive value c = 7.593456 > 0. Thus Theorem 6.19 asserts that the bang-
bang control characterized by (6.134) provides a strict strong minimum.
The alternative test for second-order sufficient conditions (SSC) is based on Proposi-
tion 6.11. The variational system ẏ(t) = fx (t)y(t), y(t1 ) = [ẋ]1 , for the variation y = (y1 , y2 )
leads to the variational system

ẏ1 = y2 , y1 (t1 ) = 0,
ẏ2 = −(1 + 2x1 2x2 )y1 + (1 − x12 )y2 , y2 (t1 ) = 2,

for which we compute

y1 (tf ) = 4.929925, y2 (tf ) = 1.837486.

Note that the relation Kxf (x(tf ))y(tf ) = 2(x1 (tf )y1 (tf ) + x2 (tf )y2 (tf ) = 0 holds. By Propo-
sition 6.11 we have to show that the quantity ρ in (6.37) is positive,
 tf
ρ = D 1 (H ) − [ψ̇]1 [ẋ]1 + (y(t))∗ Hxx (t)y(t) dt + (y(tf ))∗ (βK)xf xf y(tf ) > 0.
t1

Using [ψ̇]1 = 0 and (y(tf ))∗ (βK)xf xf y(tf ) = 2β(y1 (tf )2 + y2 (tf )2 ), we finally obtain

ρ = D 1 (H ) + 184.550 > 0.

6.7.2 Time-Optimal Control of the Rayleigh Equation


In Section 4.1, the Rayleigh problem of controlling oscillations in a tunnel-diode oscillator
(Figure 4.1) was considered with a regulator functional. In this section, we treat the time-
optimal case of steering a given initial state to the origin in minimal time. Recall that the
state variable x1 (t) = I (t) denotes the electric current. The optimal control problem is to
minimize the final time tf subject to the dynamics and control constraints

ẋ1 (t) = x2 (t), ẋ2 (t) = −x1 (t) + x2 (t)(1.4 − 0.14x2 (t)2 ) + u(t), (6.136)
x1 (0) = x2 (0) = −5, x1 (tf ) = x2 (tf ) = 0, (6.137)
| u(t) | ≤ 4 for t ∈ [0, tf ]. (6.138)

Note that we have shifted the factor 4 to the control variable in the dynamics (4.134) to
the control constraint (6.138). The Pontryagin function (Hamiltonian) (see (6.75)) for this
problem is
H (x, u, ψ) = ψ1 x2 + ψ2 (−x1 + x2 (1.4 − 0.14x22 ) + u). (6.139)
6.7. Numerical Examples of Time-Optimal Control Problems 291

The transversality condition (6.11) yields, in view of (6.139),

H (tf ) + 1 = ψ2 (tf ) u(tf ) + 1 = 0. (6.140)

The switching function φ(x, ψ) = ψ2 determines the optimal control


 
4 if ψ2 (t) < 0
u(t) = . (6.141)
−4 if ψ2 (t) > 0

As for the van der Pol oscillator, it is easy to show that there are no singular arcs with
ψ2 (t) ≡ 0 holding on a time interval Is ⊂ [0, tf ]. Hence, the optimal control is bang-bang.
Applying nonlinear programming methods to the suitably discretized Rayleigh problem,
one realizes that the optimal control comprises the following three bang-bang arcs with
two switching times t1 , t2 :
⎧ ⎫
⎨ 4 for 0 ≤ t < t1 ⎪
⎪ ⎬
u(t) = −4 for t1 ≤ t < t2 . (6.142)
⎩ 4 for t ≤ t ≤ t ⎪
⎪ ⎭
2 f

This control structure yields two switching conditions

ψ2 (t1 ) = 0, ψ2 (t2 ) = 0. (6.143)

Thus, we have to solve the multipoint boundary value problem comprising equations
(6.136)–(6.143). The codes BNDSCO [82] and NUDOCCS [13, 14] yield the extremal
depicted in Figure 6.4. The final time, the switching points, and some values for the adjoint
variables are computed as
tf = 3.66817339,
t1 = 1.12050658, t2 = 3.31004698,
ψ1 (0) = −0.12234128, ψ2 (0) = −0.08265161, (6.144)
ψ1 (t1 ) = −0.21521225, ψ1 (t2 ) = 0.89199176,
ψ1 (tf ) = 0.84276186, ψ2 (tf ) = −0.25.
Now we are going to show that the computed control satisfies the assumptions of Theorem
6.19 and thus provides a strict strong minimum. Consider the symmetric 2 × 2 matrix
 
Q11 (t) Q12 (t)
Q(t) = ,
Q12 (t) Q22 (t)

The linear equation (6.100), dQ/dt = −Q fx − fx∗ Q − Hxx , leads to the following three
equations:
Q̇11 = 2Q12 ,
Q̇12 = −Q11 − Q12 (1.4 − 0.42x22 ) + Q22 , (6.145)
Q̇22 = −Q12 − Q22 (1.4 − 0.42x22 ) + 0.84ψ2 x2 .
We must find a solution of these equations satisfying the estimates (6.114) at the switching
times t1 and t2 . From

D k (H ) = | φ̇(tk )[u]k | = 8 |ψ1 (tk )|, k = 1, 2,


292 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

(a) state variables x1 and x2 (b) time-optimal control u and switching function (×4)
6
4
4 3
2 2
1
0
0
-2
-1
-4 -2

-6 -3
-4
-8
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4

(c) phaseportrait (x1 , x2) (d) adjoint variables 1 and 2


5 1
4 0.8
3
2 0.6
1 0.4
0
0.2
-1
-2 0
-3
-0.2
-4
-5 -0.4
-7 -6 -5 -4 -3 -2 -1 0 1 0 0.5 1 1.5 2 2.5 3 3.5 4

Figure 6.4. Time-optimal control of the Rayleigh equation. (a) State variables
x1 and x2 (dashed line). (b) Control u and switching function φ (dashed line). (c) Phase
portrait (x1 , x2 ). (d) Adjoint variables ψ1 and ψ2 (dashed line).

we get

D 1 (H ) = 8 · 0.21521225 = 1.7269800 > 0, D 2 (H ) = 8 · 0.89199176 = 7.1359341 > 0.

Furthermore, we have [ẋ]k = [u]k (0, 1)∗ = (0, 8)∗ and thus obtain Qk [ẋ]k , [ẋ]k =
64 Q22 (tk ), for k = 1, 2. The next step is to find a solution for the equations (6.145) in
the interval [t1 , t2 ] that satisfies the inequalities

D 1 (H ) + Q1 [ẋ]1 , [ẋ]1 = 1.7269800 + 64 Q22 (t1 ) > 0,


D 2 (H ) − Q2 [ẋ]2 , [ẋ]2 = 7.1359341 − 64 Q22 (t2 ) > 0

This requires the estimates

Q22 (t1 ) > −0.0269841, Q22 (t2 ) < 0.11149897. (6.146)

These conditions can be satisfied by choosing, e.g., the following initial values at the switch-
ing time t1 :
Q11 (t1 ) = 0, Q12 (t1 ) = 0.25, Q22 (t1 ) = 0.
Integration yields the value Q22 (t2 ) = −0.1677185 which shows that the estimates (6.146)
hold. Note that these estimates do not hold for the choice Q(t1 ) = 0, since this initial
value would give Q22 (t2 ) = 0.70592. In summary, Theorem 6.36 asserts that the computed
solution provides a strict strong minimum.
6.8. Time-Optimal Control Problems for Linear Systems with Constant Entries 293

6.8 Time-Optimal Control Problems for Linear Systems


with Constant Entries
In this section, we shall observe some results which were obtained in [77], [79, Part 2,
Section 13], and [87].

6.8.1 Statement of the Problem, Minimum Principle, and Simple


Sufficient Optimality Condition
Consider the following problem:
tf → min,
(6.147)
ẋ = Ax + Bu, x(0) = a, x(tf ) = b, u ∈ U,
where A and B are constant matrices, a and b are fixed vectors in Rd(x) , and U is a
convex polyhedron in Rd(u) . A triple (tf , x, u) is said to be admissible if x(t) is a Lipschitz
continuous and u(t) is measurable bounded function on the interval  = [0, tf ] and the pair
(x, u) satisfies on  the constraints of the problem (6.147).

Definition 6.37. We say that the admissible triple (tf , x, u) affords an almost global mini-
mum if there is no sequence of admissible triples (tf n , x n , un ), n = 1, 2, . . . , such that tfn < tf
for n = 1, 2, . . . , and tfn → tf .

Proposition 6.38. Suppose that, in the problem (6.147), there exists an u∗ ∈ U such that
Aa + Bu∗ = 0 or Ab + Bu∗ = 0
(for example, let b = 0, 0 ∈ U ). Then the almost global minimum is equivalent to the
global one.

Proof. If, for example, Ab + Bu∗ = 0, u∗ ∈ U , then any pair (x, u) admissible on [0, tf ] can
be extended to the right of tf by putting x(t) = b, u(t) = u∗ .

Let (tf , x, u) be an admissible triple for which the conditions of the minimum principle
are fulfilled: There exists a smooth function ψ : [0, tf ] → Rd(x) such that

−ψ̇ = ψA, u(t) ∈ Arg min u ∈U (ψ(t)Bu ),


(6.148)
ψ ẋ = const ≤ 0, |ψ(0)| = 1.
These conditions follow from (6.76)–(6.80), because here H = ψ(Ax + Bu), Ht = 0, hence
−ψ ẋ = ψ0 = const =: α0 ≥ 0. Thus M0 can be identified with the set of infinitely differen-
tiable functions ψ(t) on [0, tf ] satisfying conditions (6.148).
The condition that M0 is nonempty is necessary for a Pontryagin minimum to hold
for the triple (tf , x, u). We will refer to a triple (tf , x, u) with nonempty M0 as an extremal
triple. Recall a simple sufficient first-order condition for an almost global minimum obtained
in [79, Part 1].

Theorem 6.39 (Milyutin). Suppose there exists ψ ∈ M0 such that α0 := −ψ ẋ > 0. Then
(tf , x, u) affords an almost global minimum.
294 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

In what follows, we assume that (tf , x, u) is an extremal triple such that u(t) is a
piecewise constant function taking values in the vertices of the polyhedron U . We denote
by  = {t1 , . . . , ts } the set of discontinuity points of the control u(t). Let ψ ∈ M0 . An
important role in formulations of optimality conditions in problem (6.147) is played by the
product (ψ̇ ẋ)(t).

Proposition 6.40. The product (ψ̇ ẋ)(t) is a monotone nonincreasing step function with
discontinuities only at the discontinuity points of the control u(t).

We now formulate yet another simple sufficient condition for the almost global min-
imum, obtained in [79, Part 2].

Theorem 6.41. Suppose that there exists ψ ∈ M0 such that the product (ψ̇ ẋ) fulfills one of
the following two conditions: (ψ̇ ẋ)(0) < 0, (ψ̇ ẋ)(tf ) > 0; i.e., (ψ̇ ẋ) strictly retains its sign
on [0, tf ]. Then (tf , x, u) affords an almost global minimum.

Theorem 6.41 implies a sufficient condition of a geometric nature.

Corollary 6.42. Let (tf , x, u) be an extremal triple such that for any point tk ∈  the vectors

ẋ k− = ẋ(tk −), ẋ k+ = ẋ(tk +)

are different from zero and equally directed (so that ẋ k− = ck ẋ k+ for some ck > 0). Suppose
that there exists ψ ∈ M0 such that (ψ̇ ẋ) is not identically zero. Then (tf , x, u) affords an
almost global minimum.

6.8.2 Quadratic Optimality Condition


For an extremal triple (tf , x, u) in the problem (6.147) satisfying the assumptions of Sec-
tion 6.8.1 we will write the quadratic necessary Condition A and the quadratic sufficient
Condition B using the results in Section 6.5.4.

Necessary Condition A. This time we begin by writing the quadratic form  as in


(6.94). Let us show that it is completely determined by the left and right limits of the step
function ψ̇ ẋ at points tk ∈ . Since H = ψ(Ax + Bu) and ψ̇ = −ψA, we have [ψ̇]k = 0,
k = 1, . . . , s and Hxx = 0. Moreover,

D k (H ) = −ψ̇(tk )B[u]k = −ψ̇(tk )[ẋ]k = −[ψ̇ ẋ]k , k = 1, . . . , s.

It follows from the condition Ht = 0 that ψ̇0 = 0. Further, the condition that (ψ̇ ẋ) is
constant on (ts , tf ) implies (ψ̇ ẋ)(tf ) = (ψ̇ ẋ)(ts +). Thus according to (6.94),


s
(ψ, z̄) = − [(ψ̇ ẋ)]k ξ̄k2 + t¯f2 (ψ̇ ẋ)s+ . (6.149)
k=1

This formula holds for any ψ ∈ M0 and any z̄ = (t¯f , ξ̄ , x̄) which belongs to the subspace K
as in (6.83) and (6.84).
6.8. Time-Optimal Control Problems for Linear Systems with Constant Entries 295

Let us see what is the form of K in the present case. Since x̄(t) satisfies the linear
system x̄˙ = Ax̄ on each interval  \ , condition x̄(0) = 0 in (6.84) can be replaced by
x̄(t1 −) = 0. Since ẋ satisfies the same system, ẍ = Aẋ, condition x̄(tf ) + ẋ(tf )t¯f = 0 in
(6.84) can be replaced by x̄(ts +) + ẋ(ts +)t¯f = 0. For brevity, put

x̄(t1 −) = x̄ 1− , x̄(ts +) = x̄ s+ , ẋ(ts +) = ẋ s+ .


Then by (6.83) and (6.84) the subspace K consists of triples z̄ = (t¯f , ξ̄ , x̄) satisfying the
conditions
t¯f ∈ R1 , ξ̄ ∈ Rs , x̄(·) ∈ P C ∞ (, Rn ),
x̄˙ = Ax̄, [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s,
x̄ 1− = 0, x̄ s+ + t¯f ẋ s+ = 0,
where P C ∞ (, Rn ) is the space of piecewise continuous functions x̄(t) :  → Rd(x)
infinitely differentiable on each interval of the set  \ . This property of x̄ follows from
the fact that on each interval of the set  \  the function x̄ satisfies the linear system x̄˙ = Ax̄
with constant entries.
Consider the cross-section of K specified by condition t¯f = −1. The passage to the
cross-section does not weaken the quadratic necessary Condition A because the functional
maxψ∈M0 (ψ, z̄) involved in it is homogeneous of degree 2 and nonnegative on any element
z̄ ∈ K with t¯f = 0 (since for any ψ ∈ M0 the inequalities D k (H ) = −[ψ̇ ẋ]k ≥ 0, k = 1, . . . , s,
hold).
Denote by R the cross-section of the subspace K, specified by condition t¯f = −1.
We omit the coordinate t¯f in the definition of R. Thus R is a set of pairs (ξ̄ , x̄) such that the
following conditions are fulfilled:
ξ̄ ∈ Rs , x̄(·) ∈ P C ∞ (, Rn ),
x̄˙ = Ax̄, [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s,
x̄ 1− = 0, x̄ s+ = ẋ s+ .

For ψ ∈ M0 , ξ̄ ∈ Rs , let

s
Q(ψ, ξ̄ ) = − [ψ̇ ẋ]k ξ̄k2 + (ψ̇ ẋ)s+ ,
k=1

and set
Q0 (ξ̄ ) = max Q(ψ, ξ̄ ). (6.150)
ψ∈M0
Then Theorem 6.26 implies the following theorem.

Theorem 6.43. Let a triple (tf , x, u) afford a Pontryagin minimum in the problem (6.147).
Then the set M0 as in (6.148) is nonempty and
Q0 (ξ̄ ) ≥ 0 ∀ (ξ̄ , x̄) ∈ R. (6.151)
It is clear that the set R, in this necessary condition, can be replaced by its projection
under the mapping (ξ̄ , x̄) → ξ̄ . Denote this projection by % and find out what conditions
specify it. Conditions x̄˙ = Ax̄ and ẍ = Aẋ imply
x̄(t) = eAt c̄(t), ẋ(t) = eAt c(t),
296 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

where c̄(t) and c(t) are step functions whose discontinuity points are contained in . Con-
ditions
x̄ 1− = 0, x̄ s+ = ẋ s+ , [x̄]k = [ẋ]k ξ̄k , k = 1, . . . , s,
imply
c̄1− = 0, c̄s+ = cs+ , [c̄]k = [c]k ξ̄k , k = 1, . . . , s.
Therefore

s 
s
cs+ = c̄s+ = [c̄]k = [c]k ξ̄k .
k=1 k=1

It is easily seen that the conditions


s
[c]k ξ̄k = cs+
k=1

determine the projection of R under the mapping (ξ̄ , x̄) → ξ̄ , which we denote by %. Since

[c]k = e−Atk [ẋ]k ∀ k, cs+ = e−Ats ẋ s+ ,

the set % is determined by the condition


s
e−Atk [ẋ]k ξ̄k = e−Ats ẋ s+ ,
k=1

which after multiplication by eAts from the left takes the final form


s
eA(ts −tk ) [ẋ]k ξ̄k = ẋ s+ . (6.152)
k=1

Hence % is the set of vectors ξ̄ ∈ Rs satisfying the system of algebraic equations (6.152).
Thus Theorem 6.43 implies the following theorem.

Theorem 6.44. Let a triple (tf , x, u) afford a Pontryagin minimum in the problem (6.147).
Then the set M0 as in (6.148) is nonempty and

Q0 (ξ̄ ) ≥ 0 ∀ ξ̄ ∈ %. (6.153)

It makes sense to use necessary condition (6.153) for investigation of only those
extremals (tf , x, u) for which

α0 := −ψ ẋ = 0 ∀ ψ ∈ M0 , (6.154)

because otherwise, by Theorem 6.39, (tf , x, u) affords an almost global minimum in the
problem. Condition (6.154) guarantees that the set % is nonempty (see Theorem 13.7 in
[79, p. 322]). Theorem 6.44 implies a simple consequence of a geometric nature.
6.8. Time-Optimal Control Problems for Linear Systems with Constant Entries 297

Corollary 6.45. Suppose that (tf , x, u) affords a Pontryagin minimum in the problem
(6.147). Let the vectors ẋ k− and ẋ k+ be different from zero and collinear for some tk ∈ ,
and let the jump [ψ̇ ẋ]k of the product ψ̇ ẋ at the point tk be different from zero for any
ψ ∈ M0 . Then the vectors ẋ k− and ẋ k+ are equally directed.

The proof is given in [79].

Sufficient Condition B. Let (tf , x, u) be an admissible triple in the problem (6.147)


satisfying the assumptions of Section 6.8.1. As in Section 6.2.4, denote by M the set of
functions ψ ∈ M0 satisfying the two conditions

−[ψ̇ ẋ]k > 0, k = 1, . . . , s, (6.155)


Arg min ψ(t)Bu = [u(t−), u(t+)] ∀ t ∈ [0, tf ]. (6.156)
u ∈U

Obviously, the interval [u(t−), u(t+)] is a singleton for all t ∈ [0, tf ] \ .

Theorem 6.46. For an admissible triple (tf , x, u) satisfying the assumptions of Section
6.8.1, let the set M be nonempty and

Q0 (ξ̄ ) > 0 ∀ ξ̄ ∈ %. (6.157)

Then (tf , x, u) affords a strict almost global minimum in the problem (6.147).

Proof. Condition (6.157) implies that

max (λ, z̄) > 0 (6.158)


ψ∈M0

for all z̄ ∈ K such that t¯f  = 0. Consider an element z̄ ∈ K \ {0} such that t¯f = 0. For this
element, ξ̄  = 0, since otherwise z̄ = 0. Take an arbitrary element ψ ∈ M and put q = −ψ̇ ẋ.
Then [q]k > 0, k = 1, . . . , s. Hence


s
(λ, z̄) = [q]k ξ̄k2 > 0.
k=1

Thus the inequality (6.158) holds for all z̄ ∈ K \ {0}. Therefore by Theorem 6.27, (tf , x, u)
affords a strict strong minimum in the problem (6.147). Moreover, condition M  = ∅ in the
problem (6.147) implies that the strict strong minimum is equivalent to the strict almost
global minimum. The last assertion is a consequence of Proposition 13.4 and Lemma 13.1
in [79].

6.8.3 Example
Consider the problem

tf → min, ẋ1 = x2 , ẋ2 = u, |u| ≤ 1, x(0) = a, x(tf ) = b, (6.159)


298 Chapter 6. Second-Order Optimality Conditions for Bang-Bang Control

where x = (x1 , x2 ) ∈ R2 , u ∈ R1 . The minimum principle conditions for this problem are
as follows:
ψ̇1 = 0, ψ̇2 = −ψ1 , u = − sgn ψ2 ,
(6.160)
ψ1 x2 + ψ2 u = −α0 ≤ 0, |ψ(0)| = 1,
where ψ = (ψ1 , ψ2 ). This implies that ψ2 (t) is a linear function, and u(t) is a step function
having at most one switching and taking values ±1.
The system corresponding to u = 1 is ẋ1 = x2 , ẋ2 = 1, and omitting t we have
1
x1 = x22 + C. (6.161)
2
The system corresponding to u = −1 is ẋ1 = x2 , ẋ2 = −1, whence
1
x1 = − x22 + C. (6.162)
2
Through each point x = (x1 , x2 ) of the state plane there pass one curve of the family (6.161)
and one curve of the family (6.162). The condition ψ1 x2 + ψ2 u = −α0 ≤ 0 implies that the
passage from a curve of the family (6.161) to a curve of the family (6.162) is only possible
in the upper half-plane x2 ≥ 0, and the passage from a curve of the family (6.162) to a curve
of the family (6.161) in the lower half-plane x2 ≤ 0. The direction of movement is such that
along the curves of the family (6.161) the point can go to (+∞, +∞), and along the curves
of the family (6.162) to (−∞, −∞). This means that any two points can be joined by the
extremals in no more than two ways. By Theorem 6.39 each extremal with

α0 := −ψ̇ ẋ = −ψ1 x2 − ψ2 u > 0

yields an almost global minimum.


Consider extremals with switching and α0 = 0. These are extremals with x2 = 0 at
the switching time and ẋ 1− = (0, u1− ), ẋ 1+ = (0, u1+ ). Since u1− = −u1+  = 0, the vectors
ẋ 1− and ẋ 1+ are different from zero and directed in an opposite way. The set M0 consists
of a single pair ψ = (ψ1 , ψ2 ), and

ψ̇ ẋ = −ψ1 u, [ψ̇ ẋ]1 = −ψ1 (t1 )[u]1  = 0. (6.163)

According to Corollary 6.45 this extremal does not yield a Pontryagin minimum because
it is necessary for Pontryagin minimum that the collinear vectors ẋ 1− and ẋ 1+ are equally
directed. In this case the necessary second-order condition fails.
Thus in this problem an extremal with a switching affords an almost global minimum
iff α0 > 0 for this extremal; i.e., the point x = (x1 , x2 ) at the switching time does not lie on
the x1 axis in the state plane.
For an extremal without switchings there always exists ψ ∈ M0 such that α0 > 0.
Indeed, for this extremal one can set ψ1 = 0, ψ2 = −u. Then α0 = −ψ2 u = 1, and all
conditions of the minimum principle are fulfilled.
More interesting examples of investigation of extremals in time-optimal control prob-
lems for linear systems with constant entries are given in [79, Part 2, Section 14]. The time-
optimal control in a simplified model of a container crane (ore unloader) was discussed
in [63, Section 5]. The optimality of the time-optimal control with three switching times
follows from Theorem 6.39.
Chapter 7

Bang-Bang Control Problem


and Its Induced Optimization
Problem

We continue our investigation of the pure bang-bang case. As it was mentioned in introduc-
tion, second-order sufficient optimality conditions for bang-bang controls had been derived
in the literature in two different forms. The first form was discussed in the last chapter.
The second one is due to Agrachev, Stefani, and Zezza [1], who first reduced the bang-bang
control problem to a finite-dimensional optimization problem and then showed that well-
known sufficient optimality conditions for this optimization problem supplemented by the
strict bang-bang property furnish sufficient conditions for the bang-bang control problem.
The bang-bang control problem, considered in this chapter, is more general than that in [1].
Following [99, 100], we establish the equivalence of both forms of second-order conditions
for this problem.

7.1 Main Results


7.1.1 Induced Optimization Problem
Again, let Tˆ = (x̂(t), û(t) | t ∈ [tˆ0 , tˆf ]) be an admissible trajectory for the basic problem
(6.1)–(6.3). We denote by V = ex U the set of vertices of the polyhedron U . Assume that
û(t) is a bang-bang control in  ˆ = [tˆ0 , tˆf ] taking values in the set V ,

û(t) = uk ∈ V for t ∈ (tˆk−1 , tˆk ), k = 1, . . . , s + 1,

where tˆs+1 = tˆf . Thus,  ˆ = {tˆ1 , . . . , tˆs } is the set of switching points of the control û(·) with
tˆk < tˆk+1 for k = 0, 1, . . . , s. Assume now that the set M0 of multipliers is nonempty for the
trajectory Tˆ . Put

x̂(tˆ0 ) = x̂0 , θ̂ = (tˆ1 , . . . , tˆs ), ζ̂ = (tˆ0 , tˆf , x̂0 , θ̂ ). (7.1)

Then θ̂ ∈ Rs , ζ̂ ∈ R2 × Rn × Rs , where n = d(x).


Take a small neighborhood V of the point ζ̂ in R2 × Rn × Rs , and let

ζ = (t0 , tf , x0 , θ ) ∈ V,

299
300 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

where θ = (t1 , . . . , ts ) satisfies t0 < t1 < t2 < · · · < ts < tf . Define the function u(t; θ ) by
the condition
u(t; θ ) = uk for t ∈ (tk−1 , tk ), k = 1, . . . , s + 1, (7.2)
where ts+1 = tf . The values u(tk ; θ), k = 1, . . . , s, may be chosen in U arbitrarily. For
definiteness, define them by the condition of continuity of the control from the left: u(tk ; θ) =
u(tk −; θ ), k = 1, . . . , s.
Let x(t; t0 , x0 , θ) be the solution of the initial value problem (IVP),

ẋ = f (t, x, u(t; θ)), t ∈ [t0 , tf ], x(t0 ) = x0 . (7.3)

For each ζ ∈ V this solution exists if the neighborhood V of the point ζ̂ is sufficiently
small. We obviously have

x(t; tˆ0 , x̂0 , θ̂) = x̂(t), ˆ


t ∈ , u(t; θ̂) = û(t), ˆ \ .
t ∈ ˆ

Consider now the following finite-dimensional optimization problem in the space R2 ×


Rn × Rs of the variables ζ = (t0 , tf , x0 , θ ):

F0 (ζ ) := J (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) → min,


F (ζ ) := F (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) ≤ 0, (7.4)
G(ζ ) := K(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = 0.

We call (7.4) the Induced Optimization Problem (IOP) or simply Induced Problem, which
represents an extension of the IOP introduced in Agrachev, Stefani, and Zezza [1]. The
following assertion is almost obvious.

Theorem 7.1. Let the trajectory Tˆ be a Pontryagin local minimum for the basic control
problem (6.1)–(6.3). Then the point ζ̂ is a local minimum for the IOP (7.4), and hence it
satisfies first- and second-order necessary conditions for this problem.

Proof. Assume that ζ̂ is not a local minimum in problem (7.4). Then there exists a sequence
of admissible points ζ ν = (t0ν , t1ν , x0ν , θ ν ) in problem (7.4) such that ζ ν → ζ̂ for ν → ∞ and
F0 (ζ ν ) < F0 (ζ̂ ). Take the corresponding sequence of admissible trajectories

T ν = {x(t; t0ν , x0ν , θ ν ), u(t; θ ν ) | t ∈ [t0ν , tfν ]}

in problem (6.1)–(6.3). Then the conditions t0ν → tˆ0 , tfν → tˆf , x0ν → x̂0 , θ ν → θ̂ imply that

|u(t; θ ν ) − û(t)| dt → 0, max |x(t; t0ν , x0ν , θ ν ) − x̂(t)| → 0,
ˆ
ν ∩ ˆ
ν ∩

where ν = [t0ν , tfν ]. Moreover, J(T ν ) = F0 (ζ ν ) < F0 (ζ̂ ) = J(Tˆ ). It means that the trajec-
tory Tˆ is not a Pontryagin local minimum for the basic problem (6.1)–(6.3).

We shall clarify a relationship between the second-order conditions for the IOP (7.4) at
the point ζ̂ and those in the basic bang-bang control problem (6.1)–(6.3) for the trajectory Tˆ .
7.1. Main Results 301

We shall state that there is a one-to-one correspondence between Lagrange multipliers in


these problems and a one-to-one correspondence between elements of the critical cones.
Moreover, for corresponding Lagrange multipliers, the quadratic forms in these problems
take equal values on the corresponding elements of the critical cones. This will allow
us to express the necessary and sufficient quadratic optimality conditions for bang-bang
control, formulated in Theorems 6.9 and 6.10, in terms of the IOP (7.4). In particular, we
thus establish the equivalence between our quadratic sufficient conditions and those due to
Agrachev, Stefani, and Zezza [1].
First, for convenience we recall second-order necessary and sufficient conditions
for a smooth finite-dimensional optimization problem with inequality- and equality-type
constraints (see Section 1.3.1). Consider a problem in Rn ,
f0 (x) → min, fi (x) ≤ 0 (i = 1, . . . , k), gj (x) = 0 (j = 1, . . . , m), (7.5)
where f0 , . . . , fk , g1 , . . . , gm are C 2 -functions in Rn . Let x̂ be an admissible point in this
problem. Define, at this point, the set of normalized vectors μ = (α0 , . . . , αk , β1 , . . . , βm ) of
Lagrange multipliers

0 = μ ∈ Rk+1+m | αi ≥ 0 (i = 0, . . . , k), αi fi (x̂) = 0 (i = 1, . . . , k),
 k m 
αi + |βj | = 1; Lx (μ, x̂) = 0 ,
i=0 j =1
k m
where L(μ, x) = i=0 αi fi (x) + j =1 βj gj (x) is the Lagrange function. Define the set
of active indices I = {i ∈ {1, . . . , k} | fi (x̂) = 0} and the critical cone
K0 = {x̄ | f0 (x̂)x̄ ≤ 0, fi (x̂)x̄ ≤ 0, i ∈ I , gj (x̂)x̄ = 0, j = 1, . . . , m}.

Theorem 7.2. Let x̂ be a local minimum in problem (7.5). Then, at this point, the set 0 is
nonempty and the following inequality holds:
max Lxx (μ, x̂)x̄, x̄ ≥ 0 ∀ x̄ ∈ K0 .
μ∈0

Theorem 7.3. Let the set 0 be nonempty at the point x̂ and let
max Lxx (μ, x̂)x̄, x̄ > 0 ∀ x̄ ∈ K0 \ {0}.
μ∈0

Then x̂ is a local minimum in problem (7.5).

These conditions were obtained by Levitin, Milyutin, and Osmolovskii [54, 55]; cf.
also Ben-Tal and Zowe [4].

7.1.2 Relationship between Second-Order Conditions for the Basic


and Induced Optimization Problem
Let Tˆ = (x̂(t), û(t) | t ∈ [tˆ0 , tˆf ]) be an admissible trajectory in the basic problem with the
properties assumed in Section 7.1.1, and let ζ̂ = (tˆ0 , tˆf , x̂0 , θ̂ ) be the corresponding admissible
point in the Induced Problem.
302 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Lagrange multipliers. Let us define the set 0 ⊂ R1+d(F )+d(K) of the triples μ =
(α0 , α, β) of normalized Lagrange multipliers at the point ζ̂ for the Induced Problem. The
Lagrange function for the Induced Problem is
L(μ, ζ ) = L(μ, t0 , tf , x0 , θ ) = α0 J (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))
+ αF (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) + βK(t0 , x0 , tf , x(tf ; t0 , x0 , θ ))
= l(μ, t0 , x0 , tf , x(tf ; t0 , x0 , θ )), (7.6)
where l = α0 J +αF +βK. By definition, 0 is the set of multipliers μ = (α0 , α, β) such that

α0 ≥ 0, α ≥ 0, α0 + |α| + |β| = 1, αF (p̂) = 0, Lζ (μ, ζ̂ ) = 0, (7.7)

where p̂ = (tˆ0 , x̂0 , tˆf , x̂f ), x̂0 = x̂(tˆ0 ), x̂f = x̂(tˆf ) = x(tˆf ; tˆ0 , x̂0 , θ̂ ). Now, let us define the
corresponding set of normalized Lagrange multipliers for the trajectory Tˆ in the basic
problem. Denote by  the set of multipliers λ = (α0 , α, β, ψ, ψ0 ) such that
α0 ≥ 0, α ≥ 0, α0 + |α| + |β| = 1, αF (p̂) = 0,
−ψ̇(t) = ψ(t)fx (t, x̂(t), û(t)), −ψ̇0 (t) = ψ(t)ft (t, x̂(t), û(t)),
ψ(tˆ0 ) = −lx0 (μ, p̂), ψ(tˆf ) = lxf (μ, p̂), (7.8)
ψ0 (tˆ0 ) = −lt0 (μ, p̂), ψ0 (tˆf ) = ltf (μ, p̂),
ψ(t)f (t, x̂(t), û(t)) + ψ0 (t) = 0 ∀ t ∈  ˆ \ ,
ˆ

ˆ = [tˆ0 , tˆf ] and 


where  ˆ = {tˆ1 , . . . , tˆs }.

Proposition 7.4. The projector


π0 : (α0 , α, β, ψ, ψ0 ) → (α0 , α, β) (7.9)
maps one-to-one the set  onto the set 0 .

Let us define the inverse mapping. Take an arbitrary multiplier μ = (α0 , α, β) ∈ 0 .


This tuple defines the gradient lxf (μ, p̂), and hence the system

−ψ̇ = ψfx (t, x̂(t), û(t)), ψ(tˆf ) = lxf (μ, p̂) (7.10)
defines ψ(t). Define ψ0 (t) by the equality
ψ(t)f (t, x̂(t), û(t)) + ψ0 (t) = 0. (7.11)

Proposition 7.5. The inverse mapping

π0−1 : (α0 , α, β) ∈ 0 → (α0 , α, β, ψ, ψ0 ) ∈  (7.12)


is defined by formulas (7.10) and (7.11).

We note that M0 ⊂  holds, because the system of conditions (6.7)–(6.9) and (6.11)
is equivalent to system (7.8). But it may happen that M0  = , since in the definition of 
there is no requirement that its elements satisfy minimum condition (6.10). Let us denote
MP
0 := π0 (M0 ), where MP stands for Minimum Principle.
7.1. Main Results 303

We say that multipliers μ = (α0 , α, β) and λ = (α0 , α, β, ψ, ψ0 ) correspond to each


other if they have the same components α0 , α and β, i.e.,
π0 (α0 , α, β, ψ, ψ0 ) = (α0 , α, β).

Critical cones. We denote by K0 the critical cone at the point ζ̂ in the Induced
Problem. Thus, K0 is the set of collections ζ̄ = (t¯0 , t¯f , x̄0 , θ̄ ) such that

F0 (ζ̂ )ζ̄ ≤ 0, Fi (ζ̂ )ζ̄ ≤ 0, i ∈ I , G (ζ̂ )ζ̄ = 0, (7.13)

where I is the set of indices of the inequality constraints active at the point ζ̂ . Let K
be the critical cone for the trajectory Tˆ in the basic problem, i.e., the set of all tuples
ˆ satisfying conditions (6.21)–(6.23).
z̄ = (t¯0 , t¯f , ξ̄ , x̄) ∈ Z(),

Proposition 7.6. The operator π1 : (t¯0 , t¯f , ξ̄ , x̄) → (t¯0 , t¯f , x̄0 , θ̄ ) defined by

θ̄ = −ξ̄ , x̄0 = x̄(tˆ0 ) (7.14)

is a one-to-one mapping of the critical cone K (for the trajectory Tˆ in the basic problem)
onto the critical cone K0 (at the point ζ̂ in the Induced Problem).

We say that the elements ζ̄ = (t¯0 , t¯f , x̄0 , θ̄ ) ∈ K0 and z̄ = (t¯0 , t¯f , ξ̄ , x̄) ∈ K correspond
to each other if θ̄ = −ξ̄ and x̄0 = x̄(tˆ0 ), i.e., π1 (t¯0 , t¯f , ξ̄ , x̄) = (t¯0 , t¯f , x̄0 , θ̄ ).
Now we give explicit formulas for the inverse mapping for π1 . Let V (t) be an n × n
matrix-valued function (n = d(x)) which is absolutely continuous in  ˆ = [tˆ0 , tˆf ] and satisfies
the system
V̇ (t) = fx (t, x̂(t), û(t))V (t), V (tˆ0 ) = E, (7.15)
where E is the identity matrix. For each k = 1, . . . , s denote by y k (t) the n-dimensional
vector function which is equal to zero in [tˆ0 , tˆk ), and in [tˆk , tˆf ] it is the solution to the IVP

ẏ k = fx (t, x̂(t), û(t))y k , ˙ k.


y k (tˆk ) = −[x̂] (7.16)
˙ k at tˆk .
Hence y k is a piecewise continuous function with one jump [y k ]k = −[x̂]

Proposition 7.7. The inverse mapping π1−1 : (t¯0 , t¯f , x̄0 , θ̄ ) ∈ K0 → (t¯0 , t¯f , ξ̄ , x̄) ∈ K is given
by the formulas
s
ξ̄ = −θ̄, ˙ tˆ0 )t¯0 +
x̄(t) = V (t) x̄0 − x̂( y k (t)t¯k , (7.17)
k=1

where t¯k is the kth component of the vector θ̄.

Quadratic forms. For μ ∈ 0 the quadratic form of the IOP is equal to

Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ .


The main result of this section is the following.
304 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Theorem 7.8. Let the Lagrange multipliers

μ = (α0 , α, β) ∈ MP
0 and λ = (α0 , α, β, ψ, ψ0 ) ∈ M0

correspond to each other, i.e., π0 λ = μ, and let the elements of the critical cones ζ̄ =
(t¯0 , t¯f , x̄0 , θ̄) ∈ K0 and z̄ = (t¯0 , t¯f , ξ̄ , x̄) ∈ K correspond to each other, i.e., π1 z̄ = ζ̄ . Then the
quadratic forms in the basic and induced problems take equal values: Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ =
(λ, z̄). Consequently,

max Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ = max (λ, z̄)


μ∈MP
0
λ∈M0

for each pair of elements of the critical cones ζ̄ ∈ K0 and z̄ ∈ K such that π1 z̄ = ζ̄ .

Theorems 6.9 and 7.8 and Proposition 7.6 imply the following second-order necessary
optimality condition for the basic problem.

Theorem 7.9. If the trajectory Tˆ affords a Pontryagin minimum in the basic problem, then
the following Condition A0 holds: The set M0 is nonempty and

max Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ ≥ 0 ∀ ζ̄ ∈ K0 .


μ∈MP
0

Theorems 6.10 and 7.8 and Proposition 7.6 imply the following second-order sufficient
optimality condition for the basic control problem.

Theorem 7.10. Let the following Condition B0 be fulfilled for an admissible trajectory Tˆ
in the basic problem:
(a) û(t) is a bang-bang control taking values in the set V = ex U ,
(b) the set M0 is nonempty, and there exists λ ∈ M0 such that D k (H ) > 0, k = 1, . . . , s,
and condition (6.19) holds (hence, u(t) is a strict bang-bang control),
(c) maxμ∈MP Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ > 0 for all ζ̄ ∈ K0 \ {0}.
0
Then Tˆ is a strict strong minimum.

Theorem 7.10 is a generalization of sufficient optimality conditions for bang-bang


controls obtained in Agrachev, Stefani, and Zezza [1]. The detailed proofs of the preceding
theorems will be given in the following sections. Let us point out that the proofs reveals the
useful fact that all elements of the Hessian Lζ ζ (μ, ζ̂ ) can be computed explicitly on the basis
of the transition matrix V (tˆf ) in (7.15) and of the first-order variations y k defined by (7.16).
We shall need formulas for all first-order partial derivatives of the function x(tf ; t0 , x0 , θ ).
We shall make extensive use of the variational system

V̇ = fx (t, x(t; t0 , x0 , θ ), u(t; θ)) V , V (t0 ) = E, (7.18)

where E is the identity matrix. The solution V (t) is an n × n matrix-valued function


(n = d(x)) which is absolutely continuous in  = [t0 , tf ]. The solution of (7.18) is denoted
by V (t; t0 , x0 , θ). Along the reference trajectory x̂(t), û(t), i.e., for ζ = ζ̂ , we shall use the
notation V (t) for simplicity.
7.2. First-Order Derivatives of x(tf ; t0 , x0 , θ ) 305

7.2 First-Order Derivatives of x(tf ; t0 , x0 , θ ) with Respect


to t0 , tf , x0 , and θ. Lagrange Multipliers and Critical
Cones
Let x(t; t0 , x0 , θ) be the solution of the IVP (7.3) and put

g(ζ ) = g(t0 , tf , x0 , θ ) := x(tf ; t0 , x0 , θ ). (7.19)

Under our assumptions, the operator g : V → Rn is well defined and C 2 -smooth if the
neighborhood V of the point ζ̂ is sufficiently small. In this section, we shall derive the first-
order partial derivatives of g(t0 , tf , x0 , θ ) with respect to t0 , tf , x0 , and θ at the point ζ̂ . We
shall use well-known results from the theory of ODEs about differentiation of solutions to
ODEs with respect to parameters and initial values. In what follows, it will be convenient
to drop those arguments in x(t; t0 , x0 , θ ), u(t, θ ), V (t; t0 , x0 , θ ), etc., that are kept fixed.

7.2.1 Derivative ∂x/∂x0


Let us fix θ and t0 . The following result is well known in the theory of ODEs.

Proposition 7.11. We have


∂x(t; x0 )
= V (t; x0 ), (7.20)
∂x0
where the matrix-valued function V (t; x0 ) is the solution to the IVP (7.18), i.e.,

V̇ = fx (t, x(t), u(t))V , V |t=t0 = E, (7.21)

where x(t) = x(t; x0 ), V̇ = ∂V


∂t .

Consequently, we have

∂x(tˆf ; tˆ0 , x̂0 , θ̂ )


gx0 (ζ̂ ) := = V (tˆf ), (7.22)
∂x0

where V (t) satisfies IVP (7.21) along the trajectory (x̂(t), û(t)), t ∈ [tˆ0 , tˆf ].

7.2.2 Derivatives ∂x/∂t0 and ∂x/∂tf


Fix x0 and θ and put
∂x(t; t0 )
w(t; t0 ) = .
∂t0

Proposition 7.12. The vector function w(t; t0 ) is the solution to the IVP

ẇ = fx (t, x(t), u(t))w, w|t=t0 = −ẋ(t0 ), (7.23)

where x(t) = x(t; t0 ), ẇ = ∂w


∂t . Therefore, we have w(t; t0 ) = −V (t; t0 )ẋ(t0 ), where the
matrix-valued function V (t; t0 ) is the solution to the IVP (7.18).
306 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Hence, we obtain
∂x(tˆf ; tˆ0 , x̂0 , θ̂ ) ˙ tˆ0 ).
gt0 (ζ̂ ) := = −V (tˆf )x̂( (7.24)
∂t0
Obviously, we have
∂x(tˆf ; tˆ0 , x̂0 , θ̂ ) ˙
gtf (ζ̂ ) := = x̂(tˆf ). (7.25)
∂tf

7.2.3 Derivatives ∂x/∂tk


Fix t0 and x0 . Take some k and fix tj for all j  = k. Put
∂x(t; tk )
y k (t; tk ) =
∂tk
and denote by ẏ k the derivative of y k with respect to t.

Proposition 7.13. For t ≥ tk the function y k (t; tk ) is the solution to the IVP
ẏ k = fx (t, x(t; tk ), u(t; tk )) y k , y k |t=tk = −[f ]k , (7.26)
where [f ]k = f (tk , x(tk ; tk ), uk+1 )−f (tk , x(tk ; tk ), uk ) is the jump of the function f (t, x(t; tk ),
u(t; tk )) at the point tk . For t < tk , we have y k (t; tk ) = 0. Thus, [y]k = −[f ]k , where
[y]k = y(tk +; tk ) − y(tk −; tk ) is the jump of the function y k (t; tk ) at the point tk .

Proof. Let us sketch how to obtain the representation (7.26). For t ≥ tk the trajectory x(t; tk )
satisfies the integral equation
 t
x(t; tk ) = x(tk −; tk ) + f (h, x(h; tk ), u(h, tk )) dh.
tk +

By differentiating this equation with respect to tk , we obtain


 t
y k (t; tk ) = ẋ(tk −; tk ) − ẋ(tk +; tk ) + fx (h, x(h; tk ), u(h, tk ))y k (h; tk ) dh,
tk +

from which we get y k |t=tk = −[f ]k and the variational equation in (7.26).

In particular, we obtain

∂x(tˆf ; tˆ0 , x̂0 , θ̂ )


gtk (ζ̂ ) := = y k (tˆf ). (7.27)
∂tk

7.2.4 Comparison of Lagrange Multipliers


Here, we prove Propositions 7.4 and 7.5. Consider the Lagrangian (7.6) with a multiplier
μ = (α0 , α, β) ∈ 0 , where 0 is the set (7.7) of normalized Lagrange multipliers at the
point ζ̂ in the Induced Problem (7.4). Define the absolutely continuous function ψ(t) and the
function ψ0 (t) by equation (7.10) and (7.11), respectively. We will show that the function
7.2. First-Order Derivatives of x(tf ; t0 , x0 , θ ) 307

ψ0 (t) is absolutely continuous and the collection λ = (α0 , α, β, ψ, ψ0 ) satisfies all conditions
in (7.8) and hence belongs to the set . The conditions

α0 ≥ 0, α ≥ 0, α0 + |α| + |β| = 1, αF (p̂) = 0

in the definitions of 0 and  are identical. Hence, we must analyze the equation
Lζ (μ, ζ̂ ) = 0 in the definition of 0 which is equivalent to the system

Lt0 (μ, ζ̂ ) = lt0 (p̂) + lxf (p̂)gt0 (ζ̂ ) = 0,


Ltf (μ, ζ̂ ) = ltf (p̂) + lxf (p̂)gtf (ζ̂ ) = 0,
Lx0 (μ, ζ̂ ) = lx0 (p̂) + lxf (p̂)gx0 (ζ̂ ) = 0,
Ltk (μ, ζ̂ ) = lxf (p̂)gtk (ζ̂ ) = 0, k = 1, . . . , s.

Using the equality lxf (p̂) = ψ(tˆf ) and formulas (7.24), (7.25), (7.22), (7.27) for the deriva-
tives of g with respect to t0 , tf , x0 , tk , at the point ζ̂ , we get
˙ tˆ0 ) = 0,
Lt0 (μ, ζ̂ ) = lt0 (p̂) − ψ(tˆf )V (tˆf )x̂( (7.28)
˙ tˆf ) = 0,
Lt (μ, ζ̂ ) = lt (p̂) + ψ(tˆf )x̂( (7.29)
f f

Lx0 (μ, ζ̂ ) = lx0 (p̂) + ψ(tˆf )V (tˆf ) = 0, (7.30)


Ltk (μ, ζ̂ ) = ψ(tˆf )y (tˆf ) = 0,
k
k = 1, . . . , s. (7.31)

Analysis of (7.28). The n × n matrix-value function V (t) satisfies the equation

V̇ = fx V , V (tˆ0 ) = E

with fx = fx (t, x̂(t), û(t)). Then, "(t) := V −1 (t) is the solution to the adjoint equation
˙ = "fx ,
−" "(tˆ0 ) = E.

Consequently, ψ(tˆf ) = ψ(tˆ0 )"(tˆf ) = ψ(tˆ0 )V −1 (tˆf ). Using these relations in (7.28), we get
˙ tˆ0 ) = 0.
lt0 (p̂) − ψ(tˆ0 )x̂(
˙ tˆ0 ) = −ψ0 (tˆ0 ). Hence, (7.28) is equivalent to the
By virtue of (7.11), we have ψ(tˆ0 )x̂(
transversality condition for ψ0 at the point tˆ0 :

lt0 (p̂) + ψ0 (tˆ0 ) = 0.


˙ tˆf ) = −ψ0 (tˆf ) holds, (7.29) is equivalent to the
Analysis of (7.29). Since ψ(tˆf )x̂(
transversality condition for ψ0 at the point tˆf :

ltf (p̂) − ψ0 (tˆf ) = 0.

Analysis of (7.30). Since ψ(tˆf ) = ψ(tˆ0 )V −1 (tˆf ), equality (7.30) is equivalent to the
transversality condition for ψ at the point tˆ0 :

lx0 (p̂) + ψ(tˆ0 ) = 0.


308 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Analysis of (7.31). We need the following result.

Proposition 7.14. Let the absolutely continuous function y be a solution to the system
ẏ = fx y on an interval , and let the absolutely continuous function ψ be a solution to
the adjoint system −ψ̇ = ψfx on the same interval, where fx = fx (t, x̂(t), û(t)). Then
ψ(t)y(t) ≡ const on .

dt (ψy) = ψ̇y + ψ ẏ = −ψfx y + ψfx y = 0.


d
Proof. We have

It follows from this proposition and (7.26) that for k = 1, . . . , s,

˙ k = −[ψ x̂]
ψ(tˆf )y k (tˆf ) = ψ(tˆk )y k (tˆk + 0) = ψ(tˆk )[y k ]k = −ψ(tˆk )[x̂] ˙ k = [ψ0 ]k .

Therefore, (7.31) is equivalent to the conditions

[ψ0 ]k = 0, k = 1, . . . , s,

which means that ψ0 is continuous at each point tˆk , k = 1, . . . , s, and hence absolutely
˙ k that
ˆ = [tˆ0 , tˆf ]. Moreover, it follows from 0 = [ψ0 ]k = −ψ(tˆk )[x̂]
continuous on 

φ(tˆk )[û]k = 0, (7.32)

where φ(t) = ψ(t)B(t, x̂(t)) denotes the switching function. Finally, differentiating (7.11)
with respect to t, we get

−ψfx x̂˙ + ψft + ψfx x̂˙ + ψ̇0 = 0, i.e., − ψ̇0 = ψft .

Thus, we have proved that λ = (α0 , α, β, ψ, ψ0 ) ∈ . Conversely, if (α0 , α, β, ψ) ∈ , then


one can show similarly that (α0 , α, β) ∈ 0 . Moreover, it is obvious that the projector (7.9)
is injective on 0 , because ψ and ψ0 are defined uniquely by condition (7.10) and (7.11),
respectively.

7.2.5 Comparison of the Critical Cones


Take an element ζ̄ = (t¯0 , t¯f , x̄0 , θ̄) of the critical cone K0 (see (7.13)) at the point ζ̂ in the
Induced Problem:

F0 (ζ̂ )ζ̄ ≤ 0, Fi (ζ̂ )ζ̄ ≤ 0, i ∈ I , G (ζ̂ )ζ̄ = 0.

Define ξ̄ and x̄ by formulas (7.17),

s
ξ̄ = −θ̄ , ˙ tˆ0 )t¯0 +
x̄(t) = V (t) x̄0 − x̂( y k (t)t¯k ,
k=1

where t¯k is the kth component of the vector θ̄ , and put z̄ = (t¯0 , t¯f , ξ̄ , x̄). We shall show
that z̄ is an element of the critical cone K (Equations (6.22) and (6.23)) for the trajectory
7.2. First-Order Derivatives of x(tf ; t0 , x0 , θ ) 309

Tˆ = {(x̂(t), û(t) | t ∈ [tˆ0 , tˆf ] } in the basic problem. Consider the first inequality F0 (ζ̂ )ζ̄ ≤ 0,
where F0 (ζ ) := J (t0 , x0 , tf , x(tf ; t0 , x0 , θ )). We obviously have

F0 (ζ̂ )ζ̄ = (Jt0 (p̂) + Jxf (p̂)gt0 (ζ̂ ))t¯0 + (Jtf (p̂) + Jxf (p̂)gtf (ζ̂ ))t¯f

s
+ (Jx0 (p̂) + Jxf (p̂)gx0 (ζ̂ ))x̄0 + Jxf (p̂)gtk (ζ̂ )θ̄k ,
k=1

where θ̄k = t¯k is the kth component of the vector θ̄. Using formulas (7.24), (7.25), (7.22),
(7.27) for the derivatives of g with respect to t0 , tf , x0 , tk , at the point ζ̂ , we get
˙ tˆ0 ))t¯0 + (Jt (p̂) + Jx (p̂)x̂(
F0 (ζ̂ )ζ̄ = (Jt0 (p̂) − Jxf (p̂)V (tf )x̂( ˙ tˆf ))t¯f
f f


s
+ (Jx0 (p̂) + Jxf (p̂)V (tˆf ))x̄0 + Jxf (p̂)y k (tˆf )θ̄k .
k=1

Hence, the inequality F0 (ζ̂ )ζ̄ ≤ 0 is equivalent to the inequality

Jt0 (p̂)t¯0 + Jtf (p̂)t¯f + Jx0 (p̂)x̄0


 

s
˙
+ Jx (p̂) V (tf )(x̄0 − x̂(tˆ0 )t¯0 ) + ˙
y (tˆf )θ̄k + x̂(tˆf )t¯f ≤ 0.
k
f
k=1

It follows from the definition (7.17) of x̄ that


˙ tˆ0 )t¯0 = x̄0 ,
x̄¯0 := x̄(tˆ0 ) + x̂( (7.33)

since V (tˆ0 ) = E, and y k (tˆ0 ) = 0, k = 1, . . . , s. Moreover, using the same definition, we get

s
˙ tˆf )t¯f = V (tˆf )(x̄0 − x̂(
x̄¯f := x̄(tˆf ) + x̂( ˙ tˆ0 )t¯0 ) + ˙ tˆf )t¯f .
y k (tˆf )t¯k + x̂( (7.34)
k=1

Thus, the inequality F0 (ζ̂ )ζ̄ ≤ 0 is equivalent to the inequality

Jt0 (p̂)t¯0 + Jtf (p̂)t¯f + Jx0 (p̂)x̄¯0 + Jxf (p̂)x̄¯f ≤ 0,

or briefly,
J (p̂)p̄¯ ≤ 0,
where p̄¯ = (t¯0 , x̄¯0 , t¯f , x̄¯f ); see equation (6.21).
Similarly, the inequalities Fi (ζ̂ )ζ̄ ≤ 0 for all i ∈ I and the equality G (ζ̂ )ζ̄ = 0 in the
definition of K0 are equivalent to the inequalities (respectively, equalities)

Fi (p̂)p̄¯ ≤ 0, i ∈ I, K (p̂)p̄¯ = 0,

in the definition of K; cf. (6.22).


Since V̇ = fx (t, x̂(t), û(t))V and ẏ k = fx (t, x̂(t), û(t))y k , k = 1, . . . , s, it follows from
definition (7.17) that x̄ is a solution to the same linear system

x̄˙ = fx (t, x̂(t), û(t))x̄.


310 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Finally, recall from (7.26) that for each k = 1, . . . , s the function y k (t) is piecewise
continuous with only one jump [y k ]k = −[x̂] ˙ k at the point tˆk and is absolutely continuous on
each of the half-open intervals [tˆ0 , tˆk ) and (tˆk , tˆf ]. Moreover, the function V (t) is absolutely
continuous in [tˆ0 , tˆf ]. Hence, x̄(t) is a piecewise continuous function which is absolutely
continuous on each interval of the set [tˆ0 , tˆf ] \  ˆ and satisfies the jump conditions

˙ k ξ̄k ,
[x̄]k = [x̂] ξ̄k = −t¯k , k = 1, . . . , s.
Thus, we have proved that z̄ = (t¯0 , t¯f , ξ̄ , x̄) is an element of the critical cone K. Similarly,
one can show that if z̄ = (t¯0 , t¯f , ξ̄ , x̄) ∈ K, then putting x̄0 = x̄(tˆ0 ) and θ̄ = −ξ̄ , we obtain
the element ζ̄ = (t¯0 , t¯f , x̄0 , θ̄) of the critical cone K0 .

7.3 Second-Order Derivatives of x(tf ; t0 , x0 , θ ) with


Respect to t0 , tf , x0 , and θ
In this section we shall give formulas for all second-order partial derivatives of the functions
x(t; t0 , x0 , θ) and g(ζ ) = g(t0 , tf , x0 , θ ) := x(tf ; t0 , x0 , θ )

at the point ζ̂ . We are not sure whether all of them are known; therefore we shall also sketch
the proofs. Here x(t; t0 , x0 , θ) is the solution to IVP (7.3). Denote by gk (ζ ) := xk (tf ; t0 , x0 , θ )
the kth component of the function g.

7.3.1 Derivatives (gk )x0 x0


Let x(t; x0 ) be the solution to the IVP (7.3) with fixed t0 and θ , and let xk (t; x0 ) be its kth
component. For k = 1, . . . , n, we define the n × n matrix
∂ 2 xk (t; x0 ) ∂ 2 xk (t; x0 )
W k (t; x0 ) := with entries wijk (t; x0 ) = ,
∂x0 ∂x0 ∂x0i ∂x0j
where x0i is the ith component of the column vector x0 ∈ Rn .

Proposition 7.15. The matrix-valued functions W k (t; x0 ), k = 1, . . . , n, satisfy the IVPs



n
Ẇ k = V ∗fkxx V + fkxr W r , W k |t=t0 = O, k = 1, . . . , n, (7.35)
r=1

∂W k
where Ẇ k = ∂t , O is the zero matrix, fk is the kth component of the vector function f , and

∂fk (t, x(t; x0 ), u(t)) ∂ 2 fk (t, x(t; x0 ), u(t))


fkxr = , fkxx =
∂xr ∂x∂x
are its partial derivatives at the point (t, x(t; x0 ), u(t)) for t ∈ [t0 , tf ].

Proof. For notational convenience, we use the function ϕ(t, x) := f (t, x, u(t)). By Propo-
0) ∂xi (t;x0 )
sition 7.11, the matrix-valued function V (t; x0 ) = ∂x(t;x
∂x0 with entries vij (t; x0 ) = ∂x0j is
7.3. Second-Order Derivatives of x(tf ; t0 , x0 , θ ) with Respect to t0 , tf , x0 , and θ 311

the solution to the IVP (7.18). Consequently, its entries satisfy the equations
∂ ẋk (t; x0 )  ∂xr (t; x0 )
= ϕkxr (t, x(t; x0 ))
∂x0i r
∂x0i
∂xk (t0 ; x0 )
= eki , k, i = 1, . . . , n,
∂x0i
where eki are the elements of the identity matrix E. By differentiating these equations with
respect to x0j , we get

∂ 2 ẋk (t; x0 )    ∂xr (t; x0 )


= ϕkxr (t, x(t; x0 )) x
∂x0i ∂x0j r
0j ∂x0i
 ∂ 2 xr (t; x0 )
+ ϕkxr (t, x(t; x0 )) , (7.36)
r
∂x0i ∂x0j
∂ 2 xk (t0 ; x0 )
= 0, k, i, j = 1, . . . , n. (7.37)
∂x0i ∂x0j
Transforming the first sum in the right-hand side of (7.36), we get
  ∂xr (t; x0 )
ϕkxr (t, x(t; x0 )) x0j
r
∂x0i
 ∂xs (t; x0 ) ∂xr (t; x0 )
= ϕkxr xs (t, x(t; x0 )) ·
r s
∂x0j ∂x0i
 
= V ∗ ϕkxx (t, x(t; x0 ))V ij , k, i, j = 1, . . . , n,

where (A)ij denotes the element aij of a matrix A, and A∗ denotes the transposed matrix.
Thus, (7.36) and (7.37) imply (7.35).

It follows from Proposition 7.15 that

∂ 2 xk (tˆf ; tˆ0 , x̂0 , θ̂ )


(gk )x0 x0 (ζ̂ ) := = W k (tˆf ), k = 1, . . . , n, (7.38)
∂x0 ∂x0
where the matrix-valued functions W k (t), k = 1, . . . , n, satisfy the IVPs (7.35) along the
reference trajectory (x̂(t), û(t)).

7.3.2 Mixed Derivatives gx0 tk


Let s = 1 for notational convenience, and thus θ = t1 . Fix t0 and consider the functions
∂x(t; x0 , θ ) ∂x(t; x0 , θ )
V (t; x0 , θ) = , y(t; x0 , θ ) = ,
∂x0 ∂θ
∂V (t; x0 , θ ) ∂ 2 x(t; x0 , θ )
R(t; x0 , θ) = = ,
∂θ ∂x0 ∂θ
∂V (t; x0 , θ ) ∂R(t; x0 , θ )
V̇ (t; x0 , θ) = , Ṙ(t; x0 , θ ) = .
∂t ∂t
312 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Then V , V̇ and R, Ṙ are n × n matrix-valued functions, and y is a vector function of


dimension n.

Proposition 7.16. For t ≥ θ, the function R(t; x0 , θ ) is the solution to the IVP
Ṙ = (y ∗fxx )V + fx R, R(θ; x0 , θ ) = −[fx ]V (θ ; x0 , θ ), (7.39)
where fx and fxx are taken along the trajectory (t, x(t; x0 , θ ), u(t, θ )), t ∈ [t0 , tf ]. Here, by
definition, (y ∗fxx ) is an n × n matrix with entries

n
∂ 2 fk

(y fxx )k,j = yi (7.40)
∂xi ∂xj
i=1

in the kth row and j th column, and


[fx ] = fx (θ , x(θ ; x0 , θ ), u2 ) − fx (θ , x(θ; x0 , θ ), u1 )
is the jump of the function fx (·, x(·; x0 , θ ), u(·, θ )) at the point θ . For t < θ, we have
R(t; x0 , θ) = 0.

Proof. According to Proposition 7.11 the matrix-valued function V is the solution to the
system
V̇ (t; x0 , θ) = fx (t, x(t; x0 , θ ), u(t; θ))V (t; x0 , θ ). (7.41)
By differentiating this equality with respect to θ , we get the equation

∂ V̇  ∂xi ∂V
= (fx V ) xi + fx ,
∂θ ∂θ ∂θ
i

which is equivalent to 
Ṙ = (fx V )xi yi + fx R. (7.42)
i
Upon defining 
A= (fx V )xi yi ,
i
the element in the rth row and sth column of the matrix A is equal to
⎛ ⎞
  
ars = ((fx V )rs )xi yi = ⎝ frxj vj s ⎠ yi
i i j x
 i
  
= yi frxi xj vj s = yi frxi xj vj s
i j j i
   
∗ ∗
= y fxx v
rj j s
= (y fxx )V rs
,
j

where vj s is the element in the j th row and sth column of the matrix V . Hence we have
A = (y ∗fxx )V and see that (7.42) is equivalent to (7.39). The initial condition in (7.39),
7.3. Second-Order Derivatives of x(tf ; t0 , x0 , θ ) with Respect to t0 , tf , x0 , and θ 313

which is similar to the initial condition (7.26) in Proposition 7.13, follows from (7.41) (see
the proof of Proposition 7.13). The condition R(t; x0 , θ ) = 0 for t < θ is obvious.

Proposition 7.16 yields

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gx0 tk (ζ̂ ) := = R k (tˆf ), (7.43)
∂x0 ∂tk

where the matrix-valued function R k (t) satisfies the IVP



Ṙ k (t) = y k (t)∗ fxx (t, x̂(t), û(t)) V (t) + fx (t, x̂(t), û(t))R k (t), t ∈ [tˆk , tˆf ],
R k (tˆk ) = −[fx ]k V (tˆk ). (7.44)

Here, V (t) is the solution to the IVP (7.18), y k (t) is the solution to the IVP (7.26) (for
t0 = tˆ0 , x0 = x̂0 , θ = θ̂), and [fx ]k = f (tˆk , x̂(tˆk ), û(tˆk +)) − f (tˆk , x̂(tˆk ), û(tˆk −)), k = 1, . . . , s.

7.3.3 Derivatives gtk tk


Again, for simplicity let s = 1. Fix t0 and x0 and put

∂x(t; θ ) ∂y(t; θ ) ∂ 2 x(t; θ)


y(t; θ) = , z(t; θ ) = = ,
∂θ ∂θ ∂θ 2
∂y(t; θ ) ∂z(t; θ)
ẏ(t; θ ) = , ż(t; θ ) = .
∂t ∂t
Then y, ẏ and z, ż are vector functions of dimension n.

Proposition 7.17. For t ≥ θ the function z(t; θ ) is the solution to the system

ż = fx z + y ∗fxx y (7.45)

with the initial condition at the point t = θ ,

z(θ; θ ) + ẏ(θ +; θ ) = −[ft ] − [fx ](ẋ(θ +; θ ) + y(θ ; θ)). (7.46)

In (7.45), fx and fxx are taken along the trajectory (t, x(t; θ ), u(t; θ )), t ∈ [t0 , tf ], and y ∗fxx y
is a vector with elements

n
∂ 2 fk
(y ∗fxx y)k = y ∗fkxx y = yi yj , k = 1, . . . , n.
∂xi ∂xj
i,j =1

In (7.46), the expressions

[ft ] = ft (θ, x(θ ; θ ), u2 ) − ft (θ , x(θ; θ ), u1 ),


[fx ] = fx (θ , x(θ; θ ), u2 ) − fx (θ , x(θ; θ ), u1 )

are the jumps of the derivatives ft (t, x(t; θ), u(t; θ )) and fx (t, x(t; θ), u(t; θ )) at the point θ
(u2 = u(θ +; θ), u1 = u(θ−; θ )). For t < θ , we have z(t; θ) = 0.
314 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Proof. By Proposition 7.13, for t ≥ θ the function y(t; θ ) is the solution to the IVP

ẏ(t; θ ) = fx (t, x(t; θ ), u(t; θ))y(t; θ),


y(θ ; θ ) = −(f (θ, x(θ ; θ ), u2 ) − f (θ , x(θ; θ ), u1 )).

By differentiating these equalities with respect to θ at the points θ and θ +, we obtain (7.45)
and (7.46). For t < θ we have y = 0 and hence z = 0.

For the solution x(t; t0 , x0 , θ ) to the IVP (7.3) with an arbitrary s, it follows from
Proposition 7.17 that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gtk tk (ζ̂ ) := = zkk (tˆf ), k = 1, . . . , s, (7.47)
∂tk ∂tk

where for t ≥ tˆk the vector function zkk (t) satisfies the equation

żkk (t) = fx (t, x̂(t), û(t))zkk (t) + y k (t)∗ fxx (t, x̂(t), û(t))y k (t) (7.48)

with the initial condition at the point t = tˆk :

˙ tˆk +) + y k (tˆk )).


zkk (tˆk ) + ẏ k (tˆk +) = −[ft ]k − [fx ]k (x̂( (7.49)

Here, for t ≥ tˆk , the function y k (t) is the solution to the IVP (7.26), and y k (t) = 0
for t < tˆk , k = 1, . . . , s. Furthermore, by definition, [ft ]k = ft (tˆk , x̂(tˆk ), û(tˆk +)) −
ft (tˆk , x̂(tˆk ), û(tˆk −)) and [fx ]k = fx (tˆk , x̂(tˆk ), û(tˆk +)) − fx (tˆk , x̂(tˆk ), û(tˆk −)) are the jumps
of the derivatives ft (t, x̂(t), û(t)) and fx (t, x̂(t), û(t)) at the point tˆk . For t < tˆk we put
zkk (t) = 0, k = 1, . . . , s.

7.3.4 Mixed Derivatives gtk tj


For simplicity, let s = 2, θ = (t1 , t2 ), and t0 < t1 < t2 < tf . Fix x0 and t0 and put

∂x(t; θ ) ∂y 1 (t; θ ) ∂ 2 x(t; θ )


y k (t; θ ) = , k = 1, 2, z12 (t; θ ) = = ,
∂tk ∂t2 ∂t1 ∂t2
∂y k (t; θ ) ∂z12 (t; θ )
ẏ k (t; θ ) = , k = 1, 2, ż12 (t; θ ) = .
∂t ∂t

Then y k , ẏ k , k = 1, 2, and z12 , ż12 are vector functions of dimension n.

Proposition 7.18. For t ≥ t2 the function z12 (t; θ ) is the solution to the system

ż12 = fx z12 + (y 1 )∗ fxx y 2 (7.50)

with the initial condition at the point t = t2 ,

z12 (t2 ; θ ) = −[ẏ 1 ]2 . (7.51)


7.3. Second-Order Derivatives of x(tf ; t0 , x0 , θ ) with Respect to t0 , tf , x0 , and θ 315

In (7.50), fx and fxx are taken along the trajectory (t, x(t; θ ), u(t; θ )), t ∈ [t0 , tf ], and
(y 1 )∗ fxx y 2 is a vector with elements

n
∂ 2 fk 1 2
((y 1 )∗ fxx y 2 )k = (y 1 )∗ fkxx y 2 = y y , k = 1, . . . , n.
∂xi ∂xj i j
i,j =1

In (7.51) we have [ẏ 1 ]2 = [fx ]2 y 1 (t2 ; θ ), where


[fx ]2 = fx (t2 , x(t2 ; θ), u3 ) − fx (t2 , x(t2 ; θ ), u2 ).
For t < t2 we have z12 (t; θ ) = 0.

Proof. By Proposition 7.13, for t ≥ t1 the function y 1 (t; θ ) is a solution to the equation
ẏ 1 (t; θ ) = fx (t, x(t; θ ), u(t; θ))y 1 (t; θ ),
where y 1 (t; θ ) = 0 for t < t1 . Differentiating this equation with respect to t2 , we see that for
1
t ≥ t2 , the function z12 (t; θ ) = ∂y ∂t(t;θ)
2
is a solution to system (7.50). The initial condition
(7.51) is similar to the initial condition (7.26) in Proposition 7.13. For t < t2 , we obviously
have z12 (t; θ ) = 0.

For the solution x(t; t0 , x0 , θ ) of IVP (7.3) and for tk < tj (k, j = 1, . . . , s), it follows
from Proposition 7.18 that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gtk tj (ζ̂ ) := = zkj (tˆf ), (7.52)
∂tk ∂tj

where for t ≥ tˆj the vector function zkj (t) is the solution to the equation

żkj (t) = fx (t, x̂(t), û(t))zkj (t) + y k (t)∗ fxx (t, x̂(t), û(t))y j (t) (7.53)
satisfying the initial condition
zkj (tˆj ) = −[ẏ k ]j = −[fx ]j y k (tˆj ). (7.54)

Here, for t ≥ tˆk , the function y k (t) is the solution to the IVP (7.26), while
y k (t)= 0 holds for t < tˆk , k = 1, . . . , s. By definition, [ẏ k ]j = ẏ k (tˆj +) − ẏ k (tˆj −) and
[fx ]j = fx (tˆj , x̂(tˆj ), û(tˆj +)) − fx (tˆj , x̂(tˆj ), û(tˆj −)) are the jumps of the derivatives ẏ k (t)
and fx (t, x̂(t), û(t)), respectively, at the point tˆj . For t < tˆj we put zkj (t) = 0.

7.3.5 Derivatives gt0 t0 , gt0 tf , and gtf tf


Here, we fix x0 and θ and study the functions

∂x(t; t0 ) ∂w(t; t0 ) ∂ 2 x(t; t0 )


w(t; t0 ) = , q(t; t0 ) = = ,
∂t0 ∂t0 ∂t02
∂w(t; t0 ) ∂q(t; t0 ) ∂ 2 x(t; t0 , )
ẇ(t; t0 ) = , q̇(t; t0 ) = , ẍ(t; t0 ) = .
∂t ∂t ∂t 2
316 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Proposition 7.19. The function q(t; t0 ) is the solution to the system


q̇ = fx q + w ∗fxx w, t ∈ [t0 , tf ] (7.55)
satisfying the initial condition at the point t = t0 ,
ẍ(t0 ; t0 ) + 2ẇ(t0 ; t0 ) + q(t0 ; t0 ) = 0. (7.56)
In (7.55), fx and fxx are taken along the trajectory (t, x(t; t0 ), u(t)), t ∈ [t0 , tf ], and w ∗fxx w
is a vector with elements

n
∂ 2 fk
(w ∗fxx w)k = w∗fkxx w = wi wj , k = 1, . . . , n.
∂xi ∂xj
i,j =1

Proof. By Proposition 7.12 we have


ẇ(t; t0 ) = fx (t, x(t; t0 ))w(t; t0 ), ẋ(t0 ; t0 ) + w(t0 ; t0 ) = 0.
Differentiating these equalities with respect to t0 , we obtain (7.55) and (7.56).

From Proposition 7.19 it follows that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gt0 t0 (ζ̂ ) := = q(tˆf ), (7.57)
∂t02
where the vector function q(t) is the solution to the equation
q̇(t) = fx (t, x̂(t), û(t))q(t) + w ∗ (t)fxx (t, x̂(t), û(t))w(t) (7.58)
satisfying the initial condition
¨ tˆ0 ) + 2ẇ(tˆ0 ) + q(tˆ0 ) = 0.
x̂( (7.59)
˙ tˆ0 ) in view of Proposition 7.12, V̇ = fx V , and V (tˆ0 ) = E, we obtain
Since w(t) = −V (t)x̂(
˙ tˆ0 ) = −fx (tˆ0 , x̂(tˆ0 ), û(tˆ0 ))x̂(
ẇ(tˆ0 ) = −V̇ (tˆ0 )x̂( ˙ tˆ0 ).

Thus, the initial condition (7.59) is equivalent to


¨ tˆ0 ) − 2fx (tˆ0 , x̂(tˆ0 ), û(tˆ0 ))x̂(
x̂( ˙ tˆ0 ) + q(tˆ0 ) = 0. (7.60)
From (7.24) it follows that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gt0 tf (ζ̂ ) :=
∂t0 ∂tf
˙ tˆ0 ) = −fx (tˆf , x̂(tˆf ), û(tˆf ))V (tˆf )x̂(
= −V̇ (tˆf )x̂( ˙ tˆ0 ). (7.61)
Formula (7.25) implies that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ ) ¨ tˆf ).


gtf tf (ζ̂ ) := = x̂( (7.62)
∂tf2
7.3. Second-Order Derivatives of x(tf ; t0 , x0 , θ ) with Respect to t0 , tf , x0 , and θ 317

7.3.6 Derivatives gx0 tf and gtk tf


Formula (7.22) implies that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gx0 tf (ζ̂ ) := = V̇ (tˆf ), (7.63)
∂x0 ∂tf

where V (t) is the solution to the IVP (7.18). From (7.27) it follows that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gtk tf (ζ̂ ) := = ẏ k (tˆf ), k = 1, . . . , s, (7.64)
∂tk ∂tf
where y k (t) is the solution to the IVP (7.26).

7.3.7 Derivative gx0 t0


Let us fix θ and consider
∂x(t; t0 , x0 ) ∂V (t; t0 , x0 ) ∂ 2 x(t; t0 , x0 )
V (t; t0 , x0 ) = , S(t; t0 , x0 ) = = ,
∂x0 ∂t0 ∂x0 ∂t0
∂V (t; t0 , x0 ) ∂S(t; t0 , x0 )
V̇ (t; t0 , x0 ) = , Ṡ(t; t0 , x0 ) = .
∂t ∂t

Proposition 7.20. The elements sij (t; t0 , x0 ) of the matrix S(t; t0 , x0 ) satisfy the system

ṡij = −ej∗ V ∗ (fi )xx V ẋ(t0 ) + fix Sej , i, j = 1, . . . , n, (7.65)

and the matrix S itself satisfies the initial condition at the point t = t0 ,

S(t0 ; t0 , x0 ) + V̇ (t0 ; t0 , x0 ) = 0. (7.66)

In (7.65), the derivatives fx and fxx are taken along the trajectory (t, x(t; t0 , x0 ), u(t)),
t ∈ [t0 , tf ], ej is the j th column of the identity matrix E, and, by definition, ẋ(t0 ) = ẋ(t0 ; t0 , x0 ).

Proof. By Proposition 7.11,

V̇ (t; t0 , x0 ) = fx (t, x(t; t0 , x0 ), u(t))V (t; t0 , x0 ), V (t0 ; t0 , x0 ) = E. (7.67)

The first equality in (7.67) is equivalent to

v̇ij (t; t0 , x0 ) = fix (t, x(t; t0 , x0 ), u(t))V (t, t0 )ej , i, j = 1, . . . , n.

By differentiating these equalities with respect to t0 and using Proposition 7.12, we obtain
(7.65). Differentiating the second equality in (7.67) with respect to t0 yields (7.66).

Proposition 7.20 implies that

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gx0 t0 (ζ̂ ) := = S(tˆf ), (7.68)
∂x0 ∂t0
318 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

where the elements sij (t) of the matrix S(t) satisfy the system

˙ tˆ0 ) + fix (t, x̂(t), û(t))S(t)ej ,


ṡij (t) = ej∗ V ∗ (t)(fi )xx (t, x̂(t), û(t))V (t)x̂(
i, j = 1, . . . , n. (7.69)

Here, V (t) is the solution to the IVP (7.18), and the matrix S(t) itself satisfies the initial
condition at the point t = tˆ0 ,
S(tˆ0 ) + V̇ (tˆ0 ) = 0. (7.70)
.
7.3.8 Derivative gtk t0
Consider again the case s = 1 with θ = t1 and define

∂x(t; t0 , θ ) ∂y(t; t0 , θ ) ∂ 2 x(t; t0 , θ )


y(t; t0 , θ) = , r(t; t0 , θ ) = = ,
∂θ ∂t0 ∂t0 ∂θ
∂y(t; t0 , θ ) ∂r(t, t0 , θ )
ẏ(t; t0 , θ) = , ṙ(t; t0 , θ ) = ,
∂t ∂t
∂x(t; t0 , θ ) ∂x(t; t0 , θ )
ẋ(t; t0 , θ) = , V (t; t0 , θ ) = .
∂t ∂x0

Proposition 7.21. For t ≥ θ , the function r(t; t0 , θ ) is the solution to the IVP

ṙ = fx r − y ∗fxx V ẋ(t0 ), r|t=θ = [fx ]V (θ)ẋ(t0 ), (7.71)

where y ∗fxx V ẋ(t0 ) is the vector with elements (y ∗fxx V ẋ(t0 ))i = y ∗fixx V ẋ(t0 ), i = 1, . . . , n,
V (θ ) = V (θ; t0 , θ), and

[fx ] = fx (θ , x(θ; t0 , θ ), u2 ) − fx (θ , x(θ; t0 , θ ), u1 )

is the jump of the derivative fx (t, x(t; t0 , θ ), u(t; θ )) at the point θ . The derivatives fx
and fxx are taken along the trajectory (t, x(t; t0 , θ ), u(t; θ )), t ∈ [θ , tf ]. For t < θ we have
r(t; t0 , θ) = 0. Then the jump of the function r(t; t0 , θ ) at the point t = θ is given by [r] =
[fx ]V (θ )ẋ(t0 ).

Proof. By Proposition 7.13 we have y(t; t0 , θ ) = 0 for t < θ and hence r(t; t0 , θ ) = 0 for
t < θ. According to the same proposition, for t ≥ θ the function y(t; t0 , θ ) satisfies the
equation
ẏ(t; t0 , θ) = fx (t, x(t; t0 , θ ), u(t; θ))y(t; t0 , θ ).
Differentiating this equation with respect to t0 , we get

∂x
ṙ = fx r + y ∗fxx .
∂t0
According to Proposition 7.12,

∂x(t; t0 , θ )
= −V (t; t0 , θ )ẋ(t0 ),
∂t0
7.4. Quadratic Form for the Induced Optimization Problem 319

where ẋ(t0 ) = ẋ(t0 ; t0 , θ). This yields

ṙ = fx r − y ∗fxx V ẋ(t0 ).

By Proposition 7.13, the following initial condition holds at the point t = θ :

y(θ ; t0 , θ) = −(f (θ, x(θ ; t0 , θ ), u2 ) − f (θ , x(θ; t0 , θ ), u1 )).

Differentiating this condition with respect to t0 , we get


∂x
r|t=θ = −[fx ] |t=θ = [fx ]V (θ)ẋ(t0 ),
∂t0
where V (θ ) = V (θ ; t0 , θ).

It follows from Proposition 7.21 that for each k = 1, . . . , s,

∂ 2 x(tˆf ; tˆ0 , x̂0 , θ̂ )


gtk t0 (ζ̂ ) := = r k (tˆf ), (7.72)
∂tk ∂t0

where the function r k (t) is the solution to the system


˙ tˆ0 )
ṙ k (t) = fx (t, x̂(t), û(t))r k (t) − (y k (t))∗ fxx (t, x̂(t), û(t))V (t)x̂( (7.73)

and satisfies the initial condition at the point t = tˆk ,


˙ tˆ0 ).
r k (tˆk ) = [fx ]k V (tˆk )x̂( (7.74)

Here V (t) is the solution to the IVP (7.18) and y k (t) is the solution to the IVP (7.26). The
˙ tˆ0 ) has components
vector (y k )∗ fxx V x̂(
˙ tˆ0 ))j = (y k )∗ fj xx V x̂(
((y k )∗ fxx V x̂( ˙ tˆ0 ), j = 1, . . . , n.

7.4 Explicit Representation of the Quadratic Form for the


Induced Optimization Problem
Let the Lagrange multipliers μ = (α0 , α, β) ∈ 0 and λ = (α0 , α, β, ψ, ψ0 ) ∈  correspond
to each other, i.e., let π0 λ = μ hold; see Proposition 7.4. For any ζ̄ = (t¯0 , t¯f , x̄0 , θ̄ ) ∈ R2+n+s ,
let us find an explicit representation for the quadratic form Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ . By definition,


s 
s
Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ = Lx0 x0 x̄0 , x̄0 + 2 Lx0 tk x̄0 t¯k + Ltk tj t¯k t¯j
k=1 k,j =1

s
+ 2Lx0 tf x̄0 t¯f + 2 Ltk tf t¯k t¯f + Ltf tf t¯f2
k=1

s
+ 2Lx0 t0 x̄0 t¯0 + 2 Lt0 tk t¯0 t¯k + 2Lt0 tf t¯0 t¯f + Lt0 t0 t¯02 . (7.75)
k=1
320 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

All derivatives in formula (7.75) are taken at the point (μ, ζ̂ ). Now we shall calculate these
derivatives. Recall the definition (7.6) of the Lagrangian,

L(μ, ζ ) = L(μ, t0 , tf , x0 , θ ) = l(μ, t0 , x0 , tf , x(tf ; t0 , x0 , θ )). (7.76)

Note that all functions V , W k , y k , zkj , S, R k , q, w, r k , introduced in Sections 7.1.2 and 7.3,
depend now on t, t0 , x0 , and θ . For simplicity, we put V (t) = V (t; tˆ0 , x̂0 , θ̂ ), etc.

7.4.1 Derivative Lx0 x0


Using Proposition 7.11, we get
 

l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) x̄0 = lx0 (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))x̄0
∂x0
+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))V (tf ; t0 , x0 , θ )x̄0 . (7.77)

Let us find the derivative of this function with respect to x0 . We have


lx0 (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))x̄0 = x̄0∗ lx0 x0 (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))
∂x0
+ x̄0∗ lx0 xf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))V (tf ; t0 , x0 , θ ), (7.78)

and

lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))V (tf ; t0 , x0 , θ ))x̄0
∂x0

= x̄0∗ V ∗ (tf ; t0 , x0 , θ ) lxf x0 (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))

+ lxf xf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))V (tf ; t0 , x0 , θ )

+lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) V (tf ; t0 , x0 , θ )x̄0 . (7.79)
∂x0

From (7.77)–(7.79) and the transversality condition lxf (p̂) = ψ(tˆf ) it follows that at the
point ζ̂ , we have

Lx0 x0 x̄0 , x̄0 = x̄0∗ lx0 x0 (p̂)x̄0 + 2x̄0∗ lx0 xf (p̂)V (tˆf )x̄0 + x̄0∗ V ∗ (tˆf )lxf xf (p̂)V (tˆf )x̄0
 
∂ 
+ ψ(tf ) (V (tf ; t0 , x0 , θ )x̄0 )x̄0  . (7.80)
∂x 0 ζ =ζ̂

Let us calculate the last term in this formula.

Proposition 7.22. The following equality holds:


 
∂   

ψ(tf ) V (tf ; t0 , x0 , θ)x̄0 x̄0 = x̄0 k
ψk (tf )W (tf ; t0 , x0 , θ ) x̄0 . (7.81)
∂x0
k
7.4. Quadratic Form for the Induced Optimization Problem 321

Proof. For brevity, put ψ(tf ) = ψ, V (tf ; t0 , x0 , θ ) = V , W (tf ; t0 , x0 , θ ) = W . Then we have


   
∂ ∂ ∂x ∂  ∂x
ψ (V x̄0 ) x̄0 = ψ x̄0 x̄0 = ψ x̄0i x̄0
∂x0 ∂x0 ∂x0 ∂x0 ∂x0i
i
  ∂ 2x  ∂ 2 xk
= ψ x̄0i x̄0j = ψk x̄0i x̄0j
∂x0i ∂x0j ∂x0i ∂x0j
j i  k j i  
  ∂ 2 xk 

= ψk x̄0i x̄0j = x̄0 k
ψk (tf )W x̄0 .
∂x0i ∂x0j
i j k k

Proposition 7.23. For ζ = ζ̂ , the following equality holds:


 
d 
ψk W = V ∗ Hxx V ,
k
(7.82)
dt
k

where H = ψf (t, x, u), Hxx = Hxx (t, x̂(t), ψ(t), û(t)).

Proof. According to Proposition 7.15, we have



Ẇ k = V ∗fkxx V + fkxr W r , k = 1, . . . , n. (7.83)
r

Using these equations together with the adjoint equation −ψ̇ = ψfx , we obtain
 
d   
ψk W k
= ψ̇k W k + ψk Ẇ k
dt
k k k  
  

= − ψfxk W +
k
ψk V fkxx V + fkxr W r


k k 
r 
= − ψfxk W k + V ∗ (ψk fkxx ) V + ψk fkxr W r
k k   k  r 
   

= − ψfxr W + V
r
ψk fkxx V + ψk fkxr W r

r k  r k

= − ψfxr W + V (ψfxx )V +
r
ψfxr W r = V ∗ Hxx V .
r r

Now we can prove the following assertion.

Proposition 7.24. The following formula holds:


 
∂   
ψ(tf ) V (tf ; t0 , x0 , θ )x̄0 x̄0 
∂x0 ζ̂
 tˆf
= (V (t)x̄0 )∗ Hxx (t, x̂(t), û(t), ψ(t))V (t)x̄0 dt. (7.84)
tˆ0
322 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Proof. Using Propositions 7.22 and 7.23 and the initial conditions W k (tˆ0 ) = 0 for k =
1. . . . , n, we get
 
∂   
ψ(tf ) V (tf ; t0 , x0 , θ )x̄0 x̄0 
∂x0 ζ̂
    ˆ
  tf
= x̄0∗
ψk (tˆf )W (tˆf ) x̄0 = x̄0
k ∗
ψk (t)W (t) x̄0 
k

k k tˆ0
   
tˆf d  tˆf
= x̄0∗ ψk W k x̄0 dt = x̄0∗ V ∗ Hxx V x̄0 dt
tˆ0 dt tˆ0
k
 tˆf
= (V x̄0 )∗ Hxx (V x̄0 ) dt.
tˆ0

In view of formulas (7.80) and (7.84), we obtain


Lx0 x0 x̄0 , x̄0 = x̄0∗ lx0 x0 (p̂)x̄0
+2x̄0∗ lx0 xf (p̂)V (tˆf )x̄0 + (V (tˆf )x̄0 )∗ lxf xf (p̂)V (tˆf )x̄0
 tˆf
+ (V (t)x̄0 )∗ Hxx (t, x̂(t), ψ(t), û(t))V (t)x̄0 dt. (7.85)
tˆ0

7.4.2 Derivative Lx0 tk


Differentiating (7.77) with respect to tk and using Propositions 7.13 and 7.16, we get
∂2
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ ))x̄0
∂x0 ∂tk

= lx (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))x̄0
∂tk 0
∂  
+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))V (tf ; t0 , x0 , θ )x̄0
∂tk
∂x(tf ; t0 , x0 , θ )
= x̄0∗ lx0 xf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))
∂t
  k

+ lx (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) V (tf ; t0 , x0 , θ )x̄0
∂tk f
∂V (tf ; t0 , x0 , θ )
+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) x̄0
∂tk
= x̄0∗ lx0 xf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))y k (tf ; t0 , x0 , θ )
+ (V (tf ; t0 , x0 , θ )x̄0 )∗ lxf xf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))y k (tf ; t0 , x0 , θ )
+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))R k (tf ; t0 , x0 , θ ))x̄0 . (7.86)

Hence at the point ζ = ζ̂ , we have


 ∗
Lx0 tk x̄0 t¯k = x̄0∗ lx0 xf (p̂)y k (tˆf )t¯k + V (tˆf )x̄0 lxf xf (p̂)y k (tˆf )t¯k + ψ(tˆf )R k (tˆf )x̄0 t¯k . (7.87)
Let us transform the last term.
7.4. Quadratic Form for the Induced Optimization Problem 323

Proposition 7.25. The following formula holds:


 tˆf
ψ(tˆf )R k (tˆf )x̄0 t¯k = −[Hx ]k V (tˆk )x̄0 t¯k + Hxx y k t¯k , V x̄0 dt. (7.88)
tˆk

Proof. Using equation (7.44) and the adjoint equation −ψ̇ = ψfx , we get for t ∈ [tˆk , tˆf ],
d
(ψR k ) = ψ̇R k + ψ Ṙ k = −ψfx R k + ψ ((y k )∗ fxx )V + fx R k
dt  
= ψj (y k )∗ fj xx V = (y k )∗ ψj fj xx V = (y k )∗ Hxx V ,
j j

where Hxx is taken along the trajectory (t, x̂(t), ψ(t), û(t)). Consequently,
 tˆf
ˆ k ˆ ˆ k ˆ
ψ(tf )R (tf ) = ψ(tk )R (tk ) + (y k )∗ Hxx V dt.
tˆk

Using the initial condition (7.44) for Rk at tˆk , we get


 tˆf
ψ(tˆf )R k (tˆf ) = −ψ(tˆk )[fx ]k V (tˆk ) + (y k )∗ Hxx V dt.
tˆk
Hence,
 tˆf
ψ(tˆf )R k (tˆf )x̄0 t¯k = −[Hx ]k V (tˆk )x̄0 t¯k + Hxx y k t¯k , V x̄0 dt.
tˆk

Formulas (7.87) and (7.88) and the condition y k (t) = 0 for t < tˆk imply the equality
Lx0 tk x̄0 t¯k = x̄0∗ lx0 xf (p̂)y k (tˆf )t¯k + (V (tˆf )x̄0 )∗ lxf xf (p̂)y k (tˆf )t¯k

− [Hx ]k V (tˆk )x̄0 t¯k + tˆf Hxx y k t¯k , V x̄0 dt.
0

7.4.3 Derivative Ltk tk


Using the notation ∂x
∂tk = y k from Proposition 7.13, we get

l(t0 , x0 , tf , x(tf ; t0 , x0 , θ)) = lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))y k (tf ; t0 , x0 , θ )). (7.89)
∂tk
∂y k
Now, using the notation ∂tk = zkk as in Proposition 7.17, we obtain

∂2
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ ))
∂tk2
 

= lx (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) y k (tf ; t0 , x0 , θ )
∂tk f

+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) y k (tf ; t0 , x0 , θ )
∂tk
= lxf xf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))y k (tf ; t0 , x0 , θ ), y k (tf ; t0 , x0 , θ )
+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))zkk (tf ; t0 , x0 , θ ), (7.90)
324 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

and thus, 
∂2 
Ltk tk = l(t , x , t , x(t ; t , x , θ )) 
∂tk2
0 0 f f 0 0 
ζ =ζ̂
= lxf xf (p̂)y k (tˆf ), y k (tˆf ) + lxf (p̂)zkk (tˆf ).
Let us rewrite the last term in this formula. The transversality condition lxf = ψ(tˆf ) implies
 tˆf d
lxf (p̂)zkk (tˆf ) = ψ(tˆf )zkk (tˆf ) = (ψzkk ) dt + ψ(tˆk )zkk (tˆk ). (7.91)
tˆk dt

By formula (7.48), we have

żkk = fx zkk + (y k )∗ fxx y k , t ≥ tˆk .

Using this equation together with the adjoint equation −ψ̇ = ψfx , we get
d 
(ψzkk ) = ψ̇zkk + ψ żkk = −ψfx zkk + ψfx zkk + ψj ((y k )∗ fj xx y k ) = (y k )∗ Hxx y k ,
dt
j
(7.92)
and thus  tˆf
lxf (p̂)zkk (tˆf ) = (y k )∗ Hxx y k dt + ψ(tˆk )zkk (tˆk ). (7.93)
tˆk

We shall transform the last term in (7.93) using the relations

(k H )(t) = H (t, x̂(t), ûk+ , ψ(t)) − H (t, x̂(t), ûk− , ψ(t)),
d 
D k (H ) = − (k H )t=t + = −[Ht ]k − [Hx ]k x̂( ˙ tˆk +) − ψ̇(tˆk +)[Hψ ]k (7.94)
dt k

(see Section 5.2.2).

Proposition 7.26. The following equality holds:

ψ(tˆk )zkk (tˆk ) = D k (H ) − [Hx ]k [y k ]k . (7.95)

Proof. Multiplying the initial condition (7.49) for zkk at the point t = tˆk by ψ(tˆk ), we get

˙ tˆk +) + y k (tˆk ) .
ψ(tˆk )zkk (tˆk ) + ψ(tˆk )ẏ k (tˆk +) = −ψ(tˆk )[ft ]k − ψ(tˆk )[fx ]k x̂( (7.96)

Here, we obviously have the relations ψ(tˆk )[ft ]k = [Ht ]k , ψ(tˆk )[fx ]k = [Hx ]k , and y k (tˆk ) =
[y k ]k . Moreover, equation (7.26) for y k together with the adjoint equation −ψ̇ = ψfx
implies that ψ ẏ k = ψfx y k = −ψ̇y k . Hence, in view of the initial condition (7.26) for y k ,
we find

ψ(tˆk )ẏ k (tˆk +) = −ψ̇(tˆk +)y k (tˆk ) = ψ̇(tˆk +)[f ]k = ψ̇(tˆk +)[Hψ ]k .

Thus, (7.96) and (7.94) imply (7.95).


7.4. Quadratic Form for the Induced Optimization Problem 325

From the relations (7.91), (7.93), and (7.95) and the equality y k (t) = 0 for t < tˆk , it
follows that
 tˆf
Ltk tk t¯k2 = lxf xf (p̂)y k (tˆf )t¯k , y k (tˆf )t¯k + (y k t¯k )∗ Hxx y k t¯k dt
tˆ0
+D k
(H )t¯k2 − [Hx ]k [y k ]k t¯k2 , k = 1, . . . , s. (7.97)

7.4.4 Derivative Ltk tj


Note that Ltk tj = Ltj tk for all k, j . Therefore,

s 
s 
Ltk tj t¯k t¯j = Ltk tk t¯k2 + 2 Ltk tj t¯k t¯j . (7.98)
k,j =1 k=1 k<j

Let us calculate Ltk tj for k < j . Differentiating (7.89) with respect to tj , we get

∂2
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ ))
∂tk ∂tj
 

= lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) y k (tf ; t0 , x0 , θ )
∂tj
∂ k
+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ )) y (tf ; t0 , x0 , θ )
∂tj
= lxf xf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))y k (tf ; t0 , x0 , θ ), y j (tf ; t0 , x0 , θ )
+ lxf (t0 , x0 , tf , x(tf ; t0 , x0 , θ ))zkj (tf ; t0 , x0 , θ ). (7.99)
Thus,
∂2
Ltk tj = l(t0 , x0 , tf , x(tf ; t0 , x0 , θ ))|ζ =ζ̂
∂tk ∂tj
= lxf xf (p̂)y k (tˆf ), y j (tˆf ) + lxf (p̂)zkj (tˆf ). (7.100)
We can rewrite the last term in this formula as
 tˆf d
lxf (p̂)zkj (tˆf ) = ψ(tˆf )zkj (tˆf ) = (ψzkj ) dt + ψ(tˆj )zkj (tˆj ).
tˆj dt

By formula (7.53), żkk = fx zkk + (y k )∗ fxx y j for t ≥ tˆj . Similarly to (7.92), we get
k ∗
dt (ψz ) = (y ) Hxx y and thus obtain
d kj j

 tˆf
lxf (p̂)z (tˆf ) =
kj
(y k )∗ Hxx y j dt + ψ(tˆj )zkj (tˆj ). (7.101)
tˆj

Since y j (t) = 0 for t < tˆj , we have


 tˆf  tˆf
k ∗
(y ) Hxx y dt =
j
(y k )∗ Hxx y j dt. (7.102)
tˆj tˆ0
326 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Using the initial condition (7.54) for zkj at the point θ̂ j , we get
ψ(tˆj )zkj (tˆj ) = −ψ(tˆj )[fx ]j y k (tˆj ) = −[Hx ]j y k (tˆj ). (7.103)
Formulas (7.100)–(7.103) imply the following representation for all k < j :
Ltk tj t¯k t¯j = lxf xf (p̂)y k (tˆf )t¯k , y j (tˆf )t¯j
 tˆf
+ (y k t¯k )∗ Hxx y j t¯j dt − [Hx ]j y k (tˆj )t¯k t¯j . (7.104)
tˆ0

7.4.5 Derivative Lx0 tf


Using Proposition 7.11, we get
∂2 ∂
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = {lx + lxf V }|t=tf
∂x0 ∂tf ∂tf 0
∂lxf
= (lx0 tf + lx0 xf ẋ + V + lxf V̇ )|t=tf
∂tf
= (lx0 tf + lx0 xf ẋ + (lxf tf + lxf xf ẋ)V + lxf fx V )|t=tf .

Again, we transform the last term in this formula at the point ζ = ζ̂ . Using the adjoint equa-
tion −ψ̇ = ψfx and the transversality condition ψ(tˆf ) = lxf , we get

lxf fx V |t=tˆf = ψfx V |t=tˆf = −ψ̇(tˆf )V (tˆf ).

Consequently,
˙ tˆf )t¯f , x̄0 + lx t V (tˆf )x̄0 t¯f
Lx0 tf x̄0 t¯f = lx0 tf x̄0 t¯f + lx0 xf x̂( f f
˙ (7.105)
+ lxf xf x̂(tˆf )t¯f , V (tˆf )x̄0 − ψ̇(tˆf )V (tˆf )x̄0 t¯f .

7.4.6 Derivative Ltk tf


Using the notation ∂x
∂tk = y k and Proposition 7.13, we get

∂2 ∂
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = {lx y k }|t=tf
∂tk ∂tf ∂tf f
= {(lxf xf ẋ + lxf tf )y k + lxf ẏ k }|t=tf
= {lxf xf ẋy k + lxf tf y k + lxf fx y k }|t=tf .

We evaluate the last term in this formula at the point ζ = ζ̂ using the adjoint equation
−ψ̇ = ψfx and the transversality condition ψ(tˆf ) = lxf :

lxf fx y k |t=tˆf = ψfx y k |t=tˆf = −ψ̇(tˆf )y k (tˆf ).

Therefore,
˙ tˆf )t¯f , y k (tˆf )t¯k + lx t y k (tˆf )t¯k t¯f − ψ̇(tˆf )y k (tˆf )t¯k t¯f .
Ltk tf t¯k t¯f = lxf xf x̂( (7.106)
f f
7.4. Quadratic Form for the Induced Optimization Problem 327

7.4.7 Derivative Ltf tf


We have
∂2 ∂
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = {lt + lxf ẋ}|t=tf
∂tf2 ∂tf f


= {(ltf tf + ltf xf ẋ) + (lxf tf + lxf xf ẋ)ẋ + lxf ẍ}  ,
t=tf

which gives
˙ tˆf ) + lx x x̂(
Ltf tf = ltf tf + 2ltf xf x̂( ˙ tˆf ), x̂(
˙ tˆf ) + ψ(tˆf )x̂(
¨ tˆf ). (7.107)
f f

Let us transform the last term. Equation (6.11) in the definition of M0 is equivalent to the
relation ψ x̂˙ + ψ0 = 0. Differentiating this equation with respect to t, we get

ψ̇ x̂˙ + ψ x̂¨ + ψ̇0 = 0. (7.108)

Hence, formula (7.107) implies the equality

Ltf tf t¯f2 = ˙ tˆf )t¯2 + lx x x̂(


ltf tf t¯f2 + 2ltf xf x̂( ˙ tˆf )t¯f , x̂(
˙ tˆf )t¯f − (ψ̇(tˆf )x̂(
˙ tˆf ) + ψ̇0 (tˆf ))t¯2 .
f f f f

7.4.8 Derivative Lx0 t0


In view of the relation ∂x
∂x0 = V , we obtain

∂2 ∂
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = {lx + lxf V }|t=tf
∂x0 ∂t0 ∂t0 0
   
∂x ∂x ∂V 
= lx0 t0 + lx0 xf + lxf t0 + lxf xf V + lxf .
∂t0 ∂t0 ∂t0 t=tf

Now, using the transversality condition lxf = ψ(tˆf ), formula (7.24), and the notation ∂V
∂t0 = S,
we get
˙ tˆ0 ) + lx t V (tˆf ) − x̂(
Lx0 t0 = lx0 t0 − lx0 xf V (tˆf )x̂( ˙ tˆ0 )∗ V (tˆf )∗ lx x V (tˆf ) + ψ(tˆf )S(tˆf ).
f 0 f f

The transformation of the last term in this formula proceeds as follows. Using the adjoint
equation for ψ and the system (7.69) for S, we obtain the equation
d 
(ψS) = ˙ tˆ0 )∗ V ∗
ψ̇S + ψ Ṡ = −ψfx S + ψfx S − x̂( ψi fixx V
dt
i
= ˙ tˆ0 )∗ V ∗ Hxx V ,
−x̂(

which yields
 tˆf
ψ(tˆf )S(tˆf ) = − ˙ tˆ0 )∗ V ∗ Hxx V dt + ψ(tˆ0 )S(tˆ0 ).
x̂( (7.109)
tˆ0
328 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Using now the initial condition (7.70) for S at the point t = tˆ0 and the equation V̇ = fx V ,
we get
(ψS)|tˆ0 = −(ψ V̇ )|tˆ0 = −(ψfx V )|tˆ0 = (ψ̇V )|tˆ0 = ψ̇(tˆ0 ), (7.110)

since V (tˆ0 ) = E. Formulas (7.109) and (7.110) then imply the equality

˙ tˆ0 ) + lx t V (tˆf ) − x̂(


Lx0 t0 = lx0 t0 − lx0 xf V (tˆf )x̂( ˙ tˆ0 )∗ V (tˆf )∗ lx x V (tˆf )
f 0 f f
 tˆf
− ˙ tˆ0 )∗ V ∗ Hxx V dt + ψ̇(tˆ0 ).
x̂( (7.111)
tˆ0

Therefore,
˙ tˆ0 )t¯0 , x̄0 + lx t V (tˆf )x̄0 t¯0
Lx0 t0 x̄0 t¯0 = lx0 t0 x̄0 t¯0 − lx0 xf V (tˆf )x̂( f 0
˙
−lxf xf V (tˆf )x̄0 , V (tˆf )x̂(tˆ0 )t¯0 + ψ̇(tˆ0 )x̄0 t¯0
 tˆf
− ˙ tˆ0 )t¯0 dt.
Hxx V x̄0 , V x̂( (7.112)
tˆ0

7.4.9 Derivative Ltk t0


∂yk
Using the notation ∂x
∂t0 = w, ∂x
∂tk = y k , and ∂t0 = r k , we obtain

∂2 ∂
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = {lx y k }|t=tf
∂tk ∂t0 ∂t0 f
 
k ∗ ∂x ∂y k 
= lxf t0 y + (y ) lxf xf
k
+ lxf
∂t0 ∂t0 t=tf
= {lxf t0 y k + (y k )∗ lxf xf w + lxf r k }|t=tf .

˙ tˆ0 ). Using this condition together


According to condition (7.24) we have w|t=tˆf = −V (tˆf )x̂(
with the transversality condition lxf = ψ(tˆf ), we find

˙ tˆ0 ) + ψ(tˆf )r k (tˆf ).


Ltk t0 = lxf t0 y k (tˆf ) − (y k (tˆf ))∗ lxf xf V (tˆf )x̂( (7.113)

Let us transform the last term in this formula. Using the adjoint equation for ψ and the
system (7.73) for r k , we get for t ≥ tˆk :

d 
(ψr k ) = ψ̇r k + ψ ṙ k = −ψfx r k + ψfx r k − (yi )∗ ˙ tˆ0 )
ψj fj xx V x̂(
dt
j
∗ ˙ tˆ0 ).
= −(yk ) Hxx V x̂(

It follows that
 tˆf
ψ(tˆf )r k (tˆf ) = − ˙ tˆ0 ) dt + ψ(tˆk )r k (tˆk ).
(yk )∗ Hxx V x̂( (7.114)
tˆk
7.4. Quadratic Form for the Induced Optimization Problem 329

The initial condition (7.74) for r k at the point tˆk then yields
˙ tˆ0 ) = [Hx ]k V (tˆk )x̂(
ψ(tˆk )r k (tˆk ) = ψ(tˆk )[fx ]k V (tˆk )x̂( ˙ tˆ0 ). (7.115)
Formulas (7.113)–(7.115) and the condition y k (t) = 0 for t < tˆk then imply the equality
˙ tˆ0 )
Ltk t0 = lxf t0 y k (tˆf ) − (y k (tˆf ))∗ lxf xf V (tˆf )x̂(
 tˆf
+ [Hx ]k V (tˆk )x̂( ˙ tˆ0 ) − (yk )∗ Hxx V x̂( ˙ tˆ0 ) dt. (7.116)
tˆ0

Hence,
˙ tˆ0 )t¯0
Ltk t0 t¯k t¯0 = lxf t0 y k (tˆf )t¯k t¯0 − (y k (tˆf )t¯k )∗ lxf xf V (tˆf )x̂(
 tˆf
+ [Hx ]k V (tˆk )x̂( ˙ tˆ0 )t¯0 t¯k − ˙ tˆ0 )t¯0 dt.
(y k t¯k )∗ Hxx V x̂( (7.117)
tˆ0

7.4.10 Derivative Ltf t0


We have
∂2 ∂
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = {lt + lxf ẋ}|t=tf
∂tf ∂t0 ∂t0 f
   
∂x ∂ ∂ 
= ltf t0 + ltf xf + lx ẋ + lxf ẋ  .
∂t0 ∂t0 f ∂t0 t=tf

Using the equalities


∂ ∂x ∂x
lx = lxf t0 + lxf xf , = −V ẋ(t0 ),
∂t0 f ∂t0 ∂t0
we get
∂2
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = ltf t0 − ltf xf V ẋ(t0 ) + lxf t0 ẋ(tf )
∂tf ∂t0

− (V (tf )ẋ(t0 ))∗ lxf xf ẋ(tf ) + lxf ẋ|t=tf . (7.118)
∂t0
Let us calculate the last term. Differentiating the equation
ẋ(t; t0 , x0 , θ ) = f (t, x(t; t0 , x0 , θ ), u(t; θ ))
with respect to t0 , we get
∂ ∂
ẋ = fx x = −fx V ẋ(t0 ).
∂t0 ∂t0
Consequently, at the point ζ = ζ̂ , we obtain
  
∂  ∂ 
˙ tˆ0 )}| ˆ = ψ̇(tˆf )V (tˆf )x̂(
˙ tˆ0 ).
lxf ẋ = ψ ẋ  = {−ψfx V x̂(
∂t0 t=tˆf ∂t0 t=tˆf
t=tf
330 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Using this equality in (7.118), we get at the point ζ = ζ̂ ,

˙ tˆ0 ) + lx t x̂(
Ltf t0 = ltf t0 − ltf xf V (tˆf )x̂( ˙ tˆf )
f 0

− lx x V (tˆf )x̂(˙ tˆ0 ), x̂(


˙ tˆf ) + ψ̇(tˆf )V (tˆf )x̂(
˙ tˆ0 ), (7.119)
f f

which yields

Ltf t0 t¯f t¯0 = ltf t0 t¯f t¯0 − ltf xf (V (tˆf )x̂(˙ tˆ0 )t¯0 )t¯f + lx t (x̂(
˙ tˆf )t¯f )t¯0
f 0

− lx x V (tˆf )x̂( ˙ tˆ0 )t¯0 , x̂(


˙ tˆf )t¯f + ψ̇(tˆf )(V (tˆf )x̂(
˙ tˆ0 )t¯0 )t¯f . (7.120)
f f

7.4.11 Derivative Lt0 t0


We have
 
∂2 ∂ ∂x 
l(t0 , x0 , tf , x(tf ; t0 , x0 , θ )) = lt + lxf
∂t02 ∂t0 0 ∂t0 t=tf
   
∂x ∂x ∂x ∂ 2 x 
= lt0 t0 + lt0 xf + lxf t0 + lxf xf + lxf 2 
∂t0 ∂t0 ∂t0 ∂t0 t=tf
 ' ( 
∂x ∂x ∂x ∂ x 
2
= lt0 t0 + 2lt0 xf + lxf xf , + lxf 2 
∂t0 ∂t0 ∂t0 ∂t0 t=tf

= lt0 t0 + 2lt0 xf w + lxf xf w, w + lxf q |t=tf , (7.121)

where
∂x ∂w ∂ 2 x
w= , q= = 2.
∂t0 ∂t0 ∂t0
The transversality condition lxf = ψ(tˆf ) yields

Lt0 t0 = lt0 t0 + 2lt0 xf w(tˆf ) + lxf xf w(tˆf ), w(tˆf ) + ψ(tˆf )q(tˆf ). (7.122)

Let us transform the last term using the adjoint equation for ψ and the system (7.58) for q:

d 
(ψq) = ψ̇q + ψ q̇ = −ψfx q + ψfx q + ψj (w ∗fj xx w) = w ∗ Hxx w.
dt
j

˙ tˆ0 ), we obtain
Also, using the equality w = −V x̂(
 tˆf
ψ(tˆf )q(tˆf ) = ψ(tˆ0 )q(tˆ0 ) + w∗ Hxx w dt
tˆ0
 tˆf
= ψ(tˆ0 )q(tˆ0 ) + ˙ tˆ0 ), V x̂(
Hxx V x̂( ˙ tˆ0 ) dt. (7.123)
tˆ0
7.4. Quadratic Form for the Induced Optimization Problem 331

The initial condition (7.59) for q then implies

¨ tˆ0 ) − 2ψ(tˆ0 )ẇ(tˆ0 ).


ψ(tˆ0 )q(tˆ0 ) = −ψ(tˆ0 )x̂( (7.124)

From the equation ẇ = fx w (see Proposition 7.12), the adjoint equation −ψ̇ = ψfx , and
˙ tˆ0 ), it follows that
the formula w = −V x̂(

˙ tˆ0 ).
−ψ ẇ = −ψfx w = ψ̇w = −ψ̇V x̂(

Since V (tˆ0 ) = E, we obtain


˙ tˆ0 ).
ψ(tˆ0 )ẇ(tˆ0 ) = ψ̇(tˆ0 )x̂( (7.125)

Moreover, by formula (7.108) we have

−ψ x̂¨ = ψ̇ x̂˙ + ψ̇0 . (7.126)

Formulas (7.124)–(7.126) imply

˙ tˆ0 ).
ψ(tˆ0 )q(tˆ0 ) = ψ̇0 (tˆ0 ) − ψ̇(tˆ0 )x̂( (7.127)

Combining formulas (7.122), (7.123), and (7.127), we obtain

˙ tˆ0 ) + lx x V (tˆf )x̂(


Lt0 t0 = lt0 t0 − 2lt0 xf V (tˆf )x̂( ˙ tˆ0 ), V (tˆf )x̂(
˙ tˆ0 )
f f
 tˆf
˙ tˆ0 ) +
+ ψ̇0 (tˆ0 ) − ψ̇(tˆ0 )x̂( ˙ tˆ0 ), V (tˆf )x̂(
Hxx V (tˆf )x̂( ˙ tˆ0 ) dt. (7.128)
tˆ0

Thus we have found the representation

˙ tˆ0 )t¯2 + lx x V (tˆf )x̂(


Lt0 t0 t¯02 = lt0 t0 t¯02 − 2lt0 xf V (tˆf )x̂( ˙ tˆ0 )t¯0 , V (tˆf )x̂(
˙ tˆ0 )t¯0
0 f f
 tˆf
+ ψ̇0 (tˆ0 )t¯02 − ψ̇(tˆ0 )x̂(˙ tˆ0 )t¯2 + ˙ tˆ0 )t¯0 , V (tˆf )x̂(
Hxx V (tˆf )x̂( ˙ tˆ0 )t¯0 dt. (7.129)
0
tˆ0

7.4.12 Representation of the Quadratic Form Lζ ζ ζ̄ , ζ̄


Combining all results and formulas in the preceding sections, we have proved the following
theorem.

Theorem 7.27. Let the Lagrange multipliers μ = (α0 , α, β) ∈ 0 and λ = (α0 , α, β, ψ, ψ0 ) ∈


 correspond to each other, i.e., let π0 λ = μ hold; see Proposition 7.4. Then, for any
ζ̄ = (t¯0 , t¯f , x̄0 , θ̄) ∈ R2+n+s , formulas (7.75), (7.85), (7.89), (7.97), (7.98), (7.104)–(7.106),
(7.109), (7.112), (7.117), (7.120), and (7.129) hold, where the matrix V (t) is the solution to
the IVP (7.18) and the function y k is the solution to the IVP (7.26) for each k = 1, . . . , s.
332 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Thus we have obtained the following explicit and massive representation of the
quadratic form in the IOP:

Lζ ζ ζ̄ , ζ̄ = Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ (7.130)



s 
s 
s
= Lx0 x0 x̄0 , x̄0 + 2 Lx0 tk x̄0 t¯k + Ltk tk t¯k2 + 2 Ltk tj t¯k t¯j
k=1 k=1 k<j
s
+ 2Lx0 tf x̄0 t¯f + 2 Ltk tf t¯k t¯f + Ltf tf t¯f2
k=1

s
+ 2Lx0 t0 x̄0 t¯0 + 2 Lt0 tk t¯0 t¯k + 2Lt0 tf t¯0 t¯f + Lt0 t0 t¯02
k=1

= x̄0∗ lx0 x0 x̄0 + 2x̄0∗ lx0 xf V (tˆf )x̄0 + (V (tˆf )x̄0 )∗ lxf xf V (tˆf )x̄0
 tˆf
+ (V x̄0 )∗ Hxx V x̄0 dt
tˆ0
 s s
+ 2x̄0∗ lx0 xf y k (tˆf )t¯k + 2(V (tˆf )x̄0 )∗ lxf xf y k (tˆf )t¯k
k=1 k=1
 s s  tˆf
− 2[Hx ]k V (tˆk )x̄0 t¯k + 2Hxx y k t¯k , V x̄0 dt
tˆ0
k=1 k=1


s s 
 tˆf
+ lxf xf y k (tˆf )t¯k , y k (tˆf )t¯k + (y k t¯k )∗ Hxx y k t¯k dt
ˆ
k=1 k=1 t0

s 
s 
+ D k (H )t¯k2 − [Hx ]k [y k ]k t¯k2 + 2lxf xf y k (tˆf )t¯k , y j (tˆf )t¯j
k=1 k=1 k<j
 tˆf 
+ 2(y k t¯k )∗ Hxx y j t¯j dt − 2[Hx ]j y k (tˆj )t¯k t¯j
ˆ
k<j t0 k<j

˙ tˆf )t¯f , x̄0 + 2lx t V (tˆf )x̄0 t¯f


+ 2lx0 tf x̄0 t¯f + 2lx0 xf x̂( f f

˙ tˆf )t¯f , V (tˆf )x̄0 − 2ψ̇(tˆf )V (tˆf )x̄0 t¯f


+ 2lxf xf x̂(

s 
s
+ ˙ tˆf )t¯f , y k (tˆf )t¯k +
2lxf xf x̂( 2lxf tf y k (tˆf )t¯k t¯f
k=1 k=1

s
− 2ψ̇(tˆf )y k (tˆf )t¯k t¯f
k=1
˙ tˆf )t¯2 + lx x x̂(
+ ltf tf t¯f2 + 2ltf xf x̂( ˙ tˆf )t¯f , x̂(
˙ tˆf )t¯f
f f f

˙ tˆf ) + ψ̇0 (tˆf ))t¯2


− (ψ̇(tˆf )x̂( f
˙ tˆ0 )t¯0 , x̄0 + 2lx t V (tˆf )x̄0 t¯0
+ 2lx0 t0 x̄0 t¯0 − 2lx0 xf V (tˆf )x̂( f 0
7.5. Equivalence of the Quadratic Forms 333

˙ tˆ0 )t¯0 + 2ψ̇(tˆ0 )x̄0 t¯0


− 2lxf xf V (tˆf )x̄0 , V (tˆf )x̂(
 tˆf
− 2Hxx V x̄0 , V x̂( ˙ tˆ0 )t¯0 dt
tˆ0


s 
s
+ 2lxf t0 y k (tˆf )t¯k t¯0 − ˙ tˆ0 )t¯0
2(y k (tˆf )t¯k )∗ lxf xf V (tˆf )x̂(
k=1 k=1


s s 
 tˆf
+ ˙ tˆ0 )t¯0 t¯k −
2[Hx ] V (tˆk )x̂(
k ˙ tˆ0 )t¯0 dt
2(y k t¯k )∗ Hxx V x̂(
ˆ
k=1 k=1 t0
˙ tˆ0 )t¯0 )t¯f + 2lx t (x̂(
+ 2ltf t0 t¯f t¯0 − 2ltf xf (V (tˆf )x̂( ˙ tˆf )t¯f )t¯0
f 0

˙ tˆ0 )t¯0 , x̂(


− 2lxf xf V (tˆf )x̂( ˙ tˆf )t¯f + 2ψ̇(tˆf )(V (tˆf )x̂(
˙ tˆ0 )t¯0 )t¯f

˙ tˆ0 )t¯2 + lx x V (tˆf )x̂(


+ lt0 t0 t¯02 − 2lt0 xf V (tˆf )x̂( ˙ tˆ0 )t¯0 , V (tˆf )x̂(
˙ tˆ0 )t¯0
0 f f
 tˆf
˙ tˆ0 )t¯2 +
+ ψ̇0 (tˆ0 )t¯02 − ψ̇(tˆ0 )x̂( ˙ tˆ0 )t¯0 , V x̂(
Hxx V x̂( ˙ tˆ0 )t¯0 dt.
0
tˆ0

Again, we wish to emphasize that this explicit representation involves only first-order vari-
ations y k and V of the trajectories x(t; t0 , x0 , θ ).

7.5 Equivalence of the Quadratic Forms in the Basic and


Induced Optimization Problem
In this section, we shall prove Theorem 7.8 which is the main result of this chapter. Let
the Lagrange multipliers μ = (α0 , α, β) ∈ 0 and λ = (α0 , α, β, ψ, ψ0 ) ∈  correspond to
each other, and take any ζ̄ = (t¯0 , t¯f , x̄0 , θ̄ ) ∈ R2+n+s . Consider the representation (7.130) of
the quadratic form Lζ ζ ζ̄ , ζ̄ , which is far from revealing the equivalence of the quadratic
forms for the basic control problem and the IOP. However, we show now that by a careful
regrouping of the terms in (7.130) we shall arrive at the desired equivalence. The quadratic
form (7.130) contains terms of the following types.
Type (a): Positive terms with coefficients D k (H ) multiplied by the variation of the switching
time t¯k squared,
s
a := D k (H )t¯k2 . (7.131)
k=1

Type (b): Mixed terms with [Hx ]k connected with the variation t¯k ,


s 
s
b := − 2[Hx ]k V (tˆk )x̄0 t¯k − [Hx ]k [y k ]k t¯k2
k=1 k=1
 
s
− 2[Hx ]j y k (tˆj )t¯k t¯j + ˙ tˆ0 )t¯0 t¯k . (7.132)
2[Hx ]k V (tˆk )x̂(
k<j k=1
334 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Since
  
s 
k−1
[Hx ]j y k (tˆj )t¯k t¯j = [Hx ]k y j (tˆk )t¯k t¯j = [Hx ]k y j (tˆk )t¯k t¯j ,
k<j j <k k=1 j =1

we get from (7.132),


s  
k−1 
1 ˙ tˆ0 )t¯0 t¯k .
b=− 2[Hx ]k V (tˆk )x̄0 + [y k ]k t¯k + y j (tˆk )t¯j − V (tˆk )x̂( (7.133)
2
k=1 j =1

According to (7.17) put


s
x̄(t) = V (t)x̄0 + ˙ tˆ0 )t¯0 .
y k (t)t¯k − V (t)x̂( (7.134)
k=1

Then we have

k−1
x̄(tˆk −) = V (tˆk )x̄0 + ˙ tˆ0 )t¯0 ,
y j (tˆk )t¯j − V (tˆk )x̂(
j =1

since y j (tˆk −) = y j (tˆk ) = 0 for j > k and y k (tˆk −) = 0. Moreover, the jump of x̄(t) at the
point tˆk is equal to the jump of y k (t)t¯k at the same point, i.e., [x̄]k = [y k ]k t¯k . Therefore,

1  k−1
V (tˆk )x̄0 + [y k ]k t¯k + ˙ tˆ0 )t¯0
y j (tˆk )t¯j − V (tˆk )x̂(
2
j =1
1 1
= x̄(tˆk −) + [x̄]k = (x̄(tˆk −) + x̄(tˆk +)) = x̄av
k
.
2 2

Thus, we get

s
k ¯
b=− 2[Hx ]k x̄av tk . (7.135)
k=1

Type (c): Integral terms


 tˆf s 
 tˆf

c := (V x̄0 ) Hxx V x̄0 dt + 2Hxx y k t¯k , V x̄0 dt
tˆ0 ˆ
k=1 t0
s 
 tˆf  tˆf
+ (y k t¯k )∗ Hxx y k t¯k dt + 2(y k t¯k )∗ Hxx y j t¯j dt
ˆ ˆ
k=1 t0 k<j t0
 tˆf   tˆf
s
− ˙ tˆ0 )t¯0 dt −
2Hxx V x̄0 , V x̂( ˙ tˆ0 )t¯0 dt
2(y k t¯k )∗ Hxx V x̂(
tˆ0 ˆ
k=1 t0
 tˆf
+ ˙ tˆ0 )t¯0 , V x̂(
Hxx V x̂( ˙ tˆ0 )t¯0 dt. (7.136)
tˆ0
7.5. Equivalence of the Quadratic Forms 335

Obviously, this sum can be transformed into a perfect square:


 tˆf ' 
s 
s (
c= Hxx (V x̄0 + ˙ tˆ0 )t¯0 ), V x̄0 +
y k t¯k − V x̂( ˙ tˆ0 )t¯0 dt
y k t¯k − V x̂(
tˆ0 k=1 k=1
 tˆf
= Hxx x̄, x̄ dt. (7.137)
tˆ0
Type (d): Endpoints terms. We shall divide them into several groups.
Group (d1): This group contains the terms with second-order derivatives of the endpoint
Lagrangian l with respect to t0 , x0 , and tf :
d1 := x̄0∗ lx0 x0 x̄0 + 2lx0 tf x̄0 t¯f + ltf tf t¯f2 + 2lx0 t0 x̄0 t¯0 + 2ltf t0 t¯f t¯0 + lt0 t0 t¯02 . (7.138)
Group (d2): We collect the terms with lt0 xf :

s
d2 := 2lxf t0 V (tˆf )x̄0 t¯0 + 2lxf t0 y k (tˆf )t¯k t¯0
k=1
˙ tˆf )t¯f t¯0 − 2lt x V (tˆf )x̂(
+ 2lxf t0 x̂( ˙ tˆ0 )t¯2
0 f 0
 
s
= 2lxf t0 V (tˆf )x̄0 + ˙ tˆf )t¯f − V (tˆf )x̂(
y (tˆf )t¯k + x̂(
k ˙ tˆ0 )t¯0 t¯0 ,
k=1
= 2lxf t0 x̄¯f t¯0 , (7.139)
where, in view of (7.34),

s
x̄¯f := V (tˆf )x̄0 + ˙ tˆf )t¯f − V (tˆf )x̂(
y k (tˆf )t¯k + x̂( ˙ tˆ0 )t¯0 = x̄(tˆf ) + x̂(
˙ tˆf )t¯f . (7.140)
k=1
Group (d3): Consider the terms with lx0 xf :

s
d3 := 2x̄0∗ lx0 xf V (tˆf )x̄0 + 2x̄0∗ lx0 xf y k (tˆf )t¯k
k=1
˙ tˆf )t¯f , x̄0 − 2lx x V (tˆf )x̂(
+ 2lx0 xf x̂( ˙ tˆ0 )t¯0 , x̄0
0 f
'   s  (
= 2 lx0 xf V (tˆf )x̄0 + ˙ tˆf )t¯f − V (tˆf )x̂(
y k (tˆf )t¯k + x̂( ˙ tˆ0 )t¯0 , x̄0
k=1
= 2lx0 xf x̄¯f , x̄0 . (7.141)
Group (d4): This group contains all terms with ltf xf :

s
d4 := 2lxf tf V (tˆf )x̄0 t¯f + 2lxf tf y k (tˆf )t¯k t¯f
k=1
˙ tˆf )t¯2 − 2lt x (V (tˆf )x̂(
+ 2ltf xf x̂( ˙ tˆ0 )t¯0 )t¯f
f f f
  s 
= 2lxf tf V (tˆf )x̄0 + ˙ ˙
y (tˆf )t¯k + x̂(tˆf )t¯f − V (tˆf )x̂(tˆ0 )t¯0 t¯f
k

k=1
= 2lxf tf x̄¯f t¯f . (7.142)
336 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Group (d5): We collect all terms containing lxf xf :


s
d5 := (V (tˆf )x̄0 )∗ lxf xf V (tˆf )x̄0 + 2(V (tˆf )x̄0 )∗ lxf xf y k (tˆf )t¯k
k=1

s 
+ lxf xf y k (tˆf )t¯k , y k (tˆf )t¯k + 2lxf xf y k (tˆf )t¯k , y j (tˆf )t¯j
k=1 k<j

s
˙ tˆf )t¯f , V (tˆf )x̄0 +
+ 2lxf xf x̂( ˙ tˆf )t¯f , y k (tˆf )t¯k
2lxf xf x̂(
k=1
+ lxf xf x̂( ˙ tˆf )t¯f − 2lx x V (tˆf )x̄0 , V (tˆf )x̂(
˙ tˆf )t¯f , x̂( ˙ tˆ0 )t¯0
f f


s
− ˙ tˆ0 )t¯0 − 2lx x V (tˆf )x̂(
2(y k (tˆf )t¯k )∗ lxf xf V (tˆf )x̂( ˙ tˆ0 )t¯0 , x̂(
˙ tˆf )t¯f
f f
k=1
˙ tˆ0 )t¯0 , V (tˆf )x̂(
+ lxf xf V (tˆf )x̂( ˙ tˆ0 )t¯0 . (7.143)

One can easily check that this sum can be transformed into the perfect square
'  
s 
d5 := lxf xf V (tˆf )x̄0 + ˙ tˆf )t¯f − V (tˆf )x̂(
y k (tˆf )t¯k + x̂( ˙ tˆ0 )t¯0 ,
k=1

s (
V (tˆf )x̄0 + ˙ tˆf )t¯f − V (tˆf )x̂(
y (tˆf )t¯k + x̂(
k ˙ tˆ0 )t¯0
k=1
= lxf xf x̄¯f , x̄¯f . (7.144)

Group (d6): Terms with ψ̇(tˆ0 ) and ψ̇0 (tˆ0 ):


˙ tˆ0 ))t¯2 .
d6 := 2ψ̇(tˆ0 )x̄0 t¯0 + (ψ̇0 (tˆ0 ) − ψ̇(tˆ0 )x̂( (7.145)
0

Group (d7): Terms with ψ̇(tˆf ) and ψ̇0 (tˆf ):


s
d7 := −2ψ̇(tˆf )V (tˆf )x̄0 t¯f − 2ψ̇(tˆf )y k (tˆf )t¯k t¯f
k=1
˙ tˆf ) + ψ̇0 (tˆf ))t¯2 + 2ψ̇(tˆf )(V (tˆf )x̂(
− (ψ̇(tˆf )x̂( ˙ tˆ0 )t¯0 )t¯f
f
 
 s
= −2ψ̇(tˆf ) V (tˆf )x̄0 + ˙ tˆ0 )t¯0 ) t¯f
y k (tˆf )t¯k − V (tˆf )x̂(
k=1
˙ tˆf ) + ψ̇0 (tˆf ))t¯2
− (ψ̇(tˆf )x̂( f
˙ tˆf ) + ψ̇0 (tˆf ))t¯2 .
= −2ψ̇(tˆf )x̄(tˆf )t¯f − (ψ̇(tˆf )x̂( (7.146)
f

˙ tˆf )t¯f in (7.146), we obtain


Using the equality x̄¯f = x̄(tˆf ) + x̂(

˙ tˆf ))t¯2 .
d7 = −2ψ̇(tˆf )x̄¯f t¯f − (ψ̇0 (tˆf ) − ψ̇(tˆf )x̂( (7.147)
f
7.5. Equivalence of the Quadratic Forms 337

This completes the whole list of all terms in the quadratic form associated with the
IOP. Hence, we have


7
Lζ ζ ζ̄ , ζ̄ = a + b + c + d, d= di .
i=1

We thus have found the following representation of this quadratic form; see formulas
(7.131) for a, (7.135) for b, and (7.137) for c:


s 
s  tˆf
Lζ ζ ζ̄ , ζ̄ = D (H )t¯k2 −
k k ¯
2[Hx ]k x̄av tk + Hxx x̄, x̄ + d, (7.148)
k=1 k=1 tˆ0

where according to formulas (7.138), (7.139), (7.141), (7.142), (7.144), (7.145), (7.147) for
d1 , . . . , d7 , respectively,

d = lx0 x0 x̄0 , x̄0 + 2lx0 tf x̄0 t¯f + ltf tf t¯f2 + 2lx0 t0 x̄0 t¯0 + 2ltf t0 t¯f t¯0 + lt0 t0 t¯02 + 2lxf t0 x̄¯f t¯0
˙ tˆ0 ))t¯2
+ 2lx0 xf x̄¯f , x̄0 + 2lxf tf x̄¯f t¯f + lxf xf x̄¯f , x̄¯f + 2ψ̇(tˆ0 )x̄0 t¯0 + (ψ̇0 (tˆ0 ) − ψ̇(tˆ0 )x̂( 0
˙ tˆf ))t¯2 .
− 2ψ̇(tˆf )x̄¯f t¯f − (ψ̇0 (tˆf ) − ψ̇(tˆf )x̂( (7.149)
f

In (7.148) and (7.149) the function x̄(t) and the vector x̄¯f are defined by (7.134) and (7.140),
respectively. Note that in (7.149),

lx0 x0 x̄0 , x̄0 + 2lx0 tf x̄0 t¯f + ltf tf t¯f2 + 2lx0 t0 x̄0 t¯0 + 2ltf t0 t¯f t¯0 + lt0 t0 t¯02 + 2lxf t0 x̄¯f t¯0
+ 2lx0 xf x̄¯f , x̄0 + 2lxf tf x̄¯f t¯f + lxf xf x̄¯f , x̄¯f = lpp p̄,
¯ p̄ ,
¯ (7.150)

where, by definition,
p̄¯ = (t¯0 , x̄0 , t¯f , x̄¯f ). (7.151)
Finally, we get

¯ p̄
d = lpp p̄, ¯ + 2ψ̇(tˆ0 )x̄0 t¯0 + (ψ̇0 (tˆ0 ) − ψ̇(tˆ0 )x̂(˙ tˆ0 ))t¯2
0
¯ ˙
− 2ψ̇(tˆf )x̄1 t¯f − (ψ̇0 (tˆf ) − ψ̇(tˆf )x̂(tˆf ))t¯ .
2
(7.152)
f

Thus, we have proved the following result.

Theorem 7.28. Let the Lagrange multipliers

μ = (α0 , α, β) ∈ 0 and λ = (α0 , α, β, ψ, ψ0 ) ∈ 

correspond to each other, i.e., let π0 λ = μ hold. Then for any ζ̄ = (t¯0 , t¯f , x̄0 , θ̄ ) ∈ R2+n+s the
quadratic form Lζ ζ ζ̄ , ζ̄ has the representation (7.148)–(7.152), where the vector function
x̄(t) and the vector x̄¯f are defined by (7.134) and (7.140). The matrix-valued function V (t)
is the solution to the IVP (7.18), and, for each k = 1, . . . , s, the vector function y k is the
solution to the IVP (7.26).
338 Chapter 7. Bang-Bang Control Problem and Its Induced Optimization Problem

Finally, we have arrived at the main result of this section.

Theorem 7.29. Let λ = (α0 , α, β, ψ, ψ0 ) ∈  and ζ̄ = (t¯0 , t¯f , x̄0 , θ̄ ) ∈ R2+n+s . Put μ =
(α0 , α, β), i.e., let π0 λ = μ ∈ 0 hold; see Proposition 7.4. Define the function x̄(t) by
formula (7.134). Put ξ̄ = −θ̄ and z̄ = (t¯0 , t¯f , ξ̄ , x̄), which means π1 z̄ = ζ̄ ; see Propositions 7.6
and 7.7. Then the following equality holds:

Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ = (λ, z̄), (7.153)

where (λ, z̄) is defined by formulas (6.34) and (6.35).

Proof. By Theorem 7.28, the equalities (7.148)–(7.152) hold. In view of equation (6.21), put

s
˙ tˆ0 ) = V (tˆ0 )x̄0 +
x̄¯0 = x̄(tˆ0 ) + t¯0 x̂( ˙ tˆ0 )t¯0 + t¯0 x̂(
y k (tˆ0 )t¯k − V (tˆ0 )x̂( ˙ tˆ0 ).
k=1

Since y k (tˆ0 ) = 0 for k = 1, . . . , s and V (tˆ0 ) = E, it follows that x̄¯0 = x̄0 . Consequently, the
¯ which was defined by (6.21) as (t¯0 , x̄¯0 , t¯f , x̄¯f ), coincides with the vector p̄,
vector p̄, ¯ defined
in this section by (7.151). Hence, the endpoint quadratic form d in (7.152) and the endpoint
quadratic form Ap̄, ¯ p̄
¯ in (6.35) take equal values, d = Ap̄, ¯ p̄ .
¯ Moreover, the integral
tˆf
terms tˆ Hxx x̄, x̄ dt in the representation (7.148) of the form Lζ ζ ζ̄ , ζ̄ and those in the
0
representation (6.34) of the form  coincide, and
s
 s 
s
D k (H )ξ̄k2 + 2[Hx ]k x̄av
k
ξ̄k = D k (H )t¯k2 − k ¯
2[Hx ]k x̄av tk ,
k=1 k=1 k=1

because ξ̄k = −t¯k , k = 1, . . . , s. Thus, the representation (7.148) of the form Lζ ζ ζ̄ , ζ̄
implies the equality (7.153) of both forms.

Theorem 7.8, which is the main result of this chapter, then follows from Theorem 7.29.

Remark. Theorems 7.8 and 7.29 pave the way for the sensitivity analysis of parametric
bang-bang control problems. Since the IOP is finite-dimensional, the SSC imply that we
may take advantage of the well-known sensitivity results by Fiacco [32] to obtain solution
differentiability for the IOP. The strict bang-bang property then ensures solution differ-
entiability for the parametric bang-bang control problems; cf. Kim and Maurer [48] and
Felgenhauer [31].
Chapter 8

Numerical Methods for Solving


the Induced Optimization
Problem and Applications

8.1 The Arc-Parametrization Method


8.1.1 Brief Survey on Numerical Methods for Optimal Control
Problems
It has become customary to divide numerical methods for solving optimal control prob-
lems into two main classes: indirect methods and direct methods. Indirect methods take
into account the full set of necessary optimality conditions of the Minimum Principle which
gives rise to a multipoint boundary value problem (MPBVP) for state and adjoint variables.
Shooting methods provide a powerful approach to solving an MPBVP; cf. articles from the
German School [18, 19, 61, 64, 65, 81, 101, 102, 106] and French School [9, 10, 60]. E.g.,
the code BNDSCO by Oberle and Grimm [82] represents a very efficient implementation of
shooting methods. Indirect methods produce highly accurate solutions but suffer from the
drawback that the control structure, i.e., the sequence and number of bang-bang and singular
arcs or constrained arcs, must be known a priori. Moreover, shooting methods need rather
accurate initial estimates of the adjoint variables which in general are tedious to obtain.
In direct methods, the optimal control problem is transcribed into a nonlinear pro-
gramming problem (NLP) by suitable discretization techniques; cf. Betts [5], Bock and
Plitt [7], and Bueskens [13, 14]. Direct methods dispense with adjoint variables which are
computed a posteriori from Lagrange multipliers of the NLP. These methods have shown to
be very robust and are mostly capable of determining the correct control structure without
assuming an a priori knowledge of the structure.
In all examples and applications in this chapter, optimal bang-bang and singular con-
trols were computed in two steps. First, direct methods are applied to determine the correct
optimal structure and obtain good estimates for the switching times. In a second step, the
Induced Optimization Problem (IOP) in (7.4) of Section 7.1.1 is solved by using the arc-
parametrization method developed in [44, 45, 66]. To prove optimality of the computed
extremal, we then verify that the second-order sufficient conditions (SSC) in Theorem 7.10
hold.
The next section presents the arc-parametrization method by which the arc-lengths
of bang-bang arcs are optimized instead of the switching times. Section 8.1.3 describes

339
340 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

an extension of the arc-parametrization method to control problems, where the control is


piecewise defined by feedback functions. Numerical examples for bang-bang controls are
presented in Sections 8.2–8.6, while Section 8.7 exhibits a bang-singular control for a van
der Pol oscillator with a regulator functional.

8.1.2 The Arc-Parametrization Method for Solving the Induced


Optimization Problem
This section is based on the article of Maurer et al. [66] but uses a slightly different termi-
nology. We consider the optimal control problem (6.1)–(6.3). To simplify the exposition
we assume that the initial time t0 is fixed and that the mixed boundary conditions are given
as equality constraints. Hence, we study the following optimal control problem:
Minimize J (x(t0 ), tf , x(tf )) (8.1)
subject to the constraints
ẋ(t) = f (t, x(t), u(t)), u(t) ∈ U , t0 ≤ t ≤ tf , (8.2)
K(x(t0 ), tf , x(tf )) = 0. (8.3)
The control variable appears linearly in the system dynamics,
f (t, x, u) = a(t, x) + B(t, x)u. (8.4)
The control set U ⊂ Rd(u) is a convex polyhedron with V = ex U denoting the set of vertices.
ˆ = [t0 , tˆf ]
Let us recall the IOP in (7.4). Assume that û(t) is a bang-bang control in 
with switching points tˆk ∈ (t0 , tˆf ) and values u in the set V = ex U ,
k

û(t) = uk ∈ V for t ∈ (tˆk−1 , tˆk ), k = 1, . . . , s + 1,

where tˆ0 = t0 , tˆs+1 = tˆf . Thus,  ˆ = {tˆ1 , . . . , tˆs } is the set of switching points of the control
û(·) with tˆk < tˆk+1 for k = 0, 1, . . . , s. Put

x̂(t0 ) = x̂0 ∈ Rn , θ̂ = (tˆ1 , . . . , tˆs ) ∈ Rs , ζ̂ = (x̂0 , θ̂ , tˆf ) ∈ Rn × Rs+1 . (8.5)

For convenience, the sequence of components of the vector ζ̂ in definition (7.1) has been
modified. Take a small neighborhood V of the point ζ̂ and let
ζ = (x0 , θ, ts+1 ) ∈ V, θ = (t1 , . . . , ts ), ts+1 = tf ,
where the switching times satisfy t0 < t1 < t2 < · · · < ts < ts+1 = tf . Define the function
u(t; θ) by the condition
u(t; θ) = uk for t ∈ (tk−1 , tk ), k = 1, . . . , s + 1. (8.6)
The values u(tk ; θ ), k = 1, . . . , s, may be chosen in U arbitrarily. For definiteness, define
them by the condition of continuity of the control from the left, u(tk ; θ ) = u(tk −; θ ) for
k = 1, . . . , s, and let u(0; θ) = u1 . Denote by x(t; x0 , θ ) the absolutely continuous solution
of the Initial Value Problem (IVP)
ẋ = f (t, x, u(t; θ )), t ∈ [t0 , tf ], x(t0 ) = x0 . (8.7)
8.1. The Arc-Parametrization Method 341

For each ζ ∈ V this solution exists in a sufficiently small neighborhood V of the point ζ̂ .
Obviously, we have
x(t; x̂0 , θ̂) = x̂(t), ˆ
t ∈ , u(t; θ̂ ) = û(t), ˆ \ .
t ∈ ˆ
Consider now the following Induced Optimization Problem (IOP) in the space Rn × Rs+1
of variables ζ = (x0 , θ, ts+1 ):
F0 (ζ ) := J (x0 , ts+1 , x(ts+1 ; x0 , θ )) → min,
(8.8)
G(ζ ) := K(x0 , ts+1 , x(ts+1 ; x0 , θ )) = 0.
To get uniqueness of the Lagrange multiplier for the equality constraint we assume that the
following regularity (normality) condition holds:
rank ( Gζ (ζ̂ ) ) = d(K). (8.9)
The Lagrange function (7.6) in normalized form is given by
L(β, ζ ) = F0 (ζ ) + β G(ζ ), β ∈ Rd(K) . (8.10)
The critical cone K0 in (7.13) then reduces to the subspace
K0 = { ζ̄ ∈ Rn × Rs+1 | Gζ (ζ̂ )ζ̄ = 0 }. (8.11)
Assuming regularity (8.9), the SSC for the IOP reduce to the condition that there exist
β̂ ∈ Rd(K) with
Lζ (β̂, ζ̂ ) = (F0 )ζ (ζ̂ ) + β̂ Gζ (ζ̂ ) = 0, (8.12)
Lζ ζ (β̂, ζ̂ )ζ̄ , ζ̄ > 0 ∀ ζ̄ ∈ K0 \ {0}. (8.13)
From a numerical point of view it is not convenient to optimize the switching times tk (k =
1, . . . , s) and terminal time ts+1 = tf directly. Instead, as suggested in [44, 45, 66] one
computes the arc durations or arc lengths
τk := tk − tk−1 , k = 1, . . . , s, s + 1, (8.14)
of the bang-bang arcs. Hence, the final time tf can be expressed by the arc lengths as

s+1
tf = t0 + τk . (8.15)
k=1
Next, we replace the optimization variable ζ = (x0 , t1 , . . . , ts , ts+1 ) by the optimization
variable
z := (x0 , τ1 , . . . , τs , τs+1 ) ∈ Rn × Rs+1 , τk := tk − tk−1 . (8.16)
The variables ζ and z are related by the following linear transformation involving the regular
(n + s + 1) × (n + s + 1) matrix R:
   
In 0 In 0
z = Rζ, R = , ζ = R −1 z, R −1 = ,
0 S 0 S −1
⎛ ⎞ ⎛ ⎞
1 0 ... 0 1 0 ... 0
⎜ .. . ⎟ ⎜ . . ⎟ (8.17)
⎜ −1 1 . .. ⎟ ⎜ 1 1 . . .. ⎟
S=⎜ ⎜ ⎟ −1
S =⎜ . ⎜ ⎟
.. .. ⎟, ⎟.
⎝ . . 0 ⎠ ⎝ .. . . . . . . 0 ⎠
0 −1 1 1 ... 1 1
342 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

Denoting the solution to the equations (8.7) by x(t; z), the IOP (8.8) obviously is equivalent
to the following IOP with tf defined by (8.15):
F0 (z) := J (x0 , tf , x(tf ; x0 , τ1 , . . . , τs )) → min,
(8.18)
G(z) := K(x0 , tf , x(tf ; x0 , τ1 , . . . , τs )) = 0.
This approach is called the arc-parametrization method. The Lagrangian for this problem
is given in normal form by
L(ρ, z) = F0 (z) + ρ G(z). (8.19)
It is easy to see that the Lagrange multiplier ρ agrees with the multiplier β in the Lagrangian
(8.10). Furthermore, the SSC for the optimization problems (8.8), respectively, (8.18), are
equivalent. This immediately follows from the fact that the Jacobian and the Hessian for
both optimization problems are related through
Kζ = Kz R, Lζ = Lz R, Lζ ζ = R ∗ Lzz R.
Thus we can express the positive definiteness condition (8.13) evaluated for the variable z as
Lzz (β̂, ẑ)z̄, z̄ > 0 ∀ z̄ ∈ (Rn × Rs+1 ) \ {0}, Gẑ (ẑ)z̄ = 0. (8.20)
This condition is equivalent to the property that the so-called reduced Hessian is positive
definite. Let N be the (nz × (nz − d(K))) matrix, nz = n + s + 1, with full column rank nz −
d(K), whose columns span the kernel of Gz (ẑ). Then condition (8.20) is reformulated as
N ∗ Lzz (β̂, ẑ) N > 0 (positive definite). (8.21)
The computational method for determining the optimal vector ẑ ∈ Rn+s+1
is based on a
multiprocess approach proposed in [44, 45, 66]. The time interval [tk−1 , tk ] is mapped to
the fixed interval Ik = [ k−1 k
s+1 , s+1 ] by the linear transformation
3 4
k−1 k
t = ak + bk r, ak = tk − kτk , bk = (s + 1)τk , r ∈ Ik = , , (8.22)
s +1 s +1
where r denotes the running time. Identifying x(r) ∼ = x(ak + bk · r) = x(t) in the relevant
intervals, we obtain the transformed dynamic system
dx dx dt
= · = (s + 1) τk f (ak + bk r, x(r), uk ) for r ∈ Ik . (8.23)
dr dt dr
By way of concatenation of the solutions on the intervals Ik , we obtain an absolutely
continuous solution x(r) = x(r; τ1 , . . . , τs ) for r ∈ [0, 1]. Thus we are confronted with the
task of solving the IOP
 

s+1
F0 (z) := J (x0 , tf , x(1; x0 , τ1 , . . . , τs )) → min, tf = t0 + τk ,
(8.24)
k=1
G(z) := K(x0 , tf , x(1; x0 , τ1 , . . . , τs )) = 0.
This approach can be conveniently implemented using the routine NUDOCCCS
developed by Büskens [13]. In this way, we can also take advantage of the fact that
NUDOCCCS provides the Jacobian of the equality constraints and the Hessian of the
Lagrangian, which are needed in the check of the second-order condition (8.13), respec-
tively, the positive definiteness of the reduced Hessian (8.21). Moreover, this code allows
for the computation of parametric sensitivity derivatives; cf. [48, 74, 23].
8.1. The Arc-Parametrization Method 343

8.1.3 Extension of the Arc-Parametrization Method to Piecewise


Feedback Control
The purpose of this section is to extend the IOPs (8.18) and (8.24) to the situation in which
the control is piecewise defined by feedback functions and not only by a constant vector
uk ∈ V = ex U . We consider the control problem (8.1)–(8.3) with an arbitrary dynamical
system
ẋ(t) = f (t, x(t), u(t)), t0 ≤ t ≤ tf , (8.25)
and fixed initial time t0 . Instead of considering a bang-bang control u(t) with s switching
times t0 = t0 < t1 < · · · < ts < ts+1 = tf and constant values

u(t) = uk for t ∈ (tk−1 , tk ),

we assume that there exist continuous functions uk : D → Rd(u) , where D ⊂ R × Rn is


open, with
u(t) = uk (t, x(t)) for t ∈ (tk−1 , tk ). (8.26)
Such functions are provided, e.g., by singular controls in feedback form or boundary con-
trols in the presence of state constraints, or may be simply viewed as suitable feedback
approximations of an optimal control.
The vector of switching times is denoted by θ = (t1 , . . . , ts ). Let x(t; x0 , θ ) be the
absolutely continuous solution of the piecewise defined equations

ẋ(t) = f (t, x(t), uk (t, x(t))) for t ∈ (tk−1 , tk ) (k = 1, . . . , s + 1) (8.27)

with given initial value x(t0 ; x0 , θ) = x0 . The IOP with the optimization variable

ζ = (x0 , θ, tf ) = (x0 , t1 , . . . , ts , ts+1 ) ∈ Rn × Rs+1

agrees with (8.8):


F0 (ζ ) := J (x0 , ts+1 , x(ts+1 ; x0 , θ )) → min,
(8.28)
G(ζ ) := K(x0 , ts+1 , x(ts+1 ; x0 , θ )) = 0.

The arc-parametrization method consists of optimizing x0 and the arc lengths

τk := tk − tk−1 , k = 1, . . . , s + 1.

Invoking the linear time transformation (8.22) for mapping the time interval [tk−1 , tk ] to the
fixed interval Ik = [ k−1 k
s+1 , s+1 ],

t = ak + bk r, ak = tk − kτk , bk = (s + 1)τk , r ∈ Ik ,

the dynamic system is piecewise defined by

dx/dr = (s + 1)τi f (ak + bk r, x(r), uk (ak + bk r, x(r)) for r ∈ Ik . (8.29)

Therefore, the implementation of the arc-parametrization method using the routine


NUDOCCCS [13] requires only a minor modification of the dynamic system. Applications
of this method to bang-singular controls may be found in [50, 51] and to state constrained
problems in [74]. An example for a bang-singular control will be given in Section 8.7.
344 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

8.2 Time-Optimal Control of the Rayleigh Equation


Revisited
We revisit the problem of time-optimal control of the Rayleigh equation from Section 6.4
and show that the sufficient conditions in Theorem 7.10 can be easily verified numerically
on the basis of the IOP (8.8) or (8.18). The control problem is to minimize the final time tf
subject to
ẋ1 (t) = x2 (t), ẋ2 (t) = −x1 (t) + x2 (t)(1.4 − 0.14x2 (t)2 ) + u(t), (8.30)
x1 (0) = x2 (0) = −5, x1 (tf ) = x2 (tf ) = 0, (8.31)
| u(t) | ≤ 4 for t ∈ [0, tf ]. (8.32)
The Pontryagin function (Hamiltonian)
H (x, ψ, u) = ψ1 x2 + ψ2 (−x1 + x2 (1.4 − 0.14x22 ) + u) (8.33)
yields the adjoint equations
ψ̇1 = ψ2 , ψ̇2 = ψ1 + ψ2 (1.4 − 0.42x22 ). (8.34)
The transversality conditions (6.9) gives the relation
ψ2 (tf ) u(tf ) + 1 = 0. (8.35)
The switching function
φ(t) = Hu (t) = ψ2 (t) (8.36)
determines the optimal control via the minimum condition as
 
4 if ψ2 (t) < 0
u(t) = . (8.37)
−4 if ψ2 (t) > 0
In section 6.4 it was found that the optimal control is composed of three bang-bang arcs,
⎧ ⎫
⎨ 4 for 0 ≤ t < t1
⎪ ⎪

u(t) = −4 for t1 ≤ t < t2 . (8.38)
⎩ 4 for t ≤ t ≤ t = t ⎪
⎪ ⎭
2 3 f

This implies the two switching conditions ψ2 (t1 ) = 0 and ψ2 (t2 ) = 0. Hence, the optimization
vector for the IOP (8.18) is given by
z = (τ1 , τ2 , τ3 ), τ1 = t1 , τ2 = t2 − t1 , τ3 = tf − t2 .
The code NUDOCCCS gives the following numerical results for the arc lengths, switching
times, and adjoint variables:
τ1 = t1 = 1.12051, τ2 = 2.18954, t2 = 3.31005,
τ3 = 0.35813, tf = 3.66817,
ψ1 (0) = −0.122341, ψ2 (0) = −0.082652, (8.39)
ψ1 (t1 ) = −0.215212, ψ1 (t2 ) = 0.891992,
ψ1 (tf ) = 0.842762, ψ2 (tf ) = −0.25, β = ψ(tf ).
8.2. Time-Optimal Control of the Rayleigh Equation Revisited 345

(a) time-optimal control u and switching function ( x 4 ) (b) state variables x1 and x2
6
4
3 4

2 2
1
0
0
-2
-1
-2 -4
-3 -6
-4
-8
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4

Figure 8.1. Time-optimal control of the Rayleigh equation with boundary condi-
tions (8.31). (a) Bang-bang control and scaled switching function (×4), (b) State vari-
ables x1 , and x2 .

The corresponding time-optimal bang-bang control with two switches and the state variables
are shown in Figure 8.1.
We have already shown in Section 6.4 that the control u in (8.38) enjoys the strict
bang-bang property and that the estimates D k (H ) > 0, k = 1, 2, in (6.114) are satisfied. For
the terminal conditions (8.31), the Jacobian is the (2 × 3) matrix
 
−4.53176 −3.44715 0.0
Gz (ẑ) =
−11.2768 −7.62049 4.0
which is of rank 2. The Hessian of the Lagrangian is the (3 × 3) matrix
⎛ ⎞
−10.3713 −8.35359 −6.68969
Lzz (β̂, ẑ) = ⎝ −8.35359 −5.75137 −4.61687 ⎠ .
−6.68969 −4.61687 1.97104
Note that this Hessian is not positive definite. However, the projected Hessian (8.21) is the
positive number
N ∗ L̃zz (ẑ, β̂)N = 0.515518,
which shows that the second-order test (8.20) holds. Hence, the extremal characterized by
the data (8.39) provides a strict strong minimum.
Now we consider a modified control problem, where the two terminal conditions
x1 (tf ) = x2 (tf ) = 0 are substituted by the scalar terminal condition

x1 (tf )2 + x2 (tf )2 = 0.25. (8.40)


The Hamiltonian (8.33) and the adjoint equations (8.34) remain the same. The transversality
condition (6.9) yields
λi (tf ) = 2 β xi (tf ) (i = 1, 2), β ∈ R. (8.41)
The transversality condition for the free final time is H (t) + 1 ≡ 0. It turns out that the
control is bang-bang with only one switching point t1 in contrast to the control structure
(8.38),  
4 for 0 ≤ t < t1
u(t) = . (8.42)
−4 for t1 ≤ t ≤ tf
346 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

(b) state variables x1 and x2


(a) time-optimal control u and switching function (x4)
6
4
4
3
2 2
1 0
0
-2
-1
-2 -4
-3 -6
-4
-8
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3

Figure 8.2. Time-optimal control of the Rayleigh equation with boundary condition
(8.40). (a) Bang-bang control u and scaled switching function φ (dashed line). (b) State
variables x1 , x2 .

Hence, the optimization vector is


z = (τ1 , τ2 ), τ1 = t1 , τ2 = tf − t1 .
Using the code NUDOCCCS, we obtain the following numerical results for the arc lengths
and adjoint variables:
t1 = 1.27149, τ2 = 1.69227, tf = 2.96377,
ψ1 (0) = −0.117316, ψ2 (0) = −0.0813638,
ψ1 (t1 ) = −0.213831, ψ2 (t1 ) = 0.0, (8.43)
x1 (tf ) = 0.426176, x2 (tf ) = 0.261484,
ψ1 (tf ) = 0.448201, ψ2 (tf ) = 0.274997, β = 0.525839.
Figure 8.2 displays the time-optimal bang-bang control with only one switch and the two
state variables.
The relations φ̇(t1 ) = ψ̇2 (t1 ) = ψ1 (t1 ) and [u]1 = 8 yield
D 1 (H ) = −8 · φ̇(t1 ) = 8 · 0.213831 > 0.
For the scalar terminal condition (8.40), the Jacobian is the nonzero row vector
Gz (ẑ) = (−1.90175, −1.90173),
while the Hessian of the Lagrangian is the positive definite (2 × 2) matrix
 
28.1299 19.0384
Lzz (β̂, ẑ) = .
19.0384 14.0048
Hence, the second-order conditions (8.20), respectively, the second-order conditions in
Theorem 7.10, hold, which shows that the extremal characterized by the data (8.43) furnishes
a strict strong minimum.

8.3 Time-Optimal Control of a Two-Link Robot


The control of two-link robots has been the subject of various articles; cf., e.g., [21, 35, 37,
81]. In these papers, optimal control policies are determined solely on the basis of first-order
8.3. Time-Optimal Control of a Two-Link Robot 347

x2
P

C
q2

q1

O x1
 lower arm OP
Figure 8.3. Two-link robot [67]: upper arm OQ,  , and angles q1 and q2 .

necessary conditions, since sufficient conditions were not available. In this section, we show
that SSC hold for both types of robots considered in [21, 37, 81].
First, we study the robot model considered in Chernousko et al. [21]. Göllmann [37]
has shown that the optimal control candidate presented in [21] is not optimal, since the sign
conditions of the switching functions do not comply with the Minimum Principle. Figure
8.3 represents the two-link robot schematically. The state variables are the angles q1 and q2 .
The parameters I1 and I2 are the moments of inertia of the upper arm OQ and the lower
arm QP with respect to the points O and Q. Further, let m2 be the mass of the lower arm,
L1 = |OQ| the length of the upper arm, and L1 = |QC| the distance between the second
link Q and the center of gravity C of the lower arm. With the abbreviations
A = I1 + m2 L21 + I2 + 2m2 L1 L cos q2 , B = I2 + m2 L1 L cos q2 ,
R1 = u1 + m2 L1 L(2q̇1 + q̇2 )q̇2 sin q2 , R2 = u2 − m2 L1 Lq̇12 sin q2 , (8.44)
D = I2 ,  = AD − B 2 ,
the dynamics of the two-link robot can be described by the ODE system
1
q̇1 = ω1 , ω̇1 = (DR1 − BR2 ),

(8.45)
1
q̇2 = ω2 , ω̇2 = (AR2 − BR1 ),

where ω1 and ω2 are the angular velocities. The torques u1 and u2 in the two links represent
the two control variables. The control problem consists of steering the robot from a given
initial position to a terminal position in minimal final time tf ,
q1 (0) = 0, q2 (0) = 0, ω1 (0) = 0, ω2 (0) = 0,
(8.46)
q1 (tf ) = −0.44, q2 (tf ) = 1.83, ω1 (tf ) = 0, ω2 (tf ) = 0.
Both control components are bounded by
|u1 (t)| ≤ 2, |u2 (t)| ≤ 1, t ∈ [0, tf ]. (8.47)
348 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

The Pontryagin function (Hamiltonian) is


ψ3 ψ4
H = ψ1 ω1 + ψ2 ω2 + (DR1 (u1 ) − BR2 (u2 )) + (AR2 (u2 ) − BR1 (u1 )) . (8.48)
 
The adjoint equations are rather complicated and are not given here explicitly. The switch-
ing functions are
ψ3 ψ4 ψ4 ψ3
φ1 (t) = Hu1 (t) = D− B, φ2 (t) = Hu2 (t) = A− B. (8.49)
   
For the parameter values
10
L1 = 1, L = 0.5, m2 = 10, I1 = I2 = ,
3
Göllmann [37] has found the following control structure with four bang-bang arcs:
⎧ ⎫
⎪ (−2, 1), 0 ≤ t < t1 ⎪
⎨ ⎬
(2, 1), t1 ≤ t < t2
u(t) = (u1 (t), u2 (t)) = , 0 < t1 < t2 < t3 < tf . (8.50)
⎩ (2, −1),
⎪ t2 ≤ t < t3 ⎪

(−2, −1), t3 ≤ t ≤ tf
This control structure differs substantially from that in Chernousko et al. [21] which vi-
olates the switching conditions. Obviously, the bang-bang control (8.50) satisfies the
assumption that only one control component switches at a time. Since the initial point
(q1 (0), q2 (0), ω1 (0), ω2 (0)) is specified, the optimization variable in the IOP (8.18) is
z = (τ1 , τ2 , τ3 , τ4 ), τ1 = t1 , τ2 = t2 − t1 , τ3 = t3 − t2 , τ4 = tf − t3 .
Using the code NUDOCCCS, we compute the following arc durations and switching times:
t1 = 0.7677893, τ2 = 0.3358820, t2 = 1.1036713,
τ3 = 1.2626739, t3 = 2.3663452, τ4 = 0.8307667, (8.51)
tf = 3.1971119.
Numerical values for the adjoint functions are also provided by the code NUDOCCCS,
e.g., the initial values
ψ1 (0) = −1.56972, ψ2 (0) = −0.917955,
(8.52)
ψ3 (0) = −2.90537, ψ4 (0) = −1.45440.
Figure 8.4 shows that the switching functions φ1 and φ2 comply with the minimum condi-
tion and that the strict bang-bang property (6.19) and the inequalities (6.14) are satisfied:
φ1 (t) = 0 for t  = t1 , t3 , φ2 (t)  = 0 for t  = t2 ,
φ̇1 (t1 ) < 0, φ̇1 (t3 ) > 0, φ̇2 (t2 ) > 0.
For the terminal conditions (8.46) we compute the Jacobian
⎛ ⎞
−0.751043 0.0351060 0.258904 0
⎜ 3.76119 1.84929 −0.204170 0 ⎟
Gz (ẑ) = ⎝
−0.107819 ⎠
.
−0.326347 0.0770047 0.212722
1.26849 0.445447 −0.487447 −0.233634
8.3. Time-Optimal Control of a Two-Link Robot 349

(a) control u1 and switching function 1 (b) control u2 and switching function 2

2 1
1.5
1 0.5
0.5
0 0
-0.5
-1 -0.5
-1.5
-2 -1

0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5


(c) angle q1 and velocity 1 (d) angle q2 and velocity 2
0.1 2
1.8
0
1.6
1.4
-0.1
1.2
-0.2 1
0.8
-0.3
0.6
0.4
-0.4
0.2
-0.5 0
0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5

Figure 8.4. Control of the two-link robot (8.44)–(8.47). (a) Control u1 and scaled
switching function φ1 (dashed line). (b) Control u2 and scaled switching function φ2
(dashed line). (c) Angle q1 and velocity ω1 . (d) Angle q2 and velocity ω2 .

This square matrix has full-rank in view of


det Gz (ẑ) = 0.0766524  = 0,
which means that the positive definiteness condition (8.13) trivially holds. We have thus
verified first-order sufficient conditions showing that the extremal solution given by (8.50)–
(8.52) provides a strict strong minimum.
In the model treated above, some parameters such as the mass of the upper arm and
the mass of a load at the end of the lower arm appear implicitly in the system equations.
The mass m1 of the upper arm is included in the moment of inertia I2 , and the mass M of a
load in the point P can be added to the mass m2 , where the point C and therefore the length
L have to be adjusted. The length L2 of the lower arm is incorporated in the parameter L.
The second robot model that we are going to discuss is taken from Geering et al. [35]
and Oberle [81]. Here, every physical parameter enters the system equation explicitly. The
dynamic system is as follows:
1
q̇1 = ω1 , ω̇1 = (AI22 − BI12 cos q2 ),

(8.53)
1
q̇2 = ω2 − ω1 , ω̇2 = (BI11 − AI12 cos q2 ),

where we have used the abbreviations
A = I12 ω22 sin q2 + u1 − u2 , B = −I12 ω12 sin q2 + u2 ,
 = I11 I22 − I12
2 cos2 q ,
2 I11 = I1 + (m2 + M)L21 , (8.54)
I12 = m2 LL1 + ML1 L2 , I22 = I2 + I3 + ML22 .
350 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

I3 denotes the moment of inertia of the load with respect to the point P , and ω2 is now the
angular velocity of the angle q1 + q2 . For simplicity, we set I3 = 0. Again, the torques u1
and u2 in the two links are used as control variables by which the robot is steered from a
given initial position to a nonfixed end position in minimal final time tf ,
!
q1 (0) = 0, (x1 (tf ) − x1 (0))2 + (x2 (tf ) − x2 (0))2 = r,
q2 (0) = 0, q2 (tf ) = 0, (8.55)
ω1 (0) = 0, ω1 (tf ) = 0,
ω2 (0) = 0, ω2 (tf ) = 0,
where (x1 (t), x2 (t)) are the Cartesian coordinates of the point P ,
x1 (t) = L1 cos q1 (t) + L2 cos(q1 (t) + q2 (t)),
(8.56)
x2 (t) = L1 sin q1 (t) + L2 sin(q1 (t) + q2 (t)).
The initial point (x1 (0), x2 (0)) = (2, 0) is fixed. Both control components are bounded,
|u1 (t)| ≤ 1, |u2 (t)| ≤ 1, t ∈ [0, tf ]. (8.57)
The Hamilton–Pontryagin function is given by
ψ3
H = ψ1 ω1 + ψ2 (ω2 − ω1 ) + (A(u1 , u2 )I22 − B(u2 )I12 cos q2 )
 (8.58)
ψ4
+ (B(u2 )I11 − A(u1 , u2 )I12 cos q2 ) .

The switching functions are computed as
1
φ 1 = H u1 = (ψ3 I22 − ψ4 I12 cos q2 ) ,

(8.59)
1
φ2 = Hu2 = (ψ3 (−I22 − I12 cos q2 ) + ψ4 (I11 + I12 cos q2 )) .

For the parameter values
1
L1 = L2 = 1, L = 0.5, I1 = I2 = , r = 3,
m1 = m2 = M = 1,
3
we will show that the optimal control has the following structure with five bang-bang arcs:
⎧ ⎫
⎪ (−1, 1) for 0 ≤ t < t1 ⎪

⎪ ⎪

⎨ (−1, −1) for t1 ≤ t < t2 ⎬
u(t) = (u1 (t), u2 (t)) = (1, −1) for t2 ≤ t < t3 , (8.60)

⎪ ≤ ⎪


⎩ (1, 1) for t3 t < t 4 ⎪

(−1, 1) for t4 ≤ t ≤ tf
where 0 = t0 < t1 < t2 < t3 < t4 < t5 = tf . Since the initial point (q1 (0), q2 (0), ω1 (0), ω2 (0)) is
specified, the optimization variable in the optimization problem (8.8), respectively, (8.18), is
z = (τ1 , τ2 , τ3 , τ4 , τ5 ), τk = tk − tk−1 , k = 1, . . . , 5.
The code NUDOCCCS yields the arc durations and switching times
t1 = 0.546174, τ2 = 1.21351, t2 = 1.75968,
τ3 = 1.03867, t3 = 2.79835, τ4 = 0.906039, (8.61)
t4 = 3.70439, τ5 = 0.185023, tf = 3.889409,
8.3. Time-Optimal Control of a Two-Link Robot 351

(a) control u1 (b) control u2

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4


(c) angle q1 and velocity 1 (d) angle q2 and velocity 2
1 3
2.5
0.5
2
0
1.5
-0.5 1

-1 0.5
0
-1.5
-0.5
-2 -1

-2.5 -1.5
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4

Figure 8.5. Control of the two-link robot (8.53)–(8.57). (a) Control u1 . (b) Con-
trol u2 . (c) Angle q1 and velocity ω1 . (d) Angle q2 and velocity ω2 [17].

as well as the initial values of the adjoint variables,

ψ1 (0) = 0.184172, ψ2 (0) = −0.011125,


(8.62)
ψ3 (0) = 1.482636, ψ4 (0) = 0.997367.

The two bang-bang control components as well as the four state variables are shown in
Figure 8.5. The strict bang-bang property (6.19) and the inequalities (6.14) hold in view of

φ1 (t)  = 0 for t  = t2 , t4 , φ2 (t)  = 0 for t  = t1 , t3 ,


φ̇1 (t2 ) < 0, φ̇1 (t4 ) > 0, φ̇2 (t1 ) > 0, φ̇2 (t3 ) < 0.

For the terminal conditions in (8.55), the Jacobian in the optimization problem is computed
as the (4 × 5) matrix
⎛ ⎞
−10.8575 −12.7462 −5.88332 −1.14995 0
⎜ 0.199280 −2.71051 −1.45055 −1.91476 −4.83871 ⎟
Gz (ẑ) = ⎝
6.19355 ⎠
,
−0.622556 3.31422 2.31545 2.94349
9.36085 3.03934 0.484459 0.0405811 0

which has full rank. The Hessian of the Lagrangian is given by


⎛ ⎞
71.1424 90.7613 42.1301 8.49889 −0.0518216
⎜ 90.7613 112.544 51.3129 10.7691 0.149854 ⎟
⎜ ⎟
Lzz (β̂, ẑ) = ⎜ 42.1301 51.3129 23.9633 5.12403 0.138604 ⎟.
⎝ 8.49889 10.7691 5.12403 1.49988 0.170781 ⎠
−0.0518216 0.149854 0.138604 0.170781 0.297359
352 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

(a) control u1 (b) control u2

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4


(c) angle q1 and velocity 1 (d) angle q2 and velocity 2
1 1.5
1
0.5
0.5
0 0
-0.5
-0.5
-1
-1 -1.5
-2
-1.5
-2.5
-2 -3
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4

Figure 8.6. Control of the two-link robot (8.53)–(8.57): Second solution. (a) Con-
trol u1 . (b) Control u2 . (c) Angle q1 and velocity ω1 . (d) Angle q2 and velocity ω2 .

This yields the projected Hessian (8.21) as the positive number


N ∗ Lzz (β̂, ẑ)N = 0.326929.
Hence, all conditions in Theorem 7.10 are satisfied, and thus the extremal (8.60)–(8.62)
yields a strict strong minimum.
It is interesting to note that there exists a second local minimum with the same terminal
time tf = 3.88941. Though the control has also five bang-bang arcs, the control structure is
substantially different from that in (8.60),
⎧ ⎫
⎪ (1, −1), 0 ≤ t < t1 ⎪

⎪ ⎪
⎨ (−1, −1), t 1 ≤ t < t2 ⎪⎬
u(t) = (u1 (t), u2 (t)) = (−1, 1), t 2 ≤ t < t3 , (8.63)

⎪ ≤ ⎪


⎩ (1, 1), t 3 t < t ⎪
4 ⎭
(1, −1), t4 ≤ t ≤ tf
where 0 < t1 < t2 < t3 < t4 < t5 = tf . Figure 8.6 displays the second time-optimal solution:
the bang-bang controls and state variables. The code NUDOCCCS determines the switching
times
t1 = 0.1850163, t2 = 1.091075, t3 = 2.129721,
(8.64)
t4 = 3.343237, tf = 3.889409.
for which the strict bang-bang property (6.19) and the inequalities (6.14) hold, i.e.,
D k (H ) > 0 for k = 1, 2, 3, 4. Moreover, computations show that rank ( Gz (ẑ) ) = 4 and
the projected Hessian of the Lagrangian (8.21) is the positive number
N ∗ Lzz (β̂, ẑ)N = 0.326929.
8.4. Time-Optimal Control of a Single Mode Semiconductor Laser 353

It is remarkable that this value is identical to the value of the projected Hessian for the first
local minimum. Therefore, also for the second solution we have verified that all conditions
in Theorem 7.10 hold, and thus the extremal (8.63), (8.64) is a strict strong minimum. The
phenomenon of multiple local solutions all with the same minimal time tf has also been
observed by Betts [5, Example 6.8 (Reorientation of a rigid body)].

8.4 Time-Optimal Control of a Single Mode


Semiconductor Laser
In [46] we studied the optimal control for two classes of laser, a so-called class B laser
and a semiconductor laser. In this section, we present only the semiconductor laser, whose
dynamical model has been derived in [20, 29]. The primary goal is to control the transition
between the initial stationary state, the switch-off state, and the terminal stationary state, the
switch-on state. The control variable is the electric current injected into the laser by which
the laser output power is modified.
In response to an abrupt switch from a low value of the current (initial state) to a
high value (terminal state), the system responds with damped oscillations (cf. Figure 8.8)
which are a great nuisance in several laser applications. We will show that by injecting an
appropriate bang-bang current (control) the oscillations can be completely removed while
simultaneously shortening the transition time. Semiconductor lasers are a case where the
removal of the oscillations can be particularly beneficial, since in telecommunications one
would like to be able to obtain the fastest possible response to the driving, with the cleanest
and most direct transition, to maximize the data transmission rate and efficiency.
We consider the dynamics of a standard single-mode laser model [20], reduced to
single-mode form [29]. In this model, S(t) represents the normalized photon number and
N(t) the carrier density, I (t) is the injected current (control) that is used to steer the transition
between different laser power levels.
S
Ṡ = − + G(N , S)S + βBN (N + P0 ),
tp
(8.65)
I (t)
Ṅ = − R(N ) − G(N , S)S.
q
The process is considered in a time interval t ∈ [0, tf ] with terminal time tf > 0. The gain
function G(N , S) and recombination function R(N ) are given by

G(N , S) = Gp (N − Ntr )(1 − #S),


(8.66)
R(N ) = AN + BN (N + P0 ) + CN (N + P0 )2 .

The parameters have the following meaning: tp , cavity lifetime of the photon; , cavity con-
finement factor; β, coefficient that weights the (average) amount of spontaneous emission
coupled into the lasing mode; B, incoherent band-to-band recombination coefficient; P0 ,
carrier number without injection; q, carrier charge; Gp , gain term; #, gain compression
factor; Ntr , number of carriers at transparency. All parameter values are given in Table 8.1.
The following bounds are imposed for the injected current:

Imin ≤ I (t) ≤ Imax ∀ t ∈ [0, tf ], (8.67)


354 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

Table 8.1. List of parameters from [29]. The time unit is a picosecond [ps] = [10−12 s].
tp 2.072 × 10−12 s Gp 2.628 × 104 s−1  0.3
# 9.6 × 10−8 Ntr 7.8 × 107 β 1.735 × 10−4
P0 1.5 × 107 q 1.60219 × 10−19 C A 1 × 108 s−1
B 2.788 s−1 C 7.3 × 10−9 s−1 I0 20.5 mA
Imin 2.0 mA Imax 67.5 mA I∞ 42.5 mA

where 0 ≤ Imin < Imax . To define appropriate initial and terminal values for S(t) and N(t),
we choose two values I0 and I∞ with Imin < I0 < I∞ < Imax . Then inserting the constant
control functions I (t) ≡ I0 and I (t) ≡ I∞ into the dynamics (8.65), one can show that
there exist two asymptotically stable stationary points (S0 , N0 ) and (Sf , Nf ) with Ṡ = Ṅ = 0
such that
(S(t), N(t)) → (S0 , N0 ) for t → ∞ and I (t) ≡ I0 ,
(S(t), N (t)) → (Sf , Nf ) for t → ∞ and I (t) ≡ I∞ .
Hence, we shall impose the following initial and terminal conditions for the control process
(8.65):
S(0) = S0 , N(0) = N0 and S(tf ) = Sf , N (tf ) = Nf . (8.68)
When controlling the process by the function I (t), one goal is to determine a control func-
tion by which the terminal stationary point is reached in a finite time tf > 0. But we can set
a higher goal by considering the following time-optimal control problem:
Minimize the final time tf (8.69)
subject to the dynamic constraints and boundary conditions (8.65)–(8.68). For computation,
we shall use the nominal parameters from [29] (see Table 8.1).
For these parameters, the stationary points, respectively, initial and terminal values,
in (8.68) are computed in normalized units as
S0 = 0.6119512914 × 105 , N0 = 1.3955581328 × 108 ,
(8.70)
Sf = 3.4063069073 × 105 , Nf = 1.4128116637 × 108 .
The Hamilton–Pontryagin function is given by
 
S
H (S, N, ψS , ψN , I ) = ψS − + G(N , S)S + βBN (N + P0 )
tp
  (8.71)
I
+ ψN − R(N ) − G(N , S)S ,
q
where the adjoint variables (ψS , ψN ) satisfy the adjoint equations
 
1
ψ̇S = −HS = ψS − G(N , S) + Gp (N − Ntr )#S
tp
 
+ ψN G(N , S) − Gp (N − Ntr )#S , (8.72)
ψ̇N = −HN = −ψS (Gp (1 − #S)S + βB(2N + P0 )) + ψN ( A + B(2N + P0 )
+ C(3N 2 + 4N P0 + P02 ) + Gp (1 − #S)S).
8.4. Time-Optimal Control of a Single Mode Semiconductor Laser 355

The switching function becomes


φ(t) = HI (t) = ψN (t)/q. (8.73)
The minimization of the Hamiltonian (8.71) yields the following characterization of bang-
bang controls: 
Imin if ψN (t) > 0,
I (t) = (8.74)
Imax if ψN (t) < 0.
In this problem, a singular control satisfying the condition ψN (t) ≡ 0 for all t ∈ [t1 , t2 ] ⊂
[0, tf ], t1 < t2 , cannot be excluded a priori. However, the direct optimization approach yields
the following bang-bang control with only one switching time t1 ,

Imax if 0 ≤ t < t1 ,
I (t) = (8.75)
Imin if t1 ≤ t ≤ tf .

In view of the control law (8.74), we get the switching condition ψN (t1 ) = 0. Moreover,
since the final time tf is free and the control problem is autonomous, we obtain the addi-
tional boundary condition for a normal trajectory,
H (S(tf ), N (tf ), ψS (tf ), ψN (tf ), I (tf )) + 1 = 0. (8.76)
The optimization variable in the IOP (8.18) is
z = (τ1 , τ2 ), τ1 = t1 , τ2 = tf − t1 . (8.77)
It is noteworthy that the IOP (8.18) reduces to solving an implicitly defined nonlinear
equation: determine two variables τ1 , τ2 such that the two boundary conditions S(tf ) = Sf
and N (tf ) = Nf in (8.70) are satisfied. Thus solving the IOP is equivalent to applying a
Newton-type method to the system of equations. We obtain the following switching time,
terminal time, and initial values of adjoint variables:
t1 = 29.52274, tf = 56.89444 ps,
ψS (0) = −21.6227, ψN (0) = −340.892, (8.78)
ψS (tf ) = −4.6956, ψN (tf ) = 395.60.
The corresponding control and (normalized) state functions as well as adjoint variables are
shown in Figure 8.7. Note that the constant control I (t) ≡ I∞ has to be applied for t ≥ tf in
order to fix the system at the terminal stationary point (Sf , Nf ). Since the bang-bang control
I (t) has only one switch, Proposition 6.25 asserts that the computed extremal furnishes a
strict strong minimum. The computed trajectory is normal, because the adjoint variables
satisfy the necessary condition (6.7)–(6.11) with α0 = 1. Moreover, the graph of ψN (t) in
Figure 8.7 shows that the strict bang-bang property and D 1 (H ) > 0 in (6.14) hold in view of
φ(t) < 0 ∀ 0 ≤ t < t1 , φ(t) > 0 ∀ t1 < t ≤ tf , φ̇(t1 ) > 0.
These conditions provide first-order sufficient conditions. Alternatively, we can use Theo-
rem 7.10 for proving optimality. The critical cone is the zero element, since the computed
2 × 2 Jacobian matrix
 
0.199855 −0.000155599
Gz (ẑ) =
0.0 −0.00252779
356 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

(a) Normalized photon number S (b) Normalized carrier density N


3.5 1.48
1.47
3
1.46
2.5 1.45
1.44
2
1.43
1.5 1.42
1.41
1
1.4
0.5 1.39
0 10 20 30 40 50 60 0 10 20 30 40 50 60
(c) Injected electric current (control) I (d) adjoint variables and
S N
70 400
60 300
50 200

40 100
0
30
-100
20
-200
10
-300
0 -400
-20 0 20 40 60 80 0 10 20 30 40 50 60

Figure 8.7. Time-optimal control of a semiconductor laser. (a) Normalized photon


density S(t) × 10−5 . (b) Normalized photon density N (t) × 10−8 . (c) Electric current
(control) I (t) with I (t) = I0 = 20.5 for t < 0 and I (t) = I∞ = 42.5 for t > tf . (d) Adjoint
variables ψS (t), ψN (t).

0
0 100 200 300 400 500 600

Figure 8.8. Normalized photon number S(t) for I (t) ≡ 42.5 mA and optimal I (t) [46].

is regular. Comparing the optimal control approach with the topological phase-space tech-
nique proposed in [29], we recognize that the control structure (8.75) constitutes a translation
of the latter technique into rigorous mathematical terms.
The comparison in Figure 8.8 between the uncontrolled and optimally controlled
laser shows the strength of the optimal control approach: the damped oscillations have been
completely eliminated and a substantial shortening of the transient time has been achieved.
Indeed, it is surprising that such a dramatic improvement is caused by the simple control
strategy (8.75) adopted here.
8.5. Optimal Control of a Batch-Reactor 357

Feed of B

Cooling

Figure 8.9. Schematics of a batch-reaction with two control variables.

It is worth stressing the improvement that the optimal control approach has brought
to the problem. The disappearance of the damped oscillations allows the laser to attain
its final state in a finite time rather than asymptotically. In practical terms, one can set a
threshold value, δ, around the asymptotic level, S∞ , and consider the state attained once
S∞ − δ < S(t) < S∞ + δ holds. The parameter δ can be determined by the amount of noise
present in the system, which we do not consider in our analysis. Visually, this operation
corresponds to saying that the damped oscillation and asymptotic state are indistinguishable
below a certain level of detail, e.g., if t > 500 ps as in Figure 8.8.
With this convention, one can give a quantitative estimate of the amount of improve-
ment introduced by the control function: the asymptotic state is attained at t ≈ 50 ps, even
before the first crossing of the same level which occurs at t ≈ 70 ps (dashed line in Figure
8.8). The improvement is of the order of a factor 10!

8.5 Optimal Control of a Batch-Reactor


The following optimal control problem for a batch-reactor is taken from [17, 110, 66].
Consider a chemical reaction
A+B → C
and its side reaction
B +C → D
which are assumed to be strongly exothermic. Thus, direct mixing of the entire necessary
amounts of the reactants must be avoided.
The reactant A is charged in the reactor vessel, which is fitted with a cooling jacket
to remove the generated heat of the reaction, while the reactant B is added. These reactions
result in the product C and the undesired byproduct D.
The vector of state variables is denoted by

x = (MA , MB , MC , MD , H ) ∈ R5 ,

where Mi (t) [mol] and Ci (t) [mol/m3 ] stand for the molar holdups and the molar concen-
trations of the components i = A, B, C, D, respectively. H (t) [MJ] denotes the total energy
358 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

holdup, TR (t) [K] the reactor temperature and V (t) [m3 ] the volume of liquid in the system.
The two-dimensional control vector is given by

u = (FB , Q) ∈ R2 ,

where FB (t) [mol/s] controls the feed rate of the component B while Q(t) [kW] controls
the cooling load. The objective is to determine a control u that maximizes the molar holdup
of the component C. Hence, the performance index is

J1 (x(tf )) = −MC (tf ) (8.79)

which has to be minimized subject to the dynamical equations

ṀA = −V · r1 , ṀB = FB − V · (r1 + r2 ),


ṀC = V · (r1 − r2 ), ṀD = V · r2 , (8.80)
Ḣ = FB · hf − Q − V · (r1 · H1 + r2 · H2 ).

Here, rj denote the reaction rates and kj the corresponding Arrhenius rate constants of both
reactions (j = 1, 2):

A+B → C : r1 = k1 · CA · CB with k1 = A1 · e−E1 /TR ,


(8.81)
C+B → D : r2 = k2 · CB · CC with k2 = A2 · e−E2 /TR ,

where functions are defined by


 
S = Mi · α i , W = Mi · βi ,

i=A,B,C,D
! 
i=A,B,C,D
1
TR = · −S + (W · Tref + S)2 + 2 · W · H , (8.82)
W
Mi  Mi
Ci = ( i = A, B, C, D ), V = .
V ρi
i=A,B,C,D

The reference temperature for the enthalpy calculations is Tref = 298 K and the specific
molar enthalpy of the reactor feed stream is hf = 20 kJ/mol. Initial values are given for all
state variables,

MA (0) = 9000, Mi (0) = 0 (i = B, C, D), H (0) = 152509.97, (8.83)

while there is only one terminal constraint

TR (tf ) = 300 (8.84)

with TR defined as in (8.82). The control vector u = (FB , Q) appears linearly in the control
system (8.80) and is bounded by

0 ≤ FB (t) ≤ 10 and 0 ≤ Q(t) ≤ 1000 ∀ t ∈ [0, tf ]. (8.85)

The reaction and component data appearing in (8.80)–(8.82) are given in Table 8.2.
8.5. Optimal Control of a Batch-Reactor 359

Table 8.2. Reaction and component data.


Notation Reactions Meaning
j =1 j =2
% 3 &
Aj m 0.008 0.002 Preexponential Arrhenius constants
mol·s
Ej [K] 3000 ·p 2400 Activation energies
% &
Hj kJ −100 −75 Enthalpies
mol
Notation Components Meaning
i=A i=B i=C i=D
% &
mol
ρi m3 11250 16000 10400 10000 Molar density of pure
component i
% &
αi kJ 0.1723 0.2 0.16 0.155 Coefficient of the linear (αi )
% mol·K &
βi kJ 0.000474 0.0005 0.00055 0.000323 and quadratic (βi ) term in
mol·K2
the pure component specific
enthalpy expression

Calculations show that for increasing tf the switching structure gets more and more
complex. However, the total profit of MC (tf ) is nearly constant if tf is greater than a certain
value, tf ≈ 1600. For these values one obtains singular controls. We choose the final time
tf = 1450 and will show that the optimal control has the following bang-bang structure with
0 < t1 < t2 < tf :
⎧ ⎫
⎨ (10, 0) for 0 ≤ t < t1 ⎬
u(t) = (FB (t), Q(t)) = (10, 1000) for t1 ≤ t < t2 . (8.86)
⎩ (0, 1000) for t ≤ t ≤ t ⎭
2 f

Since the initial point x(0) is specified and the final time tf is fixed, the optimization variable
in the IOP (8.18) is given by

z = (τ1 , τ2 ), τ1 = t1 , τ2 = t2 − t1 .

Then the arc-length of the terminal bang-bang arc is τ3 = 1450 − τ1 − τ2 . The code
NUDOCCCS yields the following arc-lengths and switching times:

J1 (x(tf )) = 3555.292, t1 = 433.698, τ2 = 333.575,


(8.87)
t2 = 767.273, τ3 = 1450 − t2 = 682.727.

We note that for this control the state constraint TR (t) ≤ 520 imposed in [17] does not
become active. The adjoint equations are rather complicated and are not given here explic-
itly. The code NUDOCCCS also provides the adjoint functions, e.g., the initial values

ψMA (0) = −0.0299034, ψMB (0) = 0.0433083,


ψMC (0) = −2.83475, ψMD (0) = −0.10494943, (8.88)
ψH (0) = 0.00192489.
360 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

(a) control FB and scaled switching function 1 (b) control Q and scaled switching function 2

10 1000

8 800

6 600

4 400
200
2
0
0
-200
-2
-400
-4
0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400
molar concentration MA molar concentration MB
9000 400
350
8000
300
7000 250

6000 200
150
5000
100
4000 50
0
3000
0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400
molar concentrations MC and MD total energy holdup H
4000 700000
3500 600000
3000
500000
2500
400000
2000
300000
1500
1000 200000

500 100000
0 0
0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400

Figure 8.10. Control of a batch reactor with functional (8.79). Top row: Control
u = (FB , Q) and scaled switching functions. Middle row: Molar concentrations MA and
MB . Bottom row: Molar concentrations (MC , MD ) and energy holdup H .

Figure 8.10 (top row) clearly shows that the strict bang-bang property holds with

φ̇2 (t1 ) < 0, φ̇1 (t2 ) > 0.

The Jacobian of the scalar terminal condition (8.84) is computed as

Gz (ẑ) = (0.764966, 0.396419),

while the Hessian of the Lagrangian is the positive matrix


 
0.00704858 0.00555929
Lzz (β̂, ẑ) = .
0.00555929 0.00742375

Thus, the second-order condition (8.20) is satisfied and Theorem 7.10 tells us that the
solution (8.86)–(8.88) provides a strict strong minimum.
8.6. Optimal Production and Maintenance with L1 -Functional 361

Let us now change the cost functional (8.79) and maximize the average gain of the
component C in time, i.e.,
MC (tf )
minimize J2 (tf , x(tf )) = − , (8.89)
tf

where the final time tf is free. We will show that the bang-bang control
 
(10, 1000) for 0 ≤ t < t1
u(t) = (FB (t), Q(t)) = (8.90)
(0, 1000) for t1 ≤ t ≤ tf

with only one switching point 0 < t1 < tf of the control u1 (t) = FB (t) is optimal. Since
the initial point x(0) is specified, the optimization variable in the IOP (8.18) is

z = (τ1 , τ2 ), τ1 = t1 , τ2 = tf − t1 .

Using the code NUDOCCCS, we obtain the switching and terminal times

J2 (tf , x(tf )) = 3.877103, t1 = 285.519,


(8.91)
τ2 = 171.399, tf = 456.918.

and initial values of the adjoint variables

ψMA (0) = −0.9932 · 10−4 , ψMB (0) = −0.61036 · 10−3 ,


ψMC (0) = −0.108578 · 10−2 , ψMD (0) = 0.9495 · 10−4 , (8.92)
ψH (0) = 0.298 · 10−5 .

We may conclude from Figure 8.11 that φ̇1 (t1 ) > 0 holds, while the switching function for
the control u2 (t) = Q(t) = 1000 satisfies φ2 (t) < 0 on [0, tf ]. The Jacobian for the scalar
terminal condition (8.84) is

Gz (ẑ) = (0.054340, −0.31479),

while the Hessian of the Lagrangian is the positive definite matrix


 
0.37947 −0.11213
Lzz (β̂, ẑ) = · 10−4 .
−0.11213 0.37748

Hence the SSC (8.20) hold and, again, Theorem 7.10 asserts that the solution (8.90)– (8.92)
is a strict strong minimum.

8.6 Optimal Production and Maintenance with


L1 -Functional
In section 5.4, we studied the optimal control model presented in Cho, Abad, and Parlar
[22], where optimal production and maintenance policies are determined simultaneously.
The cost functional was quadratic with respect to the production control u, which enhances
a continuous control. In this section, we consider the case when the production control
enters the cost functional linearly. In this model, x = x1 denotes the inventory level, y = x2
362 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

(a) control FB and scaled switching function 1 (b) control Q and scaled switching function 2

10 1000
8 800
6 600
4
400
2
200
0
0
-2
-200
-4
-400
-6
0 100 200 300 400 500 0 100 200 300 400 500
molar concentration MA molar concentration MB
9000 900
800
8500 700
600
8000
500
400
7500
300
200
7000
100

6500 0
0 100 200 300 400 500 0 100 200 300 400 500
molar concentrations MC and MD total energy holdup H
1800 160000
1600 140000
1400
120000
1200
100000
1000
800 80000

600 60000
400 40000
200
20000
0
0
0 100 200 300 400 500 0 100 200 300 400 500

Figure 8.11. Control of a batch reactor with functional (8.89). Top row: Control
u = (FB , Q) and scaled switching functions. Middle row: Molar concentrations MA , MB .
Bottom row: Molar concentrations (MC , MD ) and energy holdup H .

the proportion of good units of end items produced, v is the scheduled production rate
(control), and m denotes the preventive maintenance rate (control). The parameters are
α = 2 obsolescence rate, s = 4 demand rate, and ρ = 0.1 discount rate. The control bounds
and weights in the cost functional will be specified below. The dynamics of the process is
given by
ẋ(t) = y(t)v(t) − 4, x(0) = 3, x(tf ) = 0,
(8.93)
ẏ(t) = −2y(t) + (1 − y(t))m(t), y(0) = 1,
with the following bounds on the control variables:

0 ≤ v(t) ≤ 3, 0 ≤ m(t) ≤ 4 for 0 ≤ t ≤ tf . (8.94)

Recall also that the terminal condition x(tf ) = 0 implies the nonnegativity condition
x(t) ≥ 0 for all t ∈ [0, tf ]. Now we choose the following L1 -functional: Maximize the
8.6. Optimal Production and Maintenance with L1 -Functional 363

total discounted profit and salvage value of y(tf ),


 tf
J (x, y, m, v) = [8s − x(t) − 4 v(t) − 2.5 m(t)]e−0.1·t dt
0 (8.95)
+ 10 y(tf )e−0.1·tf ( tf = 1 )

under the constraints (8.93) and (8.94). Though the objective has to be maximized, we
discuss the necessary conditions on the basis of the Minimum Principle which has been
used throughout the book. Again, we use the standard Hamilton–Pontryagin function instead
of the current value Hamiltonian:

H (t, x, y, ψ1 , ψ2 , m, v) = e−0.1·t (−32 + x + 4v + 2.5 m)


(8.96)
+ ψ1 (yv − 4) + ψ2 (−2y + (1 − y)m),

where ψ1 , ψ2 denote the adjoint variables. The adjoint equations and transversality con-
dition yield, in view of the terminal constraint x1 (tf ) = 0 and the salvage term in the cost
functional,
ψ̇1 = −e−0.1 , ψ1 (tf ) = ν,
(8.97)
ψ̇2 = −ψ1 v + ψ2 (2 + m), ψ2 (tf ) = −10e−0.1 ,
where ν ∈ R is an unknown multiplier. The switching functions are given by

φm = Hm = 2.5 e−0.1·t + ψ2 (1 − y), φv = Hv = 4 e−0.1·t + ψ1 y,

which determines the controls according to


   
3 if φv (t) < 0 4 if φm (t) < 0
v(t) = , m(t) = . (8.98)
0 if φv (t) > 0 0 if φm (t) > 0

Singular controls for which either φv (t) = 0 or φm (t) = 0 holds on an interval I ⊂ [0, tf ]
cannot be excluded a priori. However, for the data and the final time tf = 1 chosen here, the
application of direct optimization methods [5, 13, 14] reveals the following control structure
with four bang-bang arcs:
⎧ ⎫
⎪ (3, 0) for 0 ≤ t < t1 ⎪
⎨ ⎬
(0, 0) for t1 ≤ t < t2
(v(t), m(t)) = . (8.99)
⎩ (0, 4) for t2 ≤ t < t3
⎪ ⎪

(3, 4) for t3 ≤ t ≤ tf = 1

The optimization vector for the IOP (8.18) is

z = (τ1 , τ2 , τ3 ), τ1 = t1 , τ2 = t2 − t1 , τ3 = t3 − t2 .

Therefore, the terminal arc-length is given by τ4 = tf −(τ1 +τ2 +τ3 ). The code NUDOCCCS
yields the following numerical results for the arc-lengths and adjoint variables:

t1 = 0.346533, τ2 = 0.380525, t2 = 0.727058,


τ3 = 0.114494, t3 = 0.841552, tf = 1.0, (8.100)
J = 25.7969, ψ(tf ) = −0.833792, ψ2 (tf ) = −0.904837.
364 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

(a) stock x and good items y (b) production v and maintenance m


3
4
2.5 3.5
2 3
2.5
1.5 2
1 1.5
1
0.5 0.5
0 0

0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(c) production control v and switching function v (d) maintenance m and switching function m

3 4
2 3
1
2
0
1
-1
0
-2
-3 -1

-4 -2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 8.12. Optimal production and maintenance with L1 -functional (8.95).


(a) State variables x and y. (b) Control variables v and m. (c), (d) Control variables and
switching functions.

Figure 8.12 clearly indicates that the strict bang-bang property (6.19) holds, since in partic-
ular we have
φ̇v (t1 ) > 0, φ̇v (t3 ) < 0, φ̇m (t2 ) < 0.

For the scalar terminal condition x(tf ) = 0, the Jacobian is the nonzero row vector

Gz (ẑ) = (−0.319377, −1.81950, −1.35638).

The Hessian of the Lagrangian is computed as the (3 × 3) matrix


⎛ ⎞
41.6187 21.0442 −3.43687
Lzz (β̂, ẑ) = ⎝ 21.0442 21.0442 −3.43687 ⎠ ,
−3.43687 −3.43687 34.4731

from which the reduced Hessian (8.20) is obtained as the positive definite (2 × 2) matrix
 
∗ 20.4789 6.61585
N Lzz (β̂, ẑ)N = .
6.61585 49.6602

Hence the second-order test (8.20) holds, which ensures that the control (8.99) with the
data (8.100) yields a strict strong minimum.
8.7. Van der Pol Oscillator with Bang-Singular Control 365

8.7 Van der Pol Oscillator with Bang-Singular Control


The following example with a bang-singular control is taken from Vossen [111, 112]. The
optimal control is a concatenation of two bang-bang arcs and one terminal singular arc. The
singular control is given by a feedback expression which allows us to optimize switching
times directly using the arc-parametrization method presented in Section 8.1.3; cf. also
[111, 112]. We consider again the dynamic model of a Van der Pol oscillator introduced in
Section 6.7.1. The aim is to minimize the regulator functional
 tf
1
J (x, u) = (x1 (t)2 + x2 (t)2 ) dt (tf = 4) (8.101)
2 0

subject to the dynamics, boundary conditions, and control constraints

ẋ1 (t) = x2 (t), x1 (0) = 0,


ẋ2 (t) = −x1 (t) + x2 (t)(1 − x1 (t)2 ) + u(t), x2 (0) = 1, (8.102)
| u(t) | ≤ 1 for t ∈ [0, tf ].

The Hamilton–Pontryagin function

H (x, u, ψ) = ψ1 x2 + ψ2 (−x1 + x2 (1 − x12 ) + u) (8.103)

yields the adjoint equations and transversality conditions

ψ̇1 = −x1 + ψ2 (1 + 2x1 x2 ), ψ(tf ) = 0,


(8.104)
ψ̇2 = −x2 − ψ − ψ2 (1 − x12 ), ψ(tf ) = 0.

The switching function


φ(t) = Hu (t) = ψ2 (t) (8.105)
determines the bang-bang controls by
 
1 if ψ2 (t) < 0
u(t) = . (8.106)
−1 if ψ2 (t) > 0

A singular control on an interval I ⊂ [0, tf ] can be computed from the relations

φ = ψ2 = 0, φ̇ = ψ̇2 = −x2 − ψ1 = 0, φ̈ = 2x1 − x2 (1 − x12 ) − u = 0,

which give the feedback expression

u = using (x) = 2x1 − x2 (1 − x12 ). (8.107)

It follows that the order of a singular arc is q = 1; cf. the definition of the order in [49].
Moreover, the strict Generalized Legendre Condition holds:
3 4
∂ d2 ∂
(−1)q Hu =− φ̈ = 1 > 0.
∂u dt 2 ∂u
366 Chapter 8. Numerical Methods for Solving the Induced Optimization Problem

optimal control u state variables x1 , x2


1
1
0.8

0.5 0.6
0.4
0 0.2
0
-0.5
-0.2
-0.4
-1
-0.6
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Figure 8.13. Control of the van der Pol oscillator with regulator functional.
(a) Bang-singular control u. (b) State variables x1 and x2 .

The application of direct optimization methods yields the following control structure with
two bang-bang arcs and a terminal singular arc:
⎧ ⎫
⎨ −1 for 0 ≤ t < t1 ⎬
u(t) = 1 for t1 ≤ t < t2 . (8.108)
⎩ ⎭
2x1 (t) − x2 (t)(1 − x1 (t) ) for t2 ≤ t ≤ tf = 4
2

Hence, the feedback functions in (8.26) are given by


u1 (x) = −1, u2 (x) = 1, u3 (x) = 2x1 − x2 (1 − x12 ),
which are inserted into the dynamic equations (8.102). Therefore, the optimization vector
for the IOP (8.28) is given by
z = (τ1 , τ2 ), τ 1 = t1 , τ2 = t2 − t1 .
This yields the terminal arc length τ3 = 4 − τ1 − τ2 . The code NUDOCCCS provides the
following numerical results for the arc-lengths, switching times, and adjoint variables:
t1 = 1.36674, τ2 = 0.109404, t2 = 2.46078,
(8.109)
J (x, u) = 0.757618, ψ1 (0) = −0.495815, ψ2 (0) = 2.58862.

The corresponding control and state variables are shown in Figure 8.13. The code
NUDOCCCS furnishes the Hessian of the Lagrangian as the positive definite matrix
 
194.93 −9.9707
Lzz (ẑ) = .
−9.9707 0.56653
At this stage we have only found a strict local minimum of the IOP (8.28). Recently, SSC
for a class of bang-singular controls were obtained by Aronna et al. [2]. Combining these
new results with the Riccati approach in Vossen [111] for testing the positive definiteness of
quadratic forms, one can now verify the SSC for the extremal solutions (8.108) and (8.109).
Here, the following property of the switching function is important:
φ(t) > 0 for 0 ≤ t < t1 , φ̇(t1 ) < 0, φ(t) < 0 for t1 < t < t2 ,
φ(t2 ) = φ̇(t2 ) = 0, φ̈(t2− ) < 0, φ(t) = 0 for t2 < t < tf .
Bibliography

[1] A. A. Agrachev, G. Stefani, and P. Zezza, Strong optimality for a bang-bang


trajectory, SIAM Journal on Control and Optimization, 41, pp. 991–1014 (2002). [5,
299, 300, 301, 304]
[2] M. S. Aronna, J. F. Bonnans, A.V. Dmitruk, and P. A. Lotito, Quadratic con-
ditions for bang-singular extremals, to appear in Numerical Algebra, Control and
Optimization, 2, pp. 511–546 (2012). [366]
[3] D. Augustin and H. Maurer, Sensitivity analysis and real-time control of a con-
tainer crane under state constraints, In: Online Optimization of Large Scale Systems
(M. Grötschel, S. O. Krumke, J. Rambau, eds.), pp. 69–82, Springer, Berlin, 2001.
[219]
[4] A. Ben-Tal and J. Zowe, Second-order optimality conditions for the L1 -
minimization problem, Applied Mathematics and Optimization, 13, pp. 45–48 (1985).
[301]
[5] J. T. Betts, Practical Methods for Optimal Control and Estimation Using Nonlin-
ear Programming, 2nd ed., Advances in Design and Control 19, SIAM, Philadelphia,
2010. [250, 339, 353, 363]
[6] G. A. Bliss, Lectures on the Calculus of Variations, University of Chicago Press,
Chicago, 1946. [24]
[7] H. G. Bock and K. J. Plitt, A multiple shooting algorithm for direct solution of op-
timal control problems, In: Proceedings of the 9th IFAC World Congress, Budapest,
pp. 243–247, 1984. [339]
[8] J. F. Bonnans and N. P. Osmolovskii, Characterization of a local quadratic
growth of the Hamiltonian for control constrained optimal control problems, Dy-
namics of Continuous, Discrete and Impulsive Systems, Series B: Applications and
Algorithms, 19, pp. 1–16 (2012). [177]
[9] B. Bonnard, J.-B. Caillau, and E. Trélat, Second order optimality conditions in
the smooth case and applications in optimal control, ESAIM Control, Optimization
and Calculus of Variations, 13, pp. 207–236 (2007). [339]
[10] B. Bonnard, J.-B. Caillau, and E. Trélat, Computation of conjugate times in
smooth optimal control: The COTCOT algorithm, In: Proceedings of the 44th

367
368 Bibliography

IEEE Conference on Decision and Control and European Control Conference 2005,
pp. 929–933, Sevilla, December 2005. [339]

[11] A. Bressan, A high order test for optimality of bang-bang controls, SIAM Journal
on Control and Optimization, 23, pp. 38–48 (1985).

[12] A. E. Bryson andY. C. Ho, Applied Optimal Control, Revised Printing, Hemisphere
Publishing Corporation, New York, 1975. [219]

[13] C. Büskens, Optimierungsmethoden und Sensitivitätsanalyse für optimale Steuer-


prozesse mit Steuer- und Zustands-Beschränkungen, Dissertation Institut für Nu-
merische Mathematik, Universität Münster, Germany (1998). [6, 219, 273, 287,
288, 291, 339, 342, 343, 363]

[14] C. Büskens and H. Maurer, SQP-methods for solving optimal control problems
with control and state constraints: Adjoint variables, sensitivity analysis and real-
time control, Journal of Computational and Applied Mathematics, 120, pp. 85–108
(2000). [6, 219, 250, 287, 288, 291, 339, 363]

[15] C. Büskens and H. Maurer, Sensitivity analysis and real-time optimization of


parametric nonlinear programming problems, In: Online Optimization of Large
Scale Systems (M. Grötschel, S. O. Krumke, J. Rambau, eds.), pp. 3–16, Springer-
Verlag, Berlin, 2001. [219]

[16] C. Büskens and H. Maurer, Sensitivity analysis and real-time control of para-
metric optimal control problems using nonlinear programming methods, In: Online
Optimization of Large Scale Systems (M. Grötschel, S. O. Krumke, J. Rambau, eds.),
pp. 57–68, Springer-Verlag, Berlin, 2001. [219]

[17] C. Büskens, H. J. Pesch, and S. Winderl, Real-time solutions of bang-bang and


singular optimal control problems, In: Online Optimization of Large Scale Systems
(M. Grötschel et al., eds.), pp. 129–142, Springer-Verlag, Berlin, 2001. [xii, 351,
357, 359]

[18] R. Bulirsch, Die Mehrzielmethode zur numerischen Lösung von nichtlinearen


Randwertproblemen und Aufgaben der optimalen Steuerung, Report of the Carl-
Cranz-Gesellschaft, Oberpfaffenhofen, Germany, 1971. [339]

[19] R. Bulirsch, F. Montrone, and H. J. Pesch, Abort landing in the presence of


windshear as a minimax optimal control problem. Part 2: Multiple shooting and
homotopy, Journal of Optimization Theory and Applications, 70, pp. 223–254 (1991).
[339]

[20] D. M. Byrne, Accurate simulation of multifrequency semiconductor laser dynam-


ics under gigabit-per-second modulation, Journal of Lightwave Technology, 10,
pp. 1086–1096 (1992). [353]

[21] F. L. Chernousko, L. D. Akulenko, and N. N. Bolotnik, Time-optimal control


for robotic manipulators, Optimal Control Applications & Methods, 10, pp. 293–311
(1989). [346, 347, 348]
Bibliography 369

[22] D. I. Cho, P. L. Abad, and M. Parlar, Optimal production and maintenance de-
cisions when a system experiences age-dependent deterioration, Optimal Control
Applications & Methods, 14, pp. 153–167 (1993). [4, 248, 249, 250, 361]

[23] B. Christiansen, H. Maurer, O. Zirn, Optimal control of servo actuators with


flexible load and Coulombic friction, European Journal of Control, 17, pp. 1–11
(2011). [342]

[24] B. Christiansen, H. Maurer, O. Zirn, Optimal control of machine tool manipu-


lators, In: Recent Advances in Optimization and Its Applications in Engineering:
The 14th Belgian-French-German Conference on Optimization, Leuven, September
2009 (M. Diehl, F. Glineur, E. Jarlebring, W. Michels, eds.), pp. 451–460, Springer,
Berlin, 2010.

[25] A. V. Dmitruk, Euler–Jacobi equation in the calculus of variations, Mat. Zametki,


20, pp. 847–858 (1976). [4, 191, 197]

[26] A. V. Dmitruk, Jacobi-type conditions for the Bolza problem with inequalities,
Mat. Zametki, 35, pp. 813–827 (1984). [4, 191, 193]

[27] A. V. Dmitruk, Quadratic conditions for the Pontryagin minimum in a control-


linear optimal control problem. I. Decoding theorem, Izv. Akad. Nauk SSSR Ser.
Mat., 50, pp. 284–312 (1986). [68]

[28] A. V. Dmitruk, A. A. Milyutin, and N. P. Osmolovskii, Lyusternik’s theorem


and extremum theory, Usp. Mat. Nauk, 35, pp. 11–46 (1980); English translation:
Russian Math. Surveys, 35, pp. 6, 11–51 (1980). [132]

[29] N. Dokhane and G. L. Lippi, Chirp reduction in semiconductor lasers through


injection current patterning, Applied Physics Letters, 78, pp. 3938–3940 (2001).
[353, 354, 356]

[30] A. Ya. Dubovitskii and A. A. Milyutin, Theory of the maximum principle, In:
Methods of Extremal Problem Theory in Economics [in Russian], Nauka, Moscow,
pp. 6–47 (1981). [166, 167]

[31] U. Felgenhauer, On stability of bang-bang type controls, SIAM Journal on Control


and Optimization, 41, pp. 1843–1867 (2003). [259, 338]

[32] A.V. Fiacco, Introduction to Sensitivity and Stability Analysis in Nonlinear Pro-
gramming, Mathematics in Science and Engineering 165, Academic Press, New
York, 1983. [338]

[33] R., Fourer, D. M. Gay, and B. W. Kernighan, AMPL: A Modeling Language


for Mathematical Programming, Duxbury Press, Brooks-Cole Publishing Company,
1993. [250]

[34] R. Gabasov and F. Kirillova, The Qualitative Theory of Optimal Processes, Marcel
Dekker Inc., New York, Basel, 1976.
370 Bibliography

[35] H. P. Geering, L. Guzella, S. A. R. Hepner, and Ch. Onder, Time-optimal


motions of robots in assembly tasks, IEEE Transactions on Automatic Control,
AC-31, pp. 512–518 (1986). [346, 349]
[36] E. G. Gilbert and D. S. Bernstein, Second-order necessary conditions of optimal
control: Accessory-problem results without normality conditions, Journal of Opti-
mization Theory Applications, 41, pp. 75–106 (1983).
[37] L. Göllmann, Numerische Berechnung zeitoptimaler Trajektorien für zweigliedrige
Roboterarme, Diploma thesis, Institut für Numerische Mathematik, Universität Mün-
ster, 1991. [346, 347, 348]
[38] B. S. Goh, Necessary conditions for singular extremals involving multiple control
variables, SIAM Journal on Control, 4, pp. 716–731 (1966).
[39] R. Henrion, La Théorie de la Variation Seconde et ses Applications en Commande
Optimale, Academie Royal de Belgique, Bruxelles: Palais des Academies (1979).
[1, 32]
[40] M. Hestens, Calculus of Variations and Optimal Control Theory, John Wiley, New
York, 1966. [225]
[41] A. J. Hoffman, On approximate solutions of systems of linear inequalities, Journal
of Research of the National Bureau of Standards, 49, pp. 263–265 (1952). [16]
[42] A. D. Ioffe and V. M. Tikhomirov, Theory of Extremal Problems [in Russian],
Nauka, Moscow (1974). Also published in German as Theorie der Extremalaufgaben,
VEB Deutscher Verlag der Wissenschaften, Berlin, 1979. [191]
[43] D. H. Jacobson and D.Q. Mayne, Differential Dynamic Programming, American
Elsevier Publishing Company Inc., New York, 1970.
[44] Y. Kaya and J. L. Noakes, Computations and time-optimal controls, Optimal Con-
trol Applications and Methods, 17, pp. 171–185 (1996). [6, 339, 341, 342]
[45] C. Y. Kaya and J. L. Noakes, Computational method for time-optimal switching
control, Journal of Optimization Theory and Applications, 117, pp. 69–92 (2003).
[6, 339, 341, 342]
[46] J.-H. R. Kim, G. L. Lippi, and H. Maurer, Minimizing the transition time in lasers
by optimal control methods. Single mode semiconductor lasers with homogeneous
transverse profile, Physica D (Nonlinear Phenomena) 191, pp. 238–260 (2004). [xii,
353, 356]
[47] J.-H. R. Kim, H. Maurer,Yu. A. Astrov, M. Bode, and H.-G. Purwins, High-speed
switch-on of a semiconductor gas discharge image converter using optimal control
methods, Journal of Computational Physics, 170, pp. 395–414 (2001). [280]
[48] J.-H. R. Kim and H. Maurer, Sensitivity analysis of optimal control problems with
bang-bang controls, In: Proceedings of the 42nd IEEE Conference on Decision and
Control, Maui, Dec. 9–12, 2003, IEEE Control Society, Washington, DC, pp. 3281–
3286, 2003. [338, 342]
Bibliography 371

[49] A. J. Krener, The high order maximum principle, and its application to singular
extremals, SIAM Journal on Control and Optimization, 15, pp. 256–293 (1977).
[365]
[50] U. Ledzewicz, H. Maurer, and H. Schättler, Optimal and suboptimal proto-
cols for a mathematical model for tumor anti-angiogenesis in combination with
chemotherapy, Mathematical Biosciences and Engineering, 8, pp. 307–328 (2011).
[343]
[51] U. Ledzewicz, H. Maurer, and H. Schättler, On optimal delivery of combination
therapy for tumors, Mathematical Biosciences, 22, pp. 13–26 (2009). [343]
[52] U. Ledzewicz, J. Marriott, H. Maurer, and H. Schättler, The scheduling of
angiogenic inhibitors minimizing tumor volume, Journal of Medical Informatics and
Technologies, 12, pp. 23–28 (2008).
[53] U. Ledzewicz and H. Schättler, Optimal bang-bang controls for a 2-compartment
model in cancer chemotherapy, Journal of Optimization Theory and Applications,
114, pp. 609–637 (2002). [270]
[54] E. S. Levitin, A. A. Milyutin, and N. P. Osmolovskii, On local minimum con-
ditions in problems with constraints, In: Mathematical Economics and Functional
Analysis [in Russian], Nauka, Moscow, pp. 139–202 (1974). [3, 301]
[55] E. S. Levitin, A. A. Milyutin, and N. P. Osmolovskii, Higher-order local min-
imum conditions in problems with constraints, Uspehi Mat. Nauk, 33, pp. 85–148
(1978). [3, 21, 174, 301]
[56] E. S. Levitin, A. A. Milyutin, and N. P. Osmolovskii, Theory of higher-order con-
ditions in smooth constrained extremal problems, In: Theoretical and Applied Opti-
mal Control Problems [in Russian], Nauka, Novosibirsk, pp. 4–40, 246 (1985). [3]
[57] K. Malanowski, C. Büskens, and H. Maurer, Convergence of approximations
to nonlinear optimal control problems, In: Mathematical Programming with Data
Perturbations (A.V. Fiacco, ed.), Lecture Notes in Pure and Applied Mathematics,
Vol. 195, pp. 253–284, Marcel-Dekker, Inc., New York, 1997.
[58] K. Malanowski and H. Maurer, Sensitivity analysis for parametric control prob-
lems with control-state constraints, Computational Optimization and Applications,
5, pp. 253–283 (1996). [219]
[59] K. Malanowski and H. Maurer, Sensitivity analysis for state constrained optimal
control problems, Discrete and Continuous Dynamical Systems, 4, pp. 241–272
(1998). [219]
[60] P. Martinon and J. Gergaud, Using switching detection and variational equa-
tions for the shooting method, Optimal Control Applications and Methods, 28,
pp. 95–116 (2007). [339]
[61] H. Maurer, Numerical solution of singular control problems using multiple
shooting techniques, Journal of Optimization Theory and Applications, 18, pp.
235–257 (1976). [339]
372 Bibliography

[62] H. Maurer, First and second order sufficient optimality conditions in mathemat-
ical programming and optimal control, Mathematical Programming Study, 14,
pp. 163–177 (1981). [25]

[63] H. Maurer, Differential stability in optimal control problems, Applied Mathemat-


ics and Optimization, 5, pp. 283–295 (1979). [298]

[64] H. Maurer and D. Augustin, Second order sufficient conditions and sensitivity
analysis for the controlled Rayleigh problem, In: Parametric Optimization and
Related Topics IV (J. Guddat et al., eds.) Lang, Frankfurt am Main, pp. 245–259
(1997). [219, 339]

[65] H. Maurer and D. Augustin, Sensitivity analysis and real-time control of paramet-
ric optimal control problems using boundary value methods, In: Online Optimization
of Large Scale Systems (M. Grötschel et al., eds.), pp. 17–55, Springer Verlag,
Berlin, 2001. [203, 219, 339]

[66] H. Maurer, C. Büskens, J.-H. R. Kim, and Y. Kaya, Optimization methods for the
verification of second-order sufficient conditions for bang-bang controls, Optimal
Control Methods and Applications, 26, pp. 129–156 (2005). [6, 339, 340, 341, 342,
357]

[67] H. Maurer, J.-H. R. Kim, and G. Vossen, On a state-constrained control problem


in optimal production and maintenance, In: Optimal Control and Dynamic Games,
Applications in Finance, Management Science and Economics (C. Deissenberg, R. F.
Hartl, eds.), pp. 289–308, Springer Verlag, 2005. [xii, 5, 223, 248, 249, 250, 347]

[68] H. Maurer and H. J. Oberle, Second order sufficient conditions for optimal con-
trol problems with free final time: The Riccati approach, SIAM Journal on Control
and Optimization, 41, pp. 380–403 (2002). [219]

[69] H. Maurer and N. P. Osmolovskii, Second order optimality conditions for bang-
bang control problems, Control and Cybernetics, 32, pp. 555–584 (2003). [267]

[70] H. Maurer and N. P. Osmolovskii, Second order sufficient conditions for


time-optimal bang-bang control, SIAM Journal on Control and Optimization, 42,
pp. 2239–2263 (2004). [274]

[71] H. Maurer and H. J. Pesch, Solution differentiability for nonlinear parametric


control problems, SIAM Journal on Control and Optimization, 32, pp. 1542–1554
(1994). [205, 219]

[72] H. Maurer and H. J. Pesch, Solution differentiability for nonlinear parametric


control problems with control-state constraints, Journal of Optimization Theory and
Applications, 86, pp. 285–309 (1995). [219]

[73] H. Maurer and S. Pickenhain, Second order sufficient conditions for optimal
control problems with mixed control-state constraints, Journal of Optimization
Theory and Applications, 86 (1995), pp. 649–667 (1995). [219, 267]
Bibliography 373

[74] H. Maurer and G. Vossen, Sufficient conditions and sensitivity analysis for
bang-bang control problems with state constraints, In: Proceedings of the 23rd IFIP
Conference on System Modeling and Optimization, Cracow, Poland (A. Korytowski,
M. Szymkat, eds.), pp. 82–99, Springer Verlag, Berlin, 2009. [342, 343]
[75] H. Maurer and J. Zowe, First and second-order necessary and sufficient condi-
tions for infinite-dimensional programming problems, Mathematical Programming,
16, pp. 98–110 (1979). [25]
[76] A. A. Milyutin, Maximum Principle in the General Optimal Control Problem [in
Russian], Fizmatlit, Moscow (2001). [179]
[77] A. A. Milyutin, A. E. Ilyutovich, N. P. Osmolovskii, and S. V. Chukanov,
Optimal Control in Linear Systems [in Russian], Nauka, Moscow (1993). [293]
[78] A. A. Milyutin and N. P. Osmolovskii, (1) Higher-order minimum conditions
on a set of sequences in the abstract problem with inequality-type constraints;
(2) Higher-order minimum conditions on a set of sequences in the abstract problem
with inequality- and equality-type constraints; (3) Higher-order minimum condi-
tions with respect to a subsystem in the abstract minimization problems on a set of
sequences, In: Optimality of Control Dynamical Systems [in Russian], No. 14, All-
Union Institute of System Studies, Moscow, pp. 68–95 (1990); English translation:
Computational Mathematics and Modeling, 4, pp. 393–400 (1993). [9, 16]
[79] A. A. Milyutin and N. P. Osmolovskii, Calculus of Variations and Optimal Con-
trol, Translations of Mathematical Monographs, Vol. 180, American Mathematical
Society, Providence, 1998. [1, 2, 3, 4, 5, 30, 32, 71, 102, 103, 104, 105, 106, 107,
108, 109, 115, 127, 128, 130, 131, 207, 225, 229, 232, 255, 293, 294, 296, 297, 298]
[80] J. Noble and H. Schättler, Sufficient conditions for relative minima of bro-
ken extremals in optimal control theory, Journal of Mathematical Analysis and
Applications, 269, pp. 98–128 (2002). [270]
[81] H. J. Oberle, Numerical computation of singular control functions for a two-link
robot arm, In: Optimal Control, Lecture Notes in Control and Information Sciences,
95, pp. 244–253 (1987). [339, 346, 347, 349]
[82] H. J. Oberle and W. Grimm, BNDSCO—A program for the numerical solution of
optimal control problems, Institute for Flight Systems Dynamics, DLR, Oberpfaffen-
hofen, Germany, Internal Report No. 515–89/22, 1989. [204, 273, 287, 288, 291, 339]
[83] H. J. Oberle and H. J. Pesch, Private communication, 2000. [272]
[84] N. P. Osmolovskii, Second-order weak local minimum conditions in an optimal
control problem (necessity, sufficiency), Dokl. Akad. Nauk SSSR, 225, pp. 259–262
(1975); English translation: Soviet Math. Dokl., 15, pp. 1480–1484 (1975). [25]
[85] N. P. Osmolovskii, High-order necessary and sufficient conditions for Pontryagin
and bounded strong minima in the optimal control problems, Dokl. Akad. Nauk
SSSR Ser. Cybernetics and Control Theory, 303, pp. 1052–1056 (1988); English
translation: Soviet Physics Doklady, 33, pp. 883–885 (1988). [1, 2]
374 Bibliography

[86] N. P., Osmolovskii, Theory of higher-order conditions in optimal control, Doctoral


Thesis, MISI (Moscow Civil Engineering Institute), Moscow, (1988). [1, 68, 173,
175]
[87] N. P. Osmolovskii, Second-order conditions in a time-optimal control problem
for linear systems, In: System Modelling and Optimization, Notes in Control and
Information Sciences, 143, pp. 368–376 (1989). [293]
[88] N. P. Osmolovskii, Quadratic conditions for nonsingular extremals in optimal
control (Theory), Russian Journal of Mathematical Physics, 2, pp. 487–516 (1995).
[2]
[89] N. P. Osmolovskii, Quadratic conditions for nonsingular extremals in optimal
control (examples), Russian Journal of Mathematical Physics, 5, pp. 487–516
(1998). [279]
[90] N. P. Osmolovskii, Second-order conditions for broken extremals, In: Calculus of
Variations and Optimal Control, Boca Raton, FL, pp. 198–216 (2000). [2]
[91] N. P. Osmolovskii, Second-order sufficient conditions for an extremum in optimal
control, Control and Cybernetics, 31, pp. 803–831 (2002).
[92] N. P. Osmolovskii, Quadratic optimality conditions for broken extremals in the
general problem of calculus of variations, Journal of Mathematical Sciences, 123,
pp. 3987–4122 (2004). [27, 68, 127]
[93] N. P. Osmolovskii, Second order conditions in optimal control problems with mixed
equality-type constraints on a variable time interval, Control and Cybernetics, 38,
pp. 1535–1556 (2009). [150]
[94] N. P. Osmolovskii, Sufficient quadratic conditions of extremum for discontinuous
controls in optimal control problem with mixed constraints, Journal of Mathematical
Sciences, 173, pp. 1–106 (2011). [4, 175]
[95] N. P. Osmolovskii, Necessary quadratic conditions of extremum for discontinuous
controls in optimal control problem with mixed constraints, Journal of Mathematical
Sciences, 183, pp. 435–576 (2012). [4, 68, 173, 175]
[96] N. P. Osmolovskii and F. Lempio, Jacobi conditions and the Riccati equation
for a broken extremal, Journal of Mathematical Sciences, 100, pp. 2572–2592
(2000). [2, 4, 212]
[97] N. P. Osmolovskii and F. Lempio, Transformation of quadratic forms to perfect
squares for broken extremals, Journal of Set-Valued Analysis, 10, pp. 209–223
(2002). [2, 4]
[98] N. P. Osmolovskii and H. Maurer, Second order sufficient optimality conditions
for a control problem with continuous and bang-bang control components: Riccati
approach, In: Proceedings of the 23rd IFIP Conference on System Modeling and
Optimization, Cracow, Poland (A. Korytowski, M. Szymkat, eds.), pp. 411–429,
Springer Verlag, Berlin, 2009. [223, 247]
Bibliography 375

[99] N. P. Osmolovskii and H. Maurer, Equivalence of second order optimality


conditions for bang-bang control problems. Part 1: Main results, Control and
Cybernetics, 34, pp. 927–950 (2005). [299]

[100] N. P. Osmolovskii and H. Maurer, Equivalence of second order optimality


conditions for bang-bang control problems. Part 2: Proofs, variational derivatives
and representations, Control and Cybernetics, 36, pp. 5–45 (2007). [299]

[101] H. J. Pesch, Real-time computation of feedback controls for constrained optimal


control problems, Part 1: Neighbouring extremals, Optimal Control Applications
and Methods, 10, pp. 129–145 (1989). [219, 339]

[102] H. J. Pesch, Real-time computation of feedback controls for constrained optimal


control problems, Part 2: A correction method based on multiple shooting, Optimal
Control Applications and Methods, 10, pp. 147–171 (1989). [339]

[103] L. S. Pontryagin, V. G. Boltyanski, R. V. Gramkrelidze, and E. F. Miscenko,


The Mathematical Theory of Optimal Processes [in Russian], Fitzmatgiz, Moscow;
English translation: Pergamon Press, New York, 1964. [225]

[104] A. V. Sarychev, First- and second-order sufficient optimality conditions for bang-
bang controls, SIAM Journal on Control Optimization, 35, pp. 315–340 (1997). [279]

[105] È. È. Shnol , On the degeneration in the simplest problem of the calculus of


variations, Mat. Zametki, 24 (1978).

[106] J. Stoer and R. Bulirsch, Introduction to Numerical Analysis, 2nd ed., Springer-
Verlag, Berlin, 1992. [206, 339]

[107] H. J. Sussmann, The structure of time-optimal trajectories for single-input systems


in the plane: The C ∞ nonsingular case, SIAM Journal on Control and Optimization,
25, pp. 433–465 (1987). [1]

[108] H. J. Sussmann, The structure of time-optimal trajectories for single-input systems


in the plane: The general real analytic case, SIAM Journal on Control and
Optimization, 25, pp. 868–904 (1987).

[109] M. G. Tagiev, Necessary and sufficient strong extremum condition in a degenerate


problem of the calculus of variations, Uspehi. Mat. Nauk, 34, pp. 211–212 (1979).

[110] V. S. Vassiliadis, R. W. Sargent, and C. C. Pantelides, Solution of a class


of multistage dynamic optimization problems. 2. Problems with path constraints,
Industrial & Engineering Chemistry Research, 33, pp. 2123–2133 (1994). [357]

[111] G. Vossen, Numerische Lösungsmethoden, hinreichende Optimalitätsbedingungen


und Sensitivitätsanalyse für optimale bang-bang und singuläre Steuerungen, Disser-
tation, Institut für Numerische und Angewandte Mathematik, Universität Münster,
Münster, Germany, 2005. [6, 365, 366]

[112] G. Vossen, Switching time optimization for bang-bang and singular controls,
Journal of Optimization Theory and Applications, 144, pp. 409–429 (2010). [6, 365]
376 Bibliography

[113] G. Vossen and H. Maurer, On L1 -minimization in optimal control and applications


to robotics, Optimal Control Applications and Methods, 27, pp. 301–321 (2006). [6]
[114] A. Wächter and L. T. Biegler, On the implementation of an interior-point filter
line-search algorithm for large-scale nonlinear programming, Mathematical Pro-
gramming, 106, pp. 25–57 (2006); cf. IPOPT home page (C. Laird and A. Wächter):
https://projects.coin-or.org/Ipopt. [250]
[115] J. Warga, A second-order condition that strengthens Pontryagin’s maximum
principle, Journal of Differential Equations, 28, pp. 284–307 (1979).
[116] V. Zeidan, Sufficiency criteria via focal points and via coupled points, SIAM
Journal of Control Optimization, 30, pp. 82–98 (1992). [219]
[117] V. Zeidan, The Riccati equation for optimal control problems with mixed state-
control constraints: Necessity and sufficiency, SIAM Journal on Control and Opti-
mization, 32, pp. 1297–1321 (1994). [219]
Index

absorbing set of sequences, 9 B, 33, 99, 142, 149, 154, 155, 228,
abstract set of sequences , 9 242, 263, 265, 278, 297
adjacent conditions, 13 B(), 94, 174, 181
admissible function (t, u), 51 Bco (), 94
admissible pair, 28, 164, 184 Bτ , 156
admissible pair (variation) with respect to BV , 149
Qtu , 52 B0 , 266, 304
admissible process, 224, 238, 274 ℵγ , 21
admissible trajectory, 256 cone Ct , 177
almost global minimum, 293 Ct (), 176
arc lengths, 341 C , 68
arc-parametrization method, 342, 343 conjugate point, 188, 206
augmented Hamiltonian, 166 constant CK , 74
augmented Pontryagin function, 150, 166, Cγ (m, g ), 11
178 Cγ (ω0 , K), 25
auxiliary minimization problem, 186 Cγ (0 , K), 22
auxiliary problem V , 138 Cγ (0 , σ ), 22
Cγ (0 , σ γ ), 22
basic bang-bang control problem, 255 Cγ̄ (C ; S4 ), 89
basic constant, 3, 13, 38 Cγ̄ (C ; S5 ), 92
basic problem, 255, 299 Cγ (C ; o(√γ ) ), 76
batch-reactor, 357
bounded strong γ -sufficiency, 120, 139 Cγ (C ; loc √
o( γ ) ), 79

Cγ (M ; o( γ ) ), 101
bounded strong γ1 -sufficiency, 231
bounded strong minimum, 117, 154, 173, Cγ (0 , σ γ ), 13, 22, 38
180, 185, 226, 240 Cγ (˜ 1 ; S1 ), 84
C 
bounded strong sequence on Q, 173 Cγ (˜ 1 ; loc√ , 83
C o( γ )
Cγ1 (2C ; S2 ), 85
C-growth condition for H , 173
canonical problem, 27 Cγ2 (2C ; S2 ), 87
canonical representation of a sequence, Cγ2 (3C ; S3 ), 88
41, 42 Cγ (σ , ), 12
closure [· ]2 , 73 convexification of control system, 230
condition A, 33, 115, 137, 153, 173, 180, critical cone, 22, 25, 31, 132, 153, 170,
227, 242, 263, 278, 294 180, 227, 241, 260, 301
Aco , 74 K0 , 261
Aτ , 156 K0 in the induced problem, 303
A0 , 304 K τ , 158

377
378 Index

K V , 147 l, 30, 130


K ζ , 110 l τ , 157
KZ , 68 l V , 142
critical subspace, 186, 216, 277 , 22, 25
0 , 22
decoding of the basic constant, 3 ϕ(t, x), 310
"(t), 307
element b, 58 q(t; t0 ), 315
b̄, 63 Q, 295
¯ 65
b̄, Q0 , 295
endpoint Lagrange function, 24, 150, 166, r(t; t0 , θ ), 318
178, 225, 239, 257 ρ(t, v), 176
equivalent normalization, 13 u(t; θ ), 300, 340
essential component, 116, 139, 154, 173, w(t; t0 ), 315
180, 226, 240 x(t; t0 , x0 , θ ), 300
Euler equation, 184 x(t; x0 , θ ), 340
external quadratic form, 172 χi , 176
extremal, 184 χ V , 77
extremal trajectory, 275 χ ∗ , 41
χk∗ , 41
∗ , 40
family Ord(w 0 ), 174 χk+
χk−∗ , 40
fine linear approximation, 11
full-rank condition, 127 y(t; t0 , θ ), 318
function (H ), 185 y k (t; tk ), 306
δH [t, v], 175 zkj (t; θ ) (k < j ), 314
δu∗ , 41 zkk (t), 314
δu∗k , 41 functional γ1 , 231
δuV , 77 γ , 52
δuV , 77 γ̄ , 33, 89, 142, 155, 174, 180, 187,
δv, 42 228, 242
δvk+ , 42 γ V , 139, 149
δvk− , 42 γ1 , 58, 85
δvk , 42 γ2 , 59, 63, 86, 87
δwV , 77 0 , 32
δwloc , 77 C , 89
δ¯u f , 122 m, 10
δ̄u H λ , 122 m+ , 10
(k H ), 240, 257, 276 , 12, 37
(k H λ )(t), 32  , 58
(k H̄ ), 152, 168 C , 76
g(ζ ), 305 0 , 12, 38
H , 30, 130 1C , 82
H , 130 1λ , 50
H̄ , 130 2C , 85
H τ , 157 2λ , 58, 85
H̄ τ , 157 3C , 87
H V , 142 3λ , 63, 87
Index 379

˜ 1 , 83
 local sequence, 39
C
˜ 1λ , 82
 Lyusternik condition, 11
σ , 10, 100 Lyusternik normalization, 14

γ -conditions, 12 main problem, 223


γ -minimum, 175 matrix Riccati equation, 189, 246
γ -necessity on , 11 matrix-valued function R(t; x0 , θ ), 311
γ -sufficiency on , 12 S(t; t0 , x0 ), 317
γ -sufficiency with respect to the rigid sub- V (t), 304
system, 175 W k (t; x0 ), 310
general problem linear in a part of con- minimal fuel consumption of a car, 272
trols, 237 minimum condition, 167
general problem of the calculus of varia- of strictness C, 173
tions, 2, 27 minimum on a set of sequences, 9, 28,
generalized strengthened Legendre con- 127, 165
dition, 177 Minimum Principle, 151, 179
of strictness C, 72
Hamiltonian, 166, 224, 239, 275
higher order, 3, 11, 21 needle-shaped variation, 54
Hoffman lemma, 16, 23 neighborhood V of u0 , 39
k
neighborhood Vk of u0 , 39
Induced Optimization Problem (IOP), 300, nondegenerate case, 276, 279
341, 342 nonlinear programming, 339
Initial Value Problem (IVP), 300, 340 normal case, 276, 279
integral minimum principle, 167 normalization condition, 12
internal quadratic form, 172
operator π1 , 303
Jacobi conditions, 188 optimal production and maintenance poli-
jump condition, 190, 216, 246 cies, 248
jump of a function, 2, 184 optimality conditions, 9
order, 11
L-point, 28, 165 order γ , 174
Lagrange function, 12, 37 order function, 51, 180
for the IOP, 302
laser, 353 -continuous mapping, 10
Legendre condition, 186 0 -extension, 9
strengthened, 155, 186 passage of quadratic form through zero,
Legendre element, 170 191
in nonstrict sense, 33 point of L-discontinuity, 28
Legendre form, 191 Pontryagin
Legendre–Clebsch condition, 169 function, 24, 150, 166, 178, 224,
limit cycle, 203 239, 257, 275
linear independence assumption, 165 γ -sufficiency, 100
Lipschitz point, 28 minimum, 29, 128, 151, 165, 184,
Lipschitz-continuous function, 178 224, 238, 256, 275
local minimum principle, 24, 166 minimum principle, 30, 167, 185,
local quadratic growth of the Hamilto- 225, 239, 257, 275
nian, 176 sequence on Q, 165
380 Index

problem A, 243 second variation of Lagrange functional,


Aτ , 244 50
co A, 243 set A, 71
co Z, 230 A0 , 68
P , 156 AC , 68
P τ , 156 co 0 , 30
V , 138 D(), 103
Z, 229 F , 104
ZN , 104 Fζ , 104
problem on a variable time interval, 150, G, 128
178 , 103, 302
problem with a local equality, 127 (), 104
production and maintenance, 361 0 , 12, 24, 30, 36, 166, 215, 301
projection, 128 0 for the IOP, 302
projector π0 , 302 0 , 31, 121, 167
ζ
0 , 106
Q-transformation of  on K, 246, 267, MP
0 , 302
281 ζ , 103
quadratic conditions, 1 Leg+ (0 ), 96
quadratic form, 22, 32, 227, 278
¯ p̄ ,
¯ 263 Leg+ (M0+ ), 33, 96, 141, 155, 177,
Ap̄,
228, 242
Lζ ζ (μ, ζ̂ )ζ̄ , ζ̄ , 319
Leg+ (M), 170
ω, 153, 188
M0 , 30, 130, 151, 167, 179, 215,
ωe , 153, 241
225, 236, 239, 257, 275
ωλ , 25, 172
λ , 172 M(C), 94, 173, 181
ω
M0co , 30
, 153, 186, 227, 241, 263
M co (C), 72
# , 212
M(C ; co 0 ), 71
λ , 136, 172
M0+ , 33, 96, 141, 177, 215, 228, 242
τ , 159
U M0V , 142
U λ , 137
V M0τ , 157
V λ , 148 ζ
ζ , 113 M0 , 108
quadratic form in the IOP, 303 M ∗ , 41
quantity quantity D k (H ), 226, 257, 276 Mk∗ , 41
Mk+∗ , 40
D k (H λ ), 32 ∗ , 40
D k (H̄ λ ), 168 Mk−
D k (H̄ ), 135, 152, 179 M V , 77
N0 , 104
Rayleigh problem, 203, 344 Qtu , 39
time-optimal, 344 R, 294
time-optimal control of the, 290 , 28, 165
reduced Hessian, 342 U, 229
regularity condition, 341 U(t, x), 30, 151, 167, 179
resource allocation problem, 263 u0 , 39
Riccati equation, 216 u0 (tk−1 , tk ), 39
Index 381

k
u0 , 39 singular control, 365
V ∗ , 39 smooth problem, 11
Vk+∗ , 39 smooth system, 11
Vk−∗ , 39 space L1 (, Rd(u) ), 29
Vk∗ , 39 L2 (, Rd(u) ), 31
V 0 , 39 L∞ ([t0 , tf ], Rr ), 24
%, 296 L∞ (, Rd(u) ), 28
set of active indices, 11, 24, 31, 165, 170, P C 1 (, Rd(x) ), 260
227, 241, 301 P C 1 (, Rn ), 276
set of indices I , 56 P W 1,2 ([t0 , tf ], Rm ), 185
I in the IOP, 303 P W 1,2 ([t0 , tf ], Rd(x) ), 170
I ∗ , 34 P W 1,1 (, Rd(x) ), 60
S
set of sequences  , 117 P W 1,2 (, Rd(x) ), 30
, 29 Rn∗ , 24
 , 54 W , 24, 28, 164, 184, 224
∗ , 41 W 1,1 ([t0 , tf ], Rn ), 24
∗uk+ , 41 W 1,∞ (, (Rd(x) )∗ ), 30
∗uk− , 40 W 1,1 (, Rd(x) ), 28
∗uk , 41 Z(), 67, 89
∗u , 41 Z2 (), 30, 170, 186, 216, 226
+ , 10 Z(), 260
+ g , 10 Z2 (), 180, 241
0 , 29, 34 standard normalization, 12, 13
S , 116, 173 strict bang-bang control, 228, 259, 276
loc √
o( γ ) , 76
strict bounded strong minimum, 139, 154,
173, 180, 226, 240
loc 1
σ γ , 57 strict minimum on a set of sequences, 10,
loc , 39 29, 127, 165
loc
u , 39 strict minimum principle, 33, 154, 185
loc
σ γ , 54 strict order, 11, 21
0 , 9 strict Pontryagin minimum, 128
g , 10 strict strong minimum, 154, 259, 275
σ γ , 38 strictly -differentiable mapping, 10
σ , 22 strictly Legendre element, 33, 141, 228,
σ γ , 13 242
o(√γ ) , 76 strong minimum, 116, 154, 173, 259, 275
S 2 , 65 subspace K# , 212
S 3 , 65 K(τ ), 191
S 4 , 66 K0 , 188
S1 , 84 L0 , 68
S2 , 85 L̃2 ([τ , tf ], Rm ), 195
S3 , 88 T , 92
S4 , 89 support of a function, 40
S5 , 91 switching function, 225, 257, 275
simplest problem of the calculus of vari- for the u-component, 239
ations, 1, 183 system S, 10
382 Index

-conjugate point, 4, 198


table, 197
time-optimal bang-bang control problem,
274
time-optimal control problem with a sin-
gle switching point, 264
time-optimal control problems for linear
system with constant entries, 293
trajectory T τ , 244
transformation to perfect square, 209, 217,
246, 271, 283
tuple of Lagrange multipliers, 30, 225,
239, 257, 301
two-link robot, 346

unessential component, 116, 139, 154, 173,


180, 226, 240

value ρ, 264
Van der Pol oscillator, 365
time-optimal control of a, 286
variable ζ of the IOP, 300
variational problem, 205
¯ 241
vector p̄,
violation function, 10, 38, 100, 139, 174,
230

weak minimum, 29, 165


Weierstrass-Erdmann conditions, 184

You might also like