CONTROL
SYSTEMS
~ , . ~ t ~ . ~ :
".. .u&:iuut ~
....aIMlJ f ~ t jS ~ J
'1 6 .3 z. : ~ OJ"':'
lrAr If I 1 a : ~ i!J1i
Electrical Engineering
Textbook Series
Richard C. Dorf, Series Editor
University of California, Davis
Forthcoming and Published Titles
Applied Vector Analysis
Matiur Rahman and Isaac Mulolani
Continuous Signals and Systems with MATLAB
Taan EIAli and Mohammad A. Karim
Discrete Signals and Systems with MATLAB
Taan EIAIi
Electromagnetics
Edward J. Rothwell and Michael J. Cloud
Optimal Control Systems
Desineni Subbaram Naidu
OPTIMAL
CONTROL
SYSTEMS
Desineni Subbaram Naidu
Idaho State Universitv
.
Pocatello. Idaho. USA
o
CRC PRESS
Boca Raton London New York Washington, D.C.
Cover photo: Terminal phase (using fueloptimal control) of the lunar landing of the Apollo 11 mission.
Courtesy of NASA.
TJ
"l13
N1. b ' ~
<'l ~ o t
Library of Congress CataloginginPublication Data
Naidu, D. s. (Desineni S.), 1940
Optimal control systems I by Desineni Subbaram N aidu.
p. cm. (Electrical engineering textbook series)
Includes bibliographical references and index.
ISBN 0849308925 (alk. paper)
1. Automatic control. 2. Control theory. 3. Mathematical optimization. I. Title II.
Series.
2002067415
This book contains information obtained from authentic and highly regarded sources. Reprinted material
is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable
efforts have been made to publish reliable data and information, but the author and the publisher cannot
assume responsibility for the validity of all materials or for the consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying, microfilming, and recording, or by any information storage or
retrieval system, without prior permission in writing from the publisher.
The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for
creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC
for such copying.
Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation, without intent to infringe.
Visit the CRC Press Web site at www.crcpress.com
© 2003 by CRC Press LLC
No claim to original u.S. Government works
International Standard Book Number 0849308925
Library of Congress Card Number 2002067415
Printed in the United States of America 1 2 3 4 5 6 7 8 9 0
Printed on acidfree paper
v
"Because the shape of the whole universe is most per
fect and, in fact, designed by the wisest Creator, nothing
in all of the world will occur in which no maximum or
minimum rule is somehow shining forth. "
Leohard Euler, 1144
vi
Dedication
My deceased parents who shaped my life
Desineni Rama Naidu
Desineni Subbamma
and
My teacher who shaped my education
Buggapati A udi Chetty
vii
Preface
Many systems, physical, chemical, and economical, can be modeled
by mathematical relations, such as deterministic and/or stochastic dif
ferential and/or difference equations. These systems then change with
time or any other independent variable according to the dynamical re
lations. It is possible to steer these systems from one state to another
state by the application of some type of external inputs or controls.
If this can be done at all, there may be different ways of doing the
same task. If there are different ways of doing the same task, then
there may be one way of doing it in the "best" way. This best way can
be minimum time to go from one state to another state, or maximum
thrust developed by a rocket engine. The input given to the system
corresponding to this best situation is called "optimal" control. The
measure of "best" way or performance is called "performance index"
or "cost function." Thus, we have an "optimal control system," when a
system is controlled in an optimum way satisfying a given performance
index. The theory of optimal control systems has enjoyed a flourishing
period for nearly two decades after the dawn of the socalled "modern"
control theory around the 1960s. The interest in theoretical and prac
tical aspects of the subject has sustained due to its applications to such
diverse fields as electrical power, aerospace, chemical plants, economics,
medicine, biology, and ecology.
Aim and Scope
In this book we are concerned with essentially the control of physical
systems which are "dynamic" and hence described by ordinary differ
ential or difference equations in contrast to "static" systems, which are
characterized by algebraic equations. Further, our focus is on "deter
ministic" systems only.
The development of optimal control theory in the sixties revolved
around the "maximum principle" proposed by the Soviet mathemati
cian L. S. Pontryagin and his colleagues whose work was published in
English in 1962. Further contributions are due to R. E. Kalman of the
United States. Since then, many excellent books on optimal control
theory of varying levels of sophistication have been published.
This book is written keeping the "student in mind" and intended
to provide the student a simplified treatment of the subject, with an
viii
appropriate dose of mathematics. Another feature of this book is to
assemble all the topics which can be covered in a onesemester class.
A special feature of this book is the presentation of the procedures in
the form of a summary table designed in terms of statement of the prob
lem and a stepbystep solution of the problem. Further, MATLAB©
and SIMULINK©
1
, including Control System and Symbolic Math
Toolboxes, have been incorporated into the book. The book is ideally
suited for a onesemester, second level, graduate course in control sys
tems and optimization.
Background and Audience
This is a second level graduate text book and as such the background
material required for using this book is a first course on control sys
tems, state space analysis, or linear systems theory. It is suggested that
the student review the material in Appendices A and B given at the
end of the book. This book is aimed at graduate students in Electrical,
Mechanical, Chemical, and Aerospace Engineering and Applied Math
ematics. It can also be used by professional scientists and engineers
working in a variety of industries and research organizations.
Acknowledgments
This book has grown out of my lecture notes prepared over many years
of teaching at the Indian Institute of Technology (IIT), Kharagpur, and
Idaho State University (ISU), Pocatello, Idaho. As such, I am indebted
to many of my teachers and students. In recent years at ISU, there are
many people whom I would like to thank for their encouragement and
cooperation. First of all, I would like to thank the late Dean Hary
Charyulu for his encouragement to graduate work and research which
kept me "live" in the area optimal control. Also, I would like to mention
a special person, Kevin Moore, whose encouragement and cooperation
made my stay at ISU a very pleasant and scholarly productive one for
many years during 199098. During the last few years, Dean Kunze
and Associate Dean Stuffie have been of great help in providing the
right atmosphere for teaching and research work.
IMATLAB and SIMULINK are registered trademarks of The Mathworks, Inc., Natick, MA,
USA.
ix
Next, my students over the years were my best critics in providing
many helpful suggestions. Among the many, special mention must be
made about Martin Murillo, Yoshiko Imura, and Keith Fisher who
made several suggestions to my manuscript. In particular, Craig Rieger
( of Idaho National Engineering and Environmental Laboratory
(INEEL)) deserves special mention for having infinite patience in writ
ing and testing programs in MATLAB© to obtain analytical solutions
to matrix Riccati differential and difference equations.
The cameraready copy of this book was prepared by the author
using H\'IEX of the PCTEX32
2
Version 4.0. The figures were drawn
using CoreiDRAW
3
and exported into H\'IEX document.
Several people at the publishing company CRC Press deserve men
tion. Among them, special mention must be made about Nora Konopka,
Acquisition Editor, Electrical Engineering for her interest, understand
ing and patience with me to see this book to completion. Also, thanks
are due to Michael Buso, Michelle Reyes, Helena Redshaw, and Judith
Simon Kamin. I would like to make a special mention of Sean Davey
who helped me in many issues regarding H\'IEX. Any corrections and
suggestions are welcome via email to naiduds@isu. edu
Finally, it is my pleasant duty to thank my wife, Sita and my daugh
ters, Radhika and Kiranmai who have been a great source of encour
agement and cooperation throughout my academic life.
Desineni Subbaram Naidu
Pocatello, Idaho
June 2002
2:rg..'lEX is a registered trademark of Personal 'lEX, Inc., Mill Valley, CA.
3CorelDRAW is a registered trademark of Corel Corporation or Corel Corporation Limited.
x
ACKNOWLEDGMENTS
The permissions given by
1. Prentice Hall for D. E. Kirk, Optimal Control Theory: An Intro
duction, Prentice Hall, Englewood Cliffs, NJ, 1970,
2. John Wiley for F. L. Lewis, Optimal Control, John Wiley & Sons,
Inc., New York, NY, 1986,
3. McGrawHill for M. Athans and P. L. Falb, Optimal Control:
An Introduction to the Theory and Its Applications, McGrawHill
Book Company, New York, NY, 1966, and
4. SpringerVerlag for H. H. Goldstine, A History of the Calculus of
Variations, SpringerVerlag, New York, NY, 1980,
are hereby acknowledged.
xi
AUTHOR'S BIOGRAPHY
Desineni "Subbaram" Naidu received his B.E. degree in Electrical Engineer
ing from Sri Venkateswara University, Tirupati, India, and M.Tech. and Ph.D.
degrees in Control Systems Engineering from the Indian Institute of Technol
ogy (lIT), Kharagpur, India. He held various positions with the Department of
Electrical Engineering at lIT. Dr. Naidu was a recipient of a Senior National
Research Council (NRC) Associateship of the National Academy of Sciences,
Washington, DC, tenable at NASA Langley Research Center, Hampton,
Virginia, during 198587 and at the U. S. Air Force Research Laboratory
(AFRL) at WrightPatterson Air Force Base (WPAFB), Ohio, during 1998
99. During 198790, he was an adjunct faculty member in the Department of
Electrical and Computer Engineering at Old Dominion University, Norfolk,
Virginia. Since August 1990, Dr. Naidu has been a professor at Idaho State
University. At present he is Director of the Measurement and Control Engi
neering Research Center; Coordinator, Electrical Engineering program; and
Associate Dean of Graduate Studies in the College of Engineering, Idaho State
University, Pocatello, Idaho.
Dr. Naidu has over 150 publications including a research monograph, Sin
gular Perturbation Analysis of Discrete Control Systems, Lecture Notes in
Mathematics, 1985; a book, Singular Perturbation Methodology in Control
Systems, lEE Control Engineering Series, 1988; and a research monograph
entitled, Aeroassisted Orbital Transfer: Guidance and Control Strategies, Lec
ture Notes in Control and Information Sciences, 1994.
Dr. Naidu is (or has been) a member of the Editorial Boards of the IEEE
Transaction on Automatic Control, (199399), the International Journal of
Robust and Nonlinear Control, (1996present), the International Journal of
ControlTheory and Advanced Technology (CTAT), (19921996), and a mem
ber of the Editorial Advisory Board of Mechatronics: The Science of Intelli
gent Machines, an International Journal, (1992present).
Professor Naidu is an elected Fellow of The Institute of Electrical and Elec
tronics Engineers (IEEE), a Fellow of World Innovation Foundation (WIF), an
Associate Fellow of the American Institute of Aeronautics and Astronautics
(AIAA) and a member of several other organizations such as SIAM, ASEE,
etc. Dr. Naidu was a recipient of the Idaho State University Outstanding Re
searcher Award for 199394 and 199495 and the Distinguished Researcher
Award for 199495. Professor Naidu's biography is listed (multiple years) in
Who's Who among America's Teachers, the Silver Anniversary 25th Edition
of Who's Who in the West, Who's Who in Technology, and The International
Directory of Distinguished Leadership.
Contents
1 Introduction 1
1.1 Classical and Modern Control . . . . . . . . . . . . . . . . . . . . . 1
1.2 Optimization................................. 4
1.3 Optimal Control .............................. 6
1.3.1 Plant ................................. 6
1.3.2 Performance Index ........................ 6
1.3.3 Constraints ............................. 9
1.3.4 Formal Statement of Optimal Control System .... 9
1.4 Historical Tour .............................. 11
1.4.1 Calculus of Variations .................... 11
1.4.2 Optimal Control Theory .................. 13
1.5 About This Book ............................. 15
1.6 Chapter Overview ............................ 16
1.7 Problems ................................... 17
2 Calculus of Variations and Optimal Control 19
2.1 Basic Concepts .............................. 19
2.1.1 Function and Functional .................. 19
2.1.2 Increment ............................. 20
2.1.3 Differential and Variation . . . . . . . . . . . . . . . . . . 22
2.2 Optimum of a Function and a Functional ............ 25
2.3 The Basic Variational Problem ................... 27
2.3.1 FixedEnd Time and FixedEnd State System ... 27
2.3.2 Discussion on EulerLagrange Equation ........ 33
2.3.3 Different Cases for EulerLagrange Equation .... 35
2.4 The Second Variation . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5 Extrema of Functions with Conditions .............. 41
2.5.1 Direct Method .......................... 43
2.5.2 Lagrange Multiplier Method ................ 45
2.6 Extrema of Functionals with Conditions ............ 48
2.7 Variational Approach to Optimal Control Systems . . . . . 57
xiii
XIV
2.7.1 Terminal Cost Problem ................... 57
2.7.2 Different Types of Systems ................. 65
2.7.3 Sufficient Condition ...................... 67
2.7.4 Summary of Pontryagin Procedure ........... 68
2.8 Summary of Variational Approach ................. 84
2.8.1 Stage I: Optimization of a Functional . . . . . . . . . 85
2.8.2 Stage II: Optimization of a Functional with
Condition ............................. 86
2.8.3 Stage III: Optimal Control System with
Lagrangian Formalism .................... 87
2.8.4 Stage IV: Optimal Control System with
Hamiltonian Formalism: Pontryagin Principle ... 88
2.8.5 Salient Features ......................... 91
2.9 Problems ................................... 96
3 Linear Quadratic Optimal Control Systems I 101
3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . .. 101
3.2 FiniteTime Linear Quadratic Regulator ........... 104
3.2.1 Symmetric Property of the Riccati Coefficient
Matrix .............................. 109
3.2.2 Optimal Control ....................... 110
3.2.3 Optimal Performance Index . . . . . . . . . . . . . .. 110
3.2.4 FiniteTime Linear Quadratic Regulator:
TimeVarying Case: Summary ............. 112
3.2.5 Salient Features. . . . . . . . . . . . . . . . . . . . . . .. 114
3.2.6 LQR System for General Performance Index ... 118
3.3 Analytical Solution to the Matrix
Differential Riccati Equation .................... 119
3.3.1 MATLAB© Implementation of Analytical
Solution to Matrix DRE. . . . . . . . . . . . . . . . .. 122
3.4 InfiniteTime LQR System I . . . . . . . . . . . . . . . . . . .. 125
3.4.1 InfiniteTime Linear Quadratic Regulator:
TimeVarying Case: Summary ............. 128
3.5 InfiniteTime LQR System II ................... 129
3.5.1 Meaningful Interpretation of Riccati Coefficient . 132
3.5.2 Analytical Solution of the Algebraic
Riccati Equation . . . . . . . . . . . . . . . . . . . . . .. 133
3.5.3 InfiniteInterval Regulator System:
TimeInvariant Case: Summary ............. 134
3.5.4 Stability Issues of TimeInvariant Regulator. . .. 139
xv
3.5.5 Equivalence of OpenLoop and ClosedLoop
Optimal Controls ....................... 141
3.6 Notes and Discussion ......................... 144
3.7 Problems .................................. 147
4 Linear Quadratic Optimal Control Systems II 151
4.1 Linear Quadratic Tracking System: FiniteTime Case 152
4.1.1 Linear Quadratic Tracking System: Summary 157
4.1.2 Salient Features of Tracking System . . . . . . . .. 158
4.2 LQT System: InfiniteTime Case ................. 166
4.3 FixedEndPoint Regulator System ............... 169
4.4 LQR with a Specified Degree of Stability . . . . . . . . . .. 175
4.4.1 Regulator System with Prescribed Degree of
Stability: Summary . . . . . . . . . . . . . . . . . . . .. 177
4.5 FrequencyDomain Interpretation ................ 179
4.5.1 Gain Margin and Phase Margin ............ 181
4.6 Problems.................................. 188
5 DiscreteTime Optimal Control Systems 191
5.1 Variational Calculus for DiscreteTime
Systems .................................. 191
5.1.1 Extremization of a Functional .............. 192
5.1.2 Functional with Terminal Cost ............. 197
5.2 DiscreteTime Optimal Control Systems ........... 199
5.2.1 FixedFinal State and OpenLoop Optimal
Control. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 203
5.2.2 FreeFinal State and OpenLoop Optimal Control 207
5.3 DiscreteTime Linear State Regulator
System ................................... 207
5.3.1 ClosedLoop Optimal Control: Matrix Difference
Riccati Equation . . . . . . . . . . . . . . . . . . . . . . . 209
5.3.2 Optimal Cost Function .................. 213
5.4 SteadyState Regulator System .................. 219
5.4.1 Analytical Solution to the Riccati Equation .... 225
5.5 DiscreteTime Linear Quadratic Tracking System . . . .. 232
5.6 FrequencyDomain Interpretation ................ 239
5.7 Problems .................................. 245
XVI
6 Pontryagin Minimum Principle 249
6.1 Constrained System .......................... 249
6.2 Pontryagin Minimum Principle . . . . . . . . . . . . . . . . .. 252
6.2.1 Summary of Pontryagin Principle . . . . . . . . . . . 256
6.2.2 Additional Necessary Conditions ............ 259
6.3 Dynamic Programming. . . . . . . . . . . . . . . . . . . . . . . . 261
6.3.1 Principle of Optimality .................. 261
6.3.2 Optimal Control Using Dynamic Programming . 266
6.3.3 Optimal Control of DiscreteTime Systems .... 272
6.3.4 Optimal Control of ContinuousTime Systems .. 275
6.4 The HamiltonJacobiBellman Equation ............ 277
6.5 LQR System Using HJB Equation ............. " 283
6.6 Notes and Discussion ......................... 288
7 Constrained Optimal Control Systems 293
7.1 Constrained Optimal Control . . . . . . . . . . . . . . . . . .. 293
7.1.1 TimeOptimal Control of LTI System ........ 295
7.1.2 Problem Formulation and Statement . . . . . . . .. 295
7.1.3 Solution of the TOC System ............... 296
7.1.4 Structure of TimeOptimal Control System .... 303
7.2 TOC of a Double Integral System ................ 305
7.2.1 Problem Formulation and Statement. . . . . . . .. 306
7.2.2 Problem Solution ....................... 307
7.2.3 Engineering Implementation of Control Law ... 314
7.2.4 SIMULINK© Implementation of Control Law .. 315
7.3 FuelOptimal Control Systems ................... 315
7.3.1 FuelOptimal Control of a Double Integral System 316
7.3.2 Problem Formulation and Statement ......... 319
7.3.3 Problem Solution. . . . . . . . . . . . . . . . . . . . . .. 319
7.4 MinimumFuel System: LTI System ............... 328
7.4.1 Problem Statement ..................... 328
7.4.2 Problem Solution. . . . . . . . . . . . . . . . . . . . . .. 329
7.4.3 SIMULINK© Implementation of Control Law. . 333
7.5 EnergyOptimal Control Systems ................ 335
7.5.1 Problem Formulation and Statement ......... 335
7.5.2 Problem Solution. . . . . . . . . . . . . . . . . . . . . .. 339
7.6 Optimal Control Systems with State
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 351
7.6.1 Penalty Function Method. . . . . . . . . . . . . . . .. 352
7.6.2 Slack Variable Method ................... 358
xvii
7.7 Problems .................................. 361
Appeddix A: Vectors and Matrices 365
A.1 Vectors ................................... 365
A.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
A.3 Quadratic Forms and Definiteness ................ 376
Appendix B: State Space Analysis 379
B.1 State Space Form for ContinuousTime Systems ...... 379
B.2 Linear Matrix Equations. . . . . . . . . . . . . . . . . . . . . .. 381
B.3 State Space Form for DiscreteTime Systems . . . . . . .. 381
B.4 Controllability and Observability ................. 383
B.5 Stabilizability, Reachability and Detectability ........ 383
Appendix C: MATLAB Files 385
C.1 MATLAB© for Matrix Differential Riccati Equation .. 385
C.l.1 MATLAB File lqrnss.m .................. 386
C.l.2 MATLAB File lqrnssf.m .................. 393
C.2 MATLAB© for ContinuousTime Tracking System ... 394
C.2.1 MATLAB File for Example 4.1(example4_l.m) . 394
C.2.2 MATLAB File for Example 4.1(example4_1p.m). 397
C.2.3 MATLAB File for Example 4.1(example4_1g.m). 397
C.2.4 MAT LAB File for Example 4.1(example4_1x.m). 397
C.2.5 MATLAB File for Example 4.2(example4_l.m) . 398
C.2.6 MATLAB File for Example 4.2( example4_2p.m). 400
C.2.7 MATLAB File for Example 4.2(example4_2g.m). 400
C.2.8 MATLAB File for Example 4.2( example4_2x.m). 401
C.3 MATLAB© for Matrix Difference Riccati Equation ... 401
C.3.1 MAT LAB File lqrdnss.m . . . . . . . . . . . . . . . .. 401
C.4 MATLAB© for DiscreteTime Tracking System ...... 409
References . ..................................... 415
Index . ......................................... 425
List of Figures
1.1 Classical Control Configuration . . . . . . . . . . . . . . . . . . . . 1
1.2 Modern Control Configuration .................... 3
1.3 Components of a Modern Control System ............ 4
1.4 Overview of Optimization . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Optimal Control Problem ....................... 10
2.1 Increment ~ f , Differential df, and Derivative j of a
Function f ( t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Increment ~ J and the First Variation 8J of the Func
tional J .................................... 24
2.3 ( a) Minimum and (b) Maximum of a Function f ( t) . . . . . 26
2.4 FixedEnd Time and FixedEnd State System ........ 29
2.5 A Nonzero g(t) and an Arbitrary 8x(t) ............. 32
2.6 Arc Length ................................. 37
2.7 FreeFinal Time and FreeFinal State System ......... 59
2.8 FinalPoint Condition with a Moving Boundary B(t) .... 63
2.9 Different Types of Systems: (a) FixedFinal Time and
FixedFinal State System, (b) FreeFinal Time and Fixed
Final State System, (c) FixedFinal Time and FreeFinal
State System, (d) FreeFinal Time and FreeFinal State
System .................................... 66
2.10 Optimal Controller for Example 2.12 ............... 72
2.11 Optimal Control and States for Example 2.12 ......... 74
2.12 Optimal Control and States for Example 2.13 ......... 77
2.13 Optimal Control and States for Example 2.14 ......... 81
2.14 Optimal Control and States for Example 2.15 ......... 84
2.15 OpenLoop Optimal Control ..................... 94
2.16 ClosedLoop Optimal Control .................... 95
3.1 State and Costate System ...................... 107
3.2 ClosedLoop Optimal Control Implementation ....... 117
X'lX
xx
3.3 Riccati Coefficients for Example 3.1 . . . . . . . . . . . . . .. 125
3.4 ClosedLoop Optimal Control System for Example 3.1 126
3.5 Optimal States for Example 3.1. . . . . . . . . . . . . . . . .. 127
3.6 Optimal Control for Example 3.1 ................ 127
3.7 Interpretation of the Constant Matrix P ........... 133
3.8 Implementation of the ClosedLoop Optimal Control:
Infinite Final Time. . . . . . . . . . . . . . . . . . . . . . . . . .. 135
3.9 ClosedLoop Optimal Control System . . . . . . . . . . . .. 138
3.10 Optimal States for Example 3.2. . . . . . . . . . . . . . . . .. 140
3.11 Optimal Control for Example 3.2 ................ 141
3.12 (a) OpenLoop Optimal Controller (OLOC) and
(b) ClosedLoop Optimal Controller (CLOC) ........ 145
4.1 Implementation of the Optimal Tracking System ..... 157
4.2 Riccati Coefficients for Example 4.1 ............... 163
4.3 Coefficients 91(t) and 92(t) for Example 4.1 ......... 164
4.4 Optimal States for Example 4.1 .................. 164
4.5 Optimal Control for Example 4.1 ................ 165
4.6 Riccati Coefficients for Example 4.2 ............... 167
4.7 Coefficients 91(t) and 92(t) for Example 4.2 ......... 168
4.8 Optimal Control and States for Example 4.2 ........ 168
4.9 Optimal Control and States for Example 4.2 ........ 169
4.10 Optimal ClosedLoop Control in Frequency Domain ... 180
4.11 ClosedLoop Optimal Control System with Unity
Feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 184
4.12 Nyquist Plot of Go(jw) ........................ 185
4.13 Intersection of Unit Circles Centered at Origin
and 1 + jO ............................... 186
5.1 State and Costate System. . . . . . . . . . . . . . . . . . . . .. 205
5.2 ClosedLoop Optimal Controller for Linear DiscreteTime
Regulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 215
5.3 Riccati Coefficients for Example 5.3 ............... 219
5.4 Optimal Control and States for Example 5.3 ........ 220
5.5 Optimal Control and States for Example 5.3 ........ 221
5.6 ClosedLoop Optimal Control for DiscreteTime
SteadyState Regulator System . . . . . . . . . . . . . . . . .. 223
5.7 Implementation of Optimal Control for Example 5.4 . .. 226
5.8 Implementation of Optimal Control for Example 5.4 ... 227
5.9 Riccati Coefficients for Example 5.5. . . . . . . . . . . . . .. 231
XXI
5.10 Optimal States for Example 5.5 ................ " 232
5.11 Optimal Control for Example 5.5 ................ 233
5.12 Implementation of DiscreteTime Optimal Tracker .... 239
5.13 Riccati Coefficients for Example 5.6 . . . . . . . . . . . . . .. 240
5.14 Coefficients 91(t) and 92(t) for Example 5.6 ......... 241
5.15 Optimal States for Example 5.6. . . . . . . . . . . . . . . . .. 241
5.16 Optimal Control for Example 5.6 ................ 242
5.17 ClosedLoop DiscreteTime Optimal Control System. . . 243
6.1 (a) An Optimal Control Function Constrained by a
Boundary (b) A Control Variation for Which 8u(t)
Is Not Admissible ........................... 254
6.2 Illustration of Constrained (Admissible) Controls ..... 260
6.3 Optimal Path from A to B . . . . . . . . . . . . . . . . . . . .. 261
6.4 A Multistage Decision Process .................. 262
6.5 A Multistage Decision Process: Backward Solution .... 263
6.6 A Multistage Decision Process: Forward Solution ..... 265
6.7 Dynamic Programming Framework of Optimal State
Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . . . .. 271
6.8 Optimal Path from A to B . . . . . . . . . . . . . . . . . . . . . 290
7.1 Signum Function . . . . . . . . . . . . . . . . . . . . . . . . . . .. 299
7.2 TimeOptimal Control ........................ 299
7.3 Normal TimeOptimal Control System ............. 300
7.4 Singular TimeOptimal Control System ............ 301
7.5 OpenLoop Structure for TimeOptimal Control System 304
7.6 ClosedLoop Structure for TimeOptimal Control System 306
7.7 Possible Costates and the Corresponding Controls .... 309
7.8 Phase Plane Trajectories for u = + 1 (dashed lines) and
u = 1 (dotted lines) ......................... 310
7.9 Switch Curve for Double Integral TimeOptimal Control
System ................................... 312
7.10 Various Trajectories Generated by Four Possible Control
Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 313
7.11 ClosedLoop Implementation of TimeOptimal Control
Law ..................................... 315
7.12 SIMULINK@ Implementation of TimeOptimal
Control Law ............................... 316
7.13 PhasePlane Trajectory for 1'+: Initial State (2,2) and
Final State (0,0) ............................ 317
xxii
7.14 PhasePlane Trajectory for 7: Initial State (2,2) and
Final State (0,0) ............................ 317
7.15 PhasePlane Trajectory for R+: Initial State (1,1) and
Final State (0,0) ............................ 318
7.16 PhasePlane Trajectory for R_: Initial State (1,1) and
Final State (0,0) ............................ 318
7.17 Relations Between A2(t) and lu*(t)1 + u*(t)A2(t) ...... 322
7.18 DeadZone Function. . . . . . . . . . . . . . . . . . . . . . . . .. 323
7.19 FuelOptimal Control ......................... 323
7.20 Switching Curve for a Double Integral FuelOptimal
Control System ............................. 324
7.21 PhasePlane Trajectories for u(t) = 0 .............. 325
7.22 FuelOptimal Control Sequences ................. 326
7.23 EFuelOptimal Control. ....................... 327
7.24 Optimal Control as DeadZone Function ........... 330
7.25 Normal FuelOptimal Control System ............. 331
7.26 Singular FuelOptimal Control System ............. 332
7.27 OpenLoop Implementation of FuelOptimal Control
System ................................... 333
7.28 ClosedLoop Implementation of FuelOptimal Control
System ................................... 334
7.29 SIMULINK@ Implementation of FuelOptimal Control
Law ..................................... 334
7.30 PhasePlane Trajectory for "Y+: Initial State (2,2) and
Final State (0,0) ............................ 336
7.31 PhasePlane Trajectory for "Y: Initial State (2,2) and
Final State (0,0) ............................ 336
7.32 PhasePlane Trajectory for R
1
: Initial State (1,1) and
Final State (0,0) ............................ 337
7.33 PhasePlane Trajectory for R3: Initial State (1,1) and
Final State (0,0) ............................ 337
7.34 PhasePlane Trajectory for R
2
: Initial State (1.5,1) and
Final State (0,0) ............................ 338
7.35 PhasePlane Trajectory for R4: Initial State (1.5,1) and
Final State (0,0) ............................ 338
7.36 Saturation Function .......................... 343
7.37 EnergyOptimal Control ....................... 344
7.38 OpenLoop Implementation of EnergyOptimal Control
System ................................... 345
XXlll
7.39 ClosedLoop Implementation of EnergyOptimal
Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 346
7.40 Relation between Optimal Control u*(t) vs (a) q*(t) and
(b) 0.5A*(t) ................................ 348
7.41 Possible Solutions of Optimal Costate A*(t) ......... 349
7.42 Implementation of EnergyOptimal Control Law ...... 351
7.43 Relation between Optimal Control u*(t) and Optimal
Costate A2 ( t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 358
List of Tables
2.1 Procedure Summary of Pontryagin Principle for Bolza
Problem ................................... 69
3.1 Procedure Summary of FiniteTime Linear Quadratic
Regulator System: TimeVarying Case. . . . . . . . . . . .. 113
3.2 Procedure Summary of InfiniteTime Linear Quadratic
Regulator System: TimeVarying Case. . . . . . . . . . . .. 129
3.3 Procedure Summary of InfiniteInterval Linear Quadratic
Regulator System: TimeInvariant Case . . . . . . . . . . .. 136
4.1 Procedure Summary of Linear Quadratic Tracking System159
4.2 Procedure Summary of Regulator System with Prescribed
Degree of Stability . . . . . . . . . . . . . . . . . . . . . . . . . .. 178
5.1 Procedure Summary of DiscreteTime Optimal Control
System: FixedEnd Points Condition .............. 204
5.2 Procedure Summary for DiscreteTime Optimal Control
System: FreeFinal Point Condition ............... 208
5.3 Procedure Summary of DiscreteTime, Linear Quadratic
Regulator System ............................ 214
5.4 Procedure Summary of DiscreteTime, Linear Quadratic
Regulator System: SteadyState Condition . . . . . . . . .. 222
5.5 Procedure Summary of DiscreteTime Linear Quadratic
Tracking System ............................ 238
6.1 Summary of Pontryagin Minimum Principle ......... 257
6.2 Computation of Cost during the Last Stage k = 2 ..... 269
6.3 Computation of Cost during the Stage k = 1,0 ....... 270
6.4 Procedure Summary of HamiltonJacobiBellman (HJB)
Approach ................................. 280
xxv
XXVI
7.1 Procedure Summary of Optimal Control Systems with
State Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . .. 355
Chapter 1
Introduction
In this first chapter, we introduce the ideas behind optimization and
optimal control and provide a brief history of calculus of variations and
optimal control. Also, a brief summary of chapter contents is presented.
1.1 Classical and Modern Control
The classical (conventional) control theory concerned with single input
and single output (8180) is mainly based on Laplace transforms the
ory and its use in system representation in block diagram form. From
Figure 1.1, we see that
Reference
Input
R(s)
+
Error
Signal

E(s)
Y(s)
R(s)
G(s)
1 + G(s)H(s)
c ompensator ant
Control B
PI
Input ... G (s)
..
Gc(s)
U(s) p
Feedback
H(s)
...
(1.1.1)
Output
yes)
Figure 1.1 Classical Control Configuration
1
2 Chapter 1: Introduction
where s is Laplace variable and we used
(1.1.2)
Note that
1. the input u(t) to the plant is determined by the error e(t) and
the compensator, and
2. all the variables are not readily available for feedback. In most
cases only one output variable is available for feedback.
The modern control theory concerned with multiple inputs and multi
ple outputs (MIMO) is based on state variable representation in terms
of a set of first order differential (or difference) equations. Here, the
system (plant) is characterized by state variables, say, in linear, time
invariant form as
x(t) = Ax(t) + Bu(t)
y(t) = Cx(t) + Du(t)
(1.1.3)
(1.1.4)
where, dot denotes differentiation with respect to (w.r.t.) t, x(t), u(t),
and y( t) are n, r, and m dimensional state, control, and output vectors
respectively, and A is nxn state, B is nxr input, Cis mxn output, and D
is mxr transfer matrices. Similarly, a nonlinear system is characterized
by
x(t) = f(x(t), u(t), t)
y(t) = g(x(t), u(t), t).
(1.1.5)
(1.1.6)
The modern theory dictates that all the state variables should be fed
back after suitable weighting. We see from Figure 1.2 that in modern
control configuration,
1. the input u( t) is determined by the controller (consisting of er
ror detector and compensator) driven by system states x(t) and
reference signal r ( t ) ,
2. all or most of the state variables are available for control, and
3. it depends on wellestablished matrix theory, which is amenable
for large scale computer simulation.
1.1 Classical and Modern Control 3
Plant
Control
Output
Input
..
p
..
u(t)
y(t)
State
R eference
x(t)
Input
C ..
'"
r(t)
Controller
Figure 1.2 Modern Control Configuration
The fact that the state variable representation uniquely specifies the
transfer function while there are a number of state variable representa
tions for a given transfer function, reveals the fact that state variable
representation is a more complete description of a system.
Figure 1.3 shows components of a modern control system. It shows
three components of modern control and their important contributors.
The first stage of any control system theory is to obtain or formulate
the dynamics or modeling in terms of dynamical equations such as dif
ferential or difference equations. The system dynamics is largely based
on the Lagrangian function. Next, the system is analyzed for its perfor
mance to find out mainly stability of the system and the contributions
of Lyapunov to stability theory are well known. Finally, if the system
performance is not according to our specifications, we resort to design
[25, 109]. In optimal control theory, the design is usually with respect
to a performance index. We notice that although the concepts such as
Lagrange function [85] and V function of Lyapunov [94] are old, the
techniques using those concepts are modern. Again, as the phrase mod
ern usually refers to time and what is modern today becomes ancient
after a few years, a more appropriate thing is to label them as optimal
control, nonlinear control, adaptive control, robust control and so on.
4 Chapter 1: Introduction
I
Modem Control System
I
~
~ r ~
r
System Dynamics
System Analysis
System Synthesis
(Modeling) (Perfonnance)
(Design)
r
r
~
State Function of
V Function of
H Function of
Lagrange
Lyapunov Pontraygin
(1788)
(1892)
(1956)
Figure 1.3 Components of a Modern Control System
1.2 Optimization
Optimization is a very desirable feature in daytoday life. We like to
work and use our time in an optimum manner, use resources optimally
and so on. The subject of optimization is quite general in the sense
that it can be viewed in different ways depending on the approach (al
gebraic or geometric), the interest (single or multiple), the nature of the
signals (deterministic or stochastic), and the stage (single or multiple)
used in optimization. This is shown in Figure 1.4. As we notice that
the calculus of variations is one small area of the big picture of the op
timization field, and it forms the basis for our study of optimal control
systems. Further, optimization can be classified as static optimization
and dynamic optimization.
1. Static Optimization is concerned with controlling a plant under
steady state conditions, i.e., the system variables are not chang
ing with respect to time. The plant is then described by algebraic
equations. Techniques used are ordinary calculus, Lagrange mul
tipliers, linear and nonlinear programming.
2. Dynamic Optimization concerns with the optimal control of
plants under dynamic conditions, i.e., the system variables are
changing with respect to time and thus the time is involved in
system description. Then the plant is described by differential
1.2 Optimization
Calculus and
Lagrange Multipliers
OPTIMIZATION
Algebraic Approach
Multiple Interest
Game Theory
Stochastic
Multiple Stage
Dynamic Programming
Linear and Nonlinear
Programming
Functional
Analysis
Figure 1.4 Overview of Optimization
5
6 Chapter 1: Introduction
(or difference) equations. Techniques used are search techniques,
dynamic programming, variational calculus (or calculus of varia
tions) and Pontryagin principle.
1.3 Optimal Control
The main objective of optimal control is to determine control signals
that will cause a process (plant) to satisfy some physical constraints
and at the same time extremize (maximize or minimize) a chosen per
formance criterion (performance index or cost function). Referring to
Figure 1.2, we are interested in finding the optimal control u*(t) (* in
dicates optimal condition) that will drive the plant P from initial state
to final state with some constraints on controls and states and at the
same time extremizing the given performance index J.
The formulation of optimal control problem requires
1. a mathematical description (or model) of the process to be con
trolled (generally in state variable form),
2. a specification of the performance index, and
3. a statement of boundary conditions and the physical constraints
on the states and/or controls.
1.3.1 Plant
For the purpose of optimization, we describe a physical plant by a set of
linear or nonlinear differential or difference equations. For example, a
linear timeinvariant system is described by the state and output rela
tions (1.1.3) and (1.1.4) and a nonlinear system by (1.1.5) and (1.1.6).
1.3.2 Performance Index
Classical control design techniques have been successfully applied to lin
ear, timeinvariant, singleinput, single output (8180) systems. Typical
performance criteria are system time response to step or ramp input
characterized by rise time, settling time, peak overshoot, and steady
state accuracy; and the frequency response of the system characterized
by gain and phase margins, and bandwidth.
In modern control theory, the optimal control problem is to find a
control which causes the dynamical system to reach a target or fol
1.3 Optimal Control 7
low a state variable (or trajectory) and at the same time extremize a
performance index which may take several forms as described below.
1. Performance Index for TimeOptimal Control System:
We try to transfer a system from an arbitrary initial state x(to) to
a specified final state x( t f) in minimum time. The corresponding
performance index (PI) is
i
t!
J = dt = t f  to = t*.
to
(1.3.1 )
2. Performance Index for FuelOptimal Control System: Con
sider a spacecraft problem. Let u(t) be the thrust of a rocket
engine and assume that the magnitude I u( t) I of the thrust is pro
portional to the rate of fuel consumption. In order to minimize
the total expenditure of fuel, we may formulate the performance
index as
i
t!
J = lu(t)ldt.
to
(1.3.2)
For several controls, we may write it as
(1.3.3)
where R is a weighting factor.
3. Performance Index for MinimumEnergy Control Sys
tem: Consider Ui (t) as the current in the ith loop of an electric
network. Then 2:i!1 u;(t)ri (where, ri is the resistance of the ith
loop) is the total power or the total rate of energy expenditure of
the network. Then, for minimization of the total expended energy,
we have a performance criterion as
(1.3.4)
or in general,
i
t!
J = u'(t)Ru(t)dt
to
(1.3.5)
8 Chapter 1: Introduction
where, R is a positive definite matrix and prime (') denotes trans
pose here and throughout this book (see Appendix A for more
details on definite matrices).
Similarly, we can think of minimization of the integral of the
squared error of a tracking system. We then have,
i
t!
J = x/(t)Qx(t)dt
to
(1.3.6)
where, Xd(t) is the desired value, xa(t) is the actual value, and
x(t) = xa(t)  Xd(t), is the error. Here, Q is a weighting matrix,
which can be positive semidefinite.
4. Performance Index for Terminal Control System: In a ter
minal target problem, we are interested in minimizing the error
between the desired target position Xd (t f) and the actual target
position Xa (t f) at the end of the maneuver or at the final time t f.
The terminal (final) error is x ( t f) = Xa ( t f)  Xd ( t f ). Taking care
of positive and negative values of error and weighting factors, we
structure the cost function as
(1.3.7)
which is also called the terminal cost function. Here, F is a positive
semidefinite matrix.
5. Performance Index for General Optimal Control System:
Combining the above formulations, we have a performance index
in general form as
i
t!
J = x/(tf)Fx(tf) + [X/(t)QX(t) + u/(t)Ru(t)]dt
to
(1.3.8)
or,
i
t!
J = S(x(tf),tf) + V(x(t),u(t),t)dt
to
(1.3.9)
where, R is a positive definite matrix, and Q and F are positive
semidefinite matrices, respectively. Note that the matrices Q and
R may be time varying. The particular form of performance index
(1.3.8) is called quadratic (in terms of the states and controls)
form.
1.3 Optimal Control 9
The problems arising in optimal control are classified based on the
structure of the performance index J [67]. If the PI (1.3.9) contains
the terminal cost function S(x(t), u(t), t) only, it is called the Mayer
problem, if the PI (1.3.9) has only the integral cost term, it is called
the Lagrange problem, and the problem is of the Bolza type if the PI
contains both the terminal cost term and the integral cost term as in
(1.3.9). There are many other forms of cost functions depending on our
performance specifications. However, the above mentioned performance
indices (with quadratic forms) lead to some very elegant results in
optimal control systems.
1.3.3 Constraints
The control u( t) and state x( t) vectors are either unconstrained or
constrained depending upon the physical situation. The unconstrained
problem is less involved and gives rise to some elegant results. From the
physical considerations, often we have the controls and states, such as
currents and voltages in an electrical circuit, speed of a motor, thrust
of a rocket, constrained as
(1.3.10)
where, +, and  indicate the maximum and minimum values the vari
ables can attain.
1.3.4 Formal Statement of Optimal Control System
Let us now state formally the optimal control problem even risking rep
etition of some of the previous equations. The optimal control problem
is to find the optimal control u*(t) (* indicates extremal or optimal
value) which causes the linear timeinvariant plant (system)
x(t) = Ax(t) + Bu(t) (1.3.11)
to give the trajectory x* (t) that optimizes or extremizes (minimizes or
maximizes) a performance index
J = x'(tf)Fx(tf) + J.
tJ
[x'(t)Qx(t) + u'(t)Ru(t)]dt (1.3.12)
to
or which causes the nonlinear system
x(t) = f(x(t), u(t), t) (1.3.13)
10 Chapter 1: Introduction
to give the state x*(t) that optimizes the general performance index
i
tf
J = S(x(tj), tf) + V(x(t), u(t), t)dt
to
(1.3.14)
with some constraints on the control variables u( t) and/or the state
variables x(t) given by (1.3.10). The final time tf may be fixed, or free,
and the final (target) state may be fully or partially fixed or free. The
entire problem statement is also shown pictorially in Figure 1.5. Thus,
Optimal Control System
I
+
..
•
Plant
I
Cost Function
II
Constraints
, ,
J
J ~
.. ~
J*
... ~
J*
..
u*(t)
u(t)
r
u*(t) u(t)
(a) Minimum (b) Maximum
Figure 1.5 Optimal Control Problem
we are basically interested in finding the control u*(t) which when
applied to the plant described by (1.3.11) or (1.3.13), gives an optimal
performance index J* described by (1.3.12) or (1.3.14).
The optimal control systems are studied in three stages.
1. In the first stage, we just consider the performance index of the
form (1.3.14) and use the wellknown theory of calculus of varia
tions to obtain optimal functions.
2. In the second stage, we bring in the plant (1.3.11) and try to
address the problem of finding optimal control u*(t) which will
1.4 Historical Tour 11
drive the plant and at the same time optimize the performance
index (1.3.12). Next, the above topics are presented in discrete
time domain.
3. Finally, the topic of constraints on the controls and states (1.3.10)
is considered along with the plant and performance index to ob
tain optimal control.
1.4 Historical Tour
We basically consider two stages of the tour: first the development of
calculus of variations, and secondly, optimal control theory [134, 58,
99, 28]1.
1.4.1 Calculus oj Variations
According to a legend [88], Tyrian princess Dido used a rope made
of cowhide in the form of a circular arc to maximize the area to be
occupied to found Carthage. Although the story of the founding of
Carthage is fictitious, it probably inspired a new mathematical dis
cipline, the calculus of variations and its extensions such as optimal
control theory.
The calculus of variations is that branch of mathematics that deals
with finding a function which is an extremum (maximum or minimum)
of a functional. A functional is loosely defined as a function of a func
tion. The theory of finding maxima and minima of functions is quite
old and can be traced back to the isoperimetric problems considered
by Greek mathematicians such as Zenodorus (495435 B.C.) and by
Poppus (c. 300 A.D.). But we will start with the works of Bernoulli. In
1699 Johannes Bernoulli (16671748) posed the brachistochrone prob
lem: the problem of finding the path of quickest descent between two
points not in the same horizontal or vertical line. This problem which
was first posed by Galileo (15641642) in 1638, was solved by John,
his brother Jacob (1654 1705), by Gottfried Leibniz (16461716), and
anonymously by Isaac Newton (16421727). Leonard Euler (17071783)
joined John Bernoulli and made some remarkable contributions, which
influenced JosephLouis Lagrange (17361813), who finally gave an el
IThe permission given by SpringerVerlag for H. H. Goldstine, A History of the Calculus
of Variations, SpringerVerlag, New York, NY, 1980, is hereby acknowledged.
12 Chapter 1: Introduction
egant way of solving these types of problems by using the method
of (first) variations. This led Euler to coin the phrase calculus of vari
ations. Later this necessary condition for extrema of a functional
was called the Euler  the Lagrange equation. Lagrange went on to
treat variable end  point problems introducing the multiplier method,
which later became one of the most powerful toolsLagrange (or Euler
Lagrange) multiplier methodin optimization.
The sufficient conditions for finding the extrema of functionals in cal
culus of variations was given by Andrien Marie Legendre (17521833)
in 1786 by considering additionally the second variation. Carl Gustav
Jacob Jacobi (18041851) in 1836 came up with a more rigorous anal
ysis of the sufficient conditions. This sufficient condition was later on
termed as the LegendreJacobi condition. At about the same time Sir
William Rowan Hamilton (17881856) did some remarkable work on
mechanics, by showing that the motion of a particle in space, acted
upon by various external forces, could be represented by a single func
tion which satisfies two firstorder partial differential equations. In 1838
Jacobi had some objections to this work and showed the need for only
one partial differential equation. This equation, called HamiltonJacobi
equation, later had profound influence on the calculus of variations and
dynamic programming, optimal control, and as well as on mechanics.
The distinction between strong and weak extrema was addressed by
Karl Weierstrass (18151897) who came up with the idea of the field
of extremals and gave the Weierstrass condition, and sufficient condi
tions for weak and strong extrema. Rudolph Clebsch (18331872) and
Adolph Mayer proceeded with establishing conditions for the more gen
eral class of problems. Clebsch formulated a problem in the calculus of
variations by adjoining the constraint conditions in the form of differ
ential equations and provided a condition based on second variation.
In 1868 Mayer reconsidered Clebsch's work and gave some elegant re
sults for the general problem in the calculus of variations. Later Mayer
described in detail the problems: the problem of Lagrange in 1878, and
the problem of Mayer in 1895.
In 1898, Adolf Kneser gave a new approach to the calculus of varia
tions by using the result of Karl Gauss (17771855) on geodesics. For
variable endpoint problems, he established the transversality condi
tion which includes orthogonality as a special case. He along with
Oskar Bolza (18571942) gave sufficiency proofs for these problems.
In 1900, David Hilbert (18621943) showed the second variation as a
1.4 Historical Tour 13
quadratic functional with eigenvalues and eigenfunctions. Between 1908
and 1910, Gilbert Bliss (18761951) [23] and Max Mason looked in
depth at the results of Kneser. In 1913, Bolza formulated the problem
of Bolza as a generalization of the problems of Lagrange and Mayer.
Bliss showed that these three problems are equivalent. Other notable
contributions to calculus of variations were made by E. J. McShane
(19041989) [98], M. R. Hestenes [65], H. H. Goldstine and others.
There have been a large number of books on the subject of calculus
of variations: Bliss (1946) [23], Cicala (1957) [37], Akhiezer (1962) [1],
Elsgolts (1962) [47], Gelfand and Fomin (1963) [55], Dreyfus (1966)
[45], Forray (1968) [50], Balakrishnan (1969) [8], Young (1969) [146],
Elsgolts (1970) [46], Bolza (1973) [26], Smith (1974) [126], Weinstock
(1974) [143], Krasnov et al. (1975) [81], Leitmann (1981) [88], Ew
ing (1985) [48], Kamien and Schwartz (1991) [78], Gregory and Lin
(1992) [61], Sagan (1992) [118], Pinch (1993) [108], Wan (1994) [141],
Giaquinta and Hildebrandt (1995) [56, 57], Troutman (1996) [136], and
Milyutin and Osmolovskii (1998) [103].
1.4.2 Optimal Control Theory
The linear quadratic control problem has its origins in the celebrated
work of N. Wiener on meansquare filtering for weapon fire control dur
ing World War II (194045) [144, 145]. Wiener solved the problem of
designing filters that minimize a meansquareerror criterion (perfor
mance measure) of the form
(1.4.1)
where, e( t) is the error, and E {x} represents the expected value of the
random variable x. For a deterministic case, the above error criterion
is generalized as an integral quadratic term as
J = 10
00
e'(t)Qe(t)dt (1.4.2)
where, Q is some positive definite matrix. R. Bellman in 1957 [12]
introduced the technique of dynamic programming to solve discrete
time optimal control problems. But, the most important contribution
to optimal control systems was made in 1956 [25] by L. S. Pontryagin
(formerly of the United Soviet Socialistic Republic (USSR)) and his as
sociates, in development of his celebrated maximum principle described
14 Chapter 1: Introduction
in detail in their book [109]. Also, see a very interesting article on the
"discovery of the Maximum Principle" by R. V. Gamkrelidze [52], one
of the authors of the original book [109]. At this time in the United
States, R. E. Kalman in 1960 [70] provided linear quadratic regulator
(LQR) and linear quadratic Gaussian (LQG) theory to design optimal
feedback controls. He went on to present optimal filtering and estima
tion theory leading to his famous discrete Kalman filter [71] and the
continuous Kalman filter with Bucy [76]. Kalman had a profound ef
fect on optimal control theory and the Kalman filter is one of the most
widely used technique in applications of control theory to real world
problems in a variety of fields.
At this point we have to mention the matrix Riccati equation that
appears in all the Kalman filtering techniques and many other fields.
C. J. Riccati [114, 22] published his result in 1724 on the solution for
some types of nonlinear differential equations, without ever knowing
that the Riccati equation would become so famous after more than two
centuries!
Thus, optimal control, having its roots in calculus of variations de
veloped during 16th and 17th centuries was really born over 300 years
ago [132]. For additional details about the historical perspectives on
calculus of variations and optimal control, the reader is referred to some
excellent publications [58, 99, 28, 21, 132].
In the socalled linear quadratic control, the term "linear" refers to
the plant being linear and the term "quadratic" refers to the perfor
mance index that involves the square or quadratic of an error, and/or
control. Originally, this problem was called the meansquare control
problem and the term "linear quadratic" did not appear in the litera
ture until the late 1950s.
Basically the classical control theory using frequency domain deals
with single input and single output (SIS0) systems, whereas modern
control theory works with time domain for SISO and multiinput and
multioutput (MIMO) systems. Although modern control and hence
optimal control appeared to be very attractive, it lacked a very impor
tant feature of robustness. That is, controllers designed based on LQR
theory failed to be robust to measurement noise, external disturbances
and unmodeled dynamics. On the other hand, frequency domain tech
niques using the ideas of gain margin and phase margin offer robustness
in a natural way. Thus, some researchers [115, 95], especially in the
United Kingdom, continued to work on developing frequency domain
1.5 About This Book 15
approaches to MIMO systems.
One important and relevant field that has been developed around
the 1980s is the Hoooptimal control theory. In this framework, the
work developed in the 1960s and 1970s is labeled as H2optimal control
theory. The seeds for Hoooptimal control theory were laid by G. Zames
[148], who formulated the optimal Hoosensitivity design problem for
SISO systems and solved using optimal N evanilinaPick interpolation
theory. An important publication in this field came from a group of four
active researchers, Doyle, Glover, Khargonekar, and Francis[44], who
won the 1991 W. R. G. Baker Award as the best IEEE Transactions
paper. There are many other works in the field of Hoo control ([51, 96,
43, 128, 7, 60, 131, 150, 39, 34]).
1.5 About This Book
This book, on the subject of optimal control systems, is based on the
author's lecture notes used for teaching a graduate level course on this
subject. In particular, this author was most influenced by Athans and
Falb [6], Schultz and Melsa [121], Sage [119], Kirk [79], Sage and White
[120], Anderson and Moore [3] and Lewis and Syrmos [91], and one
finds the footprints of these works in the present book.
There were a good number of books on optimal control published
during the era of the "glory of modern control," (Leitmann (1964) [87],
Tou (1964) [135], Athans and Falb (1966) [6], Dreyfus (1966) [45], Lee
and Markus (1967) [86], Petrov (1968) [106], Sage (1968) [119], Citron
(1969) [38], Luenberger (1969) [93], Pierre (1969) [107], Pun (1969)
[110], Young (1969) [146], Kirk (1970) [79], Boltyanskii [24], Kwaker
naak and Sivan (1972) [84], Warga (1972) [142], Berkovitz (1974) [17],
Bryson and Ho (1975) [30]), Sage and White (1977) [120], Leitmann
(1981) [88]), Ryan (1982) [116]). There has been renewed interest with
the second wave of books published during the last few years (Lewis
(1986) [89], Stengal (1986) [127], Christensen et al. (1987) [36] Ander
son and Moore (1990) [3], Hocking (1991) [66], Teo et al. (1991) [133],
Gregory and Lin (1992) [61], Lewis (1992) [90], Pinch (1993) [108], Do
rato et al. (1995) [42], Lewis and Syrmos (1995) [91]), Saberi et al.
(1995) [117], Sima (1996) [124], Siouris [125], Troutman (1996) [136]
Bardi and Dolcetta (1997) [9], Vincent and Grantham (1997) [139],
Milyutin and Osmolovskii (1998) [103], Bryson (1999) [29], Burl [32],
Kolosov (1999) [80], Pytlak (1999) [111], Vinter (2000) [140], Zelikin
16 Chapter 1: Introduction
(2000) [149], Betts (2001) [20], and Locatelli (2001) [92].
The optimal control theory continues to have a wide variety of appli
cations starting from the traditional electrical power [36] to economics
and management [16, 122, 78, 123].
1.6 Chapter Overview
This book is composed of seven chapters. Chapter 2 presents opti
mal control via calculus of variations. In this chapter, we start with
some basic definitions and a simple variational problem of extremizing
a functional. We then bring in the plant as a conditional optimization
problem and discuss various types of problems based on the bound
ary conditions. We briefly mention both Lagrangian and Hamiltonian
formalisms for optimization. Next, Chapter 3 addresses basically the
linear quadratic regulator (LQR) system. Here we discuss the closed
loop optimal control system introducing matrix Riccati differential and
algebraic equations. We look at the analytical solution to the Riccati
equations and development of MATLAB© routine for the analytical
solution. Tracking and other problems of linear quadratic optimal con
trol are discussed in Chapter 4. We also discuss the gain and phase
margins of the LQR system.
So far the optimal control of continuoustime systems is described.
Next, the optimal control of discretetime systems is presented in Chap
ter 5. Here, we start with the basic calculus of variations and then touch
upon all the topics discussed above with respect to the continuoustime
systems. The Pontryagin Principle and associated topics of dynamic
programming and HamiltonJacobiBellman results are briefly covered
in Chapter 6. The optimal control of systems with control and state
constraints is described in Chapter 7. Here, we cover topics of control
constraints leading to timeoptimal, fueloptimal and energyoptimal
control systems and briefly discuss the state constraints problem.
Finally, the Appendices A and B provide summary of results on ma
trices, vectors, matrix algebra and state space, and Appendix C lists
some of the MATLAB© files used in the book.
1.7 Problems 17
1. 7 Problems
Problem 1.1 A D.C. motor speed control system is described by a
second order state equation,
:h (t) = 25x2(t)
X2(t) = 400Xl(t)  200X2(t) + 400u(t) ,
where, Xl(t) = the speed of the motor, and X2(t) = the current in
the armature circuit and the control input u( t) = the voltage input
to an amplifier supplying the motor. Formulate a performance index
and optimal control problem to keep the speed constant at a particular
value.
Problem 1.2 [83] In a liquidlevel control system for a storage tank,
the valves connecting a reservoir and the tank are controlled by gear
train driven by a D. C. motor and an electronic amplifier. The dynamics
is described by a third order system
Xl(t) = 2Xl(t)
X2(t) = X3(t)
X3(t) = 10X3(t) + 9000u(t)
where, Xl(t) = is the height in the tank, X2(t) = is the angular posi
tion of the electric motor driving the valves controlling the liquid from
reservoir to tank, X3(t) = the angular velocity of the motor, and u(t) =
is the input to electronic amplifier connected to the input of the motor.
Formulate optimal control problem to keep the liquid level constant at
a reference value and the system to act only if there is a change in the
liquid level.
Problem 1.3 [35] In an inverted pendulum system, it is required to
maintain the upright position of the pendulum on a cart. The linearized
state equations are
Xl(t) = X2(t)
X2(t) = X3(t) + O.2u(t)
X3(t) = X4(t)
X4(t) = 10x3(t)  O.2u(t)
18 Chapter 1: Introduction
where, Xl (t) = is horizontal linear displacement of the cart, X2(t) = is
linear velocity of the cart, X3(t) = is angular position of the pendulum
from vertical line, X4(t) = is angular velocity, and u(t) = is the horizon
tal force applied to the cart. Formulate a performance index to keep
the pendulum in the vertical position with as little energy as possible.
Problem 1.4 [101J A mechanical system consisting of two masses and
two springs, one spring connecting the two masses and the other spring
connecting one of the masses to a fixed point. An input is applied to
the mass not connected to the fixed point. The displacements (XI(t)
and X2 (t)) and the corresponding velocities (X3 (t) and X4 (t)) of the two
masses provide a fourthorder system described by
XI(t) = X3(t)
X2(t) = X4(t)
X3(t) = 4XI(t) + 2X2(t)
X4(t) = XI(t)  X2(t) + u(t)
Formulate a performance index to minimize the errors in displacements
and velocities and to minimize the control effort.
Problem 1.5 A simplified model of an automobile suspension system
is described by
mx(t) + kx(t) = bu(t)
where, x(t) is the position, u(t) is the input to the suspension system
(in the form of an upward force), m is the mass of the suspension
system, and k is the spring constant. Formulate the optimal control
problem for minimum control energy and passenger comfort. Assume
suitable values for all the constants.
Problem 1.6 [112J Consider a continuous stirred tank chemical reac
tor described by
XI(t) = O.lXI(t)  0.12x2(t)
X2(t) = 0.3XI(t)  0.012x2(t)  0.07u(t)
where, the normalized deviation state variables of the linearized model
are Xl (t) = reaction variable, X2 (t) = temperature and the control
variable u(t) = effective cooling rate coefficient. Formulate a suitable
performance measure to minimize the deviation errors and to minimize
the control effort.
Chapter 2
Calculus of Variations
and Optimal Control
Calculus of variations (Co V) or variational calculus deals with finding
the optimum (maximum or minimum) value of a functional. Varia
tional calculus that originated around 1696 became an independent
mathematical discipline after the fundamental discoveries of L. Euler
(17091783), whom we can claim with good reason as the founder of
calculus of variations.
In this chapter, we start with some basic definitions and a simple
variational problem of extremizing a functional. We then incorporate
the plant as a conditional optimization problem and discuss various
types of problems based on the boundary conditions. We briefly men
tion both the Lagrangian and Hamiltonian formalisms for optimization.
It is suggested that the student reviews the material in Appendices A
and B given at the end of the book. This chapter is motivated by
[47, 79, 46, 143, 81, 48]1.
2.1 Basic Concepts
2.1.1 Function and Functional
We discuss some fundamental concepts associated with functionals along
side with those of functions.
(a) Function: A variable x is a function of a variable quantity t, (writ
IThe permission given by Prentice Hall for D. E. Kirk, Optimal Control Theory: An Intro
duction, Prentice Hall, Englewood Cliffs, NJ, 1970, is hereby acknowledged.
19
20 Chapter 2: Calculus of Variations and Optimal Control
ten as x(t) = !(t)), if to every value of t over a certain range of t there
corresponds a value x; i.e., we have a correspondence: to a number t
there corresponds a number x. Note that here t need not be always
time but any independent variable.
Example 2.1
Consider
x(t) = 2t2 + 1. (2.1.1 )
For t = 1, x = 3, t = 2, x = 9 and so on. Other functions are
x(t) = 2t; X(tb t2) = tt + t§.
N ext we consider the definition of a functional based on that of a
function.
(b) Functional: A variable quantity J is a functional dependent on a
function ! (x), written as J = J (f (x) ), if to each function f (x), there
corresponds a value J, i.e., we have a correspondence: to the function
f (x) there corresponds a number J. Functional depends on several
functions.
Example 2.2
Let x(t) = 2t2 + 1. Then
{I (I 2 5
J(x(t)) = io x(t)dt = io (2t2 + l)dt = 3 + 1 = 3
(2.1.2)
is the area under the curve x(t). If v(t) is the velocity of a vehicle,
then
l
ti
J ( v ( t )) = v ( t ) dt
to
(2.1.3)
is the path traversed by the vehicle. Thus, here x(t) and v(t) are
functions of t, and J is a functional of x(t) or v(t).
Loosely speaking, a functional can be thought of as a "function of a
function."
2.1.2 Increment
We consider here increment of a function and a functional.
2.1 Basic Concepts 21
(a) Increment of a Function: In order to consider optimal values
of a function, we need the definition of an increment [47, 46, 79].
DEFINITION 2.1 The increment of the function I, denoted by is
defined as
+  I(t). (2.1.4)
It is easy to see from the definition that depends on both the
independent variable t and the increment of the independent variable
and hence strictly speaking, we need to write the increment of a
function as
Example 2.3
If
find the increment of the function I ( t) .
Solution: The increment becomes
I(t +  I(t)
= (tl + + t2 +  (tl + t2)2
= (tl + + (t2 + + 2(iI + + 
(tI + t§ + 2tlt2)
= 2(tl + + 2(tl + + +
(2.1.5)
(2.1.6)
(b) Increment of a Functional: Now we are ready to define the
increment of a functional.
DEFINITION 2.2 The increment of the functional J, denoted by is
defined as
I + 8x(t))  J(x(t))·1
(2.1. 7)
Here 8x(t) is called the variation of the function x(t). Since the in
crement of a functional is dependent upon the function x(t) and its
22 Chapter 2: Calculus of Variations and Optimal Control
variation 8x(t), strictly speaking, we need to write the increment as
ilJ(x(t),8x(t)).
Example 2.4
Find the increment of the functional
(2.1.8)
Solution: The increment of J is given by
ilJ J(x(t) + 8x(t))  J(x(t)),
= it! [2(x(t) + 8x(t))2 + 1] dt _it! [2x2(t) + 1] dt,
to to
i
t!
= [4x(t)8x(t) + 2(8x(t) )2] dt. (2.1.9)
to
2.1.3 Differential and Variation
Here, we consider the differential of a function and the variation of a
functional.
(a) Differential of a Function: Let us define at a point t* the
increment of the function J as
+ ilt)  J(t*). (2.1.10)
By expanding J (t* + ilt) in a Taylor series about t*, we get
Af = f(t') + (:), At + :, (At)2 + ...  f(t*). (2.1.11)
Neglecting the higher order terms in ilt,
Af = (:) * At = j(t*)At = df.
(2.1.12)
Here, df is called the differential of J at the point t*. j(t*) is the
derivative or slope of J at t*. In other words, the differential dJ is
the first order approximation to increment ilt. Figure 2.1 shows the
relation between increment, differential and derivative.
2.1 Basic Concepts 23
f(t)
f(t* ....... '. . .. ... . ......... '
[(to) [ ... :. : ::t ..
.
.
o
t*
t
Figure 2.1 Increment fl.j, Differential dj, and Derivative j of a
Function j ( t)
Example 2.5
Let j(t) = t
2
+ 2t. Find the increment and the derivative of the
function j ( t).
Solution: By definition, the increment fl.j is
fl.j £ j(t + fl.t)  j(t),
= (t + fl.t)2 + 2(t + fl.t)  (t
2
+ 2t),
= 2tfl.t + 2fl.t + ... + higher order terms,
= 2(t + l)fl.t,
= j(t)fl.t.
Here, j(t) = 2(t + 1).
(2.1.13)
(b) Variation of a Functional: Consider the increment of a func
tional
fl.J£J(x(t) + 8x(t))  J(x(t)). (2.1.14)
24 Cbapter 2: Calculus of Variations and Optimal Control
Expanding J(x(t) + 8x(t)) in a Taylor series, we get
{)J 1 {)2 J
jj.J = J(x(t)) + {) 8x(t) + , {) 2 (8x(t))2 + ...  J(x(t))
x 2. x
{)J 1 {)2J 2
= {)x 8x(t) + 2! {)x
2
(8x(t)) + ...
= 8 J + 8
2
J + ... , (2.1.15)
where,
{)J
8J = {)x 8x(t)
and (2.1.16)
are called the first variation (or simply the variation) and the second
variation of the functional J, respectively. The variation 8 J of a func
tional J is the linear (or first order approximate) part (in 8x(t)) of the
increment jj.J. Figure 2.2 shows the relation between increment and
the first variation of a functional.
J(x(t»
J(x*(t)+Ox(t» . . . . . . . . . .. ... . ......... ,
. J(x*(t» ... .:. : J ~ ~ .. . 1 ~ .
:.. ~
: ox(t):
o
x*(t) x*(t)+ Ox(t) x(t)
Figure 2.2 Increment jj.J and the First Variation 8J of the
Functional J
2.2 Optimum of a Function and a Functional 25
Example 2.6
Given the functional
i
t!
J(x(t)) = [2x
2
(t) + 3x(t) + 4]dt,
to
(2.1.17)
evaluate the variation of the functional.
Solution: First, we form the increment and then extract the vari
ation as the first order approximation. Thus
~ J ~ J(x(t) + 8x(t))  J(x(t)),
i
t!
= [2(x(t) + 8x(t))2 + 3(x(t) + 8x(t)) + 4)
to
(2x
2
(t) + 3x(t) + 4)] dt,
i
t!
= [4x(t)8x(t) + 2(8x(t))2 + 38x(t)] dt.
to
(2.1.18)
Considering only the first order terms, we get the (first) variation
as
i
t!
8J(x(t),8x(t)) = (4x(t) + 3)8x(t)dt.
to
(2.1.19)
2.2 Optimum of a Function and a Functional
We give some definitions for optimum or extremum (maximum or min
imum) of a function and a functional [47, 46, 79]. The variation plays
the same role in determining optimal value of a functional as the dif
ferential does in finding extremal or optimal value of a function.
DEFINITION 2.3 Optimum of a Function: A function f (t) is said
to have a relative optimum at the point t* if there is a positive parameter E
such that for all points t in a domain V that satisfy It  t* I < E, the increment
of f(t) has the same sign (positive or negative).
In other words, if
~ f = f(t)  f(t*) 2:: 0, (2.2.1)
26 Chapter 2: Calculus of Variations and Optimal Control
f(t f(t)
v
0
t*
t
f(t)
0
t
o
t*
t
(a)
m
0
t*
t
f(t)
0
t
t*
o ~             .
. t
... ~
; . ~ ~ . , ~
f(t*)<0
(b)
Figure 2.3 (a) Minimum and (b) Maximum of a Function f (t)
then, f(t*) is a relative local minimum. On the other hand, if
b.f = f(t)  f(t*) ~ 0, (2.2.2)
then, f (t*) is a relative local maximum. If the previous relations are
valid for arbitrarily large E, then, f(t*) is said to have a global absolute
optimum. Figure 2.3 illustrates the (a) minimum and (b) maximum of
a function.
It is well known that the necessary condition for optimum of a func
tion is that the (first) differential vanishes, i.e., df = O. The sufficient
condition
2.3 The Basic Variational Problem
1. for minimum is that the second differential is positive,
i.e., d
2
f > 0, and
2. for maximum is that the second differential is negative,
i.e., d
2
f < 0.
If d
2
f = 0, it corresponds to a stationary (or inflection) point.
27
DEFINITION 2.4 Optimum of a Functional: A functional J is
said to have a relative optimum at x* if there is a positive E such that for all
functions x in a domain n which satisfy Ix  x* I < E, the increment of J has
the same sign.
In other words, if
!1J = J(x)  J(x*) ~ 0, (2.2.3)
then J(x*) is a relative minimum. On the other hand, if
!1J = J(x)  J(x*) ~ 0, (2.2.4)
then, J(x*) is a relative maximum. If the above relations are satisfied
for arbitrarily large E, then, J(x*) is a global absolute optimum.
Analogous to finding extremum or optimal values for functions, in
variational problems concerning functionals, the result is that the vari
ation must be zero on, an optimal curve. Let us now state the result in
the form of a theorem, known as fundamental theorem of the calculus
of variations, the proof of which can be found in any book on calculus
of variations [47, 46, 79].
THEOREM 2.1
For x*(t) to be a candidate for an optimum, the (first) variation of J must
be zero on x*(t), i.e., 6J(x*(t), 6x(t)) = ° for all admissible values of 6x(t).
This is a necessary condition. As a sufficient condition for minimum, the
second variation 6
2
J > 0, and for maximum 6
2
J < 0.
2.3 The Basic Variational Problem
2.3.1 FixedEnd Time and FixedEnd State System
We address a fixedend time and fixedend state problem, where both
the initial time and state and the final time and state are fixed or given
28 Chapter 2: Calculus of Variations and Optimal Control
a priori. Let x(t) be a scalar function with continuous first derivatives
and the vector case can be similarly dealt with. The problem is to find
the optimal function x* (t) for which the functional
i
t!
J(x(t)) = V(x(t), x(t), t)dt
to
(2.3.1)
has a relative optimum. It is assumed that the integrand V has con
tinuous first and second partial derivatives w.r.t. all its arguments; to
and t f are fixed (or given a priori) and the end points are fixed, i.e.,
x(t = to) = Xo; x(t = tf) = xf'
(2.3.2)
We already know from Theorem 2.1 that the necessary condition for
an optimum is that the variation of a functional vanishes. Hence, in
our attempt to find the optimum of x(t), we first define the increment
for J, obtain its variation and finally apply the fundamental theorem
of the calculus of variations (Theorem 2.1).
Thus, the various steps involved in finding the optimal solution to
the fixedend time and fixedend state system are first listed and then
discussed in detail.
• Step 1: Assumption of an Optimum
• Step 2: Variations and Increment
• Step 3: First Variation
• Step 4: Fundamental Theorem
• Step 5: Fundamental Lemma
• Step 6: EulerLagrange Equation
• Step 1: Assumption of an Optimum: Let us assume that x*(t) is
the optimum attained for the function x(t). Take some admissible
function xa(t) = x*(t) + 8x(t) close to x*(t), where 8x(t) is the
variation of x*(t) as shown in Figure 2.4. The function xa(t)
should also satisfy the boundary conditions (2.3.2) and hence it
is necessary that
(2.3.3)
2.3 The Basic Variational Problem 29
x(t)
xo ..... .
o
Figure 2.4 FixedEnd Time and FixedEnd State System
• Step 2: Variations and Increment: Let us first define the incre
ment as
6.J(x*(t), 8x(t)) ~ J(x*(t) + 8x(t), x*(t) + 8x(t), t)
J(x*(t), x*(t), t)
I
t!
= V (x*(t) + 8x(t), x*(t) + 8x(t), t) dt
to
I
t!
 V(x*(t), x*(t), t)dt.
to
(2.3.4)
which by combining the integrals can be written as
I
t!
6.J(x*(t), 8x(t)) = [V (x*(t) + 8x(t), x*(t) + 8x(t), t)
to
 V(x* (t), x*(t), t)] dt. (2.3.5)
where,
x(t) = d : ~ t ) and 8x(t) = :t {8x(t)} (2.3.6)
Expanding V in the increment (2.3.5) in a Taylor series about
the point x*(t) and x*(t), the increment 6.J becomes (note the
30 Chapter 2: Calculus of Variations and Optimal Control
cancelation of V(x*(t), x*(t), t))
= 8x(t))
= l' t) 6x(t) + x*(t), t) 6x(t)
{8
2
V( ... ) (8 ())2 8
2
V( ... ) (8· ( ))2
+ 2! 8x2 x t + 8x2 X t +
+ 6x (t)6x (t) } + .. ] dt. (2.3.7)
Here, the partial derivatives are w.r.t. x(t) and x(t) at the opti
mal condition (*) and * is omitted for simplicity .
• Step 3: First Variation: Now, we obtain the variation by retain
ing the terms that are linear in 8x(t) and 8x(t) as
8J(x*(t),8x(t)) = it! [8V(X*(t), x*(t), t) 8x(t)
to 8x
8V(x*(t), x*(t), t)8· ( )] d
+ 8x x t t.
(2.3.8)
To express the relation for the first variation (2.3.8) entirely in
terms containing 8x(t) (since 8x(t) is dependent on 8x(t)), we
integrate by parts the term involving 8x(t) as (omitting the ar
guments in V for simplicity)
1:' * 6x(t)dt = 1:' * ! (6x(t))dt
= 1:' * d(6x(t)),
= [( * 6X(t{:
_it! dt.
to dt 8x *
(2.3.9)
In the above, we used the wellknown integration formula J udv =
uv  J vdu where u = 8V/8X and v = 8x(t)). Using (2.3.9), the
2.3 The Basic Variational Problem 31
relation (2.3.8) for first variation becomes
8J(x*(t),6x(t)) = {' ( ~ : ) * 6x(t)dt + [( ~ ~ ) * 6X(t)[
_ rtf !i ( a ~ ) 8x(t)dt,
lto dt ax *
= rtf [(av) _!i ( a ~ ) ]8x(t)dt
lto ax * dt ax *
+ [( ~ ~ ) * 6x(t)] I:: . (2.3.10)
Using the relation (2.3.3) for boundary variations in (2.3.10), we
get
8J(x*(t),6x(t)) = 1:' [( ~ : ) *  :t ( ~ ~ ) .l6X(t)dt. (2.3.11)
• Step 4: Fundamental Theorem: We now apply the fundamental
theorem of the calculus of variations (Theorem 2.1), i.e., the vari
ation of J must vanish for an optimum. That is, for the optimum
x*(t) to exist, 8J(x*(t),8x(t)) = O. Thus the relation (2.3.11)
becomes
rtf [(av) _!i ( a ~ ) ]8X(t)dt = O.
lto ax * dt ax *
(2.3.12)
Note that the function 8x(t) must be zero at to and tf, but for
this, it is completely arbitrary .
• Step 5: Fundamental Lemma: To simplify the condition ob
tained in the equation (2.3.12), let us take advantage of the fol
lowing lemma called the fundamental lemma of the calculus of
variations [47, 46, 79].
LEMMA 2.1
If for every function g(t) which is continuous,
l
tf
g(t)8x(t)dt = 0
to
(2.3.13)
32 Chapter 2: Calculus of Variations and Optimal Control
where the function 8x(t) is continuous in the interval [to, tf]' then the
function 9 ( t) must be zero everywhere throughout the interval [to, t f] .
(see Figure 2.5.)
Proof: We prove this by contradiction. Let us assume that g(t) is
nonzero (positive or negative) during a short interval [ta, tb]. Next, let
us select 8x(t), which is arbitrary, to be positive (or negative) through
out the interval where 9 ( t) has a nonzero value. By this selection
of 8x(t), the value of the integral in (2.3.13) will be nonzero. This
contradicts our assumption that g( t) is nonzero during the interval.
Thus g( t) must be identically zero everywhere during the entire inter
val [to, tf] in (2.3.13). Hence the lemma.
get)
t
8x(t)
Figure 2.5 A Nonzero g(t) and an Arbitrary 8x(t)
• Step 6: EulerLagrange Equation: Applying the previous lemma
to (2.3.12), a necessary condition for x*(t) to be an optimal of
the functional J given by (2.3.1) is
(
av(x*(t),x*(t),t)) _ ~ (av(x*(t),.x*(t),t)) = 0 (2.3.14)
ax * dt ax *
2.3 The Basic Variational Problem 33
or in simplified notation omitting the arguments in V,
(
aV) _!i (aV) = 0
ax * dt ax *
(2.3.15)
for all t E [to, tf]. This equation is called Euler equation, first
published in 1741 [126].
A historical note is worthy of mention.
Euler obtained the equation (2.3.14) in 1741 using an elab
orate and cumbersome procedure. Lagrange studied Euler's
results and wrote a letter to Euler in 1755 in which he ob
tained the previous equation by a more elegant method of
"variations" as described above. Euler recognized the sim
plicity and generality of the method of Lagrange and intro
duced the name calculus of variations. The all important
fundamental equation (2.3.14) is now generally known as
EulerLagrange (E.L') equation after these two great math
ematicians of the 18th century. Lagrange worked further
on optimization and came up with the wellknown Lagrange
multiplier rule or method.
2.3.2 Discussion on EulerLagrange Equation
We provide some comments on the EulerLagrange equation [47,46].
1. The EulerLagrange equation (2.3.14) can be written in many
different forms. Thus (2.3.14) becomes
d
V   (V·) = 0
x dt x
(2.3.16)
where,
Vx = ~ : = Vx(x*(t), ±*(t), t); Vi; = ~ ~ = Vi;(X*(t), x*(t), t).
(2.3.17)
Since V is a function of three arguments x*(t), x*(t), and t, and
34 Chapter 2: Calculus of Variations and Optimal Control
that x*(t) and x*(t) are in turn functions of t, we get
Combining (2.3.16) and (2.3.18), we get an alternate form for
the EL equation as
(2.3.19)
2. The presence of it and/or x*(t) in the EL equation (2.3.14) means
that it is a differential equation.
3. In the EL equation (2.3.14), the term is in general
a function of x*(t), x*(t), and t. Thus when this function is
differentiated w.r.t. t, x*(t) may be present. This means that the
differential equation (2.3.14) is in general of second order. This is
also evident from the alternate form (2.3.19) for the EL equation.
4. There may also be terms involving products or powers of x* (t),
x*(t), and x*(t), in which case, the differential equation becomes
nonlinear.
5. The explicit presence of t in the arguments indicates that the
coefficients may be timevarying.
6. The conditions at initial point t = to and final point t = t f leads
to a boundary value problem.
7. Thus, the EulerLagrange equation (2.3.14) is, in general, a non
linear, timevarying, twopoint boundary value, second order, or
dinary differential equation. Thus, we often have a nonlinear
twopoint boundary value problem (TPBVP). The solution of the
nonlinear TPBVP is quite a formidable task and often done us
ing numerical techniques. This is the price we pay for demanding
optimal performance!
2.3 The Basic Variational Problem 35
8. Compliance with the EulerLagrange equation is only a necessary
condition for the optimum. Optimal may sometimes not yield
either a maximum or a minimum; just as inflection points where
the derivative vanishes in differential calculus. However, if the
EulerLagrange equation is not satisfied for any function, this
indicates that the optimum does not exist for that functional.
2.3.3 Different Cases for EulerLagrange Equation
We now discuss various cases of the EL equation.
Case 1: V is dependent of x(t), and t. That is, V = V(x(t), t). Then
Vx = O. The EulerLagrange equation (2.3.16) becomes
This leads us to
d
dt (Vx) = o.
V
x
' = oV(x*(t), t) = C
ox
where, C is a constant of integration.
(2.3.20)
(2.3.21)
Case 2: V is dependent of x(t) only. That is, V = V(x(t)). Then
Vx = O. The EulerLagrange equation (2.3.16) becomes
d
dt (Vx) = 0 ~ Vx = C.
(2.3.22)
In general, the solution of either (2.3.21) or (2.3.22) becomes
(2.3.23)
This is simply an equation of a straight line.
Case 3: V is dependent of x(t) and x(t). That is, V = V(x(t), x(t)).
Then vtx = O. Using the other form of the EulerLagrange equation
(2.3.19), we get
Vx  Vxxx*(t)  Vxxx*(t) = O.
(2.3.24)
Multiplying the previous equation by x*(t), we have
x*(t) [Vx  Vxxx*(t)  Vxxx*(t)] = o. (2.3.25)
This can be rewritten as
! (V  x*(t)V
x
) = 0 ~ V  x*(t)Vx = C.
(2.3.26)
36 Chapter 2: Calculus of Variations and Optimal Control
The previous equation can be solved using any of the techniques such
as, separation of variables.
Case 4: V is dependent of x(t), and t, i.e., V = V(x(t), t). Then,
Vi; = 0 and the EulerLagrange equation (2.3.16) becomes
8V(x*(t), t) = 0
ax .
(2.3.27)
The solution of this equation does not contain any arbitrary constants
and therefore generally speaking does not satisfy the boundary con
ditions x(to) and x(tf). Hence, in general, no solution exists for this
variational problem. Only in rare cases, when the function x(t) satisfies
the given boundary conditions x(to) and x(tf), it becomes an optimal
function.
Let us now illustrate the application of the EL equation with a very
simple classic example of finding the shortest distance between two
points. Often, we omit the * (which indicates an optimal or extremal
value) during the working of a problem and attach the same to the final
solution.
Example 2.7
Find the minimum length between any two points.
Solution: It is well known that the solution to this problem is a
straight line. However, we like to illustrate the application of Euler
Lagrange equation for this simple case. Consider the arc between
two points A and B as shown in Figure 2.6. Let ds be the small arc
length, and dx and dt are the small rectangular coordinate values.
Note that t is the independent variable representing distance and
not time. Then,
(2.3.28)
Rewriting
ds = VI + x
2
(t)dt, where x(t) = ~ ~ . (2.3.29)
N ow the total arc length S between two points x (t = to) and x (t =
t f) is the performance index J to be minimized. Thus,
S = J = J ds = rtf VI + x
2
(t)dt = rtf V(x(t))dt (2.3.30)
Jto Jto
2.3 The Basic Variational Problem 37
x(t)
xo ... 'A::
o
Figure 2.6 Arc Length
where, V(x(t)) = Jl + x
2
(t). Note that V is a function of x(t)
only. Applying the EulerLagrange equation (2.3.22) to the perfor
mance index (2.3.30), we get
x*(t) _ C
VI + X*2(t)  .
Solving this equation, we get the optimal solution as
x*(t) = C1t + C
2
.
(2.3.31)
(2.3.32)
This is evidently an equation for a straight line and the constants
C1 and C2 are evaluated from the given boundary conditions. For
example, if x(O) = 1 and x(2) = 5, C
1
= 2 and C
2
= 1 the straight
line is x*(t) = 2t + 1.
Although the previous example is a simple one,
1. it illustrates the formulation of a performance index from a given
simple specification or a statement, and
2. the solution is well known a priori so that we can easily verify
the application of the EulerLagrange equation.
In the previous example, we notice that the integrand V in the func
tional (2.3.30), is a function of x(t) only. Next, we take an example,
where, V is a function of x(t), x(t) and t.
38 Chapter 2: Calculus of Variations and Optimal Control
Example 2.8
Find the optimum of
J = l [:i;2(t)  2tX(t)] dt
that satisfy the boundary (initial and final) conditions
x(O) = 1 and x(2) = 5.
(2.3.33)
(2.3.34)
Solution: In the EL equation (2.3.19), we first identify that V =
x
2
(t)  2tx(t). Then applying the EL equation (2.3.15) to the
performance index (2.3.33) we get
av _ ~ (av) = 0 + 2t  ~ (2x(t)) = 0
ax dt ax dt
+ x(t) = t.
Solving the previous simple differential equation, we have
t
3
x*(t) ="6 + CIt + C2
(2.3.35)
(2.3.36)
where, C
1
and C2 are constants of integration. Using the given
boundary conditions (2.3.19) in (2.3.36), we have
4
x(O) = 1 + C2 = 1, x(2) = 5 + C
1
= 3'
(2.3.37)
With these values for the constants, we finally have the optimal
function as
t
3
4
x*(t) = "6 + "3
t
+ 1.
(2.3.38)
Another classical example in the calculus of variations is the brachis
tochrone (from brachisto, the shortest, and chrones, time) problem and
this problem is dealt with in almost all books on calculus of variations
[126].
Further, note that we have considered here only the socalled fixed
end point problem where both (initial and final) ends are fixed or given
in advance. Other types of problems such as freeend point problems
are not presented here but can be found in most of the books on the
calculus of variations [79, 46, 81, 48]. However, these freeend point
problems are better considered later in this chapter when we discuss
the optimal control problem consisting of a performance index and a
physical plant.
2.4 The Second Variation 39
2.4 The Second Variation
In the study of extrema of functionals, we have so far considered only
the necessary condition for a functional to have a relative or weak ex
tremum, i.e., the condition that the first variation vanish leading to
the classic EulerLagrange equation. To establish the nature of opti
mum (maximum or minimum), it is required to examine the second
variation. In the relation (2.3.7) for the increment consider the terms
corresponding to the second variation [120],
8
2
J = f :! [( ~ ~ ) . (8x(t))2 + ( ~ : ~ ) • (8X(t))2
+ 2 (::;x) * 8X(t)8X(t)] dt. (2.4.1)
Consider the last term in the previous equation and rewrite it in terms
of 8x(t) only using integration by parts (f udv = uv  f vdu where,
u = :;¥X8x(t) and v = 8x(t)). Then using 8x(to) = 8x(tf) = 0 for
fixedend conditions, we get
82 J = ~ rtf [{ (8
2
V) _!i ( 8
2
V.) } (8x(t))2
2 ltD 8x
2
dt 8x8x
* *
+ ( ~ : ~ ) . (8X(t))2] dt.
(2.4.2)
According to Theorem 2.1, the fundamental theorem of the calculus of
variations, the sufficient condition for a minimum is 8
2
J > O. This, for
arbitrary values of 8x(t) and 8x(t), means that
(
8
2
V) d (8
2
V)
8x
2
*  dt 8x8x * > 0,
(2.4.3)
(
8
2
V)
8x
2
* > O.
(2.4.4)
For maximum, the signs of the previous conditions are reversed. Al
ternatively, we can rewrite the second variation (2.4.1) in matrix form
as
2 1 tf . 8x
2
8x8± 8x(t)
[
82V 8
2
V]
8 J = 210 [8x(t) 8x(t)] ::rx ~ : ' ; * [8X(t) ] dt
1 rtf . [8X(t)]
= "2 ltD [8x(t) 8x(t)]II 8x(t) dt
(2.4.5)
40 Chapter 2: Calculus of Variations and Optimal Control
where,
(2.4.6)
If the matrix II in the previous equation is positive (negative) definite,
we establish a minimum (maximum). In many cases since 8x(t) is
arbitrary, the coefficient of (8x(t))2, i.e., 8
2
V /8x
2
determines the sign
of 8
2
J. That is, the sign of second variation agrees with the sign of
8
2
V / 8x
2
. Thus, for minimization requirement
(2.4.7)
For maximization, the sign of the previous equation reverses. In the
literature, this condition is called Legendre condition [126].
In 1786, Legendre obtained this result of deciding whether a
given optimum is maximum or minimum by examining the
second variation. The second variation technique was fur
ther generalized by Jacobi in 1836 and hence this condition
is usually called LegendreJacobi condition.
Example 2.9
Verify that the straight line represents the minimum distance be
tween two points.
Solution: This is an obvious solution, however, we illustrate the
second variation. Earlier in Example 2.7, we have formulated a
functional for the distance between two points as
(2.4.8)
and found that the optimum is a straight line x*(t) = Clt + C2. To
satisfy the sufficiency condition (2.4.7), we find
x*(t) 1
3/2
. (2.4.9)
[1+x*2(t)]
Since x*(t) is a constant (+ve or ve) , the previous equation satisfies
the condition (2.4.7). Hence, the distance between two points as
given by x*(t) (straight line) is minimum.
2.5 Extrema of Functions with Conditions 41
Next, we begin the second stage of optimal control. We consider op
timization (or extremization) of a functional with a plant, which is
considered as a constraint or a condition along with the functional. In
other words, we address the extremization of a functional with some
condition, which is in the form of a plant equation. The plant takes
the form of state equation leading us to optimal control of dynamic
systems. This section is motivated by [6, 79, 120, 108].
2.5 Extrema of Functions with Conditions
We begin with an example of finding the extrema of a function under
a condition (or constraint). We solve this example with two methods,
first by direct method and then by Lagrange multiplier method. Let us
note that we consider this simple example only to illustrate some basic
concepts associated with conditional extremization [120].
Example 2.10
A manufacturer wants to maximize the volume of the material
stored in a circular tank subject to the condition that the mate
rial used for the tank is limited (constant). Thus, for a constant
thickness of the material, the manufacturer wants to minimize the
volume of the material used and hence part of the cost for the tank.
Solution: If a fixed metal thickness is assumed, this condition im
plies that the crosssectional area of the tank material is constant.
Let d and h be the diameter and the height of the circular tank.
Then the volume contained by the tank is
V(d, h) = wd
2
h/4 (2.5.1)
and the crosssectional surface area (upper, lower and side) of the
tank is
A(d, h) = 2wd
2
/4 + wdh = Ao. (2.5.2)
Our intent is to maximize V(d, h) keeping A(d, h) = A
o
, where Ao
is a given constant. We discuss two methods: first one is called the
Direct Method using simple calculus and the second one is called
Lagrange Multiplier Method using the Lagrange multiplier method.
1 Direct Method: In solving for the optimum value directly, we
eliminate one of the variables, say h, from the volume relation
(2.5.1) using the area relation (2.5.2). By doing so, the condition is
embedded in the original function to be extremized. From (2.5.2),
h
= Ao  wd
2
/2
7rd . (2.5.3)
42 Chapter 2: Calculus of Variations and Optimal Control
Using the relation (2.5.3) for height in the relation (2.5.1) for vol
ume
(2.5.4)
Now, to find the extrema of this simple calculus problem, we dif
ferentiate (2.5.4) w.r.t. d and set it to zero to get
~ o _ ~ 7 r d 2 = O. (2.5.5)
Solving, we get the optimal value of d as
d* = J2
Ao
.
37r
(2.5.6)
By demanding that as per the Definition 2.3 for optimum of a
function, the second derivative of V w.r.t. d in (2.5.4) be neg
ative for maximum, we can easily see that the positive value of
the square root function corresponds to the maximum value of the
function. Substituting the optimal value of the diameter (2.5.6) in
the original crosssectional area given by (2.5.2), and solving for
the optimum h *, we get
h* = J2A
O
•
37r
(2.5.7)
Thus, we see from (2.5.6) and (2.5.7) that the volume stored by a
tank is maximized if the height of the tank is made equal to its
diameter.
2 Lagrange Multiplier Method: Now we solve the above prob
lem by applying Lagrange multiplier method. We form a new func
tion to be extremized by adjoining a given condition to the original
function. The new adjoined function is extremized in the normal
way by taking the partial derivatives w.r. t. all its variables, making
them equal to zero, and solving for these variables which are ex
tremals. Let the original volume relation (2.5.1) to be extremized
be rewritten as
f(d, h) = 7rd
2
h/4
and the condition (2.5.2) to be satisfied as
g(d, h) = 27rd
2
/4 + 7rdh  Ao = O.
(2.5.8)
(2.5.9)
Then a new adjoint function £, (called Lagrangian) is formed as
£'(d, h, \) = f(d, h) + \g(d, h)
= 7rd
2
h/4 + \(27rd
2
/4 + 7rdh  Ao) (2.5.10)

2.5 Extrema of Functions with Conditions 43
where, A, a parameter yet to be determined, is called the Lagrange
multiplier. Now, since the Lagrangian C is a function of three
optimal variables d, h, and A, we take the partial derivatives of
£(d, h, A) w.r.t. each of the variables d, h and A and set them to
zero. Thus,
ac
ad = 7rdh/2 + A(7rd + 7rh) = 0
ac
ah = 7rd
2
/4 + A(7rd) = 0
ac 2
aA = 27rd /4 + 7rdh  Ao = o.
(2.5.11 )
(2.5.12)
(2.5.13)
Now, solving the previous three relations (2.5.11) to (2.5.13) for
the three variables d*, h *, and A *, we get
d* = J
2A
O; h* = J2A
O
; ,X* = J Ao .
37r 37r 247r
(2.5.14)
Once again, to maximize the volume of a cylindrical tank, we need
to have the height (h *) equal to the diameter (d*) of the tank. Note
that we need to take the negative value of the square root function
for A in (2.5.14) in order to satisfy the physical requirement that
the diameter d obtained from (2.5.12) as
d = 4A (2.5.15)
is a positive value.
Now, we generalize the previous two methods.
2.5.1 Direct Method
N ow we generalize the preceding method of elimination using differen
tial calculus. Consider the extrema of a function f(XI, X2) with two
interdependent variables Xl and X2, subject to the condition
(2.5.16)
As a necessary condition for extrema, we have
af af
df = a dXI + a dX2 = o.
Xl X2
(2.5.17)
However, since dXI and dX2 are not arbitrary, but related by the con
dition
ag ag
dg = a dXI + a dX2 = 0,
Xl X2
(2.5.18)
44 Chapter 2: Calculus of Variations and Optimal Control
it is not possible to conclude as in the case of extremization of functions
without conditions that
and (2.5.19)
in the necessary condition (2.5.17). This is easily seen, since if the
set of extrema conditions (2.5.19) is solved for optimal values xi and
x2' there is no guarantee that these optimal values, would, in general
satisfy the given condition (2.5.16).
In order to find optimal values that satisfy both the condition (2.5.16)
and that of the extrema conditions (2.5.17), we arbitrarily choose one
of the variables, say Xl, as the independent variable. Then X2 becomes
a dependent variable as per the condition (2.5.16). Now, assuming that
8g/8x2 i 0, (2.5.18) becomes
{
8
g
/
8X
I}
dX2 =  8g/8
x
2 dXI
(2.5.20)
and using (2.5.20) in the necessary condition (2.5.17), we have
(2.5.21)
As we have chosen dXI to be the independent, we now can consider it
to be arbitrary, and conclude that in order to satisfy (2.5.21), we have
the coefficient of dXI to be zero. That is
(%:J (%:J (%:J (%:J = o.
(2.5.22)
Now, the relation (2.5.22) and the condition (2.5.16) are solved simul
taneously for the optimal solutions xi and x2' Equation (2.5.22) can
be rewritten as
=0. (2.5.23)
This is also, as we know, the Jacobian of f and 9 w.r.t. Xl and X2.
This method of elimination of the dependent variables is quite tedious
for higher order problems.
2.5 Extrema of Functions with Conditions 45
2.5.2 Lagrange Multiplier Method
We now generalize the second method of solving the same problem of
extrema of functions with conditions. Consider again the extrema of
the function f(Xl, X2) subject to the condition
(2.5.24)
In this method, we form an augmented Lagrangian function
(2.5.25)
where, A, a parameter (multiplier) yet to be determined, is the La
grange multiplier. Let us note that using the given condition (2.5.24)
in the Lagrangian (2.5.25), we have
(2.5.26)
and therefore a necessary condition for extrema is that
df = d£ = O. (2.5.27)
Accepting the idea that the Lagrangian (2.5.25) is a better represen
tation of the entire problem than the equation (2.5.26) in finding the
extrema, we have from the Lagrangian relation (2.5.25)
d£ = df + Adg = O. (2.5.28)
Using (2.5.17) and (2.5.18) in (2.5.28), and rearranging
[
8 f 8g 1 [ 8 f 89 1
8Xl + A 8Xl dXl + 8X2 + A 8X2 dX2 = O.
(2.5.29)
Now dXl and dX2 are both not independent and hence cannot immedi
ately conclude that each of the coefficients of dXl and dX2 in (2.5.29)
must be zero. Let us choose dXl to be independent differential and then
dX2 becomes a dependent differential as per (2.5.18). Further, let us
choose the multiplier A, which has been introduced by us and is at our
disposal, to make one of the coefficients of dXl or dX2 in (2.5.29) zero.
For example, let A take on the value A * that makes the coefficient of
the dependent differential dX2 equal zero, that is
(2.5.30)
46 Chapter 2: Calculus of Variations and Optimal Control
With (2.5.30), the equation (2.5.29) reduces to
(2.5.31)
Since, dXl is the independent differential, it can be varied arbitrarily.
Hence, for (2.5.31) to be satisfied for all dxl, the coefficient of dXl must
be zero. That is
oj +A* og =0.
OXl OXl
(2.5.32)
Now from (2.5.25), note that
(2.5.33)
yields the constraint relation (2.5.16). Combining the results from
(2.5.32), (2.5.30), and (2.5.33), we have
o£ = oj + A* 8g = 0
OXl OXl OXl
(2.5.34)
o£ = oj + A* og = 0
OX2 OX2 8X2
(2.5.35)
o£ (* *)
OA = 9 xl,x2 = o.
(2.5.36)
The preceding three equations are to be solved simultaneously to obtain
xi, x2' and A*. By eliminating A* between (2.5.34) and (2.5.35)
(::J (::J (::J (::J = 0
(2.5.37)
which is the same condition as (2.5.22) obtained by the direct method,
thus indicating that we have the same result by Lagrange multiplier
method.
Let us note that the necessary conditions (2.5.34) and (2.5.35) are
just the same equations which would have been obtained from consid
ering the differentials dXl and dX2 as though they were independent
in (2.5.29). Introduction of the multiplier A has allowed us to treat
all the variables in the augmented function £(Xl' X2, A) as though each
variable is independent. Thus, the multiplier A has acted like a catalyst,
appearing in the intermediate stage only.
2.5 Extrema of Functions with Conditions 47
Summarizing, the extrema of a function f(xl, X2) subject to the con
dition (or constraint) g(XI' X2) = 0 is equivalent to extrema of a single
augmented function £(XI, X2, A) = f(xl, X2) + Ag(xl, X2) as though
Xl, X2 and A are independent. We now generalize this result.
THEOREM 2.2
Consider the extrema of a continuous, realvalued function f(x)
f(xl, X2, ... ,x
n
) subject to the conditions
gl(X) = gl(XI,X2,··· ,x
n
) = 0
g2(X) = g2(XI, X2, ... ,x
n
) = 0
(2.5.38)
where, f and g have continuous partial derivatives, and m < n. Let
AI, A2, ... ,Am be the Lagrange multipliers corresponding to m conditions,
such that the augmented Lagrangian function is formed as
£(x, A) = f(x) + A'g(X), (2.5.39)
where, A' is the transpose of A. Then, the optimal values x* and A * are the
solutions of the following n + m equations
a £ = a f + A' ag = 0
ax ax ax
(2.5.40)
a£
aA = g(x) = o.
(2.5.41 )
Features of Lagrange Multiplier: The Lagrange multiplier method
is a powerful one in finding the extrema of functions subject to condi
tions. It has the following attractive features:
1. The importance of the Lagrange multiplier technique lies on the
fact that the problem of determining the extrema of the function
f(x) subject to the conditions g(x) = 0 is embedded within the
simple problem of determining the extrema of the simple aug
mented function £(x, A) = f(x) + A'g(X).
2. Introduction of Lagrange multiplier allows us to treat all the vari
ables x and A in the augmented function £(x, A) as though each
were independent.
48 Chapter 2: Calculus of Variations and Optimal Control
3. The multiplier A acts like a catalyst in the sense that it is intro
duced to perform a certain duty as given by item 2.
4. The increased dimensionality (n + m) which is characteristic of
the Lagrange multiplier method, is generally more than compen
sated by the relative simplicity and systematic procedure of the
technique.
The multiplier method was given by Lagrange in 1788.
2.6 Extrema of Functionals with Conditions
In this section, we extend our ideas to functionals based on those de
veloped in the last section for functions. First, we consider a functional
with two variables, use the results of the previous section on the Co V,
derive the necessary conditions and then extend the same for a general
nth order vector case. Consider the extremization of the performance
index in the form of a functional
subject to the condition (plant or system equation)
with fixedendpoint conditions
XI(tO)=XlO;
XI(tj) = Xlj;
X2(tO) = X20
X2(tj) = X2j'
Now we address this problem under the following steps.
• Step 1: Lagrangian
• Step 2: Variations and Increment
• Step 3: First Variation
• Step 4: Fundamental Theorem
• Step 5: Fundamental Lemma
• Step 6: EulerLagrange Equation
(2.6.2)
(2.6.3)
2.6 Extrema of Punctionals with Conditions 49
• Step 1: Lagrangian: We form an augmented functional
(2.6.4)
where, A(t) is the Lagrange multiplier, and the Lagrangian £, is
defined as
£, = £'(Xl(t),X2(t),Xl(t),X2(t),A(t),t)
= V(Xl(t), X2(t), Xl(t), X2(t), t)
+A(t)g(Xl(t), X2(t), Xl(t), X2(t)) (2.6.5)
Note from the performance index (2.6.1) and the augmented per
formance index (2.6.4) that J
a
= J if the condition (2.6.2) is
satisfied for any A(t).
• Step 2: Variations and Increment: Next, assume optimal values
and then consider the variations and increment as
Xi(t) = xi(t) + 8Xi(t), Xi(t) = xi(t) + 8Xi(t), i = 1,2
~ J a = Ja(xi(t) + 8Xi(t), xi(t) + 8Xi(t), t)  Ja(xi(t), xi(t), t),
(2.6.6)
for i = 1,2.
• Step 3: First Variation: Then using the Taylor series expansion
and retaining linear terms only, the first variation of the func
tional J a becomes
(2.6.7)
As before in the section on Co V, we rewrite the terms containing
8Xl(t) and 8X2(t) in terms of those containing 8Xl(t) and 8X2(t)
50 Chapter 2: Calculus of Variations and Optimal Control
only (using integration by parts, J udv = uv  J vdu). Thus
Using the above, we have the first variation (2.6.7) as
Since this is a fixedfinal time and fixedfinal state problem as
given by (2.6.3), no variations are allowed at the final point. This
means
(2.6.10)
Using the boundary variations (2.6.10) in the augmented first
variation (2.6.9), we have
Ma= 1:
1
!
+ {I _ ! J 8X2(t)dt. (2.6.11)
• Step 4: Fundamental Theorem: Now, we proceed as follows.
1. We invoke the fundamental theorem of the calculus of vari
ations (Theorem 2.1) and make the first variation (2.6.11)
equal to zero.
2.6 Extrema of Functionals with Conditions 51
2. Remembering that both 8Xl(t) and 8X2(t) are not indepen
dent, because Xl(t) and X2(t) are related by the condition
(2.6.2), we choose 8X2(t) as the independent variation and
8Xl(t) as the dependent variation.
3. Let us choose the multiplier A*(t) which is arbitrarily intro
duced and is at our disposal, in such a way that the coef
ficient of the dependent variation 8Xl(t) in (2.6.11) vanish.
That is
(
8C) _ ~ ( 8C) = 0
8Xl * dt 8Xl * .
(2.6.12)
With these choices, the first variation (2.6.11) becomes
f [ ( : ~ ) . ! ( : ~ ) J 8X2(t)dt = O.
(2.6.13)
• Step 5: Fundamental Lemma: Using the fundamental lemma of
CoY (Lemma 2.1) and noting that since 8X2(t) has been chosen
to be independent variation and hence arbitrary, the only way
(2.6.13) can be satisfied, in general, is that the coefficient of 8Xl(t)
also vanish. That is
(
8C) _ ~ ( 8C) = 0
8X2 * dt 8X2 * .
(2.6.14)
Also, from the Lagrangian(2.6.5) note that
(
8C) = 0
8A *
(2.6.15)
yields the constraint relation (2.6.2) .
• Step 6: EulerLagrange Equation: Combining the various rela
tions (2.6.12), (2.6.14), and (2.6.15), the necessary conditions for
extremization of the functional (2.6.1) subject to the condition
(2.6.2) (according to EulerLagrange equation) are
(
8C) _ ~ ( 8C) = 0
8Xl * dt 8Xl *
(2.6.16)
(
8C) _ ~ ( 8C) = 0
8X2 * dt 8X2 *
(2.6.17)
(
8C) _ ~ (8C) = 0
8A * dt 8.\ * .
(2.6.18)
52 Chapter 2: Calculus of Variations and Optimal Control
Let us note that these conditions are just the ones that would have
been obtained from the Lagrangian (2.6.5), as if both 8Xl(t) and
8X2(t) had been independent. Also, in (2.6.18), the Lagrangian £
is independent of ~ ( t ) and hence the condition (2.6.18) is really
the given plant equation (2.6.2).
Thus, the introduction of the Lagrange multiplier A( t) has en
abled us to treat the variables Xl(t) and X2(t) as though they
were independent, in spite of the fact that they are related by the
condition (2.6.2). The solution of the two, secondorder differ
ential equations (2.6.16) and (2.6.17) and the condition relation
(2.6.2) or (2.6.18) along with the boundary conditions (2.6.3) give
the optimal solutions xi(t), x2(t), and A*(t).
Now, we generalize the preceding procedure for an nth order system.
Consider the extremization of a functional
i
t!
J = V(x(t), x(t), t)dt
to
(2.6.19)
where, x(t) is an nth order state vector, subject to the plant equation
( condition)
gi(X(t), x(t), t) = 0; i = 1,2, ... ,m (2.6.20)
and boundary conditions, x(O) and x(tf). We form an augmented
functional
i
t!
J
a
= £(x(t), x(t), A(t), t)dt
to
(2.6.21)
where, the Lagrangian £ is given by
I£(x(t), x(t), A(t), t) = V(x(t), x(t), t) + A'(t)gi(X(t), x(t), t) I (2.6.22)
and the Lagrange multiplier A(t) = [Al(t), A2(t), ... , Am(t)]'. We now
apply the EulerLagrange equation on J
a
to yield
(
8£) _ ~ (8£) = 0
8x * dt 8x * '
(2.6.23)
(
8£) d (8£) .
~ *  di 8I * = 0 ~ 9i(X(t), x(t), t) = o.
(2.6.24)
2.6 Extrema of Functionals with Conditions 53
Note that from (2.6.22), the Lagrangian £ is independent of ~ ( t ) and
hence the EulerLagrange equation (2.6.24) is nothing but the given
relation regarding the plant or the system (2.6.20). Thus, we solve
the EulerLagrange equation (2.6.23) along with the given boundary
conditions. Let us now illustrate the preceding method by a simple
example.
Example 2.11
Minimize the performance index
(2.6.25)
with boundary conditions
x(O) = 1; x(l) = 0 (2.6.26)
subject to the condition (plant equation)
x(t) = x(t) + u(t). (2.6.27)
Solution: Let us solve this problem by the two methods, i.e., the
direct method and the Lagrange multiplier method.
1 Direct Method: Here, we eliminate u(t) between the perfor
mance index (2.6.25) and the plant (2.6.27) to get the functional
as
J = t [x
2
(t) + (x(t) + x(t))2]dt
= t [2x
2
(t) + x
2
(t) + 2x(t)x(t)]dt.
(2.6.28)
Now, we notice that the functional (2.6.28) absorbed the condition
(2.6.27) within itself, and we need to consider it as a straight for
ward extremization of a functional as given earlier. Thus, applying
the EulerLagrange equation
(
OV) _ ~ (OV) = 0
ax * dt ax *
(2.6.29)
to the functional (2.6.28), where,
(2.6.30)
54 Chapter 2: Calculus of Variations and Optimal Control
we get
4x*(t) + 2x*(t)  :t (2x*(t) + 2x*(t)) = O. (2.6.31)
Simplifying the above
x*(t)  2x*(t) = 0 (2.6.32)
the solution (see later for use of MATLAB©) of which gives the
optimal as
(2.6.33)
where, the constants C
1
and C2, evaluated using the given bound
ary conditions (2.6.26), are found to be
(2.6.34)
Finally, knowing the optimal x* ( t ), the optimal control u * ( t ) is
found from the plant (2.6.27) to be
u*(t) = x*(t) + x*(t)
= C
1
(1  V2)e
V2t
+ C
2
(1 + V2)e
V2t
. (2.6.35)
Although the method appears to be simple, let us note that it is
not always possible to eliminate u(t) from (2.6.25) and (2.6.27)
especially for higherorder systems.
2 Lagrange Multiplier Method: Here, we use the ideas devel
oped in the previous section on the extremization of functions with
conditions. Consider the optimization of the functional (2.6.25)
with the boundary conditions (2.6.26) under the condition describ
ing the plant (2.6.27). First we rewrite the condition (2.6.27) as
g(x(t), x(t), u(t)) = x(t) + x(t)  u(t) = o. (2.6.36)
Now, we form an augmented functional as
J = l [x2(t) + u
2
(t) + A(t){X(t) + x(t)  u(t)} 1 dt
= l C(x(t), x(t), u(t), A(t))dt (2.6.37)
where, A(t) is the Lagrange multiplier, and
C(x(t), x(t), u(t), A(t)) = x
2
(t) + u
2
(t)
+A(t) {x(t) + x(t)  u(t)} (2.6.38)
2.6 Extrema of Functionals with Conditions 55
is the Lagrangian. Now, we apply the EulerLagrange equation to
the previous Lagrangian to get
( ~ ~ ) *  :t ( ~ ~ ) * = 0 > 2x*(t) + ). *(t)  ~ *(t) = 0 (2.6.39)
( ~ ~ ) *  :t ( ~ ) * = 0 > 2u*(t)  ).*(t) = 0 (2.6.40)
( ~ ~ ) *  :t (!f) * = 0 > x*(t) + x*(t)  u*(t) = 0 (2.6.41)
and solve for optimal x*(t), u*(t), and A*(t). We get first from
(2.6.40) and (2.6.41)
A*(t) = 2u*(t) = 2(x*(t) + x*(t)). (2.6.42)
Using the equation (2.6.42) in (2.6.39)
2x*(t) + 2(x*(t) + x*(t))  2(x*(t) + x*(t)) = O. (2.6.43)
Solving the previous equation, we get
x*(t)  2x*(t) = 0 + x*(t) = C
1
ev'2t + C
2e
v'2t. (2.6.44)
Once we know x*(t), we get A*(t) and hence u*(t) from (2.6.42) as
u* (t) = i;* (t) + x* (t)
= C
1
(1  V2)ev'2t + C
2
(1 + V2)ev'2t.
(2.6.45)
Thus, we get the same results as in direct method. The constants
C
1
and C2, evaluated using the boundary conditions (2.6.26) are
the same as given in (2.6.34).
The solution for the set of differential equations {2.6.32} with
the boundary conditions (g. 6. 26) for Example 2.11 using Symbolic
Toolbox of the MATLAIi9, Version 6, is shown below.
56 Chapter 2: Calculus of Variations and Optimal Control
********************************************************
x=dsolve('D2x2*x=O','x(O)=1,x(1)=O')
x =
(exp(2
A
(1/2))A2+1)/(exp(2
A
(1/2))A21)*sinh(2
A
(1/2)*t)+
cosh(2A(1/2)*t)
1/2 2
(exp(2 )
1/2
+ 1) sinh(2 t) 1/2
  + cosh(2 t)
u =
1/2 2
exp(2 )  1
(exp(2A(1/2))A2+1)/(exp(2A(1/2))A21)*cosh(2A(1/2)*t)*2
A
(1/2)+
sinh(2
A
(1/2)*t)*2
A
(1/2)(exp(2
A
(1/2))A2+1)/(exp(2
A
(1/2))A2_
1)*sinh(2A(1/2)*t)+cosh(2A(1/2)*t)
1/2 2
(exp(2 )
1/2 1/2
+ 1) cosh(2 t) 2
1/2 2
exp(2 )  1
1/2
1/2 1/2
+ sinh(2 t) 2
1/2 2
(exp(2 ) + 1) sinh(2 t) 1/2
  + cosh(2
t)
1/2 2
exp(2 )  1
*********************************************************
It is easy to see that the previous solution for optimal x*(t) is the
same as given in (2.6.33) and (2.6.34).
Let us note once again that the Lagrange multiplier ..\(t) helped us to
treat the augmented functional (2.6.38) as if it contained independent
variables x(t) and u(t), although they are dependent as per the plant
equation (2.6.36).
2.7 Variational Approach to Optimal Control Systems
57
2.7 Variational Approach to Optimal Control
Systems
In this section, we approach the optimal control system by variational
techniques, and in the process introduce the Hamiltonian function,
which was used by Pontryagin and his associates to develop the fa
mous Minimum Principle [109].
2.7.1 Terminal Cost Problem
Here we consider the optimal control system where the performance
index is of general form containing a final (terminal) cost function in
addition to the integral cost function. Such an optimal control problem
is called the Bolza problem. Consider the plant as
x(t) = f(x(t), u(t), t), (2.7.1)
the performance index as
i
tf
J(u(t)) = S(x(tf), tf) + V(x(t), u(t), t)dt
to
(2.7.2)
and given boundary conditions as
x(to) = xo; x( t f) is free and t f is free (2.7.3)
where, x(t) and u(t) are n and r dimensional state and control vectors
respectively. This problem of Bolza is the one with the most general
form of the performance index.
The Lagrange problem was first discussed in 1762, Mayer
considered his problem in 1878, and the problem of Bolza
was formulated in 1913.
Before we begin illustrating the Pontryagin procedure for this problem,
let us note that
rtf dS(:(t) , t) dt = S(x(t), t ) I ~ ~ = S(x(tf), tf)  S(x(to), to). (2.7.4)
lto t
Using the equation (2.7.4) in the original performance index (2.7.2), we
get
J2(U(t» = 1:' [V(x(t), u(t), t) + ~ ~ ] dt
= r
f
V(x(t), u(t), t)dt + S(x(tf), tf)  S(x(to), to). (2.7.5)
lto
58 Chapter 2: Calculus of Variations and Optimal Control
Since S(x(to), to) is a fixed quantity, the optimization of the original
performance index J in (2.7.2) is equivalent to that of the performance
index J2 in (2.7.5). However, the optimal cost given by (2.7.2) is dif
ferent from the optimal cost (2.7.5). Here, we are interested in finding
the optimal control only. Once the optimal control is determined, the
optimal cost is found using the original performance index J in (2.7.2)
and not J2 in (2.7.5). Also note that
d[S(x(t), t)] = (as)' . () as
dt ax x t + at'
(2.7.6)
We now illustrate the procedure in the following steps. Also, we first
introduce the Lagrangian and then, a little later, introduce the Hamil
tonian. Let us first list the various steps and then describe the same in
detail.
• Step 1: Assumption of Optimal Conditions
• Step 2: Variations of Control and State Vectors
• Step 3: Lagrange Multiplier
• Step 4: Lagrangian
• Step 5: First Variation
• Step 6: Condition for Extrema
• Step 7 : Hamiltonian
• Step 1: Assumptions of Optimal Conditions: We assume opti
mum values x*(t) and u*(t) for state and control, respectively.
Then
J(u*(t)) = 1:
1
[V(x*(t), u*(t), t) + dS(x;?), t)] dt
ic*(t) = f(x*(t), u*(t), t). (2.7.7)
• Step 2: Variations of Controls and States: We consider the
variations (perturbations) in control and state vectors as (see
Figure 2.7)
x(t) = x*(t) + 8x(t); u(t) = u*(t) + 8u(t). (2.7.8)
2.7 Variational Approach to Optimal Control Systems 59
x(t)
x*(t)+ ox(t)
~ __ .......... : ...... : ... ~ ...... T ..
t
Figure 2.7 FreeFinal Time and FreeFinal State System
Then, the state equation (2.7.1) and the performance index (2.7.5) ':
become
x*(t) + 6x(t) = f(x*(t) + 6x(t), u*(t) + 6u(t), t)
[tf+
6t
f [ dS]
J(u(t)) = ito V(x*(t) + 6x(t), u*(t) + 6u(t), t) + dt dt
(2.7.9)
• Step 3: Lagrange Multiplier: Introducing the Lagrange multi
plier vector .x(t) (also called costate vector) and using (2.7.6),
we introduce the augmented performance index at the optimal
condition as
[4 (8S)' (8S)
Ja(u*(t)) = lto [V(x*(t), u*(t), t) + 8x * x*(t) + 8t *
+.x'(t) {f(x*(t), u*(t), t)  x*(t)}]dt (2.7.10)
60 Chapter 2: Calculus of Variations and Optimal Control
and at any other (perturbed) condition as
l
tf+8tf
Ja(u(t)) = [V(x*(t) + 8x(t), u*(t) + 8u(t), t)
to
(
8S)' (8S)
+ ax * [x*(t) + 8x(t)] + at *
+A'(t) [f(x*(t) + 8x(t), u*(t) + 8u(t), t)
 {x*(t) + 8x(t)}]]dt. (2.7.11)
• Step 4: Lagrangian: Let us define the Lagrangian function at
optimal condition as
£ = £(x*(t), x*(t), u*(t), A(t), t)
= V(x*(t), u*(t),t) + ( ~ : ) ~ x*(t) + c;:
+A'(t) {f(x*(t), u*(t), t)  x*(t)} (2.7.12)
and at any other condition as
£8 = £8(x*(t) +8x(t),x*(t) +8x(t),u*(t) +8U(t),A(t),t)
= V(x*(t) + 8x(t), u*(t) + 8u(t), t)
+ (:) ~ [x*(t) + 8x(t)] + (c;:) *
+A'(t) [f(x*(t) + 8x(t), u*(t) + 8u(t), t)
 {x*(t) + 8x(t)}] . (2.7.13)
With these, the augmented performance index at the optimal and
any other condition becomes
2.7 Variational Approach to Optimal Control Systems 61
Using meanvalue theorem and Taylor series, and retaining the
linear terms only, we have
(2.7.15)
• Step 5: First Variation: Defining increment fl.J, using Taylor
series expansion, extracting the first variation 8 J by retaining
only the first order terms, we get the first variation as
= Ja(u(t))  Ja(u*(t))
i
t'
= (£6  £)dt + £It 8tf
to I
{t I { (a £ ) I (a £) I (a £ ) I }
8J = ito ax * 8x(t) + ax * 8x(t) + au * 8u(t) dt
+ £It, 8tf. (2.7.16)
Considering the 8x(t) term in the first variation (2.7.16) and in
tegrating by parts (using J udv = uv  J vdu),
f 6x(t)dt = {' :t (6x(t» dt
=
 {' [:t J 6x(t)dt.
(2.7.17)
62 Chapter 2: Calculus of Variations and Optimal Control
Also note that since x(to) is specified, 8x(to) = O. Thus, using
(2.7.17) the first variation 8J in (2.7.16) becomes
oj = 1:' :t J ox(t)dt
+ f ou(t)dt
+ Cit! ot, + L
(2.7.18)
• Step 6: Condition for Extrema: For extrema of the functional J,
the first variation 8 J should vanish according to the fundamental
theorem (Theorem 2.1) of the CoV. Also, in a typical control
system such as (2.7.1), we note that 8u(t) is the independent
control variation and 8x(t) is the dependent state variation. First,
we choose A(t) = A*(t) which is at our disposal and hence £* such
that the coefficient of the dependent variation 8x(t) in (2.7.18) be
zero. Then, we have the EulerLagrange equation
(
8£) _!i (8£) = 0
8x * dt 8x *
(2.7.19)
where the partials are evaluated at the optimal (*) condition.
Next, since the independent control variation 8u(t) is arbitrary,
the coefficient of the control variation 8u(t) in (2.7.18) should be
set to zero. That is
(
8£) = O.
8u *
(2.7.20)
Finally, the first variation (2.7.18) reduces to
(2.7.21)
Let us note that the condition (or plant) equation (2.7.1) can be
written in terms of the Lagrangian (2.7.12) as
(
8£) = O.
8A *
(2.7.22)
2.7 Variational Approach to Optimal Control Systems 63
X(t)
tangent at P . . .
: .............. .
.........•...... . ~ .
X
f
........... .
. ... : .. 9(t)
. .: <>t ....
. f
o
t
Figure 2.8 FinalPoint Condition with a Moving Boundary B(t)
In order to convert the expression containing 8x(t) in (2.7.21)
into an expression containing 8xf (see Figure 2.8), we note that
the slope of x*(t) + 8x(t) at tf is aI?proximated as
(2.7.23)
which is rewritten as
8Xf = 8x(tf) + {x*(t) + 8x(t)} 8tf (2.7.24)
and retaining only the linear (in 8) terms in the relation (2.7.24),
we have
(2.7.25)
Using (2.7.25) in the boundary condition (2.7.21), we have the
general boundary condition in terms of the Lagrangian as
(2.7.26)
• Step 7: Hamiltonian: We define the Hamiltonian 1l* (also called
the Pontryagin 1l function) at the optimal condition as
11l* = V(x*(t), u*(t), t) + A*' (t)f(x*(t), u*(t), t), I (2.7.27)
64 Chapter 2: Calculus of Variations and Optimal Control
where,
'H* = 'H* (x* (t), u* (t), A * (t), t).
Then from (2.7.12) the Lagrangian £* in terms of the Hamiltonian
'H* becomes
£* = £*(x*(t), x*(t), u*(t), A*(t), t)
= 'H*(x*(t), u*(t), A*(t), t)
+ ( ~ : ) ~ x*(t) + ( ~ ~ ) *  >.*' (t)x*(t).
(2.7.28)
Using (2.7.28) in (2.7.20), (2.7.19), and (2.7.22) and noting that
the terminal cost function S = S (x( t), t), we have the control,
state and costate equations, respectively expressed in terms of
the Hamiltonian. Thus, for the optimal control u*(t), the relation
(2.7.20) becomes
(2.7.29)
for the optimal state x* ( t), the relation (2.7.19) becomes
(
8£) _ ~ (8£*) = 0 ~
8x* dt 8x*
(
81t) (828)' (8
2
8) d {(88)' }
 +  x*(t)+    A*(t) = O ~
8x * &2 8x8t dt 8x *
* *
(
81t) (8
2
8)' ( 8
2
8) [(8
2
8)' ( 8
2
8) . * 1
8x * + 8x2 * x*(t) + 8x8t *  8x2 * x*(t) + 8x8t * A (t) = 0
leading to
(
8'H) = '\*(t)
ax *
(2.7.30)
and for the costate A * ( t),
(
8£) = 0 + (8'H) = X*(t).
8A * 8A *
(2.7.31)
Looking at the similar structure of the relation (2.7.30) for the
optimal costate A*(t) and (2.7.31) for the optimal state x*(t) it
2.7 Variational Approach to Optimal Control Systems 65
is clear why .x(t) is called the costate vector. Finally, using the
relation (2.7.28), the boundary condition (2.7.26) at the optimal
condition reduces to
(2.7.32)
This is the general boundary condition for freeend point system
in terms of the Hamiltonian.
2. 7.2 Different Types of Systems
We now obtain different cases depending on the statement of the prob
lem regarding the final time tf and the final state x(tf) (see Figure 2.9).
• Type (a): FixedFinal Time and FixedFinal State System: Here,
since tf and x(tf) are fixed or specified (Figure 2.9(a», both 8tf
and 8xf are zero in the general boundary condition (2.7.32), and
there is no extra boundary condition to be used other than those
given in the problem formulation.
• Type (b): FreeFinal Time and Fix edFinal State System: Since
tf is free or not specified in advance, 8tf is arbitrary, and since
X(tf) is fixed or specified, 8xf is zero as shown in Figure 2.9(b).
Then, the coefficient of the arbitrary 8t f in the general boundary
condition (2.7.32) is zero resulting in
(2.7.33)
• Type (c): FixedFinal Time and FreeFinal State System: Here
tf is specified and x(tf) is free (see Figure 2.9(c». Then 8tf is
zero and 8x f is arbitrary, which in turn means that the coefficient
of oXf in the general boundary condition (2.7.32) is zero. That
is
(
as _ x*(t») = 0 + X*(tf) = (as)
ax *tfax *tf
(2.7.34)
66 Chapter 2: Calculus of Variations and Optimal Control
x(t)
x*(t)+ox(t)
xo ..
(a)
x(t)
xo ..
o to
t
(c)
t
x(t)
x(t)
Xo ...
o to
(b)
it
(d)
t
t
Figure 2.9 Different Types of Systems: (a) FixedFinal Time and
FixedFinal State System, (b) FreeFinal Time and FixedFinal State
System, (c) FixedFinal Time and FreeFinal State System, (d)
FreeFinal Time and FreeFinal State System
2.7 Variational Approach to Optimal Control Systems 67
• Type (d): FreeFinal Time and Dependent FreeFinal State Sys
tem: If t f and x( t f) are related such that x( t f) lies on a moving
curve 8(t) as shown in Figure 2.8, then
(2.7.35)
Using (2.7.35), the boundary condition (2.7.32) for the optimal
condition becomes
Since t f is free, 8t f is arbitrary and hence the coefficient of 8t f
in (2.7.36) is zero. That is
• Type (e): FreeFinal Time and Independent FreeFinal State:
If t f and x( t f) are not related, then 8t f and 8x f are unrelated,
and the boundary condition (2.7.32) at the optimal condition
becomes
(2.7.38)
(2.7.39)
2.7.3 Sufficient Condition
In order to determine the nature of optimization, i.e., whether it is
minimum or maximum, we need to consider the second variation and
examine its sign. In other words, we have to find a sufficient condition
for extremum. Using (2.7.14), (2.7.28) and (2.7.37), we have the second
68 Chapter 2: Calculus of Variations and Optimal Control
variation in (2.7.16) and using the relation (2.7.28), we get
For the minimum, the second variation 8
2
J must be positive. This
means that the matrix II in (2.7.40)
(2.7.41)
must be positive definite. But the important condition is that the
second partial derivative of 1t* w.r.t. u(t) must be positive. That is
(2.7.42)
and for the maximum, the sign of (2.7.42) is reversed.
2. 7.4 Summary of Pontryagin Procedure
Consider a freefinal time and freefinal state problem with general cost
function (Bolza problem), where we want to minimize the performance
index
i
t!
J = S(x(tf), tf) + V(x(t), u(t), t)dt
to
(2.7.43)
for the plant described by
x(t) = f(x(t), u(t), t) (2.7.44)
with the boundary conditions as
x(t = to) = Xo; t = tf is free and x(tf) is free. (2.7.45)
2.7 Variational Approach to Optimal Control Systems
Table 2.1 Procedure Summary of Pontryagin Principle for Bolza
Problem
A. Statement of the Problem
Given the plant as
x(t) = f(x(t), u(t), t),
the performance index as
J = S(x(tf), tf) + V(x(t), u(t), t)dt,
and the boundary conditions as
x(to) = Xo and tf and x(tf) = xf are free,
find the optimal control.
B. Solution of the Problem
Step 1 Form the Pontryagin 1i function
1i(x(t), u(t), A(t), t) = V(x(t), u(t), t) + A' (t)f(x(t), u(t), t).
Step 2 Minimize 1i w.r.t. u(t)
= 0 and obtain u*(t) = h(x*(t), A*(t), t).
Step 3 U sing the results of Step 2 in Step 1, find the optimal 1i*
1i*(x*(t), h(x*(t), A*(t), t), A*(t), t) = 1i*(x*(t), A*(t), t).
Step 4 Solve the set of 2n differential equations
x* (t) = + * and'\ * (t) =  *
with initial conditions Xo and the final conditions
[1i* + 8tf +  A*(t)]:f 8xf = O.
Step 5 Substitute the solutions of x* (t), A" (t) from Step 4
into the expression for the optimal control u*(t) of Step 2.
C. Types of Systems
a . Fixedfinal time and fixedfinal state system, Fig. 2.9(a)
b). Freefinal time and fixedfinal state system, Fig. 2.9(b)
c . Fixedfinal time and freefinal state system, Fig. 2.9(c)
d). Freefinal time and dependent freefinal state system, Fig. 2.9(d).
e . Freefinal time and independent freefinal state system
Type Substitutions Boundary Conditions
(a) 8tf = 0,8xf = 0 x(to) = xo, x(tf) = xf
(b)
8tf =1= 0, 8xf = 0 x( to) = Xo, x( t f) = x f' l1i* + Ft J t f = 0
(c)
8tf = 0, 8xf =1= 0 x( to) = xo, A * (t f) = f
(d) 8xf = 6(tf )8tf x(to) = Xo, x(tf) = 6(tf)
[H* + + { ( *  A * (t)} I 6( t) L f = 0
(e)
8tf =1= 0
8x(to) = Xo
8xf =1= 0
[1i*+as] =0 [(as) A*(t)] =0
at tf 'ax * tf
69
70 Chapter 2: Calculus of Variations and Optimal Control
Here, x(t) and u(t) are n and r dimensional state and control vec
tors respectively. Let us note that u( t) is unconstrained. The entire pro
cedure (called Pontryagin Principle) is now summarized in Table 2.1.
Note: From Table 2.1 we note that the only difference in the proce
dure between the freefinal point system without the final cost function
(Lagrange problem) and freefinal point system with final cost function
(Bolza problem) is in the application of the general boundary condition.
To illustrate the Pontryagin method described previously, consider
the following simple examples describing a second order system. Specif
ically, we selected a double integral plant whose analytical solutions for
the optimal condition can be obtained and the same verified by using
MATLAB©.
First we consider the fixedfinal time and fixedfinal state problem
(Figure 2.9(a), Table 2.1, Type (a)).
Example 2.12
Given a second order (double integrator) system as
Xl(t) = X2(t)
X2(t) = u(t)
and the performance index as
(2.7.46)
(2.7.47)
find the optimal control and optimal state, given the boundary
(initial and final) conditions as
x(O) = [1 2]'; x(2) = [1 0]'. (2.7.48)
Assume that the control and state are unconstrained.
Solution: We follow the stepbystep procedure given in Table 2.1.
First, by comparing the present plant (2.7.46) and the PI (2.7.47)
with the general formulation of the plant (2.7.1) and the PI (2.7.2),
we identify
1
V(x(t), u(t), t) = V(u(t)) = "2u
2
(t)
f(x(t), u(t), t) = [fl, f2]',
where, fl = X2(t), f2 = u(t).
(2.7.49)
2.7 Variational Approach to Optimal Control Systems
• Step 1: Form the Hamiltonian function as
H = H(XI(t), X2(t), u(t), AI(t), A2(t))
= V(u(t)) + X'(t)f(x(t) , u(t))
1
= 2u
2
(t) + Al (t)X2(t) + A2(t)U(t).
71
(2.7.50)
• Step 2: Find u*(t) from
aH
au = 0 u*(t) + A2(t) = 0 u*(t) = A2(t). (2.7.51)
• Step 3: Using the results of Step 2 in Step 1, find the optimal
H*as
H* (xi(t) , X2(t), Ai(t) , A2(t)) = (t) + Ai (t)X2(t)  A22 (t)
= Ai(t)x2(t)  (2.7.52)
• Step 4: Obtain the state and costate equations from
xi(t) = + (:) * = x2(t)
x:;(t) = + (:) * = A;(t)
. (aH)
Ai(t) =  aXI * = 0
A;(t) =  (:) * = Ai(t).
(2.7.53)
Solving the previous equations, we have the optimal state and
costate as
* C
3
3 C
4
2
Xl (t) = tit  2t + C2t + C1
X2(t) =  C
4
t + C2
Ai(t) = C3
A2(t) = C
3
t + C4.
(2.7.54)
• Step 5: Obtain the optimal control from
u*(t) = A2(t) = C3t  C4 (2.7.55)
72 Chapter 2: Calculus of Variations and Optimal Control
Optimal u*(t)
J
x
2
*(t)
J
x
1
*(Q
Controller
Figure 2.10 Optimal Controller for Example 2.12
where, 01, O
2
, 0
3
, and 0
4
are constants evaluated using the given
boundary conditions (2.7.48). These are found to be
01 = 1, O
2
= 2, 03 = 3, and 04 = 4. (2.7.56)
Finally, we have the optimal states, costates and control as
xi(t) = 0.5t
3
 2t2 + 2t + 1,
X2(t) = 1.5t
2
 4t + 2,
Ai(t) = 3,
A2(t) = 3t + 4,
u*(t) = 3t  4. (2.7.57)
The system with the optimal controller is shown in Figure 2.10.
The solution for the set of differential equations (2.7.53) with
the boundary conditions C2.7.48} for Example 2.12 using Symbolic
Toolbox of the M A T L A ~ , Version 6, is shown below.
**************************************************************
%% Solution Using Symbolic Toolbox (STB) in
%% MATLAB Version 6.0
%%
S=dsolve('Dxl=x2,Dx2=lambda2,Dlambdal=0,Dlambda2=lambdal, ...
xl(0)=1,x2(0)=2,xl(2)=1,x2(2)=0')
S.xl
S.x2
S.lambdal
S.lambda2
S =
S.xl
lambdal: [lxl symJ
lambda2: [lxl symJ
xl: [lxl symJ
x2: [lxl symJ
ans=
2.7 Variational Approach to Optimal Control Systems
S.x2
ans=
24*t+3/2*t2
S.lambdal
ans=
3
S.lambda2
ans=
Plot command is used for which we need to
%% convert the symbolic values to numerical values.
j=l;
for tp=O: .02: 2
t=sym(tp);
xlp(j)=double(subs(S.xl));
%% subs substitutes S.xl to xlp
x2p(j)=double(subs(S.x2));
%% double converts symbolic to numeric
up(j)=double(subs(S.lambda2));
%% optimal control u = lambda_2
tl(j)=tp;
j=j+l;
end
plot(tl,xlp, 'k' ,tl,x2p, 'k' ,tl,up, 'k:')
xlabel ('t')
gtext('x_l(t) ')
gtext (' x_2(t) ')
gtext ('u(t) ')
*********************************************************
73
It is easy to see that the previous solutions for xi(t) , x1(t) , Ai(t) ,
A2(t), and u*(t) = A2(t) obtained by using MATLAB© are the
same as those given by the analytical solutions (2.7.571 The op
timal control and state are plotted (using MATLAB©) in Fig
ure 2.11.
Next, we consider the fixedfinal time and freefinal state case (Fig
74 Chapter 2: Calculus of Variations and Optimal Control
2
"
·3 ,,."
,
..
,
.,'
"
,.
,.
..
u(t), .. ,.
.
.,'
.,
'"
,
"
 4 ~ ~ ~ ~   ~   ~   ~   ~   ~   ~   ~   ~
o 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2
Figure 2.11 Optimal Control and States for Example 2.12
ure 2.9(b), Table 2.1, Type (c)) of the same system.
Example 2.13
Consider the same Example 2.12 with changed boundary conditions
as
x(O) = [1 2]'; xl(2) = 0; x2(2) is free. (2.7.58)
Find the optimal control and optimal states.
Solution: Following the procedure illustrated in Table 2.1 (Type
( c) ), we get the same optimal states, costates, and control as given
in (2.7.54) and (2.7.55) which are repeated here for convenience.
* 0
3
3 04 2
X1(t) = (ft  2:t +02
t
+
C
l,
X2(t) = ~ 3 t 2  04t + C
2
,
Ai(t) = 0
3
,
A2(t) = 03
t
+ 0
4
,
u*(t) = A2(t) = C
3
t  0
4
. (2.7.59)
The only difference is in solving for the constants C1 to C4. First
of all, note that the performance index (2.7.47) does not contain
the terminal cost function S. From the given boundary conditions
2.7 Variational Approach to Optimal Control Systems 75
(2.7.58), we have tf specified to be 2 and hence 8tf is zero in the
general boundary condition (2.7.32).
Also, since x2(2) is free, 8X2j is arbitrary and hence the corre
sponding final condition on the costate becomes
(2.7.60)
(since S = 0). Thus we have the four boundary conditions as
With these boundary conditions substituted in (2.7.59), the con
stants are found to be
01 = 1; 02 = 2; 03 = 15/8; 04 = 15/4. (2.7.62)
Finally the optimal states, costates and control are given from
(2.7.59) and (2.7.62) as
* 5 3 15 2
xl (t) = 16 t  8 t + 2t + 1,
* 15 2 15
x2 (t) = t  t + 2
16 4 '
*( ) 15
Al t = 8'
* 15 15
A2(t) = 8
t
+ 4'
* 15 15
u (t) = 8
t
 4·
(2.7.63)
The solution for the set of differential equations (2.7.53) with
the boundary conditions {2. 7. 58} for Example 2.13 using Symbolic
Toolbox of the MATLAdS) , Version 6, is shown below.
***************************************************************
%% Solution Using Symbolic Toolbox (STB) in
%% MATLAB Version 6.0
%%
S=dsolve('Dxl=x2,Dx2=lambda2,Dlambdal=0,Dlambda2=lambdal,
xl(0)=1,x2(0)=2,xl(2)=0,lambda2(2)=0')
S =
76
Chapter 2: Calculus of Variations and Optimal Control
lambdal: [lxl sym]
lambda2: [lxl sym]
xl: [lxl sym]
x2: [lxl sym]
S.xl
ans=
S.x2
ans=
S.lambdal
ans=
15/8
S.lambda2
ans=
15/8*t+15/4
%% Plot command is used for which we need to
%% convert the symbolic values to numerical values.
j=l;
for tp=O:.02:2
t=sym(tp);
xlp(j)=double(subs(S.xl));
%% subs substitutes S.xl to xlp
x2p(j)=double(subs(S.x2));
%% double converts symbolic to numeric
up(j)=double(subs(S.lambda2));
%% optimal control u = lambda_2
tl(j)=tp;
j=j+l;
end
plot(tl,xlp, 'k' ,tl,x2p, 'k' ,tl,up, 'k: ')
xlabel ( , t ' )
gtext ('x_l (t) ')
2.7 Variational Approach to Optimal Control Systems 77
gtext ('x_2(t)')
gtext ('u(t) ')
*******************************************************
It is easy to see that the previous solutions for xi (tj, x2 (t), Ai (t), A2 (t),
and u*(t) = A2(t) obtained by using MATLAB© are the same as
those given by (2.7.63) obtained analytically. The optimal control
and states for Example 2.13 are plotted in Figure 2.12.
0
"".
....
1
2
.'
..
..
,"
3 .. "
" ..
"
4
0 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2
Figure 2.12 Optimal Control and States for Example 2.13
Next, we consider the freefinal time and independent freefinal state
case (Figure 2.9(e), Table 2.1, Type (e)) of the same system.
Example 2.14
Consider the same Example 2.12 with changed boundary conditions
as
Find the optimal control and optimal state.
Solution: Following the procedure illustrated in Table 2.1 (Type
(e)), we get the same optimal control, states and costates as given
78 Chapter 2: Calculus of Variations and Optimal Control
in (2.7.54) and (2.7.55) which are repeated here for convenience.
* C3 3 C4 2
X1(t) = 6
t
 2t + C2t + C1,
X2(t) =  C
4
t + C
2
,
Ai(t) = C3,
A2(t) = C3t + C4,
U*(t) = A2(t) = C3t  C
4
. (2.7.65)
The only difference is in solving for the constants C
1
to C
4
and the
unknown t f. First of all, note that the performance index (2.7.47)
does not contain the terminal cost function S, that is, S = O. From
the given boundary conditions (2.7.64), we have tf unspecified and
hence otf is free in the general boundary condition (2.7.32) leading
to the specific final condition
(2.7.66)
Also, since X2 (t f) is free, OX2 f is arbitrary and hence the general
boundary condition (2.7.32) becomes
>'2(tf) = (:!) = 0
(2.7.67)
where is given by (2.7.52). Combining (2.7.64), (2.7.66) and
(2.7.67), we have the following 5 boundary conditions for the 5
unknowns (4 constants of integration C
1
to C
4
and 1 unknown t f)
as
x1(0)=I; X2(0) =2; X1(tf) =3;
A2(tf) = 0; A1(tf)X2(tf)  = o.
(2.7.68)
Using these boundary conditions along with (2.7.65), the constants
are found to be
C
1
= 1; C
2
= 2; C
3
= 4/9; C
4
= 4/3; tf = 3. (2.7.69)
Finally, the optimal states, costates, and control are given from
2.7 Variational Approach to Optimal Control Systems 79
(2.7.65) and (2.7.69) as
(2.7.70)
The solution for the set of differential equations (2.7.53) with
the boundary conditions C2. 7.68} for Example 2.14 using Symbolic
Toolbox of the M A T L A ~ , Version 6 is shown below.
********************************************************
%% Solution Using Symbolic Toolbox (STB) in
%% of MATLAB Version 6
%%
clear all
S=dsolve('Dx1=x2,Dx2=lam2,Dlaml=O,Dlam2=lam1,xl(0)=l,
x2(0)=2,x1(tf)=3,lam2(tf)=0')
t='tf' ;
eq1=subs(S.x1)'x1tf';
eq2=subs(S.x2)'x2tf';
eq3=S.lam1'lam1tf';
eq4=subs(S.lam2)'lam2tf';
eq5='lamltf*x2tf0.5*lam2tf
A
2';
S2=solve(eq1,eq2,eq3,eq4,eq5,'tf,x1tf,x2tf,lam1tf,
lam2tf','lam1tf<>0')
%% lam1tf<>0 means lam1tf is not equal to 0;
%% This is a condition derived from eq5.
%% Otherwise, without this condition in the above
%% SOLVE routine, we get two values for tf (1 and 3 in this case)
%%
tf=S2.tf
xltf=S2.xltf;
x2tf=S2.x2tf;
clear t
x1=subs(S.xl)
x2=subs(S.x2)
lam1=subs(S.lam1)
80 Chapter 2: Calculus of Variations and Optimal Control
lam2=subs(S.lam2)
%% Convert the symbolic values to
%% numerical values as shown below.
j=l;
tf=double(subs(S2.tf))
%% coverts tf from symbolic to numerical
for tp=O:O.05:tf
t=sym(tp);
%% coverts tp from numerical to symbolic
xlp(j)=double(subs(S.xl));
%% subs substitutes S.xl to xlp
x2p(j)=double(subs(S.x2));
%% double converts symbolic to numeric
up(j)=double(subs(S.lam2));
%% optimal control u = lambda_2
tl(j)=tp;
j=j+l ;
end
plot(tl,xlp, 'k' ,tl,x2p, 'k' ,tl,up, 'k:')
xlabel('t' )
gtext (' x_l (t) ')
gtext ( , x_2 ( t) , )
gtext ('u(t) ')
*******************************************************
The optimal control and states for Example 2.14 are plotted in
Figure 2.13.
Finally, we consider the fixedfinal time and freefinal state system
with a terminal cost (Figure 2.9 (b), Table 2.1, Type (b)).
Example 2.15
We consider the same Example 2.12 with changed performance
index
1 2 1 2 110
2
2
J = [xl(2)  4] + [x2(2)  2] + 2 u dt
220
(2.7.71)
and boundary conditions as
x(O) = [1 2]; x(2) = is free. (2.7.72)
Following the procedure illustrated in Table 2.1 (Type (b)), we get
the same optimal control, states and costates as given in (2.7.54)
and (2.7.55), which are reproduced here for ready reference. Thus
2.7 Variational Approach to Optimal Control Systems 81
2.5
2
1.5
0
.. '"
,
0.5
u(t)
,
",,
..  ,,
1
,
, .. 
,,'"
1.5
0 0.5 1.5 2 2.5 3
t
Figure 2.13 Optimal Control and States for Example 2.14
we have
* C3 3 C4 2
X1(t) = (it  2t + C2t + C1,
X2(t) =  C4t + C2,
Ai(t) = C3,
A2(t) = C3
t
+ C4,
U*(t) = A2(t) = C
3
t  C
4
. (2.7.73)
The only difference is in solving for the constants C
1
to C
4
using the
given and obtained boundary conditions. Since t f is specified as 2,
6t f is zero and since x(2) unspecified, 6xf is free in the boundary
condition (2.7.32), which now reduces to
(2.7.74)
where,
(2.7.75)
82 Chapter 2: Calculus of Variations and Optimal Control
Thus, (2.7.74) becomes
AHtf) = (a as ) ~ AH2) = xl(2)  4
Xl tf
A2(tf) = (a as ) ~ A2(2) = x2(2)  2.
X2 tf
(2.7.76)
Now, we have two initial conditions from (2.7.72) and two final
conditions from (2.7.76) to solve for the four constants as
(2.7.77)
Finally, we have the optimal states, costates and control given as
* 1 3 2 2
Xl (t) = 14 t "7 t + 2t + 1,
*( ) 3 2 4
x2 t = t  t + 2
14 7 '
Ai(t) = ~ ,
A2(t) =  ~ t + ~ ,
u*(t) = ~ t  i.
7 7
(2.7.78)
The previous results c ~ also obtained using Symbolic Math
Toolbox of the MATLAB\9, Version 6, as shown below.
***************************************************************
%% Solution Using Symbolic Math Toolbox (STB) in
%% MATLAB Version 6
%%
S=dsolve('Dxl=x2,Dx2=lambda2,Dlambdal=O,Dlambda2=lambdal,
xl(O)=1,x2(O)=2,lambdal(2)=x124,lambda2(2)=x222')
t='2' ;
S2=solve(subs(S.xl)'x12',subs(S.x2)'x22','x12,x22');
%% solves for xl(t=2) and x2(t=2)
x12=S2.x12;
x22=S2.x22;
clear t
S =
2.7 Variational Approach to Optimal Control Systems
lambdal: [lxl sym]
lambda2: [lxl sym]
xl: [lxl sym]
x2: [lxl sym]
xl=subs(S.xl)
xl =
x2=subs(S.x2)
x2 =
lambdal=subs(S.lambdal)
lambdal
3/7
lambda2=subs(S.lambda2)
lambda2 =
4/73/7*t
%% Plot command is used for which we need to
%% convert the symbolic values to numerical values.
j=l;
for tp=O: .02:2
t=sym(tp);
xlp(j)=double(subs(S.xl»;
%% subs substitutes S.xl to xlp
x2p(j)=double(subs(S.x2»;
%% double converts symbolic to numeric
up(j)=double(subs(S.lambda2»;
%% optimal control u = lambda_2
tl(j)=tp;
j=j+l;
end
83
84 Chapter 2: Calculus of Variations and Optimal Control
plot(t1,x1p, 'k' ,t1,x2p, 'k' ,t1,up, 'k: ')
xlabel ('t')
gtext (' x_1 (t) ')
gtext (' x_2 (t) ')
gt ext ( 'u ( t) , )
***************************************************************
It is easy to see that the previous solutions for Ai (t), A2(t),
and u*(t) = A2(t) obtained by using MATLAB\0 are the same
as those given by (2.7.78) obtained analytically.
The optimal control and states for Example 2.15 are plotted in
Figure 2.14.
4
3
2
o
o 0.2 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2
Figure 2.14 Optimal Control and States for Example 2.15
2.8 Summary of Variational Approach
In this section, we summarize the development of the topics covered so
far in obtaining optimal conditions using the calculus of variations. The
development is carried out in different stages as follows. Also shown is
the systematic link between various stages of development.
2.8 Summary of Variational Approach
2.8.1 Stage I: Optimization of a Functional
Consider the optimal of
i
t!
J = V(x(t), x(t), t)dt
to
with the given boundary conditions
x(to) fixed and x(tj) free.
85
(2.8.1)
(2.8.2)
Then, the optimal function x* (t) should satisfy the EulerLagrange
equation
(
8V) _ ~ (8V) _ 0
8x * dt 8x *  .
(2.8.3)
The general boundary condition to be satisfied at the freefinal point
is given by [79]
(2.8.4)
This boundary condition is to be modified depending on the nature
of the given t j and x( t j ). Although the previous general boundary
condition is not derived in this book, it can be easily seen to be similar
to the general boundary condition (2.7.26) in terms of the Lagrangian
which embeds a performance index and a dynamical plant into a single
augmented performance index with integrand C.
The sufficient condition for optimum is the Legendre condition given
by
(8
2
V)
8x
2
* > 0
for minimum (2.8.5)
and
(8
2
V)
8x
2
* < 0
for maximum. (2.8.6)
86 Chapter 2: Calculus of Variations and Optimal Control
2.8.2 Stage II: Optimization of a Functional with
Condition
Consider the optimization of a functional
l
tf
J = V(x(t), x(t), t)dt
to
(2.8.7)
with given boundary values as
x(to) fixed and x(t,) free, (2.8.8)
and the condition relation
g(x(t), x(t), t) = O. (2.8.9)
Here, the condition (2.8.9) is absorbed by forming the augmented func
tional
l
tf
J
a
= £(x(t), x(t), .x(t), t)dt
to
(2.8.10)
where, .x(t) is the Lagrange multiplier (also called the costate function),
and £ is the Lagrangian given by
I £(x(t), x(t), .x(t), t) = V(x(t), x(t), t) + .x'(t)g(x(t), x(t), t).,
(2.8.11)
Now, we just use the results of the previous Stage I for the augmented
functional (2.8.10) except its integrand is £ instead of V. For optimal
condition, we have the EulerLagrange equation (2.8.3) for the aug
mented functional (2.8.10) given in terms of x(t) and .x(t) as
(
8£) _ ~ (8£) = 0 state equation and
8x * dt 8x *
(
8£) _ ~ (8£) _ 0 costate equation.
8.x * dt 8>.. *
(2.8.12)
(2.8.13)
Let us note from (2.8.11) that the Lagrangian £ is independent of
>.. * (t) and that the EulerLagrange equation (2.8.13) for the costate .x(t)
is nothing but the constraint relation (2.8.9). The general boundary
2.8 Summary of Variational Approach 87
condition (2.8.4) to be satisfied at the freefinal point (in terms of £)
is given by
(2.8.14)
This boundary condition is to be modified depending on the nature of
the given tf and x(tf)·
2.8.3 Stage III: Optimal Control System with
Lagrangian Formalism
Here, we consider the standard control system with a plant described
by [56]
x(t) = f(x(t), u(t), t), (2.8.15)
the given boundary conditions as
x(to) is fixed and x(tf) is free, (2.8.16)
and the performance index as
i
t!
J(u(t)) = V(x(t), u(t), t)dt.
to
(2.8.17)
Now, we rewrite the plant equation (2.8.15) as the condition relation
(2.8.9) as
g(x(t), x(t), u(t), t) = f(x(t), u(t), t)  x(t) = o. (2.8.18)
Then we form the augmented functional out of the performance index
(2.8.17) and the condition relation (2.8.18) as
i
t!
Ja(u(t)) = £(x(t), x(t), u(t), A(t), t)dt
to
(2.8.19)
where, the Lagrangian £ is given as
£ = £(x(t), x(t), u(t), A(t), t)
= V(x(t), u(t), t) + A'(t) {f(x(t), u(t), t)  x(t)}. (2.8.20)
88 Chapter 2: Calculus of Variations and Optimal Control
Now we just use the previous results of Stage II. For optimal condition,
we have the set of EulerLagrange equations (2.8.12) and (2.8.13) for
the augmented functional (2.8.19) given in terms of x(t), '\(t), and u(t)
as
(
0£) _ ~ (0£) = 0 state equation, (2.8.21)
ox * dt Ox *
(
0£) _ ~ (0£) _ 0 costate equation, and (2.8.22)
0'\ * dt { ) ~ *
(
0£) _ ~ (8£) = 0 control equation. (2.8.23)
ou * dt ou *
Note from (2.8.20) that the Lagrangian £ is independent of ~ * (t) and
u*(t) and that the EulerLagrange equation (2.8.22) is the same as the
constraint relation (2.8.18). The general boundary condition (2.8.14)
to be satisfied at the freefinal point becomes
[c  x'(t) ( ~ ) L Otf + ( ~ ) ~ t OX! = O.
f f
(2.8.24)
This boundary condition is to be modified depending on the nature of
the given tf and x(tf).
2.8.4 Stage IV: Optimal Control System with
Hamiltonian Formalism: Pontryagin Principle
Here, we just transform the previous Lagrangian formalism to Hamil
tonian formalism by defining the Hamiltonian as [57]
H(x(t), u(t), '\(t), t) = V(x(t), u(t), t) + ,\'(t)f(x(t), u(t), t) (2.8.25)
which in terms of the Lagrangian (2.8.20) becomes
£(x(t), x(t), u(t), '\(t), t) = H(x(t), u(t), '\(t), t)  '\'(t)x(t). (2.8.26)
Now using (2.8.26), the set of EulerLagrange equations (2.8.21) to
(2.8.23) which are in terms of the Lagrangian, are rewritten in terms
2.8 Summary of Variational Approach 89
of the Hamiltonian as
(
aH) _ = 0
ax * dt
(2.8.27)
(
aH) _ x*(t)  = 0
aA * dt
(2.8.28)
(
aH) _ = 0
8u * dt
(2.8.29)
which in turn are rewritten as
:ic*(t) = + * state equation,
(2.8.30)
,,*(t) ___ (8H)
" costate equation, and
ax *
(2.8.31)
10
_+(88
H
U)*1
. . control equation.
(2.8.32)
Similarly using (2.8.26), the boundary condition (2.8.24) is rewritten
in terms of the Hamiltonian as
[H  A'(t)X(t)  X'(t)(A(t))] I*tj 8tf + [A'(t)] I*tj 8xf = 0 (2.8.33)
which becomes
(2.8.34)
The sufficient condition is
* > 0 for minimum and
(2.8.35)
( * < 0 for maximum.
(2.8.36)
The state, costate, and control equations (2.8.30) to (2.8.32) are solved
along with the given initial condition (2.8.16) and the final condition
(2.8.34) leading us to a twopoint boundary value problem (TPBVP).
90 Chapter 2: Calculus of Variations and Optimal Control
FreeFinal Point System with Final Cost Function
This problem is an extension of the problem in Stage IV, with the
addition of final cost function. We summarize the result risking the
repetition of some of the equations. Let the plant be described as
x(t) = f(x(t), u(t), t) (2.8.37)
and the performance index be
l
tf
J(u(t)) = S(x(tf), tf) + V(x(t), u(t), t)dt
to
(2.8.38)
along with the boundary conditions
x(to) is fixed and x(tf) is free. (2.8.39)
Now, if we rewrite the performance index (2.8.38) to absorb the final
cost function S within the integrand, then the results of Stage III can
be used to get the optimal conditions. Thus we rewrite (2.8.38) as
rtf [ (8S)' 8S]
J1(u(t)) = lto V(x(t), u(t), t) + 8x x(t) + at dt.
(2.8.40)
N ow we repeat the results of Stage III except for the modification of the
final condition equation (2.8.34). Thus the state, costate and control
equations are
x*(t) = + ( ~ ~ ) *
state equation (2.8.41 )
. * (81l)
A (t) =  
8x *
costate equation (2.8.42)
control equation (2.8.43)
and the final boundary condition is
(2.8.44)
2.8 Summary of Variational Approach
The sufficient condition for optimum is
( ~ ) * > 0 for minimum and
( ~ ) * < 0 for maximum.
91
(2.8.45)
(2.8.46)
The state, costate, and control equations (2.8.41) to (2.8.43) are solved
along with the given initial condition (2.8.39) and the final condition
(2.8.44), thus this formulation leads us to a TPBVP.
2.8.5 Salient Features
We now discuss the various features of the methodology used so far
in obtaining the optimal conditions through the use of the calculus of
variations [6, 79, 120, 108]. Also, we need to consider the problems
discussed above under the various stages of development. So we refer
to the appropriate relations of, say Stage III or Stage IV during our
discussion.
1. Significance of Lagrange Multiplier: The Lagrange multiplier A(t)
is also called the costate (or adjoint) function.
(a) The Lagrange multiplier A(t) is introduced to "take care
of" the constraint relation imposed by the plant equation
(2.8.15).
(b) The costate variable A(t) enables us to use the EulerLagrange
equation for each of the variables x(t) and u(t) separately
as if they were independent of each other although they are
dependent of each other as per the plant equation.
2. Lagrangian and Hamiltonian: We defined the Lagrangian and
Hamiltonian as
£ = £(x(t), x(t), A(t), u(t), t)
= V(x(t), u(t), t)
+A'(t) {f(x(t), u(t), t)  x(t)}
1t = 1t(x(t) , u(t), A(t), t)
= V(x(t), u(t), t)
+A'(t)f(x(t), u(t), t).
(2.8.47)
(2.8.48)
92 Chapter 2: Calculus of Variations and Optimal Control
In defining the Lagrangian and Hamiltonian we use extensively
the vector notation, still it should be noted that these £ and 1t
are scalar functions only.
3. Optimization of Hamiltonian
(a) The control equation (2.8.32) indicates the optimization of
the Hamiltonian w.r.t. the control u(t). That is, the opti
mization of the original performance index (2.8.17), which is
a functional subject to the plant equation (2.8.15), is equiv
alent to the optimization of the Hamiltonian function w.r.t.
u( t). Thus, we "reduced" our original functional optimiza
tion problem to an ordinary function optimization problem.
(b) We note that we assumed unconstrained or unbounded con
trol u(t) and obtained the control relation 81t/8u = O.
( c) If u( t) is constrained or bounded as being a member of the
set U, i.e., u(t) E U, we can no longer take 81t/8u = 0,
since u( t), so calculated, may lie outside the range of the
permissible region U. In practice, the control u(t) is always
limited by such things as saturation of amplifiers, speed of
a motor, or thrust of a rocket. The constrained optimal
control systems are discussed in Chapter 7.
(d) Regardless of any constraints on u(t), Pontryagin had shown
that u(t) must still be chosen to minimize the Hamiltonian.
A rigorous proof of the fact that u( t) must be chosen to op
timize 1t function is Pontryagin's most notable contribution
to optimal control theory. For this reason, the approach is
often called Pontryagin Principle. So in the case of con
strained control, it is shown that
min 1t(x*(t), .x*(t), u(t), t) = 1t(x*(t), .x*(t), u*(t), t)
uEU
(2.8.49)
or equivalently
11t(x*(t), .x*(t), u*(t), t) :::; 1t(x*(t), .x*(t), u(t), t).1
(2.8.50)
4. Pontryagin Maximum Principle: Originally, Pontryagin used a
slightly different performance index which is maximized rather
2.8 Summary of Variational Approach 93
than minimized and hence it is called Pontryagin Maximum Prin
ciple. For this reason, the Hamiltonian is also sometimes called
Pontryagin Hfunction. Let us note that minimization of the per
formance index J is equivalent to the maximization of J. Then,
if we define the Hamiltonian as
H(x(t), u(t), A(t), t) =  V(x(t), u(t), t) + 5..' (t)f(x(t), u(t), t)
(2.8.51)
we have Maximum Principle. Further discussion on Pontryagin
Principle is given in Chapter 6.
5. Hamiltonian at the Optimal Condition: At the optimal condition
the Hamiltonian can be written as
H* = H* (x* (t), u* (t), A * (t), t)
dH* dH*
dt dt
= ( ~ ~ ) ~ x*(t) + ( : ) ~ i*(t) + ( ~ ~ ) ~ u*(t)
+ (a;:) * . (2.8.52)
Using the state, costate and control equations (2.8.30) to (2.8.32)
in the previous equation, we get
(2.8.53)
We observe that along the optimal trajectory, the total derivative
of H w.r.t. time is the same as the partial derivative of H w.r.t.
time. If H does not depend on t explicitly, then
~ I = o
~
(2.8.54)
and 1i is constant w.r. t. the time t along the optimal trajectory.
6. TwoPoint Boundary Value Problem (TPBVP): As seen earlier,
the optimal control problem of a dynamical system leads to a
TPBVP.
94 Chapter 2: Calculus of Variations and Optimal Control
OpenLoop
u*(t)
Plant
x*(t)
Optimal Controller
.. ..
Figure 2.15 OpenLoop Optimal Control
(a) The state and costate equations (2.8.30) and (2.8.32) are
solved using the initial and final conditions. In general, these
are nonlinear, time varying and we may have to resort to
numerical methods for their solutions.
(b) We note that the state and costate equations are the same
for any kind of boundary conditions.
( c) For the optimal control system, although obtaining the state
and costate equations is very easy, the computation of their
solutions is quite tedious. This is the unfortunate feature
of optimal control theory. It is the price one must pay for
demanding the best performance from a system. One has to
weigh the optimization of the system against the computa
tional burden.
7. OpenLoop Optimal Control: In solving the TPBVP arising due
to the state and costate equations, and then substituting in the
control equation, we get only the openloop optimal control as
shown in Figure 2.15. Here, one has to construct or realize an
openloop optimal controller (OLOC) and in many cases it is
very tedious. Also, changes in plant parameters are not taken
into account by the OLOC. This prompts us to think in terms
of a closedloop optimal control (CLOC), i.e., to obtain optimal
control u*(t) in terms of the state x*(t) as shown in Figure 2.16.
This CLOC will have many advantages such as sensitive to plant
parameter variations and simplified construction of the controller.
The closedloop optimal control systems are discussed in Chap
ter 7.
2.8 Summary of Variational Approach 95
u*(t)
..
Plant
x*(t)
r/
•
~ 
ClosedLoop
+
Optimal Controller
Figure 2.16 ClosedLoop Optimal Control
96 Chapter 2: Calculus of Variations and Optimal Control
2.9 Problems
1. Make reasonable assumptions wherever necessary.
2. Use MATLAB© wherever possible to solve the problems and
plot all the optimal controls and states for all problems. Provide
the relevant MATLAB© m files.
Problem 2.1 Find the extremal of the following functional
with the initial condition as x(O) = 0 and the final condition as x(2) =
5.
Problem 2.2 Find the extremal of the functional
to satisfy the boundary conditions x( 2) = 3, and x(O) = O.
Problem 2.3 Find the extremal for the following functional
with x(l) = 1 and x(2) = 10.
Problem 2.4 Consider the extremization of a functional which is de
pendent on derivatives higher than the first derivative x(t) such as
i
t!
J(x( t), t) = V(x( t), x( t), i(t), t)dt.
to
with fixedend point conditions. Show that the corresponding Euler
Lagrange equation is given by
2.9 Problems 97
Similarly, show that, in general, for extremization of
i
t!
J = V ( x ( t ), ± ( t) , x ( t), ... , x (r) ( t), t) dt
to
with fixedend point conditions, the EulerLagrange equation becomes
Problem 2.5 A first order system is given by
±(t) = ax(t) + bu(t)
and the performance index is
lint!
J =  (qx
2
(t) + ru
2
(t))dt
2 0
where, x(to) = Xo and x(tf) is free and tf being fixed. Show that the
optimal state x* (t) is given by
*( ) _ sinh(3(tf  t)
x t  Xo . h(3 ,
S'ln tf
Problem 2.6 A mechanical system is described by
x(t) = u(t)
find the optimal control and the states by minimizing
J = ~ r
5
u
2
(t)dt
2 10
such that the boundary conditions are
x(t = 0) = 2; x(t = 5) = 0; ±(t = 0) = 2; ±(t = 5) = O.
Problem 2.7 For the first order system
dx
dt = x(t) + u(t)
find the optimal control u* (t) to minimize
J = If [x
2
(t) + u
2
(t)]dt
where, tf is unspecified, and x(O) = 5 and x(tf) = o. Also find tf·
98 Chapter 2: Calculus of Variations and Optimal Control
Problem 2.8 Find the optimal control u* (t) of the plant
Xl(t)=X2(t); Xl(O) =3, xl(2)=0
X2(t) = 2Xl(t) + 5u(t); X2(0) = 5, x2(2) = 0
which minimizes the performance index
J = ~ l [xM + u
2
(t)] dt.
Problem 2.9 A second order plant is described by
Xl(t) =X2(t)
X2(t) = 2Xl(t)  3X2(t) + 5u(t)
and the cost function is
J = f" [xi(t) + u
2
(t)Jdt.
Find the optimal control, when Xl(O) = 3 and X2(0) = 2.
Problem 2.10 For a second order system
Xl(t) = X2(t)
X2(t) = 2Xl(t) + 3u(t)
with performance index
(7r/2
J = 0.5xi(1T/2) + io 0.5u
2
(t)dt
and boundary conditions x(O) = [0 1]' and x(t j) is free, find the opti
mal control.
Problem 2.11 Find the optimal control for the plant
Xl(t)=X2(t)
X2(t) = 2Xl(t) + 3u(t)
with performance criterion
121 2
J = "2Fll [Xl(tj)  4] + "2F22 [X2(tj)  2]
1 rtf [ ]
+"2 io xi(t) + 2 x ~ ( t ) + 4u
2
(t) dt
and initial conditions as x(O) = [1 2]'. The additional conditions are
given below.
2.9 Problems 99
1. Fixedfinal conditions Fn = 0, F22 = 0, t f = 2, x(2) = [4 6]'.
2. Freefinal time conditions Fn = 3, F22 = 5, x(tf) = [4 6]' and tf
is free.
3. Freefinal state conditions, Fn
x2(2) = 6.
0, Xl (2) is free and
4. Freefinal time and freefinal state conditions, Fn = 3, F22 = 5
and the final state to have xI(tf) = 4 and x2(tf) to lie on 8(t) =
5t + 15.
Problem 2.12 For the D.C. motor speed control system described
in Problem 1.1, find the openloop optimal control to keep the speed
constant at a particular value and the system to respond for any dis
turbances from the regulated value.
Problem 2.13 For the liquidlevel control system described in Prob
lem 1.2, find the openloop optimal control to keep the liquid level
constant at a reference value and the system to act only if there is a
change in the liquid level.
Problem 2.14 For the inverted pendulum control system described in
Problem 1.3, find the openloop, optimal control to keep the pendulum
in a vertical position.
Problem 2.15 For the mechanical control system described in Prob
lem 1.4, find the openloop, optimal control to keep the system at
equilibrium condition and act only if there is a disturbance.
Problem 2.16 For the automobile suspension control system described
in Problem 1.5, find the openloop, optimal control to provide minimum
control energy and passenger comfort.
Problem 2.17 For the chemical control system described in Prob
lem 1.6, find the openloop, optimal control to keep the system at
equilibrium condition and act only if there is a disturbance.
@@@@@@@@@@@@@@@
Chapter 3
Linear Quadratic Optimal
Control Systems I
In this chapter, we present the closedloop optimal control of linear
plants or systems with quadratic performance index or measure. This
leads to the linear quadratic regulator (LQR) system dealing with state
regulation, output regulation, and tracking. Broadly speaking, we are
interested in the design of optimal linear systems with quadratic per
formance indices. It is suggested that the student reviews the material
in Appendices A and B given at the end of the book. This chapter is
inspired by [6, 3, 89]1.
3.1 Problem Formulation
We discuss the plant and the quadratic performance index with par
ticular reference to physical significance. This helps us to obtain some
elegant mathematical conditions on the choice of various matrices in
the quadratic cost functional. Thus, we will be dealing with an opti
mization problem from the engineering perspective.
Consider a linear, timevarying (LTV) system
x(t) = A(t)x(t) + B(t)u(t)
y(t) = C(t)x(t)
(3.1.1)
(3.1.2)
IThe permissions given by John Wiley for F. L. Lewis, Optimal Control, John Wiley &
Sons, Inc., New York, NY, 1986, and McGrawHill for M. Athans and P. L. Falb, Optimal
Control: An Introduction to the Theory and Its Applications, McGrawHill Book Company,
New York, NY, 1966, are hereby acknowledged.
101
102 Chapter 3: Linear Quadratic Optimal Control Systems I
with a cost functional (CF) or performance index (PI)
J(u(t)) = J(x(to), u(t), to)
1 I ]
="2 [z(tf)  y(tf)] F(tf) [z(tf)  y(tf)
11tf
+ [[z(t)  y(t)]' Q(t) [z(t)  y(t)] + u'(t)R(t)u(t)] dt
2 to
(3.1.3)
where, x(t) is nth state vector, y(t) is mth output vector, z(t) is mth ref
erence or desired output vector (or nth desired state vector, if the state
x(t) is available), u(t) is rth control vector, and e(t) = z(t)  y(t) (or
e(t) = z(t)  x(t), if the state x(t) is directly available) is the mth
error vector. A(t) is nxn state matrix, B(t) is nxr control matrix,
and C(t) is mxn output matrix. We assume that the control u(t) is
unconstrained, 0 < m :::; r :::; n, and all the states and/or outputs are
completely measurable. The preceding cost functional (3.1.3) contains
quadratic terms in error e(t) and control u(t) and hence called the
quadratic cost functional
2
. We also make certain assumptions to be de
scribed below on the various matrices in the quadratic cost functional
(3.1.3). Under these assumptions, we will find that the optimal control
is closedloop in nature, that is, the optimal control u(t) is a function of
the state x(t) or the output y(t). Also, depending on the final time tf
being finite (infinite), the system is called finite (infinite) time horizon
system. Further, we have the following categories of systems.
1. If our objective is to keep the state x(t) near zero (i.e., z(t)
o and C = I), then we call it state regulator system. In other
words, the objective is to obtain a control u(t) which takes the
plant described by (3.1.1) and (3.1.2) from a nonzero state to
zero state. This situation may arise when a plant is subjected
to unwanted disturbances that perturb the state (for example,
sudden load changes in an electrical voltage regulator system,
sudden wind gust in a radar antenna positional control system).
2. If our interest is to keep the output y(t) near zero (i.e., z(t) = 0),
then it is termed the output regulator system.
2See Appendix A for more details on Quadratic Forms and Definiteness and other related
topics.
3.1 Problem Formulation 103
3. If we try to keep the output or state near a desired state or output,
then we are dealing with a tracking system. We see that in both
state and output regulator systems, the desired or reference state
is zero and in tracking system the error is to be made zero. For
example, consider again the antenna control system to track an
aircraft.
Let us consider the various matrices in the cost functional (3.1.3) and
their implications.
1. The Error Weighted Matrix Q(t): In order to keep the error e(t)
small and error squared nonnegative, the integral of the expres
sion should be nonnegative and small. Thus, the
matrix Q (t) must be positive semidefinite. Due to the quadratic
nature of the weightage, we have to pay more attention to large
errors than small errors.
2. The Control Weighted Matrix R(t): The quadratic nature of the
control cost expression indicates that one has to
pay higher cost for larger control effort. Since the cost of the
control has to be a positive quantity, the matrix R( t) should be
positive definite.
3. The Control Signal u(t): The assumption that there are no con
straints on the control u(t) is very important in obtaining the
closed loop optimal configuration.
Combining all the previous assumptions, we would like on one
hand, to keep the error small, but on the other hand, we must
not pay higher cost to large controls.
4. The Terminal Cost Weighted Matrix F(tf): The main purpose of
this term is to ensure that the error e(t) at the final time tf is
as small as possible. To guarantee this, the corresponding matrix
F( t f) should be positive semidefinite.
Further, without loss of generality, we assume that the weighted
matrices Q(t), R(t), and F(t) are symmetric. The quadratic cost
functional described previously has some attractive features:
(a) It provides an elegant procedure for the design of closedloop
optimal controller.
104 Chapter 3: Linear Quadratic Optimal Control Systems I
(b) It results in the optimal feedback control that is linear in
state function.
That is why we often say that the "quadratic performance index
fits like a glove" [6].
5. Infinite Final Time: When the final time t f is infinity, the termi
nal cost term involving F(tf) does not provide any realistic sense
since we are always interested in the solutions over finite time.
Hence, F(tf) must be zero.
3.2 FiniteTime Linear Quadratic Regulator
Now we proceed with the linear quadratic regulator (LQR) system,
that is, to keep the state near zero during the interval of interest. For
the sake of completeness we shall repeat the plant and performance
index equations described in the earlier section. Consider a linear,
timevarying plant described by
x(t) = A(t)x(t) + B(t)u(t)
with a cost functional
J(u) = J(x(to), u(t), to)
= ~ x' (t f ) F ( t f )x ( t f )
11tf
+ [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt
2 to
= ~ x' ( t f ) F ( t f ) x( t f )
+ ~ rtf [X'(t) u'(t)] [Q(t) 0 ] [X(t)] dt
2 lto 0 R(t) u(t)
(3.2.1)
(3.2.2)
where, the various vectors and matrices are defined in the last section.
Let us note that here, the reference or desired state z(t) = 0 and hence
the error e(t) = 0  x(t) itself is the state, thereby implying a state
regulator system. We summarize again various assumptions as follows.
1. The control u( t) is unconstrained. However, in many physical
situations, there are limitations on the control and state and the
case of unconstrained control is discussed in a later chapter.
3.2 FiniteTime Linear Quadratic Regulator 105
2. The initial condition x(t = to) = Xo is given. The terminal time
t f is specified, and the final state x( t f) is not specified.
3. The terminal cost matrix F(tf) and the error weighted matrix
Q(t) are nxn positive semidefinite matrices, respectively; and the
control weighted matrix R( t) is an rxr positive definite matrix.
4. Finally, the fraction ~ in the cost functional (3.2.2) is associated
mainly to cancel a 2 that would have otherwise been carried on
throughout the result, as seen later.
We follow the Pontryagin procedure described in Chapter 2 (Table 2.1)
to obtain optimal solution and then propose the closedloop configu
ration. First, let us list the various steps under which we present the
method.
• Step 1: Hamiltonian
• Step 2: Optimal Control
• Step 3: State and Costate System
• Step 4: ClosedLoop Optimal Control
• Step 5: Matrix Differential Riccati Equation
Now let us discuss the preceding steps in detail.
• Step 1: Hamiltonian: Using the definition of the Hamiltonian
given by (2.7.27) in Chapter 2 along with the performance index
(3.2.2), formulate the Hamiltonian as
1 1
H(x(t), u(t), A(t)) = "2x'(t)Q(t)x(t) + "2u'(t)R(t)u(t)
+A' (t) [A(t)x(t) + B(t)u(t)] (3.2.3)
where, A(t) is the costate vector of nth order.
• Step 2: Optimal Control: Obtain the optimal control u*(t) using
the control relation (2.7.29) as
~ ~ = 0 + R(t)u*(t) + B'(t)A*(t) = 0 (3.2.4)
leading to
u*(t) = _Rl(t)B'(t)A*(t)
(3.2.5)
106 Chapter 3: Linear Quadratic Optimal Control Systems I
where, we used
:u Gul(t)R(t)U(t)} = R(t)u(t) and
:u {A' (t)B(t)u(t)} = B' (t)A(t).
Similar expressions are used throughout the rest of the book.
Further details on such relations are found in Appendix A. We
immediately notice from (3.2.5) the need for R(t) to be positive
definite and not positive semidefinite so that the inverse Rl(t)
exists .
• Step 3: State and Costate System: Obtain the state and costate
equations according to (2.7.30) and (2.7.31) as
x*(t) = + * > x*(t) = A(t)x*(t) + B(t)u*(t) (3.2.6)
j,*(t) =  * > j, *(t) = Q(t)x*(t)  A'(t)A*(t). (3.2.7)
Substitute the control relation (3.2.5) in the state equation (3.2.6)
to obtain the (state and costate) canonical system (also called
Hamiltonian system) of equations
[
x*(t)] [A(t) E(t)] [x*(t)]
.,\ * (t) =  Q (t)  A' (t) A * (t)
(3.2.8)
where E(t) = B(t)Rl(t)B'(t). The general boundary condition
given by the relation (2.7.32) is reproduced here as
[1i* + i5tf + *  A*(t{ i5xf = 0 (3.2.9)
where, S equals the entire terminal cost term in the cost func
tional (3.2.2). Here, for our present system tf is specified which
makes 8tf equal to zero in (3.2.9), and x(tf) is not specified which
makes 8xf arbitrary in (3.2.9). Hence, the coefficient of 8xf in
(3.2.9) becomes zero, that is,
A*(tf) = *
a (t f )F( t f )x( t f)] *
= 8x(tf) = F(tf)x (tf)· (3.2.10)
3.2 FiniteTime Linear Quadratic Regulator
107
This final condition on the costate ,x * (t f) together with the initial
condition on the state Xo and the canonical system of equations
(3.2.8) form a twopoint, boundary value problem (TPBVP). The
statespace representation of the set of relations for the state and
costate system (3.2.8) and the control (3.2.5) is shown in Fig
ure 3.1.
u*(t)
J
x*(t)
State
A(t)
1
R (t)
Q(t)
B'(t)
').,*(1)
Costate
Figure 3.1 State and Costate System
• Step 4: ClosedLoop Optimal Control: The state space repre
sentation shown in Figure 3.1 prompts us to think that we can
obtain the optimal control u*(t) as a function (negative feedback)
of the optimal state x*(t). Now to formulate a closedloop opti
mal control, that is, to obtain the optimal control u*(t) which is a
function of the costate ,x*(t) as seen from (3.2.5), as a function of
the state x*(t), let us examine the final condition on ,x*(t) given
108 Chapter 3: Linear Quadratic Optimal Control Systems I
by (3.2.10). This in fact relates the costate in terms of the state
at the final time t f. Similarly, we may like to connect the costate
with the state for the complete interval of time [to, t f]. Thus, let
us assume a transformation [113, 102]
A*(t) = P(t)x*(t) (3.2.11)
where, P(t) is yet to be determined. Then, we can easily see that
with (3.2.11), the optimal control (3.2.5) becomes
u*(t) = R
1
(t)B'(t)P(t)x*(t) (3.2.12)
which is now a negative feedback of the state x*(t). Note that this
negative feedback resulted from our "theoretical development" or
"mathematics" of optimal control procedure and not introduced
intentionally [6].
Differentiating (3.2.11) w.r.t. time t, we get
.x*(t) = P(t)x*(t) + P(t)x*(t). (3.2.13)
Using the transformation (3.2.11) in the control, state and costate
system of equations (3.2.5), (3.2.6) and (3.2.7), respectively, we
get
x*(t) = A(t)x*(t)  B(t)Rl(t)B'(t)P(t)x*(t),
.x*(t) = Q(t)x*(t)  A'(t)P(t)x*(t).
(3.2.14)
(3.2.15)
Now, substituting state and costate relations (3.2.14) and (3.2.15)
in (3.2.13), we have
Q(t)x*(t)  A'(t)P(t)x*(t) = P(t)x*(t) +
P(t) [A(t)x*(t)  B(t)R
1
(t)B'(t)P(t)x*(t)] ~
[P(t) + P(t)A(t) + A'(t)P(t) + Q(t)
P(t)B(t)R
1
(t)B'(t)P(t)] x*(t) = 0 (3.2.16)
Essentially, we eliminated the costate function A * (t) from the
control (3.2.5), the state (3.2.6) and the costate (3.2.7) equations
by introducing the transformation (3.2.11) .
• Step 5: Matrix Differential Riccati Equation: Now the relation
(3.2.16) should be satisfied for all t E [to, t f 1 and for any choice of
3.2 FiniteTime Linear Quadratic Regulator 109
the initial state x*(to). Also, P(t) is not dependent on the initial
state. It follows that the equation (3.2.16) should hold good for
any value of x*(t). This clearly means that the function P(t)
should satisfy the matrix differential equation
P(t) + P(t)A(t) + A' (t)P(t) + Q(t) 
P(t)B(t)R
1
(t)B'(t)P(t) = O. (3.2.17)
This is the matrix differential equation of the Riccati type, and
often called the matrix differential Riccati equation (DRE). Also,
the transformation (3.2.11) is called the Riccati transformation,
P(t) is called the Riccati coefficient matrix or simply Riccati ma
trix or Riccati coefficient, and (3.2.12) is the optimal control (feed
back) law. The matrix DRE (3.2.17) can also be written in a
compact form as
P(t) = P(t)A(t)  A'(t)P(t)  Q(t) + P(t)E(t)P(t) (3.2.18)
where E(t) = B(t)Rl(t)B'(t).
Comparing the boundary condition(3.2.10) and the Riccati trans
formation (3.2.11), we have the final condition on P(t) as
>..*(tf) = P(tf)x*(tf) = F(tf)x*(tf) +
\ P(tf) = F(tf)'\
(3.2.19)
Thus, the matrix DRE (3.2.17) or (3.2.18) is to be solved backward
in time using the final condition (3.2.19) to obtain the solution
P ( t) for the entire interval [to, t f ] .
3.2.1 Symmetric Property of the Riccati Coefficient Ma
trix
Here, we first show an important property of the Riccati matrix P(t).
The fact that the nxn matrix P(t) is symmetric for all t E [to, tf]' i.e.,
P (t) = P' (t) can be easily shown as follows. First of all, let us note that
from the formulation of the problem itself, the matrices F(tf), Q(t),
and R(t) are symmetric and therefore, the matrix B(t)Rl(t)B'(t)
is also symmetric. Now transposing both sides of the matrix DRE
(3.2.18), we notice that both P(t) and P'(t) are solutions of the same
differential equation and that both satisfy the same final condition
(3.2.19).
110 Chapter 3: Linear Quadratic Optimal Control Systems I
3.2.2 Optimal Control
Is the optimal control u*(t) a minimum? This can be answered by
considering the second partials of the Hamiltonian (3.2.3). Let us recall
from Chapter 2 that this is done by examining the second variation of
the cost functional. Thus, the condition (2.7.41) here for
convenience) for examining the nature of optimal control is that the
matrix
(3.2.20)
must be positive definite (negative definite) for minimum (maximum).
In most of the cases this reduces to the condition that
(
8
2
1i)
ou
2
*
(3.2.21 )
must be positive definite (negative definite) for minimum (maximum).
Now using the Hamiltonian (3.2.3) and calculating the various partials,
(3.2.22)
Substituting the previous partials in the condition (3.2.20), we have
[
Q(t) 0 ]
II= 0 R(t) .
(3.2.23)
Since R(t) is positive definite, and Q(t) is positive semidefinite, it fol
lows that the preceding matrix (3.2.23) is only positive semidefinite.
However, the condition that the second partial of 1i w.r.t. u*(t), which
is R( t), is positive definite, is enough to guarantee that the control
u*(t) is minimum.
3.2.3 Optimal Performance Index
Here, we show how to obtain an expression for the optimal value of the
performance index.
3.2 Finite Time Linear Quadratic Regulator 111
THEOREM 3.1
The optimal value of the PI (3.2.2) is given by
J*(x*(t), t) = (3.2.24)
Proof: First let us note that
i
t! d 1
 (x*'(t)P(t)x*(t)) dt = x*'(to)P(to)x*(to)
& 2
(3.2.25)
Substituting for )P(tj )x*(tj) from (3.2.25) into the PI (3.2.2), and
noting that P(tj) = F(tj) from (3.2.19) we get
J* (x* (to), to) = (to)P( to)x* (to)
lit!
+ [x*'(t)Q(t)x*(t) + u*'(t)R(t)u*(t)
2 to
+ :t (x*'(t)P(t)x*(t)) 1 dt
=
lit!
+ [x*'(t)Q(t)x*(t) + u*'(t)R(t)u*(t)
2 to
+ x*'(t)P(t)x*(t) + x*'(t)P(t)x*(t)
+ x*'(t)P(t)x*(t)] dt. (3.2.26)
Now, using the state equation (3.2.14) for x*(t), we get
J* (x* (to), to) = (to)P( to)x* (to)
lit!
+ x*'(t) [Q(t) + A'(t)P(t) + P(t)A(t)
2 to
 P(t)B(t)R
1
(t)B'(t)P(t) + P(t)] x*(t)dt. (3.2.27)
Finally, using the matrix DRE (3.2.18) in the previous relations, the integral
part becomes zero. Thus,
J*(x(to),to) =
(3.2.28)
112 Chapter 3: Linear Quadratic Optimal Control Systems I
Now, the previous relation is also valid for any x* (t). Thus,
1
J* (x* (t), t) = "2x*' (t)P( t)x* (t). (3.2.29)
In terms of the final time t f' the previous optimal cost becomes
(3.2.30)
Since we are normally given the initial state x( to) and the Riccati coefficient
P(t) is solved for all time t, it is more convenient to use the relation (3.2.28).
3.2.4 FiniteTime Linear Quadratic Regulator:
Time Varying Case: Summary
Given a linear, timevarying plant
x(t) = A(t)x(t) + B(t)u(t)
and a quadratic performance index
J = ~ x' ( t f) F ( t f ) x( t f )
11t!
+"2 to [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt
(3.2.31)
(3.2.32)
where, u( t) is not constrained, t f is specified, and x( t f) is not specified,
F(tf) and Q(t) are nxn symmetric, positive semidefinite matrices, and
R(t) is TXT symmetric, positive definite matrix, the optimal control is
given by
I u*(t) = R
1
(t)B'(t)P(t)x*(t) = K(t)x*(t) I (3.2.33)
where K(t) = R
1
(t)B'(t)P(t) is called Kalman gain and P(t), the nxn
symmetric, positive definite matrix (for all t E [to, tfD, is the solution
of the matrix differential Riccati equation (DRE)
I P(t) = P(t)A(t)  A'(t)P(t)  Q(t) + P(t)B(t)R
1
(t)B'(t)P(t) I
(3.2.34)
satisfying the final condition
(3.2.35)
3.2 FiniteTime Linear Quadratic Regulator 113
Table 3.1 Procedure Summary of FiniteTime Linear Quadratic Regulator
System: TimeVarying Case
A. Statement of the Problem
Given the plant as
x(t) = A(t)x(t) + B(t)u(t),
the performance index as
J = )F(tf )x(tf) + It; [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt,
and the boundary conditions as
x(to) = xo, t f is fixed, and x( t f) is free,
find the optimal control, state and performance index.
B. Solution of the Problem
Step 1 Solve the matrix differential Riccati equation
P(t) = P(t)A(t)  A'(t)P(t)  Q(t) + P(t)B(t)R
1
(t)B'(t)P(t)
with final condition P(t = tf) = F(tf).
Step 2 Solve the optimal state x* (t) from
x*(t) = [A(t)  B(t)R
1
(t)B'(t)P(t)] x*(t)
with initial condition x(to) = Xo.
Step 3 Obtain the optimal control u*(t) as
u*(t) = K(t)x*(t) where, K(t) = R
1
(t)B'(t)P(t).
Step 4 Obtain the optimal performance index from
J* =
the optimal state is the solution of
I x*(t) = [A(t)  B(t)R
1
(t)B'(t)P(t)] x*(t) I (3.2.36)
and the optimal cost is
1
J* = x*'(t)P(t)x*(t)
2 .
(3.2.37)
The optimal control u*(t), given by (3.2.33), is linear in the optimal
state x*(t). The entire procedure is now summarized in Table 3.l.
Note: It is simple to see that one can absorb the! that is associated
with J by redefining a performance measure as
J2 = 2J = x'(tf )F(tf )x(tf)
l.
tJ
+ [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt,
to
(3.2.38)
114 Chapter 3: Linear Quadratic Optimal Control Systems I
get the corresponding matrix differential Riccati equation for J2 as
P2(t) = _ P
2
(t) A(t) _ A'(t) P
2
(t) _ Q(t)
222
+ P2(t)B(t)R
1
(t)B'(t)P
2
(t)
2 2
with final condition
(3.2.39)
(3.2.40)
Comparing the previous DRE for J
2
with the corresponding DRE
(3.2.34) for J, we can easily see that P2(t) = 2P(t) and hence the
optimal control becomes
u*(t) = R
1
(t)B'(t)P
2
(t)x*(t) = _ K
2
(t)x*(t)
2 2
= R
1
(t)B'(t)P(t)x*(t) = K(t)x*(t). (3.2.41)
Thus, using J
2
without the ~ in the performance index, we get the
same optimal control (3.2.41) for the original plant (3.2.31), but the
only difference being that the Riccati coefficient matrix P 2 (t) is twice
that of P(t) and J2 is twice that of J(for example, see [3, 42]).
However, we will retain the ~ in J throughout the book due to the ob
vious simplifications in obtaining the optimal control, state and costate
equations (3.2.4), (3.2.6) and (3.2.7), respectively. Precisely, the factor
~ in the PI (3.2.2) and hence in the Hamiltonian (3.2.3) gets eliminated
while taking partial derivatives of the Hamiltonian w.r.t. the control,
state and costate functions.
3.2.5 Salient Features
We next discuss the various salient features of the state regulator sys
tem and the matrix differential Riccati equation.
1. Riccati Coefficient: The Riccati coefficient matrix P(t) is a time
varying matrix which depends upon the system matrices A(t)
and B (t), the performance index (design) matrices Q (t), R( t)
and F(tf), and the terminal time tf, but P(t) does not depend
upon the initial state x( to) of the system.
2. P(t) is symmetric and hence it follows that the nxn order ma
trix DRE (3.2.18) represents a system of n(n + 1)/2 first order,
nonlinear, timevarying, ordinary differential equations.
3.2 FiniteTime Linear Quadratic Regulator 115
3. Optimal Control: From (3.2.21), we see that the optimal control
u*(t) is minimum (maximum) if the control weighted matrix R(t)
is positive definite (negative definite).
4. Optimal State: Using the optimal control (3.2.12) in the state
equation (3.2.1), we have
I x*(t) = [A(t)  B(t)R
1
(t)B'(t)P(t)] x*(t) = G(t)x*(t) I
(3.2.42)
where
G(t) = A(t)  B(t)R
1
(t)B'(t)P(t). (3.2.43)
The solution of this state differential equation along with the ini
tial condition x(to) gives the optimal state x*(t). Let us note
that there is no condition on the closedloop matrix G(t) regard
ing stability as long as we are considering the finite final time (t f)
system.
5. Optimal Cost: It is shown in (3.2.29) that the minimum cost J*
is given by
J* = for all t E [to,tf] (3.2.44)
where, P(t) is the solution of the matrix DRE (3.2.18), and x*(t)
is the solution of the closedloop optimal system (3.2.42).
6. Definiteness of the Matrix P(t): Since F(tf) is positive semidefi
nite, and P(tf) = F(tf), we can easily say that P(tf) is positive
semidefinite. We can argue that P (t) is positive definite for all
t E [to, t f). Suppose that P (t) is not positive definite for some
t = ts < tf, then there exists the corresponding state x*(t
s
) such
that the cost function ::; 0, which clearly vi
olates that fact that minimum cost has to be a positive quantity.
Hence, P(t) is positive definite for all t E [to, tf). Since we al
ready know that P(t) is symmetric, we now have that P(t) is
positive definite, symmetric matrix.
7. Computation of Matrix DRE: Under some conditions we can get
analytical solution for the nonlinear matrix DRE as shown later.
But in general, we may try to solve the matrix DRE (3.2.18) by
integrating backwards from its known final condition (3.2.19).
116 Chapter 3: Linear Quadratic Optimal Control Systems I
8. Independence of the Riccati Coefficient Matrix P(t): The matrix
P (t) is independent of the optimal state x* (t), so that once the
system and the cost are specified, that is, once we are given the
system/plant matrices A(t) and B(t), and the performance index
matrices F(tf), Q(t), and R(t), we can independently compute
the matrix P (t) before the optimal system operates in the forward
direction from its initial condition. Typically, we compute (off
line) the matrix P(t) backward in the interval t E [tf, to] and store
them separately, and feed these stored values when the system is
operating in the forward direction in the interval t E [to, t f] .
9. Implementation of the Optimal Control: The block diagram im
plementing the closedloop optimal controller (CLOC) is shown
in Figure 3.2. The figure shows clearly that the CLOC gets its
values of P(t) externally, after solving the matrix DRE backward
in time from t = t f to t = to and hence there is no way that
we can implement the closedloop optimal control configuration
online.
It is to be noted that the optimal control u*(t) can be solved and
implemented in openloop configuration by using the Pontryagin
procedure given in Chapter 2. In that case, the openloop optimal
controller (OLOC) is quite cumbersome compared to the equiv
alent closedloop optimal controller as will be illustrated later in
this chapter.
10. Linear Optimal Control: The optimal feedback control u*(t) given
by (3.2.12) is written as
I u*(t) = K(t)x*(t) I (3.2.45)
where, the Kalman gain K(t) = R
1
(t)B'(t)P(t). Or alterna
tively, we can write
u*(t) = (3.2.46)
where, Ka(t) = P(t)B(t)Rl(t). The previous optimal control is
linear in state x* (t). This is one of the nice features of the optimal
control of linear systems with quadratic cost functionals. Also,
note that the negative feedback in the optimal control relation
(3.2.46) emerged from the theory of optimal control and was not
introduced intentionally in our development.
3.2 FiniteTime Linear Quadratic Regulator 117
.  ....  ....  ....   ..  ......................   ....   ....   ....  ......................  ......  ...... ..
u*(t)B:
B(t)
+
J
x*(t)
Plant
......   .. .... .. ........   .......................   ......  ......................   ....   ....  _.. .. •• 1
........................ ..................... ............ .... ...... .. .. .. .... _ .... .
ClosedLoop Optimal Controller
1
R (t)B'(t)
P(t)x*(t)
OffLine Simulation of pet)
Figure 3.2 ClosedLoop Optimal Control Implementation
11. Controllability: Do we need the controllability condition on the
system for implementing the optimal feedback control? No, as
long as we are dealing with a finite time (t f) system, because
the contribution of those uncontrollable states (which are also
unstable) to the cost function is still a finite quantity only. How
ever, if we consider an infinite time interval, we certainly need
the controllability condition, as we will see in the next section.
A historical note is very appropriate on the Riccati equation [22, 132].
The matrix Riccati equation has its origin in the scalar ver
sion of the equation
x(t) = ax
2
(t) + bx(t) + c (3.2.47)
with time varying coefficients, proposed by Jacopo Franceso
Riccati around 1715. Riccati (16761754) gave the methods
of solutions to the Riccati equation. However, the original
paper by Riccati was not published immediately because he
118 Chapter 3: Linear Quadratic Optimal Control Systems I
had the "suspicion" that the work was already known to
people such as the B ernoullis.
The importance of the Riccati equation, which has been
studied in the last two centuries by an extensive number
of scientists and engineers, need not be overstressed. The
matrix Riccati equation, which is a generalization in matrix
form of the original scalar equation, plays a very important
role in a range of control and systems theory areas such as
linear quadratic optimal control, stability, stochastic filter
ing and control, synthesis of passive networks, differential
games and more recently, Hoocontrol and robust stabiliza
tion and control.
Did Riccati ever imagine that his equation, proposed more
than a quarter millennium ago, would play such an impor
tant and ubiquitous role in modern control engineering and
other related fields?
3.2.6 LQR System for General Performance Index
In this subsection, we address the state regulator system with a more
general performance index than given by (3.2.2). Consider a linear,
timevarying plant described by
x(t) = A(t)x(t) + B(t)u(t), (3.2.48)
with a cost functional
1
J(u) = 2X' (t f )F(t f )x(t f)
1 rtf
+2 ito [x/(t)Q(t)x(t) + 2X/(t)SU(t) + u/(t)R(t)u(t)] dt
= ~ X' ( t f ) F ( t f ) x ( t f )
1 rtf [, ,] [Q(t) S(t)] [X(t)]
+2 ito x (t) u (t) S/(t) R(t) u(t) dt,
(3.2.49)
where, the various vectors and matrices are defined in earlier sections
and the nxr matrix S(t) is only a positive definite matrix.
3.3 Anal. Sol. to the Matrix Differential Riccati Equation 119
Using the identical procedure as for the LQR system, we get the
matrix differential Riccati equation as
P(t) = P(t)A(t)  A' (t)P(t)  Q(t)
+ [P ( t) B ( t) + S ( t)] R 1 ( t ) [B I ( t ) P ( t) + S I ( t) ] (3.2.50)
with the final condition on P(t) as
(3.2.51 )
The optimal control is then given by
u(t) = _R1(t)B/(t) [S/(t) + P(t)] x(t).
(3.2.52)
Obviously, when S(t) is made zero in the previous analysis, we get the
previous results shown in Table 3.1.
3.3 Analytical Solution to the Matrix
Differential Riccati Equation
In this section, we explore an analytical solution for the matric dif
ferential Riccati equation (DRE). This material is based on [138, 89].
Let us rewrite the Hamiltonian system (3.2.8) of the state and costate
equations for the timeinvariant case as (omitting * for the sake of
simplicity)
[
X(t)] [A E] [X(t)]
A(t) = Q A' A(t)
(3.3.1)
where, E = BR1B/. Let
[
A E]
~ = Q A' .
(3.3.2)
Let us also recall that by the transformation .\(t) = P(t)x(t), we get
the differential matrix Riccati equation (3.2.18), rewritten for (time
invariant matrices A, B, Q and R) as
P(t) = P(t)A  A/p(t)  Q + P(t)BR lB
/
p(t), (3.3.3)
with the final condition
(3.3.4)
120 Chapter 3: Linear Quadratic Optimal Control Systems I
The solution P(t) can be obtained analytically (in contrast to numer
ical integration) in terms of the eigenvalues and eigenvectors of the
Hamiltonian matrix ~ . In order to find analytical solution to the dif
ferential Riccati equation (3.3.3), it is necessary to show that if /1 is an
eigenvalue of the Hamiltonian matrix ~ in (3.3.2), then it implies that
/1 is also the eigenvalue of ~ [89, 3]. For this, let us define
r = [ 0 I]
10
(3.3.5)
so that r
1
= r. Then by a simple pre and postmultiplication with
r we get
~ = r ~ / r =  r ~ / r  l .
(3.3.6)
Now, if /1 is an eigenvalue of ~ with corresponding eigenvector v,
~ v = /1V (3.3.7)
then
f ~ / r v = 1.J,v, ~ / r v =  /1fv
(3.3.8)
where, we used f
1
= f. Rearranging
(3.3.9)
Next, rearranging the eigenvalues of ~ as
(3.3.10)
where, M( M) is a diagonal matrix with righthalfplane (lefthalf
plane) eigenvalues. Let W, the modal matrix of eigenvectors corre
sponding to D, be defined as
W = [Wn W12] ,
W21 W2
2
(3.3.11)
where, [Wn W
21
]' are the n eigenvectors of the lefthalfplane (stable)
eigenvalues of ~ . Also,
(3.3.12)
3.3 Anal. Sol. to the Matrix Differential Riccati Equation 121
Let us now define a state transformation
[
X(t)] = w [W(t)] = [Wl1 W12] [W(t)] .
A(t) z(t) W
21
W22 z(t)
(3.3.13)
Then, using (3.3.12) and (3.3.13), the Hamiltonian system (3.3.1) be
comes
= W
1
= = W
= D [ :i!\ ] . (3.3.14)
Solving (3.3.14) in terms of the known final conditions, we have
= eM(?t/)] .
Rewriting (3.3.15)
Next, from (3.3.13) and using the final condition (3.3.4)
A(tf) = W
21
W(tf) + W22
Z
(tf)
= Fx(tf)
= F [Wl1w(tf) + W12Z(tf)] .
Solving the previous relation for z( t f) in terms of w( t f)
z(tf) = T(tf )w(tf), where
(3.3.15)
(3.3.16)
(3.3.17)
T(tf) =  [W22  FW
12
]1 [W21  FW
l1
]. (3.3.18)
Again, from (3.3.16)
z(t) = eM(tft)z(tf)
= eM(tft)T( t f )w( t f)
= eM(tft)T(tf )eM(tft)w(t). (3.3.19)
Rewriting the previous relation as
z(t) = T(t)w(t), where,
T(t) = eM(tft)T(tf)eM(tft).
(3.3.20)
122 Chapter 3: Linear Quadratic Optimal Control Systems I
Finally, to relate P(t) in (3.3.3) to the relation (3.3.20) for T(t), let us
use (3.3.13) to write
..\(t) = W21W(t) + W22Z(t)
= P(t)x(t)
= P(t) [WllW(t) + W
12
Z(t)]
and by (3.3.20), the previous relation can be written as
(3.3.21)
[W21 + W22T(t)] w(t) = P(t) [Wll + W12T(t)] w(t). (3.3.22)
Since the previous relation should hold good for all x(to) and hence for
all states w(t), it implies that the analytical expression to the solution
of P(t) is given by
(3.3.23)
3.3.1 MATLAB© Implementation oj Analytical
Solution to Matrix DRE
The solution of the matrix DRE (3.2.34) is not readily available with
MATLAB and hence a MATLABbased program was developed for
solving the matrix DRE based on the analytical solution of matrix
DRE [138] described earlier. The MATLAB solution is illustrated by
the following example.
Example 3.1
Let us illustrate the previous procedure with a simple second order
example. Given a double integral system
Xl(t)=X2(t), xl(0)=2
X2(t) = 2Xl(t) + X2(t) + u(t), X2(0) = 3, (3.3.24)
and the performance index (PI)
J = [xi(5) + xl(5)X2(5) +
1 (5
+2 io [
2x
i(t) + 6Xl(t)X2(t) + + 0.25u
2
(t)] dt,
(3.3.25)
obtain the feedback control law.
3.3 Anal. Sol. to the Matrix Differential Riccati Equation 123
Solution: Comparing the present plant (3.3.24) and the PI (3.3.25)
of the problem with the corresponding general formulations of the
plant (3.2.31) and the PI (3.2.32), respectively, let us first identify
the various quantities as
A(t) = [ ~ 2 ~ ] ;
Q(t) = [ ~ ~ ] ;
B(t) = [ ~ ] ; F(tf) = [ 0 ~ 5 0
2
5]
1
R(t) = r(t) = 4; to = 0; tf = 5.
It is easy to check that the system (3.3.24) is unstable. Let P(t)
be the 2x2 symmetric matrix
P(t) = [Pl1(t) P12(t)] .
P12(t) P22(t)
(3.3.26)
Then, the optimal control (3.2.33) is given by
u*(t) = 4 [0 1] [Pl1(t) P12(t)] [Xi(t)]
P12(t) P22(t) X2(t)
= 4[P12(t)xi(t) + P22(t)X;(t)] (3.3.27)
where, P(t), the 2x2 symmetric, positive definite matrix, is the
solution of the matrix DRE (3.2.34)
[
Pl1(t) P12(t)] [Pl1(t) P12(t)] [ ° 1]
P12(t) P22(t) =  P12(t) P22(t) 2 1
_ [0 2] [Pl1(t) P12(t)]
1 1 P12(t) P22(t)
+ [Pl1 (t) P12(t)] [0] 4 [0 1] [Pl1 (t) P12(t)]
P12(t) P22(t) 1 P12(t) P22(t)
 [ ~ ~ ]
(3.3.28)
satisfying the final condition (3.2.35)
[
Pl1 (5) P12 (5)] _ [ 1 0.5]
P12(5) P22(5)  0.5 2 .
(3.3.29)
124 Chapter 3: Linear Quadratic Optimal Control Systems I
Simplifying the matrix DRE (3.3.28), we get
pn(5) = 1,
P12(t) = pn(t)  P12(t) + 2p22(t) + 4P12 (t)p22(t)  3,
P12(5) = 0.5
P22(t) = 2p12(t)  2p22(t) + 4 P ~ 2 ( t )  5,
P22(5) = 2.
(3.3.30)
Solving the previous set of nonlinear, differential equations back
ward in time with the given final conditions, one can obtain the
numerical solutions for the Riccati coefficient matrix P(t). How
ever, here the solutions are obtained using the analytical solution
as given earlier in this section. The solutions for the Riccati coef
ficients are plotted in Figure 3.3. Using these Riccati coefficients,
the closedloop optimal control system is shown in Figure 3.4. Us
ing the optimal control u*(t) given by (3.3.27), the plant equations
(3.3.24) are solved forward in time to obtain the optimal states
xi(t) and x2(t) as shown in Figure 3.5 for the initial conditions
[2  3]'. Finally, the optimal control u*(t) is shown in Figure 3.6.
The previous resJlJ.ts are obtained using Control System Toolbox
of the MATLAB\9, Version 6 as shown below.
The following MATLAB© m file for Example 3.1 requires two
additional MATLAB© files lqrnss. m which itself requires lqrnssf. m
given in Appendix C.
****************************************************
%% Solution using Control System Toolbox of
%% the MATLAB. Version 6
%% The following file example.m requires
%% two other files lqrnss.m and lqrnssf.m
%% which are given in Appendix
clear all
A= [0. ,1. ; 2. ,1.] ;
B= [0. ; 1.] ;
Q= [2. ,3. ; 3. ,5.] ;
F=[1. ,0.5;0.5,2.];
R= [. 25] ;
tspan=[05];
xO=[2. ,3.];
[x,u,K]=lqrnss(A,B,F,Q,R,xO,tspan);
******************************************************
3.4 Infinite Time LQR System I 125
Plot of Riccati Coefficients
I I I I I
I I I I I
I I I I _ (t" , I I I
1.8         :         :                  :    
1.6
2 1.4
c:
Q)
1.2
Q)
o
(...)
';;
u
u
C2 0.8
0.6
0.4
I I I I
I
I I I I I I I I I
I I I I I I I I I
I
____ J _____ ____ ! ____ _____ L ____ _____ ____ J _____ __ _
I I I I I I I I I
I I I
I I I
I I I I I I I I I ____ _____ ____ ____ _____ L ____ _____ ____ _____ __ _
I I I , , I I I I
I I I I I
I I ,
I I I I I
I I I I I I I I I
'r''r'r'r
I I I I , , I I I
, , I , I , I , I
____ _____ : _____ _____ :_ 24 ____ _____ ____ _____ : ____ _
0.5 1.5 2 2.5 3 3.5 4 4.5
Figure 3.3 Riccati Coefficients for Example 3.1
3.4 InfiniteTime LQR System I
5
In this section, let us make the terminal (final) time t f to be infinite
in the previous linear, timevarying, quadratic regulator system. Then,
this is called the infinitetime (or infinite horizon) linear quadratic reg
ulator system [6, 3].
Consider a linear, timevarying plant
x(t) = A(t)x(t) + B(t)u(t), (3.4.1)
and a quadratic performance index
J = tx) [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt,
2 ito
(3.4.2)
where, u(t) is not constrained. Also, Q(t) is nxn symmetric, positive
semidefinite matrix, and R(t) is an rxr symmetric, positive definite
matrix. Note, it makes no engineering sense to have a terminal cost
term with terminal time being infinite.
This problem cannot always be solved without some special condi
tions. For example, if anyone of the states is uncontrollable and/or un
stable, the corresponding performance measure J will become infinite
126 Chapter 3: Linear Quadratic Optimal Control Systems I
. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ...................................................... .
I
1
2
u*(t) +0. J
J
Xl *(t) :
...
Plant
.. .. .. .. .. .. .. .. .. .... ............................................................................................ .. .............. ...
r .. .. .. .. .. .. .. .. .. ..... .................................................... ...................................... .. ............ .
4
Figure 3.4 ClosedLoop Optimal Control System for Example 3.1
and makes no physical sense. On the other hand, with the finitetime
system, the performance measure is always finite. Thus, we need to
impose the condition that the system (3.4.1) is completely controllable.
U sing results similar to the previous case of finite final time t f (see
Table 3.1), the optimal control for the infinitehorizon linear regulator
system is obtained as
u*(t) = _Rl(t)B'(t)P(t)x*(t), (3.4.3)
where,
P(t) = lim {P(t)} ,
tf+oo
(3.4.4)
the nxn symmetric, positive definite matrix (for all t E [to, tf]) is the
solution of the matrix differential Riccati equation (DRE)
P(t) = P(t)A(t)  A'(t)P(t)  Q(t) + P(t)B(t)R
1
(t)B'(t)P(t),
(3.4.5)
3.4 InfiniteTime LQR System I
(/)
Q)
CG
ti5
(;
E
'E,.
0
g
c:
o
U
(;
E
:a
o
Plot of Optimal States
1.5
0.5
1 1 1 1 1 1
1 1 1 1
I I I I I f I I
0
____ J _____ ____ l ____ _____ L ____ J __________________ _
I I I I I I
1 1 1 1 1
1 1 1 1
0.5
I I I I I I I I
    x
2
"(t) 1       :         i     :     :     :    
1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1
1
1
I I I I I I I I I
1.5
_J _____ ____ J ____ J _____ L ____ _____ ____ J ____ ____ _
1 1 1 1 1 1 1 1 1
1 1 1 1
I I I I
2
1 1 I I I I I I I

1 I I 1 I 1 I
1 1 I I I I
2.5
I , I I I I I I I
1 1 I I 1 1 I I I
I I
I 1
o 0.5 1.5 2 2.5 3 3.5 4 4.5 5
Figure 3.5 Optimal States for Example 3.1
Plot of Optimal Control
5
0.5
1
I I I I I I I
T'r'
I
1 1 1 1
1.5 2.5 3
I
I
I
I
3.5 4.5
Figure 3.6 Optimal Control for Example 3.1
5
127
128 Chapter 3: Linear Quadratic Optimal Control Systems I
satisfying the final condition
(3.4.6)
The optimal cost is given by
1 A
J* = x*'(t)P(t)x*(t)
2 .
(3.4.7)
The proofs for the previous results are found in optimal control text
specializing in quadratic methods [3]. Example 3.1 can be easily solved
for t f + 00 and F = o.
3.4.1 InfiniteTime Linear Quadratic Regulator: Time
Varying Case: Summary
Consider a linear, timevarying plant
x(t) = A(t)x(t) + B(t)u(t), (3.4.8)
and a quadratic performance index
11
00
J =  [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt,
2 to
(3.4.9)
where, u(t) is not constrained and x(tf), tf + 00 is not specified. Also,
Q(t) is nxn symmetric, positive semidefinite matrix, and R(t) is rxr
symmetric, positive definite matrix. Then, the optimal control is given
by
I u*(t) = R
1
(t)B'(t)P(t)x*(t) I (3.4.10)
where, P(t), the nxn symmetric, positive definite matrix (for all t E
[to, t f], is the solution of the matrix differential Riccati equation (DRE)
I P(t) = P(t)A(t)  A'(t)P(t)  Q(t) + P(t)B(t)R
1
(t)B'(t)P(t) I
(3.4.11)
satisfying the final condition
P(t = tf + (0) = O. (3.4.12)
3.5 InfiniteTime LQR System II
Table 3.2 Procedure Summary of InfiniteTime Linear Quadratic
Regulator System: TimeVarying Case
A. Statement of the Problem
Given the plant as
x(t) = A(t)x(t) + B(t)u(t),
the performance index as
J = ftC; [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt,
and the boundary conditions as
x(to) = Xo;
x( (0) is free,
find the optimal control, state and performance index.
B. Solution of the Problem
Step 1 the matrix differential Riccati equation (DRE)
129
P(t) = P(t)A(t)  A'(t)P(t)  Q(t) + P(t)B(t)R
1
(t)B'(t)P(t)
with final condition P(t = t f) = O.
Step 2 Solve the optimal state x*(t) from
x*(t) = [A(t)  B(t)R
1
(t)B'(t)P(t)] x*(t)
with initial condition x(to) = Xo.
Step 3 Obtain the optimal control u*(t) from
u*(t) = R
1
(t)B'(t)P(t)x*(t).
Step 4 Obtain the optimal performance index from
J* =
The optimal state is the solution of
I x*(t) = [A(t)  B(t)R
1
(t)B'(t)P(t)] x*(t) I (3.4.13)
and the optimal cost is
1 A
J* = "2x*'(t)P(t)x*(t). (3.4.14)
The optimal control u*(t) given by (3.4.10) is linear in the optimal
state x*(t). The entire procedure is now summarized in Table 3.2.
3.5 InfiniteTime LQR System II
In this section, we examine the state regulator system with infinite time
interval for a linear timeinvariant (LTI) system. Let us consider the
130 Chapter 3: Linear Quadratic Optimal Control Systems I
plant as
x(t) = Ax(t) + Bu(t) (3.5.1)
and the cost functional as
110
00
J =  [x'(t)Qx(t) + u'(t)Ru(t)] dt
2 0
(3.5.2)
where, x(t) is nth order state vector; u(t) is rth order control vec
tor; A is nxnorder state matrix; B is rxrorder control matrix; Q
is nxnorder, symmetric, positive semidefinite matrix; R is rxrorder,
symmetric, positive definite matrix. First of all, let us discuss some of
the implications of the timeinvariance and the infinite finaltime.
1. The infinite time interval case is considered for the following rea
sons:
(a) We wish to make sure that the stateregulator stays near
zero state after the initial transient.
(b) We want to include any special case of large final time.
2. With infinite finaltime interval, to include the final cost function
does not make any practical sense. Hence, the final cost term
involving F(tf) does not exist in the cost functional (3.5.2).
3. With infinite finaltime interval, the system (3.5.1) has to be com
pletely controllable. Let us recall that this controllability condi
tion of the plant (3.5.1) requires that the controllability matrix
(see Appendix B)
(3.5.3)
must be nonsingular or contain n linearly independent column
vectors. The controllability requirement guarantees that the op
timal cost is finite. On the other hand, if the system is not control
lable and some or all of those uncontrollable states are unstable,
then the cost functional would be infinite since the control inter
val is infinite. In such situations, we cannot distinguish optimal
control from the other controls. Alternatively, we can assume
that the system (3.5.1) is completely stabilizable.
3.5 InfiniteTime LQR System II 131
As before in the case of finite finaltime interval, we can proceed and
obtain the closedloop optimal control and the associated Riccati equa
tion. Still P (t) must be the solution of the matrix differential Riccati
equation (3.2.34) with boundary condition P(tf) = O. It was shown
that the assumptions of [70]
1. controllability and
imply that
(3.5.4)
where, P is the nxn positive definite, symmetric, constant matrix. If P
is constant, then P is the solution of the nonlinear, matrix, algebraic
Riccati equation (ARE),
dP = 0 = P A  A'P + PBR1B'P  Q.
dt
Alternatively, we can write (3.5.5) as
PA + A'P + Q  PBR1B'P = o.
(3.5.5)
(3.5.6)
Note, for a timevarying system with finitetime interval, we have the
differential Riccati equation (3.2.34), whereas for a linear timeinvariant
system with infinitetime horizon, we have the algebraic Riccati equa
tion (3.5.6).
A historical note on R.E. Kalman is appropriate (from SIAM News,
6/94  article about R.E. Kalman).
Rudolph E. Kalman is best known for the linear filtering
technique that he and Richard Bucy [31] developed in 1960
1961 to strip unwanted noise out of a stream of data [71,
74, 76]. The Kalman filter, which is based on the use of
statespace techniques and recursive algorithms, revolution
ized the field of estimation. The Kalman filter is widely used
in navigational and guidance systems, radar tracking, sonar
ranging, and satellite orbit determination (for the Ranger,
Apollo, and Mariner missions, for instance), as well as in
fields as diverse as seismic data processing, nuclear power
132 Chapter 3: Linear Quadratic Optimal Control Systems I
plant instrumentation, and econometrics. Among Kalman's
many outstanding contributions were the formulation and
study of most fundamental statespace notions [72, 73, 77j
including controllability, observability, minimality, realiz
ability from input and output data, matrix Riccati equa
tions, linearquadratic control [70, 75, 75j, and the sepa
ration principle that are today ubiquitous in control. While
some of these concepts were also encountered in other con
texts, such as optimal control theory, it was Kalman who
recognized the central role that they play in systems analy
sis.
Born in Hungary, Kalman received BS and MS degrees from
the Massachusetts Institute of Technology (MIT) and a DSci
in engineering from Columbia University in 1957. In the
early years of his career he held research positions at In
ternational Business Machines (IBM) and at the Research
Institute for Advanced Studies (RIAS) in Baltimore. From
1962 to 1971, he was at Stanford University. In 1971, he
became a graduate research professor and director of the
Center for Mathematical System Theory at the University
of Florida, Gainesville, USA, and later retired with emeri
tus status. Kalman's contributions to control theory and to
applied mathematics and engineering in general have been
widely recognized with several honors and awards.
3.5.1 Meaningful Interpretation of Riccati Coefficient
Consider the matrix differential Riccati equation (3.2.34) with final
condition P(tf) = O. Now consider a simple time transformation T =
t f  t. Then, in T scale we can think of the final time t f as the "starting
time," P(tf) as the "initial condition," and P as the "steadystate
solution" of the matrix DRE. As the time t f + 00, the "transient
solution" is pushed to near t f which is at infinity. Then for most of
the practical time interval the matrix P (t) becomes a steady state,
i.e., a constant matrix P, as shown in Figure 3.7 [6]. Then the optimal
control is given by
u*(t) = R1B'Px*(t) = Kx*(t), (3.5.7)
3.5 InfiniteTime LQR System II 133
P(t)= P
14) ,
transient + t
i :' 0 +1
f
Figure 3.7 Interpretation of the Constant Matrix P
where, K = R 1 B'P is called the Kalman gain. Alternatively, we can
write
u*(t) = (3.5.8)
where, Ka = PBR 1. The optimal state is the solution of the system
obtained by using the control (3.5.8) in the plant (3.5.1)
x*(t) = [A  BR
1
B'P] x*(t) = Gx*(t), (3.5.9)
where, the matrix G = A  BR 1 B'P must have stable eigenvalues so
that the closedloop optimal system (3.5.9) is stable. This is required
since any unstable states with infinite time interval would lead to an in
finite cost functional J*. Let us note that we have no constraint on the
stability of the original system (3.5.1). This means that although the
original system may be unstable, the optimal system must be definitely
stable.
Finally, the minimum cost (3.2.29) is given by
1 , 
J*  x* (t)Px*(t)
2 .
(3.5.10)
3.5.2 Analytical Solution of the Algebraic
Riccati Equation
The next step is to find the analytical expression to the steadystate
(limiting) solution of the differential Riccati equation (3.3.3). Thus, we
are interested in finding the analytical solution to the algebraic Riccati
equation (3.5.5). Obviously, one can let the terminal time t f tend to
134 Chapter 3: Linear Quadratic Optimal Control Systems I
00 in the solution (3.3.23) for P(tf). As tf t 00, eM(tf
t
) goes to
zero, which in turn makes T(t) tend to zero. Thus, under the usual
conditions of (A, B) being stabilizable and (A, v'Q) being reachable,
and T = 0, we have from (3.3.23)
I t J ~ o o P(t, tt) = P = W
21
W11·1
(3.5.11)
Thus, the solution to the ARE is constructed by using the stable eigen
vectors of the Hamiltonian matrix. For further treatment on this topic,
consult [3] and the references therein.
3.5.3 InfiniteInterval Regulator System:
TimeInvariant Case: Summary
For a controllable, linear, timeinvariant plant
x(t) = Ax(t) + Bu(t), (3.5.12)
and the infinite interval cost functional
1 tx)
J = "210 [x'(t)Qx(t) + u'(t)Ru(t)] dt,
(3.5.13)
the optimal control is given by
I u*(t) = R1B'Px*(t) I (3.5.14)
where, P, the nxn constant, positive definite, symmetric matrix, is the
solution of the nonlinear, matrix algebraic Riccati equation (ARE)
/PA  A'P + PBR1B'P  Q = 0 I (3.5.15)
the optimal trajectory is the solution of
(3.5.16)
and the optimal cost is given by
1 
J* = x*'(t)Px*(t)
2 .
(3.5.17)
The entire procedure is now summarized in Table 3.3 and the imple
mentation of the closedloop optimal control (CLOC) is shown in Fig
ure 3.8
3.5 Innnite Time LQR System II 135
u*(t)
J
B
x*(t) :
.:

A
Plant
,. _ .. __ ... _ ... 
...
ClosedLoop Optimal Controller
t ...................................................................................................................... 1
Figure 3.8 Implementation of the ClosedLoop Optimal Control:
Infinite Final Time
Next, an example is given to illustrate the infinite interval regulator
system and the associated matrix algebraic Riccati equation. Let us
reconsider the same Example 3.1 with final time t f * 00 and F = O.
Example 3.2
Given a second order plant
Xl(O) =2
X2(t) = 2Xl(t) + X2(t) + u(t), X2(O) = 3 (3.5.18)
and the performance index
obtain the feedback optimal control law.
Solution: Comparing the plant (3.5.18) and PI (3.5.19) of the
present system with the corresponding general formulation of plant
(3.5.12) and PI (3.5.13), respectively, let us first identify the various
136 Chapter 3: Linear Quadratic Optimal Control Systems I
Table 3.3 Procedure Summary of InfiniteInterval Linear
Quadratic Regulator System: TimeInvariant Case
A. Statement of the Problem
Given the plant as
x(t) = Ax(t) + Bu(t),
the performance index as
J = Iooo [x'(t)Qx(t) + u'(t)Ru(t)] dt,
and the boundary conditions as
x(to) = Xo; x(oo) = 0,
find the optimal control, state and index.
B. Solution of the Problem
Step 1 Solve the matrix algebraic Riccati equation (ARE)
PA  A'P  Q + PBR1B'P = 0 ..
Step 2 Solve the optimal state x* (t) from
x*(t) = [A  BR1B'P] x*(t)
with initial condition x(to) = Xo.
Step 3 Obtain the optimal control u * ( t) from
u*(t) = R1B'Px*(t).
Step 4 Obtain the optimal performance index from
J* =
matrices as
A = ;
Q= [;;];
1
R=r ='
4'
to = 0;
Let P be the 2x2 symmetric matrix
P = .
P12 P22
Then, the optimal control ( 3.5.14) is given by
u*(t) = 4 [0 1] [XI(t)] ,
P12 P22 x2(t)
= 4(jh2xi(t) + P22
x
;(t)],
(3.5.20)
(3.5.21 )
(3.5.22)
(3.5.23)
3.5 InEnite Time LQR System II 137
where, f>, the 2x2 symmetric, positive definite matrix, is the solu
tion of the matrix algebraic Riccati equation (3.5.15)
[
0 0] = _ [ ~ 1 1 ~ 1 2 ] [ ° 1] _ [0 2] [ ~ 1 1 ~ 1 2 ] +
° ° P12 P22  2 1 1 1 P12 P22
[
~ 1 1 ~ 1 2 ] [0] 4 [0 1] [ ~ 1 1 ~ 1 2 ] _ [23] .
P12 P22 1 P12 P22 3 5
Simplifying the equation (3.5.24), we get
4PI2 + 4P12  2 = °
P11  P12 + 2P22 + 4P12P22  3 = °
2P12  2P22 + 4 P ~ 2  5 = 0.
(3.5.24)
(3.5.25)
Solving the previous equations for positive definiteness of f> is easy
in this particular case. Thus, solve the first equation in (3.5.25)
for P12, using this value of P12 in the third equation solve for P22
and finally using the values of P12 and P22 in the second equation,
solve for P11. In general, we have to solve the nonlinear algebraic
equations and pick up the positive definite values for f>. Hence, we
get
f> = [1.7363 0.3660]
0.3660 1.4729 .
(3.5.26)
Using these Riccati coefficients (gains), the closedloop optimal
control (3.5.23) is given by
u*(t) = 4[0.366xi(t) + 1.4729x2(t)]
= [1.464xi(t) + 5.8916x2(t)]. (3.5.27)
Using the closedloop optimal control u*(t) from (3.5.27) in the
original openloop system (3.5.18), the closedloop optimal system
becomes
xi(t) = X2(t)
X2(t) = 2xi(t) + x;(t)  4[0.366xi(t) + 1.4729x;(t)] (3.5.28)
and the implementation of the closedloop optimal control is shown
in Figure 3.9.
U sing the initial conditions and the Riccati coefficient matrix
(3.5.26), the optimal cost (3.5.17) is obtained as
J* = ~ '(O)f> (0) = ~ [2 3] [1.73630.3660] [ 2 ]
2
x
x 2 0.3660 1.4729 3'
= 7.9047. (3.5.29)
138 Chapter 3: Linear Quadratic Optimal Control Systems I
 .. __ .. _ .. _
1
2
u*(t) +
+
J J
Plant
4 1.4729
0.366
ClosedLoop Optimal Controller
Figure 3.9 ClosedLoop Optimal Control System
The previous results can also easily obtained using Control System
Toolbox of the MATLABC9, Version 6 as shown below.
*********************************************
%% Solution using Control System Toolbox in
%% The MATLAB. Version 6
%% For Example:43
%%
x10=2; %% initial condition on state x1
x20=3; %% initial condition on state x2
XO=[x10;x20];
A=[O 1;2 1]; %% system matrix A
B=[O;1]; %% system matrix B
Q=[2 3;3 5]; %% performance index weighted matrix
R=[O.25]; %% performance index weighted matrix
[K,P,EV]=lqr(A,B,Q,R) %% K = feedback matrix;
%% P = Riccati matrix;
%% EV = eigenvalues of closed loop system A  B*K
K =
3.5 InfiniteTime LQR System II
1.4641
p
1.7363
0.3660
EV =
4.0326
0.8590
5.8916
0.3660
1.4729
BIN=[O;O]; % dummy BIN for "initial" command
C= [1 1];
D= [1] ;
tfinal=10;
t=0:0.05:10;
[Y,X,t]=initial(AB*K,BIN,C,D,XO,tfinal);
x1t=[1 O]*X'; %% extracting xi from vector X
x2t=[0 1]*X'; %% extracting x2 from vector X
ut=K*X' ;
plot(t,x1t, 'k' ,t,x2t, 'k')
xlabel ( , t ' )
gtext ('x_1 (t)')
gtext ('x_2(t)')
plot (t , ut , ' k ' )
xlabel ('t')
gt ext ( , u ( t) , )
**************************************************
139
Using the optimal control u*(t) given by (3.5.23), the plant equations
(3.5.18) are solved using MATLAB© to obtain the optimal states xi(t)
and x2(t) and the optimal control u*(t) as shown in Figure 3.10 and
Figure 3.11. Note that
1. the values of P obtained in the example, are exactly the steady
state values of Example 3.1 and
2. the original plant (3.5.18) is unstable (eigenvalues at 2 ± j1)
whereas the optimal dosedloop system (3.5.28) is stable (eigen
values at 4.0326, 0.8590).
3.5.4 Stability Issues of TimeInvariant Regulator
Let us consider the previous result for linear timeinvariant system with
infinitetime horizon from relations (3.5.12) to (3.5.17) and Table 3.3.
We address briefly some stability remarks of the infinitetime regulator
system [3, 89].
140 Chapter 3: Linear Quadratic Optimal Control Systems I
2 ~   ~   ~   ~   ~   ~   ~   ~   ~   ~   ~
1.5
0.5
1
 3 ~   ~   ~   ~   ~   ~   ~   ~   ~   ~   ~
o 2 3 4 5 6 7 8 9 10
Figure 3.10 Optimal States for Example 3.2
1. The closedloop optimal system (3.5.16) is not always stable espe
cially when the original plant is unstable and these unstable states
are not weighted in the PI (3.5.13). In order to prevent such a sit
uation, we need the assumption that the pair [A, C] is detectable,
where C is any matrix such that C'C = Q, which guarantees the
stability of closedloop optimal system. This assumption essen
tially ensures that all the potentially unstable states will show up
in the x'(t)Qx(t) part of the performance measure.
2. The Riccati coefficient matrix P is positive definite if and only if
[A, C] is completely observable.
3. The detectability condition is necessary for stability of the closed
loop optimal system.
4. Thus both detectabilityand stabilizability conditions are necessary
for the existence of a stable closedloop system.
3.5 Infinite Time LQR System II
1 5 r    r    r    r    r    r    r    r    ~   ~   ,
10
5
2 3 4 5 6 7 8 9 10
Figure 3.11 Optimal Control for Example 3.2
3.5.5 Equivalence of OpenLoop and ClosedLoop
Optimal Controls
141
Next, we present a simple example to show an interesting property that
an optimal control system can be solved and implemented as an open
loop optimal control (OLOC) configuration or a closedloop optimal
control (CLOC) configuration. We will also demonstrate the simplicity
of the CLOC.
Example 3.3
Consider a simple first order system
x(t) = 3x(t) + u(t) (3.5.30)
and the cost function (CF) as
(3.5.31 )
where, x(O) = 1 and the final state x( (0) = O. Find the openloop
and closedloop optimal controllers.
142 Chapter 3: Linear Quadratic Optimal Control Systems I
Solution: (a) OpenLoop Optimal Control: We use the Pon
tryagin procedure given in Chapter 2 (see Table 2.1). First of all,
comparing the given plant (3.5.30) and the CF (3.5.31) with the
general formulations (see Table 2.1), identify that
V(x(t), u(t)) = x
2
(t) + u
2
(t),
f(x(t), u(t)) = 3x(t) + u(t). (3.5.32)
Now, we use the stepbystep procedure given in Table 2.1.
• Step 1: Formulate the Hamiltonian as
1t = V(x(t), u(t)) + A(t)f(x(t), u(t))
= x
2
(t) + u
2
(t) + A(t)[3x(t) + u(t)]. (3.5.33)
• Step 2: The optimal control u*(t) is obtained by minimizing the
previous Hamiltonian w.r.t. u as
= 0 + 2u*(t) + A*(t) = 0 + u*(t) = (3.5.34)
uu 2
• Step 3: Using optimal control (3.5.34) in the Hamiltonian func
tion (3.5.33), find the optimal Hamiltonian function as
2 1 2
1t* = x* (t)  4A* (t)  3A*(t)x*(t). (3.5.35)
• Step 4: Using the previous optimal 1t*, obtain the set of state
and costate equations
x*(t) = a:;;* + x*(t) =  3x*(t), (3.5.36)
. a1t*·
A*(t) =  ax + A*(t) = 2x*(t) + 3A*(t), (3.5.37)
yielding
x*(t)  10x*(t) = 0, (3.5.38)
the solution of which becomes
x*(t) = Cle
ViOt
+ C2eViOt.
(3.5.39)
Using the optimal state (3.5.39) in (3.5.36), we have the costate
as
A*(t) = 2 [x*(t)  3x*(t)]
= 2C
1
(ViO + 3)eViOt + 2C
2
(ViO  3)eViOt. (3.5.40)
3.5 InfiniteTime LQR System II 143
Using the initial condition x(O) = 1 in the optimal state (3.5.39),
and the final condition (for 8 x f being free) ).. ( t f = 00) = 0 in the
optimal costate (3.5.40), we get
x(O) = 1 * C
1
+ C
2
= 1
)..(00) = 0 * C1 = O. (3.5.41 )
Then, the previous optimal state and costate are given as
x*(t) = ev'IOt; )..*(t) = 2(v'lO  3)ev'IOt. (3.5.42)
• Step 5: Using the previous costate solution (3.5.34) of Step 2,
we get the openloop optimal control as
(3.5.43)
(b) ClosedLoop Optimal Control: Here, we use the matrix
algebraic Riccati equation (ARE) to find the closedloop optimal
control, as summarized in Table 3.3. First of all, comparing the
present plant (3.5.30) and the PI (3.5.31) with the general formu
lation of the plant (3.5.12) and the PI (3.5.13), respectively, we
identify the various coefficients and matrices as
A = a = 3; B = b = 1;
Q = q = 2; R = r = 2; f> = p.
(3.5.44)
Note the PI (3.5.31) does not contain the factor 1/2 as in the
general PI (3.5.13) and accordingly, we have Q = q = 2 and R =
r = 2.
• Step 1: With the previous values, the ARE (3.5.15) becomes
1
p(3) + (3)p  p(l)("2)(l)p+ 2 = 0*
p2 + 12p  4 = 0, (3.5.45)
the solution of which is
p = 6 ± 2v'lO. (3.5.46)
• Step 2: Using the positive value of the Riccati coefficient (3.5.46),
the closedloop optimal control (3.5.14) becomes
1
u*(t) = r
1
bpx*(t) = "2(6 + 2v'lO)x*(t)
= ( 3 + v'lO)x*(t). (3.5.47)
144 Chapter 3: Linear Quadratic Optimal Control Systems I
• Step 3: Using the optimal control (3.5.47), the optimal state is
solved from (3.5.16) as
x(t) = 3x*(t)  (3 + v'lO)x*(t) = v'lOx*(t). (3.5.48)
Solving the previous along with the initial condition x(O) = 1,
we get the optimal state as
x*(t) = ev'IO
t
(3.5.49)
with which the optimal control (3.5.47) becomes
(3.5.50)
Thus, we note that the optimal control (3.5.50) and optimal state
(3.5.49) obtained from using the closedloop optimal control are
identical to those of (3.5.43) and (3.5.42), respectively. We can
easily extend this analysis for the general case. Intuitively, this
equivalence should exist as the optimal control being unique should
be same by any method.
The implementation of this openloop optimal controller (OLOC)
is shown in Figure 3.12(a), and that of the closedloop optimal
controller (CLOC) is shown in Figure 3.12(b).
From the previous example, it is clear that
1. from the implementation point of view, the closedloop optimal
controller (v'IO  3) is much simpler than the openloop optimal
controller (( v'IO3)ev'IOt) which is an exponential time function
and
2. with a closedloop configuration, all the advantages of conven
tional feedback are incorporated.
3.6 Notes and Discussion
We know that linear quadratic optimal control is concerned with lin
ear plants, performance measures quadratic in controls and states, and
regulation and tracking errors. In particular, the resulting optimal
controller is closedloop and linear in state. Note that linear quadratic
optimal control is a special class of the general optimal control which
includes nonlinear systems and nonlinear performance measures. There
are many useful advantages and attractive features of linear quadratic
optimal control systems which are enumerated below [3].
3.6 Notes and Discussion 145
.... _ .............. _ ..
.._ .... __ ..... _
x*(t)
+
J
OLOC
3
Plant:
~ ......................................................................... ........ I
(a)
u*(t) +
+
J
x*(t)
Plant
3
•............. " ........................ ............, .......  
.. _ .. __ ...... __ .... __ .......... _ ......... __ ...... __ ....... _ .......
, ,
'             ~ / I 0  3 ~     ~
, ,
_ .... _ ....................... __ .... __ ..  ... _ ........ __ .... __ .... _ ..
CLOC (b)
Figure 3.12 (a) OpenLoop Optimal Controller (OLOC) and (b)
ClosedLoop Optimal Controller (CLOC)
1. Many engineering and physical systems operate in linear range
during normal operations.
2. There is a wealth of theoretical results available for linear systems
which can be useful for linear quadratic methods.
3. The resulting optimal controller is linear in state and thus easy
and simple for implementation purposes in real application of the
LQ results.
4. Many (nonlinear) optimal control systems do not have solutions
which can be easily computed. On the other hand, LQ optimal
control systems have easily computable solutions.
5. As is well known, nonlinear systems can be examined for small
variations from their normal operating conditions. For example,
146 Chapter 3: Linear Quadratic Optimal Control Systems I
assume that after a heavy computational effort, we obtained an
optimal solution for a nonlinear plant and there is a small change
from an operating point. Then, one can easily use to a first
approximation a linear model and obtain linear optimal control
to drive the original nonlinear system to its operating point.
6. Many of the concepts, techniques and computational procedures
that are developed for linear optimal control systems in many
cases many be carried on to nonlinear optimal control systems.
7. Linear optimal control designs for plants whose states are mea
surable possess a number of desirable robustness properties (such
as good gain and phase margins and a good tolerance to nonlin
earities) of classical control designs.
3.7 Problems 147
3.7 Problems
1. Make reasonable assumptions wherever necessary.
2. Use MATLAB© wherever possible to solve the problems and
plot all the optimal controls and states for all problems. Provide
the relevant MATLAB© m files.
Problem 3.1 A first order system is given by
±(t) = x(t) + u(t).
(a) Find the unconstrained optimal control law which minimizes the
performance index
such that the final time t f is fixed and the final state x(t f) is free.
(b) Find the optimal control law as t f * 00.
Problem 3.2 A system is described by
x(t) + x(t) = u(t)
with initial conditions x(O) = 0 and ±(O) = 1 and the performance
index
Find the closedloop optimal control in terms of x and x and the optimal
cost function.
Problem 3.3 A first order system is described by
±(t) = ax(t) + bu(t)
with performance index
148 Chapter 3: Linear Quadratic Optimal Control Systems I
and with a fixed initial state x(O) and final state x(tf) = 0, where tf is
fixed. Show that the solution of the Riccati equation is given by
r
p(t) = b
2
[a  ;3coth{;3(t  tf)}]
and the solution of the optimal state x* (t) is given by
*( ) _ ()sinh{(3(tf  t)}
x t  x 0 . h(3
S'ln
where, (3 = Ja
2
+ b
2
q/r.
Problem 3.4 Find the optimal feedback control for the plant
XI(t) = X2(t)
X2(t) = 2XI(t) + 4X2(t) + 5u(t)
with performance criterion
121 2
J = 2f11 [xI(tf)  4] + 2f22 [x2(tf)  2] +
~ If [5xi(t) + 2 x ~ ( t ) + 4u
2
(t)] dt
and initial conditions as x(O) = [1 2]' and the final state x(tf) is free,
where t f is specified.
Problem 3.5 Find the closedloop, unconstrained, optimal control for
the system
Xl (t) = X2(t)
X2(t) = 2XI (t)  3X2 (t) + u(t)
and the performance index
J = 10
00
[xi(t) + x ~ ( t ) + u
2
(t)]dt.
Problem 3.6 Find the optimal feedback control law for the plant
;.h(t) = X2(t) + u(t)
X2(t) = XI(t)  X2(t) + u(t)
and the cost function
J = 10
00
[2xi(t) + 4 x ~ ( t ) + O.5u
2
(t)]dt.
3.7 Problems 149
Problem 3.7 Consider a second order system
x(t) + bx(t) + cx(t) = u(t)
and the performance index to be minimized as
Determine the closedloop optimal control in terms of the state x(t)
and its derivative x(t).
Problem 3.8 Given a third order plant,
Xl(t) = X2(t)
X2(t) = X3(t)
X3(t) = 5Xl(t) + 7X2(t) 10X3(t) + 4u(t)
and the performance index
for
1. ql1 = q22 = q33 = 1, r = 1,
2. ql1 = 10, q22 = 1, q33 = 1, r = 1, and
3. ql1 = q22 = q33 = 1, r = 10,
find the positive definite solution for Riccati coefficient matrix f>, op
timal feedback gain matrix K and the eigenvalues of the closedloop
system matrix A  BK.
Problem 3.9 Determine the optimal feedback coefficients and the op
timal control law for the multiinput, multioutput (MIMO) system
. [01] [11]
x(t) = 11 x(t) + 0 1 u(t)
and the cost function
J = foCXJ [2xr(t) + + 0.5ur(t) +
150 Chapter 3: Linear Quadratic Optimal Control Systems I
Problem 3.10 For the D.C. motor speed control system described in
Problem 1.1, find the closedloop optimal control to keep the speed
constant at a particular value and the system to respond for any dis
turbances from the regulated value.
Problem 3.11 For the liquidlevel control system described in Prob
lem 1.2, find the closedloop optimal control to keep the liquid level
constant at a reference value and the system to act only if there is a
change in the liquid level.
Problem 3.12 [35] For the inverted pendulum control system described
in Problem 1.3, find the closedloop optimal control to keep the pen
dulum in a vertical position.
Problem 3.13 For the mechanical control system described in Prob
lem 1.4, find the closedloop optimal control to keep the system at
equilibrium condition and act only if there is a disturbance.
Problem 3.14 For the automobile suspension system described in
Problem 1.5, find the closedloop optimal control to keep the system
at equilibrium condition and act only if there is a disturbance.
Problem 3.15 For the chemical control system described in Prob
lem 1.6, find the closedloop optimal control to keep the system at
equilibrium condition and act only if there is a disturbance.
@@@@@@@@@@@@@
Chapter 4
Linear Quadratic Optimal
Control Systems II
In the previous chapter, we addressed the linear quadratic regulator
system, where the aim was to obtain the optimal control to regulate
(or keep) the state around zero. In this chapter, we discuss linear
quadratic tracking (LQT) systems, and some related topics in linear
quadratic regulator theory. It is suggested that the student reviews the
material in Appendices A and B given at the end of the book. This
chapter is based on [6, 89, 3] 1.
Trajectory Following Systems
In tracking (trajectory following) systems, we require that the output
of a system track or follow a desired trajectory in some optimal sense.
Thus, we see that this is a generalization of regulator system in the
sense that the desired trajectory for the regulator is simply the zero
state.
IThe permissions given by John Wiley for F. L. Lewis, Optimal Control, John Wiley &
Sons, Inc., New York, NY, 1986 and McGrawHill for M. Athans and P. L. Falb, Optimal
Control: An Introduction to the Theory and Its Applications, McGrawHill Book Company,
New York, NY, 1966, are hereby acknowledged.
151
152 Chapter 4: Linear Quadratic Optimal Control Systems II
4.1 Linear Quadratic Tracking System: Finite
Time Case
In this section, we discuss the linear quadratic tracking (LQT)system
to maintain the output as close as possible to the desired output with
minimum control energy [6]. We are given a linear, observable system
x(t) = A(t)x(t) + B(t)u(t)
y(t) = C(t)x(t) (4.1.1)
where, x(t) is the nth order state vector, u(t) is the rth order control
vector, and y(t) is the mth order output vector. Let z(t) be the mth
order desired output and the various matrices A(t), B(t) and C(t) be
of appropriate dimensionality. Our objective is to control the system
(4.1.1) in such a way that the output y(t) tracks the desired output
z(t) as close as possible during the interval [to, tf] with minimum ex
penditure of control effort. For this, let us define the error vector as
e(t) = z(t)  y(t)
and choose the performance index as
1 ,
J = 2
e
(tf)F(tf)e(tf)
lit!
+2 [e'(t)Q(t)e(t) + u'(t)R(t)u(t)] dt
to
(4.1.2)
(4.1.3)
with t f specified and x( t f) not specified. In this way we are dealing
with freefinal state system. Also, we assume that F(tf) and Q(t) are
mxm symmetric, positive semidefinite matrices, and R( t) is rxr sym
metric, positive definite matrix. We now use the Pontryagin Minimum
Principle in the following order.
• Step 1 : Hamiltonian
• Step 2: OpenLoop Optimal Control
• Step 3: State and Costate System
• Step 4: Riccati and Vector Equations
• Step 5: ClosedLoop Optimal Control
• Step 6: Optimal State
4.1 Linear Quadratic Tracking System: FiniteTime Case 153
• Step 7: Optimal Cost
Now we discuss these steps in detail. Also, note that we heavily draw
upon the results of the previous Chapters 2 and 3. First of all, let us
note from (4.1.1) and (4.1.2) that the error e(t) can be expressed as a
function of z(t) and x(t) as
e(t) = z(t)  C(t)x(t). (4.1.4)
• Step 1: Hamiltonian: Formulate the Hamiltonian as (see Ta
ble 3.1)
1
1t(x(t), u(t), ,X(t)) = "2 [z(t)  C(t)x(t)]' Q(t) [z(t)  C(t)x(t)]
1
+"2 u' ( t ) R ( t ) u ( t)
+,X' (t) [A(t)x(t) + B(t)u(t)] . (4.1.5)
• Step 2: OpenLoop Optimal Control: Using the Hamiltonian
(4.1.5), obtain the control equation from
~ ~ = 0 > R(t)u(t) + B'(t),\(t) = 0
(4.1.6)
from which we have the optimal control as
I u*(t) = R
1
(t)B'(t),X*(t)·1 (4.1.7)
Since the second partial of 1t in (4.1.5) w.r.t. u*(t) is just R(t),
and we chose R( t) to be positive definite, we are dealing with a
control which minimizes the cost functional (4.1.3).
• Step 3: State and Costate System: The state is given in terms of
the Hamiltonian (4.1.5) as
x(t) = ~ ~ = A(t)x(t) + B(t)u(t)
(4.1.8)
and with the optimal control (4.1.7), the optimal state equation
(4.1.8) becomes
x*(t) = A(t)x*(t)  B(t)Rl(t)B'(t),X*(t). (4.1.9)
154 Chapter 4: Linear Quadratic Optimal Control Systems II
Using the Hamiltonian (4.1.5), the optimal costate equation be
comes
. * 81t
A (t) =  ax
= C'(t)Q(t)C(t)x*(t)  A'(t)A*(t)
+C'(t)Q(t)z(t). (4.1.10)
For the sake of simplicity, let us define
E(t) = B(t)Rl(t)B'(t),
W(t) = C'(t)Q(t).
V(t) = C' (t)Q(t)C(t),
(4.1.11)
Using the relation (4.1.11) and combining the state (4.1.8) and
costate (4.1.10) equations, we obtain the Hamiltonian canonical
system as
[
x*(t)] [A(t) E(t)] [X*(t)] [0]
'\*(t) = V(t) A'(t) A*(t) + W(t) z(t). (4.1.12)
This canonical system of 2n differential equations is linear, time
varying, but nonhomogeneous with W(t)z(t) as forcing function.
The boundary conditions for this state and costate equations are
given by the initial condition on the state as
x(t = to) = x(to) (4.1.13)
and the final condition on the costate (for the final time t f spec
ified and state x(tf) being free) given by (3.2.10), which along
with (4.1.4) become
A(tf l = f l )F(tf le(tf l]
=  C(tflx(tfl]'F(tfl[z(tfl  C(tflX(tflJ]
= C'(tf)F(tf)C(tf)x(tf)  C'(tf)F(tf)z(tf). (4.1.14)
• Step 4: Riccati and Vector Equations: The boundary condition
(4.1.14) and the solution of the system (4.1.12) indicate that the
state and costate are linearly related as
A*(t) = P(t)x*(t)  g(t)
(4.1.15)
4.1 Linear Quadratic Tracking System: FiniteTime Case 155
where, the nxn matrix P(t) and n vector g(t) are yet to be de
termined so as to satisfy the canonical system (4.1.12). This is
done by substituting the linear (Riccati) transformation (4.1.15)
in the Hamiltonian system (4.1.12) and eliminating the costate
function A * (t). Thus, we first differentiate (4.1.15) to get
'\*(t) = P(t)x*(t) + P(t)x*(t)  g(t). (4.1.16)
Now, substituting for x*(t) and '\*(t) from (4.1.12) and eliminat
ing A*(t) with (4.1.15), we get
V(t)x*(t)  A'(t) [P(t)x*(t)  g(t)] + W(t)z(t) = P(t)x*(t)
+P(t) [A(t)x(t)  E(t) {P(t)x*(t)  g(t)}]  g(t). (4.1.17)
Rearranging the above, we get
[P(t) + P(t)A(t) + A'(t)P(t)  P(t)E(t)P(t) + V(t)] x*(t) 
[g(t) + A'(t)g(t)  P(t)E(t)g(t) + W(t)z(t)] = O.
(4.1.18)
Now, this relation (4.1.18) must satisfy for all x* (t), z (t) and
t, which leads us to the nxn matrix P (t) to satisfy the matrix
differential Riccati equation (DRE)
I P(t) = P(t)A(t)  A'(t)P(t) + P(t)E(t)P(t)  V(t) I
( 4.1.19)
and the n vector g( t) to satisfy the vector differential equation
I g(t) = [P(t)E(t)  A'(t)] g(t)  W(t)z(t).1 (4.1.20)
Since P(t) is nxn symmetric matrix, and g(t) is of n vector,
the equations (4.1.19) and (4.1.20) are a set of n(n + 1)/2 +
n firstorder differential equations. The boundary conditions are
obtained from (4.1.15) as
(4.1.21)
which compared with the boundary condition (4.1.14) gives us
for all x(tf) and z(tf),
1 P ( t f) = C' ( t f ) F ( t f) C ( t f ), 1
1 g ( t f) = C' ( t f ) F ( t f ) Z ( t f ) ·1
( 4.1.22)
(4.1.23)
156 Chapter 4: Linear Quadratic Optimal Control Systems II
Thus, the matrix DRE (4.1.19) and the vector equation (4.1.20)
are to be solved backward using the boundary conditions (4.1.22)
and (4.1.23).
• Step 5: ClosedLoop Optimal Control: The optimal control (4.1.7)
is now given in terms of the state using the linear transformation
( 4.1.15)
u*(t) = R l(t)B' (t) [P(t)x* (t)  g(t)]
= K(t)x*(t) + Rl(t)B'(t)g(t)
where, K(t) = R
1
(t)B'(t)P(t), is the Kalman gain.
( 4.1.24)
• Step 6: Optimal State: Using this optimal control u*(t) from
(4.1.24) in the original plant (4.1.1), we have the optimal state
obtained from
x*(t) = [A(t)  B(t)R
1
(t)B'(t)P(t)] x*(t)
+B(t)R l(t)B' (t)g(t)
= [A(t)  E(t)P(t)]x*(t) + E(t)g(t). (4.1.25)
• Step 7: Optimal Cost: The optimal cost J*(t) for any time t can
be obtained as (see [6])
1 , I
J*(t) = 2x* (t)P(t)x*(t)  x* (t)g(t) + h(t) (4.1.26)
where, the new function h(t) satisfies [3, 6]
.1
1
1
h(t) = 2g'(t)B(t)R (t)B'(t)g(t)  2z'(t)Q(t)z(t)
1 1
= 2
g
'(t)E(t)g(t)  2z'(t)Q(t)z(t) (4.1.27)
with final condition
(4.1.28)
For further details on this, see [3, 6, 89, 90]. We now summarize the
tracking system.
4.1 Linear Quadratic Tracking System: FiniteTime Case 157
f
x*(t)
C(t)
y* t)
A(t)
Plant
ClosedLoop Optimal Controller
P(t)x*(t)
+
g(t)
OffLine Simulation of P(t) and g(t)
Desired z( t)
I ............................................................................................................................................................ I
Figure 4.1 Implementation of the Optimal Tracking System
4.1.1 Linear Quadratic Tracking System: Summary
Given the linear, observable system (see Figure 4.1)
x(t) = A(t)x(t) + B(t)u(t)
y(t) = C(t)x(t) ( 4.1.29)
the desired output z(t), the error e(t) = z(t)  y(t), and the perfor
mance index
1 lit!
J = e'(tf)F(tf)e(tf) +  [e'(t)Q(t)e(t) + u'(t)R(t)u(t)] dt
2 2 to
(4.1.30)
158 Chapter 4: Linear Quadratic Optimal Control Systems II
then the optimal control u*(t) is given by
u*(t) = R
1
(t)B'(t) [P(t)x*(t)  g(t)]
= K(t)x*(t) + R
1
(t)B'(t)g(t) (4.1.31)
where, the nxn symmetric, positive definite matrix P(t) is the solution
of the nonlinear, matrix differential Riccati equation (DRE)
P(t) = P(t)A(t)  A'(t)P(t) + P(t)E(t)P(t)  V(t) (4.1.32)
with final condition
(4.1.33)
and the nth order g(t) is the solution of the linear, nonhomogeneous
vector differential equation
g(t) =  [A(t)  E(t)P(t)]' g(t)  W(t)z(t) (4.1.34)
with final condition
(4.1.35)
where, E(t), V(t) and W(t) are defined in (4.1.11), the optimal state
(trajectory) is the solution of the linear state equation
x*(t) = [A(t)  E'(t)P(t)] x*(t) + E(t)g(t), ( 4.1.36)
and the optimal cost J*
J*(to) = ~ x * ' (to)P(to)x*(to)  x*(to)g(to) + h(to). (4.1.37)
The implementation of the tracking system is shown in Figure 4.1.
The entire procedure is now summarized in Table 4.1.
4.1.2 Salient Features of Tracking System
1. Riccati Coefficient Matrix P(t): We note that the desired output
z(t) has no influence on the matrix differential Riccati equation
(4.1.32) and its boundary condition (4.1.33). This means that
once the problem is specified in terms of the final time t f' the
plant matrices A(t), B(t), and C(t), and the cost functional ma
trices F(tf), Q(t), and R(t), the matrix function P(t) is com
pletely determined.
4.1 Linear Quadratic Tracking System: Finite Time Case 159
Table 4.1 Procedure Summary of Linear Quadratic Tracking
System
A. Statement of the Problem
Given the plant as
x(t) = A(t)x(t) + B(t)u(t), y(t) = C(t)x(t), e(t) = z(t)  y(t),
the performance index as
J = )F(tf )e(tf) + ftd [e'(t)Q(t)e(t) + u'(t)R(t)u(t)] dt,
and the boundary conditions as
x(to) = Xo, x(tj) is free,
find the optimal control, state and performance index.
B. Solution of the Problem
Step 1 Solve the matrix differential Riccati equation
P(t) = P(t)A(t)  A'(t)P(t) + P(t)E(t)P(t)  V(t),
with final condition P(tj) = C'(tj )F(tj )C(tj),
and the nonhomogeneous vector differential equation
g(t) =  [A(t)  E(t)P(t)]' g(t)  W(t)z(t),
with final condition g( t j) = C' (t j )F( t j )z( t j) where
E(t) = B(t)R1(t)B'(t), V(t) = C'(t)Q(t)C(t),
W(t) = C'(t)Q(t).
Step 2 Solve the optimal state x* (t) from
x*(t) = [A(t)  E(t)P(t)] x*(t) + E(t)g(t)
with initial condition x(to) = Xo.
Step 3 Obtain optimal control u*(t) from
u*(t) = K(t)x*(t) + R 1 (t)B' (t)g(t),
where, K(t) = R
1
(t)B'(t)P(t).
Step 4 The optimal cost J* (to) is
J*(to) =  x*(to)g(to) + h(to)
where h(t) is the solution of
b.(t) =  (t)E(t)g( t)  (t)Q( t)z(t)
with final condition h(tj) = z'(tj )P(tj )z(tj).
160 Chapter 4: Linear Quadratic Optimal Control Systems II
2. Closed Loop Eigenvalues: From the costate relation (4.1.36), we
see the closedloop system matrix [A(t)  B(t)RI(t)B'(t)P(t)]
is again independent of the desired output z(t). This means the
eigenvalues of the closedloop, optimal tracking system are inde
pendent of the desired output z(t).
3. Tracking and Regulator Systems: The main difference between
the optimal output tracking system and the optimal state regulator
system is in the vector g(t). As shown in Figure 4.1, one can
think of the desired output z(t) as the forcing function of the
closedloop optimal system which generates the signal g(t).
4. Also, note that if we make C(t) = I(t), then in (4.1.11), V(t) =
Q(t). Thus, the matrix DRE (4.1.19) becomes the same matrix
DRE (3.2.34) that was obtained in LQR system in Chapter 3.
Let us consider a second order example to illustrate the linear quadratic
tracking system.
Example 4.1
A second order plant
::h (t) = X2(t),
X2(t) = 2XI(t)  3X2(t) + u(t)
y(t) = x(t)
is to be controlled to minimize the performance index
(4.1.38)
(4.1.39)
The final time t f is specified at 20, the final state x( tJ) is free and
the admissible controls and states are unbounded. It is required
to keep the state XI(t) close to 1. Obtain the feedback control
law and plot all the time histories of Riccati coefficients, g vector
components, optimal states and control.
Solution: The performance index indicates that the state Xl (t) is
to be kept close to the reference input Zl(t) = 1 and since there is
no condition on state X2(t), one can choose arbitrarily as Z2(t) = o.
Now, in our present case, with e(t) = z(t)  Cx(t), we have el(t) =
Zl(t)  XI(t), e2(t) = Z2(t)  X2(t) and C = I.
4.1 Linear Quadratic Tracking System: FiniteTime Case 161
Next, let us identify the various matrices in the present tracking
system by comparing the state (4.1.38) and the PI (4.1.39) of the
present system (note the absence of the factor 1/2 in PI) with the
corresponding state (4.1.29) and the PI (4.1.30), respectively, of
the general formulation of the problem, we get
A = [ ~ 2 _ ~ ]; B = [ ~ ] ; C = I; z(t} = [ ~ ] ;
Q = [ ~ ~ ] = F(tf}; R = r = 0.004. (4.1.40)
Let P(t) be the 2x2 symmetric matrix and g(t) be the 2x1 vector
as
P(t)  [Pl1 (t) P12 (t)].
 P12(t) P22(t) ,
g ( t) = [gl ( t) ].
g2(t)
Then, the optimal control given by (4.1.31) becomes
u*(t) = 250 [P12xi(t) + P22X2(t)  g2(t)]
(4.1.41)
(4.1.42)
where, P(t), the positive definite matrix, is the solution of the
matrix differential Riccati equation (4.1.32)
[
Pll(t) P12(t)] [Pll(t) P12(t)] [0 1]
P12(t) P22(t) =  P12(t) P22(t) 2 3
_ [0 2] [Pl1 ( t ) P12 ( t ) ]
1 3 P12(t) P22(t)
+ [Pll (t) P12(t)] [0] 1 [0 1]
P12(t) P22(t) 1 0.004 x
[
Pll(t) P12(t)] _ [20]
P12(t) P22(t) 00 (4.1.43)
and g(t), is the solution of the nonhomogeneous vector differential
equation obtained from (4.1.34) as
162 Chapter 4: Linear Quadratic Optimal Control Systems II
Simplifying the equations (4.1.43) and (4.1.44), we get
Pn (t) = 250PI2(t) + 4pI2(t)  2
PI2(t) = 250PI2(t)P22(t)  pn(t) + 3PI2(t) + 2p22(t)
P22(t) =  2pI2(t) + 6p22(t) (4.1.45)
with final condition (4.1.33) as
( 4.1.46)
and
[/1(t) = [250PI2(t) + 2] g2(t)  2
92(t) = gl(t) + [3 + 250P22 (t)] g2(t) (4.1.47)
with final condition
( 4.1.48)
Note: One has to try various values of the matrix R in order to
get a better tracking of the states. Solutions for the functions
Pn(t),PI2(t), andp22(t) (Figure 4.2), functions gl(t) and g2(t) (Fig
ure 4.3), optimal states (Figures 4.4) and control (Figure 4.5) for
initial conditions x(O) = f=:0.5 0] and the final time t f = 20 are
obtained using MATLAB© routines given in Appendix C under
continuoustime tracking system.
Example 4.2
Consider the same Example 4.1 with a different PI as
(4.1.49)
where, t f is specified and x( t f) is free. Find the optimal control
in order that the state Xl (t) track a ramp function Zl (t) = 2t and
without much expenditure of control energy. Plot all the variables
(Riccati coefficients, optimal states and control) for initial condi
tions x(O) = [1 0]'.
Solution: The performance index (4.1.49) indicates that the state
Xl (t) is to be kept close to the reference input Zl (t) = 2t and since
there is no condition on state X2 (t), one can choose arbitrarily as
Z2(t) = O. Now, in our present case, with e(t) = z(t) Cx(t),
we have el(t) = Zl(t)  XI(t), e2(t) = Z2(t)  X2(t) and C = I.
4.1 Linear Quadratic Tracking System: FiniteTime Case 163
Plot ofP
1.4 r.....l..r; rI: r:I rI: .....,
I ••• I
• I I I I I I • I
1 2 ! ..   _ ..  :   ...... ! ..  ..    ..    
. : : : : (t) : : : : :
:: : P11 : ::
• I • I • • I • I
1  1  1 r ;   1  1   y  ;r  
E :::::::::
Q) :::::::::
..  ... t .... of _ .... :   ........ f .... ; ..  _ .. : _ .. :  ....  _ ..
:::::::::
o :::::::::
<..> :::::::::
0.6  ;'''  i  _ .... : .. _ ........ ....  i  .. r  ....  r""
u :::::::::
• I I I It. • I
Cl: :::::::::
0.4   t i  1  t    i  i  r  
: : : : P12(t): : : : :
t I • I I • • • I
02  .    1  .. .      ..  ..  .
. : : : : . : : : :
: : : : P22(tl : : : :
• I • I • , I • I
Figure 4.2 Riccati Coefficients for Example 4.1
Next, let us identify the various matrices in the present tracking
system by comparing the state (4.1.38) and the PI (4.1.39) of the
present problem (note the absence of the factor 1/2 in PI) with
the corresponding state (4.1.29) and the PI (4.1.30), respectively,
of the general formulation of the problem, we get
A = [ _!]; B = ; C = I; z(t) = ;
Q = R = r = 0.04. (4.1.50)
Let P(t) be the 2x2 symmetric matrix and g(t) be the 2x1 vector
as
P(t) = [Pl1(t) P12(t)] .
P12(t) P22(t) ,
g ( t) = [9
1
( t )] .
92(t)
Then, the optimal control given by (4.1.31) becomes
(4.1.51)
(4.1.52)
where, P(t), the positive definite matrix, is the solution of the
164
....
o
U
0)
>
C)
Chapter 4: Linear Quadratic Optimal Control Systems II
• I t I
· ...
I I
· ...
I •• 1= ,I
I I , • • • • I •
2   r  1 r T!  1  rT r 
: : : : : ; : : :
: : : : : : : : :
1 5
I , , • , J t • •
. r r   1  T r  r1  T r   r
I
I I
I
I I
I I
• I • • • I • I I
· . . , . . . . .
_ .... : .. _ ........ : ...  ...... : .. : ........ : .. _ .. 1 :  : ............ :  .. ..
• • I • •
· ... . .
· . I. . .
: : 91 (t):: : :
J
• • • • I I • • I
0.5
• I I • • t • t I
 I 1 T T  I  1 r  T  r  
J
I I I I I I 1 I
2 4 6 8 10 12 14 16 18 20
Figure 4.3 Coefficients 91 (t) and 92 (t) for Example 4.1
I I
• I • I I I • • •
• • • I I • • • •
3 I  :  r r  1 :  r: r 
• • I I • •
• • • I I • • • •
2.5  j  rrr  r1r  rr 
00 2 .......  f .. i  ...... : .......... : .... f ........ ; _ ..  .. : .... : ........ i
:::::::::
til :::::::::
1.5    t ..  ...... :  .... r""   t ....  ...... _ ..  ....  .. 
• • • I I • • • •
• • • I I • • • •
: : x1 (t): : : : : : :
1
0.5
I
I
: : : : : : : : :
1 1 T i ! 1 T i 1 
o __ \,: : : : : : : : _,
iii iii iii
2 4 6 8 10 12 14 16 18 20
Figure 4.4 Optimal States for Example 4.1
4.1 Linear Quadratic Tracking System: FiniteTime Case 165
I
I
I • • I I I I I •
20  r rrrrrrrr
i 10   r rr or  rr rr r
I 0  rrrT  r or rT  r 
10  1 r rr!  "/" or rr
20  rrrr:or rr r
30 L..L. __ L..__ L__ L.l.._..I._L_.1..l._I
024 6 8 10 12 14 16 18 20
Figure 4.5 Optimal Control for Example 4.1
matrix differential Riccati equation (4.1.32)
[
Pll(t) P12(t)] [Pll(t) P12(t)] [0 1]
P12(t) P22(t) =  P12(t) P22(t) 2 3
_ [0 2] [Pll(t) P12(t)]
1 3 P12(t) P22(t)
+ [Pll(t) P12(t)] [0]_1 [01]
P12(t) P22(t) 1 0.04 .
[
Pll (t) P12(t)]_ [2 0]
P12(t) P22(t) 00 (4.1.53)
and g(t), is the solution of the nonhomogeneous vector differential
equation obtained from (4.1.34) as
[ i n =  { [
2
 ]
 [
0
1] r
 n (4_1.54)
166 Chapter 4: Linear Quadratic Optimal Control Systems II
Simplifying the equations (4.1.53) and (4.1.54), we get
PH(t) = 25Pi2(t) + 4p12(t)  2
P12(t) = 25p12(t)P22(t)  PH (t) + 3p12(t) + 2p22(t)
P22(t) =  2p12(t) + 6p22(t) (4.1.55)
with final condition (4.1.33) as
and
[
PH (t f) P12 (t f) ] = [0 0]
P12 ( t f) P22 (t f) 0 0
iJ1(t) = [25p12(t) + 2] g2(t)  4t
i}2(t) = gl(t) + [3 + 25p22(t)] g2(t)
with final condition
n = .
(4.1.56)
(4.1.57)
(4.1.58)
See the plots of the Riccati coefficients PH(t),P12(t) and P22(t) in
Figure 4.6 and coefficients gl(t) and g2(t) in Figure 4.7. Also see
the plots of the optimal control u*(t) in Figure 4.9 and optimal
states xi(t) and x2(t) in Figure 4.8.
4.2 LQT System: InfiniteTime Case
In Chapter 3, in the case of linear quadratic regulator system, we ex
tended the results of finitetime case to infinitetime (limiting or steady
state) case. Similarly, we now extend the results of finitetime case of
the linear quadratic tracking system to the case of infinite time [3].
Thus, we restrict our treatment to timeinvariant matrices in the plant
and the performance index. Consider a linear timeinvariant plant as
The error is
x(t) = Ax(t) + Bu(t)
y(t) = Cx(t).
e(t) = z(t)  y(t),
and choose the performance index as
lim J = lim roo [e'(t)Qe(t) + u'(t)Ru(t)] dt
tr+oo tj>oo 2 io
(4.2.1)
(4.2.2)
(4.2.3)
( 4.2.4)
4.2 LQT System: InfiniteTime Case 167
1.4 ...r" T"""
, , ,
• • • t. ,.
• I I I. I
• • • I. ••
1.2 .....  :  ...... : ......  ........... : ...........  ...... ............  
I I • I' ,.
: : : : P11 ::
• • I I I • I • I
1    t _ .. _ .. .. : ......   ..  t ...... 1   _ .. .. _ ..  
I , • I I • • • •
I I • I I • • I I
E ::: :
Q) •••••••••
:Q 0.8 r'"  f ...  .... i: ......  : ....   t ......... i  :  : ............. rI
:::::::::
o :::::::::
{} :::::::::
0.6   ;.  _ .. ; .. .: .....   ;.  .. i ..  _ ..    .... ..........
(.) :::::::::
...., I I • I
ex: :::::::::
• • • I t I • , I
0.4 .....  !  _ ... ..  : ..... .....  f   .... ....  :   :  ....
• • • • 'I I'
I • • • •• ••
: P12(t): :
Figure 4.6 Riccati Coefficients for Example 4.2
to track the desired signal z(t). Also, we assume that Q is an nxn
symmetric, positive semidefinite matrix, and R is a rxr symmetric,
positive definite matrix. Note that there is no terminal cost function
in the PI (4.2.4) and hence F = O.
An obvious way of getting results for infinitetime (steadystate) case
is to write down the results of finitetime case and then simply let
tf + 00. Thus, as tf + 00, the matrix function P(t) in (4.1.19) tends
to the steadystate value P as the solution of
JPA  A'P + PBR
1
B'P  C'QC = 0.1 (4.2.5)
Also, the vector function g(t) in (4.1.20) tends to a finite function g(t)
as the solution of
g(t) = [PE  A'] g(t)  Wz(t) (4.2.6)
where, E
becomes
BR
1
B' and W = C'Q. The optimal control (4.1.31)
I u(t) = R 1 B' [Px(t)  g(t)] .1 (4.2.7)
Further details on this are available in Anderson and Moore [3].
168
....
0
t)
Q)
>
Cl
(J)
Q)
«i
Ci5
(ij
E
°a
0
Chapter 4: Linear Quadratic Optimal Control Systems II
50
45
40
35
30
I
I
• I I •
25
• I I ,
... .. ........... , ...................  ............ r .............. r ......... ..
• I I I I
• I I I •
I • I I •
• I I I •
20
............ t .............. ~ ................. : ............  ~ .............. ~ ......... ..
• I • I •
• I I I I
, I I I ,
15
10
5
0
0 2 4 6 8 10 12 14 16 18 20
Figure 4.7 Coefficients gl (t) and g2(t) for Example 4.2
5 0 ~   ~   ~   ~     ~   ~   ~   ~ ~   ~   ~   ~
40
30
20
10
0
10
I
I
I
• I " I ,
· . , . . , .
___ ................ __ .. ~ ___ ____ 1 __ ...... __ I. __ .................... __ ..1 ______ ... _____ _
• I I I • I I
• I • • • I •
• I I • • I •
• I I I • I
I I I
I I I
• I I • I I I • •
.............. t ............... ~ ............ : ............ : ............. t .............. I ............. : ............  ~ .............. ~ ........ ...
I • I •
I I I •
I I I
I I I
I I I
I I I
t  1 t  r   t 1 1   ~   r 
f' I I I I , I
•• • I I I I I
•• • I I • • I
•• • I I I • •
• ••• I I • •
............ !.......... • ............ : .......... r ............ ~ ............. ~ .... ...... 1 .......... r ............ r ........ ..
• • I • I I I • •
: : : x (t) : : : : :
, • • • I I I , •
.......... ,. ............ , ............ , .......... r ............ T  _ .....  ,   _ .... ..,    _ .... r"''''     r 
• I I I I I • • •
• • I I I I I • I
• • • • • I , • t
I • • • I I • • I
I. t I I •
,. I I. •
, • I • I I I • ,
....  .... _ .......  .............. of;  _ _ ......... t ....... _ ..  .....  ....................  ..........  ...  .. "1""""''' _t ..  ........ to ...  .. ...
I • • • • I • • t
, " .
. ., .
, .. .
• I I •
I I I •
 2 0 ~   ~   ~     ~   ~   ~     ~   ~   ~     ~   ~
o 2 4 6 8 10 12 14 16 18 20
Figure 4.8 Optimal Control and States for Example 4.2
4.3 FixedEndPoint Regulator System 169
I , • I •
I I • I I ••
100 t 
· . I. ..
I ••• • I
· ... ..
I ••• I I
• I •• I.
200  .... :  .. : ..  .. ......  f ......  ..    
• • I " • I
• • I I I I
, • I I I I
• ., I I I
I • I I I I
o :::: I ::
300     :          
(U I... I I I
E "" I I I
".;:; ::::: ::
c.. I. I I I • I
O
" I I. I.
I • • I. • I
400 f.. .. : ........ ..  ..  ....   ..
I • I I I I.
I • • I I I.
I r I I I ,
I , • I I •
I • • • I •
I , • • I I
I • • • I ••
500 ti:.. .. ..  1r.  
I I • I I ,. I
I • • I. I I •
I • • It. • I
I I • • I •••
• • , • I •••
• • • • I I. I
600 L..II....'_'_L_L_'_L.._L.._'_I
o 2 4 6 8 10 12 14 16 18 20
Figure 4.9 Optimal Control and States for Example 4.2
4.3 FixedEndPoint Regulator System
In this section, we discuss the fixedendpoint state regulator system,
where the final state x( t f) is zero and the final time t f is fixed [5]. This
is different from the conventional freeendpoint state regulator system
with the final time t j being free, leading to the matrix Riccati differen
tial equation that was discussed in Chapter 3. Following the procedure
similar to the freeend point system, we will arrive at the same matrix
differential Riccati equation (3.2.18). But, if we use the earlier trans
formation (3.2.11) to find the corresponding boundary condition for
(3.2.18), we see that
(4.3.1)
and for the fixed final condition x( t f) = 0, and for arbitrary >..( t f), we
have
(4.3.2)
This means that for the fixedendpoint regulator system, we solve the
matrix DRE (3.2.18) using the final condition (4.3.2). In practice, we
may start with a very large value of P(tj) instead of 00.
Alternatively, we present a different procedure to find closedloop
optimal control for the fixedendpoint system [5]. In fact, we will use
170 Chapter 4: Linear Quadratic Optimal Control Systems II
what is known as inverse Riccati transformation between the state and
costate variables and arrive at matrix inverse Riccati equation.
As before, consider a linear, timevarying system
x(t) = A(t)x(t) + B(t)u(t) (4.3.3)
with a cost functional
1 rtf
J(u) = "2 Jto [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt
( 4.3.4)
where, x(t) is nth state vector, u(t) is rth control vector, A(t) is nxn
state matrix, and B(t) is nxr control matrix. We assume that the
control is unconstrained. The boundary conditions are given as
x(t = to) = Xo; (4.3.5)
where, t f is fixed or given a priori. Here, we can easily see that for fixed
final condition, there is no meaning in having a terminal cost term in
the cost function (4.3.4).
We develop the procedure for freeendpoint regulator system under
the following steps (see Table 2.1).
• Step 1 : Hamiltonian
• Step 2: Optimal Control
• Step 3: State and Costate System
• Step 4: ClosedLoop Optimal Control
• Step 5: Boundary Conditions
Now, we address these steps in detail.
• Step 1: Hamiltonian: Formulate the Hamiltonian as
1 1
H(x(t), u(t), A(t)) = "2 x'(t)Q(t)x(t) + "2u'(t)R(t)u(t)
+A'(t) [A(t)x(t) + B(t)u(t)] (4.3.6)
• Step 2: Optimal Control: Taking the partial of H w.r.t. u, we
have
BaH = 0 + R(t)u(t) + B'(t)A(t) = 0
u
(4.3.7)
which gives optimal control u*(t) as
u*(t) = _Rl(t)B'(t)A*(t).
(4.3.8)
4.3 FixedEndPoint Regulator System 171
• Step 3: State and Costate System: Obtain the state and costate
equations as
x*(t) = + ~ ~ + x*(t) = A(t)x*(t) + B(t)u*(t), (4.3.9)
5..*(t) =  r::: > 5..*(t) = Q(t)x*(t)  A'(t)A*(t). (4.3.10)
Eliminating control u* (t) from (4.3.8) and (4.3.9) to obtain the
canonical system of equations
[
x*(t)] [A(t) E(t)] [x*(t)]
,\ * (t) = Q(t) A' (t) A* (t)
(4.3.11)
where, E(t) = B(t)Rl(t)B'(t). This state and costate system,
along with the given boundary conditions (4.3.5), constitutes a
twopoint boundary value problem (TPBVP), which when solved
gives optimal state x*(t) and costate A*(t) functions. This op
timal costate function A*(t) substituted in (4.3.8) gives optimal
control u*(t). This leads us to openloop optimal control as dis
cussed in Chapter 2. But our interest here is to obtain closedloop
optimal control for the fixedendpoint regulator system .
• Step 4: ClosedLoop Optimal Control: Now if this were a free
endpoint system (x(tf) free), using transversality conditions, we
would be able to obtain a final condition on the costate A( t f) ,which
lets us assume a Riccati transformation between the state and
costate function as
A*(t) = P(t)x*(t). (4.3.12)
In the absence of any knowledge on the final condition of the
costate function A * (t), we are led to assume a kind of inverse
Riccati transformation as [104, 113]
x* (t) = M(t)A * (t) ( 4.3.13)
where, the nxn matrix M(t) is yet to be determined. Note the dif
ference between the transformations (4.3.12) and (4.3.13). Now
as before in the case of freeendpoint system, by simple manipu
lation of the state and costate system (4.3.11) and (4.3.13) (that
172 Chapter 4: Linear Quadratic Optimal Control Systems II
is, eliminating x*(t)), we obtain
x* (t) = M(t).\* (t) + M(t),X * (t) ~
[M(t)  A(t)M(t)  M(t)A'(t)  M(t)Q(t)M(t) +
B(t)R1B'(t)] .\*(t) = O. (4.3.14)
Now, if the previous equation should be valid for all t E [to, tf]'
and for any arbitrary .\ * (t), we then have
I M(t) = A(t)M(t) + M(t)A'(t) + M(t)Q(t)M(t)  B(t)R1B'(t)·1
( 4.3.15)
Let us call this the inverse matrix differential Riccati equation
(DRE) just to distinguish from the normal DRE (3.2.34) .
• Step 5: Boundary Conditions: Now the boundary condition for
(4.3.15) is obtained as follows. Here, we have different cases to
be discussed.
1. x(tf) = 0 and x(to) # 0: We know from the given boundary
conditions (4.3.5) that x(tf) = 0 and using this in (4.3.13),
we get
x(tf) = 0 = M(tf).\(tf)·
For arbitrary .\(tf), (4.3.16) becomes
IM(tf)=O·1
( 4.3.16)
(4.3.17)
2. x(tf) # 0 and x(to) = 0: Here, at t = to, (4.3.13) becomes
x(to) = 0 = M(to).\(to)
and for arbitrary .\(to), (4.3.18) becomes
I M(to) = 0·1
(4.3.18)
(4.3.19)
Thus, we solve the inverse matrix DRE (4.3.15) backward
using the final condition (4.3.17) or forward using the initial
condition( 4.3.19).
The optimal control (4.3.8) with the transformation (4.3.13)
becomes
I u*(t) = R1(t)B'(t)M1(t)x*(t)·1 ( 4.3.20)
4.3 FixedEndPoint Regulator System 173
3. General Boundary Conditions: x(to) =1= 0 and x(tJ) =1= O.
Here, both the given boundary conditions are not zero, and
we assume a transformation as
x*(t) = M(t)A * (t) + v(t). ( 4.3.21)
As before, we substitute the transformation (4.3.21) in the
state and costate system (4.3.11) and eliminate x*(t) to get
x*(t) = M(t)A*(t) + M ( t ) ~ * ( t ) + v(t) (4.3.22)
leading to
A ( t) [M ( t) A * ( t) + v ( t)]  B ( t ) R 1 ( t ) B' ( t) A * ( t )
= M(t)A*(t) + M(t) [Q(t) [M(t)A*(t) + v(t)]
A'(t)A*(t)] +v(t) (4.3.23)
further leading to
[M(t)  A(t)M(t)  M(t)A'(t)  M(t)Q(t)M(t) +
B(t)R
1
(t)B'(t)] A*(t) +
[v(t)  M(t)Q(t)v(t)  A(t)v(t)] = O.
( 4.3.24)
This should be valid for any orbit ray value of A*(t), which
leads us to a set of equations
M(t) = A(t)M(t) + M(t)A'(t) + M(t)Q(t)M(t)
B(t)R
1
(t)B'(t) (4.3.25)
v(t) = M(t)Q(t)v(t) + A(t)v(t). (4.3.26)
At t = to, (4.3.21) becomes
x*(to) = M(tO)A*(tO) + v(to). (4.3.27)
Since A*(tO) is arbitrary, (4.3.27) gives us
M(to) = 0; v(to) = x(to). ( 4.3.28)
At t = t J, (4.3.21) becomes
x*(tJ) = M(tJ )A*(tJ) + v(tf)· ( 4.3.29)
174 Chapter 4: Linear Quadratic Optimal Control Systems II
Again, since A*(tj) is arbitrary, (4.3.29) becomes
( 4.3.30)
Thus, the set of the equations (4.3.25) and (4.3.26) are solved
using either the initial conditions (4.3.28) or final conditions
(4.3.30).
Finally, using the transformation (4.3.21) in the optimal control
(4.3.8), the closedloop optimal control is given by
u*(t) = R1(t)B'(t)M1(t)[x*(t)  v(t)]
( 4.3.31)
where, it is assumed that M(t) is invertible.
N ow to illustrate the previous method and to be able to get analytical
solutions, we present a first order example.
Example 4.3
Given the plant as
x(t) = ax(t) + bu(t),
and the performance index as
lit!
J =  [qx
2
(t) + ru
2
(t)]dt,
2 to
and, the boundary conditions as
x(t=O)=xo; x(t=tj)=O,
find the closedloop optimal control.
( 4.3.32)
( 4.3.33)
( 4.3.34)
Solution: Follow the procedure of the inverse matrix DRE de
scribed in the last section. We see that with the boundary con
ditions (4.3.34), we need to use the scalar version of the inverse
matrix DRE (4.3.15) having the boundary condition (4.3.17). The
optimal control (4.3.20) is given by
u*(t) = r1bm1(t)x*(t)
where, m(t) is the solution of the scalar DRE (4.3.15)
b
2
m(t) = 2am(t) + m
2
(t)q  
r
( 4.3.35)
(4.3.36)
4.4 LQR with a Specified Degree of Stability 175
with the boundary condition (4.3.17) as m( t f) = O. Solving (4.3.36)
with this boundary condition, we get
(4.3.37)
where, (3 = Ja
2
+ q ~ . Then the optimal control (4.3.35) becomes
4.4 LQR with a Specified Degree of Stability
In this section, we examine the state regulator system with infinite time
interval and with a specified degree of stability for a timeinvariant
system [3, 2]. Let us consider a linear timeinvariant plant as
x(t) = Ax(t) + Bu(t); x(t = to) = x(O), (4.4.1)
and the cost functional as
11
00
J =  e
2at
[x'(t)Qx(t) + u'(t)Ru(t)] dt
2 to
( 4.4.2)
where, a is a positive parameter. Here, we first assume that the pair
[A + aI, B] is completely stabilizable and Rand Q are constant, sym
metric, positive definite and positive semidefinite matrices, respectively.
The problem is to find the optimal control which minimizes the perfor
mance index (4.4.2) under the dynamical constraint (4.4.1).
This can be solved by modifying the previous system to fit into the
standard infinitetime regulator system discussed earlier in Chapter 3.
Thus, we make the following transformations
( 4.4.3)
Then, using the transformations (4.4.3), it is easy to see that the mod
ified system becomes
ic.(t) = !£{eatx(t)} = aeatx(t) + eatx(t)
dt
= ax(t) + eat [Ax(t) + Bu(t)]
ic.(t) = (A + aI)x(t) + Bu(t). (4.4.4)
176 Chapter 4: Linear Quadratic Optimal Control Systems II
We note that the initial conditions for the original system (4.4.1) and
the modified system (4.4.4) are simply related as i(to) = eatox(to)
and in particular, if to = 0 the initial conditions are the same for the
original and modified systems. Also, using the transformations (4.4.3),
the original performance measure (4.4.2) can be modified to
1 [00
j = 2 ito [x'(t)Qx(t) + ft'(t)Rft(t)] dt.
( 4.4.5)
Considering the minimization of the modified system defined by (4.4.4)
and (4.4.5), we see that the optimal control is given by (see Chapter 3,
Table 3.3)
ft*(t) = R
1
B'Pi*(t) = Ki*(t)
( 4.4.6)
where, K = R 1 B'P and the matrix P is the positive definite, sym
metric solution of the algebraic Riccati equation
P(A + aI) + (A' + aI)P  PBR
1
B'P + Q = O. (4.4.7)
Using the optimal control (4.4.6) in the modified system (4.4.4), we get
the optimal closedloop system as
ic*(t) = (A + aI  BR1B'P)i*(t).
( 4.4.8)
Now, we can simply apply these results to the original system. Thus,
using the transformations (4.4.3) in (4.4.6), the optimal control of the
original system (4.4.1) and the associated performance measure (4.4.2)
is given by
u*(t) = eatft*(t) = eatRIB'Peatx*(t)
= R
1
B'Px*(t) = Kx*(t). ( 4.4.9)
Interestingly, this desired (original) optimal control (4.4.9) has the same
structure as the optimal control (4.4.6) of the modified system. The
optimal performance index for original system or modified system is
the same and equals to
j* =
2
J* = (4.4.10)
We see that the closedloop optimal control system (4.4.8) has eigen
values with real parts less than a. In other words, the state x*(t)
approaches zero at least as fast as e
at
. Then, we say that the closed
loop optimal system (4.4.8) has a degree of stability of at least a.
4.4 LQR with a Specified Degree of Stability 177
4.4.1 Regulator System with Prescribed Degree of
Stability: Summary
For a controllable, linear, timeinvariant plant
x(t) = Ax(t) + Bu(t), (4.4.11)
and the infinite interval cost functional
J =  e
2at
[x'(t)Qx(t) + u'(t)Ru(t)] dt, 11
00
2 to
(4.4.12)
the optimal control is given by
u*(t) = R
1
B'Px*(t) = Kx*(t) (4.4.13)
where, K = R 1 B'P and P, the nxn constant, positive definite, sym
metric matrix, is the solution of the nonlinear, matrix algebraic Riccati
equation (ARE)
P(A + aI) + (A' + aI)P  PBR
1
B'P + Q = 0, (4.4.14)
the optimal trajectory is the solution of
x*(t) = (A  BR
1
B'P) x*(t),
and the optimal cost is given by
J* = !e
2ato
x*'(to)Px*(to).
2
(4.4.15)
(4.4.16)
The entire procedure is now summarized in Table 4.2. Consider a first
order system example to illustrate the previous method.
Example 4.4
Consider a firstorder system
x(t) = x(t) + u(t), x(O) = 1
and a performance measure
J =! roo e
2at
[x
2
(t) + u
2
(t)]dt.
2 io
( 4.4.17)
(4.4.18)
Find the optimal control law and show that the closedloop optimal
system has a degree of stability of at least a.
178 Chapter 4: Linear Quadratic Optimal Control Systems II
Table 4.2 Procedure Summary of Regulator System with
Prescribed Degree of Stability
A. Statement of the Problem
Given the plant as
x(t) = Ax(t) + Bu(t),
the performance index as
J = ~ It; e
2at
[x'(t)Qx(t) + u'(t)Ru(t)] dt,
and the boundary conditions as
x(to) = Xo; x(oo) = 0,
find the optimal control, state and index.
B. Solution of the Problem
Step 1 Solve the matrix algebraic Riccati equation
P(A + aI) + (A' + aI)P + Q  PBR
1
B'P = O.
Step 2 Solve the optimal state x* (t) from
x*(t) = (A  BR 1 B'P) x*(t)
with initial condition x(to) = Xo.
Step 3 Obtain the optimal control u * ( t) from
u*(t) = R
1
B'Px*(t).
Step 4 Obtain the optimal performance index from
J* = je
2ato
x*'(to)Px*(to).
Solution: Essentially, we show that the eigenvalue of this closed
loop optimal system is less than or equal to a. First of all, in
the above, we note that A = a = 1, B = b = 1, Q = q = 1 and
R = r = 1. Then, the algebraic Riccati equation (4.4.14) becomes
2p(a  1)  p2 + 1 = 0 + p2  2p(a  1)  1. (4.4.19)
Solving the previous for positive value of p, we have
p = 1 + a + v(a  1)2 + 1.
The optimal control (4.4.15) becomes
u*(t) = px*(t).
The optimal system (4.4.22) becomes
x*(t) = ( a  v(a 1)2 + 1) x*(t).
(4.4.20)
(4.4.21 )
( 4.4.22)
4.5 FrequencyDomain Interpretation 179
It is easy to see that the eigenvalue for the system (4.4.22) is related
as
0::  V(o::  1)2 + 1 < 0::. (4.4.23)
This shows the desired result that the optimal system has the eigen
value less than 0::.
4.5 FrequencyDomain Interpretation
In this section, we use frequency domain to derive some results from the
classical control point of view for a linear, timeinvariant, continuous
time, optimal control system with infinitetime horizon case. For this,
we know that the closedloop optimal control involves the solution of
matrix algebraic Riccati equation [89, 3]. For ready reference, we repeat
here some of the results of Chapter 3.
Consider a controllable, linear, timeinvariant plant
x(t) = Ax(t) + Bu(t), (4.5.1)
and the infinitetime interval cost functional
110
00
J =  [x'(t)Qx(t) + u'(t)Ru(t)] dt.
2 0
(4.5.2)
The optimal control is given by
u*(t) = R
1
B'Px*(t) = Kx*(t), (4.5.3)
where, K = R 1 B'P, and P, the nxn constant, positive definite, sym
metric matrix, is the solution of the nonlinear, matrix ARE
PA  A'P + PBR
1
B'P  Q = O. (4.5.4)
The optimal trajectory (state) is the solution of
x* (t) = (A  BR 1 B'P) x* (t) = (A  BK) x* (t), (4.5.5)
which is asymptotically stable. Here, we assume that [A, B] is stabi
lizable and [A, vel] is observable. Then, the openloop characteristic
polynomial of the system is [89]
Llo(s) = lsI  AI, (4.5.6)
180 Chapter 4: Linear Quadratic Optimal Control Systems II
where, s is the Laplace variable and the optimal closedloop character
istic polynomial is
de(s) = lsI  A + BKI
= II + BK[sl  A]II·lsI  AI,
= 11+ K[sl  A]IBld
o
(S). (4.5.7)
This is a relation between the openloop do ( s) and closedloop de ( s )
characteristic polynomials. From Figure 4.10, we note that
1.  K [sl  A] 1 B is called the loop gain matrix, and
2. 1+ K[sl  A]1 B is termed return difference matrix.
U(s)
.
1
[sI  A] B
X(s)
___________ _______________________________________ :
9
_ 1
K = R B'P 1+'
ClosedLoop Optimal Controller
, ................................................................................................................................................................ ...
Figure 4.10 Optimal ClosedLoop Control in Frequency Domain
To derive the desired factorization result, we use the matrix ARE
(4.5.4). Let us rewrite the ARE as
P A  A'P + PBR
1
B'P = Q. (4.5.8)
First adding and subtracting sP, s = jw to the previous ARE, we get
P[sl  A] + [sl  A']P + K'RK = Q. (4.5.9)
Next, premultiplying by s) and post multiplying by the
previous equation becomes
s)PB + +
= (4.5.10)
4.5 FrequencyDomain Interpretation 181
where, we used
= [sl  A]I; = [sl  A,]I.
(4.5.11)
Finally, using K = R 1 B'P + K' = PBR 1 + PB = K'R and
adding R to both sides of (4.5.10), we have the desired factorization
result as
I + R = [I + s)BJ' R [I + I
(4.5.12)
or equivalently,
B'[sl  A']IQ[sl  A]IB + R
= [I + K[sl  A]IB], R [I + K[sl  A]IB] .
( 4.5.13)
The previous relation is also called the Kalman equation in frequency
domain.
4.5.1 Gain Margin and Phase Margin
We know that in classical control theory, the features of gain and phase
margins are important in evaluating the system performance with re
spect to robustness to plant parameter variations and uncertainties.
The engineering specifications often place lower bounds on the phase
and gain margins. Here, we interpret some of the classical control
features such as gain margin and phase margin for the closedloop op
timal control system [3]. For ready reference let us rewrite the return
difference result (4.5.13)) with s = jw here as
B'[jwl  A,]IQ[jwl  A]IB + R
= [I + K[jwl  A]IB]' R[I + K[jwl  A]IB].
( 4.5.14)
The previous result can be viewed as
M(jw) = W'( jw)W(jw) (4.5.15)
where,
W(jw) = R
1
/
2
[I + K[jwl  A]IB]
M(jw) = R + B'[jwl  A,]IQ[jwl  A]lB. (4.5.16)
182 Chapter 4: Linear Quadratic Optimal Control Systems II
Note that M(jw) 2: R > O. Using Q = ee', R = DD' = I and the
notation
W'( jw)W(jw) = IIW(jw)11
2
, ( 4.5.17)
the factorization result (4.5.14) can be written in neat form as
This result can be used to find the optimal feedback matrix K given
the other quantities A, B, Q, R = I. Note that in (4.5.18), we need not
solve for the Riccati coefficient matrix P, instead we directly obtain
the feedback matrix K.
In the singleinput case, the various matrices become scalars or vec
tors as B = b, R = r, K = k. Then, the factorization result (4.5.14)
boils down to
r + b'[jwI  A']lQ[jwI  A]lb
= rl1 + k[jwI  A]lbI
2
.
In case Q = ee', we can write (4.5.19) as
( 4.5.19)
The previous result may be called another version of the Kalman equa
tion in frequency domain. The previous relation (also from (4.5.18) for
a scalar case) implies that
( 4.5.21)
Thus, the return difference is lower bounded by 1 for all w.
Example 4.5
Consider a simple example where we can verify the analytical so
lutions by another known method. Find the optimal feedback co
efficients for the system
Xl(t)=X2(t)
X2(t) = u(t)
and the performance measure
1 [00
J = 2 J
o
[xi(t) + x ~ ( t ) + u
2
(t)] dt.
( 4.5.22)
(4.5.23)
4.5 FrequencyDomain Interpretation 183
Solution: First it is easy to identify the various matrices as
A = [ ~ ~ ] ; B = b = [ ~ ] ; Q = [ ~ ~ ] ; R = r = 1. (4.5.24)
Also, note since Q = R = I, we have C = D = I and B = I. Thus,
the Kalman equation (4.5.18) with jw = s becomes
( 4.5.25)
Further, we have
(4.5.26)
Then, the Kalman equation (4.5.25) becomes
[1+ [ kll k12] [ ! ~ ~ ] [ ~ ] ] [1+ [ kll k12] [& ~ ] [ ~ ] ] =
1+ [0 1] [ !o ~ ] [ ~ ~ ] [& ~ ] [ ~ ] .
(4.5.27)
By simple matrix multiplication and equating the coefficients of
like powers of s on either side, we get a set of algebraic equations
in general, and in particular in this example we have a single scalar
equation as
( 4.5.28)
giving us
kll = 1, k12 = v'3
( 4.5.29)
and the optimal feedback control as
u*(t) = Kx*(t) =  [ 1 V3] x*(t). ( 4.5.30)
Note: This example can be easily verified by using the algebraic
Riccati equation (3.5.15) (of Chapter 3)
PA + A'P  PBR1B'P + Q = 0 ( 4.5.31)
184 Chapter 4: Linear Quadratic Optimal Control Systems II
discussed in Chapter 3. Using the previous relation, we get
(4.5.32)
and the optimal control (3.5.14) as
u*(t) = R1B'Px*(t) =  [1 vI3] x*(t) ( 4.5.33)
which is the same as (4.5.30).
Let us redraw the closedloop optimal control system in Figure 4.10 as
unity feedback system shown in Figure 4.11.
~
+
U(s)
 1 Xes)
• K[sI Ar B
...
I
Figure 4.11 ClosedLoop Optimal Control System with Unity
Feedback
Here, we can easily recognize that for a singleinput, singleoutput
case, the optimal feedback control system is exactly like a classical feed
back control system with unity negative feedback and transfer function
as Go(s) = k[sIA]lb. Thus, the frequency domain interpretation in
terms of gain margin, phase margin can be easily done using Nyquist,
Bode, or some other plot .of the transfer function Go (s ).
Gain Margin
We recall that the gain margin of a feedback control system is the
amount of loop gain (usually in decibels) that can be changed before
the closedloop system becomes unstable. Let us now apply the well
known Nyquist criterion to the unity feedback, optimal control system
depicted in Figure 4.11. Here, we assume that the Nyquist path is clock
wise (CW) and the corresponding Nyquist plot makes counterclock
wise (CCW) encirclements around the critical point 1 + jO. According
to Nyquist stability criterion, for closedloop stability, the Nyquist plot
(or diagram) makes CCW encirclements as many times as there are
poles of the transfer function Go (s) lying in the right half of the s
plane.
4.5 FrequencyDomain Interpretation 185
From Figure 4.11 and the return difference relation (4.5.21), we note
that
(4.5.34)
implies that the distance between the critical point 1 + jO and any
point on the Nyquist plot is at least 1 and the resulting Nyquist plot
is shown in Figure 4.12 for all positive values of w (i.e., 0 to (0). This
2 Re
 Go(jro)
1 + Go(jro)
Figure 4.12 Nyquist Plot of Go(jw)
means that the Nyquist plot of Go(jw) is constrained to avoid all the
points inside the unit circle (centered at 1 + jO). Thus, it is clear
that the closedloop optimal system has infinite gain margin. Let us
proceed further to see if there is a lower limit on the gain factor.
Now, if we multiply the openloop gain with some constant factor
f3, the closedloop system will be asymptotically stable if the Nyquist
plot of f3Go(jw) encircles 1 + jO in CCW direction as many times
as there are poles of f3G
o
(s) in the righthalf plane. This means that
the closedloop system will be stable if the Nyquist diagram of Go(jw)
encircles the critical point (1/ f3) + jO the same number of times as
there are openloop poles in the righthalf plane. But the set of points
(1/ f3) for all real f3 > ~ lies inside the critical unit circle and thus
are encircled CCW the same number of times as the original point
1 + jO. Consequently, there is a lower limit as f3 > ~ . In other words,
for values of f3 < ~ , the set of points 1/ f3 + jO lies outside the unit
circle and contradicts the Nyquist criterion for stability of closedloop
186 Chapter 4: Linear Quadratic Optimal Control Systems II
optimal control system. Thus, we have an infinite gain margin on the
upper side and a lower gain margin of j3 = 1/2.
Phase Margin
Let us first recall that the phase margin is the amount of phase shift in
CW direction (without affecting the gain) through which the Nyquist
plot can be rotated about the origin so that the gain crossover (unit
distance from the origin) passes through the 1 + jO point. Simply,
it is the amount by which Nyquist plot can be rotated CW to make
the system unstable. Consider a point P at unit distance from the
origin on the Nyquist plot (see Figure 4.12). Since we know that the
Nyquist plot of an optimal regulator must avoid the unit circle centered
at 1 + jO, the set of points which are at unit distance from the origin
and lying on Nyquist diagram of an optimal regulator are constrained
to lie on the portion marked X on the circumference of the circle with
unit radius and centered at the origin as shown in Figure 4.13. Here,
jro
2
Figure 4.13 Intersection of Unit Circles Centered at Origin and
1 +jO
we notice that the smallest angle through which one of the admissible
points A on (the circumference of the circle centered at origin) the
4.5 FrequencyDomain Interpretation 187
Nyquist plot could be shifted in a CW direction to reach 1 + j 0 point
is 60 degrees. Thus, the closedloop optimal system or LQR system
has a phase margin of at least 60 degrees.
188 Chapter 4: Linear Quadratic Optimal Control Systems II
4. 6 Problems
1. Make reasonable assumptions wherever necessary.
2. Use MATLAB© wherever possible to solve the problems and
plot all the optimal controls and states for all problems. Provide
the relevant MATLAB© m files.
Problem 4.1 A second order plant
Xl(t) = X2(t),
X2(t) = 2Xl(t)  3X2(t) + u(t)
y(t) = x(t)
is to be controlled to minimize a performance index and to keep the
state Xl(t) close to a ramp function 2t. The final time tf is specified,
the final state x( t f) is free and the admissible controls and states are
unbounded. Formulate the performance index, obtain the feedback
control law and plot all the time histories of Riccati coefficients, optimal
states and control.
Problem 4.2 A second order plant
Xl(t) = X2(t),
X2(t) = 2Xl(t)  4X2(t) + O.5u(t)
y(t) = x(t)
is to be controlled to minimize the performance index
The final time t f is specified, the final state x( t f) is fixed and the
admissible controls and states are unbounded. Obtain the feedback
control law and plot all the time histories of inverse Riccati coefficients,
optimal states and control.
4.6 Problems
Problem 4.3 For a linear, timevarying system (3.2.48) given as
x(t) = A(t)x(t) + B(t)u(t),
y(t) = C(t)x(t)
with a general cost functional (3.2.49) as
J = x' ( t f) F ( t f )x ( t f )
f [x'(t) u'(t) 1 ii!j] [:m] dt,
189
where, the various vectors and matrices are defined in Chapter 3, formu
late a tracking problem and obtain the results similar to those obtained
in Chapter 4.
Problem 4.4 Using the frequencydomain results, determine the op
timal feedback coefficients and the closedloop optimal control for the
multiinput, multioutput system
. [0 1] [10]
x(t) = 2 3 x(t) + ° 1 u(t)
and the cost function
J = 10"" [4xi(t) + + o.5ui(t) +
Problem 4.5 For the D.C. motor speed control system described in
Problem 1.1, find the closedloop optimal control to track the speed at
a particular value.
Problem 4.6 For the liquidlevel control system described in Prob
lem 1.2, find the closedloop optimal control to track the liquid level
along a ramp function 0.25t.
Problem 4.7 For the mechanical control system described in Prob
lem 1.4, find the closedloop optimal control to track the system along
(i) a constant value and (ii) a ramp function.
Problem 4.8 For the chemical control system described in Problem 1.6,
find the closedloop optimal control to track the system along (i) a con
stant value and (ii) a ramp function.
@@@@@@@@@@@@@
Chapter 5
Discrete Time Optimal
Control Systems
In previous chapters, the optimal control of continuoustime systems
has been presented. In this chapter, the optimal control of discretetime
systems is presented. We start with the basic calculus of variations
and then touch upon all the topics discussed in the previous chapters
with respect to the continuoustime systems such as openloop optimal
control, linear quadratic regulator system, tracking system, etc. It is
suggested that the student reviews the material in Appendices A and
B given at the end of the book. This chapter is inspired by [84, 89,120]1.
5.1 Variational Calculus for DiscreteTime
Systems
In earlier chapters, we discussed the optimal control of continuoustime
systems described by differential equations. There, we minimized cost
functionals which are essentially integrals of scalar functions. Now, we
know that discretetime systems are characterized by difference equa
tions, and we focus on minimizing the cost functionals which are sum
mations of some scalar functions.
IThe permission given by John Wiley for F. L. Lewis, Optimal Control, John Wiley & Sons,
Inc., New York, NY, 1986, is hereby acknowledged.
191
192 Chapter 5: DiscreteTime Optimal Control Systems
5.1.1 Extremization of a Functional
In this section, we obtain the necessary conditions for optimization of
cost functionals which are summations such as
kfl
J(x(k
o
), k
o
) = J = L V(x(k), x(k + 1), k) (5.1.1)
k=ko
where, the discrete instant k = ko, kl' ... ,kj  1. Note the following
points.
1. For a given interval k E [ko, kj] and a given function V(x(k), x(k+
1), k), the summation interval in (5.1.1) needs to be [ko, kj  1].
2. We consider first a scalar case for simplicity and then we gener
alize for the vector case.
3. We are also given the initial condition x(k = ko) = x(k
o
).
4. Consider the case of a freefinal point system, such that k is fixed
or specified and x(kj) is free or unspecified.
5. Also, if T is the sampling period, then x(k) = x(kT).
6. Let us note that if we are directly considering the discretetime
version of the continuoustime cost functionals (such as (2.3.1)
addressed in Chapter 2), we have the sampling period T multi
plying the cost functional (5.1.1).
For extremization (maximization or minimization) of functionals, anal
ogous to the case of continuoustime systems addressed in Chapter 2, we
use the fundamental theorem of the calculus of variations (Co V) which
states that the first variation must be equal to zero. The methodology
for this simple case of optimization of a functional is carried out briefly
under the following steps.
• Step 1: Variations
• Step 2: Increment
• Step 3: First Variation
• Step 4: EulerLagrange Equation
5.1 Variational Calculus for DiscreteTime Systems 193
• Step 5: Boundary Conditions
Now consider these items in detail.
• Step 1: Variations: We first let x(k) and x(k + 1) take on vari
ations 8x(k) and 8x(k + 1) from their optimal values x*(k) and
x* (k + 1), respectively, such that
x(k) = x*(k) + 8x(k); x(k + 1) = x*(k + 1) + 8x(k + 1). (5.1.2)
Now with these variations, the performance index (5.1.1) be
comes
J* = J(x*(ko), ko)
kfl
= L V(x*(k), x*(k + 1), k) (5.1.3)
k=ko
J = J(x(k
o
), k
o
)
kfl
= L V (x*(k) + 8x(k), x*(k + 1) + 8x(k + 1), k). (5.1.4)
k=ko
• Step 2: Increment: The increment of the functionals defined by
(5.1.2) and (5.1.3) is defined as
~ J = J  J * . (5.1.5)
• Step 3: First Variation: The first variation 8J is the first order
approximation of the increment ~ J . Thus, using the Taylor series
expansion of (5.1.4) along with (5.1.3), we have
oJ = kfl [8V(X*(k), x*(k + 1), k) 8x(k)
k=ko 8x*(k)
8V(x*(k), x*(k + 1), k) 8 (k 1)]
+ 8x* (k + 1) x + .
(5.1.6)
194 Chapter 5: Discrete Time Optimal Control Systems
Now in order to express the coefficient 8x(k + 1) also in terms of
8 x ( k), consider the second expression in (5.1. 6) .
kt
1
8V(x*(k), x*(k + 1), k) 6x(k + 1)
k=ko 8x*(k + 1)
= 8V(x*(ko), x*(k
o
+ 1), k
o
) 8x(ko + 1)
8x*(ko + 1)
8V(x*(ko + 1), x*(ko + 2), ko + 1) 8 (k )
+ 8x*(k
o
+2) x 0+
2
+ ................. .
8V(x*(k
f
 2), x*(k
f
 1), kf  2) 8 (k )
+ 8x* (k f  1) x f  1
8V(x*(kf  1), x*(kf), kf  1) 8 (k )
+ 8x*(kf) x f
8V(x*(ko  1), x*(ko), ko  1) 8 (k )
+ 8x*(k
o
) x 0
8V(x*(ko  1)tiko), ko  1) 6x(ko) (5.1.7)
8x* ko
where, the last two terms in (5.1.7) are added without affecting
the rest of the equation. The entire equation (5.1.7) (except the
last term and the last but the two terms) is rewritten as
kt
1
8V(x*(k), x*(k + 1), k) 6x(k + 1)
k=ko 8x*(k + 1)
= kt
1
8V(x*(k  1), x*(k), k  1) 6x(k)
k=ko 8x*(k)
8V(x*(kf  1), x*(kf), kf  1) 8 (k )
+ 8x*(kf) x f
_ 8V(x*(ko  1), x*(ko), ko  1) 8x(ko)
8x* (k
o
)
= kt
1
8V(x*(k  1), x*(k), k  1) 6x(k)
k=ko 8x*(k)
+ [8V(X*(k k  1) 6X(k)] c::. (5.1.8)
5.1 Variational Calculus for DiscreteTime Systems 195
Substituting (5.1.8) in (5.1.6) and noting that the first variation
should be zero, we have
[aV(x*(k), x*(k + 1), k) aV(x*(k  1), x*(k), k  1)]8X(k)
k7::o ax*(k) + ax*(k)
+ [aV(x'(k k 1) 8X(k)] [:: = O.
(5.1.9)
• Step 4: EulerLagrange Equation: For (5.1.9) to be satisfied
for arbitrary variations 8x(k), we have the condition that the
coefficient of 8x(k) in the first term in (5.1.9) be zero. That is
8V(x*(k), x*(k + 1), k) 8V(x*(k  1), x*(k), k  1)
8x*(k) + 8x*(k) = O.
(5.1.10)
This may very well be called the discretetime version of the
EulerLagrange (EL) equation .
• Step 5: Boundary Conditions: The boundary or transversality
condition is obtained by setting the second term in (5.1.9) equal
to zero. That is
Now, we discuss two important cases:
1. For a fixedend point system, we have the boundary condi
tions x(k
o
) and x(kf) fixed and hence 8x(ko) = 8x(kf) =
O. The additional (or derived) boundary condition (5.1.11)
does not exist.
2. For a freefinal point system, we are given the initial con
dition x(k
o
) and hence 8x(ko) = 0 in (5.1.11). Next, at the
final point, k f is specified, and x ( k f ) is not specified or is free,
196 Chapter 5: DiscreteTime Optimal Control Systems
and hence 8x(kf) is arbitrary. Thus, the coefficient of 8x(k)
at k = kf is zero in the condition (5.1.11) which reduces to
[
8V(X*(k  1), x*(k), k  1)]1 = o. ( )
(
5.1.12
8x* k) . k=k
j
Let us note that the necessary condition (5.1.10) and the associ
ated boundary or transversality condition (5.1.12) are derived for
the scalar function x (k) only.
The previous analysis can be easily extended to the vector function x( k)
of nth order. Thus, consider a functional which is the vector version of
the scalar functional (5.1.1) as
kjl
J(x(k
o
), k
o
) = J = L V(x(k), x(k + 1), k). (5.1.13)
k=ko
We will only give the corresponding final EulerLagrange equation and
the transversality condition, respectively, as
8V(x*(k), x*(k + 1), k) 8V(x*(k  1), x*(k), k  1) _ 0 ( )
8x*(k) + 8x*(k)  5.1.14
and
[
8V(X*(k  1), x*(k), k  1)]1 = o.
8x*(k) k=kj
(5.1.15)
Note in the EulerLagrange equation (5.1.10) or (5.1.14),
1. the first term involves taking the partial derivative of the given
function V(x*(k), x*(k + 1), k) w.r.t. x(k) and
2. the second term considers taking the partial derivative of V (x* (k
1), x*(k), k  1) (one step behind) w.r.t. the same function x(k).
The second function V(x*(k  1), x*(k), k  1) can be easily obtained
from the given function V (x* (k ), x* (k + 1), k) just by replacing k by
k 1. Also, compare the previous results with the corresponding results
for continuoustime systems in Chapter 2.
5.1 Variational Calculus for DiscreteTime Systems 197
5.1.2 Functional with Terminal Cost
Let us formulate the cost functional with terminal cost (in addition to
summation cost) as
J = J(x(k
o
), k
o
)
kfl
= S(x(kf), kf) + L V(x(k), x(k + 1), k) (5.1.16)
k=ko
given the initial condition x(k
o
) and the final time k
f
as fixed, and the
final state x(kf) as free. First, assume optimal (*) condition and then
consider the variations as
x(k) = x*(k) + 8x(k)
x(k + 1) = x*(k + 1) + 8x(k + 1). (5.1.17)
Then, the corresponding functionals J and J* become
kfl
J* = S(x*(kf), kf) + L V(x*(k), x*(k + 1), k), (5.1.18)
k=ko
J = S(x*(kf) + 8x(kf), kf)
kfl
+ L V(x*(k) + 8x(k), x*(k + 1) + 8x(k + 1), k). (5.1.19)
k=ko
Following the same procedure as given previously for a functional with
out terminal cost, we get the first variation as
8J = ~ 1 [8V(X*(k),X*(k + 1), k) 8V[x*(k  1), x*(k), k  1)]' 8x(k)
~ 8x*(k) + 8x*(k)
k=ko
+ [8V(X*(k  l),x*(k), k  1) 8X(k)]l
k
=k
f
8x*(k) k=ko
8S(x*(k
f
),kf)8 (k )
+ 8x* (k f) x f·
(5.1.20)
For extremization, the first variation 8J must be zero. Hence, from
(5.1.20) the EulerLagrange equation becomes
8V(x*(k), x*(k + 1), k) 8V(x*(k  1), x*(k), k  1)
8x*(k) + 8x*(k) = o.
(5.1.21)
198 Chapter 5: Discrete Time Optimal Control Systems
and the transversality condition for the freefinal point becomes
[
8V(X*(k  1), x*(k), k  1) + 8S(x*(kf), k
f
)] I = O.
8x*(k) 8x*(kf) k=kf
(5.1.22)
Let us now illustrate the application of the equation
for discretetime functionals.
Example 5.1
Consider the minimization of a functional
kfl
J(x(k
o
), ko) = J = L [x(k)x(k + 1) + x
2
(k)] (5.1.23)
k=ko
subject to the boundary conditions x(O) = 2, and x(IO) = 5.
Solution: Let us identify in (5.1.23) that
V(x(k), x(k + 1)) = x(k)x(k + 1) + x
2
(k) (5.1.24)
and hence
V(x(k  1), x(k)) = x(k  I)x(k) + x
2
(k  1). (5.1.25)
Then using the EulerLagrange equation (5.1.10), which is the same
as the scalar version of (5.1.21), we get
x (k + 1) + 2x (k) + x (k  1) = 0 (5.1.26)
or
x (k + 2) + 2x (k + 1) + x (k) = 0 (5.1.27)
which upon solving with the given boundary conditions x(O) = 2
and x(IO) = 5, becomes
x(k) = 2( _I)k + 0.3k( _I)k.
(5.1.28)
5.2 DiscreteTime Optimal Control Systems 199
5.2 DiscreteTime Optimal Control Systems
We develop the Minimum Principle for discretetime control systems
analogous to that for continuoustime control systems addressed in pre
vious Chapters 2, 3, and 4. Instead of repeating all the topics of the
continuoustime systems for the discretetime systems, we focus on lin
ear quadratic optimal control problem. We essentially approach the
problem using the Lagrangian and Hamiltonian (or Pontryagin) func
tions.
Consider a linear, timevarying, discretetime control system described
by
x(k + 1) = A(k)x(k) + B(k)u(k) (5.2.1)
where, k = ko, kl, .. . , kf 1, x(k) is nth order state vector, u(k) is rth
order control vector, and A(k) and B(k) are matrices of nxn and nxr
dimensions, respectively. Note that we used A and B for the state space
representation for discretetime case as well as for the continuoustime
case as shown in the previous chapters. One can alternatively use, say
G and E for the discretetime case so that the case of discretization of
a continuoustime system with A and B will result in G and E in the
discretetime representation. However, the present notation should not
cause any confusion once we redefine the matrices in the discretetime
case. We are given the initial condition as
x(k = ko) = x(ko). (5.2.2)
We will discuss later the final state condition and the resulting relations.
We are also given a general performance index (PI) with terminal cost
as
J= J(x(ko),u(ko),k
o
)
= ~ x ' ( k f )F(kf )x(kf)
kfl
+ ~ L [x' (k)Q(k)x(k) + u'(k)R(k)u(k)]
k=ko
(5.2.3)
where, F(kf) and Q(k) are each nxn order symmetric, positive semi
definite matrices, and R( k) is rxr symmetric, positive definite matrix.
The methodology for linear quadratic optimal control problem is car
ried out under the following steps.
200 Chapter 5: Discrete Time Optimal Control Systems
• Step 1: Augmented Performance Index
• Step 2: Lagrangian
• Step 3: EulerLagrange Equation
• Step 4: Hamiltonian
• Step 5: OpenLoop Optimal Control
• Step 6: State and Costate System
Now these steps are described in detail.
• Step 1: Augmented Performance Index: First, we formulate an
augmented cost functional by adjoining the original cost func
tional (5.2.3) with the condition or plant relation (5.2.1) using La
grange multiplier (later to be called as costate function) A(k + 1)
as
J
a
= (kf )F(kf )x(kf)
kfl
L [x'(k)Q(k)x(k) + u'(k)R(k)u(k)]
k=ko
+A(k + 1) [A(k)x(k) + B(k)u(k)  x(k + 1)]. (5.2.4)
Minimization of the augmented cost functional (5.2.4) is the same
as that of the original cost functional (5.2.3), since J = J
a
. The
reason for associating the stage (k + 1) with the Lagrange mul
tiplier A( k + 1) is mainly the simplicity of the final result as will
be apparent later.
• Step 2: Lagrangian: Let us now define a new function called
Lagrangian as
£(x(k), u(k), x(k + 1), A(k + 1))
= (k)Q(k)x(k) +
+A'(k + 1) [A(k)x(k) + B(k)u(k)  x(k + 1)]. (5.2.5)
5.2 DiscreteTime Optimal Control Systems 201
• Step 3: EulerLagrange Equations: We now apply the Euler
Lagrange (EL) equation (5.1.21) to this new function £ with re
spect to the variables x(k), u(k), and A(k + 1). Thus, we get
8£(x*(k), x*(k + 1), u*(k), A*(k + 1))
8x* (k)
8£(x*(k  1), x*(k), u*(k  1), A*(k)) _ 0
+ 8x*(k) 
8£(x*(k), x*(k + 1), u*(k), A*(k + 1))
8u*(k)
8£(x*(k  1), x*(k), u*(k  1), A*(k)) _ 0
+ 8u*(k) 
8£ (x* ( k ) , x* (k + 1), u * ( k ), A * (k + 1))
8A*(k)
8£(x*(k  1), x*(k), u*(k  1), A*(k)) _ 0
+ 8A*(k) 
and the boundary (final) condition (5.1.22) becomes
(5.2.6)
(5.2.7)
(5.2.8)
[
8£(X(k  1), x(k), u(k  1), A(k)) + 8S(x(k), k)]' 8X(k)!k=k
f
= 0
8x(k) 8x(k) *
k=ko
(5.2.9)
where, from (5.2.3),
(5.2.10)
• Step 4: Hamiltonian: Although relations (5.2.6) to (5.2.10) give
the required conditions for optimum, we proceed to get the results
in a more elegant manner in terms of the Hamiltonian which is
defined as
1t(x*(k), u*(k), A*(k + 1)) =
+A*'(k + 1) [A(k)x*(k) + B(k)u*(k)] . (5.2.11)
202 Chapter 5: DiscreteTime Optimal Control Systems
Thus, the Lagrangian (5.2.5) and the Hamiltonian (5.2.11) are
related as
£(x*(k),x*(k + 1), u*(k), >.*(k + 1)) =
H(x*(k), u*(k), >.*(k + 1))
>.*(k + l)x*(k + 1). (5.2.12)
Now, using the relation (5.2.12) in the set of EulerLagrange
equations (5.2.6) to (5.2.8), we get the required conditions for
extremum in terms of the Hamiltonian as
*(k) = 8H(x*(k), u*(k), >'*(k + 1))
>. 8x*(k) '
(5.2.13)
8H(x*(k), u*(k), >.*(k + 1))
o = 8u*(k) ,
(5.2.14)
*(k) = 8H(x*(k  1), u*(k  1), >.*(k))
x 8>.*(k)'
(5.2.15)
Note that the relation (5.2.15) can also be written in a more
appropriate way by considering the whole relation at the next
stage as
*(k 1) = 8H(x*(k), u*(k), >'*(k + 1))
x + 8>'*(k + 1) .
(5.2.16)
For the present system described by the plant (5.2.1) and the
performance index (5.2.3) we have the relations (5.2.16), (5.2.13),
and (5.2.14) for the state, costate, and control, transforming re
spectively, to
x*(k + 1) = A(k)x*(k) + B(k)u*(k)
>'*(k) = Q(k)x*(k) + A'(k)>'*(k + 1)
0= R(k)u*(k) + B'(k)>'*(k + 1).
(5.2.17)
(5.2.18)
(5.2.19)
• Step 5: OpenLoop Optimal Control: The optimal control is then
given by (5.2.19) as
I u*(k) = _Rl(k)B'(k)>.*(k + 1) I (5.2.20)
5.2 DiscreteTime Optimal Control Systems 203
where,the positive definiteness of R(k) ensures its invertibility.
Using the optimal control (5.2.20) in the state equation (5.2.17)
we get
x*(k + 1) = A(k)x*(k)  B(k)Rl(k)B'(k)A*(k + 1)
= A(k)x*(k)  E(k)A*(k + 1) (5.2.21)
where, E(k) = B(k)Rl(k)B'(k) .
• Step 6: State and Costate System: The canonical (state and
costate) system of (5.2.21) and (5.2.18) becomes
[
X*(k + 1)] [A(k) E(k)] [X*(k) ]
A*(k)  Q(k) A'(k) A*(k + 1) . (5.2.22)
The state and costate (or Hamiltonian) system (5.2.22) is shown
in Figure 5.1. Note that the preceding Hamiltonian system (5.2.22)
is not symmetrical in the sense that x* (k + 1) and A * (k) are re
lated in terms of x*(k) and A*(k + 1).
5.2.1 FixedFinal State and OpenLoop Optimal
Control
Let us now discuss the boundary condition and the associated control
configurations. For the given or fixedinitial condition (5.2.2) and the
fixedfinal state as
(5.2.23)
the terminal cost term in the performance index (5.2.3) makes no sense
and hence we can set F(kj) = O. Also, in view of the fixedfinal state
condition (5.2.23), the variation ox(kf) = 0 and hence the boundary
condition (5.2.9) does not exist for this case. Thus, the state and
costate system (5.2.22) along with the initial condition (5.2.2) and
the fixedfinal condition (5.2.23) constitute a twopoint boundary value
problem (TPBVP). The solution of this TPBVP, gives x*(k) and A*(k)
or A*(k + 1) which along with the control relation (5.2.20) leads us to
the socalled openloop optimal control. The entire procedure is now
summarized in Table 5.l.
We now illustrate the previous procedure by considering a simple
system.
204 Chapter 5: DiscreteTime Optimal Control Systems
Table 5.1 Procedure Summary of DiscreteTime Optimal
Control System: FixedEnd Points Condition
A. Statement of the Problem
Given the plant as
x(k + 1) = A(k)x(k) + B(k)u(k),
the performance index as
J(k
o
) = [x'(k)Q(k)x(k) + u'(k)R(k)u(k)] ,
and the boundary conditions as
x(k = k
o
) = x(ko); x(kf) = x(kf),
find the optimal control.
B. Solution of the Problem
Step 1 Form the Pontryagin H function
H = +
+A'(k + 1) [A(k)x(k) + B(k)u(k)] .
Step 2 Minimize H w.r.t. u(k)
* = 0 and obtain
u*(k) = _Rl(k)B'(k)A*(k + 1).
Step 3 Using the result of Step 2, find the
optimal H* function as
H* (x* ( k ), A * (k + 1)).
Step 4 Solve the set of 2n difference equations
x*(k + 1) = = A(k)x*(k)  E(k)A*(k + 1),
A*(k) = = Q(k)x*(k) + A'(k)A*(k + 1),
with the given boundary conditions
x(k
o
) and x(k
f
), where E(k) = B(k)R
1
(k)B'(k).
Step 5 Substitute the solution of A * (k) from Step 4 into
the expression for u*(k) of Step 2 to obtain the
optimal control.
5.2 DiscreteTime Optimal Control Systems
State
Costate
~ x(ko)
1
Z
Delay
A(k)
1
Advance
I A'(k) I
x*(k)
Q(k)
Figure 5.1 State and Costate System
Example 5.2
205
Consider the minimization of the performance index (PI) [120]
kfl
J(ko) = ~ L u
2
(k),
k=ko
(5.2.24)
subject to the boundary conditions
x (ko = 0) = 1, x (k f = 10) = 0
(5.2.25)
for a simple scalar system
x(k + 1) = x(k) + u(k). (5.2.26)
Solution: Let us first identify the various matrices by compar
ing the present state (5.2.26) and the PI (5.2.24) with the corre
sponding general formulation of the state (5.2.1) and the PI (5.2.3),
206 Chapter 5: DiscreteTime Optimal Control Systems
respectively, to get
A(k) = 1; B(k) = 1; F(kf) = 0; Q(k) = 0; R(k) = 1.
(5.2.27)
Now let us use the procedure given in Table 5.1.
• Step 1: Form the Pontryagin 'H function as
1
'H(x(k), u(k), A(k + 1)) = 2,u
2
(k) + A(k + l)[x(k) + u(k)].
(5.2.28)
• Step 2: Minimizing'H of (5.2.28) w.r.t. u(k)
8'H
8u(k) = 0 + u*(k) + A*(k + 1) = 0 + u*(k) = A*(k + 1).
(5.2.29)
• Step 3: Using the control relation (5.2.29) and the Hamiltonian
(5.2.28), form the optimal 'H* function
'H*(x*(k), A*(k + 1)) = X*(k)A*(k + 1)  ~ A * 2 ( k + 1). (5.2.30)
• Step 4: Obtain the set of 2 state and costate difference equa
tions
8'H*
x*(k + 1) = 8A*(k + 1) + x*(k + 1) = x*(k)  A*(k + 1)
(5.2.31)
and
(5.2.32)
Solving these 2 equations (5.2.31) and (5.2.32) (by first eliminat
ing A(k) and solving for x(k)) along with the boundary conditions
(5.2.25), we get the optimal solutions as
x*(k) = 1  0.1k; A*(k + 1) = 0.1. (5.2.33)
• Step 5: Using the previous state and costate solutions, the op
timal control u*(k) is obtained from (5.2.29) as
u*(k) = 0.1. (5.2.34)
5.3 DiscreteTime Linear State Regulator System 207
5.2.2 FreeFinal State and OpenLoop Optimal Control
Let us, first of all, note that for a freefinal state system, it is usual to
obtain closedloop optimal control configuration. However, we reserve
this to the next section. Let us consider the freefinal state condition
as
x(kf) is free, and kf is fixed. (5.2.35)
Then, the final condition (5.2.9) along with the Lagrangian (5.2.5) be
comes
(5.2.36)
Now, for this freefinal point system with kf fixed, and x(kf) being free,
8x(kf) becomes arbitrary and its coefficient in (5.2.36) should be zero.
Thus, the boundary condition (5.2.36) along with the performance in
dex (5.2.3) becomes
(5.2.37)
which gives
1 A(kf) = F(kf )x(kf )·1
(5.2.38)
The state and costate system (5.2.22) along with the initial condition
(5.2.2) and the final condition (5.2.38) constitute a TPBVP. The so
lution of this TPBVP, which is difficult because of the coupled nature
of the solutions (i.e., the state x*(k) has to be solved forward starting
from its initial condition x(ko) and the costate A*(k) has to be solved
backward starting from its final condition A(kf)) leads us to openloop
optimal control. The entire procedure is now summarized in Table 5.2.
5.3 DiscreteTime Linear State Regulator
System
In this section, we discuss the state regulator system, and obtain closed
loop optimal control configuration for discretetime systems. This leads
us to matrix difference Riccati equation (DRE). Now, we restate the
208 Chapter 5: Discrete Time Optimal Control Systems
Table 5.2 Procedure Summary for DiscreteTime Optimal
Control System: FreeFinal Point Condition
A. Statement of the Problem
Given the plant as
x(k + 1) = A(k)x(k) + B(k)u(k)
the performance index as
J(k
o
) = )F(k
f
)x(kf)
[x'(k)Q(k)x(k) + u'(k)R(k)u(k)]
and the boundary conditions as
x(k = ko) = x(k
o
); x(kf) is free, and kf is fixed,
find the optimal control.
B. Solution of the Problem
Step 1 Form the Pontryagin 1l function
1l = (k )Q(k )x(k) + (k )R(k )u(k)
+A'(k + 1) [A(k)x(k) + B(k)u(k)] .
Step 2 Minimize 1l w.r. t. u( k) as
* = 0 and obtain
u*(k) = _Rl(k)B'(k)A*(k + 1).
Step 3 U sing the result of Step 2 in Step 1, find the
optimal 1l* as
1l*(x*(k), A*(k + 1)).
Step 4 Solve the set of 2n difference equations
x*(k + 1) = = A(k)x*(k)  E(k)A*(k + 1),
A*(k) = = Q(k)x*(k) + A'(k)A*(k + 1),
with the given initial condition and
the final condition A(kf) = F(kf )x(kf),
where, E(k) = B(k)Rl(k)B'(k).
Step 5 Substitute the solution of A*(k) from Step 4 into
the expression for u*(k) of Step 2, to obtain the
optimal control.
~ ..  ~   ~  ~ ~                   
5.3 DiscreteTime Linear State Regulator System 209
problem of linear state regulator and summarize the results derived in
Section 5.2.
Consider the linear, timevarying discretetime control system de
scribed by the plant (5.2.1) and the performance index (5.2.3). We are
given the initial and final conditions as
x(k = ko) = x(k
o
); x(kf) is free, and kf is fixed. (5.3.1)
Then the optimal control (5.2.20) and the state and costate equations
(5.2.22) are reproduced, respectively here for convenience as
and
u*(k) = _Rl(k)B'(k)A*(k + 1)
x*(k + 1) = A(k)x*(k)  E(k)A*(k + 1),
A*(k) = Q(k)x*(k) + A'(k)>'*(k + 1),
(5.3.2)
(5.3.3)
(5.3.4)
where, E(k) = B(k)R
1
(k)B'(k), and the final costate relation (5.2.38)
is given by
(5.3.5)
5.3.1 ClosedLoop Optimal Control: Matrix Difference
Riccati Equation
In order to obtain closedloop optimal configuration, we need to try to
express the costate function A*(k + 1) in the optimal control (5.3.2) in
terms of the state function x*(k). The final condition (5.3.5) prompts
us to express
A *(k) = P(k )x*(k) (5.3.6)
where, P (k) is yet to be determined. This linear transformation is
called the Riccati transformation, and is of fundamental importance
in the solution of the problem. Using the transformation (5.3.6) in the
state and costate equations (5.3.3) and (5.3.4), we have
P(k)x*(k) = Q(k)x*(k) + A'(k)P(k + l)x*(k + 1) (5.3.7)
and
x*(k + 1) = A(k)x*(k)  E(k)P(k + l)x*(k + 1). (5.3.8)
210 Chapter 5: DiscreteTime Optimal Control Systems
Solving for x* (k + 1) from (5.3.8)
x*(k + 1) = [I + E(k)P(k + 1)]1 A(k)x*(k). (5.3.9)
Substituting (5.3.9) in (5.3.7) yields
P(k)x*(k) = Q(k)x*(k) + A'(k)P(k + 1) [I + E(k)P(k + 1)r
1
A(k)x*(k).
(5.3.10)
Since, this relation (5.3.10) must hold for all values of x*(k), we have
I P(k) = A/(k)P(k + 1) [I + E(k)P(k + 1)]1 A(k) + Q(k).!
(5.3.11)
This relation (5.3.11) is called the matrix difference Riccati equation
(DRE). Alternatively, we can express (5.3.11) as
I P(k) = A'(k) [P1(k + 1) + E(k)] 1 A(k) + Q(k) I (5.3.12)
where, we assume that the inversion of P(k) exists for all k =I= kf.
The final condition for solving the matrix DRE (5.3.11) or (5.3.12) is
obtained from (5.3.5) and (5.3.6) as
(5.3.13)
which gives
I P(kf) = F(kf)·!
(5.3.14)
In the matrix DRE (5.3.11), the term P(k) is on the left hand side
and P (k + 1) is on the right hand side and hence it needs to be solved
backwards starting from the final condition (5.3.14). Since Q(k) and
F(kf) are assumed to be positive semidefinite, we can show that the
Riccati matrix P(k) is positive definite. Now to obtain the closedloop
optimal control, we eliminate '\*(k+1) from the control relation (5.3.2)
and the state relation (5.3.4) and use the transformation (5.3.6). Thus,
we get the relation for closedloop, optimal control as
I u*(k) = R
1
(k)B'(k)A T(k) [P(k)  Q(k)] x*(k).1 (5.3.15)
Here, A T is the inverse of A' and we assume that the inverse of A(k)
exists. This relation (5.3.15) is the desired version for the closedloop
5.3 DiscreteTime Linear State Regulator System 211
optimal control in terms of the state. We may write the closedloop,
optimal control relation (5.3.15) in a simplified form as
I u*(k) = L(k)x*(k) I (5.3.16)
where,
1 L(k) = R
1
(k)B'(k)A T(k) [P(k)  Q(k)] .1 (5.3.17)
This is the required relation for the optimal feedback control law and
the feedback gain L(k) is called the "Kalman gain." The optimal state
x*(k) is obtained by substituting the optimal control u*(k) given by
(5.3.16) in the original state equation (5.2.1) as
I x*(k + 1) = (A(k)  B(k)L(k)) x*(k)·1 (5.3.18)
Alternate Forms for the DRE
Alternate forms which do not require the inversion of the matrix A( k)
for the matrix DRE (5.3.11) and the optimal control (5.3.16) are ob
tained as follows. Using the wellknown matrix inversion lemma
[All + A2A4A3] 1 = A1  A1A2 [A3A1A2 + Ail] 1 A3A1
(5.3.19)
in (5.3.12), and manipulating, we have the matrix DRE as
P(k) = A'(k) {P(k + 1)  P(k + l)B(k).
[B'(k)P(k + l)B(k) + R(k)] 1 B'(k)P(k + I)} A(k)
+Q(k). (5.3.20)
Next, consider the optimal control (5.3.2) and the transformation (5.3.6),
to get
u*(k) = R
1
(k)B'(k)P(k + l)x*(k + 1) (5.3.21)
which upon using the state equation (5.2.1) becomes
u*(k) = R
1
(k)B'(k)P(k + 1) [A(k)x*(k) + B(k)u*(k)] . (5.3.22)
212 Chapter 5: DiscreteTime Optimal Control Systems
Rearranging, we have
[I + R1(k)B'(k)P(k + l)B(k)] u*(k) =
R1(k)B'(k)P(k + l)A(k)x*(k).
Premultiplying by R(k) and solving for u*(k),
I u*(k) = La(k)x*(k) I
where, La (k ), called the Kalman gain matrix is
(5.3.23)
(5.3.24)
I La(k) = [B'(k)P(k + l)B(k) + R(k)]l B'(k)P(k + l)A(k)·1
(5.3.25)
Let us note from the optimal feedback control law (5.3.24) that the
Kalman gains are dependent on the solution of the matrix DRE (5.3.20)
involving the system matrices and performance index matrices. Finally.,
the closedloop, optimal control (5.3.24) with the state (5.2.1) gives us
the optimal system
x*(k + 1) = [A(k)  B(k)La(k)] x*(k). (5.3.26)
Using the gain relation (5.3.25), an alternate form for the matrix DRE
(5.3.20) becomes
I P(k) = A'(k)P(k + 1) [A(k)  B(k)La(k)] + Q(k).1 (5.3.27)
Let us now make some notes:
1. There is essentially more than one form of the matrix DRE given
by (5.3.11) or (5.3.12), (5.3.20), and (5.3.27).
2. However, the Kalman feedback gain matrix has only two forms
given by the first form (5.3.17) which goes with the DRE (5.3.11)
or (5.3.12) and the second form (5.3.25) which corresponds to the
DRE (5.3.20) or (5.3.27).
3. It is a simple matter to see that the matrix DRE (5.3.11) and
the assdciated Kalman feedback gain matrix (5.3.17) involve the
inversion of the matrix 1 + E(k)P(k + 1) once only, whereas the
5.3 DiscreteTime Linear State Regulator System
213
matrix DRE (5.3.20) and the associated Kalman feedback gain
matrix (5.3.25) together involve two matrix inversions. The num
ber of matrix inversions directly affects the overall computation
time, especially if one is looking for online implementation of
closedloop optimal control strategy.
5.3.2 Optimal Cost Function
For finding the optimal cost function J* ( ko ), we can follow the same
procedure as the one used for the continuoustime systems in Chapter 3
to get
J* = (5.3.28)
Let us note that the Riccati function P(k) is generated offline before
we obtain the optimal control u*(k) to be applied to the system. Thus,
in general for any initial state k, we have the optimal cost as
J*(k) = (5.3.29)
The entire procedure is now summarized in Table 5.3. The actual
implementation of this control law is shown in Figure 5.2. We now
illustrate the previous procedure by considering a second order system
with a general cost function.
Example 5.3
Consider the minimization of a functional [33]
J = [xi ( k f ) + ( k f ) ]
kfl
+ L [0.
5x
i(k) + + 0.5u
2
(k)]
k=ko
for the second order system
xI(k + 1) = 0.8XI(k) + x2(k) + u(k)
x2(k + 1) = 0.6X2(k) + 0.5u(k)
subject to the initial conditions
(5.3.30)
(5.3.31)
Xl (k
o
= 0) = 5, x2(ko = 0) = 3; k
f
= 10, and x(kf) is free.
(5.3.32)
214 Chapter 5: DiscreteTime Optimal Control Systems
Table 5.3 Procedure Summary of DiscreteTime, Linear
Quadratic Regulator System
A. Statement of the Problem
Given the plant as
x(k + 1) = A(k)x(k) + B(k)u(k)
the performance index as
J ( ko) = ! x' ( k f ) F ( k f ) x( k f )
+! 2 : ~ : ~ ~ [x'(k)Q(k)x(k) + u'(k)R(k)u(k)]
and the boundary conditions as
x(k = ko) = x(ko); x( k f) is free, and k f is free,
find the closedloop optimal control, state and performance index.
B. Solution of the Problem
Step 1 Solve the matrix difference Riccati equation (DRE)
P(k) = A'(k)P(k + 1) [I + E(k)P(k + 1)]1 A(k) + Q(k)
with final condition P(k = kf) = F(kf), where
E(k) = B(k)R
1
(k)B'(k).
Step 2 Solve the optimal state x* (k) from
x*(k + 1) = [A(k)  B(k)L(k)] x*(k)
with initial condition x( k
o
) = Xo, where
L(k) = R1(k)B'(k)A T(k) [P(k)  Q(k)].
Step 3 Obtain the optimal control u*(k) from
u*(k) = L(k)x*(k), where L(k) is the Kalman gain.
Step 4 Obtain the optimal performance index from
J* = !x*'(k)P(k)x*(k).
5.3 DiscreteTime Linear State Regulator System
...........................  ....  ..  ..  ..... _ ........... .
U*(k)Bi
B(k)
+
Plant
1
z
A(k)
x*(k)
f======= ::::::::::::::::::::::::::::::::::::::::::::::::::
· .
· .
: ClosedLoop Optimal Controller :
· .
· .
· .
· .
· .
· .
· .
· .
· .
: L(k)x*(k) :
· .
· .
· .
· .
· .
· .
· .
· .
· .
: OffLine Simulation of P(k) :
: and Evaluation of L(k) ~
.................................................................................................................................. I
Figure 5.2 ClosedLoop Optimal Controller for Linear
DiscreteTime Regulator
215
Solution: Let us first identify the various matrices by comparing
the system (5.3.31) and the PI (5.3.30) of the system with the
system (5.2.1) and the PI (5.2.3) of the general formulation as
A(k) = [0.81.0]; B(k) = [1.0] ;
0.00.6 0.5
F(k
f
) = [ ~ . \ ~ ] ; Q(k) = [ ~ ~ ] ; R = 1.
(5.3.33)
Now let us use the procedure given in Table 5.3.
216 Chapter 5: DiscreteTime Optimal Control Systems
• Step 1: Solve the matrix difference Riccati equation (5.3.11)
[
Pl1(k) P12(k)]_ [10] [0.80.0] [Pl1(k+1) P12(k+1)]
P12(k) P22(k)  01 + 1.00.6 P12(k+1) p22(k+1) .
[[
10] + [1.0] [1]1 [1.00.5J [Pl1(k+1) P12(k+1)]]l
x
° 1 0.5 P12(k + 1) P22(k + 1)
[
0.8 1.0]
0.0 0.6
(5.3.34)
backwards in time starting with the final condition (5.3.14) as
[
P11 (10) P12(10)] = F(k ) = [2.0 0]
P12(10) P22(10) f 0 4.0 .
(5.3.35)
• Step 2: The optimal control u*(k) is obtained from (5.3.16) as
u*(k) =  [h(k) 12(k) 1 [ ~ ~ i ~ \ ]
(5.3.36)
where I' = [II, l2] is given by (5.3.17).
• Step 3: Using the optimal control (5.3.36) the optimal states are
computed by solving the state equation (5.3.18) forward in time.
This is an iterative process in the backward direction. Evaluation
of these solutions require the use of standard software such as
MATLAB© as shown below.
*******************************************************
% Solution Using Control System Toolbox (STB)
% MATLAB Version 6
%
A=[0.8 1;0,0.6]; %% system matrix A
B=[1;0.5]; %% system matrix B
Q=[l 0;0 1]; %% performance index state weighting matrix Q
R=[l]; %% performance index control weighting matrix R
F=[2,0;0,4]; %% performance index weighting matrix F
%
x1(1)=5; %% initial condition on state xl
x2(1)=3; %% initial condition on state x2
xk=[xl(1);x2(1)];
% note that if kf = 10 then k = [kO,kf] = [0 1 2, ... ,10],
% then we have 11 points and an array xl should have subscript
% x1(N) with N=l to 11. This is because x(o) is illegal in array
5.3 DiscreteTime Linear State Regulator System
% definition in MATLAB. Let us use N = kf+1
kO=O; % the initial instant k_O
kf=10; % the final instant k_f
N=kf+1; %
[n,n]=size(A); % fixing the order of the system matrix A
I=eye(n); % identity matrix I
E=B*inv(R)*B'; % the matrix E = BR{1}B'
%
% solve matrix difference Riccati equation backwards
% starting from kf to kO
% use the form P(k) = A'P(k+1)[I + EP(k+1)]{1}A + Q
% first fix the final condition P(k_f) = F
% note that P, Q, R are all symmatric ij = ji
Pkplus1=F;
p11(N)=F(1) ;
p12(N)=F(2);
p21(N)=F(3);
p22(N)=F(4);
%
for k=N1:1:1,
end
%
Pk = A' *Pkplus1*inv(I+E*Pkplus1)*A+Q;
p11 (k) = Pk(1);
p12(k) = Pk(2);
p21(k) = Pk(3);
p22(k) = Pk(4);
Pkplus1 = Pk;
% calcuate the feedback coefficient L
% L = R{1}B'A{T}[P(k)  Q]
%
for k = N:1:1,
Pk=[p11(k),p12(k);p21(k),p22(k)] ;
Lk = inv(R)*B'*inv(A')*(PkQ);
11 (k) = Lk(1);
l2(k) = Lk(2);
end
%
% solve the optimal states
% x(k+1) = [AB*L)x(k) given x(O)
%
for k=1:N1,
Lk = [l1(k),12(k)];
xk = [x1(k);x2(k)];
xkplus1 = (AB*Lk)*xk;
217
218 Chapter 5: DiscreteTime Optimal Control Systems
end
%
xl (k+l)
x2(k+l)
xkplusl (1) ;
xkplusl(2);
% solve for optimal control u(k)
%
for k=l:N,
Lk = [ll(k),l2(k)];
xk = [xl(k);x2(k)];
u(k) =  Lk*xk;
end
%
 L(k)x(k)
% plot various values: P(k), x(k), u(k)
% let us first reorder the values of k = 0 to 10
figure (1)
plot(k,pll,'k:o',k,p12,'k:+',k,p22,'k:*')
xlabel ('k')
ylabel('Riccati Coefficients')
gtext('p_{ll}(k)')
gtext('p_{12}(k)=p_{21}(k)')
gtext('p_{22}(k)')
%
figure (2)
plot(k,xl,'k:o',k,x2,'k:+')
xlabel( 'k')
ylabel('Optimal States')
gtext ( , x_l (k) ')
gtext ( , x_2 (k) , )
%
figure (3)
plot (k, u, 'k: *')
xlabel ( , k ' )
ylabel('Optimal Control')
gtext('u(k) ')
% end of the program
%
*********************************************************
The Riccati coefficients of the matrix P(k) obtained using MATLAB©
are shown in Figure 5.3. The optimal states are plotted in Fig
ure 5.4 and the optimal control is shown in Figure 5.5.
5.4 SteadyState Regulator System
3.5
3
:
,
,
,
I
,
I
,
.
I
.
I
,:
,
&i 2.5 ;' 
·0 :
:
8 2 P22(k) 4. ,,)0
 ';," ++ + ....  +++, ....   "
1.5( _____ 0     0     0     0     £>   0     0  ,  0"
P11 (kJ

0.5 P
12
(k) = P21 (k)
 +++   +  + + .• +.
o 2 3 4 5
k
6 7 8 9
Figure 5.3 Riccati Coefficients for Example 5.3
5.4 SteadyState Regulator System
10
219
Here, we let k f tend to 00 and this necessitates that we assume the
timeinvariant case. Thus, the linear timeinvariant plant becomes
x(k + 1) = Ax(k) + BU(k) (5.4.1)
and the performance index becomes
J = f: [x*'(k)Qx(k) + u*'(k)Ru*(k)].
k=ko
(5.4.2)
As the final time kf tends to 00, we have the Riccati matrix P(k)
attaining a steadystate value P in (5.3.11). That is,
P(k) =P(k+l) =p (5.4.3)
220 Chapter 5: DiscreteTime Optimal Control Systems
__
,
,
.
,
4 ..
3
,
,
.
,
,
,
(/), ,
" " )(1(\<)
til "
(ij ,
.§ "I, ""
a. '. ..
o '. '
, 0,
. ,
\ )(2(k) ",
, ,
, __

· ... ':. ":. .. +.................... ##$#### ... e ......... # ... + ......
+ ...... ..... '
o
"'''t
a 2 3 4 5 6 7 8 9 10
k
Figure 5.4 Optimal Control and States for Example 5.3
resulting in the algebraic Riccati equation (ARE) as
1 I> = A'I> [I + BR1B'I>]1 A + Q 1 (5.4.4)
or (5.3.12) as
where, E = BR
1
B'. The feedback optimal control (5.3.15) becomes
I u*(k) = R
1
B' A T [I>  Q] x*(k) = Lx*(k) I (5.4.5)
where, the Kalman gain (5.3.17) becomes
(5.4.6)
5.4 SteadyState Regulator System 221
0.5 ..,ry,..r..r..
o
0.5
1
e 1.5
'E
o
~ 2
(1!
.§
8 2.5
,
3 f !
3.5 f
,
,
4 ~
,
,
,
,
,
,
.
,
i
,
,
,
,
,
;*'   ~ , ....  + +, + , + ++ 
,
,/ u(k)
I
2 3 4 5
k
I I
6 7 8 9

10
Figure 5.5 Optimal Control and States for Example 5.3
and A Tis the inverse of A'.
Alternate Form: An alternate form for the ARE (5.4.4) is obtained
by considering steadystate form of the DRE (5.3.20) as
P = A/{P  PB [B/PB + R]l B/p}A + Q.
(5.4.7)
The optimal feedback control (5.3.24) becomes
u*(k) = Lax*(k) (5.4.8)
where, the alternate optimalfeedback gain matrix (5.3.25) becomes
La = [B/PB + R] 1 B/p A.
(5.4.9)
The optimal control (5.4.9) with the state (5.4.1) gives us the optimal
system
x*(k + 1) = [A  BLa] x*(k). (5.4.10)
222 Chapter 5: DiscreteTime Optimal Control Systems
Table 5.4 Procedure Summary of DiscreteTime, Linear
Quadratic Regulator System: SteadyState Condition
A. Statement of the Problem
Given the plant as
x(k + 1) = Ax(k) + Bu(k)
the performance index as
J(ko) = [x'(k)Qx(k) + u'(k)Ru(k)]
where, kf = 00,
find the optimal control, state and the performance index.
B. Solution of the Problem
Step 1 Solve the matrix algebraic Riccati equation
P = A'P [I + BR 1 B'P] 1 A + Q, or
P = A'{P  PB [B'PB + R]l B'P}A + Q.
Step 2 Solve the optimal state x* (k) from
x*(k + 1) = [A  BL] x*(k) or
x*(k + 1) = [A  BLa] x*(k).
with initial condition x(k
o
) = xo, where
L = R
1
B'A T [P  Q] and
La = [B'PB + R] 1 B'P A.
Step 3 Obtain the optimal control u * ( k) from
u*(k) = Lx*(k), or
u*(k) = Lax*(k).
Step 4 Obtain the optimal performance index from
J*(k
o
) =
The optimal cost function (5.3.29) becomes
I J*(k) = x*'(k)Px*(k)·1
(5.4.11)
The entire procedure is now summarized in Table 5.4. The implemen
tation of this closedloop optimal control for steadystate (k f + (0)
case is shown in Figure 5.6. We now illustrate the previous procedure
by considering the same system of Example 5.3.
5.4 SteadyState Regulator System 223
. .. __ ... __ .... __ .. __ .... _ .. _ .... _ ......... ..
u*(k) x*(k)
B
+

A
Plant
r"""" ....................................................................................................... ..
L
ClosedLoop Optimal Controller
_ .... __ ................. __ ........ _ .... __ .. __.
Figure 5.6 ClosedLoop Optimal Control for DiscreteTime
SteadyState Regulator System
Example 5.4
Consider the minimization of the performance index (5.3.30) with
out the terminal cost function and the plant (5.3.31) with kf = 00.
Let us find the closedloop optimal control for this system.
Solution: We have already identified the various matrices of the
state (5.2.1) and the performance index (5.2.3) as given by (5.3.33),
except that we now have F(kf) = o.
Then, the solution of the algebraic Riccati equation (5.4.7), the
closedloop, optimal control (5.4.8) and the optimal states (5.4.10)
are best solved using MATLAB© as shown below.
*******************************************************
% Solution Using Control System Toolbox (STB) in
% the MATLAB, Version 6
%
A=[0.8 1;0,0.6]; %% system matrix A
B=[1;0.5]; %% system matrix B
Q=[1 0;0 1]; %% performance index state weighting matrix QO.5
R=[1]; %% performance index control weighting matrix R
%
x1(1)=5; %% initial condition on state x1
224 Chapter 5: DiscreteTime Optimal Control Systems
x2(1)=3; %% initial condition on state x2
xk=[xl(1);x2(1)] ;
% note that if kf = 10 then k = [kO,kf] = [0 1 2, ... ,10],
% then we have 11 points and an array xl should have subscript
% xl(N) with N=l to 11. This is because x(o) is illegal in array
% definition in MATLAB. Let us use N = kf+l
kO=O; % the initial instant k_O
kf=10; % the final instant k_f
N=kf+l; %
[n,n]=size(A); % fixing the order of the system matrix A
I=eye(n); % identity matrix I
E=B*inv(R)*B'; % the matrix E =
%
% solve matrix algebraic Riccati equation
% use the form P = A'PA  + Q
% note that P, Q, R are all symmatric ij = ji
% calcuate the feedback coefficient L
% L =
%
[P,EIGVAL,L,RR] = dare(A,B,Q,R)
%
P =
1.3944
0.3738
0.3738
1.7803
EIGVAL =
L
%
0.3211 + 0.2151i
0.3211  0.2151i
0.3937 0.7281
% solve the optimal states
% x(k+l) = (AB*L)x(k) given x(O)
%
for k=l:Nl,
xk = [xl(k);x2(k)];
xkplusl (AB*L)*xk;
xl(k+l) xkplusl(l);
x2(k+l) xkplusl(2);
end
%
% solve for optimal control u(k)  Lx(k)
5.4 SteadyState Regulator System
%
for k=l:N,
end
%
xk = [xl(k);x2(k)];
u(k) = L*xk;
k=O:10;
plot(k,xl,'k:o',k,x2,'k:+')
xlabel ('k')
ylabel('Optimal States')
gtext ( , x_l (k) ')
gtext ( , x_2 (k) , )
plot(k, u, 'k: *')
xlabel ( , k' )
ylabel('Optimal Control')
gtext ('u(k) ')
%
% end of the program
%
******************************************************
225
Note that the value obtained for P previously is the same as the
steadystate value for the Example 5.3 (see Figure 5.3). The opti
mal states are shown in Figure 5.7 and the optimal control is shown
in Figure 5.8.
5.4.1 Analytical Solution to the Riccati Equation
This subsection is based on [89, 138]. The solution of the matrix differ
ence Riccati equation is critical for linear quadratic regulator system.
Thus, traditionally, the solution to the DRE (5.3.20) is obtained by
iteration in a recursive manner using the final condition (5.3.14). Al
ternatively, one can obtain analytical solution to the DRE.
Let us rewrite the Hamiltonian system (5.2.22) arising in the time
invariant LQR system as (omitting the optimal notation (*) for clarity)
[
X(k)] =H [X(k+1)]
>..(k) >..(k + 1) .
Here, the Hamiltonian matrix H is
H = [Hll H12] = [A
I
I
H2I H22 QA
AIE]
A'+QAIE
(5.4.12)
(5.4.13)
226
(I)
Q)
co
Ci5
(ij
E
a
0
Chapter 5: DiscreteTime Optimal Control Systems
r,
"", 0
\
.
o
o
,
.
4
3
2
I
.
o
o
.
o
.
,
,
o
.
,
.
o
.
o
.
1 I •
.
o
I
,
·
·
o
\
,
,
,
,
,
o
"",
,
,
.
o




o ·
,
.
\."" _,,,,J
.. ....
',+
o 1 234 567 8 9 10
k
Figure 5.7 Implementation of Optimal Control for Example 5.4
where,
Let us note that
(5.4.14)
The boundary conditions for (5.4.12) are reproduced here from (5.3.1)
and (5.3.5) as
(5.4.15)
Also, in trying to obtain the difference Riccati equation, we assumed a
transformation (5.3.6) between the state and costate as
A(k) = P(k)x(k); V k ::; kf. (5.4.16)
5.4 SteadyState Regulator System 227
0.5 ,..yy,...y"""'T"'""r,,r,
o
~ .. .... +
;It  + ~ ... . "* + .;j!    !
0.5 
1
e 1.5
'E
o
(,)
ro 2
.5
8 2.5,.
t
t
,
,
,
,
i u(k)
,
,
I
,
,
,
,
,
I
t
,
,
,
,
I
,
,
3 !
3.51:
I
4 ~
,,"
,
I







4.5 L..._.L...._.&...._'_'_""""_'_'_'_"_I
o 2 3 4 5
k
6 7 8 9 10
Figure 5.8 Implementation of Optimal Control for Example 5.4
Now, we will show that the solution P(k) to the Riccati equation
(5.3.20) can be obtained in terms of the eigenvalues and eigenvectors
of the Hamiltonian matrix H.
Let us first define
J = [  ~ ~ 1
such that J
1
= J. (5.4.17)
Now, using (5.4.13), (5.4.14) and (5.4.17), it is a simple thing to show
that
H'JH = J. (5.4.18)
By premultiplying and postmultiplying (5.4.18) with H
1
and J
1
,
respectively, we have
(5.4.19)
228 Chapter 5: DiscreteTime Optimal Control Systems
and using (5.4.17), we get
H
I
= JH'J.
(5.4.20)
Now, substituting for the quantities on the right hand side of the pre
vious equation, we have
_I_[A+EATQ
H  ATQ
EA
T
]
A
T
(5.4.21 )
where, A T is the transpose of AI. Now, let us show that if J.L is an
eigenvalue of H, then 1/ J.L is also the eigenvalue of H. First, if J.L is an
eigenvalue with a corresponding eigenvector [j, g]', then
[
A
I
QA
T
and rearranging, we have
[
A' + QA IE
AIE
Using (5.4.21), we have
(5.4.22)
QA
I
] [g] [g]
AI  f = J.L f·
(5.4.23)
This shows that J.L is also an eigenvalue of H
T
, and hence of H
I
. We
know that from elementary matrix algebra that if G: is an eigenvalue of
a matrix A, then 1/ G: is also an eigenvalue of matrix AI. Therefore,
1/ J.L is an eigenvalue of H, and hence the result. This means that the
eigenvalues of H can be arranged as
(5.4.24)
where, M is a diagonal (Jordon) matrix containing n eigenvalues out
side the unit circle and M
I
is a diagonal matrix containing n eigen
values inside the unit circle which means that M
I
is stable. Now, we
note that D can be written in terms of a nonsingular matrix W whose
columns are the eigenvectors of H as
WIHW=D.
(5.4.25)
5.4 SteadyState Regulator System 229
Let us now introduce a transformation as
[
X(k)] = W [V(k)] = [Wll W12] [V(k)]
A(k) z(k) W
21
W22 z(k)·
(5.4.26)
Then, using (5.4.26), (5.4.24) and the Hamiltonian system (5.4.12), we
get
[
V(k)] = W
1
[X*(k)] = W1H [X*(k + 1)]
z(k) A*(k) A*(k + 1)
= W1HW [V(k + 1)] = D [V(k + 1)]
z(k+1) z(k+1)
[
M 0] [V(k+1)]
= ° M
1
z(k+1)·
(5.4.27)
The solution of (5.4.27) in terms of the final conditions is found by first
writing it as
[
V(k+1)] =Dl [V(k)]
z(k + 1) z(k) .
Then solving this for the given final conditions, we get
[
V(k)] = [M(kfk)
z(k) 0
(5.4.28)
Here, since M is unstable (i.e., it does not go to zero as (k f  k) + (0),
we rewrite (5.4.28) as
(5.4.29)
Next, at the final time kf' using (5.4.16) and (5.4.26) we have
A(kf) = W21V(kf) + W
2
2Z(k
f
) = P(kf)x(kf) = F(kf)x(kf),
= F(kf) [Wllv(kf) + W
12
Z(kf)] (5.4.30)
the solution of which becomes
(5.4.31)
where,
230 Chapter 5: Discrete Time Optimal Control Systems
Now, using (5.4.29) and (5.4.31)
z(k) = M(kfk)z(kf)
= M(kfk)T(kf )v(kf)
= M(kfk)T(kf )M(kfk)v(k).
This means that at each value of k,
z(k) = T(k)v(k)
where,
(5.4.33)
(5.4.34)
(5.4.35)
Now we relate P(k) in (5.4.16) with T(k) in (5.4.34) by first using
(5.4.26) and (5.4.16) to get
A*(k) = W21V(k) + W22Z(k)
= P(k)x*(k) = P(k) [Wnv(k) + W12Z(k)] (5.4.36)
and then using (5.4.34)
[W21 + W
22
T(k)] v(k) = P(k) [Wn + W12T(k)] v(k). (5.4.37)
Since this must hold good for all x(O) and hence for all v(k), leading
to
I P(k) = [W21 + W22T(k)] [Wn + W12T(k)]1 ·1 (5.4.38)
Finally, we have the nonrecursive analytical solution (5.4.38) to the ma
trix difference Riccati equation (5.3.20). Let us note that the solution
(5.4.38) requires the relations (5.4.35) and (5.4.32) and the eigenval
ues (5.4.24), eigenvectors (5.4.25), and the given terminal cost matrix
F(kf ).
SteadyState Condition
As the terminal time kf + 00, the matrix difference Riccati equation
(5.3.20) becomes matrix algebraic Riccati equation (5.4.7). Now, let
us find the analytical expression for the solution P of this ARE. As
(kf  k) + 00, M(kf
k
) + 00, since M
1
is stable. This means that
in (5.4.35) T(k) + O. Then the steadystate solution of (5.4.38) gives
II> = w21wii·1
(5.4.39)
5.4 SteadyState Regulator System 231
Let us note that the previous steadystate solution (5.4.39) requires the
unstable eigenvalues (5.4.24) and eigenvectors (5.4.25). Thus, we have
the analytical solution (5.4.39) of the ARE (5.4.7).
Example 5.5
Consider the same Example 5.3 to use analytical solution of ma
trix Riccati difference equation based on [138]. The results are
obtained using Control System Toolbox of MATLAB©, Version
6 as shown below. Th..e solution of matrix DRE is not readily avail
able with MATLABC£) and hence a program was developed based
on the analytical solution of the matrix DRE [138]. The follow
ing MATL.aB© m file for Example 5.5 requires two additional
MATLABC£) files lqrdnss.m and lqrdnssf.m given in Appendix C.
The solutions are shown in Figure 5.9. Using these Riccati gains,
the optimal states xHk) and x2(k) are shown in Figure 5.10 and
the optimal control u* (k) is shown in Figure 5.11.
. ~ I P22(k).
(1j 2 •••••• f.··.··,··.··.· ·············f··················· .
:2 : :
a.
P
11
(k):
I
I
I
I
P12(k),
0
0 2 5 10
k
Figure 5.9 Riccati Coefficients for Example 5.5
****************************************************
%% Solution using Control System Toolbox and
232 Chapter 5: DiscreteTime Optimal Control Systems
5
4
3
o;n
Q)
co
ti5
2
(ij
E
"a
0
o
 1 ~   ~   ~   ~   ~   ~ ~   ~   ~   ~   ~   ~
o 2 3 4 5 10
k
Figure 5.10 Optimal States for Example 5.5
%% the MATLAB. Version 6
%% The following file example.m requires
%% two other files lqrnss.m and lqrnssf.m
%% which are given in Appendix C
clear all
A= [ . 8 , 1; 0, .6] ;
B= [1 ; .5] ;
F= [2 , 0 ; 0 ,4] ;
Q=[1,O;O,1] ;
R=1;
kspan=[O 10];
xO ( : ,1) = [5. ; 3.] ;
[x,u]=lqrdnss_dsn(A,B,F,Q,R,xO,kspan);
******************************************************
5.5 DiscreteTime Linear Quadratic Tracking
System
In this section, we address linear quadratic tracking (LQT) problem
for a discretetime system and are interested in obtaining a closed
loop control scheme that enables a given system track (or follow) a
desired trajectory over the given interval of time. We essentially deal
5.5 DiscreteTime Linear Quadratic Tracking System
0.5 ,....,...rr,,...rr,.r,
·
o ...... _ ... ... ! .................. ................... : ...... "'''' r .................. ! ............ ...
I • I I I
I I • , ,
• , I I I I
::::::::: ::::::::: ::::;:::::: 1 :::: ::L::::r:: :::: : ::::
• I I I • • •
• • I I • • I
, , I I • • I
e 1.5
'E
I • • I I • • •
 ... " .........  .........  , ......... ,.. .............. .,.  ..............  ......... ., ..... r'"   ...............
• • • I I • • •
, • • I I • • I
o
I • • I I • • t
• • • I I , • •
• • • I I • • •
 ......... , ..........  ,  ,'" _ ... r .................. Y'"  ,  "'\  ,..  ........................ ...
• , I I I I
2
('Il'
• I I I I •
• I • I I I
• I • I I I • I
................. t .................. ................ : .............. ................. t ................ .................. ............... :.................. .. ........ ...
• I • I , • • •
.§
8 2.5
• I • I I I I I
I I • I I I , ,
3 ... 
• I • I I • I •
• • • I I • • •
I I I I I I • •
• It' , • • •
............... , .................. , .................. , ............... ,. .................. T .................. , ................. ."'''' .........          
3.5
• I • , I , • •
I I I I I , I I
• I I I I t I •
I • , I I I • •
4   !  :  _ .. _ .. :  : t   : _ .. : .. : ..  ........
• • • I I • • •
• I I • I I • •
Figure 5.11 Optimal Control for Example 5.5
233
with linear, timeinvariant systems in order to get some elegant results
although the method can as well be applied to nonlinear, timevarying
case [89].
Let us consider a linear, timeinvariant system described by the state
equation
x(k + 1) = Ax(k) + Bu(k) (5.5.1)
and the output relation
y(k) = Cx(k). (5.5.2)
The performance index to be minimized is
1 ,
J = "2 [Cx(kf)  z(kf)] F [Cx(k
f
)  z(kf)]
kfl
L {[Cx(k)  z(k)]' Q [Cx(k)  z(k)] + u'(k)Ru(k)}
k=ko
(5.5.3)
where, x(k), u(k), and y(k) are n, r, and n order state, control, and
output vectors, respectively. Also, we assume that F and Q are each
234 Chapter 5: DiscreteTime Optimal Control Systems
nxn dimensional positive semidefinite symmetric matrices, and R is an
rxr positive definite symmetric matrix. The initial condition is given
as x(k
o
) and the final condition x(k
f
) is free with kf fixed. We want
the error e(k) = y(k) z(k) as small as possible with minimum control
effort, where z(k) is n dimensional reference vector. The methodology
to obtain the solution for the optimal tracking system is carried out
using the following steps.
• Step 1 : Hamiltonian
• Step 2: State and Costate System
• Step 3: OpenLoop Optimal Control
• Step 4: Riccati and Vector Equations
• Step 5: ClosedLoop Optimal Control
Now the details follow.
• Step 1: Hamiltonian: We first formulate the Hamiltonian as
kfl
H(x(k), u(k), )"(k + 1)) = ~ L {[Cx(k)  z(k)]' Q [Cx(k)  z(k)]
k=ko
+ u'(k)Ru(k)} + )..'(k + 1) [Ax(k) + BU(k)]
(5.5.4)
and follow the identical approach of the state regulator system
described in the previous section. For the sake of simplicity, let
us define
E = BR1B', V = C'QC and W = C'Q.
(5.5.5)
• Step 2: State and Costate System: Using (5.2.15), (5.2.13) and
(5.2.14) for the state, costate, and control, respectively, we obtain
the state equation as
8H
8>"*(k + 1) = x*(k + 1) + x*(k + 1) = Ax*(k) + BU*(k),
(5.5.6)
the costate equation as
a : ~ k ) = '\*(k) > '\*(k) = A''\*(k + 1) + Vx*(k)  Wz(k),
(5.5.7)
5.5 DiscreteTime Linear Quadratic Tracking System
235
and the control equation as
a ~ k ) = 0 > 0 = B'A*(k + 1) + RU*(k).
(5.5.8)
The final condition (5.2.37) becomes
A(kj) = C'FCx(kj)  C'Fz(kj). (5.5.9)
• Step 3: OpenLoop Optimal Control: The relation (5.5.8) yields
the openloop optimal control as
u*(k) = _R1B'A*(k + 1)
(5.5.10)
and using this in the state (5.5.6) and costate (5.5.7) system (also
called Hamiltonian system), we have the Hamiltonian (canonical)
system as
[
X*(k + 1)] [A E] [ X*(k)] [0]
A*(k) = V A' .x*(k + 1) + _ W z(k). (5.5.11)
Thus, we see that the Hamiltonian system is similar to that ob
tained for state regulator system in the previous section, except
for the nonhomogeneous nature due to the forcing term z(k).
• Step 4: Riccati and Vector Equations: Now to obtain closedloop
configuration for the optimal control (5.5.10), we may assume
from the nature of the boundary condition (5.5.9) a transforma
tion
I A*(k) = P(k)x*(k)  g(k) I (5.5.12)
where, the matrix P(k) and the vector g(k) are yet to be de
termined. In order to do so we essentially eliminate the costate
A*(k) from the canonical system (5.5.11) using the transformation
(5.5.12). Thus,
x*(k + 1) = Ax*(k)  EP(k + l)x*(k + 1) + Eg(k + 1) (5.5.13)
which is solved for x*(k + 1) to yield
x*(k + 1) = [I + EP(k + 1)]1 [Ax*(k) + Eg(k + 1)]. (5.5.14)
236 Chapter 5: DiscreteTime Optimal Control Systems
Now using (5.5.14) and (5.5.12) in the costate relation in (5.5.11),
we have
[P(k) + A'P(k + 1) [I + EP(k + 1)]1 A + v] x(k) +
[g(k) + A'P(k + 1) [I + EP(k + l)r
1
Eg(k + 1)
A'g(k + 1)  Wz(k)] = 0
(5.5.15)
This equation must hold for all values of the state x* (k) which
in turn leads to the fact that the coefficient of x( k) and the rest
of the terms in (5.5.15) must individually vanish. That is
I P(k) = A'P(k + 1) [I + EP(k + l)r
1
A + v I or
I P (k) = A' [p
1
(k + 1) + E] 1 A + V I
(5.5.16)
and
g ( k) = A' { 1  [p 1 (k + 1) + EJ 1 E } g (k + 1) + W z (k ) or
g(k) = {A'  A'P(k + 1) [I + EP(k + 1)]1 E} g(k + 1) + WZ(k).
(5.5.17)
To obtain the boundary conditions for (5.5.16) and (5.5.17), let
us compare (5.5.9) and (5.5.12) to yield
,P(kf) = C'FC'
'g(kf) = C'Fz(kf ).,
(5.5.18)
(5.5.19)
Let us note that (5.5.16) is the nonlinear, matrix difference Ric
cati equation (DRE) to be solved backwards using the final con
dition (5.5.18), and the linear, vector difference equation (5.5.17)
is solved backwards using the final condition (5.5.19) .
• Step 5: ClosedLoop Optimal Control: Once we obtain these so
lutions offline, we are ready to use the transformation (5.5.12) in
the control relation (5.5.10) to get the closedloop optimal control
as
u*(k) = R1B' [P(k + l)x(k + 1)  g(k + 1)] (5.5.20)
5.5 DiscreteTime Linear Quadratic Tracking System 237
and substituting for the state from (5.5.6) in (5.5.20),
u*(k) = R
1
B'P(k + 1) [Ax*(k) + Bu*(k)] + R
1
B'g(k + 1).
(5.5.21)
Now premultiplying by R and solving for the optimal control
u*(k) we have
I u*(k) = L(k)x*(k) + Lg(k)g(k + 1) I (5.5.22)
where, the feedback gain L( k) and the feed forward gain Lg (k) are
given by
L(k) = [R + B'P(k + l)B] 1 B'P(k + l)A
Lg(k) = [R + B'P(k + l)B] 1 B'
(5.5.23)
(5.5.24)
The optimal state trajectory is now given from (5.5.6) and (5.5.22)
as
x*(k + 1) = [A  BL(k)] x(k) + BLg(k)g(k + 1). (5.5.25)
The implementation of the discretetime optimal tracker is shown in
Figure 5.12. The complete procedure for the linear quadratic tracking
system is summarized in Table 5.5.
Example 5.6
We now illustrate the previous procedure by considering the same
system of the Example 5.3. Let us say that we are interested in
tracking Xl (k) with respect to the desired trajectory Zl (k) = 2 and
we do not have any condition on the second state X2 (k ). Then the
various matrices are
A(k) = [0.8 1.0j.
0.0 0.6 '
F(kf) = [ ~ ~ j ;
B(k) = [1.0j. C = [1 OJ
0.5' ° 1
Q(k) = [ ~ ~ ] ; R = 0.01.
(5.5.26)
Now let us use the procedure given in Table 5.5. Note that one
has to try various values of the matrix R in order to get a better
tracking of the states.
The various solutions obtained using MATLAB© Version 6.
Figure 5.13 shows Riccati functions Pn(k),P12(k), and p22(k); Fig
ure 5.14 shows vector coefficients gl ( k) and 92 ( k ); Figure 5.6 gives
the optimal states and Figure 5.6 gives optimal control.
The MATLAB© program used is given in Appendix C.
238 Chapter 5: DiscreteTime Optimal Control Systems
Table 5.5 Procedure Summary of DiscreteTime Linear
Quadratic Tracking System
A. Statement of the Problem
Given the plant as
x(k + 1) = Ax(k) + BU(k),
the output relation as
y(k) = Cx(k),
the performance index as
J(ko) = ~ [Cx(kf)  z(kf)]' F [Cx(kf)  z(kf)]
+ ~ L ~ ~ ~ ~ {[Cx(k)  z(k)]' Q [Cx(k)  z(k)] + u'(k)Ru(k)}
and the boundary conditions as
x(k
o
) = Xo, x(kf) is free, and k is fixed,
find the optimal control and state.
B. Solution of the Problem
Step 1 Solve the matrix difference Riccati equation
P(k) = A'P(k + 1) [I + EP(k + 1)]1 A + V
with P(kf) = C'FC, where V = C'QC and
E = BR
1
B'.
Step 2 Solve the vector difference equation
g(k) = A' {I  [P1(k + 1) + E]l E} g(k + 1) + Wz(k)
with g(kf) = C'Fz(kf), where, W = C'Q.
Step 3 Solve for the optimal state x* (k) as
x*(k + 1) = [A  BL(k)] x*(k) + BLg(k)g(k + 1)
where, L(k) = [R + B'P(k + l)B]l B'P(k + l)A,
Lg(k) = [R + B'P(k + 1)B]1 B'.
Step 4 Obtain the optimal control as
u*(k) = L(k)x*(k) + Lg(k)g(k + 1).
5.6 FrequencyDomain Interpretation
f·. .... ... ..........
Plant
1
Z
A
x*(k)
.. .... _... .. .. ...... .. .... .. .. .. .. .... .. .. '
ClosedLoop Optimal Controller
L(k)x*(k)
OffLine Simulation of P(k) and g(k)
t
Desired z(k) :
' ___ .. ___________ .. __ ...... ____ .. ________ .. _ ...................... _ .... __ ......... _____ t
239
Figure 5.12 Implementation of DiscreteTime Optimal Tracker
5.6 FrequencyDomain Interpretation
This section is based on [89]. In this section, we use frequency do
main to derive some results from the classical control point of view
for a linear, timeinvariant, discretetime, optimal control system with
infinitetime case, described earlier in Section 5.4. For this, we know
that the optimal control involves the solution of matrix algebraic Ric
cati equation. For ready reference, we reproduce here results of the
timeinvariant case described earlier in this chapter. For the plant
x(k + 1) = Ax(k) + Bu(k), (5.6.1)
the optimal feedback control is
u*(k) = Lx*(k) (5.6.2)
240 Chapter 5: DiscreteTime Optimal Control Systems
1.4 r.......
I
: T"I: r!: ,
: : : : : : : : :
1.2 ....  !   _ ..  :      ! ......   .... 
: : : : : : : : :
: : : : : : : : :
, , , I P11 (k), , , I ,
1( ••••• ••• •• tt> ..... .•••• ....• •.... tt> •.• .. ...•• •••• ·4··· 4 P
: : : : : : : : :
(J) :::::::::
E :::::::::
0.8  ........ i : ..  .. : ............ i ..  i .. :  .... : .........  r  
Q) :::::::::
<3 :::::::::
• • • I I I I I I
J 0.6       1' r  r  1  1y  r  r 
I • I I • I I
: : : : : : : : :
0.4  :  ......  : ..  r  ........ _ ..  .. ............ ....... 
: : : : : : : : :
: : : : : : : : :
• • • , • • I I I
02  _ .. ! ..  ........  :  ....  !  .... .......... ..   .... ...... 
. : : : : : : : : :
: : : : : : : : :
, , , I P12(k), , P22(k), I I
o ... _ . ...i •• . _oIe ............ _ ......... ••••• .;I;. •  •• ... ••••• _ ......... 4.
o 1 234 567 8 9
k
10
Figure 5.13 Riccati Coefficients for Example 5.6
where,
L = [H/PB + R]l B/pA.
(5.6.3)
Rewriting the above by postmultiplying by [B/PB + R]
[B/pB + R] L = B/p A.
(5.6.4)
Here, P is the solution of the ARE
P = A' [p  PB [B/PB + R]l B/P] A + Q
(5.6.5)
where, we assume that [A, B] is stabilizable and [A, vel] is observable.
With this optimal control (5.6.2), the optimal system becomes
x*(k + 1) = [A  BL] x*(k) (5.6.6)
and is asymptotically stable. Here, the openloop characteristic poly
nomial of the system is
(5.6.7)
5.6 FrequencyDomain Interpretation
• I
.
.
I I • I • I I I I
: : : : 91 (k): : : : :
2< ••••• ••••• ?' ..... ? ... ? ..... ..... ?' ..... ? .... ? .. .... i >
• • I I • I • I I
t • , I , I I I I
• • • I • • • I I
• • • I • I I I I
I • I I • I • I I
• • • I • I I I I
I • I I • I • I I
(f) •• I I • • I I I
1i 1.5 t   i l  r   t   i  i  r    r  
I. I I • • I I I
Q) ••• I , • I I I
C •• I I • I I I I
o :::::::::
o
"0
>
"I I I
• I I I I
I • I I , , , I I
...... ..   ..........  _ ......... _ ...  ......  : _ ...... _ .. _ .... 
• • I I • I • I I
•• I 'I I
I. I I I I
I. I I I I
I' I ., I
• I I • I I
•• I • I I
•• I I. I
I • fl. I I t I
0.5 ............... t ................ i ............ : .............. f .............. ! ................ i ................ i"''''''''''' f ............. r""""''''
• • • I • • • I I
• • I I • • I I I
I • • I • • t I I
I • • I • • • I I
• , I I I • • I I
: : : : 92(k): : : : :
o 0   4   ;       
10
k
Figure 5.14 Coefficients 91 (t) and 92(t) for Example 5.6
(/)
G.l
1ii
ti5
(ij
c..
0
5
4
3
2
,
,
: : : : : :
· ,
• I , • • ,
,
• I' ,
:
.
I
: :
.
,
I
,
'" • I I • , I , , I
... J,: ___ t_ ...  i  :  f  ! i i rr 
I' I I • I I • • I
\' I I , • I I • I
I' I , I I , I I I
" I I • , I
• I I • • I
" I I I I I
• I I I I • I I I I
..     :  _ ..     : ....  _ ..     
, ,,1 I I • I , • • I
I "I I I , I I , I ,
I ,I , I • • I I I I
" " I I I • I • • •
" " I I • • I I I I
'. i : : x
1
(kl : : : : :
."t •  •• 'Y;O: .......
I' I , • I , • I I
I I I I • I I I
,I I I • I. I
". I I I I t I
I' I I • I I I
I t I' 'I' ,
• I I I I • I • • I
,,' I I I • , • I I
fo   ....   :  .. .. _ .. : ..  ..    :   .. .. ..  
I. I I • • I I I I
I I I • • • I • I
\ : : : x (k} : : : : :
: I _ •• .. + ... f •• ++
o   ... t.. .        
,'1 I I , t , • I
". I I • • I I • I
II I I I , , • • ,
II , I I • • I I , I
: : : : : : : :
iii iii iii
o 1 2 3 4 5 6 7 8 9 10
k
Figure 5.15 Optimal States for Example 5.6
241
242 Chapter 5: DiscreteTime Optimal Control Systems
, . ..
• I , • , t • ,
• • • I
1    t:        f   f ;  f 
.'" I • I I • I
l: .. , : : u(k): : : : :
: : ,.     ... ...   ...  ... ;
o  ,' .....   .... .... _ .. w .... .....   t ... _ ...... t ..  ......... f  t
I' • I • I • I I
• • • , I I • • I
I • I I I I I I I
(5 : : : : : : : : :
1 i ...   ......   .....  ............ +   t i  ........ + ........ 
o I' I I • I • • I
U ,': : : : : : : :
('G ,I: : : : : : : :
E" • I I I • I I
"';::; 2 ..... _,L ...... ................ ................ r .................. r .................. ;. ............... j .................. ; ............... .............. 
C. I' • I I , • I ,
o ,': : : : : : : :
I' • I I I • I I
.' • , • I , • I
.' I I • I I • I
3 ... : ............ .................. ................. ................ ................ ! ................... ; ................ : .................. ! ................ 
, • • I • I • , I
I • • f • It. I
• • • I • It' I
I • • I I I • • I
• • • I • I • • I
I I • I I I • • I
4 f   ....     f    f  f  f 
• t • I • It. I
• I I' I
• • • I I I • • I
; i i ; iii i
o 1 234 5 678 9
k
Figure 5.16 Optimal Control for Example 5.6
whereas, the optimal closedloop characteristic polynomial is [89]
.6.
c
(z) = IzI  A + BLI,
= II + BL[zl  A]ll·lzI  AI,
= 11+ L[zl  A]lBI·.6.
o
(z).
From Figure 5.17, we note that
1. L[zl  A]l B is called the loop gain matrix, and
2. I + L [zl  A] 1 B is termed return difference matrix.
First of all, let us note that
(5.6.8)
P  A'PA = [z
1
1  A]'P[zl  A] + [z
1
1  A]'PA + A'P[zl  A].
(5.6.9)
Using the ARE (5.6.5) to replace the lefthand side of the previous
equation, we have
[z
1
1  A]'P[zl  A] + [z
1
1  A],PA + A'P[zl  A]
+ A'PB [B'PB + R] 1 B'P A = Q. (5.6.10)
5.6 FrequencyDomain Interpretation 243
. .............................................................................................................................. I
· .
U(z)
X ( Z ) ~
Plant
.. .. .. .... ..................................................................................... .. ............................ ..
.  .. .. .... ..................................................................................... .. ............................ .
· 9 .
· .
· .
· .
· .
·  .
· .
: L :
· .
· .
· .
· .
· .
: ClosedLoop Optimal Controller :
I ...................................................................................................................................... I
Figure 5.17 ClosedLoop DiscreteTime Optimal Control System
Premultiplying by B'[zl 1 A]T (where for example, we define M
T
as the transpose of M
l
) and postmultiplying by [zI  A]lB, the
previous relation becomes
B'PB + BP A[zl  A]lB + B'[zll  A]T A'PB
+ B'[zll  A]T A'PB [B'PB + R] 1 B'P A[zl  A]B,
= B'[zll  A]TQ[zl  A]lB. (5.6.11)
Using (5.6.4) in the above, we get
B'PB + [B'PB + R]L[zI  A]lB + B'[zll  A]Tp'[B'PB + R]
+ B'[zll  A]TLT[B'PB + R]L[zll  AjlB,
= B'[zll  AjTQ[zl  AjlB. (5.6.12)
Now, adding the positive definite matrix R to both sides of the previous
equation and factoring we get
B'[zll  A']lQ[zl  AjlB + R
= [I + L[zll  AjlB]' [B'PB + R] [I + L[zl  AjlB] .
(5.6.13)
With fictitious output equation as
y(k) = Cx(k) + DU(k) (5.6.14)
244 Chapter 5: DiscreteTime Optimal Control Systems
where, Q = C'C and R = D'D, and using (5.6.1), the transfer function
relating the output y(k) and the control u*(k) becomes
Also, we see that
[C[ZlI ~ A]lB r [C[ZI DA]lB]
= B[zlI  A]TQ[zI  A]lB + R.
(5.6.15)
(5.6.16)
Then, we can easily see from (5.6.13) that the previous transfer func
tion product can be expressed in terms of the return difference matrix.
Hence, the relation (5.6.15) implies that Figure 5.17 shows a return
difference product.
5.7 Problems 245
5.7 Problems
1. Make reasonable assumptions wherever necessary.
2. Use MATLAB© wherever possible to solve the problems and
plot all the optimal controls and states for all problems. Provide
the relevant MATLAB© m files.
Problem 5.1 Show that the coefficient matrix P(k) in the matrix dif
ference Riccati equation (5.3.11)
P(k) = A'(k)P(k + 1) [I + E(k)P(k + 1)]1 A(k) + Q(k)
is positive definite.
Problem 5.2 A second order discretetime system is given by
x1(k + 1) = 2XI(k) + 0.2X2(k),
x2(k + 1) = 2X2(k) + 0.2u(k),
XI(O) = 2
X2(0) = O.
The performance index to be minimized is
kfl
J = 2
1
0 L [xI(k)2 + u(k)2]
k=ko
where, ko = 0, and kf = 10. In order to drive the above system to the
final states xI(10) = x2(10) = 0,
( a) find the openloop optimal control, and
(b) find the closedloop optimal control.
Problem 5.3 Find the openloop optimal control sequence u(O), u(l), u(2)
for the second order discretetime system
and the performance index
kfl
J = L [xi(k) + u
2
(k)]
k=ko
where, kf = 5, Xl (5) is unspecified, and x2(5) = o.
246 Chapter 5: DiscreteTime Optimal Control Systems
Problem 5.4 Derive the relation for the optimal cost function given
by (5.3.28) as
J* = ~ x * ' (ko)P(ko)x(k
o
).
Problem 5.5 Given the plant as
x(k + 1) = A(k)x(k) + B(k)u(k)
the performance index as
J ( k
o
) = ~ x' ( k f ) F ( k f ) x ( k f )
kfl
+ ~ L [x'(k)Q(k)x(k) + u' (k)R(k)u(k)]
k=ko
and the fixedend boundary conditions as x(k = k
o
) = x(k
o
), x(k =
kf) = x(kf) and kf is fixed, obtain the closedloop, optimal control
based on similar results for continuoustime, optimal control system
described in Chapter 4.
Problem 5.6 For Problem 5.3, find the closedloop optimal control
for both (a) kf = 3 and (b) kf = 00.
Problem 5.7 For Problem 5.3, add a capability to track a ramp ref
erence signal zl(k) = O.5k, z2(k) = O.
Problem 5.8 For Problem 5.3, add a capability to track a low fre
quency sinusoidal function.
Problem 5.9 Obtain the HamiltonJacobiBellman equation for discrete
time optimal control problem.
Problem 5.10 For the D.C. motor speed control system described in
Problem 1.1, obtain
1. the discretetime model based on zeroorder hold,
2. the closedloop optimal control to keep the speed constant at a
particular value, and
5.7 Problems 247
3. the closedloop optimal control to track the speed along a ramp
function 0.5t.
Problem 5.11 For the liquidlevel control system described in Prob
lem 1.2, obtain
1. the discretetime model based on zeroorder hold,
2. the closedloop optimal control to keep the liquid level constant
at a particular value, and
3. the closedloop optimal control to track the liquid level along a
ramp function 0.25t.
Problem 5.12 For the mechanical control system described in Prob
lem 1.4, find the discretetime model based on zeroorder hold and
then find the closedloop optimal control to track the system along (i)
a constant value and (ii) a ramp function.
Problem 5.13 [105] The discretized model using zeroorder hold of
the longitudinal motion of an aircraft is given by
where,
x(t) = Ax(k) + Bu(t)
0.9237 0.3081 0.0 0.0530 0.0904
0.0397 0.9955 0.0 0.1075 0.5889
A = 0.0871 1.900 1.0 0.6353 0.3940
0.0356 0.0101 0.0 0.0078 0.1374
0.0696 0.0127 0.0 0.0971 0.2874
0.0428 0.0004 0.1540
0.4846 0.5154 0.0023
B = 0.1615 0.0675 0.0053
0.2020 0.2893 0.0051
0.8068 0.8522 0.0064
248 Chapter 5: DiscreteTime Optimal Control Systems
Xl (k) = velocity, ft / sec
X2 (k) = pitch angle, deg
x3(k) = altitude, ft
X4 (k) = angle of attack, deg
X5 (k) = pitch angular velocity, deg/ sec
UI (k) = elevator deflection, deg
U2 (k) = flap deflection, deg
U3 (k) = throttle position, deg
Formulate a performance index to minimize the errors in states and to
minimize the control effort. Obtain optimal controls and states for the
system.
@@@@@@@
Chapter 6
Pontryagin Minimum
Principle
In previous chapters, we introduced Pontryagin principle in getting op
timal condition of a plant or system along with a performance index and
boundary conditions. In this chapter, we present Pontryagin Minimum
Principle (PMP) and the related topics of dynamic programming and
HamiltonJacobiBellman (HJB) equation. This chapter also bridges
the previous chapters dealing with unconstrained optimal control with
the next chapter focusing on constrained optimal control. This chapter
is motivated by [79, 89]1.
6.1 Constrained System
In this section, we consider a practical limitation on controls and states
[79]. Let us reconsider the optimal control system (see Chapter 2,
Table 2.1)
x(t) = f(x(t), u(t), t) (6.1.1)
where, x(t) and u(t) are the n and r dimensional state and control
unconstrained variables respectively, and the performance index
l
tf
J = S(x(tf), tf) + V(x(t), u(t), t)dt
to
(6.1.2)
IThe permissions given by John Wiley for F. L. Lewis, Optimal Control, John Wiley &
Sons, Inc., New York, NY, 1986, and Prentice Hall for D. E. Kirk, Optimal Control Theory:
An Introduction, Prentice Hall, Englewood Cliffs, NJ, 1970, are hereby acknowledged.
249
250 Chapter 6: Pontryagin Minimum Principle
with given boundary conditions
x(t = 0) = Xo, x(t = tf) = xf is free, tf is free. (6.1.3)
The important stages in obtaining optimal control for the previous
system are
1. the formulation of the Hamiltonian
1l(x(t) , u(t), A(t), t) = V(x(t), u(t), t) + A'(t)f(x(t), u(t), t),
(6.1.4)
where, A(t) is the costate variable, and
2. the three relations for control, state and costate as
o = + (81l) control relation,
8u *
(6.1.5)
x*(t) = + (:) * state relation, and
(6.1.6)
,\ * ( t) =  (81l) costate relation
ax *
(6.1.7)
to solve for the optimal values x*(t), u*(t), and A*(t), respectively,
along with the general boundary condition
(6.1.8)
In the previous problem formulation, we assumed that the control u(t)
and the state x( t) are unconstrained, that is, there are no limitations
(restrictions or bounds) on the magnitudes of the control and state
variables. But, in reality, the physical systems to be controlled in
an optimum manner have some constraints on their inputs (controls),
internal variables (states) and/or outputs due to considerations mainly
regarding safety, cost and other inherent limitations. For example,
consider the following cases.
1. In aD. C. motor used in a typical positional control system, the
input voltage to field or armature circuit are limited to certain
standard values, say, 110 or 220 volts. Also, the magnetic flux in
the field circuit saturates after a certain value of field current.
6.1 Constrained System 251
2. Thrust of a rocket engine used in a space shuttle launch control
system cannot exceed a certain designed value.
3. Speed of an electric motor used in a typical speed control sys
tem cannot exceed a certain value without damaging some of the
mechanical components such as bearings and shaft.
This optimal control problem for a system with constraints was ad
dressed by Pontryagin et al., and the results were enunciated in the cel
ebrated Pontryagin Minimum Principle [109]. In their original works,
the previous optimization problem was addressed to maximize the Hamil
tonian
H(x(t), u(t), A(t), t) =  V(x(t), u(t), t) + A'(t)f(x(t), u(t), t) (6.1.9)
which is equivalent to minimization of the Hamiltonian defined in
(6.1.4). However, here and throughout this book, we define the Hamil
tonian as in (6.1.4) and use Pontryagin Minimum Principle (PMP)
or simply Pontryagin Principle. Before we consider the optimal con
trol system with control constraints, a historical perspective is in order
regarding Pontryagin Minimum Principle [52, 53].
Lev Semyonovich Pontryagin (born September 3, 1908,
Moscow, Russia and died May 3, 1988, Moscow, Russia)
lost his eyesight when he was about 14 years old due to an
explosion. His mother, although does not know mathemat
ics herself, became his tutor by just reading and describing
the various mathematical symbols as they appeared to her.
L. S. Pontryagin (also known as L. S. by his associates)
entered Moscow State University in 1925 and during the
1930s and 1940s, he made significant contributions to topol
ogy leading to the publication of Topological Groups which
was translated into several languages. Later, as head of
the Steklov Mathematical Institute, he devoted to engineer
ing problems of mathematics and soon focused on two ma
jor problems of general theory of singularly perturbed sys
tems of ordinary differential equations and the maximum
principle in optimal control theory. In particular, in 1955,
he formulated a general timeoptimal control problem for a
fifthorder dynamical system describing optimal maneuvers
of an aircraft with bounded control functions. In trying to
252 Chapter 6: Pontryagin Minimum Principle
"invent a new calculus of variations," he spent "three con
secutive sleepless nights," and came up with the idea of the
Hamiltonian formulation for the problem and the adjoint
differential equations, "thanks to his wonderful geometric
insight. "
Pontryagin suggested to his former students and close asso
ciates V. Boltyanski and R. V. Gamkrelidze to join him in
continuing investigations into the optimal control systems.
Pontryagin and Boltyanski focused on controllability and
Gamkrelidze investigated the second variation of the opti
mal control system which in fact led to the maximum prin
ciple. It took nearly a year to complete the full proof of the
maximum principle and published as a short note in [25j.
E. F. Mishchenko closely collaborated with L. S. during his
later years and contributed to the development of the the
ory of singularly perturbed ordinary differential equations
and differential games. The maximum principle was first
presented at the International Congress of Mathematicians
held in Edigburgh, UK, in 1958. The summary of the works
on optimal control was published in English translation in
1962 by this group of mathematicians [109}. Pontryagin
and his associates, for their works, were awarded the Lenin
Prize in 1961.
6.2 Pontryagin Minimum Principle
In Chapter 2, for finding the optimal control u*(t) for the problem
described by the plant (6.1.1), performance index (6.1.2), and boundary
conditions (6.1.3), we used arbitrary variations in control u(t) = u*(t)+
8u(t) to define the increment t::,.J and the (first) variation 8J in J as
t::,.J(u*(t),8u(t)) = J(u(t))  J(u*(t)) 2: 0 for minimum
= 8J(u*(t), 8u(t)) + higherorder terms (6.2.1)
where, the first variation
aJ
8J = au 8u(t).
(6.2.2)
Also, in Chapter 2, in order to obtain optimal control of unconstrained
systems, we applied the fundamental theorem of calculus of variations
6.2 Pontryagin Minimum Principle 253
(Theorem 2.1), i.e., the necessary condition of minimization is that the
first variation 8J must be zero for an arbitrary variation 8u(t). But
now we place restrictions on the control u(t) such as
Ilu(t)11 ::; u (6.2.3)
or component wise,
(6.2.4)
where, U
j
 and ut are the lower and upper bounds or limits on the
control function Uj (t). Then, we can no longer assume that the con
trol variation 8u(t) is arbitrary for all t E [to, tf]. In other words, the
variation 8u(t) is not arbitrary if the extremal control u*(t) lies on the
boundary condition or reaches a limit. If, for example, an extremal
control u*(t) lies on the boundary during some interval [ta, tb] of the
entire interval [to, tf]' as shown in Figure 6.1(a), then the negative
admissible control variation 8u(t) is not allowable as shown in Fig
ure 6.1(b) [79]. The reason for taking +8u(t) and 8u(t) the way it is
shown will be apparent later. Then, assuming that all the admissible
variations 118u(t)11 is small enough that the sign of the increment b"J
is determined by the sign of the variation 8J, the necessary condition
for u * ( t) to minimize J is that the first variation
8J(u*(t), 8u(t)) 2: o. (6.2.5)
Summarizing, the relation for the first variation (6.2.5) is valid if u*(t)
lies on the boundary (or has a constraint) during any portion of the
time interval [to, tf] and the first variation 8J = 0 ifu*(t) lies within the
boundary (or has no constraint) during the entire time interval [to, tf].
Next, let us see how the constraint affects the necessary conditions
(6.1.5) to (6.1.6) which were derived by using the assumption that the
admissible control values u(t) are unconstrained. Using the results of
Chapter 2, we have the first variation as
8J(u*(t),6u(t)) = {' {[ ~ ~ + '\(t) 1. 6x(t)
[
art]' [art ]' }
+ au * 8u(t) + 8A  x(t) * 8A(t) dt
+ [ ~ ~  >.(t{ 6xf + [1t + ~ ~ L Mf ·
I I
(6.2.6)
254
u(t)
o
u(t)
o
Chapter 6: Pontryagin Minimum Principle
r
I
.Constrai:q.t boundary"
./ i "' ...
..: ,
,
I
I
I
I
,
I
I
: .. '
I •
· .'
• t'
.' I
..
I
·
·
I •
Adplissible regio4
I :
,. ,
I
I
: Constraint boundary.
I: ".
u*(t)
.......... I :',:
.. .. .!. ... .. 
I. I
I' I
' .. .. :: : ,
'i u*(t) + ou(t) i/
• I
I •
· '
· '
regioh
• I
· '
o
f
t
t
Figure 6.1 (a) An Optimal Control Function Constrained by a
Boundary (b) A Control Variation for Which 8u(t) Is Not
Admissible [79]
6.2 Pontryagin Minimum Principle 255
In the above,
1. if the optimal state x* (t) equations are satisfied, it results in the
state relation (6.1.6),
2. if the costate ..\ * (t) is selected so that the coefficient of the depen
dent variation 8x( t) in the integrand is identically zero, it results
in the costate condition (6.1. 7), and
3. the boundary condition is selected such that it results in the aux
iliary boundary condition (6.1.8).
When the previous items are satisfied, then the first variation (6.2.6)
becomes
r! [8H]'
8J( u*(t), 8u(t)) = Jto ou 8u(t)dt.
(6.2.7)
The integrand in the previous relation is the first order approximation
to change in the Hamiltonian H due to a change in u(t) alone. This
means that by definition
[
8H ]'
8u (x*(t), u*(t), ..\*(t), t) 8u(t) ==
H (x*(t), u*(t) + 8u(t), ..\*(t), t)  H (x*(t), u*(t), ..\*(t), t). (6.2.8)
Then, using (6.2.8) in the first variation (6.2.7), we have
8J(u*(t), 8u(t)) = r! [H(x*(t), u*(t) + 8u(t), ..\*(t), t)
Jto
 H(x*(t), u*(t), ..\*(t), t)] dt. (6.2.9)
Now, using the above, the necessary condition (6.2.5) becomes
i
t!
[H(x*(t), u*(t) + 8u(t), ..\*(t), t)  H(x*(t), u*(t), ..\*(t), t)] dt 2: 0
to
(6.2.10)
for all admissible 8u(t) less than a small value. The relation (6.2.10)
becomes
H(x*(t), u*(t) + 8u(t), ..\*(t), t) 2: H(x*(t), u*(t), ..\*(t), t). (6.2.11)
256 Chapter 6: Pontryagin Minimum Principle
Replacing u*(t) + bu(t) by u(t), the necessary condition (6.2.10) be
comes
1l(x*(t), u*(t), A*(t), t) 1l(x*(t), u(t), A*(t), t) (6.2.12)
or, in other words,
min {1l (x*(t), u(t), A*(t), t)} = 1l(x*(t), u*(t), A*(t), t). (6.2.13)
The previous relation, which means that the necessary condition for
the constrained optimal control system is that the optimal control should
minimize the Hamiltonian, is the main contribution of the Pontryagin
Minimum Principle. We note that this is only the necessary condition
and is not in general sufficient for optimality.
6.2.1 Summary of Pontryagin Principle
The Pontryagin Principle is now summarized below. Given the plant
as
x(t) = f(x(t), u(t), t), (6.2.14)
the performance index as
I
t!
J = S(x(tf), tf) + V(x(t), u(t), t)dt,
to
(6.2.15)
and the boundary conditions as
x(to) = Xo and tf, x(tf) = xf are free, (6.2.16)
to find the optimal control, form the Pontryagin 1l function
1l(x(t) , u(t), A(t), t) = V(x(t), u(t), t) + A' (t)f(x(t), u(t), t), (6.2.17)
minimize 1l w.r.t. U) as
1l(x*(t), u*(t), A*(t), t) 1l(x*(t), u(t), A*(t), t), (6.2.18)
and solve the set of 2n state and costate equations
. (81l) . * (81l)
x*(t) = 8A * and A (t) =  8x *
(6.2.19)
6.2 Pontryagin Minimum Principle 257
Table 6.1 Summary of Pontryagin Minimum Principle
A. Statement of the Problem
Given the plant as
x(t) = f(x(t), u(t), t),
the performance index as
J = S(x(tf), tf) + ftt: V(x(t), u(t), t)dt,
and the boundary conditions as
x(to) = Xo and tf and x(tf) = xf are free,
find the optimal control.
B. Solution of the Problem
Step 1 Form the Pontryagin 1i function
1i(x(t), u(t), .\(t), t) = V(x(t), u(t), t) + .\'(t)f(x(t), u(t), t)
Step 2 Minimize 1i w.r.t. u(t)(:S U)
1i(x*(t), u*(t), .\*(t), t) :S 1i(x*(t), u(t), .\*(t), t)
Step 3 Solve the set of 2n state and costate equations
x*(t) = (art) and.x * (t) = _ (art)
a.\ * ax *
with boundary conditions Xo and
[1i + ~ ~ ] 8tf + [ ~ ~ .\J' 8xf = O.
*tJ *tJ
258 Chapter 6: Pontryagin Minimum Principle
with the boundary conditions Xo and
(6.2.20)
The entire procedure is now summarized in Table 6.1. Note that in
Figure 6.1, the variations +8u(t) and 8u(t) are taken in such a way
that the negative variation 8u(t) is not admissible and thus we get
the condition (6.2.10). On the other hand, by taking the variations
+8u(t) and 8u(t) in such a way that the positive variation +8u(t) is
not admissible, we get the corresponding condition as
i
t!
[1i(x*(t), u*(t)  8u(t), '\*(t), t) 1i(x*(t), u*(t), '\*(t), t)] dt ~ 0
to
(6.2.21)
which can again be written as (6.2.11) or (6.2.12). It should be noted
that
1. the optimality condition (6.2.12) is valid for both constrained
and unconstrained control systems, whereas the control relation
(6.1.5) is valid for unconstrained systems only,
2. the results given in the Table 6.1 provide the necessary conditions
only, and
3. the sufficient condition for unconstrained control systems is that
the second derivative of the Hamiltonian
~ : ~ (x'(t), u'(t), A'(t), t) = ( : ~ ) .
(6.2.22)
must be positive definite.
Let us illustrate the previous principle by a simple example in sta
tic optimization which is described by algebraic equations unlike the
dynamic optimization described by differential equations.
Example 6.1
We are interested in minimizing a scalar function
H = u
2
 6u + 7 (6.2.23)
6.2 Pontryagin Minimum Principle 259
subject to the constraint relation
lui:::; 2, ~ 2 :::; u :::; +2. (6.2.24)
Solution: First let us use a relation similar to (6.1.5) for uncon
strained control as
8H * *
 = 0 ~ 2u  6 = 0 ~ u = 3
au
and the corresponding optimal H* from (6.2.23) becomes
H* = 3
2
 6x3 + 7 = 2.
(6.2.25)
(6.2.26)
This value of u* = 3 is certainly outside the constraint (admissible)
region specified by (6.2.24). But, using the relation (6.2.18) for the
constrained control, we have
H(u*) :::; H(u),
H(U*2  6u* + 7) :::; H(u
2
 6u + 7). (6.2.27)
The complete situation is depicted in Figure 6.2 which shows that
the admissible optimal value is u* = +2 and the corresponding
optimal H* is
H* = 22  6x2 + 7 = 1. (6.2.28)
However, let us note if our constraint relation (6.2.24) had been
lui:::; 3, ~ 3 :::; u :::; +3 (6.2.29)
then, we are lucky to use the relation similar to (6.1.5) or (6.2.25)
and obtain the optimal value as u* = 3. But, in general this is not
true.
6.2.2 Additional Necessary Conditions
In their celebrated works [109], Pontryagin and his coworkers also ob
tained additional necessary conditions for constrained optimal control
systems. These are stated below without proof [109].
1. If the final time t f is fixed and the Hamiltonian 11 does not depend
on time t explicitly, then the Hamiltonian 11 must be constant
when evaluated along the optimal trajectory; that is
11(x*(t), u*(t), ;\*(t)) = constant = 0
1
vt E [to, tf]. (6.2.30)
260
2
Chapter 6: Pontryagin Minimum Principle
1
8
6
H(u)
4
2
o
1
Admissible Controls
4
5 6 u
Figure 6.2 Illustration of Constrained (Admissible) Controls
2. If the final time t f is free or not specified priori and the Hamil
tonian does not depend explicitly on time t, then the Hamiltonian
must be identically zero when evaluated along the optimal tra
jectory; that is,
H(x*(t), u*(t), A*(t)) = 0 Vt E [to, tf] (6.2.31)
Further treatment of constrained optimal control systems is carried
out in Chapter 7. According to Gregory and Lin [61] the credit for
formulating the optimal control problem for the first time in 1950 is
given to M. R. Hestenes [64], the detailed proof of the problem was
given by a group of Russian mathematicians led by Pontryagin and
hence called the Pontryagin Minimum Principle (PMP) [109]. The
PMP is the heart of the optimal control theory. However, the original
proof given by Pontryagin et al. is highly rigorous and lengthy. There
are several books devoting lengthy proof of PMP such as Athans and
Falb [6], Lee and Markus [86] and Machki and Strauss [97]. Also see
recent books (Pinch [108] and Hocking [66]) for a simplified treatment
of the proof.
6.3 Dynamic Programming 261
6.3 Dynamic Programming
Given a dynamical process or plant and the corresponding performance
index, there are basically two ways of solving for the optimal control of
the problem, one is the Pontryagin maximum principle [109] and the
other is Bellman's dynamic programming [12, 14, 15]. Here we concen
trate on the latter, the dynamic programming (DP). The technique is
called dynamic programming because it is a technique based on com
puter "programming" and suitable for "dynamic" systems. The basic
idea of DP is a discrete, multistage optimization problem in the sense
that at each of the finite set of times, a decision is chosen from a finite
number of decisions based on some optimization criterion. The central
theme of DP is based on a simple intuitive concept called principle of
optimality.
6.3.1 Principle of Optimality
Consider a simple multistage decision optimization process shown in
Figure 6.3. Here, let the optimizing cost function for the segment AC
c
A
B
Figure 6.3 Optimal Path from A to B
be JAG and for the segment CB be JGB. Then the optimizing cost for
the entire segment AB is
(6.3.1)
That is, if JAG is the optimal cost of the segment AC of the entire op
timal path AB, then JGB is the optimal cost of the remaining segment
C B. In other words, one can break the total optimal path into smaller
segments which are themselves optimal. Conversely, if one finds the
optimal values for these smaller segments, then one can obtain the op
timal value for the entire path. This obvious looking property is called
the principle of optimality (PO) and stated as follows [79]:
An optimal policy has the property that whatever the pre
vious state and decision (i. e., control), the remaining deci
262 Chapter 6: Pontryagin Minimum Principle
sions must constitute an optimal policy with regard to the
state resulting from the previous decision.
Backward Solution
It looks natural to start working backward from the final stage or point,
although one can also work forward from the initial stage or point. To il
lustrate the principle of optimality, let us consider a multistage decision
process as shown in Figure 6.4. This may represent an aircraft routing
E
A
B
G
Figure 6.4 A Multistage Decision Process
network or a simple message (telephone) network system. In an aircraft
routing system, both the initial point A and the final point B represent
the two cities to be connected and the other nodes C, D, E, F, G, H, I
represent the intermediate cities. The numbers (called units) over each
segment indicate the cost (or performance index) of flying between the
two cities. Now we are interested in finding the most economical route
to fly from city A to city B. We have 5 stages starting from k = 0 to
k = N = 4. Also, we can associate the current state as the junction
or the node. The decision is made at each state. Let the decision or
6.3 Dynamic Programming 263
control be u = ± 1, where u = + 1 indicates move up or left and u = 1
indicates move down or right looking from each junction towards right.
Now, our working of the dynamic programming algorithm is shown
in Figure 6.5.
E
G
B
(0)
Figure 6.5 A Multistage Decision Process: Backward Solution
Stage 5: k = k f = N = 4
This is just the starting point, there is only one city B and hence
there is no cost involved.
Stage 4: k = 3
There are two cities H and I at this stage and we need to find the
most economical route from this stage to stage 5. Working backward,
we begin with B which can be reached by H or I. It takes 2 units to
fly from H to B by using control or decision u = 1 (downward or
right) and hence let us place the number 2 within parenthesis under
264 Chapter 6: Pontryagin Minimum Principle
H. Similarly, it takes 3 units to fly from I to B by using control or
decision u = +1 (upward or left) and hence place the number 3 just
near to I. Let us also place an arrow head to the corresponding paths
or routes. Note there is no other way of flying from H to B and I to
B except as shown by the arrows.
Stage 3: k = 2
Here, there are three cities E, F, G and from these nodes we can fly
to H and I. Consider first E. The total cost to fly from E to B
will be 2 + 4 = 6 by using control or decision u = 1 (downward or
right) and let us place units 6 in parenthesis at the node E. Secondly,
from F, we can take two routes F, H, Band F, I, B, by using decisions
u = + 1 (upward or left) and u = 1 (downward or right) and the
corresponding costs are 2 + 3 = 5 and 3 + 5 = 8, respectively. Note,
that we placed 5 instead of 8 at the node F and an arrow head on
the segment F H to indicate the optimal cost to fly the route F, H, B
instead of the costlier route F, I, B. Finally, consider G. There is only
one route which is G, I, B to go to B starting from G. The cost is the
cost to fly from G to I and the cost to fly from I to B.
Stage 2: k = 1
By the same procedure as explained above, we see that the node C
has minimum cost 9 and the node D has minimum cost 7.
Stage 1: k = 0
Here, note that from A, the two segments AC and AD have the same
minimum cost indicating either route is economical.
Optimal Solution
This is easy to find, we just follow the route of the arrow heads from
A to B. Note that there are two routes to go from stage 0 to stage
1. Thus, the most economical (optimal) route is either A, C, F, H, B or
A, D, F, H, B. The total minimum cost is 11 units.
6.3 Dynamic Programming 265
Forward Solution
One can solve the previous system using forward solution, starting from
A at stage 0 and working forward to stages 1,2,3 and finally to stage
5 to reach B. We do get the identical result as in backward solution as
shown in Figure 6.6.
E
G
Figure 6.6 A Multistage Decision Process: Forward Solution
Thus, as shown in both the previous cases, we
1. divide the entire route into several stages,
2. find the optimal (economical) route for each stage, and
3. finally, using the principle of optimality, we are able to combine
the different optimal segments into one single optimal route (or
trajectory) .
In the previous cases, we have fixed both the initial and final points
and thus we have a fixedendpoint system. We can similarly address
the variable end point system.
266 Chapter 6: Pontryagin Minimum Principle
Next, we explore how the principle of optimality in the dynamic pro
gramming can be used to optimal control systems. We notice that the
dynamic programming approach is naturally a discretetime system.
Also, it can be easily applied to either linear or nonlinear systems,
whereas the optimal control of a nonlinear system using Pontryagin
principle leads to nonlinear twopoint boundary value problem (TP
BVP) which is usually very difficult to solve for optimal solutions.
6.3.2 Optimal Control Using Dynamic Programming
Let us first consider the optimal control of a discretetime system. Or
even if there is a continuoustime system, one can easily discretize it to
obtain the discretetime system by using one of the several approaches
[82]. Let the plant be described by
x(k + 1) = f(x(k), u(k), k) (6.3.2)
and the cost function be
kfl
Ji(x(k
i
)) = J = S(x(kf), kf) + I: V(x(k), u(k)) (6.3.3)
k=i
where, x( k), u( k) are the nand r state and control vectors, respectively.
Note, we showed the dependence of J on the initial time (k) and state
(x(k)).
We are interested in using the principle of optimality to find the
optimal control u*(k) which applied to the plant (6.3.2) gives optimal
state x* (k). Let us assume that we evaluated the optimal control, state
and cost for all values starting from k + 1 to k f. Then, at any time or
stage k, we use the principle of optimality to write as
J;;(x(k)) = min [V[x(k), u(k)] + J;;+l(x*(k + 1))] .
u(k)
(6.3.4)
The previous relation is the mathematical form of the principle of opti
mality as applied to optimal control system. It is also called functional
equation of dynamic programming. Thus, it means that if one had
found the optimal control, state and cost from any stage k + 1 to the
final stage kf' then one can find the optimal values for a single stage
from k to k + 1.
6.3 Dynamic Programming 267
Example 6.2
Consider a simple scalar example to illustrate the procedure un
derlying the dynamic programming method [79,89].
x(k + 1) = x(k) + u(k) (6.3.5)
and the performance criterion to be optimized as
(6.3.6)
where, for simplicity of calculations, we take kf = 2. Let the
constraints and the quantization values on the control be
1.0::; u(k) ::; +1.0,
u(k) = 1.0, 0.5,
and on the state be
° ::; x( k) ::; +2.0,
x(k) = 0, 0.5,
k = 0,1,2 or
0, +0.5, +1.0
k = 0,1 or
1.0 1.5 2.0.
(6.3.7)
(6.3.8)
Find the optimal control sequence u* (k) and the state x* (k) which
minimize the performance criterion (6.3.6).
Solution: To use the principle of optimality, to solve the previous
system, we first set up a grid between x(k) and k, omitting all the
arrows, arrow heads, etc. We divide the stages into two sets: one
for k = 2 and the other for k = 1,0. We start with k = 2 and
first find the optimal values and work backward for k = 1, ° using
the state (6.3.5), the cost function (6.3.6) and the optimal control
(6.3.7).
Stage: k = 2
First calculate the state x(2) using the state relation (6.3.5)for
all admissible values of x(k) and u(k) given by (6.3.7) and (6.3.8).
Thus, for example, for the admissible value of x(I) = 2.0 and u(I) =
1, 0.5,0,0.5, 1, we have
x(2) = x(I) + u(l)
x(2) = 2.0 + (1) = 1.0
x(2) = 2.0 + (0.5) = 1.5
x(2) = 2.0 + ° = 2.0
x(2) = 2.0 + 0.5 = ~
x(2) = 1.5 + 1 = 3 ~ . (6.3.9)
268 Chapter 6: Pontryagin Minimum Principle
Note the values 2.5 and 3.0 (shown by a striking arrow) of state
x(2) are not allowed due to exceeding the state constraint (6.3.8).
Also, corresponding to the functional equation (6.3.4), we have
for this example,
Jk(x(k)) = min [2
1
u
2
(k) + x
2
(k) + J
k
+
1
]
u(k)
from which we have for the optimal cost at k = 2
J* = ~ x 2 ( 2 )
kf 2
which is evaluated for all admissible values of x(2) as
J
kf
= 2.000 for x(2) = 2.0
= 1.125 for x(2) = 1.5
= 0.500 for x(2) = 1.0
= 0.125 for x(2) = 0.5
= 0.000 for x(2) = O.
(6.3.10)
(6.3.11)
(6.3.12)
The entire computations are shown in Table 6.2 for k = 2 and in
Table 6.3 for k = 1, O.
The data from Tables 6.2 and 6.3 corresponding to optimal con
ditions is represented in the dynamic programming context in Fig
ure 6.7. Here in this figure, Uo = u*(x(O), 0) and ui = u*(x(I), 1)
and the quantities within parenthesis are the optimal cost values
at that stage and state. For example, at stage k = 1 and state
x(k) = 1.0, the value Ji2 = 0.75 indicates that the cost of transfer
the state from x(l) to x(2) is 0.75. Thus, in Figure 6.7, for find
ing the optimal trajectories for any initial state, we simply follow
the arrows. For example, to transfer the state x(O) = 1 to state
x(2) = 0, we need to apply first Uo = 1 to transfer it to x(l) = 0
and then ui = o.
Note: In the previous example, it so happened that for the given con
trol and state quantization and constraint values (6.3.7) and (6.3.8),
respectively, the calculated values using x(k + 1) = x(k) + u(k) either
exactly coincide with the quantized values or outside the range. In
some cases, it may happen that for the given control and state quan
tization and constraint values, the corresponding values of states may
not exactly coincide with the quantized values, in which case, we need
to perform some kind of interpolation on the values. For example, let
us say, the state constraint and quantization is
1 :::; x (k) :::; +2, k = 0, 1 or
x(k) = 1.0, 0, 0.5, 1.0 2.0. (6.3.13)
6.3 Dynamic Programming 269
Table 6.2 Computation of Cost during the Last Stage k = 2
Current Current Next Cost Optimal Optimal
State Control State Cost Control
x(l) u(l) x(2)
J12 Ji2(x(1)) u*(x(l),l)
1.0 1.0 3.0
0.5 1.5 2.25 Ji2(2.0)=2.25 u*(1.5,1) = 0.5
2.0 0 2.0 4.0
0.5 ~
1.0
&@
1.0 0.5 1.75
Ji2 (1.5)=1. 75 u*(1.5,1) = 1.0
0.5 1.0 1.75 Ji2 (1.5)=1. 75 u*(1.5,1) = 0.5
1.5 0 1.5 2.25
0.5 2.0 3.25
1.0 ~
1.0 0 1.0
0.5 0.5 0.75 Ji2(1.0) = 0.75 u*(l,l) = 0.5
1.0 0 1.0 1.0
0.5 1.5 1.75
1.0 2.0 3.0
1.0 ~
0.5 0 0.25 Ji2(0.5)=0.25 u* (0.5,1)=0.5
0.5 0 0.5 0.25 Ji2 (0.5)=0.25 u*(0.5,1)=0
0.5 1.0 0.75
1.0 1.5 1.75
1.0 ~
0.5 ~
0 0 0 0 Ji2(0)=0
u*(O,l)=O
0.5 0.5 0.25
1.0 1.0 1.0
Use these to calculate the above: x(2) = x(l) + u(l);
J12 = 0.5x
2
(2) + 0.5u
2
(1) + 0.5x
2
(1)
A strikeout (+) indicates the value is not admissible.
270 Chapter 6: Pontryagin Minimum Principle
Table 6.3 Computation of Cost during the Stage k = 1,0
Current Current Next Cost Optimal Optimal
State Control State Cost Control
x(O) u(O) x(l)
J02
J
02
(x(0) ) u*(x(O),O)
1.0 1.0 3.25 J
o2
(2.0) = 3.25 u* (2.0,0) = 1.0
0.5 1.5 3.875
2.0 0 2.0 4.25
0.5 ~
1.0 ~
1.0 0.5 1.875 J
o2
(1.5) = 1.875 u* (1.5,0) = 1.0
0.5 1.0 2.0
1.5 0 1.5 2.875
0.5 2.0 3.25
1.0 ~
1.0 0 1.0 J
o2
(1)=1 u* (1,0)=1.0
0.5 0.5 0.875
1.0 0 1.0 1.25
0.5 1.5 2.375
1.0 2.0 3.0
1.0 ~
0.5 0 0.25 J
o2
(0.5)=0.25 u* (1,0)=0.5
0.5 0 0.5 0.375
0.5 1.0 1.0
1.0 1.5 2.375
1.0
h@
0.5 ~
0 0 0 0 J
02
(0)=0 u*(O,O) = 0
0.5 0.5 0.375
1.0 1.0 1.25
Use these to calculate the above: x(l) = x(O) + u(O);
J
02
= 0.5u
2
(0) + 0.5x2(0) + Ji2(x(1))
A strikeout ( ~ ) indicates the value is not admissible.
6.3 Dynamic Programming 271
x(k) J*O 2
•
J*l 2
.
J*2
•
(2.0)
1.5 •
•
(1.125)
1.0 
•
0.5
o
o
Figure 6. 7 Programming Framework of Optimal State
Feedback Control
Then, for x(l) = 2.0 and u(l) = 0.5, when we try to use the state
equation (6.3.5) to find the x(2) = x(l) +u(I), we get x(2) = 1.5 which,
although is not an allowable quantized value, is within the constraint
(limit). Hence, we cannot simply calculate the quantity J2 = 0.5x
2
(2)
as J2 = 0.5(1.5? = 1.125, instead using the interpolation we calculate
it as
J2 = 0.5[x(2) = 1.5]2
[ (2) ]
2 0.5[x(2) = 2F  0.5[x(2) = IF
= 0.5 x = 1 + 2
2  0.5
= 0.5 + 2 = 1.25. (6.3.14)
272 Chapter 6: Pontryagin Minimum Principle
We notice that the dynamic programming technique is a computation
ally intensive method especially with increase in the order and the
number of stages of the system. However, with the tremendous ad
vances in highspeed computational tools since Bellman [15] branded
this increased computational burden inherent in dynamic programming
as "curse of dimensionality," the "curse" may be a "boon" due to the
special advantages of dynamic programming in treating both linear and
nonlinear systems with ease and in handling the constraints on states
and/ or controls.
6.3.3 Optimal Control of DiscreteTime Systems
Here, we try to derive the optimal feedback control of a discretetime
system using the principle of optimality of dynamic programming [79,
89]. Consider a linear, timeinvariant, discretetime plant,
x(k + 1) = Ax(k) + Bu(k)
and the associated performance index
J
i
= ~ x ' (k
f
)Fx(kf)
kfl
1
+2 ~ [x'(k)Qx(k) + u'(k)Ru(k)]
1,
(6.3.15)
(6.3.16)
where, x( k) and u( k) are nand r dimensional state and control vec
tors, and A(k) and B(k) are matrices of nxn and nxr dimensions,
respectively. Further, F and Q are each nxn order symmetric, positive
semidefinite matrices, and R is rxr symmetric, positive definite matrix.
For our present discussion, let us assume that there are no constraints
on the state or control.
The problem is to find the optimal control u * ( k) for i :::; k :::; k f that
minimizes the performance index Jk using the principle of optimality.
Let us assume further that the initial state x(ko) is fixed and the final
state x( k f) is free. In using dynamic programming, we start with the
final stage x( k f) and work backwards. At each stage, we find the opti
mal control and state. Let us start with last stage k = k f.
Last Stage: k = k f
Let us first note that at i = k f ,
6.3 Dynamic Programming 273
(6.3.17)
Previous to Last Stage: k = k f  1
At i = kf  1, the cost function (6.3.16) becomes
Jk
f
1 =  I)Qx(k
f
 1) +  I)Ru(kf  1)
)Fx(kf). (6.3.18)
According to the functional equation of the principle of optimality
(6.3.4), we need to find the optimal control u*(kf  1) to minimize the
cost function (6.3.18). Before that, let us rewrite the relation (6.3.18)
to make all the terms in (6.3.18) to belong to stage kf  1. For this,
using (6.3.15) in (6.3.18), we have
Jk
f
1 =  I)Qx(kf  1) +  I)Ru(kf  1)
1 ,
+2 [Ax(kf  1) + Bu(kf  1)] F [Ax(kf  1) + Bu(kf  1)].
(6.3.19)
Since there are no constraints on states or controls, we can easily find
the minimum value of (6.3.19) w.r.t. u(kf  1) by simply making
8Jk 1
8u(k; _ 1) = RU*(kf  1) + B'F [Ax(kf  1) + BU*(k
f
 1)] = o.
Solving for u*(kf  1), we have
where,
u*(kf  1) =  [R + B'FBJ 1 B'FAx(k
f
 1)
= L(kf  I)x(kf  1)
L(kf  1) = [R + B'FB] 1 B'FA
(6.3.20)
(6.3.21)
(6.3.22)
274 Chapter 6: Pontryagin Minimum Principle
is also called the Kalman gain. Now the optimal cost J
kf

1
for this
stage k f  1 is found by substituting the optimal control u* (k f  1)
from (6.3.21) into the cost function (6.3.19) to get
J
kf

1
= 1) [{A  BL(kf 1)}'F{A  BL(k
f
1)}
+ L'(kf  1)RL(kf  1) + Q] x(kf  1)
=  1)P(kf  1)x(kf  1) (6.3.23)
where,
P(kf 1) = {A  BL(kf 1)}'F{A  BL(kf 1)}
+L'(kf  1)RL(k
f
 1) + Q (6.3.24)
Stage: kf  2
Using i = kf  2 in the cost function (6.3.16), we have
Jk
f
2 = +  2)Qx(k
f
 2)
 2)Ru(kf  2) +  1)Qx(kf  1)
2 2
1)Ru(kf 1). (6.3.25)
Now, using (6.3.15) to replace kf' (6.3.21) to replace u(kf  1) and
(6.3.24) in (6.3.25), we get
Jkf2 =  2)Qx(k
f
 2) +  2)Ru(k
f
 2)
 1)P(k
f
 1)x(k
f
 1) (6.3.26)
where, P(kf  1) is given by (6.3.24). At this stage, we need to express
all functions at stage kf  2. Then, once again, for this stage, to
determine u*(kf  2) according to the optimality principle (6.3.4), we
minimize Jk
f
2 in (6.3.26) w.r.t. u(kf  2) and get relations similar to
(6.3.21), (6.3.22), (6.3.23), and (6.3.24). For example, the optimal cost
function becomes
(6.3.27)
6.3 Dynamic Programming
275
where, P(kf  2) is obtained similar to (6.3.24) except we replace kf1
by kf  2. We continue this procedure for all other stages k
f
 3, k
f

4, ... , ko.
Any Stage k
N ow we are in a position to generalize the previous set of relations
for any k. Thus, the optimal control is given by
u*(k) = L(k)x*(k), (6.3.28)
where, the Kalman gain L(k) is given by
L(k) = [R + B'P(k + 1)BJ 1 B'P(k + 1)A, (6.3.29)
the matrix P(k), also called the Riccati matrix, is the backward solution
of
P(k) = [A  BL(k)]' P(k + 1) [A  BL(k)]
+L'(k)RL(k) + Q (6.3.30)
with the final condition P(kf) = F, and the optimal cost function as
J
k
= (6.3.31)
We notice that these are the same set of relations we obtained in Chap
ter 5 by using Pontryagin principle.
6.3.4 Optimal Control of ContinuousTime Systems
Here, we describe dynamic programming (DP) technique as applied
to finding optimal control of continuoustime systems. First of all, we
note that although in the previous sections, the DP is explained w.r.t.
the discretetime situation, DP can also be applied to continuoustime
systems. However, one can either
1. discretize the continuoustime systems in one or other ways and
use the DP as applicable to discretetime systems, as explained
in previous sections, or
2. apply directly the DP to continuoustime systems leading to the
celebrated HamiltonJacobiBellman (HJB) equation, as presented
in the next section.
276 Chapter 6: Pontryagin Minimum Principle
In using the discretization of continuoustime processes, we can either
employ
1. the Euler method, or
2. sampler and zeroorder hold method.
Let us now briefly discuss these two approaches.
1. Euler Method: Let us first take up the Euler approximation of a
linear time invariant (LTI) system (although it can be used for
nonlinear systems as well) for which the plant is
x(t) = Ax(t) + Bu(t) (6.3.32)
and the cost function is
1
J(O) = "2 x' (t J )F(t J )x(t J)
1 r
f
+"2 10 [x'(t)Qx(t) + u'(t)Ru(t)] dt
(6.3.33)
where, the state vector x(t) and control vector u(t) and the var
ious system and weighted matrices and are defined in the usual
manner. Assume some typical boundary conditions for finding the
optimal control u * ( t ) .
Using the Euler approximation of the derivative in (6.3.32) as
x(t) = x(k + 1)  x(k)
T
(6.3.34)
where, T is the discretization (sampling) interval and x(k) =
x(kT), the discretized version of the state model (6.3.32) becomes
x(k + 1) = [I + TA] x(k) + TBu(k). (6.3.35)
Also, replacing the integration process in continuoustime cost
function (6.3.33) by the summation process, we get
1
J(O) = "2x'(kJ )Fx(tJ)
1 kfl
+"2 L [X'(k)QdX(k) + u(k)RdU(k)] (6.3.36)
k=ko
where, Qd = TQ, and Rd = TR.
6.4 The HamiltonJacobiBellman Equation 277
2. ZeroOrder Hold: Alternatively, using sampler and zeroorder
hold [83], the continuoustime state model (6.3.32) becomes
x(k + 1) = AdX(k) + BdU(k)
Ad = eAT, and Bd = loT eATBdT.
(6.3.37)
Thus, we have the discretetime state model (6.3.35) or (6.3.37)
and the corresponding discretetime cost function (6.3.36) for
which we can now apply the DP method explained in the pre
vious sections.
6.4 The HamiltonJacobiBellman Equation
In this section, we present an alternate method of obtaining the closed
loop optimal control, using the principle of optimality and the Hamilton
JacobiBellman (HJB) equation. First we need to state Bellman's prin
ciple of optimality [12]. It simply states that any portion of the optimal
trajectory is optimal. Alternatively, the optimal policy (control) has
the property that no matter what the previous decisions (i.e., controls)
have been, the remaining decision must constitute an optimal policy.
In Chapter 2, we considered the plant as
x(t) = f(x(t), u(t), t) (6.4.1)
and the performance index (PI) as
l
tf
J(x(to), to) = V(x(t), u(t), t)dt.
to
(6.4.2)
Now, we provide the alternative approach, called HamiltonJacobi
Bellman approach and obtain a control law as a function of the state
variables, leading to closedloop optimal control. This is important from
the practical point of view in implementation of the optimal control.
Let us define a scalar function J*(x*(t), t) as the minimum value of
the performance index J for an initial state x*(t) at time t, i.e.,
rtf
J*(x*(t), t) = it V(X*(T), U*(T), T)dT. (6.4.3)
In other words, J*(x*(t), t) is the value of the performance index when
evaluated along the optimal trajectory starting at x( t). Here, we used
278 Chapter 6: Pontryagin Minimum Principle
the principle of optimality in saying that the trajectory from t to t f is
optimal. However, we are not interested in finding the optimal control
for specific initial state x( t), but for any unspecified initial conditions.
Thus, our interest is in J(x(to), to) as a function of x(to) and to. Now
consider
dJ* (x* (t), t) = (a J* (x* (t), t) )' x* (t) a J* (x* (t), t)
dt ax* + at '
= (aJ*(x*(t),t))'f( *() *()) aJ*(x*(t),t)
a x t ,U t ,t + a .
x* t
(6.4.4)
From (6.4.3), we have
dJ*(x*(t), t) = V( *() *() )
& x t ,U t ,t .
(6.4.5)
Using (6.4.4) and (6.4.5), we get
aJ*(x*(t), t) V( *() *() )
at + xt,ut,t
(
a J* (x* (t) t))'
+ ax* ' f(x*(t), u*(t), t) = o.
(6.4.6)
Let us introduce the Hamiltonian as
(
aJ*(x* (t) t))'
1t = V(x(t), u(t), t) + ax* ' f(x(t), u(t), t)
(6.4.7)
Using (6.4.7) in (6.4.6), we have
aJ*(x*(t), t) '1J ( *( ) aJ*(x*(t), t) *() ) _ O. \...I [ )
at + I L X t, ax* ,u t, t  ,v t E to, t f
(6.4.8)
with boundary condition from (6.4.3) as
(6.4.9)
or
(6.4.10)
6.4 The HamiltonJacobiBellman Equation 279
if the original PI (6.4.2) contains a terminal cost function. This equa
tion (6.4.8) is called the HamiltonJacobi equation. Since this equation
is the continuoustime analog of Bellman's recurrence equations in dy
namic programming [15], it is also called the HamiltonJacobiBellman
(HJB) equation. Comparing the Hamiltonian function (6.4.7) with that
given in earlier chapters, we see that the costate function A*(t) is given
by
*( ) _ 8J*(x*(t), t)
A t  a .
x*
(6.4.11)
Also, we know from Chapter 2 that the state and costate are related
by
.x*(t) = _ (8H)
8x *
(6.4.12)
and the optimal control u*(t) is obtained from
( ~ ) * = 0 > u*(t) = h(x*(t), J ~ , t).
(6.4.13)
Here, comparing (6.4.11) and (6.4.12), we get
~ (aJ*(x*(t), t)) = ~ [A *(t)]
dt 8x* dt
aH (x* (t), 8J* ( ~ : (t),t) , u* (t), t)
8x*
(6.4.14)
Using
J* _ aJ*(x*(t), t).
t  at '
J* = aJ*(x*(t), t)
x &*
(6.4.15)
The HJB equation (6.4.8) becomes
I Jt + 1i (x*(t), J ~ , u*(t), t) = 0·1
(6.4.16)
This equation, in general, is a nonlinear partial differential equation
in J*, which can be solved for J*. Once J* is known, its gradient
J ~ can be calculated and the optimal control u*(t) is obtained from
(6.4.13). Often, the solution of HJB equation is very difficult. The
entire procedure is summarized in Table 6.4. Let us now illustrate the
280 Chapter 6: Pontryagin Minimum Principle
Table 6.4 Procedure Summary of HamiltonJacobiBellman
(HJB) Approach
A. Statement of the Problem
Given the plant as
x(t) = f(x(t), u(t), t),
the performance index as
J = S(x(tf), tf) + Itd V(x(t), u(t), t)dt,
and the boundary conditions as
x(to) = Xo; x ( t f) is free
find the optimal control.
B. Solution of the Problem
Step 1 Form the Pontryagin H function
H(x(t), u(t), J ~ , t) = V(x(t), u(t), t) + J ~ ' f(x(t), u(t), t).
Step 2 Minimize H w.r.t. u(t) as
( ~ ~ ) * = 0 and obtain u*(t) = h(x*(t), J ~ , t).
Step 3 Using the result of Step 2, find the optimal H* function
H*(x*(t), h(x*(t), J ~ , t), J ~ , t) = H*(x*(t), J ~ , t)
and obtain the HJB equation.
Step 4 Solve the HJB equation
Jt + H(x*(t), J ~ , t) = o.
with boundary condition J*(x*(tf), tf) = S(x(tf), t.f).
Step 5 Use the solution J*, from Step 4 to evaluate J ~ and
substitute into the expression for u*(t) of Step 2, to
obtain the optimal control.
6.4 The HamiltonJacobiBellman Equation
HJB procedure using a simple firstorder system.
Example 6.3
Given a firstorder system
x(t) = 2x(t) + u(t)
and the performance index (PI)
1 1 lot!
J = x
2
(tf) +  [x
2
(t) + u
2
(t)]dt
2 2 0
find the optimal control.
281
(6.4.17)
(6.4.18)
Solution: First of all, comparing the present plant (6.4.17) and
the PI (6.4.18) with the general formulation of the plant (6.4.1)
and the PI (6.4.2), respectively, we see that
1 1 1 2
V(x(t), u(t), t) = 2u
2
(t) + 2x2(t); S(x(tf), tf) = 2
x
(tf)
f(x(t), u(t), t) = 2x(t) + u(t). (6.4.19)
Now we use the procedure summarized in Table 6.4.
• Step 1: The Hamiltonian (6.4.7) is
1l [x*(t), J
x
, u*(t), t] = V(x(t), u(t), t) + Jxf(x(t) , u(t), t)
1 1
= 2u
2
(t) + 2x2(t) + J
x
( 2x(t) + u(t)).
(6.4.20)
• Step 2: For an unconstrained control, a necessary condition for
optimization is
and solving
81l
 = 0 + u(t) + J
x
= 0
au
u*(t) = J
x
.
(6.4.21)
(6.4.22)
• Step 3: Using the optimal control (6.4.22) and (6.4.20), form
the optimal 1l function as
1 212
1l = 2 (J
x
) + 2x (t) + J
x
( 2x(t)  J
x
)
1 2 1 2
= 2
Jx
+ 2x (t)  2x(t)J
x
. (6.4.23)
282 Chapter 6: Pontryagin Minimum Principle
Now using the previous relations, the HJB equation (6.4.16)
becomes
1 2 1 2
J
t
 2 Jx + 2x (t)  2x(t)J
x
= 0
(6.4.24)
with boundary condition (6.4.10) as
(6.4.25)
• Step 4: One way to solve the HJB equation (6.4.24) with the
boundary condition (6.4.25) is to assume a solution and check
if it satisfies the equation. In this simple case, since we want
the optimal control (6.4.22) in terms of the states and the PI
is a quadratic function of states and controls, we can guess the
solution as
1
J(x(t)) = 2
P
(t)x
2
(t), (6.4.26)
where, p(t), the unknown function to be determined, has the
boundary condition as
(6.4.27)
which gives us
(6.4.28)
Then using (6.4.26), we get
J
x
= p(t)x(t); J
t
= (6.4.29)
leading to the closedloop optimal control (6.4.22), as
u*(t) = p(t)x*(t). (6.4.30)
Using the optimal control (6.4.29) into the HJB equation (6.4.24),
we have
(
1 . 1 2 1) 2
2
P
(t)  2
P
(t)  2p(t) + 2 x* (t) = o.
(6.4.31)
For any x* ( t), the previous relation becomes
1 . 1 2 1
2
P
(t)  2
P
(t)  2p(t) + 2 = 0,
(6.4.32)
6.5 LQR System Using HJB Equation 283
which upon solving with the boundary condition (6.4.28) be
comes
(v'5  2) + (v'5 + 2) [3/5] e
2
/5(ttj)
(
t)  3+/5 ()
p  1 _ [3/5] e
2
/5(ttf) 6.4.33
3+/5
Note, the relation (6.4.32) is the scalar version of the matrix DRE
(3.2.34) for the finitetime LQR system .
• Step 5: Using the relation (6.4.33), we have the closedloop
optimal control (6.4.30).
Note: Let us note that as tf * 00, p(t) in (6.4.33) becomes
p( 00) = p = v'5  2, and the optimal control (6.4.30) is
u(t) = (vis  2)x(t). (6.4.34)
6.5 LQR System Using HJB Equation
We employ the HJB equation to obtain the closedloop optimal con
trol of linear quadratic regulator system. Consider the plant described
by
x(t) = A(t)x(t) + B(t)u(t) (6.5.1)
where, x( t) and u( t) are nand r dimensional state and control vectors
respectively, and the performance index to be minimized as
J = ~ X ' ( t f )Fx(tf)
1 rtf
+"2 J
to
[x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt, (6.5.2)
where, as defined earlier, F, and Q(t) are real, symmetric, positive
semidefinite matrices respectively, and R(t) is a real, symmetric, posi
tive definite matrix. We will use the procedure given in Table 6.4.
• Step 1: As a first step in optimization, let us form the Hamil
tonian as
1 1
1t(x(t), u(t), J;, t) = "2 x'(t)Q(t)x(t) + "2u'(t)R(t)u(t)
+J;' (x(t), t)[A(t)x(t) + B(t)u(t)].
(6.5.3)
284 Chapter 6: Pontryagin Minimum Principle
• Step 2: A necessary condition for optimization of H w.r.t. u(t)
is that
= 0 t R(t)u(t) + t) = 0,
(6.5.4)
which leads to
u*(t) = t). (6.5.5)
Let us note that for the minimum control, the sufficient condition
that
(6.5.6)
is positive definite, is satisfied due to our assumption that R(t)
is symmetric positive definite .
• Step 3: With optimal control (6.5.5) in the Hamiltonian (6.5.3)
1 1
H(x(t), u(t), t) = 2x'(t)Q(t)x(t) +

1 1
= 2x'(t)Q(t)x(t) 
A(t)x(t). (6.5.7)
The HJB equation is
Jt + H(x*(t), u*(t), t) = o. (6.5.8)
With (6.5.7), the HJB equation (6.5.8) becomes
Jt + 
A(t)x*(t) = 0, (6.5.9)
with boundary condition
(6.5.10)
6.5 LQR System Using HJB Equation 285
• Step 4: Since the performance index J is a quadratic function of
the state, it seems reasonable to assume a solution as
1
J* (x(t), t) = "2 x' (t)P(t)x(t) (6.5.11)
where, P(t) is a real, symmetric, positivedefinite matrix to be
determined (for convenience * is omitted for x(t)). With
oJ* 1·
at = Jt = "2
x
(t)P(t)x(t),
aJ*
ox = Jx = P(t)x(t)
(6.5.12)
and using the performance index (6.5.11) in the HJB equation
(6.5.9), we get
1· 1
"2 x' (t)P(t)x(t) + "2
x
(t)Q(t)x(t)
 ~ x ' (t)P(t)B(t)R 1 (t)B' (t)P(t)x(t)
+x' (t)P(t)A(t)x(t) = O. (6.5.13)
Expressing P(t)A(t) as
P(t)A(t) = ~ [P(t)A(t) + {P(t)A(t)}']
1
+"2 [P(t)A(t)  {P(t)A(t)}'] , (6.5.14)
where, the first term on the righthand side of the above expres
sion is symmetric and the second term is not symmetric. Also,
we can easily show that since all the terms, except the last term
on the righthand side of (6.5.13), are symmetric. Using (6.5.14)
in (6.5.13), we have
1· 1
"2 x' (t)P(t)x(t) + "2
x
(t)Q(t)x(t)
 ~ x ' (t)P(t)B(t)R 1 (t)B' (t)P(t)x(t)
1 1
+"2x'(t)P(t)A(t)x(t) + "2x'(t)A'(t)P(t)x(t) = O.
(6.5.15)
286 Chapter 6: Pontryagin Minimum Principle
This equation should be valid for any x( t), which then reduces
to
P(t) + Q(t)  P(t)B(t)R
1
(t)B'(t)P(t)
+P(t)A(t) + A'(t)P(t) = O.
(6.5.16)
Rewriting the above, we have the matrix differential Riccati equa
tion (DRE) as
I P(t) = P(t)A(t)  A'(t)P(t) + P(t)B(t)R
1
(t)B'(t)P(t)  Q(t)·1
(6.5.17)
Using (6.5.10) and (6.5.11),
~ x' ( t f ) P ( t f )x ( t f) = ~ x' ( t f ) F ( t f ) x ( t f ),
(6.5.18)
we have the final condition for P(t) as
I P(tf) = F(tf)·1
(6.5.19)
• Step 5: Using (6.5.5) and (6.5.12), we have the closedloop op
timal control as
u*(t) = R
1
(t)B'(t)P(t)x*(t). (6.5.20)
Some noteworthy features of this result follow.
1. The HJB partial differential equation (6.5.8) reduces to a nonlin
ear, matrix, differential equation (6.5.17).
2. The matrix P(t) is determined by numerically integrating back
ward from tf to to. We also note that since the nxn P(t) matrix
is symmetric, one need to solve only n( n + 1) /2 instead of nxn
equations.
3. The reason for assuming the solution of the form (6.5.11) is that
we are able to obtain a closedloop optimal control, which is lin
ear, and timevarying w.r.t. the state.
4. A necessary condition: The result that has been obtained is only
the necessary condition for optimality in the sense that the min
hnum cost function J*(x(t), t) must satisfy the HJB equation.
6.5 LQR System Using HJB Equation 287
5. A sufficient condition: If there exists a cost function JS(x(t), t)
which satisfies the HJB equation, then JS(x(t), t) is the minimum
cost function, i.e.,
JS(x(t), t) = J*(x(t), t). (6.5.21)
6. Solution of the nonlinear HJB equation: For the linear, time
varying plant with quadratic performance index, we are able to
guess the solution to the nonlinear HJB equation. In general, we
may not be able to easily find the solution, and the nonlinear
HJB equation needs to be solved by numerical techniques.
7. Applications of HJB equation: The HJB equation is useful in
optimal control systems. Also, this provides a bridge between
dynamic programming approach and optimal control.
We provide another example with infinitetime interval for the appli
cation of HJB approach.
Example 6.4
Find the closedloop optimal control for the firstorder system
x(t) = 2x(t) + u(t) (6.5.22)
with the performance index
J = 10"'" [x
2
(t) + u
2
(t)] dt.
(6.5.23)
Hint: Assume that J* = Jx
2
(t).
Solution: First of all, let us identify the various functions as
V(x(t), u(t)) = x
2
(t) + u
2
(t),
J(x(t), u(t)) = 2x(t) + u(t). (6.5.24)
We now follow the stepbystep procedure given in Table 6.4 .
• Step 1: Form the 1t function as
H(x(t), u(t), J;) = V(x(t), u(t)) + J;J(x(t), u(t))
= x
2
(t) + u
2
(t) + 2Jx(t) [2x(t) + u(t)]
= x
2
(t) + u
2
(t)  4Jx
2
(t) + 2Jx(t)u(t) (6.5.25)
where, we used J* = Jx
2
(t) and J; = 2Jx(t). Here, we use a
slightly different approach by using the value of J; in the begin
ning itself.
288 Chapter 6: Pontryagin Minimum Principle
• Step 2: Minimize 7l w.r.t. u to obtain optimal control u*(t) as
a7l
au = 2u*(t) + 2fx*(t) = 0 * u*(t) =  fx*(t).
(6.5.26)
Step 3: Using the result of Step 2 in Step 1, find the optimal 7l
as
2 2 2 2
7l*(x*(t), J;, t) = x* (t)  4fx* (t)  f x* (t). (6.5.27)
• Step 4: Solve the HJB equation
7l*(x*(t), J;) + J; = 0 *
X*2 (t)  4fx*2 (t)  f
2
x*2 (t) = O.
(6.5.28)
Note that J
t
= 0 in the previous HJB equation. For any x* (t),
the previous equation becomes
Taking the positive value of f in (6.5.29), we get
J* = fX*2 (t) = (2 + V5)X*2 (t).
(6.5.29)
(6.5.30)
Note that (6.5.29) is the scalar version of the matrix ARE (3.5.15)
for the infinitetime interval regulator system.
• Step 5: Using the value of f from Step 4, in Step 2, we get the
optimal control as
u*(t) =  fx*(t) = ( V5  2)x*(t). (6.5.31)
6.6 Notes and Discussion
In this chapter, we discussed two topics: dynamic programming and
HJB equation. The dynamic programming was developed by Bellman
during the 1960s as an optimization tool to be adapted with the then
coming up of digital computers. An excellent account of dynamic pro
gramming and optimal control is given recently by Bertsekas [18, 19],
where the twovolume textbook develops in depth dynamic program
ming, a central algorithmic method for optimal control, sequential de
cision making under uncertainty, and combinatorial optimization.
6.6 Notes and Discussion
289
Problems
1. Make reasonable assumptions wherever necessary.
2. Use MATLAB© wherever possible to solve the problems and
plot all the optimal controls and states for all problems. Provide
the relevant MATLAB© m files.
Problem 6.1 Prove the Pontryagin Minimum Principle based on the
works of Athans and Falb [6], Lee and Markus [86], Machki and Strauss
[97] and some of the recent works Pinch [108] and Hocking [66].
Problem 6.2 For the general case of the Example 6.2, develop a
MATLAB© based program.
Problem 6.3 For a traveling salesperson, find out the cheapest route
from city L to city N if the total costs between the intermediate cities
are shown in Figure 6.8.
Problem 6.4 Consider a scalar example
x(k + 1) = x(k) + u(k) (6.6.1)
and the performance criterion to be optimized as
1 1 kfl
J = 2x
2
(kf) + 2 L u
2
(k)
k=ko
1 2 1 2 1 2
= 2
x
(k
f
) + 2
u
(0) + 2u (1)
where, for simplicity of calculations we take kf = 2. Let the constraints
on the control be
1.0 :S u(k) :S +1.0, k = 0,1,2 or
u(k) = 1.0, 0.5, 0, +0.5, +1.0
and on the state be
0.0 :S x(k) :S + 1.0, k = 0,1 or
x(k) = 0.0, 0.5, 1.0, 1.5.
Find the optimal control sequence u*(k) and the state x*(k) which
minimize the performance criterion.
290 Chapter 6: Pontryagin Minimum Principle
E
L
N
G
Figure 6.8 Optimal Path from A to B
Problem 6.5 Find the HamiltonJacobiBellman equation for the sys
tem
:h (t) = X2(t)
X2(t) = 2X2(t)  3xi(t) + u(t)
with the performance index as
1 rtf ( )
J = 2" 10 xi(t) + u
2
(t) dt.
Problem 6.6 Solve the Example 5.3 using dynamic programming ap
proach.
Problem 6.7 For the D.C. motor speed control system described in
Problem 1.1, find the HJB equation and hence the dosedloop optimal
control to keep the speed at a constant value.
Problem 6.8 For the liquidlevel control system described in Prob
lem 1.2, find the HJB equation and hence the dosedloop optimal con
trol to keep the liquid level constant at a particular value.
6.6 Notes and Discussion 291
Problem 6.9 For the mechanical control system described in Prob
lem 1.4, find the HJB equation and hence the closedloop optimal con
trol to keep the states at a constant value.
Problem 6.10 For the automobile suspension system described in
Problem 1.5, find the HJB equation and hence the closedloop control.
@@@@@@@@@@@@@
Chapter 7
Constrained Optimal
Control Systems
In the previous chapters, we considered optimization of systems with
out any constraints on controls or state variables. In this chapter, we
present an entirely different class of systems where we impose some
constraints on controls and/or states. In this way, we address the
timeoptimal control (TOC) system, where the performance measure is
the minimization of the transition time from initial state to any tar
get or desired state. Our treatment is focused on linear, timeinvariant
(LTI) systems. These are also called brachistochrone problems. Next,
we address fueloptimal control (FOC) system, where the performance
measure is minimization of a quantity proportional to fuel consumed
by the process or plant. Next, we briefly consider the energyoptimal
control (EOC) system. Finally, we consider a plant with some con
straints on their states. It is suggested that the student reviews the
material in Appendices A and B given at the end of the book. This
chapter is based on [6, 79]1.
7.1 Constrained Optimal Control
From Chapter 6 (Table 6.1) the Pontryagin Principle is now summa
rized below for linear, timeinvariant system with a quadratic perfor
1 The permission given by McGrawHill for M. Athans and P. L. Falb, Optimal Control: An
Introduction to The Theory and Its Applications, McGrawHill Book Company, New York,
NY, 1966, is hereby acknowledged.
293
294 Chapter 7: Constrained Optimal Control Systems
mance index. Given the system as
x(t) = A(t)x(t) + B(t)u(t) (7.1.1)
with the control constraint as
u :S u( t) :S U+ + lu( t) I :S U (7.1.2)
the performance index as
J(x(to), u(t), to) = J
= ~ x' ( t f ) F ( t f ) x ( t f )
lit!
+ [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt (7.1.3)
2 to
and the boundary conditions as
x(to) = Xo fixed,x(tf) = xf is free and tf is free, (7.1.4)
to find the optimal control, form the Pontryagin H function
1 1
H(x(t), u(t), ;\(t) , t) = "2x'(t)Q(t)x(t) + "2u'(t)R(t)u(t)
+;\' (t) [A(t)x(t) + B(t)u(t)] (7.1.5)
minimize H w.r.t. u(t)(:S U) as
H(x* (t), u* (t), ;\ * (t), t) :S H(x* (t), u( t), ;\ * (t), t),
and solve the set of 2n state and costate differential equations
x*(t) = + G ~ ) * '
.x*(t) = _ (aH)
ax *
with the boundary conditions Xo and
where,
(7.1.6)
(7.1.7)
(7.1.8)
7.1 Constrained Optimal Control
295
Note: Here we address the optimal control system with a constraint
on the control u(t) given by (7.1.2). Thus, we cannot in general use
the condition
(
a1i) = 0
au *
(7.1.9)
that we used earlier in Chapters 2 to 4 for continuoustime systems,
where we had no constraints on the control u(t), because there is no
guarantee that in general the optimal control u*(t) obtained by using
the condition (7.1.9) will satisfy the constraint on the control given by
(7.1.2).
7.1.1 TimeOptimal Control of LTI System
In this section, we address the problem of minimizing the time taken
for the system to go from an initial state to the desired final state of
a linear, timeinvariant (LTI) system. The desired final state can be
conveniently taken as the origin of the state space; in this way we will
be dealing with timeoptimal regulator system.
7.1.2 Problem Formulation and Statement
Let us now present a typical timeoptimal control (TOC) system. Con
sider a linear, timeinvariant dynamical system
x(t) = Ax(t) + Bu(t) (7.1.10)
where, x(t) is nth state vector; u(t) is rth control vector, and the
matrices A and B are constant matrices of nxn and nxr dimensions,
respectively. We are also given that
1. the system (7.1.10) is completely controllable, that is, the matrix
(7.1.11)
is of rank n or the matrix G is nonsingular, and
2. the magnitude of the control u(t) is constrained as
u :S u(t) :S U+ + lu(t)l:S u (7.1.12)
296 Chapter 7: Constrained Optimal Control Systems
or component wise
(7.1.13)
Here, U+ and U are the upper and lower bounds of U. But, the
constraint relation (7.1.12) can also be written more conveniently
(by absorbing the magnitude U into the matrix B) as
1 :::; u(t) :::; +1 + lu(t)l:::; 1 (7.1.14)
or component wise,
(7.1.15)
3. the initial state is x(to) and the final (target) state is o.
The problem statement is: Find the (optimal) control u*(t) which sat
isfies the constraint (7.1.15) and drives the system (7.1.1 0) from the
initial state x(to) to the origin 0 in minimum time.
7.1.3 Solution of the TOe System
We develop the solution to this timeoptimal control (TOe) system
stated previously under the following steps. First let us list all the
steps here and then discuss the same in detail.
• Step 1: Performance Index
• Step 2: Hamiltonian
• Step 3: State and Costate Equations
• Step 4: Optimal Condition
• Step 5: Optimal Control
• Step 6: Types of TimeOptimal Controls
• Step 7: BangBang Control Law
• Step 8: Conditions for Normal Time Optimal Control System
• Step 9: Uniqueness of Optimal Control
• Step 10: Number of Switchings
7.1 Constrained Optimal Control 297
• Step 1: Performance Index: For the minimumtime system for
mulation specified by (7.1.10) and by the control constraint rela
tion (7.1.14), the performance index (PI) becomes
i
t! it!
J(u(t)) = V [x(t), u(t), t] dt = 1dt = tf  to
to to
(7.1.16)
where, to is fixed and t f is free. If the final time t f is fixed, trying
to minimize a fixed quantity makes no sense.
• Step 2: Hamiltonian: We form the Hamiltonian H for the prob
lem described by the system (7.1.10) and the PI (7.1.16) as
H(x(t), ;\(t), u(t)) = 1 + ;\'(t) [Ax(t) + Bu(t)] ,
= 1 + [Ax(t)]';\(t) + u'(t)B';\(t) (7.1.17)
where, ;\(t) is the costate variable.
• Step 3: State and Costate Equations: Let us assume the optimal
values u*(t), x*(t), and ;\*(t). Then, the state x*(t) and the
costate ;\*(t) are given by
x*(t) = + ( ~ ~ ) * = Ax*(t) + Bu*(t),
j.*(t) =  (aH) = A';\*(t)
ax *
with the boundary conditions
where, we again note that t f is free.
(7.1.18)
(7.1.19)
(7.1.20)
• Step 4: Optimal Condition: Now using Pontryagin Principle, we
invoke the condition (7.1.6) for optimal control in terms of the
Hamiltonian. Using (7.1.17) in (7.1.6), we have
1 + [Ax*(t)]';\*(t) + u*'(t)B';\*(t)
::; 1 + [Ax*(t)]';\*(t) + u'(t)B';\*(t)
which can be simplified to
u * I ( t ) B ';\ * ( t) ::; u' ( t ) B I ;\ * ( t ) ,
u * I ( t ) q * ( t) ::; u
'
( t ) q * ( t ) ,
= min {u
'
(t)q*(t)}
l u ( t ) l ~ l
(7.1.21)
(7.1.22)
298 Chapter 7: Constrained Optimal Control Systems
where q*(t) = B'A*(t), and q*(t) is not to be confused as the
vector version of the weighting matrix Q used in quadratic per
formance measures .
• Step 5: Optimal Control: We now derive the optimal sequence
for u*(t). From the optimal condition (7.1.21)
1. if q*(t) is positive, the optimal control u*(t) must be the
smallest admissible control value 1 so that
min {u'(t)q*(t)} = q*(t) = lq*(t)l, (7.1.23)
l u ( t ) l ~ l
2. and on the other hand, if q*(t) is negative, the optimal con
trol u*(t) must be the largest admissible value +1 so that
min {u'(t)q*(t)} = +q*(t) = lq*(t)l. (7.1.24)
l u ( t ) l ~ l
In other words, the previous two relations can be written in a
compact form (for either q*(t) is positive or negative) as
min {u' ( t) q * ( t)} =  I q * ( t ) I .
l u ( t ) l ~ l
(7.1.25)
Also, the combination of (7.1.23) and (7.1.24) means that
{
+1
u*(t) = 1
indeterminate
if q*(t) < 0,
if q*(t) > 0,
if q*(t) = 0.
(7.1.26)
Now, using the signum function (see Figure 7.1) defined between
input fi and output fo, written as fo = sgn{li} as
{
+1
fo = 1
indeterminate
if Ii > °
if fi < °
if fi = 0.
The engineering realization of the signum function is an ideal
relay.
Then, we can write the control algorithm (7.1.26) in a compact
fonn as
I u*(t) = SGN{q*(t)} I (7.1.27)
7.1 Constrained Optimal Control 299
+11
1 +1
11
Figure 7.1 Signum Function
~
+1
q*(t) u*(t)
..
1 +1
1
Figure 7.2 TimeOptimal Control
where the relation between the timeoptimal control u*(t) and
the function q*(t) is shown in Figure 7.2.
In terms of component wise,
uj(t) = sgn{qj(t)}
= sgn{bjA*(t)} (7.1.28)
where, bj,j = 1,2, ... , r denote the column vectors of the input
matrix B. From the timeoptimal control relation (7.1.27), note
that the optimal control u*(t) depends on the costate function
A*(t) .
• Step 6: Types of TimeOptimal Controls: We now have two
types of timeoptimal controls, depending upon the nature of the
function q*(t).
1. Normal Time Optimal Control (NTOC) System: Suppose
that during the interval [to, tj], there exists a set of times
300 Chapter 7: Constrained Optimal Control Systems
h,t2, ... ,t'Yj E [to,t!], ,= 1,2,3, ... ,j = 1,2, ... ,r such
that
*(t)  b
'
A*(t) _ {O, if and only if t = t'Yj
%  j  nonzero, otherwise,
(7.1.29)
then we have a normal timeoptimal control (NTOC) sys
tem. The situation is depicted in Figure 7.3. Here, the
+1
Uj*(t)=sgn {%*(t)}
r      .    / .      ~ ~                  
t
1
Figure 7.3 Normal TimeOptimal Control System
function q; (t) is zero only at four instants of time, and the
time optimal control is piecewise constant function with sim
ple switchings at t1, t2, t3, and t4. Thus, the optimal control
uJ (t) switches four times, or the number of switchings is
four.
2. Singular TimeOptimal Gontrol (STOG) System: Suppose
that during the interval [to, tj], there is one (or more) subin
tervals [T
1
, T2]' such that
(7.1.30)
then, we have a singular timeoptimal control (STOG) sys
tem, and the interval [Tl, T2] is called singularity intervals.
The situation is shown in Figure 7.4. During this singularity
intervals, the timeoptimal control is not defined.
7.1 Constrained Optimal Control 301
+1
O r   +       r      ~       ~       _ _ +
tt' t
1
Figure 7.4 Singular TimeOptimal Control System
• Step 7: BangBang Control Law: For a normal timeoptimal
system, the optimal control, given by (7.1.27)
I u*(t) = SGN{q*(t)} = SGN{B'A*(t)} I (7.1.31)
for all t E [to, tj], is a piecewise constant function of time (i.e.,
bangbang).
• Step 8: Conditions for NTOC System: Here, we derive the con
ditions necessary for the system to be not singular, thereby ob
taining the conditions for the system to be normal. First of all,
the solution of the costate equation (7.1.19) is
(7.1.32)
and assume that the costate initial condition A*(O) must be a
nonzero vector. With this solution for A * (t), the control law (7.1.31)
becomes
u*(t) = SGN{B'EA'tA*(O)}
(7.1.33)
or component wise,
uj(t) = sgn{qj(t)},
= sgn {bjEA't A*(O)} . (7.1.34)
302 Chapter 7: Constrained Optimal Control Systems
Let us suppose that there is an interval of time [TI' T
2
] during
which the function q*(t) is zero. Then, it follows that during
the time interval [TI' T
2
] all the derivatives of q*(t) must be zero.
That is
qj(t) = bjE A1tA*(0) = 0
i/(t) bjA'EA'tA*(O) = 0
q*(t) = bjA,2EA'tA*(0) = 0
(7.1.35)
which in turn can be written in a compact form as
(7.1.36)
where,
G
j
= [b
j
: Ab
j
: A
2
b
j
: ... : Anlb
j
]
= [B : AB : A2B : ... : AnIB]. (7.1.37)
In the condition (7.1.36), we know that E
A't
is nonsingular, and
A*(O) =1= 0, and hence the matrix Gj must be singular. Hence,
for the STOC system, G
j
must be singular. Or for the NTOC
system, G
j
must be nonsingular. We know that the matrix G
j
is nonsingular if and only if the original system (7.1.10) is com
pletely controllable. This leads us to say that the timeoptimal
control system is normal if the matrix G
j
is nonsingular or if
the system is completely controllable. These results are stated as
follows (the proofs are found in books such as [6]).
THEOREM 7.1
The necessary and sufficient conditions for the timeoptimal control
system to be normal is that the matrix G
j
, j = 1,2, ... , r, is nonsin
gular or that the system is completely controllable.
7.1 Constrained Optimal Control 303
THEOREM 7.2
The necessary and sufficient conditions for the timeoptimal control
system to be singular is that the matrix Gj,j = 1,2, ... , r, is singular
or that the system is uncontrollable.
Thus, for a singular interval to exist, it is necessary that the
system is uncontrollable, conversely, if the system is completely
controllable, a singular interval cannot exist.
• Step 9: Uniqueness of Optimal Control: If the timeoptimal
system is normal, then the timeoptimal control is unique .
• Step 10: Number of Switchings: The result is again stated in
the form of a theorem.
THEOREM 7.3
If the original system (7.1.10) is normal, and if all the n eigenvalues
of the system are real, then the optimal control u*(t) can switch (from
+1 to lor from 1 to +1) at most (n  1) times.
7.1.4 Structure of Time Optimal Control System
We examine two natural structures, i.e., openloop and closedloop
structures for implementation of timeoptimal control system.
1. OpenLoop Structure: We repeat here again the timeoptimal
control system and summarize the result. For the normal time
optimal control system, where the system is described by
x(t) = Ax(t) + Bu(t) (7.1.38)
with the constraint on the control as
jUj(t)j ~ 1, j = 1,2, ... ,r. (7.1.39)
the timeoptimal control is to find the control which drives the
system (7.1.38) from any initial condition x(O) to target condi
tion 0 in minimum time under the constraint (7.1.39). From the
304 Chapter 7: Constrained Optimal Control Systems
previous discussion, we know that the optimal control is given
by
uj(t) = sgn{bj, .\*(t)} (7.1.40)
where, the costate function .\*(t) is
'\*(t) = EA't.\*(O).
(7.1.41)
Let us note that the initial condition .\ * (0) is not specified and
hence arbitrary, and hence we have to adopt an iterative proce
dure. Thus, the steps involved in obtaining the optimal control
are given as follows.
(a) Assume a value for the initial condition .\*(0).
(b) Using the initial value in (7.1.41), compute the costate .\*(t).
(c) Using the costate .\*(t), evaluate the control (7.1.40).
(d) Using the control u*(t), solve the system relation (7.1.38).
(e) Monitor the solution x*(t) and find if there is a time tf
such that the system goes to zero, i.e., x(tf) = O. Then
the corresponding control computed previously is the time
optimal control. If not, then change the initial value of .\*(0)
and repeat the previous steps until x(tf) = O.
A schematic diagram showing the openloop, timeoptimal con
trol structure is shown in Figure 7.5. The relay shown in the
rRe1iy: r i(O):
~ ( t ) = A' A(t) A *(t) B I q*(t ~ :u*(t): x(t)=Ax(t)+Bu(t) x * ( ~ ) 1
L:
Stop Iteration 1+'"
Start Iteration!
1
1
1
1
1
1
1 1 •
Change A(O)
··r······,.·······························
1 1
1 1
1 1
_____________________________ 1 L _______________ J
OpenLoop TimeOptimal Controller Plant
Figure 7.5 OpenLoop Structure for TimeOptimal Control System
7.2 Toe of a Double Integral System 305
figure is an engineering realization of signum function. It gives
the required control sequences +1 or 1 depending on its input.
However, we note the following:
(a) The adjoint system (7.1.19) has unstable modes for a stable
system (7.1.10). This makes the already an iterative proce
dure much more tedious.
(b) We all know the obvious disadvantages of the openloop im
plementation of a control system.
One should try for closedloop implementation of the timeoptimal
control system, which is discussed next.
2. ClosedLoop Structure: Intuitively, we can feel the relation be
tween the control u * ( t) and the state x* ( t) recalling the results
of Chapters 3 and 4 where we used Riccati transformation ..\ * (t) =
P(t)x*(t) to express the optimal control u*(t), which was a func
tion of the costate ..\ * (t), as a function of the state x* (t). Thus, we
assume that at any time there is a timeoptimal control u*(t) as a
function of the state x*(t). That is, there is a switching function
h(x* (t)) such that
I u*(t) = SGN{h(x*(t))} I (7.1.42)
where an analytical and/or computational algorithm
h(x*(t)) = B'..\*(x*(t)). (7.1.43)
needs to be developed as shown in the example to follow. Then,
the optimal control law (7.1.42) is implemented as shown in Fig
ure 7.6. The relay implements the optimal control depending on
its input which in turn is decided by the feedback of the states.
The determination of the switching functions h[x*(t)] is the im
portant aspect of the implementation of the control law. In the
next section, we demonstrate the way we try to obtain the closed
loop structure for timeoptimal control system of a second order
(double integral) system.
7.2 TOe of a Double Integral System
Here we examine the timeoptimal control (TOC) of a classical dou
ble integral system. This simple example demonstrates some of the
important features of the TOC system [6].
306
Chapter 7: Constrained Optimal Control Systems
.,
, Relay ,
r,
,
,
,
,
,
,
,
,
i Algorithm  h[x*(t)]
:f
1
,
,
,
'u*(t),
!x*(t)
, ' ..
x(t)=Ax(t)+Bu(t)
, I'
Ii
,
,
, ,
,
,
, ,
l _________________________ ~ L _____________ J
ClosedLoop TimeOptimal Controller
Plant
Figure 7.6 ClosedLoop Structure for TimeOptimal Control
System
7.2.1 Problem Formulation and Statement
Consider a simple motion of an inertial load in a frictionless environ
ment. The motion is described by
my(t) = f(t) (7.2.1)
where, m is the mass of a body (system or plant), y(t), y(t), and y(t)
are the position, velocity and acceleration, respectively, and f(t) is the
external force applied to the system. Defining a set of state variables
as
XI(t) = y(t); X2(t) = y(t)
we have the double integral system described as
XI(t) = X2(t)
X2(t) = u(t)
(7.2.2)
(7.2.3)
where, u(t) = f(t)/m. Let us assume that the control (input) u(t) to
the system is constrained as
I u ( t) I :::; 1 V t E [to, t f ] . (7.2.4)
This constraint on the control is due to physical limitations such as
current in a circuit or thrust of an engine.
Problem Statement: Given the double integral system (7.2.3) and
the constraint on the control (7.2.4), find the admissible control that
forces the system from any initial state [Xl (0), X2 (0)] to the origin in
minimum time.
Let us assume that we are dealing with normal system and no singular
controls are allowed. Now, we attempt to solve the system following
the procedure described in the previous section.
7.2 Toe of a Double Integral System
307
7.2.2 Problem Solution
Our problem solution consists of the list of the following steps with the
details following.
• Step 1: Performance Index
• Step 2: Hamiltonian
• Step 3: Minimization of Hamiltonian
• Step 4: Costate Solutions
• Step 5: TimeOptimal Control Sequences
• Step 6: State Trajectories
• Step 7: Switch Curve
• Step 8: Phase Plane Regions
• Step 9: Control Law
• Step 10: Minimum Time
• Step 1: Performance Index: For minimumtime system, the per
formance index (7.1.16) is easily seen to be
l
tf
J = 1 dt = t f  to
to
(7.2.5)
where, to is fixed and t f is free.
• Step 2: Hamiltonian: From the system (7.2.3) and the PI (7.2.5),
form the Hamiltonian (7.1.17) as
'H(x(t) , ;\(t), u(t)) = 1 + Al(t)X2(t) + A2(t)U(t). (7.2.6)
• Step 3: Minimization of Hamiltonian: According to the Pon
tryagin Principle, we need to minimize the Hamiltonian as
'H(x*(t), ;\*(t), u*(t)) :S 'H(x*(t), ;\*(t), u(t),)
= min 'H(x*(t), ;\*(t), u(t)). (7.2.7)
l u l ~ l
308
Chapter 7: Constrained Optimal Control Systems
Using the Hamiltonian (7.2.6) in the condition (7.2.7), we have
which leads to
1 + Ai(t)x;(t) + A;(t)U*(t)
< 1 + Ai(t)x;(t) + A;(t)U(t) (7.2.8)
(7.2.9)
U sing the result of the previous section, we have the optimal
control (7.1.27) given in terms of the signum function as
u*(t) = sgn{A;(t)}. (7.2.10)
N ow to know the nature of the optimal control, we need to solve
for the costate function A2 (t) .
• Step 4: Costate Solutions: The costate equations (7.1.19) along
with the Hamiltonian (7.2.6) are
. 81i
Ai(t) = a * = 0,
xl
'\;(t) =  a 8 ~ = Ai(t).
x
2
(7.2.11)
Solving the previous equations, we get the costates as
Ai(t) = Ai(O),
A;(t) = A;(O)  AI(O)t. (7.2.12)
• Step 5: TimeOptimal Control Sequences: From the solutions
of the costates (7.2.12), we see that A2(t) is a straight line, and
that there are four possible (assuming initial conditions AI(O) and
A2(0) to be nonzero) solutions as shown in Figure 7.7. Also shown
are the four possible optimal control sequences
{+1}, {1}, {+1,1}, {1,+1} (7.2.13)
that satisfy the optimal control relation (7.2.10). Let us reiter
ate that the admissible optimal control sequences are the ones
given by (7.2.13). That is, a control sequence like {+1,1,+1}
is not an optimal control sequence. Also, the control sequence
7.2 Toe of a Double Integral System
u*(t)
+1
309
A2(t) ........ .
............................
o f... t
1
1 u*(t)
(a) AI(O) > 0; A2(0) < 0
(b) AI(O) < 0; A2(0) >0
+ 1 u*(t)
...
+1
' .
. '
.. '
1
.. '
.. '
(c) AI(O) <0; A2(0) <0
.........
1 tI
'.
u*(t)
(d) Al (0) > 0; A2(0»0
Figure 7.7 Possible Costates and the Corresponding Controls
{ + 1, 1, + I} requires two switchings which is in violation of the
earlier result (Theorem 7.3) that a second (nth) order system will
have at most 1 (n  1) switchings. From Figure 7.7, we see that
the timeoptimal control for the second order (double integral)
system is a piecewise constant and can switch at most once. In
order to arrive at closedloop realization of the optimal control,
we need to find the phase (XI(t), X2(t)) plane (state) trajectories .
• Step 6: State Trajectories: Solving the state equations (7.2.3),
we have
1
xi(t) = xi(O) + X2(0)t + 2
Ut2
,
X2(t) = x2(0) + Ut, (7.2.14)
where, U = u*(t) = ±l. For phase plane plots, we need to elimi
nate t from solutions (7.2.14) for the states. Thus, (for simplicity,
we omit * since we are now dealing with all optimal functions
only and write XI(O) = XlO, X2(0) = X20)
t = (X2(t)  X20)/U,
1 2 1 2
Xl (t) = XlO  2
U
X20 + 2
U
X2(t),
(7.2.15)
310 Chapter 7: Constrained Optimal Control Systems
where, we used U = ±1 = I/U. If
{
t = X2 ( t)  X20
U = U = +1, I 2 I 2 I 2
Xl (t) = XlO  2
X
20 + 2
X
2(t) = CI + 2
X
2(t)
(7.2.16)
and if
u = U = 1,
{
t = X20  X2(t),
XI(t) = XlO + !x§o  !x§(t) = C
2
 !x§(t)
(7.2.17)
where, CI = XlO  !x§o and C2 = XlO + !x§o are constants. Now,
we can easily see that the relations (7.2.16) and (7.2.17) represent
a family of parabolas in (Xl, X2) plane (or phase plane) as shown
in Figure 7.8. The arrow indicates the direction of motion for
increasing (positive) time. Our aim is to drive the system from
u=+l
u =1
Figure 7.8 Phase Plane Trajectories for u = + 1 (dashed lines) and
u = 1 (dotted lines)
any initial state (XI(O), X2(0)) to origin (0,0) in minimum time.
7.2 Toe of a Double Integral System 311
Then, from (7.2.15), we find that at t = tf
XI(t = tf) = 0; X2(t = tf) = 0. (7.2.18)
With this, (7.2.15) becomes
1 2 1 2
0= XlO  2
UX20
+ ° + XlO = 2UX20·
(7.2.19)
Now, rewriting this for any initial state Xl = XlO, X2 = X20, we
have
(7.2.20)
Note that Xl and X2 in (7.2.19) are any initial states and not to
be confused with XI(t) and X2(t) in (7.2.15) which are the states
at any time t.
Now we can restate our problem as to find the timeoptimal con
trol sequence to drive the system from any initial state (Xl, X2)
to the origin (0,0) in minimum time .
• Step 7: Switch Curve: From Figure 7.8, we see that there are
two curves labeled r+ and r which transfer any initial state
(Xl, X2) to the origin (0,0).
1. The r+ curve is the locus of all (initial) points (Xl, X2) which
can be transferred to the final point (0,0) by the control u =
+1. That is
1'+ = {(Xl, X2): xl = X2::; o} . (7.2.21)
2. The r curve is the locus of all (initial) points (Xl, X2) which
can be transferred to the final point (0,0) by the control u =
1. That is
1' = {(Xl, X2): Xl = X2 o}. (7.2.22)
3. The complete switch curve, i. e., the r curve, is defined as
the union (either or ) of the partial switch curves r + and r  .
That is
l' = {(Xl,
X
2): Xl =
= r+ U r
where, U means the union operation.
(7.2.23)
312 Chapter 7: Constrained Optimal Control Systems
The switch curve 'Y is shown in Figure 7.9
U* = +1
R+
U* =1
R_
Figure 7.9 Switch Curve for Double Integral TimeOptimal
Control System
• Step 8: Phase Plane Regions: Let us now define the regions in
which we need to apply the control u = + 1 or u = 1.
1. Let R+ be the region of the points such that
R+ = {(Xl,X2): Xl <
(7.2.24)
That is R+ consists of the region of the points to the left of
the switch curve 'Y.
2. Let R_ be the region of the points such that
(7.2.25)
That is R_ consists of the region of the points to the right
of the switch curve 'Y.
7.2 Toe of a Double Integral System 313
u=+1
'.
".
u =1
Figure 7.10 Various Trajectories Generated by Four Possible
Control Sequences
Figure 7.10 shows four possible control sequences (7.2.13) which
drive the system from any initial condition to the origin.
1. If the system is initially anywhere (say a) on the /"+ curve,
the optimal control is u = + 1 to drive the system to origin
in minimum time t f.
2. If the system is at rest anywhere (say b) on the /,,_ curve,
the optimal control is u = 1 to drive the system to origin
in minimum time t f.
3. If the system is initially anywhere (say c) in the R+ region,
the optimal control sequence is u = {+ 1, 1} to drive the
system to origin in minimum time t f.
4. If the system is initially anywhere (say d) in the R region,
the optimal control sequence is u = {1, + 1} to drive the
system to origin in minimum time t f.
If we start at d and use the control u = + 1 and use the
optimal control sequence u = {1, +1}, we certainly drive
the system to origin but
314 Chapter 7: Constrained Optimal Control Systems
(a) we then have a control sequence {+1, 1, +1} which is
not a member of the optimal control sequence (7.2.13),
and
(b) the time t f taken for the system using the control se
quence {+ 1, 1, + 1} is higher than the corresponding
time t f taken for the system with control sequence
{1,+1}.
• Step 9: Control Law: We now reintroduce * to indicate the
optimal values. The timeoptimal control u* as a function of the
state [Xl, X2] is given by
u* = u*(XI, X2) = +1 for all (Xl, X2) E f+ U R+
u* = u*(XI, X2) = 1 for all (Xl, X2) E f U R_. (7.2.26)
Alternatively, if we define z = Xl + ~ x 2 I x 2 1 , then if
z > 0, u* = 1, and
z < 0, u* = +1. (7.2.27)
• Step 10: Minimum Time: We can easily calculate the time taken
for the system starting at any position in state space and ending
at the origin. We use the set of equations (7.2.15) for each portion
of the trajectory. It can be shown that the minimum time tj for
the system starting from (Xl, X2) and arriving at (0,0) is given
by [6]
if
if
if
(Xl, X2) E R_
(Xl, X2) E R+
(Xl, X2) E I
or Xl >  ~ x 2 I x 2 1
or Xl < #x2Ix21
or Xl = 2X21x21
(7.2.28)
7.2.3 Engineering Implementation of .Control Law
Figure 7.11 shows the implementation of the optimal control law (7.2.26).
1. If the system is initially at (Xl, X2) E R_, then Xl >  ~ x 2 I x 2 1 ,
which means z > 0 and hence the output of the relay is u* = 1.
2. On the other hand, if the system is initially at (Xl, X2) E R+,
then Xl <  ~ x 2 I x 2 1 , which means z < 0 and hence the output of
the relay is u* = + 1.
7.3 FuelOptimal Control Systems 315
1,
I I
: u*(t) J X2 *(t) J Xl *(t) l
I I
: Plant :
  ~
r                              ~                
I
: Function
Generator
Relay
If f 1
ClosedLoop TimeOptimal Controller
Figure 7.11 ClosedLoop Implementation of TimeOptimal Control
Law
Let us note that the closedloop (feedback) optimal controller is non
linear (control u* is a nonlinear function of xi and x ~ 2 ) although the
system is linear. On the other hand, we found in Chapters 3 and 4 for
unconstrained control, the optimal control u* is a linear function of the
state x*.
7.2.4 SIMULINX@ Implementation of Control Law
The SIMULINK© implementation of timeoptimal control law is very
easy and convenient. The controller is easily obtained by using abs and
signum function blocks as shown in Figure 7.12. Using different initial
conditions, one can get the phaseplane (Xl and X2 plane) trajectories
belonging to 1'+,1', R+ and R shown in Figures 7.13,7.14,7.15, and
7.16, respectively.
7.3 FuelOptimal Control Systems
Fueloptimal control systems arise often in aerospace systems where
the vehicles are controlled by thrusts and torques. These inputs like
thrusts are due to the burning of fuel or expulsion of mass. Hence, the
natural question is weather we can control the vehicle to minimize the
fuel consumption. Another source of fueloptimal control systems is
316 Chapter 7: Constrained Optimal Control Systems
PLANT
u
Phase Plane
Figure 7.12 SIMULINK© Implementation of TimeOptimal
Control Law
nuclear reactor control systems where fuel remains within the system
and not expelled out of the system like in aerospace systems.
An interesting historical account is found in [59] regarding fueloptimal
control as applicable to the terminal phase of the lunar landing prob
lem [100] of Apollo 11 mission, in which astronauts Neil Armstrong and
Edwin Aldrin softlanded the Lunar Excursion Module (LEM) "Eagle"
on the lunar surface on July 20, 1969, while astronaut Michael Collins
was in the orbit with Apollo Command Module "Columbia".
7.3.1 Fuel Optimal Control of a Double Integral Sys
tem
In this section, we formulate the fueloptimal control system and obtain
a solution to the system.
7.3 FuelOptimal Control Systems
317
Figure 7.13 PhasePlane Trajectory for 1'+: Initial State (2,2) and
Final State (0,0)
Figure 7.14 PhasePlane Trajectory for 1'_: Initial State (2,2) and
Final State (0,0)
318 Chapter 7: Constrained Optimal Control Systems
Figure 7.15 PhasePlane Trajectory for R+: Initial State (1,1)
and Final State (0,0)
Figure 7.16 PhasePlane Trajectory for R_: Initial State (1,1) and
Final State (0,0)
7.3 FuelOptimal Control Systems 319
7.3.2 Problem Formulation and Statement
Consider a body with a unit mass undergoing translational motion
:h (t) = X2(t)
X2(t) = U(t), lu(t)1 ::; 1 (7.3.1)
where, Xl(t) is the position, X2(t) is the velocity, and u(t) is the thrust
force. Let us assume that the thrust (i.e., the control) is proportional
to ¢( t), the rate of fuel consumption. Then, the total fuel consumed
becomes
i
t!
J = ¢(t)dt.
to
(7.3.2)
Let us further assume that
1. the mass of fuel consumed is small compared with the total mass
of the body,
2. the rate of fuel, ¢(t) is proportional to the magnitude of the
thrust, u(t), and
3. the final time t f is free or fixed.
Then from (7.3.2), the performance index can be formulated as
i
t!
J(u) = lu(t)1 dt.
to
(7.3.3)
The fueloptimal control problem may be stated as follows: Find the
controlu(t) which forces the system (7.3.1) from any initial state (Xl(O),
X2(O) = XlO, X20) to the origin in a certain unspecified final time t f while
minimizing the fuel consumption (7.3.3).
Note that in case the final time t f is fixed then that final time t f must
be greater than the minimum time tj required to drive the system from
(XlO, X20) to the origin.
7.3.3 Problem Solution
The solution to the fueloptimal system is provided first under the
following list of steps and then explained in detail.
320 Chapter 7: Constrained Optimal Control Systems
• Step 1: Hamiltonian
• Step 2: Optimal Condition
• Step 3: Optimal Control
• Step 4: Costate Solutions
• Step 5: State Trajectories
• Step 6: Minimum Fuel
• Step 7: Switching Sequences
• Step 8: Control Law
• Step 1: Hamiltonian: Let us formulate the Hamiltonian as
H(x(t), A(t), U(t)) = lu(t)1 + Al(t)X2(t) + A2(t)U(t). (7.3.4)
• Step 2: Optimal Condition: According to the Minimum Princi
ple, the optimal condition is
H(x*(t), A*(t), u*(t)) ::; H(x*(t), A*(t), u(t)),
= min {H(x*(t), A*(t), u(t))}. (7.3.5)
Using (7.3.4) in (7.3.5), we have
lu*(t)1 + +
lu(t)1 + +
which reduces to
lu*(t)1 + lu(t)1 +
• Step 3: Optimal Control: Let us note at this point that
(7.3.6)
(7.3.7)
min {lu(t)1 + = lu*(t)1 + (7.3.8)
and
lu(t)1 = {+U(t)
u(t)
if u(t) 2: 0,
if u(t) o.
(7.3.9)
7.3 FuelOptimal Control Systems
Hence, we have
min {lu(t)1 + U(t)A2(t)}
lu(t)l:Sl
= {min1u(t)l:Sl {[+1 + A2(t)] u(t)} if u(t) 0
minlu(t)l:Sl {[I + A2(t)] u(t)} if u(t) :S o.
321
(7.3.10)
Let us now explore all the possible values of A2(t) and the cor
responding optimal values of u*(t). Thus, we have the following
table.
Possible values of Resulting values of Mi ....... · ... , L ... U. ................ .I.U
A2(t) u*(t) {lu(t)1 + u(
A2 ( t) > + 1 u * ( t) = 1 1  A2 ( t )
A2 ( t) < 1 u * ( t) = + 1 1 + A2 ( t)
A2(t) = +1 1 :S u*(t) :S 0 0
A2 ( t) = 1 0 :S u * ( t) :S + 1 0
1 < A2(t) < 1 u*(t) = 0 0
Possible values of Resulting values of Maximum value of
A2 (t) u* (t) {Iu(t) I + U(t)A2 (t)}
A2( t) = 0 u * ( t) = + 1 or  1 + 1
A2(t) > 0 u*(t) = + 1 1 + A2(t)
A2 ( t) < 0 u * ( t) = 1 1  A2 ( t)
These relations are also exhibited in Figure 7.17
The previous tabular relations are also written as
{
0 if 1 < A2 (t) < + 1
u*(t) = +1 if A2(t) < 1
1 if A2(t) > +1
o :S u*(t) :S +1 if A2(t) = 1
1 :S u*(t) :S 0 if A2(t) = +1.
The previous relation is further rewritten as
u*(t) =
undetermined
if I A2 ( t ) I < 1
if I A2 ( t ) I > 1
if I A2 ( t ) I = 1
(7.3.11)
(7.3.12)
where, sgn is already defined in the previous section on time
optimal control systems. In order to write the relation (7.3.12) in
322 Chapter 7: Constrained Optimal Control Systems
Figure 7.17 Relations Between A2(t) and lu*(t)1 + U*(t)A2(t)
a more compact form, let us define a deadzone function between
input function fi and output function fo, denoted by dez { }, as
fo = dez{Ji} means that
fa = { ~ g n { f ; }
if
Ifil < 1
if
Ifil > 1
(7.3.13)
0:::; fo:::; 1 if
fi = +1
1 :::; fo :::; 0 if Ji = 1.
The deadzone function is illustrated in Figure 7.18.
Using the definition of the dez function (7.3.13), we write the
control strategy (7.3.12) as
I u*(t) = dez{A2(t)}·1 (7.3.14)
Using the previous definition of deadzone function (7.3.13), the
optimal control (7.3.14) is illustrated by Figure 7.19 .
• Step 4: Costate Solutions: Using the Hamiltonian (7.3.4), the
7.3 FuelOptimal Control Systems
fo
+1 r
1
+1
1
Figure 7.18 DeadZone Function
+1 r
1
u*
+1
... 1
Figure 7.19 FuelOptimal Control
costates are described by
. 811,
Ai(t) = 8 * = 0,
xl
= = Ai(t),
X
2
the solutions of which become
323
(7.3.15)
From Figure 7.19), depending upon the values of Al (0) f= ° and
A2 (0), there are 9 admissible fueloptimal control sequences:
{a}, {+1}, {I}, {1,0}, {0,+1}, {+1,0}, {0,1}
{1,0,+1}, {+1,0,1}. (7.3.17)
324
Chapter 7: Constrained Optimal Control Systems
• Step 5: State Trajectories: The solutions of the state equations
(7.3.1), already obtained in (7.2.15) under timeoptimal control
system, are (omitting * for simplicity)
1 2 1 2
Xl(t) = XlO  "2 UX20 + "2
UX2
(t),
t = [X2(t)  X20]/U
(7.3.18)
for the control sequence u(t) = U = ±1. The switching curve
is the same as shown in Figure 7.9 (for timeoptimal control
of a double integral system) which is repeated here in Figure 7.20.
For the control sequence u(t) = U = 0, we have from (7.3.1)
u* = +1
R+
u* =1
R_
Xl
Figure 7.20 Switching Curve for a Double Integral FuelOptimal
Control System
Xl(t) = XlO + X20
t
,
X2(t) = X20,
t = (Xl(t)  XlO)/X20.
(7.3.19)
These trajectories for u(t) = 0 are shown in Figure 7.21. Here,
we cannot drive the system from any initial state to the origin by
7.3 FuelOptimal Control Systems 325
X2
       ~                              ~
+1
.                   ~
..                   ~
1
+1
~                              ~       
~                              ~       
1
~                               ~       
Figure 7.21 PhasePlane Trajectories for u(t) = 0
means of the zero control. For example, if the system is on the Xl
axis at (1,0), it continues to stay there for ever. Or if the system
is at (0,1) or (0,1), it travels along the trajectory with constant
X20 towards the right or left .
• Step 6: Minimum Fuel: If there is a control u(t) which drives the
system from any initial condition (XlO, X20) to the origin (0,0),
then the minimum fuel satisfies the relation
(7.3.20)
and hence
I J* = I
X
201·1
(7.3.21)
Proof: Solving the state equation (7.3.1), we have
(7.3.22)
Since we must reach the origin at t f, it follows from the previous
equation
(7.3.23)
which yields
r
f
X20 =  io u(t)dt.
(7.3.24)
326 Chapter 7: Constrained Optimal Control Systems
Or using the wellknown inequality,
(7.3.25)
Hence, IX201 = J*. Note, if the initial state is (XlO, 0), the fuel
consumed J = 0, which implies that u(t) = 0 for all t E [0, tf].
In other words, a minimumfuel solution does not exist for the
initial state (XlO,O) .
• Step 7: Switching Sequences: Now let us define the various re
gions in state space. (See Figure 7.22.)
u=+l
u=o
Xl
u=o
w. A4(x 1O,X20)
u=l
~ ~ . . . . . ; ; " " . . , ~ C
Figure 7.22 FuelOptimal Control Sequences
1. The Rl (R3) is the region to the right (left) of'Y curve and
for the positive (negative) values of X2.
2. The R2 (R4) is the region to the left (right) of 'Y curve and
for the positive (negative) values of X2.
Now, depending upon the initial position of the system, we have
a particular optimal control sequence (see Figure 7.22).
7.3 FuelOptimal Control Systems 327
1. 1'+ and 1'_ Curves: If the initial condition (XlO,X20) E 1'+(1'),
then the control u(t) = +1(u(t) = 1) .
2. R2 and R4 Regions: If the initial condition is (XlO, X20) E
R4, then the control sequence {O, + I} forces the system
to (0,0), through A4 to B, and then to 0, and hence is
fueloptimal. Although the control sequence { 1,0, + I} also
drives the system to origin through A4C DO, it is not optimal.
Similarly, in the region R2, the optimal control sequence is
{0,1}.
3. Rl and R3 Regions: Let us position the system at Al (XlO' X20)
in the region Rl, as shown in Figure 7.23. As seen, staying
Xl
Figure 7.23 EFuelOptimal Control
in region Rl there is no way one can drive the system at
Al to origin, as the control sequence u*(t) = ° drives the
system towards right (or away from the origin).
Thus, there is no fueloptimal solution for the system for
region Rl. However, given any E > 0, there is a control se
quence { 1, 0, + 1 }, which forces the system to origin. Then,
328 Chapter 7: Constrained Optimal Control Systems
the fuel consumed is
I J, = I
X
201 + I ~ I + I ~ I = I
X
201 + E = J* + E 2 J*·I
(7.3.26)
We call such a control E fueloptimal. Similarly, for the re
gion R
3
, we have the E fueloptimal control sequence given
as {+1, 0, I}. Note that the control sequence {I, +1}
through AIBCEO is not an allowable optimal control se
quence (7.3.17) and also consumes more fuel than the Efuel
optimal through AlBeDO. Also, we like to make E as small
as possible and apply the control {O} as soon as the trajec
tory enters the region R4.
• Step 8: Control Law: The fueloptimal control law for driving
the system from any initial state (Xl, X2) to the origin, can be
stated as follows:
(7.3.27)
If (Xl, X2) E RI U R3, there is no fueloptimal control. However,
there is E  fueloptimal control as described above.
7.4 MinimumFuel System: LTI System
7.4.1 Problem Statement
Let us consider a linear, timeinvariant system
x(t) = Ax(t) + Bu(t) (7.4.1)
where, x(t) and u(t) are n and r dimensional state and control
vectors, respectively. Let us assume that the control u( t) is constrained
as
1 ::; u(t) ::; +1 or lu(t)l::; 1 (7.4.2)
or component wise,
IUj(t)1 ::; 1 j = 1,2, ... , r. (7.4.3)
7.4 MinimumFuel System: LTI System 329
Our problem is to find the optimal control u*(t) which transfers the
system (7.4.1) from any initial condition x(O) to a given final state
(usually the origin) and minimizes the performance measure
rf r
J(u) = Jo z= IUj(t)ldt.
o j=l
(7.4.4)
7.4.2 Problem Solution
We present the solution to this fueloptimal system under the following
steps. First let us list the steps.
• Step 1: Hamiltonian
• Step 2: Optimal Condition
• Step 3: Costate Functions
• Step 4: Normal FuelOptimal Control System
• Step 5: BangoffBang Control Law
• Step 6: Implementation
• Step 1 : Hamiltonian: Let us formulate the Hamiltonian for the
system (7.4.1) and the performance measure (7.4.4) as
r
1t(x(t), u(t), A(t)) = z= IUj(t)1 + A'(t)Ax(t) + A'(t)Bu(t).
j=l
(7.4.5)
• Step 2: Optimal Condition: According to the Pontryagin Prin
ciple, the optimal condition is given by
1t(X*(t),A*(t), u*(t)) ~ 1t(X*(t),A*(t), u(t)),
= min {1t(x*(t), A*(t), u(t))}. (7.4.6)
lu(t)I:S1
Using (7.4.5) in (7.4.6), we have
r
L luj(t)1 + A*'(t)Ax*(t) + A*'(t)Bu*(t)
j=l
r
~ L IUj(t)1 + A*'(t)Ax*(t) + A*'(t)Bu(t) (7.4.7)
j=l
330 Chapter 7: Constrained Optimal Control Systems
which in turn yields
r r
L luj(t)1 + .x*'(t)Bu*(t) ::; L IUj(t)1 + .x*'(t)Bu(t)
j=1 j=1
or transposing
r r
L luj(t) I + u*' (t)B'.x * (t) ::; L IUj(t) I + u' (t)B'.x * (t). (7.4.8)
j=1 j=1
Considering the various possibilities as before for the double in
tegral system, we have
q*(t) = B'.x*(t). (7.4.9)
Using the earlier relations (7.3.11) and (7.3.12) for the deadzone
function, we can write the condition (7.4.6) as
I u*(t) = DEZ{q*(t)} = DEZ{B'.x*(t)} I (7.4.10)
or component wise,
uj(t) = dez{qj(t)} = dez{bj'\*(t)} (7.4.11)
where, j = 1,2, ... , r. The optimal control (7.4.10) in terms of
the deadzone (dez) function is shown in Figure 7.24.
q*
~
+1 ,..
1
..
__ 1. _____ 1
u*
Figure 7.24 Optimal Control as DeadZone Function
7.4 MinimumFuel System: LTI System 331
• Step 3: Costate Functions: The costate functions A*(t) are given
in terms of the Hamiltonian as
.x*(t) =  8H = A'A*(t)
ax
the solution of which is
A*(t) = EA'tA(O).
(7.4.12)
(7.4.13)
Depending upon the nature of the function q* (t), we can classify
it as normal fueloptimal control (NFOC) system, if Iq*(t)1 = 1
only at switch times as shown in Figure 7.25 or singular fuel
optimal control (SFOC) system, if Iq*(t)1 = 1 as shown in Fig
ure 7.26, for some t E [TI' T2]'
t
Figure 7.25 Normal FuelOptimal Control System
• Step 4: Normal FuelOptimal Control System: We first derive
the necessary conditions for the fueloptimal system to be singular
and then translate these into sufficient conditions for the system
to be normal, that is, the negation of the conditions for singular
is taken as that for normal.
For the fueloptimal system to be singular, it is necessary that in
the system interval [0, t f), there is at least one subinterval [TI' T
2
]
for which
(7.4.14)
Using (7.4.9), the previous condition becomes
Iq*(t)1 = IB'A*(t)1 = 1.
(7.4.15)
332 Chapter 7: Constrained Optimal Control Systems
u*(t) = dez{ q*(t)}
+1
t
, ,
, , ,
: .    ~ .         ~
'\. I
SIngular Intervals
Figure 7.26 Singular FuelOptimal Control System
This means that the function q*(t) is constant and hence all
its time derivatives must vanish. By repeated differentiation of
(7.4.15) and using (7.4.12), we have
(Abj)'A*(t) = 0,
(A
2
b
j
)'A*(t) = 0,
(A nIb
j
)' A*(t) = 0,
(Anhj)'A*(t) = 0, (7.4.16)
for all t E [Tl' T
2
], where j = 1, 2, ... , r. We can rewrite the
previous set of equations as
(7.4.17)
where,
G
j
= [hj, Ah
j
,"', A n1h
j
]. (7.4.18)
The condition (7.4.17) can further be rewritten as
(7.4.19)
But the condition (7.4.15) implies that A*(t) =1= 0. Then, for
(7.4.19) to hold, it is necessary that the matrix GjA' must be
singular. This means that
det{ GjA'} = det A det Gj = 0. (7.4.20)
7.4 MinimumFuel System: LTI System 333
Thus, the sufficient condition for the system to be normal is that
I det{GjA'} i 0 V j = 1,2, ... ,r.1 (7.4.21)
Thus, if the system (7.4.1) is normal (that is also controllable),
and if the matrix A is nonsingular, then the fueloptimal system
is normal.
• Step 5: BangoffBang Control Law: If the linear, timeinvariant
system (7.4.1) is normal and x*(t) and .\*(t) are the state and
costate trajectories, then the optimal control law u*(t) given by
(7.4.10) is repeated here as
u*(t) = DEZ {B'.\*(t)} (7.4.22)
for all t E [to,t f]. In other words, if the fueloptimal system is
normal, the components of the fueloptimal control are piecewise
constant functions of time. The fueloptimal control can switch
between +1,0 and 1 and hence is called the bangoffbang con
trol (or principle) .
• Step 6: Implementation: As before in timeoptimal control sys
tem, the fueloptimal control law can be implemented either in
openloop configuration as shown in Figure 7.27. Here, an itera
tive procedure is to be used to finally drive the state to origin.
On the other hand, we can realize closedloop configuration as
x(O) :
*( I
x ,t
L.....!I
Stop Iteration
Start Iteration!
Change A(O)
OpenLoop FuelOptimal Controller
Plant
Figure 7.27 OpenLoop Implementation of FuelOptimal Control
System
shown in Figure 7.28, where the current initial state is used to
realize the fueloptimal control law (7.4.22).
334
Chapter 7: Constrained Optimal Control Systems
1,
I I
I Dead Zone :
I I
1 I
I
I
I
I
I
I
I
I
h[x*(t)]
*
: I I
:u*(tt
x(t)=Ax(t)+Bu(t)
I x*(t)
IT
..
+1 : I
I
I
I I
I
I I I
I
I I
I
I I

ClosedLoop FuelOptimal Controller Plant
Figure 7.28 ClosedLoop Implementation of FuelOptimal Control
System
7.4.3 SIMULINX@ Implementation of Control Law
The SIMULINK@ implementation of fueloptimal control law for the
double integral system described in the previous section is very conve
nient. The controller is obtained by using abs, signum, and deadzone
function blocks as shown in Figure 7.29. Further note that since the
relay with deadzone function block required for fueloptimal control,
as shown in Figure 7.19, is not readily available in SIMULINK@ li
brary, the function block is realized by combining deadzone and sign3
function blocks [6]. Using different initial conditions one can get the
phaseplane (Xl and X2 plane) trajectories belonging to 1'+, 1', RI,
R3, R2 and R4 shown in Figures 7.30, 7.31, 7.32, 7.33, 7.34, and
7.35, respectively. In particular note the trajectories belonging to RI
and R3 regions showing Efueloptimal condition.
7.5 EnergyOptimal Control Systems
In minimumenergy (energyoptimal) systems with constraints, we of
ten formulate the performance measure as the energy of an electrical
(or mechanical) system. For example, if u(t) is the voltage input to a
field circuit in a typical constant armaturecurrent, field controlled po
sitional control system, with negligible field inductance and a unit field
resistance, the total energy to the field circuit is (power is u
2
(t)j Rf,
where, Rf = 1 is the field resistance)
(7.5.1)
and the field voltage u(t) is constrained by lu(t)1 :::; 110. This section
is based on [6, 89].
7.5 EnergyOptimal Control Systems 335
PLANT
Phase Pimle
CONTROLLER
Figure 7.29 SIMULINK© Implementation of FuelOptimal
Control Law
7.5.1 Problem Formulation and Statement
Let us now formulate the energyoptimal control (EO C) system with
magnitudeconstrained control. Consider a linear, timevarying, fully
controllable system
x(t) = A(t)x(t) + B(t)u(t) (7.5.2)
where, x(t) and u(t) are n and rdimensional state and control vec
tors, respectively, and the energy cost functional
lit!
J =  u'(t)R(t)u(t)dt.
2 to
(7.5.3)
Let us assume that the control u( t) is constrained as
1 :::; u(t) :::; +1 or lu(t)l:::; 1 (7.5.4)
or component wise,
IUj(t)l:::; 1 j = 1,2, ... ,r. (7.5.5)
336 Chapter 7: Constrained Optimal Control Systems
Figure 7.30 PhasePlane Trajectory for "Y+: Initial State (2,2) and
Final State (0,0)
Figure 7.31 PhasePlane Trajectory for "Y: Initial State (2,2) and
Final State (0,0)
7.5 EnergyOptimal Control Systems 337
Figure 7.32 PhasePlane Trajectory for Rl: Initial State (1,1) and
Final State (0,0)
Figure 7.33 PhasePlane Trajectory for R3: Initial State (1,1)
and Final State (0,0)
338 Chapter 7: Constrained Optimal Control Systems
Figure 7.34 PhasePlane Trajectory for R2: Initial State (1.5,1)
and Final State (0,0)
Figure 7.35 PhasePlane Trajectory for R4: Initial State (1.5,1)
and Final State (0,0)
7.5 EnergyOptimal Control Systems 339
Problem Statement
The energyoptimal control system is to transfer the system (7.5.2)
from any initial state x(t = to) = x(to) =I 0 to the origin in time tf and
at the same time minimize the energy cost functional (7.5.3) with the
constraint relation (7.5.4).
7.5.2 Problem Solution
We present the solution to this energyoptimal system under the fol
lowing steps. But first let us list the various steps involved.
• Step 1 : Hamiltonian
• Step 2: State and Costate Equations
• Step 3: Optimal Condition
• Step 4: Optimal Control
• Step 5: Implementation
• Step 1: Hamiltonian: Let us formulate the Hamiltonian for the
system (7.5.2) and the PI (7.5.3) as
1
1t(x(t), u(t), A(t)) = "2u'(t)R(t)u(t) + A'(t)Ax(t) + A'(t)Bu(t)
(7.5.6)
where, A(t) is the costate variable.
• Step 2: State and Costate Equations: Let us assume optimal val
ues u*(t), x*(t), and A*(t). Then, the state x*(t) and the costate
A*(t) optimal values are given in terms of the Hamiltonian as
x*(t) = + (:) * = A(t)x*(t) + B(t)u*(t)
.x*(t) =  C:::J = A'(t)A*(t)
with the boundary conditions
where, we again note that t f is either fixed or free.
(7.5.7)
(7.5.8)
340
Chapter 7: Constrained Optimal Control Systems
• Step 3: Optimal Condition: Now using Pontryagin Principle, we
invoke the condition for optimal control in terms of the Hamilto
nian, that is,
7l(x*(t), .\*(t), u*(t)) :::; 7l(x*(t), .\*(t), u(t))
= min 7l(x*(t), .\*(t), u(t)). (7.5.9)
Using (7.5.6) in (7.5.9), we have
1
"2u*'(t)R(t)u*(t) + .\*'(t)A(t)x*(t) + .\*'(t)B(t)u*(t)
1
:::; "2u'(t)R(t)u(t) + .\*'(t)A(t)x*(t) + .\*'(t)B(t)u(t) (7.5.10)
which becomes
+ .\*'(t)B(t)u*(t)
1
:::; "2u'(t)R(t)u(t) + .\*'(t)B(t)u(t)
= min {21u'(t)R(t)U(t) + .\*'(t)B(t)U(t)}. (7.5.11)
• Step 4: Optimal Control: Let us denote
q*(t) = Rl(t)B'(t).\*(t) (7.5.12)
and write
.\*'(t)B(t)u*(t) = u*'(t)B'(t).\*(t) = u*'(t)R(t)q*(t). (7.5.13)
Using (7.5.12) and (7.5.13) in (7.5.11), we get
1
"2u*'(t)R(t)u*(t) + u*'(t)R(t)q*(t)
1
:::; "2u'(t)R(t)u(t) + u'(t)R(t)q*(t).
Now, adding
(7.5.14)
= (7.5.15)
7.5 EnergyOptimal Control Systems
to both sides of (7.5.14), we get
That is
[u*(t) + q*(t)]'R(t) [u*(t) + q*(t)]
:S [u(t) + q*(t)]'R(t) [u(t) + q*(t)].
w*'(t)R(t)w*(t) :S w'(t)R(t)w(t)
341
(7.5.16)
= min {w'(t)R(t)w(t)} , (7.5.17)
lu(t)l::=;ll
where,
w(t) = u(t) + q*(t) = u(t) + Rl(t)B'(t)'\*(t)
w*(t) = u*(t) + q*(t) = u*(t) + Rl(t)B'(t)'\*(t). (7.5.18)
The relation (7.5.17) implies that w'(t) attains its minimum value
at w*(t).
Now we know that
1. if R(t) is positive definite for all t,
its eigenvalues d1(t), d
2
(t), ... , dr(t) are positive,
2. if D(t) is the diagonal matrix of
the eigenvalues d
1
(t), d
2
(t), ... ,dr(t) of R(t), then
3. there is an orthogonal matrix M such that
M'M = I + M' = M
1
and
D = M'RM + MDM' = R.
Now, using (7.5.20) along with (7.5.17), we have
w'(t)Rw(t) = w'(t)MDM'w(t)
r
(7.5.19)
(7.5.20)
= v'(t)Dv(t) = L dj(t)vJ(t) (7.5.21)
j=l
where, v(t) = M'w(t) and note that d
j
> O. Since both M' and
M are orthogonal, we know that
v' (t)v(t) = w' (t)MM'w(t) = w' (t)w(t)
(7.5.22)
342
Chapter 7: Constrained Optimal Control Systems
where, we used M'M = I. We can equivalently write (7.5.22)
component wise as
r r
L vJ(t) = L wJ(t).
(7.5.23)
j=l j=l
Now, (7.5.17) implies that (using (7.5.21))
min {w'(t)R(t)w(t)} = min {tdj(t)vJ(t)}
lu(t)I:S1 lu(t)I:S11 j=l
r
= Lmin {vJ(t)}.
j=l Vj (t)
(7.5.24)
This implies that ifw*(t) minimizes w'(t)R(t)w(t), then the com
ponents v j ( t) also minimize v' ( t) v ( t ). This fact is also evident
from (7.5.22). In other words, we have established that
if w*'(t)R(t)w*(t) ~ w'(t)R(t)w(t)
then w*'(t)w*(t) ~ w'(t)w(t) (7.5.25)
and the converse is also true. Or the effect of R( t) is nullified in
the minimization process. Thus,
min {w'(t)R(t)w(t)} = min {w'(t)w(t)},
lu(t)I:S11 lu(t)I:S11
r
= L min { wJ (t) } ,
j=l w(t)
r
= L min {[Uj(t) + q](t)]2} .
j=1
Iu
(t)I:S11
(7.5.26)
A careful examination of (7.5.26) reveals that to minimize the
positive quantity [Uj (t) + q] (t)]2, we must select
{
q;(t) if Iq;(t)1 ~ 1,
U * ( t) = + 1 if q; ( t) < 1,
1 if q;(t) > +1.
(7.5.27)
7.5 EnergyOptimal Control Systems 343
First, let us define sat{ } as the saturation function between the
input Ii and the output fa (see Figure 7.36) as fa = sat{fi}
means that
(7.5.28)
The sgn function is already defined in Section 7.1.1. Then the
Figure 7.36 Saturation Function
relation (7.5.27) can be conveniently written as
{
q;(t) if Iq;(t)l::; 1
u * ( t) = _ s gn { q; ( t)} if I q; ( t) I > 1,
(7.5.29)
or more compactly componentwise as
uj(t) = sat {q;(t)} , (7.5.30)
or in vector form as
lu*(t) = SAT {q*(t)} = SAT {Rl(t)B'(t)'\*(t)} I (7.5.31)
shown in Figure 7.37.
The following notes are in order.
1. The constrained minimumenergy control law (7.5.31) is valid
only if R( t) is positive definite.
2. The energyoptimal control law (7.5.31), described by sat
uration (SAT) function, which is different from the signum
(SGN) function for timeoptimal control and deadzone (DEZ)
344 Chapter 7: Constrained Optimal Control Systems
q*(t) u*(t)
+1
Figure 7.37 EnergyOptimal Control
function for fueloptimal control functions, is a welldefined
( determinate) function. Hence, the minimumenergy system
has no option to be singular.
3. In view of the above, it also follows that the optimal control
u*(t) is a continuous function of time which again is differ
ent from the piecewise constant functions of time for time
optimal and fueloptimal control systems discussed earlier
in this chapter.
4. If the minimumenergy system described by the system (7.5.2)
and the PI (7.5.3) has no constraint (7.5.4) on the control,
then by the results of Chapter 3, we obtain the optimal
control u*(t) by using the Hamiltonian (7.5.6) and the con
dition
~ ~ = 0 + R(t)u:,(t) + B'(t).\.*(t) = 0 +
u ~ ( t ) = _Rl(t)B'(t)A*(t) = q*(t), (7.5.32)
where, u ~ ( t ) refers to unconstraint control. Comparing the
relation (7.5.32) with (7.5.29), we see that
u ~ ( t ) = q*(t) = u*(t) if Iq*(t)l::; 1 (7.5.33)
where u*(t) refers to constrained control. Thus, if q*(t) ::; 1,
the constrained optimal control u*(t) and the unconstrained
optimal control u*(t) are the same.
5. For the constrained energyoptimal control system, using op
timal control (7.5.31), the state and costate system (7.5.7)
7.5 EnergyOptimal Control Systems 345
becomes
x*(t) = Ax*(t)  BSAT {R1B';\*(t)}
~ * (t) = A';\ * (t). (7.5.34)
We notice that this is a set of 2n nonlinear differential equa
tions and can only be solved by using numerical simulations.
• Step 5: Implementation: The implementation of the energyoptimal
control law (7.5.31) can be performed in openloop or closedloop
configuration. In the openloop case (Figure 7.38), it becomes
iterative to try different values of initial conditions for ;\(0) to
satisfy the final condition of driving the state to origin. On the
other hand, the closedloop case shown in Figure 7.39 becomes
more attractive.
                              ~
Saturation
,,
u*(t),' x(t)=Ax(t)+Bu(t) :x*(t)
, :
1...____ ' , :
, :
, :
, :
, ,:
, ,:
, ,:
········t······T························r·
, , ,
, , ,
, , ,
______________________________ J L ____________ ~
OpenLoop EnergyOptimal Controller
Plant
Figure 7.38 OpenLoop Implementation of EnergyOptimal
Control System
A more general constrained minimumenergy control system is where
the performance measure (7.5.3) contains additional weighting terms
x'(t)Q(t)x(t) and 2x(t)S(t)u(t) [6].
Example 7.1
Consider a simple scalar system
x(t) = ax(t) + u(t), a < 0 (7.5.35)
346 Chapter 7: Constrained Optimal Control Systems
r
I
ClosedLoop EnergyOptimal Controller Plant
Figure 7.39 ClosedLoop Implementation of EnergyOptimal
Control System
to be transferred from an arbitrary initial state x(t = 0) = Xo to
the origin that minimize the performance index
(7.5.36)
where, the final time tf is free and the control u(t) is constrained
as
lu(t)1 ::; 1. (7.5.37)
Discuss the resulting optimal control system.
Solution: Comparing the system (7.5.35) and the performance
measure (7.5.36) with the general formulations of the correspond
ing system (7.5.2) and the performance index (7.5.3), we easily
see that A(t) = a, B(t) = b = 1, R(t) = r = 2. Then using the
stepbystep procedure in the last section, we get the following .
• Step 1: Form the Hamiltonian (7.5.6) as
1
H(x(t), A(t), u( t)) = "2x2u
2
( t) + A(t)ax(t) + A(t)U( t).
• Step 2: The state and costate relations (7.5.7) are
x*(t) = + (':;) * = ax*(t) + u*(t)
~ * ( t ) =  (aH) = aA*(t).
ax *
(7.5.38)
(7.5.39)
The solution of the costate function A*(t) is easily seen to be
(7.5.40)
7.5 EnergyOptimal Control Systems
• Step 3: The optimal control (7.5.30) becomes
u*(t) = sat {qi(t)}
347
= sat {r1b'\*(t)}
= sat {0.5'\*(t)}. (7.5.41 )
In other words,
{
+1.0 if 0.5'\*(t) ::; lor
u*(t) = 1.0 if 0.5'\*(t) +1 or
0.5'\*(t) if 10.5'\*(t)l::; 1 or
'\*(t) ::; 2,
'\*(t) +2,
1'\*(t)1 ::; +2.
(7.5.42)
The previous relationship between the optimal control u*(t) and
the optimal costate '\*(t) is shown in Figure 7.40.
We note from (7.5.42) that the condition u*(t) = is
also obtained from the results of unconstrained control using the
Hamiltonian (7.5.38) and the condition
= 0 > 2u*(t) + A*(t) = 0 > u*(t) = (7.5.43)
Consider the costate function '\*(t) in (7.5.40). The condition
'\*(0) = 0 is not admissible because then according to (7.5.41),
u*(t) = 0 for t E [0, tl], and the state x*(t) = X(O)Eat in (7.5.39)
will never reach the origin in time t I for an arbitrarily given
initial state x(O).
Then, the costate '\*(t) = ,\(O)E
at
has four possible solutions
depending upon the initial values (0 < '\(0) < 2, '\(0) > 2, 2 <
'\(0) < 0, '\(0) < 2) as shown in Figure 7.4l.
1. 0 < '\(0) < 2: For this case, Figure 7.41, curve (a),
u*(t) = or 1 } (7.5.44)
depending upon whether the system reaches the origin be
fore or after time ta, the function'\*(t) reaches the value of
+2.
2. '\(0) > 2: In this case, Figure 7.41, curve (b), since '\*(t) >
+2, the optimal control u*(t) = {I}.
3. 2 < '\(0) < 0: Depending on whether the state reaches the
origin before or after time t
c
, the function ,\ * ( t) reaches the
value 2, the optimal control is (Figure 7.41, curve (c))
u*(t) = { or +1 }. (7.5.45)
348 Chapter 7: Constrained Optimal Control Systems
q*(t) q*(t)
1
u*(t)
(a)
A *(t) u*(t)
(b)
Figure 7.40 Relation between Optimal Control u*(t) vs (a) q*(t)
and (b) 0.5A*(t)
4. A(O) < 2: Here, Figure 7.39, curve (d), since A*(t) < 2,
the optimal control u*(t) = {+1}.
The previous discussion refers to the openloop implementation in
the sense that depending upon the values of the costate variable
A*(t). However, in this scalar case, it may be possible to obtain
closedloop implementation .
• Step 4: ClosedLoop Implementation: In this scalar case, it may
be easy to get a closedloop optimal control. First, let us note
that if the final time t f is free and the Hamiltonian (7.5.38) does
not contain time t explicitly, then we know that
H(x*(t), A*(t), u*(t)) = 0 V t E [0, tf] (7.5.46)
7.5 EnergyOptimal Control Systems
349
A*(t
(b)
(a)
i2  .
0
tel ta
t
2
Figure 7.41 Possible Solutions of Optimal Costate '\*(t)
which means that
2
u* (t) + '\*(t) [ax*(t) + u*(t)] = 0. (7.5.47)
Solving for the optimal state
*( ) = u*(t) [u*(t) 1]
x t a ,\*(t) + .
(7.5.48)
Let us now discuss two situations.
1. Saturated Region: (i) At time t = ta (Figure 7.41(a)), '\*(t
a
) =
2, u*(ta) = 1, then the optimal state (7.5.48) becomes
x*(t
a
) = 2
1
a' and since a < 0, x*(t
a
) < 0. (7.5.49)
Next, for time t E [ta, tf]' u*(t) = 1 and '\*(t) > 2 and the
relation (7.5.48) reveals that x*(t) < x*(ta). Combining this
with (7.5.49), we have
x*(t) < x*(ta) < °
(7.5.50)
(ii) At time t = t
e
, (Figure 7.41(c)), '\*(t
e
) = 2, u*(t
e
) =
+ 1, then the optimal state (7.5.48) becomes
x*(t
e
) =  2 ~ ' and since a < 0, x*(t
e
) > 0. (7.5.51)
350 Chapter 7: Constrained Optimal Control Systems
Next, for time t E [te, tf]' u*(t) = +1 and A*(t) < 2 and
the relation (7.5.48) reveals that x*(t) > x*(te). Combining
this with (7.5.51), we have
x*(t) > x*(t
e
) > o. (7.5.52)
2. Unsaturated Region: During the unsaturated region,
IA*(t)1 1 and (7.5.53)
and using this, the Hamiltonian condition (7.5.47) becomes
u*2(t) + A * (t)[ax*(t) + u* (t)] = 0
+ aA*(t)x*(t)  = 0
4 2
A*(t)  ax*(t)] = 0 (7.5.54)
solution of which becomes
A*(t) = 0 or A*(t) = 4ax*(t). (7.5.55)
Here, A*(t) = 0 is not admissible because then the opti
mal control (7.5.44) becomes zero. For A*(t) = 4ax*(t), the
optimal control (7.5.44) becomes
u*(t) = 2ax*(t), a < O. (7.5.56)
The previous relation also means that
If x*(t) > 0, then u*(t) = +1
If x*(t) < 0, then u*(t) = 1
If x*(t) = 0, then u*(t) = O. (7.5.57)
Control Law: Combining the previous relations for unsaturated
region and for the saturated region, we finally get the control law
for the entire region as
1, if
x*(t) < < 0,
+1, if
x*(t) > ia > 0,
u*(t) = 2ax*(t), if
x*(t) >  2
1
a > 0,
(7.5.58)
2ax*(t), if
x*(t) < +2
1
a < 0,
0, if x*(t) = 0
and the implementation of the energyoptimal control law is shown
in Figure 7.42.
Further, for a combination of timeoptimal and fueloptimal control
systems and other related problems with control constraints, see excel
lent texts [6, 116].
7.6 Optimal Control Systems with State Constraints 351
u*(t)
..
f
x*(t)
..
+ ~
+
a
$
x*(t)
....

~ + 1
•
+1
r
+1/2a
~
l/2a
~
t
1
Figure 7.42 Implementation of EnergyOptimal Control Law
7.6 Optimal Control Systems with State
Constraints
In the previous sections, we discussed the optimal control systems with
control constraints. In this section, we address the optimal control sys
tems with state constraints [79, 120].
Optimal control systems with state constraints (Constrained Opti
mal Control) has been of great interest to engineers. Some examples of
stateconstrained problems are the solution of the minimum timeto
climb problem for an aircraft that is required to start within a specified
flight envelope, the determination of the best control policy for an in
dustrial mechanical robot subject to path constraints, and the speed
of an electric motor which cannot exceed a certain value without dam
aging some of the mechanical components such as bearings and shaft.
There have been several methods proposed to handle state variable in
equality constraints. In general, there are three methods for handling
these systems [49]:
1. slack variables,
352 Chapter 7: Constrained Optimal Control Systems
2. penalty functions, and
3. interiorpoint constraints.
Let us first consider the penalty function method.
7.6.1 Penalty Function Method
Let us consider the system as
x(t) = f(x(t), u(t), t) (7.6.1)
and the performance index as
i
t!
J = V(x(t), u(t), t)
to
(7.6.2)
where, x(t) and u(t) are nand r dimensional state and control vectors,
respectively. Let the inequality constraints on the states be expressed
as
g(x(t), t) ~ 0 (7.6.3)
or
gl(Xl(t),X2(t), ... ,X
n
(t),t) ~ 0
g2(Xl (t), X2(t), . .. ,xn(t), t) ~ 0
(7.6.4)
where, g is a p :::; n vector function of the states and assumed to
have continuous first and second partial derivatives with respect to
state x(t). There are several methods of solving this system where
the ineqnality constraints (7.6.3) are converted to eqnality constraints.
One such methodology is described below. Let us define a new variable
Xn+l(t) by
Xn+l(t) ~ fn+l(X(t), t),
= [gl(X(t), t]2 H(gl) + [g2(X(t), t)]2 H(g2) + ...
+ [gp(x(t) , t)] H(gp),
(7.6.5)
7.6 Optimal Control Systems with State Constraints
where, H(9i) is a unit Heaviside step function defined by
H(9i) = {O, 9i(X(t), t) 0,
1, If 9i(X(t), t) < 0,
353
(7.6.6)
for i = 1,2, ... ,po The relations (7.6.6) and (7.6.5) mean that X
n
+1(t) =
o for all t when the constraint relation (7.6.3) is satisfied and X
n
+1(t)
o for all t due to the square terms in (7.6.5). FUrther, let us require that
the new variable X
n
+1(t) has the boundary conditions
(7.6.7)
such that
X
n
+1(t) = ft X
n
+1(t)dt
ito
= ft {[91 (x ( t), t]
2
H (91) + [92 (x ( t), t)]
2
H (92) + ...
ito
+ [9p(X(t) , t)f H(9p) } dt. (7.6.8)
Now we use the Hamiltonian approach to minimize the PI (7.6.2) sub
ject to the system equation (7.6.1) and the state inequality constraint
(7.6.3). Let us define the Hamiltonian as
H(x(t), u(t), A(t), A
n
+l(t), t)
= V(x( t), u(t), t) + A' (t)f(x( t), u( t), t)
+A
n
+1(t) {[91 (x(t), t]2 H(91) + [92 (x(t), t)]2 H(92) + ...
+ [9p(X(t), t)]2 H(9m) } ,
= V(x(t), u( t), t) + A' (t)f(x( t), u( t), t)
+A
n
+1(t)fn+1(X(t), t). (7.6.9)
Thus, the previous Hamiltonian is formed with n + 1 costates and n + 1
states. Note that the Hamiltonian (7.6.9) does not explicitly contain
the new state variable X
n
+1 (t). Now, we apply the necessary optimality
conditions for the state as
x* (t) = = f(x* (t), u*(t), t),
= = fn+1(X*(t), t),
UAn+1
(7.6.10)
354 Chapter 7: Constrained Optimal Control Systems
for the costate as
(7.6.11)
and for the control as
11t(x*(t), u(t), A*(t), (t), t) 1t(X*(t), U(t), A*(t), (t), t).\
(7.6.12)
or
min {1i(x* (t), u(t), A * (t), (t), t)} = 1i(x*(t), u(t), A *(t), (t), t).
(7.6.13)
Note that in the above, (t) = 0 because the Hamiltonian (7.6.9)
does not contain x
n
+ 1 ( t) explicitly (see Table 7.1).
Let us now illustrate the previous method by an example.
Example 7.2
Consider a second order system
Xl(t) = X2(t)
X2(t) = u(t),
and the performance index
(7.6.14)
(7.6.15)
where, time t f is free and the final state x( t f) is free. The control
u(t) is constrained as
1 u(t) +1 or lu(t)1 +1 for t E [to, tf]' (7.6.16)
and the state X2 (t) is constrained as
3 X2(t) +3 or IX2(t)1 +3 for t E [to,tf]. (7.6.17)
Find the optimal control.
7.6 Optimal Control Systems with State Constraints
Table 7.1 Procedure Summary of Optimal Control
Systems with State Constraints
A. Statement of the Problem
Given the system as
x(t) = f(x(t), u(t), t),
the performance index as
J = S(x(tf), tf) + ftd V(x(t), u(t), t)dt,
the state constraints as
g(x(t),t) 2: 0,
and the boundary conditions as
x(to) = Xo and tf and x(tf) = xf are free,
find the optimal control.
B. Solution of the Problem
Step 1 Form the Pontryagin 1i function
1i(x(t) , u(t), '\(t), A
n
+l(t), t) = V(x(t), u(t), t)
+,\'(t)f(x(t), u(t), t) + An+l (t)!n+l (x(t), t).
Step 2 Solve the set of 2n + 2 differential equations
x*(t) = = and
.* (811)·* ( 811 )
,\ (t) =  8x *' An+l(t) =  8X
n
+l * =0
with boundary conditions
xo, X
n
+l(tO) = 0, x
n
+l(tf) = 0 and
[
1i + otf +  ,\]' oXf = o.
*tj *tf
Step 3 Minimize 1i w.r.t. u(t)(:::; U)
1i ( x* ( t), u * ( t ) , ,\ * ( t ), 1 ( t ), t)
:::; 1i (x* ( t ), u ( t ), ,\ * ( t ), 1 ( t ), t).
355
356 Chapter 7: Constrained Optimal Control Systems
Solution: To express the state constraint (7.6.17) as state inequal
ity constraint (7.6.3), let us first note
and then
[X2(t) + 3] 0, and
[3  X2(t)] 0
91(X(t)) = [X2(t) + 3] 0,
92(X(t)) = [3  X2(t)] o.
• Step 1: First formulate the Hamiltonian as
1i(x( t), u( t), .\(t), A3( t))
(7.6.18)
(7.6.19)
(7.6.20)
1 2 1 2
= 2X1 (t) + 2u (t) + Al(t)X2(t)  A2(t)U(t)
+A3(t) {[X2(t) + 3f H(X2(t) + 3)
+[3  X2(t)]2 H(3  X2(t))} . (7.6.21)
• Step 2: The necessary condition for the state (7.6.10) becomes
xi(t) = x2(t),
X2(t) = u*(t),
= [X2(t) + 3]2 H(X2(t) + 3) + [3  X2(t)]2 H(3  X2(t)),
(7.6.22)
and for the costate (7.6.11)
=  81i = xi(t),
8
X
l
=  881i = Ai(t)  + 3]H(X2(t) + 3)
X2
 x2(t)]H(3  X2(t))
=  881i = 0 + = constant. (7.6.23)
X3
• Step 3: Minimize 1i w.r.t. the control (7.6.13)
1i(x*(t), u*(t), .\*(t), A3(t)) :::; 1i(x*(t), u(t), .\*(t), A3(t)).
(7.6.24)
Using (7.6.21) in the condition (7.6.24) and taking out the terms
not containing the control u(t) explicitly, we get
1 2 1 2
2u* (t) + A2(t)U*(t) :::; 2u (t) + A2(t)U(t)
= min + A2(t)U(t)}. (7.6.25)
2
7.6 Optimal Control Systems with State Constraints 357
By simple calculus, we see that the expression !u
2
(t) + A2(t)u(t)
will attain the optimum value for
u*(t) = A;(t) (7.6.26)
when the control u*(t) is unconstrained. This can also be seen
alternatively by using the relation
81l
ou = 0 ~ u*(t) + A;(t) = 0 ~ u*(t) = A;(t). (7.6.27)
But, for the present constrained control situation (7.6.14), we see
from (7.6.25) or (7.6.26) that
u*(t) = { 1,
+1,
if A;(t) > +1
if A2(t) < 1.
(7.6.28)
Combining the unsaturated or unconstrained control (7.6.26) with
the saturated or constrained control (7.6.28), we have
{
+1'
u*(t) = 1,
A2(t),
if A2(t) < 1
if A;(t) > +1
if 1::; A2(t) ::; +1.
(7.6.29)
Using the definition of saturation function (7.5.28), the previous
optimal control strategy can be written as
uj(t) = sat {A;(t)}. (7.6.30)
The situation is shown in Figure 7.43. Thus, one has to solve for
the costate function A2(t) completely to find the optimal control
u*(t) from (7.6.29) to get openloop optimal control implementa
tion.
Note: In obtaining the optimal control strategy in general, one cannot
obtain the unconstrained or unsaturated control first and then just ex
tend the same for constrained or saturated region. Instead, one has to
really use the Hamiltonian relation (7.6.13) to obtain the optimal con
trol. Although, in this chapter, we considered control constraints and
state constraints separately, we can combine both of them and have a
situation with constraints as
g(x(t), u(t), t) ::; o. (7.6.31)
For further details, see the recent book [61] and the survey article [62].
358 Chapter 7: Constrained Optimal Control Systems
A2 *(t)
u*(t)
Figure 7.43 Relation between Optimal Control u*(t) and Optimal
Costate A2 (t)
7.6.2 Slack Variable Method
The slack variable approach [68, 137], often known as Valentine's method,
transforms the given inequality state (path) constraint into an equality
state (path) constraint by introducing a slack variable. For the sake of
completeness, let us restate the state constraint problem.
Consider the optimal control system
x(t) = f(x(t), u(t), t), x(t = to) = Xo (7.6.32)
which minimizes the performance index
i
t!
J = F (x ( t f ), t f) + V (x ( t), u ( t ), t) dt
to
(7.6.33)
subject to the statevariable inequality constraint
S(x(t),t):::; O. (7.6.34)
Here, x(t) is an ndimensional order state vector, u(t) is an rdimensional
control vector and the constraint S is of pth order in the sense that the
pth derivative of S contains the control u( t) explicitly.
The stateconstrained, optimal· control problem is solved by con
verting the given inequality constrained problem into an equality con
strained one by introducing a "slack variable," as [68, 137]
(7.6.35)
7.6 Optimal Control Systems with State Constraints 359
Differentiating (7.6.35) p times with respect to time t, we obtain
8
1
(x(t), t) + aal = 0
82(X(t), t) + at + aa2 = 0
8p(x(t), u(t), t) + {terms involving a(t), al)(t)" ... , ap(t)} = 0
(7.6.36)
where, the subscripts on 8 and a denote the time derivatives, that is,
d8 (88) (dX) 88 da
8
1
= dt = 8x dt + 8t and al = dt·
(7.6.37)
Since the control u( t) is explicitly present in the pth derivative equa
tion, we can solve for the control to obtain
u(t) = g(x(t), a(t), al)(t)" ... , ap(t), t). (7.6.38)
Substituting the control (7.6.38) in the plant (7.6.32) and treating the
various a, ... , a
p
l as additional state variables, the new unconstrained
control becomes a
p
. Thus,
x(t) = f(x(t), g(x(t), a(t), a)l(t), , ... , ap(t), t), x(t = to) = Xo
a = a1, a(t = to) = aCto)
al=a2, al(t=to)=al(to)
apl = ap, apl(t = to) = apl(to). (7.6.39)
The new cost functional is then given by
i
t!
J = F(x(t f), t f) + V(x(t), g(x(t), a(t), a) (t), , ... , ap(t), t), t)dt
to
(7.6.40)
The new initial conditions aCto), ... , a
p
l(to) are required to satisfy
(7.6.35) and (7.6.36), so after some algebraic manipulations, we get
aCto) = ±J 28(x(to), to)
al(to) = 8
1
(x(to), to)/a(to)
a2(tO) = [8
2
(x(to), to) + at(to)]/a(to)
(7.6.41)
360 Chapter 7: Constrained Optimal Control Systems
With this choice of boundary conditions, the original relations w.r.t.
the constraints (7.6.35) and (7.6.36) are satisfied for all t for any con
trol function Ctp(')' In other words, any function Ctp(') will produce an
admissible trajectory. Thus, the original constrained problem is trans
formed into an unconstrained problem.
Now we apply the Pontryagin Principle to this unconstrained prob
lem. [68, 27, 40, 62]. In general terms, we define a new n + pth state
vector
Z(t) = [x(t), Ct(t), ... Ctpl]' (7.6.42)
then, the new plant (7.6.39) becomes
Z = F(Z(t), Ctp(t) , t) (7.6.43)
where the (n+p )dimensional vector function F represents the right
hand side of (7.6.39). Next, we define the Hamiltonian as
'H = V +.xF (7.6.44)
where .x is an n + pdimensional Lagrange multiplier. Then, for the
state
Z(t) = 'H.x, Z(to), (7.6.45)
for the costate
(7.6.46)
and for the control
(7.6.47)
where the subscripts in 'H.x, 'Hz, and Fx denote the partial derivative
with respect to the subscripted variable.
The previous set of equations for the states (7.6.45) and costates
(7.6.46) and their initial and final conditions constitute a two point
boundary value problem (TPBVP). Such problems can be solved, de
pending on the difficulty, with a closed solution, or highly nonlinear
problems must be solved with specialized software [68, 129, 130, 63).
7.7 Problems 361
7. 7 Problems
1. Make reasonable assumptions wherever necessary.
2. Use MATLAB© wherever possible to solve the problems and
plot all the optimal controls and states for all problems. Provide
the relevant MATLAB© m files.
Problem 7.1 Derive the expressions for minimum time given by (7.2.28)
for a double integral system.
Problem 7.2 A second order system, described by
Xl(t) = X2(t)
X2(t) = 2X2(t) + Xl(t) + u(t)
where, the initial and final states are specified, is to minimize the per
formance index
J = ~ l [2xM + xM + u
2
(t)] dt.
Find the optimal control u* (t) for
(a) u(t) unconstrained, and
(b) u(t) constrained as lu(t)1 :::; 1.
Problem 7.3 Find the optimal control law for transferring the second
order linear system
Xl(t) = X2(t)
X2(t) = u(t)
where, (a) the control u(t) is unconstrained and (b) the controllu(t)1 :::;
1, from any arbitrary initial state to the final state [2,2] in minimum
time.
Problem 7.4 For the second order, linear system
Xl(t) = Xl(t)  u(t)
X2(t) = 3X2(t)  2u(t)
362 Chapter 7: Constrained Optimal Control Systems
to be transferred from any arbitrary initial state to origin in minimum
time, find the optimal control law if the control u(t) is (a) unconstrained
and (b) constrained as I u ( t ) I :S 1.
Problem 7.5 Given a second order linear system
Xl(t) = Xl(t)  u(t)
X2(t) = 3X2(t)  2u(t), lu(t)1 :S 1
find the expression for minimum time to transfer the above system from
any initial state to the origin.
Problem 7.6 For a first order system
x(t) = u(t), lu(t)1 :S 1
find the optimal control law to minimize the performance index
rtf
J = io lu(t)ldt
tf is free
so that the system is driven from x(O) = Xo to origin.
Problem 7.7 Formulate and solve Problem 7.4 as fueloptimal control
problem.
Problem 7.8 A second order system
Xl(t) = X2(t)
X2(t) = ax2(t) + u(t), a > 0
with control constraint as lu(t)1 :S 1, discuss the optimal control strat
egy to transfer the system to origin and at the same time minimize the
performance index
r
f
J = io [13 + lu(t)l] dt
where, the final time t f is free and 13 > O.
7.7 Problems
Problem 7.9 For a double integral plant
:.h(t) = X2(t)
X2(t) = u(t)
363
with control constraint /u(t)/ ::; 1, find the optimal control which trans
fers the plant from initial condition XI(O) = 1, X2(O) = 1 to the final
condition Xl (t f) = X2 (t f) = 0 in such a way so as to minimize the
performance measure
and calculate the minimum value J*.
Problem 7.10 For a secondorder system
XI(t) = X2(t)
X2(t) = 2X2(t) + 3X2(t) + 5u(t)
with control constraint /u(t)/ ::; 1, find the optimal control which trans
fers the plant from initial condition XI(O) = 1,x2(O) = 1 to the final
condition Xl (t f) = X2 (t f) = 0 in such a way so as to minimize the
performance measure
and calculate the minimum value J*.
Problem 7.11 The double integral plant
XI(t) = X2(t)
X2(t) = u(t)
is to be transferred from any state to the origin in minimum time
with the state and control constraints as /u(t)/ ::; 1 and /x2(t)1 ::; 2.
Determine the optimal control law.
Problem 7.12 For the liquidlevel control system described in Prob
lem 1.2, formulate the timeoptimal control problem and find the opti
mal control law.
364 Chapter 7: Constrained Optimal Control Systems
Problem 7.13 For the D.C. motor speed control system described
in Problem 1.1, formulate the minimumenergy problem and find the
optimal control law if the control input is constrained as u( t) :::; 1.
Problem 7.14 For the mechanical control system described in Prob
lem 1.4, formulate the minimumenergy problem and find the optimal
control.
Problem 7.15 For the automobile suspension system described in
Problem 1.5, formulate the minimumenergy problem and find the op
timal control.
Problem 7.16 For the chemical control system described in Prob
lem 1.6, formulate the minimumenergy problem and find the optimal
control.
@@@@@@@@@@@@@
Appendix A
Vectors and Matrices
The main purpose of this appendix is to provide a brief summary of
the results on matrices, vectors and matrix algebra to serve as a review
of these topics rather than any in depth treatment of the topics. For
more details on this subject, the reader is referred to [54, 10, 13].
A.1 Vectors
Vector
A vector x, generally considered as a column vector, is an arrangement
of n elements in a column as
x= (A.l.1)
. The number n is also referred to as order, size or dimensions of the
vector. We can also write the vector x as
x = [Xl X2 .... Xn]' (A.1.2)
where, ' denotes the transpose as defined below.
365
366 Appendix A: Vectors and Matrices
Transpose of a Vector
The transpose of a vector x is the interchange of the column vector into
a row vector. Thus
x' = [Xl X2 ... Xn ] • (A.1.3)
Norm of a Vector
The norm of a vector x, written as Ilxll, is a measure of the size or
length of the vector. Further,
1. Ilxll > 0 for all x and Ilxll = 0 only if x = o.
2. Ilaxll = allxll for any scalar a and for all x.
3. IIx + yll :::; IIxll + lIyll for all x and y, called the Schwartz in
equality.
The norm is calculated by either of the following ways
1.
IIxll
2
= < x, x >= x'x, or
Ilxll = called the Euclidean nonn (A.l.4)
2.
(A.1.5)
3.
n
IIxll = Llxil
(A.1.6)
i=l
Multiplication of Vectors
The multiplication of two vectors is done by transposing one of the
vectors and then multiplying this vector with the other vector. Thus,
n
x/y =< x, Y >= L XiYi = XIYI + X2Y2 + ... + XnYn· (A.1.7)
A.2 Matrices 367
This product < x, y > which is a scalar, is often called the inner product
of these two vectors.
On the other hand, the outer product x > < y of two vectors is defined
as
x >< y = xy' =
[
X1Yl X1Y2 '" Xlyn]
X2Yl X2Y2 ... X2Yn
,
... . ....... .
XnYl X
n
Y2 ... XnYn
(A.lo8)
which is a matrix defined next.
A. 2 Matrices
Matrix
An nxm matrix A is an arrangement of nm elements aij (i = 1,2, ... , n;
j = 1,2, ... , m) into n rows and m columns as
[
an a12 ... a1m]
A = ~ ~ ~ ~ ~ ~ ::: ~ ~ ~ .
anl a
n
2 ... a
nm
(A.2.1)
The nxn of the matrix A is also referred to as order, size or dimension
of the matrix.
Square Matrix
If the number of rows and columns is the same, that is, if m = n in the
matrix A of (A.2.1), then it is called a square matrix.
Unity Matrix
A unity matrix I is defined as the matrix having values 1 for all diagonal
elements and having the rest of the elements as zero as
1= [ . ~ .. ~ .. ~ . ::: . ~ . ] .
o 0 0 .,. 1
(A.2.2)
368 Appendix A: Vectors and Matrices
Addition/Subtraction of Matrices
The addition (or subtraction) of two matrices A and B is simply the
addition of the corresponding elements in a particular row and column,
and hence, obviously these two matrices should be of the same size or
order. Thus we get a new matrix C as
C=A+B (A.2.3)
where, Cij = aij + b
ij
. The addition of two matrices is commutative as
A+B = B+A. (A.2.4)
Multiplication of a Matrix by a Scalar
The scalar multiplication of two matrices of the same order, and the
addition or subtraction is easily seen to be
where, Ql and Q2 are scalars, and
Multiplication of Matrices
The product of nxp matrix A and pxm matrix B is defined as
C = AB, where
p
Cij = L aikbkj'
k=l
(A.2.5)
(A.2.6)
(A.2.7)
Note that the element Cij is formed by summing the multiplication of
elements in the i row of matrix A and with the elements in the j column
of the matrix B. Obviously, the columns of A should be the same as
the rows of B so that the resultant matrix C has the same rows of A
and the same columns of B.
The product of two or more matrices is defined as
D = ABC = (AB) C = A (BC) . (A.2.8)
However, note that in general
AB =I BA. (A.2.9)
J\.2 1v.fatrices
369
Thus, the multiplication process is associative, but not generally com
mutative. However, with unity matrix I, we have
AI = A = IA. (A.2.10)
Transpose of a Matrix
A transpose of a matrix A, denoted as A', is obtained by interchanging
the rows and columns. Thus, the transpose B' of the matrix A is written
as
B = A' such that b
ij
= aji. (A.2.11)
Also, it can be easily seen that the transpose of the sum of two
matrices is
(A+B)'=A'+B'. (A.2.12)
The transpose of the product of two or more matrices is defined as
Symmetric Matrix
(AB)' = B'A'
(ABC)' = C'B' A'. (A.2.13)
A symmetric matrix is one whose row elements are the same as the
corresponding column elements. Thus, aij = aji. In other words, if
A = A', then the matrix A is symmetric.
Norm of a Matrix
For matrices, the various norms are defined as
1. IIAxll:S IIAII·IIxll,
2·IIA+BII:SIIAII+IIBII,
3. IIABII :S IIAII·IIBII
where. denotes multiplication.
called the Schwartz inequality,
370 Appendix A: Vectors and Matrices
Determinant
The determinant IAI of an nxn matrix A is evaluated in many ways.
One of the ways for a 3x3 matrix A is as follows.
A = [ : ~ ~ : ~ ~ : ~ : ] ;
a31 a32 a33
IAI = all (_1)1+1 1 a22 a231 + a12( _1)1+21 a21 a231
a32 a33 a31 a33
+a13( _1)1+31 a21 a221. (A.2.14)
a31 a32
Note that in (A.2.14), the subdeterminant associated with all is formed
by deleting the row and column containing all. Thus, the 3x3 deter
minant IAI is expressed in terms of the 2x2 subdeterminants. Once
again, this 2x2 subdeterminant can be written, for example, as
Some useful results on determinants are
IAI = IA'I
IABI = IAI·IBI
II+ABI = II+BAI·
Cofactor of a Matrix
(A.2.15)
(A.2.16)
A cofactor of an element in the ith row and jth column of a matrix A
is (1 )i+
j
times the determinant of the matrix formed by deleting the
ith row and jth column. Thus, the determinant given by (A.2.14) can
be written in terms of cofactors as
IAI = all [cofactor of all] + a12 [cofactor of a12]
+ + a13 [cofactor of a13]
Adjoint of a Matrix
(A.2.17)
The adjoint of a matrix A, denoted as adjA, is obtained by replacing
each element by its cofactor and transposing.
11.2 1v.fatrices 371
Singular Matrix
A matrix A is called singular if its determinant is zero, that is, if
IAI = O. A is said to be nonsingular if IAI =1= O.
Rank of a Matrix
The rank or full rank of a matrix A of order nxn is defined as
1. the number of linearly independent columns or rows of A, or
2. the greatest order of nonzero determinant of submatrices of A.
If the rank of A is n, it means that the matrix A is nonsingular.
Inverse of a Matrix
If we have a relation
PA=I, where 1 is an identity matrix, (A.2.18)
then P is called the inverse of the matrix A denoted as A 1. The
inverse of a matrix can be calculated in several ways. Thus,
It can be easily seen that
(A 1)' = (A,)1
(AB)1 = B
1
A 1.
Further, the inverse of sum of matrices is given as
(A.2.19)
(A.2.20)
[A + BCD]1 = A 1  A 1B [DA 1B + C
1
] 1 DA 1 (A.2.21)
where, A and Care nonsingular matrices, the matrix [A + BCD]
can be formed and is nonsingular and the matrix [DA 1 B + C
1
] is
nonsingular. As a special case
[I  F [sl  A]1 B] 1 = 1 + F [sl  A  BF]1 B. (A.2.22)
If a matrix A consists of submatrices as
A = [ ! ~ : ! ~ ~ ]
(A.2.23)
372 Appendix A: Vectors and Matrices
then
IAI = IA11I·IA22  A21
A
il A121
= IA221·IA11  A12A2l A211
where, the inverses of A11 and A22 exist and
Powers of a Matrix
(A.2.24)
The m power of a square matrix A denoted as Am, is defined as
Am = AA··· A upto m terms.
Exponential of a Matrix
The exponential of a square matrix A can be expressed as
A 1 2 1 3
exp(A)=e =I+A+A +A + ...
2! 3!
Differentiation and Integration
Differentiation of a Scalar w.r. t. a Vector
(A.2.26)
(A.2.27)
If a scalar J is a function of a (column) vector x, then the derivative
of J w.r.t. the x becomes
[
af 1
aXl
dJ =VxJ= it .
dx ...
~
aXn
(A.2.28)
This is also called a gradient of the function J w.r. t. the vector x.
The second derivative (also called Hessian) of J w.r.t. the vector x
is
(A.2.29)
A.2 Matrices
373
DiHerentiation of a Vector w.r. t. a Scalar
If a vector x of dimension n is a function of scalar t, then the derivative
of x w.r.t. t is
[
dXl 1
dx= ,; .
dt ...
dXn
;:It
(A.2.30)
DiHerentiation of a Vector w.r. t. a Vector
The derivative of an mth order vector function f w.r.t. an nth vector
x is written as
df' 8f'
G
dx 8x
where, G is a matrix of order nxm. Note that
G' = [:r
h
· h" 8f
w IC IS wrItten as 8x'
This is also called Jacobian matrix denoted as
Jx(f(x)) = df = 8f = [8
Ii
]
dx ax 8Xj
ah ~ ... Qll
aXl aX2 aXn
!!h !!h ... !!h
aXl aX2 aX2
Thus, the total differential of f is
of
df = ax dx.
DiHerentiation of a Scalar w.r. t. Several Vectors
For a scalar I as a function of two vectors x and y, we have
1= f(x,y)
[
8
f
]' [Of]'
df= 8x dx+ 8y dy
(A.2.31)
(A.2.32)
(A.2.33)
(A.2.34)
(A.2.35)
374
Appendix A: Vectors and Matrices
where, df is the total differential. For a scalar function
f = f(x(t), y(t), t), y(t) = y(x(t), t)
df = [a
y
'] of + of
dx ax ay ax
df = [{a
y
'} of + Of]' dx + [Of]' ay + of (A.2.36)
dt ax ay ax dt ay at at
Differentiation of a Vector w.r. t. Several Vectors
Similarly for a vector function f, we have
f = f(y, x, t), y = y(x, t), x = x(t)
df = [Of']' [a
y
']' + of
dx ay ax ax
= [Of] [a
y
] + of
ay ax ax
df = [Of']' [{ a
y
'}' dx a
y
] [Of']' dx of
dt ay ax dt + at + ax dt + at
= [:] [{:} + Z] + +
Differentiation of a Matrix w.r. t. a Scalar
(A.2.37)
If each element of a matrix is a function of a scalar variable t, then the
matrix A(t) is said to be a function of t. Then the derivative of the
matrix A ( t) is defined as
dA(t) ::: :::: 1
= dt dt dt.
dt .... ....... .
danl ... danm
dt dt dt
(A.2.38)
It follows (from chain rule) that
! [A(t)B(t)) = B(t) + A(t)
(A.2.39)
It is obvious that
! [eAt] = Ae
At
= eAtA
[A 1 ( t)] =_ A 1 ( t) dA ( t) A 1 ( t ) .
dt dt
(A.2.40)
A.2 Matrices
Differentiation of a Scalar w.r. t. a Matrix
Suppose a scalar J is a function of a matrix A, then
dJ
dA
~ ~ ... .!!L
da11 dal2 dal
rn
df df df
da2l da22 ... da2m
~ ~ ... .!!L
danl da
n
2 da
nm
375
(A.2.41)
The integration process for matrices and vectors is similarly defined
for all the previous cases. For example,
(A.2.42)
Taylor Series Expansion
It is well known that the Taylor series expansion of a function J w.r.t.
x about Xo is
[
8
J
]'1 1 [PJI
J(x) = J(xo) + 8x (x  xo) + 2! (x  xo)' 8x
2
(x  xo)
Xo Xo
+0(3) (A.2.43)
where, 0(3) indicates terms of order 3 and higher.
Trace oj a Matrix
For a square matrix A of n dimension, the trace of A is defined as
n
tr [A] = L aii·
(A.2.44)
i=l
Thus, the trace is the sum of the diagonal elements of a matrix. Also,
tr [A + B] = tr [A] + tr [B]
tr [AB] = tr [A'B'] = tr [B' A'] = tr [BA] . (A.2.45)
Eigenvalues and Eigenvectors oj a Square Matrix
For a square matrix A of order n, the roots (or zeros) of the charac
teristic polynomial equation in A
IAI AI = 0 (A.2.46)
376 Appendix A: Vectors and Matrices
are called eigenvalues of the matrix A. If there is a nonzero vector x
satisfying the equation
(A.2.47)
for a particular eigenvalue Ai, then the vector x is called the eigenvector
of the matrix A corresponding to the particular eigenvalue Ai. Also,
note that the trace of a matrix is related as
n
tr[A]=LAi'
(A.2.48)
i=l
Singular Values
Let A be an nxm matrix, then the singular values u of the matrix
A are defined as the square root values of the eigenvalues (A) of the
matrix A' A, that is
u = VA(A'A).
(A.2.49)
The singular values are usually arranged in the descending order of the
magnitude.
A.3 Quadratic Forms and Definiteness
Quadratic Forms
Consider the inner product of a real symmetric matrix P and a vector
x or the norm of vector x w.r.t. the real symmetric matrix P as
< x,Px > = x'Px = Ilxlip
= [Xl X2 ... Xn] [ ~ ; ; ~ ~ ~ : : : ; ~ 1 [ ~ ~ · l
PIn P2n .,. Pnn Xn
n
L PijXiXj' (A.3.1)
i,j=l
The scalar quantity x'Px is called a quadratic form since it contains
quadratic terms such as XIPl1 , XIX2PI2, ....
A.3 Quadratic Forms and Definiteness 377
Definiteness
Let P be a real and symmetric matrix and x be a nonzero real vector,
then
1. P is positive definite if the scalar quantity x/px > 0 or is positive.
2. P is positive semidefinite if the scalar quantity x/px ~ 0 or is
nonnegative.
3. P is negative definite if the scalar quantity x/px < 0 or is nega
tive.
4. P is negative semidefinite if the scalar quantity x/px < 0 or
nonpositive.
A test for real symmetric matrix P to be positive definite is that all its
principal or leading minors must be positive, that is,
PH > 0,
I
pII PI21 > 0,
PI2 P22
PII PI2 PI3
PI2 P22 P23 > 0
PI3 P23 P33
(A.3.2)
for a 3x3 matrix P. The > sign is changed accordingly for positive
semidefinite ( ~ ) , negative definite «), and negative semidefinite ( ~
0) cases. Another simple test for definiteness is using eigenvalues (all
eigenvalues positive for positive definiteness, etc.).
Also, note that
[x/px] I = X/p/X = x/px
p=VP#=#VP. (A.3.3)
378 Appendix A: Vectors and Matrices
Der.ivative of Quadratic Forms
Some useful results in obtaining the derivatives of quadratic forms and
related expressions are given below.
a
ax (Ax) = A
a a
ay (x/y) = ay (y/X) = x
: (x'Ay) = :y (y'A'x) = A'x
a
ax (x' Ax) = Ax + A/x
a
2
(') I
ax
2
x Ax = A + A .
If there is a symmetric matrix P, then
a
ax (x/px) = 2Px
a
2
ax
2
(x/px) = 2P.
(A.3.4)
(A.3.5)
Appendix B
State Space Analysis
The main purpose of this appendix is to provide a brief summary of
the results on state space analysis to serve as a review of these topics
rather than any in depth treatment of the topics. For more details on
this subject, the reader is referred to [69, 147, 4, 41, 11, 35].
B.l State Space Form for ContinuousTime
Systems
A linear timeinvariant (LTI) , continuoustime, dynamical system is
described by
x(t) = Ax(t) + Bu(t), state equation
y(t) = Cx(t) + Du(t), output equation (B.l.1)
with initial conditions x(t = to) = x(to). Here, x(t) is an ndimensional
state vector, u(t) is an r dimensional control vector, and y(t) is a p
dimensional output vector and the various matrices A, B, ... , are of
appropriate dimensionality. The Laplace transform (in terms of the
Laplace variable s) of the preceding set of equations (B.l.1) is
which becomes
sX(s)  x(to) = AX(s) + BU(s)
Yes) = CX(s) + DU(s)
Xes) = [sI  A]l [x(to) + BU(s)]
Yes) = C [sI  A]l [x(to) + BU(s)] + DU(s)
(B.l.2)
(B.l.3)
379
380 Appendix B: State Space Analysis
where, X( s) =Laplace transform of x( t), etc. In terms of the transfer
function G(s) with zero initial conditions x(to) = 0, we have
(B.l.4)
A linear timevarying (LTV), continuoustime, dynamical system is
described by
x(t) = A(t)x(t) + B(t)u(t), state equation
y(t) = C(t)x(t) + D(t)u(t), output equation (B.l.5)
with initial conditions x(t = to) = x(to). The solution of the continuous
time LTI system (B.l.I) is given by
x(t) = ~ ( t , to)x(to) + r ~ ( t , T)Bu(T)dT
ito
y(t) = C ~ ( t , to)x(to) + C r ~ ( t , T)Bu(T)dT + Du(t) (B.l.6)
ito
where, ~ ( t , to), called the state transition matrix of the system (B.l.I),
is given by
~ ( t , to) = eA(tto)
(B.l. 7)
having the properties
(B.l.8)
Similarly, the solution of the continuoustime LTV system (B.l.5) is
given by
x(t) = ~ ( t , to)x(to) + r ~ ( t , T)B(T)U(T)dT
ito
y(t) = C ( t ) ~ ( t , to)x(to) + C(t) rt ~ ( t , T)B(T)U(T)dT + D(t)u(t)
ito
(B.l.9)
where, ~ ( t , to), still called the state transition matrix of the system
(B.l.5), cannot be easily computed analytically, but does satisfy the
properties (B.l.8). However, in terms of a fundamental matrix X(t)
satisfying
X(t) = A(t)X(t) (B.l.IO)
it can be written as [35]
~ ( t , to) = X(t)Xl(tO). (B.l.II)
B.2 Linear Matrix Equations 381
B.2 Linear Matrix Equations
A set of linear simultaneous equations for an unknown matrix P in
terms of known matrices A and Q, is written as
PA + A'p + Q = O. (B.2.1)
In particular, if Q is positive definite, then there exists a unique positive
definite P satisfying the previous linear matrix equation, if and only
if A is asymptotically stable or the real part (Re) of A{A} < O. Then
(B.2.1) is called the Lyapunov equation, the solution of which is given
by
[00 I
P = Jo e
A
tQeAtdt. (B.2.2)
B.3 State Space Form for DiscreteTime Sys
tems
A linear timeinvariant (LTI) , discretetime, dynamical system is de
scribed by
x(k + 1) = Ax(k) + Bu(k),
y(k) = Cx(k) + DU(k),
state equation
output equation (B.3.1)
with initial conditions x(k = ko) = x(k
o
). Here, x(k) is an ndimensional
state vector, u(k) is an r dimensional control vector, and y(k) is a p
dimensional output vector and the various matrices A, B, ... , are ma
trices of appropriate dimensionality. The Ztransform (in terms of the
complex variable z) is
which becomes
zX(z)  x(ko) = AX(z) + BU(z)
Y(z) = CX(z) + DU(z)
X(z) = [zI  A]l [x(ko) + BU(z)]
Y(z) = C [zI  A]l [x(k
o
) + BU(z)] + DU(z).
(B.3.2)
(B.3.3)
In terms of the transfer function G(z) with zero initial conditions
x(ko) = 0, we have
Y(z) 1
G(z) = U(z) = C [zI  A] B + D.
(B.3.4)
382 Appendix B: State Space Analysis
An LTV, discretetime, dynamical system is described by
x(k + 1) = A(k)x(k) + B(k)u(k), state equation
y(k) = C(k)x(k) + D(k)u(k), output equation (B.3.5)
with initial conditions x(k = ko) = x(k
o
). The solution of the LTI
discretetime system (B.3.1) is given by
kl
x(k) = ~ ( k , ko)x(ko) + L ~ ( k , m + 1)Bu(m)
m=ko
kl
y(k) = C ~ ( k , ko)x(k
o
) + C L ~ ( k , m + 1)Bu(m) + DU(k)
m=ko
(B.3.6)
where, ~ ( k , ko), called the state transition matrix of the discretetime
system (B.3.1), is given by
(B.3.7)
having the properties
(B.3.8)
Similarly, the solution of the LTV, discretetime system (B.3.5) is
given by
kl
x(k) = ~ ( k , ko)x(k
o
) + L ~ ( k , m + l)B(m)u(m)
m=ko
kl
y(k) = C ( k ) ~ ( k , ko)x(k
o
) + C(k) L ~ ( k , m + l)B(m)u(m) + D(k)u(k)
m=ko
(B.3.9)
where,
~ ( k , ko) = A(k  1)A(k  2) ... A(ko) k terms, (B.3.10)
is called the state transition matrix of the discretetime system (B.3.5)
satisfying the properties (B.3.8).
B.4 Controllability and Observability 383
B.4 Controllability and Observability
Let us first consider the LTI, continuoustime system (B.l.1). Similar
results are available for discretetime systems [35]. The system (B.l.1)
with the pair (A : nxn, B : nxr) is called completely state controllable
if any of the following conditions is satisfied:
1. rank of the controllability matrix
Qc = [B AB A
2
B·· . An1B]
is n (full row rank), or
2. the controllability Grammian
Wc(t) = lot eATBB'eA'TdT
= lot
is nonsingular for any t > o.
(B.4.1)
(B.4.2)
The system (B.1.1) with the pair (A : nxn, C : pxn) is completely
observable if any of the following conditions is satisfied:
1. rank of the observability matrix
Qo = [C CA CA2 ... CA
n

1
]'
has rank n (full column rank).
2. the observability Grammian
rt I
Wo(t) = 10 e
A
rC'Ce
Ar
dT
is nonsingular for any t > O.
(B.4.3)
(B.4.4)
Other conditions also exist for controllability and observability [35].
B.5 Stabilizability, Reachability and Detectabil
ity
Stabilizability
A system is stabilizable if its uncontrollable states or modes if any, are
stable. Its controllable states or modes may be stable or unstable. Thus,
the pair (A, B) is stabilizable if (A  BF) can be made asymptotically
stable for some matrix F.
384 Appendix B: State Space Analysis
Reachability
A system is said to be reachable if the system can be transferred from
initial state to any other specified final state. Thus, a continuoustime
system is reachable if and only if the system is controllable and hence
reachability is equivalent to controllability.
Detectability
A system is detectable if its unobservable states, if any, are stable. Its
observable states may be stable or unstable. Thus, the pair (A, C) is
detectable if there is a matrix L such that (A  LC) can be made
asymptotically stable. This is equivalent to the observability of the
unstable modes of A.
Appendix C
MATLAB Files
This appendix contains MATLAB© files required to run programs
used in solving some of the problems discussed in the book. One needs
to have the following files in one's working directory before using the
MATLAB©.
C.l MATLAB© for Matrix Differential Ric
cati Equation
The following is the typical MATLAB© file containing the various
given matrices for a problem, such as, Example 3.1 using analytical
solution of matrix Riccati differential equation given in Chapter 3. This
file, say example.m requires the other two files lqrnss.m and IqrnssJ.m
given below. The electronic version of 11 these files can also be obtained
by sending an email to naiduds@isu.edu.
%%%%%%%%%%%%
clear all
A= [0. ,1. ; 2. ,1.] ;
B= [0. ; 1.] ;
Q= [2. ,3. ; 3. ,5.] ;
F= [1. ,0. 5 ; 0 . 5 , 2 .] ;
R=[.25] ;
tspan=[O 5];
xo= [2. ,3.] ;
[x,u,K]=lqrnss(A,B,F,Q,R,xO,tspan);
%%%%%%%%%%%%%
385
386 Appendix C: MATLAB Files
C.1.1 MATLAB File lqrnss.m
This MATLAB© file lqrnss.m is required along with the other files
example. m and lqrnssf. m to solve the matrix Riccati equation using its
analytical solution.
%%%%%%%%%%%%%
%% The following is lqrnss.m
function [x,u,K]=lqrnss(As,Bs,Fs,Qs,Rs,xO,tspan)
%Revision Date 11/14/01
%%
% This mfile calculates and plots the outputs for a
% Linear Quadratic Regulator (LQR) system based on given
% state space matrices A and B and performance index
% matrices F, Q and R. This function takes these inputs,
% and using the analytical solution to the
%% matrix Riccati equation,
% and then computing optimal states and controls.
%
%
% SYNTAX: [x,u,K]=lqrnss(A,B,F,Q,R,xO,tspan)
%
%
%
%
%
%
%
INPUTS (All numeric):
A,B
F,Q,R
xO
tspan
Matrices from xdot=Ax+Bu
Performance Index Parameters;
State variable initial condition
Vector containing time span [to tf]
% OUTPUTS:
%
%
%
%
x
u
K
is the state variable vector
is the input vector
is the steadystate matrix inv(R)*B'*P
% The system plots Riccati coefficients, x vector,
% and u vector
%
%Define variables to use in external functions
%
global A E F Md tf W11 W12 W21 W22 n,
%
%Check for correct number of inputs
C.1 MATLAB© for Matrix Differential Riccati Equation 387
%
if nargin<7
error('Incorrect number of inputs specified')
return
end
%
%Convert Variables to normal symbology to prevent
% problems with global statement
%
A=As;
B=Bs;
F=Fs;
Q=Qs;
R=Rs;
plotflag=O; %set plotflag to 1 to avoid plotting of
% data on figures
%
%Define secondary variables for global passing to
% oderelated functions and determine matrice size
%
%Find dimensions of
%Find dimensions of
%Find dimensions of
%Find Dimensions of
%Find Dimensions of
[n,m]=size(A);
[nb,mb]=size(B);
[nq,mq] =size (Q) ;
[nr,mr]=size(R);
[nf ,mf] =size (F) ;
if n=m
%Verify A is square
error('A must be
else
square')
[n, n] =size (A) ;
end
%
%Data Checks for proper setup
if length(A»rank(ctrb(A,B))
%Check for controllability
end
error('System Not Controllable')
return
if (n = nq) I (n  mq)
%Check that A and Q are the same size
A
B
Q
R
F
388 Appendix C: MATLAB Files
error('A and Q must be the same size');
return
end
if (mf==l&nf==l)
if (nq = nf) I (mq = mf)
%Check that Q and F are the same size
error('Q and F must be the same size');
return
end
end
if (mr==l&nr==l)
if (mr = nr) I (mb  nr)
error('R must be consistent with B');
return
end
end
mq = norm(Q,l);
% Check if Q is positive semidefinite and symmetric
if any(eig(Q) < eps*mq) I (norm(Q'Q,l)/mq > eps)
disp('Warning: Q is not symmetric and positive ...
semidefinite');
end
mr = norm(R,l);
% Check if R is positive definite and symmetric
if any(eig(R) <= eps*mr) I (norm(R'R,l)/mr > eps)
disp('Warning: R is not symmetric and positive ...
definite');
end
%
%Define Initial Conditions for
%numerical solution of x states
%
to=tspan (1) ;
tf=tspan(2);
tspan=[tf to];
%
%Define Calculated Matrices and Vectors
%
E=B*inv(R)*B' ; %E Matrix E=B*(l/R)*B'
C.1 MATLAB© for Matrix Differential Riccati Equation 389
%
%Find Hamiltonian matrix needed to use
% analytical solution to
% matrix Riccati differential equation
%
Z=[A,E;Q,A'J;
%
%Find Eigenvectors
%
[W, DJ =eig (Z) ;
%
%Find the diagonals from D and pick the
% negative diagonals to create
% a new matrix M
%
j=n;
[ml,indexlJ=sort(real(diag(D)));
for i=l:l:n
m2(i)=ml(j);
index2(i)=indexl(j);
index2(i+n)=indexl(i+n);
j=jl;
end
Md=diag(m2);
%
%Rearrange W so that it corresponds to the sort
% of the eigenvalues
%
for i=1:2*n
w2(:,i)=W(:,index2(i));
end
W=w2;
%
%Define the Modal Matrix for D and Split it into Parts
%
Wll=zeros(n);
W12=zeros(n);
W21=zeros(n);
W22=zeros(n);
390
Appendix C: MATLAB Files
j=l ;
for i=1:2*n:(2*n*n2*n+l)
Wll(j:j+nl)=W(i:i+nl);
W21(j:j+nl)=W(i+n:i+2*nl);
W12(j:j+nl)=W(2*n*n+i:2*n*n+i+nl);
W22(j:j+nl)=W(2*n*n+i+n:2*n*n+i+2*nl);
j=j+n;
end
%
%Define other initial conditions for
% calculation of P, g, x and u
%
tl=O. ;
tx=O. ;
tu=O. ;
x=O. ;
%
%Calculation of optimized x
%
%time array for x
%time array for u
%state vector
[tx,x]=ode45('lqrnssf',fliplr(tspan),xO, ...
odeset('refine',2,'RelTol',le4,'AbsTol',le6));
%
%Find u vector
%
j=l;
us=O.; %Initialize computational variable
for i=l:l:mb
for tua=tO:.l:tf
Tt=inv(W22F*W12)*(W21F*Wll);
P=(W21+W22*expm(Md*(tftua))*Tt* ...
expm(Md*(tftua)))*inv(Wll+W12*expm(Md*(tftua)) ...
*Tt*expm(Md*(tftua)));
K=inv(R)*B'*P;
xs=interpl(tx,x,tua);
usl=real(K*xs');
us (j) =usl (i) ;
tu(j)=tua;
j=j+l;
end
C.1 MATLAB© for Matrix Differential Riccati Equation 391
u ( : , i) =us' ;
us=O;
j=l ;
end
%
%Provide final steadystate K
%
P=W21/Wll;
K=real(inv(R)*B'*P);
%
%Plotting Section, if desired
%
if plotflag=l
%
%Plot diagonal Riccati coefficients using a
% flag variable to hold and change colors
%
fig=l ;
cflag=l;
j=l;
Ps=O. ;
%Figure number
%Variable used to change plot color
%Initialize P matrix plot variable
for i=l:l:n*n
for tla=tO: .1:tf
Tt=inv(W22F*W12)*(W21F*Wll);
P=real«W21+W22*expm(Md*(tftla))*Tt*expm(Md* ...
(tftla)))*inv(Wll+W12*expm(Md*(tftla))*Tt ...
*expm(Md*(tftla))));
Ps(j)=P(i);
tl(j)=tla;
j=j+l ;
end
if cflag==l;
figure (fig)
plot(tl,Ps, 'b')
title('Plot of Riccati Coefficients')
xlabel (' t')
ylabel ( , P Matrix') .
hold
cflag=2;
392
else if cflag==2
plot (t 1, Ps, , m: ' )
cflag=3;
elseif cflag==3
plot(t1,Ps,'g.')
cflag=4;
elseif cflag==4
plot(t1,Ps,'r')
cflag=1 ;
fig=fig+1;
end
Ps=O. ;
j=1 ;
end
if cflag==2Icflag==3Icflag==4
hold
fig=fig+1;
end
%
%Plot Optimized x
%
if n>2
for i=1:3:(3*fix«n3)/3)+1)
Appendix C: MATLAB Files
figure(fig);
plot(tx,real(x(:,i)),'b',tx,real(x(:,i+1)),'m:',tx, ...
real(x(:,i+2)),'g.')
end
end
title('Plot of Optimized x')
xlabel ( , t ' )
ylabel('x vectors')
fig=fig+1;
if (n3*fix(n/3))==1
figure(fig);
plot(tx,real(x(:,n)),'b')
else if (n3*fix(n/3))==2
figure(fig);
plot (tx, real (x ( : , n 1) ) , , b' , tx, real (x ( : , n) ) , , m: ' )
end
C.1 MATLAB© for Matrix Differential Riccati Equation 393
title('Plot of Optimized x')
xlabel ('t')
ylabel('x vectors')
fig=fig+1;
%
%Plot Optimized u
%
if mb>2
for i=1:3:(3*fix«mb3)/3)+1)
figure(fig);
plot(tu,real(u(:,i)),'b',tu,real(u(:,i+1)),'m:', ...
tu,real(u(:,i+2)),'g.')
title('Plot of Optimized u')
xlabel ('t')
%
end
end
ylabel('u vectors')
fig=fig+1;
if (mb3*fix(mb/3))==1
figure(fig);
plot(tu,real(u(:,mb)),'b')
elseif (mb3*fix(mb/3))==2
end
figure(fig);
plot(tu,real(u(:,mb1)),'b',tu,real(u(:,mb)),'m:')
title('Plot of Optimized u')
xlabel (' t')
ylabel('u vectors')
%
end
%%
%%%%%%%%%%%%%
C.1.2 MATLAB File lqrnssf.m
This file lqrnssf. m is used along with the other two files example. m and
lqrnss. m given above.
%%%%%%%%%%%%%%%%
394
%% The following is lqrnssf.m
%%
function dx = lqrnssf(t,x)
% Function for x
%
Appendix C: MATLAB Files
global A E F Md tf Wii W12 W2i W22 n
%Calculation of P, Riccati Analytical Solution
Tt=inv(W22F*W12)*(W2iF*Wii);
P=(W2i+W22*expm(Md*(tft))*Tt*expm(Md*(tft)))* ...
inv(Wii+W12*expm(Md*(tft))*Tt*expm(Md*(tft)));
%
xa=[AE*P] ;
%
%Definition of differential equations
%
dx=[xa*x] ;
%%%%%%%%%
C.2 MATLAB© for Continuous Time Track
ing System
The following MATLAB© files are used to solve the Example 4.1.
The main file is Example4.1(example4_l.m) which requires the files:
Example 4.1 (example4_1p.m), Example 4.1(example4_1g.m), and Ex
ample 4.1 (example4_1x.m). The file Example 4.1(example4_l.m) is for
solving the set of first order Riccati differential equations; Example 4.1
(example4_1g.m) is the set of first order g vector differential equations;
and EXaInple 4.1 (example4_1x.m) is the set of state differential equa
tions.
C.2.1 MATLAB File for Example 4.1{example4_1.m)
clear all
%
%Define variables to use in external functions
global tp p tg g
%
%Define Initial Conditions for numerical solution
C.2 MATLAB© for Continuo us Time Tracking System 395
% of g and x states
%
tf=20;
tspan=[tf 0];
tp=O. ;
tg=O. ;
tx=O. ;
pf=[2. ,0. ,0.];
gf=[2. ,0.] ;
xO=[0.5,0.] ;
%
%Calculation of P
%
[tp,p]=ode45('example4_1p',tspan,pf,odeset('refine',2, ...
'RelTol',1e4,'AbsTol',1e6));
%
%Calculation of g
%
[tg,g]=ode45('example4_1g',tp,gf,odeset('refine',2, ...
'RelTol',1e4,'AbsTol',1e6));
%
%Calculation of optimized x
%
[tx,x]=ode45('example4_1x',flipud(tg),xO, ...
odeset('refine',2,'RelTol',1e4,'AbsTol',1e6));
%
%Plot Riccati coefficients
%
fig=1; %Figure number
figure (fig)
plot(tp,real(p(:,1)),'k',tp,real(p(:,2)),'k',tp, ...
real (p ( : ,3) ) , , k ' )
grid on
xlabel (' t')
ylabel('Riccati Coefficients')
hold
%
fig=fig+1;
%
396
Appendix C: MATLAB Files
%Plot g values
%
figure(fig);
plot(tg,real(g(:,1)),'k',tg,real(g(:,2)),'k')
grid on
xlabel (' t')
ylabel('g vector')
%%
%
fig=fig+1;
%
%Plot Optimal States x
%
figure(fig);
plot(tx,real(x(:,1)),'k',tx,real(x(:,2)),'k')
grid on
xlabel (' t')
ylabel('Optimal States')
%
fig=fig+1;
%
%Plot Optimal Control u
%
[n,m] =size (p) ;
p12=p(: ,2) ;
p22=p ( : , 3) ;
x1=x(: ,1) ;
x2=x(: ,2);
g2=flipud(g(:,2));
for i=l:l:n
u(i) = 250*(p12(i)*x1(i) + p22(i)*x2(i)  g2(i));
end
figure(fig);
plot(tp,real(u),'k')
grid on
xlabel ( , t ' )
ylabel('Optimal Control')
C.2 MATLAB© for Continuous Time Tracking System 397
C.2.2 MATLAB File for Example 4.1 (example4_1p.m)
function dp = example4_1p(t,p)
% Function for P
%
%Define variables to use in external functions
%
%Definition of differential equations
%
%
250*p(2)*p(3)p(1)+3*p(2)+2*p(3)
C.2.3 MATLAB File for Example 4.1 (example4_1g.m)
function dg = example4_1g(t,g)
% Function for g
%
%Define variables to use in external functions
%
global tp p
%
%Definition of differential equations
%
dg=[(250*interpl(tp,p(:,2),t)+2)*g(2)2
g(1)+(250*interpl(tp,p(:,3),t)+3)*g(2)];
C.2.4 MATLAB File for Example 4.1 (example4_1x.m)
function dx = example4_1x(t,x)
% Function for x
%
%Define variables to use in external functions
global tp p tg g
%
%Definition of differential equations
%
dx=[x(2)
2*x(1)3*x(2)250*(interpl(tp,p(:,2),t)*x(1)+ ...
398
Appendix C: MATLAB Files
interp1(tp,p(:,3),t)*x(2)interp1(tg,g(:,2),t))] ;
%
C.2.5 MATLAB File for Example 4.2{example41 .m)
clear all
%
%Define variables to use in external functions
global tp p tg g
%
%Define Initial Conditions for numerical solution of
% g and x states
%
tf=20;
tspan=[tf 0];
tp=O. ;
tg=O. ;
tx=O. ;
pf=[O. ,0. ,0.];
gf = [0. , ° .] ;
xO= [1. , 0.] ;
%
%Calculation of P
[tp,p]=ode45('example4_2p',tspan,pf, ...
odeset('refine',2,'ReITol',1e4,'AbsTol',1e6));
%
%Calculation of g
%
[tg,g]=ode45('example4_2g',tp,gf, ...
odeset('refine',2,'ReITol',1e4,'AbsTol',1e6));
%
%Calculation of optimized x
%
[tx,x]=ode45('example4_2x',flipud(tg),xO, ...
odeset('refine',2,'ReITol',1e4,'AbsTol',1e6));
%
fig=1; %Figure number
figure (fig)
plot(tp,real(p(:,1)),'k',tp,real(p(:,2)),'k',tp, ...
C.2 MATLAB© for Continuous Time Tracking System 399
real(p(:,3)),'k')
grid on
title('Plot of P')
xlabel ( , t ' )
ylabel('Riccati Coefficients')
hold
%
fig=fig+1;
%
%Plotg values
%
figure(fig);
plot(tg,real(g(:,1)),'k',tg,real(g(:,2)),'k')
grid on
title('Plot of g Vector')
xlabel (, t')
ylabel('g vector')
%
fig=fig+1;
%
%Plot Optimized x
%
figure(fig);
plot(tx,real(x(:,1)),'k',tx,real(x(:,2)),'k')
grid on
title('Plot of Optimal States')
xlabel ( , t' )
ylabel('Optimal States')
%
fig=fig+1 ;
%
%Calculate and Plot Optimized u
%
[n,m]=size(p);
p12=flipud(p(:,2));
p22=flipud(p(:,3));
x1=x(: ,1) ;
x2=x(: ,2);
g2=flipud(g(:,2));
400 Appendix C: MATLAB Files
for i=1:1:n
u(i) = 25*(p12(i)*x1(i) + p22(i)*x2(i) + g2(i»;
end
figure(fig);
plot(tx,real(u),'b')
grid on
title('Plot of Optimal Control')
xlabel ( , t ' )
ylabel('Optimal Control')
%%%%%%%%%%%%%%%%%%%
C.2.6 MATLAB File for Example 4.2(example4_2p.m)
function dp = example4_2p(t,p)
% Function for P
%
%Define variables to use in external functions
%
%Definition of differential equations
%
%%
25*p(2)*p(3)p(1)+3*p(2)+2*p(3)
;
C.2.7 MATLAB File for Example 4.2(example4_2g.m)
function dg = example4_2g(t,g)
% Function for g
%
%Define variables to use in external functions
%
global tp p
%
%Definition of differential equations
%
dg=[(25*interp1(tp,p(:,2),t)+2)*g(2)4*t
g(1)+(25*interp1(tp,p(:,3),t)+3)*g(2)] ;
%%
C.3 MATLAB© for Matrix Difference Riccati Equation 401
C.2.8 MATLAB File for Example 4.2{example4_2x.m)
function dx = example4_2x(t,x)
% Function for x
%
%Define variables to use in external functions
global tp p tg g
%
%Definition of differential equations
%
dx=[x(2)
2*x(1)3*x(2)25*(interp1(tp,p(:,2),t)*x(1)+ ...
interp1(tp,p(:,3),t)*x(2)interpl(tg,g(:,2),t))] ;
%%
C.3 MATLAB© for Matrix Difference R i c ~
cati Equation
The following is the typical MATLAB© file containing the various
given matrices for a problem, such as Example 5.5, using analytical
solution of matrix Riccati difference equation given in Chapter 5. This
file, say example.m requires the other file lqrdnss.m given below.
%%%%%%%%%%%%%%%%%
clear all
A= [ .8, 1 ; 0, .5] ;
B= [1; .5] ;
F= [2 , ° ; ° ,4] ;
Q=[1,0;0,1] ;
R=1;
kspan=[O 10];
xO ( : ,1) = [5. ; 3.] ;
[x,u]=lqrdnss(A,B,F,Q,R,xO,kspan);
%%%%%%%%%%%%%%%%%%%%%%%
C.3.1 MATLAB File lqrdnss.m
This MATLAB© file lqrdnss. m is required along with the other file
example. m to solve the matrix Riccati difference equation using its
analytical solution.
402 Appendix C: MATLAB Files
%%%%%%%%%%%%%
function [x,u]=lqrdnss(As,Bs,Fs,Qs,Rs,xO,kspan)
%
%This mfile calculates and plots the outputs for a
% discrete Linear Quadratic Regulator system
%Based on provided linear state space matrices
% for A and B and Performance Index matrices
% for F, Q and R.
%This function takes these inputs, and using the
% analytical solution to the matrix Riccati equation,
% formulates the optimal states and inputs.
%
%
% SYNTAX: [x,u]=lqrdnss(A,B,F,Q,R,xO,tspan)
%
% INPUTS (All numeric):
% A,B Matrices from xdot=Ax+Bu
% F,Q,R Performance Index Parameters; terminal cost,
% error and control weighting
% xO State variable initial condition. Must be a
% column vector [xl0;x20;x30 ... ]
% kspan Vector containing sample span [kO kf]
%
% OUTPUTS:
%
%
%
x
u
is the state variable vector
is the input vector
% The system plots the Riccati coefficients in
% combinations of 4,
% and the x vector, and u vector in
% combinations of 3.
%
%Check for correct number of inputs
if nargin<7
error('Incorrect number of inputs specified')
return
C.3 MATLAB© for Matrix Difference Riccati Equation 403
end
%Convert Variables to normal symbology to
% prevent problems with global statement
A=As;
B=Bs;
F=Fs;
Q=Qs;
R=Rs;
plotflag=O; %set plotflag to a 1 to avoid plotting
% of data on figures
%Define secondary variables for global passing to
% oderelated functions and determine matrice size
%Find dimensions of
%Find dimensions of
%Find dimensions of
%Find Dimensions of
%Find Dimensions of
[n,m]=size(A);
[nb,mb]=size(B);
[nq,mq] =size (Q) ;
[nr,mr]=size(R);
[nf,mf]=size(F);
if n=m
%Verify A is square
error('A must be
else
[n,n]=size(A);
end
square' )
%Data Checks for proper setup
%Check for controllability
if length(A»rank(ctrb(A,B))
end
error('System Not Controllable')
return
if (n = nq) I (n = mq)
%Check that A and Q are the same size
error('A and Q must be the same size');
return
end
if (mf==l&nf==l)
A
B
Q
R
F
404 Appendix C: MATLAB Files
if (nq = nf) I (mq = mf)
%Check that Q and F are the same size
error('Q and F must be the same size');
return
end
end
if (mr==l&nr==l)
if (mr = nr) I (mb  nr)
end
end
error('R must be consistent with B');
return
mq = norm(Q,l);
% Check if Q is positive semidefinite and symmetric
if any(eig(Q) < eps*mq) I (norm(Q'Q,l)/mq > eps)
disp('Warning: Q is not symmetric and ...
positive semidefinite');
end
mr = norm(R,l);
% Check if R is positive definite and symmetric
if any(eig(R) <= eps*mr) I (norm(R'R,l)/mr > eps)
disp('Warning: R is not symmetric and ...
positive definite');
end
%Define Calculated Matrix
%Find matrix needed to calculate Analytical Solution
% to Riccati Equation
%Find Eigenvectors
[W, D] =eig (H) ;
%Find the diagonals from D and pick the negative
C.3 MATLAB© for Matrix Difference Riccati Equation 405
% diagonals to create a new matrix M
j=n;
[ml,indexl]=sort(real(diag(D)));
for i=l:l:n
m2(i)=ml(j);
index2(i)=indexl(j);
index2(i+n)=indexl(i+n);
j=jl;
end
Md=diag(m2);
%Rearrange W so that it corresponds to the
% sort of the eigenvalues
for i=1:2*n
w2(:,i)=W(:,index2(i));
end
W=w2;
%Define the Modal Matrix for D and split it into parts
Wl1=zeros(n);
W12=zeros(n);
W21=zeros(n);
W22=zeros(n);
j=l;
for i=1:2*n:(2*n*n2*n+l)
Wll(j:j+nl)=W(i:i+nl);
W21(j:j+nl)=W(i+n:i+2*nl);
W12(j:j+nl)=W(2*n*n+i:2*n*n+i+nl);
W22(j:j+nl)=W(2*n*n+i+n:2*n*n+i+2*nl);
j=j+n;
end
%Find M
M=zeros(n);
j=l;
for i=1:2*n:(2*n*n2*n+l)
406
M(j:j+n1)=D(i:i+n1);
j=j+n;
end
%Zero Vectors
x=zeros(n,1);
%Define Loop Variables (l=lambda)
kO=kspan(1);
kf=kspan(2);
%x and P Conditions
x ( : , 1) =xO ( : , 1) ;
Appendix C: MATLAB Files
Tt=inv(W22F*W12)*(W21F*W11);
P=real((W21+W22*((Md(kfO))*Tt*(Md(kfO)))) ...
*inv(W11+W12*((Md(kfO))*Tt*(Md(kfO)))));
L=inv(R)*B'*(inv(A))'*(PQ);
u(:,1)=L*xO(:,1);
k1(1)=O;
for k=(kO+1):1:(kf)
Tt=inv(W22F*W12)*(W21F*W11);
P=real((W21+W22*((Md(kfk))*Tt*(Md(kfk)))) ...
*inv(W11+W12*((Md(kfk))*Tt*(Md(kfk)))));
L=inv(R)*B'*(inv(A))'*(PQ);
xC: ,k+1)=(AB*L)*x(:,k);
u(:,k+1)=L*x(:,k+1);
k1(k+1)=k;
end
%Plotting Section, if desired
if
%Plot Riccati coefficients using flag variables
% to hold and change colors
%Variables are plotted one at a time and the plot held
fig=1; %Figure number
C.3 MATLAB© for Matrix Difference Riccati Equation 407
cflag=l ;
j=l;
Ps=O. ;
for i=l:l:n*n
%Variable used to change plot color
%Initialize P Matrix plot variable
for k=(kO):l:(kf)
Tt=inv(W22F*W12)*(W21F*Wll);
...
Ps(j)=P(i);
k2(j)=k;
j=j+l ;
end
if cflag==l;
figure(fig);
plot (k2 , Ps , ' b ' )
title('Plot of Riccati Coefficients')
grid on
xlabel ( , k')
ylabel('P Matrix')
hold
cflag=2;
elseif cflag==2
plot (k2 , Ps , ' b ' )
cflag=3;
else if cflag==3
plot (k2 ,Ps , ' b' )
cflag=4;
elseif cflag==4
plot (k2 ,Ps, ' b ' )
cflag=l ;
fig=fig+l;
end
Ps=O. ;
j=l;
end
if cflag==2Icflag==3Icflag==4
hold
fig=fig+l;
408
end
%Plot Optimized x
x=x' ;
if n>2
for i=1:3:(3*fix«n3)/3)+1)
Appendix C: MATLAB Files
figure(fig);
plot(kl,real(x(:,i)),'b',kl,real(x(:,i+l)),'b' ,kl, ...
%
end
end
real(x(:,i+2)),'b')
grid on
title('Plot of Optimal States')
xlabel( 'k')
ylabel('Optimal States')
fig=fig+l ;
if (n3*fix(n/3))==1
figure(fig);
plot(kl,real(x(:,n)),'b')
elseif (n3*fix(n/3))==2
figure(fig);
plot(kl,real(x(:,nl)),'b',kl,real(x(:,n)),'b')
end
grid on
title('Plot of Optimal States')
xlabel( 'k')
ylabel('Optimal States')
fig=fig+l;
%
%Plot Optimized u
%
u=u' ;
if mb>2
for i=1:3:(3*fix«mb3)/3)+1)
figure(fig);
plot(kl,real(u(:,i)),'b',kl,real(u(:,i+l)), ...
'm:',kl,real(u(:,i+2)),'g.')
C.4 MATLAB© for DiscreteTime Tracking System
end
end
grid on
title('Plot of Optimal Control')
xlabel( 'k')
ylabel('Optimal Control')
fig=fig+1;
if (mb3*fix(mb/3))==1
figure(fig);
plot(k1,real(u(:,mb)),'b')
elseif (mb3*fix(mb/3))==2
figure(fig);
plot(k1,real(u(: ,mb1)), 'b' ,k1,real(u(: ,mb)), 'm:')
end
grid on
title('Plot of Optimal Control')
xlabel ('k')
ylabel('Optimal Control')
gtext ('u')
end
%%%%%%%%
409
C.4 MATLAB© for DiscreteTime Tracking
System
This MATLAB© file for tracking Example 5.6 is given below.
% Solution Using Control System Toolbox (STB) in
% MATLAB Version 6
clear
A=[0.8 1;0,0.6]; %% system matrix A
B=[1;0.5]; %% system matrix B
C=[1 0;0 1]; %% system matrix C
Q=[1 0;0 0]; %% performance index
%% state weighting matrix Q
R=[0.01]; %% performance index control
%% weighting matrix R
F=[1,0;0,0]; %% performance index weighting matrix F
%
x1(1)=5; %% initial condition on state x1
410 Appendix C: MATLAB Files
x2(1)=3; %% initial condition on state x2
xk=[xl(1);x2(1)];
zk=[2;0];
zkf=[2;0];
% note that if kf = 10 then
% k = [kO,kf] = [012, ... ,10],
% then we have 11 points and an array xl should
% have subscript
% xl(N) with N=l to 11. This is because x(o) is
% illegal in array
% definition in MATLAB. Let us use N = kf+l
kO=O; % the initial instant k_O
kf=10; % the final instant k_f
N=kf+l; %
[n,n]=size(A); % fixing the order of the system matrix A
I=eye(n); % identity matrix I
E=B*inv(R)*B'; % the matrix E = BR
A
{l}B'
V=C'*Q*C;
W=C'*Q;
%
% solve matrix difference Riccati equation
% backwards starting from kf to kO
% use the form P(k) = A'P(k+l)[I + EP(k+l)]A{l}A + V
% first fix the final conditionS P(k_f) = F;
% g(k_f) = C'Fz(k_f)
% note that P, Q, R, F are all symmatric ij = ji
Pkplusl=C'*F*C;
gkplusl=C'*F*zkf;
pll(N)=F(l);
p12(N)=F(2);
p21(N)=F(3);
p22(N)=F(4);
%
gl(N)=gkplusl(l);
g2(N)=gkplusl(2);
%
Pk=O;
gk=O;
for k=Nl:l:1,
C.4 MATLAB© for DiscreteTime Tracking System
%
end
%
Pk = A'*Pkplusi*inv(I+E*Pkplusi)*A+V;
Lk = inv(R+B'*Pkplusi*B)*B'*Pkplusi*A;
gk=(AB*Lk)'*gkplusi+W*zk;
pii(k) = Pk(i,i);
p12(k) = Pk(1,2);
p2i(k) = Pk(2,1);
p22(k) = Pk(2,2);
pkplui = Pk;
gi(k) = gk(i);
g2(k) = gk(2);
gkplusi = gk;
% calcuate the feedback coefficients L and Lg(k)
% L(k) = (R+B'P(k+i)B){l}BP(k+i)A
% Lg(k) = [R + B'P(k+i)B]{l}B'
%
for k = N:i:i,
Pk=[pii(k),p12(k);p2i(k),p22(k)];
gk=[gl(k);g2(k)];
end
%
Lk = inv(R+B'*Pkplusi*B)*B'*Pkplusi*A;
Lgk= inv(R+B'*Pkplusi*B)*B';
li(k) = Lk(i);
l2(k) = Lk(2);
19i(k) = Lgk(i);
192(k) = Lgk(2);
% solve the optimal states
% x(k+i) = [AB*L)x(k) + BLg(k+i)g(k+i) given x(O)
%
xk=O.O;
for k=i: Ni,
Lk = [11(k),12(k)];
Lgk = [lgl(k),lg2(k)];
Lgkplusi=[lgl(k+i),lg2(k+i)];
xk = [xi(k);x2(k)];
xkplusi = (AB*Lk)*xk + B*Lgkplusi*gk;
411
412
end
%
xl(k+l) = xkplusl(l);
x2(k+l) = xkplusl(2);
% solve for optimal control
% u(k) =  L(k)x(k) + Lg(k)g(k+l)
%
xk=O.O;
% for k=l:N,
for k=l:Nl,
Lk = [11(k),12(k)];
Lgk = [lgl(k),lg2(k)];
gkplusl=[gl(k+l);g2(k+l)];
xk = [xl(k);x2(k)];
u(k) =  Lk*xk + Lgk*gkplusl;
end
%
Appendix C: MATLAB Files
% plot various values: P(k), g(k), x(k), u(k)
% let us first reorder the values of k = 0 to kf
%
% first plot P(k)
%
k = O:l:kf;
figure(l)
plot(k,pll,'k:o',k,p12,'k:+',k,p22,'k:*')
grid
xlabel( 'k')
ylabel('Riccati coefficients')
gtext('p_{ll}(k)')
gtext('p_{12}(k)')
gtext('p_{22}(k)')
%
% Plot g(k)
%
k = O:l:kf;
figure (2)
plot(k,gl,'k:o',k,g2,'k:+')
grid
xlabel( 'k')
C.4 MATLAB© for DiscreteTime Tracking System
ylabel('Vector coefficients')
gtext('g_{l}(k)')
gtext('g_{2}(k)')
%
k=O:l:kf;
figure (3)
plot(k,xl,'k:o',k,x2,'k:+')
grid
xlabel ( , k ' )
ylabel('Optimal States')
gtext (' x_l (k) ')
gtext ( , x_2 (k) , )
%
figure (4)
k=O: 1 :kfl;
plot (k, u, , k: * ' )
grid
xlabel( 'k')
ylabel('Optimal Control')
gtext ('u(k) ')
%
% end of the program
%
413
References
[1] N. I. Akhiezer. The Calculus of Variations. Blaisdell Publishing
Company, Boston, MA, 1962.
[2] B. D. O. Anderson and J. B. Moore. Linear system optimiza
tion with prescribed degree of stability. Proceedings of the lEE,
116(12):20832087, 1969.
[3] B. D. O. Anderson and J. B. Moore. Optimal Control: Linear
Quadratic Methods. Prentice Hall, Englewood Cliffs, NJ, 1990.
[4] P. J. Antsaklis and A. N. Michel. Linear Systems. The McGraw
Hill Companies, Inc., New York, NY, 1997.
[5] K. Asatani. Suboptimal control of fixedendpoint minimum en
ergy problem via singular perturbation theory. Journal of Math
ematical Analysis and Applications, 45:684697, 1974.
[6] M. Athans and P. Falb. Optima Control: An Introduction to The
Theory and Its Applications. McGraw Hill Book Company, New
York, NY, 1966.
[7] T. B8.§ar and P. Bernhard. HOCOptimal Control and Related
Minimax Design Problems: Second Edition. Birkhauser, Boston,
MA,1995.
[8] A. V. Balakrishnan. Control Theory and the Calculus of Varia
tions. Academic Press, New York, NY, 1969.
[9] M. Bardi and 1. C.Dolcetta. Optimal Control and Viscosity
Solutions of HamiltonJacobiBellman Equations. Birkhauser,
Boston, MA, 1997.
[10] S. Barnett. Matrices in Control Theory. Van Nostrand Reinhold,
London, UK, 1971.
[11] J. S. Bay. Fundamentals of Linear State Space Systems. The
McGraw Hill Companies, Inc., New York, NY, 1999.
[12] R. E. Bellman. Dynamic Programming. Princeton University
Press, Princeton, N J, 1957.
[13] R. E. Bellman. Introduction to Matrix Analysis. McGraw Hill
Book Company, New York, NY, second edition, 1971.
415
416 References
[14] R. E. Bellman and S. E. Dreyfus. Applied Dynamic Programming.
Princeton University Press, Princeton, NJ, 1962.
[15] R. E. Bellman and R. E. Kalaba. Dynamic Programming and
Modem Control Theory. Academic Press, New York, NY, 1965.
[16] A. Bensoussen, E. Hurst, and B. Naslund. Management Appli
cations of Modem Control Theory. NorthHolland Publishing
Company, New York, NY, 1974.
[17] L. D. Berkovitz. Optimal Control Theory. SpringerVerlag, New
York, NY, 1974.
[18] D. P. Bertsekas. Dynamic Programming and Optimal Control:
Volume I. Athena Scientific, Belmont, MA, 1995.
[19] D. P. Bertsekas. Dynamic Programming and Optimal Control:
Volume II. Athena Scientific, Belmont, MA, 1995.
[20] J. T. Betts. Practical Methods for Optimal Control Using Non
linear Programming. SIAM, Philadelphia, PA, 200l.
[21] S. Bittanti. History and prehistory of the Riccati equation. In
Proceedings of the 35th Conference on Decision and Control, page
15991604, Kobe, Japan, December 1996.
[22] S. Bittanti, A. J. Laub, and J. C. Willems, editors. The Riccati
Equation. SpringerVerlag, New York, NY, 1991.
[23] G. A. Bliss. Lectures on the Calculus of Variations. University
of Chicago Press, Chicago, IL, 1946.
[24] V. G. Boltyanskii. Mathematical Methods of Optimal Control.
Rinehart and Winston, New York, NY, 1971.
[25] V. G. Boltyanskii, R. V. Gamkrelidze, and L. S. Pontryagin. On
the theory of optimal processes. Dokl. Akad. Nauk SSSR, 110:7
10, 1956. (in Russian).
[26] O. Bolza. Lectures on the Calculus of Variations. Chelsea Pub
lishing Company, New York, NY, Third edition, 1973.
[27] A. E. Bryson, Jr. , W. F. Denham, and S. E. Dreyfus. Optimal
programming problems with inequality constraints, I: Necessary
conditions for extremal solutions. AIAA Journal, 1(3):25442550,
1963.
[28] A. E. Bryson, Jr. Optimal control1950 to 1985. IEEE Control
Systems, 16(3):2633, June 1996.
[29] A. E. Bryson, Jr. Dynamic Optimization. Addison Wesley Long
man, Inc., Menlo Park, CA, 1999.
[30] A. E. Bryson, Jr. and Y. C. Ro. Applied Optimal Control: Opti
mization, Estimation and Control. Hemisphere Publishing Com
References
417
pany, New York, NY, 1975. Revised Printing.
[31] R. S. Bucy. Lectures on Discrete Time Filtering. SpringerVerlag,
New York, NY, 1994.
[32] J. B. Burl. Linear Optimal Control: H2 and Hoc Methods.
AddisonWesley Longman Inc., Menlo Park, CA, 1999.
[33] J. A. Cadzow and H. R. Martens. Discrete Time and Computer
Control Systems. Prentice Hall, Englewood Cliffs, NJ, 1970.
[34] B. M. Chen. Robust and Hoc Control. SpringerVerlag, London,
UK, 2000.
[35] C. T. Chen. Linear System Theory and Design. Oxford Univer
sity Press, New York, NY, Third edition, 1999.
[36] G. S. Christensen, M. E. ElHawary, and S. A. Soliman. Op
timal Control Applications in Electric Power Systems. Plenum
Publishing Company, New York, NY, 1987.
[37] P. Cicala. An Engineering Approach to the Calculus of Varia
tions. Levrotto and Bella, Torino, 1957.
[38] S. J. Citron. Elements of Optimal Control. Rinehart and Win
ston, New York, NY, 1969.
[39] P. Colaneri, J. C. Geromel, and A. Locatelli. Control Theory and
Design: An RH2 and RHoc Viewpoint. Academic Press, San
Diego, CA, 1997.
[40] W. F. Denham and A. E. Bryson, Jr. Optimal programming
problems with inequality constraints II: Solution by steepest as
cent. AIAA Journal, 2(1):2534, 1964.
[41] P. M. DeRusso, R. J. Roy, C. M. Close, and A. A. Desrochers.
State Variables for Engineers. John Wiley & Sons, New York,
NY, Second edition, 1998.
[42] P. Dorato, C. Abdallah, and V. Cerone. LinearQuadratic Con
trol: An Introduction. Prentice Hall, Englewood Cliffs, NY, 1995.
[43] J. C. Doyle, B. A. Francis, and A. R. Tannenbaum. Feedback
Control Theory. Macmillan Publishing, New York, NY, 1992.
[44] J. C. Doyle, K. Glover, P. P. Khargonekar, and B. A. Francis.
Statespace solutions to standard H2 and Hoc control problems.
IEEE Transactions on Automatic Control, 34:831847, 1989.
[45] S. E. Dreyfus. Dynamic Programming and the Calculus of Vari
ations. Academic Press, New York, NY, 1966.
[46] L. Elsgolts. Differential Equations and Calculus of Variations.
Mir Publishers, Moscow, Russia, 1970.
[47] L. E. Elsgolts. Calculus of Variations. AddisonWesley, Rading,
418 References
MA,1962.
[48] G. M. Ewing. Calculus of Variations with Applications. Dover
Publications, Inc., New York, NY, 1985.
[49] W.F. Feehery and P.I. Barton. Dynamic optimization with state
variable path constraints. Computers in Chemical Engineering,
22:12411256, 1998.
[50] M. J. Forray. Variational Calculus in Science and Engineering.
McGrawHill Book Company, New York, NY, 1968.
[51] B. A. Francis. A Course in Hoc Optimal Control Theory, vol
ume 88 of Lecture Notes in Control and Information Sciences.
SpringerVerlag, Berlin, 1987.
[52] R. V. Gamkrelidze. Discovery of the Maximum Principle. Journal
of Dynamical and Control Systems, 5(4):437451, 1999.
[53] R. V. Gamkrelidze. Mathematical Works of L. S. Pontryagin.
(personal communication), August 200l.
[54] F. R. Gantmacher. The Theory of Matrices, Vols. 1 and 2.
Chelsea Publishing, New York, NY, 1959.
[55] I. M. Gelfand and S. V. Fomin. Calculus of Variations. John
Wiley & Sons, New York, NY, 1988.
[56] M. Giaquinta and S. Hildebrandt. Calculus of Variations: Vol
ume I: The Lagrangian Formalism. SpringerVerlag, New York,
NY,1996.
[57] M. Giaquinta and S. Hildebrandt. Calculus of Variations: Vol
ume II: The Hamiltonian Formalism. SpringerVerlag, New York,
NY, 1996.
[58] H. H. Goldstine. A History of the Calculus of Variations: From
the 17th through the 19th Century. SpringerVerlag, New York,
NY, 1980.
[59] R. J. Gran. Fly me to the Moon  then and now. MATLAB News
£3 Notes, Summer: 49 , 1999.
[60] M. Green and D. Limebeer. Linear Robust Control. Prentice
Hall, Englewood Cliffs, NJ, 1995.
[61] J. Gregory and C. Lin. Constrained Optimization in the Cal
culus of Variations and Optimal Control Theory. Van Nostrand
Reinhold, New York, NY, 1992.
[62] R. F. Hartl, S. P. Sethi, and R. G. Vickson. A survey of the
maximum principles for optimal control problems with state con
straints. SIAM Review, 37(2):181218, June 1995.
[63] A. Heim and O. V. Stryk. Documentation of PAREST  A Mul
References
419
tiple Shooting Code for Optimization Problems in Differential 
Algebraic Equations, November 1996.
[64] M. R. Hestenes. A general problem in the calculus of variations
with applications to paths of least time. Technical Report RM
100, RAND Corporation, 1950.
[65] M. R. Hestenes. Calculus of Variations and Optimal Control.
John Wiley & Sons, New York, NY, 1966.
[66] L. M. Hocking. Optimal Control: An Introduction to the Theory
and Applications. Oxford University Press, New York, NY, 1991.
[67] J. C. Hsu and A. U. Meyer. Modem Control Principles and
Applications. McGraw Hill, New York, NY, 1968.
[68] D. H. Jacobson and M. M. Lee. A transformation technique for
optimal control problems with state inequality constraints. IEEE
Transactions on Automatic Control, AC14:457464, 1969.
[69] T. Kailath. Linear Systems. Prentice Hall, Englewood Cliffs, NJ,
1980.
[70] R. E. Kalman. Contribution to the theory of optimal control.
Bol. Soc. Matem. Mex., 5:102119, 1960.
[71] R. E. Kalman. A new approach to linear filtering in prediction
problems. ASME Journal of Basic Engineering, 82:3445, March
1960.
[72] R. E. Kalman. Canonical structure of linear dynamical systems.
Proc. Natl. A cad. Sci., 148(4):596600, April 1962.
[73] R. E. Kalman. Mathematical description of linear dynamical
systems. J. Soc. for Industrial and Applied Mathematics, 1:152
192, 1963.
[74] R. E. Kalman. New methods in wiener filtering theory. In Pro
ceedings of the Symposium on Engineering Applications of Ran
dom Function Theory and Probability, New York, 1963. Wiley.
[75] R. E. Kalman. The theory of optimal control and the calculus
of variations. In R. Bellman, editor, Mathematical Optimization
Techniques, chapter 16. University of California Press, 1963.
[76] R. E. Kalman and R. S. Bucy. New results in linear filtering and
prediction theory. Transactions ASME J. Basic Eng., 83:95107,
1961.
[77] R. E. Kalman, Y. Ho, and K. S. Narendra. Controllability of lin
ear dynamical systems. Contributions to Differential Equations,
1(2):189213, 1963.
[78] M. I. Kamien and N. L. Schwartz. Dynamic Optimization: The
420 References
Calculus of Variations and Optimal Control in Economics and
Management, Second Edition. Elsvier Science Publishing Com
pany, New York, NY, 1991.
[79] D. E. Kirk. Optimal Control Theory. Prentice Hall, Englewood
Cliffs, NJ, 1970.
[80] G. E. Kolosov. Optimal Design of Control Systems: Stochastic
and Deterministic Problems. Marcel Dekker, Inc., New York,
NY, 1999.
[81] M. L. Krasnov, G. I. Makarenko, and A. I. Iselev. Problems and
Exercises in Calculus of Variations. Mir Publishers, Moscow,
Russia, 1975.
[82] B. C. Kuo. Digital Control Systems, Second Edition. Holt, Rine
hart, and Winston, New York, NY, 1980.
[83] B. C. Kuo. Automatic Control Systems, Seventh Edition. Pren
tice Hall, Englewood Cliffs, N J, 1995.
[84] H. Kwakernaak and R. Sivan. Linear Optimal Control Systems.
WileyInterscience, New York, NY, 1972.
[85] J. L. Lagrange. Mechanique Analytique, 2 Volumes. Paris,
France, 1788.
[86] E. B. Lee and L. Markus. Foundations of Optimal Control The
ory. John Wiley & Sons, New York, NY, 1967.
[87] G. Leitmann. An Introduction to Optimal Control. McGraw Hill,
New York, NY, 1964.
[88] G. Leitmann. The Calculus of Variations and Optimal Control:
An Introduction. Plenum Publishing Co., New York, NY, 1981.
[89] F. L. Lewis. Optimal Control. John Wiley & Sons, New York,
NY, 1986.
[90] F. L. Lewis. Applied Optimal Control and Estimation: Digital
Design and Implementation. Prentice Hall, Englewood Cliffs,
NJ,1992.
[91] F. L. Lewis and V.L. Syrmos. Optimal Control, Second Edition.
John Wiley & Sons, New York, NY, 1995.
[92] A. Locatelli. Optimal Control: An Introduction. Birkhauser,
Boston, MA, 200l.
[93] D. G. Luenberger. Optimization by Vector Space Methods. John
Wiley and Sons, New York, NY, 1969.
[94] M. A. Lyapunov. The general problem of motion stability. Comm.
Soc. Math. Kharkov, 1892. Original paper in Russian. Trans
lated in French, Ann. Fac. Sci. Toulouse, 9, pp. 203474, (1907),
References
421
Reprinted in Ann. Math. Study, No. 17, Princeton University
Press, Princeton, NJ, (1949).
[95] A. G. J. MacFarlane and I. Postlethwaite. The generalized
Nyquist stability criterion and multivariable root loci. Interna
tional Journal Control, 25:81127, 1977.
[96] J. M. Maciejowski. Multivariable Feedback Design. Addison
Wesley Publishing Company, Reading, MA, 1989.
[97] J. Macki and A. Strauss. Introduction to Optimal Control Theory.
SpringerVerlag, New York, NY, 1982.
[98] E. J. McShane. On multipliers for Lagrange problems. American
Journal of Mathematics, LXI:809818, 1939.
[99] E. J. McShane. The calculus of variations from the beginning
through optimal control theory. SIAM Journal of Control and
Optimization, 27:916939, September 1989.
[100] J. S. Meditch. On the problem of optimal thrust programming for
a lunar soft landing. IEEE Transactions on Automatic Control,
AC9:477484, 1864.
[101] L. Meirovitch. Dynamcis and Control of Structures. John Wiley
& Sons, New York, NY, 1990.
[102] G. H. Meyer. Initial Value Methods for Boundary Value Prob
lems. Academic Press, New York, NY, 1973.
[103] A. A. Milyutin and N. P. Osmolovskii. Calculus of Variations and
Optimal Control. American Mathematical Society, Providence,
RI, 1997. Translations of Mathematical Monographs.
[104] I. H. Mufti, C. K. Chow, and F. T. Stock. Solution of ill
conditioned lienar twopoint, boundary value problem by Riccati
transformation. SIAM Review, 11:616619, 1969.
[105) D. S. Naidu and D. B. Price. Singular perturbations and time
scales in the design of digital flight control systems. Technical
Paper 2844, NASA Langley Research Center, Hampton, VA, De
cember 1988.
[106] I. P. Petrov. Variational Methods in Optimal Control Theory.
Academic Press, New York, NY, 1968.
[107] D. A. Pierre. Optimization Theory with Applications. John Wiley
& Sons, New York, NY, 1969.
[108] E. R. Pinch. Optimal Control and Calculus of Variations. Oxford
University Press, New York, NY, 1993.
[109] L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F.
Mishchenko. The Mathematical Theory of Optimal Processes.
422 References
WileyInterscience, New York, NY, 1962. (Translated from
Russian).
[110] L. Pun. Introduction to Optimization Practice. John Wiley, New
York, NY, 1969.
[111] R. Pytlak. Numerical Methods for Optimal Control Problems with
State Constraints, volume 1707 of Lecture Notes in Mathematics.
SpringerVerlag, Berlin, Germany, 1999.
[112] W. F. Ramirez. Process Control and Identification. Academic
Press, San Diego, CA, 1994.
[113] W. T. Reid. Riccati Differential Equations. Academic Press, New
York, NY, 1972.
[114] C. J. Riccati. Animadversiones in aequations differentiales se
cundi gradus. Acta Eruditorum Lipsiae, 8:6723, 1724.
[115] H. H. Rosenbrock. ComputerAided Control System Design. Aca
demic Press, New York, 1974.
[116] E. P. Ryan. Optimal Relay and Saturating Control System Syn
thesis. Peter Peregrinus Ltd., Stevenage, UK, 1982.
[117] A. Saberi, P. Sannuti, and B. M. Chen. H2 Optimal Control.
Prentice Hall International (UK) Limited, London, UK, 1995.
[118] A. Sagan. Introduction to the Calculus of Variations. Dover
Publishers, Mineola, NY, 1992.
[119] A. P. Sage. Optimum Systems Control. Prentice Hall, Englewood
Cliffs, NJ, 1968.
[120] A. P. Sage and C. C. White III. Optimum Systems Control,
Second Edition. Prentice Hall, Englewood Cliffs, NJ, 1977.
[121] D. G. Schultz and J. L. Melsa. State Functions and Linear Con
trol Systems. McGraw Hill, New York, NY, 1967.
[122] A. Seierstad and K. Sydsaeter. Optimal Control Theory with
Economic Applications. Elsevier Science Publishing Co., New
York, NY, 1987.
[123] S. P. Sethi and G. L. Thompson. Optimal Control Theory: Appli
cations to Management Science and Economics: Second Edition.
Kluwer Academic Publishers, Hingham, MA, 2000.
[124] V. Sima. Algorithms for LinearQuadratic Optimization. Marcel
Dekker, Inc., New York, NY, 1996.
[125] G.M. Siouris. An Engineering Approach to Optimal Control and
Estimation Theory. John Wiley & Sons, New York, NY, 1996.
[126] D. R. Smith. Variational Methods in Optimization. Prentice Hall,
Englewood Cliffs, NJ, 1974.
References
423
[127] R. F. Stengel. Stochastic Optimal Control: Theory and Applica
tion. WileyInterscience, New York, NY, 1986.
(128] A. Stoorvogel. The Hoo Control Problem: A State Space Ap
proach. Prentice Hall, Englewood Cliffs, NJ, 1992.
[129] O. V. Stryk. Numerical solution of optimal control problems by
direct collocation. International Series of Numerical Mathemat
ics, 111:129143, 1993.
[130] O. V. Stryk. User's Guide for DIRCOL, A Direct Collocation
Method for the Numerical Solution of Optimal Control Problems,
1999.
[131] M. B. Subrahmanyam. Finite Horizon Hoo and Related Control
Problems. Birkhauser, Boston, MA, 1995.
[132] H. J. Sussmann and J. C. Willems. 300 years of optimal con
trol: from the brachistochrone to the maximum principle. IEEE
Control Systems Magazine, 17:3244, June 1997.
[133] K. L. Teo, C. J. Goh, and K. H. Wong. A Unified Computational
Approach to Optimal Control Problems. Longman Scientific and
Technical, Harlow, UK, 1991.
[134] I. Todhunter. A History of Progress of the Calculus of Variations
in the Nineteenth Century. Chelsea Publishing Company, New
York, NY, 1962.
[135] J. Tou. Modem Control Theory. McGraw Hill, New York, NY,
1964.
[136] J. L. 'Troutman. Variational Calculus and Optimal Control, Sec
ond Edition. SpringerVerlag, New York, NY, 1996.
[137] F. A. Valentine. The problem of Lagrange with differential in
equalities as added side conditions. In Contributions to the Cal
culus of Variations, pages 407408. Chicago University Press,
Chicago, IL, 1937.
[138] D. R. Vaughan. A nonrecursive algebraic solution for the discrete
Riccati equation. IEEE Transactions Automatic Control, AC
15:597599, October 1970.
[139] T. L. Vincent and W. J. Grantham. Nonlinear and Optimal
Control Systems. John Wiley & Sons, New York, NY, 1997.
[140] R. Vinter. Optimal Control. Birkhauser, Boston, MA, 2000.
[141] F. Y. M. Wan. Introduction to the Calculus of Variations and its
Applications. Chapman and Hall, London, 1994.
[142] J. Warga. Optimal Control of Differential and Functional Equa
tions. Academic Press, New York, NY, 1972.
424 References
[143] R. Weinstock. Calculus of Variations with Applications to
Physics and Engineering. Dover Publishing, Inc., New York, NY,
1974.
[144] N. Wiener. Cybernetics. Wiley, New York, NY, 1948.
[145J N. Wiener. Extrapolation, Interpolation, and Smoothing of Sta
tionary Time Series. Technology Press, Cambridge, MA, 1949.
[146] L. C. Young. Lectures on the Calculus of Variations and Optimal
Control Theory. W. B. Saunders Company, Philadelphia, PA,
1969.
[147] L. A. Zadeh and C. A. Desoer. Linear System Theory. McGraw
Hill Book Company, New York, 1963.
[148J G. Zames. Feedback and optimal sensitivity: model reference
transformation, multiplicative semi norms and approximate in
verses. IEEE Transactions Automatic Control, 26:301320, 1981.
[149] M. I. Zelikin. Control Theory and Optimization I: Homogeneous
Spaces and the Riccati Equation in the Calculus of Variations.
SpringerVerlag, Berlin, Germany, 2000.
[150] K. Zhou, J. C. Doyle, and K. Glover. Robust and Optimal Con
trol. Prentice Hall, Upper Saddle River, NJ, 1996.
Index
E fueloptimal, 328
Adjoint of a matrix, 370
Analytical solution
ARE:discretetime, 230
D RE continuoustime
MATLAB, 122
DRE: continuoustime , 119
DRE: discretetime , 225
ARE
continuoustime, 131, 134
analytical solution, 133
procedure summary table,
136
ARE:Algebraic Riccati equation
continuoustime, 131
ARE:continuoustime, 177
ARE: discretetime, 220, 240
alternate forms, 221
analytical solution, 230
Bangbang control, 301
Bangoffbang control, 333
Bellman, R., 279
Bernoulli, Johannes, 11
Bliss, Gilbert, 13
Bolza problem, 9
continuoustime, 57
procedure summary table,
69
Brachistochrone, 38
Brachistochrone problem, 11
Calculus of variations, 4, 11, 12
continuoustime, 19
fundamental theorem, 27
discretetime, 191
fundamental lemma
continuoustime, 31
Carthage, 11
Classical control, 1
Clebsch, Rudolph, 12
Closedloop optimal control
continuoustime, 107
implementation, 116
Closedloop optimal controller
continuoustime, 94, 116, 141,
143
Cofactor of a matrix, 370
Constrained controls, 249
Constrained problem, 9
Constrained states, 249
Constraints
controls, 6
states, 6
Control weighted matrix, 103
Controllability, 383
continuoustime, 117
Controllability Grammian, 383
Controllabili ty:continuoustime,
131
Cost function, 6
425
426
Deadzone function
fueloptimal control, 322
Definiteness, 377
Derivative of quadratic form, 378
Detectability, 383
Determinant of a matrix, 370
Deterministic, 4
DEZ function, 330
Difference Riccati equation (D RE),
209
Differential, 22
Differential Riccati equation
LQT:continuoustime, 155
Differential Riccati equation (DRE),
112
DRE:continuoustime, 109
Differentiation of a matrix, 372
Discretetime
freefinal state, 207
openloop optimal control,
235
optimal control, 191
Discretetime LQR, 222
Discretetime optimal control
freefinal time, 208
procedure summary table, 204,
208
DRE
continuoustime, 119, 131
HJB equation, 286
DRE:continuoustime
analytical solution, 119
computation, 115
DRE:Continuoustime
LQR, 109
D RE:continuoustime
LQT, 158
MATLAB, 122
steadystate solution, 132
transient solution, 132
Index
DRE:discretetime, 209, 210, 275
alternate forms, 211
analytical solution, 225
LQT, 236
Dynamic optimization, 4
Dynamic programming, 16, 249,
261
backward solution, 262
forward solution, 265
optimal control, 275
discretetime, 266, 272
Eigenvalues of a matrix, 375
Eigenvectors of a matrix, 375
Energyoptimal control, 334
implementation
closedloop, 345
openloop, 345
problem formulation, 335
Error weighted matrix, 103
Euler method, 276
Euler, Leonard, 11
EulerLagrange equation, 12,33
continuoustime, 19, 32, 33
different cases, 35
discussion, 33
discretetime, 195197, 201
EulerLagrange equation:discrete
time, 198
Final time
fixed, 10
free, 10
Fixedend point, 39
continuoustime, 27
discretetime, 195
Fixedend points
LQR: continuoustime , 169
Fixedendpoint, 48
Fixedendpoint problem
Index
continuoustime, 27
Freeend point, 57, 65
Freefinal state
discretetime, 207
Frequency domain
continuoustime
Kalman equation, 181
LQR: continuoustime , 179
LQR:discretetime, 239
Fueloptimal control, 315
bangoffbang control, 333
control law, 328
deadzone function, 322, 330
implementation
closedloop, 333
openloop, 333
LTI system, 328
minimum fuel, 325
Pontryagin Principle, 320, 329,
340
problem formulation, 319
problem solution, 319
SAT function, 343
saturation function, 343
SIMULINK, 334
state trajectories, 324
switching sequences, 326
Function, 19
optimum, 25
Functional, 11, 19
discretetime, 192
optimum, 27
Fundamental lemma
calculus of variations
continuoustime, 31
Fundamental theorem
calculus of variations
continuoustime, 27
Gain margin
427
LQR: continuoustime , 181,
184
Galileo, 11
Gamkrelidze, R. V., 14
Gauss, Karl, 12
General performance index
continuoustime, 118
H2optimal control, 15
Hoooptimal control, 15
Hamilton, William Rowan, 12
HamiltonJacobiBellman, 16
HamiltonJacobiBellman (HJB)
equation, see HJB equa
tion
Hamiltonian
continuoustime, 89, 91, 92
discretetime, 201
LQR:continuoustime, 105
fixedend points, 170
LQT:continuoustime, 153
openloop optimal control
continuoustime, 63
timeoptimal control, 297
Hamiltonian canonical system
continuoustime, 106
LQR: continuoustime , 106
Hamiltonian formalism
continuoustime, 88
Hamiltonian system
discretetime
LQT, 235
Hestenes, M. R., 13
Hilbert, David, 12
Historical note, 11
calculus of variations, 11
EulerLagrange equation, 33
Kalman, R. E., 131
Legendre condition, 40
optimal control, 13
428
Pontryagin Principle, 251.
Riccati equation, 117
Historical tour, 11
HJB equation, 277, 279, 286
DRE, 286
LQR, 283
summary table, 279
Increment, 20
Integral of squared error, 8
Integration of a matrix, 375
Inverse of a matrix, 371
Inverse Riccati equation
continuoustime, 172
Inverse Riccati transformation
continuoustime, 171
Jacobi, Carl Gustav, 12
Kalman equation
continuoustime
frequency domain, 181
Kalman filter, 14
Kalman gain
continuoustime, 112
discretetime, 211, 212, 275
Kalman, R.E., 14
Kneser, Adolf, 12
Lagrange
EulerLagrange equation, 33
Lagrange multiplier, 45
features, 47
Lagrange multiplier method, 42,
45, 48
Lagrange problem, 9, 12
Lagrange, JosephLouis, 11
Lagrangian
continuoustime, 42, 52, 60,
86, 91, 92
discretetime, 200
Lagrangian formalism
continuoustime, 87
Legendre condition, 40
Index
continuoustime, 85
Legendre, Andrien Marie, 12
Linear matrix equations, 381
Linear quadratic Gaussian, 14
Linear quadratic regulator (LQR) ,
see LQR
Linear quadratic tracking (LQT),
see LQT:continuoustime
Linear time invariant
continuoustime, 379
discretetime, 381
Linear time invariant system, 9
Linear time varying
continuoustime, 380
Loop gain matrix
discretetime, 242
LQR:continuoustime, 180
LQR
HJB equation, 283
LQR: continuoustime , 101, 104,
112
degree of stability, 175, 177
fixedend points, 169
frequency domain, 179
gain margin, 181, 184
general performance index,
118
Hamiltonian canonical sys
tern, 106
infinitetime, 125
loop gain matrix, 180
optimal control, 105
phase margin, 181, 186
procedure summary table, 113,
129
return difference, 180
Index
stability, 139
state and costate system, 106
LQR:discretetime, 207, 214
ARE, 220
frequency domain, 239
steadystate, 219, 222
LQT:continuoustime, 152
differential Riccati equation,
155
Hamiltonian, 153
infinitetime, 166
MATLAB, 162
optimal control, 153, 156
optimal cost, 156
procedure summary table, 159
Riccati transformation, 155
state and costate system, 153,
154
vector differential equation,
155, 158
LQT:discretetime, 232, 238
Lyapunov, A. M., 3
MATLAB
difference Riccati equation,
401
analytical solution
DRE:continuoustime, 122
ARE:continuoustime, 138
Differential Riccati Equation,
385
discretetime
tracking problem, 409
DRE:continuoustime, 122,
385
D RE:discretetime
analytical solution, 231
EulerLagrange equation
continuoustime, 55
LQR:discretetime, 216
429
LQT:continuoustime, 162
LQT:discretetime, 237
openloop optimal control
continuoustime, 72, 75, 79,
82
Matrix, 367
Matrix, addition, 368
Matrix, adjoint, 370
Matrix, cofactor, 370
Matrix, differentiation, 372
Matrix, eigenvalues, 375
Matrix, eigenvectors, 375
Matrix, integration, 375
Matrix, inverse, 371
Matrix, multiplication, 368
Matrix, norm, 369
Matrix, powers, 372
Matrix, rank, 371
Matrix, singular, 371
Matrix, subtraction, 368
Matrix, trace, 375
Matrix, transpose, 369
Mayer problem, 9, 12
Mayer, Adolph, 12
McShane, E. J., 13
MIMO, 2
Modern control, 2
Negative feedback optimal con
trol, 108
Newton, Isaac, 11
Norm of a matrix, 369
Norm of a vector, 366
Normal fueloptimal control, 331
Normal timeoptimal control, 299,
301
NTOC
normal timeoptimal control,
299
Nyquist criterion
430
continuoustime, 184
Nyquist path
continuoustime, 184
Observability, 383
Observability Grammian, 383
Openloop optimal control
discretetime, 202
Openloop optimal control
discretetime, 203, 207, 235
Openloop optimal controller
continuoustime, 94, 141, 142
Optimal control, 6
automobile suspension, 99
discretetime, 191, 199
dynamic programming, 266,
272
discretetime:variational cal
culus, 191
fuel optimal, 7
LQR, 108
LQR:continuoustime, 105,
108
fixedend points, 171
LQT:continuoustime, 153,
156, 158
minimum energy, 7
openloop
continuoustime, 62, 64, 94
discretetime, 203
state constraints, 351, 358
time optimal, 7
Optimal control problem, 9
Optimal control system
automobile, 150, 364
chemical, 364
inverted pendulum, 17, 99,
150
liquid level, 17, 99, 150, 189,
247, 290, 363
Index
mechanical, 99,364
mechanical, 18, 150, 189,
247
speed control, 17, 99, 150,
189, 246, 290, 364
Optimal control systems
chemical, 18, 99, 150, 189
Optimal cost
continuoustime, 113
LQR:infinitetime, 134
discretetime, 213, 222
LQR:finitetime, 113
LQT:continuoustime, 156
Optimal cost : continuoustime
LQR:finitetime, 128
Optimal feedback control, 104
Optimal performance index
continuoustime, 110, 115, 133
Output regulator system
continuousitem, 102
Penalty function method, 352
Performance index, 3, 6
discretetime, 199
fuel optimal, 7
minimum energy, 7
minimum fuel, 7
minimum time, 7
time optimal, 7
timeoptimal control, 297
Phase margin
LQR:continuoustime, 181,
186
Plant, 6
PMP, see Pontryagin Minimum
Principle
Pontryagin Maximum Principle,
13
continuoustime, 92
Index
Pontryagin Minimum Principle,
251
PMP, 251
Pontryagin Principle, 249, 252,
360
continuoustime, 70, 92
fueloptimal control, 320, 329,
340
summary, 258
timeoptimal control, 297
Pontryagin principle:timeoptimal
control, 307
Pontryagin, L. S., 13, 251
Poppus, 11
Positive definite matrix, 8
Positive semidefinite matrix, 8
Power of a matrix, 372
Principle of optimality, 261
Procedure summary table
ARE: continuoustime , 136
Bolza problem
continuoustime, 69
discretetime
fixedend points, 204
LQR: continuoustime , 113,
129
LQR:discretetime, 214
steadystate, 222
LQT:continuoustime, 159
LQT:discretetime, 238
prescribed degree of stabil
ity
continuoustime, 178
Quadratic form, 8, 376
Quadratic form, derivative, 378
Rank of a matrix, 371
Reachability, 383
Relay, 298
Return difference matrix
continuoustime, 180
discretetime, 242
Riccati Coefficient
continuoustime, 109
Riccati coefficient, 132
continuoustime, 114
Riccati transformation
LQR, 109
431
LQT: continuoustime , 155
Riccati transformation:continuous
time
inverse, 171
Riccati, C.J., 14
Sampler, 276
Sat function, 343
Saturation function, 343
Second variation
continuoustime, 39
Signum function, 298
fueloptimal control, 321
timeoptimal control, 298
SIMULINK
fueloptimal control, 334
timeoptimal control, 315
Singular fueloptimal control, 331
Singular matrix, 371
Singular timeoptimal control, 300
Singular values, 376
SISO, 1
Slack variable method, 358
Square matrix, 367
Stabilizability, 383
State and costate system
continuoustime, 106
discretetime, 203
LQT:continuoustime, 153,
154
LQT:discretetime, 235
432
timeoptimal control, 297
State and costates system
LQR:continuoustime
fixedend points, 171
State constraints
optimal control, 358
penalty function method, 352
slack variable method, 358
Valentine's method, 358
State regulator system
continuoustime, 102
State space, 379
State transition matrix
continuoustime, 380
discretetime, 382
Static Optimization, 4
Steadystate LQR:discretetime,
219
STOC
singular timeoptimal con
trol, 300
Stochastic, 4
Switch curve, 311
Symmetric matrix, 369
Taylor series, 22, 24, 49, 61, 375
Taylor series:discretetime, 193
Terminal control problem, 8
Terminal cost
discretetime, 197, 199
Terminal cost function, 8
Terminal cost problem
continuoustime, 57
Terminal cost weighted matrix,
103
Time optimal, 7
Time optimal control
signum function, 298
Timeoptimal control
closedloop structure, 305
Index
control law, 314
double integral system, 305
Hamiltonian, 297
implementation, 314
LTI system, 295
minimum time, 314
openloop structure, 303
performance index, 297
phase plane, 310
Pontryagin Principle, 297
problem formulation, 295
signum function, 298
SIMULINK, 315
state and costate system, 297
state trajectories, 309
switch curve, 311
Timeoptimal control:problem so
lution, 296
TPBVP
continuoustime, 107
state constraints, 360
TPBVP:continuoustime, 89, 91,
93
LQR:continuoustime
fixedend points, 171
variational calculus, 34
TPBVP:discretetime, 203, 207
Trace of a matrix, 375
Tracking system, 103
Transpose, 366
Transpose of a matrix, 369
Twopoint boundary value prob
lem, see TPBVP:continuous
time
Tyrian princess Dido, 11
Unconstrained problem, 9
Unity matrix, 367
Valentine's method, 358
Index
Variation, 22
Variational calculus
discretetime, 191
variational calculus
continuoustime, 19
Variational problem
continuoustime, 27
Vector, 365
Vector algebra, 365
Vector difference equation, 236
Vector differential equation
LQT:continuoustime, 155,
158
Vector norm, 366
Weierstrass, Karl, 12
Weighted matrix
control, 103
error, 103
terminal cost, 103
Wiener filter, 13
Wiener, N., 13
Z ames , G., 15
Zenodorus, 11
Zeroorder hold, 276
433
Electrical Engineering Textbook Series
Richard C. Dorf, Series Editor
University of California, Davis
Forthcoming and Published Titles
Applied Vector Analysis Matiur Rahman and Isaac Mulolani Continuous Signals and Systems with MATLAB Taan EIAli and Mohammad A. Karim Discrete Signals and Systems with MATLAB Taan EIAIi Electromagnetics Edward J. Rothwell and Michael J. Cloud Optimal Control Systems Desineni Subbaram Naidu
OPTIMAL CONTROL SYSTEMS
Desineni Subbaram Naidu
Idaho State Universitv . Pocatello. Idaho. USA
CRC PRESS
Boca Raton London New York Washington, D.C.
o
Cover photo: Terminal phase (using fueloptimal control) of the lunar landing of the Apollo 11 mission. Courtesy of NASA.
TJ
"l13
N1.
<'l ~ot
b'~
Library of Congress CataloginginPublication Data
Naidu, D. s. (Desineni S.), 1940Optimal control systems I by Desineni Subbaram N aidu. p. cm. (Electrical engineering textbook series) Includes bibliographical references and index. ISBN 0849308925 (alk. paper) 1. Automatic control. 2. Control theory. 3. Mathematical optimization. I. Title II. Series.
2002067415
This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe.
Visit the CRC Press Web site at www.crcpress.com
© 2003 by CRC Press LLC
No claim to original u.S. Government works International Standard Book Number 0849308925 Library of Congress Card Number 2002067415 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 Printed on acidfree paper
v
"Because the shape of the whole universe is most perfect and, in fact, designed by the wisest Creator, nothing in all of the world will occur in which no maximum or minimum rule is somehow shining forth. "
Leohard Euler,
1144
vi
Dedication
My deceased parents who shaped my life
Desineni Rama Naidu Desineni Subbamma
and
My teacher who shaped my education
Buggapati A udi Chetty
vii
Preface
Many systems, physical, chemical, and economical, can be modeled by mathematical relations, such as deterministic and/or stochastic differential and/or difference equations. These systems then change with time or any other independent variable according to the dynamical relations. It is possible to steer these systems from one state to another state by the application of some type of external inputs or controls. If this can be done at all, there may be different ways of doing the same task. If there are different ways of doing the same task, then there may be one way of doing it in the "best" way. This best way can be minimum time to go from one state to another state, or maximum thrust developed by a rocket engine. The input given to the system corresponding to this best situation is called "optimal" control. The measure of "best" way or performance is called "performance index" or "cost function." Thus, we have an "optimal control system," when a system is controlled in an optimum way satisfying a given performance index. The theory of optimal control systems has enjoyed a flourishing period for nearly two decades after the dawn of the socalled "modern" control theory around the 1960s. The interest in theoretical and practical aspects of the subject has sustained due to its applications to such diverse fields as electrical power, aerospace, chemical plants, economics, medicine, biology, and ecology.
Aim and Scope
In this book we are concerned with essentially the control of physical systems which are "dynamic" and hence described by ordinary differential or difference equations in contrast to "static" systems, which are characterized by algebraic equations. Further, our focus is on "deterministic" systems only. The development of optimal control theory in the sixties revolved around the "maximum principle" proposed by the Soviet mathematician L. S. Pontryagin and his colleagues whose work was published in English in 1962. Further contributions are due to R. E. Kalman of the United States. Since then, many excellent books on optimal control theory of varying levels of sophistication have been published. This book is written keeping the "student in mind" and intended to provide the student a simplified treatment of the subject, with an
viii appropriate dose of mathematics. Another feature of this book is to assemble all the topics which can be covered in a onesemester class. A special feature of this book is the presentation of the procedures in the form of a summary table designed in terms of statement of the problem and a stepbystep solution of the problem. Further, MATLAB© and SIMULINK© 1 , including Control System and Symbolic Math Toolboxes, have been incorporated into the book. The book is ideally suited for a onesemester, second level, graduate course in control systems and optimization.
Background and Audience
This is a second level graduate text book and as such the background material required for using this book is a first course on control systems, state space analysis, or linear systems theory. It is suggested that the student review the material in Appendices A and B given at the end of the book. This book is aimed at graduate students in Electrical, Mechanical, Chemical, and Aerospace Engineering and Applied Mathematics. It can also be used by professional scientists and engineers working in a variety of industries and research organizations.
Acknowledgments
This book has grown out of my lecture notes prepared over many years of teaching at the Indian Institute of Technology (IIT), Kharagpur, and Idaho State University (ISU), Pocatello, Idaho. As such, I am indebted to many of my teachers and students. In recent years at ISU, there are many people whom I would like to thank for their encouragement and cooperation. First of all, I would like to thank the late Dean Hary Charyulu for his encouragement to graduate work and research which kept me "live" in the area optimal control. Also, I would like to mention a special person, Kevin Moore, whose encouragement and cooperation made my stay at ISU a very pleasant and scholarly productive one for many years during 199098. During the last few years, Dean Kunze and Associate Dean Stuffie have been of great help in providing the right atmosphere for teaching and research work.
IMATLAB and SIMULINK are registered trademarks of The Mathworks, Inc., Natick, MA, USA.
ix Next, my students over the years were my best critics in providing many helpful suggestions. Among the many, special mention must be made about Martin Murillo, Yoshiko Imura, and Keith Fisher who made several suggestions to my manuscript. In particular, Craig Rieger (of Idaho National Engineering and Environmental Laboratory (INEEL)) deserves special mention for having infinite patience in writing and testing programs in MATLAB© to obtain analytical solutions to matrix Riccati differential and difference equations. The cameraready copy of this book was prepared by the author using H\'IEX of the PCTEX32 2 Version 4.0. The figures were drawn using CoreiDRAW3 and exported into H\'IEX document. Several people at the publishing company CRC Press deserve mention. Among them, special mention must be made about Nora Konopka, Acquisition Editor, Electrical Engineering for her interest, understanding and patience with me to see this book to completion. Also, thanks are due to Michael Buso, Michelle Reyes, Helena Redshaw, and Judith Simon Kamin. I would like to make a special mention of Sean Davey who helped me in many issues regarding H\'IEX. Any corrections and suggestions are welcome via email to naiduds@isu. edu Finally, it is my pleasant duty to thank my wife, Sita and my daughters, Radhika and Kiranmai who have been a great source of encouragement and cooperation throughout my academic life.
Desineni Subbaram Naidu Pocatello, Idaho June 2002
2:rg..'lEX is a registered trademark of Personal 'lEX, Inc., Mill Valley, CA. 3CorelDRAW is a registered trademark of Corel Corporation or Corel Corporation Limited.
NY. Goldstine. NJ.x ACKNOWLEDGMENTS The permissions given by 1. L. are hereby acknowledged. 1980. New York. Falb. Inc. New York. Optimal Control: An Introduction to the Theory and Its Applications. Prentice Hall. John Wiley & Sons. Englewood Cliffs. Prentice Hall for D. 2. McGrawHill for M. and 4. SpringerVerlag for H. E. Athans and P. A History of the Calculus of Variations. New York. 3. Lewis. Optimal Control Theory: An Intro duction. . 1970.. NY. SpringerVerlag. L. Optimal Control. 1966. Kirk. 1986. John Wiley for F. H. NY. McGrawHill Book Company.
Naidu has been a professor at Idaho State University. and Ph. Electrical Engineering program. degrees in Control Systems Engineering from the Indian Institute of Technology (lIT). 1988. Singular Perturbation Analysis of Discrete Control Systems.Tech. 1985. Idaho State University. During 198790. Naidu was a recipient of the Idaho State University Outstanding Researcher Award for 199394 and 199495 and the Distinguished Researcher Award for 199495. he was an adjunct faculty member in the Department of Electrical and Computer Engineering at Old Dominion University. Virginia. and M. Norfolk. Dr. Pocatello. and Associate Dean of Graduate Studies in the College of Engineering.xi AUTHOR'S BIOGRAPHY Desineni "Subbaram" Naidu received his B. Naidu was a recipient of a Senior National Research Council (NRC) Associateship of the National Academy of Sciences. ASEE. . Lecture Notes in Control and Information Sciences. Ohio. the International Journal of Robust and Nonlinear Control. At present he is Director of the Measurement and Control Engineering Research Center. lEE Control Engineering Series. and a research monograph entitled. Who's Who in Technology. Washington. Dr. Virginia. Professor Naidu is an elected Fellow of The Institute of Electrical and Electronics Engineers (IEEE). and The International Directory of Distinguished Leadership. during 199899. Coordinator. Kharagpur. etc. (199399). a book. India. during 198587 and at the U. Dr. He held various positions with the Department of Electrical Engineering at lIT. Singular Perturbation Methodology in Control Systems. Lecture Notes in Mathematics. the Silver Anniversary 25th Edition of Who's Who in the West. a Fellow of World Innovation Foundation (WIF). Naidu has over 150 publications including a research monograph. Professor Naidu's biography is listed (multiple years) in Who's Who among America's Teachers. Aeroassisted Orbital Transfer: Guidance and Control Strategies.D. an Associate Fellow of the American Institute of Aeronautics and Astronautics (AIAA) and a member of several other organizations such as SIAM. S. (1996present). Air Force Research Laboratory (AFRL) at WrightPatterson Air Force Base (WPAFB). Dr. (19921996). Hampton. Idaho. 1994. degree in Electrical Engineering from Sri Venkateswara University. an International Journal. the International Journal of ControlTheory and Advanced Technology (CTAT). tenable at NASA Langley Research Center. Dr.E. and a member of the Editorial Advisory Board of Mechatronics: The Science of Intelligent Machines. (1992present). India. DC. Since August 1990. Naidu is (or has been) a member of the Editorial Boards of the IEEE Transaction on Automatic Control. Tirupati.
.
.. . .. . 6 1. . . . . .1 Plant .. . . . .7 Variational Approach to Optimal Control Systems .. . .. . . . .... 4 1. .2 Optimum of a Function and a Functional .. . 2.. .4 The Second Variation . . . 11 1. .3 Differential and Variation . . . . . .. 16 1. . .. .. . .. . . .. . ... . . .. ... .1. . . . . . . . . . . . .. . . . . .. . .4.. . . 17 2 Calculus of Variations and Optimal Control 2. . ....2 Lagrange Multiplier Method ... . .. . . . . . .. 2. .6 Chapter Overview . . .. . . .3. 1 1... ... . . . .. . .. . . . . . . . . . . .5.. . . . .. . . . .. .. . . . . .. .. . . . .... .. . . ... .3.. . .. . ..1. .. . .. . .. . .. . . . 2.. .. . . . . . . . .. 2.. . ... .... 15 1..2 Optimization. .1 FixedEnd Time and FixedEnd State System . . . 9 1.. .1 Function and Functional . . . . . .. ..5. .. .. 11 1. .3.. . . .. . . 6 1.3. .. . . . . ... .. . .. ...1 Classical and Modern Control ..7 Problems . . . . 2. .. .2 Discussion on EulerLagrange Equation ..Contents 1 Introduction 1 1.... .. . . .. ... .. .1 Direct Method .. .. .2 Performance Index . . . . . . . . .. . . .3 The Basic Variational Problem .. .3. . . . .. .. .. 2... . . . . 19 19 19 20 22 25 27 27 33 35 39 41 43 45 48 57 xiii . . . . .. . . . . .. . .. . . . . 2... . .4 Formal Statement of Optimal Control System .. . . 9 1. 2.5 About This Book . . . ..3 Optimal Control . . . . .. . .. ... .. . . ... .. . ... . .. . . . 2. . . .3 Different Cases for EulerLagrange Equation . . . . . . . . . . .6 Extrema of Functionals with Conditions . . . .. .. 2. . ... .2 Optimal Control Theory . . . .. . . . . .. . . . . . 2. ... . . .. .... .3 Constraints .1 Calculus of Variations .. . ... . . . . . 13 1.. . ... ... . . . . . . 2. ..4..1 Basic Concepts . .. . . . .. . . . . .. . .. . . . . .. .. ..... .3.. . . .4 Historical Tour . .2 Increment . . . . 6 1. . .. . .... . . . . . .. . . . . . . . 2..5 Extrema of Functions with Conditions .. .. . 2... .1. . . .. . . . . . ... . .3... . ... . . . . . . .
2. .. . . . .. .2 FiniteTime Linear Quadratic Regulator . . . .XIV 2. .8. .. . . . . 3. . . . . . .7. . . . . . . . . . .7. . . . . . . . .. . . .. . . . 101 101 104 109 110 110 112 114 118 119 122 125 128 129 132 133 134 139 ..1 Terminal Cost Problem . . . .. . . . . . .1 InfiniteTime Linear Quadratic Regulator: TimeVarying Case: Summary .. .5 InfiniteTime LQR System II . . . . . .. . . . . ..8. . .5. .4 Stability Issues of TimeInvariant Regulator. . . . 3. ... . . . . . . . . . . .. .3 Optimal Performance Index . .. . 3. .. . . .. .4 Stage IV: Optimal Control System with Hamiltonian Formalism: Pontryagin Principle . . . .. .. . . ... . . . .. . .. . . . . . .2. ..6 LQR System for General Performance Index . . . .. . . .2 Analytical Solution of the Algebraic Riccati Equation . .. . . . . . . . .8. . . . . ..8. 2.4 InfiniteTime LQR System I . . 2..4 FiniteTime Linear Quadratic Regulator: TimeVarying Case: Summary . . . . . .5.. . . . 2. . . . . . .1 Stage I: Optimization of a Functional .. . . . . . . . . . . . ... . .. 57 65 67 68 84 85 86 87 88 91 96 3 Linear Quadratic Optimal Control Systems I 3. . . . . .3.. ..5 Salient Features. .. . . . .7. . . .1 Problem Formulation . . . . .2 Stage II: Optimization of a Functional with Condition . . . . . . .1 Meaningful Interpretation of Riccati Coefficient . . . .. . . . . . . . .4.. . . . .. . 3. . . . Summary of Variational Approach . . . .2 Optimal Control . Problems . .1 MATLAB© Implementation of Analytical Solution to Matrix DRE.4 Summary of Pontryagin Procedure . . . . . . .3 Stage III: Optimal Control System with Lagrangian Formalism .2. . . . 2. . . .. . .. . . . .2 Different Types of Systems .. . . .. . . . . 3. . . .. . .8. . . . ..3 Sufficient Condition . . . .. .. . . . . ... . . . . .. . . . . .1 Symmetric Property of the Riccati Coefficient Matrix . . . . . . . . . . . . . 3.. ... . . . . 3. . . . . . . . . . . . . . . . . . . . .. . . . .. . .. . . . . . . 2. . . 3. . .5 Salient Features . . . . . . . 2. . . . . . . . . . . . .. . 2. .3 Analytical Solution to the Matrix Differential Riccati Equation ..7. . . .3 InfiniteInterval Regulator System: TimeInvariant Case: Summary . . ... ... .2. . .9 2. 3. . . . . 3. . . . . . 3. . .. . . 3. . 3. 3. . . .5. . . . . 2.. . . . . . . ... . .2. . . .2. .5. 3. .. . . . . .. 3. . .8 2. . . .
... ... .. . . ... . .2. . . . ..... ... . .5 3..3. . . . . . ... . . ...... .. ... 4.. .. . .. . .. .. .1 Regulator System with Prescribed Degree of Stability: Summary ... . . ...1 Linear Quadratic Tracking System: FiniteTime Case 4. ... .. ... . ..1. . .5 FrequencyDomain Interpretation . .. .. .. .... .. . . . .2 Optimal Cost Function . . .. ..3. . . ... . . . ..7 4 Linear Quadratic Optimal Control Systems II 4. 239 5.. ...5. . . .2. ...6 3. 4.. . . . . .. ... 4. .. . . . . . . . . 232 5. ..1 Linear Quadratic Tracking System: Summary 4. .. ... ..2 LQT System: InfiniteTime Case .... 4... 4. . ..... . . . ....1 FixedFinal State and OpenLoop Optimal Control.. .. .4 SteadyState Regulator System ..... . . . .4 LQR with a Specified Degree of Stability . ..... ....1. . . . .. . . . .3 FixedEndPoint Regulator System .1. .. . . .. 147 3. . .4. 203 5. 245 .. ... .1 Gain Margin and Phase Margin . . . . . 219 5. . . .2 Salient Features of Tracking System . . .7 Problems . 199 5. 197 5. . ... . .. .. .... . .2 Functional with Terminal Cost ... ... . 192 5. . . . . ...2 FreeFinal State and OpenLoop Optimal Control 207 5.. 209 5.6 Problems. .1 Analytical Solution to the Riccati Equation . . .. .. . .. . . .... . . .. . . .. . 4.. 151 152 157 158 166 169 175 177 179 181 188 5 DiscreteTime Optimal Control Systems 191 5. ... . . .. . . .xv Equivalence of OpenLoop and ClosedLoop Optimal Controls ... ... . .... .... ...1 Variational Calculus for DiscreteTime Systems ... . . .... . .. 207 5... 4. .... .4. .. .. . . ... .. . .. .. . . .5.. . . ...6 FrequencyDomain Interpretation ... ..3 DiscreteTime Linear State Regulator System . . 213 5. . .. . . . 191 5.... .. 141 Notes and Discussion ... . . ..... . .. . . . . . ... . . .. . . .. ... .. ... . . .1.. . .... .. 225 5. 144 Problems . .. . .. .5 DiscreteTime Linear Quadratic Tracking System ... .... ..1 Extremization of a Functional . .2 DiscreteTime Optimal Control Systems .. . . .. ... .. .. .... . . . .1 ClosedLoop Optimal Control: Matrix Difference Riccati Equation . .. . .
.2 Additional Necessary Conditions . ... . . 352 7. .. 296 7. ... . .2.. . 314 7. . . . . . 303 7. . . 339 7.. 305 7. .. 6. .2 TOC of a Double Integral System .3... .6... . ... .. . . ..3 SIMULINK© Implementation of Control Law.. . ... ....1 Problem Formulation and Statement.. 6.1 Summary of Pontryagin Principle . . ....... . . .. .2 Problem Formulation and Statement . 6. .. . . . 319 7. ..2 Problem Solution . .6. . . . ... 333 7.. . . .3 Optimal Control of DiscreteTime Systems .. .1.... . . ..2...1 Principle of Optimality . ... . .. .3. 306 7.. .4 Structure of TimeOptimal Control System . .. .. . .. .. .4. . .2.2 Problem Formulation and Statement . .6 Notes and Discussion .. . .. 351 7..2 Optimal Control Using Dynamic Programming . . .. . 319 7. .3. ... . .. . .. . ...1 TimeOptimal Control of LTI System ... .. . .. . .1 Penalty Function Method.XVI 6 Pontryagin Minimum Principle 6.. .. 295 7. .4 MinimumFuel System: LTI System . .2 Slack Variable Method .5 EnergyOptimal Control Systems . 315 7. . . . . . ... . .. .2 Problem Solution. .... .. . 6.. ... . . .2 Pontryagin Minimum Principle ..... .1.. . . ... . . .. 295 7.. . 6. . ... . . .. ... . ..1 FuelOptimal Control of a Double Integral System 316 7. . ...1 Problem Statement .. . .3 Problem Solution...... 307 7. ... .. . .. ..2. .. . .. .1. .. ..4 SIMULINK© Implementation of Control Law . ..3. .. . ..5. . . .. .. .. . . ..2 Problem Solution.2.3 Solution of the TOC System . . .. .2..5 LQR System Using HJB Equation .. .. .. . . ... .....3 FuelOptimal Control Systems . ... . .... . .. . . .. . .. . . . 358 .. .. . ..5.... . . .... 335 7.. ..... ....1 Problem Formulation and Statement . ...3. ..1 Constrained Optimal Control .. . .4 Optimal Control of ContinuousTime Systems .3. . . . ... 6. .. .3..1.. . . . . 249 249 252 256 259 261 261 266 272 275 277 283 288 7 Constrained Optimal Control Systems 293 7. . . 6.. .4. . . . .. . ..4 The HamiltonJacobiBellman Equation . .. . .... .. . .. . . . . . .4. ... 329 7. . . . . . . 6... .3 Engineering Implementation of Control Law .. ..1 Constrained System ..... .. . 6.. . . " 6... .. .. . . 328 7.6 Optimal Control Systems with State Constraints . .... . . . . .. . . .. .. . 293 7. . . .. . . 6. ... . ...... .. .. .. .. . . . . 328 7. . 315 7. . .. ..3 Dynamic Programming. . 335 7.. . .
. C...... .... . .. ...4 Controllability and Observability . . . ..... . .. ..6 MATLAB File for Example 4.. ... ..3 Quadratic Forms and Definiteness .. . . .. ..m).. .... ... .. .. .. .. . .. .. . .m). C.2.1(example4_1p...2. .m .2 Matrices . C... .2.. . ... . . . .......1 MAT LAB File lqrdnss.. ....2( example4_2p. . . .. . . .. . .. . .... . . .. ..4 MAT LAB File for Example 4.5 Stabilizability. .m). C.. C.. ... C.. .. .. . . .. .1 MATLAB File lqrnss.. . ..3 MATLAB File for Example 4. A. .l....1 Vectors . . ... .7 Problems . .. B. .. . ... ...... 425 . . . . . . ..m).. . B. .. . . . .... . C. . . C. .1(example4_l.2 Linear Matrix Equations.... . . . C. B.... ... .2(example4_l.l..3. . . . . . . .... .... C. .3 MATLAB© for Matrix Difference Riccati Equation . . .2 MATLAB© for ContinuousTime Tracking System .. . References ... .2( example4_2x. 415 Index .. ...... . Appendix B: State Space Analysis B.m).. . C.2(example4_2g. . .m) . .. C. .. .5 MATLAB File for Example 4.. .1 State Space Form for ContinuousTime Systems ...1 MATLAB File for Example 4. . .. . .1 MATLAB© for Matrix Differential Riccati Equation . B.. ... .. . 361 365 365 367 376 379 379 381 381 383 383 385 385 386 393 394 394 397 397 397 398 400 400 401 401 401 409 Appeddix A: Vectors and Matrices A..m) .. .. . ...8 MATLAB File for Example 4...xvii 7.1(example4_1g. .... Reachability and Detectability . . . A. . .. . .. .. . .1(example4_1x... ..2. .3 State Space Form for DiscreteTime Systems .2 MATLAB File lqrnssf... . . . . .2.m .. . . ... . . .2. .2.7 MATLAB File for Example 4. . .m).... ... . . . C.m .. Appendix C: MATLAB Files C.4 MATLAB© for DiscreteTime Tracking System .. ...2 MATLAB File for Example 4.2. C.
.
5 2.. .5 2.. ... ..2 66 72 74 77 81 84 94 95 State and Costate System .. .4 1.... ... .. 117 X'lX .... .7 2..... ..14 . . . . . . .. Differential df.. . . .. . . . (c) FixedFinal Time and FreeFinal State System. .14 2. FinalPoint Condition with a Moving Boundary B(t) . . . . Increment ~J and the First Variation 8J of the Functional J .. . . (d) FreeFinal Time and FreeFinal State System . . .. . . . . . .. . . .3 2. . ... . . .. .. . .. . . . ..1 3. . . .6 2. . Arc Length . . . .. . . . .. . . ..4 2. .. .. FreeFinal Time and FreeFinal State System . .. . . . .. . . ... . .. . 5 Optimal Control Problem ... ..9 Classical Control Configuration . . . . . . . . . .... . . .. .. . ... . . . Different Types of Systems: (a) FixedFinal Time and FixedFinal State System. 3 Components of a Modern Control System . . . . .. . ... FixedEnd Time and FixedEnd State System . .. . .. . ... .. . . . ClosedLoop Optimal Control . 1 Modern Control Configuration . .. .. .. . .12 ... .. . . .. . . ... .. .. .16 3.. .. . . . Optimal Control and States for Example 2. . . ... .. . .15 2. . .. .... . . . .. . .. . .. . ..12 . . OpenLoop Optimal Control .. . .. . . .. and Derivative j of a Function f (t) . .8 2. (a) Minimum and (b) Maximum of a Function f (t) .. . .. (b) FreeFinal Time and FixedFinal State System.15 . ..2 1. Optimal Control and States for Example 2.. . .. .... . . . 23 24 26 29 32 37 59 63 2. Optimal Controller for Example 2.. .. .2 2. . . . . .10 2. 10 Increment ~f.. .. . . . . . . .. . . . ... . . .. .... . . ... 4 Overview of Optimization . . . .. .13 . ... . . .List of Figures 1.3 1. . . . . . . A Nonzero g(t) and an Arbitrary 8x(t) .13 2.. .. .11 2. .1 2. . . . . .12 2. .. Optimal Control and States for Example 2.. . . . ..1 1. . . . . .. .. . ... .. . . . . 107 ClosedLoop Optimal Control Implementation . . .... . . . . . .. .. . .. . . .. Optimal Control and States for Example 2. .
. . . .. ... .... .....1 5. . .. . . . . 5.. 3....7 4... ... ...1 ..... . .. .8 4. .. Optimal Control for Example 4.. 3...11 Optimal Control for Example 3.... . . . . .. .. 4......1 ... .. ...2 4... . .. . . . . .. . ..xx 3... 5.. .8 Implementation of the ClosedLoop Optimal Control: Infinite Final Time.1 4. . . ... .12 (a) OpenLoop Optimal Controller (OLOC) and (b) ClosedLoop Optimal Controller (CLOC) . . . . ..10 4. . . .... ..... .. .... .. . ..2 . .9 4. . .8 Implementation of Optimal Control for Example 5. 3.... 5.. ....... . . .3 .7 Implementation of Optimal Control for Example 5.. ..9 ClosedLoop Optimal Control System . . .. ... 3. .. .2 .. Riccati Coefficients for Example 4.6 ClosedLoop Optimal Control for DiscreteTime SteadyState Regulator System . . ... . . .. 5. .. .. ... .. .... 3. .... Riccati Coefficients for Example 4. .1. ...3 . . ... .10 Optimal States for Example 3.. . ... .... . ... .... . . . . . ... . . ..4 .. Optimal Control and States for Example 4..2 . . .... Optimal States for Example 4. .. ..1 .... ...9 Riccati Coefficients for Example 5...3 4...4 ClosedLoop Optimal Control System for Example 3. . . . . ... ... ..2 State and Costate System.. . 125 126 127 127 133 135 138 140 141 145 157 163 164 164 165 167 168 168 169 180 184 185 186 205 215 219 220 221 223 226 227 231 . .... .1 . . . . . .3 Riccati Coefficients for Example 3. . ..... . .... . .. . 3.. . ..2. . ... .. . ....5 Optimal Control and States for Example 5. .. .... . . .. . . ...4 4.. .3 Riccati Coefficients for Example 5.... 5. .... . 5...13 Intersection of Unit Circles Centered at Origin and 1 + jO . ..7 Interpretation of the Constant Matrix P . .. . .. . .. . ... . . .. .5...... .... ClosedLoop Optimal Controller for Linear DiscreteTime Regulator .. . .. .. . Optimal ClosedLoop Control in Frequency Domain . .6 4. .... .11 Implementation of the Optimal Tracking System . .. . 3.2 . 4..4 Optimal Control and States for Example 5.3 .1 . ....4 .. .. . . . ... .. . 3. Coefficients 91(t) and 92(t) for Example 4. .6 Optimal Control for Example 3....... . .5 4.12 Nyquist Plot of Go(jw) ..5 Optimal States for Example 3. . . . ..1 ..1 3.... .. ...... .. ..2 . .. .. . 5. . . . . 5. . .. Coefficients 91(t) and 92(t) for Example 4... ... 4. ClosedLoop Optimal Control System with Unity Feedback.... Optimal Control and States for Example 4. .
. .1 Optimal States for Example 5.. .. .. 232 233 239 240 241 241 242 243 6.. Optimal States for Example 5.. . .12 SIMULINK@ Implementation of TimeOptimal Control Law .. Optimal Path from A to B ..... . .. .. .....12 5.. . 313 7. . .. 301 7... . .... .. ... .. Illustration of Constrained (Admissible) Controls . Dynamic Programming Framework of Optimal State Feedback Control .2 6. . . .. . .. Riccati Coefficients for Example 5.. . ..... . .. .. ........ .....16 5. .6 ... Coefficients 91(t) and 92(t) for Example 5.... 312 7.. .. .. ......5 6.. .... .. . .... . ..17 6..... . . . ......... Implementation of DiscreteTime Optimal Tracker .......4 6. . 315 7.... Optimal Path from A to B .. . ... . .... .. A Multistage Decision Process: Backward Solution . .. . ... ... . 316 7.... ........... ..4 Singular TimeOptimal Control System .... .......10 Various Trajectories Generated by Four Possible Control Sequences .... ..7 6. " Optimal Control for Example 5. ....... .... ..... ... ... ....... . . . A Multistage Decision Process .1 Signum Function .. . . . ...13 5. Optimal Control for Example 5.XXI 5. ... . .6 6. .. .. . .. ..14 5.... . .. . ...8 Phase Plane Trajectories for u = + 1 (dashed lines) and u = 1 (dotted lines) . ... .. .............. .. . ...5 OpenLoop Structure for TimeOptimal Control System 304 7.. 299 7..13 PhasePlane Trajectory for 1'+: Initial State (2.....2 TimeOptimal Control . . .. ..... . ... . ... .. 299 7..... ... . . 309 7.. . ... ....15 5. ... ....... ... ... .10 5.. .5 ... . ....... . ... ... ..11 ClosedLoop Implementation of TimeOptimal Control Law .. .... ... ClosedLoop DiscreteTime Optimal Control System.... .... ... .6 . ... . . . .. .3 6.9 Switch Curve for Double Integral TimeOptimal Control System .. .. .8 254 260 261 262 263 265 271 290 7...11 5. . A Multistage Decision Process: Forward Solution ... ..........0) . .. .6.... .7 Possible Costates and the Corresponding Controls ..... ..3 Normal TimeOptimal Control System . . .... . ... . .. ... . .... .. . (a) An Optimal Control Function Constrained by a Boundary (b) A Control Variation for Which 8u(t) Is Not Admissible .2) and Final State (0. ...6 ClosedLoop Structure for TimeOptimal Control System 306 7...... 310 7. ......... .... .6 . 317 .5 . ... . .. .. 300 7.
. . . . . .. . . .2) and Final State (0... . .. . .28 ClosedLoop Implementation of FuelOptimal Control System . . . . . . . . .25 Normal FuelOptimal Control System .. . . .. . . . .31 PhasePlane Trajectory for "Y: Initial State (2. 7. . . . .38 OpenLoop Implementation of EnergyOptimal Control System .. . . . . . 7. . 7. . . . . . . 7. .34 PhasePlane Trajectory for R 2 : Initial State (1. . . . . . . . . .. 7.37 EnergyOptimal Control . .16 PhasePlane Trajectory for R_: Initial State (1. . . .. . . . .1) and Final State (0. 7. 7.. . .. . . . . . . . . . 7. . . . . . . . . . . .. . . . . .17 Relations Between A2(t) and lu*(t)1 + u*(t)A2(t) .. 7. . ... . . . . . . . . . . . . . . .0) . . . . . . . . . . .xxii 7. .0) .1) and Final State (0. . . . . . . . .. . .20 Switching Curve for a Double Integral FuelOptimal Control System . . .. . . .. .. . . . . . .. . .1) and Final State (0. . . . .. . . . . . . .32 PhasePlane Trajectory for R 1 : Initial State (1. . . . . . .15 PhasePlane Trajectory for R+: Initial State (1... . . . . . . . .0) . . . . . . . . .33 PhasePlane Trajectory for R3: Initial State (1. .2) and Final State (0. . 7. . . . .35 PhasePlane Trajectory for R4: Initial State (1. . . . . . .. . . . . . . . . . . .. . . . .. . . . . . . .. . . .19 FuelOptimal Control . . . . . . . .. . . ... 7. .22 FuelOptimal Control Sequences . . . . .. . . . . . .0) . . . . .. 7.. . .0) . . . ... . .. . . . . 7.5. . .0) .27 OpenLoop Implementation of FuelOptimal Control System . . . . . . . . . .. . . . . . . . . . . . .. .. .. . . . . .1) and Final State (0. .0) . . . . . . . . 7. . . . 7. .29 SIMULINK@ Implementation of FuelOptimal Control Law . .26 Singular FuelOptimal Control System . . . . . . .. 7. . . . .36 Saturation Function . . . 7.. . . . . . . . . .. . 7. . . . . . . ... . . . . . . 7. .23 EFuelOptimal Control. . 7. . . . . . . . . .0) . . . . . . . . . . . . ..18 DeadZone Function. . . . . . . . .1) and Final State (0. . . . . . .. . . .. . . . . .. . . . . .. . . . . . . . . . .2) and Final State (0..0) . . . 7.1) and Final State (0. . . . . .. . .. . . . . . .. . . . .. . .24 Optimal Control as DeadZone Function . . 7. . . . . . . . . .. .. . . . . .. . 7. . . . .. . . .14 PhasePlane Trajectory for 7: Initial State (2.. . . . . . . . . . . . . . . . . . . . . 317 318 318 322 323 323 324 325 326 327 330 331 332 333 334 334 336 336 337 337 338 338 343 344 345 . . . . . . . . . . 7. .. . . . . . . . . .. . . .. . .21 PhasePlane Trajectories for u(t) = 0 . . . . .. .. .. . . . .. . . . .5. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . .30 PhasePlane Trajectory for "Y+: Initial State (2.
.. ..42 Implementation of EnergyOptimal Control Law .. . ..XXlll 7. . .. . . . 346 348 349 351 358 . .... . .. . .. . . 7.... . . .. . . .. ... .43 Relation between Optimal Control u*(t) and Optimal Costate A2 (t) . ..... ... ..5A*(t) . . . .. . ... . 7... . .. .. .40 Relation between Optimal Control u*(t) vs (a) q*(t) and (b) 0. . ... .... . . . ... 7. . .. .. ..39 ClosedLoop Implementation of EnergyOptimal Control System .. . ..41 Possible Solutions of Optimal Costate A*(t) . . . ... .. . 7. .
.
. Summary of Pontryagin Minimum Principle .1 3.. . . Linear Quadratic Regulator System: SteadyState Condition . ... .. . .. . . .. . . . . .. Procedure Summary of DiscreteTime. . . . . . . . . . . . .5 6.. . . .3 4. . . 3. . . . . .. .2 3.. . . . . . . . . . Procedure Summary of DiscreteTime. . . ..2 5.1 4. . . . . .. . . . 136 Procedure Summary of Linear Quadratic Tracking System159 Procedure Summary of Regulator System with Prescribed Degree of Stability .. . .. . .. . .. . . . .. . . .. . . . . . . . . . . . .. . . .. . . . . . .1 5. .3 6. . . . . ... . . . .1 Procedure Summary of Pontryagin Principle for Bolza Problem . . . . . . . . . .. Procedure Summary of HamiltonJacobiBellman (HJB) Approach . . . . .0 . ... . . . . . . . . . . . .. .. . .. .. . . . . . .. Linear Quadratic Regulator System . .. . ..4 5. . . .2 5... . .. . . .. Computation of Cost during the Stage k = 1. . . 69 Procedure Summary of FiniteTime Linear Quadratic Regulator System: TimeVarying Case. . .. 129 Procedure Summary of InfiniteInterval Linear Quadratic Regulator System: TimeInvariant Case . . .. Computation of Cost during the Last Stage k = 2 . .. .. . . . . . Procedure Summary of DiscreteTime Linear Quadratic Tracking System ..... . .2 6.3 5. . . . 178 Procedure Summary of DiscreteTime Optimal Control System: FixedEnd Points Condition . . Procedure Summary for DiscreteTime Optimal Control System: FreeFinal Point Condition . .4 204 208 214 222 238 257 269 270 280 xxv . . 113 Procedure Summary of InfiniteTime Linear Quadratic Regulator System: TimeVarying Case. . . . .. .. .List of Tables 2. . .1 6. .. . . . ... .. .
. . . . . . . . 355 . . . .. . . . . . . . . . . . . .1 Procedure Summary of Optimal Control Systems with State Constraints . . .XXVI 7.
1 Classical and Modern Control The classical (conventional) control theory concerned with single input and single output (8180) is mainly based on Laplace transforms theory and its use in system representation in block diagram form. we see that Y(s) R(s) Reference Input R(s) Error Signal  G(s) 1 + G(s)H(s) (1. Figure 1. Gc(s) Control Input ..1) compensator .. From Figure 1.Chapter 1 Introduction In this first chapter. we introduce the ideas behind optimization and optimal control and provide a brief history of calculus of variations and optimal control.1 Classical Control Configuration 1 . U(s) B PIant p G (s) Output yes) + E(s) Feedback H(s) . 1. Also.1.. a brief summary of chapter contents is presented...1.
Similarly. and 3. and y( t) are n. t) y(t) (1. We see from Figure 1. the system (plant) is characterized by state variables. timeinvariant form as x(t) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t) (1.r. and 2.2 that in modern control configuration. t). and output vectors respectively.1.5) (1.2 Chapter 1: Introduction where s is Laplace variable and we used (1. .6) = g(x(t).1. a nonlinear system is characterized by x(t) = f(x(t). Cis mxn output. 1.3) (1. x(t).1. say. B is nxr input. In most cases only one output variable is available for feedback.1. The modern control theory concerned with multiple inputs and multiple outputs (MIMO) is based on state variable representation in terms of a set of first order differential (or difference) equations. 2. r.1. it depends on wellestablished matrix theory. and A is nxn state. the input u(t) to the plant is determined by the error e(t) and the compensator. in linear. and D is mxr transfer matrices. The modern theory dictates that all the state variables should be fed back after suitable weighting. all the variables are not readily available for feedback. all or most of the state variables are available for control.) t.t. u(t). u(t). and m dimensional state. Here.2) Note that 1. dot denotes differentiation with respect to (w. which is amenable for large scale computer simulation. u(t). the input u( t) is determined by the controller (consisting of error detector and compensator) driven by system states x(t) and reference signal r (t ) .4) where. control.
109]. C Controller '" Figure 1...1 Classical and Modern Control Plant Control Input u(t) 3 . the system is analyzed for its performance to find out mainly stability of the system and the contributions of Lyapunov to stability theory are well known.. reveals the fact that state variable representation is a more complete description of a system. Again. nonlinear control. State R eference x(t) Input r(t) . The system dynamics is largely based on the Lagrangian function. In optimal control theory.3 shows components of a modern control system. robust control and so on. adaptive control. The first stage of any control system theory is to obtain or formulate the dynamics or modeling in terms of dynamical equations such as differential or difference equations. We notice that although the concepts such as Lagrange function [85] and V function of Lyapunov [94] are old. as the phrase modern usually refers to time and what is modern today becomes ancient after a few years. the techniques using those concepts are modern.1. Figure 1. Next. the design is usually with respect to a performance index. a more appropriate thing is to label them as optimal control. Finally. p Output y(t) . It shows three components of modern control and their important contributors. we resort to design [25. if the system performance is not according to our specifications.2 Modern Control Configuration The fact that the state variable representation uniquely specifies the transfer function while there are a number of state variable representations for a given transfer function. .
linear and nonlinear programming. Then the plant is described by differential . optimization can be classified as static optimization and dynamic optimization. 2. Lagrange multipliers. Techniques used are ordinary calculus..e. The subject of optimization is quite general in the sense that it can be viewed in different ways depending on the approach (algebraic or geometric). and the stage (single or multiple) used in optimization. use resources optimally and so on. This is shown in Figure 1. the nature of the signals (deterministic or stochastic). 1.2 Optimization Optimization is a very desirable feature in daytoday life.. The plant is then described by algebraic equations. i. As we notice that the calculus of variations is one small area of the big picture of the optimization field.4 Chapter 1: Introduction I ~r Modem Control System I r ~ ~ System Dynamics (Modeling) System Analysis (Perfonnance) System Synthesis (Design) r r ~ State Function of Lagrange (1788) V Function of Lyapunov (1892) H Function of Pontraygin (1956) Figure 1. Static Optimization is concerned with controlling a plant under steady state conditions. Further. and it forms the basis for our study of optimal control systems. Dynamic Optimization concerns with the optimal control of plants under dynamic conditions. We like to work and use our time in an optimum manner.3 Components of a Modern Control System 1. i.4. the system variables are changing with respect to time and thus the time is involved in system description. the interest (single or multiple). the system variables are not changing with respect to time.e.
1.4 Overview of Optimization .2 Optimization 5 OPTIMIZATION Algebraic Approach Multiple Interest Game Theory Stochastic Multiple Stage Dynamic Programming Calculus and Lagrange Multipliers Linear and Nonlinear Programming Functional Analysis Figure 1.
variational calculus (or calculus of variations) and Pontryagin principle.6 Chapter 1: Introduction (or difference) equations.1 Plant For the purpose of optimization. a mathematical description (or model) of the process to be controlled (generally in state variable form).3.1. a specification of the performance index. dynamic programming.2 Performance Index Classical control design techniques have been successfully applied to linear. 1. timeinvariant.3 Optimal Control The main objective of optimal control is to determine control signals that will cause a process (plant) to satisfy some physical constraints and at the same time extremize (maximize or minimize) a chosen performance criterion (performance index or cost function).1.5) and (1. Referring to Figure 1.3) and (1. and the frequency response of the system characterized by gain and phase margins. we describe a physical plant by a set of linear or nonlinear differential or difference equations.4) and a nonlinear system by (1. the optimal control problem is to find a control which causes the dynamical system to reach a target or fol . The formulation of optimal control problem requires 1. 2. single output (8180) systems. 1. a statement of boundary conditions and the physical constraints on the states and/or controls.3. we are interested in finding the optimal control u*(t) (* indicates optimal condition) that will drive the plant P from initial state to final state with some constraints on controls and states and at the same time extremizing the given performance index J. and bandwidth. singleinput. peak overshoot.1.6). a linear timeinvariant system is described by the state and output relations (1. For example. Techniques used are search techniques. In modern control theory.2. 1.1. Typical performance criteria are system time response to step or ramp input characterized by rise time. and 3. settling time. and steady state accuracy.
3. J = i t! u'(t)Ru(t)dt (1.1 ) 2.1. (1. ri is the resistance of the ith loop) is the total power or the total rate of energy expenditure of the network. Performance Index for TimeOptimal Control System: We try to transfer a system from an arbitrary initial state x(to) to a specified final state x( t f) in minimum time. In order to minimize the total expenditure of fuel. Performance Index for FuelOptimal Control System: Consider a spacecraft problem. Performance Index for MinimumEnergy Control System: Consider Ui (t) as the current in the ith loop of an electric network. Let u(t) be the thrust of a rocket engine and assume that the magnitude Iu( t) I of the thrust is proportional to the rate of fuel consumption.(t)ri (where.to = t*. Then.3 Optimal Control 7 low a state variable (or trajectory) and at the same time extremize a performance index which may take several forms as described below.3. we may formulate the performance index as J = i t! lu(t)ldt.4) or in general.5) to . we may write it as (1. for minimization of the total expended energy.3.3. Then 2:i!1 u.3) where R is a weighting factor.3. 1. 3. (1. The corresponding performance index (PI) is J = i t! to dt = t f . we have a performance criterion as (1.2) to For several controls.
which can be positive semidefinite. Similarly. Performance Index for Terminal Control System: In a terminal target problem. we are interested in minimizing the error between the desired target position Xd (tf) and the actual target position Xa (tf) at the end of the maneuver or at the final time t f. R is a positive definite matrix and prime (') denotes transpose here and throughout this book (see Appendix A for more details on definite matrices).Xd(t).7) which is also called the terminal cost function. The terminal (final) error is x (t f) = Xa (t f) . Here. J = i t! x/(t)Qx(t)dt (1. Performance Index for General Optimal Control System: Combining the above formulations. F is a positive semidefinite matrix.9) to where. we structure the cost function as (1. J = S(x(tf).8) is called quadratic (in terms of the states and controls) form.3. We then have.3. . Here.Xd (t f ). 4.8 Chapter 1: Introduction where. and Q and F are positive semidefinite matrices. we have a performance index in general form as J = x/(tf)Fx(tf) + i t! [X/(t)QX(t) + u/(t)Ru(t)]dt (1. and x(t) = xa(t) . respectively.3. Taking care of positive and negative values of error and weighting factors. 5.3. xa(t) is the actual value.8) to or. Xd(t) is the desired value.t)dt (1.u(t).3. The particular form of performance index (1. Note that the matrices Q and R may be time varying.6) to where. R is a positive definite matrix. we can think of minimization of the integral of the squared error of a tracking system. Q is a weighting matrix. is the error.tf) + i t! V(x(t).
9) has only the integral cost term. u(t). speed of a motor. 1. often we have the controls and states. 1.3. it is called the Mayer problem. From the physical considerations. it is called the Lagrange problem.10) where.3. if the PI (1.indicate the maximum and minimum values the variables can attain.12) to or which causes the nonlinear system x(t) = f(x(t).3 Optimal Control 9 The problems arising in optimal control are classified based on the structure of the performance index J [67].3.3. and the problem is of the Bolza type if the PI contains both the terminal cost term and the integral cost term as in (1. The unconstrained problem is less involved and gives rise to some elegant results. such as currents and voltages in an electrical circuit.9) contains the terminal cost function S(x(t). and .3. t) (1. +.11) to give the trajectory x* (t) that optimizes or extremizes (minimizes or maximizes) a performance index J = x'(tf)Fx(tf) + J. If the PI (1. the above mentioned performance indices (with quadratic forms) lead to some very elegant results in optimal control systems. t) only.3 Constraints The control u( t) and state x( t) vectors are either unconstrained or constrained depending upon the physical situation.9). However. tJ [x'(t)Qx(t) + u'(t)Ru(t)]dt (1. u(t).3. constrained as (1.13) .1.4 Formal Statement of Optimal Control System Let us now state formally the optimal control problem even risking repetition of some of the previous equations. There are many other forms of cost functions depending on our performance specifications.3. The optimal control problem is to find the optimal control u*(t) (* indicates extremal or optimal value) which causes the linear timeinvariant plant (system) x(t) = Ax(t) + Bu(t) (1. thrust of a rocket.3.3.
3.13). 2.5. In the first stage. J ~ J* . we bring in the plant (1.3. t)dt (1.3.5 Optimal Control Problem we are basically interested in finding the control u*(t) which when applied to the plant described by (1..3. or free.3.11) or (1. The entire problem statement is also shown pictorially in Figure 1. I • Constraints I Cost Function II . and the final (target) state may be fully or partially fixed or free.14) and use the wellknown theory of calculus of variations to obtain optimal functions. we just consider the performance index of the form (1. Optimal Control System + Plant . u(t).3.3.3. tf) + i tf V(x(t).14)..~ .10 Chapter 1: Introduction to give the state x*(t) that optimizes the general performance index J = S(x(tj). 1. gives an optimal performance index J* described by (1. In the second stage. r u*(t) (b) Maximum u(t) Figure 1.14) to with some constraints on the control variables u( t) and/or the state variables x(t) given by (1. Thus..11) and try to address the problem of finding optimal control u*(t) which will . The optimal control systems are studied in three stages. ~ u*(t) (a) Minimum u(t) J* .10). J . The final time tf may be fixed.12) or (1.
1705).).3.4 Historical Tour 11 drive the plant and at the same time optimize the performance index (1.D. 28]1. Tyrian princess Dido used a rope made of cowhide in the form of a circular arc to maximize the area to be occupied to found Carthage. NY. 3. 1980. and anonymously by Isaac Newton (16421727). In 1699 Johannes Bernoulli (16671748) posed the brachistochrone problem: the problem of finding the path of quickest descent between two points not in the same horizontal or vertical line. Although the story of the founding of Carthage is fictitious. optimal control theory [134. which influenced JosephLouis Lagrange (17361813). his brother Jacob (1654. This problem which was first posed by Galileo (15641642) in 1638.12). A functional is loosely defined as a function of a function. the calculus of variations and its extensions such as optimal control theory. who finally gave an elIThe permission given by SpringerVerlag for H. and secondly. But we will start with the works of Bernoulli. 300 A. the above topics are presented in discretetime domain. SpringerVerlag. 99. Leonard Euler (17071783) joined John Bernoulli and made some remarkable contributions. 58.10) is considered along with the plant and performance index to obtain optimal control. Goldstine.3.4. Finally.) and by Poppus (c. 1. is hereby acknowledged.4 Historical Tour We basically consider two stages of the tour: first the development of calculus of variations.C.1 Calculus oj Variations According to a legend [88]. The theory of finding maxima and minima of functions is quite old and can be traced back to the isoperimetric problems considered by Greek mathematicians such as Zenodorus (495435 B. Next. H.1. 1. New York. it probably inspired a new mathematical discipline. A History of the Calculus of Variations. by Gottfried Leibniz (16461716). the topic of constraints on the controls and states (1. The calculus of variations is that branch of mathematics that deals with finding a function which is an extremum (maximum or minimum) of a functional. was solved by John. .
The sufficient conditions for finding the extrema of functionals in calculus of variations was given by Andrien Marie Legendre (17521833) in 1786 by considering additionally the second variation. Later Mayer described in detail the problems: the problem of Lagrange in 1878. acted upon by various external forces. In 1868 Mayer reconsidered Clebsch's work and gave some elegant results for the general problem in the calculus of variations. and sufficient conditions for weak and strong extrema. In 1898. For variable endpoint problems. In 1838 Jacobi had some objections to this work and showed the need for only one partial differential equation. Carl Gustav Jacob Jacobi (18041851) in 1836 came up with a more rigorous analysis of the sufficient conditions. At about the same time Sir William Rowan Hamilton (17881856) did some remarkable work on mechanics. Clebsch formulated a problem in the calculus of variations by adjoining the constraint conditions in the form of differential equations and provided a condition based on second variation. called HamiltonJacobi equation. This sufficient condition was later on termed as the LegendreJacobi condition. This equation. In 1900. This led Euler to coin the phrase calculus of variations. Lagrange went on to treat variable end .the Lagrange equation. which later became one of the most powerful toolsLagrange (or EulerLagrange) multiplier methodin optimization. David Hilbert (18621943) showed the second variation as a . later had profound influence on the calculus of variations and dynamic programming. Rudolph Clebsch (18331872) and Adolph Mayer proceeded with establishing conditions for the more general class of problems.12 Chapter 1: Introduction egant way of solving these types of problems by using the method of (first) variations. Adolf Kneser gave a new approach to the calculus of variations by using the result of Karl Gauss (17771855) on geodesics. he established the transversality condition which includes orthogonality as a special case. He along with Oskar Bolza (18571942) gave sufficiency proofs for these problems. by showing that the motion of a particle in space. and as well as on mechanics. The distinction between strong and weak extrema was addressed by Karl Weierstrass (18151897) who came up with the idea of the field of extremals and gave the Weierstrass condition. and the problem of Mayer in 1895. Later this necessary condition for extrema of a functional was called the Euler . optimal control. could be represented by a single function which satisfies two firstorder partial differential equations.point problems introducing the multiplier method.
Bellman in 1957 [12] introduced the technique of dynamic programming to solve discretetime optimal control problems.2) where. J. McShane (19041989) [98]. Elsgolts (1962) [47]. the most important contribution to optimal control systems was made in 1956 [25] by L. Bolza formulated the problem of Bolza as a generalization of the problems of Lagrange and Mayer. For a deterministic case. Other notable contributions to calculus of variations were made by E. Forray (1968) [50]. Young (1969) [146]. 1. Gilbert Bliss (18761951) [23] and Max Mason looked in depth at the results of Kneser. Pinch (1993) [108]. Hestenes [65]. S. Elsgolts (1970) [46]. Giaquinta and Hildebrandt (1995) [56. Smith (1974) [126]. Between 1908 and 1910. and E {x} represents the expected value of the random variable x.4. Leitmann (1981) [88]. Krasnov et al. In 1913. Akhiezer (1962) [1]. and Milyutin and Osmolovskii (1998) [103]. H. Wiener solved the problem of designing filters that minimize a meansquareerror criterion (performance measure) of the form (1. in development of his celebrated maximum principle described . There have been a large number of books on the subject of calculus of variations: Bliss (1946) [23]. (1975) [81]. Weinstock (1974) [143]. Pontryagin (formerly of the United Soviet Socialistic Republic (USSR)) and his associates. Wan (1994) [141].1) where. 145]. Troutman (1996) [136]. Goldstine and others. M. Cicala (1957) [37]. R. Dreyfus (1966) [45].4 Historical Tour 13 quadratic functional with eigenvalues and eigenfunctions. e( t) is the error. 57]. Gelfand and Fomin (1963) [55]. Gregory and Lin (1992) [61].2 Optimal Control Theory The linear quadratic control problem has its origins in the celebrated work of N.4. But. Q is some positive definite matrix. Ewing (1985) [48]. the above error criterion is generalized as an integral quadratic term as J = 10 00 e'(t)Qe(t)dt (1. Bolza (1973) [26].4. Bliss showed that these three problems are equivalent. R. Kamien and Schwartz (1991) [78]. Balakrishnan (1969) [8]. Wiener on meansquare filtering for weapon fire control during World War II (194045) [144. Sagan (1992) [118].1. H.
without ever knowing that the Riccati equation would become so famous after more than two centuries! Thus. the reader is referred to some excellent publications [58. controllers designed based on LQR theory failed to be robust to measurement noise. Riccati [114. 22] published his result in 1724 on the solution for some types of nonlinear differential equations. having its roots in calculus of variations developed during 16th and 17th centuries was really born over 300 years ago [132]. Kalman in 1960 [70] provided linear quadratic regulator (LQR) and linear quadratic Gaussian (LQG) theory to design optimal feedback controls. He went on to present optimal filtering and estimation theory leading to his famous discrete Kalman filter [71] and the continuous Kalman filter with Bucy [76]. Basically the classical control theory using frequency domain deals with single input and single output (SIS0) systems. Thus. the term "linear" refers to the plant being linear and the term "quadratic" refers to the performance index that involves the square or quadratic of an error. some researchers [115. Kalman had a profound effect on optimal control theory and the Kalman filter is one of the most widely used technique in applications of control theory to real world problems in a variety of fields. and/or control. see a very interesting article on the "discovery of the Maximum Principle" by R. it lacked a very important feature of robustness. continued to work on developing frequency domain .14 Chapter 1: Introduction in detail in their book [109]. E. J. On the other hand. At this point we have to mention the matrix Riccati equation that appears in all the Kalman filtering techniques and many other fields. Originally. frequency domain techniques using the ideas of gain margin and phase margin offer robustness in a natural way. 95]. V. In the socalled linear quadratic control. Gamkrelidze [52]. At this time in the United States. That is. especially in the United Kingdom. 99. C. whereas modern control theory works with time domain for SISO and multiinput and multioutput (MIMO) systems. external disturbances and unmodeled dynamics. 21. For additional details about the historical perspectives on calculus of variations and optimal control. this problem was called the meansquare control problem and the term "linear quadratic" did not appear in the literature until the late 1950s. optimal control. R. Although modern control and hence optimal control appeared to be very attractive. one of the authors of the original book [109]. 28. 132]. Also.
Sage and White (1977) [120]. Doyle. An important publication in this field came from a group of four active researchers." (Leitmann (1964) [87]. Troutman (1996) [136] Bardi and Dolcetta (1997) [9]. and Francis[44]. is based on the author's lecture notes used for teaching a graduate level course on this subject. The seeds for Hoooptimal control theory were laid by G. Gregory and Lin (1992) [61]. Vincent and Grantham (1997) [139]. Leitmann (1981) [88]). Sage [119]. Berkovitz (1974) [17]. 131. 39. Young (1969) [146]. Christensen et al. In this framework. the work developed in the 1960s and 1970s is labeled as H2optimal control theory. (1995) [117]. 128. Pun (1969) [110]. Glover. Burl [32]. Athans and Falb (1966) [6]. Pytlak (1999) [111]. Stengal (1986) [127]. Pinch (1993) [108].1. Siouris [125]. Hocking (1991) [66]. on the subject of optimal control systems. Kolosov (1999) [80]. Bryson and Ho (1975) [30]). Teo et al. G. Zames [148]. There are many other works in the field of Hoo control ([51. Bryson (1999) [29]. Warga (1972) [142]. There has been renewed interest with the second wave of books published during the last few years (Lewis (1986) [89]. Tou (1964) [135]. Baker Award as the best IEEE Transactions paper. who won the 1991 W. Milyutin and Osmolovskii (1998) [103]. Kwakernaak and Sivan (1972) [84]. Petrov (1968) [106]. Zelikin . Schultz and Melsa [121]. Vinter (2000) [140]. 43. Citron (1969) [38]. In particular. Boltyanskii [24]. 150. One important and relevant field that has been developed around the 1980s is the Hoooptimal control theory. and one finds the footprints of these works in the present book. Saberi et al. (1987) [36] Anderson and Moore (1990) [3]. Khargonekar. Luenberger (1969) [93]. Anderson and Moore [3] and Lewis and Syrmos [91].5 About This Book This book.5 About This Book 15 approaches to MIMO systems. Lewis (1992) [90]. (1995) [42]. 34]). Kirk [79]. 7. Pierre (1969) [107]. Ryan (1982) [116]). Kirk (1970) [79]. who formulated the optimal Hoosensitivity design problem for SISO systems and solved using optimal NevanilinaPick interpolation theory. (1991) [133]. Lewis and Syrmos (1995) [91]). Sage (1968) [119]. this author was most influenced by Athans and Falb [6]. Sima (1996) [124]. 60. Sage and White [120]. Dorato et al. R. Dreyfus (1966) [45]. There were a good number of books on optimal control published during the era of the "glory of modern control. 96. Lee and Markus (1967) [86]. 1.
The optimal control theory continues to have a wide variety of applications starting from the traditional electrical power [36] to economics and management [16.16 Chapter 1: Introduction (2000) [149]. 123]. and Locatelli (2001) [92]. and Appendix C lists some of the MATLAB© files used in the book. Finally. Here. vectors. We also discuss the gain and phase margins of the LQR system. we start with some basic definitions and a simple variational problem of extremizing a functional. We then bring in the plant as a conditional optimization problem and discuss various types of problems based on the boundary conditions. Here we discuss the closedloop optimal control system introducing matrix Riccati differential and algebraic equations. The Pontryagin Principle and associated topics of dynamic programming and HamiltonJacobiBellman results are briefly covered in Chapter 6. We look at the analytical solution to the Riccati equations and development of MATLAB© routine for the analytical solution. Chapter 3 addresses basically the linear quadratic regulator (LQR) system. fueloptimal and energyoptimal control systems and briefly discuss the state constraints problem. The optimal control of systems with control and state constraints is described in Chapter 7. Tracking and other problems of linear quadratic optimal control are discussed in Chapter 4. Chapter 2 presents optimal control via calculus of variations. Here. Next. the optimal control of discretetime systems is presented in Chapter 5. So far the optimal control of continuoustime systems is described. Betts (2001) [20]. 1. we start with the basic calculus of variations and then touch upon all the topics discussed above with respect to the continuoustime systems. the Appendices A and B provide summary of results on matrices. matrix algebra and state space.6 Chapter Overview This book is composed of seven chapters. 122. We briefly mention both Lagrangian and Hamiltonian formalisms for optimization. 78. In this chapter. we cover topics of control constraints leading to timeoptimal. . Next.
and u(t) = is the input to electronic amplifier connected to the input of the motor. X3(t) = the angular velocity of the motor. Formulate a performance index and optimal control problem to keep the speed constant at a particular value. motor speed control system is described by a second order state equation. Problem 1.200X2(t) + 400u(t) . C.2u(t) X4(t) 10x3(t) . where. the valves connecting a reservoir and the tank are controlled by gear train driven by a D. it is required to maintain the upright position of the pendulum on a cart.3 [35] In an inverted pendulum system. Problem 1. Formulate optimal control problem to keep the liquid level constant at a reference value and the system to act only if there is a change in the liquid level.1 A D.7 Problems 17 1.O.C. The linearized state equations are Xl(t) X2(t) X3(t) X4(t) = = = = X2(t) X3(t) + O. 7 Problems Problem 1. and X2(t) = the current in the armature circuit and the control input u( t) = the voltage input to an amplifier supplying the motor. The dynamics is described by a third order system Xl(t) = 2Xl(t) X2(t) = X3(t) X3(t) = 10X3(t) + 9000u(t) where. :h (t) = 25x2(t) X2(t) = 400Xl(t) . X2(t) = is the angular position of the electric motor driving the valves controlling the liquid from reservoir to tank. Xl(t) = is the height in the tank. Xl(t) = the speed of the motor.2u(t) .2 [83] In a liquidlevel control system for a storage tank.1. motor and an electronic amplifier.
lXI(t) . one spring connecting the two masses and the other spring connecting one of the masses to a fixed point. An input is applied to the mass not connected to the fixed point. and u(t) = is the horizontal force applied to the cart. X2 (t) = temperature and the control variable u(t) = effective cooling rate coefficient.3XI(t) . and k is the spring constant. Problem 1.4 [101J A mechanical system consisting of two masses and two springs.0. the normalized deviation state variables of the linearized model are Xl (t) = reaction variable.0. x(t) is the position. X2(t) = is linear velocity of the cart. Formulate a performance index to keep the pendulum in the vertical position with as little energy as possible. u(t) is the input to the suspension system (in the form of an upward force). X4(t) = is angular velocity. Formulate a suitable performance measure to minimize the deviation errors and to minimize the control effort. . The displacements (XI(t) and X2 (t)) and the corresponding velocities (X3 (t) and X4 (t)) of the two masses provide a fourthorder system described by XI(t) X2(t) X3(t) X4(t) = X3(t) = X4(t) = 4XI(t) + 2X2(t) = XI(t) . Problem 1.07u(t) where.6 [112J Consider a continuous stirred tank chemical reactor described by XI(t) = O. Assume suitable values for all the constants.012x2(t) . Xl (t) = is horizontal linear displacement of the cart.18 Chapter 1: Introduction where. X3(t) = is angular position of the pendulum from vertical line. Problem 1.0.12x2(t) X2(t) = 0. Formulate the optimal control problem for minimum control energy and passenger comfort.5 A simplified model of an automobile suspension system is described by mx(t) + kx(t) = bu(t) where.X2(t) + u(t) Formulate a performance index to minimize the errors in displacements and velocities and to minimize the control effort. m is the mass of the suspension system.
It is suggested that the student reviews the material in Appendices A and B given at the end of the book. We then incorporate the plant as a conditional optimization problem and discuss various types of problems based on the boundary conditions. whom we can claim with good reason as the founder of calculus of variations. We briefly mention both the Lagrangian and Hamiltonian formalisms for optimization. 1970. we start with some basic definitions and a simple variational problem of extremizing a functional. Englewood Cliffs. NJ. 48]1. 2. 19 . 143.Chapter 2 Calculus of Variations and Optimal Control Calculus of variations (Co V) or variational calculus deals with finding the optimum (maximum or minimum) value of a functional.1 Basic Concepts Function and Functional We discuss some fundamental concepts associated with functionals along side with those of functions.1.1 2. Prentice Hall. Optimal Control Theory: An Introduction. 79. Variational calculus that originated around 1696 became an independent mathematical discipline after the fundamental discoveries of L. (a) Function: A variable x is a function of a variable quantity t. In this chapter. E. 46. This chapter is motivated by [47. (writIThe permission given by Prentice Hall for D. is hereby acknowledged. Kirk. 81. Euler (17091783).
3) to is the path traversed by the vehicle.e. . Loosely speaking. Other functions are x(t) = 2t. tt Next we consider the definition of a functional based on that of a function. there corresponds a value J. Example 2." 2.1 Consider x(t) = 2t2 + 1.20 Chapter 2: Calculus of Variations and Optimal Control ten as x(t) = !(t)). If v(t) is the velocity of a vehicle.2) is the area under the curve x(t).1 ) = 1. i. we have a correspondence: to the function f (x) there corresponds a number J. For t (2.2 Let x(t) = 2t2 + 1. a functional can be thought of as a "function of a function. here x(t) and v(t) are functions of t. X(tb t2) = + t§..1. if to each function f (x). x = 9 and so on. Example 2.2 Increment We consider here increment of a function and a functional.1. Note that here t need not be always time but any independent variable. Then + l)dt = 3 + 1 = 3 2 5 {I (I J(x(t)) = io x(t)dt = io (2t2 (2. written as J = J (f (x) ). then J (v (t )) = l ti v (t ) dt (2.e. Thus. if to every value of t over a certain range of t there corresponds a value x. and J is a functional of x(t) or v(t). we have a correspondence: to a number t there corresponds a number x..1. i. x = 3. (b) Functional: A variable quantity J is a functional dependent on a function ! (x). t = 2. Functional depends on several functions.1.
79].1 Basic Concepts 21 (a) Increment of a Function: In order to consider optimal values of a function.6) (b) Increment of a Functional: Now we are ready to define the increment of a functional. DEFINITION 2. and hence strictly speaking.2. ~t).1.3 If (2.4) It is easy to see from the definition that ~I depends on both the independent variable t and the increment of the independent variable ~t. DEFINITION 2. 7) Here 8x(t) is called the variation of the function x(t). Since the increment of a functional is dependent upon the function x(t) and its . denoted by ~/. Example 2.1 defined as The increment of the function I. (2.I(t).I(t) = (tl + ~iI + t2 + ~t2? . denoted by ~J.2 defined as The increment of the functional J. (2.1. is ~/~/(t + ~t) .1. we need the definition of an increment [47. is I ~J~J(x(t) + 8x(t)) . we need to write the increment of a function as ~/(t.(tl + t2)2 = (tl + ~tl)2 + (t2 + ~t2)2 + 2(iI + ~h)(t2 + ~t2) (tI + t§ + 2tlt2) = 2(tl + t2)~tl + 2(tl + t2)~t2 + (~tl)2 + (~t2)2 +2~tl~t2.1.J(x(t))·1 (2. 46.5) find the increment of the function I (t) . Solution: The increment ~I ~ ~I becomes I(t + ~t) .
1. In other words.1. j(t*)At = df. we get Af = f(t') + (:).12) Here. j(t*) is the derivative or slope of J at t*. [2(x(t) + 8x(t))2 + 1] dt [4x(t)8x(t) + 2(8x(t) _it! to [2x2(t) + 1] dt. f(t*).J(x(t)). df is called the differential of J at the point t*. (2. we consider the differential of a function and the variation of a functional.8x(t)).J(t*).1.. (2.22 Chapter 2: Calculus of Variations and Optimal Control variation 8x(t). Af = ( : ) * At = (2.1. (~n. differential and derivative. (a) Differential of a Function: Let us define at a point t* the increment of the function J as ilf~J(t* + ilt) .1 shows the relation between increment. (At)2 + . we need to write the increment as ilJ(x(t).3 Differential and Variation Here.9) i to )2] dt. strictly speaking.8) Solution: The increment of J is given by ilJ it! t! = = to ~ J(x(t) + 8x(t)) . Figure 2.1.10) By expanding J (t* + ilt) in a Taylor series about t*. At + :. the differential dJ is the first order approximation to increment ilt.1. (2. 2. Example 2. .4 Find the increment of the functional (2.11) Neglecting the higher order terms in ilt..
.t) . the increment fl..1... '. (2..t.. = (t = j(t)fl. [ .t) . Differential dj.14) .1 t* t*+~t t Increment fl...j is fl.. Here.5 Let j(t) = t 2 + 2t. j(t) (2..(t2 + 2t).. = 2tfl.J£J(x(t) + 8x(t))  J(x(t)). Find the increment and the derivative of the function j (t). = 2(t + l)fl.j £ j(t + fl. + fl. and Derivative j of a Function j (t) Example 2.j...1.. : ~t . . .j(t). .1~.13) = 2(t + 1). o Figure 2. :. ' [(to) ~:~~~ ~ ::t~ .1 Basic Concepts f(t) 23 f(t* +~t) ..t + 2fl. (b) Variation of a Functional: Consider the increment of a functional fl.t.. ..2.. :~ . Solution: By definition...t)2 + 2(t + fl. + higher order terms.. ..t + ..
. 8J = {)x 8x(t) {)J and (2. ....... The variation 8 J of a functional J is the linear (or first order approximate) part (in 8x(t)) of the increment jj. .15) = 8 J + 82 J + .16) are called the first variation (or simply the variation) and the second variation of the functional J.:. {) 2. we get 1 {)2 J 2 jj..24 Cbapter 2: Calculus of Variations and Optimal Control Expanding J(x(t) + 8x(t)) {)J x in a Taylor series.. (2. Figure 2. ~ : ox(t): o Figure 2...J and the First Variation 8J of the Functional J . .  J(x(t)) {)J 1 {)2J 2 {)x 8x(t) + 2! {)x 2 (8x(t)) + . .1. . . J(x*(t» .2 x*(t) x*(t)+ Ox(t) x(t) Increment jj...J.. .2 shows the relation between increment and the first variation of a functional. :. ..J = J(x(t)) = + {) 8x(t) + ...1. ... J(x(t» J(x*(t)+Ox(t» . x (8x(t))2 + . . . : J~~ .1~. respectively. . . where. ... . . .
Thus ~J ~ J(x(t) + 8x(t))  J(x(t)).19) to 2. Solution: First. (2.1) . (2.2. In other words.2 Optimum of a Function and a Functional Example 2. the increment of f(t) has the same sign (positive or negative).2. DEFINITION 2. + 2(8x(t))2 + 38x(t)] dt.8x(t)) = i t! (4x(t) + 3)8x(t)dt.17) to evaluate the variation of the functional.t* I < E.18) = i [4x(t)8x(t) to Considering only the first order terms.6 25 Given the functional J(x(t)) = i t! [2x 2(t) + 3x(t) + 4]dt.f(t*) 2:: 0. 46.2 Optimum of a Function and a Functional We give some definitions for optimum or extremum (maximum or minimum) of a function and a functional [47.1.1. = i t! [2(x(t) + 8x(t))2 + 3(x(t) + 8x(t)) + 4) to (2x 2(t) t! + 3x(t) + 4)] dt. if ~f = f(t) . we form the increment and then extract the variation as the first order approximation. we get the (first) variation as 8J(x(t).1. The variation plays the same role in determining optimal value of a functional as the differential does in finding extremal or optimal value of a function. (2. 79]. (2.3 Optimum of a Function: A function f (t) is said to have a relative optimum at the point t* if there is a positive parameter E such that for all points t in a domain V that satisfy It .
. ~ . t* .26 f(t Chapter 2: Calculus of Variations and Optimal Control f(t) v 0 t* m t 0 t* t f(t) 0 t t f(t) 0 o~.f = f(t) ... If the previous relations are valid for arbitrarily large E.e. f(t*) is said to have a global absolute optimum. The sufficient condition . Figure 2..~~. f (t*) is a relative local maximum. (2. ~ f(t*)<0 o (a) t* t (b) Figure 2.3 (a) Minimum and (b) Maximum of a Function f (t) then. i. On the other hand.3 illustrates the (a) minimum and (b) maximum of a function. if b. then. t . It is well known that the necessary condition for optimum of a function is that the (first) differential vanishes.2. df = O..f(t*) ~ 0. f(t*) is a relative local minimum.2) then.
J(x*) is a relative maximum.3. and 27 2. If d2 f = 0.1 For x*(t) to be a candidate for an optimum. DEFINITION 2. the increment of J has the same sign.. it corresponds to a stationary (or inflection) point. This is a necessary condition. i. 46. for maximum is that the second differential is negative. i. the result is that the variation must be zero on. if !1J = J(x)  J(x*) ~ 0.e.e. 79]. In other words.2. d2 f > 0.3 2. the proof of which can be found in any book on calculus of variations [47.1 The Basic Variational Problem FixedEnd Time and FixedEnd State System We address a fixedend time and fixedend state problem. (2. ° 2.4 Optimum of a Functional: A functional J is said to have a relative optimum at x* if there is a positive E such that for all functions x in a domain n which satisfy Ix .3) then J(x*) is a relative minimum.3 The Basic Variational Problem 1. for minimum is that the second differential is positive. If the above relations are satisfied for arbitrarily large E..4) then.2. then. an optimal curve. 6x(t)) = for all admissible values of 6x(t).e.x* I < E. in variational problems concerning functionals. Analogous to finding extremum or optimal values for functions. and for maximum 62 J < 0. As a sufficient condition for minimum. i. Let us now state the result in the form of a theorem. (2.J(x*) ~ 0. J(x*) is a global absolute optimum. known as fundamental theorem of the calculus of variations. the (first) variation of J must be zero on x*(t). On the other hand. the second variation 62 J > 0.2. THEOREM 2. 6J(x*(t).. if !1J = J(x) . d2 f < 0. where both the initial time and state and the final time and state are fixed or given .
1).4. The problem is to find the optimal function x* (t) for which the functional J(x(t)) = i t! V(x(t).3. The function xa(t) should also satisfy the boundary conditions (2. in our attempt to find the optimum of x(t). all its arguments. Take some admissible function xa(t) = x*(t) + 8x(t) close to x*(t).e. we first define the increment for J. to and t f are fixed (or given a priori) and the end points are fixed. x(t = to) = Xo. t)dt (2.1) to has a relative optimum. • Step 1: Assumption of an Optimum • Step 2: Variations and Increment • Step 3: First Variation • Step 4: Fundamental Theorem • Step 5: Fundamental Lemma • Step 6: EulerLagrange Equation • Step 1: Assumption of an Optimum: Let us assume that x*(t) is the optimum attained for the function x(t).3.t. It is assumed that the integrand V has continuous first and second partial derivatives w.3..r. Let x(t) be a scalar function with continuous first derivatives and the vector case can be similarly dealt with. obtain its variation and finally apply the fundamental theorem of the calculus of variations (Theorem 2. i. x(t = tf) = xf' (2. the various steps involved in finding the optimal solution to the fixedend time and fixedend state system are first listed and then discussed in detail. where 8x(t) is the variation of x*(t) as shown in Figure 2.3.2) and hence it is necessary that (2. x(t).28 Chapter 2: Calculus of Variations and Optimal Control a priori. Thus. Hence.2) We already know from Theorem 2.3) .1 that the necessary condition for an optimum is that the variation of a functional vanishes.
.5) in a Taylor series about the point x*(t) and x*(t).5) ..4) to I to V(x*(t). t) (2..3 The Basic Variational Problem x(t) 29 xo . t) + 8x(t).J(x*(t). . x*(t) + 8x(t). t)dt.3.6) Expanding V in the increment (2. x(t) = d:~t) and 8x(t) = :t {8x(t)} (2. t) = I  t! V (x*(t) t! + 8x(t). o Figure 2. where. to which by combining the integrals can be written as 6.3. the increment 6.4 FixedEnd Time and FixedEnd State System • Step 2: Variations and Increment: Let us first define the increment as 6.3.. 8x(t)) ~ J(x*(t) + 8x(t).J(x*(t). x*(t).3.2. t) dt (2.J becomes (note the .V(x* (t). x*(t). x*(t) + 8x(t). x*(t). 8x(t)) = I t! [V (x*(t) + 8x(t). x*(t) J(x*(t). t)] dt.
(2.9). x(t) and x(t) at the optimal condition (*) and * is omitted for simplicity. x*(t).. = [( _it! 8x(t)~ (8~) dt ~~) * 6X(t{: 8x * dt. we used the wellknown integration formula J udv = uv . = l' + 8x(t)) [8V(X*(~~X*(t). t) 6x(t) + 8V(X*(~~x*(t).J vdu where u = 8V/8X and v = 8x(t)). we obtain the variation by retaining the terms that are linear in 8x(t) and 8x(t) as 8J(x*(t).30 Chapter 2: Calculus of Variations and Optimal Control cancelation of V(x*(t). x*(t).8x(t)) = it! to [8V(X*(t). Using (2.3. ) (8· ( ))2 + 8x2 X t + 2 2~:~~·) 6x (t)6x (t) } + . (2. t)8· ( )] d + 8x x t t..r.. x*(t).. • Step 3: First Variation: Now.3.9) In the above..3.8) To express the relation for the first variation (2. t)) ~J = ~J(x*(t).3. we integrate by parts the term involving 8x(t) as (omitting the arguments in V for simplicity) 1:' (~~) * 6x(t)dt = = 1:' (~~) ! 1:' (~~) * to (6x(t))dt * d(6x(t)).3. (2.7) Here. t) 8x(t) 8x 8V(x*(t). ] dt.t.8) entirely in terms containing 8x(t) (since 8x(t) is dependent on 8x(t)). ) (8 ())2 8x2 x t 8 V( . t) 6x(t) 2 + 2! ~ {8 V( . the partial derivatives are w. the .
3) for boundary variations in (2.10). but for this.e. LEMMA 2.12).3 The Basic Variational Problem relation (2.3. Using the relation (2.6x(t)) = 1:' [( ~:) :t (~~) * . l tf g(t)8x(t)dt =0 (2.6x(t)) = {' _ (~:) * 6x(t)dt + [( ~~) * 6X(t)[ !i (a~) dt ax * 8x(t)dt. the variation of J must vanish for an optimum. • Step 5: Fundamental Lemma: To simplify the condition obtained in the equation (2.11) • Step 4: Fundamental Theorem: We now apply the fundamental theorem of the calculus of variations (Theorem 2. let us take advantage of the following lemma called the fundamental lemma of the calculus of variations [47.3. for the optimum x*(t) to exist. ]8x(t)dt ax * (2. 79].3.l6X(t)dt.3. 8J(x*(t).12) Note that the function 8x(t) must be zero at to and tf.13) to .3. (2.8x(t)) = O.3. That is..1). we get 8J(x*(t). 46.3.10) lto rtf = lto rtf [(av) ax * _!i (a~) dt + [( ~~) * 6x(t)] I:: .2.3. i. Thus the relation (2.1 If for every function g(t) which is continuous.3. it is completely arbitrary.8) for first variation becomes 31 8J(x*(t). ax * (2.11) becomes lto rtf [(av) ax * _!i (a~) dt ]8X(t)dt = O.
tb]. let us select 8x(t).x*(t).x*(t). t f] . the value of the integral in (2.3.) Proof: We prove this by contradiction.32 Chapter 2: Calculus of Variations and Optimal Control where the function 8x(t) is continuous in the interval [to.3. Hence the lemma. tf]' then the function 9 (t) must be zero everywhere throughout the interval [to.5.13).14) ax * dt ax * ..1) is ( av(x*(t). a necessary condition for x*(t) to be an optimal of the functional J given by (2. (see Figure 2. to be positive (or negative) throughout the interval where 9 (t) has a nonzero value.12). tf] in (2. which is arbitrary.t)) _ ~ (av(x*(t). Next.t)) = 0 (2.5 A Nonzero g(t) and an Arbitrary 8x(t) • Step 6: EulerLagrange Equation: Applying the previous lemma to (2.3.3. By this selection of 8x(t). Thus g( t) must be identically zero everywhere during the entire interval [to. Let us assume that g(t) is nonzero (positive or negative) during a short interval [ta.3.13) will be nonzero. get) t 8x(t) Figure 2. This contradicts our assumption that g( t) is nonzero during the interval.
3.3.2.(V·) = 0 dt x where. 1.3. t).14) in 1741 using an elaborate and cumbersome procedure. Euler obtained the equation (2.L') equation after these two great mathematicians of the 18th century. The all important fundamental equation (2.3. Lagrange worked further on optimization and came up with the wellknown Lagrange multiplier rule or method. ±*(t). Vi.(X*(t).2 Discussion on EulerLagrange Equation We provide some comments on the EulerLagrange equation [47. and t.3. Euler recognized the simplicity and generality of the method of Lagrange and introduced the name calculus of variations.3. 33 (aV) * _!i (aV) * = 0 ax dt ax A historical note is worthy of mention. This equation is called Euler equation. first published in 1741 [126].3.46]. t). tf].16) Vx = ~: = Vx(x*(t).. x*(t). = ~~ = Vi. (2. (2. 2. and . The EulerLagrange equation (2.17) Since V is a function of three arguments x*(t).15) for all t E [to. (2. Thus (2.3 The Basic Variational Problem or in simplified notation omitting the arguments in V.14) can be written in many different forms. x*(t). Lagrange studied Euler's results and wrote a letter to Euler in 1755 in which he obtained the previous equation by a more elegant method of "variations" as described above.3.14) becomes d Vx .14) is now generally known as EulerLagrange (E.
The presence of it and/or x*(t) in the EL equation (2.19) for the EL equation. 4. x*(t).14) is. In the EL equation (2. Thus.r.34 Chapter 2: Calculus of Variations and Optimal Control that x*(t) and x*(t) are in turn functions of t. 3. This is also evident from the alternate form (2.3. 6. The conditions at initial point t = to and final point t = t f leads to a boundary value problem.3.3.3.t) is in general a function of x*(t).3. we get Combining (2. Thus. t.3. in general. This is the price we pay for demanding optimal performance! .14) means that it is a differential equation. 5. 7. There may also be terms involving products or powers of x* (t). This means that the differential equation (2. twopoint boundary value.18).3.19) 2. in which case.14) is in general of second order.3. The solution of the nonlinear TPBVP is quite a formidable task and often done using numerical techniques. and x*(t). Thus when this function is differentiated w. a nonlinear. x*(t). second order. ordinary differential equation. timevarying. the differential equation becomes nonlinear. x*(t) may be present. The explicit presence of t in the arguments indicates that the coefficients may be timevarying. we get an alternate form for the EL equation as (2. the term aV(x*(~i:x*(t). we often have a nonlinear twopoint boundary value problem (TPBVP).14).t. the EulerLagrange equation (2. and t.16) and (2.
Optimal may sometimes not yield either a maximum or a minimum.16) becomes dt (Vx) = 0 ~ Vx d = C.26) . (2. Then vtx = O.Vxxx*(t)] = This can be rewritten as o. (2.24) x*(t) [Vx .20) V ' = oV(x*(t).19). 2. t) = C x ox (2. we get Vx .3. x(t)). Using the other form of the EulerLagrange equation (2. However. The EulerLagrange equation (2. Case 1: V is dependent of x(t).3 Different Cases for EulerLagrange Equation We now discuss various cases of the EL equation. we have (2.Vxxx*(t) = O.22) In general. C is a constant of integration. and t.3 The Basic Variational Problem 35 8. (2. That is. Then Vx = O. (2.22) becomes (2.3.3. this indicates that the optimum does not exist for that functional. Case 3: V is dependent of x(t) and x(t). the solution of either (2. That is.3.x*(t)Vx) = 0 ~V x*(t)Vx = C. t). Compliance with the EulerLagrange equation is only a necessary condition for the optimum.3.23) This is simply an equation of a straight line. if the EulerLagrange equation is not satisfied for any function.2.3.21) where. V = V(x(t).3. Case 2: V is dependent of x(t) only. The EulerLagrange equation (2. V = V(x(t)).3. That is.21) or (2.3.16) becomes d dt (Vx) = This leads us to o.Vxxx*(t) .3.3.25) ! (V .3.Vxxx*(t) . Then Vx = O. just as inflection points where the derivative vanishes in differential calculus. Multiplying the previous equation by x*(t). V = V(x(t).3.
it becomes an optimal function. Then.3. Note that t is the independent variable representing distance and not time. However.16) becomes 8V(x*(t).3. Vi. t) = 0 ax . Solution: It is well known that the solution to this problem is a straight line. t).7 Find the minimum length between any two points. Example 2.6. we like to illustrate the application of EulerLagrange equation for this simple case. Thus. Let us now illustrate the application of the EL equation with a very simple classic example of finding the shortest distance between two points. (2. V = V(x(t). we omit the * (which indicates an optimal or extremal value) during the working of a problem and attach the same to the final solution. in general.27) The solution of this equation does not contain any arbitrary constants and therefore generally speaking does not satisfy the boundary conditions x(to) and x(tf). Case 4: V is dependent of x(t). (2. and t. no solution exists for this variational problem. Only in rare cases. Consider the arc between two points A and B as shown in Figure 2.28) Rewriting ds = VI + x (t)dt. and dx and dt are the small rectangular coordinate values. = 0 and the EulerLagrange equation (2. separation of variables.3.36 Chapter 2: Calculus of Variations and Optimal Control The previous equation can be solved using any of the techniques such as. S=J = J ds = Jto rtf VI + x (t)dt = Jto V(x(t))dt rtf 2 (2. Then. when the function x(t) satisfies the given boundary conditions x(to) and x(tf).30) .3. (2. i.29) N ow the total arc length S between two points x (t = to) and x (t = t f) is the performance index J to be minimized. Hence. 2 where x(t) = ~~.. Often. Let ds be the small arc length.3.e.
Next. we take an example. we get the optimal solution as x*(t) = C1t + C2 . Applying the EulerLagrange equation (2. Although the previous example is a simple one.3. it illustrates the formulation of a performance index from a given simple specification or a statement.3. For example.3. we notice that the integrand V in the functional (2. V(x(t)) = Jl + x 2 (t). x(t) and t. .6 Arc Length where.32) This is evidently an equation for a straight line and the constants C1 and C2 are evaluated from the given boundary conditions.3 The Basic Variational Problem x(t) 37 xo . and 2.. if x(O) = 1 and x(2) = 5.3.3. the solution is well known a priori so that we can easily verify the application of the EulerLagrange equation. Note that V is a function of x(t) only. In the previous example.. 'A:: o Figure 2. where. (2.30). is a function of x(t) only.. (2. V is a function of x(t).22) to the performance index (2.30).31) Solving this equation. C 1 = 2 and C 2 = 1 the straight line is x*(t) = 2t + 1.2. we get VI x*(t) _ C + X*2(t) . 1.
2tx(t). (2.3.3. Then applying the EL equation (2.34) Solution: In the EL equation (2.3.8 Find the optimum of J = l [:i. time) problem and this problem is dealt with in almost all books on calculus of variations [126].3.3.3.36) where. . and chrones.36). (2.3.19) in (2.3.35) Solving the previous simple differential equation.19). the shortest. we have x(O) = 1 + C2 = 1. Using the given boundary conditions (2.15) to the performance index (2.2(t) .33) that satisfy the boundary (initial and final) conditions x(O) = 1 and x(2) = 5.2tX(t)] dt (2. 81. Further.38) Another classical example in the calculus of variations is the brachistochrone (from brachisto. we finally have the optimal function as x*(t) = "6 + "3 t + 1.38 Chapter 2: Calculus of Variations and Optimal Control Example 2.33) we get av _~ (av) = 0 ax dt ax x*(t) + + 2t x(t) ~ (2x(t)) = 0 dt = t. x(2) = 5 + C 1 = 3' 4 (2.3. Other types of problems such as freeend point problems are not presented here but can be found in most of the books on the calculus of variations [79.37) With these values for the constants.3. t3 4 (2. note that we have considered here only the socalled fixedend point problem where both (initial and final) ends are fixed or given in advance. 48]. these freeend point problems are better considered later in this chapter when we discuss the optimal control problem consisting of a performance index and a physical plant. 46.3. we first identify that V = x 2 (t) . However. we have ="6 + CIt + C2 t3 (2. C 1 and C2 are constants of integration.
4.. To establish the nature of optimum (maximum or minimum).f vdu where. i. 8x 2 2 2 2 (2. In the relation (2. u = :. (8X(t))2] dt. 8x 2 (8V) * > O. means that d 8x8x (8V) * .2.1) * 8X(t)8X(t)] dt.4 The Second Variation In the study of extrema of functionals.4.3) (2.x) 2 (8x(t))2 + (~:~) • (8X(t))2 (2.dt (8 V) * > 0. This.4.2) According to Theorem 2. 82V 8x8± 82 V] 2 8x 8x(t) * [8X(t) ] dt = "2 ltD 1 rtf . [8X(t)] [8x(t) 8x(t)]II 8x(t) dt (2. Consider the last term in the previous equation and rewrite it in terms of 8x(t) only using integration by parts (f udv = uv . we get 2 2 82 J = ~ rtf [ { 8 (8x(t))2 8x 2 dt 8x8x 2 ltD * * (8 V) _!i ( V.4.3.) } + (~:~). for arbitrary values of 8x(t) and 8x(t). we have so far considered only the necessary condition for a functional to have a relative or weak extremum.4) For maximum.¥X8x(t) and v = 8x(t)). the fundamental theorem of the calculus of variations. the signs of the previous conditions are reversed.e. Then using 8x(to) = 8x(tf) = 0 for fixedend conditions.5) .1. 2 8J = f :! [( ~~) .4. the condition that the first variation vanish leading to the classic EulerLagrange equation.4.7) for the increment consider the terms corresponding to the second variation [120]. (2. it is required to examine the second variation. the sufficient condition for a minimum is 82 J > O. 1 tf .4 The Second Variation 39 2. we can rewrite the second variation (2. Alternatively.1) in matrix form as 2 8J = 210 [8x(t) 8x(t)] [::rx ~:'. + (::.
In 1786. Legendre obtained this result of deciding whether a given optimum is maximum or minimum by examining the second variation. Hence.7).e. ..4. Solution: This is an obvious solution. however.4. the sign of the previous equation reverses.7.40 where. Earlier in Example 2. the distance between two points as given by x*(t) (straight line) is minimum. Thus. i. this condition is called Legendre condition [126].4. . we establish a minimum (maximum).9 Verify that the straight line represents the minimum distance between two points.4. 8 2V /8x 2 determines the sign of 82 J. we illustrate the second variation. That is. the sign of second variation agrees with the sign of 8 2V / 8x 2. for minimization requirement (2.9) Since x*(t) is a constant (+ve or ve) .4. The second variation technique was further generalized by Jacobi in 1836 and hence this condition is usually called LegendreJacobi condition. we have formulated a functional for the distance between two points as (2. To x*(t) 1 [1+x*2(t)] 3/2 (2. In the literature. the coefficient of (8x(t))2. we find = Clt + C2.7) For maximization. Example 2. the previous equation satisfies the condition (2. In many cases since 8x(t) is arbitrary.6) If the matrix II in the previous equation is positive (negative) definite.7).4. Chapter 2: Calculus of Variations and Optimal Control (2.8) and found that the optimum is a straight line x*(t) satisfy the sufficiency condition (2.
which is in the form of a plant equation.5. This section is motivated by [6. h = Ao . from the volume relation (2. 2. first by direct method and then by Lagrange multiplier method.2). we eliminate one of the variables. Example 2. the manufacturer wants to minimize the volume of the material used and hence part of the cost for the tank. Let d and h be the diameter and the height of the circular tank.5. Our intent is to maximize V(d.5. We discuss two methods: first one is called the Direct Method using simple calculus and the second one is called Lagrange Multiplier Method using the Lagrange multiplier method.1) using the area relation (2. we begin the second stage of optimal control. In other words.5. (2.5.10 A manufacturer wants to maximize the volume of the material stored in a circular tank subject to the condition that the material used for the tank is limited (constant). lower and side) of the tank is (2. for a constant thickness of the material.wd /2 2 7rd . Then the volume contained by the tank is V(d. say h.5 Extrema of Functions with Conditions 41 Next. h) keeping A(d. h) = A o. this condition implies that the crosssectional area of the tank material is constant. The plant takes the form of state equation leading us to optimal control of dynamic systems. Let us note that we consider this simple example only to illustrate some basic concepts associated with conditional extremization [120]. where Ao is a given constant. From (2. 120.5 Extrema of Functions with Conditions We begin with an example of finding the extrema of a function under a condition (or constraint). 1 Direct Method: In solving for the optimum value directly.2). We consider optimization (or extremization) of a functional with a plant. h) = wd2 h/4 (2.2) A(d. Solution: If a fixed metal thickness is assumed. the condition is embedded in the original function to be extremized. we address the extremization of a functional with some condition. 79. h) = 2wd 2 /4 + wdh = Ao.3) .2. Thus. We solve this example with two methods.1) and the crosssectional surface area (upper. By doing so. 108]. which is considered as a constraint or a condition along with the functional.5.
r. we get the optimal value of d as d* = Ao J237r .5. 2 Lagrange Multiplier Method: Now we solve the above problem by applying Lagrange multiplier method.t.7) that the volume stored by a tank is maximized if the height of the tank is made equal to its diameter.6) and (2. The new adjoined function is extremized in the normal way by taking the partial derivatives w.5.5. making them equal to zero.5. we differentiate (2.6) in the original crosssectional area given by (2.Ao) (2. £'(d.5. h) 2 2 = 7rd h/4 + \(27rd /4 + 7rdh .1) for volume (2.10) .9) Then a new adjoint function £. h) = 7rd2 h/4 and the condition (2. we see from (2. (called Lagrangian) is formed as (2. d and set it to zero to get ~o _ ~7rd2 = O.5. d in (2.t.5.r.5.5. We form a new function to be extremized by adjoining a given condition to the original function.5.4) be negative for maximum.4) w. t. we get h* = J2A 37r O • (2.6) By demanding that as per the Definition 2.Ao = O. to find the extrema of this simple calculus problem.5.5.2). the second derivative of V w.5) Solving.5.1) to be extremized be rewritten as f(d.3) for height in the relation (2.5. we can easily see that the positive value of the square root function corresponds to the maximum value of the function. Let the original volume relation (2.5.4) Now.5. \) = f(d. Substituting the optimal value of the diameter (2. h) + \g(d.7) Thus.2) to be satisfied as (2.5. (2.r.42 Chapter 2: Calculus of Variations and Optimal Control Using the relation (2. h) = 27rd2/4 + 7rdh . and solving for these variables which are extremals. h.8) g(d. all its variables. (2.3 for optimum of a function. and solving for the optimum h *.
18) . 2. ac = ad ac ah 7rdh/2 + A(7rd + 7rh) = 0 (2.X* = J 247r . Xl X2 (2.14) in order to satisfy the physical requirement that the diameter d obtained from (2. we need to have the height (h *) equal to the diameter (d*) of the tank. and A*. (2. O . since dXI and dX2 are not arbitrary.5. subject to the condition (2. (2.13) for the three variables d*. each of the variables d.5 Extrema of Functions with Conditions 43 where.5.5.16) As a necessary condition for extrema. we take the partial derivatives of £(d. to maximize the volume of a cylindrical tank. h.12) as d = 4A (2. we get d* = J2A h* = J2A 37r 37r O.5. we generalize the previous two methods.17) However. h*. X2) with two interdependent variables Xl and X2. but related by the condition dg = ag ag a dXI + a dX2 = Xl X2 0. since the Lagrangian C is a function of three optimal variables d.5.5. Ao (2.5. we have df af af = a dXI + a dX2 = o.5. and A.5.t.12) = 7rd2 /4 + A(7rd) = 0 27rd 2 ac = aA /4 + 7rdh  Ao = o. Consider the extrema of a function f(XI.11 ) (2.2. Thus. A. Note that we need to take the negative value of the square root function for A in (2.14) Once again.5.r.15) is a positive value.11) to (2. solving the previous three relations (2.5. Now. . is called the Lagrange multiplier. h. h and A and set them to zero.5. A) w. Now. a parameter yet to be determined.1 Direct Method Now we generalize the preceding method of elimination using differential calculus.5.13) Now.
22) and the condition (2. we arbitrarily choose one of the variables. (2. say Xl. we have (2. we have the coefficient of dXI to be zero.5.44 Chapter 2: Calculus of Variations and Optimal Control it is not possible to conclude as in the case of extremization of functions without conditions that and (2. That is (%:J (%:J (%:J (%:J =0.21).17).5. = o.5.5.16) are solved simultaneously for the optimal solutions xi and x2' Equation (2.5.0.5.5. would.22) can be rewritten as (2. as the independent variable.5. .20) in the necessary condition (2. This is easily seen. the Jacobian of f and 9 w. and conclude that in order to satisfy (2.{ 8g/8x 2 dXI 8 g / 8X I} (2. we now can consider it to be arbitrary.5.5.5.22) Now.5.19) is solved for optimal values xi and x2' there is no guarantee that these optimal values.19) in the necessary condition (2. assuming that 8g/8x2 i. the relation (2.5. In order to find optimal values that satisfy both the condition (2.5. Then X2 becomes a dependent variable as per the condition (2. in general satisfy the given condition (2.5.16).20) and using (2. This method of elimination of the dependent variables is quite tedious for higher order problems.17). as we know.18) becomes dX2 = .5.23) This is also.r. since if the set of extrema conditions (2.t.16).17).5. Now.16) and that of the extrema conditions (2.5. (2.21) As we have chosen dXI to be the independent. Xl and X2.
5.26) and therefore a necessary condition for extrema is that df = d£ = O. let A take on the value A* that makes the coefficient of the dependent differential dX2 equal zero.2.28) Using (2. that is (2. (2. Further.5.5.5. For example.5. Consider again the extrema of the function f(Xl. we have (2.5.18). which has been introduced by us and is at our disposal.26) in finding the extrema.5. and rearranging 8f [ 8Xl 8g 8f 89 + A8Xl 1 + [8X2 + A8X2 1 = O.5. let us choose the multiplier A.25).5.29) Now dXl and dX2 are both not independent and hence cannot immediately conclude that each of the coefficients of dXl and dX2 in (2.28).25) is a better representation of the entire problem than the equation (2.5.5.24) In this method.5.5. Let us note that using the given condition (2.5.5. X2) subject to the condition (2.24) in the Lagrangian (2.5.5.27) Accepting the idea that the Lagrangian (2. a parameter (multiplier) yet to be determined. A.29) zero.30) .17) and (2.5. (2.2 Lagrange Multiplier Method 45 We now generalize the second method of solving the same problem of extrema of functions with conditions.29) must be zero. we have from the Lagrangian relation (2. we form an augmented Lagrangian function (2.18) in (2.5. dXl dX2 (2. Let us choose dXl to be independent differential and then dX2 becomes a dependent differential as per (2.5 Extrema of Functions with Conditions 2.25) d£ = df + Adg = O. is the Lagrange multiplier. to make one of the coefficients of dXl or dX2 in (2.25) where.
22) obtained by the direct method.31) to be satisfied for all dxl.33) yields the constraint relation (2.35) (::J (::J (::J (::J =0 (2. note that (2. Thus.29). we have o£ OXl o£ OX2 o£ OA = = oj OXl oj OX2 + A* + A* 8g OXl og 8X2 =0 =0 (2. Introduction of the multiplier A has allowed us to treat all the variables in the augmented function £(Xl' X2. That is oj +A* og =0.46 Chapter 2: Calculus of Variations and Optimal Control With (2. x2' and A*.34) and (2.5.5.5.5.5.31) Since.x2 The preceding three equations are to be solved simultaneously to obtain xi.5.5.36) = 9 (* *) = o. it can be varied arbitrarily. for (2. Let us note that the necessary conditions (2.5.32).34) (2.34) and (2.5. the equation (2.35) are just the same equations which would have been obtained from considering the differentials dXl and dX2 as though they were independent in (2. A) as though each variable is independent.5.37) which is the same condition as (2. and (2. .29) reduces to (2. the coefficient of dXl must be zero.5.5. Hence.5. xl.5. dXl is the independent differential. appearing in the intermediate stage only.16). By eliminating A* between (2.32) (2.5.30). (2. Combining the results from (2.5.5. thus indicating that we have the same result by Lagrange multiplier method.5. the multiplier A has acted like a catalyst.5.25). OXl OXl Now from (2.5.5.30).33).35) (2.
2.5 Extrema of Functions with Conditions 47 Summarizing..5.5.40) (2. A2. Let AI.. .. We now generalize this result.xn ) subject to the conditions gl(X) = gl(XI.Am be the Lagrange multipliers corresponding to m conditions. .41 ) Features of Lagrange Multiplier: The Lagrange multiplier method is a powerful one in finding the extrema of functions subject to conditions. X2) + Ag(xl. f and g have continuous partial derivatives.xn ) = 0 g2(X) = g2(XI. 0 (2. A) as though each were independent. and m < n.X2. the optimal values x* and A* are the solutions of the following n + m equations a£ = af + A' ag = ax ax ax a£ aA = g(x) = o.2. (2. . X2 and A are independent. ... . A' is the transpose of A. X2.2 Consider the extrema of a continuous. The importance of the Lagrange multiplier technique lies on the fact that the problem of determining the extrema of the function f(x) subject to the conditions g(x) = 0 is embedded within the simple problem of determining the extrema of the simple augmented function £(x. Then.5.38) where. realvalued function f(x) f(xl. X2) subject to the condition (or constraint) g(XI' X2) = 0 is equivalent to extrema of a single augmented function £(XI.. X2) as though Xl. A) = f(x) + A'g(X).··· . such that the augmented Lagrangian function is formed as £(x. X2.xn ) = 0 (2. A) = f(x) + A'g(X).39) where. THEOREM 2. X2. the extrema of a function f(xl. .5. Introduction of Lagrange multiplier allows us to treat all the variables x and A in the augmented function £(x. It has the following attractive features: 1. A) = f(xl. .
The increased dimensionality (n + m) which is characteristic of the Lagrange multiplier method. XI(tj) X2(tO) = Xlj. Consider the extremization of the performance index in the form of a functional subject to the condition (plant or system equation) (2. use the results of the previous section on the CoV. 2.48 Chapter 2: Calculus of Variations and Optimal Control 3.3) Now we address this problem under the following steps. = X20 X2(tj) = X2j' (2. First. is generally more than compensated by the relative simplicity and systematic procedure of the technique.6 Extrema of Functionals with Conditions In this section. derive the necessary conditions and then extend the same for a general nth order vector case. 4.2) with fixedendpoint conditions XI(tO)=XlO. we extend our ideas to functionals based on those developed in the last section for functions. • Step 1: Lagrangian • Step 2: Variations and Increment • Step 3: First Variation • Step 4: Fundamental Theorem • Step 5: Fundamental Lemma • Step 6: EulerLagrange Equation .6. we consider a functional with two variables.6. The multiplier A acts like a catalyst in the sense that it is introduced to perform a certain duty as given by item 2. The multiplier method was given by Lagrange in 1788.
4) where. xi(t). i = 1. X2(t)) (2. Xi(t) = xi(t) + 8Xi(t).6. and the Lagrangian £.Xl(t).6.6. X2(t).4) that J a = J if the condition (2. A(t) is the Lagrange multiplier.2.2.7) As before in the section on CoV.6. xi(t) + 8Xi(t).Ja(xi(t). the first variation of the functional J a becomes (2.6. (2.5) Note from the performance index (2.1) and the augmented performance index (2. • Step 2: Variations and Increment: Next.2) is satisfied for any A(t).t) = V(Xl(t).A(t).6. is defined as £. Xl(t). t) . t).X2(t).6 Extrema of Punctionals with Conditions • Step 1: Lagrangian: We form an augmented functional 49 (2.2 ~Ja = Ja(xi(t) + 8Xi(t). Xl(t). = £'(Xl(t). t) +A(t)g(Xl(t). assume optimal values and then consider the variations and increment as Xi(t) = xi(t) + 8Xi(t). we rewrite the terms containing 8Xl(t) and 8X2(t) in terms of those containing 8Xl(t) and 8X2(t) .X2(t).6. X2(t). X2(t).6) for i = 1. • Step 3: First Variation: Then using the Taylor series expansion and retaining linear terms only.
6.6.6.6. we have the first variation (2. _! (:~) J 1 8X2(t)dt.J vdu). We invoke the fundamental theorem of the calculus of variations (Theorem 2.6.1) and make the first variation (2.7) as Since this is a fixedfinal time and fixedfinal state problem as given by (2.11) • Step 4: Fundamental Theorem: Now. we proceed as follows. no variations are allowed at the final point.6. ! (:~)J8Xl(t)dt + {I [(:~).50 Chapter 2: Calculus of Variations and Optimal Control only (using integration by parts. 1. This means (2. Thus Using the above.3). we have Ma= 1: [(:~).10) in the augmented first variation (2.10) Using the boundary variations (2. . (2. J udv = uv .6.9).11) equal to zero.
6.6.11) vanish.16) (2.12).6.6.6.6.1) subject to the condition (2.6. the necessary conditions for extremization of the functional (2.15). (2.2).6. (2. Also.2) (according to EulerLagrange equation) are 8C) dt 8C) (8Xl * _~ (8Xl * = 0 8C) dt 8C) (8X2 * _~ (8X2 * = 0 (8C) * _~ (8C) * 0.6. Let us choose the multiplier A*(t) which is arbitrarily introduced and is at our disposal. Remembering that both 8Xl(t) and 8X2(t) are not independent.6. from the Lagrangian(2.6. That is 8C) dt 8C) (8X2 * _~ (8X2 * = 0. That is 8C) dt 8C) (8Xl * _~ (8Xl * = 0.14) (8C) * = 0 8A yields the constraint relation (2. With these choices.11) becomes (2.17) (2. the first variation (2. the only way (2.6.5) note that (2.6.6.14). we choose 8X2(t) as the independent variation and 8Xl(t) as the dependent variation.6.6. ! (:~) J 8X2(t)dt = O. is that the coefficient of 8Xl(t) also vanish. 3.13) • Step 5: Fundamental Lemma: Using the fundamental lemma of CoY (Lemma 2.\ = (2.6. because Xl(t) and X2(t) are related by the condition (2. 8A dt 8.6 Extrema of Functionals with Conditions 51 2. in general. and (2. in such a way that the coefficient of the dependent variation 8Xl(t) in (2.2.15) • Step 6: EulerLagrange Equation: Combining the various relations (2.2) .1) and noting that since 8X2(t) has been chosen to be independent variation and hence arbitrary.6. (2.18) .12) f [(:~).13) can be satisfied.
Thus. we generalize the preceding procedure for an nth order system.. t)dt (2.23) (2.22) and the Lagrange multiplier A(t) = [Al(t).6. t) I (2.6. x(t). We now apply the EulerLagrange equation on J a to yield (8£) * _~ (8£) * = 0 8x dt 8x ' d 8I .6.6.6.6. the introduction of the Lagrange multiplier A( t) has enabled us to treat the variables Xl(t) and X2(t) as though they were independent. x(t). x(t).18). x(t).18) is really the given plant equation (2.3) give the optimal solutions xi(t). x2(t). x(t) is an nth order state vector. (2.16) and (2.18) along with the boundary conditions (2. as if both 8Xl(t) and 8X2(t) had been independent.6. subject to the plant equation (condition) gi(X(t). in spite of the fact that they are related by the condition (2. the Lagrangian £ is independent of ~(t) and hence the condition (2. Now. . Consider the extremization of a functional J = i t! V(x(t). t) = o.m (2. A(t).17) and the condition relation (2.2). x(t). 8£) ( ~ * . t) + A'(t)gi(X(t)..6. .. .24) .19) to where.2). x(t).21) to where. A2(t). x(t)..6. .6. A(t).di (8£) * = 0 ~ 9i(X(t).6.6. Also.2) or (2. Am(t)]'.6. t) = 0. x(O) and x(tf). The solution of the two. in (2. i = 1.20) and boundary conditions.6.5). t) = V(x(t). We form an augmented functional Ja = i t! £(x(t). secondorder differential equations (2.6. the Lagrangian £ is given by I£(x(t). t)dt (2.2. and A*(t).52 Chapter 2: Calculus of Variations and Optimal Control Let us note that these conditions are just the ones that would have been obtained from the Lagrangian (2.6.
6.11 Minimize the performance index (2. Now. we solve the EulerLagrange equation (2.26) subject to the condition (plant equation) x(t) = x(t) + u(t).27) within itself.27) to get the functional as J = t = t [x 2(t) + (x(t) + x(t))2]dt (2. we notice that the functional (2. 1 Direct Method: Here. x(l) = 0 (2.6. and we need to consider it as a straight forward extremization of a functional as given earlier.6.6.e.6.28) absorbed the condition (2..2.24) is nothing but the given relation regarding the plant or the system (2.28).6.6.6 Extrema of Functionals with Conditions 53 Note that from (2.25) with boundary conditions x(O) = 1.25) and the plant (2.6. Example 2.23) along with the given boundary conditions. Thus.22). (2.6.30) . we eliminate u(t) between the performance index (2.28) [2x 2(t) + x2(t) + 2x(t)x(t)]dt.6. Thus. (2.6. the direct method and the Lagrange multiplier method.6.27) Solution: Let us solve this problem by the two methods.6. Let us now illustrate the preceding method by a simple example. the Lagrangian £ is independent of ~(t) and hence the EulerLagrange equation (2.6.29) (2.6.20). i. where. applying the EulerLagrange equation (OV) * _~ (OV) * = 0 ax dt ax to the functional (2.
27) to be u*(t) = x*(t) + x*(t) V2)e.u(t)}1 dt (2.27) as g(x(t).37) C(x(t).6.6.6. the constants C 1 and C2. A(t) is the Lagrange multiplier. 2 Lagrange Multiplier Method: Here. A(t))dt where. First we rewrite the condition (2. (2.33) where.2x*(t) = 0 (2.6. x(t).6. are found to be (2.35) Although the method appears to be simple.54 we get Chapter 2: Calculus of Variations and Optimal Control 4x*(t) + 2x*(t)  :t (2x*(t) + 2x*(t)) = O.6. (2.26).V2t + C2 (1 = C 1 (1  + V2)e V2t . u(t)) = x(t) + x(t)  u(t) = o. x(t). knowing the optimal x* (t ).32) the solution (see later for use of MATLAB©) of which gives the optimal as (2. x(t). we form an augmented functional as J = = l l [x2(t) + u2(t) + A(t){X(t) + x(t) .6.6.6. u(t).6.26) under the condition describing the plant (2.38) . (2.6.36) Now.31) Simplifying the above x*(t) . let us note that it is not always possible to eliminate u(t) from (2.34) Finally. A(t)) = x 2 (t) + u2 (t) +A(t) {x(t) + x(t)  u(t)} (2. Consider the optimization of the functional (2. u(t).6.6. evaluated using the given boundary conditions (2. and C(x(t). the optimal control u *(t ) is found from the plant (2.6.25) and (2.6.6.27).25) with the boundary conditions (2.27) especially for higherorder systems. we use the ideas developed in the previous section on the extremization of functions with conditions.
Version 6.6.34).6 Extrema of Functionals with Conditions 55 is the Lagrangian. (2. We get first from (2. we get A*(t) and hence u*(t) from (2.6. (2.6.42) in (2.45) = C 1 (1 . (2.6. evaluated using the boundary conditions (2. Now.39) + x*(t)).42) 2x*(t) + 2(x*(t) + x*(t))  2(x*(t) + x*(t)) = O. we get x*(t) .41) 2u*(t) .6. and A*(t). The constants C 1 and C2.2.* (t) + x* (t) + C 2(1 + V2)ev'2t.41) A*(t) = 2u*(t) = 2(x*(t) Using the equation (2. we get the same results as in direct method. 6.6.6.43) Solving the previous equation.40) (2. The solution for the set of differential equations {2.40) and (2.V2)ev'2t Thus.6.u*(t) > and solve for optimal x*(t). we apply the EulerLagrange equation to the previous Lagrangian to get (~~) *  (~~) * (~~) *  :t (~~) :t (~) :t (!f) = * * = 0 > 2x*(t) + ).). 26) for Example 2. (2.26) are the same as given in (2.6.11 using Symbolic Toolbox of the MATLAIi9. u*(t).6.32} with the boundary conditions (g. is shown below.2x*(t) = 0 + x*(t) = C 1 ev'2t + C2e v'2t.6. .6.*(t) = 0 x*(t) + x*(t) .42) as u* (t) = i.6.6.39) (2. *(t) *=0 0 > ~*(t) = 0 =0 (2.6.44) Once we know x*(t).
'x(O)=1.+ cosh(2 1/2 2 exp(2 ) .+ cosh(2 t) 1/2 2 exp(2 ) .56 Chapter 2: Calculus of Variations and Optimal Control ******************************************************** x=dsolve('D2x2*x=O'.6.6.1 ********************************************************* It is easy to see that the previous solution for optimal x*(t) is the same as given in (2.\(t) helped us to treat the augmented functional (2.33) and (2.1 (exp(2 1/2 t) 2 1/2 1/2 2 1/2 (exp(2 ) + 1) sinh(2 t) 1/2 t) . Let us note once again that the Lagrange multiplier ..34).6.38) as if it contained independent variables x(t) and u(t).1 u = (exp(2A(1/2))A2+1)/(exp(2A(1/2))A21)*cosh(2A(1/2)*t)*2 A(1/2)+ sinh(2 A(1/2)*t)*2 A(1/2)(exp(2 A(1/2))A2+1)/(exp(2 A(1/2))A2_ 1)*sinh(2A(1/2)*t)+cosh(2A(1/2)*t) 1/2 2 1/2 1/2 ) + 1) cosh(2 t) 2 + sinh(2 exp(2 1/2 2 ) .36). ... although they are dependent as per the plant equation (2.x(1)=O') x = (exp(2 A(1/2))A2+1)/(exp(2 A(1/2))A21)*sinh(2 A(1/2)*t)+ cosh(2A(1/2)*t) 1/2 2 1/2 (exp(2 ) + 1) sinh(2 t) 1/2 .6.
7. tf)  . t) dt = S(x(t). t)dt (2. This problem of Bolza is the one with the most general form of the performance index. (2. we approach the optimal control system by variational techniques. and the problem of Bolza was formulated in 1913. (2. u(t). x(t) and u(t) are n.7.7 Variational Approach to Optimal Control Systems In this section. Mayer considered his problem in 1878.7.7 Variational Approach to Optimal Control Systems 57 2.4) in the original performance index (2. t)dt ~~] dt S(x(to).7.dimensional state and control vectors respectively. to). the performance index as J(u(t)) = S(x(tf). Such an optimal control problem is called the Bolza problem.3) where. t)I~~ = S(x(tf). u(t).2). to).and r. let us note that rtf t Using the equation (2.7.1 Terminal Cost Problem Here we consider the optimal control system where the performance index is of general form containing a final (terminal) cost function in addition to the integral cost function.7.2) to and given boundary conditions as x(to) = xo. u(t).7.S(x(to).5) r f + S(x(tf). 2. u(t).2.1) + i tf V(x(t).7. we get lto dS(:(t) . x( t f) is free and t f is free (2. which was used by Pontryagin and his associates to develop the famous Minimum Principle [109]. and in the process introduce the Hamiltonian function. The Lagrange problem was first discussed in 1762. t). Consider the plant as x(t) = f(x(t). tf) (2. t) + V(x(t).4) J2(U(t» = = 1:' lto [V(x(t). Before we begin illustrating the Pontryagin procedure for this problem. tf) .
7.2) and not J2 in (2. we are interested in finding the optimal control only.5).7) = f(x*(t). Also note that d[S(x(t). Then J(u*(t)) = ic*(t) 1: 1 [V(x*(t). Once the optimal control is determined. introduce the Hamiltonian. Also. t)] = (as)' .58 Chapter 2: Calculus of Variations and Optimal Control Since S(x(to).7. t)] dt (2.2) is equivalent to that of the performance index J2 in (2.?).7.7) x(t) = x*(t) + 8x(t). u*(t). the optimization of the original performance index J in (2.7. • Step 1: Assumption of Optimal Conditions • Step 2: Variations of Control and State Vectors • Step 3: Lagrange Multiplier • Step 4: Lagrangian • Step 5: First Variation • Step 6: Condition for Extrema • Step 7 : Hamiltonian • Step 1: Assumptions of Optimal Conditions: We assume optimum values x*(t) and u*(t) for state and control. u*(t).6) We now illustrate the procedure in the following steps. Here.5). the optimal cost is found using the original performance index J in (2. we first introduce the Lagrangian and then. • Step 2: Variations of Controls and States: We consider the variations (perturbations) in control and state vectors as (see Figure 2. (2.7. to) is a fixed quantity. t). Let us first list the various steps and then describe the same in detail.5). the optimal cost given by (2.7.2) is different from the optimal cost (2.7. u(t) = u*(t) + 8u(t). respectively. However.7. a little later. () dt ax x t + as at' (2.7.8) . t) + dS(x.
t Figure 2. t) + dt dt (2.7. t) + 8x * x*(t) + +. . T .1) and the performance index (2.9) dS] • Step 3: Lagrange Multiplier: Introducing the Lagrange multiplier vector . . t) ...10) . . u*(t) + 6u(t).. u*(t) + 6u(t). .7.5) become ': x*(t) + 6x(t) J(u(t)) = 6t f(x*(t) + 6x(t). ~ .. . .2. . u*(t)..x*(t)}]dt [4 (8S)' (8S) 8t * (2...6)..7.x'(t) {f(x*(t).. . : . u*(t).: .7.7 Variational Approach to Optimal Control Systems x(t) 59 x*(t)+ ox(t) ~__.x(t) (also called costate vector) and using (2.7.7 FreeFinal Time and FreeFinal State System Then. we introduce the augmented performance index at the optimal condition as Ja(u*(t)) = lto [V(x*(t). the state equation (2. .. ... t) f [ [tf+ = ito V(x*(t) + 6x(t).
u*(t). the augmented performance index at the optimal and any other condition becomes . + 8u(t). u*(t) + 8u(t). x*(t).t) = V(x*(t) + 8x(t).: (2. u*(t) .x*(t)} and at any other condition as £8 = £8(x*(t) +8x(t).x*(t) +8x(t).7. u*(t). t) (2.12) +A'(t) {f(x*(t).13) +A'(t) [f(x*(t) + 8x(t). u*(t). t) to + ax at (8S)'* [x*(t) + 8x(t)] + (8S) * +A'(t) [f(x*(t) + 8x(t). u*(t) .7.{x*(t) + 8x(t)}] . t) + (:) ~ [x*(t) + 8x(t)] + (c.{x*(t) + 8x(t)}]]dt.t) + (~:)~ x*(t) + c.A(t). With these. t) = V(x*(t).:) * + 8u(t).11) • Step 4: Lagrangian: Let us define the Lagrangian function at optimal condition as £ = £(x*(t). u*(t) + 8u(t). t) (2. A(t).60 Chapter 2: Calculus of Variations and Optimal Control and at any other (perturbed) condition as Ja(u(t)) = l tf+8tf [V(x*(t) + 8x(t).u*(t) +8U(t).7. t) .
extracting the first variation 8 J by retaining only the first order terms. f (~): 6x(t)dt = {' (~. (2. (2.7.15) • Step 5: First Variation: Defining increment fl. we have (2.): :t (6x(t» = dt [(~»X(t)JC .16) Considering the 8x(t) term in the first variation (2.7. and retaining the linear terms only.J vdu).7.7 Variational Approach to Optimal Control Systems 61 Using meanvalue theorem and Taylor series.{ ' [:t (~) J 6x(t)dt.16) and integrating by parts (using J udv = uv .2. we get the first variation as ~J = Ja(u(t)) = .Ja(u*(t)) I it' to (£6 . using Taylor series expansion. 8tf.7.£)dt + £It 8tf { 8J {t = ito I (a £ ) I ax * 8x(t) + (a £) * 8x(t) + (a £ ) * 8u(t) ax au I I } dt + £It.17) .J.
7. Also. the coefficient of the control variation 8u(t) in (2. Thus.7.19) where the partials are evaluated at the optimal (*) condition. . the first variation (2. using (2.7.:t (~) J + ox(t)dt f (~)~ ou(t)dt + Cit! ot.7.7.21) Let us note that the condition (or plant) equation (2.62 Chapter 2: Calculus of Variations and Optimal Control Also note that since x(to) is specified.12) as (8£) * O.17) the first variation 8J in (2. since the independent control variation 8u(t) is arbitrary. Then.18) be zero.18) reduces to (2. we have the EulerLagrange equation (8£) * _!i (8£) * = 0 8x dt 8x (2.7. + [(~»X(t)] L (2. Next. = (2.7.7.18) should be set to zero. we choose A(t) = A*(t) which is at our disposal and hence £* such that the coefficient of the dependent variation 8x(t) in (2. That is (8£) * = O. First. 8u Finally.7.1). 8x(to) = O. the first variation 8 J should vanish according to the fundamental theorem (Theorem 2.7. we note that 8u(t) is the independent control variation and 8x(t) is the dependent state variation.7.20) (2.22) 8A . in a typical control system such as (2.1) of the CoV.18) • Step 6: Condition for Extrema: For extrema of the functional J.7.7.16) becomes oj = 1:' [(~).1) can be written in terms of the Lagrangian (2.
8)..24) and retaining only the linear (in 8) terms in the relation (2... 9(t) .26) • Step 7: Hamiltonian: We define the Hamiltonian 1l* (also called the Pontryagin 1l function) at the optimal condition as 11l* = V(x*(t)...2... t) + A*' (t)f(x*(t).. ..21) into an expression containing 8xf (see Figure 2. we have the general boundary condition in terms of the Lagrangian as (2..7. . .. u*(t)..: .25) Using (2.23) which is rewritten as 8Xf = 8x(tf) + {x*(t) + 8x(t)} 8tf (2. o Figure 2. we have (2..7..:..7....7. u*(t)...7.... X f ... . ..25) in the boundary condition (2..7. .7 Variational Approach to Optimal Control Systems X(t) 63 tangent at P . I (2..7...8 t FinalPoint Condition with a Moving Boundary B(t) In order to convert the expression containing 8x(t) in (2.27) .7....21).7.. <>tf .. ~. we note that the slope of x*(t) + 8x(t) at tf is aI?proximated as (2. t).. : . . .24)........•.
the relation (2.7. t).7. (t) (81t) *+ (8 8)'* + ( 8x8t *.{(88)' A*(t) } d (81t) *+ (828)' x*(t)+ (8 8) dt 8x * 8x &2 * 8x8t 2 2 [(82 * 2 8)' x*(t) 88) .7.7.31) Looking at the similar structure of the relation (2. u*(t). and (2.A * 1= 0 8x 8x2 x*(t) 88) = =O~ leading to ( 8'H) ax * = '\*(t) (2.7. (8£) * = 0 8A + (8'H) = X*(t).7.19)..12) the Lagrangian £* in terms of the Hamiltonian 'H* becomes £* = £*(x*(t). the relation (2.7.*' (t)x*(t).22) and noting that the terminal cost function S = S (x( t).7. 'H* = 'H* (x* (t). u*(t).20) becomes (2.28) in (2. x*(t). A* (t). u* (t). t) + (~:) ~ x*(t) + (~~) *  >.28) Using (2. for the optimal control u*(t).7.7. respectively expressed in terms of the Hamiltonian. state and costate equations.8x2 * + (8x8t *.20).30) and for the costate A* (t). t). Thus. t) = 'H*(x*(t). A*(t). A*(t).7.7.30) for the optimal costate A*(t) and (2.19) becomes (8£) _~ (8£*) 0~ 8x* dt 8x* 2 . we have the control.7. 8A * (2. (2. (2. Then from (2.29) for the optimal state x* (t).31) for the optimal state x*(t) it .64 Chapter 2: Calculus of Variations and Optimal Control where.
and since X(tf) is fixed or specified.26) at the optimal condition reduces to (2. the boundary condition (2. since tf and x(tf) are fixed or specified (Figure 2. 7. 8tf is arbitrary.32) This is the general boundary condition for freeend point system in terms of the Hamiltonian.9(b).7. That is = 0 + X*(tf) = (as) ( as _ x*(t») ax * t f a x *tf (2. using the relation (2. Finally.32) is zero resulting in (2. • Type (a): FixedFinal Time and FixedFinal State System: Here.33) • Type (c): FixedFinal Time and FreeFinal State System: Here tf is specified and x(tf) is free (see Figure 2. the coefficient of the arbitrary 8t f in the general boundary condition (2.7 Variational Approach to Optimal Control Systems 65 is clear why .2 Different Types of Systems We now obtain different cases depending on the statement of the problem regarding the final time tf and the final state x(tf) (see Figure 2.7.7. 8xf is zero as shown in Figure 2.7. which in turn means that the coefficient of oXf in the general boundary condition (2.7. 2. Then.32).28).7. Then 8tf is zero and 8xf is arbitrary.9(c». and there is no extra boundary condition to be used other than those given in the problem formulation.x(t) is called the costate vector.32) is zero. both 8tf and 8xf are zero in the general boundary condition (2.9(a».7. • Type (b): FreeFinal Time and Fix edFinal State System: Since tf is free or not specified in advance.9).7.2.34) .
(b) FreeFinal Time and FixedFinal State System. o to (c) t o to it (d) t Figure 2. (d) FreeFinal Time and FreeFinal State System .. (c) FixedFinal Time and FreeFinal State System.9 Different Types of Systems: (a) FixedFinal Time and FixedFinal State System.. t (a) (b) x(t) t x(t) xo . Xo ..66 Chapter 2: Calculus of Variations and Optimal Control x(t) x*(t)+ox(t) x(t) xo ..
7.39) 2.7.2. Using (2.35).28) and (2.35) Using (2.37).7.14). we have to find a sufficient condition for extremum. we need to consider the second variation and examine its sign. That is • Type (e): FreeFinal Time and Independent FreeFinal State: If t f and x( t f) are not related.7.e.7.7.. then 8tf and 8xf are unrelated. the boundary condition (2.38) (2.36) is zero.7 Variational Approach to Optimal Control Systems 67 • Type (d): FreeFinal Time and Dependent FreeFinal State System: If t f and x( t f) are related such that x( t f) lies on a moving curve 8(t) as shown in Figure 2.32) for the optimal condition becomes Since t f is free. i. whether it is minimum or maximum.32) at the optimal condition becomes (2.7.7.7. In other words.8. we have the second . and the boundary condition (2.3 Sufficient Condition In order to determine the nature of optimization. 8tf is arbitrary and hence the coefficient of 8tf in (2.7. then (2.7. (2.
7. (2. u(t). t) (2. tf) + i t! V(x(t).7.45) .7. But the important condition is that the second partial derivative of 1t* w. u(t) must be positive. where we want to minimize the performance index J = S(x(tf).40) (2.43) to for the plant described by x(t) = f(x(t). t = tf is free and x(tf) is free. This means that the matrix II in (2.44) with the boundary conditions as x(t = to) = Xo.t.7. u(t). the second variation 82 J must be positive.41) must be positive definite. That is (2. we get For the minimum.42) is reversed.16) and using the relation (2.7. t)dt (2.28).7.42) and for the maximum.7.7.4 Summary of Pontryagin Procedure Consider a freefinal time and freefinal state problem with general cost function (Bolza problem). the sign of (2.68 Chapter 2: Calculus of Variations and Optimal Control variation in (2.r.7.7. 2.
Fixedfinal time and freefinal state system.1 Problem Procedure Summary of Pontryagin Principle for Bolza A. Solution of the Problem Step 1 Form the Pontryagin 1i function 1i(x(t). t). x(tf) = 6(tf) [H* + Ft Jt f = 0 x( to) = xo. u(t). t).9(b) c . Step 2 Minimize 1i w. B. 8xf =1= 0 8xf = 6(tf )8tf x(to) = xo. Fig. u(t). A*(t f) = (~tt f + ~~ + { (~~) *  A*(t)} I 6(t) f = 0 L 8tf =1= 0 8xf =1= 0 8x(to) = Xo [1i*+as] =0 [(as) A*(t)] =0 at tf 'ax * tf . the performance index as J = S(x(tf). Fig. t). h(x*(t). Types of Systems a . tf) + Jt~ V(x(t). Freefinal time and independent freefinal state system Boundary Conditions Type Substitutions 8tf = 0. t) = 1i*(x*(t). Substitute the solutions of x* (t). A" (t) from Step 4 into the expression for the optimal control u*(t) of Step 2. u(t). Fig.8xf = 0 8tf =1= 0. 2. C.9(c) d). A*(t). A*(t).r. 8xf = 0 8tf = 0. 2. x(tf) = xf (a) (b) (c) (d) (e) x( to) = Xo. t). A(t).t. x( t f) = x f' l1i* x(to) = Xo.7 Variational Approach to Optimal Control Systems 69 Table 2. t). and the boundary conditions as x(to) = Xo and tf and x(tf) = xf are free. A*(t). find the optimal 1i* 1i*(x*(t). t)dt.2.9(a) b). find the optimal control. e . u(t) (~~t = 0 and obtain u*(t) = h(x*(t).9(d). u(t). 2. u(t). Fixedfinal time and fixedfinal state system. Step 4 Solve the set of 2n differential equations Step 3 x* (t) = + (~~) * and'\ *(t) = . 2. Statement of the Problem Given the plant as x(t) = f(x(t).A*(t)]:f 8xf = O. Freefinal time and fixedfinal state system. t) = V(x(t).(~":. Using the results of Step 2 in Step 1. t) + A' (t)f(x(t). Fig. Freefinal time and dependent freefinal state system. A*(t).) * with initial conditions Xo and the final conditions Step 5 [1i* + ~~]tf 8tf + [(~~t .
Solution: We follow the stepbystep procedure given in Table 2.70 Chapter 2: Calculus of Variations and Optimal Control Here.1. we identify V(x(t). u(t). f2]'. First.12 Given a second order (double integrator) system as Xl(t) = X2(t) X2(t) = u(t) and the performance index as (2. Type (a)). . To illustrate the Pontryagin method described previously. fl (2.49) = X2(t).7. f2 = u(t). Note: From Table 2.7. by comparing the present plant (2. where.1. x(t) and u(t) are n.46) and the PI (2.and r.7.47) with the general formulation of the plant (2.9(a).46) (2.1) and the PI (2. x(2) = [1 0]'.7.7.47) find the optimal control and optimal state. Example 2. we selected a double integral plant whose analytical solutions for the optimal condition can be obtained and the same verified by using MATLAB©. Table 2.7. The entire procedure (called Pontryagin Principle) is now summarized in Table 2.1. Specifically.2).7. First we consider the fixedfinal time and fixedfinal state problem (Figure 2. t) = V(u(t)) = "2u 2(t) 1 f(x(t). given the boundary (initial and final) conditions as x(O) = [1 2]'.48) Assume that the control and state are unconstrained. (2. Let us note that u( t) is unconstrained.dimensional state and control vectors respectively. u(t). t) = [fl.7. consider the following simple examples describing a second order system.1 we note that the only difference in the procedure between the freefinal point system without the final cost function (Lagrange problem) and freefinal point system with final cost function (Bolza problem) is in the application of the general boundary condition.
(2.52) = Ai(t)x2(t) . • Step 4: Obtain the state and costate equations from xi(t) = + ( : ) * = x2(t) x:.55) .(t) = + (:) * = A. AI(t). A2(t)) V(u(t)) + X'(t)f(x(t) .C4 (2. find the optimal H*as H* (xi(t) . we have the optimal state and costate as * C3 3 C4 2 Xl (t) = tit .7 Variational Approach to Optimal Control Systems • Step 1: Form the Hamiltonian function as 71 H = H(XI(t). (2.53) Solving the previous equations.51) • Step 3: Using the results of Step 2 in Step 1.7.2. u(t)) 1 = 2u2(t) + Al (t)X2(t) + A2(t)U(t).~A22(t).7. u(t). Ai(t) .50) • Step 2: Find u*(t) from aH au = 0 ~ u*(t) + A2(t) = 0 ~ u*(t) = A2(t). X2(t).( : ) * = Ai(t).7.2t + C2t + C1 X2(t) = ~3t2  C4t + C2 (2.7. Ai(t) = .7.(t) .(t) = . • Step 5: Obtain the optimal control from u*(t) = A2(t) = C3t .(aH) * = 0 aXI A. X2(t).7. A2(t)) = ~A22 (t) + Ai (t)X2(t)  A22 (t) (2.54) Ai(t) = C3 A2(t) = C3t + C4. = (2.
12 using Symbolic Toolbox of the MATLA~.Dlambdal=0. = 3.0 %% S=dsolve('Dxl=x2.5t 3 X2(t) = 1.56) Finally. 4t + 2..7. 04 = 4. xl(0)=1. costates and control as xi(t) = 0. and 0 4 are constants evaluated using the given boundary conditions (2.48). . we have the optimal states.Dlambda2=lambdal.xl S.7. is shown below..12 where.48} for Example 2. These are found to be 01 = 1.Dx2=lambda2.x2(2)=0') S. O2 . The system with the optimal controller is shown in Figure 2.7. O2 = 2.x2 S.10 Optimal Controller for Example 2.5t2 Ai(t)  2t2 + 2t + 1.72 Chapter 2: Calculus of Variations and Optimal Control Optimal Controller u*(t) J x2 *(t) J and x 1*(Q Figure 2.x2(0)=2. 0 3 .57) A2(t) = 3t + 4.53) with the boundary conditions C2.xl ans= .4. Version 6. u*(t) = 3t . 01. (2.lambdal S.xl(2)=1. The solution for the set of differential equations (2.7.lambda2 S = lambdal: lambda2: xl: x2: [lxl [lxl [lxl [lxl symJ symJ symJ symJ S.10. (2. ************************************************************** %% Solution Using Symbolic Toolbox (STB) in %% MATLAB Version 6.7. 03 = 3.
%% optimal control u = lambda_2 tl(j)=tp.lambdal ans= 3 S. 'k' .xl to xlp x2p(j)=double(subs(S.11. xlp(j)=double(subs(S. j=l.2.tl.x2 ans= 24*t+3/2*t2 S. end plot(tl. A2(t). %% subs substitutes S.lambda2 ans= Plot command is used for which we need to %% convert the symbolic values to numerical values. and u*(t) = A2(t) obtained by using MATLAB© are the same as those given by the analytical solutions (2. j=j+l. %% double converts symbolic to numeric up(j)=double(subs(S.xlp. x1(t) .tl.7.x2)). for tp=O: .xl)). 'k' . we consider the fixedfinal time and freefinal state case (Fig .x2p.7 Variational Approach to Optimal Control Systems 73 S. Ai(t) .up.02: 2 t=sym(tp). Next. 'k:') xlabel ('t') gtext('x_l(t) ') gtext (' x_2(t) ') gtext ('u(t) ') ********************************************************* It is easy to see that the previous solutions for xi(t) .571 The optimal control and state are plotted (using MATLAB©) in Figure 2.lambda2)).
note that the performance index (2. and control as given in (2.7. . . x2(2) is free.2:t +02 t + Cl.6 0.4 0.12 with changed boundary conditions as x(O) = [1 2]'. ~3t2  04t + C 2.' .. First of all. . Ai(t) = 0 3 . . Type (c)) of the same system.8 1... Table 2. .1 (Type (c) ).7. . we get the same optimal states.. A2(t) = 03 t + 0 4.. '" 2 .12 ure 2. Solution: Following the procedure illustrated in Table 2.55) which are repeated here for convenience. xl(2) = 0." " " " 4~~~~~~~~~~~~ o 0. From the given boundary conditions .74 Chapter 2: Calculus of Variations and Optimal Control u(t).7.8 2 Figure 2.2 0.6 1.. .7..' .11 Optimal Control and States for Example 2.. . ·3 .. Example 2.59) The only difference is in solving for the constants C1 to C4.2 1..4 1.1. (2.58) Find the optimal control and optimal states.7. .. u*(t) = A2(t) = C3t .0 4.9(b). * X1(t) X2(t) = = 0 3 3 04 2 (ft .54) and (2. costates. (2.47) does not contain the terminal cost function S.13 Consider the same Example 2.
7 Variational Approach to Optimal Control Systems 75 (2.59) and (2.2.62) as xl (t) * * = 16 t  5 3 15 8 t 2 + 2t + 1.59).Dlambdal=0.62) Finally the optimal states. Also.7.32). costates and control are given from (2. xl(0)=1. 4 ' x2 (t) 15 15 = .lambda2(2)=0') S = .7.7.7. 8X2j is arbitrary and hence the corresponding final condition on the costate becomes (2.t2 .0 %% S=dsolve('Dxl=x2.Dlambda2=lambdal. 04 = 15/4. since x2(2) is free.53) with the boundary conditions {2. the constants are found to be 01 = 1. 02 = 2. *************************************************************** %% Solution Using Symbolic Toolbox (STB) in %% MATLAB Version 6.7. we have tf specified to be 2 and hence 8tf is zero in the general boundary condition (2.7.4· (2. is shown below. Version 6.xl(2)=0.7.x2(0)=2.t + 2 *( ) Al t = 8' 16 15 * 15 15 A2(t) = 8 t + 4' * 15 15 u (t) = 8 t .7. 03 = 15/8. (2. 7.58).63) The solution for the set of differential equations (2.13 using Symbolic Toolbox of the MATLAdS) .Dx2=lambda2.60) (since S = 0). 58} for Example 2.7.. Thus we have the four boundary conditions as With these boundary conditions substituted in (2.
%% double converts symbolic to numeric up(j)=double(subs(S.lambda2)).tl.76 lambdal: lambda2: xl: x2: S.x2 ans= S. j=j+l.xlp.lambda2 ans= 15/8*t+15/4 %% Plot command is used for which we need to %% convert the symbolic values to numerical values. 'k' .xl)). for tp=O:. t ' ) gtext ('x_l (t) ') .x2)). %% subs substitutes S. 'k' .up. end plot(tl.x2p.02:2 t=sym(tp).lambdal ans= 15/8 S.tl. 'k: ') xlabel ( .xl to xlp x2p(j)=double(subs(S. j=l. xlp(j)=double(subs(S.xl ans= Chapter 2: Calculus of Variations and Optimal Control [lxl [lxl [lxl [lxl sym] sym] sym] sym] S. %% optimal control u = lambda_2 tl(j)=tp.
.63) obtained analytically.13 are plotted in Figure 2. we consider the freefinal time and independent freefinal state case (Figure 2. 1 2 .2." ".1. Example 2.7.12 Optimal Control and States for Example 2.14 Consider the same Example 2. we get the same optimal control. The optimal control and states for Example 2.6 1." .13 Next.4 1.12 with changed boundary conditions as Find the optimal control and optimal state.6 0. x2 (t).2 1.8 1.. Table 2.9(e). A2 (t).' ... . and u*(t) = A2(t) obtained by using MATLAB© are the same as those given by (2. states and costates as given . Solution: Following the procedure illustrated in Table 2. Type (e)) of the same system. 0 . "".1 (Type (e)). " 0 0.2 0.12.4 0.8 2 3 4 Figure 2. . Ai (t).7 Variational Approach to Optimal Control Systems gtext ('x_2(t)') gtext ('u(t) ') 77 ******************************************************* It is easy to see that the previous solutions for xi (tj.
67) where ~ is given by (2. X2(0) =2.54) and (2. note that the performance index (2. (2.32) leading to the specific final condition (2. A2(tf) = 0. C2 = 2. A1(tf)X2(tf) .0.C4.7. From the given boundary conditions (2.52). the optimal states.7. First of all.64).7. OX2 f is arbitrary and hence the general boundary condition (2.7.7. C3 = 4/9.7. costates.65) The only difference is in solving for the constants C1 to C4 and the unknown t f.7.47) does not contain the terminal cost function S. since X2 (t f) is free.69) Finally. tf = 3.66) Also. (2.7.64). (2.7.7. that is.7. the constants are found to be C1 = 1. (2. we have the following 5 boundary conditions for the 5 unknowns (4 constants of integration C 1 to C4 and 1 unknown t f) as x1(0)=I.66) and (2.65). C 4 = 4/3.55) which are repeated here for convenience. we have tf unspecified and hence otf is free in the general boundary condition (2. and control are given from .5A~(tf) = o.32) becomes >'2(tf) = (:!) = 0 (2.7. A2(t) = C3 t + C4. X1(tf) =3.67).7. Combining (2.78 Chapter 2: Calculus of Variations and Optimal Control in (2.68) Using these boundary conditions along with (2.7. S = O. = ~3t2  C4t + C2.7. Ai(t) = C3. U*(t) = A2(t) = C3t .7. X1(t) X2(t) * C3 = 6 t3  C4 2 2t + C2t + C1.
x2tf.7.eq5.7 Variational Approach to Optimal Control Systems (2. S2=solve(eq1.lam1'lam1tf'.65) and (2. %% Solution Using Symbolic Toolbox (STB) in %% of MATLAB Version 6 %% ******************************************************** clear all S=dsolve('Dx1=x2. 7. eq4=subs(S.x1(tf)=3.x2)'x2tf'.lam1tf. clear t x1=subs(S. eq3=S. x2tf=S2.x1tf.7.'lam1tf<>0') %% lam1tf<>0 means lam1tf is not equal to 0.69) as 79 (2. without this condition in the above %% SOLVE routine. x2(0)=2.x1)'x1tf'.14 using Symbolic Toolbox of the MATLA~. lam2tf'.2.'tf.x2tf.7.xltf.x2) lam1=subs(S.lam2(tf)=0') t='tf' . Version 6 is shown below.xl) x2=subs(S.eq2. eq1=subs(S.7.eq4.70) The solution for the set of differential equations (2.tf xltf=S2.Dlam2=lam1.5*lam2tf 2'.xl(0)=l. we get two values for tf (1 and 3 in this case) A %% tf=S2.Dlaml=O.53) with the boundary conditions C2.68} for Example 2.eq3.lam2)'lam2tf'.lam1) .Dx2=lam2. eq5='lamltf*x2tf0. %% This is a condition derived from eq5. %% Otherwise. eq2=subs(S.
x2p.7. Example 2.72) Following the procedure illustrated in Table 2. Finally.1 (Type (b)). end plot(tl. j=l. Table 2.1.05:tf t=sym(tp).tl. %% optimal control u = lambda_2 tl(j)=tp. states and costates as given in (2. x_2 ( t) .12 with changed performance index 1 1 J = [xl(2) .80 Chapter 2: Calculus of Variations and Optimal Control lam2=subs(S. 'k' . tf=double(subs(S2.71) and boundary conditions as x(O) = [1 2].tf)) %% coverts tf from symbolic to numerical for tp=O:O.lam2)). %% subs substitutes S. %% coverts tp from numerical to symbolic xlp(j)=double(subs(S. (2.xlp. Thus .xl to xlp x2p(j)=double(subs(S.lam2) %% Convert the symbolic values to %% numerical values as shown below. %% double converts symbolic to numeric up(j)=double(subs(S.54) and (2. which are reproduced here for ready reference.2] 2 + 2 u2 dt 2 2 0 1102 (2.55). we get the same optimal control.15 We consider the same Example 2.14 are plotted in Figure 2.7.7.4] 2 + [x2(2) .7.13.up. j=j+l . x(2) = is free. ) gtext ('u(t) ') ******************************************************* The optimal control and states for Example 2.xl)). Type (b)).tl.x2)). we consider the fixedfinal time and freefinal state system with a terminal cost (Figure 2. 'k' .9 (b). 'k:') xlabel('t' ) gtext (' x_l (t) ') gtext ( .
. A2(t) = C3 t + C4.~ _~o# 1. u(t) .73) The only difference is in solving for the constants C 1 to C4 using the given and obtained boundary conditions.4flJ~ .13 we have Optimal Control and States for Example 2. 6xf is free in the boundary condition (2. Ai(t) = C3.  .7..75) .5 t "..5 2 1.2t + C2t + C1. ..32). (2.7 Variational Approach to Optimal Control Systems 81 2.C4.14 * C3 3 C4 2 X1(t) = (it .'" .5 2 2. . Since t f is specified as 2.5 0 0.2.. which now reduces to (2. (2. 1.74) where.7.. 6t f is zero and since x(2) unspecified. U*(t) = A2(t) = C3t .5 3 Figure 2.7.5 0 0. '" "..5 1 .7. X2(t) = ~3t2  C4t + C2....
x2)'x22'. xl(O)=1. 7 = ~t  i. *************************************************************** %% Solution Using Symbolic Math Toolbox (STB) in %% MATLAB Version 6 %% S=dsolve('Dxl=x2. *( 3 4 x2 t ) = .7.74) becomes AHtf) = (a Xl ) ~ AH2) = as tf xl(2) .7.7.t + 2 14 7 ' Ai(t) = ~.Dx2=lambda2. 7 (2. Version 6.7. A2(tf) = (aas ) ~ A2(2) = x2(2) X2 tf (2. we have the optimal states.76) Now. we have two initial conditions from (2. as shown below.82 Chapter 2: Calculus of Variations and Optimal Control Thus.x2(O)=2.x12.78) The previous results c~ also obtained using Symbolic Math Toolbox of the MATLAB\9. costates and control given as Xl (t) * = 1 3 14 t 2 "7 t 2 + 2t + 1.x22').lambda2(2)=x222') t='2' . clear t S = . S2=solve(subs(S. (2.Dlambdal=O.lambdal(2)=x124.76) to solve for the four constants as (2. x22=S2.xl)'x12'.t2 .4 2.Dlambda2=lambdal.7.'x12.77) Finally.x22. %% solves for xl(t=2) and x2(t=2) x12=S2. A2(t) = u*(t) ~t + ~.7.72) and two final conditions from (2.subs(S.
x2».xl».lambdal) lambdal 3/7 lambda2=subs(S. j=j+l. %% subs substitutes S.7 Variational Approach to Optimal Control Systems lambdal: lambda2: xl: x2: xl=subs(S. xlp(j)=double(subs(S.2.x2) x2 = lambdal=subs(S.lambda2».xl) xl = 83 [lxl [lxl [lxl [lxl sym] sym] sym] sym] x2=subs(S.lambda2) lambda2 = 4/73/7*t %% Plot command is used for which we need to %% convert the symbolic values to numerical values. %% optimal control u = lambda_2 tl(j)=tp. %% double converts symbolic to numeric up(j)=double(subs(S.xl to xlp x2p(j)=double(subs(S. j=l.02:2 t=sym(tp). for tp=O: . end .
14.2 1. Also shown is the systematic link between various stages of development.4 1.15 2.14 Optimal Control and States for Example 2. and u*(t) = A2(t) obtained by using MATLAB\0 are the same as those given by (2.8 1. Ai (t).t1.t1. 'k' .8 2 Figure 2. ) *************************************************************** It is easy to see that the previous solutions for xi(t)~2(t).15 are plotted in Figure 2. 4 3 2 o 1~~~~~~~~~~~~ o 0. 'k: ') xlabel ('t') gtext (' x_1 (t) ') gtext (' x_2 (t) ') gt ext ( 'u ( t) .x2p.8 Summary of Variational Approach In this section.6 1. The development is carried out in different stages as follows.78) obtained analytically.2 0.x1p.up. 'k' .4 0. A2(t). . The optimal control and states for Example 2.84 Chapter 2: Calculus of Variations and Optimal Control plot(t1. we summarize the development of the topics covered so far in obtaining optimal conditions using the calculus of variations.7.6 0.
2) Then.8.8.8 Summary of Variational Approach 85 2.4) This boundary condition is to be modified depending on the nature of the given t j and x( t j ). x(t).8. Although the previous general boundary condition is not derived in this book. t)dt (2.1) to with the given boundary conditions x(to) fixed and x(tj) free. The sufficient condition for optimum is the Legendre condition given by (8 V) * > 2 8x2 0 for minimum (2.26) in terms of the Lagrangian which embeds a performance index and a dynamical plant into a single augmented performance index with integrand C.6) . 8x dt 8x 0 (2.8.3) The general boundary condition to be satisfied at the freefinal point is given by [79] (2.8.1 Stage I: Optimization of a Functional Consider the optimal of J = i t! V(x(t).5) and (8 V) * < 2 8x2 0 for maximum. (2. the optimal function x* (t) should satisfy the EulerLagrange equation (8V) * _~ (8V) * _ . (2.7.8.8.2. it can be easily seen to be similar to the general boundary condition (2.
x(t).8.11) Now. t) + . (2.x(t) is the Lagrange multiplier (also called the costate function). The general boundary .8. (2.8. (2. and the condition relation (2.2 Chapter 2: Calculus of Variations and Optimal Control Stage II: Optimization of a Functional with Condition Consider the optimization of a functional J = l tf V(x(t).10) to where.13) Let us note from (2.x dt 8>.x(t) is nothing but the constraint relation (2. . . x(t). we have the EulerLagrange equation (2.. x(t).8.9)..8.8. . we just use the results of the previous Stage I for the augmented functional (2.) free.8. t) = O.9) is absorbed by forming the augmented functional Ja = l tf £(x(t).10) given in terms of x(t) and .10) except its integrand is £ instead of V.12) (2. x(t).0 8. t)dt (2.8.8.x'(t)g(x(t). state equation and costate equation.3) for the augmented functional (2.8.86 2.. x(t).7) to with given boundary values as x(to) fixed and x(t.13) for the costate . t) = V(x(t).9) Here.x(t) as (8£) * _~ (8£) * = 0 8x dt 8x (8£) * _~ (8£) *. _ >.8. For optimal condition. the condition (2.8. t)dt (2.x(t).11) that the Lagrangian £ is independent of * (t) and that the EulerLagrange equation (2.8.8) g(x(t).8.x(t). and £ is the Lagrangian given by I£(x(t). x(t).8. t).
t)dt. x(t). u(t).8.8.x(t)}. u(t). and the performance index as (2.8. the Lagrangian £ is given as £ = = £(x(t).15) x(to) is fixed and x(tf) is free.17) to Now. (2. u(t). x(t). we consider the standard control system with a plant described by [56] x(t) = f(x(t). t)dt (2.14) This boundary condition is to be modified depending on the nature of the given tf and x(tf)· 2.8. u(t).8.19) to where. u(t).8 Summary of Variational Approach 87 condition (2.20) . u(t). we rewrite the plant equation (2. u(t).8. u(t). t) = f(x(t).8.9) as g(x(t). (2. x(t).4) to be satisfied at the freefinal point (in terms of £) is given by (2.x(t) = o.2.16) J(u(t)) = i t! V(x(t).8.8.15) as the condition relation (2.8.8. t).3 Stage III: Optimal Control System with Lagrangian Formalism Here. (2.8.18) Then we form the augmented functional out of the performance index (2. the given boundary conditions as (2. t) . t) . A(t).8.18) as Ja(u(t)) = i t! £(x(t). A(t). t) + A'(t) {f(x(t). t) V(x(t).17) and the condition relation (2.
8. u(t). (2.\'(t)f(x(t).4 Stage IV: Optimal Control System with Hamiltonian Formalism: Pontryagin Principle Here. Note from (2.8.20) becomes £(x(t). t) = V(x(t).20) that the Lagrangian £ is independent of ~ * (t) and u*(t) and that the EulerLagrange equation (2.21) to (2. t) (2.'\'(t)x(t). u(t). the set of EulerLagrange equations (2.26) Now using (2. are rewritten in terms . '\(t).8.22) (2.12) and (2.8. '\(t). The general boundary condition (2.8. '\(t).23) _ ( 0£) * _ ~ (0£) *.8.19) given in terms of x(t).8.8. u(t). 2.8. u(t). we just transform the previous Lagrangian formalism to Hamiltonian formalism by defining the Hamiltonian as [57] H(x(t).0 costate equation. t) = H(x(t). t) + .8.8.8.x'(t) (~) L Otf + (~)~t f OX! f = O. t) . u(t). (2.8.8. x(t).22) is the same as the constraint relation (2.88 Chapter 2: Calculus of Variations and Optimal Control Now we just use the previous results of Stage II.26). dt Ox ox (2.8.8. For optimal condition.8.21) (2.14) to be satisfied at the freefinal point becomes [c .24) This boundary condition is to be modified depending on the nature of the given tf and x(tf).8. and 0'\ dt {)~ ( 0£) * _ ~ dt ou (8£) * = 0 ou control equation.13) for the augmented functional (2.25) which in terms of the Lagrangian (2. and u(t) as ( 0£) * _ ~ (0£) * = 0 state equation.23) which are in terms of the Lagrangian. we have the set of EulerLagrange equations (2. '\(t).18).
36) The state. .8.8.8.31) (2.8.30) (2.8.28) (2..32) (8H) ax H * .8.8.*(t) ___ " (~) * state equation. (2.26).8.30) to (2. and control equations (2.8.32) are solved along with the given initial condition (2. costate.8.8.~(O) = 0 aA dt (aH) * _~(O) 0 8u dt = = (2.8. costate equation.8.34) The sufficient condition is (~) * > 0 ( for minimum and (2. 10 _+(88U)*1.8.8.33) which becomes (2. Similarly using (2.X'(t)(A(t))] I*tj 8tf + [A'(t)] I*tj 8xf = 0 (2.A'(t)X(t) .35) ~:~) * < 0 for maximum.8.24) is rewritten in terms of the Hamiltonian as [H .8 Summary of Variational Approach of the Hamiltonian as 89 (aH) * _~(A*) 0 ax dt (aH) * _x*(t) .2.27) (2. and (2.34) leading us to a twopoint boundary value problem (TPBVP). the boundary condition (2.29) which in turn are rewritten as :ic*(t) = + . control equation.16) and the final condition (2.
We summarize the result risking the repetition of some of the equations.42) (2.39) Now. Thus we rewrite (2.8. A* (t) =  state equation costate equation control equation (2.8. u(t). (2.38) to absorb the final cost function S within the integrand. u(t).8.34).38) to along with the boundary conditions x(to) is fixed and x(tf) is free.41 ) (2.8.8.38) as J 1 (u(t)) = rtf lto [ V(x(t).44) . Let the plant be described as x(t) = f(x(t). Thus the state. t) + (8S)' x(t) + at dt.37) and the performance index be J(u(t)) = S(x(tf).8. t) (2. 8S] 8x (2.8.90 Chapter 2: Calculus of Variations and Optimal Control FreeFinal Point System with Final Cost Function This problem is an extension of the problem in Stage IV. then the results of Stage III can be used to get the optimal conditions.8. tf) + l tf V(x(t).40) N ow we repeat the results of Stage III except for the modification of the final condition equation (2.8. with the addition of final cost function.8. u(t).8. if we rewrite the performance index (2. t)dt (2. costate and control equations are x*(t) = + (~~)* .43) (81l)  8x * and the final boundary condition is (2.
thus this formulation leads us to a TPBVP. and control equations (2. (a) The Lagrange multiplier A(t) is introduced to "take care of" the constraint relation imposed by the plant equation (2. (b) The costate variable A(t) enables us to use the EulerLagrange equation for each of the variables x(t) and u(t) separately as if they were independent of each other although they are dependent of each other as per the plant equation. t) +A'(t)f(x(t).39) and the final condition (2.43) are solved along with the given initial condition (2.8. t) = V(x(t). u(t).5 Salient Features We now discuss the various features of the methodology used so far in obtaining the optimal conditions through the use of the calculus of variations [6. x(t).45) (~) * < 0 for maximum. say Stage III or Stage IV during our discussion. 1.15). A(t). 2.8.41) to (2. (2.8.8.46) The state.8. t) +A'(t) {f(x(t). u(t). A(t).8.47) (2.8. we need to consider the problems discussed above under the various stages of development. u(t). Lagrangian and Hamiltonian: We defined the Lagrangian and Hamiltonian as £ = £(x(t). (2. 108]. u(t). u(t).8. costate. t) = V(x(t). t) .8 Summary of Variational Approach The sufficient condition for optimum is 91 (~) * > 0 for minimum and (2. t).8. u(t). 79.x(t)} 1t = 1t(x(t) .8. Also. Significance of Lagrange Multiplier: The Lagrange multiplier A(t) is also called the costate (or adjoint) function.2. So we refer to the appropriate relations of.44).48) . 2. 120.
1 (2.e. which is a functional subject to the plant equation (2. Pontryagin used a slightly different performance index which is maximized rather .x*(t). . So in the case of constrained control. t) :::. still it should be noted that these £ and 1t are scalar functions only. In practice. the optimization of the original performance index (2. the control u(t). t) = 1t(x*(t). (c) If u( t) is constrained or bounded as being a member of the set U. i. so calculated. u(t).. it is shown that uEU min 1t(x*(t). (b) We note that we assumed unconstrained or unbounded control u(t) and obtained the control relation 81t/8u = O. A rigorous proof of the fact that u( t) must be chosen to optimize 1t function is Pontryagin's most notable contribution to optimal control theory.8. . t). since u( t). 3. we "reduced" our original functional optimization problem to an ordinary function optimization problem. we can no longer take 81t/8u = 0.15). the approach is often called Pontryagin Principle.x*(t). (d) Regardless of any constraints on u(t). Pontryagin Maximum Principle: Originally. Pontryagin had shown that u(t) must still be chosen to minimize the Hamiltonian.x*(t). or thrust of a rocket. For this reason. That is. u(t) E U.r.x*(t).8. the control u(t) is always limited by such things as saturation of amplifiers. u( t). u*(t).49) or equivalently 11t(x*(t). t) (2. u(t). .t.8.50) 4. u*(t). .8.92 Chapter 2: Calculus of Variations and Optimal Control In defining the Lagrangian and Hamiltonian we use extensively the vector notation. is equivalent to the optimization of the Hamiltonian function w.t.17).r. 1t(x*(t). Thus.32) indicates the optimization of the Hamiltonian w. Optimization of Hamiltonian (a) The control equation (2.8. The constrained optimal control systems are discussed in Chapter 7. may lie outside the range of the permissible region U. speed of a motor.
t. t) dH* dt = (~~) ~ x*(t) + (:)~ i*(t) + (~~)~ u*(t) + (a. u(t).51) we have Maximum Principle.r.' (t)f(x(t). u(t).8 Summary of Variational Approach 93 than minimized and hence it is called Pontryagin Maximum Principle.54) and 1i is constant w.8. we get (2. t) + 5.8.8.30) to (2.8. For this reason. Let us note that minimization of the performance index J is equivalent to the maximization of J.V(x(t).8. if we define the Hamiltonian as H(x(t).52) Using the state.:) * . 5.t. time is the same as the partial derivative of H w. the optimal control problem of a dynamical system leads to a TPBVP. t) = . t. u* (t). t) (2. the Hamiltonian is also sometimes called Pontryagin Hfunction. Hamiltonian at the Optimal Condition: At the optimal condition the Hamiltonian can be written as H* dH* dt = H* (x* (t). 6.32) in the previous equation. Then. . (2. u(t).. the total derivative of H w. A(t).r.53) We observe that along the optimal trajectory. the time t along the optimal trajectory. costate and control equations (2. A* (t). TwoPoint Boundary Value Problem (TPBVP): As seen earlier. then ~I=o ~ (2. If H does not depend on t explicitly.2.r.8. Further discussion on Pontryagin Principle is given in Chapter 6. time.
time varying and we may have to resort to numerical methods for their solutions.15.15 OpenLoop Optimal Control (a) The state and costate equations (2.32) are solved using the initial and final conditions. 7.. This prompts us to think in terms of a closedloop optimal control (CLOC). This is the unfortunate feature of optimal control theory. The closedloop optimal control systems are discussed in Chapter 7. although obtaining the state and costate equations is very easy. Also.. we get only the openloop optimal control as shown in Figure 2. Plant x*(t) .e. In general.16.94 Chapter 2: Calculus of Variations and Optimal Control OpenLoop Optimal Controller u*(t) .8. to obtain optimal control u*(t) in terms of the state x*(t) as shown in Figure 2.8. .30) and (2.. one has to construct or realize an openloop optimal controller (OLOC) and in many cases it is very tedious. (b) We note that the state and costate equations are the same for any kind of boundary conditions. changes in plant parameters are not taken into account by the OLOC. Here. i. (c) For the optimal control system. Figure 2. This CLOC will have many advantages such as sensitive to plant parameter variations and simplified construction of the controller. It is the price one must pay for demanding the best performance from a system. these are nonlinear. One has to weigh the optimization of the system against the computational burden. the computation of their solutions is quite tedious. and then substituting in the control equation. OpenLoop Optimal Control: In solving the TPBVP arising due to the state and costate equations.
.8 Summary of Variational Approach 95 u*(t) r/ ~ . Plant x*(t)  • ClosedLoop +Optimal Controller Figure 2.2.16 ClosedLoop Optimal Control .
Problem 2. x( t). to with fixedend point conditions. Problem 2.3 Find the extremal for the following functional with x(l) = 1 and x(2) = 10. Use MATLAB© wherever possible to solve the problems and plot all the optimal controls and states for all problems. Provide the relevant MATLAB© m files.2 Find the extremal of the functional to satisfy the boundary conditions x( 2) = 3. i(t). 2. t)dt. and x(O) = O.1 Find the extremal of the following functional with the initial condition as x(O) = 0 and the final condition as x(2) = 5. t) = i t! V(x( t).4 Consider the extremization of a functional which is dependent on derivatives higher than the first derivative x(t) such as J(x( t). Show that the corresponding EulerLagrange equation is given by . Make reasonable assumptions wherever necessary. Problem 2.96 Chapter 2: Calculus of Variations and Optimal Control 2.9 Problems 1. Problem 2.
for extremization of J = i t! V ( x (t ).Xo sinh(3(tf . the EulerLagrange equation becomes Problem 2. x(to) = Xo and x(tf) is free and tf being fixed. show that. S'ln tf Problem 2.5 A first order system is given by ±(t) = ax(t) + bu(t) and the performance index is J = (qx 2 (t) 2 0 lint! + ru 2 (t))dt where. ± (t) . ±(t = 0) = 2. h(3 . t) dt to with fixedend point conditions. x(t = 5) = 0. .t) .6 A mechanical system is described by x(t) = u(t) 5 find the optimal control and the states by minimizing J = ~ r u (t)dt 2 10 2 such that the boundary conditions are x(t = 0) = 2.. ±(t = 5) = O. .2.9 Problems 97 Similarly. Also find tf· . in general. tf is unspecified.. Show that the optimal state x* (t) is given by x *( ) _ t . x (r) ( t).7 For the first order system dx dt = x(t) + u(t) find the optimal control u* (t) to minimize J = If [x 2 (t) + u2 (t)]dt where. Problem 2. and x(O) = 5 and x(tf) = o. x (t).
10 For a second order system Xl(t) X2(t) with performance index X2(t) 2Xl(t) + 3u(t) (7r/2 J = 0. 2 Problem 2. Find the optimal control. Problem 2. x2(2) = 0 which minimizes the performance index J = ~ l [xM + u (t)] dt. Xl(O) =3. The additional conditions are given below.98 Chapter 2: Calculus of Variations and Optimal Control Problem 2.11 Find the optimal control for the plant Xl(t)=X2(t) X2(t) = 2Xl(t) + 3u(t) with performance criterion 121 2 J = "2Fll [Xl(tj) . .5u 2(t)dt and boundary conditions x(O) = [0 1]' and x(t j) is free.2] 1 +"2 io rtf [xi(t) + 2x~(t) + 4u 2(t) ] dt and initial conditions as x(O) = [1 2]'.9 A second order plant is described by Xl(t) =X2(t) X2(t) = 2Xl(t) . xl(2)=0 X2(t) = 2Xl(t) + 5u(t). X2(0) = 5.4] + "2F22 [X2(tj) . when Xl(O) = 3 and X2(0) = 2.5xi(1T/2) + io 0.3X2(t) + 5u(t) and the cost function is J = f" = = [xi(t) + u2 (t)Jdt. find the optimal control.8 Find the optimal control u* (t) of the plant Xl(t)=X2(t). Problem 2.
9 Problems 1. optimal control to keep the system at equilibrium condition and act only if there is a disturbance.16 For the automobile suspension control system described in Problem 1.12 For the D. find the openloop. Fn = 3. Problem 2.6. t f = 2. Fn x2(2) = 6.1.14 For the inverted pendulum control system described in Problem 1. Freefinal time conditions Fn is free. Problem 2. Problem 2.15 For the mechanical control system described in Problem 1. = 3. Problem 2.5.C. optimal control to keep the pendulum in a vertical position. Fixedfinal conditions Fn 99 = 0.17 For the chemical control system described in Problem 1. find the openloop optimal control to keep the liquid level constant at a reference value and the system to act only if there is a change in the liquid level. 3. Freefinal state conditions. optimal control to keep the system at equilibrium condition and act only if there is a disturbance. find the openloop.2.4. Problem 2.13 For the liquidlevel control system described in Problem 1. x(tf) = [4 6]' and tf 0. F22 = 0. F22 = 5 and the final state to have xI(tf) = 4 and x2(tf) to lie on 8(t) = 5t + 15. x(2) = [4 6]'. Freefinal time and freefinal state conditions. optimal control to provide minimum control energy and passenger comfort. F22 = 5. @@@@@@@@@@@@@@@ . motor speed control system described in Problem 1. find the openloop. Xl (2) is free and 2. 4.2. Problem 2. find the openloop optimal control to keep the speed constant at a particular value and the system to respond for any disturbances from the regulated value.3. find the openloop.
.
Falb.. timevarying (LTV) system x(t) = A(t)x(t) + B(t)u(t) (3. This leads to the linear quadratic regulator (LQR) system dealing with state regulation. Optimal Control. NY. and tracking. 101 .1) (3. New York. are hereby acknowledged. output regulation. we present the closedloop optimal control of linear plants or systems with quadratic performance index or measure. NY. This chapter is inspired by [6.2) y(t) = C(t)x(t) IThe permissions given by John Wiley for F. Thus. This helps us to obtain some elegant mathematical conditions on the choice of various matrices in the quadratic cost functional. 3. we are interested in the design of optimal linear systems with quadratic performance indices. L. John Wiley & Sons. 3. 1986. Optimal Control: An Introduction to the Theory and Its Applications. Consider a linear. New York. Lewis. L. It is suggested that the student reviews the material in Appendices A and B given at the end of the book.Chapter 3 Linear Quadratic Optimal Control Systems I In this chapter. 1966. we will be dealing with an optimization problem from the engineering perspective. 89]1. Broadly speaking. Athans and P. and McGrawHill for M. Inc.1. McGrawHill Book Company.1.1 Problem Formulation We discuss the plant and the quadratic performance index with particular reference to physical significance.
u(t) is rth control vector. if the state x(t) is available). x(t) is nth state vector. depending on the final time tf being finite (infinite).e.1. Further.1. If our interest is to keep the output y(t) near zero (i. z(t) o and C = I). the system is called finite. 2See Appendix A for more details on Quadratic Forms and Definiteness and other related topics. In other words.y(tf) ] (3.y(t)] + u'(t)R(t)u(t)] dt 2 to y(tf)] I F(tf) [z(tf) . This situation may arise when a plant is subjected to unwanted disturbances that perturb the state (for example. to) 1 ="2 [z(tf)  + 11tf [[z(t) . Under these assumptions..2) from a nonzero state to zero state. Also.y(t) (or e(t) = z(t) . sudden wind gust in a radar antenna positional control system).3) contains quadratic terms in error e(t) and control u(t) and hence called the quadratic cost functional 2 .(infinite) time horizon system. z(t) is mth reference or desired output vector (or nth desired state vector. and C(t) is mxn output matrix. we will find that the optimal control is closedloop in nature. . z(t) = 0). B(t) is nxr control matrix.1. 0 < m :::. then we call it state regulator system. that is.102 Chapter 3: Linear Quadratic Optimal Control Systems I with a cost functional (CF) or performance index (PI) J(u(t)) = J(x(to).y(t)]' Q(t) [z(t) . 1. A(t) is nxn state matrix. We assume that the control u(t) is unconstrained. 2. and all the states and/or outputs are completely measurable.1. If our objective is to keep the state x(t) near zero (i. y(t) is mth output vector. and e(t) = z(t) . we have the following categories of systems. r :::.3) where.3).x(t). the optimal control u(t) is a function of the state x(t) or the output y(t). the objective is to obtain a control u(t) which takes the plant described by (3. sudden load changes in an electrical voltage regulator system. if the state x(t) is directly available) is the mth error vector. n. u(t).e..1.1) and (3. then it is termed the output regulator system. The preceding cost functional (3. We also make certain assumptions to be described below on the various matrices in the quadratic cost functional (3.
Thus. For example. The Control Weighted Matrix R(t): The quadratic nature of the control cost expression ~u/(t)R(t)u(t) indicates that one has to pay higher cost for larger control effort. To guarantee this. the integral of the expression ~e/(t)Q(t)e(t) should be nonnegative and small. to keep the error small. but on the other hand. The Terminal Cost Weighted Matrix F(tf): The main purpose of this term is to ensure that the error e(t) at the final time tf is as small as possible. The quadratic cost functional described previously has some attractive features: (a) It provides an elegant procedure for the design of closedloop optimal controller. the matrix Q (t) must be positive semidefinite. Let us consider the various matrices in the cost functional (3. we assume that the weighted matrices Q(t). Since the cost of the control has to be a positive quantity.1. Due to the quadratic nature of the weightage. and F(t) are symmetric. we would like on one hand. . R(t). The Error Weighted Matrix Q(t): In order to keep the error e(t) small and error squared nonnegative. 4. we have to pay more attention to large errors than small errors. The Control Signal u(t): The assumption that there are no constraints on the control u(t) is very important in obtaining the closed loop optimal configuration. Further. we must not pay higher cost to large controls. Combining all the previous assumptions. without loss of generality.1 Problem Formulation 103 3.3) and their implications. 3. If we try to keep the output or state near a desired state or output. the desired or reference state is zero and in tracking system the error is to be made zero. 1.3. 2. We see that in both state and output regulator systems. the corresponding matrix F( t f) should be positive semidefinite. consider again the antenna control system to track an aircraft. then we are dealing with a tracking system. the matrix R( t) should be positive definite.
to) = ~ x' (t f )F (t f )x (t f ) 11tf +2 to = [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt ~ x' (t f )F (t f )x( t f ) +~ rtf 2 lto [X'(t) u'(t)] [Q(t) 0 R(t) 0 ] [X(t)] dt u(t) (3. u(t).2 FiniteTime Linear Quadratic Regulator Now we proceed with the linear quadratic regulator (LQR) system. 5. Hence.x(t) itself is the state. in many physical situations.104 Chapter 3: Linear Quadratic Optimal Control Systems I (b) It results in the optimal feedback control that is linear in state function.1) J(u) = J(x(to). We summarize again various assumptions as follows. the reference or desired state z(t) = 0 and hence the error e(t) = 0 .2. that is. 1. to keep the state near zero during the interval of interest. Infinite Final Time: When the final time t f is infinity. 3. That is why we often say that the "quadratic performance index fits like a glove" [6]. thereby implying a state regulator system. The control u( t) is unconstrained.2) where. . Let us note that here. Consider a linear. timevarying plant described by x(t) = A(t)x(t) with a cost functional + B(t)u(t) (3. For the sake of completeness we shall repeat the plant and performance index equations described in the earlier section. the various vectors and matrices are defined in the last section. the terminal cost term involving F(tf) does not provide any realistic sense since we are always interested in the solutions over finite time.2. However. there are limitations on the control and state and the case of unconstrained control is discussed in a later chapter. F(tf) must be zero.
1) to obtain optimal solution and then propose the closedloop configuration. • Step 1: Hamiltonian: Using the definition of the Hamiltonian given by (2.5) . • Step 1: Hamiltonian • Step 2: Optimal Control • Step 3: State and Costate System • Step 4: ClosedLoop Optimal Control • Step 5: Matrix Differential Riccati Equation Now let us discuss the preceding steps in detail. We follow the Pontryagin procedure described in Chapter 2 (Table 2.7. the fraction ~ in the cost functional (3.2. The terminal time t f is specified.29) as ~~ leading to = 0 + R(t)u*(t) + B'(t)A*(t) = 0 (3.2). The terminal cost matrix F(tf) and the error weighted matrix Q(t) are nxn positive semidefinite matrices.2. The initial condition x(t = to) = Xo is given. 4. First.2) is associated mainly to cancel a 2 that would have otherwise been carried on throughout the result. A(t) is the costate vector of nth order. and the final state x( t f) is not specified.3) 1 where.3. Finally. and the control weighted matrix R( t) is an rxr positive definite matrix.7.2 FiniteTime Linear Quadratic Regulator 105 2.2. let us list the various steps under which we present the method. 3. formulate the Hamiltonian as 1 H(x(t). u(t).27) in Chapter 2 along with the performance index (3.4) u*(t) = _Rl(t)B'(t)A*(t) (3. respectively.2.2. as seen later. A(t)) = "2x'(t)Q(t)x(t) +A' (t) + "2u'(t)R(t)u(t) [A(t)x(t) + B(t)u(t)] (3. • Step 2: Optimal Control: Obtain the optimal control u*(t) using the control relation (2.
5) in the state equation (3. (3. the coefficient of 8xf in (3. Here.\ * (t) = .*(t) =  (~~) * > j. that is.A' (t) A* (t) (3.2.2. S equals the entire terminal cost term in the cost functional (3.2. • Step 3: State and Costate System: Obtain the state and costate equations according to (2.30) and (2. We immediately notice from (3.10) .Q (t) .2.2.32) is reproduced here as [1i* + ~~Lf i5tf + [(~~) *  A*(t{ i5xf = 0 (3.7) Substitute the control relation (3.6) j.31) as x*(t) = + (~~) * > x*(t) = A(t)x*(t) + B(t)u*(t) (3. and x(tf) is not specified which makes 8xf arbitrary in (3.2.2.9) where.*(t) = Q(t)x*(t)  A'(t)A*(t). A*(tf) (:~f)) = = * * = F(tf)x (tf)· a [~x' (t f )F( t f )x(t f)] 8x(tf) (3.9). Hence. Similar expressions are used throughout the rest of the book.9)..2. for our present system tf is specified which makes 8tf equal to zero in (3.2.5) the need for R(t) to be positive definite and not positive semidefinite so that the inverse Rl(t) exists .8) where E(t) = B(t)Rl(t)B'(t).7.6) to obtain the (state and costate) canonical system (also called Hamiltonian system) of equations x*(t)] [A(t) E(t)] [x*(t)] [ .7. The general boundary condition given by the relation (2. we used :u Gul(t)R(t)U(t)} = R(t)u(t) :u and {A' (t)B(t)u(t)} = B' (t)A(t).106 Chapter 3: Linear Quadratic Optimal Control Systems I where.9) becomes zero.2.2.2.2).7. Further details on such relations are found in Appendix A.
as a function of the state x*(t). Now to formulate a closedloop optimal control.1.*(1) Costate Figure 3.8) and the control (3. that is.5).x * (t f) together with the initial condition on the state Xo and the canonical system of equations (3.2.8) form a twopoint. The statespace representation of the set of relations for the state and costate system (3.x*(t) as seen from (3. u*(t) J A(t) x*(t) State 1 R (t) Q(t) B'(t) ').2.1 prompts us to think that we can obtain the optimal control u*(t) as a function (negative feedback) of the optimal state x*(t).3.1 State and Costate System • Step 4: ClosedLoop Optimal Control: The state space representation shown in Figure 3. boundary value problem (TPBVP).2. to obtain the optimal control u*(t) which is a function of the costate ..2 FiniteTime Linear Quadratic Regulator 107 This final condition on the costate . let us examine the final condition on .2.x*(t) given .5) is shown in Figure 3.
11) where.14) (3.2.1 (t)B'(t)P(t)] x*(t) 0 (3. t f 1 and for any choice of .13).7).15) Now.2.x*(t) = P(t)x*(t) + P(t)x*(t).16) Essentially.10).11) .2.6) and (3.108 Chapter 3: Linear Quadratic Optimal Control Systems I by (3.2. (3. Similarly. P(t) is yet to be determined.A'(t)P(t)x*(t) = P(t)x*(t) + P(t) [A(t)x*(t) . Note that this negative feedback resulted from our "theoretical development" or "mathematics" of optimal control procedure and not introduced intentionally [6].14) and (3.2.2.6) and the costate (3.2.2.13) Using the transformation (3.B(t)Rl(t)B'(t)P(t)x*(t).2.1 (t)B'(t)P(t)x*(t)] ~ [P(t) + P(t)A(t) + A'(t)P(t) + Q(t)= P(t)B(t)R.2.5). substituting state and costate relations (3.1 (t)B'(t)P(t)x*(t) (3.t.2. This in fact relates the costate in terms of the state at the final time t f.11). (3. respectively.A'(t)P(t)x*(t). we may like to connect the costate with the state for the complete interval of time [to.2.5) becomes u*(t) = R.2.2. the state (3.2. let us assume a transformation [113. .11) in the control. we get x*(t) = A(t)x*(t) .12) which is now a negative feedback of the state x*(t). • Step 5: Matrix Differential Riccati Equation: Now the relation (3. we have Q(t)x*(t) .2. the optimal control (3.7) equations by introducing the transformation (3.2. t f]. we get .5).2. state and costate system of equations (3.2. Differentiating (3.r. (3.2.11) w.16) should be satisfied for all t E [to.x*(t) = Q(t)x*(t) . we eliminated the costate function A* (t) from the control (3. we can easily see that with (3.B(t)R.2. Thus. 102] A*(t) = P(t)x*(t) (3. time t. Then.15) in (3.2.
. P(t) is not dependent on the initial state. 3.17) or (3. (3.16) should hold good for any value of x*(t). and (3.19) to obtain the solution P (t) for the entire interval [to.18) is to be solved backward in time using the final condition (3. This clearly means that the function P(t) should satisfy the matrix differential equation P(t) + P(t)A(t) + A' (t)P(t) + Q(t) P(t)B(t)R. First of all. Also.11) is called the Riccati transformation. It follows that the equation (3.18) where E(t) = B(t)Rl(t)B'(t).2.Q(t) + P(t)E(t)P(t) (3. Also. t f ].19) Thus.*(tf) = P(tf)x*(tf) = F(tf)x*(tf) + \ P(tf) = F(tf)'\ (3. we first show an important property of the Riccati matrix P(t).10) and the Riccati transformation (3. we have the final condition on P(t) as >. the matrix B(t)Rl(t)B'(t) is also symmetric. P(t) is called the Riccati coefficient matrix or simply Riccati matrix or Riccati coefficient.11). The matrix DRE (3.2.2.17) This is the matrix differential equation of the Riccati type.A'(t)P(t) .e.2. the transformation (3.2.17) can also be written in a compact form as P(t) = P(t)A(t) .2. . P (t) = P' (t) can be easily shown as follows.2 FiniteTime Linear Quadratic Regulator 109 the initial state x*(to)..2. and R(t) are symmetric and therefore.18).2.1 Symmetric Property of the Riccati Coefficient Matrix Here.3.2. The fact that the nxn matrix P(t) is symmetric for all t E [to.1 (t)B'(t)P(t) = O.2. Now transposing both sides of the matrix DRE (3. the matrix DRE (3.2. Q(t).2.2.19). let us note that from the formulation of the problem itself. we notice that both P(t) and P'(t) are solutions of the same differential equation and that both satisfy the same final condition (3. and often called the matrix differential Riccati equation (DRE). the matrices F(tf). Comparing the boundary condition(3.2.2. tf]' i.12) is the optimal control (feedback) law.
2. the condition that the second partial of w. .41) (reproduce~ here for convenience) for examining the nature of optimal control is that the matrix (3. However.3).r.2. Thus.21 ) must be positive definite (negative definite) for minimum (maximum).2. we show how to obtain an expression for the optimal value of the performance index. it follows that the preceding matrix (3.2.7.2 Optimal Control Is the optimal control u*(t) a minimum? This can be answered by considering the second partials of the Hamiltonian (3. we have II= [ 0 R(t) . Now using the Hamiltonian (3.22) Substituting the previous partials in the condition (3. Q(t) 0 ] (3.3 Optimal Performance Index Here.110 Chapter 3: Linear Quadratic Optimal Control Systems I 3.2.t.23) is only positive semidefinite.23) Since R(t) is positive definite. 1i 3.2.20). u*(t).3) and calculating the various partials.20) must be positive definite (negative definite) for minimum (maximum). which is R( t). In most of the cases this reduces to the condition that ( 8 2 ou2 1i) * (3. Let us recall from Chapter 2 that this is done by examining the second variation of the cost functional. the condition (2.2. (3. is positive definite.2.2. and Q(t) is positive semidefinite. is enough to guarantee that the control u*(t) is minimum.2.
1 The optimal value of the PI (3.3.18) in the previous relations.25) into the PI (3. J*(x(to).1 (t)B'(t)P(t) Finally.Time Linear Quadratic Regulator THEOREM 3.26) + x*'(t)P(t)x*(t) + x*'(t)P(t)x*(t) + x*'(t)P(t)x*(t)] dt.2.2.to) = ~x*'(to)P(to)x*(to). and noting that P(tj) = F(tj) from (3.2) is given by 111 J*(x*(t).25) Substituting for ~x*'(tj )P(tj )x*(tj) from (3.28) . to) = ~x*' (to)P( to)x* (to) lit! +2 [x*'(t)Q(t)x*(t) + = :t to + u*'(t)R(t)u*(t) (x*'(t)P(t)x*(t)) dt 1 ~x*'(to)P(to)x(to) lit! +2 to [x*'(t)Q(t)x*(t) + u*'(t)R(t)u*(t) (3. the integral part becomes zero. (3.24) i t! d 1 .P(t)B(t)R.2.(x*'(t)P(t)x*(t)) dt = x*'(to)P(to)x*(to) ~ & 2 +~x*'(tj)P(tj)x*(tj).14) for x*(t).2.2.2.19) we get J* (x* (to).2.2. we get J* (x* (to).2 Finite. Now.2.2.27) . Thus. to) = ~x*' (to)P( to)x* (to) lit! x*'(t) +2 to [Q(t) + A'(t)P(t) + P(t)A(t) + P(t)] x*(t)dt.2). (3. using the matrix DRE (3. t) Proof: First let us note that = ~x*'(t)P(t)x*(t).2. using the state equation (3. (3. (3.
positive definite matrix (for all t E [to.2.1 (t)B'(t)P(t) I (3.1 (t)B'(t)P(t)x*(t) = K(t)x*(t) I (3.2.28).A'(t)P(t) .Q(t) + P(t)B(t)R. u( t) is not constrained. Thus.2.31) J = ~ x' (t f) F (t f ) x( t f ) +"2 11t! to [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt (3.2. the previous relation is also valid for any x* (t). F(tf) and Q(t) are nxn symmetric. is the solution of the matrix differential Riccati equation (DRE) IP(t) = P(t)A(t) . the optimal control is given by Iu*(t) = R. and R(t) is TXT symmetric. t) = "2x*' (t)P( t)x* (t). positive semidefinite matrices. timevarying plant x(t) = A(t)x(t) and a quadratic performance index + B(t)u(t) (3.2. tfD. it is more convenient to use the relation (3.33) where K(t) = R.29) In terms of the final time t f' the previous optimal cost becomes (3. 3.4 FiniteTime Linear Quadratic Regulator: Time.112 Chapter 3: Linear Quadratic Optimal Control Systems I Now.32) where. positive definite matrix. 1 (3.34) satisfying the final condition (3.1 (t)B'(t)P(t) is called Kalman gain and P(t). the nxn symmetric.35) .2. J* (x* (t).2. t f is specified. and x( t f) is not specified.2.30) Since we are normally given the initial state x( to) and the Riccati coefficient P(t) is solved for all time t.2.Varying Case: Summary Given a linear.
Solution of the Problem Step 1 Solve the matrix differential Riccati equation P(t) = P(t)A(t) .l. 2 (3. Step 4 Obtain the optimal performance index from J* = ~x*'(t)P(t)x*(t). state and performance index.1 Procedure Summary of FiniteTime Linear Quadratic Regulator System: TimeVarying Case A.2. B. the optimal state is the solution of Ix*(t) = [A(t) .1 (t)B'(t)P(t)] x*(t) with initial condition x(to) = Xo.37) The optimal control u*(t).2.36) and the optimal cost is 1 J* = x*'(t)P(t)x*(t) .Q(t) + P(t)B(t)R. tJ [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt.1 (t)B'(t)P(t). Step 2 Solve the optimal state x* (t) from x*(t) = [A(t) . and the boundary conditions as t f is fixed.2. the performance index as J = ~x'(tf )F(tf )x(tf) + ~ [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt. find the optimal control. is linear in the optimal state x*(t). The entire procedure is now summarized in Table 3.B(t)R.2 FiniteTime Linear Quadratic Regulator 113 Table 3.2.A'(t)P(t) .38) to .1 (t)B'(t)P(t)] x*(t) I (3.33). K(t) = R. given by (3.3. Step 3 Obtain the optimal control u*(t) as u*(t) = K(t)x*(t) where. Note: It is simple to see that one can absorb the! that is associated with J by redefining a performance measure as J2 = 2J = x'(tf )F(tf )x(tf) + l.B(t)R. Statement of the Problem Given the plant as x(t) = A(t)x(t) + B(t)u(t). x(to) = xo. and x( t f) is free.1 (t)B'(t)P(t) with final condition P(t = tf) = F(tf). It. (3.
2. However. respectively. state and costate equations (3.34) for J.2. and the terminal time tf. 1.2. (3.41) Thus. the performance index (design) matrices Q (t).2. the factor ~ in the PI (3. Riccati Coefficient: The Riccati coefficient matrix P(t) is a timevarying matrix which depends upon the system matrices A(t) and B (t).(t)B'(t)P(t)x*(t) = K(t)x*(t). 2. ordinary differential equations. using J 2 without the ~ in the performance index. nonlinear.41) for the original plant (3.2.4). state and costate functions.40) Comparing the previous DRE for J 2 with the corresponding DRE (3.6) and (3.2. P(t) is symmetric and hence it follows that the nxn order matrix DRE (3.39) with final condition (3. 3. timevarying. but the only difference being that the Riccati coefficient matrix P 2(t) is twice that of P(t) and J2 is twice that of J(for example. .r.5 Salient Features We next discuss the various salient features of the state regulator system and the matrix differential Riccati equation.1 (t)B'(t)P 2(t)x*(t) = _ K 2(t)x*(t) 2 2 = R.114 Chapter 3: Linear Quadratic Optimal Control Systems I get the corresponding matrix differential Riccati equation for J2 as P2(t) = _ P 2(t) A(t) _ A'(t) P 2(t) _ Q(t) 2 2 2 + P2(t)B(t)R.2. we get the same optimal control (3.2. 1 (3.2.2. we will retain the ~ in J throughout the book due to the obvious simplifications in obtaining the optimal control.2.2. R( t) and F(tf).3) gets eliminated while taking partial derivatives of the Hamiltonian w. 42]). the control.7).2.18) represents a system of n(n + 1)/2 first order. see [3.2) and hence in the Hamiltonian (3.31). Precisely. we can easily see that P2(t) = 2P(t) and hence the optimal control becomes u*(t) = R.t.1 (t)B'(t)P 2(t) 2 2 (3. but P(t) does not depend upon the initial state x( to) of the system.
5.21).1). Optimal Control: From (3.1 (t)B'(t)P(t)] x*(t) = G(t)x*(t) I (3.43) The solution of this state differential equation along with the initial condition x(to) gives the optimal state x*(t).44) where. Let us note that there is no condition on the closedloop matrix G(t) regarding stability as long as we are considering the finite final time (t f) system. we have I x*(t) = [A(t) .3.29) that the minimum cost J* is given by J* = ~x*'(t)P(t)x*(t) for all t E [to.2.19).2.18). t f). then there exists the corresponding state x*(t s ) such that the cost function ~x*'(ts)P(ts)x*(ts) ::.1 (t)B'(t)P(t). 6. Optimal State: Using the optimal control (3. we now have that P(t) is positive definite. we see that the optimal control u*(t) is minimum (maximum) if the control weighted matrix R(t) is positive definite (negative definite). and P(tf) = F(tf). and x*(t) is the solution of the closedloop optimal system (3.2.2. Since we already know that P(t) is symmetric. Optimal Cost: It is shown in (3. P(t) is the solution of the matrix DRE (3.2 FiniteTime Linear Quadratic Regulator 115 3. Computation of Matrix DRE: Under some conditions we can get analytical solution for the nonlinear matrix DRE as shown later.2. (3.2.2. But in general. We can argue that P (t) is positive definite for all t E [to.2. tf). symmetric matrix.42).2. 4.tf] (3. P(t) is positive definite for all t E [to. .B(t)R. we may try to solve the matrix DRE (3. 0.2.12) in the state equation (3. Definiteness of the Matrix P(t): Since F(tf) is positive semidefinite. we can easily say that P(tf) is positive semidefinite. which clearly violates that fact that minimum cost has to be a positive quantity. Hence. 7.2.18) by integrating backwards from its known final condition (3. Suppose that P (t) is not positive definite for some t = ts < tf.42) where G(t) = A(t) .B(t)R.
so that once the system and the cost are specified. The figure shows clearly that the CLOC gets its values of P(t) externally.12) is written as I u*(t) = K(t)x*(t) I (3.46) u*(t) = K~(t)x*(t) where. 9. Ka(t) = P(t)B(t)Rl(t). and R(t). Implementation of the Optimal Control: The block diagram im plementing the closedloop optimal controller (CLOC) is shown in Figure 3. 10.116 Chapter 3: Linear Quadratic Optimal Control Systems I 8. the openloop optimal controller (OLOC) is quite cumbersome compared to the equivalent closedloop optimal controller as will be illustrated later in this chapter. after solving the matrix DRE backward in time from t = t f to t = to and hence there is no way that we can implement the closedloop optimal control configuration online. In that case. Q(t). Linear Optimal Control: The optimal feedback control u*(t) given by (3. once we are given the system/plant matrices A(t) and B(t). It is to be noted that the optimal control u*(t) can be solved and implemented in openloop configuration by using the Pontryagin procedure given in Chapter 2. The previous optimal control is linear in state x* (t). and feed these stored values when the system is operating in the forward direction in the interval t E [to. Typically. that is. Also. t f] .2. we compute (offline) the matrix P(t) backward in the interval t E [tf.2. Independence of the Riccati Coefficient Matrix P(t): The matrix P (t) is independent of the optimal state x* (t).1 (t)B'(t)P(t).45) where. note that the negative feedback in the optimal control relation (3. to] and store them separately. the Kalman gain K(t) tively.2.2.46) emerged from the theory of optimal control and was not introduced intentionally in our development. and the performance index matrices F(tf). we can independently compute the matrix P (t) before the optimal system operates in the forward direction from its initial condition. we can write = R.2. Or alterna(3. . This is one of the nice features of the optimal control of linear systems with quadratic cost functionals.
........ ...... _... Controllability: Do we need the controllability condition on the system for implementing the optimal feedback control? No.... .... . ... .. .2....... ......... ........ ............ ..... ...... ............ . ... The matrix Riccati equation has its origin in the scalar version of the equation x(t) = ax 2 (t) + bx(t) + c (3.. ..... .. . ... ClosedLoop Optimal Controller R (t)B'(t) 1 P(t)x*(t) OffLine Simulation of pet) Figure 3.3..... However. ... _ .... Riccati (16761754) gave the methods of solutions to the Riccati equation. • • 1 .. ..... the original paper by Riccati was not published immediately because he .... as we will see in the next section.. ... ... .. ....... . .... because the contribution of those uncontrollable states (which are also unstable) to the cost function is still a finite quantity only.............. .. ........ A historical note is very appropriate on the Riccati equation [22...2 ClosedLoop Optimal Control Implementation 11........ as long as we are dealing with a finite time (t f) system............ ..... if we consider an infinite time interval.................. proposed by Jacopo Franceso Riccati around 1715.. .....47) with time varying coefficients.. .... 132]. .2 FiniteTime Linear Quadratic Regulator 117 ... u * (B(t)B : t) + J ...... we certainly need the controllability condition. ........... .  x*(t) Plant ...... However..................... ...
Hoocontrol and robust stabilization and control. .49) ~ X' ( t f ) F (t f ) x (t f ) 1 rtf +2 ito where. (t) u .2. would play such an important and ubiquitous role in modern control engineering and other related fields? 3. (3. stochastic filtering and control.2. Did Riccati ever imagine that his equation. we address the state regulator system with a more general performance index than given by (3.118 Chapter 3: Linear Quadratic Optimal Control Systems I had the "suspicion" that the work was already known to people such as the B ernoullis. which is a generalization in matrix form of the original scalar equation. stability. timevarying plant described by x(t) with a cost functional = A(t)x(t) + B(t)u(t).2). plays a very important role in a range of control and systems theory areas such as linear quadratic optimal control. Consider a linear. The importance of the Riccati equation.] [Q(t) R(t) S(t)] x (t) S/(t) [X(t)] u(t) dt. which has been studied in the last two centuries by an extensive number of scientists and engineers. differential games and more recently. the various vectors and matrices are defined in earlier sections and the nxr matrix S(t) is only a positive definite matrix.2.6 LQR System for General Performance Index In this subsection.48) J(u) = 2X' (t f )F(tf )x(tf) 1 +2 ito = 1 rtf [x/(t)Q(t)x(t) + 2X/(t)SU(t) + u/(t)R(t)u(t)] dt [. The matrix Riccati equation.2. need not be overstressed. (3. proposed more than a quarter millennium ago. synthesis of passive networks.
2.A/p(t) .3) with the final condition (3.3. B. rewritten for (timeinvariant matrices A.3. we get the differential matrix Riccati equation (3. Sol. Let us rewrite the Hamiltonian system (3. to the Matrix Differential Riccati Equation 119 Using the identical procedure as for the LQR system.1) ~ = [ AE] Q A' .3 Anal. we get the previous results shown in Table 3. Q and R) as P(t) = P(t)A . E = BR1B/.Q(t) + [P (t) B (t) + S (t)] R 1 ( t ) [B I ( t )P (t) +S I( t) ] (3.52) Obviously.8) of the state and costate equations for the timeinvariant case as (omitting * for the sake of simplicity) X(t)] [ A(t) where.51 ) The optimal control is then given by u(t) = _R1(t)B/(t) [S/(t) + P(t)] x(t).3 Analytical Solution to the Matrix Differential Riccati Equation In this section. (3.3.\(t) = P(t)x(t).50) with the final condition on P(t) as (3. (3. when S(t) is made zero in the previous analysis.2. we explore an analytical solution for the matric differential Riccati equation (DRE).2. (3.3.18). 89]. we get the matrix differential Riccati equation as P(t) = P(t)A(t) .2.1. Let = [A E] [X(t)] A(t) Q A' (3.3.2.2) Let us also recall that by the transformation .Q + P(t)BR lB /p(t).A' (t)P(t) . 3. This material is based on [138.4) .
120 Chapter 3: Linear Quadratic Optimal Control Systems I The solution P(t) can be obtained analytically (in contrast to numerical integration) in terms of the eigenvalues and eigenvectors of the Hamiltonian matrix ~. M( M) is a diagonal matrix with righthalfplane (lefthalf plane) eigenvalues.3. In order to find analytical solution to the differential Riccati equation (3. if /1 is an eigenvalue of with corresponding eigenvector v.2).1 = f. we used f.3.and postmultiplication with ~ = r~/r = ~ (3.11) where. W21 W2 2 (3.3. the modal matrix of eigenvectors corresponding to D.J. ~/rv =  /1fv (3. 3]. [Wn W 21 ]' are the n eigenvectors of the lefthalfplane (stable) eigenvalues of ~. Also.8) where.12) .3.3. Then by a simple pre. then it implies that /1 is also the eigenvalue of ~ [89. be defined as W = [Wn W12] .3.v. /1V (3. Rearranging (3.7) ~v = then f ~/rv = 1. let us define r so that r.3). it is necessary to show that if /1 is an eigenvalue of the Hamiltonian matrix ~ in (3.10) where.6) Now. (3.3.3. For this.3. rearranging the eigenvalues of ~ as (3.1 r we get = [ 10 0I] r~/rl. (3.9) Next. Let W.3.5) = r.
4) A(tf) = W 21 W(tf) = Fx(tf) = + W22 Z(tf) + W12Z(tf)] .3.3.3.13). from (3. where T(tf) = .20) .16) Next. (3.17) F [Wl1w(tf) Solving the previous relation for z( t f) in terms of w( t f) z(tf) = T(tf )w(tf). to the Matrix Differential Riccati Equation 121 Let us now define a state transformation X(t)] = [ A(t) w [W(t)] = [Wl1 W12] [W(t)] .3.19) Rewriting the previous relation as z(t) = T(t)w(t).12) and (3. = (3.14) in terms of the known final conditions.3.16) z(t) = eM(tft)z(tf) = eM(tft)T( t f )w( t f) = eM(tft)T(tf )eM(tft)w(t).3. using (3.15) [eM~tf) eM(?t/)] [~~~)] .FW 12 ]1 [W21 . we have [~~~)] Rewriting (3. z(t) W 21 W22 z(t) (3.13) Then.3.3. T(t) = eM(tft)T(tf)eM(tft). from (3.FW l1 ]. the Hamiltonian system (3.3.3.3 Anal.3. Sol.3. (3.3.18) Again.3.15) (3. where.[W22 .3.3. (3. (3.[~i!n = Wl~ [~i!n = Wl~W [~g)] 1 =D [ :i!\ ]. (3.14) Solving (3.13) and using the final condition (3.3.1) becomes [~~~)] = W.
3.3.34) is not readily available with MATLAB and hence a MATLABbased program was developed for solving the matrix DRE based on the analytical solution of matrix DRE [138] described earlier. xl(0)=2 X2(t) = 2Xl(t) + X2(t) + u(t). and the performance index (PI) X2(0) = 3.20) for T(t). the previous relation can be written as [W21 + W22T(t)] w(t) = P(t) [Wll + W12T(t)] w(t). (3.3. to relate P(t) in (3.3.13) to write .21) and by (3.3.3.3. Example 3. Given a double integral system Xl(t)=X2(t). (3.20).3. (3.3.25u 2(t)] obtain the feedback control law.23) 3.25) 2x +2 io [ i(t) + 6Xl(t)X2(t) + 5x~(t) + 0. . The MATLAB solution is illustrated by the following example.3. let us use (3.24) J = ~ [xi(5) 1 (5 + xl(5)X2(5) + 2x~(5)] dt. it implies that the analytical expression to the solution of P(t) is given by (3.\(t) W21W(t) + W22Z(t) = P(t)x(t) = P(t) [WllW(t) + W 12 Z(t)] = (3.2.122 Chapter 3: Linear Quadratic Optimal Control Systems I Finally.1 MATLAB© Implementation oj Analytical Solution to Matrix DRE The solution of the matrix DRE (3.22) Since the previous relation should hold good for all x(to) and hence for all states w(t)..3) to the relation (3.1 Let us illustrate the previous procedure with a simple second order example.
Q(t) = [~~] . (3.2.29) .3. respectively.35) Pl1 (5) P12 (5)] _ [ 1 0.3.27) where.26) Then.2.2.3.34) Pl1(t) P12(t)] [P12(t) P22(t) =  [Pl1(t) P12(t)] [ P12(t) P22(t) 2 1 1 1 ° 1] _[0 2] [Pl1(t) P12(t)] P12(t) P22(t) + [Pl1 (t) P12(t)] P12(t) P22(t) [0] 4[0 1] [Pl1 (t) 1 P12(t)  P12(t)] P22(t) (3.2. to the Matrix Differential Riccati Equation 123 Solution: Comparing the present plant (3.33) is given by u*(t) = = 4 [0 1] [Pl1(t) P12(t)] [Xi(t)] P12(t) P22(t) X2(t) 4[P12(t)xi(t) + P22(t)X. Let P(t) be the 2x2 symmetric matrix P(t) = [Pl1(t) P12(t)] .3.32). let us first identify the various quantities as A(t) = [ ~2 ~ ] . 1 F(tf) = [0~5 05] 2 tf = 5.25) of the problem with the corresponding general formulations of the plant (3. the 2x2 symmetric.3. It is easy to check that the system (3. B(t) = [~]. Sol. R(t) = r(t) = to = 0. the optimal control (3.2.3 Anal.3.5] [P12(5) P22(5) . is the solution of the matrix DRE (3.28) [~~] satisfying the final condition (3.24) and the PI (3.24) is unstable.(t)] (3. P12(t) P22(t) (3.31) and the PI (3.3.5 2 .0. 4.3. positive definite matrix. P(t).
.] . P12(5) = 0.0. the plant equations (3. Q= [2. The following MATLAB© m file for Example 3.3. 1. .27). xO=[2. 25] .4. Version 6 as shown below.3.B.P12(t) + 2p22(t) + 4P12 (t)p22(t) .2p22(t) + 4P~2(t) . 3. tspan=[05].1. The solutions for the Riccati coefficients are plotted in Figure 3.3.xO. differential equations backward in time with the given final conditions.3.m and lqrnssf.Q.1 requires two additional MATLAB© files lqrnss. Using these Riccati coefficients.3.0.5.] . **************************************************** %% Solution using Control System Toolbox of %% the MATLAB.5.2. However. the closedloop optimal control system is shown in Figure 3.6. 2.3]'. F=[1.K]=lqrnss(A.30) Solving the previous set of nonlinear. P12(t) = pn(t) . one can obtain the numerical solutions for the Riccati coefficient matrix P(t).1.m requires %% two other files lqrnss.3. .28).5.u.m %% which are given in Appendix clear all A= [0. here the solutions are obtained using the analytical solution as given earlier in this section. .] .]. ****************************************************** .124 Chapter 3: Linear Quadratic Optimal Control Systems I Simplifying the matrix DRE (3. Using the optimal control u*(t) given by (3. . Finally. R= [.5 P22(t) = 2p12(t) . [x.3. P22(5) = 2.ts are obtained using Control System Toolbox of the MATLAB\9.24) are solved forward in time to obtain the optimal states xi(t) and x2(t) as shown in Figure 3.F.tspan). . (3. we get pn(5) = 1. .3. the optimal control u*(t) is shown in Figure 3. The previous resJlJ. Version 6 %% The following file example.]. .R.5 for the initial conditions [2 . .5. m which itself requires lqrnssf. B= [0. m given in Appendix C.
6 2 1..8 I I I .Time LQR System I Plot of Riccati Coefficients 2r~rr~~~~~r~ 125 1.) ____ J _____ I ____ ! ____ I _____ L ____ I I I I I I _____ I I I ____ J _____ I I __ _ ~ ~ ~ ~ I I I o Q) ____ ~ I _____ I~ ____ ~ I ____ ~ I _____ L I ____ ~ I _____ I~ ____ ~ I _____ I~ __ _ '..~ .1) and a quadratic performance index J = ~ 2 ito tx) [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt.6 0. this is called the infinitetime (or infinite horizon) linear quadratic regulator system [6.. it makes no engineering sense to have a terminal cost term with terminal time being infinite..4. Q(t) is nxn symmetric.. if anyone of the states is uncontrollable and/or unstable.. I . I I I ...~ . . I I I ..~+~~1~I I I I I 0.. .1 3..5 5 Figure 3... I . positive definite matrix.5 1.2) where. (3..4 'r''r'r'r____ ~ _____ :_____ ~ _____:_ 24 ~~X ____ ~ _____ ~ ____ ~ _____ :____ _ I . and R(t) is an rxr symmetric..4 Infinite. I I I I I I I I I I I I I .:.. Consider a linear.~l~i'r'I I I I I I I I I I I I I I I I ~I I ~ 1.2 (.. Then. u u C2 0.~ .:. let us make the terminal (final) time t f to be infinite in the previous linear..I I I I I I I I I I I _ (t" I I . the corresponding performance measure J will become infinite .3 Riccati Coefficients for Example 3. I I I ~~. Also..4 c: Q) . (3. timevarying... For example. ..3.8 .~ .5 4 ~ I I I 4. 3]... quadratic regulator system..5 3 3.Pt~~~ . This problem cannot always be solved without some special conditions.. Note. timevarying plant x(t) = A(t)x(t) + B(t)u(t)..~ . I I I I I I I I I 1... positive semidefinite matrix. I I I 0...4 InfiniteTime LQR System I In this section.4..:. u(t) is not constrained.5 2 2.
. Thus. . Using results similar to the previous case of finite final time t f (see Table 3.. (3........ ... .............4... . positive definite matrix (for all t E [to..Q(t) + P(t)B(t)R........................1) is completely controllable.4... .. .. ... ......... ............ ... .... . .... (3.. the optimal control for the infinitehorizon linear regulator system is obtained as u*(t) = _Rl(t)B'(t)P(t)x*(t)............... ..4 ClosedLoop Optimal Control System for Example 3.1).. ..... ......... ... Plant ... .. . . . .. ....... ..126 . ........... I 1 2 u*(t) +0..... . .1 and makes no physical sense...4.. ......4....... .... r ... P(t) = tf+oo lim {P(t)} ........ On the other hand.. ..... ...... .. . J X2*(t~ J Xl *(t) : ........ we need to impose the condition that the system (3...... with the finitetime system.. 4 ~+~~~............ .. .... ..... ....... tf]) is the solution of the matrix differential Riccati equation (DRE) P(t) = P(t)A(t) .... .. . ............. Chapter 3: Linear Quadratic Optimal Control Systems I ...5) .... ...1 (t)B'(t)P(t). ..... . ..A'(t)P(t) .... .............. .. . . ...... . .. ... ... ..... .. ..3) where... . .................... ..... .. the performance measure is always finite...4) the nxn symmetric.............. ...+~~ Figure 3.. .... ... (3... .
... i ...5 5 Figure 3.: .~~11 I I I I 0..:...5 4.~ x "(t) 12 ~ 1 1 .5 .4 InfiniteTime LQR System I Plot of Optimal States 2~~~~~~~~~~~ 127 1.1 1 1 1 1 1 I I I I 'E.5 ~ ~~~~~~~1 1 1 1 1 1 1 1 1 1 ____ JI _____ I ____ lI ____ (/) Q) ~ I I f I I _____ L ____ JI __________________ _ I I 0 I I I ti5 0 CG E 1 1 I 1 1 I 1 1 .1 1 ~ 1 1 I .....~1 1 I I _JI _____ I ____ J ____ J I _____ L ____ ~ ~ I _____ I ~ ____ JI ____ 1 ~ I ____ _ o 1 1 I 1 1 I 1 1 1 1 1 1 I 1 I 1 ...5 1.:..~~.. I I I I I 1 I I I 1 1 I I I 1 1 I I 1 I I 1 I I I I .:.5 Optimal States for Example 3.5 2..5 3 3. 1 1... 1 I 1 1 (.6 Optimal Control for Example 3.5 5 Figure 3..1 .5 4 4.3.. :a o 5 1 1 1 1 1.5 1 1 1 1 ~.~.5 2 2..5 2 2.~i~i'r'~I I I I ~~~'T~~~~1 1 I I 1 1 I I I I I I 1 _3L~~~~~~~~~~~ 0..5 1.5 3 3.5 0... 0.1 Plot of Optimal Control 15~~~~~~~~~~~~ I I ~ o U g c: E T'r'I I I 1 I I I (.
Q(t) is nxn symmetric.10) where. t f].Q(t) + P(t)B(t)R.1 (t)B'(t)P(t) I (3.4. 2 A (3. 3.4. Then. the nxn symmetric.4.9) to where.4. (3. (3.128 Chapter 3: Linear Quadratic Optimal Control Systems I satisfying the final condition (3. positive definite matrix (for all t E [to. is the solution of the matrix differential Riccati equation (DRE) IP(t) = P(t)A(t) .A'(t)P(t) .4. P(t). positive semidefinite matrix. and R(t) is rxr symmetric. Also.11) satisfying the final condition P(t = tf + (0) = O.12) .8) J = 2 11 00 [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt.7) The proofs for the previous results are found in optimal control text specializing in quadratic methods [3]. timevarying plant x(t) = A(t)x(t) and a quadratic performance index + B(t)u(t).4.4.1 (t)B'(t)P(t)x*(t) I (3.4.1 InfiniteTime Linear Quadratic Regulator: TimeVarying Case: Summary Consider a linear.6) The optimal cost is given by 1 J* = x*'(t)P(t)x*(t) . (3. Example 3. positive definite matrix. tf + 00 is not specified.1 can be easily solved for t f + 00 and F = o. u(t) is not constrained and x(tf). the optimal control is given by I u*(t) = R.
Solution of the Problem ~olve the matrix differential Riccati equation (DRE) P(t) = P(t)A(t) .2 Procedure Summary of InfiniteTime Linear Quadratic Regulator System: TimeVarying Case A. the performance index as J = ~ ftC. state and performance index.3. Statement of the Problem Given the plant as x(t) = A(t)x(t) + B(t)u(t). Solve the optimal state x*(t) from x*(t) = [A(t) . Let us consider the . Obtain the optimal performance index from J* = ~x*'(t)P(t)x*(t).1 (t)B'(t)P(t)] x*(t) I (3.Q(t) + P(t)B(t)R.13) and the optimal cost is J* = 1 "2x*'(t)P(t)x*(t).1 (t)B'(t)P(t) with final condition P(t = t f) = O. Obtain the optimal control u*(t) from u*(t) = R. [x'(t)Q(t)x(t) + u'(t)R(t)u(t)] dt. The entire procedure is now summarized in Table 3. A (3. 3.B(t)R.4.4.4. and the boundary conditions as x( (0) is free. we examine the state regulator system with infinite time interval for a linear timeinvariant (LTI) system.A'(t)P(t) .10) is linear in the optimal state x*(t).5 InfiniteTime LQR System II 129 Table 3.B(t)R.5 InfiniteTime LQR System II In this section.1 (t)B'(t)P(t)] x*(t) with initial condition x(to) = Xo.1 (t)B'(t)P(t)x*(t).2. The optimal state is the solution of I x*(t) = [A(t) . x(to) = Xo. Step 1 Step 2 Step 3 Step 4 B.14) The optimal control u*(t) given by (3. find the optimal control.
First of all. Q is nxnorder. Hence.130 plant as Chapter 3: Linear Quadratic Optimal Control Systems I x(t) = Ax(t) and the cost functional as J = 2 0 + Bu(t) (3. if the system is not controllable and some or all of those uncontrollable states are unstable. The controllability requirement guarantees that the optimal cost is finite. A is nxnorder state matrix. 3. With infinite finaltime interval.5. x(t) is nth order state vector. then the cost functional would be infinite since the control interval is infinite. the final cost term involving F(tf) does not exist in the cost functional (3. With infinite finaltime interval. we can assume that the system (3. the system (3.1) requires that the controllability matrix (see Appendix B) (3. R is rxrorder. to include the final cost function does not make any practical sense. Let us recall that this controllability condition of the plant (3. 2. symmetric.1) has to be completely controllable. In such situations. u(t) is rth order control vector. (b) We want to include any special case of large final time.3) must be nonsingular or contain n linearly independent column vectors. we cannot distinguish optimal control from the other controls. 1. positive semidefinite matrix. Alternatively.5. positive definite matrix.5. . On the other hand.5.1) is completely stabilizable.5. B is rxrorder control matrix. let us discuss some of the implications of the timeinvariance and the infinite finaltime. symmetric.1) 110 00 [x'(t)Qx(t) + u'(t)Ru(t)] dt (3.5.2).2) where.5. The infinite time interval case is considered for the following reasons: (a) We wish to make sure that the stateregulator stays near zero state after the initial transient.
If P is constant. revolutionized the field of estimation. which is based on the use of statespace techniques and recursive algorithms. dP dt = 0 = P A . for instance). The Kalman filter is widely used in navigational and guidance systems.2. and satellite orbit determination (for the Ranger.4) where. for a timevarying system with finitetime interval.A'P + PBR1B'P . whereas for a linear timeinvariant system with infinitetime horizon. we can proceed and obtain the closedloop optimal control and the associated Riccati equation. The Kalman filter. It was shown that the assumptions of [70] 1.6).5. radar tracking. matrix.34).6) Note. and Mariner missions. 74.5. Kalman). we can write (3. constant matrix.2. Kalman is best known for the linear filtering technique that he and Richard Bucy [31] developed in 19601961 to strip unwanted noise out of a stream of data [71.PBR1B'P = o. as well as in fields as diverse as seismic data processing. we have the differential Riccati equation (3. 6/94 . then P is the solution of the nonlinear. Kalman is appropriate (from SIAM News.5. algebraic Riccati equation (ARE).5.E.E. sonar ranging.Q. Rudolph E.34) with boundary condition P(tf) = O. P is the nxn positive definite. (3. symmetric.article about R. A historical note on R. (3.5. 76].5) Alternatively. Still P (t) must be the solution of the matrix differential Riccati equation (3.5 InfiniteTime LQR System II 131 As before in the case of finite finaltime interval. we have the algebraic Riccati equation (3. nuclear power . controllability and imply that (3.3.5) as PA + A'P + Q . Apollo.
. 75j.1 Meaningful Interpretation of Riccati Coefficient Consider the matrix differential Riccati equation (3. and the separation principle that are today ubiquitous in control. observability.5. as shown in Figure 3. 73.7 [6]. minimality. Gainesville. Kalman received BS and MS degrees from the Massachusetts Institute of Technology (MIT) and a DSci in engineering from Columbia University in 1957. linearquadratic control [70." P(tf) as the "initial condition. Then the optimal control is given by u*(t) = R1B'Px*(t) = Kx*(t). Then.t.7) . Among Kalman's many outstanding contributions were the formulation and study of most fundamental statespace notions [72. in T scale we can think of the final time t f as the "starting time.2. In the early years of his career he held research positions at International Business Machines (IBM) and at the Research Institute for Advanced Studies (RIAS) in Baltimore. i. such as optimal control theory. Kalman's contributions to control theory and to applied mathematics and engineering in general have been widely recognized with several honors and awards. 75. realizability from input and output data. and later retired with emeritus status." and P as the "steadystate solution" of the matrix DRE. Born in Hungary.e.34) with final condition P(tf) = O. From 1962 to 1971. the "transient solution" is pushed to near t f which is at infinity. a constant matrix P. 77j including controllability. USA. 3. As the time t f + 00. he became a graduate research professor and director of the Center for Mathematical System Theory at the University of Florida. (3.132 Chapter 3: Linear Quadratic Optimal Control Systems I plant instrumentation. he was at Stanford University. matrix Riccati equations. Now consider a simple time transformation T = t f . While some of these concepts were also encountered in other contexts. it was Kalman who recognized the central role that they play in systems analysis. Then for most of the practical time interval the matrix P (t) becomes a steady state.5. In 1971. and econometrics.
5.8) in the plant (3. the minimum cost (3.5.1 B'P] x*(t) = Gx*(t).8) where. write P K = R 1 B'P is called the Kalman gain.5.9) is stable. (3.BR1 B'P must have stable eigenvalues so that the closedloop optimal system (3.3). we are interested in finding the analytical solution to the algebraic Riccati equation (3. J* 2 .2 Analytical Solution of the Algebraic Riccati Equation The next step is to find the analytical expression to the steadystate (limiting) solution of the differential Riccati equation (3.5.9) where.29) is given by 1 . Obviously.3.7 Interpretation of the Constant Matrix where. This is required since any unstable states with infinite time interval would lead to an infinite cost functional J*. we can u*(t) = K~x*(t) (3. Ka = PBR 1.5. the matrix G = A .3.10) 3.2. Let us note that we have no constraint on the stability of the original system (3. This means that although the original system may be unstable.transient interval~tf + t 14) .5. Finally. Alternatively.5 InfiniteTime LQR System II P(t)= P 133 O~.1). Thus.5.5).1) x*(t) = [A .BR.5. one can let the terminal time t f tend to . The optimal state is the solution of the system obtained by using the control (3.steadystate interval·~ ~. (3. if :' 0 +1 Figure 3..5.x* (t)Px*(t) . the optimal system must be definitely stable.
and T = 0. we have from (3. I u*(t) = (3. the nxn constant.3 InfiniteInterval Regulator System: TimeInvariant Case: Summary For a controllable. timeinvariant plant x(t) = Ax(t) + Bu(t). 2 1  (3.5.23) for P(tf).Q the optimal trajectory is the solution of = 0 I (3.14) where.16) and the optimal cost is given by J* = x*'(t)Px*(t) .3. v'Q) being reachable.12) and the infinite interval cost functional 1 J = "210 tx) [x'(t)Qx(t) + u'(t)Ru(t)] dt.134 00 Chapter 3: Linear Quadratic Optimal Control Systems I in the solution (3. consult [3] and the references therein. is the solution of the nonlinear.5. the solution to the ARE is constructed by using the stable eigenvectors of the Hamiltonian matrix. matrix algebraic Riccati equation (ARE) /PA . tt) = P = W 21 W 1 1·1 (3. P.17) The entire procedure is now summarized in Table 3. As tf t 00. (3. symmetric matrix.5.23) ItJ~oo P(t. B) being stabilizable and (A.11) Thus.5.t ) goes to zero.5. which in turn makes T(t) tend to zero.A'P + PBR1B'P .13) the optimal control is given by R1B'Px*(t) I (3. Thus. under the usual conditions of (A. eM(tf.8 .5. For further treatment on this topic. positive definite. linear.3 and the implementation of the closedloop optimal control (CLOC) is shown in Figure 3. 3.3.5.15) (3.5.
...... __ ........19) of the present system with the corresponding general formulation of plant (3...5......2 Given a second order plant ~h(t)=X2(t)...5......5...........5 Innnite.. an example is given to illustrate the infinite interval regulator system and the associated matrix algebraic Riccati equation....5. Solution: Comparing the plant (3..... X2(O) = 3 (3...... _... ClosedLoop Optimal Controller t .....3......5....18) and PI (3....1 with final time t f * 00 and F = O.13)..  ...... Let us reconsider the same Example 3...... 1 Figure 3.....8 Implementation of the ClosedLoop Optimal Control: Infinite Final Time Next...12) and PI (3.... _ . respectively.: Plant .......... let us first identify the various ........Time LQR System II u*(t) 135 B J A x*(t) : .............. X2(t) = Xl(O) =2 2Xl(t) + X2(t) + u(t).... Example 3..18) and the performance index obtain the feedback optimal control law.......
find the optimal control. Step 4 Obtain the optimal performance index from J* = ~x*'(t)Px*(t). u*(t) Step 3 Obtain the optimal control u *(t) from = R1B'Px*(t). x2(t) 4(jh2xi(t) + P22 x .136 Chapter 3: Linear Quadratic Optimal Control Systems I Table 3.5.5. and the boundary conditions as x(oo) = 0.22) Then. Solution of the Problem Step 1 Solve the matrix algebraic Riccati equation (ARE) PA . P12 P22 (3. x(to) = Xo.5.(t)]. the optimal control ( 3. Let (3. = Q= [.. matrices as A [~2 ~] .5. the performance index as = ~ Iooo [x'(t)Qx(t) + u'(t)Ru(t)] dt. Statement of the Problem Given the plant as x(t) J = Ax(t) + Bu(t).14) is given by u*(t) = = 4 [0 1] [~1l ~12] P12 P22 [XI(t)] . B.BR1B'P] x*(t) with initial condition x(to) = Xo. state and index.3 Procedure Summary of InfiniteInterval Linear Quadratic Regulator System: TimeInvariant Case A.A'P .Q + PBR1B'P = 0 . Step 2 Solve the optimal state x* (t) from x*(t) = [A .21 ) P be the 2x2 symmetric matrix P= [~11 ~12] .23) .20) R=r =' 4' 1 to = 0.].5. (3.. (3.
3660] 0.5. (3.4[0. we get 4PI2 + 4P12 .4729 3' = 7. Hence.9.3 = 2P12 .5.29) .3. using this value of P12 in the third equation solve for P22 and finally using the values of P12 and P22 in the second equation.(t) .25) for P12.25) Solving the previous equations for positive definiteness of f> is easy in this particular case. f>.366xi(t) + 1.21 1 1 [~11 ~12] P12 P22 + [ ~11 ~12] [0] 4 [0 1] [~11 ~12] _[23] .5.4729x2(t)] [1.3660] [ 2 ] 2 x 2 0.26). P12 P22 1 P12 P22 35 (3.24). Thus.5 = 0.8916x2(t)].5. (3.4729 .18).Time LQR System II 137 where.3660 1.9047.5 InEnite.2 = P11 . (3. we have to solve the nonlinear algebraic equations and pick up the positive definite values for f>.24) Simplifying the equation (3. the closedloop optimal control (3.28) and the implementation of the closedloop optimal control is shown in Figure 3.(t)] (3. the 2x2 symmetric.366xi(t) + 1.2P22 + 4P~2 . ° ° (3.5.3660 1.5. is the solution of the matrix algebraic Riccati equation (3. positive definite matrix. In general.5. solve the first equation in (3.4729x.5.7363 0. U sing the initial conditions and the Riccati coefficient matrix (3.23) is given by u*(t) = = 4[0.26) Using these Riccati coefficients (gains). we get f> = [1.27) in the original openloop system (3.464xi(t) + 5.17) is obtained as J* = ~ x '(O)f> (0) = ~ [2 3] [1.15) [ 0 0] °° = _ [~11 ~12] P12 P22 [° 1] _ [0 2] .5.73630. the optimal cost (3.5. the closedloop optimal system becomes xi(t) X2(t) = = X2(t) 2xi(t) + x.5. solve for P11.5.5.27) Using the closedloop optimal control u*(t) from (3.5.P12 + 2P22 + 4P12P22 .
Version 6 %% For Example:43 %% x10=2.. _.. A=[O 1.9 ClosedLoop Optimal Control System The previous results can also easily obtained using Control System Toolbox of the MATLABC9. _ 1 2 u*(t) + + J Plant 1. %% P = Riccati matrix. Version 6 as shown below.25]. %% EV = eigenvalues of closed loop system A ...B.P.B*K K= . %% performance index weighted matrix [K.2 1]. __ . %% initial condition on state x1 x20=3.EV]=lqr(A.x20].R) %% K = feedback matrix. %% system matrix A B=[O.366 4 ClosedLoop Optimal Controller Figure 3.4729 J 0. %% performance index weighted matrix R=[O. ********************************************* %% Solution using Control System Toolbox in %% The MATLAB.3 5]. %% initial condition on state x2 XO=[x10..138 Chapter 3: Linear Quadratic Optimal Control Systems I .Q.1]. %% system matrix B Q=[2 3.
) ************************************************** Using the optimal control u*(t) given by (3. . %% extracting xi from vector X x2t=[0 1]*X'. the values of P obtained in the example.5.5. 89]. 3.x1t.t.tfinal). We address briefly some stability remarks of the infinitetime regulator system [3. u ( t) .5. the plant equations (3. ' k ' ) xlabel ('t') gt ext ( .3660 1. 'k') xlabel ( . Note that 1.1 and 2.7363 0.17) and Table 3.X.0326. tfinal=10. 'k' .0326 0. 0.3. D= [1] . x1t=[1 O]*X'.4729 1.3.D.11. the original plant (3.4641 p 139 5. %% extracting x2 from vector X ut=K*X' .5.18) is unstable (eigenvalues at 2 ± j1) whereas the optimal dosedloop system (3.05:10.5. % dummy BIN for "initial" command C= [1 1].XO.5.O]. [Y. ut .t]=initial(AB*K. are exactly the steadystate values of Example 3.8590 BIN=[O.BIN.5 InfiniteTime LQR System II 1.18) are solved using MATLAB© to obtain the optimal states xi(t) and x2(t) and the optimal control u*(t) as shown in Figure 3.5.C.12) to (3.10 and Figure 3.28) is stable (eigenvalues at 4. t=0:0. t ' ) gtext ('x_1 (t)') gtext ('x_2(t)') plot (t .8590).x2t.3660 EV = 4.4 Stability Issues of TimeInvariant Regulator Let us consider the previous result for linear timeinvariant system with infinitetime horizon from relations (3. plot(t.8916 0.23).
. In order to prevent such a situation.10 Optimal States for Example 3. we need the assumption that the pair [A. 4.16) is not always stable espe cially when the original plant is unstable and these unstable states are not weighted in the PI (3. The detectability condition is necessary for stability of the closedloop optimal system. 2. Thus both detectabilityand stabilizability conditions are necessary for the existence of a stable closedloop system.5.140 Chapter 3: Linear Quadratic Optimal Control Systems I 2~~~~~~~~~~~ 1. 3.5 0.13). The Riccati coefficient matrix P is positive definite if and only if [A.5.5 1 3~~~~~~~~~~~ o 2 3 4 5 6 7 8 9 10 Figure 3. which guarantees the stability of closedloop optimal system.2 1. where C is any matrix such that C'C = Q. This assumption essentially ensures that all the potentially unstable states will show up in the x'(t)Qx(t) part of the performance measure. C] is detectable. C] is completely observable. The closedloop optimal system (3.
11 Optimal Control for Example 3.5 Infinite. Example 3. x(O) = 1 and the final state x( (0) = O. We will also demonstrate the simplicity of the CLOC.5. Find the openloop and closedloop optimal controllers.Time LQR System II 15rrrrrrrr~~. we present a simple example to show an interesting property that an optimal control system can be solved and implemented as an openloop optimal control (OLOC) configuration or a closedloop optimal control (CLOC) configuration.5.2 3.5 Equivalence of OpenLoop and ClosedLoop Optimal Controls Next. 141 10 5 2 3 4 5 6 7 8 9 10 Figure 3.30) and the cost function (CF) as (3.5.3 Consider a simple first order system x(t) = 3x(t) + u(t) (3.31 ) where.3. .
(3.38) x*(t) = Cle ViOt + C2eViOt.39) Using the optimal state (3. find the optimal Hamiltonian function as 1t* = x* (t) .5. comparing the given plant (3.t. identify that V(x(t).36) (3.31) with the general formulations (see Table 2.5. u as ~1t uu = 0 + 2u*(t) + A*(t) = 0 + u*(t) = ~A*(t).5. u(t)) 2 2 = x (t) + u (t) + A(t)[3x(t) + u(t)].5.10x*(t) = 0. (3. obtain the set of state and costate equations x*(t) = a:. 1t = V(x(t).5.1). we have the costate as A*(t) = 2 [x*(t) .4A* (t) . A*(t) yielding =  + x*(t) = ~A*(t) = 3x*(t).5. f(x(t). u(t)) + A(t)f(x(t).34) • Step 3: Using optimal control (3.3x*(t)] = 2C1 (ViO + 3)eViOt + 2C2 (ViO  3)eViOt.5.32) Now. 2 (3. u(t)) = 3x(t) + u(t). • Step 1: Formulate the Hamiltonian as (3.1.34) in the Hamiltonian function (3.3A*(t)x*(t).5. we use the stepbystep procedure given in Table 2.142 Chapter 3: Linear Quadratic Optimal Control Systems I Solution: (a) OpenLoop Optimal Control: We use the Pontryagin procedure given in Chapter 2 (see Table 2..40) .39) in (3.5.* .5.36). (3.5.5.35) • Step 4: Using the previous optimal 1t*. 2 1 2 (3.30) and the CF (3.33). u(t)) = x 2(t) + u2(t).5.5. First of all. x*(t) .37) a1t*· ax + A*(t) 2x*(t) + 3A*(t). (3.1).r. the solution of which becomes (3.5.33) • Step 2: The optimal control u*(t) is obtained by minimizing the previous Hamiltonian w.
47) .5.13).5.. (3.5.30) and the PI (3.42) • Step 5: Using the previous costate solution (3.5.5.43) (b) ClosedLoop Optimal Control: Here. we use the matrix algebraic Riccati equation (ARE) to find the closedloop optimal control.5. we get x(O) = 1 * C 1 + C 2 = 1 ).5.5.34) of Step 2.31) with the general formulation of the plant (3. (t f = 00) = 0 in the optimal costate (3..5.5.1 bpx*(t) = "2(6 + 2v'lO)x*(t) = ( 3 1 + v'lO)x*(t).39).15) becomes p(3) + (3)p  p(l)("2)(l)p+ 2 = 0* p2 1 + 12p  4 = 0.5. R = r = 2. respectively. (3.14) becomes u*(t) = r.5 InfiniteTime LQR System II 143 Using the initial condition x(O) = 1 in the optimal state (3. the ARE (3. 3.44) Note the PI (3. as summarized in Table 3.5. and the final condition (for 8x f being free) ). we get the openloop optimal control as (3.13) and accordingly. B = (3.5. the previous optimal state and costate are given as x*(t) = ev'IOt.3. (3.5.46) • Step 2: Using the positive value of the Riccati coefficient (3.(00) = 0 * C1 = O.31) does not contain the factor 1/2 as in the general PI (3.41 ) Then.40). • Step 1: With the previous values. (3.5.*(t) = 2(v'lO .5. f> = p. (3.45) the solution of which is p = 6 ± 2v'lO.. we have Q = q = 2 and R = r = 2.5. = b = 1. we identify the various coefficients and matrices as A = a Q = q = 2.3)ev'IOt.5.12) and the PI (3. ).46). First of all.5. comparing the present plant (3.3. the closedloop optimal control (3.
Note that linear quadratic optimal control is a special class of the general optimal control which includes nonlinear systems and nonlinear performance measures.16) as x(t) = 3x*(t) .5.47).5.50) Thus.5. In particular.12(a). .49) obtained from using the closedloop optimal control are identical to those of (3. from the implementation point of view.5. and that of the closedloop optimal controller (CLOC) is shown in Figure 3.3) is much simpler than the openloop optimal controller (( v'IO3)ev'IOt) which is an exponential time function and 2. respectively. we get the optimal state as x*(t) = ev'IOt (3.5.47) becomes (3. The implementation of this openloop optimal controller (OLOC) is shown in Figure 3.43) and (3. we note that the optimal control (3.12(b). 3. the resulting optimal controller is closedloop and linear in state.5.5. (3.5. this equivalence should exist as the optimal control being unique should be same by any method.42).5.49) with which the optimal control (3. all the advantages of conventional feedback are incorporated.50) and optimal state (3.5. From the previous example. and regulation and tracking errors.6 Notes and Discussion We know that linear quadratic optimal control is concerned with linear plants.48) Solving the previous along with the initial condition x(O) = 1. the optimal state is solved from (3.(3 + v'lO)x*(t) = v'lOx*(t). with a closedloop configuration.144 Chapter 3: Linear Quadratic Optimal Control Systems I • Step 3: Using the optimal control (3. We can easily extend this analysis for the general case. the closedloop optimal controller (v'IO . There are many useful advantages and attractive features of linear quadratic optimal control systems which are enumerated below [3]. it is clear that 1. Intuitively. performance measures quadratic in controls and states.
. .. ........ _ . . .. .. ....... .. .......... .. .. . .... . Many (nonlinear) optimal control systems do not have solutions which can be easily computed. . _ .6 Notes and Discussion ... . . ....._ .. 2.. . nonlinear systems can be examined for small variations from their normal operating conditions... . _ ... '~/I03~~ . .... .. .... . __ . __ .... For example. CLOC (b) Figure 3...... 3..... . ....3......... _ + OLOC J 3 (a) x*(t) Plant: ~ .. .__ . .12 (a) OpenLoop Optimal Controller (OLOC) and (b) ClosedLoop Optimal Controller (CLOC) 1. The resulting optimal controller is linear in state and thus easy and simple for implementation purposes in real application of the LQ results. 145 .. . _ . __ .. .. .. . __ .. .. . ..... .... . ... . .. . . .... .. _ ... .. _ .. . ... ... ... ..... .. There is a wealth of theoretical results available for linear systems which can be useful for linear quadratic methods... •.. 5..... Many engineering and physical systems operate in linear range during normal operations..... _ ... ... _ ...... ..... __ ..... .. _ .I u*(t) + + J 3 x*(t) Plant . LQ optimal control systems have easily computable solutions. . . . __ . __ .. ... . ..... . ... . . 4.. . ....... . . ... .... " . . . . __ .. __ . ... . .. . .. On the other hand.. As is well known.... . .. . ... ... . .
we obtained an optimal solution for a nonlinear plant and there is a small change from an operating point. Then. 6. one can easily use to a first approximation a linear model and obtain linear optimal control to drive the original nonlinear system to its operating point. .146 Chapter 3: Linear Quadratic Optimal Control Systems I assume that after a heavy computational effort. 7. Linear optimal control designs for plants whose states are measurable possess a number of desirable robustness properties (such as good gain and phase margins and a good tolerance to nonlinearities) of classical control designs. techniques and computational procedures that are developed for linear optimal control systems in many cases many be carried on to nonlinear optimal control systems. Many of the concepts.
7 Problems 1. Use MATLAB© wherever possible to solve the problems and plot all the optimal controls and states for all problems. Provide the relevant MATLAB© m files. (a) Find the unconstrained optimal control law which minimizes the performance index such that the final time t f is fixed and the final state x(t f) is free. Problem 3. Problem 3.2 A system is described by x(t) with initial conditions x(O) index + x(t) = u(t) = 0 and ±(O) = 1 and the performance Find the closedloop optimal control in terms of x and x and the optimal cost function. 2. Problem 3.3.3 A first order system is described by ±(t) = ax(t) with performance index + bu(t) .1 A first order system is given by ±(t) = x(t) + u(t).7 Problems 147 3. (b) Find the optimal control law as t f * 00. Make reasonable assumptions wherever necessary.
optimal control for the system Xl (t) X2(t) = X2(t) . Show that the solution of the Riccati equation is given by p(t) = b2 [a .4] + 4X2(t) + 5u(t) 121 2 + 2f22 [x2(tf) .5u 2(t)]dt.2] + ~ If [5xi(t) + 2x~(t) + 4u2 (t)] dt and initial conditions as x(O) = [1 2]' and the final state x(tf) is free. Problem 3.t)} x t .6 Find the optimal feedback control law for the plant . Problem 3..x 0 .4 Find the optimal feedback control for the plant X2(t) X2(t) = 2XI(t) with performance criterion J = 2f11 [xI(tf) . h(3 S'ln where. XI(t) = Problem 3. where tf is fixed.3X2 (t) + u(t) = 2XI (t) and the performance index J = 10 00 [xi(t) + x~(t) + u2(t)]dt..5 Find the closedloop.h(t) = X2(t) + u(t) X2(t) = XI(t) . unconstrained. .tf)}] r and the solution of the optimal state x* (t) is given by *( ) _ ()sinh{(3(tf . where t f is specified.X2(t) and the cost function J = + u(t) 10 00 [2xi(t) + 4x~(t) + O. (3 = Ja 2 + b2 q/r.3coth{.148 Chapter 3: Linear Quadratic Optimal Control Systems I and with a fixed initial state x(O) and final state x(tf) = 0.3(t .
ql1 = q22 = q33 = 1. r = 1. and 3.7 Problems Problem 3. ql1 = q22 = q33 = 1. Xl(t) = X2(t) X2(t) = X3(t) X3(t) = 5Xl(t) and the performance index + 7X2(t) 10X3(t) + 4u(t) for 1.3. Problem 3.7 Consider a second order system 149 x(t) + bx(t) + cx(t) = u(t) and the performance index to be minimized as Determine the closedloop optimal control in terms of the state x(t) and its derivative x(t). ql1 = 10. x(t) = [01] x(t) 11 and the cost function J + [11] u(t) 01 = foCXJ [2xr(t) + 4x~(t) + 0. q22 = 1.5ur(t) + 0. optimal feedback gain matrix K and the eigenvalues of the closedloop system matrix A . r = 1.BK.9 Determine the optimal feedback coefficients and the optimal control law for the multiinput. Problem 3. find the positive definite solution for Riccati coefficient matrix f>. . q33 = 1.8 Given a third order plant. multioutput (MIMO) system . 2.25u~(t)]dt. r = 10.
6.C. find the closedloop optimal control to keep the system at equilibrium condition and act only if there is a disturbance. Problem 3. Problem 3. Problem 3. Problem 3.the closedloop optimal control to keep the pendulum in a vertical position.150 Chapter 3: Linear Quadratic Optimal Control Systems I Problem 3. find the closedloop optimal control to keep the system at equilibrium condition and act only if there is a disturbance.2.10 For the D.15 For the chemical control system described in Problem 1. @@@@@@@@@@@@@ . find the closedloop optimal control to keep the system at equilibrium condition and act only if there is a disturbance. find the closedloop optimal control to keep the liquid level constant at a reference value and the system to act only if there is a change in the liquid level.1. motor speed control system described in Problem 1. find.4.3. find the closedloop optimal control to keep the speed constant at a particular value and the system to respond for any disturbances from the regulated value.13 For the mechanical control system described in Problem 1.5.14 For the automobile suspension system described in Problem 1.11 For the liquidlevel control system described in Problem 1.12 [35] For the inverted pendulum control system described in Problem 1. Problem 3.
Inc. In this chapter. NY. NY. New York. L. we require that the output of a system track or follow a desired trajectory in some optimal sense. are hereby acknowledged. This chapter is based on [6. Optimal Control. where the aim was to obtain the optimal control to regulate (or keep) the state around zero. Optimal Control: An Introduction to the Theory and Its Applications. we addressed the linear quadratic regulator system. and some related topics in linear quadratic regulator theory.Chapter 4 Linear Quadratic Optimal Control Systems II In the previous chapter. Lewis.. It is suggested that the student reviews the material in Appendices A and B given at the end of the book. 1966. 3] 1. John Wiley & Sons. New York. Trajectory Following Systems In tracking (trajectory following) systems. 1986 and McGrawHill for M. 151 . we discuss linear quadratic tracking (LQT) systems. Falb. IThe permissions given by John Wiley for F. Thus. Athans and P. we see that this is a generalization of regulator system in the sense that the desired trajectory for the regulator is simply the zero state. L. 89. McGrawHill Book Company.
positive semidefinite matrices. u(t) is the rth order control vector.y(t) (4. positive definite matrix. and R( t) is rxr symmetric. • Step 1 : Hamiltonian • Step 2: OpenLoop Optimal Control • Step 3: State and Costate System • Step 4: Riccati and Vector Equations • Step 5: ClosedLoop Optimal Control • Step 6: Optimal State .1. observable system x(t) y(t) = A(t)x(t) + B(t)u(t) = C(t)x(t) (4. We now use the Pontryagin Minimum Principle in the following order. Our objective is to control the system (4. let us define the error vector as e(t) = z(t) . we discuss the linear quadratic tracking (LQT)system to maintain the output as close as possible to the desired output with minimum control energy [6].1. and y(t) is the mth order output vector.2) and choose the performance index as J = 2e (tf)F(tf)e(tf) +2 1 . In this way we are dealing with freefinal state system.1) in such a way that the output y(t) tracks the desired output z(t) as close as possible during the interval [to. For this. B(t) and C(t) be of appropriate dimensionality. lit! to [e'(t)Q(t)e(t) + u'(t)R(t)u(t)] dt (4.1. We are given a linear.3) with t f specified and x( t f) not specified. x(t) is the nth order state vector.1) where. Let z(t) be the mth order desired output and the various matrices A(t).1 Linear Quadratic Tracking System: FiniteTime Case In this section. Also.152 Chapter 4: Linear Quadratic Optimal Control Systems II 4.1. tf] with minimum expenditure of control effort. we assume that F(tf) and Q(t) are mxm symmetric.
7) Since the second partial of 1t in (4. .C(t)x(t)]' Q(t) [z(t) .\(t) = 0 from which we have the optimal control as I u*(t) = R.4) • Step 1: Hamiltonian: Formulate the Hamiltonian as (see Table 3.t.1.1.1.1.1 (t)B'(t). First of all.1) and (4. u*(t) is just R(t). Also. let us note from (4.1) 1t(x(t).1. (4.1.C(t)x(t)] 1 +"2 u' (t ) R (t ) u (t) +.1 Linear Quadratic Tracking System: FiniteTime Case • Step 7: Optimal Cost 153 Now we discuss these steps in detail.1.X' (t) [A(t)x(t) + B(t)u(t)] . u(t). note that we heavily draw upon the results of the previous Chapters 2 and 3.X*(t).4.5) 1 • Step 2: OpenLoop Optimal Control: Using the Hamiltonian (4.B(t)Rl(t)B'(t).3).6) (4.2) that the error e(t) can be expressed as a function of z(t) and x(t) as e(t) = z(t) .5). • Step 3: State and Costate System: The state is given in terms of the Hamiltonian (4.C(t)x(t).X*(t)·1 (4.8) and with the optimal control (4.1.1. obtain the control equation from ~~ = 0 > R(t)u(t) + B'(t). and we chose R( t) to be positive definite.X(t)) = "2 [z(t) .1.8) becomes x*(t) = A(t)x*(t) . the optimal state equation (4.5) as x(t) = ~~ = A(t)x(t) + B(t)u(t) (4. (4. we are dealing with a control which minimizes the cost functional (4. (4.r.9) .5) w.7).1.1.1.1.
1.13) and the final condition on the costate (for the final time t f specified and state x(tf) being free) given by (3. The boundary conditions for this state and costate equations are given by the initial condition on the state as x(t = to) = x(to) (4.14) and the solution of the system (4.1. (4.C(tflX(tflJ] = C'(tf)F(tf)C(tf)x(tf) . we obtain the Hamiltonian canonical system as x*(t)] ['\*(t) = [A(t) E(t)] [X*(t)] V(t) A'(t) A*(t) + [0] z(t). timevarying.1.C'(tf)F(tf)z(tf).2.12) indicate that the state and costate are linearly related as A*(t) = P(t)x*(t) . V(t) = C' (t)Q(t)C(t).1. (4.8) and costate (4. which along with (4.5). but nonhomogeneous with W(t)z(t) as forcing function.1. let us define E(t) = B(t)Rl(t)B'(t).1. (4.* == 81t ax C'(t)Q(t)C(t)x*(t) . the optimal costate equation becomes A (t) .g(t) (4.1.10).11) and combining the state (4.1.1. W(t) = C'(t)Q(t).12) W(t) This canonical system of 2n differential equations is linear.10) For the sake of simplicity.A'(t)A*(t) +C'(t)Q(t)z(t). (4.10) equations.14) • Step 4: Riccati and Vector Equations: The boundary condition (4.15) .1.1.11) Using the relation (4.1.154 Chapter 4: Linear Quadratic Optimal Control Systems II Using the Hamiltonian (4.1.4) become A(tf l = = &X~t f l [~e'(tf )F(tf le(tf l] ax~tfl [~[z(tfl C(tflx(tfl]'F(tfl[z(tfl .
17) Rearranging the above.18) Now.E(t) {P(t)x*(t) .g(t).  (4. and g(t) is of n vector. 1 P (t f) = C' (t f ) F (t f) C (t f ). we get [P(t) + P(t)A(t) + A'(t)P(t) .1.P(t)E(t)P(t) + V(t)] x*(t) [g(t) + A'(t)g(t) .1. 1g ( t f) = 1 (4.15).4.19) and the n vector g( t) to satisfy the vector differential equation I g(t) = [P(t)E(t) .20) Since P(t) is nxn symmetric matrix. The boundary conditions are obtained from (4. substituting for x*(t) and '\*(t) from (4. we get V(t)x*(t) .18) must satisfy for all x* (t).A'(t)P(t) + P(t)E(t)P(t) . the nxn matrix P(t) and n vector g(t) are yet to be determined so as to satisfy the canonical system (4.19) and (4.g(t)}] .1.g(t)] + W(t)z(t) = P(t)x*(t) +P(t) [A(t)x(t) .1.1. the equations (4.g(t). we first differentiate (4.1.14) gives us for all x(tf) and z(tf).V(t) I (4. z (t) and t. (4.1.P(t)E(t)g(t) + W(t)z(t)] = O.1.A'(t)] g(t) .1 Linear Quadratic Tracking System: FiniteTime Case 155 where. Thus.1.12).12) and eliminating the costate function A* (t). This is done by substituting the linear (Riccati) transformation (4.15) as (4.23) C' (t f ) F (t f ) Z ( t f ) ·1 .1.1.16) Now.12) and eliminating A*(t) with (4.21) which compared with the boundary condition (4. which leads us to the nxn matrix P (t) to satisfy the matrix differential Riccati equation (DRE) IP(t) = P(t)A(t) .1.20) are a set of n(n + 1)/2 + n firstorder differential equations.1. this relation (4.22) (4. (4.15) to get '\*(t) = P(t)x*(t) + P(t)x*(t) .W(t)z(t).1.1.1.A'(t) [P(t)x*(t) .1.15) in the Hamiltonian system (4.1.1.1 (4.
1.7) is now given in terms of the state using the linear transformation (4. K(t) (4.1. (4.1.156 Chapter 4: Linear Quadratic Optimal Control Systems II Thus.15) u*(t) = Rl(t)B' (t) [P(t)x* (t) .B(t)R.22) and (4.2z'(t)Q(t)z(t) 2 g'(t)E(t)g(t) .1.1 = = 2 g'(t)B(t)R.g(t)] = K(t)x*(t) + Rl(t)B'(t)g(t) where.28) For further details on this.1. • Step 6: Optimal State: Using this optimal control u*(t) from (4.1.1. We now summarize the tracking system. we have the optimal state obtained from x*(t) = [A(t) .1).1 (t)B'(t)g(t) .1.1. • Step 5: ClosedLoop Optimal Control: The optimal control (4. 90]. the new function h(t) satisfies [3. . the matrix DRE (4.26) where. see [3.1.24) in the original plant (4.25) • Step 7: Optimal Cost: The optimal cost J*(t) for any time t can be obtained as (see [6]) J*(t) = 2x* (t)P(t)x*(t) . 6] h(t) .2z'(t)Q(t)z(t) 1 1 (4. 89. is the Kalman gain.x* (t)g(t) 1 .24) = R.1 (t)B'(t)P(t)] x*(t) +B(t)Rl(t)B' (t)g(t) [A(t) .20) are to be solved backward using the boundary conditions (4.E(t)P(t)]x*(t) = + E(t)g(t).19) and the vector equation (4. I + h(t) (4.1.23).1. 6.1.27) 1 with final condition (4.1 (t)B'(t)P(t).
. . . . . . . .1. . . .1) x(t) y(t) = = A(t)x(t) + B(t)u(t) C(t)x(t) (4. . . . . . . .1. . . . . . . . . . . .y(t). . . . . . . . . .1 Linear Quadratic Tracking System: FiniteTime Case x*(t) C(t) y* t) 157 f A(t) Plant ClosedLoop Optimal Controller P(t)x*(t) + g(t) OffLine Simulation of P(t) and g(t) Desired z( t) I . . . . . . . . .4. . . . . . . . . . . . . . and the performance index 1 J = e'(tf)F(tf)e(tf) 2 + 2 to lit! [e'(t)Q(t)e(t) + u'(t)R(t)u(t)] dt (4. . . . . . . . . . . .1 Implementation of the Optimal Tracking System 4.29) the desired output z(t). . . the error e(t) = z(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . observable system (see Figure 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30) . . . . . . . . . . . . .1. .I Figure 4. . . . . .1 Linear Quadratic Tracking System: Summary Given the linear. . . . . . . . .
1.1. (4.x*(to)g(to) + h(to). .1.2 Salient Features of Tracking System 1.E'(t)P(t)] x*(t) + E(t)g(t). and R(t).1.34) with final condition (4.E(t)P(t)]' g(t) .W(t)z(t) (4.1.1.33).11). This means that once the problem is specified in terms of the final time t f' the plant matrices A(t).158 Chapter 4: Linear Quadratic Optimal Control Systems II then the optimal control u*(t) is given by u*(t) = R.1.A'(t)P(t) + P(t)E(t)P(t)  V(t) (4. the optimal state (trajectory) is the solution of the linear state equation x*(t) = [A(t) .1. Q(t). nonhomogeneous vector differential equation g(t) = .1. matrix differential Riccati equation (DRE) P(t) = P(t)A(t) .36) and the optimal cost J* J*(to) = ~x*' (to)P(to)x*(to) . the matrix function P(t) is completely determined. Riccati Coefficient Matrix P(t): We note that the desired output z(t) has no influence on the matrix differential Riccati equation (4.35) where.32) with final condition (4.g(t)] = K(t)x*(t) + R. V(t) and W(t) are defined in (4.[A(t) . and the cost functional matrices F(tf). 4.1.1 (t)B'(t)g(t) (4.1. (4.33) and the nth order g(t) is the solution of the linear. positive definite matrix P(t) is the solution of the nonlinear. and C(t).1.31) where. The entire procedure is now summarized in Table 4.1 (t)B'(t) [P(t)x*(t) .1.37) The implementation of the tracking system is shown in Figure 4. E(t). the nxn symmetric. B(t).32) and its boundary condition (4.
Solution of the Problem Step 1 Solve the matrix differential Riccati equation P(t) = P(t)A(t) . V(t) = C'(t)Q(t)C(t).(t) = .E(t)P(t)]' g(t) . . and the boundary conditions as x(to) = Xo.Time Case 159 Table 4.V(t). Statement of the Problem Given the plant as x(t) = A(t)x(t) + B(t)u(t). x(tj) is free.1 System Procedure Summary of Linear Quadratic Tracking A.[A(t) . where.x*(to)g(to) + h(to) where h(t) is the solution of b.E(t)P(t)] x*(t) + E(t)g(t) with initial condition x(to) = Xo.~z' (t)Q( t)z(t) with final condition h(tj) = z'(tj )P(tj )z(tj). state and performance index. e(t) the performance index as = z(t) .y(t).A'(t)P(t) + P(t)E(t)P(t) .1 Linear Quadratic Tracking System: Finite. J = ~e'(tf )F(tf )e(tf) + ~ ftd [e'(t)Q(t)e(t) + u'(t)R(t)u(t)] dt. Step 2 Solve the optimal state x* (t) from x*(t) = [A(t) . and the nonhomogeneous vector differential equation g(t) = . y(t) = C(t)x(t). Step 3 Obtain optimal control u*(t) from u*(t) = K(t)x*(t) + R 1 (t)B' (t)g(t). Step 4 The optimal cost J* (to) is J*(to) = ~x*'(to)P(to)x*(to) . with final condition P(tj) = C'(tj )F(tj )C(tj). find the optimal control. W(t) = C'(t)Q(t).W(t)z(t). K(t) = R.1(t)B'(t)P(t).4. B.~g' (t)E(t)g( t) . with final condition g( t j) = C' (t j )F( t j )z( t j) where E(t) = B(t)R1(t)B'(t).
3X2(t) x(t) + u(t) (4.38) is to be controlled to minimize the performance index (4. we have el(t) = Zl(t) .XI(t). g vector components.39) The final time t f is specified at 20. 2XI(t) . one can choose arbitrarily as Z2(t) = o. e2(t) = Z2(t) . Example 4. . Obtain the feedback control law and plot all the time histories of Riccati coefficients. Now.1.160 Chapter 4: Linear Quadratic Optimal Control Systems II 2.1.2. note that if we make C(t) = I(t).1. Thus. Also.1 A second order plant ::h (t) = X2(t) y(t) = = X2(t).19) becomes the same matrix DRE (3. one can think of the desired output z(t) as the forcing function of the closedloop optimal system which generates the signal g(t). As shown in Figure 4. Let us consider a second order example to illustrate the linear quadratic tracking system. Tracking and Regulator Systems: The main difference between the optimal output tracking system and the optimal state regulator system is in the vector g(t). Closed Loop Eigenvalues: From the costate relation (4. optimal tracking system are independent of the desired output z(t).Cx(t). 3. with e(t) = z(t) . V(t) = Q(t). in our present case. then in (4.36).34) that was obtained in LQR system in Chapter 3.1. It is required to keep the state XI(t) close to 1. Solution: The performance index indicates that the state Xl (t) is to be kept close to the reference input Zl(t) = 1 and since there is no condition on state X2(t).1. we see the closedloop system matrix [A(t) . the matrix DRE (4.X2(t) and C = I. This means the eigenvalues of the closedloop.1.11). 4.B(t)RI(t)B'(t)P(t)] is again independent of the desired output z(t). the final state x( tJ) is free and the admissible controls and states are unbounded. optimal states and control.
32) Pll(t) P12(t)] [P12(t) P22(t) =  [Pll(t) P12(t)] P12(t) P22(t) [02 31] _[01 2] [Pl1 (t ) P12 (t ) ] 3 P12(t) P22(t) + [Pll (t) P12(t)] [0] 1 [0 1] P12(t) P22(t) 1 0.39) of the present system (note the absence of the factor 1/2 in PI) with the corresponding state (4. let us identify the various matrices in the present tracking system by comparing the state (4.1. (4.34) as . C = I.  P12(t) P22(t) .1.1 Linear Quadratic Tracking System: FiniteTime Case 161 Next. g (t) = [gl (t) ].31) becomes u*(t) = 250 [P12xi(t) + P22X2(t) . the positive definite matrix. z(t} = [~] . of the general formulation of the problem.1.1.1. R = r = 0.1.30).29) and the PI (4.4.1. respectively. B = [~]. is the solution of the nonhomogeneous vector differential equation obtained from (4. we get A = [ Q = ~2 _~ ]. Let P(t) be the 2x2 symmetric matrix and g(t) be the 2x1 vector as P(t) .g2(t)] (4. is the solution of the matrix differential Riccati equation (4. g2(t) (4. P(t).1.1.40) [~~] = F(tf}.43) and g(t).38) and the PI (4.41) Then.1.004.[Pl1 (t) P12 (t)].004 x Pll(t) P12(t)] _ [20] 00 [ P12(t) P22(t) (4. the optimal control given by (4.1.42) where.
1. Plot all the variables (Riccati coefficients. optimal states (Figures 4.45) (4.1.49) where.1 with a different PI as (4.2pI2(t) + 6p22(t) with final condition (4.2 Consider the same Example 4.1.X2(t) and C = I.33) as + 2p22(t) (4. Solution: The performance index (4.1. e2(t) = Z2(t) .2 PI2(t) = 250PI2(t)P22(t) .XI(t). Now. in our present case.1.43) and (4.1.pn(t) + 3PI2(t) P22(t) = 250p~2(t) . andp22(t) (Figure 4.46) and [/1(t) = [250PI2(t) + 2] g2(t) . we get Pn (t) = 250PI2(t) + 4pI2(t) .47) (4.1. one can choose arbitrarily as Z2(t) = O.4) and control (Figure 4.1.162 Chapter 4: Linear Quadratic Optimal Control Systems II Simplifying the equations (4. t f is specified and x( t f) is free.PI2(t). Find the optimal control in order that the state Xl (t) track a ramp function Zl (t) = 2t and without much expenditure of control energy. optimal states and control) for initial conditions x(O) = [1 0]'.2 92(t) = gl(t) + [3 + 250P22 (t)] g2(t) with final condition (4.1. with e(t) = z(t) ~ Cx(t).2).48) Note: One has to try various values of the matrix R in order to get a better tracking of the states. Example 4.44).5) for initial conditions x(O) = f=:0.5 0] and the final time t f = 20 are obtained using MATLAB© routines given in Appendix C under continuoustime tracking system. Solutions for the functions Pn(t).3).49) indicates that the state Xl (t) is to be kept close to the reference input Zl (t) = 2t and since there is no condition on state X2 (t). . functions gl(t) and g2(t) (Figure 4. we have el(t) = Zl(t) .
.r ._ ....ii~ r: t : I : • : P12(t): I : • : • : • : I 02 ..~ . ...:. R B = [~] .1..4 r... :.. z(t) = [~] .. I • I I I I 163 Plot ofP • • • I I • I I 1 2 ! ... . I Figure 4. let us identify the various matrices in the present tracking system by comparing the state (4. .....''' ... : : : : : : : : : P22(tl : : : : • I • I • . : : : : (t) : : : : : :: • I • : P11 I : • • I :: • I 1  E :~ O~8 Q) 1 1r..l.1 Next..rI:r:IrI:...04.y..50) Q [~~] = r = 0.. ... C = I.1... ::::::::: : : : : : : : : : <. I • I ... .~ ..~ ...r"" : : : : : : : : : • I I I I t ..4 .. of. respectively....... . t. :._ .~:. ~ .i ...1. .r. P(t).51) Then... ~ _ .. :.._ . .39) of the present problem (note the absence of the factor 1/2 in PI) with the corresponding state (4._..2 Riccati Coefficients for Example 4...38) and the PI (4...1..::::::::: : : : : : : : : : ~ o . ~... ..~ t .~ u .1...1. . ._ ....~ .> .~ 0... (4....4....~ . ..1. ! .. P12(t) P22(t) ... . .. g (t) = [9 t )] .. f .30).~:.:. • I : : : : : : : : : Cl: : : : : : : : : : 0._ ..52) where. Let P(t) be the 2x2 symmetric matrix and g(t) be the 2x1 vector as P(t) = [Pl1(t) P12(t)] .1. the positive definite matrix.r. ~ .. the optimal control given by (4.. t ....1 Linear Quadratic Tracking System: FiniteTime Case 1... of the general formulation of the problem..... 1 1. is the solution of the ..~ .1.6 .. we get A = [ ~2 = _!]. 92(t) 1 ( (4..i1.~.i.31) becomes (4... ~ .. :. .29) and the PI (4..
: : I 1 v~~¥~~~~¥~~~ I : : x1 (t): : : : : 0.: ...~!:...... :. r.......:.. 1... . ~.:rl:TI:.. .. :.... • .....rT : : : : : ...(t~ 6 : 14 : 16 : 18 _.... . .... .T .........r:. :... ..• • • • • • I I ~ til ::::::::: ::::::::: I I • • • • • • • • t.:~. ...5 : : 11T i ! 1Ti: 1 : : : : : : : : : o __ \.r1r I I • • • • • • 2 .. f..5. ..r.. . : _ :·.... 2 . :.. :....5 ... i1. :... _ . • • • • I • I ... .. :..... • • • I • I I . _ . 20 i i i i i i i i i 8 10 12 Figure 4.......1 3...5~TII~:TII~:r~I~:~:~~. .. :.1 ..f .. I : • : • : I • I • • I I • • t t J I I I I I I 1 I 2 4 6 8 10 12 14 16 18 20 Figure 4. ..T! 1r.. .: : : : : : : : : : : : 15 1Tr ....~ .r • • I • • I I I I • • • • • • • I • • 2.... . ...... _ :.!:... • ..3 Coefficients 91 (t) and 92 (t) for Example 4... .. I I I ... .164 Chapter 4: Linear Quadratic Optimal Control Systems II 2.. I I · ...r1T r . : 91 ( t ) : : • • I .I::rTI:...t... ~.~ .. r"" . J t • • U 0) > C) • o I I I I I I I I I 0...~ . ....r...5 . ..r 1r......4 Optimal States for Example 4. I I 3  I :rr 1: .r t I I • • 1= .5 .. i.I~l~~ • I · .: 2 : 4 ... I. ..00 j  • rr• • • I r  I r. .r · · • • I • • . _ .. J I 1T TI 1r.
.L .1. _ . . r.r..32) Pll(t) P12(t)] [P12(t) P22(t) =  [Pll(t) P12(t)] P12(t) P22(t) 1 3 [0 3 1] 2 0. 1 .34) as [!~ i~n= .1 Linear Quadratic Tracking System: FiniteTime Case 30~~~~~~~~~~~ I 165 20  I i 10 ..rT . L .rr r rrrrrrrrrI I • • I I I I I • 0  rrr. is the solution of the nonhomogeneous vector differential equation obtained from (4. . _ .1.r.54) r . .04 .rr.1 30 L ._ _L._ . l .I 024 6 8 10 12 14 16 18 20 Figure 4.4.._ . (4.rT or 1 10  r. .{ [ ~  2 ~] n [~ [~] O_~4 [ 0 1] [:~~m :~~m] [:~i~i] [~~][ ~](4_1.53) _[0 2] [Pll(t) P22(t) P12(t)] P12(t) + [Pll(t) P12(t)] [0]_1 [01] P12(t) P22(t) 1 Pll (t) P12(t)]_ 00 [P12(t) P22(t) [2 0] and g(t).or rr 20 ..or rrrOptimal Control for Example 4.! "/".r or .r. . l ..._ _... r.1.L _ _L . I .r:.5 matrix differential Riccati equation (4.r .rr.
4.4t i}2(t) = gl(t) + [3 + 25p22(t)] g2(t) with final condition (4.1.7.2.P12(t) and P22(t) in Figure 4.PH (t) + 3p12(t) P22(t) = 25p~2(t) .53) and (4.8.2 P12(t) = 25p12(t)P22(t) . Similarly.55) PH (t f) P12 (t f) ] [P12 (t f) P22 (t f) and = [0 0] 00 (4. in the case of linear quadratic regulator system.9 and optimal states xi(t) and x2(t) in Figure 4.58) See the plots of the Riccati coefficients PH(t).54).1.1. we restrict our treatment to timeinvariant matrices in the plant and the performance index.2.2. (4.2.1) (4. = (4. we get PH(t) = 25Pi2(t) + 4p12(t) . (4. Thus.2) e(t) = z(t) .6 and coefficients gl(t) and g2(t) in Figure 4.166 Chapter 4: Linear Quadratic Optimal Control Systems II Simplifying the equations (4.y(t).2p12(t) + 6p22(t) with final condition (4. we now extend the results of finitetime case of the linear quadratic tracking system to the case of infinite time [3]. Consider a linear timeinvariant plant as x(t) y(t) The error is = = Ax(t) + Bu(t) Cx(t).1.2 LQT System: InfiniteTime Case In Chapter 3.1. we extended the results of finitetime case to infinitetime (limiting or steadystate) case.1.33) as + 2p22(t) (4. Also see the plots of the optimal control u*(t) in Figure 4.57) [~~i:~n [~] .4) .56) iJ1(t) = [25p12(t) + 2] g2(t) .3) and choose the performance index as tr+ oo lim J = lim tj>oo ~ 2 roo io [e'(t)Qe(t) + u'(t)Ru(t)] dt (4.1.
and R is a rxr symmetric.....2. positive semidefinite matrix..... :. . Note that there is no terminal cost function in the PI (4.. I I • I ex: : : : : : : : : : 0.. . i..2 ........ ..!.. E becomes (4. . • • • I. the vector function g(t) in (4.8 r'" ...2....4... •• 1.......... ..~... I 167 I I • I ' : • I I : • . :: • I • I • • • • • I • I 1 . ~ .f.20) tends to a finite function g(t) as the solution of g(t) = [PE . ::::::::: ..... ~ .. ..~ .. • • • • 'I I' I • • • • • • • • • • I t I • . : .. ...~ .t _ ..1 B'P ... we assume that Q is an nxn symmetric.. .) • • • • • • • • • : {} 0. :.. ~.~.. .. ..1.Wz(t) where. .... Thus. . .1 B' and W = C'Q. ~. . .. . ...2 to track the desired signal z(t). rI~ o ..._ ... .. An obvious way of getting results for infinitetime (steadystate) case is to write down the results of finitetime case and then simply let tf + 00.. 1......_ .. ..f... I : I • • : P11 (t~ I I I I I I .....A'P + PBR. • • • t ..... _ .. .2..6 Riccati Coefficients for Example 4.. . ...:...... Also. .. i . .. The optimal control (4..1.19) tends to the steadystate value P as the solution of JPA . E Q) ::: :Q 0..1 Further details on this are available in Anderson and Moore [3]..:..~.. ...7) I u(t) = R 1 B' [Px(t) . ..31) (4. _ ... ~....6) BR.. ... I : P12(t): : Figure 4. .... ~ ..... ..........~ .~ ...• I I I .:....2. .~ ..4 .g(t)] ..... as tf + 00.t .t ...:..1...... ~ .:. the matrix function P(t) in (4..... ~ . .1 (4..r.2 LQT System: InfiniteTime Case 1. .4 .C'QC = 0._ ..:..r"T"""~:~:":~.4) and hence F = O. .. ~ . .: ....6 ::::::::: : : : : : : : : : ::::::::: ...A'] g(t) ..::.. ...._ .. ~....5) Also. positive definite matrix....~ (..:.. i:.
..... ~ (J) Q) «i 20 Ci5 (ij t 1t r..... __ ................ I • I I ........ .. I • . r . ...r ....... . r ...... .. . • I I I I • I • I • I I I I I I I • • • .... I • ~ • • • • I I I I • I • • I I • I I • I I • I I I I I I I • • I .._ ....... ..... • I • ...• • ........................... t ... .. ...8 Optimal Control and States for Example 4. ~ ..................... ........ 1. r"'''' ...... ___ . .......... • • I • • • • ............._ ..... :..7 Coefficients gl (t) and g2(t) for Example 4............ to ..................2 .. ........... _____ _ · • • ........r ................. I I ................. .. ........1 ______ .... __ I... _ .... •............ !................... • • ............. • I I • • • t I • I .. t .  I ... ....................... .  • I t.... I I I I • • : : • I I : x (t) : • I I : I I I : I • I : . ... t ... I I .. 0 t) Q) Chapter 4: Linear Quadratic Optimal Control Systems II 30 I I • I I • > Cl 25 20 15 10 I I .... • • • • .. ....... • .. .... .... • I • ............ ................ ................... .... ... of. • • : • • I 0 10 ....... ..... :.• ......... ..... ....... ..... . :. ~ .... ...... ~ ...... 30 ........ I I 5 0 0 2 4 6 8 10 12 14 16 18 20 Figure 4..... .. ... :.. t I I . r. __ . .... " I I • "1""""''' _t...._ ..... • I • I I I • I • • • • • • • I I I I I I ................168 50 45 40 35 ..... T ...................... ~ ....... t I I _ _ .............. 20~~~~~~~~~~~ o 2 4 6 8 10 12 14 16 18 20 Figure 4.. _ . .. .. I • • I I I I I I • I I I I I I I • • I I I I f' • I • I I I I I I I I °a 0 E • • • • • • I I ~ I I ~ • I I • • • I • • 10 ...2 50~~~~~~~~~~~~ I 40 • I " I .. .t11....... . ___ ____ 1__ .... r ....... ~ ...... .... ... ...... __ ..... r .............. • I I I • I I I I • ........ :.......r~...
point system. .:. I I r . I • • I I • I • • . .L .. This is different from the conventional freeendpoint state regulator system with the final time t j being free. .. . ..3.. " " " I I I . . Alternatively. • I 400 f.L _ . O ". • • • • • . ._ ...1~ · · I I ..~ :. ~ ... and for arbitrary >...I I • I I I t .E (U I o : : : : : : c... I I I • • • • I .' _ . I • • • I I . we may start with a very large value of P(tj) instead of 00. . I I • • I • I I • • I I ..2.~t. we present a different procedure to find closedloop optimal control for the fixedendpoint system [5]. .. ~ 300 . we will arrive at the same matrix differential Riccati equation (3. .. _ .2... leading to the matrix Riccati differential equation that was discussed in Chapter 3.' .. . • I • 100 t I I • I I • • i.I I ..L .4........3 FixedEndPoint Regulator System O~~~~~~~~~rr~ 169 I .18). if we use the earlier transformation (3...~ .~. • I I I I . ::::: I . ~f . :. ~ ... ~ .~ ..1) and for the fixed final condition x( t f) = 0.3..( t f). I I • • • • • • 600 L . .. . • I  I I 200 .18).... But.I o 2 4 6 8 10 12 14 16 18 20 • • • • I I . .2 4.1r. I . . . we have (4. ..' _ ..3... I I I I I I :: • I I I I • • I . I Figure 4..~.2) This means that for the fixedendpoint regulator system.2... In practice.18) using the final condition (4...2).. .2.3 FixedEndPoint Regulator System In this section. • • • I I I I I I .. _ ..... :.f ._ .. we discuss the fixedendpoint state regulator system. In fact. • • • ..~. I... :.~ .9 Optimal Control and States for Example 4. Following the procedure similar to the freeend.~ ... ~t. ....  • • • • I I I I " I I I I I I • I I I I I I I I .. we solve the matrix DRE (3. • • I ... we will use .. .L _ . I . we see that (4.• .. where the final state x( t f) is zero and the final time t f is fixed [5]..' .~.. . .11) to find the corresponding boundary condition for (3. .... • • • • • I I I • • 500 ti:..~ ..
We assume that the control is unconstrained. consider a linear. and B(t) is nxr control matrix. As before. We develop the procedure for freeendpoint regulator system under the following steps (see Table 2. A(t)) = "2 x '(t)Q(t)x(t) + "2u'(t)R(t)u(t) +A'(t) [A(t)x(t) + B(t)u(t)] (4. The boundary conditions are given as x(t = to) = Xo.r. (4.4) where.1).7) which gives optimal control u*(t) as u*(t) = _Rl(t)B'(t)A*(t).t.3. t f is fixed or given a priori. • Step 1 : Hamiltonian • Step 2: Optimal Control • Step 3: State and Costate System • Step 4: ClosedLoop Optimal Control • Step 5: Boundary Conditions Now. A(t) is nxn state matrix.3.3. we have BaH u = 0 + R(t)u(t) + B'(t)A(t) = 0 (4.3. u.3) with a cost functional 1 J(u) = "2 Jto [x'(t)Q(t)x(t) rtf (4.4).3.3. • Step 1: Hamiltonian: Formulate the Hamiltonian as 1 1 H(x(t).6) • Step 2: Optimal Control: Taking the partial of H w.3. x(t) is nth state vector. we can easily see that for fixed final condition. timevarying system x(t) = A(t)x(t) + B(t)u(t) + u'(t)R(t)u(t)] dt (4.170 Chapter 4: Linear Quadratic Optimal Control Systems II what is known as inverse Riccati transformation between the state and costate variables and arrive at matrix inverse Riccati equation. (4. Here.5) where. u(t).8) . there is no meaning in having a terminal cost term in the cost function (4. u(t) is rth control vector. we address these steps in detail.
(4.13) (that .3.3.3.9) to obtain the canonical system of equations x*(t)] [ .3.5).3.3 FixedEndPoint Regulator System 171 • Step 3: State and Costate System: Obtain the state and costate equations as x*(t) = + ~~ + > x*(t) = A(t)x*(t) + B(t)u*(t).which lets us assume a Riccati transformation between the state and costate function as A*(t) = P(t)x*(t). constitutes a twopoint boundary value problem (TPBVP). (4. This state and costate system.11) and (4. along with the given boundary conditions (4. the nxn matrix M(t) is yet to be determined. by simple manipulation of the state and costate system (4.3. E(t) = B(t)Rl(t)B'(t).3.3.12) and (4.3..8) and (4. This optimal costate function A*(t) substituted in (4. Note the difference between the transformations (4.13) where.10) Eliminating control u* (t) from (4.11) where.\* (t) = [A(t) E(t)] [x*(t)] Q(t) A' (t) A* (t) (4. This leads us to openloop optimal control as discussed in Chapter 2. Now as before in the case of freeendpoint system. But our interest here is to obtain closedloop optimal control for the fixedendpoint regulator system . we are led to assume a kind of inverse Riccati transformation as [104.A'(t)A*(t)..12) In the absence of any knowledge on the final condition of the costate function A* (t).3.13). (4.3.3.*(t) =  r::: 5. we would be able to obtain a final condition on the costate A( t f) . which when solved gives optimal state x*(t) and costate A*(t) functions.8) gives optimal control u*(t). using transversality conditions.*(t) = Q(t)x*(t) .4. • Step 4: ClosedLoop Optimal Control: Now if this were a freeendpoint system (x(tf) free). 113] x* (t) = M(t)A * (t) (4.3.9) 5.
3.13) becomes Iu*(t) = R1(t)B'(t)M1(t)x*(t)·1 (4.8) with the transformation (4.3.16) For arbitrary .3.18) becomes I M(to) = 0·1 (4. we obtain x* (t) = M(t). if the previous equation should be valid for all t E [to. Now.13).\(to). Here.\ * (t).13) becomes x(to) = 0 = M(to).2. we get x(tf) = 0 = M(tf). (4. The optimal control (4.\(to) (4.17) or forward using the initial condition( 4.M(t)A'(t) . x(tf) = 0 and x(to) # 0: We know from the given boundary conditions (4. eliminating x*(t)).3.34) . • Step 5: Boundary Conditions: Now the boundary condition for (4.172 Chapter 4: Linear Quadratic Optimal Control Systems II is.18) and for arbitrary .3.16) becomes IM(tf)=O·1 (4.3. (4.A(t)M(t) .19).3. we then have I M(t) = A(t)M(t) + M(t)A'(t) + M(t)Q(t)M(t)  B(t)R1B'(t)·1 (4.5) that x(tf) = 0 and using this in (4.\*(t) = O.3. at t = to.15) Let us call this the inverse matrix differential Riccati equation (DRE) just to distinguish from the normal DRE (3.3.3.3.\* (t) + M(t). tf]' and for any arbitrary .3.\(tf)· (4. 1.15) backward using the final condition (4.M(t)Q(t)M(t) + (4.14) B(t)R1B'(t)] .3. we have different cases to be discussed.17) 2. (4.3.3.15) is obtained as follows.20) .3.3. x(tf) # 0 and x(to) = 0: Here.X *(t) ~ [M(t) .\(tf). we solve the inverse matrix DRE (4.3.19) Thus.
A(t)v(t)] = O. and we assume a transformation as x*(t) = M(t)A * (t) + v(t).3.M(t)Q(t)M(t) + B(t)R.1 (t)B'(t) (4.3.27) gives us M(to) = 0.28) At t = t J.A(t)M(t) .3. v(to) x(to).1 (t)B'(t)] A*(t) + [v(t) .3. (4. (4. Here.3. = (4. General Boundary Conditions: x(to) =1= 0 and x(tJ) =1= O. we substitute the transformation (4.11) and eliminate x*(t) to get x*(t) leading to = M(t)A*(t) + M(t)~*(t) + v(t) (4.3.21) As before. (4.M(t)A'(t) .3.26) At t = to. (4.22) A (t) [M (t) A* (t) = + v ( t)]  B (t )R 1 ( t )B' (t) A* (t ) (4.24) This should be valid for any orbit ray value of A*(t).3.4.21) becomes x*(tJ) = M(tJ )A*(tJ) + v(tf)· (4. (4.29) . (4.3.21) becomes x*(to) = M(tO)A*(tO) + v(to).3. which leads us to a set of equations M(t) v(t) = = A(t)M(t) + M(t)A'(t) + M(t)Q(t)M(t) B(t)R.23) M(t)A*(t) + M(t) [Q(t) [M(t)A*(t) + v(t)] A'(t)A*(t)] +v(t) further leading to [M(t) .27) Since A*(tO) is arbitrary.21) in the state and costate system (4.3.3.3 FixedEndPoint Regulator System 173 3.M(t)Q(t)v(t) .3. both the given boundary conditions are not zero.25) M(t)Q(t)v(t) + A(t)v(t). (4.3.
30) Thus.3.3.174 Chapter 4: Linear Quadratic Optimal Control Systems II Again.33) and. we need to use the scalar version of the inverse matrix DRE (4.25) and (4.30).3.15) (4.3.3. The optimal control (4.34) find the closedloop optimal control. the set of the equations (4.8).3. we present a first order example. (4. Finally.31) where.36) .3.32) and the performance index as J = 2 lit! to [qx 2 (t) + ru 2 (t)]dt. using the transformation (4.3. (4.v(t)] (4. N ow to illustrate the previous method and to be able to get analytical solutions. since A*(tj) is arbitrary. We see that with the boundary conditions (4.3.35) m(t) = 2am(t) + m 2 (t)q .17). the closedloop optimal control is given by u*(t) = R1(t)B'(t)M1(t)[x*(t) . b2 r (4.3.3.29) becomes (4. x(t=tj)=O. Solution: Follow the procedure of the inverse matrix DRE described in the last section.15) having the boundary condition (4. (4. it is assumed that M(t) is invertible. the boundary conditions as x(t=O)=xo.34).3.3.21) in the optimal control (4.26) are solved using either the initial conditions (4.20) is given by u*(t) = r1bm1(t)x*(t) where. Example 4.3.28) or final conditions (4. m(t) is the solution of the scalar DRE (4. (4.3.3.3.3.3 Given the plant as x(t) = ax(t) + bu(t).3.
35) becomes 4. Then the optimal control (4.3. we examine the state regulator system with infinite time interval and with a specified degree of stability for a timeinvariant system [3.4.37) where. x(t = to) = x(O). Let us consider a linear timeinvariant plant as x(t) = Ax(t) + Bu(t). B] is completely stabilizable and Rand Q are constant. it is easy to see that the modified system becomes ic.4.(t) = ax(t) + eat [Ax(t) + Bu(t)] (A + aI)x(t) + Bu(t).17) as m( t f) = O. Here. we first assume that the pair [A + aI.4.4.36) with this boundary condition. we make the following transformations (4. (4.4 LQR with a Specified Degree of Stability In this section.4.1) and the cost functional as J = 11 00 2 to e2at [x'(t)Qx(t) + u'(t)Ru(t)] dt (4.3. Thus.3.(t) = !£{eatx(t)} dt = = aeatx(t) + eatx(t) (4.3) Then.4 LQR with a Specified Degree of Stability 175 with the boundary condition (4. (3 = Ja 2 + q~. we get (4. 2]. The problem is to find the optimal control which minimizes the performance index (4.4) ic.4. a is a positive parameter.2) under the dynamical constraint (4. This can be solved by modifying the previous system to fit into the standard infinitetime regulator system discussed earlier in Chapter 3.3).4. positive definite and positive semidefinite matrices.4. symmetric.2) where. .1).3. respectively. Solving (4. using the transformations (4.
6) where.3) in (4.4.4. using the transformations (4. we can simply apply these results to the original system.9) Interestingly.4.4.4.4.2) can be modified to j = 2 ito 1 [00 [x'(t)Qx(t) + ft'(t)Rft(t)] dt.6). the original performance measure (4.5) Considering the minimization of the modified system defined by (4. the state x*(t) approaches zero at least as fast as e.BR1B'P)i*(t). we see that the optimal control is given by (see Chapter 3.1) and the modified system (4. this desired (original) optimal control (4. The optimal performance index for original system or modified system is the same and equals to j* J* = = ~x*'(to)Px*(to) 2 ~e2atox*'(to)Px*(to). we say that the closedloop optimal system (4. Also.4.6) of the modified system.4.at .9) has the same structure as the optimal control (4. Table 3. Then.4.8) has a degree of stability of at least a. K = R 1 B'P and the matrix P is the positive definite.8) has eigenvalues with real parts less than a.3) ft*(t) = R.4.4. if to = 0 the initial conditions are the same for the original and modified systems. In other words.4.4.3).4) and (4.7) Using the optimal control (4. (4.1 B'P + Q = O.4.4.1) and the associated performance measure (4.4.10) We see that the closedloop optimal control system (4.5).4).1 B'Pi*(t) = Ki*(t) (4. symmetric solution of the algebraic Riccati equation P(A + aI) + (A' + aI)P  PBR. (4. Thus.4. we get the optimal closedloop system as ic*(t) = (A + aI . (4.4. .4) are simply related as i(to) = eatox(to) and in particular. the optimal control of the original system (4.4. (4.4.6) in the modified system (4.2) is given by u*(t) = eatft*(t) = eatRIB'Peatx*(t) = R.4.8) Now.4.1 B'Px*(t) = Kx*(t).176 Chapter 4: Linear Quadratic Optimal Control Systems II We note that the initial conditions for the original system (4. (4. using the transformations (4.
4 LQR with a Specified Degree of Stability 177 4.15) (4.4. x(O) =1 (4. K = R 1 B'P and P.11) and the infinite interval cost functional J = 2 11 00 e2at [x'(t)Qx(t) (4. matrix algebraic Riccati equation (ARE) P(A + aI) + (A' + aI)P . symmetric matrix. the nxn constant.4.12) to the optimal control is given by u*(t) = R.BR. Consider a firstorder system example to illustrate the previous method.4 Consider a firstorder system x(t) = x(t) + u(t). (4. roo 2 (4.4.2.4.16) The entire procedure is now summarized in Table 4.1 Regulator System with Prescribed Degree of Stability: Summary For a controllable. and the optimal cost is given by J* = !e 2ato x*'(to)Px*(to). 2 = 0. timeinvariant plant x(t) = Ax(t) + Bu(t).18) Find the optimal control law and show that the closedloop optimal system has a degree of stability of at least a.4.1B'Px*(t) = Kx*(t) (4. + u'(t)Ru(t)] dt.1 B'P + Q the optimal trajectory is the solution of x*(t) = (A . (4.1 B'P) x*(t). is the solution of the nonlinear.4. .4. positive definite.4.4.17) and a performance measure J =! io e2at [x 2 (t) + u 2 (t)]dt.4. linear.14) (4.PBR. Example 4.13) where.
15) becomes (4. the algebraic Riccati equation (4. Q = q = 1 and R = r = 1. (4.4. The optimal system (4.1) .4.1. e2at [x'(t)Qx(t) x(to) = Xo.4.22) . the performance index as J = ~ It. B = b = 1. The optimal control (4.14) becomes 2p(a .1) . Step 4 Obtain the optimal performance index from J* = je2ato x*'(to)Px*(to).2 Procedure Summary of Regulator System with Prescribed Degree of Stability A. + u'(t)Ru(t)] dt.4.p2 + 1 = 0 + p2 . Step 2 Solve the optimal state x* (t) from x*(t) = (A . Solution of the Problem find the optimal control.1)2 + 1. Step 1 Solve the matrix algebraic Riccati equation P(A + aI) + (A' + aI)P + Q  PBR. state and index.21 ) x*(t) = ( a  v(a  1)2 + 1) x*(t). Then. B. First of all.20) u*(t) = px*(t). and the boundary conditions as x(oo) = 0.4.22) becomes (4.BR 1 B'P) x*(t) with initial condition x(to) = Xo.1B'P = O. we note that A = a = 1.2p(a .178 Chapter 4: Linear Quadratic Optimal Control Systems II Table 4.4. (4. Statement of the Problem Given the plant as x(t) = Ax(t) + Bu(t).1B'Px*(t). in the above.4. we have p = 1 + a + v(a . we show that the eigenvalue of this closedloop optimal system is less than or equal to a. Solution: Essentially.19) Solving the previous for positive value of p. Step 3 Obtain the optimal control u *(t) from u*(t) = R.
(4.5. timeinvariant plant x(t) = Ax(t) + Bu(t).5. (4.1 B'Px*(t) = Kx*(t). we repeat here some of the results of Chapter 3.Q The optimal trajectory (state) is the solution of = O. (4.4. we know that the closedloop optimal control involves the solution of matrix algebraic Riccati equation [89. (4.5. symmetric matrix.4.5 FrequencyDomain Interpretation 179 It is easy to see that the eigenvalue for the system (4. K = R 1 B'P.A'P + PBR. For this. the nxn constant.5) which is asymptotically stable. is the solution of the nonlinear.AI. B] is stabilizable and [A. we use frequency domain to derive some results from the classical control point of view for a linear. (4. matrix ARE PA .BK) x* (t). 4.5.6) . the openloop characteristic polynomial of the system is [89] Llo(s) = lsI .4) x* (t) = (A . (4. positive definite. 3].5.5. Here.4. linear. optimal control system with infinitetime horizon case.BR 1 B'P) x* (t) = (A .3) where.23) This shows the desired result that the optimal system has the eigenvalue less than 0::. Then. For ready reference.2) The optimal control is given by u*(t) = R.1)2 + 1 < 0::.22) is related as 0::  V(o:: . vel] is observable. and P. we assume that [A. Consider a controllable.5 FrequencyDomain Interpretation In this section. continuoustime.1) and the infinitetime interval cost functional J = 2 0 110 00 [x'(t)Qx(t) + u'(t)Ru(t)] dt. (4.1 B'P . timeinvariant.
1+ K[sl ....9) Next.t.... s = jw to the previous ARE.A] B 1 X(s) ~ ___________rJ~l].......10........ 2... premultiplying by B'~'( s) and post multiplying by ~(s)B.7) This is a relation between the openloop do (s) and closedloop de (s ) characteristic polynomials........... (4...................10) .A] = Q..8) First adding and subtracting sP...4).. K [sl  A] 1 B is called the loop gain matrix.A]IBld o (S)..A + BKI = = II + BK[sl .......1B'P P[sl ..180 Chapter 4: Linear Quadratic Optimal Control Systems II where...5.10 _ K = R B'P 11+' ClosedLoop Optimal Controller .A]1 B U(s) .. _______________________________________ : 9Figure 4.....A'P + PBR. Optimal ClosedLoop Control in Frequency Domain To derive the desired factorization result. and is termed return difference matrix..... s is the Laplace variable and the optimal closedloop characteristic polynomial is de(s) = lsI ... we use the matrix ARE (4. we note that 1........ .........5....5.. (4...... (4.................AI.. the previous equation becomes B'~'( s)PB + B'P~(s)B + B'~'( s)K'RK~(s)B = B'~'( s)Q~(s)B (4..... 11+ K[sl ....A]II·lsI .............. we get + [sl . ..5.. Let us rewrite the ARE as P A ..A']P + K'RK = Q.. [sI ...... ...5....... From Figure 4........
5.]I.1 Gain Margin and Phase Margin We know that in classical control theory.A. using K = R 1 B'P + K' = PBR 1 + PB = K'R and adding R to both sides of (4.4.5.A]IB] M(jw) = R + B'[jwl .5. we interpret some of the classical control features such as gain margin and phase margin for the closedloop optimal control system [3].13)) with s = jw here as B'[jwl . the features of gain and phase margins are important in evaluating the system performance with respect to robustness to plant parameter variations and uncertainties.14) The previous result can be viewed as M(jw) = W'( jw)W(jw) where.A]IB]' R[I + K[jwl .11) Finally. 4. The engineering specifications often place lower bounds on the phase and gain margins.]IQ[jwl .A]IB].15) W(jw) = R 1/ 2 [I + K[jwl .12) B'[sl . we have the desired factorization result as IB'~/( s)Q~(s)B + R or equivalently.10).A]I. For ready reference let us rewrite the returndifference result (4.13) The previous relation is also called the Kalman equation in frequency domain.5. Here.16) .5.A]IB] . (4.A]IB + R = [I + K[jwl .A]lB. (4. (4.5.5. (4.A']IQ[sl .A]IB + R = [I + K[sl . R [I + K[sl . (4.5 FrequencyDomain Interpretation 181 where.A.5. = [I + K~( s)BJ' R [I + K~(s)B] I (4.A. ~'(s) = [sl . we used ~(s) = [sl .5.]IQ[jwl .A]IB].
Note that in (4.5.18).21) Thus. R = r.5. R = IIW(jw)11 2 . the return difference is lower bounded by 1 for all w.17) W'( jw)W(jw) = the factorization result (4. R = I. we can write (4. we need not solve for the Riccati coefficient matrix P.19) as The previous result may be called another version of the Kalman equation in frequency domain. The previous relation (also from (4. Using Q notation = ee'.5.18) for a scalar case) implies that (4. Q.19) In case Q = ee'. In the singleinput case.5.5 Consider a simple example where we can verify the analytical solutions by another known method. Example 4. instead we directly obtain the feedback matrix K.22) J = 2J o [xi(t) + x~(t) + u 2(t)] dt. (4. the factorization result (4. (4.5. Find the optimal feedback coefficients for the system Xl(t)=X2(t) X2(t) = u(t) and the performance measure 1 [00 (4. B.14) boils down to r + b'[jwI  A']lQ[jwI .5.182 Chapter 4: Linear Quadratic Optimal Control Systems II Note that M(jw) 2: R > O.14) can be written in neat form as This result can be used to find the optimal feedback matrix K given the other quantities A.5. K = k.23) .5. the various matrices become scalars or vectors as B = b.5.A]lb = rl1 + k[jwI  A]lbI 2 . Then.5. DD' =I and the (4.
we have (4. we get a set of algebraic equations in general.24) Also. By simple matrix multiplication and equating the coefficients of like powers of s on either side. the Kalman equation (4.5.15) (of Chapter 3) PA + A'P .26) Then.5. (4.29) and the optimal feedback control as u*(t) = Kx*(t) =  [1 V3] x*(t). Q= [~~]. R = r = 1.25) becomes [1+ [ kll k12] [ !~ ~] [~]] [1+ [ kll k12] [& ~] [~]] = (4.5.4.25) Further. (4.5.28) giving us kll = 1.5.5.18) with jw = s becomes = I.30) Note: This example can be easily verified by using the algebraic Riccati equation (3. note since Q = R = I.5.5. we have C = D = I and B the Kalman equation (4. k12 = v'3 (4.PBR1B'P + Q = 0 (4. (4. and in particular in this example we have a single scalar equation as (4.5.27) 1+ [0 1] [!o ~] [~~] [& ~] [~] . Thus. B =b = [~].5.5 FrequencyDomain Interpretation Solution: First it is easy to identify the various matrices as 183 A = [~~].31) .5.
Thus. we get (4.5. we assume that the Nyquist path is clockwise (CW) and the corresponding Nyquist plot makes counterclockwise (CCW) encirclements around the critical point 1 + jO. ~ + U(s) • K[sI Ar B  1 Xes) . singleoutput case.184 Chapter 4: Linear Quadratic Optimal Control Systems II discussed in Chapter 3. According to Nyquist stability criterion.5. . Let us now apply the wellknown Nyquist criterion to the unity feedback.5. Bode. Here.. Using the previous relation.30).11 ClosedLoop Optimal Control System with Unity Feedback Here. the frequency domain interpretation in terms of gain margin. or some other plot .of the transfer function Go (s ). we can easily recognize that for a singleinput.. the Nyquist plot (or diagram) makes CCW encirclements as many times as there are poles of the transfer function Go (s) lying in the right half of the splane.14) as u*(t) = R1B'Px*(t) =  [1 vI3] x*(t) (4.33) which is the same as (4.5. Let us redraw the closedloop optimal control system in Figure 4.10 as unity feedback system shown in Figure 4.32) and the optimal control (3. I Figure 4. for closedloop stability.11.11. phase margin can be easily done using Nyquist. the optimal feedback control system is exactly like a classical feedback control system with unity negative feedback and transfer function as Go(s) = k[sIA]lb. Gain Margin We recall that the gain margin of a feedback control system is the amount of loop gain (usually in decibels) that can be changed before the closedloop system becomes unstable. optimal control system depicted in Figure 4.
34) implies that the distance between the critical point 1 + jO and any point on the Nyquist plot is at least 1 and the resulting Nyquist plot is shown in Figure 4. there is a lower limit as f3 > ~. Let us proceed further to see if there is a lower limit on the gain factor. it is clear that the closedloop optimal system has infinite gain margin. But the set of points (1/ f3) for all real f3 > ~ lies inside the critical unit circle and thus are encircled CCW the same number of times as the original point 1 + jO.12 Nyquist Plot of Go(jw) means that the Nyquist plot of Go(jw) is constrained to avoid all the points inside the unit circle (centered at 1 + jO).5. we note that (4. Thus..11 and the return difference relation (4. 0 to (0).5. Consequently.12 for all positive values of w (i. Now.5 FrequencyDomain Interpretation 185 From Figure 4.21). for values of f3 < ~.e. This 2 . the set of points 1/ f3 + jO lies outside the unit circle and contradicts the Nyquist criterion for stability of closedloop . This means that the closedloop system will be stable if the Nyquist diagram of Go(jw) encircles the critical point (1/ f3) + jO the same number of times as there are openloop poles in the righthalf plane.Go(jro) Re 1 + Go(jro) Figure 4.4. if we multiply the openloop gain with some constant factor f3. the closedloop system will be asymptotically stable if the Nyquist plot of f3Go(jw) encircles 1 + jO in CCW direction as many times as there are poles of f3G o(s) in the righthalf plane. In other words.
186
Chapter 4: Linear Quadratic Optimal Control Systems II
optimal control system. Thus, we have an infinite gain margin on the upper side and a lower gain margin of j3 = 1/2.
Phase Margin
Let us first recall that the phase margin is the amount of phase shift in CW direction (without affecting the gain) through which the Nyquist plot can be rotated about the origin so that the gain crossover (unit distance from the origin) passes through the 1 + jO point. Simply, it is the amount by which Nyquist plot can be rotated CW to make the system unstable. Consider a point P at unit distance from the origin on the Nyquist plot (see Figure 4.12). Since we know that the Nyquist plot of an optimal regulator must avoid the unit circle centered at 1 + jO, the set of points which are at unit distance from the origin and lying on Nyquist diagram of an optimal regulator are constrained to lie on the portion marked X on the circumference of the circle with unit radius and centered at the origin as shown in Figure 4.13. Here,
jro
2
Figure 4.13
Intersection of Unit Circles Centered at Origin and 1 +jO
we notice that the smallest angle through which one of the admissible points A on (the circumference of the circle centered at origin) the
4.5 FrequencyDomain Interpretation
187
Nyquist plot could be shifted in a CW direction to reach 1 + j 0 point is 60 degrees. Thus, the closedloop optimal system or LQR system has a phase margin of at least 60 degrees.
188
Chapter 4: Linear Quadratic Optimal Control Systems II
4. 6
Problems
1. Make reasonable assumptions wherever necessary.
2. Use MATLAB© wherever possible to solve the problems and plot all the optimal controls and states for all problems. Provide the relevant MATLAB© m files.
Problem 4.1 A second order plant
Xl(t) = X2(t), X2(t) = 2Xl(t)  3X2(t) y(t) = x(t)
+ u(t)
is to be controlled to minimize a performance index and to keep the state Xl(t) close to a ramp function 2t. The final time tf is specified, the final state x( t f) is free and the admissible controls and states are unbounded. Formulate the performance index, obtain the feedback control law and plot all the time histories of Riccati coefficients, optimal states and control. Problem 4.2 A second order plant
Xl(t) X2(t) y(t)
=
= =
X2(t), 2Xl(t)  4X2(t) x(t)
+ O.5u(t)
is to be controlled to minimize the performance index
The final time t f is specified, the final state x( t f) is fixed and the admissible controls and states are unbounded. Obtain the feedback control law and plot all the time histories of inverse Riccati coefficients, optimal states and control.
4.6
Problems
189
Problem 4.3 For a linear, timevarying system (3.2.48) given as
x(t) y(t)
= =
A(t)x(t) + B(t)u(t), C(t)x(t)
with a general cost functional (3.2.49) as
J
=
~ x' (t f) F (t f )x (t f )
+~ f
[x'(t) u'(t)
1[~i!j ii!j] [:m] dt,
where, the various vectors and matrices are defined in Chapter 3, formulate a tracking problem and obtain the results similar to those obtained in Chapter 4.
Problem 4.4 Using the frequencydomain results, determine the optimal feedback coefficients and the closedloop optimal control for the multiinput, multioutput system
x(t)
and the cost function
J
.
=
[02 31] x(t) + [10] u(t) °1
= 10"" [4xi(t) + 4x~(t) + o.5ui(t) + u~(t)ldt.
Problem 4.5 For the D.C. motor speed control system described in Problem 1.1, find the closedloop optimal control to track the speed at a particular value. Problem 4.6 For the liquidlevel control system described in Problem 1.2, find the closedloop optimal control to track the liquid level along a ramp function 0.25t. Problem 4.7 For the mechanical control system described in Problem 1.4, find the closedloop optimal control to track the system along (i) a constant value and (ii) a ramp function. Problem 4.8 For the chemical control system described in Problem 1.6, find the closedloop optimal control to track the system along (i) a constant value and (ii) a ramp function.
@@@@@@@@@@@@@
Chapter 5
Discrete Time Optimal Control Systems
In previous chapters, the optimal control of continuoustime systems has been presented. In this chapter, the optimal control of discretetime systems is presented. We start with the basic calculus of variations and then touch upon all the topics discussed in the previous chapters with respect to the continuoustime systems such as openloop optimal control, linear quadratic regulator system, tracking system, etc. It is suggested that the student reviews the material in Appendices A and B given at the end of the book. This chapter is inspired by [84, 89,120]1.
5.1
Variational Calculus for DiscreteTime Systems
In earlier chapters, we discussed the optimal control of continuoustime systems described by differential equations. There, we minimized cost functionals which are essentially integrals of scalar functions. Now, we know that discretetime systems are characterized by difference equations, and we focus on minimizing the cost functionals which are summations of some scalar functions.
IThe permission given by John Wiley for F. L. Lewis, Optimal Control, John Wiley & Sons, Inc., New York, NY, 1986, is hereby acknowledged.
191
192
Chapter 5: DiscreteTime Optimal Control Systems
5.1.1
Extremization of a Functional
In this section, we obtain the necessary conditions for optimization of cost functionals which are summations such as
kfl
J(x(k o), ko) = J =
L
V(x(k), x(k
+ 1), k)
(5.1.1)
k=ko
where, the discrete instant k = ko, kl' ... ,kj  1. Note the following points. 1. For a given interval k E [ko, kj] and a given function V(x(k), x(k+ 1), k), the summation interval in (5.1.1) needs to be [ko, kj  1]. 2. We consider first a scalar case for simplicity and then we generalize for the vector case. 3. We are also given the initial condition x(k
= ko) = x(ko).
4. Consider the case of a freefinal point system, such that k is fixed or specified and x(kj) is free or unspecified. 5. Also, if T is the sampling period, then x(k) = x(kT). 6. Let us note that if we are directly considering the discretetime version of the continuoustime cost functionals (such as (2.3.1) addressed in Chapter 2), we have the sampling period T multiplying the cost functional (5.1.1). For extremization (maximization or minimization) of functionals, analogous to the case of continuoustime systems addressed in Chapter 2, we use the fundamental theorem of the calculus of variations (CoV) which states that the first variation must be equal to zero. The methodology for this simple case of optimization of a functional is carried out briefly under the following steps.
• Step 1: Variations • Step 2: Increment • Step 3: First Variation • Step 4: EulerLagrange Equation
5.1
Variational Calculus for DiscreteTime Systems
• Step 5: Boundary Conditions
193
Now consider these items in detail.
• Step 1: Variations: We first let x(k) and x(k + 1) take on variations 8x(k) and 8x(k + 1) from their optimal values x*(k) and x* (k + 1), respectively, such that
x(k) = x*(k) + 8x(k);
x(k + 1) = x*(k + 1) + 8x(k + 1). (5.1.2)
Now with these variations, the performance index (5.1.1) becomes
J* = J(x*(ko), ko)
kfl
= J
=
L
V(x*(k), x*(k + 1), k)
(5.1.3)
k=ko
J(x(k o), ko)
kfl
=
L
V (x*(k) + 8x(k), x*(k + 1) + 8x(k + 1), k).
(5.1.4)
k=ko
• Step 2: Increment: The increment of the functionals defined by (5.1.2) and (5.1.3) is defined as
~J=JJ*.
(5.1.5)
• Step 3: First Variation: The first variation 8J is the first order approximation of the increment ~J. Thus, using the Taylor series expansion of (5.1.4) along with (5.1.3), we have
oJ =
kfl
k=ko
[8V(X*(k), x*(k 8x*(k)
+ 1), k) 8x(k)
+
8V(x*(k), x*(k + 1), k) 8 (k 8x* (k + 1) x
+
1)]
.
(5.1.6)
194
Chapter 5: Discrete Time Optimal Control Systems
Now in order to express the coefficient 8x(k + 1) also in terms of 8x (k), consider the second expression in (5.1. 6) .
kt1
k=ko
8V(x*(k), x*(k + 1), k) 6x(k + 1) 8x*(k + 1)
=
8V(x*(ko), x*(k o + 1), ko) 8x(ko + 1) 8x*(ko + 1) 8V(x*(ko + 1), x*(ko + 2), ko + 1) 8 (k ) 2 + 8x*(ko +2) x 0+
+ ................. .
+ + +
8V(x*(k f  2), x*(k f  1), kf  2) 8 (k ) 8x* (k f  1) x f  1 8V(x*(kf  1), x*(kf), kf  1) 8 (k ) 8x*(kf) x f 8V(x*(ko  1), x*(ko), ko  1) 8 (k ) 8x*(ko) x 0
8V(x*(ko  1)tiko), ko  1) 6x(ko) 8x* ko
(5.1.7)
where, the last two terms in (5.1.7) are added without affecting the rest of the equation. The entire equation (5.1.7) (except the last term and the last but the two terms) is rewritten as
kt
1
k=ko
8V(x*(k), x*(k + 1), k) 6x(k + 1) 8x*(k + 1)
=
kt
+
1
k=ko
8V(x*(k  1), x*(k), k  1) 6x(k) 8x*(k)
8V(x*(kf  1), x*(kf), kf  1) 8 (k ) 8x*(kf) x f
=
kt1
k=ko
_ 8V(x*(ko  1), x*(ko), ko  1) 8x(ko) 8x* (k o)
8V(x*(k  1), x*(k), k  1) 6x(k) 8x*(k)
1) 6X(k)]
+ [8V(X*(k 8;:(~;(k), k 
c::.
(5.1.8)
9) • Step 4: EulerLagrange Equation: For (5.1. we are given the initial condition x(ko) and hence 8x(ko) = 0 in (5. k) + 8V(x*(k .1).(~. x*(k). we have k~l k7::o [aV(x*(k).1. we have the boundary conditions x(k o) and x(kf) fixed and hence 8x(ko) = 8x(kf) = O.1. k) ax*(k) + aV(x*(k . x*(k). That is 8V(x*(k). (5. For a fixedend point system.1).1. (5.5. • Step 5: Boundary Conditions: The boundary or transversality condition is obtained by setting the second term in (5. k .1.1. we discuss two important cases: 1. x*(k 8x*(k) + 1).10) This may very well be called the discretetime version of the EulerLagrange (EL) equation .1. k 1) 8X(k)] [ : : = O.11) does not exist. 2.6) and noting that the first variation should be zero.(k).8) in (5. at the final point. we have the condition that the coefficient of 8x(k) in the first term in (5. The additional (or derived) boundary condition (5.1 Variational Calculus for DiscreteTime Systems 195 Substituting (5. . and x ( k f ) is not specified or is free. k f is specified. x*(k + 1).1. k .9) be zero.9) to be satisfied for arbitrary variations 8x(k). That is Now.11).1)]8X(k) ax*(k) + [aV(x'(k a~. Next. For a freefinal point system.9) equal to zero.1.1) 8x*(k) = O.
1.1)]1 8x* (k) .1. x*(k + 1). Also. x* (k + 1).t.1. respectively.11) which reduces to [ 8V(X*(k  1). k=k j = o. the second term considers taking the partial derivative of V (x* (k1).10) and the associated boundary or transversality condition (5.r. x*(k).12 ) Let us note that the necessary condition (5. k .r. k . (5.1. x(k + 1). 1. k . x*(k).1). Thus. The second function V(x*(k . the coefficient of 8x(k) at k = kf is zero in the condition (5. consider a functional which is the vector version of the scalar functional (5. (5. the first term involves taking the partial derivative of the given function V(x*(k).15) Note in the EulerLagrange equation (5.12) are derived for the scalar function x (k) only.1. .t.1.1) as kjl J(x(k o). as 8V(x*(k). x*(k).1.1. compare the previous results with the corresponding results for continuoustime systems in Chapter 2. x*(k).1) (one step behind) w.14).14 8V(X*(k [ 1). ko) = J = L V(x(k). T