You are on page 1of 523

Springer Texts in

Electrical Engineering
Consulting Editor: John B. Thomas
Springer Texts in Electrical Engineering

Multivariable Feedback Systems


F.M. CaIlier/C.A. Desoer

Linear Programming
M. Sakarovitch

Introduction to Random Processes


E. Wong

Stochastic Processes in Engineering Systems


E. Wong/B. Hajek

Introduction to Probability
lB. Thomas

Elements of Detection and Signal Design


C.L. Weber

An Introduction to Communication Theory and Systems


J.B. Thomas

Signal Detection in Non-Gaussian Noise


S.A. Kassam

An Introduction to Signal Detection and Estimation


H.V. Poor

Introduction to Shannon Sampling and Interpolation Theory


RJ. Marks II

Random Point Processes in Time and Space


D.L. Snyder/M.1. Miller

Linear System Theory


F.M. Callier/C.A. Desoer
Frank M. Callier Charles A. Desoer

Linear System Theory


With 54 Illustrations

Springer Science+Business Media, LLC


Frank M. CaIlier
Department of Mathematics
Facultes Universitaires Notre-Dame de la Paix
Rempart de la Vierge, 8
B5000 Namur, Belgium

Charles A. Desoer
Department of Electrical Engineering
and Computer Sciences
University of California
Berkeley, CA 94720
USA

Library of Congress Cataloging-in-Publication Data.


Callier, Frank M.
Linear system theory / Frank M. Callier, Charles A. Desoer.
p. cm. - (Springer texts in electrical engineering)
Includes bibliographical references and index.
ISBN 978-1-4612-6961-8 ISBN 978-1-4612-0957-7 (eBook)
DOI 10.1007/978-1-4612-0957-7
1. Control theory. 2. System analysis. 1. Desoer, Charles A.
II. Title. III. Series.
QA402.3.C325 1991
629.8'312-dc20 91-20992

Printed on acid-free paper.

© 1991 Springer Science+Business Media New York


Origina11y published by Springer-Verlag New York, Inc. in 1991
Softcover reprint of the hardcover 1st edition 1991
AII rights reserved. This work may not be translated or copied in. whole or in part without the
written permis sion of the publisher, Springer Science+Business Media, LLC
except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereaf-
ter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even if
the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by
anyone.

Camera-ready copy prepared by the authors.

9 8 7 6 5 4 3 2 (Corrected second printing 1994)

ISBN 978-1-4612-6961-8
PREFACE

This book is the result of our teaching over the years an undergraduate course on
Linear Optimal Systems to applied mathematicians and a first-year graduate course on
Linear Systems to engineers. The contents of the book bear the strong influence of the
great advances in the field and of its enormous literature. However, we made no
attempt to have a complete coverage.
Our motivation was to write a book on linear systems that covers finite-
dimensional linear systems, always keeping in mind the main purpose of engineering
and applied science, which is to analyze, design, and improve the performance of phy-
sical systems. Hence we discuss the effect of small nonlinearities, and of perturbations
on the data; we face robustness issues and discuss the properties of feedback. It is our
hope that the book will be a useful reference for a first-year graduate student.
We assume that a typical reader with an engineering background will have gone
through the conventional undergraduate single-input single-output linear systems
course; an elementary course in control is not indispensable but may be useful for
motivation. For readers from a mathematical curriculum we require only familiarity
with techniques of linear algebra and of ordinary differential equations.
The purpose of this book is to provide a systematic and rigorous access to a) the
main topics of linear state-space system theory in both the continuous-time case and
the discrete-time case, and b) the I/O description of linear systems. The main thrusts
of the book are: analysis of system descriptions and derivation of their properties, LQ-
optimal control, state-feedback and state-estimation, and a short study of MIMO
unity-feedback systems. We cover both continuous-time and discrete-time systems; in
most cases, the discrete-time case is covered in an isomorphic chapter, e.g. Chapter 2d
is the discrete-time coverage of Chapter 2.
The contents of the book can be described roughly as follows (for a topic by
topic description, see the table of contents). As an introduction, Chapter 1 discusses
the relation between physical systems, their models and their mathematical representa-
tions. It also raises the issues of sensitivity and robustness. Chapters 2 through 4
describe and structure the trajectories of a generic class of linear system representa-
tions in both the time-varying and time-invariant cases. A treatment of finite horizon
LQ-optimization is included. Based on the results of these chapters, Chapter 5
discusses general system concepts and the four-way classification· of systems: linear
versus nonlinear, time-invariant versus time-varying. Chapter 6 derives the relation
between a continuous-time system and its related discrete-time (approximate)
equivalent obtained as a result of the AID and D/A conversions. Chapter 7 covers sta-
bility: the three main types of stability for linear systems, namely, the I/O stability and
the two state-related stability concepts i.e., asymptotic stability, and above all,
exponential stability. Chapter 8 focuses on the coupling of the input to the state, i.e.
controllability, and that of the Slate to the output, i.e. observability. Related concepts
such as stabilizability and detectability are also covered. Chapter 9 covers (time-
invariant) minimal realizations and the McMillan degree; the
vi

(A,B) controllable canonical fonn is used to reduce the multi-input controllability to


the single-input controllability. Chapter 10 covers the main results of state feedback
and state estimation such as spectral assignability and stabilization by constant state-
feedback or output-injection; it ends with a treatment of infinite horizon LQ-optimal
state-feedback. Chapter 11 uses the results of previous chapters to analyze and derive
the main properties of MIMO time-invariant unity-feedback systems: closed-loop sta-
bility (MIMO Nyquist criterion), robustness (modeling errors and exogeneous perturba-
tions, uncertainty-bandwidth limitations, structured perturbations), and set-point regula-
tion. Four appendices provide mathematical background material in linear algebra,
maps, nonns, differential equations, Laplace, and z-transfonns. Many results derived
in these appendices are specifically referred to in the body of the book.
The following features may be worth noting:
• The stress on precision and rigor balanced by knowledge that the engineering
world is not perfectly known (modeling errors, perturbation of various kinds, ... );
• nonlinear perturbations are discussed from time to time;
• computational considerations are made at appropriate points
a) the computational well-posedness of certain results,
b) the directional sensitivity of some solutions.
• an introduction (in Chapter 2) to the general methodology of dynamic optimization
(computation of directional derivatives through adjoint equations) leading to a
rigorous handling of LQ optimization in Chapters 2 and 10;
• the systematic approach to duality by a pairing lemma (in Chapter 2) for both
optimization and for coupling control and observation issues;
• the care devoted to certain delicate aspects of discrete-time systems (e.g. duality,
reachability versus controllability to zero);
• the efficient derivation of basic results on unity-feedback MIMO systems (in
Chapter 11).
• In the time-invariant case, the geometric aspects of many results is emphasized for
their intuitive approach.
Acknowledgments.
It is with gratitude that we acknowledge the continued support of the National
Science Foundation, the National Aeronautics and Space Administration, the Belgian
Fonds National de la Recherche Scientifique, the Department of Mathematics of the
Facultes Universitaires de Namur, and the Department of EECS of the University of
California, Berkeley. We owe a great debt to many colleagues and students we
learned a lot from them. They asked questions, pointed to problems, and forced us to
improve our work. To list all these people is a hopeless task; we can only thank them
collectively. We do so because we believe that advances in science and engineering
are very much a collective endeavor.
vii

Special thanks are due to our wives Nicole and Jackie for their strong support
over the years. We thank Oswaldo Garcia for drawing the figures. Finally, Bettye
Fuller deserves special thanks for handling a difficult manuscript.

Summer 1990 Frank M. Callier


Namur, Belgium
Charles A. Desoer
Berkeley, California
NOTE TO THE READER

Reference numbers are used to number items such as results. definitions. state-
ments. and formulas. These reference numbers are started at the beginning of each
section listed with two symbols. e.g. section 2.1 (Fundamental Properties of R 0) or
Appendix A.3 (Linear Spaces). When referring to an item inside a section we use bare
reference numbers, e.g. (55). When referring to an item outside a section. we use
three symbols. e.g. (2.2.44) indicates item (44) of Section 2.2.
The index at the end of the book is preceded by a) a list of Mathematical Sym-
bols (followed by a brief definition). and b) a list of often-used Abbreviations.
CONTENTS

PREFACE ............................................................................................................................ v

NOTE TO THE READER ................................................................................................... viii

CHAPTER 1 INTRODUCTION ...................................................................................... ..


1.1 Science and Engineering ........................................................ ............ .......................... 1
1.2 Physical Systems, Models, and Representations .......................................................... 2
1.3 Robustness .................................................................................................................... 3

CHAPTER 2 THE SYSTEM REPRESENTATION R(·) = [A('),B('),C('),DOl ............. 5


2.1 Fundamental Properties of R(·) .................................................................................... 5
2.1.1 Definitions .......................................................................................................... 5
2.1.2 Structure of R (.) ................................................................................................ 6
2.1.3 State Transition Matrix ....................................................................................... 10
2.1.4 State Transition Map and Response Map ........................................................... 17
2.1.5 Impulse Response Matrix ............................................... ".................................. 22
2.1.6 Adjoint Equations ........................................................... ".................................. 25
2.1.7 Linear-Quadratic Optimization ........................................ ".................................. 29
2.2 Applications .................................................................................................................. 40
2.2.1 Variational Equation .......................................................................................... 40
2.2.2 Control Correction Example .............................................................................. 44
2.2.3 Optimization Example .................................................... ".................................. 48
2.2.4 Periodically Varying Differential Equations ...................................................... 51

CHAPTER 2d THE DISCRETE-TIME SYSTEM REPRESENTATION


= [A('),B('),C('),D('») ..................................................................................................
Rd(') 55
2d.l Fundamental Properties of Ri') ................................................................................. 58
2d.2 Application: Periodically Varying Recursion Equations ............................................. 66

CHAPTER 3 THE SYSTEM REPRESENTATION R = [A.B.C.Dl. Part I...................... 68


3. I Preliminaries ................................................................................................................. 68
3.2 General Properties of R = [A,B,C,Dl .......................................................................... 70
3.2.1 Definition ........................................................................................................... 70
3.2.2 State Transition Matrix ...................................................................................... 70
x

3.2.3 The State Transition and Response Map of R .................................................. 76


3.3 Properties of R when A has a Basis of Eigenvectors ................................................... 79

CHAPTER 3d THE DISCRETE-TIME SYSTEM REPRESENTATION


Rd = [A,B,C,D] .................................................................................................................... 95
3d.l Preliminaries ............................................................................................................... 95
3d.2 General Properties of Rd............................................................................................ 95
3d.3 Properties of Rd when A has a Basis of Eigenvectors ............................................... 100

CHAPTER 4 THE SYSTEM REPRESENTATION R = [A,B,C,D], Part II .................... 103


4.1 Preliminaries ................................................................................................................. 103
4.2 Minimal Polynomial ..................................................................................................... 107
4.3 Decomposition Theorem ............................................................................................... 110
4.4 The Decomposition of a Linear Map ........................................................................... 117
4.5 Jordan Form ................................................................................................................. 122
4.6 Function of a Matrix .................................................................................................... 127
4.7 Spectral Mapping Theorem .......................................................................................... 135
4.8 The Linear Map X AX+XB .................................................................................... 138

CHAPTER 5 GENERAL SYSTEM CONCEPTS ............................... _........................ _.... _ 140


5.1 Dynamical Systems ...................................................................................................... 140
5.2 Time-Invariant Dynamical Systems .............................................................................. 150
5.3 Linear Dynamical Systems ........................................................................................... 151
5.4 Equivalence .................................................................................................................. 152

CHAPTER 6 SAMPLED DATA SYSTEMS .................................................................... 160


6.1 Relation Between L- and z-Transforms ........................................................................ 160
6.2 D/A Converter .............................................................................................................. 166
6.3 AID Converter .............................................................................................................. 167
6.4 Sampled-Data System ................................................................................................... 168
6.5 Example ........................................................................................................................ 171

CHAPTER 7 STABILITY ................................................................................................. 173


7.1 I/O Stability .................................................................................................................. 173
7.2 State Related Stability Concepts and Applications ....................................................... 180
7.2.1 Stability of ir. = A(t)x ......................................................................................... 180
7.2.2 Bounded Trajectories and Regulation ................................................................ 190
7.2.3 Response to T-Periodic Inputs ........................................................................... 193
xi

7.2.4 Periodically Varying System with Periodic Input .............................................. 196


7.2.5 Slightly Nonlinear Systems ................................................................................ 197

CHAF'fER 7d STABILITY: THE DISCRETE-TIME CASE ............................................ 204


7d.1 I/O Stability ................................................................................................................ 204
7d.2 State Related Stability Concepts ................................................................................. 211
7d.'2.1 Stability of x(k+l) = A(k)x(k) ......................................................................... 211
7d.2.2 Bounded Trajectories and Regulation .............................................................. 217
7d.2.3 Response to q-Periodic Inputs ......................................................................... 220

CHAF'fER 8 CONTROLLABILITY AND OBSERVABILITY ....................................... 222


Introduction .......................................................................................................................... 222
8.1 Controllability and Observability of Dynamical Systems ............................................. 222
8.2 Controllability of the Pair (A(· ),B('» ........................................................................... 226
8.2.1 Controllability of the Pair (A(·),B(·» ................................................................. 226
8.2.2 The Cost of Control ........................................................................................... 229
8.2.3 Stabilization by Linear State Feedback .............................................................. 231
8.3 Observability of the Pair (C('),A('» ............................................................................. 233
8.4 Duality .......................................................................................................................... 235
8.5 Linear Time-Invariant Systems ..................................................................................... 239
8.5.1 Observability Properties of the Pair (C,A) ........................................................ 240
8.5.2 Controllability of the Pair (A,B) ........................................................................ 243
8.6 Kalman Decomposition Theorem ................................................................................. 247
8.7 Hidden Modes. Stabilizability, and Detectability ......................................................... 252
8.8 Balanced Representations ............................................................................................. 260
8.9 Robustness of Controllability ....................................................................................... 262

CHAF'fER 8d CONTROLLABILITY AND OBSERVABILITY: THE DISCRETE-


TIME CASE ........................................................................................................................ 265
8d.l Controllability and Observability of Dynamical Systems ........................................... 265
8d.2 Reachabilityand Controllability of the Pair (A('),B(-» .............................................. 265
8d.2.1 Controllability of the Pair (A(·),B(·» ............................................................... 265
8d.2.2 The Cost of Control ......................................................................................... 270
8d.3 Observability of the Pair (C('),A('» ........................................................................... 271
8d.4 Duality ........................................................................................................................ 275
8d.5 Linear Time-Invariant Systems ................................................................................... 279
8d.5.1 Observability of the Pair (C,A) ....................................................................... 281
8d.5.2 Reachability and Controllability of the Pair(A,B) ........................................... 283
xii

8d.6 Kalman Decomposition Theorem ............................................................................... 292


8d.7 Stabilizability and Detectability .................................................................................. 292

CHAPTER 9 REALIZATION THEORy ........................................................................... 295


9.1 Minimal Realizations .................................................................................................... 295
9.2 Controllable Canonical Fonn ........................................................................................ 306

CHAPTER 10 LINEAR STATE FEEDBACK AND ESTIMATION ............................... 315


10.1 Linear State Feedback ................................................................................................ 315
10.2 Linear Output Injection and State Estimation ............................................................. 323
10.3 State Feedback of the Estimated State ........................................................................ 328
10.4 Infinite Horizon Linear Quadratic Optimization ......................................................... 330
IOd.4 Infinite Horizon Linear Quadratic Optimization. The Discrete-Time Case .............. 346

CHAPTER 11 UNITY FEEDBACK SYSTEMS ............................................................... 356


11.1 The Feedback System I:c ........................................................................................... 357
11.1.1 State Space Analysis ........................................................................................ 357
11.1.2 Special Case: R 1 and R 2 have no Unstable Hidden Modes ............................ 364
11.1.3 The Discrete-Time Case .................................................................................. 367
11.2 Nyquist Criterion ........................................................................................................ 368
11.2.1 The Nyquist Criterion ...................................................................................... 368
11.2.2 Remarks on the Nyquist Criterion ................................................................... 370
11.2.3 Proof of Nyquist Criterion ............................................................................... 372
11.2.4 The Discrete-Time Case .................................................................................. 374
11.3 Robustness .................................................................................................................. 374
11.3.1 Robustness With Respect to Plant Penurbations ............................................. 375
11.3.2 Robustness With Respect to Exogenous Disturbances .................................... 376
11.3.3 Robust Regulation ............................................................................................ 377
11.3.4 Bandwidth-Robustness Tradeoff ...................................................................... 379
11.3.5 The Discrete-Time Case ................................................................... "............. 383
11.4 Kharitonov's Theorem ................................................................................................ 383
11.4.1 Hurwitz Polynomials ........................................................................................ 384
11.4.2 Kharitonov's Theorem ..................................................................................... 384
11.5 Robust Stability Under Structured Penurbations ........................................................ 388
11.5.1 General Robustness Theorem .......................................................................... 389
11.5.2 Special Case: Affine Maps and Convexity...................................................... 391
11.5.3 The Discrete Time Case .................................................................................. 392
xiii

11.6 Stability Under Arbitrary Additive Plant Perturbations .............................................. 393


11.7 Transmission Zeros ..................................................................................................... 396
11.7.1 Single-InputSingle·QutputCase ..................................................................... 396
11.7.2 Multi-Input Multi-Output Case: Assumptions and Definitions ........................ 397
11.7.3 Characterization of the Zeros ........................................................................... 399
11.7.4 Application to Unity Feedback Systems .......................................................... 401

APPENDIX A LINEAR MAPS AND MATRIX ANALYSIS .......................................... 403


A.l Preliminary Notions ..................................................................................................... 403
A.2 Rings and Fields .......................................................................................................... 405
A.3 Linear Spaces ............................................................................................................... 409
A4. Linear Maps ................................................................................................................. 415
AS. Matrix Representation .................................................................................................. 419
A.S.1 The Concept of Matrix Representation ............................................................. 419
A.S.2 Matrix Representation and Change of Basis ..................................................... 423
A.S.3 Range and Null Space: Rank and Nullity ......................................................... 426
A.5.4 Echelon Fonns of a Matrix ............................................................................... 429
A.6 Nonned Linear Spaces ................................................................................................. 434
A.6.1 Nonns ............................................................................................................... 434
A.6.2 Convergence ...................................................................................................... 437
A.6.3 Equivalent Nonns ............................................................................................. 438
A.6.4 The Lebesgue Spaces IP and LP [Tay.1J ........................................................... 440
A.6.S Continuous Linear Transfonnations .................................................................. 441
A.7 The Adjoint of a Linear Map ...................................................................................... 447
A.7.1 Inner Products ................................................................................................... 448
A.7.2 Adjoints of Continuous Linear Maps ................................................................ 452
A.7.3 Properties of the Adjoint ................................................................................... 456
A.7.4 The Finite Rank Operator Fundamental Lemma ............................................... 457
A.7.5 Singular Value Decomposition (SVD) .............................................................. 459

APPENDIX B DIFFERENTIAL EQUATIONS ................................................................ 469


B.I Existence and Uniqueness of Solutions ....................................................................... 469
B.I.I Assumptions ...................................................................................................... 469
B.1.2 Fundamental Theorem ....................................................................................... 470
B.1.3 Construction of a Solution by Iteration ............................................................. 471
B.1.4 The Bellman-Gronwall Inequality..................................................................... 475
B.I.S Uniqueness ........................................................................................................ 476
B.2 Initial Conditions and Parameter Perturbations ............................................................ 477
xiv

B.3 Geometric Interpretation and Numerical Calculations ................................................. 480

APPENDIX C LAPLACE TRANSFORMS ...................................................................... 482


C.I Definition of the Laplace Transfonn ............................................................................ 482
C.2 Properties of Laplace Transfonns ................................................................................ 484

APPENDIX D THE z-TRANSFORM .............................................................................. 488


D.I Definition of the z-Transfonn ...................................................................................... 488
D.2 Properties of the z-Transfonn ...................................................................................... 489

REFERENCES ..................................................................................................................... 492

ABBREVIATIONS .............................................................................................................. 498

MATHEMATICAL SYMBOLS .......................................................................................... 499

SUBJECf INDEX ............................................................................................................... 504


CHAPTER 1

INTRODUCTION

In this very brief introduction we emphasize some aspects of the difference


between the hard sciences and engineering; also we discuss heuristically the relation
between physical systems, their models and their mathematical representations: the
main message is summarized by Fig. 1.1.

1.1. Science and Engineering


It is a common experience to see Science and Engineering being confused by
both students and the media. For our purposes we consider exclusively the hard
experimental sciences e.g. physics, chemistry, biochemistry, and biophysics. These
hard sciences are characterized by the fact that their laws and their predict40ns can be
tested by experiments performed over wide ranges of parameter variations. The pur-
pose of the hard sciences is to explore nature, make new discoveries (e.g. new ele-
ments, new stars, new phenomena such as superconductivity or the Josephson effect,
new molecules as the DNA, new reactions, etc.) and predict the result of new experi-
ments. It is the astounding predictive power of the hard sciences which makes them
so useful to humanity.
The goal of engineering is to create new devices and systems to serve human pur-
poses; engineering does so by making maximum use of the hard sciences. But
engineering, although based on the hard sciences, is constantly nurtured and constantly
driven by inventions. To support this view we list below four fields of engineering
that are firmly grounded on the hard sciences and whose triumphs are mostly the
results of inventions.
1. Computers: the programmed computer, programming languages; the very large
integrated circuit, the disc memory, the laser printer, the ink-jet printer, etc.
2. Communications: the telephone and its world-wide network; television and its
many broadcasting systems; telecommunication satellites; computer communication
networks with their fiber-optic links; etc.
3. Control systems: the autopilots of airliners, automatic landing systems, the
guidance systems of satellites, robots and flexible manufacturing cells, controllers that
govern chemical processes, etc.
4. Electric power systems: the main components of such systems, nuclear reac-
tors, boilers, dams, turbines, generators, transformers, transmission lines, etc, are
human inventions.

t Sociology is a science but not a hard science because in sociology it is impossi-


ble not only to perform experiments but also to vary at will the several parameters in-
volved in the observations. Clearly, this remark also holds for economics, political
science, etc.

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
2

Thus engineers invent new devices. invent ways to manufacture them cheaply and
reliably. and interconnect them into useful systems. It is for this last step that it is
important to study the properties of engineering systems. and delineate their capabili-
ties and their limitations. As a result of these studies. engineers can 1) specify the
features the system must have in order to accomplish the required tasks. 2) use the
analysis methods to design the system.
Let us now tum to the following question: on the one hand. engineers design.
build. and operate physical systems and. on the other hand. the main tool of system
theory is mathematics: The relation between these two activities is the subject of the
next section.

1.2. Physical Systems, Models, and Representations


In engineering. the object of interest is the physical system under consideration.
It may be a simple device (e.g. a transistor. op amp. or motor.... ) or a complicated
interconnection of devices (e.g. a computer, a communication system. an electric
power system. an airplane. a satellite. a chemical plant. ...). For the purpose of
analysis and design. a physical system is replaced by a drastically simpler model. The
model ignores many of the attributes of the physical system but retains only those
attributes that are deemed crucial to the problem under study. Thus, in the study of a
given physical system. one may use several different models depending on the problem
being studied. (See Fig. 1.1.)
For example. in order to calculate the thrust and fuel expenditure required to
launch a satellite, one may model it as a particle (specified by its mass m, position 1,
and velocity V) subjected to the thrust of its motors and the aerodynamic forces due to
its motion through the atmosphere. Once the satellite is in orbit. in studying the point-
ing of its antenna towards a fixed point on earth one might model the satellite as a
rigid body moving about its center of mass. Also. if the satellite is large and if the
maneuvers must be carried out quickly one might have to model it as a flexible struc-
ture and take into account a number of its elastic vibration modes. Note that there is
absolutely no requirement that the model resemble in some sense the physical sys-
tems t: the only requirement is thatJor the problem at hand, the model delivers useful
predictions at reasonable cost.
In conclusion. for a given physical system, we may use different models; the
choice of model is dictated by the problem at hand and the model is chosen as a result
of both theoretical considerations and experiments. as well as cost and precision
requirements.
Given a model. there are several ways of writing the equations of the model: we
say that the model has several mathematical representations. For example. for a given
mechanical system we may use Newton's equations or Lagrange's equations or
Hamilton's equations; we may also choose different systems of coordinates. Similarly.

t For example. a very large flexible satellite may for certain problems be usefully
modeled as a particle. Or. an integrated circuit. part of a digital signal processor. may
be modeled as a set of binary operations performed on streams of "zeros" and "ones."
3

for electrical circuits we may use node equations or modified node equations or tableau
equations.
Having picked a mathematical representation for the chosen model, the next step
is to analyze the model, that is, find its properties, its capabilities and its limitations;
these three goals are crucial to engineering design. The study of these questions is the
task of system theory.
System theory is part of engineering in the same way that theoretical physics is
part of physics. In both cases, system theory and theoretical physics use mathematical
tools to study the main models of engineering and physics, respectively. These
mathematical tools give predictive power to engineering, thus the merits of the various
design alternatives may be sorted out before investing labor and materials in building
anyone of them.
Of course in the study of specific engineering systems one uses computer simula-
tion. The test of the validity of the whole procedure is the agreement between the
simulation results and the measurements performed on the physical system. This
whole process is summarized in Fig. 1.1.

_______MEASUREMENTS
.>----....,..---....... --..-.. . . . - ,
' ......
,
I I
......

AGREE?
,
\
\

J
i
_/

Fig. 1.1. A physical system and its relation to its models, their mathematical
representations, and the measurements.

So the purpose of these notes is to study a number of representations of linear


models of physical systems that have been chosen by their proven wide applicability.
Our goal is to study their properties, their capabilities, and their limitations.

1.3. Robustness
Robustness is an important concern in system design, we briefly discuss two
aspects of the robustness problem.
The first one arises because every physical system is designed on the basis of a
necessarily approximate model. Then the question arises: will the physical system
have a performance that is sufficiently close to the performance predicted by the
4

(idealized) model? For example, if the linear model is exponentially stable, or con-
trollable, etc ... will the physical system, which, say, is in fact a small (possibly non-
linear) perturbation from the linear model, still be exponentially stable or controllable,
etc .. ?
The second aspect of robustness appears in the following manner: the design is
based on the (idealized) model and it specifies a number of nominal properties of the
system to be built, for example, physical dimensions, geometric configuration, compo-
sition of materials, etc. Once produced and operating in the field, the physical system
differs from the nominal design mainly because of environmental conditions -- tem-
perature, humidity, wear, radiation, etc. --, and manufacturing deviations -- the
manufactured physical components are only approximations of the nominal in terms of
physical dimensions, configuration, composition, etc. So the question is: what is the
effect of all such deviations from nominal on the performance of the physical system
in the field? Such questions can be studied by calculating the effect on system perfor-
mance due to changes in design parameters and exogeneous disturbances. It is a fact
that some designs, which perform nominally perfectly, are totally inadequate in the
field because they are too sensitive to small perturbations in some of their parameters.
In the course of these notes, we'll discuss from time to time these questions of robust-
ness.
CHAPTER 2

THE SYSTEM REPRESENTATION R(')=[A('), B('), C(·), D(')]

We present here the basic properties of a standard linear differential system


representation that is either given or the result of a linearization. The latter, as well as
some optimization and periodic systems, is also discussed.
This chapter is devoted to general properties. The more specialized techniques
and results that pertain to the time-invariant case are covered in the next two chapters.
We use here some mathematical tools of normed linear spaces and differential equa-
tions that are available in the appendices; see especially Sections A3 to A7 of Appen-
dix A and Appendix B.

2.1. Fundamental Properties of R (.)


We start by giving definitions and the structural properties of the representation.
We will then consider adjoint equations.

2.1.1. Definitions
We study here dynamical systems represented by the following equations

x(t) = A(t)x(t) +B(t)u(t)

2 yet) = C(t)x(t) + D(t)u(t)


where

3 (as is usual), the state x(t) eRn, the input u(t) eRn" and the output yet) e JRI1o;

4 A(·),B(·),C(·),D(·) are matrix-valued functions on R+ of class PC; more pre-


cisely, they are, respectively, nxn, nxnj, noxn, and noxnj matrices whose elements
are known piecewise continuous, real-valued functions defined on JR+;

5 the input function u(·)e U where


U := (u(·) I u(·):R+ R n" u(·) is piecewise continuous) =:

These assumptions will hold throughout this chapter.


For short, werefer to (1)-(5) as the system representation
R(·)=[A(·),B(·),C(·),D(·»). Since a given physical system may generate many
models and representations, one must be careful to distinguish the properties of a
model and the properties of some of its representations.

6 Remarks. a) In (1)-(5), above the time interval of observation 't is chosen to


be 't=R+=[O,oo); without loss of generality 't may also read 't=R=(-oo,oo), or

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
6

't= [to,oo) or 't= [to,td (bounded interval).

\3) The theory below is also valid for complex-valued


x(·), y(.), u(·), A('), B('), CO, D(·). Therefore we shall write x*O,y*(-), etc, instead of
x'('),y'('), etc. (Hennitian transpose instead of transpose).

y) Since A('), BO, CO, DO are functions of class PC, they are bounded on
bounded intervals.

0) System representation R 0, described by (1)-(5), is often called a linear


differential system; since the state x(t) e IRn , R (-) is said to be of dimension n,

E) equation (1) is called the state differential equation (d.e.) and equation (2) is
called the read-out equation.
Later on we shall see that at any time t the state x(t) and output yet) of represen-
tation R (.) are functions of t (actual time), to (initial time), Xo (initial state), and the
input u(-) e U. t Hence for all Ie 1Rr we shall have the functional relalions

7 x(l) = s(I,Io,xo,u)

8 y(l) = p(l,to,xo,u)

which are called the state transition map and the response (map) of R (.), respec-
tively. Their struclural propenies will be studied next.

2.1.2. Structure of R (-)


We stan by studying the state transition map s(t,Io,xo,U), (7), as the unique solu-
tion of the state d.e.

x(t) = A(t)x(t) + B(t)u(t)


s.t.
11 for some (Io,xo)e IR+ x JRn x(to)=xo and u(·)e U
and where, by the assumptions above, A('), BO and uO are piecewise continuous
functions. Notice that under these conditions (1) and (11) reduce to a d.e.

12 X(I) = p(X(I),I) leR

where the RHS is a given function p(-;):


R n x R+ R n : (x,t) p(x,t) = A(t)x + B(t)u(t), which satisfies the conditions (B.1.3)

t Later we shall see that the dependence on u(-) (for t to) is actually on u[Io. I ]
(the values of u(·) during the lime-interval [Io,t] c R+).
7

and (8.1.4) of existence and uniqueness theorem (8.1.6). Indeed if DclR+ denotes
the union of sets of discontinuity points of A('), B(·) and u(·), then D has at most a
finite number of points per unit interval. Moreover:

a) For all fixed x ERn, the function t E \D -+ p(x,t) E RD is continuous and for
every 'tE D the left-hand and right-hand limits p(x,'t+) and p(x,'t-) resp., are finite vec-
tors in RD.

b) There is a piecewise continuous function k(') = II A(') II s.l.,

13 II II = II II s: k(t) II II \>' \>' E RD

(global Lipschitz condition in x).


Therefore by Theorem (8.1.6), differential equation (12) has a unique continuous
solution xO: -+ R n, which is clearly defined by the parameters (11), i.e.
n
(to,xo,u) E R+ x R xU. Hence, recalling 0) and the state transition map (7) we have

14 Theorem [Existence of the state transition map]. Under the assumptions and
notation above, for every triple (to,xo,u) E R+ x R n xU, the state transition map

is a continuous map well defined as the unique solution of the state differential equa-
tion 0), more precisely,
\>' t E \D x(t) = A(t)x(t) + 8(t)u(t)
where

16 Remark. It follows that the state trajectory x(·) = s(·,to,xo,u) is differentiable at


every t E \D, (except possibly at a finite number of discontinuity points per unit
interval due to discontinuities of AO or BO or u(·».

17 Exercise [Matrix d.e.'s]. Consider the matrix d.e.

18 X(t) = A •(t)X(t)+ X(t)A 2(t)* + F(t)

where X(t) E R nxn and A10, A20 and FO are of class PC (R+,Rnxn), (A.3.6).
Show that, for all (to,XO)E R+xRnxn, (18) has a unique solution X(')E
(A.3.7), s.t. X(to) = Xo.
(Hint: by "stacking columns" (18) is convertible into a vector d.e.
8

19 x(t) = M(t)x(t) + f(t) tE lRt

where x(t) E R n2 , MO E PC and, f(') E PC


We continue our study of R (-) by substituting our state transition map (7) into the
readout equation (2). This shall define the response function (8) of R ('). Moreover
(1) and (2) have RHSs linear in (x,u), so linearity properties can be expected.

22 Theorem [The structure of R (')]. Let C (R+,Rn ) and PC (1R+,RIlo) denote the
linear spaces of continuous functions from lRt into R n and resp. piecewise continuous
functions from R+ into RIlo. Consider any representation R (-) described by (1)-(5)
where for every input U(')E U, DcR+ denotes the union of sets of discontinuity
points of A('), BO, and u(·).
V.t.c. the state transition map s, given by (7), and the response map p, given by (8),
have the following structure:
a) For every triple (to,xo,u) E R+ x R n xU

with x(·) differentiable at all t E R+\D and

b) [Linearity in (xo,u)]:

c) [additive property]:
For every quadruple (t,to,xo,u) E x x R n xU

28 p(t,to,xo,u) = p(t,to,xo,a u ) + p(t,to,a,u) ,

where au is the zero element of the input space U given by (5).



9

29 Comments. 0.) Property a) follows easily: (23) follows immediately by


Theorem (14) and Remark (16); for obtaining (24), substitute xO = s("to,xo,u) into the
read-out equation (2): hence

30 y{-}=C(·)s(',1(j,xo,u)+D(')u('),

where all functions on the RHS are at least piecewise continuous on R+: the latter
clearly defines the response map p(',to,xo,u) e PC (1R+,IRIlo).

P) Property c) follows from property b) by setting 0.1 = 1, (xOl,UI) = (xo,Su),


and (x02,u2) = (S,u).

y) the partial maps s(t,1(j,xo,Su) and p(t,1(j,xo,Su) are called the zero-input (z-i) state
transition map, resp. the z-i response-, because of property b) they have the property
that, for fixed (t,1(j) e R+ x R+ the maps

and

are linear; hence by matrix representation Theorem (A5.3) they are representable by
matrices. Therefore there exists a matrix <I>(t,1(j) e R nxn S.t.

31

and, by read-out equation (2), we obtain the matrix representation,

32

the matrix function (t,1(j)e R+xR+ <I>(t,1(j)e R nxn is caHed the state transition
matrix (map) and will be studied below.

S) The partial maps s(t,1(j,S,u) and p(t,to,S,u) are called the zero-state (z-s) state
transition map, the z-s response, resp. Because of propeny b) they are linear in
u(·) e U, the benefits of this will be discussed later.

e) The additive property c) means that both the state transition map and the response
can be calculated by adding their z-i and z-s contributions.

33 Proof of Theorem (22). By comments (29.0.) and (29.y) property a) holds and
property c) follows from property b). So we have to show property b), i.e. linearity
relations (25) and (26).
(25): Call the LHS of (25) x(t) and the RHS o.IXt(t) + o.2xZ(t) with xi(t) := s(t,to,xOi,Ui),
10

i = 1,2. Observe that by Theorem (14), xO and xk) for i = 1,2 are the unique solu-
tions of the d.e. (I) for given (to,alxOI + ul + (to,xOi,Uj) (i=I,2), resp.
Therefore the latter xl) can also be combined s.t. is a solution of (1)
for the fonner given triple. Since the solution is unique, (25) follows.
(26) is obtained by combining (30) and (25). (Note that the read-out equation (2) is
linear in (x,u).) •

34 Exercise [Linear matrix d.e.'s, cfr. Exercise (17)]. Denote by X(·,to,Xo,F) the
unique continuous solution of matrix d.e. (18) for a given triple
(to,XO,F)E s.t. X(to)=X o.

a) [Linearity]. Show that 't (t,to) E R+ x R+,


't F 1(-),F2(·) E PC (lR+,JRnxn ) 't E R,

b) [additivity]. Show that

where 0 denotes the zero matrix.



2.1.3. State Transition Matrix
Let us consider again the system representation R (-), described by (1)-(5). As
noted in Comment (29:y) the z-i state transition map has the matrix representation:
Vt,toE lR+ VXOE R n

which defines the matrix function

<1>(... ) is called the state transition matrix. Therefore by setting Xo = Ej, (the ith stan-
dard unit vector), we get (using Theorem (14» that, for all to E for all i E n.
x(·)= <1>(·,to)Ej is. for x(to) = Ej' the unique continuous solution of the homogeneous
linear d.e.

36 x(t) = A(t)x(t) a.a.t E R+.


11

Hence, since <I>(·,to)ej is the ith column of the matrix <1>(' ,to) , we have

37 Fact. 'V toE <I>(',to): R+ R nxn is uniquely defined as the unique con-
tinuous solution XO : R+ R nxn of the homogeneous linear matrix d.e. t

38 X(t)=A(t)X(t) a.a.tE R+
s.1.

39 X(to)=I,

or equivalently,

40
a <I>(t,to) =
at A(t)<I>(t,lo) a.a.1. E R+,

s.t.
41

42 Comment. If X(·) is the solution of (38) S.t. det X(t) 0 'V t E R+, then it is
called a fundamental matrix of (36). The following shows that 'V to E R+ <1>(' ,to) is
a fundamental matrix of (36).

43 Property. Consider the d.e. (38). If there exits a to S.t. detX(to) 0, then
detX(t) 0 'V t E R+.

Proof. By contradiction. Suppose there exists a 't * to S.t. detX('t)=O, or


equivalently X('t) E IRnxn is singular. Then there exists a nonzero vector kERn S.t.
X('t)k = e. Let x(1) := X(t)k. Then, by (38), x(1) = A(t)x(t); moreover x('t) = e.
Therefore since (36) has a unique solution S.t. x('t)=9, x(t)=e 'Vt. In particular
x(to) = X(to)k= 9, whence detX(to)=0 : •

44 Comment [geometric consequences]. Formula (31) shows that under free


motion, (u(t)=e 'Vt), Xo is mapped into x(t)= <I>(t,lo)xo. Since the mapping
Xo <I>(t,lo)xo is linear, any convex set S of initial states at to--say, a ball, an n-cube,
or an ellipsoid--is mapped into a convex set. In particular if S is an n-dimensional
polytope, (an n-dimensional geometric figure bounded by hyperplanes), then <I>(t,Io)S is
also a polytope. This property does not hold for nonlinear systems x(t) = f(x,t). •

45 Exercise. Calculate the state transition matrix for

t a.a.1. means "almost all I." This is to remind ourselves that the LHS of (36) is
not defined for those t where AO is discontinuous.
12

(1) A(t)= [-I2 -3


0] ; (2) A(t)= [-I3 ;

(3) A(t)= [-2t 0]


I -I (4) A(t)= [ ] ;

[0 ro(t) ]
(5) A(t)= -ro(t) 0

COS O(t) sin O(t) ]


(Hint: verify that <1>(t,O)= [ -sin O(t) cos O(t) ,

f
where O(t)= ro(t)dt»
o

(6) A(t) = !] where E 1R,

-1+a cos 2 t I-a cos t sin t]


(7) A ( t[ ) = . .
-1-a SIn t cos t -1+a SIn 2 t

[
Verify that <1>(1,0)=
[
e(a-I)I cos 1
-e
(a-I)I'
sin t
e- I sin t
e- I cos t II
(Example due to Marcus and Yamabe, 1960).

46 Exercise. Let MO, NO, PO be Cl-functions, (A.3.8), into JRnxn . Using stan-
dard calculus and linear algebra (note carefully the order of the factors), show that
d . .
- [M(t)N(t)] = M(t)N(t) + M(t)N(t)
dt
d . . .
- [M(t)N(t)P(t)] = M(t)N(t)P(t) + M(t)N(t)P(t) + M(t)N(t)P(t).
dt

Assuming that det M(t) yt. 0 V t and noting that M(t)· M(I)-I = I, verify that
13

47 Exercise. Let t X(t) E R nxn be any fundamental matrix of (36). Show that

-[X(t)-l] = - [X(t)rl A(t)


dt
Verify also

o
010 <lJ(t,to)=-eI>(t,to)A(t).

48 Exercise. Note that (40)-(41) are equivalent to the integral equation


t

<lJ(t,to)= I + J A('t)<lJ('t,to)d't
10

Therefore. using the Picard iterates. cf. Section (B1.3). viz.

Xo(t) = I
t

Xm+I(t) = I + f Xm('t)d't m=O.1.2.....


10

prove the Peano-Baker formula:

49 Exercise. Let AO be piecewise continuous on R;.. Let A(t) and JA('t)d't


10

r r
commute for all t E 1R+. Show that:

r [!
a) For all k = 0.1.2....

-k+-' ;, [1 A(,)d< A(,)· [! A(,>', A(,)d, A(o)


14

(Hint: use a) in the Peano-Baker formula; moreover for any ME Rnxn,


Mk
L - where MO := I )
00

I Hl I
exp M =
k=O k!

,) :t [<xP [ A(')d' l l
A(t) . <xp [ A(,)d, <XP [ A(,)d, A(t)

50 Exercise. Show that for all t E A(t) and f A('t)d't commute if one of the
10
following conditions holds:

(a) A(·) is constant (then <1>(t,to) = exp[A(t-to))).


(b) A(t) = a(t)M where a('): R --+ Rand M is a constant matrix.
k
(c) A(t)= L ai(t)M i , where aiO: IR --+ IR and the Mi's are constant matrices S.t.
i=1
MiMj=MjM j Vi,j.
(d) A(t) has a time-invariant basis of eigenvectors t spanning Rn.

51 Exercise. Let AIO, A2 0 be of class PC(R+,Rnxn). Denote by <1>i the transi-


tion matrix of x= Ai(t)x, (i = 1,2). Show that the solution of the matrix d.e.

.
X(t) = Al (t) . X(t) + X(t) . A2 (t) * X(to)=X O
is

[X(t) is the zero-input solution X(t,to,Xo,O) of Exercise (34).]

52 Exercise. Let XO and yo be fundamental matrices of (36). Show that there


exists a nonsingular constant matrix C, such that X(t) = Y(t)· C for all t.

55 Properties of the State Transition Matrix.

1. <D(t,to) is uniquely defined for all t,to E 1R+.


2. The solution of x(t) = A(t)x(t), x(to) = Xo is given by

t"Eigenvectors" may be replaced by "generalized eigenvectors" (cf. Jordan form).


15

3. [Composition]. For all t,to,t1 in IRt

4. [Inverse] <1>(t,lo) is a nonsingular matrix for all t'toE IRt and

5. [Splitting]. If X(t) is any fundamental matrix of x(t) = A(t)x(t), then for all
t,toE R+,

58 <1>(t,to) = X(t)· X(to)-l ,


hence

6. [Determinant].
I

60 det[<1>(t,Io)] = exp f tr A('t)d't .


to

Proof.

1. Follows from Fact (37).


2. Has already been discussed and can be checked by direct substitution into
x(t) = A(t)x(t), x(to) = xo.
3. Call R(t) and L(t) the RHS and LHS, resp., of (56).

and

Hence, in (56), LO and RO satisfy the same d.e. and the same initial condition:
hence L(t) = R(t) 'V t E IRt.
4. By comment (42), for all to' <1>(',to) is a fundamental matrix and therefore for all
t,to <1>(t,lo) is nonsingular. Now set t = to in (56):
16

Hence (57) follows.

5. For (58) note that both sides satisfy the same d.e. and the same initial condition.
For (59) note that in (58) both factors are continuous. (exercise:
X('t)-l-X('or l =X('t)-I . [X('o)-X('t)] . X(,O)-I •... ).
6. Let .1(t) := det <1>(t.lo) and write for simplicity <1>(t) for <1>(t.'o). Observe that

<l>(t+h) = <l>(t) + A(t)<l>(t)h + o(h)

where --+ 0 as h --+ 0 (cf. Taylor expansion). Therefore

<1>(t+h) = [I + hA(t)]<1>(t)+ o(h).


whence

.1(t+h) = det[l + hA(t)]det <1>(t) + o(h)


or
.1(t+h)-.1(t) = (det[1 + hA(t)]-I).1(t) + o(h) .

Now. by calculation.

det[I + hA(t)] = 1 + h tr A(t) + o(h) .


Hence. by combining

. 1
.1(t) = lim -h [.1(t+h)-.1(t)] = tr A(t) . .1(t) •

where
.1('0) = 1.
Therefore we obtain


l

I
.1(t) =det <1>(t.lo) =exp tr A('t)d't.
10

63 Exercise. Suppose that you have established (58). Show that (58) implies (56)
and (57).

64 Exercise. Suppose that every solution of x(t) = A(t)x(t) for to = 0 is bounded.


on 1Rt.
equivalently t --+ <l>(t.O) is bounded on R+. Show that <1>(t.O)-1 is bounded on
l

R+ if and only if t --+ I tr[A('t)]d't is bounded from below on R+. or equivalently there
o
exists a constant k E R S.t.
17

!tr[A (t)] dt
t

k V tE 1R+ .

(Hint: use (60) and Adj[cI>(t,O)] = [cI>(t,O)-t] . det cI>(t,O).)

6S Exercise. Consider
x(t) = [A(t) + B(t)]x(t) , (a)

where A(') and B(') are of class Let <l>A(t,l(,) be the transition matrix
of x(t)=A(t)x(t). Let

Show that the transition matrix of (a) is of the fonn

where «1>M denotes the transition matrix of x(t) = M(t)x(t).


(Hint: use the substitution x(t) =<1>A(t,O)y(t).)

66 Exercise. Let the constant matrix ME R nxn commute with A(t), Vte Rt;
then M commutes with <I>(t,to), V t,to E R+.
(Hint: consider [McI>(t,to) - <I>(t,to)M].)

2.1.4. State Transition Map and Response Map


We give immediately the main result.

70 Theorem [State transition and response maps]. Consider system representation


R (.) described by (1)-(5). Let «1>(t,lo) be its state transition matrix (37). Let sand p
denote the state transition- and response-maps (specified by (7) and (8».
U.t.c.
Vu(·)e U

t
= <I>(t,to)xo + f cI>(t,t)B(t)u(t)dt
10

s(t,to.e,u)
18

z-i transition + z-s transition

72 yet) = p(t,to,xo,u) =
I

f
C(t)q,(t,to)xo + C(t) q,(t,'t)B('t)u('t)dnD(t)u(t)
10

p(t,to,9,u)

z-i response + z-s response .



73 Comments a) One recognizes the additive structure of sand p, predicted by
Theorem (22), see especially (27)-(28), where the z-i and z-s contributions are linear in
XOE Rn and u(·) E U, resp ..

For fixed (t,tO)' S.t. t to, sand p depend on U[to.l) t rather than u(·) (for t :s; to
upon u[I,Io)'

y) (72) is a consequence of the substitution of (71) in the read-out map, see (30);
moreover in (71) the first term on the RHS is known, see (31) where q, is defined by
Fact (37). Hence we are reduced to the

76 Proof of the expression of the z-s transition. We need to show that

x(t) = s(t,to,e,u» = f q,(t,'t)B('t)u('t)d't .


10

By Theorem (14) it is sufficient to prove that the RHS above denoted by z(t) satisfies
d.e. (1) S.t. x(to) =9, or equivalently z(') must satisfy the integral equation
t t
x(t) = f A('t)x('t)d't + f B('t)u('t)d't V' t E R+.
10 10
Now by Fact (37)
t

f
q,(t,'t)=I + A(cr)q,(cr,'t)da,
1

whence
19

77 z(t) := f
10

=ft B(t)u(t)dt + ft{tf A(a)<I>(a,t)dt }B(t)u(t)dt


10 10 't

= f B(t)u(t)dt + f A(t) {'ft <I>(t,a)B(a)u(a)da }dt


t t

10 10 10

t t

= f B(t)u(t)dt + f A(t)z(t)dt .
10 10
Q.E.D.

Note that the conversion of the double integral has been obtained by changing the
order of integration and subsequently replacing a by t and t by a. •

Comments. a) The proof above uses the modern integral equation point of view
with an application of Fubini's theorem ([Rud.l,p.l50]) for changing the order of
integration. (Its application is justified here by the fact that the integrand of the double
integral is PC on a bounded triangle of JR2; hence it is bounded there and the double
integral converges absolutely.) Notice that nowhere derivatives have been used: a pol-
icy that is important in distributed systems (semi-group systems [Cur.1] and stochastic
d.e.'s driven by white noise [Kwa.l]).

/3) A more classical proof uses differentiation t under the integral sign, e.g.
[Rud.3,pp.236-237] viz.
d af
f f(t,t)dt=f(t,t) + Iof :;-
t t
-d (t,t)dt.
tlo Qt

Applied to z(t), (77), one has:

t More precisely one should read "at almost all t e R+" (except at points of discon-
tinuity of (t,t».
20

d
f10 -:;-
t
z(t) = B(t)u(t) + <1>(t,t)B('t)uCt)dt
ut

= B(t)u(t) + A(t) f <1>(t,t)B(t)u(t)dt


10

= B(t)(t)u(t) + A(t)z(t) .

Since z(to)=e, obviously z(') satisfies d.e. (1) for x(tn)=e. Hence by the uniqueness
of the solution s(t,to,e,u) = z(t) for all t E 1R+. •

80 Heuristic derivation of the z-s transition. For simplicity let AO and BO be


continuous and suppose that the input uO on [to,t] is zero except for a small interval
[t,Hdt], (Fig. 2.1). The search for a solution of d.e. (I) with x(to)=e is then as fol-
lows. Since u(t)=e on [to, t) (Fig. 2.1),

81 x(t)=<1>(t,to)x(to) = 9.

During the interval [t, t+dt) we obtain from the d.e. (1) and (81)

82 x(Hdt) ::::< x(t) + [A(t)x(t) + B(t)u(t)]dt ,

::::< B(t)u(t)dt,

where the approximately equal sign " ::::< " reminds us that higher order terms in dt
have been neglected. Since u(t) = e on [1: + d1:,t] we finally obtain using (82)

x(t) = <1>(t, t+dt)x( t)

:::: <1>(t,Hdt)B(t)u(t)dt.

Since <1>(t;) is differentiable (Exercise (47», replacing <1>(t,Hdt) by <1>(t,t) will


u(e)
r--'1
I I

____ ________________ __
I I
I I

T T+dT t
Fig. 2.1. An elementary input.
21

introduce an error of second order in the RHS above. Hence

x(t) :::. cl>(t,t)B(t)u(t)dt.

Since for fixed (t,to), x(t) is linear in u(·), we are allowed to sum over all elementary
inputs of the previous form when u(·) is a step junction, creating the (integral) sum

x(t) ==. L q,(t,t)B(t)u(t)dt.


't

This is indeed the correct expression for


I

I
x(t) = s(t,to,e,u)= cl>(t,t)B(t)u(t)dt.
10

Notice that each elementary contribution B(t)u(t)dt at time t is mapped by the transi-
tion matrix q,(t,t) into an elementary (state) contribution q,(t,t)B(t)u(t)dt at time t.

83 Exercise. [Variation of constants approach for solving (1)]. Consider the d.e. (1)
for x(to)=xo where AO, BO, and uO are assumed to be continuous. Write
where IRn is an unknown function. Substitute in (1) and
obtain that Hence rederive the solution given by (71).

84 Exercise. [Linear matrix d.e.'s cf. (17),(34),(51)]. Consider the matrix d.e.

18 .
X(t)=A 1(t)X(t)+X(t)A2(t) +F(t)* tE R+

of Exercise (17), where


X(to)=XOE IRnxn.

a) Show that its solution reads


I

85 X(t) = q,l (t,to)X Ocl>2(t,to)* + I q,1(t,t)F(t)q,2(t,t)*d't,


to

where q,j is the transition matrix of x(t)= Aj(t)x(t), (i=I,2).


(Hint: use the variation of constants approach X(t)=q,1(t,to)M(t)q,2(t,to)*, (cf. (51»,
where M(') : R+ Rnxn.)

b) Verify the additive property (35).



We conclude this section with the following:
22

86 Fact. [State composition property). Let s denote the state transition map (7), of
any system representation R (-) given by (1)-(5).
V.l.c.
For every fixed u(·) E U,

88 Comments. a) For every fixed u(·) E U, the state x(t2) can also be obtained
by proceeding from any other state x(tl) := s(tl'to,xO,u) on the trajectory through
x(lo)=xo·

For fixed u(·) E U and x(lo) = xo, s(t,lo,x(lo),u) defines a two-parameter family of
maps

89 : x(to) x(t) = s(t.to.x(to),u).

With "0" denoting composition of maps [viz. (fog)(x)=f(g(x» for all x. see (Al.7)].
(87) reads

i.e.

So we have a property of composition of maps.

y) Of course, if u(-) = Su. then (87) is the composition property of state transition
matrices (56).

91 Proof. Set t2=t. xI := s(tl.tO.xO,u) and call1(t) (r(t»; the LHS (RHS resp.) of
the second equality of (87).
Now 10 satisfies d.e. (1) S.t. x(t\)=xl and so does r(·). Hence, by the uniqueness of
the solution of (1) for a fixed uO (see (14», we have I(t)=r(t) for all tE IR+. •

2.1.5. Impulse Response Matrix


We calculate the response of a system representation R (0), (1)-(5), to a Dirac unit
impulse at one of the inputs.
More precisely, let Ej E IR ni be the jth standard unit vector, let 0(' -1) denote the
Dirac delta function applied at time t E R+. Assume now that x(t-) = e and apply the
input u(t) = E/5(t-t) for t 2'.. t. i.e. just prior to the application time t the state is zero,
and at t the jth component of u(') is hit by a unit impulse while the others are kept
23

zero. U.l.c. by (71) the state t will display a jump at t such that x(t+) *- x(t-)=e
and Vt t

x(t) = s(t,t-,e,Ej' 8(· -t»


t

94 = J <l>(t,a)B(a)Ej' a(a-'t)da
't-

=<l>(t,t)B(t)Ej'

Moreover, by the read-out equation (2), the output shall be

95 yet) = p(t,t-,e,Er 8(· -t»

= [C(t)<l>(t,t)B(t) + D(t)8(t-'t)]Ej'

Hence, if j=I,2, ... ,ni (i.e. "the unit impulse is successively applied at all inputs"), then
we fonn the ni columns of an no x ni real valued matrix H(t,t). The matrix function

defined by

V t t
96 H(t,t) := { 0 V t< t

is called the impulse response matrix. Therefore by (95) it reads

C(t)<l>(I,t)B(t)+D(t)a(t-t) V t t
97 H(t,t) = { 0 'ltt<t.

In the single-input single-output case (no = nj = 1) it is called the impulse response


and denoted and defined by

p(t,t-,B,a( . -t»
98 h(t,t) := { 0 Vt<t.

t Taken to be right-continuous.
24

From the linearity in uO E U of the z-s response of R (.), (72), it follows by (97) that
the z-s response at time t due to the input uO applied at to reads

99 y(t) = p(t,lo,e,u)= fH(t,t)u(t)dt


to

100 Comment [I/O point of view). In certain applications for all initial times to
under consideration x(to) = e and R (-) becomes an I/O map described by the super-
position law (99). Note that it reads approximately

y(t) .::::: L H(t,t)u(t)dt,


't

i.e. each elementary input (Fig. 2.1), seen as a vector of impulses [u/t)dt 0(' -t)
at time t, is mapped by the impulse response matrix into an output contribution
H(t,'t)u(t)dt at time t t (Fig. 2.2). The output at t is the sum of the latter.

101 Exercise. For the time-invariant case R = [A,8,C,DJ. i.e. A,8,C,D are con-
stant matrices and <1>(t,lo) =exp[A(t-to)] (50).
a) Show that the impulse response matrix depends only on the elapsed time t-t. (It is
then customary to write H(t-t) instead of H(t,t); also w.l.g. t=O because H(t-t)
H(t,t) = H(t-t,O); therefore specification can be done by H(t) = H(t,O).)

b) The Laplace transform t of t H(t), t E R+, is given by

102 H(s) = C(sI-A)-1 B + D


where
CeAt8 + Do(t)
103 H(t) = { 0 Vt < 0 .

u(TldT

to T
Fig. 2.2. z-s response: contribution of u(t)dt to y(t).

t See Appendix C.
25

c) The z-s response, (99), where w.l.g. to =0 (see later) is the convolution of HO by
u('), i.e.

104 y(t) = p(t,O,a,u) = f H(t-'t)u(t)dt =: (H*u)(t) ,


o

or in terms of the Laplace transform.


A A A

105 y(s) =H(s)' u(s).

2.1.6. Adjoint Equations


In this section we use some standard notions of inner product spaces and adjoint
linear maps; for more information, see sections (A.7.1) and (A.7.2) of Appendix A.

108 Adjoint linear homogeneous d.e. 'so To the linear homogeneous d.e ..

109 x(t) = A(t)x(t) tE R.

where x(t) e R n and A(') e PC (R.,Rnxn), we associate the adjoint differential equa-
tion

110 Yc(t) = -A(t)* . x(t)

where A * is the Hermitian transpose of A (for A real this is simply the transpose of
A) and x(t) e lRn. Note that (Rn,(','» is a Hilbert space with inner product

{x,y)=x y *
Hence for any matrix A e R nxn

(y,Ax)=(A y,x) * 'v'x,ye lRn

is the defining relatio!! for A* e R nxn , the adjoint of A as a linear map (A.7.32).
Now, let «I>(t,to) and cl>(t,to) be transition matrices of (109) and (110) resp.; therefore
for all t,to e R.

111 x(t) = «I>(t.to)x(to) ,

112
-
x(to)=cI{to,t)x(t).
26

Note that (112) maps x(t) into x(to).

113 Fact. Under the preceding assumptions and notations,

114 (x(t),x(t» = (x(to),x(to» .

Proof. At a.a.t E 1R+, the derivative of the LHS of (114) is zero. To wit, by (109)-
(110),

-.! (x(t),x(t» = -(A(t)* . x(t),x(t» + (x(t),A(t) . x(t» = 0 .


dt

Now (x(·),x(·» is continuous and so must be constant.



115 Comment. Set t=t 1 and consider the linear maps

lJ' : IRfi 4 Rfi : x(to) 4 x(tl)=<1>(tl,to)X(to),

0/' : Rfi 4 Rfi : x(tl) 4 x(to) =

Note that (114) reads:

v x(to),x(tl) E IRfi
(x(tl),lJ'x(to» = (o/*X(tl)'X(to».

Hence is the adjoint of lJ'=<1>(tl,to) (see (A.7.32». Note that 0/' "sends
the future x(tl) into the past x(to) ."

117 Fact. Under the assumptions and notations above, V t,toE lR+

118

Proof. From (114), V x(t),x(to) E IRfi, we have by (Ill),

(x(to),x(to» = (x(t),x(t» = (x(t),<1>(t,to)x(to» = (<1>* (t,to)x(t),x(to» .


Hence
(x (to) - <1>* (t,to)x(t),x(to» = 0 V x(to) ERn

and since x(to) E IR fi is arbitrary, we have


27

x(to) = cp* (t,to)x(t)

For the last equality we used (112) and observe that at the outset x(t) was arbitrary.
Hence

* -
cP (t,to) = CP(to,t) .

So by interchanging to and t, (118) holds.



120 Dual-system representation. Consider any system representation
R 0= [A('),B('),C('),O(')] described (1)-(5). We call dual-system representation,
the system representation il(.)= [-A(') ,-c"'(·),B(·)* ,0(')*], described by

121 = A(t)*x(t) + C(t)*u(t)

122 yet) = B(t)*x(t) + O(t)*u(t)

where

123 x(t) ERn, U(t) E 1R1lo, yet) ERn; ,

124 the input function ii(') E ii := PC (R+,RIlo).


125 Comment. The dual input u(·) has dimension no (Le. of the output y(.»; simi-
larly the dual.9utput y has dimension nj (Le. of the input u(·». The dual system
repre§.entation R(·) is related to R (.) by the Pairing Lemma below: it reflects the fact
that RO is the adjoint of R (.) over any interval, (see Comment (129) below). Note
also that the dual of the dual system representation is the original system representa-
tion modulo a change of sign for the state.

126 Pairing Lemma.


-
Let R('), (121)-(124), be the dual system representation of
R 0, (1)-(5).
U.t.c.
'It to,tE R+, 'It (x(to),u('» E IRn x U , 'It (x(t),ii('» E IRn xU,
t

127 (X. (t),x(t» + J(ii ('t),y('t»d't = (x.(to),x(to» + J(YC't),u('t» d't

128 Proof. for simplicity we suppress in our notation all the time dependences.
Now from (121)-(122) we obtain, using (1)-(2),
28

O == x+A*-X+L - B*-x+ D*-u,U )


....11<-u,x ) + (-y+

== + (x,Ax+Bu) + (u,Cx+Du) - (y,u)

==

== :t (x,x)+(u,Y)-(Y,u).

Therefore, by integrating between to and t, (127) results.


...

129 Comment [Adjointsj. Consider an arbitrary bounded interval I of R+. Jo fix
ideas pick w.l.g. 1= [to,t11. Now note that the_input spaces of R(') and R(') on
[to,t 1], namely U[ro,ttl =PC ([ro,t}] , R"') and U[ro,ttl=PC([ro,td, R Ilo) are (by
(A.7.12» dense, (see (A.6.29», in L2([to,t}], ]Rn.) and L2([to,td, R Ilo), resp.; as usual,
L 2 denotes the Hilbert space of square integrable functions on [to,t11 with inner pro-
duct

(v,wh :== f (v('t),w('t» d't 'v' U,V E L2,


10

where CO) denotes the usual vector inner product (v('t),w('t» = v ('t)* w('t), (A.7.1O); for
simplicity we shall also write L2 [t o ,t 1j instead of L 2([Io,ttl, JRIlo) or L2([Io,td, JR"').
Now by the density above and continuous extension, (see e.g. Comment we
may w.l.g. replace the input- and output-spaces (of class PC) of R (-) and R(-) by
L 2[Io,td. Hence, on [1o,tIl, R (.) induces a well-defined linear map

sending an (initial slate, input) pair into a (final state, output) pair.
Similarly on [lo,ttl RO induces a well-defined map

p* : R"xL2[Io,tIl JR"xL2[Io,td : (x(t\),u('» (x(to),Y(-»

...
sending a (final state, input) pair into an (initial state, output) pair, (" P sends the
future into the past").
...
Note here that the domain and codomain of P and Pare JR n xL2[Io,td: this is a
product Hilbert space having the inner product (·,)lRn+ C·h. Hence from (127) (with
t=t 1) we have:
'v' (x(Io),iiO) E JRn xL2[Io,td = Domain of P ,

'v' (x(t]),U('»E JR" x L2[Io.td = Domain of p*,


29

or, with «.,'» the inner product of JR.n x L [to,td, 2

As a F * is the adjoint of F (see (A.7.3l): domain-image pairing): in


that sense R(') is the adjoint of R (-) over any interval. •

We now conclude this section with a simplified pairing important in dynamic


optimization.

134 Corollary [Pairing for y=x]. Under the conditions of Lemma (126) let C(')=I
and DO = O. Then one has

V to,t E R+, V (x(to),u('» E JR.n xU, V (x(t),ii(t» E R n xU

t t
135 (x(t),x(t» + f (ii('t),x('t» d't= (x(to),x(to» + f (B('ttx('t),u('t» d't
where
x(t) = A(t)x(t) + B(t)u(t)

-i(t) = A(ttx(t) + u(t)



Proof: Exercise.

2.1.7. Linear Quadratic Optimization


Consider the standard linear quadratic optimal control problem (standard LQ-
problem) described as follows.

136 We are given: a) a system representation R 0 = [AO, B('), 1,0], (1)-(5), on


[to,tIl, thus
xCt) = A(t)x(t) + B(t)u(t) t E [to,td c R+ ,

where the initial state Xo E R n is given, i.e.


30

b) a quadratic cost functional

138 1(uO) ,= {! [ II ulO II' + [I q')xl') 1I']d. + ,(I .)'S,(.,) }

where

139 C(.) e PC (1R+,1RIIo) and S e 1Rnxn s.t. S = s* o,t

and 11·11 denotes the Euclidean nonn, (A.6.1). It follows by (71) that for fixed Xo the
cost J is a functional of uO, more precisely it reads
J(·):PC([to,tl],1Rnl) R:u(·) J(uC·».

Standard LQ.problem: minimize J(.) over all possible inputs u(·) of class pC,tt
thus solve

140 min (J(u(·» : u(·) e PC } .



141 Exercise. Let x e Rn and show that
II x+/)x 112 = II x 112 + 2(x,/)x) + ,,/)x ,,2.
142 Comments. a) In (136) the output of R (.) is the state; moreover A(·), B(·)
are defined on 1R+ and so is C(·) in (139); hence to and tl can be moved if necessary.

(3) In the expression of the cost J(.), (138), C(·)x(·) is the weighted state-trajectory
to be penalized and S is the final-state penalty-matrix; ( II C(·)xO 112 reads also
IICOx(·)1I 2=(x(·),QOx(·» with QO=C(·)*C(·) CO and S depend on the
problem at hand.

y) In many cases one uses a weighting matrix RO to penalize the input uO in the

cost J(.), namely, a tenn J(u(t),R(t)u(t) )dt is present (instead of JII u(t) " 2dt), where
10 10
R(·) = R(.)* > 0 is a matrix-valued function of class PC (which is positive definite
'It). However by introducing U(·)=R I12 (·)u(·) and (where R II2 (.)
is any square root of class PC of R(·» we have

t S is Hennitian positive semidefinite.


ttPor brevity PC := PCC[to,td,Rn,) in this section.
31

JII f(t) 11 2dt = J(u(t),R(t)u(t)} dt


10 10
and
x(t) = A(t)x(t) + ii(t)f(t) tE 1Rt.

So we recover the standard LQ-problem by minimizing over f(.).

145 Analysis. For simplicity we omit mostly all time-dependences in the notations:
e.g. x for x(t), C for C(t), etc.

Step 1. Necessary and sufficient condition for u E PC to be a global minimizer of


the cost J(u).
Let u E PC be any input and ()u E PC be any perturbation. This generates a line
{ E IR } in the space PC and the expansion of J(u+eou) about e = 0 is

146 J(u+eou) = J(u) + eOJ(ou) + o(e) eE IR

where o(e) --+ 0 as I e I --+ 0 and oJ(8u) is the directional derivative (a functional
e
linear in ou). Equation (146) shows that for u to be a minimizer of J, (Le.
V Ou E PC, VeE R, J(u+eou) ?!. J(u», it is necessary that

147 PC.

Indeed from (146) for any ()u E PC with u a minimizer:

VeE IR e [OJ(OU) + ] =J(u+eou)-J(u) :> O.

Hence if e approaches zero from above then in the limit BJ(ou) ?!.O; similarly if the
approach is from below then 15J(ou) O. Hence condition (147) is necessary.
Now condition (147) is also sufficient. Indeed, by the linearity in (xO'u) of the
state transition map (71) of R 0, any perturbed input u+eou (with Xo fixed) generates a
perturbed state trajectory x+eox (where u and ou map into x and ox, resp., and
oxo=9). Hence, by calculation, using (138) and Exercise (141), VUEPC,
V OUE PC VeE R

148 J(u+eou) = J(u) +eoJ(ou) +e2J(ou)


32

where (a) BJ(Bu) is specified in (149)-(151) below and (b) by (138)-(139). J(Bu)
for allliu(')e PC. t Hence if ue PC is S.t. &(Bu)=O Vliue PC. then by (148)
with £= 1.

'v' aue PC J(u) J(u+Bu)

i.e. u is a global minimizer of J. confinning the sufficiency of condition (147).


In summary. u is a global minimizer of J(u) if and only if condition (147) holds.

Step 2. Determination of the optimal input u.


The calculation mentioned above gives
t1 11

149 f
SJ(Su)= (u,Su)dt + f (c"'Cx.Sx)dt + (SxI.SxI)
10 10

where ()x(·) is related to ()u(·) by

150 Sx=ASx + B()u

151 Sxo=9.

Note that SJ(Su) is a linear functional in Su which we want to make explicit in Su for
expressing (147).
Now for (150)-(151) the pairing corollary (134) gives

Viie PC

152 (xI,axI) + f (ii,Sx) dt= (xo,Sxo) + f (B*x,Su)dt with ()xo=9


10 10

where

153 + ii.

Hence comparing (152) with (149) suggests that we choose

154 ii=c"'Cx and xI=SxI'

t When calculating J(u). Eq. (1) is integrated with x(to)=xo. whereas J(Su) is ob-
tained by integrating ()x = ASx + BSu, with Sx(to) = 9.
33

With this choice, (152) converts the directional derivative (149) into the explicit func-
tional
II

155 oJ(ou) = f (u+B*x,Ou) dt .


10

Recall from Step 1 that for u e PC to be optimal it is necessary and sufficient that
oJ(ou) = 0 for all OU e PC: choosing OU = u+B*x shows that this will happen if and
only if

156 u(t) = -B(t)*x(t)

The substitution of (156) and the choices (154) in (136)-(137) and (153) resp., finally
result in the following optimality characterization.

157 Theorem [Solution of the standard LQ-problem]. The: solution of the standard
LQ-problem (136)-(140) is given by the optimal input

u(t) =-B(t)*x(t)

where xO E C ([to,til, 1Rn) is defined as the partial state trajectory of the 2n-
dimensional two-point boundary value problem or [to,t 1] given by:

158 . *
x(t) = A(t)x(t) - B(t)B(t) x(t) ,

159 . =-C(t)*C(t)x(t) - A(t)*x (t),


x(t)
where

160

161 Comments. a) The homogeneous 2n-dimensional linear d.e. (158)-(159) is
usually called a Hamiltonian system and

A(t) -B(t)B(t)* ]
162 [
H(t):= -C(t)*C(t) -A(t)*

is called the Hamiltonian matrix.

13) In (136)-(138) A(·),B(·),C(·) are defined on R+ and so the Hamiltonian system

t is obtained at all points of continuity; for points of discontinuity assume


that B(·) is right-continuous and apply continuous extension.
34

(162) can be studied on R+ for any pair lo,t1 E R+.

y) The optimal cost 10 (of the LQ-problem) is given by 10= 1


(X(Io),xo) as shown
in (164) below.

163 Optimal cost. We shall work on R+ with an arbitrary pair lo,t 1 E 1R+ s.t.
10 :s; tl·
Under these conditions, let us recall our substitutions, i.e.

156 u(t) = -B* (t)x(t)

Then (158)-(160) becomes, on R+ with 10 :s; 11 arbitrary on R+

X(I) = A(I)x(l) + B(I)u(l)

+ li(t)
such that

Now using this interpretation of (158)-(160) in the pairing corollary (134) (with Xo
arbitrary), we obtain successively:

tl tl

(x(to),xo)=(x(tl),X(II) + f (li,x)dt - f (B*x,u)dt


to to

164 = (Sx(tl),x(t 1»+ fllCxl12dt+ fllull2dt


to to

= 2Jo(uC·»,

where (a) for the second equality we used (154) and (156) and (b) the last RHS IS
twice the optimal cost 10 of the standard LQ-problem on [lo,t1l starting at x(to)=xo
(indeed the pairing is perfonned, with to :s; tl' under the conditions of Theorem
(157». Hence we have obtained an optimal cost equation. •
35

The optimal cost equation (164) is useful for converting the solution of the
standard LQ-problems into a linear feedback law dictated by either the Hamiltonian
system or the matrix Riccati d.e. This result is given in the next two theorems, which
we prove immediately.

167 Theorem [Optimal LQ state feedback by the Hamiltonian system]. Consider


the standard LQ-problem, (136)-(140), where the Iwrizon tl is fixed and toe [O,tl) is
arbitrary.
V.l.c. a) On any interval [to,tl] the LQ-problem is solved by the fixed linear
state feedback law

168

-
°
where XC-> and XC-> are nxn real matrix-valued functions defined on [O,td, with
det X(t) "# for all t E [O,tl]' as the unique solution of the backwards Hamiltonian
linear matrix d.e.

169 :t
[ X(t) ]
X(t) =
[A(t) -B(t)B(t)*
-C(t)*C(t) -A(t)*
1 [X(t)]
X(t)

= H(t) [ X(t) ]
X(t) , te [O,td

with
-
170 X(tl)=I and X(tl)=S,

b) On any interval [to,til cR+, the LQ-problem has the optimal cost 10 given by the
quadratic form
1 -
171 10="2 (X(to)X(to)-lxO,xo)

and generates the optimal closed-loop system dynamics described by the linear homo-
geneous d.e.
172 x(t) = [A(t)-B(t)B(t)*X(t)X(t)-1 ]x(t) t e [to,td

173 x(to)=xo,

(by the substitution of u('), (168), in x=Ax+Bu).



t We mean an expression u(t) = -F(t)x(t), where F(t) E lRn1xn •
36

174 Comments. a) The solution (X(·),X(·» of (169)-(170) depends only on


A('),B('),CO, and S; it does not depend on the initial state xo.

Let
175 F(t) ;= B(t)*X(t)X(tr l E R nixn

then (i) F(') is independent of xo; (ii) (172) can be interpreted as describing the
dynamics (136) modified by a linear time-varying state feedback law u(t) =-F(t)x(t),
(see Fig. 2.3 below).

176 Proof of Theorem (167). The proof is done in two steps. The first step shows
that the partial solution X(t)E Rnxn of (169)-(170) is nonsingular for all tE lO,td: it is
based on the optimal cost equation (164) and may be skipped on a first reading. The
second step shows that the theorem holds.

Step 1. det X(t) '* 0 'litE [O,ttl.


We use contradiction. Since X(t 1) = I, we assume that there is a time 'tE [0,t1)
S.t. det X('t) = O. Hence there exists a_ nonzero vector k S.t. X('t)k = 9; moreover, by
(169)-(170), x(t) := X(t)k and x(t) := X(t)k define on ['t,t 1] a solution of the Hamil-
tonian system (158)-(159) s.t. x('t)=9 and x(t\)=Sx(t\). Hence on ['t,t 1], by Theorem
(157) and the optimal cost equation (164), u(t) := -B(t)*x(t) solves an LQ-problem on
['t,t l ] with zero optimal cost: indeed, in (164) with to='t and x('t)=9 the LHS is zero
and so is the RHS. Since all contributions to the cost are nonnegative, this simpJies
I,

that f II u(t) 11 2dt=0 and therefore u(t)=O for all tE [to,ttl. Therefore, since (158)
't
reads x=Ax+Bu (with u=-B*x), there results on ['t,t l ] x=Ax with x('t) = 9. Hence
x(t)=X(t)k=9 'litE ['t,td. In particular at t1 we have (using also (170»:
9=x(tt)=X(tt)k=k::l- -H-. e:
Step 2. The theorem holds.
Let (X(·),X(·» be the backwards solution of the matrix d.e. (169)-(170). By Step
1 we know that detX(t) 0, '*
'litE [O,ttl. Therefore for the given initial condition
x(to) = xo, (137), there exists a unique vector k E lRn s.,t. at to
177 xo=X(to)k.

Hence by (171 )-(172) it follows that


178 x(t) := XCt)k xCt) = X(t)k

is the unique solution of the Hamiltonian system (158)-(160) on [to,t 1]. Note here
that, since detX(t) 0 '*
'litE [O,ttl, by (177) and (178)
37

Hence by Theorem (157) for any [to,td c the solution of the LQ-problem will be
given by:
u(t) =-B(t)* x(t) =-B(t)*-X(t)k=-B(t)*-X(t)X(t)-lx(t),

where we used successively (158), (178) and (179). Note especially that the feedback
matrix function multiplying x(t) is a fixed function on [0,t 1] which does not depend
on the specified initial state Xo E 1Rn and on to E s.t. to :::; II. Hence conclusion (a)
holds.
Finally for conclusion (b) we note that by (164), (178) and (179) we have for the
optimal cost
2Jo= (x(to),xo) = (X(to)k,xo)

= (X(to)X(to)-lxO'xO) .

This proves (171), while (172)-(173) is self explaining.



An equivalent statement of Theorem (167) is the following

180 Theorem [Optimal LQ state feedback by the Riccati d.e.]. Consider the stan-
dard LQ-problem, (136)-(141), where the horizon tl is fixed and toE [O,tl) is arbi-
trary.
U.t.c. a) On any interval [to,td c R+ the LQ-problem is solved by the fixed linear
state feedback law
181 u(t) = -B* (t)P(t)x(t) t E [O,tl]'

where PO=P(.)* °
is the nxn real matrix-valued function defined on [0,t 1] as the
unique solution of the backwards matrix d.e.

182 . * * *
-P(t) = A(t) P(t)+ P(t)A(t) - P(t)B(t)B(t) P(t) + C(t) C(t) t E [O,td
with
183

«182) is called the matrix Riccati d.e.).

b) On any interval [to,tt1 c R+ the LQ-problem has the optimal cost JO given by the
quadratic form

184
38

and generates the optimal closed-loop system dynamics described by the linear homo-
geneous d.e.
185 x(t) = [A(t)-B(t)B(t)*P(t))x(t) t E [to,td

186 x(to)=xo,

(by the substitution of u('), (181), in x=Ax+Bu).

190 Note that, for any given Hermitian pet), the RHS of (182) gives a Her-
mitian -pet). Therefore, since (by (183» P(t1) is Hermitian, P(·) will be Hermitian.
Moreover, since the cost is nonnegative and cost formula (184) will be shown to hold
for all Xo E R n and all to E [O,td, PO will be positive semidefinite. Therefore on com-
paring the statements of Theorems (167) and (180) we are done if the solution of the
d.e. (182)-(183) reads
191 pet) = X(t)· X(t)-l for t E [O,td,

where (X(·),X(·» is the backwards solution of the Hamiltonian matrix d.e. (169)-(170).
Now note that (191) and (169)-(170) imply

-P=-XX- 1 + XX-I XX-I

=-[-CCX-A*XJX-l + P[AX-BB*XJX-l

=C'C + A*P + PA - PBB*P,


with

Hence (191) defines a solution of the d.e. (182)-(183). Conversely if PC') solves
(182)-(183), then the transition at t J of the closed loop_d.e. (185) is well
defined. Call the latter XO and set XO := P(·)X(·). Then (X(·),X(·» is the unique
solution of (169)-(170) where (191) holds, (exercise). Hence (191) defines a bijection
between the solutions of (182)-(183) and (169)-(170) resp., and we are done.

e x

! }L..___
- 8·....----....
U_ _ "_'"_' " ; - - - - '

Fig. 2.3. State-feedback realization of standard LQ-problem: F(t) is the optimal state
feedback matrix.
39


192 Comments. ex) Theorems (167) and (180) show that, for a fixed horizon t l ,
the standard LQ problem on any interval [to,td c is solved by computing the fixed
state feedback matrix
193 F(t)=B(t)*P(t) Vte [O,td .

The latter, by closing the loop as in Fig. 2.3, will generate the optimal input
194 u(t)=-F(t)x(t)

which steers the system on its optimal state trajectory through x(to) = xo.

The nxn matrix P(·) in the feedback gain (193) can either be computed from the
linear Hamiltonian matrix d.e. (169)-(170) through

191 P(t) = )(t) . X(t)-l t e [O,td

(Compare with (175» or by solving the backwards Riccati d.e. (182)-(183). •

196 Concluding remarks. Five features of our solution of the LQ-problem are
worth emphasizing.
1. A local optimality analysis, (146), leading to the annihilation of the directional
cost derivative along any direction, (147). A generalization of this technique
involving a Lagrangian cost derivative leads to the maximum principle of optimal
control e.g. [Var.I,Fle.I,Alex.l]. The two-point boundary value problem of
Theorem (157) is typical for expressing that necessary principle of optimality.
2. A global optimality analysis, (148) et seq., leading to the sufficiency of condition
(147) for global optimality. This combined with point 1 results in the necessary
and sufficient condition of Theorem (157); the interested reader will easily derive
from (148) that
Vue PC Voue PC Vee R

J(u) + oJ(eou) J(u+eou);

this shows the convexity of the quadratic cost in u (that is, the tangent space is
always below the cost, e.g. [Var.I]): in optimal control, convexity is the driving
force for global optimality.
3. A systematic study of the optimal cost, (164), to obtain the solution in feedback
form: this borrows ideas from dynamic programming, e.g., [Var.I]. That theory
derives the Riccati d.e. by the Hamilton-Bellman-Jacobi equation (" backwards
40

optimal cost tracking") and results in our Theorem (180).


4. The use of the backwards Hamiltonian matrix d.e. (169)-(170) for the backwards
tracking of the optimal cost, (164), in Theorem (167). (This lines up the "max-
imum principle," (Theorem (157), to "dynamic programming," (Theorem (180»,
by the bijection (191) between the backwards Hamiltonian matrix d.e., (169)-
(170), and the Riccati d.e., (182)-(183».
5. A systematic use of duality through the pairing corollary (134) to simplify com-
putations.

2.2. Applications
This section contains various applications using a system representation
R (.) = [AO,B('),C('),D(')]: the variational equation encountered in linearization, exam-
ples of nonlinear control, dynamic optimization and periodically varying differential
equations.

2.2.1. Variational Equation


In engineering, systems having a representation R 0 = [AO,B('),CO,DO] occur
either because the model used is an interconnection of components described by linear
d.e's or because the model is basically nonlinear but one considers only small pertur-
bations about a specific trajectory. In circuit theory the representation is called the
small signal equation; in optimization and control it called the variational equation;
this process of linearization is also used throughout physics. We discuss this question
in a heuristic manner.
Suppose that the model is represented by the d.e.

x(t) = f(x(t),u(t),t)
where x(t)e R n and u(t)e R n; and f:RnxRn;xR+ Rn.
Suppose that for a given to' Xo and a given input u(-) we have calculated the
corresponding state trajectory, say xO. What happens if for given to'x o ' we now have
Uc')+ouO as input, where ou(t) is small for all times of interest? The new input will
give a new (perturbed) state trajectory xO+axO and we would expect that oxO will
be small, say of the same "order" as ou(')' This is usually the case; for precise condi-
tions see e_g. lDes.2J, [In. 1].
Proceeding formally, we obtain

2 *(t)+ox(t) = f(x(t)+ox(t),u(t)+ou(t),t) .

Since t x(t) is known, this is a differential equation in ax('); and ox(to)=9 since the
perturbed trajectory starts from Xo at to'
For each fixed t, let us expand f in a Taylor series about the point (x(t),u(t),t):

f(x(t)+ox(t),u(t)+ou(t),t) = f(X(t),U<:t),t)
41

+ higher-order tenns .

Note that f is an n x n matrix of partial derivatives: its (ij) element is


x
Tdr.· evaluated
at (x(t),u(t),t). The matrix f is often called the state Jacobian matrix along the tra-
x
jectory. Similarly, f is an nxnj matrix of partial derivatives: its (i,k) element is
u
evaluated at (x(t),u(t),t). The matrix fu is sometimes called the control Jacobian
matrix along the trajectory. Substituting (3) into (2) and remembering that

4 X(t) = f(X(t),u(t),t) ,

we get by dropping the higher-order tenns:

5 8x(t) = A(t)8x(t) + B(t)8u(t)

8x(to)=9

where A(') and B(') are the state and control Jacobian matrices defined in (3). Equa-
tion (5) is called the variational equation about the trajectory X(.) generated by
(to,Xo.l(». Note that since we dropped higher-order tenns, (5) gives an approximation
to the difference between the perturbed trajectory and the reference trajectory X(.).
This approximation is, however, extremely useful in science and engineering (e.g.
design by optimization).

*6 Exercise. Study Appendix (B2) on the influence of changing initial conditions


and parameter perturbations to d.e. 'so

*7 Exercise. Assume that in (2) auo reads au(t)=ev(t) where v(') is a specified
function bounded on R+ and e is a real parameter S.t. ee (-£o,eo) for some small
eo> O. Assume also that xo ' the initial condition at to' has been changed into xO+8xO
(thus in (2), 8x(to) = axo 9). Note that under these conditions, (a) Equation (1)
reads
8 x(t) = f(x(t),t,e) := f(x(t),u(t)+ev(t),t),

indeed u(·) and v(·) are specified, hence the RHS of (8) depends on t, x(t) and e, and
(b) if 8xa..= e and e=O, then X(t)=x(t) where X(.) satisfies (4) and X(to)=xo. Assume
now that f in (7) satisfies the condition of Theorem (8.2.6), whence it can be applied.
Show that at any fixed time t> to 8x(t):= x(t)-x(t) satisfies
42

+E J
10

10 + J
10
where

a) <1>(',') is the transition matrix of the state Jacobian matrix A(t)=fx(X(t),u(t),t) of (3)
and B(t) = fu(X(t),\(t),t) is the control Jacobian matrix of (3).

b) [ / II II ] 0 as (9,0).

(Hint: note that x(t)='I'(t,to,xo+8xo,e) and X(t)='I'(t,to,xo,O), so by Taylor's expansion


at (t,to'xO'O)

Apply now (B.2.16) and (B.2.17) to evaluate aa",Xo I and


E
I.)

11 Comment. The solution of (5) with gives (10) without the error
term o(Sxo,e): (10) asserts, in addition to (5), that at any fixed time t > to the error is

*12 Exercise.
Let Eo> 0 be small. Let v E IRn, be a fixed control perturbation
value. Let t be a fixed time S.t. t > to' For all £ E [0'£0] define in (2) (where
Sx( to) = 9) as

'r;f t E [t,t+e)
13 elsewhere on IRt .

Hence in (2) 'r;f e E [0'£0] we have the perturbed control

v 'r;f t E [t,t+e)
14
\(t) elsewhere

[" short pulse" perturbation]. Assume that 'r;f e E [O,Eo] (1) has a continuous solu-
tion due to (to,xo,u E('», satisfying

15 x(t) = f(x(t),t,£) := f(x(t),u£(t),t)


43

16

Call this solution x£(-) and note that (a) for e=O xO(t) = X(t) (the solution due to
and (b) for any fixed e e [O,eo] the composition property holds, i.e.

Now assume that the RHS of 0), namely, f(x,u,t) and its Jacobian fx(x,u,t) are con-
tinuous on R n x R n, Then a careful analysis, [Alex. 1,pp.334 et seq.], gives:

a) on any bounded interval [to,T]

18 lim x£(t)=X(t) uniformly on [to,T];


1..... 0

where <!l(',') is the transition matrix of A(t) = fx(X(t),U(t),t), thus

ata <!l(t,t l ) = A(t)<!l(t,tl) <!l(tl ,t l ) = I.

Exercise: show that:


a) With {(x,t,e) defined in (15), for any fixed xe lRn and for any fixed te ['t,He),
(x,t,e) does not exist at e=O (i.e. "blows up").

(Comment: Since { depends a parameter e in a one-sided neighborhood [O,eO) of


af
the reference value e=O and "ae blows up at e=O" one cannot apply Theorem (B.2.6).)

b) With t any fixed time such that t to, xe(t) satisfies:


for t e
20 x£(t)=X(t),
and for t e ('t,oo)
21 x£(t) = X(t) + e<!l(t,'t)[ f(X('t),v ,'t) - f(X( 't),u('t),'t)] + o(e)

where (o(e)/e) -7 0 as e -7 O.

[Hints: Take into account the explicit relation on lR+: for t using (17),
44

Hence if tl = to' (thus x£(t\) = xo), and t E [to,t], then (20) follows.
Now at t = t+E, by (14), (15), (18), and the continuity of xO,

X/t+E) - X(t) =X£(t+E) - xE(t)

= Ef(X(t),v,t) +O(E)
and
X(t+E) - X(t) = Ef(x(t),u(t),t) + O(e) .
Hence
23 xE(t+e)-X(t+e)=e[f(x(t),v,t)-f(X(t),U(t),t)] +O(e) .

Finally, for t>'t+e (for e sufficiently small, this is any time t>t), observe that on
['t+e,t] uE(·) = U('), hence (see (22»
xE(t) = 'I'(t,'t+E,XE('t+e),O) .
Moreover,
x(t) = 'I'(t,t+e,X(t+e),O) .

xE(t) = v
X(t) + dx (t,t+E,X(t+E),O)' ["t;(t+E)-X(t+E)]
v v

+ o(lIx£(t+e)-X(t+e)II) .

Use now finally (19) (with tl = 't+e) and (23) to obtain (21).]

*24 Comment. The control perturbations ouO in Exercises (7) and (12) are small
LOO-perturbations and small L I_perturbations resp. Their effect on the trajectory is
different. Compare (10) and (21) where Ox(t) = xE(t)-X(t). In (10) a uniform effect is
observed. In (21), at time t, the "short pulse" causes a sudden change in velocity, and
for t>t, its effects are propagated along the trajectory by <l>(t,t).

2.2.2. Control Correction Example

Problem. The movement of a satellite is described by the evolution of its state (posi-
tion, velocity, ... ) on a fixed time-interval (to,td c R+ and described by the nonlinear

t s is the nonlinear state transition map of d.e. (1).


45

state d.e. given by


1 x(t) = f(x(t),u(t),t)

31 x(to)=xo,

where x(t) e R nt, Xo e R n is a specified initial state and f: R n x R nj x R.t Rn. The
function fin (1) is obtained by applying the laws of Mechanics to the chosen model of
the satellite.
We are also given a fixed final state Xl e R n at tl and we want to find a control
u(')e PC S.t. we reach the state Xl at tl' i.e.
find u(')e PC S.t.
32 x(t1) = s(1,t1'to,u) = Xl ,

where s is the nonlinear state transition map.



In many cases a closed-form solution of this nonlinear problem is very difficult
and we use an iterative solution. A clever policy is then to use the philosophy of the
Newton-Raphson method for solving nonlinear equations: for a given good guess
improve upon it by locally linearizing the problem. Hence we have the following.

Approximate problem. Suppose that we know a control U(.) e PC S.t. its


corresponding state trajectory

produces a small state error XI-X(t l ) at time t i . The corresponding variational equa-
tion (5)-(6), i.e.
5 i)x(t) = A(t)i)x(t) + B(t)i)u(t)

6 Sx(to)=O

produces for small Su(·) a small error, i.e.

Neglecting the error and noting that control corrections i)U[Io.I,] e PC produce state
corrections ax(tl) given by

tHere n=6, provided the satellite is modeled as a panicle.


46

I,

33 ox(t,) = J<l>(t;t)B(t)ou('t)dt ,
to

34 the approximate problem is:


find a control correction ou(-) on [to,t 1] such that at time tl the state correction ox(t j ),
(33), equals the desired state correction OXd(t,) given by

Analysis. Note that (33) represents the linear map


36 L,:PC([to,t1J,lR n ;) -+ IR n :8u[,o./,j 1-+ 8x(ttl.

This map is called the reachability map with properties described by the following
nxn real-valued symmetric p.s.d. matrix
I,

37 Wr(lo,t,) := J<l>(tl,t)B(t)B(t)*<l>(t1,t)*dt;
to

the latter matrix is called the reachability grammian. Now, by theorem (8.2.12), Lr
and W r(Io,II) are shown to have the same range, i.e.,

Hence

39 Lr is sUljective, or equiv. R (Lr) = R n


iff
40 -=- det WrClo,t 1) "# 0.+

Note that (33), (35) and (36) suggest to refonnulate problem (34) as:
Solve for ou(-) the linear equation

hence it is appropriate to assume that surjectivity condition (40) holds.


Now, from (36)-(38), we see that (41) can be made to read
I,

OXd(tl)= Wr(l(j,tl)xl = J <l>(tl,t)B(t)B(t)*<l>(tl,t)*x,dt


to

+This is not automatic, e.g. for A = diag[ -1,-2] and b * = [1,0], del W /1 0 ,1 I) = o.
47

I,

42 = J<1>(tl ,'t)B('t)ou('t)d't ,
10

for some xI E lRn : note especially that the first and second equality follow from (41),
(38) and (37), while the last equality follows from (41) and (36) defined by (33). Now
(42) indicates a solution to problem (34), viz.

43a ou(t) := B(t)* <1>(tl ,t)*XI


where
43b xI = W,(Io,tJrlxd(tl) = W,(Io,tl)-I(xl-x(t J» .

Indeed this follows by back substitution. Hence a solution to problem (34) is given by
the following

44 Algorithm for computing Ou(·).


1. Compute W/tO,t 1) from (37), or compute the solution at t\ of the linear matrix
d.e.

:t W(to,t) = A(t)W(to,t) + W(to,t)A(t)* + B(t)B(t)* t E [to, til

W(to,to)=0

(Exercise: set tl =t in (37) and differentiate)


2. Solve for xI the linear equation
W,(to,ttlxl =oxd(t,)=x,-x(t,)

3. Compute X[Io,ltl the solution of the adjoint linear d.e.

il=-A(t)*x(t) x(tl)=xI

(Note: x(t) = <1>(tl>t)*XI' (2.1.118).)


4. ou(t) := B(t)*x(t) t E [to,tl]'
*45 Exercise. Using the methods of Sec. 2.1.7, show that the control ouO given by
(43) is the solution of the problem:

where
48

i)x(t) = A(t)i)x(t) + B(t)i)(t)

i)x(to) =9 i)x(tl) = xd(tl) .

Comment. i)u(') is the "minimum energy" or "least squares" control correction.

46 Exercise: Show that /)uO given by (43) satisfies

11

a) /I /)u(') /I i := JII /)u(t) 11 2dt=(Xd(tl).Wr (to.tlr lxd(tl»


10

Comments a) No control correction is needed if and only if the guess U(.) is exact.
i.e. xl =S(tI.to.XO,lJ).

b) It can be argued from this that. under certain conditions. a convergent iterative
scheme based on successive control corrections (43). i.e. ui+'(·)=ui(·)+/)uj (·) j=I.2....
will converge to a control u(') which solves the original problem (32). i.e.
xl = s(t,.to.xo.u). (cf. convergence of Newton-Raphson's iteration scheme.)

2.2.3. Optimization Example


Suppose we are given (a) a dynamical system represented by the state d.e.

x(t) = f(x(t).u(t).t) t E [Io.t,] c

31 x(Io)=Xo

b) a cost functional J(u('» given by

where
51 <I> : R n -+ R : x -+ $(x)
and

with s the nonlinear state transition map of (1).

Problem. Find a control U[Io.I.1 E PC that solves


49

Comment. The cost (50) depends on the final state, which itself is detennined by
the control you use: e.g. the state may be the (position, velocity) vector of the center
of a mass of a rocket, -cp(x) may be the height and u(') the thrust you want to apply
on [to,t 1] to maximize height.
Now, usually this problem has no closed-fonn solution. However, from a practi-
cal point of view, we may start from a reasonable guess and try to improve upon it.
Thus we assume that we have a guess, i.e. a control and its corresponding state
trajectory i.e. the function defined by

4 = f(X(t) ,u(t) ,t) t e [to,td

X<to)=xo·

Suppose that in our search for a "better" control we change u into UrBu; the trajectory
changes fonn x to x+Bx where, after dropping higher-order tenns,

5 Bx(t) = A(t)Bx(t) + B(t)Bu(t)

6 Bx(to)=9 .

The first equation is the variational equation about the trajectory X('), (see Sec. 2.2.1).
Let cl>(t;t) be the state transition matrix corresponding to (5); then, by (6),
11

54 J
Bx(tl)= cl>(tl,'t)B('t)Bu('t)d't.
to

Moreover, standard analysis reveals that the directional derivative BJ(Bu) at U('),
defined by
55 J(UreBu)=J(u)+eBJ(Bu)+o(e) ee IR
is given by
56 BJ(Bu) = ,Bx(tl)}'

where «lid := t!>x I [aacp aacp ... at!>] I )


'" XI x2 aX n ",I,'

i.e. <!lxl is the row gradient of cp at Xctl)' Using (54) in (56), we obtain successively
50

oJ(ou) = (<I>x*1, f" <1>(t\,t)B(t)ou(t)dt)


10

t,

= f ($x*\ , <1>(t} ,t)B(t)Ou(t» dt


to

51

Now, by (2.1.118), observe that


x(t) = <D(t 1,1)*<1>:1

is the solution of the adjoint d.e.


ii:(t) = -A(t)*x(t) t E (to.td

Therefore xO can be computed on [to,t l ] and the same applies to

58 YCt) := B(t)*x(t)= .

Therefore, by (51), the directional derivative oJ(ou) (a linear functional in ou) has the
explicit form

59 oJ(ou) = J(y(t),ou(t» dt .
to

Remember that we want to minimize our cost J(u) having at our disposal J(u). Note
also that, by (55) where (0(10)/10) -) 0 as E -) 0, there exists an EO>O such that for all
10 E (0,100]
J(UtEOU) < J(u)

if ou(·) satisfies
60 oJ(ou) < 0,

i.e. ouO is a direction of descent. Indeed by (55) and (60) with E > 0 small

J(UtEOU) - J(u) = 10 [ oJ(Ou) + 1< 0


SI

since aJ(au) dominates (o(£)/e). Hence the biggest decrease in the cost from Umay be
expected in a direction of steepest descent au('), which is the solution of the problem

61 min{&(&u}:! lloo(t} II 'dt-ron,tant-a' > 0 } .

Note that since we are interested in a direction, length must be kept constant; we

r= I II
I,

choose as length the L2-norm, (A.7.1O), given by lIauOII Bu(t)11 2dt. Note,
10
also that the constant a is arbitrary and we may set
62 a=IIY(')112
I,

where y(.) is given by (58). Now using the L 2 -inner-product (f,gh= (f(t),g(t»dt, it f
10
follows immediately that the solution of (61)-(62), i.e a direction of steepest descent,
reads
63 Bu(t) =-y(t) =-B(t)*<1>(tl,t)* cjl,tl t e [to,td.

Indeed by Schwarz's inequality, (A.7.4), and by (59) and (62) we get:


't Bu(') S.t. II au(') 112 = II y(.) 112

BJ(Bu)=(Y('),8u('»2 -lly(-)lIi=-(Y(-);Y('»'

Finally, having our direction of steepest descent cSu(·) our search for an improved
guess reduces to try to find an 00 such that, at the new control u+-ecSu, J(U+-£cSu) is
significantly smaller than J(u). The determination of this £>0 is called a line search,
e.g. [Oro.1], and belongs to the art of computing.

2.2.4. Periodically Varying Differential Equations


An importaflt special case is the d.e.
65 x(t)= A(t)x(t). te

where the matrix A(')e PC [R+,Rnxn] is a periodic function with period T (T-periodic,
for short), i.e.
A(t+T) = A(t) 'tte

From the periodicity of A('), direct substitution into (65) shows that if <1>(t,to) is the
transition matrix of (65) then the map t <1>(t+T,to) is a fundamental matrix of (65).
52

Consequently, by Exercise (2.1.52), there exists a constant nonsineular matrix C


s.t.
67 <I>(t+T,to) = <I>(t,1o)C,

whence for t=O and tO=O

68 <I>(T,O)= C,

a nonsingular constant matrix.


Now for any nonsillguiar matrix ME rr nxn , the (natural) logarithm of M,
denoted by log Me rr nxn , is well defined t : in Chapter 4 we shall see that log M is
the matrix evaluation of an interpolation polynomial, therefore for all k E Z, Mk and
log M commute; moreover exp(log M) = M and log AE tr is an eigenvalue of log M
iff A is an eigenvalue of M.
Therefore, since <I>(T,O) is nonsingular, the n x n matrix tt

69 B := 1. log <I>(T,O) E (!;nxn


T

is well defined. Hence equation (68) can be rewritten

70 exp(BT) = <I>(T ,0) = C .

We can now state our main result.

71 Theorem [Floquet]. Let A(-) be piecewise continuous and T-periodic. Consider


the matrix-valued function PO, defined on lR+ by

72 pet) := <I>(t,O) exp[-Btl,

where B is the constant matrix defined in (69).


V.t.c.,

a) PO is T-periodic,

b) pet) is nOllsinguiar for all t E lR+ with P(O)==I,

c) the transition matrix of (65) reads

tWe choose the principal branch of the logarithm.


tt The matrix B will be complex if <I>(T,O) has negative real eigenvalues.
53

73 <1>(t,to) =P(t) exp[B(t-to)]p(to)-l 'it.to E R+,

d) by changing to a periodically varying system of coordinates


74 x(t) = t E R+,

equation (65) becomes

75 tE

where B is the constant matrix defined in (69).

Proof. a) From (72) and then using (67), (70) and the properties of the exponential.
we see that for all t E
P(t+T)=<1>(t+T,O) exp[-B(t+T)]

= <1>(t,O) exp(BT) exp(-BT) exp(-Bt)

=<1>(t.O) exp(-Bt)

=P(t) ,

where the last step follows by (72). Hence P(·) is T-periodic.

b) (72) shows that pet) is nonsingular for all t E with P(O)=I.

c) (73) is true because, by (2.1.56), (2.1.57) and (72),

<1>(t,lo) = <1>(t,O)<1>(O,Io) = <1>(t,O)[<1>(lomr 1

=P(t) exp(Bt) exp(-Bto)P(to)-l

=P(t) exp[B(t-to)]p(to)-l.

d) If x(t) is any solution of (1) then, for some IoE R+ and XOE ern, x(t) = «1>(t,tO)xO'
whence by the state transformation (74), with defined by

= p(tr1x(t)

= p(t)-l<1>(t.to)xo =

where the last step follows by (73). Hence f9r every to E for every corresponding
state := p(tor1xo. defined by (74), satisfies the d.e. (75). •
54

76 Exercise. Consider the periodically varying d.e. of Exercise (2.1.45) item (7)
and choose the period T = 2x. Compute the matrix B, (69), and the transition matrix
cI>(t,tO)' [Hint: use (73).]

77 Exercise. Consider the periodically varying d.e. (65) with BE fCnxn given by
(69). Show that,

a) the transition matrix of (65) may be written


78 cI>(t,to) = P(t,to) exp[B(t-to)] V t,to E Rr

where (1) for all to the matrix-valued function t P(t,to) is T-periodic and (2) P(t,tO)
is nonsingular for all t,to;

b) by changing to a parametrized periodically varying system of coordinates x(t) =


the d.e. (65) becomes

79 tE R+,

where BE fCnxn is given by (69).


[Hint: study P(t,to) := cI>(t,to) exp[-B(t-to)].]

80 Remark. By (78) the transition matrix of a periodically varying d.e. is the pro-
duct of a periodically modulated amplitude P(t,tO) by an exponential exp[B(t-tO)]'

81 Remark. The matrix-valued functions pet) and P(t,tO) defined by (72) and (78),
resp. are related by

82 pet) = P(t,O) .

CHAPTER 2d

THE DISCRETE-TIME SYSTEM REPRESENTATION Rd(-)=[A(-),B(-),q-),D(-)]

Introduction
This chapter starts by discussing how to obtain a discrete-time linear system
representation R d(') from a continuous-time system. The state and output trajectories
of Rd(') are then derived and structured. The dual-system representation Rd(') is next
defined and related to R d(') via a Pairing lemma. We then handle finite horizon linear
quadratic optimization and end with coverage of periodically varying recursion equa-
tions.

Physical Setting
For most engineering problems, the basic laws of Physics are those of Mechanics
and Electromagnetism: Newton's laws, Lagrange's equations, the Navier-Stokes equa-
tions, Maxwell's equations, Kirchhoff laws, etc. Each of these laws describe
continuous-time phenomena. At present it is cost-effective to manipulate signals in
digital fonn. For this purpose, the continuous-time signals are periodically sampled by
an AID converter (analog-to-digital converter) and transformed to digital fonn: the out-
put of the AID converter can be thought as a sequence of numbers. This sequence of
numbers may be manipulated by a digital computer (controller) and the resulting
sequence of numbers must be restored to analog fonn by a D/A converter; indeed, the
analog fonn is required to actuate the physical devices.
In control problems (robots, measuring instruments, airplanes, satellites, process
control problems, etc.), sensors measure physical variables (e.g. position, velocity,
acceleration, temperature, pressure, voltage, etc.) and the AID converter transfonns it
to digital form. After treatment by a digital computer (controller), the digital signal is
restored to analog fonn in order to operate the actuators (motors, valves, reaction
wheels, ailerons, ... ).
In communications systems, the signal from a microphone or picture tube is sam-
pled and transmitted in digital fonn. At the receiving end, the signals are restored to
analog fonn to actuate loudspeakers or TV tubes.
Throughout this chapter we neglect the quantization error: that is the error occur-
ring in the process of transforming a sample of an analog signal into a finite sequence
of binary digits. We assume that this error is a smalI noise that will barely affect the
performance of the system.
Throughout this chapter, we assume that the sampling period h>O is given;
roughly speaking, the sampling frequency 1/h must be definitely larger than twice the
highest frequency present in the signals being sampled.

The Mathematical Representation R d(-)


We are given a physical system S represented by its linear time-varying
representation R c(') (subscript c to emphasize continuous-time):

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
56

where ueO and YeO are the continuous-time input and output, resp., A(') and BO are
piecewise continuous and, for simplicity, we assume CO and DO continuous, with
A(·) and B(') continuous at the sampling points: A(kh+) = A(kh-),B(kh+) = B(kh-),
'ike N.
Suppose we want to drive S by a digital input specified by the sequence
ud('):= (Ud(kh) J;: we apply this sequence to the D/A converter which, by assump-
tion, produces a piecewise-constant continuous-time input uc(') to S. (The D/A con-
verter is assumed to behave as a zero-order hold circuit.) Thus

3 uc(t)=ud(kh) for tE (kh,(k+l)h).

The input ucO of S produces the continuous-time output YeO, which is fed into an
AID converter to product the output sequence Yd(')::::: (Yd(kh) J;. (See Fig. 2d.1.)

We assume that the D/A and AID converters are synchronized.


The problem is: given R cO specified by (1) and (2), the initial state of S, say at
kOh, and the input sequence udO, find the output sequence ydO.
Since S is driven by a piecewise-constant signal equal to ud(kh), for all
te (kh,k(h+I», we have
(k+l)h
4 x«k+ l)h) = <1>«k+ 1)h,kh)x(kh) + J <1>«k+ I )h,'t)B('t)d't . ud(kh)
kh

5 y(kh) = C(kh)x(kh) + D(kh)ud(kh)

where <1>(''') is the state-transition matrix of ReO.


To get simple expressions, define

Fig.2d.l The system S with the D/A and AID converters.


57

(k+l)h
6 Ad(k) :=: Cl>«k+l)h,kh); Bik ):= J Cl>«k+l)h,t)B('t)dt
kh

7 ; Dd(k):= D(kh)

where x(kh) is the state of S when it is driven by ud ' as shown in Fig. 2d.1. Finally,
let us write ud(k) for ud(kh), then

ke N.

The linear time-varying equations (10) and (11) relate the output sequence (Yd(k)];

and the state sequence (Xd(k) ]; to the input sequence (Ud(k) ];.

These equations specify the discrete·time representation


Rd(·)=[Ad(·),Bd(·),Cd(·),Dd(·)] where the sequences
12 Ad : N --t IRnxn , Bd: N --t IRn><n;

are known (by equations (6) and (7».

Remark. For the physical set up of Fig. 2d.l -- a continuous-time system preceded
by a D/A and followed by a synchronized AID converter -- the matrix Ad is always
nonsingular by (6). The same holds for series and/or parallel connections of such sys-
tems. However this is not necessarily the case for feedback interconnections, (e.g.
deadbeat controllers).

Comments. IX) Since in the system of Fig. 2d.l, S is driven by the piecewise-
constant input ucO, S has a state trajectory xO : R+ --t R n which, at time kh, coin-
cides with xd(k), the state of the discrete-time representation R d(') defined by (10) and
(11). In fact, x(·) is discontinuous at the sampling points as well as at the points of
discontinuity of A(·) and B(·).

Exercise. For the system of Fig. 2d.I, calculate x(kh+)-x(kh-).


58

2d.l Fundamental properties of Rd(')


Since in the following sections we are considering exclusively discrete-time sys-
tems, we will drop the subscript d. Thus we consider the discrete-time representation
R d(') specified by the sequences

1 A(k),B(k),C(k),D(k) ke N

and the linear time-varying equations


2 x(k+ 1) = A(k)x(k) + B(k)u(k)
keN
3 y(k) = C(k)x(k) + D(k)u(k)

Remarks I. It is understood that the recursion equation (2) is run forward, calculat-
ing the later state x(k+l) in terms of the input u(k) and the previous state x(k) is
always possible.
II. In most cases, A(k) e Rnxn is nonsingular for all k (see eq. (0.6) above).
However, some engineering systems are "deadbeat," and A(k) is singular: in that case,
x(k+l) and u(k) in (2) do not define x(k) uniquely, and we cannot solve (2) backward
in time.
III. To emphasize the analogy between the continuous-time representation R (-)
treated in Chapter 2 and the discrete-time representation R dO we number the main
properties of R dO using the same numbers as those labeling those of R ('). For brev-
ity. we do not state obvious definitions. (Most of the time. the proofs are by inspec-
tion. If not. a hint is given.)

14 State transition function.


'v'xoe JRn, 'v'koe N, 'v' (U(k) )k:, (2) and (3) have a unique solution
x(k) = s(k,ko,xo,u('»
y(k) = p(k,ko,xo,u('» } for k=ko,ko+l,···

22 The linear structure of R d: 'v' (k.ko) fixed.

(xo. (U(k)]k; ] s(k.ko,xo.u(·» }


the functions { ( r ]] are linear.
xo. Lu(k) k; p(k.ko,xo.u(·»

31 The state transition matrix. t


59

<1>(';): (NxN)+ -4 R nxn is defined by: V'koE N

40 Cl>(k+ I ,leo) = A(k)<1>(k,leo)


k=ko,ko+ l ,...

Remark. It is only when A(k) is nonsingular V kEN that (40) can be solved for
<1>(k,ko) in tenns of <1>(k+l,ko). In that case, <1>(',') is defined by (40) and (41) on all of
NxN.

48 Formula for <1>(k,Ieo).


Let k>ko, then <1>(k,Ieo)=A(k-I)A(k-2)'" A(ko).
(Proof: Check it for k = kO+ I; then use induction.)

56 Composition property for <1>(.,.).


V ko,k1,k in N with ko $ kl $ k, we have
<1>(k,leo) = <1>(k,k 1)<1>(k 1,leo) .

71 State-transition and response maps.


V XOE Rn, VkoE N V [U(k»)k;; and Vk>ko
k-l
x(k) = <1>(k,ko)xo + L <1>(k.k'+ 1)B(k')u(k')
k'=k o

k-l
y(k) = C(k)<1>(k,ko)xo + C(k) L <1>(k.k'+1)B(k')u(k') + D(k)u(k) .
k'=ko

96 Impulse response.
The zero-state response (Y(k»)k;; due to the input (U(k) )k;; is
k
y(k) = L H(k,k')u(k')
k'=k o

where the impulse response is the matrix-valued sequence (k,k') -4 H(k,k') given by

C(k)<1>(k.k'+I)B(k') for ko $; k' <: k


H(k,k') = { D(k) for k' = k
e for k' > k .

110 Adjoint recursive equation.


60

To the recursive equation x(k+ 1) = A(k)x(k) we associate the adjoint equation


x(k) = A(k)*x(k+ I) kEN.

114 Consider the equation together with x(k+ 1) = A(k)x(k); for all solutions,
k (x(k),x(k» is constant.
The state transition matrix of the adjoint equation is given by

118 '¥ (k,ko) = <l>(ko,k)* k::; leo.

120 Dual-system representation: RdO .


-
Given the discrete-time representation R dO specified by (I )-(3) there are several
ways of defining a dual representation. The following seems the most natural:
- * *
Rd(') = [A(k) ,C(k) ,B(k)*,D(k)*] where

121 x(k) = A(k)*x(k+ I) + C(k)*u(k+l)

122 y(k+I) = B(krx(k+ 1) + D(k>*u(k+I).

Note that since A(k) is not necessarily invertible, the dual system runs naturally back-
wards in time.

126 Pairing lemma.


Given R dO and RdO, I;f ko,k with ko < k,
l;f(x(k),u('»E RnxU d
k-l k-l
127 (x(ko),x(ko» + L (y(l+I),u(l)=(x(k),x(k» + L (U(l+l),y(l».
/=ko /=ko

Proof. Compute: use first (121) and (122), then (2) and (3):
(x(l),x(l» + (Y(l+l),u(l)

= (A(l )*x(l + 1)+C(1 )*u(l + I),x(l» + (C(l )*x(l + l)+D(l )*1I(l + 1),u(l»

=(x(l + 1),A(l )x(l)+B(l )u(l» + (li(f + 1),C(l )x(l )+D(l )u(/»

=(x(l+I),x(l+I» + (u(/+l),y(l».

Finally, Eq. (127) is obtained by summing over 1 from kO to k-1.

*129 Comment: adjoints. It rums out that it is easier to consider pointwise maps:
Fd: (IR n xIR n,) (IRn x 1R""), (IRnxlRllo) (IRnxlR ll ,) where
61

F d : (x(l),u(/» (x(l+l),y(l)

F d* : (x(l+l),ii(l+l) (x(l),Y(i+l»

the maps being, respectively, specified by (2) and (3), and (121), (122)0

134 Pairing Corollaryo Consider Rd(o)=[A(o),B(o),I,O), then V (k,ko)e (NxN)+,


V (x(ko),u(o» e R n xU d' V (x(k),ii(o» e R n x U d
k-l
(x(k),x(k)} + L (ii(l + 1),x(l)}
leo

= (x(ko),x(ko)} +
k-l
L *
(B(l) x(1 + l),u(/)} 0

leo

Linear quadratic optimization.

136 We are given


a) Rd(o)=[A(o),B(o),I,O] with A(k) nonsingular V'ke N;
b) A(k), B(k), C(k) given "Ike N; S=S*e lRn.xn, S 2::..0;
137 c) (ko,k l ) e (N x N)+ 0

Standard LQ-Problem: Minimize

138 J(u)= [IIC(k)X(k)112+ lIu(k)ll2 ] + X(kltSx(kl)'


leo

over all sequences [U(k) ,where is the sequence due to Xo at kO

and the input [U(k) 0

Analysis: To simplify notations we will often omit the explicit time-dependence,


eogo we write x for x(k), C for C(k), 000 0

148 As in the continuous-time case, it is easy to check that the control u(-) is a glo-
bal minimizer of J if and only if
BJ05u)=O V Bu
where

149 BJ(Bu)= kr,l [(u,Bu) + (C'Cx,Bx) ] + (SXl,Bxl),


leo

where Bx is related to by
62

150 Bx(k+ 1) = A(k)Bx(k) + B(k)Bu(k) 'It kEN

By the pairing corollary (134), if we choose

154 ii(k+I)=C(ktC(k)x(k) and xI =SxI

we obtain
k,-I
Bl(Bu) = L (u(k) + B(ktx(k+l),Bu(k)}.
leo

From this expression we easily obtain the

157 Theorem. The solution of the standard LQ-problem, (136)-138), is given by

156 u(k)=-B(ktx(k+l) for kE [ko,k1-l]

where is given by the backward Hamiltonian recursion equation


158 x(k)=A(krlx(k+l) + A(k)-IB(k)B(k)*x(k+l)

159 x(k) = C(ktC(k)A(k)-lx(k+l) + (A(kt +C(ktC(k)A(k)-IB(k)B(kt ] x(k+l)


with

161 Comments. ex) Naturally, the recursion equation (Le.) of the system goes
forward and the r.e. of the adjoint goes backward. In (158), we used the nonsingular-
ity of A(k) to make the system r.e. go backward.
\3) Let us denote the Hamiltonian by H(k)
A(kr l A(k)-I B(k)B(k)* ]
162 H(k):= [
C(k)* C(k)A(k)-l A(d +C(ktC(k)A(k)-1 B(k)B(k)* .

y) The row operation P2 f- P2 - C(k)*C(k)Pl applied to (162) gives

detH(k) =det[A(k)-I] det[A(k)*] = 1 'It k.

More detailed analysis shows also that AE a(H(k» X'-l E a(H(k».


63

Optimal cost.
Given Rd(') and the cost J(u), (136)-(138), by substituting the optimal control
(156) and using the pairing corollary (134),t we get \i (ko,kl) E (N X N)+,
\ixOE R n ,

(x,(ko),xo> = (x,(kl),x(k l » + t l [(U(k+ 1),x(k» - (B(k)*x,(k+ 1),u(k» ]


ko

164 =(SXI,XI) + kf:,1 [1IC(k)X(k)11 2+IIU(k)ll2 ]


ko

= 2Jo(u)

Eq. (164) gives the optimal cost formula: 2Jo(u)= (x,(ko),xo).

167 Theorem [Optimal state-feedback by the Hamiltonian system]. For the stan-
dard LQ-problem (136)-(138), let the horizon kl be fixed :and koe [O,kl-l] be arbi-
trary.
U.th.c., on any [kO,k 1],
a) there exists a unique linear state-Jeedback law
168 u(k) =-B(k)*5qk+ I)X(k)-lx(k) for k E [0,k1-1]

where X(·) and X (-) are n x n real matrix--::alued sequences defined on [0,k 1], with
detX(k) "" 0, \ikE [O,kd, and XO and X(·) are the unique solution of the back-
wards Hamiltonian matrix r.e.

169
170

b) the optimal cost is


171 2Jo =(X(ko)X(kor!xo,xo).

c) the control law (168) generates the closed-loop dynamical system

172 x(k+l)= [A(k)-B(k)B(k)*5«k+l)X(k)-1 ] x(k), ke [ko,k l-l]

173 x(ko)=xo .

t Where the substitutions (154) are applicable.
64

174 Comment. The control law (168) is of the fonn u(k)=-F(k)x(k) where F(k)
does not depend on xO'

Proof (Outline). The proof by contradiction of the continuous-time case extends


easily to prove that det X(k) *" 0, V k E [O,kd. Using (160) in (169)-(170) gives

]= [X x(k l )

hence x(k+ 1) = X (k+ I)X(k)-lx(k)


and, with (156), we obtain (168). Finally (171), (172) and (173) are easily checked. •

180 Theorem [Optimal state-feedback by the Riccati r.e.]. For the standard LQ
problem (136)-(138), let the horizon kl be fixed and ko E [O,k l ] be arbitrary. U.l.c.,
on any interval [kO,k}],
a) the standard LQ problem is solved by the linear state-feedback law

181 u(k) = -B(k)*P(k+ 1)[I+B(k)B(k)*P(k+ In-I A(k)x(k) k E [ko,kl-l]

where PO=P(·)* is the nxn real matrix-valued sequence defined on [O,k l ] as the
unique solution of the matrix Riccati r.e.
182 P(k-l) = C(k)*C(k) + A(k)*P(k)[I+B(k)B(k)*P(k)r1 A(k)
with
183

b) the optimal cost is

c) the state-feedback law gives the closed-loop system dynamics

185 x(k+l) = [I+B(k)B(k)*P(k+ l)r l A(k)x(k)

186 x(!co)=xo.

187 Comments. a.) In (181) and (185), the inversion is legitimate; indeed, as we
shall see P(k) = P(k)* 0 and
det[I+B(k)B(k*)P(k+ 1)] =det[I+B(k*)P(k+ I)B(k)] > 0 Vk

indeed I is positive definite and B*PB is positive semidefinite.


/3) In the continuous-time case, the optimal linear state-feedback changes A(t) to
65

A(t)-B(t)B(t)*P(t). In the discrete-time case, A(k) becomes

Ac(k) = [I+B(k)B(ktp(k+ 1Wi A(k) .

y) If we put h:k) := [I+B(k)B(ktp(kW l A(k), then simple calculations show that the
Riccati equation (182) becomes

189 r
P(k-l) = N:k)*[P(k)+P(k)B(k)B(k)*P(k) 1N:k) + C(k)* C(k)

Since P(k 1) = S = S* 0, (189) shows that P(k) = P(k)* 0 for all k < k 1.

S) Using standard matrix manipulations, the matrix Riccati Le. (182) can also be
written as (we drop the dependence on k in the RHS),
190 P(k-l)= C"C+ A*PA- A*PB(I+B*PB)-lB*PA

Note that the RHS of (190), say M, satisfies M * = M provided P = P *.


Proof. 1. Let us establish the matrix Riccati recursion equation (182). Let
Q(k) := C(ktC(k) and, using the notations of (169)-(170), let

192 P(k) := X (k)X(k)-1 .

We are going to prove that if (192) holds in the RHS of the Riccati equation (182),
then P(k-l)=X(k-l)X(k-l)-I. Consider Eq. (182), a) multiply it on the right by
A(k)-I[I+B(k)B(ktp(k)] and b) multiply the result by X(k) OIl Ihe right to get

194 P(k-l)A(kr l [X(k)+B(k)B(k)*P(k)X(k)]

= Q(k)A(k)-IX(k) + [Q(k)A(k)-1 B(k)B(kt+A(ktlP(k)X(k) .

Using the Hamiltonian equations (169)-(170) and (192), equation (194) becomes

P(k-l)X(k-l)=X (k-l)
equivalently
195 P(k-l)=X (k-l)X(k-l)-I.

Thus we have shown that if P(k) is given by (192) and if (X(,),X (.» are solutions of
(169)-(170) then P(k-l), given by (195), satisfies the Riccati r.c.
The calculations above can be performed in reverse order: assume (192) and
(195), write (194) and obtain from it the Riccati r.e. (182).
66

We conclude that, given the relation (192) between P(k) and X (k) and X(k),
(X('),X('» is a solution of (169)-(170) if and only if P(·) is a solution of the Riccati
r.e. (182).
II. The cost formula (184) follows immediately form (171) and (192).
III. The optimal control (181) follows from (168) indeed from (168)

196 u(k) = -B(k)*X (k+ I)X(kr 1x(k) =-B(k>*P(k+ I)' X(k+ I)X(k)-lx(k) .

Now by the HamiItonifffi equation (169), we obtain


197 1= A(kr1X(k+I)X(k)-1 + A(k)-IB(k)B(k)*P(k+1)' X(k+1)X(k)-1

equivalently,
198 X(k+ l)X(k)-l = (I+B(k)B(k)*P(k+l))-l A(k)

and (181) follows from (196) and (198).


IV. The closed-loop equations (185)-(186) follow by substitution of (181) in (172). •

2d.2 Application: Periodically Varying Recursion Equations


Consider the recursion equation

65 x(k+l) = A(k)x(k) ke N,

where k A(k) e R nxn is p-periodic, i.e.


A(k+p) = A(k) V' keN for some peN,
and
66 det [A(k)] ¢ 0 V'ke N.

From the periodicity of A('), direct substitution into (65) shows that if ct>(k,ko) is the
transition matrix of (65), then k is a fundamental matrix of (65), hence
there exists a constant nonsingular matrix C s.1.
67 cl>(kf-p,ko) = <'l>(k,ko)C ,

whence for k =0 and kO=O

68 ct>(p,O) = c.
Now, for any nonsingular matrix Me a: nxn, the pth root of M, denoted by M1/P is
67

well defined t the evaluation of an interpolation polynomial (see Chapter 4):


'v'ke Z, M and M p commute; [MlIPJP=M and A,lIp is an eigenvalue of Mi/p iff
A. is an eigenvalue of M. Therefore since <Il(p,O) is nonsingular, the n x n matrix
69 B := [<Il(p,O)] lIP e o:nxn

is well defined. Hence equation (68) can be rewritten

We have then

71 Theorem. Let k A(k) be p-periodic, where (66) holds. Consider the matrix-
valued function PO, defined on N by
72 P(k) := <Il(k,O) . B-k ,

where B is the constant matrix given by (69).


U.t.c.
a) P(·) is p-periodic,
b) P(k) is nonsingular for kEN with P(O) = I,
c) the transition matrix of (65) reads

d) by changing to a periodically varying system of coordinates

74 x(k) = P(k)/;(k) kEN,

equation (65) becomes

75 /;(k+l)=B/;(k) ke N,

where B is the constant matrix defined in (69).



76 Exercise. Prove Theorem (71).
[Hint: see the proof of Theorem (2.2.71).]

tWe choose the principal branch of the pth root.


CHAPTER 3

THE SYSTEM REPRESENTATION R =[A,B,C,D], PART I


This chapter develops the general properties of the time-invariant representation
R = [A,B,C,D] (i.e. where A,B,C,D are constant matrices) and then sorts out those pro-
perties of the representations R that have a state space basis of eigenvectors. Chapter
4 will treat the case when there is no basis of eigenvectors. We start with some prel-
iminaries.

3.1. Preliminaries
1 Notations [Rings and fields]. R[s], ( ((: [s)), denotes the ring of polynomials in s
with real, (complex, resp.), coefficients; 8[p(s)] denotes the degree of the polynomial
p(s). lR(s), ( ((:(s», denotes the field of rational functions in s with real, (complex
resp.), coefficients; thus f(·)e ((:(s) iff f(s) = n(s)d(sr 1 where nO and d(·)e ((:[s].
Rp(s), «((:p(s», denotes the ring of proper rational functions with real, (complex
resp.), coefficients; thus f(·) e ((: p(s) iff fO e ((: (s) and f(oo) exists as a finite complex
number. Rp,o(s), ( ((:p,o(s», is the ring of strictly proper rational functions with real,
(complex resp.), coefficients; thus f(')e ((:p,o(s) iff f(')e ((:(s) and f(oo)=O. Mat [ ... ]
denotes the set of matrices having entries in the ring between the brackets, e.g. Mat
[ ((: [s]] is the set of matrices having complex polynomial entries (also called complex
polynomial matrices). If the size of these matrices remains constant then this is indi-
cated by superscripts: e.g. Rp,o(s)n><Jn denotes the linear space of nxm real strictly
proper rational matrices, For more about these rings, fields, and matrices we refer to
section A2 of the Appendix A. Note that most polynomials and rational functions
have real coefficients; however, complex coefficients may creep in when computing
roots or partial fraction expansions. Hence we prefer to work with complex
coefficients.

2 Square matrices. Let A e ((:nxn. The characteristic polynomial of A is


denoted XA and defined by XA(s) := det (sI-A). Clearly, XA is a monic polynomial of
degree n.

3 Fact. Let Ae ((:nxn and let be a complex number. The following statements
are equivalent:
i)
4 ii) There exists a nonzero vector e e ((:n S.t. Ae=Ae;
5 iii) There exists a nonzero vector '11 e ((:n S.t. *
'11 *;
iv) XA*("X)=O .t

t denotes the complex conjugate of

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
69

Proof. Exercise: apply Theorem (AA.13) to the matrix A - AI and its Hermitian
*-
transpose A -A.I. I

We can now make the following definitions. Let Ae cr nxn.

6 The complex number A is said to be an eigenvalue of A iff XA (A) = 0, or


equivalently t one of the equivalent conditions of Fact (3) hold"

7 Any nonzero vector eE crn S.t. (4) holds is called a (right) eigenvector of A
associated with the eigenvalue A.

8 Any nonzero vector 11 E crn S.t. (5) holds is called a left eigenvector of A associ-
ated with the eigenvalue A.
The eigenvalue-eigenvector structure is important to explain the action of a
matrix. The following definitions single out interesting cases.

9 Let A E cr nxn . We say that A is simple if A has n pairwise distinct eigenvalues


AiE cr.

10 We say ([Kat.l,pA3]) that A is sef'isimple or diagonable if A has n eigenvec-


tors that form a basis for c: n (i.e. a l.i.t family of n eigenvec:tors spanning c: n ).
Definitions (9) and (10) suggest that in certain cases we have to make a distinc-
tion between spectrum (point set) and spectral list (ordered set). Hence we have the
following definitions.

11 Let A E cr nxn . We call spectrum of A the point set of the (necessarily pairwise
distinct) eigenvalues of A we denote it by o[A].

12 We call spectral list of A any n-tuple (Ai),n of eigenvalues that is complete as


a list of roots of the characteristic polynomial XA; such list is denoted by 01 [A].
In Theorem (3.3.23) below we will show that if A e (r nxn is simple then A is
semi-simple, and in Theorem (3.3.34) below it will follow that an eigenvector basis of
a semisimple matrix is associated with a spectral list. Moreover, it will be important
to know that (cf. (A6.84a» the sets M d , M'd' and M i of semisimple, simple, and non-
singular matrices are dense in (fnxn.

13 Exercise [Real matrices]. Let A E Rnxn. Show that eigenvalues and eigenvec-
tors have the property of complex conjugate symmetry, i.e. if I.E c: is an eigenvalue
of A and (e,11) is a corresponding pair of right- and left-eigenvectors, then is an
eigenvalue and (e,Tf) is a corresponding pair of right- and left-eigenvectors. Hence

tEigenvalues of Ae c: mxn are defined by property (4).


tt l.i. means "linearly independent."
70

eigenvalue-eigenvector triples (A,e,T]) are either real or occur in complex conjugate


pairs.

3.2. General Properties of R = [A,B,C,D]


We give here the definition and the structural properties of the representation R =
[A,B,C,D]

3.2.1. Definition
In this chapter we consider system representations of the form
1 x(t) = Ax(t) + Bu(t)
tE R
l yet) = Cx(t) + Du(t)

wheret
3 the state x(t) E G: n, the input u(t) E G: n, and the output yet) E G: n" ;

4 A,B,C,D are constant complex matrices of dimension n x n, nxni, noxn, noxni' resp. ;

5 the input function u(·) E U where U ;= PC (JR, G: n,) .

For short we refer to (1)-(5), as the system representation R = [A,B,C,D].

6 Comment. The system representation R = [A,B,C,D] is a special case of the


representation R(-)=[A(·),B(·),C(·),D(·)]. (2.1.1), where AO, BO, CO, DO are con-
stant functions. For this reason R = [A,B,C,D] is called a time-invariant representa-
tion: for more precision on this notion, see Comment (47) below. Note that our
val of observation T is IR since our parameters A,B,C,D are constant.

3.2.2. State Transition Matrix


The state transition matrix will first be calculated for 10= O. By definition, <1>(t,O)
satisfies

7 ata <1>(t,O) = A<1>(t,O) tE JR

8 <1>(0,0) = I .

Let us solve the matrix differential equation by using Picard iterates, (cf. section
(B. 1.3)), and call <1>0,<1>( ,<1>2' . .. the successive iterates; we obtain for all t E IR

tWe prefer complex arithmetic because e.g. eigenvalues may be complex, or we


consider the input u(t) = u exp(jrot).
71

<1>m+1 (t) = 1 + f A<1>m(t)dt m=O,l,2, ...


o
thus
<1>1(t) = 1 + At

<1>2(t) = 1 + At +

By the proof of the fundamental Theorem (B.1.6), we know that the sequence of
iterates converges on any bounded interval to the state transition matrix <1>(t,O). There-
fore it is given by the convergent Taylor series expansion

9 <1>(t,O)= L
00

1=0
I
-It [At]1 =1

1
+ ,[At] +
1.
+ -h [Alii +

By analogy with the freshman-calculus power-series definition of the exponential, we


write
10 <1>(t,O) = exp[At].

11 Theorem. For any t,lo E 1R the state transition matrix of (1) is given by

12 <1>(t,lo) = exp[A(t-to)] .

13 Exercise. Prove the theorem in detail.

14 Exercise. Use the power series (9) to show that if

A= then exp[At]= 1;

15 Exercise. Let A be an n x n constant matrix given by


72

A. 1 0 0 0
0 A. 0 0

A=

0 0 0 I.. 1
0 0 0 0 I..

show that

eAt teA! t n - 1 .\t


(n_l)!e

0 eAt

eAt =

0 eAt ie'\!

0 0 eAt

(Hints: A=A.I+N, where AI and N commute; exp (At) = exp (At)exp[Nt]; compute
exp[Nt] using (9): for k Nk=O)

16 Property. For any t l ,t2 E R

17 exp[A(t l+t2)] = exp[Atd . exp[At2] .

Proof. By the composition property, (2.1.56), <1>(tl+t2,0) = <1>(tl+t2,t2)' <1>(t2'0), and by


(12) equation (17) follows. •

Computing exp[At] by the Laplace transform


Taking the Laplace transform of (7) above, using initial condition (8) and denot-
ing by <1>(s) the Laplace transfonn of <1>(t,O) on t 0, we obtain t

18 (sI-A)<1>(s)=I.
Hence

19 <l>(s)=L[exp(At)]=(sI-Ar l .

The elements of the matrix (sl-A) are polynomials with coefficients in tr; therefore
they are elements of the field of rational functions with coefficients in tr;

t L [... ] denotes "Laplace transform of."


73

consequently. Cramer's rule applies. Hence element (ij) of <1>(s) reads

20

where XA(s)=det(sl-A) is the characteristic polynomial of Ae cr nxn and bi/s) is the


polynomial obtained (i) by replacing column i of (sl-A) by the jth unit vector Ej and
(ii) by computing the determinant: Note especially that o[bi/s)] < n. while
o[XA(s)]=n. Hence. for all i.je n. <t\jCs)e ([P.o(s) (i.e. a strictly proper rational
function). As a consequence by collecting the coefficients of the polynomials b .. (s) in
. .
coe ffi Clent matrices we h ave D

21

where B(s) E (f [s]nxn S.t.

21a B(s)=Bosn-l+BlSn-2+ ... +B n- 2s+B n- 1

and
n .
21b XA(s)=det(sl-A)=sn + L XiSn-l
i=1

with BiE (fnxn for all i = O.I ....... n-l. and XiE cr for all iE n.

22 Theorem. Assume that the polynomial XA(S) is known. then the coefficient
matrices Bi can be successively calculated by the formulas:

Bo=1

Proof. Multiply both sides of (21) on the right by (sl-A)' XA(S): we get

[BOs n-1 + B 1sn- 2 + ... + B n- 2s + B n- I )<SI-A)

= [sn-I+XIsn-I+X2sn-2+ ... +Xn-IS+Xn)·1.

Equating the matrix coefficients of the same powers of s we obtain (23).



Suppose that in the polynomial XA(S) we replace each power of s. say sk. by Ak
and multiply the zeroth-power term by I; then we denote the resulting matrix by
XA(A).
74

24 Theorem [Cayley-Hamilton]. For any square matrix A with elements in a field


F, XA(A)=O, where 0 denotes the zero matrix.

Proof. If we successively eliminate B n_1, B n_2 ,... in the equations (23) we obtain

0=Bn_2A2+Xn_lA+XnI

0= Bn_3A3 + Xn-2A 2 + Xn-IA + XnI

The Cayley-Hamilton theorem implies that for any n x n matrix with elements in a
field F, An is a linear combination of I,A,A2,... ,An- 1: i.e. in c: nxn, the n+l "vectors"
I,A, ... ,A n are linearly dependent.

2S Exercise. Show that for any integer p 0, A n+p is a linear combination of


2 n-l
I,A,A ,... ,A .
Let P[ .. ] denote the set of poles of the rational function between the brackets.
Recall that alA] denotes the spectrum of the matrix A. Then we have

26 Theorem [Poles of <1>(s)]. Let A E c: nxn• Then

pE c: is S.t. pEP [(sI-Ar 1]

XA(P)=O or equivalently pE alA].

Proof. We use contraposition for each implication.


=:>: Assume p rI. alA]. i.e. XA(P) t:. O. Then by (20) (pI-A)-1 E c: nxn . Therefore the
function s (Sl-A)-l is bounded at p: so p rI. P [(sl-A)-l].
<:=: Assume p rI. P [(sl-A)-l]. Therefore (sl-Ar 1 is bounded at p and so is
det[(sI-A)-I] = [det(sl-AW 1 = [:XA(S)-l]. Hence p rI. P r:X:A(Sr 1] implies XA(P) '" O. •

Let A E c: nxn, then we call

27 PA := max { IAI : AE cr[A] }

the spectral radius of A (note that PAis the maximum absolute value of the eigen-
values of A); PAis important for the following result.

30 Theorem [Expansion at infinity]. Let A E c: nxn . Then


75

32 Comments. a) Fonnula (31) shows that (sl-Af 1 is strictly proper and ana-
lytic in BC(O;PA):= ( s E (C I I s I > PA ). t
13) In [Des.3,p.27] one shows that for any matrix A E (Cnxn and for any £>0, there
exists an induced matrix nonn 11'11 such that IIAII-£ ::;; PA ::;; IIAII (i.e. the spectral
radius can be approximated arbitrarily close from above by IIAII for some induced
nonn 11·11). Hence if I s I > PA' then w.l.g. I s I > IIAII for some induced nonn IHI.

33 Proof of Theorem 30. Because of comment 13) w.l.g. we may assume that
I s I > IIAII. Now call S(s) the RHS of (31), then S(s) converges; indeed
00
IIS(s)1I Is I-I L (IIAIII S I-I )n= (I s I -IIAID- I < 00. Furthennore, (exercise),
n=O
S(s)(sI-A) == I . •

The following is an important result.

36 Fact. Consider the square matrices A and B in (Cnxn. Let XAB and XBA denote
the characteristic polynomials of AB and BA in (Cnxn. Then

37 XAB(S)= XBA(s)

or equivalently
37a det(sI-AB) = det(sI-BA) .

38 Comment. The proof below displays the advantage of a density argument based
on theorem (A.6.84) (i.e. the continuous extension of identities).

Proof. Observe that, if A is nonsingular, then (37) holds: indeed then


A-l(sI-AB)A = (sl-BA) ,

and so (37a) follows by taking the detenninant of the equation above. Now let
n .
XAB(s)=sn + L (Xi' sn-1,
i=1

n .
XBA(S) = sn + L Pi' sn-t,
i=l
hence:

t B(O;PA) denotes the closed ball ( s E (C : IsiS PA ) , hence BC is a notation


for its complement.
76

39 (37) holds ¢;> <Xj = for all i En.

Observe also that for fixed BE (Cnxn, the coefficients <Xi and depend continuously
on the entries of A, Le. for all i E n the maps

are continuous, in fact they are polynomials in the a.k's. Hence the equality of (37)
when A is nonsingular is equivalent to J
VAEM j :== (AE (Cnxn:detA ¢. O)

<Xj(A) = for all i En.

Now by Exercise (A.6.84a) M j is a set that is dense in ((:nxn. Hence by Theorem


(A.6.84) (using continuous extension) V A E (Cnxn, <Xj(A) = for all i E n.
Hence by (39) we are done. •

40 Comment. The proof above shows that for discovering a matrix property
involving equalities it pays to investigate first a simple case (here det A '¢ 0). The
same philosophy could also be used to prove the Cayley-Hamilton Theorem. Indeed
in this case it follows easily that XA(A)=O for a semi-simple matrix A (cfr. section 3
below).

41 Exercise. Let A E ((:nxn. Show that the map A XA(A) is continuous.


n .
(Hint: XA(A) = An +L Xj(A)'An- 1 ••• )
j=1

42 Comment. Since XA(A) =0 for all A E M d = (A E ((:nxn: A is semisimple)


and Md is dense in (Cnxn, (A.6.84a), XA(A)=O for all AE (Cnxn using continuous
extension.

3.2.3. The State Transition and Response Map of R


We consider any time-invariant system representation R = [A,B,C,D], (1)-(5).
Note that cl>(t,lo) is the exponential given by (3.1.12). Hence using this in (2.1.71) and
(2.1.72) we have the following state transition- and response-map:
Vt,toE JR, V'XOE (Cn, V'U(')E U

45 x(1) = s(t,to,xo,u) == e
A(t-Io)
Xo + J
10
77

z-i transition + z-s transition

J
t

46 + C + Du(t) .
to

+ p(t,to,e,u)

z-i response + z-s response.

47 Comment. Let T't be the shift operator, i.e. for every function f(·) defined on
R, T'tf is the function f(·) delayed by 't seconds according to

48 [Tl](t) = f(t-'t) '<tte lR.

It follows then from (46) that


V t,to,'t E R, VUE U

i.e. by delaying the application of Xo and of the input by 't seconds, the output is
delayed 't seconds. Note also that the same applies to the state transition map (45),
viz.

Hence under the shifting conditions above, the behavior of R is independent of the
initial time to: so w.l.g. to = O. For this reason the system representation R is called
time-invariant.

51 Exercise. Prove (49) and (50) in detail.

52 Exercise. Consider the nth order scalar linear d.e. with constant coefficients
Z(n)+ClIZ(n-I)+Cl2Z(n-2)+ ... + Cln_lz(l) + Clnz=u(t) .

Show that such an equation can always be put into the form (1) by setting
z =: Xl, z(l) =: X2, z(2) =: X3, ... ,z(n-l) =: xn and obtaining:
78

o 1 o 0 xl
0
o o X2 0

+ u(t)
1 0
0 0
-(X2 -(Xl Xn

or for short x=Ax+Enu where matrix A is said to be in row companion form


(observe the bottom row of coefficients and the superdiagonal of l's).

55 Impulse response and convolution. Consider again the time-invariant system


representation R = [A,B,C,D], (1)-(5). As mentioned in Comment (47) w.l.g. to = 0
and we may restrict ourselves to normalized state transition and response maps given
by
Vt VXOE IRn VUE U
I

56 x(t) = s(t,O,xo,u) = eAlx o + f eA(I-'t)Bu('t)dt,


o
z-i transition + z-s transition
and
I

57 y(t)=p(t,O,xo,u)= CeAlxo + JCeA{I-t)Bu('t)dHDu(t),


o
z-i response + z-s response.

Now by Exercise (2.1.101) we have introduced a normalized impulse response H(') of


R given by (2.1.108) i.e.

ceA1B + D/)(t) Vt L 0
{
58 H(t) := H(t,O) = 0 'V t < 0

where we remember that t represents here the elapsed time t-t since the application
of impulses is at 't, i.e. H(t-'t) = H(t,t) = H(t-t,O). In a similar fashion we introduce a
normalized state impulse response matrix

eAIB 'V t > 0


59 K(t) := K(t,O)=
{ 0 V O. t;
Hence by (54) and (55) and the use of convolution as in (2.1.104)
79

60

or in tenns of the Laplace transfonn on t O. with

62 K(s) := L [K(t)] = (sI-A)-1 BE €p,o(s)nxn;.

63 H(s)=L [H(t)] =C(sI-A)-IB+D E €p(S)II.,l<1\

64 xes) := L [s(t.O.xo.u] = (sI-A)-lxo + (sI-Ar l •

z-i transition + z-s transition


and
y (s) := L [p(t.O.xo.u)]

65 = C(sI-A)-lxo +

=z-i response + z-s response.

66 Exercise. Check equations (60)-(65).

67 Comments. a. In (63) the Laplace transfonn H(s) of the impulse response


H(·) of R is called the transfer function matrix of R
Most inputs used in engineering such as Dirac impulses and sums of exponentials
II
a Laplace u(')e €p(s);. In such cases. by (64) and (65).
x(') e €p,O(S)1I and yO e (Cp(S)IIo.

68 Exercise [Expansion at 00]. Consider the transfer function R(s) of R given by


(63). Show that with PA the spectral radius of A. (27).

R(s)=D+f(CAiB)'s-(i+l) V'se (C S.t. Isl>PA'


i=O

(Hint: use (63t and (31); the matrices CAiB for i=O.I,2.... are called the Markov
parameters of H(s».

3.3. Properties of R when A has a Basis of Eigenvectors


We study a system representation R = [A.B.C.D]. (3.2.1)-(3.2.5). where the com-
plex n x n matrix A is semisimple. (3.1.10). or equivalently A has a basis of n eigen-
vectors. i.e. there exists a I.i. family of n nonzero complex vectors e i such that

Aei = A.iei for i = 1.2.....n

where the A.i are eigenvalues of A.


80

2 Comments a) If ei satisfies (I) then so does !lej for any ae (!:: hence w.l.g.
we shall assume that

3 the eigenvectors ei have been normalized to constant length: e.g.


lIeill} = (ei,ei) = 1 for all i e n.
If A=I then any basis of (Cn is an eigenvector basis of (Cn: hence the eigenvec-
tors ei in (1) are not necessarily unique up to multiplication by a nonzero scalar. •

The following characterization is important.

4 Theorem. A square complex n x n matrix is semisimple if and only if there exists


a nonsingular complex n x n matrix T- 1 and a diagonal complex n x n matrix A for
which

or equivalently,

The columns ej e cr n of T- 1 organized as

and the diagonal entries A; e fr of A organized as

8 A := diagP.. 1,A,z, ... , "-nl e (Cnxn

may be taken as n eigenvectors associated according to (1) with n eigenvalues Ai of A


that form a spectral list (3.1.12). •

9 Comments. a) In other words, A e (Cnxn is semisimple if and only if A is


diagonable by a similarity transformation (i.e. a change of basis applied to (Cn; see
section A.5.2 of appendix A).
If A E fr nxn is semisimple then the eigenvalues listed in (1) are necessarily a
complete list of the n roots of XA counting multiplicities, i.e. a spectral list 0'/ [AJ,
(3.1.12).
81

y) It is standard practice in engineering that the eigenvectors of A are the columns of


T- 1.
0) If the eigenvalues Ai of A are not pairwise distinct then the corresponding eigen-
vectors in (1) are not necessarily unique up to multiplication by a nonzero scalar: see
Comment (2.f})

12 Proof of Theorem (4). Necessity. By assumption A has a l.i. family of n


eigenvectors associated with the eigenvalues A; according to (1). Hence the matrix
11, defined by (7) is nonsingular and from the equations (1) and (8) we obtain

13

Since T-! is nonsingular, (13) yields (5) and (6). Moreover by (6) and (8)
det(sI-A)=det(sI-A)='XA(s); hence (Ai J.n is a list of n roots of 'XA counting multi-
1=1
plicities.

Sufficiency. By assumption (6) holds where T-! and A are given by (7) and (8).
Therefore multiplication of (6) on the right by T-! gives (13). Finally equation (13)
yields columnwise the eigenvalue-eigenvector relations (1). So A is semisimple. •

An important result is the following.

16 Fact [Distinct eigenvalues]. Let A e (Cnxn and {Ai 1.1 c (C be any point set of
L-1
(necessarily pairwise distinct) eigenvalues. Then any fami!; of eigenvectors (e i ].11=1
such that

is linearly independent.

18 Comment. In other words, eigenvectors associated with pairwise distinct eigen-


values constitute a l.i. family.
82

).1
19 Proof of Fact (16). By contradiction. Assume that the family (ei
1-' is l.d .. t
Hence there exists an I-tuple (ai ).1 of scalars not all zero, say a, ¢ 0, that
1='
1
20 L aiei=9.
i='

Multiply now (20) from the left by (A-A2I)(A-A3I) ... (A-All). Then, with (17), we
get

where the factor multiplying e l is nonzero because a, ¢ 0 and the eigenvalues are by
assumption distinct. Hence e 1 is a zero eigenvector: --+ Therefore the family
( ei ).1 is l.i. . I
1='
An important special case of semisimplicity is the following.

23 Theorem [A simple]. Let Ae a: nxn


have n pairwise distinct eigenvalues. then it
has a basis of eigenvectors. one associated with each eigenvalue. Moreover any eigen-
vector of A is a nonzero scalar multiple of one of these n eigenvectors.

24 Comment. In other words. if A is simple, (3.1.9), then A is semisimple and


eigenvectors are unique up to multiplication by a nonzero scalar.

25 Proof of Theorem (23). The first assertion follows from Fact (16) with I=n:
indeed, any family rei ).n1='
of eigenvectors associated with n pairwise distinct eigen-
values A. according to (1) is l.i.. Now let e be any eigenvector associated. say with
AI' We1show that e=a,e, for some a, e a:. Indeed since there exists an eigenvec-
tor basis (ei J.n there exists an-tuple (ai J.n of scalars not all zero such that
1=1 1=1
n
e= L aiei. i.e.
i='
n
26 (-e+ale,) + L aiej=9.
i=2

Now assume that (-e+a,e,)e a: n is nonzero. Then, (exercise). (-e+alel) is an eigen-


vector of A associated with AI' Hence (-e+a,el.e2, ... ,en) is a family of n eigenvectors
associated with the distinct Ai for i::::l .... ,n: it is l.i. because this holds for any such

t l.d. means "linearly dependent."


83

family, Therefore in (26) ' .. =<Xn=O and (-e+ael)=9: -H-. Hence


e=alel' Q.E.D. •

The remainder of this section is devoted to a reinterpretation of key Theorem (4)


giving rise to important structural and dynamical facts by appropriate dyadic expan-
sions.

30 Fact [Dyadic Expansion]. Let A.B and C be respectively mxn nxp and mxp
matrices over a commutative ring. For k=I ..... n. let a'k and denote the successive
columns of A and rows of B. resp. Then C := AB has the expansion

31


32 Comment. Each term in the sum of (31) is a matrix product of a column by a
row: such matrix is called a dyad and (31) is called a dyadic expansion.
n
33 Proof of Fact (30). Observe that the (ij)th element of C reads Cij= L !ljkbkj'
k=1
Hence with i.j free we get
n n
C=[Cij]i.j= L L a·kbk ..
k=1 k=1

We have now the main result of this section. It uses the concepts of right- and
left eigenvectors and spectral list (see section 3.1). In (42) below

34 8ij denotes the Kronecker symbol i.e. 8ij= {


for
for
i.
I=J
* ,j}.
40 Theorem [Eigenvector dy'adic expansions]. Let A e (Cnxn be an n x n complex
semisimple matrix. Let (Ai ).n1=1
= 01 [A] be any spectral list of A. (Recall that.
(3.3.3). we normalize eigenvectors: lIeilli= 1.)
U.th.c.
i) A has a corresponding basis of right eigenvectors (ei ]:1 such that

41 Aej= Ajej for all ie n

ii) A has a corresponding basis of left eigenvectors (Tlj )j:l such that

41a

iii) The bases (ej );1 and (Tli )i:1 of right- and left-eigenvectors (of ern) are
84

mutually orthonormal, i.e.


42 (TJ;,e) := TJ; ej=Bij*
where Bij is the Kornecker symbol, (34).
iv) The unit matrix IE c: nXn has the eigenvector dyadic expansion

43 1= L e;TJ;
n *
;=1

and every x E c: n reads w.r.1. to the eigenvector basis

44 X
n
=L
*
e;TJ; x =
n
L (TJ;,x) e; =: L
n
xiei,
;=1 i=1 i=1

i.e. Xi := (TJi'x) is the ith component of x w.r.1. the basis ( e;)n .


1=1
v) The matrix A E (Cnxn has the eigenvector dyadic expansion

45 A=Ln A;e;TJ;*
;=1

and, for every x E ern, y=Ax reads w.r.1. to the eigenvector basis (e.] n
1 i=1

n * n n
46 y=Ax=L AieiTJ; x=L A; (TJi,x)e; =: L A;xiei
bl bl bl

i.e. for all i each ith component of x has been multiplied by the eigenvalue Ai' •

47 Comments a. In (44) the left eigenvectors determine the coordinates of x


W.r.t. to the right eigenvector basis (e;)n .
1=1

By (4Ia) the eigenvector dyads eiTJt satisfy (e;TJt)(ejTJt)=Bij(e;TJt) for all i,j:
they are projection matrices e.g. [Kat.l,p.21]; their sum associated with one distinct
eigenvalue is called an eigen projection, [Kat.l,p.42].
'Y. Dyadic expansions are paramount to display a basis for the solution space of the
d.e. x=Ax and to characterize the poles of a transfer function matrix: see below.
8. Note that the spectral list of eigenvalues (Ai ).n1=1
= 01 [A] and the eigenvector
bases (e; ).n1=1 and (TJ; ).n1=1 define matrices
85

49

111*

112*
50 N* := E trnxn,

11n*

where E and N* are nonsingular. Therefore equations (40)-(43) and (45) are
equivalent to respectively
40a AE=EA

41a N*A=AN* ,

42a N*E=I

43a EN*=I

45a A=EAN* ,

(for (43a) and (45a) use Fact (30». So, if we observe that equations (44) and (46) are
straightforward in view of (43) and (45) resp., then five fonnulas condense the infor-
mation of the theorem namely (48)-(50) and (45a) with

51

54 Exercise. Let the n x n complex semisimple matrix A have the eigenvector
dyadic expansion (45). Show that:
i) for A nonsinguiar,

55 A-I = t
i=!
(1-)-1 e{Tlt ,

ii) for all k ;:: 0

56 A
k
= Ln k *
(I"i) eir\j ,
i=1
86

iii) for all t E JR

57

= Ln exp(l.,it)eiTli
*
i=1

iv) L(e At ) is given by the partial fraction expansion:

58 (sI-A)-1 = ±
i=1
(s-A)-l ei11 t

(Hints: use (48)-(50), (45a) and (51); get A-1=EA-1N*, Ak=EAkN*,

exp(At)= fo k\. (At]k=Eexp(At)N*, (sI-A)-1 =E(sI-A)-IN*; then use dyadic expan-


sion).

61 Short proof of Theorem (40). Since A is by Theorem (4), for any


spectral list [Ai J.n
there exists a basis of eigenvectors ej ).n associated with the A.
1
according to (I) such that (5)-f8) holds. now that (8) is matrix A given by
(48). Moreover by setting T- =E and T=N we obtain the nonsingular matrices E,
(49), and N*, (50), such that equations (40a)-(43a) and (15a) hold by using (5) and
Tr l =r1T=1. Note now that the nonsingular matrix N in (4Ia) defines a basis of
left eigenvectors (11i ).n.
The latter and the equivalence mentioned in Comment
1=1
allow to derive all results of the statement of Theorem (40). •

65 Basis of the solution space of x=Ax. We consider the linear d.e.

66 X(I) = Ax(t) x(O) = Xo

where IE JR, x(l) E c: n , and A E {f nxn is semisimple. By (3 .. 2.56) the state trajectory
xC') is given by
67 X(I) = s(t,0,xo,8 u)= exp[At]xo

as a linear function of Xo. Hence the set {x(·): XOE c: n } is a linear space. Also,
using the eigenvector basis [ek ) n of Theorem (40), we get successively from (67)
k=l
by using (44) and linearity in Xo
87

n
x(t) = s( t.O. L (llk.Xo) ek.au)
k=l
n
= L (11k.xo) s(t,O,ek,au)
k=1

n
68 = L <11k.xo}exp[At]ek
k=l
n
=L (11k,xo) expO"kt)ek ,
k=l

where for the last equality we used (57) and the mutual orthonorrnality condition (42).
Hence, by equation (68). if we think of the initial state as a linear combination of
eigenvectors. then the resulting motion of the state is a linear combination of very sim-
ple motions s(t,O.ek,au) = exp(Akt)ek. These motions are called modes and form a
basis of the linear space of solutions of it = Ax. Observe that in (68). i)
(11k'xo) =: Xok E .r is the kth coordinate of the initial state w.r.1. to the eigenvector
basis and measures the excitation of the kth-mode by the initial state Xo and ii) if
xO=ek then x(t)=exp(Akt)ek for all t meaning that. for all t. the state remains on
Sp(ek) = (cxek : cx E (C). i.e. the axis supported by ek .

*70 Real algebra. Consider now the case where A E R nxn is semisimpie and
n
XE lR is a real vector. By exercise (3.1.13) eigenvalues and corresponding eigenvec-
tors have the property of complex conjugate symmetry. 111erefore in Theorem (40)
we may assume that the eigenvalue-eigenvector triples (Ak,ek,11k) i) occur in complex
conjugate pairs (Ak,ek.llk).(Ak.ek,11k) for k=I,. ..• m and ii) are real for k=2m+ 1,...• n. As
a consequence for any x E lRn, for any k=I, ... ,m, complfx conjugate contributions in
(44) and (46) generate real contributions by the equations

71 (11k.x)ek + (11k.x) ek = 2 Re«11k.x )ek)

while. for k=2m+ 1,...• n. the contributions (11k.x}ek and A.k (11k.x}ek resp. are real.
Therefore for k=l •...• m expressions (71)-(72) will have real parameters if we introduce
real and imaginary parts according to:
A.k = CXk + jOlJ< Ak = Uk - jOlk

F1
t rn z= (zl.q •...• Zn) E .r n• Re(z) denotes the vector in R n with components
( Rezi .
i=l
88

73

So by (71)-(73), (44) and (46) read:


i) For any x E lRn
m n
74 x= L [xkrekr + xkieki] + L xkek
k-=I k=2m+1

where, for k=l, ... ,m, xkr=2{l1kr,x) and Xki=2(l1ki,X) are real and, for k=2m+l, ... ,n,
xk = (l1k'X) is real;
ii) For any x E IRn
m n
75 y = Ax = L [Ykrekr + Ykieki] + L Ykek
k=1 k=2m+1

where. for k=l •...• m

and for k=2m+ I ,... ,n

Note especially that for k=l •...• m the l.i. pairs of nonreal eigenvectors (ek,ek) and
(l1k,l1k) have been replaced by the l.i. real pairs (ekr,eki) and (l1kr.11ki) resp., where, by
considering real spans of the form ckek + c;;- ek and dk11 k + d k11k for complex
coefficients c k and d k ,

= Sp(ekr,eki)'
and

Recall now that for ken the ek and the 11k are l.i. Therefore IRn has now two bases
of n real vectors namely

76 ( ekr,eki J:1' (ek J:2m+1


and

which. by (42), (i) satisfy, for identical subscripts k,


89

and (ii) are mutually orthogonal for distinct subscripts. Hence (74) and (75) are the
unique representations of XE R n and y=AXE IRn w.r.1. to the real basis (76); more-
over for k=l,oo.,m with Ak=uk + jrok for k m, we see that
for x = ekr we have y = Aekr = Ukekr - rokeki '

and

we have

So to any complex conjugate eigenvalue pair (Ak,Ak) there exists a subspace neces-
sarily oj dimension two, viz. Sp(ekr,eki) such that its image under A is contained in it;
moreover the same will happen to its image under AI for any I and under exp[At]
for any t E R. Hence, for Xo E IRn and k E ill, the explicit writing of the complex con-
jugate modes of (68), viz.

77 exp(Akt) (llk,xo)ek + exp(Akt) (rlk,xo) ek

leads, by using (71)-(73), to the following unique representation of any solution of


x= Ax W.r.t. to the real basis (76): for any Xo E IRn

x(t) = s(t,O,xo,e u ) = exp[At]xo


m n
78 =L [xkr(t)ekr + xki(t)eki] + L xk(t)ek,
k;\ k;2m+\

where, for k=l,oo.,m

and, for k=2m+l,oo.,n

with
90

Moreover, to any nonreal eigenvalue pair O"k'X;;) there exists a two-dimensional sub-
space Sp(ekr,eki) such that if the initial state Xo lies in this subspace, then so does the
z-i state trajectory x(·): this trajectory is an exponential spiral; see Fig. 3.1 for the
case <Xk<O.

*79 Special case: XOE Sp[ek,ek)' As before let AE R nxn , be semisimple and let the
notations (73) hold. As above let XOE 1Rn but now let Xo E Sp[ek,ek)'
Since XOE Rnc ten. xo=xo. and since in addition XOE Sp[ek.ek) we have

80

for some complex number ck . The solution x(t) becomes successively

x(t) = exp(At) . Xo

Consider the special case where Xo = ekr. the real part of the eigenvector ek ; then (80)
shows that ck is real and. in fact, ck =0.5. Then (81) reduces to
x(t) = eUkl cos (Ilk! ekr"'" eUkl eki .

By inspection, x(O)=eki=xO as it should.

83 Poles of the transfer function. The transfer function matrix R(s) of any system

Fig. 3.1. An exponential spiral generated by complex conjugate modes.


91

representation R = [A,B,C,D), (3.2.1)-(3.2.5) is given by (3.2.63): it is the proper


rational matrix

with poles necessarily in the spectrum of A (see (3.2.26». Assume now that A is
semi simple with spectral list cr/[A)= (Ai J.n1=1 .
Then, by Theorem (40) and (58), (sI-

Af 1 has the eigenvector dyadic expansion (sI-A)-1 = ±


H
(S-Ai)-lej11*j' giving

85

H(s) = Ln *
(S-Ajr 1 Cej'11 jB + D.
j=l

Note that Cej . 11*jB is a dyad with column Cei and row 11*jB. Hence a pole contribu-
tion at A; disappears iff the corresponding dyad is zero or equivalently Cej = e or
* . .
11*jB = e . Hence we have with P [H(s)]. the set of poles of H(s), the theorem.
. .
86 Theorem [poles of H(s)]. Let H(s) be the transfer function matrix of a system
representation R == [A,B,C,D] with AE (tnxn semisimple and spectral list
( I..; J.n1=1 =cr/[A].
V.t.c.
pE P[H(s)]

if and only if
i) there exists an eigenvalue Aj E 0/ [A] S.t. p== \ '
ii) Cej *' e,
...
111
) *B
11j e*
*' ,
where e i and 11i are right- and left-eigenvectors of A corresponding to \ in the dyadic
expansion (85). I

89 Exercise. [Response map]. Consider any system representation R = [A,B,C,D]


with A semisimple having an eigenvector dyadic expansion (45). Show that, for any
XOE (Cn and any U(')E U, the response for t is given by

yet) = p(t,O,xo,u)

90

(Hint: use (3.2.57) and (57».

91 Comment. Clearly in (85) and (90) the nonzero vectors Ce i and 11*jB represent
the strength of the coupling of the ith-mode with the output and the input resp. Note
that Ce i depends on the location and the sensitivity of the sensors while l,*jB depends
92

on the location and the strength of the actuators.

92 Exercise [State transition]. Given R where A is semisimple with dyadic expan-


sion (45), (a) show that
x(t) = s(t,O,xo,u) =

93a ±
j=1
ej exp(l"jt)· [<Tli,X O) + J e -A,'t <B* TJj,u(t» d't
0
1
(b) Let xo=8n at t=O- and, for some P=(Pl,P2, ... Pn i)E CI: n" let u(t) = p6(t) (i.e. for
k E llj, the kth-scalar input is an impulse of area Pk applied at t=O): show that

93b

Remarks. a.) In (93b), (B*TJj,p) measures the coupling between the impulsive
vector input p6(t) and the ith-mode; in particular, if (B*TJj,p)=O, then the ith-mode is
not excited by that particular input.
*
If B TJj = 8 ni , then, by Eq. (93a), we see that no input can excite the ith-
mode, i.e. the actuators are not coupled to the ith-mode.
r) If Cej = 8110 , then Eq. (90) shows that the jth-mode does not contribute to the
output, i.e. the sensors are not coupled to the ith-mode.

95 Numerical considerations and fast modes. We condense our ideas in an exercise.

96 Exercise. Let A E JR2x2 be simple S.t. A = 11 AT =: E A N* with eigenvalues


Al = -1 00 and "-2 = -1 and a matrix of normalized eigenvectors

cos Ijl]
sin Ijl .

See Fig. 3.2 where we observe that lied I = IIe21i = 1.

Note that

[o -cotg Ijl ]
(sin <1»-1
93

II
e1
Fig. 3.2 The angle between the normalized eigenvectors el and e2 is III.

(a) Show that the t volume of the parallelepiped based on e l and e2 , is given by
97 OS; Idet.11 S; I,

where for c!> variable the following minimum and maximum occur

Idel.11 =0 (ei J: is an l.d. family

Idet. 1 I = 1 (e;]: is an orthonormal family

(b) Show that 'v'i

with the following minimum and maximum

d=O (ej ); is an orthonormal family,

d=oo (e; is an l.d. family.

99 Comment: The results in (a) and (b) generalize in (CD. In fact if det. 1 ::: I,
then T- 1 is almost unitary, the e· are almost orthonormal and Vi IIllill::: I; if
det. 1 ::: 0, then T- 1 is almost the ei are almost linearly dependent and :3
i s.t. Illlili is very large. In order to calculate precisely, in truncated arithmetic, the
similarity transformation A = TA.1, it is required that T- 1 be almost unitary; if not,

t Of course for n=2 volume should read "area" and parallelepiped "parallelogram,"
resp.
94

many significant figures may be lost, (see e.g. [Gol.l ,p. 197-198]). A similar waming
applies also to the solution of the equation Ax=b.
(c) In order to illustrate Comment (99) above, consider the matrix A defined above
and let cp=1t-£, with £ small; more precisely £2 is negligible.
i) Calculate Iderrll and the condition number of A; sketch the vectors el' e 2,
11 1> 112'
ii) Consider the equation Ax=b and a perturbation in b: hence the resulting
perturbation in x is = A-I
Let Bb=/}·th/ll11dl, with /}>O small. Show that Bx is very large, in fact

(Hint: use (55».

100 Comment. Though 118bll = /} > 0 is small, 118xll is huge because of the almost
linear dependence of e 1 and e 2 and the disparity of the eigenvalues: A is not well con-
ditioned.

(d) For the same A as in pan (c) let xo=TtI/IlTtdl.


i) Obtain an exact expression for

x(t) = exp[At] . Xo.

ii) Show that for t 0.05

x(t) = -£I . exp(-t)· e2'


(Hint: use (57».

101 Comment. The fast mode exp(-lOOt) disappears almost immediately and for t
not large, IIx(t)ll > > Ilxoll = I; so here IIx(t)ll increases first: this shows that even for
real negative eigenvalues IIx(t)II may increase considerably above IIxoil before even-
tually decreasing exponentially to zero.
CHAPTER 3d

THE DISCRETE.TIME SYSTEM REPRESENTATION Rd=[A,B,C,D]

This chapter develops concisely the general properties of the time-invariant


representation R d = [A,B,C,D] and then indicates some specific results for the case that
Rd has a state space basis of eigenvectors; if the latter does not apply specific formulas
can be found in Chapter 4.
As done earlier the analogy between the continuous-time- and the discrete-time
case will be stressed by using identical reference number for the main properties of
Rd and R.

3d.1 Preliminaries
In the sequel the notations of Section 3.1 apply with the exception that the
transform variable s has to be replaced by z; e.g. R[z], ( tr [z]), denotes the ring of
polynomials in z with real, (complex, resp.), coefficients, etc.

3d.2 General Properties of Rd


Definition of Rd' We consider system representations of the form
1 x(k+ 1) = Ax(k) + B u(k)
kE N
2 y(k) = Cx(k) + Du(k)
where
the state x(k) E cr n, the input u(k) E cr ni , and the output y(k) E cr 110;
A,B,C,D are constant complex matrices of dimension n x n, nxni' noxn, nOxni' resp.
For short we refer to (1)-(2) as the system representation Rd=[A,B,C,D].

Remarks. I. It is understood that the recursion equation (l) is run forward: since
A may be singular (e.g. "deadbeat" case), x(k+ 1) and u(k) do not always uniquely
define x(k). However, if A is nonsingular then (1) can also be run backwards and it is
better to define Rd on all integers i.e. on the set Z, (instead of the "half-line" of
integers N).
II. The system representation R d = [A,B,C,D] is a special case of the representation
R d(') = [A('),B('),C('),D(-)] where A('),B(-),C(-),D(-) are constant sequences. For this
reason R d = [A,B,C,D] is called a time-invariant representation: see (47) below.

The State Transition Matrix.


By (2d.1.40) with A(k) == A

10 <1> (k,ko) = Ak-ko = <1> (k-ko,Q)

where V k,l e N

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
96

In contrast to the continuous-time case one has easily the following theorem.

12 Theorem. Consider the recursion equation x(k+l) = Ax(k) and let No=N[A Illa]
be the algebraic eigenspace of A e cr nxn associated with its eigenvalue A=O, [see
(4.3.3) below]. Then for all x(ko)e No we have for kl := ko+mo where mo :s; n

x(k t )=6

13 Comment. If AO= 0 is an eigenvalue then there is a subspace of the state space


S.t. reaches the zero equilibrium solution in a finite amount of time.
In the continuous-time case, i.e. x(t) = exp[A(t-to)), this never happens!

15 Exercise. Show that if

At 0
A= [ 0
1, then Ak=
[Ar
0
0
AJ< 1
;

k!
where Cmk := --..;;;;.;--
m!(k-m)! .

A k by the z-transform
_ Taking the z-transform of <I> (k+ 1,0) = A<I> (k,O) with <I> (0,0) = I and denoting
<l>(z) := Z[<I> (k,O)] we have by (D2.S)

IS
- -
z [<I>(z)-I] = A<I>(z) .

Hence
19 q)(z) = Z [A k] = z(zl - Af! E G::p(z)" X" •

Comment. The only yifference between (19) and (3.2.19) is the matrix factor zl,
which multiplies (zI-Af . Hence using similar results of Section 3.2, we have the fol-
lowing.

20
-
Exercise [Computing <l>(z)J:
-
Show that <l>(z) given by (19) reads
97

where XA(Z):= det(zI-A) is the characteristic polynomial of A and

rrbij(Z)]. .
I.JE D
=: B(z) E (Cnxn[z] is an n x n polynomial matrix reading

s.t. with
n .
2Ib XA(Z)=Zn + L XjZn-l,
j;l

the matrix coefficients Bi satisfy

Bo=I

24 Remark. If the only eigenvalue of A is A=O (deadbeat case), then XA(z)=zn


and Alc=O 'r;J k n as was to be expected.
-
26 Exercise [Poles of cl>(z) at z "# 0]. Show that, with A E (Cnxn,
pE ([ with P "# 0 is S.t. pEP [z(zI-Arl]

<=> XA(P)=O, or equivalently, pE alA] .



For the case p = 0 we propose the following.

27 Exercise. Let q,(z) = z(zI-A)-1 and A "# O. Show that

_ [Z(Z-A)-I 0
27a if then cl>(z) = 0 l;

27b
OIl -
if A= [ 0 0 ' then cl>(z)=
[1 1
0
z-I
1 .

Exercise (27) indicates the following.

28 Fact [Poles of q,(z) at z=O]. Let No=N[AmO] be the algebraic eigenspace of


AE ([nxn at its eigenvalue 1..=0, [see (4.3.3) below]. Then
98

<=> XA(O)=0 and No=N[Arno] with 1110> 1 .

29 Remark [see Chapter 4]. Note in particular that rnO is the size of the largest Jor-
dan block associated with the eigenvalue 1..=0. lienee, if A is semisimple (i.e.
diagonable), then 1110 ::;; I and 0 is not a pole of <I>(z), [compare (pa) and (27b»).
Moreover if A is nonsingular then mo=O and 0 is not a pole, indeed <1>(0)=0.

30 Exercise [Expansion at infinity]. Let AE (Coxo with spectral radius


PA := max ( 11..1: AE a[A] ). Then
31 q,(z)=z(zI-A)-I=I+Az- 1+A2z-2 + ... +Akz-k + ... for all ze (C S.t. Izl >PA

32 Remark. Expanding at infinity is the same as taking the inverse z-transform:


Z-l[q,(z)]= (Ak ); as was to be expected.

The State Transition and Response Map of Rd


From (2d.1.71) and (10) with A,B,C,D constant we have

\ixOE (Co, \iuO= (U(k»)O', \ik,koE N with k>ko

k-l
45 x(k) = s(k,ko,xo,u) = Ak-koxO + L Ak-k'-IBu(k')
k';ko
and
k-ko k-l ,
46 y(k)=p(k,ko,xo,u)=CA xo+ L CAk-k-1Bu(k')+Du(k).
k'=ko

47 Comment. Let lEN and define the delay t operator T/ S.t. for every sequence
fO:= )0', T/ f is the sequence f(-) delayed by I sampling periods, more precisely

f(k-l) for k I
{
48 [T/f](k)= e for 0::;; k < I .

Hence from (45) and (46),

t By defining R d, (1 )-(2), upon Z , it is possible to use a shift operator with


IE Z.
99

49 s(k+/ ,ko+/ ,xo,T/ u) = s(k,ko,xo,u)

50 p(k+/ ,ko+/ ,xo,T/ u) = p(k,ko,xo,u) ,

i.e. by delaying the application of Xo and the input I sampling periods, the state and
the output of R d are delayed by the same amount.
Therefore, under the shift operations above the behavior of R d is independent of
°
ko' so w.l.g. k = 0 . For this reason the system representation R d is called time-
invariant and the state transition- and response-maps are normalized to ko=O.

51 Exercise. Prove (49) and (50) in detail.

55 Impulse response and convolution.


From (2d.1.96) and (10) the normalized impulse response of Rd is given by

CAk_1B for k 0
58 H(k) := H(k,O) = { DO for k=0
for k < 0

Hence by (46) the normalized z-s response reads:


'It u(·)= (U(k»): and 'It k
k
59 y(k) = p(k,O,e,u) = H(k-k')u(k') = (H*u)(k)
k'=O

where the last expression means the convolution of by rUCk) Therefore,


with the transfer function matrix of R d defined by

60
-
H(z) := Z[H(k)] ,

and the z-s response has a z-transform:


-
62 y(z) = Z[p(k,O,e,u)] = H(z)u(z) .

63 Exercise [State- and response z-transform]. Show that the normalized state
transition- and response maps of R d have a z-transform:
'ltXOE (En, 'ltu(-)= 'ltk
100

64 x(z) = Z[s(k,D,xo,u)]

= z(zI-A)-1 Xo + (zI-A)-1 Bii(z)


65 yez) =Z[p(k,D,xo,u)]

= Cz(zI-A)-lxO+ I{z)u(z)
= Cz(zI-A)-lxo + [C(zI-ArlB+D]ii(z)
-
where H(z) is the transfer function of Rd'

3d.3 Properties of Rd when A has a Basis of Eigenvectors
As in section 3.3 we assume that A E er nxn has a basis of eigenvectors, i.e. is
semisimple. Hence as in Theorem (3.3.40) A has a dyadic expansion

45

where rAi),n = adA] is a spectral list of eigenvalues of A and rei).n


t 1=1 r 1=1
and [Tli).n
1=1
are correspondinf bases of right- and left-eigenvectors that are mutually orthornonnal
(Le. (Tli,ej):= Tli ej=Bij for all ij En). Hence the following specific results for
Rd=[A,B,C,D], the proof of which is left to the reader.

65 Basis of the solution space of x(k+l) =Ax(k).


'It XOEern, 'It k
n
68 x(k) =<1> (k,O)xo= Akxo= L (TIl ,xo> A/kel .
1=1

Hence the solution of x(k+l)=Ax(k) on N has a basis [A/ke/l: 1 (right eigenvectors)


in which every solution is represented by a unique coefficient vector [<Tl/,xo ») n
1=1
(left
eigenvectors).

70 Real algebra. We assume that A E IRnxn is semisimple and consider

71 x(k+ 1) = Ax(k) kE N,

where x(k) E R n for a real initial state x E IRn at k=O in the span of two (nonreal) com-
plex conjugate eigenvectors e/ and e;
of A corresponding to a pair of (nonreal) com-
plex conjugate eigenvalue-eigenvector triples (A/,e/,'Il/) and (A/,e;,Tl/)' Introducing for
AI E fr a polar representation
101

72

and for e, and 111 real and imaginary parts

73 111 =11lr+ j111i e R n + jlRn.

we have for the initial state

74 x E Sp(e,.e,) = Sp(e/r.e/i) E IRn


through
X= 2 Re( (11I.x»

Hence for the corresponding real solution x(k) of (71) at time k

x(k) = 2 Re (11/'x)'A/x)

where the pair of real coefficients xr(k) and xi(k) are given by

[
Xr(k) ] [ cos kcll, sin kcll, ] [2(rll r'xO) ]
xj(k) = P/k -sin kcll, cos kcll, 2(11/i.xo)
with

[ Pp' pPI sin ]k = P/ k [ sin ].


- 1 sm 'f'1 1 cos 'f'1 -sm 'f'1 cos 'f'1

Note that V k. x(k) e Sp(e/pe/i). a two-dimensional real subspace in which its tra-
jectory is an exponential spiral.

83 Poles of the transfer function. Note that. with H(z) the transfer function of R d
-
given by (61). then by the dyadic expansion (45)

85 H(z) = C(zI-A)-IB+D = L
- n
(z-'A.j)-ICej '11j B+D *
i-l

Hence. as in Theorem (3.3.86). we have the following.

86 Theorem [Poles of H(z)].


-
Let H(z) be the transfer function matrix of system
102

representation R d= [A,B,C,D] given by (61). Let A be semisimple with spectral list


[ "-i ].n1=1 = (II [A].
V.t.c.
pE P[H(z)]
if and only if
i) there exists an eigenvalue "-i E (II [A] S.t. p= "-i
ii)
iii)
where e i and llj are the right- and the left-eigenvectors of A corresponding to "-i. •

89 Exercise [Response Map]. Consider any system representation R d= [A,B,C,D]


with A semisimple havi?g dyadic expansion (45). Show that, for any
Xo E (l: n and any u(-) = LuCk) Jo ' the response for k > 0 is given by

y(k) = p(k,O,xo,u)
n n k-l
90 =1: Cej(TJj,xo))"jk + 1: Cej' 1: Ajk-k'-I(B*TJj,u(k'»+Du(k).
j=1 j=l](=O

J:,
92 Exercise [State transition]. Given R d = [A,B,C,D] where A is semisimple with
dyadic expansion (45). Show that for any XOE ffn and any u(·)= (U(k) the state
for k > 0 is given by

x(k) = s(k,O,xo,u)

93


CHAPTER 4

THE SYSTEM REPRESENTATION R;;;. [A,B,C,D], Part II

This chapter develops the main properties of the linear time-invariant representa-
tion R = [A,B,C,D] when the matrix A is general.
The main topic is the representation of a linear map A and its consequences. Our
approach will be mainly geometric and certain details will be omitted. A key tool here
is matrix representation theory and especially the effects of a change of basis: see sec-
tion A.5. of Appendix A. Some key references are [Kat.l,I.3 and 1.5] for theory,
[Nob.l,chs.8 and 10] for algebraic formulations and [001.1], [001.2] for numerical
aspects. The chapter concludes by discussing the function of a matrix, the spectral
mapping theorem and a discussion of the linear map X AX + XB. The discrete-time
results are implicitly present throughout the chapter.
We start by giving some preliminaries.

4.1. Preliminaries
We introduce the notions of A-invariant subspace, direct sum of subspaces and
nilpotent matrix.

1 Invariant subspaces. Let (V,F) be a linear space over the field F s.t. dim V = n.
Let M be a subspace of (V,F) and let A : (V ,F) (V,F) be a linear map. The sub-
space M is said to be A-invariant iff
2 xeM:>AxeM.

3 Comments. a) Let A[M] denote the image of M under A. Equivalent


definitions are: M is A-invariant iff
A[M]cM

or equivalently, restricting the domain of A to M,

A:M

The definition does not depend on the choke of basis (b i]i:l of V.


Any such basis generates a unique vector representation I; e pn for x by

and a matrix representation A = (aij) e Fn xn for A by

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
104

n
5 A bj = L aijb i for all j En,
i='

(see Section A.S.I). The vectors in M are then described in terms of vectors E P
and M is A-invariant if and only if

6 EM=:> EM.

It is then customary to say that M is A-invariant.


y) AM' i.e. the restriction A : M M has a square matrix representation
AM whose size does not exceed that of A, (see Theorem (19) below). AM is called
the part of A in M, [Kat.l,p.22).

7 Example. (V,F) is represented by (1:2 by the basis r£i)2 . Let the linear map A
1='
be represented by the matrix A = [6 g]. then the subspace M := Sp[ed is A-
invariant.

8 Exercise. Let A : (V,F) (V,F) be a linear map.


i) Show that N (A) and R (A) are A-invariant subspaces.
ii) Let Ai be an eigenvalue of A, i.e. :3 a nonzero x E V s. t. A x = A.;x, show that
N (A -\1) is A-invariant.
iii) Let pO be any polynomial with coefficients in F, say,
p(s)=sk+UISk-l+ ... +Uk_IS+Uk' Define p( A) by

p(A)=A k +u IA k - 1 + ... +Uk-l A +ukl.

Show that N (p(A » is A-invariant.

9 Exercise. Let the subspaces M I and M 2 be A-invariant. Let


M 1+M 2 := (XE V: x=xI+x2,xiE M j for i=1,2,). Show that the subspaces
M,nM 2 and MI+M2 are A-invariant.

10 Remark. M 1 U M 2 is usually not a subspace. Note that M ,+M 2 is the smallest


subspace containing M 1 U M 2'

12 Direct sum of subspaces. Let (V,F) be a linear space. Let M 1.M2,'" ,M k be


subspaces of (V,F). We say that V is the direct sum of M I,M 2' . . . ,M k (and we
k
write V = M 1 (11 M 2 (11 EEl M k =: EEl M -) iff, for all x E V, there exists a unique
i=1 1

representation of x as
k
13 +Xk=L Xi,
i=1
lOS

where xi e M i for i= 1.2•...•k. (See Fig. 4.1.)

14 Example. Let (V.R) be the linear space of all periodic functions from R into 1R
of the fonn

f(t) =
N 'k
1: t
k=-N

where a with so that f(t)e R.


Let Me(Mo) be the subspace of all even (odd) periodic functions in V, Then
V = MeG) M 0' Indeed. for any f e V. we have the unique decomposition

where feet) := '21 (f(t)+f(-t»; fo(t) := '21 (f(t)-f(-t»,


15 Exercise, Let Ae a nxn, Let A be simple. i.e. A has n distinct eigenvalues.
Show that

16 Exercise. Let Ml and M2 be subspaces of (V.F). Show that V=M 1 GlM 2 if


and only if V=M 1+M 2 and M 1 r'lM 2 = {e}.

We now use the concepts of direct sum and A-invariance to prove an extremely
useful theorem.

Fig. 4.1. These illustrations show that the direct sum decomposition of V =]B2 is not
unique.
106

19 Theorem [Second Representation Theorem]. Let (V.F) be a linear space with


dim V = n. Let V = M I EJ) M 2 with dim M I = k. (hence dim M 2 = n-k). Let
A : (V.F) (V.F) be a linear map.
U.th.c.
i) If M 1 is A-invariant. then V has a basis w.r.t. that the map A has a matrix
representation A e F"xn of the form
k n-k
k

A
20
n-k

ii) If both M 1 and M 2 are A-invariant then V has a basis such that A has a matrix
representation A e F" x n of the form
k n-k

A= tAi'-
k
21
n-k.

22 Comments. a) The first representation theorem is Matrix Representation


Theorem (A.5.3) of Appendix A.
In (20) block A21 is zero because M 1 is A-invariant.
y) If M 1 is A-invariant then there does not always exist an A-invariant sub-
space M 2. s.t. V =M 1 EJ)M 2: e.g. take A = [6 and M 1 = Sp[el].

and rbj].n
Proof of Theorem (19).
respectively.
r i)
r M
Let (bj].k
Now by Exercise (16). since V = M
J=I J=k+l
1U 2
be bases for M 1 and M 2'
and MIn M 2 = { e} we
have V = M 1 Ea M 2; hence (bj);1 is a basis for V and any x in V has the unique
representation (4); moreover A has a matrix representation A = (aij) dictated by (5).
Now T/j=I.2.....k. bj M which is A-invariant; hence A b e M with basis (b i1:1'
E 1 j 1

Therefore. by (5). T/ j E k
k
Abj=L Ilij bj
i=l


i.e. Ilij=O T/ i = k+ I ..... n. T/ j = I.2 ..... k. Hence one has the matrix representation
(20).

22a Remarks. a) In (20). All is the matrix representation of the part of A in M I'
In (21). All and A22 are matrix representations of the parts of A in M I and in M 2.
107

resp.
13) The representation (21) will be blockdiagonal -- irrespective of what
bases are chosen in M 1 and M 2.

23 Nilpotent Matrices. Let Me c: nxn be a square matrix.. We say that M is nil-


potent of index 1.1 iff there exists a positive integer m s.t. Mm=o and
Il=min (me N : m I,Mm=O).

24 Comment. It will turn out that any square matrix A e c: nxn is uniquely decom-
posable as the sum of a semisimple matrix and a nilpotent one. (See Theorem (4.4.14)
below.)

25 Exercise. Let Me c: nxn be nilpotent. Show that all eigenvalues of M are zero.
(Hint: First. prove that if A. is an eigenvalue of M. then A.m is an eigenvalue of Mm.)

26 Exercise. Show that the matrices M\ and M2 in (l:3x3 given by

M\= 0 0 I
10] and M2=
000 0 0

are nilpotent of index 3.


(Note that. if the matrix M\ is n x n, then its index is n.)

4.2. Minimal Polynomial

1 XA(s)=det(sI-A) is, by definition, the characteristic polynomial of the matrix


Ae c: nxn . By the Cayley-Hamilton Theorem (3.2.24) XA(A)=O. Calling any poly-
nomial p(.) e c: [s] s.t. peA) = 0 an annihilating polynomial we see that XA is an
annihilating polynomial. Now there may exist annihilating polynomials p of lesser
degree than XA. To see this, take:

i) A\ = ,XA(s) = (s-A.\)2(s-'-2).
o 0 '-2
108

2 Definition. Given any matrix A E (COX", we call minimal polynomial of A the


annihilating polynomial '" of least degree (normalized so that it is monic, i.e. the
leading coefficient is I).

3 Notation. Let A E (C"X". As a consequence, the characteristic polynomial lACs)


will have, counting multiplicities, n zeros in (C. The distinct eigenvalues of A will
*
be denoted by 1..\,1..2,,,.,1..0 ; hence, Ai Aj' 'v'i *
j, and ij E!J.. Thus lACs) can be
written as

where
5 dkE N S.t. dk 1 for k= 1,2,,,.,0
and
6 d\+d2 + ... +do=n,

i.e. the positive integer dk is the multiplicity of the zero Ak as a zero of the characteris-
tic polynomial lA('): if A has repeated eigenvalues, then for some k,dk > 1 and o<n.
If on the other hand, dk= I for all k, the 0 = n and A is simple.
Theorem (14) below shows that the minimal polynomial "'A(s) has the factored
form:

where the multiplicities mk of the zeros Ak satisfy

8 mkE N S.t. I::; mk ::; dk for k= 1,2,,,.,0


and

We denote by "'k(s) the polynomial


o
10 "'k(S) := '"A(s) I (s-Ak)mk= n (s-A/ )m,
1=1
/ .. k

and by N k the subspace

11 Nk := N ([A-Ak I ]m-J.
According to Exercise (4.1.8), Nk is an A-invariant subspace of (C".
109

The notations dk, mk, XA' 'I' A, 'l'k' and N k will be used without comment in the follow-
ing developments.

14 Theorem. Let A E c: nxn with characteristic polynomial XAO given by (4)-(6).


Let 'I' AO be the minimal polynomial of A.
U.th.c.
i) XA(') is a multiple of 'I' A('),
ii) the relations (7)-(9) hold.

Proof. 1) Let Ak be any eigenvalue of A. We claim that Ak is a zero of 'l'A(');


therefore its multiplicity mk is S.t. mk 1.
Indeed there exists a nonzero eigenvector ekE c: n s.t. Aek=Akek' Hence, with 15 the
degree of 'I' A('),

'I' A(A)ek = (A 6 + ciA &-i Jek


1=1

where the ci are the complex coefficients of 'I' A(·)' Therefore for all k, 'I' A(Ak) = O.
2) We claim that for all k the multiplicity mk of the zero Ak satisfies mk :s; dk; i.e. the
polynomial 'l'A is a divisor of the polynomial XA-
Indeed the minimal polynomial 'I'A has necessarily a degree 0['1'A1 which does not
exceed the degree 15[XA] of XA, (an annihilating polynomial). Hence using Euclidean
division, there exist polynomials q and r in c: [sl such that

15 XA(S) = q(s)'I'A(S) +r(s) ,

where, if res) == 0, XA is a multiple of 'I'A and mk :s; dk for all k, and, if res) =1= 0,
then its degree satisfies 15[r] < 15['1'A]' The latter alternative leads to contradiction.
Indeed by (15), with s replaced by A, and the Cayley-Hamilton Theorem (3.2.24):

where 'I'A(A) = ° (annihilating polynomial). Hence if res) =1= 0, r(A)=O with


a[r] < a['I'A] : <-. Therefore res) ==
0 and XA is a multiple of 'I'A'
3) Properties (8) and (9) follow immediately from that fact. •

18 Remark. Part 1) of the proof shows that any annihilating polynomial has zeros
at Ak for k = 1,2, ... ,a. Setting XA = p in part (ii) we conclude that any annihilating
polynomial, having zeros solely at the Ak' has the form
110

CJ
19 pes) = n (S-Akl
k=l
k

where
20 PkeN S.t. for k=1,2, ... ,0'.

22 Exercise [Ascent of a matrix]. Consider any matrix Te C;nxn. Consider the


consecutive powers Til for Jl= 1,2, ....
Show that
23 i) N[TIl] eN [TIl+I] for Jl=O,I,2, ....

24 ii) There exists a unique exponent v 1 S.t. in (23) above, for Jl=O,I, ... ,v-l the
inclusion is a strict t inclusion of subspaces, and, for all Jl v, the inclusion is an
equality of subspaces.
iii) Establish the corresponding property for the successive ranges and ranks of Til.
[Hint: N[TIl+l] = (xe c;n:TxeN[TIl] J.]

25 Remark. The unique exponent v, mentioned in (ii) above, is called the ascent of
T, [Tay.l,p.290]. It will turn out that the multiplicity mk of the zero Ak of the minimal
polynomial 'I'A(') in (7) above is the ascent of A-AkI.

26 Exercise. Let 'I'A(s) and XA(S) denote the minimal- and characteristic polynomi-
als resp. of Ae (Cnxn, given by (7) and (4). For any polynomial p(s)e C;[s] denote
by pes) the polynomial obtained from pes) by replacing its coefficients by their com-
plex conjugates. Consider A* e (Cnxn (the Hermitian transpose of A). Show that its
minimal- and characteristic polynomial read, resp.

and

[Hints: for (27) show that p(.) is an annihilating polynomial of A iff p(.) is an annihi-
lating polynomial of A*; for (28) consider XA(s)=det[sI-A].]

4.3. Decomposition Theorem


The main topic of this section is the direct sum decomposition of € n dictated by
a linear map represented by a matrix A e C;nxn.

1 Theorem [Decomposition into algebraic eigenspaces]. Let A e c;nxn, with

t i.e. the subspaces are not equal.


III

characteristic- and minimal polynomials XAO and 'l' AO, resp. given by (4.2.4)-(4.2.6)
and (4.2.7)-(4.2.9). Consider the A-invariant subspaces N k as given by (4.2.11).
U.th.c.
o
2 (Cn= $ Nk=N, $N 2 $ ... $N o '
k='
3 Comments. a.) Nk := N[(A-AkI)mk] and N[A-AkI] are called, resp., the
algebraic- and geometric eigenspace of Ae c: nxn at its eigenvalue Ak'

4 while Ok := dim N k

5 and rk := dimN[A-AkI] are called resp. the algebraic- and geometric multiplicity
of the eigenvalue Ak' [KaLl,pp.42,37]. Note that N[A-AkI] is the subspace of eigen-
vectors of A at Ak; moreover in Corollary (25) below it will tum out that if mk> 1 then
N [A-AkI] is a proper subspace of N k: in this case N k cannot be spanned by eigenvec-
tors:
P) Note that for ke Q, Nk=N[(A-AkI)mk] is A-invariant. Hence by
Theorem (4.1.19), the decomposition (2) implies that if we pick as basis for <en, the
union of cr arbitrary bases of, respectively, N ,,N 2,''',N 0' we obtain for A a blockdiag-
onal representation of the form

A, 0 0

0 A2 0
-
6 A=block diag[A"A 2 , ... ,A cr l=
0 0 0

0 0 Ao

In Corollary (28) below it will turn out that Ok := dim N k = dim Ak = dk, i.e. the alge-
braic multiplicity of the eigenvalue Ak is the exponent of (S-Ak) in the characteristic
polynomial XAo, (see (3.2.7»; moreover, det(sI-A k) = (S-Ak)d k •

7 Proof of Theorem (1). Consider the partial fraction expansion of and


'l'A(S)
collect ternlS:

where nk e c: [s], a polynomial Multiplying (8) by 'l' A(s) and using the definition
112

of 'l'k' (4.2.10), we obtain t


a
9 1= L nk(s)'I'k(S) V'SE (J:.
k=1

The RHS of (9) is a polynomial in s such that the coefficients of all positive powers of
s are zero. Therefore, if we replace s by A in the RHS of (9), all coefficients of posi-
tive powers of A will be zero and
a
10 1= L nk(A)'I'k(A).
k=l

Consider now any xe (J:n. From (10)

with

We claim that
13 xkENk for k=I,2, ... ,a.

Indeed, using (4.2.10) and 'I'A(A) = 0, we obtain successively

(A-Ak I )mkXk= [ (S-Ak tknk(S)'I'k(S) ] IA x

= [nk(s) (S-Ak )mk'l'k(S) JIA x

I
= [nk(s)'I'A(s)] I x
IA

Hence (11)-(13) gives a representation of any x e (J:n as a sum of vectors xk in N k •


We have to show that the decomposition of x given by (l1)-(13) is unique. Sup-
pose it were not; then, by subtracting two supposedly different decompositions of x we
would have

t Identity (9) expresses the fact the polynomials 'l'kO are coprime. It is called the
Bezout condition.
113

with xkE Nk for k=1.2 .... 0". W.l.g. we may assume that xI '" e. (otherwise reorder
the eigenvalues such that to Al corresponds XI '" e). From (14)

The definition of N I with xI E N I gives

16 (A-A,II tlxl =8;

now by the definitions of 'l'1(s) (see (4.2.10» and the Nk for k > 2, premultiplication
of (15) by '1'1 (A) gives

17 'l'1(A)xl =8.

because 'l'1(A)xk=8, 'v'k L2. Now the polynomials (S-A2)m1 and 'l'1(s) are
coprime, hence by the Bezout condition we can find polynomials hl(s) and h2(s) S.t.

'v'SE (t.

Therefore

where, in view of (16) and (17), each term on the LHS of (18) is 8. Hence xI =8.
which contradicts (15). Therefore the decomposition (11)-(13) is a unique decomposi-
tion of x as a sum of vectors in the N/s. This means that

i.e. (2) holds.



19 Corollary. Let the assumptions of Theorem (1) hold and let V be any A-
invariant subspace of (tn. Then
(J

V =$ (V (1 N k) .
k=1

Comment. The corollary asserts that "every A-invariant subspace is spanned by gen-
eralized eigenvectors."

Proof. Let XE V. Then, using the unique decomposition (11)-(12) of the proof of
114

Theorem (1), x = xl + X2 + ... + xa ' where, for all k= 1, ... , 0', xk = nk(A)Pk(A)x
belongs to N k. Now, since xe V and V is A-invariant, for each k, xk belongs also to
V. Hence
a


Vee (V n N k) c V .
k=l

20 Exercise. Let M be an arbitrary subspace of cr n. Show that in general


a
M ¢ e (MnN k ).
k=l

[Hints: pick 0' = 2 with each N k spanned by one real eigenvector. Let M be any
line through the origin not parallel with the Nk'S .... J

*21 Remark. The proof of Theorem 1 uses only the fact that 'VA(s) is an annihilat-
ing polynomial of A with zeros solely at the Ak' Therefore, recalling the general
expression (4.3.19)-(4.3.20) of such polynomial, there results also

22 cr n = k!l N [ [A-AkI t] for all Pk ... ,0'.

In particular, if Pk>mk and PI =ml for aliI ¢ k, then, by (4.3.23)

S.t. on comparing (2) and (22)

23 Nk := N [ (A-Ak )m.] = N[ for all Pk

On the other hand, if N [ (A-AkI t·- l ] were to be equal to Nk then by (2),

('VA(s)/(s-Ak» would be the annihilating polynomial of A; this is impossible and so by


(4.3.23)

24 N [ (A-AkI t·- 1] is a proper subspace of Nk

(i.e. strict inclusion). Hence by (4.3.24) and (23)-(24) we have established the follow-
ing

2S Corollary. Let Ae 4}:nxn. Then the exponent mk of (S-Ak) of the minimal


115

polynomial",A(S), (4.3.7), is the ascent of A-AkIn' i.e.

Moreover the restriction of A-AkI on N k' i.e.

27 A-AkI : Nk Nk is nilpotent of index mk'

We have also the following.

28 Corollary. Let Ae (Cnxn where (Cn is decomposed according to (2). Consider


A Nk , i.e. the part of A in Nk, the map defined by

and

Let dk be the exponent of (S-Ak) of the characteristic polynomial XA(S), (4).


U.th.c.

29 Ok := dimNk=d k
and

30 XAN.(S)=

33 Proof. By the blockdiagonal representation (6),


(1

34 XA(s)=TI
k=1 •

We claim that AN. has only one eigenvalue: Ak' Indeed, suppose Jl were another
eigenvalue. Then there exists a nonzero vector ZEN k C ([ n s. t.

35 AZ=Jlz
and
36 [A-AkI tkZ=9n.

Hence, by (35)-(36), r·z = en where mk 1 and Z yI: en. Therefore Jl = Ak'


proving the claim. Therefore with 0k:= dimNk=c;
116

Hence using (34). (37) and (4.3.4)

Thus, by the uniqueness of the factorization of polynomials in tr [s], 1\ = d k and the


Corollary follows. •

38 Exercise. Let A E tr nxn and let x be a nonzero vector of (Cn. Let 'VA be the
minimal polynomial of A. Let p denote a polynomial. call o[p] its degree. Show that
i) there exists an eigenvalue Ak of A and a polynomial pes). with 0 s; orp] < 0['1'A],
s.t. p(A)x is a ( nonzero) eigenvector of A at Ak.


ii) If x E N k then, in i), Ak is the eigenvalue corresponding to N k and p is a divisor
of [S-Ak with 0 O[p] < mk·
[Hint: there exists a polynomial 'V of minimum degree such that 'V(A)x = e.]

39 Exercise. Consider A E tr nxn with spectrum a[A) = {Ak At every eigen-

]m.] of dimension dk-


f.
value Ak let A have an algebraic eigenspace N k[A] := N [ (A-AkI

Consider A* E tr nxn (the Hermitian transpose of A) with spectrum a(A*) = {X"k

Define at every eigenvalue -Ak of A*

40 Nk[A*]

Show that
a) Nk[A*] is A*-invariant.
b) Nk[A*) is the algebraic eigenspace of A* at X"k of dimension dk,
c) tr n is decomposed into algebraic eigenspaces of A* according to
cr
41 (Cn= $ N [A*].
k=! k

[Hints: (a) Use Exercise (4.1.8); (b) use (4.2.27) and (4.2.28); (c) in Theorem (4.3.1)
replace A by A*.]

42 Comment. The spaces N k[A] and N k[A*]' defined above. are called resp. the
right- and left algebraic eigenspace of A E tr nxn at its eigenvalue Ak. (Note that
117

Nk[A) contains the right eigenvectors of A at A.k' and Nk[A*] contains the left eigen-
vectors of A at A.k (i.e. T\* A=A.kT\*).

4.4. The Decomposition of a Linear Map


As in Comment (4.3.3) we pick as basis for (tn, the union of arbitrary bases
for respectively N I,N 2•••• ,N 0: hence A E (t nxn has a blockdiagonal representation
(4.3.10). Therefore, in view of (4.3.27) and (4.3.29), calling EkE (tnxd. the matrix
whose columns are the basis vectors of N k , we have for k=1,2, ... ,a

and by the matrix representation rule (4.1.5) we have

2 AEk=EkAk=Ek [A.kIdk+Rk]'

where Ak E (t d.xd. is the kth diagonal block of (4.3.6) and the dkxd k matrices Ak1d.
and Rk := Ak-AkId. satisfy

3 AkId.Rk = RkAk1d.

4 Rk is nilpotent of index mk'

Hence defining

5 11 := [EI ... :Ea] E (tnxn

8 m:= max {mk' k=1.2 ..... a}.


then 11 is nonsingular by (4.3.2); furthermore by (4.3.6) and (1)-(4)

10 A=blockdiag [A 1.A2..... Acr ] =TA11 =A+R

where
118

11 AR=RA
and
12 R is nilpotent of index m.

Therefore defining
13 D:=lIAT, M:=l IRT,

we have found the following.

14 Theorem [Decomposition of Ae (Cnxn]. Let Ae (Cnxn. In terms of the


definitions above (1)-(13), there exist unique t matrices D and M in (Cnxn S.t.

15 A=D+M

where D is semisimple and M is nilpotent of index m, defined by (8); moreover D and


M commute, and the eigenvalues of A and D are identical.
Our analysis shows even more.

18 Theorem [Partial Fraction expansion]. Let Ae cr nxn and let (Co be decom-
posed into algebraic eigenspaces N k according to (4.3.2). Pick any basis of cr n as the
union of cr bases, respectively, N IN 2, . . . ,N ("J' Hence (1)-(15) applies.
U.th.c
m-I
19a (sl-Ar l = (sI-D)-1 + :E (sl-D)-I-IM 1
1=1

20 Comments. <X.) (19b) shows that (SI-A)-I has a pole at each eigenvalue of A;
moreover the pole at Ak is of order mk the ascent of A-AkI. Note that if A is semisim-
pIe then, by Theorem (14) and (13), R=O, and all poles of (sI-A)-1 are simple.
P) Equations (19) give the partial fraction expansion of (SI-A)-I: (19a)
is the "coordinate-free" form of that expansion.

t See Remark (28) below.


119

23 Proof of Theorem (18). In view of (1)-(13) we may restrict ourselves to prove


(19a). Now in view of Theorem (14) we have
24 (sI-A)-1 = (sI-D-M)-1 = (sI-D)-I[I-M(sI-D)-lr l

where M and (sl-O)-1 commute and M is nilpotent of index m. Therefore


00
25 [I-M(sl-O)-Ir l = [M(sI-D)-I]1
1=0

00 m-I
= (sl-D)-I MI = (81-0)-1 M' ;
1=0 1=0

in (25). equality holds for lsi sufficiently large and. by analytic continuation. for all
s *- Ak for k=1.2 .....0'. (19a) follows by combining (24) and (25). •

28 Remark (Uniqueness of M and 0). Formula (l9c) is the partial fraction expan-
sion of (sI-Ar l at its poles A.k' Hence the coefficient matrices of (S-Ak)-l and
(S-Akr2. namely

29 r l . blockdiag [O .....O.ldk.O.....O] . T

and
30 r 1 • blockdiag [O.....O.Rk.O.....O] . T •

resp.• are unique. i.e. independent of the choice of basis ,I


(i.e. the one chosen in
(5». Now by summing (30) over k=I.2 ..... 0' we get ,IRT. see (7): hence in Theorem
(14). M := rIRT. (13). is unique. Similarly by multiplying (29) by Ak and summing
over k=I.2 .....0'. we get ,lAT. see (6): hence in Theorem (14). D := j 1AT. (13). is
unique.

32 Remark. Let jl E (l:nxn be the nonsingular matrix given by (5). Define now
33a T:= N*

where NE (l:nxn and define blocks NkE (Cnxdk S.t.

33b r=N:= [NJ :N2: ... :Na].


Then rIT=Trl=l. (15) and (19c) can be made to read
120

(1

34 1= L EkNk"
k=!

36 A= f Ek rAk · I
k=! [
d_ + Rk]N k*

Moreover, using the Laplace transfonn,

38 V't.

Fonnulas (34)-(38) generalize the dyadic expansion fonnulas (3.3.43), (3.3.42),


(3.3.45), (3.3.58) and (3.3.57). Indeed if A is semisimple then, V' k, Rk = 0 and the
matrices Ek and Nk are made up resp. of dk l.i. right- and left-eigenvectors at Ak (that
are mutually orthonormal). If, in addition, A is simple then o=n, and dk =l, V'k .•

40 Exercise. Let A E tr nxn be decomposed as in Theorem (14) with D and M the


semisimple- and nilpotent part, resp. of A. Let Ak be an eigenvalue of A. Show that

41 i) Nk :== N [ [A-Ak I t-] =N[D-AkIJ

i.e. an eigenvector of A is an eigenvector D, which is in the nUll-space of M.


iii) If A is nonsingular, then
m-I
43 A-1=D-! + L D-I-lM I
1=1

iv) for any VE N

44


121

Ae tr nxn with spectrum alA] = {\k r


46 Exercise [Orthogonality of right- and left algebraic eigenspaces of A).

and algebraic eigenspaces


Consider

47 Nk[A) := N [ [A-Ak I t·]


r
of dimension dk • for ke 0:. (see section 4.3). Consider A* e

transpose of A) with spectrum a[A*) =


tr nxn (the Hennitian
and algebraic eigenspaces

48 *
Nk[A] := N - )m.]
[[A*-AkI

of dimension d k• for k e 0:. (see Exercise (4.3.39». Let

49 K:= {kE O::ReAk<O}. KC := {kE Q:ReAk


and
50 N _[A):= Efl N k[A) •
keK

Show that
*
a) For every ke 0:. Nk[A) and Nk[A) have bases that are mutually orthonormal.
i.e. there exist n x d k matrices Ek and Nk• both ofrank d k• s. t.

51

and
52 Nk*Ek = Id••

b) For every k E 0:
a
53 N [A).1
k
= 1=1
Efl Nk[A*).
I .. k

(where Nk[A].L denotes the orthogonal complement of Nk[A)).


c)

54 N_[A] .1 =N+[A). *
122

55 Note on terminology. Consider the spaces N _[A] and N +[A*] defined in (49)-
(50). In Theorem (7.2.33) below we prove that the solutions of the d.e. x=Ax (viz.
x(t)=exp[At]xo) are exponentially decaying iff XOE N_[A] (Le. the direct sum of the
algebraic eigenspaces of A associated with its eigenvalues with negative real parts).
For this reason we call N _[A] the (right) stable subspace of A. Similarly N +[A*] is
called the (right) unstable subspace of A* (or the left unstable subspace of A). Of
course the same terminology applies to N _[A*] and N +[A] (obtained by exchanging A
and A* in (50». Recall now that the algebraic eigenspace Nk[A*], (48), is called the
left algebraic eigenspace of A at its eigenvalue Ak' (see (4.3.42».

56 Comments. By the terminology mentioned above the results (51)-(54) read:


a) (51)-(52): at every eigenvalue Ak the right- and left algebraic eigenspaces of A at
Ak have bases that are mutually orthonormal.
P) (53): at every eigenvalue Ak the orthogonal complement of the right algebraic
eigenspaces of A at Ak is the direct sum of the left algebraic eigenspaces of A not at
Ak'
y) (54): the orthogonal complement of the right stable subspace of A is the left
unstable subspace of A.
Moreover we have
8) By (53), right - and left algebraic eigenspaces of A corresponding to distinct
eigenvalues are mutually orthogonal.
to) By (49)-(50), N _[A] is an A-invariant subspace and N +[A*] is an A* -invariant
subspace.
[Hints: consider as basis of ern a union of bases of algebraic eigenspaces of A as
arranged in the matrix 'l' described by (1)-(5). Hence A e er nxn gets the blockdiag-
onal representation A, (4.3.10), by A'l' ='l'A. Note that A*T" =T":;?, where the
matrix T* has the structure (33)-(35).
a) (51)-(52): use (1), (4) and (35): note in particular that

N( [Ak-AkI ]m,= [Ak-Ak1d, r'N(=O

with dimR[Nk]=dk=dimNdA*].
b) (53): use (35) and (51).
c) (54): by (53) and Exercise (A.7.30b)

L
[keK Nk[A]).L = n
keK
[f
1=1
1 "k
N1*[AJ].]

4.5. Jordan Form


Let Ae er nxn and let ern be decomposed into algebraic eigenspaces Nk accord-
ing to (4.3.2). Having picked as a basis of ern the union of the chosen bases for the
N k as in (4.3.6), there remains one question: is there a way to pick a basis in each N k
such that, in (4.3.6), Ak (the matrix representation of the part of A in N k ) is as simple
as possible? Note that, by (4.4.1)-(4.4.4), for any basis in N k • Ak=Ak1dk+Rk where Rk
123

is nilpotent of index mk' The question hence is to find a basis of N k such that Rk• the
nilpotent part of A k• is as simple as possible. The answer to this question is yes and is
documented by Fact (13) below.
An important notion is the following.

1 Let Ae (rnxn have the eigenvalue Ak with algebraic eigenspace N k. We call Jor-
dan chain of length fl at the eigenvalue Ak of A any I.i. family of vectors ];1 in
Nk S.t.

2 for j=I.2 ..... fl

where := e.
3 Comment. The nonzero vectors of N k are called generalized eigenvectors; hence
(2) defines a chain of generalized eigenvectors. Note that any chain of generalized
eigenvectors contains one eigenvector, namely. E N [A-AkI].
Jordan chains are useful because of the following.

4 Exercise. Let be a Jordan chain of length fl at the eigenvalue Ak of A


J=1
defined by (2). Let S.t. t

5 Ef= e(

i.e. R =Span ];1' Show that. w.r.t. R A has the matrix representation
dictated by

6
where

Ir
7 If = AkII1 + Rt
with
11-1
8 Rf= [-:-
o I

so, e.g. for 11 = 3

t In the symbols It and Rt. fl is a superscript and not an exponent; III is the
flxfl unit matrix as usual.
124

(Hint: use (2) for e.g. 11 = 3).

10 Comments. Jt:, defined by (7)-(8), is called a Jordan block at Ak of size 11; Rt:
is the nilpotent part of Jt: of index 11 (see (4.4.15»; a Jordan chain of length 11 gen-
erates a Jordan block of size 11 by (5)-(6).
The comments suggest the following affirmative answer to the question posed at
the start of this section.

13 Fact [Jordan basis of N k]' Let A E (l: n x n have the eigenvalue Ak with algebraic
eigenspace N k' Let d k ;= dimN k and rk ;= dimN [A-AkI] be resp. the algebraic- and
geometric multiplicity of Ak and let mk be the ascent of A-AkI.
V.th.c.
N k has as basis a union of rk Jordan chains of length for 't= 1,2, ... ,rk such that

14 i) I $ $ mk for 't=1,2, ... ,mk;

15 ii) there exists at least one 't E Ik such that Ilk1 = mk '

i.e. the basis contains at least one chain of length mk;

16 iii)

i.e. the sizes of the Jordan blocks add up to dk = dim N k'


As a consequence in (4.3.10) the part of A in N k has a matrix representation

17 Ak = blockdiag kk< ]::1


where for 't=1, ... ,rk' Jrk< is a Jordan block of size Ilk1' (see (6)-(9) with 1l=llk1)' I

20 Comments. a) The hard pan of the proof of the Fact above is the proof of
existence of the Jordan basis of rk chains; see [Kat.l,p.22] and [Nob.l,Th.1O.3]. The
other parts of Fact (13) are reasonably straightforward; in particular for (15) the
interested reader is referred to Exercise (30) below.
Note that, by (15) and Exercise (4), Ak in (17) contains always a Jordan
block of maximal size mk' Hence it makes sense to order the blocks in (17) according
to decreasing size, i.e.
125

This convention and the knowledge of d k and mk enable us by (14)-(16) to evaluate


possible values of the sizes Ilkt and rk'
For example if dk = 5 and mk = 3 then there are two possibilities, viz.
(i) Ilk! = 3, Ilkl = 2, and rk = 2, thus

Ak 0 0 0
0 Ak 0 0
3
0 0 Ak 0 0
Ak=
0 0 0 Ak 2
0 0 0 0 Ak

(ii) Ilk! = 3, Ilkl = 1, Ilk3 = 1, and rk = 3, thus

Ak 0 0 0
0 Ak 0 0 3
0 0 Ak 0 0
Ak=
0 0 0 Ak 0

0 0 0 0 Ak

However if dk = 5 and mk = 4, then there is only one possibility namely mk! = 4,


mk2 = 1 and rk = 2. So sometimes we are lucky to predict the form of Ak> which is
unique: see below. However, it is also known that a Jordan basis for Nk is not unique.
We have now the following notion.

23 Let A E (C nxn. Then the n x n matrix A= blockdiag [Ak] k=!


(J where each dkxdk
matrix Ak has the form (17) is called the Jordan form of A (with the understanding
that two Jordan forms are considered identical if they contain the same Jordan blocks;
i.e. they differ only by the ordering of the blocks).
The Jordan form has the following property, [Kat.l].

24 Fact. Let A and }{ be two complex n x n matrices. Then


A and }{ are similar
or equivalently,
126

:3Te (Cnxn with detT *0 S.t.

if and only if
26 A and A. have the same Jordan form.

28 Comments. (X) [cf. (A. 5.22)]. In other words. since (25) reflects a change of
basis. any linear map A : (V. (C) (V. (C) where the linear space (V. (C) has dimen-
sion n. has a unique matrix representation in Jordan form.
The numerical computation of the Jordan form is a formidable task and
should be avoided except in simple cases: see [GoI.2] for an extensive review of the
numerical difficulties and [Kag.l] for an algorithm: imponant numerical considerations
are (i) the difference in magnitude of the eigenvalues. (ii) whether or not the algebraic
eigenspaces N k are almost orthogonal. etc.
y) The structural information of the Jordan form. however. is useful in theoret-
ical considerations: for instance in (4.4.15) its semisimple pan D equals A. where A is
the diagonal matrix of eigenvalues (4.4.6) and its nilpotent part M has zero entries
everywhere except for some entries of the superdiagonal, which are 1. For structural
infonnation on the Jordan fonn of A* see [KatJ,I.5.5.] and [Nob.1.,Th.10.4].
We conclude this section by two exercises: they may be skipped.

**30 Exercise. Let Ae (Cnxn have the eigenvalue Ak with algebraic eigenspace N k •
Let dk=dimN k and let mk be the ascent of A-AkI with 1 mk d k. Recall
definition (1). Show that A has a Jordan chain ];1 in Nk of length Il=mk'
More precisely, set T = A-AkI; hence (2) reads with Il = mk

for j=1,2 .....1l with e.


Show that
i) since Il is the ascent of T, there exists a nonzero vector ef in N k=N [Til] S.t.
ef=TIl-lef *e.
ii) For all j=l, ... ,1l
N[Tj] eN ['fIl] =N k .
iii) With lEN S.t. s; I S; Il- I,

:_1 for
for
j=I,2, ... ,/
j=l+l .... ,I1·
127

iv) By (i) and (iii) the family tl' with ef given by (i). is l.i. In particular. if
for complex coefficients Uj.

f
j=1

then with I as given in (iii)


11 . I
L ujef =8.
j=l+l

...... 31 Exercise. Let A E tr nx n and let N k be its algebraic eigenspace at the eigen-
x
value Ak' Consider = Ax. x(O) = xo. Show that 'v' Xo '" en. Xo E N k' the solution
contains a term expO"kt). where is a nonzero vector of tr n.
(Hint: use the Jordan form).

4.6. Function of a Matrix


In many problems we need to understand how certain expressions depend on a
given matrix and on some other parameters. For example. how does exp(At) depend
on the characteristics of A and on t. We can view exp(At) as the analytic function
feAt) in which A has been replaced by A; this "substitution" will require clarification.
However in the case of polynomials. the procedure is almost obvious. Let

p(l..) = 1.. 3 + 31.. 2 + 51..+ 2

then. by definition.

The purpose of this section is to develop a systematic method for defining the func-
tion of a matrix. The most important cases are

Ak. ke N

Throughout this section. we use the following notations. Let A e trnxn. let
{ A\.A2 ....• Ao} be the point set of its (distinct) eigenvalues. also called the spectrum of
A and denoted by o(A). We write its minimal polynomial as in (4.2.7)
128

The following lemma will greatly simplify the later derivations.

2 Lemma. Given two polynomials PI.P2 e c: [s]. let us divide them by 'I'A: call qi
and ri the respective quotients and remainders. Thus we have

then the following statements are equivalent

6 VAe c:

7

8 Remarks. ex) If we evaluate Eq. (3) at A and recall that 'l'A(A)=9 we get
PI(A)=rl(A). So whenever we want to calculate any polynomial pI(A) we never
need to calculate powers of A larger than d\v A-I.
Conditions (7) are called the spectral interpolation conditions.

9 Proof of Lemma (2).


(5) => (6).
By (5). 9=PI(A)-P2(A)=(PI-PI)(A)=9.
Therefore the polynomial PI-P2 is a multiple of 'l'A; consequently. by (3) and (4).
rl=r2'

(6) => (7)


Recall (3) and (4); by (6). rl =r2; hence PI-P2=(ql-q2)'I'A' Thus. in view of (1),
PI-P2 has a zero at Ak of order at least mk' Hence (7) follows.

(7) => (5).


By (7). for k=1.2 ..... <J. PI-P2 has a zero of order at least mk at Ak' Hence for some
polynomial Il(A).

(PI-P2)(A) = Il(A)? (s-Ak ]mk= Il(A)'I'A(A).


Hence
129


One can define a function of a matrix A, say f(A), in tenns of the power series of
f. However, in view of Lemma (2), it turns out that it is much more efficient and
much more insightful to do it in the following way.

10 Definition. Let f be a complex-valued function that is analytic on a domain t !J.


of the complex plane. We assume that !J. contains all the eigenvalues Ak of A. Let p
be any polynomial that satisfies the following spectral interpolation conditions (com-
pare with (7) above!):
11 p(/)(Ak)=t")(Ak) for k=l,2, ... ,a 1=O,l, ... ,mk-l.

By definition,

12 f(A) := p(A).

(J

13 Comments. (X) The interpolation conditions (11) involve equali-


1
ties: hence they can be satisfied by an interpolating polynomial p of degree a",A-I.
By lemma 2, the matrix f(A) is independent of the choice of the
polynomial p.
y) For any matrix A e c:c nxn, the functions Am, (m integer); (sI-A)-1
for s IF a(A); exp(At) V te IR are well defined because the corresponding f's are
respectively A --+ Am; A --+ A --+ exp(At).
S-I\,
y) The functions A --+ ..fA. and A --+ InA have a branch point at A=O,
hence, if A is singular, A112 and In A are not defined; they are weIl defined for any
nonsingular A.
The following exercises are straightforward consequences of the definition (to).

15 Exercise. Let f satisfy the conditions above. Show that (a) Af(A) = f(A)A; (b)
f(AT)=f(A)T; (c) let Te (Cnxn be nonsingular, then f(TArl) = Tf(A)r 1.

16 Exercise. Let f satisfy the conditions above and, as is usual in applications, let
fez) = fOO. V z e!J.. Show that the polynomial p can be chosen with real
coefficients and hence
f(A*)=f(At .

t In analytic function theory, a domain is an open connected set of the complex


plane.
130

17 Exercise. Using the definition show that 'if A E (Cnxn


a) 'ift),t2E JR,
exp(t)A) exp(t2A) = exp(t)+t2)A
sin(2A) = 2 sin A . cos A .
b) If f and g are analytic functions on D., then f(A)g(A) = g(A)f(A); that is, any two
functions of the same matrix commute.

18 Exercise. Show that a) if ek is an eigenvector of A, then ek is an eigenvector of


f(A); b) however, if ek is an eigenvector of f(A), then ek is not always an eigenvector
of A.

[Hint: consider A= The point is that this A is not SerniSimPle.]

Expression (I), which defines f(A), seems at first somewhat arbitrary: let us verify
that if g and h are, respectively, the sum and the product of the analytic functions f 1'[2'
i.e.

then we do have
and

More precisely we state the

19 Theorem [Properties of f(A)]. Given AE (Cnxn. Let I'l be a domain that con-
tains all the eigenvalues of A. If f) ,f2 and f are analytic functions mapping I'l inw (C,
then

20

21 with f) . f2 denoting the product of f) and [2'

22 f(A) = e ¢> f(l)(A k ) = 0 for k = 1.2..... CJ I=O.l ..... mk-l.

23 Proof. Let, for i=1,2, Pi denote an interpolating polynomial of f i , (i.e. Pi satisfies


the conditions (11 ), then, by definition (12),
131

Now, by (11), <X,P'+<X2P2 is an interpolating polynomial of <x,f'+<X2f2' hence (20) fol-


lows.
Consider now (21). By definition of the Pi'S,

where the last step follows by direct calculation. We are going to show that the poly-
nomial S --4 P,(s)P2(s) is an interpolating polynomial for f,(s)f2(s). Indeed
VAkE a(A)

26 (p, . P2)(Ak) = PI (Ak) . P20'k) = fl (A k)f2(Ak) .

Suppose that mk = 2, then we must check the second interpolation condition

where the second step follows from the definition of p, and pz and the last one from
Leibnitz's rule. If mk > 2, the same type of calculations establishes the required equali-
ties. So (21) is established.
It remains to establish the equivalence (22).
(¢). Since the right hand side of (22) holds, we may choose p(A)=O, VA, as an
interpolating polynomial; hence peA) := f(A) = e.

°
By contradiction: assume that at least one interpolating condition (22) is not
satisfied, say, t</)(Ak) ¢. for some k a and I mk-l. Since I mk-1, any
such interpolating polynomial, say p, cannot have [S-Ak ]m. as a factor, hence p is not
a multiple of 'JfA hence, by definition of the minimal polynomial, peA) := f(A) ¢. 8,
which contradicts the left-hand side of (22). •

30 Theorem [General formula for f(A)]. Let A E c: nxn have. a minimal polynomial
'Jf A given by (1). Let the domain contain a(A), then for any analytic function
f : --4 c: we have
132

o mk- I
31 f(A) = L L t</)(Ak)Pk/(A) ,
k=1 1=0

where the PkI'S are polynomials independent off.

32 Comments. The importance of formula (31) lies in its form: a) the polynomials
Pki are independent of f and depend exclusively on A. So the matrices Pki (A) are
independent off.
b) The weighting factors of the PkI's, namely, t<I)O"k) depend only on
the values of f (and some of its derivatives) on the spectrum of
A : a(A) = {"'I'''-2, ... ,1.. 0 } .

33 Exercise. Using the notation of theorem (30), show that


o mk- I
a) '<;f t E lR, exp(tA) = L L tl exp(Akt)· Pki (A)
k=1 1=0

o mk- I
b) '<;f V E N, AV = L L v(v-l) ...
k=1 1=1

o mk- I dl
c) for A nonsingular, InA=L L -I (lnA)'IA"Pkl(A)
k=1 1=0 dA

d) '<;f s r/. a(A),

Note that (33d) shows that (sI-A)-l has a pole of order mk at S=Ak' (Indeed
Pk(mk-1)(A) 9: if not, we could drop one of the interpolating conditions (22); but in
Theorem (19) we showed that all the m conditions were necessary.) The fact that
(SI-A)-I has a pole of order mk at Ak can also be seen by taking a Laplace transform
of (33a).

34 Two Stability Conditions. I. Consider the differential equation x= Ax,


x(O)=xo, where AE tr nxn and XOE tr n are given: From (33a) we see that

35 { «p(At) --'> e " H = {'f A, e alA), RO(A,) < 0 }

and
133

V Ak E a(A), Re(Ak) ::; 0 }


36 { t -t exp At is bounded on IR+ } { and .
mk = 1 whenever Re(Ak) = 0

II. Consider the discrete-time system x(k+ 1) = Ax(k), keN, with


x(O)=xo where Ae (tnxn, XOE (en are given; then, for ke N, x(k)= Akxo. From
(33b) we see that

37 { Ak -t 8 as k -t 00 1 {v Ai e a(A), I Ai I < 1 }

and
V Ai E a(A), I Ai I ::; 1 }
{ } {
38 k -t Ak is bounded on IN+ and mi = 1 whenever I Ai I = 1 .

39 Calculating the Pkl (A)'s. Before proving Theorem (30), l<:t us show how to cal-
culate the matrices Pkl (A) without obtaining an interpolating polynomial. The trick is
to use Comment (32a).
Suppose that, for the given A, a=2, mt =2, m2= 1; thus
'JIA(A) = (A-Al)2(A-A2)' Then formula (31) gives for any analytic -t (t

40 f(A) = f(Al)PtO(A) + f(At)Pll(A) + f(A2)P20(A).

Let us apply formula (40) to several conveniently chosen polynomials f:


The chosen f 's The equations for the Pki (A)

f 1(A) = 1, VA pJO(A) + P2o(A)= I

f 2(A) = A-A} , VA

f3 (A) = (A-Al)2, VA

Solving by back substitution, we obtain

1 2
P2o(A) = 2 (A-A} I)

ptt(A)=(A-AtI) - (A-AI I)2


1I.2-1I.}
134

and the expression for f(A) follows immediately by (40).

41 Exercise. Let J be a Jordan block of size 3 with eigenvalue AI' Then, for any f
analytic in a neighborhood of AI,

AI 1 0 f(AI) J...
2!
fUCA )
I

J= [ 0 Al 1 and f(1)= 0 nA I )
o 0 o f(AI)

generalize for the case of a Ilxll Jordan block.

42 Proof of Theorem (30). To alleviate the notations let us consider a special case
where

where 1.. 1,1..2,1..3 are of course distinct. Since m 1= 1, m2 = I, and m3 = 2, d\jfA = 4;


hence we consider a third degree polynomial

44 p(A)=A3ao+A2al+Aa2+a3

where the coefficients ao, ... ,a3 must satisfy the interpolating requirements (11):

Afao+Afal +Ala2+ a3=f(AI)

Alao + Aial + + a3 = f(A2)

45 ASao + Alai + A.3a2 + a3 = f(A.3)

This is a system of four linear algebraic equations in the coefficients ao,al,a2,a3' If \jf A
had four simple zeros, the matrix of the coefficients would be the classical nonsingular
Vandermonde matrix. Here, A.3, is a double zero of \jfA' hence the matrix is a modified
Vandermonde matrix [AiLl,p.ll9]; it is known that it is nonsingular. Consequently
the system (45) has a unique solution and by Cramer's theorem each coefficient ak of
p(A.) is a linear combination of f(A"I),f(AI),f(A3) and f'("-:3).
Therefore, in general we have
135

where each coefficient is a linear combination of the rt)(Ak)' k=I,2, ... ,0 and
I=O,I, ... ,mk-l. Therefore if we rearrange the tenns of the sum in (46) we obtain an
expression of the fonn
o m.-I
47 f(A) = peA) = L L rl)(Ak)Pkl (A) .
k=1 1=0

where the matrices Pkl (A) are the result of the rearrangement of the tenns. This com-
pletes the proof of fonnula (31). •

4.7. Spectral Mapping Theorem


The spectral mapping theorem is easy to state, easy to prove, and has many use-
ful applications.

1 Theorem [Spectral Mapping]. Let A E cr nxn with spectrum


0(A}= (A I ,A2' ... ,1..0 l. Let 11 be a domain which contains o(A). Let f: 11 -t cr
be analytic on 11. (Hence f(A) is well defined. (4.6.12)). Then

2 o[f(A)] = }.

3 Comments. a) By assumption A has 0 distinct eigenvalues. 0 n. There is

r, r,
no guarantee that f(A) has 0 distinct eigenvalues. (Take f(A) = 1, V AE cr). So in
(2) (f(A1).£(A2), ... f(A o) l denotes the point set contained in the listed elements.

r,
b) By (2) the spectra of exp(At), log A and Av are {exP(\t) {lOg Ai and

{A;V respectively.

4 Proof of Theorem (1). Let p be an interpolating polynomial. (4.6.11), for f:


hence, by (4.6.12), f(A) := p(A). Let J be the Jordan fonn of A, hence o(A) = o(J)
and o(p(A)) = o(p(J» (use exercise (15) in section 4.6). Thus

5 o[f(A)] = o[p(A)] = o[p(J)].

By computation, since J is upper triangular. so is p(J) whatever p may be. Further-


more, to the diagonal element \ of J corresponds the diagonal element p(A.;) of p(J),
hence

where the last step follows from the interpolation condition of the definition of f(A),
136

see (4.6.11).

8 Application to the numerical integration of x=Ax. Given x=Ax, x(O)=xo,
AE tr nxn , XE trn. Call t -t x(t) the exact solution x(t)=exp(At)·xo. Note
that t -t x(t) is analytic in t. Call the sequence of computed values.

°
9 Forward Euler method.
For h > and small, we have, for any tk E JR+,

10 x(tk+h) = x(tk) + h . X(tk) + 0(h 2)

=x(tk) + hAx(tk) +0(h 2).

In other words, we have approximately

So if we perform repeatedly this step starting at 10=0, we have the computed


sequence given by

11 m=O,1.2 ....

From the spectral mapping theorem (1) and (11). we have the following.
o
12 Fact. Suppose a(A) c tr_ (equivalently, the origin is exponentially stable). Let
ho be the largest positive h such that

max 11 +hAj I = 1 .

U.th.c. (a) -t 0 exponentially for all So


iff hE (O,h o).
(b) if h > ho, then, for almost all xo, the sequence of computed values

is such that grows exponentially!

o
Conclusion. Thus even if a(A) c tr_ (and hence the exact solution x(t) -t en
exponentially). for h > ho, for almost all xo. the sequence of computed vectors
blows up. It is for this reason that engineers prefer the backward Euler
method.

13 Exercise. a) Prove Fact (12) in detail.


b) Assume you are given the eigenvalues \ as points in the complex plane
137

(C, devise a graphical test for obtaining the ho defined in Fact (12). [Hint: consider a
unit circle centered on (-1,0»)

°
15 Backward Euler Method.
For h > and small we have, for any tk E JR,

Thus, we have approximarely

x(tk+h) :::::. (I-hA)-lx(t k)·

So if we perform repeatedly this step, staning from to = 0, we get the computed


sequence [Si]; given by

16 m=D,I,2, ....

Now the spectrum of (I-hA)-1 is [O-hA)-1 )i:l' Hence by (6),

[Sm ); en as m 00 ¢:> 'VAj E a(A), I(l-hAi)-11 < 1

¢:> 'V A; E a(A), II-hAj I > 1 .

Note that if Re Aj < 0, then Il-hAj I > 1, since h> 0. Thus we have proven the follow-
ing.

o
17 Fact. Ifa(A)c (C_,then, jorallh>O, 'VXOE (Cn, the computed sequence
(Sm ); obtained by the backward Euler formula (16) goes to en exponentially. •

This is very imponant in practice, because if h is unfonunately chosen too large


the compured sequence may lose accuracy but at least it will never blow up!

18 Exercise. A conservative physical system is modeled by x==Ax, A E (Cnxn and it


IIx(t)1 h is constant, i.e. Ilx(t)ll} is the
is normalized so that along any trajectory t
energy. In order to integrate numerically, a student considers three methods: with h >
and h« peA),
°
a) forward Euler method:
b) backward Euler method: = xeD)
c) Sk+\ = So = xeD).
138

(The idea behind method (c) is to do one step of backward Euler with step size h/2
followed by one step of forward Euler with step size h/2.)
Select the method that is most appropriate for this problem and justify your choice.
(Hint: What can you say about A and cr(A); consider as k --7 00.)

4.8. The Linear Map X --7 AX + XB

1 Theorem. Let A and BE (Cnxn. Let L: (Cnxn --7 (Cnxn be the linear map
defined by

2 L :X --7 AX + XB .

Let [Ai).n and r/lj),n be the list of eigenvalues of A and B, respectively. V.l.c.
1-\ J=\

3 a) the eigenvalues of L are [Aj + /lj ;

4 b) if the matrices A and Bare semisimple, then the linear map

L is semisimple: indeed it has n2 eigenvectors [ViW/),n, E (Cnxn that are I.i.


1,)=\
0 0
where AVi=lI.jVi and Wj B=lljwj .

5 Proof. I. Assume A and B semisimple. Let us determine the eigenvalues of L.


By assumption, A has eigenvectors [Vi) and 8 has left eigenvectors [Wj J: which
form a basis for (J:n. Now Vi,jE n, by (2), AVj=AjVj and wtB= Iljwt

6 L [VjWjoJ := AVjwj" ,
+viWj B=(lI.j+llj)VjWj . 0

This proves that V i,j E n, (Aj+llj) is an eigenvalue of A with viwt as corresponding


eigenvector.
To show that there is no other eigenvalues of L, we show that the n2 eigenvectors
(Vjw/) are linearly independent: hence L is semisimple and (3) and (4) hold.
Suppose not: then there are n2 scalars aij E tr, not all zero, such that
n •
1 P:= L ajjvjwj =6.
j.j=1

W.l.g. we may assume 0. 1\ '" 0. Let v E (C n be a nonzero vector orthogonal to


v2'v3 .... ,vn; then v* VI '" 0, for otherwise v would be orthogonal to the n basis vectors
vl,v2' ... , vn and be nonzero, which is impossible. Similarly, let WE (Cn be nonzero
and orthogonal to W2,W3,''''W n' Compute now

8 v*Pw=al1(v*VI)(WI *W)=O
139

by (7). Furthennore since v* vI ¢ 0, ¢ 0 we have all =0 which is a contrad-

iction. Hence the n2 eigenvectors viwt are linearly independent and (3) and (4) hold.
Note especially that (3) and (4) hold for A and B simple, (3.3.23).
II. Consider the case where either A or B or both are not simple. Let us reorder the
elements of X E tr nxn in lexicographic order, say column by column, so that we
obtain an n2 -vector SE tr n 2. Then the eigenvalue equation L (X) = AX is rewritten as

where E (C n2, as defined above, and M is an n2xn 2 matrix whose elements are the
ai/s, bi/s and zeros; M depends continuously on the 1ti/s and bi/S. The eigenvalue
equation

9 det(AI-M) = 0

shows that the eigenvalues A of L are the zeros of a polynomial of degree n2 whose
coefficients are continuous functions of the 1ti{s and bi{s. Hence these eigenvalues are
continuous functions of the 1ti{s and bi/s. Let <l>A' (<I>B), denote the discriminant t of the
polynomial det(AI-A), (det(AI-B), resp.): these are polynomials in the 1ti/s and bi{s
resp. Now for all ai/s and b;{s such that <l>A ¢ 0 and <l>B ¢ 0 we have shown above
that A= for some (i,j); more precisely, we have shown that a) both sides of this
equation are continuous functions of the ails and bi/S, and b) that the equations holds
V Ae tr nxn and VBe tr nxn such that <l>A ¢ 0 and <PB ¢ 0, i.e. the equation holds
on an open dense set of (Cnxnx (Cnxn, (A.6.84a). Hence by continuity it holds for all
Ae (CnxnandBe (Cnxn. (SeeThmA.6.84)

10 Exercise. Let A,Be (Cnxn. Let L: (Cnxn -t (Cnxn such that

L :X -t X-AXB .

Show that the eigenvalues of L are ( 1- i,j=n ,where


] .. [ Ai ] n and n
are the
,.)=1 1 I
lists of eigenvalues of A and B, resp.

t det{!"I-A) has a multiple zero !.. iff det(AI-A) = 0 and 'h(!..) = O. The discrim-
inant <PAC!..) is a polynomial, which is the derivative of det{AI-A).
CHAPTER 5

GENERAL SYSTEM CONCEPTS

In this chapter we define the concept of dynamical system whose representation


D generalizes the standard linear differential or recursion system representations
R(')=[A('),B('),q'),DO] or RdO=[A('),B('),q'),DO] (Chapters 2 and 2d) as well
as their time-invariant counterparts R = [A,B,C,D] or R d = [A,B,C,D] (Chapters 3 and
3d). We discuss important properties of dynamical systems such as time-invariance,
linearity and equivalent representations.
For a discussion of the relation between physical systems, models and their
mathematical representations the reader is referred to Chapter 1, see esp. Fig. 1.1. We
start by discussing examples of models and their representations.

Note on Terminology. Since in System Theory we deal mostly with models and
their mathematical representations, there will be no confusion if we follow common
usage and use (1) the word system to refer to the model under consideration and (2)
the expression system representation to refer to any of its mathematical representa-
tions. At a later stage we also use the word system for system representation as
is common in mathematics.

5.1. Dynamical Systems

1 Models and their representations: examples. Let the physical system of interest
be an electrical circuit.
Model I. Suppose we model the passive elements as linear time-invariant elements
as shown in Fig. 5.1. A mathematical representation of this model may be obtained by
using the fluxes <1>1' <1>2 and the capacitor charge q as state variables, noting that the
voltage of the voltage source is u(t) and writing the state equations dictated by
Kirchoffs current- and voltage-laws:

q=-L1I<l>1 - Lil<l>2

2 $1 =C-Iq-RIL1I<l>I-U(t)

The output is given by

From the fundamental Theorem of differential equations (B.l.6), we know that given

t Except for the fact that in mathematics systems have usually no inputs.

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
141

• •
¢1 ¢2
+ +
L1 ¢2 /L 2
v1
+ q• .+
R1 y

q/C
+
C

Fig. 5.1. This linear time-invariant circuit is described by e:quations (2) and (3).

any to E R, a set of initial conditions q(to) , $\ (to) and $2(to), and any piecewise con-
tinuous input u(') defined from to on, the differential equation (2) defines q(t), $\ (t) and
$2(t) for all t to, and hence by (3) a unique output yet).
Note that (2)-(3) is standard linear time-invariant differential system representation R
= [A,B,C,OJ, (3.2.1)-(3.2.5), with state x= (q,<i>\,<i>2) E R3 and A,B,C,O constant
matrices (check this).

Model II. Suppose that in the previous circuit we take into account the nonlinearities
of the RLC elements (see Fig. 5.2), which are specified by the element characteristics

Then a mathematical representation of Model II is obtained by writing the Kirchoff


current and voltage laws:

q=-fL,($l) -fL,.($2)
5 [fL,($\),t]-U(t)

$2 = fc(q) - R2fL,.«h)
and
142

iR1 i2
+ q•
+
vR1 R1 R2 Y
+

Fig. 5.2. If the capacitor charge q and the inductor fluxes $1 and $2 are used as state
variables, this nonlinear time-varying circuit is described by the equations
(5) and (6).

Now, if fRI is piecewise continuous in t and if all element characteristics satisfy a


Lipschitz condition (8.1.4), then for any toE R+, for any set of initial conditions q(to),
r
$1(to) and $2(to), and for any piecewise continuous input u('), b Theorem (8.1.6), the
differential equation (5) defines a unique continuous solution Lq(tM1 (tM2(t) ] for all
t to and hence by (6) a unique continuous output y(t), (prove this). Hence Model II
has a well defined mathematical representation (5)-(6) of a form more general than
R 0 = [A(·),BO,CO,D(·)].

7 Remark. Model II is a special case of very useful mathematical representations


of systems. Namely, let x(t) ERn, the input u(t) ERn, and the output y(t) ERn., then t

8 x(t) = f(x(t),u(t),t)

9 y(t) = r(x(t),u(t),t)

where R n and JRn.. Moreover, for every


U(')E PC let the differential equation (8) have, for any toE and for any
x(to) ERn, a unique continuous solution x(') on t to such that a uniquely defined
piecewise-continuous output y(.) results from (9). System representations (8)-(9) with
this property are called

t The time set R+ may also read R or [T,oo).


143

10 dill erential system representations.

Note that (8)-(9) generalize the linear differential system representation


R 0= [A(·),B(·),CO,D(·)]. Indeed for this case

11 f(x,u,t) = A(t)x + B(t)u

12 r(x,u,t) = C(t)x + D(t)u

In the discrete-time case the linear recursion system representation


R dO = [A(· ),B(· ),C(·),D(·)] is generalized by the following. Let x(k) E IRn, the input
u(k) ERn, and the output y(k) E R'''', such that tt

13 x(k+ 1) = f(x(k),u(k),k) kE N,

14 y(k) = r(x(k),u(k),k)

where f: IR n x IR n, x IN ..... IR nand r: IR n x IR n, x IN ..... IR n,. Then (13)-(14) are called a

15 recursion system representation

16 Exercise [Model II, (5)-(6)]. a) With x = (q,«P\ ,cl>2) E 1R3 , u(t) E R, and
yet) E R, obtain from (5)-(6) explicit expressions for the functions f and r in (8)-(9).
From the assumptions on the element characteristics (4), show that. for any given
U(·)E PC(R+,R) and with XE R 3 , the function p(x,t):= f(x,u(t),t) satisfies conditions
(B.1.3) and (8.1.4) of the fundamental theorem of differential equations (B.1.6.).
y) Show that Model II has a differential system representation (8)-(9).

17 Exercise. Model a satellite S as a rigid body with center of mass 0b. Let
(Ob,b l ,b2 ,b3) be a dextral body frame centered at 0b with the b(s along the principal
axes of S. Suppose that gas jets apply to S a resultant force F and a torque (measured
with respect to 0b) t with components ('t\.'tZ,t3) in the body frame. Choose an inertial
frame and write the equations of motion of 0b and of S about 0b.

21 Dynamical Systems [State representations]. We now introduce a very important


class of models, which is used to analyze and design physical systems, namely dynam-
ical systems.
Roughly speaking, to a dynamical system. we can associate an input space K
(i.e. the set of all inputs of interest), an output space Y (i.e. the set of all outputs )

tt The time set N may also read Z or k E Z k;.


144

and a state space (Le. the set of all states). The inputs and outputs are functions of
time defined, typically, on (-00,00) or [0,00) in the continuous-time case or on
( nT: n e Z ) or (nT: n eN) in the discrete-time case. To tackle all possibilities
the time domain of the inputs and outputs will be denoted by T, where T is a subset
of R. The fundamental property of a dynamical system is that given any "initial" time
toe T, any "initial" state xoe and any input u(·)e U, both x(t) and yet) (the result-
ing state and output at some later time t) are uniquely defined. Also, given Xo and to,
the state x(t) and the output yet) depend only on the values of the input u(·) in the
elapsed interval [to,t). We are ready for a formal definition.

22 Definition. We say that the model of a physical system is a dynamical system


iff we can associate with it a quintuple D = (U ,s,r), (i.e. a list of five objects),
satisfying two axioms, where:

U is a set of functions: its elements map T -+ U (Typically U is R or R n,). An


element u(·) e U is called an input, and for any t e T, u(t) is called the value
of Ihe inpul al lime I.
is a set. Any element x of is called a slale. (Typically is Rn). For
any t e T, x(t) is called the state at time t. A map xO : T -+ is called a
state trajectory.
Y is a set of functions: its elements map T -+ Y. (Typically Y is R or R Ilo).
An element yO of Y is called an output, and, for any t e T, yet) is called the
value of the output at time t.
s is called the state transition function: the function s is defined for all tt time
pairs (tl'to) e (T xT)+, for all initial states Xo e l: and all inputs u E U, such
that
23 x(t1) := s(tl'to,xO'u),

i.e. it delivers the state at time t1 reached from the state Xo at an earlier time
to as a result of the input u.
r is called the read-out function: the function r is defined for all times t E T , for
all states at time t, namely, x(t) E and for all inputs u(t) e U at time t, such
that
24 yet) := r(x(t),u(t),t) ,

i.e. it gives the output at time t given the time t, the state at time t and the
input at time t. Thus the read-out function r is "memoryless," (all arguments

t Since Y is meant to be the range (codomain) of a function the elements of Y


are strictly speaking output candidates.
tt (TxT)+:= (t1'to): tl e T, toE T, t1
145

are evaluated at the same time t).


In addition the state transition function s is required to satisfy two axioms.

25 State transition axiom. For all pairs (tl.to)E (TxT)+. for all XOE L.
if u.u E U such that u(t) = u(t) for all t E [to.ttl n T. then

x(tl) = S(tl.to.xO.u) = S(tl.to.xO.u) .

In other words. given (tl.to)E (TxT)+ and XOE L.


the state x(tl) at time tl
depends only on the values of UE U in the "elapsed interval" [to.tt) n T.
This property is often summarized by writing

26 x(t l ) = s(tl.to.xO.u[lo,td)'

where u[to.tll denotes the restriction of the input u to the "interval" [to.ttl n T. i.e.

27 u[to.t,) := {U(t): t E [to.ttl n T} .


28 State composition axiom. For all to :'> t\ :'> t2 E T. for all Xo E L. for all u E U

i.e. to obtain the state x(t2) at time t2 we may first calculate x(tl)=s(tl.to.xO,u[lo,td) at
any intermediate time tt and then use this result to calculate the state at time t2 from


30 Any quintuple D = (U 'L'y ,S,T) associated with a dynamical system having the
properties given above is called a dynamical system representation. (Brevity will often
force us to speak about the dynamical system D). •

33 Comment [Dependence of the present state upon the input). Consider the state
transition axiom (26).
a) Given Xo and to, the present state x(tl) depends only on the values of the input u
during the elapsed past [to.ttl n T; the state Xo at time to summarizes the effect on D
of the inputs prior to time to.
Given Xo and to, x(t\) does not depend on the "future" values of the input after
the time tl; by (24) the same applies to the output y(tl) at time t l . This property is
referred to by saying that a dynamical system is
146

34 nonanticipative (or causal) .

3S Comment. The concept of dynamical system is extremely useful and general:


with appropriate selection of the state, it includes the basic models of physics (e.g.
mechanics and electromagnetism), of circuit and control theory, and of computer sci-
ence (e.g. sequential machines).

37 Example I [Differential systems). The differential system representation (8)-(9),


is a dynamical system representation D = (U ,L'Y ,s,r) where T = R+,
U =PC(R+,Rn,), L=Rn, and, for any u(')e PC (R+,Rn,), t x(t)=s(t,l{j,xO,ulfo,t]) is
the unique continuous solution of the d.e. (8) due to the initial state Xo at to and the
input u('); the resulting output given by the read-out function (9) is piecewise continu-
ous, i.e. Y = PC (IR+,lR"").

38 Exercise. Prove that the statement of Example I is correct. Check in particular


that the state axioms hold. {Hint: given u(')e PC (lR+,lRn,), then by Definition (10),
with f(x,u(t),t) =: p(x,t), the d.e. x(t) = p(x(t),t) has a unique continuous solution
t x(t) = cjJ(t,to,xo) =: s(t,!o,xo,u) for any (xo,to) e R n x R+; therefore
x(t)= s(t,to,xo,ulfo,tj) and cjJ(t2,to,xO)=cjJ(t2,t 1,x(t)) for any t) E [to,t2) ... J

39 Example II [Recursion systems). The recursion system representation (13)-(14)


is a dynamical system representation D = (U'L,Y,s,r) where t T=N, U=F(N,R n ,),
Y =F(N,lR""), k x(k)= s(k,ko,xo,u) the unique state trajectory produced by the
recursion equation (13) from the initial state Xo at ko due to the input u(-) E U, and
r(',',') given by (14).
Until now in Examples I and II the state space was lRn, i.e. finite-dimensional.
However this is not always the case.

40 Example III [Delay systems]. Consider the system of Fig. 5.3. Since the output
at time t is yet), the input to the subsystem with transfer function (s+ 1)-1 is y(t)+y(t);

1 y
u
5+1

Fig. 5.3. Feedback system with infinite-dimensional state.

t F (T ,U) = {f: f: T
U ); thus F (N,lRn,) is the space of sequences [U(k»)O'
where, V' kEN, u(k) eRn,.
147

also the output of the delay line is y(t-I) since the delay is I se(:ond. Thus

y(t)=-y(t}-y(t-l}+u(t) .

How can we choose a state for this system at to? Reasoning intuitively, we see that

y(to) =-y(to) - y(to-1)+u(to)

and, for h > 0 very small

Repeating the process h seconds later

y(to+h) =-y(to+h}-y(to+h-l}+u(to+h)
so
y(to+2h) :::; y(to+h)+h[-y(to+h}-y(to+h-I}+u(to+h)],

and so on. Therefore to calculate the slope of y for any t E (to.to+ I) we need to know
Y[Io-I,101 as well as u[Io,ll' Hence our choice for the state at time to is

x(to) = Y[Io-1.101 .

The state at time to is an element of a function space (more precisely. the space of C l
real-valued functions defined on a unit interval). The system shown is thus a simple
example of a dynamical system whose state space is infinite-dimensional. For more on
this see e.g. [Cur.1] [HaI.2].

43 Dynamical Systems: Response Function and I/O· Representation. Let


D = (U .L.Y .s.r) be a dynamical system representation. Substituting

23 x(t) = s(t.to, xo.u)

into
24 yet) = r(x(t).u(t).t)

we obtain
44 yet) = r(s(t.to.xo.u).u(t).t) .

Thus the output yet) is uniquely defined in tenns of t. to.xo. and u. Let us give the
name of p to the function in the RHS of (44). The function p is defined for all
(tl.to) E (TxT)+. for all Xo E L. for all u E U, and the output is given by
148

We call p the response function: it produces the output at time t in tenns of the initial
time to, the initial state Xo and the input U[Io,I)'
Observe now by (45) that with to and Xo fixed to standard values, (agreed upon in
advance; typically to=O and xo=9), then the output yet) is only a function of t and
U[Io.I)' i.e. for a given input u(·) e U the output yet) depends on t: this results in D hav-
ing an liD representation

46 u(')e U y(-)=F[u](')e Y .

Note in particular that u F [u] maps an input (a function of time) into an output
(another function of time): F is therefore called the I/O map of D.

50 Example I. For the standard linear differential system R (.) = [A('),B('),CO,D('»),


with xo=9, the l/O-map reads by (2.1.99)
I

51 yet) = F [u](t) = f H(t,'t)u(t)dt


10

where H(t,'t) is the impulse response (2.1.97). Moreover if RO=R=[A,B,C,D],


to=O, and according to (3.2.61)
I

52 f
y(t)=F [u](t)= H(t-1:)u('t)dt= (H*u)(t) Vt
o

with H(t) = H(t,O) the nonnalized impulse response (3.2.58). Note that in the linear
time-invariant case F is the convolution of the impulse response H (-) by the input u(·).
The discrete-time analogs for Rd(')=[A('),B('),C(')D('») and Rd=[A.B,C,D). are,
respectively, given by
k
53 y(k)=F[u](k)= L H(k,k,)u(k')
k'=ko

(see (2d.1.96», and


k
54 y(k) =F [u](k) = L H(k-k')u(k') = (H*u)(k) Vk
k'=O

(see (3d.2.59) and (3d.2.58».

55 Example n. A classical example of a nonlinear I/O-map is


149

56 y(t)=F [u](t) = f h(t-'t)<1>(u(,t»d't


o

where <1>: R R is a nonlinear C 1 map s.t. <1>(0) =0, h(t) is the impulse response of
a linear time-invariant SISO system and u(') e F (lRt,lR).

57 Exercise. Consider the nonlinear feedback system of Fig. 5.4, where <)I: 1R 1R
is a nonlinear C 1 map S.t. <)1(0)=0, h(t) is the impulse response of a linear time-
invariant SISO system. Let the inputs u('), the errors e(') and the outputs yO belong to
an appropriate identical space L = E = U = Y of R-valued functions on 1R+.
Assume that the open-loop I/O map

58 e(')e L yO=G[e]O=[h*<1>(e)]Oe L

is well defined. Assume moreover that the return-difference map


59 I+G:L L

is invertible, (I denotes the identity map), more precisely (I+Gf i maps L into L.

Show that closed-loop I/O map is given by

u y

J
Fig. 5.4 Nonlinear feedback system.

u(')e L Y(')=F[u](')

60 = G[CI+G)-I[u]]C') e L .

61 Exercise. Show that the I/O map of the system of Fig. 5.3, where Y[O,-IJ - 0,
reads

F [u](t) = (h*u)(t)
where
150

h(s):= L[h(t)] = (S+l+e-s)-1

_ _1__ e- s _1_ + e-2s _1_ _


s+ 1 (s+ 1)2 (s+ 1)3

Explain physically this series expansion. Sketch out h(t) for t 2:. O.

5.2. Time-Invariant Dynamical Systems


Let D = (U ,L.Y .s,r) be a dynamical system representation with response function
p(t.lQ,xo.u) given by (5.1.45). In order to define time-invariance it is convenient to
require that the input space U be closed under translations. Thus we require T to be
either R or Z. Let t E T: for every u E F (T .U) define the function Ttu by

1 (Ttu)(t) = u(t-'t) V''tET.

Clearly Ttu is the result of shifting u(·) to the right by 't seconds, i.e .• for't > 0, Ttu
is the input u delayed by 't seconds. U c F (T .U) is said to be closed under transla-
tions iff. for every 't E T, u E U ::::> TtU E U.

2 Definition. A dynamical system is said to be time-invariant iff we can associate


with it a representation D = (U 'L,Y ,s,r) such that
(a) U is closed under translations,
(b) for alllQ.tl.tE T with tl 2:.lQ. for all XOE L and for all UE U

3 p(tl+t.lQ+t.xo.Tt u) = p(t 1.lQ.xo.u) .

Observe that the value of the new input Ttu at the new initial time lQ+t is just u(to).
which is the value of the old input u(·) at the old initial time to. Thus (3) expresses
the fact that by delaying the application of Xo and the input u by t seconds. the output
y is delayed by the same amount. •

4 A dynamical is said to be time-varying iff it is not time-invariant.



5 Remark. Condition (3) agrees with (3.2.49). used for calling the system represen-
tation R = [A,B.C.D] time-invariant; similarly for the discrete-time case
R d = [A.B,C.D] see (3d.2.50).

6 Exercise. Show that if a differential system has a representation of the form


(5.1.8)-(5.1.9) where

7 x(t) = f(x(t).u(t»
tE R
8 yet) = r(x(t).u(t»
151

(Le. f and r are not explicitly dependent on t), then it is time-invariant.

9 Exercise. Let x(t),y(t),u(t) E R. Show that the representation

x(t) = x(t) + exp(2t)u(t)

yet) = exp(-2t)x(t) + u(t)

corresponds to a time-invariant dynamical system, (hence the converse of Exercise (6)


does not hold).

10 Exercise. Given a time-invariant representation R = [A,B,C,D] with state x(t),


and a C1 matrix-valued function t T(t) with det T(t) *" 0 V t, obtain R, the
representation of R resulting from changing the coordinates of L according to
=T(t)x(t) (here represents the state in the new coordinates). Does R represent
a time-varying dynamical system?

5.3. Linear Dynamical Systems

1 Definition. A dynamical system is said to be a linear dynamical system iff we


can associate with it a dynamical system representation D = (U ,L,Y ,s,r) such that
a) U ,L,Y are linear spaces over the same field P,
b) for any fixed pair (t,to) E (T x T)+, the response function is a linear map from
LXU into Y thus Vx"x2E L, VUIO,u2(')E U, P,

2 p(t,to,a,x, + +

3 A dynamical system is said to be nonlinear iff it is not linear.

3a Remark. We purposefully state that to check whether a dynamical system is
linear we have to exhibit a dynamical system representation D that has linearity pro-
perties (a) and (b) above. Some representations may hide the fact that the system is in
fact linear. Consider the following simple scalar example:
x=-x+u

y=x.

Change coordinates in the state space IR by the bijection x = sinh to obtain

+ (sech u
L y= sinh
In the the system appears to be nonlinear!
152

4 By definition (1) the following consequences are immediate.

5 Decomposition property. If D is a linear dynamical system representation. then


its response is the sum of its zero-input response and its zero-state response. i.e.
V(t,to)E (TxT)+, VXOEL' VUE U

= +
6
response z-i response z-s response .


7 Linearity of the zero-input response. If D is a linear dynamical system
representation, then
V (t,to)E (T xT)+, VXI,x2E L' V(Xl,(X2E F,

Equation (6) explains why the zero-input dynamics of R(·)=[A(·),B(·).CO.DO]


is so simple; since the state space is n-dimensional, the z-i response to the initial state
xo is completely specified by xo and the n zero-input responses specified by the stan-
dard basis vectors £j as initial states.

9 Linearity of the zero-state response [Superposition property). If D is a linear


dynamical system representation. then V (t.to) E (T x T )+. V u1.u2 E U,
V(X!.!XzE F.

10 p(t,tO,(lL'(X1 1I1+!Xzu2) = (XIP(t,to,(lL,ul) + (X2P(t,to,(lL,U2) .

A little thought shows that the linearity property (10) is the basis for the superpo-
sition integral (5.1.51).

5.4. Equivalence

1 Equivalent States. Let D and jj be two (not necessarily distinct) dynamical sys-
tem with the same input and output spaces U and Y. States Xo D
and Xo of D are said to be equivalent at time to iff D in the state Xo at to and D in
state Xo at to. when driven by any input u"o.ro) E U. yield the same output for all t to;
thus, with p and p the response functions of D and jj resp., VUE U, Vt to,

2

153

3 Exercise. Consider the linear time-varying system of Fig, 5.5, where the switch is
closed prior to t=1 and open for t 1. Pick the state x= E R2. Consider the
states xo=(O,I) and "0=(1,1) at times to=2 and to=O. Show that these states are
equivalent at to=2 but not at to=O.

4 Equivalence. Two dynamical system representations D and i5 are equivalent iff


(a) they have the same input space U, and _(b) for any to E T, for any state Xo of D
there corresponds at least one state Xo of D that is equivalent to Xo at to, and con-
versely, for any toE T, for any 3tate "0 of i5 there corresponds at least one state Xo of
D which is equivalent to "0 of D at to. •

5 Exercise. Show that if two dynamical system repres·entations D and D are


equivalent then 'v'to ET

i.e. they have the same set of I/O-pairs, where it is understood that Y[to. oo ) is the output
of D produced by [Xo,u[lo.oo)] and Y[to. oo) is the output of jj produced by ["o,u[to.OO) J.
(note the same input u).

7 Example. Let D be a differential system representation of the form (5.1.8)-


(5.1. 9), i.e.
8 x(t) = f(x(t),u(t),t)

9 yet) =r(x(t),u(t),t)

Let
10 ",:]R.n R n:

be a bijection S.t. '" and its inverse ",-1 are Cion ]R.n. Note that, for any ERn, the

Fig. 5.5
154

Jacobian matrix E R nxn is nonsingular, (to prove this, differentiate


(",. V ERn.... )

Consider the state transformation (i.e. change of variables)

Hence, along a trajectory. by the chain rule

12 x= .

and by (8). (9). (11) and (12). there results a new differential system representation D
of the form

13 = . =:

14 yet) =

(prove this: note that by (11). t -+ x(t) is C l iff t -+ is Cl). Observe moreover. by
(11), that
V to E R+, the state Xo of D is equivalent to the state of jj at to. and
conversely,
V to E R+, the state of jj is equivalent to the state Xo := of D at to.
Hence D and fj are equivalent, having V to E T and V u[to. oo)' the same output


The nonlinear time-invariant state transformation (11) can be extended to the
time-varying case. For simplicity, we suppress the time dependence of x(t) and u(t)
where there is no ambiguity.

15 Exercise. Let D be a differential system representation (5.1.8)-(5.1.9) of the


form (8)-(9) above. Let

",(.,.) : R n x -+ lRn : -+

have the following properties:


1) V t to ",(·.t) is a bijection from R n onto JRn• whence it has an inverse w.r.1.
denoted by ",-l(-.t) s.t. if then thus 'It VXE JRn•
x = ",(",-l(x.t,).t).
2) Both ",(".) and ",-1(.,.) are C 1 on
3) ::In> Os.1. inf
I> 0
Show that by the state transformation
155

x(t) = '1'-1 (X(t),t) 'It t E R+ ,

the given differential system representation D is converted into an equivalent


differential system representation D of the form:

= [D 1 =: tE Rt.
=:
[Hints: x= Dl . + t x(t) is C 1 iff t is C1; exhibit equivalent
states.]

20 Zero-State Equivalence. Two linear dynamical systems D and D are said to be


zero-state equivalent iff
a) they have the same input and output spaces,
[thus D =(U ,1:,Y ,p) and D=(U ,tY ,M],
b) for all (t,to) E (T X T)+ and for all u E U

21 p(t,to,9;E'u) = p(t,lQ,9 t ,u) ,

[thus, for alllQE T, the zero states are equivalent].



22 Exercise. Show that two linear dynamical systems are zero-state equivalent if
and only if they have the same impulse response. [Hint: use (2.1.96).]

23 Remark. The definition of zero-state equivalence does not imply that the state
space 1: and thave the same dimension: see e.g. the Kalman decomposition Theorem
(8.6.16).

27 Algebraic Equivalence. As in Example (7) and Exercise (15) we consider a


change of coordinates to transform the state x E 1: of linear system representation
R (.) = [A(·),B(·),C(·),D(·)].

28 Exercise. Given RO=[A(·),B(·),CO,D(·)], define x(t)=T(t)x(t) where T(t) is


nonsingu/ar for all t and T(') has a piecewise continuous derivative. Show that

29 = [T(t)A(t)+T(t)]T(t)-lx(t) + T(t)B(t)u(t)

30 yet) =C(t)T(t)-lx(t) + D(t)u(t)


31 Definition. Consider systems whose representations
R 0= [A(·),B(·),CO,DO] and RO=[A(·),B(·),CO,DO] have (a) the same input- and
and (b) state spaces of the same dimension, say n. The representations
R (.) and R(·) are said to be algebraically equivalent iff there exists an n x n matrix T(I)
s.t.:
a) T(t) is nonsingular for all t e and T(I) has a piecewise-continuous derivative
for each t e R+,
156

b) T(t) relates the representations by

32 A(t)= [T(t)A(t)+T(t)]T(tr l B(t) = T(t)B(t)

33 C(t)=C(t)T(t)-1 D(t)=D(t)

for all t E 1R+.



We denote this relation by

From Exercise (28), " is related to x by the state transformation


35 x(t) = T(t)x(t) .

Furthermore u('), x(·), yO satisfy R 0 if and only if u('), T(')x('), y(.) satisfy RO.
36 Comments. a) In the study of stability, (see Chapter 7), it is important that
T(') as well as TO-I are bounded on x
whence the exponential stability of = A(t)x
becomes equivalent to that of i = A(t)", [Bro.1,p.188].
Algebraic equivalence is a special case of the dynamical system equivalence of
Exercise (15).

37 Exercise. Show that

1) if RO TO RO then R(') TO-I RO

T (.) T20
2) if R -I-RIO and RIO --RIO, then

R 0 T2(')Tt<") R 2('),

40 Lemma. If R(o)=[A('),BO,Co,DO] and R(·)=[A(·),B(·),C(·),DO] are algebrai-


cally equivalent, then:

41 a) ¢(t,"C)=T(t)¢(t,"C)T("C)-1 Vt,"CE ,

b) their impulse responses are equal, i.e.

42 H(t,"C) = H(t,"C) Vt,"CE R+.


157

43 Proof. a) Note that the function t T(t)<l>(t,'t)T(t)-1 jis equal to I for t = 't and
satisfies the d.e. X= A(t)X fort a.a.t E 1R+ as can be easily checked using (32).
Hence, by the fundamental theorem of d.e., (41) follows.
b) Equation (42) follows immediately when (41), (32) and (33) are substituted in

H(t,'t)= C(t)Cl>(t,'t)B('t)+ D('t)/)(t- 't) .



44 Theorem. If the representation R(-)=[A('),B(')SO,DO] and
RO=[A('),B('),C('),DO) are algebraically equivalent, then R(·) and R(') represent two
equivalent dynamical systems.

45 Comment. Thus algebraic equivalence implies equivalence; however, the con-


verse is not true.

46 Proof. The response of R(') is given by


I

P(t,to,xo,u) = C(t)<I>(t,to)xo + JH(t,'t)u('t)d't .


t"

By algebraic equivalence, there is an everywhere nonsingular matrix T(t) such that


RO TO RO and x(t)=T(t)x(t). Then, using (32), (33), (41) and (42) we obtain suc-
cessively
I

P(t,to,xo,u) = C(t)T(t)-IT(t)<l>(t,to)T(torIT(to)xo + JH(t,'t)u('t)d't


t"

= p(t,to,xo,u).

Hence 'V to E R+, the states Xo and Xo are equivalent at to iff Xo = T(to)xo. Therefore
R (-) and RO are equivalent. •

For discrete-time system representations RdO=[A('),B('),C('),D('») we have

50 Exercise. Given Ri')=[A('),B('),C(-),D('»), define x(k)=T(k)x(k) where T(k) is


nonsingular for all k. Show that 'V kEN

51 x(k+ 1) = T(k+ 1)A(k)T(k)-lx(k) + T(k+ l)B (k)u(k)

t a.a.1. '" "almost all (,"


158

52 y(k) =C(k)T(k)-lx(k) + D(k)u(k).

53 Definition. Consider two linear recursion systems RdO=[A('),B('),C('),D(-)] and


Rd(') = [A(·),B(·),CO,D(·)] having (a) the same input- and and (b) state
spaces of the same dimension, say n. The representations R dO and RdO are said to
be algebraically equivalent iff there exists an n x n matrix T(k) S.t.:
a) T(k) is nonsingular for each kEN,
b) T(k) relates the representations by

54 A(k) = T(k+l)A(k)T(k)-1 B(k) = T(k+ 1)B(k)


for all kEN . •
55 C(k) = C(k)T(kf 1 D(k)=D(k)

We denote this relation

56

From Exercise (50) x is related to x by

57 x(k) = T(k)x(k) .

Furthermore u('), x('i, yO satisfy R d(') if and only if uC'), T(·)x(·), y(.) satisfy RdO.
Therefore R dO and RdO are equivalent system representations with

58 <D(k,k') = T(k)<I>(k,k')T(k'r 1 . '1k,k'E N.


59 H(k,k') = H(k,k')

60 Exercise. Prove (58) and (59).

61 Exercise. Let R = [A,B,C,D] and R= [A,B,C,D] have (1) the same input- and
output-spaces and (2) state spaces of the same dimension, say n.
Let TE R nxn be any nonsinguiar matrix. Show that

62 R .I..R [i.e. Rand R are algebraically equivalent]

iff
63 A=TAr 1, B=TB, c = c r 1, D=D.

(The states are related by it. = Tx).



159

This concludes our brief overview of general system concepts; the point is that
these concepts are far more general than the special case of linear finite-dimensional
differential systems.
CHAPTER 6

SAMPLED DATA SYSTEMS

Consider the basic laws of Physics that describe the behavior of physical objects:
Newton's laws, Lagrange's equations, the Navier-Stokes equations, Maxwell's equa-
tions, Kirchhoff's laws etc. Each of these laws describe continuous-time phenomena.
At present it is cost effective to manipulate signals in digital form: for this purpose,
signals are sampled by an AID (analog-to-digital) converter, the resulting sequence of
numbers is operated on by a digital computer (a controller) and the resulting sequence
of numbers must be restored to analog form (Le. to a continuous-time form) by a D/A
(digital-to-analog) converter. Indeed, the analog form is required to actuate the physi-
cal devices that are to achieve the engineering goals.
In communication systems, at the sending end the signals from a microphone or a
picture tube are sampled and transmitted in digital form. At the receiving end the sig-
nals are restored to analog form to actuate microphones and/or picture tubes.
In control applications, the continuous-time plant-output yc(t) is sampled by the
AID converter to produce the digital output Yo' The digital inputs Uo and Yo are fed
into the digital controller to produce the digital signal vo, which is then transformed to
a continuous-time signal vc' The latter is then used to drive the plant.
The aim of this chapter is to develop tools to study such problems, namely, linear
systems with digital and analog signals.
All the derivations are carried out in the SISO case, however they apply to the
MIMO case, as explained in Remark (6.4.4) below.

6.1. Relation Between L- and z-Transforms


Figure 6.2 outlines the tasks of this section.
The problem
We are given the continuous-time function f: t f(t). We assume that

1 f has a Laplace transform with abscissa of absolute convergence O'f. Thus £(s) is

DIGITAL
,--- CONTROLLER
I
I

Fig. 6.1 Sampled-data system.

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
161

analytic for Re s > O'r and, for all 0' > O'r, f(s) -) 0 uniformly as I s I -) 00 in c:: (H' by
the Riemann-Lebesgue lemma (C.2.3).
As shown on the figure by the operator S, f: 1R+ -) IR is periodically sampled
(with period T) by an AID converter. Its output is the sequence of samples

foh··· = [fn ];.

Now it often happens that f is identically zero for t < 0, but f(t) "# 0 for t > 0
near zero: thus f(O+) "# O. By convention fo is taken to be f(O+); this suggests that
the sampling process commanded at time t=O had a minute delay in the execution of
the sampling. For these reasons, we assume that

f(t)=O for t < 0,


2 {
f is continuous at t = T,2T, ... , nT, ....

Thus the z-transform of the sampled fO is given by

3 f(z) = f(O+) + I. f(kT)z-k.


"'=1

Note that fo = f(O+) and, for n > 0t


fn = f(nT).
We shall assume that f is such that

4 f(z) has a finite radius of convergence Pr.

consequently f(z) is analytic for I z I > Pro


The figure shows four transformations: S, Z, Land T.
S takes the continuous-time function f into its sequence of samples

S : f(-) -) [fn ] ; ; note that fo = f(O+) .

L takes f into its Laplace transform f.


Z takes the sequence into its z-transform f.
T takes f into £.
5 Exercise. Prove that these four transformations are linear maps.

6 L is an injection. f is defined uniquely by the defining integral (C.l.1):


f=L [f]. If we identify any pair of functions f,g: 1R+ -) IR whenever

t If, for some M and some c < 00 and Vt 0, I f(t) I ::;; M exp(ct) then
Pr ::;; exp(cT).
162

CONTINUOUS-TIME
FUNCTIONS SAMPLED-
FUNCTIONS
f: IR+... IR S
f(O+), flT),f(2Tl, ...
O"f < IX)

tL
LAPLACE TRANSFORMS
JZ
Z-TRANSFORMS
1\
f:C ... C
.....
T f(·)
O"f

Fig. 6.2 Illustration of the relations between the four transformations S, L, Z, T.

00

llf(t)-g(t)le-tltdt=O for 0' > max {O'£,O'g}

then it can be shown that L is an injective map defined on the resulting equivalence
class. Now the inversion integral (C.2.11) recovers f, a representative of the
equivalence class.

7 Z is an injection. Z is injective as can be seen from (3) and the sequence


(fn ); is uniquely recovered from f by the inversion theorem (D.2.!1).

8 Even with assumption (2), S is not injective. Indeed, the function f1(t):= 1,
f2 (t) := 1 + sin and f3(t) = 1 + e sin
t have the same sampled sequence. Note that
if we doubled the sampling rate, the sampled sequences would be different in the three
cases: hence it is important to sample fast enough. These examples illustrate the
well-known phenomenon of aliasing [Kwa.2] There is one case where S is injective.

9 Sampling Theorem [K wa.2]. If f: IR IR is band limited, more precisely if it


has a Fourier transform f (jro) and if fOro) = 0 for all I ro I > ;, then (f(nT»)O'
defines uniquely t f(t):

00 sin [; (t-nT) ]
f(t) = L f(nT) --"----"-
-00 (t-nT)
T
163

Since by (8) and the examples above, two different time-functions f(') and g(')
may have the same sequence of samples, it follows that different f and gmay produce
f and g that are equal: in other words
10 T is not injective.

Representation of the map T : i'(.) --+ r(·)


Several useful representations of T are given by the following theorem.

11 Theorem. Assume that f: --+ IR satisfies assumptions (1) and (2); let f be
defined as usual and let f be defined by (3); choose c E R and constrain s E € and
ze € by
12 Of < c < Re[s] and z=esT ,

then, for exp(orT) < expecT) < I z I, we have


c+jm
138 (i) f(z)=.!.f(O+)+_l_. f(p) J dp,
2 27t] c-jm l-z- l exp(pT)
or
I c+jm fcp)epT
13b f(z)=f(O+)+ 27tj elm z-exp (pT) dp ,

14 (ii) -f(z)=f(e
- ST Il
)=- f(O+)+- +00.f 21tk 1;
fs+j-
2 T T
k=-oo

• n rk
15 (iii) if, in addition, f(s) = - - ; then
k=1 S-Pk

- n
fez) =
Z
16


rk -----'.:.--
k=1 z-eXP(PkT)

Before proving the theorem we must make a number of remarks.

Comment. Expression (13a) above is standard: it is very convenient if one wants to


close the integration path by a right half semi-circle. Expression (13b) is a slight
modification of (13a): it is very convenient if one wants to close the integration path
by a left half semi-circle.
164

The map s -+ exp sT. Equation (3) shows that for sampled data systems, the
variables z and s are related by z = esT: this map takes the point s = O+jOl into the
point z = eoT(cos coT +j sin coT).

20 Exercise. (The map z=e sT)


(Drawing pictures is highly recommended.)
Show that (a) the s-plane strip - ; ::;; Im[s] ::;; ; is mapped by s -+ z=esT onto
the whole complex plane.
(b) Consider, in the z-plane, the haIf-lines z = pei 9, 0 P 00, for
9=0°,45°,90°, ... ,335°; obtain their preimages in the s-plane.
(c) Let 00 be given, consider the s-plane rectangle - 00 < 0 < 00 < 0,
Ico I < ;; obtain its image in the z-plane.

(d) Add to the above s-,Iane rectangle the triangle specified by the vertices
[oo,j ; ]. (0,0) and [0 -j; ;obtain the z-plane image of the resulting polygon.
0,

Write the equation of the boundaries.

21 Poles of (s) and poles of f(z).


Assume that f is a rational function and goes to zero uniformly in arg s, as
Is I -+ 00 in Re[s] S; c; use (l3b) and the Jordan lemma to prove that

22 PkE p[n exp(PkT)e P[t].

For example, left half-plane poles of ( map into poles of f that are in D(O; 1), the
unit disc centered on the origin.

23 Zeros of '(s) and f(z). Equation (14) shows that there is no simple relation
between the zeros of f(z) and those of (s). There is, however, an interesting limiting
case:

24 Asymptotic property of the zeros. t::t (s) be rational and let f(s) = O( 1/s2) for
I s I -+ 00; call ... ,Zm the zeros of f(s), then, as T -+ 0, (i.e. as the sampling
rate increases), lim f(exp(zjT» = O.

Proof. Hint: use (14).

Proof of Theorem (11)

Proof of (i). By definition (3) of f(z) and by (2), we obtain

- 1 1
f(z) = "2 f(O+) + - [f(O-)+f(O+)] + L00
27 f(kT)z-k.
2 k=1
165

Using the inversion integral (D.2.11). we have

28

For each k and for fixed z. the integral converges absolutely since of < c; furthennore
since I z I > exp cT. in the integrand I z-k exp(pkT) I < 1 along the integration path;
therefore we may interchange the order of summation and integration and we may sum
the geometric series. and thus obtain
c+jm
I
A

138 f(z)=l. f(O+) + f(P) dp.


2 21tJ c-jm l-z- l exp(pT)

To obtain (13b) we write

f(z)=f(O+) + r.
k=1
f(kT)z-k

and proceed as above.

Proof of (ii). We first examine some properties of the integrand of (13a).


Consider a fixed value of z. say Zo. where. of course, IZo I > expecT). For that
Zo, the integrand of (13a) has poles at

29 'VUE Z.

These poles lie to the right of the vertical integration path of (13a).
Note that if f(p) itself has poles. say, PI.P2' .... then for all such poles.

30 Re Pk Of < C

and they to the left of the integration path. Furthennore. f(p) is analytic in
Re s > Of: f is analytic on the integration path and to its right.
Note that the two observations above also hold for the integrand of (l3b). By
assumption (1) and the Lebesgue lemma (C.2.3), for all 00 > Of. as
Ipl 00. with Rep> 00. f(p) 0 uniformly. In order to calculate (13a). let us
use the residue theorem: so we first close the integration path to the right by a half
circle of radius R and then let R -+ 00. Since on the half circle p = R exp j6.

Iexp pT I = exp(R cos 6 T) -+ 00 as R -+ 00 with

Since f(p) 0 uniformly on the right half circle. by the Jordan lemma (C.2.26) the
166

contribution of the half circle goes to zero as R -+ 00. Hence, in the limit of R -+ 00,
all the integrand poles to the right of the vertical path are included and, by calculating
the residues, we obtain (14).

Proof of (iii). Follows by direct calculation from


n
f(t)= l(t) l: rk exp(Pkt).
k=1

32 Exercise. Obtain (16) by calculating (13b) (close the integration path to the
left!).

" rl
33 Exercise. Let f(s)= (1-e-sT) - - . a) Show that we cannot close the
S-PI
integration path on the left to calculate fez) by (13a). b) Use (l3b) to calculate fez).
c) Use time-domain calculations to get
- z-1
f(z)=rl (T) .
z-exp PI

6.2. DIA Converter

01 A It-_uc__
Fig.6.3 D/A converter.
The D/A converter shown in Fig. 6.3 maps input sequences into piecewise con-
stant functions; more precisely if the input is the sequence Uo = [uo(nT»); then the
output is the piecewise-constant function uc: t -+ uc(t) given by, 'V t 0,

1 uc<t)= l:
n=O
[1(t-nT)-I(t-(n+l)T)] uo(nT) .

See Fig. 6.4.


In tenns of transfonns, the sequence Uo is specified by its z-transform uo(z) and
the output ud') is specified by its L-transform lids). From (1) we obtain
1_e-sT
l:
"00
2 uc<s) = uo(nT)e- snT -- •
n--() s

Hence recalling the definition of u(z),


167

T I -e-sT
3 uC<s)=uo(e S
A
) --- •
s

Equation (3) gives s -+ lic<s), the L-transform of the piecewise-constant


continuous-time output t -+ ue(t) in terms of the z-transform z -+ uo(z) of the input
sequence uo.

o T 2T 3T 4T 5T 6T

Fig. 6.4. Typical output of a D/A converter.


6.3. AID Converter
The AID converter shown in Fig. 6.5 maps continuous-time functions into
sequences of samples; more precisely, if the input function YeO is continuous, (no
jumps!), then, V n EN,

1 yo(nT) := yC<nT)

the nth sample is simply the value of Ye at time nT.


In terms of transforms, referring to Fig. 6.5, we get
2 Yo(z) = T[Ye(-)](z) .

Now if YeO is identically zero for t < 0 and y(Ot-) = 0, then

3 yo(esT ) = n!loo)le [S+j 2;n ] .

4 Special case. By (6.2.16), if 9c<s) = 1:


k=1 S-Pk
then

Note that (5) is even valid when Ys(s) = O( lis) as I s I -+ 00 - equivalently when
'* 0 - '*
m
Lrk in that case, yc<Ot-) yC<G-) = 0 and for (5) to be valid the first sample
1
must be yC<O+).
168

Yc YO
AID 1-----
--'
Fig. 6.5 NO converter: Ye is the continuous input.

6.4. Sampled.Data System


Consider the system shown in Fig. 6.6: we assume that the D/A and the NO
converters are synchronized. The input sequence Uo is fed into the D/A converter to
produce the piecewise constant output uc. The latter is fed into the linear time-
invariant continuous-time system specified by its transfer function g(s). We assume
throughout this section that this linear system is in its zero-state at t = 0-. The output
of the linear system is the continuous-time function Ye. The latter is sampled by the
NO converter to produce the output sequence yo. Thus we have an overall system
whose input is the sequence Uo and whose output is the sequence YD. and both
sequences are synchronized. We assume that

1 a) the D/A and the AID converters have the same period T and are synchron-
ized.

2 b) g(s) is a strictly-proper rational junction, i.e. g(s) -+ 0 as I s I -+ 00.

Let Pl.P2.....Pm denote the poles of gthen the abscissa of absolute convergence of
g is Og := m!lX I
(RePi)'

By assumption (1). the overall system is time-invariant. More precisely the


discrete-time system that takes the sequence Uo into the sequence Yo is linear and
time-invariant: hence it is described by its pulse transfer function geq(') or its weight-
ing sequence (impulse response) n -+ geq(n). Thus we have

3 Yo(z) = geq(z)uo(z)
and

I
,,-----4 CLOCK 1-----------7
I

_ u_
D__
11DIA
I I Uc
..
LINEAR SYSTEM
9(8)
Yc
EJ-'
AID ---
Yo

Fig. 6.6. Overall system under study: geq(z) is the pulse transfer function
from the input sequence Uo to the output sequence YD'
169

n
4 yo(n) = 1: geq(n-k)uo(k) .
k=O

To calculate geq(') we proceed as follows: apply the input sequence


uo=(1.1.1 •....). hence uo(z)=z/(z-1)=O-z-1)-1. Consequently
1 ( 1
5 uC<s)=- and Yc s)=- g(s).
s s

Note that yc<s) is a rational function of s that is at most 0(1/s2) as 1s 1 -+ 00. Since
1.s g(s) is the input to the AID converter. its output is specified by (see (6.3.2»:

6 yo(z)=T [! g(S)]'
Finally

7 geq(z)=(l-z-I)T [+ g(S)]
and by (6.1.13a)
c+joo
8 ge I glat(p) dp
q 21tJ c-joo p[1-z-1exp pT]

where max {Og.O} < c and expecT) < 1z I.


Now. if we use (6.1.l3b) to evaluate (7) we obtain

I.
c+joo
10 - (z)=O-z-l) _1_ exp(pT) iiPl dp
geq 21tj z-exp(pT) p
C-jOO

where max {Og'O} < c and expecT) < 1z I.


Since by (2). g(s)/s is at most 00/S2) as 1s 1 -+ 00, by closing the integration
path by a left semi circle. we see that geq(z) is a strictly proper rational function of z.
To see that it is strictly proper note that for the input uo=O.O,O •...•O•... ). yc<Ot) =0.
Thus we have proven the following theorem.

11 Theorem. Let assumptions (1) and (2) hold, (i.e. synchronization and gO
rational and strictly proper), then the pulse transfer function geq(') relating the input
sequence uD to the output sequence Yo (see Fig. 6.6) is given by
170

where geqO is a strictly proper rational function in z, which is given by (8) and (10).

12 Exercise. Let assumptions (1) and (2) hold. Call Pl,P2, ... , Pm the poles of
g(s)/s, (they may have order 1).
a) Use (1) to show that

13 geq(z) = (l-z-l) f
i=l
Residue [ exp(pT) .
Pi z-exp(pT)
ili ] .
p

b) If, in addition all the poles of g(s)/s are simple, show that

14 geq(z) = (l-z-I)
m
E exp(PiT [g(S) ]
Residue.
i=1 z-eXP(PiT) Pi s

m' r
c) If, in addition, g(s)= E _k_, (1tkE 0: is the kth pole of gO), then
k=1 s-1tk
m' rk exp(1tkT)-l
15 g (z)-1: -
eq - k=1 1tk z-exp(1tkT)

d) Same assumption as in (c); use the time-domain: set uo=(1,O,,,.,O,,,.) calcu-


late yd') and obtain 8eq(z) as the z-transform of the sampled YeO.
e) If g(O) 0, show that g(O)=geq(1). (Hint: use (13». Give a physical
interpretation of this result.
f) Show that geq(') is analytic at z=O. (Use (13); note that the residues are
rational functions of z, analytic near z=O,,,. )
g) Show that

Pk E P [g ] <::> exp(PkT) E P [8eq(Z) ] .

16 Remark on Zeros. Examples show that even if gO is minimum phase, i.e. has
no €+-zeros, then geqO may have zeros outside the unit disk. [Ast.l]. For example,

g(s)= 1/s3

the numerator has a zero at-3.732.


(For further information on the zeros of g(s) and geq(z) see [Ast.l)).
171

18 Remark on Notations. It is very important to avoid confusion t: Here g(s)


denotes the transfer function of the continuous-time device, consequently, its impulse
response is get). If we were to sample this impulse response, its z-transform would be
denoted by g(z): in other words,

19u g(z)=T[g(s)] .

Now geq(z) is the pulse transfer function of the system shown in Fig. 6.5: its use is to
calculate Yo from lIO by (3): from Fig. 6.5 we see that

20 geq(z) = T [-7- g
1 -sT ]
(s) .

It turns out that it is more convenient to calculate geq(z) by (7), namely,

7 geq(z)=(l-z-I)T ].
g
Equation (7) is convenient because (s)/s is at most 0(1/s2) as I s I 00, hence we

may close the integration path of (8) and (10) to the left without changing the value of
the integral.

Fig. 6.7. Control system with a digital controller.

6.5. Example
Consider the control system shown in Fig. 6.7: the digital input is uo and the
continuous-time output is called Ye. The digital controller is specified by its pulse
transfer function c(z): hence vo(z)=c(z)co(z). The continuous-time plant is specified
by its transfer function g(s) which is assumed to be strictly proper and rational. The
NO and D/A converters are assumed to be synchronized. Hence geq(z) is the pulse
transfer function relating \to to Yo:

t More than one author uses the same notation for two different functions!
172

1 YD(Z)=Seq(Z)VD(Z) .

Writing the summing node equation, we obtain successively

Solving for eD(z), we obtain the Laplace transfonn of the output

Note that y(s) is the product of five factors: the last three are periodic in s with period
. 2x
JT'-

4 Remark on the MIMO Case


Careful examination of all derivations in this chapter show that the results apply
to the MIMO case. For example if f: R+ JRn , then, in (6.1.13a), f(p) and f(z) are
vectors in G: n. In (6.4.3), UD and YD are vector-valued and Seq is a matrix of suitable
size: the point is that (6.4.8) and (6.4.10) still hold. Equation (3) above still holds,
because the order of all transfer functions has been carefully preserved through all cal-
culations.
CHAFfER 7

STABILITY

This chapter describes the stability of a linear system from different points of
view: bounded-input bounded-output stability (I/O representation, external stability)
and the stability of the zero solution of x= A(t)x (state representation, internal stabil-
ity): for the latter case the notions of asymptotic- and exponential-stability are
developed. More specific topics conclude the chapter: bounded trajectories and regula-
tion, response of a linear stable system to T-periodic inputs, equilibrium solution of a
driven T-periodic differential system and slightly nonlinear systems.

7.1. I/O Stability


Consider a linear dynamical system having as I/O map a superposition integral on
R, i.e.
t
1 y(t)=F[u](t)= f H(t;t)u(t)dt 'Vte IR
-00

where
H(','):RxR RDoXn; and 'Vte R, t H(t,t) is piecewise continuous,
u(')e PC (R,Rn,) S.t. the integral in (I) makes sense, (e.g. u(t)=9 'Vt < 0),
y(-):R Rllo.

2 Comments. a) It is assumed that the state is zero at to=-oo, (e.g. as in com-


munication theory).
For a linear differential system representation R (.)= [A('),B('),C(-),O] with
t

u(t) = 9 for t < 0, the RHS of (l) reads f H(t,t)u(t)dt.


o

3 Preliminaries. a) Throughout this section vectors will be normed by their sup-


norm and matrices according to the corresponding induced matrix norm, i.e. for all
xe R n and for all Ae IRmxn

4 IIxli := max I xi I
i

5 IIAII := '!lax l: Illjj I [max row-sum]


lem jen

For more on this see section A.6.l.


b) Vector functions will be measured by their L""-norm, thus for functions u(') and
y(.) in (1),

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
174

6 lIull
-
:= sup lIu(t)1I = sup {max Uj(t)
JElli
I I},
7 /lyll ::;;: sup IIy(t)ll = sup {'!lax I Yi(t) I } .
00 IE. lEn.

For a discussion of linear nonned Loo-spaces, see (A.6.S4) et seq.. Remember that a
function belongs to L 00 iff its L--nonn is finite or equivalently the function is
bounded. t
y) An attempt will be made to view the I/O map F : u y of (I) as a linear
transfonnation from L;.' into

8 I/O stability. The concept of input-output stability, (I/O stability), roughly asserts
that "any bounded input uO produces a bounded output yO using a finite gain." More
precisely:
We say that the system described by (1) is I/O stable iff

9 :3 k < 00 S.t. for all bounded uO IIyll 00 S; kllull 00 .

10 Comment. See section A.6.S. on continuous linear transfonnations. Note that in


(9) not only does a bounded input u(') produce a bounded output y(.), but also its
Loo-nonn, i.e. lIylloo' is smaller than or equal to a fixed number k times Ilulloo' In
other words the I/O map (1), considered as a linear transfonnation

11 F : L;' ;u F [ul =; y ,

is a continuous, (i.e. bounded, (A.6.82», linear transfonnation. Indeed F has an


induced nonn defined by:

IIFII := sup {"YI Ullull_ : IIull_ E (O,oo)}

12

Hence by (9), the system described by (I) is I/O stable if and only if the I/O map F
maps L.;:' into such that there exists a k < 00 for which IIF II < k < 00.

t Mathematical purists should read "essentially bounded" and replace the sup in
(6)-(7) by the essential supremum.
175

13 Exercise. Show that:


the system described by (1) is not I/O stable if there exists a sequence of inputs
(u l (-) ) ; cPC (R,IR"') s.t., with / := F [u l J. for all lEN

II/II 00 > I .

14 Ilulll 00 = 1 and

17 Theorem [I/O stability]. The linear system described by (1) is I/O stable iff

18 {L IIH(t. ,)lld, } =, k < =

where the norm of H(t,'t) is the induced matrix norm (5).

19 Comments. (1.) assertion (18) is equivalent to: the map t -7 f IIH(t;t)lIdt is

bounded on R or, equivalently:

20 'It (i,j) E n" x lli the map t -7 f I hilt,t) I dt is bounded on IR.

The proof below shows that k, defined by (18), is an upperbound for the induced
norm of the I/O map F given by (12); hence it may be used as the fixed constant k of
I/O stability definition (9).
y) In (18) above, any matrix norm may be used since matrix norms are equivalent,
(A.6.45).

23 Exercise. Prove equivalence (20).


110 nj
[Hint: using the norm (5) we have: I hij(t,'t) I IIH(t,t)1I L L I hi/t;t) I ]
i=1 j=1

24 Proof of Theorem (17). 1) Sufficiency. We prove that (18) implies (9). There-
fore let uO be any input that is bounded on 1R, i.e. whose norm lIuli oo given by (6) is
finite. Then from (1), taking vector norms, we have, successively, 'It t E IR

Ily(t)1 I= II J H(t,t)u(t)dtll ::::; J IIH(t,t)u(t)lldt


176

s; f IIH(t,t)lIllu(t)lldt
-00

(we used successively (A.6.20), (A.6.17), (6) and (18». Hence, by (7), taking the
supremum w.r.t. t we obtain (9) with k as defined in (18).
2) Necessity. We use contraposition, i.e. the negation of (18) must imply the nega-
tion of (9) (i.e. system (1) is not UD stable). Therefore, let us negate (18); then,
negating equivalence (20), we obtain: there exists E !10 x OJ S.t.

2S

Now w.l.g. Ct= 1 and I, whence


V lEN there exists tl E R S.t.

26 f IhJl(tl,t)ldt > I.
-00

(Indeed, if (26) is false, then so is (25». Now define a sequence of inputs


[u l (-) ); cPC (R,IR";) as follows: V lEN

27a u l (t) := (ut(t),O,O, ... ,O) V t E IR ,

with
sign[h 11 (tl,t)1 V t E (-co,td
{
27b ut(t):= 0 elsewhere

(sign(Ct) := 1 for Ct > 0, := 0 for Ct= 0, := -I for Ct < 0).


Therefore, a) by (27) and (6)

28 VI E N

(note that by (26) hJl(tl,t) is not identically zero),


b) by (1) (for t=tl) and (27), the output /0 at the tl corresponding to ulO has a
first component
177

I,

29 y/(tl)= f Ihll(tl,'t)ld't,

c) by (7), (29) and (26)

30 '<lIeN lIyllloo lyl(tl)1 > I.

Hence by (28) and (30) the system described by (1) is not I/O stable by Exercise (13) .•

33 System representations R C·) = [AC·),BC·),C(-),DC·)]. Consider the I/O map (1). If


HCt,'t) is the impulse response (2.1.97) of a linear system representation
R C·)= [AC'),B('),CO,DC')] with u(t) = e for t < 0 and in the zero state at time - 00,
(hence in the zero state at to = 0), then the I/O map reads

yet) = F [ul(t) = JHCt,'t)u('t)d't


o
I

34 = JCCt)<ll(t,'t)B('t)u('t)dH D(t)u(t)
o
where uC·) e PC and y(.) e PC ClR+,lRl1o ).

35 We say that Cthe system described by) R (.) is I/O stable iff condition (9) holds
for the I/O map (34).
Note in particular that only the first tenn of the last expression of (34) is a superposi-
tion integral. However if DO is bounded on lR+ then R 0 = [AC'),BC'),CC'),DC')] is I/O
stable iff RO=[A(-),Bo,CC'),O] is I/O stable, (prove this). Therefore by Theorem
(17) and Comment (19. y), we have the following.

36 Corollary. Consider a linear system representation R (-) = [A(' ),B(· ),C(· ),D(-)l
where DO is bounded Then R (-) is I/O stable iff

37 {11lc(')<I>(',<)8«)lI d<} < 00,

(where the matrix nonn used is arbitrary).



40 Linear time-invariant systems. If the system described by (1) is time-invariant,

then H(t,'t) depends only on the elapsed time t-'t. Thus J IIH(t,'t)lld't= JIIH('t)lIdt,
o
178

i.e. the first expression is independent of t e IR. Hence, by Theorem (17) we have.

41 Corollary. Let the system described by (1) be time-invariant. Then the system
described by (1) is I/O stable iff
00

42 f IIH(t)lldt < 00,


o

i.e. the impulse response is absolutely integrable. I

Finally combining Corollaries (36) and (41) we have the following.

43 Corollary. Consider a linear time-invariant representation R = [A,B,C,D] with


transfer function R(s) e (I: p(S)n.xni given by (3.2.63). Then

R is 110 stable

00

44

A 0
45 P [H(s)] c (1:_ .

46 Comment. (45) means that B(s) has no poles in (1:+, or equivalently is analytic
on (I: +: a well known characterization of "external stability."

47 Proof of (44) <::> (45). Set G(t) = C eAtB 'v'te R+, hence B(s)=G(s)+D.
Observe now that (44) <::> (45) is equivalent to 'v' (i,j) e lloXfij
00

J I gjj(t) Idt < 00


o

o
P [gij(s)] C c:_ .

We prove this equivalence.


00

::;:. Indeed by (C.2A) sup I gij(s) I s


SE it.
f0 I gij(t) I dt < 00.

.;: gij(t) is of the form


179

I
48 gij(O = I: 1tk(t)exp[Akt] \;f t E R+ ,
k=1

o
where \;f k, Ak is a pole in Ci:_ and 1tk(t) is a polynomial in t.
Observe now that as t 00, any polynomial in t will grow at a slower rate than any

growing exponential exp[et] with e > 0, whence for any given polynomial 1tk(t) ,
\;f e > 0, \;f k E L
3 mk(e) E Rt s.t.
49 11tk(t) I mk(e) exp[et] \;f t E R+ '

(to check this, take logarithms).


I
Therefore picking J.1 := -max {ReAk} > 0, eE (0,1l), and m(e) := I: mk(e), we
k k=1
have by (48) and (49)

50 I gij(t) I mexp(-(Il-e)t) 'v' tE R+

00

f I gij(t) I dt

with Il-e > O. Hence m(ll-e)-1 < 00.
o

51 Comment. The proof above shows that for any proper transfer matrix
H(s) E Ci:p(S)n.><n1 with 0(8) := H(s)-H(oo) E Ci:p.O(S)n.><nI,
A 0
52 P [R(s)]!:: Ci:_
¢>

with Il:= -max{ReA:AE P[H(s)l=P[O(s)]} > 0:


53 \;f e E (0,1l) 3 m(e) E R+ s.t.

110(011 S m(e) exp[-(Il-e)t) \;f t E R+ .

In other words the impulse response (corresponding to the strictly proper part of H(s»
is exponentially decaying with a positive decay rate arbitrarily close to J.1 > O. Hence,
the following result,

54 Corollary. Let in (1), a) H(t,'t)=H(t--'t) represent the impulse response of a


proper transfer function matrix H(S)E Ci:p(s)n..xn; and b) u(·) and yO be complex
valued vectors, then the system described by (1) is I/O stable
¢>
180

o
P [R(s)]!:;; (L.

7.2. State Related Stability Concepts and Applications
In this section we obtain some results that are partial answers to the following
questions. Given a linear differential system R (-) represented by

1 x(t) = A(t)x(t) + B(t)u(t)

2 yet) = C(t)x(t) + D(t)u(t)

Under what conditions on A('), BO, C('), DO does a bounded input produce for all
XOE IRn and all toE R+ bounded state- and output-trajectories on [to,oo). In the first
subsection we consider only the state trajectory with u=elj, (i.e. the zero-input case)
and define concepts that guarantee that the state tends to zero as t 00. In the next

subsection we consider R (.), i.e. (I) and (2), and insure bounded trajectories for both
the state and the output; in addition we guarantee regulation, i.e. when the input tends
to zero as t 00, then so does the state and the output. The section concludes with
some specific applications.

7.2.1. Stability of i = A(t)x


We study the stability of the zero equilibrium solution of the linear d.e.

3 x=A(t)x Vte

where x(t)e R n and A(')e PC(R+,Rnxn). We now take the differential equations
point of view, e.g. [Mil.1]. For obvious physical reasons, we restrict ourselves to the
forward motion of the state, by (2.1.31) the solution of (3) reads:

4 x(t) = cl>(t,to)xo

where cl>(''') is the transition matrix (2.1.32).

Remarks. In the nonlinear case, equation (3) reads

5 x=f(x,t)

where f('''): Rn x R R n is sufficiently smooth S.t. V (xo,to) E IRn x IR+ (5) has a
unique solution x(') = $(' ,to,xo) that is continuous on IR+. On comparing (5) and (3)-
(4), we observe that, in the linear case with L=lRn , in order to obtain all state trajec-
lSI

tories starting from to we need only know the solutions of x=A(t)x fort XO=Ei for all
i E n; moreover generically there is only one equilibrium point. viz. x ==
e. This is.
in general. not true in the nonlinear case: equation (5) may have more than one equili-
brium point (some stable or unstable); furthermore. one may encounter so-called
periodic closed orbits. some of which are stable and some unstable. For example,
consider the motion of the point (r,e) in the plane described by

r.=-sin r
{
e= r.

There are infinitely many solutions corresponding to periodic closed orbits, to wit: for
kEN, the equations

rk(t)= klt
{
Ok(t)=km

define state trajectories that are circles, say Ck , of radius klt, where for k=O Ck is a
stable equilibrium point, for k=2,4 .... , Ck is a stable closed orbit. and for
k= 1.3.5 ..... Ck is an unstable closed orbit. Indeed. consider the following initial con-
ditions:
1) if 0 < ro < 1t and 00 is arbitrary. the state trajectory spirals into the origin (i.e.
Co);
2) if 1t < ro < 21t and 80 is arbitrary, the state trajectory spirals outwards and
asymptotically approaches C 2;
3) if 211: < ro < 411: .... etc. (The reader should draw a sketch of these orbits and of a
few state trajectories.)
Thus. depending on how far away from the origin the initial state (ro,Oo) is. the nature
of the solution is qualitatively very different.
For more on this, see e.g. [Mil.lJ. In this section we will take full advantage of the
linearity of the d.e. (3).

Asymptotic stability. This property roughly asserts that every solution of x= A(t)x
tends to zero as t 00. More precisely we have the following.

6 Definition. We say that the zero solution x(t) == 8 of x=A(t)x (on t ;?.O) is
asymptotically stable (asy. stable) iff. for all Xo E IRn, for all to E 1R+
a) t x(t) = <l>(t,to)xo is bounded on t ;?. to,
b) t x(t) = <l>(t,to)xo tends to zero as t 00. •

7 Comment. Observe that any solution xO is continuous and so b) a). Hence

t Ei = the standard ith unit vector of IRn.


182

by the linearity of the solution in Xo we expect the following.

8 Theorem [Asymptotic Stability). The zero solution of x=A(t)x on t is asy.


stable if and only if

9 0 as t 00 •

10 Comments. ex) Since t <1>(t,O) is continuous on R+, condition (9) IS

equivalent to

<1>(t,O) is bounded on IRt '


11 {
0 as t 00 •

This interpretation is important for the definition of uniform asy. stability in Comments
(19) below.
\3) t can be calculated by solving successively x = A(t)x from Xo = Ej at
10=0 for all ie n.

12 Proof of Theorem (8). Sufficiency_ (9) => asymptotic stability.


x
Considering any solution (4) of = A(t)x, we have, by taking vector norms,

IIx(t)1I 11<1>(t,to)llllxoll = 11<1>(t,O)<1>(O,to)lIllxoll

11<1>(t,O)llII<1>(O,to)llllxoll

(where we used e.g. the sup vector norm, (A.6.17), (2.1.56), and (A. 6.1 8». Now
is a constant independent of t and equivalence (11) holds_ Hence by
(9) we satisfy the conditions of definition (6).
Necessity. By contraposition. So assume that t <1>(t,O) does not tend to zero as
t 00. This must hold for at least one entry of <1>(·,0), say w.l.g. <l>1l(·'0). Now
t <l>ll(t,O) is the first component of the state trajectory x(·) due to XO=EI (the first
standard unit vector) at to = O. Thus, this solution of x = A(t)x does not tend to zero
as t 00. Hence the zero solution of x=A(t)x is not asy. stable. •

Exponential stability. This property roughly asserts that every solution of x= A(t)x
is bounded by a decaying exponential depending on the elapsed time !-to- This
technically most desirable property is not always guaranteed by asymptotic stability.

16 Example. The linear time-varying circuit of Fig. 7.1 is described by the state d.e.

q=-q/(l+t)

(q is the capacitor coulombs). Whence by standard methods

q(t) = <1>(t,to)qo= 1+1 qo 'Ii t to· Hence <1>(t,O)=(1+t)-1 0 as t 00


183

Fig. 7.1. The linear time-varying circuit of Example (16).

and the zero solution is asy. stable. However for every Cl > 0 and for every solution
q(.), Iq(t)lexp[Cl(t-to)) 00 as t-to tends as 00. Hence Iq(t)1 cannot be bounded by

a decaying exponential.
We are now ready for a formal definition.

17 Definition. We say that the zero solution of x=A(t)x on t is exponentially


stable, (exp. stable), iff
:3 Cl>O and m>O s.t. for all toE 1R+
18 11<1>(t,to) 11 m exp[-Cl(Ho)]

(where the matrix norm used is arbitrary).

19 Comments. Cl) The positive constants Cl and m are fixed, i.e. independent of
toE R+; in particular Cl is afixed decay rate.
Using an induced matrix norm i.e.
11<1>(t,to)11 := sup (11<1>(t,to)xoll/llxoll : Ilxoll :t 0) we see that. with x(t) = <1>(t,to)xo.
exponential stability is equivalent to:

:3 Cl > 0 and m > 0 s.t. 'Ii (xo.to) E IRn x 1R+

IIx(t)ll mllxoll exp[-a(t-to)] 'Ii t to .

x
In other words every solution of = A(t)x is bounded by a decaying exponential in
t-to. whose defining constants mllxoll and Cl are such that m and Cl are fixed.
y) By abuse of language, the expression "the zero solution of x A(t)x is expo =
stable" is often replaced by "x = A(t)x is expo stable" or "the equilibrium x = e is expo
stable" or "A(·) is expo stable."
B) [Mil.l,p.188). A stability concept equivalent to exponential stability is the fol-
lowing:
x
The zero solution of = A(t)x on t 0 is said to be

20 unitormly asymptotically stable if and only if

a) 1 ---) <1>(I,to) is bounded on 1 to uniformly in to E 1R+, i.e.


184

21 :3 k < 00 S.t. V to E IR+ 1I<l>(t,tom:5 k Vt to ,

b) t <l>(t,to) tends to zero as t 00 uniformly in toE 1Rt, i.e.


22 '\7'£ > 0 :3T(e) > 0 S.t. VtoE R+ 1I<l>(t,to)ll:5 £ Vt

[It is crucial to note that the constants k in (21) and T(e) in (22) are independent of
toE compare with the conditions (11) for asymptotic stability.]
e) In the time-invariant case x= Ax, (with A constant), asymptotic stability is
equivalent to exponential stability. Indeed <l>(t,to) = exp[A(t-to)] depends only on the
x
elapsed time t-to; hence the zero solution of = A(t)x is asy. stable, (i.e. (11) holds),
iff the zero solution is uniformly asy. stable, iff the zero solution is exponentially
stable.

*25 Comment. A proof of

<=:
=>: Observe that given any T > 0,
°
A(') is uniformly asy. stable ¢:> A(-) is expo stable, is as follows:
(21) and (22) follow from (18) with k=m and T(e) > S.t. exp[-aT] :5 em-I.

Vt ::I!nE Nand ::I!SE [O,T) S.t.

26 t-to=nT+s.

Let toE IRt be arbitrary but fixed and pick in (22) some T(e) > for e= 112. Then, °
by (22), 11<l>(s+to+T,to)11 :5 112 V s O. Hence, using composition and induced
matrix norms as in e.g.

<I>(s+to+2T,to) = <l>(x+to+2T,to+T)<l>(to+T,to) ,

1I<I>(s+to+2T,to)ll :5 1I<l>(s+to+2T,to+T)II'II<l>(to+T,to)ll :5 2-2 ,

it follows by induction that

27 Vs 0 V n = 1,2,... 1I<I>(s+to+nT,to)ll:5 Tn.

Pick now a > 0 S.t. exp(aT) = 2. Hence

28 V s E [O,T) 1:5 2 exp(-as) ,

whence V s E [O,T) 2-n :5 2 exp[-a(s+nT)]. Combining this with (27) we have

29 V s E [O,T) V n= 1,2,... 1I<l>(s+to+nT,to}ll:5 2exp[-aCs+nT)] .

Now, by (21) and (28)


185

30 'liSE [O,T) 1I<1>(S+Io,Io)ll :5 k :5 2k exp(-as).

Hence by (26), (29) and (30)

'lit 11<1>(t,Io)ll:5 2kexp[-a(t-to)] ,

where the positive constants 2k and a are independent of to.



We are now ready to discuss characterizations of exponential stability for some
special cases.

33 Theorem [Time-invariant case; A is a constant matrix]. The zero solution of


x=Ax is expo stable jff
o
34 cr(A)c 4L

(i.e. every eigenvalue of A has a negative real part).

3S Analysis <1>(t,Io)=exp[A(t-to)]. Now, by (4.4.38)

36 exp[At] = f
k=!
1tk(t)e""I, where {Ak}O = o(A)
!
and

'Ilk 7tk(t) is a matrix polynomial in t. Hence, by taking matrix norms Vt 0

Ilexp[AtJII :5 L° l17tk(t)11 exp[(Re Ak)t]


k=!

31 :5 L° Pk(t) exp[(Re Ak)t] :5 p(t) exp[-Ilt] ,


k=!

where the Pk(t) are polynomials S.t. Pk(t) II7tk(t)ll, p(t) := L° Pk(t) 0 and
k=l

38 11:= -max{ReA:AE orA]} .

Since a polynomial is growing slower than any growing exponential we have

39 'lie > 0 3m(e) > 0 S.t. 0 S; Ip(t)1 S; mexp[et] 'lit

Hence combining (37) and (39),


186

40 VE > 0 :3 m(E) > 0 s.t. Ilexp[At] II :::; m exp[-O.l.-E)t] Vt 0 .

o
41 Proof of Theorem (33). If a(A) c 4L, then, by (38), > O. Hence picking
EE we have that (18) holds with > O. Therefore sufficiency holds.
o
On the other hand if a(A) is not included in (1:_, then by (36) exp(At) does not tend
to the zero matrix as t 00 and the zero solution is not expo stable. •

42 Comment. If A is expo stable then by (40) exp[AtJ is bounded by a decaying


exponential with positive decay rate approaching 11 > 0, (38), arbitrarily closely from
below. If, in addition, A is semi-simple then E in (40) may be made zero, i.e. the
decay rate > 0 is guaranteed; indeed under these conditions each matrix polynomial
1tk(t) in (35) is a constant matrix.

43 Important remark. Beginners often suggest the following intuitively appealing


but false statement: if for each t 0, the eigenvalues of A(t) have negative real parts
then the zero solution of x = A(t)x is expo stable. To show that this statement is false,
use Exercise (2.1.45) item (7). For 1 < a < 2, the eigenvalues of A(t) are indepen-
dent of t and ReAj=- (2;a) < 0; but for XO=EI' 1I<I>(t,O)Xolh=exp[(a-l)t] i.e. it
diverges exponentially! On the other hand, it can be shown that if A(·) is bounded on
R+ and, for all t 0, the eigenvalues of A(t) are in the left half plane Re A :::; E :::; 0
x
where E is fixed, then the zero solution of = A(t)x is expo stable provided that the
rate of change of A(t) is sufficiently small, [Bro.! ,p.206j.

Lyapunov equation. We study here the equation

46 A*P+PA=-Q

where A E (1:nxn, Q E (]:nxn S.t. Q= Q* > 0 are given t and a unique solution
p = p* > 0 is to be found. Equation (46) is called the Lyapunov equation. Its solva-
bility relates directly to the expo stability of x = Ax.

47 Lemma. Assume that given Q=Q* > 0, equation (46) has a unique solution
p=p* > 0, then the zero solution of x=Ax is expo stable, i.e. o(A)c

Proof. Consider the quadratic form

48 v(x): (1:n R+ : x v(x)=x*Px .

Since P= p* > 0, there exist constant Pu > PI > 0 S.t.

t Q=Q* > 0 means that Q is Hermitian positive definite.


187

Indeed, P is Hennitian positive definite and P/'Pu can be chosen to be its least and
largest eigenvalues, resp.. Taking the derivative w.r.t. t of vO along any trajectory of
x=Ax, we have by (48) and (46)

v(x) =x*[A*P+PA]x =-x*Qx .

Since Q=Q* > 0, there exists a constant y > 0 S.t.

Yllxll2 x*Qx VXE (Cn.

Hence along any trajectory 'Ii Xo E (C n with Xo *- fl, 'Ii to E 1R+

v - =' -2a<0 for all t .


v PI'

Hence by integration we get vex) v(xo) exp[-2a(t-Io)], V t 10. Therefore by (49)


with m 2 :=

IIxl1 2 m2 exp[-2a(t-I o)1IIx oI1 2 Vt

So, with x = x(t) = <1>(t,to)xo, using induced matrix nonns

'Ii to E 11<1>(t,Io)ll m exp[-a(t-to»)

(where the constants m and a are independent of 10).



51 Exercise. Give an alternative proof of Lemma 47. (Hint: Let A.j, ej be any
eigenvalue-eigenvector pair of A; multiply (46) on the left by ej* and on the right by
ej; use Aej = A.jej, .... )

Consider now the converse of Lemma (47). As a first step consider the following.

o
52 Exercise. With A E (Cnxn, let a(A) c (L. Consider the map
53 F : c;nxn -7 C;nxn: X -7 A*X+XA

(see Theorem (4.8.1». Show that


a) F is bijective,
b) if A*X+XA is Hennitian, then so is X.
[Hint: note that F injective F surjective.]
188

o
S4 Lemma. Consider the Lyapunov equation (46) and let o(A) c (L. Then
'ltQ=Q* > 0 equation (46) has a unique solution p=p* > 0 given by

f eA*'QeA'dt .
00

55 P=
o

Proof. Let Q = Q* > 0 and consider the linear matrix d.e.

X=A*X+XA+Q, X(O)=O t

1) By (2.1.86), the solution is given by

Now since A is expo stable it follows that the integral converges, as t -t 00, to the
limit
,
P := X(oo) = f eA*'tQeA'tdt ,
o

which solves (46), since X(t) converges necessarily to zero.


2) The solution defined by (55) is unique since, by Exercise (52), the operator F,
given by (53), is injective.
3) Obviously, p=p* moreover P > 0, indeed, with R any nonsingular matrix

s.t. Q= R*R, if x is S.t. /Px = 0, then f IIR eA'x1l 2dt = 0, i.e. R eA'x === e, whence
o
x= e because ReA' is nonsingular 'It t E IR and V A E R nxn and all nonsingular
R.

From Lemmas (47) and (54) we have now our main result.

S6 Theorem. Consider the Lyapunov equation (46). Then the following statements
are equivalent:
a) 'ltQ=Q* > 0 the Lyapunov equation (46) has a unique solution p=p* > 0
given by (55);
o
b) o(A) C (L;
c) the zero solution of x=Ax is expo stable.

o
S7 Exercise. With o(A) c (L and P given by (55), show that
189

fo -dtd
00

*
A P+PA= [eA*'QeAtjdt=-Q .

Periodically varying d.e.'s. We study the exponential stability of the d.e.

60 x=A(t)x tE R;.

where the matrix function A(') E PC [R+,Rnxn] is T-periodic, i.e.

61 A(t+T)=A(t)

In Theorem (2.2.71) we established that, a) with

62 B := T1 10g<b(T,0)E cr nxn and P(t):= <b(t,O)exp[-Bt] for tE R+ '

63 <b(t,to) = P(t)exp[B(t-to)]p(to)-1

where P(t)E cr nxn is nonsingular for all tE and t --t pet) is T-periodic, and b)
under the coordinate change x(t) = P(t)!;(t), (60) becomes a time-invariant d.e.

64

(where B is the constant matrix (62». Our objective is to show that the zero solution
of (60) is expo stable iff the zero solution of (64) is expo stable. Note that, by
o
Theorem (33), the latter holds iff O'(B) = cr_.
65 Exercise. Consider equation (62) and let D(O; 1) denote the open unit disc, i.e.
D(O;I):= {AE cr: 11..1 < 1). Show that
o
O'(B) c cr_

if and only if
O'[<I>(T,O)] c 0(0; 1) .

[Hints: by (62) exp(BT)=<b(T,O); apply Theorem (4.7.1) with f(A)=exp(AT).]


We obtain now our main result.

66 Theorem [Exp. Stability). Consider the T-periodically varying d.e. (60). Then
the zero solution of x = A(t)x is expo stable if and only if
190

67 a[<D(T,O)]cD(O;I).

68 Short proof (Exercise). Consider equation (63). Since pet) is nonsingular for all
te and PO is T-periodic and continuous, P(·) as well as PO-I are bounded on
equivalently, there exists positive constants M and N such that, for all t e 1R+,
IIP(t)11 S M and 1!P(t)-lli S; N. Therefore, by (63), for all toe R+, for all t
(MN)-llIexp[B(t-to)lII S; 11<1>(t,to)11 S; (MN)llexp[B(t-to)]II.
Hence, (18), the zero solution of x = A(t)x is expo stable if and only if the zero
solution of = is expo stable. The latter condition is equivalent to condition (67)
by Exercise (65). •

x
69 Comments. a.) Clearly the periodically varying d.e. = A(t)x is expo stable iff
the time-invariant d.e. = BS is expo stable.
Condition (67) can be tested easily; compute <1>(T,O) by integrating x = A(t)x over
[O,T] starting from XO=Ei for all i en; compute the eigenvalues Ai of <1>(T,O) and
check if I Ai I < 1 for all i e n.

70 Exercise. Consider the periodically varying d.e. (60). Assume that o[<1>(T,O)] is
not included in the disc D(O; 1) and let e E cr n be an eigenvector of <1>(T,O)
corresponding to an eigenvalue AE cr with I A I 1. Show that the solution
x(t)=<1>(t,O)e does not tend to zero as t -7 00.
[Hint: use <D(t+T,to+T) = <1>(t,to).]

7.2.2. Bounded Trajectories and Regulation


In this section we consider a linear system R (-) = [A('),BO,CO,DOI and discuss
conditions on A('),BO,CO,DO such that, for every toe 1R+, for every xoe IRn , and for
every input bounded on [to,oo), there results a state trajectory x(-)=s(·,to,xo,u) and a
response y(·)=p(·,to,xo,u) that are both bounded on [10,00); moreover under the same
conditions we guarantee that if the input u(t) tends to zero as t -4 00 then both the
state and the output also tend to zero. We refer to this property as regulation. As
before our linear system R (-) is represented by

71 x(t) = A(t)x(t) + B(t)u(t)

72 yet) = C(t)x(t) + D(t)u(t)

where A('),B('),CO,DO are piecewise continuous functions. Throughout this subsec-


tion the Lco-norm of the restriction to [to,oo) of a vector function f(·): 1R+ -7 IR n will
be denoted by IIfll ; more precisely,
00,10

73 IIfll =llf[ )11 =sup IIf(t) II


00,10 10.00 00 t
191

where any vector nonn may be used for IIf(t)lI. Of course, IIflloo.o=lIfll oo ' i.e. the
usual Loo-nonn. For matrix functions M('): R+ lRnxm we denote by IIMII 00 the
Loo-nonn, i.e.

74 II M 1100 IIM(t)1I

where the matrix norm IIM(t)!I is the one induced by the chosen vector nonn.

75 Theorem [Bounded trajectories and regulation]. Consider a system representation


R (.)= [A('),B('),C('),D(')] such that
a) x=A(t)x is expo stable with positive constants m and a as in (18),
b) the matrix functions B('),C('),DO are bounded on R+, i.e. liB 1100 , IICI 1
00 , IIDlloo

are finite constants.


U.th.c.,
I) for every (xo,to) E R n x R+, for every bounded input u[\o.oo)' the state
t x(1) = s(t,to,xo,u) and the output t yet) = p(t,to,xo,u) are bounded on t to, more
precisely, with the constants given above,

76 Ilxll 00.10 s: mllxoll + [(mla) liB II 00 ] lIull 00.10

77 Ilyll oo.to s: [milCiI 00 ] IIxoll+[(mla)IICIl 00 IIBII 00 +IIDII 00] lIull 00,10 ;

2) under the same conditions as in 1), if in addition u(t) e as t 00, then


x(t) e
and yet) as t 00. e
78 Comments. a) the constants multiplying IIxoll and Ilull
00.10
on the RHS of
(76)-(77) are independent of to and xo.
13) Note that in (76) and (77), the nonns of x(·), yO and uO are taken over [10,00)
and not over [0,00), (see (73». For matrix functions see (74).

79 Proof of Theorem (75). 1) is straightforward and left as an exercise.


t 00
[Hint: Jexp[-a(t--t)]dt Jexp[-at]dt = a-I.]
S;
10 o
2) By (72),

lIy(t)1I S; IICI U Ix(t)ll + IIDI U lu(t)1I 'v' t 10 .

Hence, if both u(t) and x(t) tend to zero as t 00, then so does yet). Hence, we are
reduced to show that
192

J
x(t)=<l>(t,to)xo+ <l>(t,t)B(t)u(t)dt =: X\(t)+X2(t)
10

tends to zero as t 00. This is immediate for xl(t) since x=A(t)x is expo stable. So

we are left to prove that lim X2(t)= 8. By taking vector norms and using (18),
1--> 00
I

o :; I/ x2(t)ll ::;; mllBlloo· Jexp[-a(t-t)lllu(t)lldt


10

and we are done if the integral on the RHS converges to zero as t 00. For this
e
purpose, first set u(t) = for t < to and then observe that

Jexp[-a(t-t)] l/u(t)lldt= J exp[-atJ Ilu(t-t)lldt


00

10 0

[in the last expression Ilu(t-t)1I = 0 for t > t-tol. Take now any sequence
[tk);C[to,OO) s.t tk 00 as k 00, and set 'ik, fk(t):= exp[-at] IIU(tk-t ) II
for t e R+. Thus we are done if we prove that
00
lim Jfk(t)dt = 0 .
k-->oo 0

Now this follows from Lebesgue's dominated convergence theorem, [Rud.l,p.27],


which allows us to permute the operations of limit taking and integration, [note that
since u(t) tends to zero as t 00, 'ite lR+ lim fk(t)=O ]. Indeed the conditions
k-->oo
of Lebesgue's theorem are satisfied: with g('t) := e-uTllull we have i) for all k,
00.10

Ifk('tl S; get) fortE and ii) fo Ig(t) Idt=(a)-l lIull


00.10
< 00, i.e. each function of

the sequence (fkO ]; is dominated by the fixed absolutely integrable function gO. •

The exercises below add an important robustness perspective to the statement of


Theorem (75).

83 Exercise [Robustness]. Consider the time-invariant system representation


o
R = [A,B,C,D], thus w.l.g. to =0. Let o(A)c (L. Show that for allY sufficiently
small perturbation in A, and for any perturbation in B,C,D the conclusions of Theorem
(75) remain true. Hence "A is expo stable" is a robust condition for bounded trajec-
tories and regulation.
193

84 Exercise [Bounded-input bounded-state stability; time-invariant case]. Consider a


time-invariant system representation R = [A,B,C,D], thus w.l.g. to=O. Let N _ and
No denote the algebraic eigenspaces, (4.3.3), of A E ([nxn corresponding, resp. to its
eigenvalues with negative and zero real parts.
Show that, with to=O, for all XOE ([n, for all u(') bounded on the state trajectory
is bounded on R+, if and only if
a) o(A)c ([_={A.E ([:ReA.:s; 0) and all jeo-axis eigenvalues are simple roots
of the minimal polynomial of A;

b) in the decomposition ([n=N _ fD No the representation of B is [:1] i.e.


R(B)(")N o={9), (or equivalently, the undamped modes must not be coupled with the
input).
[Hint: use (4.6.36) and require R = [A,B,LO] to be I/O stable by Corollary (43).]

86 Exercise. Consider Exercise (84). Show that if A has some j<o-axis eigenvalues
then the bounded-input bounded-state stability of R is not robust w.r.t. small pertur-
bations in B.

87 Important comment. From Exercises (83), (84) and (86), we see that (in the
time-invariant case), for reasons of robustness, it is desirable that A be expo stable (Le.
o
a(A) c ([ J for obtaining bounded state and output trajectories and regulation. Now
in many cases it will be known that R is I/O stable. So an important question is: can
• 0
we decide that A is expo stable from the I/O stability of R, (Le. P [H(s)] c C_,
(43»1 The answer is affirmative iff A has no unstable hidden modes: see Corollary
(9.1.80) below. Similar facts can be developed for the time-varying case, but they are
quite technical, e.g. [Eng.1]. •

88 Exercise. Consider a time-invariant system representation R = [A,B,C,D] with


o
a(A) c ([ _. a) Show that 'v' (xo,to) e cr n X R+, for all bounded inputs u[
to,oo
I that
tend to a constant u as t 00, x(t) and yet) tend to constant vectors x and y as
00 00

t 00, where y00 =G(O)u00 .


b) Show by example that the conclusion above does not hold in general for time-
varying systems. (Consider A, B, D constant and CO time-varying.)

7.2.3. Response to T-Periodic Inputs


In this section we study the response of a time-invariant system representation
R = [A,B,C,D], where x=Ax is expo stable, to a T-periodic input up(')e PC [R+, (Cni),
whence

90

91 We shall denote by t vp(t) the T-periodic extension of t up(t) on all of IR,


194

i.e. Vp(·)E PC[lR, .r"il is the unique T-periodic function on R s.t. vp(t)=up(t) for all
tE R+.

92 Analysis. a) Consider t H(t) the impulse response (3.2.58) of R = [A,B,C,Dl


and define ypO: R+ .r 11o by
t

93 Yp(t):= J H(t-'C)vp('t)d'C for all t E lR+ .

94 Claim. YpOE PC(R, .r llo) is well defined and T-periodic.


x
Indeed, 1) the first assertion follows because = Ax is expo stable and vp is bounded
on lR, (using (18) and norms we get: Ily(t)1 I S [(mJa)lIClIIiBII + IIDlllllvplloo' where
Ilvplloo is the Loo-norm of vpO on 1R), and 2) T-periodicity follows by checking that
Yp(t+T) = Yp(t) for t E lR+, ( using (93) and the T -periodicity of vpO).
b) Consider now the response of R = [A,B,C,D] due to any state XOE (fn and the T-
periodic input up(·)' i.e. by (3.2.61)

yet) = p(t,O,xo,u p)
t

95 = C exp[Atlxo + JH(t-'C)up(t)dt
o

=Cexp[At]xo + f H(t-t)vp(t)dt
o

(since the extension vp(·) agrees with upO on Hence, by (93), (95) and (3.2.58),
o
96 'VtE R+ y(t)-Yp(t)=CeAt[xo- J exp[-AtlBvp(t)dtl

where the last integral converges because A is expo stable, (indeed, using norms, (18)
and the notation I for the integral, 11111 (mJa)llBllllvplloo < 00). Therefore in the
RHS of (96) the expression in the brackets is a constant vector. Therefore since
eAt 0 for t 00, 'V Xo E.r" lim [Y-Yp](t) = 0. From this and claim (94) we

have the following theorem.

97 Theorem [Response to T-periodic inputs]. Consider a time-invariant system


representation R = [A,B,C,D], where 10=0 and x=Ax is expo stable, and which is
driven by a T-periodic input UpC-)E PC (1R+, .r"'). Consider the T-periodic output
Yp(·) E PC (lR+, .r 11o) described by (91) and (93), then as t 00 the output yet) of the

representation R tends to Yp(t), more precisely


195

98 lim [Y-Yp](t) = 0 .


99 Comments. a) For to ¢ 0, the same result is obtained by straightforward
modifications (exercise).
The T-periodic output Yp(·) is often called the steady-state response to the T-
periodic input 1Ip(.).

102 Exercise. Using (18), show that, by (96), for all t E R+:

i.e. yO is exponentially attracted to yl).

103 Exercise. Consider Theorem (97). Let the T -periodic input be of the form
m
104 1Ip(t):= Uk exp[j(ko>ot)] ,

where 0>0 := T21t ' and, for all Ik I E m, Uk E c: n;. Show that

m
105 yp(t) = I: H(jkooo) Uk expU(kooot)] "V t E R+ .
k=-m

00

Note. If in (104) m=oo, with IIUkll < 00, then (105) holds with m=oo, pro-

vided
00

IIH(jkO>o)ukll <


00 •
k=-oo

106 Exercise. Show that, under the conditions of Theorem (97), as t -+ 00, the state
x(t) tends to a T-periodic steady-state trajectory xp(t) given by
t

Xp(t) = J K(t--'t)vlt)dt for t E R+ ,

where 1) vpO is the T-periodic extension on R of upO given by (91), and 2) K(t) is
the state impulse response (3.1.59). •
196

7.2.4. Periodically Varying System with Periodic Input


We study in this subsection a driven linear d.e.

110 x(t) = A(t)x(t) + u(t) for tE ,

where a) the state x(t) ERn, the input u(t) E IRn and A(t) E R nxn , b) A(-) and u(-) are
given piecewise continuous T-periodic functions on R+, whence A(t+T)=A(T) and
u(t+T)=u(t) for all tE 1R+, and c) the d.e. x=A(t)x is expo stable, or equivalently,
condition (18) holds for fixed positive constants m and a, where a is the exponential
decay rate.
Intuitively, based on our experience with the time-invariant case, we expect that
for all XOE IRn, for all to the solution of (110) tends as t -+ 00 to a unique T-
periodic solution xp(·).
To obtain this periodic solution xpO, the first idea is to, say, start from the zero
state at time 0 and integrate the d.e. (110) until a periodic solution is reached. Unfor-
tunately this may be very expensive due to a slow transition to the steady state (. .. the
exponent a in (18) may be very small ... ).
A better idea is as follows: if we knew the vector xp(O) then the T-periodic solu-
tion xp(·) would be defined for all tE 1R+ by integrating the d.e. (110) over [O,T]. Now
by T-periodicity
T
111 f
xp(O) = xp(T) = <lJ(T,O)xp(O) + <lJ(T;t)u(t)dt ,
o

and equivalently,
T
112 [I-<lJ(T,O)]xp(O) = f <lJ(T,t)u('t)dt .
o

Thus xp(O) is on a T-periodic solution if and only if (112) holds. Furthennore (110)
will have a unique T-periodic solution for any T-periodic input u(·) if and only if the
linear equation (112) has a unique solution.
Now the assumption that x(t) = A(t)x(t) is expo stable is (by Theorem (2.2.71);
see also (60)-(61) above) equivalent to the condition that all eigenvalues of <lJ(T,O)
satisfy I Ai I < 1. Hence all those of I-<lJ(T,O) are different from zero: hence I-<lJ(T,O)
is nonsingular. Consequently equation (112) has a unique solution for any T-periodic
input u(·).
For computation we proceed as follows:
1) The numerical matrix <lJ(T,O) is obtained by integrating x(t) = A(t)x(t) as in Com-
ment (69. 13).
2) The RHS of (112) is the z-s response of (110) at time T: so it is obtained by
integrating the differential equation (110) over [O,T] starting from (0,0).
3) xp(O) is the unique solution of the system of linear algebraic equations (112).
197

4) Integrating (110) over [O,T) starting from (xp(O),O) gives xp(t) for all t E [O,T] and
by periodicity for all t O.

113 Exercise.. Show that the solution of (11 0) starting at x(to) = Xo is given by

We summarize this analysis by stating the following theorem.


115 Theorem [Steady-state periodic solution]. Consider the system described by
(110) where u:R+ R n and A:R+ R nxn are T-periodic, and x(t)=A(t)x(t) is
n
expo stable. Then for all XOE R , for all the solution of (110) tends, as
t 00, to a unique T-periodic solution xp(·), with xp(O) defined by (112).

116 Comment. In the stability literature, the T -periodic solution xpO is said to be
"globally exponentially stable in the large."

7.2.5. Slightly Nonlinear Systems


In this subsection we consider a nonlinear zero-input state differential equation
x=f(x,t) (on R+), which arises as a perturbation of the linear state differential equa-
tion x = A(t)x. We develop below conditions under which the exponential stability of
x = A(t)x remains true for x = f(x,I). This does not always happen. Indeed consider
the following.

119 Exercise. Consider the scalar d.e. x=-x+ex 2 for e > O. Show that, a) for all
Xo > e- 1, and all to, I x(t) I 00 as (t-to) In(exof(exo-l)),
b) for all Xo < e-1, and all to, x(t) 0 as t 00.

c) Sketch the trajectories in the t-x plane. •

Clearly, in the exercise above if we neglect the nonlinear term (i.e. EX2) the zero solu-
tion of x= -x is expo stable; this is no longer true for the nonlinear equation if I Xo I is
large.

Small nonlinearities, robustness. Consider the d.e.


120 x= f(x,t) = A(t)x + h(x,t) for t E R+ ,

where and h(· .. are continuous and h is


Lipschitzian in x, (see (B. 1.4». Hence by the fundamental theorem of d.e. (B. 1.6),
equation (120) has for all (Xo,to)E RnxR+ a unique continuous solution on

121 Theorem [Small nonlinearities]. Consider the d.e. (120) and assume,
x
a) the zero solution of the linear equation = A(t)x is expo stable, or equivalently (18)
holds with fixed positive constants m and a;
b) the nonlinear term h(x,t) is S.t.
198

:3 P > 0 for which


122 IIh(x,t)11 :s; Pllxll v (X,t) E lRn x IR+

and
123 P< (aIm) .

D.th.c.
for all (Xo,to)E every solution x(·) of (120) satisfies

124

where 'Y := a-pm > 0 and m > 0 are fixed constants independent of Xo and to, i.e. the
zero solution of (120) is expo stable.

125 Comments. a) The constant pin (122) must be independent of x and t; (123)
requires this constant to be sufficiently small.
(3) If the function h(',,) in (120) is of the form
h(x,t) = M(t) . x

where oAO E PC (IR+,IRnxn) is a perturbation of AO. Then (122)-(123) is equivalent


to

Thus a sufficiently small L00_ perturbation of A(') will not upset its expo stability. •

o
127 Exercise. Let A E lRnxn S.t. alA] c C_. Consider the time-varying d.e.
x=A(t)x (on Rt), where A(')E has the property that A(t) 4 A as
t 4 00. Show that the zero solution of x=A(t)x is expo stable on [T,oo) for some
sufficiently large T> O.

128 Exercise [Robustness]. Consider the statement of Theorem (75) on bounded


trajectories and regulation. Show that its conclusions remain valid for sufficiently
small Loo-perturbations of AO, and for arbitrary LOO-perturbations of B('),CO,D('); (of
course the constants on the RHS of (76)-(77) must be modified appropriately). Thus
the conditions of Theorem (75) are robust conditions for bounded trajectories and
regulation.

129 Remark. Exercise (128) generalizes Exercise (83) to the time-varying case.
199

130 Exercise. Consider the scalar d.e. x=-x+£x 2, where £>0. Show that condi-
tion (122) is not satisfied.

131 Proof of Theorem (121). Conclusion (124) follows at once fonn the Bellman-
Gronwall inequality (B.1.15). From (120) we have V' (xo,to) e R n x R+ V' t to
I

IIx(t)11 = 1I<I>(t,to)xo + f <I>(t,t)h(x(t),t)dtll


10

S mllxollexp[-a(t-to)] + f exp[-a(t-t)]llx(t)lldt,
10

where we used the triangle inequality, (18) and (122).


Let
132 wet) := Ilx(t)!! exp[at]

Then, multiplying the inequality above by exp[at], we obtain


I

wet) S mw(to) + f w(t)dt .


10

Hence by the Bellman-Gronwall inequality

wet) S mw(to) exp[ml3(t-to)] .

Hence with y := a-13m> 0 (by (123», by (132) we obtain

IIx(t)11 S mllxoll exp[-')'(t-to)]

where the constants m and y are independent of Xo and to.



133 Exercise. Let the zero solution of x = A(t)x be expo stable, i.e. (18) holds with

positive constants m and a. Let BA(') e PC [IRr,1Rnxn] S.t. f IIBA(t)!!dt < 00. Show
o
that the zero solution of the perturbed d.e. x= [A(t) + 6A(t)]x is expo stable, (more

precisely (18) holds with positive constants m'exp[m I lIoA(t)lIdt] and a).
o
[Hint: use the same technique as in the proof of Theorem (121)].

Linearization. In (120) we considered a perturbation of the d.e. x=A(t)x: from the


global assumption on the perturbation we showed that the zero solution remained
200

exponentially stable. We consider now a local result. We start from the nonlinear
differential equation

135 x = f(x,t) with f(8,t) =8


where f is smooth enough so that 'it (xo,to), equation (135) has a unique solution on
R+. The idea is to expand, for each fixed t, the function x f(x,t) in a Taylor series
about 8:

136 x(t) = A(t)x(t) + h(x,t)

where A(t) is Dlf evaluated at (8,t), i.e. the Jacobian matrix of f evaluated at (8,t),
h(x,t) represents the remainder: intuitively as Ilxll 0, Ilh(x,t)ll 0 faster than IIxll.
The linear differential equation

137 x(t)= A(t)x(t)

is called the linearized equation: more precisely, (137) is the linearization of (135)
about its equilibrium point 8. Intuitively, if (137) is exponentially stable, we expect
that, for initial states Xo small enough, any solution of (135) starting from (xo,to),
with to arbitrary, would decay to 8 exponentially. The fact that this expectation is true
is very important: it allows us to predict the local exponential stability of the zero
solution of the nonlinear d.e. (135) once we know that its linearized equation
x(t) = A(t)x(t) has an exponentially stable zero solution. For brevity's sake, when that
is the case, we say that "equation (137) is expo stable" or "the function t A(t) is
exponentially stable." We now state precisely the result.

140 Theorem [Linearization and expo stability]. (We use 12 vector-norms


throughout.)
Given the nonlinear differential equation (135), or equivalently (136), let,

141 a) t A(t) be piecewise continuous and bounded on 1Rr, thus, for some a > 0, 'it t

142 IIA(t)11 :!> a ,

143 b) ·
I1m sup Ilh(x,t)11 =0 .
IIxll->O t Ilxli

Under these conditions, if the linearized equation x(t) = A(t)x(t) is expo stable, then
there is an e > 0 S.t. all solutions of the nonlinear equation starting from any
(xo,to) E B(6;e) x goes to 6 exponentially. •
201

145 Comments. a) Assumption (143) is the technical way of expressing the idea
that, for all t, h(x,t) 4 e faster than x 4 e, uniformly in t.
In the proof of the theorem we use the Lyapunov function technique, which is
extremely useful especially for nonlinear problems.
We start with a lemma.

146 Lemma [ Properties of pet)]. Let AO satisfy (141), let x=A(t)x be expo
stable, define

147 pet) := f <1>(t',t)* <1>(t',t)dt' Vt

then

P{t) is well defined on IR+ and 3 Pu > P, > 0 S.t. v Z E ern Vt 0

148 *Z :0:::; *
v(z,t) := Z P(t)z :0:::; *Z .

Proof. By expo stability, :::I m,a > 0 S.t.

149 1I<1>(t,to)ll m exp[-a(Ho)] .

This inequality in (147) yields Pu :0:::; m2/2a. Now, V(Z,t)E (L"xR+.

f z* <1>(t',t)*<1>(t',t)zdt'
00

v(z,t) := z*P(t)z=
t

=f Ils(t',t,z,O)11 2dt'
I

1.
a
j IIA(t')II'lls(t',t,z,O)ll2dt'
I
by (142)

1.
a
7t
I s(t',t,z,O)* A(t')s(t',t,z,O) I dt'

a
7I
I s(t',t,z,O)*
dt
s(t',t,z,O) I dt'

and replacing in the RHS the integral of the absolute value by the absolute value of
the integral, we finally get
202

V(Z,t) 1 Z*z= IIzll/2a


2a 2 .

To obtain the last line we used the expo stability of (137). Thus PI can be taken to be
lI2a. •

Thus the positive definite matrix pet) is such that Amin(P(t» and Amax(P(t» ::; PU'
Vt

153 Exercise. Show that a) pet) defined in (147), satisfies

154 · *
pet) =-A(t) pet) - P(t)A(t) - I
and
P(t)* = pet).

b) With Q(t) := p(t)-1

155 · *
Q(t)=Q(t)A(t) + A(t)Q(t)+ Q(t)Q(t) .

Proof of Theorem (140). Consider the solution of the nonlinear d.e. (135) starting
from (xo,to); for brevity call it x(t). Consider the composite function t v(x(t),t)
which maps R+ (C where v(z,t) is given by (147)-(148). Its derivative is called the
derivative o/v along the solution x(·) 0/(135). By the chain rule we have
d ()
156 - v(x(t),t)= D1v(x(t),t) . f(x(t),t) + : \ v(x(t),t)
dt vt

where the first term is the product of the gradient of the scalar function v, evaluated at
(x(t),t), with f(x(t),t). For brevity write x for x(t) and v(x,t) for the LHS of (156), use
(136) and (147) to obtain from (156) - dropping temporarily the dependence on x and
t-

hence

157 · * *
v(x,t)=-x x+x P(t)h(x,t)+h(x,t) P(t)x .*
View (157) as an equality between functions of x and t. By (143), given 1I4Pu'
=:Ie > 0 s.t. V IIxll < e,

158 sup Ilh(x,t)l/ < (4Pu)-J .


t IIx I I
203

Hence, by (148), V (x,t) E B(e,S) x 1R+,

159 * *
I x P(t)h(x,t) I < x x/4 .

So by (157), (159) and (148), we see that V (x,t) E B(B;e) x 1R+

160 v(x,t) rl .
V(X,t)

So going back to the solution x(t) := s(t,to,xo,Q) of (135) we have:


V (Xo,to) E B(B;e) x R+,

v(x(t),t) v(xo,to)

Hence, by (148),

Thus, for any to 0, any solution of (135) that starts in the ball B(B;e) tends to S
exponentially. •

We emphasize again: theorem (140) is a local result: it is only the solutions that
start sufficiently close to 9 that are guaranteed to tend to 9 exponentially as t 00.
CHAPTER 7d

STABILITY: THE DISCRETE-TIME CASE

This chapter develops concisely the most important discrete-time analogs of


Chapter 7 on continuous-time stability: VO stability, state related stability concepts and
responses to q-periodic inputs. Certain specific details such as the partial fraction
expansion C7d.1.56) are added for clarity.

7d.l. I/O Stability


Consider a linear discrete-time dynamical system having as I/O map a superposi-
tion series on Z , i.e.
k
1 y(k) = F [u](k) = L H(k,k') uCk') 'Ii ke Z
Je';-oo

where

H(·,) : Z x Z --+ R lloxni ,


uO : Z --+ lRni S.t. the series (1) is well defined, (e.g.
u(k')=S for k' < 0),
yO: Z --+ RIlo.

2 Comments. a) It is assumed that the state is zero at ko = - 00 .

P) For a linear recursion system representation R dO = [A(' ),BC'),CO,D( .)] with


u(k)=9 for k < 0, the sum in the RHS of (1) starts at k'=O.

3 Preliminaries. a) Throughout this section vectors will be nonned by their sup-


nonn and matrices according to the corresponding induced matrix norm, i.e. for all
xe R n and for all Ae lRmxn

4 II xII := max I Xi I
i

5 II A II .- max L I aij I ["max row-sum"] .


ie ill je n

b) Vector sequences will be nonned by their (OO-norm, thus for the sequences u(')
and y(.) in (1)

6 II ull oo .- sup II u(k) II = sup I ujCk) I } ,


ke Z ke Z Je II;

7 II y II"" .- ke
sup II y(k) II = sup (max I Yi(k) I } .
Z ke Z ie Do

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
205

'Y) We view the I/O map F : u y of (1) as a linear transfonnation from I;; into I;;:',
(A.6,49).

8 I/O Stability. The concept of I/O stability roughly asserts that "any bounded
input u(-) produces a bounded output y(.) using a finite gain." More precisely: we say
that the discrete-time system described by (1) is I/O stable iff

9 3 'Y < 00 s.t., for all bounded u(') , lIylloo:S; 'Yllulloo '

10 Comment. The discrete-time system described by (1) is I/O stable iff the I/O
map (1) considered as a linear transformation

11 F : 1;; 1;;:' : u F[u] =: y

is continuous, (Le. bounded (A,6.82». Hence, condition (9) is equivalent to

12 IIF II = ( IIYlloo: Ilull oo = 1) < 00.

13 Exercise. Show that the discrete-time system described by (1) is not I/O stable
if there exists a sequence of inputs (uP(.) ];0 S.t. for all peN

14 II uP 1100 = 1 yet II yP II > p.

17 Theorem [I/O stability]. The linear discrete-time system described by (1) is I/O
stable iff

18 {k'!oo I H(k,k') II} =: 'Y < 00

where the norm of H(k,k') is the induced matrix norm (5).

19 Comments. 0.) Assertion (18) is equivalent to:

k
the map k L II H(k,k') II is bounded on Z ,
k'=-oo

or equivalently,
k
20 'V (ij) e flo x llj , the map k L I hjj(k,k') I is bounded on Z .
k'=-oo

In (18) above, any matrix norm may be used, since all matrix norms are equivalent,
(A.6,45).
206

24 Proof of Theorem (17). 1) Sufficiency. We prove that (18) implies (9). There-
fore let u(') be any bounded input sequence, i.e. whose norm II u 11 00 , given by (6), is a
finite number. Then from (1), taking vector-norms, we have successively 'Ii k E Z
k k
II y(k) II = II L H(k,k')u(k') II:S; L II H(k,k')u(k') 1/

:s; k'!OO II H(k,k') 1111 u(k') II L!oo II H(k,k') II} . II u 1100

:s; yll u II""

(we used successively an inductive application of the triangle inequality, (A.6.17), (6)
and (18». Hence, using (7), taking the supremum w.r.t. k, we obtain (9) with yas
defined in (18). •

2) Necessity. We use contraposition, i.e. the negation of (18) must imply the negation
of (9) (Le. system (1) is not I/O stable). Therefore let (18) not be true; then negating
equivalence (20) there exists (a,p) E !10 x Dj S.t.

,up I "",(k,k') I } =

Now w.I.g. a= 1 and p= I, whence

'Ii pEN there exists kp E Z S.t.

kp
26 L I I > p.
k'=-""

Now define a sequence of input sequences S.t. \I pEN

27a uP(k) := (ui(k),O,O, ... ,0) \IkE Z

with

sign [h ll (kp,k)]
{ O
27b ui(k) :=
elsewhere

tkE (-oo,kpl iff the integer k :s; k p .


207

(sign(a):= 1 for a > 0, := 0 for a=O, := -1 for a < 0). Therefore, a) by


(27) and (6)

28 V'pe N

(note that by (26) is not identically zero),


b) by (1) (for and (27), the output sequence yP(.) at corresponding to uP(.)
has a first component.
kp
29 = L I I ,
k'=-oo

c) by (7), (29) and (26)

30 V'pe N lIyPll oo lIyP(kp>l1 > I yi(kp) I > p.

Hence by (28) and (30) the system described by (1) is not I/O stable by Exercise (13) .•

33 System Representations R d(')= [A('),B('),q'),D(')]. Consider the I/O map of


R d(·)' If H(k,k') is the impulse response (2d.1.96) of a linear discrete-time system
representation RdO=[A('),B('),C('),D(')] with u(k)=9 for k < 0 and in the zero state
at time - 00, (hence in the zero state at ko = 0), then the I/O map reads
k
34 y(k) =F [u](k) = L H(k,k')u(k')
k'=O V'k
k-l
=L C(k)<I>(k,k'+1)B(k')u(k') + D(k)u(k)
k'=O

where u(') : N -4 IRnj and y(.): N -4 Rn",

35 We say that the system R dO is I/O stable iff condition (9) holds for the I/O
map (34).
Similarly as in the continuous-time case, we have the following corollary.

36 Corollary, Consider a linear discrete-time system representation


Rd(')=[A('),B('),C('),D(')] where D(') is bounded on N. Then Rd(') is I/O stable iff

37 sup
k
{2:
k'=O
II C(k)<I>(k,k'+1)B(k') II } < 00

(where the matrix norm used is arbitrary).



38 Exercise, Prove Corollary (36).
208

40 Linear Time-Invariant Systems. If the system described by (1) is time-


invariant, then H(k,k') depends only on the elapsed time k-k'. Thus we have
k 00
H(k-k') := H(k-k',O) = H(k,k'). Hence L II H(k,k') 11= L II H(k') II, i.e. the
k'=-oo k'=O
first expression is independent of k e Z. Hence by Theorem (17), we have

41 Corollary. Let the discrete-time system described by (1) be time-invariant.


Then the system described by (1) is I/O stable iff

00

L IIH(k) II < 00
k=O

i.e. the impulse response is absolutely summable.



Finally combining Corollaries (41) and (36) we have

43 Corollary. Consider a discrete-time linear time-invariant system representation


Rd=[A,B,C,D] with transfer function H(z)e C:p(z)""xn i given by (3d.2.61). Then

R d is 110 stable

45 P[H(z)]cD(O,l).

46 Comment. (45) means that H(z) has no poles outside the open unit disc, or
equivalently, is analytic in I z I 1 : a well-known characterization for "external sta-
bility."

Proof of (44) (45). Define the sequence GO on N S.t. G(O)=O""xni and


G(k+l)=CAkB for k Hence H(z)=G(z)+D. Observe now that is
equivalent to V (i,j) e Do x flj

00

L I gjik) I < 00
k=O

P[gij(Z)]cD(O,I) .
209
We prove this equivalence

00

=>: With giiz):= :E gi/k)Z-k where giiO) = 0, we obtain on I z I 1,


k=O

00

I gij(Z) I :$; L I gij(k) I < 00.


k=O

<=: gij(Z) is a strictly proper rational function, hence by Exercise (51) below (partial
fraction expansion) and A E cr nxn (Comment (60)),

48 gij(k) = LP 'Ttl (k)(AI )


k
Vk n+ 1
1=1

where V I, AI is a pole in D(O,l)\ { O} and 'Ttl (k) is a polynomial in k. Beware that


(48) is only valid for k n+1; for smaller values of k, the (possible) pole at zero
would have a contribution for some k.
Observe now that, as k 00, polynomials in k grow slower than any growing
exponential (l+e)k with e > 0, more precisely, Ve VI e m, 3m/(e)E R+
S.t.

m
Therefore picking p := max I AI I < 1, eE (O,p-I-l), and m(e) := L m/(E). we have
I 1=\
by (48) and (49)

00
where (1+E)p < 1. Hence L I gij(k) I < 00.
k=O QED.

51 Exercise. Let A. E cr. Show by induction that V r 1 we have the following


inverse z-transforms:

52 Z-l«z-A)-r)(k)= [ k-l]
r-l A.k- r l(k-r) Vk

where

[ k-l ]
r-l =«k-l)(k-2)··· (k-r+l»/(r-l)!

and
210

I(k)=
{ oI for k < 0
for k

Note that the next-to-Iast expression is polynomial in k.


[Hint: see (D.2.11).]

53 Comment. For A=O fonnula (52) reduces to

where

0 for k"# 0
{
O(k)= 1 for k=O.

56 Exercise [Partial fraction expansion]. Let g(z) = (n(z)/d(z» E tr pO(z) (strictly


proper rational). where n(z) and d(z) are coprime. Let d(z) have zeros at AI of multi-
plicity ml for I = 1.2.....0. Show that the partial fraction expansion

cr m,
57 g(z)= L L a/r<z-A/rr
1=1 r=1

leads to

58 ]Atr ICk-r) for k

where the contribution due to A/ = 0 reduces to


m,
59 L a/rO(k-r) for k .
r-I

60 Comment. Let G(z)=C(zI-ArIBE C:pOCZ)llo xn i be the transfer function of a


time-invariant system R d = [A.B.C.OJ, Then. for all (i.j) E Do x nj.
gjj(z) = nij(z)/det(zI-A) E tr pO(z) . After canceling common factors. partial fraction
expansion can be applied to each gjjCz). whence gij(k) reads as (58)-(59). Note espe-
cially that. with n = degree(det(zI-A». for k n+ I there is no contribution by a pole
A= 0 and all other pole tenns are active. Hence. using (58)-(59) with Acr = O.

a-I mr [k-l] k-r


61 gjjCk) = L L air r-I A/ Vk
/=1 r=1

a-I k
=: L A/ Tt/ (k).
/=1
211

where 1tl(k) is a polynomial of degree ml-l. Compare with (48).

7d.2 State-Related Stability Concepts


We are given a linear recursion system R d(·) represented by

1 x(k+ I) = A(k)x(k) + B(k)u(k)

'v'kE N
2 y(k) = C(k)x(k) + D(k)u(k)
and treat the same problems as in the continuous-time case: asymptotic and exponen-
tial stability of the zero solution of x(k+ 1) = A(k)x(k); condition for obtaining
bounded state- and output-trajectories as well as regulation, and the response to
periodic inputs.

7d.2.I. Stability of x(k+l)=A(k)x(k)


We study the stability of the zero equilibrium solution of the linear r.e.

3 x(k+ I) = A(k)x(k) kE N

where x(k) E R n and AO : N Rnxn.

Asymptotic stability. This property roughly asserts Ithat every solution of


x(k+ 1) = A(k)x(k) tends to zero as k 00. More precisely, with <1>( ... ) the transition
matrix (2d.1.31):

6 Definition. We say that the zero solution x(k) ==


e of x(k+l)=A(k)x(k) (on
k is asymptotically stable (asy. stable) iff for all XOE lR n, for all koE N,
a) k x(k) = <1>(k,ko)xo is bounded on k ko '
b) k x(k) = <1>(k,ko)xo tends to zero as k 00 .

7 Comment. Observe that any solution x(·), on any [ko,k] , is a finite set, hence
Moreover by the linearity of the solution we expect

8 Theorem [Asymptotic stability]. Let det A(k) 0 'v' kEN. The zero solution of
x(k+l)= A(k)x(k) on k is asy. stable if and only if

9 <1>(k,O) 0 as k 00 .

10 Exercise. Prove Theorem (8). [Hint: see the proof of Theorem (7.2.8); observe
that, because detA(k) * 0 V' ke N, V' k <I>(k,ko)='(I>(k,O)<I>(O,ko)].

Exponential stability. This property roughly asserts that every solution of


x(k+ 1) = A(k)x(k) is bounded by a decaying exponential depending on the elapsed
time k - ko. More precisely, we define it as follows.
212

17 Definition. We say that the zero solution of x(k+l)=A(k)x(k) on k is


exponentially stable (exp. stable) iff :3 P E [0.1) and m > 0 S.t. for all ko E N

18 II <l>(k.ko>ll S m pk-ko V' k

(where the matrix norm used is arbitrary).

19 Comments. a) The constants p E [0.1) and m > 0 are fixed, i.e. independent of
ko EN; the constant a 0 s. t. p = exp(--a) is the exponential decay rate.
I}) Using an induced matrix norm for II <l>(k.ko) 1/ we see that the zero solution of
x(k+l)= A(k)x(k) on k is expo stable if and only if :3 pE [0,1) and m > 0 s.t.
V' (xo.ko) E RRX N

in other words. every solution of x(k+ 1) = A(k)x(k) is bounded by a decaying


exponential where the constant p and m are independent of ko EN.
y) By abuse of language the expression "the zero solution of x(k+ 1) = A(k)x(k) is
expo stable" is often replaced by "x(k+l)= A(k)x(k) is expo stable" or "the equilibrium
solution x = e is expo stable" or "A(') is expo stable."
Ii) Following the method of Comment (7.2.25) it can be shown that a stability concept
equivalent to exponential stability is the following:
The zero solution of x(k+l)=A(k)x(k) on k is said to

20 be uniformly asymptotically stable iff


a) k <l>(k.ko) is bounded on k uniformly in ko EN. i.e.

21 :3 I < 00 S.t. V' koE N II <l>(k.ko) 1/ S I V' k •

b) k <l>(k.ko) tends to zero as k 00 uniformly in ko EN, i.e.

22 V' e > 0 :3 an integer K(e) > 0 S.t. V' ko EN" <l>(k.ko II S e V' k

[It is crucial to note that the constants I in (21) and K(e) in (22) are independent of
ko EN; compare with the conditions (9) of asymptotic stability J.
e) In the time-invariant case x(k+I)=Ax(k). (with A constant), asymptotic stability is
equivalent to exponential stability. Indeed <l>(k,ko)=A k- ko depends only on the
elapsed time k-ko; hence the zero solution of x(k+ 1) = Ax(k) is asy. stable, (i.e. (9)
holds), iff the zero solution is uniformly asy. stable.

*25 Exercise. Show that


A(') is uniformly asy. stable A(') is expo stable.
[Hint: adopt the method of proof of Comment (7.2.25): in the necessity part use (21)
213

and (22) with 2E = I, and pick p e [O,I) s.t. 2pK > 1; hence II <I>(k,ko) II 21 pk-ko
for all k

33 Theorem [Time-invariant case; A is a constant matrix]. The solution of


x(k+ 1) = Ax(k) is expo stable iff

o(A) eD(O,I)

(Le. every eigenvalue of A has magnitude strictly less than one).

35 Analysis. <I>(k,ko) = Ak- ko . Now using (4.4.36), (see Exercise (42) below),

(J

36 Ak= L TI/(k)'O,,/)k 'ik


/=1

where (Ad = o(A)\ ( 0) and VI, TIL (k) is a matrix polynomial in k. Hence by
taking matrix norms V k n

(J

II Ak II L. II TIl (k) II' I AI Ik


1=1

(J

37 L 1t1 (k)· I AI I k 1t(k)pk,


1=1

(J

where the 1t1 (k) are polynomials S.t. 1t1 (k) II TIl (k)ll. 1t(k) := L. 1t1 (k) 0 and
1=1

38 P := max ( I A I : I.e o(A») .

Since a polynomial is growing slower than any growing exponential, we have

39 VE > 0 ::3m(E) > 0 s.t. 0 1t(k) m(l+E)k Vk

Hence, combining (37) and (39),

40 VE > 0 ::3m(E) > 0 s.t. IIAkll m[(1+E)p]k Vk

41 Proof of Theorem (33). If o(A)eD(O,I), then, by (38), pe [0,1). Hence pick-


ing E > 0 S.t. (1+E)p < I we have that (18) holds. On the other hand, if
o(A)¢D(O,I), then by (36) Ak+O as k400 and the zero solution is not expo stable.'

42 Exercise. Let Ae cr nxn and let (AI Using the notations of Sec-
tion 4.4, esp. (4.4.36), show that
214

'v'k

where ", (k) is a matrix polynominal in k of degree m,-t.

Comment. The contribution due to an eigenvalue A, =0 reads E, R,k N,* for k


where R, is nilpotent of index m, S n. Hence it disappears 'v' k n .

Lyapunov equation. Consider the equation

46 P=A*PA + Q

where Ae cc nxn, Qe (Cnxn S.t. Q=ct' > 0 is given, and a unique solution
P=P'" > 0 is to be found. Equation (46) is called the (discrete-time) Lyapunov equa-
tion. It's solvability relates directly to the expo stability of x(k+l) =Ax(k).

47 Lemma. Assume that given Q=ct' > 0, Eq. (46) has a unique solution
P=P'" > 0, then the zero solution of x(k+l)=Ax(k) is expo stable, i.e.
o(A) c D(O,I).

Proof. Consider the quadratic form

48 v(x): (Cn R+ : x x*Px.

Since P=P'" > 0, there exist constants > > 0 s.t.

Taking differences. of vex) along any trajectory of x(k+ 1) = Ax(k) we have by (48) and
(46)

v(x(k+l»-v(x(k» =x(k)*[A*PA - P]x(k)

=-x(k)*Qx(k) .

Since Q=ct' > 0 there exists a constant 'IE S.t.

Hence along any trajectory 'v' Xo E (C n with Xo "# e, 'v' ko E N


215

v(x(k+ I» $ ] v(x(k)) V' k ,

with p2 := S.t. P E [0,1). Hence, by composing, we get


v(x(k» $ v(xO)p2(k-ko>, for k Therefore, by (49) with

So, with x(k) = <I>(k,ko)xo, using induced norms,

(where the constants m > 0 and p E [0,1) are independent of ko ).



Consider now the converse of Lemma (47). As a first step consider

52 . Exercise. With A E c: nxn , consider the map

(see Exercise (4.8.10». Show that, if a(A)cD(O,l) then


a) A is bijective
b) if X-A*XA is hermitian then so is X.
[Hint: note that A injective ==- A surjective}.

54 Lemma. Consider the Lyapunov equation (46) and let a(A) c D(O,l). Then
V'Q=Q* > 0 Eq. (46) has a unique solution p=P* > 0 given by

55 p= (A*)kQAk.
k=O

Proof. Let Q = Q* > 0 and consider the matrix r.e.

55a X(k+l)=A*X(k)A+Q X(O)=O k

1) By an elementary induction

I
55b X(l+I)= L .
k=O

Now since A is expo stable it follows that, as I 00, the series converges to the limit

P := X(oo) = l: (A*)kQAk ,
k=O
216

which solves (46). (To see this let k 00 in (55 a) and use (55b).)
2) The solution defined by (55) is unique, since, by Exercise (52), the operator A,
given by (53), is injective.
3) Obviously P=P'" moreover P > 0, indeed, a) the first term of (55b) is posi-
tive definite by assumption and b) all the remaining terms are positive semi-definite by
inspection. •

From Lemmas (47) and (54) we have now our main result

56 Theorem. Consider the Lyapunov equation (46). Then the following statements
are equivalent:
a) VQ=ct > 0 the Lyapunov equation (46) has a unique solution P=P'" > 0 given
by (55),
b) a(A) c D(O,l),
c) the zero solution of x(k+l)=Ax(k) is expo stable.

Periodically varying r.e.'s. We study the expo stability of the Le.

60 x(k+l)=A(k)x(k) ke N ,

where the matrix sequence A(-) : N R nxn is p-periodic and nonsingular i.e.

61 A(k+p)=A(k) Vke N ,

and
61a det A(k) '" 0 V k e [O,p-I]

(whence detA(k) '" 0 Vk and det<l>(k,l) '" 0 for all k and I


In Theorem (2d.2.71) we established that, a) with
I
62 B := [<l>(p,O)] P E (Cnxn and P(k):= <l>(k,O)B-k for keN,

where P(k)e (Cnxn is nonsingu\ar for all ke N and is p-periodic, and b)


under the coordinate change x(k) = P(k)S(k), (60) becomes a time-invariant r.e.

64 S(k+l)=BS(k)

(where B is the constant nonsingular matrix given by (62». Our objective is to show
that the zero solution of (60) is expo stable iff the zero solution of (64) is expo stable.
Note that, by Theorem (33) or (56), the latter holds iff a(B) eD(O,I).
217

6S Exercise. Consider Eq. (62). Show that

a(B) cD(O,l)

if and only if

a( c D(O, 1) .

[Hints: by (62) apply the spectral mapping Theorem (4.7.1) with


f()..)=)..p.]
We obtain now our main result.

66 Theorem [Exp. stability]. Consider the periodically varying r.e. (60), where
detA(k) '" 0 'Vke N. Then the zero solution of x(k+l)=A(k)x(k) is expo stable if
and only if

67 a[ c D(O, 1) .

68 Short proof (exercise). Consider Eq. (63). Since P(k) is nonsingular for all
ke Nand P(·) is p-periodic, the sequences P(·) as well as p(·r 1 are bounded on N;
equivalently there exist positive constants M and N S.t. for all ke N, II P(k) II s: M
and II p(k)-lll s: N. Therefore by (63), for all ko e N and for all k

(MN)-IIlBk-koll s: s: (MN)IIBk-koli.

Hence, using (18), the zero solution of x(k+l)=A(k)x(k) is expo stable if and only if
the zero solution of 1) = B is expo stable. The latter condition is equivalent to
condition (67) by Exercise (65). •

7d.2.2. Bounded Trajectories and Regulation


We study a discrete-time system Rd(')=[A('),B('),C('),D(')] represented by

71 x(k+ 1) = A(k)x(k) + B(k)u(k)


'Vke N
72 y(k) = C(k)x(k) + D(k)u(k)

where A('),B('),C('),D(') are matrix sequences on N. The tOO-norm of the restriction


to k of a vector sequence f(·); N R n will be denoted by II fll oo•ko ' more pre-
cisely,

73 IIfll 00. ko=lIf[ .."",coIII co = ksup II f(k) II


218

where any nonn may be used for II f(k) II. Of course, II f 11 00 ,0 = II f 1100 ' i.e. the usual
loo-nonn of f(·). For matrix functions we denote by 11M 1100 the 100 _
nonn, i.e.

74 IIMII = sup IIM(k)11


00 k

where the matrix nonn II M(k) II is the one induced by the chosen vector norm.

75 Theorem [Bounded trajectories and regulation]. Consider a discrete-time system


representation R d (')= [A('),B(-),CO,D(-)J such that
a) x(k+1)= A(k)x(k) is expo stable with constant m > 0 and pE [0,,1) as in (18),
b) the matrix sequences B('),C('),D(') are bounded on N, i.e. IIBlloo,IIClloo,IIDll oo
are finite constants.
U.th.c.

1) for every (xo,ko) E JR.n x N, for every bounded input u[k",oo)' then the state
k =
x(k) s(k,ko,xo,u) and the output k y(k) = p(k,ko,xo,u) are bounded on k ko,
more precisely with the constants given above,

2) Under the same conditions as in I), if in addition u(k) 8 as k 00, then x(k)
and as

79 Proof of Theorem (75). 1) is straightforward and left as an exercise,


k-l
[Hints: use (2d.1.71) and (18); moreover note that L p(k-k'-l) L pi =(l_p)-l .J
00

k'=k" 1=0

2) By (72), Ily(k)ll;S; IIClloollx(k)II+IIDlloollu(k)1I Vk

Hence, if both u(k) and x(k) tend to zero as k 00, then so does y(k). Hence we arc
reduced to show that

k-I
x(k)=<l>(k,ko)xo + L <l>(k,k'+I)B(k')u(k') := xl(k)+x2(k)
k'=ko

tends to zero as k 00. Now this is immediate for XI (k) since x(k+ 1) = A(k)x(k) is
expo stable. So we are left to prove that lim x2(k) = 8. Now by taking vector norms
k->oo
and using (18),

k-l
o Ilx2(k)1I mllBlloo L p(k-k'-l)llu(k')II Vk> ko
k'=ko
219

and we are done if the series on the RHS converges to zero as k 00. For this pur-
pose, first set u(k) = e for k < leo and then observe that

k-J
L L
00
p(k-k'-l)lIu(k')11= pk'lIu(k-k'-I)1I
k=ko k'=O

[in the last expression lIu(k-k'-I)II=O for k' > k-IJ. Take now any sequence
[kd; S.t. kl > \:1/ fl(k'):=pk'llu(kl-k'-I)llfork'eN.
Thus we are done if we prove that

79a lim L fl(k')=O.


l ..... ook'=O

Now this follows from Lebesgue's dominated convergence theorem, [Rud.I,p.27],


applied to the sequence of series in (79a) above, (these serIes are integrals w.r.t. the

l:
counting measure on N). This theorem allows to pennute the operations of limit tak-
ing and summation because the sequence [flO is dominated by the fixed abso-
lutely summable series gO on N where g(k') := pk'il u Ilk ; indeed i) \:I I and k' e N
0.-
I fl (k') I :,; g(k'), and ii) g(k') = (l-p )-111 U Ilk < 00. Hence by the theorem,
k'=O 0._

with u(k) 0 as k 00 , and k/ 00 ,

00 00

lim L fl (k') = L lim f, (k')


/ ..... 00 k'=O k'=O 1 ..... 00

= lim pk'lIu(k/-k'-I) 11=0 ,


k'=O / ..... 00

and we are done.



83 Exercise [Robustness]. Consider a discrete-time time-invariant system represen-
tation Rd=[A,B,C,D,], thus w.l.g. ko=O. Let a(A)cD(O,I). Show that for any
sufficiently small perturbation in A, and for any perturbation in B,C,D the conclusions
of Theorem (75) remain true.

88 Exercise. Consider a time-invariant system representation R d = [A,B,C,D] with


a(A) c DCD, 1). Show that V (xo,ko) E cr n X N , for all bounded inputs u[ko. oo ) that tend
to a constant u00 as k 00, x(k) and y(k) tend to constallit vectors x00 and y00 as
k 00. Compute x00 and y00 •
220

7d.2.3 Response to q-Periodic Inputs


We study the response of a discrete-time time-invariant system representation
Rd=[A,B,C,D], where x(k+l)=Ax(k) is expo stable, to a q-periodic input sequence
up(·): N --) tr nl , whence

90 up(k+q) = up(k) V kEN .

91 We shall denote by k --) vp(k) the q-periodic extension of k --) up(k) on all of Z ,
i.e. vp(') is the unique q-periodic sequence on Z S.t. vp(k) = up(k) V kEN.

92 Analysis. a) Consider k --) H(k) the impulse response (3d.2.59) of


R d = [A,B,C,D] and define the sequence yp(-) : N --) tr 110 by
k
93 yp(k) = L H(k-k')vp(k') for all keN.
k'=-oo

94 Claim: the sequence yp(-) is well defined and q-periodic.


Indeed, 1) the first assertion follows because x(k+ 1) = Ax(k) is expo stable and
vpO is bounded, (using (18), and norms, we get
IIYp(t)1I :s; [(mI(l-p»IICIIIIBII+IIDIDlllvplloo' where IIvplloo is the too-norm of
vp(·) on Z), and 2) q-periodicity follows by checking that yp(k+q)=yp(k), (using
(93) and the q-periodicity of vp(·».
b) Consider now the response of R d = [A,B,C,D] due to any state Xo e tr n and the q-
periodic input up('), i.e. using (3d.2.59)

k
95 y(k) = p(k,O,xo,up) = CA kxo + L H(k-k')up(k')
k'=O

k Vke N
=CAkxo + L H(k-k')vp(k')
k'=O

(since the extension vp(-) agrees with up(·) on N).


Hence, by (93), (95) and (3d.2.58),

96 Vke N y(k)-yp(k)=CA k[xo - k,Eoo Kk'-IBvp(k') ],

where the last series converges because A is expo stable (indeed, using norms and (18)
and the notation i for the series, Iii II :s; (mI(l-p» II B IIII vp 1100 < (0). Therefore in
the RHS of (96) the expression between the brackets is a constant vector. Therefore
since Ak--)O as k--)oo, VXOE tr n lim [y-yp](k)=O. From this and claim (94)
k->oo
we have
221

97 Theorem [Response to q-periodic inputs]. Consider a discrete-time rime-


invariant system representation Rd=[A,B,C,D], where \co=O and x(k+1)=Ax(k) is
expo stable, and which is driven by a q-periodic input up(·): N er
llt • Consider the

q-periodic output yp(-): N ern.


as described by (91) and (93), then as k 00, for
any initial state Xo E er
n , the output y(k) of R d tends exponentially to yp(k), as shown
by (96). •

103 Exercise. Consider the representation R d defined in Theorem (97). Let the q-
periodic input be of the form
m
104 up(k) := L ul exp[j(l 90)k]
1=1

where 9 0 = (21t/q) and for all I E ill, ul E (C 11j. Show that


m _
105 yp(k) = L H(exp(j(l 90))) u/ exp[j(l 90)k] 'r;j kEN
1=1

where H(z) is transfer function (3d.2.1) of Rd'


CHAPTER 8

CONTROLLABILITY AND OBSERV ABILITY

Introduction
This chapter treats the coupling of the input to the state, i.e. controllability, and
that of the state to the output, i.e. observability. This is done for general dynamical
systems which are then specialized to the linear system representation
R (.) = [A(-),BC'},C(-),D(')]: first in the time-varying case and then in the time-invariant
case. For the latter systems this leads to Kalman decomposition and a discussion of
the absence of unstable hidden modes, viz. stabilizability and detectability. This
chapter ends with a brief study of 1) balanced representations (based upon normalized
controllability and observability grammians) and 2) the robustness of controllability
(for perturbed nonlinear systems).

8.1. Controllability and Observability of Dynamical Systems


The concepts of controllability and observability are quite general. so it is natural
to introduce them in the framework of general dynamical systems D = ( U • L. Y .s.r ) ,
without either assuming linearity nor time-invariance. We prove a couple of proper-
ties of memoryless feedback: memoryless state-feedback does not affect controllability
and memory less output-feedback does not affect either controllability or observability.
We consider a given nonlinear dynamical system representation
D = ( U, L ,Y , s,r ). Its response function p is also specified by the appropriate com-
position of the state-transition functions and the readout function r.
Let to < t l ; the input u[lo.ttl is said to steer the state Xo at to to the state XI at tl iff

We also say that the input u[to.t.] steers the phase (xo.to) of D to the phase (x!>t})

2 The dynamical system representation D is called controllable on [to.td iff


'it xO,xI E L, :::Ju[lo.t.] E U that steers the phase (xo.to) to the phase (x}.t l )·

In some cases, one does not want to prespecify the time t}; in such a case we say
D is called controllable at to iff 'it xO,xI E L, there exists some tl > to such that some
u[lo. tIl E U steers (xo,to) to (xI,II)'

3 Remarks. I. D is controllable on [to,tIl

'it Xo E L. the map S(tl'to,xO;): u[t,.tcl -t s(tl,to,xO,u[lo.td)

is surjective. that is, it maps PC ([to.tIl) onto L'

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
223

In most applications. if some input u[l().ttl steers (xo.to) to (x1.t1). there are
infinitely many other controls that do it also.
II. Controllability depends only on U. and the state transition function s. (r
and p have nothing to do with controllability).
For convenience. we will follow common usage and say "the dynamical system
D " rather than the more exact "the dynamical system representation."

4 Memoryless feedback and controllability. Consider a given dynamical system


D. a map Fs: -+ U and a map Fo: Y -+ U. Let us use Fs and Fo to apply memory-
less state feedback and memoryless output feedback on D: the resulting systems are
called D s and Do. resp. and are shown on Figs. 8.1 and 8.2. resp. Thus
Vt. we have
u(t) =v(t)- F.(x(t»
and
u(t) = vet) - Fo(y(t» .

5 Uniqueness assumption. For D s and Do. we assume that V phases (xo.to). V


exogeneous inputs v(') there is one and only one state response x(') and one output
response y(').
6 Theorem. For the dynamical systems Do and D s satisfying (5) we have

7 D is controllable on [to.ttl

v
+.<f- U
y

YFS('II-Yx
Fig. 8.1 System D s with memoryless state feedback.

v + y

Fig. 8.2 System Do with memory less output feedback.


224

8 D. is controllable on [to,ttl

9 Do is controllable on [to,tIl .

10 Comments. a) Roughly speaking: nonlinear memoryless state-feedback and


output-feedback do not affect controllability.
b) The memory less assumption is crucial: it allows us to use the same state
space for D , D s and Do.

Proof of the lst equivalence. By assumption f) is controllable on [to,td.


Consider D. and two arbitrary phases (Xo,to),(xI,t l ). Since D is controllable on [to.td,
:::Iii [100 1,] that steers the phase (XQ.to) of D to the phase (xI,t l ) of D. Now a) the
phases of D are identical with those of D. since F. is memoryless, b) apply to D. the
exogeneous input

14 v(t) := ii (t) + Fs (S(t,to,xo.ii [10,1))' V t E [to.td .

By the uniqueness assumption (5). this exogeneous input v(-) will precisely produce
the input u that will steer (xo.to) to (xI.tl)'
(<:) By controllability of Ds. VXO,xIE L,
:::I v[100111 that steers the phase (xo.to) of
D s to the phase (xI.tl) of D s' Since D sand D have the same phase and. by (5), v
will produce a unique input U. the input u of D will produce the required transfer of


(xo.to) to (xI.tl)·
The second equivalence is proved in a similar way.

16 Definition of observability, The dynamical system D is called observable on


[to,ttl iff, given D, V inputs U(IooI,] and V corresponding outputs y(Io. I,] E Y(Io,I,] the
state Xo at time to is uniquely determined.
Of course, once Xo is calculated, from U(Io,I,) and the state transition map s we can
calculate the state trajectory xO: indeed

We say that D is observable at to iff given D, V inputs u[loo oo ) and V correspond-


ing outputs Y[loo oo )' the initial state Xo is uniquely determined.

Remarks. l. D is observable on [to,tIl


for each fixed u[IooI.J. the panial response map

that is, the panial response map is a one-to-one map from L to Y(Io,ld'
225

20 Memoryless feedback and feedforward, and observability. We consider a given


dynamical system D, a map Fo:Y U and a map Fr:U Y. We use Fo to apply
to D a memoryless output feedback and Fc to apply to D a memoryless feedforward.
Call the resulting system Do and D f' resp. (see Figs. 8.3 and 8.4). For D f'

21 rt(t)=y(t) + FrCu(t» .
For Do we make the uniqueness assumption (5) above.

22 Theorem. For the system Dr and the system Do satisfying (5), we have

23 D is observable on lto,td

24 <=> Do is observable on [to,ttl

25 <=> Dr is observable on [to,tl] .

Proof. Exercise.

Remark. Memoryless state feedback may affect observability. For example, for a
linear time-invariant system representation R = [A,B,C,D1, there may exist a linear
state feedback Fs such that for some states Xo and for some inputs uO, the state trajec-
tory remains, for all t, in the nullspace of C.

v y

Fig. 8.3 System Do with memory less output feedback.

v u y

Fig. 8.4 System D f with memory less feedforward.


226

Exercise. Let b=[n, c=[Ol].

Verify that for fs =[-I,O], the system Ds has some nonzero states that produce an out-
put which is identically zero.

8.2. Controllability of the Pair (A(·),B(·»


From the present section to section 4 we restrict ourselves to linear time-varying
representations: R (.) = [A( '),B(' ),C(' ),D(')] where A(' ),B(' ),C(' ),DO are piecewise con-
tinuous on lR+ and the state space is (en. We adopt the following convention: if a
piecewise continuous function is zero at all its points of continuity, then we take it to
be identically zero. (Note that, without this convention, that piecewise continuous
function could only be nonzero at a finite number of points in any bounded intelVal.)

8.2.1. Controllability of the Pair (A('),B('»


Since controllability involves only the relation between the input u(·) and the state
trajectory x(-), we will use the expression "controllability of (A(·),B(·»."

1 More precisely, we say that the pair (A('),B('» is controllable on [to,td iff
:3 ul.."I.) that steers the phase (xo,to) to the phase (xI,t l ).
'V (xo,to) and 'V (xI,II)
Given the pair (A('),B('», both piecewise continuous, we know that U[to,l.J

transfers the phase (xo,to) to the phase (X},tl) if and only if


I,

2 x} = s(t} ,to,xo,u) = <I>(t} ,to)xo + f <I>(t} ,'t)B('t)uCt)d't .


to

3 Equation (2) shows that there will be an input ul.."l.] that will transfer an arbitrary
phase (xo,to) to an arbitrary phase (xI,tl) if and only if the linear map
II
4 L r : ul.."I,] f <l>(tl,t)B(t)u(t)d't : PC ([to,td) (fn
to

is surjective. Lr is called the reachability map on [Io,td.


Sometimes we do not want to specify t}: then we say that the pair (A(·),B(·» is
controllable at to iff for some t} > 10, the pair is controllable on [Io,t}].
Recalling that the state transition matrix <I>(t,lo) is nonsingular V (t,Io), it is easy
to establish Theorem (5).

5 Reduction Theorem.
The pair (A('),B('» is controllable on [to,td
227

Proof. Exercise. (Use (2) and the properties of <1>(',).)


These equivalences suggest the following natural definitions:

8 Given the representation R (.), or the pair (A('),B('», we say that the state Xo is
controllable to zero on [to,tIl iff :3 U[Io.tl) that steers (xo,to) to [9n ,t l ); we say that
the state XI is reachable on [to,ttl iff :3 u[to.ttl that steers (9 n ,to) to (xI,tl)'
The essence of the reduction theorem (5) is that, for linear system representa-
tions, controllability on [to,td, controllability to zero on [to,td of all states, and reacha-
bility on [to,tll of all states are equivalent. The reader should construct a one-
dimensional nonlinear example to show that this is not so for nonlinear systems.
We state now two closely related theorems that characterize the controllability of
the pair (A(·),B(·».

12 Theorem (Controllability in terms of reachability). Let A('),B(-) be given and be


piecewise continuous. Then
13 (i) The pair (A('),B('» is controllable on [to,td
14 R CL r )= (Cn

16 det Wr(to,t l ) 0, where the grammian Wr is defined by

II

17 WrCto,t l ) := f <1>(t l ,'t)B('t)B('tt<1>(tl ,'ttd't.


10

(ii) The set of reachable states on [to,ttl is the subspace R (L r), which is equal to
R [Wr(to.t l )].

Fig. 8.5 The reach ability map Lr and its adjoint L: .


18 Comments. a.) Equation (17) shows that Wr(to,tl) is the integral of a positive
semi-definite Hermitian matrix: hence V ZE (Cn, z*Wr(to,tl)Z Furthermore, for
228

fixed final time tl' all the singular values of Wr(to,t l ) decrease as to increases towards
tl' In particular if to < to', then

19 Wr(to,t\) Wito',t\).

13) The function tl Wr(to,tl) is the solution of the linear matrix d.e.
20 X(t) = A(t)X(t) + X(t)A(t)* + B(t)B(t)*

with X(to) =0.


y) From (4) or Fig. 8.5, we see that L.Lr*: ([n ([n; hence it has a matrix
representation. Lr* is defined by: 'VZE ([n, 'VUE U[to,ld'

(z,LrU) =(Lr* Z,u)

where (',')n denotes the inner product in ([n and C·) denotes the inner product in L;'.
Using matrix notations we have

where we noted that z is independent of t and, for matrices (MNP)* = p*N*M*.


Thus

21

Using this result and the definition (4) of Lr we conclude that the linear map L.Lr* has
the matrix representation Wr(to,tl) defined in (17).

Proof of Theorem (12)


(13) (14). Statement (13) is equivalent, as noted above, to Lr being surjec-
tive, equivalently, R(L r )= ([n.
(14) (15). This is simply (A.7.59) applied to A =Lr viewed as a map from
L;' ([to,tIl) into ([n.

(15) (16). Since L.Lr*: ([n ([n, L.Lr* is surjective iff it is a bijection,
so using its matrix representation (17), we see that (15) is equivalent to (17).
Statement (ii) follows immediately from (2) with Xo = Sn' I

Let us now consider controllability in terms of controllability to zero.


Let
229

1\

25 Lc : u(lo.ld f <I>(to,'t)B('t)u('t)d't : PC ([to,t1]) (Cn


to

26 Theorem (Controllability in tenns of controllability to 2ero). Let (A('),B('» be


given and be piecewise continuous. Then

27 (i) The pair (A('),B('» is controllable on [to,t 1]

28 ¢;> R(L c)= (Cn

30 ¢;> detWc(to,t1) '# 0, where the grammian Wc is defined by

II

31 W c (to,t1) := I <I>(to,'t)B('t)B('t)*<I>(to,'t)*d't.
to

32 (ii) The set of all states controllable to zero on [to,td is the subspace R (L c)'
which is equal to R [Wc (to,t1)]'

33 Exercise. Show that, given tl >to and definitions (4) and (25),

34 Exercise. Obtain a matrix differential equation that will. have t W e (t,t1) as its
solution.

Proof of Theorem (26): Exercise.

8.2.2. The Cost of Control


Given the pair (A('),B('» let us consider the cost of reaching (x1,t1) from (en,to).
In order to fit the framework of the theory of the adjoint (Appendix A, Sec. 7.4) let us
assume that the cost of control is given by the of u(·), namely,
II

35 (u,u) = f u(t)* u(t)dt = II u Iii.


to

36 Theorem (Minimum cost control).


i) If the pair (A('),B('» is controllable on [to,t1] then 'V xO,x1 E (Cn, the input
ti : [to,ttl 4t n; defined by
230

steers the phase (xo.to) to (xl.tl)'


ii) If the cost of control is given by (35). then the minimal cost of reaching (xl.t l)
from (8n.to) is given by

38 II u lIi=xtWr(lo.t1r1xl .

where u(-) is defined in (37).



40 Comments. a) Note that (37) gives a control u O. which transfers (xo.to) to
(xl.t l ). In general there are infinitely many others: indeed the transfer requirement (2)
imposes n linearly independent constraints on an infinite dimensional space: hence the
set of all such controls forms a linear variety (affine subspace) of codimension n: any
u=u +v with VE N(L,) will accomplish the transfer.
b) Geometrically u is the least-cost L2 -solution iff II u Ih is the minimum distance
between the origin and the solution variety u +N(L,) of a). Hence u is the least-cost
L 2_ solution iff u is orthogonal to the variety u + N (L,) or equivalent u is orthogonal
to the subspace N (L¥. Now by (A.7.57) R (Lr*) =N (L,).l... Thus u is the least-cost
solution iffu E R(L,). equivalent for some (Cn.
c) The conclusions (37) and (38) are easily verified by calculation. The control u(to.t,)
transfers (xo.to) to (xl.tl) if and only if

41 Xl -¢(tl.to)xO=L,u .

Now by b) above the least-L2 solution u is of the form Substitution in (41)


together with (17) gives

42

Hence the minimum cost u is

which is precisely (37).


d) W,(to.t 1) is a positive semi-definite Hermitian matrix by (17): call Ai' Vi its eigen-
values and associated orthonormal eigenvectors. then

From (38) it follows that. for a unit cost (as specified by (35». we can reach any of
the points Vi 1-fA:. i=I.2 ..... n. In fact for a unit cost. starting from phase (8 n .tO) we can
231

reach any point on the ellipsoid whose semi xes are i=1,2, ... ,n. If we order the
eigenvalues as follows, Al ... then the direction vn is the most
expensive to reach and the direction VI is the cheapest. Roughly speaking, the eigen-
values of Wr(to,t l ) measure the effectiveness of the actuators in the task of reaching a
specified state.

8.2.3. Stabilization by Linear State Feedback


We are given the representation

51 x(t) = A(t)x(t) + B(t)u(t) .

Suppose that A(·) is not stable and that we wish to stabilize the system by state feed-
back.
For u 0, to and tl E IR+, let us define
I,

52 Ha(to,tl) := f <l>(to,t)B(t)B(tt<l>(to,tt exp[-4u(t-to)]dt .


to

53 Theorem. If A('),BO are piecewise continuous and if


:3 8>0, :3 hM S.t. Vte IR+

(this condition is usually referred to as (A('),B('» is strongly uniformly controllable),


then, for any u > 0, the linear state feedback
55 u(t) = - F(t)x(t) = - B(t)*Ha(t,t + 8)-1 x(t)

will result in a closed loop system x= (A(t)-B(t)F(t»x(t) = A(t)x(t) such that


Vxo, Vto

56 II x(t)e<Xt II ° exponentially as t 00

(i.e. all z-i state trajectories of the feedback system go to zero at an exponential rate,
which is faster than exp(-ut». •

57 Comment. Thus if (A('),B('» satisfies the tight controllability condition (54),


then, for any u > 0, there is a linear state feedback that exponentially stabilizes the sys-
tem and, as t 00, II x(t) II = o(e-at ). Note that as u increases, the l Tnorm of
increases, hence the feedback becomes tighter.
232

Proof. To establish the claim we will show that the equilibrium point of

60 (t) = (A(t) +

is exponentially stable. Choose as a Lyapunov function

61 =

By (54), we have

62 S S e4at.h;;;11,

Thus -t is bounded below and above by positive definite quadratic


forms on all of R+x fro. We now have to calculate V (60), the time derivative of
along solutions of (60): in abbreviated notation we have

63 .

In more detail, dropping the dependence of on t, we have

64 V. * [A(t)* *

- + +

+ - B(t)B(t)*

+ A(t)Ha(t,t + Ha(t,t + + .

After simplifications we obtain

V. (60) -2a.!;(t)*Ha,(t,t +

65 S 112 .

From (61) and (65), we conclude that along any trajectory of (60)
. -2a.hMI
66 V (60)/V S 1 =-2a.(h m /h M )exp(-4M) .
h;;;

Hence, along all solutions of (60), t -t decays exponentially; in view of (61)


and (62) so does t -t •

67 Exercise. Show that, for all t to,


233

69 Exercise. Show that if )(.): R+ [1/2,00) is any piecewise continuous function,


the feedback law
70 u(t) = -F (t)x(t) = --'y(t)B(ttHa(t,t + 6)-l x(t)

has the property stated in theorem (53).

8.3. Observability of the Pair (C('),A('»

1 We are given the linear time-varying representation R (.) = [A('),B('),C('),D(')],


where A('),B('),C('),D(') are piecewise continuous; recall the convention that any
piecewise continuous function that is zero at all its continuity points is taken to be
identically zero.
Given R (.), any input u[to.l,l E U [10.1,) and any initial state Xo at to, the correspond-
ing output is
I,

2 J
y(t) = C(t)cl>(t,to)xo+ C(t) cl>(t,t)B(t)u(t)dt + D(t)u(t).
10

Suppose that in addition to R (-) we know u(·), then we can calculate the last two
tenns; thus wJ.o.g. for the study of the relation between the state Xo and the output
Y[Io,I,l we need only consider the first tenn of (2):

3 y(t) = C(t)cl>(t,to)xo, for t E [to,ttl .

This suggests to define the linear map Lo as follows

4 Lo: ([n PC ([to,t l )): Xo C(')cl>("to)xo :[Jo,I,I'

5 We'll say that state Xo is unobservable on [to,t!] iff its z-i response is zero on
[to,t!].
In view of definition (5), we have

6 Xo is unobservable on [to,t!] ¢:> XOE N(L o );

equivalently, the set of all states of R (.) or of the pair (C('),A('» that are unobservable
on [to,tll is the linear subspace N (Lo).
Recalling the definition (8.1.16) of a dynamical system D observable on [to,tl]'
and specializing to the special case of R (.), we say that

7 the pair (C('),A('» is observable on [to,tl] iff given R (.), V inputs U[Io,I,) and V
234

corresponding outputs Y[to.td' the state Xo at time to is uniquely detennined.


Since Lo is a linear map,

8 (C(·),A(·» is observable on [to,tIl ¢:> Lo is injective

As a consequence, the pair (C(·),A(·» is not observable on [to,tIl iff there is some
nonzero state that is unobservable on [to,td, or equivalently, it has a nontrivial sub-
space (namely, N (Lo)) of unobservable states on [to,td.

12 Theorem (Characterization of observability). Given (C(-),A(·», with AO and


CO piecewise continuous and (see definition (1», we have the following equivalences

13 the pair (C(·),A(·» is observable on [to,td

14

15

t,

16 ¢:> detMo(to,t j ) "" 0 where Mo(to,ll):= f <I>(t,to)*C(t)*C(t)<I>(t,to)dt.


10

Fig. 8.6 Definition of the map Lo and its adjoint L;

Proof. (13) ¢:> (14) is immediate from the definitions (4)-(5)


(14) ¢:> (15) ¢:> (16) follows from the theory of the adjoint in section A.7.4. •

17 Corollary. Let (CO,A(·» be observable on [to,tJl, then


a) with y:= the z-i response on[to,td due to xo,

18
235

b) given Y[lo.ltl, Xo is retrieved by


11

19 xo=(Lo*LoflLo*Y=Mo(to,tlflJ <l>(t,totC(ttY(t)dt .
10

20 Comment. Let A" > 0 be the smallest eigenvalue of the, positive definite Hermi-
tian matrix Mo(to,t) and en its corresponding normalized eigenvector: then for Xo = en'
II Xo liz = 1 and its z-i response is s,t. (y,y) = A". So if An < < 1, some states are barely
observable in case of noisy observations.

Proof. (18) is immediate by computation.


(19) is obtained as follows: by (4), we have

By observability on [to,ttl, Lo*Lo: ffn -; (fn is a bijeclion, hence operating on both


sides of (22) by Lo* we have

*
(Lo Lo)xo=Lo Y *
from which (19) follows.

23 Corollary. Given the pair (C('),A('», the set of all unobservable states on [to,ttl
is a subspace of ffn and, in particular, it is N(Lo)=N(Mo(to,tl»'

Proof. Exercise.

24 Exercise. Using the notations of Theorem (12), show that to -; MO(tO,tl) is the
solution of the linear matrix differential equation

X(t) = - A(t)*X(t) - X(t)A(t) - C(t)*C(t), X(t 1) = 8 .

8.4. Duality
If we compare theorems (8.2.26) and (8.3.12), it becomes clear that they are
closely related. In fact, they are related by duality.
Consider a linear time-invariant representation R (-)= [A('),B(·),q'),DO], i.e.

X(t) = A(t)x(t) + B(t)u(t)

2 yet) = qt)x(t) + D(t)u(t)


236

where, as usual, x(t)e «r n , u(t)e «r n \ y(t)e «rn., and the matrix functions
AC·),BC·),C(·),DC·) are piecewise continuous as specified in definition (8.3.1). As
before, (t,to) -+ ¢Ct,to) denotes the state transition matrix of A(·).
The dual representation R (-) of R (.) is closely related to R (.) in such a way that,
roughly speaking, controllability properties of R (-) are closely related to observability
properties of R C·), and vice-versa. The formal definition of R (-) is as follows.

3 The dual of the representation R (.) is the representation R (.) defined by

4 R- 0= [-A(·)*,-CO*,B(-)*,0(-)*] .

Equivalently, R0 is described by

5 -i (t) = A(t)*x (t) + CCt)*u (t)

6 Y(t) = B(t)*x (t) + D(ttu (t)

where xCt)e ([n, uCt)e ([n., and y(t)e ([n;. The state transition matrix of R (-) is
(see 2.1.118)

7 *
'P(t,t)= ¢(t,t) .

Note the minus sign in the d.e. (5): we could get rid of it by introducing a reverse
t
time =-t, (see [KaI.l]). However, this is not convenient for physical interpretation.
Moreover, denoting by R(.) the dual of the dual-system representation R (.), (4), we
obtain

8 R(·)=[A(·),-BC·),-C(·),DC·)] ,

i.e. R(.) equals R (.) modulo a change of sign for the state. Hence their reachability,
controllability to zero and observability maps, (8.2.4), (8.2.25) and (8.3.4) resp. are
related by

It turns out that controllability to zero on [Io,td is the dual of unobservability on


[to,t1] and vice versa.

10 Theorem. [Duality: controllability to zero versus unobservability]. The sub-


space of all states of RO that are controllable to zero (unobservable) on [to,t.] is the
orthogonal complement of the subspace of all states of its dual R (-) that are unobserv-
able (controllable to zero, resp.) on [to,ttl. In terms of the notations (8.2.25) and
(8.3.4), that is
237

or equivalently, by (8.2.32) and (8.3.23)

12

From (12), we immediately obtain the following corollary.

13 Corollary. The representation R (.) is controllable (observable) on [to,tIl


c;:;. its dual R (.) is observable (controllable, resp.) on [Io,td.

14 Proof of Theorem (10). a) Calculations show that th(: state Xo of R (.) is con-
trol/able to zero on [to,t 1] iff :3 u[to.l,l S.t.
I,

15 f
Xo =- <l>(to,'t)B('t)u('t)d't ,
10

equivalently, iff XOE R(Lc) where (see (8.2.25»


I,

16 Lc: U[to,l,] f <l>(Io,'t)B('t)u('t)d't .


to

Now by (8.3.6), the state x 0 of R (-) is unobservable on [1o,tIl iff i oX 0 = e, more pre-
cisely iff

17 B(t)*'l'(t,to)x.o=B(t)*<l>(Io,t)*xo=en; Vte [to,ttl .

But by (8.2.26), (17) is equivalent to: XOE N(Wc(lo,tl»=N(L c*)' Now by (A.7.57),
R(Lc)=N(L c*1. ) . Thus we have established (since Lo=Lc - *), R(Lc)=N(L- 0) .1 .
Hence the first equality in (11) and in (12) hold.
b) To obtain the others repeat the reasoning of a) to the dual 0, (4), and the dual of
the dual R (-), (8), for which (9) holds. Hence by the first equality (11)

[R(i c)]l. =N(Lo)=N(L o),

i.e. the second equality (11) holds. The second equality (12) follows similarly. •

Thus the notion of "xo is controllable to zero on [Io,ttl" is related by duality with
"xo is unobservable on [to,ttl." Also "RO is controllable on [1o,t1]" is the dual of
"R (-) is observable on [Io,ttl."
238

Now what is the notion that is related by duality with "xl is reachable (from zero)
on [to,tIl"? It turns out that the required notion is "xl is unreconstructible on [to,td."

20 We say that the state Xl is unreconstructible on [to,tIl iff the z-i response on
[to,ttl which corresponds to the final phase (xI,tl) is zero, equiv.

So if we let

we have
23 Xl is unreconstructible on[to,td $;> Lrec(xI) = e

where
I,

24 M,.ec(to,t l ) ;= f <1>(t,t tC(ttC(t)<1>(t,tl)dt.


l
10

25 Theorem. (R(Lr)=N(L- ree) .L .)


The subspace of all states of R (-) that are reachable (from zero) on [to,tJl is the
orthogonal complement of the subspace of all states of R (-) that are unreconstructible
on [to,ttl.
In terms of the notations of (8.2.4) and (22), that is

or equivalently,

27

Proof. The state Xl is reachable on [to,td $;> XI E R (Lr)


I,

$;> f
XI = <1>(t l ,t)B(t)u(t)dt, for some u[to,t,l .
to

The state it I of R 0, the dual of R (.), is unreconstructible on [to,ttl


239

By the theory of the Adjoint (see section A.7.4). R(Lr)=N(Lr*)l. =N(i rec)l.. The
remainder of the theorem follows easily. •

8.5. Linear Time-Invariant Systems


We consider now linear time· invariant systems represented by R = [A.B.C.D]
where A.B.C. and D are constant matrices with elements in R or (C.
Because of the time·invariance. it will tum out that a state of R is controllable to
zero (unobservable) on some interval [to.t\] ¢:> that state is controllable to zero
(unobservable resp.) on any nonzero interval of R. (See the proof of Theorem 9. part
(i) As a result we will drop the phrase "on [to.tJl" everywhere.
Two matrices playa crucial role in the developments:

1 the controllability matrix: C:= [B: AB: A2B: ... : An-IB] E (rnxnn;

CA
2 the observability matrix: 0:= E ernn"xn

CAn-I

3 Lemma. The subspaces R (C) and N(O) are A·invariant (i.e.


XOE R(C) =:> AXE R(C) .... ).

Proof. In both cases the proof is based on the Cayley·Hamilton Theorem (3 .. 2.24).
The A-invariance of N(O) is straightforward. Let us prove it for R(C). By 0). we
have

4 XE R(C) ¢:> :3 (Vk );-1 E ern; S.t. x=


k=O
AkBvk'

We must show that Ax E R (C). Multiply the equation (4) by A on the left. Now. by
Cayley·Hamilton. An is a linear combination of (Ak );-1 hence. after some rearrange-
n-I
ments. we see that for some Wk'SE ern;. Ax= L AkBwk' i.e. AXE R(C).
o •
240

8.5.1. Observability Properties of the Pair (e,A)


We now state a theorem that asserts four properties related to observability.
Some properties will be proved in later sections.

9 Theorem (Observability properties of the pair (C,A». Given the pair (C,A),

10 i) the set of all unobservable states is the A-invariant subspace N(O) c. (l:n,

11 ii) the pair (C,A) is observable rk 0 =n

12 'v'AE cr(A), rk ]=n.

iii) For C and A real, for any monic t real polynomial 1t of degree n, there exists
LE Rnxn.. S.t.

13 XA+LC=1t

if and only if the pair (C,A) is observable.


o
iv) Let, in addition, cr(A) c. (I: _; the pair (C,A) is observable if and only if Mo, the
unique solution of
14 A*M+MA+C'C=O,

is positive definite.

Before proving Theorem (9) let us consider some aspects which illustrate the
dynamic consequences of the assertions of the theorem.

20 Remark on the extraction of the unobservable part. Let us assume that (i)
holds and that dimN(O)=r<n, the unobservable subspace is r-dimensional. Choose
a basis for N (0) and precede these r basis-vectors with n-r vectors from (l:n so that
we have a new basis for (l:n. Suppose, for simplicity, that D=O. Since a) N(O) is
A-invariant and since, by (1), N (C) -=> N (0) i.e. the last r basis vectors are in the
nullspace of C, in the new basis, the system is represented by

21

t A polynomial is monic iff the coefficient of its highest degree term is 1.


241

where xl e (Cn-r and X2 e (Cr. Equations (21) and (22) imply the connections indi-
cated on the block diagram of Fig. 8.7.

Fig. 8.7 Extracting the unobservable part.

As the figure shows, we have extracted an r-dimensional subsystem from R and the
state x2 of that subsystem is unobservable. Since by (i) the set of all unobservable
states is of dimension r the pair (CI,A ll ) is observable.

23 Comments. a) The rank test (11) is not numerically appealing: suppose that the
ratio of I Aroax/Aroin I =: 11 for A is, say, 10; then by the spectral mapping theorem
(4.7.1), for An- l the corresponding ratio becomes lln-l. Thus, when we calculate An - l
we are going to add and subtract numbers that differ by many orders of magnitude.
Consequently we should expect to lose a lot of significant information in the round-off
process.
b) The rank test (12) is more appealing because the QR algorithm will give reliable
eigenvalues and by singular value decomposition the rank can be realistically evaluated
[Gol.1].
c) The interpretation of (13) is the following: suppose that, in the original system
x = Ax + Bu, we add Ly = LCx to the input of the integrators, (this is called "constant
x
output injection"), we then obtain = (A+LC)x + Bu, thus the resulting spectrum is
now cr(A+LC), i.e. the set of roots of XA+LC' Note that if R is not observable it is
obvious from (11) and (12) and Fig. 8.7, that no amount of output injection of
y=C1xl will change the spectrum of the unobservable part! In fact, partitioning the
matrix L in two submatrices L I , we find that A+LC has the form

Al1+ L I C I o
24 [

1
Thus, for all L, the elements of cr(A 22 ) are elements of cr(A+LC), the set of closed-
loop eigenvalues. Thus we see that constant output injection does not affect the eigen-
values of the unobservable part.

Proof of Theorem (9).


(i) Let to and tl be arbitrary in 1R and tl > to. By definition (8.3.5), the state Xo is
242

unobservable on [to,ti1 iff its z-i response is zero on [to,t!] that is, in the present case,
iff

25 C exp[A(t-to)]xo= en. 'rj tE [to,td .

Expanding t --+ exp[A(t-to)] in Taylor series about to, we obtain a Taylor series that is
identically zero on [to,t!] with t[ > to. This is equivalent to having all the coefficients
of the Taylor series equal to zero, or equivalently,

26

By the Cayley-Hamilton theorem, this is equivalent to

27 cAkxo=en. for k=O,l, ... ,n-l.

These n conditions are themselves equivalent to Xo E N (0). So we have shown that


any unobservable state Xo is in N(O). Conversely, if XOE N(O), then (27) holds and
by the reasoning above, Xo satisfies (25), hence Xo is not observable. •

ii) (C,A) is observable ¢ > N (0)= { en} by (8.3.6), (8.3.7) and i) above.
¢ > the n columns of 0 are linearly independent

¢> rk(O)=n (0 E (Cnn.xn)

We prove the equivalence (12) by contradiction.


Suppose :3 Aj E a(A) S.t. the rank is smaller than n, then there exists an eigen-
vector ej "*
en s.t. Cej = en. and Aej = Ajej. Thus t --+ C exp(Ajt)ej, the z-i response of
(C,A) from ej at 0, is identically zero; equivalently, ej "*
en is unobservable. And the
contradiction is established.
(<=) Use contradiction again: assume that the pair (C,A) is not observable. Separate
the unobservable part as in Remark (20) above: then (21) and (22) show that for any
AE a(A) there is an eigenvector e A of A22 which induces an unobservable
state of (C,A).
eA •

(iii) This follows by Theorem (10.2.7) below.


a
(iv) Since a(A)c (C_, by (4.8.1) the spectrum of the linear map X --+ A*X+XA is in
the open left half plane; hence this map is bijective and the solution of (14) is unique.
Furthermore, by (8.5.14), it is given by
00

Mo= [ exp(A*t)· C'. Cexp(At)dt .


243

Now Mo is recognized to be the observability grammian Mo(o,oo) in (8.3.16). By


inspection Mo is positive semi-definite. By Theorem (8.3.12) applied to the pair
(C,A), (C,A) is observable if and only if det Mo 0, equivalently, iff Mo= Mo* > O.•

8.5.2. Controllability of the Pair (A,B)


As in the case of the set of unobservable states, for any [to,tl] with tl > to, Xo is
controllable to zero on [to,tl] if and only if Xo is controllable to zero on any nonzero
interval. Furthermore because of A-invariance, it will turn out that
34 the subspace of all states controllable to zero is identical to the subspace of all
states reachable from zero. Hence for the time-invariant pair (A,B), we say "the sub-
space of controllable states," or, for short,
35 the controllable subspace of the pair (A,B).

36 Theorem (Controllability properties). Given R = [A,B,C,D],

37 i) the controllable subspace of the pair (A,B) is the A-invariant subspace R (C);
ii) the pair (A,B) is controllable

38 rk[C]=n

39 'V A. e a(A), rk[A.I-A : B] = n;

iii) For A,B real, for any monic real polynomial 7t of degree n, there exists Fe lRD;Xl1
such that

if and only if the pair (A,B) is controllable.


o
iv) Let a(A) c (L; the pair (A,B) is controllable WC' the unique solution of
41 AW+WA* +BB*=O

is positive definite.

44 Remark on the extraction of the controllable part. Assume that i) of Theorem
(36) holds. Let dimR (C) =y< n: the controllable subspace is y-dimensional. Choose
a basis for R (C) and complete it n-y vectors from ([D so as to obtain a basis for ([D.
Now R(C) is A-invariant by (3), and by (1), R(C)::>R(B), hence R(B) is in the sub-
space generated by the first y basis vectors. Consequently, in this new basis, the sys-
tem representation is of the form
244

45

46

u
o--Y-
+

Fig. 8.8 Extracting the uncontrollable part.

As Fig. 8.8 shows, we have extracted an (n-y)-dimensional subsystem that is totally


unaffected by the input u, hence is uncontrollable. Since by (i), the set of controllable
states is y-dimensional, x, E (r'Y and all the states of the second (n-y)-dimensional sub-
system are unaffected by u, the pair (A",B,) is controllable.
In Remark (20) and Remark (44) above. we used changes of coordinates to exhi-
bit the fact that we could extract the unobservable part and the controllable part of the
system. These questions can be analyzed algebraically.

48 Exercise. We are given R = [A,B,C,D] with state x and we consider a transfor-


mation of coordinates. Let x= (x I,X2) be the list of the components of the state vector
with respect to the new coordinates. Let x=Tx; of course, det T*,O and x = I I X;
hence the ith column of I I consists of the component list of the ith-new basis vector
with respect to the old basis. Then A t- TAl' =; X, B t- TB =; Ii, C t- Cl i =; t.
and D=ts.
a) Show that

C"=TC and a =Oll.


(Hence rkC =rkC", rkO =rkO' and OC is invariant under changes of coordinates.)
b) Refer to (45) and (46); call C I the controllability matrix of (A",B I ). Show that
the assumptions in (44) imply that rke I =y, i.e. the pair (AIl,B I ) is controllable.
c) Refer to (21) and (22); call 0 1 the observability matrix of the pair (C,A II ). Show
that the assumptions in Remark (20) imply that rk 0 1= r, i.e. the pair (C I ,A l1 ) is
observable.

49 Comments. a) The rank test (38) is not numerically appealing for the reasons
245

given in Comment (23a) above.


b) For the same reasons as in Comment (23b) above, the rank test (39) is much more
appealing.
c) The interpretation of (iii) is through the idea of "constant state feedback." Suppose
that the state variables are available, (from, say, measurements), that we calculate Fx
for a given FE Rn,xn and that we feedback Fx to the input: the resulting feedback sys-
tem is, with u = e,
x=(A+BF)x.

Thus (iii) asserts that the pair (A,B) is controllable we can always choose F so that
the closed-loop characteristic polynomial XA+BF has as roots a list of n preassigned
points in IT; of course, these n points must be located symmetrically with respect to
the real axis because the polynomial XA+BF has real coefficients. In particular given
any unstable A with (A,B) controllable we can always stabilize it by constant state
feedback.
d) (iii) says that for any polynomial 1t, i.e. for any configuration r of its n roots, we
may find an F such that cr(A+BF) = r. A word of caution is required. Suppose we
have a system with h(s)=(s+l)-', i.e. R =[1,1,1,0] and y == x. Suppose we wish to
broaden the bandwidth by a factor of 103, then we set up the system shown in Fig.
8.9.

Fig 8.9 Limitations of linear state feedback.

999
The closed-loop transfer function is he(s) = - - 3 ; note that he(D)
A A

1 and the 3db


s+1O
bandwidth is 10 radJs. However, suppose we start from zero initial conditions and
3
u(t) = l(t), a unit step, then obviously., at t=O+, e2(0+) =999. Such a large e2 prob-
ably completely saturates the system and the linear model is no longer valid. The
moral is "state feedback can move the closed-loop eigenvalues but large motions, (here
from -1 to -103 ), may lead to saturation, sensitivity problems etc ..... " So engineers
watch out!

Proof of Theorem (36). (i) By Theorem (8.4.10) of duality theory, we know that for
any [Io,ttl, with t, > 10,

51 R (Le) = ( subspace of states of R controllable to zero on (to,td )

= [ subspace of states of R unobservable on [to,ttl } 1..


246

=N CO).1. by (10)

Thus RCLJ is a fixed Cindependent of to,tl) A-invariant subspace of (1n since N(O) is
a fixed A* -invariant subspace of ern.
By the theory of the adjoint

52 NCO).1. =R(O*)=RCC)

where the last equality follows by calculation (A=-'A*, B=C\ Now from (8.2.33)
and w.l.g. to=O,

53 RCLr)=eAIIRCLc)'

We claim that eAI1R(LJ=RCLc) because R(Lc) is A-invariant: indeed a) let XE R(Lc)


then eAI1x=p(A)XE R(Lc) where pO is the suitable polynomial needed to evaluate
eAII . This shows that eAI1R (Lc) cR (Lc).
b) Rewrite the claim as R(Lc)=e-AIIR(Lc)' Use the same argument to obtain
e-AIIR (Lc) cR (Lc), equivalent R (Lc) c eAIIR (Lc)' Finally we have, by (53),

R(Lr)=R(LJ=RCC) .

In summary, we have shown that for any time-invariant R,


1) the subspace of states controllable to zero is independent of to and t l ;
2) the subspace of reachable states is independent of to and tl and
3) these two fixed subspaces are equal and equal to R (C).
ii), iii) and iv) follow directly from ii), iii) and iv) of Theorem (9) by duality: indeed,
by Corollary (8.4.13), R = [A,B,C,D] is controllable if and only if
- = [-A*,..>1<
R * *
,-C ,B ,D ] is observable, etc. •

54 x
Exercise (Stabilization by constant state feedback). Given = Ax + B u, show that
(A,B) is controllable ¢> there exist a constant state feedback that stabilizes R.
(Hint: Define for any a>O and 't>0,

Ha := f e-iX1e-A1BB*e-A*'dt= f m(t)dt
o 0

Ha is positive definite. Choose the Lyapunov function Vex) = x*H;;lx. To evaluate V,


't

calculate fo Adt (m(t»dt in two ways. Choose the state feedback F=-B*H;;I.)

55 Remark. In several engineering applications, it is possible to establish the con-


trollability of a nonlinear system by using linear theory: the trick is to come up with a
memory less state feedback that linearizes the system and then use Theorem 36 and
247

Theorem 8.1.6.

Example. Consider the standard approximate model of an interconnected power sys-


tem
.. .
Mk 9 k+D k 9k=Uk- L Bki sin(9k-e i) ke n
i" k

where Mk = moment of inertia of kth-generator,


Dk = damping constant of kth-generator,
uk = applied torque (from turbine driving the generator),
Bkj = normalized susceptance of transmission line connecting generator k to
generator i,
9k = rotor angle of kth-generator with respect to a synchronously rotating
reference frame.
. .
Clearly the state is of dimension 2n: (9 1 ,9 1 , , 9n ,en)' Assuming that we know
the state, we leave it as an exercise to propose a (possibly nonlinear) memoryless state
feedback to linearize the resulting feedback system and to show that it is controllable.

8.6. Kalman Decomposition Theorem


The Kalman decomposition theorem is an important conceptual tool because it
clarifies a number of concepts and problems.
We consider a linear time-invariant representation R = [A,B,C,O] where for sim-
plicity we take D = O. (If D "* 0, it will become clear that its contribution can be
added at the end.)
The intuition is the following: from Remark (8.5.20) we know how to separate
the unobservable part; from Remark (8.5.44) we know how to separate the controlIable
part. Suppose we do both at the same time: we expect fou:r interacting subsystems,
(see Fig. 8.10), which from now on will be labeled, using obvious notations, c.o., c.q,.,
¢n and ¢'po How these subsystems will interact among themselves is not intuitively
obvious.

Analysis. The analysis rests on two pillars: 1) R(C) is the subspace of all controll-
able states and N (0) is the subspace of all states, 2) the second
representation theorem: R (C) and N (0) are A-invariant.
The idea is to create four subspaces, say, Lo' 14, and such that

1 Leo ® ® L¢> ® L¢1l = ern

where R(C)=Lo ® ® L¢1l'


248

Fig. 8.10 Block diagram of the Kalman decomposition.

We proceed in 4 steps

2 Step I. Lo4> :=R(C)nN(O).

This subspace is uniquely defined.


Step II. Choose Leo such that

Note that Leo is not uniquely determined.


Step III. Choose Lrj$ such that

4 Le<»$Lrj$=N(O).

Again L¢<P is not uniquely determined.


Step IV. Finally choose Lctn such that (1) holds. Again L<t¢ is not uniquely
determined.
Let us pick a basis in each of these four subspaces and use the union of these
bases as a basis for €"; then the components of the state vector will be written
x=(X{.xJ.xl.xJ)T; similarly A is split into 16 submatrices. Band C into four sub-
matrices B j and Cj. i= 1.2.3,4. new coordinate system. the controllability
(observability) matrix is denoted by C. ( 0).
By (3). we see that the A-invariant subspace R (C) is spanned by the basis vec-
tors of Leo and Leej" the first two subspaces of the direct sum (1), hence the bottom
left block of A is zero:
249

8 31 32]
A41 A42
= [0 01
0 0

and since R (i3) c R (C), the two bottom submatrices of S are zero.

9 [::]-

By (4), the A-invariant subspace N (ci) is spanned by the basis vectors of Lee!>
and L ¢II' the
second and the fourth subspaces in the direct sum (I). Hence A 22' A 24'
A 42 and A 44 are nonzero and

10 12 14] =
A 32 A 34
[0 0]
0 0 .

Since N(O) c N(C), all the basis vectors in N(O) are in N(C) hence

11 C= [c 1: 0 : C3 : 0 ] .

Thus in the new basis the given representation R becomes a new representation
R= [A ,S ,C ,0] where

All 0 AI3 0 BI
A21 A22 A 23 A24 B2
12 A :=
0 0 0 0
=: S
A33
0
0 0 A43 A44

13 C .- [C I :0:C 3 :0].

These equations lead to the block diagram shown in Fig. 8.11: note that the input
affects only blocks 1 and 2, the output is only affected by blocks 1 and 3; all arrows
going from any block to any other block are from right to left.

16 Theorem. Given any linear time-invariant representation R = [A,B,C,O] of


dimension n, there is a basis of the state space with respect to which it becomes
R= [A ,S ,C ,0] where A ,S and C are given by (12) and (13). Let ni be the number of
columns in the ith column block of A. Then
250

,..., N

u C 1 X1 Y
0

1 .. ...
x3 3
C3 "3

Fig. 8.11 Kalman decomposition.

17 i) R I := [A 11,8 I'C 1,0] of dimension nl is controllable, observable and zero-


state equivalent to R; and

19 of dimension nl +n3,

is observable and equivalent to R;

20 Hi) R,,' :..]. [::].I(;,OJ.O] ofdimon,iono,+o,

[I ° ] ° '
is controllable;

21 iv) R 34 := A33
A 43 A44 ' [0] [C3 : 0] ,0 is completely uncontrollable

in the sense that none of its nonzero states is reachable.



Proof.

24
251

where X is a complicated expression involving k and the A ij' s.


i) To see the zero-state equivalence of Rand R I , consider (12): if x(0)=6 n, then
Vu(O.oo)' x3(t)=6 and x4(t)=6, 'It x2(t) is in general nonzero but both xI(t)
and y(t) are unaffected. Hence, V uO, R 1 and R have the same z-s responses. Hence
(18) follows immediately.
By (3) t, rk C (A ,8) = nl + n2 and, by (9), its bottom n3 + n4 rows are identically
zero; hence its first nl +n2 rows are linearly independent. By calculation, using (24),
the first nl rows of C (A ,8) consists precisely in the first nl rows of C(A 11,8 1): hence
rkC(A 11,8 1)=n1' equiv. Rl is controIlable.
A similar reasoning based on (4) and examining the columns of 0 (C ,A) shows
that R 1 is observable.
Statements (ii), (iii) and (iv) follow easily. •

27 Exercise. Consider the linear time-invariant circuit shown in Fig. 8.12. Let
XI' x2, x3, X4 denote the voltages across the capacitors and currents in inductors as
shown in Fig. 8.12. Use the Kirchhoff laws and vL=Li L, ic=CVc to obtain the state
equations. The current source delivers u amperes to the circuit. The output voltage is
labeled y.
a) Determine the subspace of controllable states and that of unobservable states.
b) Obtain the Kalman decomposition and identify physically the unobservable modes.

28 Exercise on constant state feedback. Given R = [A ,8 ,C ,0] with A ,8 ,C given


by (12) and (13), consider a constant state-feedback matrix Fs = [F I}; 2}' 3}' 4]'
a) Draw the block diagram with the state feedback v = u + 8 Fs applied to the system
of Fig. 8.11. Based on the diagram, show that Fs can only affect the eigenvalues
associated with x I and x 2'
b) Write state equations for the feedback system drawn in a) and show that its charac-
teristic polynomial Xc is given by

1H

+
HI.
y
HI.

Fig. 8.12 Circuit example.

t C (A ,8) denotes the controllability matrix of the pair (A ,8).


252

29

30 Exercise on constant output feedback. As in the previous exercise consider R.


apply the constant output-feedback v=u + BFoy.
a) Using the block diagram, show that F0 can only affect the eigenvalues of xl'
b) Show that the characteristic polynominal Xo is given by

32 Comments. (We use the notations of Exercise (28) and (30».


a) The conclusion a) of each exercise still holds if one replaces the constant
feedback matrices Fs and F0 by dynamical feedbacks Rsand Ro. State space tech-
niques are not convenient to establish this fact.
Thus we see that state feedback can only affect the dynamics of the controll-
able part of the system, and that output-feedback can only affect the dynamics of the
part that is controllable and observable. This teaches us that the choice and location of
actuators (which determine B) are crucial in the effectiveness of state feedback in
modifying the dynamics of a given system. Similarly, the choice and location of both
actuators and sensors (which determine B and C) are crucial in the effectiveness of
output-feedback in modifying the dynamics of a given system.

8.7. Hidden Modes, Stabilizability, and Detectability


Consider the time-invariant system representation R = [A,B,C,D]. Let
orAl =: {Ak} be the spectrum of A E (t rom with corresponding algebraic eigenspaces

(see (4.3.3». Any nonzero vector of N k is called


2 a generalized eigenvector of A at Ak' (4.5.3), and any solution of x= Ax (i.e.
exp[At]xo) generated by a nonzero vector Xo E N k is called
3 a mode at Ak' Let C and 0, resp. be the controllability and observability matrix
of R.

4 We say that there is an uncontrollable hidden mode at Ak iff there exists a general-
ized eigenvector at Ak which is not controllable, or equivalently,

5 Nk is not a subspace of R[C]

6 We say that there is an unobservable hidden mode at Ak iff there exists a general-
ized eigenvector at Ak that is unobservable, or equivalently,
253

8 We say that there is a hidden mode at Ak if there is an uncontrollable or unobserv-


able hidden mode or both.

9 Comments. ex) The negation of (5) reads: every mode generated by a general-
ized eigenvector at the eigenvalue Ak can be reached by an appropriate control; more
precisely, using the state transition function of R,
V nonzero x E N k' there exists a control uO that is zero outside some bounded interval
[O,T] S.t. V t T x(t) =eA(t-T) x = s(t,O,e,u). Hence an uncontrollable hidden mode
at Ak is a mode that cannot be displayed by the zero-state transition function after
appropriate control action.
(3) Condition (7) reads: there is (a mode generated by) a generalized eigenvector at Ak
that is unobservable (at the output); more precisely, using the state transition and
response functions of R ,

3 a nonzero x E N k such that

x(t) = s(t,O,x,e) =eA1 x Vt 0,

yet
yet) = p(t,O,x,e) = CeA'x = e Vt 0 .

Hence an unobservable hidden mode at Ak is a mode that is unobservable at the out-


put.

12 Exercise. Using the notations of (8.6.12) and (8.6.13) show that if there is a
hidden mode at Ak' then Ak E a(A 22) u a(A 33) u a(A 44)'
[Hint: use contradiction.]

13 Comment. In tenus of hidden modes, Comment (8.6.32 can be restated as


If R has an unstable hidden mode at Ak' (Le. ReAk then R cannot become
exponentially stable by dynamic output feedback.
It is for this reason that in the feedback theory developed in terms of transfer functions
(i.e. which bookkeeps only R I = [A l1 ,B\>C I ,O], see (8.6.18» we have to assume that R
has no unstable hidden modes.
We proceed now to characterize the existence of hidden modes.

16 Theorem [Hidden
R = [A,8,C,D] where alA] = {Ak

U.th.c.
r
modes]. Consider a time-invariant
is the spectrum of A E (Cnxn.
representation

a) There is a hidden unobservable mode at Ak E alA] if and only if


254

b) There is a hidden uncontrollable mode at Ak E orA] if and only if

18 rk [AkI - A : B ] < n.

c) There is a hidden mode at Ak E orA] iff either (17) or (18). or both (17) and (18)
hold.

20 Comments. a) (17) and (18) are eigenvector characterizations. In the first


instance there is a right eigenvector of A that is unobservable: indeed there exists a
nonzero vector e S.t. Ae=Ake and Ce=a. whence CAie=a Vi=0.1.2 .... (prove
it). and CeAte= a. In the second instance. there exists a left eigenvector of A that is
orthogonal to the controllable subspace: indeed there is a nonzero vector II S.t.
ll*A=Akll* and ll*B=B*. whence ll"'NB=S* for all i=0.1.2 ..... hence for all uO. for
t

all t J
ll* eA(l--'t)Bu('t)d't=O.
o
If A is simple. then. using a basis of eigenvectors (ek); and the notations
(3.3.48)-(3.3.50). R becomes R = [A .N*B.CE.D]. In this case. if there is a hidden
mode at Ak' i.e., according to (17) or (18). if there exists an eigenvector ek at Ak S.t.
Cek = e or a left eigenvector !1k at Ak S.t. 11* kB = a*. then in the partial fraction expan-
sion of the transfer function H(s), the kth term

drops out.

22 Proof of Theorem (16). We only prove claim a) and b). a) We prove


(7) (17)
¢:: As already observed in Comment (20 a). (17) implies the existence of an eigen-
vector in N k that is unobservable. Hence (7) holds.
If (7) holds. then there exists a nonzero vector x in N k that is unobservable. By
Exercise (4.3.38) there exists a polynomial $(s) S.t. $(A)x is a (nonzero) eigenvector of
A at Ak' Moreover. since CAix=S for all i=0.1.2 ..... C$(A)x=8. Hence (17) holds.
b) We prove (5) (8). or equivalently. by negation.

23 NkcR [C]

24 rk [AkI - A : 8 ] = n .
255

For this purpose use orthogonal complementation and duality, whereby (using
expanded notations)

25

26 (R[C]).1 =(R[C(A,B))).1 =N[O(B*,A*)]

(for (25) see Exercise (4.4.46); for (26) see (8.5.52». Moreover we shall need

27 N/ [A*] nN k[A*] = (e } for all I '* k


and

(J

(Since (J:n= ED N/[A*], (4.3.41), and note that any A*-invariant subspace is spanned
/=1
by generalized eigenvectors, see (4.3.l9).) Therefore

(J

N[O (B*,A*)] c ED N/[A*] (by (25) and (26»


/=1
I,. k

29 (by (27) and (28»

<=>

'k (by ",;m .))

<=>
rk [AkI-A: B ]=n

Hence (23) <=> (24), or equivalently (5) <=> (18).



31 Exercise. Consider a time-invariant system representation R = [A,B,C,D] and its
-
dual R=[-A *,-C-* ,B*,0].
* Then a) R has an uncontrollable hidden mode at AkE cr[A]
- - *
iff R has an unobservable hidden mode at -Ak E cr[-A]. b) R has an unobservable
hidden mode at AkE cr[A] iff R has an uncontrollable hidden mode at -AkE cr[-A*].
256

[Hints: a) use (23) ¢:> (29); b) note that the dual-system representation of the dual is
the original system representation modulo a change of sign for the state.]

32 No unstable hidden mo{es}O Consider the time-invariant system representation


R = [A,B,C,D]. Let alA] =: A.k be the spectrum of A E (Cnxn with corresponding
1
algebraic eigenspaces N k' (1). Let N _ and N + be, resp., the stable and unstable sub-
space of A; more precisely,

33 N _:= e
keK
N k with K:= (k E Q : Re A.k < 0) ,

(see (4.4.55». These A-invariant subspaces are important below.

35 We say that there are no unstable (uncontrollable, unobservable) hidden modes iff
there are no (uncontrollable, unobservable) hidden modes at every A.k E a[A] S.t.
ReA.k
We stress that by "no unstable hidden modes" we mean neither unstable uncon-
trollable nor unstable unobservable hidden modes. Moreover, we have

36 Corollary [No unstable hidden modes]. Consider the time-invariant representa-


tion R = [A,B,C,D]. Let N +, R [el, N [0 J, resp., be the unstable subspace of A, the
controllable subspace of (A,B), and the unobservable subspace of (e,A).
U.th.c.
a) There are no unstable uncontrollable hidden modes
¢:>

38 'It A. E alA] n 0: + rk[A.I-A I B]=n.

b) There are no unstable unobservable hidden modes


¢:>

40

c) There are no unstable hidden modes


257

42 'if A. E o[A] (") er + rk[A.I-A I B]=n and rk [


A.I-A 1
=n.


45 Exercise. Prove Corollary (36).
[Hints: use the definitions of hidden modes, (34), and Theorem (16).]

46 Exercise. Show that

37 N+cR [C]

<=>
47 N_+R[C]= ern

<=>
48 N+[A*] (")N[O (B*,A*)]= {a}

[Hints: ern=N_ffiN+; R[C(A,B)]l. =N[O (B*,A*)], and, by (4.4.54),
N _[A]l. =N +[A*]. ]

SO Exercise. Consider a time-invariant representation R = [A,B,C,D] and its dual


representation R= [-A* ,-d' ,B* ,D*]. Assume that R has no hidden modes on the ima-
ginary axis. Show that R has no unstable hidden modes iff.R has no stable hidden
modes.

52 Stabilizability and detectability. Consider a time-invariant system


R = [A,B,C,D]. The A-invariant subspaces on the LHS of (47) and (39), viz.

53 S(A,B) := N_+R [C]


and
54 ND (C,A) := N + (")N [0]

are called, resp., the stabilizable subspace of the pair (A,B) and the undetectable sub-
space of the pair (C,A).

55 Moreover, we say that the pair (A,B) is stabilizable iff S(A,B)= ern and we say
that the pair (C,A) is detectable iff ND (C,A) = ( e ) .
258

The exercises below show that stabilizability must be viewed as controllability to


zero at infinity and detectability must be seen as observability at infinity.

57 Exercise. Consider the stabilizable subspace (53) of (A,B). Let s(· ...... ) denote
the state transition function of the time-invariant representation R [A,B,C,D]. Show
that

XOE S(A,B)
iff
there exists a control uO such that

lim x(t) = s(t,Q,xo,u) = O.


1-->00

[Hints: ::;. Follows from the definitions; pick a basis according to


¢ :

trn=S(A,B)(J)T where T is a complementary subspace. (Note that S(A,B) is A-


i.nvariant and Hence in this x=Ax+Bu reads
x\=A\x\+A12X2+B\U and x2=A2x2, where A\ represents A IS(A.B) and a(A 2)c tr+.
Therefore if Xo is not a member of S (A,B), then x20 i' ,0 and, for every u(·),
x(t) does not go to 0 as t -t 00.]

58 Comments. a) A state is stabilizable iff it is controllable to zero at infinity.


P) Since S(A,B)=N_(J)(N+nR[C]), the stabilizable subspace is the smallest A-
invariant subspace that contains the stable states and the unstable states that are con-
trollable to zero.
y) By Corollary (36) and Exercise (46), (A,B) is stabilizable iff there are no unstable
uncontrollable hidden modes.

59 Exercise. Consider the undetectable subspace (54) of (C,A). Let s(· ...... ) and
pC' .... ), resp., denote the state transition and response function of the time-invariant
representation [A,B,C,D]. Show that

a nonzero state Xo E ND (C,A)


iff
with x(t) = exp[At]xo = s(t,Q,xo,Ou)'

and y(t) = C exp[At]xo= p(t,Q,xo,e u)'

x(t) does not tend to e as t -t 00 yet lim y(t) = e .

60 Comments. a) A nonzero state is undetectable iff it is unobservable at infinity.


P) ND (C,A) = N + n N [0] is the largest A-invariant subspace of states that are both
unstable and unobservable.
259

'{) By Corollary (36). (C.A) is detectable iff there are no unstable unobservable hid-
den modes.
[Hints: follows from the definitions; <=: w.l.g. xoe Nk=N([A-AkI]m l ) with
Re Ak O. By Exercise (4.3.38) there exists a chain of generalized eigenvectors (ei ]
with e 1 an eigenvector and [A-Akl]ei=ei-1. where eO := e and em=xo. Hence
2 tm - 2
Cexp[At]xo=C[e l (m-l)! + e (m-2)! + ... + em]exp(Akt). Show that Cei=e
for all i=1.2..... m.]
We conclude this section by some theorems synthesizing the results of this sec-
tion. Certain results on linear constant state feedback and output injection are added.
They are proved in Chapter 10. Another added result on the relation between I/O sta-
bility and expo stability is proved in Chapter 9.

62 Theorem [Stabilizability properties]. Given R = [A.B.C.D] where the stabiliz-


able subspace of the pair (A.B) is the A-invariant subspace S (A,B) =N _ + R [e].
i) The pair (A,B) is stabilizable
¢;> S(A,B)= 41 n
¢;> N+cR[C]
¢;>there are no unstable uncontrollable hidden modes
¢;>'r/Ae o[A]" cr+ rk[AI-A I B]=n.
ii) For A,B real. there exists Fe lRn;XIl such that
o
o[A+BF] c 41_

if and only if the pair (A,B) is stabilizable. [Stabilizability by linear state feedback.] •

65 Theorem [Detectability properties]. Given R = [A,B,C,D] where the undetect-


able subspace of the pair (C,A) is the A-invariant subspace ND (C,A)=N + nN[O].
i) The pair (C,A) is detectable
¢;> N D(C.A)= (e)
¢;> N+nN(O)= (e)
¢;> there are no unstable unorservablj hidden modes
AI-A
¢;> l;fAe o[A]n cr+ rk ---- =n.
C
ii) For C.A real, there exists L E lRI1XI1o such that
o
o[A + LC] c 41_

if and only if the pair (C,A) is detectable. [Stabilizability by linear output injection.] •

68 Theorem [No unstable hidden modes]. Given R = [A,B.C,D] with transfer


260

function H(s) ,
i) there are no unstable hidden modes
¢;> N+cR[C] and N+nN[O]= Ie}
¢;> "rIf..Eo[A]n 41:\,

rk[f..I-A I B]=n and rk ]=n


¢;> the pair (A,B) is stabilizable and the pair (C,A) is detectable.
o
ii) Assume P [H(s)] c tr _, then
o
orA] c tr_
¢;> there are no unstable hidden modes. [Exp. stability by I/O stability.] •

70 Theorem [Duality]. Given R = [A,B,C,D] where the stabilizable subspace of the


pair (A,B) is the A-invariant subspace S(A,B)=N_+R(C) and the undetectable sub-
space of the pair (C,A) is the A-invariant subspace ND (C,A)=N +nN (0).
D.l.c.
i) S (A,B).l =ND (B*,A*),
ii) ND(C,A).l =S(A*,c!'),
iii) the pair (A,B) is stabilizable
<=:> the pair (B ,A*) is detectable,
iv) the pair (C,A) is detectable
¢;> the pair (A*,C') is stabilizable.

71 Remark. Theorem (70) follows using Exercise (46).

8.8. Balanced Representation

1 Consider a given R = [A,B,C,O] that is controllable and observable and with


o
o(A)c (L (equiv. "rIf..jE o(A), Ref..j<O).
Consider its controllability grammian given by (8.5.41), i.e.
00

2 Wc(O,oo) = f eAtBB*eA"'tdt =: We
o

and its observability gramrnian given by (8.5.14), i.e.

3 Mo(O,oo)= f eA"'tC'CeAtdt =: Mo.


o

The positive definite matrix We is related to the minimal cost to reach x on (- 00,0]:
261

o
4 min
u
J (u,u)dt=
-00

The positive definite matrix Mo is related to the energy in the z-i response starting
from x at time 0:

f IIp(t,0,x,6 )1I 2dt=x*Mox.


00

5 u
o

We also know that We and Mo are the unique solutions of

6 AWc+WeA* +BB*=O

where the uniqueness follows from that X AX + XA* and Y A*Y + YA are injec-
o
tive because o(A) c tr_.
It turns out that there is a basis in state space such that, for an R satisfying (1),
the representation with respect to the new basis has the property that the controllability
grammian and the observability grammian are equal and diagonal: thus the new
representation is "balanced."

10 Theorem. Given R controllable, observable and expo stable, there exists a coor-
dinate transformation that gives a new representation if = [A,B,C,O] that is "balanced,"
that is, in the new coordinate system

11 Mo=Wc =diag(01,02,···,On) =:.t

where OJ>O Vie n.

12 Exercise. Use (4) and (5) to give an interpretation of (11) in terms of energies
in the input and the output.

Proof. *
Let Mo = R R (e.g. let Mo = .t Ak ek ek' Ak> 0,* V k; then take
R=.t Perform an SVD on RWcR*: noting that RWcR* is Hermitian, we
obtain (L is defined in (11».

16 RWeR*=U.t2u*, Uunitaryandok>O, Vk
and, as usual, 01 ...
Choose
262

Note that A r- TArl. B r- TB. C r- cr l • We r- TWet' and Mo r- (t'rIMor l .


(Use (6) and (7) to check the last two.) Then straightforward calculations using U is
unitary. R is nonsingular..... give

18

19 Discussion. The SVD delivers the a?,
in (16), with OJ Suppose that
0< on < < 1: then to reach the nth unit vector in the new coordinates, costs at least
liOn' hence to reach is expensive. Furthermore, II pC' = on hence the
z-i response due to is "small." In other words. in terms of input-output properties
the nth coordinate does not contribute much. The following result is interesting for
system reduction [Glo.1].
Using the notations above, assume that for some k E fl, Ok> 0k+': hence
V' i k, V' j > k, OJ> OJ. Partition A,B,C, conformally so that All is kxk. Then
o
o(A l1 )c (L-
In other words, if one truncates in this manner an expo stable representation one
obtains an expo stable representation of reduced order: and if 0k+' is small compared to
0, ,02' ... , Ok' then the transfer function of the reduced system is close to that of
the given one.

8.9. Robustness of Controllability


In section 2 and section 5, we considered exclusively the controllability of linear
systems. For obvious engineering reasons. it is desirable that if a linear system
represented by the pair (A('),B('» is controllable then it remains controllable if its
dynamics are perturbed so that it becomes. say, nonlinear. Our purpose here is to indi-
cate the nature of the results: the proofs are based on fixed point theorems and are
somewhat tricky.
We consider the nominal system

1 x(t) = A(t)x(t) + B(t)u(t)

where A('),B(') are piecewise continuous and bounded on R+. For stating the results,
we need two definitions.

2 We say that the pair (A(·),B(·» is uniformly controllable over T seconds iff
the reachability grammian W.(to,to+T) >0, equivalently, by (8.2.12),
V' V' xo.x, E R n there is an input u[Io.Io+T] that steers the phase (xo,to) to the
phase (x"to+T).
We also need a stronger condition.
263

3 We say that the pair (A('),B('» is strongly uniformly controllable over T seconds
iff :3 As> 0 such that 'v' to e lR+, Wr(to,to + T) A?I. (The point here is that the
same As> 0 works for all to e R+.)
Consider now a nonlinear system

4 x(t) = g(x(t),u(t),t)

where 'v'toE lR+, 'v'XOE R n, 'v'u[o.oo) piecewise continuous Eq. (4) has a unique
solution on [0,00). We say that (4) is uniformly controllable over T seconds iff
'v' to 0, 'v' XO,X} ERn, :3 u[Io.Io+Tl that steers the phase (xo,to) to (x},to+ T).
The nature of the robustness results is displayed by the following statement.

5 Theorem. Let the pair (A('),B('» of (1) be strongly uniformly controllable over T
seconds, then
i) the perturbed nonlinear system

x(t) = A(t)x(t) + B(t)u(t) + h(x(t),u(t),t)


where
sup IIh(x,v,t) II =: ko<oo

(where the supremum is taken over all x e lRn, v E lRn" t e 1R+), is uniformly controll-
able over T seconds;
ii) the perturbed nonlinear system

x(t) = A(t)x(t) + B(t)u(t) + f(u(t),t)

where for some y(f) e R+, P(f) E lR+, 'v'v ERn" 'v' t 0

II f(v,t) II )'(f) II v II + P(f) ,

is also uniformly controllable over T seconds provided )'(f) is small enough;


iii) the perturbed nonlinear system

x(t) = A(t)x(t) + B(t)u(t) + \jI(x(t),t)

where for some Y(\jI) E R+ and P(\jI) E R+, 'v' x E lRn and 'v' t 0

is also uniformly controllable over T seconds provided )'(\jI) is small enough. I

In fact, specific bounds on y(f) and Y(\jI) are available in the literature (see, e.g.
[Sas.I]); the important point is that under reasonable conditions the controllability of
264

the linear model is not destroyed by nonlinear perturbations provided these perturba-
tions are "small."
As expected, it can also be shown that under reasonable conditions, the zero-input
observability of the linear model is not destroyed by nonlinear perturbations provided
these perturbations are "small" (see, e.g. [Sas.I] and the references therein).
CHAPTER 8d

CONTROLLABILITY AND OBSERVABILITY. THE DISCRETE-TIME CASE

Introduction
In this chapter we study the most important discrete-time analogs of the
continuous-time case. The fundamental difference is that controllability to zero does
not necessarily imply reach ability from zero.

8d.l. Controllability and Observability of Dynamical Systems


The definitions and results are as in section 8.1 except that the times t,to,tl are
now integers k,ko,k l ' and the functions are sequences.

8d.2. Reachability and Controllability of the Pair (A(.),B(.»


From the present section to section 4 we restrict ourselves to discrete-time linear
time-varying system representations Rd(·)=[A(·),B(,),q·),D(·)] with state space ern
where A('),B('),CO,D(') are matrix-sequences defined on N. We denote by
U d<ko,k,-l) the linear space of vector input sequences
U[kok,,-I] = (u(ko), u(ko+1), ... , u(k,-l».

8d.2.1. Controllability of the Pair (A(.),B(.»


Since controllability involves the relation between the input u(·) and the state tra-
jectory x(·), we will use the expression "controllability of «A(·),B(·»."

1 More precisely, we say that the pair (A(·),B(·» is controllable on [ko,k 1] iff for all
(xo,ko) and for all (xl,k,) there exists a control sequence u[ko.k,-I) that transfers the
phase (xo,ko) to the phase (xl,k 1).
Given the matrix-sequence pair (AO,B('» we know that u[ko.k,-l) transfers the
phase (Xo,ko) to the phase (xl,k 1) if and only if
kl-I
2 XI = s(k1,ko,xo,u)= <l>(kl,ko)xo + L <l>(k 1,k'+1)B(k')u(k').
k'=ko

Equation (2) shows that there will be an input u[ko.k,-l) that will transfer an arbitrary
phase (xo,ko) to an arbitrary phase (xI,k l ) if an only if the linear map
kl-l
4 Lr(ko,k l ): Ud(ko,k,-l) -? ern: u[ko.k,-') -? L <l>(k,,k'+1)B(k')u(k')
k'=ko

is surjective. Lr(ko,k 1) is called the reachability map on [ko,ktl and is represented by


the reachability matrix L, on [ko,kd given by

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
266

Sometimes we do not want to specify k l : then we say that the pair «A'),B('» is
controllable at leo iff for some kl > leo the pair is controllable on [ko,kd.
Unlike in the continuous-time case,

has an inverse <1>(ko,k d = [<1>(k I,kOW I iff detA(k) i: ° V k E [ko,k I - 1]. Hence

5 Reduction Theorem. The pair (A('),B('» is controllable on [ko,kd

7 '\/ Xl E c: n , :3urko.k,-I] that steers (e n ,leo) to (xl,k l ),

6a and the last statement is equivalent if detA(k) °for all kE [ko,kl-lj.


Proof. Exercise. (Use (2) and the properties of <1>(';».

8 Exercise. Show that constant pair (A,b) where

A= [8000
b?], b= ,

°
is not controllable on [0,3J, yet every state Xo at is driven to zero at k = 3 by using
a zero control. [Hint: Let x 3(O) = E3, then the zero control will steer (E3,0) to (6,0).]

9 Exercise. With Ae R3x3 as in Exercise (8) and b=€3' show that the pair (A,b)
is controllable and satisfies the conditions (6) and (7) on [0,31.

10 Comment. This exercise shows that controllability on [ko,kd and Condition (7)
on [ko,ktl may hold in the case that A(k) is singular.
[Hint: show that L r (0,3), given by (4) is surjective.]
Motivated by the reduction above we must make a distinction between the fol-
lowing.

11 Given the representation R ,i') or the pair (A(-),B('» we say that the stale Xo is
controllable to zero on [Ieo,ktl iff :3U[k{).k,-I] that steers (xo,k o) to (en,k l ); we say that
the state xl is reachable on [ko,ktl iff 3u[4k,-I] that steers (en,k o) to (xl,k l ). More-
over we say that the pair (A('),B('» or R dO is controllable to zero (reachable) on
[ko,kd iff every state Xo (xl resp.) is controllable to zero (reachable resp.) on [ko,kd.
267

As a consequence Theorem (5) reads as follows.

lla Corollary. The pair (A('),B('» is controllable on [ko,kd,


<=> the pair (A('),B(')) is reachable on [ko,kd,
=> the pair (A('),B('» is controllable to zero on [ko,ktJ,
and the last statement is equivalent if detA(k) '" 0 for all ke [ko,kl-l].

llb Comment. On any [ko,kt1 controllability and reachability are equivalent.


However this does not hold for controllability and controllability to zero (in Exercise
(8) (A,b) is controllable to zero on [0,3] but is not reachable on [0,3]).
We have now by the definitions and Theorem (5) the following theorem.

12 Theorem [Controllability in terms of reachability from zero]. Let A('),B(') be


given compatible matrix-sequences, with the reach ability map and reach ability matrix
given by (4) and (4a) resp. Then

13 i) The pair (A(·),B(·» is controllable on [ko,kt1

14 <=> The pair (A('),B('» is reachable on [ko,k l ]

17 <=> detWr(ko,k l ) '" 0, where the reachability grammian is defined by


k,-\
18 Wr(ko,k\):= L Cl>(k\.k' + I)B(k/)B(k/)*Cl>(k,.k' + 1)* .
k'=ko

ii) The set of reachable states on [ko,kd is the subspace

19 Exercise. Prove Theorem (12). Hint: for (17) note that

20

21 Comments. a) Equation (18) shows that WrCko,k l ) is the sum of Hermitian


positive semi-definite matrices and is therefore itself Hermitian positive semi-definite,
whence Vze (Cn, z*Wr(ko,kl)z Furthermore if kl <kl'

i.e. more states are possibly reached as the horizon k) increases. The same happens as
268

ko decreases for a fixed kl .


The function kl W,(ko,k 1) is the solution at k = kl of the matrix recursion equa-
tion, (r.e.)

23 X(k+l)=A(k)X(k)A(k)* + B(k)B(k{ k

with X(ko) = O.
We have also by the definitions and Theorem (12) the following theorem.

24 Theorem [Controllability to zero]. Let A('),BO be given compatible matrix


sequences, with the reachability map and reachability matrix given by (4) and (4a),
resp .. Then
i) the pair (A(·),B(·» is controllable to zero on [ko,ktl
¢:>

ii) The set of states XOE (Cn that are controllable to zero on [ko,ktl is the subspace

In other words, XOE ern is controllable to zero on [ko,ktl if and only if <I>(kl,ko)xo is
reachable on [ko,ktl.

27 Exercise. Prove Theorem (24). [Hint: for (i) use Eq. (2) and Theorem (12,
(ii».]

28 Comments. a) If row rk[L,(ko,k 1)] = n then condition (25) holds, in other


words, if (A('),B('» is reachable on [ko,k!] then «A('),B('))) is controllable to zero on
[ko,k l ), cfr. Corollary (Ila).
P) In the time-invariant case, on [O,n], the subspace of states controllable to zero con-
tains the subspace of states reachable from zero, see (8d.5.76) below.
y) C o(ko,k 1), (26), is the inverse image of the reachability subspace R [L,(ko,k l )] under
the map <I>(k1,ko).

31 Controllability to zero when A(k) is nonsingular for all k. Consider a system


RdO or a pair (A(·),B(·» where

32 det A(k) .",. 0 for all k.

Hence by (4b) the transition matrix <I>(k,/) is nonsingular for all k and / with
<I>(k,l)-!=<I>(l,k). Hence, using (2) and definition (lla), a state XOE (Cn is controll-
able on [ko,ktl iff for some ulko.k,-l]
269

k,-l
33 Xo=- L <I>(ko.k'+l)B(k')u(k').
k'=k o

This suggests to define the controllability to zero map on [ko.kd


kl-t

34 Lc(ko.k I) : Ud(ko,k 1 -1) -> Q::": U[ko.k,-I) -> L


k'=ko
<I>(ko,k' + l)B(k')u(k')

represented by the controllability to zero matrix on [ko.ktl

34a Lc(ko,k 1):= [<I>(ko,k1)B(k1-l): <I>(ko,k 1-l)B(k J-2): ... : <I>(ko.ko+l)B(k o)] .

3S Exercise. Let A(k) be nonsingular for all k and let Lr(kO.k J ) be the reachability
matrix (4a). Show that

where <I>(k1.ko) is nonsingular.

35b Comment. If A(k) is nonsingular for all k. then a) on any interval [ko.ktl the
controllability to zero and reachability matrices are equivalent and b) (A(·).B(·» is con-
trollable on [ko.ktl iff (A(·).B(·)) is reachable on [ko.kd iff (A(·),B(·» is controllable to
zero on [ko,kJl.

36 Theorem [Controllability in terms of controllability to zero]. Let AO and BO


be given compatible matrix-sequences with A(k) nonsingular for all k. Consider the
controllability to zero map and controllability to zero - matrix given by (34) and (34a).
resp .. Then,

37 i) the pair (A(·).B(·» is controllable on [ko.kd

38 ¢:> the pair (A('),B('» is controllable to zero on [ko,kd

41 ¢:> det[Wc(ko.k 1)] 0 where the controllability to zero grammian is


defined by
k,-l
42 W c(ko.k 1) = L <I>(ko,k'+ 1)B(k')B(k't <I>(ko.k'+ 1)* .
k'=ko

ii) the set of states that are controllable to zero on [ko,kd is the subspace
270


43 Exercise. Prove Theorem (40). Hint: for (41) note that
43a

44 Exercise. Show that the sequence ko Wc(ko,k 1) is the solution at k=ko of


the backward matrix r.e.

46 Exercise. Show that under the conditions of Theorem (36), on any [ko,kd the
subspace of reachable states and that of states controllable to zero are related by the
nonsingular transformation

hence their dimension is the same.

8d.2.2. The Cost of Control


Given the pair (A('),B('», let us consider the cost of reaching (xl,k l ) from (en.ko).
In order to fit the framework of the theory of the adjoint (Sec. A.7.4), let us assume
that the cost of control is given by the [2-nonn of ulko,k.-l)' namely
k.-l
55 (u,u)= I. u(k')*u(k')=lIu IIi.
k'=!co

56 Theorem [Minimum cost control].


i) If the pair (A('),B('» is controllable on (ko,kd, then V xO,xl E ([n, the input
u: [ko,k)-l] -+ern; defined by

57 u(k) := B(k)*<I>(kl,k+l)*Wr(ko,klrl[x)-<I>(kl'ko)xo]

steers the phase (xo,ko) to (x),k 1).


ii) If the cost of control is given by (55), then the minimal cost of reaching (xl,k 1)
from (en,ko) is given by

58 lIullz=xtWr(ko,k)-lxl

where u is given by (57).



271

Short proof. The control u[ko,k,-]I transfers (xo,ko) to (x],k]) if and only if

59 x]-W(k1,ko)xo=Lr(ko,k1)u

where Lr(ko,k 1) is given by (4a). By (A.7.57) (Theory of the Adjoint) the least /2_
solution of (59) is of the form L.(ko,k1)*1; for some I; e (]:n. Substitution into (59)
together with (20) gives the minimum cost u[ko,k,-lJ as

which is precisely (57).



8d.3. ObservabiJity of the Pair (C(-),A(-»
We are given the discrete-time linear time-varying system representation
Rd(')=[A('),B('),C('),D('»), where A('),B('),C('),D(') are given matrix-sequences. We
denote by U d(ko,k]) the linear space of input sequences
u[ko,k,J = (u(ko),u(ko+ 1), ... , u(k]», and by Y d(ko,k1) the linear space of output sequences
»·
Y[ko,k,) = (y(ko),y(ko+ 1), ... , y(k 1

1 Given R d('), for any initial state Xo at ko and any input Ulko.k,) e U d(ko,k 1), the
corresponding output Ylko,k.J is given by
k-J
2 y(k)=C(k)W(k,ko)xo + C(k) 1: W(k,k'+l)B(k')u(k') + D(k)u(k) .
k'=ko

Suppose that in addition to R dO we know ulko.k,l' then in (2) we can calculate the last
two terms; thus w.l.g. for the study of the relation between the state Xo and the output
Ylko,kd we need only consider the first term, i.e. the z-i response

3 y(k) = C(k)W(k,ko)xo V ke [ko,kd,

which is known if the pair of matrix-sequences C(') and AO are known.


Equation (3) suggests to define the observability map on [ko,ktl by
I
4 Lo(ko,k]): (]:n Y d(ko,k 1): Xo C(k)W(k,ko)xo Ike lko,k,J .

This map Lo is represented by the observability matrix Lo(ko,k 1) on [ko,kd given


by
272

C(ko)
C(ko+ 1)<I>Cko+ 1,ko)

5 We'll say that the slale Xo of R dO or CC(,),AC'» is unobservable on [ko,kd iff the
z-i response (3) is zero on (ko,kd.
In view of definition (5), we have
Xo is unobservable on (ko,kd
6

equivalently, the set of all states of R C·) or CC('),AC'» that are unobservable on [ko,kd
is the linear subspace N CLoCko.k I» =N CLoCko,k l ».
Recalling the definition (8.1.16) of a dynamical system observable on [to,td and
specializing to the special case of R dO, we say that

7 the pair (C('),A('» is observable on [ko,kd iff given R dO. V inputs u[k",k,-I]
and V corresponding outputs Y[ko,ktl, the state Xo at ko is uniquely determined.
Since Lo(ko,kl) is a linear map represented by the matrix Lo(ko,k l ), we have

8 (C('),A('» is observable on [ko,kd ¢:> Lo(ko,k 1) is injective

As a consequence the pair (C('),A('» is observable iff the zero state is the only
state which is unobservable on [ko,kd, equivalently, the subspace of unobservable
states is the trivial subspace of IT

12 Theorem [Characterization of observability on [ko,ktll. Let CO and AO be


compatible matrix-sequences, with the observability map and observability matrix
specified by (4) and (4a) resp.. Then, we have the following equivalences:

13 the pair (C('),A('» is observable on (ko,kd


273

16 det #- 0 where the observability grammian is defined by

kl
17a Mo(ko,k 1):= L cI>(k',ko)*C(k')*C(k')cI>(k',ko).
k'=ko

17 Corollary. Let (C('),A('» be observable on then


a) with y(.):= the z-i response on due to Xo

18

b) given Y(ko,kd' Xo is retrieved by


k,
19 xo= [Lo(ko,k/ Lo(ko.klW1Lo*(ko,k1)y = Mo(ko,k)-l L cI>(k',kotC(k't y(k') .
k'=k o

Proof. Exercise.

23 Corollary. Given the pair (CC'),A('», the set of all unobservable states on
[ko,kd is a subspace of tr n and, in particular, it is
N (Lo(ko,k 1» =N =N [Mo(ko,k 1)] .

Proof. Exercise.

24 Exercise. Using the notations of Theorem (12), show that the sequence
ko Mo(ko,k 1) is the solution at of the backward matrix r.e.

3 X(k)=A(k)*X(k+l)A(k) + C(k)*C(k) for k kl

with X(k\+l)=O.

30 ReconstructibiIity on [ko.kd. Consider the discrete-time system


Rd(')=[A('),B('),CC'),D(')] where detA(k) #- O/or all k. Consider the backward evo-
lution of system R d('), i.e. consider the solution map

(x(k+ l),u(k» (x(k),y(k»

described by
274

31 x(k) = A(k)-lx(k+ 1)- A(krIB(k)u(k)

32 y(k)=C(k)A(k)-lx(k+l) + [D(k)-C(k)A(k)-lB(k)]u(k) .

Note that given RdO, for any state xl at k l +l and any input u[lco.k,] there
corresponds a unique output Yrko.kd'

33 Thus the backward observability problem, i.e. the reconstructibility problem on


[ko,kd is: given the control-sequence urko.k,] and the output-sequence Yrko.k,]' find the
state Xl = x(k l+ 1).
Since R dO is linear the contribution due to u(ko.k,] in Y(ko.k,] is known and addi-
tive. Hence by subtracting this contribution, the reconstructibility problem is reduced
to: given the z-i response Y(ko.k,] (defined by(31)-(32» due to Xl =x(k l+l), i.e.

34 y(k)= C(k)Cll(k,k l+l)Xl for ke [ko,kd,

find Xl =x(kl+l). Equation (34) suggests to define the reconstructibility map on


[ko,kd by

35

36 Moreover, we say that the state Xl E (Cn of R dO or the pair (C(·),A(·» is


unreconstructible on [ko,kd iff the z-i response (34) is zero on [ko,kd
and
we say that R d(') or the pair (C(·),A(·» is reconstructible on [ko,kd iff the zero state is
the only state that is unreconstructible on lko,kd.
By the map Lrec(ko,k l ) and definition (36) we have: let Xl be any state of R dO
with det A(k) -:;: 0 for all k, then

37 Xl is unreconstructible on [ko,ktl

k,
38 Wrec(ko,k l ) = L <I>(k',k l+ l)*C(k'>*C(k')<I>(k',k l+ 1) .
k'=ko

275

8d.4. Duality
If we compare Theorems (8.2.36) and (8.3.12) it becomes clear that they are
closely related. In fact, they are related by duality. Throughout this section we
assume det A(k) '11= 0 'v' k for reasons of simplicity.
Consider a discrete-time linear time-invariant representation
Rd(·)=[A(·),BO,CO,DO], i.e.

1 x(k+ 1) = A(k)x(k) + B(k)u(k)

2 y(k)=C(k)x(k) + D(k)u(k)

where as usual x(k)e ern, u(k)e er ni, y(k)e ern.. and A(·),B(·),C(·),D(·) are compati-
ble matrix-sequences. As before, (k,leo) -4 <I>(k,leo) is the transition matrix.
As in (2d.120), the reverse-time dual-system representation

4 -
Rd(')=[A* (-),c
AI<
('),B* ('),D* (.)]

is described by

5 x(k)=A(k)*x (k+l)+C(k)*u (k+l)

6 Y(k+ 1) = B(k>*x (k+ 1) + D(k>*ii (k+ 1) .

The main purpose of this section is to relate in a convenient way controllability


to zero and observability. As a consequence, we must be able to run RdO forward.
Hence, as already announced, we assume that

7 det A(k) '11= 0 'v' k,

whence [<I>(k,k')r' = <I>(k',k) exists \f k' and k. Moreover, we shall denote by U d the
appropriate space of input-sequences. Calculations based on (5) and (6) show that for
any state x(leo) e er n at time leo, for any k> leo, and for any ii (.) E U d, the state and
output of RdO is given by
k-l
8 x (k) = <I>(ko,ktx (leo) - I. <I>(k',ktC(k'tii(k'+l)
k'=ko

k
9 Y(k+l) = B(k)*<I>(ko,k+ ltx. (ko) - B(k)* L <l>(k',k+ 1>*C(k'>*U (k'+ 1) + D(k)*U (k+ 1)
k'=ko

Note that by the pairing Lemma (2d.1.126), given Rd(') and Rd(')' 'v'ko,k with ko<k,
276

'v'(x(ko).u(·»e o:nxU d• and 'v'(x(k).ii(·»e o:nxU d •


k-I
(it (ko).x(ko» + (y (k'+ I ),u(k'»
k'=ko

k-I
10 = (it (k).x(k» + (ii (k'+l).y(k,) .
k'=ko

11 The pamng (10) shows that the input. state and output of RdO are the
sequences k ii (k+ 1). k x (k) and k Y(k+ 1) resp.. This the choice of
fonnulas (8) and (9) for describing the state and output trajectories of R d(·)' Moreover
using the convention (11) and Eqs. (8) and (9), on any [ko.k l ]. the controllability to
zero map (8d.2.34) and the observability map (8d.3.4) of Rd(') are given by. resp .•
kl-I
12 -Lc(ko,kl)ii= «ll(k'.ko)*C(k')*ii(k'+I)
k'=ko
and

13 Lo(ko.kl)xo= (B(k')*«ll(ko.k'+I)* "0

By the definition of the adjoint. (Section A.7.2). we have therefore

14

and
15

where Lo and Lc are the observability and controllability to zero maps. (8d.3.4) and
(8d.2.34) of the system R i'). (1)-(2).
Hence Theorem (8.4.10) reads in the discrete-time case as follows.

20 Theorem [Duality: controllability to zero versus unobservability]. Assume that


(7) holds.
i) The subspace of all states of RdO that are controllable to zero on lko.kd is the
orthogonal complement of the subspace of all states of its dual Rd(') that are unobserv-
able on [ko.kl-l]. In tenns of the maps (8d.2.34) and (12), that is
- 1.
21a R (Lc(ko,k 1» = [N (Lo(ko.k1-l»] .

ii) The subspace of all states of Ri') that are unobservable on [ko.kl-l] is the
orthogonal complement of the subspace of all states of its dual RdO that are controll-
able to zero on lko,kd. In terms of the maps (8d.3.4) and (II), that is,
277

21b

iii) In tenns of grammians, (2Ia) and (2Ib) translate into

and
22b

where
k,-\
23a Wc(ko,k\) = 1\1 o(l<o,k\-I) = L <l>(ko,k'+I)B(k')B(k')*<l>(ko,k'+l)*
k'=ko

k,-\
23b M,,(ko,k\-I) = Wc(ko,k\) = L <l>(k',ko)*C(k'tC(k')<l>(k',ko)
k'=ko

(We' Mo and We' M 0 are the controllability to zero and observability grammians of
R dO and Ri') resp.).

25 Corollary. Assume that (7) holds.


a) The representation R dO is controllable to zero on (ko,ktl

The representation Rd(') is observable on (ko.k\-l].

b) The representation RdO is observable on [ko,k1-1]

The representation Rd(') is controllable (to zero) on [ko,kIl.

25a Comment. In other words "observability on [ko,k1-1]" and "controllability on


[ko.kd" and vice-versa are related by duality. The mismatch k1-l versus k\ is due to
the nature of discrete-time systems which produce at time k the output and the next
state: compare (2) with (I).

26 Exercise. Prove Theorem (20). [Hints: use (14) and (15); note that We(ko,k\) is
a matrix representation of Lc*Lc , etc.]

26a Exercise. Let Ri') be the dual system representation (5)-(6) where
detA(k) '" 0 for all k with state x(k), input u(k+l) and output Y(k+I). a) Show that
the forward representation of RdO is given by

27 x(k+ I) =A(kt- 1x(k) - A(kt-1CCk)* ii (k+l)


278

28 Y(k+I)=B(kt A*-l X(k)+ [D(k)*-B(k)* A(k)*-IC(kt] u(k+l) 0

b) Show that under the same conventions as in a) the dual of the dual representation
II dO denoted by RdO is given by
29 x(k+l)=A(k)x(k)-B(k)u(k+2)

30 y(k+2)=-C(k)x(k)+D(k)u(k+2) 0

31 Comment. Under the conventions (II) the forward versions of RdCo) and R dO
read as followso

33 II dO=[ACo),-BO,-CO,-DCo)] 0

Hence a) the dual of the dual equals R dO modulo a change of sign of the state and b)
Corollary (24) can be made to read

34 Corollaryo Assume that (7) holds:


a) the pair (A(o),B(o» is controllable (to zero) on [ko,ktl

the pair (B(/ A(/-l , A(o)*-l) is observable on [ko,k1-l];

b) the pair (C(o),A(o» is observable on [ko,k1-1]

the pair (A(o)*-l ,-A(o)*-IC(o)*) is controllable (to zero) on [ko,ktlo



3S Comment. the convention (II) and assumption (7):
a) the state x0 of R dO is controllable to zero on [ko,k t1

the state x0 of the (forward) pair (A(o)*-t, - AO*-ICO*) is controllable to zero on


[ko,k l ]

b) the state x0 of RdO is unobservable on [ko,kl-I]


the state x0 of the (forward) pair (B(f A(/-l ,A(-)*-l) is unobservable on


279

Reachability Versus Reconstructibility

40 Reconstructibility of Rd' Observe that Eqs. (5) and (6) are the backward equa-
tions of Rd(') reflecting the evolution map x(k+l),u(k+1) (x(k),y(k+l», where
as in (11) u (k+ 1) and y (k+ 1) are the control and output, resp. at time k. Hence, by
definition (8d.3.36),

41 the state xl =x (k l ) of Rd(') is unreconstructible on [ko,kel] iff

Analogous to Theorem (8.4.25) we have the following theorem.

45 Theorem [Reachability versus reconstructibility]. Let assumption (7) hold. Then,


a) The subspace of all states XI of R d(') that are reachable on [ko,ktl is the orthogonal
complement of the subspace of all states x I of RdO that are unreconstructible on
[ko,k l -l].
More precisely

where Lr(ko,k l ) is the reachability map (8d.2.4) and Lrec(kO,k1-l) is the reconstructibil-
ity map of RdO on [ko,k1-l] given by
-
Lrec(ko,k,-I): ern Yd(ko,kl-I): XI
* *1Ike [k",k - ' ]
B(k) <lJ(k l ,k+l) l

b) R dO or the (forward) pair (A('),B('» is reachable on [ko,ktl

c) RdO or the (forward) pair (AO*-', - A(/-'CO*) is reachable on [ko,ktJ

¢:> Rd(') or the (forward) pair (C(·),A(·» is reconstructible on [ko,k,-l].



46 Exercise. Prove Theorem (45).

8d.5. Linear Time-Invariant Systems


We consider now linear time-invariant systems R d = [A,B,C,D] where A,B,C, and
D are constant matrices with elements in IR or er.
Because of the time-invariance the initial time leo is set equal to zero and by the
Cayley-Hamilton Theorem and A-invariance the properties of reachability and control-
lability to zero can be studied w.l.g. on the integer time-interval [O,n], (where n is the
dimension of the state space ern). Similarly observability can be studied on [O,n-I],
280

as will be done later.


As expected, two matrices playa crucial role:

1 the reachability matrix C:= [B: AB: ... : An-IB ] ,

CA
2 the observability matrix 0 .-

3 Comments. a) The matrices (I) and (2) are obtained from the reachability
matrix (8d.2.4a) (for ko=O and kl =n) and from the observability matrix (8d.3.4a)
resp. (for ko=O and kl =n-l). For matrix (I) we use the notation C because con-
trollability is equivalent to reach ability on [O,nJ by Corollary (8d.2.11 a).
/3) The subspaces R [C] and N [0] are A-invariant by Lemma (8.5.3).
In this section we shall use the following tool.

4 Lemma'. Let A E (Cro<n be nonsingular. Let V be an A-invariant subspace of


(Cn. Then
i) V is A-I-invariant,
ii) if x E V then 'r;j k E Z Akx E V .

Proof. i) Since by assumption detA "* 0, by Theorem (4.5.30) there exists a poly-
nomial p(.) S.t. A-I=p(A). Since V is A-invariant, '\IXE V, '\IkE N, AkxE V.
Since V is a linear subspace, 'It x E V , A-Ix = p(A)x E V.
ii) Since V is A-invariant, 'ltXE V and '\IkE N, AkxE V. Since V is A- I-
invariant, '\I x E V and 'It (-k) E N, Akx E V. •

A first application of Lemma (4) shows the following.

5 Corollary. Consider a time-invariant representation R d = [A,B,C,DJ, where A is


nonsingular. Consider the [O,n] controllability to zero matrix Co obtained from
(8d.2.34) and the [O,n-I] reconstructibiliry matrix 0 I obtained from (8d.3.35), viz.

6 C o = [A-nB :. Kn+IB:. ... :. A-IB] ,

and
281

CA-n

CA-n+1
7 0 1=

CA- 1

Then

8 (i) R [C] =R [C 0] ,

9 (ii) N[O]=N[Od·

10 Forward comments. i) will mean that for a pair (A,B) with A nonsingular, the
subspace of reachable states is identical to the subspace of states that are controllable
to zero; moreover rk [C) = n will mean that such pair (A,B) is simultaneously reach-
able and controllable to zero. See Comments (SOy) and (S5P) below.
ii) can be shown to mean that for a pair (C,A), with A nonsingular, the subspace of
states that are unobservable is identical to the subspace of states that are unreconstruc-
tible; moreover rk [0] =n will mean that such pair (C,A) is simultaneously observable
and reconstructible. Compare with the conclusions of Theorem (9) below.

Proof of Corollary (5). (i) By (1) and (6) C o=A-nC. Now since A is nonsingular,
by Lemma (4) the A-invariant subspace R [C) is also A-I-invariant. Hence
R [C 0] cR [C]. Moreover, with A nonsingular, R [C 0] is both A-1_ and A-invariant.
Hence R [C o]:::>R [C]. Thus R [C] =R [C 0]' (ii) follows similarly. Note that
01=OA-n and that with A nonsingular, both N[O] and N[Otl are A- and A- I-
invariant. •

The proof of the results below emphasizes only important steps.

8d.5.1. Observability of the pair (C,A)


Because of the Cayley-Hamilton Theorem (3.2.24) a state Xo of R d is unobserv-
able on [O,k] for all k iff the state Xo is unobservable on [O,n-l1. Hence

11 we define a state Xo of R d or the pair (C,A) to be unobservable iff Xo is unob-


servable on [O,n-l1, and we say

12 that R d or the pair (C,A) is observable iff the zero state is the only state of
R d or of (C,A) that is unobservable.
Thus
282

xo e cr n is unobservable
13 <=>
CAkxo=6 'v'k=O.1.2 ..... n-l.

Hence we have the analog of Theorem (8.5.9).

14 Theorem [Observability properties of the pair (C.A»). Given the matrix pair
(C.A),

15 i) the set of all unobservable states is the A-invariant subspace N (0) c <r n,

16 ii) the pair (C.A) is observable <=> rk 0 = n

17 <=> 'v' A. e a(A). rk [ -


A.I-A 1=n.

iii) For C and A real. for any monic real polynomial 1t of degree n. there exists
L e IRnxn. S.t.

18 XA+LC = 1t.
iv) Let. in addition, a(A) c D(O, 1); the pair (C.A) is observable if and only if Mo. the
unique solution of

19 W=A*WA+C'C

is positive-definite.

Proof of Theorem (14), part (iv). By Exercise (7d.2.52). the operator A : W - A*WA
is bijective because a(A) cD(O,l). Hence. (as in the proof of Lemma (7d.2.54». Eq.
(19) has a unique solution

20 Mo= 1: (A*)kC'C(A)k
k=O

that is Hermitian positive semi-definite. Note. for any Xo e (C n. with 0 the observa-
bility matrix (2).

xoMoxo=O <=> CAkxo=6 'v'ke N


283

(where the last equivalence follows by Cayley-Hamilton). Hence Mo is positive-


definite iff (C,A) is observable. •

8d.S.2. Reachability and controllability of the pair (A,B)


Consider a discrete-time time-invariant system R d = [A,B,C,D] where A,B,C and
D are constant matrices and let C be the reachability matrix (1). The most important
results of this section are Theorems (66) and (75) using normalized definitions of
reach ability and controllability (to zero) on a standard interval [O,n] with A possibly
singular: see Definitions (31) and (60) below. The effort for this normalization is non-
trivial and reflected in the three normalization Lemmas (30), (55) and (58). Since A
may be singular Lemmas (55) and (58) require the decomposition (4) of the state
space ern into reversible states and deadbeat states, which generate useful decomposi-
tions of the r.e. x(k+l)=Ax(k) and of the reachable subspace R [C]; these are
indispensable for understanding the normalization and the statements and proofs of
main Theorems (66) and (75) below. We start by normalizing reachability.

30 Lemma [Normalization of reachability]. Given R d = [A,B,C,D]. Then


the state xI E ern is reachable on the interval [O,k] for some k> 0
xI is reachable on [O,n].

Proof =>. If k < n, the proof is immediate; hence, consider only k n. Let xI be
reachable on [O,k] with k i.e. there exists a control U[O.k-1] S.t.
k-I
xI = L Ak-k'-IBu(k') .
k'=O

.
By the Cayley-Hamilton theorem (3.2.24) we know that, for all k' n, Ak ' is a linear
combination of Ai, for i=O,l, ... ,n-1. Hence there exists a control v[O.n-l) S.t.
n-I
xI = L An-k'-I Bv(k'). Hence xI is reachable on [O,n].

Motivated by Fact (30) we have the following normalized definitions.

31 Given R d = [A,B,C,D], we say that a state xI E er n of R d or the pair (A,B) is


reachable iff xI is reachable on [O,n], (i.e. in at most n steps); we say that the pair
(A,B) is reachable iff every state XI E ern is reachable.
Thus, given R d = [A,B,C,D] on the pair (A,B), we have by (8d.2.11):

33 a state XI is reachable
284

n-l
xl = L An-k'-IBu(k') for some control u[O,n-lj
k'=O

(where C is the reachability matrix given by (1», Thus

34 R [C] = { subspace of all (A,B)-reachable states J

For the analysis of controllability to zero we need a new decomposition.

40 Decomposition: reversible and deadbeat states. Consider any constant matrix


pair (A,B) where A e er nxn and Be (Cnxni, and the corresponding state equation

41 x(k+ 1) = Ax(k) + Bu(k) .

By (4.3.1) the state space can be decomposed into A-invariant algebraic eigenspaces,
i.e.

43 Let N r be the direct sum of the Nk's corresponding to all the nonzero eigen-
values and let N d be the algebraic eigenspace corresponding to the (possible) eigen-
value at O. Thus by (42) ern is uniquely decomposed into

44 ern=Nr$N d

where the fixed A-invariant subspaces N rand N d are called the reversible subspace
and the deadbeat subspace resp, 0/ er n. States that belong to N rand N d are called
reversible and deadbeat states resp., and denoted by xr and xd, resp. Thus, by (44)
every state x e ern is uniquely decomposed into

where xr e N r is the reversible projection of x and xd eNd is the deadbeat projection


of x.
The facts below condense useful analysis tools. We shall freely denote by the
same symbol a map and its matrix representation.

46 Facts [Decomposition of the r.e. x(k+l) = Ax(k)]. Let Ae (CnXII and denote by
285

Ar and Ad the restriction of the map A on N rand N d resp.; more precisely,

and
48 Ad : N d N d : xd Adxd:= AXd .

U.t.c.
a) The map AI' (47), is invertible.
(Since every eigenValue of Ar is nonzero by (43)).
b) The r.e. Xr(k+ 1) = on N r is time-reversible, i.e. V xr(O) E N r the r.e. has a
unique solution xr(k) = E N r for k 0 and for k O.
(This justifies the name reversible state for xr(O); note that because Ar: N r N r is
invertible, xr(k) E N r for all k E Z .)
c) The map Ad' (48), is nilpotent with index md n, (i.e. the smallest nonnegative
integer k S.t. At=O is md)'
(This follows because, by (43), a(A d)= (O }).
d) The r.e. xd(k+l)=Adxd(k) on Nd has a unique solution xd(k)=Ajxd(O) on k
such that for every xd(O) ENd' xd(n) = Ximd) = e.
(This justifies the name deadbeat state for xd(O».
e) For every X(O)E (tn, decomposition (45) applied to x(O) induces a unique decom-
position of the solution of the r.e. x(k+ I) = Ax(k) on k 0, viz.

x(k)=Akx(O)=A:x,(O) for all k .



Decomposition (44) decomposes also the reachable subspace R [C] (see (34».

50 Facts [Decomposition of R (C)]. Let R [C) be the reachable subspace of any


constant matrix pair (A,B). Then

where
a) (R[C]nN r) is both an Ar-invariant and an Ar-I-invariant subspace of N"
(R[C]nN r is the subspace of all reversible states that are reachable),
b) R[C]nN d is an Ad-invariant subspace of N d , (R[C]nN d is the subspace of all
deadbeat states that are reachable). •

[Hints: (51) holds because the A-invariant subspace R[C] is spanned by generalized
eigenvectors, (4.3.19); for a) and b) use (47) and (48); for a) use also Fact (46a) and
Lemma (4).]
286

A more familiar formula for R [C] n N r is as follows.

S3 Exercise. Consider a constant pair (A,B) and decomposition (44), where the
dimension of N r is"r' Consider as in (47). Use decomposition (45) to decompose
each column of BE 0: nxn, to obtain finally

B=Br+Bd

such that R (Br) eN rand R (B d) eNd'


Show that

R[C]nNr=R [Br: ArBr: ... : Arn,-I Br ].



[Hints: note that the projection of R [C] onto N r along N d is precisely R [C ] n N r.]

S4 Comment. One could call (ApBr) the reversible part of the pair (A,B).
For controllability to zero we now have by definition (8d.2.11) and decomposition
(44).

SS Lemma [Normalization of controllability to zero]. Given R d= [A,B,C,D). Then

the state Xo E 0: n is controllable to zero on [O,k] for some k> 0


¢:>

Xo is controllable to zero on [O,n].

Proof. Let k < n: bring Xo to zero in k steps; with zero inputs from k on, the state
will forever stay at zero. Let k n. Let C be the reachabiIity matrix (I) and decom-
pose Xo according to (45) into Xo = XOr + xOd' Observe that by Fact (46e) Akxo = ArkxOr
for all k Hence we have
Xo is controllable to zero on [O,k] for some k
k-l
=> -Akxo= L Ak-k'-l Bu(k') for some control u[O.k-1] (by definition)
k'=O

=> AkxOE R[C] (by Cayley-Hamilton)

=> ArkxOr E R [ C ] n N r (by Fact (46e) since k

=> A:'XOrE R [C] nNr (by Fact (50a): R[C]nN r is Ar-I-invariant)

=> A"xOER[C] (by Fact (46e»

=> Xo is controllable to zero on [O,n] (by definition).



287

We have also the following due to Lemmas (30) and (55).

58 Lemma [Nonnalization of controllability]. Given R d = [A,B,C,D]. Then


the pair (A,B) is controllable on [O,k] for some k > 0
<» the pair (A,B) is controllable on [O,n].

Proof. ::>. By Corollary (8d.2.11a), on a given integer time-set [O,k] with k>O,
controllability of (A,B) implies reach ability and controllability to zero. By Lemmas
(30) and (55) the latter imply reachability on [O,n] and controllability on [O,n]. Hence,
by definition, for any state xl at n and any state Xo at 0 there exists controls U(O.n-Ij
and v(O,n-Ij such that
n-l
XI = An-k'-I Bu(k')
k'=O

and
n-l
-Anxo= 1: An-k'-I Bv(k') .
k'=(j

Therefore, by addition
n-l
x)-Anxo= 1: An-lc'-l B(u(k')+v(k'» .
k'=O

In other words given any state xI at n and any state Xo there exists a control
(u+v)(O,n_Ij that steers the phase (xo,O) to the phase (xI,n). Thus by definition
(9d.2.11) the pair (A,B) is controllable on [O,n]. •

Lemmas (55) and (58) justify the following normalized definitions.

60 Given a time-invariant system representation R d = [A,B,C,D], we say that the


state Xo of R d or (A,B) is controllable to zero iff that state is controllable to zero on
[O,n]; we say that R d or the pair (A,B) is controllable to zero iff every state Xo is con-
trollable to zero; we say that R d or the pair (A,B) is controllable iff (A,B) is controll-
able on [O,n].
Thus, with C the reachability matrix (1),
61 Xo is controllable to zero

Moreover by Corollary (8d.2.11a) and the definitions


288

62 (A,B) is controllable

(A,B) is reachable rk[C]=n

(A,B) is controllable to zero.

Now that we have nonnalized and are using the definitions (31) and (60), we have in
analogy with (8.5.36).

66 Theorem [Controllability and reachability properties of a pair (A,B)]. Given


Rd=[A,B,C,D] with reachability matrix C given in (1), and given the definitions (31)
and (60),

62 i) the pair (A,B) is controllable

the pair (A,B) is reachable.

67 ii) The reachable subspace of the pair (A,B) is the A-invariant subspace R [C].

iii) The pair (A,B) is reachable

68 rk[C]=n

69 T:/AE cr(A), rk[AI-A:Bj=n.

iv) For A and B real, for any monic real polynomial 1t of degree n, there exists
FE 1Rnixn such that

if and only if the pair (A,B) is reachable.

(v) Let cr(A)cD(O,I); the pair (A,B) is reachable if and only if Wr the unique
solution of the equation

71 W=AWA* +BB*

is positive-definite.

72 Comments. a) Deadbeat closed-loop systems. If in iv) 1t(A.)=A.n , then the
resulting closed-loop r.e. x(k+l)=(A+BF)x(k) is S.t. a(A+BF)= (0 I, i.e. every
closed-loop state x(O) E (en is a deadbeat state by decomposition (40): every closed-
loop state-trajectory k --t x(k) is zero for k n.
(3) Controllability to zero: It turns out that condition (69) holds for every
289

I.E o(A)\ (O) if and only if the pair (A,B) is controllable to zero, (see Theorem (75)
below).

Proof of Theorem (66), part (v). As in the proof of Theorem (9), part (v), the
unique solution Wr of (71) reads

Wr = (A)kBB*(A*)k
k=O

Hence, by Cayley-Hamilton

Wr > 0 rk [C] = n (A,B) is reachable.



75 Theorem [Controllability to zero]. Consider a time-invariant system representa-
tion.Rd=[A,B,C,D] where the state space frn is additively decomposed into its rever-
sible subspace N r and into its deadbeat subspace N d' (see (40). Let C be the reacha-
bility matrix (1).
U.t.c
i) The set of all states of the pair (A,B) that are controllable to zero is the A-
invariant subspace

ii) R[C]nN r
( subspace of reversible states that are controllable to zero )
71
= (subspace of reversible states that are reachable) .

iii) The pair (A,B) is controllable to zero

79 VA.Eo(A)\(O} rk[A.I-A:Bj=n.

80 Comments. a) (78) means that (A,B) is controllable to zero iff every reversi-
ble state is reachable.
(77) means that if x EN r (i.e. x is reversible), then x is controllable to zero iff x
is reachable.
y) If A is nonsingular then N r = fr nand (77) reduces to the equality of the sub-
space of reachable states and the subspace of states that are controllable to zero, more-
over (78) reduces to rk[C]=n. Compare with Eq. (8) and Comment (10).
290

Proof of Theorem (75). (i) We have the following equivalences

Xo is controllable to zero

(by (61»

(decompose Xo using (45) and use Fact (46e»

(by Fact (50a): R [C] nN r is Ar- and A;-l-invariant)

(by the unique decomposition (45».

(ii) The equalities (77) follow by comparing the direct sum decompositions in (51)
and (76).
(iii) Equivalence (78) follows by (76) and trn=N rEf> N d' In view of (43).
equivalence (79) follows from condition (78) which is equivalent to

(recall also equivalence (8.7.23) (8.7.24».



81 Exercise [Controllability reachability) versus controllability to zero]. Con-
sider a time-invariant system representation R d = [A.B.C.D] where the state space (C n
is additively decomposed into its reversible subspace N r and its deadbeat subspace N d'
(see (40». Let C be the reachability matrix (1). Show that the condition

82 NdcR[C] rk[A:B]=n

is necessary and sufficient for


a) the equality of the subspace of states that are controllable to zero and the subspace
of states that are reachable; more precisely.
83 R [C] +N d=R [C] •

and b) the equivalences of the following:


291

the pair (A,B) is controllable to zero

84 the pair (A,B) is reachable

the pair (A,B) is controllable) .



85 Comments. a) (82) means that every deadbeat state must be reachable.
If A is nonsingular then N d = (e) and (82) holds; hence conclusions a) and b)
hold. Compare with Comment (10).
y) Condition (82) allows "B to help A," (see for example Exercise (8d.2.9».
The following note places some familiar results in their appropriate context.

90 Duality of reversible systems (i.e det A "# 0). Consider a discrete-time time-
invariant system representation R d = [A,B,C,D] where A is nonsingular.
Consider the reachability matrix C and the observability matrix 0 given by (1) and
(2).
By Corollary (5) and Exercise (81) the reachable subspace of (A,B) and the subspace
of (A,B) of states that are controllable to zero are both equal to R [C]. Hence we are
justified

91 to call R [C) the controllable subspace of (A,B), (without specification).


Similarly, by Corollary (5) the unobservable subspace N (0) of (C,A) defines both the
unobservable and unreconstructible subspace of (C,A).
Note that, when detA :F 0, the duality relations (8d.4A5) and (8dA.20), (with ko=O
and kl =n) result in a) and b), below viz.:
a) (R [C]).1. =N [(\]
where 0 I is the reconstructibility matrix (7) of the (forward) pair (B* A*-I, A*-J) of
the dual system Rd' Moreover,

92 N[OI]=N[C*]

is easily recognized as the unobservable subspace of the pair (B* ,A*). Hence we have
by a) and (92)

( controllable subspace of (A,B) ) .1.


93
= ( unobservable subspace of (B*,A*») .

b) (N [0]).1. =R [C 0]
where Co is the to zero matrix (6) of the (forward) pair (A*-',- A*-'c!)
of the dual system Rd' Moreover,
292

is easily recognized as the controllable subspace of the pair (A*,C\ Hence we have
by b) and (94)

( unobservable subspace of the pair (C,A) ) 1.


95
= (controllable subspace of the pair (A*,c*) ) .

96 Comment. In (92) we have exchanged the forward pair (B* A*-',A*-') of Rd for
its backward pair (B*,A*). Similarly in (94) we exchanged the forward pair
LA*-',-A*-'c*) for its backward pair (A*,c*). Compare the forward representation of
R d (8d.4.27)-(8d.4.28) with its backward representation (8d.4.5)-(8d.4.6). Another
benefit of Corollary (5) is the ability to exchange forward pairs and backward pairs!

8d.6. Kalman Decomposition Theorem


Theorem (8.6.10) applies to any discrete-time time-invariant system representation
Rd = [A,B,C,D] by replacing "controllable" or "controllability" by "reachable" or
"reachability" resp ..

8d.7 Stabilizability and Detectability


Consider a discrete-time time-invariant system representation R d = (A,B,C,D).
Let C be its reachability matrix given by (8d.5.1) and 0 be its observability matrix

given by (8d.5.2). Consider the matrix A E (C nxn with spectrum (l(A) = {Ak rand

algebraic eigenspaces N k' (4.2.11), given by

Nk =N (A-AkI)ffik) .

2 Call the states of N k modes at the eigenvalue "k


Consider the complement of the open unit disc, denoted by D(O,l)C and given by

3 D(O,l)c= (AE (C : IAI 1) .

Modes at an eigenvalue Ak E (l(A) tl D(O,l)C are called

4 unstable modes and add up to the A-invariant unstable subspace

Observe that, for every Ak E (l(A) tl D(O,l)C R (C] tlN k represents the A-invariant
subspace of (unstable) modes at Ak that are both controllable to zero and reachable,
(prove this: observe that unstable modes are reversible states and use Theorem
(8d.5.75», therefore we call simply R [C) tl N k' the controllable subspace at such Ak'
293

Hence there are

6 no uncontrollable unstable hidden modes at AkE o(A)nD(O,I)C iff

7 Similarly there are no (nonzero) unobservable unstable hidden modes at


Ak E o(A) n D(O, I)C iff

8 We say that the pair (A,B) is stabilizable iff R d or (A,B) has no uncontrollable
unstable hidden modes or equivalently

9 We say that the pair (C,A) is detectable iff R d or (C,A) has no (nonzero)
unobselVable unstable hidden modes, or equivalently,

The following results parallel Theorems (8.7.62) and (8.7.65), and are proved similarly.

62 Theorem [Stabilizability]. Consider a discrete-time time-invariant system


representation Rd=[A,B,C,D]. Let C be the reachability matrix given by (8d.5.I) and
let N + be the unstable subspace given by (5). Then
i) the pair (A,B) is stabilizable

N+cR[C]

VAE o(A)nD(O,l)C, rk[AI-A;B]=n;

ii) if A and B are real, there exists FE R niXll s.t

o(A+BF) c 0(0, 1)

if and only if the pair (A,B) is stabilizable.



65 Theorem [Detectability]. Consider a discrete-time time-invariant system
representation R d=[A,B,C,D]. Let 0 be the obselVability matrix given by (8d.5.2)
and let N + be the unstable subspace given by (5). Then
i) the pair (C,A) is detectable
294

'<IAE a(A)nD(O,I)C, rk [ c-
AI-A]
=n,

ii) if C and A are real, there exists L E IRJOO'I" S.t.

a(A+LC) cD(O,I)

if and only if the pair (C,A) is detectable.



CHAPTER 9

REALIZATION THEORY

Introduction
In this chapter we study the main properties of the realizations of a given proper
transfer function matrix R(s) E CI:p(S)n..xnI. We show that the McMillan degree of R(s)
is the minimal dimension of the state space of any of its realizations. We prove that
two minimal realizations are algebraically equivalent. Vfe show that eigenvalues of A
of any minimal realization are dictated by the poles of H(s) .• We conclude this chapter
with a description of a controllable canonical realization of H(s) as in [Chao 1].

9.1. Minimal Realizations


In this section R(s) E (f p(S)n..xn1 is a given proper transfer function matrix. As a
first St6P we proceed to define its McMillan degree and minimal realizations.

1 We say that a linear time-invariant system representation R = [A,B,C,D] is a


realization of R(s) E (f p(S)n..xni iff the transfer function of R is R(s). or equivalently
according to (3.2.65).
2 R(s)=C(sI-A)-IB+D.

3 We call dimension of any representation R = [A.B,C,DJ the dimension n of its


state space.

4 We say that a realization R = [A,B,C,D] of R(s) is minimal iff its dimension is


minimal among all'realizations of R(s).
Realizations are related by the following fact.

5 Fact. Two time-invariant system representations R = [A.B.C,D] and


R= [A.il ,C .DJ are realizations of the same transfer function R(s) e CI:p(S)n..xni if and
only if one of the following equivalent statements holds:
a) R and R have the same impulse response, i.e.

6 H(t) := Cexp[At]B + DO(t)=H (t) := C exp[A t]B + f> oCt)

b) R and R are zero-state equivalent;


c)
7 CAiB=CA in for i=O.1.2.... and D=D.

Proof. The equivalence to a) <=> b) is obvious by Definition (1) and Exercise

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
296

(5.4.22), (nqte that L [H(t)] = H(s». Concerning the equivalence a) ¢:> c) observe that
1) H(oo)=H(oo) ¢:> D=D, and 2) t -t Cexp[At]B as well as t -t C exp[At]B are
analytic on R and have a Taylor expansion, e.g.

8 for all te IR .

Therefore c) ::> a). Moreover a) ::> c), since, with

C exp[At]B = C exp[A t]B for all te R ,

CAiB=C AiB for i=0,1,2, ... ,

by successively taking derivatives at t = 0.



9 Analysis. any transfer function H(s) E cr p(s)floXIli and any realization
R = [A,B,C,D] of H(s) of any dimension n,
a) H(s) has an expansion at 00

10 H(s)=H(oo) + L His-(i+l) for lsi >max ( IAI : Ae P[H(s)] } ,


i=O

where the so-called Markov parameters Hi E cr nxn satisfy

11 Hi=CAiB for i=O,I,2, ...

for any given realization R = [A,B,C,D].


Indeed this follows by taking the Laplace transform of H(t) - Do(t) = C exp[At]B on
both sides of (8), where, I) by analytic extension, the expansion at 00, initially valid
for Is I > PA = max ( IA I : Ae orA] ) , is valid for any s as specified on the RHS of
(10), and 2) (11) holds by (7). Hence (10)-(11) hold.
b) H(s) specifies Hankel matrices HI of order I =0,1,2, ... defined by
12 H :=
/ trH-'+J.]'i.j=O e cr(/+l)n.,x(l+l)ni
,

s.t. VI

13 H, = [CAi+jB J:.j=O=O/ . C/

where
297

CA
14 01 e ."
'" (1+llno"" t and C,:= [B :. AB :.
. .. :. Al B] E .".<(1+1).;
'U- ,

CAl

(for an arbitrary realization R = [A,B,C,D]). Hence, in particular, with n the dimen-


sion of a realization R,

15 H
n-l
= rtR I+J.J iJ=O
n-l =0 . C E CI:1U\,XJUl1

where 0 = 0 n-l and C = C n-l are the observability and controllability matrices of R,
given by (8.5.2) and (8.5.1). Moreover, we have the following fact.

16 Fact. For any realization R = [A,B,C,D] of H(s) E CI:p(s)n.,><,\

17 rk[HI ] =rk[H n-l] VI

and
18 rk[H n-l] is independent of the given realization .

Proof. a) (17) is obtained as follows: let I with HI given by (13)-(14). Let
011-1' C11-1and if n-l be matrices (of dimensions (l + I )no x n, n x (l + I )nj and
(l+l)nox(l+l)nj, resp.) obtained by placing On-I' Gn-l and H II- 1, resp. in the upper
left-hand comer and bordering them with zero entries. Now, (exercise), by (14) and
the Cayley-Hamilton theorem (3.2.24), there exist nonsingular matrices Land R
(obtained by row and column operations, resp.) such that

Hence, using (13),

LH/R=0n-lCn_l =H n- 1 ,

and therefore taking the rank

rk[Hd=rk[Hn-d=rkfHn_d· QED.

b) (18) is proved as follows. Let R = [A ,13 ,C ,0] be another realization of B(s) of


dimension n. Hence, we are done if we prove that
298

19 rk[H n-d =rk[H it-I] .

For this purpose let I {n-l,n-I} and observe that, by (13) and (7),

HI = [CAi+jB r.i=O = [c Ai+js ]i:.i=O = if I

Therefore, applying (17) to both Rand R,

rk[H n-tl =rk[H/] =rk[H/] =rk[H ,,-1] .

Hence (19) holds.



By Fact (16) it is now easy to assert the following.

22 Theorem [Rank of Hankel matrices]. Let H(s) E ([: p(s)""xn i be a given proper
transfer function matrix; let its expansion at 00 be described by (10)-(11). Consider its
Hankel matrices HI of all orders described by (12)-(l3).
U.th.c.
a)
23 "II E N rk[HI ] rk[HI+t1,

b)

24 max {rk[Hd; / EN} =rk[H n-d =rk{[CAi+jB

where nand A,B,c; are, resp. the dimension and parameter matrices of any realization
=
R [A,B,C,D] of H(s).

Proof. a) (23) is obvious since HI is a submatrix of H I + I .


b) (24) follows from Fact (16), (23) and (13).

2S Comments. a) It is crucial to view the maximum on the LHS of (24) as a


property of the transfer function H(s) E tr p(s)noX'lI, (note that HI is specified by H(s),
see (12», and to view the right-hand sides of (24) as formulas for calculating this
maximum using any realization; (note that by the spectral mapping theorem (4.7.1) this
computation may be ill-conditioned).
In realization theory, given any proper transfer function H(s) E ([: p(S)n.xni

26 OM := max ( rk[Hd : / EN)

is called the McMillan degree of H(s). (The matrices HI are the Hankel matrices of
299

R(s) as defined by (10) and (12).)


y) The McMillan degree aM is the dimension of any minimal realization of R(s).

30 Theorem [Minimal realizations]. Let R(s) E ([p(s)n.xn. be a proper transfer func-


tion with McMillan degree aM' (26). Let R = [A,B,C,D] be an arbitrary reali-
zation of H(s) of dimension n.
U.th.c
a)
31 n

33 the pair (A,B) is controllable and the pair (C,A) is observable.



34 Comment. In other words, the McMillan degree is the minimal dimension of
any realization and a realization is minimal iff it is controllable and observable.
Hence, the following makes sense.

35 Definition. A time-invariant system representation R = [A,B,C,D] is said to be


minimal iff the pair (A,B) is controllable and the pair (C,A) is observable. ("minimal"
means "of minimal dimension as a realization of its transfer function.")

37 Proof of Theorem (30). Observe that for any realization R = [A,B,C,D], by


(26), (24) and (15)

38 aM=rk[O . C],

where 0 and C are the observability and controllability matrices of R, resp. (see
(8.5.1)-(8.5.2». Hence by Sylvester's inequality (A.5.37), since 0 and C have, resp. n
columns and n rows
39 rk[O] + rk[C]-n $ BM $ min (rk[O],rk[C]) $ n

where, by (8.5.11) and (8.5.38),

40 (C,A) is observable ¢> rk[O]=n

and
41 (A,B) is controllable ¢> rk[C]=n.
300

Therefore, (31) follows from the RHS of (39). Moreover, by (39)-(41), (32) is
equivalent to (33). •

42 Comments. ex) Equation (38) has a simple interpretation. By (8.5.37), R (C)


is the subspace of all controllable states (or equivalently reachable by (8.5.34». By
(8.5.10) N (0) is the subspace of unobservable states. Hence R (OC) is the subspace
of all controllable states that are not unobservable, indeed any unobservable state
XE R(C) is such that Ox=6 (Le. does not contribute to dimR(OC». Hence
8M =rk[OC]=dimR[OC] is the dimension of the subspace of all states that are both
controllable and not unobservable (see section 8.6 (Kalman decomposition».
Equation (15) and Theorem (30) show that the Hankel matrix H n-I = OC is

43 invariant for all minimal realizations of a given transfer function; it has a con-
stant size 8M no x 8M nj and consists of realization independent Markov parameters Hi'
(11).
y) The McMillan degree 8M the minimum number of integrators needed to simu-
late the given transfer function H(s).
The next result establishes that minimal realizations of the same transfer function
are algebraically equivalent.

44 Theorem [Minimality and algebraic equivalence]. Let R = [A,B,C,D] be a


minimal realization of a proper transfer function matrix R(s) e 0: p(S)l1oxn i. Let
R= [A ,B ,C ,f:>] be another realization of R .
U.th.c.

45 R is a minimal realization

if and only if

46 Rand R are algebraically equivalent,

more precisely,
there exists a nonsingular matrix Te o:nxn s.t.
R -.::L R [i.e. x= Tx] .

Furthermore,
47a T=(O*O)-IO*O.

47b r1=c C*(CC*rl

where 0, 0 and C , C, resp. are the observability and controllability matrices of R


301

andR.

48 Comment. In (47a) (0*0)-10*


is a left inverse of the full-column rank matrix
o (by Theorem (30), (C ,A) is observable). In (47b) C*(C C* )-1 is a right inverse of
the full-row rank matrix C (by Theorem (30), (A ,n) is controllable).

50 Proof of Theorem (44). We first prove (45) (46).

¢ : By assumption R R and R is a minimal realization of dimension n. Hence


R has the same dimension and is a minimal realization.
: By assumption Rand R are minimal realizations of the same transfer function
R(s) and are of the same dimension n. Hence, using (7) and (8.5.1)-(8.5.2),

51 D=D,

52 OC=OC,

53 0 AC=OAC,

where, by Theorem (30),

54 rkO =rkO =rkC =rkC =n .

By (54) the Hermitian positive semi-definite matrices 0 *0 and CC * are positive-


definite, hence nonsingular. Therefore the nxn square matrices
55 TI := (0* 0)-10*0 and T2 := CC* (CC* rl

are well defined. By (52), T I T2=I, hence TI and T2 define a nonsingular matrix
T e o:n><n s.t.

56 T :=TI and r l =T2 .

From (55)-(56) it follows, on mUltiplying (52) separately, first on the left by 0* and
-*,that
second on the right by C
57 C=TC and 0=0 rl.

Hence by (8.5.1)-(8.5.2)
58 n=TB and c=crl.
302

Moreover by (55)-(56) it follows, on multiplying (53) simultaneously to the left


by 0 -* and to the right by C-* ,that
59 A=TArl.

Hence there exists a nonsingular matrix T defined by (55)-(56) s.t. (59), (58) and (51)
hold, i.e., by Exercise (5.4.61), R 2- R. QED.

Equivalence (45) (46) is now established. Finally (47a) and (47b) follow from
(55)-(56). •

The last main result of this section shows that the eigenvalues of A of any
minimal realization of a given transfer function B(s) are precisely the poles of B(s).

62 Theorem [Minim,ality and poles]. Let R = [A,B,C,D] be any minimal realization


of a transfer function H(s) E ([:p(s)noxn,.
U.th.c.

63 AE fr is an eigenvalue of A E cr: nxn with multiplicity m as a zero of its


minimal polynomial (4.2.7)

64 A E fr is a pole of H(s) of order m,


or equivalently

"# 0 for I =m
AE P[H(s)] and lim (S-A,)' B(s) {
s-->A. = 0 for I > m .

65 Exercise. a) Let R = [A,B,C,D] be any minimal realization of


B(s) E frp(S)n.><ni. Let Ak be any eigenvalue of A. Consider the algebraic multiplicity
of Ak defined by

66 dk := dimN [ [A -Ak I t·]


(see (4.3.4); d k is also the multiplicity of Ak as a zero of the characteristic polypomial
of A, (see (4.3.29)). Show that d k is invariant over all minimal realizations of H(s).

67 (d k is called the McMillan degree of the pole Ak of B(s); hence, by the direct
sum decomposition (4.3.2) of the state space into algebraic eigenspaces,
303

68 the McMillan degree of R(s) is the sum of the McMillan degrees of its poles,
b) Let R = [A,B,C,D] be a minimal realization of H(s) E tr p(S)"'X1\ where the basis
of the state space tr n is chosen to be a union of bases of the algebraic eigenspaces of
A, (see decomposition (4.3.2». Use notations induced by (4.3.6) and (4.4.1) et seq.
A

Show that (with appropriate matrices) the partial fraction expansion of H(s) reads
,... " a mit
69 H(s)-H(oo)= L L CdAk-AkI<I..1/-IBk(s-Ak)-1

where

70 {Ak r =a[A] =P[H(s)]

and, V kE Q:,

mk is the order of the pole Ak of H(s) ,


d k= dim N[(A - AkI)mk] is the McMillan degree
71 of the pole Ak of H(s), (66)--(67),
[Ak-Ak1dJ is nilpotent of index mk, (4.1.23),

[Hints: for a) use Theorem (62) and Theorem (44); for b) use (4.4.19c), (4.4.2),
Theorem (62) and a).]

72 Comment. Theorem (62) shows that by examining the entries of H(s) one can
determine alA] and the multiplicities of the eigenvalues (63) for any minimal realiza-
tion. The minim!!l polynomial of such A is the least common (i.c.d.) of
all elements of H(s). By examining the partial fraction expansion of H(s) (see (87)
below), one can obtain the dimension n of any minimal realization and the dimension
d k of the algebraic eigenspaces of its matrix A. It be shown that the characteristic
polynomial of such A is the l.c.d. of all minors of H(s), e.g. [Che. I].

75 Abbreviated proof of Theorem (62). We prove that (62) (63).


=>: By Fact (4.5.13) the multiplicity m of A as a zero of the minimal polynomial is
the size of the largest Jordan block associated with the eigenvalue A of A. Therefore
there exists a Jordan chain of length m, (4.5 1), i.e. a l.i. family of generalized eigen-
vectors [ei ] c tr n such that

76 Vj E ill with e O := e.
Note that e l is an eigenvector of A.
304

Since R is a minimal realization, (A,B) is controllable and (C,A) is observable (by


Theorem (30».
By controllability there exists, for any T> 0, an input urO.T) that steers the state
x(O) = 6 to the state x(T) := em. Setting u(t) = 6, V t > T, the output reads. using
(76),

Vt>T yet) = C exp[A(t-T)]e m

77
_r
-C le
1 (t-T)m-l
(m-I)! + e2
(t T)m-2 ]
+ ... + em exp[A(t-T)],

where by observability Ce l :F 6.
Observe now that the piecewise continuous functions Y[o.n and u[O.T) (set equal to
zero on (T,oo» have Laplace transforms that are analytic in the finite plane, (i.e. they
are entire). Moreover, by (2), since m is the size of largest Jordan block associated
with AE a[A], R(s), when it has a pole at A, has a pole of order at most m. Thus,
using (77),
78 y(s) = Y[O.T)(S) + e- sTC[e 1(s-A)-m + e 2(S-A)-m+1 + ... +em(s-A)-l]

and yes) has a pole at A. of order m (indeed Y[O.T) has no pole at A. and Ce l :F e).
Moreover, since
, , ,
79 y(s)=H(s)' Uro.T)(S)

(where has no pole at A.), R(s) must have a pole at A. of order m. Indeed the
negation of the last assertion leads to contradiction:
1) if R(s) has no pole at A. then, by (79), yes) has no pole at A: -H-.
2) if R(s) has a pole at A. not of order m, then, since its order is at most m, the order
must be strictly smaller than m; hence in view of (79), yes) cannot have a pole of
orderm
Hence we have established that H(s) has a pole at A. of order m.
¢:: We assume that R(s) has a pole at A. of order m and use contradiction:
1) if A is not an eigenvalue of A then, by (2), R(s) is bounded in a neighborhood of
A: hence A is not a pole of R(s) : f- .
2) if A is an eigenvalue of A with multiplicity Jl (as a zero of its minimal polynomial)
such that 11 :F m, then by the proof above, R(s) has a pole at A of order
Jl :F •

80 Corollary [Exp. stability by I/O stability]. Consider a time-invariant system


representation R = [A,B,C,D] with transfer function R(s) E cr p(s)rv<n;.
Assume that R is I/O stable, Le. (7.1.42),
305

• 0
81 P[H(s)] c CL .

Then
o
x=Ax is expo stable (or equivalently, cr(A)c (1:_, (7.2.33»,

if and only if
R has no unstable hidden modes (or equivalently,

(A,B) is stabilizable and (C,A) is detectable, (8.7.68» .

Short Proof.
<=. With N _ and N + resp. the stable and unstable A-invariant subspaces of ern,
(given by (8.7.33)-(8.7.34)), pick a basis according to ern=N _ $ N +. In that basis R
reads xI =AtxI +Blu, x2=A2X2+B2U, y=Clxl +C2x2+Du, where
o
cr(AI)=cr(AIN)=cr(A)n CI:_ and cr(A 2)=cr(AI N)=cr(A)n CI:+. Let
:= and let H I(s)=H(s)-H 2(s). By assumption (81),
P[H2(s)] = 0; furthermore [A2,B 2,C2,0] is minimal, (by Theorem (8.7.68), N + cR [C)
and N + nN [0] = { e }.) Hence, by Theorem (62),
• 0
cr(A)n (1:+ = cr(A 2) = P[H2(s)] =0. Thus cr(A)c CI:_.
:;'. Assume that R has an unstable hidden mode; then there exists an unstable mode


o
of x = Ax. Thus cr(A)¢ (I: _ •

The exercises below show how to compute the McMillan degree from the partial
fraction expansion of a given transfer function.

82 Exercise. Let HI(s) be any transfer of the form


83 H t (s)=K t s-I+K 2s-2

wh<;.re KI and K2 belong to (l:JloXIl; with K2 nonzero. Show that the McMillan degree
of H1(s) is given by

84

[Hints: use Exercise (65). a) Let [A,B,C,O] be any minimal realization of HI(s). Then
cr[A]= {O}, HI(s) = CBs-I +CABs-2, AI =0 for I ;::,2 and n ;::,2.
b)
306

K:= : = [:_] [B lAB]


K2 I 0 CA

such that OC is obtained from K by bordering K with zero entries.


c) Use (38), i.e. BM=rk[OC].]

85 Exercise. Let H 2(s) be a transfer function of the fonn

86 H 2(s) = KI (S-A)-I + K2(s-A)-2

where KI and K2 belong to ern.><n, with K2 nonzero. Show that (84) still holds.
[Hints: set o=s-A, then H 2(s)=H 1(0), see (83), ... , show that
rk[O (C,A)C (A,B)] =rk[O (C,A-AI)C (A-AI,B)]]

87 Exercise. Let H(s) E er p(s)n.><n, have a partial fraction expansion


cr m.
L L
A A

88 H(s)-H(oo)= Kk/(S-Akr1

r
k=1 1=1

{Ak = P[H(s)] and, for all k E Q, Kkm • # O. Show that the McMillan degree

of H(s) is given by

where, for all k E Q:,

Kkl Kk2 Kk3 Kkm•


Kk2 Kk3 Kkm•
90 dk := rk Kk3 Kkm•

[Hints: a) Use Exercise (65), in particular (66)-(71); b) Decompose accordingly the


controllability and observability matrices C (A,B) and 0 (C,A); c) for all k E Q: use the
last hint of Exercise (85).]

9.2. Control/able Canonical Form


Let R = [A,B,C,D] be a linear time-invariant system representation, where (A,B)
is controllable. In this section we construct a basis of the state space ern S.t. under
307

the change of coordinates x = Tx. R is algebraically equivalent to a representation


R= [A ,R ,C ,n] such that the pair (A ,R) has a special form, called the controllable
canonical form of (A,B) (or R according to the context). The main advantage of this
form is its simplicity for the display of certain control properties of the pair (A,B).
Our main tool is the change of coordinates formula x=Tx where T is nonsingu-
1ar (consequently the new basis vectors are represented by the columns orrl); hence
one has

(see (5.4.61) and Sections A.5.1 and A.5.2 on matrix representation theory). Note that,
given A and B, the choice of II determines A and R.

Standing assumptions. A and B are complex matrices of dimensions nxn and


nxnj resp. such that

2 (A,B) is controllable

and

3 B has full column-rank nj.

4 Comment. Assumption (3) can be made without loss of generality: if (3) does
not hold, introduce a change of coordinate ti = Lu of the input space: e.g. choose a
basis L- I (representing elementary column operations upon B) such that R := BL- I is
in column echelon form (A.5.66), whence for all u E cr n, with ti = Lu

where RI has full-column rank. Hence assumption (3) holds by reducing the input to
ii I'

5 Selection of a basis. Let bj denote the ith column of BE cr nxn ,. By assumption


(2) the controllability matrix of (A,B) defined by

6 C := [B: AB : ... : An-IB ] E cr nxnn,

has rank n. Hence it has n l.i. columns of the form Aibj which, if nj> I, may be
selected in many ways. We select these columns by reading the columns of C from
left to right, rejecting each column that is t l.d. on previously selected columns. Note
308

that, by assumption (3), for all i e n i, each column bi is selected. Moreover if a


column, say Aib i , is l.d. on previously selected columns, then so are all columns Ambj
for all m (check this). Hence, by this selection process, for all i e n j we find a
least integer Tt ki en such that, for this i,

7 Ak, bi is l.d. on previously selected columns,

. ] k.-\
8 the family [ AJbj .' is selected;
J=O

moreover
II;
9 "Lkj=n.
i=l

Let us express the nj linear dependence relations (7): for convenience, replace i by I,
and write:
k, n,
10 Ak,b/ = -"L L y/Aibj
j=o j=l

where

(these y's correspond to columns of Ak'B that do not precede Ak'b/), and, for all ie ni
such that kj k/

lOb Yki' = ... =0,

(because, for all m Ambj is l.d. on previously selected columns). Note that the nj
families (8) constitute a basis for cr n.

11 Exercise. An alternate way of viewing the selection process is as follows: form


an nxnj array of vectors where the ith column consists of bi , Abj, "" An-1b j • By (3), the
nj vectors of the first row are selected since they are l.i.. The selection process goes
row by row, each time a vector is J.d. on previously selected vectors it is deleted from
the array. Show that the kj's selected by (7)-(9) are such that Ak'bj is the first vector
deleted from the ith column. Give a pictorial interpretation of the equations (lOa) and
(lOb).

t l.d. means linearly dependent.


tt kj is
called the ith controllability index; if nj = 1 then k\ = n (all columns of C
are selected).
309

Note that the ni families (8) constitute a basis for ern. Replace in (8) i by I and j
for q. and add to each selected vector A%/ a linear combination of previously
selected vectors Aq-ib i• defining vectors by:

12 for all I E ni. for all q=O.l •...• kl-l.

We note that. by (10).

13 for all I E ni.

Moreover. by (lOa)-(lOb). only previously selected vectors are present in the RHS of
(12) and. by (9). (12) accounts for n vectors Therefore (12) defines a basis of
ern and the nxn matrix II defined by

14 1 I := [e kl I I e k1-1 I I ... e II I e k22 I ... I e 12 I ... I e knini I ... I e Ini ]

is nonsingular. The basis (12). ordered as in (14). is the basis required for the con-
trollable canonical form (A.R) of (A.B). (by using (1). i.e. (20)-(21) below).

15 Exercise [Dynamical interpretation]. Consider equations (12)-(13). Show that


for all len i' for all q = O.l •...• is the state at time q+ 1 produced by the recur-
sion equation xG+ 1) = AxG) + BuG) due to the zero state at time 0 and the input
sequence [u1m};:0' where. forallj=O.l ..... k/ • ul(j)= replaced by
1.

16 Controllable canonical form. By Exercise (15) we have for all lEn i


/-1
17 e{=Bul(O)=bl + L
i=1

and for alii E ni and q=O.1.2 .....k / _1

18 with (by (13».

Moreover the nixni matrix L-I defined by


310

is an upper triangular nonsingular marrix with diagonal elements equal to 1 ( must


be replaced by 1 and for i=I+I, ... ,nj (by (lOa».
Hence, by (17)-(19),

20 B= [et' : el: ... : J L,


and for alII E fij and q=O,l, ...•k /-l

21 - [el: el: ... : Lu/ (q+l),

where =6 and L is an upper triangular matrix with diagonal elements equal to 1.

Since L is nonsingular, (20) shows that R[B]=Sp [el,el· ... Let now

L:= t: =1' then = 1 and = 0 for i > I; with these notations (20) becomes
I-I
22 b, = 1: el + ef V IE nj.
j=l

Note the position of the vectors el in the ordered basis (see (14». Equation (22)
expresses the Ith column of B. b in terms of the new basis (12). ordered as in (14).
"
From (22) the general form of B follows (for a special case, see (28) below).
Equation (21) exhibits for all I E OJ the image of under the map A. Let now
Lul (q+l):= then from (21), for q=O.l, ...• k/ -l

1 1 n, '1 .
23 A eq+l = eq+2 - 1: a(-<t el .
j=l

Equation (23) expresses the image of the basis vector under the map A. as
a linear combination of the new basis vectors. Hence the representation of A in the
new basis, namely A. follows immediately from the first representation Theorem
(A.5.3). For a special case, see (28) below.

24 Special case. For simplicity we consider a specific case. namely,

26 Exercise. For these parameters, a) write down equations (lO)-(lOb); b) obtain


the basis vectors (12) in the order el.ed.el.ei.eJ; c) write down equations (20)-(21)
using the a's and Ws defined above, and d) show that, for the parameters (25). (A .S).
the controllable canonical form of (A,B) reads:
311

0 0 0 0 0 0
-ap -aF -o.F -o.F 1 (l12
28 A= 0 0 0 1 0 B= 0 0
0 0 0 0 0 0
-all -o.}i -al2 -a.:p -a.:p 0

r
where A is a (k l +k2)x(k l +k2)=5x5 matrix and B is a (k l +k2)x2 matrix. Here
the (lil and al are computed from the Yjil, defined by (10), by

29 1 :1:
and

30

31 Comments. Let 01 := kl and 02:= k l+k2. a) In (A ,B) above, only the


entries -al' and resp. are by. the basis-dependent parameters yl'; all
other entries are fixed. The entries -a.jil and W' are located on fixed rows, viz. row 01
and row 0'2'
A has nj diagonal blocks of size k,xk, for IE llj. These blocks are in row-
companion fonn (as shown in (28».
Y) If nj=l, then kl=n and (A,B) reduces to one row-companion block for A and
the vector En for b.
5) Equations (29)-(30) follow from the definitions of the a's and Ws above.

34 Exercise. Consider the single-input case, nj = 1, whence (28) reads e.g. for n=3

35 A= [ 0
0 1
0
0
1
1
-at -a3

a) Show that the characteristic polynomial of A reads


36 'XA(s)=det(sI-A)=sn+(C1.nsn- 1 + ... .

b) Show that
312

37

c) With c := [11 12 13] and (A i» given by (35), give a block diagram realization of
h(s) =c(sl-A)-li) .
[Hints: a) consider (sl-A) and the e.c.o.'s 12 +-- 12 + s13 and 11 +-- 11 + s12; b) calcu-
late (sl-A) [I s s2 ... sn-I]T; c) consider a chain of integrators of length n .]

40 Observable canonical form. The constant matrix pair (e,A) is said to be in


observable canonical form iff (A* ,r!) is in controllable canonical form. Hence if
(C,A) is observable, (Le. (A*,r!) is controllable), then we can find a change of coordi-
T- ---- --
nates x=Tx, such that R = [A,B,C,D] - R=[A ,B ,C ,D] where (C,A) is in observ-
able canonical form. Indeed, note that

Therefore apply procedure (5) to the pair (A*,r!) and let f' be defined by the result-
ing RHS of (14). Of course Hermitian transposition converts this procedure into the
selection of rows of the observability matrix and the definition of the rows of T, (exer-
cise).
In the single output case, (i.e. no = 1), (co,A o) in observable canonical form, for
n=3, reads

41a co=[OOI],

(Ao is now in column-companion form.)

42 Exercise. Consider equations (4Ia)-(4Ib). a) Show that


43 XA,(s) =det(sl-A o) = sn + (Clusn-I + ... + a2s + a l ).

b) Show that

44 co (sl-A o )-l =XA,(s)-l [I s ... sn-l].


313

c) With b:= and (co,Ao) given by (41), give a blockdiagram realization


of h(s)=co(sI-Ao)-lb.
d) Obtain the configuration c), by taking the dual of the one obtained in c) of Exer-
cise (34).
[Hint: c) consider a summing node in front of each integrator.]
The last result of this section shows that multi-input controllability can be
reduced to single-input controllability by using constant state feedback. This result is
a consequence of Exercise (15), (dynamical interpretation of the selected basis (12»,
using time-in variance and the superposition of appropriately shifted inputs. A first
exercise shows that the basis (12) can be considered as a discrete-time state trajectory
of n l.i. states.

45 Exercise [/.i. state trajectory]. Consider equations (12)-(13). Let, for all / E nj,
(u (j)
l denote the control sequences of Exercise (15). Set 0'0:= 0 and
(1, := k t + k2 + ... + k, for all / E n j. Consider nj shifted control sequences (Vi (j) ]j':o
S.t. for all / E nj

U1G-O'I_t) for j=O'/_t,O'/_l+l, "',O'I_I+k,


v/G) := {
9 elsewhere.

Consider their sum. i.e. the control sequence (uG»;'o S.t.


ni
uG) := l: v/G) for j=O.1.2..... n.
1=1

Consider the zero-state response of the recursion equation

46 x(j+l)=AxG)+BuG) for j=O.l ..... n-l.

Show that for alII E nj and for all q=O,l, ....k,-l

47 x(O'/-1 .

In other words. the basis (12) consists of n states reached by the recursion equation
above starting from x(O) = 9 and using the control sequence .
[Hints: by Exercise (15) each control sequence (u (j)
l generates the zero-state state
trajectory with ek/=9 (by (13»; then use time-invariance and superposition.]

50 Analysis. Consider recursion equation (46) under the conditions of Exercise


(45). Note that there exists a unique matrix FE (C niXll such that
314

51 u(j) = Fx(j) for all j = 1,2, ... ,n .

Indeed (stacking the vectors u(j) and xU) as columns of the matrices denoted below by
square brackets) F is given by

F := [u(l) : u(2): ... : u(n)]· [x(I) : x(2): ... : x(n) rl ,

where the last matrix is nonsingular because, by Exercise (45), the state trajectory
is a l.i. family.
Observe now that by (47), for 1=1 and q = 0, x(I)=el=b l (by (12». Hence,
by (46) and (51), for all j E n=1

52 x(j+I)=(A+BF)ib l .

Therefore (since the family (xU»j',!,l is l.i.) the controllability matrix of (A+BF, b I)
given by

53 C(A+BF,b l )= [b l : (A+BF)b I : ... : (A+BFt-lbI]

is nonsingular. Recall now the standing assumptions (2)-(3), where the last assump-
tion can be made without loss of generality (see Comment (4». Hence we have

54 Theorem [Single-input controllability by state feedback]. Let A E cr nxn and


BE cr nxni , with (A,B) controllable and B full-column rank. Let b l denote the first
column of B. Then, there exists a matrix FE cr njXJI S.t. the single-input pair
(A+BF, bl) is controllable.

55 Comment. In other words, if the pair (A,B) is controllable and B has full-
column rank, then there exists a linear state feedback law u = Fx + v, (where vet) E (I: fl,
is the new external input), S.t. the state d.e.

is controllable by using only the single input vI'


Using similar methods, one can in fact prove a stronger result, [Won. I, Lemma
2.2], namely:

Theorem. Let the constant pair (A,B) be controllable; let b be any nonzero vector in
R [B]. Then there is a constant state feedback matrix F such that the single-input pair
(A+BF,b) is controllable.
CHAPTER 10

LINEAR STATE FEEDBACK AND ESTIMATION

In this chapter we shall transfonn a given time-invariant system representation


R = [A,B,C,O] for various purposes such as i) improving the dynamics and ii) con-
structing a state estimator. Note that the direct transmission DE Rn.xnl of the system
is assumed to be zero for reasons of simplicity. The latter assumption is often
satisfied, since most time-invariant plants have a strictly proper transfer function
matrix.
We study first how the dynamics of R = [A,B,C,O] may be changed by constant
linear state-feedback. Next we construct, for a given R = [A,B,C,O], a linear state esti-
mator (i.e. an "observer"), using constant linear output injection. These investigations
are then combined to transfonn a given system R = [A,B,C,O] by linear feedback of
the estimated state. The implementation of these schemes mostly involves a choice of
appropriate feedback gain matrices. One way to choose the gain matrix is to use
optimization, thus enabling the designer to generate, e.g., an optimal linear state-
feedback law. This was done in Chapter 2 in the time-varying case by solving the
finite horizon linear-quadratic (LQ) optimal control problem. For the time-invariant
case a constant optimal linear state-feedback law is produced by solving the LQ time-
invariant infinite horizon optimal control problem. We study this problem at the end
of this chapter.

10.1. Linear State Feedback


We are given the real time-invariant system representation R = [A,B,C,O], i.e.
1 x(t) = Ax(t) + Bu(t)

2 y(t)= Cx(t)

where A E R nxn, BE R nxnl and C E Rn.xn, whence the characteristic polynomial is


lACS) := det(sI-A)

with real coefficients (Xi'

Consider the constant linear state-feedback law


4 u(t) = Fx(t) + vet)

whence FE RnjXII is called a state- feedback matrix and vet) E R nl is the new exogene-
ous control, as shown in Fig. 10.1.

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
316

r-----{A ...----,
+
v +

' - - - - - - - - - - - - - - - - - { F ....1 - - -.....


Fig. 10.1. Implementation of linear state-feedback; F is the state feedback
matrix.

Equations (1)-(4) generate a closed-loop system representation R r= [Ar,B,C,O],


described by

6 x(t) = Arx(t) + Bv(t)

7 y(t) = Cx(t)

where Be Rn><n " Ce Rn.,)<Jl and

8 A r := A+BFe R nxn

generates the

9 closed-loop characteristic polynomial 'XAf(S) E R[s].

In this section we show that, given any monic polynomial 7t(s) e R[s] of degree n,
viz.

to 7t(s)=sn+ [7tnSn- 1 + ... +7tI)

(where, for all i e n, 7tj E R) there exists a state-feedback matrix Fe Rn;XJl S.t.

if and only if the pair (A,B) is controllable.


317

This means that, modulo this condition, any list of n closed-loop eigenvalues
(having the property of conjugate symmetry) can be assigned to R f by an appropriate
state-feedback law (4); this property is usually called "arbitrary spectral assignment by
state-feedback. "
We start our study by considering the pair (A,B) in the single-input case (i.e.
nj = 1) where A is in row-companion form.

15 Exercise. Let Ae R nxn be similar to Ae R nxn in row-companion form, e.g. for


n=3
1
16 o
°1
1
-<X3
.

Note that by Exercise (9.2.34)

17 XA(S)=X:4.(s)=sn+(<Xnsn-1+ ... +<Xl)'

Let £j denote the ith standard unit vector of R n and consider any monic real polyno-
mial 7t of degree n given by (10).
Show that

18 For all ie n,

19

20

[Hints: for (19), use (18) and observe that X:4.(A)=O, (3.2.24); for (20), use (18) and
(19).]

23 Exercise. Let A e R nxn and beRn be S.t. the pair (A,b) is controllable. Let
(A,b) be its controllable canonical form where A is given by (16)-(17) and b=En. See
Exercise (9.2.34), where the transformation matrix 11 e IRnxn S.t.

and

is given by (9.2.12)-(9.2.14) as
318

with, e.g. for n = 3, the column vectors el,e2 and CJ are

Let C E R nxn be the controllability matrix of (A,b), given by (8.5.1), and let q E R n
be defined by

26 q* =£n*C- 1 .

Show that

<l:2 UJ CXn 1

UJ o
o
27 [TCr l =
CXn
CXn 1
0 o

28 q* =£1*T.

[Hint: for (27), use (25)].

We are now ready for our first main result.

31 Theorem [Spectral assignability: single-input case]. Let A E R nxn and bERn be


s.t. the pair (A,b) is controllable. Let C E ]Rnxn denote the controllability matrix,
(8.5.1), of (A,b) and let £j denote the jth standard unit vector of ]Rn. For any row vec-
tor fe R n let
319

32 Af := A+bfe IRnxn

and let XAr(S) e R[s] be its characteristic polynomial. Let x(s) e R[s] be any real
monic polynomial of degree n with coefficients x n ' ••• 'Xl as in (10).
V.l.c.

33 f e IRn is s.1. XAr = 1t

if and only if

34 f=-q*x(A)

where q e R n is defined by
26 q* := t;.*C- I .

35 Comment. a) In other words, in the controllable single-input case the eigen-


value assignment problem (10)-(11) is uniquely solved by the feedback vector (34).
Fonnula (34) appeared in [Ack.1].

t
In the proof below we show that if (A,b) is in controllable canonical fonn, then
f = [fi in (34) is given by fi = ai-xi for all i e n, where the ai and Xi are the
coefficients of lAo (17), and 1t, (10): we say that "f is a coefficient shift vector."
y) From an engineering point of view, we must keep in mind that if we try to change
too drastically the location of the eigenvalues we will run into trouble with saturation.
To make the point, consider the simple case where n=l, A = -I, b=c=1. The input-
output transfer function is h(s)= (s+lr 1. Suppose we apply a state feedback f=99, the
transfer function becomes h1(s)= . The response is 100 times faster, but the
I/O gain at zero frequency is 1/100! To remedy this problem, let us place the gain f
between the summing node and the plant: then h2(s) = 99/(s+ 1(0) and the I/O gain at
CJ) = 0 is 0.99 == I, i.e., it is essentially restored to its previous value and the new system

is 100 times faster. Now consider the original system starting from the zero-state and
with an input l(t) (where the input l(t) is about as large as the system can tolerate
without saturating): the maximum value of the plant input is 1. Now repeat the exper-
iment with the last closed-loop system considered above: an easy calculation shows
that the input to the plant takes the value 99 at time 0+. Clearly, this unit step input
will saturate the plant. The conclusion is that, in the real world, the amount by which
one may change the spectrum of a linear system is limited by physical considerations
such as saturation, noise sensitivity, etc.

37 Proof of Theorem (32). Since the pair (A,b) is controllable it can be brought in
controllable canonical form (A,b) by a state transformation x= Tx, (see Exercise
(9.2.34». This transfonnation relates A,b, and fe R n to A, b, and 1', resp. by
320

38 b=l l b f=fr

where Ae IRnxn has the row-companion form (16)-(17), b=En, and the matrix
rl e R nxn is described by (24)-(25). Hence, by defining
39 f:= [flf2 •.. fnl e IR lxn ,

and

40 Af := A + fir ,

we get using (32) and (38)-(40)

Ar=A+bf=rl[A+bflT=rIAcT,

where, w.l.g. for n=3, Ac reads

1
Hence, for any f E IRn, Af is similar to Ar in row-companion form, and therefore

41 'XA,(S)='XA/s)=sn+ («o-fn)So--,+ ... + (el,-f1) ] •

Note that the coefficients eli of 'XA(S) have been shifted by the components fi of the
feedback vector f, (39). Hence, upon comparing (41) and (10) we get, successively,
33 is s.t.

if and only

42 for all ie n,

or equivalently, with q e IRn given by (26),

34 f=-q*1t(A) .

The last equivalence follows because by the third equation of (38), (42), (20), the first
equation of (38) and (28)
- * - * *
f=fT=-E, 1t(A)T=-e, T1t(A)=-q 1t(A) .

321

*44 Exercise. Let Ae R nxn and be R n S.t. the pair (A,b) is controllable and let
q e R n be given by (26) where C is the controllability matrix of (A,b). Let feRn
generate Ar := A+bf. Show that
i) For all
ii) For all
fe R n q*(sI-Ar)-lb= r\
feR lxn the pair (q* ,Ar) is observable;
iii) For any monic real polynomial1t given by (10) and for all feR lxn

* *
q 1t(Ar)=q 1t(A)+f.

[Hints: consider the controllable canonical form of (A,b) using (38)-(40) and Exercises
(15) and (23).]
We are now ready to tackle the general eigenvalue assignment problem (10)-(11).

46 Theorem [Spectral assignability: multi-input case]. Let A e R nxn and Be R nxn ,.


Let F. e R"'xn be any state-feedback matrix that generates a closed-loop system matrix

8 Ar:= A+BFe R nxn

with characteristic polynomial XA,(S). Let 1t(s) e R[s] denote any real monic polyno-
mial of degree n, as given by (10).
V.t.c.

47 3Fe R nixn S.t. XA,=1t

if and only if

48 the pair (A,B) is controllable.

49 Comment. In (47) it is not claimed that the appropriate feedback matrix F is



unique. By theorem (31) we know that F is unique in the single-input case. That this
not so in the multi-input case is revealed in the sufficiency part of the proof below,
which uses Theorem (9.2.54) to reduce multi-input controllability to single-input con-
trollability by applying a nonunique state-feedback.

50 Proof of Theorem (46). If. The pair (A,B) is assumed to be controllable.


W.l.g. we assume that BE R nxn, has full column rank, (otherwise simplify B and the
control u S.t. all columns of B are l.i.). Let b l denote the first column of B. Then
by Theorem (9.2.54) there exists a matrix FI e Rn,xn S.t. the single-input pair
(A+BF1,b l ) is controllable. Hence by Theorem (31) for any 1t(s) e R[s] of the form
(10) there exists a feedback vector f2 e R lxn S.t.
322

Define now a matrix F2 E Rn,)(n S.t. all rows of F2 are zero except for the first one,
which is f2; consequently, BF2 =b,f2. Hence setting F:= F,+F2 and
A f := A+BF=A+B 1F,+b,f2, we get XA r=1t. Hence (47) holds.

Only if. We use contraposition. So suppose (48) is not true, i.e. (A,B) is not con-
trollable. Then by Theorems (8.5.36) and (8.7.16) the pair (A,B) has an uncontrollable
hidden mode, i.e. there exists an eigenvalue A of A and a nonzero vector 11 E c: n S.t.

l1*A=l1*A and 11*B=6*.

Hence there is a nonzero vector 11 E n, such that for all FE Rn,><n

Le. A f := A+BF has the fixed eigenvalue A. Thus, V FE R nixn, A is a zero of XA r'
and for any polynomial1t of the form (10) with no zero at A, 1t(S)=XA,(S) is not possi-
ble. Hence (47) does not hold. I

We conclude this section by justifying the claim of Theorem (8.7.62) that stabil-
izability (no unstable uncontrollable hidden modes) is equivalent to stabilizability by
linear state-feedback (Le. such that the closed-loop system (6)-(9), is exponentially
stable).

53 Theorem [Stabilizability). Let A E R nxn and BE R nxni .


U.t.c.
o
54 cr(A+BF)c €_

if and only if

55 the pair (A,B) is stabilizable.

56 Comment. Claim (54) means that all eigenvalues of the closed-loop system
matrix A+BF can be placed in the open left-half plane by appropriate state-feedback.
Of course the eigenvalues of A associated with uncontrollable modes will not be
affected by state-feedback.

58 Proof of Theorem (53). With N_ and N+ resp. the stable and unstable A-
invariant subspaces of R n (given by (8.7.33)-(8.7.34», pick a basis of R n according to
Rn=N_(J)N+. In that basis the state d.e. x=Ax+Bu reads x,=A,x,+B,u and

t cr(A) denotes the spectrum of A E Rnxn.


323

x2=A2x2+B2u where o(A 1)=o [AI N _]=O(A)n €_, and


o(A 2) = a [AIN.]=O(A)n tr+. Hence by Theorems (8.7.36) and (46) it follows
easily that (A,B) is stabilizable iff (A 2,B 2) is controllable.
If. By assumption (A,B) is stabilizable, whence is controllable. Hence by
o
Theorem (46) there exists a state-feedback matrix F2 s.t. o(A2+B 2F 2)c tr_. Hence,
using appropriate partitions, it follows that the overall feedback matrix F given by

results in

A+BF= [

Hence
o
o(A+BF) = o(A 1) u o(A 2+B 2F2) c tr _ .

Only if. We use contraposition. So suppose that (A,B) is not stabilizable. Then
(A,B) has an unstable uncontrollable hidden mode, i.e. by Theorem (8.7.62) there
exists an eigenvalue A. of A in tr+ and a nonzero vector" E tr n S.t.
T\*A=T\*A. and T\*B=e*.

Hence for this nonzero vector" E tr n we have for all FE Rn;xn


,,*(A+BF)=,,*A. .

Thus, for all FE Rn;xn, A+BF has the unstable eigenvalue I.E tr+, whence


o
'<:/ FE Rn;xn o(A+BF) c tr _ .

10.2. Linear Output Injection and State Estimation

Linear Output Injection


We are given a real time-invariant system representation R = [A,B,C,O], i.e.
1 x(t) = Ax(t) + Bu(t)

2 yet) = Cx(t)
324

where A e R nxn , Be R nxn1 and C e Rn.,xn. Consider Fig. 10.2 where the system R has
been transfonned by adding a linear function of the output y, (expressed by Ly = LCx
for some Le Rnxn.,), to the input of the integrators. Note that Fig. 10.2 is precisely
the dual of Fig. 10.1.

This transfonnation is called constant linear output-injection and L e Rnxn., is called


the output-injection matrix. This generates a system representation
R, = [A+LC,B,C,O], i.e.

3 x(t) = (A+LC)(x) + Bu(t)

4 y(t) = Cx(t)

where Ae R nxn , Be R nxni , Ce Rn.,xn and Le Rnxn.. Note that the resulting spectrum
is a(A+LC). As already observed in Comment (8.5.23c) the eigenvalues correspond-
ing to unobservable hidden modes of the pair (C,A) are always in a(A+LC) for every
Le 1Rnxn.. Hence, in order to have arbitrary spectral assignability by linear output-
injection, it is necessary that the pair (C,A) be observable. This is also sufficient.
Indeed we have the following theorem.

7 Theorem [Spectral assignability by output-injection]. Let A e R nxn and


C e 1Rn.xn. Let 1t(s) e lR[s] denote any monic real polynomial of degree n, i.e.

8 1t(s)=sn+(1tnSn- 1+ ... +1tl)

where for all i e n, Xi e R.


U.t.c.

9 'V x :3 L e 1Rnxn. S.t. XA+LC = 1t

+
_U_ _ __ )--_••
+
-_..
Fig. 10.2. Implementation of linear output-injection; L is the output-injection matrix.
325

if and only if

10 the pair (C,A) is observable.

11 Comments. Ct.) The proof below shows that Theorem (7) and Theorem
(10.1.46) are related by the duality, (8.4.13), of observability and controllability.
\3) In the single-output case the output-injection matrix L in (9) is unique (see Exer-
cise (14) below).

13 Proof of Theorem (7). By duality, (8.4.13), the pair (C,A) is observable iff the
pair (A* ,f!) is controllable. Note that all matrices are real and hence for any
LE Rnxn., A*-tC'L*=(A+LC)T, whence XA*+C"L*=XA+LC' Thus, using Theorem
(10.1.46).

V 1t 3LE Rnxn. s.t. XA+LC=1t

iff

iff

the pair (A* ,f!) is controllable

iff
the pair (C,A) is observable.

14 Exercise [Single-output case]. Let A E R nxn and c* E R n such that the pair
(c,A) is observable. Let 0 denote the observability matrix (8.5.2) of (c,A) and let Ej
denote the ith standard unit vector of Rn. Let 1t(s) E R[s] denote any monic real poly-
nomial of degree n given by (8).
Show that:

IE R n is S.t. XA+lc=1t

if and only if

I =-1t(A)q ,

where q E R n is given by

q=O-IE n •


326

[Hint: use Theorem (10.1.31) and duality.]


By Theorem (8.7.70) the constant pair (C,A) is detectable iff the pair (A*,c!) is
stabilizable. Hence by Theorem (10.1.56) we have the following theorem.

15 Theorem [Stabilization by output-injection]. Let A e R nxn and C e RflXno. Let


L e Rnxn. denote any output-injection matrix.
V.t.c.
o
:3 L e RflXno S.t. a(A+LC) c 0:_

if and only if

the pair (C,A) is detectable.

16 Exercise. Prove Theorem (15) in detail.

Linear State Estimation


A direct application of constant linear output-injection is the construction of a
ful/-order state-estimator (also called an observer), see Fig. 10.3. x is the estimated
state; L e RflXno is the output-error injection matrix.

We are given a real time-invariant system representation R = [A,B,C,O], i.e.

20 x(t) = Ax(t) + Bu(t)

u x __.. Y
CJ--__- -

+ Y

Fig. 10.3. A full-order state-estimator: (u,y) x.


327

21 y(t)= Cx(t)

where Ae R nxn , Be RTIXni and Ce RfIoXII.

The estimator, (shown in Fig. 10.3 surrounded by broken lines) is constructed by tak-
ing a copy of R and applying constant output-error injection using an output-error
injection matrix L e Rnxn.. Hence, with x(t) e R n denoting the estimated state, the
estimator is described by the d.e.

= Ax (t) + Bu(t) + L(y (t)-y(t» te Rr

where yet) = CX(t), or equivalently

22 = (A+LC)x(t) + Bu(t)-Ly(t)

Thus the estimator becomes a dynamical map

[U(.),y(.») -+ x(') .

We wish to find conditions such that the estimated state x(t) "follows" the state x(') of
the given system R, i.e. we would like the state estimation error eO given by

23 e(t) =x(t}-x(t)

to tend to zero exponentially as t -+ 00.

Analysis. By (20)-(23) one has

24 e(t) = (A+LC)e(t),

i.e. the estimation error satisfies

e(t) = exp[(A+LC)t]e(O)

where
e(O) = x(O}-x(O)

is the initial estimation error.


Hence, it follows that the dynamical behavior of eO depends upon cr(A+LC), i.e. the
eigenvalues of A+LC. Therefore, by Theorems (7) and (15), these dynamics can (by
an appropriate choice of L e Rnxn.) be chosen to be expo stable iff the pair (C,A) is
detectable and these dynamics have an arbitrarily assignable spectrum iff the pair
(C,A) is observable. •
328

The full order state estimator described by (22) assumes perfect knowledge of
A,B,C. Suppose now that we have only imperfect knowledge of the given system R =
[A,B,C,O]: more precisely suppose that the estimator is given by (22) but the system
under consideration is described by (20) and (21), in which A,(B,C, resp.) is replaced
by A+5A, (B+5B, C+5C, resp.) with 11M II «IIAII,

2S Exercise. Treat each of these perturbations independently and treat and


ac as first-order parameters. Investigate the difference between the perturbed and the
unperturbed state-estimation errors.

10.3. State Feedback of the Estimated State


In many engineering situations some components of the state are not available for
measurement. One way to perform state feedback approximately is to use the
estimated state for feedback, as shown in Fig. 10.4. We'll show that this idea works
much better than expected.
We are given 1) a real time-invariant system representation R = [A,B,C,O], i.e.
1 x(t) = Ax(t) + Bu(1)
te Rt
2 y(t)= Cx(t)

where A e Rnxn, Be RDXnI and C e Rnxn..;


2) A (full-order) state estimator as in Sec. 10.2, i.e.

3 ! x(t) = (A+LC)x(t)+ Bu(t)-Ly(t) te Rt,

where x(t) e R n is the estimated state and L e Rnxn.,;


3) A constant linear feedback law of the estimated state, i.e.

4 u(t) = FX(t) + v(t)

where Fe R niXD and v(') is the new exogeneous control.

v+ u Y
R=[A,B,C,O]
x
+
u
ESTIMATOR
y
Fig. 10.4. State feedback of the estimated state.
329

Analysis. Equations (1)-(4) generate a feedback system R r : v(·) yO. With

5 e(t) := i(t)-x(t)

denoting the state-estimation error, R r has the state representation

6
:, rX('l]= [A+BF BF ] rX(t)] + [B ] v(t)
le(t) 0 A+LC le(t) 0

7 y(t)=[C:O] rX(t)]
le(t)

Using (6) and (7) we find that I/O transfer function from v to y is given by

8 H(s)=C(sI-A-BFrIB.

Let Ar denote the A matrix in (6).


This analysis shows that
i) For all inputs u(·), for all initial conditions xo, the estimation error e(·) satisfies
e=(A-LC)e; its dynamics is dictated by the output-error injection matrix Le Rnxn.,;
the error eO is coupled neither to the input nor to the output. The states of the form
[a,e(t)] are uncontrollable and unobserv!lble.
ii) The I/O behavior (reflected by H(s» is dictated by the state-feedback matrix
Fe RDiXII, and is independent of L.
iii) The dynamics of the state [x(t),e(t)]T (defined by Af) has the separation property,
viz.
9 cr(Ar) = cr(A+BF) u cr(A+LC) .

Hence these overall dynamics can be independently manipulated by state-feedback


(A+BF: control law) and by output-injection (A+LC: estimation error). By Theorems
(10.1.46), (10.1.53), (10.2.7) and (10.2.15), the spectrum of Af
a) can be made expo stable iff (A,B) is stablizable and (C,A) is detectable and
b) is arbitrarily assignable iff (A,B) is controllable and (C,A) is observable.

10 Remark. In practice the choice of the gain matrices F and L is subject to many
constraints; typically optimization may be used to choose them. For example, for the
choice of F one may solve a time-invariant infinite horizon LQ optimal control prob-
lem, as is done in Sec. 10.4 below. For the choice of F and L in a more general sto-
chastic LQ-context, see [Kwa.l].

11 Exercise. For the system of Fig. 10.4, repeat exercise (10.2.25). In addition,
330

investigate the I/O transfer function of the perturbed system.

10.4. Infinite Horizon Linear-Quadratic Optimization


The purpose of this section is to present the solution of the infinite horizon linear
quadratic (LQ) optimization problem for the time-invariant case. The final result is
stated completely in Theorem (91) below. The optimal linear feedback control
requires the computation of P+ the (unique) positive semi-definite (stabilizing) solution
of the algebraic Riccati equation, a method for this computation is given in (85).
To understand the proper context of the infinite horizon problem we give for the
time-invariant case, a' preliminary discussion of the solution of the fixed horizon stan-
dard LQ problem of Sec. 2.1.7 .

Preliminary Discussion. Consider the finite horizon standard LQ problem (2.1.136)-


(2.1.140) where A('), B(-) and CO are constant matrices viz. A, Band C. To take full
advantage of the time-invariance we shall consider [to,tJl as an interval of R (instead
of R+). Moreover, to grasp the impact of various parameters, the main quantities
describing the solution will be written in more detail.
Thus we shall denote the cost due to an initial state x(to) = Xo at to and a control u(')
during [to,til as J(tl'lo,xo,U('» (instead of I(u(·))). Similarly Io(tl'to,xo) will denote the
optimal cost on [Io,til due to an initial state x(to)=xo at to, (instead of 10 ), and
P(t,tl'S) will denote the solution of the matrix Riccati d.e. at time t due to a terminal
condition P(tl)=S at time t1'
With these conventions and notations we consider the (extended) time-invariant stan-
dard LQ problem.

Data, We are given a)

1 a horizon tl E R and an arbitrary to E (- oo,t1) ;

b) a real time-invariant system representation R = [A,B,I,OJ, (3.2.1)-(3.2.5)" where the


input u(') is applied during the interval [to,tdcR, thus

2 x(t) = Ax(t) + Bu(t) t E [to,ttl c R

where the initial state is given by

c) a quadratic cost functional


331

I,

4 f [llu(t)1I 2 + IICx(t)11 2] dt+ X(tl)*SX(tl)


10

where

5 Ce Rn.xn and S=S* e R nxn S.t. S

and 11'11 denotes the Euclidean nonn. (As usual, it is understood that x(1) in Eq. (4) is
an abbreviation for s(t,to,xo,u(·».)

6 Problem. Minimize the cost J(tl'to,Xo'-) over all possible inputs u(·) of class PC
of piecewise continuous functions from [to,ttl to Rnl.
A quick review of the proofs of Theorems (2.1.67), p. 35, and (2.1.80), p. 37,
shows that we established in Chapter 2 the following result.

10 Theorem [Time-invariant standard LQ-problem]. Consider the time-invariant


standard LQ-problem (1)-(6), where the iwrizon tl is fixed and to e (- oo,tl) is arbi-
trary.
V.t.c. a) On any interval [to,tl] the LQ-problem is solved by the linear statejeedback
law (independent of to)

11 *
u(t)=-B P(t,tl,S)x(t) te (-oo,td

where P(',tl'S)= P(-,tl'S) *


0 is the n x n real matrix-valued function defined upon
(- oo,tl] as the unique solution of the backwards matrix d.e.

12 -pet) = A*P(t) + P(t)A - P(t)BB*P(t) + d'c te (-oo,td

with

«12) is called the Matrix Riccati d.e., abbreviated RDE).


b) On any interval [to,ttl c R, the LQ-problem has the optimal cost JO(tl'to,xO) given
by the quadratic fonn

14 2Jo(t l'to,xO) = (P( to,t I ,S)xo,xo)

and generates the optimal closed-loop system dynamics described by the linear homo-
geneous d.e.

15 x(t) = [A-BB*P(t,tl,S)]x(t) te [to,td


332

16 x(to)=xo

(by the substitution of u(·), (11), in x=Ax+Bu).



20 Remarks. a) The time-invariant standard LQ-problem has constant matrix
parameters A E R nxn, BE Rnl<DI, C E RfIoXJl and S E RRXR. Hence the RHS's of the
d.e. 's (2) and (12) do not depend explicitly on t, and therefore (by Exercise (5.2.6»,
with T'( the time shift operator (5.2.1), the state and the solution of the Riccati d.e. are
shift-invariant, i.e. 'V t E R

22 P(t,tl,S)=P(t+t,to+t,S).

Therefore we see (by (4) and (14» that the cost and the optimal cost are shift-
invariant, i.e. 'V t E R

Hence there is no loss of generality if we set to=O. This will be done below.
b) The unique solution of the RDE is necessarily symmetric: first P(t,)=S and S is
symmetric; second, visualizing Picard iterations to solve the RDE, every iterate is sym-
metric, hence so is the limit, namel}" the unique solution of the RDE.
c) The final state penalty tenn x(tl SX(tl) in the cost (4) is usually there for avoiding
that the final state x(tl) deviates too much from zero. As the horizon tl --+ 00, this
becomes clearly a pseudo-stability consideration. However, in that case it can be
shown that under generic conditions, [Kwa.l], an expo stable optimal state trajectory is
obtained for any final state penalty matrix S = S* O. So under these conditions
x(oo)=O and the final state penalty becomes zero, i.e. S=O is acceptable. This will be
done below. Note however that if one has duality in mind and one wants to design a
stochastic LQ-optimal estimator, [Kwa.ll, then S becomes the initial variance of the
optimal state estimation error and so for certain applications it might be imperative that
S be nonzero.
d) The infinite horizon problem below must be seen as the limiting case for t, --+ 00 of
the time-invariant standard LQ problem. This implies the following:
1) In view of (4), where to=O and S=O, the limit cost given by
333

2J(oo,O,xo,u('» = lim 2J(t lo0'Xo,u('»

00

25 = J[llu(t)ll2+IICx(t)lf]dt
o

exists only in R;. := u (oo ) , (the cost may blow up in the limit).
2) In view of Theorem (10) the solution of the infinite horizon problem depends criti-
cally upon the solution in the limit of the Riccati d.e. (12). It becomes therefory
important to study symmetric solutions of the equation obtained from (12) by setting P
equal to zero, i.e.

26 O=A*P+PA-PBB*P+CC

for p=p* E Rnxn.


(Eq. (26) is called the algebraic Riccati equation, abbreviated ARE.)
We now state the infinite horizon problem; for convenience we call it the standard
LQoo problem.

30 Standard LQoo-problem.
Data. We are given
a) a real time-invariant system representation R = [A,B,I,O] described by

31 x(t) = Ax(t) + Bu(t)

where the initial state is given by

32 x(O)=XOE Rn ,

b) a quadratic cost functional


00

33 2J(00,O,xo,u('»= J[lIu(t)1I2+ IICx(t)1 12 ] dt


o

where

34 C E Rn.,xn ,

11'11 denotes the Euclidean norm and xO denotes the solution of (31) to the control
u(·) and the initial condition Xo.
334

35 Problem. Minimize the cost J(oo,O,xo:) over all possible inputs u(') of class PC

The solution of problem LQoo is usually done in two steps. First one establishes
the existence of a unique positive semi-definite (p.s.d.) solution P+ of the ARE such
that A - BB*P+ is expo stable, and then problem LQoo is solved by an optimal constant
linear state-feedback law u=-B*P+x.

38 Theorem [The ARE has a p.s.d. solution that is unique and stabilizing]. Con-
sider the standard LQoo problem, (30)-(35). Assume that the pair (A,B) is stabilizable
and that the pair (C,A) is detectable.
U.t.c.
i) the ARE (26) has a positive semi-definite solution P+,
ii) this solution P+ is unique and stabilizing; more precisely it is unique and the state-
feedback control u(t)=-B*P+x(t) results in a system matrix A+E Rnxn given by

39

s.t. the closed-loop system dynamics

40 x= A+x is exponentially stable

equivalently

41

42 Comments. a) The matrix P + will be shown to be a limit solution of the RDE,
more precisely, for any fixed t E R,

The two last equalities in (43) follow by the time-invariance relation (22) which
implies

Therefore the unique p.s.d. stabilizing solution P+ can be obtained by integrating the
RDE (12) backwards from P(O) = O. (This method is not recommended for computa-
tion).
/l) The matrix A+, given by (39), is obtained by substituting the constant linear state-
x
feedback law u = -B*P+x in = Ax + Bu. The latter feedback law will be shown to be
optimal for problem LQoo in Theorem (91) below. Therefore the expo stability, (40),
335

of x=A+x implies that the optimal closed-loop dynamics will always be expo stable.
Note that the spectrum of A+ contains every eigenvalue corresponding to the stable
hidden modes of the system [A,B,C"O]. This is natural for the (A,B) uncontrollable
modes and follows by Corollary (75) below for the (C,A) unobservable modes,
because the latter are in N (P+).

48 Proof of Theorem (38). The proof is done in three steps. In step 1 we establish
the existence of a p.s.d. symmetric solution P+ of the ARE (26) as a limit solution
(42) of the RDE (12). In step 2 we prove that such solution is stabilizing, Le. (39)-
(41) holds. Finally in step 3 we show that a symmetric stabilizing solution of the
ARE is unique.

Step 1. Existence: the ARE (26) has a p.s.d. solution p+=p/ such that (43)
holds.
Observe first that, because of (44) Eq. (43) reduces to

49 P+= lim P(O,tl'O) ,


tl-4oo

i.e., when the horizon recedes to infinity, the solution at time 0 of the RDE (12) with
P(tl)=O converges to a p.s.d. symmetric solution P+ of the ARE (26). We start to
prove this.
For this purpose we first claim that for any XOE R n

50 the function tl R+ is increasing. Indeed, by Theorem (10) with


S=O, the finite-horizon optimal cost satisfies

o 2Jo(t l ,O,xo) = xo*P(O,tl,O)xo

51 =min (2l(tl,O,xO,u('»;U(')E PC}

where, with S=o,


11

52 2J(tl,O,xo,u('»= f [lIu(t)11 2 +IICx(t)11 2 ]dt.


o

Now, for any fixed control u(·) of class PC on R+, the cost J(II'O,xO'u('» is increasing
as II increases. Hence claim (50) follows by (51) and (52).
Next we claim that
:3K<oo S.t. forany xoERn, forany tiER
336

53

i.e.

S4 Ihe funclion 11 is bounded on Rt by Kllxol12 where K is


independent of II and xo.

Indeed, since the pair (A,B) is stabilizable it follows by Theorem (10.1.53) Ihat there
exists a state-feedback matrix FE Rl1;xn s.1. the linear state-feedback control

55 u(t)=Fx(t)

transforms (31) into an expo slable d.e.

56 X(I) = (A+BF)x(t)

where by (7.2.17) and (7.2.33) there exists an m<oo and an a>O s.t. for any
XOE R n and "It

57 lIexp[(A+BF)IJxoll m exp(-al) IIxoll .

Hence, for the specific control (55)-(56), the cost (52) satisfies for any Xo E IRn , for any
tl E R,

Ihe r.h.s. is given by


00

58 = f <exp[(A+BF)I]xo, (Ii"F+c*C)exp[(A+BF)I]xo > dt


o

with

(For the last inequality of (58) we used Schwarz's inequality and (57); the matrix norm
in (59) is the norm induced by the Euclidean norm on Rn).
Claim (53) follows now by (51) and (58)-(59).
Now by claims (50) and (53) it follows that for every Xo there exists a p(xo) E R+
such that
337

60 p(xo) = lim xo*P(O,tl,O)XO 2!.. 0;


11-+ 00

indeed as t, 00, the increasing function t, xo*P(O,t"O)xo is bounded on hence


it converges to a/mite nonnegative number p(Xo).
Now denote by Pij(O,t"O) the ijth element of the symmetric n x n matrix P(O.t,.O)
and set Xo in (60) successively equal to the standard unit vectors Ej. Ej of IRn as well as
their sum Ej+Ej. It follows by (60) that there exist finite nonnegative numbers p(Ej).
p(Ej) and p(Ej+Ej) such that

for all ie n p(Ej) = lim Pii(O,tt'O)


11-+00

and

for all i,j E nxn

[p(Ei+Ej) - p(q)-p(Ej)] = 2 lim Pij(O.t, ,0) .


tl-4OO

Hence by (60) every element of the p.s.d. symmetric matrix P(O,t"O) converges as
tl i.e. there is a real p.s.d. symmetric matrix P+=P+* S.t.

61 P+= lim P(O,tl,O) 0.


11-+00

We show now that P+ is a solution of the ARE (26). Indeed by (61), as tl 00,

the RHS of the RDE (12) converges to a constant matrix and hence the LHS, con-
verges to a constant matrix. The only possible limit is zero (indeed if lim P is a
nonzero constant, then, for tl large, P(O,t"O)=P(-tl,O.O) would not converge). Hence
P+ is a solution of the ARE, and step 1 is proved.

Step 2. The matrix P+, as a p.s.d. solution of the ARE, is a stabilizing solution, i.e.
(39)-(41) hold.
We prove (41). Consider any eigenvalue A. of A+ with corresponding nonzero
eigenvector e e c:: n. thus

We are done if Re A. < O. Now since P+ is a solution of the ARE (26) and
A+ := A-BBP+ we have

63 *
O=A+ P++P+A++P+BB P++c
.-II< C. *
Pre- and postmultiply (63) by e* and e, resp.; then use (62):
338

where e*P+e 0, (P+ is p.s.d. by step 1). Two cases occur: either e*P+e>O or
e*P+e=O.

Case 1: *
e P+e > O. By (64) ReI.. S O. We note that Rel..=O implies
B*P+e = Ce = 8. Recalling
39

we obtain
Ae=A+e=l..e with Rel..=O and Ce=8.

Note that the eigenvector e is a nonzero vector. Hence it is an unstable unobservable


hidden mode of the pair (C,A), this is a contradiction since (C,A) is assumed to be
detectable. Therefore we have

6S

Case 2. e*P+e=O, equivalently, ee N(P+). Then by (64) and (39), for the eigenvec-
tor e defined in (62) we have Ae=A+e=l..e with Ce=8. Hence e is an unobserv-
able hidden mode of (C,A) at A. By assumption, (C,A) is detectable (no unstable
unobservable hidden modes); hence Re I.. < O. Therefore

66

By (65) and (66), Re I.. < 0 in all cases, and every eigenvalue of A+ has a strictly
negative real part. So (41) is established.

Step 3. P+ as a symmetric stabilizing solution of the ARE is unique.


To prove this let P+ and P+ be two such solutions. Therefore
67 A+ := A-BB*P+ and - := A-BB*-P+
A+

satisfy
o 0
68 a(A+) c ([_ and a(A+) c ([_ •
339

Define now

71 oP+ := P+- p+

and note that we are done if oP+= O.


Now (67) and (69)-(71) deliver upon subtracting (70) from (69)

72

Now equation (72) is a linear homogeneous equation in oP+ of the fonn A x = 9 where
i) the vector x E Rn2 is obtained by stacking the columns of oP+ consecutively and ii)
A E Rn'xn" is a matrix representation of the operator

XERnxn A+*X+XA+E Rnxn .

By Theorem (4.8.1), A has eigenvalues of the fonn for some eigenvalue Ai of


o
A+ and Ilj of A+. Since by (68) the latter matrices have their spectra in ([_ it follows
that any eigenvalue A. of A is s. t. Re A. < O. Hence all eigenvalues of A are nonzero
and therefore A is nonsingular. Hence A x = 9 results in x = 9, i.e. oP+= 0, and we
are done. •

The following corollary and theorem i) reveal the structure of the null space of P+
and ii) indicate an algebraic way to compute P+.

75 Corollary [N (P+)]. Let (A,B) be stabilizable and (C,A) detectable. Consider the
symmetric p.s.d. solution P+ of the ARE (26), (hence P+ is unique and stabilizing).
Let N -(A) denote the stable subspace of A, (8.7.33), and let 0 (C,A) denote the obser-
vability matrix of the pair (C,A), (8.5.2).
V.t.c.
76 N(P+)=N_(A)nN(O(C,A».

Hence
77 P+ > 0 ¢> (C,A) is observable .

78 Comments. ex) Equation (76) means that the null space of P+ is spanned by the
stable unobservable modes of the pair (C,A).
Since (C,A) is assumed to be detectable (no unstable unobservable hidden modes),
(77) is a direct consequence of (76).
y) The RHS of (76) is the largest stable A-invariant subspace in N (C).
340

79 Short proof. Note that xe N_(A) iff lim e Alx=9 and xe N(O(C.A» iff

'V i=0.1 ..2..... We prove (76) by asserting a two-way inclusion of sets.


satisfies with *
A+ = A-BB P+

63

Therefore x e N (P+) implies A+x = Ax and

80 *
B P+x=9. Cx=9. P+A+x=P+Ax=9.

Hence N(P+) is both A-invariant and A+-invariant. Therefore

81 xe N(P+) => 'V ie N. Aix and A!xe N(P+).

Moreover. x e N (P+) => Ax = A+x and by an easy induction

'V ie N Aix=A!x.

hence

o
where. by (41), o(A+)c (L. Hence

Moreover by (80). N (P+) is an A-invariant subspace in N (C). so

N(P+)cN(O(C.A» .

=> : P+ satisfies the ARE, i.e.


82 O=A*P++P+A-P+BB*P++d'C.

Let x e N _(A) nN (0 (C.A». Premultiply (82) by x* eA· I and post-multiply it by eAlx.


Note that CeA'x == 9.

Hence. by the ARE.

d (eA 'x • PeA


-dt + - +
*
• 'x) -liB P eA'x11 2
,

which may be integrated on R+ since x e N _(A). Therefore


341

00

- (x,P+x) = J IIB*P+e A lx11 2 dt 0 ,


o


85 Theorem [Properties of the Hamiltonian] [Kuc.l,Lau.l]. Assume (A,B) stabiliz-
able and (C,A) detectable. Define

86
1 E
R2nx2n.

D.l.c.
a) H has no eigenvalues on the imaginary axis,
b) A E o(H) <=> - X" E o(H)
[i.e. the eigenvalues of H have quadrantal symmetry; they are symmetric W.f.t. the feal
axis (because H is real) and symmetric W.f.1. the imaginary axis, (by b»; hence N _(H),
the stable subspace of H, has dimension n, (by a) and b»).
c) Let ZE R2nxn have columns fonning a basis of N -(H). Set

87
[ : 1 E R2nxn

then
-

-\
88 P+=XX .

Comment. [Lau.l] gives a numerically attractive method for obtaining a basis of


Schur vectors of N -<H) and hence a method for computing P+.

Proof. Let P+ = P +* 0 be the unique stabilizing p.s.d. solution of the (ARE) (26);
as before let A+=A-BB*P+. Consider the change of coordinates Z=y-1ZE )R2n
where

89
.-
[
-P+
I 0 1 E R2nx2n.

An easy calculation shows that


342

A+ -BB*
90 H .- rlHT :;;::

[ 0 -A +*
e R2nX2n.

o
-
Hence a(H)=a(H)=a(A+)ua(-A+). * By Theorem (38) a(A+)c (L; hence
N -(H) is n-dimensional and conclusions a) and b) follow.
Let z::;;:: (xT,,?l e R2n be a vector in N -(H), then, by (89),
z ::;;:: rlz=(xT,(i-P+x)T)T, and ze N_<H). Inspection of (90) gives i-p+x=e and
z=(xT,eT)T where xe N_(A+) = Rn. Now let Ze R 2nxn be any matrix as in (87)
whose columns constitute an n-dimensional basis of N _(H), then
z:= rlz=[XT:(X_p+X)T)T=[XT:OT)T has rank n: thus X is nonsingular and
We conclude that (88) holds for any such Z. •

We are now ready to solve the standard LQoo-problem by the sole knowledge of

91 Theorem [Optimal control and optimal cost of the standard LQoo-problem).


Consider the standard LQoo problem, (30)-(35). Assume that the pair (A,B) is stabiliz-
able and the pair (C,A) is detectable.
U.t.c. a) The standard LQoo problem is solved by the constant linear state1eedback
law
92

where p+=p+* 0 is the (unique) p.s.d. solution of the ARE (26); furthermore,
* .
A+ = A-BB P+ IS s.t. a(A+) c (L.
0

b) The LQoo-problem has the optimal cost Jo(oo,O,xo) given by the quadratic form

and generates the optimal closed-loop system dynamics described by the time-invariant
expo stable linear homogeneous d.e.
94 x(t) = [A-BB*P+]x(t)

95 x(O)=xo,

(by the substitution of u(·), (92), in x= Ax+Bu). •


96 Comment. Theorem (91) clearly shows that the solution of the standard LQoo_
problem reduces to finding P+ the p.s.d. solution of the ARE (26).
343

100 Proof of Theorem (91). Note that, since (A,B) is stabilizable and (C,A) detect-
able, by Theorem (38) the p.s.d. symmetric solution of the ARE (26) is unique and
stabilizing. Let J(oo,O,xo,u('» be the cost of LQoo, (33), generated by any u(·) of class
PC Then by (25), for any t) E

2J(oo,O,xo,u('» 2J(t),O,xo,u('»
II

= J[lIu(OIl2 + IICx(OIl2] dt
°
where xO satisfies (31)-(32). Hence, by Theorem (10), using the optimal cost (14)
with S=O on [O,t)],

2J(oo,O,Xo,u('» *
2Jo(t),O,xo)=xo P(O,tl'O)XO .

Therefore, by the convergence (43) of P(O,tl'O) as tl 00, for any u(·) of class PC on
R+
101

where P+ is the unique symmetric p.s.d. stabilizing solution of the ARE (26). Thus the
RHS of (101) is a lower bound for any cost of problem LQoo. It remains to be shown
that, for u(·) given by the state-feedback law (92), we have

2J(oo,O,xo,u('»=xo P+xo. *
Now with the linear feedback control given by (92), x=Ax+Bu becomes x=A+x,
(where A+=A-BB*P+), which by (41) is expo stable; whence, for all Xo, .

102 lim x(tl) = 9 .

Hence using (92), (63) and x= A+x we obtain successively


II

2J(tl'O,xo,u('» = J[lIu(t)1i 2+ IICx(t)1I2 ] dt


o
II

= J(x(t),[P+BB*P++ C'C]x(t» dt
o
344

I,

J
=- (x(t),[A+*P++P-rA+]x(t»dt
o
I,

=- fo ddt (x(t),P+x(t)} dt

Hence by (102), for uO given by (92),

Le. the lower bound in (101) is attained for the constant state-feedback law (92). •

Denormalization
Let us start by emphasizing a well-known fact.

104 Fact. Let Re R nxn satisfy R=R*>O. Then from the expansion
n
R=I; "'jejej* (where the eigenvalues Aj>O, Vi and the eigenvectors ej are chosen
j=l
to form an orthonormal basis), we obtain

where both matrices are real, symmetric and positive definite.


Let us write the basic equations of our normalized formulation, to avoid ambigui-
ties we write uN, BN and uON to emphasize that these symbols correspond to the nor-
malized formulation:

00

33N J= f [IIUN(t)112+IICx(t)ll2]dt
o

26N

92N
345

94N

In the denonnalized fonnulation, the cost functional is


00

105 J= f [IIR 1i2u(t)ll2 + IICx(t)1I 2] dt


o

where R=R*>O is given.


Note that IIR 112u112 = uTRu, IICxll 2 = xTC" ex and the latter is usually written xTQx.
To go from the nonnalized fonnulation to the denonnalized fonnulation requires res-
caling u and B: the d.e. of the system must read

106 x(t) = Ax(t) + Bu(t)

Comparison of (33N) with (lOS) and of (31N) with (106) suggests

101

Introducing these changes in the ARE (26N), the optimal control (92N) and the
closed-loop d.e. (94N), we obtain

26D O=A*P+PA-PBR-1BP+C"C

92D

94D •
x(t) = [ A - BR- 1B*P+] x(t) .

These three equations are the basic equations of the denonnalized fonnulation.
Equations (26D), (92D) and (94D) show that once Q and R are chosen and the
assumptions of Theorem (38) hold, the closed-loop d.e. is completely detennined.
An elementary way of choosing Q and R is the following: 1) choose Q and
R > 0 diagonal; 2) since the state variables represent physical variables (positions, velo-
cities, currents, voltages, ... ), large diagonal values are assigned to those state-variables
whose deviation from zero (Le. nominal) are undesirable and smaller ones are assigned
to the other state-variables; 3) the inputs also represent physical quantities (voltages.
flows, aileron angles, ... ) so large diagonal values are assigned to those inputs that are
costly and smaller ones to the other ones. The lesson of these considerations is that
the LQoo-solution is essentially a tool, once Q and R are chosen, the optimal closed-
loop dynamics is fixed but the LQoo-theory does not provide the designer with a
methodology for choosing Q and R. That choice is a matter of art and experience.
For an interesting contribution relating the LQ-design and closed-loop perfonnance see
[Fuj.l].
346

Final comment. Every aspect of the LQoo-problem has not been studied in this sec-
tion. For more infonnation see, e.g. [Wil.lJ, [Mol.lJ, [Cop. I], [Rod.1J, [Kuc.IJ.
[Lau.1J. [Cal.2J.

IOd.4 Infinite Horizon Linear Quadratic Optimization: The Discrete-Time Case


As in the continuous-time case we consider first the (extended) time-invariant
standard LQ-problem.

Data. We are given a)

1 a horizon kl e Z and an arbitrary integer time ko<kl;


b) a real time-invariant discrete-time system representation R d = [A,B,I.OJ. (3d.2.1)-
(3d.2.2), with detA 'I: 0, where the input sequence u(·) is applied during
[ko.k1-l] c Z, thus

2 x(k+ 1) = Ax(k) + Bu(k)

where the initial state is given by

c) a quadratic cost functional

4 kf [IIU(k)ll2+ IICX(k)ll2] + x(kd'Sx(kl)


k=ko

where

5 Ce Rn.xn and S=S* e R nxn S.t. S

and 11'11 denotes the Euclidean norm.


(As usual. it is understood that x(k) in Eq. (4) is an abbreviation for s(k.ko.xo.u(·».)

6 Problem. Minimize the cost J(k1.ko.xo") over all possible input sequences
uO : [ko.k1-1]-+lRn, . •

A quick review of the proofs of Theorems (2.1.167) and (2d.1.180) shows that in
Chapter 2d we established the following result

10 Theorem [Time-invariant standard LQ-problem]. Consider the time-invariant


discrete-time standard LQ-problem (1}-(6). where the horizon kl is fixed and ko < kl is
arbitrary.
347

U.t.c. a) on any integer time-interval [ko.kl] the LQ-problem is solved by the linear
state1eedback law (independent of ko)

11 u(k)=-B*P(k+l.k l .S)[I+BB*P(k+l.k l .SW l Ax(k) for k<kl

where P(-.k1,S)=P(·.k1.S)* is the nxn real matrix-valued function defined upon


k kl as the unique solution of the backwards matrix r.e.

12 P(k-l) = c! c + A*P(k) [I+BB*P(k)r 1A


with

13

«(12) is called the matrix Riccati r.e., abbreviated RRE).


b) On any integer time-interval [ko.kd c Z the LQ-problem has the optimal cost
JO(k1.ko.xo) given by the quadratic form
14 2Jo(kj.ko.xo) =( P(ko.k1.S)xo.xo)

and generates the optimal closed-loop system dynamics described by the linear homo-
geneous r.e.

15 x(k+l) = [I + BB*P(k+ l.k 1o SW 1Ax(k)

16 x(I<o)=xo

(by the substitution of u(·). (11). in x(k+l)=Ax(k)+Bu(k».



20 Remark. As in the continuous-time case. the dynamics and cost of the LQ-
problem (1)-(6) are time-invariant, and the solution of the RRE (12) is shift-invariant,
i.e.
"if k'e Z

Thus w.l.g. we set 1<0=0. Moreover. we would like to know what happens as the hor-
izon kl for the case S=O (i.e. with zero final state penalty term in (4».

30 Standard LQoo-problem.
Data. We are given
a) a real time-invariant discrete-time system representation R d = [A,B,I,O], with
det A ':/: 0, described by
348

31 x(k+ 1) = Ax(k) + Bu(k) kENc Z

where the initial state is given by


32 x(O)=XOE Rn;

b) a quadratic cost functional

33 2J(oo,O,xo,u('» := [llu(k)ll2+ IICx(k)11 2 ]


k=O

where

34 CE RfloXJl ,

11'11 denotes the Euclidean norm and x(') denotes the solution of (31) to the control
u(') and the initial state xo.

35 Problem. Minimize the cost J(oo,O,xo,) over all possible input sequences
u('):N -+Rnl. •

The solution of the LQoo-problem, (30)-(35), involves the study of the associated
discrete-time algebraic Riccati equation (abbreviated ARE), which is obtained from
the RRE (12) by setting P(k-l)=P(k)=P, i.e.

26 P=c*C+A*P[I+BB*pr'A.

This nonlinear equation is to be solved for p=p* Note that p=p* implies
that det(I+BB*P)=det(I+B*PB) '" O.

36 Exercise. [Equivalent forms of the ARE]. Prove that equations (26a), (26b). and
(26c) below are equivalent to (26).

26a P=CC+A*[I+PBB*r'PA,

and
26c P=CC+A* [I+PBB* r'(p+ PBB*P)[ 1+ BB*pr'A .

38 Theorem [The ARE has a p.s.d. solution that is unique and stabilizing). Con-
sider the standard LQoo-problem, (30)-(35), Assume that the pair (A,B) is stabilizable
and that the pair (C,A) is detectable.
349

V.t.c.
i) the ARE (26) has a positive semi-definite solution P+,
ii) this solution is unique and stabilizing; more precisely it is unique and the state-
feedback control
u(k)=-B*P+[I+BB*P+]-IAx(k) ke N

in x(k+l) = Ax(k) + Bu(k) results in a system matrix A+E R nxn given by

39 A+=[I+BB*p+r1A

s.t. the closed-loop system dynamics


40 is exponentially stable ,

equivalently

41

42 Comments. a) It will tum out that the matrix P+ is a limit solution of the RRE
(12), more precisely, for any fixed kE Z

43 P+= lim P(k,k1,0)= lim P(0,k1,Q)= lim P(-k1,0,0)

(the equalities on the RHS follow by the shift-invariance of the solution of the RRE
(12), see (22». Hence the unique p.s.d. stabilizing solution P+ of the ARE can be
obtained in the limit by letting the RRE (12) run backwards from P(O)=O.
The matrix A+, given by (39), is obtained by substituting the constant linear state-
feedback law u(k)=-B*P+[I+BB*p+r1Ax(k) in x(k+l)= Ax(k)+ Bu(k). The latter
feedback will be shown to be optimal for problem LQoo in Theorem (91) below.
Therefore the expo stability, (40), of x(k+l)=A+x(k) implies that the optimal closed-
loop dynamics will always be expo stable.

48 Proof of Theorem (38). The proof is done in three steps as in the continuous-
time case. Details of the proof are presented only when the algebraic manipulations
are specific to the discrete-time case.

Step 1. The ARE (26) has a p.s.d. solution P+=P+* More precisely, as the
horizon kl recedes to 00, P(O,kl'O) i.e. the solution at 0 of the RRE (12) with P(k 1) = 0
converges to a p.s.d. symmetric solution P+ of the ARE (26). In other words, there
exists a solution P+= P+* 0 s.t.
350

49 P+= lim P(O,k 100) .


k1 ..... oo

A reasoning similar to the one used in the proof of Theorem (10.4.38) shows that for
every XOE R n the optimal finite horizon cost (4) on [O,k!] with S=O satisfies (by (14»

*
2Jo(k 1,0,xo)=xo P(0,k1,0)xo '

is increasing for kl increasing and is bounded for kl E N (because the pair (A,B) is
stabilizable). Hence for every Xo E R n there exists a nonnegative number p(xo) s.t.

60 p(xo) = lim xo*P(O,k! ,O)xo O.


k1-->oo

Therefore by appropriate choice of Xo in (60) we can show there exists a symmetric


p.s.d. real matrix P+ s.t.

61 P+= lim P(O,k"O)= lim P(-k"O,O),


k1 ..... oo k1-->oo

(where the last equality follows by the shift-invariance of the solution of the RRE, see
(22». Finally P+ in (61) is a solution of the ARE: this follows from (61) by letting k
tend to - 00 in the RRE (12) subject to P(O) = O.

Step 2. The matrix P+, as a p.s.d. symmetric solution of the ARE (26) is a stabiliz-
ing solution, i.e. with A+ given by (39), conclusion (40), equivalent to (41), holds.
We prove (41). Consider any eigenvalue A of A+ and a corresponding (nonzero)
eigenvector e E (Cn; thus

We are done if I A I < 1 (which is equivalent to (41». Now since P+ is a p.s.d. solution
of the ARE (26), by (39) and (26c) we obtain
63 AI<
P+=c C+A+* [P++P+BB*P+]A+ .

Now pre- and post-multiply (63) by e* and e; then, by (62)


64 (1-11.. I 2)e*P+e = I AI 2 I1B*P+eI1 2+ IICell 2

where e*P+e 0, (because P+ is p.s.d. by step 1).


Two cases occur: either e*P+e> 0 or e*P+e=O.

*
Case 1. e P+e > O. By (64) I A I 1. We claim that I AI = 1 leads to a contradic-
tion. Indeed by (64) 11..1 = 1 implies B*P+e=Ce=O. Since A+=[I+BB*p+r!A we
351

have
64a

So we obtain (using also (62»

A e = A+e with I AI = 1 and C e = e.


Hence the eigenvector e of A (which is nonzero) is an unstable unobservable hidden
mode of the pair (C,A); this is a contradiction since the pair (C,A) is assumed to be
detectable. Therefore

65

Case 2. e*P+e=O, equivalently ee N(P+). Then by (64) and (39) (:::> (64a», for the
eigenvector defined in (62) we have Ae=Ae with Ce=e. Hence e is an unobservable
hidden mode of the pair (C,A) at A. Since the pair (C,A) is detectable by assumption,
it follows that I AI < 1. Therefore

66

Thus, by (65) and (66), I AI < 1 in all cases, and every eigenvalue of A+ has magni-
tude strictly less than one. So (41) holds.

Step 3. P+ as a symmetric stabilizing solution of the ARE is unique.


To prove this let P+ and i,\ be two such solutions. Therefore
67 A+ =[I+BB*P+ ]-IA and A'-
+ [I+BB*P+ ]-IA

satisfy

68 o(A+) c D(O,I) and o(A+) c D(O,I) ,

and (using (26a) <:::> (26»

69 P+=C"C+A*[I+p+BB*r1p+A

Define now

and note that we are done if we show that BP+=O. Now by subtracting (70) from
352

(69), we have (using (67»

SP+=A* {[ I + P+BB* rIp+- P+[ 1+ BB* P+ rl }A

=A+* {P+[I+ BB* P+J -[ I +P+BB* J P+ }A+

S.t. (by (71»

72

Now Eq. (72) is a linear homogeneous equation in SP+ of the fonn A x = e where i)
the vector x e IRnz is obtained by stacking the columns of SP+ consecutively and ii)
A e RnZxnl is a matrix representation of the operator

Xe R nxn -+ X-A+*XA+.

By Exercise (4.8.10) A has of the fonn I-X"iJlj fo! some eigenvalue Ai of


A+ and for some eigenvalue Jlj of A+. Now by (68) A+ and A+ have their spectra in
0(0,1). Thus it follows that all eigenvalues of A are nonzero and therefore A is non-
singular. Hence A x = 9 implies x = 9, i.e. SP+ = 0, and we are done. •

77 Exercise. Assume the conditions of Theorem (38). Show that

p+ > 0 (C,A) is observable .



The following theorem is inspired by (2d.1.162) and Theorem (10.4.85). It reveals an
algebraic way to compute the p.s.d. solution P+ of the ARE (26).

85 Theorem [Properties of the backwards Hamiltonian]. Assume that the pair


(A,B) is stabilizable and that the pair (C,A) is detectable, Mth detA ;to O. Define

86
A-lBB* 1 E JR2nx2n •
A* +C'CKIBB*

V.t.e.
a) Hb has no eigenvalues on the unit circle,
b) AE a(Hb) ==- X"-l E a(Hb)
[i.e. the eigenvalues of Hb have double symmetry, viz. 1) w.r.t. to the real axis
(because H is real), and 2) w.r.t. to unit circle, (by b»; hence N +(H b), the unstable
353

subspace of Hb , has dimension n, (by a) and b».]


c) Let Z e 1R2nxn have columns fonning a basis of N +(Hb). Set

then
88

Proof. Exercise. Hints: copy the proof of Theorem (10.4.85). With P+ the p.s.d. sta-
bilizing solution fo the ARE (26) we have with

89 ,1_ [ I
0

1 e
R2nx2n
-p+

A;I I
K ••' ] .
90 Hb-,IHbT- [
0 A*
+

where A+ given by (39) is s.t. cr(A+) cD(O,1), (by (41».


z=(xT,x T? e N+(Hb), then, z:= rlz=(xT,(x_p+x)T)T e N+(Hb)' Thus
where xe N+(A.;I)=N_(A+)=IRR.

We are now ready to solve the standard LQoo-problem by the sole knowledge of p+.

91 Theorem [Optimal control and optimal cost of the standard LQoo-problem).


Consider the standard LQoo-problem, (30)-(35). Assume that the pair (A,B) is stabiliz-
able and that the pair (C,A) is detectable.
U.t.c. a) The standard LQoo-problem is solved by the constant linear statejeedback
law
92 keN

where p+=p+* is the (unique) p.s.d. solution of the ARE (26); furthennore
A+=[I+BB*p+rIA is S.t. cr(A+)cD(O,l).
b) The LQoo-problem has the optimal cost Jo(oo,O,xo) given by the quadratic fonn
354

and generates the optimal closed-loop system dynamics described by the time-invariant
expo stable linear homogeneous r.e.
94 x(k+ I) = [I + BB*P+rl Ax(k) keN,

95 x(O)=xo

(by the substitution of u(·), (92), in x(k+I)=Ax(k)+Bu(k».



96 Comments. a) Theorem (91) shows that the solution of the standard LQoo prob-
lem reduces to finding P+ the p.s.d. solution of the ARE (26).
The expo stable closed-loop system matrix

39 A+= [1+ BB*P+ riA

reduces the control law (92) and the closed-loop r.e. (95) to
97 u(k)=-B*
and ke N

98 x(k+ 1) = A+x(k) .

Note that compared to the continuous-time case formula (97) includes the additional
factor A+.

100 Proof of Theorem (91). Note that, since (A,B) is stabilizable and (C,A) detect-
able, by Theorem (38) the ARE (26) has a unique stabilizing p.s.d. solution P+.
Let J(oo,O,xo,u(·» be the cost of LQoo generated by any input sequence u(·) (see (33».
Then, on replacing the infinite horizon by a finite horizon kl e N and on using the
optimal cost on [O,ktl dictated by Theorem (10) with S=(),

Note that P(O,kl,O) converges for kl to P+ by (43). Hence, for any input
sequence u(·) and for any Xo
355

101

Thus XO*P+xo is a lower bound for any cost of problem LQoo. Hence we are done if
we show that, for u(') given by the state-feedback law (92), (Le. (97) by (39))

2J(oo,0,xo,u('))=xo P+xo . *
Observe for this purpose that for such u(·), x(k+ 1) = Ax(k) + Bu(k) has the form
(98), which by (40) is expo stable. Thus for all Xo

102 lim x(k 1)=6.

Recall that, by (26c), the p.s.d. matrix P+ satisfies the equivalent ARE

63 *
...11< C+A+ (P++P+BB P+)A+.
P+=c *

Hence using (92) o¢> (97), (63) and (98) we obtain successively

2J(k 1,O,xo,u('»

k,-l
= 1: [lIu(k)11 2 + IICx(k) 112]
k=O

= 1: * *
[x(k) P+x(k)-x(k+l) P+x(k+l)]
k=0

Hence by (102), for uO given by (92) as kl

2J(oo,O,xo,u('))=xo P+xo, *
i.e. the lower bound in (101) is attained for the constant state-feedback law (92). •
CHAPTER 11

UNITY FEEDBACK SYSTEMS

Introduction
This chapter covers a number of the main techniques and results in MIMO linear
time-invariant feedback systems. There are three main reasons for this choice of sub-
ject: first, MIMO feedback systems are ubiquitous in modem industry (autopilots, con-
trol of auto and airplane engines, automated manufacturing systems, process control,
... ); second, the statement of and derivation of these main results constitute an excel-
lent demonstration of the power of the concepts and the techniques developed in the
previous chapters; third, a number of these results are basic to computer-aided design
procedures. In fact a good number of these concepts and techniques were invented to
understand and solve feedback problems.
For simplicity we restrict ourselves to the unity feedback configuration: that is the
given dynamical system (the plant) is driven by the output of the compensator; the
plant output Y2 is fedback and compared to the input ul to obtain the error el = ul-Y2;
hence the tenn unity feedback. For more complicated configurations see [Des.6, Net.1,
Vid.l, Des.7, Gun.l].
In section 1 we will develop the state-space representation of the MIMO unity-
feedback system (11.1.14, and 11.1.15), calculate its characteristic polynomial
(11.1.26), and establish that, in the absence of unstable hidden modes, the expo stability
of the system can be guaranteed by testing for stability four transfer functions
(11.1.40). Important special cases are treated in detail; note, in particular, the case
where R 2 is expo stable (11.1.43) and the Q-parametrization of all I/O maps HY1U1
(11.1.48).
The main result of Section 2 is the Nyquist theorem for MIMO feedback systems
(11.1.25). Its importance lies in that 1) it relates the characteristic polynomial to the
return difference, 2) it is the basis of many arguments used in robustness theory and of
many computer-aided design algorithms. Important aspects of the Nyquist theorem are
discussed in subsection 2.2.
Robustness is the subject of Section 3, the main properties of a well-designed
MIMO feedback system are derived. Provided the loop-gain PC is large (in all direc-
tions) we show that the feedback systems I/O map is relatively immune to changes in
the plant (11.3.9) and to exogeneous disturbances (11.3.21). We establish the key
features of set-point regulators and demonstrate their robustness (11.3.27). The final
theorem (11.3.44) establishes the trade-off between the achievable bandwidth of the
feedback system and the plant uncertainty.
Section 4 uses the Nyquist theorem to obtain a simple proof of the celebrated
theorem of Kharitonoy (11.4.9), it uses a well-known characterization of Hurwitz poly-
nomials (11.4.6).

F. M. Callier et al., Linear System Theory


© Springer-Verlag New York, Inc. 1991
357

From an engineering point of view, structured perturbations are the most realistic
ones; in Section 5 we obtain necessary and sufficient conditions for stability of a class
of structurally perturbed systems (11.5.23) and develop, for a speeial class, a procedure
for testing the stability of each member of the class; this procedure is well suited for
computer-aided design (11.5.31).
In Section 6, we derive necessary and sufficient conditions under which a unity
feedback system remains stable under arbitrary additive plant perturbations (11.6.6),
the additive perturbations are only required to be proper, i.e. unstable proper additive
perturbations of the plant are covered by the theorem.
It is possible to extend the results of Sees. 1,2,3 and 6 to more general classes of
linear time-invariant systems, e.g. systems involving delays. This requires the use of
advanced algebraic techniques (8-algebra, see, e.g. [Cal.3-6], [Vid.l], [Che.2]). It is
important to note that the theorems remain valid in this more general context.
In Section 7, we define the concept of zeros of transmission, we characterize
them and we illustrate their importance in feedback design.

11.1. The Feedback System r.c

11.1.1. State Space Analysis


Assumptions and preliminary considerations. We consider the system Le formed by
the interconnection of two given systems Ll and L2 according to the figure 11.1. For
i=I,2, the system I:i is represented by its linear time-invariant representation

with state Xi E ]Rn,. Note that its transfer function is


2

is a proper rational matrix. We do not assume that either

Fig. 11.1. The feedback system Le'


representation R i is minimal, nor do we require them to be expo stable.
The system L, called the closed-loop system, is driven by the initial conditions
x1(0),x2(0) and by two exogeneous inputs U1(-) and u20 as shown on Fig. 11.1. Note
that any additive disturbances that may be modeled as added to the subsystem outputs
Yl and Y2 can be incorporated in the exogenous inputs u1 and u2'
358

3
u:= [ul,uI r and for its output, y:= r.
We choose for the state of L,the vector x:= [xl,xI r; for its input,

As justified by the exercises (5) and (6) below, we impose a "well-posedness"


condition

From an engineering point of view, if assumption (4) does not hold, there is a funda-
mental defect in the modeling of L.

5 Exercise. Show that

det[I+D)D:zJ =det[l+ D2Dd =det[l+0)(00)02(00)J .

6 Exercise. Let resp.), be the closed-loop transfer function of L from


u1 tO A
Uj
A to Yj. resp.), using the summing node equations
A

(e) = u)-Y2' = u2+Y). show that

Show!hat if d:t[I+D 1D 2] =0, then


i) :tIUl a pole at s=oo;
ii) f:\,IUl • f:\,2Ul • and f:\,2u2 have a pole at s = 00 .
[Hint: 02(S) is bounded as I s I -+ 00; verify that 92' 0IU+G20d- 1 = 1- [I+G 20d-1
and study its behavior as I s I -+ 00 to conclude that f:\,2u l has a pole at s = 00 • • • • J

In the analysis of Lc it is important to keep in mind the following facts:


o
7 R j is expo stable ¢:;> a(Aj)=Z[det(sI-Aj)] C (C_.

8 P [OJ] C a(Aj). Furthermore. if Ak e a(Aj) is not a pole of OJ, then Ak is the


eigenvalue associated with an uncontrollable andlor unobservable mode of Lj.
equivalently. 1:; has a hidden mode at Ak'

9 If R j is minimal, then P [Gjl = a(Aj) .

10 Exercise. Show by example that the converse of fact (9) is not true.
359

System Description of :Ec


The subsystems Ll and L2 are specified by R 1 and R 2; the interconnection is
specified by the summing-node equations (see Fig. 11.1):

The result is given by the following theorem.

13 Theorem (State-space representation of L). Consider the feedback


configuration Lc given in Fig. 11.1; assume that det[I+D 1D7J O. If we choose the
representation R c for L, (see (3) above), we obtain

14

- [c] [::l + [D] [::l


where

15a

-BI(I+D2DlrID2]
ISh
B2(I+D 1D 2 rl

-Dl(I+D2Dl)-IC2]
15c (I+D 2D 1)-IC 2

15d
360

r; r;
Notation: it will be convenient to define

15e x:= Gr.xl y= &r.y!

r; r.
and

15f e:= [er.e! u:= [u!.ul

for the state. the output, the error and the exogeneous input of Lc .

16 Exercise. Let M and N be matrices of suitable dimensions with elements in €.


Show that

det 7] = det[I+MN] = det[I+NM]

MN(I+MNrl = I-(I+MN)-l.

Proof of Theorem (13). Consider the closed-loop system 1:c shown in Fig. 11.1. the
subsystems 1:1 and 1:2 are described by: for i=1.2

17a

and there are two summing-node equations


17b

Eliminating el and e2 in the read-out maps of 1:1 and 1:2' we obtain

Using this expression. the notation defined in (15e) and (15f). and exercise (16) in Eq.
(18) we immediately obtain
361

20 y=Cx+Du,

where C and D are defined by (15c) and (15d) of Theorem (13); note that C and D are
simply the RHS of (19) multiplied on the right by and diag[D I,D2],
respectively.
Now use (20) and (17b) in the differential equations in (l7a) to obtain, after some
calculations,

218 xI = [AI-BI(I+D2DI)-ID2CI ] Xl -

These two equations verify (15a) and (l5b).



22 Exercise. Assume DI =D 2=0. Use Fig. 11.1 and the Eqs. (17a) and (l7b) to
obtain by inspection the following equations:

238

23b = [::] .
25 Theorem (Characteristic polynomial of L) [Hsu.1]. Consider the feedback
configuration L of Fig. 11.1 where det [1+D 1D2] ":F- 0, and the R {s are specified by
(1) and (2). The characteristic polynomial of Rc= [A,B,C,D], namely,
XA(S) =det(sl-A), with A defined in (14) and (15a), is given by
det[I-tG I(s)G 2(s)]
26 det(sl-A) =det(sl-A I)' det(sl-A2) • A A

det[l+G 1(00)02(00)]

27 Remark. Equation (26) is very important because it relates the characteristic


polynomial of Le with the return difference of Le, namely, det[I-tG I(s)G 2(s)].

28 Comment. In the SISO case, we know that the "loop gain" gl(S)g2(S) and the
return difference 1 + g I(S)g2(s) play a crucial role in the stability analysis of Le'
362

we expect, in the MIMO case, that we should consider


I+ 1+ G 2G 1 ,(wpich are usually different in the MIMO case), or at least
det[I+G 1G 2]=det[I+G2Gt]. Equation (26) states precisely what that role is in the
stability analysis.

29 Corollary. (Necessary condition for stability). If R c is expo stable, then

equivalently, if det[I-tG 1(s)G 2(s)] has one or more zeros in <T+, then R c is not expo
stable.

Proof. XA(s) =det(sl-A), the LHS of (26), is a In the RHS,


det(s!-A 1) and det(sI-A 2) are also polynomials, hence if det[I+G 1(S)G2(s)] has one or
more 4I:+-zeros, then det(sI-A), the characteristic polynomial of L. has one or more
(C +-zeros; hence R c is not expo stable. •

30 Remark. To gain insight, rewrite Eq. (26) as follows: call Aci the closed-loop
eigenvalues, (i.e. the eigenvalues of A), and Aki the eigenvalues of R k; then

31

<\
Equation (31) shows that any zero of det[I + (S)02(S)] is a zero of
XA(S) =det(sI-A) = n(s-Aci) (as we have already seen in Corollary (29». It also
shows that some closed-loop eigenvalues, say AC I' may not be a zero of
det[I+G 1(s)G 2(s)]; indeed the factor (S-Ac1) may be canceled by one of the denomina-
tor polynomials. This exhibits the fallacy that lies in trying to detennine the stability
of Lcby examining only the rational function det[I+G 1(S)G 2(s)]. From Eq. (31). we
obtain the following corollary.

32 Corollary (R 1 and R 2 expo stable). Let R 1 and R 2 be expo stable; then R c is


expo stable if and only if


33 Z [det[I+G 1(s)02(S)]] c

Proof of Eq. (26). To simplify notations we use the x. y. e and u (defined in (l5e)
and (15£). Let
363

with Bo, Co and Do defined similarly. Finally, introduce the (square) symplectic
matrix

Equation (17), which describes L, can be rewritten as

178 y= Cox + Doe

17b e=u-Jy.

Note that A o, Bo, Co and Do are block diagonal matrices. From the last two equations
we obtain

(I+Do1)y = Cox + Dou


and

Note that we have obtained (I+Do1r 1 in Eq. (19) above.


Since we want to compute the characteristic polynomial of Le' namely,
det(sI-A), we set u = 0, so e = -Jy. So the differential Eq. (17a) becomes

x= [A o-Bo1(I+Do1)-ICo] x.
Hence
sI-A = sI-Ao+ Bo1(I+Do1r 1Co

= [I+Bo1(I+Do1r 1Co(sI-A or 1] . (sI-Ao)·

Now use det(PQ) = det P det Q and det(I+MN) = det (I+NM) to write

det(sI-A) =det(sI-A o) det[I+(I+Do1)-IC o(sI-A o)-IBo11

=det(sI-A o)' [det(I+DoJ)]-1 det[I+(Co(sI-Aor-lBo+Do)J].

Referring to the definitions of Ao, Bo, Co and Do, the last determinant is easily seen to
be

Hence, finally
364

det(sI-A) = det(sI-A I)' det(sI-A 2) . det[I + G I / det[l+D ID 2] .

This last equation is easily seen to be equivalent to (26).



11.1.2. Special Case: Rl and R2 Have No Unstable Hidden Modes
For R I and R 2 to have no unstable hidden modes means that the representations
R I and R 2 have all their unstable eigenvalues associated with controllable and observ-
able modes. An equivalent way of stating the condition is: R I and R 2 are stabilizable
and detectable (8.7.68). Under these conditions, if the exogeneous inputs ulO and
u20 are zero and if YI (t) and Y2(t) e, then the states of R I and of R 2, namely, XI (t)
and x2(t), both tend to zero; as a consequence, we can establish the stability of the
feedback system R c by observing its inputs, ul and u2, and its outputs YI and Y2- In
particular, we can establish the expo stability of Rc by transfer function techniques.
(As an exercise, show that if R I has an unstable hidden mode, then there is no R 2 that
will give a feedback system R c that is expo stable.)

36 Assumption. R I and R 2 have no unstable hidden modes, equivalently, they are


both stabilizable and detectable.
Given assumption (36), let us use transfer function techniques: using the notations
of (l5e-f) and (35), we note that e=u-Jy; hence

e(s) = il,u(s)ii (s) = ii (s) - J I\.u(s)ii (s) .


Hence

=
A A

37 I\u(s) 1- J f\.u(s)

and since ]T] =I

that we say that the transfer function H(s) E lR(sr xn is expo stable if and
A

only if a) H(s) is proper and. b) H(s) has no poles in CJ:+.


Equations (37) and (38) establish immediately the following fact.

39 Fact. For the configuration Lc shown in Fig. 11.1,

Ii:u(s) is expo stable I\.u(s) is expo stable.

Explicit expressions for li,u have been obtained in Exercise (6). It is important to keep
in mind that Fact (39) holds for the unity-feedback configuration L; for more compli-
cated configurations, equivalence (39) need not be true! ([Des.6]; see also Fig. 11.5 in
Section 6 below.)
365

As a summary of the study of stability so far, we state the following extremely


important theorem.

40 Theorem. (Stability of R c: state space and transfer function point of view).


Consider the feedback configuration L shown in Fig. 11.1 and defined in terms
of Rl and R2 by (I), (2) and (4). If Rl and R2 have no unstable hidden modes.
(Assumption (36», then the following statements are equivalent
o
i) the representation R c is expo stable (equivalently Z[det(sI-A)] c (L, see Eq. (21)
and (26));
ii) the transfer function I,\u(s) is expo stable;
iii) the transfer function I-\-u(s) is expo stable.

41 Remarks. <X) It is important to note the algebraic fact:


Rc expo stable => I\u(s)=C(sI-A)-lB+D is expo stable.

P) Assumption (36) allows us to infer the stability of R c from that of the closed-loop
transfer functions either I\u or ltu. Obviously the presence of unstable hidden modes
in either R 1 and/or in R 2 would destroy that inference. Theorem 40 has a very useful
special case:

43 Theorem. (Exp. stability of R c when G2(S) is expo stable).


Let R 1 and R2 be described by (1). (2) and (4). Assume that there are no
unstable hidden modes in R 1 and in R 2' Let 02(S) be expo stable, then

44 R c is expo stable

45 Remark. In case O2 is expo stable, (which is the case in most applications), we


can establish the expo stability of ltu,: hence !hat of L itself, by testing that of I\u,
only. It is important to note that if G 1 and G2 are not known to be expo stable, one
must test all four submatrices of ltu [Des.4].

Proof. (=» By remark (41), R c expo stable => I-\-u is expo stable => f{,u is expo
stable => f{,lU, is expo stable.

( ¢ ) a) Since O2 is expo stable and is expo stable, so is


366

A A

b) Since O2 and He2u , are both expo stable, so is


A A A A A -1 A A A -1 A A

He2u , '02=01(1+0201) '0 2=(1+° 1° 2) ° 1° 2

hence
'"
(I +
" " '"
=
1
is Ii"u
expo stable. From this we obtain:
(I + = -I-\"u2 is expo stable. Thus we have established that all four sub-
A

of I-\,u are expo stable once He2u , is expo stable. Hence by Theorem (40)
R c is expo stable. •

46 Design Implications of Theorem (43).


Suppose is a given "plant" to be controlled, more precisely, we want to choose
a 4, such that the feedback system L of Fig. 11.1 has an I/O transfer
function H y2U , with desirable properties. We assume that Assumption (36) holds.
A A

47 Assumption. is given, expo stable and strictly proper (i.e.


Now consider any proper transfer function for which L is expo stable; then
!\2Ulis also expo stable.Furthermore, since we consider we set 112 = 8; hence we !\2U,
have and consequently

I-\"Ul by Q.
A A

where, following tradition [Zam.1] we label Note that, by (6),

49
A A

He2u, = Q= 0 1(I +
A

°°
A

2
A

1)
-1
.

A A A

50 Exercise. Solve Eq. (49) for 0 1 in terms of O2 and Q and get


A A A A_I

51 0 1=QJ-02Q .

Now we reach the useful conclusion.

52 Theorem (Design for G 2 expo stable and strictly proper). Consider the
configuration L defined by (1) and (2). Assume that both R I and R 2 have no
unstable hidden modes. Suppose that O2 is expo s!able and strictly proper. Under
these conditions, a) for all proper transfer functions 01> which result in an expo stable
R C' the I/O map is given by

for some expo stable Q;


b) conversely, for any expo stable transfer function Q, the "controller" 01 given by
367

(51) is proper. achieves the I/O map (48) and the resulting feedback configuration Rc
is expo stable.

53 Comments. a) Theorem (52) can be summarized as "for any given expo stable
and strictly proper plant O2 • the set of all I/O maps achievable by an expo stable R c is
parametrized by the expo stable Q's (see Eq. (48». The required controllers 0 1 are
given by (51) and are proper." The use of (48) and (51) in design is usually referred
to as the Q-parametrization method [Zam.1.Des.5.Mor.1].

Note that the 01 given by (51) is proper; there is no guarantee. however. that it
will be expo stable; of course :E, is expo stable.

'I) Consider Eq. (48): 62 is given. Q is free subject to being expo stable. Hence any
O H
tr+-zero of 2 will be a tr+-zero of y2u (because, roughly speaking. Q may not
Gv.
AI
have tr+-poles to cancel the tr+-zeros of Thus. some dynamical characteristics
of the "plant" 62 necessarily impose limitations on the achievable JlO maps of :E,.
A A

8) Since Q is proper, the JlO map Hy2uI has a behavior as Is I -+ 00, which is con-
strained by that of 62, (see (48».

Proof of Theorem (52).


a) By assumption, R c is expo stable, hence. by Theorem (40), the transfer function
HC2UI = Qis expo stable. •

b) By assumption, Qis expo stable, hence proper; consequently, letting Is I -+ 00 in


(51) and recalling that 62 is strictly proper we see that 61(00)=Q:00); i.e. 61 is
proper. Now since both 62 and are expo stable, Rc is expo stable by theorem
(43). •

11.1.3. The Discrete-Time Case


Stp'pose now that the systems i are linear time-invariant but discrete-time,
Rdi= LAdi,Bdi,Cdi,Ddi]. i = 1,2. We use here a subscript "d" to label the discrete-time
representation matrices. Let us write the Laplace-transform equations for R t and the
z-transform equations for R dt (we use hat and tilde to distinguish the Laplace
transforms and Z-transforms):

x=Alx+Blu sx(s)=Atx(s)+Blu(s)+x(D-)
55 { {
y=Ctx+Dtu yes) = Ctx(s)+D 1ii (s)
368

x(k+ 1) = Adl x(k)+B dl u(k)


56 {
y(k+ 1) = Cdl x(k)+Bdl u(k)

The Eqs. (55) and (56) are identical in fonn except for the z factor multiplying
x(O) in (56).
Consequently, the results of Theorems (13), (25), (40), (43) and (52) apply
o
without change except that the open-left half-s-plane CL is replaced by the open-unit
disc D(O,I).
For a very detailed and elementary treatment of the discrete-time case see Ogata's
book. and the references therein. [Oga.l]

11.2. Nyquist Criterion


The Nyquist criterion is an extremely useful stability test as well as an important
design tool. For the MIMO case, the criterion is far from obvious, therefore it
deserves a careful discussion and proof.

11.2.1. The Nyquist Criterion


We consider again the MIMO feedback system Lc shown in Fig. 11.1 where, for
k = 1,2, Rk=[Ak,Bk,Ck.Dkl is the linear time-invariant representation of the subsystem
Lk' We do not assume that either Rk is minimal nor do we assume that either Rk is
stable. Their transfer functions are given by

Gk(s)= Ck(sI-AkrIBk+Dk

Note that this power series converges absolutely for I s I > p(A k), the spectral radius of
Ak. We assume that L is well posed. Le. that

3 det[I+G 1(oo)G2(oo)]=det[I+D 1D2] '# O.

For the closed-loop system L we choose the representation Regiven by


(11.1.14) and (11.1.15); its state is (xl,x2). its input is (UI>U2) and its output (Yl.Y2)'
Recall that €+ denotes the closed right-half of the complex plane (i.e. the imaginary
axis is included). Let

4 PO+:;:;; number of €+-zeros of the polynomial det(sI-A 1)' det(sI-A2)'


counting multiplicities, (Le. if it has a triple zero at
sl E €+. then the zero sl contributes 3 to the count).

In the complex s-plane. (s = o+jro). consider the curve D defined as follows: let R be
369

a sufficiently large positive number so that 'f/ I s I det[l+<J 1(S)G2(s)]"* 0,


by (2) and (3) this is always possible; D consists of the vertical segment [-jRjR] of
the jco-axis and of the right halfpcircte of radius R centered on (0,0); in case the
rational function f(s):= det[I+G 1(s)G2(s») has one or more poles on the segment
[-jRjR), D is indented to the LEFf by an arbitrarily small half-circle centered at the
pole. Following tradition, we orient D clockwise (see Fig. 11.2): D is a simple,
closed, oriented curve.
Call f(D) the closed oriented curve obtained by mapping D by the rational func-
tion f; f(D) is obtained by plotting f(s) as s traverses the curve D clockwise. Note
that f(D) is oriented; the resulting oriented curve is usually called the Nyquist plot of
f(s). Note also that whereas D is a simple curve (i.e. it does not intersect itself), f(D)
is not necessarily a simple curve. We now state the criterion.

5 Theorem (Nyquist criterion). Consider the linear time-invariant feedback system


:I:c defined above in terms of its representation R c' (11.1.14). Using the notations just
specified, if (3) holds, we have

6 R c is expo stable

the Nyquist plot f(D) encircles the orgin Po+ times


{
7 <::> counterclockwise, and does not go through the origin,

the net change in phase of f(s) as s traverses D


is gi,Yen by
8 l\( arg det[I+G 1(s)G2(s») ) = 21t . Po+


and arg f(s) is well defined for all SED

'"
.-pl.n. "

• 11
D'
..

Fig. 11.2 The curve D and its image f(D).


370

9 Exercise. Show that

det[I-tO t (s)G 2(s)] =det[I+D t D2 + D t yB 2/s + C t B t D2/s+ 0(1/s 2)]

11.2.2. Remarks on the Nyquist Criterion

1. If R 1 and R 2 are expo stable, (equivalently, P<H- = 0), then, by Theorem (5), we
need only to check that f(D) does not encircle the origin. This special case is very
important; because (i) the values of <\ (jro) and for ro e 1R+ may be measured
easily and accurately (by a sinusoidal steady-state experiment) and hence modeling
errors avoided; (ii) more generally, the theorem sti11 applies to systems where <\(s)
and/or represent expo stable distributed systems (e.g. delays, distributed dynamic
elements, etc ... ): in that case <\ (s) and/or are transcendental functions, but they
are analytic in tr +.

2. In view of Eq. (2), as I s I 00 ,

Hence for R sufficiently large, the image of the right half-circle of D is an arc of a
small circle centered on (doo'O) and of radius O(1IR), (see (9) above). Since d oo i= 0,
(see (3) and (10», the number of encirclements of the origin is determined by the
image under f of the jro-axis segment [-jR,+jRJ, (duly indented if required), where R is
taken sufficiently large.

3. Left indentations. In the SISO case, if jrol is a pole of l+gl(S)g2(S), then jrol is a
zero of (l+gl(S)g2(S»-I; in particular, (l+gt(S)g2(S»-1 cannot have a pole at jrol'
Thus in the SISO case, (assuming no tr+ pole-zero cancellation in the product
g 1(s)g 2(S», if So e tr+ is a pole of 1+g 1(s)g2(S), then the closed-loop transfer func-
tions (l+gl(S)g2(S»-I, gl(S)(l+gt(S)g2(S)r l , g2(S)(l+gl(S)g2(S»-1 and
gl(S)g2(S)(1+g 1(s)g2(S»-1 do not have a pole at so' i.e. these four transfer functions
are analytic at so. Consequently, So need not be encircled by the contour D, (in order
to guarantee the analyticity of the closed-loop transfer functions at so), and D may be
indented either to the left or to the right.
In the MlMO case, if So is a pole of f(s)=det[I-tG 1(s)G 2(s)], it may be a pole of
[I-t0 1(s)G 2(SW I . Consider the following example
371

s+1
A A [ ..
s+12 S2 s+2
-s(s+2)
--
1+0 1(s)02(s) = 0 s+1 , [It<'i.(,lG,('Jr' [ s
0
s s+1

Here f(s), [I+<3 I (s)02(S)] and [I+<3 I (s)02(S)r1 have a pole at s=O, a point on the jro-
axis. It is for this reason that in the MIMO case, jco-aKis poles of f(s) must be
included inside the contour D , i.e. the indentation must be taken on the left.
A A

4. The calculation of f(s)=det[I+G I (s)02(s)] is not onerous: indeed, if, in


some applications, Al and A2 are largeA the size of square matrix 01(s)02(s)
is small. The determinant f(s) = det[I+G I (s)02(s)] is easily computed by LV decompo-
sition (say, using the UNPACK program, [Don. I]).

5. The number of encirclements can be checked by plotting ro <t fUro) for ro


from 0 to infinity or by plotting f(D). Since I fUro) I can vary over several decades it
is often convenient to generate a polar plot, e.g.
(r(ro)= a + log I fUro) I , 9(ro) = <t fUro» where a is a suitable positive constant.
6. Suppose that LI and L2 are specified by the rational matrices 0 1(s) and 02(S).
How do we evaluate Po+, assuming that we use minimal OnAe way is to
consider the contributions to theA partial fmction expansions of 0 1(s) and 02(s) due to
poles in (C +. For example, if 0 1(s) has a pole PI E ACl\ of order 3, this pole contri-
butes three terms to the partial fraction expansion of 0l(s), namely,
RI Rz R3
--+---+---
s-PI (S-PI)2 (S-PI)3

the Rk's are in (Cnixn", where R3 is not the zero matrix. That pole PI' as a pole
of 01(S), has a McMillan degree (9.1.90) given by

RI Rz R3
rk [ R2 R3 0

R3 0 0

Note that in the Nyquist criterion (11.1.5), we are concerned with Po+, defined in (4),
the number of (C+-zeros of det(s!-A I ) det(sI-Az}. In the context of present dis-
cussion, Po+ is the sum of the McMillan degrees of all the 4r+-poies of O\(s) and that
of all the (C+-poles of 02(S).
7. Suppose we have the list of all jco-axis zeros of det(s!-A 1) and of det(s!-A 2); it is
easy to see that no harm is done if left-indentations are used for each such zero, i.e.
there is no need to check whether at each such point the matrix 01(S)02(S) has a pole.
372

11.2.3. Proof of Nyquist Criterion

Fig. 11.3. The complex plane and the contour D \.

Let us recall the argument principle as it applies to a rational function such as


f(s). We are given a simple closed oriented curve D \ as shown in Fig. 11.3
r:I (s-Zj)
13 f(s)= __I - - -
n (S-Pk)
k

where the z;'s and the Pk's are the zeros and the poles of f with Zj Pk, * V i,k.
Assume that f(s) has neither poles nor zeros on D l' Consider s traversing clockwise
the closed curve D \. Since, by (13),

14 argf(s)=L arg(s-zi) - L arg(s-Pk)


k

referring to Fig. 11.3, we see that if zl is outside the curve D l' the net change in
arg (s-z\) as s traverses D \ once is zero; on the other hand, if Zz is inside D l' then the
net change in arg (s-Zz) is 21t radians in the clockwise sense, that is -21t radians in the
counterclockwise sense. Similarly, poles outside D \ contribute nothing to the net
change of phase, but if P2 is inside D \ the net change in arg (s-P2) is 21t radians in the
counterclockwise sense. Hence by (14) the net change in phase in the counterclock-
wise sense is (p+-n+) 21t = 21t x (the number of poles inside D 1 - number of zeros
inside D 1)' counting multiplicities. Thus we have:

The Argument principle. Let f(s) be a proper rational function with f(s) finite and
non-zero for all SED \; as s traverses D \ clockwise, f(s) traverses feD \) and encircles
the origin p+-n+ times counterclockwise.

Proof of Theorem (5). The equivalence of (7) and (8) is immediate.


To establish (6) (7), refer to Eq. (11.1.26) which we rewrite as follows
373

Now XA' XA, and XA, are polynomials, whereas f(s) := det[I+O I (S)02(S)] is a proper
rational function, hence every pole of f(s) is completely canceled by some zeros of XA,
and/or of XA" i.e. if f(s) has a pole of order 5 at PI then the product XA, XA, has a zero
at PI of order at least 5.
With this in mind together with Eq. (11.1.26), we can write
o
16 R c is expo stable ¢:> Z[XA(s)] = Z[det(sI-A)] c (L

a) everyone of the PG+- ([+-zeros of XA, XA" is canceled


exactly by a pole of f(s), (more precisely, if PI is a fifth-order
17
([ +-zero of XA, XA, then f(s) has a fifth-·order pole at PI) ,
b) f(s) has no ([+-zeros.

a) f(s) has exactly PCH- ([ +-poles, counting multiplicities,


18 {
¢:> b) f(s) has no ([+ -zeros.

f(D) encircles the origin PCH- times counterclockwise


{
19 ¢:> and f(D) does not go through the origin .

The last step follows from the argument principle. I

20 Exercise. Give a detailed proof of the equivalences (l6} to (19).

21 Remark. From the proof and the use of the argument principle, it is clear that
the shape of D can be changed, the encirclements simply che:ck that the characteristic
polynominal of R C' namely, XA(s), has no zeros inside or on D. For design purposes,
we might want all the zeros of XA(S) to lie to the left of the vertical line Re(s)=-a,
for some given (;(>0, then we simply shift the left boundary of D to Re(s)=-a.
Similarly if we want the zeros of XA(s) to lie to the left of some simple curve C in
o
([ _, (the simple curve C is assumed to partition the ([ -plane in two parts; one to the
left of C and one to the right of C), we choose this curve C as the left boundary of
the closed curve D; as before D is closed by an arc of a circle of sufficiently large
radius R centered at the origin. Because of these possibilities, the Nyquist criterion is
a very flexible tool in the hands of imaginative engineers.
374

22 Exercise. Give a heuristic explanation for the following recipe for counting the
counterclockwise encirclements of the origin by the closed oriented curve f(D). By
assumption f(D) does not go through the origin. i) Draw a half line L from the origin
to infinity (in any direction); ii) choose L so that it intersects the oriented curve f(D)
at a number of points say PI> ...• Pn • furthermore L is chosen so that P I.P2 ..... Pn are
simple points of the curve f(D). i.e. points where f(D) does not cross itself; iii) for k ==
1..... n. if. at Pk • f(D) intersects L in a counterclockwise. (clockwise). fashion. the count
is 1. (-1. resp); iv) add all the counts for PI to Pn : this sum is the number of counter-
clockwise encirclements of the origin by f(D). (For a rigorous proof see Vid.2].)

11.2.4. The Discrete-Time Case


As pointed out in sec. 11.1.3, the algebraic facts pertinent to the discrete-time
case are almost identical with those of the continuous-time case discussed above. As
expected, the Nyquist criterion (5) applies; however. since exponential stability
requires that all the eigenvalues of Ad lie in the open unit disk D(O.I), the closed curve
D is the unit circle with. if required. left indentations at the poles of
f(z) = det[I+G1 (z)G 2(z)] on the unit circle.
Note again that one may choose to restrict the eigenvalues to for some
a< I or to any subset of D(O,I) suggested by the design requirements.

11.3. Robustness
The main purpose of feedback is to obtain a closed-loop system that is insensitive
to plant perturbations and that attenuates effectively external disturbances. These
insensitivity properties are rather obvious in the SISO case by simple Nyquist diagram
considerations, [Hor.l]. The same holds for the trade-off between uncertainty and
achievable performance. In the MIMO case, the correct formulation of the correspond-
ing properties took some time to be discovered but. fortunately for us. they are now
easily derived if we use some of the basic tools developed previously.
For simplicity. we consider the simple configuration S(P.C) shown in Fig. 11.4
below: S(P.C) is the same configuration as L of Fig. 11.1 except that we use now the
control terminology: P designates the "plant," i.e. the dynamical system to be con-
trolled and C designates the "controller." (For a study of more involved
configurations, see [Des.6,7], [NeLl], [Vid.I,Gun.I].)

91

Fig. 11.4. The control system S(P,C).


375

We assume throughout Section 3 that


P(s) E lRpo(S)n.xn; C(s) E 1Rv(st'XIlo

i.e. P is strictly proper (P(s) -+ 9 as I s I -+ 00);

2 P and C are specified by representations R p and R c


which are stabilizable and detectable;

(In other words, P and C have no unstable hidden modes.)

3 S(p,C) is expo stable.

11.3.1. Robustness with Respect to Plant Perturbations


. Since for S(P,C),
4 Hyzu , =PC(I+PC)-I =I-(I+PC)-I

as far as Hyzu , is concerned, S(P,C) is equivalent to an open-loop system consisting of


a "precompensator" := C(I+PC)-I followed by the plant P.
The open-loop system Cp followed by P has the same I/O map HY1U1 as S(P,C),
see (4). Consider now a change in the plant, say P becomes P=P+M>, for the
equivalent open-loop system, the change in the I/O map is (we assume that Cp is
unchanged)
M> Cp =M' C(I+PC)-l yzu,

(the superscript "0" is used to remind us that it refers to the equivalent open-loop sys-
tem). Now consider the I/O map variation due to the change from P to
is given by
m yzu, = (l+pC)-lpC-PC(I+PC)-1

= (I+pC)-1 [PC(I+PCHI+PC)PC](I+PC)-I

5 = (I+pcrlM> C(I+pC)-I.
Hence

i.e. the I/O map variation of the closed-loop system is the I/O map variation of the
equivalent open-loop system premultiplied by (1+ P C)-I.
376

Now for fixed w, using the nonn induced by the Euclidean vector nonn,

7 I1(I+PC(jW»-11 I= cr[(l+PC(jW»-I]
= 1 / Q[I+PC(jW)]

where a and Q denote the maximum and the minimum singular values of the matrix in
the brackets (A,7.81). Now since

8 :::>

we conclude with the following theorem.

9 Theorem. Let P and C satisfy assumptions (1 )-(3). Let P = be strictly


proper and such that S(P+M,C) is expo stable.
U .t.C., if, over the frequency band n, > > 1, then, over the frequency band n,
H y1U !' the I/O map of S(P,C), is much less sensitive to the plant perturbations than
the I/O map of the equivalent open-loop system. The exact relation is given by (6).

10 Comment. Where do the M come from? First from modeling, models are
simplified representations of the physical reality: a) in electrical circuits one neglects
some propagation effects (lumped circuits), some small inductances, small capacitances
and small resistances; b) in mechanical systems, one assumes that a bar is infinitely
rigid or that a beam does not bend, etc .... ; c) the mass and the moments of inertia of
a crane change when it picks up a load, same for a communication satellite when it
consumes fuel, or when it reorients its solar panels, etc.... Second, in mass produc-
tion, M comes from manufacturing deviations. Third, in operations, the equipment is
exposed to temperature changes that may significantly change their physical properties,
etc .....

11.3.2. Robustness with Respect to Exogenous Disturbances


The autopilot of an airplane must maintain its direction and orientation in spite of
the wind and/or air turbulence; similarly, a radio-telescope must maintain its orienta-
tion; an oven must maintain its temperature, etc .... In Fig. 11.4, we model these exo-
genous disturbances by theexogeneous input do; to be precise do represents the distur-
bances referred to the output. For brevity, we'll call do the disturbances. Let assump-
tions (1), (2), (3) hold. A simple calculation gives

18 Hy:At=(I+PC)-I.

Note that from (4) we have

This relation is very important for it shows the constraint imposed by the configuration
377

S(P,C):

In other words, if, over the frequency band n, the system S(P,C) achieves excellent
disturbance rejection then the I/O map H y1U, is close to I and there is a considerable
decoupling between, say the first component of U1 and the second, third, ... com-
ponents of Y2; and conversely.
From a design point of view, we see that if V coe n

21 .Q(PC(jco}»>1 then IIHy1<lo(jco)II«1 Vcoen.

23 Remark. Suppose that for some (bad) design, for some COo e n,
.Q(PC(jcoo» =: 0'110 < < 1; then, using the singular value decomposition of PC(j roo) ,
namely,

we see that if PC(jcoo) is perturbed to

PC(jcoo) - O'nollnovno* .

then, for this perturbed system, any disturbance do in the direction of Vno is not
attenuated by the feedback! In other words, at that frequency, for that perturbation,
the feedback is useless!

24 Exercise. Prove that statement.

11.3.3. Robust Regulation


In many practical applications S(P,C) is required to obey the asymptotic (set
point) regulator property: to be precise, for any initial conditions in S(P,C), for any
input of the fonn uol(t), where uoe R no is arbitrary, it is required that

26 y(t) -+ Uo as t -+ 00 •

Physically, this means that after some transients, (whose duration depends on the
spectrum of the closed-loop system), the output Y2(t) becomes a constant precisely
equal to the input. The theorem below shows that such regulators have remarkable
robustness properties.
378

27 Theorem. Consider the system S(P,C) where assumptions (1)-(2) hold. Assume

28 C(s)=Co(s)/s

29 det[P(O)Co(O)] 'F 0

30 S(P,C) is expo stable,

then, for all initial conditions, for all inputs of the form uol(t), the output yet) Uo as
t 00.

Furthermore, if Co and P are penurbed to Co and P subject to the only conditions that
S(P,Co(s)/s) is expo stable and (29) holds for the penurbed system, then the asymptotic
regulator propeny still holds.

31 Remarks. a) It can be shown that if condition (30) holds, if nj = no and if pes)


and sC(s)=Co(s) have no poles at s=O, then conditions (28) and (29) are necessary
for the asymptotic regulator propeny [Cal.1],
Since (29) implies that rk Co(O) is no, the minimum number of integrators required
is no, one per output to be regulated.
y) Condition (29) implies that nj no; indeed if nj < no, then rk Co(s) nj < no; hence
rk [P(O)Co(O)] nj < no. Hence P(O)Co(O) is singular, contradicting (29).

32 Proof. By linearity, the output Y2(t) consists of two terms, one due to the initial
conditions and one due to the input uol(t). The first term goes to zero in view of (30).
Use Laplace transforms to compute the second term

= [P(s)Co(s)/s]' [I+P(s)Co(s)/sr l . uols

The partial fraction expansion of this expression involves only poles in the open left-
half-plane - since Hy2u1 is expo stable - and one pole at s=O. The residue at s=O is
given by

lim P(s)Co(s)[sI+P(s)CO(sW1uo = uo,


5->0

where we used assumption (29). Hence, under the conditions stated above, we have
Y2(t) Uo as t 00.
The second statement is proved similarly.

33 Exercise. For the system described by Theorem (27), calculate Hcju1 ' Evaluate
it at s=O and derive from it the tracking propeny (26).
379

11.3.4. Bandwidth-Robustness Trade Off


In the ideal world of perfectly linear, perfectly noiseless and perfectly known sys-
tems, it can be shown that, given any stable plant pes) e RpO(s)n.,x,\ there is a compen-
sator such that the closed-loop I/O map HY2U1 has arbitrarily large bandwidth, [Zam.
I]. (By Eq. (11.1.48), this requires Q(s) very large for large lsi). In this section we
use a class of model uncertainty to demonstrate that the plant uncertainty imposes limi-
tations on the achievable bandwidth of the I/O map of S(P,C); in other words, the
plant uncertainty and the stability requirement impose limitations on the achievable
performance of the closed-loop system.
In general, the plant model is very accurate at low frequencies but it becomes
more and more inaccurate as the frequency increases. Indeed in the process of model-
ing one assumes some shafts and some beams to be rigid, one neglects some small
inductances, some small capacitances and some propagation effects. These neglected
phenomena have very little effect at low frequencies, however, generally, they contri-
bute resonances at higher frequencies, hopefully at frequencies much higher than the
desired bandwidth of the feedback system. As shown in Section 3.1 above, as long as
.a(PC(jro» > > I, these approximations will have little effect on the I/O map HY2U1 of
S(P,C). We will show below that such uncertainties associated with the plant impose
definite limitations on achievable performance.
Let us model the uncertainty, we are given the nominal plant

3S Po(s) e RpO(s)n.,xn" analytic on the jco-axis,


and having 'Yp (C+-poles (counted per McMillan degree)

and the class A(Po;w) of plants defined by

36 PeA(Po;w) ¢:> IIP(jro)-Po(jro)lh< Iw(jro) I 'ifroeR

where

37 both P and Po have 'Yp ¢+-poles and no jro-axis poles,

38 w(s)e Rp(s) is expo stable, and Iw(jro) I > 0, 'if roe R.

39 Comments. a) Iw(jro) I bounds the uncertainty of the plant transfer function at


the frequency roo For the reasons given above, I w(jro) I is small for small ro and
increases rapidly as I IPo(jro) I I becomes small.
Conditions (36-38) do not require that pes) and Po(s) have the same ¢+-poles:
(37) requires the same number of ¢+-poles.
y) Roughly speaking, we may think of Pe A(Po;w) as P belonging to a ball centered
on Po with radius specified by W. Such perturbations of Po are called unstructured
perturbations. (Structured perturbations will be studied in Section 5.)
380

40 Preliminary Calculations.
Consider any P and C satisfying (1):

41 I+PC = I+PoC+ (P-Po)C

= [1+(P-Po)C(I+PoC)-l]. (I+PoC).

As in (11.1.49), we set

42 Q := C(I+poCr 1

so

I+PC = [I+(P-Po)Q] . (I+PoC)

and

43 det(I+PC) = det[I+(P-Po)Q] . det(I+PoC) .

44 Theorem (Stability Robustness of A(Po;w». Let Po satisfy (35) and let the
compensator C be S.t. S(Po,C) is expo stable; then

45 C stabilizes all members of A(Po;w) (as specified by (36-38»


¢:>

46 II Q(j co) II Iw(jco) I 1 'd COE R.

This theorem was first stated by Doyle and Stein, [Doy.l], then proved in a more
general context in [Che.2]. The proof below uses a technique in [Vid.lj.

47 Comment. The equivalence (45)-(46) represents a trade-off between uncertainty


and achievable closed-loop bandwidth. The closed-loop I/O map is Hy,u, =PQ, see (4)
and (42). So if COc represents roughly the frequency at which IIP(jco)ll starts decreasing
at 20 or 40 db/decade, then in order to achieve a larger bandwidth we need
IIQ(jcoll> > 1 for co>coc. This will be impossible by (46) if Iw(jco) I is of the order of
1 or larger at those frequencies. In other words, the larger Iw(jco) I, (i.e. the larger the
uncertainty in the plant), the smaller the achievable closed-loop bandwidth.

Proof.
'*'. by assumption, condition (46) holds.
For simplicity assume that C(s) is analytic on the jco-axis.

Step 1. Since C E lRv(s)n,xn. stabilizes Po(s) E RpO(s)no><,1o, considering minimal realiza-


tions of Po(s) and C(s) and using obvious notations we obtain from (11.1.26)
381

48 det(sI-A) = det(sI-Ap>det(sI-Ac)det[I+Po(s)C(s)] .

Let 'Yp and 'Yc denote the number of (i:+-eigenvalues of Ap and Ac. resp. (counting
multiplicities). Since. by assumption. S(Po.C) is expo stable. (48) shows that
det[I+Po(s)C(s)] has 'Yp+'Yc (i:+-poles and no (i:+-zeros; hence the Nyquist plot of
co -+ det[l+PoCUco)] encircles the origin 'Yp+'Yc times counterclockwise (Theorem
11.2.5).

Step 2. Since '1/ P e A(Po;w). P has 'Yp (t +-poles. it remains to show that for all
such P·s. the Nyquist plot co -+ det[I+PCUco)] encircles the origin 'Yp+'Yc times counter-
clockwise.
Now by (36). (38), and assumption (46) we have

49 II(P-Po)Uco)II'IIQUco)1I < 1 '1/ co E R.

Hence, 'V 1:E [0,1], 'V COE R+, 'V PEA(Po;w)

50 det[IH(P-Po)Q(jro)] '#: O.

Call f(tjro,P) the LHS of (50): given the assumptions on P, (35-37), and those on C,
'1/ P e A(Po;w) the function

(t,ro) -+ f(tjco,P)

is continuous. Furthennore, 'V 1: E [0,1] and 'V P E A(Po;w) the function


ro -+ f(,t,jro,P) defines a Nyquist plot. In particular, when 1:=0, f(O,jco,P) = 1
'Vro, 'VP; i.e. the Nyquist plot reduces to one point. By (50), 'V 1:E [0,1],
'V PE A(Po;w) none of the Nyquist plots touches the origin. Consequently, f(l,jco,P)
does not touch and does not encircle the origin.
Now by (43), we have '1/ roe R. V PE A(Po;w)

51 det[I+PC] = f(ljro,P) det[l+PoC] .

Now, the Nyquist plot of f(I,jro,P) does not encircle the origin; also, by step I, the
Nyquist plot of det[I+PoC] encircles the origin 'Yp+'Yc times; hence, by (51), that of
det[I+PC] encircles the origin 'Yp+'Yc times. Finally by Theorem (11.2.5),
'1/ PE A(Po;w), S(P,C) is expo stable.
=>. We use contraposition. So we assume (45) and we negate (46); i.e. we assume
(45) and that for some roo

52 IIQUcoo)II' I wUcoo) I > 1 .

If roo=O, (coo = 00, resp.), by continuity of Q(.) and w(') we replace roo in (52) by
some roo>O arbitrarily close to 0 (arbitrarily close to 00, resp.). Hence w.l.o.g. we
382

have O<roo < 00.

Step 1. Calculate the singular value decomposition of Q(jroo):

53 QUroo)=ULV*=L OjUjvt,

hence U and V are constant unitary matrices, the only nonzero elements of L are on
its diagonal, and, as usual, 01 02 ,.... For later use, note that (52) and (53) imply
that

We are going to use (54) to construct a perturbed P specified by P=Po+L'1 such that
P E A(Po;w) and such that

In view of (43), (55) implies that, for that P, S(P,C) is not expo stable: indeed the
closed loop characteristic polynomial has a zero at s= jroo.

Step 2. The requirements on L'1.


Let OE [O,21t) be defined by o=arg[w(jroo)]. Call ul and VI the first column of U and
V, resp.: hence Ilutll=llvtll=l. Choose L'1(s) as follows:
w(s) T
56 L'1(s) = - I (j ) I v(s) u(s)
w roo 01
where

57 the vectors u(s) and Yes) are expo stable; Ilu(jro) II = 1


and IlvUro)lI=l, VroE R;

58

Inequality (54), condition (57) and assumption (38) imply that, V ro E R,


1IL'1(jroll < Iw(jro) I and w(s) is expo stable; hence P E A(Po;w). We claim that, for the
L'1(s) specified by (56-58), Eq. (55) holds; indeed use (53) to calculate the matrix in
(55), and obtain M(jroo)vl = (I-vlvt)Vl =9; hence M(jroo) is singular, i.e. Eq. (55) is
satisfied by the L'1(s) specified in (56).
Thus we have shown that if (52) holds, (equivalently (54) holds), any L'1(s) given
by (56) and satisfying (57) and (58) destabilizes the system and P := Po+L'1 E A(Po;w).

Step 3. Construction of L'1.


Write the ith component of Ul as Uli exp(j<Pi) where Uli is real and its sign is S.t.
<PiE (-1t,Ol. Define the vector u(s) by its ith component i=1,2, ...
383

59 (U(S»)-=UI' -
1 1

where > 0 is chosen so that = -<IIi' (To see that this is


always possible plot the zero IIj and the pole -1Ij in the complex plane and consider the
angles.)
Write the ith component of VI as Vii exp(j'lli) with Vii real and 'IIi S.t. 'IIi-a e [O.n).
Choose v(s) by its ith component. i=l.2 •...
s-bi
60 (v(S»i = Vii -b-
S+ j

where bj > 0 is chosen S.t. arg(jOlo-bj)-arg(jOlo+bj) = 'l'i-a .


Equations (59) and (60) and assumption (38) show that u(s) and v(s) satisfy
requirements (57) and (58). Hence by step 2, P := Po+t. destabilizes the system. This
is a contradiction with (45). Hence (46) is established. •

11.3.S. The Discrete-Time Case


All the techniques used in the previous subsections are frequency domain tech-
o
niques. Hence the transfer to the discrete-time case is straightforward: CI:_ becomes
D(O.l); for the set point regulator. the integrator 1/s is replaced by 1/(l-z). etc.

11.4. Kharitonov's Theorem


Consider the set A of monic real polynomials of degree n of the form

1 p(s.a)= 8o+als + a2s2+ ... + Bn_ISo- l + sn

where for some given ili.a; e R. the coefficients are constrained by


2 i=O.I •... ,n-l .

To each vector a := (ao.al' ...• 80-1) e R n corresponds a polynomial of the set A.


and conversely; thus. thinking in coefficient space. we think of A as a parallelipiped in
R n whose edges are parallel to the coordinate axes of R n,

3 We say that the polynomial p(s.a) is Hurwitz iff p(s.a) ¢ 0 V s e CI:+. (i.e.
V s in Res

4 We say that the set A is Hurwitz iff p(s.A) ¢ O. V s e CI: +. equivalently. iff
every polynomial in A is Hurwitz.
384

11.4.1. Hurwitz Polynomials


The following well-known elementary lemma is the basis of our proof.

6 Lemma. a) If the real monic polynomial p is Hurwitz, then all its coefficients
are positive and arg(p(jro)) is a strictly increasing function of roo
b) The real polynomial p of degree n is Hurwitz

"II ro E JR, arg(pUro)) is a well-defined continuous function of ro;


7
lim arg(pUro)) - arg (p(O)) = nlt/2.
w.... oo

Proof. a) Since pes) is monic and Hurwitz, it can be factored as

where the -(X/s are the real zeros of p. (hence (Xi> 0). and the (-13k ±jYk)'s are the
complex conjugate pairs of zeros of p. hence 13k> O. Equation (8) shows that all the
coefficients of p are positive. The fact that arg pUro) is strictly increasing is immediate
by calculation using (8). or geometrically obvious by drawing a diagram showing the
o
constellation of zeros in <r_.
b) =:> i) is immediate since there are no zeros on the jro-axis. ii) As ro increases from
o to 00, argUro'Hx j ) increases by and that of [U<o+l3k)2+Y(1 increases by It. Since
o
there are n zeros in <r _. the net increase is nlt/2.
¢ (Sketch of proof). Suppose that p had one real zero zl with zl > 0; then
argUo>-zl) would decrease by as ro increases from 0 to 00. Hence, ii) would be
violated since p has degree n, and the net increase in argument of the LHS of ii)
would be strictly less than nlt/2. •

In assumption (2) above. we required that "II i, O<l!j: in view of lemma (6)
and since each polynomial p(s,a), aE A, is monic, the requirement O<.!!,j entails no loss
of generality.

11.4.2. Kharitonov's Theorem

9 Theorem. 'The set A of monic polynomials (defined in (1) and (2» is Hurwitz if
and only if the following four polynomials are Hurwitz:

10 kll(S)=l!O+.l!ls+1lzs2+a3s3+ ...
385

14 Comment. Kharitonov discovered this amazing fact that the Hurwitzness of four
polynomials are necessary and sufficient to guarantee the Hurwitzness of the whole
class A. [Kha.l,2J. Step I of the proof and Fig. U.5 will give an intuitively clear rea-
son why these four polynomials are crucial to the stability of the set A of polynomials.
We will refer to the four polynomials as the K-polynomials. The proof below follows
[Min. I].

Proof.
Step 1. Claim: \;;/ ro E JR, p(jro,A) is a rectangle in € whose sides are parallel to
the coordinate axes.
Note that \;;/ aE A, \;;/ roe R,

15 p(joo,a) = no - a2oo2 + a4oo4 - a6oo6 +

Consider now a fixed 00 e then V a e A we have


16 (.aO-a2oo2+.a4oo4- ... ) :s; Re[p(jOO,a)] :s; (aO-.a2oo2+ a4oo4- ... )

With obvious definitions of gj(jOO) and hj(joo), (i=I,2), we rewrite (16) and (17) as:
\;;/ aeA and V ooe

18 gl (joo) S Re[p(jro,a)] :s; g2(jOO)

19 hl(jro) S 1m [p(jro,a)] :s; h2(joo).


J J

Hence, V 00 e p(joo,A) is the rectangle in «r, with vertices

Note that, \;;/ 00 e the sides of the rectangle p(joo,A) are parallel to the axes of
«r. By inspection, we note that the four polynomials defined in (10-13) are given by
386

21 i,j= 1,2.

For convenience, call the rectangles p(jO),A), defined by (16) and (17) "r/ 0) E Rt, the
K-rectangles. Note that they are bounded away from the origin.

Step 2. $ . By assumption, klI,k12,k21,k22 are Hurwitz.


For 0)=0, the K-rectangle reduces to the line segment [.ilo.'iiQ1 of the positive real
axis, because l!o and 'iiQ are positive, As 0) increases from 0 to 00, each of the vertices
of the K-rectangle moves around the origin, with its argument strictly increasing and
as 0) -+ 00 its argument tends to n1t/2; furthermore, its sides remain at all times paral-
lel to the coordinate axes. Now, "r/ a E A, the polynomial p(s,a) is, for each 0),
represented by a point in the K-rectangle; hence its argument increases by n1t/2 and,
consequently, it is Hurwitz by Lemma (6).

=> By assumption the whole class A is Hurwitz, hence, in particular, the four polyno-
mials kij are also Hurwitz. •

24 Corollary. Consider the set A defined above,


i) if n=3, the set A is Hurwitz if and only if k21 (s) is Hurwitz;
ii) if n=4, the set A is Hurwitz if and only if k21 (s) and k22(s) are Hurwitz;
iii) if n=5, the set A is Hurwitz if and only if k21 (s), k 22(s) and k 12(s) are Hurwitz;
iv) for n 6 the four polynomials must be tested.

Proof of 0=3. => is immediate since k21 is a polynomial in A.

Fig. 11.5. p(jco,A) is a rectangle with edges parallel to the axes; its
vertices are defined by the K-polynomials.
387

.;: For ro=O. the K-rectangle reduces to [.Ilo.liQ] with .Ilo>O. By assumption k21 is
Hurwitz. hence arg[k21(jro)] is stricdy increasing and increases by 3x/2. Also k21 (jco)
is the lower right-hand corner of the K-rectangle which. V roe has its sides
parallel to the coordinate axes. Finally. for ro large. not only k21 (jro) but the whole
K-rectangle enters and remains in the third quadrant; indeed. for n=3. since all polyno-
mials in A are monic. for ro large. any peA is such that

since k21 is Hurwitz and .Il2> O. for ro large. both the real part and the imaginary pan
of any such p(jco.a) are negative with Im[p(jro.a)] :: ro Re[p(jco.a)]; hence. V peA.
the net change in argument is 3x/2. Hence all polynomials in A are Hurwitz. •

The proof for n=4 is similar except that to guarantee that V peA the net
change in argument be 2x we need both k21 and k22 to be Hurwitz.

25 Exercise (Kharitonov theorem for complex polynomials). Develop the Karitonov


theorem for complex polynomials: A is now the set of monic complex polynomials of
degree n of the form

26 ...

where the (Xi's and are real and for given 1l.i.ai.fti.Pi

27 !l. i (Xi ai and ft i Pi V i.

(Hint: now a E c:
n and we lose the property that p(jro.a)* = p(-jro.a); hence. we must
consider the change of argument from co=-oo to ro=oo. Consider ro :::?. 0 and
ro 0 separately; each case gives you a set of four K-polynomials. The necessary
and sufficient conditions will involve eight K-polynomials. (See. e.g. [Min. I].) Use
these results and the transformation z = soia to find conditions so that the set A has its
zeros in the sector +e arg(s) S; - - e. where e is a positive angle <1t/2.)
28 Exercise. Let us now consider the set Aof real polynomials of degree n of the
form

where for some given .!I.j and 3i. in JR.

O<.Ili ai iii for i=O.l ..... n

and .Il i > O. V i. Show that the set A is Hurwitz if and only if four special
388

polynomials are Hurwitz.

29 The discrete-time case. The formulation of the Kharitonov type theorem for the
discrete-time case is quite involved (see [Man.1]).

11.5. Robust Stability Under Structured Perturbations


From an applications point of view. Kharitonov's theorem can be criticized
because it allows each of the n coefficients of p(s.a). defined in (11.4.1). to vary
independently within its prescribed bounds. In general. for physical systems. the
coefficients of the characteristic polynomial depend on some physical parameters
(dimension. mass. temperature. voltage .... ) and the same physical parameter may
appear in the expression of several coefficients. hence the coefficients do not vary
independently. So instead of a parallelepiped in JR n+! we have a differentiable mani-
fold in Rn+!. For example. such facts are amply illustrated by the many examples in
circuit and control theory textbooks.

1 Example. For our purposes. we consider a very simple example consisting of a


linear electric motor (modeled by its armature resistance R and inductance L. and its
torque constant K). This motor drives a shaft modeled by its inertia J and its friction
coefficient b; and to the shaft is applied an exogenous load: a torque 'to Thus if we
drive the armature by a voltage Vo we have

2 vo=Ri+Li+KO)

3 't=-Ki+Jro+bO).

Equations (2) and (3) define our plant with Vo as the control input and the velocity 0)
is the output. The torque 't is set to zero because we view it as a disturbance. Using
Laplace transforms and abusing notations viz. we write i(s) for the Laplace transform
of i(t) .... we obtain from (2) and (3)
n (s)
4 O)(s) = P(s)vo(s) = -p- vo(s)

where

6 = LJ s2 + (JR+Lb)s + Rb+K 2 .

Suppose we want to maintain 0) constant in spite of step disturbances. this suggests a


compensator of the form (see Sec. 3.3 of this chapter)
389

2
7 C(s)=k s +as+a
S(S+P)

where a and P are positive numbers chosen so that S(P,C) is expo stable. The charac-
teristic polynomial of S(P,C) is given by

8 pes) = d1A + npnc

where

ao=Kka

a1 =

10 a2=

Note that except for the terms and K2 in a1 and a2, resp., every coefficient of the
characteristic polynomial pes) is affine in each of the physical parameters (R, L, J, b
and K) and in the design parameters (k, a and 13).

11 Exercise. In a design study associated with the example above, we consider


only J and b as subjects to perturbation

I J J

where 1. J, n, b are known constants: thus we have two variable parameters ql =J,
q2 = b and the parameter vector q := (ql,q2) is restricted to belong to the rectangle
QcR2 with vertices (I,b), (l,b), (1.12), (J,b). Let us label the polynomial p of
(8)-( 10) as p(s,q). Show that p(s,Q) is a parallelogram in lr. (Hint: for a fixed s,
consider successively only J varying, then only b varying: you obtain two straight-line
segments that are, in general, not parallel to the axes nor orthogonal to each other.... )

11.5.1. General Robustness Theorem


We are now going to generalize the notion of stability and we are going to allow
the coefficients to be continuous functions of some parameter vector q E JRm • The
derivation follows [Ana. I].

13 Let U be a closed subset of lr which a) includes (C+ and b) is symmetric with


respect to the real axis. The boundary of the closed set U is denoted by au; here
o
au=U\u.
For example,
390

14 U= { sE <r I Re s 2:.. -a for some fixed a> 0 }

or
U= {S E (t I either Re s 2:.. -a for some fixed a> O,or
15
I arg s I "2n; + 9 for some fixed 9 E (0,1t/2) )

16 A polynomial is called U-stable iff pes) 0, '<;j S E U. Similarly, a transfer


function will be called U-stable iff it is proper and has no poles in U.

17 Assumption. The characteristic polynomial is of degree n and has real


coefficients:

where

i) for i=O,l, ... ,n, a;.: Q -) IR is continuous


19 {
ii) > 0, '<;j q E Q,

20 iii) Q is a compact arcwise connected subset of Rffi.

Conditions (17)-(20) imply the following fact.

21 Fact. a) '<;j q E Q, the zeros of p(s,q) are in some fixed compact disc contained
in, say, D(O,r);
b) '<;j s E (t, p(s,Q) is compact and arcwise connected, (as the continuous image of Q
compact and arcwise connected, [Die. 1, ChapJ]).

22 Exercise. Let Il:= and Ilj:= maxa;.(q), i=O,l, ... ,n-l, where the
minimum and the maxima are taken over all q E Q. Show that > and that r can be Il 0
taken to be any number larger than

max [1, Il/fl] .


1=1

n-l
(Hint: if s is a zero of p(s,q), I s I L I I . I s I k+l-n).
o
Let Qp denote the set of polynomials defined in (17-20); we say that the set Qp is
U-stable iff '<;j q E Q and '<;j s E u, p(s,q) '# O.

23 Robust Stability Theorem. Let conditions (17-20) hold.


U.l.c. the set of polynomials Qp is U-stable
i) for some qo E Q, p(s,qo) is U-stable,
ii) '<;j SE au, 0 p(s,Q).
391

24 Comments. a) This theorem is very general in that a) the concept of u-


stability is chosen by the engineer for the application that he has in mind and b) the
dependence of the coefficients on the parameters q is required only to be continuous.
Since the major test is to find out whether 0 e ap(s.Q) for some s e au. there are
possibilities for great computational efficiency if p(s.Q) has convenient properties.

Proof. =:>. i) is immediate as a special case.


ii) is obvious by contraposition.

¢. By assumption. p(s.qo) is U-stable. i.e. all its zeros are in the open set
Uc = tr\U. For a proof by contraposition. assume that. for some qe Q. p(s.q) is not
V-stable; hence p(s.q) has at least one zero in V. say. z.
By (20). there is an arc in Q
that joins qo to q. Consider a point q moving continuously from qo to q along this arc;
during this motion. by (18) and (19). the coefficients of p(s.q) vary continuously;
hence the zeros of p(s.q) move continuously in fr. (Indeed the zeros of a polynomial
are continuous functions of its coefficients; note (19) and [Die.!. Thm 9.17.4].) In
particular one zero of p(s.q) starts from a point in VC (when q = qo). moves along a
z
continuous curve C to end at in V (when q=q). If ze av.
then p(z.q)=O. which
z
contradicts ii). If is in the interior of V. then the curve C starts (for q =qo) at a
point in the open set UC and ends (for q=<}) at z in the interior of U; consequently.
the curve C intersects the boundary of V. au. at some Zj. which corresponds to some
q) e Q along the arc joining qo to q. Hence, p(z),ql)=O with zi e au and q) e Q,
which is a contradiction. I

11.5.2. Special Case: Affine Maps and Convexity


Let us start with an observation. Let QcRm be a convex polytope whose
extreme points (vertices) are el. e2.... ,er , [Roc. I]. In other words, the polytope Q
may be viewed as the convex hull of el,e2, ... ,er : that is.

25 Q=co { e\.e2.....er }
r r
= { q e Rm I q= L '-;ej. Aj 0 It i, L Aj = I }.
) )

Consider now an affine map from IRm to IRn, then

26 f: q -t b+Aq

where beRn and A E Rnxm.

27 Fact. If Q is a convex polytope and f is affine. then f(Q) is a convex polytope.

r
Proof. Consider an arbitrary q e Q. with q = L Aiei' (where L Ai = 1, Ai 0 It i).
) \
392

then

28 f(q) = b+Aq = b+A (.± Aiei ]


,=1

r r
= L Ie; (b+Ae;)= L A;f(e;).
i=1 i=1

Consequently,
r r
29 f(Q)= ( ZE lRn I z=L le;f(ej); L Aj= 1; Aj 0" Vi},
t i=1

i.e. f(Q) is the convex hull of the f(e)'s.



Let us go back to the characteristic polynomial p(s,q) given by (18). Assume that
Q is a convex polytope as in fact (27). Assume that each is affine in q. Then,
V s E (C, q p(s,q) is an affine map of Q into (C and consequently p(s,Q) is a
convex polygon in (C. Since there exists an efficient algorithm to compute the nearest
point (from the origin) in a polytope, [WoU], we have a cost-effective robustness
test.

31 Test. For a sufficient number of points s E av, calculate n(p(s,Q», the nearest
point from the origin to p(s,Q). If, for all such s, n(p(s,Q» > 0, then the set of polyno·
mials is V-stable, else it is not V-stable.

32 Exercise (Barmish). Consider the problem above where Qc R. m is a convex


polytope with vertices el ,e2, ... ,er ; the characteristic polynomial p(s,q) is such that,
V s, q p(s,q) is affine. Hence p(s,Q) is a convex polygon in (C. Assume that for,
some qoE Q, p(s,qo) is V-stable. Show that the family of polynomials p(',q), q E Q, is
V-stable.

33 V SE av,
where
= min arg(p(s,ej»
j

a(s)=max arg(p(s,ei» .
,
11.5.3. The Discrete Time Case
The methods used in Sections 5.1 and 5.2 above are based on the continuity of
the zeros of a polynomial, on a connectedness argument and on properties of affine
maps. Clearly these methods apply to the discrete-time case; note, however, that V is
now chosen in the z-plane, is symmetric with respect to the real axis and must include
393

the exterior of the open unit disk, D(O,I).

11.6. Stability Under Arbitrary Additive Plant Perturbations


Since this section considers transfer functions exclusively, to alleviate notations,
we suppress the symbol .. A ".

1. Suppose that we have designed aU-stable (11.5.16) control system S(P,C) (see
Fig. 11.4) which has the following properties: a) P and C are proper rational matrices;
b) det[I+PC](oo) :# 0; c) the representations of P and of C have no U-unstable hidden
modes; and d) the closed-loop transfer functions

are U-stable, equivalently, they are proper and have no poles in the closed set U. By
(11.1.39), Heu is U-stable ¢:> Hyu is U-stable.
Now in the process of modeling, certain dynamical aspects of the physical plant
are usually neglected for simplicity; furthermore in the course of the operations some
parameters of the plant may vary; we model these changes by replacing P(s) by
P(s)+.-lP(s). In certain applications some exogenous inputs are coupled to the unmo-
deled dynamics and not to the original model. To capture this aspect we introduce a
third input u3 in the new configuration S(P,M,C) shown in Fig. 11.6. The system
S(P,M,C) has three inputs UI,U2,U3 and three outputs YioY2,Y3; hence we consider
Hyu: (uI,u2,u3) (YI,Y2,Y3) and Heu: (uI,u2,u3) (e"e2,e3)'

2 We assume that
3 M(s) E Mat(Rp(s»

+.. 6P
b +
P +
Y2
u
2
+

Fig. 11.6. The perturbed control system S(P,6P,C).


394

that the representation of M has no U-unstable hidden modes and that


det[I+(P+M)C](oo) '# O.
Note that M may contain any number of U-unstable poles; note the contrast with
the specific assumptions made in (1l.3.35)-(1l.3.37) above.

4 Given that there are no U-unstable hidden modes in P, C and tJ>, we say that
S(P,tJ>,C) is V-stable iff the transfer functions Hyu and Heu are V-stable.

5 It is easy to see that Hyu is V-stable Heu is V-stable, but the converse is not
true: Hence to establish V-stability we need only show that Hyu is V-stable.
We are ready to state the necessary and sufficient conditions for V-stability of
S(P,M,C) [Bha.l].

6 Theorem (Stability of S(P,llP,C». We are given a U-stable system S(P,C) as


specified by (1); the perturbation llP satisfies (2); then

7 i) S(P,tJ>,C} is U-stable <=:> tJ>(I+QtJ»-1 is U-stable;

ii) if, in addition, llP is U -stable, then

8 S(P,llP,C) is U-stable <=:> det[I+QM](s) '# 0, V S E U.

9 Comments. 0.) Since S(P,C} is U-stable, Hyu is U-stable; hence, in particular,

10 Q := C(I+pC)-J, PQ=PC(I+pC}-I, QP=CP(I+CP)-l, P(I-QP)=p(I+Cpr l

are U-stable. We'll use this fact in the proof.


Since Q is U-stable, the last expression in (7) may be replaced by "S(Q,t1P) is U-
stable," (see theorem 11.1.43). This makes sense intuitively: cut the diagram in Fig.
11.6 at a and at b, and observe that the gain from a to b is Q; thus llP forms a new
feedback loop with -Q.
y) In some problems, llP=R/(s-p), where pE (J: may be in U and R is a complex
matrix, then

11 M(I+Qt1pr l = R[(s-p)I+QRr l .

Since the matrix expression in the brackets is analytic in V, (see (10) above), by (7)
we have
12 S(P,RJ(s-p),C) is V-stable det[(s-p)I+Q(s)R] '# 0, VSE U .

In case R is a dyad, say cbT , the second expression in (12) reduces to the scalar con-
dition, (using det(I+MN) = det(I+NM»
395

13 (s-p)+bTQ(s)c ¢ 0, \tSE U.

This equation is very useful: it shows how the closed-loop control (through Q(s»
affects the open-loop pole p.
8) Consider equivalence (8): the condition on AI> in (8) says in particular, that AI>
will not destabilize S(P,C) if, \t s e U, Q(s) is small whenever P(s) is large: more
precisely, if

IIQ(s) AI>(s)ll < 1 \t s e U .

Proof of Theorem (6).


Statement (7): <=. By assumption AI>(I+QAI»-l is U-stable. Let us use el,e2,e3 as
unknowns and write the summing node equations for S(P,AP,C), (see Fig. 11.6),

15

In (15)-(17), P,C and LlP are matrices of proper rational functions. Let us perform the
following elementary operations, first, (16) (16)-P'(15) and, second,
(17) (17) + Q . (16). If we write the result in matrix form we have

I -C
18
[ o
o
I+PC
o

Solve (18) by back substitution:

19 Y3 =LlPC) =.1P(I+QLlP)-I[U3+QUI-QPU21.

In view of (to), namely, Q and QP are U-stable, and of the assumption that
LlP(I+Q.1P)-1 is U-stable, Eq. (19) shows that the transfer function (ul,u2,u3) Y3 is
U stable. Next

Y1 = Ce1 = C(I+pC)-1[_Y3+U1-PU21,
hence

Thus (20) shows that the transfer function (u1,u2,u3) -+ Y1 is U-stable. Finally,
396

shows that because of (10) the transfer function (Ul,U2,U3) Y2 is V-stable. Hence we
have shown that Hyu is V-stable, hence S(P,IlP,C) is V-stable.

==-. Hyu is V-stable by assumption.


By (19), for ul =u2=8,

22 Y3=Hy,u,U3=L1P(I+QllPr1u3'

Hence we have L1P(I+QL1pr i is V-stable since Hy,u, is a suhmatrix of the V-stable


Statement (8): we prove that with L1P V-stable,

23 1lP(I+QL1pr l is V-stable ¢> det(I+QL1P)(s) "# 0, 'riSE V.

¢:. Follows immediately from Cramer's theorem and the fact that both Q and L1P are
V-stable.
==-. By the assumption in (23) and the V-stability of Q, we have
24 I-QL1P(I+QL1P)-1 = (I+QL1P)-1

is also V-stable. Since both Q and IlP are V-stable, the last expression (24) is V-
stable iff det(l+QL1P)(s) "# 0, 'rI s E V. So necessity is established. •

The Discrete-Time Case


Since the methods used to derive Theorem (6) are purely algebraic, the applica-
tion of the theorem to the discrete-time case is straightforward.

11.7. Transmission Zeros


In this section we develop the notion of transmISSIOn zero for linear time-
invariant system representations. We briefly discuss its importance in engineering
design.

11.7.1. Single-Input Single-Output Case


In the SISO case, the notion of transmission zero is straightforward: we are given
R = [A,b,c,d] with (A,b) controllable and (A,c) observable (i.e. R is minimal); we note
that the transfer function

Ii (s)= c(sI - A)-lb+ d= n(s)/d(s)

where by minimality the polynomials nO and dO are coprime. We say that z is a


397

transmission zero of Riff


2 n(z)=O,

equivalently, iff Ii (z)=O .

3 Exercise. Given the assumptions above and n(z)=O,


a) Show that

4
SI-A -b
n(s)=det [ -c -d
1,d(s)=det(sI-A) .
o
b) Let a(A) c (L and Re z 0; show that

p(t;O,xo,exp(zt» 0 as t 00.

c) Show that there is a unique Xo E ern such that


5 p(t; 0, xo,exp(zt» =0 'Vt

(Hint: use the time-domain description of R.)



6 Comments.
ex) Conclusions b) and c) of the exercise show that the transmission of any
(scalar) input proportional to exp(zt) is blocked by the system, hence the name
"transmission zero."
Note also that since n(-) and dO are coprime, if z is a zero of R (or of Ii (s»
then z is not a pole of Ii (s), and vice versa; this is not necessariily true in the MIMO
case.
'Y) The generalization of these facts to the MIMO case is not obvious: now there
are nj scalar inputs and no scalar outputs. Do we consider a fixed Uo or an arbitrary
Uo E er ni with a corresponding input Uo exp(zt) and require that the output
y(t) = p(t; O,xo,uoexp(zt» be identically zero or be restricted to some hyperplane in
([:1Io?

1l.7.2. Multi-Input Multi-Output Case: Assumptions and Definitions


Consider now the MIMO case with nj not necessarily equal to no; we are given a
linear time-invariant representation R = [A,B,C,D] where A,B,C,D E Mat( er); we
assume that

7 (A,B) is controllable and (A,C) is observable,


398

8 S(A):= [ ;\.I-A
-c
-B
-D
1 is ful/-normal rank

i.e. the polynoinal matrix S(A) is full rank except at a finite number of points in C: . •

Note that S(A) is an (n+no) x(n+ni) matrix. From (7) and (8), we conclude

9 if n + ni ::; n + no' then is full-column rank;

10 if n + ni n + no, then [C I D] is full-row-rank.

Exercise (3) above suggests the following.

11 Definitions [Transmission zeros]. Let R be given and satisfy (7) and (8):
a) if n + ni ::; n + no. we say that z is a transmission zero of Riff :3uo e ni
and a corresponding unique Xo en such that

12 yet) := pet; O,xo,uoexp(zt» = Sno Vt

b) if n + ni n + no and if z If. cr(A), we say that z is a transmission zero


of Riff :311 e no S.t. V UOE c: ni , :3xoE c: n S.t.

13 *
11 pet; O,xo,uo exp(zt» =0 Vt 0.

14 Comments. a) For n+ni ::; n+n o' (12) says that the input space c: ni
has a
special direction Uo such that the input uo exp(zt) and a uniquely chosen initial state xo
produce an output that is identical to zero. Thus, R blocks completely the transmission
of the input uoexp(zt).
For n + nj n + no, (13) says that the ouput space € n" has a special direction 11
such that for all inputs Uo exp(zt), (with Uo arbitrary), and a corresponding initial state
xo, the output yet) remains in the subspace orthogonal to 11. Thus, R blocks the
transmission of any input uoexp(zt) in the sense that the corresponding output is con-
strained to remain orthogonal to 11.
Thus, in the MIMO case, the notion of transmission zero consists of not only a
point z in the complex plane but also of either a direction in the input space or a direc-
tion in the output space. If, in addition, ni = no' then, for the special uo, the output yet)
is completely blocked out, and for any input uo exp(zt), the output y( t) is orthogonal to
11·
A We could have formulated the problem starting with the transfer function
H(s), considering any of its minimal realizations and require (8): then we would say
399

"transmission zeros of H(s)." Clearly the two approaches are equivalent.

11.7.3. Characterization of the Zeros

15 Theorem [Characterization of transmission zeros].


Let R satisfy assumptions (7) and (8). Under these conditions, z is a transmission
zero of R

16 rk [S(z) 1 < normal rk [S1 .


I

This theorem is the extension to the MIMO case of the definition (2) of the SISO
case. (See, in particular, Eq. (4) of Exercise (3).)

Proof. Case I n+nj 5; n+no


-==. Assume that rk S(z) < n + nj .
Equi.valently, we assume that the n + nj columns of S(z) are linearly dependent. Hence
there is a nonzero vector (xo,uo) E ([ n+nj such that

17 [ZI-A -B]
-C -D
[Xo]
Uo
[88 ].
Note that neither Uo nor Xo can be zero; this follows from (7) and (9). Furthermore, by
(7), to any such uo, there is a unique Xo that satisfies (17). For the Uo and Xo in (17),
consider the input Uo exp(zt) and the state trajectory x(t) = Xo exp(zt). Referring to the
differential equation x= Ax + Bu, we see that

18 set; O,xo,uoexp(zt» = xoexp(zt) .

Now, by (17) again, Vt 0,

19 pet; O,xo, Uo exp(zt» = C Xo exp(zt) + Duo exp(zt) = 8 .

Hence, by definition (12), z is a zero of transmission of R.

=:> Assume that z is a zero of transmission of R.



Consider the Uo and the Xo occurring in the definition of z (see (11) and (12». Let yet)
be the response of R due to the input Uo exp(zt) and starting from Xo at time D, then

20 (sI-A)x(s)-Buo/(s-z)=xo

21 y(s)= Cx (s)+ D uol (s-z)=8

where the last equation follows from (12).


400

From (20), (21), and assumption (7), xes) is uniquely defined and has only one
x x
pole at s=z. Call k the residue of (s) at z, so (s) = k/(s-z). But x(O) = Xo by (12),
hence k = xo. Substituting this result in (20) and (21) and letting s 4 z we obtain

(zI-A)xo+ B uo=8
{
22 C Xo + Duo=8'

i.e. the nonzero vector (xo,uo) is in the nullspace of S(z); hence, rk [S(z») < n + ni; i.e.
SeA) drops rank at s=z. •

Case Il n+ni <::.. n+no


¢. Assume that rk S (z) < n + no'
Equivalently, the n + no rows of S(z) are linearly dependent; hence there is a nonzero
vector E tt n+n., such that

23 I T]* ) [ = [8* I 8* ) .

Again, from (7) and (8), neither nor T] can be zero vectors.
Now since, by assumption, z If. a(A), zI-A is nonsingular; hence V UOE (Cn; there
is a unique Xo such that

Consequently, V t <::.. 0

set; o,xo,uo exp(zt» = Xo exp(zt)


and

25 * *
T] pet; o,xo,uo exp(zt» =T] (Cxo exp(zt) + Duo exp(zt» = °
where the last equality follows from (23) and (24). By definition (13), Eq. (25) estab-
lishes that z is a transmission zero of R. •

=:>. Assume that z is a transmission zero of R (i.e. that (13) holds).


So V Uo E (C\ with xo calculated by (24), Eq. (25) holds. Since, V t, exp(zt) "# 0,
from (25) and (24) we obtain successively

*
T] (Cxo+Duo)=O

and

26

Hence
401

where" ¢ e.
Now since (zI-A) is nonsingular and since the two matrices

28
ZI-A
S(z)= [ -c
-B
-D
1 and

are equivalent (i.e. either one can be derived from the other by elementary (block) row
and column operations), the two matrices in (28) have the same rank. Now (27)
implies that the rank of the second is smaller than n + no; hence

rkS(z) < n+no'



29 Remarks.
1. In Definition (11), for n+nj we specifically assumed that z ¢ a(A).
This assumption used twice in the proof of Theorem (15). In the MIMO case,
poles and zeros of H(s) can coincide. Consider

H(s)=diag
A [S-I
-, -S+IJ
s+1 s-1
;

it has a pole at s= 1 and at s=-I; it also has a zero at s= 1 and at s=-1.


A A

2. Let z be a zero of of H(s) and let H(s) be analytic at s = z. Then


if n+nj nfno' the columns of H(z) are linearly dependent; if n+nj then
the rows of H(z) are linearly dependent. For the latter case, see (27) above.

11.7.4. Application to Unity-Feedback Systems


Consider S(P,C), the unity-feedback system shown in Fig. 1104, where
pes) E IRpo(s)n.xni and C(s) E IRp(s)n ix n., and where P and C are specified by their
representations R p and R c that are stabilizable and detectable (i.e. S(P,C) has no
unstable hidden modes). Assume that nj
We assume that a) S(P,C) is expo stable and b) that PC(s) has full-normal rank
equal to no (i.e. det PC(s) ¢ 0 except for a finite number of points in cr, namely, the
poles and the transmission zeros of PC).

30 Theorem. Under these conditions, if P has no pole at z and has a transmission


zero at ZE cr+, then the I/O map from Ut to Y2, namely,

has also a transmission zero at Z.


402

32 Comments. a) In fact the theorem is still true if P has a pole at z. Unfor-


tunately, the proof is quite involved.
If P has a transmission zero at z E then, for all compensators C such that
S(P,C) is expo stable, the I/O map Hy2U ,(s) has a transmission zero at z, this is a fun-
damentallimitation on the achievable I/O map of S(P,C).
As a consequence of this zero z E +, for all initial conditions, for any input (to
S(P'C» ul (t) of the fonn

the corresponding output Y2(t) -t el10 as t -t 00. In particular, if ZEjIR, (i.e. z=O or
z=jroo for some nonzero roDE R), S(P'C) will not respond asymptotically to some
constant inputs (in case z = 0) or some sinusoidal inputs (in case, z = jwo), and this
will happen irrespective of the choice of the stabilizing C.

Proof. S(P,C} is expo stable and ZE (1:+ imply that C(I+PC}-I=Hy,u, is analytic at Z.
Hence, p. C(l+PC}-1 is the product of two matrices analytic at Z.
Since nj and P has a transmission zero at z, the no rows of P(z) are linearly
dependent; hence rkP(z) < no and det[PC(z)] =0. Now, by (31),

33 det H y2u ,(z) = det[PC(z)] / det[I+PC(z)] .

Now He,u, = (I+pcr l is analytic at z because S(P,C) is expo stable and z E <r +; con-
sequently,

34 det(I+PC(z» oF- O.

From (33) and (34) we conclude that detH y2u ,(z)=0, equivalently, H y2u , E IRp(S)I1o X I1o
has a transmission zero at Z. I

This completes our brief exposition of some properties of MIMO unity-feedback


systems. These seven sections illustrate how many of the concepts and techniques
developed in the first ten chapters are crucial to make sense out of the often subtle
MIMO results.
APPENDIX A

LINEAR MAPS AND MATRIX ANALYSIS

We describe here the main concepts and results useful in linear system theory.
We start with basic algebraic concepts, functions, rings, fields, linear spaces and linear
maps. Next we tackle the question of representation of linear maps, a crucial concept
in many applications. Normed linear spaces are next, with the key concept of conver-
gence. The final section covers adjoints. It leads naturally to the singular value
decomposition.

AI. Preliminary Notions

We assume that the reader is familiar with the notion of sets, intersections of sets
and unions of sets; and with the symbols e",c .¢,U ,andrl; and with the logic sym-
bols 'fI, 3, 3!,=>,<=, and <=>. (See, e.g. [Loo.1], [Rud.1].)

I Some sets are given standard labels: for example Z, N, R,R+, €, (J: + denote the
sets of integers, nonnegative integers, real numbers, nonnegative real numbers, com-
plex numbers and complex numbers with nonnegative real part, resp.

2 Given two sets X and Y, by their cartesian product we mean the set of all ordered
pairs (x,y) where x e X and ye Y. The cartesian product of X and Y is denoted by
X x Y. Consequently the set of all ordered n-tuples of real (complex) numbers is
denoted by R n ( n).

3 Next the notion of function. Given two sets X and Y, by f: X -4 Y, we mean that
to every xe X, the function f assigns one and only one element f(x)e Y called value
of fat x. X and Y are resp. called the domain and codomain of f and we say that f
maps X into Y. f(X):= {f(x) I xe X} is called the range of f. The words "map,"
"operator," and "transformation" have the same meaning as function; a more complete
specification for a function is f: X -+ Y : x -+ f(x) where x -+ f(x) means that f sends
xe X into f(x)e Y. The latter can also specify f when the domain and codomain are
known; for example t -4 cost defines the function cosine: R -4 R.

4 A function f: X -4 Y : x -+ f(x) is said to be injective, or called an injection,


or (one-one), iff f(xl)=f(x2) => XI =x2, (or equivalently XI x2 => f(xl) f(x2».

5 A function f: X -4 Y is said to be surjective, or called a surjection, (or onto n,


iff VYE Y, 3XE X s.t. Y= f(x), (or equivalent Y = f{X».

6 A function f: X -4 Y is said to be bijective, or called a bijection iff it is injective


and surjective, (or equivalently 'fI ye Y 3! xe X s.t. y = f (x». (Recall that 3!
means "there exists a unique.")

7 Composition. Consider functions g: X -+ Y : x -+ g(x) and f: Y -+ Z : y -+ f(y). We


404

call the function


h:=fog : X Z: x h(x):=f(g(x»

the composition of g and f (in that order); h is also said to be a composite function.

8 Composition may be visualized by a diagram: see Fig. A 1. To go from X to Z by


fog is the same as going through Y in two steps. first by g. then by f. We say that the
diagram commutes.

9 Composition is associative: with f and g as above and


h :W X then fo(goh) = (fog)oh.
Composition leads also to the definition of inverses:

10 Consider maps f: X Y with Ix resp. 1y the identity maps on X and Y. Let


gL.gR and g be maps of Y into X.
a) gL is called a left inverse of f iff gLof= Ix.
gR is called a right inverse of f iff fogR = 1y •
'Y) g is called a two-sided inverse or an inverse of f denoted by I I iff
gof= Ix and fog = I y .

For the latter case we say that f is invertible or I I exists.

11 Fact [MacL.l p.8]. Let f: X Y be invertible. Then a two-sided inverse g of f is


unique; furthermore any two inverses (left. right. two-sided) of f are equal.

12 Facts [MacL.l pp.7-8]: Let f: X Y.


a) f has a left inverse iff f is injective;
f has a right inverse iff f is surjective;
'Y) f is invertible iff f is bijective.
[Hint: a) Since f is bijective onto f(X) c Y. one can pick gL: Y X s.t .• on f(X).
gL(f(x»=x; hence gLof= Ix.
13) Because f is onto Y it is possible to choose gR: Y X S.t. fog R = 1y . . . . J

X fog z

\1 y
Fig. AI. A commutative diagram.
405

A2. Rings and Fields

1 In engineering we frequently encounter the following


(a) Fields: R, (I: ,Q ; lR(s), (I: (s) , where in particular

2 Q := the field of rational numbers


3 lR(s), ( (I: (s» := the field of rational functions in s with coefficients in
R, ( cr, resp.).
(b) Commutative rings:
Z ,R[s], cr [s], R(O), R 0(0), Ru ; diagonal manices with elements in a
field (e.g., R, cr, or R(s» or in a commutative ring (e.g., R[s], cr [s]);
where in particular

4 R[s], ( (I: [s]) := the ring of polynomials in s with coefficients in R,


( cr, resp.).
:= the ring of propert (strictly proper) tt rational functions with
coefficients in R.
6 R(O),(R 0(0» := the subring of elements of Rp(s), (Rp.o(s», that are analytic in
(1:+ (Le. with no poles in (1:+).

7 RU := the subring of elements of Rp(s) that are analytic in U: a


closed subset of (I: symmetric w.r.t. the real axis and
which includes (I: +.
The elements of R(O) and RU are also called exponentially stable, (abbreviated expo
stable) and U-stable transfer functions, resp.
(c) Noncommutative rings:
R nxn , (I:nxn; R[s]nxn, (I: [s]nxn; R(s)nxn, (I: (s)nxn;

which denote n x n manices with elements in R, (1:, etc.



8 Exercise. For each of the rings and fields above, verify that you know the opera-
tions of addition and multiplication. Identify precisely the identity under addition
denoted by 0 and under multiplication denoted by I, (denoted by I in the matrix case).
In order to emphasize the similarity and differences between rings and fields we
define them jointly: thus the left column is assigned to axioms of rings and the right
column to those of fields; the axioms that are common to both are stated only once.
9 Definitions. We call ring, (field, resp.), the object consisting of a set of elements,

t Bounded at infinity.
tt (Zero at infinity).
406

two binary operations viz. addition + and multiplication • , an identity element


under addition denoted by 0, and an identity element under multiplication denoted by
1 obeying the following axioms:

Ring: (R, +, 0; 0, 1) Field: (F, +, 0; 0, 1)

(a) Addition is
Associative: (a+p}+y=a+(l3+y) Va,p,y
Commutative: a+p = l3+a V a,p
:3 identity 0:
°
cH.()=a Va
:3 inverse: Va, :3 element (-a) S.t. a+(-a.) =
(b) Multiplication is
Associative : (a 'P) .y= a . (P .y) V a,p,y;

Not necessarily commutative Commutative


a'
:3 identity l:a·l=l·a=a Va.
Va E R, a # ° 4- a-I exists VaEF,a#O
=>3 inverse a-I S.t.
a'(a-I)=(a-I)'a= 1
(c) Distributive laws: Va,p,y

Note that we require our rings to have an identity element 1: some algebraists do not

require this and call our rings, "unitary rings," e.g. [Sig.l]. There is a standard pro-
cedure to add a 1 to any ring (Jac.l].
From the axioms above four important facts follow.

10 Fact. In any ring and in any field, the identities 0 and 1 are unique.
(This is easily shown by contradiction.)

11 Fact. In a ring, the cancellation law does not necessarily hold; more precisely, in
a ring

and a"# 0 do not necessarily imply p=y.

Example. Consider the noncommutative ring R2x2 :


407

but clearly 13 '" y, even though a '" O.

12 Remark. The cancellation law holds in any field F because a '" 0


=> 3! a-I e F and a-I (all) = a-I (ay)
=> (a-1a)p = (a-1a)y (associativity)
=> /3=y (a-Ia= 1).

13 Remark. We know some rings for which the cancellation law holds: e.g.
Z, R[s], cr [s], lRv(s), Rv.o(s), R(O), R 0(0), RU' Such rings are called integral
domains, or better yet, entire rings.

14 Fact. 'iae R, a'O=O'a=O


Proof. a+O=a => a'(a+O)=a'a => a·a+a·O=a·a. Adding -(a'a) to both
sides gives a' 0= O. Repeat the proof but multiply by a on the right:
O+a=a => (o+a)' a=a' a ,etc., gives O' a=O. •

15 Fact. 'ia,l3eR, (-a)I3=-(a·/3)=a·(-I3).


Proof.
0= 0,/3= [a+(--a))·/3= a'!3+(-a)'/3 => -(a'/3) =


16 Exercise. Show, from the axioms, that in any ring R,

(± (I:
i=1
ai ].
k=!
/3k ] = i: I: a i/3k .
i=! k=1

17 The ring K is called a commutative ring iff, in addition to the standard axioms
(9) we have

18 pq=qp 'ip,qe K.

19 Example. The commutative ring K might be


(1) any field: R, R(s), ... ;
(2) R[s], Rp(s), Rp,o(s), R (0), R 0(0), ... ;
(3) scalar convolution operators: p*q = q*p.
408

20 Addition and Multiplication of matrices with elements in K are defined as follows


(n denotes the sequence 1,2, ... ,n):
If Pe Krnxn and QeKrnxn then
(P+Q)ij := Pij + qij V ie ill, V je n

defines the matrix P+Q that is, in fact, in Krnxn. If Pe K rnxn and Qe KTlXp then
n
(PQ)ik := L Pij qjk V ie 01, V ke I!
j=l

defines the matrix PQ, which is an element of KffiXP. I

21 Exercise. Show that for n > I, K nxn is a noncom mutative ring. [Hint: check that
the axioms are satisfied].

22 Fact. For matrices with elements in K, the definition and properties of deter-
minants hold as in the case of elements in a field as long as one does not take
inverses! For example, if P,Q e Knxn , then det(PQ) e K and det(PQ) =det(P)·det(Q).

23 Fact [Cramer's rule]. Let PeKnxn , hence detPe K. Let Adj(P) denote as usual
the "classical adjoint", [Sigl. p.282], of P. By direct calculation we have:

24 (a) Adj(P) P=P Adj(P)=(det P)In

(b) PeKnxn has an inverse in Knxn

25 <;:;> det P has an inverse in K.

In that case,

26 p-l = Adj (P)[det(PW 1 e Knxn.

27 Comments. From (24) and (26) it follows that P has a right inverse iff it has a
left inverse; the common right and left inverse of P is called the inverse of P, (cf.
(AU 1).

Proof of (23): Outline:


(a) (24) is equivalent to, [Sig.l,p.287],
n n
L CkiPkj= L Pikcjk=BijldetPj Vi,jell
k=l k=l

where (I) Vi=j.


409

(2) Cij is the cofactor of element Pij of P, i.e., cij=(-I)i+jmij with mij
denoting the determinant of the matrix obtained by crossing out row i
and column j of P.
(b) If P has an inverse p- i e Knxn, then by the axioms of K, det(p- i ) e K; now
pp-I = implies [det(P)]. [det(p-I)] = 1; hence (25) holds. Conversely if (25) holds,
then the RHS of (26) e Knxn and is the inverse of P according to (24). •

28 Note. A matrix Pe Knxn is said to be nonsingular iff detP *" 0, where 0 is the
additive identity of K. Hence if K is a field then condition (25) is equivalent to
det P *" O. Therefore, we have the following coroJlary.

29 Corollary: Let Pe pnxn, then


Pe pnxn has an inverse in pnxn

P is nonsingular.

30 Comment: If the ring K is not a field then there may exist nonsingular matrices
P e Knxn having no inverses in K n xn : however corollary (29) still holds for inverses
in F nxn where the field F has K as a subring."

31 Example: Let K = R[s] be the ring of polynomials; R[s] is a subring of the field
F= R(s) of rational functions.
Hence, according to Corollary (29), P e R[s]nxn c R(s)nxn has an inverse in
R(s)nxn iff Pis nonsingular.

To wit: let Pt(s) = ; ], PI(s)-t= [_:2 -; ].

However, according to Fact (23), Pe R[s]nxn has an inverse in R[s]nxn iff det Pis S.t.
(detp)-I e R[s], i.e. such that det P is a nonzero constant (i.e. a polynomial of order
zero): such polynomial matrices are called unimodular, (equiv. invertible in
R[s]n xn ).

To wit: let P2(s) = P2(s)-1 = -;].

It follows that unimodular polynomial matrices are nonsingular but the converse is not
true.
To wit: PI(s) e R[s]2x2 and P2(s) e R[s]2x2 are both nonsingular but only P2(s) is uni-
modular: det P2(s) = I, det PI (s) = I-s3 (not a nonzero constant).

A3. Linear Spaces


Every engineer has encountered the linear spaces Rn, tr n .. '. Linear spaces
are also called "vector spaces" or "linear vector spaces." Roughly speaking, a linear
space is a set of vectors say V, to which we add, a field of scalars, say F, with which
410

we multiply vectors. So we shall denote a linear space by (V,F) or by V for brevity.


Sometimes to emphasize the field F, we say the F-linear space V.

1 Definition. We call linear space (V,F) the object consisting of a set (of vectors)
Y, a field (of scalars) F and two binary operations viz. addition of vectors + and
multiplication of vectors' by scalars ., which obey the following axioms:

(a) Addition is given by

+:VxV -+ V:(x,y) -+ x+y;

Addition is
Assqciative: (x+y)+z = x+(y+z) V x,y,ze V
Commutative: x+y = y+x V x,ye V
3! identity e, (called the zero vector), S.t.
x+e=e+x=x VxeV
3! inverse: V xe Y, 3!(-x)eY S.t. x+(-x)=9;

(b) Multiplication by scalars is given by


.: FxY -+ V:(a,x) -+ ax
where V xe V V F
(ap)x = a(px)
lx=x Ox=9;

(c) Addition and multiplication by scalars are related by distributive laws viz.
V xe V, V F =

Vx.yeV, VaeF a(x+y) = ax+ay .



There are two extremely important examples of linear spaces: for this reason we call
them canonical examples.

2 Canonical Example I. The linear space (P.F): the linear space of n-tuples in F
over the field F, with elements x = (Si );. Y= (lli); where each Si:'lie F for ie n·
Addition and scalar multiplication are defined by
x+Y:=(Si+lli)P and aX:=(uSi)P V ae F .

The most common examples are ( ern, er), (Rn,R), (R(s)n,JR(s» or ern,Rn,R(s)n for
short.

3 Exercise. Show that (P,F) is a linear space. [Hint: use the axioms of the field
411

to check the axioms in Definition (1).]

4 Canonical Example II. The function space F(D,V):


Let (V,F) be a linear space. Let D be a set. Let M be the class of all functions:
D V. On M define addition and scalar multiplication by

(f+g) (d) = f(d)+g(d) 'V f,ge M 'V de D

(aO(d)= af(d) 'VaeF,'VfeM 'VdeD

Then M with these operations is a linear space over F; it is denoted by F(D,V), or


F, when D and V are understood.

5 Exercise. Using the definitions of a function and of a linear space, show that
F(D,V) is a linear space. Describe precisely what the zero vector is in this case.
(Denote it by 9F , and that in V by 9 v ).

Comment. D stands for domain and is an arbitrary set, e.g. N, R, R n or a function


space. Note also that V is an arbitrary linear space.

6 Example III. The function space PC ([ro,td,Rn): it is the set of all functions map-
ping the bounded interval [to,ttl into R n which are piecewise continuous, i.e. they are
continuous everywhere, except that they may have a finite number of discontinuity
points 'tk where the one-sided limits f('tk+) and f('tk-) are well defined and finite. An
Rn-valued function defined on an infinite interval is said to be piecewise continuous
iff it is piecewise continuous on every bounded subinterval.
The prototype of a function in PC ([O,oo),R) is the square wave.

7 Example IV. The space of continuous functions mapping [to,ttl R n denoted


by C([to,t)],Rn).

8 Example V. The space of k times continuously differentiable functions mapping


[to,t)] R n denoted by Ck([to,td,Rn) or C k for short.

9 Example VI. The space of functions f: [to,td R n that are k times


differentiable S.t. the kth derivative is piecewise continuous (whence necessarily each
function and its derivatives up to the (k-l)th are continuous).

10 Exercise. Show that examples III-VI are linear spaces.

11 Example VII. Let F= R or C:. The space of 21t-periodic functions:


[O,21t] F such that
412

00

f(t)= L Ckeikl where


k=-oo

Note that if F=1R then Ck=C_k=Ckr+jcki and


00

f(t) = co+ 2 L (Cia cos (kt)-cki sin (kt»,


k=l

then "each vector" of this space is specified by the sequence (c k);.

We shall next describe the concept of subspace and product space.

14 Definition. Let (V,F) be a linear space and W a subset of V. Then ( W,F) is


called a (linear) subspace of V iff ( W,F) is itself a linear space.
From this definition it follows that

15 Examples. The set of all vectors in R n whose first component is zero. The set
of all functions fe F (D,V) that are equal to 8 y (the zero vector in V) at some fixed
point doe D or on some fixed subset Dl of D. The set of all functions f: R+ --+ R,
integrable over whose Laplace transform is zero at some fixed point Zo with
Re(zo) > O.

16 Exercises. Let I be an index set and (Wi)iEI be a family of subspaces of a linear


space (V,F). Show that n Wi is a subspace of V. Give an example to show that
ieI
WI uW2 is not necessarily a subspace. Show that
W I+W2 := {Wl+w2:wjeWj ie2,} is a subspace.
["Subspaces get smaller by intersecting and bigger by adding."]

17 Definition. Let (V,F) be a linear space. We call the subspace generated by a


subset S c V the intersection of all subspaces containing S, or equivalently the smallest
subspace containing S.

18 Fact [Sig.1. p.196]. Let (V,F) be a linear space. Then the smallest subspace
generated by a subset S cV is the span of S denoted by Sp(S) viz. the set of finite
linear combinations

(lie F, sjeS Vien.

19 Definition. Let (V,F) and (W,F) be linear spaces over the same field F. The
linear space (VxW,F) is called product space of V and W: it consists of vectors
413

(v.w)e VxW with addition and multiplication by scalars given by

and
a' (v.w) := (av.aw) 'v' ae F. 'v' ve V. 'v'weW.

Its zero vector is (9 y .9w). One usually abbreviates (VxW.F) by VxW.

20 Example. (:R.n.R) is the n-fold product of <R.1R) by itself.


We next touch the notions of linear independence and basis.

21 Definitions. Let (V.F) be a linear space. The family of vectors (Vi)r. where
each ViE V. is said to be linearly independent iff any relation of the form

implies

The family of vectors (vi)f. where each ViE V is said to be linearly dependent iff there
exist scalars ...• Un. not all zero S.t.

22 Exercise. a) Give three examples of linear spaces. In each case exhibit a


linearly independent and linearly dependent family of vectors. b) Let F= JR. For k
=<>.l ..... n. let fk:[-1.11 R:t tk. Show that the family (fk)8 is linearly indepen-
dent (in F([-1.11.JR». c) Let F= (C. For k=O.1.2 ..... n. let
fk : [-x,x] (C: t exp(jkt). Show that the family (fk)8 is linearly independent.

23 Exercise. Consider the matrix Me prnxn where m < n:

1 x x x
o 1 x
o0
M .-
·-

000 1 x x

where the x's denote arbitrary real numbers. Note that the leading nonzero element in
each row is 1. Show that the family of row vectors. (ordered from top to bottom). is
linearly independent.
414

24 Definitions. Let (V,F) be a linear space. The family of vectors [b i ); c V is


said to be a basis iff (a) (b i ); cV is linearly independent family and (b) V is the
subspace generated by [b i );. The elements of [b i ); are called basis vectors of V.

2S Note. In view of Fact (18) condition (b) reads also Sp ( (b i );] = V. Hence, in
view of the definitions, if XE V and (b i ); is an (ordered) basis, then there exists a

unique n-tuple of scalars = S.t.


n

;=1

The vector is called the component vector of x or the representation of x w.r.t. to


the basis (bi );. Note that XE V. biE V for iE n. SE pn and Si E F for iE n.

26 Exercise. Show that the Si'S are uniquely defined in terms of x and (b i ];.

The following is an important fact.

27 Fact [Sig.1,p.215]. Let (bit be a basis of (V,F). Then any other basis of
(V,F) has also n elements. Thus the number of elements of a (finite) basis of a linear
space is independent of the basis considered. Hence the following definitions.

28 Definitions. If a linear space (V,F) has a basis of n elements then (V,F) is said
to be finite dimensional or of dimension n. (We write dim V=n). Otherwise is
said to be infinite-dimensional.

29 Examples. a) (Rn,R) and (Rnxn,R), i.e. the space of n x n real matrices, are
finite dimensional of dimension, n and n2, resp., with corresponding bases [ei and t
(Eij)i,jeD where a) e i is the n-vector with every element 0 except for the ith which is 1
and b) Eij denotes the n x n matrix with every element 0 except for the (i J)th which is
1. PC ([O,I],R) is an infinite-dimensional linear space: indeed subspace

Sp [ (t t" ];] is infinite-dimensional.

30 Modules. Careful examination of Definitions (1), (14), (17), (19), (21) and (24)
shows that one may replace the field F by a commutative ring K: if this substitution is
carried out according to Table I below then we have generalizations of the notions in
the column on the left to those in the column on the right.
In system theory, we shall have many uses of the module (R[s]n, R[s)), i.e. the
module of polynomial n-vectors over the ring of polynomials. Another one is
415

Table 1

linear space over F, module over K,

subspace, submodule,

subspace generated by subset, submodule generated by a subset,

product space, product module,

linear independence, linear independence (over K),

linear dependence, linear dependence (over K),

basis, basis

31 Exercises. 1. Give examples of submodules and product modules. 2. Show that


a) (ej)f is a basis for the module (R[S]R, R[s]); b) (l,O) and (O,s) is not a basis for
(R[s]2,R[s)).

32 Note. Not all modules have a basis; the modules that we shall use have a basis.
Modules (V,K) that have a basis are free modules [Sig.l.p.200].

In the sections below we shall be involved with linear maps over linear spaces
and their matrix representations over a field. Note that, for example, for polynomial
matrices one must introduce linear maps over modules and their matrix representa-
tions over a ring see, e.g., [Sig.1], [Cal.1].

A4 Linear Maps

1 Let (V,F) and (W,F) be linear spaces over the same field F. Let A be a map
from V -+ W. We say that A is a linear map (or linear operator) iff

2

Note the abuse of notation: the value of the function A at v is denoted by Av rather
than A(v).

3 Example. Let A : R3 -+ R3; let w:= Av.


416

w:= [:: 1= 1[:: 1 =: A v .


w3 7 0 16 v3

That condition (2) holds here is immediate from the rules of matrix addition and multi-
plication by a scalar.

4 Example. Let V = PC ([O,11,R) and W = R; let A : V R be defined by


1
J
A v= v(t)dt.
o

That (2) holds is immediate from the properties of the integral.


Note. In example 4 the domain V and codomain W are different: hence their zero
vectors 6 y and 6 w differ: 6 v is the zero junction and 6w is the number zero.

5 Example. Let V=W=PC([O,oo),lR) and A: V W be defined by

f
wet) := (A v)(t) = exp[-(t-1:)]V(1:)d1:
o

6 Exercise. Show that A (6 v ) = 8 w. (Hint: A (8 v ) = OA v and Ox =8 in any linear


space),

7 Let A : V W be a linear map. We call the set

8 N (A) := { ve V : A v = 6w } c V

the null space of A and the set

9 R (A): ( A v: ve V ) cW

the range of A (also called the image of A).



N( A) and R( A) are also called the kernel of A and the image of A, resp.
It is important to note that N (A) and R (A) are linear subspaces (Prove it!),

10 Exercise. Suppose A : F" pn is a linear map defined by an m x n matrix A


s.t., with y = Ax, we have componentwise
n
Yi= L ieID.
j=\
417

Show that if ajE Fm is the jth column vector of A, then R (A ) = Sp (b )


Recall that A is injective iff vI v2=>A VI A v2'

11 Theorem. Let A : V W be a linear map. then


A is injective <::> N (A) = {9 v } .

Proof: We prove it by the following equivalences: A is not injective


<::> :::Ivi v2 S.t. A vI =A v2
<::> :::Iv 9. namely v=vl-v2 s.t. Av=A (vl-v2)=A vI-Av2=8w.

12 Exercise. Let A : U V be a linear map.
1) If (A Ui); is a linear independent family, then so is
2) Show that the converse holds iff A is injective.

13 Theorem. Let A : U V be a linear map with dim U = dim V = n.


U.t.c A -1 exists (i.e. A is bijective).
<::> A is injective (Le. N (A ) = 9 u).
<::> A is sUijective (i.e. R (A ) = V).

Hints: Any basis of U is a family of n linearly independent vectors (Ui t and simi-
Uj t
t
larly for V. If A is injective then [A is a basis of V. If A is surjective then.
for any basis (Vi of V. :3UjE U S.t. A Uj = Vj; this defines a basis (Uj); of U ...
=
14 The Equation A x b. Let (U,F) and (V.F) be two linear spaces (not neces-
sarily finite dimensional). Let A : U -7 V be a linear map and let be V. The following
Theorem specifies conditions for the existence and uniqueness of solutions; further-
more it specifies all solutions ofAx=b.

15 Theorem. In terms of the notations just defined. we have


a) Ax=b has at least one solution <::> be R (A ).
b) assume, in addition, that bER (A), then
i) Ax=b has a unique solution <::> N (A) = (9 v ) ;
ii) let Xo be any particular solution. i.e. A Xo = b; then
x is a solution of Ax = b
.;:;> x = Xo + z with zeN (A).
418

16 Comments. a) In case dim N (A) > 1, (ii) shows that Ax = b has an infinite
number of solutions. In particular, if is a basis for N (A), then

k
x=Xu + L cibi
i=l

is a solution.
b) In general, the set of solutions is an affine space denoted by Xo + N (A) (note that if
every vector of that space is shifted by -Xu, then the affine space becomes a linear
space),

Proof of Theorem (15). a) follows from the definition of the range of A.


b) Observe that for any two solutions, say )( and Xu, we obtain by subtraction

A x-A Xo=A (x-xo) =8 v ,


so equivalently),

z := x - XOE N (A ) .

Hence (i) and (ii) follow.



17 Three ways to represent a subspace. Let (V,F) and (V,F) be two linear spaces
where U and V are not necessarily finite dimensional
A (linear) subspace M of V may be represented in several ways:
i) [Span of a set.] M is specified by

M := Sp(S)

t
when S is a subset of V.
Note that if M has a basis [bi then

ii) [Range of a linear mapl.Let A : U --t V be a linear map; then M is specified by

M:= (veV I V=AU,UEU),

equivalently. M:= R(A).


iii) [Null space of a linear map]. Let B : V --t U be a linear map; then M is specified
by

M .- (v E V I B v = 8 u )
419

equivalently M:= N (8 ).

18 Note. IT the linear space V above is of finite dimension n then the subspace M
of V is also finite dimensional and the integer n-dim( M) is called the codimension of
M w.r.t. V.

AS. Matrix Representation


In this section we discuss various issues concerning the matrix representation of a
linear map with finite dimensional domain and codomain.

AS.l. The Concept of Matrix Representation


Let A : (U,F) -+ (V,F) be a linear map where dim U = n and dim = m.
t
V Let
be a basis of U. Then for any xe U 3 = e P' S.t.

where is called the component vector of x w.r.t. the basis Now by linearity

Let be a basis of V. Since A : U -+ Y, each A uje Y, whence each A Uj has a


unique representation in terms of the basis (Vi

m
1 A Uj = l: 3jjVj '\:t je n ,
j=l

note that (3jj). is the component vector of A Uj' Hence, by stacking column vectors

traij).lem,we an mxn matrix

A= m rows

3ml
n columns

where AeFrnxn.
420

Let us calculate the component vector 11 = (11i] of y := A XE V w.r.t. [Vi];l:

n n m
y = Ax =L Uj = L a;jVi
pI pI i=1

m
= L11iv i'
i=1

Hence by the uniqueness of the representation


n
11i= L \:j iE ill
j=1

or equivalently. in vector notation we obtain the component equation

i.e. the component vector 11 of the vector y = Ax is obtained by multiplying by the


matrix A the component vector of x. Therefore we have proven the useful theorem.

3 Theorem. Let (U.F) have. a basis (Uj); and let (V.F) have a basis Let
A : U , V be a linear map. Tnen. w.r.t. these bases. the linear map A is represented
by the m x n matrix

A= [a;). . E pnxn
j lEll.jEn

where the jth column of A is the component vector of A Uj w.r.t the basis

4 Note. In most applications we replace vectors and linear maps by their represen-
tations. viz. component vectors and matrices. e.g. we write N(A) instead of N(A).

5 Exercise. Let A be a linear map of (U.F) into itself. where dim U = n. i) Sup-
pose that ... -Un-IA -UnI. where I is the identity map:
U, U and the ai's are elements of F. Let b be a vector in U. Suppose that

6 b is an A-generator of U, i.e. (b. Ab ..... An-1b) is a basis of U. Show that w.r.t.


this basis the vector b and the linear map A are represented by
421

}
o 0 0 -ex,.
1 0 0 -<Xn-I
0
0 o} 0 -<Xn-2
el= A= 0

o o0 o
The matrix A above is said to be in column-companion form.
ii) Suppose that A.E F and (bi )." is a basis of U such that
1=1

and
A bk = Ab k + b k - I for k = 2,...,n.

Obtain the matrix representation of A w.r.t. to that basis.

7 Exercise. Let A be a linear map from R3 into R3. Consider the usual mutually
orthogonal axes Ox I' and Ox 3, (Fig. A2). The map A first rotates any vector by
an angle Cli about the aXIs Ox} and then rotates the resulting vector by Cl3 about Ox3.
Obtain the matrix representation of the linear map A w.r.t. the given axes.

8 Exercise [Matrix operations].


a. Consider the linear maps

A : (V,F) --+ (W,F): y --+ z := Ay


and

Fig. A2. The map A is the composition of the two rotations shown.
422

B : (U,F) ;(V,F):x y :=Bx

where dim U = m, dim V = n and dim W = p. Let U,V, and W have bases fUk
ht (Wit and with corresponding component vectors s= 11= (11j and

Let A and B be represented by matrices

A= (a;.]J iellJell
and B= (b·J
k]
jen.kern

generating the component equations

and n=Bs.
Consider the composed linear map

C =A·B:U W:x -+ z=Cx=A(Bx).

Show that C is represented by the matrix

C= (Cik] iell,kern

generating the component equation

where \;:fi,k
n
cik = L aijbjk
pI
i.e.
C=AB

or equivalently, composition of linear maps corresponds to matrix multiplication.


b. Consider the linear operator
A : (U,F) (U,F): x y := A x

where dim U = n and A is invertible. i.e. A has an inverse A-I s.t.

where I is the identity map on U. Show that if A and A -1 have matrix representations
A and B, then
423

BA=AB=In

where In is the unit matrix of F"xn. Hence


B=A- 1

i.e. inversion of a linear map corresponds to matrix inversion.

9 Final note. Of course "summing linear operators according to


(A +B )x=A x+B x \;f x corresponds to summing matrix representations, i.e. A+B."

AS.2. Matrix Representation and Change of Basis


In this section we study the relation between two matlix representations of the
same linear map. We start with a detailed exercise

h)
11 Exercise [Linear spaces and bases]. Let (V,F) be a linear space S.t. dim V = m.
Let'V have bases and (Vi For ye V let the corresponding component vec-

tors be 1'\ = (1'\i ]mand -1'\= l'\ both in pm: thus y = Lm1'\ivi= Lm_l'\Vi'
I ! i=! i=!
Show that
a. Each choice of basis (Vi ) defines a bijective linear map

b. Each change of basis defines a bijective linear map: vi Vi' ie ill given by the
matrix

whose jth column is the component vector of vi w.r.t.


c. The relations between V and the component vector spaces Fm are reflected in the
commutative diagram of Fig. A3.
We study now a linear map under change of bases in the domain and codomain.

t
12 Let A : (V,F) (V,F) be a linear map with dim U = m and dim V = n.
Let U, with elements x, have bases [Uj and [iij ) generating component vectors
and S, resp. in F". According to Exercise (11) we know that and Sare related by

13 S=PS
where the matrix PeF"xn is nonsingular, (A2.29). Similarly, let V, with elements y,
have bases (Vi ) and generating component vectors 1'\ and 11 in Fm related by
424

V,y


Fig. A3. Change of basis in a linear space. (Note that a double arrow
represents a bijection.)

14 ll=QrJ

where the matrix QEPrnxm is nonsingular. Finally let according to Theorem (AS.3)
and (AS.2) A have matrix representations A and A in Fmxn corresponding to com-
ponent equations

15

16

Note that the process of generating Eq. (15) is depicted in the commutative diagram of
Pig. A4 where the double arrows indicate bijections generated by the choice of bases
in domain and codomain. Of course, a similar diagram can be drawn when generating
Eq. (16).
It follows now immediately from Eqs. (13)-(15) that using matrix multiplication

Whence, since A: li is a unique matrix representation of A,

Fig. A4. Commutative diagram of a matrix representation of A : U v.


425

17 A=QAP

or, since matrices Q, Pare nonsingular in (13)-(14),

Equations (17)-(18) specify the relations between the matrix: representations A and A
of the map A under the change of bases described above.

19 Two matrices A and A S.t. (17)-(18) holds are said to be equivalent [Sig. 1,
p.256].
Thus we see that under change of basis, two matrices A and A, both in pnxn,
represent the same linear map A : U Y with dim U = n and dim Y = n if and only
if A and A are equivalent.
The whole process is visualized by the commutative diagram of Fig. A5. Recall
the diagrams of Fig. A3 and Fig. A4. In Fig. A5 the right part of the diagram depicts
the change of basis in the codomain Y, the left part does so for the change of basis in
the domain U. The lower part and the top part show the matrix representations
A: '11 and A: ii. Reading off the outer part of the commutative diagram a)
clockwise, we obtain A=Q-1AP-"1 and b) counterclockwise, we obtain A=QAP.
Thus the commutative diagram of Fig. AS summarizes Eqs. (13) to (18).
Special Case. Suppose A : U U. Then we may use the same basis for domain
and codomain. The diagram simplifies then and becomes that of Fig. A6. Using the
commutative properties of its left-most part two matrix representations A and A of A
are related by:

-
A

p-1

A
F".,
Fig. AS. The diagram of change of bases for A : U Y.
426

Fig. A6. Change of basis when A : U U.

21 A=Pi\p-' .

Note A and A are square matrices in pxn and that PeF nxn is nonsingular.

21 Two square matrices A and A S.t. Eqs. (20)-(21) hold are said to be similar
[Sig.l,p.257]. Hence, when the same basis is used in domain and codomain, two
matrices A and Aepxn represent the same linear map A :U U, with dim U = n if
and only if A and A are similar.

AS.3. Range and Null Space: Rank and Nullity


Let A be a linear map from (U,F) into (V,F) with dim U = n and dim V = m.
N( A) and R( A) are subspaces of U and V, resp. Hence they have well-defined
dimensions denoted by dimN(A) and dimR(A). It is a fact that

24 dimR(A )+dimN(A )=n=dimDomain(A).

25 Remark. One might expect that N (A) and R (A) be disjoint. This is not so.
Various possibilities may occur. For example let U=V=R2

If A=

If A = then R (A ) (') N (A ) = { e} .

0 0 0
Let U=V=lR3 andA= [ 001 . Then
1
000
427

R (A) ("IN (A )= Sp(e2)'

Proof of (24). Consider Eq. (24). Let (Uj) be a basis of N (A). Complete the basis

s.t. is a basis of U. For any xeU there is a unique representation x=


j=l
and by the linearity of A

26 Ax=A t
j=l j=1 i=k+l

Now A uj=9 u for i = 1.2•...•k because those ujeN(A). Clearly. since x in (26) is arbi-
trary. fA Uj) n spans R (A) and we claim that this family is a linearly independent
t k+l
family. Suppose it is not: then it would be linearly dependent. i.e. for some scalars
<Xk+ l' • • •• <Xn (not all zero)

9y = t
k+l
<Xr4 Uj =A [ t
j=k+l
<XjUij.

i.e. the vector t <XjUj is in N (A). However. this vector is nonzero since (Ui] n is a
w w
n
linearly independent family. Hence l; <XjUj. a nontrivial linear combination of the
k+1
basis [Ui]; of U. is zero. This is clearly impossible and hence (a) (A Ui] n is a basis
t

k+l
for R(A); (b) dimR(A)=n-k; and (c) Eq. (24) follows.

27 By rank of the matrix Ae pnxn (denoted by rkA) we mean dimR (A) and by
nUllity of the matrix Ae pnxn (denoted by nlA) we mean dim N (A). Hence

28 n=rkA+nIA.

29 Fact [Sig.l,pp.262,377]. Let Aepnxn be a matrix. U.t.c.,

30 0 S; rkA S; min(m,n) ;

31 rkA is equal to the


(a) maximum number of linearly independent column vectors of A.
(b) maximum number of linearly independent row vectors of A, and
(c) largest integer r such that at least one minor of order r is nonzero. •

32 Exercise. Let Aepxn be a square matrix. Show that A has an inverse if and
428

only if A is nonsingular or equivalently rk[A]=n.


[Hint: use Corollary (A2.29) and (31.c).]
In the following when we consider a linear map A represented by a matrix A, we
shall write R(A) and N(A), when we mean R (A) resp. N (A). In other contexts we
shall also say that A is injective where we mean A is injective.
Now in many applications we need also:

33 Let AeFrnxn be a matrix. By row rank and column rank of A, (denoted by


row rkA, col rkA), we mean its maximum number of linearly independent row vectors,
column vectors, resp.
Note that by (31)

34 rkA = row rkA = col rkA.

3S The matrix AeFrnxn is said to be of full-row rank or offull-column rank iffrkA


= m, or rkA = n, resp.
36 Exercise. Let AeF rnxn and 1m be the mxm identity matrix. Show that
(a) A has a right inverse, (equivalently, is surjective), iff
rkA=rk[A )rn]; equivalently A has full-row rank.
(b) A has a left inverse, (equivalently, is injective), iff
N [A] = {e } ; equivalently A has full-column rank. •

Sylvester's inequality: Let Aepxn and Bep><P be two matrices, then ABePrnxp
and
37 rkA+rkB-n :!> rkAB :!> min {rkA, rkB} .

Hints: Let W be the codomain of A and let A I R (B) : R (B) -+ W be the restriction of
A to R (B). Note that

R(AB)=R (A IR(B) ]CR(A)

N(A 'R(B» cN(A)

R(B) =Domain [A IR(B) ]

and applying (24) to A I R (B)

dimR(B) =dimR(A IR(B» + dimN(A IR(B»



429

Sylvester's inequality and (24) are the main tools for the following result which is left
as an exercise.

38 Theorem [Rank and nullity invariance under equivalence]. Let AeF rnxn be a
matrix and Pe pnxn and Qe pmxm be nonsingular matrices. V.tc.

39 rkA = rkAP = rkQA = rkQAP ,

40 nIA=nIAP=nIQA=nIQAP.

41 Comment. This theorem is an algebraic consequence of the obvious geometric


fact that the range and null space of a linear map A does not change under a change
of basis in its domain or codomain or both.

AS.4. Echelon Forms of a Matrix


It is important to note that F denotes a field. Our objective is to reduce a matrix
Aeprnxn to row or column echelon form. These forms are obtained by a change of
basis in the codomain or domain of the map A.... They are well suited for discussing
the construction of a basis for R(A), N(A) and the solution of the equation Ax = b.

42 Let Aeprnxn. Elementary row operations (e.r.o.'s) on A are of three kinds t :


a) Interchange two rows Pi Pj.
b) Multiply row i by a nonzero scalar ce F : Pi t-- cPi.
c) For j *" i, add to row i another row j multiplied by re F : Pi f- Pi + rpj.
43 Note that e.r.o.'s are equivalent to premultiplying A by left elementary matrices
L: these are obtained from the unit matrix by performing the desired e.r.o. upon it.

44 Exercises. a) For each e.r.o. compute its corresponding left elementary matrix L
and check that the transformed matrix A is related to A by A== LA.
b) Show that each e.r.o. is invertible and, for each one, obtain Cl.
(Hint: Pj Pi ; Pi t-- c-1Pi etc ... ).

45 Let AeFrnxn. Elementary colwnn operations (e.c.o.'s) on A are similarly


defined: replace "row" by "column" (Pi t-- 'Yi;Pj t-- 'Y}.tt

46 Note that e.c.o.'s are equivalent to postmultiplying A by right elementary


matrices: these are obtained from the unit matrix by performing the desired e.c.o. upon
it.

t Pi stands for row i; with re F, rpi stands for the product of row i by r.
tt Pj denotes row j; 'Yj denotes column j.
430

47 Exercise. For each e.c.o. compute its corresponding right elementary matrix R
and check that the transformed matrix A is related to A by A = AR.

48 An operation on a matrix AeF mxn is said to be an elementary operation (e.o.)


iff it is an e.r.o. or an e.c.o. A square matrix with elements in F is said to be an ele-
mentary matrix iff it is a left elementary matrix or a right elementary matrix.

49 Exercise. a) Show that each elementary operation is invertible, equivalently each


elementary matrix is nonsinguiar.
b) Show that each elementary operation does not change the rank or nullity of A.
(Hint: use (a) and Theorem (AS.36)).

50 Theorem [Row echelon form]. Let Aeprnxn. Then there exists a nonsingular
matrix Qe prnxm, (obtained by e.r.o.'s) S.t.

kr
1 0
2 0
0
51 QA=A= 0
0
r 0
--- ---
0 0
0 0

where A is said to be in row echelon form and has the following properties:

52 a) Let r denote the number of nonzero rows of A, then


r=rkA=rkA

53 b) 'r/ ier, Pi is nonzero with a nonzero leading entry s.t.

54 c) 'r/'Yj s.t. j <kb 'Yj is zero,


'r/ 'Yj S.t. ki !5: j < ki+ I with ie r-l, then the last m-i entries of 'Yj are zero,
'r/ 'Yj S.t. j the last m-r entries of 'Yj are zero. •

To reduce a given matrix Aeprnxn to row echelon form we use the following
algorithm. Here the field F is or R or tr, the algorithm is a special case of a more
general algorithm that is valid for the case that F is any field (see below).

55 Algorithm [Reduction to row echelon form with partial pivoting].


Data: Aepxn with F=lRor cr.
431

Step 1. Search for 'Yk" the first column from the left that is nonzero. If no such
column is found. stop.

Step 2. Choose the entry which is the largest in absolute value among the entries
of 'Yk,. and. by row permutation. bring it in position (l.k l ).

Step 3. For t = 2.3 .....m. from Pt subtract PI multiplied by entry (t.kl ) and divided
by entry (l,kl)'
(Hence there is only one nonzero entry left in 'Yk, viz. entry (l.k l ).)
For i = 2.3 ....
Step 4. Search for 'Yk" the first column from the left that is nonzero below Pi-I' If
no such column is found. or if i=m+l. or stop.

Step S. Choose the element that is the largest in absolute value among the
nonzero entries of 'Yk. below Pi-I; by row permutation bring that entry into position
(i.ki )·

Step 6. For t = i+ l .....m. from Pt subtract Pi mUltiplied by entry (t.ki ) and divided
by entry
(Hence there is only one nonzero element left in 'Yk, below Pi-It viz. entry (i.ki ).) •

S6 Remarks. a) In the case of a general field F. the notion of absolute value may
not be available (e.g. R(s». then in Steps 2 and 5 we pick any appropriate entry of 'Yk,
that is not the zero element of that field.
13) However if F is R or 0: then careful analysis of the propagation of round-off
errors has shown that it is preferable in Steps 2 and 5 to choose as nonzero entry of 'Yk,
one that is largest in absolute value. This is referred to as partial pivoting
[Gol.l.p.IIO]. This is done in Algorithm (55).

57 Exercise. Use algorithm (55) on the matrix A (given below) to obtain A (in row
echelon form):

6 12 6
o -1 -1
58
000
000

In the proof of Theorem (50) below we will apply Algorithm (55) under the conditions
of Remark (56. a».We consider any field F.
432

Proof of Theorem (56).


Steps 3 and 6 of Algorithm (55) will reduce to zero all entries of 'Yki in rows
i+l,i+2 .... ,m. Since A has m'n entries, the algorithm will stop with A in row echelon
form with properties (52)-(54): note especially that by Exercise (49b)
r=rankA=rankA. •

59 Remark. Matrix A in row echelon form as well as the transformation matrix Q


of Theorem (50) can be obtained simultaneously by performing the e.r.o's of Algo-
rithm (55) on the compound matrix [A : 1m); (here 1m denotes the mxm unit matrix).

60 Application 1. Solving Ax=b.


Let Aepxn, beP and detA '" O. Hence rkA=n and A in row echelon form is upper
triangular, call it U. For i=1,2, ... ,n-l Algorithm (55) performs a permutation Pi and
then a sequence of e.r.o.'s that is described by a product of left elementary matrices,
say L i; it is easy to see that Li is lower triangular with l's on the diagonal. So

Ln- IPn- t Ln-2Pn-2 ... LtPtA=U

where U is upper triangular.


Let P:=Pn- IPn- 2 ... PI and L:=P(Ln-lPn-l ... L1P1)-I. Careful analysis shows
that L is lower triangular (with l's on the diagonal) and that

PA=LU.

Since detP '" 0, xo is a solution of Ax=b if and only if Xo is a solution of LUx=Pb.


Thus to obtain Xo we solve

LUx=Pb.

We do this in two steps, first we solve

Ly=Pb

by forward substitution, then, using y, we solve

Ux=y

by backward substitution.
If all the elements of A and b are nonzero, the solution requires O(n 3/3) opera-
tions. This estimate is not valid if A is sparse (i.e. made of mostly zeros) and if
sparse-matrix software is used the solution requires typically O(n1. 5) operations.
433

62 Application 2. Basis for N(A).


Let Ae pnxn have a row echelon fonn A with index set K= where the
k i are the column indices of the leading entries. Let KC := n\K. A basis for N(A) is
obtained as follows: solve

63 Ax=9

n-r times by setting successively all components of x with indices in KC equal to zero
except for one component which is set equal to one. The resulting family [Xi) of
solutions of (63) is a basis for N(A). •

Por example if A and A are given by (58) then K= { 1,3 } , KC = (2,4) and a
basis for N(A) is given by where t Xl = (-2,1,0,ol, X2=(2,O,-I,ll

Note that the vectors of such basis (Xi


they fonn the columns of a matrix "containing I
t- r
cpt'

."
are linearly independent because

n-r

64 Application 3. Basis for R(A)


Suppose we have A a row echelon fonn for Ae pm xn. Let us use the notation of
Theorem (50). Then the columns of A: Yk"Yk 2 , • • • 'Yk, constitute a basis for R(A).

Proof. Exercise. [Hint: A= QA, R (.4) = Sp ((YkiJ J]


65 Let A=(ll;j)EFnxm be a matrix and let AT = (aji)Epnxn denote the transpose of
A. We say that AEpxnt is in column echelon form iff ATEFmxn is in row echelon
fonn.

66 Transposition, Theorem (50), Algorithm (55) and Remark (56a) show that for
any matrix AE pnxm there exists a nonsingular matrix PE p,nxm (obtained by e.c.o.'s)
S.t. A= AP is in column echelon fonn.

67 Exercise. State precisely the properties of matrix A in column echelon fonn (cf.
(52)-(54».

68 Exercise. State precisely an algorithm that will reduce any matrix Ae Fnxm in
column echelon fonn using partial pivoting (cf. Algorithm (55».

69 Exercise ["Diagonalization under equivalence"]. Let Ir denote the rxr unit


matrix. Prove the following:

t The superscript T denotes the transpose.


434

Let Ae pnxn be a matrix, then there exist nonsingular matrices Qe FIlOOn and
Pepxn (obtained by e.r.o.'s and by e.c.o.'s, resp.) S.t.:

70 QAP=D

with

r
71
m-r

and

72 r=rkA.

73 Comment. Any matrix can be "diagonalized" according to (70)-(72) by a


change of basis in the codomain (e.r.o.'s) and the domain (e.c.o.'s).

74 Exercise. Let A e pn xn. Let L(R) denote a product of left elementary matrices,
(right elementary matrices, resp.)
Show that:
(i) N(LA)=N(A) ,
(ii) dimR(LA)=dimR(A),
(iii) R (A) =R (AR) ,
(iv) dimN(A)=dimN(AR).

A6. Normed Linear Spaces


In this section we review a number of extremely useful concepts and results of
analysis on nonned linear spaces: nonns, convergence, completeness, equivalent
nonns, the Lebesgue spaces [p and LP and continuous linear transfonnations.

A6.1. Norms
Intuitively the nonn of a vector is a measure of its "length." Of course, since ele-
ments of a linear space can be vectors in R n, matrices, functions, etc ... , the concept of
nonn must be defined in general tenns.

1 Let the field F be either 1R or tr; consequently V ae F the absolute value of a,


denoted by I ai, is well defined. A linear space (V,F) is said to be a normed linear
space iff there is a function, denoted by 11'11, mapping V satisfying the three
conditions
435

(a) IIvl+v211 S Ilvlll+llv211 VVI,v2eV (triangle inequality),


(b) lIavll = lal IIvll VaeF, VveF,
(c) II v II =0 <::> v= ay •
The expression "" v "" is read "the norm of v" and the function /1'/1 is said to be
the norm on V. It is often convenient to label a normed space by (V,F, II' II ).
The following examples show that, on a given linear space (V,F), many norms can be
defined.

2 Example. n.
(F ,F) with elements x = [ )n
Xj IE P.

n
3 IIxllt:= L IXil
I

5 II x II := max I xi I
00 i

are called sum norm, Euclidean norm, and sup-norm, resp. See [Nob.l.p.13ll

6 Example [Matrices] [Nob.1.Sec.5.3]. (pxn,F) is the linear space of matrices


with elements A=
lra;j).IEm.JED
.
eF"xn. The following are norms obtained by identify-
ing matrices as vectors in pmn
m n
7 IIAlla:= 1:1: laijl ,
j=1 ;=1

8 (Frobenius norm),

9 II A II b := max { Illjj I , ie ill, jE n} .

Note the correspondence between (3)-(5) and (7)-(9). The following are so-called
induced norms and will be discussed later, (see (73) below):

10 II A III := max { Illjj l,jE n } (column sums),


1=1

11 II A 112 := max { [AiA* A)]1!2,je n } (singular values),

where A*Epnxm is the Hermitian or complex-conjugate transpose of A and


AlM) denotes the jth eigenvalue of the matrix M.
436

12 (row sums).

13 Example. C ([to,t!1,F'). On this function space, having as elements continuous


functions f: [to,t 1] cR F', we can define a nonn

14 II f II 00 := max ( II f(t) II, tE [to,ttl )

where II f(t) II. stands for any nonn in (Fn,F), (see equivalent nonns, especially (47)).

15 Exercise. Show that the nonns defined in (5), (9), (12), and (14) with II f(t) II
given by (5) are nonns, i.e. satisfy the axioms (a), (b), and (c) in (1).

16 Exercise. Let KeF', Aepntxn, and BePnxp. Using the nonns defined by (5) and
(12) show that

17 IIAxll IIAII IIxll

18 II AB II II A II II B II.
19 Exercise. Let f:[to,td F2:t f(t)=(f1(t),f2(t», (fl and f2 are the com-
ponents of f). Let f be piecewise continuous. Consider

v := i f(t)dt:= [i j
fl (t)dt, f2(t)dt ].

Clearly, v E F2. Show that, for the nonn defined in (5),

II 11

20 II v II = II f f(t)dt II s f II f(t) II dt.

(Hint: use approximating Riemann sums, e.g. l:f(Ij)(Ij+l-Ij).)


i
In a nonned linear space (V,F,II'I!), the set

21 B(a;r):= (UE V: II u-a II < r )


where aE V r > 0 is called the (open) ball of radius r with center a. B(9;1) is
called the unit ball (with center 9).

22 Comment. The definitions (21) are special cases of similar definitions in metric
spaces [Die. 1].
437

23 Exercise. In (1R2,R) draw the unit balls for the nonns 11'11 b 11'112, 11'11 00
defined by (3)-(5).

24 Exercise. Let B be the unit ball in the nonned linear space (V,F, 11'11). Show
that
(a) B is convex (Le. v"v2E B =:> AVI+(I-A)v2E B 'v' AE [0,1]),
(b) B is balanced (Le. VE B =:> -VE B),
(c) 'v' VE V there is a finite r>O S.t. VE B(9;r).
(d) Give an example of a set C that satisfies (a) and (b) but not (c).

A6.2. Convergence
A typical engineering use of nonns is in deciding whether or not an iterative
technique converges or not. Again, since we use nonns, the field F is either 1R or (C.

26. Given a nonned linear space (V,F, 11'11) and a sequence of vectors [Vi); C V,

we say that the sequence converges to the vector VE V, (denoted by vi. --+ v
1-->00

or lim Vi = v), iff the sequence of nonnegative real numbers II Vi-v II tends to zero as

i --+ 00.
Thus, thanks to the notion of nonn, the concept of convergence of vectors is
reduced to that of convergence of nonnegative real numbers.
In iterative techniques the "limit vector" v is not known (otherwise, why iterate?).
So we need to be able to decide the question of convergence without knowing v. As
with real numbers, we need the concept of Cauchy sequence,

27 A sequence (Vi] in (V,F, 11'11) is said to be a Cauchy sequence in V iff for any
DO, there exists an integer N, depending on E, such that

geometrically, this condition means 'v'p E N, vN+p E B(vN;E).

Reference to the definitions shows that every sequence (Vi] C V that converges
to some VE V is a Cauchy sequence. For the converse we ne.ed a new concept.

28 A nonned linear space (V,F, II'ID is said to be complete or a Banach space, (or
the linear space (V,F) is said to be complete in the norm II''') iff every Cauchy
sequence in V converges to an element VE V.
Finally, for purposes of approximation we need the notion of dense set.

29 A subset X of a nonned linear space (V,F, 11'11) is dense in V iff for every
438

element VE V there exists a sequence (Xi); in X that converges to VE V in the given


norm, i.e. II x;-v II o.
The following facts are known, e.g. [Tay.l] and useful.

30 Fact (F = R or (C). Let (V,F) be any finite-dimensional linear space. Let 11'11
denote any norm on V. V.l.c. (V,F,II'II) is a complete normed linear space, or equiv.,
a Banach space.

31 Examples. CF',F,II'II) and (pxn,F,II'II) where we use any norm defined, in


examples (2) and (6), resp., are Banach spaces.

32 Fact. Let (Rn,R,II'II) be the Banach space of n-tuples of real numbers. Let Q"
be the subset of n-tuples of rational numbers. V.l.c. Q" is a dense subset of Rn.

33 Fact. Let F=Ror (C. (a) The normed linear space C«[to,td,F"),F,II'1I 00 ) of
Example (13) is a Banach space.
(b) Its subset P ([to,til,F") of n-tuples of polynomials in tE [to,til c R with coefficients
in F is dense in C ([to,til,F") (in the norm 11'11 00),

A6.3. Equivalent Norms


It turns out that on a given linear space (V,F) different norms may have the same
consequences, provided one considers:
(i) the convergence and Cauchy nature of sequences.
(ii) the question of completeness in the norm, and
(iii) the question of the density of a subset in V .

41 Let (V,F) be a linear space. Let 11'11 a and 11'11 b be two norms defined on V.
We say that the norms 11'11 a and 11'11 b are equivalent iff there exist two positive
numbers mt and mu S.t.

42 VVEV.

It is crucial to note that the same ml and mu must work for all VE V. Note also that
equivalence is an equivalence relation (reflexive, symmetric, and transitive). (Prove it.)

43 Exercise. Let (V,F) be a linear space. Let 11'11 a and 11'11 b two equivalent
norms. Let (vi);CV be a sequence,let X a subset of V and let veV. V.l.e.

(a) Convergence is equivalent, i.e.

Vi v in 11'11 a Vi v in 11'11 b'


i-+oo
439

(b) The Cauchy nature of sequences is equivalent. i.e.


(Vi) is a Cauchy sequence in 11'11 a
<=>
is a Cauchy sequence in 11'11 b'

(c) Completeness in the norm is equivalent, i.e.: (V,F) is complete in


11'11 •<:;> (V,F) is complete in II' II b •
(d) Density of a subset is equivalent, i.e.
X is a dense subset of V in 11'11.
<=>
X is a dense subset of V in II' II b'

44 Comment. As far as convergence questions are concerned II '11 a and II' II b lead
to the same answer. In practice one prefers the norm which is most convenient for the
problem at hand.

4S Fact [Tay.1,pp. 55,62]. Let (V,F) be a finite-dimensional linear space, then any
two norms on V are equivalent. Hence for F=R or «I:, any two norms on the linear
spaces (p,F) and (F'"xn,F) are equivalent. •

46 Exercise. Consider the linear space (P,F) of Example (2) with norms 11'111'
II' 112' resp. II' 1100 given by (3)-(5). Show that

II x II 00 S II x lit S n II x II 00 'v' XE Fn

II x II 00
S II x 112 S ..filII x II 00
'v' XE F"

whence II . lit, II '112 and II' II 00 are equivalent.


n
(Hint: use Schwarz's inequality l: Ixd IYi I S II x 11211 Y112 ... )
i=1

47 Exercise. Consider the function space C ([to.td.P ) of Example (13) upon


which, by choosing two norms 11'11 a and 11'11 bin P, we define two norms:
II f II 00•• := max ( II f(t) II a' t E [to,tl] )
440

and

IIflloo.b:= max ( IIf(I)llb, IE [to,td) .

Show that the nonns II' II oo.a and II' II oo.b are equivalent.
(Hint: observe first that because of Fact (45) the nonns II '11 a and II' II b are
equivalent.) (This exercise justifies why in the definition of II f II 00' in (14), Ilf(I)II
may be chosen 10 be any P-nonn.)

A6.4. The Lebesgue Spaces I P and LP [Tay.1].


Again F=R or tr and, for reasons of equivalence, 11'11 denotes any P-nonn.

49 IP spaces. Consider an P-valued sequence x = (Xi ] ViE N


Define for such sequences the following nonns and sets:
a) forpe [1,00),

50 IIxllp:= 1Xi lP fP;

b) for p=oo,

51 IIxll:= suI? { Ilxdl }


00 I

c) for p E [1,00],

52 If: := {x = Gi] : ViE N Xi E F' and II x lip < 00 }

is called the space of pth power summable sequences (1 p < 00), resp., space of
bounded sequences (p=oo). For n = 1 these spaces are denoted by IP. •

53 Fact. For pE [1,00], (If: ,F,II'II P ] is a Banach space, i.e. a complete nonned
linear space.

54 LP spaces. Consider an F n-valued function f: I F' : t f(l) that is measurable


and where 1= IR or or [to,td c R. Define for such functions the following nonns
and sets:
a) for pE [1,00),
441

b) for p=oo,
56 II f II := ess sup ( II f(t) II ) := inf { K> 0: ( tel: II f(t) II > K )) = 0 )
00 tel

where I![A] and ess sup denote the measure of the set AcR and the essential
supremum, resp., (the essential supremum measures the least upper bound K > 0 s.t.
II f(t) II K except for a set of measure zero). In the text we shall write "sup" for
"ess sup."

57 LP(I,p n):= {f: f : 1-+ F" is measurable and II f II p < oo} is called the space of
pth power integrable functions on I, (1 P < 00), resp., the space of essentially
boundedt functions on I, (p=oo). LP(I,F") is also denoted LJ(I) or simply LJ. Por n=l
the latter are denoted LP(I) or LP.

58 Fact. Por pe [1,00], (LP(I,F"),F, 11'11 p) is a Banach spac(: i.e. a complete normed
linear space if we identify two functions f and g that are equal almost everywhere, i.e.
equal except on a set of measure zero.

59 Fact. The sets C ([1o,tt1,F") and PC ([1o,td,P) of continuous resp., piecewise


continuous pn -valued functions on [1o,td are dense in the Banach spaces LP([1o,td,P)
for pe [1,00).

60 Comment. In most applications, the integral in (55) may be taken to be a


Riemann integral.

A6.5. Continuous Linear Transformations


Continuous maps have the desirable property that a small perturbation in x results
in a small perturbation in F (x) .... They are paramount in the study of robustness and
the effect of perturbations of data.

66 Let F = R or IT and consider two normed linear spaces (U,F, 11'11 u) and
(V,F, 11'11 v). Let F be a map (or operator) S.t. F : U -+ V.
a) [Local continuity]. We say the F is continuous at the point u E U iff, for every
£>0, there exist a possibly depending on £ and u, S.t. considering points u'eU

lIu'-ullu<8 :::> IIF(u')-F(u)lly<£.

b) [Global continuity]. We say that F is continuous on U iff it is continuous at every

tIn the text we shall write "bounded" for "essentially bounded."


442

ueU.
c) [Induced (operator) nonn]. The induced (operator) norm of F is defined to be


67 IIF II := ( IIF(u)lIv/ lIullu) .

The following is a useful criterion for local continuity using sequences.

68 Fact [Local sequential continuity]. Let U and V be as defined above and let
F :U V be a map. U.t.c. F is continuous at ue U iff, for every sequence
F(u)=.lim F(uj).
i-JOO

69 Comment. Definition (66a) of local continuity is a special case of the usual 10-0
definition when U and V are metric spaces. Fact (68) remains valid for metric spaces.

Proof of Fact (68): Outline.

If: Let (uj]; cUbe any sequence s.1. Uj u. By the 10-0 definition of continuity,
for every DO, there exists an integer I S.t. Vi IIF(uj)-F(u)II <10, i.e.
Only if: If F is not continuous at ueU, then :310>0 and a sequence
S.t. lIui-ull < yet IIF(uj)-F(u)l1 >10... •

For linear maps we have the following important facts.

70 Fact. Let F= 1R or cr; let (U,F,II'II u) and (V,F,II'II v) be given nonned


spaces.
Let A : U V be a linear map. U.t.c. the induced (operator) nonn of A is given by

71 II A II := { II A u II v I II u II u } = sup ( II A u II v: II u II u = I ) .

(Hint: Note that since A is linear, the map ue lIuullu) is con-


stant, hence "for each direction" the ratio has one value.)

72 Comment. Note that the induced nonn II A II is the "maximum gain of the map
A over all directions"; moreover II A II depends on the choice of the nonns 11·11 u and
II . II v in the domain and the codomain, resp ..

73 Example [Nob.l,p.153][Induced matrix norms]. Consider the linear space


(pnxn, F) of matrices of Example (6). Consider now a matrix A = (lljj) E pm><n as a
linear map A: pn pm : x Ax. For p=l,2,oo let us choose the norm II '11 p given by
(3)-(6), both in the domain pn and in the codomain Fm and let us consider the induced
443

nonn of the map A viz.

74 II A II p := ( II Ax lip I II x lip) for p=1,2,00 . Then

r
10 (column sum)

2
11 IIA A./A*A) ,
JEn

12 II A II 00 = max { I I, ie m }, (row sum).


J=1

This justifies the use of the name "induced nonns" in Example (6).

75 Exercise. Show that II A II 00 defined by (74) is given by (12).


Induced nonns of linear maps satisfy some key properties which easily follow
from (71).

76 Theorem [Nob.1,p. 164]. Let (U,F,II·llu), (V,F,II·lIv), (W,F,II·lIw) be


nonned linear spaces and let A : V -+ W, resp. !i:
V -+ W and B : U -+ V be linear
maps. Note that 'V ae F, aA: V -+ W, resp. A +A: V -+ Wand AB : U -+ W are also
linear maps.
V.t.c. using induced nonns we have:

77 a) IIAvllw:S; IIA II IIvllv,

78 b) lloA II = lalliA II 'Vae F,

79 c) II A +.4 II :s; II A II + 11.4 II ,


80 d) IIA II =0 ¢:> A is the zero map,

81 e) IIAB II :s; IIA II liB II· I

We delay our comments after the following useful Fact.

82 Theorem [Rud.1,p.102]. Let (U,F, 11·11 u) and (V,F, lI·n v) be two nonned linear
spaces and let A : U -+ V be a linear map. Then, by the linearity of A, the following
three statements are equivalent:
a) A is continuous on U;
b) A is continuous at one point ueU, (for example, u=8 u );
c) A has a finite induced nonn i.e. II A II < 00. I
444

83 Comments. u) Note that (77) and (81) generalize the results of Exercise (16).
With V and V as given above, consider the set L(U ,V) of continuous linear maps
A : V --+ V, upon which we define addition and scalar multiplication by
(A+B)v:=Av+Bv VveV and (M)v:=MV Vue F VveV resp. From
Theorems (76) and (82) it follows at once that under the induced norm, (71)
(L(V,V),F,II'ID is a normed linear space. Moreover, it is known that [Tay.l,p.189] if
V is complete then L(U,V) is complete; i.e. Cauchy sequences of continuous linear
maps will converge to such a map in the induced norm.
y) From Example (73) and Fact (45) we see that matrices have finite induced norms.
Hence by Theorem (82), matrices represent continuous linear maps.

84 Theorem [Extensions of equalities]. Let (U,F) and (V,F) be normed spaces; let f
and g map U into V.
If f and g are continuous (equivalently, for all convergent sequences in V, [Xi
limf(x)=f(limXi» ,
and if f(x) = g(x) V X in a dense subset DeV then f(x) = g(x), V XE V.

Proof. Let xe U\D. Since D is dense, there is a sequence [Xi] e D such that xi --+ X.
By assumption, since each xiE D,

f(Xi) = g(xi) Vi = 1,2, ...

and by continuity

.lim f(xi) = f (.lim Xi) = f(x)


1-+00 1-+00


= .lim g(Xi) = g [.lim Xi] = g(x).
1-+00 1-+00

84a Exercise. Let F= 1R or 4r. Show that


a) the set M d := {Aepxn: :::ITs.!. TAil is diagonal} is dense in Fnxn,
b) the set M'd:= { Ae P x n : A has n distinct eigenvalues} is dense in P xn,
c) the set M i:= {Ae F nxn : A is nonsingular} is dense in pxn.
We conclude this section by with remark on the solution of the equation Ax=b
and on computer arithmetic.

85 Remark: [Equation Ax=b]. [Go1.1];[Sto.1].


In (A4.15) we considered the general linear equation Ax=b where the linear map
A : (V,F) --+ (V,F). We consider now the case of n linear algebraic equations in n
445

unknowns. More precisely, let P=R or C:; let AeF'xn, be F' and xe F'. Consider
the equation

86 Ax=b

We assume that det A 0 and b en; hence, (86) has a unique nonzero solution
xo=A-lb. Suppose that as a result of noisy measurements, round-off errors, etc., we
only have an approximate value A+3A for A and b+3b for b: thus we have a per-
turbed system of equations and calling its solution Xo + 3x, we write

87 (A + 3A)(Xo + 3x) = b + 3b.

We wish to relate the size of 3x to that of 3A and 3b.


Pirst, we choose some norm in pn, 11'11, and we denote by II A II the
corresponding induced norm of A.
Second, we assume that

88 II M II < < II A II and 113b II < < II b II

and wish to calculate an upper bound on II 8x 11/11 x II .


Neglecting the product 113AII 113bll, we obtain from (86) and (87) the approxi-
mate equation

OAxo + Aox = Ob .

Calculating 8x, taking norms of both sides, and using the properties of the norm and of
the induced norm, we obtain

Di viding by II Xo II > 0 and noting that II b II :S II A II II Xo II , we obtain

89
IIxoll
:S II A II . II KIll [Mill
IIbll
+ 11M II
II All
1.

The positive number

90 1(A):= II A II . II KIll
446

is called the condition number of A. Note that it depends on the nonn chosen. How-
ever, regardless of the nonn K(A) 1. In fact, K(A) = 1 if A is unitary and if II· Ii2
is used.
Matrices for which K(A) is large are called ill-conditioned; for such matrices, the
solution is very sensitive to some small changes in A and in b. Since all nonns on F n
are equivalent, it can be shown that if a matrix is very ill-conditioned in some nonn, it
will also be ill-conditioned in any other nonn.
Note that (89) gives an upper bound on II ax 11111 Xo II. In some cases, the right-
hand side gives a gross overestimate of the error.
The following exercise shows that small errors in A and b can lead to large errors
in x when K(A) is large.

Exercise. Let A=diag(I,2lO-2), cSA=diag(0,-1O-2). So for


p=I,2,oo,lIcSAII",IIAll p=lO-2. Let b=(1,2)T and cSb=(O,2lO-3) hence
II cSb II '" II blip =- 10-3 • Check that Xo = (1, 102)T and Xo + ax = (1,2.002 lO2)T. Evaluate
both sides of (89).
Note that even though the relative errors on A and b are small, the perturbation dou-
bles the size of II Xo II.

93 Remark on computer arithmetic [Ool.1];[Sto.l].


Most computers represent numbers in the following floating point binary fonnat:

First a sign bit, then t binary digits fonning the mantissa, and the integer e is the
exponent. Typical!?' e may be any integer in [L,U] with L of the order of - 103 and U
of the order of 10 . Thus, there is a largest possible positive number and a smallest
positive number that the computer can handle: they are

9S +.11 ... 1 x2u and +.00 ... 01 X 2L , resp ..

If, in the course of a computation, the result is larger than the first we have overflow
and if it is smaller than the second we have underflow; in both cases calculations stop.
In the following we assume that neither overflow nor underflow occurs. In the
process of representing a real number x, (say 7t, e, 1/3, log 2,... ), by a sequence of t
binary digits as in (94) above, an error usually occurs; clearly, the best choice is for
the computer to choose the number of the fonn (94) that is closest to x. This number
is denoted by flex). Clearly

96 fl(x) = x(1+E) where lEI rt=:Em •

The number Em is called the machine E. For scientific computers, t is typically 48,
then rt= 3.55 x lO-15. Note that replacing x by flex) causes a very small relative
error.
447

Multiplication. Given two real numbers x and y, we obtain first their binary represen-
tations, then multiply these representations and round off the result:
fI(x x y) := flrf/(x) x f I(y)), i.e.

Neglecting the terms of the form EjEk' we obtain

So the relative error is at most 3Em. Hence computer multiplication is subject to very
small relative errors. It is easy to check that the same conclusion holds for division.

Addition. Reasoning as before, we have f/(x+y):=f/[f/(x)+fl(y)] so

Again neglecting terms in EjEk we obtain

XE1+yE2 ]
fl (x+y)= (x+y) [ 1+ + £3 .
x+y

Here the relative error may be considerable: consider x>O,y=-x+d with 0 < d « x
and £1 =-£2=£3 = Em, then the relative error is, neglecting higher-order terms,

Note that in this case, the round-off errors in x and y, namely, £1 and are amplified
by the ratio x/d which is very large. When this happens we say that there is a catas-
trophic cancellation. The conclusion is: computer addition may involve very large
relative errors.
Finally, note that computer addition is not necessarily associative and that in the
course of the computatign, valuable data may be rounded off. For example, let a = I,
b = -1, c = 0.l2345 10- . Carrying five decimal digits in the computation, we see that
fI«a+b)+c) = 0.12345 10-5 butfl(a+(b+c» = O. It is for this reason that it is important
to scale problems and to use normalized data so that, say, in evaluating polynomials or
scalar products, one does not add and subtract numbers whose magnitudes are very
different.

A7. The Adjoint of a Linear Map


The purpose of this section is to discuss inner product spaces, adjoint linear maps
and their properties.
448

A7.1. Inner Products


Norms add the notion of length to a linear space, inner products add the notion of
angle.

Inner Product Spaces and Hilbert Spaces

1 Let F be IR or 0: . Let (H,F) be a linear space. The function


(".): HxH F: (x,y) (x,y) is called inner product iff
(a) (x,y+z) = (x,y) + (x,z) V X,y,ZE H,
(b) (x,ay) = a(x,y) Va E F, V x,y E H,

2 (c) IlxIl 2 :=(x,x) > 0 XE H S.t. x '" e H,


(d) (x,y) = (y,x)
where the overbar denotes the complex conjugate of (y,x).

3 A linear space (H,F) equipped with an inner product viz. the triple (H,F,(','» is
said to be an inner product space. The norm 11'11 defined by (1) is said to be the
norm defined by the inner product. It is a norm because of Schwarz's inequality.

4 Schwarz's Inequality. Let (H,F,O) be an inner product space. U.t.c.

5 I (x,y) I S; II x II . II y II VX,ye H.

Comments. a) This inequality implies the triangle inequality for the norm defined by
(2); (prove it).
For F = R the angle between two vectors x,y e H is defined by

6 cos e:= (x,y) I (II x II II y II) (0 S; e S; 1t).

y) For F=R

(x+y,x+y)-(x-y,x-y) =4(x,y).

Proof of (5). Choose a E F, with 1a 1= 1, S.t. a(x,y) = 1(x,y) I. Then, for all
'A. E R, obtain, after some calculations,

Since this last polynomial in 'A. is nonnegative, obtain


7 An inner product space (H,F,(','» that is complete in the norm defined by the
inner product is called a Hilbert space.
449

8 Example. (F",F,{','» is a Hilbert space under the inner product


n
9 (x,y) := l: xiYi =: x*Y
i=1

where x= y= and x* is the Hennitian transpose of x. The inner product


nonn is the Euclidean nonn.

10 Example. Let L2([to,ttJ, F") be the Banach space of square integrable F'-valued
functions on [to,t 1], (A6.57). Let f and g be two such functions and define the L2_
inner product by
I,

11 (f,gh := f f(t)"'g(t)dt
10

where f(t) and get) belong to F'.


(L2([to,td.F").F,{·,h) is a Hilbert space where i) the inner product nonn is the L2
nonn. (A6.55) and ii) two functions f and g are identified if f and g are equal almost
everywhere, (see Fact (A6.58».

12 Example. Let C ([to.tll. 1"') and PC ([to,ttl, 1"') be the linear space over F of pO-
valued continuous. resp., piecewise-continuous functions on [to,ttl c R On both
spaces define the L2-inner product given by (11). The linear spaces

are inner product spaces that are dense in the Hilbert space (L2([to,td. F,(''')2) in the
L2_nonn, (see Fact (A6.59». •

13 Comment. For all practical purposes inner products over (piecewise) continuous
functions on [ta,t 1] may be replaced by inner products over square integrable func-
tions; indeed the latter can be approximated by the fonner because of the following
Fact.

14 Fact [Tay.1,p.75] [Continuity of the inner productl. Let (H,F.{·,» be an inner


product space and let II' II denote the inner product nonn.
D.t.c. the inner product: (x,y)e F is a continuous function, i.e. for all
sequences (Xi);CH and S.t. II xi-x II as and II Y.rY II as
j 00, then (Xi,Yj) (x,y). •

Hint: by Schwarz's inequality and adding and subtracting tenns one has
450

Orthogonality, Annihilation and Orthogonal Complements

15 Let (H,F,(-.. » be an inner product space. Two vectors x,ye H are said to be
orthogonal iff (x,y)=O. In that case

16 II x+y 112 = II x 112 + II y 112 (Pythagoras' Theorem) .

17 Exercise [Annihilation property]. Let (H,F,(','» be an inner product space and let
Y be a dense subset of H under the inner product nonn. Then "i/ x e H

"i/ye H

18 <==> (x,y) = 0 "i/ye YeH.

(Hint: for the last equivalence use Fact (14).)

19 Let (V,F,II'I!) be a nonned linear space. A subset FeV is closed iff every
ve V, for which there is a sequence in F that converges to v, belongs to F. A
subset G e V is open iff it is the complement of a closed set, equiv. G C := V\G is
closed.

20 Fact. A finite-dimensional subspace of a nonned linear space is closed; (prove


it).

21 Let M be a subspace of an inner product space (H,F(' ,'». The subset

M.l = (yeH:(x,y)=O "i/xeM)

is called the orthogonal complement of M.



23 Fact. M.l is a closed linear subspace of H S.t. Mil M.l = (e ) .
(Hints: linearity is obvious; let s.t. Vi-He H; by the continuity of (".),
(14), "i/xE M (x,v)=O ... ; (v,v)=O::;. v=e.)

23a Gram Schmidt Orthogonalization.


Let the finite-dimensional subspace M have a basis then it has an ortho-
nonnal basis (b i i.e.

"i/i,ke m.
451

Such an orthononnal basis can be obtained from (Ilj t as follows: start with

The idea is to proceed step by step. Suppose we have obtained b l .b2..... bk_1 using
al.a2 .....ak_l; in other words (b i ] are orthononnal and span Sp [(aj] ;-1 ]. To
obtain we first compute from ak a vector Uk orthogonal to the bi's for ie k-l

By calculation. Uk is orthogonal to bi for all i e k-l, The last step is to nonnalize Uk:

k-I
(Note that II Uk II > O. for otherwise. ak would be a linear combination of bi ] • or
equivalently of r. ] 1k-I • which is impossible since LIlj
LIlj r. ] 1k is a linearly independent1 fam-
ily.) Thus the procedure leads to an orthononnal basis (bi ) of the subspace M.

Direct Sums and Orthogonal Projections

24 Let M and N be two subspaces of the linear space (V.F). The sum of two sub-
spaces M and N. viz. M+N = ( u+v;u e M.ve N) is called a direct sum iff
MnN= (9). The direct sum is denoted by MEaN. •

2S Fact. V = M ED N if and only if V x e V there is a unique decomposition x =


u+v S.t. u e M and ve N; (prove it).

26 Theorem [Orthogonal projection][Rud.l.pp.83-84]. Let M be a closed subspace


of a Hilbert space (H.F.(·,,». Then we have the direct sum decomposition

27 H=MEaM.i .

28 Equivalently. Vxe H 3! ye M called the orthogonal projection of x onto


the subspace M S.t.

x-ye Ml.

(equiv. in the inner product nonn II x-y II = inf ( II x-u II.u eM) ).

29 Comments. a) It is crucial that M be closed. For example. take H=L2([O.I].R)


and M=C([O,I],Ri its subspace of continuous functions, which is dense in
L2([O,1],R) in the L -nonn, (12). Now M is not a closed subspace (otherwise C =L2)
452

and with x E L2 given by


I
t E [0'2)

tE

inf {llx-u I"ue M} =0 (M is dense!) but there is no ye M (continuous) S.t.


II x-y II = O. i.e. x = y : x has no orthogonal projection on M.
13) In most applications M1. equals another previously labeled subspace, say, N. In
that case (27) is denoted

30 N

and we say that H has an orthogonal and direct sum decomposition.

30a Exercise. Let (H. c:: .C·»


be a Hilbert space and M be a (closed) subspace
specified by an orthonormal basis For any x e H, let xp be the orthogonal pro-
jection of x on M. By direct calculation show that:
k
(a) xp= L (Ui.Xi) ui.
i=!
(b) IIx-xpll < IIx-YII, V'yeM with y * xp
k
(Hint: write y= L lliui and minimize II x-y 112 with respect to the lli's .... ),
i=!
(c) if H = ern and the basis is not orthnormal, then xp= U(U*U)-!U*x where
U is the nxk. matrix whose columns are the u{s. i= 1,2, .... k.

30b Exercise. Let F= R or C::. Consider the Hilbert space (P. F. (.,.»
defined by
(8). Let Sand T be arbitrary linear subspaces of P. (Since Sand T are finite dimen-
sional. they are necessarily closed.) Show that:
a) (S1.)1. =S,
b) SeT <==> T1. eS1. •
c) (S+T)1. =S1. nT1. •
d) (SnT)1. =S1. +T1. .

A7.2. Adjoints of Continuous Linear Maps


Adjoints are instrumental in understanding controllability, observability and dual-
ity.

31 Let F be R or er. Let (U,F.(·,,)u) and (V,F.(·,,)v) be Hilbert spaces i.e. complete
inner product spa,fes. Let A : U V be a continuous linear map. Then the adjoint of
A, denoted by A • is the linear map A *: V U s.t.
453

(v,A u}v=(A *v,u}U 'Vue U, 'Vve V

32 Example. U=F"; V=prn, A is given by the matrix A=(aij)e prnxn


m n
(y ,Ax)pn = ., Ax = :E Y;:E ll;jXj
i=! j=!

Hence A*=(liji)e poon, the Hennitian transpose of A, is the adjoint of A.

33 Example. U=(L2([to,t!],prn),F,(',')V and V=F"; A : U V is the linear map


defined by

where

11
35 J
Au := O(t)u(t)dt
10

with

It follows easily by using Schwarz's inequality that A has a finite induced nonn, so A
is continuous, (A6.82). The adjoint A * : F" L2([to,tt1,prn) is given by

37 (A*v)(t)=O*(t)v for te [to.t!]

where O*(t) e poon is the Hennitian transpose of O(t). Indeed, for any ve Fm ,

J J
(v,A u)pn= v*O(t)u(t)dt= (O(t)*v)*u(t)dt
to 10

=(A*v,uh·

38 Comment. In the example above A is the unique continuous extension of the


continuous linear map
454

I,

39 A : PC ([to,tll,pm) -t P: u -t Au = f G(t)u(t)dt
I()

where G(') satisfies (36): indeed since PC([to,td,pm) is dense in L2([to,td,Fm ) in the
L2-norm, (see Example (12), for every u E there exists a sequence
[u i J cPC n S.t. Ui -t u and Au = lim AUi' We shall therefore also call A>I< the adjoint
of A.

39a Exercise. Let (U,F) and (V,F) be two normed linear spaces. Show that if
(V,F) is finite dimensional, they any linear map A : V -t V is continuous.
(Hint: if cbi]n cV is a basis of V then UE V iff u= ±<ljbi and A u= ±C:X0 ui
1 i=! i=!
Compare with (37).)

40 Comment. If V is not finite dimensional then not all linear maps A : V -t V are
continuous.
To wit let (V,IR) be the linear space of polynomials with coefficients in R for which
we choose the norm II u II 00 = max { I u(t) I : t E [0,1] }, and let V = IR. Consider the
linear evaluation map A : U -t R defined by
Au=u(4).

Consider then the sequence (Uk) where Uk(t):= (t/2)k. Clearly, in terms of the norm
defined above as k -t 00 uk -t 9u the zero polynomial. However,
A Uk=Uk(4)=2k -t 00. Hence A is not continuous. •

Self-Adjoint Maps

41 Let (H,F,('''>H) be a Hilbert space and let A : H -t H be a continuous linear map


with adjoint A>I<: H -t H. We say that the map A is self-adjoint iff A =A >1<, or
equivalently,

V x,ye H.

42 Example [Hermitian matrices]. Let H=F" and let A be represented by a matrix


A = (:ljj)iJEn : F" -t P. Then A is self-adjoint iff the matrix A is Hermitian, or
. Ien tl y, A = A*'
equlva - \v. I1,j" En.
,ViZ. aij=aji

Hint: the latter condition is equivalent to V x,y E P

(x,AY)p=(Ax,y)p.
455

43 Example [Integral operator]. Let H=L2(d,F) where d=[to,ttlcR. Let


k(','): F: (t,'t) k(t,'t) S.t.

K := JJ Ik(t,'t) 12dtd't < 00


M

and

k(t,'t) = k('t,!) 'v' (t,'t)e l1xl1.

Then the linear map


A
s.t.
(A u)(t) = f k(t,'t)u('t)d't 'v'te 11
!J.

is well defined, continuous and self-adjoint.


(Hint: IIAuill Klluilland(v,Auh=(Av,uh V'u,veL2(d,F).)

44 Properties of self-adjoint maps. Let A : H H be a continuous linear map and


let A be self-adjoint, then
(a) all eigenvalues t of A are real;
(b) if and Ak are distinct eigenvalues of A, then any eigenvector Vi associated with
is orthogonal to any eigenvector vk associated with Ak'

*
Proof. (a) From A Vi = Aivi and A = A, we obtain successively

(Vi,A Vj)=Aj II vdI 2 =(A Vi,Vj)= (Vj.A Vi)

(where in the last RHS, the overbar indicates the complex conjugate). It follows that
Ai is real.
(b) From A Vj=AjVj and A vk=Akvk we obtain
Aj(Vk,Vj}= (Vk,A Vj}

Subtracting the last two equations we get

t A complex number Ae tr is called an eigenvalue of A iff there exists a


nonzero element ve H such that AV=AV; v is called an eigenvector associated with A
456


A7.3 Properties of the Adjoint

4S Fact. Let A be a linear continuous map from U into V, where U and V are Hil-
bert spaces. Then

46 A * is linear and continuous


with induced nonn

47 IIA*II = IIA II.



Proof. By definition A * is linear S.t.

(v,A u)v=(A *v,u)u VUE U V VE V.

Hence by Schwarz's inequality

I (A*v,u)u I IIA II lIuli Ilvll.

Hence choosing II v II = 1 and u = A *v, one obtains


IIA*II:= (IIA*vll: Ilvll =1) IIA II <00. Hence A* is continuous. In similar
fashion one obtains IIAII IIA*ll.whence IIA*II=IIAII. •

48 Fact. Let U, V, W be Hilbert spaces over F=R or (C. Let A :U -; V,


A : U -; V, and B : V -; W be linear continuous maps.
Then
(a) (A +Af = A *+A*,
(b) (aA)*=<XA* VUE F,
(c) (AB )*=B*A *,
(d) A**=A.

Proof. (a)-(b) are direct consequence of the definition of the adjoint and the properties
of the inner product.
(c) Using the definition of the adjoint we have VUE U and V WE W
457

«AB )*w,u)u=(w,(AB )u)w=(w,A (Bu»w

=(A *w,B u)V=(B*A *w,u)U'

Hence «AB)*w-B*A*w,u)u=O, 'Vue U and by the Annihilation property, (18),

(AB )*w=B*A *w 'VweW.

(d) In similar fashion for all u E U and v E V

(v,A u)v= (A*v,u)u= (v,A**u)v.

So (v,A u-A **u)v = 0 for all v E V. Hence

Au=A**u forallueU.

A7.4. The Finite Rank Operator Fundamental Lemma
The following lemma is crucial for the study of controllability and leads to the
singular value decomposition of a matrix.

55 Lemma [Finite rank operators]. Let F=IR or (1:. Let (H,F'(-')H) be a Hilbert
space and consider pm as the Hilbert space (ptn,F,(-,>F"')' Let A : H F m be a continu-
ous linear map with adjoint A * : pm H. U .I.c.

56 A*A:H-->H

are continuous linear maps with AA * and A *A self-adjoint. Furthermore (see Fig.
A7).

57 a) H=R(A*)iN(A), pn=R(A)iN(A*);

58 b) The restriction A I R(A*) is a bijection of R (A *) onto R (A) and

59 N (AA *) =N (A *), R (AA *) =R (A);

60 c) The restriction A * IR(A} is a bijection of R (A) onto R (A *)


and

61 N(A*A)=N(A), R(A*A)=R(A*).

458

A'"

--

Fig. A7. The orthogonal decomposition of the domain and codomain of a finite rank
operator A : H Fm and its associated bijections.

Comments. (X) Conclusions (a) and (b) of the theorem above display the following:
Let P: H -+ R (A *) be the orthogonal projection of H onto R (A *) and J: R (A ) -+ Fm
be the natural injection of R (A) into pm. V.l.c. the map A is depicted by the com-
mutative diagram of Fig. A8 where A IR(A") is a bijection of R(A"') onto R(A).
Note that, modulo restriction in the domain and codomain, A becomes bijective.
*
A similar diagram can be drawn for the adjoint A (do it).
*:
It is crucial to notice that AA pm -+ pm and hence, by (59), the study of the range
*
of A and the null space of A is equivalent to study of the range and null space, resp.,
of any (Hermitian) matrix representation M of AA *; (cf. controllability ... )
"I) [Tay.l,p.244]. If A V where V is a Hilbert space, then the theorem statement
applies with all range spaces replaced by their closures.

* * *
Proof of Theorem (55). A is continuous by Fact (45). AA and A A are obviously
self-adjoint and they are continuous as the composition of continuous maps. Hence
(56) holds. Moreover:
a) R (A *) cHand R (A ) c Fm are finite-dimensional subspaces, (for R (A *) note that
dim R (A *) s: dim Domain(A *) = m). Hence, by Fact (20), both R (A *) and R (A) are
closed subspaces of Hand Fm, resp. Therefore, by the orthogonal projection Theorem
(26):

:OF:
RIA·) AtRIA.) RIA)

Fig. AS. The commutative diagram of a finite rank operator A : H Fm.


459

and

We claim that R(A)l = N(A*). Indeed,

xe R(A).1

0 = (x,A y)pn = (A "'X,y)H 'r/yeH

xeN(A*).

Now R (A *)1. =N (A) follows similarly. Hence (57) follows.


b) A' R (A *) is obviously onto R (A) by (57 1). We claim that A 'R (A *) is one-one on
R(A"'), equiv., N(A 'R(A*»= Ie}. To see this, let ye N(A 'R(A*»; then Ay=O and,
for some xeFm , y=A"'x; thus yeN(A)nR(A"')= Ie} by the first part of (57).
Hence A 'R (A *) is one-one onto R (A ) and (58) follows.
To establish the first part of (59), note that
1) AA *x)= IIA *x 11 2 =0 A *x=8, hence N(AA *)cN(A *), and
2) A *x = 8 => AA *x = 8, hence N (AA *)::> N (A *). So the first part of (59) is esta-
blished.
For the second part of (59), let for any set S, A[S] denote the image of Sunder
A. Then R (A ) =A [H] =A [R (A "')] =R (AA *), (in the second equality we used the
first part of (57». Hence, the second part of (59) follows.
c) (60) and (61) are established similarly using the second part of (57).

A7.S. Singular Value Decomposition. (SVD)
Let F=R or cr.
6S A family [Ui t of vectors in a Hilbert space (H,F,<-.. » is said to be orthonormal
iff (u;,u) = Bij 'r/ i,j e n, where Bij equals 0 if i j and I if i=j, (equiv., the vectors
are mutually orthogonal and their norm is 1); such finite family is called complete iff
it is a basis of H (in that case H is finite-dimensional).
From now on we consider matrices.

66 A matrix U E FOX" is said to be unitary iff U·U = UU* = In' (equiv., the n
columns and the n rows of U form orthonormal bases of P). If F=R such matrix is
called orthogonal.

67 Unitary matrices preserve the inner product and hence length; more precisely, if
Ue Fnxn is unitary, then 'r/x,ye P, (Ux,Uy)p=(x,y) and hence IIUxI12= IIxll2,
(prove it).
460

Orthogonal matrices represent rotations in R n , [Nob.l,p.283].

70 Recall that a matrix AE F"xn is called Hennitian iff A=A*; moreover such
matrix A is said to be positive semidefinite iff (x,Ax)p V x E F".
Hennitian positive semidefinite matrices are associated with nonnegative quadratic
fonns [Nob.l,p425].

71 Exercise. Show (x,Ax) is real for all x E cr n if and only if A is Hennitian.

72 Fact [Eigenvalues and eigenvectors of Hennitian matrices].


[Nob.l,pp.306,314,426]. Let A E F"xn be a Hennitian matrix.
U.t.c.
a) A has n real eigenvalues, (not necessarily distinct);
b) F" has an orthononnal basis of eigenvectors of A there exists a com·
plete orthononnal family of eigenvectors of A, say ui ). As a consequence, if
1
P E F"xn is the unitary matrix whose columns are those eigenvectors, then

P*AP=A and A=PAP*

where A is a real diagonal matrix whose diagonal elements are the corresponding
eigenvalues Ai of A; in dyadic fonn the last equation reads

n
73 A= 1: AjUjut;
j=I

c) A is positive definite (equiv., (x,Ax) > 0, V x t:: en)

¢:> ).,>0, ViE n;

A is positive semi definite ¢:> Aj 0, ViE n;


d) let the (not necessarily distinct) eigenvalues be ordered as Al Al . .. An'
then

max (x,Ax) = Al min (x,Ax) = .


Ilxll=I . II xII =1

Proof of (a) and (b). From (44), we know that A being Hennitian has only real
eigenvalues and that if)., t:: Ak' then any eigenvector of )., is orthogonal to any eigen·
vector of Ak'
We establish the existence of a basis of orthononnal eigenvectors by using induction.

Step I. Let Al be any eigenvalue of A (Le. a zero of det(U-A» and let ul be an


eigenvector associated with AI' nonnalized by II uIII = 1. Call M I the (n-I)-
dimensional subspace of all vectors of Fn orthogonal to uI:
461

We claim that M I is a subspace invariant under A. Indeed, let ZE M I; hence


(Z,uI) = O. Consequently,

0= A.I(z,UI) =(z,A.IUI)= (z,AuI) = (Az,uI)

where in the last step we used A* = A. So ZE M I=> Az EM I'

Step II. Pick an orthonormal basis [bJ: for M t ; then (u t ,b 2, ... ,b n) is a basis for F"
and ul.l M I with M I A-invariant and M 1= Sp[b 2 ,b 3, .... bn). By the second matrix
representation theorem of Chapter 4, with respect to this new basis A has the form

1...1 0 . 0
o
M

where ME F(n-I)x(n-I) and. as is easily verified. M* = M.

Step III. Repeat Steps I and II successively (n-l) times; the result is an orthonormal
basis of eigenvectors (u i ] and the representation of A with respect to that basis is
diag(A I,"-2, ...
,A,J Clearly P is the matrix whose ith column is Uj E P, i En. Equa-
tion (73) follows immediately from the orthonormality of the Uj's.
The proofs of (c) and (d) are left as an exercise.

We consider now F m and F n as Hilbert spaces and the matrix A E F mxn as a
linear continuous map A: F" P with as adjoint A * the Hermitian transpose of A.
Hence the conclusions of Lemma (55) apply with H and A replaced by F n and A.
resp.. In particular AA* E pnxm and A* A E F"xn are Hennitian positive semidefinite
matrices. For these we have the following.

75 Exercise [Common rank and nonzero eigenvalues of AA* and A*AJ. Let
A E pxn have rank r.
Show that the Hermitian positive semidefinite matrices AA* and A* A obey:

76 a) r=rk AA*=rk A=rk A*=rkA*A.

77 b) AA * and A *A have exactly r identical positive eigenvalues a? > 0, i E I.


(Hints: a) use the second part of (59) and (61) and (A5.31); b) show that for all aE 1R.
462

79 Comment. The singular value decomposition (SVD) of A E F mxn will display


the square roots aj>O, iEL of the common positive eigenvalues of AA* and A*A:
they are called positive singular values of A. This is done by using appropriate ortho-
normal bases in the domain F n and codomain Fm decomposed according to

80 N(A) and N(A*) (cf. (57)).

The SVD Theorem

81. Theorem [Singular value decomposition]. Let F = R or <r. Let A E F mxn be a


matrix of rank r. Then there exist matrices U E F rnxm , V E F"xn, and LIE IR rxr S.t.

82 a) V=[Vl:V2]EFnxn, V 1 EplXT, satisfies


V is unitary, equiv., V *V = I ,
*
R(V I) = R(A ) ;
n

the columns of V 1 form an orthonormal basis of R(A *);


R(V 2 ) = N(A) ;
the columns of V 2 form an orthonormal basis of N(A) ; *
the columns of V form a complete orthonormal basis of eigenvectors of A A.

83 b) U = [U 1 : U2] E pnxm , with U 1 E pmxr satisfies


U is unitary, equiv., U*U=I m,
R(U I ) = R(A);
the columns of U 1 form an orthonormal basis of R(A);
R(U 2 ) = N(A * );
the columns of U 2 form an orthonormal basis for N(A *);
the columns of U form a complete orthonormal basis of eigenvectors of AA *.
c) Let xl E R(A*) and yl E R (A) be represented by the component vectors and
111 resp., according to the following scheme:

then the bijections induced by orthogonal decomposition, viz.

have the representations


463

with

s.t.

where the ai' for i e r. are the square roots of the common positive eigenvalues of
A*A and AA*; the aj's for ie r. are called positive singular values of A.
d) A has a dyadic expansion
r
88 A=UILIVI*' or equivalently, A=Lajujvj*
j=1

where the ui' vi' for i e r. are the columns of U I and VI, resp.,
e) Ae pnxn has a singular value decomposition (s.v.d.)

89 A=ULV*

where

r
90
n-r

91. Comments. a) If Ae IR m • n then Ve IR n • n, U e IRm • m are orthogonal matrices


(VTV = In' etc.); they define rotations in domain and codomain.
P) The SVD has the following geometric interpretation: from (82), (83) and (89)
A[V I : V2] = [U ILl: 0] where for the active part AV1= U ILl' i.e. AVi = ujaj for i e r.;
thus any linear map rotates (Vj uj,i e r) and scales (Uj uiaj,i e r), (the Uj'S and
Vi'S are columns of U I and VI' resp.).
y) Appropriate references for the theory are [Nob.l,p.327], [Gol.1,p.l6], [Hor.2,p.411];
for programs and routines see [Gar.lJ.

Proof of Theorem (81): Since A e pn x n is a continuous linear map A : F' pn,


Theorem (55) applies with H and A replaced by F n and A, resp.. Note that the
decompositions (55) are now given by (80).
a)*Ae pnxn has rank r, whence by (76) the Hp.rmitian positive semidefinite matrix
A A has rank r with n nonnegative eigenvalues a?
i e n. ordered as
464

to which corresponds a complete orthonormal eigenvector basis [Vi); of A *A, (see


Fact (71». This family of Fn-vectors forms the columns of a unitary n x n matrix, say,
V. From (92), the first part of (80), R(A*A)=R(A*) and N(A*A)=N(A), (see
(59», the properties of V listed in (82) follow.
b) The mxm matrix U is obtained as follows. Define

where the ai' i E L are those given in (92). From properties (82), especially the
columns of V I form a set of orthonormal eigenvectors asmciated with the nonzero
eigenvalues of A*A, A*AVI=VILf, whence (AVILj'I)*(AVILj'I)=Ir • This defines
an mxr matrix

Let ui' i E L be the columns of U I, then ui = O'i- l AVi' i E r. From the properties above
we obtain by calculation:

'V i,k E r,

AA*U 1= U ILf ,consequently AA*Ui = a?ui 'Vier.

Since by (77) A *A and AA * have exactly r nonzero identical eigenvalues, it foll'2ws


that the columns of U l form an orthonormal basis for R(AA*)=R(A), (see (61 »:
thus the properties of U I listed in (83) follow. Define now an mx(m-r) matrix U 2
with orthonormal columns S.t. U* 2UI =0. Then

U=[U1 :U2]e pnxm

is a unitary matrix; moreover pn=R (U l ). R (U 2 ). Now by the second part of (80)


Fffi=R(A).N(A*), where we know that R(UI)=R(A): hence R(U 2)=N(A*) with
the columns of U 2 forming an orthonormal basis of N (A*). Note also that by the first
part of (61), N(A*)=N(AA*); hence R(U 2 )=N(AA*). Therefore, since in addition
U=[U 1:U21 is unitary with AA*U1=PILt, the columns of U form a complete
orthonormal basis of eigenvectors of AA . Hence all properties of U listed in (83) are
established.
c) The proof of assertion (c) uses the relations

95

which follow from it uses also the fact that the nonzero eigenvalues of A *A in
(92)-(93) are those of A A.
465

d) The dyadic expansion (88) follows from the first part of (95).
e) The singular value decomposition (89)-(90) follows because

A[V I :OJ=[U I

96 Definition. Let A e F mxn have rank r. Then the n nonnegative square roots cri
...
of the eigenvalues cr'f of A A are called the singular values of A. When ordered
according to

the first r singular values are called the positive singular values of A.

97 Geometric Interpretation of the action of A. For simplicity we take Ae pnxn


to be nonsingular (equiv., detA 0 or r = m = n). Apply the unitary transformations
x= and y = Urt in the domain and codomain of A; (V and U are given by (82) and
(83), resp., giving P=R(V)=R(A)=R(U)=R(A*». Let Sn be the unit sphere in
Fn, i.e.

Since V is unitary, it follows by (67) that


n
Sn:= pn: II IIl= L 1 12= 1 } .
j=l

Let A[SnJ denote the image of Sn under A; more precisely,

A[Sn]= {ye pn:y=Ax, IIxI12=1}.

Hence since U is unitary and by the SVD AV = UL


n
A[SnJ= {y=Un:
i=l

Note that columnwise

\lien.

Thus A[Sn]' the image of the unit sphere Sn under A, is an ellipsoid whose principal
axes are along the Uj'S and are of length OJ, viz. the positive singular values of A.
This is shown in Fig. A9 for the case that F=IR and n=2. The action of A consists
of, first, a "rotation" (Vi Ui), then a scaling (Ui Uicr), i.e. the size of the action of A
is dictated by its singular values.
466

Fig. A9. A e R2x2 is nonsingular, the unit sphere is on the left; the image under
A is the ellipsoid is on the right.

98 Exercise. Let Ae ptxn have a SVD A=ULV*, Show that

where H 1,H2 e ptxn are Hermitian positive semidefinite and W I'W 2 are unitary. «99)
is called polar decomposition of A, in analogy with a+jb = pel<il for a complex
number.)
(Hint: Consider A=ULU*UV* .... )

100 Sensitivity analysis of Ax =h. As above let F= R or tr. Let now A E ptxn,
detA '¢ 0, A=ULV*,
1. From (97) we have

O'I=maxO'j=max {IIAxI12: Ilxl12=l} = IIAI12


i

and

whence

because, in the present case, A-1=VL-1U*. Thus, using 2-norms, the condition
number of A is
467

Thus, from (A6.89), we see that if 01;» On> 0, for some and the resulting
II Ih I II Xo 112 may be very large.
II. The smallest singular value on of A is a measure of how far the nonsingular
matrix A is from being singular. First, note that ujvtis an n x n matrix of rank 1
whose range is Sp[ Uj] and

II UjVj* Ib = 1 .
Furthermore, since

101

with M defined by

we see that
det(A+aA) = o.

Furthermore that particular aA is the aA with least norm, II aA 112, such that A+aA is
singular, (prove it). In conclusion, an measures how far A is from being singular.
III. The solution Xo of Ax = b is given by

Let now b be penurbed into b+ab, the resulting ax is given by

It is clear from (103) that a of length £ along the vector un will cause the largest
change in x: Un gives the direction of maximum sensitivity of x for changes in b; and a
penurbation II ab 112 = £ in that direction causes a change in ax in x of length
468

118x 112 = Elan and in the direction vn. In conclusion. the SVD a) shows how close A
is from being singular. b) detennines the least nonn perturbation in A that makes A
singular (see (102»; c) gives a geometric interpretation of the effect on x by a pertur-
bation 8b (see (104»; and d) identifies the "worst" 8b.
APPENDIX 8

DIFFERENTIAL EQUATIONS

Since the opening and closing of switches and since square waves are common
occurrences in engineering we shall allow discontinuous functions of time in our
differential equations. Section B I discusses the existence and uniqueness of solutions;
Section B2, the dependence on initial conditions and parameter perturbations. Finally
in Section B3, we discuss briefly the concept of flow and numerical calculations.
Essential references are [Cod.l], [HaUl, [MiUl, [Die.I,Ch.xl.

81. Existence and Uniqueness of Solutions

81.1. Assumptions
Our basic differential equation (d.e.) under consideration is written as follows:

1 x=p(x,t),

where X(t)E 1R n, for t 0 and p(.,'): 1R n x 1R+ ---> 1R n. We are given "initial conditions"
(to,xo) and require that

2 x(to)=xo.

The function p(.,') must satisfy two assumptions:


a) Let D be a set in R+ which contains at most a finite number of points per unit
interval. D is the set of possible discontinuity points; it may be empty. Furthermore,

3 for each fixed x ERn, the function: t E IR+\D p(x,t) E IRn is continuous and for
any tED the left-hand and right-hand limits p(x,t-) and p(x, t+), resp., are finite vec-
tors in IRn.
b) There is a piecewise continuous function kO: R+ IR+ S.t.

4 II II k(t) II

This is called a global Lipschitz condition because it must hold for all and in IRn.

Comments. a) Let '1': R+ IR n be a continuous function. Then by adding and sub-


tracting p('I'('t),t), by the triangle inequality and (4) we obtain

II p('I'(t),t) - p('I'(t),t) II

where k(-) being piecewise continuous is bounded in any compact interval [to,td c:::.
Rr. Therefore by (3) and the continuity of '1'0, it follows that for all tE IR+\D,
470

lim p(\jI(t),t) = p(\jI('t);t). Hence the function t --; p(\jI(t),t) is continuous at such 'to
t-4t
Now if 't E D then the inequality above still applies with p(\jI('t),'t} replaced by
p(\jI(t),t-), respectively, p(\jI('t),t+), which are well defined by (3). A similar reason-
ing shows that both one-sided limits exist, i.e. 'v' 't E D, with t increasing to 't
lim p(\jI(t),t) =p(\jI('t),t-) and with t decreasing to 't lim p(\jI(t),t) = p(\jI('t),'t+). It fol-
1-4t l-4t
lows therefore that for any continuous function \jI: R+ --; IRn the function t --; p(\jI(t),t)
is piecewise continuous on with discontinuity points in D.
Therefore, for any such \jI, we can integrate p(\jI(t),t) versus time. Also the func-
tion
t
f
t --; p(\jI('t),'t)d't
o

is continuous. Furthermore, by the fundamental theorem of calculus, its derivative is


equal to p(\jI(t),t) for all t E \D.
b) In many engineering problems, the RHS of the d.e. (1) does not obey a global
Lipschitz condition such as (4): more precisely, for inequality (4) to hold, we must
constrain the size of and e.g. and must belong to some ball, say,
B(en;r) c IRn. If this is the case, the construction below still applies with the premise
that, at each stage of the iteration, we must check that the iterate xm(t) remains within
the ball B(en;r) for all times t in the interval under consideration.

c) When (4) does not hold, it may happen that the solution cannot be continued
beyond a certain time. For example, the scalar equation lIc, c 0
has the solution = lI(c-t) defined on (-oo,c). As t --; c, I S(t) I blows up; we say
that we have a finite escape time at time c.
By imposing the global Lipschitz condition (4), (i) we can construct the solution
on and (ii) we greatly simplify the description of the iterations without losing the
key features of the reasoning.

5 Exercise. Show that, given R > 0, if there is a piecewise continuous k(') such
that

then the inequality (4) holds 'v' S,S' E B(en;R), 'v' t E R+.

B1.2. Fundamental Theorem

6 Theorem [Existence and uniqueness of the solution of a d.e.]. Consider the d.e.
(1) under the initial condition (2). Let p satisfy conditions (3) and (4).
Then i) For each (to,xo) E R+ x R n there exists a continuous function 4> : R+ --; IRn S.t.
471

and

8 cp(t) = p(cp(t),t) 'It te

ii) This function is unique. The function cp is called the solution through (to,xo) of
the d.e. (1). •

x
In other words, given any (to,xo) e R+ x Rn, the d.e. = p(x,t) defines a unique solu-
tion cp(t) that is defined for all t in (This soluti?n is often written as cp(t,to,xo»'
The solution is continuous on R+ and, in addition, cp is continuous at al1 te R+\D,
because t -+ p(cp(t),t) is continuous at such.t. Now consider some. 1: e D, then t -+ «P(t)
i.s continuous but at 1:, the function t-+cp(t) jumps at 1: from c!>(1:-)=p(c!>(1:),1:-) to
cp(t+)=p(cp(t),t+), if p(cp(t),t-) "" p(cp(t),t+).
The proof of the theorem is in two steps; first, a solution is constructed by itera-
tion, and second, uniqueness is established.

Bl.3 Construction of a Solution by Iteration


On the interval we construct a sequence of continuous functions as
follows: for m = 0,1,2, ...

9 f
"m+l(t) := xo + p(xm(1:),1:)d1: for te R+
10

with

xo(t) := Xo

Let [tl,t2] be any closed interval of containing to. We shall show that, on any
such the sequence of continuous functions Gm(')]; is a Cauchy sequence of
the Banach space [c ([tl,tv,lRn),R, 11'11 00 ] . where

10 "f(' )11 00 = max ( II f(t) " , t e [tlot21 }

and "." is any norm in IRn, see Fact (A6.33); (the norm in R n is arbitrary by
equivalence of norms, see (A6.47». By completeness of the Banach space, there is a
continuous function cp: JR+ -+ JRn to which the sequence [xm(·»); converges in 11'11 00
;

thus xm(t) -+ cp(t) uniformly on [tl,t2]' This function cp will be shown to be a solution
of the d.e. (1).
We start by studying estimates where 11'11 is any norm in R n and we use
(A6.20), which is valid for any such norm. To wit: for m=I,2,... and te JR+
472

I
II Xm+1 (t)-Xm(t) II = II p(xm('t),'t)-P(Xm_1 ('t),'t)d't II
10

It I
:j II p(xm('t),'t)-P(Xm_l ('t);t) II d't: (by (A6.20»
110 I

(by(4» .

Let k be the supremum of k(t) over [tl.t21. then for m=1,2 •... and for all tE [t l ,t2]

I[ I
11 IIxm+I(t)-x m (t)II :!> k:J IIxm('t)-Xm_I('t)lld't:.
110 I

From the first step in (9), for all t E [t}, t21


[

IIxl(t)-xoli jllp(xo .'t)lI d't:!> f IIp(xo,'t)lld't=: M,


10 [I

where M is known since Xo is specified. Hence by (11), 'It t E [tl,t21, we obtain suc-
cessively

M[klt-tolf
II x3(t)-x2(t) II :!>

Let us take 11'11 co on [t2,td as in (10) and define T := t2-t}, then

12

To see that the sequence [x m (·»); is a Cauchy sequence in C ([t1,tzl,Rn) we have


473

(triangle inequality)

p-) [ktr+k
(by (12»
S M I, (m+k)!
k=O

[kTJffi p-l [kT]k


<M--
- k 1• (since (m+k)!
m.1.L.
k=O

[ since ekT = I, - -
- 00 [k"T]k) .
k=O k!

Thus for all m,p=O,1,2 ...

where the RHS is independent of p. Thus, however large p is, as


m -+ 00, II xm+p(')-x m(.) II 00 tends to zero. Hence (XmC'»); is a Cauchy sequence of

the Banach space [CC[t),t2],Rn),]R, 11'11 00]. whence [XmO ); converges to a continu-
ous function cjl on [t l,t 2 ] in 11'1100'
Recall that [t 1,t2 ] is an arbitrary closed interval of containing to and note that
convergence in 11'11 00 implies pointwise convergence at every te [t),t21; indeed, for all
such t's, II II S II II 00.1[ ••1", -+ O. Therefore if <II and", are con-
.1

tinuous limit functions on [t 1,t 2] and ['t),'t2]' resp., then

<IICt) = ",Ct) 'v'te [t),t2]("'\[1:),1:2]'

To see this, note that for all such t's, as m -+ 00 we have

Therefore there is one continuous function cjI defined on R+ to which the sequence
(xmC') ]; will converge in 11'11 00 on any [t),t2] containing to·
474

We now show that 4> is a solution of the d.e. (1) on any such interval [t 1,t2) and
hence on R+. Consider therefore the iteration fonnula (9), namely

9 "m+1 (t) = Xo f
+ p(xm(t),t)dt
to

We are going to show that

14 f p(xm(t),t)dt f p(4)(t),t)dt as m 00.


to to

Indeed, from the Lipschitz condition, for t E [tl,t2)


II t II
II J
II [p(x m(t),'t)-p(4>(t),t)]dt II
II
II to II

I t I
:f II p(xm(t);t)-p(4>('t),t) II dt :
I 10 I

I t I
: Ik(t) II xm(t)-4>(t) II dt :
I to I

(by letting p 00 in (13».

Thus (14) follows as m -) 00.


Hence, going over to the limit as m -) 00 in (9) we have
t

f
4>(t)=xo + p(4)(t),'t)dt foraB tE [tl,t2)'
10

Hence (by the fundamental theorem of calculus) at every point tE [tl,tz) \ D, (where
p(4)(t),t) is continuous),

$(t) = p($(t),t)
475

Since the interval [t 1,t2] is an arbitrary closed interval of containing to' we con-
clude that the proposed iterative scheme converges to a solution cj> defined on 1R+.

Remark. We have constructed a solution on R+. Conceivably a different construc-


tion might lead to another solution. In other words, we have to verify that q, is the
unique solution.
Bl.4. The Bellman·Gronwall Inequality

15 Bellman·Gronwall inequality. Let u(·) and k(·) be real-valued piecewise-


continuous functions on R+. Let u(·) and k(·) be ) 0 Let Cl be a nonnegative
constant and to e R+. U.t.c. if

It I
16 I f
u(t) S cl + I k(t)u(t)dt II 'ite
Ito I
then

17
{"f '}
u(t) S cl exp : k(t)dt:
I to I
'ite .


Proof. For reasons of symmetry we may restrict ourselves to the case that t) to.
Call U(t) the right-hand side of (16). Hence (16) reads u(t) S U(t) 'i t e R. Multiply
both sides by the nonnegative function

k(,) exp [+(t)dt ].


Since at every point of continuity of k(·)uO , U(t) = k(t)u(t) the resulting inequality
reads

Integrating between to and t, noting that U(to)=Cl and that at t=to the exponential is
one, we obtain for all t to
476


18 Exercise. Let u('), $0, k(') be real-valued piecewise-continuous functions on
R+. Let u(·), $0 and kO be ) 0 on 1R;.. If uO satisfies
It I
u(t) f
$(t) + : k('t)u(t)dt: V tE ,
I to I
then

It [It 1]1
u(t) S;
I
f
$(t) + I $(t)k(t) exp
I I I
I fk(a)da I dt I
I 10 I 't I I

B1.5. Uniqueness
Let us return to the d.e. x=p(x,t). where p(.,) satisfies assumptions a) and b) of
B 1.1. Suppose there are two solutions $ and 'I' satisfying x = p(x.t) and
$(to) = 'I'(to) = xo· By integrating the d.e., we obtain

f
$(t)-'I'(t) = [p($(t).t)-pC'I'(t).t)]dt
10

As before. restricting our attention to any closed interval [t.,t 2 ] of IR+ containing
to and using the Lipschitz condition, we obtain
I I I
II$Ct)-'I'(t) II S; f
k : 11$(t)-'I'Ct) Ildt:
I 10 I

A fortiori for any cl 0


I t I
II <l>Ct)-'I'(t) II S; cl + k : f II <l>Ct)-'I'Ct) IIdt:
I 10 I

Hence by the Bellman-Gronwall inequality, taking u(t) = II «pCt) - 'l'Ct) II. we obtain

11$(t)-'I'Ct)II :::; clexp[klt-toll VtE [t1,t21 .

Since this holds for any ci by taking cl=O. we see that 1I<I>(t)-'I'(t)II =0.
477

'v'te [tl,t2]' Hence for each te [t1,t2]' cjl(t)=",(t). Hence the solution is unique on
[t1,t2]' Since [t1,t2] is an arbitrary closed interval of R+ containing to, the domain of
uniqueness can be extended to cover all Hence cjl(t) = ",(t), 'v' t e •

This concludes the proof of the fundamental theorem (6).

82. Initial Conditions and Parameter Perturbations

Heuristic Introduction
Suppose we have a system described by a d.e. that depends on m real parameters:

x(t) = f(x,t,Zo) with x(to) = xo,

where Zo e R m is the list of the nominal values of the parameters. Call


t --+ ",(t;to,Xo,Zo) the solution of (1); we view it as a time-function parametrized by to,xo
and Zo. For convenience, we denote that solution by t --+ "'o(t); we call it the nominal
solution.
Suppose now that the parameters change from Zo to Zo+az where II az II is very
small, then the perturbed system is described by

2 x(t) = f(x,t,Zo+az), x(to) = xo.

Call its solution "'00+5",(·). Using (2) and a Taylor expansion we obtain successively

3 Wo(t)+aw(t) = f("'o+a""t,zo+az)

= f("'o,t,Zo)+Dlf' (wo(t),t,Zo) • a",(t)


+ D3 f ' (wo(t),!,Zo) • 5z + h.o. t.

where a) for i = 1,2,3, Dl denotes the derivative of f with respect to its ith argument
and "h,o.t." denotes the higher-order terms in az and a",. Dropping the h.o.t., using
(1) we have approximately

8W = Dl f' (wo(t),t,z.,) • 8", + D3f 'wo(t),!,z.,) • Sz

Using obvious notations, this is of the form

where the matrices A(t,Zo) and B(t,Zo) are known once the nominal solution "'o(t) is
known. So with denoting the state transition matrix associated with A(t,Zo),
478

we have

5 8",(t) = f <I>(t;t)B(t,Zo)d't . 8z
10

where we used the "approximately equal" symbol to remind ourselves that Eq. (5) is
an approximation caused by our dropping the h.o.1. in (3).

Perturbation Theorem
The heuristic derivation above gives us a feel for the nature of the result. Now in
order not to clutter the statement of the theorem we will assume that the Lipschitz
conditions hold globally, etc. The theorem below specifies the properties of the solu-
tion. ((10) and (11) below). and gives expressions for the derivatives of the solution
with respect to t,!o,Xo and zoo (see (12), (15) and (17) below).

6 Theorem. Let B be an open ball in IRm with center Zo. let


f(· .... ): IRnx R+xB Rn. We consider the system

7 x(t) = f(x(t),t;Zo)

where f(· .... ) satisfies the following conditions:


a) "I (x,z) e Rn x B, t f(x,t,z) is piecewise continuous and the set 0 of discon-
tinuity points is independent of (x,z) and 0 (] [Ia. is a finite set for all bounded
intervals
b) Let k be an integer S.t. k 1. Let for any T > 0, 0 (] [O.T] have points

where without loss of generality 'to=O and 'tn=T. For all T > 0, f(· .... ) is of class
C k on IRnx['ti-!,'tj]xB for ie n where at the boundary points 'tj the function value and
the values of the derivatives are defined to be the appropriate one-sided limits.
c) f(· .... ) is globally Lipschitz in x ; more precisely there is a piecewise-continuous
function k(·) : R+ such that

"I e IRn , "Ite "IzeB

9 II - II :5 k(t) II II.
V.t.c.

10 i) "I (xo.!o.Zo) e IR n x 1R+ x B. Eq. (7) has a unique continuous solution defined
on say ",(t,!o.xo.Zo);
479

11 ii) t -+ 'I'(t;to.Xo.Zo) is C k+1 on R+\D with well-defined one-sided limits at any


teD for the function and its derivatives; moreover '1'('",) is C k on
with well-defined values for the function and its derivatives at
(t-.to.Xo.Zo) and (t+.to.Xo.Zo) for (t.to.xo.Zo) e D x x IRn x B;
iii) 'V

In the following. we abbreviate the solution specified in (i) by 'l'o(t). Consider the
variational equation; let U: x B -+ R nxn : (t,Zo) -+ U(t,Zo) be the (matrix) solution of

13

where the derivatives of f are both evaluated at ('I'o(t),t,Zo)' Note that the RHS of (13)
is of the form A(t,Zo)U(t.Zo)+B(t,Zo) where A(',Zo) and B(',zo) are known functions
that are piecewise C k- 1 with discontinuity points in D; let t -+ Cll(t,t;Zo) be the
corresponding state transition matrix, namely, the solution of
a
at [Cll(t.t;Zo)] = A(t.Zo)Cll(t.t;Zo), Cll(t.t;z,,) = I.

With these notations in mind we have (all partial derivatives below are evaluated at
(t,to.xo,Zo»:

15 iv)

16 v)


17 vi)

18 Comment. Conclusion (ii) follows from the C k version of the implicit function
theorem, [Die.l,Thm 10.2.3], applied to
t

",(t,to,xo,Zo) = Xo + f f(",.t' ,zo)dt' .


10
480

B3. Geometric Interpretation and Numerical Calculations


In this section, for simplicity we make two assumptions:

1 (x,t) p(x,t) is continuous on IRn

2 Vt 0, p(. ,t) is globally Lipschitz.

The Concept of Flow


Consequently., 'if Xo E IRn, 'if 10 E R+, the solution starting from Xo at 10, namely
is C 1 and the map from R n into R n is continuous. Con-
sider any set SocRn of initial conditions at to; under the motion x(t)=p(x(t),t), this
set becomes, at time tI' the set

If starting from SI we integrate backward in time, we get by uniqueness,

Thus, a continuous bijection maps So onto S I' We visualize each point Xo E So as


connected to its corresponding point, Xt E SI by the C1-curve in other
words, there is a flow connecting So to SI, (see Fig. B.l.).
Since SI is the image of So under a continuous map, if So is arcwise connected,
then SI is also arcwise connected. Note, however, that in the nonlinear case, if So is
convex, then SI is not necessarily convex.

3 Exercise. Let x p(x,t) be linear, show that So is convex if and only if SI is


convex.

Numerical Solutions
Under assumptions (1) and (2), the most obvious way of calculating an approxi-
mate solution of the differential equation is to use the forward Euler method. The idea
is that

where h is "small" and positive; essentially, we assume that the velocity of the state is
constant in [to,to+h).
481

Fig. B.l. The solutions generate a flow from, say, So to S l'

More precisely the algorithm for computing the solution on the compact interval
[to,to4oT]
is as follows.

Step 1. Choose an integer m» 1, set


hm=T/m.

Step 2. = Xo
for i=O,I,2, ... ,m-1
Si+l = Si+hmP(Si,ti)
1j+l = 1j+hm·

The output is a sequence of m+1 Rn-vectors the interpretation is that
the calculated solution is the polygonal line joining to for
i = 0, 1,2, ... ,m-1.
It can be shown that (I) and (2) guarantee that, as m 00,

Ilsi-<P(ti;to,XO)1100 for i=O,I, ... ,m.

The subject of numerical solution of ordinary differential equations is a deep sub-


ject where considerations of round-off error, truncation errors and cost are paramount,
(e.g., see [Sto.I]).
APPENDIX C

LAPLACE TRANSFORMS

This appendix is a concise survey of the Laplace transfonn; we consider only


those properties that we found useful in the material associated with this book. The
principal references are [Doe.I], [Doe.2], [But.I], [Gold.I].

Ct. Definition of the Laplace Transform

t Definition. A function f : ..., R or f : 4J: is said to be locally


integrable iff for all a,b e with a < b
b
2 f I f(t) Idt < 00 •
a

3 Definition of the Laplace transform. Let f: R+ ..., R or f: R+..., 4J: be locally


integrable and assume that for some 0' e lR,

4 I
00

I f(t) I e--<1tdt < 00 •

Call O'f the infinum of alIa's satisfying inequality (4); O'f is called the abscissa of
absolute convergence of f(·).
·
For some f unctions, nne, e.g. tior eat, Of= a; tior 0 thers, Of= 00, e.g. et 2 :
. fi'
Of IS
such functions are not Laplace transfonnable.
The Laplace transfonn of f, [denoted by L[f] or f(s)], is defined for all s in the
open half-plane Re[s] > O'f by

J f(t)e-stdt .
00

S f (s) :=
(}-

This integral is to be understood to mean lim l ...


R
dt as R..., 00 • When s is com-

plex, we write s = 0' +jro .

6 Analyticity. The integral in (5) is a well-defined finite complex number for. all
Re[s] > Of. Furthennore since, for each fixed t, the integrand is analytic in s, f is
analytic in s for Re[s] > O'f, [Die.2,Thm13.8.6].

7 In the RHS of (5), we choose 0- to be the lower limit of the integral; this is
because in engineering we often consider generalized functions such as
f All integrals are Lebesgue integrals. In most practical cases it does not mauer; the standard
Riemann integral suffices.
483
n
8 f(t) = fe(t) + L CjS(Hj)
j=O

where fe is locally integrable with Of. < 00. S(·) denotes the Dirac delta function.
O=to < tl < ... tn. and the c{s are in R; then. since we integrate from 0-. we
obtain from (5)
n
+ L Cie-st. .
A

9 f(s)=fc(s)
i=O

10 Examples. a) For f(t) = 1(t)eat, f (s) = (s-a)-I. Of= a.

b) For k=l •...• n. let 7tk(t) be a polynominal in t of degree mk-1 and let Ake Cl:;
finally let

L [1:
k=1
7tk(t)eAt.1 1=: £(s) .
Then £(s) is a strictly proper rational function in s with a pole of order mk at Ak}or
k= I •...• n. It is easy to see that. in this case. the defining integral (5) defines f(s)
only for Re[s] > max ( Re[Ak] ). It is well known that f(s) may be continued
k •
analytically so that the rational function f (s) is defined for all s in Cl: except at its
poles AI.···. Am .

11 Remark. Example b) leads to a common error. The example suggests that if


f(s) is analytic for Re[s] > -a for some a > O. then f(t) 0 exponentially as
t 00. This implication is false. If £(s) is. in addition. a rational function. then by
example b) it is true. The following exercise gives you a counterexample.

12 Exercise. Let f(t) = 1(t)e1sin(el ). i.e. f(t) oscillates with an exponentially increas-
ing amplitude and with an exponentially increasing frequency.
a) Use the defining integral to show that Of= 1.

b) Set t=el and use successive integration by parts to establish that £(s) is analytic in
the whole plane! (It has. in fact, an essential singularity at infinity.)
Thus this example shows that f may be analytic in the whole plane and have f(t)
unbounded as t 00.

13 Remark. It is not always possible to extend. by analytic continuation, the


domain of definition of £(5). For example. let JI be the first Bessel function, let
for k = 0, ... , 00 be an enumeration of the rational numbers in R. let f: R.. R be
defined by
00
14 f(t) .- L .
k=O
484
It can be shown that O"f= 0 and that

15 [(s) =
k=O

As is expected from O"f= 0, f (s) is analytic in Re[s] > 0, but f has ,!Jranch points
dense on the jo>-axis so that it is impossible to extend the definition of f into the left
half-plane.

C2. Properties of Laplace Transforms

1 Definition of Lt , The set of all functions f: (C such that

f I f(t) I dt < 00
o

form a linear space. It is convenient to identify two functions f and g whenever

f If(t)-g(t)ldt=O,
o

i.e. whenever f and g differ on a set t of measure zero (equivalently, whenever f=g
almost everywhere). The resulting set of equivalence classes also forms a linear space;
furthermore it has a norm

2 II fill := f I f( t) I dt.
o

The normed space thus defined is denoted by LI (R+), or L I. It can be shown that L 1
is complete, i.e. Ll is a Banach space; indeed, any Cauchy sequence of LI functions
has, as limit, an L 1 function.

3 Properties of L[f] for fe Ll. If f:R+ R or f:R+ tC is in Ll, then, with


(C+ := (se tC:Re[s] J:
(i) f(s) is analytic in Re[s] > 0 ;

(ii)

4 sup I f (s) I ::; II fill;


SE lr+

(iii) [Riemann-Lesgue lemma], [But. 1,po 189]:

5 f Go» 0 as I 0> I 00,

TThis set may be empty.


485

or equivalently,
6 I £(s) I -+ 0 as I s I -+ 00 in (t+;

(iv) ro -+ [(O"+jro) is uniformly continuous on the jro-axis.

7 Remark. If f is Laplace transfonnable with abscissa of absolute convergence


0"( < 00, then, for all 0" > 0"(

and

L[e-olf] = [(O"+s) for se (t+.

8 Linearity. Let e ([, f l : R+ -+ ([, f2 : R+ -+ ([ with O"n < 00 and


0"f2 < 00, then, for Re[s] > max ( O"n,O"f2 ) ,

= = alf
. .
.

9 Differentiation. Let f: 1R+ -+ ([ and let f denote its derivative taken in the dis-
tribution sense. (In particular if f(·) has a finite "jump" at to from f(Io-) to f(Io+), then
f(.) includes the tenn [f(to+) -f(to-)]· 8(t-lo) .) If f is Laplace transfonnable with
abscissa of absolute convergence O"f' then, for Re[s] > 0"(

L[O = s[(s)-f(o-) .

10 Convolution. Let f*g denote the convolution of f and g, more precisely,


1+
(f*g)(t) := f f(t-'t)g('t)d't ,
0-

then, for Re[s] > max ( O"(,O"g )

L[f*g] = f(s)·g(s).

11 Inversion integral. Let f(s) be analytic in Re[s] > 0"(. If, in the neighborhood
of t, f is of bounded variation, then t, for 0" > 0"(,
a+joo
12 2- 1 ·[f(t+)+f(t-)] = (21tj)-I. f f(s)eS1ds.
a-joo

f More precisely if t -+ e-otr(t) is in LI, then the integral in (12) is to be understood as


a+jR
lim f [(s)eS1ds.
R-ooo a-jR
486

In applications, it is important to note that the inversion integral gives the average
of f(t+) and f(t-): for example, since L [I (t)] = s-I, for t=O and a > 0
a+joo
13 Z-l = (21tW' f s-l'ds, so f(0)=2-'.
a-joo

14 Jordan Lemma [Doe.I,VoU, p.224J. Let rl denote the left half-plane semicir-
cle of radius R centered on the origin:

r1 := { s E (C: s = r eJ"0 ,e E [1t"2'""2


31t] } .

If, as R 00, f (s) 0 uniformly on r l , then, for all t > 0

lim r f (s)estds = °.
R--+oo f,

°
15 Application. For f(s)=s-', for t > and cr > 0, the evaluation of (12) using
the Jordan lemma is done as follows: a) close the vertical integration path from a-jR
to a+jR by the left half-plane semicircle of radius R and centered at the origin; b) now
use (14): as R 00, this closed contour integration tends to the integral required by
(12); c) note that the integrand is analytic in (C except for the pole at s=O; d) use
Cauchy's theorem, note that the residue of the integrand at s = 0 is I, hence obtain
L-'[S-l]=1 forallt > O.

16 Initial value theorem. If, as t decreases to 0, f(t) has afinite limit f(O+), then

17 f(O+) = lim sf(s).


s --+ 00

Examples show that the RHS limit may exist but that f(t) tends to no limit as t
decreases to 0, [Doe.l,p. 476].

18 Final value theorem. If, as t 00, f(t) tends to finite limit f(oo), then

19 f(oo) = lim sf(s) .


...... 0

20 Remark. The example f(t):= l(t)e 8Icost, with a > 0, and


£(s)=(s-a)[(s-a)2+1r gives lim s£(s)=O, but as t
1 00, f(t) oscillates with
...... 0
exponentially increasing amplitude, i.e. has no limit! Therefore, when applying
theorem (18) it is essential to check that f(t) has a finite liimit as t 00.

21 Exercise. Consider what happens if you apply (19) to f(s) = (s-l)-'.

22 Power series expansion. Let f (s) E Rp(s), in particular, let


487

23

where the polynominals nand d are coprime. For 1s 1 sufficiently large, we may
f(s) in a power series in s-l: let, for k= 1, ... ,n, Pk denote the kth pole of
f('), then, for any p > max 1Pk I, this power series converges absolutely and uni-
k
formly for all I s I > p. (Note that the power series is easily obtained by long divi-
sion.) Let
24 res) = bo+CXos-l+ttls-2+tt3s-3+ ... +ttks-(k+l)+ . . . .

Using the inversion integral with (J p, and integrating term by term we obtain

25 f(t) = bol5(t)+ l(t) [ 2!


t2
+ ... +ak e+c
kT ...] .

Since for t > 0, f is a sum of polynominals in t times exponentials the power series
(25) converges absolutely for each t > O.

26 Theorem. Let fO be a proper rational function whose power series in S-1 is


given by (24), then f(t) is given by the series (25), and vice versa.

27 Exercise. Obtain (24) from (25).


APPENDIX D

THE z-TRANSFORM

The z-transfonn is a very useful tool to study linear time-invariant discrete-time


systems and sampled-data systems. In Section Dl the z-transfonn is defined and illus-
trated by simple examples. Section D2 states a number of useful properties of the z-
transfonn.
Standard references are [Jur.l], [Zad.l, Appendix B.4].

Dl. Definition of the z-Transform

In many engineering applications signals are sampled periodically, say, with sam-
pling period T. Thus the continuous-time signal is replaced by a sequence of real
numbers: fo , fl .... , fn , . .. . We denote this sequence by (fn );.

It is convenient to view the sequence (fn); as a function f: n fn mapping N (the


nonnegative integers) into 1R or (C.

The z-transJorm of the sequence f= (fn]; is given by

1 fez) := fo+flZ-I+f2Z-2+ ... +fnz-n + ... ,

f
i.e. fez) is specified as a power series in z-I. In order for the expression (1) to mak
sense, the power series must have a finite radius of absolute convergence; i.e.
Pf < 00, where Pc is the least nonnegative number such that the power series in (I)
converges absolutely for all I z I > Pc.

The z-transfonn maps sequences f= (fn];, for which the series (1) converges abso-
lutely for some finite z, into functions of the complex variable z.
We write

3 Examples. The purpose of these simple examples is to illustrate the definition


(1).

a) For (fn); = (1,0,0, ... ,0, ... ), f(z) = 1 and Pc=O.

t More precisely Pf := inf ( P E [0,00] : i: I fn I p-n < 00 ) .


o
489

b) For [fn); = (0.0 •...• 0.1.0 •... ). f(z)=z-k and Pr=O.

c) For = [an]; = (l.a.a2 • '" .an•... ). f(z).= z(z-a)-l and


Pr= 1a I.

d) For = (nan); = (0. a. 2a2 • 3a3 • . . • • nan •... ). f(z) = az(z-ar2 and
Pr= 1a I.

(Hint: the present sequence is that of (c) operated on by a' :a .)

e) For [fn); = (1. e. e4 • e9 • . • . • en\ ... ]. Pr=oo and the sequence has no z-
transfonn.

f) For k = 1.2..... ,m, let n -+ 1tk(n) be a polynomial in n of degree mk - 1. let Ak e (C.


then the sequence whose nth tenn is
m
fn = L 1tk(n»).,r
k=1

has a z-transfonn fez) which is a proper rational function with poles Ak of order mk in
the closed disk D(O,Pr) := {z e (C: I z I S Pc ) where Pr := max I Ak 1.
k

D2. Properties of the z-Transform

1 In the following, we consider exclusively sequences for which the power series
(Dl.l) converges absolutely for a finite z:f(z) = fnz-n.
o

The classical theory of power series [Hil.l Ch. 5] [Con.1. p. 31] yields the following
facts:

2 The radius of absolute convergence Pr satisfies


Pc = lim sup I fn 1lin •
n

3 If 1 z 1 > Pr, then the power series converges absolutely and the sum is an ana-
lytic function of z defined for all 1z 1 > Pr.

4 For 1 z 1 > Pr. derivatives of fez) of any order may be obtained by differentiating
the series tenn by tenn and summing the results.

5 If Iz 1 < Pr, then the tenns of the series become unbounded as n -+ 00; hence
490
the series diverges.

6 For any P > Pc, the series (1) converges uniformly in I z I 2:. P .

The following properties are very useful in applications; they follow easily from the
definition (1).

7 Linearity. For all e € and for all sequences (fn) , (gn) with finite radius of
convergence Pc and Pg , resp.,

=
= .

This equality is valid for I z I > max { Pc,P g } •

8 Advance by k steps. Let ke N be fixed. let (fn+k):;() denote the sequence

[fk • fk+l • . . . ). i.e. the sequence advanced by k steps; then

9 Delay by k steps. Let us extend the sequence (fn]; to the left by setting C n= 0

for all integers n > O. Let keN. let rfn_k] - be the sequence (fn) delayed by k
r.
steps. i.e. the sequence lO.O •...• O.fo.fl •...
n;()] ; then

Z [ [fn-k );] = z-kf(z) .

10 Convolution. By definition (fn) * (gn) ) (n) = f


k=O
fn-kg k • then

Z [(fn) * (gn)] = fez) ·g(z) .

This equality is valid for I z I > max ( Pr.P g ) •

11 Application. If = [an); with ae (£, then. for ne N

gn := r(fn) * (fn) * ... * (fn)] (n) = (n+k-l)! an


(k-l)!n!
k times.k

and
491

g(z) = (I-aZ- 1 tk for I z I > I a I .

Note that. concerning gn' an is multiplied by a polynomial in n of degree k-1.

12 Inversion theorem. For all n E N

fn = (21tj )-1 f f(z)zn-1dz


r

where r is any closed rectifiable curve in I z I > Pc that encircles the origin once in
counterclockwise sense, (equivalently, n(r,O)= I, the index of r with respect to 0 is
equal to one, [Con.I,p. 81]).

13 Initial value theorem.


lim f(z) = fo .
z-+oo

14 Comment. (13) shows that all z-transjorms tend to a constant as I z I 00;


f(z) is always analytic at infinity. This is in sharp contrast to the Laplace transfonn,
which may have essential singularities at infinity; e.g., f(s)=e-sT .

15 Final value theorem. If lim fn =: foo exists, then Pc 1 and


n-+oo

f lim (z-l)f(z) ,
00
z-+ 1

where z decreases to the limit 1.


REFERENCES

Ack. 1 J. Ackennann, "Der Entwurf Linearer Regelungssysteme im Zustandsraum,"


Regelungstechnik, Vol. 7, pp. 297-300, 1972.
Ail. I A.C. Aitken, "Detenninants and Matrices," Interscience, New York, 1951.
Alex. 1 V. Alexeev, V. Tikhomirov, and S. Fomine, "Commande Optimale," Mir,
Moscow, 1982, (French).
And. 1 B.D.O. Anderson, "Internal and External Stability of Linear Time-Varying
Systems," SIAM Jour. Control and Optimization, Vol. 20, No.3, pp. 408-
413, 1982.
Ast. 1 K.J. Astrom, P. Hagander, and J. Sternby, "Zeros of Sampled Systems,"
Automatica, Vol. 20, No. I, pp. 31-38, 1984.
Bha. 1 A. Bhaya and C.A. Desoer, "Robust Stability Under Additive Perturbations,"
IEEE Trans. Auto. Control, Vol. AC-30, No. 12, pp. 1233-1234, December
1985.
Bro. 1 R.W. Brockett, "Finite Dimensional Linear Systems," Wiley, New York,
1970.
But. 1 P.L. Butzer and R.J. Nesser, "Fourier Analysis and Approximations," 2 Vols.,
Academic Press, New York, 1971.
Cal. 1 F.M. Callier and C.A. Desoer, "Multivariable Feedback Systems," Springer-
Verlag, New York, 1982.
Cal. 2 F.M. Callier and J.L. Willems," Criterion for the Solution of the Riccati
Differential Equation," IEEE Trans. Automatic Control, Vol. AC-26, pp.
1232-1242, 1981.
Cal. 3 EM. Callier and C.A. Desoer, "An Algebra of Transfer Functions for Distri-
buted Linear Time-Invariant Systems," IEEE Trans. Circuits and Systems,
Vol. CAS-25, pp. 651-662, 1978; (correction ibidem, Vol. CAS-26, p. 360,
1979).
Cal. 4 F.M. Callier and C.A. Desoer, "Simplifications and Clarifications on the
Paper 'An Algebra of Transfer Functions for Distributed Linear Time-
Invariant Systems'," IEEE Trans. Circuits and Systems, Vol. CAS-27, pp.
320-323, 1980.
Cal. 5 F.M. Callier and C.A. Desoer, "Stabilization Tracking and Disturbance
Rejection in Multivariable Convolution Systems," Annales de la Societe
Scientifique de Bruxelles, T. 94, pp. 7-51, 1980.
Cal. 6 F.M. Callier and J. Winkin, "Distributed System Transfer Functions of
Exponential Order," International Jour. Control, Vol. 43, pp. 1353-1373,
1986.
Chao 1 W.S. Chan and Y.T. Wang, "A Basis for the Controllable Canonical Fonn of
Linear Time-Invariant Multiinput Systems," IEEE Trans. Auto. Control, Vol.
AC-23, pp. 742-745, 1978.
493

Che. 1 C.T. Chen, "Linear System Theory and Design," Holt, Rinehart and Winston,
New York, 1984.
Che. 2' M.J. Chen and C.A. Desoer, "Necessary and Sufficient Conditions for Robust
Stability of Linear Distributed Feedback Systems," Int. J. Contr., Vol. 35, pp.
255-267, 1982.
Cod. 1 E.A. Coddington and N. Levinson, "Theory of Ordinary Differential Equa-
tions," McGraw-Hill, New York, 1955.
Con. 1 J.B. Conway, "Functions of One Complex Variable," 2nd ed., Springer-
Verlag, New York, 1978.
Cop. 1 W.A. Coppel, "Matrix Quadratic Equations," Bull. Austral. Math. Soc., Vol.
10, pp. 377-401, 1974.
Cur. 1 R.F. Curtain and A.J. Pritchard, "Infinite Dimensional Linear Systems
Theory," Springer-Verlag, Berlin, 1978.
Del. 1 D.F. Delchamps, "State Space and Input-Output Linear Systems," Springer-
Verlag, New York, 1988.
Des. 1 C.A. Desoer, "Notes for a Second Course on Linear Systems," Van Nostrand
Reinhold, New York, 1970, (paperback).
Des. 2 C.A. Desoer and K.K. Wong, "SmaIl-Signal Behavior of Nonlinear Lumped
Networks," Proc. of the IEEE, Vol. 56, pp. 14-22, 1968.
Des. 3 C.A. Desoer and M. Vidyasagar, "Feedback Systems: Input-Output Proper-
ties," Academic Press, New York, 1975.
Des. 4 C.A. Desoer and W.S. Chan, "The Feedback Interconnection of Linear
Time-Invariant Systems," J. Franklin Inst., Vol. 300, pp. 335-351, April
1975.
Des. 5 C.A. Desoer and MJ. Chen, "Design of Multivariable Feedback Systems
with Stable Plants," IEEE Trans. Auto. Control, Vol. AC-26, pp. 408-415,
1981.
Des. 6 C.A. Desoer and C.A. Lin, "A Comparative Study of Linear and Nonlinear
MIMO Feedback Configurations," Int. Jour. Systems Sci., Vol. 16, No.7, pp.
789-813, 1985.
Des. 7 C.A. Desoer and A.N. Gundes, "Algebraic Theory fo Linear Time-Invariant
Feedback Systems with Two-Input Two-Output Plant and Compensator," Int.
Jour. of Control, Vol. 47, No. I, pp. 33-53, January 1988.
Die. 1 J. Dieudonne', "Foundations of Modern Analysis," Academic Press, New
York, 1969.
Die. 2 J. Dieudonne', "Treatise on Analysis," Vol. II, Academic Press, New York,
1970.
Doe. G. Doetsch, "Handbuch der Laplace-Transformation," 3 Vols., Birkhaiiser.
Basel, (German), 1971.
Doe. 2 G. Doetsch, "Introduction to the Theory and Application of the Laplace
Transformation," Springer-Verlag, Berlin, 1974.
494

Don. J. Dongarra, J.R. Bunch, C.B. Moler, and G.W. Stewart, "LINPACK Users
Guide," SIAM Publications, Philadelphia, 1978.
Doy. J.C. Doyle and J.C. Stein, "Multivariable Feedback Design: Concepts for a
Classical/Modern Synthesis," IEEE Trans. Auto. Control, Vol. AC-26, pp.
4-16, February 1981.
Eng. 1 J.C. Engwerda, "Stabilizability and Detectability of Discrete-Time Time-
Varying Systems," IEEE Trans. Auto Control, Vol. AC-35, pp. 425-429,
1990.
Fie. 1 W.H. Fleming and R.W. Rishel, "Deterministic and Stochastic Optimal Con-
trol," Springer-Verlag, New York, 1975.
Fuj. 1 T. Fujii, "A New Approach to the LQ Design from the Viewpoint of the
Inverse Regulator Problem," IEEE Trans. Auto. Contr., Vol. AC-32, No. 11,
pp. 995-1004, November 1987.
Gar. 1 B.S. Garbow, J.M. Boyle, J.1. Dongarra and C.B. Moler, "Matrix Eigensys-
tem Routines: EISPACK Guide Extension," Springer-Verlag, New York,
1972.
Glo. 1 K. Glover, "All Optimal Hankel-Norm Approximations of Linear Multivari-
able Systems and Their Bounds," Int. Jour. Control, Vol. 39, pp.
1115-1193, 1984.
Gol. 1 G. H. Golub and C.F. Van Loan, "Matrix Computations," Johns Hopkins
University Press, Baltimore, 1983.
Gol2 G.H. Golub and J.H. Wilkinson, "Ill-Conditioned Eigensystems and the Com-
putation of the Jordan Canonical Form," SIAM Review, Vol. 18, pp. 578-
619, 1976.
Gold 1 R.R. Goldberg, "Fourier Transforms," Cambridge University Press, Cam-
bridge, U.K., 1961.
Gru. 1 W.A. Gruver and E. Sachs, "Algorithmic Methods in Optimal Control," Pit-
man, Boston, 1981.
Gun. 1 A.N. Giindes and C.A. Desoer, "Algebraic Theory of Linear Feedback Sys-
tems with Full and Decentralized Controllers," Springer-Verlag, New York,
1990.
Hal. 1 J.K. Hale, "Ordinary Differential Equations," Wiley, New York, 1969.
Hal. 2 J.K. Hale, "Functional Differential Equations," Springer-Verlag, New York,
1975.
Hil. 1 E. Hille, "Analytic Function Theory," Vol. I, Blaisdell, Waltham, Mass.,
1959.
HoT. I.M. Horowitz, "Synthesis of Feedback Systems," Academic Press, New
York,1963.
Hor. 2 R.A. Horn and C.A. Johnson, "Matrix Analysis," Cambridge University
Press, Cambridge, U.K., 1985.
495

Hsu. 1 C.H. Hsu and C.T. Chen, "A Proof of the Stability of Multi-Variable Feed-
back Systems," Proc. IEEE, Vol. 56, No. 11, pp. 2061-2062, 1968.
In. 1 M.K. Inan, "On the Perturbational Sensitivity of Solutions of Nonlinear
Differential Equations," Memo. No. ERL-M270, Electronics Research
Laboratory, University of California, Berkeley, 1970.
Jac. 1 N. Jacobson, "Basic Algebra I," Freeman, San Francisco, 1974.
Jur. 1 E.I. Jury, "Theory and Application of the z-Transform Method," Wiley, New
York,1964.
Kal. 1 R.E. Kalman, "Mathematical Description of Linear Dynamical Systems,"
SIAM Jour. Control, Vol. I, pp. 152-192, 1963.
Kag. 1 B. Kagstrom and A. Ruhe, "An Algorithm for Numerical Computation of the
Jordan Normal Form of a Complex Matrix," ACM Trans. Math. Software,
Vol. 6, pp. 398-419, 1980.
Kai. 1 T. Kailath, "Linear Systems," Prentice Hall, Englewood Cliffs, NJ, 1980.
Kat. 1 T. Kato, "A Short Introduction to Perturbation Theory for Linear Operators,"
Springer-Verlag, New York, 1982.
Kha. 1 V.L. Kharitonov, "On a Generalization of a Stability Criterion," Izvestia Aka-
demii Nauk Kazakhskoi SSR, Seria Fiziko-Matematicheskaia, Vol. I, pp.
53-57, 1978 (in Russian).
Kha. 2 V.L. Kharitonov, "Asymptotic Stability of an Equilibrium Position of a Fam-
ily of Systems of Linear Differential Equations," Differential Equations, Vol.
14, pp. 1483-1485, 1979.
Kuc. 1 V. Kucera, "A Contribution to Matrix Quadratic Equations," IEEE Trans. on
Automatic Control, Vol. AC-17, pp. 344-347, 1972.
Kwa. 1 H. Kwakernaak and R. Sivan, "Linear Optimal Control Systems," Wiley,
New York, 1972.
Kwa. 2 H. Kwakernaak and R. Sivan, "Modem Signals and Systems," Prentice Hall,
Englewood Cliffs, 1990.
Lau. 1 A. Laub, "A Schur Method for Solving Algebraic Riccati Equations," IEEE
Trans. on Automatic Control, Vol. AC-24, pp. 913-921, December 1979.
Loo. 2 L.H. Loomis and S Sternberg, "Advanced Calculus," Addision-Wesley, Read-
ing, Mass., 1968.
MacL. 1 S. MacLane and O. Birkhoff, "Algebra," 2nd Ed., McMillan, New York,
1959.
Man. 1 M. Mansour and F. Kraus, "Strong Kharitonov Theorem for Discrete Sys-
terns," Proc. 27th IEEE Conference on Decision and Control, pp. 106-111,
December 1988.
Mil. 1 R.K. Miller and A.N. Michel, "Ordinary Differential Equations," Academic
Press, New York, 1982.
Min. 1 R.J. Minnichelli, J.J. Anagnost, and C.A. Desoer, "An Elementrary Proof of
Kharatinov's Stability Theorem with Extensions," IEEE Trans. on Automatic
496

Control, Vol. AC-34, pp. 995-998, September 1989.


Mol. 1 B.P. Molinari, "The Time-Invariant Linear-Quadratic Optimal Control Prob-
lem," Automatica, Vol. 13, pp. 347-357, 1977.
Mor. 1 M. Morari and E. Zafiriou, "Robust Process Control," Prentice-Hall, Engle-
wood Cliffs, 1989.
Net. 1 C.N. Nett, "Algebraic Aspects of Linear Control System Stability," IEEE
Trans. on Automatic Control, AC-31, pp. 941-949, 1985.
Nob. 1 B. Noble and J.W. Daniel, "Applied Linear Algebra," Prentice Hall, Engle-
wood Cliffs, NJ, 1977.
Oga. 1 K. Ogata, "Discrete-Time Control Systems," Prentice Hall, Englewood Cliffs,
NJ, 1988.
Ort. 1 I.M. Ortega and W.C. Rheinboldt, "Iterative Solution of Nonlinear Equations
in Several Variables," Academic Press, New York, 1970.
Roc. RT. Rockafellar, "Convex Analysis," Princeton Univ. Press, Princeton, NJ,
1970.
Rod. 1 1. Rodriguez-Canabal, "The Geometry of the Riccati Equation," Stochastics,
Vol. I, pp. 129-149,1973.
Rud. 1 W. Rudin, "Real and Complex Analysis," 2nd. Ed., McGraw-Hill, New York,
1974.
Rud.2 W. Rudin, "Functional Analysis," McGraw-Hill, New York, 1973.
Rud. 3 W. Rudin, "Principles of Mathematical Analysis," 3rd. Ed.,McGraw-Hill,
New York, 1976.
Roy. 1 H.L. Royden, "Real Analysis," MacMillan, New York, 1968.
Sas. 1 S.S. Sastry and C.A. Desoer, "The Robustness of Controllability and
vability of Linear Time-Varying Systems," IEEE Trans. Automatic Control,
Vol. AC-27, pp. 933-939, 1982.
Sig. 1 L.E. Sigler, "Algebra," Springer-Verlag, New York, 1976.
Smi. 1 B.T. Smith, T.M. Boyle, Y. Ikebe, V.C. Klema, and C.B. Moler, "Matrix
Eigensystem Routines: EISPACK Guide," 2nd. Ed., Springer-Verlag, New
York 1970.
Sto. 1 1. Stoer and R. Bulirsch, "Introduction to Numerical Analysis," Springer-
Verlag, New York, 1980.
Tay. 1 A.E. Taylor and D.C. Lay, "Introduction to Functional Analysis," Wiley,
New York, 1980.
Var. P.V. Varaiya, "Notes on Optimization," Van Nostrand Reinhold, New York,
1972.
Vid.l M. Vidyasagar, "Control System Synthesis, A Factorization Approach," MIT
Press, Cambridge, 1985.
Vid. 2 M. Vidyasagar, RK. Bertschmann, and C.S. Sallaberger, "Some
Simplification of the Graphical Nyquist Criterion," IEEE Trans. Auto.
497

Control, Vol. AC-33, No.3, pp. 301-305, March 1988.


Wil. 1 IC. Willems, "Least-Squares Stationary Optimal Control and the Algebraic
Riccati Equation," IEEE Trans. on Automatic Control, Vol. AC-16, pp. 621-
634, 1971.
Wol. I P. Wolfe, "Finding the Nearest Point in a Polytope," Mathematical Program-
ming, No. 11, pp. 128-149, 1976.
Won. I W.M. Wonham, "Linear Multivariable Control: A Geometric Approach" 2nd.
Ed., Springer-Verlag, New York, 1979.
Zad. 1 L.A. Zadeh and C.A. Desoer, "Linear System Theory: The State Space
Approach," McGraw-Hill, New York, 1963.
Zam. 1 G. Zames, "Feedback and Optimal Sensitivity: Model Reference Transforma-
tions, Multiplicative Seminorms and Approximate Inverses," IEEE Trans.
Automatic Control, AC-26, pp. 301-320, April 1981.
Abbreviations

asy. stable asymptotically stable


char. poly characteristic polynomial
d.e. differential equation
e.g. for example
e.o. elementary operation
e.c.o. elementary column operation
e.r.o. elementary row operation
equ. equation
equiv. equivalently
expo stable expo stable
Le. that is
p.s.d. positive semi-definite
r.e. recursion equation
resp. respecti vel y
soln. solution
U.t.C. under these conditions
w.l.g. without loss of generality
w.r.t. with respect to
z-i zero-input
z-s zero-state
ARE Algebraic Riccati Equation
I/O input-output
LHS left-hand side
LQ linear-quadratic
MIMO mUlti-input multi-output
RHS right-hand side
RDE Riccati differential equation
RRE Riccati recursion equation
SISO single-input single-output
SVD singular value decomposition
Mathematical Symbols

Frequently used mathematical symbols are defined briefly in the five listings below.

1. Set theory
2. Sets
3. Algebra and linear spaces
4. Analysis
5. System theory

1. Set Theory

e aeA a is an element of A; a belongs to A.


c AcB set A is contained in set B; A is a subset of B.
u AuB union of set A with set B.
n AnB intersection of set A and set B.
:::> p:::>q p implies q; equivalently "not q" implies "not p."
<= p<=q q implies p.
<=> p<=>q p if and only if q; equivalently, p implies q and q implies p.
o
M interior of the set M.
M closure of the set M (Note: For z e C:, Z
denotes the complex conjugate of z).
complement of the set M.
A :=B the set A is by definition the set B.
A =: B the set B is by definition the set A.
A\B the set difference of set A minus set B; equiv. AnBc.
x AxB the cartesian product of set A times set B.
o o the empty set.
3 3aeB there exists an element a of set B.
3! 3!aeB there exists a unique element a of set B.
'V 'Vae B for every element a of set B.
500
2. Sets

B(O,p) the open ball of the complex plane centered at 0 of radius p, [B(9,p)
denotes an open ball of a linear space].
field of complex numbers.
:= { s E (£: Re s 0 ) ; equiv. the closed right half of the complex plane.
:= { ZE 0:: I Z I < 1 } , equiv. the open unit disk of the complex plane.
the set of integers (1 ,2, ... ,k).
set of nonnegative integers, namely, (O,1,2, ... ).
field of rational numbers.
field of real numbers.
:= { x E R : x 0 } ; equiv. the set of nonnegative numbers.
an undesirable subset of 0: that is symmetric w.r.t. the real axis and
contains (£ +.
Z ring of integers; equiv. ( ... ,-2,-1,0,1,2, ... ).
[a,b], (a, b) a x b, resp. a < x < b.
[a,b), (a,b] a x < b, resp. a < x b.

3. Algebra and Linear Spaces

R a ring (e.g. R = Z or R[s)).


K a commutative ring.
F a field (e.g. F=R or 0:).
F[s] ring of polynomials in one variable with coefficients in the field F
(e.g. F[s]=R[s] or (£[s]).
o[p] the degree of the polynomial p.
F(s) field of rational junctions in one variable with coefficients in the
field F (e.g. F(s) = R(s) or (£ (s».
Rp(s) , Rp,o(s) ring of proper (equiv. bounded at infinity), resp. strictly proper
(equiv. zero at infinity), rational functions in s with coefficients in lR.
McMillan degree of a (proper) rational matrix.
the subring of elements of Rv(s), resp.,
that are analytic in (£+ (equiv. with no poles in 0:+).
the subring of elements of R (0) that are analytic in U ( U contains (£ +).
family of elements; K := index set, the family is a map K X.
set of n-tuples of elements belonging to the set A (e.g.
lRD, R[S]D, R(s)n, ... ).
the ith standard unit vector of o:n (i.e. every entry
is zero except for the jth, which is one).
501

Apxq set of pxq arrays of elements belonging to the set A;


equiv. the set of p x q matrices with elements in the set A (e.g.
lRpxq , lR[s]PX q , R(s)PX q , ... ).
Ae Mat[B] matrix A has entries in the set B (e.g. He Mat[Rp(s)]).
Pi the ith row of a matrix.
'"(j the jth column of a matrix.
aT, AT the transpose of the vector a, resp. matrix A.
a* , A* the complex conjugate transpose of the complex vector a, resp. matrix A.
det(A) the detenninant of the square matrix A.
XA the characteristic polynomial of A e (l!nxn, equiv.
XA (s) = det(sI-A).
eigenvalue of the complex matrix A, (Ae=A.e).
e right eigenvector of the complex matrix A, (Ae=A.e).
'11 left eigenvector of the complex matrix A, ('11* A=A.'I1*).
o(A) the spectrum of the complex matrix A (equiv. the set of eigenvalues of
matrix A).
:= (xe (l!n:(A-A.kI)mk;x=9) ,equiv. the algebraic eigenspace
of A e (l!nxn at its eigenvalue A.k .
PA the spectral radius of matrix A, i.e. the largest absolute value of its
eigenvalues .
R(A) the range or image of the matrix A (or the linear map A).
rk(A) the rank of the matrix A.
col rk(A) the column-rank of the matrix A.
row rk(A) the row-rank of the matrix A.
N(A) the null space of the matrix A (or the linear map A).
nl (A) the nullity of matrix A (dimension of its null space).
omu[A] , o[A]the largest singular value of matrix A.
0min[A], Q[A] the smallest singular value of matrix A .
V the linear space V (also caIled vector space).
(9) the trivial subspace (contains only the zero element of a linear space).
Vol the orthogonal complement of the vector space V.
the sum of two vector spaces.
the intersection of two vector spaces.
the direct sum of two vector spaces.
502

4. Analysis

f: A -t B f is a function (or a map or an operator) mapping a domain A


into a codomain B; also denoted x -t f(x), i.e. f is a function that
associates with x E A its image f(x) E B.
Dom[f) the domain of f.
eodom [f) the codomain of f.
R [f) the range or image of the map f.
N [f) the null space of the map f.
Ik I the absolute value of the scalar k.
II f II the norm of f.
II A II the norm of the matrix (or operator) A.
C the space of continuous functions.
Ck the space of k times continuously differentiable functions.
PC the space of piecewise continuous functions .
LP for p E [1,00), the space of functions that are pth-power (absolutely)
integrable; for p = 00, the space of functions that are (essentially) bounded.
(xk)ke K the family of elements Xk with index set K, the family is a

map K -t X; special case: sequences: (xk)k e IN is often written (Xk);

or (X(k))o or simply (xk)'


IP for p E [1,00), the space of sequences that are pth-power
(absolutely) summable; for p= 00, the space of sequences that are bounded.
Z [f) , Z [H] the set of zeros of the (vector) function f, resP. the matrix function H.
P [f) , P [H]the set of poles of the (vector) function f, resp. the matrix function H.

5. System Theory

T time interval of observation.


(leo,kd the set of integer times k = leo ' ko+ 1, . . . ,k l'
(T xT)+ := { (t,to) : t E T , to E T , t to} .
T/ the function f shifted by r seconds [(T,f)(t) = f(t-r)].
s(t,to,xo,u) the state transition map producing the state at time t due
to the state Xo at time to and the input u; (denoted by
s(k,ko,xo,u) in the discrete-time case).
the response map producing the output at time t due to
the state Xo at time to and the input u; (denoted by
p(k,ko,xo.u) in the discrete-time case).
503

l: the state space (usually,l:= R n or ((;n).


u the space of input functions (usually piecewise continuous).
y the space of output functions (usually piecewise continuous).
the space of input sequences.
the space of output sequences.
the state transition matrix from to to tl
(denoted by <I>(k1,ko) in the discrete-time case).
R (.) = [A('),B('),C('),D(')] the (continuous-time) system representation
x(t) = A(t)x(t) + B(t)u(t), y(t) = C(t)x(t) + D(t)u(t); (denoted
by R = [A,B,C,D] in the time-invariant case).
R d(')= [A('),B('),C('),D(')] the (discrete-time) systems representation
x(k+ 1) = A(k)x(k) + B(k)u(k), y(k) = C(k)x(k) + D(k)u(k);
(denoted by Rd=[A,B,C,D] in the time-invariant case).
the convolution of f with g.
the Laplace transform of the vector function f, resp.
matrix function H (also denoted by L [f1, resp. L [H]).
the inverse Laplace transform of the vector function f ,
resp. matrix function H.
f,H the z-transform of the vector function f, resp. matrix
function H (also denoted by Z [fJ, resp. Z [H]).
the inverse z-transform of the vector function f,
resp. matrix function H.
SUBJECT INDEX
Adjoint Convolulion, 78
of a Iincar map, 451 et seq. Delay system, 246
Adjoint differential «Iuation, 25 Design •
discrete·time case, 60 of L' when O2 is e,p. smble, 366
Affine maps, 390 Deleclabilily of (C,A)
Algebraic «lui valence of linear definition, 257
systems, ISS, 157, 158 deleclability properties, 259
Algebraic Riccati «Iuation, (ARE), 333 relation to smbilizabilily, 260
(see also Riccati recursion equation) in terms of observability at infinity, 258
properties of, 334-339 Diagonable matri" 69,80
properties of N(p.}, 339 (see matri, semisimple)
properties of Hamiltonian. 341 Differential «Iuations, 469
Argument principle, 372 Bellman·Gronwail inequality, 475
Asymptotic smbility, 181 existence and uniqueness of solutions, 470
Balanced representation, 260 geometric interpretation and numerical calculations, 4S0
main theorem, 261 initial conditions and parameter perturbations, 477
Bandwidth robustness trade·off, 379 Lipschitz condition, 469
theorem, 380 Differential system
Basis lineM,5
of a linear space, 414 nonlineM, 142
orthonormal of a Hilbert space, 459 (see also Appendix B)
Cauchy sequence. 437 Differential system represenlation, 143, 146
Cayley-Hamilton theorem, 74 Dimension
C1osed·loop eigenvalues, 362 of a linear differential system, 6
Computer arithmetic, 446 of a linear space, 414
Continuous linear transformation (map), 441 Direction of descent, 50
adjoint linear map, 452 Directional derivative, 31
characterization Iheorem, 443 Discrete·Lime
finite rank operators. 457 adjoint equation, 60
induced (operator) norm, 442 asymptotic smbility, 211
properties of an adjoint linear map, 456 uniform asymptotic slabilily, 212
self·adjoint linear map, 454 controllability of (A('),B('», 265
Control correction e,ample, 44 controllability to zero, 268
Controllability controllability to zero grammian, 269
of a dynamical system, 222 controllability to zero map, 269
memoryless feedback, 223 controllability using controllability to zero, 269
of Ihe pair (A(·),B(·». 226 controllability using reachability, 267
minimum cost control, 229 duality, 275
rcachability grammian, 226 minimum cost control, 265
reachability map, 226 reachability grammian, 267
reduction theorem, 226 reachability map. 265
robust perturbation Iheorem, 263 reduction Iheorem, 266
smbilization by lineM feedback, 231 controllability and reachability of (A,B), 283
strong uniform controllability, 231 controllability to zero, 287, 289
of Ule pair (A,B) controllability versus controllability to zero, 290
e,traction of controllable part, 243 controllability and reachability properties, 288
matrix, 239 duality, 291
subspace, 243 rcachability,283
discrete Lime dual system represenmtion, 60
of the pair (A,B), 283 pairing lemma. 60
of the pair (A('),B(')), 265 exponential smbility, 211
Controllability to lero bounded trajectories and regulation, 21M
grammian, 229 Lyapunov melhod, 216
map, 223 periodically varying, 217
versus observability, 236 time·invariant, 212
Controllable canonocial form, 307-312 time-invariant with periodic inputs, 221
Convex sct, 12 impulse response, 59
505

linear time-varying case, 59 algebraic eigenspace of a matrix expansion, 120


linear time-invariant case, 99 of exp At, 86
input· output stability, 205 of (sl-Ar' , 86
linear time-varying case, 207 of solution .(t), 87
linear time·invariant case, 208 Dynamical Systems, 143
linear Quadratic optimization, 61 Dynamical system representations, 145
finite·horizon problem, 61 causal, 146
optimal slate feedback using the Hamiltonian, 63 controllable, 222
optimal slate feedback using the Riccati r.e., 64 nonanticipative, 146
solution using the Hamiltonian, 62 observable, 224
infinite·horizon problem, 347 Eigenvalue, 69
existence of solutions of the algebraic Riccati Eigenvector, 69
equation, 348 generalized, 123
solution of the algebraic Riccati equation Eigenvalue placement
using the Hamiltonian, 352 by linear state feedback, 317, 324
solution by linear Slate feedback (and expo multi·input case, 321
stability of the closed·loop), 354 single.input case, 317, 318
time-invariant LQ problem, 346 by linear output injection, 324
observability of (C('),A(')}, 271,272 Ellipsoid A[Sol, 465
characterization, 272 Equivalence, 152
observability grammian, 273 Equivalent states, 152
observability map, 271 Equivalent dynamical system representations, 153
duality, 275 Exponential stability (see also, Stability, Nyquist criterion,
observability of (C,A), 281 Hurwitz polynomials)
duality, 292 bounded trajectories and regulation, 191
obscrvability propenies, 282 by I/O stability, 259·260
l1T1observable state, 281 definition, 183
reconstructibility of (C('),A(')), 273,274 time·invariant case, 185
rcconstructibility grammian, 274 Lyapunov method, 188
reconstructibility map, 274 linearization and exp. stability, 200
response map: linear time· varying case, 58, 59 periodically varying case, 189
linear time·invariant case, 100, 102 robustness of, 197, 198
stability, 204 time· invariant systems with periodic inputs, 194
slabilizabitity and detcctability, 292 T·periodic systems with periodic inputs, 197
stabilizability theorem, 293 Exp. stability and I/O stability, 178, 179,259,260,304·)05
detectability theorem, 293 Extension of identities, 444
slale transition matrix, 58 Feedback (see S(P,C), linear state feedback, linear'quadra'ic
linear time·varying case, 68 optimi7.ation)
linear time-invariant case, 95 Feedback system EO' 357
using the z-transfonn, 96 characteristic polynomial, 361
state transition map, 58, 59 discrete·time case, 367
linear time·varying case, 59 input·output maps, 358
linear time-invariant case, 100, 102 state representation, 359
system representation R d(-)= [A('),B('),C('),D(')l, 55, 58 special cases, 362, 364
system rcpresentation Rd=[A,B,C,DJ. 95 Field,405
with a basis of eigenvectors, 100 Finite·rank operator, fundamental lemma, 457
Duality, 27, 235 et seq. Function (map, operator), 403
dual system (continuous time), 27 bijective, bijection, 403
pairing lemma, 27 codomain, 403
duat system (discrete time), 60 composition of, 404
controllability to zero versus unobservability, 236 domain, 403
discrete-time case, 276 injective, injection, onc·one, 403
reachability and reconstructibility, 238 inversion of, 405
discrete·time case, 279 range (image), 403
stabilizability and detcctability, 260 Function a matrix, 129
of reversible systems (linear discrete time), 291 general fonnula, 131
Dyadic expansion, 83 properties of, 130
of a matrix, 84 spectral interpolation conditions, 128
506
speclral interpolation polynomial, 128 properties of, 484
Fundamental matrix, II relation to z-lransfonm, 160
Generalized eigenvalue, 123 Linear dynamical systems, 5, 8, 151
Grammian linearity of response map, 8, 152
conlrollability, 227, 229 linearity of state lransition map 8, 152
conlrollability to zero, 269 Linear equation Ax = b, 417
observability, 234, 243, 273 solution 417, 432
reachability, 267 numerical aspeclS of, 466
Hamiltonian matrix, 33 Linear map, 415
discrete·time, 62 (linear operator, linear Iransfonmation)
properties of, 341 nullity, 427
Hamiltonian recursion equation, 63 null-space (kernel), 416
Hamiltonian system, 33 range (image), 416
Hankel matrices, 296 rank,427
relation with 0 and C, 297 Linear map X->AX+XB, 138
theorem on rank, 298 Linear map X -> X- AXB, 138
Hermitian matrix (see Matrix, Henmitian) Linear output injection, 324
Hidden modes, 252 spectral assignability (pole placement), 324
unconlrollable, 252 stabilization by, 326
unobservable, 252 Linear-quadratic optimization for (A(·),B(·)) (continuous
characterization theorem, 253 time), 29 et seq.
no unstable hidden modes, 256 concluding remarks, 39
characterization theorem, 256 stale feedback solution, 33
main properties, 259 using Hamiltonian, 35
Hilbert space, 448 using Riccati d.e., 37
adjoint of continuous linear map. 452 Linear-quadratic optimil.ation for (A('),BO (discrele lime).
continuity of inner 449 61 et seq.
finlle rank operator, 457 stale feedback solution, 62
Gram-Schmidt orthogonalization, 450 using Hamiltonian, 63
orthogonal complement, 450 using Riccati f.e., 64
orthogonal family, 459 Linear-quadratic optimization for (A,B) (continuous time
orthogonal projection, 451 case), 329-344
orthogonality, 450 time invariant standard LQ problem on (0,00), 333
Schwartz inequality, 448 key theorem, 342
Hurwitz polynomial, 383 denonmalization, 344
characterization, 384 time-invariant standard LQ problem on (lo,t.1, 330-331
Impulse response, 23, 59, 78,99 Linear·quadratic optimization for (A,B) (discrete-time
Inner product space, 448 case), 346-353
Input, 5, 144 finite horizon case, 346
Input-output map, 148 infinite horiwn case, 347
Impulse response malrix properties of the RRE, 348
time-varying case, 23, 59 properties of the optimal solution, 353
lime-invariant case, 78, 99 Linear space, 410
Integral domain, 406 basis, 414
(see ring) dimension, 414
Jordan fonm (see matrix) direct sums, 451
lordan lemma, 486 linear independence, 413
Kalman decomposition theorem, 249 product space, 412
Kharitonov's theorem, 383, et seq. span, 412
for complex polynomials, 385 subspace, 412
special cases, 386-387 Linear slate estimation, 326
LP space, 440 full-order state estimator, 326
IP space, 440 Linear slate feedback, 231, 315
Laplace Iransfonm, 482 achieves single-input conlTollability, 314
abscissa of convergence, 482 limitations of, 245
initial value theorem, 486 separation property, 329
inversion of, 485 speclral assignability
final value theorem, 486 single-input case, 317, 318
507
multiple-input case, 321 nonms, 435
stabilization: time-invariant case, 246 nullity, 427
stabilization: time-varying case, 231 orthogonal, 459
Linear system representation, 151-152 orthogonality of right and left eigenspaces, 121
(see also controllability, observability, realization) positive singular values, 462
for R = (A('),B('),C('),D(')], 5 et seq. rank,427
for R = (A,B,C,D), 70 et seq. reduction to row echelon form, 431
Lyapunovequation, 186-189 representation theorem first, 427
condition for exponential stability. 188 second, 106
discrete-time case, 216 restricted to an invariant eigenspace, \03
Markov parameters, 296 right and left eigenspaces, 116
McMillan degree, 298 singular value, 464
and minimal realizations, 299 singular value decomposition, SVD, 462
computation of, 305 spectral 69
of a pole A. of H(s), 303 spectral mapping theorem, 135
Matrix applications of, 136
action of a matrix, 465 spectral radius, 74
A-invariant subspace, 103 spectrum of, 69
algebraic eigenspace, III stable subspace of, 122
annihilating polynomial, 107 Sylvester's inequality, 428
Cayley-Hamilton theorem, 74 unstable subspace of, 122
change of basis, 423 unitary, 459
characteristic polynomial, 68, 108, 109 Minimal realizations, 295
column companion form, 421 and McMillan degree, 299
column (row) operations, 421 algebraic equivalence of, 300
complex conjugate symmetry, 69 and poles of H(s), 302
condition number, 446 theorem, 300
decomposition into algebraic eigenspaces, 110 Minimality and algebraic equivalence, 300
decomposition into semisimple and nilpotent part, 118 Mode, 252
echelon forms (row, column), 430, 433 (see also hidden modes)
applications, 432 real eigenvalues, 87
eigcn projection, 84 complex eigenvalue pair, 90
simple, 69, 82 Models, 2
semisimple, 69, 80 Modules, 415
eigenvalue, 69 Nonlinear dynamical systems, 151
algebraic and geometric muhiplicity, III Nonlinear perturbation of d.e., 197-199
eigenvector, 69 Normed linear spaces, 434
elementary operations (row, column), 429 Cauchy sequences, 437
elememary matrix, 430 closed subset in, 450
eigenValue, eigenvector, 69 complete (Banach space), 437
equivalence, 425 convergence, 437
first representation theorem, 420 dense subset in, 437
function of a matrix, 127 inner product spaces, 448
fonnula for, 131-132 Hilbert spaces, 448
computation of, 133 norms, 435
geometric eigenspace, III open subset in, 450
generalized eigenvectors, 123 Numerical considerations
Hankel matrices, 296-300 (see also computer arithmetic)
Henmitian matrices, 454 backward Euler method, 137
eigenvectors and eigenvalues, 460 concerning the matrix spectral mapping theorem, 136
Henmitian positive definite, 460 forward Euler method, 136
left algebraic eigenspace, 120 for solving Ax = b, 445
lordan bloc, 124 of the matrix exponential, 92
lordan chain, 123 368-374
lordan fonm, 125 counting encirclements, 374
minimal polynomial, 108, 109 criterion, 368, et seq.
nilpotent, 107 remarks on, 370-372
nonsingular, 409 proof of, 372
508
discrete-time case, 374 duality: relation to rcachability, 238
plOl,369 rcconstructibility grammian, 238
theorem, 369 reconstructibility map, 238
Observability (see also Controllability) unreconstructible state, 238
of a dynamical system, 224 Recwsion system representation, 143 146
memoryless feedback and feedforward, 225 Regulator property, 377
of the pair (C('),AO) (time-varying case), 233 Response map, 6, 148
observability map, 233 linear time-varying case, 17
observability grammian, 234 linear time-invariant case, 77, 91
characterization theorem, 234 linear case: decomposition property, 8, 152
duality: conlrollability to zero versus Return difference, 362
observability, 236 Riceali differential equation, 37
initial state reconstruction, 235 Riceali recursion equation (RRE), 64, 347
Observability of the pair (CA) (time-invariant case), 240 properties of, 348
extraction of unobservable part, 240 properties of the backwards Hamiltonian, 352
observability matrix, 239 Ring, 405
unobservable subspace, 240 entire ring (integral domain), 407
Observable canonical form, 312-314 commutative ring, 407
Observable state (see state, unobservable) Robust regulation, 378
Observer, 326 conditions for. 378
(sec state estimator) Robust stability, 388-393
Optimal LQ problem (see linear quadratic optimization) under structured perturbations, 388
Optimization example, 48 theorem, 390
Output, 5, 144 Robustness, 373 et seq., 393, 466 477
Plant, 374 Robustness of S(P,C), 374
Pairing lemma, 27 exogeneous disturbances, 376
Parametrization (see Q-parametrization) plant penurbations, 375
Periodically varying differential equations, 51 under additive plant penurbations, 393
Roquet theorem, 52 theorem, 394
Periodically varying recursion equations. 66 S(P'C)' unity feedback system (see also feedback
Hoquet theorem, 67 system L,)
Perturbation (see Robustness) definition, 374
plant perturbation, 375, 393 Sampled-data system, 160 et seq.
Piecewise continuous functions. 411 AID converter, 167
Pole (see also Nyquist criterion) control system with digital controller, 171
of the transfer function, 90 D/A converter, 166
and minimal realization, 302 pulse transfer function, 168
of (sl-A)"', 132 sampling theorem, 162
Polynomials, 68, 405 zeros of pulse transfer function, 170
Polynomial matrices, 409 Semisimple (diagonable) matrix, 69, 80, 83
unimodular, 409 Sensitivity, 466
Q-parametrization (of L,), 366 Separation property, 328
Rational functions, 68, 405 Simple matrix, 69, 82
proper, 68, 405 Solution space
strictly proper, 68, 405 of the lime-invariant z-i state transition map. 86
Reachable state, 227 discrete-lime case, 100
Reachability Solving Ax = b, 417
controllability in terms of reachability, 227 numerical considemtions, 445
duality: relation to reconstructibility, 238 sensitivity analysis of Ax = b, 466
reachability map, 226 using row echelon form, 432
reachability gmmmian, 44, 227 Stability (see also, Nyquist criterion, Kharitonov's
Read·out equation, 6, 144 theorem, Hurwitz polynomial, Exponential stability)
Realization, 295 asymptotic stability, 181
minimal,295 input-output stability. 175
and McMillan degree, 299 linear time-varying case, 177
algebraic equivalence of, 300 linear lime-invariant case, 178
zero-state equivalence of, 295 of R., 365
Reconstructibility, 238 when H,(s) is expo stable, 365, 366
509

unifonn asymptotic stability, 183 stabilizable subspace of (A,B), 257


Stabilizability (sce also linear state feedback) undetectable subspace of (C,A), 257
by linear stale feedback, 322 unobservable subspace of (C,A), 240
by linear output injection, 326 Superposition law, 9, 152
Stabilizability of (A,B) System representation, 145
by linear stale feedback, 322 for R =!A('),B('),C('),D(')I, 5
definition, 257 for R = [A,B,C,DJ, 70
in tenns of controllability to zerO at infinity, 257 with a basis of eigenvetors, 79
relation to detectability, 260 general case 103
stabilizability properties, 259 Time-invariance, ISO
Stabilizability of (A('),B('», 231 for R = [A,B,C,DJ, 77
Stability conditions (see also Exponential stability, Time-invariant dynamical systems, 150
Nyquist criterion, Hurwitz polynomials) Time-invarying dynamical syslems, 150
for the matrix exponential, 132 Trade-off
for the matrix power sequence, 133 bandwidth versus robustness, 378
Stale, 5, 144 Transfer function of R = [A,R,C,DJ, 79, 99
composition property, 22 expansion at s= 00,295
composition axiom, 145 Markov parameters, 79, 296
conlrOllable (see dynamical systems) poles of, 91, 101
conlrOllable to zero, 227 poles and minimality, 302
discrete-time case, 266, 287 Transmission zeros, 396
deadbeat, 284 characteri7.ation of, 399
detectable, (see undeleCtable) multi-input multi-output case, 398
differential equation, 6, 142 single-input singe-output case. 397
reachable, 227 theorem (zero of P=zero of H,.",), 402
discrete-time case, 266, 283 Unitary (see Matrix, unitary)
reversible (discrete-time case), 284 Variational equation, 40
stabilizable, 257, 258 Vector
undetectable, 257, 258 definition, 410
unobservable, 233 representation, 414
discrete-time case, 271, 281 z-transfonn, 488
unreconstructible, 238 final value theorem, 491
State estimation, 326, 330 initial value theorem, 491
and state feedback, 328 inversion, 491
State feedback (see linear state feedback, linear properties of, 489
quadratic optimization) radius of convergence, 489
State recursion equation, 58, 143 relation to Laplace transfonn, 160
Stale space Zero-input response, 9, 152
decomposition into algebraic eigenspaces, 110 Zero-inpul state transition map, 9
Kalman decomposition, 247 Zero-stale equivalent linear systems, 155
State transilion axiom, 145 Zero-stale response, 9, 152
State transition map, 6, 144 Zero-stale stale transition map, 9
linear time-varying case, 17 Zeros of transmission, 397
linear time-invariant case, 77, 92 multi-input multi-output case, 398
State transition matrix, 9, 10 application to unity feedback systems, 401
properties of, 14 characterization, 399
lime-invariant case, 70 definition, 398
using the Laplace transfonn, 72 singe-input single-output case, 397
Subspace., 402
algebraic and geometric eigenspaces, 110
codimension of, 419
conlrOllable subspace of (A,B), 243
deadbeat, 284
direct sum of, 104,451
invariant subspace, 103
representation of 418
reversible subspace, 284
stable subspace of a matrix, 122

You might also like